class: center middle main-title section-title-7 # Instrumental<br>variables I .class-info[ **Session 11** .light[PMAP 8521: Program evaluation<br> Andrew Young School of Policy Studies ] ] --- name: outline class: title title-inv-8 # Plan for today -- .box-5.medium.sp-after-half[Endogeneity and exogeneity] -- .box-6.medium.sp-after-half[Instruments] -- .box-3.medium.sp-after-half[Using instruments] --- name: endo-exo class: center middle section-title section-title-5 animated fadeIn # Endogeneity<br>and exoegneity --- layout: true class: title title-5 --- # Does education cause higher earnings? <img src="11-slides_files/figure-html/iv-dag-simple-1.png" width="70%" style="display: block; margin: auto;" /> -- .medium[ `$$\color{#FF851B}{\text{Earnings}_i} = \beta_0 + \beta_1 \color{#0074D9}{\text{Education}_i} + \varepsilon_i$$` ] --- layout: false .box-inv-5.medium[If we ran this regression, would β<sub>1</sub><br>give us the causal effect of education?] .medium[ `$$\text{Earnings}_i = \beta_0 + \beta_1 \text{Education}_i + \varepsilon_i$$` ] -- .box-5.medium[No!] -- .float-left.center[.box-inv-5[Omitted variable bias!] .box-inv-5[Unclosed backdoors!]] -- .box-inv-5[**Endogeneity!**] --- layout: true class: title title-5 --- # Exogeneity and endogeneity .box-inv-5.medium[**Exogenous** variables] -- .box-5[Value is not determined by<br>anything else in the model] -- .box-5[In a DAG, a node that doesn't<br>have arrows coming into it] --- # Exogeneity .box-inv-5.medium[Education is exogenous: no arrows *into* it] <img src="11-slides_files/figure-html/iv-dag-simple-1.png" width="100%" style="display: block; margin: auto;" /> --- # Exogeneity and endogeneity .box-inv-5.medium[**Endogenous** variables] -- .box-5[Value is determined by<br>something else in the model] -- .box-5[In a DAG, a node that<br>has arrows coming into it] --- # Endogeneity .box-inv-5.medium[Education is endogenous: Ability → Education] <img src="11-slides_files/figure-html/iv-dag-endogenous-1.png" width="60%" style="display: block; margin: auto;" /> --- # Exgoeneity .box-inv-5.medium[What would exogenous variation<br>in education look like?] -- .box-5[Choices to get more education that are essentially random<br>(or at least uncorrelated with omitted variables)] --- layout: false .box-5.medium[We'd like education to be exogenous<br>.smaller[(an outside decision or intervention)], but it's not!] <img src="11-slides_files/figure-html/iv-dag-endogenous-1.png" width="45%" style="display: block; margin: auto;" /> -- .box-inv-5[Part of it is exogenous, but part of it is<br>caused by ability, which is in the DAG] --- class: title title-5 # Fixing endogeneity with DAGs <img src="11-slides_files/figure-html/iv-dag-endogenous-1.png" width="45%" style="display: block; margin: auto;" /> -- .box-5[Close backdoor and adjust for ability] -- .box-inv-5.smaller[Adjustment filters out the endogenous part of education and leaves us with just the endogenous part] `$$\text{Earnings}_i = \beta_0 + \beta_1 \text{Education}_i + \beta_2 \text{Ability}_i + \varepsilon_i$$` --- .pull-left-wide[ .small[ <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <thead> <tr> <th style="empty-cells: hide;border-bottom:hidden;" colspan="1"></th> <th style="border-bottom:hidden;padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px; ">Outome = wage</div></th> </tr> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Unadjusted </th> <th style="text-align:center;"> Adjusted </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:center;"> −59.378*** </td> <td style="text-align:center;"> −85.571*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (10.376) </td> <td style="text-align:center;"> (7.198) </td> </tr> <tr> <td style="text-align:left;background-color: #FFC6C6 !important;"> educ </td> <td style="text-align:center;background-color: #FFC6C6 !important;"> 13.124*** </td> <td style="text-align:center;background-color: #FFC6C6 !important;"> 7.767*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.618) </td> <td style="text-align:center;"> (0.456) </td> </tr> <tr> <td style="text-align:left;"> ability </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> 0.344*** </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> </td> <td style="text-align:center;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> </td> <td style="text-align:center;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> (0.010) </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 1000 </td> <td style="text-align:center;"> 1000 </td> </tr> <tr> <td style="text-align:left;"> R2 </td> <td style="text-align:center;"> 0.311 </td> <td style="text-align:center;"> 0.673 </td> </tr> <tr> <td style="text-align:left;border-bottom: 1px solid"> RMSE </td> <td style="text-align:center;border-bottom: 1px solid"> 39.13 </td> <td style="text-align:center;border-bottom: 1px solid"> 26.97 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001</td></tr></tfoot> </table> ] ] .pull-right-narrow[ .box-2[Unadjusted<br>is wrong!] .box-5[Adjusted<br>is right!] .box-inv-5.small[One year of education causes hourly wage to increase by $7.77] .box-inv-5.smaller[(FAKE DATA)] ] --- layout: true class: title title-5 --- # But we can't measure ability! <img src="11-slides_files/figure-html/iv-dag-endogenous-confounding-1.png" width="40%" style="display: block; margin: auto;" /> `$$\color{#FF851B}{\text{Earnings}_i} = \beta_0 + \beta_1 \color{#0074D9}{\text{Education}_i} + \beta_2 \color{#FF4136}{\text{Ability}_i} + \varepsilon_i$$` -- .box-inv-5.small[Unmeasurable ability node is in the error term (ε)] `$$\color{#FF851B}{\text{Earnings}_i} = \beta_0 + \beta_1 \color{#0074D9}{\text{Education}_i} + \color{#FF4136}{\varepsilon_i}$$` --- # Split exogeneity and endogeneity .box-inv-5[What if we could somehow separate education<br>into its endogenous and exogenous parts?] -- .SMALL[ $$ `\begin{aligned} \color{#FF851B}{\text{Earnings}_i} =& \beta_0 + \beta_1 \color{#B10DC9}{\text{Education}_i} + \varepsilon_i \\ & \beta_0 + \beta_1 (\color{#0074D9}{\text{Education}_i^\text{exog.}} + \color{#FF4136}{\text{Education}_i^\text{endog.}}) + \varepsilon_i \\ & \beta_0 + \beta_1 \color{#0074D9}{\text{Education}_i^\text{exog.}} + \underbrace{\beta_1 \color{#FF4136}{\text{Education}_i^{\text{endog.}}} + \varepsilon_i}_{\color{#AAAAAA}{\omega_i}} \\ & \beta_0 + \beta_1 \color{#0074D9}{\text{Education}_i^\text{exog.}} + \color{#AAAAAA}{\omega_i} \end{aligned}` $$ ] --- # Find exogeneity with One Weird Trick™ .medium[ $$ \color{#FF851B}{\text{Earnings}_i} = \beta_0 + \beta_1 \color{#0074D9}{\text{Education}_i^\text{exog.}} + \color{#AAAAAA}{\omega_i} $$ ] .box-inv-5.medium.sp-after[How do we find only Education<sup>exog.</sup>?] -- .box-5.large[Use an instrument!] --- layout: false name: instruments class: center middle section-title section-title-6 animated fadeIn # Instruments --- layout: true class: title title-6 --- # What is an instrument? -- .box-inv-6[Something that is correlated with the policy variable] .box-6.small.sp-after[(Relevance)] -- .box-inv-6[Something that does not directly cause the outcome] .box-6.small.sp-after[(Exclusion)] -- .box-inv-6[Something that is not correlated with the omitted variables] .box-6.small[(Exogenity)] --- layout: false <img src="11-slides_files/figure-html/iv-dag-general-1.png" width="100%" style="display: block; margin: auto;" /> --- <img src="11-slides_files/figure-html/iv-dag-example-1.png" width="100%" style="display: block; margin: auto;" /> --- <img src="11-slides_files/figure-html/iv-dag-letters-1.png" width="100%" style="display: block; margin: auto;" /> --- .pull-left[ .box-6.SMALL[**Relevance**<br>Correlated with policy] .box-inv-6.smaller[Z → X   Cor(Z, X) ≠ 0] .box-6.SMALL[**Excludability**<br>Correlated with outcome<br>*only through* policy] .box-inv-6.smaller[Z → X → Y   Z ↛ Y   Cor(Z, Y | X) = 0] .box-6.SMALL[**Exogeneity**<br>*Not* correlated<br>with omitted variables] .box-inv-6.smaller[U ↛ Z   Cor(Z, U) = 0] ] .pull-right[ ![](11-slides_files/figure-html/iv-dag-letters-1.png) .box-inv-6.smaller[**Relevance** testable with stats] .box-inv-6.smaller[**Excludability** testable with stats + story] .box-inv-6.smaller[**Exogeneity** requires story, no stats] ] ??? https://dlm-econometrics.blogspot.com/2020/08/horseshoes-and-hand-grenades.html --- layout: true class: title title-6 --- # Relevance .box-6[Instrument causes change in policy] .box-inv-6.smaller.sp-after[Z → X   Cor(Z, X) ≠ 0] <hr> -- .center.float-left.sp-after[.box-inv-6.sp-before[Social security number] .box-2.smaller[Probably not relevant (uncorrelated with education)]] -- .center.float-left.sp-after[.box-inv-6.sp-before[3rd grade test scores] .box-5.smaller[Potentially relevant (early grades cause more education)]] -- .center.float-left[.box-inv-6.sp-before[Father's education] .box-5.smaller[Relevant (Educated parents cause more education)]] --- # Excludability .box-6[Instrument causes outcome *only through* policy] .box-inv-6.smaller.sp-after[Z → X → Y   Z ↛ Y   Cor(Z, Y | X) = 0] <hr> -- .center.float-left.sp-after[.box-inv-6.sp-before[Social security number] .box-5.smaller[Exclusive (SSN isn't correlated with hourly wages)]] -- .center.float-left.sp-after[.box-inv-6.sp-before[3rd grade test scores] .box-5.smaller[Potentially exclusive (early grades probably don't cause wages)]] -- .center.float-left[.box-inv-6.sp-before[Father's education] .box-5.smaller[Exclusive (Parent's education doesn't cause your wages (lol))]] --- # Exogeneity .box-6[Instrument not correlated with omitted variables] .box-inv-6.smaller.sp-after[U ↛ Z   Cor(Z, U) = 0] <hr> -- .center.float-left.sp-after[.box-inv-6.sp-before[Social security number] .box-5.smaller[Exogenous (Unrelated to anything related to education)]] -- .center.float-left.sp-after[.box-inv-6.sp-before[3rd grade test scores] .box-2.smaller[Not exogenous (Grades correlated with other education factors)]] -- .center.float-left[.box-inv-6.sp-before[Father's education] .box-5.smaller[Exogenous (Birth to parents is random)]] --- # The huh? factor .box-inv-6.medium["A necessary but not a sufficient condition<br>for having an instrument that can satisfy<br>the exclusion restriction is <span style="color: #A52C60;">if people are<br>confused when you tell them about the<br>instrument's relationship to the outcome.</span>"] .box-6.small[Scott Cunningham, *Causal Inference: The Mixtape*, p. 123] --- layout: false .small[ <table> <thead> <tr> <th style="text-align:left;"> Outcome </th> <th style="text-align:left;"> Policy </th> <th style="text-align:left;"> Unobserved stuff </th> <th style="text-align:left;"> Instrument </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Father's education </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Income </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Education </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Ability </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Distance to college </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Income </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Education </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Ability </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Military draft </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Health </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Smoking cigarettes </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Other negative health behaviors </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Tobacco taxes </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Crime rate </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Patrol hours </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> # of criminals </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Election cycles </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Crime </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Incarceration rate </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Simultaneous causality </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Overcrowding litigations </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Labor market success </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Americanization </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Ability </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Scrabble score of name </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Conflicts </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Economic growth </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Simultaneous causality </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Rainfall </td> </tr> </tbody> </table> ] --- .small[ <table> <thead> <tr> <th style="text-align:left;"> Outcome </th> <th style="text-align:left;"> Policy </th> <th style="text-align:left;"> Unobserved stuff </th> <th style="text-align:left;"> Instrument </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Father's education </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Distance to college </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Income </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Education </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Ability </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Military draft </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Health </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Smoking cigarettes </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Other negative health behaviors </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Tobacco taxes </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Crime rate </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Patrol hours </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> # of criminals </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Election cycles </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Crime </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Incarceration rate </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Simultaneous causality </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Overcrowding litigations </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Labor market success </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Americanization </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Ability </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Scrabble score of name </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Conflicts </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Economic growth </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Simultaneous causality </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Rainfall </td> </tr> </tbody> </table> ] --- .small[ <table> <thead> <tr> <th style="text-align:left;"> Outcome </th> <th style="text-align:left;"> Policy </th> <th style="text-align:left;"> Unobserved stuff </th> <th style="text-align:left;"> Instrument </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Father's education </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Distance to college </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Military draft </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Health </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Smoking cigarettes </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Other negative health behaviors </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Tobacco taxes </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Crime rate </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Patrol hours </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> # of criminals </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Election cycles </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Crime </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Incarceration rate </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Simultaneous causality </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Overcrowding litigations </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Labor market success </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Americanization </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Ability </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Scrabble score of name </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Conflicts </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Economic growth </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Simultaneous causality </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Rainfall </td> </tr> </tbody> </table> ] --- .small[ <table> <thead> <tr> <th style="text-align:left;"> Outcome </th> <th style="text-align:left;"> Policy </th> <th style="text-align:left;"> Unobserved stuff </th> <th style="text-align:left;"> Instrument </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Father's education </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Distance to college </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Military draft </td> </tr> <tr> <td style="text-align:left;"> Health </td> <td style="text-align:left;"> Smoking cigarettes </td> <td style="text-align:left;"> Other negative health behaviors </td> <td style="text-align:left;"> Tobacco taxes </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Crime rate </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Patrol hours </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> # of criminals </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Election cycles </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Crime </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Incarceration rate </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Simultaneous causality </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Overcrowding litigations </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Labor market success </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Americanization </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Ability </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Scrabble score of name </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Conflicts </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Economic growth </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Simultaneous causality </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Rainfall </td> </tr> </tbody> </table> ] --- .small[ <table> <thead> <tr> <th style="text-align:left;"> Outcome </th> <th style="text-align:left;"> Policy </th> <th style="text-align:left;"> Unobserved stuff </th> <th style="text-align:left;"> Instrument </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Father's education </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Distance to college </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Military draft </td> </tr> <tr> <td style="text-align:left;"> Health </td> <td style="text-align:left;"> Smoking cigarettes </td> <td style="text-align:left;"> Other negative health behaviors </td> <td style="text-align:left;"> Tobacco taxes </td> </tr> <tr> <td style="text-align:left;"> Crime rate </td> <td style="text-align:left;"> Patrol hours </td> <td style="text-align:left;"> # of criminals </td> <td style="text-align:left;"> Election cycles </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Crime </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Incarceration rate </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Simultaneous causality </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Overcrowding litigations </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Labor market success </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Americanization </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Ability </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Scrabble score of name </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Conflicts </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Economic growth </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Simultaneous causality </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Rainfall </td> </tr> </tbody> </table> ] --- .small[ <table> <thead> <tr> <th style="text-align:left;"> Outcome </th> <th style="text-align:left;"> Policy </th> <th style="text-align:left;"> Unobserved stuff </th> <th style="text-align:left;"> Instrument </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Father's education </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Distance to college </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Military draft </td> </tr> <tr> <td style="text-align:left;"> Health </td> <td style="text-align:left;"> Smoking cigarettes </td> <td style="text-align:left;"> Other negative health behaviors </td> <td style="text-align:left;"> Tobacco taxes </td> </tr> <tr> <td style="text-align:left;"> Crime rate </td> <td style="text-align:left;"> Patrol hours </td> <td style="text-align:left;"> # of criminals </td> <td style="text-align:left;"> Election cycles </td> </tr> <tr> <td style="text-align:left;"> Crime </td> <td style="text-align:left;"> Incarceration rate </td> <td style="text-align:left;"> Simultaneous causality </td> <td style="text-align:left;"> Overcrowding litigations </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Labor market success </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Americanization </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Ability </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Scrabble score of name </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Conflicts </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Economic growth </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Simultaneous causality </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Rainfall </td> </tr> </tbody> </table> ] --- .small[ <table> <thead> <tr> <th style="text-align:left;"> Outcome </th> <th style="text-align:left;"> Policy </th> <th style="text-align:left;"> Unobserved stuff </th> <th style="text-align:left;"> Instrument </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Father's education </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Distance to college </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Military draft </td> </tr> <tr> <td style="text-align:left;"> Health </td> <td style="text-align:left;"> Smoking cigarettes </td> <td style="text-align:left;"> Other negative health behaviors </td> <td style="text-align:left;"> Tobacco taxes </td> </tr> <tr> <td style="text-align:left;"> Crime rate </td> <td style="text-align:left;"> Patrol hours </td> <td style="text-align:left;"> # of criminals </td> <td style="text-align:left;"> Election cycles </td> </tr> <tr> <td style="text-align:left;"> Crime </td> <td style="text-align:left;"> Incarceration rate </td> <td style="text-align:left;"> Simultaneous causality </td> <td style="text-align:left;"> Overcrowding litigations </td> </tr> <tr> <td style="text-align:left;"> Labor market success </td> <td style="text-align:left;"> Americanization </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Scrabble score of name </td> </tr> <tr> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Conflicts </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Economic growth </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Simultaneous causality </td> <td style="text-align:left;color: #ffffff; background-color: #ffffff;"> Rainfall </td> </tr> </tbody> </table> ] --- .small[ <table> <thead> <tr> <th style="text-align:left;"> Outcome </th> <th style="text-align:left;"> Policy </th> <th style="text-align:left;"> Unobserved stuff </th> <th style="text-align:left;"> Instrument </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Father's education </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Distance to college </td> </tr> <tr> <td style="text-align:left;"> Income </td> <td style="text-align:left;"> Education </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Military draft </td> </tr> <tr> <td style="text-align:left;"> Health </td> <td style="text-align:left;"> Smoking cigarettes </td> <td style="text-align:left;"> Other negative health behaviors </td> <td style="text-align:left;"> Tobacco taxes </td> </tr> <tr> <td style="text-align:left;"> Crime rate </td> <td style="text-align:left;"> Patrol hours </td> <td style="text-align:left;"> # of criminals </td> <td style="text-align:left;"> Election cycles </td> </tr> <tr> <td style="text-align:left;"> Crime </td> <td style="text-align:left;"> Incarceration rate </td> <td style="text-align:left;"> Simultaneous causality </td> <td style="text-align:left;"> Overcrowding litigations </td> </tr> <tr> <td style="text-align:left;"> Labor market success </td> <td style="text-align:left;"> Americanization </td> <td style="text-align:left;"> Ability </td> <td style="text-align:left;"> Scrabble score of name </td> </tr> <tr> <td style="text-align:left;"> Conflicts </td> <td style="text-align:left;"> Economic growth </td> <td style="text-align:left;"> Simultaneous causality </td> <td style="text-align:left;"> Rainfall </td> </tr> </tbody> </table> ] --- layout: true class: title title-6 --- # Instruments are hard to find! .box-inv-6.medium[The trickiest thing to prove is<br>the exclusion restriction] .box-6.sp-after[Instrument causes the outcome *only through* the policy] -- .box-inv-6.medium[Most proposed instruments fail this!] --- # Rainfall as an instrument .box-inv-6[People love using weather as an instrument… buuuuut…] -- .center[ <figure> <img src="img/11/weather-paper.png" alt="Rainfall exclusion restrictions paper" title="Rainfall exclusion restrictions paper" width="85%"> </figure> ] ??? <https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3715610> --- layout: false .center[ <figure> <img src="img/11/weather-violations.png" alt="Rainfall exclusion restrictions paper" title="Rainfall exclusion restrictions paper" width="60%"> </figure> ] --- layout: true class: title title-6 --- # COVID-19 as an instrument .box-inv-6.medium[A global pandemic is a huge<br>exogenous shock to<br>social systems everywhere] .box-6[Maybe we can use it as an instrument!] --- # COVID-19 as an instrument .box-inv-6[What effect does closing schools have on<br>student performance or lifetime earnings?] <img src="11-slides_files/figure-html/covid-dag-1-1.png" width="70%" style="display: block; margin: auto;" /> --- # lolnope <img src="11-slides_files/figure-html/covid-dag-2-1.png" width="100%" style="display: block; margin: auto;" /> ??? https://twitter.com/joshuasgoodman/status/1238517897829310464 --- # Falsifying exclusion assumptions .box-inv-6[Can you think of some other way that the instrument<br>can cause the outcome outside of the policy?] -- .box-inv-6[If so, the instrument doesn't meet exclusion restriction] -- .pull-left[ ![](11-slides_files/figure-html/iv-dag-general-1.png) ] .pull-right[ .box-6.small[Instrument → ?? → outcome?] .box-6.small[Rainfall → ?? → civil war?] .box-6.small[Tobacco taxes → ?? → health?] .box-6.small[Scrabble score → ?? →<br>Labor market success?] ] --- layout: false name: using-instruments class: center middle section-title section-title-3 animated fadeIn # Using instruments --- `$$\text{Earnings}_i = \beta_0 + \beta_1 \text{Education}_i + \varepsilon_i$$` .smaller[ <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Unadjusted </th> <th style="text-align:center;"> Forbidden </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:center;"> −59.378*** </td> <td style="text-align:center;"> −85.571*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (10.376) </td> <td style="text-align:center;"> (7.198) </td> </tr> <tr> <td style="text-align:left;background-color: #F5ABEA !important;"> educ </td> <td style="text-align:center;background-color: #F5ABEA !important;"> 13.124*** </td> <td style="text-align:center;background-color: #F5ABEA !important;"> 7.767*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.618) </td> <td style="text-align:center;"> (0.456) </td> </tr> <tr> <td style="text-align:left;"> ability </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> 0.344*** </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> </td> <td style="text-align:center;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> </td> <td style="text-align:center;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> (0.010) </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 1000 </td> <td style="text-align:center;"> 1000 </td> </tr> <tr> <td style="text-align:left;"> R2 </td> <td style="text-align:center;"> 0.311 </td> <td style="text-align:center;"> 0.673 </td> </tr> <tr> <td style="text-align:left;border-bottom: 1px solid"> RMSE </td> <td style="text-align:center;border-bottom: 1px solid"> 39.13 </td> <td style="text-align:center;border-bottom: 1px solid"> 26.97 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001</td></tr></tfoot> </table> ] --- .SMALL[ $$ `\begin{aligned} \color{#FF851B}{\text{Earnings}_i} =& \beta_0 + \beta_1 \color{#B10DC9}{\text{Education}_i} + \varepsilon_i \\ & \beta_0 + \beta_1 (\color{#0074D9}{\text{Education}_i^\text{exog.}} + \color{#FF4136}{\text{Education}_i^\text{endog.}}) + \varepsilon_i \\ & \beta_0 + \beta_1 \color{#0074D9}{\text{Education}_i^\text{exog.}} + \underbrace{\beta_1 \color{#FF4136}{\text{Education}_i^{\text{endog.}}} + \varepsilon_i}_{\color{#AAAAAA}{\omega_i}} \\ & \beta_0 + \beta_1 \color{#0074D9}{\text{Education}_i^\text{exog.}} + \color{#AAAAAA}{\omega_i} \end{aligned}` $$ ] --- <img src="11-slides_files/figure-html/iv-dag-example-1.png" width="85%" style="display: block; margin: auto;" /> .center.float-left[.box-3[Relevancy] .box-3[Excludability] .box-3[Exogeneity]] --- layout: true class: title title-3 --- # Relevancy .box-inv-3[Program ~ instrument] .pull-left-narrow[ <img src="11-slides_files/figure-html/plot-first-stage-1.png" width="504" style="display: block; margin: auto;" /> ] .pull-right-wide[ .small-code.smaller[ .box-3.small[Clear, significant effect = relevant!] ```r first_stage <- lm(educ ~ fathereduc, data = father_education) tidy(first_stage) ``` ``` ## # A tibble: 2 × 5 ## term estimate std.error statistic p.value ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 (Intercept) 2.25 0.172 13.1 3.67e-36 ## 2 fathereduc 0.916 0.0108 84.5 0 ``` ] .small-code.smaller[ .box-3.small[First-stage model F-statistic (`statistic` here) > 104 = strong instrument] ```r glance(first_stage) ``` ``` ## # A tibble: 1 × 12 ## r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC ## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 0.877 0.877 0.703 7136. 0 1 -1066. 2137. 2152. ## # ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int> ``` ] ] ??? <https://arxiv.org/abs/2010.05058> --- # Exclusion .box-inv-3[Does it meet exclusion assumption?] .box-3.smaller[Father's education causes your wages *only through* your education?] .box-3.smaller[Any other plausible node between father's education and earnings?] .pull-left-narrow[ <img src="11-slides_files/figure-html/plot-exclusion-1.png" width="80%" style="display: block; margin: auto;" /> ] .pull-right-wide[ <img src="11-slides_files/figure-html/iv-dag-example-1.png" width="80%" style="display: block; margin: auto;" /> ] ??? Most obvious is that father’s education causes father’s earnings, which then causes your earnings --- # Exogeneity .box-inv-3.medium[Is assignment to your parents random?] -- .box-3.sp-after[Sure.] -- .box-inv-3.medium[Is your parents' choice to<br>gain education random?] -- .box-3[lolz.] --- # Two-stage least squares (2SLS) .box-inv-3.medium[Find exogenous part of policy variable based on instrument; use *that* to predict outcome] -- .pull-left[ .box-3.small[First stage] .small[ $$ `\begin{aligned} &\widehat{\text{Education}}_i = \\ &\quad \gamma_0 + \gamma_1 \text{Father's education}_i + \upsilon_i \end{aligned}` $$ ] .box-inv-3.smaller["Education hat": fitted/predicted values; exogenous part of education] ] -- .pull-right[ .box-3.small[Second stage] .small[ $$ `\begin{aligned} &\text{Earnings}_i = \\ &\quad \beta_0 + \beta_1 \widehat{\text{Education}}_i + \varepsilon_i \end{aligned}` $$ ] ] --- # Stage 1: Policy ~ instrument ```r first_stage <- lm(educ ~ fathereduc, data = father_education) tidy(first_stage) ``` ``` ## # A tibble: 2 × 5 ## term estimate std.error statistic p.value ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 (Intercept) 2.25 0.172 13.1 3.67e-36 ## 2 fathereduc 0.916 0.0108 84.5 0 ``` --- # Stage 1: Check instrument strength .box-inv-3[Model's F-statistic .smaller[(`statistic` here)] should be > 104<br>(though most books say > 10)] ```r glance(first_stage) ``` ``` ## # A tibble: 1 × 5 ## r.squared adj.r.squared sigma statistic p.value ## <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 0.877 0.877 0.703 7136. 0 ``` --- # Stage 1: Use first stage to predict policy .small[ $$ \widehat{\text{Education}}_i = 2.251 + (0.916 \times \text{Father's education}_i) + \upsilon_i $$ ] .small-code[ ```r data_with_predictions <- augment_columns(first_stage, data = father_education) |> rename(educ_hat = .fitted) head(data_with_predictions) ``` ] .pull-left.small-code[ ``` ## # A tibble: 6 × 5 ## wage educ ability fathereduc educ_hat ## <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 180. 18.5 408. 17.2 18.0 ## 2 100. 16.2 310. 15.5 16.4 ## 3 125. 18.2 303. 17.7 18.4 ## 4 178. 16.6 342. 15.6 16.5 ## 5 265. 17.3 534. 14.7 15.8 ## 6 187. 17.5 409. 16.0 16.9 ``` ] .pull-right[ .box-3.smaller[educ_hat = 2.251 + (0.916 × **17.2**) = **18.0**] .box-3.smaller[educ_hat = 2.251 + (0.916 × **15.5**) = **16.4**] ] --- # Stage 2: Outcome ~ predicted policy ```r second_stage <- lm(wage ~ educ_hat, data = data_with_predictions) tidy(second_stage) ``` ``` ## # A tibble: 2 × 5 ## term estimate std.error statistic p.value ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 (Intercept) 28.8 12.7 2.27 2.32e- 2 ## 2 educ_hat 7.83 0.755 10.4 5.10e-24 ``` --- layout: false .pull-left-wide.smaller[ <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Unadjusted </th> <th style="text-align:center;"> Forbidden </th> <th style="text-align:center;"> 2SLS IV </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:center;"> −59.378*** </td> <td style="text-align:center;"> −85.571*** </td> <td style="text-align:center;"> 28.819* </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (10.376) </td> <td style="text-align:center;"> (7.198) </td> <td style="text-align:center;"> (12.672) </td> </tr> <tr> <td style="text-align:left;background-color: #F5ABEA !important;"> educ </td> <td style="text-align:center;background-color: #F5ABEA !important;"> 13.124*** </td> <td style="text-align:center;background-color: #F5ABEA !important;"> 7.767*** </td> <td style="text-align:center;background-color: #F5ABEA !important;"> </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.618) </td> <td style="text-align:center;"> (0.456) </td> <td style="text-align:center;"> </td> </tr> <tr> <td style="text-align:left;"> ability </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> 0.344*** </td> <td style="text-align:center;"> </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> (0.010) </td> <td style="text-align:center;"> </td> </tr> <tr> <td style="text-align:left;background-color: #F5ABEA !important;"> educ_hat </td> <td style="text-align:center;background-color: #F5ABEA !important;"> </td> <td style="text-align:center;background-color: #F5ABEA !important;"> </td> <td style="text-align:center;background-color: #F5ABEA !important;"> 7.835*** </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> </td> <td style="text-align:center;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> </td> <td style="text-align:center;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> </td> <td style="text-align:center;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> (0.755) </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 1000 </td> <td style="text-align:center;"> 1000 </td> <td style="text-align:center;"> 1000 </td> </tr> <tr> <td style="text-align:left;"> R2 </td> <td style="text-align:center;"> 0.311 </td> <td style="text-align:center;"> 0.673 </td> <td style="text-align:center;"> 0.097 </td> </tr> <tr> <td style="text-align:left;border-bottom: 1px solid"> RMSE </td> <td style="text-align:center;border-bottom: 1px solid"> 39.13 </td> <td style="text-align:center;border-bottom: 1px solid"> 26.97 </td> <td style="text-align:center;border-bottom: 1px solid"> 44.80 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001</td></tr></tfoot> </table> ] .pull-right-narrow[ .box-2[Unadjusted<br>is wrong!] .box-5[Forbidden is right,<br>but not actually<br>measurable!] .box-5[2SLS is close<br>*and* measurable!] .box-inv-5.small[One year of education causes hourly wage to increase by $7.84] ] --- layout: true class: title title-3 --- # Multiple instruments .box-inv-3[You can use multiple instruments to<br>explain more of the endogeneity in the policy node] <img src="11-slides_files/figure-html/multiple-ivs-dag-1.png" width="70%" style="display: block; margin: auto;" /> --- # Multiple instruments $$ `\begin{aligned} \widehat{\text{Education}}_i =&\ \gamma_0 + \gamma_1 \text{Father's education}_i +\\ &\ \gamma_2 \text{Mother's education}_i + \upsilon_i\\ \\ \text{Earnings}_i =&\ \beta_0 + \beta_1 \widehat{\text{Education}}_i + \varepsilon_i \end{aligned}` $$ --- # Other control variables .box-inv-3[You can use control variables too!] .box-inv-3[For mathy reasons,<br>all exogenous controls need to go in both stages] .small[ $$ `\begin{aligned} \widehat{\text{Education}}_i =&\ \gamma_0 + \gamma_1 \text{Father's education}_i + \gamma_2 \text{Mother's education}_i +\\ &\ \gamma_3 \text{SES}_i + \gamma_4 \text{State}_i + \gamma_5 \text{Year}_i + \upsilon_i\\ \\ \text{Earnings}_i =&\ \beta_0 + \beta_1 \widehat{\text{Education}}_i +\\ &\ \beta_2 \text{SES}_i + \beta_3 \text{State}_i + \beta_4 \text{Year}_i + \varepsilon_i \end{aligned}` $$ ] ??? Control variables and other backdoors (exogenous ones go in both stages, even if they’re not used as instruments in the first stage – first stage is supposed to take out all the endogenous variation in all variables, so if you don’t include 2nd stage controls, they’ll still have some mathematical endogeneity) - https://raw.githack.com/edrubin/EC421W19/master/LectureNotes/11InstrumentalVariables/11_instrumental_variables.html#147 - https://twitter.com/grant_mcdermott/status/1194348199743377408 - https://stats.stackexchange.com/questions/177747/why-do-you-put-all-the-exogenous-variables-into-the-first-and-second-stage-of-2s --- # Faster, more accurate ways to run 2SLS .box-inv-3[Running the first stage, calculating policy-hat,<br>then running second stage is neat, but time consuming!] .code-small.smaller[ ```r first_stage <- lm(educ ~ fathereduc, data = father_education) data_with_predictions <- augment_columns(first_stage, data = father_education) |> rename(educ_hat = .fitted) second_stage <- lm(wage ~ educ_hat, data = data_with_predictions) ``` ] -- .box-inv-3[Your standard errors will be wrong unless<br>you adjust them with fancy math by hand] -- .box-3[Use R packages that do all that work for you instead!] ??? https://raw.githack.com/uo-ec607/lectures/master/08-regression/08-regression.html#Instrumental_variables --- # Faster, more accurate ways to run 2SLS .box-inv-3[`ivreg()` from the **ivreg** package] .box-3[Outcome ~ 2nd stage stuff | 1st stage stuff] .pull-left.code-small.tiny[ ```r library(ivreg) model_ivreg <- ivreg(wage ~ educ | fathereduc, data = father_education) tidy(model_ivreg) ``` ``` ## # A tibble: 2 × 5 ## term estimate std.error statistic p.value ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 (Intercept) 28.8 11.5 2.51 1.21e- 2 ## 2 educ 7.83 0.683 11.5 1.13e-28 ``` ] .pull-right.code-small.tiny[ ```r summary(model_ivreg) ``` ``` ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 28.8187 11.4679 2.513 0.0121 * ## educ 7.8349 0.6834 11.465 <2e-16 *** ## ## Diagnostic tests: ## df1 df2 statistic p-value ## Weak instruments 1 998 7136 <2e-16 *** ## Wu-Hausman 1 997 1102 <2e-16 *** ``` ] ??? https://raw.githack.com/uo-ec607/lectures/master/08-regression/08-regression.html#Option_1:_AER::ivreg() --- # Faster, more accurate ways to run 2SLS .box-inv-3[`iv_robust()` from the **estimatr** package] .box-3[Outcome ~ 2nd stage stuff | 1st stage stuff] .code-small.smaller[ ```r library(estimatr) model_iv_robust <- iv_robust(wage ~ educ | fathereduc, data = father_education) tidy(model_iv_robust) ``` ``` ## term estimate std.error statistic p.value conf.low conf.high ## 1 (Intercept) 28.818695 11.1645893 2.581259 9.985789e-03 6.909932 50.727459 ## 2 educ 7.834935 0.6635423 11.807739 3.281862e-30 6.532837 9.137033 ## df outcome ## 1 998 wage ## 2 998 wage ``` ] .box-inv-3.small[(See also `lfe()` from the **felm** package for IV with fancy fixed effects)] ??? https://raw.githack.com/uo-ec607/lectures/master/08-regression/08-regression.html#Option_3:_felm::lfe() --- layout: false .smaller[ <table style="NAborder-bottom: 0; width: auto !important; margin-left: auto; margin-right: auto;" class="table"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Unadjusted </th> <th style="text-align:center;"> Forbidden </th> <th style="text-align:center;"> 2SLS IV (by hand) </th> <th style="text-align:center;"> 2SLS IV (ivreg()) </th> <th style="text-align:center;"> 2SLS IV (iv_robust()) </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:center;"> −59.378*** </td> <td style="text-align:center;"> −85.571*** </td> <td style="text-align:center;"> 28.819* </td> <td style="text-align:center;"> 28.819* </td> <td style="text-align:center;"> 28.819** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (10.376) </td> <td style="text-align:center;"> (7.198) </td> <td style="text-align:center;"> (12.672) </td> <td style="text-align:center;"> (11.468) </td> <td style="text-align:center;"> (11.165) </td> </tr> <tr> <td style="text-align:left;background-color: #F5ABEA !important;"> educ </td> <td style="text-align:center;background-color: #F5ABEA !important;"> 13.124*** </td> <td style="text-align:center;background-color: #F5ABEA !important;"> 7.767*** </td> <td style="text-align:center;background-color: #F5ABEA !important;"> </td> <td style="text-align:center;background-color: #F5ABEA !important;"> 7.835*** </td> <td style="text-align:center;background-color: #F5ABEA !important;"> 7.835*** </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.618) </td> <td style="text-align:center;"> (0.456) </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> (0.683) </td> <td style="text-align:center;"> (0.664) </td> </tr> <tr> <td style="text-align:left;"> ability </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> 0.344*** </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> (0.010) </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> </td> <td style="text-align:center;"> </td> </tr> <tr> <td style="text-align:left;background-color: #F5ABEA !important;"> educ_hat </td> <td style="text-align:center;background-color: #F5ABEA !important;"> </td> <td style="text-align:center;background-color: #F5ABEA !important;"> </td> <td style="text-align:center;background-color: #F5ABEA !important;"> 7.835*** </td> <td style="text-align:center;background-color: #F5ABEA !important;"> </td> <td style="text-align:center;background-color: #F5ABEA !important;"> </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> </td> <td style="text-align:center;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> </td> <td style="text-align:center;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> </td> <td style="text-align:center;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> (0.755) </td> <td style="text-align:center;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> </td> <td style="text-align:center;box-shadow: 0px 1.5pxborder-bottom: 1px solid"> </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 1000 </td> <td style="text-align:center;"> 1000 </td> <td style="text-align:center;"> 1000 </td> <td style="text-align:center;"> 1000 </td> <td style="text-align:center;"> 1000 </td> </tr> <tr> <td style="text-align:left;background-color: #FFE2D0 !important;"> R2 </td> <td style="text-align:center;background-color: #FFE2D0 !important;"> 0.311 </td> <td style="text-align:center;background-color: #FFE2D0 !important;"> 0.673 </td> <td style="text-align:center;background-color: #FFE2D0 !important;"> 0.097 </td> <td style="text-align:center;background-color: #FFE2D0 !important;"> 0.261 </td> <td style="text-align:center;background-color: #FFE2D0 !important;"> 0.261 </td> </tr> <tr> <td style="text-align:left;border-bottom: 1px solid"> R2 Adj. </td> <td style="text-align:center;border-bottom: 1px solid"> 0.311 </td> <td style="text-align:center;border-bottom: 1px solid"> 0.672 </td> <td style="text-align:center;border-bottom: 1px solid"> 0.096 </td> <td style="text-align:center;border-bottom: 1px solid"> 0.260 </td> <td style="text-align:center;border-bottom: 1px solid"> 0.260 </td> </tr> </tbody> <tfoot><tr><td style="padding: 0; " colspan="100%"> <sup></sup> + p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001</td></tr></tfoot> </table> ] --- class: title title-3 # General IV process -- .box-inv-3[**1: Is the instrument relevant?**<br>.smaller[Instrument correlated with policy/program; F-statistic in 1st stage > 104]] -- .box-inv-3[**2: Does the instrument meet exclusion assumption?**<br>.smaller[Instrument causes outcome *only through* policy/program. **Good luck.**]] -- .box-inv-3[**3: Is the instrument exogenous?**<br>.smaller[No arrows going into instrument node in DAG]] -- .box-inv-3[**4: 2-stage least squares (2SLS)**<br>.smaller[program ~ instrument; outcome ~ program_hat **OR** `iv_robust()`]]