Instrumental Variables Saralyn J Miller EDU 7314.

40
Instrumental Variables Saralyn J Miller EDU 7314

Transcript of Instrumental Variables Saralyn J Miller EDU 7314.

Page 1: Instrumental Variables Saralyn J Miller EDU 7314.

Instrumental Variables

Saralyn J MillerEDU 7314

Page 2: Instrumental Variables Saralyn J Miller EDU 7314.

Overview of Presentation• Understanding IV

– History– Defined– Assumptions– Endogeneity– Exogenous Variable - Instrument– Angrist example paralleled with an education example

• Statistical Understanding of IV– Present 2 equations

• Card Example– Overview of article– Replicate his study in R

• In-class Example• Other Examples of IV in Education

Page 3: Instrumental Variables Saralyn J Miller EDU 7314.

History of IV

• Historically IV has mostly been used by economists and statisticians (Angrist & Kreuger, 2001).

• Philip G. Wright (econometrician) vs. Sewell Wright (biologist) (Wright, 1928).– Philip had written about the problem of endogenous variation in

previous papers.– Sewell had discovered the use of an instrument, but the variables

were already exogenous, so the analysis was unnecessary.– Stylometric analysis of their writing (Stock & Trebbi, 2003

• Authors found Philip to be the writer and founder of IV

• 1940’s IV was rediscovered• 1953 Theil introduced the two stage least squares method for

computing IV

Page 4: Instrumental Variables Saralyn J Miller EDU 7314.

Instrumental Variables Defined

• Causality is difficult to prove, even in experimental research.

• In education, randomization is what is used to determine causality.

• However, we can’t always randomize or create a true experiment.

• The IV method is a quasi-experimental research method used to estimate causal relationships.

Page 5: Instrumental Variables Saralyn J Miller EDU 7314.

Regression Assumption• One of the assumptions of the error term in a regression analysis is that

the error must be independent and identically distributed. – Error variance is the same for all values.– Error is not related to other error values.– Error is normally distributed.

• Use IV when the independent variable is correlated with unobservable error.

• 3 reasons why this assumption might be violated:– Omitted variable bias: When an unobservable variable is capturing some of the

dependent variable and this unobservable variable is not in your model. Instead, the variables you have included are picking up some of the unobserved and the unobserved needs to be accounted for on it’s own. In other words, there are other variables that can explain the outcome measure and your variable is picking up some of this explanation (omitted variable bias).

– Measurement error – causation is not determined due to error in the collection of the data

– Reverse Causality – direction of causality is not determined.http://www.unescap.org/tid/artnet/mtg/gravity_d4s1_shepherd.pdf

Page 6: Instrumental Variables Saralyn J Miller EDU 7314.

Endogeneity

• When an independent variable correlates with unobservable error we call this endogeneity.– Endogenous variables: variables that are correlated with error

term. You can’t say that the independent variables cause the dependent variable.

– Often the factors that affect an outcome depend on that outcome (reverse causality).

– Example• The more shots Kobe Bryant takes, the lower the percentage of wins

for the Lakers. Does an increase in shots that Kobe takes cause the Lakers to lose? Or does the loss of the game and the fact that teammates are not making shots cause Kobe to take more shots? (http://drbseconomicblog.blogspot.com/2009/01/kobe-and-reverse-causality.html )

Page 7: Instrumental Variables Saralyn J Miller EDU 7314.

Endogeneity• Sometimes in a linear model some of the variables are

endogenous, meaning the regressors or variables are correlated with the error term. – Ex: Effect of military service on future earnings (Angrist, 1990).

• Military service is endogenous. – Does the military cause a soldier’s future earnings to be a certain amount when

he or she leaves the service? Or are there certain characteristics of those that join the military that influence future earnings?» An individual’s choice to enter the service might be indicative of the

individual’s expected future earnings. There are some individuals that choose to go into the military because their expected future earnings are low. Therefore, their enrollment is related to the fact that those that join the service might on average have lower future earnings.

» Also, veterans have certain observed and unobserved characteristics that affect their decision to enroll and these could be related to earnings.

http://financialaccess.org/node/2042

Page 8: Instrumental Variables Saralyn J Miller EDU 7314.

What do we do when you have an endogenous variable?

• An exogenous variable or instrument can “fix” endogeneity.– These variables are correlated with the regressors,

but are uncorrelated with the error term.– We call these exogenous variables instruments.– Ex: Since determining earnings is dependent on other

things such as expected earnings, Angrist (1990) used the Vietnam draft as an instrument. It is correlated with entering the service, but is not correlated with earnings. The draft system is exogenous.

Page 9: Instrumental Variables Saralyn J Miller EDU 7314.

Qualities of an Instrument – Exogenous Variable

• It must be correlated with the independent variable.

• It must be uncorrelated with the error of the dependent variable.

• Assumption of IV: Instrument must be exogenous.

Page 10: Instrumental Variables Saralyn J Miller EDU 7314.

Example

• Joshua Angrist’s 1990 work.• He analyzed the difference in earnings between

veterans and non-veterans.• But analyzing this difference does not tell us the causal

impact of military service on future earnings. • In education – we “fix” this problem by randomly

placing students into treatment and control conditions. • We can’t always randomize. What if we gave students a

choice on whether they wanted to attend tutoring sessions (Reardon, 2010) because we could not randomly assign students to a condition?

Page 11: Instrumental Variables Saralyn J Miller EDU 7314.

Example Continued• A young person’s decision to enter the military could be affected

by his/her expectations of future earnings. This is an endogeneity problem: does military service affect future earnings or does the prospect of future earnings affect the decision to enter the military?

• Veterans have observed and unobserved characteristics that affect their reason for entering the military. We cannot control for the unobserved characteristics.

• Tutoring session example (Reardon, 2010): A student’s decision to attend tutoring could be affected by his/her expectations of how it will affect academic achievement. Does tutoring affect achievement or does the prospect of future grades affect the decision to go to tutoring?

Page 12: Instrumental Variables Saralyn J Miller EDU 7314.

What did Angrist do?

• He used the Vietnam draft lottery as an instrument (exogenous variable).– The draft lottery is correlated with serving in the military.– The draft lottery is only correlated with future earnings of

military personnel through enrollment in the military.• Tutoring session could use a lottery system too.– The lottery would be correlated with those that go to

tutoring.– The lottery would be correlated with future grades only

through attendance to the tutoring program.

Page 13: Instrumental Variables Saralyn J Miller EDU 7314.

Problem

• What about those who were drafted and avoided the draft?

• Or those who were not drafted, but felt compelled to fight anyway?

• What about the students who were picked for the lottery, but chose not to go because they didn’t think it would help?

• Or those that were not picked, but really felt like they needed the help?

Page 14: Instrumental Variables Saralyn J Miller EDU 7314.

Answer

• The IV method recognizes that those described previously cannot be included in the sample. It is not an average treatment effect for the whole sample, but is a local average treatment effect (LATE)

• Military earnings example only tells you the treatment effect on those who pulled a “bad” number and served and those who pulled a “good” number and did not serve.

• Tutoring example: only tells you the treatment effect on those who were picked for tutoring and attended and those who were not picked for tutoring and did not attend.

• Therefore we are only measuring a treatment effect for compliers, which makes this method less generalizable.

Page 15: Instrumental Variables Saralyn J Miller EDU 7314.

IV Limitations & Advantages

• Limitations– LATE– Estimates can be biased when not a binary choice, but an

ordered choice (use LIV to correct).– There is not usually a theoretical model that the relationships

are based on except when a natural experiment is created.– Only generalizable to those that benefit from the instrument.

• Advantages– Can be used to estimate a causal relationship when

randomization is not applicable.

Page 16: Instrumental Variables Saralyn J Miller EDU 7314.

Statistical Understanding of IV

• Think of IV models as 2 separate equations.

– Y is the outcome variable– K is the variable related to the instrument– IV is the instrument related to K– e is the error

12'1'1

243'1

eBKBxy

eIVBBxK

i

Page 17: Instrumental Variables Saralyn J Miller EDU 7314.

Typical Regression

Exogenous

Endogenous

DV

X1

X2

e1

Page 18: Instrumental Variables Saralyn J Miller EDU 7314.

Instrumental Variable Regression

InstrumentalVariable

Exogenous

Endogenous

X1

X2

e1

Exogenous

Page 19: Instrumental Variables Saralyn J Miller EDU 7314.

How do we find a good instrument and test the instrument’s validity?

• You can use theory and past research to provide evidence for an instrument.

• Hausman test• Check correlation between independent

variable and instrument.

Page 20: Instrumental Variables Saralyn J Miller EDU 7314.

Example in R – Card data

• Explanation of Card (1993) study

• Replicate study using Card data (Card, 1993; Hamersma, 2009).

Page 21: Instrumental Variables Saralyn J Miller EDU 7314.

Using Geographic Variation in College Proximity to Estimate the Return to Schooling (Card, 1993)

• Does level of education or number of years of schooling effect wages or earnings?– You would think yes!– BUT, the studies that show earnings gains are controversial

because educational levels are NOT randomly assigned. Individuals choose their level of education. Education is endogenous.

– The effect of schooling is difficult to determine and you cannot randomly assign some children to school.

– The author needs an exogenous variable. Card uses geographic differences in the proximity to a college.

• Overall finding: When college proximity is used as an instrument in place of education, the author finds that the return to education is approximately 50% higher than the OLS estimate.

Page 22: Instrumental Variables Saralyn J Miller EDU 7314.

Why is Education Endogenous to Earnings?

• Ability bias – if some individuals have an ability that explains earnings despite education, then those that earn higher schooling will have an upward-biased level of earnings (IQ).

• Measurement error- All of the data was student reported. We could argue that there is a negative correlation between earnings error and observed schooling.

Page 23: Instrumental Variables Saralyn J Miller EDU 7314.

Is College Proximity Exogenous?

• Card proposes college proximity as an exogenous variable. College proximity needs to be related to wages, but only through education.

• If you are poor, the likelihood of attending college increases if you live near one, so proximity is related to education.

• He checked this by looking at the effect of college proximity on predicted education given other demographic variables. Biggest effect was men with low chance of continuing education. (if you live near a college, then there is a lower cost of higher education so there is a bigger effect on education outcomes of poorer children)

Page 24: Instrumental Variables Saralyn J Miller EDU 7314.

Recap

• We’re trying to predict the effect of schooling on wages.

• Education is our key independent variable that is endogenous.

• Wage (log of wages) is our dependent variable.

• College proximity is our exogenous instrument.

Page 25: Instrumental Variables Saralyn J Miller EDU 7314.

Variables Used in Card analysis

• lwage = log(wages)• educ = years of schooling, 1976• exper = age – educ – 6• expersq• black = 1 if black• south = 1 if in south, 1976• smsa = 1 if in metropolitan area, 1976• reg661-reg668 = 1 for region lived in, 1966• smsa66 = 1 if in metropolitan area, 1966• nearc4 = 1 if near 4 year college, 1966

Page 26: Instrumental Variables Saralyn J Miller EDU 7314.

3 Step Process for Replicating Card’s Findings (Card, 1992; Hamersma, 2009)

###Load Stata file###library(foreign)card.data<-read.dta("card.dta")attach(card.data)head(card.data) id nearc2 nearc4 educ age fatheduc motheduc weight momdad14 sinmom14 step141 2 0 0 7 29 NA NA 158413 1 0 02 3 0 0 12 27 8 8 380166 1 0 03 4 0 0 12 34 14 12 367470 1 0 04 5 1 1 11 27 11 12 380166 1 0 05 6 1 1 12 34 8 7 367470 1 0 06 7 1 1 12 26 9 12 380166 1 0 0 reg661 reg662 reg663 reg664 reg665 reg666 reg667 reg668 reg669 south66 black1 1 0 0 0 0 0 0 0 0 0 12 1 0 0 0 0 0 0 0 0 0 03 1 0 0 0 0 0 0 0 0 0 04 0 1 0 0 0 0 0 0 0 0 05 0 1 0 0 0 0 0 0 0 0 06 0 1 0 0 0 0 0 0 0 0 0 smsa south smsa66 wage enroll kww iq married libcrd14 exper lwage expersq1 1 0 1 548 0 15 NA 1 0 16 6.306275 2562 1 0 1 481 0 35 93 1 1 9 6.175867 813 1 0 1 721 0 42 103 1 1 16 6.580639 2564 1 0 1 250 0 25 88 1 1 10 5.521461 1005 1 0 1 729 0 34 108 1 0 16 6.591674 2566 1 0 1 500 0 38 85 1 1 8 6.214608 64

Page 27: Instrumental Variables Saralyn J Miller EDU 7314.

Step 1: OLS Estimate without InstrumentWe find education is SSD, but we can make the case that it is endogenous.

m1<-lm(lwage~educ+exper+expersq+black+south+smsa+reg661+reg662+reg663+reg664+reg665+reg666+reg667+reg668+smsa66)

summary(m1) Call:lm(formula = lwage ~ educ + exper + expersq + black + south + smsa + reg661 + reg662 + reg663 + reg664 +

reg665 + reg666 + reg667 + reg668 + smsa66) Residuals: Min 1Q Median 3Q Max -1.62326 -0.22141 0.02001 0.23932 1.33340  Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.7393766 0.0715282 66.259 < 2e-16 ***educ 0.0746933 0.0034983 21.351 < 2e-16 ***exper 0.0848320 0.0066242 12.806 < 2e-16 ***expersq -0.0022870 0.0003166 -7.223 6.41e-13 ***black -0.1990123 0.0182483 -10.906 < 2e-16 ***south -0.1479550 0.0259799 -5.695 1.35e-08 ***smsa 0.1363845 0.0201005 6.785 1.39e-11 ***reg661 -0.1185698 0.0388301 -3.054 0.002281 ** reg662 -0.0222026 0.0282575 -0.786 0.432092 reg663 0.0259703 0.0273644 0.949 0.342670 reg664 -0.0634942 0.0356803 -1.780 0.075254 . reg665 0.0094551 0.0361174 0.262 0.793503 reg666 0.0219476 0.0400984 0.547 0.584182 reg667 -0.0005887 0.0393793 -0.015 0.988073 reg668 -0.1750058 0.0463394 -3.777 0.000162 ***smsa66 0.0262417 0.0194477 1.349 0.177327 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  Residual standard error: 0.3723 on 2994 degrees of freedomMultiple R-squared: 0.2998, Adjusted R-squared: 0.2963 F-statistic: 85.48 on 15 and 2994 DF, p-value: < 2.2e-16

Page 28: Instrumental Variables Saralyn J Miller EDU 7314.

What do we know so far?

• Education is the key variable and is SSD, but education is endogenous and is not accounting for individual ability.

• Card uses college proximity as an instrument to correct endogenous scenario. College proximity is correlated with wages, but only through education

• We want to check to see if college proximity is correlated with education.

Page 29: Instrumental Variables Saralyn J Miller EDU 7314.

Step 2: Is college proximity an exogenous determinant of wages?m2<-lm(educ~exper+expersq+black+south+smsa+reg661+reg662+reg663+reg664+reg665+reg666+reg667+reg668+smsa66+nearc4)summary(m2) Call:lm(formula = educ ~ exper + expersq + black + south + smsa + reg661 + reg662 + reg663 + reg664 + reg665 + reg666 +

reg667 + reg668 + smsa66 + nearc4) Residuals: Min 1Q Median 3Q Max -7.54513 -1.36996 -0.09103 1.27836 6.23847  Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 16.8485239 0.2111222 79.805 < 2e-16 ***exper -0.4125334 0.0336996 -12.241 < 2e-16 ***expersq 0.0008686 0.0016504 0.526 0.598728 black -0.9355287 0.0937348 -9.981 < 2e-16 ***south -0.0516126 0.1354284 -0.381 0.703152 smsa 0.4021825 0.1048112 3.837 0.000127 ***reg661 -0.2102710 0.2024568 -1.039 0.299076 reg662 -0.2889073 0.1473395 -1.961 0.049992 * reg663 -0.2382099 0.1426357 -1.670 0.095012 . reg664 -0.0930890 0.1859827 -0.501 0.616742 reg665 -0.4828875 0.1881872 -2.566 0.010336 * reg666 -0.5130857 0.2096352 -2.448 0.014442 * reg667 -0.4270887 0.2056208 -2.077 0.037880 * reg668 0.3136204 0.2416739 1.298 0.194490 smsa66 0.0254805 0.1057692 0.241 0.809644 nearc4 0.3198989 0.0878638 3.641 0.000276 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  Residual standard error: 1.941 on 2994 degrees of freedomMultiple R-squared: 0.4771, Adjusted R-squared: 0.4745 F-statistic: 182.1 on 15 and 2994 DF, p-value: < 2.2e-16

Page 30: Instrumental Variables Saralyn J Miller EDU 7314.

Step 2: Is college proximity an exogenous determinant of wages?m3<-lm(lwage~exper+expersq+black+south+smsa+reg661+reg662+reg663+reg664+reg665+reg666+reg667+reg668+smsa66+nearc4)summary(m3)

Call:lm(formula = lwage ~ exper + expersq + black + south + smsa + reg661 + reg662 + reg663 + reg664 + reg665 + reg666 +

reg667 + reg668 + smsa66 + nearc4)Residuals: Min 1Q Median 3Q Max -1.57387 -0.25161 0.01483 0.27229 1.38522  Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 5.9896107 0.0434375 137.890 < 2e-16 ***exper 0.0540214 0.0069336 7.791 9.07e-15 ***expersq -0.0022207 0.0003396 -6.540 7.21e-11 ***black -0.2698014 0.0192855 -13.990 < 2e-16 ***south -0.1514588 0.0278638 -5.436 5.90e-08 ***smsa 0.1646968 0.0215645 7.637 2.96e-14 ***reg661 -0.1354657 0.0416546 -3.252 0.00116 ** reg662 -0.0450389 0.0303145 -1.486 0.13746 reg663 0.0091190 0.0293467 0.311 0.75602 reg664 -0.0701587 0.0382651 -1.833 0.06683 . reg665 -0.0250439 0.0387187 -0.647 0.51780 reg666 -0.0123840 0.0431315 -0.287 0.77404 reg667 -0.0294058 0.0423056 -0.695 0.48706 reg668 -0.1496489 0.0497234 -3.010 0.00264 ** smsa66 0.0218819 0.0217616 1.006 0.31472 nearc4 0.0420679 0.0180776 2.327 0.02003 * ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  Residual standard error: 0.3993 on 2994 degrees of freedomMultiple R-squared: 0.1947, Adjusted R-squared: 0.1907 F-statistic: 48.25 on 15 and 2994 DF, p-value: < 2.2e-16

Page 31: Instrumental Variables Saralyn J Miller EDU 7314.

Step 3: Does education effect wages when college proximity is used as the instrument?

library(AER)m4<-ivreg(lwage~educ+exper+expersq+black+south+smsa+reg661+reg662+reg663+reg664+reg665+reg666+reg667+reg668+smsa66|

nearc4+exper+expersq+black+south+smsa+reg661+reg662+reg663+reg664+reg665+reg666+reg667+reg668+smsa66)summary(m4) Call:ivreg(formula = lwage ~ educ + exper + expersq + black + south + smsa + reg661 + reg662 + reg663 + reg664 + reg665

+ reg666 + reg667 + reg668 + smsa66 | nearc4 + exper + expersq + black + south + smsa + reg661 + reg662 + reg663 + reg664 + reg665 + reg666 + reg667 + reg668 + smsa66)

 Residuals: Min 1Q Median 3Q Max -1.83164 -0.24075 0.02428 0.25208 1.42760  Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.7739651 0.9349470 4.037 5.56e-05 ***educ 0.1315038 0.0549637 2.393 0.016793 * exper 0.1082711 0.0236586 4.576 4.92e-06 ***expersq -0.0023349 0.0003335 -7.001 3.12e-12 ***black -0.1467757 0.0538999 -2.723 0.006504 ** south -0.1446715 0.0272846 -5.302 1.23e-07 ***smsa 0.1118083 0.0316620 3.531 0.000420 ***reg661 -0.1078142 0.0418137 -2.578 0.009972 ** reg662 -0.0070465 0.0329073 -0.214 0.830460 reg663 0.0404445 0.0317806 1.273 0.203252 reg664 -0.0579172 0.0376059 -1.540 0.123640 reg665 0.0384577 0.0469387 0.819 0.412671 reg666 0.0550887 0.0526597 1.046 0.295587 reg667 0.0267580 0.0488287 0.548 0.583735 reg668 -0.1908912 0.0507113 -3.764 0.000170 ***smsa66 0.0185311 0.0216086 0.858 0.391193 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  Residual standard error: 0.3883 on 2994 degrees of freedomMultiple R-Squared: 0.2382, Adjusted R-squared: 0.2343 Wald test: 51.01 on 15 and 2994 DF, p-value: < 2.2e-16

Page 32: Instrumental Variables Saralyn J Miller EDU 7314.

Compare OLS to IV Estimatorlm(formula = lwage ~ educ + exper + expersq + black +

south + smsa + reg661 + reg662 + reg663 + reg664 + reg665 + reg666 + reg667 + reg668 + smsa66)

 Residuals: Min 1Q Median 3Q Max -1.62326 -0.22141 0.02001 0.23932 1.33340  Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 4.7393766 0.0715282 66.259 < 2e-16 ***educ 0.0746933 0.0034983 21.351 < 2e-16 ***exper 0.0848320 0.0066242 12.806 < 2e-16 ***expersq -0.0022870 0.0003166 -7.223 6.41e-13 ***black -0.1990123 0.0182483 -10.906 < 2e-16 ***south -0.1479550 0.0259799 -5.695 1.35e-08 ***smsa 0.1363845 0.0201005 6.785 1.39e-11 ***reg661 -0.1185698 0.0388301 -3.054 0.002281 ** reg662 -0.0222026 0.0282575 -0.786 0.432092 reg663 0.0259703 0.0273644 0.949 0.342670 reg664 -0.0634942 0.0356803 -1.780 0.075254 . reg665 0.0094551 0.0361174 0.262 0.793503 reg666 0.0219476 0.0400984 0.547 0.584182 reg667 -0.0005887 0.0393793 -0.015 0.988073 reg668 -0.1750058 0.0463394 -3.777 0.000162 ***smsa66 0.0262417 0.0194477 1.349 0.177327 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘

’ 1  Residual standard error: 0.3723 on 2994 degrees of freedomMultiple R-squared: 0.2998, Adjusted R-squared: 0.2963 F-statistic: 85.48 on 15 and 2994 DF, p-value: < 2.2e-16

ivreg(formula = lwage ~ educ + exper + expersq + black + south + smsa + reg661 + reg662 + reg663 + reg664 + reg665 + reg666 + reg667 + reg668 + smsa66 | nearc4 + exper + expersq + black + south + smsa + reg661 + reg662 + reg663 + reg664 + reg665 + reg666 + reg667 + reg668 + smsa66)

 Residuals: Min 1Q Median 3Q Max -1.83164 -0.24075 0.02428 0.25208 1.42760  Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.7739651 0.9349470 4.037 5.56e-05 ***educ 0.1315038 0.0549637 2.393 0.016793 * exper 0.1082711 0.0236586 4.576 4.92e-06 ***expersq -0.0023349 0.0003335 -7.001 3.12e-12 ***black -0.1467757 0.0538999 -2.723 0.006504 ** south -0.1446715 0.0272846 -5.302 1.23e-07 ***smsa 0.1118083 0.0316620 3.531 0.000420 ***reg661 -0.1078142 0.0418137 -2.578 0.009972 ** reg662 -0.0070465 0.0329073 -0.214 0.830460 reg663 0.0404445 0.0317806 1.273 0.203252 reg664 -0.0579172 0.0376059 -1.540 0.123640 reg665 0.0384577 0.0469387 0.819 0.412671 reg666 0.0550887 0.0526597 1.046 0.295587 reg667 0.0267580 0.0488287 0.548 0.583735 reg668 -0.1908912 0.0507113 -3.764 0.000170 ***smsa66 0.0185311 0.0216086 0.858 0.391193 ---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘

’ 1  Residual standard error: 0.3883 on 2994 degrees of

freedomMultiple R-Squared: 0.2382, Adjusted R-squared: 0.2343 Wald test: 51.01 on 15 and 2994 DF, p-value: < 2.2e-16

Effect of education increased from 0.075 to 0.131. Card (1993): “The implied instrumental variables estimates of the earnings gain per year of additional schooling at 10-14% are substantially above the earnings gains estimated by a conventional ordinary least squares procedure (7.3%)”

Page 33: Instrumental Variables Saralyn J Miller EDU 7314.

Example 2

• Does cigarette smoking have an effect on child birth weight (Wooldridge, 2002)?– What is the dependent variable?

– What is the independent variable?

– Do we have an endogeneity problem?

– This examples uses cigarette prices as the exogenous variable or as the instrument in the analysis

Page 34: Instrumental Variables Saralyn J Miller EDU 7314.

Insert Data into R bwght<-read.dta("bwght.dta")

head(bwght) faminc cigtax cigprice bwght fatheduc motheduc parity male white cigs1 13.5 16.5 122.3 109 12 12 1 1 1 02 7.5 16.5 122.3 133 6 12 2 1 0 03 0.5 16.5 122.3 129 NA 12 2 0 0 04 15.5 16.5 122.3 126 12 12 2 1 0 05 27.5 16.5 122.3 134 14 12 2 1 1 06 7.5 16.5 122.3 118 12 14 6 1 0 0 lbwght bwghtlbs packs lfaminc1 4.691348 6.8125 0 2.60268972 4.890349 8.3125 0 2.01490313 4.859812 8.0625 0 -0.69314724 4.836282 7.8750 0 2.74084005 4.897840 8.3750 0 3.31418616 4.770685 7.3750 0 2.0149031

attach(bwght)

Page 35: Instrumental Variables Saralyn J Miller EDU 7314.

Step 1: What is the first regression analysis we should calculate?

Page 36: Instrumental Variables Saralyn J Miller EDU 7314.

Step 2: Check the instrumentAre cigarette prices correlated with number of cigarettes smoked

per day while pregnant?

Page 37: Instrumental Variables Saralyn J Miller EDU 7314.

What did we find?

Page 38: Instrumental Variables Saralyn J Miller EDU 7314.

Other Examples of IV (Angrist & Kreuger, 2001)

Page 39: Instrumental Variables Saralyn J Miller EDU 7314.

IV in Educational Research

• Tutoring voucher system• Remediation programs• Schooling effects• Effects of absences on achievement• Effects of attendance on earnings• Effects of class size on achievement• Effects of hours spent in algebra on math

achievement

Page 40: Instrumental Variables Saralyn J Miller EDU 7314.

References

Angrist, J. (1990). Lifetime earnings and the vietname era draft lottery: Evidence from social security administrative records. American Economic Review, 80(3), 313-336.

Angrist, J. D. & Kreuger, J. D. (2001). Instrumental variables and the search for identification: From supply and demand to natural experiments. Journal of Economic Perspectives, 15(4), 69-85.

Card, D. (1993). Using geographic variation in college proximity to estimate the return to schooling. NBER Working Paper Series, 4483, 1-37 Retrieved from ??.

Bauchet, J. (2009). Of instrumental variables and sample definition. Financial Access Initiative. Retrieved November 1, 2010, from http://financialaccess.org/node/2042.

Hamersma, S. (2009). Homework # 2: ECO 7427 answer key. Retrieved from http://bear.warrington.ufl.edu/hamersma/Teaching/ECO7427/Homework/Homework2-AK.pdf

Reardon, S. (2010, March). Using instrumental variables in educational research. Presentation at Society for Research on Educational Effectiveness. Retrieved from http://www.sree.org/conferences/2010/program/

Shepherd, B. (2008). Session 1: Dealing with endogeneity. Retrieved from http://www.unescap.org/tid/artnet/mtg/gravity09_tues3.pdf

Stock, J. H. & Trebbi, F. (2003). Retrospective: Who invented instrumental variable regression? Journal of Economic Perspectives, 17(3), 177-194.

Wilson, B. (2009). Kobe and reverse causality. Brooks Wilson’s Economics Blog. Retrieved November 1, 2010, from http://drbseconomicblog.blogspot.com/2009/01/kobe-and-reverse-causality.html.

Wooldridge, J. (2002). Introductory econometrics: A modern approach. (2nd Ed?) South-Western College Pub, City?.

Wright, P. G. (1928). The tariff on animal and vegetable oils. New York: Macmillan.