03.Econ427_2016_SW6
Transcript of 03.Econ427_2016_SW6
-
8/19/2019 03.Econ427_2016_SW6
1/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
UBC Okanagan
Dr. Mohsen JavdaniWinter 1, 2015/2016
Eon!2"#
Eono$etris
%&ro$ 'tok and Watson(s Introductionto Econometrics)
Cha*ter 6
6+1
-
8/19/2019 03.Econ427_2016_SW6
2/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
Ot-ine
1. Omitted variable bias
2. Causality and regression analysis
3. ultiple regression and O!"
#. easures o$ $it
%. "ampling distribution o$ the O!" estimator
6+2
-
8/19/2019 03.Econ427_2016_SW6
3/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
O$itted aria-e Bias%'W 'etion 6.1)
&he error u arises be'ause o$ $a'tors( orvariables( that in$luen'e Y but are notin'luded in the regression $un'tion. &here
are al)ays omitted variables.
"ometimes( the omission o$ those variables'an lead to bias in the O!" estimator.
6+
-
8/19/2019 03.Econ427_2016_SW6
4/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
Omitted variable bias, ctd.
&he bias in the O!" estimator that o''urs as a resulto$ an omitted $a'tor( or variable( is 'alled omittedvariable bias. *or omitted variable bias to o''ur( theomitted variable +Z , must satis$y t)o 'onditions
&he t)o 'onditions $or omitted variable bias
1. Z is a determinant o$ Y i.e. Z is part o$ u/ and
2. Z is 'orrelated )ith the regressor X i.e. 'orrZ ( X / 0/
Both conditions must hold for the omission of Z toresult in omitted variable bias.
6+!
-
8/19/2019 03.Econ427_2016_SW6
5/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
Omitted variable bias, ctd.
n the test s'ore eample
1. 4nglish language ability )hether the student has4nglish as a se'ond language/ plausibly a$$e'tsstandardi5ed test s'ores Z is a determinant o$Y .
2. mmigrant 'ommunities tend to be less a$$luentand thus have smaller s'hool budgets and higherSTR Z is 'orrelated )ith X .
A''ordingly( is biased. What is the dire'tion o$this bias6
– What does common sense suggest?
– $ 'ommon sense $ails you( there is a $ormula7
6+5
-
8/19/2019 03.Econ427_2016_SW6
6/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
Omitted variable bias, ctd.
A $ormula $or omitted variable bias re'all thee8uation(
9 β1 : :
)here v i : X i 9 /ui ; X i 9 μ X /ui .
-
8/19/2019 03.Econ427_2016_SW6
7/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
Omitted variable bias, ctd.
-
8/19/2019 03.Econ427_2016_SW6
8/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
he o$itted varia-e ias or$-a#
• $ an omitted variable Z is both
1. a determinant o$ Y that is( it is 'ontained in u/ and
2. 'orrelated )ith X (then ρ Xu 0 and the O!" estimator is biased and is not
'onsistent.
.
. *or eample( distri'ts )ith $e) 4"! students 1/ do better on
standardi5ed tests and 2/ have smaller 'lasses biggerbudgets/( so ignoring the e$$e't o$ having many 4"!students $a'tor )ould result in overstating the 'lass si5ee$$e't. s this is actuall! going on in the "# data6
6+3
β 1 +→ p
σ u
σ X
ρ Xu
-
8/19/2019 03.Econ427_2016_SW6
9/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
• Bistri'ts )ith $e)er 4nglish !earners have higher test s'ores
• Bistri'ts )ith lo)er per'ent E$ %ctE$/ have smaller 'lasses
• Among distri'ts )ith 'omparable %ctE$( the e$$e't o$ 'lasssi5e is small re'all overall +test s'ore gap, : .#/
6+4
-
8/19/2019 03.Econ427_2016_SW6
10/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
Casa-it and regression ana-sis
• &he test s'oreDSTR D$ra'tion 4nglish !earnerseample sho)s that( i$ an omitted variablesatis$ies the t)o 'onditions $or omitted variablebias( then the O!" estimator in the regression
omitting that variable is biased and in'onsistent."o( even i$ n is large( )ill not be 'lose to &1.
• &his raises a deeper 8uestion ho) do )e de$ine
&16 &hat is( )hat pre'isely do )e )ant to estimate
)hen )e run a regression6
6+10
1β̂
-
8/19/2019 03.Econ427_2016_SW6
11/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
What *reise- do e ant to esti$atehen e rn a regression7
&here are at least/ three possible ans)ers to this8uestion
1. We )ant to estimate the slope o$ a line through a
s'atterplot as a simple summary o$ the data to)hi'h )e atta'h no substantive meaning.
This can be useful at times' but isn(t ver! interestingintellectuall! and isn(t )hat this course is about.
6+11
-
8/19/2019 03.Econ427_2016_SW6
12/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
2. We )ant to maEe $ore'asts( or predi'tions( o$ thevalue o$ Y $or an entity not in the data set( $or)hi'h )e Eno) the value o$ X .
*orecasting is an im+ortant ,ob for economists'and e-cellent forecasts are +ossible usingregression methods )ithout needing to no)causal effects. We )ill return to forecasting laterin the course.
6+12
-
8/19/2019 03.Econ427_2016_SW6
13/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
3. We )ant to estimate the 'ausal e$$e't on Y o$ a'hange in X .
This is )h! )e are interested in the class si/eeffect. Su++ose the school board decided to cutclass si/e b! 2 students +er class. What )ould bethe effect on test scores? This is a causal 0uestion)hat is the causal effect on test scores of STR?so )e need to estimate this causal effect. E-ce+t)hen )e discuss forecasting' the aim of thiscourse is the estimation of causal effects using
regression methods.
6+1
-
8/19/2019 03.Econ427_2016_SW6
14/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
What, *reise-, is a asa- eet7
• +Causality, is a 'omple 'on'eptF
• n this 'ourse( )e taEe a pra'ti'al approa'h
to de$ining 'ausality
8 asa- eet is deined to e the
eet $easred in an idea-rando$i9ed ontro--ed e:*eri$ent.
5+1!
-
8/19/2019 03.Econ427_2016_SW6
15/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
;dea-
-
8/19/2019 03.Econ427_2016_SW6
16/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
Bak to -ass si9e#
magine an ideal randomi5ed 'ontrolled eperiment $ormeasuring the e$$e't on Test Score o$ redu'ing STR7
• n that eperiment( students )ould be randomly assigned to'lasses( )hi'h )ould have di$$erent si5es.
• @e'ause they are randomly assigned( all student'hara'teristi's and thus ui / )ould be distributed
independently o$ STRi .
• &hus( E ui HSTRi / : 0 9 that is( !"A =1 holds in a randomi5ed
'ontrolled eperiment.
6+16
-
8/19/2019 03.Econ427_2016_SW6
17/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
=o does or oservationa- data dierro$ this idea-7
• &he treatment is not randomly assigned
• Consider %ctE$ 9 per'ent 4nglish learners 9 in thedistri't. t plausibly satis$ies the t)o 'riteria $oromitted variable bias Z : %ctE$ is1. a determinant o$ Y and
2. 'orrelated )ith the regressor X .
• &hus( the +'ontrol, and +treatment, groups di$$erin a systemati' )ay( so 'orrSTR(%ctE$/ 0
6+1"
-
8/19/2019 03.Econ427_2016_SW6
18/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
• Iandomi5ation J 'ontrol group means that anydi$$eren'es bet)een the treatment and 'ontrolgroups are random 9 not systemati'ally related to
the treatment
• We 'an eliminate the di$$eren'e in %ctE$ bet)eenthe large 'ontrol/ and small treatment/ groupsby eamining the e$$e't o$ 'lass si5e amongdistri'ts )ith the same %ctE$.
– $ the only systemati' di$$eren'e bet)een the large andsmall 'lass si5e groups is in %ctE$( then )e are ba'E tothe randomi5ed 'ontrolled eperiment 9 )ithin ea'h %ctE$
group.
– &his is one )ay to +'ontrol, $or the e$$e't o$ %ctE$ )henestimating the e$$e't o$ STR.
6+13
-
8/19/2019 03.Econ427_2016_SW6
19/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
Return to omitted variable bias
hree as to overo$e o$itted varia-e ias
1. Iun a randomi5ed 'ontrolled eperiment in )hi'h treatmentSTR/ is randomly assigned then %ctE$ is still adeterminant o$ TestScore( but %ctE$ is un'orrelated )ithSTR. This solution to 34 bias is rarel! feasible./
2. Adopt the +'ross tabulation, approa'h( )ith $iner gradationso$ STR and %ctE$ 9 )ithin ea'h group( all 'lasses have thesame %ctE$( so )e 'ontrol $or %ctE$ 5ut soon !ou )ill runout of data' and )hat about other determinants lie famil!
income and +arental education6/3.
-
8/19/2019 03.Econ427_2016_SW6
20/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
he >o*-ation M-ti*-e
-
8/19/2019 03.Econ427_2016_SW6
21/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
;nter*retation o oeiients in $-ti*-eregression
Y i : β0 J β1 X 1i J β2 X 2i J ui ( i : 1(7(n
Consider 'hanging X 1 by Δ X 1 )hile holding X 2
'onstant
Population regression line before the 'hange
Y : β0 J β1 X 1 J β2 X 2
Population regression line( after the 'hange
Y J ΔY : β0 J β1 X 1 J Δ X 1/ J β2 X 2
6+21
-
8/19/2019 03.Econ427_2016_SW6
22/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
Before Y : β0 J β1 X 1 J Δ X 1/ J β 2 X 2
After Y J ΔY : β0 J β1 X 1 J Δ X 1/ J β2 X 2
Difference ΔY : β1Δ X 1
So:
β1 : ( ho-ding X
2 onstant
β2 : ( ho-ding X 1 onstant
β0 : predi'ted value o$ Y )hen X 1 : X 2 : 0.
6+22
∆Y
∆ X 1
∆Y ∆ X
2
-
8/19/2019 03.Econ427_2016_SW6
23/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
he O?' Esti$ator in M-ti*-e
-
8/19/2019 03.Econ427_2016_SW6
24/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
E:a$*-e# the Ca-iornia test sore data
Iegression o$ TestScore against STR
: KLM.L 9 2.2MNSTR
o) in'lude per'ent 4nglish !earners in the distri't
%ctE$/
: KMK.0 9 1.10NSTR 9 0.K%%ctE$
• What happens to the 'oe$$i'ient on STR6• What STR( %ctE$/ : 0.1L/
6+2!
-
8/19/2019 03.Econ427_2016_SW6
25/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
M-ti*-e regression in '88
reg testscr str pctel, robust;
Regression with robust standard errors Number of obs = 420
F( 2, 417) = 223.82
Prob > F = 0.0000
R-squared = 0.4264
Root MSE = 14.464
------------------------------------------------------------------------------
| Robust
testscr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
str | -1.101296 .4328472 -2.54 0.011 -1.95213 -.2504616
pctel | -.6497768 .0310318 -20.94 0.000 -.710775 -.5887786
_cons | 686.0322 8.728224 78.60 0.000 668.8754 703.189
------------------------------------------------------------------------------
: KMK.0 9 1.10NSTR 9 0.K%%ctE$
6ore on this +rintout later7
6+25
-
8/19/2019 03.Econ427_2016_SW6
26/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
Measres o &it or M-ti*-e
-
8/19/2019 03.Econ427_2016_SW6
27/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
SER and RMSE
As in regression )ith a single regressor( theSER and the R6SE are measures o$ thespread o$ the Y s around the regression line
SER :
R6SE :
6+2"
2
1
1ˆ
1
n
i
i
un k =− −
∑
2
1
1 ˆn
i
i
un =∑
-
8/19/2019 03.Econ427_2016_SW6
28/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
R2 and %ad@sted R2)
&he R2 is the $ra'tion o$ the varian'e eplained 9same de$inition as in regression )ith a singleregressor
R2 : : (
)here ESS : ( SSR : ( TSS : .
&he R2 al)ays in'reases )hen you add another regressor)h!? 9 a bit o$ a problem $or a measure o$ +$it,
6+23
R2
ESS TSS
1− SSRTSS 2
1
ˆ ˆ( )n
i
i
Y Y =
−∑ 21
ˆn
i
i
u=∑
(Y
i − Y )2
i=1
n
∑
-
8/19/2019 03.Econ427_2016_SW6
29/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
R2 and ctd.
&he the +adGusted R2,/ 'orre'ts this problem by +penali5ing, you $or in'luding another regressor 9the does not ne'essarily in'rease )hen you addanother regressor.
#d,usted R2 A
ote that R2( ho)ever i$ n is large the t)o )illbe very 'lose.
6+24
R2
1−n−1
n − k −1
SSR
TSS R2
R2
R2
R2
-
8/19/2019 03.Econ427_2016_SW6
30/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
Measures of fit, ctd.
&est s'ore eample
1/ : KLM.L 9 2.2MNSTR(
R2 : .0%( SER : 1M.K
2/ : KMK.0 9 1.10NSTR 9 0.K%%ctE$(
R2 : .#2K( A .#2#, SER : 1#.%
• What 8 +recisel! 8 does this tell !ou about the fit of regression
2 com+ared )ith regression 1?
• Wh! are the R2 and the so close in 2?
6+0
R2
R2
-
8/19/2019 03.Econ427_2016_SW6
31/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
he ?east 'ares 8ss$*tions orM-ti*-e
-
8/19/2019 03.Econ427_2016_SW6
32/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
8ss$*tion 1# the onditiona- $ean o u given the in-ded X s is 9ero.
E uH X 1 : - 1(7( X : - / : 0
• &his has the same interpretation as in regression )ith asingle regressor.
•
*ailure o$ this 'ondition leads to omitted variable bias(spe'i$i'ally( i$ an omitted variable
1. belongs in the e8uation so is in u/ and
2. is 'orrelated )ith an in'luded X
. then this 'ondition $ails and there is OS bias.
. &he best solution( i$ possible( is to in'lude the omittedvariable in the regression.
. A se'ond( related solution is to in'lude a variable that'ontrols $or the omitted variable dis'ussed in Ch. /
6+2
-
8/19/2019 03.Econ427_2016_SW6
33/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
8ss$*tion 2# % X 1i ,, X i ,! i ), i A1,,n, are i.i.d.
&his is satis$ied automati'ally i$ the data are 'olle'ted
by simple random sampling.
8ss$*tion # -arge ot-iers are rare %initeorth $o$ents)
&his is the same assumption as )e had be$ore $or asingle regressor. As in the 'ase o$ a single regressor(O!" 'an be sensitive to large outliers( so you need to'he'E your data s'atterplotsF/ to maEe sure there areno 'ra5y values typos or 'oding errors/.
6+
-
8/19/2019 03.Econ427_2016_SW6
34/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
8ss$*tion !# here is no *eret $-tio--inearit"erfect multicollinearit# is )hen one o$ the regressors is anea't linear $un'tion o$ the other regressors.
E$am%le# "uppose you a''identally in'lude STR t)i'eregress testscr str str, robust
Regression with robust standard errors Number of obs = 420
F( 1, 418) = 192!
"rob # F = 00000
R$s%uared = 00&12
Root ' = 18&81
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
* Robust
testscr * +oef td rr t "#*t* 9&- +onf .nter/a
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
str * $2239808 &194892 $49 0000 $0094& $12&8!31 str * (dro55ed)
6cons * !989 10!4! !344 0000 !38&!02 3190&3
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
6+!
-
8/19/2019 03.Econ427_2016_SW6
35/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
"erfect multicollinearit# is )hen one o$ theregressors is an ea't linear $un'tion o$ theother regressors.
• n the previous regression( β1 is the e$$e't on
TestScore o$ a unit 'hange in STR( holding STR 'onstant 666/
• We )ill return to per$e't and imper$e't/multi'ollinearity shortly( )ith more eamples7
•
• With these least s0uares assum+tions in hand'
)e no) can derive the sam+ling distribution of ( (7( .
6+5
1β̂ ˆ
k β 2β̂
-
8/19/2019 03.Econ427_2016_SW6
36/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
he 'a$*-ing Distrition o the O?'Esti$ator %'W 'etion 6.6)
-
8/19/2019 03.Econ427_2016_SW6
37/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
M-tio--inearit, >eret and ;$*eret%'W 'etion 6.")
"erfect multicollinearit# is )hen one o$ the regressors is anea't linear $un'tion o$ the other regressors.
"ome more eamples o$ per$e't multi'ollinearity
1. &he eample $rom be$ore you in'lude STR t)i'e(2. Iegress TestScore on a 'onstant( ;( and 5( )here ;i : 1
i$ STR T 20( : 0 other)ise 5i : 1 i$ STR U20( : 0
other)ise( so 5i : 1 9 ;i and there is per$e't
multi'ollinearity.
3. Would there be per$e't multi'ollinearity i$ the inter'ept'onstant/ )ere e'luded $rom this regression6 &hiseample is a spe'ial 'ase o$7
6+"
-
8/19/2019 03.Econ427_2016_SW6
38/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
he d$$ varia-e tra*
"uppose you have a set o$ multiple binary dummy/ variables( )hi'hare mutually e'lusive and ehaustive 9 that is( there are multiple'ategories and every observation $alls in one and only one 'ategory*reshmen( "ophomores( Vuniors( "eniors( Other/. $ you in'lude allthese dummy variables and a 'onstant( you )ill have per$e't
multi'ollinearity 9 this is sometimes 'alled the dumm# variabletra%.
• Wh! is there +erfect multicollinearit! here6
• Solutions to the dumm! variable tra+
1. Omit one o$ the groups e.g. "enior/( or
2. Omit the inter'ept. What are the im+lications of 1 or 2 for the inter+retation of the
coefficients?
6+3
-
8/19/2019 03.Econ427_2016_SW6
39/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
"erfect multicollinearit#, ctd.
• Per$e't multi'ollinearity usually re$le'ts a mistaEein the de$initions o$ the regressors( or an oddity inthe data
•
$ you have per$e't multi'ollinearity( yourstatisti'al so$t)are )ill let you Eno) 9 either by'rashing or giving an error message or by +dropping, one o$ the variables arbitrarily
•&he solution to per$e't multi'ollinearity is tomodi$y your list o$ regressors so that you nolonger have per$e't multi'ollinearity.
6+4
-
8/19/2019 03.Econ427_2016_SW6
40/41
Copyright © 2011 Pearson Addison-Wesley. All rights reserved.
Im%erfect multicollinearit#
mper$e't and per$e't multi'ollinearity are 8uite di$$erent despite thesimilarity o$ the names.
Im%erfect multicollinearit# o''urs )hen t)o or more regressors arevery highly 'orrelated.
• Why the term +multi'ollinearity,6 $ t)o regressors are very highly'orrelated( then their s'atterplot )ill pretty mu'h looE liEe astraight line 9 they are +'o-linear, 9 but unless the 'orrelation isea'tly 1( that 'ollinearity is imper$e't.
6+!0
-
8/19/2019 03.Econ427_2016_SW6
41/41
Im%erfect multicollinearit#, ctd.
mper$e't multi'ollinearity implies that one or more o$ theregression 'oe$$i'ients )ill be impre'isely estimated.
• &he idea the 'oe$$i'ient on X 1 is the e$$e't o$ X 1 holding X 2
'onstant but i$ X 1 and X 2 are highly 'orrelated( there is very
little variation in X 1 on'e X 2 is held 'onstant 9 so the datadonQt 'ontain mu'h in$ormation about )hat happens )hen X 1
'hanges but X 2 doesnQt. $ so( the varian'e o$ the O!"
estimator o$ the 'oe$$i'ient on X 1 )ill be large.
•
mper$e't multi'ollinearity 'orre'tly/ results in largestandard errors $or one or more o$ the O!" 'oe$$i'ients.
• &he math6 "ee "W( App. K.2
&e$t to%ic: h#%othesis tests and confidence intervals'
6