Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable...

17
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original citation: Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 6). [Teaching Resource] © 2012 The Author This version available at: http://learningresources.lse.ac.uk/132/ Available in LSE Learning Resources Online: May 2012 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/ http://learningresources.lse.ac.uk/

Transcript of Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable...

Page 1: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

Christopher Dougherty

EC220 - Introduction to econometrics (chapter 6)Slideshow: variable misspecification ii: inclusion of an irrelevant variable

 

 

 

 

Original citation:

Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 6). [Teaching Resource]

© 2012 The Author

This version available at: http://learningresources.lse.ac.uk/132/

Available in LSE Learning Resources Online: May 2012

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/

 

 http://learningresources.lse.ac.uk/

Page 2: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

In this sequence we will investigate the consequences of including an irrelevant variable in a regression model.

1

Consequences of variable misspecification

TRUE MODEL

FIT

TE

D M

OD

EL

uXXY 33221 uXY 221

33

221ˆ

Xb

XbbY

221ˆ XbbY

Correct specification,no problems

Correct specification,no problems

Coefficients are biased (in general). Standarderrors are invalid.

Page 3: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

Consequences of variable misspecification

TRUE MODEL

FIT

TE

D M

OD

EL

uXXY 33221 uXY 221

33

221ˆ

Xb

XbbY

221ˆ XbbY

Correct specification,no problems

Correct specification,no problems

Coefficients are biased (in general). Standarderrors are invalid.

The effects are different from those of omitted variable misspecification. In this case the coefficients in general remain unbiased, but they are inefficient. The standard errors remain valid, but are needlessly large.

Coefficients are unbiased (in general),

but inefficient.Standard errors are

valid (in general)

2

Page 4: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

uXY 221

33221ˆ XbXbbY

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

These results can be demonstrated quickly.

3

Page 5: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

uXY 221

33221ˆ XbXbbY

uXXY 3221 0

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

Rewrite the true model adding X3 as an explanatory variable, with a coefficient of 0. Now the true model and the fitted model coincide. Hence b2 will be an unbiased estimator of 2 and b3 will be an unbiased estimator of 0.

4

Page 6: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

uXY 221

33221ˆ XbXbbY

uXXY 3221 0

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

However, the variance of b2 will be larger than it would have been if the correct simple regression had been run because it includes the factor 1 / (1 – r2), where r is the correlation between X2 and X3.

2,

222

22

32

2 11

XXi

ub rXX

5

Page 7: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

uXY 221

33221ˆ XbXbbY

uXXY 3221 0

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

The estimator b2 using the multiple regression model will therefore be less efficient than the alternative using the simple regression model.

2,

222

22

32

2 11

XXi

ub rXX

6

Page 8: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

uXY 221

33221ˆ XbXbbY

uXXY 3221 0

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

The intuitive reason for this is that the simple regression model exploits the information that X3 should not be in the regression, while with the multiple regression model you find this out from the regression results.

2,

222

22

32

2 11

XXi

ub rXX

7

Page 9: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

uXY 221

33221ˆ XbXbbY

uXXY 3221 0

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

The standard errors remain valid, because the model is formally correctly specified, but they will tend to be larger than those obtained in a simple regression, reflecting the loss of efficiency.

2,

222

22

32

2 11

XXi

ub rXX

8

Page 10: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

These are the results in general. Note that if X2 and X3 happen to be uncorrelated, there will be no loss of efficiency after all.

uXY 221

33221ˆ XbXbbY

uXXY 3221 0

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

2,

222

22

32

2 11

XXi

ub rXX

9

Page 11: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

. reg LGFDHO LGEXP LGSIZE

Source | SS df MS Number of obs = 868---------+------------------------------ F( 2, 865) = 460.92 Model | 138.776549 2 69.3882747 Prob > F = 0.0000Residual | 130.219231 865 .150542464 R-squared = 0.5159---------+------------------------------ Adj R-squared = 0.5148 Total | 268.995781 867 .310260416 Root MSE = .388

------------------------------------------------------------------------------ LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- LGEXP | .2866813 .0226824 12.639 0.000 .2421622 .3312003 LGSIZE | .4854698 .0255476 19.003 0.000 .4353272 .5356124 _cons | 4.720269 .2209996 21.359 0.000 4.286511 5.154027------------------------------------------------------------------------------

The analysis will be illustrated using a regression of LGFDHO, the logarithm of annual household expenditure on food eaten at home, on LGEXP, the logarithm of total annual household expenditure, and LGSIZE, the logarithm of the number of persons in the household.

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

10

Page 12: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

. reg LGFDHO LGEXP LGSIZE

Source | SS df MS Number of obs = 868---------+------------------------------ F( 2, 865) = 460.92 Model | 138.776549 2 69.3882747 Prob > F = 0.0000Residual | 130.219231 865 .150542464 R-squared = 0.5159---------+------------------------------ Adj R-squared = 0.5148 Total | 268.995781 867 .310260416 Root MSE = .388

------------------------------------------------------------------------------ LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- LGEXP | .2866813 .0226824 12.639 0.000 .2421622 .3312003 LGSIZE | .4854698 .0255476 19.003 0.000 .4353272 .5356124 _cons | 4.720269 .2209996 21.359 0.000 4.286511 5.154027------------------------------------------------------------------------------

The source of the data was the 1995 US Consumer Expenditure Survey. The sample size was 868.

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

11

Page 13: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

. reg LGFDHO LGEXP LGSIZE LGHOUS

Source | SS df MS Number of obs = 868---------+------------------------------ F( 3, 864) = 307.22 Model | 138.841976 3 46.2806586 Prob > F = 0.0000Residual | 130.153805 864 .150640978 R-squared = 0.5161---------+------------------------------ Adj R-squared = 0.5145 Total | 268.995781 867 .310260416 Root MSE = .38812

------------------------------------------------------------------------------ LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- LGEXP | .2673552 .0370782 7.211 0.000 .1945813 .340129 LGSIZE | .4868228 .0256383 18.988 0.000 .4365021 .5371434 LGHOUS | .0229611 .0348408 0.659 0.510 -.0454214 .0913436 _cons | 4.708772 .2217592 21.234 0.000 4.273522 5.144022------------------------------------------------------------------------------

Now add LGHOUS, the logarithm of annual expenditure on housing services. It is safe to assume that LGHOUS is an irrelevant variable and, not surprisingly, its coefficient is not significantly different from zero.

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

12

Page 14: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

. reg LGFDHO LGEXP LGSIZE LGHOUS

Source | SS df MS Number of obs = 868---------+------------------------------ F( 3, 864) = 307.22 Model | 138.841976 3 46.2806586 Prob > F = 0.0000Residual | 130.153805 864 .150640978 R-squared = 0.5161---------+------------------------------ Adj R-squared = 0.5145 Total | 268.995781 867 .310260416 Root MSE = .38812

------------------------------------------------------------------------------ LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- LGEXP | .2673552 .0370782 7.211 0.000 .1945813 .340129 LGSIZE | .4868228 .0256383 18.988 0.000 .4365021 .5371434 LGHOUS | .0229611 .0348408 0.659 0.510 -.0454214 .0913436 _cons | 4.708772 .2217592 21.234 0.000 4.273522 5.144022------------------------------------------------------------------------------

It is however highly correlated with LGEXP (correlation coefficient 0.81), and also, to a lesser extent, with LGSIZE (correlation coefficient 0.33).

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

. cor LGHOUS LGEXP LGSIZE(obs=869)

| LGHOUS LGEXP LGSIZE--------+--------------------------- lGHOUS| 1.0000 LGEXP| 0.8137 1.0000 LGSIZE| 0.3256 0.4491 1.0000

13

Page 15: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

. reg LGFDHO LGEXP LGSIZE

------------------------------------------------------------------------------ LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- LGEXP | .2866813 .0226824 12.639 0.000 .2421622 .3312003 LGSIZE | .4854698 .0255476 19.003 0.000 .4353272 .5356124 _cons | 4.720269 .2209996 21.359 0.000 4.286511 5.154027------------------------------------------------------------------------------

. reg LGFDHO LGEXP LGSIZE LGHOUS

------------------------------------------------------------------------------ LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- LGEXP | .2673552 .0370782 7.211 0.000 .1945813 .340129 LGSIZE | .4868228 .0256383 18.988 0.000 .4365021 .5371434 LGHOUS | .0229611 .0348408 0.659 0.510 -.0454214 .0913436 _cons | 4.708772 .2217592 21.234 0.000 4.273522 5.144022------------------------------------------------------------------------------

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

Its inclusion does not cause the coefficients of those variables to be biased.

14

Page 16: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

. reg LGFDHO LGEXP LGSIZE

------------------------------------------------------------------------------ LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- LGEXP | .2866813 .0226824 12.639 0.000 .2421622 .3312003 LGSIZE | .4854698 .0255476 19.003 0.000 .4353272 .5356124 _cons | 4.720269 .2209996 21.359 0.000 4.286511 5.154027------------------------------------------------------------------------------

. reg LGFDHO LGEXP LGSIZE LGHOUS

------------------------------------------------------------------------------ LGFDHO | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+-------------------------------------------------------------------- LGEXP | .2673552 .0370782 7.211 0.000 .1945813 .340129 LGSIZE | .4868228 .0256383 18.988 0.000 .4365021 .5371434 LGHOUS | .0229611 .0348408 0.659 0.510 -.0454214 .0913436 _cons | 4.708772 .2217592 21.234 0.000 4.273522 5.144022------------------------------------------------------------------------------

VARIABLE MISSPECIFICATION II: INCLUSION OF AN IRRELEVANT VARIABLE

But it does increase their standard errors, particularly that of LGEXP, as you would expect, reflecting the loss of efficiency.

15

Page 17: Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification ii: inclusion of an irrelevant variable Original.

Copyright Christopher Dougherty 2011.

These slideshows may be downloaded by anyone, anywhere for personal use.

Subject to respect for copyright and, where appropriate, attribution, they may be

used as a resource for teaching an econometrics course. There is no need to

refer to the author.

The content of this slideshow comes from Section 6.3 of C. Dougherty,

Introduction to Econometrics, fourth edition 2011, Oxford University Press.

Additional (free) resources for both students and instructors may be

downloaded from the OUP Online Resource Centre

http://www.oup.com/uk/orc/bin/9780199567089/.

Individuals studying econometrics on their own and who feel that they might

benefit from participation in a formal course should consider the London School

of Economics summer school course

EC212 Introduction to Econometrics

http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx

or the University of London International Programmes distance learning course

20 Elements of Econometrics

www.londoninternational.ac.uk/lse.

11.07.25