Chapter 6: Endogeneity and Instrumental Variables …...1. Introduction The outline of this chapter...

Post on 16-Mar-2020

5 views 1 download

Transcript of Chapter 6: Endogeneity and Instrumental Variables …...1. Introduction The outline of this chapter...

Chapter 6: Endogeneity and Instrumental Variables(IV) estimator

Advanced Econometrics - HEC Lausanne

Christophe Hurlin

University of Orléans

December 15, 2013

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 1 / 68

Section 1

Introduction

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 2 / 68

1. Introduction

The outline of this chapter is the following:

Section 2. Endogeneity

Section 3. Instrumental Variables (IV) estimator

Section 4. Two-Stage Least Squares (2SLS)

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 3 / 68

1. Introduction

References

Amemiya T. (1985), Advanced Econometrics. Harvard University Press.

Greene W. (2007), Econometric Analysis, sixth edition, Pearson - PrenticeHil (recommended)

Pelgrin, F. (2010), Lecture notes Advanced Econometrics, HEC Lausanne (aspecial thank)

Ruud P., (2000) An introduction to Classical Econometric Theory, OxfordUniversity Press.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 4 / 68

1. Introduction

Notations: In this chapter, I will (try to...) follow some conventions ofnotation.

fY (y) probability density or mass function

FY (y) cumulative distribution function

Pr () probability

y vector

Y matrix

Be careful: in this chapter, I don�t distinguish between a random vector(matrix) and a vector (matrix) of deterministic elements (except in section2). For more appropriate notations, see:

Abadir and Magnus (2002), Notation in econometrics: a proposal for astandard, Econometrics Journal.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 5 / 68

Section 2

Endogeneity

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 6 / 68

2. Endogeneity

Objectives

The objective of this section are the following:

1 To de�ne the endogeneity issue

2 To study the sources of endogeneity

3 To show the inconsistency of the OLS estimator (endogeneity bias)

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 7 / 68

2. Endogeneity

Objectives in this chapter, we assume that the assumption A3(exogeneity) is violated:

E (εjX) 6= 0N�1

but the disturbances are spherical:

V (εjX) = σ2IN

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 8 / 68

2. Endogeneity

The reasons for suspecting E (εjX) 6= 0 are varied:

1 Errors-in-variables

2 Jointly endogenous variables: the usual example is runningquantities on prices to estimate a demand equation (supply alsoa¤ects the determination of equilibrium).

3 Omitted variables: one or more columns in X cannot be included inthe regression because no data on those variables areavailable� estimation will be altered to the extent that the missingvariables and the included ones are correlated

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 9 / 68

2. Endogeneity

1. Error-in-variables

1 Consider the regression model:

y �i = x�>i β+ εi

where E ( εi j x�i ) = 0.2 One does not observe (y �, x�) but (y , x)

yi = y �i + vi xi = x�i +wi

withE (vi ) = E (vi εi ) = E (viy �i ) = E

�w>i x

�i

�= 0

E (wi ) = E (viwi ) = E (wi εi ) = E (wiy �i ) = E (vix�i ) = 0

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 10 / 68

2. Endogeneity

1. Error-in-variables (cont�d)

1 The mismeasured regression equation is given by:

y �i = x�>i β+ εi

() yi = x>i β+ εi � vi +w>i β

() yi = x>i β+ ηi

with ηi = εi � vi +w>i β.2 The composite error term ηi is not orthogonal to the mismeasuredindependent variable xi .

E (ηixi ) 6= 0

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 11 / 68

2. Endogeneity

1. Error-in-variables (cont�d)

Indeed, we have:ηi = εi � vi +w>i β.

As a consequence:

E (ηixi ) = E (εixi )�E (vixi ) +E�w>i β xi

�= E

�w>i β xi

�E (ηixi ) 6= 0

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 12 / 68

2. Endogeneity

2. Simultaneous equation bias

Consider the demand equation

qd = α1p + α2y + ud

where qd , p and y denote respectively the quantity, the price and income.

Unfortunately, the price p is not exogenous or the orthogonality conditionE (udp) = 0 is not satis�ed!

E (udp) 6= 0

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 13 / 68

2. Endogeneity

2. Simultaneous equation bias (cont�d)

Indeed, the supply/demand system can be written as:

qd = α1p + α2y + ud

qs = β1p + us

qd = qp

where E (ud ) = E (us ) = E (usud ) = E (usy) = E (udy) = 0.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 14 / 68

2. Endogeneity

2. Simultaneous equation bias (cont�d)

Solving qd = qp , the reduced-form equations, which express theendogenous variables in terms of the exogenous variables, write:

p =α2y

β1 � α1+ud � usβ1 � α1

= π1y + w1

q =β1α2y

β1 � α1+

β1ud � α1usβ1 � α1

= π2y + w2

Therefore

E (udp) =σ2ud

β1 � α16= 0

This result leads to an overestimated (upward biased) price coe¢ cient.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 15 / 68

2. Endogeneity

3. Omited variables

Consider the true model:

yi = β1 + β2x1i + β2x2i + εi

with E (εi ) = E (εix1i ) = E (εix2i ) = 0.

If we regress y on a constant and x1 (omitted variable x2):

yi = β1 + β2x1i + µi

µi = β2x2i + εi

If Cov (x1i , x2i ) 6= 0, then

E (µix1i ) 6= 0

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 16 / 68

2. Endogeneity

Question

What is the consequence of the endogeneity assumption on the OLSestimator?

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 17 / 68

2. Endogeneity

Consider the (population) multiple linear regression model:

y = Xβ+ ε

where (cf. chapter 3):

y is a N � 1 vector of observations yi for i = 1, ..,N

X is a N �K matrix of K explicative variables xik for k = 1, .,K andi = 1, ..,N

ε is a N � 1 vector of error terms εi .

β = (β1..βK )> is a K � 1 vector of parameters

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 18 / 68

2. Endogeneity

The OLS estimator is de�ned as to be:

bβOLS = �X>X��1 X>yIf we assume that

E (εjX) 6= 0Then, we have:

E�bβOLS ���X� = β0 +

�X>X

��1 �X>E (εjX)

�6= 0

E�bβOLS� = EX

�E�bβOLS ���X�� 6= β0

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 19 / 68

2. Endogeneity

Theorem (Bias of the OLS estimator)

If the regressors are endogenous, i.e. E (εjX) 6= 0, the OLS estimator ofβ is biased

E�bβOLS� 6= β0

where β0 denotes the true value of the parameters. This bias is called theendogeneity bias.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 20 / 68

2. Endogeneity

Remark

1 We saw in Chapter 1 that an estimator may be biased (�nite sampleproperties) but asymptotically consistent (ex: uncorrected samplevariance).

2 But in presence of endogeneity, the OLS estimator is alsoinconsistent.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 21 / 68

2. Endogeneity

Objectives We assume that:

plim1NX>ε = γ 6= 0K�1

whereγ = E (xi εi ) 6= 0K�1

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 22 / 68

2. Endogeneity

Given the de�nition of the OLS estimator:

bβOLS = β0 +�X>X

��1 �X>ε

�We have:

plim bβOLS = β0 + plim�1NX>X

��1� plim

�1NX>ε

Or equivalently:plim bβOLS = β0 +Q

�1γ 6= β0

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 23 / 68

2. Endogeneity

Theorem (Inconsistency of the OLS estimator)

If the regressors are endogenous with plim N�1X>ε = γ, the OLSestimator of β is inconsistent

plim bβOLS = β0 +Q�1γ

where Q = plim N�1X>X.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 24 / 68

2. Endogeneity

Remark

The bias and the inconsistency property is not con�ned to the coe¢ cientson the endogenous variables.

Consider a case where all but the last variable in X are uncorrelated with ε:

plim1NX>ε = γ =

0BB@00..γ

1CCAThen we have:

plim bβOLS = β0 +Q�1γ

There is no reason to expect that any of the elements of the last columnof Q�1 will equal to zero.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 25 / 68

2. Endogeneity

Remark (cont�d)

plim bβOLS = β0 +Q�1γ

1 The implication is that even though only one of the variables in X iscorrelated with ε, all of the elements of bβOLS are inconsistent,not just the estimator of the coe¢ cient on the endogenous variable.

2 This e¤ects is called smearing; the inconsistency due to theendogeneity of the one variable is smeared across all of the leastsquares estimators.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 26 / 68

2. Endogeneity

Example (Endogeneity, OLS estimator and smearing)Consider the multiple linear regression model

yi = 0.4+ 0.5xi1 � 0.8xi2 + εi

where εi is i .i .d . with E (εi ) . We assume that the vector of variablesde�ned by wi = (xi1 : xi2 : εi ) has a multivariate normal distribution with

wi � N (03�1,∆)

with

∆ =

0@ 1 0.3 00.3 1 0.50 0.5 1

1AIt means that Cov (εi , xi1) = 0 (x1 is exogenous) but Cov (εi , xi2) = 0.5(x2 is endogenous) and Cov (xi1,xi2) = 0.3 (x1 is correlated to x2).

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 27 / 68

2. Endogeneity

Example (Endogeneity, OLS estimator and smearing (cont�d))

Write a Matlab code to (1) generate S = 1, 000 samples fyi , xi1, xi2gNi=1of size N = 10, 000. (2) For each simulated sample, determine the OLSestimators of the model

yi = β1 + β2xi1 + β3xi2 + εi

Denote bβs = �bβ1s bβ2s bβ3s�> the OLS estimates obtained from the

simulation s 2 f1, ..Sg . (3) compare the true value of the parameters inthe population (DGP) to the average OLS estimates obtained for the Ssimulations

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 28 / 68

2. Endogeneity

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 29 / 68

2. Endogeneity

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 30 / 68

2. Endogeneity

Question: What is the solution to the endogeneity issue?

The use of instruments..

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 31 / 68

2. Endogeneity

Key Concepts

1 Endogeneity issue

2 Main sources of endogeneity: omitted variables, errors-in-variables,and jointly endogenous regressors.

3 Endogeneity bias of the OLS estimator

4 Inconsistency of the OLS estimator

5 Smearing e¤ect

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 32 / 68

Section 3

Instrumental Variables (IV) estimator

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 33 / 68

3. Instrumental Variables (IV) estimator

Objectives

The objective of this section are the following:

1 To de�ne the notion of instrument or instrumental variable

2 To introduce the Instrumental Variables (IV) estimator

3 To study the asymptotic properties of the IV estimator

4 To de�ne the notion of weak instrument

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 34 / 68

3. Instrumental Variables (IV) estimator

De�nition (Instruments)

Consider a set of H variables zh 2 RN for h = 1, ..N. Denote Z the N �Hmatrix (z1 : .. : zH ) . These variables are called instruments orinstrumental variables if they satisfy two properties:

(1) Exogeneity: They are uncorrelated with the disturbance.

E (εjZ) = 0N�1

(2) Relevance: They are correlated with the independent variables, X.

E (xikzih) 6= 0

for h 2 f1, ..,Hg and k 2 f1, ..,Kg.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 35 / 68

3. Instrumental Variables (IV) estimator

Assumptions: The instrumental variables satisfy the following properties.

Well behaved data:

plim1NZ>Z = QZZ a �nite H �H positive de�nite matrix

Relevance:

plim1NZ>X = QZX a �nite H �K positive de�nite matrix

Exogeneity:

plim1NZ>ε = 0K�1

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 36 / 68

3. Instrumental Variables (IV) estimator

De�nition (Instrument properties)We assume that the H instruments are linearly independent:

E�Z>Z

�is non singular

or equivalentlyrank

�E�Z>Z

��= H

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 37 / 68

3. Instrumental Variables (IV) estimator

Remark

The exogeneity condition

E ( εi j zi ) = 0 =) E (εizi ) = 0

with zi = (zi1..ziH )> can expressed as an orthogonality condition or

moment conditionE�zi�yi � x>i β

��= 0

The sample analog is

1N

N

∑i=1

�zi�yi � x>i β

��= 0

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 38 / 68

3. Instrumental Variables (IV) estimator

De�nition (Identi�cation)

The system is identi�ed if there exists a unique β = β0 such that:

E�zi�yi � x>i β

��= 0

where zi = (zi1..ziH )> . For that, we have the following conditions:

(1) If H < K the model is not identi�ed.

(2) If H = K the model is just-identi�ed.

(3) If H > K the model is over-identi�ed.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 39 / 68

3. Instrumental Variables (IV) estimator

Remark

1 Under-identi�cation: less equations (H) than unknowns (K )....

2 Just-identi�cation: number of equations equals the number ofunknowns (unique solution)...=> IV estimator

3 Over-identi�cation: more equations than unknowns. Two equivalentsolutions:

1 Select K linear combinations of the instruments to have a uniquesolution )...=> Two-Stage Least Squares

2 Set the sample analog of the moment conditions as close as possible tozero, i.e. minimize the distance between the sample analog and zerogiven a metric (optimal metric or optimal weighting matrix?) =>Generalized Method of Moments (GMM).

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 40 / 68

3. Instrumental Variables (IV) estimator

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 41 / 68

3. Instrumental Variables (IV) estimator

Assumption: Consider a just-identi�ed model

H = K

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 42 / 68

3. Instrumental Variables (IV) estimator

Motivation of the IV estimator

By de�nition of the instruments:

plim1NZ>ε = plim

1NZ> (y�Xβ) = 0K�1

So, we have:

plim1NZ>y =

�plim

1NZ>X

�β

or equivalently

β =

�plim

1NZ>X

��1plim

1NZ>y

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 43 / 68

3. Instrumental Variables (IV) estimator

De�nition (Instrumental Variable (IV) estimator)

If H = K , the Instrumental Variable (IV) estimator bβIV of parametersβ is de�ned as to be: bβIV = �Z>X��1 Z>y

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 44 / 68

3. Instrumental Variables (IV) estimator

De�nition (Consistency)

Under the assumption that plim N�1Z>ε, the IV estimator bβIV isconsistent: bβIV p! β0

where β0 denotes the true value of the parameters.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 45 / 68

3. Instrumental Variables (IV) estimator

Proof

By de�nition: bβIV = β0 +

�1NZ>X

��1 � 1NZ>ε

�So, we have:

plimbβIV = β0 +

�plim

1NZ>X

��1 �plim

1NZ>ε

�Under the assumption of exogeneity of the instruments

plim1NZ>ε = 0K�1

So, we haveplim bβIV = β0 �

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 46 / 68

3. Instrumental Variables (IV) estimator

De�nition (Asymptotic distribution)

Under some regularity conditions, the IV estimator bβIV is asymptoticallynormally distributed:

pN�bβIV � β0

�d! N

�0K�1, σ2Q�1ZXQZZQ

�1ZX

�where

QZZK�K

= plim1NZ>Z QZX

K�K= plim

1NZ>X

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 47 / 68

3. Instrumental Variables (IV) estimator

De�nition (Asymptotic variance covariance matrix)

The asymptotic variance covariance matrix of the IV estimator bβIV isde�ned as to be:

Vasy

�bβIV � = σ2

NQ�1ZXQZZQ

�1ZX

A consistent estimator is given by

bVasy

�bβIV � = bσ2 �Z>X��1 �Z>Z� �X>Z��1

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 48 / 68

3. Instrumental Variables (IV) estimatorRemarks

1 If the system is just identi�ed H = K ,�Z>X

��1=�X>Z

��1QZX = QXZ

the estimator can also written as

bVasy

�bβIV � = bσ2 �Z>X��1 �Z>Z� �Z>X��12 As usual, the estimator of the variance of the error terms is:

bσ2 = bε>bεN �K =

1N �K

N

∑i=1

�yi � x>i bβIV �2

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 49 / 68

3. Instrumental Variables (IV) estimator

Relevant instruments

1 Our analysis thus far has focused on the �identi�cation�conditionfor IV estimation, that is, the �exogeneity assumption,�whichproduces

plim1NZ>ε = 0K�1

2 A growing literature has argued that greater attention needs to begiven to the relevance condition

plim1NZ>X = QZX a �nite H �K positive de�nite matrix

with H = K in the case of a just-identi�ed model.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 50 / 68

3. Instrumental Variables (IV) estimator

Relevant instruments (cont�d)

plim1NZ>X = QZX a �nite H �K positive de�nite matrix

1 While strictly speaking, this condition is su¢ cient to determine theasymptotic properties of the IV estimator

2 However, the common case of �weak instruments,� is only barelytrue has attracted considerable scrutiny.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 51 / 68

3. Instrumental Variables (IV) estimator

De�nition (Weak instrument)A weak instrument is an instrumental variable which is only slightlycorrelated with the right-hand-side variables X. In presence of weakinstruments, the quantity QZX is close to zero and we have

1NZ>X ' 0H�K

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 52 / 68

3. Instrumental Variables (IV) estimator

Fact (IV estimator and weak instruments)

In presence of weak instruments, the IV estimators bβIV has a poorprecision (great variance). For QZX ' 0H�K , the asymptotic variancetends to be very large, since:

Vasy

�bβIV � = σ2

NQ�1ZXQZZQ

�1ZX

As soon as N�1Z>X ' 0H�K , the estimated asymptotic variancecovariance is also very large since

bVasy

�bβIV � = bσ2 �Z>X��1 �Z>Z� �X>Z��1

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 53 / 68

3. Instrumental Variables (IV) estimator

Key Concepts

1 Instrument or instrumental variable

2 Orthogonal or moment condition

3 Identi�cation: just-identi�ed or over-identi�ed model

4 Instrumental Variables (IV) estimator

5 Statistical properties of the IV estimator

6 Weak instrument

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 54 / 68

Section 4

Two-Stage Least Squares (2SLS) estimator

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 55 / 68

4. Two-Stage Least Squares (2SLS) estimator

Assumption: Consider an over-identi�ed model

H > K

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 56 / 68

4. Two-Stage Least Squares (2SLS) estimator

Introduction

If Z contains more variables than X, then much of the preceding derivationis unusable, because Z>X will be H �K with

rank�Z>X

�= K < H

So, the matrix Z>X has no inverse and we cannot compute the IVestimator as: bβIV = �Z>X��1 Z>y

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 57 / 68

4. Two-Stage Least Squares (2SLS) estimator

Introduction (cont�d)

The crucial assumption in the previous section was the exogeneityassumption

plim1NZ>ε = 0K�1

1 That is, every column of Z is asymptotically uncorrelated with ε.

2 That also means that every linear combination of the columns of Zis also uncorrelated with ε, which suggests that one approach wouldbe to choose K linear combinations of the columns of Z.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 58 / 68

4. Two-Stage Least Squares (2SLS) estimator

Introduction (cont�d)

Which linear combination to choose?

A choice consists in using is the projection of the columns of X in thecolumn space of Z: bX = Z �Z>Z��1 Z>XWith this choice of instrumental variables, bX for Z, we have

bβ2SLS =�bX>X��1 bX>y

=

�X>Z

�Z>Z

��1Z>X

��1X>Z

�Z>Z

��1Z>y

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 59 / 68

4. Two-Stage Least Squares (2SLS) estimator

De�nition (Two-stage Least Squares (2SLS) estimator)

The Two-stage Least Squares (2SLS) estimator of the parameters β isde�ned as to be: bβ2SLS = �bX>X��1 bX>ywhere bX = Z �Z>Z��1 Z>X corresponds to the projection of the columnsof X in the column space of Z, or equivalently by

bβ2SLS = �X>Z �Z>Z��1 Z>X��1 X>Z �Z>Z��1 Z>y

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 60 / 68

4. Two-Stage Least Squares (2SLS) estimatorRemark

By de�nition bβ2SLS = �bX>X��1 bX>ySince bX = Z �Z>Z��1 Z>X = PZXwhere PZ denotes the projection matrix on the columns of Z. Reminder:PZ is symmetric and PZP>Z = PZ . So, we have

bβ2SLS =�X>P

>ZX��1 bX>y

=�X>P

>ZPZX

��1 bX>y=

�bX>bX��1 bX>yChristophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 61 / 68

4. Two-Stage Least Squares (2SLS) estimator

De�nition (Two-stage Least Squares (2SLS) estimator)

The Two-stage Least Squares (2SLS) estimator of the parameters βcan also be de�ned as:

bβ2SLS = �bX>bX��1 bX>yIt corresponds to the OLS estimator obtained in the regression of y on bX.Then, the 2SLS can be computed in two steps, �rst by computing bX, thenby the least squares regression. That is why it is called the two-stage LSestimator.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 62 / 68

4. Two-Stage Least Squares (2SLS) estimator

A procedure to get the 2SLS estimator is the following

Step 1: Regress each explicative variable xk (for k = 1, ..K ) on the Hinstruments.

xki = α1z1i + α2z2i + ..+ αH zHi + vi

Step 2: Compute the OLS estimators bαh and the �tted values bxkibxki = bα1z1i + bα2z2i + ..+ bαH zHii

Step 3: Regress the dependent variable y on the �tted values bxki :

yi = β1bx1i + β2bx2i + ..+ βKbxKi + εi

The 2SLS estimator bβ2SLS then corresponds to the OLS estimatorobtained in this model.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 63 / 68

4. Two-Stage Least Squares (2SLS) estimator

TheoremIf any column of X also appears in Z, i.e. if one or more explanatory(exogenous) variable is used as an instrument, then that column of X isreproduced exactly in bX.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 64 / 68

4. Two-Stage Least Squares (2SLS) estimator

Example (Explicative variables used as instrument)Suppose that the regression contains K variables, only one of which, say,the K th, is correlated with the disturbances, i.e. E (xKi εi ) 6= 0. We canuse a set of instrumental variables z1,..., zJ plus the other K � 1 variablesthat certainly qualify as instrumental variables in their own right. So,

Z = (z1 : .. : zJ : x1 : .. : xK�1)

Then bX = (x1 : .. : xK�1 : bxK )where bxK denotes the projection of xK on the columns of Z.

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 65 / 68

4. Two-Stage Least Squares (2SLS) estimator

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 66 / 68

4. Two-Stage Least Squares (2SLS) estimator

Key Concepts

1 Over-identi�ed model

2 Two-Stage Least Squares (2SLS) estimator

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 67 / 68

End of Chapter 6

Christophe Hurlin (University of Orléans)

Christophe Hurlin (University of Orléans) Advanced Econometrics - HEC Lausanne December 15, 2013 68 / 68