Lecture 6 LBS Slides

105
Econometrics Part I Lecture 6 6 November 2009 J. James Reade

Transcript of Lecture 6 LBS Slides

Page 1: Lecture 6 LBS Slides

Econometrics Part ILecture 6

6 November 2009

J. James Reade

Page 2: Lecture 6 LBS Slides

Admin

• Lectures:

– Today: Simultaneous equations modelling and VARs.– Next week: Limited Dependent Variable Modelling.– Final lecture: Treatment effects and recap.

• Lecture notes:

– Handouts for each lecture: Slides.– Please point out typos!

• Classes: This week, next week.

• Exam: Two weeks on Tuesday.

– Prep: Assignments best guide for exam.

• Happy to chat via email or after lecture.

[email protected] 2

Page 3: Lecture 6 LBS Slides

Today: Multiple-Equation Modelling

• Many reasons to estimate more than a single-equation model.

• Panel: Time series observed over multiple units.

– View as system of time series?

• We may want to consider several models jointly:

– Helpful if know disturbances likely correlated.– E.g. CAPM residuals (excess returns) over different firms correlated.

• May want to consider determination of several variables jointly:

– Endogeneity/simultaneity.

[email protected] 3

Page 4: Lecture 6 LBS Slides

Today: Textbook Coverage

• Greene covers all material:

– SURE: Chapter 14.– Simultaneous equations: Chapter 15.– VARs, etc: Chapter 19.

• Another textbook gives good intro to simultaneous equations material:

– Gujarati: Basic Econometrics, ed. 4, Chs 18–21.

• Gujarati somewhat more shaky on time-series grounds though.

• Stock and Watson does not cover simultaneous equations.

[email protected] 4

Page 5: Lecture 6 LBS Slides

The Seemingly Unrelated Regression Model

• Seemingly very similar to panel data.

• Set of data for number of units, say firms.

– May have cross equation dependencies, e.g. firms in similar industry.– May be more efficient to exploit error structure in estimation.

• E.g. Consumer demand for N goods. Could estimate N demands.

– But constraints hold over all consumer: Budget constraints etc.

• These kind of models motivate the Seemingly Unrelated Regression approach.

– Proposed by Zellner (1962).

[email protected] 5

Page 6: Lecture 6 LBS Slides

The Seemingly Unrelated Regression Model

• Take bivariate case: E.g. two goods, N observations for each:

y1 = X1β1 + u1 u1 ∼ (0, σ21), (1)

y2 = X2β2 + u2 u1 ∼ (0, σ22). (2)

• Write as a system:

(y1y2

)2N×1

=(X1 00 X2

)2N×(K1+K2)

(β1

β2

)(K1+K2)×1

+(u1

u2

)2N×1

. (3)

• Stacked form:

y = Xβ + u, E (u |X ) = 0, Cov (u |X ) = Ω =(σ11In σ12Inσ21In σ22In

). (4)

• Big regression problem: Use OLS? βOLS = (X ′X)−1X ′y. (5)

[email protected] 6

Page 7: Lecture 6 LBS Slides

The Seemingly Unrelated Regression Model

• OLS estimator has properties:

E(βOLS |X

)= β, Cov

(βOLS |X

)= (X ′X)−1X ′ΩX(X ′X)−1. (6)

– Latter follows from assumption that Cov (u |X ) = Ω.

• This is just a big OLS estimation: OLS on each equation seperately.

• But is Ω diagonal?

– In many cases unlikely: CAPM example, demand expenditures.– If Ω non-diagonal then OLS inefficient.– Can gain over OLS by using GLS and exploiting cross-equation correlations.

• Using GLS is called Seemingly Unrelated Regression (SURE) analysis.

[email protected] 7

Page 8: Lecture 6 LBS Slides

The Seemingly Unrelated Regression Model

• GLS estimator:

βSURE = (X ′Ω−1X)−1X ′Ω−1y. (7)

• We’ve assumed Ω non-diagonal:

Ω =(σ11In σ12Inσ21In σ22In

). (8)

• So σ12 and σ21 non-zero.

• SURE estimation weights this information, in effect using parts of β1 in β2,SURE.

– Problematic if one equation in system misspecified as effects transmitted acrossall equations.

[email protected] 8

Page 9: Lecture 6 LBS Slides

Alternative Systems of Equations

• Panel data from last week seemingly a generalisation of SURE.

– Although would appear SURE accounts for cross-section dependence.1

• May be interested not in the same data series over many different units of observation.

• What if the variables for a given unit are endogenously determined?

– In macroeconomics, it’s hard to see how data are anything but endogenous.– Demand and supply also.

• Much of today: Simultaneous equation models.

– Extending into VAR modelling.

1Don’t quote me on this.

[email protected] 9

Page 10: Lecture 6 LBS Slides

A Necessity for Simultaneous Equations

• Endogeneity pervasive in econometric modelling.

– Regression of yi on xi: yi = βxi + εi. (9)

– But yi also depends on xi: xi = θyi + ei. (10)

– Estimator: β =∑Ni=1 xiyi∑Ni=1 x

2i

= β +∑Ni=1 xiεi∑Ni=1 x

2i

. (11)

– But E (xiεi) = E ((θyi + ei)εi) = E ((θ(βxi + εi) + ei)εi) = θE(ε2i)6= 0.

• Endogeneity or simultaneity manifested in correlation between errors and variables.

– Already considered many strategies for this ailment.

• Today we consider estimating each possible equation.

[email protected] 10

Page 11: Lecture 6 LBS Slides

Multiple-Equation Modelling: Supply and Demand

• E.g. Market for PhDs: Demand, supply, equilibrium wages (w) and employment (e).

– Market clearing w, e caused by demand and supply.– But demand and supply affected by what w and e are.– Joint determination of equilibrium quantities.

• Then quantity demanded of PhDs qdt is: qdt = β11 + β12wt + ε1t. (12)

• But the quantity supplied qst is: qst = β21 + β22wt + ε2t. (13)

• OLS estimation of either β12 or β22 problematic: Setting qdt = qst gives:

wt =β11 − β21

β22 − β12+ε1t − ε2tβ22 − β12

, E (wtε1t) 6= 0, E (wtε2t) 6= 0. (14)

[email protected] 11

Page 12: Lecture 6 LBS Slides

Multiple-Equation Modelling: Macroeconomics

• Macroeconomic, or general-equilibrium, modelling:

– Interested in joint determination of many variables.– E.g. Interest rates, output gap, inflation.– Macro models specify equations for each of these variables.

• All such systems have two types of variable:

– Endogenous Variables: Determined with the model/system.– Exogenous Variables: Determined outside the model/system.∗ Also known as predetermined variables.

[email protected] 12

Page 13: Lecture 6 LBS Slides

Multiple-Equation Modelling

• Stylised model: M endogenous variables Y1,t, . . . , YM,t, K exogenous variablesX1,t, . . . XK,t.

• Implies M equations, one for each endogenous variable:

Y1,t = β12Y2,t + β13Y3,t + · · ·+ β1MYM,t + γ11X1,t + · · ·+ γ1KXK,t + ε1,t,

Y2,t = β21Y1,t + β23Y3,t + · · ·+ β2MYM,t + γ21X1,t + · · ·+ γ2KXK,t + ε2,t,

... = ... + ... + · · ·+ ... + ... + · · ·+ ... + ... ,

YM,t = βM1Y1,t + βM2Y2,t + · · ·+ βM,M−1YM−1,t + γM1X1,t + · · ·+ γMKXK,t + εM,t.

• This is the structural form of an economic, or econometric model.

– Endogenous variables in terms of other endogenous and exogenous variables.

• Some or many coefficients may be restricted to zero by theory a priori.

[email protected] 13

Page 14: Lecture 6 LBS Slides

Structural and Reduced Form Modelling

• Could write structural form instead as: BYt = GXt + εt. (15)

• Where:

B =

1 −β12 −β13 . . . −β1M

−β21 1 −β23 . . . −β2M... ... ... . . . ...

−βM,1 −βM,2 −βM,3 . . . 1

, G =

γ11 . . . γ1K... . . . ...

γM,1 . . . γM,K

,

εt =

ε1tε2t...

εMt

, Yt =

Y1t

Y2t...

YMt

, Xt =

X1t

X2t...

XKt

.

• Clearly modelling structural form has endogeneity problems.

– Solve for endogenous variables in terms of exogenous variables. . .

[email protected] 14

Page 15: Lecture 6 LBS Slides

The Reduced Form

• Reduced form: Represent endogenous variables in terms of exogenous variables.

Yt = B−1GXt +B−1εt. (16)

• E.g. PhDs job market: qdt = β11 + β12wt + ε1t, (17)

qst = β21 + β22wt + ε2t. (18)

• Reduced form here comes from equilibrium: qst = qdt so:

wt =β11 − β21

β22 − β12+ε1t − ε2tβ22 − β12

= Π0 + vt. (19)

• Where: Π0 =β11 − β21

β22 − β12, vt =

ε1t − ε2tβ22 − β12

. (20)

[email protected] 15

Page 16: Lecture 6 LBS Slides

Identification Difficulties

• Can now find equilibrium quantity from either original equation:

qt = β11 + β12

(β11 − β21

β22 − β12+ε1t − ε2tβ22 − β12

)+ ε1t, (21)

=β11β22 − β12β21

β22 − β12+β22ε1t − β12ε2tβ22 − β12

= Π1 + et. (22)

• Where: Π1 =β11β22 − β12β21

β22 − β12, et =

β22ε1t − β12ε2tβ22 − β12

. (23)

• These are reduced-form equations for wage wt and quantity employed qt.

• Can estimate Π0, Π1: No endogeneity problem remains.

• But Π0, Π1 are non-linear functions of β11, β12, β21, β22, the structural parameters.

– Cannot uncover structural parameters from reduced form estimation: Unidentified.

[email protected] 16

Page 17: Lecture 6 LBS Slides

Identification

• Identification achieved if unique values of structural parameters can be found.

• Main interest in PhD model are structural parameters β11, β12, β21, β22:

qdt = β11 + β12wt + ε1t, (24)

qst = β21 + β22wt + ε2t. (25)

• But OLS estimation biased and inconsistent due to E (wtεit) 6= 0, i = 1, 2.

• Reduced form estimators Π0, Π1 are consistently estimated by OLS.

• But cannot recover β11, β12, β21, β22 from Π0, Π1.

• Our parameters of interest are unidentified, or underidentified.

[email protected] 17

Page 18: Lecture 6 LBS Slides

Identification Conceptually

• Recall from microeconomic theory: Market price satisfies demand and supply.

• (qt, wt) are set of equilibrium prices:

– Intersections of different demand and supply curves.

q

w

y y

y y

y

yy

[email protected] 17

Page 19: Lecture 6 LBS Slides

Identification Conceptually

• Recall from microeconomic theory: Market price satisfies demand and supply.

• (qt, wt) are set of equilibrium prices:

– Intersections of different demand and supply curves.

q

w

yS1

D1yS1

D1

yS1

D1yS1

D1

yS1

D1

yS1

D1

yS1

D1

[email protected] 18

Page 20: Lecture 6 LBS Slides

Identification Difficulties

• Identification: Many observationally equivalent representations.

• Could add λ to demand equation, 1− λ to supply:

λqdt = λβ11 + λβ12wt + λε1t, (26)

(1− λ)qst = (1− λ)β21 + (1− λ)β22wt + (1− λ)ε2t. (27)

• Adding the two equations together yields:

qdt = γ1 + γ2wt + η1t, (28)

– Where: γ1 = λβ11 +(1−λ)β21, γ2 = λβ12 +(1−λ)β22, ηt = (1−λ)ε1t+(1−λ)ε2t.

• New equation (28) indistinguishable from demand or supply equations.

– Cannot tell which is which from data alone.

[email protected] 19

Page 21: Lecture 6 LBS Slides

Seeking Identification

• Method for identification: Additional information.

• E.g. Include extra ‘shift’ variable in demand equation.

– Then can conceptually move that variable to shift demand curve.– Trace out supply curve.

q

w

y

S1

D3

yD2

yD1

yD4

yD5

[email protected] 20

Page 22: Lecture 6 LBS Slides

Seeking Identification

• For PhDs, add extra variable to demand equation to find supply equation.

– Add student enrolments st to demand equation. Need lecturers to lecture.

qdt = β11 + β12wt + γ11st + ε1t, (29)

qst = β21 + β22wt + ε2t. (30)

• Then reduced form is: wt = Π0 + Π1st + vt, qt = Π2 + Π3st + ut. (31)

• Where: Π0 =β21 − β11

β12 − β22, Π1 =

γ11

β22 − β12, vt =

ε2t − ε1tβ12 − β22

,

Π2 =β12β21 − β11 − β22

β12 − β22, Π3 =

γ11β22

β22 − β12, ut =

β12ε2t − β22ε1tβ12 − β22

.

• Five unknowns, four equations. But: β21 = Π2 − β22Π0, β22 =Π3

Π1. (32)

– Hence supply equation identified.

[email protected] 21

Page 23: Lecture 6 LBS Slides

Checking Identification

• As earlier, we can add λ and (1− λ) to each equation:

qt = γ0 + γ1wt + γ2st + εt. (33)

– γk and εt defined accordingly.

• Hence: This new representation indistinguishable from demand equation:

– Demand unidentified: Can take any linear combination of equations.

• But supply identified: Can distinguish supply from (33) by st term.

• We identify supply equation by adding term to demand equation:

– Extra demand term allows us to trace out supply curve.– Regression means can hold all else fixed, vary st.– Shifts demand curve while supply fixed:∗ What we get must be supply curve.

[email protected] 22

Page 24: Lecture 6 LBS Slides

Seeking Identification

• Can repeat ‘trick’ to identify demand: Add term to supply equation.

– Add consultancy wage rate ct to supply equation. Outside option.– Add student enrolments st to demand equation. Need lecturers to lecture.

qdt = β11 + β12wt + γ11st + ε1t, (34)

qst = β21 + β22wt + γ21ct + ε2t. (35)

• Will yield reduced forms: wt = Π0 + Π1st + Π2ct + vt, (36)

qt = Π3 + Π4st + Π5ct + ut. (37)

• 6 unknowns: (β11, β12, β21, β22, γ11, γ21), 6 equations: (Π0,Π1,Π2,Π3,Π4,Π5).

– All parameters identified, both equations identified.

[email protected] 23

Page 25: Lecture 6 LBS Slides

Overidentification

• Other factors influence demand and supply. E.g. for PhDs, previous period wage.

qdt = β11 + β12wt + γ11st + ε1t, (38)

qst = β21 + β22wt + γ21ct + γ22wt−1 + ε2t. (39)

• Usual problem of simultaneity bias means we look for reduced form:

wt = Π0 + Π1st + Π2ct + Π3wt−1 + vt, (40)

qt = Π4 + Π5st + Π6ct + Π7wt−1 + ut. (41)

• Got eight equations (Πs) for only seven structural parameters. E.g.:

β22 = Π6/Π2, β22 = Π5/Π1. (42)

[email protected] 24

Page 26: Lecture 6 LBS Slides

Overidentification

• Got eight equations (Πis) for only seven structural parameters. E.g.:

β22 = Π6/Π2, β22 = Π5/Π1. (43)

• Model overidentified: Too much information.

– We exclude two variables for just one endogenous variable in supply function.

• Multiple expressions for parameters such as β22 may give different answers.

– Ambiguity transmitted to other parameters: β22 in denominators of other Πis.

• But TMI not necessarily a bad thing:

– Estimation methods exist to handle extra information.

[email protected] 25

Page 27: Lecture 6 LBS Slides

Identification More Formally

• Identification very difficult to grasp and to find.

• Rank and order conditions exist to check for identification:

– Facilitate finding identification via automation in computer packages.

• Recap and extension of notation:

– M : Number of endogenous variables in model/system.– m: Number of endogenous variables in equation of model.– K: Number of exogenous, predetermined variables in model/system.– k: Number of exogenous, predetermined variables in equation of model/system.

[email protected] 26

Page 28: Lecture 6 LBS Slides

The Order Condition

• Order condition is necessary but not sufficient for identification.

• Can be stated in two ways:

1. In model of M simultaneous equations, equation identified if:

• It excludes at least M − 1 variables (endogenous or otherwise).– If less than M − 1 excluded, unidentified.– If exactly M − 1 excluded, just identified.– If more than M − 1 excluded, overidentified.

2. In M -equation system, equation identified if:

• Number of exogenous variables excluded bigger than number of endogenousvariables in equation minus 1:

K − k ≥ m− 1. (44)

[email protected] 27

Page 29: Lecture 6 LBS Slides

The Order Condition: Examples

• Simple demand example: qdt = β11 + β12wt + ε1t, (45)

qst = β21 + β22wt + ε2t. (46)

– Two endogenous variables, M = 2, each equation excludes zero variables.∗ Unidentified.

• Add student enrolment: qdt = β11 + β12wt + γ11st + ε1t, (47)

qst = β21 + β22wt + ε2t. (48)

– Two endogenous variables, M = 2, K = 1.– Demand equation: Excludes zero variables hence unidentified.– Supply equation: Excludes one variable variable hence identified.

[email protected] 28

Page 30: Lecture 6 LBS Slides

The Order Condition: Examples

• Add student enrolment, consultancy wages and lagged wages:

qdt = β11 + β12wt + γ11st + ε1t, (49)

qst = β21 + β22wt + γ21ct + γ22wt−1 + ε2t. (50)

– Two endogenous variables, M = 2, three exogenous K = 3.– Demand equation: Excludes two variables (ct, wt−1) hence overidentified.– Supply equation: Excludes one variable (st) variable hence identified.

• But we already know supply equation actually overidentified.

[email protected] 29

Page 31: Lecture 6 LBS Slides

The Order Condition

• Order condition probably most commonly used identification strategy.

• Use theory to argue why particular variables excluded.

• E.g. Rainfall affects supply and not demand for wheat.

• Rainfall is stark example: Not always so in economics.

– E.g. Ricardian equivalence: Debt has no real effect?!

• Implications of wrong identification strategy not innocuous:

– But very hard to test!

[email protected] 30

Page 32: Lecture 6 LBS Slides

The Rank Condition

• Order condition necessary but not sufficient for identification.

• Even if satisfied equation may not be identified.

• E.g. If st insignificant in demand equation, γ11 = 0, supply unidentified.

• Identification also violated if exogenous variables excluded not independent:

– If linear combination exists, mapping from βs and γs to Πs non-unique.

Rank Condition:

• M -equation system identified iff at least one non-zero determinant of order (M −1)(M − 1) can be constructed using coefficients (endogenous or exogenous) ofvariables excluded from that equation.

– Necessary and sufficient condition for identification.

[email protected] 31

Page 33: Lecture 6 LBS Slides

Rank Condition: An Example

• System of 4 endogenous Y variables and 3 exogenous X variables:

Y1t − β10 − β12Y2t − β13Y3t − γ11X1t = u1t, (51)

Y2t − β20 − β23Y3t − γ21X1t − γ22X2t = u2t, (52)

Y3t − β30 − β31Y1t − γ31X1t − γ32X2t = u3t, (53)

Y4t − β40 − β41Y1t − β42Y2t − γ43X3t = u4t. (54)

• Identified?

Eq. No K − k m− 1 Identified?

(51) 2 2 Exactly(52) 1 1 Exactly(53) 1 1 Exactly(54) 2 2 Exactly

[email protected] 32

Page 34: Lecture 6 LBS Slides

Rank Condition: An Example

• To help with rank condition write equations in table:

Coefficients of the variablesEq No. 1 Y1 Y2 Y3 Y4 X1 X2 X3

(51) −β10 1 −β12 −β13 0 −γ11 0 0(52) −β20 0 1 −β23 0 −γ21 −γ22 0(53) −β30 −β31 0 1 0 −γ31 −γ32 0(54) −β40 −β41 −β42 0 1 0 0 −γ43

• To check equation (51): Form matrix of coefficients on Y4, X2, X3:

A =

0 −γ22 00 −γ32 01 0 −γ43

, detA = 0. (55)

• Hence A not full rank: Rows/columns not linearly independent.

– Relationships exist between variables hence unidentified.– Cannot tell (52) and (53) apart hence can’t tell (51) from either.

[email protected] 33

Page 35: Lecture 6 LBS Slides

More on Identification

• Order condition necessary but not sufficient, rank condition necessary and sufficient.

• Rank tells us whether identified or not, order whether exact- or over-identification.

• Four cases:

1. If K − k > m− 1 and rank (A) = M − 1 equation overidentified.2. If K − k = m− 1 and rank (A) = M − 1 equation exactly identified.3. If K − k ≥ m− 1 and rank (A) < M − 1 equation under identified.4. If K − k < m− 1 equation unidentified.

• Rank condition can get difficult with large dimension systems:

– Often just order condition used if software cannot calculate.

[email protected] 34

Page 36: Lecture 6 LBS Slides

Testing for Simultaneity

• If we have no simultaneity problem then OLS consistent.

– If we do have simultaneity then need 2SLS/IV — to come.

• Thus testing for simultaneity helpful. Hausman (1976) provided a test.

• Consider model: Qdt = α0 + α1Pt + α2Xt + ε1t, (56)

Qst = β0 + β1Pt + ε2t. (57)

• Reduced form: Pt = Π0 + Π1Xt + vt, (58)

Qt = Π2 + Π3Xt + et. (59)

• OLS gives: Pt and Pt = Pt + vt.

• Sub back into supply: Qt = β0 + β1Pt + β1vt + ε2t.

[email protected] 35

Page 37: Lecture 6 LBS Slides

Testing for Simultaneity

• Test equation: Qt = β0 + β1Pt + β1vt + ε2t = β0 + β1Pt + θvt + ε2t.

• If simultaneity then vt correlated with ε2t:

– vt is variation remaining in Pt controlling for exogenous variables.– We have split Pt into potentially simultaneous component via instrumenting.

• If simultaneity then vt significant if run OLS on test equation. Since:

θ =∑Tt=1Qvt∑Tt=1 v

2t

=∑Tt=1Q(Pt − Pt)∑T

t=1 v2t

. (60)

• Hence Hausman test for simuntaneity is t-test on olsvt:

– If test rejected, conclude simultaneity.

[email protected] 36

Page 38: Lecture 6 LBS Slides

Estimation

• Estimating system: BYt = GXt + εt (61)

– M endogenous variables in M × 1 matrix Yt, M ×M coefficient matrix B.– K exogenous variables in K × 1 matrix Xt, M ×K coefficient matrix G.

• Recall reduced form:

Yt = B−1GXt +B−1εt = ΠXt + ut. (62)

• OLS estimation of Π consistent.

– But Π rarely of interest: These are reduced-form parameters.– B, G more of interest but not necessarily identified.

[email protected] 37

Page 39: Lecture 6 LBS Slides

Estimation

• Two estimation methods:

1. Limited Information Methods:

• Estimate each equation of system separately.• Take into account restrictions on that equation, not others.

2. Full Information Methods:

• Estimate all equations jointly, or simultaneously.• Impose all restrictions on all equations (required for identification).

[email protected] 38

Page 40: Lecture 6 LBS Slides

Other Special Cases

• If the model is recursive, also known as triangular or causal, then:

B =

1 0 0 . . . 0−β21 1 0 . . . 0−β31 −β32 1 . . . 0

... ... ... . . . ...−βM1 −βM2 −βM3 . . . 1

. (63)

• I.e. Y1 causes Y2 causes Y3 etc. hence no endogeneity problems.

• Also require that Cov (εt) = Ω is diagonal, i.e. Cεit, εjt = 0 for all j 6= i.

• Here OLS is consistent.

[email protected] 39

Page 41: Lecture 6 LBS Slides

Other Special Cases

• Vector autoregression: E.g. VAR(2):

Xt = Π1Xt−1 + Π2Xt−2 + εt. (64)

• Here: Xt =

X1t

X2t...Xpt

. (65)

• Model has no contemporaneous values of X1, . . . , Xp on RHS hence OLS consistent.

• Usually economic theory imposes structure on contemporaneous variables:

BXt = Π1Xt−1 + Π2Xt−2 + εt (66)

– Then again got issue of endogeneity.– More on VARs later. . .

[email protected] 40

Page 42: Lecture 6 LBS Slides

Limited Information Methods

• In just-identified case, we can use indirect least squares (ILS). Model:

qdt = α0 + α1pt + α2st + ε1t, (67)

qst = β0 + β1pt + β2ct + ε2t. (68)

• Proceed in three steps:

1. Find reduced-form equations:– Endogenous variables in terms of only exogenous variables.

pt = Π0 + Π1st + Π2ct + ut,

qt = Π3 + Π4st + Π5ct + ut.

2. Estimate reduced-form equations by OLS:– Yields consistent estimates of reduced-form parameters as no endogeneity.– Produces Π1, . . . , Π5.

3. Obtain estimates of structural coefficients by one-to-one correspondence:– Requires just-identification to get α1, . . . , β2.

[email protected] 41

Page 43: Lecture 6 LBS Slides

Limited Information Methods

• ILS breaks down if equation overidentified:

– More than one possibility for each parameter, standard errors dubious.

• Say model is: qdt = α0 + α1pt + α2st + α3pt−1 + ε1t,

qst = β0 + β1pt + β2ct + ε2t.

• Problem remains that pt endogenous hence E (ptε1t) 6= 0.

• Require method to isolate component of pt correlated with ε1t.

• Can use instrumental variable estimation:

– Exogenous variables st, pt−1, ct satisfy one instrumenting condition.∗ Namely uncorrelatedness with the error term.

– Expect relevance condition to hold — otherwise unidentified.– Method: Two-stage least squares

[email protected] 42

Page 44: Lecture 6 LBS Slides

Two-Stage Least Squares

1. Regress endogenous variable pt on all exogenous variables in system:

pt = π0 + π1st + π2pt−1 + π3ct + ut. (69)

• Yields estimates π0, . . . , π3 to use to get fitted values pt.• pt is variation in pt explained by exogenous variables.

– Hence uncorrelated with error component.• ut = pt − pt is remaining component correlated with error.

2. Run original system equation using pt in place of pt:

qdt = α0 + α1pt + α2st + α3pt−1 + ε1t. (70)

• Resulting estimate α1 consistent provided exogenous variables valid instruments.• Isolate and remove component of pt correlated with error term by instrumenting.• Can estimate even if model overidentified as here.

[email protected] 43

Page 45: Lecture 6 LBS Slides

More on 2SLS

• 2SLS estimator: β2SLS = (Z ′iXi)−1(Z ′iyi) = β + (Z ′iXi)−1(Z ′iεi). (71)

– Where Zi is vector of exogenous variables, Xi endogenous and yi dependent.

• As before, need E (Z ′iXi) 6= 0 and E (Z ′iεi) = 0 for consistency:

– I.e. Need exogenous variables be exogenous and to identify system.

• First stage provides test of relevance of instruments.

• Need to ensure standard errors correctly calculated on second stage:

– Use pt not pt to calculate residuals hence standard errors etc.

[email protected] 44

Page 46: Lecture 6 LBS Slides

Full Information Methods

• Write as stacked system: y = Zδ + ε, ε ∼ (0,Σ). (72)

• Or:

y1y2...yM

=

Z1 0 . . . 00 Z2 . . . 0... ... . . . ...0 0 . . . ZM

δ1δ2...δM

+

ε1ε2...εM

. (73)

• Here: Zm = (Ym, Xm), hence both endogenous and exogenous variables.

• OLS estimator:

δOLS = (Z ′Z)−1Z ′y. (74)

– OLS on system equivalent to equation-by-equation OLS hence inconsistent.∗ Also by SUR inefficient: Does not exploit information in Σ.

[email protected] 45

Page 47: Lecture 6 LBS Slides

Full Information: 3SLS

• Method to solve inconsistency:

– Instrumental Variables estimation.

• Method to solve inefficiency:

– Feasible GLS estimation.

• Hence find estimator in three stages:

– Three-stage least squares:

[email protected] 46

Page 48: Lecture 6 LBS Slides

Full Information: 3SLS

1. Instrumenting equation estimation:

• Use all exogenous variables X as instruments.• Yielding Π to create Yi = ΠXi, where i = 2, . . . ,M for equation 1, etc.

– Gives matrix Zi formed of Y1 and Xi.

2. Instrumental variables estimation:

• Use Zi in place of Zi.• Yields 2SLS estimators δ2SLS:

δ2SLS = (Z ′Z)−1(Z ′y) = (W ′Z)−1W ′y. (75)

• Estimator consistent only if E (W ′ε) = 0, E (W ′Z) 6= 0.• Also get variance-covariance matrix estimator Σ:

σ2ij = T−1(yi − Ziδi)′(yj − Zjδj). (76)

[email protected] 47

Page 49: Lecture 6 LBS Slides

Full Information: 3SLS

3. Feasible GLS estimation:

• Use Z and Σ from 2SLS estimates.

δIV,GLS = δ3SLS = (Z ′Σ−1Z)−1Z ′Σ−1y = (W ′Σ−1Z)−1W ′Σ−1y. (77)

• 3SLS consistent provided instruments are valid.

• Asymptotic efficiency amongst system IV estimators.

[email protected] 48

Page 50: Lecture 6 LBS Slides

Full-Information Maximum Likelihood (FIML)

• Likelihood framework can be applied to system.

• Begin with reduced-form: Y = XΠ + V. (78)

– Each row of V assumed multivariate Normal: vi |X ∼ N (0,Ω).

• Log-likelihood function: ln L = −T2[M ln(2π) + ln |Ω|+ tr

(Ω−1W

)]. (79)

• Where: Wij = T−1(y −Xπi)′(y −Xπj). (80)

– Here, πi is ith column of Π not the number.

• Maximise likelihood subject to all restrictions placed on system: B matrix.

[email protected] 49

Page 51: Lecture 6 LBS Slides

Full-Information Maximum Likelihood (FIML)

• Reduced form Y = XΠ + V found from structural form:

Y Γ = XB + U, U ∼ (0,Σ),

⇒ Y = XBΓ−1 + UΓ−1, UΓ−1 ∼ (0,Γ−1′ΣΓ−1).

• Interested in structural form not reduced form so use substitutions:

Π = BΓ−1, Ω = Γ−1′ΣΓ−1, Ω−1 = ΓΣ−1Γ′. (81)

• Hence:

ln L = −T2[M ln(2π) + ln

∣∣Γ−1′ΣΓ−1∣∣+ tr

(ΓΣ−1Γ′(Y +XBΓ−1)′(Y +XBΓ−1)

)].

[email protected] 50

Page 52: Lecture 6 LBS Slides

Full-Information Maximum Likelihood (FIML)

• Again:

ln L = −T2[M ln(2π) + ln

∣∣Γ−1′ΣΓ−1∣∣+ tr

(ΓΣ−1Γ′(Y +XBΓ−1)′(Y +XBΓ−1)

)].

• Simplified: ln L = −T2[M ln(2π)− 2 ln |Γ|+ ln |Σ|+ tr

(Σ−1S

)].

• Where: sij = T−1(Y Γi +XBi)′(Y Γj +XBj). (82)

• Maximise likelihood to yield Γ, B matrices:

[email protected] 51

Page 53: Lecture 6 LBS Slides

Full-Information Methods

• FIML:

– Coherent estimation framework.– Testing feasible via LR, LM, Wald tests.– But: Normality assumption may not be valid.– But: Numerical optimisation.

• 3SLS:

– Vastly easier to compute: No numerical methods.– If Normal errors assumed, 3SLS and FIML same asymptotic properties.∗ 3SLS thus much more popular in usage.

– Small samples: Because many parameters to estimate, 3SLS and FIML may diverge.

[email protected] 52

Page 54: Lecture 6 LBS Slides

Simultaneous Equation Methods Condensed

• Often need to estimate more than one equation:

– Similar regression for different firms, goods, etc.– Equations for all endogenous variables in system.

• Estimation:

– Instrumental variables to counter endogeneity.∗ Exogenous variables are instruments.

– Full- or limited-information (IV) methods:∗ Full can be computationally cumbersome.∗ Limited information methods inefficient.

[email protected] 53

Page 55: Lecture 6 LBS Slides

Break time?

[email protected] 54

Page 56: Lecture 6 LBS Slides

Adding Lagged Variables: Vector Autoregressions

• Simultaneous equations models often used in time-series context:

• Huge macroeconomic models of 1960s and 1970s:

– Klein-Goldberger model of US economy: 20 equations.– Brookings-Social Science Research Council model: 150 equations.

• Models need not be time series but generally are.

• What if many lags included? Stability?

• Alternative simultaneous equations model:

– The Vector Autoregression (VAR — not VaR).– Only endogenous variables, but no contemporaneous terms.– Often used for ‘theory-free’ estimation or forecasting.

[email protected] 55

Page 57: Lecture 6 LBS Slides

Vector Autoregressive Models

• Already noted second-order VAR: Two lags:

Xt = Π0 + Π1Xt−1 + Π2Xt−2 + εt. X1,t...

Xp,t

=

π01...π0p

+

π111 . . . π1

1p... . . . ...π1p1 . . . π1

pp

X1,t−1...

Xp,t−1

+

π211 . . . π2

1p... . . . ...π2p1 . . . π2

pp

X1,t−2...

Xp,t−2

+

ε1,t...εp,t

.

• All variables X1, X2, . . . , Xp determined within system. Endogenous.

• But: No variables dated t enter each equation:

– Hence can apply time-series methods to estimate.

[email protected] 56

Page 58: Lecture 6 LBS Slides

The Vector Autoregressive Model

• Object of interest: set of variables over time:

• Xt p-dimensional data vector at time t: Xt =

X1,t

X2,t...

Xp,t

. (83)

• p variables relating to particular problem of interest:

– Xt is snapshot at point in time:

Xt =

(tax/Y )t(G/Y )t(D/Y )tπtY gapt

=

2008:30.1800.2180.6950.053−0.022

. (84)

• X is p× T data vector:

X =

(tax/Y )(G/Y )(D/Y )π

Y gap

=

1966:10.172

1966:20.176

1966:30.176

1966:40.176 . . .

2008:30.180

0.165 0.172 0.174 0.177 . . . 0.2180.416 0.405 0.409 0.408 . . . 0.6950.023 0.028 0.033 0.037 . . . 0.0530.058 0.050 0.046 0.042 . . . −0.022

. (85)

[email protected] 57

Page 59: Lecture 6 LBS Slides

The Vector Autoregressive Model

• p Xp,t variables:

– Same variable, different countries/regions/firms/people: e.g. exchange rates.– Different variables, same country/region/firm/person: e.g. consumption, income.

• Can view as reduced-form modelling:

– Without assuming exogeneity.

• Can test exogeneity assumptions easily.

– Variable Xp,t exogenous if determined outside system.– Hence if all coefficients on Xs,t−k insignificant, p 6= s, k > 0, Xp,t exogenous.– Can then omit equation in Xp,t from system.∗ Similar to Granger causality test: More later. . .

[email protected] 58

Page 60: Lecture 6 LBS Slides

What are VARs Used For?

• Forecasting:

– All RHS variables are lagged: Can forecast tomorrow given today’s data.

• Theory-free modelling:

– Sims: No need for ‘incredible restrictions’ implied by theory.

• Theory-full modelling:

– VARs correspond well to reduced form of many macroeconomic theory models.– Theory interested in effect of impulse to variables through time:∗ E.g. Effect of government spending on GDP.

[email protected] 59

Page 61: Lecture 6 LBS Slides

What are VARs Used For?

• VARs give extremely rich characterisation of dynamics in data.

– Autoregressive and distributed lag components.– Can cope with unit roots as in AR(k) model: Cointegration.– Deterministic terms can be incorporated (with care).– No conditioning or exogeneity assumed.

• Many theories in economics and wider postulate steady states.

– Yet data are non-stationary and endogenous.– Concept of cointegration means can estimate steady state.– VAR model allows endogeneity and testing of exogeneity.

[email protected] 60

Page 62: Lecture 6 LBS Slides

Cointegration

• If Yt, Xt both I(d) then in general any linear combination will also be I(d).

• If there exists linear combination et = Yt − βXt s.t. et ∼ I(d− b), b > 0 then:

– Yt, Xt are cointegrated of order (d, b).

• In reality rarely find anything other than cointegration of order (1, 1): ‘Cointegration’.

• Cointegration very powerful concept in time-series econometrics.

– Recall spurious regression? Brazilian rainfall causes UK GDP?– Cointegration allows estimation of levels relationships in data.∗ Even if data non-stationary.

• Same spuriousness possible in VAR, simultaneous equation modelling.

[email protected] 61

Page 63: Lecture 6 LBS Slides

Going back to our ADL Model

• ADL model: Yt = δ + α1Yt−1 + β0Xt + β1Xt−1 + εt. (86)

• Error/equilibrium-correction mechanism (ECM) with long-run solution nested is:

∆Yt = β0∆Xt + (α1 − 1)[Yt−1 −

δ

1− α1− β0 + β1

1− α1Xt−1

]+ εt. (87)

• Define ecmt as long-run solution: ecmt = Yt −δ

1− α1− β0 + β1

1− α1Xt. (88)

• Subtract [δ/ (1− α1)− (β0 + β1)Xt/ (1− α1)] from both sides of ADL:

ecmt = δ − δ

1− α1+ α1Yt−1 +

(β0 −

β0 + β1

1− α1

)Xt + β1Xt−1 + εt. (89)

[email protected] 62

Page 64: Lecture 6 LBS Slides

• Collect terms and adding and subtracting a term in Xt−1 to create ecmt−1:

ecmt = α1

[Yt−1 −

δ

1− α1− β0 + β1

1− α1Xt−1

]+[α1 (β0 + β1)

1− α1+ β1

](Xt−1 −Xt) + εt

= α1ecmt−1 +(β1 + α1β0

1− α1

)(Xt−1 −Xt) + εt

= α1ecmt−1 + ξt.

– Where: ξt = εt −(β1 + α1β0

1− α1

)νt.

– νt is error term on random walk for Xt: νt = Xt −Xt−1.

• Hence: ecmt = α1ecmt−1 + ξt, ξ ∼ I(0). (90)

[email protected] 63

Page 65: Lecture 6 LBS Slides

What was that for?

• We just found that: ecmt = α1ecmt−1 + ξt, ξ ∼ I(0). (91)

• Thus provided |α1| < 1, then ecmt ∼ I(0):

– Recall: ecmt = Yt −δ

1− α1− β0 + β1

1− α1Xt.

– ecmt is long-run solution for Yt and Xt, steady state.– Steady-state relationships can exist in context of non-stationary model.∗ Cointegration: Xt ∼ I(1), Yt ∼ I(1), but ecmt = f (Yt, Xt) ∼ I(0).

• Fundamental concept in econometrics:

– Even though data non-stationary can still estimate static steady-state conditions.– Appeals to common sense and to economic theory: Allows estimation.

∆Yt = β0∆Xt + (α1 − 1) [Yt−1 − κ0 − κ1Xt−1] + εt. (92)

[email protected] 64

Page 66: Lecture 6 LBS Slides

All Good, but What About Estimation?

• Cointegrating relationship is static: Yt = βXt + ecmt, ecmt ∼ N(0, σ2

ecm

).

– Thus no problems with estimating: no residual autocorrelation.– Despite dynamic nature of relationship, can estimate static equation.

• Furthermore: it’s ‘super consistent’: estimate converges to true value very fast.2

• Suggests a procedure: Estimate ecmt, insert that into ECM and estimate.

– Engle-Granger 2-step procedure.– Intuitive and encompasses test for cointegration: is ecmt ∼ I(0)?

2It converges at rate T as opposed to√T for ‘normal’ estimator.

[email protected] 65

Page 67: Lecture 6 LBS Slides

That Cointegration Test

• Form of ADL and ECM suggest cointegration test of ADF form:

∆ecmt = φecmt−1 +p−1∑i=1

φi∆ecmt−i + µ+ δt+ ωt, ωt ∼ N(0, σ2

ω

). (93)

• Include trend and constant either to (93) or to ecmt equation.

– Since (93) just residuals from ecmt equation.

• Cointegration test: H0 : φ = 0. (94)

• However, cannot use Dickey-Fuller distribution.

– Must use MacKinnon (1991) critical values: See Table 4.1, Harris and Sollis.

[email protected] 66

Page 68: Lecture 6 LBS Slides

Sounds Too Good to be True?

• Unfortunately, it is.

• Banerjee et al (1993): β biased in small samples.

• β has very complicated distribution: cannot draw standard inference.

• The test of cointegration has low power:

– Rejects null (no cointegration) too infrequently when null false.– Thus we conclude in favour of cointegration too little.

• Endogeneity issues: often in macro systems feedback Yt to Xt.

• More than one steady-state relationship?

– Fiscal and monetary policy: why not two cointegrating relationships?– Cannot estimate more than one here.

[email protected] 67

Page 69: Lecture 6 LBS Slides

Solutions?

• Numerous other estimation strategies suggested for single-equation framework.

• But: all suffer from endogeneity problem and can’t estimate > 1 relationships.

• Solution: Simultaneous equations, or vector autoregressive model (VAR).

– Johansen (1996) proposed VAR approach.– Workhorse of cointegration analysis.

[email protected] 68

Page 70: Lecture 6 LBS Slides

A first order autoregressive model

• We can build up to the VAR(k) in several steps. . .

• First order autoregressive (AR(1)) process: xt = ρxt−1 + µ+ εt. (95)

• Model solution: Recursive substitution: xt = ρtx0 +t∑i=1

ρi (µ+ εi) . (96)

– Moving average representation.

• Cases:

1. Stationarity: z1, . . . , zt (strongly) stationary if: (z1, . . . , zt)D= (zs, . . . , zt+s) ∀s.

– Weak, or covariance, stationary if: E (zt) = µ, Cov (zt, zt−s) = γ(s), ∀t.

– With εt ∼ N (0,Ω), weak ⇒ strong.

[email protected] 69

Page 71: Lecture 6 LBS Slides

Stationary Case

• If |ρ| < 1, characterise model as:

E (xt |x0) = ρtx0 +t∑i=1

ρiµ −→ µ

1− ρ

Var (xt |x0) = E

(2i∑i=1

ρ2iε2i

)=

t∑i=1

ρ2iσ2 −→ σ2

1− ρ2

Cov (xt, xt−k |x0) = E

(t−k∑i=1

ρ2iε2i

)=

t−k∑i=1

ρ2iσ2 −→ σ2

1− ρ2.

• Process stationary asymptotically but not in small samples.

[email protected] 70

Page 72: Lecture 6 LBS Slides

Case 2: Unit-root

• If ρ = 1: xt = x0 +t∑i=1

(µ+ εi) , (97)

• So:E (xt |x0) = x0 + tµ −→∞

Var (xt |x0) =2i∑i=1

ε2i =t∑i=1

σ2 = tσ2 −→∞.

• Mean, variance functions of t: non-stationary.

• Unit root case corresponds many economic data series.

3. Explosive case: ρ > 1 not considered: infinity and beyond. . .

[email protected] 71

Page 73: Lecture 6 LBS Slides

Moving-average Representations

• MA representation: easy to characterise the data process under consideration.

– Principle same for more complicated models.

• Also recall lag operator L s.t. Lkxt = xt−k. If |ρ| < 1:

xt − ρxt−1 = εt (98)

xt (1− ρL) = εt (99)

xt = (1− ρL)−1εt

= εt + ρεt−1 + ρ2εt−2 + . . . ,(100)

• Alternative derivation that assumed stationarity.

[email protected] 72

Page 74: Lecture 6 LBS Slides

Impulse response analysis

• MA representation also facilitates impulse response analysis.

– If economy shocked (impulsed) now, where will it be in h periods?– Formally written:

xt+h = ρhxt +t+h∑i=t

ρi(µ+ εi). (101)

– Taking expectations: E (xt+h |xt) = ρhxt +t+h∑i=t

ρiµ. (102)

– Impulse response defined as: IR (h) =∂E (xt+h |xt)

∂xt= ρh. (103)

– If stationary, |ρ| < 1 then IR (h) =⇒ 0: impulse dies away.– If unit root, ρ = 1, then IR (h) = 1 ∀h: shock cumulates, never dies away.– If explosive, |ρ| > 1, IR (h)→∞.

[email protected] 73

Page 75: Lecture 6 LBS Slides

Three Impulse Responses

0 10 20 30 40 50 60 70 80 90 100

0.5

1.0 Stationary processxt=0.6xt−1+εt

0 10 20 30 40 50 60 70 80 90 100

0.5

1.0 Random walk process

xt=1xt−1+εt

0 10 20 30 40 50 60 70 80 90 100

5

10

15 Explosive processxt=1.03xt−1+εt

[email protected] 74

Page 76: Lecture 6 LBS Slides

Some Caution on Impulse Responses. . .

• IR analysis phenomenally popular in empirical studies.

• Impulse to residual of statistical model 6= economic shock.

– Even if model is identified.

• ‘Retail Energy Prices and Consumer Expenditures’ by Paul Edelstein and Lutz Kilian:

– IR but no formal checks on model: Confidence in output?

• Can impose restrictions to identify structure so it accords to theory:

– But in VARs, IRs heavily dependent on particular restrictions.– Identification restrictions generally not test-able.– Causality very difficult to achieve in macroeconomics.∗ (Identification is on comtemporaneous terms in VAR)

• Impulse response analysis intuitively great but fraught with difficulties.

[email protected] 75

Page 77: Lecture 6 LBS Slides

The AR(2) Model

• Model: xt = π1xt−1 + π2xt−2 + εt. (104)

• Lag operator:(1− π1L− π2L

2)xt = εt ⇒ Π(L)xt = εt (105)

– Characteristic (lag) polynomial defined as:

Π(z) = 1− π1z − π2z2 = (1− ρ1z) (1− ρ2z) , (106)

1Π(z)

=1

(1− ρ1z) (1− ρ2z)=∞∑i

ρi1zi∞∑i

ρi2zi =

∞∑n

cnzn, (107)

– cn → 0 as n→∞ if |ρ1| < 1, |ρ2| < 1, so MA(∞) exists: xt =∞∑n=0

cn (µ+ εt−n) .

• Taking expectations: E(xt |x0) =∞∑n=0

cnµ =µ

(1− ρ1)(1− ρ2)=

µ

1− π1 − π2.

[email protected] 76

Page 78: Lecture 6 LBS Slides

Impulse Response Analysis Again

• Want to know impact at t+ h of impulse at t:

xt+h =∞∑n=0

cn (µ+ εt+h−n) (108)

= c0(εt+h − µ) + c1(εt+h−1 − µ) + . . .

+ ch−1(εt+1 − µ) + ch(εt − µ) + ch+1(εt−1 − µ) + . . .(109)

= · · ·+ ch(xt − π1xt−1 − π2xt−2) + . . . (110)

– Only residual εt matters: rest set to zero.

• Hence: IR (h) =∂

∂xtE(xt+h |x0, . . . , xt) = ch −→ 0. (111)

– Same implications as before for unit root, explosive cases.

[email protected] 77

Page 79: Lecture 6 LBS Slides

Bi-variate VAR(2) model with deterministic terms

• Model: Xt = Π1Xt−1 + Π2Xt−2 + εt (112)(X1,t

X2,t

)=(π1,11 π1,12

π1,21 π1,22

)(X1,t−1

X2,t−1

)+(

π2,11 π2,12

π2,21 π2,22

)(X1,t−2

X2,t−2

)+(ε1,tε2,t

).

(113)

• Characteristic polynomial defined as:

Π(z) = I2 −Π1z −Π2z2 =

(1− π1,11z − π2,11z

2 −π1,12z − π2,12z2

−π1,21z − π2,21z2 1− π1,22z − π2,22z

2

).

(114)

– z scalar, πk,ij is ijth element of Πk.

[email protected] 78

Page 80: Lecture 6 LBS Slides

• As in univariate system, use Π(z) to characterise properties of model.

• Multivariate equivalent to solving for roots is to solve det(Π(z)) = 0:

det(Π(z)) = (1− ρ1z)(1− ρ2z)(1− ρ3z)(1− ρ4z) = 0, (115)

– ρi functions of Π1, Π2.

• Linear algebra:

Π(z)−1 =adj(Π(z))det(Π(z))

, (116)

– adj(Π(z)) adjoint/adjugate matrix: each element at most order 2 as matrix 2× 2.– So convergence of Π(z)−1 depends on det(Π(z)).

[email protected] 79

Page 81: Lecture 6 LBS Slides

• We already have det(Π(z)) so:

Π(z)−1 =adj(Π(z))det(Π(z))

=P (z)

(1− ρ1z)(1− ρ2z)(1− ρ3z)(1− ρ4z)

= P (z)

∞∑i=0

ρi1zi∞∑j=0

ρj2zj∞∑k=0

ρk3zk∞∑m=0

ρm4 zm

=∞∑n=0

P ∗nzn,

– P (z) second order function of z incorporated into P ∗n– P ∗n exponentially convergent if |ρi| < 1.

[email protected] 80

Page 82: Lecture 6 LBS Slides

• If |ρi| < 1 MA(∞) representation:

Xt =∞∑i=0

P ∗i (ΦDt−i + εt−i) = Π(L)−1 (ΦDt + εt) ,

• Π−1(z) =∑∞i=0P

∗i z

i.

• Hence E(Xt) =∑∞i=0P

∗i ΦDt−i, V ar(Xt) =

∑∞i=0P

∗i ΩP ∗′i .

• Xt not stationary as Dt depends on t, but Xt − E(Xt) is stationary.

[email protected] 81

Page 83: Lecture 6 LBS Slides

The companion form of a vector autoregressive model

• Carrying on with VAR(2), useful expression is companion form:

(Xt

Xt−1

)=(

Π1 Π2

I2 0

)(Xt−1

Xt−2

)+(

ΦDt + εt0

)(117)

= ΞXt + vt, (118)

– Ξ, Xt, vt suitably defined.

• Ξ is companion matrix:

– VAR(p) reduced to VAR(1) representation– Useful for characterising model via MA representation.

[email protected] 82

Page 84: Lecture 6 LBS Slides

• Roots of companion matrix = roots of system, found by solving eigenvalue problem:

det((

Π1 Π2

I2 0

)− ρ

(I2 00 I2

))= 0. (119)

• Equivalently:

(Π1 Π2

I2 0

)(v1v2

)= ρ

(I2 00 I2

)(v1v2

), (120)

• Implying:Π1v1 + Π2v2 = ρv1,

v1 = ρv2,

⇒ Π1v1 + Π2ρ−1v1 = ρv1. (121)

• det(A− ρI) = 0 ⇐⇒ Av = ρIv then (121) ⇒ det(ρ−Π1 −Π2ρ−1) = 0, or:

ρ−1Π1v1 + Π2ρ−2v1 = v1 ⇐⇒ det(I2 −Π1ρ

−1 −Π2ρ−2) = 0. (122)

[email protected] 83

Page 85: Lecture 6 LBS Slides

• If roots of characteristic polynomial (ρ) outside the unit circle, then roots ofcompanion matrix (ρ−1) inside unit circle, system stationary.

• Intuition as in AR(1): stationarity conditions enable MA(∞) representation, allowcharacterisation of model.

[email protected] 84

Page 86: Lecture 6 LBS Slides

The unrestricted vector autoregressive model

• The unrestricted VAR model with two lags is:

Xt = Π1Xt−1 + Π2Xt−2 + ΦDt + εt

• Define:

B =

Π′1Π′2Φ′

Wt =

Xt−1

Xt−2

Dt

, (123)

• Can simplify:

Xt = B′Wt + εt. (124)

[email protected] 85

Page 87: Lecture 6 LBS Slides

The Assumptions of the VAR Model

• The VAR(p) depends on a number of assumptions:

1. (Xt |Xt−1, Xt−2, . . . , Xt−p) mutually independent.2. (Xt |Xt−1, Xt−2, . . . , Xt−p) ∼ N (Π1Xt−1 + · · ·+ ΠpXt−p + ΦDt,Σ).

– Conditional Normality.3. Parameter space exists.

• Vital that assumptions hold.

• Likelihood framework gives powerful tool for economic analysis.

– But ‘price’ is distributional assumption.

[email protected] 86

Page 88: Lecture 6 LBS Slides

Maximum likelihood estimation of the unrestricted VAR

• First define likelihood function:

– Joint density of Xt given parameter set θ.

• Autoregressive structure requires sequential factorisation: no independenceassumption.

f(Xt, Xt−1, . . . , X1, X0) =T∏t=k

f(Xt |Xt−1, . . . , Xt−k). (125)

• Likelihood defined as:

L(θ;Xt) =T∏t=k

f(Xt |Xt−1, . . . , Xt−k; θ). (126)

[email protected] 87

Page 89: Lecture 6 LBS Slides

• Maximum likelihood estimator of θ given data Xt defined as:

θ = maxθ

L(θ;Xt), (127)

• Value of θ that, given assumed distribution, maximises likelihood function.

– Measure of plausibility: how plausible is particular parameter value?

• Logarithms often used to make likelihood function tractable:

θ = maxθ

log L(θ;Xt) = maxθ` ((|θ) ;Xt). (128)

• For VAR, Normality assumption implies:

` (θ;Xt) = −T p2

ln(2π)− T 12

ln |Ω| − 12

T∑t=1

(Xt −B′Wt)′Ω−1 (Xt −B′Wt) .

[email protected] 88

Page 90: Lecture 6 LBS Slides

• Likelihood maximisation implies, for B′:

minB

T∑t=0

(Xt −B′Wt)2

(129)

0 =T∑t=0

(Xt −B′Wt)W ′t (130)

B′ =T∑t=0

XtWt

(T∑t=0

WtW′t

)−1

= MXWM−1WW . (131)

– ML Estimators: B′ = (Π1, Π2, Φ).

• Product moment matrices are generically defined as MXW =∑Tt=0XtWt.

• Furthermore: εt = Xt − B′Wt−1 (132)

Ω = T−1T∑t=0

εtε′t = MXX −MXWM

−1WWMWX, (133)

[email protected] 89

Page 91: Lecture 6 LBS Slides

Maximised Likelihood

• Use estimators in likelood:

Lmax = L(B, Ω) = (2π)Tp/2∣∣∣Ω∣∣∣T/2 exp(−1

2tr

[Ω−1

T∑t=1

εtε′t

])(134)

L−2/Tmax = (2πe)p

∣∣∣Ω∣∣∣ . (135)

• Very powerful result for testing:

– Regardless of model estimated with Normal distribution, get this result.– Can impose restrictions, estimate, get ΩR and BR and get:

L−2/TR = (2πe)p

∣∣∣ΩR∣∣∣ . (136)

– Likelihood ratio test has easy form: ratio of residual variances.

[email protected] 90

Page 92: Lecture 6 LBS Slides

Testing with the Maximum Likelihood Framework

• Likelihood framework allows easy testing: likelihood ratio test.

• Test the hypothesis: H0 : θ = θ0, (137)

• Using test statistic:

LR = −2(

logL (θ0;Xt)− logL(θ;Xt))∼ χ2

dim θ. (138)

• Test assesses plausibility of restrictions.

– If restrictions move likelihood too far from θ, reject restrictions.

• Restrictions on B′ formed by constructing matrices R or H — Lecture 3.

[email protected] 91

Page 93: Lecture 6 LBS Slides

• Using H form, restrictions imposed by ψ = HB:

Xt = HB′Zt + εt = ψZt + εt. (139)

• Estimating gives restricted estimators, denoted by checks:

ψ = MXZH (H ′MZZH)−1(140)

Ω = MXX −MXZH (H ′MZZH)−1MZX (141)

• Likelihood ratio test: −2 ln(LR) = T ln(∣∣Ω∣∣ / ∣∣∣Ω∣∣∣)→ χ2

r. (142)

– Test statistic simple and intuitive.

[email protected] 92

Page 94: Lecture 6 LBS Slides

The VAR Likelihood Framework

• VAR is simultaneous equations autoregressive model.

• Allows rich characterisation of dynamics of data.

• Equivalent to reduced form of economic theory models.

• Likelihood estimation consistent as no endogeneity.

– Also efficient as Ω estimated.

• Provided VAR well specified, powerful tool for exploring data:

– Forecasting.– Impulse response analysis.– Investigating steady-state relationships: Cointegration.– Issues of causality and exogeneity.

[email protected] 93

Page 95: Lecture 6 LBS Slides

Checking the VAR

• VAR provides much information on modelled data.

• Johansen (2004): “which statistical model describes the data?”

– Statistical models rely on assumptions, properties proved based these.– Must test assumptions hold.– Check on unrestricted VAR before proceeding cointegration analysis.∗ Choice of number of cointegrating vectors vital.∗ Akin to deciding whether data I(1) or I(0).∗ Choice affected by model misspecification.

[email protected] 94

Page 96: Lecture 6 LBS Slides

The Assumptions of the VAR model

• VAR model assumes:

1. Linear conditional mean explained by past observations and deterministic terms:– Testing: Un-modelled systematic variation in residuals:∗ Informal: plots of residuals.∗ Formal: test for autocorrelated errors, heteroskedasticity and ARCH.

– Remedy by:∗ Choice of lag length.∗ Choice of information set — composition of Xt.∗ Incorporate outliers.∗ Data transformations: Non-linearity.∗ Structural breaks: Non-constant parameters, deterministic terms.

[email protected] 95

Page 97: Lecture 6 LBS Slides

Assumptions (continued. . . )2. Time-invariant conditional variance:

– Heteroskedasticity and ARCH effects:∗ Informal: plots of residuals.∗ Formal: White test, ARCH test.

– Remedy:∗ Add potentially causal regressors?∗ Regime shifts in the variance: deterministic terms.

3. Independent Normal errors, mean zero, variance Ω:– Informal testing: histogram of residuals– Formal testing: Autocorrelation test on residuals.

4. Parameter space:– All model outcomes plausible?– Remedy: data transformation — e.g. logs for % change.

[email protected] 96

Page 98: Lecture 6 LBS Slides

Cointegration in the VAR

• Data generally non-stationary: assume Xt ∼ I(1).

• As with AR(1), reformulate: Error correction form of VAR:

∆Xt = ΠXt−1 + Γ1∆Xt−1 + · · ·+ Γk−1∆Xt−k+1 + ΦDt + εt. (143)

– Π =(∑k

i=1 Πi

)− 1, Γj =

∑ki=j+1 Πi.

• ∆Xt ∼ I(0), εt ∼ I(0), but Xt ∼ I(1) still. (143) unbalanced.

• Solution: Π reduced rank. Then ∃ p× r matrices α, β s.t. Π = αβ′:

∆Xt = αβ′Xt−1 + Γ1∆Xt−1 + · · ·+ Γk−1∆Xt−k+1 + ΦDt + εt. (144)

– β′Xt−1 ∼ I(0): I(0) combinations of I(1) variables: cointegrating vectors.

[email protected] 97

Page 99: Lecture 6 LBS Slides

A Bivariate Example

• Example: r = 1, p = 2, β 2× 1 so

αβ′Xt−1 =(α1

α2

)(β1 β2

)( X1,t−1

X2,t−1

)=(α1

α2

)(β1X1,t−1 + β2X2,t−1) .

• β′Xt: Stationary Linear combination of I(1) variables.

• α1, α2: speed of adjustment of variables in Xt to disequilibrium.

– αi = 0 implies Xi,t weakly exogenous.

• X1,t, X2,t: consumption and income, home and foreign interest rate. . .

• Very powerful framework for analysis of steady-state relationships.

– Can check if more than one variable adjusts to steady-state.– No empirical examples today: If interested I can provide more slides.

[email protected] 98

Page 100: Lecture 6 LBS Slides

Granger Causality

• Causality central to economics and other fields:

– Does money cause GDP, or GDP cause money?– Does advertising cause sales? Or sales cause advertising?

• The VAR framework allows us to answer these questions.

• A variable Xt is Granger non-causal for Yt if:

E (Yt |Yt−1, Xt−1, . . . ) = E (Yt |Yt−1) . (145)

– I.e. Previous values of Xt do not provide information on Yt.– Same as strong exogeneity.

• If (145) does not hold, implication is Xt Granger causal for Yt.

– But need also that Yt Granger non-causal for Xt.– Rule out feedback, establish causality.

[email protected] 99

Page 101: Lecture 6 LBS Slides

Granger Causality

• Can easily test Granger causality in VAR model. E.g. bivariate VAR(2):

Xt = Π1Xt−1 + Π2Xt−2 + εt (146)(X1,t

X2,t

)=(π1,11 π1,12

π1,21 π1,22

)(X1,t−1

X2,t−1

)+(

π2,11 π2,12

π2,21 π2,22

)(X1,t−2

X2,t−2

)+(ε1,tε2,t

).

(147)

• If π1,21 = π2,21 = 0 then lags of X1,t Granger non-causal for X2,t.

• If π1,12 6= 0, π2,12 6= 0 then X2,t Granger causal for X1,t.

– Hence X2,t Granger causal for X1,t.

• Powerful test of causality often used in literature.

[email protected] 100

Page 102: Lecture 6 LBS Slides

Granger Causality

• But general case: tri-variate VAR(2):

X1,t

X2,t

X3,t

=

π1,11 π1,12 π1,13

π1,21 π1,22 π1,23

π1,31 π1,32 π1,33

X1,t−1

X2,t−1

X3,t−1

+

π2,11 π2,12 π2,13

π2,21 π2,22 π2,23

π2,31 π2,32 π2,33

X1,t−2

X2,t−2

X3,t−2

+

ε1,tε2,tε3,t

.

(148)

• If π1,21 = π2,21 = 0 then lags of X1,t Granger non-causal for X2,t.

• If π1,12 6= 0, π2,12 6= 0 then X2,t Granger causal for X1,t.

– But what about causality from X1,t−2 to X3,t−1 to X2,t?– Need also π2,31 = π1,23 = 0.

[email protected] 101

Page 103: Lecture 6 LBS Slides

Granger Causality

• Granger causality extensively used in empirical work.

– Powerful and intuitive test of causality.– Reliant on VAR framework.– Could be run as series of single equation estimations.∗ Thought inefficient.

• But it is severely limited:

– With many lags, complicated structure of zero restrictions required.– Test is conditional on information set included:∗ Hence unmodelled lags or variables may provide causality.∗ Hence Granger non-causality conclusion may be invalidated.

[email protected] 102

Page 104: Lecture 6 LBS Slides

Advertising: Copenhagen Cointegration Summer School

• Learn from the Masters!

• Three week course in August each year:

– August 3–23 2009: register your interest!– http://www.econ.ku.dk/summerschool/

• Mornings: Cointegration theory from Juselius, Johansen, Rahbek and Nielsen.

• Afternoons: Computer labs to work on your own dataset.

• Hugely useful course:

– Submit paper at end of course for feedback.– Potential PhD chapter.

[email protected] 103

Page 105: Lecture 6 LBS Slides

Concluding

• In-depth look at multiple-equation modelling.

• Seemingly Unrelated Regression:

– E.g. Demand systems.– Exploit information between regressions.

• Simultaneous-equation modelling:

– E.g. Demand and supply systems.– Endogeneity problem.– IV estimation: Instruments are exogenous variables.

• VAR modelling:

– Extending the time series dimension.– Forecasting, theory-free/full modelling, impulse responses, Granger causality.

• Next week: Limited-Dependent-Variable Modelling.

[email protected] 104