Post on 22-Oct-2014
Econometrics Part ILecture 6
6 November 2009
J. James Reade
Admin
• Lectures:
– Today: Simultaneous equations modelling and VARs.– Next week: Limited Dependent Variable Modelling.– Final lecture: Treatment effects and recap.
• Lecture notes:
– Handouts for each lecture: Slides.– Please point out typos!
• Classes: This week, next week.
• Exam: Two weeks on Tuesday.
– Prep: Assignments best guide for exam.
• Happy to chat via email or after lecture.
jjreade@gmail.com 2
Today: Multiple-Equation Modelling
• Many reasons to estimate more than a single-equation model.
• Panel: Time series observed over multiple units.
– View as system of time series?
• We may want to consider several models jointly:
– Helpful if know disturbances likely correlated.– E.g. CAPM residuals (excess returns) over different firms correlated.
• May want to consider determination of several variables jointly:
– Endogeneity/simultaneity.
jjreade@gmail.com 3
Today: Textbook Coverage
• Greene covers all material:
– SURE: Chapter 14.– Simultaneous equations: Chapter 15.– VARs, etc: Chapter 19.
• Another textbook gives good intro to simultaneous equations material:
– Gujarati: Basic Econometrics, ed. 4, Chs 18–21.
• Gujarati somewhat more shaky on time-series grounds though.
• Stock and Watson does not cover simultaneous equations.
jjreade@gmail.com 4
The Seemingly Unrelated Regression Model
• Seemingly very similar to panel data.
• Set of data for number of units, say firms.
– May have cross equation dependencies, e.g. firms in similar industry.– May be more efficient to exploit error structure in estimation.
• E.g. Consumer demand for N goods. Could estimate N demands.
– But constraints hold over all consumer: Budget constraints etc.
• These kind of models motivate the Seemingly Unrelated Regression approach.
– Proposed by Zellner (1962).
jjreade@gmail.com 5
The Seemingly Unrelated Regression Model
• Take bivariate case: E.g. two goods, N observations for each:
y1 = X1β1 + u1 u1 ∼ (0, σ21), (1)
y2 = X2β2 + u2 u1 ∼ (0, σ22). (2)
• Write as a system:
(y1y2
)2N×1
=(X1 00 X2
)2N×(K1+K2)
(β1
β2
)(K1+K2)×1
+(u1
u2
)2N×1
. (3)
• Stacked form:
y = Xβ + u, E (u |X ) = 0, Cov (u |X ) = Ω =(σ11In σ12Inσ21In σ22In
). (4)
• Big regression problem: Use OLS? βOLS = (X ′X)−1X ′y. (5)
jjreade@gmail.com 6
The Seemingly Unrelated Regression Model
• OLS estimator has properties:
E(βOLS |X
)= β, Cov
(βOLS |X
)= (X ′X)−1X ′ΩX(X ′X)−1. (6)
– Latter follows from assumption that Cov (u |X ) = Ω.
• This is just a big OLS estimation: OLS on each equation seperately.
• But is Ω diagonal?
– In many cases unlikely: CAPM example, demand expenditures.– If Ω non-diagonal then OLS inefficient.– Can gain over OLS by using GLS and exploiting cross-equation correlations.
• Using GLS is called Seemingly Unrelated Regression (SURE) analysis.
jjreade@gmail.com 7
The Seemingly Unrelated Regression Model
• GLS estimator:
βSURE = (X ′Ω−1X)−1X ′Ω−1y. (7)
• We’ve assumed Ω non-diagonal:
Ω =(σ11In σ12Inσ21In σ22In
). (8)
• So σ12 and σ21 non-zero.
• SURE estimation weights this information, in effect using parts of β1 in β2,SURE.
– Problematic if one equation in system misspecified as effects transmitted acrossall equations.
jjreade@gmail.com 8
Alternative Systems of Equations
• Panel data from last week seemingly a generalisation of SURE.
– Although would appear SURE accounts for cross-section dependence.1
• May be interested not in the same data series over many different units of observation.
• What if the variables for a given unit are endogenously determined?
– In macroeconomics, it’s hard to see how data are anything but endogenous.– Demand and supply also.
• Much of today: Simultaneous equation models.
– Extending into VAR modelling.
1Don’t quote me on this.
jjreade@gmail.com 9
A Necessity for Simultaneous Equations
• Endogeneity pervasive in econometric modelling.
– Regression of yi on xi: yi = βxi + εi. (9)
– But yi also depends on xi: xi = θyi + ei. (10)
– Estimator: β =∑Ni=1 xiyi∑Ni=1 x
2i
= β +∑Ni=1 xiεi∑Ni=1 x
2i
. (11)
– But E (xiεi) = E ((θyi + ei)εi) = E ((θ(βxi + εi) + ei)εi) = θE(ε2i)6= 0.
• Endogeneity or simultaneity manifested in correlation between errors and variables.
– Already considered many strategies for this ailment.
• Today we consider estimating each possible equation.
jjreade@gmail.com 10
Multiple-Equation Modelling: Supply and Demand
• E.g. Market for PhDs: Demand, supply, equilibrium wages (w) and employment (e).
– Market clearing w, e caused by demand and supply.– But demand and supply affected by what w and e are.– Joint determination of equilibrium quantities.
• Then quantity demanded of PhDs qdt is: qdt = β11 + β12wt + ε1t. (12)
• But the quantity supplied qst is: qst = β21 + β22wt + ε2t. (13)
• OLS estimation of either β12 or β22 problematic: Setting qdt = qst gives:
wt =β11 − β21
β22 − β12+ε1t − ε2tβ22 − β12
, E (wtε1t) 6= 0, E (wtε2t) 6= 0. (14)
jjreade@gmail.com 11
Multiple-Equation Modelling: Macroeconomics
• Macroeconomic, or general-equilibrium, modelling:
– Interested in joint determination of many variables.– E.g. Interest rates, output gap, inflation.– Macro models specify equations for each of these variables.
• All such systems have two types of variable:
– Endogenous Variables: Determined with the model/system.– Exogenous Variables: Determined outside the model/system.∗ Also known as predetermined variables.
jjreade@gmail.com 12
Multiple-Equation Modelling
• Stylised model: M endogenous variables Y1,t, . . . , YM,t, K exogenous variablesX1,t, . . . XK,t.
• Implies M equations, one for each endogenous variable:
Y1,t = β12Y2,t + β13Y3,t + · · ·+ β1MYM,t + γ11X1,t + · · ·+ γ1KXK,t + ε1,t,
Y2,t = β21Y1,t + β23Y3,t + · · ·+ β2MYM,t + γ21X1,t + · · ·+ γ2KXK,t + ε2,t,
... = ... + ... + · · ·+ ... + ... + · · ·+ ... + ... ,
YM,t = βM1Y1,t + βM2Y2,t + · · ·+ βM,M−1YM−1,t + γM1X1,t + · · ·+ γMKXK,t + εM,t.
• This is the structural form of an economic, or econometric model.
– Endogenous variables in terms of other endogenous and exogenous variables.
• Some or many coefficients may be restricted to zero by theory a priori.
jjreade@gmail.com 13
Structural and Reduced Form Modelling
• Could write structural form instead as: BYt = GXt + εt. (15)
• Where:
B =
1 −β12 −β13 . . . −β1M
−β21 1 −β23 . . . −β2M... ... ... . . . ...
−βM,1 −βM,2 −βM,3 . . . 1
, G =
γ11 . . . γ1K... . . . ...
γM,1 . . . γM,K
,
εt =
ε1tε2t...
εMt
, Yt =
Y1t
Y2t...
YMt
, Xt =
X1t
X2t...
XKt
.
• Clearly modelling structural form has endogeneity problems.
– Solve for endogenous variables in terms of exogenous variables. . .
jjreade@gmail.com 14
The Reduced Form
• Reduced form: Represent endogenous variables in terms of exogenous variables.
Yt = B−1GXt +B−1εt. (16)
• E.g. PhDs job market: qdt = β11 + β12wt + ε1t, (17)
qst = β21 + β22wt + ε2t. (18)
• Reduced form here comes from equilibrium: qst = qdt so:
wt =β11 − β21
β22 − β12+ε1t − ε2tβ22 − β12
= Π0 + vt. (19)
• Where: Π0 =β11 − β21
β22 − β12, vt =
ε1t − ε2tβ22 − β12
. (20)
jjreade@gmail.com 15
Identification Difficulties
• Can now find equilibrium quantity from either original equation:
qt = β11 + β12
(β11 − β21
β22 − β12+ε1t − ε2tβ22 − β12
)+ ε1t, (21)
=β11β22 − β12β21
β22 − β12+β22ε1t − β12ε2tβ22 − β12
= Π1 + et. (22)
• Where: Π1 =β11β22 − β12β21
β22 − β12, et =
β22ε1t − β12ε2tβ22 − β12
. (23)
• These are reduced-form equations for wage wt and quantity employed qt.
• Can estimate Π0, Π1: No endogeneity problem remains.
• But Π0, Π1 are non-linear functions of β11, β12, β21, β22, the structural parameters.
– Cannot uncover structural parameters from reduced form estimation: Unidentified.
jjreade@gmail.com 16
Identification
• Identification achieved if unique values of structural parameters can be found.
• Main interest in PhD model are structural parameters β11, β12, β21, β22:
qdt = β11 + β12wt + ε1t, (24)
qst = β21 + β22wt + ε2t. (25)
• But OLS estimation biased and inconsistent due to E (wtεit) 6= 0, i = 1, 2.
• Reduced form estimators Π0, Π1 are consistently estimated by OLS.
• But cannot recover β11, β12, β21, β22 from Π0, Π1.
• Our parameters of interest are unidentified, or underidentified.
jjreade@gmail.com 17
Identification Conceptually
• Recall from microeconomic theory: Market price satisfies demand and supply.
• (qt, wt) are set of equilibrium prices:
– Intersections of different demand and supply curves.
q
w
y y
y y
y
yy
jjreade@gmail.com 17
Identification Conceptually
• Recall from microeconomic theory: Market price satisfies demand and supply.
• (qt, wt) are set of equilibrium prices:
– Intersections of different demand and supply curves.
q
w
yS1
D1yS1
D1
yS1
D1yS1
D1
yS1
D1
yS1
D1
yS1
D1
jjreade@gmail.com 18
Identification Difficulties
• Identification: Many observationally equivalent representations.
• Could add λ to demand equation, 1− λ to supply:
λqdt = λβ11 + λβ12wt + λε1t, (26)
(1− λ)qst = (1− λ)β21 + (1− λ)β22wt + (1− λ)ε2t. (27)
• Adding the two equations together yields:
qdt = γ1 + γ2wt + η1t, (28)
– Where: γ1 = λβ11 +(1−λ)β21, γ2 = λβ12 +(1−λ)β22, ηt = (1−λ)ε1t+(1−λ)ε2t.
• New equation (28) indistinguishable from demand or supply equations.
– Cannot tell which is which from data alone.
jjreade@gmail.com 19
Seeking Identification
• Method for identification: Additional information.
• E.g. Include extra ‘shift’ variable in demand equation.
– Then can conceptually move that variable to shift demand curve.– Trace out supply curve.
q
w
y
S1
D3
yD2
yD1
yD4
yD5
jjreade@gmail.com 20
Seeking Identification
• For PhDs, add extra variable to demand equation to find supply equation.
– Add student enrolments st to demand equation. Need lecturers to lecture.
qdt = β11 + β12wt + γ11st + ε1t, (29)
qst = β21 + β22wt + ε2t. (30)
• Then reduced form is: wt = Π0 + Π1st + vt, qt = Π2 + Π3st + ut. (31)
• Where: Π0 =β21 − β11
β12 − β22, Π1 =
γ11
β22 − β12, vt =
ε2t − ε1tβ12 − β22
,
Π2 =β12β21 − β11 − β22
β12 − β22, Π3 =
γ11β22
β22 − β12, ut =
β12ε2t − β22ε1tβ12 − β22
.
• Five unknowns, four equations. But: β21 = Π2 − β22Π0, β22 =Π3
Π1. (32)
– Hence supply equation identified.
jjreade@gmail.com 21
Checking Identification
• As earlier, we can add λ and (1− λ) to each equation:
qt = γ0 + γ1wt + γ2st + εt. (33)
– γk and εt defined accordingly.
• Hence: This new representation indistinguishable from demand equation:
– Demand unidentified: Can take any linear combination of equations.
• But supply identified: Can distinguish supply from (33) by st term.
• We identify supply equation by adding term to demand equation:
– Extra demand term allows us to trace out supply curve.– Regression means can hold all else fixed, vary st.– Shifts demand curve while supply fixed:∗ What we get must be supply curve.
jjreade@gmail.com 22
Seeking Identification
• Can repeat ‘trick’ to identify demand: Add term to supply equation.
– Add consultancy wage rate ct to supply equation. Outside option.– Add student enrolments st to demand equation. Need lecturers to lecture.
qdt = β11 + β12wt + γ11st + ε1t, (34)
qst = β21 + β22wt + γ21ct + ε2t. (35)
• Will yield reduced forms: wt = Π0 + Π1st + Π2ct + vt, (36)
qt = Π3 + Π4st + Π5ct + ut. (37)
• 6 unknowns: (β11, β12, β21, β22, γ11, γ21), 6 equations: (Π0,Π1,Π2,Π3,Π4,Π5).
– All parameters identified, both equations identified.
jjreade@gmail.com 23
Overidentification
• Other factors influence demand and supply. E.g. for PhDs, previous period wage.
qdt = β11 + β12wt + γ11st + ε1t, (38)
qst = β21 + β22wt + γ21ct + γ22wt−1 + ε2t. (39)
• Usual problem of simultaneity bias means we look for reduced form:
wt = Π0 + Π1st + Π2ct + Π3wt−1 + vt, (40)
qt = Π4 + Π5st + Π6ct + Π7wt−1 + ut. (41)
• Got eight equations (Πs) for only seven structural parameters. E.g.:
β22 = Π6/Π2, β22 = Π5/Π1. (42)
jjreade@gmail.com 24
Overidentification
• Got eight equations (Πis) for only seven structural parameters. E.g.:
β22 = Π6/Π2, β22 = Π5/Π1. (43)
• Model overidentified: Too much information.
– We exclude two variables for just one endogenous variable in supply function.
• Multiple expressions for parameters such as β22 may give different answers.
– Ambiguity transmitted to other parameters: β22 in denominators of other Πis.
• But TMI not necessarily a bad thing:
– Estimation methods exist to handle extra information.
jjreade@gmail.com 25
Identification More Formally
• Identification very difficult to grasp and to find.
• Rank and order conditions exist to check for identification:
– Facilitate finding identification via automation in computer packages.
• Recap and extension of notation:
– M : Number of endogenous variables in model/system.– m: Number of endogenous variables in equation of model.– K: Number of exogenous, predetermined variables in model/system.– k: Number of exogenous, predetermined variables in equation of model/system.
jjreade@gmail.com 26
The Order Condition
• Order condition is necessary but not sufficient for identification.
• Can be stated in two ways:
1. In model of M simultaneous equations, equation identified if:
• It excludes at least M − 1 variables (endogenous or otherwise).– If less than M − 1 excluded, unidentified.– If exactly M − 1 excluded, just identified.– If more than M − 1 excluded, overidentified.
2. In M -equation system, equation identified if:
• Number of exogenous variables excluded bigger than number of endogenousvariables in equation minus 1:
K − k ≥ m− 1. (44)
jjreade@gmail.com 27
The Order Condition: Examples
• Simple demand example: qdt = β11 + β12wt + ε1t, (45)
qst = β21 + β22wt + ε2t. (46)
– Two endogenous variables, M = 2, each equation excludes zero variables.∗ Unidentified.
• Add student enrolment: qdt = β11 + β12wt + γ11st + ε1t, (47)
qst = β21 + β22wt + ε2t. (48)
– Two endogenous variables, M = 2, K = 1.– Demand equation: Excludes zero variables hence unidentified.– Supply equation: Excludes one variable variable hence identified.
jjreade@gmail.com 28
The Order Condition: Examples
• Add student enrolment, consultancy wages and lagged wages:
qdt = β11 + β12wt + γ11st + ε1t, (49)
qst = β21 + β22wt + γ21ct + γ22wt−1 + ε2t. (50)
– Two endogenous variables, M = 2, three exogenous K = 3.– Demand equation: Excludes two variables (ct, wt−1) hence overidentified.– Supply equation: Excludes one variable (st) variable hence identified.
• But we already know supply equation actually overidentified.
jjreade@gmail.com 29
The Order Condition
• Order condition probably most commonly used identification strategy.
• Use theory to argue why particular variables excluded.
• E.g. Rainfall affects supply and not demand for wheat.
• Rainfall is stark example: Not always so in economics.
– E.g. Ricardian equivalence: Debt has no real effect?!
• Implications of wrong identification strategy not innocuous:
– But very hard to test!
jjreade@gmail.com 30
The Rank Condition
• Order condition necessary but not sufficient for identification.
• Even if satisfied equation may not be identified.
• E.g. If st insignificant in demand equation, γ11 = 0, supply unidentified.
• Identification also violated if exogenous variables excluded not independent:
– If linear combination exists, mapping from βs and γs to Πs non-unique.
Rank Condition:
• M -equation system identified iff at least one non-zero determinant of order (M −1)(M − 1) can be constructed using coefficients (endogenous or exogenous) ofvariables excluded from that equation.
– Necessary and sufficient condition for identification.
jjreade@gmail.com 31
Rank Condition: An Example
• System of 4 endogenous Y variables and 3 exogenous X variables:
Y1t − β10 − β12Y2t − β13Y3t − γ11X1t = u1t, (51)
Y2t − β20 − β23Y3t − γ21X1t − γ22X2t = u2t, (52)
Y3t − β30 − β31Y1t − γ31X1t − γ32X2t = u3t, (53)
Y4t − β40 − β41Y1t − β42Y2t − γ43X3t = u4t. (54)
• Identified?
Eq. No K − k m− 1 Identified?
(51) 2 2 Exactly(52) 1 1 Exactly(53) 1 1 Exactly(54) 2 2 Exactly
jjreade@gmail.com 32
Rank Condition: An Example
• To help with rank condition write equations in table:
Coefficients of the variablesEq No. 1 Y1 Y2 Y3 Y4 X1 X2 X3
(51) −β10 1 −β12 −β13 0 −γ11 0 0(52) −β20 0 1 −β23 0 −γ21 −γ22 0(53) −β30 −β31 0 1 0 −γ31 −γ32 0(54) −β40 −β41 −β42 0 1 0 0 −γ43
• To check equation (51): Form matrix of coefficients on Y4, X2, X3:
A =
0 −γ22 00 −γ32 01 0 −γ43
, detA = 0. (55)
• Hence A not full rank: Rows/columns not linearly independent.
– Relationships exist between variables hence unidentified.– Cannot tell (52) and (53) apart hence can’t tell (51) from either.
jjreade@gmail.com 33
More on Identification
• Order condition necessary but not sufficient, rank condition necessary and sufficient.
• Rank tells us whether identified or not, order whether exact- or over-identification.
• Four cases:
1. If K − k > m− 1 and rank (A) = M − 1 equation overidentified.2. If K − k = m− 1 and rank (A) = M − 1 equation exactly identified.3. If K − k ≥ m− 1 and rank (A) < M − 1 equation under identified.4. If K − k < m− 1 equation unidentified.
• Rank condition can get difficult with large dimension systems:
– Often just order condition used if software cannot calculate.
jjreade@gmail.com 34
Testing for Simultaneity
• If we have no simultaneity problem then OLS consistent.
– If we do have simultaneity then need 2SLS/IV — to come.
• Thus testing for simultaneity helpful. Hausman (1976) provided a test.
• Consider model: Qdt = α0 + α1Pt + α2Xt + ε1t, (56)
Qst = β0 + β1Pt + ε2t. (57)
• Reduced form: Pt = Π0 + Π1Xt + vt, (58)
Qt = Π2 + Π3Xt + et. (59)
• OLS gives: Pt and Pt = Pt + vt.
• Sub back into supply: Qt = β0 + β1Pt + β1vt + ε2t.
jjreade@gmail.com 35
Testing for Simultaneity
• Test equation: Qt = β0 + β1Pt + β1vt + ε2t = β0 + β1Pt + θvt + ε2t.
• If simultaneity then vt correlated with ε2t:
– vt is variation remaining in Pt controlling for exogenous variables.– We have split Pt into potentially simultaneous component via instrumenting.
• If simultaneity then vt significant if run OLS on test equation. Since:
θ =∑Tt=1Qvt∑Tt=1 v
2t
=∑Tt=1Q(Pt − Pt)∑T
t=1 v2t
. (60)
• Hence Hausman test for simuntaneity is t-test on olsvt:
– If test rejected, conclude simultaneity.
jjreade@gmail.com 36
Estimation
• Estimating system: BYt = GXt + εt (61)
– M endogenous variables in M × 1 matrix Yt, M ×M coefficient matrix B.– K exogenous variables in K × 1 matrix Xt, M ×K coefficient matrix G.
• Recall reduced form:
Yt = B−1GXt +B−1εt = ΠXt + ut. (62)
• OLS estimation of Π consistent.
– But Π rarely of interest: These are reduced-form parameters.– B, G more of interest but not necessarily identified.
jjreade@gmail.com 37
Estimation
• Two estimation methods:
1. Limited Information Methods:
• Estimate each equation of system separately.• Take into account restrictions on that equation, not others.
2. Full Information Methods:
• Estimate all equations jointly, or simultaneously.• Impose all restrictions on all equations (required for identification).
jjreade@gmail.com 38
Other Special Cases
• If the model is recursive, also known as triangular or causal, then:
B =
1 0 0 . . . 0−β21 1 0 . . . 0−β31 −β32 1 . . . 0
... ... ... . . . ...−βM1 −βM2 −βM3 . . . 1
. (63)
• I.e. Y1 causes Y2 causes Y3 etc. hence no endogeneity problems.
• Also require that Cov (εt) = Ω is diagonal, i.e. Cεit, εjt = 0 for all j 6= i.
• Here OLS is consistent.
jjreade@gmail.com 39
Other Special Cases
• Vector autoregression: E.g. VAR(2):
Xt = Π1Xt−1 + Π2Xt−2 + εt. (64)
• Here: Xt =
X1t
X2t...Xpt
. (65)
• Model has no contemporaneous values of X1, . . . , Xp on RHS hence OLS consistent.
• Usually economic theory imposes structure on contemporaneous variables:
BXt = Π1Xt−1 + Π2Xt−2 + εt (66)
– Then again got issue of endogeneity.– More on VARs later. . .
jjreade@gmail.com 40
Limited Information Methods
• In just-identified case, we can use indirect least squares (ILS). Model:
qdt = α0 + α1pt + α2st + ε1t, (67)
qst = β0 + β1pt + β2ct + ε2t. (68)
• Proceed in three steps:
1. Find reduced-form equations:– Endogenous variables in terms of only exogenous variables.
pt = Π0 + Π1st + Π2ct + ut,
qt = Π3 + Π4st + Π5ct + ut.
2. Estimate reduced-form equations by OLS:– Yields consistent estimates of reduced-form parameters as no endogeneity.– Produces Π1, . . . , Π5.
3. Obtain estimates of structural coefficients by one-to-one correspondence:– Requires just-identification to get α1, . . . , β2.
jjreade@gmail.com 41
Limited Information Methods
• ILS breaks down if equation overidentified:
– More than one possibility for each parameter, standard errors dubious.
• Say model is: qdt = α0 + α1pt + α2st + α3pt−1 + ε1t,
qst = β0 + β1pt + β2ct + ε2t.
• Problem remains that pt endogenous hence E (ptε1t) 6= 0.
• Require method to isolate component of pt correlated with ε1t.
• Can use instrumental variable estimation:
– Exogenous variables st, pt−1, ct satisfy one instrumenting condition.∗ Namely uncorrelatedness with the error term.
– Expect relevance condition to hold — otherwise unidentified.– Method: Two-stage least squares
jjreade@gmail.com 42
Two-Stage Least Squares
1. Regress endogenous variable pt on all exogenous variables in system:
pt = π0 + π1st + π2pt−1 + π3ct + ut. (69)
• Yields estimates π0, . . . , π3 to use to get fitted values pt.• pt is variation in pt explained by exogenous variables.
– Hence uncorrelated with error component.• ut = pt − pt is remaining component correlated with error.
2. Run original system equation using pt in place of pt:
qdt = α0 + α1pt + α2st + α3pt−1 + ε1t. (70)
• Resulting estimate α1 consistent provided exogenous variables valid instruments.• Isolate and remove component of pt correlated with error term by instrumenting.• Can estimate even if model overidentified as here.
jjreade@gmail.com 43
More on 2SLS
• 2SLS estimator: β2SLS = (Z ′iXi)−1(Z ′iyi) = β + (Z ′iXi)−1(Z ′iεi). (71)
– Where Zi is vector of exogenous variables, Xi endogenous and yi dependent.
• As before, need E (Z ′iXi) 6= 0 and E (Z ′iεi) = 0 for consistency:
– I.e. Need exogenous variables be exogenous and to identify system.
• First stage provides test of relevance of instruments.
• Need to ensure standard errors correctly calculated on second stage:
– Use pt not pt to calculate residuals hence standard errors etc.
jjreade@gmail.com 44
Full Information Methods
• Write as stacked system: y = Zδ + ε, ε ∼ (0,Σ). (72)
• Or:
y1y2...yM
=
Z1 0 . . . 00 Z2 . . . 0... ... . . . ...0 0 . . . ZM
δ1δ2...δM
+
ε1ε2...εM
. (73)
• Here: Zm = (Ym, Xm), hence both endogenous and exogenous variables.
• OLS estimator:
δOLS = (Z ′Z)−1Z ′y. (74)
– OLS on system equivalent to equation-by-equation OLS hence inconsistent.∗ Also by SUR inefficient: Does not exploit information in Σ.
jjreade@gmail.com 45
Full Information: 3SLS
• Method to solve inconsistency:
– Instrumental Variables estimation.
• Method to solve inefficiency:
– Feasible GLS estimation.
• Hence find estimator in three stages:
– Three-stage least squares:
jjreade@gmail.com 46
Full Information: 3SLS
1. Instrumenting equation estimation:
• Use all exogenous variables X as instruments.• Yielding Π to create Yi = ΠXi, where i = 2, . . . ,M for equation 1, etc.
– Gives matrix Zi formed of Y1 and Xi.
2. Instrumental variables estimation:
• Use Zi in place of Zi.• Yields 2SLS estimators δ2SLS:
δ2SLS = (Z ′Z)−1(Z ′y) = (W ′Z)−1W ′y. (75)
• Estimator consistent only if E (W ′ε) = 0, E (W ′Z) 6= 0.• Also get variance-covariance matrix estimator Σ:
σ2ij = T−1(yi − Ziδi)′(yj − Zjδj). (76)
jjreade@gmail.com 47
Full Information: 3SLS
3. Feasible GLS estimation:
• Use Z and Σ from 2SLS estimates.
δIV,GLS = δ3SLS = (Z ′Σ−1Z)−1Z ′Σ−1y = (W ′Σ−1Z)−1W ′Σ−1y. (77)
• 3SLS consistent provided instruments are valid.
• Asymptotic efficiency amongst system IV estimators.
jjreade@gmail.com 48
Full-Information Maximum Likelihood (FIML)
• Likelihood framework can be applied to system.
• Begin with reduced-form: Y = XΠ + V. (78)
– Each row of V assumed multivariate Normal: vi |X ∼ N (0,Ω).
• Log-likelihood function: ln L = −T2[M ln(2π) + ln |Ω|+ tr
(Ω−1W
)]. (79)
• Where: Wij = T−1(y −Xπi)′(y −Xπj). (80)
– Here, πi is ith column of Π not the number.
• Maximise likelihood subject to all restrictions placed on system: B matrix.
jjreade@gmail.com 49
Full-Information Maximum Likelihood (FIML)
• Reduced form Y = XΠ + V found from structural form:
Y Γ = XB + U, U ∼ (0,Σ),
⇒ Y = XBΓ−1 + UΓ−1, UΓ−1 ∼ (0,Γ−1′ΣΓ−1).
• Interested in structural form not reduced form so use substitutions:
Π = BΓ−1, Ω = Γ−1′ΣΓ−1, Ω−1 = ΓΣ−1Γ′. (81)
• Hence:
ln L = −T2[M ln(2π) + ln
∣∣Γ−1′ΣΓ−1∣∣+ tr
(ΓΣ−1Γ′(Y +XBΓ−1)′(Y +XBΓ−1)
)].
jjreade@gmail.com 50
Full-Information Maximum Likelihood (FIML)
• Again:
ln L = −T2[M ln(2π) + ln
∣∣Γ−1′ΣΓ−1∣∣+ tr
(ΓΣ−1Γ′(Y +XBΓ−1)′(Y +XBΓ−1)
)].
• Simplified: ln L = −T2[M ln(2π)− 2 ln |Γ|+ ln |Σ|+ tr
(Σ−1S
)].
• Where: sij = T−1(Y Γi +XBi)′(Y Γj +XBj). (82)
• Maximise likelihood to yield Γ, B matrices:
jjreade@gmail.com 51
Full-Information Methods
• FIML:
– Coherent estimation framework.– Testing feasible via LR, LM, Wald tests.– But: Normality assumption may not be valid.– But: Numerical optimisation.
• 3SLS:
– Vastly easier to compute: No numerical methods.– If Normal errors assumed, 3SLS and FIML same asymptotic properties.∗ 3SLS thus much more popular in usage.
– Small samples: Because many parameters to estimate, 3SLS and FIML may diverge.
jjreade@gmail.com 52
Simultaneous Equation Methods Condensed
• Often need to estimate more than one equation:
– Similar regression for different firms, goods, etc.– Equations for all endogenous variables in system.
• Estimation:
– Instrumental variables to counter endogeneity.∗ Exogenous variables are instruments.
– Full- or limited-information (IV) methods:∗ Full can be computationally cumbersome.∗ Limited information methods inefficient.
jjreade@gmail.com 53
Break time?
jjreade@gmail.com 54
Adding Lagged Variables: Vector Autoregressions
• Simultaneous equations models often used in time-series context:
• Huge macroeconomic models of 1960s and 1970s:
– Klein-Goldberger model of US economy: 20 equations.– Brookings-Social Science Research Council model: 150 equations.
• Models need not be time series but generally are.
• What if many lags included? Stability?
• Alternative simultaneous equations model:
– The Vector Autoregression (VAR — not VaR).– Only endogenous variables, but no contemporaneous terms.– Often used for ‘theory-free’ estimation or forecasting.
jjreade@gmail.com 55
Vector Autoregressive Models
• Already noted second-order VAR: Two lags:
Xt = Π0 + Π1Xt−1 + Π2Xt−2 + εt. X1,t...
Xp,t
=
π01...π0p
+
π111 . . . π1
1p... . . . ...π1p1 . . . π1
pp
X1,t−1...
Xp,t−1
+
π211 . . . π2
1p... . . . ...π2p1 . . . π2
pp
X1,t−2...
Xp,t−2
+
ε1,t...εp,t
.
• All variables X1, X2, . . . , Xp determined within system. Endogenous.
• But: No variables dated t enter each equation:
– Hence can apply time-series methods to estimate.
jjreade@gmail.com 56
The Vector Autoregressive Model
• Object of interest: set of variables over time:
• Xt p-dimensional data vector at time t: Xt =
X1,t
X2,t...
Xp,t
. (83)
• p variables relating to particular problem of interest:
– Xt is snapshot at point in time:
Xt =
(tax/Y )t(G/Y )t(D/Y )tπtY gapt
=
2008:30.1800.2180.6950.053−0.022
. (84)
• X is p× T data vector:
X =
(tax/Y )(G/Y )(D/Y )π
Y gap
=
1966:10.172
1966:20.176
1966:30.176
1966:40.176 . . .
2008:30.180
0.165 0.172 0.174 0.177 . . . 0.2180.416 0.405 0.409 0.408 . . . 0.6950.023 0.028 0.033 0.037 . . . 0.0530.058 0.050 0.046 0.042 . . . −0.022
. (85)
jjreade@gmail.com 57
The Vector Autoregressive Model
• p Xp,t variables:
– Same variable, different countries/regions/firms/people: e.g. exchange rates.– Different variables, same country/region/firm/person: e.g. consumption, income.
• Can view as reduced-form modelling:
– Without assuming exogeneity.
• Can test exogeneity assumptions easily.
– Variable Xp,t exogenous if determined outside system.– Hence if all coefficients on Xs,t−k insignificant, p 6= s, k > 0, Xp,t exogenous.– Can then omit equation in Xp,t from system.∗ Similar to Granger causality test: More later. . .
jjreade@gmail.com 58
What are VARs Used For?
• Forecasting:
– All RHS variables are lagged: Can forecast tomorrow given today’s data.
• Theory-free modelling:
– Sims: No need for ‘incredible restrictions’ implied by theory.
• Theory-full modelling:
– VARs correspond well to reduced form of many macroeconomic theory models.– Theory interested in effect of impulse to variables through time:∗ E.g. Effect of government spending on GDP.
jjreade@gmail.com 59
What are VARs Used For?
• VARs give extremely rich characterisation of dynamics in data.
– Autoregressive and distributed lag components.– Can cope with unit roots as in AR(k) model: Cointegration.– Deterministic terms can be incorporated (with care).– No conditioning or exogeneity assumed.
• Many theories in economics and wider postulate steady states.
– Yet data are non-stationary and endogenous.– Concept of cointegration means can estimate steady state.– VAR model allows endogeneity and testing of exogeneity.
jjreade@gmail.com 60
Cointegration
• If Yt, Xt both I(d) then in general any linear combination will also be I(d).
• If there exists linear combination et = Yt − βXt s.t. et ∼ I(d− b), b > 0 then:
– Yt, Xt are cointegrated of order (d, b).
• In reality rarely find anything other than cointegration of order (1, 1): ‘Cointegration’.
• Cointegration very powerful concept in time-series econometrics.
– Recall spurious regression? Brazilian rainfall causes UK GDP?– Cointegration allows estimation of levels relationships in data.∗ Even if data non-stationary.
• Same spuriousness possible in VAR, simultaneous equation modelling.
jjreade@gmail.com 61
Going back to our ADL Model
• ADL model: Yt = δ + α1Yt−1 + β0Xt + β1Xt−1 + εt. (86)
• Error/equilibrium-correction mechanism (ECM) with long-run solution nested is:
∆Yt = β0∆Xt + (α1 − 1)[Yt−1 −
δ
1− α1− β0 + β1
1− α1Xt−1
]+ εt. (87)
• Define ecmt as long-run solution: ecmt = Yt −δ
1− α1− β0 + β1
1− α1Xt. (88)
• Subtract [δ/ (1− α1)− (β0 + β1)Xt/ (1− α1)] from both sides of ADL:
ecmt = δ − δ
1− α1+ α1Yt−1 +
(β0 −
β0 + β1
1− α1
)Xt + β1Xt−1 + εt. (89)
jjreade@gmail.com 62
• Collect terms and adding and subtracting a term in Xt−1 to create ecmt−1:
ecmt = α1
[Yt−1 −
δ
1− α1− β0 + β1
1− α1Xt−1
]+[α1 (β0 + β1)
1− α1+ β1
](Xt−1 −Xt) + εt
= α1ecmt−1 +(β1 + α1β0
1− α1
)(Xt−1 −Xt) + εt
= α1ecmt−1 + ξt.
– Where: ξt = εt −(β1 + α1β0
1− α1
)νt.
– νt is error term on random walk for Xt: νt = Xt −Xt−1.
• Hence: ecmt = α1ecmt−1 + ξt, ξ ∼ I(0). (90)
jjreade@gmail.com 63
What was that for?
• We just found that: ecmt = α1ecmt−1 + ξt, ξ ∼ I(0). (91)
• Thus provided |α1| < 1, then ecmt ∼ I(0):
– Recall: ecmt = Yt −δ
1− α1− β0 + β1
1− α1Xt.
– ecmt is long-run solution for Yt and Xt, steady state.– Steady-state relationships can exist in context of non-stationary model.∗ Cointegration: Xt ∼ I(1), Yt ∼ I(1), but ecmt = f (Yt, Xt) ∼ I(0).
• Fundamental concept in econometrics:
– Even though data non-stationary can still estimate static steady-state conditions.– Appeals to common sense and to economic theory: Allows estimation.
∆Yt = β0∆Xt + (α1 − 1) [Yt−1 − κ0 − κ1Xt−1] + εt. (92)
jjreade@gmail.com 64
All Good, but What About Estimation?
• Cointegrating relationship is static: Yt = βXt + ecmt, ecmt ∼ N(0, σ2
ecm
).
– Thus no problems with estimating: no residual autocorrelation.– Despite dynamic nature of relationship, can estimate static equation.
• Furthermore: it’s ‘super consistent’: estimate converges to true value very fast.2
• Suggests a procedure: Estimate ecmt, insert that into ECM and estimate.
– Engle-Granger 2-step procedure.– Intuitive and encompasses test for cointegration: is ecmt ∼ I(0)?
2It converges at rate T as opposed to√T for ‘normal’ estimator.
jjreade@gmail.com 65
That Cointegration Test
• Form of ADL and ECM suggest cointegration test of ADF form:
∆ecmt = φecmt−1 +p−1∑i=1
φi∆ecmt−i + µ+ δt+ ωt, ωt ∼ N(0, σ2
ω
). (93)
• Include trend and constant either to (93) or to ecmt equation.
– Since (93) just residuals from ecmt equation.
• Cointegration test: H0 : φ = 0. (94)
• However, cannot use Dickey-Fuller distribution.
– Must use MacKinnon (1991) critical values: See Table 4.1, Harris and Sollis.
jjreade@gmail.com 66
Sounds Too Good to be True?
• Unfortunately, it is.
• Banerjee et al (1993): β biased in small samples.
• β has very complicated distribution: cannot draw standard inference.
• The test of cointegration has low power:
– Rejects null (no cointegration) too infrequently when null false.– Thus we conclude in favour of cointegration too little.
• Endogeneity issues: often in macro systems feedback Yt to Xt.
• More than one steady-state relationship?
– Fiscal and monetary policy: why not two cointegrating relationships?– Cannot estimate more than one here.
jjreade@gmail.com 67
Solutions?
• Numerous other estimation strategies suggested for single-equation framework.
• But: all suffer from endogeneity problem and can’t estimate > 1 relationships.
• Solution: Simultaneous equations, or vector autoregressive model (VAR).
– Johansen (1996) proposed VAR approach.– Workhorse of cointegration analysis.
jjreade@gmail.com 68
A first order autoregressive model
• We can build up to the VAR(k) in several steps. . .
• First order autoregressive (AR(1)) process: xt = ρxt−1 + µ+ εt. (95)
• Model solution: Recursive substitution: xt = ρtx0 +t∑i=1
ρi (µ+ εi) . (96)
– Moving average representation.
• Cases:
1. Stationarity: z1, . . . , zt (strongly) stationary if: (z1, . . . , zt)D= (zs, . . . , zt+s) ∀s.
– Weak, or covariance, stationary if: E (zt) = µ, Cov (zt, zt−s) = γ(s), ∀t.
– With εt ∼ N (0,Ω), weak ⇒ strong.
jjreade@gmail.com 69
Stationary Case
• If |ρ| < 1, characterise model as:
E (xt |x0) = ρtx0 +t∑i=1
ρiµ −→ µ
1− ρ
Var (xt |x0) = E
(2i∑i=1
ρ2iε2i
)=
t∑i=1
ρ2iσ2 −→ σ2
1− ρ2
Cov (xt, xt−k |x0) = E
(t−k∑i=1
ρ2iε2i
)=
t−k∑i=1
ρ2iσ2 −→ σ2
1− ρ2.
• Process stationary asymptotically but not in small samples.
jjreade@gmail.com 70
Case 2: Unit-root
• If ρ = 1: xt = x0 +t∑i=1
(µ+ εi) , (97)
• So:E (xt |x0) = x0 + tµ −→∞
Var (xt |x0) =2i∑i=1
ε2i =t∑i=1
σ2 = tσ2 −→∞.
• Mean, variance functions of t: non-stationary.
• Unit root case corresponds many economic data series.
3. Explosive case: ρ > 1 not considered: infinity and beyond. . .
jjreade@gmail.com 71
Moving-average Representations
• MA representation: easy to characterise the data process under consideration.
– Principle same for more complicated models.
• Also recall lag operator L s.t. Lkxt = xt−k. If |ρ| < 1:
xt − ρxt−1 = εt (98)
xt (1− ρL) = εt (99)
xt = (1− ρL)−1εt
= εt + ρεt−1 + ρ2εt−2 + . . . ,(100)
• Alternative derivation that assumed stationarity.
jjreade@gmail.com 72
Impulse response analysis
• MA representation also facilitates impulse response analysis.
– If economy shocked (impulsed) now, where will it be in h periods?– Formally written:
xt+h = ρhxt +t+h∑i=t
ρi(µ+ εi). (101)
– Taking expectations: E (xt+h |xt) = ρhxt +t+h∑i=t
ρiµ. (102)
– Impulse response defined as: IR (h) =∂E (xt+h |xt)
∂xt= ρh. (103)
– If stationary, |ρ| < 1 then IR (h) =⇒ 0: impulse dies away.– If unit root, ρ = 1, then IR (h) = 1 ∀h: shock cumulates, never dies away.– If explosive, |ρ| > 1, IR (h)→∞.
jjreade@gmail.com 73
Three Impulse Responses
0 10 20 30 40 50 60 70 80 90 100
0.5
1.0 Stationary processxt=0.6xt−1+εt
0 10 20 30 40 50 60 70 80 90 100
0.5
1.0 Random walk process
xt=1xt−1+εt
0 10 20 30 40 50 60 70 80 90 100
5
10
15 Explosive processxt=1.03xt−1+εt
jjreade@gmail.com 74
Some Caution on Impulse Responses. . .
• IR analysis phenomenally popular in empirical studies.
• Impulse to residual of statistical model 6= economic shock.
– Even if model is identified.
• ‘Retail Energy Prices and Consumer Expenditures’ by Paul Edelstein and Lutz Kilian:
– IR but no formal checks on model: Confidence in output?
• Can impose restrictions to identify structure so it accords to theory:
– But in VARs, IRs heavily dependent on particular restrictions.– Identification restrictions generally not test-able.– Causality very difficult to achieve in macroeconomics.∗ (Identification is on comtemporaneous terms in VAR)
• Impulse response analysis intuitively great but fraught with difficulties.
jjreade@gmail.com 75
The AR(2) Model
• Model: xt = π1xt−1 + π2xt−2 + εt. (104)
• Lag operator:(1− π1L− π2L
2)xt = εt ⇒ Π(L)xt = εt (105)
– Characteristic (lag) polynomial defined as:
Π(z) = 1− π1z − π2z2 = (1− ρ1z) (1− ρ2z) , (106)
1Π(z)
=1
(1− ρ1z) (1− ρ2z)=∞∑i
ρi1zi∞∑i
ρi2zi =
∞∑n
cnzn, (107)
– cn → 0 as n→∞ if |ρ1| < 1, |ρ2| < 1, so MA(∞) exists: xt =∞∑n=0
cn (µ+ εt−n) .
• Taking expectations: E(xt |x0) =∞∑n=0
cnµ =µ
(1− ρ1)(1− ρ2)=
µ
1− π1 − π2.
jjreade@gmail.com 76
Impulse Response Analysis Again
• Want to know impact at t+ h of impulse at t:
xt+h =∞∑n=0
cn (µ+ εt+h−n) (108)
= c0(εt+h − µ) + c1(εt+h−1 − µ) + . . .
+ ch−1(εt+1 − µ) + ch(εt − µ) + ch+1(εt−1 − µ) + . . .(109)
= · · ·+ ch(xt − π1xt−1 − π2xt−2) + . . . (110)
– Only residual εt matters: rest set to zero.
• Hence: IR (h) =∂
∂xtE(xt+h |x0, . . . , xt) = ch −→ 0. (111)
– Same implications as before for unit root, explosive cases.
jjreade@gmail.com 77
Bi-variate VAR(2) model with deterministic terms
• Model: Xt = Π1Xt−1 + Π2Xt−2 + εt (112)(X1,t
X2,t
)=(π1,11 π1,12
π1,21 π1,22
)(X1,t−1
X2,t−1
)+(
π2,11 π2,12
π2,21 π2,22
)(X1,t−2
X2,t−2
)+(ε1,tε2,t
).
(113)
• Characteristic polynomial defined as:
Π(z) = I2 −Π1z −Π2z2 =
(1− π1,11z − π2,11z
2 −π1,12z − π2,12z2
−π1,21z − π2,21z2 1− π1,22z − π2,22z
2
).
(114)
– z scalar, πk,ij is ijth element of Πk.
jjreade@gmail.com 78
• As in univariate system, use Π(z) to characterise properties of model.
• Multivariate equivalent to solving for roots is to solve det(Π(z)) = 0:
det(Π(z)) = (1− ρ1z)(1− ρ2z)(1− ρ3z)(1− ρ4z) = 0, (115)
– ρi functions of Π1, Π2.
• Linear algebra:
Π(z)−1 =adj(Π(z))det(Π(z))
, (116)
– adj(Π(z)) adjoint/adjugate matrix: each element at most order 2 as matrix 2× 2.– So convergence of Π(z)−1 depends on det(Π(z)).
jjreade@gmail.com 79
• We already have det(Π(z)) so:
Π(z)−1 =adj(Π(z))det(Π(z))
=P (z)
(1− ρ1z)(1− ρ2z)(1− ρ3z)(1− ρ4z)
= P (z)
∞∑i=0
ρi1zi∞∑j=0
ρj2zj∞∑k=0
ρk3zk∞∑m=0
ρm4 zm
=∞∑n=0
P ∗nzn,
– P (z) second order function of z incorporated into P ∗n– P ∗n exponentially convergent if |ρi| < 1.
jjreade@gmail.com 80
• If |ρi| < 1 MA(∞) representation:
Xt =∞∑i=0
P ∗i (ΦDt−i + εt−i) = Π(L)−1 (ΦDt + εt) ,
• Π−1(z) =∑∞i=0P
∗i z
i.
• Hence E(Xt) =∑∞i=0P
∗i ΦDt−i, V ar(Xt) =
∑∞i=0P
∗i ΩP ∗′i .
• Xt not stationary as Dt depends on t, but Xt − E(Xt) is stationary.
jjreade@gmail.com 81
The companion form of a vector autoregressive model
• Carrying on with VAR(2), useful expression is companion form:
(Xt
Xt−1
)=(
Π1 Π2
I2 0
)(Xt−1
Xt−2
)+(
ΦDt + εt0
)(117)
= ΞXt + vt, (118)
– Ξ, Xt, vt suitably defined.
• Ξ is companion matrix:
– VAR(p) reduced to VAR(1) representation– Useful for characterising model via MA representation.
jjreade@gmail.com 82
• Roots of companion matrix = roots of system, found by solving eigenvalue problem:
det((
Π1 Π2
I2 0
)− ρ
(I2 00 I2
))= 0. (119)
• Equivalently:
(Π1 Π2
I2 0
)(v1v2
)= ρ
(I2 00 I2
)(v1v2
), (120)
• Implying:Π1v1 + Π2v2 = ρv1,
v1 = ρv2,
⇒ Π1v1 + Π2ρ−1v1 = ρv1. (121)
• det(A− ρI) = 0 ⇐⇒ Av = ρIv then (121) ⇒ det(ρ−Π1 −Π2ρ−1) = 0, or:
ρ−1Π1v1 + Π2ρ−2v1 = v1 ⇐⇒ det(I2 −Π1ρ
−1 −Π2ρ−2) = 0. (122)
jjreade@gmail.com 83
• If roots of characteristic polynomial (ρ) outside the unit circle, then roots ofcompanion matrix (ρ−1) inside unit circle, system stationary.
• Intuition as in AR(1): stationarity conditions enable MA(∞) representation, allowcharacterisation of model.
jjreade@gmail.com 84
The unrestricted vector autoregressive model
• The unrestricted VAR model with two lags is:
Xt = Π1Xt−1 + Π2Xt−2 + ΦDt + εt
• Define:
B =
Π′1Π′2Φ′
Wt =
Xt−1
Xt−2
Dt
, (123)
• Can simplify:
Xt = B′Wt + εt. (124)
jjreade@gmail.com 85
The Assumptions of the VAR Model
• The VAR(p) depends on a number of assumptions:
1. (Xt |Xt−1, Xt−2, . . . , Xt−p) mutually independent.2. (Xt |Xt−1, Xt−2, . . . , Xt−p) ∼ N (Π1Xt−1 + · · ·+ ΠpXt−p + ΦDt,Σ).
– Conditional Normality.3. Parameter space exists.
• Vital that assumptions hold.
• Likelihood framework gives powerful tool for economic analysis.
– But ‘price’ is distributional assumption.
jjreade@gmail.com 86
Maximum likelihood estimation of the unrestricted VAR
• First define likelihood function:
– Joint density of Xt given parameter set θ.
• Autoregressive structure requires sequential factorisation: no independenceassumption.
f(Xt, Xt−1, . . . , X1, X0) =T∏t=k
f(Xt |Xt−1, . . . , Xt−k). (125)
• Likelihood defined as:
L(θ;Xt) =T∏t=k
f(Xt |Xt−1, . . . , Xt−k; θ). (126)
jjreade@gmail.com 87
• Maximum likelihood estimator of θ given data Xt defined as:
θ = maxθ
L(θ;Xt), (127)
• Value of θ that, given assumed distribution, maximises likelihood function.
– Measure of plausibility: how plausible is particular parameter value?
• Logarithms often used to make likelihood function tractable:
θ = maxθ
log L(θ;Xt) = maxθ` ((|θ) ;Xt). (128)
• For VAR, Normality assumption implies:
` (θ;Xt) = −T p2
ln(2π)− T 12
ln |Ω| − 12
T∑t=1
(Xt −B′Wt)′Ω−1 (Xt −B′Wt) .
jjreade@gmail.com 88
• Likelihood maximisation implies, for B′:
minB
T∑t=0
(Xt −B′Wt)2
(129)
0 =T∑t=0
(Xt −B′Wt)W ′t (130)
B′ =T∑t=0
XtWt
(T∑t=0
WtW′t
)−1
= MXWM−1WW . (131)
– ML Estimators: B′ = (Π1, Π2, Φ).
• Product moment matrices are generically defined as MXW =∑Tt=0XtWt.
• Furthermore: εt = Xt − B′Wt−1 (132)
Ω = T−1T∑t=0
εtε′t = MXX −MXWM
−1WWMWX, (133)
jjreade@gmail.com 89
Maximised Likelihood
• Use estimators in likelood:
Lmax = L(B, Ω) = (2π)Tp/2∣∣∣Ω∣∣∣T/2 exp(−1
2tr
[Ω−1
T∑t=1
εtε′t
])(134)
L−2/Tmax = (2πe)p
∣∣∣Ω∣∣∣ . (135)
• Very powerful result for testing:
– Regardless of model estimated with Normal distribution, get this result.– Can impose restrictions, estimate, get ΩR and BR and get:
L−2/TR = (2πe)p
∣∣∣ΩR∣∣∣ . (136)
– Likelihood ratio test has easy form: ratio of residual variances.
jjreade@gmail.com 90
Testing with the Maximum Likelihood Framework
• Likelihood framework allows easy testing: likelihood ratio test.
• Test the hypothesis: H0 : θ = θ0, (137)
• Using test statistic:
LR = −2(
logL (θ0;Xt)− logL(θ;Xt))∼ χ2
dim θ. (138)
• Test assesses plausibility of restrictions.
– If restrictions move likelihood too far from θ, reject restrictions.
• Restrictions on B′ formed by constructing matrices R or H — Lecture 3.
jjreade@gmail.com 91
• Using H form, restrictions imposed by ψ = HB:
Xt = HB′Zt + εt = ψZt + εt. (139)
• Estimating gives restricted estimators, denoted by checks:
ψ = MXZH (H ′MZZH)−1(140)
Ω = MXX −MXZH (H ′MZZH)−1MZX (141)
• Likelihood ratio test: −2 ln(LR) = T ln(∣∣Ω∣∣ / ∣∣∣Ω∣∣∣)→ χ2
r. (142)
– Test statistic simple and intuitive.
jjreade@gmail.com 92
The VAR Likelihood Framework
• VAR is simultaneous equations autoregressive model.
• Allows rich characterisation of dynamics of data.
• Equivalent to reduced form of economic theory models.
• Likelihood estimation consistent as no endogeneity.
– Also efficient as Ω estimated.
• Provided VAR well specified, powerful tool for exploring data:
– Forecasting.– Impulse response analysis.– Investigating steady-state relationships: Cointegration.– Issues of causality and exogeneity.
jjreade@gmail.com 93
Checking the VAR
• VAR provides much information on modelled data.
• Johansen (2004): “which statistical model describes the data?”
– Statistical models rely on assumptions, properties proved based these.– Must test assumptions hold.– Check on unrestricted VAR before proceeding cointegration analysis.∗ Choice of number of cointegrating vectors vital.∗ Akin to deciding whether data I(1) or I(0).∗ Choice affected by model misspecification.
jjreade@gmail.com 94
The Assumptions of the VAR model
• VAR model assumes:
1. Linear conditional mean explained by past observations and deterministic terms:– Testing: Un-modelled systematic variation in residuals:∗ Informal: plots of residuals.∗ Formal: test for autocorrelated errors, heteroskedasticity and ARCH.
– Remedy by:∗ Choice of lag length.∗ Choice of information set — composition of Xt.∗ Incorporate outliers.∗ Data transformations: Non-linearity.∗ Structural breaks: Non-constant parameters, deterministic terms.
jjreade@gmail.com 95
Assumptions (continued. . . )2. Time-invariant conditional variance:
– Heteroskedasticity and ARCH effects:∗ Informal: plots of residuals.∗ Formal: White test, ARCH test.
– Remedy:∗ Add potentially causal regressors?∗ Regime shifts in the variance: deterministic terms.
3. Independent Normal errors, mean zero, variance Ω:– Informal testing: histogram of residuals– Formal testing: Autocorrelation test on residuals.
4. Parameter space:– All model outcomes plausible?– Remedy: data transformation — e.g. logs for % change.
jjreade@gmail.com 96
Cointegration in the VAR
• Data generally non-stationary: assume Xt ∼ I(1).
• As with AR(1), reformulate: Error correction form of VAR:
∆Xt = ΠXt−1 + Γ1∆Xt−1 + · · ·+ Γk−1∆Xt−k+1 + ΦDt + εt. (143)
– Π =(∑k
i=1 Πi
)− 1, Γj =
∑ki=j+1 Πi.
• ∆Xt ∼ I(0), εt ∼ I(0), but Xt ∼ I(1) still. (143) unbalanced.
• Solution: Π reduced rank. Then ∃ p× r matrices α, β s.t. Π = αβ′:
∆Xt = αβ′Xt−1 + Γ1∆Xt−1 + · · ·+ Γk−1∆Xt−k+1 + ΦDt + εt. (144)
– β′Xt−1 ∼ I(0): I(0) combinations of I(1) variables: cointegrating vectors.
jjreade@gmail.com 97
A Bivariate Example
• Example: r = 1, p = 2, β 2× 1 so
αβ′Xt−1 =(α1
α2
)(β1 β2
)( X1,t−1
X2,t−1
)=(α1
α2
)(β1X1,t−1 + β2X2,t−1) .
• β′Xt: Stationary Linear combination of I(1) variables.
• α1, α2: speed of adjustment of variables in Xt to disequilibrium.
– αi = 0 implies Xi,t weakly exogenous.
• X1,t, X2,t: consumption and income, home and foreign interest rate. . .
• Very powerful framework for analysis of steady-state relationships.
– Can check if more than one variable adjusts to steady-state.– No empirical examples today: If interested I can provide more slides.
jjreade@gmail.com 98
Granger Causality
• Causality central to economics and other fields:
– Does money cause GDP, or GDP cause money?– Does advertising cause sales? Or sales cause advertising?
• The VAR framework allows us to answer these questions.
• A variable Xt is Granger non-causal for Yt if:
E (Yt |Yt−1, Xt−1, . . . ) = E (Yt |Yt−1) . (145)
– I.e. Previous values of Xt do not provide information on Yt.– Same as strong exogeneity.
• If (145) does not hold, implication is Xt Granger causal for Yt.
– But need also that Yt Granger non-causal for Xt.– Rule out feedback, establish causality.
jjreade@gmail.com 99
Granger Causality
• Can easily test Granger causality in VAR model. E.g. bivariate VAR(2):
Xt = Π1Xt−1 + Π2Xt−2 + εt (146)(X1,t
X2,t
)=(π1,11 π1,12
π1,21 π1,22
)(X1,t−1
X2,t−1
)+(
π2,11 π2,12
π2,21 π2,22
)(X1,t−2
X2,t−2
)+(ε1,tε2,t
).
(147)
• If π1,21 = π2,21 = 0 then lags of X1,t Granger non-causal for X2,t.
• If π1,12 6= 0, π2,12 6= 0 then X2,t Granger causal for X1,t.
– Hence X2,t Granger causal for X1,t.
• Powerful test of causality often used in literature.
jjreade@gmail.com 100
Granger Causality
• But general case: tri-variate VAR(2):
X1,t
X2,t
X3,t
=
π1,11 π1,12 π1,13
π1,21 π1,22 π1,23
π1,31 π1,32 π1,33
X1,t−1
X2,t−1
X3,t−1
+
π2,11 π2,12 π2,13
π2,21 π2,22 π2,23
π2,31 π2,32 π2,33
X1,t−2
X2,t−2
X3,t−2
+
ε1,tε2,tε3,t
.
(148)
• If π1,21 = π2,21 = 0 then lags of X1,t Granger non-causal for X2,t.
• If π1,12 6= 0, π2,12 6= 0 then X2,t Granger causal for X1,t.
– But what about causality from X1,t−2 to X3,t−1 to X2,t?– Need also π2,31 = π1,23 = 0.
jjreade@gmail.com 101
Granger Causality
• Granger causality extensively used in empirical work.
– Powerful and intuitive test of causality.– Reliant on VAR framework.– Could be run as series of single equation estimations.∗ Thought inefficient.
• But it is severely limited:
– With many lags, complicated structure of zero restrictions required.– Test is conditional on information set included:∗ Hence unmodelled lags or variables may provide causality.∗ Hence Granger non-causality conclusion may be invalidated.
jjreade@gmail.com 102
Advertising: Copenhagen Cointegration Summer School
• Learn from the Masters!
• Three week course in August each year:
– August 3–23 2009: register your interest!– http://www.econ.ku.dk/summerschool/
• Mornings: Cointegration theory from Juselius, Johansen, Rahbek and Nielsen.
• Afternoons: Computer labs to work on your own dataset.
• Hugely useful course:
– Submit paper at end of course for feedback.– Potential PhD chapter.
jjreade@gmail.com 103
Concluding
• In-depth look at multiple-equation modelling.
• Seemingly Unrelated Regression:
– E.g. Demand systems.– Exploit information between regressions.
• Simultaneous-equation modelling:
– E.g. Demand and supply systems.– Endogeneity problem.– IV estimation: Instruments are exogenous variables.
• VAR modelling:
– Extending the time series dimension.– Forecasting, theory-free/full modelling, impulse responses, Granger causality.
• Next week: Limited-Dependent-Variable Modelling.
jjreade@gmail.com 104