MSc Time Series Econometrics Module 2 Lecture 1: VARs, introduction, motivation, estimation,...
-
Upload
gavin-porter -
Category
Documents
-
view
219 -
download
0
Transcript of MSc Time Series Econometrics Module 2 Lecture 1: VARs, introduction, motivation, estimation,...
MSc Time Series Econometrics
Module 2Lecture 1: VARs, introduction, motivation, estimation,
preliminariesTony Yates
Spring 2014, Bristol
Me• New to academia.• 20 years in Bank of England, directorate responsible
for monetary policy.• Various jobs on Inflation Report, Inflation Forecast,
latterly as senior advisor, leading ‘monetary strategy team’
• Research interests: VARs, TVP-VARs, monetary policy design, learning, DSGE, heterogeneous agents.
• Also teaching MSc time series on VARs.
Follow my outputs if you are interested
• My research homepage• Blog ‘longandvariable’ on macro and public
policy• Twitter feed @tonyyates
Topics we will cover
• Vector autoregressions:– motivation– Estimation, MLE, OLS, Bayesian using analytical and Gibbs
Sampling MCMC methods– Identification [short run restrictions, long run restrictions,
sign restrictions, max share criteria]– interpretation, use, contribution to macroeconomics
• Factor models in vector autoregressions• TVP VAR estimation using kernels.• If we have time, for fun: The Kalman Filter,
Bootstrapping
Course/learning strategy
• Rudimentary algebra of VARs, estimation, identification, etc.
• Some examples of code to i) give an insight into what might lie in store if you continue ii) can sometimes help to demystify the algebra.
• Applications: their contribution, impact.• In the exam, you will be expected to understand and
reproduce the algebra, and to cite and comment on the applications.
• You won’t be expected to write code.
VARs useful sources
• Chris Sims, 'Macroeonomics and reality‘• Lutz Kilian 'Structural Vector Autoregressions‘• Fabio Canova: Methods for Applied Business
Cycle research• James Hamilton 'Time Series Analysis‘• Helmut Luktepohl
'New introduction to multiple time series analysis'
Useful sources, ctd
• Stock and Watson: implications of dynamic factor models for VAR analysis
• Stock and Watson: 'Dynamic factor models'
Matrix/linear algebra pre-requisites
• Scalar, vector, matrix.• Transpose• Inverse (matrix equivalent of dividing).• Diagonal matrix.• Eigenvalues and eigenvectors.• Powers of a matrix.• Matrix series sums. Matrix equivalent of geometric scalar
sums.• Variance-covariance matrix.• Cholesky factor of a variance-covariance matrix.• Givens matrix.
Some applications
• Christiano, Eichenbaum, Evans: ‘Monetary policy shocks: what have we learned and to what end?’
• Christiano, Eichenbaum and Evans (2005): ‘Nominal rigidities and the dynamics effects of a monetary policy shock’
• Mountford, Uhlig (2008): ‘what are the effects of fiscal policy shocks?’
• Gali (1999): ’Technology, employment and the business cycle….’
VAR motivation: Cowles Commission models
• Dominant paradigm was large scale macroeconometric models, in policy institutions especially
• Many estimated equations.• Academic origins in foundational work to create
national accounts; Keynesian formulation of macroeconomics; Haavelmo’s notion of ‘probability model’ applied to this.
• Nice discussion in Sims’ Nobel acceptance lecture.
Silly example of a CC modelSims: Equation for C excludes U, TUEquation for Y excludes C, UAnd so on....
Lucas: equation for C sounds like common sense, but is it the C that reflects the solution to a consumers’ problem in a general equilibrium model? Maybe, or maybe not
Those u’s look exogenous and are meant to be, but are they really primitive shocks? Both Sims’ and Lucas’ critiques would suggest not
C t c0 c1Yt c2Yt 1 uCt
Yt y0 y1N t y2N t 1 y3Wt uYt
Wt w0 w1U t w2TU t uWt
U t u0 u1Yt u2Yt 1 uUt
TU t tu0 tu1Yt tu2U t uTUt
Critiques of Cowles Commission approach
• Lucas (1976):– Laws of motion have to come from solving
problems of agents in the model– If not, correlations will change if policy changes
• Sims (1981):– ‘Incredible’ identification restrictions
Response to Sims and Lucas critiques
• No incredible restrictions. Everything left in.• Reduced form shocks span the structural
shocks. • Structural shocks and their effects sought
through identification, reference to classes of Lucas-Critique proof models
• ‘Modest policy interventions’ [Sims and Zha].
Cowles Commission variables set out as a VAR model
Yt
C t
Yt
Wt
U t
TU t
N t
A110 A12
0 . . . A160
A210 . . . . .
. . . . . .
. . . . . .
. . . . . .
A610 . . . . A66
0
Yt 1 A1Yt 2 . . .Z t
Potentially, everything is a function of everything else lagged
Simultaneity encoded in the reduced form errors. To be disentangled into structural shocks through identification.
VAR topics and contributions to macro (1)
• Estimating parameters in a DSGE model:– Rotemburg and Woodford (1998)– Christiano, Eichenbaum and Evans (2005)
• Identify monetary policy ‘shock’• Choose parameters of the model so that
impulse response to this shock in the DSGE model as close as possible to corresponding IRF in the VAR.
VAR topics and contributions to macro (2)
• Evaluating the RBC claim that technology shocks cause business cycles– Gali (1999)
• Identified technology shocks as the only thing that could change N/Y in long run
• Showed that these caused hours work to fall, not rise
• Inconsistent with RBC model (‘make hay while the sun shines’)
VAR topics and contributions to macro (3)
• Cogley and Sargent (2005)– VAR with time-varying parameters– Multivariate counterpart to persistence, ‘predictability’– Bayesian estimation
• By how much did inflation predictability change in the post-war period?
• If it changed a lot, what does that suggest about its causes?
Misc technical preliminaries to help you read the papers
• The lag polynomial operator ‘L’
yt 1yt 1 2yt 2 . . . pyt p
0L0yt 1 1L1yt 1 . . . p 1Lp 1yt 1
Lyt 1
Lyt yt 1 ,L 1yt yt 1Lag / lead operator denoted by positive, negative powers of L
Misc technical preliminaries...
• Cholesky decomposition
11 . .
21 22 .
31 32 33
a 0 0
. b 0
. . c
a . .
0 b .
0 0 c
a 0 0
. b 0
. . c
PP
a . .
0 b .
0 0 c
PP I
1 0 0
0 1 0
0 0 1
a,b,c 0
Sigma is a v-cov matrix, with elements symmetric about the diagonal; Cholesky factor on the RHS
Here we decompose further using an orthonormal matrix P
Diagonals of vcov matrix are positive beause these correspond to variances…
Misc technical preliminaries
• Givens matrix is an example of an orthonormal matrix
Also known as a Givens ‘rotation’Useful theorem: any orthonormal matrix can be shown to be a product of Givens matrices with different thetas, the number depending on the dimension of the orthonormal matrix concerned.
P
1 0 0 0
0 c s 0
0 s c 0
0 0 0 1
,c cos , s sin
Products of orthonormal matrices
PaPa I,PbPb
I
PaPbPaPb I
If two matrices are orthonormal, then so is the product of those matrices.
The VAR impulse response function
Impulse response function in a univariate time series modelTake some unit value for e1, then substitute into eq for y repeatedly
yt yt 1 et
yirf
y1
y2
y3
. . .
yn
e
e
2e
. . .
n 1e
AR(1) impulse response function
0 10 20 30 40 50 60 70 80 90 1000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Y_t=0.85*y_t-1+e_t; e(1)=1IRF shrinks back to zero as we multiply 1 by successively higher powers of 0.85
Matlab code to plot IRF in an AR(1)
Matlab code to plot an AR(1) impulse response
• %compute impulse response in an AR(1).• %illustration for MSc Time Series Econometrics module.• • %y_t=rho*y_t-1+e_t.• • %calibrate parameters• rho=0.85; %persistence parameter in the ar(1)• samp=100; %length of time we will compute y for.• y=zeros(samp,1); %create a vector to store our ys.• %semi colons suppress output of every command• %to screen.• e=1; %value of the shock in period 1.• y(1)=e; %first period, y=shock.• • for t=2:samp• y(t)=rho*y(t-1); %loop to project forwards effect of the unit shock.• end• • %now plot the impulse response• time=[1:samp]; %create a vector of numbers to record the progression of time• plot(time,y)
VAR impulse response
A multivariate example.For IRF to e1, choose e=[1,0] for first period, then project forwards....
Yt y1
y2t
AYt 1 et a11 a12
a21 a22
y1
y2t 1
e1
e2
Yirf,1 e,Ae,A2e, . . . ,e e1
e2
Matlab code to plot VAR(1) impulse response function
• %compute impulse response in an VAR(1).• %illustration for MSc Time Series Econometrics module.• • %y_t=A*y_t-1+e_t.• • clear all; %bit of housekeeping to clear all variables so each time you run program as you are debugging• %you know you are not adding onto previous• %values• • %calibrate parameters• A=[0.6 0.2;• 0.2 0.6];• • samp=100; %length of time we will compute y for.• y=zeros(samp,2); %create a matrix this time to store our 2 by samp bivariate time series y={y1,y2}.• • e=[1;0]; %we will simulate a shock to the first equation. Note shock in first period is now a 2 by 1 vector.• y(1,:)=e; %first period, y=shock. the colon ':' means 'corresponding values in this dimension' • • for t=2:samp• y(t,:)=A*y(t-1,:)'; %loop to project forwards effect of the unit shock.• end %' is transpose• • %now plot the impulse response• time=[1:samp]; %create a vector of numbers to record the progression of time• subplot(2,1,1)• plot(time,y(:,1))• subplot(2,1,2)• plot(time,y(:,2))
VAR(1) impulse response
0 10 20 30 40 50 60 70 80 90 1000
0.2
0.4
0.6
0.8
1
0 10 20 30 40 50 60 70 80 90 1000
0.05
0.1
0.15
0.2
0.25
Y_t=A*Y_t-1’+e_t, e_t=[1 0], A=[0.6 0.2;0.2 0.6]Note that the eigenvalues of A are 0.4 and 0.8
Using a VAR to forecast
Yt hf |e s 0, s t e tAh
Forecast at future horizon h, conditional on starting from steady state= IRF to the latest estimated shock.
Yt hf e tAh e t 1Ah 1 e t 2Ah 2 . . . e t nAh n
Forecast conditional on all the shocks estimated to have occurred:Sum of IRF to that shock at increasing horizons.Terms further to the right get smaller as higher powers of h [=smaller for stable VAR]Reflects response to shocks that hit further and further back in time.Forecasts further and further out shrink back to steady state for the same reason. Higher powers of h , A has eigenvalues <1 in absolute value.
VAR(1) representation of a VAR(p)
Yt A1Yt 1 A2Yt 2 . . .ApYt p et
Yt
Yt
Yt 1
. . .
Yt p 1
AYt 1 e t
A
A1 A2 . . . Ap
Ik 0 . . . 0
0 Ik 0 . . .
0 0 Ik 0
,e t
et
0
. . .
0
Moving average representation of a VAR(p)
We have used the VAR(1) form of the VAR(p).In words, it means Y is the sum of shocks, where each shock taken to higher and higher powers as we go back in time.Notice how this is related to the formula for the VAR impulse response we computed before.
Yt i 0
Aie t i
Persistence, memory, predictability, stability
• When a shock hits, how long does it takes for its effects to die out?
• Applications:– Business cycle theory concerned with mechanisms for
propagation. Consumption and output not white noise: why?– Bad monetary and fiscal policy could be part of the story about
why shocks take time to die out• Time series notions of persistence etc are one way to
characterise propagation and bad policy.• Why ‘bad’, because more persistence means larger variance,
and larger variance for most utility functions is bad
Persistence and variance
Persistence, higher rho, means lower denominator, means higher unconditional variance of yMost economic models asssume agents don’t like varianceSo persistence is interesting economically, since it usually indicates something ‘bad’ is happening
yt yt 1 et
varyt 2varyt 1 varet 2 covyt 1 ,et
2varyt 1 varet
NB : varyt varyt 1
varyt varet1 2
Persistence and predictability
P tj 1
e i h 0
j 1A th t A t
h e i
e i h 0
A th t A t
h e i
Multivariate predictability: if set horizon=2, dimension of Y_t=1, then this formula delivers rho^2, ie persistence squared.
AR ‘stability’
yt yt 1 et, | | 1 Univariate model.
Stability=stationarity=series have convergent sums=first and second moments independent of t, and computable
e, e, 2e, . . . ne
limt te 0 Effects of shock eventually die out. So series converges to something.
yt et et 1 2et 2 . . . net n s 0
set s
lims set s 0
Here we take the perspective that today’s data is the sum of the effects of shocks going back into the infinite past.Since today’s data is finite and well-defined, then it must be that shocks infinitely far back have no effect. Otherwise today’s data would be infinitely large.
The contribution of a shock very far back goes to zero
VAR(1) stability
Yt AYt 1 et
x eigA; |x | 1, x
detIK Ax 0 |x | 1
Stability condition echoes the AR case, but where does the dependence on the eigenvalues come from?
Yt A0et A1et 1 A2et 2 . . . Anet n
A0L0e A Let A2L2et . . . LnAnet
Here we write out a vector Y as the sum of contributions from shocks going back further and further into time.
Explaining VAR(1) eigenvalue stability condition
An PDnP 1
Dn
1n 0 0
0 2n 0
0 0 dn
P v 1 v 2 v d
The crucial thing is to make this A^n go to zero as n goes to infinity.Importance of eigenvalues in this happening comes from the fact that we can write a square matrix using the eigenvalue-eigenvector decomposition.And then compute the power of A using the powers of the eigenvales of A.So to make A^n go to zero, we have to make all the diagonal elements of D go to zero, and by analogy with the AR(1) case, this means the eigenvalues have to be < 1 in absolute value.
VAR(p) stability
Yt
Yt
Yt 1
. . .
Yt p 1
AYt 1 e t
A
A1 A2 . . . Ap
Ik 0 . . . 0
0 Ik 0 . . .
0 0 Ik 0
,e t
et
0
. . .
0
x eigA; |x | 0, x
VAR estimation
• Linear, multivariate model• Suggests estimation by...• OLS!• Or MLE, which in these circumstances [linear;
Gaussian errors] is equivalent.• MLE cumbersome because you may have
many parameters over which to optimise• Why is this cumbersome? Well, you tell me.
OLS estimation of VAR parameters
A XX
1XY
yt yt 1 et
y y2 , . . .yT,x y2 , . . .yT 1 x x 1x y
Univariate case
Multivariate case for VAR(1) representation of VAR(p)
OLS estimate of reduced form residual variance covariance matrix
Univariate case.Take data, subtract prediction, multiply residual vector by transpose....
yt yt 1 et
1/Tee , e y x
y y2 . . .yT ,x y1 . . .yT 1
1/Tee , e Y AXAnalogously for the multivariate case
The likelihood function for a VAR
• Why bother if we can use OLS? Given the drawbacks of MLE?
• Log posterior is sum of log likelihood and log prior: so we need it for Bayesian estimation
• Key to understanding many concepts:– Origin and derivation of standard errors from slope of
LF– Estimation when we can’t evaluate the LF but have to
approximate it by simulation– Pseudo-ML when the data are non-Gaussian
Likelihood fn for an AR(1)
yt yt 1 et,et N0, e2
Ey1 0
vary1 Ey1 02 e2
1 y1 ,y1
0
f y1y10 ; , e
2 1
2 e21 2
exp y1
02
2 e2 /1 2
y2 y1 e2
y2 y1 y10 N y10 , e
2
f y2 y1y20 y1
0 , , e2 1
2 e21 2
exp y2
0 y102
2 e2 /1 2
LF for an AR(1)/ctd...
Logging turns complicated product into long sumWe can maximise this and ignore constants.
f y y1,y2... f y1 f y2 y1 f y3 y2 . . . f yT yT1
logf y y1,y2... 0.5 log2 0.5 log 21 2 y102
2 2 /1 2
T 1/2 log2 T 1/2 log 2
t 2
T
yt0 y t 10 2
2 2
Likelihood for a VAR(p)Yt Yt 1 ,Yt 2 , . . .Yt p 1 NYt A1Yt 1 A2Yt 2 . . .ApYt p ,
Yt Yt 1 ,Yt 2 , . . .Yt p
A A1 ,A2 . . . .Ap
Yt Yt 1 ,Yt 2 , . . .Yt p 1 NAYt,
fY t Y t1,Y t2,...Y tp1Yt0 ,Yt 1
0 ,Yt 20 . . . . ;A, 2 n/2 | 1 |
0.5 exp 0.5Yt0 AYt
0 1Yt0 AYt
0
logfY Y1,Y2...YT logfY1 ... logfY2 ... . . . t 1
T
logfY t Y t1,...
Tn/2 log2 T/2 log| 1 |
0.5 t 1
T
Yt0 AYt
0 1Yt0 AYt
0
Recap
• Remember likelihood assumes Gaussian errors• In some circumstances you can get consistent
estimates of parameters (but not standard errors) even if this is violated.
Distributions for a VARs impulse response functions
yt yt 1 et IRF is easy in an AR1.
e, e, 2e, . . . ne It involves powers of only 1 coefficient. So distributions of rho can be used to compute distributions of the elements of the IRFs vector. Work it out.
Yt hirf Ahe,h 0,1,2. . . Things harder with a VAR as involves
many, jointly distributed coefficients.
Bootstrapping algorithm for VAR IRFs
Yt AYt 1 e t Suppose have estimated a VAR(p)
1.Draw, with replacement, a time series of shocks,e i e i1 ,e i2 , , ,
2.Create a new time series of observeables using Yt AYt 1 e it
3.Re-estimate the VAR to produce A i
4. Compute IRFs using a unit shock e0 1,0,0. . . and powers of A i
5. set i i 1, return to step 1. if i iter
Set iter=200 or so.Algorithm will generate iter vectors h long, h=max chosen horizon of the IRF.Q: how to do step 1 with computer random number generator?
Why estimate VARs
• Estimate RBC/DSGE model by choosing the parameters to match the VARs IRF
• Estimate using indirect inference.• Test implications of model by identifying a
shock.• Accounting for the business cycle by
identifying a shock.• Forecasting.
Indirect inference with VARs1.Estimate a VAR on the data
2.Generate data from DSGE model for candidate parameter values i
3.Estimate a VAR on the generated data
4.Compute score S i distance of 1 from 2
5.Let i i 1;Go back to 2. i iterm
6. i such that S i minS
Key reference is Gourieroux et alMeasure of distance, eg, euclidian norm, perhaps weighted in some way.VAR impulse response matching [Rotemburg and Woodford] is akin to indirect inference.In some cases, DSGE models don’t have VAR representations, though they have VAR approximationsWhen they do, more properly called a partial information method, rather than indirect inference.IRFs one of the infinite moments summarised by the likelihood
Next topic, in fact most of rest of course, will be on identification in VARs
• Apologies – we are going to do some economics.
• These are going to be ‘credible’ identification restrictions, contrast with Cowles Commission.
• But still contestable.• They will be based on classes of business cycle
models.• Enables a test of a theory without making too
many auxiliary assumptions that could fail.
From VARs to SVARs
• Short-run, ‘zero’ restrictions [Sims, CEE]• Long run restrictions. [Blanchard-Quah, Gali]• Sign restrictions. [Faust, Uhlig, Mountford and
Uhlig, Canova and de Nicolo]• Max share restrictions and news shocks.
[Francis et al, Barsky-Sims, Pinter, Theodoridis and Yates]
• Heteroskedasticity-based identification [Rigobon]