Juselius_Johansen_The Cointegrated VAR Model

279
The Cointegrated VAR Model: Econometric Methodology and Macroeconomic Applications Katarina Juselius July, 20th, 2003

description

Quant Finance

Transcript of Juselius_Johansen_The Cointegrated VAR Model

Page 1: Juselius_Johansen_The Cointegrated VAR Model

The Cointegrated VAR Model: EconometricMethodology and Macroeconomic Applications

Katarina Juselius

July, 20th, 2003

Page 2: Juselius_Johansen_The Cointegrated VAR Model

Chapter 1

Introduction

Economists frequently formulate an economically well-specified model as theempirical model and apply statistical methods to estimate its parameters. Incontrast, statisticians might formulate a statistically well-specified model forthe data and analyze the statistical model to answer the economic questionsof interest. In the first case, statistics are used passively as a tool to getsome desired estimates, and in the second case, the statistical model is takenseriously and used actively as a means of analyzing the underlying generatingprocess of the phenomenon in question.

The general principle of analyzing statistical models instead of applyingmethods can be traced back to R.A. Fisher. It was introduced into econo-metrics by Haavelmo (1944) [hereafter Haavelmo] and operationalized andfurther developed by Hendry and Richard (1983), Hendry (1987), Johansen(1995) and recent followers. Haavelmo’s influence on modern econometricshas been discussed for example in Hendry, Spanos, and Ericsson (1989) andAnderson (1992).

Few observed macroeconomic variables can be assumed fixed or predeter-mined a prior. Haavelmo’s probability approach to econometrics, therefore,requires a probability formulation of the full process that generated the data.Thus, the statistical model is based on a full system of equations. The com-putational complexities involved in the solution of such a system were clearlyprohibitive at the time of the monograph when even the solution of a multi-ple regression problem was a nontrivial task. In today’s computerized worldit is certainly technically feasible to adopt Haavelmo’s guide-lines in empir-ical econometrics. Although the technical difficulties have been solved longago most papers in empirical macroeconomics do not seem follow the general

1

Page 3: Juselius_Johansen_The Cointegrated VAR Model

2 CHAPTER 1. INTRODUCTION

principles stated very clearly in the monograph. In this monograph we willclaim that the VAR approach offers a number of advantages as a generalframework for addressing empirical questions in (macro)economics withoutviolating Haavelmo’s general probability principle.First we need to discuss a number of important questions raised by

Haavelmo and their relevance for recent developments in the econometricanalysis of time series. In so doing we will essentially focus on issues relatedto empirical macroeconomic analysis using readily available aggregated data,and only briefly contrast this situation with a hypothetical case in which thedata have been collected by controlled experiments.The last few decades, in particular the nineties, will probably be remem-

bered as a period when the scientific status of macroeconomics, in particularempirical macroeconomics was much debated and criticized both by peoplefrom outside, but increasingly also from inside the economics profession. Seefor example the discussions in Colander and Klamer (1987), Colander andBrenner (1992), and Colander (2001). An article expressing critical viewson empirical economics is Summers (1991), in which he discusses what heclaims to be “the scientific illusion in empirical macroeconomics”. He ob-serves that applied econometric work in general has exerted little influenceon the development of economic theory, and generated little new insight intoeconomic mechanisms. As an illustration he discusses two widely differentapproaches to applied econometric modelling; (i) the representative agent’sapproach, where the final aim of the empirical analysis is to estimate a fewdeep parameters characterizing preferences and technology, and (ii) the use ofsophisticated statistical techniques, exemplified by VAR-models a la Sims, to“identify” certain parameters on which inference to the underlying economicmechanisms is based. In neither case he finds that the obtained empiricalresults can convincingly discriminate between theories aimed at explaining amacroeconomic reality which is infinitely more rich and complicated than thehighly simplified empirical models. Therefore, he concludes that a less for-mal examination of empirical observations, the so-called stylized facts (usu-ally given as correlations, mean growth rates, etc.) has generally resultedin more fruitful economic research. This is a very pessimistic view of theusefulness of formal econometric modelling which has not yet been properlymet, nor officially discussed by the profession. The aim of this book is tochallenge this view, claiming that part of the reason why empirical resultsoften appear unconvincing is a neglect to follow the principles laid out byHaavelmo (1943).

Page 4: Juselius_Johansen_The Cointegrated VAR Model

1.1. A HISTORICAL OVERVIEW 3

The formal link between economic theory and empirical modelling liesin the field of statistical inference, and the focus is on statistical aspects ofthe proposed VAR methodology, but at the same time stressing applicabilityin the fields of macroeconomic models. Therefore, all through the text thestatistical concepts are interpreted in terms of relevant economic entities.The idea is to define a different class of ‘stylized facts’ which are statisticallywell founded and much richer than the conventional graphs, correlations andmean growth rates often referred to in discussion of stylized facts.

In this chapter we will revisit Haavelmo’s monograph as a background forthe discussion of the reasons for “the scientific illusion in empirical macroe-conomics” and ask whether it can be explained by a general failure to followthe principles expressed in the monograph. Section 1.1 provides a histori-cal overview, Section 1.2 discusses the choice of a theoretical model. Section1.3-1.5 discuss three issues from Haavelmo that often seem to have been over-looked in empirical macroeconomics: (i) the link between theoretical, trueand observed variables, (ii) the distinction between testing a hypothesis andtesting a theory, and (iii) the formulation of an adequate design of experi-ment in econometrics and relates it to a design by controlled experiments andby passive observation. Section 1.6 finally introduces the empirical problemwhich is to be used as an illustration all through the book.

1.1 A historical overview

To be included!

1.2 On the choice of economic models

This section discusses the important link between economic theory and theempirical model. In order to make the discussion more concrete, we willillustrate the ideas with an example taken from the monetary sector of theeconomy. In particular we will focus on the aggregate demand for money re-lation, being one of the most analyzed relations in empirical macroeconomics.Before selecting a theoretical model describing the demand for money as afunction of some hypothetical variables, we will first discuss the reasons whyit is interesting to investigate such a relation.

The empirical interest in money demand relations stems from basic macroe-

Page 5: Juselius_Johansen_The Cointegrated VAR Model

4 CHAPTER 1. INTRODUCTION

conomic theory postulating that the inflation rate is directly related to theexpansion in the (appropriately defined) supply of money at a rate greaterthan that warranted by the growth of the real productive potential of theeconomy. The policy implication is that the aggregate supply of moneyshould be controlled in order to control the inflation rate. The optimal con-trol of money, however, requires knowledge of the “noninflationary level” ofaggregate demand for money at each point of time, defined as the level ofmoney stock, m∗, at which there is no tendency for the inflation rate toincrease or decrease. Thus, on a practical level, the reasoning is based onthe assumption that there exists a stable aggregate demand-for-money rela-tion, m∗ = f(x), that can be estimated. Given this background, what canbe learned from the available economic theories about the form of such arelation, and which are the crucial determinants?There are three distinct motives for holding money. The transactions

motive is related to the need to hold cash for handling everyday transactions.The precautionary motive is related to the need to hold money to be able tomeet unforeseen expenditures. Finally, the speculative motive is related toagents’ wishes to hold money as part of their portfolio. Since all three motivesare likely to affect agents’ needs to hold money, let the initial assumption bethat m/p = f(yr, c), saying that real money holdings, m/p, is a function ofthe level of real income (assumed to determine the volume of transactionsand precautionary money) and the cost of holding money, c.Further assumptions of optimizing behavior are needed in order to derive

a formal model for agents’ willingness to hold money balances. Among theavailable theories two different approaches can be distinguished: (i) theoriestreating money as a medium of exchange for transaction purposes, so thatminimizing a derived cost function leads to optimizing behavior, (ii) theoriestreating money as a good producing utility, so that maximizing the utilityfunction leads to optimizing behavior.For expository purposes, only the first approach will be discussed here,

specifically the theoretical model suggested by Baumol (1952), which is stillfrequently referred to in this context. The model is strongly influenced byinventory theory, and has the following basic features. Over a certain timeperiod t1−t2, the agent will pay out T units of money in a steady stream. Twodifferent costs are involved, the opportunity cost of the foregone investmentmeasured by the interest rate r, and the so-called “brokerage” cost b. Thelatter should be assumed to cover all kinds of costs in connection with a cashwithdrawal. It is also assumed that liquid money does not yield interest.

Page 6: Juselius_Johansen_The Cointegrated VAR Model

1.3. THEORETICAL, TRUE AND OBSERVABLE VARIABLES 5

The optimal value of cash withdrawn from investment can now be found as:

C =p2bT/r (1.1)

so that the cost-minimizing agent will demand cash in proportion to thesquare root of the value of his transactions. The average holding of cash underthese assumption is C/2. Taking the logarithms of (1.1) gives a transactionselasticity of 0.5 and an interest elasticity of -0.5. These have been the priorhypotheses of many empirical investigations based on aggregated data, andestimates supporting this have been found, for instance, in Baba, Hendry,and Starr (1992).If this theoretical model is tested against data, more precise statements

of what is meant by the theoretical concepts C, b, T and r need to be made.This is no straightforward task, as expressed by Haavelmo, p.4:”When considering a theoretical set-up, involving certain variables and

certain mathematical relations, it is common to ask about the actual meaningof this and that variable. But this question has no sense within a theoreticalmodel. And if the question applies to reality it has no precise answer. Itis one thing to build a theoretical model it is another thing to give rules forchoosing the facts to which the theoretical model is to be applied. It is onething to choose the model from the field of mathematics, it is another thingto classify and measure objects of real life.”As a means to clarify this difficult issue Haavelmo introduces the con-

cepts of true and theoretical variables as opposed to observable variables andproposes that one should try to define how the variables should be measuredin an ideal situation. This will be briefly discussed in the next section.

1.3 Theoretical, true and observable variables

In order to operationalize a theoretical concept, one has to make precisestatements about how to measure the corresponding theoretical variable,i.e. to give the theoretical variable a precise meaning. This is expressed inHaavelmo, p. 5 as:We may express the difference [between the ”true” and the theoretical

variables] by saying that the ”true” variables (or time functions) representour ideal as to accurate measurements of reality ”as it is in fact” while thevariables defined in theory are the true measurements that we should make ifreality were actually in accordance with our theoretical model.

Page 7: Juselius_Johansen_The Cointegrated VAR Model

6 CHAPTER 1. INTRODUCTION

Say, for example, that a careful analysis of the above example showsthat the true measurements are the average holdings of cash and demanddeposits by private persons in private banks, postal banks or similar institu-tions (building societies, etc.) measured at closing time each trading day of amonth. The theoretical variable C as defined by Baumol’s model would thencorrespond to the true measurements given that i) no interest is paid on thisliquid money, ii) transactions are paid out in a steady stream over successiveperiods, iii) the brokerage cost and interest rate r are unambiguously defined,vi) no cash and demand deposits are held for speculative or precautionarymotives, and so on.

Needless to say, the available measurements from official statistics are veryfar from the definitions of the true variables. Even if it were possible to obtainmeasurements satisfying the above definition of the true measurements, itseems obvious that these would not correspond to the theoretical variables1

.

Nevertheless, if the purpose of the empirical investigation is to test atheory, a prerequisite for valid inference about the theoretical model is aclose correspondence between the observed variables and the true variables,or in the words of Haavelmo:

It is then natural to adopt the convention that a theory is called true orfalse according as the hypotheses implied are true or false, when tested againstthe data chosen as the ”true” variables. Then we may speak interchangeablyabout testing hypotheses or testing theories.

For example, the unit root tests of GDP series to discriminate betweendifferent real growth theories are good examples of misleading inference inthis respect. Much of the criticism expressed by Summers may well be relatedto situations in which valid inference would require a much closer correspon-dence between observed and true variables.

Even if data collected by passive observation do not generally qualify fortesting “deep” theoretical models, most empirical macroeconomic models arenonetheless based on the officially collected data. This is simply because ofa genuine interest in the macroeconomic data as such. Since “data seldomspeak by themselves”, theoretical arguments are needed to understand thevariation in these data in spite of the weak correspondence between the theo-retical and observed variables. Nevertheless, the link between the theoretical

1It should be pinted out that Baumol does not claim any such correspondence. In fact,he gives a long list of reasons why this cannot be the case.

Page 8: Juselius_Johansen_The Cointegrated VAR Model

1.4. TESTING A THEORY AS OPPOSED TO A HYPOTHESIS 7

and the empirical model is rather ambiguous in this case and the interpre-tation of the empirical results in terms of the theoretical arguments is notstraightforward. This leads to the second issue to be discussed here, i.e. thedistinction between testing a hypothesis and testing a theory. This mono-graph will claim that based on macroeconomic data it only possible to testtheoretical hypotheses but not a theory model as such. The frequent failiureto separate between the two might explain a great del of Summers’ critique.

1.4 Testing a theory as opposed to a hypoth-

esis

When the arguments of the theory do not apply directly to the empiricalmodel, one compromise is to be less ambitious about testing theories and in-stead concentrate on testing specific hypotheses derived from the theoreticalmodel. For example, the hypotheses that the elasticity of the transactions de-mand for cash ec = 0.5 and of the interest rate er = −0.5 have frequently beentested within empirical models that do not include all aspects of Baumol’stheoretical model. Other popular hypotheses that have been widely testedinclude long-run price and income homogeneity, the sign of first derivatives,zero restrictions, and the direction of causality. “Sophisticated statisticaltechniques” have often been used in this context. Since, according to Sum-mers among others, the outcomes of these exercises have generally not beenconsidered convincing or interesting enough to change economists’ views,there seems to be a need for a critical appraisal of econometric practice inthis context. There are several explanations why empirical results are oftenconsidered unpersuasive.

First, not enough care is taken to ensure that the specification of empir-ical models mimics the general characteristics of the data. If the empiricalmodel is based on a valid probability formulation of the data, then statisticalhypothesis testing is straightforward, and valid procedures can be derived byanalyzing the likelihood function. In this case, inferences about the specifiedhypotheses are valid, though valid inference about the theory as such dependson the strength of the correspondence between the true and the theoreticalvariables.

Second, not enough attention is paid to the crucial role of the ceterisparibus‘everything else constant’ assumption when the necessary information

Page 9: Juselius_Johansen_The Cointegrated VAR Model

8 CHAPTER 1. INTRODUCTION

set is specified.Third, the role of rational expectations as a behavioral assumption in

empirical models has been very unconvincing!Fourth, the available measurments do not correspond closely enough to

the true values of the theoretical variables. For example, in many applica-tions it is assumed that certain (linearly transformed) VAR residuals can beinterpreted as structural shocks. This would require the VAR residuals to beinvariant to changes in the information set, which is not likely to be the casein practical applications.These issues will be further discussed in the next chapter and related to

some general principles for VAR-modelling with nonstationary data. But,first the concept of “a design of experiment” in econometrics as discussed byHaavelmo has to be introduced.

1.5 Experimental design in macroeconomics

As discussed above, the link between the true variables suggested by thetheory and the actual measurements taken from official statistics is, in mostcases, too weak to justify valid inference from the empirical model to thetheory. Even in ideal cases, when the official definition of the aggregatedvariable, say liquid money stock M1, corresponds quite closely to the truemeasurements, there are usually measurement problems. High-quality, rea-sonably long aggregated series are difficult to obtain because definitionschange, new components have entered the aggregates and new regulationshave changed the behavior. The end result is the best set of measurementsin the circumstances, but still quite far from the true measurements of anideal situation. This problem is discussed in terms of a design of experimentin Haavelmo, p. 14:...If economists would describe the experiments they have in mind when

they construct the theories ”they would see that the experiments they have inmind may be grouped into two different classes namely, (1) experiments thatwe should like to make to see if certain real economic phenomena – whenartificially isolated from other influences – would verify certain hypothesisand (2) the stream of experiments that nature is steadily turning out from hisown enormous laboratory, and which we merely watch as passive observers.......In the first case we can make agreements or disagreements between theory

Page 10: Juselius_Johansen_The Cointegrated VAR Model

1.5. EXPERIMENTAL DESIGN IN MACROECONOMICS 9

and facts depend on two things: the facts we choose to consider and ourtheory about them. .....In the second case we can only try to adjust our theories to reality as it

appears before us. And what is the meaning of a design of experiment in thiscase. It is this: We try to choose a theory and a design of experiments to gowith it, in such a way that the resulting data would be those which we get bypassive observation of reality. And to the extent that we succeed in doing so,we become masters of reality – by passive agreement.Since the primary interest here is in the case of passive observations, we

will restrict ourselves to this case. What seems to be needed is a set of as-sumptions which are general enough to ensure a statistically valid descriptionof typical macroeconomic data, and a common modelling strategy to allowquestions of interest to be investigated in a consistent framework. Chapter 3will discuss under which conditions the VAR model can work as a reasonableformalization of a “design of experiment” for data by passive observation.Controlled experiments are not usually possible within a single macroe-

conomy and the only possibility to ‘test’ hypothetical relationships is to waitfor new data which were not used to generate the hypothesis. Another possi-bility is to rely on the “experiments” provided by other countries or regionsthat differ in various aspects with regard to the investigated economic prob-lem. For instance, if the question of interest is whether an expansion ofmoney supply generally tends to increase the inflation rate, it seems advis-able to examine this using similar data from countries that differ in terms ofthe pursued economic policy.In the rest of the book we will discuss the applicability of the cointegrated

VAR model as a common modelling strategy. We will argue that this modelin spite of its simplicity offers a potential richness in the specification of eco-nomically meaningful short- and long-run structures and components, such assteady-state relations and common trends, interaction and feed-back effects.Even more importantly, in the unrestricted general form the VAR modelis essentially only a reformulation of the covariances of the data. Providedthat these have remained approximatively constant over the sample periodthe VAR can be considered a convenient summary of the ‘stylized facts’ ofthe data. To the extent that the ‘true’ economic model underlying behaviorsatisfies a first order linear approximation (see Henry and Richard, 1983),one can test economic hypotheses expressed as the number of autonomouspermanent shocks, steady-state behavior, feed-back and interaction effects,etc. within a statistically valid framework. This is essentially the ‘general-

Page 11: Juselius_Johansen_The Cointegrated VAR Model

10 CHAPTER 1. INTRODUCTION

to-specific’ approach described in Hendry and Mizon (1993), subsequentlyevaluated in Hoover and Perez (1999) and recently implemented as an auto-matic selection procedure in PcGets (Hendry and Krolzig, 2003). The “final”empirical model should, in the ideal case, be structural both in the economicand statistical sense of the word.The VAR procedure is less pretentious about the prior role of a theoretical

economic model, but it avoids the lack of empirical relevance the theory-basedapproch has often been criticized for. Since the starting point is the time-series structure of the chosen data, it is often advantageous to search forstructures against the background of not just one but a variety of possiblyrelevant theories. In this sense the suggested approach is a combination ofinductive and deductive inference.

1.6 On the choice of empirical example

Throughout the book we will illustrate the methodological arguments withempirical analyses of a Danish money demand data. These data have beenextensively analyzed by myself and Søren Johansen both as an inspiration fordeveloping new test procedures and as a means to understand the potentialuse of the cointegrated VAR model. See for example Johansen and Juselius(1990) and Juselius (1993a).Given the above discussion one should ask whether there is a close enough

correspondence between observed variables and the true versus theoreticalvariables for example as given by the Baumol model? The answer is clearlyno. A detailed examination of the measurements reveal that essentially all theusual problems plague the data. For instance, new components have enteredthe aggregate, banking technology has changed, the time interval betweenthe measurements is too broad, and so on. From these observed variables,it would be very hard to justify inference from the empirical analysis to aspecific transactions demand-for-money theory.The main reason why these variables were selected is simply because they

were being used by the Central Bank of Denmark as a basis for their policyanalysis. In that sense, the data have been collected because of an interest inthe variables for their own sake. This choice does not exclude the possibilitythat these variables are closely related to the theoretical variables, nor thatthe empirical investigation might suggest other more relevant measurementsfor policy analysis. For example, LaCour (1993) demonstrated that long-run

Page 12: Juselius_Johansen_The Cointegrated VAR Model

1.6. ON THE CHOICE OF EMPIRICAL EXAMPLE 11

price homogeneity was more strongly accepted when a weighted average ofcomponents of different liquidity was used as a proxy for the transactionsdemand for money.Looking in the rear mirror, after having applied the cointegrated VAR

model to many other data sets, it is obvious that we were very fortunateto begin our first cointegration attempts using this data set. It turned outto be one of the rare data sets describing long-run relationships which haveremained remarkable stable over the last few decades. However, it should bepointed out that this was true only after having econometrically accountedfor a regime shift in 1983 as a consequence of deregulating capital movementsin Denmark. In this sense the Danish data were also able to illustrate a mostimportant finding: the sensitivity of the VAR approach to empirical shifts ingrowth rates and in equilibrium means.Many of the theoretical advances were directly influenced by empirical

analyses of this data set. In the first applied paper (Johansen and Juselius,1990), a vector of real money, real income, and two interest rates was an-alyzed and only one cointegration relation was found. However, this wasbased on an implicit assumption of long-run price homogeneity. A need totest this properly resulted in the development of the I(2) analysis (Johansen,1992, 1995, 1997). The latter approach demonstrated that the original spec-ification in real variables were misspecified without including the inflationrate. Juselius (1993) tested the long-run price homogeneity assumption inthe I(2) model and found that it was accepted. Re-estimating the modelwith inflation rate correctly included as a new variable showed that inflationrate was a crucial determinant in the system, which strongly affected theinterpretation of the steady-state relations and the dynamic feed-back effectswithin the system. Including inflation rate in the vector produced one morecointegration relation (a relation between the two interest rates and inflation)in addition to the previously found money demand relation.When the number of potentially interesting variables is large, so is the

number of cointegrating relations. In this case, it is often a tremendouslydifficult task to identify them. The number of possible combinations is simplytoo large. If, instead, more information is gradually added to the analysis, itis possible to build on previous results and, thus, not to loose track so easily.The idea of gradually increasing the data vector of the VAR model, build-

ing on previously found results, was also influenced by empirical analyses ofthe Danish data. Juselius (1992a) added the loan rate to the previously usedinformation set and found one additional cointegration relation between the

Page 13: Juselius_Johansen_The Cointegrated VAR Model

12 CHAPTER 1. INTRODUCTION

loan rate and the deposit rate. This approach was then further developedin Juselius (1992b), where three different sectors of the macroeconomy wereanalyzed separately and then combined into one model.Thus, contrary to the “general to specific” approach of the statistical

modelling process (i.e. of imposing more and more restrictions on the un-restricted VAR), it appeared more advantageous to follow the principle of“specific to general ” in the choice of information set.The representation and statistical analysis of the I(2) model meant a

major step forward in the empirical understanding of the mechanisms deter-mining nominal growth rates in Denmark. Juselius (1993) analyzed commontrends in both the I(2) and I(1) models, and found that the stochastic trendcomponent in nominal interest rates seemed to have generated price infla-tion. This was clearly against conventional wisdom that predicted the linkto be the other way around. Needless to say, such a result could not convinceeconomists as long as it stood alone. Therefore, a similar design was used ina number of other studies (ref.) based on data from various European coun-tries (with essentially the same conclusions!) The experience of looking atvarious economies characterized by different policies through the same ‘spec-tacles’ generated the idea that the VAR approach could possibly be used asa proxy for a ‘designed experiment’ in a situation where only data by passiveobservations were available.Today’s academics living under an increasingly strong pressure ”to pub-

lish or perish” can seldom afford to spend time on writing computer software.It is no coincidence that the most influential works in econometrics in thelast decades were those which combined theoretical results with the develop-ment of the necessary computer software. To meet the demand for empiricalapplicability, all test and estimation procedures discussed in this book arereadily implemented as user-friendly menu-driven programs in, for example,CATS for RATS (Hansen and Juselius, 1995) or in PcGive (Doornik andHendry, 2002).An urge to understand more fully why this approach frequently produced

results that seemed to add a question mark to conventional theories andbeliefs was the motivation for writing the next chapter. It focuses on whatcould be called the economist’s approach as opposed to the statistician’sapproach to macroeconomic modelling, distinguishing between models andrelations in economics and econometrics. The aim is to propose a frameworkfor discussing the probability approach to econometrics as contrasted to moretraditional methods. It draws heavily on Juselius (1999).

Page 14: Juselius_Johansen_The Cointegrated VAR Model

Chapter 2

Models and Relations inEconomics and Econometrics

In Chapter 1 we discussed Haavelmo’s probability approach to empiricalmacroeconomics and the need to formulate a statistical model based on astochastic formulation of the process that has generated the chosen data.Because most macroeconomic data exhibit strong time dependence, it is nat-ural to formulate the empirical model in terms of time dependent stochasticprocesses.The organization of this chapter is as follows: Section 2.1 discusses in

general terms the VAR approach as contrasted to a theory-based model ap-proach. Section 2.2 briefly considers the treatment of inflation and monetarypolicy in Romer (1996) with special reference to the equilibrium in the moneymarket. Section 2.3 discusses informally some empirical and theoretical im-plications of unit roots in the data. Section 2.4 addresses more formally astochastic formulation based on a decomposition of the data into trends, cy-cles, and irregular components. Section 2.5 gives an empirical motivation fortreating the stochastic trend in nominal prices as I(2), and Section 2.6 asI(1).

2.1 The VAR approach and theory based mod-

els

The vector autoregressive (V AR) process based on Gaussian errors has fre-quently been a popular choice as a description of macroeconomic time series

13

Page 15: Juselius_Johansen_The Cointegrated VAR Model

14 CHAPTER 2. MODELS AND RELATIONS

data. There are many reasons for this: the VAR model is flexible, easy to es-timate, and it usually gives a good fit to macroeconomic data. However, thepossibility of combining long-run and short-run information in the data byexploiting the cointegration property is probably the most important reasonwhy the V ARmodel continues to receive the interest of both econometriciansand applied economists.

Theory-based economic models have traditionally been developed as non-stochastic mathematical entities and applied to empirical data by adding astochastic error process to the mathematical model1. As an example of thisapproach we will use the macroeconomic treatment in ”Inflation and Mone-tary Policy”, Chapter 9 in D. Romer (1996): ”Advanced Macroeconomics”.

From an econometric point of view the two approaches are fundamen-tally different: one starting from an explicit stochastic formulation of alldata and then reducing the general statistical (dynamic) model by imposingtestable restrictions on the parameters, the other starting from a mathemati-cal (static) formulation of a theoretical model and then expanding the modelby adding stochastic components. For a detailed methodological discussion ofthe two approaches, see for example Gilbert (1986), Hendry (1995), Juselius(1993), and Pagan (1987).

Unfortunately, the two approaches have shown to produce very differentresults even when applied to identical data and, hence, different conclusions.From a scientific point of view this is not satisfactory. Therefore, we willhere attempt to bridge the gap between the two views by starting fromsome typical questions of theoretical interest and then show how one wouldanswer these questions based on a statistical analysis of the V AR model.Because the latter by construction is ”bigger” than the theory model, theempirical analysis not only answers a specific theoretical question, but alsogives additional insight into the macroeconomic problem.

A theory model can be simplified by the ceteris paribus assumptions”everything else unchanged”, whereas a statistically well-specified empiricalmodel has to address the theoretical problem in the context of ”everythingelse changing”. By imbedding the theory model in a broader empirical frame-work, the analysis of the statistically based model can provide evidence ofpossible pitfalls in macroeconomic reasoning. In this sense the V AR analysiscan be useful for generating new hypotheses, or for suggesting modificationsof too narrowly specified theoretical models. As a convincing illustration see

1Dynamic general equilibrium models

Page 16: Juselius_Johansen_The Cointegrated VAR Model

2.2. INFLATION AND MONEY GROWTH 15

Hoffman (2001?).Throughout the book we will illustrate a variety of econometric problems

by addressing questions of empirical relevance based on an analysis of mon-etary inflation and the transmission mechanisms of monetary policy. Thesequestions have been motivated by many empirical V AR analyses of money,prices, income, and interest rates and include questions such as:• How effective is monetary policy when based on changes in money stock

or changes in interest rates?• What is the effect of expanding money supply on prices in the short

run? in the medium run? in the long run?• Is an empirically stable demand for money relation a prerequisite for

monetary policy control to be effective?• How strong is the direct (indirect) relationship between a monetary

policy instrument and price inflation?Based on the V AR formulation we will demonstrate that every empirical

statement can, and should, be checked for its consistency with all previousempirical and theoretical statements. This is in contrast to many empiricalinvestigations, where inference relies on many untested assumptions usingtest procedures that only make sense in isolation, but not in the full contextof the empirical model.

2.2 Inflation and money growth

A fundamental proposition in macroeconomic theory is that growth in moneysupply in excess of real productive growth is the cause of inflation, at least inthe long run. Here we will briefly consider some conventional ideas underlyingthis belief as described in Chapter 9 by Romer (1996).The well-known diagram illustrating the intersection of aggregate demand

and aggregate supply provides the framework for identifying potential sourcesof inflation as shocks shifting either aggregate demand upwards or aggregatesupply to the left. See the upper panel of Figure 2.1.As examples of aggregate supply shocks that shift the AS curve to the

left Romer (1996) mentions; negative technology shocks, downward shifts inlabor supply, upwardly skewed relative-cost shocks. As examples of aggregatedemand shocks that shift the AD curve to the right he mentions; increasesin money stock, downward shifts in money demand, increases in governmentpurchases. Since all these types of shocks, and many others, occur quite

Page 17: Juselius_Johansen_The Cointegrated VAR Model

16 CHAPTER 2. MODELS AND RELATIONS

frequently there are many factors that potentially can affect inflation. Someof these shocks may only influence inflation temporarily and are, therefore,less important than shocks with a permanent effect on inflation. Among thelatter economists usually emphasize changes in money supply as the crucialinflationary source. The economic intuition behind this is that other factorsare limited in scope, whereas money in principle is unlimited in supply.More formally the reasoning is based on money demand and supply and

the condition for equilibrium in the money market:

M/P = L(R,Y r), LR < 0, Ly > 0. (2.1)

where M is the money stock, P is the price level, R the nominal interestrate, Y r real income, and L(·) the demand for real money balances. Basedon the equilibrium condition, i.e. no changes in any of the variables, Romer(1996) concludes that the price level is determined by:

P =M/L(R,Y r) (2.2)

The equilibrium condition (2.1) and, hence (2.2), is a static concept thatcan be thought of as a hypothetical relation between money and prices forfixed income and interest rate. The underlying comparative static analysisinvestigates the effect on one variable, say price, when changing anothervariable, say money supply, with the purpose of deriving the new equilibriumposition after the change. Thus, the focus is on the hypothetical effect of achange in one variable (M) on another variable (P ), when the additionalvariables (R and Y r) are exogenously given and everything else is takenaccount of by the ceteris paribus assumption.However, when time is introduced the ceteris paribus assumption and the

assumption of fixed exogenous variables become much more questionable.Neither interest rates nor real income have been fixed or controlled in mostperiods subject to empirical analysis. Therefore, in empirical macroeconomicanalysis all variables (inclusive the ceteris paribus ones) are more or less con-tinuously subject to shocks, some of which permanently change the previousequilibrium condition. In this sense an equilibrium position is always relatedto a specific time point in empirical modelling. Hence, the static equilibriumconcept has to be replaced by a dynamic concept, for instance a steady-state

Page 18: Juselius_Johansen_The Cointegrated VAR Model

2.2. INFLATION AND MONEY GROWTH 17

position. For an equilibrium relation time is irrelevant whereas a steady-staterelation without a time index is meaningless.In a typical macroeconomic system new disturbances push the variables

away from steady-state, but the economic adjustment forces pull them backtowards a new steady-state position. To illustrate the ideas one can use ananalogy from physics and think of the economy as a system of balls connectedby springs. When left alone the system will be in equilibrium, but pushinga ball will bring the system away from equilibrium. Because all balls areconnected, the ‘shock’ will influence the whole system, but after a while theeffect will die out and the system is back in equilibrium. In the economy,the balls would correspond to the economic variables and the springs to thetransmission mechanisms that describe how economic shocks are transmittedthrough the system. As we know, the economy is not a static entity. Insteadof saying that the economy is in equilibrium it is more appropriate to usethe word ‘steady-state’. Hence, we need to replace the above picture witha system where the balls are moving with some ‘controlled’ speed and bypushing a ball the speed will change and influence all the other balls. Leftalone the system will return to the ‘controlled’ state, i.e. the steady state.However, in real applications the adjustment back to steady-state is dis-

turbed by new shocks and the system essentially never comes to rest. There-fore, we will not be able to observe a steady-state position and the empiricalinvestigation has to account for the stochastic properties of the variables aswell as the theoretical equilibrium relationship between them. See the lowerpanel of Figure 1 for an illustration of a stochastic steady-state relation.In (2.1) the money market equilibrium is an exact mathematical expres-

sion and it is straightforward to invert it to determine prices as is done in(2.2). The observations from a typical macroeconomic system are adequatelydescribed by a stochastic vector time series process. But in stochastic sys-tems, inversion of (2.1) is no longer guaranteed (see for instance Hendry andEricsson, 1991). If the inverted (2.1) is estimated as a regression model it islikely to result in misleading conclusions.The observed money stock can be demand or supply determined or both,

but it is not necessarily a measurement of a long-run steady-state position.This raises the question whether it is possible to empirically identify andestimate the underlying theoretical relations. For instance, if central banksare able to effectively control money stock, then observed money holdingsare likely to be supply determined and the demand for money has to adjustto the supplied quantities. This is likely to be the case in trade and capital

Page 19: Juselius_Johansen_The Cointegrated VAR Model

18 CHAPTER 2. MODELS AND RELATIONS

1975 1980 1985 1990 1995

-.1

0

.1 Deviations from money steady-state:m-p-y-14(Rm-Rb)

.5

1

1.5

AD ASAn equilibrium position

Figure 2.1: An equilibrium position of the AD and AS curve (upper panel)and deviations from an estimated money demand relation for Denmark: (m−p− y)t − 14.1(Rm −Rb) (lower panel).

regulated economies or, possibly, in economies with flexible exchange rates,whereas in open deregulated economies with fixed exchange rates centralbanks would not in general be able to control money stock. In the latter caseone would expect observed money stock to be demand determined.Under the assumption that the money demand relation can be empirically

identified, the statistical estimation problem has to be addressed. Becausemacroeconomic variables are generally found to be nonstationary, standardregression methods are no longer feasible from an econometric point of view.Cointegration analysis specifically addresses the nonstationarity problem andis, therefore, a feasible solution in this respect. The empirical counterpartof (2.1) (with the opportunity cost of holding money, R = Rb −Rm) can bewritten as a cointegrating relation, i.e.:

ln(M/PY r)t − L(Rb −Rm)t = vt (2.3)

where vt is a stationary process measuring the deviation from the steady-state

Page 20: Juselius_Johansen_The Cointegrated VAR Model

2.3. THE TIME DEPENDENCE OF MACROECONOMIC DATA 19

position at time t. The stationarity of vt implies that whenever the systemhas been shocked it will adjust back to equilibrium. This is illustrated inFigure 2.1 (lower panel) by the graph of the deviations from an estimatedmoney demand relation based on Danish data with the opportunity cost ofholding money being measured by Rb−Rm (Juselius, 1998b). Note the largeequilibrium error at about 1983, as a result of removing restrictions on capitalmovements and the consequent adjustment back to steady-state.

However, empirical investigation of (2.3) based on cointegration analysisposes several additional problems. Although in a theoretical exercise it isstraightforward to keep some of the variables fixed (the exogenous variables),in an empirical model none of the variables in (2.1), i.e. money, prices, incomeor interest rates, can be assumed to be fixed (i.e. controlled). The stochasticfeature of all variables implies that the equilibrium adjustment can take placein either money, prices, income or interest rates. Therefore, the equilibriumdeviation vt is not necessarily due to a money supply shock at time t, but canoriginate from any change in the variables. Hence, one should be cautiousto interpret a coefficient in a cointegrating relation as in the conventionalregression context, which is based on the assumption of ”fixed” regressors.In multivariate cointegration analysis all variables are stochastic and a shockto one variable is transmitted to all other variables via the dynamics of thesystem until the system has found its new equilibrium position.

The empirical investigation of the above questions raises several econo-metric questions: What is the meaning of a shock and how do we measure iteconometrically? How do we distinguish empirically between the long run,the medium run and the short run? Given the measurements can the param-eter estimates be given an economically meaningful interpretation? Thesequestions will be discussed in more detail in the subsequent sections.

2.3 The time dependence of macroeconomic

data

As advocated above, the strong time dependence of macroeconomic data sug-gests a statistical formulation based on stochastic processes. In this contextit is useful to distinguish between:

• stationary variables with a short time dependence and• nonstationary variables with a long time dependence.

Page 21: Juselius_Johansen_The Cointegrated VAR Model

20 CHAPTER 2. MODELS AND RELATIONS

In practice, it is useful to classify variables exhibiting a high degree oftime persistence (insignificant mean reversion) as nonstationary and variablesexhibiting a significant tendency to mean reversion as stationary. However, itis important to stress that the stationarity/nonstationarity or, alternatively,the order of integration of a variable, is not in general a property of aneconomic variable but a convenient statistical approximation to distinguishbetween the short-run, medium-run, and long-run variation in the data. Wewill illustrate this with a few examples involving money, prices, income, andinterest rates.Most countries have exhibited periods of high and low inflation, lasting

sometimes a decade or even more, after which the inflation rate has returnedto its mean level. If inflation crosses its mean level, say ten times, the econo-metric analysis will find significant mean reversion and, hence, conclude thatinflation rate is stationary. For this to happen we might need up to hundredyears of observations. The time path of, for example, quarterly Europeaninflation over the last few decades will cover a high inflation period in theseventies and beginning of the eighties and a low inflation period from mid-eighties until the present date. Crossing the mean level a few times is notenough to obtain statistically significant mean reversion and the economet-ric analysis will show that inflation should be treated as a nonstationaryvariable. This is illustrated in Figure 2.2 where yearly observations of theDanish inflation rate has been graphed for 1901-1992 (upper panel), for 1945-1992 (middle panel), and 1975-1992 (lower panel). The first two time seriesof inflation rates look mean-reverting (though not to zero mean inflation),whereas significant mean-reversion would not be found for the last section ofthe series.That inflation is considered stationary in one study and nonstationary in

another, where the latter is based, say, on a sub-sample of the former mightseem contradictory. This need not be so, unless a unit root process is givena structural economic interpretation. There are many arguments in favor ofconsidering a unit root (a stochastic trend) as a convenient econometric ap-proximation rather than as a deep structural parameter. For instance, if thetime perspective of our study is the macroeconomic behavior in the mediumrun, then most macroeconomic variables exhibit considerable inertia, consis-tent with nonstationary rather than stationary behavior. Because inflation,for example, would not appear to be statistically different from a nonsta-tionary variable, treating it as a stationary variable is likely to invalidatethe statistical analysis and, therefore, lead to wrong economic conclusions.

Page 22: Juselius_Johansen_The Cointegrated VAR Model

2.3. THE TIME DEPENDENCE OF MACROECONOMIC DATA 21

0 10 20 30 40 50 60 70 80 90

0

.2 Century long inflation

45 50 55 60 65 70 75 80 85 90

0

.05

.1

.15 Inflation after 2nd world war

75 80 85 90

.05

.1Inflation after 1975

Figure 2.2: Yearly Danish inflation 1901-92 (upper panel), 1945-92 (middlepanel), and 75-92 (lower panel).

On the other hand, treating inflation as a nonstationary variable gives usthe opportunity to find out which other variable(s) have exhibited a similarstochastic trend by exploiting the cointegration property. This will be dis-cussed at some length in Section 2.5, where we will demonstrate that the unitroot property of economic variables is very useful for the empirical analysisof long- and medium-run macroeconomic behavior.

When the time perspective of our study is the long historical macroe-conomic movements, inflation as well as interest rates are likely to showsignificant mean reversion and, hence, can be treated as a stationary vari-able.

Finally, to illustrate that the same type of stochastic processes are ableto adequately describe the data, independently of whether one takes a close-up or a long-distance look, we have graphed the Danish bond rate in levelsand differences in Figure 2.3 based on a sample of 95 quarterly observations(1972:1-1995:3), 95 monthly observations (1987:11-1995:9), and 95 daily ob-servations (1.5.95-25.9.95). The daily sample corresponds to the little hump

Page 23: Juselius_Johansen_The Cointegrated VAR Model

22 CHAPTER 2. MODELS AND RELATIONS

0 20 40 60 80 100

8

8.5daily (level)

0 20 40 60 80 100

0

.2daily (difference)

0 20 40 60 80 100

7.5

10

monthly (level)

0 20 40 60 80 100-1

0

1 monthly (difference)

0 20 40 60 80 100

10

15

20quarterly (level)

0 20 40 60 80 100

-2

0

2quarterly (difference)

Figure 2.3: Average Danish bond rates, based on daily observations, 1.5.-25.9.95 (upper panel), monthly observations, Nov 1987 - Sept 1995 (middlepanel), and quarterly observations, 1972:1-1995:3, (lower panel).

at the end of the quarterly time series. It would be considered a small sta-tionary blip from a quarterly perspective, whereas from a daily perspectiveit is nonstationary, showing no significant mean reversion. Altogether, thethree time series look very similar from a stochastic point of view.

Thus, econometrically it is convenient to let the definition of long-run orshort-run, or alternatively the very long-run, the medium long-run, and theshort-run, depend on the time perspective of the study. From an economicpoint of view the question remains in what sense a ”unit root” process canbe given a ”structural” interpretation.

2.4 A stochastic formulation

To illustrate the above questions we will consider a conventional decom-position into trend, T , cycle, C, and irregular component, I, of a typical

Page 24: Juselius_Johansen_The Cointegrated VAR Model

2.4. A STOCHASTIC FORMULATION 23

macroeconomic variable.

X = T × C × I

Instead of treating the trend component as deterministic, as is usuallydone in conventional analysis, we will allow the trend to be both determinis-tic, Td, and stochastic, Ts, i.e. T = Ts×Td, and the cyclical component to beof long duration, say 6-10 years, Cl, and of shorter duration, say 3-5 years,Cs, i.e. C = Cl × Cs. The reason for distinguishing between short and longcycles is that a long/short cycle can either be treated as nonstationary orstationary depending on the time perspective of the study. As an illustrationof long cycles that have been found nonstationary by the statistical analysis(Juselius, 1998b) see the graph of trend-adjusted real income in Figure 4,middle panel.An additive formulation is obtained by taking logarithms:

x = (ts + td) + (cl + cs) + i (2.4)

where lower case letters indicate a logarithmic transformation. In the sub-sequent chapters the stochastic time dependence of the variables will be ofprimary interest, but the linear time trend will also be important as a mea-sure of average linear growth rates usually present in economic data.To give the economic intuition for the subsequent multivariate cointe-

gration analysis of money demand / money supply relations, we will illus-trate the ideas in Section 4.1 and 4.2 using the time series vector xt =[m, p, yr, Rm, Rb]t, t = 1, ..., T, where m is a measure of money stock, p theprice level, yr real income, Rm the own interest on money stock, and Rbthe interest rate on bonds. All variables are treated as stochastic and, hence,from a statistical point of view need to be modelled, independently of whetherthey are considered endogenous or exogenous in the economic model.To illustrate the ideas we will assume two autonomous shocks u1 and u2,

where for simplicity u1 is a nominal shock causing a permanent shift in theAD curve and u2 is a real shock causing a permanent shift in the AS curve.This would be consistent with the aggregate supply (AS ) aggregate demand(AD) curve.To clarify the connection between the econometric analysis and the eco-

nomic interpretation we will first assume that the empirical analysis is based

Page 25: Juselius_Johansen_The Cointegrated VAR Model

24 CHAPTER 2. MODELS AND RELATIONS

on a quarterly model of, say, a few decades and then on a yearly model of,say, a hundred years. In the first case, when the perspective of the studyis the medium run, we will argue that prices should generally be treated asI(2), whereas in the latter case, when the perspective is the very long run,prices can sometimes be approximated as a strongly correlated I(1) process.Figure 4 illustrates different stochastic trends in the Danish quarterly

data set. The stochastic I(2) trend in the upper panel corresponds to trend-adjusted prices and the stochastic I(1) trend in the middle panel correspondsto trend-adjusted real income. The lower panel is a graph of ∆pt and de-scribes the stochastic I(1) trend in the inflation rate, which is equivalent tothe differenced I(2) trend. Note, however, that inflation has been positiveover the full sample, meaning that the price level will contain a linear de-terministic trend. Thus, a nonzero sample average of inflation, ∆p 6= 0, isconsistent with a linear trend in the price levels, pt.

1975 1980 1985 1990 1995

-.1

0

.1Stochastic I(2) trend in prices

1975 1980 1985 1990 1995

-.05

0

.05

.1 Stochastic I(1) trend in real income

1975 1980 1985 1990 1995

.05

.1

.15 Stochastic I(1) trend in inflation

Figure 2.4: Stochastic trends in Danish prices, real income and inflation,based on quarterly data 1975:1-1994:4

The concept of a common stochastic trend or a driving force requires afurther distinction between:

Page 26: Juselius_Johansen_The Cointegrated VAR Model

2.4. A STOCHASTIC FORMULATION 25

• an unanticipated shock with a permanent effect (a disturbance to thesystem with a long lasting effect)• an unanticipated shock with a transitory effect (a disturbance to the

system with a short duration).To give the non-expert reader a more intuitive understanding for the

meaning of a stochastic trend of first or second order, we will assume thatinflation rate, πt, follows a random walk:

πt = πt−1 + εt,= εt/(1− L) + π0, t =1,...T.

(2.5)

where εt = εp,t + εs,t, consists of a permanent shock, εp,t, and a transitoryshock, εs,t.

2 By integrating the shocks, starting from an initial value of infla-tion rate, π0, we get:

πt = εp,t + εp,t−1 + εp,t−2 + ...+ εp,1 + εs,t + εs,t−1 + εs,t−2 + ...+ εs,1 + π0=

Pti=1 εp,i +

Pti=1 εs,i + π0 ≈

Pti=1 εp,i + εs,t + π0

A permanent shock is by definition a shock that has a lasting effect on thelevel of inflation, such as a permanent increase in government expenditure,whereas the effect of a transitory shock disappears either during the nextperiod or over the next few ones. For simplicity we assume that only theformer case is relevant here. An example of a transitory price shock is avalue added tax imposed in one period and removed the next. In the lattercase prices increase temporarily, but return to their previous level after theremoval. Therefore, a transitory shock can be described as a shock thatoccurs a second time in the series but then with opposite sign. Hence, atransitory shock disappears in cumulation, whereas a permanent shock hasa long lasting effect on the level. In the summation of the shocks εi:

πt =tPi=1

εi + π0 (2.6)

only the permanent shocks will remain and we callPt

i=1 εi a stochastic trend.The difference between a linear stochastic and deterministic trend is that

2Note that in this case an ARIMA(0,1,1) model would give a more appropriate speci-fication provided εp and εs are white noise processes.

Page 27: Juselius_Johansen_The Cointegrated VAR Model

26 CHAPTER 2. MODELS AND RELATIONS

the increments of a stochastic trend change randomly, whereas those of adeterministic trend are constant over time. The former is illustrated in themiddle and lower panels of Figure 2.4.A representation of prices instead of inflation is obtained by integrating

(2.6) once, i.e.

pt =tPs=1

πs + p0 =tPs=1

sPi=1

εi + π0t+ p0. (2.7)

It appears that inflation being I(1) with a nonzero mean, corresponds toprices being I(2) with linear trends. The stochastic I(2) trend is illustratedin the upper part of Figure 2.4.The question whether inflation rates should be treated as I(1) or I(0) has

been subject to much debate. Figure 2.2 illustrated that the Danish inflationrate measured over the last decades was probably best approximated by anonstationary process, whereas measured over a century by a stationary,though strongly autocorrelated, process. For a description of the latter case,(2.5) should be replaced by:

πt = ρπt−1 + εt,= εt/(1− ρL) + π0, t =1,...T.

(2.8)

which becomes:

πt =tPi=1

ρt−iεi + π0 = εt + ρεt−1 + ...+ ρt−1ε1 + π0 (2.9)

where the autoregressive parameter ρ is less than but close to one. In thiscase prices would be represented by:

pt =tPs=1

πs + p0 =tPs=1

sPi=1

ρs−iεi + π0t+ p0 (2.10)

= (1− ρ)−1tPi=1

εi − ρ(1− ρ)−1tPi=1

ρt−iεi + π0t+ p0

i.e. by a strongly autoregressive first order stochastic trend and a determin-istic linear trend.

Page 28: Juselius_Johansen_The Cointegrated VAR Model

2.4. A STOCHASTIC FORMULATION 27

The difference between (2.6) and (2.9) is only a matter of approximation.In the first case the parameter ρ is approximated with unity, because thesample period is too short for the estimate to be statistically different fromone. In the second case the sample period contains enough turning points forρ to be significantly different from one. Econometrically it is more optimalto treat a long business cycle component spanning over, say, 10 years as anI(1) process when the sample period is less than, say, 20 years. In this sensethe difference between the long cyclical component cl and ts in (2.4) is thatts is a true unit root process (ρ = 1), whereas cl is a near unit root process(ρ ≤ 1) that needs a very long sample to distinguish it from a true unit rootprocess.we will argue below that, unless a unit root is given a structural inter-

pretation, the choice of one representation or the other is as such not veryimportant, as long as there is consistency between the economic analysis andthe choice. However, from an econometric point of view the choice betweenthe two representations is usually crucial for the whole empirical analysis andshould, therefore, be based on all available information.Nevertheless, the distinction between a permanent (long-lasting) and a

transitory shock is fundamental for the empirical interpretation of integra-tion and cointegration results. The statement that inflation, ∆pt, is I(1) isconsistent with inflationary shocks being strongly persistent. Statisticallythis is expressed in (2.6) as inflation having a first order stochastic trenddefined as the cumulative sum of all previous shock from the starting date.Whether inflation should be considered I(1) or I(0) has been much de-

bated, often based on a structural (economic) interpretation of a unit root.We argue here that the order of integration should be based on statistical,rather than economic arguments. If ρ is not significantly different from one(for example because the sample period is short), but we, nevertheless, treatinflation as I(0) then the statistical inference will sooner or later producelogically inconsistent results.However, the fact that a small value of ρ, say 0.80, is often not signif-

icantly different from one in a small sample, whereas a much higher valueof ρ, say 0.98, can differ significantly from one in a long sample, is likelyto give semantic problems when using ‘long-run’ and ‘short-run’ to describeintegration and cointegration properties. For example, a price variable couldeasily be considered I(2) based on a short sample, whereas I(1) based on alonger period.Because of the sample split, inference on the cointegration and integra-

Page 29: Juselius_Johansen_The Cointegrated VAR Model

28 CHAPTER 2. MODELS AND RELATIONS

tion properties will be based on relatively short samples leading to the aboveinterpretational problems. When interpreting the subsequent results we willuse the concept of ‘long-run relation’ to mean a cointegrating relation be-tween I(1) or I(2) variables, as defined above. We will talk about ‘short-runadjustment’ when a stationary variable is significantly related to a cointegra-tion relation or to another stationary variable. A necessary condition for a‘long-run relation’ to be empirically relevant in the model is ‘short-run ad-justment’ in at least one of the system equations3. Based on this definitionnon-cointegrating relations incorrectly included in the model will eventuallydrop out as future observations become available: a stationary variable can-not significantly adjust to a nonstationary variable.

Because a cointegrating relation does not necessarily correspond to aninterpretable economic relation, we make a further distinction between thestatistical concept of a ‘long-run relation’ and the economic concept of a‘steady-state relation’.

2.5 Scenario Analyses: treating prices as I(2)

In this section we will assume that the long-run stochastic trend ts in (2.4)can be described by the twice cumulated nominal (AD) shocks,

PPu1i,

as in (2.7) and the long cyclical component cl by the once cumulated nom-inal shocks,

Pu1i, and the once cumulated real (AS) shocks,

Pu2i. This

representation gives us the possibility of distinguishing between the long-runstochastic trend component in prices,

PPu1i, the medium-run stochastic

trend in price inflation,Pu1i, and the medium-run stochastic trend in real

activity,Pu2i.

As an illustration of how the econometric analysis is influenced by theabove assumptions we will consider the following decomposition of the datavector:

3When data are I(2) we have integration and cointegration on different levels and theconcepts need to be modified accordingly. The need to distinguish between these conceptsis moderate here and we refer to Juselius (1999b) for a more formal treatment.

Page 30: Juselius_Johansen_The Cointegrated VAR Model

2.5. THE I(2) SCENARIO 29

mt

ptyrt

Rm,tRb,t

=c11c21000

[PPu1i] +

d11 d12d21 d22d31 d32d41 d42d51 d52

· P

u1iPu2i

¸+

g1g2g300

[t] + stat.comp.(2.11)

The deterministic trend component, td = t, accounts for linear growthin nominal money and prices as well as real income. Generally, when g1 6=0, g2 6= 0, g3 6= 0, then the level of real income growth rate and nominalgrowth rates is nonzero consistent with stylized facts in most industrializedcountries. If g3 = 0 and d31 = 0 in (2.11), then

Pu2i is likely to describe

the long-run real growth in the economy, i.e. a ”structural” unit root processas discussed in the many papers on the stochastic versus deterministic realgrowth models. See for instance, King, Plosser, Stock and Watson (1991). Ifg3 6= 0, then the linear time trend is likely to capture the long-run trend andPu2i will describe the medium-run deviations from this trend, i.e. the long

business cycles. The trend-adjusted real income variable in the middle panelof Figure 4 illustrates such long business cycles. For a further discussion, seeRubin (1998). The first case explicitly assumes that the average real growthrate is zero whereas the latter case does not. Whether one includes a lineartrend or not in (2.11) influences, therefore, the possibility of interpreting thesecond stochastic trend,

Pu2i, as a long-run structural trend or not.

Conditions for long-run price homogeneity:

Let us now take a closer look at the trend components of mt and pt in(2.11):

mt = c11PP

u1i + d11Pu1i + d12

Pu2i + g1t+ stat. comp.

pt = c21PP

u1i + d21Pu1i + d22

Pu2i + g2t+ stat. comp.

If (c11, c21) 6= 0, then mt, pt ∼ I(2). If, in addition, c11 = c21 then

mt − pt = (d11 − d21)Pu1i + (d12 − d22)

Pu2i + (g1 − g2)t+ stat.comp.

Page 31: Juselius_Johansen_The Cointegrated VAR Model

30 CHAPTER 2. MODELS AND RELATIONS

is at most I(1). If (d11 6= d21), (d12 6= d22), then mt and pt are cointegratingfrom I(2) to I(1), i.e. they are CI(2, 1). If, in addition, (g1 6= g2) then realmoney stock grows around a linear trend.The case (mt − pt) ∼ I(1) implies long-run price homogeneity and is a

testable hypothesis. Money stock and prices are moving together in the long-run, but not necessarily in the medium-run (over the business cycle). Long-run and medium-run price homogeneity requires c11 = c21, and d11 = d21,i.e. the nominal (AD) shocks u1t affect nominal money and prices in thesame way both in the long run and in the medium run. Because the realstochastic trend

Pu2i is likely to enter mt but not necessarily pt, testing

long- and medium-run price homogeneity jointly is not equivalent to testing(mt− pt) ∼ I(0). Therefore, the joint hypothesis is not as straightforward totest as long-run price homogeneity alone.Note that (mt − pt) ∼ I(1) implies (∆mt − ∆pt) ∼ I(0), i.e. long-run

price homogeneity implies cointegration between price inflation and moneygrowth. If this is the case, then the stochastic trend in inflation can equallywell be measured by the stochastic trend in the growth in money stock.Assuming long-run price homogeneity:In the following we will assume long-run price homogeneity, i.e. c11 =

c21, and discuss various cases where medium-run price homogeneity is eitherpresent or absent. In this case it is convenient to transform the nominal datavector into real money (m− p) and inflation rate ∆p (or equivalently ∆m)4:

mt − pt

∆ptyrt

Rm,tRb,t

=d11 − d21 d12 − d22

c21 0d31 d32d41 d42d51 d52

· P

u1iPu2i

¸+

g1 − g20g300

[t] + ...(2.12)

In (2.12) all variables are at most I(1). The inflation rate (measured by∆pt or ∆mt) is only affected by the once cumulated AD trend,

Pu1i, but

all the other variables can in principle be affected by both stochastic trends,Pu1i and

Pu2i.

The case trend-adjusted (mt − pt) ∼ I(0) requires that both d11 = d21and d12 = d22, which is not very likely from an economic point of view. A

4Chapter 14 and 15 will discuss this nominal to real transformation in more detail.

Page 32: Juselius_Johansen_The Cointegrated VAR Model

2.5. THE I(2) SCENARIO 31

priori, one would expect the real stochastic trendPu2i to influence money

stock (by increasing the transactions, precautionary and speculative demandsfor money) but not the price level, i.e. that d12 6= 0 and d22 = 0.The case (mt − pt − yrt ) ∼ I(0), i.e. money velocity of circulation is a

stationary variable, requires that d11−d21−d31 = 0, d12−d22−d32 = 0 and g1−g2 − g3 = 0. If d11 = d21 (i.e. medium run price homogeneity), d22 = 0 (realstochastic growth does not affect prices), d31 = 0 (medium-run price growthdoes not affect real income), and d12 = d32, then mt− pt− yrt ∼ I(0). In thiscase real money stock and real aggregate income share one common trend,the real stochastic trend

Pu2i. The stationarity of money velocity, implying

common movements in money, prices, and income, is then not inconsistentwith the conventional monetarist assumption as stated by Friedman (1970)that ”inflation always and everywhere is a monetary problem”. This case,(mt − pt − yrt ) ∼ I(0), has generally found little empirical support (Juselius,1996, 1998b, Juselius, 2000, Juselius and Toro, 1999). As an illustration seethe graph of money velocity in the upper panel of Figure 5.We will now turn to the more realistic assumption of money velocity being

I(1).

The case (mt − pt − yrt ) ∼ I(1), implies that either (d11 − d21 − d31) 6=0 or (d12 − d22 − d32) 6= 0. It suggests that the two common stochastictrends affect the level of real money stock and real income differently. A fewexamples illustrate this:Example 1 : Inflation is cointegrating with velocity, i.e.:

mt − pt − yrt + b1∆pt ∼ I(0), (2.13)

or alternatively

(mt − pt − yrt ) + b2∆mt ∼ I(0).

Under the previous assumptions that d31, d22 = 0, and d12 = d32, the I(0)assumption of (2.9) implies that d11 − d21 = b1c21. If b1 > 0, then (2.13) canbe interpreted as a money demand relation, where the opportunity cost ofholding money relative to real stock as measured by ∆pt is a determinantof money velocity. On the other hand if b1 < 0 (or b2 > 0), then inflationadjusts to excess money, though if | b1 |> 1, with some time lag. In this caseit is not possible to interpret (2.13) as a money demand relation.

Page 33: Juselius_Johansen_The Cointegrated VAR Model

32 CHAPTER 2. MODELS AND RELATIONS

Example 2 : The interest rate spread and velocity are cointegrating, i.e.:

(mt − pt − yrt )− b3(Rm −Rb)t ∼ I(0). (2.14)

Because (Rm − Rb)t ∼ I(1), either (d41 − d51) 6= 0, or (d42 − d52) 6= 0, orboth. In either case the stochastic trend in the spread has to cointegrate withthe stochastic trend in velocity for (2.14) to hold. If b3 > 0, then (2.14)can be interpreted as a money demand relation in which the opportunitycost of holding money relative to bonds is a determinant of agent’s desiredmoney holdings. On the other hand, if b3 < 0 then the money demandinterpretation is no longer possible, and (2.14) could instead be a centralbank policy rule. Figure 5, the middle panel, shows the interest spreadbetween the Danish 10 year bond rate and the deposit rate, and the lowerpanel the linear combination (2.14) with b3 = 14. It is notable how well thenonstationary behavior of money velocity and the spread cancels in the linearmoney demand relation.From the perspective of monetary policy a nonstationary spread suggests

that the short-term central bank interest rate can be used as an instrumentto influence money demand. A stationary spread on the other hand signalsfast adjustment between the two interest rates, such that changing the shortinterest rate only changes the spread in the very short run and, hence, leavesmoney demand essentially unchanged.In a model explaining monetary transmission mechanisms, the determi-

nation of real interest rates is likely to play an important role. The Fisherparity predicts that real interest rates are constant, i.e.

Rt = Et(∆mpt+m)/m+R0 = (pt+m − pt)/m+R0 = 1/mmPi=1

∆pt+i +R0

(2.15)

where R0 is a constant real interest rate and Et(∆mpt+m)/m is the expectedvalue at time t of inflation at the period of maturity t+m.If (∆mpt+m−Et∆mpt+m) ∼ I(0), then the predictions do not deviate from

the actual realization with more than a stationary error. If, in addition,(∆pt − (∆mpt+m)/m) ∼ I(0), then Rt − ∆pt is stationary. From (2.15) itappears that if (Rm−∆p) ∼ I(0) and (Rb−∆p) ∼ I(0), then d42 = d52 = 0.Also, if d42 = d52 = 0, then Rm and Rb must be cointegrating, (Rm−b4Rb)t ∼

Page 34: Juselius_Johansen_The Cointegrated VAR Model

2.5. THE I(2) SCENARIO 33

1975 1980 1985 1990 1995

-1

-.8Money velocity

1975 1980 1985 1990 1995

.01

.02

The interest rate spread

1975 1980 1985 1990 1995

-.1

0

.1 Deviations from money steady-state

Figure 2.5: Money velocity (upper panel), the interest rate spread (middelpanel), and money demand (lower panel) for Danish data.

I(0) with b4 = 1 for d41 = d51. In this sense stationary real interest rates areboth econometrically and economically consistent with the spread and thevelocity being stationary. It corresponds to the situation where real incomeand real money stock share the common AS trend,

Pu2i, and inflation and

the two nominal interest rates share the AD trend,Pu1i. This case can be

formulated as a restricted version of (2.12):

mt − pt

∆ptyrt

Rm,tRb,t

=

0 d12c21 00 d12c21 0c21 0

· P

u1iPu2i

¸+ ... (2.16)

Though appealing from a theory point of view, (2.16) has not foundmuch empirical support. Instead, real interest rates, interest rate spreads,and money velocity have frequently been found to be nonstationary. This

Page 35: Juselius_Johansen_The Cointegrated VAR Model

34 CHAPTER 2. MODELS AND RELATIONS

suggests the presence of real and nominal interaction effects, at least overthe horizon of a long business cycle.By modifying some of the assumptions underlying the Fisher parity, the

nonstationarity of real interest rates and the interest rate spread can bejustified. For example, Goldberg and Frydman (2002) shows that imperfectknowledge expectations (instead of rational expectations) is likely to generatean I(1) trend in the interest rate spread. Also if agents systematically mis-predict the future inflation rate we would expect (∆mpt+m − Et∆mpt+m) ∼I(1) and, hence, Rt − ∆pt ∼ I(1). In this case one would also expectEt(∆bpt+b)/b − (∆mpt+m)/m ∼ I(1), and (Rm − Rb)t ∼ I(1) would beconsistent with the predictions from the expectation’s hypothesis (or theFisher parity).

2.6 Scenario Analyses: treating prices as I(1)

In this case ρ < 1 in (2.9) implying that inflation is stationary albeit stronglyautocorrelated. The representation of the vector process becomes:

mt

ptyrt

Rm,tRb,t

=c11 d12c21 d220 d320 d420 d52

· P

u1iPu2i

¸+

g1g2g300

[t] + stat.comp. (2.17)

Money and prices are represented by:

mt = c11Pu1i + d12

Pu2i + g1t+ stat.comp.

pt = c21Pu1i + d22

Pu2i + g2t+ stat.comp.

If c11 = c21 there is long-run price homogeneity, but (mt − pt) ∼ I(1) unless(d12 − d22) = 0. If d12 6= 0 and d22 = 0, then

mt − pt = d12Pu2i + (g1 − g2)t+ stat.comp.

If d12 = d32 then (mt − pt − yrt ) ∼ I(0). From mt, pt ∼ I(1) it followsthat ∆p,∆m ∼ I(0), and real interest rates cannot be stationary unlessd42 = d52 = 0.

Page 36: Juselius_Johansen_The Cointegrated VAR Model

2.7. CONCLUDING REMARKS 35

Hence, a consequence of treating prices as I(1) is that nominal interestrates should be treated as I(0), unless one is prepared a priori to excludethe possibility of stationary real interest rates.As discussed above, the inflation rate and the interest rates have to cross

their mean path fairly frequently to obtain statistically significant mean-reversion. The restricted version of (2.17) given below is economically as wellas econometrically consistent, but is usually only relevant in the analysis oflong historical data sets.

mt

ptyrt

Rm,tRb,t

=c11 d12c21 00 d120 00 0

· P

u1iPu2i

¸+

g1g2g300

[t] + stat.comp. (2.18)

2.7 Concluding remarks

This chapter focused on the decomposition of a nonstationary time-seriesprocess into stochastic and deterministic trends as well as cycles and otherstationary components. In the case when some of the variables contain com-mon stochastic trends Sections 4 and 5 showed that the latter can be canceledby taking linear combinations of the former and that these linear combina-tions potentially can be interpreted as economic steady-state relations. Allthis was given in a purely descriptive manner without specifying a statisticalmodel consistent with these features. This is the purpose of the next chapterwhich discusses the properties of aggregated time-series data and point outunder which assumptions these data will produce the VAR model.

Page 37: Juselius_Johansen_The Cointegrated VAR Model

Chapter 3

The Probability Approach inEconometrics and the VAR

This chapter will (i) define the basic characteristics of a single time seriesand a vector process, (ii) derive the VAR model under certain simplifyingassumptions on the vector process, (iii) discuss the dynamic properties ofthe VAR model (iv) and illustrate the concepts with a data set of quarterlyobservations covering 1975:1-1993:4 on money, income, inflation and two in-terest rates from Denmark. The aim is to discuss under which simplifyingassumptions on the vector time series process the VAR model can be usedas an adequate summary description of the information in the sample data.

3.1 A single time series process

To begin with we will look at a single variable observed over consecutivetime points and discuss its time-series properties. Let xs,t, s = 1, ..., S, t =1, ..., T describe S realizations of a variable x over T time periods. WhenS > 1 this could, for example describe a variable in a study based on paneldata or it could describe a simulation study of a time series process xt, inwhich the number of replications are S. Here we will focus on the casewhen S = 1, i.e. when there is just one realization (x1, ..., xT ) on the indexset T . Since we have just one realization of the random variable xt, wecannot make inference on the shape of the distribution or its parametervalues without making simplifying assumptions. We illustrate the difficultieswith two simple examples in Figures 3.1 and 3.2.

37

Page 38: Juselius_Johansen_The Cointegrated VAR Model

38 CHAPTER 3. THE PROBABILITY APPROACH

x(t)

µ

T

6

-

1

r³³³³³x1

2

r········

x2

3

rQQQQQ

x3

4

r ´´´x4

5

rSSSSSS

x5

6

rx6

Figure 3.1. E(xt) = µ, V ar(xt) = σ2, t = 1, .., 6

x(t)

T

6

-

1

r³³³³³x1

µ1

2

r········

x2

µ2

3

rQQQQQ

x3

µ3

4

r ´´´x4

µ4

5

rSSSSSS

x5

µ5

6

rx6µ6

Figure 3.2. E(xt) = µt, V ar(xt) = σ2, t = 1, .., 6

In the two examples, the line connecting the realizations xt producesthe graph of the time-series. For instance in Figure 3.1 we have assumedthat the distribution, the mean value and the variance is the same for eachxt, t = 1, ..., T . In figure 3.2 the distribution and the variance are identical,but the mean varies with t. Note that the observed time graph is the same

Page 39: Juselius_Johansen_The Cointegrated VAR Model

3.1. A SINGLE TIME SERIES PROCESS 39

in both cases illustrating the fact, that we often need rather long time seriesto be able to statistically distinguish between different hypotheses in timeseries models.To be able to make statistical inference we need:

(i) a probability model for xt, for example the normal model(ii) a sampling model for xt, for example dependent or independent drawings

For the normal distribution, the first two moments around the mean are suffi-cient to describe the variation in the data. Without simplifying assumptionson the time series process we have the general formulation for t = 1, ..., T :

E(xt) = = µtV ar(xt) = E(xt − µt)2 = σ2t,0

Cov(xt, xt−h) = E[(xt − µt)(xt−h − µt−h)] = σt,h h = ...− 1, 1, ...

E[x] = E

x1x2...xT

=µ1µ2...µT

= µ

Cov[x] = E[x− E(x)][x−E(x)]0 =

σ1.0 σ1.1 σ1.2 · · · σ1.T−1σ2.1 σ2.0 σ2.1 · · · σ2.T−2σ3.2 σ3.1 σ3.0 · · · σ3.T−3...

......

. . ....

σT.T−1 σT.T−2 σT.T−3 · · · σT.0

= Σ

x =

x1x2...xT

∼ N(µ,Σ)

Page 40: Juselius_Johansen_The Cointegrated VAR Model

40 CHAPTER 3. THE PROBABILITY APPROACH

Because there is just one realization of the process at each time t, there isnot enough information to make statistical inference about the underlyingfunctional form of the distribution of each xt, t ∈ T and we have to makesimplifying assumptions to secure that the number of parameters describingthe process are fewer than the number of observations available. A typicalassumption in time series models is that each xt has the same distribution andthat the functional form is approximately normal. Furthermore, given thenormal distribution, it is frequently assumed that the mean is the same, i.e.E(xt) = µ, for t = 1, ..., T, and that the variance is the same, i.e. E(xt−µ)2 =σ2, for t = 1, ..., T .We use the following notation to describe (i) time varying mean and

variance, (ii) time varying mean and constant variance, (iii) constant meanand variance:

(i) xt ∼ N(µt,σ2t ) t = 1, ..., T

(ii) xt ∼ N(µt,σ2) t = 1, ..., T

(iii) xt ∼ N(µ,σ2) t = 1, ..., T

For a time series process time dependence is an additional problem that has tobe addressed. Consecutive realizations cannot usually be considered as inde-pendent drawings from the same underlying stochastic process. For the nor-mal distribution the time dependence between xt and xt−h, h = ...,−1, 0, 1, ...and t = 1, ..., T can be described by the covariance function. A simplifyingassumption in this context is that the covariance function is a function ofh, but not of t, i.e. σt.h = σh, for t = 1, ..., T. If xt has constant mean andvariance and in addition σh = 0 for all h = 1, ..., T, then Σ is a diagonalmatrix and xt is independent of xt−h for h = 1, ..., t. In this case we say that:

xt ∼ Niid(µ,σ2),where Niid is a shorthand notation for Normally, identically, independently,distributed.

3.2 A vector process

We will now move on to the more interesting case where we observe a vectorof p variables. In this case we need additionally to discuss covariances be-tween the variables at time t as well as covariances between t and t−h. The

Page 41: Juselius_Johansen_The Cointegrated VAR Model

3.2. A VECTOR PROCESS 41

covariances contain information about static and dynamic relationships be-tween the variables which we would like to uncover using econometrics. Fornotational simplicity xt will here be used to denote both a random variableand its realization.Consider the p× 1 dimensional vector xt:

xt =

x1,tx2,t...xp,t

, t = 1, ..., T.We introduce the following notation for the case when no simplifying as-sumptions have been made:

E[xt] =

µ1,tµ2,t...µp,t

= µt, Cov[xt,xt−h] =

σ11.h σ12.h · · · σ1p.hσ21.h σ22.h · · · σ2p.h...

.... . .

...σp1.h σp2.h · · · σpp.h

= Σt.h

t = 1, ..., T.

We will now assume that the same distribution applies for all xt and that it isapproximately normal, i.e. xt ∼ N(µt,Σt). Under the normality assumptionthe first two moments around the mean (central moments) are sufficient todescribe the variation in the data. We introduce the notation :

Z =

x1x2...xT

, E[Z] =µ1µ2...µT

= µ (3.1)

where Z is a pT × 1 vector. The covariance matrix Σ is given by

E[(Z− µ)(Z− µ)0] =

Σ1.0 Σ02.1 · · · Σ0T−1.T−2 Σ0T.T−1Σ2.1 Σ2.0 · · · Σ0T.T−2...

.... . .

......

ΣT−1.T−2... ΣT−1.0 Σ0T.1

ΣT.T−1 ΣT.T−2 · · · ΣT.1 ΣT.0

= Σ(Tp×Tp)

Page 42: Juselius_Johansen_The Cointegrated VAR Model

42 CHAPTER 3. THE PROBABILITY APPROACH

where Σt.h = Cov(xt, xt−h) = E(xt − µt)(xt−h − µt−h)0. The above notationprovides a completely general description of a multivariate normal vectortime series process. Since there are far more parameters than observationsavailable for estimation, it has no meaning from a practical point of view.Therefore, we have to make simplifying assumptions to reduce the number ofparameters. Empirical models are typically based on the following assump-tions:

• Σt.h =Σh, for all t ∈ T, h = ...,−1, 0, 1, ...

• µt = µ for all t ∈ T.

These two assumptions are needed to secure parameter constancy in theVAR model to be defined in Section 3.4. When the assumptions are satisfiedwe can write the mean and the covariances of the data matrix in the simplifiedform:

µ =

µµ...µ

, Σ =

Σ0 Σ01 Σ02 · · · Σ0T−1

Σ1 Σ0 Σ01. . .

...

Σ2 Σ1 Σ0. . . Σ02

.... . . . . . . . . Σ01

ΣT−1 · · · Σ2 Σ1 Σ0

The above two assumptions for infinite T define a weakly stationary pro-

cess:

Definition 1 Let xt be a stochastic process (an ordered series of randomvariables) for t = ...,−1, 1, 2, .... If

E[xt] = µ <∞ for all t,

E[xt − µ]2 = σ2 <∞ for all t,

E[(xt − µ)(xt+h − µ)] = σ.h <∞ for all t and h = 1, 2, ...

then xt is said to be weakly stationary. Strict stationarity requires that thedistribution of (xt1, ...,xtk) is the same as (xt1+h, ...,xtk+h) for h = ...,−1, 1, 2, ....

Page 43: Juselius_Johansen_The Cointegrated VAR Model

3.2. A VECTOR PROCESS 43

An illustration:The data set introduced here will be used throughout the book to illus-

trate the many questions and their empirical answers that can be asked withinthe cointegrated VAR model. As a matter of fact the development of manyof the subsequent cointegration procedures were more or less forced upon usas a result of the empirical analyses being performed on this data set. Thedata vector is defined by [mr

t , yrt∆ptRm,tRb,t], t = 1975:1, ..., 1993:4,where

mrt = mt − pt is a measurement of real money stock at time t, where

mt is the nominal M3 and pt is the implicit deflator of the gross nationalexpenditure, GNE,

yrt is the real gross national expenditure, GNE,

∆pt is the quarterly inflation rate measured by the implicit GNE deflator,

Rm.t is the average deposit rate as a proxy for the interest yield on M3,

Rb,t is the 10 year government bond rate.

Figure 3.3 and 3.4 show the graphs of the data in levels and in firstdifferences.

1975 1980 1985 1990 1995

.25

.5

.75Real money stock

1975 1980 1985 1990 1995

1.2

1.3

1.4

1.5Real aggregate expenditure

1975 1980 1985 1990 1995

0

.025

.05 Inflation rate

1975 1980 1985 1990 1995

.015

.02

.025

.03Deposit rate

1975 1980 1985 1990 1995

.03

.04

.05 Bond rate

Figure 3.3. The Danish data in levels.

Page 44: Juselius_Johansen_The Cointegrated VAR Model

44 CHAPTER 3. THE PROBABILITY APPROACH

1975 1980 1985 1990 1995

0

.1 Dm

1975 1980 1985 1990 1995

0

.05

.1Dry

1975 1980 1985 1990 1995

-.025

0

.025

.05DDpy

1975 1980 1985 1990 1995

0

.005 Did

1975 1980 1985 1990 1995

-.005

0

.005Dib

Figure 3.4. The Danish data in first differences.

A visual inspection reveals that neither the assumption of a constantmean nor of a constant variance seem appropriate for the levels of the vari-ables, whereas the differenced variables look more satisfactory in this respect.If the marginal processes are normal then the observations should lie sym-metrically on both sides of the mean. This seems approximately to be thecase for real money, but for real income, inflation rate, and the two interestrates there are some ”outlier” observations. The question is whether theseobservations are too far away from the mean to be considered realizationsfrom a normal distribution. At this stage it is a good idea to have a look inthe economic calendar to find out if the outlier observations can be relatedto some significant economic interventions or reforms.The outlier observation in real income and inflation rate appears at the

same date and seems related to a temporary removal of the value added taxrate in 1975:4. Denmark had experienced a stagnating domestic demandin the period after the first oil shock and to boost aggregate activity thegovernment decided to remove VAT for one quarter and gradually put itback again over the next two quarters. The outlier observation in bond rateis related to the lifting of previous restrictions on capital movements andthe start of the ‘hard’ EMS in 1983, whereas the outliers in the depositrate are related Central Bank interventions. Furthermore, we note that the

Page 45: Juselius_Johansen_The Cointegrated VAR Model

3.2. A VECTOR PROCESS 45

variability of the inflation rate in Figure 3.3. looks higher in the first part ofthe sample. Thus, the assumption of constant variance does not seem to besatisfied in this particular case.

These are realistic examples that point at the need to include additionalinformation on interventions and institutional reforms in the empirical modelanalysis. This can be done by including new variables measuring the effect ofinstitutional reforms, or if such variables are not available by using dummyvariables as a proxy for the change in institutions.

At the start of the empirical analysis it is not always possible to knowwhether an intervention was strong enough to produce an ”extraordinary”effect or not. Essentially every single month, quarter, year is subject to somekind of political interventions, most of them have a minor impact on the dataand the model. Thus, if an ordinary intervention does not ‘stick out’ as anoutlier, it will be treated as a random shock for practical reasons. Majorinterventions, like removing restrictions on capital movements, joining theEMS, etc. are likely to have much more fundamental impact on economicbehavior and, hence, need to be included in the systematic part of the model.Ignoring this problem is likely to seriously bias all estimates of our model andresult in invalid inference.

It is always a good idea to start with a visual inspection of the data andtheir time series properties as a first check of the assumptions of the V ARmodel. Based on the graphs we can get a first impression of whether xi,t looksstationary with constant mean and variance, or whether this is the case for∆xi,t. If the answer is negative to the first question, but positive to the nextone, we can solve the problem by respecifying the VAR in error-correctionform as will be demonstrated in the next chapter. If the answer is negativeto both questions, it is often a good idea to check the economic calendarto find out whether any significant departure from the constant mean andconstant variance coincides with specific reforms or interventions. The nextstep is then to include this information in the model and find out whether theintervention or reform had caused a permanent or transitory effect, whetherit seems to be additive to the model or whether it fundamentally changed theparameters of the model. In the latter case the intervention is likely to havecaused a regime shift and the model would need to be re-specified allowing forthe shift in the structure. The econometric modelling of intervention effectswill be discussed in more detail in Chapter 6.

Page 46: Juselius_Johansen_The Cointegrated VAR Model

46 CHAPTER 3. THE PROBABILITY APPROACH

3.3 Sequential decomposition of the likelihood

function

The purpose of this section is to demonstrate (i) that the joint likelihood func-tion P (X; θ) can be sequentially decomposed into T conditional probabilitiesP (xt | xt−1,...,x1;X0,θ), (ii) that the conditional process (xt | xt−1,...,x1;X0)has a parameterization that corresponds to the vector autoregressive model.First we give a repetition of the simple multiplicative rule to calcu-

late joint probabilities, and the formulas for calculating the conditional andmarginal mean and the variance of a multivariate normal vector Y.

Repetition:

***********************************************An illustration of the multiplicative rule for probability calculations based

on four dependent events, A, B, C, andD:

P (A ∩B ∩ C ∩D) = P (A|B ∩ C ∩D)P (B ∩ C ∩D)= P (A|B ∩ C ∩D)P (B|C ∩D)P (C ∩D)= P (A|B ∩ C ∩D)P (B|C ∩D)P (C|D)P (D)

Note that a multiplicative formulation has been achieved for the conditionalevents, even if the events themselves are are not independent.The general principle of the multiplicative rule for probability calculations

will be applied in the derivation of conditional and marginal distributions.Consider first two normally distributed random variables y1,t and y2,t withthe joint distribution:

Y ∼ N(m,S)

Y =

·y1,ty2,t

¸, E[Y] =

·m1

m2

¸, Cov

·y1,ty2,t

¸=

·s11 s12s21 s22

¸The marginal distributions for y1,t and y2,t are given by

y1,t ∼ N(m1, s11)

y2,t ∼ N(m2, s22)

Page 47: Juselius_Johansen_The Cointegrated VAR Model

3.4. DERIVING THE VAR 47

The conditional distribution for y1,t|y2,t is given by(y1,t|y2,t) ∼ N(m1.2, s11.2)

where

m1.2 = m1 + s12s−122 (y2,t −m2) (3.2)

= (m1 − s12s−122m2) + s12s−122 y2,t

= β0 + β1y2,t

and

s11.2 = s11 − s12s−122 s21 (3.3)

The joint distribution of Y can now be expressed as the product of theconditional and the marginal distribution:

P (y1,t, y2,t;θ)| z the joint distribution

= P (y1,t|y2,t;θ1)| z the conditional distribution

× P (y2,t;θ2)| z the marginal distribution

(3.4)

**************************************************

3.4 Deriving the VAR

1The empirical analysis begins with the data matrixX = [x1, ...,xT ]0 where xt

is a (p×1) vector of variables. Under the assumption that the observed dataX is a realization of a stochastic process we can express the joint probabilityof X given the initial value X0 and the parameter value θ describing thestochastic process:

P (X|X0;θ) = P (x1,x2, ...,xT |X0;θ)

For a given probability function maximum likelihood estimates can be foundby maximizing the likelihood function. Here we will restrict the discussionto the multivariate normality distribution. To express the joint probabilityof X|X0 it is convenient to use the stacked process Z

0 = x01,x02,x

03, ...,x

0T ∼

NTp(µ,Σ) defined in (3.1) instead of the (T × p) data matrix X. Since

1This section draws heavily on Hendry and Richard (1983).

Page 48: Juselius_Johansen_The Cointegrated VAR Model

48 CHAPTER 3. THE PROBABILITY APPROACH

µ is Tp × 1 and Σ is Tp × Tp, there are far more parameters than obser-vations without simplifying assumptions. But even if we impose simplifyingrestrictions on the mean and the covariances of the process, they are not veryinformative as such. Therefore, to obtain more interpretable results we willdecompose the joint process into a conditional process and a marginal processand then sequentially repeat the decomposition for the marginal process:

P (x1,x2,x3, ...,xT |X0;θ) (3.5)

= P (xT |xT−1, ...,x1,X0;θ)P (xT−1,xT−2, ...,x1|X0;θ)...

=T

Πt=1P (xt|X0

t−1;θ)

where

X0t−1 = [xt−1,xt−2, ...,x1,X0].

The VAR model is essentially a description of the conditional process

xt|X0t−1 ∼ NIDp(µt,Ω).

It is now possible to see how µt and Ω are related to µ and Σ by using therules given in Section 3.3. for calculating the mean and the variance of theconditional distribution (3.2)-(3.3). We first decompose the data into two

sets, the vector xt and the conditioning set X0t−1, i.e. X =

·xtX0t−1

¸. Using

the notation of Section 3.3 we write the marginal and the conditional process:

y1,t = xt µ1 = E[xt]

y2,t =

xt−1xt−2...x1

µ2 =

E[xt−1]E[xt−2]...

E[x1]

Σ =

Σ0 Σ01 · · · Σ0T−1

Σ1 Σ0 Σ01...

.... . . . . . Σ01

ΣT−1 · · · Σ1 Σ0

=·Σ11 Σ12Σ21 Σ22

¸

Page 49: Juselius_Johansen_The Cointegrated VAR Model

3.4. DERIVING THE VAR 49

We can now derive the parameters of the conditional model:

(xt|X0t−1) ∼ N(µ1.2,Σ11.2)

where

µ1.2 = µ1 + Σ12Σ−122 (X

0t−1 − µ2) (3.6)

and

Σ11.2 = Σ11 − Σ12Σ−−122 Σ21 (3.7)

The difference between the observed value of the process and its conditionalmean is denoted εt :

xt − µt = εt

Inserting the expression for the conditional mean gives:

xt = µ1 + Σ12Σ−122 (X

0t−1 − µ2) + εt

xt = µ1 − Σ12Σ−122 µ2 + Σ12Σ

−122X

0t−1 + εt

Using the notation: Π0 = µ1−Σ12Σ−122 µ2, [Π1,Π2, ...,ΠT−1] = Σ12Σ−122 and

assuming that Πk+1,Πk+2, ...,ΠT−1 = 0, we arrive at the k0th order vectorautoregressive model:

xt = Π0 +Π1xt−1 + ...+Πkxt−k + εt, t = 1, ..., T (3.8)

where εt is Niidp(0,Ω) and x0, ...x−k+1 are assumed fixed.If the assumption that X = [x1,x2, ...,xT ] is multivariate normal (µ,Σ) is

correct then it follows that (3.8):

• is linear in the parameters• has constant parameters• has normally distributed errors εt.Note that the constancy of parameters depends on the constancy of the

covariance matrices Σ12 and Σ22. If any of them change as a result of areform or intervention during the sample, both the intercept, Π0, and the‘slope coefficients’ Π1, ...,Πk are likely to change.

Page 50: Juselius_Johansen_The Cointegrated VAR Model

50 CHAPTER 3. THE PROBABILITY APPROACH

3.5 Interpreting the VAR model

We have shown that the VAR model is essentially a reformulation of thecovariances of the data. The question is whether it can be interpreted interms of rational economic behavior and if so whether it could be used as a’design of experiment’ when data are collected by passive observation. Theidea, drawing on Hendry and Richard (1983), is to interpret the conditionalmean µt of the VAR model

µt = Et−1(xt | xt−1, ...,xt−k) = Π1xt−1 + ...+Πkxt−k, (3.9)

as describing agents’ plans at time t − 1 given the available informationX0t−1 = [xt−1, ...,xt−k]. According to the assumptions of the VAR model the

difference between the mean and the actual realization is white noise process

xt − µt = εt, εt ∼ Niidp(0,Ω). (3.10)

Thus, the Niid(0,Ω) assumption in (3.10) is consistent with economic agentswhich are rational in the sense that they do not make systematic forecasterrors when they make plans for time t based on the available informationat time t − 1. For example, a VAR model with autocorrelated and or het-eroscedatic residuals would describe agents that do not use all informationin the data as efficiently as possible. This is because they could do betterby including the systematic variation left in the residuals, thereby improvingtheir expectations about the future.

Checking the assumptions of the model, i.e. checking the white noise re-quirement of the residuals, is not only crucial for correct statistical inference,but also for the economic interpretation of the model as a rough descriptionof the behavior of rational agents.

As an illustration Figure 3.5 shows the graphs of the (0,1) standardizedresiduals from the VAR(2) model based on the Danish data.

Page 51: Juselius_Johansen_The Cointegrated VAR Model

3.5. INTERPRETING THE VAR MODEL 51

1975 1980 1985 1990 1995

-2

0

2res_m3

1975 1980 1985 1990 1995

-2

0

2res_y

1975 1980 1985 1990 1995

0

2.5Res_infl

1975 1980 1985 1990 1995

0

2.5res_deprate

1975 1980 1985 1990 1995

-2.5

0

2.5 Res_bondrate

Figure 3.5. The graphs of the residuals from the VAR(2) model for realmoney stock, real income, inflation, deposit rate and bond rate.

The residuals do not look too bad, though a few outlier observationscan be found, approximately corresponding to the interventions and reformsdiscussed above.Unfortunately, in many economic applications the multivariate normal-

ity assumption is seldom satisfied for the VAR in its simplest form (3.8).Since in general the statistical inference is valid only to the extent that theassumptions of the underlying methods are satisfied, this is potentially a se-rious problem. Therefore, we have to ask whether it is possible to modify thebaseline VAR model (3.8) so that it preserves its attractiveness as a conve-nient description of the basic properties of the data, while at the same timeyielding valid inference.Simulation studies we shown that valid statistical inference is sensitive

to the validity of some of the assumptions, like parameter non-constancy,autocorrelated residuals (the higher, the worse!) and skewed residuals, whilequite robust to others, like excess kurtosis and residual heteroscedastisity.

Page 52: Juselius_Johansen_The Cointegrated VAR Model

52 CHAPTER 3. THE PROBABILITY APPROACH

This will be discussed in more detail in Chapter 5.Whatever the case, direct or indirect testing of the assumptions is crucial

for the success of the empirical application. As soon as we understand thereasons why the model fails to satisfy the assumptions it is often possibleto modify the model, so that in the end we can start from a statistically”well-behaved” model. Important tools in this context are:

• the use of intervention dummies to account for significant political orinstitutional events during the sample,

• conditioning on weakly exogenous variables,• checking the measurements of the chosen variables,• changing the sample period to avoid fundamental regime shift or split-ting the sample into more homogenous periods.

How to use these tools will be further discussed in the subsequent chap-ters.

3.6 The dynamic properties of the VAR pro-

cess

The dynamic properties of the process can be investigated by calculating theroots of the VAR process (3.8). It is convenient to formulate the VAR as apolynomial in the lag operator L, where Lixt = xt−i:

(I− Π1L− ...− ΠkLk)xt = Φ Dt + εt, (3.11)

Π(L)xt = Φ Dt + εt,

where the model has been extended to contain Dt, a vector of deterministiccomponents, such as a constant, seasonal dummies and intervention dum-mies. The autoregressive formulation is useful for expressing hypotheses oneconomic behavior, whereas the moving average representation is useful whenexamining the properties of the process. When the process is stationary, thelatter representation can be found directly by inverting the VAR model sothat xt, t = 1, ..., T, is expressed as a function of past and present shocks,εt−j, j = 0, 1, ..,initial values X0, possible deterministic components Dt:

Page 53: Juselius_Johansen_The Cointegrated VAR Model

3.6. THE DYNAMIC PROPERTIES OF THE VAR PROCESS 53

xt = Π−1(L)(Φ Dt + εt), (3.12)

= |Π (L)|−1Πa(L)(Φ Dt + εt), (3.13)

= (I+C1L+C2L2...)(Φ Dt + εt), (3.14)

where |Π (L)| = det(Π (L)) and Πa(L) is the adjunct matrix of Π (L) . Jo-hansen (1995), Chapter 2 gives a recursive formula for Cj = f( Π1, ..., Πk)when the VAR process is stationary. When the VAR process is nonstation-ary xt is non-invertible and the Cj matrices have to be derived under theassumption of reduced rank. This case will be discussed in Chapter 5.

3.6.1 The roots of the characteristic function

To calculate the roots of the VAR process we consider first the characteristicpolynomial:

Π(z) = I− Π1z − ...− Πkzk,

(Π(z))−1 = |Π (z)|−1Πa(z).

We will first illustrate that the roots of |Π (z)| = 0 summarize importantinformation about the dynamics and the stability of the process. We assumea stationary two-dimensional VAR(2) model:

(I− Π1L− Π2L2)xt = Φ Dt + εt.

The characteristic function is:

Π (z) = I−·π1.11 π1.12π1.21 π1.22

¸z −

·π2.11 π2.12π2.21 π2.22

¸z2

= I−·π1.11z π1.12zπ1.21z π1.22z

¸−·π2.11z

2 π2.12z2

π2.21z2 π2.22z

2

¸=

·1− π1.11z − π2.11z

2 −π1.12z − π2.12z2

−π1.21z − π2.21z2 1− π1.22z − π2.22z

2

¸

Page 54: Juselius_Johansen_The Cointegrated VAR Model

54 CHAPTER 3. THE PROBABILITY APPROACH

and

|Π (z)| = (1− π1.11z − π2.11z2)(1− π1.22z − π2.22z

2)− (π1.12z + π2.12z2)(π1.21z + π2.21z

2),

= 1− a1z − a2z2 − a3z3 − a4z4,= (1− ρ1z)(1− ρ2z)(1− ρ3z)(1− ρ4z),

i.e. the determinant is a fourth order polynomial in z which gives us fourcharacteristic roots, z1 = 1/ρ1, ..., z4 = 1/ρ4, when solving for |Π (z)| = 0.The characteristic roots contain useful information about the dynamic be-

havior of the process. This can be illustrated using the simple two-dimensionalVAR(2) model:

xt =Πa(L)(Φ Dt + εt)

(1− ρ1L)(1− ρ2L)(1− ρ3L)(1− ρ4L),

=

µΠa(L)

(1− ρ2L)(1− ρ3L)(1− ρ4L)

¶µεt + Φ Dt

(1− ρ1L)

¶,

for t = 1, ..., T. As an example of the dynamic behavior of the process weexpand the first root component (1 − ρ1L)

−1( εt + ΦDt) = (1 + ρ1L + ..)(εt + Φ Dt). Thus, each shock εt will dynamically affect both present andfuture values of the variables in xt. Note that this holds true also for anydummy variable included in Dt. The persistence of the effect depends onthe magnitude of |ρ1| , the larger, the stronger the persistence.It is noteworthy that already the simple two-dimensional VAR(2) model

can generate a very rich dynamic pattern in the variables xt as a resultof the multiplicity of the roots and the additional dynamics given by theΠc(L) matrix. A real root ρj will generate exponentially declining behavior,a complex pair of roots ρj = ρreal ± iρcomplex will generate exponentiallydeclining cyclical behavior. If a real root is lying on the unit circle (ρ = 1, ρ =−1) it will generate non-stationary behavior, i.e. a stochastic trend in xt. Ifthe modulus of a complex root is one it corresponds to nonstationary seasonalbehavior. An example of the latter is the simple fourth order difference modelfor quarterly data:

(1− L4) xt = (1− L)(1 + L)(1 + L2) xt = εt.

We set the characteristic polynomial to zero

Page 55: Juselius_Johansen_The Cointegrated VAR Model

3.6. THE DYNAMIC PROPERTIES OF THE VAR PROCESS 55

(1− z)(1 + z)(1 + z2) = 0

and find the characteristic roots: z1 = 1, z2 = −1, z3 = −i, z4 = i.

3.6.2 Calculating the roots using the companion ma-trix

We will here demonstrate that the roots of the process can be convenientlycalculated by first reformulating the VAR(k) model into the ‘companionAR(1) form’ and then solving an eigenvalue problem. In this case the eigen-value solution gives the roots directly as ρ1, ..., ρp×k instead of the inverseρ−11 , ..., ρ

−1p×k obtained by solving the characteristic function. To distinguish

between the two cases we call the former characteristic roots and the lat-ter eigenvalue roots. The latter are calculated by transforming the VAR(k)model into an AR(1) model based on the companion form. For simplicity weassume k = 2 when we illustrate the procedure. First we rewrite the VAR(2)model in the AR(1) form:

·xtxt−1

¸=

·Π1 Π2I 0

¸ ·xt−1xt−2

¸+

·εt0

¸,

or more compactly:

ext = eΠext−1 + eεtThe roots of the matrix eΠ can be found by solving the eigenvalue problem:

ρV = eΠVwhere V is a kp× 1 vector.

ρ

·v1v2

¸=

·Π1 Π2I 0

¸ ·v1v2

¸

Page 56: Juselius_Johansen_The Cointegrated VAR Model

56 CHAPTER 3. THE PROBABILITY APPROACH

i.e.

ρv1 = Π1v1 + Π2v2ρv2 = v1

The solution can be found from:

ρv1 = Π1v1+ Π2(v1/ρ)

v1 = Π1(v1/ρ) + Π2(v1/ρ2)

i.e. the eigenvalues of Π are the pk roots of the second order polynomial:¯I− Π1ρ

−1− Π2ρ−2¯ = 0

or ¯I− Π1z− Π2z

2¯= 0,

where z = ρ−1. Note that the roots of the companion matrix ρi are theinverse of the roots of the characteristic polynomial. Thus, the solution to

¯I− z Π

¯= 0

gives the stationary roots outside the unit circle, whereas the solution to

¯ρI− Π

¯= 0

gives stationary roots inside the unit circle. To summarize:

• if the roots of |Π (z)| , are all outside the unit circle (or alternatively ifthe eigenvalues of the companion matrix are all inside the unit circle)then xt is stationary,

• if the roots are outside or on the unit circle (alternatively if the eigen-values are inside or on the unit circle) then xt is nonstationary,

• if any of the roots are inside the unit circle (alternatively if the eigen-values are outside the unit circle) then xt is explosive.

Page 57: Juselius_Johansen_The Cointegrated VAR Model

3.6. THE DYNAMIC PROPERTIES OF THE VAR PROCESS 57

Table 3.1: The roots of the VAR(2) model

real complex modulus0.99 0.00 0.990.75 0.10 0.760.75 -0.10 0.760.69 -0.33 0.760.69 0.33 0.76-0.29 -0.40 0.49-0.29 0.40 0.49-0.35 0.00 0.350.11 -0.28 0.300.11 0.28 0.30

3.6.3 Illustration

Table 3.1 reports the roots of the VAR(2) model for the Danish data andFigure 3.6 shows them in the unit circle.

There are two real roots, one is almost on the unit circle, the other isnegative root (which is likely to be the results of ∆pt being to some extentover-differenced). All remaining roots come in complex pairs.

Page 58: Juselius_Johansen_The Cointegrated VAR Model

58 CHAPTER 3. THE PROBABILITY APPROACH

The eigenvalues of the companion matrix

-1.0 -0.5 0.0 0.5 1.0-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

Figure 3.6. The pk = 10 roots of the VAR(2) model for the Danish data.

3.7 Concluding remarks

The aim of this chapter was to describe a “design of experiment” that mayhave generated data by passive observation for which the VAR model is anappropriate description. On an aggregated level, economic agents were as-sumed rational in the sense that they learn by past experience and adjusttheir behavior accordingly, so that their plans do not systematically devi-ate from actual realizations. Thus, the “design of experiment” consistentwith the Niid assumption of the residuals relies on the assumption thatagents make plans based on conditional expectations using the informationset xt−1, Dt, so that the residual (the unexpected component given thechosen information set) behaves as a normal innovation process.In this framework, the success of the empirical analysis relies crucially

on the choice of a sufficient and relevant information set of an appropriatesample period and the skillfulness of the investigator to extract economicallyinteresting results from this information.The purpose of the next chapter is to discuss estimation of the unre-

stricted VAR and some diagnostic tools which can be used when assessingthe appropriateness of the chosen model.

Page 59: Juselius_Johansen_The Cointegrated VAR Model

Chapter 4

Estimation, Specification andTests in the Unrestricted VAR

The probability approach in econometrics requires an explicit probability for-mulation of the empirical model so that a fully specified statistical model canbe derived and checked against the data. Assume that we have derived anestimator under the assumption of multivariate normality as demonstratedin the previous chapter. We then take the model to the data and obtainmodel estimates derived under this assumption. If the multivariate normal-ity assumption is correct the residuals should not deviate significantly fromthe Niid assumption. If they do not pass the tests, for example because theyare autocorrelated or heteroscedastic, or because the distribution is skewedor leptocurtic, then the estimators may no longer have optimal propertiesand cannot be considered full information maximum likelihood (FIML) esti-mators. The obtained parameter estimates (based on an incorrectly derivedestimator) may not have any meaning and since we do not know their ”true”properties inference is likely to be hazardous.However, some assumptions are more crucial for the properties of the es-

timates than others. Therefore, when reporting the various misspecificationtests below we will discuss robustness properties against modest violations ofthe assumptions. Nevertheless, if we are going to claim that our conclusionsare based on FIML inference, then we also have to demonstrate that ourmodel is capable of mirroring the ‘full information’ of the data in a satisfac-tory way.Before being able to test the assumptions we need to estimate the model

and Section 4.1 derives the ML estimator under the null of correct model

59

Page 60: Juselius_Johansen_The Cointegrated VAR Model

60 CHAPTER 4. ESTIMATION AND SPECIFICATION

specification. Section 4.2 discusses different parametrization of the unre-stricted VAR model and illustrates the estimates based on the Danish data.Section 4.3 reports briefly some frequently used misspecification tests.

4.1 Likelihood based estimation in the unre-

stricted VAR

Under the assumption that the parameters Θ = Π1, Π2, ..., Πk,Ω inthe VAR model (3.3.6) of Chapter 3 are unrestricted, it can be shown thatthe simple OLS estimator is identical to the FIML estimator. When thedata contain unit roots we need to derive the likelihood estimator subject toreduced rank restrictions. Chapter 7 will give a detailed discussion of how tosolve this problem. To simplify notation we rewrite (4.7) in compact form:

xt = B0Zt + εt, t = 1, ..., T (4.1)

εt ∼ Np(0,Ω)

where B0 = [ Π1, Π2, ..., Πk], Z0t= [x0t−1,x

0t−2, ...,x

0t−k] and the initial values

X0 = [x00,x0−1, ...,x

0−k+1] are given. For simplicity we assume ΦDt = 0. We

need to derive the equations for estimating B and Ω which can be done byfinding the expression for B and Ω for which the first order derivatives of thelikelihood function are equal to zero.We consider first the log likelihood function

lnL(B,Ω;X) = −T 12ln(2π)− T 1

2|Ω|− 1

2

TXt=1

(xt−B0Zt)0Ω−1(xt−B0Zt),

and calculate ∂ lnL/∂B = 0 which gives

TPt=1

xtZ0t = B

0 TPt=1

ZtZ0t,

so that the FIML estimator for B is:

B0 =TPt=1

(xtZ0t)(

TPt=1

ZtZ0t)−1 =MxZM

−1ZZ . (4.2)

Page 61: Juselius_Johansen_The Cointegrated VAR Model

4.1. LIKELIHOODBASED ESTIMATION IN THE UNRESTRICTEDVAR61

Next we calculate ∂ lnL/∂Ω = 0 which gives the estimator of Ω:

Ω = T−1TPt=1

(xt−B0Zt)(xt−B0Zt)

0 = T−1TPt=1

εtε0t. (4.3)

The ML estimators (4.2) and (4.3) are identical to the correspondingOLS estimators. We can now find the maximal value of the (log) likelihoodfunction for the ML estimates B and Ω:

lnLmax = −12T ln(2π)− 1

2T ln

¯Ω¯− 12

TPt=1

(xt−B0Zt)0Ω−1(xt−B0Zt)

We will now show that lnLmax = −12T ln¯Ω¯+constant terms. Consider

first:

(xt−B0Zt)0Ω−1(xt−B0Zt) = εtΩ

−1ε0t= u21t + ...+ u

2pt.

where ui,t = εtΩ1/2. Let:

U =

u21t

u22t. . .

u2pt

.It follows that trace(U) = u21t + ...+ u

2pt.

Using the rule trace(ABC) = trace(CAB) we have that:

traceh(xt−B0Zt)0Ω−1(xt−B

0Zt)i= trace

h(xt−B0Zt)(xt−B

0Zt)

0Ω−1i

so that trace(ε0tΩ−1εt) = trace(εtε0tΩ

−1). Summing over T we obtain:

traceT−1ΣTt=1(xt−B0Zt)Ω

−1(xt−B0Zt)

0 = traceT−1ΣTt=1εtε0tΩ−1= traceΩΩ−1= traceIp = p

Page 62: Juselius_Johansen_The Cointegrated VAR Model

62 CHAPTER 4. ESTIMATION AND SPECIFICATION

and

lnLmax = −T 12ln¯Ω¯− T 1

2p− T 1

2ln(2π),

i.e. apart from some constant terms, the maximum of the log likelihoodfunction is proportional to the log determinant of the residual covariancematrix Ω :

lnLmax = −T 12ln¯Ω¯+ constant terms

This result will be used in many of the test procedures discussed below andin the derivation of the maximum likelihood estimator for the cointegratedVAR model in Chapter 7.To be able to test hypotheses on B we have to find the distribution of the

estimates B.We will use the simple VAR(2) model to discuss the asymptoticdistribution of B under the assumption of stationarity of the process xt.Next, consider the estimation error of the VAR coefficients:

B0−B0 =hΠ1, Π2

i− [Π1,Π2] (4.4)

First we denote the covariance matrices between xt−1 and xt−2

Σ =

·Σ11 Σ12Σ21 Σ22

¸=

·V ar(xt−1) Cov(xt−1,xt−2)

Cov(xt−2,xt−1) V ar(xt−2)

¸.

Under the stationarity assumption the distribution of (4.4) has the followingasymptotic property:

T12 (B−B) w→ N(0,Σ−1 ⊗ Ω) (4.5)

where

Σ−1⊗Ω =·Σ−111 Ω Σ−112 ΩΣ−121 Ω Σ−122 Ω

¸.

Page 63: Juselius_Johansen_The Cointegrated VAR Model

4.1. LIKELIHOODBASED ESTIMATION IN THE UNRESTRICTEDVAR63

To see how the distribution of the unrestricted VAR estimates relates tothe corresponding results for the standard regression model we digress brieflyto the latter.

Repetition:**********************************

The distribution of the linear regression model estimate.

y = Xβ + ε, ε ∼ Niid(0,σ2ε)β = (X0X)−1X0y

β − β = (X0X)−1X0εV ar(β − β) = σ2ε(X

0X)−1

***********************************

Thus, the VAR results are similar to the linear regression model, ex-cept that the design matrix X0X of the latter is replaced by the 2p × 2pcovariance matrix M. Note that the asymptotic distribution of the linearregression model coefficients are based on the assumption that the design

matrix T−1X0X P→ A where A is a constant matrix. When the data haveunit roots this is no longer the case and the design matrix when normalizeddifferently will instead converge towards a matrix of Brownian motions. Thiswill be discussed further in Chapter 8.

Assume now that we would like to test the significance of a single co-efficient, for example the first element π1,11 of Π1. We define two ‘design’vectors ξ0 = [1, 0, 0, ..., 0] and η0 = [1, 0, 0, ..., 0] where ξ is p × 1 and η is2p × 1, so that ξ0B0η = π1,11. Using (4.5) we can find the test statistic forthe null hypothesis π1,11 = 0, which has a Normal (0,1) distribution. Thiscan be generalized to testing any coefficient in B by appropriately choosingthe vectors ξ and η :

T12ξ0B0η

(ξ0Ωξη0Σ−1η)12

w→ N(0, 1). (4.6)

Page 64: Juselius_Johansen_The Cointegrated VAR Model

64 CHAPTER 4. ESTIMATION AND SPECIFICATION

4.1.1 The estimates of the unrestricted VAR(2) for theDanish data

The unrestricted VAR(2) model was estimated based on the following as-sumptions:

xt = Π1xt−1 + Π2xt−2 + ΦDt + εt (4.7)

t = 1, ...T, εt ∼ Np(0,Ω)where Dt contains three centered seasonal dummies and a constant. Theestimates reported in Table 4.1 are calculated in GiveWin by running anOLS regression equation by equation. As discussed above these estimatesare ML estimates as long as no restrictions have been imposed on the VARmodel. To increase readability we have omitted standard errors of estimatesand ‘t’ ratios. Instead, coefficients with a ‘t’-ratio greater than 1.9 havebeen given in bold face. Since Chapter 3 found at least one characteristicroot very close to the unit circle in the unrestricted VAR, xt is not likely tobe stationary. This implies that the ’design matrix’ normalized differentlywill no longer converge towards a constant matrix in the limit but, instead,towards a matrix of Brownian motions. In this case the ‘t’-ratios are morelikely to be distributed as the Dickey-Fuller’s ‘τ ’ and should, therefore, notbe interpreted as Student’s t.mrt

yrt∆ptRm,tRb,t

=

0.59 0.37 −0.11 2.59 −7.220.11 0.96 −0.23 −2.42 −1.13−0.03 −0.04 −0.52 −0.38 0.29−0.01 0.00 0.00 0.87 0.30−0.03 0.05 0.01 0.03 1.27

mrt−1

yrt−1∆pt−1Rm,t−1Rb,t−1

+

+

0.21 −0.15 −0.07 3.41 2.22−0.01 −0.25 −0.07 0.07 2.240.04 −0.09 −0.22 1.83 −0.430.01 0.01 −0.01 −0.13 −0.250.02 −0.04 0.01 −0.03 −0.40

mrt−2

yrt−2∆pT−2Rm,t−2Rb,t−2

+

ε1.tε2.tε3.tε4.tε5.t

,

Ω =

1.00.25 1.0−0.20 −0.19 1.0−0.09 −0.05 0.02 1.0−0.24 0.09 −0.09 0.32 1.0

, σε =

0.02670.01690.01450.00130.0017

Page 65: Juselius_Johansen_The Cointegrated VAR Model

4.2. THREE DIFFERENT ECM-REPRESENTATIONS 65

Log(Lmax) = 1973.1, log¯Ω¯= −51.24, R2(LR) = 0.9998, R2(LM) =

0.64, F-test on all regressors: F(50,272) =32.5, where R2(LR) and R2(LM)will be explained in Section 4.3.An inspection of the estimated coefficients reveals more significant co-

efficients at lag 1 than lag 2. Most of the coefficients with large t-ratiosare on the diagonal, implying that the variables are highly autoregressive.Only the inflation rate, ∆pt, has a negative autoregressive coefficient, whichis a result of the imposed difference operator1. The bond rate seems to bequite important for most of the variables in this system. The residual cor-relations between equations are generally moderately sized and suggest thatthe current effects are not likely to be very important in this case. The log

likelihood value and log¯Ω¯are only informative when compared to another

model specification. For example, based on the VAR(1) model we obtained

log¯Ω¯= −50.19 > −51.24. The R2(LR) is almost 1.0, which does not imply

that we have explained all variation in the data but, instead, that R2 is anincorrect measure when the variables are trending as they are in the presentcase. See also the discussion in Section 4.3.2.Finally we report F-tests on the significance of single regressors. The

tests are distributed as F(5,59):·mrt−1 yrt−1 ∆pt−1 Rm,t−1 Rb,t−1 mr

t−2 yrt−2 ∆pt−2 Rm,t−2 Rb,t−24.8 14.1 4.8 10.3 22.7 2.8 3.7 1.0 0.9 2.9

¸

We note that the second lag of inflation and deposit rate could altogetherbe omitted from the system.

4.2 Three different ECM-representations

The unrestricted V AR model can be given different parametrization with-out imposing any binding restrictions on the model parameters, i.e. with-out changing the value of the likelihood function. The so called vector(equilibrium)-error-correction model gives a convenient reformulation of (4.7)in terms of differences, lagged differences, and levels of the process. Thereare several advantages of this formulation:

1For example, if pt = 0.5pt−1 + 0.5pt−2 then ∆pt = −0.5∆pt−1 + εt.

Page 66: Juselius_Johansen_The Cointegrated VAR Model

66 CHAPTER 4. ESTIMATION AND SPECIFICATION

1. The multicollinearity effect which typically is strongly present in time-series data is significantly reduced in the error-correction form. Differ-ences are much more ‘orthogonal’ than the levels of variables.

2. All information about long-run effects are summarized in the levelsmatrix Π which can, therefore, be given special attention when solvingthe problem of cointegration.

3. The interpretation of the estimates is much more intuitive, as the co-efficients can be naturally classified into short-run effects and long-runeffects.

4. We are generally interested in understanding why inflation rate, say,changed from the previous to the present period. The ECM-formulationanswers this question directly.

We will now discuss three different versions of the VAR(k) model repre-sented in the general error-correction form

∆xt = Γ(m)1 ∆xt−1 + Γ

(m)2 ∆xt−2 + ...+ Γ

(m)k−1∆xt−k+1 + Πxt−m + Φ Dt + εt

(4.8)

where m is an integer value between 1 and k. Note that the value of the like-lihood function does not change even if we change the value of m. For theDanish data we will assume the lag length k = 2 and report the unrestrictedparameter estimates for that choice. Additionally the value of the log likeli-hood function, some multivariate R2 measures, and F-test of the significanceof the regressors will be reported. The purpose is to illustrate how differentthe estimates can look although it is exactly the same model that has beenestimated in all three cases.

4.2.1 The ECM formulation with m = 1.

The VAR(2) model is specified as:

∆xt = Γ(1)1 ∆xt−1 + Πxt−1 + Φ Dt + εt (4.9)

where Π= I− Π1− Π2, and Γ(1)1 = − Π2. In (4.9) the lagged levels matrix

Π has been placed at time t− 1.

Page 67: Juselius_Johansen_The Cointegrated VAR Model

4.2. THREE DIFFERENT ECM-REPRESENTATIONS 67

The estimated coefficients reported below show that most of the signifi-cant coefficients are now in the lagged levels matrix Π, whereas only four outof 25 coefficients in the Γ

(1)1 matrix seem significant. Among the former, two

are on the diagonal and the remaining two describe the big change in inversevelocity, as a result of a reallocation of money holdings when restrictions oncapital movements were lifted at 1983, which coincided with a huge drop inthe bond rate. In Chapter 5 we will re-estimate the VAR model accountingfor this intervention.

Altogether, estimating the model exclusively in differences, i.e. settingΠxt−1 = 0, would not deliver many interesting results. But, including Πxt−1in the model raises the question of how to handle the nonstationarity problem.Since a stationary process cannot be equal to a nonstationary process, theestimation results can only make sense if Πxt−1 defines stationary linearcombinations of the variables. This can be seen by noting that the first rowof Πxt−1 can be reformulated as:

−0.20(mrt−1 − 1.1yrt−1 + 0.9∆pt−1 − 30.0Rm,t−1 + 25.0Rb,t−1).

If the linear combination in the bracket defines a stationary variable thenall parts of the first equation in the system would be stationary and, therefore,balanced. This is, in a nutshell, what cointegration analysis does: it identifiesstationary linear combinations between nonstationary variables so that anI(1) model can be reformulated exclusively in stationary variables. Our taskis to give the stationary linear combinations an economically meaningfulinterpretation by imposing relevant identifying or over-identifying restrictionson the coefficients. For example the above relation may be interpreted as thedeviation of observed money holdings from a steady-state money demandrelation, mr

t−1 −mr∗t−1 where

mr∗t = 1.1y

rt−1 − 0.9∆pt−1 + 30.0Rm,t−1 − 25.0Rb,t−1.

We might now like to test whether real income has a unit coefficient, whetherthe coefficient to inflation is zero, and whether the interest rate coefficientsare equal with opposite sign. How to do it, will be discussed in Chapter 9.

Page 68: Juselius_Johansen_The Cointegrated VAR Model

68 CHAPTER 4. ESTIMATION AND SPECIFICATION

∆mr

t

∆yrt∆p2t∆Rm,t∆Rb,t

=

−0.20 0.15 0.07 −3.41 −2.220.01 0.25 0.07 −0.07 −2.24−0.04 0.08 0.22 −1.83 0.43−0.00 0.01 0.00 −0.26 0.05−0.02 0.04 −0.01 0.03 0.40

∆mrt−1

∆yrt−1∆p2t−1∆Rm,t−1∆Rb,t−1

+

+

−0.20 0.22 −0.18 6.01 −5.000.10 −0.29 −0.31 −2.35 1.110.01 −0.12 −1.74 1.46 −0.14−0.00 0.01 −0.00 −0.26 0.05−0.01 0.01 0.02 0.00 −0.13

mrt−1

yrt−1∆pt−1Rm,t−1Rb,t−1

+ ΦDt + εt

Log(Lmax) = 1973.1, log¯Ω¯= −51.2, trace correlation = 0.54, R2(LR) =

0.96, R2(LM) = 0.42, F-test on all regressors: F(50,272) =5.4. The tracecorrelation coefficient will be described in Section 4.3.2.Note that Log(Lmax) and log

¯Ω¯are exactly the same as for the unre-

stricted VAR in the previous section, demonstrating that from a likelihoodpoint of view the models are identical. Because the residuals are identicalin all the ECM representations, all residual tests or information criteria areidentical, whereas tests of the significance of single variables need not be (andoften are not). For example, the single F-tests of the lagged variables in dif-ferences, distributed as F(5,59), are very different when compared to the twosubsequent specifications, whereas the test values for the lagged variables inlevels are identical. This illustrates that the Π matrix is invariant to lineartransformations of the VAR system but not the Γ

(m)1 matrices, which depend

on how we choose m.·∆mr

t−1 ∆yrt−1 ∆p2t−1 ∆Rm,t−1 ∆Rb,t−1 mrt−1 yrt−1 ∆pt−1 Rm,t−1 Rb,t−1

2.8 3.7 1.0 0.9 2.9 3.7 4.1 16.0 4.9 4.5

¸

4.2.2 The ECM formulation with m = 2

The VAR(2) model is now specified so that Π is placed at xt−2:

∆xt = Γ(2)1 ∆xt−1 + Πxt−2 + Φ Dt + εt (4.10)

Page 69: Juselius_Johansen_The Cointegrated VAR Model

4.2. THREE DIFFERENT ECM-REPRESENTATIONS 69

with Π= I− Π1− Π2 and Γ(2)1 = (I− Π1). Thus, the Π matrix remains

unchanged, but not the Γ(2)1 matrix. The latter measures the cumulative long-

run effect, whereas Γ(1)1 in (4.9) describes ‘pure’ transitory effects measured

by the lagged changes of the variables. While the explanatory power isidentical for the two model versions, the estimated coefficients and theirp-values can vary considerably. Usually many more significant coefficientsare obtained with formulation (4.10) than with (4.9). We note that thenumber of ‘significant’ coefficients in Γ1 is larger than in Γ1, but that the Πmatrix is unchanged. Thus, many significant coefficients does not necessarilyimply high explanatory power, but may as well be a consequence of theparameterization of the model. It illustrates that the interpretation of theestimated coefficients in dynamic models is less straightforward than in staticregression models.

∆mr

t

∆yrt∆p2t∆Rm,t∆Rb,t

=

−0.40 0.36 −0.11 2.59 −7.220.11 −0.04 −0.23 −2.42 −1.13−0.03 0.04 −1.52 −0.38 0.29−0.01 0.00 0.00 −0.13 0.30−0.03 0.05 0.01 0.03 0.27

∆mrt−1

∆yrt−1∆p2t−1∆Rm,t−1∆Rb,t−1

+

+

−0.20 0.22 −0.18 6.01 −5.000.10 −0.29 −0.31 −2.35 1.110.01 −0.12 −1.74 1.46 −0.14−0.00 0.01 −0.00 −0.26 0.05−0.01 0.01 0.02 0.00 −0.13

mrt−2

yrt−2∆pt−2Rm,t−2Rb,t−2

+ ΦDt + εt

Log(Lmax) = 1973.1, log¯Ω¯= −51.2, R2(LR) = 0.96, R2(LM) = 0.42,

F-test on all regressors: F(50,272) = 5.4. F-test on single regressors:

·∆mr

t−1 ∆yrt−1 ∆p2t−1 ∆Rm,t−1 ∆Rb,t−1 mrt−2 yrt−2 ∆pt−2 Rm,t−2 Rb,t−2

6.7 5.3 33.9 1.2 4.9 3.7 4.1 16.0 4.9 4.5

¸

We note that the single F-tests, F(5,59), on the lagged variables in differ-ences have changed. For example, ∆p2t−1 is now highly significant, whereasit was completely insignificant in the case m = 1.

Page 70: Juselius_Johansen_The Cointegrated VAR Model

70 CHAPTER 4. ESTIMATION AND SPECIFICATION

4.2.3 ECM-representation in acceleration rates, changesand levels

Another convenient formulation of the VAR model is in second order differ-ences (acceleration rates), changes, and levels:

∆2xt = Γ1.1∆2xt−1 + ...+ Γ1.k−2∆2xt−2 + Γ∆xt−1 + Πxt−2 + Φ Dt + εt

(4.11)

where Γ = I− Γ1 − ...− Γk−1 = I− Π2 − 2 Π3 − ...− (k − 1) Πk, Π= I−Π1 − ...− Πk as before, and Γ1.i = Γi+2 + ...+ Γk = Π3 + 2 Π4 + ... + kΠk. Chapter 15 will show that this formulation is particularly useful when xtcontains I(2) variables, but it is in general a convenient representation whenthe sample contains periods of rapid change, so that acceleration rates (inaddition to growth rates) become relevant determinants of agents’ behavior.The VAR(2) model for the Danish data now becomes:

∆2xt = Γ∆xt−1 + Πxt−2 + Φ Dt + εt (4.12)

where Γ = I− Γ1, and Π = I− Π1− Π2. At first sight the estimates ofthe Γ1 matrix reported below look very different from the previous case. Asecond look shows that the coefficients are identical except that a constantfactor of -1 has been added to the diagonal elements. Thus, the significanceof the diagonal elements are only a consequence of applying the differenceoperator once more to ∆xt. Therefore, it may be more meaningful to testwhether the diagonal elements are significantly different from -1 (or from -2for the inflation rate) than from zero.

∆2mrt

∆2yrt∆∆2pt∆2Rm,t∆2Rb,t

=

−1.40 0.36 −0.11 2.59 −7.220.11 −1.04 −0.23 −2.42 −1.13−0.03 0.04 −2.52 −0.38 0.29−0.01 0.00 0.00 −1.13 0.30−0.03 0.05 0.01 0.03 −0.73

∆mrt−1

∆yrt−1∆2pt−1∆Rm,t−1∆Rb,t−1

+

+

−0.20 0.22 −0.18 6.01 −5.000.10 −0.29 −0.31 −2.35 1.110.01 −0.12 −1.74 1.46 −0.14−0.00 0.01 −0.00 −0.26 0.05−0.01 0.01 0.02 0.00 −0.13

mrt−2

yrt−2∆pt−2Rm,t−2Rb,t−2

+ ΦDt + εt

Page 71: Juselius_Johansen_The Cointegrated VAR Model

4.2. THREE DIFFERENT ECM-REPRESENTATIONS 71

Log(Lmax) = 1973.1, log¯Ω¯= −51.2, R2(LR) = 0.999, R2(LM) = 0.65,

F-test on all regressors: F(50,272) =11.5, F-tests on single regressors:

·∆mr

t−1 ∆yrt−1 ∆p2t−1 ∆Rm,t−1 ∆Rb,t−1 mrt−2 yrt−2 ∆pt−2 Rm,t−2 Rb,t−2

34.5 24.9 33.9 18.3 20.3 3.7 4.1 16.0 4.9 4.5

¸

The F-tests on the significance of the first five regressors have now ob-tained very large values, which is just an artifact of the ∆2 transformation,and they do not really say much about how important the lagged t− 1 vari-ables are for explaining the variation in xt.

4.2.4 The relationship between the different VAR for-mulations

We will now evaluate the above model formulations using the characteristicfunction based on the slightly more general VAR(3) model:

xt = Π1xt−1 + Π2xt−2 +Π3xt−3 + ΦDt + εt (4.13)

with the characteristic function:

Π(z) = I− Π1z− Π2z2− Π3z

3,

The ECM form of (4.13), with m = 1, is:

∆xt = Γ(1)1 ∆xt−1 + Γ

(1)2 ∆xt−2 + Πxt−1 + Φ Dt + εt (4.14)

and the characteristic function:

Π(z) = I− z− Γ(1)1 (1− z)z− Γ(1)2 (1− z)z2− Πz,

= I− (I+ Γ(1)1 + Π)z− ( Γ(1)2 − Γ(1)1 )z

2+ Γ(1)2 z

3.

The relation between the parameters of 4.13) and (4.14) can now be foundas:

Page 72: Juselius_Johansen_The Cointegrated VAR Model

72 CHAPTER 4. ESTIMATION AND SPECIFICATION

Γ(1)1 = −( Π2 + Π3),

Γ(1)2 = − Π3,

Π = −(I− Π1 − Π2 − Π3).

The ECM form of (4.13), with m = 3, is:

∆xt = Γ(3)1 ∆xt−1 + Γ

(3)2 ∆xt−2 + Πxt−3 + Φ Dt + εt (4.15)

and the characteristic function:

Π(z) = I− z− Γ(3)1 (1− z)z− Γ

(3)2 (1− z)z

2− Πz3,

= I− (I+ Γ(3)1 )z− ( Γ(3)2 − Γ

(3)1 )z

2+( Γ

(3)2 − Π)z

3.

The relationship between (4.13) and (4.15) is:

Γ(3)1 = −(I− Π1),

Γ(3)2 = −(I− Π1 − Π2),

Π = −(I− Π1 − Π2 − Π3).

In both cases the Π matrix is unchanged, but the Γ(m)i depends on the

chosen lag m of xt in the model.

4.3 Misspecification tests

After the model has been estimated the multivariate normality assumptionunderlying the VAR model can (and should) be checked against the datausing the residuals εt. In the subsequent sections we will briefly discuss someof the test procedures and information criteria contained in CATS in RATS(Hansen and Juselius, 1995) and in GiveWin (Doornik and Hendry, 2002).

4.3.1 Specification checking

It is always useful to begin the specification checking with a graphical analy-sis. Quite often the graphs reveal specification problems that the tests fail to

Page 73: Juselius_Johansen_The Cointegrated VAR Model

4.3. MISSPECIFICATION TESTS 73

discover. Figures 4.1-4.5 show the fitted and actual values of ∆xi,t (panel a),the empirical distribution compared to the normal (panel b), the residuals(panel c), and the autocorrelogram of order 20 (panel d). Figure 4.6 showsall the autocorrelations for the full system. The diagonal autocorrelogramsare defined by Corr(εxit, εxit−h), i = 1, .., 5, h = 1, ..., 18, i.e. are the sameas in Figures 4.1-4.5, panel d, whereas the off diagonal diagrams define thecross-autocorrelograms Corr(εxit, εxjt−h), i 6= j.

Actual and Fitted for DMO

75 78 81 84 87 90 93-0.100

-0.075

-0.050

-0.025

-0.000

0.025

0.050

0.075

0.100

0.125

Standardized Residuals

75 77 79 81 83 85 87 89 91 93-2.7

-1.8

-0.9

-0.0

0.9

1.8

2.7

Histogram of Standardized Residuals

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45NormalDMO

Correlogram of residuals

Lag2 4 6 8 10 12 14 16 18

-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

Figure 4.1. Graphs of residuals from the money stock equation.

Page 74: Juselius_Johansen_The Cointegrated VAR Model

74 CHAPTER 4. ESTIMATION AND SPECIFICATION

Actual and Fitted for DFY

75 78 81 84 87 90 93-0.050

-0.025

0.000

0.025

0.050

0.075

0.100

Standardized Residuals

75 77 79 81 83 85 87 89 91 93-3

-2

-1

0

1

2

3

Histogram of Standardized Residuals

0.0

0.1

0.2

0.3

0.4

0.5NormalDFY

Correlogram of residuals

Lag2 4 6 8 10 12 14 16 18

-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

Figure 4.2. Graphs of residuals from the income equation.

Actual and Fitted for DDIFPY

75 78 81 84 87 90 93-0.06

-0.04

-0.02

0.00

0.02

0.04

0.06

Standardized Residuals

75 77 79 81 83 85 87 89 91 93-2.4

-1.6

-0.8

-0.0

0.8

1.6

2.4

3.2

Histogram of Standardized Residuals

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45NormalDDIFPY

Correlogram of residuals

Lag2 4 6 8 10 12 14 16 18

-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

Figure 4.3. Graphs of residuals from the inflation equation.

Page 75: Juselius_Johansen_The Cointegrated VAR Model

4.3. MISSPECIFICATION TESTS 75

Actual and Fitted for DIDE

75 78 81 84 87 90 93-0.006

-0.004

-0.002

0.000

0.002

0.004

0.006

Standardized Residuals

75 77 79 81 83 85 87 89 91 93-3.2

-2.4

-1.6

-0.8

-0.0

0.8

1.6

2.4

3.2

Histogram of Standardized Residuals

0.00

0.08

0.16

0.24

0.32

0.40

0.48

0.56NormalDIDE

Correlogram of residuals

Lag2 4 6 8 10 12 14 16 18

-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

Figure 4.4. Graphs of residuals from the deposit rate equation.

Actual and Fitted for DIBO

75 78 81 84 87 90 93-0.010

-0.008

-0.006

-0.004

-0.002

0.000

0.002

0.004

Standardized Residuals

75 77 79 81 83 85 87 89 91 93-3.6

-2.7

-1.8

-0.9

-0.0

0.9

1.8

2.7

Histogram of Standardized Residuals

0.0

0.1

0.2

0.3

0.4

0.5NormalDIBO

Correlogram of residuals

Lag2 4 6 8 10 12 14 16 18

-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

Figure 4.5. Graphs of residuals from the bond rate equation.

Page 76: Juselius_Johansen_The Cointegrated VAR Model

76 CHAPTER 4. ESTIMATION AND SPECIFICATION

Cross- and autocorrelograms of the residuals

Lags 1 to 19

DMO

DFY

DDIFPY

DIDE

DIBO

DMO DFY DDIFPY DIDE DIBO

Figure 4.6. Cross- and autocorrelograms of the full system.

The graphs helps us to spot the big ‘value-added’ residual in the incomeequation and the big ‘deregulation’ residual in the bond rate equation. But,even if the graphical analysis can be a powerful tool to detect problems inmodel specification, it cannot replace formal misspecification tests. The fol-lowing multivariate and univariate residual tests will be discussed and illus-trated based on the Danish data: an LR test and three information criteriatests for the choice of lag length, a trace correlation statistic, a multivariateLM test for first and fourth order residual autocorrelation, the multivariateDoornik-Hansen test for normality, a univariate ARCH test and a univariateJarque-Bera normality test. Because the VAR estimates are more sensitiveto deviations from normality due to skewness than to excess kurtosis we alsoreport these measures.

4.3.2 Residual correlations and information criteria

The VAR model is often called a ‘reduced form’ model because it describesthe variation in xt as a function of lagged values of the process, but notof current values. This means that all information about current effects inthe data is stored in the residual covariance matrix Ω. Because correlations(standardized covariances) are easier to interpret most software programs

Page 77: Juselius_Johansen_The Cointegrated VAR Model

4.3. MISSPECIFICATION TESTS 77

(inclusive CATS and PcFiml) report correlations instead of covariances. Thecorrelation coefficients are calculated as follows:

ρij =σijpσiiσjj

, i = 1, ..., p. (4.16)

When the correlation coefficients and the residual variances (or residual stan-dard deviations) are given it is, thus, straightforward to derive the corre-sponding covariances. The estimated standardized residual covariance ma-trix for the Danish data is reported in Section 4.1.1.The residual standard errors, σi =

√σii i = 1, ..., p are reported below:

∆mr ∆yr ∆2p ∆Rm ∆Rb0.0241 0.0153 0.0132 0.0012 0.0016

Note that the residual standard errors, multiplied by 100, can be interpretedas an % error in this case because the variables are in logarithmic changes.When assessing the adequacy of the VAR specification we frequently make

use of the maximal likelihood value given by:

−2/T lnLmax = ln |Ω| + constant terms

For example, when determining the truncation lag k of the VAR model onecan use the Likelihood ratio test procedure

−2lnQ(Hk/Hk+1) = T (ln|Ωk|− ln|Ωk+1|)

where Hk is the null hypothesis that k lags are sufficient and Hk+1 is thealternative hypothesis that the VAR model needs k + 1 lags. Because theLR test is testing a p × p matrix to be zero, the test statistic −2lnQ isasymptotically distributed as χ2 with p2 degrees of freedom. The LR test ofk = 1 versus k = 2 for the Danish data becomes:

−2lnQ(H1/H2) = 77(−50.2 + 51.2) = 77,

which is distributed as χ2(25) under the null of no significant coefficients atlag 2 in the VAR model. The χ2.95(25) is approximately 35 and the null istherefore rejected.

Page 78: Juselius_Johansen_The Cointegrated VAR Model

78 CHAPTER 4. ESTIMATION AND SPECIFICATION

There are various other test procedures for the determination of the laglength and we will briefly discuss three of them, the Akaike, the Schwartzand the Hannan-Quinn information criteria. They are defined by:

AIC = ln |Ω|+ (p2k) 2T, (4.17)

SC = ln |Ω|+ (p2k)lnTT

(4.18)

HQ = ln |Ω|+ (p2k)2 ln lnTT

, (4.19)

All of them are based on the maximal value of the likelihood function with anadditional penalizing factor related to the number of estimated parameters.The suggested criteria differ regarding the strength of the penalty associatedwith the increase in model parameters as a result of adding more lags. Theidea is to calculate the test criterium for different values of k and then choosethe value of k that corresponds to the smallest value. When using thesecriteria for the choice of lag length it is important to remember that they arevalid under the assumption of a correctly specified model. If there are otherproblems with the model, such as regime shifts and non-constant parameters,then these should be accounted for prior to choosing the lag length.Section 4.1.1 showed that many of the coefficients at lag two were in-

significant in the Danish data and we may consult the information criteriato find out whether a VAR(1) would be appropriate. We report the Schwarzand Hannan-Quinn criteria for lag 1, 2, and 3 lag below:

k = 1 k = 2 k = 3Schwarz −47.68 −47.30 46.53

Hannan−Quinn −48.50 −48.58 −48.28The SC criterium suggests k = 1 and the H-Q suggests k = 2. Because

the suggested information criteria are based on different penalties of esti-mated coefficients they need not produce the same answer and often do not.Checking the other misspecification tests for k = 1 showed that all of themgot much worse as compared to k = 2.However, the graphical analysis of the previous chapter suggested the

possibility of a regime shift in the model which has not yet been tested for.Before this is done the specification tests remain tentative. At this stage wecontinue with the VAR(2) model.

Page 79: Juselius_Johansen_The Cointegrated VAR Model

4.3. MISSPECIFICATION TESTS 79

In the VAR model we can calculate an overall measure of goodness of fit,which is similar to the conventional R2 in the linear regression model. InCATS this is called ‘trace correlation’.

Trace correlation = 1− trace(Ω(V(∆xt))−1)/p,whereV(∆xt) is the covariance matrix of∆xt. In CATS it is calculated usingthe Rats instruction VCV. It can be roughly interpreted as an average R2 ofall the VAR equations. For the Danish data it is 0.54.PcFiml calculates two alternative measures called R2(LR) and R2(LM)

in the VAR model which are defined in the PcFiml manual, Section 10.8.Finally, R2 for each equation is calculated as R2i = 1−Ωii/V ar(∆xt)ii, i =

1, ..., p. For the Danish data the estimated R2i for the models in ECM form:is

∆mr ∆yr ∆2p ∆Rm ∆Rb0.75 0.36 0.75 0.49 0.44

Note that the R2 values are completely misleading when calculated for theunrestricted VAR(2) in Section 4.1.1 because the dependent variable in thiscase is a nonstationary, trending variable. The R2 compares the model’sability to explain the variation in the dependent variable as compared to thebaseline of a constant mean, i.e. R2 = 1 − Σε2i /Σ(xi − x)2. When xi,t isa nonstationary variable, the baseline hypothesis of a constant mean is nolonger appropriate. Essentially any variable will do better in this respectand the random walk hypothesis should replace the constant mean as thebaseline hypothesis.

4.3.3 Tests of residual autocorrelation

The Ljung-Box test of residual autocorrelation is given by:

Ljung Box = T (T + 2)

[T/4]Xh=1

(T − h)−1trace(Ω0hΩ−1Ω0hΩ−1) (4.20)

where Ωh = T−1PT

t=h εt ε0t−h and the residuals are from the estimated VAR

model. The Ljung-Box test is considered to be approximately distributed asχ2 with p2([T/4]− k + 1)− p2 degrees of freedom. (See Ljung & Box (1978)

Page 80: Juselius_Johansen_The Cointegrated VAR Model

80 CHAPTER 4. ESTIMATION AND SPECIFICATION

and Hosking (1980)). For the Danish data the test becomes χ2(425) = 438.8(p-value = 0.31).The LM-tests for first and fourth order autocorrelation are calculated

using an auxiliary regression as proposed in Godfrey (1988, Chapter 5). Thecovariance matrix of the residuals in the auxiliary model is calculated as:

T Ω(j) = ε0ε− ε0µ

ε0Lag jx0

¶0µ·ε0Lag jx0

¸ ·ε0Lag jx0

¸0¶−1µε0Lag jx0

¶ε, j = 1, 4.

where x0 = [x1,x2, ...,xT ], and ε0Lag j = [ε−j, ε−j+1,..., εT−j]. The first jmissing values ε−j, ..., ε−1 are set equal to 0. The LM-test is calculatedas Wilks’ ratio test with a small-sample correction. (See Anderson (1984,section 8.5.2) or Rao (1973, section 8c.5)),

LM(j) = −(T − p(k + 1)− 12) ln

Ã|Ω(j)||Ω|

!. (4.21)

The test is asymptotically χ2-distributed with p2 degrees of freedom. Thisis a fairly important test, partly because the whole VAR philosophy is basedon the idea of decomposing the variation in the data into a systematic partdescribing all the dynamics and an unsystematic random part. If the testsuggests that there are significant autocorrelations left in the model, agentsplans based on the conditional VAR expectations would have deviated sys-tematically from actual realizations. The properties of the estimators arealso sensitive to significant autocorrelations, the bigger the worse.The test statistic for the Danish data became LM(1) : χ2(25) = 17.4 (p-

value = 0.86) and LM(4) : χ2(25) = 40.2 (p-value = 0.03).

4.3.4 Tests of residual heteroscedastisity

In CATS the number of lags in the test for ARCH is equal to the number oflags in the model, and the test statistic is calculated as (T − k)×R2, whereR2 is from the auxiliary regression,

ε2it = γ0 +kXj=1

γj ε2i,t−j + error.

Page 81: Juselius_Johansen_The Cointegrated VAR Model

4.3. MISSPECIFICATION TESTS 81

The residuals from each equation are tested individually for ARCH effects.For the Danish data the tests became:

∆mr ∆yr ∆2p ∆Rm ∆Rb2.7 4.0 3.5 6.7 0.1

Only the residuals from the deposit rate equation were borderline significant.However, simulation studies have demonstrated that the (cointegrated) VARestimates are robust against moderate ARCH effects in the residuals.

4.3.5 Normality tests

The test for normality used in CATS is based on Doornik & Hansen (1994).Because the test is not very well known we explain the calculations in moredetail. The multivariate test for normality is the sum of p univariate testsbased on ‘system residuals’, where ‘system residuals’ are defined as

ut = VΛ−1V0diag(σ− 12

i )( εt − ε), (4.22)

where Λ is a diagonal matrix of eigenvalues of the correlation matrix of the(original) residuals and V are the eigenvectors, so that (4.22) is a principalcomponents decomposition of the standardized residuals.The ‘system residuals’, which are invariant to affine transformations of

each variable and reordering of the system, are uncorrelated by constructionand thereby independent under the assumption of normality. Testing for nor-mality is done by testing each of the system residual series, and the univariatetests are based on the skewness and kurtosis estimates of the residuals withsmall sample corrections as proposed in Shenton & Bowman (1977), and amodification of that test as proposed in Doornik & Hansen (1994).The skewness and kurtosis of each series is calculated as,

skewnessi =√b1i = T−1

PTt=1 u

3it, (4.23)

kurtosisi = b2i = T−1PT

t=1 u4it, (4.24)

i = 1, . . . , p.With the small sample approximations, we obtain 2p independent stan-

dard normal variables, which are squared and summed to a multivariateomnibus test for normality, so that

Normality test =

pXi=1

¡τ 21i + τ 22i

¢(4.25)

Page 82: Juselius_Johansen_The Cointegrated VAR Model

82 CHAPTER 4. ESTIMATION AND SPECIFICATION

is approximately χ2-distributed with 2p degrees of freedom. For the Danishdata the test became χ2(10) = 18.7 (p-value = 0.04). Thus, normality isborderline rejected.To be able to find out where in the system the non-normality is most

pronounced we also need to calculate the univariate Jarque-Bera test fornormality, distributed as χ2(2):

Jarq.Bera = T (skewness)2/6 + T (kurtosis− 3)2/24 a∼ χ2(2)

where skewness and excess kurtosis are defined below:

mean(εi,t) = T−1TXt=1

εi,t = 0 (4.26)

std.dev(εi,t) = σi (4.27)

skewness(εi,t) = T−1TXt=1

µεi,tσi

¶3(4.28)

kurtosis(εi,t) = T−1TXt=1

µεi,tσi

¶4(4.29)

The Jarque-Bera test of normality is based on the null hypothesis ofnormally distributed errors, under which the following applies:

(εi,t/σi)3 a∼ N(0, 6)

and

(εi,t/σi)4 a∼ N(3, 24).

Thus, the variance of skewness is smaller than the variance of kurtosis, whichmeans that the normality test is more easily rejected when the empiricaldistribution is skewed (often because of outliers in the VARmodel) than whenit is leptocurtic (thick tails or too many small residuals close to the mean).Note also that the Jarque-Bera test is χ2-distributed only asymptotically.This means that the small sample behavior may not follow a χ2 distributionvery closely.For the Danish data the test results are reported in Table 4.1:We find that the normality is rejected primarily because of non-normality

in the two interest rate equations. This is due to the big ‘deregulation’ outlierin 1983 for the bond rate and excess kurtosis for both interest rates.

Page 83: Juselius_Johansen_The Cointegrated VAR Model

4.4. CONCLUDING REMARKS 83

Table 4.1: Specification tests for the unrestricted VAR(2) model.Univariate normality tests for: ∆mr ∆yr ∆p ∆Rm ∆RbJarq.Bera(2) 1.4 1.6 2.2 6.2 5.9Skewness 0.32 -0.20 0.27 -0.05 -0.40Kurtosis 3.00 3.28 3.39 4.02 4.07

Altogether the formal misspecification tests have confirmed the findingsbased on the graphical inspection: multivariate normality is borderline re-jected due to non-normality in the two interest rates; there is some sea-sonal autocorrelation left in the residuals; the deposit rate exhibits moderateARCH effects.

4.4 Concluding remarks

All parameters of the VAR models (4.7) - (4.11) of Chapter 3 were unre-stricted and we demonstrated that OLS equation by equation produced MLestimates. In this chapter the estimates of these models showed that theunrestricted VAR models are heavily over-parametrized. This is consistentwith the discussion in Chapter 3 which showed that the VAR model is essen-tially just a convenient reformulation of the covariances of the data, some ofwhich are very small and statistically insignificant.The generality of the VAR formulation has a cost: adding one variable to a

p-dimensional VAR(k) system introduces (2p+1)×k new parameters. Whenthe sample is small, typically 50-100 in quarterly macroeconomic models,adding more variables can easily become prohibitive. However, in some casesthere can be trade-off between the number of variables in the system, p, andthe number of lags, k, needed to obtain uncorrelated residuals. Since anextra lag corresponds to p× p additional parameters one can in some casesreduce the total number of VAR parameters by adding a relevant variable tothe model.By imposing statistically acceptable restrictions on the VAR model we

hope to uncover meaningful economic models with interpretable coefficients.Such restrictions are, for example, reduced rank restrictions, zero parameterrestrictions and other linear or nonlinear parameter restrictions. This will bethe focus of the remaining chapters of this book.Even if the VAR(2) model seemed to provide a fair description of the

Page 84: Juselius_Johansen_The Cointegrated VAR Model

84 CHAPTER 4. ESTIMATION AND SPECIFICATION

information in the data, the tests and the graphs suggested some scope forimprovements. In Chapter 6 we will discuss the important role of determin-istic components as a means to improve model specification.

Page 85: Juselius_Johansen_The Cointegrated VAR Model

Chapter 5

The Cointegrated VAR Model

The purpose of this chapter is to introduce the nonstationary VAR modeland show that the nonstationarity can be accounted for by a reduced rankcondition on the long-run matrix Π = αβ0. Section 5.1 defines the concepts ofintegration and cointegration and Section 5.2 gives an intuitive interpretationof the reduced rank, r, as the number of stationary long-run relations based onthe unrestricted VAR estimates of the Danish data. Section 5.3 discusses theinterpretation of the number of unit roots, p− r, as the number of commondriving trends and shows how they can be related to the VAR model byinverting the AR lag polynomial. Based on the simple VAR(1) model Section5.4 demonstrates how the parameters of the MA representation are relatedto the AR parameters. Finally, Section 5.5 concludes the chapter with adiscussion of the cointegrated VAR model as a general framework withinwhich one can describe economic behavior in terms of pulling forces towardsequilibrium, generating stationary behavior, and pushing forces away fromequilibrium, generating nonstationary behavior.

5.1 Defining integration and cointegration

We will now show that the presence of unit roots in the unrestricted VARmodel corresponds to nonstationary stochastic behavior which can be ac-counted for by a reduced rank (r < p) restriction of the long-run levelsmatrix Π = αβ0. Johansen (1996), Chapter 3, provides a mathematicallyprecise definition of the order of integration and cointegration. Here we onlyreproduce the basic definitions:

85

Page 86: Juselius_Johansen_The Cointegrated VAR Model

86 CHAPTER 5. THE COINTEGRATED VAR MODEL

Definition 1 xt is integrated of order d if xt has the representation (1 −L)dxt = C(L)εt, where C(1) 6= 0, and εt ∼ Niid(0,σ2).Definition 2 The I(d) process xt is called cointegrated CI(d,b) with cointe-grating vector β 6= 0 if β0xt is I(d-b), b = 1,...,d, d = 1,...Cointegration implies that certain linear combinations of the variables of

the vector process are integrated of lower order than the process itself. Asalready discussed informally in Chapter 2, cointegrated variables are drivenby the same persistent shocks. Thus, if the non-stationarity of one variablecorresponds to the non-stationarity of another variable, then there exists alinear combination between them that becomes stationary. Another way ofexpressing this is that when two or several variables have common stochastic(and deterministic) trends, they will show a tendency to move together inthe long-run. Such cointegrated relations, β0xt, can often be interpreted aslong-run economic steady-state relations and are, therefore, of considerableeconomic interest. In the next section we will give an intuitive account ofsuch relations and how one can find them in the long-run matrix Π.

5.2 An intuitive interpretation of Π = αβ0

Within the VAR model the cointegration hypothesis can be formulated asa reduced rank restriction on the Π matrix defined in the previous chapter,Section 4.5. Below we reproduce the VAR(2) model in ECM form withm = 1 :

∆xt = Γ1∆xt−1 +Πxt−1 + µ+ εt (5.1)

and give the estimate of the unrestricted Π1 matrix for the Danish data(keeping in mind that it was invariant to the choice of model formulation):

Πxt =

−0.20 0.22 −0.18 6.01 −5.000.10 −0.29 −0.31 −2.35 1.110.01 −0.12 −1.74 1.46 −0.14−0.00 0.01 −0.00 −0.26 0.05−0.01 0.01 0.02 0.00 −0.13

mrt−1

yrt−1∆pt−1Rm,t−1Rb,t−1

.1Coefficients with a |t-ratio| > 0 are given in bold face.

Page 87: Juselius_Johansen_The Cointegrated VAR Model

5.2. AN INTUITIVE INTERPRETATION OF Π = αβ0 87

If xt ∼ I(1), then ∆xt ∼ I(0) implying that Π cannot have full rank as thiswould lead to a logical inconsistency in (5.1). This can be seen by consid-ering Π =I as a simple full rank matrix. In this case each equation woulddefine a stationary variable ∆xt to be equal to a nonstationary variable, xt−1,plus some lagged stationary variables Γ1∆xt−1 and a stationary error term.Hence, either Π = 0, or it must have reduced rank:

Π = αβ0

where α is a p × r matrix and β is a p × r matrix, r ≤ p. Thus, under theI(1) hypothesis the cointegrated VAR model is given by:

∆xt = Γ1∆xt−1 + ...+ Γk−1∆xt−k+1 +αβ0xt−1 + µ+ εt (5.2)

where β0xt−1 is an r × 1 vector of stationary cointegration relations. Underthe hypothesis that xt ∼ I(1) all stochastic components are stationary inmodel (5.2) and the system is now logically consistent.In Chapter 4, Section 3, we showed that the first row of Π was a linear

combination of the five variables which could be given a tentative interpre-tation as a (stationary) deviation from a long-run money demand relation,i.e. as an equilibrium error. We will now examine the implications of thisfor the full VAR model, assuming that r = 1, i.e. among the five variablesthere exists only one stationary relation, the money demand relation. Us-ing roughly the estimated coefficients of the Danish data α11 = −0.2 andβ01 = [1, −1, 0.9, −25.0, 25.0] we can approximately reproduce the first rowof Π as α11β

01. If, for simplicity, we assume that Γ1 = 0 the cointegrated VAR

model can now be written as:

∆mr

t

∆yrt∆2pt∆Rm,t∆Rb,t

=−0.2a21a31a41a51

£ mr − yr + 0.9∆p− 25(Rm −Rb)t−1¤+ µ+ εt

(5.3)

The question is now how to choose the estimates of α21, ...,α51 so that therelevant information contained in Π is preserved. An inspection of the unre-stricted Π matrix shows that the coefficients of the second row corresponding

Page 88: Juselius_Johansen_The Cointegrated VAR Model

88 CHAPTER 5. THE COINTEGRATED VAR MODEL

to real income might (with some good will) be considered proportional toβ01xt with a21 ≈ 0.10. (Whether this is so is a testable hypothesis which willbe discussed formally in Chapter 9.) None of the remaining rows seems tohave coefficients even vaguely proportional to β01xt−1 and a31 = a41 = a51 = 0seems the only possible choice. Thus for r = 1 the best representation ofΠ = αβ0 seems to be:

∆mr

t

∆yrt∆2pt∆Rm,t∆Rb,t

=−0.20.10.00.00.0

£ mr − yr + 0.9∆p− 25(Rm −Rb)t−1¤+ µ+ εt

(5.4)

Model (5.4) represents an economy where the deviation between agents’actual money holdings, mr

t−1, and their desired demand for money, mr∗t−1 =

yr−0.9∆p+25(Rm−Rb)t−1, determines the real money stock, mrt , and the

aggregate level of expenditure, yrt . When actual money holdings are above orbelow the long-run desired level, agents make gradual adjustments of theirmoney holdings (and in their expenditure level) until the level of money stockis back at the steady-state level. Because a21 > 0, real expenditure goes upwhen there is ‘excess money’ in the economy. ‘Excess money’ has no short-run or long-run impact on inflation, nor on the two interest rates. In thissimple empirical model it can be shown (see Johansen and Juselius, 2003)that the central bank would only be able to influence the level of aggregateexpenditure by changing money supply but this would have no effect oninflation rate. Thus, the specific values of the coefficients α and β can haveimportant implications for whether a chosen policy is effective or not.Under the assumption that a21 = 0, model (5.4) would be equivalent to

the single equation error-correction model. Such models have been widelyused to estimate money demand relations based on the assumption that astable relation is a prerequisite for monetary policy control. Needless tosay the implications of (5.4) as discussed above suggest that the assumptionr = 1 is a too simplistic2 to be relevant as an analytical tool for monetarypolicy decisions. I will argue that one reason why this is necessarily so is

2At this point we will disregard the possibility of extending the information set andhow this may change the interpretation of the model.

Page 89: Juselius_Johansen_The Cointegrated VAR Model

5.2. AN INTUITIVE INTERPRETATION OF Π = αβ0 89

that r = 1 presumes p− r = 4 common stochastic trends. Given the presentinformation set this seems too many to be consistent with most theoreticalassumptions underlying inflation and monetary policy control. However, itis only by formulating a model for the full system that we can examine theimplications of this seemingly innocent assumption.

The choice of r = 1 forced us to impose many restrictions on Π, suchas the proportionality of the first two rows and the zeros of the remainingrows. Thus, it does not allow for enough flexibility in the description of thefeed-back dynamics of the long-run structure. For example, assume that thecoefficient of inflation rate, 0.9, in β01xt is in fact zero, supported by its verysmall 0t0-ratio in the first row of Π. With r = 1,∆p would have to be excludedaltogether from the cointegration space, which would leave out the variableof primary interest for monetary policy control. From an econometric pointof view such a choice would be contradicted by the significant 0t0-ratio ofinflation in the third row of Π.

We will now assume that r = 2 and see how this choice can give us moreflexibility. The question is how to choose the second relation β02xt−1 so thatthe two cointegration relations together approximately describe the structureof the Π matrix. An inspection of the first two rows of the unrestricted Πshows that the proportionality assumption is probably not valid. Therefore,we can instead let the second cointegration relation describe an IS type ofrelation between real income, real money stock and the deposit rate withcoefficients approximately consistent with the second row of the Π matrix.The r = 2 system can now be represented as:

∆mr

t

∆yrt∆2pt∆Rm,t∆Rb,t

=−0.2 0

0 −0.30 00 00 0

· (mr − yr)− 25(Rm −Rb)t−1yr − 0.3mr + 7Rdt−1

¸+ µ+ εt

(5.5)

Although, (5.5) can approximately reproduce the relevant information of thefirst two rows of Π, it excludes the inflation rate from the long-run relations,which is inconsistent with its highly significant coefficient in the third rowof Π. One possibility is that inflation rate is in fact stationary by itself and,therefore, is a ‘cointegration’ vector by itself. In this case the rank of Π would

Page 90: Juselius_Johansen_The Cointegrated VAR Model

90 CHAPTER 5. THE COINTEGRATED VAR MODEL

have to be increased to r = 3 and the model would become:∆mr

t

∆yrt∆2pt∆Rm,t∆Rb,t

=

−0.2 0 0

0 −0.3 00 0 −1.740 0 00 0 0

(mr − yr)− 25(Rm −Rb)t−1yr − 0.3mr + 7Rdt−1∆pt−1

(5.6)

+

µ1µ2µ3µ4µ5

+ε1,tε2,tε3,tε4,tε5,t

Model (5.6) is now able to roughly reproduce the data information in

Πxt−1, with the exception of the ‘significant’ deposit rate in the fourth rowof Π.

The above discussion served the purpose of illustrating that the choiceof α and β has to reproduce the statistical information in the Π matrix.No testing was performed and the proposed decompositions were, therefore,completely hypothetical. Chapters 8 and 9 will discuss likelihood based testprocedures for a wide variety of hypotheses including the above mentioned.

However, the choice of α and β should also ideally describe an inter-pretable economic structure and provide some empirical insight on the ap-propriateness of the underlying economic model. The tentatively proposedcointegration relations given above are quite different from the ‘monetarytheory’ consistent cointegration relations discussed in Chapter 2 and repro-duced below. Since Chapter 2 did not discuss a hypothetical adjustmentstructure for the cointegration relations, the α coefficients reported belowhave been chosen to be roughly consistent with the predictions from mone-tarists’ theory models.

∆mr

t

∆yrt∆2pt∆Rm∆Rb,t

=−a11 a12 0

0 0 −a23a31 0 0a41 0 0a51 a52 −a53

(m− p− yr)t−1(Rm −Rb)t−1

(Rb −∆p)t−1

+µ1µ2µ3µ4µ5

+ε1,tε2,tε3,tε4,tε5,t

(5.7)

Page 91: Juselius_Johansen_The Cointegrated VAR Model

5.3. COMMON TRENDS 91

It is noteworthy that a11 = −0.20 and a12 ≈ 5 would approximatelyreproduce the money demand relation of (5.6). Thus, a row in the Π matrixcan also be found as a linear combination of several cointegration relations.In this sense the illustration of the three cointegration relations in (5.6) islikely to be too simplified. Most empirical models would require a much morecomplicated cointegration structure and without the help of sophisticated testprocedures it would be utterly hard to uncover these structures. This will bethe topic of Chapters 9 - 13.

5.3 Common trends and the moving average

representation

Chapter 3 showed that the stationary VAR model could be directly invertedinto the moving average form. When the VAR model contains unit roots theautoregressive lag polynomial becomes non-invertible. The purpose of thissection is to demonstrate how we can find the moving average representationin this case. For simplicity of notation we focus on the simple VAR(2) model:

Π(L)xt = (I−Π1L− Π2L2)xt = µ+ εt

We assume now that the characteristic polynomial |Π(z)| = |I−Π1z −Π2z2|

contains a unit root. In this case the determinant |Π(z)| = 0 and Π(z) can-not be inverted for z = 1. Therefore, we need to factorize out the unit rootcomponent of the lag polynomial of the VAR model. First we move thematrix lag polynomial to the right hand side of the equation for xt:

Π(L)xt = µ+ εtxt = Π−1(L)(µ+ εt) (5.8)

Since Π−1(L) = Πa(L)det(Π(L))

and det(Π(L)) is a polynomial in z we multiply both

sides of (5.8) with the difference operator (1− L) so that the non-invertibleunit root is cancelled out:

(1− L)xt = Π−1(L)(1− L)(εt + µ)= C(L)(εt + µ)

= (C0 +C1L+C2L2 + ...)(εt + µ)

Page 92: Juselius_Johansen_The Cointegrated VAR Model

92 CHAPTER 5. THE COINTEGRATED VAR MODEL

where C(L) is now a stationary lag polynomial and hence invertible.

C(z) = C0 +C1z +C2z2 + ...

The characteristic function of C(L) is now expanded using the Taylor rule,evaluated for z = 1 and reformulated as:

C(L) = C(1) +C∗(L)(1− L). (5.9)

By inserting (5.9) in (5.8) we get:

(1− L)xt = C+C∗(L)(1− L)(εt + µ)where C =C(1). This equation can be written as:

xs = xs−1 +Cεs +C∗(L)(εs − εs−1),= xs−1 +Cεs +Ys −Ys−1,

where Ys is a short-hand notation for the stationary part of the process,C∗(L)εs. Summing for s = 1, ..., t we get:

xt = x0 +CtXs=1

εs +Yt −Y0,

= CtXs=1

εs +Yt + ex0.Thus, ex0 contains both the initial value, x0, of the process xt and the initialvalue of the short-run dynamics C∗(L)ε0. Dividing through by (1 − L) weobtain:

xt = C

µεt

1− L¶+C∗(L)εt + ex0.

We can now formulate the VAR model in moving average form as:

xt = CΣti=1εi +C

∗(L)εt + ex0, (5.10)

showing that xt contains stochastic trends CP

ti=1εi, stationary stochastic

components C∗(L)εt, and initial values. Thus the VAR model is capableof reproducing a similar trend-cycle-irregular decomposition of the vectorprocess as was informally discussed in Chapter 2.

Page 93: Juselius_Johansen_The Cointegrated VAR Model

5.4. FROM AR TO MA 93

5.4 From the AR to the MA representation3

We showed above that the Ci matrices are functions of the Πi matrices. Wewill now illustrate how one can find the Ci matrices when β and α are knownfor the simple VAR(1) model. Johansen (1996), Chapter 4 provides a detailedand lengthy derivation of the results for the general VAR(k) model. Theinterested reader is referred to the discussion there. Although the detailedderivation of the result for the VAR(k) model is more complicated, the generalprinciple can be understood using the VAR(1) model:

∆xt = αβ0xt−1 + µ+ εt, t = 1, . . . , T, (5.11)

with initial value x0. We consider now the matrix α⊥ of full rank and di-mension p × (p − r) so that α0α⊥ = 0 and rank (α,α⊥) = p. As will bediscussed in Chapter 12 the matrix α⊥ is not uniquely defined without im-posing identifying restrictions, but the results here apply for any choice ofα⊥.We will now make use of the following relationship between α,β,α⊥,β⊥ :

β⊥ (α0⊥β⊥)

−1α0⊥ +α (β

0α)−1 β0 = I. (5.12)

Using (5.12) we can decompose any vector v inRp into a vector v1 ∈ sp(β⊥)anda vector v2 ∈ sp(α).We now apply the results to the p-dimensional vectorxt :

xt = β⊥ (α0⊥β⊥)

−1α0⊥xt +α (β0α)−1 β0xt

= ω1α0⊥xt +ω1β

0xt.(5.13)

Thus, xtcan be expressed as a linear combination of α0⊥xt and the cointe-

gration relations, β0xt and the next step is to express them as a function ofinitial values and the errors (εt, εt−1, ...).We first pre-multiply equation (5.11) with β0 and then solve for β0xt to

get the equation

β0xt = (I+ β0α)β0xt−1 + β0µ+ β0εt.

3This section relies heavily on the SJ Chapter 3.

Page 94: Juselius_Johansen_The Cointegrated VAR Model

94 CHAPTER 5. THE COINTEGRATED VAR MODEL

The eigenvalues of the matrix (I+ β0α) are inside the unit circle when ther-dimensional process β0xt is stationary and it is straightforward to representβ0xt as a function of εI, I = 1, ..., T and the constant µ :

β0xt =∞Xi=0

(I+ β0α)i β0 (εt−i + µ) . (5.14)

An expression for α0⊥xt is found by pre-multiplying (5.11) with α0⊥ and

get the equation

α0⊥∆xt = α0⊥εt +α0⊥µ,

which has the solution:

α0⊥xt = α0⊥x0 +tXi=1

α0⊥ (εi + µ) . (5.15)

Inserting (5.14) and (5.15) into (5.13) we obtain the following result:

xt = β⊥(α0⊥β⊥)

−1 £α0⊥Pti=1 (εi + µ) +α

0⊥x0

¤+α(β0α)−1

P∞i=0 (I+ β

0α)i β0 (εt−i + µ

= CPt

i=1 εi +Cµt+Cx0 +α (β0α)−1

P∞i=0 (I+ β

0α)i β0 (εt−i + µ) .

= CPt

i=1 εi + τ 1t+ τ 0 +Yt

(5.16)

where C = β⊥(α0⊥β⊥)

−1α0⊥, τ 1 = Cµ measures the slope of a linear trendin xt, τ 0 = Cx0 depends on initial values, and Yt is a stationary process.By expressing the VAR(k) model in the companion form it is possible

to derive the results for the more general case using the same principle asabove. This will not be done here but is left for the interested reader to doas an exercise. The main result relates the C matrix to all the parametersof the VAR(k) model:

C = β⊥(α0⊥Γβ⊥)

−1α0⊥ (5.17)

Page 95: Juselius_Johansen_The Cointegrated VAR Model

5.4. FROM AR TO MA 95

where Γ =I−Γ1 − ...−Γk−1 (See Johansen (1996), Chapter 4). Thus, (5.16)with Γ =I is a special case of (5.17).By expressing (5.17) as:

C = β⊥α0⊥

where β⊥= β⊥(α0⊥Γβ⊥)

−1 it is easy to see that the decomposition of the Cmatrix is similar to the one of the Π matrix, except that in the AR repre-sentation β determines the common long-run relations and α the loadings,whereas in the moving average representation α0⊥ determines the commonstochastic trends and β⊥ their loadings. Thus, (5.17) together with (5.16)show that the non-stationarity in the process xt originates from the cumu-lative sum of the p − r combinations α0⊥Σti=1εi, leading to the followingdefinition:

Definition 3 The common driving trends are the variables α0⊥P

ti=1εi.

All the remaining chapters will use results based on both the AR and theMA representation. It is, therefore, important to have a good intuition forthe meaning of the orthogonal complements. We will illustrate this with afew simple examples:Representation (5.3): r = 1 corresponds to p− r = 4, so that α⊥ and β⊥

are 5× 4 matrices. For α = [−0.1, 0.2, 0.0, 0.0, 0.0] we can find α⊥ as:

α0⊥ =

1 2 0 0 00 0 1 0 00 0 0 1 00 0 0 0 1

Xti=1εi =

P

ti=1ε1,iPti=1ε2,iPti=1ε3,iPti=1ε4,iPti=1ε5,i

(5.18)

It is easy to verify that α0⊥α =0. We also note that a zero row in α cor-responds to a unit vector in α⊥. Such a variable is called a weakly exoge-nous variable, which will be further discussed in Chapter 9. Interpretingα0⊥

Pti=1εi as an estimate of the p− r common stochastic trends shows that

the first one is a weighted sum of cumulated shocks to real money stock andreal income, whereas the next three stochastic trends are equal to the cumu-lated shocks to inflation, the deposit rate and the bond rate, respectively.

Page 96: Juselius_Johansen_The Cointegrated VAR Model

96 CHAPTER 5. THE COINTEGRATED VAR MODEL

Note that a linear combination of common stochastic trends are alsostochastic trends. Thus, (5.18) is just one of infinitely many representations.We will now similarly find β⊥ corresponding to β

0 = [1,−1, 0 , 25,−25],where for simplicity we have set the coefficient to inflation to zero.

β⊥ =

1 0 0 251 0 0 00 1 0 00 0 1 00 0 1 1

(5.19)

To be able to interpret β⊥ as loadings to the common stochastic trendswe would first need to post multiply by the component (α0⊥Γβ⊥)

−1. This willdone in subsequent chapters. We leave it to the reader to find the orthogonalcomplements to α and β for (5.5) and (5.6).

5.5 Pulling and pushing forces

4The simple error correction model (5.11) illustrates on one hand how theprocess is pulled towards steady-state, defined by β0xt−E(β0xt) = 0, with theforce α which activates as soon as the process is out of steady-state, definedby β0xt 6= E(β0xt). The common trends representation (5.16) illustrates onthe other hand how the variables move in a nonstationary manner describedby the common driving trends α0⊥

Pti=1εi. In this sense the AR and the MA

representation are two sides of the same coin: the pulling and the pushingforces of the system. Figure 5.1 illustrates these forces for a simple bivariatesystem with x0t = [mt,yt], where mt is money stock and yt is income.If the steady state position corresponds, for example, to a constant money

velocity m−y = µ, so that β0 = [1,−1], then the attractor set is β⊥ = [1, 1].In the picture this is indicated by the 450 line showing that mt = yt in steadystate. If β0xt = (mt − yt)−E(β0xt) 6= 0, then the adjustment coefficients αwill force the process back towards the attractor set with a speed of adjust-ment that depends on the length of α and the size of the equilibrium errorβ0xt. The common trend, measured by α0⊥Σ

ti=1εi, has pushed money and

income along the line defined by β⊥, the attractor set. Thus, positive shocks

4This section relies heavily on Johansen (1976) Chapter 3.

Page 97: Juselius_Johansen_The Cointegrated VAR Model

5.5. PULLING AND PUSHING FORCES 97

-

6

yt

mt

©©©©

©©©©

©©©©

©©©©

©© sp(β⊥)HHHHj·················©©©*

©©©¼ α0⊥Pt

i=1 ²i

β0xt

α

Figure 5.1: The process x0t = [mt, yt] is pushed along the attractor set bythe common trends and pulled towards the attractor set by the adjustmentcoefficients

to the system will push the process higher up along β⊥, whereas negativeshocks will move it down.

The steady-state positions β0xt−E(β0xt) = 0 describes a system at rest.At these (long-run) equilibrium points there is no economic adjustment force(incentive) to change the system to a new position. But when new (exoge-nous) shocks hit the system, causing β0xt 6= E(β0xt), the adjustment forcesare activated and pull the process back towards E(β0xt). Note, however, thatthe steady-state relations, β0xt−E(β0xt), should not be interpreted to meanthat these relations are satisfied in the limit as t → ∞. The steady-stateposition is something that exists at all time points as a rest point towardswhich the process is drawn when being pushed away.

It should be emphasized that the picture is strictly speaking only valid formodel (5.11) where the short-term dynamics have been left out. When thereis short-run adjustment dynamics in the lagged differences of the process thesituation is more complicated and the simple intuition behind the pullingand pushing forces in the above picture can be misleading. For example,Hansen and Johansen (1998), Exercise ?? gives an example where α0⊥xt isstationary, which shows that the common trends α0⊥Σ

ti=1εi cannot generally

be replaced by α0⊥xt.

Page 98: Juselius_Johansen_The Cointegrated VAR Model

98 CHAPTER 5. THE COINTEGRATED VAR MODEL

Hansen and Johansen (1998) also discusses a model with overshooting

∆x1,t = −12(x1,t−1 − x2,t−1) + ε1,t,

∆x2,t = −14(x1,t−1 − x2,t−1) + ε2,t,

i.e. a model where the variable x2,t does not error-correct with a coefficientof plausible sign, despite the fact that the variables are cointegrating. In thismodel x2,t is not error-correcting to the equilibrium error (x1,t−1−x2,t−1), butinstead is pushing the process further away from steady state. But, since, x1,tis also reacting on the same equilibrium error with a larger error correctioncoefficient, |−0.50| > |−0.25| , the process is nevertheless stable.

5.6 Concluding discussion

This chapter has shown that the notion of common trends, α0⊥P

ti=1εi, and

the notion of cointegrating relations, β0xt, are two sides of the same coin, asare the loadings coefficients, β, and the adjustment coefficients, α. Althoughwe can, of course, choose the representation we prefer, there is, nonetheless,one aspect in which the two concepts differ. A cointegration relation is invari-ant to changes in the information set, whereas this is not necessarily the casewith a common trend. If cointegration holds between a set of variables, thenthe same cointegration relation will still be found in a larger set of variables.This will be discussed in great detail in Chapter 10. An ‘unanticipated dis-turbance’, εt, defining the common trend α

0⊥P

ti=1εi, is only unanticipated

for the chosen information set. Unless the latter is complete in the senseof comprising all relevant variables, the residual, εt, will necessarily containthe effect of omitted variables. Thus an unanticipated εt based on a smallersystem need no longer be so in a larger system. Generally, both εt and α⊥change when the information set changes, implying that the definition of acommon trend is not invariant to changes in the information set, see Hendry(1995). This will be further discussed in Chapter 11.

Page 99: Juselius_Johansen_The Cointegrated VAR Model

Chapter 6

Deterministic Components inthe I(1) Model

The purpose of this chapter is to discuss the interpretation of fixed effects,such as constant, deterministic trends, and intervention dummies and showhow they affect the mean of the differenced process, E(∆xt) and the mean ofthe equilibrium error process, E(β0xt). Section 6.1 first illustrates the dualrole of the constant term and the linear trend in a simple dynamic regressionmodel. Section 6.2 then extends the discussion to the more complicated caseof the VAR model. Section 6.3 discusses five cases of different restrictionsimposed on the trend and the constant in the VAR model. Section 6.4derives the MA representation when there is a trend and a constant in theVAR model. Section 6.5 discusses the role of three different types of dummyvariables in a simple dynamic regression model and Section 6.6 extends thediscussion to the VAR model. Section 6.7 illustrates.

6.1 A trend and a constant in a simple dy-

namic regression model

Most economists are familiar with the standard regression model and howto interpret its coefficients. When dynamics are introduced, inference andinterpretation changes fundamentally and interpretational mistakes are easyto make. We will use a simple univariate model to demonstrate how theinterpretation of a linear time trend and a constant term is crucially relatedto the dynamics of the model, in particular to whether the dynamics contains

99

Page 100: Juselius_Johansen_The Cointegrated VAR Model

100 CHAPTER 6. DETERMINISTIC COMPONENTS

a unit root or not.We consider the following simple regression model for yt containing a

linear trend and a constant:

yt = γt+ ut + µ0, t = 1, ..., T (6.1)

where the residual ut is a first order autoregressive process:

ut =εt

1− ρL(6.2)

and u0 is assumed fixed. Note that the assumption (6.2) implies that (6.1)is a common factor model. As demonstrated below such a model imposesnonlinear restrictions on the parameters of the AR model and is, therefore,a special case of the general autoregressive model. Nonetheless, (6.1)-(6.2)serves the purpose of providing a pedagogical illustration of the dual roles ofdeterministic components in dynamic models.It is useful to see how the constant µ0 is related to the initial value

of yt. Using (6.1) we have that y0 = µ0 + u0. Since an economic variableis usually given in logs, the level contains information about the unit ofmeasurements (the log of 100.000 euro, say). Therefore, the value of µ0is generally dominated by y0. For practical purposes µ0 ' y0 and in thediscussion below we will set µ0 = y0 to emphasize the role of measurmentson the constant in a dynamic regression model.By substituting (6.2) in (6.1) we get:

yt = γt+εt

1− ρL+ y0 (6.3)

and by multiplying through with (1− ρL) :

(1− ρL)yt = (1− ρL)γt+ (1− ρL)y0 + εt. (6.4)

Rewriting (6.4) using Lxt = xt−1 we get:

yt = ρyt−1 + γ(1− ρ)t+ ργ + (1− ρ)y0 + εt. (6.5)

It is easy to see that the ”static” regression model (6.1) is equivalent to thefollowing dynamic regression model:

yt = b1yt−1 + b2t+ b0 + εt (6.6)

Page 101: Juselius_Johansen_The Cointegrated VAR Model

6.1. A DYNAMIC REGRESSION MODEL 101

with

b1 = ρb2 = γ(1− ρ)b0 = ργ + (1− ρ)y0.

(6.7)

We consider the following four cases:Case 1. ρ = 1 and γ 6= 0. It follows from (6.5) that ∆yt = γ+ εt, for t =

1, ..., T, i.e. ”the random walk with drift” model. Note that E(∆yt) = γ 6= 0is equivalent to yt having a linear trend, γt.Case 2. ρ = 1 and γ = 0. It follows from (6.5) that ∆yt = εt, for

t = 1, ..., T, i.e. ”the pure random walk” model. In this case E(∆yt) = 0 andyt contains no linear trend.Case 3. | ρ |< 1 and γ 6= 0 gives (6.6), i.e. yt is stationary around its

mean Eyt = a1t+ a0. We will now show that a1 = γ and a0 = y0:

Eyt = ρEyt−1 + γ(1− ρ)t+ ργ + (1− ρ)y0

a1t+ a0 = ρ(a1(t− 1) + a0) + γ(1− ρ)t+ ργ + (1− ρ)y0

a1(1− ρ)t+ a0(1− ρ) + ρa1 = γ(1− ρ)t+ ργ + (1− ρ)y0

Hence:

a1(1− ρ)t = γ(1− ρ)t

a1 = γ

and:

a0(1− ρ) + ργ = ργ + (1− ρ)y0

a0 = y0.

Thus, one should note that the coefficients in a dynamic regression modelhave to be interpreted with caution. For example b2 in (6.6) is not an estimateof the trend slope in yt and b0 is not an estimate of µ0.Case 4. | ρ |< 1 and γ = 0, gives us yt = ρyt−1 + (1− ρ)y0 + εt, where

Eyt = y0 i.e. ”the stationary autoregressive” model with a constant term.To summarize:• in the static regression model (6.1) the constant term is essentially

accounting for the unit of measurement of yt,

Page 102: Juselius_Johansen_The Cointegrated VAR Model

102 CHAPTER 6. DETERMINISTIC COMPONENTS

• in the dynamic regression model (6.5) the constant term is a weightedaverage of the growth rate γ and the initial value y0,• in the differenced model (ρ = 1) the constant term is only measuring

the growth rate, γ.

6.2 A trend and a constant in the VAR

The above results were derived for the univariate model. We will now demon-strate that one can, with some modifications, apply a similar interpretationof the deterministic components in the multivariate model.A characteristic feature of the error-correction formulation given below is

the inclusion of both differences and levels in the same model, allowing usto investigate short-run as well as long-run effects in the data. When twovariables share the same stochastic trend we showed in the previous chapterthat it is possible to find a linear combination that cancels the trend, i.e. acointegration relation. But many economic variables typically exhibit lineardeterministic growth (at least locally over the sample period) in additionto stochastic growth. Statistically it is not always straightforward to dis-tinguish between the two, especially over short sample periods. Sometimesit is preferable to approximate the trend behavior with a stochastic trend,sometimes it is better to model it with a deterministic trend and in mostcases we need a combination of the two.We call a variable which contains only a deterministic trend, but no

stochastic trend a trend-stationary variable. In the cointegrated VAR modelthe latter case can be modeled by adding a trend to the cointegration space.In other cases a linear combination between the variables removes the stochas-tic trend but not the deterministic trend and we need to add a linear trendto the cointegration relation to achieve stationarity. We call it a trend-stationary cointegration relation.The basic idea can be illustrated with a simple VAR(1) model containing

a constant, µ0, and a trend, µ1t. For notational simplicity all short-rundynamic effects, Γi, have been set to zero. Thus we consider the followingmodel in AR form:

∆xt = αβ0xt−1 + µ0 + µ1t+ εt (6.8)

and in the MA form:

Page 103: Juselius_Johansen_The Cointegrated VAR Model

6.2. A TREND AND A CONSTANT IN THE VAR 103

xt = CtXi=1

(εi + µ0+iµ1) +∞Xi=0

C∗i (εt−i + µ0+(t− i)µ1) (6.9)

Because, ∆xt and β0xt−1 are stationary we can express (6.8) as:

∆xt −E∆xt = α(β0xt−1 −E(β0xt−1)) + εt (6.10)

where

E∆xt = αE(β0xt−1) + µ0+µ1t (6.11)

and

Eβ0∆xt = β0αE(β0xt−1) + β0µ0+β0µ1t (6.12)

Eβ0xt = (1 + β0α)E(β0xt−1) + β0µ0+β0µ1t

The two (p× 1) vectors µ0 and µ1 can always be decomposed into two newvectors, so that one of them belongs to the α space, i.e. to the cointegrationrelations, and the other to the orthogonal space, i.e. to ∆xt. Since a vectorcan be decomposed in many different ways we need some principle for doingit. Generally we would like an equilibrium error to have mean zero and onepossibility is to decompose µ0 and µ1 so that (β

0xt −Eβ0xt) = 0.To achieve this when µ1 6= 0 is quite involved and we will focus here on

the case (µ0 6= 0,µ1 = 0), i.e. a constant term but no linear trend in theVAR model. In this case we can use the equality:

α(β0α)−1β0+β⊥(α0⊥β⊥)

−1α0⊥= I (6.13)

to decompose the vector µ0 into two new vectors:

µ0 = α(β0α)−1β0µ0+β⊥(α0⊥β⊥)

−1α0⊥µ0, = αβ0+γ0 (6.14)

where β0 = β0α)−1β0µ0 andγ0 = β⊥(α0⊥β⊥)

−1α0⊥µ0. We will now showthat E(β0xt + β0) = 0 and E∆xt = γ0 in (6.8) when µ1 = 0. In this case(6.9) becomes:

Page 104: Juselius_Johansen_The Cointegrated VAR Model

104 CHAPTER 6. DETERMINISTIC COMPONENTS

xt = CtXi=1

(εi + µ0) +∞Xi=0

C∗i (εt−i + µ0)

and

E∆xt = Cµ0 (6.15)

Eβ0xt = β0∞Xi=0

C∗iµ0

By noting that Eβ0∆xt = 0 in (6.12) (because Eβ0xt is constant) we obtainthe following expression for Eβ0xt :

Eβ0xt = −(β0α)−1β0µ0 = β0∞Xi=0

C∗iµ0. (6.16)

Inserting (6.15) and (6.16) into (6.10) and noting that C = β⊥(α0⊥β⊥)

−1α0⊥gives:

∆xt − β⊥(α0⊥β⊥)−1α0⊥µ0 = αβ0xt−1 +α(β0α)−1β0µ0 + εt (6.17)

∆xt − γ = α(β0xt−1 + β0) + εt

Because the l.h.s. of (6.17) has a zero mean, as has εt, E(β0xt−1 + β0) = 0

and we have shown that the decomposition (6.13) satisfies the criteriumE(β0xt + β0) = 0.Thus, the motivation for choosing the decomposition (6.13) is that when

Γ1 = 0 and µ1 = 0 we have that E(4xt) = γ0 and, hence, E(β0xt + β0) =

0. When Γ1, ...,Γk−1 6= 0 (but µ1 = 0) we can apply a slightly differentdecomposition:

ΓC+ (I− ΓC) = Γβ⊥(α0⊥Γβ⊥)

−1α0⊥ + I− Γβ⊥(α0⊥Γβ⊥)

−1α0⊥ = I (6.18)

where Γ = I− Γ1 − ...− Γk−1. However, the derivation of the mean value of∆xt becomes more complicated in the general case.

Page 105: Juselius_Johansen_The Cointegrated VAR Model

6.2. A TREND AND A CONSTANT IN THE VAR 105

When µ1 6= 0 we would need a different, more complicated, decompositionof µ0 and µ1 to achieve that the equilibrium error has a zero mean

1. In anycase a similar logic is used for the decomposition of the constant and thetrend into the space spanned by α and β⊥:

µ0 = αβ0+γ0µ1 = αβ1+γ1

(6.19)

By substituting (6.19) in (6.8) we get:

∆xt = αβ0xt−1 +αβ0 +αβ1t+ γ0+γ1t+ εt, (6.20)

and by rearranging (6.20) can be written as:

∆xt = α [β0,β0,β1]

xt−11t

+ γ0+γ1t+ εt.Thus, (??) can be reformulated as:

∆xt = αβ0xt−1 + γ0+γ1t+ εt, (6.21)

where β0= [β0,β0,β1] and xt−1 = (xt−1, 1, t)

0.The γ components can be interpreted from the equations:

E(∆xt) = γ0+γ1t, (6.22)

i.e. γ0 6= 0, implies linear growth in at least some of the variables as demon-strated in Case 1 in Section 1, and γ1 6= 0 implies quadratic trends in thevariables.From the above discussion it appears that the constant and the deter-

ministic trend play a double role in the cointegration model and we needto be able to distinguish between the part that belongs to the cointegrationrelations, and the part that belongs to the differences.

1If, nevertheless, (6.13) is used to obtain (β0,β1,γ0,γ1), the consequence is thatE(β0xt−1 + β0 + β1t) 6= 0. but, in general, the deviation from a zero mean is very small.

Page 106: Juselius_Johansen_The Cointegrated VAR Model

106 CHAPTER 6. DETERMINISTIC COMPONENTS

6.3 Five cases

In empirical work we generally do not know from the outset whether thereare linear trends in some of the variables, or whether they cancel in thecointegrating relations or not. We will demonstrate in Chapter 9 that suchhypotheses can be expressed as testable linear restrictions on the economet-ric model. The five different models discussed below arise from imposingdifferent restrictions on the deterministic components in (??).Case 1. µ1, µ0 = 0. This case corresponds to a model with no deter-

ministic components in the data, i.e. E∆xt = 0 and E(β0xt) = 0, implying

that the intercept of every cointegrating relation is zero. As demonstratedin the previous section an intercept is generally needed to account for theinitial level of measurements, X0, and only in the exceptional case whenthe measurements start from zero, or when the measurements cancel in thecointegrating relations a zero restriction can be justified.Case 2. µ1 = 0,γ0 = 0 but β0 6= 0, i.e. the constant term is restricted

to be in the cointegrating relations. In this case, there are no linear trends inthe data, consistent with E(∆xt) = 0 in (6.22). The only deterministic com-ponent in the model is the intercept of the cointegrating relations, implyingthat the equilibrium mean is different from zero.Case 3. µ1 = 0, and the constant term µ0 is unrestricted, i.e. no linear

trends in the VAR model (6.8), but linear trends in the data (6.9). In thiscase, there is no trend in the cointegration space, but E(∆xt) = γ0 6= 0,is consistent with linear trends in the variables but, since β1 = 0, thesetrends cancel in the cointegrating relations. It appears that µ0 6= 0 impliesboth linear trends in the data and a non-zero intercept in the cointegrationrelations.Case 4. γ1 = 0, but γ0,β0,β1 are unrestricted, i.e. the trend is

restricted only to appear in the cointegrating relations, but the constantis unrestricted in the model. When γ1 is restricted to zero we allow linear,but no quadratic trends, in the data. As illustrated in the previous section,E(∆xt) = γ0 6= 0, implies a linear trend in the level of xt.When, in addition,β1 6= 0 the linear trends in the variables do not cancel in the cointegrating re-lations, i.e. our model contain ‘trend-stationary’ variables or trend-stationarycointegrating relations. These can either describe a trend-stationary variableor a trend-stationary cointegration relation. Therefore, the hypothesis thata variable is trend-stationary, for example that the output gap is stationary,can be tested in this model.

Page 107: Juselius_Johansen_The Cointegrated VAR Model

6.4. THE MA REPRESENTATION 107

Case 5. No restrictions on µ0, µ1, i.e. trend and constant are unrestrictedin the model. With unrestricted parameters, the model is consistent withlinear trends in the differenced series ∆xt and, thus, quadratic trends in xt.Although quadratic trends may sometimes improve the fit within the sam-ple, forecasting outside the sample is likely to produce implausible results.Instead it seems preferable to find out what has caused this approximatequadratic growth, and if possible include more appropriate information inthe model (for example, population growth or the proportion of old/youngpeople in a population).These are only a few examples showing that the role of the deterministic

and stochastic components in the cointegrated VAR is quite complicated.Not only is a correct specification important for the model estimates and theirinterpretation, but also the asymptotic distribution of the rank test dependson the specification of these components. This will be further discussed inChapter 8.

6.4 The MA representation with determinis-

tic components

For simplicity we will here focus on the derivation of the MA representationwhen the VAR contains an unrestricted constant and a linear trend. It isstraightforward to generalize to other cases.Chapter 5 showed that the MA form of the VAR model (??) can be

obtained by inverting:

4xt = C(L)(εt + µ0+µ1t) (6.23)

= [C(1) +C∗(L)(1− L)](εt + µ0+µ1t)

and summing:

xt = C(1)(εt + µ0+µ1t)

(1− L) +C∗(L)(εt + µ0+µ1t) + X0 (6.24)

where X0 contains the effect of the initial values defined so that β0X0 = 0.

As before:

Page 108: Juselius_Johansen_The Cointegrated VAR Model

108 CHAPTER 6. DETERMINISTIC COMPONENTS

C(1) = C = β⊥(α0⊥Γβ⊥)

−1α0⊥. (6.25)

By summing and rearranging (6.24) can be written as:

xt = Cµ0t+ 0.5Cµ1t2 + 0.5Cµ1t+C

∗(L)µ1t+C∗(1)µ0| z

determ. comp.

+

+CΣεi +C∗(L)εt| z

stoch. comp.

+ X0, for t = 1, ..., T.(6.26)

Substituting (6.25) in (6.26) we get:

xt = β⊥(α0⊥Γβ⊥)

−1α0⊥£µ0t+ 0.5µ1t+ 0.5µ1t

2¤| z + [C∗(L)µ1t]| z +

+CΣεi +C∗(L)εt +C∗(1)µ0 + X0 (6.27)

Substituting (6.14) in (6.27) and focusing on the linear and quadratic trendcomponents we get:

α0⊥µ0t = α0⊥αβ0t| z 0

+α0⊥γ0t

α0⊥0.5µ1t = 0.5(α0⊥αβ1t| z 0

+α0⊥γ1t)

and

α0⊥0.5µ1t2 = 0.5(α0⊥αβ1t

2| z 0

+α0⊥γ1t2),

and the MA representation can be written as:

xt = β⊥(α0⊥Γβ⊥)

−1α0⊥γ0t+ 0.5γ1t+ 0.5γ1t2+C∗(L)µ1t++C∗(1)µ0 + CΣεi +C

∗(L)εt + X0.(6.28)

Thus, (6.28) shows that linear trends in the variables can originate from threedifferent sources in the VAR model:

Page 109: Juselius_Johansen_The Cointegrated VAR Model

6.5. DUMMY VARIABLES IN A SIMPLE REGRESSION MODEL 109

1. the α component (C∗(L)µ1t) of the unrestricted linear trend µ1t

2. the β⊥ component (γ1t) of the unrestricted linear trend µ1t

3. the β⊥ component (γ0t) of unrestricted constant term µ0.

We write (6.28) in a more compact form:

xt = Cτ 1t+ τ 2t2| z + CΣεi +C∗(L)εt| z +X0, (6.29)

where τ 1 and τ 2 can be derived from (6.28).

6.5 Dummy variables in a simple regression

model

Similarly as for the trend and the constant we consider first a simple re-gression model for yt containing three different types of dummy variables,Dst, Dpt, and Dtrt :

yt = φsDst + φpDpt + φtrDtrt + ut + y0, t = 1, ..., T (6.30)

where Dst is a mean-shift dummy (...0,0,0,1,1,1,...), Dpt is a permanent in-tervention dummy (...0,0,1,0,0,...) and Dtrt is a transitory shock dummy(...0,0,1,-1,0,0,...), and the residual ut is a first order autoregressive process:

ut =εt

1− ρL(6.31)

By substituting (6.31) in (6.30) we get:

(1− ρL)yt = φs(1− ρL)Dst + φp(1− ρL)Dpt + φtr(1− ρL)Dtrt(6.32)

+(1− ρL)y0 + εt,

yt = ρyt−1 + φsDst − ρφsDst−1 + φpDpt − ρφpDpt−1+φtrDtrt − ρφtrDtrt−1 + (1− ρ)y0 + εt,

= b1yt−1 + b2Dst + b3Dst−1 + b4Dpt + b5Dpt−1 (6.33)

+b6Dtrt + b7Dtrt−1 + b0 + εt

Page 110: Juselius_Johansen_The Cointegrated VAR Model

110 CHAPTER 6. DETERMINISTIC COMPONENTS

Thus, the ”static” regression model (6.30) with autoregressive errors cor-responds to a dynamic model with lagged dummy variables. Note that theeffects of the dummy variables can equivalently be formulated as:

φsDst − ρφsDst−1 = φsρ∆Dst + φs(1− ρ)Dst (6.34)

φpDpt − ρφpDpt−1 = φpρ∆Dpt + φp(1− ρ)Dpt (6.35)

φtrDtrt − ρφtrDtrt−1 = φtrρ∆Dtrt + φtr(1− ρ)Dtrt (6.36)

where ∆Dst now becomes an impulse dummy (...0,0,1,0,0,...) describingan permanent intervention (...0,0,1,0,0,...) and ∆Dpt becomes a transitoryblip dummy (...0,0,1,-1,0,0,...), and ∆Dtrt a double transitory blip dummy(...0,0,1,-2,1,0,0,...). Hence, (6.32) can be reformulated as:

∆yt = −(1− ρ)yt−1 + φs(1− ρ)Dst + [φsρ+ φp(1− ρ)]Dpt + (6.37)

+[φpρ+ φtr(1− ρ)]Dtrt + φtrρ∆Dtrt + εt

When ρ = 1 (6.37) becomes:

∆yt = φs∆Dst + φp∆Dpt + φtr∆Dtrt + εt, t = 2, ..., T,

= φsDpt + φpDtrt + φtr∆Dtrt + εt,

i.e. a shift in the levels of a variable becomes a ’blip’ in the differencedvariable, a permanent ’blip’ in the levels becomes a transitory blip in thedifferences, and finally a transitory blip in the levels becomes a double tran-sitory blip in the differences.

6.6 Dummy variables and the VAR

Significant interventions and reforms frequently show up as extraordinarylarge (non-normal) shocks in the VAR analysis, thus violating the normalityassumption. We will strongly argue below that it would be a mistake to treatthem exclusively as a statistical nuisance to be remedied by appropriatelycorrecting the observations. First we will illustrate the need for interventiondummies and how to model them based on the analysis of the Danish data,then we will discuss more formally how they influence the dynamics of the

Page 111: Juselius_Johansen_The Cointegrated VAR Model

6.6. DUMMY VARIABLES AND THE VAR 111

VAR model, and finally discuss their significance for the empirical modelestimates.

A graphical analysis of the residuals based on the unrestricted VAR(2)model for the Danish data showed that the temporary removal of the VATin 1975 had a very strong impact on real aggregate expenditure2 and thatthe removal of restrictions on capital movements in 1983 caused the yearlylong-term bond rate to fall with approximately 10% from a level of ca. 25%,which was internationally high. The huge drop in interest rate was associatedwith a similarly large increase in aggregate money stock as a result of thepossibility among foreigners to hold Danish krones. It is often much easierto recognize a possible outlier observation in ∆xt than in the levels xt. Thus,a first tentative decision to model a political intervention, for example witha permanent or a transitory blip dummy, can be made by examining thedifferenced process as illustrated below. The effect on the levels of the processwill be discussed subsequently.

The first intervention is transitory in the sense that the VAT was removedfor one quarter and gradually restored again over two quarters. Nonetheless,it was clearly meant to have a permanent effect on real aggregate incomeand we might hypothetically expect both a transitory and a permanent ef-fect. The removal of VAT is also likely to influence prices in various ways.For example, if we measure prices by an CPI index, then a change in VATis likely to be seen as a roughly proportional effect. But it is also pos-sible that some producers, in the expectation of a high demand for theirproducts, will increase prices to some extent. In the present illustrationwe are using the implicit price deflator of GNE as a measure of pricesso the effects are likely to be much smaller than for the CPI. The tran-sitory effect of the VAT intervention should be modelled by a transitoryblip dummy, for example Dtrt = [0, 0, 0, ...0, 1,−0.5,−0.5, 0, 0, ..., 0, 0, 0],whereas a permanent effect should be modelled by a blip dummy, for ex-ample Dpt = [0, 0, 0, ...0, 1, 0, 0, ..., 0, 0, 0]. The statistical analysis of the VARmodel can then be used to find out if the VAT intervention effects were indeedsignificant and if this was the case for which equations.

The second intervention is a result of permanently removing restrictionsin the capital market and is, therefore, first of all likely to have permanenteffects on the system. However, markets sometimes overreact and we often

2The purpose of the intervention was exactly to boost domestic demand to avoid adepression in the aftermath of the first oil shock.

Page 112: Juselius_Johansen_The Cointegrated VAR Model

112 CHAPTER 6. DETERMINISTIC COMPONENTS

see both a permanent effect and a transitory effect as a result of a significantreform or intervention in the market. The latter would show up as a shockin one period, followed shortly afterwards by a similar shock of opposite signmaking the variable return to its previous level. Such transitory shocks arequite common, in particular, for high frequency data. If they are of minorsignificance and, therefore, not explicitly modelled they are likely to producesome negative residual autocorrelation in the VAR model. But, contraryto permanent shocks they will disappear in cumulation and, therefore, willhave no effect on the stochastic trends defined as the cumulative sum of allprevious shocks.Using dummies to account for extraordinary mean-shifts, permanent blips,

and transitory shocks, the CVAR model is reformulated as:

∆xt = Γ1∆xt−1 +αβ0xt−1 + ΦsDst + ΦpDpt + ΦtrDtrt +αµ0 + εt,εt ∼ Niid(0,Ω ), t = 1, ..., T

(6.38)

where Dst is d1 × 1 vector of mean-shift dummy variables (...0,0,0,1,1,1,...),Dpt is a d2×1 vector of permanent blip dummy variables (...0,0,1,0,0,...) andDtrt is a d3×1 vector of transitory shock dummy variables (...0,0,1,-1,0,0,...).Because the VAR model contain both differences and levels of the vari-

ables the role dummy variables (and other deterministic terms) is more com-plicated than in the usual regression model (Hendry and Juselius, 2001, Jo-hansen, Nielsen, and Mosconi, 2001). An unrestricted mean shift dummyaccounts for a mean shift in ∆xt and cumulates to a broken trend in xt.An unrestricted permanent blip dummy accounts for a large blip (impulse)in ∆xt and cumulates to a level shift in xt. An unrestricted transitory blipdummy accounts for two consecutive blips of opposite signs in ∆xt and cu-mulates to a single blip in xt.To understand the role of the dummies in the CVAR model we use (6.13)

to partition the dummy effects into an α and an β⊥ component:

Φs= αδ0+δ1, (6.39)

Φp= αϕ0+ϕ1, (6.40)

Page 113: Juselius_Johansen_The Cointegrated VAR Model

6.6. DUMMY VARIABLES AND THE VAR 113

Φtr= αψ0+ψ1, (6.41)

where δ0= (β0α)−1β0Φs and δ1 = β⊥(α

0⊥β⊥)

−1α0⊥Φs and ϕ0,ϕ1,ψ0,ψ1 aresimilarly defined. It is now possible to investigate the dynamic effects ofthe dummies from the moving average representation of the model, whichdefines the variables xt as a function of εi, i = 1, ..., t, the dummy variablesDst, Dpt and Dtt, and the initial values, X0. For model (6.38) it is givenby:

xt = CPt−1

i=1 εi +CΦsPt−1

i=1Dsi +CΦpPt−1

i=1Dpi +CΦtrPt−1

i=1Dti+

C∗(L)(εt + µ0+δ0Dst+ΦpDpt+ψ1Dtt) + X0

(6.42)

where, as before

C = β⊥(α0⊥Γβ⊥)

−1α0⊥ (6.43)

and C∗(L) is an infinite polynomial in the lag operator L. The first summa-tion in (6.42) gives the common stochastic trends generated by the ordinaryshocks to the system, the second summation generates a broken linear trendin xt, the third summation a shift in the level of the variables associated withthe permanent extraordinary large shock, the forth summation a blip in thevariables. It appears from (6.43) that only the β⊥components of (6.39) -(6.41) will enter with a nonzero coefficient in the summation of the dummycomponents in (6.42), whereas the α components will have a zero coefficientand, hence, disappear. Thus, dummy variables which are restricted to be inthe cointegration relations do not cumulate in xt.Hence, to avoid broken linear trend in the data δ1 = 0 has to be imposed

in (6.39). The second part of (6.39), αδ0, is restricted to lie in the α space,i.e. in the cointegration relations. Thus, δ0 6= 0 describes a mean shift in β0xtas a result of mean shifts in the variables that do not cancel in a cointegratedrelation. A mean shift in a variable xj,t implies a permanent blip in ∆xj,tand, hence, δ0 6= 0 implies Φp 6= 0.If, Φs = 0, then there is no broken trend in the variables nor a mean shift

in the cointegration relations, but if Φp 6= 0, then ψ1 6= 0 describes levelshifts in the variables that cancel in β0xt, whereas ψ0 6= 0 describes a blip

Page 114: Juselius_Johansen_The Cointegrated VAR Model

114 CHAPTER 6. DETERMINISTIC COMPONENTS

in β0xt. A blip in the variables xt implies a transitory shock in ∆xt. Thus,ψ0 6= 0 is consistent with Φtr 6= 0, and describes a situation when the blips inthe levels of xt generated by transitory shocks to ∆xt do not cancel in β

0xt.To avoid adding more dummy components we assume here that αψ0 = 0.Thus, (6.42) shows that a large shock at time t, accounted for by the

dummiesDpt orDtrt, will influence the variables with the same dynamics asan ordinary shock unless the dummies enters the model with lags. Thus, if adummy variable needs a lag in the model we will consider the corresponding’intervention’ shock to be inherently different from the ’ordinary’ shocks,whereas if the dummy is needed only once, at the day of the ’news’, we willconsider it a big, but nevertheless ordinary, shock3.Thus, we need to make a distinction between extraordinary intervention

shocks with a permanent effect, for example as a result of central bank orgovernment interventions, and ’ordinary’ large shocks, for example as a resultof market (over)reaction to various ’news’. Conceptually we can distinguishbetween:

• ordinary (normally distributed) random shocks,

• (extra)ordinary large permanent random shocks (|εi,t|> 3.3σε) describedby a blip dummy without lags,

• intervention shocks (large permanent shocks, |εi,t| > 3.3σε, related toa well-defined intervention) described by a blip dummy with lags

• transitory large shocks, outliers (typing mistakes, etc.) described by a+/- blip dummy.

The occurrence of transitory shocks in the model, whether large or small,will produce some (usually small) residual autocorrelations in the model and,hence, violate the independence assumption of the VAR model. Becausetransitory shocks appear un-systematically this problem cannot be solvedby increasing the lag length of the VAR or by including a moving averageterm in the error process. To some extent it can be accounted for by theinclusion of transitory intervention dummies in the model. But only the verylarge transitory shocks will generally be accounted for by dummies and the

3A similar distinction is between additive outliers, which are extraordinary shocks whichare not subject to the VAR dynamics (a typical example is a typing error in the data) andwhich after they have occurred are subject to the VAR dynamics.

Page 115: Juselius_Johansen_The Cointegrated VAR Model

6.7. AN ILLUSTRATIVE EXAMPLE 115

empirical model is, therefore, likely to exhibit some minor autocorrelationsin the residuals.

Similar arguments as was made for the trend and the constant term in theVAR model can be made for intervention dummies. The intervention mayhave influenced several variables in such a way that the intervention effect iscancelled in a cointegration relation. Alternatively, the intervention may onlyhave affected one of the variables (or several variables but not proportionallywith β), so that the effect does not disappear in a cointegration relation.

Table 1 not yet finished. To be worked out later.

6.7 An illustrative example

Consistent with the discussion above we will now re-estimate the Danish VARmodel allowing for a trend restricted to lie in the cointegration space, a shiftdummyDs83t = 1 for t = 1983:1,.....,1993:4, 0 otherwise, also restricted to liein the cointegration space, an unrestricted transitory blip dummyDtr75t = 1for t = 1975:4, -0.5 for 1976:1 and 1976:2, 0 otherwise and a permanent blipdummy Dp83t = 1 for 1983:1, 0 otherwise. The Dp83t and its lagged valuewill be included in the model. In addition, three centered seasonal dummiesdefined by Dq0t = [Dq1t,Dq2t,Dq3t], where Dqit = 0.75 in quarter i, -0.25 inquarter i+1, i+2, i+3. the advantages of using centered seasonal dummiesis that

PTj=1Dqij = 0 in samples covering complete years.

The VAR model to be estimated is given by:

∆xt = Γ1∆xt−1 +αβ0xt−1 +αβ0 +αβ1t+αδ0Ds83t +

+Φp.1Dp83t + Φp.2Dp83t−1 + ΦtrD75trt + Φ0qDqt + γ0 + εt,

= Γ1∆xt−1 +αβ0xt−1 + Φp.1Dp83t + Φp.2Dp83t−1 + ΦtrD75trt + Φ0qDqt + γ0 + εt,

where

β =

ββ0β1δ0

and xt−1 =

xt−11t

Ds83t

.

Page 116: Juselius_Johansen_The Cointegrated VAR Model

116 CHAPTER 6. DETERMINISTIC COMPONENTS

Table 6.1: Specification tests for the unrestricted VAR(2) model with dum-mies.

Multivariate tests:Ljung-Box χ2(425) = 450.6 p-val. 0.19Residual autocorr.LM1 χ2(25) = 29.3 p-val. 0.25LM4 χ2(25) = 31.8 p-val. 0.16Normality: LM χ2(10) = 17.2 p-val. 0.07Univariate tests: ∆mr ∆yr ∆p ∆Rm ∆RbARCH(2) 5.0 1.9 2.2 4.1 0.2Jarq.Bera(2) 1.1 2.1 1.8 10.3 1.9Skewness 0.22 -0.37 0.26 0.12 -0.37Kurtosis 3.10 2.92 3.29 4.50 3.04The 5 largest roots of the characteristic polynomialReal 0.87 0.87 0.66 0.66 0.58Complex 0.02 -0.02 0.31 -0.31 0.00Modulus 0.88 0.88 0.73 0.73 0.58

The properties of the residuals from the estimated VAR model with dum-mies have now changed compared to the unrestricted model of Chapter 4.We will first have a look at the misspecification tests reported in Table 6.1.

It appears that the model specification has improved to some extent.For example the test for forth order autocorrelation is no longer significantand the residuals from the bond rate equation now pass the normality test.However, the ’excess’ kurtosis of the residuals from the deposit rate is nowfurther away from normality and the Jarque-Bera test clearly rejects nor-mality. This is probably due to the fact that calculated standard errors ofthe equations are now smaller and, hence, any deviations from normality willnow be measured with a more precise ’yardstick’.

We note that instead of one root almost on the unit circle we have nowtwo fairly large complex roots in which the complex part is very small. Thequestion whether they correspond to approximate unit roots will be testedin Chapter 8.

Page 117: Juselius_Johansen_The Cointegrated VAR Model

6.7. AN ILLUSTRATIVE EXAMPLE 117

∆mr

t

∆yrt∆p2t∆Rm,t∆Rb,t

=−0.29 0.17 0.10 −3.83 0.91−0.00 0.25 0.12 −1.10 −0.69−0.05 0.09 0.22 −1.93 0.20−0.01 −0.01 0.01 −0.16 0.25−0.01 0.03 −0.00 0.07 0.20

∆mrt−1

∆yrt−1∆p2t−1∆Rm,t−1∆Rb,t−1

+

+

−0.28 0.19 −0.15 4.87 −4.64 0.05 −0.0000.09 −0.26 −0.27 −2.20 1.08 0.01 −0.0000.04 −0.14 −1.76 1.73 −0.00 −0.00 −0.000−0.01 0.01 −0.00 −0.31 0.08 0.00 0.000−0.01 0.01 0.02 −0.00 −0.10 −0.00 0.000

mrt−1

yrt−1∆pt−1Rm,t−1Rb,t−1Ds83t−1t

+

0.03 0.06 0.01 0.048 −0.008 0.031 −0.090.04 −0.01 0.03 0.006 0.002 −0.000 0.34−0.01 −0.00 0.00 −0.006 −0.000 −0.008 0.170.00 0.00 −0.00 −0.001 0.000 −0.001 −0.00−0.00 −0.01 −0.00 −0.002 −0.001 −0.002 −0.01

Dtr75tDp83tDp83t−1Dq1tDq2tDq3tconst

+ εt

Ω =

1.00.23 1.0−0.19 −0.20 1.0−0.18 −0.05 0.00 1.0−0.11 0.11 −0.13 0.41 1.0

, σε =

0.02180.01430.01300.00110.0014

Log(Lmax) = 1973.1, log |Ω| = −52.0, trace correlation = 0.61,We notice that although the estimates of Π and Γ1 have not changed a

lot compared to the no-dummy VAR, the parameters of the implicit money

Page 118: Juselius_Johansen_The Cointegrated VAR Model

118 CHAPTER 6. DETERMINISTIC COMPONENTS

demand relation in the first row of the Π matrix seem now more reasonable.For example the interest rate elasticities are now more moderately sized.The shift dummy is significant in the money stock and deposit rate equa-tion which is consistent with our prior hypotheses. The linear trend seemsto be important in the income and inflation rate equation. The transitoryVAT dummy is significant only in the income equation and the 1983 blipdummy is significant in the money stock and the bond rate equation. Thelagged 1983 dummy is not significant in any of the equations and we concludethat the 1983 shock can be considered a large, but ordinary shock. Mostof the seasonal effects are quite insignificant. The residual correlations arealmost unchanged, but the estimated standard errors have decreased quitesubstantially.

6.8 Conclusions

The normality assumption of εt is frequently not satisfied in empirical VARmodels without accounting for such reforms and interventions that have pro-duced extraordinary large residuals. There are several possibilities:

1. The linear relationship of the VARmodel does not hold for large shocks:market reacts differently to ordinary and extraordinary shocks.

2. The linear relationship of the VAR model holds approximately, butthe properties of the VAR estimates are sensitive to the presence ofextraordinary large shocks. Ordinary and extraordinary shocks aredrawn from different distributions.

3. The estimates of the VAR model are generally robust to deviationsfrom normality.

It is almost impossible to know beforehand which of the three cases isclosest to the truth in a specific empirical application. It depends very muchon such aspects as the length of the sample, how frequently the outliers occurand whether positive and negative outliers occur relatively symmetrically,and so on. The best advice is to take them seriously. Neglecting the outlierproblem is likely to produce unreliable results.

Page 119: Juselius_Johansen_The Cointegrated VAR Model

Chapter 7

Estimation in the I(1) Model

We assume here that the empirical VAR model can describe the data satisfac-torily (i.e. is data congruent) so that we can proceed by testing the reducedrank conditions of Π under the assumption that Γ = I− Γ1−...− Γk−1 satisfythe I(1) condition, i.e. there are no I(2) components in the model. Section 7.1demonstrates how the short-run dynamics can be concentrated out from thegeneral VAR model, Section 7.2 derives the ML estimator of β and α, Section7.3 shows how to normalize the cointegration vectors, Section 7.4 discussesthe uniqueness of the unrestricted estimates, and Section 7.5 illustrates theestimation procedure.

7.1 Concentrating the general VAR-model

The I(1) condition can be stated as:

Π = αβ0, (7.1)

where α and β are p×r matrices. If r = p then xt is stationary and standardinference applies. If r = 0 then xt is nonstationary and it is not possibleto obtain stationary relations between the levels of the variables by linearcombinations. We say that the variables do not have any common stochastictrends and hence cannot move together in the long run. In this case theVAR model in levels becomes a VAR model in differences (without loss oflong-run information) and since ∆xt ∼ I(0) standard inference applies. Ifp > r > 0 then xt ∼ I(1) and there exists r directions into which the process

119

Page 120: Juselius_Johansen_The Cointegrated VAR Model

120 CHAPTER 7. ESTIMATION IN THE I(1) MODEL

can be made stationary by linear combinations. These are the cointegratingrelations and the question is whether they can be given an interpretation aseconomic steady-state relations.We consider now a VAR(k) model in ECM form with Π = αβ0:

∆xt = Γ1∆xt−1 + ...+ Γk−1∆xt−k+1 +αβ0xt−1 (7.2)

+ΦDt + εt,

where t = 1, ..., T, xt−1 is p1×1, p1 = p+m, m is the number of deterministiccomponents, such as constant or trend, and the initial values x−1,...,x−k areassumed fixed. In (7.2) the time series process xt is dependent on laggedvalues xt−1,...,xt−k. When we estimate the model based on a finite sampleit is useful to condition on the first k observations X0 = x−1,...,x−k, i.e.treating them as fixed known parameters. This is particularly the case whenthe data are nonstationary, simply because it is not meaningful to include themarginal probability of a nonstationary variable in the likelihood function.Note that when choosing the sample period, it is important to make surethat the first observations are not too far away from equilibrium. Otherwisethe first initial values might generate explosive roots in the data.We use the following shorthand notation:

Z0t = ∆xtZ1t = xt−1Z2t = [∆xt−1,∆xt−2, ...,∆xt−k+1,Dt],

and write (7.2) in the more compact form

Z0t = αβ0Z1t+ΨZ2t + εt,

where Ψ = [Γ1,Γ2, ...,Γk−1,Φ]. We will now concentrate out the short-run’transitory’ effects, ΨZ2t, to obtain a ’cleaner’ long-run adjustment model. Toexplain the idea of ’concentration’ which is used in many different situationsin econometrics, we will first illustrate its use in a multiple regression model.

A digression:***********************************It is well known (the Frish-Waugh theorem) that the OLS estimate of β2.1

in the linear regression model:

Page 121: Juselius_Johansen_The Cointegrated VAR Model

7.1. CONCENTRATING THE GENERAL VAR-MODEL 121

yt = β1.2x1t + β2.1x2t + εt

can be obtained in two steps:

• 1.a. Regress yt on x1t, obtaining the residual u1t from yt = b1x1t + u1t

• 1.b. Regress x2t on x1t, obtaining the residual u2t from x2t = b2x1t+u2t• 2. Regress u1t on u2t, to obtain the estimate of β2.1, i.e.:

u1t = β2.1u2t + error.

Hence, we first concentrate out the effect of x1t on both yt and x2t, and thenregress the ”cleaned” yt, i.e. u1t, on the ”cleaned” x2t, i.e. u2t.*************************************************

We now use the same idea on the VAR-model. First we define the auxil-iary regressions:

Z0t = B01Z2t+R0t

Z1t = B02Z2t+R1t(7.3)

where B01=M02M−122 and B

02 =M12M

−122 are OLS estimates andMij= Σt(ZitZ

0jt)/T .

ThusMij−ZiZ0j are the empirical counterparts of the covariance matrices Σijdiscussed in Chapter 3. The following scheme shows how these are definedfor the VAR(k) model:

∆xt ∆xt−1 · · · ∆xt−k+1 xt−1∆xt M00 M01 M02

∆xt−1... M10 M11 M12

∆xt−k+1xt−1 M20 M21 M22

Page 122: Juselius_Johansen_The Cointegrated VAR Model

122 CHAPTER 7. ESTIMATION IN THE I(1) MODEL

The concentrated model

R0t= αβ0R1t + error (7.4)

is important for understanding both the statistical and economic proper-ties of the VAR model. In the form (7.4) we have transformed the original”messy” VAR containing short-run adjustment and intervention effects into”the baby model” form, in which the adjustment exclusively takes place to-wards the long-run steady-state relations. This means that we not only havetransformed the ”dirty” empirical model into a nice statistical model but alsointo a more interpretable economic form.

7.2 Derivation of the ML estimator

Consider the concentrated model (7.4):

R0t= αβ0R1t + εt, t = 1, ..., T,

where

εt ∼ Np(0,Ω).The ML estimator is derived in two steps: First we assume that β is knownand derive an estimator of α under the assumption that β0R1t is a knownvariable. Then we insert α = α(β) in the expression for the maximum ofthe likelihood function so that it becomes a function of β, but not of α. Wethen find the value of β that maximizes the likelihood function. When wehave found the ML estimator of β we can then find α = α(β).Step 1. The ML estimator of α given β corresponds to the standard

LS estimator. It can be derived by post-multiplying (7.4) with R01tβ and

dropping the error term:

R0tR01tβ = αβ0R1tR

01tβ.

Summing over t, and dividing by T gives:

Page 123: Juselius_Johansen_The Cointegrated VAR Model

7.2. DERIVATION OF THE ML ESTIMATOR 123

S01β = αβ0S11β

where Sij = T−1ΣtRitR

0jt =Mij−Mi2M

−122M2j.

It is now easy to derive the least squares estimator of α as a function ofβ :

α(β) = S01β(β0S11β)−1 (7.5)

Step 2. Given the assumption of multivariate normality we have that themaximum of the likelihood function of (7.4) is equal to the determinant ofthe error covariance matrix as a function of fixed β and α:

L−2/Tmax (β,α) = |Ω(β,α)|+ cons tan t terms (7.6)

where

Ω(β,α) = T−1P(R0t−αβ0R1t)(R0t−αβ0R1t)

0

= T−1(PR0tR

00t−

PR0tR

01tβα

0−αβ0PR1tR00t+αβ

0PR1tR01tβα

0)

= S00−S01βα0−αβ0S10+αβ0S11βα0 (7.7)

By substituting (7.5) in (7.7) we can express the error covariance matrix asa function exclusively of β:

Ω(β) = S00−S01β(β0S11β)−1β0S10−S01β(β0S11β)−1β0S10+

S01β(β0S11β)

−1β0S11β(β0S11β)−1| z I

β0S10.

Hence:

Ω(β) = S00−S01β(β0S11β)−1β0S10 (7.8)

Page 124: Juselius_Johansen_The Cointegrated VAR Model

124 CHAPTER 7. ESTIMATION IN THE I(1) MODEL

The ML estimator of β is given by the estimate β that minimizes |Ω(β)| .To derive the estimator we use of the following result:¯

A BB0 C

¯= |A| · |C−B0A−1B| = |C| · |A−BC−1B0| (7.9)

where A, B, C are nonsingular square matrices. Now substitute:

S00 = Aβ0S11β = CS01β = B

in (7.9), resulting in:

|S00| · |β0S11β − β0S10S−100 S01β| = |β0S11β| · |S00−S01β(β0S11β)−1β0S10| z |Ω(β)|

|

Hence,

|Ω(β)| = |S00| · |β0S11β − β0S10S−100 S01β||β0S11β|

= |S00| · |β0(S11−S10S−100 S01)β|

|β0S11β|Using the result that the function

f(x) =|X0MX||X0NX|

is maximized by solving the eigenvalue problem

|ρN−M| = 0, (7.10)

we can obtain a solution for β that minimizes |Ω(β)|.We first substitute:

M = S11−S10S−100 S01N = S11X = β

Page 125: Juselius_Johansen_The Cointegrated VAR Model

7.3. NORMALIZATION 125

in (7.10) to formulate the eigenvalue problem, the solution of which gives theestimates of β:

|ρS11−S11+S10S−100 S01| = 0or equivalently:

|(1− ρ)| z λ

S11−S10S−100 S01| = 0 (7.11)

The solution gives p eigenvalues λ1, ...,λp and p eigenvectors V1, ...,Vp andwe can now express the determinant of the residual covariance matrix as:

|Ω| = |S00|pQi=1

(1− λi) (7.12)

The cointegration vectorsV0ixt are not yet normalized on a variable and in

the next section we will discuss how to choose an appropriate normalizationfor each vector. The normalized vectors will be called βi to distinguish themfrom the non-normalized vectors. Note that the relations β0ixt are orderedaccording to λ1 > ... > λp > 0 and the magnitude of λi is a measure of the’stationarity’ of the corresponding β0ixt. The next chapter will discuss howto classify the p relation into r stationary relations corresponding to the rlargest eigenvalues and p − r non-stationary relations corresponding to thep− r smallest eigenvalues.

7.3 Normalization

To be able to interpret a cointegration relation as a relation to be primarilyassociated with a particular economic variable we need to normalize theformer by setting the coefficient of the latter to be unity. This is similarto what we do in a regression model, x1t = β0+ β1x2t + β3x3t + ut, whenwe choose one of the variables, x1t, to be the ’dependent’ variable, i.e. tohave a unitary coefficient. In a regression model with stochastic variablesit might happen that we choose the ’wrong’ variable to be the dependentvariable. This would be the case if it turns out that the regression, x2t = eβ0+eβ1x1t+eβ3x3t+eut, gives more interpretable coefficient estimates and improvedstatistical properties.

Page 126: Juselius_Johansen_The Cointegrated VAR Model

126 CHAPTER 7. ESTIMATION IN THE I(1) MODEL

In an analog manner the choice of normalization of a cointegrating re-lation should make sense economically as well as statistically. For example,normalizing on an insignificant or irrelevant coefficient does not make sense,normalizing on money stock, say, in a relation describing real income seems abad choice. There is, however, an important difference between a regressionmodel and a cointegration relation. Normalizing on either x1t or x2t in the re-gression model generally changes the estimates of the regression coefficients,whereas in a cointegration relation the ratios between coefficients are thesame independent on the chosen normalization. In this sense the coefficientestimates in a cointegration relation are more ’canonical’.

7.4 The uniqueness of the unrestricted esti-

mates

For a given choice of the number of stationary cointegrating relations, r,the Johansen procedure gives the maximum likelihood estimates of the unre-stricted cointegrating relations β0xt. How to determine r will be discussed inthe next chapter. In Chapter 6 we gave these estimates a first tentative inter-pretation in terms of underlying steady-state relations. In this section we willdiscuss whether such an interpretation is at all meaningful. The unrestrictedestimates of α and β are calculated given the following conditions:

1. Stationarity, i.e. β0xt ∼ I(0).

2. Conditional independence of β0jxt, i.e. β

0S11β = I, where S11 was de-

fined at the beginning of this chapter.

3. The ordering given by the maximal conditional correlation with thestationary process ∆xt.

Given the above criteria the unrestricted cointegrating relations are uniquelydetermined and possibly meaningful if the former can be considered relevant.Because it happens quite frequently that the unrestricted cointegration rela-tions are interpretable, it may be of some interest to discuss whether the threeconditions (or rather the last two, since stationarity is clearly mandatory)are reasonable from an economic point of view.The conditional independence condition is a consequence of the chosen

eigenvalue normalization, β0S11β = I, as a result of analyzing the likelihood

Page 127: Juselius_Johansen_The Cointegrated VAR Model

7.5. AN ILLUSTRATION 127

function. It is a purely statistical condition and is arbitrary in the sense thatwe could have chosen another normalization. Because the conditional or-thogonality condition surprisingly often seem to produce economically inter-pretable relations it is tempting to look for some regularity in macroeconomicbehavior which could be associated with this purely statistical condition.If the empirical problem is about macroeconomic behavior in a market

where equilibrating forces are allowed to work without binding restrictions,at least in the long-run, one would generally expect two types of agents withdisparate goals interacting in such a way that equilibrium is restored once ithas been violated. These can be demanders versus suppliers, producers versusconsumers, employers versus employees, etc. Thus, if we choose a sufficientlyrich set of variables for the VAR analysis, economic theory would generallysuggest at least two (but often more) stationary long-run relationships. Thequestion is whether we should expect them to be conditionally independent.A somewhat heuristic guess is that the conditional independence may

produce empirically interpretable relations when the VAR model containssufficiently many variables to identify these hypothetical long-run relations.For example, to be able to empirically identify a long-run demand and asupply relation among the unrestricted cointegration relations there shouldbe at least one variable which is strongly influencing the demand behaviorbut unrelated to the supply behavior and vice versa. This is, of course, thebasic idea behind identification which will be discussed in great detail inChapter 10, where we discuss how to obtain unique estimates without theneed to imposing the condition of conditional independence.The third statistical criterion, the maximal correlation with the station-

ary part of the process, does not seem easily interpretable as a meaningfuleconomical criterion. Therefore, even if a direct interpretation of the un-restricted cointegration vectors is sometimes possible, the results should beconsidered indicative rather than conclusive, and cannot replace formal test-ing of structural hypotheses.

7.5 An illustration

Solving the eigenvalue problem (7.11) for the Danish data produced the fiveeigenvalues and the corresponding eigenvectors reported in the first partof Table 7.1. The eigenvectors are calculated based on the normalizationv0S11v = I. The ordering is based on the magnitude of λi so that the first

Page 128: Juselius_Johansen_The Cointegrated VAR Model

128 CHAPTER 7. ESTIMATION IN THE I(1) MODEL

relation v01xt is most strongly correlated with the stationary part of the pro-cess. The squared canonical correlation coefficient λ1 = 0.58 corresponds toa correlation coefficient

√0.58 = 0.76. We note that the last eigenvalue λ5 is

quite close to zero. The question is when the value of λi is small enough notto be significantly different from zero. The next chapter will deal with thisimportant and difficult issue.

For each eigenvector vi there is a corresponding vector of weights (load-ings) wi = S01vi satisfying

Ppi=1wiv

0i= Π. The coefficients of the eigenvec-

tors vi reported in Table 7.1 are generally quite large and the coefficientsof the weights wi correspondingly small. Without an adequate normaliza-tion it is hard to see what they mean and the first task is to normalize1 theeigenvectors by an element vij as follows:

βi· = viv−1ij , i = 1, ..., p,

αi· = wivij, i = 1, ..., p.

To distinguish between the non-normalized and the normalized vectors weuse the notation vi· and wi· for the former and βi· and αi· for the latter. Thenormalized vectors are reported in the middle part of Table 7.1. The firstvector has been normalized on ∆p; the second on mr; the third on Rm; thefourth on yr, and finally the fifth on Rb. The choice of normalization elementcan be done arbitrarily, but to be able to interpret the results one needs tonormalize on a variable that is ’representative’ for the relation. For example,normalizing on ∆p in the first relation means that the first relation shouldin some vague sense describe a relation for inflation rate (with significantequilibrium correction in the inflation rate equation). The first choice ofnormalization is tentative by nature and will often change as a result of moredetailed inspection of the results.

The estimated unrestricted cointegration vectors may or may not makeeconomic sense. As already mentioned they are uniquely defined based on(1) the ordering of the λi and (2) the choice of eigenvector normalizationβ0S11β = I, both of which are statistical criteria without any obvious eco-nomic interpretation. Nevertheless, we will illustrate below that a carefulinspection of the first estimation results can often be crucial for a successfulcompletion of the empirical exercise.

1We will further discuss normalization of cointegration vectors in Section 9.1.

Page 129: Juselius_Johansen_The Cointegrated VAR Model

7.5. AN ILLUSTRATION 129

Table 7.1: Estimated eigenvalues, eigenvectors, and loadings for the Danishdata

Non-normalized eigenvectors: v0iλi mr yr ∆p Rm Rb D83 trend0.58 v 01 -1.5 -7.2 -122.7 199.5 -82.6 0.3 -0.00.36 v 02 -25.2 25.5 17.2 327.1 -350.9 3.0 0.00.26 v 03 4.6 2.7 24.2 556.9 -208.8 -2.9 0.00.12 v 04 -13.3 40.7 5.1 68.2 -64.6 -1.7 0.00.06 v 05 -10.2 19.9 3.8 183.7 -339.9 -3.8 -0.0

Normalized eigenvectors: β0iβ01 0.01 0.06 1.00 -1.63 0.67 -.00 0.00β02 1.00 -1.01 -0.68 -12.97 13.92 -0.12 -0.00β03 0.01 0.00 0.04 1.00 -0.37 -0.01 0.00β04 -0.20 0.60 0.07 1.00 -0.95 -0.03 0.00β05 0.03 -0.06 -0.01 -0.54 1.00 0.01 0.00

The weights to the eigenvectors: αiα·1 α·2 α·3 α·4 α·5

∆mrt −0.26

(−0.9)−0.33(−5.6)

0.27(0.2)

−0.25(−1.6)

−0.17(−0.2)

∆yrt −0.11(−0.6)

0.05(1.1)

−1.34(−1.5)

−0.30(−2.8)

−0.41(−0.8)

∆2pt −1.62(−9.3)

0.06(1.7)

0.11(0.1)

0.09(0.9)

0.15(0.3)

∆Rm,t 0.01(0.9)

−0.00(−0.7)

−0.31(−4.4)

0.01(1.5)

−0.02(−0.5)

∆Rb,t 0.02(0.8)

0.00(0.1)

−0.04(−0.5)

0.01(0.9)

−0.11(−2.2)

The combined effects : Πmrt yrt ∆pt Rmt Rbt D83 t trend

∆mrt −0.29

(−4.0)0.18(1.5)

−0.04(−0.1)

4.82(2.9)

−4.80(3.8)

0.04(3.1)

0.00(1.7)

∆yrt 0.08(1.7)

−0.21(−2.6)

−0.21(−1.1)

−1.84(−1.7)

0.94(1.1)

0.00(0.5)

−0.00(−3.3)

∆2pt 0.03(0.7)

−0.11(−1.5)

−1.66(−9.2)

1.96(2.0)

−0.21(−0.3)

−0.00(−0.5)

−0.00(3.2)

∆Rm,t −0.01(−2.0)

0.01(1.6)

0.00(0.2)

−0.28(−3.2)

0.06(0.9)

0.001(1.7)

0.00(0.1)

∆Rb,t −0.00(−1.1)

0.01(1.6)

0.02(0.8)

0.01(0.0)

−0.10(−1.1)

−0.00(−1.5)

0.00(0.6)

Page 130: Juselius_Johansen_The Cointegrated VAR Model

130 CHAPTER 7. ESTIMATION IN THE I(1) MODEL

To be able to discriminate between significant and less significant αicoefficients we have reported the least squares standard errors of estimates.Note, however, that the ’t ’ values are distributed as Student’s t only if thecorresponding β0ixt is stationary. If this is not the case, then a Dickey-Fullertype distribution is probably more appropriate.Based on the estimated α coefficients we note that (i) the first relation

is significantly adjusting only in the inflation rate equation, (ii) the secondrelation only in the money stock equation, (iii) the third relation only in thedeposit rate equation, (iv) the fourth relation only in real income, but with at-value which would hardly be significant based on a Dickey-Fuller distribu-tion, and (v) the last relation is not really important in any equation. Sincea stationary variable, ∆xt, cannot be significantly explained by a nonsta-tionary variable, this is a sign of nonstationarity of the last two eigenvectors.The graphs in Figure 7.1-7.5 of β0ixt and β

0iR1t, where β

0iR1t is derived from

the concentrated model (7.4) seem to support this interpretation.The finding that the cointegration vectors are significant in just one equa-

tion each is a fortunate (and atypical) situation. In this case we already knowfrom the outset of the cointegration analysis that three of the variables, realmoney, inflation and deposit rate, are equilibrium error correcting, whereasthe remaining two, the real income and the bond rate, are not. This is sup-ported by the finding of no significant coefficients in the fifth row of the Πmatrix corresponding to the bond rate equation and only a significant trendcoefficient in the second row corresponding to the income equation. As a zerorow of α is the condition for weak exogeneity w.r.t. the long-run parametersβ, this tentative finding suggests that real income and bond rate are weaklyexogenous in this model, an issue that will be further discussed in Chapter9.Because each cointegration relation β0ixt was found to be important in just

one equation, we will tentatively try to interpret them as potential steady-state relations in respective equations.The first relation seems approximately to describe a relation between real

deposit rate and the interest rate spread:

(Rm −∆p) = 0.6(Rb −Rm) + ... (7.13)

It was found to be important in the inflation rate equation where the sign ofthe adjustment coefficient suggests equilibrium error correction. Hence, in-

Page 131: Juselius_Johansen_The Cointegrated VAR Model

7.5. AN ILLUSTRATION 131

flation rate seems to have adjusted upwards when the real short-term interestrate has been above 0.6 times the long-short interest rate spread.The second relation resembles a typical money demand relation where

the opportunity cost of holding money is measured by the spread betweenthe bond rate and the deposit rate:

mr = yr − 13.5(Rb −Rm) + ... (7.14)

Only money stock is significantly adjusting to this relation with a negativesign of the coefficient α12. This suggests that money stock is equilibrium errorcorrecting to agents’ demand for money.The third relation seems to describe an interest rate relation:

Rm = 0.4Rb + ... (7.15)

Only the deposit rate seems to be significantly adjusting to this relation.Again the coefficient suggests it is equilibrium error correcting.The fourth relation, which is probably nonstationary, seems most im-

portant for the real income equation. Normalizing on real income gives thefollowing relation:

yr = 0.3mr − 1.4(Rm −Rb) + ... (7.16)

which resembles an IS-curve relationship with positive real money effects.The last relation seems definitely nonstationary and we will not attempt tointerpret it.It has been debated whether it is at all meaningful even tentatively to

interpret the estimated eigenvectors and in some cases the unrestricted rela-tions are clearly not interpretable, and it does not make sense to attempt todo so. But, surprisingly often the first unrestricted estimates give a roughpicture of the basic long-run information in the data. The latter can subse-quently be used to facilitate the identification of an acceptable structure ofcointegration relations as argued below:If we assume that the cointegration rank is three in the above exam-

ple, then the first three eigenvectors (7.13) - (7.15) define stationary rela-tions (provided that the coefficients we tentatively set to zero were in fact

Page 132: Juselius_Johansen_The Cointegrated VAR Model

132 CHAPTER 7. ESTIMATION IN THE I(1) MODEL

zero). This can be formally tested based on a LR test procedure discussed inChapter 9 and if accepted we have identified three tentatively interpretablecointegration vectors spanning the cointegration space.

Assume now that the economic model we had in mind contained twosteady-state relations: a money demand relation and an aggregate incomerelation, but due to a ceteris paribus assumption no prior relation for inflationrate and the interest rates. How should we proceed after this first inspectionof the results? In my view already at this stage we need to adjust ourintuition of how the economic and the empirical model work together. Onepossibility is to go back to the economic model and see whether it is possibleto understand the long-run weak exogeneity of the real income variable andthe long-term bond rate and whether it is possible to make inflation andthe deposit rate enter the model. In some cases the economic model needsonly minor modifications, in other cases the model needs mor fundamentalchanges.

Another possibility is to reconsider the choice of variables and find outwhether an extended empirical model would be more consistent with thechosen economic model. For example in the above example we found that thefourth relation, though probably nonstationary, exhibited coefficients whichresembled an income relation. This could suggest that there is an importantomitted I(1) variable, for example real exchange rate, which is needed for theincome relation to become stationary.

Thus, a first tentative inspection of the empirical results might at anearly stage of the analysis suggest how to modify either your empirical oryour economic model. This is in my view one way of translating Haavelmo’s.... The other alternative, which is to force your economic model onto thedata, i.e. squeezing the reality into ’all-too-small-size clothes’, is a too frus-trating experience which all too often makes the desperate researcher choosesolutions which are not scientifically justified.

The last part of Table 7.1 reports the estimates of the unrestrictedΠ basedon full rank. First note that π0i.xt, i = 1, ..., p, defines a stationary relationonly when Π = αβ0 where α and β are p×r and (p +m) × r matrices andm is the number of deterministic components estimated to be proportionalto α. Therefore t-values in the brackets cannot be interpreted as Student’st, because some of the β0xt relations are nonstationary.

Page 133: Juselius_Johansen_The Cointegrated VAR Model

7.5. AN ILLUSTRATION 133

V1` * Zk(t)

74 76 78 80 82 84 86 88 90 92-6

-5

-4

-3

-2

-1

0

1

V1` * Rk(t)

74 76 78 80 82 84 86 88 90 92-3

-2

-1

0

1

2

3

Figure 7.1. The first cointegration relation β01xt (upper panel) and β01R1tcorrected for short-run effects (lower panel).

Page 134: Juselius_Johansen_The Cointegrated VAR Model

134 CHAPTER 7. ESTIMATION IN THE I(1) MODEL

V2` * Zk(t)

74 76 78 80 82 84 86 88 90 92-38.4

-37.6

-36.8

-36.0

-35.2

-34.4

-33.6

-32.8

-32.0

V2` * Rk(t)

74 76 78 80 82 84 86 88 90 92-2.7

-1.8

-0.9

-0.0

0.9

1.8

2.7

Figure 7.2. The second cointegration relation β02xt (upper panel) and β02R1tcorrected for short-run effects (lower panel).

Page 135: Juselius_Johansen_The Cointegrated VAR Model

7.5. AN ILLUSTRATION 135

V3` * Zk(t)

74 76 78 80 82 84 86 88 90 92-15

-14

-13

-12

-11

-10

-9

V3` * Rk(t)

74 76 78 80 82 84 86 88 90 92-2.4

-1.6

-0.8

-0.0

0.8

1.6

2.4

3.2

Figure 7.3. The third cointegration relation β03xt (upper panel) and β03R1tcorrected for short-run effects (lower panel).

Page 136: Juselius_Johansen_The Cointegrated VAR Model

136 CHAPTER 7. ESTIMATION IN THE I(1) MODEL

V4` * Zk(t)

74 76 78 80 82 84 86 88 90 9248

49

50

51

52

53

54

55

V4` * Rk(t)

74 76 78 80 82 84 86 88 90 92-1.8

-1.2

-0.6

0.0

0.6

1.2

1.8

2.4

3.0

Figure 7.4. The fourth cointegration relation β04xt (upper panel) and β04R1tcorrected for short-run effects (lower panel).

Page 137: Juselius_Johansen_The Cointegrated VAR Model

7.5. AN ILLUSTRATION 137

V5` * Zk(t)

74 76 78 80 82 84 86 88 90 92-10

-9

-8

-7

-6

-5

-4

V5` * Rk(t)

74 76 78 80 82 84 86 88 90 92-2.8

-2.1

-1.4

-0.7

0.0

0.7

1.4

2.1

Figure 7.5. The fifth cointegration relation β05xt (upper panel) and β05R1tcorrected for short-run effects (lower panel).

Page 138: Juselius_Johansen_The Cointegrated VAR Model

Chapter 8

Determination of CointegrationRank

Section 8.1 gives the basic results for the derivation of the Likelihood Ratiotest of the cointegration rank and discusses whether there is an optimalsequence of these tests. Section 8.2 discusses the derivation of the asymptotictables and how the presence of deterministic components influence thesetables. Section 8.3 discusses the difficult choice of the cointegration rank in apractical situation, Section 8.4 provides an empirical illustration, and Section8.5 reports on some diagnostic tools for checking parameter constancy andillustrates with an analysis of the Danish data.

8.1 The LR test for cointegration rank

The LR test for the cointegration rank r is based on the VAR model in theR-form (??), where all short-run dynamics, dummies and other deterministiccomponents have been concentrated out. Using (??) and (??) we can writethe log likelihood function as:

−2lnL(β) = Tln|S00|+ TpXi=1

ln(1− λi), (8.1)

by calculating the eigenvalues of the determinant

|λS11 − S01S−100 S10| = 0 (8.2)

139

Page 139: Juselius_Johansen_The Cointegrated VAR Model

140 CHAPTER 8. COINTEGRATION RANK

giving the solution (λ1,λ2, ...,λp).The eigenvalues λi can be interpreted asthe squared canonical correlation between linear combinations of the levelsβ0iR1t−1 and a linear combination of the differences ω

0iR0t . In this sense the

magnitude of λi is an indication of how strongly the linear relation β0iR1t−1is correlated with the stationary part of the process R0t. Another way of ex-

pressing this is by noticing that diag(λ1, ..., λr) = α0S−100 α = β0S10S

−100 S01β,

i.e. λi is related to the estimated αi. When λi = 0 the linear combinationβ0ixt is nonstationary and there is no equilibrium correction, i.e. αi = 0.The statistical problem is to derive a test procedure to discriminate be-

tween those λi, i = 1, ..., r which correspond to stationary relations and thoseλi, i = r + 1, ..., p which correspond to nonstationary relations. Becauseλi = 0 does not change the likelihood function, the maximum is exclusivelya function of the non-zero eigenvalues:

L−2/T = |S00|Πri=1(1− λi). (8.3)

Based on (8.3) it straightforward to derive a likelihood ratio test for thedetermination of the cointegration rank r, which involve the following hy-potheses:

H0(p) : rank = p, i.e. no unit roots, xt is stationaryH1(r) : rank = r, i.e. p− r unit roots, r cointegration relations, xt is non-stationary

The LR test, the so called trace test, is found as:

−2lnQ(Hr/Hp) = T ln

(|S00|(1− λ1)(1− λ2) · · · (1− λr)

|S00|(1− λ1)(1− λ2) · · · (1− λr) · · · (1− λp)

)

τ p−r = −T ln(1− λr+1) · · · (1− λp). (8.4)

As an illustration consider a VAR with p = 5 variables based on whichwe test the hypothesis H2 : r = 2, i.e. p− r = 3 against the null H5 : r = 5,i.e. p− r = 0. The test value is calculated as:

−2lnQ(H2/H5) = T ln

½|S00|(1− λ1)(1− λ2)

|S00|(1− λ1)(1− λ2)(1− λ3)(1− λ4)(1− λ5)

¾τ 3 = −T ln(1− λ3) + ln(1− λ4) + ln(1− λ5)

Page 140: Juselius_Johansen_The Cointegrated VAR Model

8.1. THE TRACE TEST 141

i.e. the LR test is a test of λ3 = λ4 = λ5 = 0, corresponding to threeunit roots in the model. If this hypothesis is correct then the test statisticshould be ”small” when compared to some critical value derived under theassumption that λ3 = λ4 = λ5 = 0. Note, however, that H2, i.e. (λ3 = λ4 =λ5 = 0) is correctly accepted also when λ2 = 0 or λ2 = λ1 = 0. Therefore, ifH2 is accepted, we conclude that there are at least 3 unit roots and, hence,at most two stationary relations.Assume that we have a prior hypothesis of the correct number of common

trends p−r∗, i.e. r∗ cointegrating relations. We could then calculate the teststatistic τ p−r∗ using (8.4) and compare it with the appropriate critical valueCp−r∗ to be discussed in the next section. If τ p−r∗ > Cp−r∗, we reject thehypothesis of p− r∗ unit roots (common trends) in the model, and concludethat they are fewer than assumed. If τ p−r∗ < Cp−r∗, we accept the hypothesisof at least p− r∗ unit roots in the model, but conclude there may be more.Hence, the trace test (8.4) does not give us the exact number of unit rootsp − r (or cointegration relations r). It only tells us whether p − r < p − r∗(r ≥ r∗) when τ p−r∗ > Cp−r∗ or alternatively p − r ≥ p − r∗ (r < r∗) whenτ p−r∗ ≤ Cp−r∗.Therefore, to estimate the value of r we have to perform a sequence of

tests. The question is whether this sequence should be from top to bottom,i.e. r = 0, p unit roots, r = 1, p − 1 unit roots, .... , r = p, 0 unitroots or the other way around.The asymptotic tables are determined so that when Hr is true then

Pp−r(τ p−r ≤ C95%(p− r)) = 95% , where τ p−r is given by:

τ p−r = −2lnQ(Hr|Hp) = −TpX

i=r+1

ln(1− λi) (8.5)

We discuss first the ’top−→bottom’ procedure and then compare it withthe ’bottom−→top’ procedure based on a simple example where p = 3.Applying the ’top−→bottom’ trace test procedure can hypothetically pro-

duce four different choices of the cointegration rank r and, hence, the numberof unit roots (p− r) :

p− r = 3, r = 0 when τ 3 ≤ C3p− r = 2, r = 1 when τ 3 > C3, τ 2 ≤ C2p− r = 1, r = 2 when τ 3 > C3, τ 2 > C2, τ 1 ≤ C1p− r = 0, r = 3 when τ 3 > C3, τ 2 > C2, τ 1 > C1 .

Page 141: Juselius_Johansen_The Cointegrated VAR Model

142 CHAPTER 8. COINTEGRATION RANK

We will now illustrate the properties of the ’top → bottom’ sequenceby investigating P1 r = i, i = 0, ..., 3 when the true value of r = 1, i.e.p− r = 2, and the size of the test is 5%. First,

P1 r = 0 = P1 τ 3 ≤ C3where

τ 3 = −Tnln(1− λ1) + ln(1− λ2) + ln(1− λ3)

o.

For λ1 > 0, we have that −T ln(1− λ1)→∞ when T →∞. Thus, P1(τ 3 ≤C3)→ 0 asymptotically. The next value p− r = 2 corresponds to r = 1, thetrue cointegration number, and P1(τ 2 ≤ C2) → 0.95 in accordance with theway the critical tables have been constructed. Thus, P1(p − r = 1) as.→ 0.05and P1(p− r = 0) as.→≤ 5%. We summarize:P1(p− r = 3, r = 0) = P1(τ 3 ≤ C3) → 0P1(p− r = 2, r = 1) = P1(τ 3 > C3, τ 2 ≤ C2) → 0.95P1(p− r = 1, r = 2) = P1(τ 3 > C3, τ 2 > C2, τ 1 ≤ C1) → 0.05P1(p− r = 0, r = 3) = P1(τ 3 > C3, τ 2 > C2, τ 1 > C1) → p0 < 0.05

Thus, by applying the ’top → bottom’ procedure we will asymptoticallyaccept the correct value of r in 95% of all cases, which is exactly what wewould like a 5% test procedure to do.We will now similarly investigate the ’bottom → top’ procedure.For p = 3 there are the following four different choices of the cointegration

rank:

p− r = 0, r = 3 when τ 1 > C1p− r = 1, r = 2 when τ 1 ≤ C1, τ 2 > C2p− r = 2, r = 1 when τ 1 ≤ C1, τ 2 ≤ C2, τ 3 > C3p− r = 3, r = 0 when τ 1 ≤ C1, τ 2 ≤ C2, τ 3 ≤ C3 .

In this case we start by testing p− r = 0, i.e. stationarity and if rejectedcontinue with p− r = 1, and so on until first acceptance.

P1(p− r = 0, r = 3) = P (τ 1 > C1) → p0 < 0.05P1(p− r = 1, r = 2) = P (τ 1 ≤ C1, τ 2 > C2) → 0.05P1(p− r = 2, r = 1) = P (τ 1 ≤ C1, τ 2 ≤ C2, τ 3 > C3) → ≤ 0.95P1(p− r = 3, r = 0) = P (τ 1 ≤ C1, τ 2 ≤ C2, τ 3 ≤ C3) → 0

Page 142: Juselius_Johansen_The Cointegrated VAR Model

8.2. THE ASYMPTOTIC TABLES 143

In this case the probability of wrongly accepting r = 2 or r = 3 is0.05 + p0 ≥ 0.05. Hence, the probability of choosing the correct value r = 1is ≤ 0.95.The reason why the ’top →down’ procedure is asymptotically more cor-

rect is because the probability of incorrectly accepting r < r∗ is asymptot-ically zero, whereas the probability of incorrectly accepting r > r∗ in the’bottom → top’ procedure is generally greater than the chosen p-value.

8.2 The asymptotic tables and the determin-

istic components

The distribution of the Likelihood Ratio test statistic (8.4) is non-standardand has been determined by simulations for the asymptotic case. The asymp-totic distributions depend on the deterministic terms in the V AR model asshown in Johansen, 1995c where a detailed treatment can be found. Herewe will only give the intuition for how the distributions have been derivedand how they have been affected by deterministic components in the VARmodel.The LR test statistic of the hypothesis H(r) against H(p) is given by

(8.4). Under the null of p − r unit roots the last p − r eigenvectors vi, i =r+1, ..., p, should behave like random walks. Therefore, if the null hypothesisis correct then the calculated trace test statistic, −T

Ppi=r+1 ln(1−λi) should

not deviate significantly from the simulated test values C∗. However, it canbe shown that the asymptotic distribution of −2lnLR(H(r)/H(p) is thesame as −2lnLR(H(0)/H(p − r). Therefore, the asymptotic tables havebeen simulated for (p− r)-dimensional VAR models where r = 0. Under thenull hypothesis of p− r unit roots, the following approximation can be used:

−Tp−rXi=1

ln(1− λi) ≈ Tp−rXi=1

λi

where

p−rXi=1

λi ≈ trace(S−111 S10S−100 S01) (8.6)

Page 143: Juselius_Johansen_The Cointegrated VAR Model

144 CHAPTER 8. COINTEGRATION RANK

Without loss of generality we let S00 = I and hence:

p−rXi=1

λi ≈ trace(S−111 S10S01) = trace(S01S−111 S10) (8.7)

Thus, the idea behind the asymptotic tables is to simulate the distributionof (8.7) by first generating a (p − r)−dimensional random walk process ofacceptable length and then replicate this process a large number of times.In the following we will discuss how to simulate the asymptotic tables for

five different assumptions on the deterministic terms in the model. They areall sub-models of the following model:

∆xt = αβ0xt−1 + µ0 + µ1t+ εt (8.8)

where xt is (p× 1),

µ0 = αβ0 + γ0 (8.9)

and

µ1 = αβ1 + γ1 (8.10)

Because the trace test is exclusively related to the non-stationary directions ofthe model we need not consider the stationary components of the model whenderiving the asymptotic distributions. Therefore, the tables are simulatedunder the assumption that there are no short-run adjustment effects, i.e.Γ1 = 0, ...,Γk−1 = 0, and that r = 0, i.e. there are no equilibrium correctionin the simulated models. Altogether the tables have been simulated for p−r =1, ..., 12 unit roots and for five different assumptions on the deterministiccomponents. Under the assumption that r = 0 in (8.8) we have that µ0 = γ0,µ1 = γ1 in (8.9) and (8.10). Thus, the process xt can be represented as:

xt =tXi=1

εi + µ0t+ µ11

2t(t+ 1) + x0 (8.11)

If µ0 and µ1 are unrestricted in (8.8) all nonstationary directions of theprocess contain stochastic trends as well as linear and quadratic deterministic

Page 144: Juselius_Johansen_The Cointegrated VAR Model

8.2. THE ASYMPTOTIC TABLES 145

time trends. If there are linear but no quadratic trends in the data, thenγ1 = 0. If there are no linear trends at all in the data, µ1 = 0 and γ0 = 0.Because the asymptotic distributions change depending on whether µ0 andµ1 are restricted or unrestricted in the model a correct specification of thedeterministic components in the VAR model is crucial for correct inference.The asymptotic tables reported in Johansen (1976) and reproduced in the

Appendix have been derived from 6000 replications of p - dimensional ran-dom walk processes, ∆yt = εt, p = 1, ..., 12 and εt ∼ Np(0, I), t = 1, ..., 400.As discussed in Chapter 7 the reduced rank regression is based on the co-variance matrices Sij, i, j = 0, 1 reproduced below:

S11 = T−1PT

t=1R1,tR01,t

S01 = T−1PT 0

t=1R0,tR01,t

S00 = T−1PT

t=1R0,tR00,t

(8.12)

where R1,t = xt−1 − b10 − b11t and R0,t = ∆xt − b00 − b01t are the residualsin the auxiliary regressions of xt−1 and ∆x in (8.8). Because the simulatedVAR model does not contain any short-run effects, xt−1 and ∆xt will only becorrected in those versions of the model which contain an unrestricted trendand/or a constant.We will now show how to simulate the asymptotic tables using an example

where p− r = 3. The following five VAR models with different assumptionson the deterministic terms have been simulated:

Case 1. µ0, µ1 = 0.

This corresponds to the VAR model:

∆xt = αβ0xt−1 + εt,

which under the assumption that αβ0 = 0 gives:

xt =tXi=1

εi + x0.

There are no deterministic terms in this model and R1,t and R0,t arespecified as:

Page 145: Juselius_Johansen_The Cointegrated VAR Model

146 CHAPTER 8. COINTEGRATION RANK

R1,t =

Pt−1i=1 ε1iPt−1i=1 ε2iPt−1i=1 ε3i

and R0,t =

ε1tε2tε3t

, t = 1, ..., T.Case 2. µ0 6= 0, with γ0 = 0, and β0 6= 0, µ1 = 0.

This corresponds to the VAR model:

∆xt = α£β0 β0

¤ · xt−11

¸+ εt,

which under the assumption that αβ0 = 0 gives:

xt =tXi=1

εi + x0

There are no linear trends in the model, nor in the data, but the cointe-grating relations have an intercept term, so:

R1,t =

Pt−1

i=1 ε1iPt−1i=1 ε2iPt−1i=1 ε3i

1

and R0,t =

ε1tε2tε3t

, t = 1, ..., T.

Case 3. µ1 = 0, µ0 is unrestricted.

This corresponds to the VAR model:

∆xt = αβ0xt−1 + µ0 + εt

with the vectors of corrected residuals: R1,t = xt−1− b01 and R0,t = ∆xt−1−b00. Under the assumption that αβ

0 = 0 and µ0 6= 0 the process xt is givenby:

xt =tXi=1

εi + µ0t+ x0 (8.13)

Page 146: Juselius_Johansen_The Cointegrated VAR Model

8.2. THE ASYMPTOTIC TABLES 147

There are no linear trends in the VAR model but, because of the unrestrictedconstant µ0, the data contain linear trends. In the auxiliary regression ofxt−1 on the unrestricted constant in the VAR model, the constant x0 in(8.13) cancels and R1,t will contain stochastic trends

Pt−1i=1 εi and a linear

trend t but no constant. Since a linear time trend asymptotically dominatesa stochastic trend, the (p − r) = 3 nonstationary directions of the processare decomposed into (p − r − 1) = 2 directions that contain the correctedstochastic trends and one direction that contains the linear trend. The tablesare based on:

R1,t =

(Pt−1i=1 ε1i) | 1

(Pt−1

i=1 ε2i) | 1t | 1

and R0,t =

ε1tε2tε3t

, t = 1, ..., T.Case 4. β1 6= 0, γ1 = 0, µ0 is unrestricted.

This corresponds to the VAR model:

∆xt = α£β0 β1

¤ · xt−1t

¸+ µ0 + εt

with the vectors of corrected residuals given by : R1,t = [x0t−1, t]

0 − b01 andR0,t = ∆xt−1 − b00. Under the assumption that αβ0 = 0 the process xtbecomes:

xt =tXi=1

εi + µ0t+ x0

In this case we have allowed for a linear trend both in the data and in thecointegration relations, but have restricted the quadratic trend to be zero.Because the constant x0 cancels in the regression of xt−1 on a constant, R1,tcontain three corrected stochastic trends and a linear trend but no constant.

R1,t =

Pt−1

i=1 ε1i | 1Pt−1i=1 ε2i | 1Pt−1i=1 ε3i | 1

trend | 1

and R0,t = ε1t

ε2tε3t

, t = 1, ..., T.

Page 147: Juselius_Johansen_The Cointegrated VAR Model

148 CHAPTER 8. COINTEGRATION RANK

Case 5. µ1, µ0 are unrestricted.

This corresponds to the VAR model:

∆xt = αβ0xt−1 + µ0 + µ1t+ εt

with the vectors of corrected residuals: R1,t = xt−1 − b01 − b11t and R0,t =∆xt−1− b00− b10t. Under the assumption that αβ0 = 0 and µ1 6= 0 the processxt becomes:

xt =tXi=1

εi + µ0t+ µ11

2t(t+ 1) + x0 (8.14)

This model allows for linear trends and quadratic trends in the data aswell as linear trends in the cointegrating relations. Because the constant andthe linear trend in (8.14) cancel in the regression of xt−1 on the unrestrictedconstant and the linear trend, only the stochastic trends,

Pt−1i=1 εi, and the

quadratic trend, t2, are left in R1,t. Furthermore, because a quadratic timetrend asymptotically dominates a linear stochastic trend, the (p − r) = 3nonstationary directions of the process are decomposed into (p− r − 1) = 2directions which contain the corrected stochastic trends and one directionwhich contains the quadratic trend.

R1,t =

(Pt−1i=1 ε1i) | 1, t

(Pt−1

i=1 ε2i) | 1, tt2 | 1, t

and R0,t =

ε1tε2tε3t

, t = 1, ..., T.As an example let us consider a VARmodel with an unrestricted constant.

In this model we would like to test the hypothesis of p − r = 3 unit roots.This case corresponds to the third row of Table A.3 and the first test valuecorresponds to the 50% quantile of the 5000 test statistics calculated fromthe simulated VAR model with p − r = 3 unit roots and an unrestrictedconstant. Thus, in 2500 cases the trace test statistic was smaller than 18.65and in 4750 cases smaller than 29.38, the 95% quantile.We have shown that the asymptotic distributions depend on whether

there is a constant and/or a trend in the VAR model and whether they areunrestricted or not. However, other deterministic components, such as inter-vention dummies, are also likely to influence the shape of the distributions.

Page 148: Juselius_Johansen_The Cointegrated VAR Model

8.2. THE ASYMPTOTIC TABLES 149

In particular, care should be taken when a deterministic component gener-ates trending behavior in the levels of the data. A typical example is anunrestricted shift dummy (· · · ,0,0,0,1,1,1,· · · ) which cumulates to a brokenlinear trend in the data. An detailed discussion of this case can be foundin Johansen, Nielsen, and Mosconi (2000), Juselius (2000), and Doornik,Hendry, and Nielsen (1998).

Because the asymptotic distributions for the rank test depend on thedeterministic components in the model and whether these are restricted orunrestricted, the rank and the specification of the deterministic componentshave to be determined jointly. Alternatively the deterministic componentshave to be removed from the model prior to testing. Nielsen and Rahbek(1998) have demonstrated that a test procedure based on a model formu-lation that allows a deterministic variable, Dt, to be in the cointegrationrelations and its difference, ∆Dt, to be in the VAR equations gives similarityin the test procedure. Assume, for example, that the data contain a lineartrend t so that E[∆xt] = γ0 6= 0. In this case we need to include an unre-stricted constant term µ0 = αβ0 + γ0 in the VAR model (c.f. (??) and thediscussion in the previous chapter) to account for the linear growth in thedata. However, a linear trend in the variables need not cancel in the cointe-grating relations and we need to allow for the possibility of trend-stationarycointegration relations. This is achieved by allowing the linear trend to enterthe cointegrating relations, i.e. β1 6= 0 in (??). Thus, to achieve similarity inthe test procedure the linear trend t has to be restricted to the cointegrationrelations and a constant term (i.e. the difference of t) has to be unrestrictedin the model.

To summarize: Given linear trends in the data, case 4 (see also Chapter6, Section 3) is generally the best specification to start with unless we have astrong prior that the linear trends cancel in the cointegration relations. Thisis because case 4 allows for trends both in the stationary and nonstationarydirections of the model and, hence, similarity in the test procedure. Whenthe rank has been determined it is always possible to test the hypothesisβ1 = 0, as a linear hypothesis on the cointegrating relations. This will beillustrated in the next chapter. Given no linear trends in the data, case 2 isthe appropriate specification (unless exceptionally the cointegration relationscan be assumed to have a zero mean)

Page 149: Juselius_Johansen_The Cointegrated VAR Model

150 CHAPTER 8. COINTEGRATION RANK

8.3 The cointegration rank: a difficult and

crucial choice

The cointegration rank divides the data into r relations towards which theprocess is adjusting and p− r relations which are pushing the process. Theformer will be given an interpretation as equilibrium errors (deviations fromsteady-state) and the latter as common driving trends in the system. Hence,the choice of r will influence all subsequent econometric analysis and will becrucial for conclusions we draw on our economic hypotheses. In the previoussection we showed that the asymptotic distributions depend on the determin-istic components in the VAR model and that the stationary short-run effectsΓ1∆xt−1+ ...+ Γk−1∆xt−k+1 do not matter asymptotically. In small samplesthese effects are in most cases important. Johansen (2002) demonstratedthat the closer the VAR model is to the I(2) boundary the more importantis the short-term dynamic effects. In many cases the proper solution is touse bootstrap methods to determine the critical values (ref.). The idea is tosimulate tables for models mimicking the short-run dynamics of the empiricalmodel.

Unfortunately there is no clear answer to the question of how many obser-vations we need for the asymptotic results to hold sufficiently well. Whetherthe sample is ’small’ or ’big’ is not exclusively a function of the number ofobservations available in the sample but also of the information in the data.If the data are very informative about a hypothetical long-run relation (β0xt,i.e. the equilibrium error crosses the mean line several times over the sampleperiod) then we might have good test properties even if the sample periodis relatively short. If the estimated eigenvalues are empirically informativein the sense of being either very high or very low the trace test is likely toperform well. Note, however, that a high value of λi can also be an indicationof a small ratio between the number of estimated parameters and the numberof observations. If some of the estimated eigenvalues are in the region whereit is hard to discriminate between significant and insignificant eigenvaluesthe trace test will usually have low power for near unit root alternatives. Alow power of the test is a sign that the data are not very informative aboutthe cointegration rank.

Thus, we may have a problem both with the size and the power of the testwhen determining the rank. In the ideal case we would like the probabilityto reject a correct null hypothesis (r = r∗) to be small and the probability

Page 150: Juselius_Johansen_The Cointegrated VAR Model

8.3. CHOOSING THE RANK 151

to accept a correct alternative hypothesis (r 6= r∗) to be high for relevanthypotheses in the ’near unit root’ region. For example, if the adjustment backto equilibrium is very slow then the correct hypothesis would be a stationarybut near unit root.Many simulation studies have demonstrated that the asymptotic distribu-

tions can be poor approximations to the true distributions when the samplesize is small resulting in substantial size and power distortions. Small sam-ple corrections have been developed in Johansen (2002). For moderatelysized samples (50-70) typical of many empirical models in economics thesecorrections can be substantial.While applying a small sample correction to the trace test statistics leads

to a more correct size, it does not solve the power problem. In some casesthe size of the test and the power of alternative hypotheses close to the unitcircle are almost of the same magnitude. In such cases a 5% test procedurewill reject r = r∗ incorrectly in 5 % of all the cases where r∗ is the true value.It will also incorrectly accept r = r∗ in say 90% of the cases when the truevalue of r is greater than r∗. This is particularly worrying when the null of aunit root is not a natural economic hypothesis.As discussed above the trace test is based on a sequence of tests. In

the ’top-down’ case we test the hypothesis ”p unit roots” and, if rejected,continues until first acceptance of p − r unit roots. Whether we choose the’top-down’ or the ’down-top’, the test procedure is essentially based on theprinciple of ”no prior economic knowledge” regarding the rank r. Thisis in many cases difficult to justify. For example, in the monetary modelof Chapter 2 we demonstrated that the hypothesis (r = 3, p − r = 2) wasa priori consistent with two types of autonomous shocks, one shifting theaggregate demand curve and the other the aggregate supply curve. We alsodiscussed that for a more regulated economy the hypothesis (r = 2, p−r = 3)might be preferable a priori as a result of very slow market adjustment.An alternative procedure is, therefore, to test a given prior economic

hypothesis, say p−r = 2, using the trace test and, if accepted, continue withthis assumption unless the data strongly suggest the presence of additionalunit roots. The latter can be investigated in a number of ways. For example,we can test the significance of the adjustment coefficients αi,r, i = 1, ..., p ofthe r0th cointegrating vector. If all αr,i coefficients have small t-ratios, thenincluding the r0th cointegrating relation in the model would not improve theexplanatory power of the model but, more likely, would invalidate subsequentinference. But if the choice of r incorrectly includes a nonstationary relation

Page 151: Juselius_Johansen_The Cointegrated VAR Model

152 CHAPTER 8. COINTEGRATION RANK

among the cointegrating relations, then one of the roots of the characteristicpolynomial of the model would correspond to a unit root or a near unit rootand, thus, be large. If either of these cases occur, then the cointegrationrank should be reduced. Note, however, that additional unit roots in thecharacteristic polynomial can be the result of I(2) components in the data.Reducing the rank in this case will not solve the problem as will be furtherdiscussed in Chapters 14 and 15.Note also that the cointegration rank is not in general equivalent to the

number of theoretical equilibrium relations derived from an economic model.For example, in the monetary model of Chapter 2 there was one equilibriumrelation, the money demand relation (??). As demonstrated in Chapter 2,Section 4 the monetary model would be consistent with r = 3 cointegratingrelations (and not one) in a VAR model with real money, real income, infla-tion and two interest rates (instead of just one as in Romer’s example). Theprior assumption that r = 1 has been incorrectly assumed in many empiricalapplications of money demand data.Thus, cointegration between variables is a statistical property of the data

that only exceptionally can be given a direct interpretation as an economicequilibrium relation. The reason for this is that a theoretically meaning-ful relation can be (and often is) a weighted sum of several ’irreducible’cointegration relations (Davidson, 2001). However, these relations containinvaluable information about common stochastic trends between sets of vari-ables. In the next chapter we will illustrate that this can be used to assessthe hypothetical scenario of the economic problem as proposed in Chapter2.To summarize: When assessing the appropriateness of the asymptotic

tables to determine the cointegration rank we need to consider not only thesample size but also the short-run dynamics. Because the power of the tracetest can be very low for alternative hypotheses in the neighborhood of theunit circle it is advisable to use as much additional information as possible.For example, we can check

1. the characteristic roots of the model: If the rth+1 cointegration vectoris nonstationary and is wrongly included in the model, then the largestcharacteristic root will be close to the unit circle.

2. the t-values of the α-coefficients to the rth + 1 cointegration vector. Ifall of them are small, say less than 3.0, then one would not gain a lotby including the rth+1 vector as a cointegrating relation in the model.

Page 152: Juselius_Johansen_The Cointegrated VAR Model

8.4. AN ILLUSTRATION 153

3. the recursive graphs of the trace statistic for er = 1, 2, ..., p. Since thevariable −Tj ln(1 − λi), j = T1, ..., T, grows linearly over time whenλi 6= 0 the recursively calculated components of the trace statisticshould grow linearly for all i = 1, ..., r , but stay constant for i =r + 1, ..., p.

4. the graphs of the cointegrating relations: If the graph of a suppos-edly stationary cointegration relation reveals distinctly nonstationarybehavior, one should reconsider the choice of r, or find out if the modelspecification is in fact incorrect, for example are data I(2) instead ofI(1).

5. the economic interpretability of the results.

The above criteria will now be illustrated based on the Danish data.

8.4 An illustration based on the Danish data

In Table 8.1 we have reported the estimated eigenvalues, λi, the componentsof the trace test, −Tln(1-λi), the trace test, Trace(i) = −

Pij=1Tln(1-λi),and

C ∗.95, the 95% quantiles from the asymptotic Table 15.4 in Johansen (1995).The trace test appears to reject 5 unit roots (140.4 > 87.0) and 4 unit

roots (71.7> 62.6), but not 3 unit roots (39.0< 42.2). Based on the trace testwe would, therefore, choose r = 2. As demonstrated in Chapter 1, our prioreconomic hypothesis was r = 3 assuming two driving trends; one nominaland one real stochastic trend. The sample size is 79 and the asymptotictables may not be very precise approximations in this case. Furthermore, thepower of the test might be low for the third eigenvector with the eigenvalueλ3 = 0.27. This value is in the borderline region where it is hard to knowwhether the corresponding eigenvector should be considered stationary ornonstationary. Therefore, before choosing the cointegration rank it is usefulto examine all five sources of additional information suggested at the end ofSection 8.3.The largest characteristic root is 0.77 for r = 2 and 0.83 when r = 3,

a moderate difference. Whether the root 0.83 corresponds to a unit root ornot is difficult to know and we will collect more information by inspectingthe significance of the adjustment coefficients αi,2 and αi,3, i = 1, ..., p. Table7.1 reported the Π matrix decomposed into all five α and β vectors. We

Page 153: Juselius_Johansen_The Cointegrated VAR Model

154 CHAPTER 8. COINTEGRATION RANK

Table 8.1: The trace test of the cointegration rank and the eigenvalue rootsof the model

λi p-r Tln(1-λi) Trace(i) C ∗.95 Modulus: 5 largest rootsr = 5 r = 4 r = 3 r = 2

0.59 5 68.7 140.4 87.0 0.88 1.0 1.0 1.00.35 4 32.7 71.7 62.6 0.88 0.87 1.0 1.00.27 3 24.0 39.0 42.2 0.74 0.74 0.83 1.00.12 2 10.1 15.0 25.5 0.55 0.74 0.66 0.770.06 1 5.0 5.0 12.4 0.44 0.56 0.66 0.43

notice that the choice r = 2 will exclude the relation β03xt describing arelation between the two interest rates from the model. The adjustmentcoefficient α4,3 is highly significant implying that the deposit rate is stronglyadjusting to β03xt. In Figure 8.1 the graph of the third relation suggests mean-reverting behavior in spite of some evidence of drift. Finally, the graph of therecursively calculated trace test for the third component given in Figure 8.2exhibits linear growth over time, consistent with λ3 being different from zero.Altogether we have found strong evidence supporting our economic prior r= 3. However, before trusting this choice we need to assess the constancyof parameters of the VAR model. In the next section we will discuss somerecursive procedures which have been developed to detect possible sources ofparameter nonconstancy in the VAR model.

Page 154: Juselius_Johansen_The Cointegrated VAR Model

8.5. RECURSIVE TESTS OF CONSTANCY 155

V3` * Zk(t)

74 76 78 80 82 84 86 88 90 92-15

-14

-13

-12

-11

-10

-9

V3` * Rk(t)

74 76 78 80 82 84 86 88 90 92-2.4

-1.6

-0.8

-0.0

0.8

1.6

2.4

3.2

Figure 8.1. The graphs of the third cointegration relation. Upper panelbased on β0xt and lower panel based on β0R1t.

8.5 Recursive tests of constancy

In Chapter 4 we applied a number of residual misspecification tests of theVAR model. Though the empirical model seemed to pass these tests suffi-ciently well to continue the analysis it is completely possible that the modelsuffer from parameter non-constancy. The purpose of this section is to pro-vide a number of diagnostic tests to check for this important feature of themodel.

8.5.1 Recursively calculated trace tests

Page 155: Juselius_Johansen_The Cointegrated VAR Model

156 CHAPTER 8. COINTEGRATION RANK

The Trace testsZ(t)

THE CRITICAL VALUES ARE NOT VALID83 84 85 86 87 88 89 90 91 92 93

0.8

1.0

1.2

1.4

1.6

1.8

2.0

2.2

R(t)

1 is the 10% significance level83 84 85 86 87 88 89 90 91 92 93

0.4

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

2.2

Figure 8.2. The recursively calculated components of the trace statisticscaled by the 90% quantile of the asymptotic distributions.

The expression for the trace test in (8.6) is used repeatedly in the re-cursive estimation of the VAR model. Figure 8.2 illustrates the recursivelycalculated components. To increase readability, the trace statistic is scaledby the 90% quantile of the appropriate asymptotic distribution. Note, how-ever, that the scaling by the critical values can have the consequence that thelines cross each other. Note also that if the VAR model contains exogenousvariables or dummy variables, then the applied 90% quantiles may no longerbe appropriate and CATS will insert a warning in the picture ”The criticalvalues are not valid”.

The upper panel is based on recursive estimation of the full model, whereasthe lower panel is based on recursive estimation of the R-form in which short-run effects have been concentrated out as follows: Based on the full sampleR0t and R1t are determined once and for all by the auxiliary regressions (??)

Page 156: Juselius_Johansen_The Cointegrated VAR Model

8.5. RECURSIVE TESTS OF CONSTANCY 157

in Chapter 7. The recursions are now calculated as if R0t = αβ0R1t + εtis the true model. In this sense any parameter instability in the short-runcoefficients will have been averaged out in the lower panel. All trace com-ponents exhibit (to some extent) trending behavior consistent with non-zeroλi, i = 1, ..., p. However, the smallest eigenvalue test component in the upperpanel is hardly growing and is, therefore, likely to correspond to a unit rootor a near unit root.

8.5.2 The recursively calculated log-likelihood

Z(t)-ln(det(S00))

83 84 85 86 87 88 89 90 91 92 9350.3

50.4

50.5

50.6

50.7

50.8

50.9

51.0

51.1

-Sum(ln(1-lambda))

83 84 85 86 87 88 89 90 91 92 931.44

1.60

1.76

1.92

2.08

2.24

2.40

2.56

2.72

-2/T*log-likelihood

83 84 85 86 87 88 89 90 91 92 9345.0

47.5

50.0

52.5

55.0

57.5

60.0

R(t)-ln(det(S00))

83 84 85 86 87 88 89 90 91 92 9350.04

50.16

50.28

50.40

50.52

50.64

-Sum(ln(1-lambda))

83 84 85 86 87 88 89 90 91 92 931.5

1.6

1.7

1.8

1.9

2.0

2.1

-2/T*log-likelihood

83 84 85 86 87 88 89 90 91 92 9345.0

47.5

50.0

52.5

55.0

57.5

60.0

Figure 8.3. The recursively calculated loglikelihood based on the full modeland the R-form.

The log-likelihood value is calculated as:

−2/t1 ln(L(r)) =

Ãln |S00,(t1)|+

rXi=1

ln(1− λi,(t1))

!, (8.15)

t1 = T0, . . . , T,

and the 95% confidence bound is calculated as ±2p2p/t1. Figure 8.3 illus-

trates.

Page 157: Juselius_Johansen_The Cointegrated VAR Model

158 CHAPTER 8. COINTEGRATION RANK

It appears that the calculated log-likelihood lies within the 95% confidencebands for t1 =1983:1-1994:3. Note also that the R - form is more stable thanthe full model. This is a typical outcome, because some of the short-runcoefficients in the full model are likely to be unstable over time.

8.5.3 Recursively calculated prediction tests

1-step prediction testZ(t)

83 84 85 86 87 88 89 90 91 92 930

1

2

3

4

5

R(t)

83 84 85 86 87 88 89 90 91 92 930.0

0.5

1.0

1.5

2.0

2.5

Figure 8.4. Recursively calculated one-step ahead prediction errors of thesystem. The upper panel is for the full model and the lower panel for the

R-form model.

The one-step-ahead prediction test is based on the hypothesis that thevector process ∆xt1 is generated by the same cointegrated process that hasgenerated ∆x1, . . . ,∆xt1−1, (t1 = T0 + 1, . . . , T ). (See Lutkepohl (1991),section 4.6). The one-step-ahead prediction error for the system is calculated

Page 158: Juselius_Johansen_The Cointegrated VAR Model

8.5. RECURSIVE TESTS OF CONSTANCY 159

as:

ft1 = ∆xt1 −k−1Xj=1

Γj,(t1)∆xt1−j − Π(t1−1)xt1−1 − µ0,(t1−1) − Φ(t1−1)Dt1 ,(8.16)

t1 = T0 + 1, . . . , T,

and the test statistic as:

T (t1) =

µt1

d1 + r+ 1

¶f 0(t1)Ω

−1(t1−1)f(t1) (8.17)

t1 = T0 + 1, . . . , T.

where d1 = k− 1 + d and d is the number of dummy variables in the model.Under the null T (t1) is asymptotically distributed as χ2 with p degrees offreedom. Figure 8.4 illustrates.

A prediction error larger than two standard errors it indicated with along vertical line. Note that the prediction errors from the full model aregenerally much larger than from the R-form model. This is almost an artifactof the R-model predicting exclusively the long-run components of the modelin contrast to the full model which has to predict both the long-run and theshort-run components.

The test of one-step-ahead prediction errors for the individual variablesxi,t1 is given by:

Ti(t1) =µ

t1d1 + r

+ 1

¶f2i,t1/Ωii,(t1−1), t1 = T0 + 1, . . . , T, (8.18)

and is asymptotically distributed as χ2(1). Also in this case we can choosebetween predictions from the full model illustrated in Figure 8.5 and fromthe R-form model illustrated in Figure 8.6. Note the large prediction errorsof the bond rate and the deposit rate at 1983:1 based on the full model. TheR-form model does not show a similar prediction failure because this effecthas been concentrated out by the dummy variable Dp83t.

Page 159: Juselius_Johansen_The Cointegrated VAR Model

160 CHAPTER 8. COINTEGRATION RANK

DMO

83 84 85 86 87 88 89 90 91 92 930.0

0.5

1.0

1.5

2.0

2.5

3.0

DFY

83 84 85 86 87 88 89 90 91 92 930.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

DDIFPY

83 84 85 86 87 88 89 90 91 92 930.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

DIDE

83 84 85 86 87 88 89 90 91 92 930.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

DIBO

83 84 85 86 87 88 89 90 91 92 930.0

0.8

1.6

2.4

3.2

4.0

4.8

5.6

6.4

Figure 8.5. One-step ahead prediction errors of each variable of the systembased on the full model. A vertical line indicates a prediction error larger

than two standard errors.

Page 160: Juselius_Johansen_The Cointegrated VAR Model

8.5. RECURSIVE TESTS OF CONSTANCY 161

DMO_R

83 84 85 86 87 88 89 90 91 92 930.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

2.25

DFY_R

83 84 85 86 87 88 89 90 91 92 930.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

DDIFPY_R

83 84 85 86 87 88 89 90 91 92 930.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

2.25

DIDE_R

83 84 85 86 87 88 89 90 91 92 930.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

DIBO_R

83 84 85 86 87 88 89 90 91 92 930.00

0.25

0.50

0.75

1.00

1.25

1.50

Figure 8.6. One-step ahead prediction errors of each variable of the systembased on the R-form model. A vertical line indicates a prediction error

larger than two standard errors.

The time paths of the recursively calculated r largest eigenvalues showsthe estimated eigenvalues from the unrestricted VAR model (8.22). Thestandard error of the estimate λi(t1) is calculated as:

s.e.(λi) =

qT−14(1− λi)2(λi +

PT13

h=1(1− h/T13 )2(ri,u(h)2 − ri,uv(h)2)),

(8.19)

where

ri,u(h) = T−1PT

t=h ui,tui,t−h,

ri,uv(h) = T−1PT

t=h ui,tvi,t−h,

Page 161: Juselius_Johansen_The Cointegrated VAR Model

162 CHAPTER 8. COINTEGRATION RANK

for

ui,t = λ− 12

i α0iS−100 R0t,

and

vi,t = (λi(1− λi))− 12 α0iS

−100 (R0t − αβ

0R1t).

For further details see Hansen & Johansen (1999). Figure 8.7 illustratesthe recursively calculated λi, i = 1, ..., 3 together with the 95% confidencebands. It appears that the estimated eigenvalues stays within the bands forall periods. This is a quite typical outcome which suggests that the power ofdetecting instability might be quite low.

lambda1

83 84 85 86 87 88 89 90 91 92 930.00

0.25

0.50

0.75

1.00

lambda2

83 84 85 86 87 88 89 90 91 92 930.00

0.25

0.50

0.75

1.00

lambda3

83 84 85 86 87 88 89 90 91 92 930.00

0.25

0.50

0.75

1.00

Figure 8.7. Recursively calculated λi, i = 1, ..., 3 together with 95%confidence bands.

Page 162: Juselius_Johansen_The Cointegrated VAR Model

8.5. RECURSIVE TESTS OF CONSTANCY 163

The test of constancy of β is a test of the hypothesis:

Hβτ : β ∈ sp(β(t1)), t1 = T0, . . . , T.

in which β is a known matrix. The test statistic is given by

−2 ln(Q(Hβτ |β(t1))) = t1

rXi=1

³ln(1− ρi,(t1))− ln(1− λi,(t1))

´, (8.20)

t1 = T0, . . . , T,

where ρi(t1) are the solutions of

|ρβS11,(t1)β − βS10,(t1)S−100,(t1)

S01,(t1)β| = 0, t1 = T0, . . . , T, (8.21)

and λi(τ) are the r largest eigenvalues in the solution of the unrestrictedeigenvalue problem:

|λS11,(t1) − S10,(t1)S−100,(t1)S01,(t1)| = 0, t1 = T0, . . . , T. (8.22)

The test statistic (8.20) is asymptotically distributed as χ2 with (p1 − r)rdegrees of freedom (Hansen & Johansen, 1993). Figure 8.8 illustrates thecase where β is the full sample estimate β and Figure 8.9 where β is theestimate based on the sample 1974:2-1987:1. To increase readability the testvalues have been scaled by the 95% quantiles of the χ2 distribution. A valuelarger than 1.0 means that the test rejects constancy. The solid line is basedon the full model the dotted line based on the R-form model. In Figure 8.8the dotted line shows that the full sample β would have been accepted in allperiods 1974:2-1983:1+j, j = 1, ..., 47 when the short-run effects had beencorrected for whereas the solid line shows that this would not have been thecase in the first few years after 1983. Based on Figure 8.9 the conclusionis not as clear. Here we have chosen a specific sub-sample, 1974:2-1987:1as a reference point. Based on the R-form model (the dotted line) we couldapproximately accept constancy of β over the full period, though at the end ofthe sample the test is close to the critical value. Based on the full model (solidline) constancy of β is less strongly supported. This serves as an illustrationthat the test procedures can give quite different results depending on thequestions we ask. It is often useful to check the sensitivity of the stabilitytests to the choice of reference value β using different sample periods.

Page 163: Juselius_Johansen_The Cointegrated VAR Model

164 CHAPTER 8. COINTEGRATION RANK

Test of known beta eq. to beta(t)

1 is the 5% significance level83 85 87 89 91 93

-0.25

0.00

0.25

0.50

0.75

1.00

1.25

1.50BETA_ZBETA_R

Figure 8.8. Recursively calculated tests of the full sample estimateβ ⊆ sp(βt1).

Test of known beta eq. to beta(t)

1 is the 5% significance level83 85 87 89 91 93

0.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75BETA_ZBETA_R

Figure 8.9. Recursively calculated tests of β ⊆ sp(βt1) where β is estimatedon the sub-sample 1974:2-1987:1.

Page 164: Juselius_Johansen_The Cointegrated VAR Model

Chapter 9

Testing restrictions on β and α

In Section 7.4 we discussed the eigenvector decomposition of the long-runmatrix Π and interpreted the unrestricted estimates as a convenient descrip-tion of the information given by the covariances of the data. In Chapter 8 wediscussed how to determine the cointegration rank which separates the eigen-vectors into r stationary and p− r nonstationary directions. The purpose ofthis chapter is to discuss a number of test procedures by which we can testvarious restrictions on the r stationary cointegrating relations. While thefinal aim is to test and impose over-identifying structural restrictions on thelong-run structure β and on the adjustment coefficients α, the restrictionsdiscussed here are not identifying by themselves. Nonetheless, they implybinding restrictions on Π = αβ0 and, hence, are testable. By systematic ap-plication of these tests one will gain invaluable information on the time-seriesproperties of the long-run relations which will facilitate identification of thelong-run and short-run structure to be discussed in Chapters 10-14.

This chapter will discuss how to test for stationarity of cointegrationrelations subject to the following types of restrictions on β and α: (i) samerestrictions on all cointegrating relations (ii) all coefficients known in some ofthe relations, (iii) only some coefficients known in a cointegrating relation,(iv) zero restrictions of rows of α. The organization is as follows: Section9.1 discusses how to formulate hypotheses as restrictions on the parametermatrices, Section 9.2 how to test the same restriction on all β, Section 9.3 howto test a known β, Section 9.4 how to test some restrictions on a cointegrationvector, Section 9.5 how to test long-run weak exogeneity as a row restrictionon α, and Section 9.6 interprets the results in terms of the scenario analysisof Chapter 2.

165

Page 165: Juselius_Johansen_The Cointegrated VAR Model

166 CHAPTER 9. TESTING RESTRICTIONS

9.1 Formulating hypotheses as restrictions

on β

Hypotheses on the cointegration vectors can be formulated in two alternativeways: either by specifying the si free parameters in each βi vector, or themi restrictions on each vector. We will consider both cases for a generalformulation of restrictions on the (p1×r) matrix β,where p1 is the dimensionof xt−1 in the VAR model. We first specify the constrained cointegrationvector βci in terms of the si free parameters:

βc = (βc1, ...,βcr) = (H1ϕ1, ...,Hrϕr), (9.1)

where ϕi is a (si× 1) coefficient matrix, and Hi is a (p1× si),design matrix,and i = 1, ..., r . In this case we use the design matrices to determine the sifree parameters in each cointegration vector.

In the other case we specify some restriction matrices Ri (p1×mi) whichdefine the mi restrictions on βi:

R01β1 = 0

...

R0rβr = 0.

As an illustration consider the following hypothetical specification of β0xtwhere x0t = [m

rt , y

rt ,∆pt, Rm,t, Rb,t,Ds83t] :

β01xt = mrt − yrt − b1(Rm,t −Rb,t)− b2Ds83t

β02xt = yrt − b3(∆pt −Rb,t)β03xt = (Rm,t −Rb,t) + b4Ds83t

The first cointegration relation has three free parameters (s1 = 3), corre-sponding to three restrictions (m1 = 3) and the second and the third relationhave two free parameters (s2 = s3 = 2), corresponding to four restrictions(m2 = m3 = 4). The three restricted vectors Hiϕi take the following form:

Page 166: Juselius_Johansen_The Cointegrated VAR Model

9.1. FORMULATING HYPOTHESES 167

β1 = H1ϕ1 =

1 0 0−1 0 00 0 00 1 00 −1 00 0 1

ϕ11

ϕ12ϕ13

,

β2 = H2ϕ2 =

0 01 00 10 00 −10 0

·ϕ21ϕ22

¸,

β3 = H3ϕ3 =

0 00 00 01 0−1 00 1

·ϕ31ϕ32

¸.

It appears that ϕ11 = β11 = −β12, ϕ12 = β24 = −β25, ϕ13 = β36. Afternormalization on real money in the first relation the normalized coefficientsbecome b1 = ϕ12/ϕ11, b2 = ϕ13/ϕ11 and similarly for the remaining relations.

We will now formulate the above relations as restrictions on the βi coef-ficients. They are as follows:

Page 167: Juselius_Johansen_The Cointegrated VAR Model

168 CHAPTER 9. TESTING RESTRICTIONS

R01β1 =

1 1 0 0 0 00 0 1 0 0 00 0 0 1 1 0

β11β12β13β14β15β16

= 0,

R02β2 =

1 0 0 0 0 00 0 1 0 1 00 0 0 1 0 00 0 0 0 0 1

β21β22β23β24β25β26

= 0,

R03β3 =

1 0 0 0 0 00 1 0 0 0 00 0 1 0 0 00 0 0 1 1 0

β31β32β33β34β35β36

= 0.

Note that Ri = H⊥,i, i.e. R0iHi = 0.

After the rank has been determined all tests are about stationarity, i.e.the null hypothesis is now that a restricted linear combinations of the vectorprocess βci is stationary. This is in contrast to the rank test where the nullhypothesis was a unit root. Under the assumption that the rank was correctlychosen, the matrix Π = αβ0, where α and β are of dimension r, describesthe r stationary directions β. Hence testing restrictions on β is the same asasking whether a restricted vector βci lies in the ’stationarity space’ spannedby β. When a test rejects it means that the restricted vector points outsidethe ’stationarity space’.

9.2 Same restriction on all β

Before imposing specific restrictions on each βi it is often useful to test forgeneral restrictions on all relations. Typical examples are tests of long-run

Page 168: Juselius_Johansen_The Cointegrated VAR Model

9.2. SAME RESTRICTION 169

exclusion of a variables, i.e. a zero row restriction on β. If accepted, thevariable is not needed in the cointegration relations and can be omittedaltogether from the cointegration space. Another example is the test of long-run price homogeneity between some (or all) of the variables. For example, wewant to test whether nominal money and prices are long-run homogeneousin all cointegration relations. If accepted, we can re-specify the long-runrelations directly in real money (and, as will be shown in Chapter 14 anominal growth rate) thus simplifying the long-run structure.

These are testable hypotheses, but, since they impose the same restrictionon all cointegration relations, they are not identifying as will be shown in thenext chapter. All Hi (or Ri), i = 1, ...r, are identical and we can formulatethe hypothesis as:

Hc(r) : βc = (Hϕ1, ...,Hϕr) = Hϕ (9.2)

where βc is p1× r, H is p1×s, ϕ is s× r and s is the number of unrestrictedcoefficients in each vector. The hypothesis Hc(r) is tested against H(r) : βunrestricted, i.e. we test the following restricted model:

∆xt = αϕ0H0xt−1 +k−1Xi=1

Γi∆xt−i + εt (9.3)

The hypothesis (9.2) generally implies a transformation of the data vector,H0xt, as can be illustrated by the following example. Assume that we wishto test the hypothesis of long-run proportionality between mr and yr in allcointegration relations specified by the following design matrix H:

H0 =

1 −1 0 0 0 00 0 1 0 0 00 0 0 1 0 00 0 0 0 1 00 0 0 0 0 1

, ϕ =

ϕ11 ϕ12 ϕ13ϕ21 ϕ22 ϕ23ϕ31 ϕ32 ϕ33ϕ41 ϕ42 ϕ43ϕ51 ϕ52 ϕ53

The transformed data vector becomes:

Page 169: Juselius_Johansen_The Cointegrated VAR Model

170 CHAPTER 9. TESTING RESTRICTIONS

H0xt =

(mr − yr)t

∆ptRm,tRb,tDs83t

.If this restriction is accepted all cointegrating relations can be expressed asa function of the liquidity ratio (mr − yr)t . Thus, the restricted β,is now(5×3) instead of (6×3).The likelihood ratio test procedure is derived by calculating the ratio

between the value of the likelihood function in the restricted and unrestrictedmodel. The maximum likelihood of the unrestricted model is:

L−2/Tmax (H(r)) = |S00|Πri=1(1− λi).

It appears from (9.3) that we can find the restricted ML estimator simi-larly as for the unrestricted model, i.e. by solving the reduced rank regressionof ∆xt on H

0xt−1 corrected for the short-run dynamics:

¯λH0S11H−H0S10S−100 S01H

¯= 0.

This gives the eigenvalues λc1,λc2, ...,λ

cs and the eigenvectors v

c1,v

c2, ...,v

cs.

Note that when H is p1× s, the restricted model will have s eigenvalues,i.e. less than the p1 eigenvalues of the unrestricted model and, hence, m ≤p1 − r. In the money demand example with r = 3 and p1 − r = 3 we canimpose at most three identical restrictions on the cointegrating relations. Wecould, for example, in addition to the liquidity ratio impose the interest ratespread in all relations as well as long-run exclusion of the shift dummy Ds83tresulting in the transformed data vector x0tH = [(mr − yr),∆p, (Rm − Rb)].Imposing one more restriction would make the dimension of the transformedvector equal to 2, which would violate the condition r = 3.The ML estimator of β is found by choosing ϕ = [ vc1,v

c2, ...,v

cr] and then

transforming the vectors back to βc= Hϕ. Note that this type of restrictions

are not identifying and the estimates of the constrained cointegration rela-tions are, therefore, based on the normalization condition and the orderingof the eigenvalues as discussed in Section 7.4.

Page 170: Juselius_Johansen_The Cointegrated VAR Model

9.2. SAME RESTRICTION 171

The LR test statistic is calculated as:

Λ =

µL−2/Tmax Hc(r)

L−2/Tmax H(r)

¶=

µ|S00| × (1− λc1)(1− λc

2) · · · (1− λc

r)

|S00| × (1− λ1)(1− λ2) · · · (1− λr)

¶i.e.

−2 lnΛ = Tln(1− λc1)− ln(1− λ1) + ...+ ln(1− λc

r)− ln(1− λr).

It is asymptotically distributed as χ2(ν) where ν = rm. There are rm de-grees of freedom because we have imposed m restrictions on each of the rcointegration vectors. If the eigenvalues change significantly when we imposethe restrictions H0xt the test will reject.Section 8.3 gave an interpretation of the eigenvalues λi as a squared corre-

lation coefficients between a linear combination of the stationary part of thevector process and a linear combination of the non-stationary part. Thus,if λ

c

i becomes very small as a result of imposing the restrictions Ri, it isa sign of nonstationarity of the restricted cointegration relation. Anotherway of expressing this is again to notice that diag(λ1, ..., λr) = α0S−100 α =

β0S10S

−100 S01β, i.e. λi is related to the estimated αi. Therefore, the rejection

of the hypothesis β = Hϕ implies that at least one of the restricted relationsno longer define a mean-reverting relation and, thus, will have insignificantα coefficients.

9.2.1 Illustrations

The test of β = Hϕ can be calculated in CATS by choosing the option:”Restrictions on subsets of the β−vectors”. The program first prompts forthe number of different groups. When the same restriction is imposed onall vectors, the answer is [1 ]. The program then asks for the number ofrestrictions [m].In Chapter 6 we discussed the role of the deterministic components in the

model and showed that if there are linear trends in the data, the most generalformulation is to include a linear trend inside the cointegration relations andan unrestricted constant in the VAR equations. As discussed in Chapter8, this formulation delivers ’similar’ inference based on the trace test forcointegration rank. After the rank is determined we first test whether the

Page 171: Juselius_Johansen_The Cointegrated VAR Model

172 CHAPTER 9. TESTING RESTRICTIONS

linear trend is needed in the cointegration relations, i.e. whether we canimpose a zero restriction on the trend coefficient in all cointegrating relations.Example 1: A test of long-run exclusion of a linear trend in the cointe-

gration relations for x0t = [mrt , y

rt ,∆pt, Rm,t, Rb,t, Ds83t, t].

H1 : βc = Hϕ or R0β = 0

where

H =

1 0 0 0 0 00 1 0 0 0 00 0 1 0 0 00 0 0 1 0 00 0 0 0 1 00 0 0 0 0 10 0 0 0 0 0

, ϕ =

ϕ11 ϕ12 ϕ13ϕ21 ϕ22 ϕ23ϕ31 ϕ32 ϕ33ϕ41 ϕ42 ϕ43ϕ51 ϕ52 ϕ53ϕ61 ϕ62 ϕ63

and

R0 = [0, 0, 0, 0, 0, 0, 1].

The design matrix H is specified so that the last row, corresponding tothe trend, is equal to zero and the restriction matrix R so that it makesthe last coefficient row of β equal to zero. With few restrictions the Rformulation is more parsimonious than the H formulation. Table 9.1 reportsthe estimates of λc and βc in the restricted model. It appears that the λcihardly changes compared to the λi. The same is true with β

c compared to theunrestricted β reported in Table 7.1 in Chapter 7. Not surprisingly, the threezero restrictions on the trend can be accepted with a p-values of 0.61 and weconclude that the appropriate specification is Case 3 of Section 6.3, i.e. notrend in the cointegration relations, unrestricted constant in the model.When testing the hypothesis of long-run exclusion of variables which are

strongly correlated, i.e. collinear, we sometimes find that long-run exclusionis accepted even if it subsequently turns out that the variable in questionwas very important in at least some of the long-run relations. Thus, somecaution is needed with this test, in particular if the variables are stronglycollinear. This could be the case, for example, if two variables are stronglytrending.

Page 172: Juselius_Johansen_The Cointegrated VAR Model

9.2. SAME RESTRICTION 173

Example 2. A test of long-run exclusion of the shift dummy Ds83t in thecointegration relations for x0t = [mr

t , yrt ,∆pt, Rm,t, Rb,t, Ds83t, t]. At around

this date the graphs of the variables in levels and differences reported inSection 3.2 showed a major shift in the level of money stock, a less pronouncedshift in the level of real aggregate demand, and a big blip in the differencesof the bond rate. A blip in ∆xt corresponds to a shift in the levels xt.Because it is often hard to know whether the mean shift will cancel or notin a cointegration relation, it is often useful to include the shift dummy inthe cointegration relations and then test its significance. The specification ofthe H and the R matrices are similar as for the test for long-run exclusionof the trend and will, therefore, not be reported. The hypothesis of long-runexclusion of the shift dummy Ds83 t was rejected with a p-value of 0.01. Therestricted estimates are reported under H2 in Table 9.1.

Comparing the estimates under H2 to the estimates under H1 (which are

close to the unrestricted β estimates) it appears that βc

1 has hardly changed(implying that the shift dummy was not needed in this relation), whereas this

is not the case with βc

2 and βc

3. This implies that the estimates of the moneydemand relation and of the interest rate relation are sensitive to whether weinclude the shift dummy or not: The income coefficient in β

c

2 is no longerclose to -1, as it was when the shift dummy was allowed to enter the relation.Moreover, the interest coefficients have almost doubled and one of them haschanged sign. Thus, the consequence of deregulating capital movements in1983 seems to have been a shift in money stock to a higher level withouta corresponding shift in the level of real income. If the shift to the new(deregulated) equilibrium level of money stock is not appropriately accountedfor by the shift dummy, then the econometric procedure will account for theextraordinary increase in money stock by increasing the estimates of the realincome and interest rate coefficients.

The deregulation of capital movements in 1983 also affected the Danishinterest rates, in particular on the long-term bond rate. As a result of theincreased foreign demand for Danish bonds, the previously very high yield onbonds dropped dramatically and, after a period of increased volatility, settleddown on a much lower level where it has stayed all since. Figure 3.3 in Section3.2 illustrates this graphically. The effect was a marked decrease in theinterest rate spread and a similar gradual stabilization at a much lower level inthe new regime. Figure 2.5 in Chapter 2 illustrates this graphically. Withoutaccounting for this shift in the equilibrium level between the interest rates

Page 173: Juselius_Johansen_The Cointegrated VAR Model

174 CHAPTER 9. TESTING RESTRICTIONS

(the risk premium) the relationship between the two interest rates becomesnegative as βc3 demonstrates.This example serves as an illustration of the importance of appropriately

accounting for major reforms and interventions in order not to bias the co-efficient estimates of the economic steady-state relations.Example 3. A test of long-run homogeneity betweenmr and yr in all coin-

tegrating relations for x0t = [mrt , y

rt ,∆pt, Rm,t, Rb,t,Ds83t, t]. The hypothesis

H3 can be accepted with a p-value of 0.13. The design matrices H and Rhave the following form:

H =

1 0 0 0 0 0−1 0 0 0 0 00 1 0 0 0 00 0 1 0 0 00 0 0 1 0 00 0 0 0 1 00 0 0 0 0 1

, R =

£1 1 0 0 0 0 0

¤.

The restricted estimates are reported in Table 9.1 under H3 and showsthat βc does not differ very much from the unrestricted values. The coeffi-cients to real money and real income are very small except for in the secondrelation, suggesting that they enter significantly only in one of the threecointegration relations. Thus, two of the acceptable long-run homogeneityrestrictions are describing a ’0, 0’ relationship.Example 4. A test of the interest rate spread in all cointegration relations.

The hypothesis H4 was strongly rejected with a p-value of 0.00. The H andR matrices are similar to the previous example and will not be reported.The restricted estimates of H4 are reported in Table 9.1. It appears thatthe first and the third restricted cointegration relation. The violation of thespread restriction in the first relation suggests that inflation is not exclusivelyrelated to the interest rate spread, but also to the level of nominal interestrates. The violation of the spread in the third relation suggests that the twointerest rates are not homogeneously related to each other.Example 5. A joint test of H1 and H3. It appeared that the zero restric-

tion on the trend and the homogeneity restriction on real money and incomewere each acceptable. The joint hypothesis H5 was accepted with a p-valueof 0.20. Note that the joint test is not the sum of the individual tests, be-cause the latter are not independent. It frequently happens that the joint

Page 174: Juselius_Johansen_The Cointegrated VAR Model

9.2. SAME RESTRICTION 175

Table 9.1: Estimated eigenvalues, eigenvectors, and loadings for the Danishdata

λi mr yr ∆p Rm Rb Ds83 trendH1 : β7· = 0, χ

2(3) = 1.80, p-value = 0.610.58 βc01 0.03 0.05 1.00 -1.63 0.81 -0.00 0.00.36 βc02 1.00 -1.05 -0.83 -13.9 14.2 -0.15 0.00.26 βc03 0.01 0.00 0.04 1.00 -0.37 -0.00 0.0

H2 : β6· = 0, χ2(3) = 11.69, p-value = 0.01

0.58 βc01 0.01 0.06 1.00 -1.66 0.67 0.0 0.000.36 βc02 1.00 -1.34 -1.21 27.7 21.5 0.0 -0.000.26 βc03 0.12 -0.16 0.05 1.00 0.86 0.0 -0.00

H3 : β1· = −β2·, χ2(3) = 5.71, p-value = 0.130.58 βc01 0.06 -0.06 1.00 -2.69 1.46 0.01 0.000.36 βc02 1.00 -1.00 -0.86 -13.2 14.0 -0.15 0.000.26 βc03 0.02 -0.02 0.05 1.00 -0.33 -0.00 0.00

H4 : β4· = −β5·, χ2(3) = 15.33, p-value = 0.000.58 βc01 0.03 0.05 1.00 -0.54 0.54 -0.00 0.000.36 βc02 1.00 1.04 -0.80 -14.5 14.5 -0.12 -0.000.26 βc03 -0.02 0.01 0.01 1.00 -1.00 -0.01 0.00H5 : β1· = −β2· and β7· = 0, χ2(6) = 8.54, p-value = 0.20

0.58 βc01 0.06 -0.06 1.00 -2.69 1.46 0.01 0.00.36 βc02 1.00 -1.00 -0.86 -13.2 14.0 -0.15 0.00.26 βc03 0.02 -0.02 0.05 1.00 -0.33 -0.00 0.0

test rejects, though the individual tests are accepted. The design matricesH and R are formulated as:

H =

1 0 0 0 0−1 0 0 0 00 1 0 0 00 0 1 0 00 0 0 1 00 0 0 0 10 0 0 0 0

, R =

·1 1 0 0 0 0 00 0 0 0 0 0 1

¸.

We have given several examples of restrictions that were acceptable andrestrictions that were not. In the latter cases it was quite straightforward to

Page 175: Juselius_Johansen_The Cointegrated VAR Model

176 CHAPTER 9. TESTING RESTRICTIONS

see why a restriction violated the information in the data. This was partlybecause the unrestricted cointegration relations could be roughly interpretedas plausible steady-state relations, c.f. the discussion in Section 6.7. In mostcases it is not so obvious why a restriction violates the information in thedata.To be able to re-specify the hypothesis in a more adequate way it is cru-

cial to understand why a restriction rejects. A useful tool in this contextis to compare the unrestricted Π with the restricted Πc matrix. If the re-

strictions are acceptable then the constrained Πc = αcβc0 ≈ αβ

0= Π, i.e.

the constrained cointegration relations and the corresponding weights shouldapproximately reproduce the unrestricted Π. If the restrictions are not ac-

ceptable, then Πc = αcβc0 6= αβ

0= Π. To illustrate how to locate the reason

for a test failure Table 9.2 compares Π with Πc, where the latter is calculatedunder the hypothesis H4 in Table 9.1.It is now easy to see that imposing the interest rate spread on all coin-

tegrating relations hardly changes the rows for real money, real income, andthe bond rate of Πc, but does change the rows for inflation and the depositrate. In the inflation equation the significant effect of the deposit rate in Πhas now disappeared and in the deposit rate equation the deposit rate is nolonger related to the bond rate.Therefore, if we had started the analysis with the transformed vectorex0t = [mr, yr,∆p, (Rm − Rb)] important feedback information from changes

in nominal interest rate on the system would have been lost. Such transfor-mations are often suggested by the underlying theory model and, therefore,frequently employed in empirical models without first testing for data admis-sibility. The consequence is, however, that potentially important informationin the original variables is being lost.

9.3 Some β assumed known

This test is useful when we want to test whether a hypothetical known vectoris stationary. For example we might be interested in whether real interestrate defined as R−4p is stationary, whether 4p is stationary by itself, andwhether the income velocity of money, mr − yr is stationary.To formulate this hypothesis it is convenient to decompose the r coin-

tegrating relations into s known vectors b (in most cases s = 1) and r − sunrestricted vectors ϕ :

Page 176: Juselius_Johansen_The Cointegrated VAR Model

9.3. SOME β ASSUMED KNOWN 177

Table 9.2: Comparing the combined effects of unrestricted and restrictedcointegration relations for r=3

Comparing Π = αβ0(UR) with Πc = αcβ

c0(R) under H4

mr yr ∆p Rm Rb Ds83 trendUR ∆mr

t −0.33(−5.4)

0.32(5.1)

−0.03(−0.1)

4.98(3.1)

−4.88(4.9)

0.04(3.9)

0.00(3.8)

R ∆mrt −0.34

(−5.5)0.32(5.1)

−0.03(−0.1)

5.00(4.3)

−5.00(4.3)

0.04(2.8)

0.00(1.7)

UR ∆yrt 0.03(0.8)

−0.06(−1.3)

−0.20(−0.9)

−1.76(−1.6)

1.07(1.5)

0.00(0.3)

−0.00(−1.8)

R ∆yrt 0.05(1.1)

−0.06(−1.4)

−0.21(−1.0)

−0.87(−1.1)

0.87(1.1)

−0.00(0.1)

−0.00(−0.8)

UR ∆2pt 0.04(1.2)

−0.16(−4.1)

−1.66(−9.2)

1.96(2.0)

−0.28(−0.5)

−0.00(−0.6)

−0.00(5.3)

R ∆2pt 0.00(0.1)

−0.15(−3.7)

−1.65(−8.7)

0.22(0.3)

−0.22(−0.3)

−0.00(−0.4)

−0.00(0.2)

UR ∆Rm,t −0.01(−1.4)

0.01(0.5)

0.00(0.1)

−0.30(−3.5)

0.10(1.8)

0.002(3.4)

−0.00(−1.3)

R ∆Rm,t 0.00(0.1)

−0.00(−0.1)

−0.00(−0.1)

−0.10(−1.5)

0.10(1.5)

0.002(2.5)

−0.00(−2.5)

UR ∆Rb,t 0.00(0.0)

0.00(0.1)

0.01(0.7)

−0.07(−0.6)

0.03(0.4)

0.00(0.2)

0.00(0.1)

R ∆Rb,t 0.00(0.1)

0.00(0.1)

0.01(0.6)

0.02(0.3)

−0.02(0.3)

−0.00(−0.6)

0.00(0.6)

Page 177: Juselius_Johansen_The Cointegrated VAR Model

178 CHAPTER 9. TESTING RESTRICTIONS

Hc(r) : βc = (b,ϕ), (9.4)

where b is a p1× s, and ϕ is a p1× (r − s) vector. We partition

α = (α1,α2), (9.5)

where α1 are the adjustment coefficients to b, and α2 to ϕ. The cointegratedVAR model can now be written as:

4xt = α1b0xt−1 +α2ϕ0xt−1 + Γ14 xt−1 + ΦDt + εt. (9.6)

The unrestricted estimates α and β are first obtained from the ”concen-trated model”:

R0t = αβ0R1t + εt (9.7)

defined in (??). Inserting (9.4) and (9.5) in (9.7) we get:

R0t = α1b0R1t +α2ϕ

0R1t + εt (9.8)

Since the variable b0R1t is known and stationary under Hc it can be concen-trated out from (9.8):

R0,t = B1b0R1,t +R0.b,t

and

R1,t = B2b0R1,t +R1.b,t

The new ”concentrated model” is given as:

R0.b,t = α2ϕ0R1.b,t + εt (9.9)

Page 178: Juselius_Johansen_The Cointegrated VAR Model

9.3. SOME β ASSUMED KNOWN 179

The remaining task is now to derive the ML estimator of (ϕ,α1,α2). Twocomplications have to be solved:First, if ϕ happens to lie in the space spanned by b, then the model

will become singular. Therefore, to avoid this we can force ϕ to lie in thespace spanned by b⊥. This can be achieved by reformulating model (9.9) forϕ = b⊥ψ:

R0.b,t = α2ψ0b0⊥R1.b,t + εt

Second, because the maximum of the likelihood function is given by:

L−T/2max = |S00.b|Πr−si=1(1− λc

i)

and |S00.b| 6= |S00| the determinants do not cancel in the LR test. This canbe circumvented by solving an auxiliary eigenvalue problem:

¯ρS00−S01b(b0S11b)−1b0S10

¯= 0

from which we get the eigenvalues:

ρ1 > ... > ρs > ρs+1 = ... = ρp = 0

If s = 1, as is usually the case in practical applications, only ρ1 > 0. It canbe shown that

|S00.b|= |S00|Πsi=1(1− ρi)

Therefore,

L−2/Tmax (Hc(r)) = |S00|Πsi=1(1− ρi)Πr−si=1(1− λ

c

i)

and the LR test procedure

Λ = L−2/Tmax (Hc(r))/L−2/Tmax (H(r))

Page 179: Juselius_Johansen_The Cointegrated VAR Model

180 CHAPTER 9. TESTING RESTRICTIONS

gives us the LR test statistic:

−2 lnΛ = Tln(1− ρ1) + ...+ ln(1− ρs) + ln(1− λc

1) + ...+ ln(1− λc

r−s)−− ln(1− λ1) + ...+ ln(1− λr)

which is asymptotically distributed as χ2 with (p1− r)s degrees of freedom.

9.3.1 Illustrations

This test can be calculated in CATS by choosing the same option as in theprevious test: ”Restrictions on subsets of the β−vectors. The program firstprompts you to input the number of different groups. If there is one knownvector to test, then the answer is [2 ]; i.e. one group containing the knownvector and the second group the remaining r − 1 vectors. The programprompts for the number of vectors in group 1 [1 ] and for the number ofrestrictions [p1-1]. Then you need to type in the restrictions defining theknown vector and save the restrictions. Finally, the program will ask for thenumber of vectors in the second group [r-1 ] and the number of restrictions[0 ].As an illustration of the test procedure we will ask whether inflation rate,

real interest rates, and the interest rate spread are stationary by themselves.Note that the null is stationarity in this case. First, the test that ∆p ∼ I(0)is formulated as:

H6 : βc = (b,ϕ),

where

b0 =£0 0 1 0 0 0 0

¤i.e. b is a unit vector that picks up the inflation rate. The remaining r-1= 2 vectors are unrestricted and described by the matrix ϕ of dimensionp1× r−1 = 7×2. The coefficients ϕij are uniquely determined based on theordering of eigenvalues and the normalization ϕ0S11.bϕ = I. In most cases ϕijare not of particular interest and there is no need to interpret the estimated

Page 180: Juselius_Johansen_The Cointegrated VAR Model

9.4. SOME COEFFICIENTS KNOWN 181

coefficients. Only when b is found to be stationary can an inspection ofthe remaining relations sometimes be helpful in tentatively identifying thecomplete cointegration structure.The results of testing the four stationarity hypotheses are presented in the

first part of Table 9.3. H6 tests the stationarity of inflation rate, H7 of realdeposit rate, H8 of real bond rate, and H9 of the interest rate spread. All ofthem are rejected. Note, however, that the test results are not independentof the choice of r. If a conservative value of r is chosen (small r) then it willbe more difficult to accept the stationarity hypothesis than if when r is big.We note, however, that the rejection might be related to the zero restric-

tion of the shift dummy Ds83 t. For example, if the interest rate spread isstationary around one level before 1983 and another level after that date,then H9 would probably be rejected as a consequence of imposing a zerorestriction on Ds83 t. If this is the case it would be more relevant to askwhether (Rm − Rb − b1Ds83) ∼ I(0) rather than (Rm − Rb) ∼ I(0), whereb1Ds83 is the estimated shift in the level of interest rate spread as a resultof deregulation of capital movements. In the next section we will discuss atest procedure for the situation when at least one of the coefficients is notknown and, therefore, have to be estimated.

9.4 Only some coefficients are known

Here we will test restrictions on one (or a few) of the cointegration vec-tors assuming that some of the coefficients are known and some have tobe estimated. In the general case we formulate the hypothesis as β =H1ϕ1,H2ϕ2, ...,Hrϕr, where Hi is a (p1 × si) matrix, i = 1, .., r. Forsimplicity we assume in the following that the r cointegration relations aredivided into two groups containing r1 and r2 vectors each. Here we will focuson the special case r1 = 1, r2 = r− 1 and H2 = I and leave the more generalspecification to the next chapter. This case is useful for asking questionsabout the stationarity of a single hypothetical cointegrating relation whileleaving the remaining r − 1 relations unrestricted. For example, this testwould answer the question whether there exists any stationary combinationbetween nominal interest rate R and inflation 4p, i.e. whether (R−ω∆p) isstationary for some value of ω. The test procedure will find the value ω thatproduces the ’most stationary’ relation and calculate the p-value associatedwith the null hypothesis. The latter is expressed as:

Page 181: Juselius_Johansen_The Cointegrated VAR Model

182 CHAPTER 9. TESTING RESTRICTIONS

βc = (βc1,β2) = (H1ϕ1,ϕ2) (9.10)

where H1 is a known design matrix of dimension p1× s1, ϕ1is s1× 1 matrixand ϕ2 is a p1×(r−1) matrix of unrestricted coefficients. Again we partitionα so that it corresponds to the partitioning of βc:

α = (α1,α2).

The concentrated model can be written as:

R0t = α1ϕ01H

01R1t +α2β

02R1t + εt

The estimation problem is now more complicated because neither ϕ01H01R1t

nor β02R1t are known and can be concentrated out. Different algorithms forsolving nonlinear estimation problems can be used. CATS uses a switchingalgorithm described below:

1. Estimate an initial value of βc1 = β1.

2. For a fixed value of βc1 = β1 estimate α2 and β2 by reduced rankregression of R0.β1,t on R1.β1,t, where R0.β1,t and R1.β1,t are corrected

for β01R1t. This defines β2 and L

−2/Tmax (β1).

3. For fixed value of β2 = β2 estimate α1 and ϕ1 by reduced rank regres-sion of R0.β2,t on H

01R1.β2,t, where R0.β2,t and H

01R1.β2,t are corrected

for β02R1t. This defines β1 = H1ϕ1 and L

−2/Tmax (β2).

4. Repeat the steps, always using the last obtained values of βi untilthe values of the maximized likelihood function converge. In CATSL−2/Tmax (β1) − L−2/Tmax (β2) ≤ 0.000001 for the algorithm to stop. Amaximum of 200 iterations is set by CATS.

The eigenvalue problem for fixed β1 = β1 is given by:

¯ρ1S11 − S10.β1(S00.β1)

−1S01.β1

¯= 0

Page 182: Juselius_Johansen_The Cointegrated VAR Model

9.4. SOME COEFFICIENTS KNOWN 183

or for fixed β2 = β2 by:¯ρ2H

01S11H1−H0

1S10.β2(S00.β2)−1S01.β2H1

¯= 0

and the maximum of the likelihood function by:

L−2/Tmax (Hc) = |S00| (1− ρ1)...(1− ρr)

where ρi are the eigenvalues obtained after convergence of the likelihoodfunction.Using the LR procedure the hypothesis (9.10) can be tested by calculating

the test statistic:

−2 lnΛ = Tln(1− ρ1) + ...+ ln(1− ρr)− ln(1− λ1)− ...− ln(1− λr),(9.11)

which is asymptotically χ2 distributed with the degrees of freedom given by:

ν = (m1 − r + 1) = (p1− r)− (s1 − 1)

9.4.1 Illustrations

This test can be performed in CATS by choosing the same option as inthe previous test: ”Restrictions on subsets of the β−vectors”. The programfirst prompts you to input the number of different groups, usually [2 ] i.e. onegroup containing the restricted vector(s) and the second group the remainingvectors. The program prompts for the number of vectors in group 1 [1 ] andfor the number of restrictions [m]. Then CATS will ask you to type in therestrictions of the design matrix H or alternatively the restriction matrix Rdefining the restricted vector and save the restrictions. Finally, the programwill ask for the number of vectors in the second group [r-1 ] and the numberof restrictions [0 ].We will now illustrate this type of tests:

βc = Hϕ1,ϕ2 (9.12)

Page 183: Juselius_Johansen_The Cointegrated VAR Model

184 CHAPTER 9. TESTING RESTRICTIONS

Table 9.3: Testing the stationarity of simple relations

mr yr ∆p Rm Rb Ds83 trend χ2(υ) p.val.Tests of a known β vector

H6 0 0 1 0 0 0 0 20.0(4) 0.00H7 0 0 1 -1 0 0 0 16.5(4) 0.00H8 0 0 1 0 -1 0 0 12.7(4) 0.01H9 0 0 0 1 -1 0 0 20.1(4) 0.00

Tests of a trend-stationary relationH10 1 0 0 0 0 -0.57 0.003 8.9(2) 0.01H11 0 1 0 0 0 -0.28 0.003 12.6(0) 0.01

Tests of velocity relationsH12 1 -1 0 0 0 -0.39 0.002 8.5(2) 0.01H13 0.04 -0.04 1.0 0 0 -0.01 0.000 4.4(1) 0.04H14 1 -1 0 -14.5 14.5 -0.113 -0.001 0.02(1) 0.88H15 1 -1 0 -14.1 14.1 -0.145 0 0.65(2) 0.72

Tests of real income relationsH16 0 1 11.7 0 0 -0.053 0.002 0.39(1) 0.53H17 0 1 15.3 -15.3 0 -0.105 0.002 2.27(1) 0.13H18 0 1 17.7 0 -17.7 -0.438 0.006 9.32(1) 0.00

Tests of inflation, real interest rates and the spreadH19 0 0 1 0 0 0.015 0 7.4(3) 0.06H20 0 0 1 -1 0 0.009 0 7.8(3) 0.05H21 0 0 1 0 -1 -0.001 0 12.6(3) 0.01H22 0 0 0 1 -1 -0.001 0 14.3(3) 0.00

Tests of combinations of interest rates and inflation rateH23 0 0 1 -0.45 0 0.012 0 7.1(2) 0.03H24 0 0 1 0 -0.20 0.011 0 6.8(2) 0.03H25 0 0 0 1 -0.45 -0.001 0 0.9(2) 0.64

Tests of homogeneity between inflation and the interest ratesH26 0 0 1.0 -0.25 -0.75 0 0 12.1(3) 0.01H27 0 0 1.0 -1.66 0.66 0.015 0 5.8(2) 0.06H28 0 -0.04 1.0 -1.56 0.56 0 -0.00 0.02(1) 0.89

Page 184: Juselius_Johansen_The Cointegrated VAR Model

9.4. SOME COEFFICIENTS KNOWN 185

where we impose restrictions on just one of the vectors and leave the remain-ing ones unrestricted. This test is indispensable for identifying irreduciblesets of cointegrated variables (Davidson, 2001), our so called building blocks.Rejecting the stationarity of the hypothetical relation ϕ01H

0xt implies thatit does not qualify as a steady-state relation in the final long-run structure.Therefore, systematically testing the stationarity of all possible relationshipsis an indispensable help in spotting potentially relevant cointegration rela-tions in the structure of identified long-run steady-state relations. The testof a fully identified cointegration structure will be discussed in more detailin the next chapter.Trend-stationarity of real money and real income is tested as the hypothe-

ses H10 and H11. None of them are accepted. Note also that the estimatedtrend coefficient implies a negatively sloped trend in both real money andincome!The next group of tests, H12−H15, are about (inverse) velocity relations.

The trend-stationarity of the liquidity ratio, H12, is clearly rejected (see alsoFigure 2.5 in Chapter 2). It appears from H13 that the liquidity ratio andinflation do not appear to be cointegrated, while H14 shows strong cointegra-tion between the liquidity ratio and the interest rate spread. To investigatewhether the trend is significant we have imposed a zero restriction on thetrend in H15. The difference between the two test values, 0.65-0.02 = 0.63,is approximately χ2(1) and, hence, the effect is not significant and the trendcan be set to zero in H14.The hypotheses H16 − H18 are tests about real income relations. Real

income and inflation appear cointegrated in H16 but with implausible coeffi-cients. This is probably a consequence of a declining inflation and a modestlygrowing real income in this period. Though being found stationary, we wouldnot consider this relation to be a serious candidate for an economic steady-state relation. The same is true for H17 which cannot be interpreted as anIS relation and for H18 the stationarity of which is rejected.The tests in the next group of hypotheses, H19 −H22, are similar to the

tests in the first part of the table. The difference is that we here allow theshift dummy Ds83 t to enter the relations. However, none of the variablesbecomes convincingly stationary and we conclude that inflation, real interestrates and the spread are all nonstationary even when we allow for a meanshift in 1983:1.The hypothesesH23−H25 are similar toH20−H22 except that the unitary

restriction on the bond rate coefficient is now relaxed. We note that the two

Page 185: Juselius_Johansen_The Cointegrated VAR Model

186 CHAPTER 9. TESTING RESTRICTIONS

interest rates appear cointegrated with a coefficient of 0.45. The low value ofthe coefficient can possibly be explained by the deposit rate being an averageof all the components in M3, some of which yielded no interest, while othersquite a high interest.Finally, the last group tests homogeneity between inflation rate and the

two interest rates. The design matrices in H26 are specified as:

H =

0 00 01 0−1 −10 10 00 0

, R =

1 0 0 0 0 0 00 1 0 0 0 0 00 0 1 1 1 0 00 0 0 0 0 1 00 0 0 0 0 0 1

.

Because H26 is rejected we ask in H27 whether the results would improveby allowing for a mean shift in 1983:1. Stationarity did improve, but notconvincingly so. By inspecting the Πc and comparing it with the unrestrictedΠ (not reported here) we found that imposing H27 on the data caused realincome to become insignificant in the inflation equation. Hence, by addingreal income (and the trend) to H27 we obtained H28 which was stronglyaccepted.To summarize: this exercise has ’identified’ the following three possible

candidates for a long-run structure: H15,H25, and H28. These will be testedjointly in Chapter 10.

9.5 Long-run weak exogeneity: restrictions

on α

The hypothesis that a variable has influenced the long-run stochastic pathof the other variables of the system, while at the same time has not beeninfluenced by them, is called the hypothesis of ‘no levels feedback’ or long-run weak exogeneity when the parameters of interest are β. We test thefollowing hypothesis on α:

Hcα(r) : α = Hα

c (9.13)

Page 186: Juselius_Johansen_The Cointegrated VAR Model

9.5. LONG-RUN WEAK EXOGENEITY 187

where α is p × r, H is a p × s matrix, αc is a s × r matrix of nonzero α-coefficients and s ≥ r. (Compare this formulation with the hypothesis of thesame restriction on all β, i.e. β = Hϕ.) As with tests on β we can expressthe restriction (9.13) in the equivalent form:

Hcα(r) : R

0α = 0 (9.14)

where R = H⊥. Since the R matrix is often of much smaller dimension thanthe H matrix, the default in CATS is to input the R matrix.The condition s ≥ r implies that the number of non-zero rows in α must

not be greater than r. This is because a variable that has a zero row in αdoes not adjust to the long-run relations and, hence, can be considered asa common driving trend in the system. Since there can be at most (p− r)common trends the number of zero-row restrictions can at most be equal to(p− r) .The hypothesis (9.13) can be expressed as:

H0 : Hα = αc =

µαc10

¶.

Under Hcα(r):

∆xt = Γ1∆xt−1 +αcβ0xt−1 + µ0 + ΦDt + εt. (9.15)

We consider first the concentrated model:

R0t = αβ0R1t + εt

Under H0 :

R0t = Hαβ0R1t + εt

with εt ∼ NIID(0,Ω) and

Ω =

·ω11 ω12ω21 ω22

¸where ω12 is the covariance matrix between the errors from the m ’endoge-nous’ variables x1,t and the errors from the p−m weakly exogenous variables

Page 187: Juselius_Johansen_The Cointegrated VAR Model

188 CHAPTER 9. TESTING RESTRICTIONS

x2,t. The weak exogeneity hypothesis can be tested with a LR test proceduredescribed in Johansen and Juselius (1990). The idea behind the derivation ofthe test procedure is to partition the equation system (9.15) into the ”x1,t”and the ”x2,t” equations. Formally this can be done by multiplying (9.15)withH andH⊥ respectively. To obtain simpler results we use the normalizedmatrix H = H(H0H)−1 instead of H (because H

0H = H0H= I), i.e.:

H0R0t = αc1β

0R1t +H0εt

H0⊥R0t = H0

⊥εt(9.16)

For the ”baby” model this would correspond to:

∆x1t = αc1β0xt−1 + ε1t

∆x2t = ε2t

Next step is to formulate the joint model (9.16) as a conditional and marginalmodel using results from multivariate normal distributions:

H0R0t = αc1β

0R1t + ρH0⊥R0t − ρH0

⊥εt +H0εt

H0⊥R0t = H0

⊥εt(9.17)

where ρ = ω12ω−122 . For the ”baby” model this would correspond to:

∆x1t = αc1β0xt−1 + ρ∆x2t − ρε2t + ε1t

∆x2t = ε2t

i.e.

∆x1t = αc1β0xt−1 + ρ∆x2t + ε1.2t

∆x2t = ε2t

where ε1.2t = ε1t−ρε2t and Cov(ε1.2t, ε2t) = 0. All relevant information aboutα and β are now in the ∆x1t equation and we can solve the eigenvalueproblem entirely based on that equation. Since ∆x2t is stationary we cancorrect R0,t (∆x1t) and R1,t (xt−1) for this variable based on the auxiliaryregressions:

R0t = B1∆x2t +R0.H⊥t

Page 188: Juselius_Johansen_The Cointegrated VAR Model

9.5. LONG-RUN WEAK EXOGENEITY 189

and

R1t = B2∆x2t +R1.H⊥t

so that:

R0.H⊥t = αcβ0R1.H⊥t + ut (9.18)

The ’usual’ eigenvalue problem is based on (9.18) and the solution delivers

p −m eigenvalues λc

i . The r largest are used in the LR test which is givenby:

−2 lnL(Hcα(r)/H(r)) = −TΣri=1ln(1− λ

c

i)− ln(1− λi)It is asymptotically distributed as χ2(ν) where ν = rm and m is the numberof weakly exogenous variables.

9.5.1 Empirical illustration:

We will now test the hypothesis that the bond rate is weakly exogenousfor the long-run parameters in the Danish money demand data, i.e. thatα51 = α52 = α53 = 0 . We have p = 5, r = 3, m = 1, s = 4.

∆mrt

∆yrt∆2pt∆Rm,t∆Rb,t

= ...+

α11 α12 α13. . .. . .. . .

α51 α51 α51

β01xt−1β02xt−1β03xt−1

=

1 0 0 00 1 0 00 0 1 00 0 0 10 0 0 0

αc11 αc12 αc13. . .. . .

αc41 αc42 αc43

β01xt−1β02xt−1β03xt−1

=

αc11 αc12 αc13. . .. . .

αc41 αc42 αc430 0 0

β01xt−1β02xt−1β03xt−1

Page 189: Juselius_Johansen_The Cointegrated VAR Model

190 CHAPTER 9. TESTING RESTRICTIONS

Table 9.4: Tests of long-run weak exogeneity

r mr yr ∆p Rm Rb ν χ2(ν)1 0.4 0.2 29.8 0.5 0.6 1 3.82 10.9 1.0 40.3 0.6 0.6 2 6.03 → 19.2 2.4 50.4 11.1 0.7 ← 3 7.84 23.8 6.6 54.7 15.4 1.1 4 9.5

The specification of the zero row restriction on the adjustment coefficientsof the bond rate is given by the R matrix (which is the form used in CATS):

R0 = [0, 0, 0, 0, 1].

The test statistic, distributed as χ2(3), became 0.7 and the weak exogene-ity of the bond rate is clearly accepted.

In Table 9.4 we have reported the weak exogeneity test for all variablesand for all possible values of r. The row corresponding to our preferred choicer = 3 is indicated by two arrows. The remaining test results for other choicesof r serve the purpose of sensitivity analyses: If, instead we had chosen r= 2 (remember that based on the rank test we could as well have chosenthis value), the two interest rates would have become weakly exogenous,implying that each of them would have acted as an independent commondriving trend in this system, which a priori would not seem very plausible.On the other hand, if we had chosen r = 1 (as has often been the case inmany empirical models of money demand), then only inflation would havebeen ’non-exogenous’ and all information about agents’ adjustment towardsa ’money-demand’ relation would have been lost. This is because the firstunrestricted cointegration relation was significant exclusively in the inflationequation (see Table 7.1).

Furthermore, the test results in Table 9.4 show that the weak exogeneityresult for yr and Rb is robust to the choice of r and we might ask whetherthey are jointly exogenous. Because the weak exogeneity tests of a singlevariable are not independent the joint hypothesis of two or several variablescan be (and often is) rejected even if the latter are individually accepted asweakly exogenous. The design matrix for the joint test is specified as:

Page 190: Juselius_Johansen_The Cointegrated VAR Model

9.5. LONG-RUN WEAK EXOGENEITY 191

R0 =·0 0 0 0 10 1 0 0 0

¸,

and the test statistic, distributed as χ2(6), became 3.21 with a p-value of0.78. Hence, we can safely accept both variables to be weakly exogenousin this system. Since m = 2 which corresponds to the number of commontrends, p− r, this is the maximum number of weakly exogenous variables inthis case.Long-run weak exogeneity of the real income and the bond rate implies

that valid inference on β can be obtained from the three-dimensional systemdescribingmr,∆p, and Rm conditional on y

r and Rb (Johansen, 1992, Hendryand Richard, 1983). The argument is based on a partitioning of the jointdensity into the conditional and marginal densities:

D(∆xt|∆xt−1, xt−1,Ds83t−1, Dt;θ) == D(∆x1t|∆x2t,∆xt−1, xt−1, Ds83t−1,Dt;θ1)×D(∆x2t|∆xt−1, Dt;θ2)

(9.19)

where x0t = [x01t, x

02t] and x

01t = [m

rt,∆pt, Rm,t], x

02t = [y

rt , Rb,t],

Dt = [D75.4,∆Ds83t,∆Ds83t−1,Dp92.4], θ1,θ2 are variation free andonly θ1 contains the long-run parameters of interest β.In this case the cointegrated VAR(2) model

∆xt= Γ1∆xt−1+αβ0xt−1+ΦDt+εt

is equivalent to

∆x1t = ρ∆x2t+Γ1.1∆xt−1+α1β0xt−1+Φ1Dt+ε1t − ρε2t

∆x2t = Γ1.2∆xt−1+Φ2Dt+ε2t

where Cov(ε1.2t, ε2t) = 0, a1 is (3×2), Γ1.1 (3×5), α1 (3×3), Φ1 (3× 5), Γ1.2(2× 5), and Φ2 (2× 5).When m zero-row restrictions on α are accepted (m ≤ (p − r) we can

partition the p equations into (p−m) equations which exhibit levels feed-back, and m equations with no levels feedback. We say that the m variablesare weakly exogenous when the parameters of interest are β. Because the

Page 191: Juselius_Johansen_The Cointegrated VAR Model

192 CHAPTER 9. TESTING RESTRICTIONS

m weakly exogenous variables do not contain information about the long-run parameters, we can obtain fully efficient estimate of β from the (p−m)equations conditional on the marginal models of the m weakly exogenousvariables. This gives the condition for when a partial model can be used toestimate β without losing information.More formally: let xt = x1,t,x2,t where x2,t is weakly exogenous

when β is the parameter of interest, then a fully efficient estimate of β canbe obtained from the partial model:

∆x1,t = A0∆x2,t + Γ11∆xt−1 +α1β0xt−1 + µ0 + Φ1Dt + ε1t. (9.20)

Thus, to know whether we can estimate β from a partial system we needfirst to estimate the full system and test α2 = 0 in that system. After havingestimated the full system, the question is why would we bother to reestimatea partial system. There are two reasons:

1. It is sometimes a priori very likely that weak exogeneity holds, forexample it is almost sure that US bond rate has an influence on theDanish economy, while it is almost as sure that the Danish economyis irrelevant for the US bond rate. Hence, testing may not be neces-sary and we can estimate a partial system conditional on the US bondrate. When the number of potentially relevant variables to include inthe VAR model is large it can be useful to impose weak exogeneityrestrictions from the outset.

2. By conditioning on weakly exogenous variables one can often achieve apartial system which has more stable parameters than the full system.This is often the case when the marginal model of the weakly exogenousvariable has non-constant parameters or exhibit non-linearities in theparameters.

Finally, finding weakly exogenous variables is often helpful in the identi-fication of the common driving trends as will be illustrated in Chapter 13.Thus the hypothesis α2 = 0 is of interest for its own sake as it correspondsto the hypothesis of ’no long-run feed-back.

9.6 Revisiting the scenario analysis

The choice of r = 3 is consistent with the discussion in Chapter 2, wherewe hypothetically assumed two autonomous common trends,

Pti=1u1i and

Page 192: Juselius_Johansen_The Cointegrated VAR Model

9.6. REVISITING THE SCENARIO ANALYSIS 193

Pti=1u2i as the driving forces in the small monetary system. We also noted

in Chapter 2 that if nominal and real shocks are separated in the sense ofnominal shocks exhibiting no real effects (at least in the long-run) we wouldexpect to find the following relations to be stationary cointegrating relations:

(m− p− yr) ∼ I(0),(Rb −Rm) ∼ I(0),(Rb −∆p) ∼ I(0).

In the next chapter we will demonstrate that each of them imposes twooveridentifying restrictions on the cointegrating space, i.e. a total of sixtestable restrictions, which together would completely determine the cointe-gration space:

β0xt =

1 −1 0 0 00 0 0 1 −10 0 −1 0 1

(m− p)tyrt∆ptRm,tRb,t

.However, none of the above relations were found to be stationary suggestingthat there have been permanent interaction effects between the nominal andthe real side of the economy (at least over the business cycle horizon). Wefound strong empirical support for the stationarity of the following relations1:

(m− p− yr) + 14(Rb −Rm) ∼ I(0),Rm − 0.45Rb ∼ I(0),(∆p−Rm)− 0.56(Rm −Rb)− 0.04(yr − b4t) ∼ I(0).

The first relation shows that there is empirical support for a Danish moneydemand relation of the type discussed in Romer (1996). However, it is a linearcombination of two nonstationary relations rather than a linear combinationof two stationary relations. The finding that (mr−yr)t ∼ I(1) and (Rb−Rm)t∼ I(1), but (mr − yr)t + 14(Rb −Rm)t ∼ I(0) implies that the stochastic

1Similar results have been found in many other empirical applications based on mone-tary transmission models for Germany, Italy and Spain in the post Bretton Woods period.See for example Juselius (1998a, 1998b), Juselius (2000), and Juselius and Toro (2003).

Page 193: Juselius_Johansen_The Cointegrated VAR Model

194 CHAPTER 9. TESTING RESTRICTIONS

trend in money velocity is the same as the stochastic trend in the spread.We will now take a closer look at the time series implications of this result.The fact that (mr − yr)t ∼ I(1) can mean either that mr and yr contain

different stochastic trends or that mr and yr have been affected by the samestochastic trend but not in the same proportion one to one. In the first caseno value of b in the linear combination mr − byr would produce stationar-ity between mr and yr, for example because yr had only been affected bycumulated real shocks, whereas mr by both real and nominal shocks. Thehypothesis (mr − byr) ∼ I(0) is a testable hypothesis which was rejected bythe data. Thus, money velocity seems to have been affected by the nominalstochastic trend, but possibly also by the real stochastic trend in this period.Which of the two cases is empirically correct can be inferred by examiningthe time series properties of the two interest rates.The cointegration results showed that even if (Rm − Rb)t ∼ I(1) there

existed a stationary linear combination (Rm − 0.45Rb)t ∼ I(0). Hence, thetwo interest rates do share a common trend though not in the proportion oneto one. This trend is much more likely to describe cumulated nominal ratherthan real shocks to the system. Since (mr − yr) and (Rm − Rb) were foundto be cointegrating, money velocity must have been affected by the nominalstochastic trend.Chapter 2 pointed out that it is not irrelevant for the effectiveness of

monetary policy whether we get one result or the other. As an example,let us consider the simple case when the central bank increases its interestrate in order to curb excess aggregate demand in the economy (and henceinflationary pressure). This is based on the assumption that the interest rateshock will first influence all market interest rates from the short to the longend, then lower the demand for investment, and finally bring inflation down.If (Rm−Rb) ∼ I(0), then the shocks will be transmitted one to one and thecentral bank may very well be successful in bringing the inflation down byincreasing interest rates. If (Rm − Rb) ∼ I(1), then the shocks will not betransmitted one to one and it is less obvious that the central bank is able tobring the inflation down by increasing its own interest rates. Whether thisis the case or not depends on the remaining cointegration properties and thedynamics of the system (Johansen and Juselius, 2003). For example it de-pends on whether excess aggregate demand is related to the real interest rate,whether the latter is stationary or not, whether excess aggregate demand isrelated to inflation or not, and so on.Because the implications of monetary policy are likely to differ depend-

Page 194: Juselius_Johansen_The Cointegrated VAR Model

9.6. REVISITING THE SCENARIO ANALYSIS 195

ing on the cointegration properties between a policy instrument variable,intermediate targets and a goal variable it is important to have reliable in-formation about these relationships. In periods of deregulation or changesin regimes the latter are likely to change. Therefore, cointegration proper-ties and changes in them are likely to provide valuable information aboutthe consequences of shifting from one regime to another. See for exampleJuselius (1998a).

Page 195: Juselius_Johansen_The Cointegrated VAR Model

Chapter 10

Identification of the Long-RunStructure

When the empirical model is estimated with data that are nonstationary inlevels we need to discuss two different identification problems: identificationof the long-run structure, i.e. of the cointegration relations and identifica-tion of the short-run structure, i.e. of the equations. The former is aboutimposing long-run economic structure on the unrestricted cointegration rela-tions, the latter is about imposing short-run dynamic adjustment structureon the equations for the differenced process. In this chapter we will pri-marily discuss identification of the long-run relations and leave the short-runadjustment structure to the next chapter.

The organization of this chapter is the following: Section 10.1 discussesthe cointegrated VAR model for first order integrated data in reduced andstructural form. The parameters of the model are partitioned into the short-run and the long-run parameters and it is shown that the analysis of thelong-run structure can be performed in either representation. Section 10.2discusses the condition for identification in terms of restrictions of the long-run structure. A general result for formal identification in a statistical modelis given, and empirical identification is defined. Section 10.3 discusses how tocalculate degrees of freedom when testing over-identifying restrictions. Sec-tion 10.4 discusses just-identified restrictions of the long-run structure andprovides two illustrations in which economic identification can be addressed.Section 10.5 does the same for an over-identified structure and Section 10.6for an nonidentified structure. Section 10.7 illustrates some recursive proce-dures to check for parameter constancy and Section 10.8 concludes.

195

Page 196: Juselius_Johansen_The Cointegrated VAR Model

196 CHAPTER 10. IDENTIFICATION LONG-RUN STRUCTURE

10.1 Identification when data are nonstation-

ary

To illustrate the difference between the two identification problems it is usefulto consider the cointegrated VAR model both in the so called reduced formand the structural form and discuss in which aspects they differ. First,consider the usual reduced form representation:

∆xt = Γ1∆xt−1 +αβ0xt−1 + ΦDt + εt, εt ∼ Np(0,Ω) (10.1)

and then pre-multiply (10.1) with a nonsingular p × p matrix A0 to obtainthe so called structural form representation (10.2):

A0∆xt = A1∆xt−1 + aβ0xt−1 + ΦDt + vt, vt ∼ Np(0,Σ). (10.2)

At this stage we assume that reduced form parameters λRF = Γ1,α,β,Φ,Ωand structural form parameters λSF = A0, A1,a,β, Φ,Σ are unrestricted.To distinguish between parameters of the long-run and the short-run struc-ture we partition λRF = λSRF ,λLRF, where λSRF = Γ1,α,Φ,Ω and λLRF =β and λSF = λSSF ,λLSF, where λSSF = A0, A1,a, Φ,Σ and λLSF = β.The relation between λSRF and λ

SSF is given by:

Γ1 = A−10 A1, α = A

−10 a, εt = A

−10 vt, Φ = A−10 Φ, Ω = A−10 Σ A0

0−1.

The short-run parameters of the reduced form, λSRF , are uniquely de-fined, whereas those of the structural form, λSSF , are not without imposingp(p − 1) just-identifying restrictions. Although the long-run parameters βare uniquely defined based on the normalization of the eigenvalue problem,this need not coincide with an economic identification and in general we needto impose r(r − 1) just-identifying restrictions also on β. Because the long-run parameters remain unaltered under linear transformations of the VARmodel, β is the same both in both forms and identification of the long-runstructure can be done in either the reduced form or the structural form.This gives the rationale for identifying the long-run and the short-run

structure as two separate statistical problems, though from an economic pointof view they are interrelated. Therefore, we can discuss the statistical prob-lem of how to test structural hypotheses on the long-run structure β before

Page 197: Juselius_Johansen_The Cointegrated VAR Model

10.2. IDENTIFYING RESTRICTIONS 197

addressing structural hypotheses on the short-run structure A0,A1, a,Φ,Σ.From a practical point of view this is invaluable, as the joint identificationof the long- and short-run structure is likely to be immensely difficult. Theidentification process starts with the identification of β in the reduced formand proceed to the identification of the short-run structure keeping the iden-tified β fixed.To understand all aspects of identification it is useful to distinguish be-

tween identification in three different meanings:

• generic (formal) identification, which is related to a statistical model• empirical (statistical) identification, which is related to the actual esti-mated parameter values, and

• economic identification, which is related to the economic interpretabil-ity of the estimated coefficients of a formally and empirically identifiedmodel.

For identification to be empirically useful all three conditions for identifi-cation have to be satisfied in the empirical problem, which as a crucial partinvolves the choice of data.

10.2 Identifying restrictions

1In order to identify the long-run structure we have to impose restrictions oneach of the cointegrating relations. As before, Ri denotes a p1×mi restrictionmatrix andHi =Ri⊥ a p1×si design matrix (mi+si = p1) so thatHi definedbyR0

i Hi = 0. Thus, there aremi restrictions and consequently si parametersto be estimated in the i’ th relation. The cointegrating relations are assumedto satisfy the restrictions R0

i βi = 0, or equivalently βi = Hiϕi for somesi-vector ϕi, that is

β = (H1ϕ1, ...,Hrϕr), (10.3)

where the matrices H1, ...,Hr express linear hypotheses to be tested againstthe data. Note that the linear restrictions do not specify any normalization

1This section relies strongly on Johansen and Juselius (1995).

Page 198: Juselius_Johansen_The Cointegrated VAR Model

198 CHAPTER 10. IDENTIFICATION LONG-RUN STRUCTURE

of the vectors βi. In the previous chapter we gave several examples of designmatrices Hi.The idea is to choose Hi so that (10.3) identifies the cointegrating rela-

tions. The well known rank condition expresses that the first cointegrationrelation, say, is identified if

rank(R01β1, ...,R

01βr) = rank(R

01H1ϕ1, ...,R

01Hrϕr) = r − 1. (10.4)

This implies that no linear combination of β2, ...,βr can produce a vectorthat “looks like” the coefficients of the first relation, i.e. satisfies the restric-tions defining the first relation. Note, however, that in order to check therank condition (10.4) we need to know the coefficients ϕi i = 1, ..., r, but inorder to estimate the coefficients we need to know whether the restrictionsare identifying. Most software programs check the rank condition prior toestimation, by first giving the coefficients ϕ some arbitrary numbers. If therank condition is satisfied estimation can proceed.One can, however, avoid the arbitrary coefficients and explicitly check the

rank condition based on the known matrices Ri and Hi. Johansen (1992b)gives the following condition for a set of restrictions to be identifying:

The set of restrictions is formally identifying if for all i and k = 1, ..., r−1and any set of indices 1 ≤ i1 < ... < ik ≤ r not containing i it holds that

rank(R0iHi1, ...,R

0iHik) ≥ k . (10.5)

As an example we consider r = 2, where (10.5) reduces to the condition:

ri.j = rank(R0iHj) ≥ 1, i 6= j.

If r = 3 the conditions to be satisfied are

ri.j = rank(R0iHj) ≥ 1, i 6= j,

ri.jm = rank(R0i(Hj,Hm)) ≥ 2, i, j,m different.

The value of ri.jm can be determined by finding the eigenvalues of the sym-metric matrix (Hj,Hm)

0(I −Hi(H0iHi)

−1H0i)(Hj,Hm) using the identity:

ri.jm = rank(R0i(Hj,Hm)) = rank(Hj,Hm)

0(I −Hi(H0iHi)

−1H0i)(Hj,Hm).

(10.6)

Page 199: Juselius_Johansen_The Cointegrated VAR Model

10.2. IDENTIFYING RESTRICTIONS 199

Thus the usual rank condition (10.4) requires the knowledge of the (not yetestimated) parameters, whereas condition (10.5) is a property of the knowndesign matrices. In the empirical applications below we will illustrate howto check the rank conditions using (10.6).It is useful to distinguish between just identifying restrictions and over-

identifying restrictions. The former can be achieved by linear combinations ofthe relations (equations) and, hence, do not change the likelihood function,whereas the latter constrain the parameter space and, hence, change thelikelihood function.

For identifying restrictions it holds that rank (R0iβ) ≥ r−1, and if equality

holds the i0th relation is exactly identified, if inequality holds then the i0threlation is overidentified. The system is exactly identified if rank(R0

iβ) = r−1for all i, and overidentified if it is identified and rank R0

iβ > r−1 for at leastone i.

As a general procedure it is useful to start with a just-identified systemand then impose further restrictions if the estimated parameters indicate thata further reduction in the statistical model is possible. For example, onewould generally prefer to set insignificant coefficients in the just-identifiedmodel to zero. Such restrictions constrain the parameter space and, thus,are testable. However they may, but need not, be overidentifying. Thereason is that when accepting further overidentifying restrictions the rankcondition (10.5) need not be satisfied and the more restricted model is nolonger identified.For example, if the rank condition (10.5) is satisfied under the condition

that a certain coefficient is nonzero, but the true value is in fact zero, thenthe rank condition in the ’true’ model is not satisfied. Generally the param-eter values of the generically identified model are not known but have to beestimated. If the true coefficient is zero, then the estimate will in general notbe significantly different from zero and restricting it to zero will in such acase violate the rank condition. Thus, we have that although the original sta-tistical model is formally identifying, the economic model is not empiricallyidentified.This can be formalized as:

An economic model specified by the parameter value ϑ, say, is formallyidentified if ϑ is contained in the parameter space specified by identifyingrestrictions. It is empirically identified if ϑ is not contained in any noniden-tified sub-model.

Page 200: Juselius_Johansen_The Cointegrated VAR Model

200 CHAPTER 10. IDENTIFICATION LONG-RUN STRUCTURE

In a formally identified model the parameters can be estimated subject tothe restrictions by the iterative procedure discussed in Chapter 9, Section 4.The results can now be generalized to the case where we impose (identifying)restrictions on all cointegrating vectors. We consider the equilibrium errorcorrection term of (10.1) and write it as:

αβ0xt−1 = α1β01xt−1 + ...+αrβ

0rxt−1 = α1ϕ

01H

01xt−1 + ...+αrϕ

0rH

0rxt−1.

The hypothesis on β is expressed as:

β = (β1, ...,βr) = (H1ϕ1, ...,Hrϕr)

where Hi, i = 1, ..., r are known design matrices of dimension p1 × si, andϕiare si × 1 matrices of unrestricted coefficients. Again we partition α sothat it corresponds to the partitioning of β:

α = (α1, ...,αr).

The concentrated model can be written as:

R0t = α1ϕ01H

01R1t + ...+αrϕ

0rHrR1t + εt

The estimation problem can be solved by extending the switching algorithm,introduced in Section 9.3, described below:

1. Estimate an initial value of β1 = β1, ..., and βr−1 = βr−1 .

2. For fixed value of β1 = β1, ..., βr−1 = βr−1 estimate αr and ϕr by re-duced rank regression ofR0t onH

0rR1t corrected for β

01R1t, ..., β

0r−1R1t.

This defines βr = Hrϕr and L−2/Tmax (β1, ..., βr−1).

3. For fixed value of β2 = β2, ...,βr−1 = βr−1,βr = βr estimate α1and ϕ1 by reduced rank regression of R0t on H

01R1t corrected for

β02R1t, ..., β

0rR1t. This defines β1 = H1ϕ1 and L

−2/Tmax (β2, ..., βr).

4. Repeat the steps using the last obtained values of βi until the value ofthe maximized likelihood function has converged.

Page 201: Juselius_Johansen_The Cointegrated VAR Model

10.2. IDENTIFYING RESTRICTIONS 201

The eigenvalue problem for fixedeβ1 = β1 = β1, ...,βr−1 = βr−1 is

given by:

¯ρrH

0rS11Hr −H0

rS10.eβ1(S00.β1)−1S01.eβ1Hr

¯= 0

and for fixedeβ2 = β2 = β2, ...,βr = βrby:

¯ρ1H

01S11H1−H0

1S10.eβ2(S00.eβ2)−1S01.eβ2H1

¯= 0,

and so on. The maximum of the likelihood function is given by:

L−2/Tmax (Hc) = |S00| (1− ρ1)...(1− ρr)

where ρi are the eigenvalues obtained from applying the switching algorithmuntil convergence of the likelihood function.Using the LR procedure, the hypothesis (??) can be tested by calculating

the test statistic:

−2 lnΛ = Tln(1− ρ1) + ...+ ln(1− ρr)− ln(1− λ1)− ...− ln(1− λr),(10.7)

which, under assumption that all Hi are identifying, is asymptotically χ2

distributed with degrees of freedom given by:

υ = Σri=1(mi − r + 1) = Σri=1(p1− r)− (si − 1).How to calculate the degrees of freedom will be given a detailed discussion

in the next section.To summarize: For fixed values of ϕ2, ...,ϕr, or β2, ...,βr, we can find

the ML estimate of β1 by performing a reduced rank regression of ∆xton H0

1xt−1 corrected for all the stationary and deterministic terms, that is,β02xt−1, ...,β

0rxt−1,∆xt−1 and Dt. This determines the estimate of ϕ1 and,

hence, β1 = H1ϕ1. In the next step we keep the values β1,β3, ...,βr fixedand perform a reduced rank regression of ∆xt on H

02xt−1 corrected for all

Page 202: Juselius_Johansen_The Cointegrated VAR Model

202 CHAPTER 10. IDENTIFICATION LONG-RUN STRUCTURE

stationary and deterministic terms. This determines β2. By applying thealgorithm until the likelihood function has converged to its maximum wecan find the maximum likelihood estimates of β subject to the identifyingrestrictions.The speed of convergence of the switching algorithm depends very much

on how we choose the initial values of β. For example, the unrestricted es-timates of β are not in general the best choice, because the unrestrictedeigenvectors need not correspond to the ordering given by H1, ...,Hr and,thus, can be very poor initial values. Instead, the linear combination of theunrestricted estimates which is as close as possible to sp(Hi), i = 1, ..., r isclearly preferable as a starting value for βi. These can be found by solvingthe eigenvalue problem:

|ρH0iHi −H0

iβ0(β

0β)β

0Hi| = 0,

for the r eigenvalues ρ1 > ... > ρr and v1, ..., vr, and choose as initial valuefor βi the first eigenvector defined by Hiϕi. This choice of initial values hasthe extra advantage that for exactly identified equations no iterations areneeded.

10.3 Formulation of identifying hypotheses and

degrees of freedom

When testing restrictions imposed on the cointegration relations using read-ily available software packages, the degrees of freedom are usually providedby the program. However, some hypotheses can impose restrictions whichare quite complicated and where standard formula for calculating degrees offreedom may no longer be applicable. CATS will in such cases suggest anumber and ask if you agree or simply prompt for the degrees of freedom. Itis, therefore, important to understand the logic behind the calculations of thedegrees of freedom. Furthermore, even if most software packages check iden-tification using some generic values for the model parameters and inform theuser when identification is violated, we need to understand why identificationfailed in order to respecify the restrictions.The following example illustrates how to calculate degrees of freedom

when we have imposed over-identifying restrictions on three cointegratingrelations in a VAR analysis of the Danish data (mr

t , yrt ,∆pt, Rm.t, Rb,t). The

Page 203: Juselius_Johansen_The Cointegrated VAR Model

10.3. FORMULATING IDENTIFYING HYPOTHESES 203

first relation expresses that (mrt − yrt ) and (Rm.t −Rb,t) are cointegrated, so

are driven by the same stochastic trend; the second relation that yrt ,∆pt andRb,t are cointegrated, so share two common trends; and the third relationthat Rm.t and Rb,t are cointegrated, so share one common trend.The first step is to examine whether the restricted structure defined by

βc = H1ϕ1, ...,Hrϕr satisfies the rank and order condition for identifica-tion. We will here illustrate how this can be done analytically using condition(10.5) based on the following example where, for simplicity, we disregard anydeterministic components in the cointegration relations.

βc11 −βc11 0 βc12 −βc120 βc21 βc22 0 βc230 0 0 βc31 βc32

mrt

yrt∆ptRm,tRb,t

(10.8)

As discussed in Chapter 7 the parameters (βc11,βc12), (β

c21, β

c22, β

c23) and (β

c31,β

c32)

are defined up to a factor of proportionality, and it is possible to normalizeon one element in each vector without changing the likelihood:

1 −1 0 βc12/βc11 −βc12/βc11

0 1 βc22/βc21 0 βc23/β

c21

0 0 0 1 βc32/βc31

mrt

yrt∆ptRm,tRb,t

. (10.9)

When normalizing βci by diving through with a non-zero element βcij, the

corresponding αci vector is multiplied by the same element. Thus, normaliza-tion does not change Π = αciβ

c0i = αβ0 and we can generally choose whether

to normalize or not. However, when the long-run structure is identified nor-malization becomes more important. This is because it is only possible toget standard errors of βij when each cointegration vector is properly normal-

ized. In this case it is convenient to express βci = hi + Hiϕi where ϕ is now(si − 1)× 1, hi is a vector in sp(Hi) defining the chosen normalization, andsp(hi, Hi) = sp(Hi) (see Johansen, 1995, p. 76).As discussed in Chapter 9.1 hypotheses on the cointegration structure

can be formulated either by specifying the number of free parameters ϕi, i.e.βc = (H1ϕ1, ...,Hrϕr) or the number of restrictions mi, i.e. R

0β = 0. Whenwe discuss identification it is convenient to choose the former formulation,

Page 204: Juselius_Johansen_The Cointegrated VAR Model

204 CHAPTER 10. IDENTIFICATION LONG-RUN STRUCTURE

i.e. to express the hypotheses in terms of the number of free parameters.For each cointegration vector βci we have to make a distinction between thenormalized coefficient and the remaining si − 1 free coefficients ϕi. As anillustration we express the above structure (10.9) using βci = hi + Hiϕi :

βc = h1 + H1ϕ1, h2+H2ϕ2, h3+H3ϕ3, (10.10)

where

h1 + H1ϕ1 =

1−1000

+0001−1

[ϕ12] , h2 + H2ϕ2 =

01000

+0 00 01 00 00 1

·ϕ22ϕ23

¸,

h3 + H3ϕ3 =

00010

+00001

[ϕ32] .

For given estimates of βc the estimates of αc are given by the formula inChapter 7:

αc = S01βc(β

c0S11β

c)−1, (10.11)

and the standard errors of βc

ij are calculated as:

σβcij=

qT−1diag[HH0(αc0Ω−1αc ⊗ S11)H−1H0]ij (10.12)

where

H =

H1 0 · · · 0

0 H2 0...

... 0. . . 0

0 · · · 0 Hr

Page 205: Juselius_Johansen_The Cointegrated VAR Model

10.3. FORMULATING IDENTIFYING HYPOTHESES 205

and the elements diag[·]ij are defined by the ordering i = 1, ..., r and j =1, ..., p1.

The standard errors of the corresponding αcij coefficients are calculatedas:

σαcij=

qT−1Ωii(β

c(β

c0S11β

c)−1βc0)jj (10.13)

We will show below that it is always possible to impose r − 1 just-identifying restrictions on β by linear manipulations of the unrestricted coin-tegration vectors. Such ’pseudo’ restrictions do not change the value of thelikelihood function and, thus, no testing is involved in this case. Note thatpseudo ’restrictions’ are just-identifying only if there are exactly r − 1 ofthem and they satisfy the condition (10.5). Additional restrictions on thestructure change the value of the likelihood function and, thus, are testable.Such restrictions are over-identifying if they satisfy (10.5), otherwise they arenon-identifying though nevertheless real testable restrictions.

Given that the restrictions satisfy (10.5) the degrees of freedom can becalculated from the following formula:

ν =X(mi − (r − 1)).

Consider si, the number of restricted coefficients in βci , and mi = p− si, the

total number of restrictions on vector βci , then the degrees of freedom in theabove example are calculated as follows:

si s1 = 2 s2 = 3 s3 = 2mi m1 = 3 m2 = 2 m3 = 3r − 1 2 2 2

mi − (r − 1) 1 0 1

so the degrees of freedom are ν = 2. Note, however, that some of the re-strictions may not be identifying (for example the same restriction on allcointegration relations), but are nevertheless testable restrictions. Whateverthe case, the restrictions that are identifying must as a minimum satisfy thecondition for just identification.

Page 206: Juselius_Johansen_The Cointegrated VAR Model

206 CHAPTER 10. IDENTIFICATION LONG-RUN STRUCTURE

10.4 Just-identifying restrictions

In general we can always transform the long-run matrix Π = αβ0 by anonsingular r × r matrix Q in the following way: Π = αQQ−1β0 = αβ

0,

where α = αQ and β = βQ0−1.We will now demonstrate how to choose thematrix Q so that it imposes r−1 just-identifying restrictions on each βi. Asan example of the latter, we consider the following design matrix Q = [β1]where β1 is a (r × r) nonsingular matrix defined by β0 = [β1,β2]. In thiscase αβ0 = α(β1β

−101 β0) = α[I, β] where I is the (r × r) unit matrix and

β = β−101 β2 is a r× (p− r) matrix of full rank. For example, assume that βis (5×3):

β =

β11 β12 β13β21 β22 β23β31 β32 β33.... .... .....β41 β42 β43β51 β52 β53

=β1

....β2

; β−11

β1

....β2

=

1 0 00 1 00 0 1.... .... .....eβ41 eβ42 eβ43eβ51 eβ52 eβ53

We notice that the choice of Q = β1 in our example has in fact imposed twozero restrictions and one normalization on each cointegration relation. Thesejust-identifying restrictions have transformed β to the long-run ”reducedform”. Thus, the above example for xt = [x1t,x2t], where x1t = [x1t, x2t, x3t]and x2t = [x4t, x5t], would describe an economic application where the threevariables in x1t are ’endogenous’ and the two in x2t are ’exogenous’.Furthermore, if α2 = 0, then β0xt does not appear in the equation for

∆x1,t and x2,t is weakly exogenous for β. In this case efficient inference onthe long-run relations can be conducted in the conditional model of ∆x1,tgiven ∆x2,t as discussed in the previous chapter, see also Johansen (1992a).When ’endogenous’ and ’exogenous’ are given an economic interpretationthis corresponds to the representation suggested by Phillips (1990). Notethat the triangular representation requires that α2 = 0, which is a testablehypothesis and it is straightforward to test whether this is a good descriptionof the data.Example 1: We consider here a just identified structure describing the

long-run ’reduced form’ assuming real money, inflation, and the short-term

Page 207: Juselius_Johansen_The Cointegrated VAR Model

10.4. JUST-IDENTIFYING RESTRICTIONS 207

Table 10.1: Two just-identified long-run structures

HS1 HS2

β1 β2 β3 β1 β2 β3mr 0 1.0 0 0 1.0 0yr 0.09

(3.9)−0.86(−4.3)

0.01(1.3)

0.09(3.9)

-1.0 0.08(3.9)

∆p 1.0 0 0 1.0 0 1.0Rm 0 0 1.0 0.21

(4.4)−15.9(−3.7)

−0.27(−4.9)

Rb −0.12(−0.6)

8.09(4.8)

−0.44(−6.8)

−0.21(−4.4)

15.0(6.1)

0

D83 −0.00(−0.2)

−0.18(−6.0)

−0.00(−1.7)

−0.00(−0.5)

−0.14(−5.0)

−0.00(−0.1)

α1 α2 α3 α1 α2 α3∆mr

t * −0.31(−5.4)

−5.32(−3.2)

* −0.30(−5.4)

*

∆yrt * * * * * *∆2pt −1.66

(−9.1)* 1.7

(1.7)3.4(1.6)

* −5.00(−2.4)

∆Rm,t * −0.01(−1.7)

−0.31(−3.5)

−0.83(−4.5)

−0.01(−1.7)

0.83(4.5)

∆Rb,t * * * * * *

interest rate to be ’endogenous’ and real income and the bond rate to be’exogenous’ corresponding to the following restrictions on β :

HS1 : β = (H1ϕ1,H2ϕ2,H3ϕ3),

where

H1 =

0 0 0 01 0 0 00 1 0 00 0 0 00 0 1 00 0 0 1

,H2 =

1 0 0 00 1 0 00 0 0 00 0 0 00 0 1 00 0 0 1

,H3 =

0 0 0 01 0 0 00 0 0 00 1 0 00 0 1 00 0 0 1

i.e. H1 picks up inflation rate, H2 real money, H3 the short-term interestrate and the two ’exogenous’ variables and the shift dummy Ds83 enter all

Page 208: Juselius_Johansen_The Cointegrated VAR Model

208 CHAPTER 10. IDENTIFICATION LONG-RUN STRUCTURE

three relations. The estimates are reported in Table 10.1. The first relationis similar to the real income relation H16 in Table 9.3, noticing that thecoefficient to bond rate (and the shift dummy) is not significantly differentfrom zero. The second relation is approximately money velocity as a functionof the long-term bond rate, except that the income coefficient is not as closeto unity as we have seen before. The third relation is almost replicating H25

in Table 9.2 (noticing that real income is not significant). There is no testingin this case since the r - 1 = 2 restrictions have been obtained by linearcombinations of the unrestricted relations, i.e. by rotating the cointegrationspace.The α coefficients are reported in the lower part of Table 10.1. Only coeffi-

cients with a |t-value| > 1.6 have been reported. Because no real restrictionshave been imposed the matrix Π = αβ

0= αcβ

c0, i.e. it is exactly the same

as for the unrestricted model. It is now easy to see that the combination

αc12βc02 xt + αc13β

c03 xt will replicate the money demand relation H15 in Table

9.2. Hence, the money demand relation, mr−yr−14.1(Rm−Rb)−0.145Ds83is in fact a linear combination of two stationary relations. Similarly, the lin-

ear combination αc31βc01 xt+α

c33β

c03 xt will replicate the inflation relation H28

in Table 9.2.Example 2. Here we give an example of both zero and non-zero (homo-

geneity) just-identifying restrictions:

HS2 : β = (H1ϕ1,H2ϕ2,H3ϕ3),

where

H1 =

0 0 0 01 0 0 00 1 0 00 0 1 00 0 −1 00 0 0 1

,H2 =

1 0 0 0−1 0 0 00 0 0 00 1 0 00 0 1 00 0 0 1

,H3 =

0 0 0 01 0 0 00 1 0 00 0 1 00 0 0 00 0 0 1

In this case we check that the restrictions are in fact just-identifying bycalculating the rank condition (10.5). Just identification corresponds torank(R0

iHj) = 1 for i, j = 1, 2, 3 and j 6= i, and rank (R0i(Hj,Hk)) = 2

for i, j, k different. The rank indices reported in Table 10.3 column HS2,

Page 209: Juselius_Johansen_The Cointegrated VAR Model

10.5. OVER-IDENTIFYING RESTRICTIONS 209

Table 10.2: Two over-identified long-run structures for the Danish data

HS.3 HS.4

β1 β2 β3 β1 β2 β3mr 0 1.0 0 0 1.0 0yr 0.09

(9.0)−0.96(−5.3)

0 0.08(5.0)

-1.0 0

∆p 1.0 0 0 1.0 0 0Rm 0 0 1.0 −1.48

(−6.7)−13.8(−6.1)

1.0

Rb 0 8.56(5.6)

−0.38(−10.0)

0.48(2.2)

13.8(6.1)

−0.38(−9.7)

D83 0 −0.18(−4.7)

0 0 −0.15(−5.5)

0

α1 α2 α3 α1 α2 α3∆mr

t ∗ −0.29(−5.0)

−5.09(−3.4)

∗ −0.30(−5.0)

∗∆yrt ∗ ∗ ∗ ∗ ∗ ∗∆2pt −1.66

(−9.0)1.5(1.6)

−1.66(−9.1)

∗ ∗∆Rm,t ∗ −0.01

(−2.3)−0.28(−3.5)

∗ −0.01(−2.3)

−0.37(4.4)

∆Rb,t ∗ ∗ ∗ ∗ ∗ ∗

confirm that the above conditions are indeed met. The estimates of α andβ are given in the right hand side of Table 10.1. It appears that the secondrelation is now the money demand relation, whereas β1+β3 becomes the firstrelation and β1−β3 the third relation under HS1.These two examples serve the purpose of illustrating that in general one

can find several identified structures by rotating the cointegrating space.Table 10.1 showed that two of the estimated coefficients in the just-identifiedstructures were insignificant. In the next section we will discuss how toimpose over-identifying restrictions on β and how to choose between differentlong-run structures.

10.5 Over-identifying restrictions

We will consider two overidentified structures based on the money demanddata for p1 = 6 and r = 3. Formal identification requires that rank(R0

iHj) ≥1 for i, j = 1, 2, 3 and j 6= i, and that rank (R0

i(Hj,Hk)) ≥ 2 for i, j, k

Page 210: Juselius_Johansen_The Cointegrated VAR Model

210 CHAPTER 10. IDENTIFICATION LONG-RUN STRUCTURE

Table 10.3: Checking the rank conditions

ri.j HS2 HS3 HS4 HS5 ri.jk HS2 HS3 HS4 HS5

1.2 1 3 2 2 1.23 2 4 4 41.3 1 2 1 0

2.1 1 1 2 3 2.13 2 2 3 32.3 1 1 1 2

3.1 1 2 2 2 3.12 2 4 3 23.2 1 3 2 3

different. The rank indices are given in Table 10.3 column HS.3 and HS.4

where the i.j elements should be at least 1 and the i.jk elements at least 2 forgeneric identification. The degrees of freedom in the test for overidentifyingrestrictions are calculated as ν = Σi(mi − r + 1), where mi is the number ofrestrictions on βi.Example 3. We first consider an overidentified structure reported defined

by:

HS3 : β = (H1ϕ1,H2ϕ2,H3ϕ3),

where

H1 =

0 01 00 10 00 00 0

,H2 =

1 0 0 00 1 0 00 0 0 00 0 0 00 0 1 00 0 0 1

,H3 =

0 00 00 01 00 10 0

,

i.e. structureHS1 with the insignificant coefficients set to zero. The estimatesare given in Table 10.2. The degrees of freedom are ν = Σri=1mi − (r − 1) =(4 − 2) + (2 − 2) + (4 − 2) = 2 + 0 + 2 = 4 and the test statistic becameχ2(4) = 1.79 with a p-value of 0.77. Thus the structure can clearly beaccepted.Example 4. The next example consists of the joint testing of the three sta-

tionary relations H28, H15, and H25 reported in Table 9.2. They are reported

Page 211: Juselius_Johansen_The Cointegrated VAR Model

10.6. LACK OF IDENTIFICATION 211

under HS4 in Table 10.2. The degrees of freedom are ν = Σri=1mi− (r−1) =1 + 1 + 2 = 4 and the test statistic became χ2(4) = 1.60 with a p-valueof 0.81. Thus, both HS3 and HS4 are acceptable long-run structures. HS3

has more the character of ’building blocks’ (corresponding to the irreduciblecointegration vectors in Davidson 2001) which, when weighted by the cor-responding αij, become more interpretable from an economic point of view.The relations of HS4 describe directly interpretable steady-state relations,and as demonstrated above, are in fact linear combinations of the buildingblocks in HS3.Does it matter or not whether we choose one representation or the other?

In the present example the answer is probably ’not really’. Most economistswould probably prefer a structural representation that mimics as closely aspossible the hypothetical steady-state relations. From a statistical point ofview the ’building blocks’ representation is in many cases preferable for twodifferent but related reasons:First, when r is large relative to p - r it becomes increasingly difficult to

identify r meaningful steady-state relations given that at least r - 1 restric-tions has to be imposed on each relation.Second, when a steady-state relation is a direct combination of two sta-

tionary building blocks, such as (mr−yr)−b1(Rm−Rb) where (mr−yr) ∼ I(0)and (Rm − Rb) ∼ I(0), then b1 is no longer a coefficient between two non-stationary variables, and the super-consistency result for estimated cointe-gration coefficients no longer holds. In fact, b1 has now the meaning of anα coefficient combining two stationary cointegration relations. Nevertheless,the money demand relation in HS4 is a linear combination of two nonsta-tionary relations, (mr − yr) ∼ I(1) and (Rm −Rb) ∼ I(1), as demonstratedin Table 9.2.We demonstrated above that the same money demand relation can also

be obtained by combining two stationary building blocks. Similar argumentshold for the first relation in HS4. Because of the specific restrictions imposedthe βij coefficients ofHS4 are all coefficients between nonstationary variables.Hence, the calculated standard errors are based on correct distributionalassumptions in both HS3 and HS4.

10.6 Lack of identification

Example 5: . We now consider the structure defined by:

Page 212: Juselius_Johansen_The Cointegrated VAR Model

212 CHAPTER 10. IDENTIFICATION LONG-RUN STRUCTURE

HS5 : β = (H1ϕ1,H2ϕ2,H3ϕ3),

where

H1 =

0 0 0 01 0 0 00 1 0 00 0 1 00 0 0 10 0 0 0

,H2 =

1 0 0−1 0 00 0 00 1 00 −1 00 0 1

,H3 =

0 00 00 01 00 10 0

,i.e. structure HS4 without imposing the homogeneity restriction on β1. Therank conditions of Table 10.3 shows that relation 1 is not identified relativeto relation 3. The reason is that the above structure cannot be distinguishedfrom βc1+ωβc3,βc2,βc3, ω ∈ R.We say that H3 is a subset of H1. But, thehomogeneity restriction of HS4 identifies β1 uniquely in the sense that ωβ

c3

can no longer be added without violating the former. Thus, identificationcan be restored by either imposing the homogeneity restriction as in HS4 orby setting one of two interest rates to zero in H1ϕ1.Thus, the model specified by the restrictions in HS5 is not identifying in

the sense defined here implying that the four parameters ϕ11, ϕ12, ϕ13 andϕ14 cannot be estimated uniquely without further restrictions. Another wayof expressing this is that one of the interest rates can be removed from β1by adding a linear combinations of β3. For example, β1 − 0.10

0.36β3, removes

the bond rate from β1. Therefore, in this set-up we can only estimate theimpact of a linear combination of the interest rates in the first relation.Nevertheless, though the restrictions in Table 10.4 are not identifying, theyare genuine restrictions on the parameter space and the model can be testedby a likelihood ratio test but in this case we need to tell CATS how manydegrees of freedom there is in the test. In this case the test statistic became2.07 with v = 1 + 1 + 2 = 4 degrees of freedom and the estimated relationsclearly define stationary relations in the cointegration space.

10.7 Recursive tests of α and β

The parameters of the identified model are simply plots of βc(t1) and α

c(t1)for t1 = T0, . . . , T . The standard errors are calculated as indicated in (10.12)and (10.13).

Page 213: Juselius_Johansen_The Cointegrated VAR Model

10.7. RECURSIVE TESTS OF α AND β 213

Table 10.4: An un-identified long-run structures for the Danish data

HS.5

β1 β2 β3mr 0 1.0 0yr 0.08 −1.0 0∆p 1.0 0 0Rm -0.04 -12.9 1.0Rb 0.10 12.9 −0.36D83 0 −0.15 0

α1 α2 α3∆mr

t ∗ −0.30(−5.0)

∗∆yrt ∗ ∗ ∗∆2pt −1.68

(−9.1)∗ 1.6

(1.8)

∆Rm,t ∗ −0.01(−2.3)

−0.36(−4.5)

∆Rb,t ∗ ∗ ∗

MO = 0.00

83 84 85 86 87 88 89 90 91 92 93-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

FY

83 84 85 86 87 88 89 90 91 92 930.04

0.06

0.08

0.10

0.12

0.14

DIFPY = 1.00

83 84 85 86 87 88 89 90 91 92 930.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

IDE

83 84 85 86 87 88 89 90 91 92 93-2.16

-1.98

-1.80

-1.62

-1.44

-1.26

-1.08

-0.90

IBO

83 84 85 86 87 88 89 90 91 92 93-0.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

D83 = 0.00

83 84 85 86 87 88 89 90 91 92 93-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

Figure 10.1. Recursively calculated coefficients of β1(t1) fort1 =1983:2,...,1993:4.

Page 214: Juselius_Johansen_The Cointegrated VAR Model

214 CHAPTER 10. IDENTIFICATION LONG-RUN STRUCTURE

DMO

83 84 85 86 87 88 89 90 91 92 93-2.0

-1.5

-1.0

-0.5

0.0

0.5

1.0

DFY

83 84 85 86 87 88 89 90 91 92 93-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

1.25

DDIFPY

83 84 85 86 87 88 89 90 91 92 93-2.2

-2.0

-1.8

-1.6

-1.4

-1.2

-1.0

-0.8

DIDE

83 84 85 86 87 88 89 90 91 92 93-0.075

-0.050

-0.025

0.000

0.025

0.050

0.075

DIBO

83 84 85 86 87 88 89 90 91 92 93-0.125

-0.100

-0.075

-0.050

-0.025

-0.000

0.025

0.050

0.075

Figure 10.2. Recursively calculated coefficients of α1(t1) fort1 =1983:2,...,1993:4.

MO = 1.00

83 84 85 86 87 88 89 90 91 92 930.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

FY = -1.00

83 84 85 86 87 88 89 90 91 92 93-2.00

-1.75

-1.50

-1.25

-1.00

-0.75

-0.50

-0.25

0.00

DIFPY = 0.00

83 84 85 86 87 88 89 90 91 92 93-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

IDE

83 84 85 86 87 88 89 90 91 92 93-25

-20

-15

-10

-5

0

IBO

83 84 85 86 87 88 89 90 91 92 930

5

10

15

20

25

D83

83 84 85 86 87 88 89 90 91 92 93-0.30

-0.25

-0.20

-0.15

-0.10

-0.05

Figure 10.3. Recursively calculated coefficients of β2(t1) fort1 =1983:2,...,1993:4.

Page 215: Juselius_Johansen_The Cointegrated VAR Model

10.7. RECURSIVE TESTS OF α AND β 215

DMO

83 84 85 86 87 88 89 90 91 92 93-0.48

-0.40

-0.32

-0.24

-0.16

-0.08

0.00

0.08

0.16

DFY

83 84 85 86 87 88 89 90 91 92 93-0.10

-0.05

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

DDIFPY

83 84 85 86 87 88 89 90 91 92 93-0.20

-0.15

-0.10

-0.05

-0.00

0.05

0.10

0.15

DIDE

83 84 85 86 87 88 89 90 91 92 93-0.0240

-0.0200

-0.0160

-0.0120

-0.0080

-0.0040

0.0000

0.0040

0.0080

DIBO

83 84 85 86 87 88 89 90 91 92 93-0.015

-0.010

-0.005

0.000

0.005

0.010

0.015

0.020

0.025

Figure 10.4. Recursively calculated coefficients of α2(t1) fort1 =1983:2,...,1993:4.

MO = 0.00

83 84 85 86 87 88 89 90 91 92 93-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

FY = 0.00

83 84 85 86 87 88 89 90 91 92 93-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

DIFPY = 0.00

83 84 85 86 87 88 89 90 91 92 93-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

IDE = 1.00

83 84 85 86 87 88 89 90 91 92 930.00

0.25

0.50

0.75

1.00

1.25

1.50

1.75

2.00

IBO

83 84 85 86 87 88 89 90 91 92 93-0.520

-0.480

-0.440

-0.400

-0.360

-0.320

-0.280

-0.240

-0.200

-0.160

D83 = 0.00

83 84 85 86 87 88 89 90 91 92 93-1.00

-0.75

-0.50

-0.25

0.00

0.25

0.50

0.75

1.00

Figure 10.5. Recursively calculated coefficients of β3(t1) fort1 =1983:2,...,1993:4.

Page 216: Juselius_Johansen_The Cointegrated VAR Model

216 CHAPTER 10. IDENTIFICATION LONG-RUN STRUCTURE

DMO

83 84 85 86 87 88 89 90 91 92 93-3

-2

-1

0

1

2

3

4

5

6

DFY

83 84 85 86 87 88 89 90 91 92 93-4

-3

-2

-1

0

1

2

DDIFPY

83 84 85 86 87 88 89 90 91 92 93-3

-2

-1

0

1

2

3

DIDE

83 84 85 86 87 88 89 90 91 92 93-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

-0.0

DIBO

83 84 85 86 87 88 89 90 91 92 93-0.36

-0.24

-0.12

0.00

0.12

0.24

Figure 10.6. Recursively calculated coefficients of α3(t1) fort1 =1983:2,...,1993:4.

We note that the estimated long-run coefficients remain remarkably stableall since 1983:2.

More!

10.8 Concluding discussions

This chapter demonstrated that an empirically stable money demand relationof the type discussed in Romer (1996) can be found as a linear combination(1.0, −13.8) of the two stationary cointegration relations (mr− yr+8.6Rb−0.2Ds83)t and (Rm − 0.4Rb)t, or, alternatively, as a cointegration relation(1.0, −13.8) between two nonstationary processes, (mr− yr− 0.2Ds83)t and(Rm−Rb)t. Given that the VAR model is a good description of the informa-tion in the data, we do not only obtain full information maximum likelihoodestimates of the money demand parameters (which in general have optimalproperties) but, as already discussed in Section 9, we also gain informationabout the number of autonomous shocks, i.e. the common stochastic trendsdriving the system.

Page 217: Juselius_Johansen_The Cointegrated VAR Model

10.8. CONCLUDING DISCUSSIONS 217

Furthermore, by imbedding the money demand relation (as well as theother stationary relations) in a dynamic equilibrium error correction systemit is possible gain empirical insight into the short-run dynamics of the ad-justment and feed-back behavior. This will be the topic of the next chapter,in which we will ask questions like: Is it possible by expansion of moneysupply in excess of the real productivity level in the economy to permanentlyincrease real income or is the final effect only an increase in the inflationrate? Should money supply or the interest rates be used as monetary in-struments? Is money stock endogenously or exogenously determined? Is thelevel of the market interest rates determined by the central bank or by thefinancial market, or by both? What are the empirical consequences of eithercase?

Page 218: Juselius_Johansen_The Cointegrated VAR Model

Chapter 11

Identification of the Short-RunStructure

In the previous chapter we gave the arguments for why it is possible totreat the identification as two separate, though interdependent, problems.Essentially all the results discussed in Section 10.1 apply also for the short-run structure of the model and will not be repeated here. This means thatan identified short-run adjustment structure should satisfy the conditions forgeneric, empirical and economic identification similarly as for the long-runstructure. However, the residual covariance matrix plays an important role inthe identification of the short-run structure, whereas the long-run covariancematrix of the cointegrating relations was not part of the identification process.In this important respect the two identification problems differ from eachother.In this chapter we will discuss identification of a short-run structure where

we allow simultaneous current effects as well as short-run adjustment effectsto lagged changes of the variables and to previous equilibrium errors to bepart of the model. However, we will not impose identifying restrictions onthe residuals, such as orthogonality restrictions, except for trivially in thetriangular form model. Identifying restrictions on the residuals will be dis-cussed in Chapter 13, where we also will discuss how to distinguish betweentemporary and permanent errors in a structural VAR model.The organization of this chapter is as follows: Section 11.1 discusses how

to impose identifying restrictions on the short-run adjustment dynamics, Sec-tion 11.2 the difficult problem of interpreting linear functions of VAR residu-als as structural shocks, and Section 11.3 empirical and economic identifica-

219

Page 219: Juselius_Johansen_The Cointegrated VAR Model

220CHAPTER 11. IDENTIFICATIONOFTHE SHORT-RUN STRUCTURE

tion and provides some examples of economic questions related to the short-run adjustment structure. Sections 11.4, 11.5 and 11.6 illustrate short-runidentification in three different model representations. Section 11.7 discussesidentification of the short-run structure in a partial model and illustrateswith the Danish data. Finally, Section 11.8 concludes with a discussion ofthe economic plausibility of the estimated results based on generically andempirically identified models for the Danish data.

11.1 Formulating identifying restrictions

As already discussed the identification of the short-run structure is very muchfacilitated by keeping the properly identified cointegrating relations fixed at

their estimated values, i.e. treating βc0xt as predetermined stationary re-

gressors similarly as ∆xt−1. The statistical justification is that the estimatesof the long-run parameters β

care super consistent, i.e. the speed of con-

vergence toward the true value βc is proportional to t as t + ∞, whereasthe convergence of the estimates of the short-run adjustment parameters areproportional to

√t.

The cointegrated VAR model is a reduced form model in the short-rundynamics in the sense that potentially important current (simultaneous) ef-fects are not explicitly modeled but are left in the residuals. Thus, largeoff-diagonal elements of the covariance matrix Ω can be a sign of signifi-cant current effects between the system variables. In some cases the residualcovariances are small and of minor importance and can be disregarded al-together. Since the reduced form is always generically identified all furtherrestrictions on the short-run structure are then overidentifying. While a sim-plification search in the reduced form VAR model is quite simple, this isgenerally not the case when the covariance matrix Ω is part of the identifi-cation process.

When discussing identification of the short-run adjustment parameterswe will assume that the cointegration relations have been properly identifiedin the first step of the identification scheme. For simplicity we disregarddummy variables at this stage. The discussion of how to impose identify-ing restrictions on the short-run parameters will then be based on a givenidentified β = β

ci.e. for:

Page 220: Juselius_Johansen_The Cointegrated VAR Model

11.1. FORMULATING IDENTIFYING RESTRICTIONS 221

A0∆xt = A0Γ1∆xt−1+A0αβc0xt−1+A0µ0+A0εt, (11.1)

εt ∼ Np(0,Ω) (11.2)

or equivalently:

A0∆xt = A1∆xt−1 + aβc0xt−1 + µ0,a+vt, (11.3)

vt ∼ Np(0,Σ)

where A1 = A0Γ1, a = A0α, µ0,a = A0µ0, and vt = A0εt.Multiplying the VAR model with a nonsingular (p × p) matrix A0 does

not change the likelihood function, but introduces p×(p−1) new parameters(assuming that the diagonal elements of A0 are ones and that the residualcovariance matrix is unrestricted). Thus, we need to impose at least p×(p−1)just-identifying restrictions on (11.3) to obtain a unique solution.The structural vector error correction model (11.3) can be written in a

more compact form, similar as for the cointegration structure, β0xt = ut:

A0Xt = µ0,a + vt. (11.4)

where A0= (A0, −A1,−a) and X0t = (∆x0t,∆x

0t−1,x

0t−1β

c) is a stationaryprocess. When checking for generic identification of the short-run structure(11.4) we can use the same rank condition given in Chapter 10 for the long-run structure. Identifying restrictions on the rows ofA0 (we assume here thatthe constant term is not part of the identification process) can, as before, beformulated by the design matrices Hi:

A = (H1ϕ1, ...,Hpϕp).

Note that when the model contains dummy variables we can, in general,choose whether to include them in the identification process or not. In thelatter case, the dummy variables will be unrestricted in all equations, whereasin the former case they will be included in the vector X0

t and, thus, have tosatisfy the usual conditions for identification. Either of the two ways ofcalculating the identification rank indices described in the previous chaptercan be used.To find out whether the restrictions defining the model are identifying

one can either check the rank conditions in (??) based on (??) or by usingsome generic parameter values.

Page 221: Juselius_Johansen_The Cointegrated VAR Model

222CHAPTER 11. IDENTIFICATIONOFTHE SHORT-RUN STRUCTURE

Given that the short-run identification structure is generically identified,estimation can in principle be carried out by the switching algorithm ofthe eigenvalue routine described in Chapter 10. Since the dimension of theequation system is generally larger than the dimension of the cointegrationspace, the switching algorithm can be a little cumbersome and it is commonto apply other maximization algorithms. See, for example, the selection ofdifferent estimation algorithms in GiveWin (Doornik and Hendry, 2003).

11.2 Interpreting shocks

Because different choices of A0 lead to different estimates of the residuals,the question of how to define a shock is important. The theoretical concept ofa shock, and its decomposition into an anticipated and unanticipated part,has a straightforward correspondence in the V AR model as a change of avariable, ∆xt, and its decomposition into an explained part, the conditionalexpectation Et−1∆xt | Xt−1, and an unexplained part, the residual εt. Therequirement for εt to be a correct measure of an unanticipated (autonomous)shock is that the conditional expectation Et−1∆xt | xt−1 correctly describeshow agents form their expectations. For example, if agents make model basedrational expectations from a model that is different from the V AR model,then the conditional expectation would no longer be an adequate descriptionof the anticipated part of the shock ∆xt. However, in this case the VARmodel would not be a good description of the information in the data.Theories also require shocks to be ”structural”, implying that they are, in

some sense, objective, meaningful, or absolute. With the reservation that theword structural has been used to cover a wide variety of meanings, we willhere assume that it describes a shock, the effect of which is (1) unanticipated(novelty), (2) unique (a shock hitting money stock alone), and (3) invariant(no additional explanation by increasing the information set).The novelty of a shock depends on the credibility of the expectations

formation, i.e. whether εt = ∆xt− Et−1∆xt | g(Xt−1) is a correct mea-sure of the unanticipated change in x. The uniqueness can be achievedeconometrically by choosing A0 so that the covariance matrix Ω becomesdiagonal. For example, as will be demonstrated in Section 11.5, by postu-lating a causal ordering among the variables of the system one can triviallyachieve uncorrelated residuals. In general, the VAR residuals can be orthog-onalized in many different ways and whether the orthogonalized residuals vt

Page 222: Juselius_Johansen_The Cointegrated VAR Model

11.3. WHICH ECONOMIC QUESTIONS? 223

can be given an economic interpretation as a unique structural shock, de-pends crucially on the plausibility of the identifying assumptions. Some ofthem are just-identifying and cannot be tested against the data. Thus, differ-ent schools will claim structural explanations for differently derived estimatesbased on the same data.The invariance of a structural shock is probably the most crucial require-

ment as it implies that an empirically estimated structural shock should notchange when increasing the information set. Theory models defining deepstructural parameters are always based on many simplifying assumptions in-clusive numerous ceteris paribus assumptions. In empirical models the ceterisparibus assumptions should preferably be accounted for by conditioning onthe ceteris paribus variables. Since essentially all macroeconomic systemsare stochastic and highly interdependent, the inclusion of additional ceterisparibus variables in the model is likely to change the VAR residuals and,hence, the estimated shocks.Therefore, though derived from sophisticated theoretical models, struc-

tural interpretability of estimated shocks seems hard to justify. In the wordsof Haavelmo (1943) there is no close association between the true shocks de-fined by the theory model and the measured shock based on the estimatedVAR residuals. A structural shock seems to be a theoretical concept withlittle or fragile empirical content in macro-econometric modelling.In the remaining part of this chapter we will discuss economic identifica-

tion in the broad sense of answering some questions of economic relevance,and not in the narrow sense of identifying deep structural parameters

11.3 Which economic questions?

Macroeconomic theory is usually quite informative about prior economic hy-potheses relevant for the long-run structure, whereas much less seems to beknown about the short-run adjustment mechanisms. Thus, the identificationof the short-run structure has often the character of data analysis aiming atthe “identification” of a parsimonious parameterization rather than testingwell-specified economic hypotheses. Nevertheless, the simplification searchshould preferably be guided by some relevant questions that the empiricalanalysis should provide an answer to.From (11.3) we note that the data variation can be decomposed into a

systematic (anticipated) part and an unsystematic (unanticipated) part. The

Page 223: Juselius_Johansen_The Cointegrated VAR Model

224CHAPTER 11. IDENTIFICATIONOFTHE SHORT-RUN STRUCTURE

systematic part explains the change from t − 1 to t of the system variablesas a result of:

1. current (anticipated) changes (∆xj,t, j 6= i) in the system variables,∆xi,t, i = 1, ..., p.

2. previous changes of the system variables, ∆xi,t−m, i = 1, ..., p, m =1, ..., k − 1,

3. deviations from previous long-run equilibrium states, β0ixt−1, i = 1, .., r,and

4. extraordinary events, ΦDt, such as reforms and interventions.

Thus, this type of empirical models allow for the possibility that agentsreact on (i) previous equilibrium errors in the long-run steady-state relations,(ii) current and lagged changes in the determinants, and (iii) extraordinaryevents.In contrast, most theoretical models make prior assumptions on how

agents are supposed to react under optimizing behavior given some ceterisparibus assumptions. Based on these assumptions, the model may predict,for example, instantaneous or partial adjustment behavior towards equilib-rium states. In this sense the specification of economic models is more precisewith respect to the postulated behavior and its economic consequences. Onthe other hand these consequences are only empirically relevant given thatthe postulated behavior is (at least approximately) correct.The specification of an empirical model like (11.3) is less precise in terms

of an underlying theoretical model, but far more flexible in terms of actualmacroeconomic behavior which is influenced by the wider circumstances thatgenerated the data, i.e. by the ceteris paribus assumptions of the economicmodel. In the ideal case the empirical model should allow for all relevantaspects of (possibly competing) theoretical model(s) as testable hypotheses,but also add to the realism of the empirical model by conditioning on rele-vant ceteris paribus variables. For example, the question of instantaneous orpartial adjustment behavior in the domestic money market can be specifiedas hypotheses on the adjustment coefficients αij in the empirical model. A’surprising’ outcome of a test might then be associated with ceteris paribusassumptions of the theoretical model, such as ”constant real exchange rate”

Page 224: Juselius_Johansen_The Cointegrated VAR Model

11.3. WHICH ECONOMIC QUESTIONS? 225

or ”no risk aversion in the capital markets” just to mention a few of thepossibly very crucial ones.We will illustrate these ideas by recalling that the economic motivation

for doing an empirical analysis of the Danish money demand data was thediscussion in Chapter 2 of inflation and monetary policy. The latter wasstrongly influenced by the discussion in Romer (1996), Chapter 9, in whichthe economic model was based on an assumption on equilibrium behaviorin the money market. The interest in empirical money demand relations ismotivated by the idea that if the latter are known and empirically stable,then central banks would be able to supply exactly the amount of moneysatisfying agents’ demand for money.Based on the empirical results reported in the preceding chapters we

were able to conclude that the Danish money market behavior in the postBretton Woods period has been characterized by short-run equilibrium errorcorrection behavior. By allowing for dynamic adjustment towards long-runsteady states it is now possible to address a number of additional questionswhich are directly or indirectly related to the effectiveness of monetary policy.For example:

• Is money stock adjusting to a long-run money demand or supply rela-tion?

• Has monetary policy been more effective when based on changes inmoney stock or changes in interest rates?

• Does the effect of expanding money supply on prices differ in the shortrun, in the medium run, or in the long run?

• Is an empirically stable demand for money relation a prerequisite formonetary policy to be effective for inflation control?

• How strong is the direct (indirect) relationship between a monetarypolicy instrument and price inflation?

As demonstrated above the short-run adjustment coefficients are not ingeneral invariant to the choice of A0, which is why the answers to the abovequestions are crucially related to the identification issue. In some cases theempirical answers can be very sensitive to the choice of identification scheme,in other cases results are more robust.

Page 225: Juselius_Johansen_The Cointegrated VAR Model

226CHAPTER 11. IDENTIFICATIONOFTHE SHORT-RUN STRUCTURE

To illustrate how one can empirically investigate the above questions wewill assume that the causal chain model below is an adequately identifiedrepresentation of the short-run structure of the Danish data. For simplicityof notation we will here assume that A1 = 0, i.e. there is no short-rundynamic adjustment in lagged changes of the process and that the long-run structure can be described by the simple relations discussed in Chapter2. Generalization to more realistic specifications, such as the empiricallyidentified relations of Table 10.2, should be straightforward.We consider the following model specification:

1 a012 a013 a014 a0150 1 a023 a024 a0250 0 1 a034 a0350 0 0 1 a0450 0 0 0 1

∆mrt

∆2pt∆Rm,t∆yrt∆Rb,t

=µ0,1µ0,2µ0,3µ0,4µ0,5

+

+

a11 a12 a13a21 a22 a23a31 a32 a33a41 a42 a43a51 a52 a53

(mr − yr)t−1(Rb −Rm)t−1(∆p−Rm)t−1

+

vmtv∆ptvyrtvRm,tvRb,t

(11.5)

To simplify the discussion we will focus solely on the money and inflationequations, albeit acknowledging that a complete answer should be based onthe full system analysis.

∆mrt = a012∆

2pt + a013∆Rm,t + a

014∆y

rt + a

015∆Rb,t

+a11(mr − yr)t−1 + a12(Rb −Rm)t−1 + a13(∆p−Rm)t−1 + εmt

∆2pt = a023∆Rm,t + a024∆y

rt + a

025∆Rb,t

+a21(mr − yr)t−1 + a22(Rb −Rm)t−1 + a23(∆p−Rm)t−1 + εpt

∆yrt = ....∆Rm,t = .....∆Rb,t = .....

Question 1. Is money stock adjusting to money demand or supply?

Page 226: Juselius_Johansen_The Cointegrated VAR Model

11.3. WHICH ECONOMIC QUESTIONS? 227

If a11 < 0, a12 > 0, and a13 > 0, then empirical evidence is in favor ofmoney holdings adjusting to a long-run money demand relation. The lattercan be derived from the money stock equation as follows:

mr = yr + a12/a11(Rb −Rm) + a13/a11(∆p−Rm)= yr − β1(Rb −Rm)− β2(∆p−Rm) (11.6)

Relation (11.6) corresponds to the aggregate money demand relation dis-cussed in a static equilibrium framework by Romer, with the difference that(11.6) is now imbedded in a dynamic adjustment framework. On the otherhand, the case a11 < 0, a12 = 0, and a13 = 0, could be consistent with asituation where the central bank has controlled money stock so that moneyvelocity is stationary around a constant level.Question 2. Is an empirically stable money demand relation a prerequisite

for central banks to be able to control inflation rate? This hypothesis is basedon essentially three arguments:

2.1 There exists an empirically stable demand for money re-lation.2.2. Central banks can influence the demanded quantity of

money.2.3. Deviations from this relation cause inflation.

The empirical requirement for a stable money demand relation is thata11 < 0, a12 > 0,and a13 > 0, and that the estimates are empirically stable.In this case money stock is endogenously determined by agents’ demand formoney and it is not obvious that money stock can be used as a monetaryinstrument by the central bank. Thus, given the previous result (i.e. a11 <0, a12 > 0,and a13 > 0), central banks cannot directly control money stock.Nevertheless, controlling money stock indirectly might still be possible bychanging the short-term interest rate. But a12 > 0 and (Rb − Rm) ∼ I(0)implies that a change in the short-term interest rate will transmit through thesystem in a way that leaves the spread basically unchanged. Therefore, evenif a stable demand for money relation has been found, it may, nevertheless,be difficult to control money stock.Finally, a21 > 0 would be consistent with the situation where deviations

from the empirically stable money demand relation influence inflation ratein the short-run. Under the assumption that agents can obtain their desired

Page 227: Juselius_Johansen_The Cointegrated VAR Model

228CHAPTER 11. IDENTIFICATIONOFTHE SHORT-RUN STRUCTURE

level of money (i.e. no credit restrictions), it seems likely that there is ’excessmoney’ in the economy because of ’excess supply’ rather than ’excess privatedemand’. Excess money would generally then be the result of central bankissuing more money than demanded, i.e. by monetization of governmentdebt.In the next sections we will illustrate the above discussion using the Dan-

ish data based on the following identification schemes:

1. imposing restrictions on the short-run parameters when A0 = I,

2. imposing (just-identifying) zero restrictions on the off-diagonal ele-ments of Σ,

3. imposing general restrictions on A0 without imposing restrictions onΣ,

4. re-specifying the full system model as a partial model based on weakexogeneity test results.

11.4 Overidentifying restrictions on the re-

duced form

The estimates of the unrestricted VAR model were given in Table 7.1. Wenote that: (i) the short-run structure is over-parameterized with many in-significant coefficients, (ii) the adjustment coefficients αij seem to providethe bulk of explanatory power, (iii) the signs and the magnitudes of the αijcoefficients seem reasonable, but the coefficients to the lagged changes of theprocess are more difficult to interpret, (iv) some of the correlations of the(0,1) standardized residuals are quite large possibly because of importantcurrent effects.For the Danish data we have no strong prior hypotheses about the short-

run structure and imposing overidentifying restrictions has more the charac-ter of a simplification search rather than stringent economic identification.The guiding principle is the plausibility of the results, in particular plausibleestimates of the equilibrium correction coefficients, a.Many of the short-run adjustment coefficients in the unrestricted reduced

form VAR discussed in Chapter 4 were statistically insignificant and thefirst step is to test whether they can be set to zero without violating the

Page 228: Juselius_Johansen_The Cointegrated VAR Model

11.4. REDUCED FORM RESTRICTIONS 229

joint test of over-identifying restrictions. First we remove the insignificantlagged variables ∆2pt−1 and ∆Rm,t−1 from the system (altogether 10 coef-ficients) based on F-test. The dummy variables, except for ∆Ds83t, wereonly marginally significant and were also left out (altogether 15 coefficients).In this ’trimmed’ system 20 additional zero restrictions were accepted basedon a LR test of over-identifying restriction, χ2(20) = 19.7 and a p-value of0.54. The estimates of the remaining coefficients reported in Table 11.1 areall significant on the 5% level, albeit some only borderline so.The original VAR contained 50 autoregressive coefficients + 20 seasonal

coefficients, 20 dummy variable coefficients and 15 parameters in the resid-ual covariance matrix. Of these only 16 remained significant, which is asignificant reduction. We note that the equilibrium correction coefficients aijare highly significant demonstrating the major loss of information usuallyassociated with VAR models specified in differences.All current effects are accounted for by the residual covariance matrix Σ

at the bottom of Table 11.1. Most of the correlations are relatively small, butthe residuals from the money stock equation and the real income equationare correlated with a coefficient 0.33 and the residuals from the short-terminterest rate and the bond rate equations with a coefficient of 0.40. Thereare at least three explanations for why the residuals from a VAR model arelikely to be correlated:

1. Causal chain effects get blurred when the data are temporally aggre-gated; the more aggregated the data, the higher the correlations. As-sume, for example, that the bond rate changes the day after a change inthe short-term interest rate (as a result of a central bank intervention,say) and that the central bank changes the short-term interest rate asa result of a market shock to the long-term bond rate but only afterweek. In this case we would be able to identify correctly the causallinks based on daily data. However, when we aggregate the data overtime the information about these links get mixed up.

2. Expectations are inadequately modelled by the VAR model. Chap-ter 3 showed that the VAR model is consistent with agents makingplans based on the expectation: Et−1(∆xt| ∆xt−1, ...,∆xt−k+1,β0xt−1).Assume now that agents have forward looking expectations to the vari-able, x1t, so that Et−1(∆x1t) = ∆xe1t, and that they use this value whenmaking plans: Et−1(∆xj,t | ∆xe1t,∆xt−1, ...,∆xt−k+1,β0xt−1), j 6= 1. If

Page 229: Juselius_Johansen_The Cointegrated VAR Model

230CHAPTER 11. IDENTIFICATIONOFTHE SHORT-RUN STRUCTURE

∆xe1t = ∆x1t, i.e. the expectation is exactly correct, then the reducedform VAR residuals from equation ∆x1t would be correlated with theresiduals from the equations in which ∆xe1t were important.

3. Omitted variables effects. In most cases the VAR model contains onlya subset of (the most important) variables needed to explain the eco-nomic problem. This is because the VAR model is very powerful forthe analysis of small system, but the identification of cointegration re-lations becomes increasingly difficult when the dimension of the vectorprocess gets larger. There are two important implications of omittedvariables: (1) the estimated short-run adjustment coefficients are likelyto change when we increase the number variables in the model (pro-vided that the new variables are not orthogonal to the old variables),(2) the residuals generally become smaller when increasing the dimen-sion of the VAR. Thus, the residuals of a ’small’ VAR model are likelyto contain omitted variables effects and the residuals will be correlatedif the left-out variables are important for several of the variables in the’small’ VAR.

The last point makes it very hard to argue that the residuals could possi-bly be a measure of autonomous errors (structural shocks). Therefore, a largeresidual correlation coefficient does not necessarily imply a ’structural’ simul-taneous effect, nor does it imply incorrectly specified expectations. With thisin mind we will now attempt to ’identify’ some of the current effects in themodel as if they were a result of either point 1 or 2 above.

11.5 The VAR in triangular form

Assume that we want to estimate the VARmodel with uncorrelated residuals.The most convenient way of achieving this is by choosing A0 in (11.3) to be anupper triangular matrix Ω−1/2, i.e. by pre-multiplying the VAR model withthe inverse of the Choleski decomposition of the covariance matrix Ω. In thiscase the likelihood function is decomposed into p independent sequentiallydecomposed likelihoods as demonstrated in Section 2.4:

P (∆x1,t, ...,∆xp,t|∆xt−1,β0xt−1;λs) ==

p

Πi=1P (∆xit|∆xi+1,t, ...,∆xp,t,∆xt−1,β0xt−1;λsi )

Page 230: Juselius_Johansen_The Cointegrated VAR Model

11.5. THE VAR IN TRIANGULAR FORM 231

Table 11.1: A parsimonious parametrization of the cointegrated VAR model

∆mrt ∆2pt ∆Rm,t ∆yrt ∆Rb,t

∆mrt−1 −0.21

(−2.3)0 −0.01

(−2.6)0 −0.02

(−3.2)∆yrt−1 0 0 0 0.21

(2.0)0.05(5.6)

∆Rb,t−1 0 0 0.34(4.5)

−2.28(−2.4)

0.29(3.4)

ecm1t−1 0 −1.39(−13.3)

0 0 0

ecm2t−1 −0.33(−6.7)

0 −0.01(−2.0)

0 0

ecm3t−1 0 0 −0.28(−3.9)

0 0

∆Ds83 t 0.06(2.6)

0 0 0 −0.01(−5.0)

Ω (off-diagonal elements are standardized)∆mr

t 0.02432

∆2pt -0.14 0.01462

∆Rm,t -0.22 -0.02 0.00122

∆yrt 0.33 -0.14 -0.09 0.01822

∆Rb,t -0.13 -0.14 0.41 0.02 0.00152

Page 231: Juselius_Johansen_The Cointegrated VAR Model

232CHAPTER 11. IDENTIFICATIONOFTHE SHORT-RUN STRUCTURE

We find p conditional expectations:

E(∆xit|∆xi+1,t, ...,∆xp,t,∆xt−1,β0xt−1) =pPj=i

a0,j+1∆xj+1,t +A1∆xt−1 + aβ0xt−1, i = 1, ..., p

where a0i is a row vector defined by the last p-i elements of the ith row ofA0. Because the residuals are uncorrelated in this system, the coefficientscan be estimated fully efficiently by OLS equation by equation. In this casethe OLS estimates are equivalent to FIML estimates.The triangular system is based on a specific ordering of the p variables

and, thus, on an underlying assumption of a causal chain. Since differentorderings will, in general, produce different results, and the choice of order-ing is subjective, there is an inherent arbitrariness in this type of models.It is, therefore, important to check the sensitivity of the results to variousorderings of the causal chains. Here we first investigate the consequencesof different ordering by performing an OLS regression for each variable asif it had been at the end of the causal chain. The results are reported inTable 11.2 where the coefficients of each row are given by the conditional

expectation E(∆xi,t|∆xj,t(j 6= i), βc0xt−1,Dt), i = 1, ..., 5, . By this exercisewe can find out how the reduced form model estimates would have changedfor equation ∆xi,t, i = 1, ..., 5 if we had included current changes of all theother variables in the system as regressors.The results show that ∆mrt and ∆y

rt as well as ∆Rb,t and ∆Rm,t exhibited

significant simultaneous effects and that ∆Rm,t was equilibrium correcting tothe third cointegration relation but not ∆Rb,t.In Chapter 10 we showed that both ∆yrt and ∆Rb,t are weakly exogenous

for the long-run parameters β.However, the short-run adjustment parametersare not invariant to transformations by a non-singular matrix A0 as demon-strated in (11.1). Thus, previously insignificant adjustment coefficients αij inthe reduced form model might become significant in the transformed model.Independently of the chosen ordering no such changes in equilibrium correc-tion effects were found in the equations for ∆yrt and ∆Rb,t, demonstratingthe robustness of the previous results regarding the lack of long-run feedbackfor different model specifications. This is an important reason for placing∆yrt and ∆Rb,t prior to ∆Rm,t, ∆

2pt, and ∆mrt in the causal chain. Finally,because the primary purpose of this study is to investigate the role of money

Page 232: Juselius_Johansen_The Cointegrated VAR Model

11.5. THE VAR IN TRIANGULAR FORM 233

Table 11.2: The fully specified conditional expectations equation by equation

∆mrt ∆2pt ∆Rm,t ∆yrt ∆Rb,t

∆mrt 1 −0.09

(−1.1)−0.01(−1.1)

0.21(2.4)

−0.01(−0.7)

∆2pt −0.22(−1.1)

1 0.00(0.1)

−0.14(−1.0)

−0.02(−1.3)

∆Rm,t −2.79(−1.1)

0.28(0.1)

1 −0.61(−0.3)

0.47(3.3)

∆yrt 0.40(2.4)

−0.10(−1.0)

−0.00(−0.3)

1 0.01(0.5)

∆Rb,t −1.52(−0.7)

−1.61(−1.3)

0.31(3.3)

0.75(0.5)

1

∆mrt−1 −0.32

(−3.0)−0.06(−0.9)

−0.01(−1.3)

0.14(1.7)

−0.01(−2.0)

∆yrt−1 0.17(0.9)

0.22(2.0)

−0.02(−2.0)

0.12(0.9)

0.04(4.5)

∆Rb,t−1 0.66(0.4)

−0.54(−0.5)

0.26(3.0)

−3.13(−2.4)

0.12(1.0)

ecm1t−1 −0.26(−0.8)

−1.43(−12.6)

0.00(0.2)

−0.40(−1.6)

−0.01(−0.6)

ecm2t−1 −0.30(−4.5)

0.01(0.3)

−0.01(−2.0)

0.10(1.8)

0.00(0.3)

ecm3t−1 −0.38(−0.2)

−0.83(−0.7)

−0.30(−3.5)

−1.71(−1.3)

0.08(0.5)

∆Ds83 t 0.05(1.6)

−0.02(−0.7)

0.003(2.2)

−0.03(−1.2)

−0.01(−4.3)

and inflation in this system, ∆2pt and ∆mrt are placed at the end of thechain, prior to ∆Rm,t. Inflation did not exhibit any significant effects fromcurrent changes in the other variables and, hence was placed prior to ∆mrt .

Based on the above arguments the estimates of the triangular system inTable 11.3 are based on the following ordering:

∆Rb,t → ∆yrt → ∆Rm,t → ∆2pt → ∆mrt . (11.7)

The upper triangular representation of the current effects in Table 11.3corresponds to the Cholesky decomposition of the residual covariance matrixwhen the system variables are ordered as in (11.7).

To increase readability coefficients with an absolute t-value > 1.9 are inbold face. It appears that the transformation by A0 does not change theVAR model estimates very much, indicating that the current effects are not

Page 233: Juselius_Johansen_The Cointegrated VAR Model

234CHAPTER 11. IDENTIFICATIONOFTHE SHORT-RUN STRUCTURE

Table 11.3: The short-run adjustment structure in triangular form

∆mrt ∆2pt ∆Rm,t ∆yrt ∆Rb,t

∆mrt 1 0 0 0 0

∆2pt −0.22(−1.1)

1 0 0 0

∆Rm,t A00 : −2.79

(−1.1)0.52(0.3)

1 0 0

∆yrt 0.40(2.4)

−0.14(−1.4)

−0.01(−0.7)

1 0

∆Rb,t −1.52(−0.7)

−1.51(−1.2)

0.32(3.5)

0.33(0.2)

1

∆mrt−1 −0.32

(−3.0)−0.04(−0.5)

−0.01(−1.1)

0.10(1.2)

−0.02(−2.6)

∆yrt−1 A01 : 0.17

(0.9)0.22(1.9)

−0.02(−2.2)

0.16(1.2)

0.04(3.8)

∆Rb,t−1 0.66(0.4)

−0.54(−0.5)

0.25(3.0)

−3.66(−2.9)

0.29(2.9)

ecm1t−1 −0.26(−0.8)

−1.43(−12.6)

0.00(0.1)

−0.21(−1.5)

0.01(0.9)

ecm2t−1 a0: −0.30(−4.5)

0.04(0.9)

−0.01(−2.1)

0.04(0.9)

−0.00(−0.1)

ecm3t−1 −0.38(−0.2)

−0.81(−0.7)

−0.30(−3.6)

−1.41(−1.1)

−0.07(−0.6)

∆Ds83 t φ0 : 0.05(1.6)

−0.02(−0.9)

0.003(2.0)

−0.02(−0.8)

−0.01(−4.2)

∆mrt 1

∆2pt 0 1

∆Rm,t bΣ 0 0 1∆yrt 0 0 0 1∆Rb,t 0 0 0 0 1

Page 234: Juselius_Johansen_The Cointegrated VAR Model

11.6. GENERAL RESTRICTIONS 235

very important, i.e. points 1 and 2 above do not seem to be empirically veryimportant in this case.

11.6 Imposing general restrictions on A0

Many of the estimates of the triangular system reported in Table 11.3 wereinsignificant and it would seem natural to restrict them to zero. We will nextdemonstrate how to impose general restrictions on the matrices A0,A1, a,and Φ while leaving Ω unrestricted.Though statistical significance is not necessarily the same as economic

significance the former is important for two reasons; (i) our model shouldpreferably describe empirically relevant aspects of the economic problem forthe chosen period, (ii) leaving insignificant coefficients in the model mayintroduce near singularity in the model if generic identification is violatedwhen restricting the insignificant coefficients to zero.In the following we will demonstrate two different model structures ob-

tained by imposing as many zero restrictions as possible on (A0,A1, a, andΦ) while at the same time keeping the residual correlations of Σ as small aspossible.Short-run structure I reported in Table 11.4 allows simultaneous effects

of ∆mrt and ∆yrt and of ∆Rm,t and ∆Rb,t while imposing overidentifying

restrictions on the remaining model parameters. The ∆2pt equation is inthe reduced form and, therefore, identified by construction. The remainingequations contain simultaneous effects and therefore need to be checked forgeneric and empirical identification.The zero restriction of ecm2t−1 in the equation for ∆yrt and the corre-

sponding nonzero coefficient in the equation for ∆mrt are identifying, both

generically and empirically (because of the highly significant coefficient ofecm2t−1 in the ∆mr

t equation). Similarly, the significant coefficients of ∆yrt−1

and ∆Rb,t−1 in the equation for ∆yrt and the corresponding zero coefficientsin the equation for ∆mr

t are identifying. The calculated rank indices (??)reported in Table 11.5 confirms the visual inspection: r1.4 = 2 implying thateq. 1 is identified w.r.t. eq. 4 with one overidentifying restriction and r4.1 = 3that eq.4 is identified w.r.t. eq.1 with two overidentifying restrictions.The significant coefficients of ∆mr

t−1 and Ds83 t in the equation for ∆Rb,tand the corresponding zero coefficients in the equation for ∆Rm,t are identi-fying the latter w.r.t. the former. This is an example where a strongly signif-

Page 235: Juselius_Johansen_The Cointegrated VAR Model

236CHAPTER 11. IDENTIFICATIONOFTHE SHORT-RUN STRUCTURE

Table 11.4: An overidentified simultaneous short-run adjustment structure I

∆mrt ∆2pt ∆Rm,t ∆yrt ∆Rb,t

∆mrt 1 0 0 −0.28

(−1.7)0

∆2pt 0 1 0 0 0∆Rm,t A0

0 : 0 0 1 0 0.24(0.8)

∆yrt 0.76(2.0)

0 0 1 0

∆Rb,t 0 0 0.10(0.7)

0 1

∆mrt−1 −0.32

(−2.9)0 0 0 −0.01

(−2.4)∆yrt−1 A0

1 : 0 0 −0.02(−1.8)

0.31(2.5)

0.04(5.0)

∆Rb,t−1 0 0 0.35(3.5)

−4.04(−2.4)

0.22(1.8)

ecm1t−1 0 −1.40(−13.4)

0 0 0

ecm2t−1 a0: −0.31(−6.2)

0 −0.01(−2.2)

0 0

ecm3t−1 0 0 −0.33(−3.9)

0 0

∆Ds83 t φ0: 0.07(2.7)

0 0 0 −0.01(−4.8)

(off-diagonal elements are standardized)∆mr

t 0.02362

∆2pt -0.08 0.01472

∆Rm,t Σ : -0.15 0.02 0.00122

∆yrt 0.06 -0.20 -0.16 0.02102

∆Rb,t -0.11 -0.14 0.10 0.01 0.00142

Page 236: Juselius_Johansen_The Cointegrated VAR Model

11.6. GENERAL RESTRICTIONS 237

icant (and economically interpretable) dummy variable can contain valuableidentifying information. Similarly, the significant coefficients of ecm2t−1 andecm3t−1 in the ∆Rm,t equation and the corresponding zero coefficient in the∆Rb,t equation are identifying w.r.t. ∆Rm,t, whereas the coefficients of ∆y

rt−1

and ∆Rb,t−1 are not identifying, since they enter similarly in both equations.The calculated rank indices (??) reported in Table 11.5 confirms the visualinspection: r3.5 = 2 implying that eq. 3 is identified w.r.t. eq. 5 with oneoveridentifying restriction. The same is true for eq.5 w.r.t. eq.3.Altogether 17 overidentifying restrictions have been imposed on the short-

run structure I. The LR test, distributed as χ2(17), produced a test statistic of16.0 and the restrictions were accepted with a p-value of 0.52. The residualcorrelations reported at the bottom of Table 11.4 are quite small and in-significant, though not zero as in the triangular representation in Table 11.3.However, the estimated simultaneous effects are generally not very signifi-cant, suggesting that the identifying information is weak in this model. Thisis particularly so for the coefficients of the bond rate and the deposit rate.Therefore, short-run structure II, reported in Table 11.6, sets the insignifi-cant current effect of ∆Rb,t in the equation for ∆Rm,t to zero and imposesadditionally a zero restriction on ∆Rb,t−1 in the equation for ∆Rb,t. As aresult the residual correlations changed to some extent, but not significantlyso as can be inferred from the test of the 20 overidentifying restrictions whichwere accepted based on a test value of 22.5 (p-value 0.34).Based on the estimated results we notice that:

1. real money stock is exclusively equilibrium error correcting to long-runmoney demand,

2. the effect of an increase in real aggregate demand on money stock mightbe smaller in the short run (0.7) than in the long run (1.0),

3. inflation is exclusively equilibrium error correcting to the homogeneousinflation-interest rate relation,

4. short-term interest is essentially equilibrium error correcting to thelong-term bond rate and exhibits a (small) negative effect from excessliquidity,

5. real aggregate demand shows a negative short-run effects from increasesin the long-term bond rate,

Page 237: Juselius_Johansen_The Cointegrated VAR Model

238CHAPTER 11. IDENTIFICATIONOFTHE SHORT-RUN STRUCTURE

Table 11.5: Checking the rank conditions of short-run structure 1

ri.j ri.jk ri.jkm ri.jkmn1.23 7

1.2 2 1.24 41.3 5 1.25 6 1.234 71.4 2 1.34 5 1.235 7 1.2345 71.5 4 1.35 5 1.345 5

1.45 42.13 10

2.1 5 2.14 72.3 6 2.15 9 2.134 102.4 4 2.34 8 2.135 10 2.1345 102.5 6 2.35 8 2.345 10

2.45 83.12 6

3.1 4 3.14 43.2 2 3.15 4 3.124 63.4 2 3.24 4 3.125 6 3.1245 63.5 2 3.25 4 3.245 6

3.45 44.12 5

4.1 3 4.13 64.2 2 4.15 5 4.123 64.3 4 4.23 6 4.125 7 4.1235 84.5 4 4.25 6 4.235 8

4.35 65.12 5

5.1 3 5.13 45.2 2 5.14 3 5.123 65.3 2 5.23 4 5.124 5 5.1234 65.4 2 5.24 4 5.234 6

5.34 4

Page 238: Juselius_Johansen_The Cointegrated VAR Model

11.6. GENERAL RESTRICTIONS 239

Table 11.6: An overidentified simultaneous short-run adjustment structureII

∆mrt ∆2pt ∆Rm,t ∆yrt ∆Rb,t

∆mrt 1 0 0 −0.16

(−1.7)0

∆2pt 0 1 0 0 0∆Rm,t A0

0 : 0 0 1 0 0.67(3.8)

∆yrt 0.70(1.8)

0 0 1 0

∆Rb,t 0 0 0 0 1∆mr

t−1 −0.32(−3.0)

0 0 0 −0.01(−2.1)

∆yrt−1 A01 : 0 0 0 0.12

(2.3)0.04(4.6)

∆Rb,t−1 0 0 0.38(4.8)

−4.01(−2.8)

0

ecm1t−1 0 −1.40(−13.4)

0 0 0

ecm2t−1 a0 : −0.31(−6.3)

0 −0.005(−1.5)

0 0

ecm3t−1 0 0 −0.28(−3.4)

0 0

∆Ds83 t φ0 : 0.06(2.7)

0 0 0 −0.01(−5.0)

(off-diagonal elements are standardized)∆mr

t 0.02332

∆2pt -0.08 0.01472

∆Rm,t Σ : -0.18 -0.02 0.00132

∆yrt 0.10 -0.19 -0.14 0.02092

∆Rb,t -0.03 -0.15 -0.17 0.03 0.00142

Page 239: Juselius_Johansen_The Cointegrated VAR Model

240CHAPTER 11. IDENTIFICATIONOFTHE SHORT-RUN STRUCTURE

6. the bond rate exhibits a small negative short-run effect from an increasein liquidity and a positive short-run effect from an increase in aggregatedemand,

7. neither real aggregate demand nor the long-term bond rate exhibit anylong run equilibrium correction effects.

The last point can probably be explained by the present information setnot being sufficiently large to give a satisfactory explanation of all five vari-ables. For example, the long-term bond rate would probably be equilibriumerror correcting to the German bond rate of similar maturity and real ag-gregate demand to real exchange rates, terms of trade and foreign aggregatedemand if these variables were included in the analysis.Chapter 16 will discuss a procedure for extending the model to include

such effects.

11.7 A partial system

In this section we re-estimate the model as a partial model for real moneystock, inflation and the short-run interest rate conditional on the bond rateand the real aggregate demand which were found to be weakly exogenous forthe long-run parameters β. Note, however, that the latter does not implyweak exogeneity for the short-run adjustment parameters ( λSSF ). Therefore,if the parameters of interest are λSSF then neither Rb,t nor y

rt is weakly ex-

ogenous for these parameters as demonstrated by the estimation results inTables 10.3 and 10.4. But, because the motivation for the empirical study inmost cases is an interest in the dynamic feed-back effects w.r.t. the long-runstructure, establishing long-run weak exogeneity for a variable is often usedas a justification for performing the model analysis conditional on such avariable.As already discussed in Chapter 9.5 to find out whether weak exogeneity

holds either for the long-run parameters β, or the short-run adjustment pa-rameters λSSF , we need to estimate the full system of equations. The questionis why one would like to continue the analysis in a partial model consider-ing that we already have estimated the full system. However, in some casesthere are clear advantages with a partial system analysis. Assume, for exam-ple, that a weakly exogenous variable (w.r.t. β) has been subject to many

Page 240: Juselius_Johansen_The Cointegrated VAR Model

11.8. ECONOMIC IDENTIFICATION 241

interventions and that current changes of this variable has significantly af-fected the other variables in system. Conditioning on such a variable is likelyto reduce the need for intervention dummies in the model and this mightimprove the stability of the parameter estimates in the conditional model.Furthermore, the residual variance in the conditional model is often signif-icantly reduced, thus improving the precision of the statistical inference ascompared to the full model. In some cases the exogeneity status of a variableis so obvious that testing might not be needed and one can directly estimatea partial model. For example, the economic activity in USA is likely to in-fluence the Danish economy, but whatever happens in the Danish economyit is not likely to influence the US economy. In more doubtful cases the pro-cedure suggested in Harbo, Johansen, and Rahbek (1998) can be used as asafeguard against conditioning on variables, which are not weakly exogenousfor the long-run parameters.Table 11.7 reports the estimates of the partial model for ∆mr

t , ∆p2t , and

∆Rm,t conditional on ∆Rb,t and ∆yrt . The estimated results are similar tothe estimates of the full system reported in Table 11.4 and 11.6. Thus, theempirical conclusions would be fairly robust whether based on the full or thepartial system. The parameter estimates (particularly the current effects)have become slightly more significant in the present model. As mentionedabove the need for dummy variables may change in the partial model. Thiswas the case here: all dummy variables became insignificant after condition-ing on ∆Rb,t and ∆y

rt . Finally, the residual standard error of the deposit rate

equation decreased to some extent compared to the full system estimate.

∆mr ∆2p ∆RmFull system: σε 0.0233 0.0147 0.00130Partial system: σε 0.0232 0.0147 0.00117

11.8 Economic identification

We will now make an attempt to answer the questions raised in Section 11.2based on the above empirical results. First we notice that the basic resultson the short-run adjustment structure remained essentially unaltered in allthe different representations reported in Table 11.1-11.5. In particular, theECM terms in the parsimonious reduced form model reported in Table 11.1hardly changed at all when allowing for simultaneous effects in the model.

Page 241: Juselius_Johansen_The Cointegrated VAR Model

242CHAPTER 11. IDENTIFICATIONOFTHE SHORT-RUN STRUCTURE

Table 11.7: The estimates of a partial system for money, the inflation rateand the deposit rate

∆mrt ∆2pt ∆Rm,t

∆mrt 1 0 0

∆2pt 0 1 0∆Rm,t A0

0 : 0 0 1∆yrt 0.41

(2.9)0 0

∆Rb,t −2.58(2.0)

0 0.27(2.9)

∆mrt−1 −0.26

(−2.7)0 0

∆yrt−1 A01 : 0 0 −0.02

(−2.8)∆Rb,t−1 0 0 0.28

(3.4)

ecm1t−1 0 −1.39(−12.9)

0

ecm2t−1 a0 : −0.29(−5.9)

0 −0.01(−2.0)

ecm3t−1 0 0 −0.31(−3.7)

∆mrt 0.02322

∆2pt -0.12 0.01472

∆Rm,t Σ : -0.07 -0.04 0.00112

The latter were essentially found between changes in real money stock andreal income and between changes in the short and the long-term interest rate.Including the change in current real income in the real money stock equationseemed to improve the model specification. Including current interest ratechanges improved the dynamic specification somewhat but the static long-run solutions remained unaltered.

Altogether, current and lagged changes of the process do not seem verycrucial for the interpretation of the results and we can focus on the equilib-rium error correction results when answering the questions of Section 11.2.

Question 1. Is money stock adjusting to money demand or supply?

The result that real money holdings have been exclusively adjusting toa long-run money demand relation was very robust in all representations.This seems to indicate either that the central bank has willingly supplied the

Page 242: Juselius_Johansen_The Cointegrated VAR Model

11.8. ECONOMIC IDENTIFICATION 243

demanded money, or that agents have been able to satisfy their desired levelof money holdings independently of the central bank. The access to creditoutside Denmark as a result of deregulation of capital movements mightsuggest the latter case.Question 2. Given the empirically stable money demand relation found in

the Danish data, can inflation be effectively controlled by the central bank?If the level of (the broad measure of) money stock is endogenously de-

termined by agents’ demand for money, then the central bank can indirectlyinfluence the level of the demanded quantity by influencing its determinants,i.e. the cost of holding money. According to the estimated money-demandrelation this can be achieved by changing the short-term interest rate. Buta change in the short-term interest rate is likely to change money demandonly to the extent that it changes the spread (Rm−Rb). Although we foundthat (Rm − Rb) ∼ I(1) in the money demand relation, we also found thatthe money demand relation was a linear combination between two station-ary relations (mr − yr + b1Rb + b2Ds83t) and (Rm − 0.4Rb). Hence, it seemslikely that the long-term movements in the demand for money are primarilyinfluenced by the level of the long-term bond rate, which was found to beweakly exogenous and probably not controllable by the central bank. Sincethe ’modified spread’ was found to be stationary an increase in central bankinterest rate is likely to transmit through the system in a way that leavesthis component basically unchanged. Therefore, the present empirical evi-dence seems to suggest that the central bank would not have been able toeffectively control M3.Question 3. Does excess money, defined here as the deviation from the

long-run money demand relation cause inflation?If this is the case, ecm2 should have a positive coefficient in inflation

equation. No such evidence is found in any of the short-run structures.The triangular form in Table 10.2 reports a positive, but very small (0.04)coefficient of ecm2. However, this insignificant effect is canceled by a negativecoefficient -0.04 to lagged changes in money stock. Thus, there is no evidencethat inflation has been caused by monetary expansion in this period, at leastnot in the short run.Question 4. What is the effect of expanding money supply on prices in

the long run? Is money causing prices or prices causing money?This question will be addressed in the common trends model in the next

chapter.

Page 243: Juselius_Johansen_The Cointegrated VAR Model

Chapter 12

Identification of commonTrends

Sections 5.2 and 6.4 demonstrated the duality between the AR-representationdescribing the adjustment dynamics , α, towards the long-run relations β0xtand the MA-representation describing the common driving trends, α0⊥Σεi,and their weights, β⊥. Generally, the identification problem for the commontrends case is similar to the one of the long-run relations in the sense thatone can choose a normalization and (p − r − 1) restrictions without chang-ing the value of the likelihood function, whereas additional restrictions areoveridentifying and, hence, testable.

In this chapter we will discuss how to impose restrictions on the underly-ing common trends without attaching a structural meaning to the estimatedshocks. This will be done in the next chapter, where we will discuss re-strictions on the VAR model that aim at identifying r permanent and p− rtransitory shocks.

The organization of this chapter is as follows: Section 12.1 discusses thecommon trends decomposition based on the VAR model, Section 12.2 focuseson some special cases, Section 12.3 illustrates the ideas using the Danish data,and Section 12.4 discusses whether the results are economically identifiedusing the scenario analysis of Chapter 2.

245

Page 244: Juselius_Johansen_The Cointegrated VAR Model

246 CHAPTER 12. IDENTIFICATION OF COMMON TRENDS

12.1 The common trends representation

Chapters 10 and 11 demonstrated that the V AR residuals are not invariant tolinear transformations. This was shown by pre-multiplying the VAR model:

∆xt = Γ1∆xt−1 +αβ0xt−1 + µ0 +αβ1t+εt, (12.1)

εt ∼ Np(0,Ω)

with a (p× p) non-singular matrix A0 :

A0∆xt = A1∆xt−1 + aβ0xt−1 + µ0,a + aβ1t+ vt, (12.2)

vt ∼ Np(0,A0ΩA00)

where A1 = A0Γ1, a = A0α, µ0,a = A0µ0, and vt = A0εt.The moving average representation for a VAR model with linear, but no

quadratic, trends is given by:

xt= CX

εi + tCµ0 +C∗(L)(εt + µ0 +αβ1t), (12.3)

where

C = β⊥(α0⊥Γβ⊥)

−1α0⊥. (12.4)

It is useful to express the C matrix as a product of two matrices (similarlyas Π = αβ0)

C = eβ⊥α0⊥, (12.5)

or, alternatively,

C = β⊥eα0⊥,where eβ⊥= β⊥(α

0⊥Γβ⊥)

−1, and eα0⊥= (α0⊥Γβ⊥)−1α0⊥. CATS uses the formerformulation. Note that the matrices β⊥ and α⊥ can be directly calculated forgiven estimates of α,β, and Γ based on (12.4). Thus, the common stochastictrends and their weights can be calculated either based on unrestricted α, β,or on restricted estimates, αc, β

c.When choosing the moving average option

Page 245: Juselius_Johansen_The Cointegrated VAR Model

12.2. SOME SPECIAL CASES 247

in CATS the program uses the latest estimates of α and β as a basis for thecalculations.

It appears from (12.5) that the p × (p − r) matrix eβ⊥ (alternativelyeβc⊥) can be given an interpretation as the coefficients of the p − r commonstochastic trends α0⊥

Pεi (alternatively α

c0⊥Pεi) of the variables xt. In the

last section of this chapter we will relate this decomposition to the moreintuitive discussion of common stochastic trends of Chapter 2.

The decomposition of C = eβ⊥α0⊥ resembles the decomposition Π = αβ0

but with the important difference that eβ⊥ is a function not only of β⊥, butalso of α⊥. Similarly as for α and β one can transform eβ⊥and α⊥ by anonsingular (p− r)× (p− r) matrix Q

C =eβ⊥QQ−1α0⊥=eβc⊥αc0⊥. (12.6)

without changing the value of the likelihood function. Thus, the Q trans-formation leads to just-identified common trends for which no testing is in-volved. Additional restrictions on eβ⊥and α⊥ would constrain the likelihoodfunction and, hence, be testable. The next section will discuss a few spe-cial cases of testable restrictions on α⊥ which can be expressed as testablerestrictions on α, so that new test procedures need not be derived.

12.2 Some special cases

We will now discuss a few special cases of restrictions on β and α whichimply some interesting restrictions on eβ⊥ and α⊥.Case 1: Long-run homogeneity:

β =

a b c−ω1a −ω2b −ω3c

−(1− ω1)a −(1− ω2)b −(1− ω3)c∗ ∗ ∗∗ ∗ ∗

→ eβ⊥ =1 11 11 10 00 0

Case 2: A stationary variable in β :

Page 246: Juselius_Johansen_The Cointegrated VAR Model

248 CHAPTER 12. IDENTIFICATION OF COMMON TRENDS

β =

0 ∗ ∗1 ∗ ∗0 ∗ ∗0 ∗ ∗0 ∗ ∗

→ eβ⊥ =∗ ∗0 0∗ ∗∗ ∗∗ ∗

,i.e. a zero row in the C-matrix.Case 3: A column of α is proportional to a unit vector:

α =

∗ ∗ ∗0 ∗ ∗0 ∗ ∗0 ∗ ∗0 ∗ ∗

→ α⊥ =

0 0∗ ∗∗ ∗∗ ∗∗ ∗

,i.e. a zero column in the C−matrix.Case 4: A row in α is equal to zero.

α =

∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗∗ ∗ ∗0 0 0

→ α⊥ =

∗ 0∗ 0∗ 0∗ 0∗ 1

In this case the last variable is weakly exogenous and corresponds to a com-mon driving trend. In addition, if all the rows of Γi, i = 1, ..., k − 1 cor-responding to the weakly exogenous variable are zero, then the latter willappear as a unit row vector in the C-matrix.

12.3 Illustrations

For the Danish data we found r = 3 and, hence, p− r = 2 common trends.We will report three different cases of the estimated common trends repre-sentation:

1. based on the unrestricted α and β from the VAR model,

Page 247: Juselius_Johansen_The Cointegrated VAR Model

12.3. ILLUSTRATIONS 249

Table 12.1: The unrestricted VAR representations

The common trends representationeβ⊥,1

eβ⊥,2 εmr

(0.024)ε∆p(0.015)

εRm(0.001)

εyr(0.018)

εRb(0.002)

mr 5.3 -11.6 α0⊥,1 -0.03 0.02 1.00 -0.24 -0.54∆p 0.1 0.4 α0⊥,2 -0.02 0.03 0.50 -0.16 1.00Rm -0.4 0.5 Residual standard deviations in the brackets

yr -2.2 -3.8Rb -0.9 1.0

The C matrixεmr ε∆p εRm εyr εRb

mr 0.02(0.1)

−0.18(−0.9)

−0.44(−0.1)

0.57(2.6)

−14.5(−4.9)

∆p −0.01(−0.7)

0.01(0.9)

0.31(0.9)

−0.09(−5.5)

0.40(1.8)

Rm 0.00(0.8)

0.00(0.5)

−0.13(−0.7)

0.01(1.2)

0.68(5.4)

yr 0.13(0.4)

−0.14(−0.9)

−4.05(−1.0)

1.11(5.7)

−2.60(−1.0)

Rb 0.01(0.6)

0.01(0.3)

−0.38(−0.9)

0.05(2.2)

1.51(5.1)

t-ratios in the brackets

2. based on a restricted α (the zero row restrictions of real income andbond rate), but unrestricted β.

3. based on β restricted to the structure HS.4 of Table 10.3 and α re-stricted as in 2.

The estimates eβ⊥,1, eβ⊥,2, α⊥,1, and α⊥,2 in Table 12.1 are based on a firsttentative normalization by normalizing on the largest coefficient. We notethat the largest coefficient in α⊥,2 corresponds to the weakly exogenous bondrate, whereas in α⊥,1 it corresponds to the short-term interest rate and not tothe weakly exogenous real income. Thus, by normalizing on εRm we have infact normalized on an insignificant coefficient. This is a reminder that a largecoefficient does not necessarily imply a statistically significant coefficient.Unless the residuals are standardized the magnitude of a coefficient is notvery informative. To improve interpretability we have, therefore, reportedthe residual standard deviations in the brackets underneath the residuals in

Page 248: Juselius_Johansen_The Cointegrated VAR Model

250 CHAPTER 12. IDENTIFICATION OF COMMON TRENDS

the upper part of Table 12.1. It appears that even if the coefficient of εRm is5 times larger than the coefficient of εyr , the residual standard error of thelatter is 18 times larger than the one of the former.Figure 12.1 (to be included) shows the graphs of the cumulated VAR

residuals equation by equation and Figure 12.2 (to be included) the twounrestricted common trends defined by:

Xti=1u1,i = α0⊥,1

Xti=1εi,X

ti=1u2,i = α0⊥,2

Xti=1εi,

where α⊥,1 and α⊥,2 are given by the estimates in Table 12.1 and ε0i =[εmr , ε∆p, εRm, εyr , εRb]i, i = 1974:3, ..., 1993:3.As already mentioned the common trends estimates of Table 12.1 are

not unique in the sense that we can impose (p − r − 1) = 1 identifyingrestriction on each vector without changing the likelihood function. Assume,for example, that we would like to impose a zero restriction on the bond rateresidual and a unitary coefficient on the real income residual in α⊥,1, and azero restriction on the real income residual and a unitary coefficient on thebond rate residual in α⊥,2. This can be achieved by premultiplying α⊥ bythe transformation matrix Q−1

Q−1=· −0.24 −0.54−0.16 1.00

¸−1=

· −3.06 −1.65−0.49 0.73

¸resulting in:

αc0⊥ =·0.12 −0.11 −3.89 1.0 0.00.00 0.01 −0.12 0.0 1.0

¸By post-multiplying β⊥ by Q we get the corresponding loadings matrix:

βc

⊥ =

0.58 −14.46−0.09 0.350.02 0.721.14 −2.610.06 1.49

Page 249: Juselius_Johansen_The Cointegrated VAR Model

12.3. ILLUSTRATIONS 251

Table 12.2: Three representations of the moving average form

εmr

(0.024)ε∆p(0.015)

εRm(0.001)

εyr(0.018)

εRb(0.002)

α⊥,1 0 0 0 1 0α⊥,2 0 0 0 0 1

The C matrix: Case 2 Case 3

εmr

ε∆p εRm εyr(β⊥,1) εRb(β⊥,2) εyr(β⊥,1) εRb(β⊥,2)mr 0 0 0 0.66

(2.8)−14.4(5.1)

0.82(2.8)

−16.0(4.3)

∆p 0 0 0 −0.08(−5.4)

0.51(3.0)

−0.10(−5.3)

0.44(2.0)

Rm 0 0 0 0.02(2.0)

0.68(5.0)

0.02(2.0)

0.55(4.4)

yr 0 0 0 1.25(6.0)

−3.85(1.6)

1.27(5.6)

−3.49(−1.2)

Rb 0 0 0 0.05(2.3)

1.40(4.9)

0.05(2.0)

1.45(4.4)

Note that this can also be achieved by simple row manipulations of theunrestricted vectors, for example:

αc⊥,1 = [(α0⊥1 + 0.54α

0⊥2)/(−0.33)] =

£0.12 −0.09 −3.91 1.0 0.0

¤.

In the lower part of the Table 12.1 we report the C matrix for the un-restricted VAR. The columns of εm, ε∆p and εRm contain no significant co-efficients which is consistent with the columns of α being proportional tounit vectors (as indeed seemed to be the case for HS.4 in Chapter 10). Theresults suggest that there are no significant long-run impact of disturbancesto real money stock, short-term interest rate and the inflation rate on anyof the variables of this system. Note that the first two are probably closelyrelated to unanticipated shocks to the monetary policy instruments and thethird one is the goal variable of monetary policy.In representation 2 reported in Table 12.2 we have imposed the two zero

row restrictions on α corresponding to the weak exogeneity of the real incomeand the bond rate. The corresponding two common trends,

Pti=1 εyr,i andPt

i=1 εRb,i, are now overidentified with three overidentifying restrictions each(which is the equivalent of the six degrees of freedom in the joint exogeneitytest in Chapter 9). Note that the two exogeneity restrictions completely

Page 250: Juselius_Johansen_The Cointegrated VAR Model

252 CHAPTER 12. IDENTIFICATION OF COMMON TRENDS

determine the common trends in our empirical model (recalling that p− r =2). Consequently, two rows of the Π matrix are zero in this case and theremaining three rows can be represented as:

Π =

α11 0 00 α22 00 0 α330 0 00 0 0

β01xt−1β02xt−1β03xt−1

.Thus, when there are exactly p − r weakly exogenous variables, the α

vectors can be represented as r unit vectors resulting in r zero columns inthe C matrix.In addition to the weak exogeneity restrictions on α, representation 3

has imposed the structure HS.4 on β. Since the first three columns of the Cmatrix are equal to zero, Table 12.2 only reports the last two columns. In thiscase they are equivalent to β⊥,1 and β⊥,2. A comparison of the columns for εyrand εRb, i.e. of β⊥,1 and β⊥,2, in the three cases reveals that the unrestrictedestimates have not changed much as a result of imposing the sixα restrictionsand the four β restrictions. This is because these restrictions were acceptedwith very high p-values. When this is not the case the estimates of theC matrix may differ considerably depending on whether they are based onunrestricted or restricted α and β.While the long-run weak exogeneity of the bond rate and the real income

implies that their cumulated residuals can be considered common stochastictrends, it does not necessarily imply that these two variables are the commontrends. For this to be the case we need the further condition on the Γimatricesthat the rows associated with the weakly exogenous variables have tobe zero. In a such a case the equation for the variable xj,t becomes∆xj,t = εj,tso that xj,t =

Pti=1 εj,i, i.e. the common stochastic trend coincides with the

variable itself.Table 12.2 shows that the bond rate is significantly affected by both

of the stochastic trends. The same is true for the real aggregate incomevariable, though less significantly so. The reason for this can be found inTable 11.1 which shows that both variables exhibit significant effects fromlagged changes of process.Chapter 5 demonstrated that cointegration and common trends are two

sides of the same coin. We will now exploit this feature to learn more about

Page 251: Juselius_Johansen_The Cointegrated VAR Model

12.3. ILLUSTRATIONS 253

the generating mechanisms underlying the cointegration structure HS.4. Wereproduce the common trends estimates of Case 3 in Table 12.2:

mr

∆pRmyr

Rb

=

0.82 −16.0−0.10 0.440.02 0.551.27 -3.490.05 1.45

· P

εyr,tPεRb,t

¸+stationary anddeterministiccomponents

where εyr,t is a measure of u1t and εRb,t a measure of u2t. In Chapter 9 wefound that the liquidity ratio, the interest rate spread and the short andlong-term real interest rates were nonstationary. Using the above results wecan now express these as linear functions of the two stochastic trends:

mr − yr = (0.82− 1.27)Σεyr,i − (16.0 + 3.5)ΣεRb,i = −0.45Σεyr ,i − 12.50ΣεRb,iRm −Rb = (0.02− 0.05)Σεyr,i + (0.55− 1.45)ΣεRb,i = −0.03Σεyr,i − 0.90ΣεRb,iRm −∆p = (0.02 + 0.10)Σεyr ,i + (0.55− 0.44)ΣεRb,i = 0.12Σεyr,i + 0.11ΣεRb,iRb −∆p = (0.05 + 0.10)Σεyr ,i + (1.45− 0.44)ΣεRb,i = 0.15Σεyr,i + 1.01ΣεRb,i

The nonstationarity of the liquidity ratio seems primarily to derive fromcumulated shocks to the bond rate. The level of real money balances seemsto have been strongly influenced by the latter trend, whereas the level ofreal aggregate income much less so. Furthermore, cumulated shocks to realaggregate income might have influenced real money balances and real incomedifferently (though the difference between 0.82 and 1.27 may or may not bestatistically significant). Thus, the results seem to suggest that the Danishmoney demand has been more interest rate elastic than aggregate demand.The nonstationarity of the interest spread is related to both stochastic

trends. Both the short- and the long-term interest rate have been affectedby the two shocks, but the bond rate more strongly so.The magnitude of the nonstationarity of the short-term and the long-

term real interest rate appears to differ in the sense that the real short-terminterest rate is closer to stationarity than the real long-term rate. This isprimarily because the real bond rate has been more strongly affected by thetwo stochastic trends than the real deposit rate. Furthermore, it is interest-ing to notice that the first stochastic trend, describing shocks to real income,has had a positive impact on both interest rates but a negative one on the

Page 252: Juselius_Johansen_The Cointegrated VAR Model

254 CHAPTER 12. IDENTIFICATION OF COMMON TRENDS

inflation rate, whereas the second trend describing shocks to the bond ratehas had a positive impact on the inflation rate and the two interest rates. Itseems plausible that the nonstationarity of real interest rates in this periodis related to the ’real income’ stochastic trend and its effect on the inflationrate. Altogether, the results indicate that the relationship between excessaggregate demand and inflation rate has not followed conventional mecha-nisms in the present sample period.Based on the above results it is now straightforward to examine the sta-

tionary cointegration relations defined by HS.4.

(mr − yr)− 13.8(Rm −Rb) = −0.06Σεyr,i − 0.03ΣεRb,iRm − 0.4Rb = 0.03Σεyr,i

(Rm −∆p) + 0.5(Rm −Rb)− 0.08yr == (0.11− 0.45 + 0.28)Σεyr ,i + (0.10− 0.01− 0.10)ΣεRb,i = 0.06Σεyr,i − 0.01ΣεRb,i

The reason why the stochastic trends do not cancel exactly is that cointe-gration relations are derived under the hypothesis H24 but without imposingweak exogeneity restrictions on α, whereas the common trends represen-tation is based on a restricted β and α. The stationarity of the empiricalmoney demand relation is a result of the interest rate spread being influ-enced by the stochastic trends in the same proportion as the liquidity ratio.The stationarity of the interest rate relation has been achieved by combiningthe two interest rates in the same proportion as the two stochastic trendsenter the variables. The stationarity of the third relation is achieved bycombining a homogeneous relation between the inflation rate and the two in-terest rates with the real income variable. It appears that the homogeneousinflation-interest rates relation (i.e. H27 in Table 9.3) does not cancel the twostochastic trends, of which the real trend is probably the most significant.Adding a small fraction of real income is enough to counterbalance the twostochastic trends so that stationarity is strongly improved in the extendedrelation (H28 in Table 9.3).

12.4 Economic identification

We have now obtained estimates of the common trends representation thatcan be used to evaluate the empirical content of the real money representation

Page 253: Juselius_Johansen_The Cointegrated VAR Model

12.4. ECONOMIC IDENTIFICATION 255

discussed in Chapter 2. The theoretical model predicted that nominal growthderives from money expansion in excess of real productive growth in theeconomy and real growth from productivity shocks to the economy, i.e. u1t =α0⊥,1εt and u2t = α0⊥,2εt, where α

0⊥,1 = [∗, 0, 0, 0, 0] and α0⊥,2 = [0, 0, 0, ∗, 0].

The empirical model was consistent with finding two autonomous per-manent disturbances which cumulate to two common driving trends. One ofthese seemed consistent with the hypothetical real income trend, whereas theother was clearly not related to excess monetary expansion in the domesticmarket. Instead, it was generated by permanent ’shocks’ to the long-termbond rate and, thus, seemed more related to financial behavior in long-termcapital market.To facilitate the discussion of the cointegration implications of the empir-

ical results for the real money representation we reproduce both the theoret-ical representation and the estimates from table 12.2, Case 3, i.e. with theweak exogeneity restrictions imposed on α and the restrictions HS,4 on β.To improve comparability u1,t denotes an autonomous real disturbance andu2,t a nominal disturbance.

mrt

∆ptyrt

Rm,tRb,t

=d12 00 c21

d12 00 c210 c21

· P

u1iPu2i

¸+stationary anddeterministiccomponents

(12.7)

mr

∆pyr

RmRb

=

0.82 −16.0−0.10 0.441.27 -3.490.02 0.550.05 1.45

· P

u1iPu2i

¸+stationary anddeterministiccomponents

(12.8)

The cointegration implications of the theory model (12.7) is that velocity,(mr − yr), the interest rate spread, (Rm − Rb), and the real interest rates,(Rm−∆p), and (Rb−∆p) should all be stationary. As demonstrated abovethis was not the case here and we will now discuss why.According to (12.7) the real trend should influence real money stock and

real income with the same coefficients. The estimated coefficients are 0.82 tomoney stock and 1.27 to real income; both are positive and not too far away

Page 254: Juselius_Johansen_The Cointegrated VAR Model

256 CHAPTER 12. IDENTIFICATION OF COMMON TRENDS

from unity, but may, nevertheless, deviate too much from each other. Thetheory model predicts that the nominal I(1) trend should not influence realmoney nor real income. It appears that the nominal trend (defined here asthe cumulated shocks to the long-term bond rate) has indeed an insignificanteffect on the real income variable, but a very significant one on real moneystock. Therefore, some of discrepancy between the assumed monetary modeland the empirical evidence can be related to the strong impact of the long-term interest rate in the money demand relation (conventional monetarymodels would assume the demand for money to be interest rate inelastic).

The Fisher parity predicts that nominal interest rates are only influencedby the stochastic nominal trend and the expectations hypothesis predicts thatthe latter influences the interest rates with equal coefficients. The empiricalresults show that both stochastic trends influence the two interest rates in asimilar way though not in the proportion one to one. Instead the weights areapproximately in the proportion 0.4 to 1.0 consistent with (Rm − 0.4Rb) ∼I(0) (probably because we used average yields on M3 money stock).

Finally, but not least importantly, inflation rate is significantly affectedby both trends which is against the econometric assumption of Chapter 2that only the nominal trend should influence inflation. Furthermore, the ef-fect from the real trend on inflation is highly significant and negative. Thelatter seems counter-intuitive, at least in terms of a conventional Phillipscurve relationship. The effect from the nominal trend (shocks to the bondrate) is positive, possibly reflecting the self-fulfilling effect of long-term infla-tionary expectations on price inflation. The finding of two stochastic trendsin inflation rate could also be consistent with prices containing two stochasticI(2) trends instead of one. However, the estimated results of the I(2) analysisin Chapter 14 clearly supports the existence of one I(2) trend. Therefore, toachieve econometrically consistent results we need to redefine the commonnominal trend as a linear combination of the shocks to real expenditure andto the nominal bond rate. By transforming β⊥ based on (12.8) using thefollowing transformation matrix Q

Q =

·1.0 0.000.23 1.0

¸

we achieve the following common trends representation

Page 255: Juselius_Johansen_The Cointegrated VAR Model

12.4. ECONOMIC IDENTIFICATION 257

mr

∆pyr

RmRb

=−2.86 −16.00.0 0.440.47 -3.490.15 0.550.38 1.45

· P

u1iPu2i

¸+stationary anddeterministiccomponents

(12.9)

in which inflation is affected by a single common stochastic trend definedby u2i = ΣεRb,i + 0.23Σεyr,i and u1i = Σεyr ,i as before. Note that the twocommon trends are no uncorrelated as they were in (12.8) with εyr,t and εRb,tbeing correlated with a coefficient 0.03 (see Table 11.5).Though the coefficient of the nominal trend in yr is insignificant we can

similarly apply the following transformation to (12.9):

Q =

·1.0 7.450.0 1.0

¸and obtain the representation:

mr

∆pyr

RmRb

=−2.86 −5.310.0 0.440.47 0.00.15 1.670.38 4.28

· P

u1iPu2i

¸+stationary anddeterministiccomponents

(12.10)

In (12.10) we have identified the common trends by imposing restrictions

on eβ⊥, in (12.8) on α⊥, and in (12.9) on both. Given the discussion in Sec-tion 12.1 about interpreting shocks we conclude that the definition of u1t andu2t satisfies the requirement of uniqueness and, possibly novelty, but fails onthe econometric condition that only the nominal stochastic trend should in-fluence inflation. In (12.9) and (12.10) the econometric condition is satisfiedbut at the sacrifice of the uniqueness condition. Furthermore, the economicimplications of the results became less plausible in (12.9) and (12.10). Forexample, the condition that real money stock and real aggregate demandshould be affected similarly by the real stochastic trend is now lost. Thisseems to be related to the fact that the cumulated shocks to real aggregatedemand were more significant, and negatively so, in price inflation than wereshocks to the bond rate. Moreover, the finding that the nominal stochastic

Page 256: Juselius_Johansen_The Cointegrated VAR Model

258 CHAPTER 12. IDENTIFICATION OF COMMON TRENDS

trend did not arise from shocks to money stock was very strong and signif-icant. Thus, the results illustrate how fragile a structural interpretation ofVAR residuals can be.The logic of the conventional monetary model seems to be inconsistent

with the logic of the econometric analysis and, thus, points to a need toreconsider the theoretical basis for understanding inflationary mechanisms inthis period. Whether or not such an empirical claim can be forcefully broughtforward depends on the reliability of the empirical results, i.e. whether or notthe structuring of the economic reality using the cointegrated VAR model isempirically convincing. However, as long as a first order linear approximationof the underlying economic structure provides an adequate description of theempirical reality, the VAR model is essentially a convenient summary of thecovariances of the data (see Chapter 3). Provided that further reductions(simplifications) of the model are based on valid procedures for statisticalinference the final empirical results should essentially reflect the informationin the covariances of the data.Altogether, the empirical evidence based on the present sample does not

seem to support the conventional monetary model as represented by (12.7).In particular, the finding that shocks to the long-term bond rate, insteadof the money stock, were an important driving force, suggests that financialderegulation and the increased globalization may have played a more crucialrole for nominal growth in the domestic economy than the actions of thecentral bank. By comparing the theoretical model with the correspondingempirical results we may get some understanding for why the theory modelfailed. Thus, in the ideal case the empirical analysis might suggest possibledirections for modifying the theoretical model. Alternatively, it might sug-gest how to modify the empirical model (for example by adding more data)to make it more consistent with the theory. In either case the analysis pointsforward, which is why we believe the VAR methodology has the potential ofbeing a progressive research paradigm. We will return to this question in thelast chapter.Up to this point we have not explicitly identified the estimated shocks

by making them uncorrelated, or by distinguishing between permanent andtransitory shocks. This will be the topic of the next chapter.

Page 257: Juselius_Johansen_The Cointegrated VAR Model

Chapter 13

Identification of a StructuralVAR

In the previous chapter no attempt was made to identify the transitoryshocks, i.e. the shocks which do not cumulate to stochastic trends. Thisis the purpose of this chapter.There is, however, an important difference: The identification of the long-

run structure is formulated as r linear hypotheses on the variables of thesystem, whereas the identification of the common trends is formulated asp − r linear hypotheses on the shocks to the variables/equations. While wedo not in general need to discuss what a variable is, the empirical definition ofa (structural) shock is more arbitrary. The previous chapter argued that thisis particularly so when the shock has to be estimated from the VAR residualswhich are seldom invariant to extensions of the information set. Therefore,the definition of a structural shock and how to identify it based on the VARresiduals plays an important role in the identification of the common trends.

13.1 Transitory and permanent shocks

The VAR model:

∆xt= Γ1∆xt−1+αβ0xt−1+µ+ εt,

with εt ∼ Np(0,Ω).The Vector Equilibrium Correction model with simultaneous effects:

259

Page 258: Juselius_Johansen_The Cointegrated VAR Model

260 CHAPTER 13. IDENTIFICATION OF A STRUCTURAL VAR

A0∆xt = A1∆xt−1+aβ0xt−1+µa+vt,vt ∼ Np(0,A0ΩA

00)

where A1 = A0Γ1, a = A0α, µa = A0µ, and vt= A0εt.The MA representation of the VAR model assuming no linear trends in

the data:

xt= CX

εi+C∗(L)(εt+µ0), (13.1)

where

C = β⊥(α0⊥Γβ⊥)

−1α0⊥ (13.2)

= eβ⊥α0⊥. (13.3)

The SVAR model:The SVAR model differs from the Eq.C. model in the following sense:

1. the VAR residuals are assumed to be related to some underlying ’struc-tural’ shocks which are linearly independent

2. the p ’structural shocks’ are divided into (p− r) permanent and the rtransitory shocks.

Most common trends models achieve this by assuming that

1. transitory shocks have no long-run impact on the variables in the sys-tem (i.e. the transitory shocks define zero columns in the C matrix)

2. permanent shocks have a significant long-run impact on the variablesof the system.

We consider now a ’structural’ representation as defined by the matrix B(which is similar to the matrix A0 of the Eq.C model, except that B doesnot assume a normalization) associating the ’structural’ shocks ut with theVAR residuals:

ut = Bεt (13.4)

Page 259: Juselius_Johansen_The Cointegrated VAR Model

13.1. TRANSITORY AND PERMANENT SHOCKS 261

or, alternatively, associating the VAR residuals with the underlying struc-tural shocks:

εt = B−1ut (13.5)

We can now reformulate (13.1) using (13.5):

xt=eβ⊥α0⊥B−1Xvi+C∗(L)B−1ut+µ0. (13.6)

We need to choose B so that the ’usual’ assumptions underlying a structuralintepretation are satisfied:

1. Distinction between transitory and permant shocks, i.e. ut = [us,ul] =[us,1, ..., us,r, ul,1, ..., ul,p−r]

2. The transitory shocks, [us,1, ..., us,r] have no long-run impact on thevariables of the system whereas the permanent shocks [ul,1, ..., ul,p−r]have.

3. E(utu0t) = Ip, i.e. all ’structural’ shocks are linearly independent or,

alternatively

4. E(us,tu0s,t) = Ir and E(ul,tu

0l,t) = Ip−r

Identification:By including the relation (13.4) to the unrestricted VAR model we have

introduced p× p (= 25) additional parameters so that we need to impose asmany restrictions to achieve just identification. The orthogonality condition3. implies that p × (p + 1)/2 (= 15) of the parameters in B−1 have beenfixed. The condition 2. restricts (p − r) × r (= 6) additional parameters.Thus, in our example we need to impose 4 additional restrictions to achievecomplete uniqueness of the ’structural’ shocks.We will first choose B so that conditions 1, 2, and 3 are satisfied.1. Orthogonality of the transitory shocks, us,t and the permanent shocks

ul,t = Bεt can be achieved by choosing:

us,t = α0Ω−1εtul,t = α0⊥εt

Page 260: Juselius_Johansen_The Cointegrated VAR Model

262 CHAPTER 13. IDENTIFICATION OF A STRUCTURAL VAR

2. Orthogonality within the two groups can be achieved by choosing:

B =

"α0(Ω−1α)−1/2α0Ω−1

(α0⊥Ωα⊥)−1/2α0⊥

#.

which corresponds to:

B−1 =hα(α0Ω−1α)−1/2, Ωα⊥(α

0⊥Ωα⊥)

−1/2i.

We now relax the assumption that permanent and transitory shocks areuncorrelated, while maintaining the orthogonality assumption within thegroups. Thus, we still need to distinguish between permanent and transi-tory shocks, but transitory shocks, say, can be correlated with permanentshocks: This can be ahieved by the the following choice:

B =

·α0

α0⊥

¸and

B−1 =£α α⊥

¤where α0 = α(α0α)−1 and α⊥ = α⊥(α

0⊥α⊥)

−1, so that α0εt define thetransitory shocks, and α0⊥εt define the permanent shocks.Orhogonality of the transitory shocks implies that E(α0εt)(ε0tα) =

α0Ωα = I, which can be achieved by:

us,t = (α0Ωα)−1/2α0εt

Orhogonality of the permanent shocks implies that E(α0⊥εt)(ε0tα⊥) =α0⊥Ωα⊥ = I, which can be achieved by:

ul,t = (α0⊥Ωα⊥)

−1/2α0⊥εt.

The uniqueness can be achieved econometrically by choosing A0 so thatthe covariance matrix Ω becomes diagonal and by appropriately restrict-ing α⊥. For empirical applications see Mellander, Vredin and Warne (1992),

Page 261: Juselius_Johansen_The Cointegrated VAR Model

13.1. TRANSITORY AND PERMANENT SHOCKS 263

Hansen and Warne (200?), and Coenen and Vega (2000). Whether the result-ing estimate vt can be given an economic interpretation as a unique structuralshock, depends crucially on the plausibility of the identifying assumptions.Some of them are just-identifying and, thus, cannot be tested against thedata.

Page 262: Juselius_Johansen_The Cointegrated VAR Model

Chapter 14

I(2) Symptoms in I(1) Models

The purpose of this chapter is to give a soft introduction to the I(2) model sothat the reader has a good intuition for the basic ideas prior to the analysisof the formal I(2) model of the next chapter. Generally, there are few appli-cations of the cointegrated VAR model for I(2) data. The reason for this isthat the existence of I(2) data and, hence, the relevance of the I(2) modelhas often been disregarded from the outset based on economic arguments.However, as argued in Chapter 2, the unit root property is a statistical con-cept which in general should not be translated into a property of a economicproblem. Therefore, to give a double unit root a structural economic in-terpretation as a ’very’ long-run relationships in the data is generally notgranted. For example, the intuition that I(2) trends can be found over verylong time periods is at odds with the fact that significant mean reversion ismore likely to be found in large than in small samples. Thus, the hypothesisof a double unit root is often hard to reject even in moderately sized sam-ples when the adjustment behavior is sluggish. Thus, while the distinctionbetween the I(1) and I(2) model is theoretically sharp, empirically it is oftenmuch more diffuse.The statistical analysis of the I(2) model is quite involved and the move

from the fairly well-known I(1) world to the more complex I(2) analysismay easily seem prohibitive. The aim of this chapter and the next is toconvince the reader that the I(2) analysis, though possibly demanding, iswell worth the effort. We will argue here that the I(2) analysis offer a wealthof largely unexploited possibilities for an improved understanding of empiricalproblems where acceleration rates matter, for example, the determination ofthe inflation rate.

265

Page 263: Juselius_Johansen_The Cointegrated VAR Model

266 CHAPTER 14. I(2) SYMPTOMS

The distinction between an I(2) variable and a trend-adjusted I(1) vari-able is often diffuse, particularly in small samples. Therefore, Section 14.1gives a brief informal introduction to the role of deterministic and stochas-tic components in models with nominal variables. Most econometric soft-ware packages contains a routine for cointegration analysis based on the I(1)model, but only a few include tests and estimation procedures for the I(2)model. Section 14.2, therefore, provides an intuitive approach to the I(2)model based on the definition of the so called R model of Chapter 7. Section14.3 discusses typical ’symptoms’ in the I(1) model signalling I(2) problems.Section 14.4 gives some examples of questions which can be adequately askedand tested based on the I(1) model even if the data are I(2). Section 14.5discusses under which circumstances the data can be transformed to I(1)variables without loss of information. Section 14.6 concludes.

14.1 Stochastic and deterministic components

in nominal models

When analyzing nominal instead of real variables, for example, mt and ptinstead of (m − pt), we need to reconsider the role of the stochastic anddeterministic components and how they enter the model. However, beforedoing that one would probably first like to know whether the data are I(2)or not. Therefore, most empirical applications in which any of the variablesmight contain a double unit root report the results of some univariate Dickey-Fuller type tests applied to each variable separately. We will, however, notdiscuss the univariate test procedures here because the next chapter willstrongly argue that univariate tests of individual variables cannot (and shouldnot) replace the multivariate I(2) test procedure.

Thus, we will leave the formal I(2) testing to the next chapter and, in-stead, we will take a look at the graphs of the data in levels and differences asa first step in the analysis. Because an I(2) variable typically exhibits a verysmooth behavior, which can be difficult to distinguish from an I(1) variablewith a linear trend, the differenced data are often more informative aboutpotential I(2) behavior than the data in levels. It is often the case that anI(2) variable can be approximated by a trend-adjusted I(1) variable when itis observed over shorter periods. Since the slope coefficient of a linear trendin xt corresponds to an average growth rate of ∆xt, the graph of the latter

Page 264: Juselius_Johansen_The Cointegrated VAR Model

14.1. STOCHASTIC ANDDETERMINISTIC COMPONENTS IN NOMINALMODELS267

1975 1980 1985 1990 1995

.25

.5

.75 m

1975 1980 1985 1990 1995

-.5

0

.5py

1975 1980 1985 1990 1995

1.2

1.3

1.4

1.5 ry

1975 1980 1985 1990 1995

.015

.02

.025

.03 id

1975 1980 1985 1990 1995

.03

.04

.05ib

Figure 14.1: The levels of the variables for the Danish nominal data.

often suggests whether there is significant mean reversion in the differencesor not.

In many cases it may be possible to avoid the I(2) analysis by allowingsufficiently many mean shifts in ∆xt (i.e. broken linear trends in xt) evenif the growth rate seems to be drifting off in a nonstationary manner. Forexample nominal money stock and prices in Figure 14.1, panel a and b,exhibit smooth trending behavior over the whole sample period but with achange in the slope at around 1982-83. Consistent with this the growth rates(in panel c and d) seem to fluctuate around a higher mean value up to 1983and a lower value thereafter. Hypothetically, trend-adjusted nominal moneyand prices could be found to be empirically I(1) when allowing for differentgrowth rates in the two regimes, but to be I(2) when assuming constant lineargrowth rates. In the former case we would have chosen to model the shiftfrom a high inflation to a low inflation regime deterministically in the lattercase stochastically.

Does the choice matter or not? It depends! In the Danish data we de-tected an extraordinary large shock in the bond rate and real money stock at1983:1 which approximately coincided with a change in inflationary regimes.From an econometric point of view this shock violated the normality assump-tion of the VAR model and, hence, was accounted for by a blip dummy. Thelatter cumulates to a shift in the levels of the bond rate and the levels of real

Page 265: Juselius_Johansen_The Cointegrated VAR Model

268 CHAPTER 14. I(2) SYMPTOMS

1975 1980 1985 1990 1995

0

.1 Dm

1975 1980 1985 1990 1995

0

.025

.05Dpy

1975 1980 1985 1990 1995

0

.05

.1Dry

1975 1980 1985 1990 1995

0

.005 Did

1975 1980 1985 1990 1995

-.005

0

.005Dib

Figure 14.2: The differences of the variables for the Danish nominal data.

money stock. Chapter 7 demonstrated that an unrestricted step dummy inthe VAR model corresponds to a broken linear trend in the data. Thus, byeconometrically accounting for the extraordinary large shock in the data wehave in fact chosen to model the shift deterministically. Nonetheless, Table9.3. showed that the stationarity of the Danish inflation rate was rejected(though borderline so) even when corrected for a shift in the mean at 1983:1.

In order not to violate the multivariate normality assumption of the VARmodel, big interventions and reforms often need to be modelled determinis-tically and one can ask whether this is good or bad. One could on one handargue that if the behavioral shift was properly anticipated by the economicagents then a model derived under the assumption of such forward lookingbehavior should be able to account for these large changes in the data. Forexample, if model based forward looking expectations are adequately de-scribing agents’ behavior then the linear VAR model should exhibit signs ofnon-constancy and we should preferably move to a nonlinear model analysis.If on the other hand the VAR model provides a good description of the data,then the magnitude of the shock was probably unanticipated and we shouldtreat it as a large innovation outlier.

When there are large shocks in the data, though not large enough to vi-olate the normality assumption one might, nevertheless, prefer to allow fora mean shift in the differences to avoid the I(2) analysis. In this case wetreat some shocks as deterministic though strictly not required by the econo-

Page 266: Juselius_Johansen_The Cointegrated VAR Model

14.2. ESTIMATING AN I(1) MODEL WITH I(2) DATA 269

metric analysis. Such a choice should, therefore, be justified by economicarguments. For example, one could argue that a shift dummy/linear brokentrend is a proxy for an omitted variable which, if included, would have madethe trend/dummy variable superfluous in the model. To use broken lineartrends and step dummies exclusively to avoid the I(2) analysis does not seemreasonable.

14.2 Estimating an I(1) model with I(2) data

Chapter 9 found that the inflation rate was empirically I(1) based on thedata vector x0t = [mr, yr,∆p,Rm, Rb], where m

r = m − p. This impliesthat prices are I(2) and, therefore, that nominal money stock is likely to beI(2) as well. Chapter 2 demonstrated that we need to reconsider the roleof the stochastic trends when the VAR analysis is based on nominal moneyand prices x0t = [m, p, yr, Rm, Rb] . However, in this case we also need toreconsider the role of the deterministic components.The empirical analyses of Chapter 9 showed that the real money stock

and the real income variable contained a linear trend and a level shift ataround 1983. The latter did not seem to have generated a broken lineartrend in the two variables. The question is whether we need to reconsiderthe possibility of a broken linear trend in the nominal variables. As a matterof fact nominal money and prices might very well contain a broken lineartrend even though (m− p) showed no such evidence. This would be the caseif m and p contain the same (broken) trend as it would then cancel in thereal transformation. The graphs of nominal money stock and prices in Figure14.1 suggest that the slope of the linear trend might indeed have changedat around 1983. Based on the nominal VAR analysis it is possible to testthe hypothesis that there are broken linear trends in the data and that theycancel in m− p.To illustrate this possibility we respecify the VAR model so that it is

consistent with broken linear trends both in the data and in the cointegrationrelations:

∆xt= Γ1∆xt−1+αβ0xt−2+ΦDpt+µ01+µ02Dst+εt

εt ∼ Np(0,Ω ), t = 1, ..., T (14.1)

where β0= [β0,β11,β12] and x

0t = [xt, t, tDst], Φ, µ01, and µ02 are unre-

Page 267: Juselius_Johansen_The Cointegrated VAR Model

270 CHAPTER 14. I(2) SYMPTOMS

stricted and the broken linear trend is restricted to lie in the cointegrationrelations to avoid quadratic trends in the data.

When xt ∼ I(2) and, hence ∆xt ∼ I(1) the reduced rank restriction onthe matrix Π is not sufficient to get rid of all (near) unit roots in the model.Because the process ∆xt also contains unit roots we need to impose anotherreduced rank restriction on the matrix Γ = I− Γ1. This will be formally dis-cussed in the next chapter. Note, however, that even if the rank of Π = αβ0

has been correctly determined there will remain additional unit roots in theVAR model when xt is I(2).

Therefore, a straightforward way of finding out whether there is an addi-tional (near) unit root in the model is to calculate the roots of vector process(as described in Chapter 3) after the rank r has been determined. If thereremain one or several large roots in the model for any reasonable choice of r,then it is a clear sign of I(2) behavior in at least some of the variables. Be-cause the additional unit root(s) belong to the difference matrix Γ = I− Γ1,lowering the value of r does not remove the additional unit roots associatedwith the I(2) components in the data.

In the Danish nominal money model there are altogether p × k = 5 ×2 = 10 roots in the characteristic polynomial. Since the specification ofthe deterministic components is likely to influence any inference regardingpossible I(2) components in the VAR model we will first estimate the Danishnominal money model allowing for broken linear trends both in the data andin the cointegration relations and then without allowing for such trends.

The characteristic roots are reported below for the model (14.1) allowingfor broken linear trends in the data and in the cointegration relations:

V AR(p) 0.98 0.87 0.74 0.74 0.58 0.58 0.45 0.31 0.19 0.19r = 3 1.0 1.0 0.90 0.67 0.67 0.65 0.44 0.30 0.08 0.08r = 2 1.0 1.0 1.0 0.78 0.61 0.61 0.45 0.26 0.22 0.17

There appears to be two large roots in the unrestricted model consistentwith r = 3. When imposing this rank restriction, the model contains twounit roots plus an additional fairly large root (0.90). When imposing threeunit roots a quite large root (0.78) remains in the model, though it is nowconsiderably smaller compared to r = 3. Thus, if the rank is three as arguedin Chapter 8, the results might suggest a double unit root in the data. Nev-

Page 268: Juselius_Johansen_The Cointegrated VAR Model

14.3. AN INTUITIVE APPROACH 271

ertheless, the evidence of I(2) behavior is not very strong, a conclusion thatwill be confirmed by the graphs of β0xt and β0R1,t given below.For the model version with no broken trends and Ds83t restricted to the

cointegration space the modulus of the roots are reported below:

V AR(p) 0.99 0.90 0.74 0.72 0.72 0.56 0.51 0.32 0.24 0.24r = 3 1.0 1.0 0.92 0.79 0.65 0.65 0.51 0.34 0.23 0.23r = 2 1.0 1.0 1.0 0.78 0.78 0.47 0.46 0.46 0.33 0.15

Compared to the model with broken linear trends the results are almostunchanged. Thus, the stochastic movements in the data do not seem to beparticularly well approximated by the introduction of broken linear trendsin the model.

14.3 An intuitive approach

One might ask whether it makes sense at all sense to estimate the I(1) modelin this case. We will argue below that one can test a number of hypothesesbased on the I(1) procedure even if xt is I(2) without loosing much precisionbut that the interpretation of the results has to be modified as compared tothe I(1) case. The intuition for this can be seen from the so called R-modeldiscussed in Section 7.1.We reproduce the basic definitions of the R-model:

R0t= αβ0R1t+εt. (14.2)

where R0t and R1t are found by first concentrating out lagged short-runeffects, ∆xt−1, and intervention effects, Dpt,Dst, in model (14.1):

∆xt= B1∆xt−1+ΦDpt+µ01+µ02Dst +R0t (14.3)

and

xt−2= B2∆xt−1+ΦDpt+µ01+µ02Dst +R1t. (14.4)

Page 269: Juselius_Johansen_The Cointegrated VAR Model

272 CHAPTER 14. I(2) SYMPTOMS

When xt ∼ I(2), both ∆xt and ∆xt−1 contain a common I(1) trend whichis cancelled when regressing one on the other as is done in (14.3). Thus,R0t ∼ I(0) even if ∆xt ∼ I(1). On the other hand, the I(2) trend in xt−2cannot be cancelled by regression on the I(1) variable ∆xt−1 as is done in(14.4). Thus, R1t ∼ I(2). Because R0t ∼ I(0) and εt ∼ I(0), the equation(14.2) can only be true if β0R1t ∼ I(0) or β = 0. Thus, the linear combinationβ0R1t transforms the process from I(2) to I(0) and we say that the estimateβ is super-super consistent.

The connection between β0xt−2 and β0R1t can be seen by inserting (14.4)into (14.2):

R0t = αβ0(xt−2−B2∆xt−1) + εt (14.5)

= α(β0t−2x− β0B2∆xt−1) + εt= α(β0t−2x−ω0∆xt−1) + εt

where ω = β0B2. Thus, the stationary relations β0R1t consists of two compo-nents β0xt−2 and ω0∆xt−1. For the cointegration relations β0iR1t, i = 1, ..., rto be stationary there are two possibilities: (i) either ωi= 0 and β0ixt−2∼ I(0), or (ii) ω0i∆xt−1 ∼ I(1) and β0ixt ∼ I(1) cointegrate to produce thestationary relation β0R1t ∼ I(0). In the first case we talk about directlystationary relations, in the second case about polynomially cointegrated re-lations. Here we consider β0xt ∼ I(1) without separating between the twocases, albeit recognizing that some of the cointegration relations β0xt may bestationary by themselves. The next chapter will discuss more formally howto distinguish between the two cases.

To conclude: when xt ∼ I(2) we have that β0ixt ∼ I(1) and β0iR1t ∼ I(0)for at least one i, i = 1, ..., r. It is a strong indication of double unit rootsin the data when the graphs of at least one of the cointegration relations,β0ixt, exhibits nonstationary behavior but β

0iR1t looks stationary. This gives

a powerful diagnostic for detecting I(2) behavior in the VAR model.

The graphs of the cointegration relations based on model (14.1) with nobroken trend in the cointegration relations are given in Figures 14.3-14.5. Theupper panels show the relations, β0ixt, and the lower panels the cointegrationrelations corrected for short-run dynamics, β0iR1t.

Page 270: Juselius_Johansen_The Cointegrated VAR Model

14.3. AN INTUITIVE APPROACH 273

Beta1'*Z2(t)

74 76 78 80 82 84 86 88 90 92-27.6

-26.4

-25.2

-24.0

-22.8

-21.6

Beta1'*R2(t)

74 76 78 80 82 84 86 88 90 92-3

-2

-1

0

1

2

3

4

Figure 14.3. The graphs of β01xt (upper panel) and β01R1t (lower panel).

Beta2'*Z2(t)

74 76 78 80 82 84 86 88 90 92-10.8

-9.0

-7.2

-5.4

-3.6

-1.8

-0.0

1.8

3.6

Beta2'*R2(t)

74 76 78 80 82 84 86 88 90 92-2.0-1.5

-1.0-0.5

0.00.5

1.0

1.5

2.02.5

Figure 14.4. The graphs of β02xt (upper panel) and β02R1t (lower panel).

Page 271: Juselius_Johansen_The Cointegrated VAR Model

274 CHAPTER 14. I(2) SYMPTOMS

Beta3'*Z2(t)

74 76 78 80 82 84 86 88 90 92-41.6-40.8

-40.0

-39.2-38.4-37.6

-36.8

-36.0

-35.2-34.4

Beta3'*R2(t)

74 76 78 80 82 84 86 88 90 92-3

-2

-1

0

1

2

3

Figure 14.5. The graphs of β03xt (upper panel) and β03R1t (lower panel).

As can be seen the graphs of β0xt and β0R1,t are quite different supportingthe interpretation above that there might be a double unit root in the data.

14.4 Transforming I(2) data to I(1)

Chapter 2 demonstrated that (m − p) ∼ I(1) when both nominal moneyand prices contain the same I(2) trend (with the same coefficients). All theprevious empirical analyses using real money/inflation rate were econometri-cally valid under the implicit assumption that the stochastic I(2) trend hadbeen cancelled by the nominal to real transformation. We will now test thehypothesis that all cointegration relations satisfy long-run price homogeneityusing the standard I(1) procedure. In the next chapter we will address thenominal-to-real transformation more formally and show that long-run pricehomogeneity of the cointegration relations is a necessary, but not sufficientcondition for his transformation.Chapter 2 discussed the case where the I(2) trends exclusively affected

nominal money and prices. Under the assumption of long-run price homo-geneity we demonstrated that a transformation of the two nominal variables,

Page 272: Juselius_Johansen_The Cointegrated VAR Model

14.4. TRANSFORMING I(2) DATA TO I(1) 275

m and p, to real money, m − p, and price inflation, ∆p, removed the I(2)trend without loss of information, i.e.

mt

ptyrtRm,tRb,t

∼ I(2)→(m− p)t∆ptyrtRm,tRb,t

∼ I(1)if (mt, pt) ∼ I(2), (yrt , Rm,t, Rb,t) ∼ I(1), and mt and pt cointegrate (1, -1) from I(2) to I(1). In this case the VAR analysis based on the nominalor the real data give essentially the same results with the exception thata VAR(k) model based on the real vector will have one more lag of prices(pt−k−1) compared to a VAR(k) in nominal variables. Thus, long-run pricehomogeneity is a very important property when analyzing models based onreal transformations.For the Danish data the test of long-run price homogeneity in the nomi-

nal model with no (broken) linear trends in the data or in the cointegrationrelations was accepted based on a χ2(3) = 3.38 and a p-value of 0.34. How-ever, when broken linear trends were allowed in the model long-run pricehomogeneity was not accepted. To be added!We will now examine the results of the nominal analysis when β is unre-

stricted and when long-run price homogeneity has been imposed and comparethe results with the empirical results from the real money analysis of Chapter9. Table 14.1 reports the estimates of the β relations and the correspondingα for Danish money data in nominal values.The results of the nominal and real analysis differ primarily with respect

to inflation being absent in the cointegration space in the former case butpresent in latter case. As shown in Table 8.2 inflation was only present inone of the long-run relations of the preferred structure H4, (∆p − Rm) −0.5(Rm − Rb) + 0.1yr, i.e. in a relation describing a homogeneous relationbetween inflation and the two interest rates as a function of real aggregateincome. The other two long-run relations described a relationship between(1) velocity and the interest rate spread and (2) the short and the long-terminterest rate. We have imposed three over-identifying restrictions (similarto the ones of HS4 in Table 10.2) on the β vectors of Table 14.1 to makethe nominal results as comparable as possible with the results of the realanalysis. We note that β02xt approximately reproduces the money demand

Page 273: Juselius_Johansen_The Cointegrated VAR Model

276 CHAPTER 14. I(2) SYMPTOMS

Table 14.1: Two just-identified long-run structures

Unrestricted V AR Generically identified

β1 β2 β3 β1 β2 β3m -0.03 1.00 0.01 0 1.0 0p 0.02 -1.13 -0.01 0 -1.0 0yr 0.01 -1.10 0.01 −0.18

(4.2)-1.0 0

Rm 1.00 -1.65 1.00 1.00 −13.6(−6.2)

1.00

Rb -0.60 10.48 -0.42 −0.35(1.0)

13.6(6.2)

−0.45(7.1)

D83 0.00 -0.08 -0.00 0 −0.15(−5.0)

−0.00(−0.1)

α1 α2 α3 α1 α2 α3∆mt 7.5

(7.3)−0.10(−1.7)

−0.42(−0.3)

0.75(3.8)

−0.28(−5.2)

2.6(1.6)

∆pt 3.10(4.9)

0.14(3.8)

−0.87(−1.1)

0.64(5.1)

0.00(0.2)

1.2(1.2)

∆yrt −0.15(−0.2)

0.06(1.3)

−1.40(−1.5)

0.25(1.7)

0.05(1.2)

−1.29(−1.1)

∆Rm,t −0.10(−1.9)

−0.01(−2.1)

−0.25(−3.7)

0.00(0.3)

−0.00(−1.3)

−0.39(−4.4)

∆Rb,t −0.07(−0.9)

−0.00(−0.4)

−0.03(−0.3)

−0.00(−0.2)

0.00(0.5)

−0.06(−0.5)

Page 274: Juselius_Johansen_The Cointegrated VAR Model

14.5. CONCLUDING REMARKS 277

relation and β03xt the interest rate relation of H4. The homogeneous inflationinterest rate relation can be reproduced from the second equation in Table14.1 as:

∆pt = 0.64(−0.2yr + 1.0Rm − 0.35Rb) + 1.25(Rm − 0.45Rb) (14.6)= −0.12yr + 1.9Rm − 0.8Rm (14.7)

i.e. it corresponds approximately to the following cointegration relation:(∆p − Rm) − 0.8(Rm − Rb) + 0.1yr. Except for that the coefficient to thespread is slightly higher in (14.6), the nominal money analysis reproduces theresults of the real money analysis remarkably well. The difference betweenthe two versions of the model is basically whether inflation should be treatedas stationary or nonstationary. In the real model where inflation is includedin the cointegration space, it should correspond to a unit vector if stationary.Since, Table 9.3 indicated that this was not so, the econometric analysis willbenefit from treating nominal prices as I(2).

14.5 Concluding remarks

We will argue here that both econometrically as well as economically there ispotentially a lot to be gained from using the econometrically rich structureof the I(2) analysis.There are at least three straightforward ways of checking the possibility

of double unit roots in the data:

1. The graphs of the data in levels and differences.

2. The graphs of the cointegration relations β0xt compared to β0R1,t.

3. The characteristic roots of the model for reasonable choice of cointe-gration rank.

When long-run price homogeneity is not present we may still be able totransform the data but with some violation of the I(1) properties. However,the analysis based on the I(1) model will then imply some loss of information.

Page 275: Juselius_Johansen_The Cointegrated VAR Model

ii

Page 276: Juselius_Johansen_The Cointegrated VAR Model

Contents

0.1 Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

1 Introduction 11.1 A historical overview . . . . . . . . . . . . . . . . . . . . . . . 31.2 On the choice of economic models . . . . . . . . . . . . . . . . 31.3 Theoretical, true and observable variables . . . . . . . . . . . 51.4 Testing a theory as opposed to a hypothesis . . . . . . . . . . 71.5 Experimental design in macroeconomics . . . . . . . . . . . . 81.6 On the choice of empirical example . . . . . . . . . . . . . . . 10

2 Models and Relations 132.1 Inflation and money growth . . . . . . . . . . . . . . . . . . . 152.2 The time dependence of macroeconomic data . . . . . . . . . . 192.3 A stochastic formulation . . . . . . . . . . . . . . . . . . . . . 222.4 Scenario Analyses: treating prices as I(2) . . . . . . . . . . . . 282.5 Scenario Analyses: treating prices as I(1) . . . . . . . . . . . . 342.6 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . 35

3 The Probability Approach 373.1 A single time series process . . . . . . . . . . . . . . . . . . . 373.2 A vector process . . . . . . . . . . . . . . . . . . . . . . . . . . 403.3 Sequential decomposition of the likelihood function . . . . . . 463.4 Deriving the VAR . . . . . . . . . . . . . . . . . . . . . . . . . 473.5 Interpreting the VAR model . . . . . . . . . . . . . . . . . . . 503.6 The dynamic properties of the VAR process . . . . . . . . . . 53

3.6.1 The roots of the characteristic function . . . . . . . . . 533.6.2 Calculating the roots using the companion matrix . . . 553.6.3 Illustration . . . . . . . . . . . . . . . . . . . . . . . . 57

3.7 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . 58

iii

Page 277: Juselius_Johansen_The Cointegrated VAR Model

iv CONTENTS

4 Estimation and Specification 59

4.1 Likelihood based estimation in the unrestricted VAR . . . . . 60

4.1.1 The estimates of the unrestricted VAR(2) for the Dan-ish data . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.2 Three different ECM-representations . . . . . . . . . . . . . . 65

4.2.1 The ECM formulation with m = 1. . . . . . . . . . . . 66

4.2.2 The ECM formulation with m = 2 . . . . . . . . . . . 68

4.2.3 ECM-representation in acceleration rates, changes andlevels . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.2.4 The relationship between the different VAR formulations 71

4.3 Misspecification tests . . . . . . . . . . . . . . . . . . . . . . . 72

4.3.1 Specification checking . . . . . . . . . . . . . . . . . . . 72

4.3.2 Residual correlations and information criteria . . . . . 76

4.3.3 Tests of residual autocorrelation . . . . . . . . . . . . . 79

4.3.4 Tests of residual heteroscedastisity . . . . . . . . . . . 80

4.3.5 Normality tests . . . . . . . . . . . . . . . . . . . . . . 81

4.4 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . 83

5 The Cointegrated VAR Model 85

5.1 Integration and cointegration . . . . . . . . . . . . . . . . . . 85

5.2 An intuitive interpretation of Π = αβ0 . . . . . . . . . . . . . 86

5.3 Common trends . . . . . . . . . . . . . . . . . . . . . . . . . . 91

5.4 From AR to MA . . . . . . . . . . . . . . . . . . . . . . . . . 93

5.5 Pulling and pushing forces . . . . . . . . . . . . . . . . . . . . 96

5.6 Concluding discussion . . . . . . . . . . . . . . . . . . . . . . 98

6 Deterministic Components 99

6.1 A dynamic regression model . . . . . . . . . . . . . . . . . . . 99

6.2 A trend and a constant in the VAR . . . . . . . . . . . . . . . 102

6.3 Five cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

6.4 The MA representation . . . . . . . . . . . . . . . . . . . . . . 107

6.5 Dummy variables in a simple regression model . . . . . . . . . 109

6.6 Dummy variables and the VAR . . . . . . . . . . . . . . . . . 110

6.7 An illustrative example . . . . . . . . . . . . . . . . . . . . . . 115

6.8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Page 278: Juselius_Johansen_The Cointegrated VAR Model

CONTENTS v

7 Estimation in the I(1) Model 1197.1 Concentrating the general VAR-model . . . . . . . . . . . . . 1197.2 Derivation of the ML estimator . . . . . . . . . . . . . . . . . 1227.3 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . 1257.4 The uniqueness of the unrestricted estimates . . . . . . . . . . 1267.5 An illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

8 Cointegration Rank 1398.1 The trace test . . . . . . . . . . . . . . . . . . . . . . . . . . . 1398.2 The asymptotic tables . . . . . . . . . . . . . . . . . . . . . . 1438.3 Choosing the rank . . . . . . . . . . . . . . . . . . . . . . . . 1508.4 An illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . 1538.5 Recursive tests of constancy . . . . . . . . . . . . . . . . . . . 155

8.5.1 Recursively calculated trace tests . . . . . . . . . . . . 156

8.5.2 The recursively calculated log-likelihood . . . . . . . . 1578.5.3 Recursively calculated prediction tests . . . . . . . . . 158

9 Testing restrictions 1659.1 Formulating hypotheses . . . . . . . . . . . . . . . . . . . . . 1669.2 Same restriction . . . . . . . . . . . . . . . . . . . . . . . . . . 168

9.2.1 Illustrations . . . . . . . . . . . . . . . . . . . . . . . . 1719.3 Some β assumed known . . . . . . . . . . . . . . . . . . . . . 177

9.3.1 Illustrations . . . . . . . . . . . . . . . . . . . . . . . . 1799.4 Some coefficients known . . . . . . . . . . . . . . . . . . . . . 181

9.4.1 Illustrations . . . . . . . . . . . . . . . . . . . . . . . . 1849.5 Long-run weak exogeneity . . . . . . . . . . . . . . . . . . . . 186

9.5.1 Empirical illustration: . . . . . . . . . . . . . . . . . . 1899.6 Revisiting the scenario analysis . . . . . . . . . . . . . . . . . 192

10 Identification Long-Run Structure 19510.1 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . 19610.2 Identifying restrictions . . . . . . . . . . . . . . . . . . . . . . 19710.3 Formulating identifying hypotheses . . . . . . . . . . . . . . . 20210.4 Just-identifying restrictions . . . . . . . . . . . . . . . . . . . 20610.5 Over-identifying restrictions . . . . . . . . . . . . . . . . . . . 20910.6 Lack of identification . . . . . . . . . . . . . . . . . . . . . . . 21110.7 Recursive tests of α and β . . . . . . . . . . . . . . . . . . . . 21210.8 Concluding discussions . . . . . . . . . . . . . . . . . . . . . . 216

Page 279: Juselius_Johansen_The Cointegrated VAR Model

vi CONTENTS

11 Identification of the Short-Run Structure 21911.1 Formulating identifying restrictions . . . . . . . . . . . . . . . 22011.2 Interpreting shocks . . . . . . . . . . . . . . . . . . . . . . . . 22211.3 Which economic questions? . . . . . . . . . . . . . . . . . . . 22311.4 Reduced form restrictions . . . . . . . . . . . . . . . . . . . . 22811.5 The VAR in triangular form . . . . . . . . . . . . . . . . . . . 23111.6 General restrictions . . . . . . . . . . . . . . . . . . . . . . . . 23411.7 A partial system . . . . . . . . . . . . . . . . . . . . . . . . . 23911.8 Economic identification . . . . . . . . . . . . . . . . . . . . . . 241

12 Identification of Common trends 24512.1 The common trends representation . . . . . . . . . . . . . . . 24612.2 Some special cases . . . . . . . . . . . . . . . . . . . . . . . . 24712.3 Illustrations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24812.4 Economic identification . . . . . . . . . . . . . . . . . . . . . . 254

13 Identification of a Structural VAR 25913.1 Transitory and permanent shocks . . . . . . . . . . . . . . . . 259

14 I(2) Symptoms 26514.1 Stochastic and deterministic components in nominal models . 26614.2 Estimating an I(1) model with I(2) data . . . . . . . . . . . . 26814.3 An intuitive account . . . . . . . . . . . . . . . . . . . . . . . 27014.4 Transforming I(2) data to I(1) . . . . . . . . . . . . . . . . . . 27314.5 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . 276

15 The I(2) Model 27715.1 Introducing the I(2) model . . . . . . . . . . . . . . . . . . . . 27715.2 Defining the I(2) model . . . . . . . . . . . . . . . . . . . . . . 27915.3 Deterministic components in the I(2) model . . . . . . . . . . 28115.4 The two-step procedure . . . . . . . . . . . . . . . . . . . . . . 28215.5 Th full information maximum likelihood procedure . . . . . . 28715.6 Price relations in the long run and the medium run . . . . . . 29015.7 The nominal to real transformation . . . . . . . . . . . . . . . 292

16 On the Econometric Approach 29516.1 A concluding discussion . . . . . . . . . . . . . . . . . . . . . 29616.2 General-to-Specific and Specific-to-General . . . . . . . . . . . 299