Nielsen Dissertation

Post on 23-Oct-2014

60 views 0 download

Tags:

Transcript of Nielsen Dissertation

Multivariate Fractional Integrationand Cointegration

By Morten Ørregaard Nielsen

A dissertation submitted to

The Faculty of Social Sciences

in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

in

Economics

University of Aarhus, Denmark

ii

To My FamilyTil Min Familie

iii

iv

Table of Contents

Preface vii

Summary of the Dissertation ix

Dansk Resume (Danish Summary) xi

Chapter 1 1

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatil-ity Relation

Chapter 2 43

Semiparametric Estimation in Time Series Regression with Long Range Dependence

Chapter 3 71

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

Chapter 4 107

Multivariate Lagrange Multiplier Tests for Fractional Integration

v

vi

Preface

The present collection of papers constitutes my PhD dissertation. The dissertation was writtenduring my studies at the Department of Economics, University of Aarhus, in the years 1999-2002. In the Spring of 2002, I visited the Department of Economics and the Cowles Foundationat Yale University. I would like to thank Peter C. B. Phillips and the department for theirhospitality.

I am very grateful to my dissertation advisors, Bent Jesper Christensen and Niels Haldrup,for providing intellectual guidance and support throughout my studies, and for always beingwilling to read and comment on several drafts of this dissertation and my other papers.

During my studies in Aarhus, I have worked with Bent Jesper Christensen, Niels Haldrup,and Svend Hylleberg on various research projects. I am grateful for having had this opportunityto collaborate, and I hope several more opportunities will present themselves in the years tocome.

Finally, I would like to thank the other PhD students in Aarhus for generating a verypleasant environment. Special thanks go to my good friend Lars Stentoft with whom I haveshared several offices, lots of academic and not-so-academic discussions, and many visits to theFriday bar.

Morten Ørregaard NielsenAarhus, December 2002

The pre-defense of this dissertation took place on 21 February, 2003. I would like to take thisopportunity to thank the members of the assessment committee, Søren Johansen (Universityof Copenhagen), Jörg Breitung (Bonn University), and Svend Hylleberg (University of Aarhus)for their numerous constructive and very helpful comments and suggestions.

Morten Ørregaard NielsenAarhus, March 2003

vii

viii

Summary of the Dissertation

This dissertation is concerned with the properties of statistical inference techniques in time se-ries models of long memory, fractional integration, and cointegration, as well as the applicationof such models to economic data. The aim is to develop new and improved methods of infer-ence for long memory models, and methods for investigating econometric relationships betweensuch models, in particular to extend existing work on the asymptotic distribution theory forfractionally integrated and cointegrated time series.

Since the seminal work by Granger & Joyeux (1980), Granger (1981), and Hosking (1981),introducing long memory and fractionally integrated models into econometrics, there has beenan increasing focus on the development of econometric and statistical inference techniques forsuch models. Time series exhibiting long memory (or long range dependence) are characterizedby a strong dependence between observations that are distant in time, and they provide aflexible modelling framework for both stationary and nonstationary data. Two excellent surveysare Robinson (1994b) and Baillie (1996).

Recent empirical work by a large number of researchers, e.g. Diebold & Rudebusch (1989),Sowell (1992), Baillie (1996), Lobato & Velasco (2000), Andersen, Bollerslev, Diebold & Ebens(2001), and Andersen, Bollerslev, Diebold & Labys (2001), reveals that many important eco-nomic time series exhibit long memory and may be modelled using fractional integration tech-niques. Series where these phenomena are observed include exchange rates, interest rates,production, consumption, unemployment, volatility, and many others. Thus, almost all ar-eas of economics are affected by these observations, and hence the importance of developingappropriate econometric techniques to model such series.

This dissertation consists of four selfcontained papers. In chapter 1, I consider local Whit-tle analysis, see Robinson (1995) and Lobato (1999), of a stationary fractionally cointegratedmodel. A two step estimator, equivalent to the local Whittle QMLE, is proposed to jointlyestimate the integration orders of the regressors, the integration order of the errors, and thecointegration vector. The estimator is semiparametric in the sense that it employs local as-sumptions on the joint spectral density matrix of the regressors and the errors near the zerofrequency. Consequently, the estimator is invariant to short-run dynamics which does not evenhave to be specified. I show that, for the entire stationary region of the integration orders, theestimator is asymptotically normal with block diagonal covariance matrix. Thus, the estimatesof the integration orders are asymptotically independent of the estimate of the cointegrationvector. Furthermore, the present estimator of the cointegrating vector is asymptotically normalfor a wider range of integration orders than the narrow band frequency domain least squaresestimator of Robinson (1994a), and is superior with respect to asymptotic variance, see alsoChristensen & Nielsen (2001). An application to financial volatility series is offered, whichdemonstrates that useful long-run relations may be derived between stationary time series.

Chapter 2 is concerned with semiparametric estimation in time series regression in thepresence of long range dependence in both the errors and the stochastic regressors. A centrallimit theorem is established for a class of semiparametric frequency domain weighted leastsquares estimates, which includes both narrow band ordinary least squares and narrow bandgeneralized least squares as special cases. The estimates are semiparametric as in chapter 1.

ix

This setting differs from earlier work on time series regression with long range dependencewhere a fully parametric approach has been employed, e.g. Robinson & Hidalgo (1997). Thegeneralized least squares estimate is infeasible when the degree of long range dependence isunknown and must be estimated in an initial step. In that case, I show that a feasible estimateexists which has the same asymptotic properties as the infeasible estimate.

In chapter 3, I propose a Lagrange Multiplier test of the null hypothesis of cointegration infractionally cointegrated models. The test statistic utilizes fully modified residuals, see Phillips& Hansen (1990), to cancel the endogeneity and serial correlation biases, and I show thatstandard asymptotic properties apply under the null and under local alternatives. With i.i.d.Gaussian errors the asymptotic Gaussian power envelope of all (unbiased) tests is achieved bythe one-sided (two-sided) test. In an application to the dynamics among exchange rates forseven major currencies against the US dollar, mixed evidence of the existence of a cointegratingrelation is found.

In chapter 4, I introduce a multivariate Lagrange Multiplier test for fractional integration,generalizing Tanaka’s (1999) univariate test to multiple time series. With no multivariate testsavailable for testing the order of fractional integration, researchers interested in multiple timeseries have been forced to apply univariate tests to each element of the multiple time series.That procedure is not only cumbersome, but ignores potentially important correlations betweenthe elements of the multiple time series, which could lead to increased power of a multivariatetest. A regression variant of Tanaka’s (1999) LM test is proposed by Breitung & Hassler(2002), but it is not equivalent to the LM test in the multivariate case. I show that the LMstatistic is asymptotically chi-squared distributed and efficient against local alternatives. Thus,asymptotic inference in this framework is much simpler than in the usual (integer) integrationmodels where asymptotic distributions are nonstandard, see e.g. Phillips & Durlauf (1986)and Choi & Ahn (1999). An application to multivariate time series of real interest rates for sixcountries is offered, demonstrating that more clear-cut evidence can be drawn from multivariatetests compared to conducting several univariate tests.

x

Dansk Resume (Danish Summary)

Afhandlingen beskæftiger sig med egenskaberne ved statistiske inferensmetoder i tidsrække-modeller med lang hukommelse, fraktionel integration og kointegration, såvel som anvendelsenaf sådanne modeller på økonomiske data. Målet er at udvikle nye og forbedrede inferensme-toder for tidsrækkemodeller med lang hukommelse, og metoder til at undersøge økonometriskerelationer mellem sådanne modeller, specielt at udvide eksisterende asymptotisk fordelingsteorifor fraktionelt integrerede og kointegrerede tidsrækker.

Siden det banebrydende arbejde af Granger & Joyeux (1980), Granger (1981) og Hosk-ing (1981), som introducerede lang hukommelse og fraktionelt integrerede modeller i denøkonometriske litteratur, har der været en stigende fokus på udviklingen af økonometriskeog statistiske inferensteknikker for disse modeller. Tidsrækker, som udviser lang hukommelse,er karakteriserede ved en stærk afhængighed mellem observationer, som er langt fra hinanden itid, og de tilbyder fleksibel modellering for både stationær og ikke-stationær data. To glimrendeoversigtsartikler er Robinson (1994b) og Baillie (1996).

Nyere empirisk arbejde af mange forskere, f.eks. Diebold & Rudebusch (1989), Sow-ell (1992), Baillie (1996), Lobato & Velasco (2000), Andersen, Bollerslev, Diebold & Ebens(2001) og Andersen, Bollerslev, Diebold & Labys (2001), har demonstreret at mange vigtigeøkonomiske tidsrækker udviser lang hukommelse og kan modelleres ved hjælp af fraktionalintegration. Tidsrækker hvor disse fænomener er observeret inkluderer valutakurser, renter,produktion, forbrug, arbejdsløshed, volatilitet og mange andre. Således er næsten alle om-råder berørt af disse observationer, hvilket understreger vigtigheden af at udvikle passendeøkonometriske teknikker til at modellere sådanne tidsrækker.

Afhandlingen indeholder fire selvstændige papirer. I kapitel 1 foretager jeg lokal Whittleanalyse, se Robinson (1995) og Lobato (1999), af en stationær fraktionelt kointegreret model.En to-trins estimator, som er ækvivalent med den lokale Whittle QMLE, foreslås til samtidig atestimere regressorernes integrationsordener, fejlleddets integrationsorden og kointegrationsvek-toren. Estimatoren er semiparametrisk på den måde, at den benytter lokale antagelser omspektraltæthedsmatricen for regressorerne og fejlleddet nær nulfrekvensen. Som konsekvens erestimatoren invariant overfor kortsigtsdynamik, som ikke engang kræves specificeret. Jeg viser,at estimatoren er asymptotisk normalfordelt for hele den stationære region af integrationsor-denerne, samt at kovariansmatricen er blokdiagonal. Dvs., estimaterne af integrationsordenerneog estimaterne af kointegrationsvektoren er asymptotisk uafhængige. Desuden er den nye esti-mator asymptotisk normalfordelt for et større område af integrationsordener end narrow bandfrequency domain least squares estimatoren af Robinson (1994a), og den er bedre med hensyntil asymptotisk varians, se også Christensen & Nielsen (2001). Den nye model og estimator eranvendt på finansielle volatiliteter, og anvendelsen demonstrerer at nyttige langsigtssammen-hænge mellem stationære tidsrækker kan udledes.

Kapitel 2 drejer sig om semiparametrisk estimation i tidsrækkeregression, når fejlleddet ogde stokastiske regressorer har lang hukommelse. En central grænseværdisætning udledes foren klasse af semiparametriske frequency domain weighted least squares estimater, der inklud-erer både narrow band ordinary least squares og narrow band generalized least squares somspecielle tilfælde. Estimaterne er semiparametriske på samme måde som i kapitel 1. Denne

xi

tilgangsvinkel er forskellig fra tidligere arbejde på tidsrækkeregression med lang hukommelse,hvor en fuldt parametrisk model typisk har været antaget, se f. eks. Robinson & Hidalgo(1997). Generalized least squares estimatet er uopnåeligt, hvis graden af lang hukommelse erukendt og skal estimeres i en initial analyse. I det tilfælde viser jeg, at et opnåeligt estimateksisterer, som har samme asymptotiske egenskaber som det uopnåelige estimat.

I kapitel 3 introducerer jeg et Lagrange Multiplier test for kointegration i fraktionelt kointe-grerede modeller. Teststatistikken udnytter fuldt modificerede residualer, se Phillips & Hansen(1990), til at fjerne endogenitets- og autokorrelationsbias, og jeg viser, at standard asympto-tiske egenskaber gælder under nulhypotesen og under lokale alternativer. Med i.i.d. Gaussiskefejl opnår det ensidede (tosidede) test den asymptotiske Gaussiske styrkegrænse for alle (for-ventningsrette) tests. I en anvendelse til valutakurser for syv store valutaer overfor U.S. dollarsfinder jeg blandede beviser for eksistensen af kointegration.

I kapitel 4 introducerer jeg et multivariat Lagrange Multiplier test for fractionel integra-tion, som generaliserer Tanaka’s (1999) univariate test til multiple tidsrækker. Uden nogentilgængelige multivariate tests til at teste den fraktionelle integrationsorden har forskere, somhar været interesserede i multiple tidsrækker, været nødt til at anvende univariate tests påhvert element af den multiple tidsrække. Den procedure er ikke kun besværlig, men ignor-erer potentielt vigtige korrelationer mellem elementerne af den multiple tidsrække, som kunnelede til øget styrke af et multivariat test. En regressionsvariant af Tanaka’s (1999) LM test erforeslået af Breitung & Hassler (2002), men den er ikke ækvivalent med LM testet i det mul-tivariate tilfælde. Jeg viser, at LM statistikken er asymptotisk chi-i-anden fordelt og efficientmod lokale alternativer. Således er asymptotisk inferens i denne model meget simplere end isædvanlige heltalsintegrerede modeller, hvor de asymptotiske fordelinger er ikke-standard, sef.eks. Phillips & Durlauf (1986) og Choi & Ahn (1999). LM testet anvendes på en multipeltidsrække af realrenter for seks lande, og anvendelsen demonstrerer at skarpere konlusioner kandrages fra multivariate tests sammenlignet med at foretage en række univariate tests.

References

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Ebens, H. (2001), ‘The distribution of realizedstock return volatility’, Journal of Financial Economics 61, 43—76.

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2001), ‘The distribution of exchangerate volatility’, Journal of the American Statistical Association 96, 42—55.

Baillie, R. T. (1996), ‘Long memory processes and fractional integration in econometrics’,Journal of Econometrics 73, 5—59.

Breitung, J. & Hassler, U. (2002), ‘Inference on the cointegration rank in fractionally integratedprocesses’, Journal of Econometrics 110, 167—185.

Choi, I. & Ahn, B. C. (1999), ‘Testing the null of stationarity for multiple time series’, Journalof Econometrics 88, 41—77.

xii

Christensen, B. J. & Nielsen, M. Ø. (2001), ‘Semiparametric analysis of stationary fractionalcointegration and the implied-realized volatility relation’, Department of Economics Work-ing Paper 2001-04 (revised 2002), University of Aarhus .

Diebold, F. X. & Rudebusch, G. D. (1989), ‘Long memory and persistence in aggregate output’,Journal of Monetary Economics 24, 189—209.

Granger, C. W. J. (1981), ‘Some properties of time series data and their use in econometricmodel specification’, Journal of Econometrics 16, 121—130.

Granger, C. W. J. & Joyeux, R. (1980), ‘An introduction to long memory time series modelsand fractional differencing’, Journal of Time Series Analysis 1, 15—29.

Hosking, J. R. M. (1981), ‘Fractional differencing’, Biometrika 68, 165—176.

Lobato, I. N. (1999), ‘A semiparametric two-step estimator in a multivariate long memorymodel’, Journal of Econometrics 90, 129—153.

Lobato, I. N. & Velasco, C. (2000), ‘Long memory in stock-market trading volume’, Journal ofBusiness and Economic Statistics 18, 410—427.

Phillips, P. C. B. & Durlauf, S. N. (1986), ‘Multiple time series regression with integratedprocesses’, Review of Economic Studies 53, 473—495.

Phillips, P. C. B. & Hansen, B. E. (1990), ‘Statistical inference in instrumental variables re-gression with I(1) variables’, Review of Economic Studies 57, 99—125.

Robinson, P. M. (1994a), ‘Semiparametric analysis of long-memory time series’, Annals ofStatistics 22, 515—539.

Robinson, P. M. (1994b), Time series with strong dependence, in C. A. Sims, ed., ‘Advancesin Econometrics’, Cambridge University Press, Cambridge, pp. 47—95.

Robinson, P. M. (1995), ‘Gaussian semiparametric estimation of long range dependence’, Annalsof Statistics 23, 1630—1661.

Robinson, P. M. & Hidalgo, F. J. (1997), ‘Time series regression with long-range dependence’,Annals of Statistics 25, 77—104.

Sowell, F. B. (1992), ‘Modeling long run behavior with the fractional ARIMA model’, Journalof Monetary Economics 29, 277—302.

Tanaka, K. (1999), ‘The nonstationary fractional unit root’, Econometric Theory 15, 549—582.

xiii

xiv

Chapter 1

Local Whittle Analysis of Stationary Fractional Cointegrationand the Implied-Realized Volatility Relation

1

2

Local Whittle Analysis of Stationary Fractional Cointegrationand the Implied-Realized Volatility Relation

Morten Ørregaard Nielsen∗

Abstract

We consider local Whittle analysis of a stationary fractionally cointegrated model. A twostep estimator, equivalent to the local Whittle QMLE, is proposed to jointly estimate theintegration orders of the regressors, the integration order of the errors, and the cointegrationvector. We first show that the univariate local Whittle QMLE of the integration order of theresiduals in our model is unaffected by the fact that it is based on residuals, and that it maythus be employed as an initial estimator. The two step estimator is semiparametric in thesense that it employs local assumptions on the joint spectral density matrix of the regressorsand the errors near the zero frequency. We show that the estimator is asymptoticallynormal for the entire stationary region of the integration orders, and thus for a wider rangeof integration orders than the narrow band frequency domain least squares estimator, andthat it is superior to the latter estimator with respect to asymptotic variance. Monte Carloevidence documenting the finite sample feasibility of our new methodology is presented. Inan application to financial volatility series, we examine the unbiasedness hypothesis in theimplied-realized volatility relation.

JEL Classification: C22

Keywords: Fractional Cointegration, Fractional Integration, Whittle Likelihood, Long Mem-ory, Realized Volatility, Semiparametric Estimation

∗I am grateful to Richard Baillie, Richard Blundell, Jörg Breitung, Bent Jesper Christensen, Niels Haldrup,Uwe Hassler, Svend Hylleberg, Søren Johansen, Helmut Lütkepohl, Neil Shephard, Herman van Dijk, and TimVogelsang for many helpful comments and discussions that have significantly improved the paper. I would alsolike to thank participants at the 2002 Econometric Society European Meeting in Venice, the 2002 EconometricSociety Winter European Meeting in Budapest, the 2002 (EC)2 Conference in Bologna, and seminar participantsat University of Aarhus, Tilburg University, University of British Columbia, Cornell University, Michigan StateUniversity, and Nuffield College (Oxford) for comments.

3

Chapter 1

1 Introduction

In this paper we are concerned with the joint estimation of the integration orders and thecointegrating vector in stationary fractionally cointegrated models. Suppose we observe thep-vector zt = (yt, x

0t)0, which is integrated of order d ∈ (0, 1/2), denoted zt ∈ I (d). For a

precise statement, zt ∈ I (d) if(1− L)d zt = εt, (1)

where εt ∈ I (0) and (1− L)d is defined by its binomial expansion

(1− L)d =∞Xj=0

Γ (j − d)

Γ (−d)Γ (j + 1)Lj , Γ (z) =

Z ∞

0tz−1e−tdt, (2)

in the lag operator L (Lzt = zt−1). A process is labelled I (0) if it is covariance stationary andhas spectral density that is bounded and bounded away from zero at the origin.

A scalar-valued stochastic process generated by (1) has spectral density

f (λ) ∼ gλ−2d as λ→ 0+, (3)

where g is a constant and the symbol “∼” means that the ratio of the left- and right-handsides tends to one in the limit. Such a process is said to possess strong dependence or longrange dependence, since the autocorrelations decay at a hyperbolic rate in contrast to the muchfaster exponential rate in the weak dependence case. The parameter d determines the memoryof the process. If d > −1/2, zt is invertible and admits a linear representation, and if d < 1/2it is covariance stationary. If d = 0, the spectral density (3) is bounded at the origin, and theprocess has only weak dependence. Sometimes, zt is said to have intermediate memory, shortmemory, and long memory when d < 0, d = 0, and d > 0, respectively.

Suppose further that zt = (yt, x0t)0 satisfies the regression model

yt = β0xt + et, (4)

where the error term is integrated of a smaller order de < d, i.e. et ∈ I (de). A much studiedspecial case is the standard I (1) − I (0) cointegration model which arises when d = 1 andde = 0, see e.g. Watson (1994) for a review. When d and/or de are not integers the model iscalled a fractional cointegration model following the original idea by Granger (1981). We callthe model (4) with 0 ≤ de < d < 1/2 a stationary fractionally cointegrated model, since it isconcerned with the long-run linear co-movement between two or more stationary fractionallyintegrated processes. The properties of the model in the standard I (1) − I (0) cointegrationcase are well known, see Watson (1994), but the fractional cointegration framework has beenexamined only recently, see the short review in Robinson & Yajima (2002).

Throughout this paper, we shall be concerned with the stationary case d ∈ (0, 1/2). Thisinterval is relevant for many applications in finance, for instance stock market trading volume

4

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

(Lobato & Velasco (2000)), exchange rate volatility (e.g. Andersen, Bollerslev, Diebold &Labys (2001)), stock return volatility (e.g. Andersen, Bollerslev, Diebold & Ebens (2001)and Christensen & Nielsen (2004)), spot prices for crude oil (Robinson & Yajima (2002)),and electricity spot prices (Haldrup & Nielsen (2004)). In particular, it is the relevant regionfor the volatility processes we study below in our empirical application. Henry & Zaffaroni(2003) provide a survey of empirical applications of fractional integration and long memory inmacroeconomics and finance.

Since our model is stationary, a comparison with the standard time series regression modelwith weakly dependent regressors is natural. It is well known that, in that case, under a widevariety of regularity conditions, the ordinary least squares and generalized least squares esti-mates of β in (4) are asymptotically normal, see e.g. Hannan (1979). The new complicationis that, since the regressors and the errors both have long memory, they are potentially corre-lated even at very long horizons, thus rendering the OLS estimator inconsistent as discussedby Robinson (1994a, 1994b) and Robinson & Hidalgo (1997). To deal with this issue, Robinson(1994a) proposed a semiparametric narrow band frequency domain least squares (FDLS) esti-mator that assumes only a multivariate generalization of (3), and essentially performs OLS ona degenerating band of frequencies around the origin. The consistency of the estimator in thestationary case is proved by Robinson (1994a), and Christensen & Nielsen (2004) show that itsasymptotic distribution is normal when the collective memory of the regressors and the errorterm is less than 1/2, i.e. when d+de < 1/2. In contrast, Robinson & Marinucci (2003) considerseveral cases where the regressors are fractionally integrated and nonstationary, and show thatthe limiting distributions for the FDLS estimator are then functionals of fractional Brownianmotion, and Chen & Hurvich (2003a) generalize the model to allow deterministic polynomialtrends. As an alternative, Robinson & Hidalgo (1997) introduced a parametric class of (fullband) weighted least squares estimates (including generalized least squares as a special case),and proved root-n-consistency and asymptotic normality for their estimates, assuming correctspecification of the dynamics at any frequency (later relaxed by Hidalgo & Robinson (2002))and independence between the regressors and the errors.

Many estimators of the memory parameter d and the scale parameter g have been suggestedin the literature. A semiparametric approach has been developed by Geweke & Porter-Hudak(1983), Künsch (1987), Robinson (1994a, 1995a, 1995b), Lobato & Robinson (1996), and Lo-bato (1999), among others. The semiparametric estimators of the memory parameter assumeonly the model (3) for the spectral density, and use a degenerating part of the periodogramaround the origin to estimate the model. This approach has the advantage of being invariantto any short- and medium-term dynamics (as well as mean terms since the zero frequency isusually left out). In particular, a local Whittle quasi maximum likelihood estimator (QMLE)based on the maximization of a local Whittle approximation to the likelihood, see our equa-tion (7), has been developed by Künsch (1987), Robinson (1995a) (who called it a Gaussiansemiparametric estimator), and Lobato (1999) to estimate the integration orders of univariate

5

Chapter 1

and multivariate stationary fractionally integrated time series, respectively. Of course, a fullyparametric approach is more efficient, using the entire sample, but is inconsistent if the para-metric model is specified incorrectly, e.g. if the lag-structure of the short-term dynamics ismisspecified.

The methods described above are combined by Marinucci & Robinson (2001) and Chris-tensen & Nielsen (2004), who suggest conducting a fractional cointegration analysis in severalsteps. First, the integration orders of the raw data is estimated by, e.g., the local WhittleQMLE. Secondly, the narrow band FDLS estimator for the cointegrating vector is calculated,and finally the integration order of the residuals is estimated assuming that the approach isequally valid for residuals. Hypothesis testing is then conducted on de as if et were observed,and on β as if de (which enters in the limiting distribution of the FDLS estimator) were known.Although this is indeed a valid course of action, see Hassler, Marmol & Velasco (2000) andVelasco (2003) for the nonstationary case, and our Theorem 1 below for the stationary case,a joint estimation method for the integration orders and the cointegration vector would bepreferable.

We propose a simple joint semiparametric two step estimator of the integration orders andthe cointegration vector in (4), which is equivalent to the local Whittle QMLE. The two stepestimator is based on consistent initial estimators. We show that such estimators exist, and inparticular, our Theorem 1 shows that the local Whittle QMLE of the integration order of theresiduals is unaffected by the fact that it is based on residuals. More generally, this shows thatin fact the three step procedure employed by Marinucci & Robinson (2001) and Christensen &Nielsen (2004) is valid. That is, inference on de may, in their setup, be conducted based on ourdistributional result in Theorem 1 and is equivalent to disregarding the fact that the estimatoris based on residuals.

Similarly to the narrow band FDLS estimator of the cointegration vector and the localWhittle QMLE of the integration orders, our two step estimator employs local assumptionson the joint spectral density matrix of the regressors and the errors near the zero frequency.It turns out that the limiting distribution of our estimator has a block diagonal covariancematrix, so that the estimates of the integration orders are asymptotically uncorrelated with theestimates of the cointegration vector. Thus, the marginal limiting distribution of the estimatesof the integration orders equals the one derived by Lobato (1999), and in particular, it isunaffected by the fact that it is partly based on residuals. In contrast to the FDLS estimator,we show that our two step estimator is asymptotically normal for the entire parameter space, i.e.0 ≤ de < d < 1/2, thus avoiding the condition d + de < 1/2 required by the FDLS estimatorfor asymptotic normality, see Christensen & Nielsen (2004). We also demonstrate that ourtwo step estimator, in addition to being applicable for a wider range of integration orders,has smaller asymptotic variance than the FDLS estimator when the latter is asymptoticallynormal.

A similar approach to ours is considered by Velasco (2001) for bivariate nonstationary frac-

6

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

tionally cointegrated processes, and similar results for the asymptotic distribution are reachedusing data tapering, following Lobato & Velasco (2000). However, Velasco’s (2001) results arelimited to a bivariate model, and require tapering and an additional user chosen bandwidthparameter to trim out the very first Fourier frequencies as in Robinson (1995b).

Following the semiparametric approach outlined above, our estimator enjoys the extremelygeneral treatment of the short-term dynamics that has made the log-periodogram and localWhittle estimators popular among practitioners. In particular, the short-term dynamics doesnot even need to be specified, since only a degenerating band of frequencies around the origin isused. In contrast, for a parametric estimator to be consistent we would have to specify correctlythe short-run dynamics of the model, employing e.g. a vector fractional ARIMA specificationas in Dueker & Startz (1998). The obvious cost for this robustness is that the efficiency ofthe semiparametric estimator relative to a correctly specified parametric estimator convergesto zero.

In order to demonstrate the feasibility of our methodology in finite samples, we present theresults of a small Monte Carlo study. The results show that the performance of the estimatoris very good with respect to bias and root mean squared error when no short-run dynamics ispresent. However, it also shows that the bandwidth parameter should not be too high in thepresence of short-run dynamics to avoid biased results.

The stationary fractional cointegration model has many potential applications, especiallyin finance. Many financial time series, like the volatility of stock returns and exchange rates,have been found to be well described by stationary fractionally integrated processes, see e.g.the seminal early contributions by Baillie, Bollerslev & Mikkelsen (1996), Breidt, Crato & deLima (1998), and Harvey (1998) or the more recent studies by Andersen, Bollerslev, Diebold& Ebens (2001), Andersen, Bollerslev, Diebold & Labys (2001), and Christensen & Nielsen(2004). Our model then applies if it is assumed that the long memory is a common featurebetween two or more such processes, which would seem like a plausible assumption especiallyif the underlying assets are traded on the same market (exchange rate or stock market).

Finally, we offer an application to the relation between the volatility implied by optionprices and the volatility subsequently realized in the stock market, which has previously beenanalyzed by, e.g., Christensen & Prabhala (1998), Christensen & Nielsen (2004), and Bandi &Perron (2004). The unbiasedness hypothesis in the option market implies a slope coefficientof unity in the implied-realized volatility relation, but the ordinary regression estimate is lessthan one-half. However, we conduct a stationary fractional cointegration analysis, and findthat the volatility series are well described as being stationary fractionally cointegrated withd approximately 0.45 and de insignificantly different from zero. When accounting for thepossibility of stationary fractional cointegration, the estimated slope coefficient is in one caseinsignificantly different from unity, thus supporting long-run unbiasedness of implied volatilityas a forecaster of realized volatility. However, the evidence when applying our more efficientjoint estimation procedure is not as clear-cut as in Christensen & Nielsen (2004) and Bandi &

7

Chapter 1

Perron (2004), and in particular the tests for long-run unbiasedness actually reject when largerbandwidths are employed.

The paper is organized as follows. In the next section we present the model and set upthe local Whittle likelihood and the assumptions necessary to prove our main result. We alsopresent our first theorem which shows the validity of the univariate local Whittle QMLE of dwhen based on residuals. In section 3 we state our main theorem which gives the asymptoticdistribution of the joint semiparametric two step estimator. The asymptotic distribution iscompared to that of the local Whittle QMLE of d and that of the narrow band FDLS estimatorof β. Section 4 presents the results of a Monte Carlo study, illustrating the finite samplebehavior of the proposed estimator. Section 5 presents the empirical application to the implied-realized volatility relation, and section 6 concludes. The proofs of the two theorems are providedin three appendices.

2 Stationary Fractional Cointegration Model

Let us now generalize the simple model described above. In particular, suppose the spectraldensity matrix of the p-vector wt = (x

0t, et)

0 is

f (λ) ∼ Λ−1GΛ−1 as λ→ 0+, (5)

where Λ = diag(λd1 , ..., λdp), da ∈ ∆ = x| 0 ≤ x ≤ ∆1, 0 < ∆1 < 1/2, a = 1, ..., p, and G

is a p × p real symmetric matrix. Here, diag(a1, a2, ..., ak) denotes the diagonal k × k matrixwith diagonal elements a1, a2, ..., ak. Later we shall use the more general notation that, formi ×mi matrices Ai, i = 1, ..., k, diag(A1, A2, ..., Ak) is the

Pki=1mi ×

Pki=1mi block-diagonal

matrix with diagonal blocks A1, A2, ..., Ak. Furthermore, the symbol "∼" means that the ratioof the left- and right-hand sides tends to one in the limit, element-by-element. Equation (5)is the natural multivariate extension of (3), including multivariate fractional ARIMA modelsas a special case, and is also considered in previous work by e.g. Robinson (1995b), Lobato(1999), and Robinson & Yajima (2002). Thus, the elements of the vector xt can be integratedof different orders, i.e. xat ∈ I (da). This implies, by (4), that yt ∈ I(max1≤a≤p−1 da) (withthe maximum taken over those a where βa 6= 0), such that the conceptual requirement thatat least two of the variables in (yt, x0t)

0 must be integrated of the same order is automaticallysatisfied. Notice that dp now denotes the integration order of the error term, i.e. et ∈ I (dp).

For simplicity of presentation we assume that only one cointegration vector exists, i.e. thatthe cointegration rank is unity. A generalization to the case with cointegration rank r < p

along the lines of Engle & Granger (1987) should be possible, but is beyond the scope of thispaper. Consistent semiparametric procedures for estimating the cointegration rank, r, fromdata have been explored recently by Robinson & Yajima (2002) in the stationary fractionalcointegration case also considered here, and by e.g. Chen & Hurvich (2003b) and Nielsen &Shimotsu (2004) for the nonstationary case.

8

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

We collect the parameters of interest in the (2p− 1)-vector θ = ¡d1, ..., dp, β0¢0. The Whittleapproximation to the (negative) likelihood is

W (θ,G) =

Z π

−π

¡log |f (λ)|+ tr £f−1 (λ)Re (I (λ))¤¢ dλ,

where I (λ) = (2πn)−1¡Pn

t=1wteitλ¢ ¡Pn

t=1w0te−itλ¢ is the periodogram matrix of wt at fre-

quency λ. Note that β enters the likelihood function through the relation Ipp (λ) = Iyy (λ) −Re(β0Ixy (λ) + Iyx (λ)β − β0Ixx (λ)β), where the subscripts pp, yy, and xx denote the pe-riodograms of et (or equivalently, wpt), yt, and xt, and the subscript xy denotes the cross-periodogram between xt and yt.

In the spirit of the semiparametric approach, we prefer the discrete local version of thelikelihood

W (θ,G) =1

m

mXj=1

¡log |f (λj)|+ tr

£f−1 (λj)Re (I (λj))

¤¢(6)

evaluated at the Fourier frequencies λj = 2πj/n, j = 1, ...,m. We let the bandwidth parameterm = m (n) tend to infinity to gather information, but at a slower rate than n to remain in aneighborhood of λ = 0. Note that the zero frequency has been left out of the summation in (6)to render the estimation invariant to mean terms. An integral version of (6) could also havebeen considered, but it would not share this property and it would be computationally moreburdensome.

The local Whittle estimator of (θ,G) is defined as

(θ, G) = argminθ,G

W (θ,G)

over a compact subset of ∆p × Rp2+p−1. We concentrate G out of the likelihood by settingG (θ) = m−1

Pmj=1Λj Re(I (λj))Λj , and write the concentrated likelihood as

L (θ) = log¯G (θ)

¯− 2 (

Ppa=1 da)

m

mXj=1

log λj (7)

apart from constants. The local Whittle estimator of the parameter of interest, θ, can then bedefined in terms of the concentrated likelihood as

θ = argminθ∈Θ

L (θ) , (8)

where the minimization is carried out over Θ, a compact subset of ∆p ×Rp−1.We propose the following simple two step estimator (TSE) for the integration orders and

the cointegrating vector,

θ(2)= θ

(1) −µ∂2L (θ)

∂θ∂θ0

¯θ(1)

¶−1µ∂L (θ)

∂θ

¯θ(1)

¶, (9)

9

Chapter 1

where θ(1)is a consistent initial estimator, e.g. the univariate local Whittle QMLE of Robinson

(1995a) and the narrow band FDLS estimator of Robinson (1994a) and Christensen & Nielsen(2004). We could iterate (9) until convergence for higher order gains, but that does not changethe first order asymptotics. It is well known that the TSE has the same asymptotic distributionas the QMLE, but we prefer the TSE for its simplicity.

To prove our main results we assume, with obvious implications for yt, the following condi-tions on wt = (x

0t, et)

0, the bandwidth, and the initial estimates.

Assumption 1 The spectral density matrix of wt given in (5) with typical element fab (λ), thecross spectral density between wat and wbt, satisfies¯

fab (λ)− gabλ−da−db

¯= O

³λα−da−db

´as λ→ 0+, a, b = 1, ..., p,

for some α ∈ (0, 2]. The matrix G satisfies gap = gpa = 0 for a = 1, ..., p − 1, and the leading(p− 1)× (p− 1) submatrix of G, denoted G, is positive definite.

Assumption 2 wt is a linear process, wt = µ +P∞

j=0Ajεt−j, with square summable coeffi-cient matrices,

P∞j=0 kAjk2 < ∞. The innovations satisfy, almost surely, E (εt| Ft−1) = 0,

E (εtε0t| Ft−1) = Ip, and the matrices µ3 = E (εt ⊗ εtε

0t| Ft−1) and µ4 = E (εtε

0t ⊗ εtε

0t| Ft−1)

are nonstochastic, finite, and do not depend on t, where Ft = σ (εs, s ≤ t).

Assumption 3 As λ→ 0+,

dAa (λ)

dλ= O

¡λ−1 kAa (λ)k

¢, a = 1, ..., p,

where Aa (λ) is the a0th row of A (λ) =P∞

j=0Ajeijλ.

Assumption 4 The bandwidth parameter m = m (n) satisfies

1

m+

m1+2α (logm)2

n2α→ 0 as n→∞.

Assumption 5 The initial estimates θ(1)are consistent, and in particular satisfy

d(1)a − da = Op

³m−1/2

´for a = 1, ..., p, (10)

β(1)a − βa = Op

³m−1/2λda−dpm

´for a = 1, ..., p− 1. (11)

Our assumptions are a multivariate generalization of those in Robinson (1994a, 1995a), seealso Lobato (1997, 1999). Since our assumptions are semiparametric in nature they naturallydiffer from those employed by e.g. Robinson & Hidalgo (1997) in their parametric setup, andare at least in some respects weaker than standard parametric assumptions. In particular, we

10

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

avoid the standard assumptions (from time series regression theory with stationary variables)of independence between xt and ut and complete specification of f (λ).

The first part of Assumption 1 specializes (5) by imposing smoothness conditions on thespectral density matrix of wt commonly employed in the literature. They are satisfied withα = 2 if, for instance, wt is a vector fractional ARIMA process. The positive definitenesscondition on G is a no multicollinearity or no cointegration condition within the componentsof xt, which is typical in single-equation cointegration models or regression models.

The condition that gap = gpa = 0, for a = 1, ..., p− 1, is new compared to previous researchfrom the I (1)−I (0) cointegration literature, but it relaxes the standard orthogonality conditionfrom the time series regression literature with stationary variables which is a more relevantcomparison given our stationary setting. The condition ensures that the coherence betweenthe regressors and the error process is zero at the origin, and it can be thought of as a local-to-zero version of the usual orthogonality condition from least squares theory. It is needed, forinstance, to show that the estimation of dp is unaffected by the fact that it is based in part onestimated residuals. The condition is not quite as strong as it may seem at first. In particular,it does not require the regressors xt and the errors et to be uncorrelated at frequencies awayfrom the origin, but allows the regressor and error terms to share the same short- and medium-term dynamics. Thus, it relaxes the independence (or uncorrelatedness) assumption typicallyemployed in the time series regression literature with stationary variables, see e.g. Robinson &Hidalgo (1997), which is a more relevant comparison given our stationary framework.

While the local orthogonality condition is necessary to derive the asymptotic normalitydistribution theory below, we conjecture that our TSE (9) remains consistent even if somegap 6= 0. The conjecture is based on the rate result of Robinson & Marinucci (2003) whichstates that the FDLS estimator of β is λde−dm -consistent for general non-zero G, and on the factthat the multivariate TSE of the integration orders of non-cointegrated variables in Lobato(1999) does not require this condition. This conjecture is partially confirmed in simulations insection 4 below. From the simulations it also seems that the estimation of de is unaffected bythe violation of the local orthogonality condition and the resulting slower rate of convergencefor the initial estimate of β.

Assumptions 2 and 3 follow Robinson (1995a) and Lobato (1999) in imposing a linearstructure on wt with square summable coefficients and martingale difference innovations withfinite fourth moments. The assumption of constant conditional variance for the innovationscould presumably be relaxed by assuming boundedness of the eighth moment as in Robinson &Henry (1999). Assumption 2 is satisfied, for instance, if εt is an i.i.d. process with finite fourthmoments. Under Assumption 2 we can write the spectral density matrix of wt as

f (λ) =1

2πA (λ)A∗ (λ) , (12)

where the asterisk is complex conjugation combined with transposition.Assumption 4 restricts the expansion rate of the bandwidth parameter m = m (n). The

11

Chapter 1

bandwidth is required to tend to infinity for consistency, but at a slower rate than n to remainin a neighborhood of the origin, where we have some knowledge of the form of the spectraldensity. When α is high, (5) is a better approximation to (12) as λ → 0+, and hence (by thesecond term of Assumption 4) a higher expansion rate of the bandwidth can be chosen. Theweakest constraint is implied by α = 2, in which case the condition is m = o(n4/5) apart froma logarithmic term.

Finally, Assumption 5 states the required rates of convergence of the initial estimates. Thecointegration vector may initially be estimated by the narrow band FDLS estimator whichsatisfies (11) at least when maxa da + dp < 1/2, see Christensen & Nielsen (2004). For any dain the stationary and invertible range, i.e. also when maxa da + dp < 1/2 is not satisfied, thenarrow band frequency domain generalized least squares type estimate of Nielsen (2002) maybe employed, which satisfies Assumption 5 for any such da, but other estimators would alsosatisfy the assumption. Note that Christensen & Nielsen (2004) and Nielsen (2002) employassumptions like our Assumptions 1-4 (except the logarithmic term in Assumption 4) to derivetheir asymptotic distribution theory. In particular, their stationary setup also requires thelocal orthogonality condition in Assumption 1. Also note that Assumption 5 depends on thebandwidth m to be used in the second stage of the TSE, so that a slow rate of convergenceof the initial estimator would imply restrictions on m and therefore limitations on the rate ofconvergence of the TSE.

For the initial estimates of the integration orders we suggest using the local Whittle QMLEor log-periodogram methods which obviously satisfy (10) for a = 1, ..., p − 1, see Robinson(1995a, 1995b). When the time series is not observed but instead is a residual, which is thecase for a = p, the results of Robinson do not apply directly. Hassler et al. (2000) and Velasco(2003) consider the estimation of d for residuals when the observed variables are nonstationary.They show that, under complicated conditions on the bandwidth parameter, the use of thelocal Whittle QMLE or the log-periodogram estimator is indeed valid. In particular, theirapproaches assume both nonstationarity and the condition that min1≤a≤p−1 da− dp > 1/2 andthus do not apply in our setting.

We next show that, under Assumptions 1-4 above and the condition (11) on the estimator ofβ, the local Whittle QMLE remains valid in our stationary model even when based on residuals.In particular, we do not need to introduce additional, complicated conditions on the expansionrate of the bandwidth parameter.

Thus, suppose dp is estimated by

dp = argmind∈∆

R (d) , (13)

R (d) = log G (d)− 2dmXj=1

log λj , G (d) =1

m

mXj=1

λ2dj Ipp (λj) ,

12

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

where

Ipp (λj) = Ipp (λj) + (β − β)0Re (Ixx (λj)) (β − β) + 2(β − β)0Re (Ixp (λj)) (14)

is the periodogram of the residual series et = yt − β0xt = et + (β − β)0xt. The subscript xp in

(14) denotes the cross-periodogram between xt and et (or equivalently wpt). Our first theoremshows that, under our assumptions, the effect of using residuals in place of observed series isnegligible.

Theorem 1 Let Assumptions 1-4 be satisfied and suppose dp is given by (13) based on the

residuals et = yt − β0xt, where β satisfies (11) (in place of β

(1)). Let the true value be denoted

by d0p and suppose d0p belongs to the interior of ∆. Then, as n→∞,√m(dp − d0p)

D→ N (0, 1/4) .

Proof. See appendix A.The theorem demonstrates that we may choose to use the local Whittle QMLE as the initial

estimator of the integration order of the residuals when the estimator of the cointegration vectorsatisfies (11). Hence, we have found feasible initial estimates of both the integration orders andthe cointegration vector that satisfy Assumption 5.

More generally, Theorem 1 in fact shows that the three step procedure employed by Mar-inucci & Robinson (2001) and Christensen & Nielsen (2004) is valid. That is, inference ondp may, in their setup, be conducted based on our distributional result in Theorem 1 and isequivalent to disregarding the fact that dp is based on residuals.

3 Main Result

We are now ready to state our main result regarding the TSE.

Theorem 2 Let θ0 denote the true value of the parameter vector θ, and suppose θ0 belongs tothe interior of the parameter space, Θ. Under 0 ≤ dp < da < 1/2, for a = 1, ..., p− 1, (4), andAssumptions 1-5 √

mdiag³Ip, λ

dpm Λ

−1m

´³θ(2) − θ0

´D→ N

¡0,Ω−1

¢, (15)

with

Ω =

ÃE 0

0 F

!, (16)

E = 2¡Ip +G¯G−1

¢, (17)

Fab =2gab

gpp (1− da − db + 2dp), a, b = 1, ..., p− 1, (18)

where ¯ denotes the Hadamard product and Λm is the leading (p− 1) × (p− 1) submatrix ofΛm = diag(λ

dam , ..., λ

dpm ).

13

Chapter 1

Proof. The asymptotic distribution of the TSE is the same as that of the QMLE, whichis given by (15) if we can show the following. The score is such that

√mdiag

³Ip, λ

−dpm Λm

´ ∂L (θ0)

∂θD→ N (0,Ω) , (19)

and the Hessian satisfies

diag³Ip, λ

−dpm Λm

´ ∂2L(θ)

∂θ∂θ0diag

³Ip, λ

−dpm Λm

´p→ Ω (20)

for all θ such that k θ − θ0 k≤k θ(1) − θ0 k. Notice that Ω is positive definite by Assumption 1and the fact that the Hadamard product of two positive definite matrices is positive definite.We prove (19) in appendix B, where parts of the proof follow Lobato (1999) in applying themartingale difference array approximation technique by Robinson (1995a). (20) is proven inappendix C.

Some comments on our result are in order. Velasco (2001) reaches a result very similarto our Theorem 2 in a nonstationary setup, using tapered periodograms to account for thenonstationarity, following the approach of Lobato & Velasco (2000). However, Velasco’s (2001)results are limited to a bivariate model, and require tapering and an additional user chosenbandwidth parameter (say l) to trim out the first l Fourier frequencies as in Robinson (1995b).

The asymptotic distribution in (15) is block diagonal, such that the estimates of the inte-gration orders are asymptotically uncorrelated with the estimate of the cointegration vector. Inparticular, the asymptotic distribution of the estimators of the integration orders is unaffectedby the fact that they are based in part on residuals. This is due to the local orthogonality con-dition in Assumption 1, which ensures that the effect of the estimation of β on the estimationof the integration orders is negligible. A discussion of the efficiency gains of the multivariate es-timator of the integration orders over the univariate local Whittle QMLEs in Robinson (1995a)can be found in Lobato (1999, p. 136).

Let us have a closer look at the asymptotic distribution of the estimator of the cointegrationvector in the simple two variable case. Suppose we observe two time series yt and xt bothintegrated of order d < 1/2, and that the error term is known to be integrated of order de < d.

Then the asymptotic (marginal) distribution (15) of β(2)in Theorem 2 reduces to

√mλde−dm

³β(2) − β0

´D→ N

µ0,ge (1− 2d+ 2de)

2gx

¶,

where gx and ge are the elements of G, which is a diagonal 2 × 2 matrix. Thus, the variancedepends on the signal-to-noise ratio gx/ge.

We compare our estimator of the cointegration vector with the narrow band FDLS estimatorgiven by

βFDLS =

⎛⎝ 1

m

mXj=1

Re (Ixx (λj))

⎞⎠−1 1m

mXj=1

Re (Ixy (λj)) , (21)

14

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

and asymptotically distributed according to

√mλde−dm

³βFDLS − β0

´D→ N

Ã0,

ge (1− 2d)22gx (1− 2d− 2de)

!

in the two variable case with d + de < 1/2, see Christensen & Nielsen (2004). Thus, for thecomparison, we restrict ourselves to the case d + de < 1/2 since otherwise the narrow bandFDLS estimator is non-normal. We note immediately that the convergence rates are the same,and in particular, they are very close to

√n for relevant parameter values. For instance, when

m = O¡n0.5

¢and d − de = 0.4, which are values close to those in the empirical application

below, we get that β(2)and βFDLS are n0.45-consistent. The asymptotic relative efficiency of

β(2)with respect to βFDLS is

V ar(βFDLS)

V ar(β(2))=

(1− 2d)2(1− 2d)2 − 4d2e

,

which equals unity if and only if de = 0, and exceeds unity otherwise. Thus, our two stepestimator is more efficient and applies for a wider range of (d, de) than the FDLS estimator.

The unknown parameters appearing in the asymptotic distribution (15) can be replacedby consistent estimates. In particular, the matrix of coherencies at the zero frequency, G,

can be estimated consistently by G(θ(2)). Its asymptotic distribution could also be derived by

application of the delta-method, see Robinson (1995a) and Lobato & Velasco (2000).Based on Theorem 2 it is straightforward to construct Wald tests of hypotheses that involve

both the integration orders and the cointegration vector. For instance, the linear restrictionsH0 : Rθ = r can be tested by

W =³Rθ − r

´0 ¡RΩ−1R0

¢−1 ³Rθ − r

´D→ χ2q (22)

under the null, where q is the number of linearly independent restrictions. Some hypothesesof general interest in this framework are (i) the components of xt are integrated of the sameorder, θ1 = ... = θp−1, (ii) the errors have no long memory, θp = 0, (iii) xkt is not presentin the cointegrating relation, θp+k = 0, or combinations of these. In the empirical applicationbelow we apply a combination of (ii) and (iii) since the unbiasedness hypothesis implies thatthe errors have no long memory and that xt has a unit coefficient, i.e. θ2 = 0 and θ3 = 1 in abivariate setup.

4 Finite Sample Performance

In this section we investigate the finite sample behavior of the TSE. We consider the followingthree generating mechanisms for xt and et,

15

Chapter 1

Model A : (1− L)d xt = ε1t, (1− L)de et = ε2t, ρ = corr(ε1t, ε2t) = 0,

Model B : (1− L)d xt = u1t, (1− L)de et = ε2t, u1t = 0.5u1,t−1 + ε1t, ρ = 0,

Model C : (1− L)d xt = u1t, (1− L)de et = ε2t, u1t = 0.5u1,t−1 + ε1t, ρ = 0.5,

where εt = (ε1t, ε2t)0 is independently and identically normally distributed with mean zero,

unit variances, and contemporaneous correlation ρ. We then generate yt from (4) with β = 1.Models A and B satisfy all the assumptions of the model, whereas Model C violates the

assumption of block diagonality of the G matrix. In particular, the models are increasing incomplexity with Model A being very simple with no short-run dynamics. Model B adds short-run dynamics to the regressor and thus disturbs the signal due to the contamination of the lowfrequencies of xt from the higher frequencies which are dominated by the short-run dynamics.In Model C we add the further complication that gap 6= 0, which violates Assumption 1 of ourmodel and hence our distribution theory no longer applies. As conjectured in section 2 above,the TSE is presumably still consistent but biased.

For each model, we use 10, 000 replications for sample sizes n = 200 and n = 500. Thebandwidth parameters chosen for the simulation study are m = n0.5, m = n0.6, and m = n0.7,using the same bandwidth for the initial estimates. The reported results are robust to changesin the bandwidth parameters for the initial estimates.

Tables 1 and 2 about here

Tables 1 and 2 present the Monte Carlo bias and root mean squared error (RMSE) of theTSE for Model A with (d, de) = (0.4, 0) and (d, de) = (0.2, 0.1), respectively. The former isclose to what is expected in many practical situations concerning e.g. volatility series as in theempirical application below, and the latter is a weaker form of cointegration where d and deare closer and there is long memory in the error term.

Model A is simple and contains no short-run dynamics and consequently the approximation(5) is close to (12) even for frequencies away from the origin. Hence the bias in the estimates isvery low. For almost all the specifications of the model the bias is negative, and it is uniformlylower than 0.06 in absolute value in Table 1 and 0.04 in Table 2. The RMSE is decreasingin the bandwidth for all three parameters suggesting that for Model A larger bandwidths arepreferable. We also notice that the bias and RMSE of de are only slightly higher than those ofd. Thus, the fact that de is based on residuals only increases the RMSE slightly in Model A.

Tables 3 and 4 about here

Tables 3 and 4 present the simulation results for Model B with (d, de) = (0.4, 0) and(d, de) = (0.2, 0.1), respectively. In Model B, the short-run dynamics in u1t influences theresults significantly. I.e., for xt, (5) is a poor approximation to (12) when moving only a short

16

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

distance away from the origin, due to the contamination from higher frequencies (short-rundynamics), and thus we expect biased results when the bandwidth is chosen too large.

For n = 200 with m = n0.6 or m = n0.7, the estimates of d suffer from severe bias of upto 0.22 for both configurations of integration orders. However, the bias decreases significantlywhen considering the large sample, n = 500, where the bias is 0.06 when m = n0.6 and 0.15when m = n0.7 (for both configurations of integration orders). Furthermore, the RMSE of dreflects the bias resulting from large bandwidth parameters and is no longer decreasing in thebandwidth. Thus, in the presence of short-run dynamics a smaller bandwidth is preferable toavoid biased results. The estimates of de and β contain no bias even with the introductionof short-run dynamics in the regressor. Unreported simulations confirm that the exact samepattern emerges if the short-run dynamics were instead present in the error term et, i.e. theestimates of de would be biased but the estimates of d and β would be virtually unaffected.

Tables 5 and 6 about here

Finally, Tables 5 and 6 present the simulation results for Model C with (d, de) = (0.4, 0)

and (d, de) = (0.2, 0.1), respectively. In Model C we add another complication relative toModel B. I.e., the local orthogonality condition between the regressors and the errors is nowviolated since ρ = 0.5 which introduces correlation between the error term and the regressor atall frequencies, and in particular at frequency zero. Thus, the underlying assumptions of ourmodel and the distribution theory in sections 2 and 3 above are no longer satisfied for ModelC, and consequently we expect the estimates to be biased but conjecture as in section 2 thatthe TSE remains consistent even in this case.

In both Tables 5 and 6, the bias in the estimate of d is unchanged relative to ModelB and the estimate of de remains roughly unbiased. The simulations thus suggest that theresults of Theorem 1 are unaffected by the violation of the local orthogonality condition andconsequent slower convergence rate of the initial β estimate. However, the violation of the localorthogonality condition has introduced a bias in the estimation of β. For the case (d, de) =(0.4, 0) in Table 5, the bias ranges from 0.09 to 0.11 when n = 200, but decreases to 0.07−0.10when the larger sample n = 500 is considered. In both cases the bias is increasing in thebandwidth. The RMSE of β also reflects the bias and is in fact increasing in the bandwidth.For the case (d, de) = (0.2, 0.1), where the cointegrating strength is much weaker, the bias ismore pronounced but still increasing in the bandwidth and decreasing in the sample size. Thisis expected based on the rate result of Robinson & Marinucci (2003) that the FDLS estimatorof β is λde−dm -consistent when gap is allowed to be non-zero, and hence the rate of convergenceis slower in the case (d, de) = (0.2, 0.1) compared to the case (d, de) = (0.4, 0). Thus, as in thecase of short-run dynamics, the long-run coherence between the regressors and errors makesthe smaller bandwidth preferably to avoid biases. Unreported simulations show qualitativelyvery similar results for different (or no) short-run dynamics in Model C.

17

Chapter 1

5 The Implied-Realized Volatility Relation

We proceed to conduct an actual empirical stationary fractional cointegration analysis. Weanalyze the relation between the volatility implied by option prices, and the subsequentlyrealized return volatility of the underlying asset, following Christensen & Prabhala (1998),Bandi & Perron (2004), and particularly Christensen & Nielsen (2004). The presence of longmemory (or fractional integration) in the volatility of financial assets has received a great deal ofattention in recent years, see inter alia Robinson (1991), Ding, Granger & Engle (1993), Baillieet al. (1996), Comte & Renault (1996), Andersen & Bollerslev (1997a, 1997b), Breidt et al.(1998), Harvey (1998), Andersen, Bollerslev, Diebold & Ebens (2001), Andersen, Bollerslev,Diebold & Labys (2001), and Deo & Hurvich (2003).

If option market participants are rational and markets are efficient, the price of a financialoption should reflect all publicly available information including information about expectedfuture return volatility of the underlying asset. Given an observation on the price of an option,the implied volatility σIV may be determined by inverting the option pricing formula with re-spect to σIV , and if this is done every period t a time series σIV,t results. Each implied volatilityσIV,t may now be considered as the market’s forecast of the actually realized return volatilityof the underlying asset. Here, realized volatility is simply the sample standard deviation σRV,tof the realized return from t to t+ 1. In practice, we work with the log volatilities, since theyare close to Gaussian, see Andersen, Bollerslev, Diebold & Ebens (2001).

Christensen & Prabhala (1998) considered the regression specification

yt = α+ βxt + et, (23)

where yt = lnσRV,t and xt = lnσIV,t are the log volatilities, and α and β are intercept andslope coefficients. The unbiasedness hypothesis for option markets implies a β-coefficient ofunity. A monthly sampling frequency was employed for xt and yt. The underlying asset wasthe S&P100 stock market index, and yt was calculated from daily returns, see Christensen &Prabhala (1998) for the details. Basic OLS regression in (23) produced a β-estimate that wasgreater than zero and less than unity (Christensen & Prabhala (1998) also presented resultswithout the logarithmic transform and the difference was negligible).

Inferences from OLS may be erroneous if xt and yt are fractionally cointegrated, see Robin-son (1994a) and Robinson & Marinucci (2003), which is exactly what would be expected underthe unbiasedness hypothesis. Thus, if xt=Et (yt) with Et (·) denoting conditional expectationas of time t, then β is unity and et is serially uncorrelated. For a detailed description of theimplied-realized volatility relation and its implications, see Christensen & Prabhala (1998).If volatility is fractionally integrated, as empirical literature suggests (Andersen, Bollerslev,Diebold & Ebens (2001), Christensen & Nielsen (2004), and Bandi & Perron (2004) find frac-tional integration with d around 0.40− 0.45), whereas the forecasting error et in (23) possessesonly short memory, then xt and yt are fractionally cointegrated. This is in fact what the em-pirical results in Christensen & Nielsen (2004) and our empirical results below indicate. In

18

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

particular, Christensen & Nielsen (2004) considered a fractional cointegration analysis of (23),using first the univariate local Whittle estimator of Robinson (1995a) to estimate the integra-tion orders of the raw data, then the narrow band FDLS estimator to estimate β, and finallythe local Whittle estimator to estimate the integration order of the errors. It was found thataccounting for the possibility of stationary fractional cointegration greatly improves the results,and in most cases produces β-estimates that are insignificantly different from unity.

The data we use are the same as those investigated by Christensen & Nielsen (2004), and areweekly data covering the period January 1, 1988, to December 31, 1995, resulting in n = 417

observations. The final data series are based on high-frequency data from the Berkeley OptionsData Base (BODB), see the BODB User’s Guide for a description. From the high-frequencyoptions data, a 5-minute return series for the underlying S&P500 index is constructed for theperiod 9:00 AM to 3:00 PM each trading day. This results in a series of 147,022 observations.From this 5-minute return series we form the realized volatility σRV,t over each one-week intervalby taking the sample standard deviation of the 5-minute annualized returns in week t.

The implied volatilities are backed out from the Monday 10:00 AM quote, for the call ofshortest maturity and closest to the money, using the standard option pricing formula correctedfor dividends. This results in a weekly implied volatility series σIV,t with different times tomaturity since the options expire monthly. We convert this heterogeneous series to anotherweekly series σIV,t, that may be associated with the series σRV,t of realized volatilities coveringhomogeneous nonoverlapping weekly intervals by the formula

σ2IV,t−i =1

di − di−1

¡di · σ2IV,t−i − di−1 · σ2IV,t−i+1

¢, (24)

where di is the number of days until expiration of σIV,t−i, starting with σIV,t = σIV,t for tcorresponding to a one-week option and then applying the recursion (24). This is of coursean approximation for implied volatilities, as opposed to realized volatilities where it is anidentity. However, the approximation is a high-frequency measurement error, and consequentlyour semiparametric approach should be robust towards it. For the complete details of theconstruction of the data set and summary statistics, see Christensen & Nielsen (2004).

In Table 7 we report the results of our stationary fractional cointegration analysis forbandwidths m = n0.50 = 20, m = n0.55 = 27, and m = n0.60 = 37, respectively. Following thesuggestions from the Monte Carlo study in the previous section, the bandwidths are chosen tobe quite small, and in particular m is slightly lower than in Christensen & Nielsen (2004).

The first column shows the initial estimates. For the integration orders d and de we chooseRobinson’s (1995a) univariate local Whittle estimates. To obtain an initial estimate of β weuse the frequency domain narrow band generalized least squares estimator of Nielsen (2002),since it is suspected that the condition d + de < 1/2 may not be satisfied in our application.Both sets of initial estimates use the same bandwidth as the TSE. The results are robust tochanges in the bandwidth parameters for the initial estimates, see also Christensen & Nielsen(2004). The initial estimates are comparable to those found by Christensen & Nielsen (2004),

19

Chapter 1

and in particular the series seem to be stationary (d < 1/2) and the errors are close to I (0).The initial estimate of β ranges from 0.70 to 0.78, which is well above the 0.3 − 0.4 that aretypical for OLS estimates of β, see Christensen & Prabhala (1998) and Christensen & Nielsen(2004), but still suggests that implied volatility is a biased forecast of realized volatility.

Table 7 about here

In the next two columns we report the TSE (9) and the standard error of each parameter,respectively. The standard errors are calculated using our new distribution theory as the squareroot of the diagonal elements of the covariance matrix in Theorem 2. For d, the estimates arethe same as the initial estimates of approximately 0.45 − 0.48, which is in line with previousevidence, e.g. Andersen, Bollerslev, Diebold & Ebens (2001), Christensen & Nielsen (2004),and Bandi & Perron (2004). Turning to the estimation of the parameters of primary interest,de and β, we find somewhat different results for bandwidth m = 20 compared to m = 27 orm = 37. The TSE point estimate of de is smaller for the small bandwidth and increases withthe bandwidth, ranging from 0.15 to 0.17, and are slightly higher than the initial estimates.Similarly, the TSE of β is larger for the small bandwidth and decreases (rapidly) with thebandwidth, being estimated at 0.77 for the small bandwidth and 0.68 − 0.71 for the largerbandwidths. For the small bandwidth, m = 20, the estimates of de and β are insignificantlydifferent from zero and unity, respectively. However, the β estimates for m = 27 and m = 37,the two largest bandwidths considered, are significantly different from unity, and the de estimatefor the largest bandwidth is significantly different from zero.

The results so far are consistent with the notion that realized and implied volatility are welldescribed as stationary but fractionally integrated series, and that they tend to move togetherin the sense that the errors in (23) have less memory. The interesting question is how closelythey move together and whether the errors are in fact only weakly dependent. To answer thisquestion, the fourth column shows the Wald test statistic (22) of the joint hypothesis thatde = 0 and β = 1, which is asymptotically distributed as a χ2 random variable with 2 degreesof freedom (the 5% and 1% critical values are 5.99 and 9.21, respectively). The test rejectsat the 1% level for the two larger bandwidth choices, m = 27 and m = 37, casting somedoubt on the conclusions from the literature that implied and realized volatility can indeedbe described by a stationary fractionally cointegrated relation with unit coefficient and onlyweakly dependent errors. However, the results for the smaller bandwidth, m = 20, do suggestthat all long memory properties in volatility are common features for implied and realizedvolatility pointing towards a unit coefficient and weakly dependent errors.

The remaining columns in Table 7 present the estimates, standard errors, and Wald teststatistic when (9) is iterated until convergence. These results are very similar to the results forthe TSE, and consequently the Wald statistics offer the same conclusions as for the TSE.

Thus, similarly to Christensen & Nielsen (2004) and Bandi & Perron (2004), we find thatthe volatility series are well described as stationary fractionally integrated series and we cannot

20

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

reject that implied and realized volatility indeed are stationary fractionally cointegrated. Thatis, the residuals are of lower order of fractional integration than the volatility series themselves,de < d. In fact, our results offer some support to the even stronger relation that de = 0. Underlong-run unbiasedness, we would expect the series to follow each other closely resulting in a unitβ-coefficient, which is also somewhat supported by our analysis. However, the evidence whenapplying our more efficient joint estimation procedure is not as clear-cut as in Christensen &Nielsen (2004) and Bandi & Perron (2004), and in particular the tests for long-run unbiasednessactually reject when larger bandwidths are employed.

6 Conclusion

We consider a local Whittle analysis of a stationary fractionally cointegrated model. In par-ticular, we propose a two step estimator, which is equivalent to the local Whittle QMLE, tojointly estimate the integration orders of the regressors, the integration order of the errors, andthe cointegration vector. The estimator is semiparametric in the sense that it employs localassumptions on the joint spectral density matrix of the regressors and the errors near the zerofrequency. By using a degenerating part of the periodogram near the origin, the approach isinvariant to short-run dynamics, which would have to be specified correctly in a parametricprocedure.

The two step estimator is based on consistent initial estimators. We show that such es-timators exist, and in particular, our Theorem 1 shows that the local Whittle QMLE of theintegration order of the residuals is unaffected by the fact that it is based on residuals. Moregenerally, our Theorem 1 in fact shows that the three step procedure employed by Marinucci& Robinson (2001) and Christensen & Nielsen (2004) is valid. That is, inference on de may, intheir setup, be conducted based on our distributional result in Theorem 1 and is equivalent todisregarding the fact that the estimator is based on residuals.

In our stationary fractionally integrated case, we show that the two step estimator is as-ymptotically normal with block diagonal covariance matrix for the entire stationary region ofthe integration orders. Thus, the estimates of the integration orders are asymptotically un-correlated with the estimate of the cointegration vector. Furthermore, our estimator of thecointegration vector is asymptotically normal for a wider range of integration orders than thenarrow band frequency domain least squares estimator of Robinson (1994a), analyzed by Mar-inucci & Robinson (2001), Robinson & Marinucci (2003), and Christensen & Nielsen (2004),and is superior with respect to asymptotic variance when the latter is normal.

To demonstrate the feasibility of our methodology in finite samples we have presented theresults of a small Monte Carlo study. The results show that the performance of the estimatoris very good with respect to bias and root mean squared error when no short-run dynamics ispresent. However, it also shows that the bandwidth parameter should not be too high in thepresence of short-run dynamics to avoid biased results.

21

Chapter 1

We have applied our methodology to financial volatility series. The unbiasedness hypothesisof option markets implies a coefficient of unity in the implied-realized volatility relation, butthe ordinary regression estimate is less than one-half. We show that implied and realizedvolatility are well described as being stationary fractionally cointegrated. When accounting forthis, our estimates of this coefficient are about twice as large as before and for the smallestbandwidth even insignificantly different from unity. Furthermore, we are unable to reject thejoint hypothesis of unit coefficient and weak dependence of the error process, when consideringthe specification with the smallest bandwidth parameter. The analysis demonstrates that usefullong-run relations can be derived even among stationary series.

Appendix A: Proof of Theorem 1

First we show that (logn) (dp−d0p) p→ 0. Rewriting equations (A.1)-(A.4), (A.24), (A.25), and(A.30) from the proof of Theorem 3 of Robinson (1997) it suffices to show that

m2(d0p−∆1)−1mXj=1

j2(∆1−d0p) |hj | p→ 0 for ∆1 < d0p, (25)

(logn)2m2δ−1mXj=1

j−2δ |hj | p→ 0 for some δ > 0, (26)

(logn)2

m

mXj=1

|hj | p→ 0, (27)

1

m

mXj=1

³(j/q)2(∆1−d0p) − 1

´hj

p→ 0, (28)

where q = exp³m−1

Pmj=1 log j

´and

hj =Ipp (λj)− Ipp (λj)

Gppλ−2d0pj

(29)

is a normalized measure of the impact of using the periodogram of residuals instead of theperiodogram of observed data. Our assumption that d0p ≥ 0 allows a slight simplification ofthe conditions (25)-(28) compared to their counterparts in Robinson (1997) which shortens thisproof somewhat. It could easily be relaxed at the expense of a longer proof.

Note that, by Assumption 1, (11), (14), and Robinson (1995b, Theorem 2), the randomvariables hj satisfy

|hj | = Op

³m−1 +m−1/2λαm

´. (30)

22

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

Using (30) and the fact that

sup−1≤α≤C

¯¯m−α−1 (logm)−1 mX

j=1

¯¯ = O (1) for C ∈ (1,∞) , (31)

it is easy to show that (25)-(27) are

Op

³(logm)

³m−1 +m−1/2λαm

´´,

Op

³(logn)2 (logm)

³m−1 +m−1/2λαm

´´,

Op

³(logn)2

³m−1 +m−1/2λαm

´´,

respectively. We will need (31) throughout all our proofs, and we shall use it automatically andwithout special reference in what follows. Using the fact that q ∼ m/e (e = 2.71...) as n→∞,the left-hand side of (28) is bounded by

1

m

mXj=1

µj

m

¶2(∆1−d0p)|hj |+ 1

m

mXj=1

|hj | ,

which is negligible by (25) and (27).Thus, we have shown (logn)-consistency of dp and proceed to prove the asymptotic distri-

bution result. Following Robinson (1995a) we need to show that

supd∈∆∩Nδ

¯¯G0,e (d)− G0,e (d)

G (d)

¯¯ = op

µ1

(logm)6

¶, (32)¯

Fk,e (d0p)− Fk,e (d0p)¯

p→ 0, k = 0, 1, 2, (33)

√m

¯∂Re (d0p)

∂d− ∂Re (d0p)

∂d

¯p→ 0, (34)

where

Gk,a (d) =1

m

mXj=1

(log λj)k λ2dj Iaa (λj) ,

G (d) = Gpp1

m

mXj=1

λ2(d−d0p)j ,

Fk,a (d) =1

m

mXj=1

(log j)k λ2dj Iaa (λj) ,

∂Ra (d)

∂d= 2

G1,a (d)

G0,a (d)− 2

m

mXj=1

log λj ,

23

Chapter 1

and Nδ is defined in Robinson (1995a, p. 1634).By (4.7) in Robinson (1995a), (32) follows if

(logm)6mXj=1

µj

m

¶1−2δj−2

¯¯

jXk=1

hk

¯¯ p→ 0

which holds by (26) and (27) above, when changing the order of the summations. The left-handside of (33) is bounded by¯

¯Gpp

m

mXj=1

(log j)k hj

¯¯ ≤ Gpp (logm)

k

m

mXj=1

|hj |

= Op

³(logm)k

³m−1 +m−1/2λαm

´´by the same arguments as applied to (27) above. The left-hand side of (34) is

2√m

¯¯G1,e (d0p)G0,e (d0p)

− G1,e (d0p)

G0,e (d0p)

¯¯

≤ 2√m

¯G1,e (d0p)− G1,e (d0p)

¯¯G0,e (d0p)

¯ + 2√m

¯G1,e (d0p)

¯ ¯¯G0,e (d0p)− G0,e (d0p)

¯¯¯¯G0,e (d0p) G0,e (d0p)

¯¯ ,

where G0,e (d0p) = Gpp + op (1) by Robinson (1995a) and thus also G0,e (d0p) = Gpp + op (1)

in view of (33) with k = 0. Furthermore,¯G1,e (d0p)

¯≤¯G1,e (d0p)− G1,e (d0p)

¯+¯G1,e (d0p)

¯,

where G1,e (d0p) = G0,e (d0p)∂Re(d0p)

∂d = (Gpp + op (1))Op

¡m−1/2

¢= op (1) by Robinson (1995a).

Consequently, to complete the proof, we need to show that

√m¯G1,e (d0p)− G1,e (d0p)

¯p→ 0.

The left-hand side is

Gpp√m

¯¯ mXj=1

(logλj)hj

¯¯ ≤ Gpp (logn)√

m

mXj=1

|hj |

= Op

³(logn) (logm)

³m−1/2 + λαm

´´as in (27).

Appendix B: Limit of the Score

In this appendix and the following one, we ignore the subscript zero on the true values of theparameters d, β, and G to lighten the notation.

24

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

Applying the Cramér-Wold device we need to show that

η0√mdiag

³Ip, λ

−dpm Λm

´ ∂L (θ0)

∂θD→ N

¡0, η0Ωη

¢(35)

for any non-null vector η. The derivatives with respect to da and βa are

∂L (θ0)

∂da=

2

m

mXj=1

νj Re³gaΛjIwa (λj)λ

daj − 1

´, a = 1, ..., p, (36)

∂L (θ0)

∂βa= − 2

m

mXj=1

λdpj Re (g

pΛjIwa (λj)) , a = 1, ..., p− 1, (37)

where νj = log j − m−1Pm

j=1 log j, ga is the a0th row of G−1, and Iwa (λ) is the cross-

periodogram between wt and wat. In both (36) and (37) we replaced G (θ0) by G since

°°°G (θ0)−G°°° = Op

³m−1/2

´, (38)

see Lobato (1999).

The part of the left-hand side of (35) corresponding to (37) is

−p−1Xa=1

ηa+pλda−dpm

2√m

mXj=1

λdpj Re (g

pΛjIwa (λj))

= −p−1Xa=1

ηa+pλda−dpm

2√m

mXj=1

λdpj Re (g

pΛj (Iwa (λj)−A (λj)J (λj)A∗a (λj))) (39)

−p−1Xa=1

ηa+pλda−dpm

2√m

mXj=1

λdpj Re (g

pΛjA (λj)J (λj)A∗a (λj)) , (40)

where J (λ) is the periodogram of εt and Aa (λ) is the a0th row of A (λ). By (C.2) of Lobato(1999), which is implied by our assumptions,

(39) = Op

Ãp−1Xa=1

1√m

µm1/3 (logm)2/3 + logm+

√m

n1/4

¶!

= Op

Ã(logm)2/3

m1/6+logm√

m+

1

n1/4

!p→ 0.

25

Chapter 1

Write (40) as

−p−1Xa=1

ηa+pλda−dpm

2√m

mXj=1

λdpj Re

⎛⎝gpΛjA (λj)1

2πn

¯¯nXt=1

εteitλj

¯¯2

A∗a (λj)

⎞⎠= −

p−1Xa=1

ηa+pλda−dpm

1

π√m

mXj=1

λdpj Re (g

pΛjA (λj)A∗a (λj)) (41)

−p−1Xa=1

ηa+pλda−dpm

1

π√m

mXj=1

λdpj Re

ÃgpΛjA (λj)

Ã1

n

nXt=1

εtε0t − Ip

!A∗a (λj)

!(42)

−p−1Xa=1

ηa+pλda−dpm

2√m

mXj=1

λdpj Re

⎛⎝gpΛjA (λj)1

2πn

nXt=1

Xs6=t

εtε0se

i(t−s)λjA∗a (λj)

⎞⎠ . (43)

By definition of f (λ), see (12), and using Assumption 1

(41) = max1≤a≤p−1

O

⎛⎝ 1√mλda−dpm

mXj=1

λ2dpj fpa (λj)

⎞⎠= max

1≤a≤p−1O

⎛⎝ 1√mλda−dpm

mXj=1

λα−da+dpj

⎞⎠ ,

which is O³n−2αm1+2α (logm)2

´→ 0 by Assumption 4. For equation (42), note that εtε0t− Ip

is a martingale difference sequence with respect to Ft implying that n−1Pn

t=1 εtε0t − Ip =

Op(n−1/2). Thus,

(42) = max1≤a≤p−1

Op

⎛⎝ 1√mλda−dpm

mXj=1

λ2dpj

1√nfpa (λj)

⎞⎠= max

1≤a≤p−1Op

⎛⎝ 1√nm

λda−dpm

mXj=1

λα−da+dpj

⎞⎠= Op

³λ1/2+αm (logm)

´.

We are left with (43), which we rewrite as

nXt=1

ε0tt−1Xs=1

p−1Xa=1

ηa+pπn√mλda−dpm

mXj=1

λdpj Re

³A0 (λj)Λjgp0ei(t−s)λj Aa (λj)

´εs.

The corresponding term for (36), derived by Lobato (1999, p. 141), is given by

nXt=1

ε0tt−1Xs=1

p−1Xa=1

ηaπn√m

mXj=1

λdaj νj Re³A0 (λj)Λjga0ei(t−s)λj Aa (λj)

´εs.

26

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

Thus, (35) has the same asymptotic distribution asPn

t=1 ε0t

Pt−1s=1 ct−s,nεs, where we define

ctn =1

πn√m

mXj=1

(θj1 + θj2) cos (tλj) ,

θj1 = νj

pXa=1

λdaj ηaRe¡A0 (λj)Λjga0Aa (λj) +A0a (λj) g

aΛjA (λj)¢,

θj2 = −λdpjp−1Xa=1

ηa+pλda−dpm Re

¡A0 (λj)Λjgp0Aa (λj) +A0a (λj) g

pΛjA (λj)¢.

Notice that, by construction, kθj1k = O (1) and kθj2k = O³max1≤a≤p−1 (m/j)da−dp

´.

Since ztn = ε0tPt−1

s=1 ct−s,nεs is a martingale difference array with respect to Ft = σ (εs, s ≤ t),we can apply the CLT for martingale difference arrays if (see Brown (1971) or Hall & Heyde(1980, chp. 3.2))

nXt=1

E¡z2tn¯Ft−1

¢− 2p−1Xa=1

2p−1Xb=1

ηaηbΩabp→ 0, (44)

nXt=1

E¡z2tn1 (|ztn| > δ)

¢→ 0 for all δ > 0. (45)

A sufficient condition for (45) isnXt=1

E¡z4tn¢→ 0. (46)

First, to show (44),

nXt=1

E¡z2tn¯Ft−1

¢=

nXt=1

E

Ãt−1Xs=1

t−1Xr=1

ε0sc0t−s,nεtε

0tct−r,nεr

¯¯Ft−1

!

=nXt=1

t−1Xs=1

ε0sc0t−s,nct−s,nεs (47)

+nXt=1

t−1Xs=1

Xr 6=s

ε0sc0t−s,nct−r,nεr. (48)

The term (48) has mean zero and variance

O

⎛⎝n

ÃnX

s=1

kcsnk2!2+

nXt=3

t−1Xu=2

Ãu−1Xs=1

kcu−s,nk2u−1Xs=1

kct−s,nk2!⎞⎠ (49)

by (D.10) and (D.11) of Lobato (1999). It is immediate by definition of csn that kcsnk =O³(n√m)

−1Pmj=1 kθj1 + θj2k

´= O

¡n−1m1/2 (logm)

¢using (31). Define the functionsHa (λ) =

27

Chapter 1

λdp Re¡A0 (λ)Λgp0Aa (λ) +A0a (λ) gpΛA (λ)

¢such that θ2j = −

Pp−1a=1 ηa+pλ

da−dpm Ha (λj),Ha (λ) =

O¡λdp−da

¢as λ→ 0+, and Ha (λ) is differentiable with ∂Ha (λ) /∂λ = O

¡λdp−da−1

¢as λ→ 0+

by Assumption 3. Now we can derive an alternative bound as

kcsnk = O

⎛⎝ max1≤a≤p−1

λda−dpm

n√m

¯¯ mXj=1

Ha (λj) cos (sλj)

¯¯⎞⎠

= O

⎛⎝ max1≤a≤p−1

λda−dpm

n√m

¯¯m−1Xj=1

(Ha (λj)−Ha (λj+1))

jXk=1

cos (sλk)

¯¯⎞⎠

+O

⎛⎝ max1≤a≤p−1

λda−dpm

n√m

¯¯Ha (λm)

mXj=1

cos (sλj)

¯¯⎞⎠

from summation by parts. Using the Mean Value Theorem it follows that Ha (λj)−Ha (λj+1) =

(λj+1 − λj)∂Ha(λj)

∂λ = 2πn

∂Ha(λj)∂λ , and the bound is

kcsnk = O

⎛⎝ max1≤a≤p−1

λda−dpm

n√m

¯¯m−1Xj=1

n−1λdp−da−1j

jXk=1

cos (sλk)

¯¯⎞⎠

+O

⎛⎝ max1≤a≤p−1

λda−dpm

n√m

¯¯λdp−dam

mXj=1

cos (sλj)

¯¯⎞⎠

= O

µlogm

s√m

using also¯Pl

j=1 cos (sλj)¯= O (n/s), see Zygmund (2002, p. 2). This bound is better when

s > n/m.

Thus, we find that

nXs=1

kcsnk2 = O

⎛⎝[n/m]Xs=1

m (logm)2

n2+

nXs=[n/m]+1

(logm)2

s2m

⎞⎠= O

Ã(logm)2

n

!,

implying that the first term of (49) is O(n−1 (logm)4). The second term of (49) is

O

⎛⎝n

ÃnX

s=1

kcsnk2!⎛⎝[n/2]X

s=1

s kcsnk2⎞⎠⎞⎠ ,

28

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

following the analysis in Robinson (1995a, pp. 1646-1647). Applying the latter bound we findthat

[n/2]Xs=1

s kcsnk2 = O

⎛⎝[n/2]Xs=1

1

sm

⎞⎠= O

µlogn

m

¶and (49) = O

¡n−1(logm)4 +m−1 (logn) (logm)2

¢.

We still need to show that the mean of (47) is asymptotically equal toP2p−1

a=1

P2p−1b=1 ηaηbΩab.

Thus,

E (47) =nXt=1

t−1Xs=1

E tr¡c0t−s,nct−s,nεsε

0s

¢=

nXt=1

t−1Xs=1

tr¡c0t−s,nct−s,n

¢by Assumption 2. Rewrite this expression as

nXt=1

t−1Xs=1

mXj=1

mXj0=1

1

π2n2mtr¡¡θ0j1 + θj2

¢ ¡θ0j01 + θj02

¢¢cos ((t− s)λj) cos

¡(t− s)λj0

¢=

nXt=1

t−1Xs=1

mXj=1

1

π2n2mtr¡θ0j1θj1

¢cos2 ((t− s)λj) (50)

+nXt=1

t−1Xs=1

mXj=1

1

π2n2mtr¡θ0j2θj2

¢cos2 ((t− s)λj) (51)

+nXt=1

t−1Xs=1

mXj=1

1

π2n2m2 tr

¡θ0j1θj2

¢cos2 ((t− s)λj) (52)

+nXt=1

t−1Xs=1

mXj=1

mXj0 6=j

1

π2n2mtr¡θ0j1θj01

¢cos ((t− s)λj) cos

¡(t− s)λj0

¢(53)

+nXt=1

t−1Xs=1

mXj=1

mXj0 6=j

1

π2n2mtr¡θ0j2θj02

¢cos ((t− s)λj) cos

¡(t− s)λj0

¢(54)

+nXt=1

t−1Xs=1

mXj=1

mXj0 6=j

1

π2n2m2 tr

¡θ0j1θj02

¢cos ((t− s)λj) cos

¡(t− s)λj0

¢. (55)

It was shown by Lobato (1999) that (50) is asymptotically equal toPp

a=1

Ppb=1 ηaηbEab and

29

Chapter 1

that (53) is asymptotically negligible. We consider the remaining terms in turn. First,

(54) = max1≤a≤p−1

O

⎛⎝ nXt=1

t−1Xs=1

mXj=1

mXj0 6=j

1

n2m

µm

j

¶da−dp µmj0

¶da−dpcos ((t− s)λj) cos

¡(t− s)λj0

¢⎞⎠and, using that

Pnt=1

Pt−1s=1 cos ((t− s)λj) cos

¡(t− s)λj0

¢= −n/2 for λj 6= λj0 , we can bound

(54) bymax1≤a≤p−1O³(nm)−1m2da−2dpPm

j=1 jdp−daPm

j0 6=j j0dp−da

´= O

³(logm)2m/n

´. Sim-

ilarly, (55) is also O³(logm)2m/n

´.

For the covariance term in (52) we notice that

tr

µ1

4π2θ0j1θj2

¶= −νj

pXa=1

p−1Xb=1

ηaηb+pλdb−dpm λ

da+dpj

×∙tr

µ1

4π2Re¡A0a (λj) g

aΛjA (λj)¢Re¡A0 (λj)Λjgp0Ab (λj)

¢¶+tr

µ1

4π2Re¡A0a (λj) g

aΛjA (λj)¢Re¡A0b (λj) g

pΛjA (λj)¢¶

+tr

µ1

4π2Re¡A0 (λj)Λjga0Aa (λj)

¢Re¡A0 (λj)Λjgp0Ab (λj)

¢¶+tr

µ1

4π2Re¡A0 (λj)Λjga0Aa (λj)

¢Re¡A0b (λj) g

pΛjA (λj)¢¶¸

and, using the definition of f (λ) and Assumption 1, this is easily shown to be o (1). E.g.,

the first term in the square brackets is tr (fab (λj) gaΛjf (λj)Λjgp0) = O³λ−da−dbj gaλαj

´=

O³λα−da−dbj

´using that gap = 0 for a = 1, ..., p − 1. This implies that (52) is o (1) sincePn−1

t=1

Pn−ts=1 cos

2 (sλj) = (n− 1)2 /4.Now let us examine tr(θ0j2θj2) appearing in (51),

tr

Ãθ0j2θj24π2

!

= tr

Ãp−1Xa=1

p−1Xb=1

ηa+pηb+pλda+db−2dpm λ

2dpj

4π2Re¡A0a (λj) g

pΛjA (λj)¢Re¡A0 (λj)Λjgp0Ab (λj)

¢!

+tr

Ãp−1Xa=1

p−1Xb=1

ηa+pηb+pλda+db−2dpm λ

2dpj

4π2Re¡A0a (λj) g

pΛjA (λj)¢Re¡A0b (λj) g

pΛjA (λj)¢!

+tr

Ãp−1Xa=1

p−1Xb=1

ηa+pηb+pλda+db−2dpm λ

2dpj

4π2Re¡A0 (λj)ΛjgpAa (λj)

¢Re¡A0 (λj)Λjgp0Ab (λj)

¢!

+tr

Ãp−1Xa=1

p−1Xb=1

ηa+pηb+pλda+db−2dpm λ

2dpj

4π2Re¡A0 (λj)ΛjgpAa (λj)

¢Re¡A0b (λj) g

pΛjA (λj)¢!

.

30

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

By definition of the spectral density f (λ) in (12), the first term is asymptotically equaltoPp−1

a=1

Pp−1b=1 ηa+pηb+pλ

2dpj fba (λj) g

pΛjf (λj)Λjgp0 =

Pp−1a=1

Pp−1b=1 ηa+pηb+pλ

2dp−da−dbj gbag

−1pp ,

the fourth toPp−1

a=1

Pp−1b=1 ηa+pηb+pλ

2dp−da−dbj gabg

−1pp , and the second and third terms to zero

using Assumption 1. Hence, (51) is asymptotically equal to

nXt=1

t−1Xs=1

mXj=1

p−1Xa=1

p−1Xb=1

ηa+pηb+p4λ

da+db−2dpm λ

2dp−da−dbj

n2m

(gab + gba)

gppcos2 ((t− s)λj)

=mXj=1

p−1Xa=1

p−1Xb=1

ηa+pηb+p4λ

da+db−2dpm λ

2dp−da−dbj

n2m

(gab + gba)

gpp

ÃnXt=1

t−1Xs=1

cos2 ((t− s)λj)

!

=mXj=1

p−1Xa=1

p−1Xb=1

ηa+pηb+p4λ

da+db−2dpm λ

2dp−da−dbj

n2m

(gab + gba)

gpp

(n− 1)24

(56)

sincePn−1

t=1

Pn−ts=1 cos

2 (sλj) = (n− 1)2 /4.We can approximate the Riemann sum appearing in (56) by an integral, viz.

n

mXj=1

λ2dp−da−dbj ∼

Z λm

0λ2dp−da−dbdλ =

λ1−da−db+2dpm

1− da − db + 2dp,

where the symbol ”∼” means that the ratio of the left- and right-hand sides tends to one.Using this approximation we get that

(56) ∼p−1Xa=1

p−1Xb=1

ηa+pηb+p2λ

da+db−2dpm

πnm

gab + gbagpp

n2

4

λ1−da−db+2dpm

1− da − db + 2dp

=

p−1Xa=1

p−1Xb=1

ηa+pηb+p2gab

gpp (1− da − db + 2dp),

and we have shown (44).Thus, we need to show (46),

nXt=1

E¡z4tn¢=

nXt=1

E

⎛⎝t−1Xs=1

ε0sct−s,nεtε0t

t−1Xr=1

ct−r,nεrt−1Xp=1

ε0pct−p,nεtε0t

t−1Xq=1

ct−q,nεq

⎞⎠≤ C

ÃnXt=1

tr

Ãt−1Xs=1

c0t−s,nct−s,nc0t−s,nct−s,n

!

+nXt=1

tr

Ãt−1Xs=1

c0t−s,nt−1Xr=1

ct−r,nc0t−r,nct−s,n

!!

for some constant C > 0 by Assumption 2. This expression can be bounded byO³n¡Pn

t=1

°°c2tn°°¢2´ =O¡n−1(logm)4

¢, and we are done.

31

Chapter 1

Appendix C: Limit of the Hessian

We prove that

∂2L(θ)

∂da∂db

p→ Eab, (57)

λdb−dpm

∂2L(θ)

∂da∂βb

p→ 0, (58)

λda+db−2dpm

∂2L(θ)

∂βa∂βb

p→ Fab, (59)

for all θ such that k θ − θ0 k≤k θ(1) − θ0 k.First, we will need to strengthen the approximation (38) to G by showing that°°°G(θ)− G (θ0)

°°° = Op

µlogn√m

¶. (60)

The proof for the leading (p− 1) × (p− 1) block is given in Lobato (1999, pp. 145-148).Consider now, for a = 1, ..., p− 1,

gap(θ)− gap (θ0) =1

m

mXj=1

³λda+dpj Iap (λj)− λ

da+dpj Iap (λj)

´,

where Iap (λ) is the cross-periodogram between xat and et = yt − β0xt. Noting that Iap (λ) −

Iap (λ) = (β − β)0Ixa (λ), we can rewrite this as

gap(θ)− gap (θ0) =1

m

mXj=1

λda+dpj (β − β)0Ixa (λj) +

1

m

mXj=1

³λda+dpj − λ

da+dpj

´Iap (λj) . (61)

The first term on the right-hand side can be bounded as

1

m

mXj=1

λda+dpj

p−1Xb=1

(βb − βb)Iba (λj)

≤ 1

m

µmax1≤j≤m

λda+dp−da−dpj

¶ mXj=1

λda+dpj

p−1Xb=1

(βb − βb)Iba (λj)

= Op

³m−1/2

´,

using max1≤j≤m λda+dp−da−dpj = 1 + op (1) which follows since the exponent is Op

¡m−1/2

¢by

Assumption 5. The second term on the right-hand side of (61) is

Op

⎛⎝ 1

m

µmax1≤j≤m

λda+dp−da−dpj − 1

¶ mXj=1

λda+dpj Iap (λj)

⎞⎠ = op

µlogn√m

32

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

by Assumption 5 and the above analysis. The (p, p)0th element of (60) follows in the exactsame way by application of the Cauchy-Schwartz Inequality.

In view of (60), (57) follows from Lobato (1999). For (58) and (59) it can be shown that

λdb−dpm

Ã∂2L(θ)

∂da∂βb− ∂2L (θ0)

∂da∂βb

!p→ 0, (62)

λda+db−2dpm

Ã∂2L(θ)

∂βa∂βb− ∂2L (θ0)

∂βa∂βb

!p→ 0, (63)

by proceeding component by component with the same methods that we applied to show (60).We show next that

λda+db−2dpm

∂2L (θ0)

∂βa∂βb

p→ Fab. (64)

The left-hand side of (64) is asymptotically equal to

λda+db−2dpm

2

m

mXj=1

λdpj Re

ÃgpΛj

Ã0p−1

Iab (λj)

!!(65)

−λda+db−2dpm2

m

mXj=1

λdpj Re

⎛⎝gp

⎛⎝ 1

m

mXj0=1

Λj0

ÃOp−1 Ixa

¡λj0¢

Iax¡λj0¢2Ipa

¡λj0¢ !

⎞⎠Λj0G−1ΛjIwa (λj)⎞⎠(66)

by (38), with 0p−1 and Op−1 denoting a (p− 1)-vector of zeros and a (p− 1)× (p− 1) matrixof zeros, respectively. The first of these terms is

(65) = λda+db−2dpm

2

m

mXj=1

λ2dpj Re (gpp (Iab (λj)−Aa (λj)J (λj)A

∗b (λj)))

+λda+db−2dpm

2

m

mXj=1

λ2dpj Re (gppAa (λj)J (λj)A

∗b (λj)) ,

where the first term is op (1) by the same arguments as for (39) in appendix B. The secondterm is

λda+db−2dpm

2

m

mXj=1

λ2dpj Re

⎛⎝gppAa (λj)1

2πn

¯¯nXt=1

εteitλj

¯¯2

A∗b (λj)

⎞⎠= λ

da+db−2dpm

2

m

mXj=1

λ2dpj

1

2πRe (gppAa (λj)A

∗b (λj)) + op (1) (67)

by Assumption 2 and the same arguments as for (41)− (42) in appendix B.By definition of f (λ) we get that

(67) = λda+db−2dpm

2

m

mXj=1

λ2dpj gppRe (fab (λj)) + op (1) .

33

Chapter 1

Applying the integral approximation from appendix B and recalling that gpp = g−1pp and λm =

2πm/n, this expression is asymptotically equal to

λda+db−2dpm

n

mπgppZ λm

0λ2dp

³gabλ

−da−db´dλ =

2gabgpp (1− da − db + 2dp)

.

Next, rewrite (66) as

− λda+db−2dpm

2

m

mXj=1

λdpj ×

Re

⎛⎝ 1

m

mXj0=1

³gppλ

dpj0 Iax

¡λj0¢, gpΛj0Iwa

¡λj0¢+ gppλ

dpj0 Ipa

¡λj0¢ ´Λj0G

−1ΛjIwa (λj)

⎞⎠= Op

⎛⎝λda+db−2dpm

1

m

mXj=1

λdpj

⎛⎝ 1

m

mXj0=1

λdp+daj0

⎞⎠λ−daj

⎞⎠applying the same type of analysis as in appendix B. The last expression is seen to beO(λda+dbm ) =

o (1) .

To complete the proof, we need to show that

λdb−dpm

∂2L (θ0)

∂da∂βb

p→ 0, (68)

which implies (58) in view of (62). The left-hand side of (68) is asymptotically equal to

λdb−dpm

2

m

mXj=1

νj Re

⎛⎝ga

⎛⎝ 1

m

mXj0=1

Λj0

ÃOp−1 Ixa

¡λj0¢

Iax¡λj0¢2Ipa

¡λj0¢ !

⎞⎠Λj0G−1ΛjIwa (λj)λdaj⎞⎠

− λdb−dpm

2

m

mXj=1

νj Re

ÃgaΛj

Ã0p−1

Iab (λj)

!λdaj

!

by (38). The first of these terms is asymptotically negligible by the same arguments as for (66),and the second by those for (52). This completes the proof.

34

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

References

Andersen, T. G. & Bollerslev, T. (1997a), ‘Heterogenous information arrivals and return volatil-ity dynamics: Uncovering the long-run in high frequency returns’, Journal of Finance52, 975—1005.

Andersen, T. G. & Bollerslev, T. (1997b), ‘Intraday periodicity and volatility persistence infinancial markets’, Journal of Empirical Finance 4, 115—158.

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Ebens, H. (2001), ‘The distribution of realizedstock return volatility’, Journal of Financial Economics 61, 43—76.

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2001), ‘The distribution of exchangerate volatility’, Journal of the American Statistical Association 96, 42—55.

Baillie, R. T., Bollerslev, T. & Mikkelsen, H. O. (1996), ‘Fractionally integrated generalizedautoregressive conditional heteroskedasticity’, Journal of Econometrics 74, 3—30.

Bandi, F. M. & Perron, B. (2004), ‘Long memory and the relation between implied and realizedvolatility’, Preprint, Universite de Montreal .

Breidt, F. J., Crato, N. & de Lima, P. (1998), ‘The detection and estimation of long-memoryin stochastic volatility’, Journal of Econometrics 83, 325—348.

Brown, B. M. (1971), ‘Martingale central limit theorems’, Annals of Mathematical Statistics42, 59—66.

Chen, W. W. & Hurvich, C. M. (2003a), ‘Estimating fractional cointegration in the presenceof polynomial trends’, Journal of Econometrics 117, 95—121.

Chen, W. W. & Hurvich, C. M. (2003b), ‘Semiparametric estimation of multivariate fractionalcointegration’, Journal of the American Statistical Association 98, 629—642.

Christensen, B. J. & Nielsen, M. Ø. (2004), ‘Asymptotic normality of narrow-band least squaresin the stationary fractional cointegration model and volatility forecasting’, Forthcomingin Journal of Econometrics .

Christensen, B. J. & Prabhala, N. R. (1998), ‘The relation between implied and realized volatil-ity’, Journal of Financial Economics 50, 125—150.

Comte, F. & Renault, E. (1996), ‘Long-memory continuous-time models’, Journal of Econo-metrics 73, 101—149.

Deo, R. S. & Hurvich, C. M. (2003), Estimation of long memory in volatility, in P. Doukhan,G. Oppenheim & M. S. Taqqu, eds, ‘Theory and Applications of Long-Range Dependence’,Birkhäuser, Boston, pp. 313—324.

35

Chapter 1

Ding, Z., Granger, C. W. J. & Engle, R. F. (1993), ‘A long memory property of stock marketreturns and a new model’, Journal of Empirical Finance 1, 83—106.

Dueker, M. & Startz, R. (1998), ‘Maximum-likelihood estimation of fractional cointegrationwith an application to U.S. and Canadian bond rates’, Review of Economics and Statistics83, 420—426.

Engle, R. & Granger, C. W. J. (1987), ‘Cointegration and error correction: Representation,estimation and testing’, Econometrica 55, 251—276.

Geweke, J. & Porter-Hudak, S. (1983), ‘The estimation and application of long memory timeseries models’, Journal of Time Series Analysis 4, 221—238.

Granger, C. W. J. (1981), ‘Some properties of time series data and their use in econometricmodel specification’, Journal of Econometrics 16, 121—130.

Haldrup, N. & Nielsen, M. Ø. (2004), ‘A regime switching long memory model for electricityprices’, Working paper, Cornell University .

Hall, P. & Heyde, C. C. (1980), Martingale Limit Theory and its Application, Academic Press,New York.

Hannan, E. J. (1979), ‘The central limit theorem for time series regression’, Stochastic Processesand their Applications 9, 281—289.

Harvey, A. C. (1998), Long memory in stochastic volatility, in J. Knight & S. Satchell, eds,‘Forecasting Volatility in Financial Markets’, Butterworth-Heineman, Oxford, pp. 307—320.

Hassler, U., Marmol, F. & Velasco, C. (2000), ‘Residual log-periodogram inference for long-runrelationships’, Forthcoming in Journal of Econometrics .

Henry, M. & Zaffaroni, P. (2003), The long range dependence paradigm for macroeconomicsand finance, in P. Doukhan, G. Oppenheim & M. S. Taqqu, eds, ‘Theory and Applicationsof Long-Range Dependence’, Birkhäuser, Boston, pp. 417—438.

Hidalgo, F. J. & Robinson, P. M. (2002), ‘Adapting to unknown disturbance autocorrelationin regression with long memory’, Econometrica 20, 1545—1581.

Künsch, H. R. (1987), Statistical aspects of self-similar processes, in Y. Prokhorov & V. V.Sazanov, eds, ‘Proceedings of the First World Congress of the Bernoulli Society’, VNUScience Press, Utrecht, pp. 67—74.

Lobato, I. N. (1997), ‘Consistency of the averaged cross-periodogram in long memory series’,Journal of Time Series Analysis 18, 137—155.

36

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

Lobato, I. N. (1999), ‘A semiparametric two-step estimator in a multivariate long memorymodel’, Journal of Econometrics 90, 129—153.

Lobato, I. N. & Robinson, P. M. (1996), ‘Averaged periodogram estimation of long memory’,Journal of Econometrics 73, 303—324.

Lobato, I. N. & Velasco, C. (2000), ‘Long memory in stock-market trading volume’, Journal ofBusiness and Economic Statistics 18, 410—427.

Marinucci, D. & Robinson, P. M. (2001), ‘Semiparametric fractional cointegration analysis’,Journal of Econometrics 105, 225—247.

Nielsen, M. Ø. (2002), ‘Semiparametric estimation in time series regression with long rangedepencence’, Forthcoming in Journal of Time Series Analysis .

Nielsen, M. Ø. & Shimotsu, K. (2004), ‘Determining the cointegrating rank in nonstationaryfractional systems by the exact local Whittle approach’,Working paper, Cornell University.

Robinson, P. M. (1991), ‘Testing for strong serial correlation and dynamic conditional het-eroskedasticity in multiple regressions’, Journal of Econometrics 47, 67—84.

Robinson, P. M. (1994a), ‘Semiparametric analysis of long-memory time series’, Annals ofStatistics 22, 515—539.

Robinson, P. M. (1994b), Time series with strong dependence, in C. A. Sims, ed., ‘Advancesin Econometrics’, Cambridge University Press, Cambridge, pp. 47—95.

Robinson, P. M. (1995a), ‘Gaussian semiparametric estimation of long range dependence’,Annals of Statistics 23, 1630—1661.

Robinson, P. M. (1995b), ‘Log-periodogram regression of time series with long range depen-dence’, Annals of Statistics 23, 1048—1072.

Robinson, P. M. (1997), ‘Large-sample inference for nonparametric regression with dependenterrors’, Annals of Statistics 25, 2054—2083.

Robinson, P. M. & Henry, M. (1999), ‘Long and short memory conditional heteroscedasticityin estimating the memory parameter of levels’, Econometric Theory 15, 299—336.

Robinson, P. M. & Hidalgo, F. J. (1997), ‘Time series regression with long-range dependence’,Annals of Statistics 25, 77—104.

Robinson, P. M. & Marinucci, D. (2003), Semiparametric frequency domain analysis of frac-tional cointegration, in P. M. Robinson, ed., ‘Time Series With Long Memory’, OxfordUniversity Press, Oxford, pp. 334—373.

37

Chapter 1

Robinson, P. M. & Yajima, Y. (2002), ‘Determination of cointegrating rank in fractional sys-tems’, Journal of Econometrics 106, 217—241.

Velasco, C. (2001), ‘Gaussian semiparametric estimation of fractional cointegration’, Preprint,Universidad Carlos III de Madrid .

Velasco, C. (2003), ‘Gaussian semi-parametric estimation of fractional cointegration’, Journalof Time Series Analysis 24, 345—378.

Watson, M. W. (1994), Vector autoregressions and cointegration, in R. F. Engle & D. L. McFad-den, eds, ‘Handbook of Econometrics, Vol. IV’, North-Holland, Amsterdam, chapter 47,pp. 2843—2915.

Zygmund, A. (2002), Trigonometric Series, third edn, Cambridge University Press, Cambridge.

38

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

Table 1: Simulation Results for Model A with d = 0.4 and de = 0Bias RMSE

n = 200 d de β d de β

m = n0.5 = 14 −0.0138 −0.0545 −0.0013 0.1962 0.2834 0.1080

m = n0.6 = 24 −0.0062 −0.0311 −0.0007 0.1336 0.1588 0.0861

m = n0.7 = 40 −0.0086 −0.0191 −0.0000 0.0969 0.1054 0.0746

n = 500

m = n0.5 = 22 −0.0041 −0.0333 0.0002 0.1431 0.1719 0.0622

m = n0.6 = 41 −0.0009 −0.0158 0.0002 0.0940 0.1016 0.0515

m = n0.7 = 77 −0.0021 −0.0086 0.0004 0.0651 0.0673 0.0458

Table 2: Simulation Results for Model A with d = 0.2 and de = 0.1Bias RMSE

n = 200 d de β d de β

m = n0.5 = 14 −0.0216 −0.0361 0.0018 0.1971 0.2649 0.1866

m = n0.6 = 24 −0.0142 −0.0205 0.0000 0.1358 0.1446 0.1331

m = n0.7 = 40 −0.0119 −0.0135 0.0006 0.0958 0.0997 0.1061

n = 500

m = n0.5 = 22 −0.0125 −0.0184 −0.0020 0.1419 0.1558 0.1271

m = n0.6 = 41 −0.0062 −0.0105 −0.0018 0.0950 0.0979 0.0946

m = n0.7 = 77 −0.0056 −0.0065 −0.0011 0.0659 0.0661 0.0729

39

Chapter 1

Table 3: Simulation Results for Model B with d = 0.4 and de = 0Bias RMSE

n = 200 d de β d de β

m = n0.5 = 14 0.0486 −0.0600 −0.0007 0.2026 0.2807 0.0563

m = n0.6 = 24 0.1180 −0.0350 −0.0010 0.1785 0.1549 0.0454

m = n0.7 = 40 0.2153 −0.0222 −0.0007 0.2369 0.1046 0.0424

n = 500

m = n0.5 = 22 0.0157 −0.0380 −0.0000 0.1454 0.1902 0.0339

m = n0.6 = 41 0.0605 −0.0202 −0.0001 0.1137 0.1025 0.0270

m = n0.7 = 77 0.1499 −0.0119 −0.0001 0.1639 0.0671 0.0250

Table 4: Simulation Results for Model B with d = 0.2 and de = 0.1Bias RMSE

n = 200 d de β d de β

m = n0.5 = 14 0.0388 −0.0378 0.0004 0.2018 0.3478 0.1488

m = n0.6 = 24 0.1156 −0.0180 0.0012 0.1791 0.2951 0.1211

m = n0.7 = 40 0.2171 −0.0157 0.0005 0.2383 0.1008 0.0681

n = 500

m = n0.5 = 22 0.0108 −0.0220 0.0000 0.1412 0.1573 0.0652

m = n0.6 = 41 0.0577 −0.0130 −0.0001 0.1101 0.1010 0.0517

m = n0.7 = 77 0.1493 −0.0088 −0.0000 0.1629 0.0667 0.0431

40

Local Whittle Analysis of Stationary Fractional Cointegration and the Implied-Realized Volatility Relation

Table 5: Simulation Results for Model C with d = 0.4 and de = 0Bias RMSE

n = 200 d de β d de β

m = n0.5 = 14 0.0438 −0.0593 0.0864 0.2018 0.2850 0.1117

m = n0.6 = 24 0.1164 −0.0421 0.0973 0.1784 0.1660 0.1125

m = n0.7 = 40 0.2163 −0.0279 0.1109 0.2384 0.1113 0.1228

n = 500

m = n0.5 = 22 0.0159 −0.0353 0.0720 0.1435 0.1917 0.0840

m = n0.6 = 41 0.0604 −0.0220 0.0832 0.1127 0.1069 0.0898

m = n0.7 = 77 0.1503 −0.0143 0.0961 0.1645 0.0712 0.1013

Table 6: Simulation Results for Model C with d = 0.2 and de = 0.1Bias RMSE

n = 200 d de β d de β

m = n0.5 = 14 0.0389 −0.0479 0.2362 0.2010 0.3412 0.2748

m = n0.6 = 24 0.1126 −0.0382 0.2516 0.1771 0.1553 0.2674

m = n0.7 = 40 0.2161 −0.0303 0.2772 0.2375 0.1050 0.2883

n = 500

m = n0.5 = 22 0.0105 −0.0268 0.2230 0.1407 0.1567 0.2357

m = n0.6 = 41 0.0557 −0.0214 0.2377 0.1095 0.1010 0.2452

m = n0.7 = 77 0.1486 −0.0202 0.2608 0.1627 0.0700 0.2657

41

Chapter 1

Table 7: Application to the Implied-Realized Volatility RelationParameter Initial Two Step Std. Error Wde=0,β=1 Converged Std. Error Wde=0,β=1

m = n0.50 = 20

d 0.4628 0.4628 0.1117 0.4628 0.1117

de 0.1476 0.1507 0.1117 4.8452 0.1507 0.1117 4.8464

β 0.7767 0.7717 0.1313 0.7716 0.1313

m = n0.55 = 27

d 0.4807 0.4807 0.0960 0.4807 0.0960

de 0.1570 0.1673 0.0960 9.8175∗∗ 0.1677 0.0960 9.8332∗∗

β 0.7253 0.7098 0.1115 0.7095 0.1115

m = n0.60 = 37

d 0.4527 0.4527 0.0821 0.4527 0.0821

de 0.1679 0.1766 0.0821 17.057∗∗ 0.1768 0.0821 17.074∗∗

β 0.6968 0.6810 0.0905 0.6808 0.0905

Note: For the Wald tests, one or two asterisks denote significance at 5% or 1% level,respectively.

42

Chapter 2

Semiparametric Estimation in Time Series Regression withLong Range Dependence

Published in Journal of Time Series Analysis, 2005, vol. 26, pp. 279—304

43

44

Semiparametric Estimation in Time Series Regression withLong Range Dependence

Morten Ørregaard Nielsen∗

Abstract

We consider semiparametric estimation in time series regression in the presence of longrange dependence in both the errors and the stochastic regressors. A central limit theorem isestablished for a class of semiparametric frequency domain weighted least squares estimates,which includes both narrow band ordinary least squares and narrow band generalized leastsquares as special cases. The estimates are semiparametric in the sense that focus is onthe neighborhood of the origin, and only periodogram ordinates in a degenerating bandaround the origin are used. This setting differs from earlier work on time series regressionwith long range dependence where a fully parametric approach has been employed. Thegeneralized least squares estimate is infeasible when the degree of long range dependenceis unknown and must be estimated in an initial step. In that case, we show that a feasibleestimate exists, which has the same asymptotic properties as the infeasible estimate. ByMonte Carlo simulation, we evaluate the finite-sample performance of the generalized leastsquares estimate and the feasible estimate.

JEL Classification: C14, C22

Keywords: Fractional integration, generalized least squares, linear regression, long rangedependence, semiparametric estimation, Whittle likelihood

∗I am grateful to Jörg Breitung, Svend Hylleberg, Søren Johansen, Peter Phillips, seminar participants atYale University, an anonymous referee, and an associate editor for comments and suggestions.

45

Chapter 2

1 Introduction

In this paper we derive central limit theorems for semiparametric estimates of the coefficientvector β in the multiple linear time series regression model

yt = α+ β0xt + ut, t = 1, 2, ..., (1)

where both the (p− 1)-vector of stochastic regressors xt and the scalar errors ut are allowed tohave long range dependence.

It is well known that, under a wide variety of regularity conditions, the ordinary leastsquares and generalized least squares estimates of β are asymptotically normal, see e.g. Hannan(1979). However, as discussed by Robinson (1994a, 1994b) and Robinson & Hidalgo (1997),this fails to hold when xt and ut have sufficient collective long range dependence. To accountfor this, Robinson (1994a) suggested a narrow band (semiparametric) frequency domain leastsquares estimate, where the estimation is conducted over a degenerating band of frequenciesnear the origin, and proved its consistency for arbitrary short-run dynamics. As an alternative,Robinson & Hidalgo (1997) introduced a parametric class of (full band) weighted least squaresestimates (including generalized least squares as a special case), and proved root-n-consistencyand asymptotic normality for these estimates, assuming correct specification of the dynamicsat any frequency.

We consider a semiparametric version of the class of weighted least squares estimates inRobinson & Hidalgo (1997). The advantage of the semiparametric approach is that consistencyand asymptotic normality are retained without the need for correct specification of the short-run dynamics. Suppose the spectral density matrix of the p-vector wt = (x0t, ut)

0 exists andsatisfies

fw (λ) ∼ Λ−1GΛ−1 as λ→ 0+, (2)

where the symbol ”∼” means that the ratio of the left- and right-hand sides tends to one(elementwise), Λ = diag(λd1 , ..., λdp), and G is a p×p real, symmetric, positive definite matrix.Then the process wt is said to have long range dependence or strong dependence since theautocorrelations decay hyperbolically. The parameters d1, ..., dp determine the memory of theprocess, i.e. each component of wt, say wat, is associated with one memory parameter, da.If da > −1/2, wat is invertible and admits a linear representation, and if da < 1/2, wat iscovariance stationary. If da = 0, the spectral density is bounded at the origin and wat has onlyweak dependence. Sometimes wat is said to have negative, short, or long memory when da < 0,da = 0, or da > 0, respectively. Note that the memory parameter of ut = wpt is dp in thisnotation. Throughout this paper we shall be concerned with the case 0 ≤ da < 1/2, a = 1, ..., p,since this is the dominant case in empirical research, see Robinson (1994b) and Beran (1994)for a review of long range dependent processes.

The most well known parametric models satisfying (2) are the fractional Gaussian noise andthe fractional ARIMA models, see Mandelbrot & Van Ness (1968), Adenstedt (1974), Granger

46

Semiparametric Estimation in Time Series Regression with Long Range Dependence

& Joyeux (1980), and Hosking (1981). The obvious advantage of specifying the spectral densityonly in a neighborhood of the origin as in (2), is that it allows treating the spectral densityaway from the origin nonparametrically, assuming only mild regularity conditions. Thus, inapplications we need not worry about correct specification of the short-run dynamics of theprocess, such as the autoregressive and moving average orders in the fractional ARIMA model.Previously, this type of specification, termed semiparametric by Robinson (1994a), has beenapplied for estimation of the memory parameters by Geweke & Porter-Hudak (1983), Robinson(1994a, 1995a, 1995b), Lobato & Robinson (1996), and Lobato (1997, 1999), among others.

Based on observations (yt, xt) , t = 1, ..., n, we consider the class of semiparametric weightedleast squares estimates

βδ,m =

⎛⎝ 1

m

mXj=1

λ2δj Re (Ixx (λj))

⎞⎠−1 1m

mXj=1

λ2δj Re (Ixy (λj)) , (3)

where

Iab (λ) = wa (λ)w∗b (λ) and wa (λ) =

1√2πn

nXt=1

ateitλ (4)

are the cross-periodogram matrix between at and bt and the discrete Fourier transform of at,respectively, λj = 2πj/n are the harmonic frequencies, m = m (n) is a bandwidth parameter,and the asterisk denotes complex conjugation combined with transposition.

Our estimates are semiparametric in the sense that they employ only local assumptions(near the zero frequency), such as (2), on the spectral density matrix of wt, except for weakregularity conditions (see below). Thus, we shall need the bandwidth parameter m = m (n)

to tend to infinity at a slower rate than n, such that we remain in a neighborhood of theorigin where the functional form of the spectral density (2) is assumed. This has the advantagethat our estimate is invariant to the short-run dynamics of the processes xt and ut (it is alsolocation invariant since λ0 = 0 is left out of the summations in (3)). In contrast, the estimatesin Robinson & Hidalgo (1997) use all available periodogram ordinates (i.e. m = n) and replaceour weights λ2δj by weight functions φ (λ), π ≤ λ < π. Thus, φ (λ) = 1 and φ (λ) = f−1u (λ)

correspond to ordinary least squares and generalized least squares, respectively, and correctspecification of the dynamics of the model at any frequency is assumed.

In our setting, (3) with δ = 0 (i.e. β0,m) is termed the narrow band frequency domain leastsquares (FDLS) estimate (see Robinson (1994a) and Robinson & Marinucci (2003)). Hence-forth, we shall term (3) with δ = dp (i.e. βdp,m) the narrow band frequency domain generalizedleast squares (FDGLS) estimate. The latter case also corresponds to the local Whittle QMLEof β. To see this, consider the local frequency domain Whittle QML objective function for (1),

W (β,Gpp) =1

m

mXj=1

µlog fpp (λj) +

Ipp (λj)

fpp (λj)

¶. (5)

47

Chapter 2

Concentrate Gpp out of the likelihood by setting Gpp (β) = m−1Pm

j=1 λ2dpj Ipp (λj), then the

concentrated likelihood is Wc (β) = log Gpp (β) apart from constant terms. The derivative,using that Ipp (λj) = Iyy (λj)−Re(β0Ixy (λj) + Iyx (λj)β − β0Ixx (λj)β), is

W 0c (β) = 2Gpp (β)

−1 1m

mXj=1

λ2dpj Re (Ixx (λj)β − Ixy (λj)) ,

and setting this equal to zero produces (3) with δ = dp.

In the next section, we shall give the conditions necessary to prove central limit theoremsof the type

√mλ

dpmΛ

−1m

³βδ,m − β

´→d N

¡0, E−1FE−1

¢, (6)

where Λm = diag(λd1m , ..., λdp−1m ) and E,F will be defined later. As mentioned above, the fully

parametric version of this class of estimates has been examined by Robinson & Hidalgo (1997),who derived a parametric version of (6) in the case of long range dependent regressors anderrors.

For the case with long range dependent errors and fixed (nonstochastic) regressors, Yajima(1988, 1991) derived central limit theorems for the ordinary least squares and generalized leastsquares estimates under conditions on the cumulants of all orders, and gave conditions forthe ordinary least squares estimate to achieve the efficiency of the generalized least squaresestimate. Dahlhaus (1995) considered an efficient weighted least squares estimate, and provedasymptotic normality under Gaussianity of the errors. Robinson (1997) gave a central limittheorem for nonparametric regression with fixed regressors assuming that the errors are linearin martingale differences. For a detailed discussion of the fixed regressor case, see Robinson &Hidalgo (1997) and the references therein.

Our emphasis on stochastic long range dependent regressors reflects recent empirical re-search. Thus, we also cover the case of cointegration where, if dp < min1≤a≤p−1 da, yt and xtare termed (fractionally) cointegrated, see ? for details on this definition of cointegration andits implications. Cointegration is essentially the necessary condition to avoid spurious regres-sion effects when data is trended, i.e. when da is high, see Phillips (1986) and Tsay & Chung(2000). Since we impose only the condition da ∈ [0, 1/2), for all a, on the memory parameters,our framework provides a unified treatment of cointegration and regression with fractionallyintegrated regressors and errors.

The paper proceeds as follows. In the next section we present the central limit theoremfor (3), and discuss its implications for the FDLS and FDGLS estimates. Section 3 discussesfeasible versions of these estimates, and it is shown that the central limit theorem continues tohold for the feasible estimates. Section 4 reports the results of a Monte Carlo investigation ofour estimates. The proofs of our theorems appear in sections 5 and 6, and section 7 containssome auxiliary lemmas and propositions.

48

Semiparametric Estimation in Time Series Regression with Long Range Dependence

2 Asymptotic Distribution of Estimates

We shall need the following assumptions on wt and the spectral density matrix fw (λ) (withobvious implications for yt).

Assumption 1 The spectral density matrix of wt in (2) with typical element fab (λ), the crossspectral density between wat and wbt, satisfies¯

fab (λ)−Gabλ−da−db

¯= O

³λα−da−db

´as λ→ 0+, a, b = 1, ..., p,

for some α ∈ (0, 2] and 0 ≤ da < 1/2, a = 1, ..., p. The matrix G satisfies Gap = Gpa = 0

for a = 1, ..., p − 1, and the leading (p− 1) × (p− 1) submatrix of G, denoted Gx, is positivedefinite.

Assumption 2 wt is a linear process, wt = µ+P∞

j=0Ajεt−j, where the coefficient matrices aresquare summable,

P∞j=0 kAjk2 <∞. The innovations satisfy, almost surely, E (εt| Ft−1) = 0,

E (εtε0t| Ft−1) = Ip, and the matrices µ3 = E (εt ⊗ εtε

0t| Ft−1) and µ4 = E (εtε

0t ⊗ εtε

0t| Ft−1)

are nonstochastic, finite, and do not depend on t, with Ft = σ (εs, s ≤ t).

Assumption 3 As λ→ 0+

dAa (λ)

dλ= O

¡λ−1 kAa (λ)k

¢, a = 1, ..., p,

where Aa (λ) is the a’th row of A (λ) =P∞

j=0Ajeijλ.

We also need a restriction on the expansion rate of the bandwidth parameter m.

Assumption 4 The bandwidth parameter m = m (n) satisfies, as n→∞,1

m+

m1+2α

n2α→ 0.

Finally, we need to restrict the weighting parameter depending on the memory parametersas follows.

Assumption 5 The weighting parameter δ satisfies

max1≤a≤p−1

(2da + 2dp − 1) /4 < δ ≤ dp.

Our assumptions are a multivariate generalization of those in Robinson (1994a, 1995a),see also Lobato (1997, 1999). They are in some respects much weaker than those employed byRobinson & Hidalgo (1997) in their parametric setup. In particular, we avoid their assumptionsof independence between xt and ut and complete specification of f (λ).

49

Chapter 2

Assumptions 1 and 3 specialize (2) by imposing smoothness conditions on the spectraldensity matrix of wt commonly employed in the literature. They are satisfied with α = 2 if, e.g.,wt is a vector fractional Gaussian noise or a vector fractional ARIMA process. The conditionthat Gx must be positive definite is a no multicollinearity condition for the components of xt.The extra condition that Gap = Gpa = 0 for a = 1, ..., p−1 ensures that the coherence at λ = 0between the regressors and the error process is of smaller order, and can be thought of as alocal version of the usual orthogonality condition from least squares theory. In particular, itrelaxes the independence assumption employed by Robinson & Hidalgo (1997). Assumption2 is a straightforward multivariate generalization of the corresponding condition in Robinson(1995a), following Lobato (1999), and imposes a linear structure on wt with square summablecoefficients and martingale difference innovations with finite fourth moments. It is satisfied,for instance, if εt forms an i.i.d. process with finite fourth moments. Under Assumption 2 wecan write the spectral density matrix of wt as

f (λ) =1

2πA (λ)A∗ (λ) . (7)

Assumption 4 restricts the expansion rate of the bandwidth parameter m = m (n). Thebandwidth is required to tend to infinity for consistency, but at a slower rate than n to remainin a neighborhood of the origin, where some knowledge of the form of the spectral density isassumed. The maximal rate depends on the adequacy of the approximation (2) to (7), i.e. onthe parameter α from Assumption 1, and the weakest constraint is implied by α = 2 in whichcase the condition is m = o(n4/5).

Finally, Assumption 5 states the required restrictions on the weighting parameter. ReversingAssumption 5 effectively gives a restriction on the memory parameters for the narrow bandFDLS estimate (i.e. δ = 0) to be covered by our theory. Thus, for max1≤a≤p−1 da + dp < 1/2,the narrow band FDLS estimate satisfies Assumption 5.

We now state the following central limit theorem for βδ,m, which is proved in section 5.

Theorem 1 Under (1) and Assumptions 1-5, the estimator defined by (3) satisfies

√mλ

dpmΛ

−1m

³βδ,m − β

´→d N

¡0, E−1FE−1

¢(8)

with

Eab =Gab

1− da − db + 2δ, (9)

Fab =GabGpp

2 (1− da − db − 2dp + 4δ) . (10)

If the memory parameters of xt and ut are all equal, i.e. da = d, a = 1, ..., p, inference isparticularly simple since the memory parameter does not appear in the convergence rate andE,F are scalar multiples of Gx. We state this special case as a corollary.

50

Semiparametric Estimation in Time Series Regression with Long Range Dependence

Corollary 1 Under (1), Assumptions 1-5, and da = d ∈ [0, 1/2), a = 1, ..., p, the estimatordefined by (3) satisfies

√m³βδ,m − β

´→d N

Ã0,(1− 2d+ 2δ)22 (1− 4d+ 4δ)GppG

−1x

!.

Let us focus briefly on the case of scalar xt. Suppose fw (λ) = diag¡Gxλ

−2dx , Guλ−2du¢ as

λ→ 0+. When dx + du < 1/2, the FDLS estimate satisfies

√mλdu−dxm

³β0,m − β

´→d N

Ã0,Gu

Gx

(1/2− dx)2

1/2− dx − du

!. (11)

However, the FDGLS estimate satisfies

√mλdu−dxm

³βdu,m − β

´→d N

µ0,Gu

Gx(1/2− dx + du)

¶(12)

for the entire stationary region of dx and du, unlike the FDLS estimate. Furthermore, theasymptotic relative efficiency of βdu,m with respect to β0,m (when both are asymptoticallynormal) is

V (β0,m)

V (βdu,m)=

(1/2− dx)2

(1/2− dx)2 − d2u

,

which equals unity if and only if du = 0, and exceeds unity otherwise. Hence, as expected,the FDGLS estimate is more efficient and applies for a wider range of (dx, du) than the FDLSestimate.

We end this section by remarking that the location of the spectral pole at the origin isnot critical as long as the location is known. If instead the pole was located at λ = λ 6= 0,we assume (2) as λ → λ and use periodogram ordinates close to λ in the summations in (3).However, the case with a pole at the origin dominates both theoretical and empirical research,so we shall not consider this extension further.

3 Feasible Estimates

For the FDGLS estimate the correct δ is usually not known a priori, and hence this estimate isinfeasible in practice. However, δ can obviously be estimated in any given situation by δ = dp,where dp is an estimate of dp based on residuals ut from (1). These residuals can be obtainedby e.g. FDLS, which does not require any knowledge of the memory parameters. Although theFDLS estimate is not asymptotically normal for all d, it is consistent, see Robinson (1994a)and Lobato (1997), and is thus useful as a preliminary estimate. We assume the following fordp.

51

Chapter 2

Assumption 6 The estimate of dp satisfies, as n→∞,

(logn)³dp − dp

´→p 0.

In practice, the estimate can be obtained from residuals ut as mentioned above. Hassler,Marmol & Velasco (2000) and Velasco (2001) provide some evidence that the log-periodogramand Gaussian semiparametric procedures of Robinson (1995a, 1995b), with carefully chosenbandwidth parameters, satisfy Assumption 6 with dp − dp = Op(m

−1/2).Denote the feasible estimate βdp,m. The asymptotic distribution is given by the following

theorem which is proved in section 6.

Theorem 2 Under (1) and Assumptions 1-4 and 6 the results of Theorem 1 hold with δ replacedby dp.

Thus, under the additional Assumption 6, the initial estimation of the memory parameterof the error process does not influence the asymptotic distribution theory for the regressioncoefficients obtained in the previous section.

4 Finite Sample Performance

We proceed to investigate the finite sample properties of the FDGLS (henceforth GLS) andfeasible FDGLS (henceforth FGLS) estimates in a Monte Carlo study with two different gen-erating mechanisms for xt and ut. In particular, we generated 10, 000 replications of xt and utof length n = 256, 512, and 1, 024. Both were Gaussian fractional ARIMAs with spectra givenby the two models

Model A : fa (λ) =1

¯1− eiλ

¯−2da, a = x, u,

Model B : fa (λ) =1

¯1 + 0.4eiλ

1− 0.6eiλ¯2 ¯1− eiλ

¯−2da, a = x, u,

for the grid of values dx = 0.0(0.1)0.4 and du = 0.0(0.1)dx, i.e. du ≤ dx to avoid any spuriousregression effects, see Phillips (1986) and Tsay & Chung (2000). These models both satisfyAssumptions 1-3 with α = 2. From the linear model (1), we then generated yt by α = 0 andβ = 1; the results are not sensitive to the choice of α and β. All calculations were performedin Ox version 3.10, see Doornik (2001) and Doornik & Ooms (2001).

In each model we computed βdu,m and βdu,m with bandwidth parameters m =£n0.4

¤and

m =£n0.5

¤, where [z] denotes the integer part of z. The first bandwidth is more conservative,

and is expected to be more robust under more complicated generating mechanisms such asModel B. The FGLS estimate was computed by first obtaining the residual process ut fromFDLS estimation of β, and then estimating du by the Gaussian semiparametric estimator ofRobinson (1995a) using the same bandwidth parameter for the entire estimation procedure.

52

Semiparametric Estimation in Time Series Regression with Long Range Dependence

Tables 1-4 about here

In Tables 1-4 we present the results of the simulation study for Model A. Tables 1 and2 display the Monte Carlo bias of the GLS and FGLS estimates, respectively. The bias isuniversally lower than 0.008 in absolute value, and there are no clear trends. Tables 3 and 4display the ratio (henceforth MSE ratio) of the asymptotic variance of βdu,m (from Theorem1) to the simulated mean-squared errors of the GLS and FGLS estimates. In both tables theestimates with the higher bandwidth parameter are superior, their MSE ratios being closer tounity and in some cases up to 20% higher than those with the lower bandwidth parameter.Comparing the results of Tables 3 and 4, the mean-squared errors of the FGLS estimates are inmost cases approximately 5% higher than those of the GLS estimates, the difference of coursebeing due to the estimation of du. Furthermore, we note a clear monotonicity in the MSEratios for both estimates. Thus, the ratios tend to be decreasing when dx−du increases. Whendx = du, i.e. on the diagonals, the asymptotic theory performs very well with MSE ratiosaround 0.9 for the GLS estimate and 0.85 for the FGLS estimate. The MSE ratios for the fullyparametric estimates in Robinson & Hidalgo (1997) display similar magnitudes and patternsacross dx and du (c and d, respectively, in their notation).

Tables 5-8 about here

Tables 5-8 present the corresponding simulation results for Model B. Again the bias isnegligible, and the pattern of MSE ratios from Tables 3 and 4 is repeated. Naturally, the MSEratios tend to be lower under this more complicated generating mechanism, but only slightlyso. Robinson & Hidalgo (1997) considered only one generating mechanism, Model A, but doconjecture that their MSE ratios ’could deteriorate if a richer model of f (λ) were estimated.’

Unreported simulations have shown that the highest possible expansion rate for the band-width under Assumption 4, m =

£n0.8

¤, generally results in an MSE ratio smaller than 0.6 for

the GLS estimate for Model B, and thus appears too high for the sample sizes considered here.

Overall, the asymptotic theory seems to perform well, and the results of the Monte Carlostudy are very similar to those obtained by Robinson & Hidalgo (1997) for their fully parametricestimates. However, in contrast to the estimates of Robinson & Hidalgo (1997), ours can beobtained without any prior knowledge of the generating mechanism of xt and ut. In particular,we do not need to know if xt and ut are generated by Model A or Model B in order to calculateour semiparametric estimates. The simulated bias is negligible in all our specifications andthe MSE ratio is high when dx and du are not too far apart. However, when dx is muchlarger than du, the asymptotic variance is quite small compared to the Monte Carlo result, andconsequently asymptotic confidence intervals tend to be too narrow.

53

Chapter 2

5 Proof of Theorem 1

We prove Theorem 1 using the auxiliary results in section 7. The basic technique is themartingale difference approximation method of Robinson (1995a). The left-hand side of (8) is⎛⎝Λmλ−2δm

1

m

mXj=1

λ2δj Re (Ixx (λj))Λm

⎞⎠−1Λmλdp−2δm1√m

mXj=1

λ2δj Re (Ixp (λj)) .

From Proposition 1 of section 7, the first term on the right-hand side satisfies

Λmλ−2δm

1

m

mXj=1

λ2δj Re (Ixx (λj))Λm →p E,

where E is defined in (9). Note that Gx (and thus E) is invertible by Assumption 1.For the second term we show that

1√mλdp−2δm Λm

mXj=1

λ2δj Re (Ixp (λj))→d N (0, F ) .

By application of the Cramèr-Wold device, we need to examine (η is a (p− 1)-vector)p−1Xa=1

ηa1√mλda+dp−2δm

mXj=1

λ2δj Re (Iap (λj))

=

p−1Xa=1

ηa1√mλda+dp−2δm

mXj=1

λ2δj Re¡Iap (λj)−Aa (λj)J (λj)A

∗p (λj)

¢(13)

+

p−1Xa=1

ηa1√mλda+dp−2δm

mXj=1

λ2δj Re

ÃAa (λj)

1

2πn

nXt=1

εtε0tA∗p (λj)

!(14)

+

p−1Xa=1

ηa1√mλda+dp−2δm

mXj=1

λ2δj Re

⎛⎝Aa (λj)1

2πn

nXt=1

Xs 6=t

εtε0se

i(t−s)λjA∗p (λj)

⎞⎠ , (15)

where J (λ) is the periodogram matrix of the innovations εt. Lemma 2 of section 7 proves that(13) is op (1), while Lemma 3 in conjunction with Assumptions 1 and 4 proves that (14) isop (1) since m−1/2λ

da+dp−2δm

Pmj=1 λ

2δj Re (fap (λj)) = O

¡m1+2αn−2α

¢.

We are left with (15), which can be written asPn

t=1 ztn, where

ztn = ε0tt−1Xs=1

ct−s,nεs,

ctn =1

2πn√m

mXj=1

θj cos (tλj) ,

θj =

p−1Xa=1

ηaλda+dp−2δm λ2δj Re

¡A0a (λj) Ap (λj) +A0p (λj) Aa (λj)

¢.

54

Semiparametric Estimation in Time Series Regression with Long Range Dependence

Since ztn is a martingale difference array with respect to the filtration (Ft)t∈Z, Ft = σ (εs, s ≤ t),we can apply the CLT of Brown (1971) and Hall & Heyde (1980, chp. 3.2) if

nXt=1

E¡z2tn¯Ft−1

¢− p−1Xa=1

p−1Xb=1

ηaηbFab →p 0, (16)

nXt=1

E¡z2tn1 (|ztn| > κ)

¢→ 0, κ > 0. (17)

A sufficient condition for (17) isnXt=1

E¡z4tn¢→ 0. (18)

First we show (16). The first term on the left-hand side is

nXt=1

E

Ãt−1Xs=1

t−1Xr=1

ε0sc0t−s,nεtε

0tct−r,nεr

¯¯Ft−1

!=

nXt=1

t−1Xs=1

ε0sc0t−s,nct−s,nεs + op (1) (19)

by Lemma 4. We need to show that the mean of the first term on the right-hand side of (19)is asymptotically equal to

Pp−1a=1

Pp−1b=1 ηaηbFab. Thus,

nXt=1

t−1Xs=1

E tr¡c0t−s,nct−s,nεsε

0s

¢=

nXt=1

t−1Xs=1

tr¡c0t−s,nct−s,n

¢=

nXt=1

t−1Xs=1

mXj=1

1

4π2n2mtr¡θ0jθj

¢cos2 ((t− s)λj) (20)

+nXt=1

t−1Xs=1

mXj=1

Xk 6=j

1

4π2n2mtr¡θ0jθk

¢cos ((t− s)λj) cos ((t− s)λk) . (21)

Notice that since kθjk = O (1) by Theorem 2 of Robinson (1995b) we have

(21) = O

⎛⎝ nXt=1

t−1Xs=1

mXj=1

Xk 6=j

1

n2mcos ((t− s)λj) cos ((t− s)λk)

⎞⎠ ,

and sincePn

t=1

Pt−1s=1 cos ((t− s)λj) cos ((t− s)λk) = −n/2, (21) isO

³Pmj=1

Pk 6=j

¡n2m

¢−1n´=

O (m/n). Now, tr¡θ0jθj

¢equals

Pp−1a=1

Pp−1b=1 ηaηbλ

da+db+2dp−4δm λ4δj times

tr¡Re¡A∗p (λj)Aa (λj) +A∗a (λj)Ap (λj)

¢Re¡A0b (λj) Ap (λj) +A0p (λj) Ab (λj)

¢¢= 4π2 (fab (λj) fpp (λj) + fap (λj) fbp (λj) + fpb (λj) fpa (λj) + fpp (λj) fba (λj))

55

Chapter 2

by definition of f (λ), see (7). By Assumption 1 the second and third terms are of smallerorder, and since (x+ x) = 2Re (x) for any complex number x we can thus rewrite (20) as

nXt=1

t−1Xs=1

mXj=1

p−1Xa=1

p−1Xb=1

2ηaηbn2m

λda+db+2dp−4δm λ4δj Re (fab (λj) fpp (λj)) cos

2 ((t− s)λj) . (22)

Using Lemma 1 to approximate the sumPm

j=1 by an integral, and sincePn−1

t=1

Pn−ts=1 cos

2 (sλj) =

(n− 1)2 /4, we have that (22) isp−1Xa=1

p−1Xb=1

ηaηbnπm

λda+db+2dp−4δm

ÃnXt=1

t−1Xs=1

cos2 ((t− s)λj)

!Z λm

0λ4δ Re (fab (λ) fpp (λ)) dλ

=

p−1Xa=1

p−1Xb=1

ηaηbλda+db+2dp−4δm

2

Z λm

0λ4δ Re (fab (λ) fpp (λ)) dλ,

and we have shown (16).Hence, we have to show (18),

nXt=1

E¡z4tn¢=

nXt=1

E

⎛⎝t−1Xs=1

ε0sct−s,nεtε0t

t−1Xr=1

ct−r,nεrt−1Xp=1

ε0pct−p,nεtε0t

t−1Xq=1

ct−q,nεq

⎞⎠≤ C

ÃnXt=1

tr

Ãt−1Xs=1

c0t−s,nct−s,nc0t−s,nct−s,n

!+

nXt=1

tr

Ãt−1Xs=1

c0t−s,nt−1Xr=1

ct−r,nc0t−r,nct−s,n

!!

for some constant C > 0 by Assumption 2. Using the arguments in Lemma 4, this expressioncan be bounded by O

³n¡Pn

t=1

°°c2tn°°¢2´ = O¡n−1

¢, which completes the proof.

6 Proof of Theorem 2

We show that √mλ

dpmΛ

−1m

³³βdp,m − β

´−³βdp,m − β

´´→p 0.

By definition of βdp,m and βdp,m, this amounts to showing that

λ−2dpm Λm

⎛⎝ 1

m

mXj=1

³λ2dpj − λ

2dpj

´Re (Ixx (λj))

⎞⎠Λm →p 0, (23)

√mλ

−dpm Λm

⎛⎝ 1

m

mXj=1

³λ2dpj − λ

2dpj

´Re (Ixp (λj))

⎞⎠→p 0. (24)

56

Semiparametric Estimation in Time Series Regression with Long Range Dependence

Since¯max1≤j≤m λ

2dp−2dpj − 1

¯= Op

³¯dp − dp

¯logn

´, we have

λda+db−2dpm

1

m

mXj=1

³λ2dpj − λ

2dpj

´Re (Iab (λj))

= Op

⎛⎝λda+db−2dpm

1

m

¯max1≤j≤m

λ2dp−2dpj − 1

¯ mXj=1

λ2dpj Re (Iab (λj))

⎞⎠= Op

⎛⎝λda+db−2dpm

1

m

¯dp − dp

¯(logn)

mXj=1

λ2dp−da−dbj

⎞⎠= Op

³¯dp − dp

¯logn

´and

√mλ

da−dpm

1

m

mXj=1

³λ2dpj − λ

2dpj

´Re (Iap (λj))

= Op

⎛⎝√mλda−dpm

1

m

³¯dp − dp

¯logn

´ mXj=1

λα+dp−daj

⎞⎠= op

³√mλαm

¯dp − dp

¯logn

´by Assumption 1. In view of Assumptions 4 and 6, this proves (23) and (24).

7 Auxiliary Propositions and Lemmas

Here we provide a series of auxiliary results used to prove our main theorems. First, we providean extension of the consistency result of Lobato (1997, Theorem 1) for the discretely averagedcross-periodogram, showing that the result is equally valid for our weighted cross-periodogram.

Proposition 1 Under Assumptions 1, 2, 5, and m−1 +m/n→ 0,

λda+db−2δm1

m

mXj=1

λ2δj Re (Iab (λj))−Gab

1− da − db + 2δ→p 0, a, b = 1, ..., p. (25)

57

Chapter 2

Proof. Decompose the left-hand side of (25) as

λda+db−2δm

1

m

mXj=1

λ2δj Re (Iab (λj)−Aa (λj)J (λj)A∗b (λj)) (26)

+λda+db−2δm

1

m

mXj=1

λ2δj Re

ÃAa (λj)

1

2πn

nXt=1

εtε0tA∗b (λj)− fab (λj)

!(27)

+λda+db−2δm

1

m

mXj=1

λ2δj Re

⎛⎝Aa (λj)1

2πn

nXt=1

Xs6=t

εtε0se

i(t−s)λjA∗b (λj)

⎞⎠ (28)

+λda+db−2δm

1

m

mXj=1

λ2δj Re (fab (λj))−Gab

1− da − db + 2δ. (29)

By Lemmas 2 and 3 and the analysis of (15) in the proof of Theorem 1, (26) − (28) are allop (1). Applying Lemma 1 to (29) we get that

λda+db−2δm

1

m

mXj=1

λ2δj Gabλ−da−dbj − Gab

1− da − db + 2δ= o (λm) ,

thus completing the proof.The first lemma is undoubtedly well known and is provided for reference.

Lemma 1 For m−1 +m/n→ 0 and any c ∈ (−1, 1] ,2π

n

mXj=1

λcj −Z λm

0λcdλ = o

¡λc+1m

¢as n→∞.

Proof. For n sufficiently large, the left-hand side ismXj=1

Z λj

λj−1

¡λcj − λc

¢dλ =

mXj=1

λc−1j

Z λj

λj−1

Ãλj −

µλ

λj

¶c−1λ

!dλ.

As |λj − (λ/λj)a λ| ≤ |λj − λ| for λ ∈ (λj−1, λj) and a ≤ 0, the first term on the right-handside is

O

⎛⎝ mXj=1

λc−1j

Z λj

λj−1

¯¯λj −

µλ

λj

¶c−1λ

¯¯ dλ

⎞⎠ = O

⎛⎝¯¯ mXj=1

λc−1j

Z λj

λj−1(λj − λ) dλ

¯¯⎞⎠ . (30)

SinceR λjλj−1 (λj − λ) dλ = 2π2/n2 it follows that (30) isO

³n−2

Pmj=1 λ

c−1j

´, which isO

¡m−1λc+1m

¢if c > 0 and O

¡(logm)n−c−1

¢= o

¡λc+1m

¢if −1 < c ≤ 0.

The remaining lemmas are straightforward extensions (to incorporate our weights) andvariants of previous results appearing in Robinson (1995a) and Lobato (1997, 1999).

58

Semiparametric Estimation in Time Series Regression with Long Range Dependence

Lemma 2 Under the conditions of Proposition 1, for a, b = 1, ..., p,

λda+db−2δm

1

m

mXj=1

λ2δj Re (Iab (λj)−Aa (λj)J (λj)A∗b (λj)) = op (1) , (31)

and under the conditions of Theorem 1, for a = 1, ..., p− 1,

λda+dp−2δm

1√m

mXj=1

λ2δj Re¡Iap (λj)−Aa (λj)J (λj)A

∗p (λj)

¢= op (1) . (32)

Proof. Using summation by parts, we have that

mXj=1

λ2δj Re (Iab (λj)−Aa (λj)J (λj)A∗b (λj))

=m−1Xj=1

³λ2δj − λ2δj+1

´Ã jXk=1

Re (Iab (λk)−Aa (λk)J (λk)A∗b (λk))

!

+λ2δm

mXj=1

Re (Iab (λj)−Aa (λj)J (λj)A∗b (λj))

=mXj=1

n−1λ2δ−1j

ÃjX

k=1

Re (Iab (λk)−Aa (λk)J (λk)A∗b (λk))

!

+λ2δm

mXj=1

Re (Iab (λj)−Aa (λj)J (λj)A∗b (λj)) ,

which is

op

⎛⎝ mXj=1

n−1λ2δ−1j

³nλ1−da−dbj

´+ λ2δmnλ1−da−dbm

⎞⎠ = op

³nλ1−da−db+2δm

´

by (3.3)-(3.4) in Lobato (1997), which apply under the conditions of our Proposition 1. Itfollows that the left-hand side of (31) is op (1).

59

Chapter 2

To prove the second statement, we use summation by parts to show that

mXj=1

λ2δj Re¡Iap (λj)−Aa (λj)J (λj)A

∗p (λj)

¢=

m−1Xj=1

³λ2δ−da−dpj − λ

2δ−da−dpj+1

´ jXk=1

λda+dpk Re

¡Iap (λk)−Aa (λk)J (λk)A

∗p (λk)

¢+λ

2δ−da−dpm

mXj=1

λda+dpj Re

¡Iap (λj)−Aa (λj)J (λj)A

∗p (λj)

¢=

m−1Xj=1

n−1λ2δ−da−dp−1j

jXk=1

λda+dpk Re

¡Iap (λk)−Aa (λk)J (λk)A

∗p (λk)

¢(33)

+λ2δ−da−dpm

mXj=1

λda+dpj Re

¡Iap (λj)−Aa (λj)J (λj)A

∗p (λj)

¢. (34)

Under the conditions of Theorem 1, we can apply eq. (C.2) in Lobato (1999) to conclude that

(34) is Op

³λ2δ−da−dpm

³m1/3 (logm)2/3 + (logm) +m1/2n−1/4

´´and (33) is

Op

⎛⎝nda+dp−2δmXj=1

j2δ−da−dp−1³j1/3 (log j)2/3 + (log j) + j1/2n−1/4

´⎞⎠= Op

³nda+dp−2δ (logm)2

³1 +m2δ−da−dp+1/3 + n−1/4m2δ−da−dp+1/2

´´.

Thus, the left-hand side of (32) isOp(m−1/6 (logm)2/3+m−1/2 (logm)+n−1/4+mda+dp−2δ−1/2 (logm)2+

m−1/6 (logm)2 + n−1/4 (logm)2) = op (1) by Assumptions 4 and 5.

Lemma 3 Under the conditions of Proposition 1, for a, b = 1, ..., p,

1√mλda+db−2δm

mXj=1

λ2δj Re

ÃAa (λj)

1

2πn

nXt=1

εtε0tA∗b (λj)− fab (λj)

!= op (1) .

Proof. The proof follows parts of the proof of Lobato (1997, Proposition 3). By definitionof f (λ), the left-hand side is bounded by¯

¯ 1√mλda+db−2δm

mXj=1

λ2δj Aa (λj)1

2πDA∗b (λj)

¯¯ , (35)

where D = n−1Pn

t=1 εtε0t − Ip satisfies kDk = Op(n

−1/2), since by Assumption 2, εtε0t − Ip is amartingale difference sequence with respect to the filtration (Ft)t∈Z. Then, since kAi (λj)k =

60

Semiparametric Estimation in Time Series Regression with Long Range Dependence

O(fii (λj)1/2), i = a, b, (35) is bounded by

1

2π√mλda+db−2δm

⎛⎝ mXj=1

λ4δj kAa (λj)k2 kDk2 kAb (λj)k2⎞⎠1/2

= Op

⎛⎜⎝m−1/2λda+db−2δm kDk⎛⎝ mX

j=1

λ4δj faa (λj) fbb (λj)

⎞⎠1/2⎞⎟⎠

= Op

⎛⎜⎝m−1/2λda+db−2δm kDk⎛⎝ mX

j=1

λ2δj faa (λj)

⎞⎠1/2⎛⎝ mXj=1

λ2δj fbb (λj)

⎞⎠1/2⎞⎟⎠ ,

which is Op(λ1/2m ) = op (1) as required.

Lemma 4 Under the conditions of Theorem 1,

nXt=2

E

Ãt−1Xs=1

t−1Xr=1

ε0sc0t−s,nεtε

0tct−r,nεr

¯¯Ft−1

!−

nXt=2

t−1Xs=1

ε0sc0t−s,nct−s,nεs = op (1) .

Proof. We prove convergence in mean-square. The left-hand side isPn

t=2

Pt−1s=1

Pr 6=s ε

0sc0t−s,nct−r,nεr,

which has mean zero and variance

O

⎛⎝n

ÃnX

s=1

kcsnk2!2+

nXt=3

t−1Xu=2

Ãu−1Xs=1

kcu−s,nk2u−1Xs=1

kct−s,nk2!⎞⎠ , (36)

following the analysis in Robinson (1995a, p. 1646) and Lobato (1999, pp. 150-151). By

Theorem 2 of Robinson (1995b), kθjk = O³(m/j)da+dp−2δ

´such that kcsnk is bounded by

kcsnk = O

⎛⎝ 1

n√m

mXj=1

kθjk⎞⎠

= O

⎛⎝mda+dp−2δ−1/2

n

mXj=1

j2δ−da−dp

⎞⎠= O

µ√m

n

¶.

Next, define the functions Ha (λ) = λ2δ Re¡A0a (λ) Ap (λ) +A0p (λ) Aa (λ)

¢such that θj =Pp−1

a=1 ηaλda+dp−2δm Ha (λj) and Ha (λ) = O

¡λ2δ−da−dp

¢as λ → 0+, and Ha (λ) is differentiable

61

Chapter 2

with ∂Ha (λ) /∂λ = O¡λ2δ−da−dp−1

¢as λ → 0+ by Assumption 3. Now we can derive an

alternative bound as

kcsnk = O

⎛⎝ max1≤a≤p−1

λda+dp−2δm

n√m

¯¯ mXj=1

Ha (λj) cos (sλj)

¯¯⎞⎠

= O

⎛⎝ max1≤a≤p−1

λda+dp−2δm

n√m

¯¯m−1Xj=1

(Ha (λj)−Ha (λj+1))

jXk=1

cos (sλk)

¯¯⎞⎠

+O

⎛⎝ max1≤a≤p−1

λda+dp−2δm

n√m

¯¯Ha (λm)

mXj=1

cos (sλj)

¯¯⎞⎠

from summation by parts. Using the Mean Value Theorem it follows that Ha (λj)−Ha (λj+1) =

(λj+1 − λj)∂Ha(λj)

∂λ = 2πn

∂Ha(λj)∂λ , and the bound is

kcsnk = O

⎛⎝ max1≤a≤p−1

λda+dp−2δm

n√m

¯¯m−1Xj=1

n−1λ2δ−da−dp−1j

jXk=1

cos (sλk)

¯¯⎞⎠

+O

⎛⎝ max1≤a≤p−1

λda+dp−2δm

n√m

¯¯λ2δ−da−dpm

mXj=1

cos (sλj)

¯¯⎞⎠

= O

µ1

s√m

¶using also

¯Plj=1 cos (sλj)

¯= O (n/s), see Zygmund (2002, p. 2). This bound is better when

s > n/m.Thus, we find that

nXs=1

kcsnk2 = O

⎛⎝[n/m]Xs=1

m

n2+

nXs=[n/m]+1

1

s2m

⎞⎠= O

¡n−1

¢,

implying that the first term of (36) is O¡n−1

¢. The second term of (36) is bounded by

O

⎛⎝n

ÃnX

s=1

kcsnk2!⎛⎝[n/2]X

s=1

s kcsnk2⎞⎠⎞⎠ ,

see Robinson (1995a, pp. 1646-1647). The summand in the last sum is O(sm/n2 + (sm)−1).Choose the first bound when s ≤ £n/m2/3

¤, then the last sum is

O

⎛⎜⎝[n/m2/3]Xs=1

sm

n2+

[n/2]Xs=[n/m2/3]+1

1

sm

⎞⎟⎠ = O

µ1

m1/3

¶,

62

Semiparametric Estimation in Time Series Regression with Long Range Dependence

and (36) = O¡n−1 +m−1/3

¢.

63

Chapter 2

References

Adenstedt, R. K. (1974), ‘On large-sample estimation of the mean of a stationary randomsequence’, Annals of Statistics 2, 1095—1107.

Beran, J. (1994), Statistics for Long-Memory Processes, Chapman-Hall, New York.

Brown, B. M. (1971), ‘Martingale central limit theorems’, Annals of Mathematical Statistics42, 59—66.

Dahlhaus, R. (1995), ‘Efficient location and regression estimation for long range dependentregression models’, Annals of Statistics 23, 1029—1047.

Doornik, J. A. (2001), Ox: An Object-Oriented Matrix Language, 4th edn, Timberlake Consul-tants Press, London.

Doornik, J. A. & Ooms, M. (2001), ‘A package for estimating, forecasting and simulating arfimamodels: Arfima package 1.01 for Ox’, Working Paper, Nuffield College, Oxford .

Geweke, J. & Porter-Hudak, S. (1983), ‘The estimation and application of long memory timeseries models’, Journal of Time Series Analysis 4, 221—238.

Granger, C. W. J. & Joyeux, R. (1980), ‘An introduction to long memory time series modelsand fractional differencing’, Journal of Time Series Analysis 1, 15—29.

Hall, P. & Heyde, C. C. (1980), Martingale Limit Theory and its Application, Academic Press,New York.

Hannan, E. J. (1979), ‘The central limit theorem for time series regression’, Stochastic Processesand their Applications 9, 281—289.

Hassler, U., Marmol, F. & Velasco, C. (2000), ‘Residual log-periodogram inference for long-runrelationships’, Forthcoming in Journal of Econometrics .

Hosking, J. R. M. (1981), ‘Fractional differencing’, Biometrika 68, 165—176.

Lobato, I. N. (1997), ‘Consistency of the averaged cross-periodogram in long memory series’,Journal of Time Series Analysis 18, 137—155.

Lobato, I. N. (1999), ‘A semiparametric two-step estimator in a multivariate long memorymodel’, Journal of Econometrics 90, 129—153.

Lobato, I. N. & Robinson, P. M. (1996), ‘Averaged periodogram estimation of long memory’,Journal of Econometrics 73, 303—324.

64

Semiparametric Estimation in Time Series Regression with Long Range Dependence

Mandelbrot, B. B. & Van Ness, J. W. (1968), ‘Fractional brownian motions, fractional noisesand applications’, SIAM Review 10, 422—437.

Phillips, P. C. B. (1986), ‘Understanding spurious regressions in econometrics’, Journal ofEconometrics 33, 311—340.

Robinson, P. M. (1994a), ‘Semiparametric analysis of long-memory time series’, Annals ofStatistics 22, 515—539.

Robinson, P. M. (1994b), Time series with strong dependence, in C. A. Sims, ed., ‘Advancesin Econometrics’, Cambridge University Press, Cambridge, pp. 47—95.

Robinson, P. M. (1995a), ‘Gaussian semiparametric estimation of long range dependence’,Annals of Statistics 23, 1630—1661.

Robinson, P. M. (1995b), ‘Log-periodogram regression of time series with long range depen-dence’, Annals of Statistics 23, 1048—1072.

Robinson, P. M. (1997), ‘Large-sample inference for nonparametric regression with dependenterrors’, Annals of Statistics 25, 2054—2083.

Robinson, P. M. & Hidalgo, F. J. (1997), ‘Time series regression with long-range dependence’,Annals of Statistics 25, 77—104.

Robinson, P. M. & Marinucci, D. (2003), Semiparametric frequency domain analysis of frac-tional cointegration, in P. M. Robinson, ed., ‘Time Series With Long Memory’, OxfordUniversity Press, Oxford, pp. 334—373.

Tsay, W. J. & Chung, C. F. (2000), ‘The spurious regression of fractionally integratedprocesses’, Journal of Econometrics 96, 155—182.

Velasco, C. (2001), ‘Gaussian semiparametric estimation of fractional cointegration’, Preprint,Universidad Carlos III de Madrid .

Yajima, Y. (1988), ‘On estimation of a regression model with long-memory stationary errors’,Annals of Statistics 16, 791—807.

Yajima, Y. (1991), ‘Asymptotic properties of the LSE in a regression model with long-memorystationary errors’, Annals of Statistics 19, 158—177.

Zygmund, A. (2002), Trigonometric Series, third edn, Cambridge University Press, Cambridge.

65

Chapter 2

Table 1: Bias (x100) of GLS estimate for Model A.n = 256

m = [n0.4] = 9 m = [n0.5] = 16du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 −.7733 −.0060 .1793 .0591 .0591 .2953 .0263 −.1773 .1337 .00200.1 − .4049 .1212 −.3895 .0629 − −.0518 .1541 −.0450 −.05660.2 − − .2961 .0829 .0920 − − .0927 .0666 −.04350.3 − − − .1227 .1817 − − − .0648 .14980.4 − − − − .0347 − − − − .0270

n = 512m = [n0.4] = 12 m = [n0.5] = 22

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 −.2024 −.1835 −.0131 −.0095 .0542 −.0900 −.0369 −.0850 −.0244 −.02570.1 − .0960 −.0028 −.1694 .1098 − .0169 .0113 .0200 .12600.2 − − .3121 −.0062 .2058 − − .0622 .1402 −.17650.3 − − − .1069 .0496 − − − .0828 .09570.4 − − − − .1913 − − − − −.1440

n = 1024m = [n0.4] = 16 m = [n0.5] = 32

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 −.3978 −.0492 −.0090 −.0245 .0597 −.1952 −.1215 −.0572 .0241 −.03270.1 − −.0727 −.0642 −.0984 −.0384 − −.1049 .0279 −.0138 .06830.2 − − .1535 −.1370 .0691 − − −.1881 −.0478 −.07040.3 − − − .0975 .0183 − − − −.0736 .07930.4 − − − − .0591 − − − − −.1162

Table 2: Bias (x100) of FGLS estimate for Model A.n = 256

m = [n0.4] = 9 m = [n0.5] = 16du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 −.7944 −.0014 .1751 .0578 .0273 .3206 −.0159 −.1937 .1089 −.00800.1 − .3735 .0490 −.3513 .0506 − −.1036 .1480 −.0222 −.04250.2 − − .3707 .0799 .1320 − − .1950 .0565 −.05570.3 − − − .1947 .2056 − − − .0201 .19350.4 − − − − .2135 − − − − .0475

n = 512m = [n0.4] = 12 m = [n0.5] = 22

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 −.1949 −.1477 −.0034 −.0076 .0600 −.1094 −.0205 −.0849 −.0301 −.01450.1 − .0626 −.0300 −.1943 .1075 − −.0064 −.0021 .0200 .14760.2 − − .3309 .0133 .2500 − − .1177 .1292 −.19040.3 − − − .0595 .0487 − − − .0069 .10940.4 − − − − .1509 − − − − −.1144

n = 1024m = [n0.4] = 16 m = [n0.5] = 32

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 −.3073 −.0163 −.0188 −.0242 .0831 −.2009 −.1301 −.0563 .0063 −.04860.1 − .0170 −.0924 −.1709 −.0285 − −.0840 .0204 .0025 .06950.2 − − .1925 −.1273 .0592 − − −.1846 −.0452 −.06480.3 − − − .1410 .0204 − − − −.0700 .10980.4 − − − − .0024 − − − − −.0464

66

Semiparametric Estimation in Time Series Regression with Long Range Dependence

Table 3: MSE ratio of GLS estimate for Model A.n = 256

m = [n0.4] = 9 m = [n0.5] = 16du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .8858 .8286 .7605 .6042 .4021 .9398 .8692 .8281 .7038 .45530.1 − .9064 .8340 .7468 .6343 − .9417 .8997 .8580 .72250.2 − − .8751 .8227 .7672 − − .9527 .8950 .82000.3 − − − .8532 .8581 − − − .9185 .90920.4 − − − − .8578 − − − − .8982

n = 512m = [n0.4] = 12 m = [n0.5] = 22

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .9212 .8944 .7923 .6594 .4268 .9459 .9099 .8538 .7457 .49420.1 − .9050 .9000 .7855 .6714 − .9587 .9071 .8954 .75190.2 − − .9035 .8934 .8185 − − .9674 .9200 .86810.3 − − − .8877 .8900 − − − .9286 .92580.4 − − − − .8817 − − − − .9184

n = 1024m = [n0.4] = 16 m = [n0.5] = 32

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .9550 .9079 .8373 .7032 .4752 .9827 .9541 .9007 .7810 .53500.1 − .9494 .9050 .8277 .7218 − .9770 .9386 .9251 .80890.2 − − .93425 .9159 .8410 − − .9710 .9481 .90230.3 − − − .8995 .9016 − − − .9500 .93360.4 − − − − .8820 − − − − .9117

Table 4: MSE ratio of FGLS estimate for Model A.n = 256

m = [n0.4] = 9 m = [n0.5] = 16du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .8287 .7737 .7085 .5597 .37413 .8930 .8236 .7826 .6639 .43140.1 − .8448 .7725 .6939 .5829 − .8910 .8535 .8089 .67550.2 − − .8050 .7620 .7052 − − .9035 .8456 .77090.3 − − − .7793 .7882 − − − .8537 .84280.4 − − − − .7608 − − − − .8314

n = 512m = [n0.4] = 12 m = [n0.5] = 22

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .8693 .8395 .7441 .6148 .3996 .9131 .8742 .8179 .7108 .46930.1 − .8448 .8372 .7371 .6239 − .9278 .8700 .8503 .71040.2 − − .8479 .8266 .7529 − − .9212 .8808 .81950.3 − − − .8264 .8093 − − − .8847 .87230.4 − − − − .7972 − − − − .8683

n = 1024m = [n0.4] = 16 m = [n0.5] = 32

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .9127 .8650 .7866 .6617 .4474 .9561 .9231 .8760 .7494 .51040.1 − .8952 .8585 .7873 .6749 − .9479 .9090 .8855 .77530.2 − − .8848 .8574 .7866 − − .9405 .9173 .86780.3 − − − .8469 .8367 − − − .9202 .89470.4 − − − − .8203 − − − − .8746

67

Chapter 2

Table 5: Bias (x100) of GLS estimate for Model B.n = 256

m = [n0.4] = 9 m = [n0.5] = 16du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 −.2044 .1635 .0611 −.1900 .0728 .4680 −.1765 −.1787 .1638 .02200.1 − −.1195 −.1304 .1619 .0522 − .0037 −.0905 −.0576 .16790.2 − − −.4386 .0352 −.1474 − − −.0469 .2834 −.24540.3 − − − −.1338 .2467 − − − .1265 −.09780.4 − − − − −.0978 − − − − −.2504

n = 512m = [n0.4] = 12 m = [n0.5] = 22

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .0457 −.0221 .1051 −.1531 .0083 .0678 −.2716 −.0721 .0289 .03650.1 − .0220 −.0913 .0911 .0530 − .1370 .0143 −.0653 .12470.2 − − −.3577 −.2223 −.1993 − − −.1206 −.0003 −.08610.3 − − − −.0351 .3337 − − − .1794 −.08760.4 − − − − −.0613 − − − − −.5301

n = 1024m = [n0.4] = 16 m = [n0.5] = 32

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .0908 −.0696 −.0998 −.0120 .0203 .0491 −.0093 −.0748 −.0161 −.01350.1 − −.0906 −.0527 −.0502 .0251 − .1616 .0710 −.0687 .01740.2 − − −.3233 −.0296 −.1125 − − −.2311 −.1081 −.07750.3 − − − .0690 .2212 − − − .1561 −.08330.4 − − − − .2007 − − − − −.3223

Table 6: Bias (x100) of FGLS estimate for Model B.n = 256

m = [n0.4] = 9 m = [n0.5] = 16du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 −.2263 .0342 .0154 −.2082 .0748 .4661 −.2301 −.1878 .1502 .00670.1 − −.1463 −.1412 .2009 .0675 − −.0124 −.0525 −.0417 .17430.2 − − −.4286 −.0571 −.2076 − − .0460 .3203 −.27130.3 − − − −.1530 .3159 − − − .2171 −.14490.4 − − − − −.2354 − − − − −.1875

n = 512m = [n0.4] = 12 m = [n0.5] = 22

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .2877 −.0495 .1330 −.1251 −.0045 .0552 −.2613 −.0942 −.0040 .02020.1 − .0223 −.0824 .1044 .0610 − .1756 .0484 −.0707 .11870.2 − − −.3494 −.1992 −.2081 − − −.0588 −.0229 −.08890.3 − − − .0305 .4237 − − − .1858 −.13300.4 − − − − −.0911 − − − − −.5193

n = 1024m = [n0.4] = 16 m = [n0.5] = 32

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .1009 −.04255 −.0984 −.0057 .0297 .0297 .0260 −.1000 −.0278 −.00370.1 − −.0780 −.0156 −.0353 .0172 − .1708 .0915 −.0481 .01720.2 − − −.4002 −.0451 −.1011 − − −.2167 −.1124 −.07430.3 − − − −.0015 .2307 − − − .1570 −.09200.4 − − − − .1773 − − − − −.3276

68

Semiparametric Estimation in Time Series Regression with Long Range Dependence

Table 7: MSE ratio of GLS estimate for Model B.n = 256

m = [n0.4] = 9 m = [n0.5] = 16du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .8782 .8500 .7665 .6340 .3966 .9135 .8589 .8212 .7026 .46900.1 − .8903 .8317 .7777 .6326 − .9240 .8744 .8403 .70570.2 − − .8817 .8305 .7646 − − .9238 .8859 .83340.3 − − − .8598 .8311 − − − .8970 .85530.4 − − − − .8540 − − − − .8810

n = 512m = [n0.4] = 12 m = [n0.5] = 22

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .8863 .8912 .8051 .6805 .4308 .9531 .8966 .8478 .7550 .50850.1 − .9139 .8746 .8092 .6478 − .9336 .9048 .8557 .74460.2 − − .9220 .8612 .8125 − − .9464 .9016 .87870.3 − − − .9272 .8929 − − − .9380 .89610.4 − − − − .8676 − − − − .8992

n = 1024m = [n0.4] = 16 m = [n0.5] = 32

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .9217 .8971 .8479 .7196 .4695 .9826 .9504 .8963 .8092 .55610.1 − .9292 .8889 .8366 .7031 − .9767 .9378 .8871 .76750.2 − − .9143 .8963 .8374 − − .9761 .9410 .90260.3 − − − .9386 .9052 − − − .9518 .93130.4 − − − − .9238 − − − − .9551

Table 8: MSE ratio of FGLS estimate for Model B.n = 256

m = [n0.4] = 9 m = [n0.5] = 16du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .8266 .7917 .7093 .5863 .3676 .8857 .8287 .7889 .6725 .44690.1 − .8356 .7706 .7201 .5899 − .8944 .8344 .8002 .67370.2 − − .8134 .7671 .7069 − − .8864 .8474 .79940.3 − − − .7843 .7501 − − − .8544 .81120.4 − − − − .7621 − − − − .8265

n = 512m = [n0.4] = 12 m = [n0.5] = 22

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .8395 .8340 .7511 .6301 .4037 .9164 .8590 .8151 .7277 .48450.1 − .8625 .8172 .7550 .6062 − .9071 .8626 .8265 .71400.2 − − .8586 .8037 .7484 − − .9143 .8622 .84040.3 − − − .8547 .8305 − − − .9022 .85200.4 − − − − .7897 − − − − .8543

n = 1024m = [n0.4] = 16 m = [n0.5] = 32

du\dx 0 0.1 0.2 0.3 0.4 0 0.1 0.2 0.3 0.40 .8783 .8570 .7953 .6714 .4400 .9547 .9216 .8631 .7750 .53520.1 − .8858 .8346 .7891 .6570 − .9539 .9043 .8531 .73660.2 − − .8614 .8441 .7830 − − .9509 .9109 .86230.3 − − − .8773 .8370 − − − .9219 .89940.4 − − − − .8542 − − − − .9166

69

70

Chapter 3

Optimal Residual Based Tests for Fractional Cointegration andExchange Rate Dynamics

Published in Journal of Business and Economic Statistics, 2004, vol. 22, pp. 331—345

71

72

Optimal Residual Based Tests for Fractional Cointegration andExchange Rate Dynamics

Morten Ørregaard Nielsen∗

Abstract

We propose a Lagrange Multiplier test of the null hypothesis of cointegration in frac-tionally cointegrated models. The test statistic utilizes fully modified residuals to cancelthe endogeneity and serial correlation biases, and we show that standard asymptotic prop-erties apply under the null and under local alternatives. With i.i.d. Gaussian errors theasymptotic Gaussian power envelope of all (unbiased) tests is achieved by the one-sided(two-sided) test. The finite sample properties are illustrated by a Monte Carlo study. Inan application to the dynamics among exchange rates for seven major currencies againstthe US dollar, mixed evidence of the existence of a cointegrating relation is found.

JEL Classification: C12, C22, C32

Keywords: Cointegration Test, Fully Modified Estimation, Nonstationarity, Optimal Test,Power Envelope

∗This paper has benefitted from comments by Niels Haldrup, Michael Jansson, Peter Phillips, KatsumiShimotsu, seminar participants at the University of Aarhus and Yale University, and two anonymous referees.In particular, Jörg Breitung and Søren Johansen provided many very helpful and constructive suggestions whichimproved the paper significantly.

73

Chapter 3

1 Introduction

In this paper we propose a Lagrange Multiplier (LM) test of the null hypothesis of cointegrationin fractionally cointegrated models. In nonstationary and possibly cointegrated models, esti-mators and test statistics are often found to have nonstandard distributional properties whenthe null is nested in the autoregressive alternatives typically considered in the literature. Incontrast, we show that by embedding the model of interest in a general I(d) framework, the LMtest statistic regains the standard distributional properties and uniform optimality propertieswell known from simpler models.

The analysis of cointegration has been a very active area of research in the econometrics andtime series literature in the last 20 years, starting with the seminal contributions by Granger(1981) and Engle & Granger (1987). Most of this work has considered the I (1)− I (0) type ofcointegration in which linear combinations of two or more I (1) variables are I (0). A processis labelled I (0) if it is covariance stationary and has spectral density that is bounded andbounded away from zero at the origin, and I (1) if the first differenced series is I (0) . If yt andxt are I (1), and hence in particular nonstationary (unit root) processes, but there exists aprocess et which is I (0) and a fixed β such that

yt = β0xt + et, (1)

then yt and xt are said to be cointegrated. Thus, the nonstationary series move together in thesense that a linear combination of them is stationary and a common stochastic trend is shared.Testing for cointegration in this framework amounts to testing stationarity of the unobservedresidual process et against a unit root alternative, see e.g. Shin (1994), Jansson (2004), andthe references therein.

The above notion of cointegration is based on the knife-edge distinction between I(1) andI(0) processes. However, many economic and financial time series exhibit strong persistencewithout exactly possessing unit roots, for some recent evidence see e.g. Diebold & Rudebusch(1989), Baillie & Bollerslev (1994), Baillie (1996), Lobato & Velasco (2000), and Marinucci& Robinson (2001). This has led to the consideration of the class of fractionally integratedprocesses, which is more general than I(1) and still admits a criterion for linear co-movementof series. Thus, a process is fractionally integrated of order d, denoted I(d), if its d’th differenceis I (0). Here, d may be any real number, i.e. d = 0 or d = 1 are special cases. For a precisestatement, xt is I (d) if

∆dxt = utI (t ≥ 1) = u#t , (2)

or equivalently, inverting (2),

xt = ∆−du#t , (3)

defining u#t = utI (t ≥ 1), where ut is I (0), I (·) denotes the indicator function, and the frac-

74

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

tional difference operator ∆d = (1− L)d is defined by its binomial expansion

(1− L)d =∞Xj=0

Γ (j − d)

Γ (−d)Γ (j + 1)Lj , Γ (z) =

Z ∞

0tz−1e−tdt, (4)

in the lag operator L (Lxt = xt−1). With the definition (2) or (3), xt is a type II fractionallyintegrated process, which is nonstationary for all d but asymptotically stationary for d < 1/2,see Marinucci & Robinson (1999). Following the original idea by Granger (1981), a naturalgeneralization of the cointegration concept is to assume that the raw series are I (d) and thata certain linear combination is I (d− b) , with d ≥ b positive real numbers. This is denotedCI (d, b).

To fix ideas, consider the simple system

∆d−b+θ ¡y1t − β0y2t¢= u#1t, (5)

∆dy2t = u#2t, (6)

where ut = (u1t, u02t)0 is I (0). In this model yt is CI (d, b− θ) and the cointegration vector is

given by¡1,−β0¢. Clearly, this allows the study of co-movement among persistent series much

more generally than in the standard unit root based I (1)− I (0) cointegration framework. Inthe present paper, we assume that d and b are known a priori and satisfy d ≥ b ≥ 3/4 + ε forsome ε > 0.

We wish to test the hypothesis H0 : θ = 0, i.e. setting d = b = 1 can be seen as analternative to testing for stationarity of the residuals in (1). If the null hypothesis is changedslightly in this setup, the properties of the process yt do not change as dramatically as inthe standard cointegration model in which the relation (1) is either perfectly cointegrating,i.e. CI (1, 1), or spurious. A notion of near-cointegration does exist in the unit root basedI (1)− I (0) cointegration literature, which offers some smoothing of the gap between CI (1, 1)

and spurious regression, e.g. Jansson & Haldrup (2002). However, the test statistics in thatframework still have nonstandard distributional properties.

We show that in our fractional integration framework much more desirable properties areobtained than in (1). Our test can be considered an extension of the univariate LM tests inRobinson (1991, 1994), Agiakloglou & Newbold (1994), and Tanaka (1999), among others, whoconsidered testing for a unit root in a fractional integration framework, i.e. testing on theparameter d in (2) in the frequency and time domains. They showed that their tests havestandard asymptotic distributions and, under Gaussianity, that their tests enjoy optimalityproperties. Simulations in Tanaka (1999) showed that, in finite samples, the time domain testsare superior to Robinson’s (1994) frequency domain LM test with respect to both size andpower.

Presumably there exist Wald and likelihood ratio versions of our LM test, which have thesame asymptotic properties as our test even though their finite sample properties may differ,as shown by Nielsen (2004) in a general univariate model. However, we consider only the time

75

Chapter 3

domain LM test for fractional cointegration with the usual computational motivation that themodel only needs to be estimated under the null hypothesis. As we shall see below, in theimportant special case d = b = 1 the computation of the LM test statistic does not require anyfractional differencing, and indeed all that is needed in this case are the residuals from a fullymodified regression which can be obtained from readily available computer software.

We show that the likelihood theory in the time domain is tractable and that the MLestimator of the cointegrating vector β, which is required to compute the test statistic, reducesto a version of the fully modified least squares estimator of Phillips & Hansen (1990) andPhillips (1991), see also Kim & Phillips (2001) for a fractional cointegration version. We thenshow that the LM test can be calculated using the residuals from the fully modified regressionand establish the desirable distributional properties and optimality properties of the test. Inparticular, the test statistic is consistent and asymptotically normal or chi-squared distributed,and under the additional assumption of Gaussianity the test is locally most powerful. Indeed,we show that in the special case with i.i.d. Gaussian errors, the asymptotic Gaussian powerenvelope of all (unbiased) tests is achieved by the one-sided (two-sided) version of our test, i.e.the one-sided (two-sided) test is asymptotically uniformly most powerful among all (unbiased)tests. In a simulation study we find that the finite sample rejection frequencies are reasonablebut well below the asymptotic local power for samples of size n = 200, and much closer to theasymptotic local power for n = 500.

Our new methodology is applied to the analysis of exchange rate dynamics following Baillie& Bollerslev (1989, 1994). Previous studies have focused on the estimation of the cointegrationvector and the memory parameter of the equilibrium errors, but no formal testing of thehypothesis of fractional cointegration has been done. We concentrate on testing for the presenceof (fractional) cointegration with various specifications of d and b. Our findings are not decisive,but we do find some evidence of cointegration among a system of exchange rates for seven majorcurrencies against the US Dollar. In particular, we do not reject at the 1% level (againstfractional alternatives) that the exchange rates can be described by a standard I(1) − I(0)

cointegration model when the errors (i.e. u1t and u2t in (5) and (6) above) are allowed tofollow autoregressive processes of order one.

The remainder of the paper is laid out as follows. Section 2 sets up the model of fractionalcointegration. In section 3 we consider the estimation of the cointegrating vector, derive theLM test statistic, and establish the desirable distributional properties. In section 4 we derivethe asymptotic Gaussian power envelopes for the one-sided and two-sided testing problemsand show that they coincide with the local asymptotic power functions of the one-sided andtwo-sided LM tests. Section 5 presents the results of the Monte Carlo study and in section 6 weprovide the empirical application to exchange rate dynamics. Section 7 offers some concludingremarks. All proofs are collected in the appendix.

76

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

2 A Model of Fractional Cointegration

Suppose we observe the K-vector time series yt, t = 1, 2, ..., n, which we partition as y1t(scalar) and y2t ((K − 1)-vector). We consider a triangular model of fractional cointegration inthe spirit of the Phillips (1991) triangular system. Thus, let yt be generated by the fractionallycointegrated system

y1t = β0y2t + zt, t = 1, 2, ..., (7)

∆d−b+θzt = u#1t, t = 1, 2, ..., (8)

∆dy2t = u#2t, , t = 1, 2, ..., (9)

where zt is the (unobserved) deviation from the cointegrating relation and ut = (u1t, u02t)0 is

an error component. We allow the error components u1t and u2t to be contemporaneouslycorrelated and possibly weakly dependent, c.f. Assumption 1 below.

The system (7) − (9) generalizes the standard triangular cointegration model. The seriesshare fractionally integrated stochastic trends of orders I (d) and I (d− b), and the linearcombination

¡1,−β0¢ eliminates the most persistent one. Equation (7) can be regarded as

an equilibrium relationship between the I (d) components of yt. Under the null, θ = 0, thedeviations from equilibrium constitute an I (d− b) process, and when d = b the deviations areonly weakly dependent, so this is a case of special interest. The model could be extended tomultidimensional cointegrating relationships as in Jeganathan (1999), where the estimation ofthe cointegration rank and cointegrating vectors is of interest. However, most empirical studiesconsider a single cointegrating relation among two or more variables, e.g. Cheung & Lai (1993),Baillie & Bollerslev (1994), Dueker & Startz (1998), Marinucci & Robinson (2001), and Kim& Phillips (2001). Thus, we consider only the case of a single cointegrating relationship in thispaper to keep focus on optimal testing of hypotheses on θ.

The model is assumed to satisfy the following assumption on the error process.

Assumption 1 We consider four typical specifications for the error component ut. In eachcase, the innovations et = (e1t, e02t)0 ∼ i.i.d. (0,Σ) with finite fourth moment and Σ is a positivedefinite matrix which we partition conformably as

Σ =

"σ211 σ021σ21 Σ22

#. (10)

0. ut ∼ i.i.d. or equivalently ut = et.

1. u1t follows the stationary AR(p) process

g (L)u1t = e1t, t = 1, 2, ..., (11)

and u2t = e2t.

77

Chapter 3

2. u1t = e1t, and u2t follows the (K − 1)-dimensional stationary VAR(p) process

G (L)u2t = e2t, t = 1, 2, .... (12)

3. ut follows the K-dimensional block diagonal stationary VAR(p) process

g (L)u1t = e1t, t = 1, 2, ..., (13)

G (L)u2t = e2t, t = 1, 2, .... (14)

In cases 1-3, g (z) and G (z) are lag polynomials of order p with coefficients gathered in γ1and γ2, respectively, and G (1) has full rank (no cointegration among the components of y2t).

In the following we write A (z) = diag (g (z) , G (z)) as shorthand for the lag polynomial inAssumption 1.3. It would be straightforward to extend Assumption 1.3 to A (L)ut = et, for ageneral lag polynomial A (z) of order p, where A (1) has full rank. Applying the formulae inHosking (1980), the results in Lemma 1 and the following theorems could be extended to coverthis more general case. However, the structure imposed by Assumption 1.3 seems relevant andits interpretation is natural.

In our model the constants d and b are prespecified. In particular, we assume that d ≥b ≥ 3/4 + ε for some ε > 0 such that the series are nonstationary and cointegration reducesthe integration order by more than 3/4. Assuming that b is known a priori is natural as iteffectively specifies the null for our LM test and thus, according to the LM principle, there isno need to estimate b. If d is not known a priori it could be estimated in a preliminary step asin, e.g., Cheung & Lai (1993), Baillie & Bollerslev (1994), Marinucci & Robinson (2001), andKim & Phillips (2001), although this may change the limiting distributions below. Efficientprocedures have been developed to estimate d in fractionally integrated time series models, e.g.Sowell (1992) (exact ML) and Tanaka (1999) (conditional ML).

Our objective is to test the hypothesis

H0 : θ = 0 (15)

against H1 : θ > 0 or H2 : θ 6= 0 in the model (7) − (9). In particular, d = b = 1 generatesa standard I(1) − I(0) cointegrated system under the null, so this is a test of the null ofcointegration in the usual sense, but the fractional alternatives against which the test is directedare new. Thus, a test of (15) can be considered an alternative to testing stationarity of theresiduals in (1), which has been standard in the literature, see e.g. Shin (1994), Jansson (2004),and the references therein. Another important case, for d ≥ 1.25 and some small user-chosenε > 0, is the one-sided test of (15) with b = d− 1/2+ ε, i.e. d− b = 1/2− ε, which is a test forthe existence of an (asymptotically) stationary cointegrating relation against the alternativethat no stationary cointegrating relation exists (though a nonstationary but mean-revertingcointegrating relation with 1/2 ≤ d− b < 1 may still exist). Finally, for d ≥ 1 and some small

78

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

user-chosen ε > 0, it is of interest to conduct a one-sided test of (15) with b = d − 1/4 + ε,i.e. d − b = 1/4 − ε, as a border case for square integrability of the spectral density of theequilibrium errors and asymptotic normality of the autocovariances of the equilibrium errors,see e.g. Fox & Taqqu (1986).

Choosing d = b = 1 also suggests applying a test of (15) as a valuable diagnostics tool ina standard I (1)− I (0) cointegration analysis. In this context, rejecting (15) should be takeneither as evidence of a drastically misspecified dynamic structure or as a suggestion to employan actual fractional cointegration analysis. Thus, the test could be thought of as a general testfor misspecification of the model. If, for example, y1t and y2t are related by some complicatednonlinear filter and a linear model is imposed, then it is plausible that long-range dependencecould be introduced in the residuals as a result of this misspecification.

3 Testing Fractional Cointegration

The log-likelihood function of the model (7)−(9) under Assumption 1.3 (the most general case)and Gaussianity of the errors is

L (θ, β,Σ, γ) = −n2ln |Σ|− 1

2

nXt=1

Ãg (L)∆d−b+θztG (L)∆dy2t

!0Σ−1

Ãg (L)∆d−b+θztG (L)∆dy2t

!(16)

bearing in mind the truncation in our definition of fractionally integrated processes, e.g.G (L)∆dy2t = e#2t by (9) and (14). The log-likelihood in (16) is equal to the sum of themarginal log-likelihood

−n2ln |Σ22|− 1

2

nXt=1

G (L)∆dy02tΣ−122 G (L)∆

dy2t (17)

and the conditional log-likelihood

−n2lnσ21.2 −

1

2σ21.2

nXt=1

³g (L)∆d−b+θ ¡y1t − β0y2t

¢− σ021Σ−122 G (L)∆

dy2t

´2, (18)

where σ21.2 = σ211 − σ021Σ−122 σ21 is the variance of e1.2t = e1t − σ021Σ

−122 e2t, which is e1t centered

about its mean conditional on e2t. The asymptotic results derived later impose only Assumption1 on the error process. Gaussianity is not necessary for most of our results and is used only tochoose a likelihood function and to derive optimality properties.

From the conditional likelihood (18), the MLE of β under the null, θ = 0, is recognized tobe the NLS estimator in the augmented regression

∆d−by1t = β0∆d−by2t + (g (L)− 1)∆d−b ¡y1t − β0y2t¢+ c0G (L)∆dy2t + e1.2t, (19)

see Phillips & Loretan (1991) for a discussion of the equivalent estimator in the standardI (1)− I (0) cointegration framework. Presumably, the lagged equilibrium errors in (19) could

79

Chapter 3

be replaced by leaded∆dy2t, as demonstrated by Saikkonen (1991) in the standard cointegrationframework, and the resulting regression could be estimated by OLS.

Under Assumption 1.2 where g (z) = 1, i.e. when there is no autoregressive term in theequilibrium errors, the estimation of (19) reduces to OLS on

∆d−by1t = β0∆d−by2t +pX

k=0

ck∆dy2t−k + e1.2t. (20)

This simplification is even stronger under Assumption 1.0 where p = 0 in (20) and the laggedfractionally differenced y2t disappear. The simplification (20) is especially useful in manyapplications where cointegration is a result of rational expectations theory, i.e. that deviationsfrom equilibrium in time t should be unpredictable, based on information up to time t − 1,which in our framework implies d = b and g (z) = 1.

Equivalently, (20) is OLS in the (infeasible) regression

∆d−by∗1t = β0∆d−by2t + e1.2t, (21)

where y∗1t = y1t − σ021Σ−122

Ppk=0∆

by2t−k. This is the fully modified least squares method ofPhillips & Hansen (1990) and Phillips (1991), which was developed for fractional cointegrationby Kim & Phillips (2001). In contrast to our restrictions on d and b, Kim & Phillips (2001)require 2d− b > 1, d ≥ 1 in their fully modified method and further that b ≥ 1 in the likelihoodanalysis of their model. Thus, Kim & Phillips (2001) limit the strength of the cointegratingrelation by bounding b < 2d− 1 from above, and in particular they exclude the CI (1, 1) case.We assume at least b ≥ 3/4 + ε for some ε > 0 in our analysis, since our estimation problemunder the null has been transformed into a regression between I (b) processes with I(0) errors,(19)−(20). Thus, the necessity of at least b > 1/2 becomes clear, since otherwise the estimatorof β becomes inconsistent as demonstrated by e.g. Marinucci & Robinson (2001, p. 231).

Note that if OLS is applied to (7) directly, which has often been the case in the literature,see e.g. Cheung & Lai (1993) or Baillie & Bollerslev (1994), it introduces a bias unless σ21 = 0and g (z) = 1. Indeed, if σ21 = 0 and g (z) = 1 hold, y2t is strictly exogenous and inference onθ (and estimation of the parameter β) will depend only on the part of the likelihood attributedto (7). In particular, the MLE of β reduces to OLS on (7) and we can apply the univariatemethods of Robinson (1994) and Tanaka (1999). This is not the case when σ21 6= 0 or g (z) 6= 1because of the well known endogeneity and serial correlation biases, see e.g. Phillips (1991).

Returning to the full model, the normalized score statistic is found by differentiating (16)or (18) with respect to θ and evaluating the resulting expression under the null,

Sn =1√n

∂L (θ, β,Σ, γ)

∂θ

¯θ=0,β=β,Σ=Σ,γ=γ

=−1√nσ21.2

nXt=1

³ln (∆) (g (L)∆d−b(y1t − β

0y2t))

´³g (L)∆d−b(y1t − β

0y2t)− c0G (L)∆dy2t

´,(22)

80

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

where g (z) and G (z) are evaluated at γ1 and γ2, respectively. Using that ln (1− z) =

−P∞j=1 j

−1zj and defining the fully modified residuals under θ = 0 as

e1.2t = g (L)∆d−b(y1t − β0y2t)− c0G (L)∆dy2t (23)

and

e1t = g (L)∆d−b(y1t − β0y2t), (24)

the score can be written more compactly as

Sn =1√nσ21.2

nXt=1

t−1Xj=1

j−1e1t−j e1.2t

=

√n

σ21.2

n−1Xj=1

j−1³C11 (j)− c0C21 (j)

´

=√nn−1Xj=1

j−1e01Σ−1C (j) e1, (25)

where Cab (j) = n−1Pn

t=j+1 eate0bt−j is the estimated sample autocovariance function, e1 =

(1, 00)0 is the selection vector, and we used that σ−21.2 = Σ11 and −σ−21.2σ021Σ−122 = Σ12, where Σab

is the (a, b)’th block of Σ−1 for a, b = 1, 2.The asymptotic distribution of the score statistic Sn under the null (15) is considered next.

Theorem 3.1 Suppose d ≥ b ≥ 3/4 + ε for some ε > 0 in the model (7) − (9) and let Sn bedefined by (25). Under H0 : θ = 0 and Assumption 1.0,

SnD→ N

µ0,π2

6

σ211σ21.2

¶. (26)

Under H0 : θ = 0 and Assumption 1.i, Sn is asymptotically Gaussian with mean zero andvariance

π2

6

σ211σ21.2− vec ¡Σ−1e1e01ΣΦ0i1, .,Σ−1e1e01ΣΦ0ip¢0Hi

× ¡H 0i

¡Γi ⊗ Σ−1

¢Hi

¢−1H 0i vec

¡Σ−1e1e01ΣΦ

0i1, .,Σ

−1e1e01ΣΦ0ip

¢(27)

for i = 1, 2, 3. Here, Γi is the covariance matrix of (u0t, ..., u0t−p+1)0, Φil =P∞

j=l j−1Ψi,j−l,

Ψi,k is the k0th term in the Wold representation of ut normalized such that Ψi,0 = IK , andHi =

¡∂a01/∂γi, ..., ∂a0p/∂γi

¢0, where aj = vecAj are the coefficients in the autoregressive rep-resentation A (L)ut = et.

81

Chapter 3

In the simple bivariate VAR(1) example also considered in the appendix, the varianceequations (27) reduce to

π2

6

σ211σ21.2− vec ¡Σ−1e1e01ΣΦ0i1¢0Hi

¡H 0i

¡Σ−1 ⊗ Γi

¢Hi

¢−1H 0i vec

¡Σ−1e1e01ΣΦ

0i1

¢, i = 1, 2, 3,

where Γi = E (utu0t) can be estimated by n

−1Pnt=1 utu

0t and the particular Φi1 and Hi for this

example are given in the appendix.The Fisher information for θ, which is derived in the next theorem, illustrates the standard

nature of our testing problem.

Theorem 3.2 Let the assumptions of Theorem 3.1 be satisfied and assume that et is Gaussian.Under Assumption 1.0 the Fisher information for θ is

I0 = − limn→∞E

µ1

n

∂2L (θ, β,Σ)

∂θ∂θ0

¶=

π2

6

σ211σ21.2

, (28)

and under Assumption 1.i, i=1,2,3, the Fisher information for θ is

Ii =π2

6

σ211σ21.2− vec ¡Σ−1e1e01ΣΦ0i1, .,Σ−1e1e01ΣΦ0ip¢0Hi

× ¡H 0i

¡Γi ⊗ Σ−1

¢Hi

¢−1H 0i vec

¡Σ−1e1e01ΣΦ

0i1, .,Σ

−1e1e01ΣΦ0ip

¢. (29)

To assess the local power properties of the test, we derive the asymptotic distribution underthe sequence of local alternatives θ1n = δ/

√n.

Theorem 3.3 Under the assumptions of Theorem 3.1 and θ = δ/√n,

SnD→ N (δIi,Ii) (30)

as n→∞, where Ii is defined in Theorem 3.2.

Consider again briefly the special case where it is known that σ21 = 0. In that case thescore (25) and the distributions in Theorem 3.3 coincide with the ones obtained by Tanaka(1999). That is, we can apply the test of Tanaka (1999) to the residuals in (24), and Tanaka’s(1999) i.i.d. result is obtained under Assumptions 1.0 and 1.2 and his result for autocorrelatederrors is obtained under Assumptions 1.1 and 1.3. Thus, when σ21 = 0, our test has the samefunctional form and distribution as Tanaka’s (1999) test, which is based on more information(β known), and therefore our test shares the asymptotic optimality properties of that test whenσ21 = 0.

In practice, to construct an approximate size α test of H0 against H1 : θ > 0 underAssumption 1.i, we compute the statistic

LMi1 =1√Ii

SnD→ N

³δpIi, 1

´(31)

82

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

under θ = δ/√n as n→∞, and compare it to the 100 (1− α)% point of the standard normal

distribution. To test against the two-sided alternative H2 : θ 6= 0 under Assumption 1.i, atapproximate size α, we compute

LMi2 = LM2i1

D→ χ21¡δ2Ii

¢(32)

under θ = δ/√n as n → ∞, and compare it to the 100 (1− α)% point of the central χ21

distribution.A useful feature of the asymptotic distributions (31) and (32) is that they are free of the

parameters d and b. Since d and b are assumed known a priori, their effect is neutralized bysuitable differencing. This shows that simple asymptotic inference about θ can be carried outfor any choice of d and b satisfying d ≥ b ≥ 3/4 + ε for some ε > 0.

The calculation of the tests may seem to be quite involved as p gets large because of thecovariance matrices Φ and Γ. However, for a given parameter value γ we can calculate Φand Γ (and thus the tests) simply by finding the coefficients in the Wold representation ofut, then directly evaluate the sums in Φ, and set Γ equal to the sample covariance matrix of(u0t, ..., u0t−p+1)0. Another possibility is to employ the following numerical approximation to theone-sided test,

dLM i1 = −√n

nXt=1

e1.2t∂e1t∂θ

,vuut nXt=1

µ∂e1t∂θ

¶2 nXt=1

e21.2t

¯¯H0

, (33)

which follows by noting that ∂e1t/∂θ = −Pt−1

j=1 j−1e1t−j and comparing with (25) and (31).

From Theorem 3.3, we can easily calculate the asymptotic local power functions of theone-sided and two-sided tests (31)− (32). This is stated as a corollary.

Corollary 3.1 Under the assumptions of Theorem 3.1 and θ = δ/√n,

P (LMi1 > Z1−α) → Φ³Zα + |δ|

pIi´, (34)

P¡LMi2 > χ21,1−α

¢ → 1− Fλi¡χ21,1−α

¢, (35)

where Z1−α and χ21,1−α are the 100 (1− α)% points of the standard normal and central χ21distributions, respectively, and Φ and Fλi are the distribution functions of the standard normaldistribution and the noncentral χ21 distribution with noncentrality parameter λi = δ2Ii.

Figure 1 shows asymptotic local power functions for d = b = 1 and a variety of first orderautoregressive specifications and contemporaneous correlation structures. When the correlationis low (left-hand side panels) only the autoregressive term in the equilibrium error (8) has asignificant effect. In fact, if the errors are contemporaneously uncorrelated, i.e. σ21 = 0,the power functions for cases 1.0 and 1.2 coincide and the power functions for cases 1.1 and1.3 coincide. With highly correlated errors (right-hand side panels) the autocorrelation in

83

Chapter 3

∆dy2t spills over and has some effect on the power function, though still not as much as theautoregressive term in the equilibrium error. This is well known from standard cointegrationanalysis. Since the regressors y2t are already heavily trended it makes little difference if theinnovations to the stochastic trend are weakly autocorrelated.

Figure 1 about here

It follows from Corollary 3.1 and Theorem 3.2 that the power functions of the tests dependon the covariance matrix of the underlying innovations et, such that the power depends on theextent of the endogeneity of the regressors y2t. In particular, under Assumptions 1.0 and 1.1any correlation between y2t and zt is exploited by the test to increase power, c.f. equation (28)and Figure 1 (compare the starred lines in the left-hand and right-hand side panels). Notethat in case 1.2 the power may increase or decrease with correlated errors. Comparing thesolid line in the left-hand and right-hand side panels the power increases when correlation isincreased from .6 to .9. However, comparing the starred and solid lines in the upper left-hand side panel shows that in case 1.2 power is decreased with correlation .6 compared to theuncorrelated case. Thus, when correlation is high the first term in I2, which increases power ascorrelation increases, dominates the second term, which decreases power due to the spill-overof the autocorrelation via the contemporaneous correlation.

In general, the ability of the test to exploit the correlation stands in contrast to the standardI (1) − I (0) framework where the power functions and power envelopes do not depend on Σ,see Jansson & Haldrup (2002) and Jansson (2004). The dependence on Σ is due to the factthat β can be assumed to be known in the derivation of power functions and power envelopesin our fractional setup. That is not the case in the standard I (1)− I (0) framework and thuscointegration tests in that framework are unable to exploit the correlation between y2t and ztto gain power.

As a first optimality result, it follows immediately from Theorems 3.2 and 3.3 that thetwo-sided test is locally most powerful (LMP) and we state this as a corollary.

Corollary 3.2 Under the assumptions of Theorem 3.3 and the additional assumption of Gaus-sianity, the two-sided test statistic (32) is locally most powerful in the sense that the noncen-trality parameter is maximal.

Next, we show that much stronger optimality results than the LMP property of Corollary3.2 can be obtained for the problem of testing (15) when the errors are assumed to be i.i.d.Gaussian.

4 Asymptotic Gaussian Power Envelopes

In this section, we derive the asymptotic Gaussian power envelopes for the one-sided and two-sided testing problems and proceed to show, following Elliott, Rothenberg & Stock (1996) and

84

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

Tanaka (1999), that the one-sided test is asymptotically uniformly most powerful (UMP) and,following Nielsen (2004), that the two-sided test is asymptotically uniformly most powerfulunbiased (UMPU).

Assume that the data generating process is (7)− (9) with ut independent, normally distrib-uted, β and Σ known, and true parameter value θ0n = c/

√n for some fixed c > 0. The test

of H0 : θ = 0 against the local alternative H1 : θ1n = δ/√n for some fixed δ > 0 is a test of a

simple null against a simple alternative. The Neyman-Pearson Lemma, e.g. Lehmann (1986,chapter 3), states that the test that rejects the null when

Mn = n

Pnt=1 u

21.2nt −

Pnt=1 u

21.2ntPn

t=1 u21.2nt

(36)

becomes large is most powerful. Here, u1.2nt and u1.2nt are the residuals (with β and Σ known)under H0 and H1, respectively. The next theorem derives the limiting distribution ofMn underlocal alternatives.

Theorem 4.1 Let Mn denote the test statistic (36) in the model generated by θ0n = c/√n

(c > 0 is a fixed scalar). Then, under the sequence of local alternatives θ1n = δ/√n (δ > 0 is

a fixed scalar), it holds that

MnD→M (c, δ) = 2δ

pI0Z + δ (2c− δ)I0

as n→∞, where Z is a standard normal variable.

Let the power of Mn be given by π (c, δ) = P (M (c, δ) > cα (δ)) under H1n when θ0n istrue, where the critical value cα (δ) is determined by P (M (0, δ) > cα (δ)) = α. Then thepower envelope of all one-sided tests is given by Π (δ) = π (δ, δ), and a test whose powerattains the power envelope for all points δ is UMP.

To find a test statistic that applies against two-sided alternatives we invoke the principleof unbiasedness, see Lehmann (1986, chapter 4), to construct a most powerful unbiased test.Unbiasedness requires that the power of the test does not fall below the nominal significancelevel for any point in the alternative. A test whose power attains the power envelope for allpoints δ is UMPU.

The following theorem derives the asymptotic Gaussian power envelopes of the one-sidedand two-sided testing problems, and shows that these envelopes are achieved by our tests.

Theorem 4.2 The one-sided asymptotic Gaussian power envelope for all tests of size α ofH0 : θ = 0 against H1 : θ1n = δ/

√n (δ a fixed scalar) is given by (34) and the two-sided

asymptotic Gaussian power envelope for all unbiased tests of size α is given by (35). Thus, inthe i.i.d. Gaussian model, the one-sided LM test (31) is asymptotically uniformly most powerful(UMP) and the two-sided LM test (32) is asymptotically uniformly most powerful among allunbiased tests (UMPU).

85

Chapter 3

This result is in stark contrast to the results in the standard I (1) − I (0) cointegrationliterature. Tests that enjoy optimality properties have been derived in that framework by e.g.Shin (1994) and Jansson (2004) whose tests are LMP and point optimal, respectively, i.e. teststhat have maximal power against a single prespecified point in the (autoregressive) alternative.However, our criterion is against all (fractional) alternatives, i.e. against all points in thealternative δ 6= 0.

5 Finite Sample Performance

The local power functions and power envelopes derived above are asymptotic results, and inthis section we examine by Monte Carlo experiments whether these asymptotic approximationscarry over to finite samples.

The model we have chosen for the simulation study is a bivariate system with d = b = 1,i.e.

∆θ (y1t − y2t) = u#1t, (37)

∆y2t = u#2t, (38)

which is a standard cointegrated model under the null. We consider several specifications forthe error process corresponding to each case in Assumption 1 and let et be bivariate normal withvariances normalized to unity and with contemporaneous correlation 0 or .6. The parametervalues for the autoregressive coefficients correspond to those in the upper panels in Figure 1,i.e. γ1 = .2 and γ2 = .5.

All calculations were made in Ox v3.00 (Doornik (2001)) including the Arfima packagev1.01 (Doornik & Ooms (2001)). Throughout, we fix the nominal size (type I error) at .05 andthe number of replications at 1, 000. We consider the sample sizes n = 200 and n = 500. Theformer is typical for macroeconomic time series, and the latter (or even larger) for financialtime series.

We concentrate on comparing the finite sample performance of the one-sided LM test (re-ported as LM) with the asymptotic local power, but also report results for the size correctedLM test (reported as LMsc). The properties of the estimator of the cointegrating vector βin a similar model were examined by Kim & Phillips (2001), who found that even in samplesas small as n = 100 the performance of the estimator is very good with respect to bias andvariance.

Tables 1-4 show the simulated rejection frequencies of the test statistics (LM and LMsc) fordifferent assumptions on the autocorrelation structure of the errors (as in Figure 1) correspond-ing to each case outlined in Assumption 1. For comparison, the asymptotic local power, whichis equal to the power envelope under Assumption 1.0 by Theorem 4.2, has been calculated fromCorollary 3.1 for the same parameter values and is reported under the heading ’Envelope’. The

86

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

first three columns of each table give the results for contemporaneously uncorrelated errors,whereas in the last three columns the contemporaneous correlation between the errors is .6.

Tables 1-4 about here

First, consider the case where the cointegrating error u1t is i.i.d. and u2t is either i.i.d. (Table1) or follows an AR(1) (Table 3). In these two cases the finite sample rejection frequencies arequite close to the asymptotic local power, even for the small sample size n = 200, and especiallywith contemporaneously uncorrelated errors. In Table 3 the effect of G (z) spills over via thecorrelation, but only slightly degrades the size and power compared to the uncorrelated casewhere there is no spill-over. The insignificance of the specification of u2t is well known fromstandard cointegration analysis, and is due to the fact that y2t is already highly trended andmaking the innovations to this I (d) process weakly dependent does not add significantly tothis trend.

When u1t is allowed to be autocorrelated as in Tables 2 and 4, where u1t follows an AR(1)process, we know from Corollary 3.1 and Figure 1 that the power of the test degrades andconsequently the asymptotic local power functions are much lower than in Tables 1 and 3. Thefinite sample rejection frequencies reflect this behavior and are well below the asymptotic powerfor n = 200 and also somewhat below the asymptotic power for n = 500.

Comparing the middle and right-hand side panels in Tables 1, 2, and 4 shows that thetest takes advantage of the correlation between the underlying errors, and the improvement inpower when the errors are correlated (right-hand side panels) is evident. The ability of the testto exploit this correlation to increase power even in finite samples is remarkable and contraststhe inability of conventional cointegration tests to exploit this correlation even asymptotically,see Jansson & Haldrup (2002) and Jansson (2004).

In general, the finite sample power functions for samples of size n = 200 are reasonable,but well below the asymptotic local power. For samples of size n = 500 they are close to theasymptotic local power functions, especially in the absence of an autoregressive term in theequilibrium errors. Thus, one would expect very good performance of the tests in financialapplications where samples are often many times larger. In such cases the power loss resultingfrom the estimation of a rich autocorrelation structure would also be of less importance. Thesample size in our empirical application below is n = 336, so for the application we expectthe performance of the tests to lie between the two cases considered in the present simulationstudy.

6 Exchange Rate Dynamics

The analysis of exchange rate dynamics and potential (fractional) cointegrating relations be-tween exchange rates for different currencies has attracted much attention recently. Baillie& Bollerslev (1989) find evidence of one cointegrating relation between seven different (log)

87

Chapter 3

spot exchange rates using conventional cointegration methods. This is challenged by Diebold,Gardeazabal & Yilmaz (1994) who show that the inclusion of an intercept changes the con-clusion for the Baillie & Bollerslev (1989) data set. This finding is further supported in ananalysis of a different data set covering a longer span of time in Diebold et al. (1994).

In the article by Baillie & Bollerslev (1994) it is argued that the failure of conventionalcointegration tests to find evidence of cointegration in the Baillie & Bollerslev (1989) exchangerate data is due to the presence of fractional cointegration. Thus, they estimate the cointe-gration vector by OLS following Cheung & Lai (1993) and fit a simple fractionally integratedwhite noise model to the residuals. It is concluded that the exchange rates can be describedby a CI(1, .11) relationship (in our notation). However, their estimate of the integration orderof the equilibrium errors (.89) may well be upwards biased since relevant short-run dynam-ics may have been left out. This is indeed what is concluded by Kim & Phillips (2001) whoemploy their fractional fully modified estimation procedure to a different data set covering alonger time span but the same exchange rates. They find that the equilibrium errors are bestdescribed by an ARFIMA(1,d,0) process with d = .33.

All the above studies concentrate on the estimation of the cointegration vector and thememory parameter of the equilibrium errors, but no formal testing of the hypothesis of frac-tional cointegration is attempted. We take the opposite view and concentrate on testing for thepresence of (i) standard I(1) − I(0) cointegration against fractional alternatives, (ii) CI(d, d)cointegration, where d is a preliminary estimate of d, and (iii) fractional cointegration withequilibrium errors that are integrated of order less than one-quarter, i.e. that the spectraldensity of the equilibrium errors is square integrable or equivalently that their autocovariancesare asymptotically normally distributed.

We apply our tests to a system of log exchange rates for the currencies of the following sevencountries, (West) Germany, United Kingdom, Japan, Canada, France, Italy, and Switzerlandagainst the US Dollar. The same currencies are examined in the studies cited above. However,where Baillie & Bollerslev (1989, 1994) and Diebold et al. (1994) consider daily observationscovering 1 March 1980 to 28 January 1985 and Kim & Phillips (2001) consider quarterlyobservations from 1957 through 1997, our data set is comprised of monthly averages of noon(EST) buying rates and runs from January 1974 through December 2001 for a total of n =336 observations. Thus, our data set, which is extracted from the Federal Reserve Board ofGovernors G.5 release, covers only the period of the current flexible exchange rate regime, buta much longer span of time than the Baillie & Bollerslev (1989) data set. A long time span hasgenerally been found to be important in detecting long-run relations.

Tables 5-6 about here

Table 5 presents the fractional integration analysis of the data set. The first two rows arethe estimates of the fractional integration orders estimated by the conditional ML technique(CMLE) in Tanaka (1999) with lag orders p = 0 and p = 1. The standard errors reported

88

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

in parenthesis are calculated as√6/ (π

√n) when p = 0 and (

√nω)

−1 when p = 1, whereω2 = π2/6−¡1− a2

¢a−1 (ln (1− a))2 and a is the estimated AR coefficient, see Tanaka (1999).

As a robustness check we also report the Gaussian semiparametric (GSP) estimates of Robinson(1995) (applied to the first-differenced data and adding one to the resulting estimate) with twodifferent bandwidths in the final two rows of Table 5. The standard errors of these estimatesare 1/ (2

√m), see Robinson (1995). The final column gives estimates of a common integration

order, computed simply as an average of the estimated integration orders for each exchangerate, which we use in our fractional cointegration analysis.

In Table 6 we report two common misspecification tests for the residuals of the univariateCMLEs in Table 5. The Portmanteau test of autocorrelation up to lag 6 is reported as AR(6)and the test for ARCH up to lag 1 is reported as ARCH(1). When p = 0 the AR(6) test rejectsat the 1% level for all the time series except CAN, whereas when p = 1 the test only rejectsat the 5% level for JAP and UK. Likewise, when p = 0 the ARCH test rejects at the 1% levelfor ITA, JAP, and UK, but when p = 1 it rejects only for UK at the 1% level. Thus, themisspecification tests suggest that the data are well described when allowance is made for oneautoregressive lag, i.e. when p = 1.

Returning to the estimates in Table 5, it is clear that the exchange rates can be welldescribed as I (1) processes. The CMLEs are insignificantly different from unity except CANwith p = 0, but that estimate may be upwards biased if relevant short-run dynamics is left outof the estimation (although the misspecification tests do not suggest so). Thus, when p = 1 theCAN estimate is insignificantly different from unity. The GSP estimates are all insignificantlydifferent from unity except FRA with m = 67. Hence, the results of Table 5 support theoverwhelming evidence in the previous literature that exchange rates are I(1). E.g. Baillie &Bollerslev (1989) conduct unit root tests of the I(1) hypothesis against the I(0) alternative andBaillie (1996) provides evidence from fractional models.

Table 7 about here

In Table 7 the results from applying the one-sided LM test (33) to the exchange rate dataare presented. The exchange rate for (West) Germany is y1t, and the remaining six exchangerates are gathered in y2t, which is then a six-dimensional vector. Following the evidence infavor of p = 1 in Table 6, we consider only Assumptions 1.2 and 1.3 with p = 1. We testthree different hypotheses. Based on the evidence in Table 5 and the previous literature, wespecify d = b = 1 in the first hypothesis corresponding to the standard I (1) − I (0) model asdiscussed above. Secondly, we use the estimated common integration order dc for d and b (i.e.we set d = b = dc), using the estimates from Table 5 with p = 1 for dc. The third hypothesis,d = 1, b = .76, is that there exists a cointegrating relation which is integrated of order lessthan one-quarter (using ε = .01). This implies square integrability of the spectral density ofthe cointegrating errors and asymptotic normality of their autocovariances.

The results we obtain in Table 7 are mixed. Under Assumption 1.2 all the tests reject

89

Chapter 3

strongly. However, when allowance is made for an autoregressive specification in the cointe-grating relation, i.e. under Assumption 1.3, the test does not reject the third hypothesis, thussupporting a dynamic specification of the cointegrating relation (possibly with fractional inte-gration in the cointegrating relation). In particular, under Assumption 1.3 which we considerthe most relevant based on Table 6, the test rejects the first two hypotheses at the 5% level,but none of the hypotheses are rejected at the 1% level. In the cases with an estimated au-toregressive term in the equilibrium errors, the estimates of the autoregressive parameter (notreported in the table) are between .84 and .94. Hence, there appears to be persistence in thecointegrating relation, and the results of Table 7 suggest that it could be only short memory.

7 Conclusion

We have proposed and examined a time domain LM test for the null of cointegration in afractionally cointegrated model with the usual computational motivation. In the importantcase where the null hypothesis is that of standard I (1) − I (0) cointegration, but the test isagainst fractional alternatives, the calculation of the LM test statistic does not require anyfractional differencing and can be based on residuals from readily available computer software.

The likelihood theory in the time domain is tractable and the ML estimation of the cointe-gration vector β reduces to a version of the fully modified least squares estimator. Thus, theLM test statistic utilizes fully modified residuals to cancel the endogeneity and serial correlationbiases. The test statistic is shown to have standard distributional properties under the null andunder local alternatives, such that inference can be drawn from the normal and chi-squareddistributions.

In the special case with i.i.d. Gaussian errors, the asymptotic Gaussian power envelope ofall tests is achieved by the one-sided version of our test, and the asymptotic Gaussian powerenvelope of all unbiased tests is achieved by the two-sided version of our test. Thus, with i.i.d.Gaussian errors, the one-sided (two-sided) version of our test is asymptotically uniformly mostpowerful among all (unbiased) tests.

The empirical relevance of our test is established by Monte Carlo experiments, which showthat finite sample rejection frequencies are reasonable for samples of size n = 200 and close tothe asymptotic local power for n = 500.

Finally, we have applied our methodology to the analysis of exchange rate dynamics in asystem of exchange rates for seven major currencies against the US Dollar. We have focusedon testing for the presence of (fractional) cointegration, rather than the estimation of anyparticular model, but the evidence is mixed.

90

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

Appendix: Proofs

Before we prove the theorems we need a lemma. Define the sample autocovariance and residualautocovariance functions

C (j) =1

n

nXt=j+1

ete0t−j and C (j) =

1

n

nXt=j+1

ete0t−j ,

where the et are estimated residuals of a VAR(p) process. We consider the asymptotic distri-bution of a particular linear combination of the residual autocovariances in each of the fourcases outlined in Assumption 1.

Lemma 1 Let et be the estimated residuals of the K-dimensional VAR(p) process A (L)ut = et,where et is i.i.d. (0,Σ) with finite fourth moments and A (z) has the structural parameterizationin Assumption 1.i. Then

√nn−1Xj=1

j−1 vecC (j) D→ N (0,Ω0)

√nn−1Xj=1

j−1 vec C (j) D→ N (0,Ωi)

as n→∞, where

Ω0 =π2

6Σ⊗ Σ,

Ωi =π2

6Σ⊗ Σ− ¡ΣΦ0i1 ⊗ IK , ...,ΣΦ

0ip ⊗ IK

¢Hi

¡H 0i

¡Γi ⊗ Σ−1

¢Hi

¢−1H 0i

¡ΣΦ0i1 ⊗ IK , ...,ΣΦ

0ip ⊗ IK

¢0,

for i = 1, 2, 3. Here, Γi is the covariance matrix of (u0t, ..., u0t−p+1)0, Φil =P∞

j=l j−1Ψi,j−l,

Ψi,k is the k0th term in the Wold representation of ut normalized such that Ψi,0 = IK , andHi =

¡∂a01/∂γi, ..., ∂a0p/∂γi

¢0, where aj = vecAj are the coefficients in the autoregressive rep-resentation A (L)ut = et.

Proof. For a fixed m > p define the K2m-vectors Cm = vec(C (1) , ..., C (m))0 and Cm =

vec(C (1) , ..., C (m))0. Consider first case 1.0, where ut is i.i.d. and C (j) is observable. It iswell known that in this case √

nCmD→ N (0, Im ⊗Σ⊗ Σ)

and thus√n

mXj=1

j−1 vecC (j) D→ N

⎛⎝0, mXj=1

j−2Σ⊗ Σ⎞⎠ .

The desired result for case 1.0 now follows by application of Bernstein’s Lemma, see e.g. Hall& Heyde (1980, pp. 191-192).

91

Chapter 3

For the remaining three cases we employ a result of Ahn (1988) on the asymptotic distrib-ution of the residual autocovariances of a VAR(p) process under structural parameterization.Consider case 1.2. Define the matrix H2 =

¡∂a01/∂γ2, ..., ∂a0p/∂γ2

¢0, where aj = vecAj are thecoefficients in the autoregressive representation A (L)ut = et and γ2 is the vector of coefficientsin G (z). In this setup, Ahn (1988) showed that (in our notation)

√nCm

D→ N³0, Im ⊗Σ⊗ Σ−GmH2

¡H 02

¡Γ⊗ Σ−1¢H2

¢−1H 02G

0m

´,

where

G0m =

⎡⎢⎢⎢⎢⎣Σ⊗ IK Ψ1Σ⊗ IK · · · · · · · · · Ψm−1Σ⊗ IK

0. . .

......

. . . . . ....

0 · · · 0 Σ⊗ IK · · · Ψm−pΣ⊗ IK

⎤⎥⎥⎥⎥⎦ .

Consequently,

√n

mXj=1

j−1 vec C (j) D→ N³0,Ω

(m)2

´,

where Ω(m)2 is a truncated version of Ω2, i.e. with π2/6 replaced byPm

j=1 j−2 and Φil replaced

by Φ(m)il , which is truncated at m. Again, we can apply Bernstein’s Lemma to replace thetruncated sums by their limits. For cases 1.1 and 1.3 the same results hold, except that Hi,Φil, and Γi are different as indicated by the subscript i.

As a simple example consider a bivariate VAR(1) system with g (z) = 1− γ1z and G (z) =

1 − γ2z. The Hi matrices are H1 = (1, 0, 0, 0)0, H2 = (0, 0, 0, 1)

0, and H3 = (H1,H2) and thecovariance equations simplify to

Ωi =π2

6Σ⊗ Σ− ¡ΣΦ0i1 ⊗ IK

¢Hi

¡H 0i

¡Σ−1 ⊗ Γi

¢Hi

¢−1H 0i (ΣΦi1 ⊗ IK) , i = 1, 2, 3,

where Φ11 = diag (φ1, 1), Φ21 = diag (1, φ2), Φ31 = diag (φ1, φ2), φi = −γ−1i ln (1− γi), i = 1, 2,and Γi = E (utu

0t) can be estimated by n

−1Pnt=1 utu

0t.

Proof of Theorem 3.1. Suppose first that β is known. Using that vec (A)0 vec (B) =tr (A0B) and by application of Lemma 1, the score statistic is

Sn =√nn−1Xj=1

j−1 vec¡Σ−1e1e01

¢0vec C (j)

D→ N³0, vec

¡Σ−1e1e01

¢0Ωi vec

¡Σ−1e1e01

¢´,

92

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

where the Ωi are defined in Lemma 1. The variance equations (26) and (27) follow immediatelyfrom Lemma 1, e.g. in case 1.0 with i.i.d. errors the variance is

vec¡Σ−1e1e01

¢0 π26(Σ⊗ Σ) vec ¡Σ−1e1e01¢ =

π2

6vec

¡Σ−1e1e01

¢0vec

¡e1e

01Σ¢

=π2

6tr¡e1e

01Σ−1e1e01Σ

¢=

π2

6

σ211σ21.2

.

Next, we show that estimating β does not influence the result. From e.g. Cheung &Lai (1993), Marinucci & Robinson (2001), and Kim & Phillips (2001) we know that, sinceβ is estimated by OLS between I (b) processes with I (0) errors, β − β = Op(n

1−2b) when1/2 < b ≤ 1 and β − β = Op(n

−b) when b > 1.For simplicity we consider only the case with i.i.d. errors and scalar β in the remainder of

the proof, i.e. ut = et is a bivariate i.i.d. process. The general case follows similarly. Considerthe residual processes

zt = y1t − βy2t = zt + (β − β)y2t,

u1t = ∆d−bzt = u1t +∆d−b(β − β)y2t,

u1.2t = u1.2t +∆d−b(β − β)y2t,

and define wt =Pt−1

j=1 j−1u1t−ju1.2t and wt =

Pt−1j=1 j

−1u1t−ju1.2t. Then

1√n

nXt=1

(wt −wt) =1√n

nXt=1

t−1Xj=1

j−1³(β − β)2(∆d−by2t−j)(∆d−by2t)

+u1t−j∆d−b(β − β)y2t + u1.2t∆d−b(β − β)y2t−j

´= Op

⎛⎝ 1√n

n−1Xj=1

j−1nX

t=j+1

(β − β)2v2t−jv2t

⎞⎠ (39)

+Op

⎛⎝ 1√n

n−1Xj=1

j−1nX

t=j+1

(β − β)u1t−jv2t

⎞⎠ (40)

+Op

⎛⎝ 1√n

n−1Xj=1

j−1nX

t=j+1

(β − β)u1.2tv2t−j

⎞⎠ (41)

defining v2t = ∆−bu#2t ∈ I (b).To prove that (39)− (41) are negligible, we first show that

(logn)−1 n−2bn−1Xj=1

j−1nX

t=j+1

v2t−jv2t = Op (1) . (42)

93

Chapter 3

It is easily seen that

anj = n−2bnX

t=j+1

v2t−jv2t = Op (1) (43)

for j/n→ 0, n→∞ by a slight variation of the results in, e.g., Marinucci & Robinson (2000).We thus rewrite the left-hand side of (42) as

(logn)−1n−1Xj=1

j−1anj = (logn)−1knXj=1

j−1anj + (logn)−1n−1X

j=kn+1

j−1anj . (44)

Applying (43) and the fact that (logn)−1Pn

j=1 j−1 = O (1), the first term on the right-hand

side of (44) is readily seen to be Op (1) if kn/n→ 0 as n→∞. The last term of (44) is boundedby

(logn)−1¯¯ n−1Xj=kn+1

j−1anj

¯¯ ≤ max

j≤n|anj | (logn)−1

n−1Xj=kn+1

j−1

= maxj≤n

|anj |Oµlog (n/kn)

logn

¶,

where maxj≤n |anj | ≤ n−2bmaxj≤nPn

t=j+1 |v2t−jv2t| ≤ n−2bPn

t=1 v22t = an0 = Op (1) by

(43). Hence, we find that (42) is Op (1) if we can choose kn such that kn/n → 0 and(logn)−1 log (n/kn) = O (1), which is satisfied if, e.g., kn = n/ (logn).

Now we return to the evaluation of (39) − (41). When 1/2 < b ≤ 1, it follows from (42)

that (39) is

Op

⎛⎝n3/2−4bn−1Xj=1

j−1nX

t=j+1

v2t−jv2t

⎞⎠ = Op

³(logn)n3/2−2b

´.

Similarly, when 1/2 < b ≤ 1, (40) and (41) are of orders

Op

⎛⎝n1/2−2bn−1Xj=1

j−1nX

t=j+1

u1t−jv2t

⎞⎠ = Op((logn)n1/2−b)

and

Op

⎛⎝n1/2−2bn−1Xj=1

j−1nX

t=j+1

u1.2tv2t−j

⎞⎠ = Op((logn)n1/2−b),

respectively. Since b ≥ 3/4 + ε for some ε > 0 by assumption, all these terms are op (1) usingthat n−ε (logn) = o (1) for any ε > 0.

When b > 1, (39)− (41) are all Op

¡(logn)n−1/2

¢by the same arguments. This completes

the proof.

94

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

Proof of Theorem 3.2. As in the proof of Theorem 3.1 it can be shown that estimatingβ does not affect the result, so we assume that β is known. The second derivative of thelikelihood (18) is

∂2L (θ, β,Σ)

∂θ2= − 1

σ21.2

nXt=1

nln (1− L) ln (1− L)

³g (L)∆d−b+θ ¡y1t − β0y2t

¢´o׳³

g (L)∆d−b+θ ¡y1t − β0y2t¢− σ021Σ

−122 G (L)∆

dy2t

´´− 1

σ21.2

nXt=1

nln (1− L)

³g (L)∆d−b+θ ¡y1t − β0y2t

¢´o2= − 1

σ21.2

nXt=1

t−1Xj=1

t−j−1Xk=1

j−1k−1e1t−j−ke1.2t − 1

σ21.2

nXt=1

t−1Xj=1

t−1Xk=1

j−1k−1e1t−j e1t−k.(45)

In the case with i.i.d. errors, et is observable and the contribution of the first term to theFisher information is zero by uncorrelatedness of et. The contribution of the second term is

E1

nσ21.2

nXt=1

t−1Xj=1

t−1Xk=1

j−1k−1e1t−je1t−k =1

σ21.2

n−1Xj=1

j−2σ211

by uncorrelatedness of et, which proves the result for i.i.d. errors.

In the remaining cases, we need to take the estimation of the autoregressive parametersinto account. Again it can be shown that the first term of (45) is negligible. Since C (0) =Σ+Op(n

−1/2) and σ−21.2 = e01Σ−1e1, the contribution of the second term is

E tr1

n

nXt=1

t−1Xj=1

t−1Xk=1

j−1k−1e1e01Σ−1C (0)Σ−1e1t−j e1t−k

= E trnn−1Xj=1

n−1Xk=1

j−1k−1e1e01Σ−1C (j) e1e01C (k)

0Σ−1

= Enn−1Xj=1

j−1e01Σ−1C (j) e1

n−1Xk=1

k−1e01Σ−1C (k) e1,

which is equal to vec¡Σ−1e1e01

¢0Ωi vec

¡Σ−1e1e01

¢as n→∞ by Lemma 1 and Theorem 3.1.

Proof of Theorem 3.3. Let θ = δ/√n. First suppose β is known and define e1nt =

g (L)∆d−bzt = g (L)∆−θe1t and e2t = G (L)∆dy2t. By the Mean Value Theorem we obtain

e1nt = e1t +δ√n

t−1Xj=1

j−1e1t−j + op(n−1/2)

95

Chapter 3

for all t = 1, ..., n. Thus, under θ = δ/√n,

Sn =1

σ21.2√n

nXt=1

t−1Xj=1

j−1e1nt(e1nt − σ021Σ−122 e2t)

=1

σ21.2√n

nXt=1

t−1Xj=1

j−1Ãe1t−j +

δ√n

t−j−1Xk=1

k−1e1t−j−k

!Ãe1.2t +

δ√n

t−1Xk=1

k−1e1t−k

!+ op (1)

=√nn−1Xj=1

j−1e01Σ−1C (j) e1 + δn

n−1Xj=1

j−1e01Σ−1C (j) e1

n−1Xk=1

k−1e01Σ−1C (k) e1 + op (1)

as in the proofs of Theorems 3.1 and 3.2. The result when β is known now follows from Lemma1 and the above theorems. When β is unknown we can apply the same arguments as in theproof of Theorem 3.1, along with elementary inequalities to the components due to e1nt − e1t,to show that the result is unaffected.

Proof of Corollary 3.1. Follows immediately from Theorem 3.3.

Proof of Corollary 3.2. Follows immediately from (32) and Theorem 3.2.

Proof of Theorem 4.1. By the Mean Value Theorem we obtain

u1.2nt = u1t − σ021Σ−122 u2t +

c√n

t−1Xj=1

j−1u1t−j + op(n−1/2)

u1.2nt = u1t − σ021Σ−122 u2t +

c− δ√n

t−1Xj=1

j−1u1t−j + op(n−1/2)

for all t = 1, ..., n. Thus, we note that

1

n

nXt=1

u21.2ntP→ σ21.2 (46)

as n→∞. The numerator of Mn is

nXt=1

u21.2nt −nXt=1

u21.2nt =nXt=1

c2

n

t−1Xj=1

j−2u21t−j −nXt=1

(c− δ)2

n

t−1Xj=1

j−2u21t−j

+2nXt=1

u1.2tδ√n

t−1Xj=1

j−1u1t−j + op (1)

= δ (2c− δ)π2

6σ211 + 2δ

rπ2

6σ211σ

21.2Z + op (1) . (47)

Combining (46) and (47) we get the desired result.

96

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

Proof of Theorem 4.2. Consider the one-sided case with δ > 0 (the reverse case followssimilarly). The one-sided power envelope is

Π (δ) = P (M (δ, δ) > cα (δ))

= P³δ³2pI0Z + δI0

´> cα (δ)

´= P

µZ >

µcα (δ)

δ− δI0

¶/2pI0¶,

where cα (δ) satisfies

α = P (M (0, δ) > cα (δ))

= P

µZ >

µcα (δ)

δ+ δI0

¶/2pI0¶

such that cα (δ) = 2δ√I0Zα − δ2I0. Then

Π (δ) = P³Z >

³2pI0Zα − 2δI0

´/2pI0´

= P³Z > Zα − δ

pI0´.

In the two-sided case we note that, since for varying c the family of distributions M (c, δ)

is normal, it satisfies the requirement that it be strictly totally positive of order 3 (STP3, seeLehmann (1986, p. 119)). Hence the power envelope of all unbiased tests of H0 : θ = 0 againstH1 : θ1n = δ/

√n is given by Π2 (δ) = 1 − P (C1,α (δ) < M (δ, δ) < C2,α (δ)) (Lehmann (1986,

p. 303)), where the constants are determined by

P (C1,α (δ) < M (0, δ) < C2,α (δ)) = 1− α (48)∂P (C1,α (δ) < M (c, δ) < C2,α (δ))

∂c

¯c=0

= 0. (49)

Consider first (49) which implies that (φ (·) is the density function of the standard normaldistribution)

φ

µC2,α (δ) + δ2I0

2δ√I0

¶= φ

µC1,α (δ) + δ2I0

2δ√I0

¶with the non-trivial solution C1,α (δ) = −C2,α (δ)− 2δ2I0. Now we can find the constants from(48),

1− α = P¡−C2,α (δ)− 2δ2I0 < M (0, δ) < C2,α (δ)

¢= P

µ−C2,α (δ) + δ2I0

2δ√I0

< Z <C2,α (δ) + δ2I0

2δ√I0

¶,

where Z is a standard normal random variable. Thus, C2,α (δ) solves Φ¡¡C2,α (δ) + δ2I0

¢/2δ√I0

¢=

1 − α/2, which implies C2,α (δ) = 2δ√I0Z1−α/2 − δ2I0, where Z1−α/2 is the 100 (1− α/2)%

point of the standard normal distribution.

97

Chapter 3

The two-sided power envelope is then

Π2 (δ) = 1− P (C1,α (δ) < M (δ, δ) < C2,α (δ))

= 1− P³−2δ

pI0Z1−α/2 − δ2I0 < 2δ

pI0Z + δ2I0 < 2δ

pI0Z1−α/2 − δ2I0

´= 1− P

³−Z1−α/2 < Z + δ

pI0 < Z1−α/2

´= 1− Fλ

¡χ21,1−α

¢.

98

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

References

Agiakloglou, C. & Newbold, P. (1994), ‘Lagrange multiplier tests for fractional difference’,Journal of Time Series Analysis 15, 253—262.

Ahn, S. K. (1988), ‘Distribution for residual autocovariances in multivariate autoregressivemodels with structural parameterization’, Biometrika 75, 590—593.

Baillie, R. T. (1996), ‘Long memory processes and fractional integration in econometrics’,Journal of Econometrics 73, 5—59.

Baillie, R. T. & Bollerslev, T. (1989), ‘Common stochastic trends in a system of exchangerates’, Journal of Finance 44, 167—181.

Baillie, R. T. & Bollerslev, T. (1994), ‘Cointegration, fractional cointegration, and exchangerate dynamics’, Journal of Finance 49, 737—745.

Cheung, Y. E. & Lai, K. S. (1993), ‘A fractional cointegration analysis of purchasing powerparity’, Journal of Business and Economic Statistics 11, 103—122.

Diebold, F. X., Gardeazabal, J. & Yilmaz, K. (1994), ‘On cointegration and exchange ratedynamics’, Journal of Finance 49, 727—735.

Diebold, F. X. & Rudebusch, G. D. (1989), ‘Long memory and persistence in aggregate output’,Journal of Monetary Economics 24, 189—209.

Doornik, J. A. (2001), Ox: An Object-Oriented Matrix Language, 4th edn, Timberlake Consul-tants Press, London.

Doornik, J. A. & Ooms, M. (2001), ‘A package for estimating, forecasting and simulating arfimamodels: Arfima package 1.01 for Ox’, Working Paper, Nuffield College, Oxford .

Dueker, M. & Startz, R. (1998), ‘Maximum-likelihood estimation of fractional cointegrationwith an application to U.S. and Canadian bond rates’, Review of Economics and Statistics83, 420—426.

Elliott, G., Rothenberg, T. J. & Stock, J. H. (1996), ‘Efficient tests for an autoregressive unitroot’, Econometrica 64, 813—836.

Engle, R. & Granger, C. W. J. (1987), ‘Cointegration and error correction: Representation,estimation and testing’, Econometrica 55, 251—276.

Fox, R. & Taqqu, M. S. (1986), ‘Large-sample properties of parameter estimates for stronglydependent stationary gaussian series’, Annals of Statistics 14, 517—532.

99

Chapter 3

Granger, C. W. J. (1981), ‘Some properties of time series data and their use in econometricmodel specification’, Journal of Econometrics 16, 121—130.

Hall, P. & Heyde, C. C. (1980), Martingale Limit Theory and its Application, Academic Press,New York.

Hosking, J. R. M. (1980), ‘The multivariate portmanteau statistic’, Journal of the AmericanStatistical Association 75, 602—608.

Jansson, M. (2004), ‘Point optimal tests of the null hypothesis of cointegration’, Forthcomingin Journal of Econometrics .

Jansson, M. & Haldrup, N. (2002), ‘Regression theory for nearly cointegrated time series’,Econometric Theory 18, 1309—1335.

Jeganathan, P. (1999), ‘On asymptotic inference in cointegrated time series with fractionallyintegrated errors’, Econometric Theory 15, 583—621.

Kim, C. S. & Phillips, P. C. B. (2001), ‘Fully modified estimation of fractional cointegrationmodels’, Preprint, Yale University .

Lehmann, E. L. (1986), Testing Statistical Hypotheses, 2nd edn, Springer, New York.

Lobato, I. N. & Velasco, C. (2000), ‘Long memory in stock-market trading volume’, Journal ofBusiness and Economic Statistics 18, 410—427.

Marinucci, D. & Robinson, P. M. (1999), ‘Alternative forms of fractional Brownian motion’,Journal of Statistical Planning and Inference 80, 111—122.

Marinucci, D. & Robinson, P. M. (2000), ‘Weak convergence of multivariate fractionalprocesses’, Stochastic Processes and their Applications 86, 103—120.

Marinucci, D. & Robinson, P. M. (2001), ‘Semiparametric fractional cointegration analysis’,Journal of Econometrics 105, 225—247.

Nielsen, M. Ø. (2004), ‘Efficient likelihood inference in nonstationary univariate models’, Econo-metric Theory 20, 116—146.

Phillips, P. C. B. (1991), ‘Optimal inference in cointegrated systems’, Econometrica 59, 283—306.

Phillips, P. C. B. & Hansen, B. E. (1990), ‘Statistical inference in instrumental variables re-gression with I(1) variables’, Review of Economic Studies 57, 99—125.

Phillips, P. C. B. & Loretan, M. (1991), ‘Estimating long-run economic equilibria’, Review ofEconomic Studies 58, 407—436.

100

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

Robinson, P. M. (1991), ‘Testing for strong serial correlation and dynamic conditional het-eroskedasticity in multiple regressions’, Journal of Econometrics 47, 67—84.

Robinson, P. M. (1994), ‘Efficient tests of nonstationary hypotheses’, Journal of the AmericanStatistical Association 89, 1420—1437.

Robinson, P. M. (1995), ‘Gaussian semiparametric estimation of long range dependence’, Annalsof Statistics 23, 1630—1661.

Saikkonen, P. (1991), ‘Asymptotically efficient estimation of cointegration regressions’, Econo-metric Theory 7, 1—21.

Shin, Y. (1994), ‘A residual-based test of the null of cointegration against the alternative of nocointegration’, Econometric Theory 10, 91—115.

Sowell, F. B. (1992), ‘Maximum likelihood estimation of stationary univariate fractionally in-tegrated time series models’, Journal of Econometrics 53, 165—188.

Tanaka, K. (1999), ‘The nonstationary fractional unit root’, Econometric Theory 15, 549—582.

101

Chapter 3

Table 1: Finite Sample Rejection Frequencies Under Assumption 1.0Uncorrelated Correlation .6

Sample Size θ Envelope LM LMsc Envelope LM LMscn = 200 0 0.050 0.024 0.050 0.050 0.031 0.050

0.05 0.230 0.172 0.247 0.305 0.222 0.2750.10 0.567 0.468 0.573 0.733 0.585 0.6430.15 0.859 0.755 0.828 0.960 0.867 0.8860.20 0.976 0.913 0.949 0.998 0.976 0.9850.25 0.998 0.986 0.993 1.000 0.996 0.998

n = 500 0 0.050 0.037 0.050 0.050 0.040 0.0500.05 0.416 0.360 0.416 0.559 0.507 0.5550.10 0.889 0.838 0.879 0.974 0.944 0.9570.15 0.996 0.980 0.986 1.000 0.997 0.9970.20 1.000 1.000 1.000 1.000 1.000 1.0000.25 1.000 1.000 1.000 1.000 1.000 1.000

Table 2: Finite Sample Rejection Frequencies Under Assumption 1.1Uncorrelated Correlation .6

Sample Size θ Envelope LM LMsc Envelope LM LMscn = 200 0 0.050 0.022 0.050 0.050 0.024 0.050

0.05 0.121 0.060 0.117 0.146 0.075 0.1090.10 0.243 0.116 0.195 0.323 0.136 0.1970.15 0.412 0.191 0.289 0.553 0.229 0.3050.20 0.600 0.270 0.383 0.766 0.330 0.4150.25 0.766 0.348 0.455 0.906 0.397 0.469

n = 500 0 0.050 0.029 0.050 0.050 0.031 0.0500.05 0.185 0.123 0.159 0.240 0.161 0.2080.10 0.442 0.261 0.336 0.591 0.356 0.4320.15 0.727 0.464 0.543 0.878 0.594 0.6640.20 0.912 0.611 0.673 0.982 0.762 0.8100.25 0.982 0.742 0.802 0.999 0.851 0.887

102

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

Table 3: Finite Sample Rejection Frequencies Under Assumption 1.2Uncorrelated Correlation .6

Sample Size θ Envelope LM LMsc Envelope LM LMscn = 200 0 0.050 0.017 0.050 0.050 0.015 0.050

0.05 0.230 0.178 0.312 0.286 0.146 0.2260.10 0.567 0.452 0.581 0.696 0.421 0.5800.15 0.859 0.742 0.836 0.944 0.666 0.7980.20 0.976 0.922 0.956 0.996 0.903 0.9520.25 0.998 0.975 0.992 1.000 0.972 0.991

n = 500 0 0.050 0.035 0.050 0.050 0.023 0.0500.05 0.416 0.378 0.436 0.524 0.308 0.4440.10 0.889 0.819 0.850 0.961 0.809 0.8800.15 0.996 0.987 0.993 1.000 0.983 0.9950.20 1.000 1.000 1.000 1.000 1.000 1.0000.25 1.000 1.000 1.000 1.000 1.000 1.000

Table 4: Finite Sample Rejection Frequencies Under Assumption 1.3Uncorrelated Correlation .6

Sample Size θ Envelope LM LMsc Envelope LM LMscn = 200 0 0.050 0.022 0.050 0.050 0.024 0.050

0.05 0.121 0.058 0.088 0.146 0.065 0.1100.10 0.243 0.115 0.193 0.321 0.148 0.2330.15 0.412 0.223 0.319 0.550 0.251 0.3560.20 0.600 0.269 0.358 0.763 0.321 0.4190.25 0.766 0.332 0.432 0.904 0.355 0.478

n = 500 0 0.050 0.029 0.050 0.050 0.027 0.0500.05 0.185 0.122 0.177 0.238 0.148 0.2390.10 0.442 0.287 0.385 0.588 0.368 0.4960.15 0.727 0.483 0.598 0.876 0.599 0.7220.20 0.912 0.621 0.727 0.982 0.764 0.8470.25 0.982 0.734 0.805 0.999 0.836 0.889

103

Chapter 3

Table 5: Estimates of Fractional Integration Orders

WG (y1t) CAN SW FRA ITA JAP UK dc = d

CMLE p = 0 1.0057(0.0425)

1.1211∗∗(0.0425)

0.9938(0.0425)

1.0081(0.0425)

1.0033(0.0425)

0.9975(0.0425)

1.0770(0.0425)

1.0295

p = 1 0.9625(0.0975)

1.0588(0.0744)

0.9465(0.1016)

0.9920(0.0914)

1.0023(0.1026)

0.9959(0.0980)

1.0163(0.0960)

0.9963

GSP m = 33 1.0428(0.0870)

1.1311(0.0870)

1.0197(0.0870)

1.1141(0.0870)

1.0857(0.0870)

0.9696(0.0870)

0.9837(0.0870)

1.0495

m = 67 1.0847(0.0611)

1.0406(0.0611)

1.0662(0.0611)

1.1338∗(0.0611)

1.1029(0.0611)

1.1185(0.0611)

1.0725(0.0611)

1.0885

Standard errors are given in parenthesis, see also Tanaka (1999) and Robinson (1995). Oneasterisk denotes significantly different from unity at 5% level and two asterisks denote

significantly different from unity at 1% level.

Table 6: Misspecification Tests for Univariate CMLEsWG (y1t) CAN SW FRA ITA JAP UK

p = 0 AR(6) 35.095∗∗ 6.5008 42.468∗∗ 33.236∗∗ 54.811∗∗ 49.539∗∗ 30.144∗∗

ARCH(1) 0.0201 2.9443 0.4078 0.2719 29.695∗∗ 16.123∗∗ 14.914∗∗

p = 1 AR(6) 2.9604 6.5195 6.5442 6.4627 4.2951 11.789∗ 11.389∗

ARCH(1) 0.7083 2.2356 0.9340 0.5906 2.3098 2.1289 7.1594∗∗

AR(6) is the Portmanteau test up to lag 6 and ARCH(1) is the test for ARCH up to lag 1,which are asymptotically χ2(5− p) and F (1, 333− p) distributed, respectively. One asterisk

denotes significance at 5% level and two asterisks denote significance at 1% level.

Table 7: One-sided LM Tests for Fractional Cointegrationd = b = 1 d = b = dc d = 1, b = 0.76

Assumption 1.2 13.40∗∗ 13.38∗∗ 11.91∗∗

Assumption 1.3 2.16∗ 2.14∗ −0.20One asterisk denotes significance at 5% level and two asterisks denote significance at 1% level.

104

Optimal Residual Based Tests for Fractional Cointegration and Exchange Rate Dynamics

0 1 2 3 4 5

0.25

0.50

0.75

1.00

u ~iid u 1 ~iid, γ 2 =0.5 γ 1 =0.2, u 2 ~iid γ 1 =0.2, γ 2 =0.5

0 1 2 3 4 5

0.25

0.50

0.75

1.00

u ~iid u 1 ~iid, γ 2 =0.5 γ 1 =0.2, u 2 ~iid γ 1 =0.2, γ 2 =0.5

0 1 2 3 4 5

0.25

0.50

0.75

1.00

u ~iid u 1 ~iid, γ 2 =0.8 γ 1 =0.4, u 2 ~iid γ 1 =0.4, γ 2 =0.8

0 1 2 3 4 5

0.25

0.50

0.75

1.00

u ~iid u 1 ~iid, γ 2 =0.8 γ 1 =0.4, u 2 ~iid γ 1 =0.4, γ 2 =0.8

Figure 1: Asymptotic local power functions calculated using Corollary 3.1 with d = b = 1and first order autoregressive specifications. The variances are normalized to unity, and thecorrelation is .6 and .9 in the left-hand and right-hand side panels, respectively.

105

106

Chapter 4

Multivariate Lagrange Multiplier Tests for FractionalIntegration

Forthcoming in Journal of Financial Econometrics, 2005

107

108

Multivariate Lagrange Multiplier Tests for FractionalIntegration

Morten Ørregaard Nielsen∗

Abstract

We introduce a multivariate Lagrange Multiplier (LM) test for fractional integration.We derive and analyze the LM statistic and show that it is asymptotically noncentral chi-squared distributed under local alternatives, and that, under Gaussianity, the LM test isasymptotically efficient against local alternatives. It is shown that the regression variant inBreitung & Hassler (2002, Journal of Econometrics 110, 167-185) is not equivalent to theLM test in the multivariate case, although it is in the univariate case. A generalization ofthe LM test that explicitly allows for different integration orders for each variable is alsointroduced. The finite sample properties of the LM test are evaluated by Monte Carloexperiments which demonstrate that it is superior to the Breitung & Hassler (2002) test.An application to multivariate time series of real interest rates for six countries is offered,demonstrating that more clear-cut evidence can be drawn from multivariate tests comparedto conducting several univariate tests.

JEL Classification: C32

Keywords: Asymptotic Local Power, Efficient Test, Fractional Integration, Lagrange Mul-tiplier Test, Multivariate Fractional Unit Root, Nonstationarity

∗I am grateful to Jörg Breitung, Uwe Hassler, Søren Johansen, Eric Renault (the editor), an associate editor,and two anonymous referees for many helpful comments and constructive suggestions, to Byung Chul Ahn andKlaus Neusser for providing the data, and to the Danish Social Science Research Council for financial support(SSF grant no 24-02-0181).

109

Chapter 4

1 Introduction

In this paper we introduce multivariate Lagrange Multiplier (LM) tests (or efficient score tests)for fractional integration. Multivariate procedures are important since most applied workconcerns multiple time series, either stationary or nonstationary. Parametric tests for fractionalintegration have been examined previously by Robinson (1991, 1994), Agiakloglou & Newbold(1994), and Tanaka (1999), among others, in a univariate framework, and recently by Breitung& Hassler (2002) and Gil-Alana (2003) in the multivariate case. The objective is to test if anobserved K-vector time series yt is integrated of order d, denoted I (d), against the hypothesisthat it is I (d+ θ) for θ 6= 0. By differencing the observed time series, this is equivalent to testingif xt = (1− L)d yt is I (0) against I (θ). Multivariate nonparametric tests for I (0) against I (θ)have been considered by Robinson (1995), who considers a test based on the multivariate log-periodogram estimator, Lobato & Robinson (1998), who propose a multivariate LM test basedon the objective function considered by Lobato (1999), who also considers a Wald statistic.

With no multivariate parametric tests available for testing the order of fractional integra-tion, researchers interested in conducting parametric tests on multiple time series have beenforced to apply univariate tests to each element of the multiple time series. That procedureis not only cumbersome, but ignores potentially important correlations between the elementsof the multiple time series, which could lead to increased power of a multivariate test. Hence,the purpose of the present paper is to introduce LM tests that apply to the multivariate case,with the usual computational motivation for the LM principle. The proposed multivariate testsin the present paper in many ways parallel the ones by Choi & Ahn (1999) and Nyblom &Harvey (2000), who propose stationarity tests, i.e. tests of I (0) against I (1), for multiple timeseries, and our work can thus also be seen as a generalization of their work with the importantdifference that our test is directed against different (i.e. fractional) alternatives.

The tests proposed in this paper are intended primarily for preliminary data analysis. Forinstance, when testing the null of stationarity or I (0)-ness (against fractional alternatives),non-rejection would allow standard methods to be employed for conducting, e.g., causality,structural vector autoregression, or impulse response analyses. More generally, the tests mayindicate the transformation of the data that would be required in order to make the datasuitable for such analyses. For instance, in Andersen, Bollerslev, Diebold & Labys (2003) afractional difference is taken of the multivariate volatility processes considered there, and theresulting multivariate series are modeled by vector autoregressions. Our tests could then beapplied to ensure that the fractional difference is sufficient to render the volatility processesI(0). Another example is the analysis of the Real Interest Parity hypothesis by Kugler &Neusser (1993) using a co-dependence approach, which requires that the data is I(0). Again,our methodology could then be applied to test the latter hypothesis, underlying their entireanalysis, and we return to this in section 5 below.

Suppose we observe yt, t = 1, ..., n generated by

110

Multivariate Lagrange Multiplier Tests for Fractional Integration

(1− L)d+θ yt = etI (t ≥ 1) , t = 0,±1,±2, ..., (1)

where I (·) denotes the indicator function and et is I (0), i.e. is covariance stationary and hasspectral density that is bounded and bounded away from zero at the origin. The process ytgenerated by (1) is well defined for all d and θ, and is sometimes called a multivariate typeII fractionally integrated process. The effect of the truncation or initial values condition in(1) has been analyzed by Robinson (2004) who investigates the difference between the processdefined in (1) and the corresponding process without the truncation (type I). The process in(1) allows a uniform definition, valid for all d and θ, whereas the alternative definition withouttruncation would be valid only for d+θ ∈ (−1/2, 1/2) and partial summation would be neededto generate a process with integration order outside this range. For more details on type I andtype II fractionally integrated processes and their difference, see Marinucci & Robinson (1999)and Robinson (2004). Deterministic terms could be added to (1), allowing for a non-zero meanand trend or deterministic seasonal behavior, see section 3.1. In section 3.2 we consider theextension to different values of d and θ for each component of yt in (1).

For the moment, we let the errors et be independently and identically distributed withmean zero and positive definite covariance matrix Σ, i.i.d.(0,Σ). In section 3.3 we relax thisassumption, and let et follow a stationary vector autoregressive process of order p, VAR(p). Wecould presumably relax the assumption of constant second moment structure to accommodateconditional heteroskedasticity along the lines of Ling & Li (1997) who analyze a univariatefractionally integrated autoregressive moving average model with conditional heteroskedastic-ity. Conditional heteroskedasticity is often found in financial data, where our methodology isespecially applicable due to the large amount of data available, and hence this would be animportant direction for future research. Furthermore, note that positive definiteness of Σ rulesout cointegration among the components of yt. However, even though we rule out the possibil-ity of cointegration, which has been popular especially in recent empirical macroeconomics, weare still able to apply our model to test a number of interesting hypotheses such as joint sta-tionarity or I(0)-ness as described above, in which case we need not worry about the possibilityof cointegration. It turns out that our test is also implicitly a test of the null of no cointegration(see section 2 and the discussion following Corollary 2 below), and for such tests Lee & Tse(1996) provide evidence that leptokurticity as produced by conditional heteroskedasticity maycause overrejection of the null of no cointegration, and Sin & Ling (2004) show how reducedrank analysis may be modified to accommodate conditional heteroskedasticity. Further analysisin the case of cointegration, reduced rank of Σ, and/or conditional heteroskedasticity is beyondthe scope of this paper, and we leave these important topics for future research. However, inthe next section we do briefly consider the properties of our proposed tests in the presence ofcointegration.

In the model (1) we assume that d is specified a priori and wish to test the hypothesis

H0 : θ = 0 (2)

111

Chapter 4

against the alternative H1 : θ 6= 0. For instance, the unit root hypothesis and the hypothesisof joint stationarity (or more precisely, weak dependence) of yt are given by (1) and (2) withd = 1 and d = 0, respectively. Indeed, the most important motivation for the current study isthe test of joint stationarity or joint I(0)-ness, which we also illustrate empirically in section5 below. In that case we also do not need to worry about the assumption of no cointegration.It is important to note that the assumption that d is known a priori is made without loss ofgenerality. The specification of a particular value of d exactly specifies the null hypothesis sinceθ = 0 in (2).

Robinson (1994) and Tanaka (1999) consider testing (2) in the univariate model, i.e. (1) withK = 1. Robinson (1994) shows that the frequency domain LM test statistic has a chi-squaredlimiting distribution under the null, and is asymptotically efficient against local alternatives,θ = δ/

√n, under Gaussianity. The frequency domain LM test of Robinson (1994) is extended

to the multivariate case by Gil-Alana (2003). Tanaka (1999) shows that the univariate timedomain LM test statistic has a normal (one-sided test) or chi-squared (two-sided test) limitingdistribution, and that, under Gaussianity, it is asymptotically most powerful against local al-ternatives among all the invariant tests. A simulation study by Tanaka (1999) also indicates thefinite sample superiority of the time domain test over Robinson’s (1994) frequency domain test.Breitung & Hassler (2002) suggest a regression variant of Tanaka’s (1999) LM test similar tothe Dickey-Fuller test, see also Dolado, Gonzalo & Mayoral (2002). Breitung & Hassler (2002)also suggest a multivariate version, which generalizes to a trace test for the cointegrating rank,along the lines of the Johansen (1988) test, and show that their multivariate test has a limitingchi-squared distribution, where the degrees of freedom depends only on the cointegrating rankunder the null.

We show that the equivalence of the LM test and the regression based test of Breitung& Hassler (2002) fails to hold in the multivariate case. We derive the LM test statistic forthe hypothesis (2) in the time domain, with the usual computational advantage of estimationunder the null. Thus, no multivariate fractionally integrated model needs to be estimated.Of course, in the presence of short-run dynamics a vector autoregressive model needs to beestimated under the null, but that is quite simple and computationally not as demanding asthe estimation of a multivariate fractionally integrated model, which would typically requirenumerical optimization. In fact, the test is based on computationally simple moment matrices,see (4) and (7) below.

We establish desirable distributional properties and optimality properties of the test. Inparticular, the test statistic is asymptotically noncentral chi-squared distributed under localalternatives, where the degrees of freedom equals the number of restrictions tested, and underGaussianity, it is asymptotically efficient against local alternatives. Furthermore, the LM test isshown to be consistent against fractional cointegration, i.e. it rejects with probability tendingto one in the case where the integration order of some linear combination of the observedvector time series is lower than the hypothesized value. Thus, the test could be employed as a

112

Multivariate Lagrange Multiplier Tests for Fractional Integration

test of non-cointegration against the alternative of cointegration. An extension of the LM teststatistic that explicitly allows for different integration orders (both different d and different θ)for each variable in the vector time series yt is also introduced, and its asymptotic propertiesare examined.

In a simulation study we examine the properties of the LM test in finite samples and comparewith the Breitung & Hassler (2002) test. We find that the LM test compares favorably withthe Breitung & Hassler (2002) test, and in particular that the LM test has higher finite samplepower than the Breitung & Hassler (2002) test in the non-cointegrated model. The frequencydomain test by Gil-Alana (2003) is not considered in our simulation study since the evidence inTanaka (1999) suggests that time domain tests are superior in terms of finite sample properties.

Finally, we present an interesting empirical application of the multivariate LM test and itsextensions, which demonstrates their usefulness in practice. We apply our tests to a multivariatetime series of real interest rates for six major industrialized countries previously examined byKugler & Neusser (1993) and Choi & Ahn (1999). Kugler & Neusser (1993) analyze the RealInterest Parity hypothesis by a co-dependence approach, which requires the vector time seriesin question to be stationary, or more precisely, to be I(0). To test this underlying hypothesis,Kugler & Neusser (1993) apply univariate unit root tests to each element of the multiple timeseries which mostly reject the null of a unit root, and Choi & Ahn (1999) apply their multivariatestationarity test (i.e. test of I (0) against I (1)) and find no evidence against the null hypothesis.Our objective is to test if the real interest rates are jointly I (0) against fractional alternatives,and the evidence we obtain from the multivariate tests is more clear-cut than the evidence fromapplying univariate tests to each element of the multiple time series.

The rest of the paper is laid out as follows. Next, we derive and analyze the multivariate LMtest in the basic model with only one integration order, which is common to all the variables. Insection 3 we consider generalizations of the basic model allowing deterministic terms, differentvalues of d and θ for each variable, and short-run dynamics. Section 4 presents the results ofthe simulation study, and section 5 presents the empirical application. Section 6 offers someconcluding remarks. Proofs are collected in the appendix.

2 Multivariate LM Test

The Gaussian log-likelihood function of the model in (1) is

L (θ,Σ) = −n2ln (2π |Σ|)− 1

2

nXt=1

(1− L)d+θ y0tΣ−1 (1− L)d+θ yt, (3)

113

Chapter 4

and hence the score is, see also Tanaka (1999) and Breitung & Hassler (2002),

∂L (θ,Σ)

∂θ

¯θ=0,Σ=Σ

= −nXt=1

(ln (1− L))x0tΣ−1xt

= tr³Σ−1S10

´, (4)

where xt = (1− L)d yt, S10 =Pn

t=2 x∗t−1x0t, x∗t−1 =

Pt−1j=1 j

−1xt−j , and Σ = n−1Pn

t=1 xtx0t is a

consistent estimate of Σ = E (ete0t) under the null. When K = 1, i.e. when the observed time

series is univariate, the score in (4), normalized by√n, reduces to Tanaka’s (1999) univariate

time domain score statistic, sn =√nPn−1

j=1 j−1ρ (j), where ρ (j) is the j’th order sample

autocorrelation of xt. Our multivariate score (4) is similar to Choi & Ahn’s (1999, p. 47) SBDHstatistic and Nyblom & Harvey’s (2000, p. 179) LBI statistic for testing I (0) against I (1) inmultiple time series. The difference is that we introduce the j−1 weights in the calculation ofx∗t−1, where Choi & Ahn (1999) and Nyblom & Harvey (2000) use unweighted partial sums.

Breitung & Hassler (2002) consider the test statistic

Λ0 (d) = tr³Σ−1S010S

−111 S10

´, (5)

where S11 =Pn

t=2 x∗t−1x∗0t−1, and show that Λ0 (d) →d χ2K2 under the null (2). However,

since tr (AB) 6= tr (A) tr (B) in general, (5) is not equivalent to the multivariate LM test of(2), as demonstrated for the univariate test by Breitung & Hassler (2002). Instead, (5) is aregression variant along the lines of the Dickey-Fuller test and the fractional Dickey-Fuller test,see Dolado et al. (2002). Indeed, the main aim of Breitung & Hassler (2002) is to construct afractional trace statistic similar to Johansen (1988), just as the Dickey-Fuller test generalizesto Johansen’s (1988) trace statistic. In particular, (5) can be rewritten as a sum of eigenvalues,Λ0 (d) =

PKj=1 λj , where λj turns out to be the test statistic for φj = 0 in

(v0jxt) = φ0jx∗t−1 + et

and vj is the eigenvector corresponding to λj . Thus, K2 restrictions are being tested (φj =0, j = 1, ...,K) instead of one restriction as in (2), which explains the K2 degrees of freedomin the asymptotic distribution of Λ0 (d). Consequently, the test statistic (5) is not the LM teststatistic for testing the hypothesis (2).

The multivariate LM test statistic for testing (2) is, e.g. Amemiya (1985, p. 142),

LM =∂L (η)

∂η0

¯θ=0,Σ=Σ

"− ∂2L (η)

∂η∂η0

¯θ=0,Σ=Σ

#−1∂L (η)

∂η

¯θ=0,Σ=Σ

, (6)

114

Multivariate Lagrange Multiplier Tests for Fractional Integration

where η = ((vecΣ)0 , θ0)0. The relevant block of the Hessian matrix in (6) is

− ∂2L (θ,Σ)

∂θ2

¯θ=0,Σ=Σ

=nXt=1

(ln (1− L))x0tΣ−1 (ln (1− L))xt

+1

2

nXt=1

³x0tΣ

−1 (ln (ln (1− L)))xt + (ln (ln (1− L)))x0tΣ−1xt

´= tr

³Σ−1M11

´,

defining M11 = S11+12 (S20 + S020), S20 =

Pnt=1 x

∗∗t−2x0t, and x∗∗t−2 =

Pt−2j=1 j

−1x∗t−j−1. Thus, wefind that

LM =tr(Σ−1S10)2

tr(Σ−1M11). (7)

In the following theorem we present the limiting distribution of the test statistic underalternatives local to the null, H1n : θ = δ/

√n, where δ is a fixed scalar.

Theorem 1 Under θ = δ/√n, the LM test statistic (7) is asymptotically distributed as χ21

¡Iδ2¢,where

I = − limn→∞E0

1

n

∂2L (θ,Σ)

∂θ2=

π2K

6. (8)

Under the additional assumption of Gaussianity, the test is asymptotically efficient against localalternatives. Under the null hypothesis (2), LM →d χ

21.

Thus, the LM test is chi-squared with one degree of freedom under the null, which isexpected since only one restriction is being tested. In contrast, the test (5) has K2 degreesof freedom. More generally, standard statistical results apply in the present fractional model,unlike in the multivariate unit root and stationarity tests nested in autoregressive models, e.g.Phillips & Durlauf (1986), Fountis & Dickey (1989), Choi & Ahn (1999), and Nyblom & Harvey(2000).

Note that Theorem 1 continues to hold if (the negative inverse of) the Fisher informationmatrix (8) is substituted for the Hessian matrix in the definition of the LM test in (6) or (7).However, in simulation experiments not reported here, it was found that the LM test definedin (6) has superior finite sample properties, especially in the presence of short-run dynamics.In addition, when allowance is made for short-run dynamics, the calculation of the Fisherinformation matrices, see (17) and (18) below, can be quite complicated. Thus, we maintainthe definition of the LM test in terms of the Hessian matrix as in (6).

Next, as in Choi & Ahn (1999), we use the fact that the LM test is invariant to non-singularlinear transformations, i.e. transformations of the type xt = Dxt for D non-singular, to showthat the test is consistent against fractional cointegration. Following Breitung & Hassler (2002),

115

Chapter 4

we say that yt is fractionally cointegrated, denoted CI (d, b), if yt is I (d) and there exists K×r

and K × (K − r) linearly independent matrices β and γ of full rank such that

γ0yt ∼ I (d) ,

β0yt ∼ I (d− b) ,

where it is assumed that the fractional integration order d is given, but b > 0 is unknown. Thatis, the maintained hypothesis is that yt is I (d), but it is now assumed that there exists somelinear combination of yt, which is integrated of a lower order. Thus, we are still under the nullin the sense of (2). We also assume that ut = (γ, (1− L)−b β)0xt is i.i.d. (0,Σ). The followingcorollary shows that our multivariate LM test (7) rejects with probability tending to one whenyt is CI(d, b).

Corollary 2 The LM test statistic (7) is Op (n) when yt is CI (d, b).

Note that in practical application with e.g. d = 1, rejection of the null can be caused byeither cointegration among the variables or because one of the variables is not I(1). Hence,in that case, rejection of the null warrants further investigation to determine the cause of therejection, e.g. analyzing subsets of the variables. In addition, Lee & Tse (1996) argue thatrejection could be caused by leptokurticity as produced by conditional heteroskedasticity, andSin & Ling (2004) show how reduced rank analysis may be modified to accommodate conditionalheteroskedasticity. However, this is not a concern when testing the important hypothesis ofjoint stationarity or joint I(0)-ness, i.e. with d = 0, in which case we do not need to worryabout the possible presence of cointegration.

More generally, by setting β equal to a column of the identity matrix, Corollary 2 actuallydemonstrates that the LM test in this section is also consistent against the alternatives consid-ered in section 3.2 below, in the sense that if the integration order of just one of the variablesdiffers from d, the test statistic will be Op (n).

3 Extensions of the Model

3.1 Deterministic Terms

We allow for deterministic terms in the data generating process following Robinson (1994).Suppose we observe the K-vector time series

©y0t , t = 1, 2, ..., n

ª, generated by the linear model

y0t = βzt + yt, (9)

where zt is a q-vector of purely deterministic components and yt is an unobservedK-dimensionalcomponent generated by (1).

Two leading cases for the deterministic terms are zt = 1 and zt = (1, t)0, which yield themodels y0kt = βk0 + ykt and y0kt = βk0 + βk1t + ykt, respectively, but other terms like seasonal

116

Multivariate Lagrange Multiplier Tests for Fractional Integration

dummies or polynomial trends can also be accommodated. As in Definition 2 of Robinson(1994), it is only required that

Pnt=1 ztzt

0 is positive definite for n sufficiently large, wherezt = (1 − L)dzt. It follows from Robinson (1994) that β can be estimated by least squaresregression of (1− L)d y0t on zt, yielding the estimate β. The test statistic is then based on theresiduals yt = y0t − βzt.

Note that we assume the deterministic terms appear in the generating mechanism of theobserved vector time series y0t , instead of xt as in Breitung & Hassler (2002). This follows theapproach of Robinson (1994), and is more natural for interpretation of zt when d is nonintegral.Consider the simple case with zt = 1 and 0 < d < 1/2. In our setup, y0t is then an asymptoticallystationary long memory process around a non-zero mean vector, β0. However, in the setupof Breitung & Hassler (2002), y0t would be an asymptotically stationary long memory processaround the vector of fractional deterministic trends, (1− L)−d I (t ≥ 1)β0.

3.2 Different θ for Each Variable

Suppose the generating mechanism (1) is modified to

(1− L)dk+θk ykt = ektI (t ≥ 1) , k = 1, ...,K, t = 0,±1,±2, ..., (10)

such that θ = (θ1, ..., θK)0 is now a K-vector. Redefining the log-likelihood accordingly and

denoting it LK (θ,Σ) (subscript K denoting different θ for each variable), the score is now givenby

∂LK (θ,Σ)

∂θ

¯θ=0,Σ=Σ

= −nXt=1

diag ((ln (1− L))xt) Σ−1xt

=nXt=1

J 0K(x∗t−1 ⊗ Σ−1xt)

= J 0K vec³Σ−1S010

´(11)

by use of vec (ABC) = (C 0 ⊗A) vecB and property 1 of Lemma 1. We denote by diag (a) thediagonal matrix having the vector a on the diagonal, and the matrix JK is defined in Lemma1. As in the previous section, the score (11) reduces to the univariate score when K = 1.

The relevant block of the Hessian matrix in (6) is

− ∂2LK (θ,Σ)

∂θ∂θ0

¯θ=0,Σ=Σ

=nXt=1

diag ((ln (1− L))xt) Σ−1 diag ((ln (1− L))xt)

+nXt=1

J 0K(IK ⊗ Σ−1xt) diag ((ln (ln (1− L)))xt)

=nXt=1

diag¡x∗t−1

¢Σ−1 diag

¡x∗t−1

¢+

nXt=1

diag(Σ−1xt) diag¡x∗∗t−2

¢= S11 ¯ Σ−1 + (Σ−1S020)¯ IK ,

117

Chapter 4

using property 3 of Lemma 1. Here, ¯ denotes the Hadamard product, see the appendix orMagnus & Neudecker (1999). We thus form the LM test statistic

LMK = vec(Σ−1S010)

0JK³S11 ¯ Σ−1 + (Σ−1S020)¯ IK

´−1J 0K vec(Σ

−1S010). (12)

The asymptotic distribution of the test statistic under local alternatives, H1n : θ = δ/√n,

where δ is now a fixed K-vector, is given by the following theorem.

Theorem 3 Under θ = δ/√n, δ a fixed K-vector, the LM test statistic (12) is asymptotically

distributed as χ2K¡δ0IKδ

¢, where

IK = − limn→∞E0

1

n

∂2LK (θ,Σ)

∂θ∂θ0=

π2

6Σ¯Σ−1.

Under the additional assumption of Gaussianity, the test is asymptotically efficient against localalternatives. Under the null hypothesis (2), LMK →d χ

2K .

From Theorem 3 it is worth noting once more that, in the more general model consideredin this section, the degrees of freedom still equals the number of restrictions tested, K.

3.3 Short-run Dynamics

In this section we allow for short-run dynamics following Tanaka (1999) and Breitung & Hassler(2002). In particular, suppose et is generated according to the vector autoregressive (VAR)process

A (L) et = εt, t = 0,±1,±2, ..., (13)

where εt satisfies the assumptions of et before. Here, A (z) is a matrix polynomial of order psuch that A (1) has full rank and et is a stationary VAR(p) process. The parameters of A (z)are gathered in the K2p-vector a = vec (A1, ..., Ap), and we also define φ =

¡θ0, a0

¢0.In the case with a different d or a different θ for each equation, an important caveat applies in

our multivariate setup as pointed out by Comte & Renault (1996) and Lobato (1997) in differentcontexts. Namely that the ordering of the autoregressive polynomial and the differencingoperator matters. In our multivariate ARFIMA(p, d, 0) time series model in (10) and (13), itis apparent that, under the null, ykt is integrated of order dk for all k = 1, ...,K. However,suppose instead that the model (bivariate for simplicity) were given byÃ

(1− L)d1 0

0 (1− L)d2

!Ãa11 (L) a12 (L)

a21 (L) a22 (L)

!Ãy1ty2t

!=

Ãe1tI (t ≥ 1)e2tI (t ≥ 1)

!(14)

118

Multivariate Lagrange Multiplier Tests for Fractional Integration

under the null. That is, the autoregressive polynomial and the differencing operator have beeninterchanged compared to our model in (10) and (13). Then we can write y1t as

(a11 (L) a22 (L)− a12 (L) a21 (L)) (1− L)d1+d2 y1t

= a22 (L) (1− L)d2 e1tI (t ≥ 1)− a12 (L) (1− L)d1 e2tI (t ≥ 1)and thus y1t is I (d2) if d1 < d2 and a12 (1) 6= 0, and y1t is I (d1) otherwise. Similarly, y2t isI (d1) if d2 < d1 and a21 (1) 6= 0, and y2t is I (d2) otherwise. Thus, in (14), the integrationorders of y1t and y2t are no longer constant throughout the parameter space as they are in ourmodel, where y1t is I (d1) and y2t is I (d2) for any d1, d2. The model (14) is equivalent to ourmodel only in the univariate setup or when dk = d for some d and all k = 1, ...,K, i.e. whenthe setup is as in section 2.

For the model with short-run dynamics (13), we construct the test statistics based on theprewhitened series, i.e. we use the residuals from the regression

xt = A1xt−1 + ...+ Apxt−p + εt, t = 1, ..., n,

and define ε∗t−1 =Pt−1

j=1 j−1εt−j , ε∗∗t−2 =

Pt−1j=1 j

−1ε∗t−j−1, and Xt−1 =¡x0t−1, ..., x0t−p

¢0. Thetest statistics (7) and (12) are now defined in terms of Σ = n−1

Pnt=1 εtε

0t, S10 =

Pnt=2 ε

∗t−1ε

0t,

S11 =Pn

t=2 ε∗t−1ε

∗0t−1, S20 =

Pnt=2 ε

∗∗t−2ε

0t, Sx1 =

Pnt=2Xt−1ε∗0t−1, Sxx =

Pnt=2Xt−1X 0

t−1, andthe Hessian matrices

− ∂2L (θ, a,Σ)

∂φ∂φ0

¯θ=0,a=a,Σ=Σ

=

"tr(Σ−1M11) vec(Sx1)

0

vec Sx1 Sxx ⊗ Σ−1#,

− ∂2LK (θ, a,Σ)

∂φ∂φ0

¯θ=0,a=a,Σ=Σ

=

"S11 ¯ Σ−1 + (Σ−1S020)¯ IK J 0K(S

0x1 ⊗ IK)

(Sx1 ⊗ IK)JK Sxx ⊗ Σ−1#.

Applying the partitioned matrix inverse formula, the test statistics are

LM =tr(Σ−1S10)2

tr(Σ−1(M11 − Sx1S−1xx Sx1))

, (15)

LMK = vec(Σ−1S010)

0JK³S11 ¯ Σ−1 + (Σ−1S020)¯ IK − (Sx1S−1xx Sx1)¯ Σ−1

´−1J 0K vec(Σ

−1S010).

(16)

The results of Theorems 1 and 3 continue to hold in the present case with autocorrelatederrors, though the noncentrality parameters are different.

Theorem 4 Suppose (13) holds and let the LM test statistics be defined by (15) and (16). Theresults of Theorems 1 and 3 continue to hold with noncentrality parameters defined by

I =π2K

6− tr ¡Φ0Γ−1ΦΣ¢ , (17)

IK = π2

6Σ¯Σ−1 − ¡ΣΦ0Γ−1ΦΣ¢¯Σ−1, (18)

119

Chapter 4

where Γ is the covariance matrix of (e0t, ..., e0t−p+1)0, Φ =¡Φ01, ...,Φ0p

¢0, Φi =P∞j=i j

−1Bj−i, andBi is the coefficient on zi in the moving average polynomial B (z) from the Wold representationof et.

As a simple example consider the VAR(1), et = Aet−1 + εt =P∞

j=0Ajεt−j . In this case, I

and IK reduce to π2K/6− tr ¡Φ1Γ−1Φ01Σ¢ and π2

6 Σ¯Σ−1−¡ΣΦ1Γ

−1Φ01Σ¢¯Σ−1, respectively,

where Φ1 = IK +P∞

j=2 j−1Aj−1 and Γ = E (ete

0t) can be recovered from the relation vecΓ =

(IK2 −A⊗A)−1 vecΣ.

4 Finite Sample Performance

In this section we compare the finite sample properties of the LM test in (7) or (15), andBreitung & Hassler’s (2002) Λ0 (d) test (henceforth the BH test) in (5) with allowance forshort-run dynamics when relevant, see Breitung & Hassler (2002). The asymptotic local powerof the LM test can easily be derived from the previous results as

P¡LM > χ21,1−α

¢= 1− F1,λ

¡χ21,1−α

¢, (19)

where χ21,1−α is the 100 (1− α)% point of the central χ2 distribution with one degree of freedom,and F1,λ is the distribution function of the noncentral χ2 distribution with one degree of freedomand noncentrality parameter λ defined in Theorems 1 and 4. Setting δ = θ

√n in (19), we can

compare the asymptotic local power with the finite sample rejection frequencies for any fixedvalues of θ and n.

The models we consider for the simulation study are

Model A :

"(1− L)d+θ 0

0 (1− L)d+θ

#yt = εtI (t ≥ 1) ,

Model B : (I2 −AL)

"(1− L)d+θ 0

0 (1− L)d+θ

#yt = εtI (t ≥ 1) , A =

"a 0

0 a

#,

Model C : y1t = βy2t + u1t, (I2 −AL)

"(1− L)1−θ 0

0 (1− L)

#"u1ty2t

#= εtI (t ≥ 1) ,

where the εt are i.i.d. N (0,Σ). Unreported simulations have shown that using fat-tailed errordistributions, such as the t5 or Cauchy, do not change the results below by much. The con-temporaneous covariance matrix Σ is normalized such that the diagonal elements equal unityand the correlation coefficient ρ is 0 or 0.6. Models A and B are non-cointegrated and thealternatives are of the form considered in Theorem 1, i.e. with the same θ for each variable.We use the values d = 0 and d = 1 in Models A and B but the results do not vary much withd as seen below. The simulations have also been run using other values of d and the resultsare almost identical to those obtained in these two cases. The cointegrated alternatives of

120

Multivariate Lagrange Multiplier Tests for Fractional Integration

Corollary 2 are considered in Model C, where y1t and y2t are fractionally cointegrated if θ > 0and non-cointegrated under the null hypothesis, θ = 0. To generate data we used β = 1.

All calculations were made in Ox version 3.20 including the Arfima package version 1.01, seeDoornik (2001) and Doornik & Ooms (2001). To calculate the BH test we adapted the Gausscode available on Jörg Breitung’s internet homepage. Throughout, the nominal size (type Ierror) of the tests is fixed at 5%, and the number of replications at 10, 000.

Tables 1-2 about here

In Tables 1 and 2 the finite sample rejection frequencies of the LM and BH tests for the casewith i.i.d. errors are presented, i.e. for Model A, with d = 0 and d = 1, respectively. Under theheading ’Limit’, we give the asymptotic local power calculated from (19) with δ = θ

√n. Size

corrected rejection frequencies have also been computed and are reported as LMsc and BHsc.There are no significant differences between the case with d = 0 (Table 1) and the one with

d = 1 (Table 2). The simulated sizes of both tests are close to the nominal 5% level, but theLM test is the more powerful test for Model A, except against θ > 0 with n = 100 in whichcase the BH test appears slightly more powerful. Furthermore, the finite sample power of theLM test is close to the corresponding asymptotic local power.

Unreported simulations show that the BH test is robust to the case where the θ’s in ModelA are allowed to be different, i.e. as in the model of section 3.2. However, the LMK testis specifically designed for that model and is directed against alternatives where the θ’s aredifferent. Hence, the LMK test is clearly superior to the BH test in that model.

Tables 3-4 about here

Tables 3 and 4 present the simulation results for Model B with d = 0 and d = 1, respectively,and a = 0.4. As in Model A, there are no significant differences between the two cases d = 0and d = 1.

For the small sample size, n = 100, the BH test is slightly size distorted, with simulatedsizes ranging from 0.0694 to 0.0745 for the different choices of ρ and d. When n = 100, theBH test has slightly higher power against θ < 0 (opposite the case in Tables 1 and 2), butagainst θ > 0 the LM test has much higher power than the BH test. When considering thelarger sample size, n = 250, or the size corrected tests, the LM test is clearly the superior testfor Model B. It is worth noting that in all cases, i.e. for both n = 100 and n = 250, for bothvalues of d, and for both values of ρ, the BH test has lower power against θ = 0.3 than againstθ = 0.2.

Tables 5-6 about here

To evaluate the sensitivity to the particular value of the coefficient matrix (i.e. a = 0.4)in the autoregressive specification in Model B, Tables 5 (d = 0) and 6 (d = 1) present the

121

Chapter 4

simulated sizes of the LM and BH tests for different specifications of the coefficient matrix Ain Model B. In particular, the values a = −0.75,−0.5,−0.25, 0, 0.25, 0.5, 0.75 and sample sizesn = 100, n = 250, and n = 500 are considered. Notice that the column a = 0 corresponds tothe case where a VAR(1) is estimated for et even though it is really an i.i.d. process.

For all specifications the size distortions of both tests are small, and as before the resultsfor d = 0 and d = 1 are almost identical. For samples of n = 100 the simulated size of theLM test ranges from 0.0497 to 0.0775 when a ≤ 0.50. However, when a = 0.75 and n = 100,the simulated size of the LM test is almost 13% for a nominal 5% test. When larger samplesof n = 250 and n = 500 are considered, the size distortions for a = 0.75 are smaller. Overall,Tables 5 and 6 show that the size of the LM test is close to the nominal 5% level.

Table 7 about here

Table 7 shows finite sample rejection frequencies of the LM and BH tests for Model Cwith d = 1 and a = 0.4, i.e. when yt is fractionally cointegrated with short-run dynamics.The column θ = 0 corresponds to I (1) non-cointegrated data, the column θ = 1 to standardbivariate I (1) − I (0) cointegration, and 0 < θ < 1 corresponds to fractional cointegrationwith I (1− θ) cointegration errors. Thus, the degree of cointegration is determined by themagnitude of θ. In this model, both tests exhibit simulated rejection frequencies very close tothe nominal 5% level when θ = 0, i.e. when there is no cointegration. When ρ = 0, the finitesample rejection frequencies of the two tests are close. When the errors are contemporaneouslycorrelated, ρ = 0.6, both tests have increased rejection frequencies, but the rejection frequenciesof the BH test are higher than those of the LM test, which is expected since the BH test isspecifically directed towards testing against cointegration.

Overall, the Monte Carlo study shows that the LM test has higher finite sample power thanthe BH test in the non-cointegrated model, although both tests can be slightly size distortedwhen the errors exhibit positive autocorrelation. Since the results for the non-cointegratedmodel in Tables 1-6 were practically identical for d = 0 and d = 1, we expect that these resultscarry over to any value of d which is also indicated by unreported simulations for severalalternative values of d. Moreover, as the present test is not considered a test for cointegrationbut more a test of stationarity (or more generally of fractional integration of a given order),we do not put much weight on the results for the cointegrated model. Thus, the LM test issuperior in the non-cointegrated model for any value of the integration order, d.

5 Empirical Application

In this section we apply our tests to the data examined previously by Kugler & Neusser (1993)and Choi & Ahn (1999). The data are monthly observations on real interest rates for the USA,Japan, the UK, (West) Germany, France, and Switzerland from January 1980 to October 1991,i.e. 142 observations on six time series. A more detailed description is available in Kugler &

122

Multivariate Lagrange Multiplier Tests for Fractional Integration

Neusser (1993) or Choi & Ahn (1999). Time series plots of the six real interest rate data seriesare presented in Figure 1. From the time series plots, the time series data do appear seriallycorrelated. Whether they are in fact fractionally integrated is the question to which we turnnext.

Figure 1 about here

The objective of Kugler & Neusser (1993) was to test the Real Interest Parity hypothesisusing a co-dependence approach, which requires the vector time series in question to be sta-tionary, or more precisely, to be I(0). In order to establish stationarity of the data, Kugler &Neusser (1993) conducted a series of univariate unit root tests, which rejected the unit rootnull hypothesis for most of the series. They found some sensitivity to the choice of lag lengthfor the augmented Dickey-Fuller tests, while the Phillips-Perron tests all rejected the null.

Choi & Ahn (1999) reversed the null and alternative hypotheses, and tested the null hy-pothesis of level-stationarity against the alternative of a unit root, which seems to be a morenatural testing strategy in the present case. They applied the multivariate stationarity testsdeveloped in their paper and also the univariate counterparts for comparison. It was foundthat one of the univariate stationarity tests (their LMI test) rejected the null at the 5% levelfor France, and that two univariate stationarity tests (their SBDHT and SBDHB tests) rejectedthe null at the 10% level for the USA. However, none of their multivariate tests rejected thenull at the 10% level, thus providing more certain evidence than the univariate tests.

We apply our LM and LMK tests and the BH test of Breitung & Hassler (2002) to thereal interest rate data to test the hypothesis that d = 0, i.e. that the data are I (0), againstfractionally integrated alternatives. Thus, we test one of the underlying assumptions of theKugler & Neusser (1993) analysis, where non-rejection of the hypothesis that d = 0 impliesthat their analysis is applicable. We allow for a non-zero mean by setting zt = 1 as in section3.1, and report the tests without allowing short-run dynamics (p = 0) and allowing VAR(p)dynamics with p = 1 and p = 4. Note that in the case of d = 0, the treatment of deterministicterms is the same for our tests and for the BH test, see section 3.1. We thus demonstrate awide variety of the tests proposed in the above sections.

Table 8 about here

In panel (a) of Table 8 we report the results from applying the univariate LM and BH teststo each individual time series. When p = 0 both tests reject clearly for all the time series.However, when p > 0 the LM test rejects at the 1% level in two of the twelve cases (Germanyand Switzerland with p = 1), and similarly the BH test rejects at the 5% level in one case(Germany with p = 1) and at the 1% level in one case (France with p = 4).

The results from applying the multivariate LM, LMK , and BH tests are reported in panel(b) of Table 8. Again, the null is soundly rejected when no short-run dynamics is allowed, i.e.

123

Chapter 4

when p = 0, and also when p = 4 for the BH test. However, when allowing short-run dynamicswith either p = 1 or p = 4, the LM and LMK tests do not reject the null.

Thus, the empirical results provide strong evidence that the data are indeed I (0) with non-zero means, when allowance is made for short-run dynamics, and hence support the unit-roottests in Kugler & Neusser (1993) and the stationarity tests in Choi & Ahn (1999). Indeed,our results for the multivariate tests, as well as those of the multivariate tests of Choi & Ahn(1999), are less ambiguous than the results of Kugler & Neusser (1993) and offer more clear-cutevidence in favor of the null hypothesis.

6 Conclusion

We have introduced a multivariate LM test for fractional integration, generalizing the univariatetests developed recently by Robinson (1994) and Tanaka (1999), among others. The testis intended primarily for preliminary data analysis. For instance, when testing the null ofstationarity or I (0)-ness (against fractional alternatives), non-rejection would allow standardmethods to be employed for conducting, e.g., causality, structural VAR, or impulse responseanalyses. More generally, our multivariate test may indicate the transformation of the datathat would be required in order to make the data suitable for said analyses.

We have shown that the regression variant of the LM test derived by Breitung & Hassler(2002) is not equivalent to the LM test in the multivariate case, although they are equivalentin the univariate case. Indeed, in the multivariate case, the two tests have different degrees offreedom in their asymptotic chi-squared distributions.

We have established desirable distributional properties and optimality properties of the LMtest. In particular, the test statistic is asymptotically noncentral chi-squared distributed underlocal alternatives, where the degrees of freedom equals the number of restrictions tested. UnderGaussianity the LM test is asymptotically efficient against local alternatives. An extension ofthe LM test statistic, explicitly allowing different integration orders for each variable, has alsobeen introduced.

Finite sample properties have been evaluated by Monte Carlo experiments, which show thatthe LM test compares favorably with the Breitung & Hassler (2002) test.

Finally, we have presented an interesting empirical application, demonstrating the practicalusefulness of our tests. We apply our tests to a multivariate time series of real interest ratesfor six major industrialized countries previously examined by Kugler & Neusser (1993) andChoi & Ahn (1999) to test an underlying assumption of their analysis, namely that the vectortime series is I(0). Kugler & Neusser (1993) apply univariate unit root tests to each elementof the multiple time series which mostly reject the null of a unit root, and Choi & Ahn (1999)apply their multivariate stationarity test (i.e. test of I (0) against I (1)) and find no evidenceagainst the null hypothesis. Our objective is to test if the real interest rates are jointly I (0)

against fractional alternatives, and the evidence we obtain from the multivariate tests is more

124

Multivariate Lagrange Multiplier Tests for Fractional Integration

clear-cut than the evidence from applying univariate tests to each element of the multiple timeseries. The results indicate that, when allowing for short-run dynamics, the real interest ratesare jointly I (0) with non-zero means.

Appendix: Proofs

Proof of Theorem 1. Breitung & Hassler (2002) show that, under θ = 0,

1√nvec

³Σ−1/2S10

´→d N (0, IK ⊗ Ω) , (20)

and by slight modification of the arguments of Breitung & Hassler (2002, p. 180), it followsthat

n−1S11 →p Ω, n−1S20 →p 0, n

−1M11 →p Ω, (21)

where

Ω = limn→∞n−1

nXt=1

E¡x∗tx

∗0t

¢=

π2

6Σ. (22)

The distribution under the null follows immediately using tr (A0B) = vec (A)0 vec (B) andconsistency of Σ.

Consider next the case θ = δ/√n. Then

tr¡Σ−1S10

¢= tr

ÃΣ−1

nXt=2

e∗t−1e0t

!+

δ√ntr

ÃΣ−1

nXt=2

e∗t−1e∗0t−1

!+Op

¡n−1

¢, (23)

following the arguments of Tanaka (1999, p. 579). Applying (20) and (21) to the second-moment matrices of et, the desired result follows.

By uncorrelatedness of xt under the null,

I = − limn→∞E0

1

n

∂2L (θ,Σ)

∂θ2= tr

¡Σ−1Ω

¢=

π2K

6

is the Fisher information for θ under Gaussianity. Hence, the noncentrality parameter is max-imal, and the test is efficient against local alternatives.

Proof of Corollary 2. Since the LM test is invariant to non-singular linear transforma-tions, we equivalently consider xt = Dxt (corresponding to zt in Breitung & Hassler (2002)),where

D =

Ã(γ0Σγ)−1/2 γ0

β0 − β0Σγ (γ0Σγ)−1 γ0

!such that the (K − r)-vector x1t is i.i.d. (0, IK−r) and the r-vector x2t is uncorrelated with x1t.

The LM test is proportional to³PK

k=1 λk

´2, where the λk are eigenvalues of

¯λΣ− n−1/2S10

¯=

0, or equivalently ¯λn−1X 0X − n−1/2X 0X∗

¯= 0, (24)

125

Chapter 4

with capital letters denoting matrices of observations, i.e. X = (x1, ..., xn)0 and X∗ = (x∗1, ..., x∗n)

0.By Lemma A.1 of Breitung & Hassler (2002),

1√nX 0X∗ =

1√n

ÃX 01

X 02

!³X∗1 X∗

2

´=

ÃA11 A12A21 A22

!,

say, where A11 = Op (1), A12 = Op (1), A21 = Op (1), and A22 = Op

¡n1/2

¢. Thus, it follows

from eigenvalue inequality (6) of Lütkepohl (1996, section 5.3.1) that (24) has K−r eigenvaluesthat are Op (1) and r eigenvalues that are Op

¡n1/2

¢.

In the following we need a lemma on some properties of the Hadamard product, which isdefined for two m× n matrices A = (aij) and B = (bij) as

A¯B = (aijbij) ,

see e.g. Magnus & Neudecker (1999, Chapter 3.6) for more details. The proof of the lemma iseasy and is omitted.

Lemma 1 Property 1. There exists a K2 ×K matrix JK := (vecE11, ..., vecEKK), Eii = eie0i

where ei is the i0th unit K-vector, such that for any K ×K matrix A,

J 0K vecA = a,

where a is the K-vector holding the diagonal of A. If Ad := IK ¯ A is the diagonal matrixobtained from A then

vecAd = JKa.

Property 2. Connection with the Kronecker product. For all K ×K matrices A and B,

J 0K (A⊗B)JK = A¯B,

where JK is defined as in property 1.Property 3. Let A and B be K ×K matrices such that A is diagonal and B is symmetric.

Then

ABA = aa0 ¯B,

where a is defined as in property 1.

Proof of Theorem 3. It follows from (20), application of vec (ABC) = (C 0 ⊗A) vecB,and property 2 of Lemma 1 that

1√nJ 0K vec

³Σ−1S010

´→d N

µ0,π2

6Σ¯Σ−1

¶.

126

Multivariate Lagrange Multiplier Tests for Fractional Integration

By (21) and consistency of Σ, the distribution under the null follows. Under θ = δ/√n the

expansion corresponding to (23) is

J 0K vec¡Σ−1S010

¢=

nXt=2

diag¡e∗t−1

¢Σ−1et +

nXt=2

diag¡e∗t−1

¢Σ−1 diag

¡e∗t−1

¢ δ√n+Op

¡n−1

¢,

(25)and the result follows as above.

Proof of Theorem 4. Consider first θ = 0. For a fixed m > p, define the K2m-vectorCm = ((vec C (1))0, ..., (vec C (m))0)0, where C (j) = n−1

Pnt=j+1 εtε

0t−j is the j’th residual au-

tocovariance. Hosking (1980) showed that

√nCm →d N

¡0, Im ⊗ Σ⊗ Σ−Km

¡Γ−1 ⊗ Σ¢K 0

m

¢,

where Γ−1 ⊗ Σ is the inverse Fisher information for the parameters in A (z),

Km =

⎡⎢⎢⎢⎢⎣Σ 0

ΣB01 Σ...

.... . .

ΣB0m−1 ΣB0m−2 · · · · · · ΣB0m−p

⎤⎥⎥⎥⎥⎦⊗ IK ,

and Bi is the coefficient on zi in the moving average polynomial B (z) from the Wold represen-tation of et. Thus,

√n

mXj=1

j−1 vec C (j)→d N (0,Ψm)

with Ψm =Pm

j=1 j−2Σ ⊗ Σ − (Σ(Φ0(m)1 , ...,Φ

0(m)p )Γ−1(Φ0(m)1 , ...,Φ

0(m)p )0Σ) ⊗ Σ and Φ(m)i =Pm

j=i j−1Bj−i. It now follows by application of Bernstein’s Lemma, see e.g. Hall & Heyde

(1980, pp. 191-192), that√nn−1Xj=1

j−1 vec C (j)→d N (0,Ψ) ,

where Ψ = limm→∞Ψm. The limiting distributions of LM and LMK in (15) and (16), whenθ = 0, now follow by recalling that n−1S10 =

Pn−1j=1 j

−1C (j), and using that n−1Sx1 →p ΦΣ

and n−1Sxx →p Γ along with (21).When θ = δ/

√n, the desired results follow by combining the arguments of the previous

theorems, and using expansions like (23) and (25).

127

Chapter 4

References

Agiakloglou, C. & Newbold, P. (1994), ‘Lagrange multiplier tests for fractional difference’,Journal of Time Series Analysis 15, 253—262.

Amemiya, T. (1985), Advanced Econometrics, Harvard University Press, Cambridge.

Andersen, T. G., Bollerslev, T., Diebold, F. X. & Labys, P. (2003), ‘Modelling and forecastingrealized volatility’, Econometrica 71, 579—625.

Breitung, J. & Hassler, U. (2002), ‘Inference on the cointegration rank in fractionally integratedprocesses’, Journal of Econometrics 110, 167—185.

Choi, I. & Ahn, B. C. (1999), ‘Testing the null of stationarity for multiple time series’, Journalof Econometrics 88, 41—77.

Comte, F. & Renault, E. (1996), ‘Long-memory continuous-time models’, Journal of Econo-metrics 73, 101—149.

Dolado, J. J., Gonzalo, J. & Mayoral, L. (2002), ‘A fractional Dickey-Fuller test for unit roots’,Econometrica 70, 1963—2006.

Doornik, J. A. (2001), Ox: An Object-Oriented Matrix Language, 4th edn, Timberlake Consul-tants Press, London.

Doornik, J. A. & Ooms, M. (2001), ‘A package for estimating, forecasting and simulating arfimamodels: Arfima package 1.01 for Ox’, Working Paper, Nuffield College, Oxford .

Fountis, N. G. & Dickey, D. A. (1989), ‘Testing for a unit root nonstationarity in multivariateautoregressive time series’, Annals of Statistics 17, 419—428.

Gil-Alana, L. A. (2003), ‘A fractional multivariate long memory model for the US and theCanadian real output’, Economics Letters 81, 355—359.

Hall, P. & Heyde, C. C. (1980), Martingale Limit Theory and its Application, Academic Press,New York.

Hosking, J. R. M. (1980), ‘The multivariate portmanteau statistic’, Journal of the AmericanStatistical Association 75, 602—608.

Johansen, S. (1988), ‘Statistical analysis of cointegration vectors’, Journal of Economic Dy-namics and Control 12, 231—254.

Kugler, P. & Neusser, K. (1993), ‘International real interest rate equalization: A multivariatetime series approach’, Journal of Applied Econometrics 8, 163—174.

128

Multivariate Lagrange Multiplier Tests for Fractional Integration

Lee, T. & Tse, Y. (1996), ‘Cointegration tests with conditional heteroskedasticity’, Journal ofEconometrics 73, 401—410.

Ling, S. & Li, W. K. (1997), ‘On fractionally integrated autoregressive moving-average timeseries models with conditional heteroskedasticity’, Journal of the American StatisticalAssociation 92, 1184—1194.

Lobato, I. N. (1997), ‘Consistency of the averaged cross-periodogram in long memory series’,Journal of Time Series Analysis 18, 137—155.

Lobato, I. N. (1999), ‘A semiparametric two-step estimator in a multivariate long memorymodel’, Journal of Econometrics 90, 129—153.

Lobato, I. N. & Robinson, P. M. (1998), ‘A nonparametric test for I(0)’, Review of EconomicStudies 65, 475—495.

Lütkepohl, H. (1996), Handbook of Matrices, John Wiley and Sons, New York.

Magnus, J. R. & Neudecker, H. (1999), Matrix Differential Calculus with Applications in Sta-tistics and Econometrics, revised edn, John Wiley and Sons, New York.

Marinucci, D. & Robinson, P. M. (1999), ‘Alternative forms of fractional Brownian motion’,Journal of Statistical Planning and Inference 80, 111—122.

Nyblom, J. & Harvey, A. C. (2000), ‘Tests of common stochastic trends’, Econometric Theory16, 176—199.

Phillips, P. C. B. & Durlauf, S. N. (1986), ‘Multiple time series regression with integratedprocesses’, Review of Economic Studies 53, 473—495.

Robinson, P. M. (1991), ‘Testing for strong serial correlation and dynamic conditional het-eroskedasticity in multiple regressions’, Journal of Econometrics 47, 67—84.

Robinson, P. M. (1994), ‘Efficient tests of nonstationary hypotheses’, Journal of the AmericanStatistical Association 89, 1420—1437.

Robinson, P. M. (1995), ‘Log-periodogram regression of time series with long range dependence’,Annals of Statistics 23, 1048—1072.

Robinson, P. M. (2004), ‘The distance between rival nonstationary fractional processes’, Forth-coming in Journal of Econometrics .

Sin, C. Y. & Ling, S. (2004), ‘Estimation and testing for partially nonstationary vector au-toregressive models with GARCH’,Working Paper, Hong Kong University of Science andTechnology .

Tanaka, K. (1999), ‘The nonstationary fractional unit root’, Econometric Theory 15, 549—582.

129

Chapter 4

Table 1: Finite sample rejection frequencies for Model A with d = 0ρ = 0 ρ = 0.6

θ Limit LM BH LMsc BHsc Limit LM BH LMsc BHscn = 100−0.3 0.9998 0.9966 0.9771 0.9970 0.9789 0.9998 0.9957 0.9777 0.9959 0.9793−0.2 0.9523 0.8900 0.7117 0.8969 0.7195 0.9523 0.8864 0.7148 0.8900 0.7265−0.1 0.4420 0.3918 0.1965 0.4076 0.2020 0.4420 0.3899 0.1978 0.3948 0.2094

0 0.0500 0.0468 0.0476 0.0500 0.0500 0.0500 0.0489 0.0458 0.0500 0.05000.1 0.4420 0.1891 0.2645 0.1949 0.2704 0.4420 0.1891 0.2608 0.1915 0.27060.2 0.9523 0.7128 0.7971 0.7187 0.8029 0.9523 0.7139 0.8054 0.7160 0.81440.3 0.9998 0.9667 0.9848 0.9682 0.9852 0.9998 0.9668 0.9858 0.9676 0.9863

n = 250−0.3 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000−0.2 0.9999 0.9999 0.9965 1.0000 0.9968 0.9999 0.9997 0.9954 0.9996 0.9954−0.1 0.8180 0.7879 0.5412 0.7921 0.5478 0.8180 0.7836 0.5344 0.7820 0.5409

0 0.0500 0.0478 0.0483 0.0500 0.0500 0.0500 0.0505 0.0483 0.0500 0.05000.1 0.8180 0.6276 0.6188 0.6319 0.6250 0.8180 0.6284 0.6174 0.6267 0.62270.2 0.9999 0.9954 0.9959 0.9956 0.9962 0.9999 0.9963 0.9956 0.9963 0.99590.3 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

Note: The table reports simulated rejections frequencies under the null (θ = 0) and alternative (θ 6= 0) basedon 10,000 replications of Model A with d = 0.

130

Multivariate Lagrange Multiplier Tests for Fractional Integration

Table 2: Finite sample rejection frequencies for Model A with d = 1ρ = 0 ρ = 0.6

θ Limit LM BH LMsc BHsc Limit LM BH LMsc BHscn = 100−0.3 0.9998 0.9945 0.9767 0.9950 0.9788 0.9998 0.9966 0.9779 0.9967 0.9775−0.2 0.9523 0.8914 0.7161 0.8977 0.7328 0.9523 0.8923 0.7064 0.8947 0.7057−0.1 0.4420 0.3864 0.1998 0.4000 0.2156 0.4420 0.3899 0.2038 0.3937 0.2032

0 0.0500 0.0457 0.0444 0.0500 0.0500 0.0500 0.0489 0.0501 0.0500 0.05000.1 0.4420 0.1855 0.2616 0.1906 0.2726 0.4420 0.1879 0.2609 0.1894 0.26060.2 0.9523 0.7159 0.8056 0.7234 0.8152 0.9523 0.7171 0.8029 0.7191 0.80220.3 0.9998 0.9667 0.9872 0.9677 0.9881 0.9998 0.9666 0.9884 0.9670 0.9884

n = 250−0.3 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000−0.2 0.9999 0.9999 0.9951 0.9999 0.9952 0.9999 0.9998 0.9965 0.9998 0.9965−0.1 0.8180 0.7882 0.5324 0.7867 0.5377 0.8180 0.7876 0.5400 0.7832 0.5458

0 0.0500 0.0504 0.0482 0.0500 0.0500 0.0500 0.0519 0.0477 0.0500 0.05000.1 0.8180 0.6241 0.6166 0.6234 0.6203 0.8180 0.6380 0.6278 0.6339 0.63260.2 0.9999 0.9964 0.9973 0.9964 0.9974 0.9999 0.9973 0.9966 0.9972 0.99680.3 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000

Note: The table reports simulated rejections frequencies under the null (θ = 0) and alternative (θ 6= 0) basedon 10,000 replications of Model A with d = 1.

131

Chapter 4

Table 3: Finite sample rejection frequencies for Model B with d = 0 and a = 0.4ρ = 0 ρ = 0.6

θ Limit LM BH LMsc BHsc Limit LM BH LMsc BHscn = 100−0.3 0.6044 0.2201 0.3222 0.1847 0.2577 0.6044 0.2195 0.3175 0.1863 0.2591−0.2 0.3171 0.0830 0.1510 0.0676 0.1124 0.3171 0.0789 0.1630 0.0631 0.1227−0.1 0.1150 0.0394 0.0887 0.0308 0.0588 0.1150 0.0459 0.0859 0.0361 0.0609

0 0.0500 0.0603 0.0732 0.0500 0.0500 0.0500 0.0597 0.0694 0.0500 0.05000.1 0.1150 0.1517 0.0997 0.1400 0.0719 0.1150 0.1457 0.0963 0.1315 0.07250.2 0.3171 0.3047 0.1196 0.2874 0.0868 0.3171 0.3036 0.1169 0.2838 0.09080.3 0.6044 0.3964 0.1151 0.3795 0.0832 0.6044 0.4019 0.1165 0.3832 0.0899

n = 250−0.3 0.9404 0.8180 0.7826 0.8390 0.7542 0.9404 0.8222 0.7888 0.8490 0.7607−0.2 0.6500 0.4034 0.3918 0.4378 0.3539 0.6500 0.4079 0.3923 0.4519 0.3617−0.1 0.2164 0.1025 0.1253 0.1190 0.1068 0.2164 0.1061 0.1171 0.1317 0.1023

0 0.0500 0.0428 0.0598 0.0500 0.0500 0.0500 0.0410 0.0597 0.0500 0.05000.1 0.2164 0.1059 0.1224 0.1141 0.1063 0.2164 0.1087 0.1188 0.1194 0.10320.2 0.6500 0.2983 0.2357 0.3087 0.2120 0.6500 0.2961 0.2349 0.3097 0.21440.3 0.9404 0.5081 0.2179 0.5176 0.1954 0.9404 0.5114 0.2176 0.5262 0.1962

Note: The table reports simulated rejections frequencies under the null (θ = 0) and alternative (θ 6= 0) basedon 10,000 replications of Model B with d = 0 and a = 0.4.

132

Multivariate Lagrange Multiplier Tests for Fractional Integration

Table 4: Finite sample rejection frequencies for Model B with d = 1 and a = 0.4ρ = 0 ρ = 0.6

θ Limit LM BH LMsc BHsc Limit LM BH LMsc BHscn = 100−0.3 0.6044 0.2308 0.3211 0.1923 0.2513 0.6044 0.2157 0.3137 0.1766 0.2405−0.2 0.3171 0.0892 0.1639 0.0702 0.1220 0.3171 0.0897 0.1638 0.0677 0.1210−0.1 0.1150 0.0441 0.0885 0.0331 0.0614 0.1150 0.0376 0.0860 0.0292 0.0579

0 0.0500 0.0591 0.0745 0.0500 0.0500 0.0500 0.0598 0.0740 0.0500 0.05000.1 0.1150 0.1450 0.0973 0.1304 0.0689 0.1150 0.1577 0.0982 0.1402 0.07110.2 0.3171 0.3130 0.1232 0.2924 0.0882 0.3171 0.3034 0.1155 0.2842 0.08390.3 0.6044 0.3959 0.1143 0.3728 0.0818 0.6044 0.3990 0.1089 0.3765 0.0797

n = 250−0.3 0.9404 0.8209 0.7818 0.8520 0.7560 0.9404 0.8130 0.7797 0.8330 0.7611−0.2 0.6500 0.4126 0.3943 0.4651 0.3655 0.6500 0.4032 0.3923 0.4346 0.3652−0.1 0.2164 0.1087 0.1288 0.1366 0.1146 0.2164 0.1069 0.1241 0.1230 0.1099

0 0.0500 0.0394 0.0570 0.0500 0.0500 0.0500 0.0425 0.0576 0.0500 0.05000.1 0.2164 0.1081 0.1161 0.1217 0.1051 0.2164 0.1083 0.1217 0.1172 0.11040.2 0.6500 0.3029 0.2367 0.3213 0.2185 0.6500 0.3002 0.2290 0.3093 0.21200.3 0.9404 0.5084 0.2118 0.5287 0.1949 0.9404 0.5009 0.2226 0.5132 0.2032

Note: The table reports simulated rejections frequencies under the null (θ = 0) and alternative (θ 6= 0) basedon 10,000 replications of Model B with d = 1 and a = 0.4.

133

Chapter4

Table5:Simulatedsizeofnominal5%

testforModelBwithd=0

ρ=0

ρ=0.6

Test\a

−0.75

−0.5

−0.25

00.25

0.5

0.75

−0.75

−0.5

−0.25

00.25

0.5

0.75

n=100

LM

0.0620

0.0540

0.0528

0.0497

0.0508

0.0749

0.1291

0.0498

0.0523

0.0517

0.0502

0.0516

0.0774

0.1276

BH

0.0591

0.0512

0.0550

0.0568

0.0669

0.0710

0.0588

0.0518

0.0524

0.0568

0.0595

0.0639

0.0708

0.0591

n=250

LM

0.0556

0.0503

0.0531

0.0495

0.0437

0.0488

0.1004

0.0470

0.0503

0.0516

0.0501

0.0472

0.0471

0.1082

BH

0.0487

0.0565

0. 0560

0.0524

0.0547

0.0625

0.0540

0.0510

0.0502

0.0506

0.0505

0.0566

0.0587

0.0506

n=500

LM

0.0508

0.0496

0.0517

0.0505

0.0461

0.0426

0.0862

0.0520

0.0521

0.0532

0.0514

0.0441

0.0424

0.0895

BH

0.0512

0.0544

0.0511

0.0554

0.0490

0.0554

0.0497

0.0500

0.0508

0.0496

0.0517

0.0480

0.0597

0.0487

Note:Thetablereportssimulatedrejectionsfrequenciesunderthenull(θ=0)basedon10,000replicationsofModelBwithd=0.

134

MultivariateLagrangeMultiplierTestsforFractionalIntegration

Table6:Simulatedsizeofnominal5%

testforModelBwithd=1

ρ=0

ρ=0.6

Test\a

−0.75

−0.5

−0.25

00.25

0.5

0.75

−0.75

−0.5

−0.25

00.25

0.5

0.75

n=100

LM

0.0569

0.0614

0.0532

0.0560

0.0527

0.0762

0.1334

0.0587

0.0592

0.0518

0.0521

0.0513

0.0690

0.1271

BH

0.0544

0.0531

0.0554

0.0595

0.0656

0.0667

0.0649

0.0561

0.0564

0.0578

0.0604

0.0669

0.0713

0.0618

n=250

LM

0.0516

0.0541

0.0519

0.0516

0.0462

0.0505

0.1083

0.0537

0.0535

0.0544

0.0496

0.0474

0.0457

0.1064

BH

0.0509

0.0533

0. 0483

0.0528

0.0556

0.0561

0.0506

0.0503

0.0529

0.0493

0.0556

0.0600

0.0611

0.0520

n=500

LM

0.0510

0.0517

0.0528

0.0511

0.0432

0.0403

0.0870

0.0531

0.0494

0.0556

0.0491

0.0444

0.0407

0.0879

BH

0.0543

0.0498

0.0481

0.0554

0.0554

0.0564

0.0464

0.0520

0.0535

0.0493

0.0535

0.0545

0.0589

0.0515

Note:Thetablereportssimulatedrejectionsfrequenciesunderthenull(θ=0)basedon10,000replicationsofModelBwithd=1.

135

Chapter 4

Table 7: Finite sample rejection frequencies for Model C with a = 0.4ρ = 0 ρ = 0.6

Test\θ 0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.4 0.6 0.8 1.0

n = 100LM 0.0622 0.0704 0.1921 0.4606 0.7339 0.9000 0.0609 0.0893 0.2718 0.5731 0.8376 0.9535BH 0.0691 0.1073 0.2515 0.5294 0.7942 0.9414 0.0711 0.1306 0.3625 0.6574 0.8732 0.9619

n = 250LM 0.0429 0.1554 0.5948 0.9256 0.9939 0.9998 0.0387 0.1862 0.6743 0.9589 0.9979 1.0000BH 0.0583 0.1901 0.6791 0.9657 0.9992 1.0000 0.0595 0.2845 0.8556 0.9947 0.9998 1.0000

n = 500LM 0.0411 0.3409 0.9124 0.9976 1.0000 1.0000 0.0403 0.3873 0.9388 0.9995 1.0000 1.0000BH 0.0578 0.3942 0.9697 1.0000 1.0000 1.0000 0.0534 0.5824 0.9979 1.0000 1.0000 1.0000

Note: The table reports simulated rejections frequencies based on 10,000 replications of Model C with β = 1and a = 0.4.

136

Multivariate Lagrange Multiplier Tests for Fractional Integration

Table 8: Empirical results for the real interest rate data(a) Univariate tests of d = 0 with non-zero mean

p = 0 p = 1 p = 4LM(1) BH(1) LM(1) BH(1) LM(1) BH(1)

USA 36.48∗∗ 25.34∗∗ 1.81 0.19 1.69 1.30Japan 37.56∗∗ 26.97∗∗ 0.02 0.39 0.45 0.36UK 51.98∗∗ 31.23∗∗ 0.85 0.64 0.10 0.32Germany 26.09∗∗ 18.07∗∗ 8.99∗∗ 4.06∗ 0.43 2.74France 42.63∗∗ 31.12∗∗ 0.92 1.16 1.04 8.93∗∗

Switzerland 44.53∗∗ 28.05∗∗ 20.17∗∗ 2.25 0.14 2.05

(b) Multivariate tests of d = 0 with non-zero meanp = 0 p = 1 p = 4

LM(1) BH(36) LMK(6) LM(1) BH(36) LMK(6) LM(1) BH(36) LMK(6)136.47∗∗ 166.44∗∗ 145.09∗∗ 0.18 41.11 3.76 2.23 76.76∗∗ 6.44

Note: One asterisk denotes significance at 5% level and two asterisks denote significance at 1% level. All teststatistics are asymptotically χ2-distributed, with the appropriate degrees of freedom reported in parenthesis.

137

Chapter 4

Figure 1: Time series plots of real interest rates

1980 1985 1990

-10

0

10

20 USA

1980 1985 19 90

0

20Japan

1980 1985 1990

0

10UK

1980 1985 19 90

0

10

20Germany

1980 1985 1990

0

10

20France

1980 1985 19 90

-10

0

10

20Switzerland

138