Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

432

Transcript of Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Page 1: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics
Page 2: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

ADVANCED TEXTS IN ECONOMETRICS

Editors

Manuel Arellano Guido Imbens Grayham E. Mizon

Adrian Pagan Mark Watson

Advisory Editor

C. W. J. Granger

Page 3: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Other Advanced Texts in Econometrics

ARCH: Selected ReadingsEdited by Robert F. Engle

Bayesian Inference in Dynamic Econometric ModelsBy Luc Bauwens, Michel Lubrano, and Jean-Francois Richard

Co-integration, Error Correction, and the Econometric Analysis of Non-Stationary DataBy Anindya Banerjee, Juan J. Dolado, John W. Galbraith, and David Hendry

Dynamic EconometricsBy David F. Hendry

Finite Sample EconometricsBy Aman Ullah

Generalized Method of MomentsBy Alastair R. Hall

Likelihood-Based Inference in Cointegrated Vector Autoregressive ModelsBy Søren Johansen

Long-Run Economic Relationships: Readings in CointegrationEdited by Robert F. Engle and Clive W. J. Granger

Micro-Econometrics for Policy, Program and Treatment EffectsBy Myoung-jae Lee

Modelling Economic Series: Readings in Econometric MethodologyEdited by Clive W. J. Granger

Modelling Non-Linear Economic RelationshipsBy Clive W. J. Granger and Timo Terasvirta

Modelling SeasonalityEdited by S. Hylleberg

Non-Stationary Time Series Analysis and CointegrationEdited by Colin P. Hargreaves

Panel Data EconometricsBy Manuel Arellano

Periodic Time Series ModelsBy Philip Hans Franses and Richard Paap

Periodicity and Stochastic Trends in Economic Time SeriesBy Philip Hans Franses

Readings in Unobserved Components ModelsEdited by Andrew C. Harvey and Tommaso Proietti

Stochastic Limit Theory: An Introduction for EconometriciansBy James Davidson

Stochastic VolatilityEdited by Neil Shephard

Testing ExogeneityEdited by Neil R. Ericsson and John S. Irons

The Cointegrated VAR ModelBy Katarina Juselius

The Econometrics of Macroeconomic ModellingBy Gunnar Bardsen, Øyvind Eitrheim, Eilev S. Jansen and Ragnar Nymoen

Time Series with Long MemoryEdited by Peter M. Robinson

Time-Series-Based Econometrics: Unit Roots and Co-integrationsBy Michio Hatanaka

Workbook on CointegrationBy Peter Reinhard Hansen and Søren Johansen

Page 4: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Volatility and Time SeriesEconometrics: Essays in Honor of

Robert F. Engle

Edited by

Tim Bollerslev, Jeffrey R. Russell,and Mark W. Watson

1

Page 5: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3Great Clarendon Street, Oxford ox2 6dp

Oxford University Press is a department of the University of Oxford.It furthers the University’s objective of excellence in research, scholarship,

and education by publishing worldwide in

Oxford New York

Auckland Cape Town Dar es Salaam Hong Kong KarachiKuala Lumpur Madrid Melbourne Mexico City Nairobi

New Delhi Shanghai Taipei Toronto

With offices in

Argentina Austria Brazil Chile Czech Republic France GreeceGuatemala Hungary Italy Japan Poland Portugal SingaporeSouth Korea Switzerland Thailand Turkey Ukraine Vietnam

Oxford is a registered trade mark of Oxford University Pressin the UK and in certain other countries

Published in the United Statesby Oxford University Press Inc., New York

c© Oxford University Press 2010

The moral rights of the authors have been assertedDatabase right Oxford University Press (maker)

First published 2010

All rights reserved. No part of this publication may be reproduced,stored in a retrieval system, or transmitted, in any form or by any means,

without the prior permission in writing of Oxford University Press,or as expressly permitted by law, or under terms agreed with the appropriate

reprographics rights organization. Enquiries concerning reproductionoutside the scope of the above should be sent to the Rights Department,

Oxford University Press, at the address above

You must not circulate this book in any other binding or coverand you must impose the same condition on any acquirer

British Library Cataloguing in Publication Data

Data available

Library of Congress Cataloging-in-Publication Data

Volatility and time series econometrics : essays in honor ofRobert F. Engle / edited by Mark W. Watson, Tim Bollerslev,

and Jeffrey R. Russell.p. cm.—(Advanced texts in econometrics)

ISBN 978-0-19-954949-8 (hbk.)1. Econometrics. 2. Time-series analysis. I. Engle, R. F. (Robert F.)

II. Watson, Mark W. III. Bollerslev, Tim, 1958-IV. Russell, Jeffrey R.

HB139.V65 2009330.01′51955—dc22 2009041065

Typeset by SPI Publisher Services, Pondicherry, IndiaPrinted in Great Britain

on acid-free paper byCPI Antony Rowe, Chippenham, Wiltshire

ISBN 978-0-19-954949-8

1 3 5 7 9 10 8 6 4 2

Page 6: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Contents

Introduction x

1 A History of Econometrics at the University of California,San Diego: A Personal Viewpoint 1Clive W. J. Granger

1 Introduction 12 The Founding Years: 1974–1984 13 The Middle Years: 1985–1993 34 The Changing Years: 1994–2003 45 Graduate students 66 Visitors 67 Wives 88 The Econometrics Research Project 89 The UCSD Economics Department 8

10 The way the world of econometrics has changed 811 Visitors and students 9

2 The Long Run Shift-Share: Modeling the Sources ofMetropolitan Sectoral Fluctuations 13N. Edward Coulson

1 Introduction 132 A general model and some specializations 143 Data and evidence 214 Summary and conclusions 33

3 The Evolution of National and Regional Factors inUS Housing Construction 35James H. Stock and Mark W. Watson

1 Introduction 352 The state building permits data set 383 The DFM-SV model 45

v

Page 7: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

vi Contents

4 Empirical results 495 Discussion and conclusions 60

4 Modeling UK Inflation Uncertainty, 1958–2006 62Gianna Boero, Jeremy Smith, and Kenneth F. Wallis

1 Introduction 622 UK inflation and the policy environment 633 Re-estimating the original ARCH model 664 The nonstationary behavior of UK inflation 695 Measures of inflation forecast uncertainty 736 Uncertainty and the level of inflation 777 Conclusion 78

5 Macroeconomics and ARCH 79James D. Hamilton

1 Introduction 792 GARCH and inference about the mean 813 Application 1: Measuring market expectations of what the

Federal Reserve is going to do next 874 Application 2: Using the Taylor Rule to summarize changes in

Federal Reserve policy 915 Conclusions 95

6 Macroeconomic Volatility and Stock Market Volatility,World-Wide 97Francis X. Diebold and Kamil Yilmaz

1 Introduction 972 Data 993 Empirical results 1004 Variations and extensions 1055 Concluding remark 109

7 Measuring Downside Risk – Realized Semivariance 117Ole E. Barndorff-Nielsen, Silja Kinnebrock, and Neil Shephard

1 Introduction 1172 Econometric theory 1223 More empirical work 1284 Additional remarks 1315 Conclusions 133

Page 8: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Contents vii

8 Glossary to ARCH (GARCH) 137Tim Bollerslev

9 An Automatic Test of Super Exogeneity 164David F. Hendry and Carlos Santos

1 Introduction 1642 Detectable shifts 1663 Super exogeneity in a regression context 1704 Impulse saturation 1735 Null rejection frequency of the impulse-based test 1756 Potency at stage 1 1797 Super-exogeneity failure 1818 Co-breaking based tests 1869 Simulating the potencies of the automatic super-exogeneity test 186

10 Testing super exogeneity in UK money demand 19011 Conclusion 192

10 Generalized Forecast Errors, a Change of Measure,and Forecast Optimality 194Andrew J. Patton and Allan Timmermann

1 Introduction 1942 Testable implications under general loss functions 1963 Properties under a change of measure 2004 Numerical example and an application to US inflation 2025 Conclusion 209

11 Multivariate Autocontours for Specification Testingin Multivariate GARCH Models 213Gloria Gonzalez-Rivera and Emre Yoldas

1 Introduction 2132 Testing methodology 2153 Monte Carlo simulations 2194 Empirical applications 2245 Concluding remarks 230

12 Modeling Autoregressive Conditional Skewness and Kurtosiswith Multi-Quantile CAViaR 231Halbert White, Tae-Hwan Kim, and Simone Manganelli

1 Introduction 2312 The MQ-CAViaR process and model 232

Page 9: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

viii Contents

3 MQ-CAViaR estimation: Consistency and asymptotic normality 2344 Consistent covariance matrix estimation 2375 Quantile-based measures of conditional skewness and kurtosis 2386 Application and simulation 2397 Conclusion 246

13 Volatility Regimes and Global Equity Returns 257Luis Catao and Allan Timmermann

1 Econometric methodology 2612 Data 2653 Global stock return dynamics 2674 Variance decompositions 2755 Economic interpretation: Oil, money, and tech shocks 2816 Implications for global portfolio allocation 2877 Conclusion 293

14 A Multifactor, Nonlinear, Continuous-Time Model of InterestRate Volatility 296Jacob Boudoukh, Christopher Downing, Matthew Richardson, Richard

Stanton, and Robert F. Whitelaw

1 Introduction 2962 The stochastic behavior of interest rates: Some evidence 2983 Estimation of a continuous-time multifactor diffusion process 3074 A generalized Longstaff and Schwartz (1992) model 3135 Conclusion 321

15 Estimating the Implied Risk-Neutral Density for theUS Market Portfolio 323Stephen Figlewski

1 Introduction 3232 Review of the literature 3253 Extracting the risk-neutral density from options prices, in theory 3294 Extracting a risk-neutral density from options market prices,

in practice 3315 Adding tails to the risk-neutral density 3426 Estimating the risk-neutral density for the S&P 500 from S&P 500

index options 3457 Concluding comments 352

Page 10: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Contents ix

16 A New Model for Limit Order Book Dynamics 354Jeffrey R. Russell and Taejin Kim

1 Introduction 3542 The model 3563 Model estimation 3584 Data 3585 Results 3606 Conclusions 364

Bibliography 365

Index 401

Page 11: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Introduction

On June 20–21, 2009 a large group of Rob Engle’s students, colleagues, friends, andclose family members met in San Diego to celebrate his extraordinary career. This bookcontains 16 chapters written to honor Rob for that occasion.

Rob’s career spans several areas of economics, econometrics and finance. His CornellPh.D. thesis focused on temporal aggregation and dynamic macroeconometric models. Asan assistant professor at MIT he began working in urban economics. In his long careerat UCSD he continued his empirical work in macroeconomics and urban economics,and branched out into energy economics and finance, an interest that eventually ledhim to NYU’s Stern School of Business. His interest in applied problems and hisoriginal way of looking at them led Rob to develop econometric methods that havefundamentally changed empirical analysis in economics and finance. Along the way, Robworked closely with scores of graduate students, fundamentally changing their lives for thebetter.

We have organized the contributions in the book to highlight many of the themes inRob’s career. Appropriately, the book begins with Clive Granger’s history of econometricsat UCSD, tracing Clive’s arrival at UCSD and how he recruited a young Rob Engle tojoin him to build what ultimately became the dominant econometrics group of the latetwentieth century. For those of us who were part of it (and, in one way or another thatincludes nearly every practicing econometrician of the time), this is an extraordinarystory.

The next two contributions focus on urban economics and housing. Ed Coulsoninvestigates the sources of metropolitan fluctuations in sectoral employment by studyingvarious restrictions on VAR representations of stochastic processes describing national,local, and industry employment. Jim Stock and Mark Watson investigate sources ofvolatility changes in residential construction using 40 years of state building permit dataand a dynamic factor model with stochastic volatility.

Of course, Rob’s most famous contribution to econometrics is the ARCH model, andthe next five contributions focus on time-varying volatility. The empirical application inRob’s original ARCH paper was to UK inflation uncertainty, and Gianna Boero, JeremySmith and Ken Wallis test the external validity of Rob’s conclusion by extending his1958–77 sample through 2006. The ARCH class of models has subsequently found mostwidespread use in applications with financial data. However, Jim Hamilton shows thatmacroeconomists primarily interested in inference about the conditional mean ratherthan the conditional variance, still need to think about possible ARCH effects in thedata. Further exploring the link between macroeconomics and finance, Frank Dieboldand Kamil Yilmaz examine the cross-sectional relationship between stock market returns

x

Page 12: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Introduction xi

and volatility and a host of macroeconomic fundamentals. The chapter by Ole Barndorff-Nielsen, Sinja Kinnebrock and Neil Shephard shows how the standard ARCH modelingframework may be enriched through the use of high-frequency intraday data and a new so-called realized semivariance measure for downside risk. Finally, Tim Bollerslev providesa glossary for the large number of models (and acronyms) that followed Rob’s originalARCH formulation.

The next four chapters study various aspects of dynamic specification and forecastingthat have interested Rob. David Hendry and Carlos Santos propose a test for “superexogeneity”, a concept originally developed by Rob, David, and Jean-Francois Richard.Andrew Patton and Allan Timmermann discuss properties of optimal forecasts undergeneral loss functions, and propose an interesting change of measure under whichminimum mean square error forecast properties can be recovered. Gloria Gonzalez-Rivera and Emre Yoldas develop a new set of specification tests for multivariate dynamicmodels based on the concept of autocontours. On comparing the fit of different multi-variate ARCH models for a set of portfolio returns, they find that Rob’s DCC modelprovides the best specification. This section is rounded out by Hal White, Tae-HwanKim, and Simone Manganelli who extend the CAViaR model for conditional quantilesthat was originally proposed by Rob and Simone to simultaneously model multiplequantiles.

The final four chapters take up topics in finance. Luis Catao and Allan Timmermannstudy to what extent equity market volatility can be attributed to global, country-specific,and sector-specific shocks. Jacob Boudoukh, Christopher Downing, Matthew Richardson,and Richard Stanton explore the relationship between volatility and the term structure ofinterest rates. The continuous-time model developed in that chapter is quite general, butsome of the ideas and empirical results are naturally related to Rob’s original ARCH-Mpaper on time-varying risk premia in the term structure. The concept of risk-neutraldistributions figures prominently in asset pricing finance as a way of valuing future riskypayoffs and characterizing preferences toward risk, as exemplified in Rob’s work with JoshRosenberg. In his contribution to the volume, Stephen Figlewski provides an easy-to-follow step-by-step procedure for the construction of well-behaved empirical risk-neutraldistributions. Rob has also been a leader in developing models to analyze intraday, high-frequency transactions data in financial markets. The last chapter by Taejin Kim andJeffrey Russell proposes a new model for the minute-by-minute adjustments to the limitorder book.

We thank the conference sponsors Duke University, the Journal of Applied Economet-rics, Princeton University, the University of Chicago, and the University of California, SanDiego. We thank all of the authors for their original contributions to this volume. Moreimportantly, on behalf of the economics profession we thank Rob for his fundamentalcontributions to our field. Finally, at the end of the June 21st dinner, Rob was presentedwith a bronze oak tree with 77 leaves. Inscribed on each leaf was the name and thesistitle of one of Rob’s students. So most importantly, on behalf of all Rob’s past, presentand future students we say simply “Thanks for growing us.”

Tim BollerslevJeffrey R. RussellMark W. Watson

Page 13: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

This page intentionally left blank

Page 14: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

1

A History of Econometricsat the University of California,

San Diego: A Personal ViewpointClive W.J. Granger

1. Introduction

It is difficult to decide when a history should start or finish, but as this account is basedon my own recollections, I decided to start in 1974. This was when I arrived at theUniversity of California, San Diego (UCSD) with a full-time position, although I hadbeen a visitor for six months a couple of years earlier. The account will end in 2003when both Rob Engle and I officially retired from UCSD. Of course, history never reallyends and it will be up to later participants in the program to add further essays in thefuture.

The account has been divided into three convenient periods: 1974–1984, the foundingyears; 1985–1993, the middle years; and 1994–2003, the changing years.

2. The Founding Years: 1974–1984

I arrived at UCSD in the summer of 1974 having spent 22 years at the University ofNottingham in England (apart from a year as a post-doc at Princeton in 1959–1960),starting there as an undergraduate and finishing as a Full Professor.

At the time of my arrival the teaching of econometrics was largely done by JohnHooper who was well known but not actively engaged in research. For my first year I wasaccompanied by Paul Newbold from Nottingham so that we could finish off our book onforecasting economic time series. We were surprised by how much time we had to workat UCSD compared to England as our teaching was easy, marking and student help was

1

Page 15: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 A history of econometrics at the University of California, San Diego

provided by graduate students, lunch was brief, and there were no lengthy tea or coffeebreaks during the day.

The head of the department was Dan Orr who had been my best man when Patriciaand I were married in Princeton’s chapel in 1960, so we automatically had some goodfriends in the department.

During the first year I found myself on an outside committee chaired by ArnoldZellner of Chicago, which was organizing a large conference on seasonal adjustment.The committee met in Washington, DC to make a variety of decisions. Also on thecommittee was Rob Engle, then at MIT. After the meeting he asked me if I knew of adepartment looking for a time series econometrician, to which I replied that we were. Hecame out for a visit and both sides liked each other. Rob joined the department in the fallof 1975.

I had met Rob a couple of years earlier in a fortunate manner. Marc Nerlove atChicago and I had been asked to select the speakers for three sessions at the forthcomingEconometrics World Congress. As might be expected we had many good applications,many from well-known people. However, we decided to dedicate one of our sessions justto young promising and (at that time) unpublished authors. Amongst the three we chosewere Rob as well as Chris Sims, which suggests that long run forecasting is quite possible!It produced a good session at the congress.

A couple of years later, Hal White came as a visitor from the University of Rochesterduring our spring quarter, 1979. He soon found that he quite liked the department butgreatly liked the beaches and weather. He joined us permanently in 1980 completing theinitial group.

By the end of this period, in 1984, all three of us were Fellows of the EconometricSociety: Rob in 1982, Hal in 1983, and I had been one since 1972, which made us asmall but distinguished group on the West Coast. We did our research not only alonebut also jointly when we found appropriate topics. We would be on the committee ofeach other’s graduate students and also attend the almost weekly econometrics seminar,which was usually given by a visitor. Early in this period we started the “Tuesday’sEconometricians Lunch” at a local restaurant, initially with just Rob, Hal, and myselfand the occasional visitor. The topics could be far ranging, from football to going throughone of our papers almost line by line. Some of our visitors so liked the idea that theyadopted it, particularly Nuffied College, Oxford and Monash University in Melbourne.As our numbers grew, we stopped going out but instead met in the department for a“brown bag” luncheon. Some of the more famous ideas that came out of the group firstsaw the light in these meetings, as well as some other ideas that did not survive.

Two developments in this period are worth special attention as they produced twoNobel Prizes: Autoregressive Conditional Heteroskedasticity (ARCH) for Rob and Coin-tegration for me. I had written a paper on forecasting white noise, which was quitecontroversial. It suggested that functions of an uncorrelated series could be autocorre-lated. When Bob Hall visited from Stanford to give a macroseminar, which Rob and Iboth attended, he had several equations with residuals having no serial correlations. Isuggested to Rob that the squares of these residuals might not be white noise, but hedid not agree. He was still connected electronically to MIT so he called up the same datathat Hall had used, performed the identical regressions, obtained the residuals, squaredthem, and found quite strong autocorrelations. Later, on a visit to the London Schoolof Economics, he considered what model would produce this behavior and found the

Page 16: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 The middle years: 1985–1993 3

ARCH class, which has been such an enormous success. It is interesting to note that thisexample and also that used in Rob’s first paper in the area were from macroeconomics,whereas the vast majority of its applications have been in finance.

From the start I decided not to do research in ARCH and to leave the field to Robas it was clear that it would be heavily involved with financial data, which was an area Ihad decided to leave, at least most of the time, a couple of decades before. Autoregres-sive Conditional Heteroskedasticity has become a huge research and application successmostly within the finance area, and many of our Ph.D. students chose to work in thisarea.

Cointegration arose from my interest in the “balance” of econometric models whereif one side of the equation contained a strong feature, such as a trend, then necessarilythe other side must also do so. I was particularly concerned about the error-correctionmodel being possibly out of balance. I had a disagreement with David Hendry, of Oxford,who said he thought it was possible to add two I(1) series and get an I(0) process. I saidhe was wrong, but my attempt at a proof found cointegration and showed that he wascorrect.

I did publish a couple of small papers on cointegration but in trying to get somethinginto Econometrica I was told that they would need a discussion of testing, estimation,and an application. Rob said that he would be happy to provide these, and become aco-author of a paper that eventually became a “Citation Classic.”

In this first period Rob produced 41 papers, five of which appeared in Econometrica,concerning spectral analysis and particularly spectral regression, regional economics,electrical residential load forecasting, various testing questions, exogeneity, forecastinginflation, and ARCH (in 1982). The exogeneity work has David Hendry as a co-author andlinks together my causality ideas and the statistical assumptions underlying estimations.Details can be found in his CV on his website.

In this period Hal produced one book, two edited volumes, and 14 papers, five ofwhich appeared in Econometrica. Amongst the topics of interest to him were a varietyof testing and asymptotic questions, maximum likelihood estimation of mis-specifieddynamic models, and mis-specified nonlinear regression models.

My own contributions in this period were three books and 86 papers1 concerningforecasting, transformed variables, temporal and spatial data, causality, seasonality, non-linear time series, electric load curve pricing and forecasting, and the invertability of timeseries. The innovation that was published in this period that had the greatest impactwas fractional integration, or long memory processes.

3. The Middle Years: 1985–1993

In this period the econometrics group was steadily productive, had many excellent visitorsas discussed below, and also built the reputation of the graduate program substantially(also discussed in more detail later). This was a period of consolidation and growth inmaturity. Towards the end of the period the original three of us were joined by JimHamilton who works in time series and macroeconometrics and had previously been avisitor here as had Hal and I.

1In counting papers I have excluded notes, comments, and book reviews.

Page 17: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 A history of econometrics at the University of California, San Diego

In this period Rob produced one book and 40 articles on topics including: Kalmanfilters, ARCH-M, cointegration and error-correction, meteor showers or heat waves, withan application to volatility in the foreign exchange market, modeling peak electricitydemand, implied ARCH models for option prices, seasonal cointegration, testing super-exogeneity in variance, common features and trends.

In this period Hal produced one book and 29 papers. Some of the papers consideredneural networks, interval forecasting, trends in energy consumption, and testing forneglected nonlinearity in time series models.

He also had several papers attempting to win the “least comprehensible title” compe-tition. Examples are “Efficient Instrumental Variables Estimation of systems of ImplicitHeterogenous Nonlinear Dynamic Equation With Nonspherical Errors” and “UniversalApproximation Using Feedforward Networks With Non-Sigmoid Hidden Layer Activa-tion Functions.” He is well known for the robust standard errors now known as “White’sStandard Error.”

In his short couple of years with the department Jim produced an enormous andhighly successful textbook on “Time Series” and also an article in the American EconomicReview as well as his important work on switching regime models.

My own contributions in this period were three books and 60 articles. The topicsinclude aggregation with common factors, cointegration, causality testing and recentdevelopments, models that generate trends, nonlinear models, chaos, gold and silverprices, multicointegration, nonlinear transformations of integrated series, treasury billcurves and cointegration, and positively related processes.

One active area of research in this period concerned electricity prices and conductedwithin a small consulting company called QUERI directed by Rob, Ramu Ramanathan,and myself. The advantages of the work were that we were encouraged to publish and acouple of graduate students obtained their degrees on the topics and took jobs with theelectricity production industry. We were involved in an interesting real-time forecastingproject of hourly electricity demand in a particular region of the Northwest. Using avery simple dynamic model we beat several other consulting groups who used rathercomplicated and sophisticated methods. The following year we also won and were notallowed to enter in the third year because the organizers wanted a different method towin. We submitted a paper to a leading journal about our experiences but it was initiallyrejected because the editor said it was not surprising that forecasts provided by Rob andmyself won a competition.

In this eight-year period the group produced six books and 130 papers, often highlyinnovative and progressive.

4. The Changing Years: 1994–2003

In the previous period both Hal and Rob had received very tempting offers from otheruniversities but fortunately had been persuaded to stay at UCSD. However, in thisthird period the inevitable changes started to occur when, in 1999, Rob accepted aprofessorship at the Stern School at New York University (NYU), although he did notofficially retire from UCSD until 2003.

This period started with a high note as two new econometricians joined us: GrahamElliott and Allan Timmermann. Both immediately showed considerable quality and

Page 18: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 The changing years: 1994–2003 5

enthusiasm. Allan is best known for his financial forecasting models and Graham forunit root inference. For the first few years we had six econometricians at UCSD and thelunches, seminars, and other activities all continued. However, towards the end of theperiod the newer members had to take charge as Rob had left, Hal was often involved inconsulting projects, and I was running out of energy.

I finish this account in 2003 because that is the year that Rob and I both officiallyretired from UCSD and then a few months later we heard that we had won the NobelPrize. Whether or not there is any causality involved will have to be tested by laterretirements.

Of course, the econometrics program is continuing and remains very active with Jim,Allan, Graham, and the more recent additions of Yixiao Sun and Ivana Komunjer.

In this period, while at UCSD Rob published one book and 16 articles, and in thefive years at Stern he had one book and 10 articles. These included work on internationaltransmission of volatility, forecasts of electricity loads, and autoregressive conditionalduration.

Hal was very productive in the period with one book and 40 articles. The topicsincluded the dangers of data mining (with Allan) and reality checks for data snooping,testing for stationarity, ergodicity, and for co-movement between nonlinear discrete-timeMarkov processes.

Jim published one book and 14 papers of which one was in Econometrica and one inthe American Economic Review. He also became a Fellow of the Econometric Society.His research topics included testing Markov switching models, asking “What do leadingindicators lead?”, measuring the liquidity effect, the daily market for federal funds, whatis an oil shock, and the predictability of the yield spread.

Allan published 30 papers on topics including implied volatility dynamics and predic-tive densities, nonlinear dynamics of UK stock returns, structural breaks and stock prices,moments of Markov switching models, data snooping, reform of the pension systems inEurope, and mutual fund performance in the UK.

Graham published 12 papers, three of which appeared in Econometrica. The top-ics included near nonstationary processes, testing unit roots, cointegration testing andestimation, monetary policy, and exchange rates.

I published two books and 65 papers. The books were about deforestation in theAmazon region of Brazil, and on modeling and evaluation. I was elected CorrespondingFellow of the British Academy and a Distinguished Fellow of the American EconomicAssociation. Rob, Hal, and I all became Fellows of the American Academy of Arts andSciences.

In all, the econometricians at UCSD produced five books and 187 papers in thisperiod. We received two awards for best paper of the year from the International Journalof Forecasting (one by Hal and one by myself).

The period ended on a high note as Rob and I each received the Nobel Prize in Eco-nomics for 2003. The awards were presented in Stockholm in an exciting and memorableceremony before many of our family members, colleagues, and friends. Rob’s award wasfor ARCH and mine was for Cointegration, although causality was also mentioned.

Although not explicitly stated I do believe that we partly won the awards for helpingto develop the econometrics group at San Diego in 30 years from being inconsequentialand unranked to a major group of substantial importance. A couple of rankings producedby the journal Econometric Theory had UCSD ranked in the top three departments in

Page 19: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 A history of econometrics at the University of California, San Diego

the world. A later ranking, considering the productivity of students after obtaining theirdoctorates, ranked our students second, which I think is excellent. It suggests that weproduce some serious academic econometricians.

5. Graduate students

On my arrival at San Diego I was told that I had a graduate student and that he was arather unusual one. He was an Augustinian monk named Augustine. He was based at theEscorial in Spain and the story of how he found himself at UCSD was rather complicated.His religious order in Spain ran a college, not exclusively religious, and wanted someoneto teach econometrics. They thought he was good at mathematics so they sent him tothe United States to learn first statistics and then econometrics. Why he chose us wasnot clear but I was happy that he had passed the preliminary examination satisfactorily.However, I was surprised that he had decided to study stock market prices as a Ph.D.topic. After a successful first year he was called back by his order to start teachingand so did not finish his degree in San Diego. Later, he rose to a very high positionin the college and always retained a very positive outlook, was cheerful, monkish, anddelightful.

We have attracted some excellent graduate students who have built very successfulcareers such as Mark Watson, Tim Bollerslev, and Norm Swanson, but to mention justa few is unfair to all the others, many of whom have been terrific.

Unfortunately the department has not kept careful record of all our students and sothe lists that are included with this paper are of those who worked in some way withRob or published with the other faculty members. I am sure that many of our excellentstudents have been left off the list and I apologize for this.

From the very beginning we had good students, some very good students, and in lateryears several excellent students. Many have built successful academic careers and havebecome well-known academics. As well as the steady flow from the United States, wehave had memorable students from Spain, Canada, Australia, England, New Zealand,Taiwan, China, Korea, Hong Kong, Japan, Mexico, Italy, and Lebanon. Although somestayed in the US, most returned to their home countries which makes our internationaltravel interesting these days. The quality and quantity of our graduate students certainlyincreased the standing of the department and made working here a pleasure.

6. Visitors

The location of the campus near the beaches of San Diego and cliffs of Torrey Pines,the usually great weather, and the steadily improving group of econometricians, quicklyattracted the notice of possible visitors to the department, especially econometricians.Over the years we have enjoyed visits from many of the very best econometricians in theworld such as David Hendry, Søren Johansen, and Timo Terasvirta. There was a periodwhen all three were here together with Katrina Juselius, James MacKinnon, and TonyHall, which produced some exceptionally interesting discussions, socializing, and tennis.Over the years we had received and hopefully welcomed an incredible group of visitors

Page 20: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 Visitors 7

(a list is attached). I apologize to anyone who is left off but no official list was maintainedand my memory is clearly fallible.

The visitors added a great deal of depth, breadth, and activity to the econometricsgroup, as well as further improving our social life.

To illustrate the impact that the UCSD econometrics group had it is worth lookingat the article “What Has Mattered to Economics Since 1970” by E. Han Kim, AdairMorse, and Luigi Zingales published in the Journal of Economic Perspectives, volume 20number 4, Fall 2006, pages 189–202. The most cited article, with 4,318 citations, is HalWhite’s 1980 piece titled “A Heteroskedacity-Consistent Covariance Matrix Estimatorand a Direct Test for Heteroskedasticity,” Econometrica volume 48, pages 817–838.

The fourth most cited article is by Rob Engle and myself in 1987 on “Cointegra-tion and Error-Correction: Representation, Estimation, and Testing,” which appeared inEconometrica volume 55, pages 251–276, with 3,432 citations.

The 10th most cited article is also by Rob Engle in 1982 on “Autoregressive Condi-tional Heteroskedasticity with Estimates of the Variance of United Kingdom Inflation,”which appeared in Econometrica volume 50, pages 987–1007, with 2,013 citations.

Thus the UCSD group registered three of the top ten most cited papers with a totalof nearly 10,000 citations between them. Also in the top 10 was a paper by one of ourvisitors, Søren Johansen, in 1988 “Statistical Analysis of Cointegration Vectors” fromthe Journal of Economic Dynamics and Control volume 12, pages 231–254. It is worthnoting that this article lists the most cited articles throughout economics and not justeconometrics.

Appearing at number 24 is Tim Bollerslev, a UCSD graduate, with his paper onGARCH from the Journal of Econometrics volume 31, pages 307–327, with 1,314citations. Hal White appears again at 49th place with his 1982 paper on “MaximumLikelihood Estimates of Mis-Specified Models,” Econometrica volume 40, pages 1–25.Finally, in 72nd place is Jim Hamilton with his 1989 paper on “A New Approach to theEconomic Analysis of Nonstationary Time Series and the Business Cycle,” Economet-rica volume 57, pages 357–384, in which he introduced his well-known regime switchingmodel.

Our final two mentions in the published list involve our own graduate. At 92 is T.Bollerslev, R. Chou, and K. Kroner on “ARCH Modeling in Finance” from the Journalof Econometrics, volume 32, 1,792 citations; and at number 99 are Rob Engle and B.S.Yoo on “Forecasting and Testing in Cointegrated Systems” also from the Journal ofEconometrics, volume 35, with 613 citations.

To illustrate how highly ranked these papers are it is worth noting that further down,at numbers 131, 132, and 133 are three very well-known Econometrica papers by JimDurbin (1970) on “Testing for Social Correlation in Least Squares Regressions,” by TrevorBreusch and Adrian Pagan (1979) on a “Simple Test for Heteroskedasticity and RandomCoefficient Variation,” and by Roger Koenker and Gilbert Basset (1978) on “RegressionQuantiles.”

Most publications in economics get very few citations so the papers mentioned herehave been exceptionally successful.

There are a few concepts in our field that can be considered as “post-citation.” Exam-ples are the Durbin–Watson statistic and Student’s t-test which are frequently mentionedbut very rarely cited. “Granger Causality” seems to fall into this category now and weshould expect that ARCH will fall into this category.

Page 21: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

8 A history of econometrics at the University of California, San Diego

7. Wives

Virtually all good researchers are backed up by a patient and caring wife and it would bewrong to write this history without mentioning our wives. The original three, Marianne,Patricia, and Kim were later joined by Marjorie and then by Solange. All have madesubstantial contributions to our success.

8. The Econometrics Research Project

In 1989 the UCSD administration decided to reward the publishing success of the econo-metrics group by forming an official UCSD Research Project for Econometrics. It wasdesigned to encourage our faculty to seek research grants that could use the projectas a home office with little overhead. It was also charged with being helpful to ourvisitors.

In 1992 we were fortunate to have Mike Bacci join us after service in the US Navy.He keeps the project in good shape and has been extremely helpful to our many visitors,both senior and students, particularly those from overseas.

9. The UCSD Economics Department

The department has provided a supportive environment that has allowed us to grow bothin size and quality. It has matured a great deal itself, now containing excellent scholarsin several fields.

There have been close research links with the econometricians and other facultymembers often leading to publications including Ramu Ramanathan (with Rob andmyself), Max Stinchcombe (with Hal), Mark Machina (with me), and Valerie Ramey(with myself).

Although many of us are most excited by doing research and derive a great deal ofpleasure from being involved in successful research projects, we are actually paid to teachboth undergraduates and graduates. Over the years the number of students involved grewsubstantially. This produced larger class sizes and consequently changes in the methodsof teaching. These developments allowed some of us to develop new approaches towardsgetting our messages across to classes who seem to be declining in levels of interest. Thegraduate students, acting as teaching assistants (TAs) were essential in helping overcomeany problems. However, the UCSD faculty continued to give lectures and be availablefor discussions.

10. The way the world of econometrics has changed

When the UCSD econometrics group was starting, in the mid- to late-1970s, the fieldof econometrics was still dominated by large, simultaneous models with little dynamicsoften built using short lengths of annual or, at best, quarterly data. The problems of how

Page 22: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

11 Visitors and students 9

to specify, estimate, test, and identify such models were essential ones, but very difficultand some excellent econometrics was done on these topics. When data are insufficientin quantity one always replaces it, or expands it, by using theory. Evaluation of thesemodels was difficult but forecasting comparisons were used.

The advent of faster computers, more frequent data, even in macro but particularlyin finance, brought more attention to time series methods, such as those developed byBox and Jenkins. Forecast comparisons usually found that the new, simpler and dynamicmodels outperformed the old models. The fact that some of the classical techniques, suchas linear regressions, could perform badly when series are I(1) also turned researchers’attention to the new methods.

Some very famous university economic groups moved very reluctantly away from theclassical areas and the research of the new groups, such as at UCSD, was not always wellreceived. A sign of this can be seen in the development of standard econometric textbooks.The early versions, available in the 1970s and well into the 1980s and sometimes beyond,would make virtually no mention of time series methods, apart from a possible briefmention of an AR(1) model or a linear trend. Today many textbooks cover almost nothingbut time series methods with considerable attention being paid to ARCH, cointegration,fraction integration, nonlinear models including neural networks, White robust standarderrors, regime switching models, and causality, all of which were developed at UCSD.

I think that it can be claimed the work at UCSD has had a major impact. It will bea difficult task to keep this level of activity going.

Throughout the years discussed above the major institution concerned with economet-rics was the Econometric Society and it was influential through its journal, Econometrica,started in 1933. It is acknowledged to be one of the most prestigious within the field ofeconomics. Several of us have published in Econometrica, particularly Rob and Hal, andfour of us are Fellows of the society.

However, there have been remarkably few contacts between the organization of thesociety and the UCSD group. Rob was a member of the council for two three-year termsand I was a member for one term, but we were not asked to be active. Rob was anassociate editor of Econometrica for the years 1975–1981 and that is the total of thecontacts! We were asked to be on the boards of many other journals but somehow thosewho run the Econometric Society never warmed to what was being achieved here.

11. Visitors and students

Much of the strength of the UCSD econometrics group came from the quality of ourstudents and visitors. Unfortunately no comprehensive lists were kept of our students orvisitors, so to make appropriate lists we have taken two approaches. In list “A” are allthe students who had Rob Engle as one of their examiners and so signed their thesis upto the year 2003. There are 60 names on list “A.”

For the rest of us we have just listed graduate students that have published with usup to the year 2003 or so. These lists give 31 students for Granger, 25 for White, eightfor Hamilton, 10 for Timmermann, and one for Elliot, giving a total of 75 in all, althoughseveral students appear on more than one list. There is, of course, a great deal of overlapbetween the Engle list and these other lists.

Page 23: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

10 A history of econometrics at the University of California, San Diego

There are 44 names on the visitors list and this is a very distinguished group. Whatfollows is a partial list of distinguished econometricians who visited UCSD:

Lykke AndersenAllan AndersonBadi BaltagiAlvaro EscribanoPhilip Hans FransesRon GallantPeter BossaertsPeter BoswijkJames DavidsonJesus GonzaloNeils HaldrupTony HallDavid HendryKurt HornikSvend HyllebergJoao Issler

Eilev JansenMichael JanssonSøren JohansenKatarina JuseliusJan KivietsErich KoleAsger LundeHelmut LutkepohlEssie MaasoumiJ. MagnusJohn MacDonaldJames MacKinnonGraham MizonUlrich MuellerPaul NewboldDirk Ormoneit

Rachida OuysseGary PhillipsSer-Huang PoonJeff RacineBarbara RossiPierre SiklosNorm SwansonTimo TerasvirtaDag TjøstheimDick van DijkHerman van DijkAndrew WeissMinxian Yang

A. List of students for whom Rob Engle signed the Ph.D.thesis as an examiner

Richard AndersonHeather AndersonYoshihisa BabaTim BollerslevMichael BrennanKathy BradburyScott BrownSharim ChaudhuryRay ChouMustafa ChowdhuryRiccardo ColacitoEd CoulsonZhuanxin DingIan DomowitzAlfonso DufourAlvaro EscribanYing-Feng (Tiffany) GauIsamu GinamaGloria Gonzalez-RiveraJesus GonzaloPeter Hansen

Andreas HeinenChe-Hsiung (Ted) HongOwen IrvineIsao IshidaJoao IsslerOscar JordaSharon KozickiDennis KraftSandra KriegerKenneth KronerJoe LangeGary LeeHan Shik LeeWen-Ling LinHenry LinSimone ManganelliJuri MarcucciRobert MarshallAllen MitchemFrank MonforteWalter Nicholson

Victor NgJaesun NohAndrew PattonLucio PicciGonzalo RangelRussell RobinsJoshua RosenbergJeffrey RussellDean SchiffmanKevin SheppardAaron SmithGary SternZheng SunRaul SusmelFarshid VahidArtem VoronovMark WatsonJeff WooldridgeByungsam (Sam) YooAllan Zebede

Page 24: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

E. List of UCSD graduate students who published with Hamilton 11

B. List of UCSD students who published with Granger

Lykke Andersen,Heather AndersonMelinda DeutschZhuanxin DingLuigi ErminiRaffaella GiacominiJesus GonzaloJeff HallmanB.-N. HuangTomoo Inoue

Yongil JeonRoselyn JoyeuxMark KamstraDennis KraftChung-Ming KuanH.-S. LeeT.-H. LeeC.-F. LinJ.-L LinMatthew Mattson

Allan MitchemNorm MorinAndrew PattonRussell RobinsChor-Yiu SinScot SpearNorman R. SwansonFarshid Vahid-AraghiMark WatsonSam Yoo

C. List of UCSD graduate students who publishedwith Elliott

Elena Pesavento

D. List of UCSD graduate students who publishedwith White

Stephen C. BagleyXiaohong ChenC.-S. James ChuValentina CorradiIan DomowitzRaffaella GiacominiSilvia GoncalvesChristian HaefkeYong-Miao Hong

Mark KamstraPauline KennedyTae-Hwan KimRobert KosowskiChung-Ming KuanT.-H. LeeRobert LieliMatthew MattsonTeo Perez-Amara

Mark PlutowskiShinichi SakataChor-Yiu SinLiangjun SuRyan SullivanNorman R. SwansonJeff Wooldridge

E. List of UCSD graduate students who publishedwith Hamilton

Michael C. DavisAna Maria HerreraOscar Jorda

Dong Heon KimGang LinJosefa Monteagudo

Gabriel Perez-QuirosRaul Susmel

Page 25: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

12 A history of econometrics at the University of California, San Diego

F. List of UCSD graduate students who publishedwith Timmermann

Marco AiolfiMassimo GuidolinRobert KosowskiAsger Lunde

David MilesAndrew PattonBradley PayeGabriel Perez-Quiros

Davide PettenuzzoRyan Sullivan

Page 26: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2

The Long Run Shift-Share:Modeling the Sources ofMetropolitan Sectoral

FluctuationsN. Edward Coulson

1. Introduction

In this tribute to the career of Robert Engle, attention should be given to an aspect ofhis early career that is not universally recognized, that of urban and regional economist.As related in his interview with Diebold (2003), upon arriving at Massachusetts Insti-tute of Technology (MIT) Engle was asked by Franklin Fisher and Jerome Rothenbergto collaborate on the construction of a multi-equation structural model of the Mas-sachusetts economy, and this led to a number of publications at the outset of hiscareer. His involvement with the field did not end there. An examination of Engle’scurriculum vitae reveals that of his first 13 publications, seven were in the field ofurban economics, and there are many more publications in that area through the early1990s. Perhaps of equal interest is the fact that many of his contributions to “pure”econometrics used urban and regional data to illustrate the methods associated withthose contributions. Two prominent examples are his paper on the parameter variationacross the frequency domain (Engle, 1978a), and Engle and Watson (1981) which intro-duced the Dynamic Multiple-Indicator Multiple-Cause (DYMIMIC) model by treating

Acknowledgments: My thanks go to Mark Watson, an anonymous referee, and participants at presenta-tions at the 2006 Regional Science Association and the Federal Reserve Banks of New York and St. Louisfor helpful comments.

13

Page 27: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

14 The long run shift-share

the decomposition of metropolitan wage rates. As he notes in the interview with Diebold,“there is wonderful data in urban economics that provides a great place for econometricanalysis. In urban economics we have time series by local areas, and wonderful crosssections . . . ”.

One of the natural links between urban economics and time series econometrics isthe examination of aggregate urban fluctuations. Because of data availability, such anal-ysis focuses on the determination of metropolitan area employment and labor earnings,and, again because of the data, sectoral level data are often employed in the analysis.This is helpful and appropriate, because both journalistic and academic explanationsof the differences in cyclical movements of aggregate urban employment often cen-ter on differences in sectoral composition across metropolitan areas. On that accountthe focus turns to the sources of fluctuations in metropolitan industry sectors. Forexample Brown, Coulson and Engle (1991), following Brown (1986), ask the basicquestion of whether or not metropolitan industry sectors are cointegrated (Engle andGranger, 1987) with national industry counterparts, and Brown, Coulson and Engle(1992) ask, in the context of constructing export base multipliers, under what circum-stances metropolitan sectoral employment is cointegrated with aggregate metropolitanemployment.

In what follows, I build on the methods of the above papers and others and proposeto delineate the sources of sectoral fluctuations in metropolitan economies. This delin-eation has four steps. First, a general “city-industry” vector autoregression (VAR) isconstructed, which accounts for both short and long run fluctuations at a number of dif-ferent levels of aggregation. Second, a large number of “traditional” models of regionaleconomics (including the two cointegration analyses of the preceding paragraph) areshown to be reductions of this general VAR, although a by-product of the analysis isthat it is not likely that all of these reductions can be applied simultaneously. Both ofthese steps occur in the next section. In Section 3 the restrictions implied by the restric-tions of the traditional model are tested using data from 10 sectors and five cities. Noneis found to be universally applicable, though some do less violence to the data thanothers. Given these results, the fourth step of estimating the complete VARs (for eachcity industry) is undertaken under four different assumptions. The overall result is thatthe traditional models are unsatisfactory because they neglect the role of local supplyshocks, although this neglect does more damage in “short run” models than in those thatinvoke cointegration.

2. A general model and some specializations

The goal of this analysis is to estimate the sources of sectoral fluctuations in a metropoli-tan area – for example, the Los Angeles manufacturing sector. Such sources can beconveniently catalogued as arising from four different levels: national (aggregate US),industrial (US manufacturing), metropolitan (aggregate Los Angeles), and sourcesthat are idiosyncratic to the particular metropolitan sector. Consider, then, the fol-lowing VAR, which for simplicity is restricted to first order autoregressive processes

Page 28: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 A general model and some specializations 15

(an assumption relaxed in its empirical implementation):⎛⎜⎜⎝ΔnΔiΔmΔe

⎞⎟⎟⎠t

=

⎛⎜⎜⎝k1k2k3k4

⎞⎟⎟⎠+A1

⎛⎜⎜⎝ΔnΔiΔmΔe

⎞⎟⎟⎠t−1

+A0

⎛⎜⎜⎝nime

⎞⎟⎟⎠t−1

+

⎛⎜⎜⎝un

ui

um

ue

⎞⎟⎟⎠t

cov

⎛⎜⎜⎝un

ui

um

ue

⎞⎟⎟⎠ = Ω (1)

where

nt = log of aggregate national employment at time tit = log of national employment in the same specified industry at time tmt = log of aggregate local employment at time tet = log of local employment in a specified industry at time t,

and the ki are intercept terms.

Consideration of this issue has been one of the primary concerns of empirical regionaleconomics over the past half century.1 Over that period of time, a number of models havebeen developed that in essence impose extra structure on the parameters of (1). In theextreme, such simplifications become shift-share decompositions, requiring no estimationat all. The exact form of the shift-share decomposition varies somewhat. Our baselineversion of this decomposition seems to originate in Emmerson, Ramanathan and Ramm(1975):

Δet = Δnt + (Δit −Δnt) + (Δmt −Δnt) + (Δet −Δmt −Δit +Δnt). (2)

Growth in a local industry is decomposed into four parts. The first component, thenational component, estimates the impact of national employment movements on localemployment movements. If, say, national employment grows at 5% in a year, then, otherthings equal, the local industry – say the finance sector in Boston – is also expectedto grow at the same 5% rate. The second component, the industry component, is thedeviation of the national industry growth rate from that of the nation as a whole. Thusif the national finance sector grew at a rate of 10%, then the Boston finance sectorshould, other things equal, also be expected to grow at that same rate, with national andindustry factors each responsible for half of that growth. Similarly, the third componentis dubbed by Dunn (1960) the “total share component”, and is the deviation of theoverall metropolitan growth rate from the national growth rate; obviously this is thecontribution of local aggregate growth to local sector growth. The fourth component

1It should be noted at the outset that such a model can only be used to assess the sources of fluctuationsof et, and not the other three series, all of which include et in their aggregations. A finding that e wasa source of fluctuations of n, m, or i would seem to be vacuous without consideration of the impactof other industries or locations. For an analysis of the reverse question, industry and regional impactson national fluctuations, see e.g. Horvath and Verbrugge (1996), Altonji and Ham (1990), Norrbin andSchlagenhauf (1988). At the metropolitan level the role of sectoral fluctuations in determining aggregatemetropolitan employment is discussed in Coulson (1999) and Carlino, DeFina and Sill (2001).

Page 29: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

16 The long run shift-share

is the change in the industry’s share of employment at the metropolitan level relativeto its share at the national level. It is the percentage change in the familiar locationquotient and is interpretable as the outcome of local supply shocks to local employmentgrowth (given that the total share components net out local demand factors and theindustry component presumably nets out technology shocks that are common to alllocations).

How can the shift-share model be used to inform the specification of the VAR (1)?There are effectively two general approaches which are not mutually exclusive, thoughfor the purposes of this paper they will be. One is to view (2) as an orthogonalization;that is, each of the components is assumed to be uncorrelated with the others, andtherefore capable of separate study. How this has happened in the historical literaturewill be addressed below, but for the moment note that in the context of the VAR, theimplications of this (Coulson, 1993) are that we should premultiply both sides of (1) bythe orthogonalization matrix W, where:

W =

⎛⎜⎜⎜⎝1 0 0 0−1 1 0 0−1 0 1 01 −1 −1 1

⎞⎟⎟⎟⎠ (3)

and we have

W

⎛⎜⎜⎜⎝Δn

Δi

Δm

Δe

⎞⎟⎟⎟⎠t

= Wk +WA1

⎛⎜⎜⎜⎝Δn

Δi

Δm

Δe

⎞⎟⎟⎟⎠t−1

+WA0

⎛⎜⎜⎜⎝n

i

me

⎞⎟⎟⎟⎠t−1

+

⎛⎜⎜⎜⎝en

ei

emee

⎞⎟⎟⎟⎠t

(4)

where k is the vector representation of the intercept terms,

u = W−1e (5)

and the components of e are orthogonal. Thus we can write

Ω = W−1DW ′−1. (6)

The orthogonalization of the VAR is much the same as occurs in ordinary VARs, in thatthe orthogonalization matrix is triangular; however, given the nature of the homogene-ity restrictions, the model is an overidentified structural (B-form) VAR (Coulson, 1993;Lutkepohl, 2005). The reasonableness of the structure, which is equivalent to testingthe overidentifying restrictions, is also a test of the reasonableness of separately ana-lyzing the components of the shift-share decomposition, as is typically the case, eventoday.

As it happens, models and modes of regional analysis that view shift-share throughthis lens very often make implicit (and sometimes explicit) assumptions on the natureof the long run behavior of the components, that is to say, on the form of the matrixA0. As is well known, the rank of A0 is critical to the time series representation of thevector of variables. If this rank is zero the variables are all integrated of order 1 (atleast) and are not cointegrated; it happens that this is the explicit assumption of manyprevious models, as noted in Brown, Coulson and Engle (1991). It is for this reason that

Page 30: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 A general model and some specializations 17

shift-share is regarded as a short run model. Long run considerations, as manifested inA0, are non-existent

If the rank is positive but less than the full rank of four, there is cointegration amongthe variables. If the rank is full, then the variables are ostensibly stationary – that is,integrated of order zero (I(0)). It will be demonstrated later that this last possibility willnot trouble us much, and so if A0 �= 0 then the proper question is how many cointegratingvectors exist within the system? Let the four components of the data vector be notatedas xt. The essence of cointegration is that the number of variables in x is greater than thenumber of integrated processes characterizing their evolution, therefore the componentsof x are tied together in the long run. This is delivered by the fact that while each of thex variables is I(1), certain linear combinations are I(0). If those combinations are notatedas β′x we can write:

A0 = αβ′ (7)

where β is the kxr matrix of the r cointegrating vectors, and α is the kxr matrix ofadjustment speeds.2 As is well known, α and β are not separately identified (since forany nonsingular 4 × 4 matrix F, the two vectors α∗ = αF and β∗ = F−1β would beobservationally equivalent to α and β).

The usual procedure is to specifiy restrictions on β, which are usually zero or nor-malization restrictions. To anticipate the implications of the long run shift-share model,we suppose that in our system of four variables we have three cointegrating vectors. A0

would therefore have rank = 3 and therefore 15 free parameters. The matrix of adjust-ment speeds, α, is typically freely estimated, and therefore uses up 4×3 = 12 parameters,leaving β with three free parameters. Typically, then, the cointegrating vectors would begiven, without loss of generality, as:

β′ =

⎛⎜⎝β1 1 0 0β2 0 1 0β3 0 0 1

⎞⎟⎠ . (8)

This is, of course, where the shift-share decomposition comes in. The second strand ofmodels that deal with shift-share analysis have used the decomposition to identify, andoveridentify, the matrix β. Accumulate and slightly rearrange the decomposition (2) toobtain the identity:

(et − nt) = (it − nt) + (mt − nt) + (et −mt − it + nt). (9)

The idea is that now each of the parenthetical terms represents a cointegrating vector;that while each of the data series is I(1), the differences displayed represent I(0) objects.Equally obvious is the fact that if any three of the parenthetic terms are I(0) the fourthone is as well, and so one can, indeed must, be omitted from the rows of β. In the standardformulation (8), this long run shift-share model would impose the further restrictions

2Note that we could write the levels term as α(β′xt). The parenthetic part is known as the errorcorrection term, and is a measure of the distance of the x vector from its long run equilibrium. The αterm is then described as the speed of adjustment and it, as the name suggests, is a measure of how fastthe error correction term goes to its equilibrium value of zero.

Page 31: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

18 The long run shift-share

β1 = β2 = β3 = −1. But clearly we could implement the alternative formulation:

β′ =

⎛⎜⎝−1 1 0 0−1 0 1 01 −1 −1 1

⎞⎟⎠ (10)

which implies that the industry component, the total share and the location quotient areall I(0). This form of β is attractive in that it is simply the last three rows of W.

It should be noted that the existence of three cointegrating regressions in four vari-ables implies that the entire system is driven in the long run by one shock. Given theimplicit assumptions on causality that are inherent in the W matrix, that shock is theone to national employment. This seems somewhat implausible, so (as in the short runmodel) we can consider other parameterizations to this model as alternatives to the longrun shift-share, models that assume some cointegration, but not “full cointegration” asimplied by the long run shift-share model.

To summarize: the short run shift-share model implies (a) that rank(A0) = 0 sothat the model is one of changes; and (b) an orthogonalization of those data series thatinvolves homogeneity restrictions. The long run shift-share implies (a) rank(A0) = 3, and(b) similar homogeneity restrictions on the cointegrating matrix β.

We can now survey the historical development of the shift-share model as a series ofrestrictions on the above delineated types. It should not be assumed that the authors whoare cited as developing these various models necessarily found evidence in their favor,only that they developed and used them for testing or forecasting purposes.

2.1. Dunn (1960): The total share model

Dunn (1960) views the shift-share model as a model of total regional employment ratherthan local sectoral employment. He proposes the following decomposition:3

Δm = Δn+ (Δm−Δn) (11)

With m-n as the share of the region in the national economy, the second term is, naturallyenough, the shift in that share. Hence, the name. Because this needs to be distin-guished from industry based shift-share, these are actually dubbed the “total” shift andshare. Given the language in Dunn (1960), this model is viewed as one in which, otherthings equal, the region should grow at the same rate as the nation as a whole. Dunnclearly views the model as one of the short run, hence we would view the decomposi-tion as a reduction of the orthogonalization scheme above, specifically W31 = −1 andW32 = 0.

The total share model does not operate at the level of the industry (either localor national). This is not at all the same thing as assuming that industry effects arenonexistent, merely that they are not part of the assumed structure. Thus the W matrix

3Actually, Dunn (1960), and much of the literature that follows, frames shift and share in terms ofnumbers of jobs gained or lost. Thus they would premultiply both sides of (2) by mt−1 (or later byet−1). In the interest of simplicity this modification is ignored.

Page 32: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 A general model and some specializations 19

is written as:

W =

⎛⎜⎜⎜⎝1 0 0 0β21 1 0 0−1 0 1 0β41 β42 β43 1

⎞⎟⎟⎟⎠ .Also, there is no cointegration between m and n, and thus the total share is I(1). In thefirst order model above this implies that the share is a random walk.

2.2. Carlino and Mills (1993): Long run constant total share(stochastic convergence)

In direct contrast to Dunn’s implicit assumption that the total share is a random walk,Carlino and Mills (1993) test for the proposition that the total share m-n is I(0).4 Thusthe share held by a particular region is constant in the long run. This is taken as evidenceof stochastic convergence, the idea being that deviations from this share will not persist.The long run constant total share model is therefore manifested as the restriction rank(β) = 1, as there is only one long run restriction, and this row will be of the form(−1 0 1 0); that is, neither et nor it is expected to be part of the long run model.

2.3. H. Brown (1969): Sectoral shift-share

Brown (1969) introduced the three part shift-share model, which shifted focus from thetotal regional share to the industry share:

Δe = Δn+ (Δi−Δn) + (Δe−Δi) (12)

with attention focusing on the behavior of the final term, the regional shift, which iseasily seen to be the change in the industry share (e/i) held by the region. The fact thatthese three terms were regarded as separately analyzed series is an implicit assumptionthat the decomposition is in fact an orthogonal one (Coulson, 1993). Noting that m playsno role in this decomposition the W−1 matrix is of the form:

W =

⎛⎜⎜⎜⎝1 0 0 0−1 1 0 0w31 w32 1 00 −1 0 1

⎞⎟⎟⎟⎠ . (13)

Once the three part decomposition is developed, the assumption of orthogonalitybecomes explicit, as modeling of the shift component e/i is now the focus of the researchprogram. Not only that, but the short run assumption also becomes operational. In anattempt to frame shift-share as a forecasting tool, Brown (1969) postulated that theregion’s industry share was a random walk. This implies not only the orthogonalizationsuggested in (12) but also that there is no cointegration between e and i.

4Though perhaps with a structural break.

Page 33: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

20 The long run shift-share

2.4. S. Brown, Coulson and Engle (1991): Constant share

This is the natural counterpart to the martingale share model, implying that the orthog-onalization in (12) is appropriate at the long run rather than short run time horizon,that is that there is a single cointegrating vector in β, and that the row is of the form(0 1 0 −1). Brown, Coulson and Engle (1991) frame this model as a set of regionalproduction functions with a fixed-in-location factor of production, technology shocks arenational, and migration across regions equilibrates the share of production for each regionin the long run as a function of the region’s share in the fixed factor.

2.5. Sasaki (1963): The base multiplier model

One of the workhorse models of regional economics is the base multiplier model, whichimplies a relationship between (a set of) local industry employments (the basic sectors)and aggregate regional employment. This is not a relationship between e and m, per se,but between the sum of a subset of e’s and m and was first placed in a regression contextby Sasaki (1963). Nevertheless, while a regression of m on e would yield a biased estimateof the multiplier (as the intercept term), if the base multiplier theory holds there shouldstill be a unit elastic relationship between each sectoral employment and total regionalemployment in the short run.

2.6. Brown, Coulson and Engle (1992): Long run base multiplier

If the employment series are integrated, then Brown, Coulson and Engle (1992) demon-strate that the base multiplier model implies that e and m will be cointegrated, regardlessof whether e is not part of the basic sectors, if certain other conditions hold, to bediscussed shortly. Thus there will be a row of β that can be rendered as (0 1 0 −1).

It is of interest to note that the combination of restrictions implied in models 2, 4and 6 yield the long run shift-share model. The three rows of β discussed in those modelsare linearly independent and are equivalent to the matrix in equation (10). As a furtherinteresting demonstration of this note that:

(et −mt) = (nt −mt) + (et − it) + (it − nt).

Thus the three models together imply that national industries are cointegrated with thenational aggregate. This seems implausible on its face, as it would imply that technologyshocks are identical across industries.

Thus, one of the three long run models must be wrong. Our preliminary candidate ismodel (6), the long run base multiplier. The “certain other conditions” alluded to abovehave to do with the cointegration of the various sectors at the local level. Basically, if eis a basic sector, then it must be cointegrated with other basic sectors, again implyingsomething like common technology shocks. If e is a local sector, it must be cointegratedwith other local sectors, which presumably implies common demand shocks. At the locallevel this is slightly more plausible; nevertheless, the long run shift-share model doesrequire a lot of the data.

To round out the model descriptions we reiterate the full models described in thebeginning:

Page 34: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Data and evidence 21

2.7. Coulson: The four part shift-share model

The three part decomposition/orthogonalization (model 5) is unsuitable particularlybecause it is difficult to interpret the components of the decomposition in a coherentmanner. If e and i are employments in an export-oriented industry then a change inthe share plausibly represents supply shocks to that region-industry (at least relativeto the national industry); but if there are regional demand shocks for the local output,then the shift term will conflate them with the supply shocks. As noted, the four partdecomposition originated by Emmerson, Ramanathan and Ramm (1975) overcomes thisproblem by re-introducing the total shift and re-orthogonalizing (as particularly noted byBerzeg (1978)) with W as described in (3) above. Thus this four part model is basicallya pair of hypotheses: (a) that there is no cointegration among the four variables in levels,and thus that the VAR should be specified in changes; (b) that the matrix W describesthe appropriate orthogonalization.

2.8. Long run shift-share

The long run counterpart to Coulson (1993) is the long run shift-share, as previouslydiscussed.5 There are three maintained hypotheses, that (a) the data are integrated; (b)there are three cointegrating vectors among the four variables; (c) that (10) describesthe cointegrating relationships.

3. Data and evidence

Data on full-time employment are drawn from the website of the US Bureau of LaborStatistics (www.bls.gov). Data are drawn from five different Metropolitan StatisticalAreas (MSAs): Philadelphia, Dallas, Atlanta, Chicago, and Los Angeles. These exam-ple MSAs were chosen more or less at random, to represent a diversity of regions andeconomic bases. Not every industry is available in every MSA, so for purposes of compa-rability, we use the broad industry aggregates (“supersectors”) of the North AmericanIndustry Classification System (NAICS), which are listed in the Tables. Comparable dataare drawn from the US employment page for aggregate and supersector employment. Thedata are monthly and range from January 1990 through August 2006. The start datereflects the availability of consistently measured city-industry data.6

Our first task is to determine the integratedness of the series in question. All of themodels above implicitly assume that the series are indeed integrated. Table 2.1 presentsaugmented Dickey–Fuller tests for each of the series. The Dickey–Fuller test is a test

5There are several other variants on the above models, but these are omitted from the present survey.Brown (1969) argues that the shift itself is a random walk, which would indicate that the employmentseries are I(2). Test results (not included) indicate no evidence of this. Theil and Ghosh (1980) modelthe decomposition in effect as a two-way ANOVA model, where the interaction term, i.e. the locationquotient, plays no role.

6The conversion of BLS industry classifications from Standard Industrial Classification (SIC) to NAICSoccurred in 1997. The BLS could only reliably backcast the MSA industry-level data to 1990, and neitherhas it recreated SIC data after this change. The lack of long run information in these time series maycause the lack of specificity in the VAR results.

Page 35: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

22 The long run shift-share

Table 2.1. Unit root tests

US Philadelphia Dallas Atlanta Chicago LosAngeles

Total −1.88 −2.69 −3.92∗ −2.86 −3.01 −3.74∗

Construction −3.79∗ −3.20 −3.37 −3.01 −1.86 −3.48∗

Durable Manufacturing −1.60 −2.81 −1.53 −0.59 −2.75 −3.49∗

Nondurable Manufacturing −1.35 −2.14 −2.51 −0.78 −1.60 −2.06Trade, Transportation −1.55 −3.42 −3.34 −2.51 −2.91 −2.91

and UtilitiesFinance −3.45∗ −2.52 −2.66 −1.05 −2.97 −2.09Information −1.39 −0.23 0.10 0.61 −0.57 −1.65Professional and −2.10 −3.19 −3.16 −2.17 −2.66 −2.16

Business ServicesEducation and Health −2.14 −2.07 −2.39 −1.46 −3.22 −3.34

ServicesLeisure and Hospitality −1.00 −2.40 −1.55 −1.58 −1.97 −2.24

ServicesOther Services −0.79 −3.22 −0.55 −1.82 −2.61 −2.27Government −1.69 −1.81 −2.47 −2.97 −0.59 −3.04

The table entries are the t-values from an Augmented Dickey–Fuller test for unit roots in the indicatedseries. Asterisks indicate a test-statistic with a prob-value between 1 and 5% for rejecting the nullhypothesis that a unit root exists, against the I(0) alternative. The Dickey–Fuller regressions contain anintercept, a time trend, and lags of the first difference as selected by the Schwarz information criterion.

for stationarity, regressing the change in a variable on the lagged level (i.e. a univariateversion of the final equation in the VAR equation (1)). Rejection of the null hypothesisindicates that the series in question is I(0). As can be seen, the test-statistics are almostinvariably above (i.e. closer to zero than) the 5% critical value. Of the 72 series listedin Table 2.1, four have test-values less than the 5% critical value, about what wouldbe expected if the null were universally true and the tests were independent (whichof course they are not). The general conclusion is therefore that the series are indeedintegrated.

This paves the way for Table 2.2, which tests the extent to which the four seriesare cointegrated with each other. The unrestricted VAR (equation (1)) with four lags isestimated using each city-industry and the three more aggregated series that correspondto it.7 Trace tests (Johansen, 1995) are then performed sequentially to reject or fail toreject whether the rank of the matrix A0 is zero, one, or two. That is, zero rank is tested,

7Equation (1) contains intercept terms. The VARs are estimated under the assumption that part of thisintercept is “inside” the cointegrating relation and part is “outside” (which are not separately identified).The first part then corresponds (under the homogeneity assumption at least) with the proportionalitieswhich exist across different levels of aggregation, and the second is congruent with the assumptionthat the employment levels have deterministic trends. The models are estimated using Eviews, whichidentifies the first part by assuming that the “inside” part is zero during estimation, and then regressingthe error correction term on an intercept. The difference between that intercept and the estimatedconstant becomes the trend term.

Page 36: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Data and evidence 23

Table 2.2. Trace tests of the long run shift-share

Philadelphia Dallas Atlanta Chicago LosAngeles

Construction 3 3 3 3 2Durable Manufacturing 1 2 3 3 2Nondurable Manufacturing 3 2 1 2 2Trade, Transportation and Utilities 2 1 0 1 3Finance, Insurance and Real Estate 1 1 2 1 2Information Services 1 1 1 1 3Professional and Business Services 2 2 2 1 2Education and Health Services 3 2 2 3 3Leisure and Hospitality Services 2 2 2 1 3Other Services 1 1 2 1 3Government 3 3 2 1 3

The table entries are the number of cointegrating vectors in a four equation system consisting of thelogs of national employment, total city employment, total industry employment and city-industryemployment for the indicated row and column. Sequential trace tests were employed to reject (or failto reject) ranks in the A0 matrix of zero, one, and two at 5% critical values.

and if rejected, a rank of one is tested, and if that is rejected a rank of two is tested.Given that a rank of four is only possible if the data are I(0), testing ceases if a rank oftwo is rejected in favor of a rank of three. Recall that the long run shift-share hypothesisis that the rank of the matrix is three.

Five points can be made from Table 2.2:

1. There is cointegration of some kind in almost all of the VARs. Only one com-bination, that associated with Atlanta’s Trade, Transport and Utilities sector,failed to reject the null hypothesis rank (A0) = 0, and the prob-value of that testwas 6.5%.

2. At the other extreme, there is only a modest amount of empirical support forthe long run shift-share model, in that relatively few of the VARs exhibit threecointegrating vectors. This is to be expected given the discussion above.

3. Nevertheless, there are patterns to the number of cointegrating vectors. Morecointegration (and more evidence of the long run shift-share model) are observablein the construction and Government sectors and in the Los Angeles MSA. Otherindustries (information services, finance, other services) and cities exhibit muchless cointegration.

4. A question of importance is the extent to which the results from point 3 areinfluenced by the results from Table 2.1. For instance, Los Angeles has more cointe-gration than other cities, but it is also one of the two cities where the unit root nullwas rejected for its aggregate employment. On the other hand, Dallas’ aggregate

For comparability purposes, it was desirable that the number of lags (four) in the VAR be the sameacross the different models. Four lags is something of a compromise; one lag is clearly too short to providefor the dynamics in the data, but using 12 (as might be suggested by the use of monthly data) seems,according to a random inspection of information criteria, like overparameterizing the model.

Page 37: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

24 The long run shift-share

employment also was not found to have a unit root, and its VARs exhibit consider-ably less cointegration than those of Los Angeles. Similarly, aggregate employmentin the US finance sector also appeared to be I(0), and yet across the MSAs, thefinance sector’s VARs exhibit much less cointegration than the construction sector,or indeed any sector. The bottom line is that very little about Table 2.3 could havebeen inferred a priori from the results in Table 2.1.

5. As the predominant finding is that these VARs exhibit one or two cointegratingrelationships, it would be prudent to use the bivariate cointegrating models 2, 4,and 6 to seek a better understanding. Tables 2.3 and 2.4 pursue this course.

The first row of Table 2.3 examines the cointegration (or lack thereof) betweenaggregate city employment and national employment. The table entries are the tracetest-statistic for cointegration in the indicated two-variable VAR. An asterisk indicatesrejection at the 5% level of the null hypothesis that there is no cointegration betweenaggregate city employment and aggregate US employment. The nonrejection is taken asevidence in favor of Dunn’s total share model (Model 1) and this is the case for Dallasand Chicago. For the other three cities, the trace test indicates that there is cointe-gration between city and national employment. This is partial evidence in favor of thestochastic convergence, long run constant share model of Carlino and Mills (Model 2),but that model also requires that the cointegrating vector have unit coefficients. Thus,for those entries that reject the null hypothesis, a notation of = 1 indicates that 1is in the 95% confidence interval for the un-normalized coefficient in the cointegratingregression. This result is congruent with model 2. An indication of �= 1 indicates other-wise. The Carlino–Mills Model appears to hold for Los Angeles, but not for Atlanta andPhiladelphia.

Rows in Table 2.3 beyond the first are analogous results for city-industry employmentand national counterparts. Lack of cointegration (no asterisk) is taken as evidence in favorof Model 3, H. Brown’s presentation of the shift-share model, whereas cointegration withunit coefficients is evidence for Model 4, S. Brown, Coulson and Engle’s constant industryshare model. What conclusions can be drawn from these rows of Table 2.3? The industrylevel results have a broad range of interpretations. At one extreme is the Education andHealth Services sector, in which all five city employments are cointegrated with nationalemployment, and four of those are statistically indistinguishable from the constant sharemodel. An interpretation of this is of course that permanent shocks, i.e. productivityshocks, occur at the national level and percolate immediately to each local industry, andlocal productivity shocks are unimportant. At the other extreme is the information sector,where no local sector is cointegrated with the aggregate. An interpretation of this result isthat productivity shocks are completely local; there is no national trend to tie local sectorsto the broader. Although a few industries display results that are to an extent like thoseof the information sector (e.g. nondurable manufacturing), the most common outcome isa mixture of noncointegration and cointegration with a nonunit coefficient. For example,Professional and Business Services exhibits two cities with no cointegration and three withnonunit cointegration. Aside from the difficulties of interpreting nonunit cointegration(as a partial adoption of national technology shocks?) the variety of responses makes itsupremely difficult to draw general conclusions.

Returning to the aggregate level we see that only Los Angeles fails to reject thehomogeneity requirement for the constant long run share model, whereas Dallas and

Page 38: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 2.3. Trace tests of the constant share model

Philadelphia Dallas Atlanta Chicago Los Angeles

Total 24.39∗ �= 1 14.73 23.1∗9 �= 1 13.76 42.12∗ = 1Construction 57.24∗ �= 1 18.40∗ �= 1 13.58 87.28∗ �= 1 25.20∗ �= 1Durable Manufacturing 7.27 8.51 9.98 15.53∗ �= 1 30.88∗ �= 1Nondurable Manufacturing 13.29 5.87 5.01 22.69∗ �= 1 7.79Trade, Transportation and Utilities 29.95∗ �= 1 21.28∗ �= 1 22.10∗ �= 1 12.31 13.45Finance, Insurance and Real Estate 10.61 6.39 11.34 18.39∗ �= 1 23.07∗ �= 1Information Services 2.87 11.98 12.81 7.85 12.90Professional and Business Services 11.22 39.92∗ �= 1 15.74∗ �= 1 26.85∗ �= 1 14.23Education and Health Services 35.34∗ �= 1 19.69∗ = 1 19.78∗ = 1 20.87∗ = 1 28.66∗ = 1Leisure and Hospitality Services 35.43∗ = 1 62.67∗ �= 1 61.21∗ �= 1 120.41∗ �= 1 13.43Other Services 19.31∗ = 1 32.32∗ �= 1 12.81 33.56∗ �= 1 24.31∗ �= 1Government 27.15∗ �= 1 25.51∗ �= 1 20.25∗ �= 1 12.00 8.57

With the exception of the first row, the table entries are the trace test-statistic for cointegration between the indicated city-industry and itsnational counterpart. An asterisk indicates significance (i.e. cointegration) at the 5% level. For each significant result, the notation =1 indicatesthat the cointegration coefficient contains 1 in its 95% confidence interval, �=1 indicating the contrary. The first row is the corresponding statisticfor aggregate employment.

Page 39: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 2.4. Trace tests of the multiplier model

Philadelphia Dallas Atlanta Chicago Los Angeles

Construction 30.54∗ �= 1 66.18∗ �= 1 31.19∗ �= 1 43.39∗ �= 1 5.44Durable Manufacturing 5.06 7.62 7.75 9.87 14.49Nondurable Manufacturing 4.29 36.94∗ �= 1 4.17 20.69∗ �= 1 9.80Trade, Transportation and Utilities 11.84 7.64 8.23 33.90∗ �= 1 39.20∗ �= 1Finance, Insurance and Real Estate 6.24 14.66 10.09 12.77 45.5∗ �= 1Information Services 5.83 2.91 6.85 2.76 41.02∗ �= 1Professional and Business Services 6.09 41.45∗ �= 1 30.44∗ �= 1 25.37∗ �= 1 48.02∗ �= 1Education and Health Services 11.66 3.66 8.42 15.08 10.16Leisure and Hospitality Services 4.39 11.92 9.16 14.12 19.13∗ �= 1Other Services 2.73 10.29 25.3∗ = 1 3.51 8.56Government 31.83∗ = 1 13.15 7.55 58.80∗ �= 1 14.92

The table entries are the trace test-statistic for cointegration between the indicated city-industry and its regional aggregate. An asterisk indicatessignificance (i.e. cointegration) at the 5% level. For each significant result, the notation =1 indicates that the normalized cointegration coefficientcontains 1 in its 95% confidence interval, �=1 indicating the contrary.

Page 40: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Data and evidence 27

Chicago, at the other extreme, are not cointegrated at all with national employment.Again the absence of similar results across cities makes generalities impossible. But evenso, puzzles arise. For example, Los Angeles conforms to the long run total share modeleven though none of the component industries do.

Table 2.4 provides tests of the Brown, Coulson, Engle model of long run base multipli-ers. The Table entries are notated as before. There is very little evidence of cointegration(aside from Los Angeles, and the construction and business service sectors) and almost noevidence of unit responses (only two cases). This, as noted, is to be expected. The modelof Brown, Coulson and Engle (1991), for example, assumes that permanent componentsof employment series are due to productivity shocks, it is quite natural for there to becointegration between local and national sectors in the same industry. It would be quiteanother matter for different industries in the same city to have such a correspondence. AsBrown, Coulson and Engle (1992) note, it is possible for a single local industry series tobe cointegrated with its metropolitan aggregate. The example discussed there concerneda single basic sector, which could be cointegrated with metropolitan employment if itwas cointegrated with the other basic sectors. Such a scenario, as noted, is quite unlikely,as the productivity shocks are unlikely to be the same across industries. What is per-haps more likely is a second example (only indirectly discussed in Brown, Coulson andEngle (1992)) where a single local sector can be cointegrated with the aggregate if it iscointegrated with other local-serving industries. This is more plausible only in the sensethat local-serving industries are largely in the service sector, and the dominant form ofpermanent shocks is perhaps more likely to be local demand shocks, and therefore com-mon across sectors. By this reasoning it is perhaps sensible that the cointegration thatdoes occur is in two sectors that are plausibly local-serving: construction and businessservices.

Obviously, given the mixture of results, neither the long run shift-share nor the shortrun shift-share fully describe the fluctuations of regional economies. In order to saymore, the VAR itself must actually be estimated. We will perform four VARs for eachcity-industry:

• (A) The short run shift-share: The model is estimated in differences, and theorthogonalization (3) is imposed.

• (B) The short run VAR: The model is estimated in differences and only a causalordering implied by W is imposed (i.e. without the homogeneity restrictions).

• (C) The intermediate model: Cointegration is assumed, with the number of coin-tegrating vectors as indicated by Table 2.2. Statistically, this is the “correct”model.

• (D) The long run shift-share: Three cointegrating relations are assumed and thehomogeneity restrictions are added.8

The VARs are estimated using four lags as before, and compared using the 24-monthforecast error variance decomposition. The results for the six sampled MSAs are inTables 2.5 through 2.9. The variation of results across cities and industries is large.

8A fifth model was estimated, which provided for three cointegrating relations, but without impos-ing the homogeneity restriction of the long run shift-share. As might be expected, the results wereintermediate between those of Models C and D.

Page 41: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 2.5. Philadelphia VARs

Sector Model AShort runshift-share

Model BShort run without

shift-share restrictions

Model CLong run without

shift-share restrictions

Model DLong runshift-share

n i m e n i m e n i m e n i m e

C 17.6 31.5 17.7 33.2 11.7 35.9 9.2 63.3 11.0 8.1 27.6 53.4 10.6 9.9 20.3 59.3DM 39.8 21.0 20.1 19.0 13.0 9.5 26.6 50.9 21.2 61.3 9.8 7.8 60.3 20.9 1.7 17.1NM 28.9 22.9 21.8 26.4 3.5 3.6 35.5 57.4 1.5 13.5 28.1 56.9 1.4 9.4 20.5 68.8TU 24.2 23.9 26.7 25.2 13.3 29.1 10.8 75.0 2.9 23.9 72.4 13.2 30.8 8.8 48.2 12.2F 28.8 25.5 22.0 23.7 3.8 26.6 8.0 61.6 8.8 22.0 20.3 48.8 15.2 9.2 22.1 53.5IS 30.7 25.7 21.4 22.2 4.6 1.9 57.3 36.2 17.3 50.8 2.2 29.7 31.8 37.2 4.2 26.8PS 26.4 24.3 26.4 22.9 1.4 0.7 58.0 39.9 15.2 33.1 6.4 45.4 24.9 15.0 16.3 43.8ES 26.5 18.9 36.8 17.8 1.6 0.7 56.1 41.6 4.8 2.3 36.4 56.6 15.3 0.5 36.0 48.2LS 22.1 23.5 25.0 29.4 0.6 2.2 47.4 49.8 5.3 7.5 39.7 47.5 4.8 10.4 20.4 64.4OS 24.6 26.5 24.1 24.8 2.4 1.1 31.7 64.8 8.1 5.1 8.1 78.6 1.5 1.5 7.1 89.9G 28.7 23.7 26.9 20.7 0.7 2.9 23.9 72.4 4.3 3.7 32.2 59.7 10.1 3.4 29.7 56.7

The table entries are the percentage of the 24-month forecast error variance of local employment that can be ascribed to the indicated shock.Sector abbreviations given in the first column correspond to sectors listed in the first columns of Tables 2.1–2.4.

Page 42: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 2.6. Dallas VARs

Sector Model AShort runshift-share

Model BShort run without

shift-share restrictions

Model CLong run without

shift-share restrictions

Model DLong runshift-share

n i m e n i m e n i m e n i m e

C 53.0 20.6 5.7 20.7 8.5 1.5 31.7 62.4 63.1 1.2 20.2 15.5 42.7 9.9 21.9 25.5DM 72.3 19.6 3.2 4.9 22.7 6.8 10.8 59.7 56.3 33.9 0.7 9.1 58.0 25.8 2.7 13.6NM 54.4 31.1 1.5 13.0 8.0 4.3 52.5 35.3 73.6 12.4 9.6 4.3 75.4 8.1 9.8 6.7TU 60.5 7.6 15.5 16.4 3.7 2.4 54.9 39.0 49.5 3.4 44.5 2.6 55.7 4.9 28.2 11.2F 69.5 24.1 2.3 4.1 6.7 6.8 22.0 64.5 32.1 19.5 2.5 45.9 26.2 17.4 23.1 33.2IS 63.8 18.5 7.7 10.0 21.3 4.5 4.8 69.4 59.0 6.9 1.3 32.8 58.0 2.8 1.4 37.8PS 9.7 68.9 9.8 11.6 5.3 4.0 59.8 30.9 22.8 59.3 9.7 8.2 40.5 28.7 17.6 13.2ES 12.2 48.3 7.3 32.2 1.2 1.6 55.4 41.9 1.2 0.8 10.4 87.6 1.3 13.8 32.3 52.5LS 64.5 14.9 6.7 13.9 1.0 1.3 36.2 61.4 45.5 7.5 18.4 28.6 34.5 12.0 16.9 36.6OS 66.1 18.9 5.0 10.0 3.5 1.2 20.0 75.3 38.4 4.4 7.6 49.6 37.7 8.0 6.7 47.5G 52.9 22.3 10.3 14.5 0.7 1.7 14.3 83.4 19.6 8.2 24.2 48.1 12.5 7.4 22.4 57.7

The table entries are the percentage of the 24-month forecast error variance of local employment that can be ascribed to the indicated shock.

Page 43: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 2.7. Atlanta VARs

Sector Model AShort runshift-share

Model BShort run without

shift-share restrictions

Model CLong run without

shift-share restrictions

Model DLong runshift-share

n i m e n i m e n i m e n i m e

C 27.7 25.9 22.7 23.8 6.7 3.7 39.2 50.4 59.1 5.1 7.3 28.6 27.0 14.3 29.2 29.5DM 27.1 25.5 22.5 24.8 5.1 4.0 14.5 76.4 30.8 24.8 13.1 31.2 44.4 5.4 3.2 47.1NM 25.4 27.2 22.6 24.8 2.1 3.0 22.5 72.3 15.5 1.6 33.0 49.9 49.9 9.4 9.6 31.2TU 19.1 30.3 17.8 32.8 4.3 2.4 55.7 37.5 4.3 2.4 55.7 37.5 57.0 2.7 20.4 19.9F 27.9 24.8 23.9 23.4 2.5 5.7 29.2 62.6 39.3 6.3 3.5 50.9 33.5 4.8 10.3 51.4IS 35.7 25.2 17.7 21.4 8.7 10.7 14.2 66.4 31.1 38.4 22.2 8.3 40.1 34.5 5.6 19.8PS 25.9 24.8 27.5 21.8 3.9 4.6 65.7 25.8 50.0 18.7 14.6 16.7 46.8 16.5 4.5 32.1ES 22.8 22.5 28.4 26.3 1.9 0.9 35.8 61.4 6.5 5.3 14.9 73.3 0.9 19.5 28.8 50.8LS 22.0 25.6 25.2 27.2 1.6 1.2 45.0 52.2 22.0 17.6 22.8 37.7 27.1 17.4 17.5 38.0OS 26.8 23.9 25.0 24.3 3.7 0.3 6.1 89.9 28.4 1.1 2.6 67.9 16.9 3.2 13.3 66.6G 30.9 19.2 29.0 27.9 23.7 27.7 20.7 77.3 10.2 18.6 19.2 51.9 1.4 14.0 30.2 54.4

The table entries are the percentage of the 24-month forecast error variance of local employment that can be ascribed to the indicated shock.

Page 44: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 2.8. Chicago VARs

Sector Model AShort runshift-share

Model BShort run without

shift-share restrictions

Model CLong run without

shift-share restrictions

Model DLong runshift-share

n i m e n i m e n i m e n i m e

C 45.1 8.8 38.9 16.8 30.0 14.8 38.4 62.3 15.6 2.4 33.0 49.0 16.2 4.6 31.0 48.2DM 36.4 24.9 19.6 19.1 9.7 11.3 32.8 46.3 25.0 17.4 2.2 55.4 17.4 2.2 55.4 57.2NM 35.0 23.4 22.1 19.6 4.9 4.8 47.8 42.6 21.1 63.3 4.8 10.8 15.7 48.9 8.2 27.2TU 17.7 30.0 18.4 34.0 2.6 2.0 51.7 43.7 47.6 1.4 27.8 23.1 46.4 1.3 32.2 20.1F 28.0 24.3 22.8 25.0 2.0 1.9 30.1 66.1 3.4 2.2 40.4 54.0 7.5 1.3 6.7 84.6IS 28.4 23.3 28.6 33.5 22.9 21.8 21.8 77.1 24.7 30.2 2.2 42.8 10.1 23.9 9.5 56.5PS 21.3 29.6 18.3 30.9 3.4 2.4 56.7 37.6 18.7 45.5 16.9 18.9 51.4 17.0 13.9 17.7ES 36.3 13.6 33.9 16.1 1.0 3.6 45.7 49.6 10.0 42.8 14.4 32.8 4.1 17.4 20.7 57.8LS 13.5 31.5 14.3 40.7 1.0 1.2 43.1 54.7 23.8 10.2 34.4 31.6 21.0 10.8 33.9 34.3OS 25.0 25.3 24.0 25.7 2.0 1.5 29.9 66.6 19.6 15.1 27.9 37.4 35.0 4.5 22.1 38.3G 30.2 19.4 31.7 18.6 0.6 1.1 29.9 68.5 2.7 5.2 47.2 44.9 6.0 6.2 41.1 46.6

The table entries are the percentage of the 24-month forecast error variance of local employment that can be ascribed to the indicated shock.

Page 45: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 2.9. Los Angeles VARs

Sector Model AShort runshift-share

Model BShort run without

shift-share restrictions

Model CLong run without

shift-share restrictions

Model DLong runshift-share

n i m e n i m e n i m e n i m e

C 32.0 21.9 26.6 19.5 3.5 2.8 48.2 45.5 20.7 17.7 5.9 55.6 26.1 6.7 4.9 62.4DM 36.6 23.1 22.3 18.0 8.1 6.6 45.6 39.7 17.6 43.3 5.9 33.3 29.3 46.8 3.5 20.5NM 5.5 5.4 51.2 33.6 23.2 27.9 15.4 37.9 31.0 12.5 1.5 55.0 23.5 24.2 3.9 48.4TU 22.2 27.3 22.5 28.0 0.8 2.2 66.9 30.0 30.9 0.2 58.5 10.4 24.1 2.2 55.4 18.3F 25.1 25.3 26.3 23.4 2.4 3.1 40.2 54.3 12.9 5.5 13.6 68.0 6.0 20.4 14.4 59.2IS 26.9 19.2 33.1 20.8 4.7 2.2 23.1 70.0 22.7 22.6 7.5 47.3 6.0 7.3 4.4 82.3PS 27.4 24.2 27.3 21.1 3.5 2.4 54.4 39.7 35.0 6.0 8.3 50.6 33.0 6.0 10.0 51.1ES 21.8 30.5 21.2 26.4 0.1 0.6 30.0 69.3 0.8 16.1 8.5 74.6 4.6 9.9 8.5 77.1LS 25.2 24.5 24.9 25.3 0.3 1.9 53.1 44.7 9.8 0.4 9.2 80.5 16.7 19.4 22.7 41.2OS 27.1 23.6 29.4 20.0 0.3 2.3 58.0 39.3 3.9 0.4 7.9 87.8 4.0 2.3 26.4 67.3G 25.8 25.5 24.2 24.5 2.0 2.1 19.6 76.3 0.4 1.1 17.9 80.6 6.5 5.1 15.0 73.3

The table entries are the percentage of the 24-month forecast error variance of local employment that can be ascribed to the indicated shock.

Page 46: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Summary and conclusions 33

The following stylized conclusions might be drawn, although every one of these hasexceptions.

A good starting point is the comparison of Models A and B, both of which assume alack of cointegration, but which differ in whether they impose the shift-share orthogonal-ization (A) or not (B). Not shown in the tables is that the overidentifying restrictions thatimpose the normalization are universally rejected. The shift-share model is not appropri-ate; in comparing models A and B we see that the statistically preferred Model B assignsfar more explanatory power, on average, to the local industry shock, and (less regularly,the aggregate metro shock) than Model A. This is natural; what the shift-share modeldoes is force local movements to follow movements in broader aggregates in a one-for-onemanner, thus ignoring the role of local supply shocks. This can be contradictory to theactual movements of local industries, and thus not imposing the short run constraintswould seem to be preferable. Another way of looking at this is to note that in the firststep of the variance decomposition in Model A, all four shocks are given equal weight(as per the structure of the matrix W), and the force of this persists even to the 24-stephorizon.

When the statistically preferred number of cointegrating vectors are assumed to exist(Model C), the results are generally closer to the results in Model B than to those in ModelA. Generally, though, Model C does assign more explanatory power to national andnational-industry shocks than does B. This is to be expected given the previous bivariateresults of Tables 2.3 and 2.4. Note that bivariate cointegration was far more common inthe relationship between local industry and national industry than between local industryand the aggregate local economy. Thus, we would expect that when cointegration isallowed into the system, the impact of the nation and national industry would increase.By and large (but by no means universally) this result is confirmed.

As we move from Model C to Model D, recall that two modeling changes are made.First, the number of cointegrating vectors is forced to be three. This would not beexpected to make much of a difference in the results, as the extra cointegrating coefficientwould presumably be close to zero. The imposition of unit coefficients (especially whenthey would otherwise be zero) is therefore presumably of more importance. Note firstof all (test statistics not shown) that these unitary restrictions are universally rejectedby the data at any conventional level of significance. Second, although there are strongdifferences in the results, these results do not appear to have any systematic pattern. Inparticular, the share of the forecast error variance that is absorbed by the idiosyncraticshock does not show systematic rise or fall when the long run shift-share restrictionsare imposed. Thus, the imposition of the long run shift-share might be particularlydangerous, as there is little indication of in which direction the bias from the modelruns.

4. Summary and conclusions

A natural intersection of urban economics and time series econometrics is in the examina-tion of urban fluctuations. In this chapter, the work of Robert Engle at this intersection iscarried forward. The traditional models of metropolitan sectoral fluctuations investigatedby Engle and others are shown to be special cases of a general four-dimensional VAR.

Page 47: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

34 The long run shift-share

Many of the restrictions that the traditional models embody are shown to be largelyrejected by the data in favor of models with greater parameterization. This would seemto be due, at least in the short run, to the fact that the traditional models try to tracklocal sectoral fluctuations by using broader aggregates. This implicitly minimizes the roleof local productivity shocks, which, according to the variance decomposition, turn outto be quite important. In the long run there is some connection between local sectoralmovements and broader aggregates via cointegrating relationships, but the relationship isnot homogenous, and the imposition of shift-share type restrictions is not recommendedeven in the long run.

Page 48: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3

The Evolution of National andRegional Factors in US Housing

ConstructionJames H. Stock and Mark W. Watson

1. Introduction

This chapter uses a dynamic factor model with time-varying volatility to study thedynamics of quarterly data on state-level building permits for new residential units from1969–2007. In doing so, we draw on two traditions in empirical economics, both startedby Rob Engle. The first tradition is the use of dynamic factor models to understandregional economic fluctuations. Engle and Watson (1981) estimated a dynamic factormodel of sectoral wages in the Los Angeles area, with a single common factor designedto capture common regional movements in wages, and Engle, Lilien, and Watson (1985)estimated a related model applied to housing prices in San Diego. These papers, alongwith Engle (1978b) and Engle and Watson (1983), also showed how the Kalman filtercould be used to obtain maximum likelihood estimates of the parameters of dynamicfactor models in the time domain. The second tradition is modeling the time-varyingvolatility of economic time series, starting with the seminal work on ARCH of Engle(1982). That work, and the extraordinary literature that followed, demonstrated howtime series models can be used to estimate time-varying variances, and how changes inthose variances in turn can be linked to economic variables.

The dynamics of the US housing construction industry are of particular interest forboth historical and contemporary reasons. From an historical perspective the issuance ofbuilding permits for new residential units has been strongly procyclical, moving closely

Acknowledgments: This research was funded in part by NSF grant SBR-0617811. We thank DongBeong Choi and the Survey Research Center at Princeton University for their help on this projectand Jeff Russell and a referee for comments on an earlier draft. Data and replication files are availableat http://www.princeton.edu/∼mwatson

35

Page 49: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

36 The evolution of national and regional factors in US housing construction

1965

0.6

0.4

0.2

0.0

–0.2

–0.4

–0.6

–0.81970 1975 1980 1985 1990 1995 2000 2005 2010

–3

–2

–1

0

1

2

3

1965 1970 1975 1980 1985 1990 1995 2000 2005 2010

Fig. 3.1. Four-quarter growth rate of GDP (dark line) and total US building permitsin decimal units (upper panel) and in units of standard deviations (lower panel)

with overall growth in GDP but with much greater volatility. Figure 3.1 plots four-quarter growth rates of GDP and aggregate US building permits from 1960–2007. LikeGDP growth and other macroeconomic aggregates, building permits were much morevolatile in the first half of the sample period (1960–1985) than in the second half of theperiod (1986–2007). In fact, the median decline in the volatility of building permits is sub-stantially greater than for other major macroeconomic aggregates. From a contemporaryperspective, building permits have declined sharply recently, falling by approximately30% nationally between 2006 and the end of our sample in 2007, and the contractionin housing construction is a key real side-effect of the decline in housing prices and the

Page 50: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

1 Introduction 37

1975

2.5

2.0

1.5

1.0

0.5

0.0

–0.5

–1.0

–1.51980 1985 1990 1995 2000 2005 2010

NortheastSoutheastNorthcentralSouthwestWest

Fig. 3.2. Deviation of regional 30-year fixed mortgage rates from the national median,1976–2007 (units are decimal points at an annual rate). Data Source: Freddie MacPrimary Mortgage Market Survey

turbulence in financial markets during late 2007 into 2008. Because building permit dataare available by state, there is potentially useful information beyond that contained in thenational aggregate plotted in Figure 3.1, but we are unaware of any systematic empiricalanalysis of state-level building permit data.

In this chapter, we build on Engle’s work and examine the coevolution of state-levelbuilding permits for residential units. Our broad aim is to provide new findings concerningthe link between housing construction, as measured by building permits,1 and the declinein US macroeconomic volatility from the mid-1980s through the end of our sample in2007, often called the Great Moderation. One hypothesis about the source of the GreatModeration in US economic activity is that developments in mortgage markets, suchas the elimination of interest rate ceilings and the bundling of mortgages to diversifythe risk of holding a mortgage, led to wider and less cyclically sensitive availability ofhousing credit. As can be seen in Figure 3.2, prior to the mid-1980s there were substantialregional differences in mortgage rates across the US; however, after approximately 1987these differences disappeared, suggesting that what had been regional mortgage marketsbecome a single national mortgage market. According to this hypothesis, these changesin financial markets reduced the cyclicality of mortgage credit, which in turn moderatedthe volatility of housing construction and thus of overall employment.

This chapter undertakes two specific tasks. The first task is to provide a new dataset on state-level monthly building permits and to provide descriptive statistics about

1Somerville (2001), Goodman (1986) and Coulson (1999) discuss various aspects of the links betweenhousing permits, starts, and completions.

Page 51: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

38 The evolution of national and regional factors in US housing construction

these data. This data set was put into electronic form from paper records provided bythe US Bureau of the Census. These data allow us to characterize both the comovements(spatial correlation) of permits across states and changes in volatility of state permitsfrom 1969 to the present.

The second task is to characterize the changes over time in the volatility of buildingpermits with an eye towards the Great Moderation. If financial market developmentswere an important source of the Great Moderation, one would expect that the volatilityof building permits would exhibit a similar pattern across states, and especially that anycommon or national component of building permits would exhibit a decline in volatilityconsistent with the patterns documented in the literature on the Great Moderation. Saiddifferently, finding a lack of a substantial common component in building permits alongwith substantial state-by-state differences in the evolution of volatility would suggestthat national-level changes in housing markets, such as the secondary mortgage market,were not an important determinant of housing market volatility.

The model we use to characterize the common and idiosyncratic aspects of changes instate-level volatility is the dynamic factor model introduced by Geweke (1977), modifiedto allow for stochastic volatility in the factors and the idiosyncratic disturbances; we referto this as the DFM-SV model. The filtered estimates of the state variables implied by theDFM-SV model can be computed by Markov Chain Monte Carlo (MCMC). The DFM-SV model is a multivariate extension of the univariate unobserved components-stochasticvolatility model in Stock and Watson (2007a).

In the DFM-SV model, state-level building permits are a function of a single nationalfactor and one of five regional factors, plus a state-specific component. Thus specificationof the DFM-SV model requires determining which states belong in which region. Oneapproach would be to adopt the Department of Commerce’s definition of US regions;however, that grouping of states was made for administrative reasons and, althoughthe groupings involved some economic considerations, those considerations are now outof date. We therefore follow Abraham, Goetzmann, and Wachter (1994) (AGW) andCrone (2005) by estimating the regional composition using k-means cluster analysis.Our analysis differs from these previous analyses in three main respects. First, we areinterested in state building permits, whereas AGW studied metropolitan housing prices,and Crone was interested in aggregate state-wide economic activity (measured by statecoincident indexes from Crone and Clayton-Matthews (2005)). Second, we estimate theclusters after extracting a single national factor, whereas AGW estimated clusters usingpercentage changes in metropolitan housing price indexes and Crone estimated clustersusing business cycle components of the state-level data, where in both cases a nationalfactor was not extracted. Third, we examine the stability of these clusters before andafter 1987.

The outline of the chapter is as follows. The state-level building permits data set isdescribed in Section 2, along with initial descriptive statistics. The DFM-SV model isintroduced in Section 3. Section 4 contains the empirical results, and Section 5 concludes.

2. The state building permits data set

This section first describes the state housing start data set, then presents some summarystatistics and time series plots.

Page 52: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 The state building permits data set 39

2.1. The data

The underlying raw data are monthly observations on residential housing units authorizedby building permits by state, from 1969:1–2008:1. The data were obtained from theUS Department of Commerce, Bureau of the Census, and are reported in the monthlynews release “New Residential Construction (Building Permits, Housing Starts, andHousing Completions).” Data from 1988–present are available from Bureau of the Censusin electronic form.2 Data prior to 1988 are available in hard copy, which we obtained fromthe Bureau of the Census. These data were converted into electronic form by the SurveyResearch Center at Princeton University.

For the purpose of the building permits survey, a housing unit is defined as a newhousing unit intended for occupancy and maintained by occupants, thereby excludinghotels, motels, group residential structures like college dorms, nursing homes, etc. Mobilehomes typically do not require a building permit so they are not counted as authorizedunits.

Housing permit data are collected by a mail survey of selected permit-issuing places(municipalities, counties, etc.), where the sample of places includes all the largest per-mitting places and a random sample of the less active permitting places. In addition,in states with few permitting places, all permitting places are included in the sample.Currently the universe is approximately 20,000 permitting places, of which 9,000 aresampled, and the survey results are used to estimate total monthly state permits. Theuniverse of permitting places has increased over time, from 13,000 at the beginning ofthe sample to 20,000 since 1974.3

Precision of the survey estimates vary from state to state, depending on coverage.As of January 2008, eight states have 100% coverage of permitting places so for thesestates there is no sampling error. In an additional 34 states, the sampling standard errorin January 2008 was less than 5%. The states with the greatest sampling standard errorare Missouri (17%), Wyoming (17%), Ohio (13%), and Nebraska (12%).

In some locations, housing construction does not require a permit, and any con-struction occurring in such a location is outside the universe of the survey. Currentlymore than 98% of the US population resides in permit-issuing areas. In some states,however, the fraction of the population residing in a permit-issuing area is sub-stantially less; the states with the lowest percentages of population living within apermit-requiring area are Arkansas (60%), Mississippi (65%), and Alabama (68%). InJanuary 2008, Arkansas had 100% of permitting places in the survey so there wasno survey sampling error; however, the survey universe only covered 60% of Arkansasresidents.4

The series analyzed in this chapter is total residential housing units authorized bybuilding permits, which is the sum of authorized units in single-family and multiple-family dwellings, where each apartment or town house within a multi-unit dwelling iscounted as a distinct unit.

2Monthly releases of building permits data and related documentation are provided at the CensusBureau Website, http://www.census.gov/const/www/newresconstindex.html

3The number of permit-issuing places in the universe sampled by date are: 1967–1971, 13,000; 1972–1977, 14,000; 1978–1983, 16,000; 1984–1993, 17,000; 1994–2003, 19,000; 2004–present, 20,000.

4Additional information about the survey and the design is available at http://www.census.gov/const/www/newresconstdoc.html#reliabilitybp and http://www.census.gov/const/www/C40/sample.html

Page 53: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

40 The evolution of national and regional factors in US housing construction

19700

4

8

12

16

20

24

28

32

0

4

8

12

16

20

24

28

Tho

usan

ds

1975 1980 1985 1990 1995 2000 2005 2010

OhioLouisianaKansasNew JerseyVermont

OhioLouisianaKansasNew JerseyVermont

Tho

usan

ds

1970 1975 1980 1985 1990 1995 2000 2005 2010

Fig. 3.3. Quarterly building permits data for five representative states. Upper panel:not seasonally adjusted. Lower panel: seasonally adjusted using Census X12

The raw data are seasonally unadjusted and exhibit pronounced seasonality. Data foreach state was seasonally adjusted using the X12 program available from the Bureau ofthe Census. Quarterly sums of the monthly data served as the basis for our analysis. Thequarterly data are from 1969:I through 2007:IV.5

2.2. Summary statistics and plots

Quarterly data for five representative states, Ohio, Louisiana, Kansas, New Jersey, andVermont are plotted in Figure 3.3 (upper panel). Three features are evident in theseplots. First, there is not a clear long run overall trend in the number of permits issued,

5The raw data are available at http://www.princeton.edu/∼mwatson

Page 54: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 The state building permits data set 41

and for these states the number of permits issued in 2007 is not substantially differentfrom the number issued in 1970. Second, the raw data are strongly seasonal, but theseasonality differs across states. Not surprisingly, the states with harsher winters (Ohioand Vermont) have stronger seasonal components than those with more moderate winters(Louisiana). Third, there is considerable volatility in these series over the several-yearhorizon (building permits are strongly procyclical).

The lower panel of Figure 3.3 presents the seasonally adjusted quarterly build-ing permits data for the same five states. The comovements among these series canbe seen more clearly in these seasonally adjusted data than in the nonseasonallyadjusted data. For example, these states (except Vermont) exhibited a sharp slowdownin building activity in the early 1980s and a steady growth in permits through the1990s.

Summary statistics for the seasonally adjusted building permits data for all 50 statesare given in Table 3.1. The average quarterly number of building permits (first numericcolumn) differs by an order of magnitude across states. The average growth rate ofbuilding permits (second numeric column) is typically small in absolute value, and isnegative for many states, especially in the northeast. The third and fourth numericcolumns report the standard deviation of the four-quarter growth in building permits,defined as:

Δ4yit = yit − yit−4. (1)

where yit = ln(BPit), and BPit denotes the number of building permits in state i andtime t. These standard deviations reveal first the great volatility in permits in all states,and second the marked decline in volatility in most states between the first and secondhalf of the sample. In most states, the standard deviation fell by approximately one-half(variances fell by 75%) between the two subsamples.

The final three columns of Table 3.1 examine the persistence of building permitsby reporting a 95% confidence interval, constructed by inverting the ADF tμ statistic(columns 5 and 6) and, in the final column, the DF-GLSμ t statistic, both computedusing four lags in the quarterly data. The confidence intervals indicate that the largestAR root is near one, and all but three of the confidence intervals contain a unit root.The DF-GLSμ statistics paint a somewhat different picture, with 25 of the 50 statisticsrejecting a unit root at the 5% significance level. Such differences are not uncommon usingunit root statistics, however. Taken together, we interpret these confidence intervals andDF-GLSμ statistics as consistent with the observation suggested by Figure 3.3 that theseries are highly persistent and plausibly can be modeled as containing a unit root. Forthe rest of the chapter we therefore focus on the growth rate of building permits, eitherthe quarterly growth rate or (for comparability to the literature on the Great Moderation)on the four-quarter growth rate Δ4yit defined in (1).

The four-quarter growth rates of building permits for each of the 50 states are plottedin Figure 3.4. Also shown (solid lines) are the median, 25%, and 75% percentiles of growthrates across states, computed quarter by quarter. The median growth rate captures thecommon features of the five states evident in Figure 3.3, including the sharp fall inpermits (negative growth) in the early 1980s, the steady rise through the 1990s (smallfluctuations around a positive average growth rate), and the sharp decline in permits atthe end of the sample. This said, there is considerable dispersion of state-level growth

Page 55: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

42 The evolution of national and regional factors in US housing construction

Table 3.1. Seasonally adjusted state building permits: summary statistics

State Averagequarterlypermits

Averageannualgrowthrate

Std. dev. offour-quartergrowth rate

95% confidenceinterval for

largest AR root

DF-GLSμ

unit rootstatistic

1970–1987 1988–2007 Lower Upper

CT 3463 −0.030 0.29 0.23 0.92 1.02 −0.42MA 5962 −0.022 0.33 0.20 0.90 1.02 −1.42MD 7780 −0.013 0.34 0.18 0.84 1.01 −2.42∗

ME 1313 0.022 0.34 0.20 0.88 1.02 −1.30NH 1657 0.002 0.39 0.26 0.86 1.01 −2.06∗

NJ 8247 −0.010 0.33 0.28 0.83 1.01 −2.15∗

NY 11147 −0.002 0.33 0.17 0.90 1.02 −1.33PA 10329 −0.005 0.29 0.15 0.86 1.01 −2.72∗∗

RI 953 −0.020 0.45 0.23 0.90 1.02 −2.04∗

CA 43210 −0.015 0.38 0.21 0.87 1.01 −2.70∗∗

ID 2047 0.061 0.45 0.21 0.90 1.02 −0.12IN 7482 −0.004 0.35 0.15 0.89 1.02 −2.02∗

MI 11138 −0.028 0.38 0.19 0.91 1.02 −0.87NV 5745 0.044 0.48 0.33 0.88 1.02 −0.73OH 11619 −0.016 0.35 0.14 0.88 1.02 −1.37OR 5259 0.009 0.38 0.22 0.86 1.01 −2.55∗

SD 863 0.032 0.46 0.30 0.90 1.02 −1.31WA 9828 0.006 0.31 0.16 0.87 1.01 −2.76∗∗

WI 6918 0.002 0.31 0.15 0.90 1.02 −2.11∗

IA 2837 −0.001 0.38 0.19 0.93 1.02 −1.95∗

IL 12256 −0.011 0.45 0.16 0.88 1.02 −1.49KA 2989 0.002 0.41 0.22 0.82 1.00 −3.30∗∗

MN 6937 −0.013 0.32 0.17 0.87 1.02 −1.95∗

MO 5870 −0.005 0.36 0.18 0.83 1.01 −2.17∗

ND 762 0.011 0.44 0.31 0.88 1.02 −2.18∗

NE 2025 0.004 0.37 0.22 0.89 1.02 −2.28∗

DE 1274 0.000 0.44 0.18 0.89 1.02 −1.94FL 39213 −0.005 0.45 0.22 0.36 0.91 −4.61∗∗

GA 15586 0.016 0.36 0.17 0.90 1.02 −1.83HA 2021 −0.018 0.39 0.36 0.90 1.02 −0.72KY 3846 0.000 0.40 0.19 0.82 1.01 −2.50∗

MS 2449 0.019 0.42 0.21 0.87 1.01 −2.71∗∗

NC 13623 0.032 0.33 0.14 0.87 1.01 −1.04SC 6520 0.025 0.31 0.16 0.90 1.02 −1.34TN 7687 0.013 0.42 0.17 0.81 1.00 −2.88∗∗

VA 12674 −0.001 0.35 0.18 0.79 0.97 −3.52∗∗

VT 704 0.014 0.36 0.25 0.89 1.02 −0.77WV 749 0.016 0.52 0.20 0.93 1.02 −1.05AK 703 0.006 0.62 0.35 0.89 1.02 −1.97∗

cont.

Page 56: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 The state building permits data set 43

Table 3.1. (Continued)

State Averagequarterlypermits

Averageannualgrowthrate

Std. dev. offour-quartergrowth rate

95% confidenceinterval for

largest AR root

DF-GLSμ

unit rootstatistic

1970–1987 1988–2007 Lower Upper

AL 4612 0.015 0.40 0.17 0.85 1.01 −2.99∗∗

AR 2405 0.015 0.40 0.22 0.85 1.01 −2.19∗

AZ 12274 0.016 0.46 0.22 0.86 1.01 −1.92CO 8725 0.007 0.44 0.22 0.83 1.01 −2.70∗∗

LA 4461 0.005 0.42 0.20 0.87 1.02 −2.23∗

MT 690 0.027 0.46 0.29 0.92 1.02 −1.64NM 2481 0.036 0.45 0.21 0.36 0.94 −1.25OK 3637 0.000 0.46 0.21 0.88 1.02 −2.06∗

TX 30950 0.017 0.37 0.16 0.91 1.02 −1.89UT 3891 0.032 0.40 0.21 0.86 1.01 −1.45WY 548 0.045 0.42 0.31 0.92 1.02 −0.75

The units for the first numeric column are units permitted per quarter. The units for columns 2–4 aredecimal annual growth rates. The 95% confidence interval for the largest autoregressive root in column5 is computed by inverting the ADFμt-statistic, computed using four lags. The final column reportsthe DF-GLSμt-statistic, also computed using four lags. The DF-GLSμt-statistic rejects the unit rootat the: ∗5% or ∗∗1% significance level. The full quarterly data set spans 1969Q1–2007Q4.

rates around the median, especially in the mid-1980s. Also clearly visible in Figure 3.4 isthe greater volatility of the four-quarter growth rate of building permits in the first partof the sample than in the second.

2.3. Rolling standard deviations and correlations

Figure 3.4 shows a decline in volatility in the state-level building permit data and alsosubstantial comovements across states. Here we provide initial, model-free measurementsof these two features.

Volatility. Rolling standard deviations of the four-quarter growth rate of buildingpermits for the 50 states (that is, the standard deviation of Δ4yit), computed using acentered 21-quarter window, are plotted in Figure 3.5; as in Figure 3.4, the dark linesare the median, 25%, and 75% percentiles. The median standard deviation clearly showsa sharp, almost discrete decline in state-level volatility that occurred in approximately1984–1985, essentially the same date that has been identified as a break date for theGreat Moderation. After 1985, however, the median volatility continued to decrease toa low of approximately 0.15 (decimal units for annual growth rates), although a sharpincrease is evident at the end of the sample when it returned to the levels of the late 1980s(approximately 0.2). The magnitude of the overall decline in volatility is remarkable, fromapproximately 0.4 during the 1970s and 1980s to less than 0.2 on average during the1990sand 2000s.

Spatial correlation. There are, of course, many statistics available for summarizingthe comovements of two series, including cross correlations and spectral measures such

Page 57: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

44 The evolution of national and regional factors in US housing construction

1970–2.0

–1.5

–1.0

–0.5

0.0

0.5

1.0

1.5

2.0

1975 1980 1985 1990 1995 2000 2005 2010

Fig. 3.4. Four-quarter growth rate of building permits for all 50 states. The dottedlines are the state-level time series; the median, 25%, and 75% percentiles of the 50growth rates (quarter by quarter) are in solid lines

as coherence. In this application, a natural starting point is the correlation between thefour-quarter growth rates of two state series, computed over a rolling window to allowfor time variation. With a small number of series it is possible to display the N(N −1)/2pairs of cross-correlations, but this is not practical when N = 50. We therefore draw onthe spatial correlation literature for a single summary time series that summarizes thepossibly time-varying comovements among these 50 series. Specifically, we use a measurebased on Moran’s I, applied to a centered 21-quarter rolling window.6 Specifically, themodified Moran’s I used here is:

It =

N∑i=1

i−1∑j=1

cov(Δ4yit,Δ4yjt)/N(N − 1)/2

N∑i=1

var(Δ4yit)/N

(2)

where cov(Δ4yit,Δ4yjt) = 121

∑t+10s=t−10 (Δ4yis −Δ4yit)(Δ4yjs −Δ4yjt), var(Δ4yit) =

121

∑t+10s=t−10 (Δ4yis −Δ4yit)2,Δ4yit = 1

21

∑t+10s=t−10Δ4yis, and N = 50.

The time series It is plotted in Figure 3.6. For the first half of the sample, the spa-tial correlation was relatively large, approximately 0.5. Since 1985, however, the spatialcorrelation has been substantially smaller, often less than 0.2 except in the early 1990s

6Moran’s I is a weighted spatial correlation measure. Here we are interested in comovement over timeacross states.

Page 58: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 The DFM-SV model 45

19700.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1975 1980 1985 1990 1995 2000 2005 2010

Fig. 3.5. Rolling standard deviation (centered 21-quarter window) of the four-quartergrowth rate of building permits for all 50 states (decimal values). The dotted lines arethe state-level rolling standard deviations; the median, 25%, and 75% percentiles of the50 rolling standard deviations (quarter by quarter) are in solid lines

and in the very recent collapse of the housing market. Aside from these two periods ofnational decline in housing construction, the spatial correlation in state building permitsseems to have fallen at approximately the same time as did their volatility.

3. The DFM-SV model

This section lays out the dynamic factor model with stochastic volatility (DFM-SV)model, discusses the estimation of its parameters and the computation of the filteredestimates of the state variables, and describes the algorithm for grouping states intoregions.

3.1. The dynamic factor model with stochastic volatility

We examine the possibility that state-level building permits have a national component, aregional component, and an idiosyncratic component. Specifically, we model log buildingpermits (yit) as following the dynamic factor model,

yit = αi + λiFt +NR∑j=1

γijRjt + eit (3)

Page 59: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

46 The evolution of national and regional factors in US housing construction

1970 1975 1980 1985 1990 1995 2000 2005 20100.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Fig. 3.6. Rolling average spatial correlation in the four-quarter growth of buildingpermits across states as measured by the modified Moran’s I statistic It

where the national factor Ft and the NR regional factors Rjt follow random walks andthe idiosyncratic disturbance eit follows an AR(1):

Ft = Ft−1 + ηt (4)

Rjt = Rjt−1 + υjt (5)

eit = ρieit−1 + εit. (6)

The disturbances ηt, υjt, and εit are independently distributed and have stochasticvolatility:

ηt = ση,tζη,t (7)

υjt = συj ,tζυj ,t (8)

εit = σεi,tζεi,t (9)

lnσ2η,t = lnσ2

η,t−1 + νη,t (10)

lnσ2υj ,t = lnσ2

υj ,t−1 + νυj ,t (11)

lnσ2εi,t = lnσ2

εi,t−1 + νεi,t (12)

Page 60: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 The DFM-SV model 47

where ζt = (ζη,t, ζυ1,t, . . . , ζυNR,t, ζε1,t, . . . , ζεN ,t)′ is i.i.d. N(0, I1+NR+N ), νt = (νη,t,

νυ1,t, . . . , νυNR,t, νε1,t, . . . , νεN ,t)′ is i.i.d. N(0, φI1+NR+N ), ζt and νt are independently

distributed, and φ is a scalar parameter.The factors are identified by restrictions on the factor loadings. The national fac-

tor enters all equations so {λi} is unrestricted. The regional factors are restrictedto load on only those variables in a region, so γij is nonzero if state i is in regionj and is zero otherwise, and the grouping of states into regions is described below.The scale of the factors is normalized setting λ′λ/N = 1 and γ′jγj/NR,j = 1, whereλ = (λ1, . . . , λN )′, γj = (γ1j , . . . , γNj)′, and NR, is the number of state in region j.

The parameters of the model consist of {αi, λi, γij , ρi, φ}.7 In the subsection we discussestimation of the parameters and states conditional on the grouping of states into regions.We then discuss the regional groupings.

3.2. Estimation and filtering

Estimation of fixed model coefficients. Estimation was carried out using a two-stepprocess. In the first step, the parameters {αi, λi, γij , ρi}, i = 1, . . ., 50 were estimatedby Gaussian maximum likelihood in a model in which the values of σ2

η, σ2υj

, and σ2εi

are allowed to break midway through the sample (1987:IV). The pre- and post-breakvalues of the variances are modeled as unknown constants. This approximation greatlysimplifies the likelihood by eliminating the need to integrate out the stochastic volatility.The likelihood is maximized using the EM algorithm described in Engle and Watson(1983). The scale parameter φ (defined below equation (12)) was set equal to 0.04, avalue that we have used previously for univariate models (Stock and Watson, 2007a).

Filtering. Conditioning on the values of {αi, λi, γij , ρi, φ}, smoothed estimates of thefactors and variances E(Ft, Rjt, σ

2η,t, σ

2υj ,t, σ

2εi,t|{yiτ}50,T

ι=1,τ=1) were computed using Gibbssampling. Draws of ({Ft, Rjt}NR,T

j=1,t=1|{yit}50,Tι=1,t=1, {σ2

η,t, σ2υj ,t, σ

2εi,t}NR,50,T

j=1,i=1,t=1) were gen-erated from the relevant multivariate normal density using the algorithm in Carter andKohn (1994). Draws of ({σ2

η,t, σ2υj ,t, σ

2εi,t}NR,50,T

j=1,i=1,t=1|{yit}50,Tι=1,t=1, {Ft, Rjt}NR,T

j=1,t=1) wereobtained using a normal mixture approximation for the distribution of the logarithmof the χ2

1 random variable (ln(ζ2)) and data augmentation as described in Shephard(1994) and Kim, Shephard and Chib (1998) (we used a bivariate normal mixture approx-imation). The smoothed estimates and their standard deviations were approximatedby sample averages from 20,000 Gibbs draws (after discarding 1,000 initial draws).Repeating the simulations using another set of 20,000 independent draws resulted inestimates essentially indistinguishable from the estimates obtained from the first set ofdraws.

3.3. Estimation of housing market regions

In the DFM-SV model, regional variation is independent of national variation, and anyregional comovements would be most noticeable after removing the national factor Ft.

7The model (3)–(6) has tightly parameterized dynamics. We also experimented with more looselyparameterized models that allow leads and lags of the factors to enter (3) and allow the factors to followmore general AR processes. The key empirical conclusions reported below were generally unaffected bythese changes.

Page 61: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

48 The evolution of national and regional factors in US housing construction

Accordingly, the housing market regions were estimated after removing a single commoncomponent associated with the national factor. Our method follows Abraham, Goetz-mann, and Wachter (1994) and Crone (2005) by using k-means cluster analysis, exceptthat we apply the k-means procedure after subtracting the contribution of the nationalfactor.

Specifically, the first step in estimating the regions used the single-factor model,

yit = αi + λiFt + uit (13)

Ft = Ft−1 + ηt (14)

uit = ρi1uit−1 + ρi2uit−2 + εit, (15)

where (ηt, ε1t, . . . , ε2t) are independently and distributed normal variables with meanzero and constant variances. Note that in this specification, uit consists of the contribu-tion of the regional factors as well as the idiosyncratic term, see (3). The model (13)–(15)was estimated by maximum likelihood, using as starting values least-squares estimates ofthe coefficients using the first principal component as an estimator of Ft (Stock and Wat-son, 2002a). After subtracting out the common component, this produced the residualuit = yit − αi − λiFt.

The k-means method was then used to estimate the constituents of the clusters. Ingeneral, let {Xi}, i = 1, . . . , N be a T -dimensional vector and let μj be the mean vectorof Xi if i is in cluster j. The k-means method solves,

min{μj ,Sj}k∑

j=1

∑i∈Sj

(Xi − μj)′ (Xi − μj) (16)

where Sj is the set of indexes contained in cluster j. That is, the k-means method isthe least-squares solution to the problem of assigning entity i with data vector Xi togroup j.8

We implemented the k-means cluster method using four-quarter changes in uit, thatis, withXi = (Δ4ui5, . . . ,Δ4uiT )′. In principle, (16) should be minimized over all possibleindex sets Sj . With 50 states and more than two clusters, however, this is computationallyinfeasible. We therefore used the following algorithm:

(i) An initial set of k clusters is assigned at random; call this S0.(ii) The cluster sample means were computed for the grouping S0 yielding the

k-vector of means, μ0.(iii) The distance from each Xi is computed to each element of μ0 and each state i

is reassigned to the cluster with the closest mean; call this grouping S1.(iv) The k cluster means μ1 are computed for the grouping S1, and steps (iii) and

(iv) are repeated until there are no switches or until the number of iterationsreaches 100.

This algorithm was repeated for multiple random starting values.

8In the context of the DFM under consideration, the model-consistent objective function would beto assign states to region so as to maximize the likelihood of the DFM. This is numerically infeasible,however, as each choice of index sets would require estimation of the DFM parameters.

Page 62: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Empirical results 49

We undertook an initial cluster analysis to estimate the number of regions, in whichthe foregoing algorithm was used with 20,000 random starting values. Moving from twoto three clusters reduced the value of the minimized objective function (16) by approx-imately 10%, as did moving from three to four clusters. The improvements from fourto five, and from five to six, were less, and for six clusters the number of states was asfew as five in one of the clusters. Absent a statistical theory for estimating the numberof clusters, and lacking a persuasive reason for choosing six clusters, we therefore chosek = 5.

We then estimated the composition of these five regions using 400,000 random startingvalues. We found that even after 200,000 starting values there were some improvementsin the objective function; however, those improvements were very small and the switchesof states in regions involved were few. We then re-estimated the regions for the 1970–1987and 1988–2007 subsamples, using 200,000 additional random starting values and usingthe full-sample regional estimates as an additional starting value.

4. Empirical results

4.1. Housing market regions

The resulting estimated regions for the full sample and subsamples are tabulated inTable 3.2 and are shown in Figure 3.7 (full sample), Figure 3.8 (1970–1987), andFigure 3.9 (1988–2007).

Perhaps the most striking feature of the full-sample estimates shown in Figure 3.7is the extent to which the cluster algorithm, which did not impose contiguity, createdlargely contiguous regions, that is, regions in a traditional sense. Other than Vermont,the Northeast states comprise Region 1, and the Southeast states comprise Region 4,

12

3

4

5

Fig. 3.7. Estimated housing market regions, 1970–2007

Page 63: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

50 The evolution of national and regional factors in US housing construction

Table 3.2. Estimated composition of housing market regions

State 1970–2007 1970–1987 1988–2007 State 1970–2007 1970–1987 1988–2007

CT 1 1 1 NE 3 3 3MA 1 1 1 DE 4 4 2MD 1 1 2 FL 4 4 2ME 1 2 1 GA 4 4 4NH 1 1 1 HA 4 1 3NJ 1 1 1 KY 4 4 2NY 1 1 1 MS 4 4 4PA 1 1 1 NC 4 4 4RI 1 1 1 SC 4 4 4CA 2 2 2 TN 4 4 4ID 2 3 5 VA 4 4 1IN 2 2 3 VT 4 4 1MI 2 2 2 WV 4 4 4NV 2 5 2 AK 5 5 5OH 2 2 3 AL 5 4 5OR 2 2 2 AR 5 4 5SD 2 3 3 AZ 5 5 4WA 2 3 2 CO 5 5 5WI 2 2 3 LA 5 5 4IA 3 3 3 MT 5 3 5IL 3 2 3 NM 5 5 5KA 3 3 4 OK 5 5 5MN 3 2 3 TX 5 5 5MO 3 2 4 UT 5 3 5ND 3 3 4 WY 5 5 4

Estimated using k-means cluster analysis after eliminating the effect of the national factor as describedin Section 3.3.

excluding Alabama and including Vermont. Region 3 is the Upper Midwest, withoutSouth Dakota, and Region 5 consists of the Rocky Mountain and South Central states,plus Alabama, Arkansas and Louisiana. The only region which is geographically dispersedis Region 2, which consists of the entire West Coast but also South Dakota, and the rustbelt states.

Figures 3.8 and 3.9 indicate that the general location of the regions was stable betweenthe two subsamples, especially the New England and Rocky Mountain/South Centralregions. Housing in Florida, Washington, and Nevada evidently behaved more like Cali-fornia in the second sample than in the first, in which they were in other clusters. It isdifficult to assess the statistical significance of these changes, and without the guidanceof formal tests we are left to our own judgment about whether the groupings appear tobe stable. The fact that the objective function is essentially unaltered by some changes inthe groupings suggests that there is considerable statistical uncertainty associated withthe regional definitions, which in turn suggests that one would expect a fair amount ofregion switching in a subsample analysis even if the true (unknown population) regions

Page 64: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Empirical results 51

123

4

5

Fig. 3.8. Estimated housing market regions, 1970–1987

were stably defined. We therefore proceed using five regions with composition that iskept constant over the full sample.

4.2. Results for split-sample estimates of the dynamic factor model

Before estimating the DFM-SV model, we report results from estimation of the dynamicfactor model with split-sample estimates of the disturbance variances. This model is givenby (3)–(6), where ηt, υjt, and εit are i.i.d. normal. The purpose of this estimation is to

123

4

5

Fig. 3.9. Estimated housing market regions, 1988–2007

Page 65: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

52 The evolution of national and regional factors in US housing construction

examine the stability of the factor loading coefficients and the disturbance variances overthe two split subsamples, 1969–1987 and 1988–2007.

Accordingly, two sets of estimates were computed. First, the unrestricted split-sampleestimates were produced by estimating the model separately by maximum likelihood onthe two subsamples, 1969–1987 and 1988–2008. Second, restricted split-sample estimateswere computed, where the factor loading coefficients λ and γ and the idiosyncratic autore-gressive coefficients ρ were restricted to be constant over the entire sample period, andthe variances {σ2

η, σ2νj, σ2

εi} were allowed to change between the two subsamples. This

restricted split model has the effect of holding the coefficients of the mean dynamics con-stant but allows for changes in the variances and the relative importance of the factorsare idiosyncratic components.

The MLEs for the restricted split-sample model are reported in Table 3.3. The fac-tor loadings are normalized so that λ′λ/N = 1 and γ′jγj/NR,j = 1. The loadings on thenational factor are all positive and, for 44 states, are between 0.6 and 1.4. The states withthe smallest loadings of the national factor are Hawaii (0.16), Wyoming (0.51), RhodeIsland (0.55), and Alaska (0.57). There is considerably more spread on the loadings ofthe regional factors, and in fact four states have negative regional factor loadings: WestVirginia (−1.3), South Carolina (−0.66), Georgia (−0.65), and Mississippi (−0.39). Allthe states with negative loadings are in Region 4, which suggests either a lack of homo-geneity within that region or some intra-region flows in economic activity as these fourstates see declines in activity associated with gains in Florida and Virginia. The idiosyn-cratic disturbances exhibit considerable persistence, with a median AR(1) coefficientof 0.71.

The restricted split estimates allow only the disturbance variances to change betweensamples, and the results in Table 3.3 and Table 3.4 (which presents the restricted split-sample estimates of the standard deviations of the factor innovations) show that nearly allthese disturbance variances fall, and none increase. The average change in the idiosyn-cratic disturbance innovation standard deviation is −0.07, the same as the change inthe standard deviation of the national factor innovation. The change in the innovationstandard deviations of the regional factors is less, typically −0.03.

Table 3.5 provides a decomposition of the variance of four-quarter growth in build-ing permits, Δ4yit, between the two samples. Each column contains two estimates forthe column entry, the first from the unrestricted split model and the second from therestricted split model. The first block of columns reports the fraction of the variance ofΔ4yit explained by the national factor, regional factor, and idiosyncratic term for thefirst subsample, and the second block reports these statistics for the second subsample.The final block provides a decomposition of the change in variance of Δ4yit betweenthe two subsamples, attributable to changes in the contributions of the national factor,regional factor, and idiosyncratic term.

Five features of this table are noteworthy. For now, consider the results based on therestricted model (the second of each pair of entries in Table 3.5).

First, in both samples most of the variance inΔ4yit is attributable to the idiosyncraticcomponent, followed by a substantial contribution of the national factor, followed bya small contribution of the regional factor. For example, in the first sample, the meanpartial R2 attributable to the national factor is 36%, to the regional factor is 10%, and tothe state idiosyncratic disturbance is 54%. There is, however, considerable heterogeneity

Page 66: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Empirical results 53

Table 3.3. Maximum likelihood estimates, restricted split-sample estimation

Region λ γ ρ σε (69–87) σε (88–07)

CT 1 0.90 1.38 −0.04 0.09 0.07MA 1 0.91 1.21 0.47 0.15 0.06MD 1 0.78 0.70 0.79 0.13 0.11ME 1 1.00 0.67 0.86 0.20 0.09NH 1 1.16 1.04 0.78 0.23 0.11NJ 1 1.08 1.13 0.64 0.12 0.10NY 1 0.86 0.55 0.83 0.18 0.10PA 1 0.74 0.60 0.76 0.13 0.07RI 1 0.55 1.30 0.48 0.26 0.12CA 2 1.02 0.45 0.97 0.12 0.08ID 2 1.07 0.53 0.91 0.28 0.10IN 2 1.02 1.01 0.42 0.13 0.08MI 2 1.23 1.89 0.92 0.11 0.06NV 2 1.31 0.11 0.84 0.22 0.19OH 2 1.11 0.93 0.89 0.11 0.05OR 2 0.69 1.10 0.84 0.16 0.13SD 2 1.14 0.70 0.63 0.25 0.22WA 2 0.68 0.68 0.79 0.13 0.10WI 2 0.99 1.38 0.07 0.07 0.05IA 3 1.23 1.58 −0.17 0.11 0.08IL 3 1.55 1.03 0.90 0.14 0.06KA 3 0.83 0.42 0.55 0.22 0.12MN 3 1.24 0.73 0.91 0.17 0.08MO 3 1.01 0.26 0.77 0.15 0.09ND 3 1.02 1.32 0.63 0.25 0.23NE 3 1.12 0.96 0.37 0.16 0.15DE 4 1.09 1.00 0.68 0.29 0.11FL 4 0.83 0.95 0.93 0.13 0.07GA 4 1.21 −0.65 0.94 0.10 0.07HA 4 0.16 1.11 0.71 0.32 0.26KY 4 1.04 0.23 0.50 0.23 0.10MS 4 0.92 −0.39 0.70 0.22 0.13NC 4 1.07 0.24 0.91 0.15 0.06SC 4 0.79 −0.66 0.83 0.12 0.08TN 4 1.18 0.24 0.70 0.14 0.07VA 4 1.13 1.56 −0.18 0.07 0.04VT 4 0.90 1.88 0.71 0.30 0.15WV 4 0.93 −1.30 0.29 0.39 0.12AK 5 0.57 1.42 0.69 0.36 0.24AL 5 1.02 0.37 0.40 0.20 0.10AR 5 1.09 0.18 0.33 0.15 0.13AZ 5 1.41 0.39 0.68 0.18 0.09

(cont.)

Page 67: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

54 The evolution of national and regional factors in US housing construction

Table 3.3. (Continued)

Region λ γ ρ σε (69–87) σε (88–07)

CO 5 1.10 0.86 0.83 0.11 0.11LA 5 0.90 1.34 0.06 0.11 0.11MT 5 1.10 0.88 0.79 0.33 0.17NM 5 1.00 0.38 0.63 0.23 0.12OK 5 0.72 1.51 0.42 0.15 0.12TX 5 0.76 1.29 0.96 0.08 0.06UT 5 0.89 0.59 0.90 0.17 0.10WY 5 0.51 1.39 0.83 0.31 0.19

Estimates are restricted split-sample MLEs of the dynamic factor model in Section 3.3, withinnovation variances that are constant over each sample but differ between samples.

behind these averages, for example in the first period the partial R2 attributable tothe national factor ranges from 0% to 67%. The states with 5% or less of the varianceexplained by the national factor in both periods are Hawaii, Wyoming, and Alaska.The states with 45% or more of the variance explained by the national factor in bothperiods are Georgia, Wisconsin, Illinois, Arizona, Ohio, Tennessee, Virginia, and NorthCarolina.

Second, the importance of the national factor to state-level fluctuations falls fromthe first sample to the second: the median partial R2 in the first period is 0.37 andin the second period is 0.23. The contribution of the regional factor is approximatelyunchanged, and the contribution of the state-specific disturbance increases for moststates.

Third, all states experienced a reduction in the volatility of Δ4yit, and for moststates that reduction was large. The variance reductions ranged from 35% (Hawaii) to88% (West Virginia), with a median reduction of 72%. This reduction in variance is,on average, attributable equally to a reduction in the volatility of the contribution ofthe national factor and a reduction in the volatility of the idiosyncratic disturbance;on average, the regional factor makes only a small contribution to the reduction involatility.

Table 3.4. Restricted split-sample estimates of the standarddeviation of factor shocks for the national and regional factors

1969–1987 1988–2007 Change

National Factor 0.12 0.05 −0.07Region 1 0.06 0.05 −0.01Region 2 0.06 0.03 −0.03Region 3 0.09 0.03 −0.06Region 4 0.03 0.03 0.00Region 5 0.07 0.04 −0.03

Page 68: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Empirical results 55

Tab

le3.

5.V

ari

ance

dec

om

posi

tions

for

four-

quart

ergro

wth

inst

ate

buildin

gper

mit

s(Δ

4y

it)

base

don

unre

stri

cted

and

rest

rict

edsp

lit-

sam

ple

esti

mati

on

ofth

edynam

icfa

ctor

model

,1969–1987

and

1988–2007

1969–1987

1988–2007

Decom

posi

tion

of(V

ar 6

9−

87−

Var 8

8−

07)/

Var 8

8−

07

σR

2−

FR

2−

RR

2−

R2−

FR

2−

RR

2−

eTota

lF

Re

CT

10.2

80.2

90.6

30.5

30.0

30.2

70.3

40.2

00.2

20.2

00.4

60.2

30.1

50.5

10.3

90.2

7−

0.3

8−

0.5

5−

0.3

4−

0.4

30.0

6−

0.0

5−

0.1

0−

0.0

8

MA

10.3

00.3

50.6

10.3

90.0

70.1

50.3

30.4

50.1

80.1

80.5

20.2

70.1

50.4

50.3

30.2

8−

0.6

3−

0.7

2−

0.4

1−

0.3

2−

0.0

1−

0.0

3−

0.2

1−

0.3

7

MD

10.2

90.3

00.5

80.3

70.0

20.0

70.4

00.5

60.1

80.2

20.2

10.1

30.2

30.1

00.5

60.7

7−

0.6

0−

0.4

5−

0.4

9−

0.3

00.0

7−

0.0

1−

0.1

7−

0.1

4

ME

10.3

90.4

50.3

50.2

80.0

00.0

30.6

50.7

00.2

40.2

10.2

80.2

50.2

90.1

10.4

30.6

4−

0.6

3−

0.7

9−

0.2

5−

0.2

20.1

00.0

0−

0.4

9−

0.5

6

NH

10.4

00.5

10.5

80.2

90.0

10.0

50.4

00.6

60.2

40.2

50.3

40.2

30.0

10.1

70.6

50.6

0−

0.6

4−

0.7

5−

0.4

6−

0.2

4−

0.0

1−

0.0

1−

0.1

7−

0.5

1

NJ

10.2

80.3

50.7

10.5

50.0

10.1

30.2

80.3

20.2

40.2

40.3

30.2

30.0

30.2

40.6

40.5

4−

0.3

0−

0.5

4−

0.4

7−

0.4

40.0

1−

0.0

20.1

7−

0.0

7

NY

10.3

00.4

00.4

70.2

60.0

80.0

20.4

50.7

10.2

40.2

10.2

60.1

80.1

30.0

70.6

00.7

5−

0.3

8−

0.7

2−

0.3

1−

0.2

10.0

00.0

0−

0.0

8−

0.5

1

PA

10.3

00.3

00.4

30.3

40.0

10.0

50.5

50.6

10.1

40.1

60.4

00.2

30.1

60.1

50.4

40.6

2−

0.7

8−

0.7

2−

0.3

5−

0.2

80.0

2−

0.0

1−

0.4

6−

0.4

4

RI

10.4

20.4

50.2

30.0

80.0

00.1

00.7

70.8

10.2

30.2

40.2

30.0

60.2

20.3

00.5

40.6

4−

0.6

9−

0.7

1−

0.1

6−

0.0

70.0

7−

0.0

2−

0.6

0−

0.6

3

CA

20.3

40.3

50.6

90.4

90.0

40.0

20.2

70.4

90.1

90.2

00.3

00.2

90.0

10.0

20.6

90.6

9−

0.6

8−

0.6

8−

0.6

0−

0.3

9−

0.0

4−

0.0

2−

0.0

5−

0.2

7

ID2

0.5

30.5

90.1

50.1

80.1

10.0

10.7

40.8

10.2

20.2

30.2

30.2

40.0

10.0

20.7

60.7

4−

0.8

3−

0.8

5−

0.1

1−

0.1

5−

0.1

1−

0.0

1−

0.6

1−

0.6

9

IN2

0.3

40.3

40.5

90.5

10.1

00.1

30.3

10.3

60.1

70.1

70.4

60.4

00.0

60.1

30.4

80.4

7−

0.7

6−

0.7

5−

0.4

8−

0.4

1−

0.0

9−

0.1

0−

0.1

9−

0.2

4

MI

20.4

30.4

20.6

60.4

70.1

30.3

00.2

10.2

30.1

80.2

10.5

50.3

70.3

30.3

10.1

30.3

2−

0.8

2−

0.7

6−

0.5

6−

0.3

8−

0.0

7−

0.2

2−

0.1

9−

0.1

5

NV

20.4

50.5

10.3

00.3

80.0

10.0

00.6

90.6

20.2

90.3

70.1

80.1

40.0

00.0

00.8

20.8

6−

0.6

0−

0.4

7−

0.2

3−

0.3

1−

0.0

10.0

0−

0.3

6−

0.1

6

OH

20.3

60.3

50.6

00.5

60.1

10.1

00.2

90.3

40.1

30.1

60.5

10.5

10.1

00.1

20.3

90.3

7−

0.8

7−

0.7

9−

0.5

3−

0.4

5−

0.1

0−

0.0

8−

0.2

4−

0.2

6

OR

20.3

60.3

60.2

30.2

10.2

60.1

40.5

10.6

50.2

10.2

60.1

20.0

80.0

00.0

70.8

80.8

6−

0.6

4−

0.4

6−

0.1

9−

0.1

7−

0.2

6−

0.1

1−

0.1

9−

0.1

9

SD

20.4

60.5

00.1

80.3

00.1

10.0

30.7

20.6

70.3

20.4

00.1

20.0

90.0

30.0

10.8

50.9

0−

0.5

3−

0.3

7−

0.1

2−

0.2

4−

0.0

9−

0.0

2−

0.3

2−

0.1

1

WA

20.3

10.3

00.3

40.3

00.0

80.0

80.5

80.6

30.1

60.2

10.0

40.1

20.0

00.0

40.9

60.8

4−

0.7

5−

0.5

2−

0.3

3−

0.2

4−

0.0

8−

0.0

6−

0.3

4−

0.2

2

WI

20.3

10.3

10.6

50.5

90.2

10.3

10.1

40.1

00.1

50.1

50.4

10.4

60.0

80.3

10.5

10.2

4−

0.7

5−

0.7

5−

0.5

5−

0.4

8−

0.1

9−

0.2

3−

0.0

1−

0.0

4

IA3

0.4

00.4

40.4

20.4

40.4

10.4

30.1

70.1

40.2

00.2

00.3

20.4

00.0

00.2

70.6

80.3

3−

0.7

4−

0.7

9−

0.3

4−

0.3

5−

0.4

1−

0.3

70.0

0−

0.0

7

IL3

0.5

20.5

00.6

80.5

50.1

70.1

50.1

50.3

00.1

60.2

00.7

00.6

20.1

30.1

20.1

70.2

6−

0.9

1−

0.8

3−

0.6

2−

0.4

5−

0.1

6−

0.1

3−

0.1

3−

0.2

5

KA

30.4

00.4

10.3

00.2

30.0

50.0

30.6

50.7

40.2

00.2

20.2

10.1

60.0

10.0

20.7

80.8

3−

0.7

5−

0.7

3−

0.2

5−

0.1

8−

0.0

5−

0.0

3−

0.4

6−

0.5

1

MN

30.3

40.4

50.4

90.4

20.0

40.0

90.4

70.4

90.2

20.2

00.5

60.4

10.0

40.0

60.4

00.5

3−

0.5

9−

0.8

0−

0.2

6−

0.3

4−

0.0

2−

0.0

7−

0.3

1−

0.3

9

MO

30.4

00.3

70.7

60.4

30.0

70.0

20.1

80.5

60.1

60.1

90.4

40.3

20.0

50.0

10.5

10.6

7−

0.8

3−

0.7

4−

0.6

8−

0.3

5−

0.0

6−

0.0

1−

0.0

9−

0.3

8

ND

30.5

80.5

40.2

90.2

00.2

00.2

00.5

10.6

00.3

40.4

10.0

80.0

70.0

00.0

50.9

20.8

8−

0.6

4−

0.4

3−

0.2

7−

0.1

6−

0.2

0−

0.1

7−

0.1

8−

0.0

9

NE

30.4

30.4

00.4

30.4

50.2

10.2

00.3

60.3

50.2

20.2

60.1

40.2

10.0

30.0

60.8

40.7

3−

0.7

3−

0.5

8−

0.3

9−

0.3

7−

0.2

0−

0.1

7−

0.1

4−

0.0

4

DE

40.4

70.5

60.2

00.2

10.0

60.0

10.7

40.7

80.2

40.2

30.2

40.2

50.0

10.0

80.7

60.6

7−

0.7

3−

0.8

4−

0.1

4−

0.1

7−

0.0

60.0

0−

0.5

3−

0.6

7

FL

40.3

20.3

30.3

00.3

60.0

90.0

20.6

00.6

10.1

70.1

60.3

30.2

70.0

10.1

30.6

60.5

9−

0.7

2−

0.7

5−

0.2

1−

0.2

9−

0.0

90.0

1−

0.4

2−

0.4

6

GA

40.3

50.3

50.6

30.6

70.0

30.0

10.3

40.3

20.1

90.1

90.3

40.4

50.4

40.0

50.2

20.5

0−

0.6

9−

0.7

1−

0.5

2−

0.5

40.1

10.0

0−

0.2

8−

0.1

8

(cont.)

Page 69: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

56T

heevolution

ofnational

andregional

factorsin

US

housingconstruction

Table 3.5. (Continued)

1969–1987 1988–2007 Decomposition of (Var69−87 − Var88−07)/Var88−07

σ R2 − F R2 − R R2 − e σ R2 − F R2 − R R2 − e Total F R e

HA 4 0.52 0.56 0.00 0.00 0.12 0.01 0.88 0.98 0.45 0.45 0.01 0.00 0.00 0.02 0.99 0.97 −0.24 −0.35 0.01 0.00 −0.12 0.00 −0.13 −0.36

KY 4 0.42 0.44 0.18 0.32 0.16 0.00 0.66 0.68 0.19 0.19 0.51 0.31 0.02 0.01 0.47 0.68 −0.80 −0.80 −0.08 −0.26 −0.15 0.00 −0.57 −0.54

MS 4 0.37 0.44 0.13 0.25 0.61 0.00 0.26 0.75 0.23 0.25 0.16 0.14 0.12 0.01 0.72 0.85 −0.61 −0.67 −0.07 −0.20 −0.57 0.00 0.02 −0.47

NC 4 0.33 0.38 0.27 0.45 0.21 0.00 0.52 0.55 0.13 0.17 0.50 0.45 0.16 0.01 0.34 0.54 −0.85 −0.81 −0.19 −0.36 −0.18 0.00 −0.47 −0.45

SC 4 0.27 0.29 0.33 0.43 0.13 0.02 0.54 0.56 0.17 0.18 0.21 0.21 0.22 0.05 0.57 0.74 −0.62 −0.61 −0.25 −0.34 −0.05 0.01 −0.32 −0.27

TN 4 0.34 0.37 0.59 0.57 0.17 0.00 0.23 0.43 0.16 0.18 0.43 0.49 0.01 0.01 0.57 0.51 −0.77 −0.78 −0.50 −0.46 −0.17 0.00 −0.10 −0.32

VA 4 0.28 0.30 0.60 0.81 0.10 0.08 0.30 0.11 0.18 0.16 0.45 0.51 0.02 0.35 0.53 0.13 −0.57 −0.70 −0.40 −0.65 −0.09 0.03 −0.07 −0.07

VT 4 0.55 0.58 0.07 0.14 0.00 0.03 0.93 0.83 0.34 0.29 0.20 0.10 0.07 0.16 0.73 0.74 −0.61 −0.74 0.01 −0.11 0.02 0.01 −0.64 −0.64

WV 4 0.59 0.62 0.00 0.13 0.04 0.01 0.96 0.86 0.26 0.22 0.20 0.20 0.07 0.14 0.73 0.65 −0.81 −0.88 0.04 −0.10 −0.03 0.00 −0.82 −0.78

AK 5 0.67 0.67 0.01 0.04 0.08 0.09 0.90 0.87 0.37 0.43 0.10 0.02 0.06 0.06 0.85 0.92 −0.71 −0.59 0.01 −0.03 −0.07 −0.07 −0.65 −0.49

AL 5 0.39 0.39 0.26 0.39 0.04 0.02 0.70 0.59 0.16 0.19 0.09 0.33 0.18 0.02 0.73 0.65 −0.82 −0.77 −0.24 −0.32 −0.01 −0.01 −0.57 −0.45

AR 5 0.35 0.35 0.41 0.56 0.07 0.01 0.52 0.44 0.20 0.22 0.16 0.26 0.04 0.00 0.80 0.73 −0.66 −0.60 −0.36 −0.45 −0.05 0.00 −0.25 −0.14

AZ 5 0.43 0.46 0.50 0.53 0.05 0.01 0.45 0.46 0.21 0.21 0.68 0.49 0.02 0.02 0.30 0.49 −0.76 −0.79 −0.34 −0.43 −0.04 −0.01 −0.38 −0.36

CO 5 0.37 0.35 0.61 0.57 0.23 0.12 0.16 0.31 0.23 0.25 0.06 0.22 0.18 0.06 0.76 0.72 −0.60 −0.50 −0.59 −0.46 −0.16 −0.09 0.15 0.05

LA 5 0.36 0.33 0.44 0.44 0.35 0.33 0.21 0.23 0.18 0.20 0.03 0.21 0.41 0.23 0.56 0.56 −0.73 −0.60 −0.43 −0.35 −0.24 −0.25 −0.06 0.00

MT 5 0.60 0.66 0.05 0.16 0.07 0.04 0.87 0.81 0.31 0.33 0.04 0.12 0.18 0.04 0.78 0.84 −0.73 −0.75 −0.04 −0.13 −0.02 −0.03 −0.66 −0.60

NM 5 0.49 0.46 0.30 0.27 0.04 0.01 0.66 0.72 0.19 0.23 0.32 0.21 0.02 0.01 0.66 0.78 −0.85 −0.75 −0.25 −0.22 −0.03 −0.01 −0.56 −0.52

OK 5 0.42 0.36 0.31 0.23 0.35 0.36 0.34 0.41 0.20 0.22 0.13 0.12 0.19 0.25 0.68 0.64 −0.78 −0.61 −0.28 −0.19 −0.31 −0.26 −0.18 −0.17

TX 5 0.31 0.30 0.38 0.37 0.35 0.37 0.27 0.26 0.16 0.17 0.13 0.22 0.49 0.31 0.38 0.47 −0.73 −0.68 −0.35 −0.30 −0.22 −0.27 −0.16 −0.11

UT 5 0.38 0.40 0.28 0.29 0.05 0.04 0.67 0.67 0.20 0.21 0.11 0.20 0.19 0.04 0.69 0.76 −0.72 −0.72 −0.24 −0.23 0.00 −0.03 −0.48 −0.46

WY 5 0.58 0.61 0.00 0.04 0.12 0.10 0.88 0.86 0.25 0.37 0.05 0.02 0.31 0.08 0.64 0.90 −0.81 −0.64 0.01 −0.03 −0.06 −0.07 −0.76 −0.54

Mean 0.40 0.42 0.38 0.36 0.12 0.10 0.49 0.54 0.22 0.23 0.28 0.25 0.11 0.12 0.61 0.63 −0.69 −0.68 −0.30 −0.29 −0.09 −0.06 −0.30 −0.33

0.10 0.29 0.30 0.05 0.13 0.01 0.00 0.18 0.23 0.16 0.16 0.05 0.07 0.00 0.01 0.33 0.28 −0.83 −0.81 −0.56 −0.46 −0.24 −0.23 −0.64 −0.63

0.25 0.33 0.35 0.23 0.23 0.04 0.01 0.29 0.35 0.17 0.19 0.13 0.14 0.01 0.02 0.47 0.51 −0.78 −0.77 −0.46 −0.39 −0.16 −0.08 −0.48 −0.51

0.50 0.38 0.40 0.35 0.37 0.08 0.04 0.47 0.56 0.20 0.21 0.23 0.23 0.06 0.06 0.64 0.65 −0.73 −0.72 −0.31 −0.30 −0.06 −0.02 −0.25 −0.36

0.75 0.45 0.50 0.59 0.49 0.17 0.13 0.67 0.71 0.24 0.25 0.43 0.33 0.18 0.16 0.76 0.77 −0.62 −0.60 −0.19 −0.19 −0.01 0.00 −0.13 −0.14

0.90 0.55 0.58 0.65 0.56 0.26 0.30 0.87 0.81 0.31 0.37 0.51 0.46 0.29 0.31 0.85 0.86 −0.57 −0.47 −0.04 −0.11 0.02 0.00 −0.05 −0.07

The first entry in each cell is computed using the unrestricted split-sample estimates of the dynamic factor model; the second entry is computed using restrictedsplit-sample estimates for which the factor loadings and idiosyncratic autoregressive coefficients are restricted to equal their full-sample values. The first numericcolumn is the region of the state. The next block of columns contains the standard deviation of Δ4yit over 1969–1987 and the fraction of the variance attributableto the national factor F , the regional factor R, and the idiosyncratic disturbance e. The next block provides the same statistics for 1988–2007. The final blockdecomposes the relative change in the variance from the first to the second period as the sum of changes in the contribution of F , R, and e; for each state, thesum of the final three columns equals the Total column up to rounding.

Page 70: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Empirical results 57

Fourth, the summary statistics based on the restricted and unrestricted split-sampleestimation results are similar. For example, the median estimated R2 explained by thenational factor in the first period (numeric column 3) is 0.38 using the unrestrictedestimates and 0.36, using the restricted estimates. Similarly, the median fractional changein the variance between the first and the second sample attributed to a reduction inthe contribution of the national factor (numeric column 11) is 0.31 for the unrestrictedestimates and 0.30 for the restricted estimates. These comparisons indicate that little islost, at least on average, by modeling the factor loadings and autoregressive coefficients asconstant across the two samples and allowing only the variances to change.9 Moreover,inspection of Table 3.5 reveals that the foregoing conclusions based on the restrictedestimates also follow from the unrestricted estimates.

4.3. Results for the DFM-SV model

We now turn to the results based on the DFM-SV model. As discussed in Section 3.2,the parameters λ, γ, and ρ are fixed at the full-sample MLEs, and the filtered estimatesof the factors and their time-varying variances were computed numerically.

National and regional factors. The four-quarter growth of the estimated nationalfactor from the DFM-SV model, Δ4Ft, is plotted in Figure 3.10 along with three othermeasures of national movements in building permits: the first principal component of the50 series Δ4y1t, . . . ,Δ4y50t; the average state four-quarter growth rate, 1

50

∑50i=1Δ4yit,

and the fourth-quarter growth rate of total national building permits, ln(BPt/BPt−4),where BPt =

∑50i=1BPit. The first principal component is an estimator of the four-

quarter growth rate of the national factor in a single-factor model (Stock and Watson,2002a) as is the average of the state-level four-quarter growth rates under the assumptionthat the average population factor loading for the national factor is nonzero (Forni,and Reichlin, 1998). The fourth series plotted, the four-quarter growth rate of nationalaggregate building permits, does not have an interpretation as an estimate of the factorin a single-factor version of the DFM specified in logarithms because the factor model isspecified in logarithms at the state level.

As is clear from Figure 3.10, the three estimates of the factor (the DFM-SV estimate,the first principal component, and the average of the state-level growth rates) yieldsvery nearly the same estimated four-quarter growth of the national factor. These inturn are close to the growth rate of national building permits; however, there are somediscrepancies between the national permits and the estimates of the national factor,particularly in 1974, 1990, and 2007. Like national building permits and consistent withthe split-sample analysis, the four-quarter growth rate of the national factor shows amarked reduction in volatility after 1985.

Figure 3.11 presents the four-quarter growth rates of the national and five regionalfactors, along with ±1 standard deviation bands, where the standard deviation bandsrepresent filtering uncertainty but not parameter estimation uncertainty (as discussed inSection 3.2). The region factors show substantial variations across regions, for examplethe housing slowdown in the mid-1980s in the South Central (Region 5) and the slowdown

9The restricted and unrestricted split-sample log-likelihoods differ by 280 points, with 194 additionalparameters in the unrestricted model. However, it would be heroic to rely on a chi-squared asymptoticdistribution of the likelihood ratio statistic for inference with this many parameters.

Page 71: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

58 The evolution of national and regional factors in US housing construction

1965 1970 1975 1980

Factor-SVPC(State Growth Rates)Total BPs Growth RateAverage Growth Rate

1985 1990 1995 2000 2005 2010

0.6

0.8

0.4

0.2

0.0

–0.2

–0.4

–0.6

–0.8

Fig. 3.10. Comparison of DFM-SV filtered estimate of the national factor (solid line)to the first principal component of the 50 state series, total US building permits, and theaverage of the state-level building permit growth rates, all computed using four-quartergrowth rates

in the late-1980s in the Northeast (Region 1) are both visible in the regional factors, andthese slowdowns do not appear in other regions.

Figure 3.12 takes a closer look at the pattern of volatility in the national and regionalfactors by reporting the estimated instantaneous standard deviation of the factor innova-tions. The estimated volatility of the national factor falls markedly over the middle of thesample, as does the volatility for Region 3 (the Upper Midwest). However the pattern ofvolatility changes for regions other than 3 is more complicated; in fact, there is evidenceof a volatility peak in the 1980s in regions 1, 2, 4, and 5. This suggests that the DFM-SVmodel attributes the common aspect of the decline in volatility of state building permitsover the sample to a decline in the volatility of the national factor.

Figure 3.13 uses the DFM-SV estimates to compute statistics analogous to thosefrom the split-sample analysis of Section 4.2, specifically, state-by-state instantaneousestimates of the standard deviation of the innovation to the idiosyncratic disturbance andthe partial R2 attributable to the national and regional factors and to the idiosyncraticdisturbance. The conclusions are consistent with those reached by the examination ofthe split-sample results in Table 3.5. Specifically, for a typical state the fraction of thestate-level variance of Δ4yit explained by the national factor has declined over time,the fraction attributable to the idiosyncratic disturbance has increased, and the fractionattributable to the regional factor has remained approximately constant. In addition, thevolatility of the idiosyncratic disturbance has decreased over time.

Page 72: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Empirical results 59

1965

0.75

0.50

National Factor Regional Factor 1

0.25

0.00

–0.25

–0.50

–0.75

0.75

0.50

0.25

0.00

–0.25

–0.50

–0.751975 1985 1995 2005 2015 1965 1975 1985 1995 2005 2015

Regional Factor 20.75

0.50

0.25

0.00

–0.25

–0.50

–0.751965 1975 1985 1995 2005 2015

1965

0.75

0.50

Regional Factor 4Regional Factor 3

0.25

0.00

–0.25

–0.50

–0.75

0.75

0.50

0.25

0.00

–0.25

–0.50

–0.751975 1985 1995 2005 2015 1965 1975 1985 1995 2005 2015

Regional Factor 50.75

0.50

0.25

0.00

–0.25

–0.50

–0.751965 1975 1985 1995 2005 2015

Fig. 3.11. Four-quarter decimal growth of the filtered estimates of the national factor(first panel) and the five regional factors from the DFM-SV model, and ±1 standarddeviation bands (dotted lines)

National Factor Regional Factor 1 Regional Factor 20.25

0.20

0.15

0.10

0.05

0.001965 1975 1985 1995 2005 2015

0.25

0.20

0.15

0.10

0.05

0.001965 1975 1985 1995 2005 2015

0.25

0.20

0.15

0.10

0.05

0.001965 1975 1985 1995 2005 2015

0.25

0.20

0.15

0.10

0.05

0.001965 1975 1985 1995 2005 2015

0.25

0.20

0.15

0.10

0.05

0.001965 1975 1985 1995 2005 2015

0.25

0.20

0.15

0.10

0.05

0.001965 1975 1985 1995 2005 2015

Regional Factor 3 Regional Factor 4 Regional Factor 5

Fig. 3.12. DFM-SV estimates of the instantaneous standard deviations of the inno-vations to the national and regional factors, with ±1 standard deviation bands (dottedlines)

Page 73: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

60 The evolution of national and regional factors in US housing construction

1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010

0.3

0.6

0.4

0.8

1.0Idiosyncratic SD National R2

0.2

0.4

0.10.2

0.0 0.0

Idiosyncratic R2

1965 1970 1975 1980 1985 1990 1995 2000 2005 2010

0.6

0.8

1.0

0.4

0.2

0.0

Regional R2

1965 1970 1975 1980 1985 1990 1995 2000 2005 2010

0.6

0.8

1.0

0.4

0.2

0.0

Fig. 3.13. DFM-SV estimates of the evolution of the state-level factor model: thestandard deviation of the idiosyncratic innovation (upper left) and the partial R2 fromthe national factor (upper right), the regional factor (lower left), and the idiosyncraticterm (lower right). Shown are the 10%, 25%, 50%, 75%, and 90% percentiles acrossstates, evaluated quarter by quarter

This said, the patterns in Figure 3.13 suggest some nuances that the split-sampleanalysis masks. Notably, the idiosyncratic standard deviation declines at a nearly con-stant rate over this period, and does not appear to be well characterized as having asingle break. The volatility of the regional factor does not appear to be constant, andinstead increases substantially for many states in the 1980s. Also, the importance of thenational factor has fluctuated over time: it was greatest during the recessions of the late70s/early 80s, but in the early 1970s the contribution of the national factor was essen-tially the same as in 2007. For this partial R2 associated with the national factor, thepattern that emerges is less one of a sharp break than of a slow evolution.

5. Discussion and conclusions

The empirical results in Section 4 suggest five main findings that bear on the issues, laidout in the introduction, about the relationship between state-level volatility in housingconstruction and the Great Moderation in overall US economic activity.

Page 74: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Discussion and conclusions 61

First, there has been a large reduction in the volatility of state-level housing construc-tion, with the state-level variance of the four-quarter growth in building permits fallingby between 35% and 88% from the period 1970–1987 to the period 1988–2007, with amedian decline of 72%.

Second, according to the estimates from the state building permit DFM-SV model,there was a substantial decline in the volatility of the national factor, and this declineoccurred sharply in the mid-1980s. On average, this reduction in the volatility of thenational factor accounted for one-half of the reduction in the variance of four-quartergrowth in state building permits.

Third, there is evidence of regional organization of housing markets and, intriguingly,the cluster analytic methods we used to estimate the composition of the regions resultedin five conventionally identifiable regions – the Northeast, Southeast, Upper Midwest,Rockies, and West Coast – even though no constraints were imposed requiring the esti-mated regions to be contiguous. The regional factors, however, explain only a modestamount of state-level fluctuations in building permits, and the regional factors showno systematic decline in volatility; if anything, they exhibit a peak in volatility in themid-1980s.

Fourth, there has been a steady decline in the volatility of the idiosyncratic componentof state building permits over the period 1970–2007. The smooth pattern of this declineis different than that for macroeconomic aggregates or for the national factor, whichexhibit striking declines in volatility in the mid-1980s.

Taken together, these findings are consistent with the view, outlined in the intro-duction, that the development of financial markets played an important role in theGreat Moderation: less cyclically sensitive access to credit coincided with a decline inthe volatility of the national factor in building permits, which in turn led to declinesin the volatility of state housing construction. The timing of the decline in the volatilityof the national housing factor coincides with the harmonization of mortgage rates acrossregions in Figure 3.2, the mid-1980s.

We emphasize that the evidence here is reduced-form, and the moderation of nationalfactor presumably reflects many influences, including moderation in the volatility ofincome. Sorting out these multiple influences would require augmenting the state buildingpermits set developed here with other data, such as state-level incomes.

Page 75: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4

Modeling UK InflationUncertainty, 1958–2006

Gianna Boero, Jeremy Smith, and Kenneth F. Wallis

1. Introduction

Introducing the autoregressive conditional heteroskedastic (ARCH) process in his cel-ebrated article in Econometrica in July 1982, Robert Engle observed that the ARCHregression model “has a variety of characteristics which make it attractive for econo-metric applications” (p. 989). He noted in particular that “econometric forecasters havefound that their ability to predict the future varies from one period to another”, citingthe recognition by McNees (1979, p. 52) that “the inherent uncertainty or randomnessassociated with different forecast periods seems to vary widely over time”, and McNees’sfinding that “the ‘large’ and ‘small’ errors tend to cluster together” (p. 49). McNeeshad examined the track record of the quarterly macroeconomic forecasts published byfive forecasting groups in the United States over the 1970s. He found that, for inflation,the median one-year-ahead forecast persistently underpredicted the annual inflation ratefrom mid-1972 to mid-1975, with the absolute forecast error exceeding four percentagepoints for five successive quarters in this period; outside this period forecast errors weremore moderate, and changed sign from time to time, though serial correlation remained.Engle’s article presented an application of the ARCH regression model to inflation in theUnited Kingdom over the period 1958–1977, which included the inflationary explosion of1974–1975, the magnitude of which had likewise been unanticipated by UK forecasters(Wallis, 1989). In both countries this “Great Inflation” is now seen as an exceptionalepisode, and the transition to the “Great Moderation” has been much studied in recentyears. How this has interacted with developments in the analysis of inflation volatilityand the treatment of inflation forecast uncertainty is the subject of this chapter.

The quarter-century since the publication of ARCH has seen widespread applicationin macroeconomics of the basic model and its various extensions – GARCH, GARCH-M,EGARCH . . . – not to mention the proliferation of applications in finance of these andrelated models under the heading of stochastic volatility, the precursors of which predate

62

Page 76: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 UK inflation and the policy environment 63

ARCH (Shephard, 2008). There has also been substantial development in the mea-surement and reporting of inflation forecast uncertainty (Wallis, 2008). Since 1996 theNational Institute of Economic and Social Research (NIESR) and the Bank of Englandhave published not only point forecasts but also density forecasts of UK inflation, thelatter in the form of the famous fan chart. Simultaneously in 1996 the Bank initiatedits Survey of External Forecasters, analogous to the long-running US Survey of Profes-sional Forecasters; based on the responses it publishes quarterly survey average densityforecasts of inflation in its Inflation Report. Finally the last quarter-century has seensubstantial development of the econometrics of structural breaks and regime switches,perhaps driven by and certainly relevant to the macroeconomic experience of the period.

These methods have been applied in a range of models to document the decline inpersistence and volatility of key macroeconomic aggregates in the United States, wherethe main break is usually located in the early 1980s. Interpretation has been less straight-forward, however, especially with respect to inflation, as “it has proved hard to reachagreement on what monetary regimes were in place in the US and indeed whether therewas ever any change at all (except briefly at the start of the 1980s with the experiment inthe control of bank reserves)” (Meenagh, Minford, Nowell, Sofat and Srinivasan, 2009).Although the corresponding UK literature is smaller in volume, it has the advantage thatthe various changes in policy towards inflation are well documented, which Meenagh et al.and other authors have been able to exploit. Using models in this way accords with theearlier view of Nerlove (1965), while studying econometric models of the UK economy,that model building, in addition to the traditional purposes of forecasting and policyanalysis, can be described as a way of writing economic history. The modeling approachand the traditional approach to economic history each have limitations, but a judiciousblend of the two can be beneficial. At the same time there can be tensions between theex post and ex ante uses of the model, as discussed below.

The rest of this chapter is organized as follows. Section 2 contains a brief review of UKinflationary experience and the associated policy environment(s), 1958–2006, in the lightof the literature alluded to in the previous paragraph. Section 3 returns to Engle’s originalARCH regression model, and examines its behavior over the extended period. Section 4turns to a fuller investigation of the nature of the nonstationarity of inflation, preferringa model with structural breaks, stationary within subperiods. Section 5 considers a rangeof measures of inflation forecast uncertainty, from these models and other UK sources.Section 6 considers the association between uncertainty and the level of inflation, firstmooted in Milton Friedman’s Nobel lecture. Section 7 concludes.

2. UK inflation and the policy environment

Measures of inflation based on the Retail Prices Index (RPI) are plotted in Figure 4.1,using quarterly data, 1958–2006. We believe that this is the price index used by Engle(1982a), although the internationally more standard term, “consumer price index”, isused in his text; in common with most time-series econometricians, he defined inflationas the first difference of the log of the quarterly index. In 1975 mortgage interest paymentswere introduced into the RPI to represent owner-occupiers’ housing costs, replacing arental equivalent approach, and a variant index excluding mortgage interest payments

Page 77: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

64 Modeling UK inflation uncertainty, 1958–2006

40

30

20

10

0

–1060 65 70 75 80 85 90 95 00 05

Fig. 4.1(a). UK RPI inflation 1958:1–2006:4 (percentage points of annual inflation),Δ1pt

–5

0

5

10

15

20

25

60 65 70 75 80 85 90 95 00 05

Fig. 4.1(b). UK RPI inflation 1958:1–2006:4 (percentage points of annual inflation),Δ4pt

(RPIX) also came into use. This became the explicit target of the inflation targetingpolicy initiated in October 1992, as it removed a component of the all-items RPI thatreflected movements in the policy instrument. In December 2003 the official target waschanged to the Harmonised Index of Consumer Prices, constructed on principles harmo-nized across member countries of the European Union and promptly relabeled CPI in theUK, while the all-items RPI continues in use in a range of indexation applications, includ-ing index-linked gilts. Neither of these indices, nor their variants, is ever revised afterfirst publication. For policy purposes, and hence also in public discussion and practicalforecasting, inflation is defined in terms of the annual percentage increase in the relevantindex. We denote the “econometric” and “policy” measures of inflation, respectively, asΔ1pt and Δ4pt, where Δi = 1−Li with lag operator L, and p is the log of the quarterly

Page 78: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 UK inflation and the policy environment 65

index. The former, annualized (by multiplying by four), is shown in the upper panel ofFigure 4.1; the latter in the lower panel. It is seen that annual differencing removes themild seasonality in the quarterly RPI, which is evident in the first-differenced series, andalso much reduces short-term volatility.

Episodes of distinctly different inflationary experience are apparent in Figure 4.1,and their identification in the context of different modeling exercises and their associ-ation with different approaches to macroeconomic policy have been studied in the UKliterature mentioned above. Haldane and Quah (1999) consider the Phillips curve fromthe start of the original Phillips sample, 1861, to 1998. For the post-war period, with aspecification in terms of price inflation (unlike the original Phillips curve specification interms of wage inflation), they find distinctly different “curves” pre- and post-1980: at firstthe curve is “practically vertical; after 1980, the Phillips curve is practically horizontal”(p. 266). Benati (2004), however, questions Haldane and Quah’s use of frequency-domainprocedures that focus on periodicities between five and eight years, and argues for a more“standard” business-cycle range of six quarters to eight years. With this alternativeapproach he obtains a further division of each episode, identifying “a period of extremeinstability (the 1970s), a period of remarkable stability (the post-1992 period), and twoperiods ‘in-between’ (the Bretton Woods era and the period between 1980 and 1992)”(p. 711). This division is consistent with his prior univariate analysis of RPI inflation,1947:1–2003:2, which finds three breaks in the intercept, coefficients and innovation vari-ance of a simple autoregression, with estimated dates 1972:3, 1981:2 and 1992:2 (althoughthe date of the second break is much less precisely determined than the other two dates).

Nelson and Nikolov (2004) and Meenagh et al. (2009) consider a wide range of“real-time” policy statements and pronouncements to document the vicissitudes of UKmacroeconomic policymaking since the late 1950s. Until 1997, when the Bank of Englandgained operational independence, monetary policy, like fiscal policy, was in the hands ofelected politicians, and their speeches and articles are a rich research resource. This evi-dence, together with their simulation of an estimated New Keynesian model of aggregatedemand and inflation behavior, leads Nelson and Nikolov to conclude that “monetarypolicy neglect”, namely the failure in the 1960s and 1970s to recognize the primacy ofmonetary policy in controlling inflation, is important in understanding the inflation ofthat period. Study of a yet wider range of policymaker statements leads Nelson (2009) toconclude that the current inflation targeting regime is the result not of changed policy-maker objectives, but rather of an “overhaul of doctrine”, in particular a changed viewof the transmission mechanism, with the divide between the “old” and “modern” erasfalling in 1979.

Meenagh et al. (2009) provide a finer division of policy episodes, identifying fivesubperiods: the Bretton Woods fixed exchange rate system, up to 1970:4; the incomespolicy regime, 1971:1–1978:4; the money targeting regime, 1979:1–1985:4; exchange ratetargeting, 1986:1–1992:3; and inflation targeting, since 1992:4. They follow their narrativeanalysis with statistical tests in a three-variable VAR model, finding general supportfor the existence of the breaks, although the estimated break dates are all later thanthose suggested by the narrative analysis. These reflect lags in the effect of policy oninflation and growth outcomes and, when policy regimes change, “there may well be alag before agents’ behaviour changes; this lag will be the longer when the regime changeis not clearly communicated or its effects are not clearly understood” (p. 980). Meenaghet al. suggest that this applies to the last two changes: the switch to exchange rate

Page 79: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

66 Modeling UK inflation uncertainty, 1958–2006

targeting in 1986, with a period of “shadowing the Deutsche Mark” preceding formalmembership of the Exchange Rate Mechanism of the European Monetary System, wasdeliberately kept unannounced by the Treasury, while in 1992 inflation targeting wasunfamiliar, with very little experience from other countries to draw on. Independentevidence on responses to later changes to the detail of the inflation targeting arrangementsis presented in Section 5.

None of the research discussed above is cast in the framework of a regime switchingmodel, of which a wide variety is available in the econometric literature. The brief accountof five policy episodes in the previous paragraph makes it clear that there was no switchingfrom one regime to another and back again; at each break point the old policy wasreplaced by something new. Likewise no regime switching models feature in the analysispresented below.

3. Re-estimating the original ARCH model

The original ARCH regression model for UK inflation is (Engle, 1982a, pp. 1001–2):

Δ1pt = β0 + β1Δ1pt−1 + β2Δ1pt−4 + β3Δ1pt−5 + β4(pt−1 − wt−1) + εt, (1)

εt|ψt−1 ∼ N(0, ht), ht = α0 + α1

(0.4ε2t−1 + 0.3ε2t−2 + 0.2ε2t−3 + 0.1ε2t−4

)(2)

where p is the log of quarterly RPI and ψt−1 is the information set available at time t−1.The wage variable used by Engle (in logs) in the real wage “error correction” term, namelyan index of manual wage rates, was subsequently discontinued, and for consistency in allour re-estimations we use the average earnings index, also used by Haldane and Quah(1999). For the initial sample period, 1958:1–1977:2, we are able to reproduce Engle’squalitative findings, with small differences in the quantitative details due to these minorvariations. In particular, with respect to the h-process, our maximum likelihood estimateof α0 is, like his, not significantly different from zero, whereas our estimate of α1, at0.897, is slightly smaller than his (0.955). The turbulence of the period is illustrated inFigure 4.2, which plots the square root of the estimates of ht over the sample period:these are the standard errors of one-quarter-ahead forecasts of annual inflation based onthe model. The width of an interval forecast with nominal 50% coverage (the interquartilerange) varies from a minimum of 2.75 percentage points to a maximum of 14 percentagepoints of annual inflation. Engle concludes that “this example illustrates the usefulnessof the ARCH model . . . for obtaining more realistic forecast variances”, although thesewere not subject to test in an out-of-sample exercise.

Re-estimation over the extended sample period 1958:1–2006:4 produces the resultsshown in Table 4.1. These retain the main features of the original model – significantautoregressive coefficients, insignificant α0, estimated α1 close to 1 – except for the esti-mate of the error correction coefficient, β4, which is virtually zero. Forward recursiveestimation shows that this coefficient maintains its significance from the initial sampleto samples ending in the mid-1980s, but then loses its significance as more recent obser-vations are added to the sample. Figure 4.3(a) shows the conditional standard error ofannualized inflation over the fully extended period. The revised estimates are seen toextend the peaks in the original sample period shown in Figure 4.2; there is then a

Page 80: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Re-estimating the original ARCH model 67

0

2

4

6

8

10

12

58 60 62 64 66 68 70 72 74 76

Fig. 4.2. Conditional standard errors, 1958:1–1977:2, Δ1pt

further peak around the 1979–1981 recession, after which the conditional standard errorcalms down.

Practical forecasters familiar with the track record of inflation projections over thepast decade may be surprised by forecast standard errors as high as two percentagepoints of annual inflation shown in Figure 4.3(a). Their normal practice, however, is towork with an inflation measure defined as the percentage increase in prices on a yearearlier, Δ4p, whereas Δ1p is used in Engle’s model and our various re-estimates of it.The latter series exhibits more short-term volatility, as seen in Figure 4.1. Replacing Δ1pin the original ARCH regression model given above by Δ4p and re-estimating over theextended sample gives the conditional standard error series shown in Figure 4.3(b). Thishas the same profile as the original specification, but reflects a much lower overall levelof uncertainty surrounding the more popular measure of inflation.

Table 4.1. Estimation of the original ARCH model over 1958:1–2006:4Δ1pt = β0 + β1Δ1pt−1 + β2Δ1pt−4 + β3Δ1pt−5 + β4(pt−1 −wt−1) + εt,εt|ψt−1 ∼ N(0, ht), ht = α0 + α1(0.4ε2t−1 + 0.3ε2t−2 + 0.2ε2t−3 + 0.1ε2t−4)

Coeff. Std Error z statistic p value

β0 0.014 0.0097 1.44 0.150β1 0.391 0.0852 4.59 0.000β2 0.659 0.0504 13.07 0.000β3 −0.337 0.0646 −5.22 0.000β4 0.002 0.0062 0.39 0.696α0 0.0002 8E–05 2.99 0.003α1 1.009 0.1564 6.45 0.000Log likelihood 398.9 Akaike info criterion −4.00

Schwarz criterion −3.88Hannan-Quinn criterion −3.95

Page 81: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

68 Modeling UK inflation uncertainty, 1958–2006

0

2

4

6

8

10

12

14

60 65 70 75 80 85 90 95 00 05

Fig. 4.3(a). Conditional standard errors, 1958:1–2006:4, Δ1pt

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

60 65 70 75 80 85 90 95 00 05

Fig. 4.3(b). Conditional standard errors, 1958:1–2006:4, Δ4pt

Over the last decade the time series plotted in Figures 4.1 and 4.3 have a morehomoskedastic, rather than heteroskedastic appearance, despite the significance of theestimate of α1 over the full sample including this period. As a final re-estimation exerciseon the original ARCH model, with Δ1p, we undertake backward recursive estimation. Webegin with the sample period 1992:4–2006:4, the inflation targeting period, despite reser-vations about a learning period having been required before the full benefits of the newpolicy became apparent. We then consider sample periods starting earlier, one quarterat a time, until the complete sample period 1958:1–2006:4 is reached. Equivalently, wecould begin with full sample estimation then sequentially remove the earliest observation.Either way, the resulting estimates of the coefficient α1 and the p values of the LM test(Engle, 1982a, Section 8) are plotted in Figure 4.4 against the starting date of the sample;the end date is 2006:4 throughout. There is seen to be a clear change around 1980. To

Page 82: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 The nonstationary behavior of UK inflation 69

–0.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1960 1965 1970 1975 1980 1985 1990

Coefficient p value

Sample start date

Fig. 4.4. Backward recursive estimates of α1 and the LM p-value, Δ1pt

exhibit significant conditional heteroskedasticity it is necessary to include periods earlierthan this in the sample; samples starting after 1980 offer no support for the existenceof ARCH in this model. Similar results are obtained when the model is rewritten interms of Δ4p, except that the sample has to start in 1990 or later for the significantARCH effect to have disappeared. These findings prompt more general questions aboutnonstationarity.

4. The nonstationary behavior of UK inflation

We undertake a fuller investigation of the nature of the nonstationarity of inflation, in thelight of the coexistence in the literature of conflicting approaches. For example, Garratt,Lee, Pesaran and Shin (2003; 2006, Ch. 9) present an eight-equation conditional vectorerror correction model of the UK economy, estimated over 1965:1–1999:4, in which RPIinflation, Δ1p, is treated as an I(1) variable. This leads them to express the target intheir monetary policy experiment as a desired constant reduction in the rate of inflationfrom that observed in the previous period, which does not correspond to the inflationtarget that is the current focus of policy in the UK, nor anywhere else. In contrast, Castleand Hendry (2008) present error correction equations for inflation (GDP deflator) for usein forecast comparisons, with the same sample starting date as Garratt et al., assumingthat “the price level is I(1), but subject to structural breaks which give the impressionthat the series is I(2)”.

Standard unit root tests without structural breaks reveal some of the sources ofpotential ambiguity. Tests are performed recursively, beginning with a sample of 40observations, 1958:1–1967:4, then extending the sample quarter-by-quarter to 2006:4.Results for the augmented Dickey–Fuller (ADF) test are representative of those obtainedacross various other tests. For the quarterly inflation series Δ1p, the results presented in

Page 83: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

70 Modeling UK inflation uncertainty, 1958–2006

–8

–6

–4

–2

0

2

4

1970 1975 1980 1985 1990 1995 2000 2005

Fig. 4.5(a). Recursive ADF tests for Δ1p, with 5% and 10% critical values: Constantonly

–6

–5

–4

–3

–2

–1

0

1

2

1970 1975 1980 1985 1990 1995 2000 2005

Fig. 4.5(b). Recursive ADF tests for Δ1p, with 5% and 10% critical values: Constantand seasonal dummies

Figure 4.5 demonstrate sensitivity to the treatment of seasonality. The upper panel givesthe ADF statistic with the inclusion of a constant term, and shows that over the 1970sand 1980s the null hypothesis of I(1) inflation would not be rejected. The addition ofquarterly dummy variables, however, gives the results shown in the lower panel, whichlead to the clear rejection of the unit root hypothesis as soon as the end-point of thesample gets clear of the 1975 peak in inflation, and thereafter. Such constant additiveseasonality can alternatively be removed by annual differencing, which also reduces short-term volatility, as noted above in the discussion of Figure 4.1. For the Δ4p series, in thecorresponding figure (not shown) the ADF statistic lies in the unit root nonrejectionregion over the whole period. Backward recursive estimation of the ADF test for the Δ4p

Page 84: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 The nonstationary behavior of UK inflation 71

series, however, shows that the unit root hypothesis would be rejected in samples withstart dates in 1990 or later. These results represent a simple example of the impact of adeterministic component, and different ways of dealing with it, on inference about unitroots, and the sensitivity of such inference to the choice of sample period.

The impact of structural breaks on inference about unit roots over the full dataperiod is assessed using the procedures of Zivot and Andrews (1992), allowing for anestimated break in mean under the alternative hypothesis. Once this is done, the ADFstatistic, relative to Zivot and Andrews’s critical values, implies rejection of the unitroot hypothesis in all three cases: Δ1p, with and without seasonal dummy variables, andΔ4p. These results motivate further investigation of structural change, in models thatare stationary within subperiods.

We apply the testing procedure developed by Andrews (1993), which treats the breakdates as unknown. Confidence intervals for the estimated break dates are calculated bythe method proposed by Bai (1997). For the Δ1p series, in an autoregressive model withseasonal dummy variables, namely

Δ1pt = β0 + β1Δ1pt−1 + β2Δ1pt−4 +3∑

j=1

γjQjt + εt, (3)

we find three significant breaks in β0, but none in the remaining coefficients, at thefollowing dates (95% confidence intervals in parentheses):

1972:3 (1970:3–1974:3)

1980:2 (1979:2–1981:2)

1990:4 (1987:4–1993:4).

These are similar dates to those of the more general breaks identified by Benati(2004), noted above, although in our case it is the date of the second break that is mostprecisely estimated. Likewise our three break dates are close to the dates of the firstthree breaks estimated in the three-variable VAR of Meenagh et al. (2009, Table 1). Wehave no counterpart to their fourth break, in 1993:4, associated with the introductionof inflation targeting a year earlier, although this date is the upper limit of the 95%confidence interval for our third break, which is the least precisely determined of thethree.

The resulting equation with shifts in β0 shows evidence of ARCH over the wholeperiod, but results given in the final paragraph of Section 3 about its time dependencesuggest separate testing in each of the four subperiods defined by the three break dates.In none of the subperiods is there evidence of ARCH. As an alternative representationof heteroskedasticity we consider breaks in the error variance. Following Sensier and vanDijk (2004) we again locate three significant breaks, at similar dates, namely 1974:2,1981:3, and 1990:2. Estimates of the full model are presented in Table 4.2, and theimplied subperiod means and standard deviations of inflation are shown as horizontallines in Figures 4.1(a) and 4.3(a), respectively.

For the Δ4p series seasonal dummy variables are not required, but a moving averageerror is included, and the autoregression is slightly revised, giving the model

Δ4pt = β0 + β1Δ4pt−1 + β2Δ4pt−2 + εt + θεt−4. (4)

Page 85: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

72 Modeling UK inflation uncertainty, 1958–2006

Table 4.2. Estimation of the “breaks model”, 1958:1–2006:4.Δ1pt = β0+ β1Δ1pt−1+β2Δ1pt−4+

∑3j=1 γjQjt+δ1D72:3+δ2D80:2+δ3D90:

4 + εt, εt|ψt−1 ∼ N(0, ht), ht = α0 + α1D74:2 + α2D81:3 + α3D90:2

Coeff. Std Error z statistic p value

β0 0.024 0.007 3.62 0.000γ1 −0.016 0.006 −2.73 0.006γ2 0.030 0.006 4.92 0.000γ3 −0.038 0.007 −5.30 0.000β1 0.405 0.070 5.77 0.000β2 0.138 0.074 1.88 0.061δ1 0.047 0.012 3.96 0.000δ2 −0.038 0.011 −3.50 0.001δ3 −0.015 0.005 −2.87 0.004α0 0.001 0.000 6.37 0.000α1 0.003 0.001 2.67 0.008α2 −0.003 0.001 −3.05 0.002α3 −0.001 0.000 −5.68 0.000Log likelihood 449.5 Akaike info criterion −4.45

Schwarz criterion −4.24Hannan-Quinn criterion −4.37

Again we find three significant breaks in β0, the first and third of which are accompaniedby shifts in β1, the dates being as follows:

1975:3 (1974:2–1976:4)

1981:4 (1981:2–1982:2)

1988:3 (1987:2–1989:4).

As in the quarterly difference series, ARCH effects persist over the whole period, butthere are no ARCH effects in any of the subperiods defined by these shifts in mean. Withthe same motivation as above we also find three significant breaks in variance in thiscase, namely 1974:2, 1980:2, and 1990:2, the first and last dates exactly coinciding withthose estimated for the Δ1p series. This again provides an alternative representation ofthe observed heteroskedasticity, and the corresponding subperiod means and standarddeviations are shown in Figures 4.1(b) and 4.3(b), respectively. (Note that regressionresiduals sum to zero over the full sample period, but not in each individual subperiod,because some coefficients do not vary between subperiods. Hence the plotted values inFigure 4.1 do not coincide with the subperiod means of the inflation data.)

The ARCH regression model and the alternative autoregressive model with interceptbreaks in mean and variance are non-nested, and can be compared via an informationcriterion that takes account of the difference in the number of estimated parameters ineach model. We find that the three measures in popular use, namely Akaike’s information

Page 86: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Measures of inflation forecast uncertainty 73

criterion, the Hannan-Quinn criterion and the Schwarz criterion, unambiguously selectthe breaks model, for both Δ1p and Δ4p versions.

A final note on outliers is perhaps in order, as several empirical researchers identifyinflation outliers associated with the increase in Value Added Tax in 1979:3 and theintroduction of Poll Tax in 1990:2, and deal with them accordingly. We simply reportthat none of the modeling exercises presented in this section is sensitive to changes inthe treatment of these observations.

5. Measures of inflation forecast uncertainty

Publication of the UK Government’s short-term economic forecasts began on a regularbasis in 1968. The 1975 Industry Act introduced a requirement for the Treasury to publishtwo forecasts each year, and to report their margins of error. The latter requirement wasfirst met in December 1976, with the publication of a table of the mean absolute error(MAE) over the past 10 years’ forecasts of several variables, compiled in the early partof that period from internal, unpublished forecasts. Subsequently it became standardpractice to include a column of MAEs in the forecast table – users could then easilyform a forecast interval around the given point forecast, if they so wished – althoughin the 1980s and 1990s these were often accompanied by a warning that they had beencomputed over a period when the UK economy was more volatile than expected in thefuture. This publication practice continues to the present day.

We consider the RPI inflation forecasts described as “fourth quarter to fourth quar-ter” forecasts, published each year in late November – early December in Treasurydocuments with various titles over the years – Economic Progress Report, AutumnStatement, Financial Statement and Budget Report, now Pre-Budget Report. For com-parability with other measures reported as standard errors or standard deviations wemultiply the reported forecast MAEs, which are rounded to the nearest quarter per-centage point, by 1.253(=

√π/2), as Melliss and Whittaker’s (2000) review of Treasury

forecasts found that “the evidence supports the hypothesis that errors were normally dis-tributed”. The resulting series is presented in Figure 4.6(a). The series ends in 2003, RPIhaving been replaced by CPI in the 2004 forecast; no MAE for CPI inflation forecasts hasyet appeared. The peak of 5 percentage points occurs in 1979, when the point forecast forannual inflation was 14%; on this occasion, following the new Conservative government’spolicy changes, the accompanying text expressed the view that the published forecastMAEs were “likely to understate the true margins of error”.

For comparative purposes over the same period we also plot comparable forecaststandard errors for the two models estimated in Sections 3 and 4 – the ARCH modeland the breaks model. In common with the practice of the Treasury and other forecast-ers we use the annual inflation (Δ4pt) versions of these models. Similarly we regard the“year-ahead” forecast as a five-quarter-ahead forecast, as when forecasting the fourthquarter next year we first have to “nowcast” the fourth quarter this year, given thatonly third-quarter information is available when the forecast is constructed. The fore-cast standard errors take account of the estimated autoregressions in projecting fivequarters ahead, but this is an “in-sample” or ex post calculation that assumes knowl-edge of the full-sample estimates at all intermediate points including, for the breaks

Page 87: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

74 Modeling UK inflation uncertainty, 1958–2006

model, the dates of the breaks; the contribution of parameter estimation error is alsoneglected. It is seen that the ARCH model’s forecast standard error shows a much moreexaggerated peak than that of Treasury forecasts in 1979, and is more volatile over thefirst half of the period shown, whereas the breaks model’s forecast standard error is bydefinition constant over subperiods. Of course, in real-time ex ante forecasting the down-ward shift in forecast standard error could only be recognized with a lag, as discussedbelow.

From 1996 two additional lines appear in Figure 4.6(a), following developments notedin the Introduction. As late as 1994 the Treasury could assert that “it is the only majorforecasting institution regularly to publish alongside its forecasts the average errors frompast forecasts” (HM Treasury, 1994, p. 11), but in 1996 density forecasts of inflationappeared on the scene. We consider the Bank of England’s forecasts published around

0

1

2

3

4

5

6

7

8

9

1980 1985 1990 1995 2000 2005

TREASURYARCHBREAKS MODEL

MPCSEF

Fig. 4.6(a). Measures of uncertainty, year-ahead forecasts, 1976–2006

0

1

2

3

4

5

6

7

8

9

1980 1985 1990 1995 2000 2005

HMT COMPILATION

SEF

Fig. 4.6(b). Measures of disagreement, year-ahead forecasts, 1986–2006

Page 88: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Measures of inflation forecast uncertainty 75

the same time as the Treasury forecasts, namely those appearing in the November issueof the quarterly Inflation Report. From the Bank’s spreadsheets that underlie the fancharts of quarterly forecasts, originally up to two years ahead (nine quarters), laterextended to three years, we take the uncertainty measure (standard deviation) of the five-quarter-ahead inflation forecast. This is labeled MPC in Figure 4.6(a), because the Bank’sMonetary Policy Committee, once it was established, in 1997, assumed responsibility forthe forecast.

In 1996 the Bank of England also initiated its quarterly Survey of External Forecast-ers, at first concerned only with inflation, later including other variables. The quarterlyInflation Report includes a summary of the results of the latest survey, conducted approx-imately three weeks before publication. The survey asks for both point forecasts anddensity forecasts, reported as histograms, and from the individual responses Boero, Smithand Wallis (2008) construct measures of uncertainty and disagreement. Questions 1 and2 of each quarterly survey concern forecasts for the last quarter of the current year andthe following year, respectively, and for comparable year-ahead forecasts we take theresponses to question 2 in the November surveys. For these forecasts our SEF averageindividual uncertainty measure is plotted in Figure 4.6(a).

The general appearance of Figure 4.6(a) has few surprises for the careful reader ofthe preceding sections. The period shown divides into two subperiods, the first with highand variable levels of forecast uncertainty, the second with low and stable levels of fore-cast uncertainty, where the different estimates lie within a relatively small range. Therecent fall in the Treasury forecast standard error may be overdramatized by rounding,whereas the fall in SEF uncertainty is associated by Boero, Smith and Wallis (2008)with the 1997 granting of operational independence to the Bank of England to pur-sue a monetary policy of inflation targeting. Their quarterly series show a reduction inuncertainty until the May 1999 Survey of External Forecasters, after which the generallevel is approximately constant. This reduction in uncertainty about future inflation isattributed to the increasing confidence in, and credibility of, the new monetary policyarrangements.

The forecast evaluation question, how reliable are these forecasts, applies to mea-sures of uncertainty just as it does to measures of location, or point forecasts. Wallis(2004) presents an evaluation of the current-quarter and year-ahead density forecastsof inflation published by the MPC and NIESR. He finds that both overstated forecastuncertainty, with more inflation outcomes falling in the central area of the forecast den-sities, and fewer in the tails, than the densities had led one to expect. Current estimatesof uncertainty are based on past forecast errors, and both groups had gone back toofar into the past, into a different monetary policy regime with different inflation experi-ence. Over 1997–2002 the MPC’s year-ahead point forecast errors have mean zero andstandard deviation 0.42, and the fan chart standard deviation gets closest to this, at0.48, only at the end (2002:4) of the period considered. Mitchell (2005), for the NIESRforecasts, asks whether the overestimation of uncertainty could have been detected, inreal time, had forecasters been alert to the possibility of a break in the variance. Sta-tistical tests can detect breaks only with a lag, and in a forecast context we must alsowait to observe the outcome before having information relevant to the possibility of abreak in uncertainty at the forecast origin. In a “pseudo real time” recursive experimentit is concluded that tests such as those used in Section 4 could have detected at the endof 1996 that a break in year-ahead forecast uncertainty had occurred in 1993:4. This

Page 89: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

76 Modeling UK inflation uncertainty, 1958–2006

is exactly the date of the most recent break identified by Meenagh et al. (2009), andMitchell’s estimate is that it would not have been recognized by statistical testing untilthree years later; in the meantime forecasters might have been able to make judgmentaladjustments.

As an aside we discuss a recent inflation point forecast evaluation study in whichthe same issue arises. Groen, Kapetanios and Price (2009) compare the inflation fore-casts published in the Bank of England’s Inflation Report with those available in pseudoreal time from a suite of statistical forecasting models. All of the latter are subjectto possible breaks in mean, so following a breaks test, the identified break dates areused to demean the series prior to model estimation, then the statistical forecastsare the remeaned projections from the models. It is found that in no case does astatistical model outperform the published forecasts. The authors attribute the Bankforecasters’ success to their ability to apply judgment in anticipating the importantbreak, namely the change of regime in 1997:3 following Bank independence. As inMitchell’s study, the ex ante recursively estimated shift is not detected until three yearslater.

For Treasury forecasts, which started earlier, we can compare the ex ante uncertaintymeasures in Figure 4.6(a) with the forecast root mean squared errors of year-aheadinflation forecasts reported by Melliss and Whittaker’s (2000). Over subperiods, datedby forecast origin, these ex post measures are: 1979–1984, 2.3%; 1985–1992, 1.7%; 1993–1996, 0.8%. These are below, often substantially so, the values plotted in Figure 4.6(a),with the exception of the 1990 and 1992 forecasts, again illustrating the difficulty ofprojecting from past to future in times of change.

In the absence of direct measures of uncertainty it is often suggested that a mea-sure of disagreement among several competing point forecasts may serve as a usefulproxy. How useful such a proxy might be can be checked when both measures areavailable, and there is a literature based on the US Survey of Professional Forecastersthat investigates this question, going back to Zarnowitz and Lambros (1987). How-ever, recent research on the SPF data that brings the sample up to date and studiesthe robustness of previous findings to the choice of measures finds little support forthe proposition that disagreement is a useful proxy for uncertainty (Rich and Tracy,2006, for example). In the present context we provide a visual illustration of this lackof support by plotting in Figure 4.6(b) two measures of disagreement based on year-ahead point forecasts of UK inflation. Although the series are relatively short, we usethe same scales in panels (a) and (b) of Figure 4.6 to make the comparison as directas possible and the lack of a relation as clear as possible. The first series is based onthe Treasury publication Forecasts for the UK Economy, monthly since October 1986,which is a summary of published material from a wide range of forecasting organiza-tions. Forecasts for several variables are compiled, and their averages and ranges arealso tabulated. We calculate and plot the sample standard deviation of year-aheadinflation forecasts in the November issue of the publication. The shorter series is ourcorresponding disagreement measure from the Bank of England Survey of External Fore-casters (Boero, Smith and Wallis, 2008). Other than a slight downward drift, neitherseries shows any systematic pattern of variation, nor any correlation of interest withthe uncertainty measures. We attribute the lower standard deviation in the SEF to theBank’s care in selecting a well-informed sample, whereas the Treasury publication isall-encompassing.

Page 90: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 Uncertainty and the level of inflation 77

6. Uncertainty and the level of inflation

The suggestion by Friedman (1977) that the level and uncertainty of inflation are posi-tively correlated has spawned a large literature, both theoretical and empirical. Simpleevidence of such an association is provided by our breaks model where, using Benati’s(2004) characterization of the four subperiods as a period of high inflation and infla-tion variability, a period of low inflation and inflation variability, and two “in-between”periods, we note that the high and low periods for both measures coincide. Compare thehorizontal lines in Figures 4.1(a) and 4.3(a) for the Δ1p model, and in Figures 4.1(b) and4.3(b) for theΔ4pmodel. For the unconditional subperiod means and standard deviationsof inflation over a shorter period (1965–2003), the data of Meenagh et al. (2009, Table 2)show a stronger association: when their five policy subperiods are ranked by mean infla-tion and by inflation standard deviation, the ranks exactly coincide. Of course, theempirical literature contains analyses of much greater sophistication although, perhapssurprisingly, they are not subjected to tests of structural stability.

Two leading examples in the empirical literature, on which we draw, are the articlesby Baillie, Chung and Tieslau (1996) and Grier and Perry (2000), in which various exten-sions of the GARCH-in-mean (GARCH-M) model are developed in order to formalizeand further investigate Friedman’s proposition. The first authors analyze inflation in 10countries, the second authors analyze inflation and GDP growth in the US, includingsubsample analyses. Of particular relevance for the present purpose is the inclusion ofthe conditional variance (or standard deviation) in the inflation equation and, simul-taneously, lagged inflation in the conditional variance equation. Then, with a GARCHrepresentation of conditional heteroskedasticity, the model is:

Δ1pt = β0 + β1Δ1pt−1 + β2Δ1pt−4 +3∑

j=1

γjQjt + δ1√ht + εt (5)

ht = α0 + α1ε2t−1 + α2ht−1 + δ2Δ1pt−1. (6)

Full-sample estimation results show positive feedback effects between the conditionalmean and the conditional variance, with a highly significant coefficient on lagged infla-tion in the variance equation (δ2), and a marginally significant coefficient (p value 0.063)on the conditional standard deviation in the mean equation (δ1); all other coefficients arehighly significant. However, the model is not invariant over subperiods. If we simply splitthe sample at 1980, then the estimate of δ2 retains its significance while the GARCH-Meffect drops out from equation (5), which may be associated with the insignificant esti-mates of α1 and α2 in equation (6). All of these statements apply to each half-sample;however, further division reveals the fragility of the significance of δ2. As a final testwe return to the breaks model of Section 4 and add the conditional standard devia-tion in mean and lagged inflation in variance effects. Equivalently, we allow the separateintercept terms in equations (5) and (6), β0 and α0, to shift at the dates estimated inSection 4; the coefficients α1 and α2 are pre-tested and set to zero. This model domi-nates the originally estimated model (5)–(6) on the three standard information criteria,yet has completely insignificant estimates of δ1 and δ2. More elaborate models are notable to take us much beyond Friedman’s simple association between the first and secondmoments of inflation, as reflected in the shifts of our preferred model.

Page 91: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

78 Modeling UK inflation uncertainty, 1958–2006

7. Conclusion

Robert Engle’s concept of autoregressive conditional heteroskedasticity was a majorbreakthrough in the analysis of time series with time-varying volatility, recognized bythe joint award of the Bank of Sweden Prize in Economic Sciences in Memory of AlfredNobel in 2003. “The ARCH model and its extensions, developed mainly by Engle andhis students, proved especially useful for modelling the volatility of asset returns, andthe resulting volatility forecasts can be used to price financial derivatives and to assesschanges over time in the risk of holding financial assets. Today, measures and forecastsof volatility are a core component of financial econometrics, and the ARCH model andits descendants are the workhorse tools for modelling volatility” (Stock and Watson,2007b, p. 657). His initial application was in macroeconometrics, however, and reflectedhis location in the United Kingdom at the time. This chapter returns to his study of UKinflation in the light of the well-documented changes in economic policy from his originalsample period to the present time.

Investigation of the stability of the ARCH regression model of UK inflation showsthat little support for the existence of the ARCH effect would be obtained in a sampleperiod starting later than 1980; data from the earlier period of “monetary policy neglect”(Nelson and Nikolov, 2004) are necessary to support Engle’s formulation. Fuller investi-gation of the nature of the nonstationarity of inflation finds that a simple autoregressivemodel with structural breaks in mean and variance, constant within subperiods (andwith no unit roots), provides a preferred representation of the observed heteroskedas-ticity from an economic historian’s point of view. As noted at the outset, however, theARCH model has a strong forecasting motivation, and forecasters using the breaks modelneed to anticipate future breaks. Nevertheless, the shifts also provide a simple character-ization of the association between the level and uncertainty of inflation suggested byFriedman (1977), which more elaborate models of possible feedbacks are unable toimprove upon.

The United Kingdom can claim several firsts in the measurement and public dis-cussion of the uncertainty surrounding economic forecasts by official agencies, and wepresent a range of measures of inflation forecast uncertainty, from the models consideredhere and from other UK sources. The few available evaluations of their accuracy indi-cate that the well-known problems of projecting from past to future in times of changeapply equally well to measures of uncertainty as to point forecasts. Although the chapterre-emphasizes the importance of testing the structural stability of econometric relation-ships, it also acknowledges the difficulty of dealing with instability in a forecast context,for both the levels of variables of interest and, receiving more attention nowadays, theiruncertainty.

Page 92: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5

Macroeconomics and ARCHJames D. Hamilton

1. Introduction

One of the most influential econometric papers of the last generation was Engle’s(1982a) introduction of autoregressive conditional heteroskedasticity (ARCH) as a toolfor describing how the conditional variance of a time series evolves over time. The ISI Webof Science lists over 2,000 academic studies that have cited this article, and simply recit-ing the acronyms for the various extensions of Engle’s theme involves a not insignificantcommitment of paper (see Table 5.1, or the more detailed glossary in Chapter 8).

The vast majority of empirical applications of ARCH models have studied financialtime series such as stock prices, interest rates, or exchange rates (see Bollerslev, Chouand Kroner, 1992). To be sure, there have also been a number of interesting applicationsof ARCH to macroeconomic questions. Pelloni and Polasek (2003) analyzed the macroe-conomic effects of sectoral shocks within a VAR-GARCH framework. Lee, Ni, and Ratti(1995) noted that the conditional volatility of oil prices, as captured by a GARCH model,seems to matter for the magnitude of the effect on GDP of a given movement in oil prices,and Elder and Serletis (2006) use a vector autoregression with GARCH-in-mean elementsto describe the direct consequences of oil-price volatility for GDP. Grier and Perry (2000)and Fountas and Karanasos (2007) use such models to conclude that inflation and outputvolatility also can depress real GDP growth, while Serven (2003) studied the effects ofuncertainty on investment spending, and Shields et al. (2005) analyzed the response ofuncertainty to macroeconomic shocks.

However, despite these interesting applications, studying volatility has traditionallybeen a much lower priority for macroeconomists than for researchers in financial marketsbecause the former’s interest is primarily in describing the first moments. There seemsto be an assumption among many macroeconomists that, if your primary interest is inthe first moment, ARCH has little relevance apart from possible GARCH-M effects.

The purpose of this chapter is to suggest that even if our primary interest is inestimating the conditional mean, having a correct description of the conditional variancecan still be quite important, for two reasons. First, hypothesis tests about the mean in a

79

Page 93: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

80 Macroeconomics and ARCH

model in which the variance is mis-specified will be invalid. Second, by incorporating theobserved features of the heteroskedasticity into the estimation of the conditional mean,substantially more efficient estimates of the conditional mean can be obtained.

Section 2 develops the theoretical basis for these claims, illustrating the potential mag-nitude of the problem with a small Monte Carlo study and explaining why the popularWhite (1980) or Newey–West (Newey and West, 1987) corrections may not fully correctfor the inference problems introduced by ARCH. The subsequent sections illustrate thepractical relevance of these concerns using two examples from the macroeconomics litera-ture. The first application concerns measures of what the market expects the US FederalReserve’s next move to be, and the second explores the extent to which US monetarypolicy today is following a fundamentally different rule from that observed 30 years ago.

I recognize that it may require more than these limited examples to persuade macroe-conomists to pay more attention to ARCH. Another thing I learned from Rob Engle isthat, in addition to coming up with a great idea, it doesn’t hurt if you also have a catchyacronym that people can use to describe what you’re talking about. After all, wherewould we be today if we all had to pronounce “autoregressive conditional heteroskedas-ticity” every time we wanted to discuss these issues? However, Table 5.1 reveals that theacronyms one might logically use for “Macroeconomics and ARCH” seem already to betaken. “MARCH”, for example, is already used (twice), as is “ARCH-M”.

Table 5.1. How many ways can you spell “ARCH”? (A partial lexicography)

AARCH Augmented ARCH Bera, Higgins and Lee (1992)APARCH Asymmetric power ARCH Ding, Engle, and Granger (1993)ARCH-M ARCH in mean Engle, Lilien and Robins (1987)FIGARCH Fractionally integrated GARCH Baillie, Bollerslev, Mikkelsen (1996)GARCH Generalized ARCH Bollerslev (1986)GARCH-t Student’s t GARCH Bollerslev (1987)GJR-ARCH Glosten-Jagannathan-Runkle

ARCHGlosten, Jagannathan, and Runkle

(1993)EGARCH Exponential generalized ARCH Nelson (1991)HGARCH Hentschel GARCH Hentschel (1995)IGARCH Integrated GARCH Bollerslev and Engle (1986)MARCH Modified ARCH Friedman, Laibson, and Minsky

(1989)MARCH Multiplicative ARCH Milhøj (1987)NARCH Nonlinear ARCH Higgins and Bera (1992)PNP-ARCH Partially Nonparametric ARCH Engle and Ng (1993)QARCH Quadratic ARCH Sentana (1995)QTARCH Qualitative Threshold ARCH Gourieroux and Monfort (1992)SPARCH Semiparametric ARCH Engle and Gonzalez-Rivera (1991)STARCH Structural ARCH Harvey, Ruiz, and Sentana (1992)SWARCH Switching ARCH Hamilton and Susmel (1994)TARCH Threshold ARCH Zakoian (1994)VGARCH Vector GARCH Bollerslev, Engle, and Wooldrige

(1988)

Page 94: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 GARCH and inference about the mean 81

Fortunately, Engle and Manganelli (2004) have shown us that it’s also OK to mixupper- and lower-case letters, picking and choosing handy vowels or consonants soas to come up with something catchy, as in “CAViaR” (Conditional AutoregressiveValue at Risk). In that spirit, I propose to designate “Macroeconomics and ARCH”as “McARCH.” Maybe not a new product so much as new packaging.

Herewith, then, discussion of the relevance of McARCH.

2. GARCH and inference about the mean

We can illustrate some of the issues with the following simple model:

yt = β0 + β1yt−1 + ut (1)

ut =√htvt (2)

ht = κ+ αu2t−1 + δht−1 for t = 1, 2, . . . , T

h0 = κ/(1 − α− δ)vt ∼ i.i.d. N(0, 1). (3)

Bollerslev (1986, pp. 312–313) showed that if

3α2 + 2αδ + δ2 < 1, (4)

then the noncentral unconditional second and fourth moments of ut exist and aregiven by

μ2 = E(u2t ) =

κ

1 − α− δ (5)

μ4 = E(u4t ) =

3κ2(1 + α+ δ)(1 − α− δ)(1 − δ2 − 2αδ − 3α2)

. (6)

Consider the consequences if the mean parameters β0 and β1 are estimated by ordinaryleast squares,

β =(∑

xtx′t

)−1 (∑xtyt

)β = (β0, β1)′

xt = (1, yt−1)′,

and where all summations are for t = 1, . . . , T . Suppose further that inference is basedon the usual OLS formula for the variance, with no correction for heteroskedasticity:

V = s2(∑

xtx′t

)−1

(7)

s2 = (T − 2)−1∑

u2t

ut = yt − x′tβ.

Page 95: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

82 Macroeconomics and ARCH

Consider first the consequences of this inference when the fourth-moment condition(4) is satisfied. For simplicity of exposition, consider the case when the true value ofβ = 0. Then from the standard consistency results (e.g., Lee and Hansen, 1994;Lumsdaine, 1996) we see that

T V = s2(T−1

∑xtx′

t

)−1

(8)

p→ E(u2t )

[1 E(yt−1)

E(yt−1) E(y2t−1)

]−1

=[μ2 00 1

]−1

.

In other words, the OLS formulas will lead us to act as if√T β1 is approximately N(0, 1)

if the true value of β1 is zero. But notice

√T (β − β) =

(T−1

∑xtx′

t

)−1 (T−1/2

∑xtut

). (9)

Under the null hypothesis, the term inside the second summation, xtut, is a martingaledifference sequence with variance

E(u2txtx′

t) =

[E(u2

t ) E(u2tut−1)

E(ut−1u2t ) E(u2

tu2t−1)

].

When the (2,2) element of this matrix is finite, it then follows from the Central LimitTheorem (e.g., Hamilton, 1994, p. 173) that

T−1/2∑

yt−1utL→ N

(0, E

(u2

tu2t−1

)). (10)

To calculate the value of this variance, recall (e.g., Hamilton, 1994, p. 666) that theGARCH(1,1) structure for ut implies an ARMA(1,1) structure for u2

t :

u2t = κ+ (δ + α)u2

t−1 + ωt − δωt−1

for wt−1 a white noise process. It follows from the first order autocovariance for anARMA(1,1) process (e.g., Box and Jenkins, 1976, p. 76) that

E(u2tu

2t−1) = E(u2

t − μ2)(u2t−1 − μ2) + μ2

2

= ρ(μ4 − μ22) + μ2

2 (11)

for

ρ =[1 − (α+ δ)δ]α

1 + δ2 − 2(α+ δ)δ. (12)

Page 96: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 GARCH and inference about the mean 83

Substituting (11), (10) and (8) into (9),√T β1

L→ N(0, V11)

V11 =ρμ4 + (1 − ρ)μ2

2

μ22

= ρ3(1 + α+ δ)(1 − α− δ)(1 − δ2 − 2αδ − 3α2)

+ (1 − ρ).

with the last equality following from (5) and (6).Notice that V11 ≥ 1, with equality if and only if α = 0. Thus OLS treats

√T β1

as approximately N(0, 1), whereas the true asymptotic distribution is Normal witha variance bigger than unity, meaning that the OLS t-test will systematically rejectmore often than it should. The probability of rejecting the null hypothesis that β1 = 0(even though the null hypothesis is true) gets bigger and bigger as the parameters getcloser to the region at which the fourth moment becomes infinite, at which point theasymptotic rejection probability becomes unity. Figure 5.1 plots the rejection proba-bility as a function of a and δ. If these parameters are in the range typically foundin estimates of GARCH processes, an OLS t-test with no correction for heteroskedas-ticity would spuriously reject with arbitrarily high probability for a sufficiently largesample.

The good news is that the rate of divergence is pretty slow – it may take a lotof observations before the accumulated excess kurtosis overwhelms the other factors. Isimulated 10,000 samples from the above Gaussian GARCH process for samples of size

Fig. 5.1. Asymptotic rejection probability for OLS t-test that autoregressive coefficientis zero as a function of GARCH (1,1) parameters α and δNote: Null hypothesis is actually true and test has nominal size of 5%

Page 97: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

84 Macroeconomics and ARCH

T = 100, 200, and 1,000 and 10,000, (and 1,000 samples of size 100,000), where the truevalues were specified as follows:

β0 = β1 = 0

κ = 2

α = 0.35

δ = 0.6.

The solid line in Figure 5.2 plots the fraction of samples for which an OLS t-test of β1 = 0exceeds two in absolute value. Thinking we’re only rejecting a true null hypothesis 5%of the time, we would in fact do so 15% of the time in a sample of size T = 100 and 33%of the time when T = 1, 000.

As one might imagine, for a given sample size, the OLS t-statistic is more poorlybehaved if the true innovations υt in (2) are Student’s t with 5 degrees of freedom (thedashed line in Figure 5.2) rather than Normal.

What happens if instead of the OLS formula (7) for the variance of β we use White’s(1980) heteroskedasticity-consistent estimate,

V =(∑

xtx′t

)−1 (∑u2

txtx′t

)(∑xtx′

t

)−1

? (13)

102 103 104 1050

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Sample size (T)

NormalStudent’s t

Fig. 5.2. Fraction of samples in which OLS t-test leads to rejection of the null hypoth-esis that autoregressive coefficient is zero as a function of the sample size for regressionwith Gaussian errors (solid line) and Student’s t-errors (dashed line)Note: Null hypothesis is actually true and test has nominal size of 5%

Page 98: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 GARCH and inference about the mean 85

102 103 104 1050

1

2

3

4

5

6

Sample size (T)

WhiteOLS

Fig. 5.3. Average value of√T times estimated standard error of estimated autoregres-

sive coefficient as a function of the sample size for White standard error (solid line) andOLS standard error (dashed line)

ARCH is not a special case of the class of heteroskedasticity for which V is intended tobe robust, and indeed, unlike typical cases, T V is not a consistent estimate of a givenmatrix:

T V =(T−1

∑xtx

′t

)−1 (T−1

∑u2

txtx′t

)(T−1

∑xtx′

t

)−1

.

The first and last matrices will converge as before,

T−1∑

xtx′t

p→[1 00 μ2

],

but T−1∑u2

txtx′t will diverge if the fourth moment μ4 is infinite. Figure 5.3 plots the

simulated value for the square root of the lower-right element of T V for the Gaussiansimulations above. However, this growth in the estimated variance of

√T β1 is exactly

right, given the growth of the actual variance of√T β1 implied by the GARCH specifi-

cation. And a t-test based on (13) seems to perform reasonably well for all sample sizes(see the second row of Table 5.2). The small-sample size distortion for the White testis a little worse for Student’s t compared with Normal errors, though still acceptable.Table 5.2 also explores the consequences of using the Newey–West (Newey and West,1987) generalization of the White formula to allow for serial correlation, using a lagwindow of q = 5:

V∗=

(T∑

t=1

xtx′t

)−1 [(1 − υ

q + 1

) T∑t=υ+1

utut−υ

(xtx′

t−υ + xt−υx′t

)]( T∑t=1

xtx′t

)−1

.

Page 99: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

86 Macroeconomics and ARCH

Table 5.2. Fraction of samples for which indicated hypothesis is rejected by testof nominal size 0.05

H0 Test based on T = 100 T = 200 T = 1000

Errors Normally distributedβ1 = 0 (H0 is true) OLS standard error 0.152 0.200 0.327β1 = 0 (H0 is true) White standard error 0.072 0.063 0.054β1 = 0 (H0 is true) Newey–West standard error 0.119 0.092 0.062εt homoskedastic

(H0 is false)White TR2 0.570 0.874 1.000

εt homoskedastic(H0 is false)

Engle TR2 0.692 0.958 1.000

Errors Student’s t with 5 degrees of freedomβ1 = 0 (H0 is true) OLS standard error 0.174 0.229 0.389β1 = 0 (H0 is true) White standard error 0.081 0.070 0.065β1 = 0 (H0 is true) Newey–West standard error 0.137 0.106 0.079εt homoskedastic

(H0 is false)White TR2 0.427 0.691 0.991

εt homoskedastic(H0 is false)

Engle TR2 0.536 0.822 0.998

These results (reported in the third row of the two panels of Table 5.2) illustrate onepotential pitfall of relying too much on “robust” statistics to solve the small-sampleproblems, in that it has more serious size distortions than does the simple White statisticfor all specifications investigated.

Another reason one might not want to assume that White or Newey–West standarderrors can solve all the problems is that these formulas only correct the standard errorfor β, but are still using the OLS estimate itself, which from Figure 5.3 was seen notto be

√T convergent. By contrast, even if the fourth moment does not exist, max-

imum likelihood estimation as an alternative to OLS is still√T convergent. Hence

the relative efficiency gains of MLE relative to OLS become infinite as the samplesize grows for typical values of GARCH parameters. Engle (1982a, p. 999) observedthat it is also possible to have an infinite relative efficiency gain for some parametervalues even with exogenous explanatory variables and ARCH as opposed to GARCHerrors.

Results here are also related to the well-known result that ARCH will render inac-curate traditional tests for serial correlation in the mean. That fact has previously beennoted, for example, by Milhøj (1985, 1987), Diebold (1988), Stambaugh (1993), andBollerslev and Mikkelsen (1996). However, none of the above seems to have commentedon the fact (though it is implied by the formulas they use) that the test size goes tounity as the fourth moment approaches infinity, or noted the implications as here forOLS regression.

Finally, I observe that just checking for a difference between the OLS and the Whitestandard errors will sometimes not be sufficient to detect these problems. The difference

Page 100: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Application 1 87

between V and V will be governed by the size of∑(s2 − u2

t )xtx′t.

White (1980) suggested a formal test of whether this magnitude is sufficiently small onthe basis of an OLS regression of u2

t on the vector ψt consisting of the unique elementsof xtx′

t. In the present case, ψt = (1, yt−1, y2t−1)

′. White showed that, under the nullhypothesis that the OLS standard errors are correct, TR2 from a regression of u2

t on ψt

would have a χ2(2) distribution. The next-to-last row of each panel of Table 5.2 reportsthe fraction of samples for which this test would (correctly) reject the null hypothesis.It would miss about half the time in a sample as small as 100 observations but is morereliable for larger sample sizes.

Alternatively, one can look at Engle’s (1982a, 1982b) analogous test for the null ofhomoskedasticity against the alternative of qth-order ARCH by looking at TR2 from aregression of u2

t on (1, u2t−1, u

2t−2, . . . , u

2t−q)

′, which asymptotically has a χ2(q) distribu-tion under the null. The last rows in Table 5.2 report the rejection frequency for thistest using q = 3 lags. Not surprisingly, as this test is designed specifically for the ARCHclass of alternatives whereas the White test is not, this test has a little more power.Its advantage over the White test for homoskedasticity is presumably greater in manymacro applications in which xt includes a number of variables and their lags, in whichcase the vector ψt can become unwieldy, whereas the Engle test remains a simple χ2(q)regardless of the size of xt.

The philosophy of McARCH, then, is quite simple. The Engle TR2 diagnostic shouldbe calculated routinely in any macroeconomic analysis. If a violation of homoskedasticityis found, one should compare the OLS estimates with maximum likelihood to make surethat the inference is robust. The following sections illustrate the potential importance ofdoing so with two examples from applied macroeconomics.

3. Application 1: Measuring market expectations of what theFederal Reserve is going to do next

My first example is adapted from Hamilton (2009). The Fed funds rate is a market-determined interest rate at which banks lend reserves to one another overnight. Thisinterest rate is extremely sensitive to the supply of reserves created by the Fed, andin recent years monetary policy has been implemented in terms of a clearly announcedtarget for the Fed funds rate that the Fed intends to achieve.

A critical factor that determines how Fed actions affect the economy is expectationsby the public as to what the Fed is going to do next, as discussed, for example, inmy (Hamilton, 2009) paper. One natural place to look for an indication of what thoseexpectations might be is the Fed funds futures market.

Let t = 1, 2, . . . , T index monthly observations. In the empirical results reported here,t = 1 corresponds to October 1988 and the last observation (T = 213) is June 2006. Foreach month, we’re interested in what the market expects for the average effective Fedfunds rate over that month, denoted rt. For the empirical estimates reported in this

Page 101: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

88 Macroeconomics and ARCH

section, rt is measured in basis points, so that for example rt = 525 corresponds to anannual interest rate of 5.25%.

On any business day, one can enter into a futures contract through the Chicago Boardof Trade whose settlement is based on what the value of rt+j actually turns out to befor some future month. The terms of a j-month-ahead contract traded on the last dayof month t can be translated1 into an interest rate f (j)

t such that, if rt+j turns out tobe less than f (j)

t , then the seller of the contract has to compensate the buyer a certainamount (specifically, $41.67 on a standard contract) for every basis point by which f (j)

t

exceeds rt+j . If f (j)t < rt+j , the buyer pays the seller. As f (j)

t is known as of the endof month t but rt+j will not be known until the end of month t + j, the buyer of thecontract is basically making a bet that rt+j will be less than f (j)

t . If the marginal marketparticipant were risk neutral, it would be the case that

f(j)t = Et(rt+j) (14)

where Et(.) denotes the mathematical expectation on the basis of any information pub-licly available as of the last day of month t. If (14) holds, we could just look at the valueof f (j)

t to infer what market participants expect the Federal Reserve to do in the comingmonths.

However, previous investigators such as Sack (2004) and Piazzesi and Swanson (2008)have concluded that (14) does not hold. The simplest way to investigate this claim is toconstruct the forecast error implied by the one-month-ahead contract,

u(1)t = rt − f (1)

t−1

and test whether this error indeed has mean zero, as it should if (14) were correct. Forcontracts at longer horizons j > 1, one can look at the monthly change in contract terms,

u(j)t = f

(j−1)t − f (j)

t−1.

If (14) holds, then u(j)t would also be a martingale difference sequence:

u(j)t = Et(rt+j−1) − Et−1(rt+j−1).

One simple test is then to perform the regression

u(j)t = μ(j) + ε(j)t

and test the null hypothesis that μ(j) = 0; this is of course just the usual t-test fora sample mean. Table 5.3 reports the results of this test using one-, two-, and three-month-ahead futures contracts. For the historical sample, the one-month-ahead futurescontract f (1)

t overestimated the value of rt+1 by an average of 2.66 basis points andf

(j)t overestimated the value of f (j−1)

t+1 by almost 4 basis points. One interpretation isthat there is a risk premium built into these contracts. Another possibility is that themarket participants failed to recognize fully the chronic decline in interest rates over thisperiod.

1Specifically, if Pt is the price of the contract agreed to by the buyer and seller on day t, thenft = 100 × (100 − Pt).

Page 102: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Application 1 89

Table 5.3. OLS estimates of bias in monthly fed funds futures forecast errors

Dependentvariable (u(j)

t )Estimatedmean (μ(j))

Standarderror

OLSp value

ARCH(4)LM p value

Loglikelihood

j = 1 month −2.66 0.75 0.001 0.006 −812.61j = 2 months −3.17 1.06 0.003 0.204 −884.70j = 3 months −3.74 1.27 0.003 0.001 −922.80

Before putting too much credence in such interpretations, however, recall that thetheory (14) implies that u(j)

t should be a martingale difference sequence but makes noclaims about predictability of its variance. Figure 5.4 reveals that each of the series u(j)

t

exhibits some clustering of volatility and a significant decline in variability over time,in addition to occasional very large outliers. Engle’s TR2 test for omitted fourth-orderARCH finds very strong evidence of conditional heteroskedasticity at least for u(1)

t andu

(3)t ; see Table 5.3. Hence if we are interested in a more accurate estimate of the bias

One month

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

–60–50–40–30–20–10

0102030

Two month

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

–75

–50

–25

0

25

50

75

Three month

1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

–80

–60

–40

–20

0

20

40

60

80

Fig. 5.4. Plots of one-month-ahead forecast errors (u(j)t ) as a function of month t based

on j = one-, two-, or three-month ahead futures contracts

Page 103: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

90 Macroeconomics and ARCH

and statistical test of its significance, we might want to model these features of thedata.

Hamilton (2009) calculated maximum likelihood estimates for parameters of the fol-lowing EGARCH specification (with (j) superscripts on all variables and parameterssuppressed for ease of readability):

ut = μ+√htεt (15)

log ht − γ′zt = α(|εt−1| − k2) + δ(log ht−1 − γ′zt−1) (16)

zt = (1, t/1, 000)′

k2 = E|εt| =2√νΓ [(ν + 1)/2]

(ν − 1)√πΓ (ν/2)

for εt a Student’s t variable with ν degrees of freedom and Γ (.) the gamma function:

Γ (s) =∫ ∞

0

xs−1e−xdx.

The log likelihood is then found fromT∑

t=1

log f(ut|Ut−1;θ) (17)

f(ut|Ut−1,θ) =(k1/√ht

)[1 + (ε2t/ν)]

−(ν+1)/2

k1 = Γ [(ν + 1)/2]/[Γ (ν/2)√νπ].

Given numerical values for the parameter vector θ = (μ,γ′, α, δ, ν)′ and observed dataUT = (u1, u2, . . . , uT )′ we can then begin the iteration (16) for t = 1 by setting h1 =exp(γ′z0). Plugging this into (15) gives us a value for ε1, which from (16) gives us thenumber for h2. Iterating in this fashion gives the sequence {ht, εt}T

t=1 from which the loglikelihood (17) can be evaluated for the specified numerical value of θ. One then triesanother guess for θ in order to numerically maximize the likelihood function. Asymptoticstandard errors can be obtained from numerical second derivatives of the log likelihoodas in Hamilton (1994, equation [5.8.3]).

Maximum likelihood parameter estimates are reported in Table 5.4. Adding thesefeatures provides an overwhelming improvement in fit, with a likelihood ratio test statisticwell in excess of 100 when adding just four parameters to a simple Gaussian specificationwith constant variance. The very low estimated degrees of freedom results from the bigoutliers in the data, and both the serial dependence (δ) and trend parameter (γ2) for thevariance are extremely significant.

A very remarkable result is that the estimates for the mean of the forecast error μactually switch signs, shrink by an order of magnitude, and become far from statisticallysignificant. Evidently the sample means of u(j)

t are more influenced by negative outliersand observations early in the sample than they should be.

Note that for this example, the problem is not adequately addressed by simply replac-ing OLS standard errors with White standard errors, as when the regressors consist only

Page 104: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Application 2 91

Table 5.4. Maximum likelihood estimates (asymptotic standard errors in parenthe-ses) for EGARCH model of Fed funds futures forecast errors

Horizon (j) u(1)t u

(2)t u

(3)t

Mean (μ) 0.12 (0.24) 0.43 (0.34) 0.27 (0.67)Log average variance (γ1) 5.73 (0.42) 6.47 (0.51) 7.01 (0.54)Trend in variance (γ2) −22.7 (3.1) −23.6 (3.3) −17.1 (3.8)|ut−1|(α) 0.18 (0.07) 0.15 (0.07) 0.30 (0.12)log ht−1(δ) 0.63 (0.16) 0.74 (0.22) 0.84 (0.11)Student’s t degrees of freedom (υ) 2.1 (0.4) 2.2 (0.4) 4.1 (1.2)Log likelihood −731.08 −793.38 −860.16

of a constant term, the two would be identical. Moreover, whenever, as here, there isan affirmative objective of obtaining accurate estimates of a parameter (the possible riskpremium incorporated in these prices) as opposed solely to testing a hypothesis, theconcern is with the quality of the coefficient estimate itself rather than the correct sizeof a hypothesis test.

4. Application 2: Using the Taylor Rule to summarize changesin Federal Reserve policy

One of the most influential papers for both macroeconomic research and policy over thelast decade has been John Taylor’s (1993) proposal of a simple rule that the central bankshould follow in setting an interest rate like the Fed funds rate rt. Taylor’s proposal calledfor the Fed to raise the interest rate by an amount governed by a parameter ψ1 when theobserved inflation rate πt is higher than it wishes (so as to bring inflation back down),and to raise the interest rate by an amount governed by ψ2 when yt, the gap betweenreal GDP and its potential value, is positive:

rt = ψ0 + ψ1πt + ψ2yt

In this equation, the value of ψ0 reflects factors such as the Fed’s long-run inflation targetand the equilibrium real interest rate. There are a variety of ways such an expression hasbeen formulated in practice, such as “forward-looking” specifications, in which the Fedis responding to what it expects to happen next to inflation and output, and “backward-looking” specifications, in which lags are included to capture expectations formation andadjustment dynamics.

A number of studies have looked at the way that the coefficients in such a relationmay have changed over time, including Judd and Rudebusch (1998), Clarida, Galı andGertler (2000), Jalil (2004), and Boivin and Giannoni (2006). Of particular interest hasbeen the claim that the coefficient on inflation ψ1 has increased relative to the 1970s, andthat this increased willingness on the part of the Fed to fight inflation has been a factorhelping to make the US economy become more stable. In this chapter, I will explore the

Page 105: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

92 Macroeconomics and ARCH

variant investigated by Judd and Rudebusch, whose reduced-form representation is

Δrt = γ0 + γ1πt + γ2yt + γ3yt−1 + γ4rt−1 + γ5Δrt−1 + vt. (18)

Here t = 1, 2, . . . , T now will index quarterly data, with t = 1 in my sample correspondingto 1956:Q1 and T = 205 corresponding to 2007:Q1. The value of rt for a given quarteris the average of the three monthly series for the effective Fed funds rate, with Δrt =rt−rt−1, and for empirical results here is reported as percent rather than basis points, e.g.,rt = 5.25 when the average Fed funds rate over the three months of the quarter is 5.25%.Inflation πt is measured as 100 times the natural logarithm of the difference between thelevel of the implicit GDP deflator for quarter t and its value for the corresponding quarterof the preceding year, with data taken from Bureau of Economic Analysis Table 1.1.9.As in Judd and Rudebusch, the output gap yt was calculated as

yt =100(Yt − Y ∗

t )Y ∗

t

for Yt the level of real GDP (in billions of chained 2000 dollars, from BEA Table 1.1.6)and Y ∗

t the series for potential GDP from the Congressional Budget Office (obtainedfrom the St. Louis FRED database). Judd and Rudebusch focused on certain rear-rangements of the parameters in (18), though here I will simply report results interms of the reduced-form estimates themselves. The term vt in (18) is the regressionerror.

Table 5.5 presents results from OLS estimation of (18) using the full sample of data.Of particular interest are γ1 and γ2, the contemporary responses to inflation and output,respectively. Table 5.6 then re-estimates the relation, allowing for separate coefficientssince 1979:Q3, when Paul Volcker became Chair of the Federal Reserve. The OLSresults reproduce the findings of the many researchers noted above that monetary pol-icy seems to have responded much more vigorously to disturbances since 1979, withthe inflation coefficient γ1 increasing by 0.26 and the output coefficient γ2 increasingby 0.64.

However, the White standard errors for the coefficients on dtπt and dtyt are almosttwice as large as the OLS standard errors, and suggest that the increased responseto inflation is in fact not statistically significant and the increased response to output ismeasured very imprecisely. Moreover, Engle’s LM test for the null of Gaussian errors with

Table 5.5. Fixed-coefficient Taylor Rule as estimated from full sample OLS regression

Regressor Coefficient Std error (OLS) Std error (White)

Constant 0.06 0.13 0.18πt 0.13 0.04 0.06yt 0.37 0.07 0.11yt−1 −0.27 0.07 0.10rt−1 −0.08 0.03 0.03Δrt−1 0.14 0.07 0.15TR2 for ARCH(4) (p value) 23.94 (0.000)Log likelihood −252.26

Page 106: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Application 2 93

Table 5.6. Taylor Rule with separate pre- and post-Volcker parametersas estimated by OLS regression (dt = 1 for t > 1979:Q2)

Regressor Coefficient Std error Std error(OLS) (White)

constant 0.37 0.19 0.19πt 0.17 0.07 0.04yt 0.18 0.08 0.07yt−1 −0.07 0.08 0.07rt−1 −0.21 0.07 0.06Δrt−1 0.42 0.11 0.13dt −0.50 0.24 0.30dtπt 0.26 0.09 0.16dtyt 0.64 0.14 0.24dtyt−1 −0.55 0.14 0.21dtrt−1 0.05 0.08 0.08dtΔrt−1 −0.53 0.13 0.24TR2 for ARCH(4) (p value) 45.45 (0.000)Log likelihood −226.80

no heteroskedasticity against the alternative of fourth-order ARCH leads to overwhelmingrejection of the null hypothesis.2 All of which suggests that, if we are indeed interestedin measuring the magnitudes by which these coefficients have changed, it is preferableto adjust not just the standard errors but the parameter estimates themselves in light ofthe dramatic ARCH displayed in the data.

I therefore estimated the following GARCH-t generalization of (18):

yt = x′tβ + vt

vt =√htεt

ht = κ+ ht

ht = α(v2t−1 − κ) + δht−1 (19)

with εt a Student’s t random variable with ν degrees of freedom. Iteration on (19) isinitialized with h1 = 0. The log likelihood is then evaluated exactly as in (17). Maximumlikelihood estimates are reported in Table 5.7.

Once again generalizing a homoskedastic Gaussian specification is overwhelminglyfavored by the data, with a comparison of the specifications in Tables 5.6 and 5.7producing a likelihood ratio χ2(4) statistic of 183.34. The degrees of freedom for theStudent’s t distribution are only 2.29, and the implied GARCH process is highly per-sistent (α + δ = 0.82). Of particular interest is the fact that the changes in the Fed’sresponse to inflation and output are now considerably smaller than suggested by the OLS

2Siklos and Wohar (2005) also make this point.

Page 107: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

94 Macroeconomics and ARCH

Table 5.7. Taylor Rule with separate pre- and post-Volckerparameters as estimated by GARCH-t maximum likelihood(dt = 1 for t > 1979:Q2)

Regressor Coefficient Asymptotic std error

constant 0.13 0.08πt 0.06 0.03yt 0.14 0.03yt−1 −0.12 0.03rt−1 −0.07 0.03Δrt−1 0.47 0.09dt −0.03 0.12dtπt 0.09 0.04dtyt 0.05 0.07dtyt−1 0.02 0.07dtrt−1 −0.01 0.03dtΔrt−1 −0.01 0.11GARCH parameters

constant 0.015 0.010α 0.11 0.05δ 0.71 0.07ν 2.29 0.48

Log likelihood −135.13

estimates. The change in γ1 is now estimated to be only 0.09 and the change in γ2 hasdropped to 0.05 and no longer appears to be statistically significant.

Figure 5.5 offers some insight into what produces these results. The top panel illus-trates the tendency for interest rates to exhibit much more volatility at some times thanothers, with the 1979:Q2–1982:Q3 episode particularly dramatic. The bottom panel plotsobservations on the pairs (yt,Δrt) in the second half of the sample. The apparent posi-tive slope in that scatter plot is strongly influenced by the observations in the 1979–1982period. If one allowed the possibility of serial dependence in the squared residuals, onewould give less weight to the 1979–1982 observations, resulting in a flatter slope estimateover 1979–2007 relative to OLS.

This is not to attempt to overturn the conclusion of earlier researchers that there hasbeen a change in Fed policy in the direction of a more active policy. A comparisonof the changing-parameter specification of Table 5.7 with a fixed-parameter GARCHspecification produces a χ2(4) likelihood ratio statistic of 18.22, which is statisticallysignificant with a p value of 0.001. Nevertheless, the magnitude of this change appearsto be substantially smaller than one would infer on the basis of OLS estimates of theparameters.

Nor is this discussion meant to displace the large and thoughtful literature on possiblechanges in the Taylor Rule, which has raised a number of other substantive issues notexplored here. These include whether one wants to use real-time or subsequent reviseddata (Orphanides, 2001), the distinction between the “backward-looking” Taylor Rule

Page 108: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Conclusions 95

Change in Fed funds rate, 1956:Q2–2007:Q1

date

chan

ge in

fun

ds r

ate

1956 1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998 2001 2004 2007

–4

–2

0

2

4

6

8

Scatter diagram, 1979:Q2–2007:Q1

GDP deviation

chan

ge in

fun

ds r

ate

–8 –6 –4 –2 0 2 4

–4

–2

0

2

4

6

8

Fig. 5.5. Change in Fed funds rate for the full sample (1956:Q2–2007:Q1), and scatterplot for later subsample (1979:Q2–2007:Q1) of change in Fed funds rate against deviationof GDP from potential

explored here and “forward-looking” specifications (Clarida, Galı, and Gertler, 2000),and continuous evolution of parameters rather than a sudden break (Jalil, 2004; Boivin,2006). The simple exercise undertaken nevertheless does in my mind establish the poten-tial importance for macroeconomists to check for the presence of ARCH even when theirprimary interest is in the conditional mean.

5. Conclusions

The reader may note that both of the examples I have used to illustrate the potentialrelevance of McARCH use the Fed funds rate as the dependent variable. This is notentirely an accident. Although Kilian and Goncalves (2004) concluded that most macroseries exhibit some ARCH, the Fed funds rate may be the macro series for which oneis most likely to observe wild outliers and persistent volatility clustering, regardless ofthe data frequency or subsample. It is nevertheless, as the examples used here illustrate,a series that features very importantly for some of the most fundamental questions inmacroeconomics.

Page 109: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

96 Macroeconomics and ARCH

The rather dramatic way in which accounting for outliers and ARCH can change one’sinference that was seen in these examples presumably would not be repeated for everymacroeconomic relation estimated. However, routinely checking something like a TR2

statistic, or the difference between OLS and White standard errors, seems a relativelycostless and potentially quite beneficial habit. And the assumption by many practitionersthat we can avoid all these problems simply by always relying on the White standarderrors may not represent best possible practice.

Page 110: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6

Macroeconomic Volatility andStock Market Volatility,

World-WideFrancis X. Diebold and Kamil Yilmaz

1. Introduction

The financial econometrics literature has been strikingly successful at measuring, model-ing, and forecasting time-varying return volatility, contributing to improved asset pricing,portfolio management, and risk management, as surveyed for example in Andersen,Bollerslev, Christoffersen and Diebold (2006a, 2006b). Much of the financial economet-rics of volatility is of course due to Rob Engle, starting with the classic contribution ofEngle (1982a).

Interestingly, the subsequent financial econometric volatility, although massive, islargely silent on the links between asset return volatility and its underlying determi-nants. Instead, one typically proceeds in reduced-form fashion, modeling and forecastingvolatility but not modeling or forecasting the effects of fundamental macroeconomicdevelopments.1 In particular, the links between asset market volatility and fundamental

Acknowledgments: We gratefully dedicate this paper to Rob Engle on the occasion of his 65th birth-day. The research was supported by the Guggenheim Foundation, the Humboldt Foundation, and theNational Science Foundation. For outstanding research assistance we thank Chiara Scotti and GeorgStrasser. For helpful comments we thank the Editor and Referee, as well as Joe Davis, Aureo DePaula,Jonathan Wright, and participants at the Penn Econometrics Lunch, the Econometric Society 2008Winter Meetings in New Orleans, and the Engle Festschrift Conference.

1The strongly positive volatility-volume correlation has received attention, as in Clark (1973), Tauchenand Pitts (1983), and many others, but that begs the question of what drives volume, which again remainslargely unanswered.

97

Page 111: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

98 Macroeconomic volatility and stock market volatility, world-wide

volatility remain largely unstudied; effectively, asset market volatility is modeled inisolation of fundamental volatility.2

Ironically, although fundamental volatility at business cycle frequencies has beenstudied recently, as for example in Ramey and Ramey (1995) and several of thepapers collected in Pinto and Aizenman (2005), that literature is largely macroeco-nomic, focusing primarily on the link between fundamental volatility and subsequent realgrowth.3 Hence the links between fundamental volatility and asset market volatility againremain largely unstudied; fundamental volatility is modeled in isolation of asset marketvolatility.

Here we focus on stock market volatility. The general failure to link macroeconomicfundamentals to asset return volatility certainly holds true for the case of stock returns.There are few studies attempting to link underlying macroeconomic fundamentals tostock return volatility, and the studies that do exist have been largely unsuccessful.For example, in a classic and well-known contribution using monthly data from 1857to 1987, Schwert (1989) attempts to link stock market volatility to real and nominalmacroeconomic volatility, economic activity, financial leverage, and stock trading activity.He finds very little. Similarly and more recently, using sophisticated regime-switchingeconometric methods for linking return volatility and fundamental volatility, Calvet,Fisher and Thompson (2006) also find very little. The only robust finding seems to bethat the stage of the business cycle affects stock market volatility; in particular, stockmarket volatility is higher in recessions, as found by and echoed in Schwert (1989) andHamilton and Lin (1996), among others.

In this chapter we provide an empirical investigation of the links between fundamentalvolatility and stock market volatility. Our exploration is motivated by financial economictheory, which suggests that the volatility of real activity should be related to stock marketvolatility, as in Shiller (1981) and Hansen and Jagannathan (1991).4 In addition, andcrucially, our empirical approach exploits cross-sectional variation in fundamental andstock market volatilities to uncover links that would likely be lost in a pure time seriesanalysis.

This chapter is part of a nascent literature that explores the links between macroeco-nomic fundamentals and stock market volatility. Engle and Rangel (2008) is a prominentexample. Engle and Rangel propose a spline-GARCH model to isolate low-frequencyvolatility, and they use the model to explore the links between macroeconomic fundamen-tals and low-frequency volatility.5 Engle, Ghysels and Sohn (2006) is another interestingexample, blending the spline-GARCH approach with the mixed data sampling (MIDAS)approach of Ghysels, Santa-Clara, and Valkanov (2005). The above-mentioned Engle

2By “fundamental volatility,” we mean the volatility of underlying real economic fundamentals. Fromthe vantage point of a single equity, this would typically correspond to the volatility of real earningsor dividends. From the vantage point of the entire stock market, it would typically correspond to thevolatility of real GDP or consumption.

3Another strand of macroeconomic literature, including for example Levine (1997), focuses on the linkbetween fundamental volatility and financial market development. Hence, although related, it too missesthe mark for our purposes.

4Hansen and Jagannathan provide an inequality between the “Sharpe ratios” for the equity marketand the real fundamental and hence implicitly link equity volatility and fundamental volatility, otherthings equal.

5Earlier drafts of our paper were completed contemporaneously with and independently of Engle andRangel.

Page 112: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 Data 99

et al. macro-volatility literature, however, focuses primarily on dynamics, whereas inthis chapter we focus primarily on the cross-section, as we now describe.

2. Data

Our goal is to elucidate the relationship, if any, between real fundamental volatilityand real stock market volatility in a broad cross-section of countries. To do so, weask whether time-averaged fundamental volatility appears linked to time-averaged stockmarket volatility. We now describe our data construction methods in some detail; a moredetailed description, along with a complete catalog of the underlying data and sources,appears in the Appendix.

2.1. Fundamental and stock market volatilities

First consider the measurement of fundamental volatility. We use data on real GDP andreal personal consumption expenditures (PCE) for many countries. The major source forboth variables is the World Development Indicators (WDI) of the World Bank.

We measure fundamental volatility in two ways. First, we calculate it as the stan-dard deviation of GDP (or consumption) growth, which is a measure of unconditionalfundamental volatility. Alternatively, following Schwert (1989), we use residuals froman AR(3) model fit to GDP or consumption growth. This is a measure of conditionalfundamental volatility, or put differently, a measure of the volatility of innovations tofundamentals.6

Now consider stock market volatility. We parallel our above-discussed approach tofundamental volatility, using the major stock index series from the IMF’s InternationalFinancial Statistics (IFS). Stock indices are not available for some countries and periods.For those countries we obtain data from alternative sources, among which are Datas-tream, the Standard and Poors Emerging Markets Database, and the World Federationof Exchanges. Finally, using consumer price index data from the IFS, we convert to realstock returns.

We measure real stock market volatility in identical fashion to fundamental volatility,calculating both unconditional and conditional versions. Interestingly, the AR(3) coeffi-cients are statistically significant for a few developing countries, which have small andilliquid stock markets.7

2.2. On the choice of sample period

Our empirical analysis requires data on four time series for each country: real GDP,real consumption expenditures, stock market returns and consumer price inflation. Interms of data availability, countries fall into three groups. The first group is composed

6The latter volatility measure is more relevant for our purposes, so we focus on it for the remainderof this chapter. The empirical results are qualitatively unchanged, however, when we use the formermeasure.

7Again, however, we focus on the condition version for the remainder of this chapter.

Page 113: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

100 Macroeconomic volatility and stock market volatility, world-wide

of mostly industrial countries, with data series available for all four variables from the1960s onward.

The second group of countries is composed mostly of developing countries. In manydeveloping countries, stock markets became an important means of raising capital only inthe 1990s; indeed, only a few of the developing countries had active stock markets beforethe mid-1980s. Hence the second group has shorter available data series, especially forstock returns.

One could of course deal with the problems of the second group simply by discardingit, relying only on the cross-section of industrialized countries. Doing so, however, wouldradically reduce cross-sectional variation, producing potentially severe reductions in sta-tistical efficiency. Hence we use all countries in the first and second groups, but we startour sample in 1983, reducing the underlying interval used to calculate volatilities to 20years.

The third group of countries is composed mostly of the transition economies and someAfrican and Asian developing countries, for which stock markets became operational onlyin the 1990s. As a result, we can include these countries only if we construct volatilitiesusing roughly a 10-year interval of underlying data. Switching from a 20-year to a 10-yearinterval, the number of countries in the sample increases from around 40 to around 70(which is good), but using a 10-year interval produces much noisier volatility estimates(which is bad). We feel that, on balance, the bad outweighs the good, so we exclude thethird group of countries from our basic analysis, which is based on underlying annualdata. However, and as we will discuss, we are able to base some of our analyses onunderlying quarterly data, and in those cases we include some of the third group ofcountries.

In closing this subsection, we note that, quite apart from the fact that data limitationspreclude use of pre-1980s data, use of such data would probably be undesirable even ifit were available. In particular, the growing literature on the “Great Moderation” –decreased variation of output around trend in industrialized countries, starting in theearly 1980s – suggests the appropriateness of starting our sample in the early 1980s, sowe take 1983–2002 as our benchmark sample.8 Estimating fundamental volatility usingboth pre- and post-1983 data would mix observations from the high and low fundamentalvolatility eras, potentially producing distorted inference.

3. Empirical results

Having described our data and choice of benchmark sample, we now proceed withthe empirical analysis, exploring the relationship between stock market volatility andfundamental volatility in a broad cross-section covering approximately 40 countries.

8On the “Great Moderation” in developed countries, see Kim and Nelson (1999a), McConnell andPerez-Quiros (2000) and Stock and Watson (2002b). Evidence for fundamental volatility moderation indeveloping countries also exists, although it is more mixed. For example, Montiel and Serven (2006)report a decline in GDP growth volatility from roughly 4% in the 1970s and 1980s to roughly 3% in the1990s. On the other hand, Kose, Prasad, and Terrones (2006) find that developing countries experienceincreases in consumption volatility following financial liberalization, and many developing economieshave indeed liberalized in recent years.

Page 114: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Empirical results 101

3.1. Distributions of volatilities in the cross-section

We begin in Figure 6.1 by showing kernel density estimates of the cross-country dis-tributions of fundamental volatility and stock return volatility. The densities indicatewide dispersion in volatilities across countries. Moreover, the distributions tend to beright-skewed, as developing countries often have unusually high volatility. The log trans-formation largely reduces the right skewness; hence we work with log volatilities fromthis point onward.9

3.2. The basic relationship

We present our core result in Figure 6.2, which indicates a clear positive relationshipbetween stock return and GDP volatilities, as summarized by the scatterplot of stockmarket volatility against GDP volatility, together with fitted nonparametric regressioncurve.10 The fitted curve, moreover, appears nearly linear. (A fitted linear regressiongives a slope coefficient of 0.38 with a robust t-statistic of 4.70, and an adjusted R2

of 0.26.)When we swap consumption for GDP, the positive relationship remains, as shown

in Figure 6.3, although it appears less linear. In any event, the positive cross-sectionalrelationship between stock market volatility and fundamental volatility contrasts withthe Schwert’s (1989) earlier-mentioned disappointing results for the US time series.

3.3. Controlling for the level of initial GDP

Inspection of the country acronyms in Figures 6.2 and 6.3 reveals that both stock marketand fundamental volatilities are higher in developing (or newly industrializing) coun-tries. Conversely, industrial countries cluster toward low stock market and fundamentalvolatility. This dependence of volatility on stage of development echoes the findings ofKoren and Tenreyro (2007) and has obvious implications for the interpretation of ourresults. In particular, is it a development story, or is there more? That is, is the apparentpositive dependence between stock market volatility and fundamental volatility due tocommon positive dependence of fundamental and stock market volatilities on a thirdvariable, stage of development, or would the relationship exist even after controlling forstage of development?

To explore this, we follow a two-step procedure. In the first step, we regress allvariables on initial GDP per capita, to remove stage-of-development effects (as proxiedby initial GDP). In the second step, we regress residual stock market volatility on residualfundamental volatility.

In Figures 6.4–6.6 we display the first-step regressions, which are of independentinterest, providing a precise quantitative summary of the dependence of all variables(stock market volatility, GDP volatility and consumption volatility) on initial GDPper capita. The dependence is clearly negative, particularly if we discount the distor-tions to the basic relationships caused by India and Pakistan, which have very low

9The approximate log-normality of volatility in the cross-section parallels the approximate uncon-ditional log-normality documented in the time series by Andersen, Bollerslev, Diebold and Ebens(2001).10We use the LOWESS locally weighted regression procedure of Cleveland (1979).

Page 115: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

102 Macroeconomic volatility and stock market volatility, world-wide

Real stock return volatility

0

0.01

0.02

0.03

0.04

Den

sity

10 20 30 40 50

Log real stock return volatility

0

0.2

0.4

0.6

0.8

1

Den

sity

2 2.5 3 3.5 4

Log real GDP growth volatility

0

0.2

0.4

0.6

0.8

Den

sity

0 0.5 1 1.5 2

Real GDP growth volatility

0

0.1

0.2

0.3

Den

sity

0 2 4 6 8

Real PCE growth volatility

0

0.05

0.1

0.15

0.2

0.25

Den

sity

0 2 4 6 8 10

Log real PCE growth volatility

0

0.2

0.4

0.6

Den

sity

–1 0 1 2 3

Fig. 6.1. Kernel density estimates, volatilities and fundamentals, 1983–2002Note: We plot kernel density estimates of real stock return volatility (using data for 43 countries), realGDP growth volatility (45 countries), and real consumption growth volatility (41 countries), in bothlevels and logs. All volatilities are standard deviations of residuals from AR(3) models fitted to annualdata, 1983–2002. For comparison we also include plots of bestfitting normal densities (dashed).

Page 116: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Empirical results 103

NLD

ITA

SWI

UK

SPA

FRAGER

AUT

NOR

USA

JPN

PAK

AUS

SWE

GRC

CAN

NZL

IND

COL

FIN

ISR

SAF

TAI

IRL

JAMLUX

BRAPHL

CHL

TTB

MEX

KOR

HKGSGP

THA

MOR

MYS

IDN

JOR

VEN

ZBW PERARG

2

2.5

3

3.5

4L

og r

eal s

tock

ret

urn

vola

tilit

y

0 0.5 1 1.5 2

Log real GDP growth volatility

Fig. 6.2. Real stock return volatility and real GDP growth volatility, 1983–2002Note: We show a scatterplot of real stock return volatility against real GDP growth volatility, with anonparametric regression fit superimposed, for 43 countries. All volatilities are log standard deviationsof residuals from AR(3) models fitted to annual data, 1983–2002.

SWI

FRA

JPN

USA

NLD

CAN

GER

AUT

AUS

UK

PHL

SPA

SWEITA

NOR

GRC

IND

IRL

NZL

FIN

SAF

TAI

COL SGP

PAK

BRA

ISRHKG

MEX

THA

KOR

CHL

MOR

MYS

PERIDN

ARG

JAM

ZBW

2

2.5

3

3.5

4

Log

rea

l sto

ck r

etur

n vo

latil

ity

0 0.5 1 1.5 2

Log real PCE growth volatility

Fig. 6.3. Real stock return volatility and real PCE growth volatility, 1983–2002Note: We show a scatterplot of real stock return volatility against real consumption growth volatility,with a nonparametric regression fit superimposed, for 39 countries. All volatilities are log standarddeviations of residuals from AR(3) models fitted to annual data, 1983–2002.

Page 117: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

104 Macroeconomic volatility and stock market volatility, world-wide

IND

PAK

IDNZBW

MOR

THA

PHL

COL

JOR

JAM

CHL

MYS

PER

TAI

MEX

VEN

BRA

KOR

SAF

TTB

ARG GRC

SPA

IRL

SGPISR

HKG

UK

NZL

ITA

AUS

CAN

NLD

USA

FIN

FRA

AUT

SWE

GER

NOR

LUX

JPN

SWI

2

2.5

3

3.5

4L

og r

eal s

tock

ret

urn

vola

tility

6 8 10 12

Log real GDP per capita in 1983

Fig. 6.4. Real Stock return volatility and initial real GDP per capita, 1983–2002Note: We show a scatterplot of real stock return volatility against initial (1983) real GDP per capita, witha nonparametric regression fit superimposed, for 43 countries. All volatilities are log standard deviationsof residuals from AR(3) models fitted to annual data, 1983–2002.

IND

PAK

IDN

ZBW

MORTHA

PHL

COL

JOR

JAM

CHL

MYS

PER

TAI

MEX

VEN

BRA

KOR

SAF

URY

TTB

ARG

GRC

SPA

IRL

SGP

ISR

HKG

UK

NZL

ITA

AUS

CAN

NLD

USA

FIN

FRA

AUT

SWE

GERNOR

LUX

DENJPN

SWI

0

0.5

1

1.5

2

Log

rea

l GD

P gr

owth

vol

atili

ty

6 8 10 12

Log real GDP per capita in 1983

Fig. 6.5. Real GDP growth volatility and initial GDP per capita, 1983–2002Note: We show a scatterplot of real GDP growth volatility against initial (1983) real GDP per capita,with a nonparametric regression fit superimposed, for 45 countries. All volatilities are log standarddeviations of residuals from AR(3) models fitted to annual data, 1983–2002. The number of countries istwo more than in Figure 2 because we include Uruguay and Denmark here, whereas we had to excludethem from Figure 2 due to missing stock return data.

Page 118: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Variations and extensions 105

IND

PAK

IDN

ZBW

MOR

THA

PHL

COL

JAM

CHLMYSPER

TAI

MEX

BRA

KOR

SAF

URY

ARG

GRCSPA

IRL

SGPISR

HKG

UK

NZL

ITA

AUSCAN

NLDUSA

FIN

FRA

AUT

SWE

GER

NORDEN

JPN SWI

0

0.5

1

1.5

2

Log

rea

l PC

E g

row

th v

olat

ility

6 8 10 12

Log real GDP per capita in 1983

Fig. 6.6. Real PCE growth volatility and initial GDP per capita, 1983–2002Note: We show a scatterplot of real consumption growth volatility against initial (1983) real GDP percapita, with a nonparametric regression fit superimposed, for 41 countries. All volatilities are log standarddeviations of residuals from AR(3) models fitted to annual data, 1983–2002. The number of countries istwo more than in Figure 3 because we include Uruguay and Denmark here, whereas we had to excludethem from Figure 3 due to missing stock return data.

initial GDP per capita, yet relatively low stock market, and especially fundamental,volatility.

We display second-step results for the GDP fundamental in Figure 6.7. The fittedcurve is basically flat for low levels of GDP volatility, but it clearly becomes positiveas GDP volatility increases. A positive relationship also continues to obtain when weswitch to the consumption fundamental, as shown in Figure 6.8. Indeed the relationshipbetween stock market volatility and consumption volatility would be stronger after con-trolling for initial GDP if we were to drop a single and obvious outlier (Philippines),which distorts the fitted curve at low levels of fundamental volatility, as Figure 6.8 makesclear.

4. Variations and extensions

Thus far we have studied stock market and fundamental volatility using underlyingannual data, 1983–2002. Here we extend our analysis in two directions. First, we incor-porate higher frequency data when possible (quarterly for GDP and monthly, aggregatedto quarterly, for stock returns). Second, we use the higher frequency data in a panel-dataframework to analyze the direction of causality between stock market and fundamentalvolatility.

Page 119: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

106 Macroeconomic volatility and stock market volatility, world-wide

COLSPA

ITA

UK

PAK

NLD

GRC

SAF

TAI

JAM

PHL

FRA GER

AUT

AUS

NOR

CHLUSA

SWI

BRA

IND

CAN

NZL

TTB

MEX

SWE

JPN

THA

KOR

MOR

ISR

MYS

IRL

JOR

FIN

VEN

IDN

PER

ZBW

SGPHKG

LUX

ARG

–1

–0.5

0

0.5

Log

rea

l sto

ck r

etur

n vo

latil

ity

–0.5 0 0.5 1

Log real GDP growth volatility

Fig. 6.7. Real stock return volatility and real GDP growth volatility, 1983–2002,controlling for initial GDP per capitaNote: We show a scatterplot of real stock return volatility against real GDP growth volatility with anonparametric regression fit superimposed, for 43 countries, controlling for the effects of initial GDP percapita via separate first-stage nonparametric regressions of each variable on 1983 GDP per capita. Allvolatilities are log standard deviations of residuals from AR(3) models fitted to annual data, 1983–2002.

PHL

SAF

COL

TAI

IND

SPAFRA

GRC

CAN

UK

AUS

USA

JPN

NLD

IRL

ITA

BRA

SWI

THA

PAK

GER

AUT

NZL MEX

SWE

MOR

NOR

CHL

KORMYS

FIN

PER

SGPIDN

ISR

JAM

HKGZBW

ARG

–1

–0.5

0

0.5

Log

rea

l sto

ck r

etur

n vo

latil

ity

–1 –0.5 0 0.5 1

Log real PCE growth volatility

Fig. 6.8. Real stock return volatility and real PCE growth volatility, 1983–2002,controlling for initial GDP per capitaNote: We show a scatterplot of real stock return volatility against real consumption growth volatilitywith a nonparametric regression fit superimposed, for 39 countries, controlling for the effects of initialGDP per capita via separate first-stage nonparametric regressions of each variable on 1983 GDP percapita. All volatilities are log standard deviations of residuals from AR(3) models fitted to annual data,1983–2002.

Page 120: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Variations and extensions 107

UKISAF

NLD

SWI

HUN

CAN

GER

USA

CZE

AUS

ITA

DENCOL

NZL

MEX

FIN

SLV

AUT

JPN

SPA

PHL

CHL

SWE

FRA

PRT

IDN

BEL

THA

NOR

MYS

LAT

TAI

HKG

SGP

IRL

KOR

ISR

PER

TUR

ARG

1.5

2

2.5

3

3.5

Log

rea

l sto

ck r

etu

rn v

olat

ility

–1 0 1 2

Log real GDP growth volatility

Fig. 6.9. Real stock return volatility and real GDP growth volatility, 1999.1–2003.3Note: We show a scatterplot of real stock return volatility against real GDP growth volatility, with anonparametric regression fit superimposed, for 40 countries. All volatilities are log standard deviationsof residuals from AR(4) models fitted to quarterly data, 1999.1–2003.3.

4.1. Cross-sectional analysis based on underlying quarterly data

As noted earlier, the quality of developing-country data starts to improve in the 1980s.In addition, the quantity improves, with greater availability and reliability of quarterlyGDP data. We now use that quarterly data 1984.1 to 2003.3, constructing and examiningvolatilities over four five-year spans: 1984.1–1988.4, 1989.1–1993.4, 1994.1–1998.4, and1999.1–2003.3.

The number of countries increases considerably as we move through the four periods.Hence let us begin with the fourth period, 1999.1–2003.3. We show in Figure 6.9 thefitted regression of stock market volatility on GDP volatility. The relationship is stillpositive; indeed it appears much stronger than the one discussed earlier, based on annualdata 1983–2002 and shown in Figure 6.2. Perhaps this is because the developing-countryGDP data have become less noisy in recent times.

Now let us consider the other periods. We obtained qualitatively identical results whenrepeating the analysis of Figure 6.9 for each of the three earlier periods: stock marketvolatility is robustly and positively linked to fundamental volatility. To summarize thoseresults compactly, we show in Figure 6.10 the regression fitted to all the data, so that,for example, a country with data available for all four periods has four data points inthe figure. The positive relationship between stock market and fundamental volatility isclear.11

11Two outliers on the left (corresponding to Spain in the first two windows) distort the fitted curveand should be discounted.

Page 121: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

108 Macroeconomic volatility and stock market volatility, world-wide

1.5

2

2.5

3

3.5L

og r

eal s

tock

ret

urn

vola

tility

–1 0 1 2

Log real GDP growth volatility

Fig. 6.10. Real stock return volatility and real GDP growth volatility, 1984.1–2003.3Note: We show a scatterplot of real stock return volatility against real GDP growth volatility, with anonparametric regression fit superimposed, for 43 countries. All volatilities are log standard deviationsof residuals from AR(4) models fitted to quarterly data over four consecutive five-year windows (1984.1–1988.4, 1989.1–1993.4, 1994.1–1998.4, 1999.1–2003.3).

4.2. Panel analysis of causal direction

Thus far we have intentionally and exclusively emphasized the cross-sectional relation-ship between stock market and fundamental volatility, and we found that the two arepositively related. However, economics suggests not only correlation between fundamen-tals and stock prices, and hence from fundamental volatility to stock market volatility,but also (Granger) causation.12

Hence in this subsection we continue to exploit the rich dispersion in the cross-section,but we no longer average out the time dimension; instead, we incorporate it explicitly viaa panel analysis. Moreover, we focus on a particular panel analysis that highlights thevalue of incorporating cross-sectional information relative to a pure time series analysis.In particular, we follow Schwert’s (1989) two-step approach to obtain estimates of time-varying quarterly stock market and GDP volatilities, country-by-country, and then wetest causal hypotheses in a panel framework that facilitates pooling of the cross-countrydata.

Briefly, Schwert’s approach proceeds as follows. In the first step, we fit autoregressionsto stock market returns and GDP, and we take absolute values of the associated residuals,which are effectively (crude) quarterly realized volatilities of stock market and funda-mental innovations, in the jargon of Andersen, Bollerslev, Diebold and Ebens (2001).

12There may of course also be bi-directional causality (feedback).

Page 122: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Concluding remark 109

In the second stage, we transform away from realized volatilities and toward conditionalvolatilities by fitting autoregressions to those realized volatilities, and keeping the fittedvalues. We repeat this for each of the 46 countries.

We analyze the resulting 46 pairs of stock market and fundamental volatilities intwo ways. The first follows Schwert and exploits only time series variation, estimating aseparate VAR model for each country and testing causality. The results, which are notreported here, mirror Schwert’s, failing to identify causality in either direction in the vastmajority of countries.

The second approach exploits cross-sectional variation along with time series varia-tion. We simply pool the data across countries, allowing for fixed effects. First we estimatea fixed-effects model with GDP volatility depending on three lags of itself and three lagsof stock market volatility, which we use to test the hypothesis that stock market volatil-ity does not Granger cause GDP volatility. Next we estimate a fixed-effects model withstock market volatility depending on three lags of itself and three lags of GDP volatility,which we use to test the hypothesis that GDP volatility does not Granger cause stockmarket volatility.

We report the results in Table 6.1, using quarterly real stock market volatility and realGDP growth volatility for the panel of 46 countries, 1961.1–2003.3. We test noncausalityfrom fundamental volatility (FV) to return volatility (RV), and vice versa, and we presentF-statistics and corresponding p values for both hypotheses. We do this for 30 samplewindows, with the ending date fixed at 2003.3 and the starting date varying from 1961.1,1962.1, . . . , 1990.1. There is no evidence against the hypothesis that stock market volatil-ity does not Granger cause GDP volatility; that is, it appears that stock market volatilitydoes not cause GDP volatility. In sharp contrast, the hypothesis that GDP volatility doesnot Granger cause stock market volatility is overwhelmingly rejected: evidently GDPvolatility does cause stock market volatility.

The intriguing result of one-way causality from fundamental volatility to stock returnvolatility deserves additional study, as the forward-looking equity market might beexpected to predict macro fundamentals, rather than the other way around. Of coursehere we focus on predicting fundamental and return volatilities, rather than fundamen-tals or returns themselves. There are subtleties of volatility measurement as well. Forexample, we do not use implied stock return volatilities, which might be expected to bemore forward-looking.13

5. Concluding remark

This chapter is part of a broader movement focusing on the macro-finance interface.Much recent work focuses on high-frequency data, and some of that work focuses on thehigh-frequency relationships among returns, return volatilities and fundamentals (e.g.,Andersen, Bollerslev, Diebold and Vega, 2003, 2007). Here, in contrast, we focus oninternational cross-sections obtained by averaging over time. Hence this chapter can beinterpreted not only as advocating more exploration of the fundamental volatility/return

13Implied volatilities are generally not available.

Page 123: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

110 Macroeconomic volatility and stock market volatility, world-wide

Table 6.1. Granger causality analysis of stock market volatilityand fundamental volatility

BeginningYear

RV not ⇒ FV FV not ⇒ RV

F-stat. p value F-stat. p value

1961 1.16 0.3264 4.14 0.00241962 1.18 0.3174 4.09 0.00261963 1.11 0.3498 4.21 0.00211964 1.14 0.3356 4.39 0.00151965 1.07 0.3696 4.33 0.00171966 1.06 0.3746 4.33 0.00171967 1.01 0.4007 4.48 0.00131968 1.00 0.4061 4.44 0.00141969 0.98 0.4171 4.38 0.00161970 0.96 0.4282 4.14 0.00241971 0.89 0.4689 3.86 0.00391972 0.78 0.5380 4.16 0.00231973 0.62 0.6482 4.06 0.00271974 0.84 0.4996 4.40 0.00151975 0.83 0.5059 3.90 0.00361976 0.83 0.5059 3.89 0.00371977 0.95 0.4339 3.93 0.00351978 0.88 0.4750 4.11 0.00251979 0.73 0.5714 4.02 0.00301980 0.74 0.5646 4.52 0.00121981 0.49 0.7431 4.67 0.00091982 0.47 0.7578 4.77 0.00081983 0.59 0.6699 5.15 0.00041984 0.71 0.5850 5.39 0.00031985 0.83 0.5059 5.58 0.00021986 1.07 0.3697 5.59 0.00021987 1.29 0.2716 5.76 0.00011988 1.29 0.2716 4.84 0.00071989 1.21 0.3044 3.86 0.00391990 1.23 0.2959 3.42 0.0085

We assess the direction of causal linkages between quarterly real stock marketvolatility and real GDP growth volatility for the panel of 46 countries, 1961.1 to2003.3. We test noncausality from fundamental volatility (FV) to return volatility(RV), and vice versa, and we present F-statistics and corresponding p values forboth hypotheses. We do this for 30 sample windows, with the ending date fixedat 2003.3 and the starting date varying from 1961.1, 1962.1, . . . , 1990.1.

volatility interface, but also in particular as a call for more exploration of volatility atmedium (e.g., business cycle) frequencies. In that regard it is to the stock market as, forexample, Diebold, Rudebusch and Aruoba (2006) is to the bond market and Evans andLyons (2007) is to the foreign exchange market.

Page 124: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Appendix 111

Appendix

Here we provide details of data sources, country coverage, sample ranges, and trans-formations applied. We discuss underlying annual data first, followed by quarterlydata.

Annual data

We use four “raw” data series per country: real GDP, real private consumption expendi-tures (PCE), a broad stock market index, and the CPI. We use those series to computeannual real stock returns, real GDP growth, real consumption growth, and correspond-ing volatilities. The data set includes a total of 71 countries and spans a maximum of42 years, 1960–2002. For many countries, however, consumption and especially stockmarket data are available only for a shorter period, reducing the number of countrieswith data available.

We obtain annual stock market data from several sources, including InternationalFinancial Statistics (IFS), the OECD, Standard and Poor’s Emerging Market Data Base(EMDB), Global Insight (accessed via WRDS), Global Financial Data, Datastream, theWorld Federation of Exchanges, and various stock exchange websites. Details appear inTable 6.A1, which lists the countries for which stock market index data are available atleast for the 20-year period from 1983–2002. With stock prices in hand, we calculatenominal returns as it = ln(pt/pt−1). We then calculate annual consumer price index(CPI) inflation, πt, using the monthly IFS database 1960–2002, and finally we calculatereal stock returns as rt = (1 + it)/ (1 + πt) − 1.

We obtain annual real GDP data from the World Bank World Development Indicatorsdatabase (WDI). For most countries, WDI covers the full 1960–2002 period. Exceptionsare Canada (data start in 1965), Germany (data start in 1971), Israel (data end in 2000),Saudi Arabia (data end in 2001), and Turkey (data start in 1968). We obtain Taiwan realGDP from the Taiwan National Statistics website. We complete the real GDP growthrate series for Canada (1961–1965), Germany (1961–1971), Israel (2001–2002) and SaudiArabia (2002) using IFS data on nominal growth and CPI inflation. We calculate realGDP growth rates as GDPt/ GDPt−1 − 1.

We obtain real personal consumption expenditures data using the household and per-sonal final consumption expenditure from the World Bank’s WDI database. We recovermissing data from the IFS and Global Insight (through WRDS); see Table 6.A2 for details.We calculate real consumption growth rates as Ct/Ct−1 − 1.

Quarterly data

The quarterly analysis reported in the text is based on 46 countries. Most, but not all,of those countries are also included in the annual analysis.

For stock markets, we construct quarterly returns using the monthly data detailedin Table 6.A3, and we deflate to real terms using quarterly CPI data constructedusing the same underlying monthly CPI on which annual real stock market returns arebased.

For real GDP in most countries, we use the IFS volume index. Exceptions are Brazil(real GDP volume index, Brazilian Institute of Geography and Statistics website), Hong

Page 125: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

112 Macroeconomic volatility and stock market volatility, world-wide

Kong (GDP in constant prices, Census and Statistics Department website), Singapore(GDP in constant prices, Ministry of Trade and Industry, Department of Statisticswebsite), and Taiwan (GDP in constant prices, Taiwan National Statistics website).

Table 6.A4 summarizes the availability of the monthly stock index series and quarterlyGDP series for each country in our sample.

Table 6.A1. Annual stock market data

Country Period covered Database/Source Acronyms

Argentina 1966–2002 1966–1989 Buenos Aires SE(1)

General IndexARG

1988–2002 Buenos Aires SE MervalIndex

Australia 1961–2002 IFS(2) AUSAustria 1961–2002 1961–1998 IFS AUT

1999–2002 Vienna SE WBI indexBrazil 1980–2002 Bovespa SE BRACanada 1961–2002 IFS CANChile 1974–2002 IFS CHLColombia 1961–2002 IFS COLFinland 1961–2002 IFS FINFrance 1961–2002 IFS FRAGermany 1970–2002 IFS GERGreece 1975–2002 Athens SE General Weighted Index GRCHong Kong, China 1965–2002 Hang Seng Index HKGIndia 1961–2002 IFS INDIndonesia 1977–2002 EMDB–JSE Composite(3) IDNIreland 1961–2002 IFS IRLIsrael 1961–2002 IFS ISRItaly 1961–2002 IFS ITAJamaica 1969–2002 IFS JAMJapan 1961–2002 IFS JPNJordan 1978–2002 Amman SE General Weighted Index JORKorea 1972–2002 IFS KORLuxembourg 1970–2002 1980–1998 IFS LUX

1999–2002 SE–LuxX General IndexMalaysia 1980–2002 KLSE Composite MYSMexico 1972–2002 Price & Quotations Index MEXMorocco 1980–2002 EMDB–Upline Securities MORNetherlands 1961–2002 IFS NLDNew Zealand 1961–2002 IFS NZLNorway 1961–2002 1961–2000 IFS NOR

2001–2002 OECD–CLI industrialsPakistan 1961–2002 1961–1975 IFS PAK

1976–2002 EMDB–KSE 100(cont.)

Page 126: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Appendix 113

Table 6.A1. (Continued)

Country Period covered Database/Source Acronyms

Peru 1981–2002 Lima SE PERPhilippines 1961–2002 IFS PHLSingapore 1966–2002 1966–1979 Strait Times Old Index SGP

1980–2002 Strait Times New IndexSouth Africa 1961–2002 IFS SAFSpain 1961–2002 IFS SPASweden 1961–2002 IFS SWESwitzerland 1961–2002 OECD–UBS 100 index SWITaiwan 1967–2002 TSE Weighted Stock Index TAIThailand 1975–2002 SET Index THATrinidad and Tobago 1981–2002 EMDB–TTSE index TTBUnited Kingdom 1961–2002 1961–1998 IFS, industrial share

indexUK

1999–2002 OECD, industrial shareindex

United States 1961–2002 IFS USAVenezuela, Rep. Bol. 1961–2002 IFS VENZimbabwe 1975–2002 EMDB–ZSE Industrial ZBW

(1) SE denotes Stock Exchange.(2) IFS denotes IMF’s International Financial Statistics. IFS does not provide the name of the stockmarket index.(3) EMDB denotes Standard & Poors’ Emerging Market Data Base.

Table 6.A2. Annual Consumption Data

Country Database Country Database

Argentina 1960–2001 IFS(1) 2002

WRDS(2)Malaysia 1960–2002 WDI

Australia 1958–2000 WDI(3),

2001–2002 WRDS

Morocco 1960–2001 WDI, 2002 WRDS

Austria 1959–2002 WDI, 2002

WRDS

Mexico 1959–2001 WDI, 2002 WRDS

Brazil 1959–2001 WDI, 2002

WRDS

Netherlands 1959–2001 WDI, 2002 WRDS

Canada 1960–1964 IFS; 1965–2000

WDI, 2002 WRDS

New Zealand 1958–2000 WDI, 2001–2002

IFS

Chile 1960–2001 WDI, 2002

WRDS

Norway 1958–2000 WDI, 2001–2002

WRDS

Colombia 1960–2001 WDI, 2002

WRDS

Pakistan 1960–2002 WDI

Denmark 1959–2001 WDI, 2002 IFS Peru 1959–2001 WDI, 2002 WRDS

Finland 1959–2001 WDI, 2002

WRDS

Philippines 1960–2001 WDI, 2002 WRDS

(cont.)

Page 127: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

114 Macroeconomic volatility and stock market volatility, world-wide

Table 6.A2. (Continued)

Country Database Country Database

France 1959–2001 WDI, 2002

WRDS

Singapore 1960–2002 WDI

Germany 1960–1970 IFS, 1971–2001

WDI, 2002 WRDS

South Africa 1960–2002 WDI

Greece 1958–2000 WDI,

2001–2002 WRDS

Spain 1959–2001 WDI, 2002

WRDS

Hong Kong,

China

1959–2001 WDI, 2002 IFS Sweden 1959–2001 WDI, 2002

WRDS

India 1959–2001 WDI, 2002

WRDS

Switzerland 1959–2001 WDI, 2002

WRDS

Indonesia 1960–2002 WDI Taiwan 1964–2002 National

Statistics Office

Ireland 1960–2000 WDI,

2001–2002 WRDS

Thailand 1960–2002 WDI

Israel 1960–2000 WDI,

2001–2002 WRDS

United Kingdom 1959–2001 WDI, 2002

WRDS

Italy 1959–2001 WDI, 2002 IFS United States 1958–2000 WDI, 2001–2002

WRDS

Jamaica 1959–2001 WDI, 2002 IFS Uruguay 1960–2001 WDI, 2002

WRDS

Japan 1959–2001 WDI, 2002

WRDS

Zimbabwe 1965–2002 WDI

Korea 1960–2002 WDI

(1) IFS denotes IMF’s International Financial Statistics.(2) Data taken from the Global Insight (formerly DRI) database which is available through WhartonResearch Data Service (WRDS).(3) WDI denotes World Development Indicators.

Table 6.A3. Monthly Stock Index Data

Acronym Country Definition Period covered Source

ARG Argentina Buenos Aires Old(1967–1988)

1983:01–2003:12 GFD(1)

Merval Index (1989–2003)AUS Australia 19362. . . ZF. . . , Share

Prices: Ordinaries1958:01–2003:12 IFS(2)

AUT Austria 12262. . . ZF. . . , SharePrices

1957:01–2003:12 IFS

BEL Belgium 12462. . . ZF. . . 1957:01–2003:12 IFSBRA Brazil 22362. . . ZF. . . 1980:01–2003:12 IFSCAN Canada 15662. . . ZF. . . 1957:01–2003:11 IFSCHL Chile 22862. . . ZF. . . 1974:01–2003:10 IFSCOL Colombia 23362. . . ZF. . . 1959:01–2003:12 IFS

(cont.)

Page 128: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Appendix 115

Table 6.A3. (Continued)

Acronym Country Definition Period covered Source

CZE Czech Republic PX50 Index 1994:01–2003:12 EMDB(3)

DEN Denmark 12862A..ZF. . . 1967:01–2003:12 IFSFIN Finland 17262. . . ZF. . . 1957:01–2003:12 IFSFRA France 13262. . . ZF. . . 1957:01–2003:11 IFSGER Germany 13462. . . ZF. . . 1970:01–2003:12 IFSGRC Greece Athens General Index 1980:01–2003:09 GFDHKG Hong Kong Hang Seng Index 1980:01–2003:05 GFDHUN Hungary BSE BUX Index 1992:01–2003:12 EMDBIDN Indonesia Jakarta SE Composite

Index1983:03–2003:12 GFD

IRL Ireland 17862. . . ZF. . . (May 1972missing)

1957:01–2003:11 IFS

ISR Israel 43662. . . ZF. . . 1957:01–2003:11 IFSITA Italy 13662. . . ZF. . . 1957:01–2003:12 IFSJPN Japan 15862. . . ZF. . . 1957:01–2003:11 IFSJOR Jordan ASE Index 1986:01–2003:02 EMDBKOR S. Korea KOSPI Index 1975:01–2003:12 GFDLAT Latyia 94162. . . ZF. . . 1996:04–2003:12 IFSMYS Malaysia KLSE composite 1980:01–2003:12 GFDMEX Mexico IPC index 1972:01–2003:12 GFDNLD Netherlands 13862. . . ZF. . . 1957:01–2003:11 IFSNZL New Zealand 19662. . . ZF. . . 1961:01–2003:09 IFSNOR Norway 14262. . . ZF. . . (Sep 1997

missing)1957:01–2003:12 IFS

PER Peru Lima SE Index 1981:12–2003:12 GFDPHL Philippines 56662. . . ZF. . . 1957:01–2003:11 IFSPRT Portugal PSI General Index 1987:12–2003:12 EMDBSGP Singapore Old+New Strait Times

Index1966:01–2003:11 GFD

SLV Slovakia SAX Index 1996:01–2003:12 EMDBSAF South Africa 19962. . . ZF. . . 1960:01–2003:10 IFSSPA Spain 18462. . . ZF. . . 1961:01–2003:12 IFSSWE Sweden 14462. . . ZF. . . 1996:06–2003:12 IFSSWI Switzerland 14662. . . ZF. . . 1989:01–2003:12 IFSTAI Taiwan SE Capitalization

Weighted Index1967:01–2003:12 GFD

THA Thailand SET Index 1980:01–2003:12 GFDTUR Turkey ISE National-100 Index 1986:12–2003:12 GFDUKI United Kingdom FTSE 100 Index 1957:12–2003:11 WRDS(4)

USA United States 11162 ZF 1957:01–2003:12 IFS

(1) GFD denotes Global Financial Data.(2) IFS denotes IMF’s International Financial Statistics.(3) EMDB denotes Standard & Poors’ Emerging Market Data Base.(4) WRDS denotes Wharton Research Data Services.

Page 129: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

116 Macroeconomic volatility and stock market volatility, world-wide

Table 6.A4. Availability of monthly stock returns and quarterly GDP series

Acronym Country 1984. I–1988.IV 1989.I–1993.IV 1994.I–1998.IV 1999.I–2003.IV

Stock

index

GDP Stock

index

GDP Stock

index

GDP Stock

index

GDP

ARG Argentina � � � � � � � �AUS Australia � � � � � � �AUT Austria � � � � � � � �BEL Belgium � � � � � � � �BRA Brazil � � � � �CAN Canada � � � � � � � �CHL Chile � � � � � � � �COL Colombia � � � � �CZE Czech Republic � � �DEN Denmark � � � � � � � �FIN Finland � � � � � � � �FRA France � � � � � � � �GER Germany � � � � � � � �GRC Greece � � � � �HKG Hong Kong � � � � � �HUN Hungary � � �IDN Indonesia � � � � �IRL Ireland � � � � �ISR Israel � � � � � � � �ITA Italy � � � � � � � �JPN Japan � � � � � � � �JOR Jordan � � � �KOR S. Korea � � � � � � � �LAT Latvia � �MYS Malaysia � � � � � � �MEX Mexico � � � � � � � �NLD Netherlands � � � � � � � �NZL New Zealand � � � � � � � �NOR Norway � � � � � � � �PER Peru � � � � � � � �PHL Philippines � � � � � � � �PRT Portugal � � � � � �SGP Singapore � � � � � � � �SLV Slovakia � �SAF South Africa � � � � � � � �SPA Spain � � � � � � � �SWE Sweden � �SWI Switzerland � � � � �TAI Taiwan � � � � � � � �THA Thailand � � � �TUR Turkey � � � � � �UKI United Kingdom � � � � � � � �USA United States � � � � � � � �

Page 130: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

7

Measuring DownsideRisk – Realized SemivarianceOle E. Barndorff-Nielsen, Silja Kinnebrock, and Neil Shephard

“It was understood that risk relates to an unfortunate event occurring, so for an investment

this corresponds to a low, or even negative, return. Thus getting returns in the lower tail of

the return distribution constitutes this ‘downside risk.’ However, it is not easy to get a simple

measure of this risk.” Quoted from Granger (2008).

1. Introduction

A number of economists have wanted to measure downside risk, the risk of prices falling,just using information based on negative returns – a prominent recent example is by Ang,Chen, and Xing (2006). This has been operationalized by quantities such as semivariance,value at risk and expected shortfall, which are typically estimated using daily returns. Inthis chapter we introduce a new measure of the variation of asset prices based on highfrequency data. It is called realized semivariance (RS). We derive its limiting properties,relating it to quadratic variation and, in particular, negative jumps. Further, we showit has some useful properties in empirical work, enriching the standard ARCH models

Acknowledgments: The ARCH models fitted in this chapter were computed using G@RCH 5.0, thepackage of Laurent and Peters (2002). Throughout, programming was carried out using the Ox languageof Doornik (2001) within the OxMetrics 5.0 environment.

We are very grateful for the help of Asger Lunde in preparing some of the data we used in this analysisand advice on various issues. We also would like to thank Rama Cont, Anthony Ledford and AndrewPatton for helpful suggestions at various points. The referee and an editor, Tim Bollerslev, made anumber of useful suggestions.

This chapter was first widely circulated on 21 January, 2008.

117

Page 131: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

118 Measuring downside risk – realized semivariance

pioneered by Rob Engle over the last 25 years and building on the recent econometricliterature on realized volatility.

Realized semivariance extends the influential work of, for example, Andersen, Boller-slev, Diebold, and Labys (2001) and Barndorff-Nielsen and Shephard (2002), onformalizing so-called realized variances (RV), which links these commonly used statis-tics to the quadratic variation process. Realized semivariance measures the variation ofasset price falls. At a technical level it can be regarded as a continuation of the work ofBarndorff-Nielsen and Shephard (2004) and Barndorff-Nielsen and Shephard (2006), whoshowed it is possible to go inside the quadratic variation process and separate out com-ponents of the variation of prices into that due to jumps and that due to the continuousevolution. This work has prompted papers by, for example, Andersen, Bollerslev, andDiebold (2007), Huang and Tauchen (2005) and Lee and Mykland (2008) on the impor-tance of this decomposition empirically in economics. Surveys of this kind of thinkingare provided by Andersen, Bollerslev, and Diebold (2009) and Barndorff-Nielsen andShephard (2007), while a detailed discussion of the relevant probability theory is givenin Jacod (2007).

Let us start with statistics and results which are well known. Realized variance esti-mates the ex post variance of log asset prices Y over a fixed time period. We will supposethat this period is 0 to 1. In our applied work it can be thought of as any individual dayof interest. Then RV is defined as

RV =n∑

j=1

(Ytj

− Ytj−1

)2where 0 = t0 < t1 < . . . < tn = 1 are the times at which (trade or quote) prices areavailable. For arbitrage free-markets, Y must follow a semimartingale. This estimatorconverges as we have more and more data in that interval to the quadratic variation attime one,

[Y ]1 = p − limn→∞

n∑j=1

(Ytj

− Ytj−1

)2,

(e.g. Protter, 2004, pp. 66–77) for any sequence of deterministic partitions 0 = t0 <t1 < . . . < tn = 1 with supj{tj+1 − tj} → 0 for n→ ∞. This limiting operation is oftenreferred to as “in-fill asymptotics” in statistics and econometrics.1

One of the initially strange things about realized variance is that it solely uses squaresof the data, whereas the research of, for example, Black (1976), Nelson (1991), Glosten,Jagannathan, and Runkle (1993) and Engle and Ng (1993) has indicated the importanceof falls in prices as a driver of conditional variance. The reason for this is clear, as thehigh-frequency data become dense, the extra information in the sign of the data can fallto zero for some models – see also the work of Nelson (1992). The most elegant framework

1When there are market frictions it is possible to correct this statistic for their effect using the two-scaleestimator of Zhang, Mykland, and Aıt-Sahalia (2005), the realized kernel of Barndorff-Nielsen, Hansen,Lunde, and Shephard (2008) or the pre-averaging based statistic of Jacod, Li, Mykland, Podolskij, andVetter (2007).

Page 132: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

1 Introduction 119

in which to see this is where Y is a Brownian semimartingale

Yt =∫ t

0

asds+∫ t

0

σsdWs, t ≥ 0,

where a is a locally bounded predictable drift process and σ is a cadlag volatility process– all adapted to some common filtration Ft, implying the model can allow for classicleverage effects. For such a process

[Y ]t =∫ t

0

σ2sds,

and so

d[Y ]t = σ2t dt,

which means for a Brownian semimartingale the quadratic variation (QV) process tellsus everything we can know about the ex post variation of Y and so RV is a highlyinteresting statistic. The signs of the returns are irrelevant in the limit – this is truewhether there is leverage or not.

If there are jumps in the process there are additional things to learn than just theQV process. Let

Yt =∫ t

0

asds+∫ t

0

σsdWs + J t,

where J is a pure jump process. Then, writing jumps in Y as ΔYt = Yt − Yt−,

[Y ]t =∫ t

0

σ2sds +

∑s≤t

(ΔYs)2,

and so QV aggregates two sources of risk. Even when we employ bipower variation(Barndorff-Nielsen and Shephard, 2004 and Barndorff-Nielsen and Shephard, 20062),which allows us to estimate

∫ t

0σ2

sds robustly to jumps, this still leaves us with estimatesof∑

s≤t(ΔJs)2. This tells us nothing about the asymmetric behavior of the jumps –which is important if we wish to understand downside risk.

In this chapter we introduce the downside realized semivariance (RS−)

RS− =tj≤1∑j=1

(Ytj

− Ytj−1

)2 1Ytj− Ytj−1≤0,

where 1y is the indicator function taking the value 1 if the argument y is true. We willstudy the behavior of this statistic under in-fill asymptotics. In particular we will seethat

RS− p→ 12

∫ 1

0

σ2sds +

∑s≤1

(ΔYs)2 1ΔYs≤0,

2Threshold-based decompositions have also been suggested in the literature, examples of this includeMancini (2001), Jacod (2007) and Lee and Mykland (2008).

Page 133: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

120 Measuring downside risk – realized semivariance

under in-fill asymptotics. Hence RS− provides a new source of information, onewhich focuses on squared negative jumps.3 Of course the corresponding upside realizedsemivariance

RS+ =tj≤1∑j=1

(Ytj

− Ytj−1

)2 1Ytj− Ytj−1≥0

p→ 12

∫ 1

0

σ2sds +

∑s≤1

(ΔYs)2 1ΔYs≥0,

may be of particular interest to investors who have short positions in the market (hencea fall in price can lead to a positive return and hence is desirable), such as hedge funds.Of course,

RV = RS− +RS+.

Semivariances, or more generally measures of variation below a threshold (targetsemivariance) have a long history in finance. The first references are probably Markowitz(1959), Mao (1970b), Mao (1970a), Hogan and Warren (1972) and Hogan and Warren(1974). Examples include the work of Fishburn (1977) and Lewis (1990). Sortino ratios(which are an extension of Sharpe ratios and were introduced by Sortino and van derMeer, 1991), and the so-called post-modern portfolio theory by, for example, Rom andFerguson (1993), has attracted attention. Sortino and Satchell (2001) look at recentdevelopments and provide a review, whereas Pedersen and Satchell (2002) look at theeconomic theory of this measure of risk. Our innovation is to bring high-frequency analysisto bear on this measure of risk.

The empirical essence of daily downside realized semivariance can be gleaned fromFigure 7.1, which shows an analysis of trades on General Electric (GE) carried out onthe New York Stock Exchange4 from 1995 to 2005 (giving us 2,616 days of data). Ingraph (a) we show the path of the trades drawn in trading time on a particular randomlychosen day in 2004, to illustrate the amount of daily trading which is going on in thisasset. Notice by 2004 the tick size has fallen to one cent.

Graph (b) shows the open to close returns, measured on the log-scale and multipliedby 100, which indicates some moderation in the volatility during the last and first pieceof the sample period. The corresponding daily realized volatility (the square root ofthe realized variance) is plotted in graph (c), based upon returns calculated every 15trades. The Andersen, Bollerslev, Diebold, and Labys (2000) variance signature plot isshown in graph (d), to assess the impact of noise on the calculation of realized volatility.It suggests statistics computed on returns calculated every 15 trades should not be toosensitive to noise for GE. Graph (e) shows the same but focusing on daily RS− and RS+.Throughout, the statistics are computed using returns calculated every 15 trades. The

3This type of statistic relates to the work of Babsiria and Zakoian (2001) who built separate ARCH-type conditional variance models of daily returns using positive and negative daily returns. It alsoresonates with the empirical results in a recent paper by Chen and Ghysels (2007) on news impactcurves estimated through semiparametric MIDAS regressions.

4These data are taken from the TAQ database, managed through WRDS. Although information ontrades is available from all the different exchanges in the US, we solely study trades which are made atthe exchange in New York.

Page 134: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

1 Introduction 121

0 1000 2000 3000

35.2

35.3(a): Trading prices in a day in 2004

1996 1998 2000 2002 2004

0

10

(b): Daily log returns (open to close), times 100

1996 1998 2000 2002 2004

2.5

5.0

7.5

10.0(c): Daily realized volatility, every 15 trades

0 5 10 15 20 25 30

4

6

(d): ABDL variance signature plot

0 5 10 15 20 25 30

2

3(e): Component variance signature plot

0 10 20 30 40 50 60

0.25

0.50

(f): ACF: components of realized variance

RS+RS− Realized variance

Fig. 7.1. Analysis of trades on General Electric carried out on the NYSE from 1995to 2005. (a) Path of the trades drawn in trading time on a random day in 2004. (b)Daily open to close returns ri, measured on the log-scale and multiplied by 100. Thecorresponding daily realized volatility (

√RVi) is plotted in graph (c), based upon returns

calculated every 15 trades. (d) Variance signature plot in trade time to assess the impactof noise on the calculation of realized variance (RV ). (e) Same thing, but for the realizedsemivariances (RS+

i and RS−i ). (f) Correlogram for RS+

i , RVi and RS−i

average value of these two statistics are pretty close to one another on average over thissample period. This component signature plot is in the spirit of the analysis pioneeredby Andersen, Bollerslev, Diebold, and Labys (2001) in their analysis of realized variance.Graph (f) shows the correlogram for the realized semivariances and the realized varianceand suggests the downside realized semivariance has much more dependence in it thanRS+. Some summary statistics for these data are available in Table 7.2, which will bediscussed in some detail in Section 3.

In the realized volatility literature, authors have typically worked out the impactof using realized volatilities on volatility forecasting using regressions of future realizedvariance on lagged realized variance and various other explanatory variables.5 Engleand Gallo (2006) prefer a different route, which is to add lagged realized quantities asvariance regressors in Engle (2002a) and Bollerslev (1986) GARCH-type models of daily

5Leading references include Andersen, Bollerslev, Diebold, and Labys (2001) and Andersen, Bollerslev,and Meddahi (2004).

Page 135: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

122 Measuring downside risk – realized semivariance

returns – the reason for their preference is that it is aimed at a key quantity, a predictivemodel of future returns, and is more robust to the heteroskedasticity inherent in the data.Typically when Engle generalizes to allow for leverage he uses the Glosten, Jagannathan,and Runkle (1993) (GJR) extension. This is the method we follow here. Throughout wewill use the subscript i to denote discrete time.

We model daily open to close returns6 {ri; i = 1, 2, . . . , T} as

E (ri|Gi−1) = μ,

hi = Var (ri|Gi−1) = ω + α (ri−1 − μ)2 + βhi−1

+ δ (ri−1 − μ)2 Iri−1−μ<0 + γzi−1,

and then use a standard Gaussian quasi-likelihood to make inference on the parameters,e.g. Bollerslev and Wooldridge (1992). Here zi−1 are the lagged daily realized regressorsand Gi−1 is the information set generated by discrete time daily statistics available toforecast ri at time i− 1.

Table 7.1 shows the fit of the GE trade data from 1995 to 2005. It indicates thelagged RS− beating out of the GARCH model (δ = 0) and the lagged RV. Both realizedterms yield large likelihood improvements over a standard daily returns-based GARCH.Importantly there is a vast shortening in the information-gathering period needed tocondition on, with the GARCH memory parameter β dropping from 0.953 to around 0.7.This makes fitting these realized-based models much easier in practice, allowing their useon relatively short time series of data.

When the comparison with the GJR model is made, which allows for traditionalleverage effects, the results are more subtle, with the RS− significantly reducing theimportance of the traditional leverage effect while the high-frequency data still has animportant impact on improving the fit of the model. In this case the RS− and RV playsimilar roles, with RS− no longer dominating the impact of the RV in the model.

The rest of this chapter has the following structure. In Section 2 we will discussthe theory of realized semivariances, deriving a central limit theory under some mildassumptions. In Section 3 we will deepen the empirical work reported here, looking at avariety of stocks and also both trade and quote data. In Section 4 we will discuss variousextensions and areas of possible future work.

2. Econometric theory

2.1. The model and background

We start this section by repeating some of the theoretical story from Section 1.

6We have no high frequency data to try to estimate the variation of the prices over night and so donot attempt to do this here. Of course, it would be possible to build a joint model of open to close andclose to open returns, conditional on the past daily data and the high frequency realized terms but wehave not carried this out here. An alternative would be to model open to open or close to close pricesgiven past data of the same type and the realized quantities. This is quite a standard technique in theliterature, but not one we follow here.

Page 136: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 7.1. ARCH-type models and lagged realized semivariance and variance

GARCH GJR

Lagged RS− 0.685 0.499 0.371 0.441(2.78) (2.86) (0.91) (2.74)

Lagged RV −0.114 0.228 0.037 0.223(−1.26) (3.30) (0.18) (2.68)

ARCH 0.040 0.036 0.046 0.040 0.017 0.021 0.016 0.002(2.23) (2.068) (2.56) (2.11) (0.74) (1.27) (1.67) (0.12)

GARCH 0.711 0.691 0.953 0.711 0.710 0.713 0.955 0.708(7.79) (7.071) (51.9) (9.24) (7.28) (7.65) (58.0) (7.49)

GJR 0.055 0.048 0.052 0.091(1.05) (1.51) (2.86) (2.27)

Log-likelihood −4527.3 −4527.9 −4577.6 −4533.5 −4526.2 −4526.2 −4562.2 −4526.9

Gaussian quasi-likelihood fit of GARCH and GJR models fitted to daily open to close returns on General Electric share prices, from 1995 to2005. We allow lagged daily realized variance (RV) and realized semivariance (RS) to appear in the conditional variance. They are computedusing every 15th trade. T -statistics, based on robust standard errors, are reported in small font and in brackets.

Page 137: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

124 Measuring downside risk – realized semivariance

Consider a Brownian semimartingale Y given as

Yt =∫ t

0

asds+∫ t

0

σsdWs, (1)

where a is a locally bounded predictable drift process and σ is a cadlag volatility process.For such a process

[Y ]t =∫ t

0

σ2sds,

and so d[Y ]t = σ2t dt, which means that when there are no jumps the QV process tells us

everything we can know about the ex post variation of Y .When there are jumps this is no longer true, in particular let

Yt =∫ t

0

asds+∫ t

0

σsdWs + J t, (2)

where J is a pure jump process. Then

[Y ]t =∫ t

0

σ2sds +

∑s≤t

(ΔJs)2,

and d[Y ]t = σ2t dt+ (ΔYt)

2. Even when we employ devices like realized bipower variation(Barndorff-Nielsen and Shephard, 2004 and Barndorff-Nielsen and Shephard, 2006)

BPV = μ−21

tj≤t∑j=2

∣∣Ytj− Ytj−1

∣∣ ∣∣Ytj−1 − Ytj−2

∣∣ p→ {Y }[1,1]t =

∫ t

0

σ2sds,

μ1 = E |U | , U ∼ N(0, 1),

we are able to estimate∫ t

0σ2

sds robustly to jumps, but this still leaves us with estimatesof∑s≤t

(ΔJs)2. This tells us nothing about the asymmetric behavior of the jumps.

2.2. Realized semivariances

The empirical analysis we carry out throughout this chapter is based in trading time,so data arrive into our database at irregular points in time. However, these irregu-larly spaced observations can be thought of as being equally spaced observations ona new time-changed process, in the same stochastic class, as argued by, for example,Barndorff-Nielsen, Hansen, Lunde, and Shephard (2008). Thus there is no loss in initiallyconsidering equally spaced returns

yi = Y in− Y i−1

n, i = 1, 2, . . . , n.

We study the functional

V (Y, n) =nt∑i=1

(y2

i 1{yi≥0}y2

i 1{yi≤0}

). (3)

Page 138: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 Econometric theory 125

The main results then come from an application of some limit theory of Kinnebrockand Podolskij (2008) for bipower variation. This work can be seen as an importantgeneralization of Barndorff-Nielsen, Graversen, Jacod, and Shephard (2006) who studiedbipower-type statistics of the form

1n

n∑i=2

g(√nyi)h(

√nyi−1),

when g and h were assumed to be even functions. Kinnebrock and Podolskij (2008) givethe extension to the uneven case, which is essential here.7

Proposition 1 Suppose (1) holds, then

nt∑i=1

(y2

i 1{yi≥0}y2

i 1{yi≤0}

)p→ 1

2

∫ t

0

σ2sds(

11

).

Proof Trivial application of Theorem 1 in Kinnebrock and Podolskij (2008).

Corollary 1 Suppose

Yt =∫ t

0

asds+∫ t

0

σsdWs + J t,

holds, where J is a finite activity jump process thennt∑i=1

(y2

i 1{yi≥0}y2

i 1{yi≤0}

)p→ 1

2

∫ t

0

σ2sds(

11

)+∑s≤t

((ΔYs)

2 1{ΔYs≥0}(ΔYs)

2 1{ΔYs≤0}

).

Remark 1 The above means that

(1,−1)nt∑i=1

(y2

i 1{yi≥0}y2

i 1{yi≤0}

)p→∑s≤t

(ΔYs)2 1{ΔYs≥0} − (ΔYs)

2 1{ΔYs≤0},

the difference in the squared jumps. Hence this statistic allows us direct econometricevidence on the importance of the sign of jumps. Of course, by combining with bipowervariation

nt∑i=1

(y2

i 1{yi≥0}y2

i 1{yi≤0}

)− 1

2

(BPVBPV

)p→∑s≤t

((ΔYs)

2 1{ΔYs≥0}(ΔYs)

2 1{ΔYs≤0}

),

we can straightforwardly estimate the QV of just positive or negative jumps.

In order to derive a central limit theory we need to make two assumptions on thevolatility process.

(H1) If there were no jumps in the volatility then it would be sufficient to employ

σt = σ0 +∫ t

0

a∗sds+∫ t

0

σ∗sdWs +∫ t

0

v∗sdW ∗s . (4)

7It is also useful in developing the theory for realized autocovariance under a Brownian semimartingale,which is important in the theory of realized kernels developed by Barndorff-Nielsen, Hansen, Lunde, andShephard (2008).

Page 139: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

126 Measuring downside risk – realized semivariance

Here a∗, σ∗, v∗ are adapted cadlag processes, with a∗ also being predictable and locallybounded. W ∗ is a Brownian motion independent of W .

(H2) σ2t > 0 everywhere.

The assumption (H1)’ is rather general from an econometric viewpoint as it allows forflexible leverage effects, multifactor volatility effects, jumps, nonstationarities, intradayeffects, etc. Indeed we do not know of a continuous time continuous sample path volatilitymodel used in financial economics that is outside this class. Kinnebrock and Podolskij(2008) also allow jumps in the volatility under the usual (in this context) conditionsintroduced by Barndorff-Nielsen, Graversen, Jacod, Podolskij, and Shephard (2006) anddiscussed by, for example, Barndorff-Nielsen, Graversen, Jacod, and Shephard (2006)but we will not detail this here.

The assumption (H2) is also important, it rules out the situation where the diffusivecomponent disappears.

Proposition 2 Suppose (1), (H1) and (H2) holds, then

√n

⎧⎪⎨⎪⎩nt∑i=1

⎛⎜⎝y2i 1{yi≥0}y2

i 1{yi≤0}|yi| |yi−1|

⎞⎟⎠−∫ t

0

σ2sds

⎛⎜⎝1212

μ21

⎞⎟⎠⎫⎪⎬⎪⎭ Dst→ Vt

where

Vt =∫ t

0

αs (1) ds+∫ t

0

αs (2) dWs +∫ t

0

αs (3) dW ′s,

αs (1) =1√2π

{2asσs + σsσ∗s}⎛⎝1−10

⎞⎠ ,αs (2) =

2√2πσ2

s

⎛⎝1−10

⎞⎠ ,

As = σ4s

⎛⎜⎝54 − 1

4 μ21

− 14

54 μ2

1

μ21 μ2

1 1 + 2μ21 − 3μ4

1

⎞⎟⎠ ,αs (3)αs (3)′ = As − αs (2)αs (2)′ ,

where αs (3) is a 2×2 matrix. Here W ′ is independent of (W,W ∗), the Brownian motionwhich appears in the Brownian semimartingale (1) and (H1).

Proof Given in the Appendix.

Remark 2 When we look at

RV = (1, 1)nt∑i=1

(y2

i 1{yi≥0}y2

i 1{yi≤0}

),

Page 140: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 Econometric theory 127

then we produce the well-known result

√n

{RV−

∫ t

0

σ2sds}

Dst→∫ t

0

2σ2sdW

′s

which appears in Jacod (1994) and Barndorff-Nielsen and Shephard (2002).

Remark 3 Assume a, σ |= W then

√n

⎧⎨⎩nt∑i=1

(y2

i 1{yi≥0}y2

i 1{yi≤0}

)−1

2

∫ t

0

σ2sds(

11

)⎫⎬⎭Dst→ MN

(1√2π

∫ t

0

{2asσs + σsσ∗s} ds

(1−1

),14

∫ t

0

σ4sds(

5 −1−1 5

)).

If there is no drift and the volatility of volatility was small then the mean of this mixedGaussian distribution is zero and we could use this limit result to construct confidenceintervals on these quantities. When the drift is not zero we cannot use this result as wedo not have a method for estimating the bias, which is a scaled version of

1√n

∫ t

0

{2asσs + σsσ∗s}ds.

Of course in practice this bias will be small. The asymptotic variance of

(1,−1)nt∑i=1

(y2

i 1{yi≥0}y2

i 1{yi≤0}

)

is 3n

∫ t

0σ4

sds, but obviously not mixed Gaussian.

Remark 4 When the a, σ is independent of W assumption fails, we do not know howto construct confidence intervals even if the drift is zero. This is because in the limit

√n

⎧⎨⎩nt∑i=1

(y2

i 1{yi≥0}y2

i 1{yi≤0}

)−1

2

∫ t

0

σ2sds(

11

)⎫⎬⎭depends upon W . All we know is that the asymptotic variance is again

14n

∫ t

0

σ4sds(

5 −1−1 5

).

Notice, throughout the asymptotic variance of RS− is

54n

∫ t

0

σ4sds

so it is less than that of the RV (of course it estimates a different quantity). It also meansthe asymptotic variance of RS+ −RS− is

3n

∫ t

0

σ4sds.

Page 141: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

128 Measuring downside risk – realized semivariance

Remark 5 We can look at the measure of the variation of negative jumps through

√n

⎛⎝2nt∑i=1

y2i 1{yi≤0} − 1

μ21

nt∑i=1

|yi| |yi−1|⎞⎠ Dst→ Vt

where

Vt =∫ t

0

αs(1)ds+∫ t

0

αs(2)dWs +∫ t

0

αs(3)dW ′s,

αs(1) = −21√2π

{2asσs + σsσ∗s} ,

αs(2) = −22√2πσ2

s ,

As = σ4s

(μ−4

1 + 2μ−21 − 2

),

αs(3)αs(3)′ = As − αs(2)αs(2)′.

We note thatμ−4

1 + 2μ−21 − 2 3.6089,

which is quite high (the corresponding term is about 0.6 when we look at the differencebetweem realized variance and bipower variation). Without the assumption that the driftis zero and no leverage, it is difficult to see how to use this distribution as the basis of atest.

3. More empirical work

3.1. More on GE trade data

For the GE trade data, Table 7.2 reports basic summary statistics for squared open toclose daily returns, realized variance and downside realized semivariance. Much of this is

Table 7.2. Summary information for daily statistics for GE trade data

Variable Mean S.D. Correlation matrix ACF1 ACF20

ri 0.01 1.53 1.00 −0.01 0.00r2i 2.34 5.42 0.06 1.00 0.17 0.07RVi 2.61 3.05 0.03 0.61 1.00 0.52 0.26RS+

i 1.33 2.03 0.20 0.61 0.94 1.00 0.31 0.15RS−

i 1.28 1.28 −0.22 0.47 0.86 0.66 1.00 0.65 0.37BPVi 2.24 2.40 0.00 0.54 0.95 0.84 0.93 1.00 0.64 0.34BPDVi 0.16 0.46 −0.61 −0.10 −0.08 −0.34 0.34 −0.01 1.00 0.06 0.03

Summary statistics for daily GE data computed using trade data. ri denotes daily open to close returns,RVi is the realized variance, RSi are the realized semivariances, and BPVi is the daily realized bipowervariation. BPDV will be defined on the next page.

Page 142: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 More empirical work 129

Table 7.3. GE trade data: regression of returns on lagged realized semivarianceand returns

Coefficient t-value Coefficient t-value Coefficient t-value

Constant 0.009 0.03 −0.061 −1.43 −0.067 −1.56ri−1 −0.012 0.01 −0.001 −0.06 0.016 0.67RS−

i−1 0.054 2.28 0.046 1.85BPDVi−1 0.109 1.26logL −4,802.2 −4,799.6 −4,798.8

Regression of returns ri on lagged realized semivariance RS−i−1 and returns ri−1 for daily returns

based on the GE trade database.

familiar, with the average level of squared returns and realized variance being roughly thesame, whereas the mean of the downside realized semivariance is around one-half thatof the realized variance. The most interesting results are that the RS− statistic has acorrelation with RV of around 0.86 and that it is negatively correlated with daily returns.The former correlation is modest for an additional volatility measure and indicates thatit may have additional information not in the RV statistic. The latter result shows thatlarge daily semivariances are associated with contemporaneous downward moves in theasset price – which is not surprising of course.

The serial correlations in the daily statistics are also presented in Table 7.2. Theyshow the RV statistic has some predictability through time, but that the autocorrelationin the RS− is much higher. Together with the negative correlation between returns andcontemporaneous RS− (which is consistent for a number of different assets), this suggestsone should be able to modestly predict returns using past RS−.

Table 7.3 shows the regression fit of ri on ri−1 and RS−i−1 for the GE trade data.

The t-statistic on lagged RS− is just significant and positive. Hence a small amount ofthe variation in the high-frequency falls of price in the previous day is associated withrises in future asset prices – presumably because the high-frequency falls increase therisk premium. The corresponding t-statistics for the impact of RS−

i−1 for other series aregiven in Table 7.6, they show a similar weak pattern.

The RS− statistic has a similar dynamic pattern to the bipower variation statistic.8

The mean and standard deviation of the RS− statistic is slightly higher than half therealized BPV one. The difference estimator

BPDVi = RS−i − 0.5BPVi,

which estimates the squared negative jumps, is highly negatively correlated with returnsbut not very correlated with other measures of volatility. Interestingly this estimator isslightly autocorrelated, but at each of the first 10 lags this correlation is positive, whichmeans it has some forecasting potential.

8This is computed using not one but two lags, which reduces the impact of market microstructure, asshown by Andersen, Bollerslev, and Diebold (2007).

Page 143: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

130 Measuring downside risk – realized semivariance

Table 7.4. Summary information for daily statistics for other trade data

Mean S.D. Correlation matrix ACF1 ACF20

DISri −0.02 1.74 1.00 −0.00 0.00r2i 3.03 6.52 0.04 1.00 0.15 0.08RVi 3.98 4.69 −0.00 0.53 1.00 0.69 0.35RS+

i 1.97 2.32 0.19 0.55 0.94 1.00 0.66 0.35RS−

i 2.01 2.60 −0.18 0.46 0.95 0.81 1.00 0.57 0.30BPVi 3.33 3.97 −0.00 0.53 0.98 0.93 0.93 1.00 0.69 0.37BPDVi 0.35 1.03 −0.46 0.13 0.52 0.25 0.72 0.43 1.00 0.05 0.04

AXPri 0.01 1.86 1.00 0.01 0.01r2i 3.47 7.75 −0.00 1.00 0.15 0.09RVi 3.65 4.57 −0.01 0.56 1.00 0.64 0.37RS+

i 1.83 2.62 0.22 0.52 0.93 1.00 0.48 0.27RS−

i 1.82 2.30 −0.28 0.53 0.91 0.72 1.00 0.64 0.36BPVi 3.09 3.74 −0.04 0.52 0.94 0.83 0.92 1.00 0.69 0.39BPDVi 0.27 0.90 −0.63 0.27 0.37 0.10 0.62 0.28 1.00 0.20 0.11

IBMri 0.01 1.73 1.00 −0.05 0.01r2i 3.02 7.25 0.04 1.00 0.13 0.04RVi 2.94 3.03 0.03 0.55 1.00 0.65 0.34RS+

i 1.50 1.81 0.24 0.54 0.94 1.00 0.50 0.26RS−

i 1.44 1.43 −0.24 0.48 0.91 0.74 1.00 0.65 0.34BPVi 2.62 2.60 0.00 0.51 0.96 0.86 0.93 1.00 0.70 0.38BPDVi 0.13 0.49 −0.71 0.05 0.13 −0.11 0.44 0.10 1.00 0.04 −0.01

Summary statistics for various daily data computed using trade data. ri denotes daily open to closereturns, RVi is the realized variance, RSi is the realized semivariance, and BPVi is the daily realizedbipower variation. BPDVi is the realized bipower downward variation statistic.

3.2. Other trade data

Results in Table 7.4 show that broadly the same results hold for a number of frequentlytraded assets – American Express (AXP), Walt Disney (DIS) and IBM. Table 7.5 showsthe log-likelihood improvements9 by including RV and RS− statistics into the GARCHand GJR models based on trades. The conclusion is clear for GARCH models. By includ-ing RS− statistics in the model there is little need to include a traditional leverage effect.Typically it is only necessary to include RS− in the information set, adding RV playsonly a modest role. For GJR models, the RV statistic becomes more important and issometimes slightly more effective than the RS− statistic.

9Of course the log-likelihoods for the ARCH-type models are Gaussian quasi-likelihoods and so thestandard distributional theory for likelihood ratios does not apply directly. Instead one can think of themodel fit through a criterion like BIC.

Page 144: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Additional remarks 131

Table 7.5. Trades: logL improvements by including lagged RS− and RV inconditional variance

Lagged variables GARCH model GJR model

AXP DIS GE IBM AXP DIS GE IBM

RV, RS− & BPV 59.9 66.5 50.5 64.8 47.7 57.2 36.7 45.7RV & BPV 53.2 63.7 44.7 54.6 45.4 56.9 36.0 44.6RS− & BPV 59.9 65.7 48.7 62.6 47.6 53.2 36.4 42.5BPV 46.2 57.5 44.6 43.9 40.0 50.0 35.8 34.5RV & RS− 59.8 66.3 49.5 60.7 47.5 56.9 35.4 42.4RV 53.0 63.5 43.2 51.5 45.1 56.7 34.7 41.9RS− 59.6 65.6 48.7 60.6 47.1 52.4 35.4 41.7None 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Improvements in the Gaussian quasi-likelihood by including lagged realized quantities inthe conditional variance over standard GARCH and GJR models. Fit of GARCH and GJRmodels for daily open to close returns on four share prices, from 1995 to 2005. We allowlagged daily realized variance (RV), realized semivariance (RS−), realized bipower variation(BPV) to appear in the conditional variance. They are computed using every 15th trade.

3.3. Quote data

We have carried out the same analysis based on quote data, looking solely at the seriesfor offers to buy placed on the New York Stock Exchange. The results are given inTables 7.6 and 7.7. The results are in line with the previous trade data. The RS−

statistic is somewhat less effective for quote data, but the changes are marginal.

4. Additional remarks

4.1. Bipower variation

We can build on the work of Barndorff-Nielsen and Shephard (2004), Barndorff-Nielsenand Shephard (2006), Andersen, Bollerslev, and Diebold (2007) and Huang and Tauchen

Table 7.6. t-statistics for ri on RS−i−1, controlling

for lagged returns

AIX DIS GE IBM

Trades −0.615 3.79 2.28 0.953Quotes 0.059 5.30 2.33 1.72

The t-statistics on realized semivariance calculated by regressingdaily returns ri on lagged daily returns and lagged daily semi-variances (RS−

i−1). This is carried out for a variety of stock prices

using trade and quote data. The RS statistics are computed usingevery 15th high-frequency data point.

Page 145: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

132 Measuring downside risk – realized semivariance

Table 7.7. Quotes: logL improvements by including lagged RS and RV inconditional variance

Lagged variables GARCH model GJR model

AXP DIS GE IBM AXP DIS GE IBM

RV & RS− 50.1 53.9 45.0 53.8 39.7 48.0 31.7 31.5RV 45.0 53.6 43.3 43.9 39.1 46.3 31.6 31.3RS− 49.5 50.7 44.5 53.7 38.0 39.4 29.1 30.0None 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

Quote data: Improvements in the Gaussian quasi-likelihood by including lagged realizedquantities in the conditional variance. Fit of GARCH and GJR models for daily open to closereturns on four share prices, from 1995 to 2005. We allow lagged daily realized variance (RV)and realized semivariance (RS) to appear in the conditional variance. They are computedusing every 15th trade.

(2005) by defining

BPDV =tj≤1∑j=1

(Ytj

− Ytj−1

)2 1Ytj− Ytj−1≤0 − 1

2μ−2

1

tj≤1∑j=2

∣∣Ytj− Ytj−1

∣∣ ∣∣Ytj−1 − Ytj−2

∣∣p→∑s≤t

(ΔYs)2IΔYs≤0,

the realized bipower downward variation statistic (upward versions are likewise trivialto define). This seems a novel way of thinking about jumps – we do not know of anyliterature that has identified

∑s≤t (ΔYs)

2IΔYs

before. It is tempting to try to carry outjump tests based upon it to test for the presence of downward jumps against a null of nojumps at all. However, the theory developed in Section 2 suggests that this is going tobe hard to implement based solely on in-fill asymptotics without stronger assumptionsthan we usually like to make due to the presence of the drift term in the limiting resultand the nonmixed Gaussian limit theory (we could do testing if we assumed the driftwas zero and there is no leverage term). Of course, it would not stop us from testingthings based on the time series dynamics of the process – see the work of Corradi andDistaso (2006).

Further, a time series of such objects can be used to assess the factors that drive down-ward jumps, by simply building a time series model for it, conditioning on explanatoryvariables.

An alternative to this approach is to use higher order power variation statistics (e.g.Barndorff-Nielsen and Shephard, 2004 and Jacod, 2007),

tj≤1∑j=1

∣∣Ytj− Ytj−1

∣∣r 1Ytj− Ytj−1≤0

p→∑s≤t

|ΔYs|r IΔYs≤0, r > 2,

as n→ ∞. The difficulty with using these high order statistics is that they will be moresensitive to noise than the BPDV estimator.

Page 146: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Appendix 133

4.2. Effect of noise

Suppose instead of seeing Y we see

X = Y + U,

and think of U as noise. Let us focus entirely onn∑

i=1

x2i 1{xi≤0} =

n∑i=1

y2i 1{yi≤−ui} +

n∑i=1

u2i 1{yi≤−ui} + 2

n∑i=1

yiui1{yi≤−ui}

n∑

i=1

y2i 1{ui≤0} +

n∑i=1

u2i 1{ui≤0} + 2

n∑i=1

yiui1{ui≤0}.

If we use the framework of Zhou (1996), where U is white noise, uncorrelated with Y ,with E(U) = 0 and Var(U) = ω2 then it is immediately apparent that the noise willtotally dominate this statistic in the limit as n→ ∞.

Pre-averaging based statistics of Jacod, Li, Mykland, Podolskij, and Vetter (2007)could be used here to reduce the impact of noise on the statistic.

5. Conclusions

This chapter has introduced a new measure of variation called downside “realized semi-variance.” It is determined solely by high-frequency downward moves in asset prices. Wehave seen it is possible to carry out an asymptotic analysis of this statistic and see thatits limit is effects only by downward jumps.

We have assessed the effectiveness of this new measure using it as a conditioningvariable for a GARCH model of daily open to close returns. Throughout, for nonleverage-based GARCH models, downside realized semivariance is more informative than theusual realized variance statistic. When a leverage term is introduced it is hard to tell thedifference.

Various extensions to this work were suggested.The conclusions that downward jumps seem to be associated with increases in future

volatility is interesting for it is at odds with nearly all continuous time parametric stochas-tic volatility models. It could only hold, except for very contrived models, if the volatilityprocess also has jumps in it and these jumps are correlated with the jumps in the priceprocess. This is because it is not possible to correlate a Brownian motion process witha jump process. This observation points us towards models of the type, for example,introduced by Barndorff-Nielsen and Shephard (2001). It would suggest the possibilitiesof empirically rejecting the entire class of stochastic volatility models built solely fromBrownian motions. This seems worthy of some more study.

Appendix: Proof of Proposition 2

Consider the framework of Theorem 2 in Kinnebrock and Podolskij (2008) and choose

g (x) =

⎛⎝g1 (x)g2 (x)g3(x)

⎞⎠ =

⎛⎝ x21{x≥0}x21{x≤0}|x|

⎞⎠ h (x) =

⎛⎝1 0 00 1 00 0 |x|

⎞⎠

Page 147: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

134 Measuring downside risk – realized semivariance

Assume that X is a Brownian semimartingale, conditions (H1) and (H2) are satisfiedand note that g is continuously differentiable and so their theory applies directly. Due tothe particular choice of h we obtain the stable convergence

√n

⎧⎪⎨⎪⎩V (Y, n)t −∫ t

0

σ2sds

⎛⎜⎝1212

μ21

⎞⎟⎠⎫⎪⎬⎪⎭→

∫ t

0

αs(1)ds+∫ t

0

αs(2)dWs +∫ t

0

αs(3)dW ′s, (5)

where W ′ is a one-dimensional Brownian motion defined on an extension of the filteredprobability space and independent of the σ-field F . Using the notation

ρσ (g) = E {g(σU)} , U ∼ N(0, 1)

ρ(1)σ (g) = E {Ug(σU)} , U ∼ N(0, 1)

ρ(1,1)σ (g) = E

{g(σW1)

∫ 1

0

WsdWs

},

the α(1), α(2) and α(3) are defined by

αs (1)j = σ∗s ρ(11)σs

(∂gj

∂x

)ρσs

(hjj) + asρσs

(∂gj

∂x

)ρσs

(hjj)

αs (2)j = ρ(1)σs(gj) ρσs

(hjj)

αs (3)αs (3)′ = As − αs (2)αs (2)′

and the elements of the 3 × 3 matrix process A is given by

Aj,j′s = ρσs

(gjgj′) ρσs(hjjhj′j′) + ρσs

(gj) ρσs

(gj′hjj

)ρσs

(hj′j′)

+ ρσs(gj′) ρσs

(gjhj′j′

)ρσs

(hjj)

− 3ρσs(gj) ρσs

(gj′) ρσs(hjj) ρσs

(hj′j′) .

Then we obtain the result using the following Lemma.

Lemma 1 Let U be standard normally distrubuted. Then

E[1{U≥0}U3

]=

2√2π, E

[1{U≥0}U

]=

1√2π,

E[1{U≤0}U3

]= − 2√

2π, E

[1{U≤0}U

]= − 1√

2π.

Proof Let f be the density of the standard normal distribution.∫ ∞

0

f (x)xdx =1√2π

∫ ∞

0

exp(−x

2

2

)xdx

=1√2π

[− exp

(−x

2

2

)]∞0

=1√2π.

Page 148: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Appendix 135

Using partial integration we obtain∫ ∞

0

f (x)xdx =1√2π

∫ ∞

0

exp(−x

2

2

)xdx

=1√2π

[12x2 exp

(−x

2

2

)]∞0

− 1√2π

∫ ∞

0

12x2

(− exp

(−x

2

2

)x

)dx

=1

2√

∫ ∞

0

exp(−x

2

2

)x3dx

=12

∫ ∞

0

x3f (x) dx.

Thus ∫ ∞

0

x3f (x) dx =2√2π.

Obviously, it holds ∫ 0

−∞f (x)xdx = −

∫ ∞

0

f (x)xdx,

∫ 0

−∞x3f (x) dx = −

∫ ∞

0

x3f (x) dx.

This completes the proof of the Lemma.

Using the lemma we can calculate the moments

ρσs(g1) = ρσs

(g2) =12σ2

s ,

ρσs(h1,1) = ρσs

(h2,2) = 1,

ρσs(h3,3) = ρσs

(g3) = μ1σs,

ρ(1)σs(g1) =

2√2πσ2

s = −ρ(1)σs(g2) ,

ρσs(g1h3,3) = ρσs

(g2h3,3) =12σ3μ3,

ρσs(g3h3,3) = μ2

1σ2s , ρσs

(g23)

= ρσs

(h2

3

)= μ2

1,

We note that μ3 = 2μ1. Further

ρσs

(∂g1∂x

)=

2√2πσs = −ρσs

(∂g2∂x

),

Page 149: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

136 Measuring downside risk – realized semivariance

ρ(1)σs

(∂g1∂x

)= ρ(1)σs

(∂g2∂x

)= σs,

ρσs

((g1)

2)

= ρσs

((g2)

2)

=32σ4

s ,

ρ11σs

(∂g1∂x

)=

σs√2π

= −ρ11σs

(∂g2∂x

).

The last statement follows from

ρσs

(∂g1∂x

)= E

[∂g1∂x

(σsW1)∫ 1

0

WudWu

]

= 2E[σsW11{W1≥0}

∫ 1

0

WudWu

]

= 2E[σsW11{W1≥0}

(12W 2

1 − 12

)]= σsE

[(W 3

1 −W1

)1{W1≥0}

]=

σs√2π.

Page 150: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

8

Glossary to ARCH (GARCH)Tim Bollerslev

Rob Engle’s seminal Nobel Prize winning 1982 Econometrica article on the AutoRegres-sive Conditional Heteroskedastic (ARCH) class of models spurred a virtual “arms race”into the development of new and better procedures for modeling and forecasting time-varying financial market volatility. Some of the most influential of these early paperswere collected in Engle (1995). Numerous surveys of the burgeoning ARCH literaturealso exist; e.g., Andersen and Bollerslev (1998), Andersen, Bollerslev, Christoffersen andDiebold (2006a), Bauwens, Laurent and Rombouts (2006), Bera and Higgins (1993),Bollerslev, Chou and Kroner (1992), Bollerslev, Engle and Nelson (1994), Degiannakisand Xekalaki (2004), Diebold (2004), Diebold and Lopez (1995), Engle (2001, 2004),Engle and Patton (2001), Pagan (1996), Palm (1996), and Shephard (1996). More-over, ARCH models have now become standard textbook material in econometrics andfinance as exemplified by, e.g., Alexander (2001, 2008), Brooks (2002), Campbell, Lo andMacKinlay (1997), Chan (2002), Christoffersen (2003), Enders (2004), Franses and vanDijk (2000), Gourieroux and Jasiak (2001), Hamilton (1994), Mills (1993), Poon (2005),Singleton (2006), Stock and Watson (2007), Tsay (2002), and Taylor (2004). So, whyanother survey type chapter?

Even a cursory glance at the many reviews and textbook treatments cited abovereveals a perplexing “alphabet-soup” of acronyms and abbreviations used to describe theplethora of models and procedures that have been developed over the years. Hence, asa complement to these more traditional surveys, I have tried to provide an alternativeand easy-to-use encyclopedic-type reference guide to the long list of ARCH acronyms.Comparing the length of this list to the list of general Acronyms in Time Series Analysis(ATSA) compiled by Granger (1983) further underscores the scope of the research effortsand new developments that have occurred in the area following the introduction of thebasic linear ARCH model in Engle (1982a).

Acknowledgments: I would like to acknowledge the financial support provided by a grant from the NSFto the NBER and CREATES funded by the Danish National Research Foundation. I would also liketo thank Frank Diebold, Xin Huang, Andrew Patton, Neil Shephard and Natalia Sizova for valuablecomments and suggestions. Of course, I am solely to blame for any errors or omissions.

137

Page 151: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

138 Glossary to ARCH (GARCH)

My definition of what constitutes an ARCH acronym is, of course, somewhat arbitraryand subjective. In addition to the obvious cases of association of acronyms with specificparametric models, I have also included descriptions of some association of abbreviationswith more general procedures and ideas that figure especially prominently in the ARCHliterature. With a few exceptions, I have restricted the list of acronyms to those thathave appeared in already published studies. Following Granger (1983), I have purposelynot included the names of specific computer programs or procedures as these are oftenof limited availability and may also be sold commercially. Even though I have tried mybest to be as comprehensive and inclusive as possible, I have almost surely omitted someabbreviations. To everyone responsible for an acronym that I have inadvertently left out,please accept my apology.

Lastly, let me make it clear that the mere compilation of this list does not meanthat I endorse the practice of associating each and every ARCH formulation with itsown unique acronym. In fact, the sheer length of this list arguably suggests that the useof special names and abbreviations originally intended for easily telling different ARCHmodels apart might have reached a point of diminishing returns to scale.

AARCH (Augmented ARCH) The AARCH model of Bera, Higgins and Lee (1992)extends the linear ARCH(q) model (see ARCH) to allow the conditional variance todepend on cross-products of the lagged innovations. Defining the q × 1 vector et−1 ≡{εt−1, εt−2, . . . , εt−q}, the AARCH(q) model may be expressed as:

σ2t = ω + e′t−1Aet−1,

where A denotes a q × q symmetric positive definite matrix. If A is diagonal, themodel reduces to the standard linear ARCH(q) model. The Generalized AARCH, orGAARCH model is obtained by including lagged conditional variances on the right-hand-side of the equation. The slightly more general GQARCH representation was proposedindependently by Sentana (1995) (see GQARCH).

ACD (Autoregressive Conditional Duration) The ACD model of Engle and Russell (1998)was developed to describe dynamic dependencies in the durations between randomlyoccurring events. The model has found especially wide use in the analysis of high-frequency financial data and times between trades or quotes. Let xi ≡ ti − ti−1 denotethe time interval between the ith and the (i-1)th event. The popular ACD(1,1) modelthen parameterizes the expected durations, ψi = E(xi|xi−1, xi−2, . . .), analogous to theconditional variance in the GARCH(1,1) model (see GARCH),

ψi = ω + αxi−1 + βψi−1.

Higher order ACD(p,q) models are defined in a similar manner. Quasi MaximumLikelihood Estimates (see QMLE) of the parameters in the ACD(p,q) model may beobtained by applying standard GARCH(p,q) estimation procedures to yi ≡ x

1/2i , with

the conditional mean fixed at zero (see also ACH and MEM).

ACH1 (Autoregressive Conditional Hazard) The ACH model of Hamilton and Jorda(2002) is designed to capture dynamic dependencies in hazard rates, or the probabilityfor the occurrence of specific events. The basic ACH(p,q) model without any updating of

Page 152: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 139

the expected hazard rates between events is asymptotically equivalent to the ACD(p,q)model for the times between events (see ACD).

ACH2 (Adaptive Conditional Heteroskedasticity) In parallel to the idea of allowing fortime-varying variances in a sequence of normal distributions underlying the basic ARCHmodel (see ARCH), it is possible to allow the scale parameter in a sequence of StableParetian distributions to change over time. The ACH formulation for the scale parameter,ct, first proposed by McCulloch (1985) postulates that the temporal variation may bedescribed by an exponentially weighted moving average (see EWMA) of the form,

ct = α|εt−1| + (1 − α)ct−1.

Many other more complicated Stable GARCH formulations have subsequently beenproposed and analyzed in the literature (see SGARCH).

ACM (Autoregressive Conditional Multinomial) The ACM model of Engle and Russell(2005) involves an ARMA-type representation for discrete-valued multinomial data, inwhich the conditional transition probabilities between the different values are guaranteedto lie between zero and one and sum to unity. The ACM and ACD models (see ACD)may be combined in modeling high-frequency financial price series and other irregularlyspaced discrete data.

ADCC (Asymmetric Dynamic Conditional Correlations) The ADCC GARCH model ofCappiello, Engle and Sheppard (2006) extends the DCC model (see DCC) to allow forasymmetries in the time-varying conditional correlations based on a GJR threshold-typeformulation (see GJR).

AGARCH1 (Asymmetric GARCH) The AGARCH model was introduced by Engle(1990) to allow for asymmetric effects of negative and positive innovations (seealso EGARCH, GJR, NAGARCH, and VGARCH1). The AGARCH(1,1) model isdefined by:

σ2t = ω + αε2

t−1 + γεt−1 + βσ2t−1,

where negative values of γ implies that positive shocks will result in smaller increases infuture volatility than negative shocks of the same absolute magnitude. The model mayalternatively be expressed as:

σ2t = ω′ + α(εt−1 + γ′)2 + βσ2

t−1,

for which ω′ > 0, α ≥ 0 and β ≥ 0 readily ensures that the conditional variance is positivealmost surely.

AGARCH2 (Absolute value GARCH) See TS-GARCH.

ANN-ARCH (Artificial Neural Network ARCH) Donaldson and Kamstra (1997) term theGJR model (see GJR) augmented with a logistic function, as commonly used in NeuralNetworks, the ANN-ARCH model.

ANST-GARCH (Asymmetric Nonlinear Smooth Transition GARCH) The ANST-GARCH(1,1) model of Nam, Pyun and Arize (2002) postulates that

σ2t = ω + αε2

t−1 + βiσ2t−1 + [κ + δε2

t−1 + ρσ2t−1]F (εt−1, γ),

Page 153: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

140 Glossary to ARCH (GARCH)

where F (·) denotes a smooth transition function. The model simplifies to the ST-GARCH(1,1) model of Gonzalez-Rivera (1998) for κ = ρ = 0 (see ST-GARCH) andthe standard GARCH(1,1) model for κ = δ = ρ = 0 (see GARCH).

APARCH (Asymmetric Power ARCH) The APARCH, or APGARCH, model of Ding,Engle and Granger (1993) nests several of the most popular univariate parameterizations.In particular, the APGARCH(p,q) model,

σδt = ω +

q∑i=1

αi (|εt−i| − γiεt−i)δ +

p∑i=1

βiσδt−i,

reduces to the standard linear GARCH(p,q) model for δ = 2 and γi = 0, the TS-GARCH(p,q) model for δ = 1 and γi = 0, the NGARCH(p,q) model for γi = 0, theGJR-GARCH model for δ = 2 and 0 ≤ γi ≤ 1, the TGARCH(p,q) model for δ = 1 and0 ≤ γi ≤ 1, while the log-GARCH(p,q) model is obtained as the limiting case of themodel for δ → 0 and γi = 0 (see GARCH, TS-GARCH, NGARCH, GJR, TGARCH andlog-GARCH).

ARCD (AutoRegressive Conditional Density) The ARCD class of models proposed byHansen (1994) extends the basic ARCH class of models to allow for conditional dependen-cies beyond the mean and variance by postulating a specific non-normal distribution forthe standardized innovations zt ≡ εtσ

−1t , explicitly parameterizing the shape parameters

of this distribution as a function of lagged information. Most empirical applications ofthe ARCD model have relied on the standardized skewed Student-t distribution (see alsoGARCH-t and GED-GARCH). Specific examples of ARCD models include the GARCHwith Skewness, or GARCHS, model of Harvey and Siddique (1999), in which the skewnessis allowed to be time-varying. In particular, for the GARCHS(1,1,1) model,

st = γ0 + γ1z3t + γ2st−1,

where st ≡ Et−1(z3t ). Similarly, the GARCH with Skewness and Kurtosis, or GARCHSK,

model of Leon, Rubio and Serna (2005), parameterizes the conditional kurtosis as:

kt = δ0 + δ1z4t + δ2kt−1,

where kt ≡ Et−1(z4t ).

ARCH (AutoRegressive Conditional Heteroskedastcity) The ARCH model was originallydeveloped by Engle (1982a) to describe UK inflationary uncertainty. However, the ARCHclass of models has subsequently found especially wide use in characterizing time-varyingfinancial market volatility. The ARCH regression model for yt first analyzed in Engle(1982a) is defined by:

yt|Ft−1 ∼ N(x′tβ, σ2

t ),

where Ft−1 refers to the information set available at time t − 1, and the conditionalvariance,

σ2t = f(εt−1, εt−2, . . . , εt−p; θ),

is an explicit function of the p lagged innovations, εt ≡ yt − x′tβ. Using a standard

prediction error decomposition-type argument, the log-likelihood function for the ARCH

Page 154: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 141

model may be expressed as:

Log L(yT , yt−1, . . . , y1; β, θ) = −T

2log(2π) − 1

2

T∑t=1

[log(σ2t ) + (yt − x′

tβ)σ−2t ].

Even though analytical expressions for the Maximum Likelihood Estimates (see alsoQMLE) are not available in closed form, numerical procedures may readily be usedto maximize the function. The qth-order linear ARCH(q) model suggested by Engle(1982a) provides a particularly convenient and natural parameterizarion for capturingthe tendency for large (small) variances to be followed by other large (small) variances,

σ2t = ω +

q∑i=1

αiε2t−i,

where for the conditional variance to be non-negative and the model well defined ω hasto be positive and all of the αis non-negative. Most of the early empirical applications ofARCH models, including Engle (1982a), were based on the linear ARCH(q) model withthe additional constraint that the αis decline linearly with the lag,

σ2t = ω + α

q∑i=1

(q + 1 − i)ε2t−i,

in turn requiring the estimation of only a single α parameter irrespective of the value ofq. More generally, any nontrivial measurable function of the time t − 1 information set,σ2

t , such that

εt = σtzt,

where zt is a sequence of independent random variables with mean zero and unit variance,is now commonly referred to as an ARCH model.

ARCH-Filters ARCH and GARCH models may alternatively be given a nonparamet-ric interpretation as discrete-time filters designed to extract information about someunderlying, possibly latent continuous-time, stochastic volatility process. Issues relatedto the design of consistent and asymptotically optimal ARCH-Filters have been studiedextensively by Nelson (1992, 1996a) and Nelson and Foster (1994). For instance, theasymptotically efficient filter (in a mean-square-error sense for increasingly finer sampleobservations) for the instantaneous volatility in the GARCH diffusion model (see GARCHDiffusion) is given by the discrete-time GARCH(1,1) model (see also ARCH-Smoothers).

ARCH-NNH (ARCH Nonstationary Nonlinear Heteroskedasticity) The ARCH-NNHmodel of Han and Park (2008) includes a nonlinear function of a near or exact unitroot process, xt, in the conditional variance of the ARCH(1) model,

σ2t = αε2

t−1 + f(xt).

The model is designed to capture slowly decaying stochastic long run volatilitydependencies (see also CGARCH1, FIGARCH, IGARCH).

ARCH-M (ARCH-in-Mean) The ARCH-M model was first introduced by Engle, Lilienand Robins (1987) for modeling risk-return tradeoffs in the term structure of US interest

Page 155: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

142 Glossary to ARCH (GARCH)

rates. The model extends the ARCH regression model in Engle (1982a) (see ARCH) byallowing the conditional mean to depend directly on the conditional variance,

yt|Ft−1 ∼ N(x′tβ + δσ2

t , σ2t ).

This breaks the block-diagonality between the parameters in the conditional mean andthe parameters in the conditional variance, so that the two sets of parameters must beestimated jointly to achieve asymptotic efficiency. Nonlinear functions of the conditionalvariance may be included in the conditional mean in a similar fashion. The final preferredmodel estimated in Engle, Lilien and Robins (1987) parameterizes the conditional meanas a function of log

(σ2

t

). Multivariate extensions of the ARCH-M model were first analyzed

and estimated by Bollerslev, Engle and Wooldridge (1988) (see also MGARCH1).

ARCH-SM (ARCH Stochastic Mean) The ARCH-SM acronym was coined by Lee andTaniguchi (2005) to distinguish ARCH models in which εt ≡ yt − Et−1(yt) �= yt − E(yt)

(see ARCH).

ARCH-Smoothers ARCH-Smoothers, first developed by Nelson (1996b) and Foster andNelson (1996), extend the ARCH and GARCH models and corresponding ARCH-Filtersbased solely on past observations (see ARCH-Filters) to allow for the use of both currentand future observations in the estimation of the latent volatility.

ATGARCH (Asymmetric Threshold GARCH) The ATGARCH(1,1) model of Crouhy andRockinger (1997) combines and extends the TS-GARCH(1,1) and GJR(1,1) models (seeTS-GARCH and GJR) by allowing the threshold used in characterizing the asymmetricresponse to differ from zero,

σt = ω + α|εt−1|I(εt−1 ≥ γ) + δ|εt−1|I(εt−1 < γ) + βσt−1.

Higher order ATGARCH(p,q) models may be defined analogously (see also AGARCHand TGARCH).

Aug-GARCH (Augmented GARCH) The Aug-GARCH model developed by Duan (1997)nests most of the popular univariate parameterizations, including the standard linearGARCH model, the Multiplicative GARCH model, the Exponential GARCH model,the GJR-GARCH model, the Threshold GARCH model, the Nonlinear GARCHmodel, the Taylor–Schwert GARCH model, and the VGARCH model (see GARCH,MGARCH2, EGARCH, GJR, TGARCH, NGARCH, TS-GARCH and VGARCH1). TheAug-GARCH(1,1) model may be expressed as:

σ2t = |λϕt − λ + 1|I(λ �= 0) + exp(ϕt − 1)I(λ = 0),

where

ϕt = ω + α1|zt−1 − κ|δϕt−1 + α2 max(0, κ − zt−1)δϕt−1

+ α3(|zt−1 − κ|δ − 1)/δ + α4(max(0, κ − zt−1)δ − 1)/δ + βϕt−1,

and zt ≡ εtσ−1t denotes the corresponding standardized innovations. The basic

GARCH(1,1) model is obtained by fixing λ = 1, κ = 0, δ = 2 and α2 = α3 = α4 = 0,

Page 156: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 143

whereas the EGARCH model corresponds to λ = 0, κ = 0, δ = 1 and α1 = α2 = 0 (see alsoHGARCH).

AVGARCH (Absolute Value GARCH) See TS-GARCH.

β-ARCH (Beta ARCH) The β-ARCH(1) model of Guegan and Diebolt (1994) allowsthe conditional variance to depend asymmetrically on positive and negative laggedinnovations,

σ2t = ω + [αI(εt−1 > 0) + γI(εt−1 < 0)]ε2·β

t−1,

where I(·) denotes the indicator function. For α = γ and β = 1 the model reduces to thestandard linear ARCH(1) model. More general β-ARCH(q) and β-GARCH(p,q) modelsmay be defined in a similar fashion (see also GJR, TGARCH, and VGARCH1).

BEKK (Baba, Engle, Kraft and Kroner) The BEKK acronym refers to a specific param-eteriztion of the multivariate GARCH model (see MGARCH1) developed in Engle andKroner (1995). The simplest BEKK representation for the N × N conditional covariancematrix Ωt takes the form:

Ωt = C′C + A′εt−1ε′t−1A + B′Ωt−1B,

where C denotes an upper triangular N × N matrix, and A and B are both unrestrictedN×N matrices. This quadratic representation automatically guarantees that Ωt is positivedefinite. The reference to Y. Baba and D. Kraft in the acronym stems from an earlierunpublished four-authored working paper.

BGARCH (Bivariate GARCH) See MGARCH1.

CARR (Conditional AutoRegressive Range) The CARR(p,q) model proposed by Chou(2005) postulates a GARCH(p,q) structure (see GARCH) for the dynamic dependenciesin time series of high–low asset prices over some fixed time interval. The model is essen-tially analogous to the ACD model (see ACD) for the times between randomly occurringevents (see also REGARCH).

CAViaR (Conditional Autoregressive Value at Risk) The CAViaR model of Engle andManganelli (2004) specifies the evolution of a particular conditional quantile of a timeseries, say ft where Pt−1(yt ≤ ft) = p for some pre-specified fixed level p, as an autoregres-sive process. The indirect GARCH(1,1) model parameterizes the conditional quantilesas:

ft =(ω + αy2

t−1 + βf2t−1

)1/2.

This formulation would be correctly specified if the underlying process for yt follows aGARCH(1,1) model with i.i.d. standardized innovations (see GARCH). Alternative mod-els allowing for asymmetries may be specified in a similar manner. The CAViaR modelwas explicitly developed for predicting quantiles in financial asset return distributions,or so-called Value-at-Risk.

CCC (Constant Conditional Correlations) The N × N conditional covariance matrix forthe N × 1 vector process εt, say Ωt, may always be decomposed as:

Ωt = RtDtRt,

Page 157: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

144 Glossary to ARCH (GARCH)

where Rt denotes the N × N matrix of conditional correlations with typical element

ρijt =Covt−1(εit, εjt)

Vart−1(εit)1/2Vart−1(εjt)1/2,

and Dt denotes the N × N diagonal matrix with typical element Vart−1(εit). The CCCGARCH model of Bollerslev (1990) assumes that the conditional correlations are constantρijt = ρij , so that the temporal variation in Ωt is determined solely by the time-varyingconditional variances for each of the elements in εt. This assumption greatly simplifiesthe inference, requiring only the nonlinear estimation of N univariate GARCH models,whereas Rt = R may be estimated by the sample correlations of the corresponding stan-dardized residuals. Moreover, as long as each of the conditional variances are positive,the CCC model guarantees that the resulting conditional covariance matrices are positivedefinite (see also DCC and MGARCH1).

Censored-GARCH See Tobit-GARCH.

CGARCH1 (Component GARCH) The component GARCH model of Engle and Lee(1999) was designed to better account for long run volatility dependencies. Rewritingthe GARCH(1,1) model as:(

σ2t − σ2

)= α

(ε2

t−1 − σ2)

+ β(σ2

t−1 − σ2),

where σ2 ≡ ω/(1 − α − β) refers to the unconditional variance, the CGARCH model isobtained by relaxing the assumption of a constant σ2. Specifically,(

σ2t − ζ2

t

)= α

(ε2

t−1 − ζ2t−1

)+ β

(σ2

t−1 − ζ2t−1

),

with the corresponding long run variance parameterized by the separate equation,

ζ2t = ω + ρζ2

t−1 + ϕ(ε2

t−1 − σ2t−1

).

Substituting this expression for ζ2t into the former equation, the CGARCH model may

alternatively be expressed as a restricted GARCH(2,2) model (see also FIGARCH).

CGARCH2 (Composite GARCH) The CGARCH model of den Hertog (1994) representsε2

t as the sum of a latent permanent random walk component and another latent AR(1)component.

COGARCH (Continuous GARCH) The continuous-time COGARCH(1,1) model pro-posed by Kluppelberg, Lindner and Maller (2004) may be expressed as,

dy(t) = σ(t)dL(t),

and

σ2(t) = [σ2(0) + ω

∫ t

0exp(x(s))ds] exp(−x(t−)),

where

x(t) = −t log β −∑

0<s≤t

log[1 + α exp(− log β)ΔL(s)2].

The model is obtained by backward solution of the difference equation defining thediscrete-time GARCH(1,1) model (see GARCH), replacing the standardized innovationsby the increments to the Levy process, L(t). In contrast to the GARCH diffusion model

Page 158: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 145

of Nelson (1990b) (see GARCH Diffusion), which involves two independent Brownianmotions, the COGARCH model is driven by a single innovation process. Higher orderCOGARC(p,q) processes have been developed by Brockwell, Chadraa and Lindner (2006)(see also ECOGARCH).

Copula GARCH Any joint distribution function may be expressed in terms of its marginaldistribution functions and a copula function linking these. The class of copula GARCHmodels builds on this idea in the formulation of multivariate GARCH models (seeMGARCH1) by linking univariate GARCH models through a sequence of possibly time-varying conditional copulas. For further discussion of estimation and inference in copulaGARCH models, see, e.g., Jondeau and Rockinger (2006) and Patton (2006a) (see alsoCCC and DCC).

CorrARCH (Correlated ARCH) The bivariate CorrARCH model of Christodoulakis andSatchell (2002) parameterizes the time-varying conditional correlations as a distributedlag of the product of the standardized innovations from univariate GARCH models foreach of the two series. A Fisher transform is used to ensure that the resulting correlationsalways lie between −1 and 1 (see also CCC, DCC and MGARCH1).

DAGARCH (Dynamic Asymmetric GARCH) The DAGARCH model of Caporin andMcAleer (2006) extends the GJR-GARCH model (see GJR) to allow for multiplethresholds and time-varying asymmetric effects (see also AGARCH, ATGARCH andTGARCH).

DCC (Dynamic Conditional Correlations) The multivariate DCC-GARCH model ofEngle (2002a) extends the CCC model (see CCC) by allowing the conditional corre-lations to be time-varying. To facilitate the analysis of large dimensional systems, thebasic DCC model postulates that the temporal variation in the conditional correlationsmay be described by exponential smoothing (see EWMA) so that

ρijt =qijt

q1/2iit q

1/2jjt

,

where

qijt = (1 − λ)εit−1εjt−1 + λqijt−1,

and εt denotes the N × 1 vector innovation process. A closely related formulation wasproposed independently by Tse and Tsui (2002), who refer to their approach as a VaryingConditional Correlation, or VCC-MGARCH model (see also ADCC, CorrARCH, FDCCand MGARCH1).

diag MGARCH (diagonal GARCH) The diag MGARCH model refers to the simplifi-cation of the vech GARCH model (see vech GARCH) in which each of the elementsin the conditional covariance matrix depends on its own past values and the productsof the corresponding elements in the innovation vector only. The model is convenientlyexpressed in terms of Hadamard products, or matrix element-by-element multiplication.In particular, for the diag MGARCH(1,1) model,

Ωt = Co + Ao εt−1εt−1 + Bo Ωt−1.

Page 159: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

146 Glossary to ARCH (GARCH)

It follows (see Attanasio, 1991) that if each of the three N × N matrices Co, Ao and Bo

are positive definite, the conditional covariance matrix will also be positive definite (seealso MGARCH1).

DTARCH (Double Threshold ARCH) The DTARCH model of Li and Li (1996) allowsthe parameters in both the conditional mean and the conditional variance to changeacross regimes, with the m different regimes determined by a set of threshold parametersfor some lag k ≥ 1 of the observed yt process, say rj−1 < yt−k ≤ rj , where −∞ = r0 <

r1 < . . . < rm = ∞ (see also TGARCH).

DVEC-GARCH (Diagonal VECtorized GARCH) See diag MGARCH.

ECOGARCH (Exponential Continuous GARCH) The continuous-time ECOGARCHmodel developed by Haug and Czado (2007) extends the Levy driven COGARCH modelof Kluppelberg, Lindner and Maller (2004) (see COGARCH) to allow for different impactof positive and negative jump innovations, or so-called leverage effects. The model maybe seen as a continuous-time analog of the discrete-time EGARCH model (see alsoEGARCH, GJR and TGARCH).

EGARCH (Exponential GARCH) The EGARCH model was developed by Nelson (1991).The model explicitly allows for asymmetries in the relationship between return andvolatility (see also GJR and TGARCH). In particular, let zt ≡ εtσ

−1t denote the

standardized innovations. The EGARCH (1,1) model may then be expressed as:

log(σ2t ) = ω + α(|zt−1| − E(|zt−1|)) + γzt−1 + β log(σ2

t−1).

For γ < 0 negative shocks will obviously have a bigger impact on future volatility thanpositive shocks of the same magnitude. This effect, which is typically observed empiricallywith equity index returns, is often referred to as a “leverage effect,” although it is nowwidely agreed that the apparent asymmetry has little to do with actual financial leverage.By parameterizing the logarithm of the conditional variance as opposed to the conditionalvariance, the EGARCH model also avoids complications from having to ensure that theprocess remains positive. This is especially useful when conditioning on other explanatoryvariables. Meanwhile, the logarithmic transformation complicates the construction ofunbiased forecasts for the level of future variances (see also GARCH and log-GARCH).

EVT-GARCH (Extreme Value Theory GARCH) The EVT-GARCH approach pioneeredby McNeil and Frey (2000), relies on extreme value theory for i.i.d. random variablesand corresponding generalized Pareto distributions for more accurately characterizingthe tails of the distributions of the standardized innovations from GARCH models. Thisidea may be used in the calculation of low-probability quantile, or Value-at-Risk, typepredictions (see also CAViaR, GARCH-t and GED-GARCH).

EWMA (Exponentially Weighted Moving Average) EWMA variance measures are definedby the recursion,

σ2t = (1 − λ)ε2

t−1 + λσ2t−1.

EWMA may be seen as a special case of the GARCH(1,1), or IGARCH(1, 1), modelin which ω ≡ 0, α ≡ 1 − λ and β ≡ λ (see GARCH and IGARCH). EWMA covariancemeasures are readily defined in a similar manner. The EWMA approach to variance

Page 160: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 147

estimation was popularized by RiskMetrics, advocating the use of λ = 0.94 with dailyfinancial returns.

F-ARCH (Factor ARCH) The multivariate factor ARCH model developed by Dieboldand Nerlove (1989) (see also Latent GARCH) and the factor GARCH model of Engle,Ng and Rothschild (1990) assumes that the temporal variation in the N × N conditionalcovariance matrix for a set of N returns can be described by univariate GARCH modelsfor smaller set of K < N portfolios,

Ωt = Ω +

K∑k=1

λkλ′kσ2

kt,

where λk and σ2kt refer to the time invariant N × 1 vector of factor loadings and time

t conditional variance for the kth factor, respectively. More specifically, the F-GARCH(1,1) model may be expressed as:

Ωt = Ω + λλ′[βw′Ωt−1w + α(w′εt−1)2]

where w denotes an N × 1 vector, and α and β are both scalar parameters (see alsoOGARCH and MGARCH1).

FCGARCH (Flexible Coefficient GARCH) The FCGARCH model of Medeiros and Veiga(2009) defines the conditional variance as a linear combination of standard GARCH-typemodels, with the weights assigned to each model determined by a set of logistic functions.The model nests several alternative smooth transition and asymmetric GARCH modelsas special limiting cases, including the DTARCH, GJR, STGARCH, TGARCH, andVSGARCH models.

FDCC (Flexible Dynamic Conditional Correlations) The FDCC-GARCH model of Billio,Caporin and Gobbo (2006) generalizes the basic DCC model (see DCC) to allow fordifferent dynamic dependencies in the time-varying conditional correlations (see alsoADCC).

FGARCH (Factor GARCH) See F-ARCH.

FIAPARCH (Fractionally Integrated Power ARCH) The FIAPARCH (p,d,q) model ofTse (1998) combines the FIGARCH (p,d,q) and the APARCH (p,q) models in parame-terizing σδ

t as a fractionally integrated distributed lag of (|εt| − γεt)δ (see FIGARCH and

APARCH).

FIEGARCH (Fractionally Integrated EGARCH) The FIEGARCH model of Bollerslevand Mikkelsen (1996) imposes a fractional unit root in the autoregressive polynomial inthe ARMA representation of the EGARCH model (see EGARCH). In particular, theFIEGARCH (1,d,1) model may be conveniently expressed as:

(1 − βL)(1 − L)d log(σ2t ) = ω + α(|zt−1| − E(|zt−1|)) + γzt−1.

For 0 < d < 1 this representation implies fractional integrated slowly decaying hyperbolicdependencies in log

(σ2

t

)(see also FIGARCH, HYGARCH and LMGARCH).

FIGARCH (Fractionally Integrated GARCH) The FIGARCH model proposed by Bail-lie, Bollerslev and Mikkelsen (1996) relies on an ARFIMA-type representation to bettercapture the long run dynamic dependencies in the conditional variance. The model may

Page 161: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

148 Glossary to ARCH (GARCH)

be seen as natural extension of the IGARCH model (see IGARCH), allowing for frac-tional orders of integration in the autoregressive polynomial in the corresponding ARMArepresentation,

ϕ(L)(1 − L)dε2t = ω + (1 − β(L))νt,

where vt ≡ ε2t −σ2

t , 0 < d < 1, and the roots of ϕ(z) = 0 and β(z) = 1 are all outside the unitcircle. For values of 0 < d < 1/2 the model implies an eventual slow hyperbolic decay inthe autocorrelations for σ2

t (see also FIEGARCH, HYGARCH and LMGARCH).

FIREGARCH (Fractionally Integrated Range EGARCH) See REGARCH.

FLEX-GARCH (Flexible GARCH) The multivariate Flex-GARCH model of Ledoit,Santa-Clara and Wolf (2003) is designed to reduce the computational burden involved inthe estimation of multivariate diagonal MGARCH model (see diag MGARCH). This isaccomplished by estimating a set of bivariate MGARCH models for each of the N(N +1)/2

possible different pairwise combinations of the N variables, and then subsequently “paste”together the parameter estimates subject to the constraint that the resulting parame-ter matrices for the full N-dimensional MGARCH model guarantee positive semidefiniteconditional covariance matrices.

GAARCH (Generalized Augmented ARCH) See AARCH.

GARCH (Generalized AutoRegressive Conditional Heteroskedasticity) The GARCH(p,q) model of Bollerslev (1986) includes p lags of the conditional variance in the linearARCH(q) (see ARCH) conditional variance equation,

σ2t = ω +

q∑i=1

αiε2t−i +

p∑i=1

βiσ2t−i.

Conditions on the parameters to ensure that the GARCH(p,q) conditional varianceis always positive are given in Nelson and Cao (1992). The GARCH(p,q) model mayalternatively be represented as an ARMA(max{p, q}, p) model for the squared innovation:

ε2t = ω +

max{p,q}∑i=1

(αi + βi)ε2t−i −

p∑i=1

βiνt−i,

where νt ≡ ε2t − σ2

t , so that by definition Et−1(vt) = 0. The relatively simple GARCH(1,1)model,

σ2t = ω + αε2

t−1 + βσ2t−1,

often provides a good fit in empirical applications. This particular parameterization wasalso proposed independently by Taylor (1986). The GARCH(1,1) model is well definedand the conditional variance positive almost surely provided that ω > 0, α ≥ 0 and β ≥ 0.The GARCH(1,1) model may alternatively be expressed as an ARCH(∞) model,

σ2t = ω(1 − β)−1 + α

∞∑i=1

βi−1ε2t−i,

provided that β < 1. If α+β < 1 the model is covariance stationary and the unconditionalvariance equals σ2 ≡ ω/(1 − α − β). Multiperiod conditional variance forecasts from the

Page 162: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 149

GARCH(1,1) model may readily be calculated as:

σ2t+h|t = σ2 + (α + β)h−1(σ2

t+1 − σ2),

where h ≥ 2 denotes the horizon of the forecast.

GARCH-Δ (GARCH Delta) See GARCH-Γ .

GARCH Diffusion The continuous-time GARCH diffusion model is defined by:

dy(t) = σ(t)dW1(t),

and

dσ2(t) = (ω − θσ2(t))dt +√

2ασ2(t)dW2(t),

where the two Wiener processes, W1(t) and W2(t), that drive the observable y(t) processand the instantaneous latent volatility process, σ2(t), are assumed to be independent.As shown by Nelson (1990b), the sequence of GARCH(1,1) models defined over discretetime intervals of length 1/n,

σ2t,n = (ω/n) + (α/n1/2)ε2

t−1/n,n + (1 − α/n1/2 − θ/n)σ2t−1/n,n,

where εt,n ≡ y(t) − y(t − 1/n), converges weakly to a GARCH diffusion model for n → ∞(see also COGARCH and ARCH-Filters).

GARCH-EAR (GARCH Exponential AutoRegression) The GARCH-EAR model ofLeBaron (1992) allows the first order serial correlation of the underlying process todepend directly on the conditional variance,

yt = ϕ0 + [ϕ1 + ϕ2 exp(−σ2

t/ϕ3

)]yt−1 + εt.

For ϕ2 = 0 the model reduces to a standard AR(1) model, but for ϕ2 > 0 and ϕ3 > 0

the magnitude of the serial correlation in the mean will be a decreasing function of theconditional variance (see also ARCH-M).

GARCH-Γ (GARCH Gamma) The gamma of an option is defined as the second deriva-tive of the option price with respect to the price of the underlying asset. Optionsgamma play an important role in hedging volatility risk embedded in options posi-tions. GARCH-Γ refers to the gamma obtained under the assumption that the returnon the underlying asset follows a GARCH process. Engle and Rosenberg (1995) findthat GARCH-Γ s are typically much higher than conventional Black–Scholes gam-mas. Meanwhile, GARCH-Δs, or the first derivative of the option price with respectto the price of the underlying asset, tend to be fairly close to their Black–Scholescounterparts.

GARCH-M (GARCH in Mean) See ARCH-M.

GARCHS (GARCH with Skewness) See ARCD.

GARCHSK (GARCH with Skewness and Kurtosis) See ARCD.

GARCH-t (GARCH t-distribution) ARCH models are typically estimated by maximumlikelihood under the assumption that the errors are conditionally normally distributed(see ARCH). However, in many empirical applications the standardized residuals, εtσ

−1t ,

appear to have fatter tails than the normal distribution. The GARCH-t model of

Page 163: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

150 Glossary to ARCH (GARCH)

Bollerslev (1987) relaxes the assumption of conditional normality by instead assum-ing that the standardized innovations follow a standardized Student t-distribution. Thecorresponding log Likelihood function may be expressed as:

LogL(θ) =

T∑t=1

log

(ν + 1

2

2

)−1

((ν − 2)σ2t )−1/2(1 + (ν − 2)−1σ−2

t ε2t)−(ν+1)/2

),

where ν > 2 denotes the degrees of freedom to be estimated along with the parametersin the conditional variance equation (see also GED-GARCH, QMLE and SPARCH).

GARCH-X1 The multivariate GARCH-X model of Lee (1994) includes the error cor-rection term from a cointegrating-type relationship for the underlying vector processyt ∼ I(1), say zt−1 = b′yt−1 ∼ I(0), as an explanatory variable in the conditional covariancematrix (see also MGARCH1).

GARCH-X2 The GARCH-X model proposed by Brenner, Harjes and Kroner (1996) formodeling short-term interest rates includes the lagged interest rate raised to some power,say δrγ

t−1, as an explanatory variable in the GARCH conditional variance equation (seeGARCH).

GARCHX The GARCHX model proposed by Hwang and Satchell (2005) for modelingaggregate stock market return volatility includes a measure of the lagged cross-sectionalreturn variation as an explanatory variable in the GARCH conditional variance equation(see GARCH).

GARJI Maheu and McCurdy (2004) refer to the standard GARCH model (see GARCH)augmented with occasional Poisson distributed “jumps” or large moves, where the time-varying jump intensity is determined by a separate autoregressive process, as a GARJImodel.

GDCC (Generalized Dynamic Conditional Correlations) The multivariate GDCC-GARCH model of Cappiello, Engle and Sheppard (2006) utilizes a more flexibleBEKK-type parameterization (see BEKK) for the dynamic conditional correlations (seeDCC). Combining the ADCC (see ADCC) and the GDCC models results in an AGDCCmodel (see also FDCC).

GED-GARCH (Generalized Error Distribution GARCH) The GED-GARCH model ofNelson (1991) replaces the assumption of conditionally normal errors traditionally usedin the estimation of ARCH models with the assumption that the standardized innova-tions follow a generalized error distribution, or what is also sometimes referred to as anexponential power distribution (see also GARCH-t).

GJR (Glosten, Jagannathan and Runkle GARCH) The GJR-GARCH, or just GJR,model of Glosten, Jagannathan and Runkle (1993) allows the conditional variance torespond differently to the past negative and positive innovations. The GJR(1,1) modelmay be expressed as:

σ2t = ω + αε2

t−1 + γε2t−1I(εt−1 < 0) + βσ2

t−1,

where I(·) denotes the indicator function. The model is also sometimes referred to as aSign-GARCH model. The GJR formulation is closely related to the Threshold GARCH,or TGARCH, model proposed independently by Zakoıan (1994) (see TGARCH), and

Page 164: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 151

the Asymmetric GARCH, or AGARCH, model of Engle (1990) (see AGARCH). Whenestimating the GJR model with equity index returns, γ is typically found to be positive, sothat the volatility increases proportionally more following negative than positive shocks.This asymmetry is sometimes referred to in the literature as a “leverage effect,” althoughit is now widely agreed that it has little to do with actual financial leverage (see alsoEGARCH).

GO-GARCH (Generalized Orthogonal GARCH) The multivariate GO-GARCH modelof van der Weide (2002) assumes that the temporal variation in the N × N conditionalcovariance matrix may be expressed in terms of N conditionally uncorrelated components,

Ωt = XDtX′,

where X denotes a N × N matrix, and Dt is diagonal with the conditional variancesfor each of the components along the diagonal. This formulation permits estimation bya relatively easy-to-implement two-step procedure (see also F-ARCH, GO-GARCH andMGARCH1).

GQARCH (Generalized Quadratic ARCH) The GQARCH(p,q) model of Sentana (1995)is defined by:

σ2t = ω +

q∑i=1

ψiεt−i +

q∑i=1

αiε2t−i + 2

q∑i=1

q∑j=i+1

αijεt−iεt−j +

q∑i=1

βiσ2t−i.

The model simplifies to the linear GARCH(p,q) model if all of the ψis and the αijsare equal to zero. Defining the q × 1 vector et−1 ≡ {εt−1, εt−2, . . . , εt−q}, the model mayalternatively be expressed as:

σ2t = ω + Ψ ′et−1 + e′t−1A et−1 +

q∑i=1

βiσ2t−i,

where Ψ denotes the q×1 vector of ψi coefficients and A refers to the q×q symmetric matrixof αi and αij coefficients. Conditions on the parameters for the conditional variance tobe positive almost surely and the model well defined are given in Sentana (1995) (seealso AARCH).

GQTARCH (Generalized Qualitative Threshold ARCH) See QTARCH.

GRS-GARCH (Generalized Regime-Switching GARCH) The RGS-GARCH model pro-posed by Gray (1996) allows the parameters in the GARCH model to depend upon anunobservable latent state variable governed by a first order Markov process. By aggre-gating the conditional variances over all of the possible states at each point in time, themodel is formulated in such a way that it breaks the path-dependence, which complicatesthe estimation of the SWARCH model of Cai (1994) and Hamilton and Susmel (1994)(see SWARCH).

HARCH (Heterogeneous ARCH) The HARCH(n) model of Muller, Dacorogna, Dave,Olsen, Puctet and von Weizsacker (1997) parameterizes the conditional variance as afunction of the square of the sum of lagged innovations, or the squared lagged returns,

Page 165: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

152 Glossary to ARCH (GARCH)

over different horizons,

σ2t = ω +

n∑i=1

γi

⎛⎝ i∑j=1

εt−j

⎞⎠2

.

The model is motivated as arising from the interaction of traders with different investmenthorizons. The HARCH model may be interpreted as a restricted QARCH model (seeGQARCH).

HESTON GARCH See SQR-GARCH.

HGARCH (Hentschel GARCH) The HGARCH model of Hentschel (1995) is based ona Box-Cox transform of the conditional standard deviation. It is explicitly designed tonest some of the most popular univariate parameterizations. The HGARCH(1,1) modelmay be expressed as:

σδt = ω + αδσδ

t−1

(∣∣∣εt−1σ−1t−1 − κ

∣∣∣ − γ(εt−1σ−1t−1 − κ)

)ν+ βσδ

t−1.

The model obviously reduces to the standard linear GARCH(1,1) model for δ = 2, ν = 2,κ = 0 and γ = 0, but it also nests the APARCH, AGARCH1, EGARCH, GJR, NGARCH,TGARCH, and TS-GARCH models as special cases (see also Aug-GARCH).

HYGARCH (Hyperbolic GARCH) The HYGARCH model proposed by Davidson (2004)nests the GARCH, IGARCH and FIGARCH models (see GARCH, IGARCH andFIGARCH). The model is defined in terms of the ARCH(∞) representation (see alsoLARCH),

σ2t = ω +

∞∑i=1

αiε2t−1 ≡ ω +

[1 − δ(L)

β(L)(1 + α((1 − L)d − 1))

]ε2

t−1.

The standard GARCH and FIGARCH models correspond to α = 0, and α = 1 and0 < d < 1, respectively. For d = 1 the HYGARCH model reduces to a standard GARCHor an IGARCH model depending upon whether α < 1 or α = 1.

IGARCH (Integrated GARCH) Estimates of the standard linear GARCH (p,q) model(see GARCH) often results in the sum of the estimated αi and βi coefficients being closeto unity. Rewriting the GARCH(p,q) model as an ARMA (max {p,q},p) model for thesquared innovations,

(1 − α(L) − β(L))ε2t = ω + (1 − β(L))νt

where νt ≡ ε2t − σ2

t , and α(L) and β(L) denote appropriately defined lag polynomials,the IGARCH model of Engle and Bollerslev (1986) imposes an exact unit root in thecorresponding autoregressive polynomial, (1−α(L)−β(L)) = ϕ(L)(1−L), so that the modelmay be written as:

ϕ(L)(1 − L)ε2t = ω + (1 − β(L))νt.

Even though the IGARCH model is not covariance stationary, it is still strictly stationarywith a well-defined nondegenerate limiting distribution; see Nelson (1990a). Also, asshown by Lee and Hansen (1994) and Lumsdaine (1996), standard inference procedures

Page 166: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 153

may be applied in testing the hypothesis of a unit root, or α(1) + β(1) = 1 (see alsoFIGARCH).

IV (Implied Volatility) Implied volatility refers to the volatility that would equate thetheoretical price of an option according to some valuation model, typically Black–Scholes,to that of the actual market price of the option.

LARCH (Linear ARCH) The ARCH (∞) representation,

σ2t = ω +

∞∑i=1

αiε2t−1,

is sometimes referred to as a LARCH model. This representation was first used byRobinson (1991) in the derivation of general tests for conditional heteroskedasticity.

Latent GARCH Models formulated in terms of latent variables that adhere to GARCHstructures are sometimes referred to as latent GARCH, or unobserved GARCH, models.A leading example is the N-dimensional factor ARCH model of Diebold and Nerlove(1989), εt = λft + ηt, where λ and ηt denote N × 1 vectors of factor loadings and i.i.d.innovations, respectively, and the conditional variance of ft is determined by an ARCHmodel in lagged squared values of the latent factor (see also F-ARCH). Models in whichthe innovations are subject to censoring is another example (see Tobit-GARCH). Incontrast to standard ARCH and GARCH models, for which the likelihood functions arereadily available through a prediction error decomposition-type argument (see ARCH),the likelihood functions for latent GARCH models are generally not available in closedform. General estimation and inference procedures for latent GARCH models based onMarkov Chain Monte Carlo methods have been developed by Fiorentini, Sentana andShephard (2004) (see also SV).

Level-GARCH The Level-GARCH model proposed by Brenner, Harjes and Kroner (1996)for modeling the conditional variance of short-term interest rates postulates that

σ2t = ψ2

tr2γt−1,

where ψt follows a GARCH(1,1) structure,

ψ2t = ω + αε2

t−1 + βψ2t−1.

For γ = 0 the model obviously reduces to a standard GARCH(1,1) model. The Level-GARCH model is also sometimes referred to as the Time-Varying Parameter Level, orTVP-Level, model (see also GARCH and GARCH-X2).

LGARCH1 (Leverage GARCH) The GJR model is sometimes referred to as a LGARCHmodel (see GJR).

LGARCH2 (Linear GARCH) The standard GARCH(p,q) model (see GARCH) in whichthe conditional variance is a linear function of p own lags and q lagged squared innovationsis sometimes referred to as a LGARCH model.

LMGARCH (Long Memory GARCH) The LMGARCH(p,d,q) model is defined by,

σ2t = ω + [β(L)ϕ(L)−1(1 − L)−d − 1]νt,

Page 167: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

154 Glossary to ARCH (GARCH)

where νt ≡ ε2t − σ2

t , and 0 < d < 0.5. Provided that the fourth order moment exists, theresulting process for ε2

t is covariance stationary and exhibits long memory. For furtherdiscussion and comparisons with the FIGARCH model see Conrad and Karanasos (2006)(see also FIGARCH and HYGARCH).

log-GARCH (logarithmic GARCH) The log-GARCH(p,q) model, which was suggestedindependently in slightly different forms by Geweke (1986), Pantula (1986) and Milhøj(1987), parameterizes the logarithmic conditional variance as a function of the laggedlogarithmic variances and the lagged logarithmic squared innovations,

log(σ2

t

)= ω +

q∑i=1

αi log(ε2

t−i

)+

p∑i=1

βi log(σ2

t−i

).

The model may alternatively be expressed as:

σ2t = exp(ω)

q∏i=1

(ε2

t−i

)αi

p∏i=1

(σ2

t−i

)βi .

In light of this alternative representation, the model is also sometimes referred to as aMultiplicative GARCH, or MGARCH, model.

MACH (Moving Average Conditional Heteroskedastic) The MACH(p) class of modelsproposed by Yang and Bewley (1995) is formally defined by the condition:

Et

(σ2

t+i

)= E

(σ2

t+i

)i > p,

so that the effect of a shock to the conditional variance lasts for at most p periods.More specifically, the Linear MACH(1), or L-MACH(1), model is defined by σ2

t =

ω + α(εt−1/σt−1)2. Higher order L-MACH(p) models, Exponential MACH(p), or E-MACH(p), models, Quadratic MACH(p), or Q-MACH(p), models, may be defined ina similar manner (see also EGARCH and GQARCH). The standard linear ARCH(1)model, σ2

t = ω + αε2t−1, is not a MACH(1) process.

MAR-ARCH (Mixture AutoRegressive ARCH) See MGARCH3.

MARCH1 (Modified ARCH) Friedman, Laibson and Minsky (1989) denote the class ofGARCH(1,1) models in which the conditional variance depends nonlinearly on the laggedsquared innovations as Modified ARCH models,

σ2t = ω + αF

(ε2

t−1

)+ βσ2

t−1,

where F (·) denotes a positive valued function. In their estimation of the model Friedman,Laibson and Minsky (1989) use the function F (x) = sin(θx) · I(θx < π/2) + 1 · I(θx ≥ π/2)

(see also NGARCH).

MARCH2 (Multiplicative ARCH) See MGARCH2.

Matrix EGARCH The multivariate matrix exponential GARCH model of Kawakatsu(2006) (see also EGARCH and MGARCH1) specifies the second moment dynamics interms of the matrix logarithm of the conditional covariance matrix. More specifically,let ht = vech(log Ωt) denote the N(N + 1)/2 × 1 vector of unique elements in log Ωt, wherethe logarithm of a matrix is defined by the inverse of the power series expansion used

Page 168: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 155

in defining the matrix exponential. A simple multivariate matrix EGARCH extension ofthe univariate EGARCH(1,1) model may then be expressed as:

ht = Ω + A(|εt−1| − E(|εt−1|)) + Γ εt−1 + Bht−1,

for appropriately dimensioned matrices Ω, A, Γ and B. By parameterizing only theunique elements of the logarithmic conditional covariance matrix, the matrix EGARCHmodel automatically guarantees that Ωt ≡ exp(ht) is positive definite.

MDH (Mixture of Distributions Hypothesis) The MDH first developed by Clark (1973)postulates that financial returns over nontrivial time intervals, say one day, represent theaccumulated effect of numerous within period, or intraday, news arrivals and correspond-ing price changes. The MDH coupled with the assumption of serially correlated newsarrivals is often used to rationalize the apparent volatility clustering, or ARCH effects,in asset returns. More advanced versions of the MDH, relating the time-deformation tovarious financial market activity variables, such as the number of trades, the cumulativetrading volume or the number of quotes, have been developed and explored empiricallyby Tauchen and Pitts (1983) and Andersen (1996) among many others.

MEM (Multiplicative Error Model) The Multiplicative Error class of Models (MEM) wasproposed by Engle (2002b) as a general framework for modeling non-negative valued timeseries. The MEM may be expressed as,

xt = μt ηt,

where xt ≥ 0 denotes the time series of interest, μt refers to its conditional mean, andηt is a non-negative i.i.d. process with unit mean. The conditional mean is naturalparameterized as,

μt = ω +

q∑i=1

αixt−i +

p∑i=1

βiμt−i,

where conditions on the parameters for μt to be positive follow from the correspondingconditions for the GARCH(p,q) model (see GARCH). Defining xt ≡ ε2

t and μt ≡ σ2t ,

the MEM class of models encompasses all ARCH and GARCH models, and specificformulations are readily estimated by the corresponding software for GARCH models.The ACD model for durations may also be interpreted as a MEM (see ACD).

MGARCH1 (Multivariate GARCH) Multivariate GARCH models were first analyzedand estimated empirically by Bollerslev, Engle and Wooldridge (1988). The unrestrictedlinear MGARCH(p,q) model is defined by:

vech(Ωt) = Ω +

q∑t=1

Aivech(εt−iε′t−i) +

p∑i=1

Bivech(Ωt−i),

where vech(·) denotes the operator that stacks the lower triangular portion of a symmetricN ×N matrix into an N(N +1)/2×1 vector of the corresponding unique elements, and theAi and Bi matrices are all of compatible dimension N(N+1)/2×N(N+1)/2. This vectorizedrepresentation is also sometimes referred to as a VECH GARCH model. The general vechrepresentation does not guarantee that the resulting conditional covariance matrices Ωt

are positive definite. Also, the model involves a total of N(N +1)/2+(p+q)(N4+2N3+N2)/4

parameters, which becomes prohibitively expensive from a practical computational point

Page 169: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

156 Glossary to ARCH (GARCH)

of view for anything but the bivariate case, or N = 2. Much of the research on multivariateGARCH models has been concerned with the development of alternative, more par-simonious, yet empirically realistic, representations, that easily ensure the conditionalcovariance matrices are positive definite. The trivariate vech MGARCH(1,1) model esti-mated in Bollerslev, Engle and Wooldridge (1988) assumes that the A1 and B1 matricesare both diagonal, so that each element in Ωt depends exclusively on its own laggedvalue and the product of the corresponding shocks. This diagonal simplification, result-ing in “only” (1 + p + q)(N2 + N)/2 parameters to be estimated, is often denoted as a diagMGARCH model (see also diag MGARCH).

MGARCH2 (Multiplicative GARCH) Slightly different versions of the univariate Multi-plicative GARCH model were proposed independently by Geweke (1986), Pantula (1986)and Milhøj (1987). The model is more commonly referred to as the log-GARCH model(see log-GARCH).

MGARCH3 (Mixture GARCH) The MAR-ARCH model of Wong and Li (2001) andthe MGARCH model Zhang, Li and Yuen (2006) postulates that the time t conditionalvariance is given by a time-invariant mixture of different GARCH models (see also GRS-GARCH, NM-GARCH and SWARCH).

MS-GARCH (Markov Switching GARCH) See SWARCH.

MV-GARCH (MultiVariate GARCH) The MV-GARCH, MGARCH and VGARCHacronyms are used interchangeably (see MGARCH1).

NAGARCH (Nonlinear Asymmetric GARCH) The NAGARCH(1,1) model of Engle andNg (1993) is defined by:

σ2t = ω + α(εt−1σ−1

t−1 + γ)2 + βσ2t−1.

Higher order NAGARCH(p,q) models may be defined similarly (see also AGARCH1 andVGARCH1).

NGARCH (Nonlinear GARCH) The NGARCH(p,q) model proposed by Higgins and Bera(1992) parameterizes the conditional standard deviation raised to the power δ as a func-tion of the lagged conditional standard deviations and the lagged absolute innovationsraised to the same power,

σδt = ω +

q∑i=1

αi|εt−i|δ +

p∑i=1

βiσδt−i.

This formulation obviously reduces to the standard GARCH(p,q) model for δ = 2

(see GARCH). The NGARCH model is also sometimes referred to as a PowerARCH or Power GARCH model, or PARCH or PGARCH model. A slightly differ-ent version of the NGARCH model was originally estimated by Engle and Bollerslev(1986),

σ2t = ω + α|εt−1|δ + βσ2

t−1.

Page 170: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 157

With most financial rates of returns, the estimates for δ are found to be less than two,although not always significantly so (see also APARCH and TS-GARCH).

NL-GARCH (NonLinear GARCH) The NL-GARCH acronym is sometimes used todescribe all parameterizations different from the benchmark linear GARCH(p,q) rep-resentation (see GARCH).

NM-GARCH (Normal Mixture GARCH) The NM-GARCH model postulates that thedistribution of the standardized innovations εtσ

−1t is determined by a mixture of two

or more normal distributions. The statistical properties of the NM-GARCH(1,1) modelhave been studied extensively by Alexander and Lazar (2006) (see also GARCH-t, GED-GARCH and SWARCH).

OGARCH (Orthogonal GARCH) The multivariate OGARCH model assumes that theN × 1 vector process εt may be represented as εt = Γft, where the columns of the N × m

matrix Γ are mutually orthogonal, and the m elements in the m × 1 ft vector processare conditionally uncorrelated with GARCH conditional variances. Consequently, theconditional covariance matrix for εt may be expressed as:

Ωt = ΓDtΓ′,

where Dt denotes the m×m diagonal matrix with the conditional factor variances alongthe diagonal. Estimation and inference in the OGARCH model are discussed in detail inAlexander (2001, 2008). The OGARCH model is also sometimes referred to as a principalcomponent MGARCH model. The approach is related to but formally different from thePC-GARCH model of Burns (2005) (see also F-ARCH, GO-GARCH, MGARCH1 andPC-GARCH).

PARCH (Power ARCH) See NGARCH.

PC-GARCH (Principal Component GARCH) The multivariate PC-GARCH model ofBurns (2005) is based on the estimation of univariate GARCH models to the principalcomponents, defined by the covariance matrix for the standardized residuals from a firststage estimation of univariate GARCH models for each of the individual series (see alsoOGARCH).

PGARCH1 (Periodic GARCH) The PGARCH model of Bollerslev and Ghysels (1996)was designed to account for periodic dependencies in the conditional variance by allowingthe parameters of the model to vary over the cycle. In particular, the PGARCH(1,1)model is defined by:

σ2t = ωs(t) + αs(t)ε

2t−1 + βs(t)σ

2t−1,

where s(t) refers to the stage of the periodic cycle at time t, and ωs(t), αs(t) and βs(t)

denote the different GARCH(1,1) parameter values for s(t) = 1, 2, . . . , P .

PGARCH2 (Power GARCH) See NGARCH.

PNP-ARCH (Partially NonParametric ARCH) The PNP-ARCH model estimated byEngle and Ng (1993) allows the conditional variance to be a partially linear function of

Page 171: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

158 Glossary to ARCH (GARCH)

the lagged innovations and the lagged conditional variance,

σ2t = ω + βσ2

t−1 +

m∑i=−m

θi(εt−1 − i · σ)I(εt−1 < i · σ),

where σ denotes the unconditional standard deviation of the process, and m is an integer.The PNP-ARCH model was used by Engle and Ng (1993) in the construction of so-callednews impact curves, reflecting how the conditional variance responds to different sizedshocks (see also GJR and TGARCH).

QARCH (Quadratic ARCH) See GQARCH.

QMLE (Quasi Maximum Likelihood Estimation) ARCH models are typically estimatedunder the assumption of conditional normality (see ARCH). Even if the assumption ofconditional normality is violated (see also GARCH-t, GED-GARCH and SPARCH), theparameter estimates generally remain consistent and asymptotically normally distribu-ted, as long as the first two conditional moments of the model are correctly speci-fied; i.e, Et−1(εt) = 0 and Et−1(ε2

t) = σ2t . A robust covariance matrix for the resulting

QMLE parameter estimates may be obtained by post- and pre-multiplying the matrix ofouter products of the gradients with an estimate of Fisher’s Information Matrix. A rela-tively simple-to-compute expression for this matrix involving only first derivatives wasderived in Bollerslev and Wooldridge (1992). The corresponding robust standard errorsare sometimes referred to in the ARCH literature as Bollerslev–Wooldridge standarderrors.

QTARCH (Qualitative Threshold ARCH) The QTARCH(q) model of Gourieroux andMonfort (1992) assumes that the conditional variance may be represented by a sum ofstep functions:

σ2t = ω +

q∑i=1

J∑j=1

αijIj(εt−i),

where the Ij(·) function partitions the real line into J sub-intervals, so that Ij(εt−i) equalsunity if εt−i falls in the jth sub-interval and zero otherwise. The Generalized QTARCH,or GQTARCH(p,q), model is readily defined by including p lagged conditional varianceson the right-hand-side of the equation.

REGARCH (Range EGARCH) The REGARCH model of Brandt and Jones (2006)postulates an EGARCH-type formulation for the conditional mean of the demeanedstandardized logarithmic range. The FIREGARCH model allows for long-memorydependencies (see EGARCH and FIEGARCH).

RGARCH1 (Randomized GARCH) The RGARCH(r,p,q) model of Nowicka-Zagrajek andWeron (2001) replaces the intercept in the standard GARCH(p,q) model with a sum ofr positive i.i.d. stable random variables, ηt−i, i = 1, 2, . . . , r,

σ2t =

r∑i=1

ciηt−i +

q∑i=1

αiε2t−i +

p∑i=1

βiσ2t−i,

where ci ≥ 0.

RGARCH2 (Robust GARCH) The robust GARCH model of Park (2002) is designedto minimize the impact of outliers by parameterizing the conditional variance as a

Page 172: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 159

TS-GARCH model (see TS-GARCH) with the parameters estimated by least absolutedeviations, or LAD.

RGARCH3 (Root GARCH) The multivariate RGARCH model (see also MGARCH1

and Stdev-ARCH) of Gallant and Tauchen (1998) is formulated in terms of the lowertriangular N × N matrix Rt, where by definition,

Ωt = RtR′t.

By parameterizing Rt instead of Ωt, the RGARCH formulation automatically guaranteesthat the resulting conditional covariance matrices are positive definite. However, the for-mulation complicates the inclusion of asymmetries or “leverage effects” in the conditionalcovariance matrix.

RS-GARCH (Regime Switching GARCH) See SWARCH.

RV (Realized Volatility) The term realized volatility, or realized variation, is commonlyused in the ARCH literature to denote ex post variation measures defined by the summa-tion of within period squared or absolute returns over some nontrivial time interval. Arapidly growing recent literature has been concerned with the use of such measures andthe development of new and refined procedures in light of various data complications.Many new empirical insights afforded by the use of daily realized volatility measuresconstructed from high-frequency intraday returns have also recently been reported in theliterature; see, e.g., the review in Andersen, Bollerslev and Diebold (2009).

SARV (Stochastic AutoRegressive Volatility) See SV.

SGARCH (Stable GARCH) Let εt ≡ ztct, where zt is independent and identically dis-tributed over time as a standard Stable Paretian distribution. The Stable GARCH modelfor εt of Liu and Brorsen (1995) is then defined by:

cλt = ω + α|εt−1|λ + βcλ

t−1.

The SGARCH model nests the ACH model (see ACH2) of McCulloch (1985) as a specialcase for λ = 1, ω = 0 and β = 1 − α (see also GARCH-t, GED-GARCH and NGARCH).

S-GARCH (Simplified GARCH) The simplified multivariate GARCH (see MGARCH1)approach of Harris, Stoja and Tucker (2007) infers the conditional covariances throughthe estimation of auxiliary univariate GARCH models for the linear combinations in theidentity,

Covt−1(εit, εjt) = (1/4) · [V art−1(εit + εjt) + V art−1(εit − εjt)].

Nothing guarantees that the resulting N × N conditional covariance matrix is positivedefinite (see also CCC and Flex-GARCH).

Sign-GARCH See GJR.

SPARCH (SemiParametric ARCH) To allow for non-normal standardized residuals, ascommonly found in the estimation of ARCH models (see also GARCH-t, GED-GARCHand QMLE), Engle and Gonzalez-Rivera (1991) suggest estimating the distributionof εtσ

−1t through nonparametric density estimation techniques. Although Engle and

Page 173: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

160 Glossary to ARCH (GARCH)

Gonzalez-Rivera (1991) do not explicitly use the name SPARCH, the approach hassubsequently been referred to as such by several other authors in the literature.

Spline-GARCH The Spline-GARCH model of Engle and Rangel (2008) specifies theconditional variance of εt as the product of a standardized unit GARCH(1,1) model,

σ2t = (1 − α − β)ω + α(ε2

t−1/τt) + βσ2t−1,

and a deterministic component represented by an exponential spline function of time,

τt = c · exp[ω0t + ω1((t − t0)+)2 + ω2((t − t1)+)2 + . . . + ωk((t − tk−1)+)2],

where (t − ti)+ is equal to (t − ti) for t > ti and 0 otherwise, and 0 = t0 < t1 < . . . < tk = T

defines a partition of the full sample into k equally spaced time intervals. Other exogenousexplanatory variables may also be included in the equation for τt. The Spline GARCHmodel was explicitly designed to investigate macroeconomic causes of slowly moving, orlow-frequency volatility components (see also CGARCH1).

SQR-GARCH (Square-Root GARCH) The discrete-time SQR-GARCH model of Hestonand Nandi (2000),

σ2t = ω + α(εt−1σ−1

t−1 − γσt−1)2 + βσ2t−1,

is closely related to the VGARCH model of Engle and Ng (1993) (see VGARCH1). Incontrast to the standard GARCH(1,1) model, the SQR-GARCH formulation allows forclosed form option pricing under reasonable auxiliary assumptions. When defined overincreasingly finer sampling intervals, the SQR-GARCH model converges weakly to thecontinuous-time affine, or square-root, diffusion analyzed by Heston (1993),

dσ2(t) = κ(θ − σ2(t))dt + νσ(t)dW (t).

The SQR-GARCH model is also sometimes referred to as the Heston GARCH or theHeston–Nandi GARCH model (see also GARCH diffusion).

STARCH (Structural ARCH) An unobserved component, or “structural,” time seriesmodel in which one or more of the disturbances follow an ARCH model was dubbed aSTARCH model by Harvey, Ruiz and Sentana (1992).

Stdev-ARCH (Standard deviation ARCH) The Stdev-ARCH(q) model first estimated bySchwert (1990) takes the form,

σ2t = (ω +

q∑i=1

αi|εt−i|)2.

This formulation obviously ensures that the conditional variance is positive. However, thenonlinearity complicates the construction of forecasts from the model (see also AARCH).

STGARCH (Smooth Transition GARCH) The ST-GARCH(1,1) model of Gonzalez-Rivera (1998) allows the impact of the past squared innovations to depend upon boththe sign and the magnitude of εt−1 through a smooth transition function,

σ2t = ω + αε2

t−1 + δε2t−iF (εt−1, γ) + βiσ

2t−1,

where

F (εt−1, γ) = (1 + exp(γεt−1))−1,

Page 174: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 161

so that the value of the function is bounded between 0 and 1 (see also ANST-GARCH,GJR and TGARCH).

Structural GARCH The Structural GARCH approach named by Rigobon (2002) relieson a multivariate GARCH model for the innovations in an otherwise unidentified struc-tural VAR to identify the parameters through time-varying conditional heteroskedas-ticity. Closely related ideas and models have been explored by Sentana and Fiorentini(2001) among others.

Strong GARCH GARCH models in which the standardized innovations, zt = εtσ−1t , are

assumed to be i.i.d. through time are referred to as strong GARCH models (see alsoWeak GARCH).

SV (Stochastic Volatility) The term stochastic volatility, or SV model, refers to formula-tions in which σ2

t is specified as a nonmeasurable, or stochastic, function of the observableinformation set. To facilitate estimation and inference via linear state-space representa-tions, discrete-time SV models are often formulated in terms of time series models forlog(σ2

t

), as exemplified by the simple SARV(1) model,

log(σ2

t

)= μ + ϕ log

(σ2

t−1

)+ σuut,

where ut is i.i.d. with mean zero and variance one. Meanwhile, the SV approachhas proven especially useful in the formulation of empirically realistic continuous-timevolatility models of the form,

dy(t) = μ(t)dt + σ(t)dW (t),

where μ(t) denotes the drift, W (t) refers to standard Brownian Motion, and the diffusivevolatility coefficient σ(t) is determined by a separate stochastic process (see also GARCHDiffusion).

SVJ (Stochastic Volatility Jump) The SVJ acronym is commonly used to describecontinuous-time stochastic volatility models in which the sample paths may be dis-continuous, or exhibit jumps (see also SV and GARJI).

SWARCH (regime SWitching ARCH) The SWARCH model proposed independently byCai (1994) and Hamilton and Susmel (1994) extends the standard linear ARCH(q) model(see ARCH) by allowing the intercept, ωs(t), and/or the magnitude of the squared inno-vations, ε2

t−i/s(t − i), entering the conditional variance equation to depend upon somelatent state variable, s(t), with the transition between the different states governed by aMarkov chain. Regime switching GARCH models were first developed by Gray (1996)(see GRS-GARCH). Different variants of these models are also sometimes referred to inthe literature as Markov Switching GARCH, or MS-GARCH, Regime Switching GARCH,or RS-GARCH, or Mixture GARCH, or MGARCH, models.

TGARCH (Threshold GARCH) The TGARCH(p,q) model proposed by Zakoıan (1994)extends the TS-GARCH(p,q) model (see TS-GARCH) to allow the conditional stan-dard deviation to depend upon the sign of the lagged innovations. In particular, theTGARCH(1,1) model may be expressed as:

σt = ω + α|εt−1| + γ|εt−1|I(εt−1 < 0) + βσt−1.

Page 175: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

162 Glossary to ARCH (GARCH)

The TGARCH model is also sometimes referred to as the ZARCH, or ZGARCH,model. The basic idea behind the model is closely related to that of the GJR-GARCH model developed independently by Glosten, Jagannathan and Runkle (1993)(see GJR).

t-GARCH (t-distributed GARCH) See GARCH-t.

Tobit-GARCH The Tobit-GARCH model, first proposed by Kodres (1993) for analyz-ing futures prices, extends the standard GARCH model (see GARCH) to allow for thepossibility of censored observations on the εts, or the underlying yts. More general for-mulations allowing for multiperiod censoring and related inference procedures have beendeveloped by Lee (1999), Morgan and Trevor (1999) and Wei (2002).

TS-GARCH (Taylor–Schwert GARCH) The TS-GARCH(p,q) model of Taylor (1986)and Schwert (1989) parameterizes the conditional standard deviation as a dis-tributed lag of the absolute innovations and the lagged conditional standarddeviations,

σt = ω +

q∑i=1

αi|εt−i| +p∑

i=1

βiσt−i.

This formulation mitigates the influence of large, in an absolute sense, observationsrelative to the traditional GARCH(p,q) model (see GARCH). The TS-GARCH modelis also sometimes referred to as an Absolute Value GARCH, or AVGARCH, model, orsimply an AGARCH model. It is a special case of the more general Power GARCH, orNGARCH, formulation (see NGARCH).

TVP-Level (Time-Varying Parameter Level) See Level-GARCH.

UGARCH (Univariate GARCH) See GARCH.

Unobserved GARCH See Latent GARCH.

Variance Targeting The use of variance targeting in GARCH models was first suggested byEngle and Mezrich (1996). To illustrate, consider the GARCH(1,1) model (see GARCH),

σ2t = (1 − α − β)σ2 + αεt−1 + βσ2

t−1,

where σ2 = ω(1 − α − β)−1. Fixing σ2 at some pre-set value ensures that the long runvariance forecasts from the model converge to σ2. Variance targeting has proven especiallyuseful in multivariate GARCH modeling (see MGARCH1).

VCC (Varying Conditional Correlations) See DCC.

vech GARCH (vectorized GARCH) See MGARCH1.

VGARCH1 Following Engle and Ng (1993), the VGARCH(1,1) model refers to theparameterization,

σ2t = ω + α(εt−1σ−1

t−1 + γ)2 + βσ2t−1,

Page 176: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Glossary to ARCH (GARCH) 163

in which the impact of the innovations for the conditional variance is symmetric andcentered at −γσt−1. Higher order VGARCH(p,q) models may be defined in a similarmanner (see also AGARCH1 and NAGARCH).

VGARCH2 (Vector GARCH) The VGARCH, MGARCH and MV-GARCH acronyms areused interchangeably (see MGARCH1).

VSGARCH (Volatility Switching GARCH) The VSGARCH(1,1) model of Fornari andMele (1996) directly mirrors the GJR model (see GJR),

σ2t = ω + αε2

t−1 + γ(ε2

t−1/σ2t−1

)I(εt−1 < 0) + βσ2

t−1,

except that the asymmetric impact of the lagged squared negative innovations is scaledby the corresponding lagged conditional variance.

Weak GARCH The weak GARCH class of models, or concept, was first developed byDrost and Nijman (1993). In the weak GARCH class of models σ2

t is defined as the linearprojection of ε2

t on the space spanned by{1, εt−1, εt−2, . . . , ε2

t−1, ε2t−2, . . .

}as opposed to the

conditional expectation of ε2t, or Et−1

(ε2

t

)(see also ARCH and GARCH). In contrast to

the standard GARCH(p,q) class of models, which is not formally closed under temporalaggregation, the sum of successive observations from a weak GARCH(p,q) model remainsa weak GARCH(p’, q’) model, albeit with different orders p’ and q’. Similarly, as shownby Nijman and Sentana (1996) the unrestricted multivariate linear weak MGARCH(p,q)model (see MGARCH1) defined in terms of linear projections as opposed to conditionalexpectations is closed under contemporaneous aggregation, or portfolio formation (seealso Strong GARCH).

ZARCH (Zakoian ARCH) See TGARCH.

Page 177: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

9

An Automatic Test ofSuper ExogeneityDavid F. Hendry and Carlos Santos

1. Introduction

It is a real pleasure to contribute to a volume in honor of Rob Engle, who has greatlyadvanced our understanding of exogeneity, and has published with the first authoron that topic. At the time of writing Engle, Hendry and Richard (1983) (which hasaccrued 750 citations and counting), or even Engle and Hendry (1993), we could nothave imagined that an approach based on handling more variables than observationswould have been possible, let alone lead to an automatic test as we explain below. Robhas, of course, also contributed hugely to many other aspects of econometrics, not leastthe modeling of volatility (with over 5,000 citations starting from Engle, 1982a) andnonstationarity (where his famous paper on cointegration, Engle and Granger, 1987,has garnered an astonishing 8,500 cites according to Google Scholar): DFH still remem-bers discussing cointegration endlessly while running round Florence in a mad dashto see all the sights during a day visit in 1983, while we were both attending theEconometric Society Meeting in Pisa. The hallmarks of Rob’s publications are inven-tiveness, clarity, and succinctness such that his research is filled with ideas that arebeautifully explained despite the often complex mathematics lying behind – settinga high standard for others to emulate. He is also one of the driving forces for therapid progress in our discipline, and we wish him continuing high productivity into thefuture.

Acknowledgments: Financial support from the ESRC under Research Grant RES-062-23-0061, and fromthe Fundacao para a Ciencia e a Tecnologia (Lisboa), is gratefully acknowledged by the first and secondauthors, respectively. We are indebted to Jennifer L. Castle, Jurgen A. Doornik, Ilyan Georgiev, SørenJohansen, Bent Nielsen, Mark W. Watson and participants at the Festschrift in honor of Robert F. Englefor helpful comments on an earlier draft, and to Jurgen and J. James Reade for providing some of theresults based on Autometrics.

164

Page 178: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

1 Introduction 165

In all areas of policy that involve regime shifts, or structural breaks in condition-ing variables, the invariance of the parameters of conditional models under changesin the distributions of conditioning variables is of paramount importance, and wascalled super exogeneity by Engle et al. (1983). Even in models without contempora-neous conditioning variables, such as vector equilibrium systems (EqCMs), invarianceunder such shifts is equally relevant. Tests for super exogeneity have been proposedby Engle et al. (1983), Hendry (1988), Favero and Hendry (1992), Engle and Hendry(1993), Psaradakis and Sola (1996), Jansen and Terasvirta (1996) and Krolzig andToro (2002), inter alia: Ericsson and Irons (1994) overview the literature at the timeof their publication. Favero and Hendry (1992), building on Hendry (1988), consideredthe impact of nonconstant marginal processes on conditional models, and concludedthat location shifts (changes in unconditional means of nonintegrated, I(0), variables)were essential for detecting violations attributable to the Lucas (1976) critique. Engleand Hendry (1993) examined the impact on a conditional model of changes in themoments of the conditioning variables, using a linear approximation: tests for superexogeneity were constructed by replacing the unobservable changing moments by prox-ies based on models of the processes generating the conditioning variables, includingmodels based on ARCH processes (see Engle, 1982a), thereby allowing for nonconstanterror variances to capture changes in regimes. However, Psaradakis and Sola (1996)claim that such tests have relatively low power for rejecting the Lucas critique. Jansenand Terasvirta (1996) propose self-exciting threshold models for testing constancy inconditional models as well as super exogeneity. Krolzig and Toro (2002) developedsuper-exogeneity tests using a reduced-rank technique for co-breaking based on the pres-ence of common deterministic shifts, and demonstrated that their proposal dominatedexisting tests (on co-breaking in general, see Clements and Hendry, 1999, and Hendryand Massmann, 2007). We propose a new addition to this set of possible tests, showthat its rejection frequency under the null is close to the nominal significance level instatic settings, and examine its rejection frequencies when super exogeneity does nothold.

The ability to detect outliers and shifts in a model using the dummy saturation tech-niques proposed by Hendry, Johansen and Santos (2008) opens the door to this new classof automatically computable super-exogeneity tests. Their approach is to saturate themarginal model (or system) with impulse indicators (namely, include an impulse for everyobservation, but entered in feasible subsets), and retain all significant outcomes. Theyderive the probability under the null of falsely retaining impulses for a location-scale i.i.d.process, and obtain the distribution of the estimated mean and variance after saturation.Johansen and Nielsen (2009) extend that analysis to dynamic regression models, whichmay have unit roots. Building on the ability to detect shifts in marginal models, weconsider testing the relevance of all their significant impulses in conditional models. Aswe show below, such a test has the correct rejection frequency under the null of superexogeneity of the conditioning variables for the parameters of the conditional model, fora range of null-rejection frequencies in the marginal-model saturation tests. Moreover,our proposed test can detect failures of super exogeneity when there are location shiftsin the marginal models. Finally, it can be computed automatically – that is withoutexplicit user intervention, as occurs with (say) tests for residual autocorrelation – oncethe desired nominal sizes of the marginal saturation and conditional super-exogeneitytests have been specified.

Page 179: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

166 An automatic test of super exogeneity

Six conditions need to be satisfied for a valid and reliable automatic test of superexogeneity. First, the test should not require ex ante knowledge by the investigator ofthe timing, signs, or magnitudes of any breaks in the marginal processes of the condi-tioning variables. The test proposed here uses impulse-saturation techniques applied tothe marginal equations to determine these aspects. Second, the correct data generationprocess for the marginal variables should not need to be known for the test to have thedesired rejection frequency under the null of super exogeneity. That condition is satisfiedhere for the impulse-saturation stage in the marginal models when there are no explosiveroots in any of the variables, by developing congruent models using an automatic variantof general-to-specific modeling (see Hendry, 2009, for a recent discussion of congruence).Third, the test should not reject when super exogeneity holds yet there are shifts inthe marginal models, which would lead to many impulses being retained for testing inthe conditional model. We show this requirement is satisfied as well. Fourth, the con-ditional model should not have to be over-identified under the alternative of a failureof super exogeneity, as needed for tests in the class proposed by (say) Revankar andHartley (1973). Fifth, the test must have power against a large class of potential failuresof super exogeneity in the conditional model when there are location shifts in some ofthe marginal processes. Below, we establish the noncentrality parameter of the proposedtest in a canonical case. Finally, the test should be computable without additional userintervention, as holds for both the impulse-saturation stage and the proposed super-exogeneity test. The results here are based partly on the PcGets program (see Hendryand Krolzig, 2001) and partly on the more recent Autometrics algorithm in PcGive (seeDoornik, 2009, 2007b), which extends general-to-specific modeling to settings with morevariables than observations (see Hendry and Krolzig, 2005, and Doornik, 2007a).

The structure of the chapter is as follows. Section 2 reconsiders which shifts in vectorautoregressions (VARs) are relatively detectable, and derives the implications for test-ing for breaks in conditional representations. Section 3 considers super exogeneity in aregression context to elucidate its testable hypotheses, and discusses how super exogene-ity can fail. Section 4 describes the impulse-saturation tests in Hendry et al. (2008) andJohansen and Nielsen (2009), and considers how to extend these to test super exogeneity.Section 5 provides analytic and Monte Carlo evidence on the null rejection frequenciesof that procedure. Section 6 considers the power of the first stage to determine loca-tion shifts in marginal processes. Section 7 analyzes a failure of weak exogeneity under anonconstant marginal process. Section 8 notes a co-breaking saturation-based test whichbuilds on Krolzig and Toro (2002) and Hendry and Massmann (2007). Section 9 inves-tigates the powers of the proposed automatic test in Monte Carlo experiments for abivariate data generation process based on Section 7. Section 10 tests super exogeneity inthe much-studied example of UK money demand; and Section 11 concludes.

2. Detectable shifts

Consider the n-dimensional I(0) VAR(1) data generation process (DGP) of {xt} overt = 1, . . . , T :

xt = φ+ Πxt−1 + νt where νt ∼ INn [0,Ων ] (1)

Page 180: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 Detectable shifts 167

so Π has all its eigenvalues less than unity in absolute value, with unconditionalexpectation E [xt]:

E [xt] = (In − Π)−1φ = ϕ (2)

hence:

xt − ϕ = Π (xt−1 − ϕ) + νt. (3)

At time T1, however, (φ : Π) changes to (φ∗ : Π∗), so for h ≥ 1 the data are generatedby:

xT1+h = φ∗ + Π∗xT1+h−1 + νT1+h (4)

where Π∗ still has all its eigenvalues less than unity in absolute value. Such a shiftgenerates considerable nonstationarity in the distribution of {xT1+h} for many periodsafterwards since:

E [xT1+h] = ϕ∗ − (Π∗)h (ϕ∗ − ϕ) = ϕ∗h −−−→

h→∞ ϕ∗

where ϕ∗ = (In − Π∗)−1φ∗, so that, from (4):

xT1+h − ϕ∗ = Π∗ (xT1+h−1 − ϕ∗) + νT1+h. (5)

Clements and Hendry (1994), Hendry and Doornik (1997), and Hendry (2000) show thatchanges in ϕ are easy to detect, whereas those in φ and Π are not when ϕ is unchanged.This delimits the class of structural breaks and regime changes that any test for superexogeneity can reasonably detect.

To see the problem, consider the one-step forecast errors from T1 + 1 onwards using:

xT1+h|T1+h−1 = ϕ + Π (xT1+h−1 − ϕ)

which would be νT1+h|T1+h−1 = xT1+h − xT1+h|T1+h−1 where:

νT1+h|T1+h−1 = (ϕ∗ − ϕ) + Π∗ (xT1+h−1 − ϕ∗) − Π (xT1+h−1 − ϕ) + νT1+h. (6)

Finite-sample biases in estimators and estimation uncertainty are neglected here as neg-ligible relative to the sizes of the effects we seek to highlight. Unconditionally, therefore,using (2):

E[νT1+h|T1+h−1

]= (In − Π∗) (ϕ∗ − ϕ) + (Π∗ − Π)

(ϕ∗

h−1 − ϕ). (7)

Consequently, E[νT1+h|T1+h−1] = 0 when ϕ∗ = ϕ, however large the changes in Π or φ.Detectability also depends indirectly on the magnitudes of shifts relative to Ων , as thereare data variance shifts following unmodeled breaks, but such shifts are hard to detectwhen ϕ∗ = ϕ until long after the break has occurred, as the next section illustrates.

2.1. Simulation outcomes

To illustrate, let n = 2, and for the baseline case (a):

Π =

(π′

1

π′2

)=

(0.7 0.2−0.2 0.6

), φ =

(11

)(8)

Page 181: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

168 An automatic test of super exogeneity

where Π has eigenvalues of 0.65 ± 0.19i with modulus 0.68, and for |ρ| < 1:

Ων = σ2

(1 ρ

ρ 1

)= (0.01)2

(1 0.5

0.5 1

)

so the error standard deviations are 1% for xt interpreted as logs, with:

ϕ =

(1 − 0.7 −0.2

0.2 1 − 0.6

)−1(11

)=

(3.750.625

). (9)

At time T1, Π and φ change to Π∗ and φ∗ leading to case (b):

Π∗ =

(0.5 −0.20.1 0.5

), φ∗ =

(2.0

−0.0625

)(10)

where the eigenvalues of Π∗ are 0.5 ± 0.14i with modulus 0.27. The coefficients in Πare shifted at T1 = 0.75T = 75 by −20σ, −40σ, +30σ and +10σ, so the standardizedimpulse responses are radically altered between Π and Π∗. Moreover, the shifts to theintercepts are 100σ or larger when a residual of ±3σ would be an outlier. Figure 9.1shows the data outcomes on a randomly selected experiment in the first column, withthe Chow test rejection frequencies on 1,000 replications in the second (we will discussthe third below):

• for the baseline DGP in (8);• for the changed DGP in (10);• for the intercept-shifted DGP in (11) below;• for the intercept-shifted DGP in (11) below, changed for just one period.

The data over 1 to T1 are the same in the four cases, and although the DGPs differ overT1 + 1 to T = 100 in (a) and (b), it is hard to tell their data apart. The changes in φin (b) are vastly larger than any likely shifts in real-world economies. Nevertheless, therejection frequencies on the Chow test are under 13% at a 1% nominal significance.

However, keeping Π constant in (8), and changing only φ by ±5σ to φ∗∗ yieldscase (c):

Π =

(0.7 0.2−0.2 0.6

), φ∗∗ =

(1.050.95

)(11)

which leads to massive forecast failure. Indeed, changing the DGP in (11) for just oneperiod is quite sufficient to reveal the shift almost 100% of the time as seen in (d). Theexplanation for such dramatic differences between the second and third rows – wherethe former had every parameter greatly changed and the latter only had a small shift inthe intercept – is that ϕ is unchanged from (a) to (b) at:

ϕ∗ =

(1 − 0.5 0.2−0.1 1 − 0.5

)−1(2.0

−0.0625

)=

(3.750.625

)= ϕ (12)

Page 182: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 Detectable shifts 169

0 50 100

–2.5

0.0

2.5

Baseline data sample

(a)

x1,a x2,a

0 50 100

–2.5

0.0

2.5

Changing all VAR(1) parameters

(b)

x1,b x2,b

0 50 100

–5

0

5

Changing intercepts only

(c) x1,c x2,c

0 50 100

0

5

One−period shift

(d) x1,d x2,d

60 80 100

0.5

1.0 (a)

Baseline null rejection frequencies

Chow test at 0.01

60 80 100

0.5

1.0 (b)

Changing all VAR(1) parameters

Chow test at 0.01

60 80 100

0.5

1.0(c)

Changing intercepts only

Chow test at 0.01

60 80 100

0.5

1.0

(d)

One−period shift

Chow test at 0.01

60 80 100

0.5

1.0 (a)

Baseline conditional model

Chow test at 0.01

60 80 100

0.5

1.0(b)

Conditional model

Chow test at 0.01

60 80 100

0.5

1.0(c)

Conditional model

Chow test at 0.01

60 80 100

0.5

1.0(d)

Conditional model

Chow test at 0.01

Fig. 9.1. Data graphs and constancy test rejection frequencies

whereas in (c):

ϕ∗∗ =

(1 − 0.7 −0.2

0.2 1 − 0.6

)−1(1.050.95

)=

(3.81250.46875

)(13)

inducing shifts of a little over 6σ and 16σ in the locations of x1,t and x2,t, respectively,relative to the in-sample E [xt]. Case (d) may seem the most surprising – it is far easier todetect a one-period intercept shift of 5σ than when radically changing every parameterin the system for a quarter of the sample, but where the long run mean is unchanged:indeed the rejection frequency is essentially 100% versus less than 15%. The entailedimpacts on conditional models of such shifts in the marginal distributions are consideredin the next section.

2.2. Detectability in conditional models

In the bivariate case (a) of section 2.1, let x′t = (yt : zt) to match the notation below,

then:

E [yt | zt,xt−1] = φ1 + π′1xt−1 + ρ (zt − φ2 − π′

2xt−1)

= ϕ1 + ρ (zt − ϕ2) + (π1 − ρπ2)′ (xt−1 − ϕ)

Page 183: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

170 An automatic test of super exogeneity

as: (φ1

φ2

)=

(ϕ1 − π′

ϕ2 − π′2ϕ

).

After the shift in case (b), so t > T1:

E [yt | zt,xt−1] = (φ∗1 − ρφ∗2) + ρzt + (π∗1 − ρπ∗

2)′ xt−1

= ϕ1 + ρ (zt − ϕ2) + (π∗1 − ρπ∗

2)′ (xt−1 − ϕ) (14)

and hence the conditional model is constant only if:

π1 − π∗1 = ρ (π2 − π∗

2) , (15)

which is strongly violated by the numerical values used here:

π1 − π∗1 =

(0.20.4

)�=(−0.150.05

)= ρ (π2 − π∗

2) .

Nevertheless, as the shift in (14) depends on changes in the coefficients of zero-meanvariables, detectability will be low. In case (c) when t >> T1:

E [yt | zt,xt−1] = ϕ∗1 + ρ (zt − ϕ∗

2) + (π1 − ρπ2)′ (xt−1 − ϕ∗) (16)

where E [zt] = ϕ∗2 and E [xt−1] = ϕ∗ so there is a location shift of ϕ∗

1 − ϕ1. The thirdcolumn of graphs in Figure 9.1 confirms that the outcomes in the four cases above carryover to conditional models, irrespective of exogeneity: cases (a) and (b) are closely similarand low, yet rejection is essentially 100% in cases (c) and (d). Notice that there is noshift at all in (14) when (15) holds, however large the changes to the VAR. Consequently,we focus the super-exogeneity test to have power for location shifts in the marginaldistributions, which thereby “contaminate” the conditional model.

2.2.1. Moving window estimation

One approach which could detect that breaks of type (b) had occurred is the use ofmoving estimation windows, as a purely post-break sample (last T −T1 +1 observations)would certainly deliver the second-regime parameters. Sufficient observations must haveaccrued in the second regime (and no other shifts occurred): see e.g., Castle, Fawcett andHendry (2009). If impulse response analysis is to play a substantive role in policy advice,it would seem advisable to check on a relatively small final-period subsample that theestimated parameters and error variances have not changed.

3. Super exogeneity in a regression context

Consider the sequentially factorized DGP of the n-dimensional I(0) vector process {xt}:T∏

t=1

Dx (xt | Xt−1,θ) =T∏

t=1

Dy|z (yt | zt,Xt−1,φ1) Dz (zt | Xt−1,φ2) (17)

Page 184: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Super exogeneity in a regression context 171

where x′t = (y′

t : z′t), Xt−1 = (X0 x1 . . .xt−1) for initial conditions X0, and φ =(φ′

1 : φ′2

)′ ∈ Φ with φ = f (θ) ∈ Rk. The parameters φ1 ∈ Φ1 and φ2 ∈ Φ2 of the

{yt} and {zt} processes need to be variation free, so that Φ = Φ1 × Φ2, if zt is to beweakly exogenous for the parameters of interest ψ = h (φ1) in the conditional model.However, such a variation-free condition by itself does not rule out the possibility that φ1

may change if φ2 is changed. Super exogeneity augments weak exogeneity with parameterinvariance in the conditional model such that:

∂φ1

∂φ′2

= 0 ∀φ2 ∈ Cφ2 (18)

where Cφ2 is a class of interventions changing the marginal process parameters φ2, so(18) requires no cross-links between the parameters of the conditional and marginalprocesses. No DGPs can be invariant for all possible changes, hence the limitation toCφ2 , the “coverage” of which will vary with the problem under analysis.

When Dx (·) is the multivariate normal, we can express (17) as the unconditionalmodel: (

yt

zt

)∼ INn

[(μ1,t

μ2,t

),

(σ11,t σ′

12,t

σ12,t Σ22,t

)](19)

where E [yt] = μ1,t and E [zt] = μ2,t are usually functions of Xt−1. To define theparameters of interest, we let the economic theory formulation entail:

μ1,t = μ+ β′μ2,t + η′xt−1 (20)

where β is the primary parameter of interest. The Lucas (1976) critique explicitly consid-ers a model where expectations (the latent decision variables given by μ2,t) are incorrectlymodeled by the outcomes zt. From (19) and (20):

E [yt | zt,xt−1] = μ1,t + σ′12,tΣ

−122,t

(zt − μ2,t

)+ η′xt−1

= μ+ γ1,t + γ′2,tzt + η′xt−1 (21)

where γ′2,t = σ′

12,tΣ−122,t and γ1,t = (β−γ2,t)′μ2,t. The conditional variance is ω2

t = σ11,t−γ′

2,tσ21,t. Thus, the parameters of the conditional and marginal densities, respectively,are:

φ1,t =(μ : γ1,t : γ2,t : η : ω2

t

)and φ2,t =

(μ2,t : Σ22,t

).

When (21) is specified as a constant-parameter regression model over t = 1, . . . , T :

yt = μ+ β′zt + η′xt−1 + εt where εt ∼ IN[0, ω2] (22)

four conditions are required for zt to be super exogenous for (μ,β,η, ω2) (see Engle andHendry, 1993):

(i) γ2,t = γ2 is constant ∀t;(ii) β = γ2;(iii) φ1,t is invariant to Cφ2 ∀t;(iv) ω2

t = ω2 ∀t.

Page 185: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

172 An automatic test of super exogeneity

Condition (i) requires that σ′12,tΣ

−122,t is constant over time, which could occur because

the σij happened not to change over the sample, or because the two components movein tandem through being connected by σ′

12,t = γ′2Σ22,t. Condition (ii) then entails

that zt is weakly exogenous for a constant β. Together, (i) + (ii) entail the key resultthat γ1,t = 0 in (21), so the conditional expectation does not depend on μ2,t. Next,(iii) requires the absence of links between the conditional and marginal parameters.Finally, a fully constant regression also requires (iv), so ω2

t = σ11,t − β′Σ22,tβ = ω2 isconstant ∀t, with the observed variation in σ11,t derived from changes in Σ22,t. However,nonconstancy in ω2

t can be due to factors other than a failure of super exogeneity, sois only tested below as a requirement for congruency. Each of these conditions can bevalid or invalid separately: for example, βt = γ2,t is possible when (i) is false, and viceversa.

When conditions (i)–(iv) are satisfied:

E [yt|zt,xt−1] = μ+ β′zt + η′xt−1 (23)

in which case zt is super exogenous for(μ,β,η, ω2

)in this conditional model.

Consequently:

σ′12,t = β′Σ22,t ∀t (24)

where condition (24) requires that the means in (20) are interrelated by the same param-eter β as the covariances σ12,t are with the variances Σ22,t. Under those conditions, thejoint density is:(

yt

zt

)|xt−1 ∼ INn

[(μ+ β′μ2,t + η′xt−1

μ2,t

),

(ω2 + β′Σ22,tβ β′Σ22,t

Σ22,tβ Σ22,t

)](25)

so the conditional-marginal factorization is:(yt|zt,xt−1

zt | xt−1

)∼ INn

[(μ+ β′zt + η′xt−1

μ2,t

),

(ω2 0′

0 Σ22,t

)]. (26)

Consequently, under super exogeneity, the parameters (μ2,t,Σ22,t) can change in themarginal model:

zt | xt−1 ∼ INn−1

[μ2,t,Σ22,t

](27)

without altering the parameters of (22). Deterministic-shift co-breaking will then occurin (25), as

(1 : β′)xt does not depend on μ2,t: see §8. Conversely, if zt is not super

exogenous for β, then changes in (27) should affect (22) through γ1,t = (β − γ2,t)′μ2,t,as we now discuss.

3.1. Failures of super exogeneity

Super exogeneity may fail for any of three reasons, corresponding to (i)–(iii) above:

(a) the regression coefficient γ2 is not constant when β is;(b) zt is not weakly exogenous for β;(c) β is not invariant to changes in Cφ2 .

Page 186: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Impulse saturation 173

When zt is not super exogenous for β, and μ2,t is nonconstant, then (21) holds as:

yt = μ+ (β − γ2,t)′μ2,t + γ′

2,tzt + η′xt−1 + et (28)

We model μ2,t using lagged values of xt and impulses to approximate the sequentialfactorization in (17):

zt = μ2,t + v2,t = π0 +s∑

j=1

Γjxt−j + dt + v2,t (29)

where v2,t ∼ INn−1 [0,Σ22,t] is the error on the marginal model and dt denotes ashift at t. Section 2 established that the detectable breaks in (29) are location shifts,so the next section considers impulse saturation applied to the marginal process, thenderives the distribution under the null of no breaks in §5, and the behavior under thealternative in §6. Section 7 proposes the test for super exogeneity based on includingthe significant impulses from such marginal-model analyses in conditional equationslike (28).

4. Impulse saturation

The crucial recent development for our approach is that of testing for nonconstancy byadding a complete set of impulse indicators

{1{t}, t = 1, . . . , T

}to a marginal model,

where 1{t} = 1 for observation t, and zero otherwise: see Hendry et al. (2008) andJohansen and Nielsen (2009). Using a modified general-to-specific procedure, thoseauthors analytically establish the null distribution of the estimator of regression param-eters after adding T impulse indicators when the sample size is T . A two-step process isinvestigated, where half the indicators are added, and all significant indicators recorded,then the other half examined, and finally the two retained sets of indicators are com-bined. The average retention rate of impulse indicators under the null is αT when thesignificance level of an individual test is set at α. Moreover, Hendry et al. (2008) showthat other splits, such as using three splits of size T/3, or unequal splits do not affectthe retention rate under the null, or the simulation-based distributions. Importantly,Johansen and Nielsen (2009) both generalize the analysis to dynamic models (possi-bly with unit roots) and establish that for small α (e.g., α ≤ 0.01), the inefficiencyof conducting impulse saturation is very small despite testing T indicators: intuitively,retained impulses correspond to omitting individual observations, so only αT data pointsare “lost”.

This procedure is applied to the marginal models for the conditioning variables, andthe associated significant dummies in the marginal processes are recorded. Specifically,after the first stage when m impulse indicators are retained, a marginal model like (29)has been extended to:

zt = π0 +s∑

j=1

Γjxt−j +m∑

i=1

τ i,α11{t=ti} + v∗2,t (30)

Page 187: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

174 An automatic test of super exogeneity

0 20 40 60 80 100

0

5

10

(a) (b)

Y Fitted

0 20 40 60 80 100–2

0

2scaled residuals

Fig. 9.2. Absence of outliers despite a break

where the coefficients of the significant impulses are denoted τ i,α1 to emphasize theirdependence on the significance level α1 used in testing the marginal model. Equation(30) is selected to be congruent. Second, those impulses that are retained are tested asan added variable set in the conditional model.

There is an important difference between outlier detection, which does just that, andimpulse saturation, which will detect outliers but may also reveal other shifts that arehidden by being “picked up” incorrectly by other variables. Figure 9.2(a) illustrates amean shift near the mid-sample, where a regression on a constant is fitted. Panel (b)shows that no outliers, as defined by |ui,t| > 2σii (say), are detected (for an alternativeapproach, see Sanchez and Pena, 2003). By way of comparison, Figure 9.3 shows impulse

0 50 100

Dummies included in model

0 50 100

0

5

10

15

Blo

ck 1

0 50 100

Final model: dummies selected

0 50 100 0 50 100

0

5

10

15

Blo

ck 2

0 50 100

0.5

1.0

0 50 100 0 50 100

0

5

10

15

Fina

l

0 50 100

Final model: actual and f itted

Fig. 9.3. Impulse saturation in action

Page 188: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Null rejection frequency of the impulse-based test 175

saturation for the same data, where the columns show the outcomes for the first half, sec-ond half, then combined, respectively, and the rows show the impulses included at thatstage, their plot against the data, and the impulses retained. Overall, 20 impulses aresignificant, spanning the break (the many first-half impulses retained are due to tryingto make the skewness diagnostic insignificant). In fact, Autometrics uses a more sophis-ticated algorithm, which outperforms the split-half procedure in simulation experiments(see Doornik, 2009, for details).

The second stage is to add them retained impulses to the conditional model, yielding:

yt = μ+ β′zt + η′xt−1 +m∑

i=1

δi,α21{t=ti} + εt (31)

and conduct an F-test for the significance of (δ1,α2 . . . δm,α2) at level α2. Under thenull of super exogeneity, the F-test of the joint significance of the m impulse indicatorsin the conditional model should have an approximate F-distribution and thereby allowan appropriately sized test: Section 5 derives the null distribution and presents MonteCarlo evidence on its small-sample relevance. Under the alternative, the test will havepower in a variety of situations discussed in Section 7 below. Such a test can be auto-mated, bringing super exogeneity into the purview of hypotheses about a model thatcan be as easily tested as (say) residual autocorrelation. Intuitively, if super exogene-ity is invalid, so β′ �= σ′

12,tΩ−122,t in (28), then the impact on the conditional model of

the largest values of the μ2,t should be the easiest to detect, noting that the significantimpulses in (30) capture the outliers or breaks not accounted for by the regressor variablesused.

The null rejection frequency of this F-test of super exogeneity in the conditionalmodel should not depend on the significance level, α1, used for each individual testin the marginal model. However, too large a value of α1 will lead to an F-test withlarge degrees of freedom; too small will lead to few, or even no, impulses being retainedfrom the marginal models. Monte Carlo evidence presented in Section 5.1 supports thatcontention. For example, with four conditioning variables and T = 100, then under thenull, α1 = 0.01 would yield four impulses in general, whereas α1 = 0.025 would deliver 10.Otherwise, the main consideration for choosing α1 is to allow power against reasonablealternatives to super exogeneity.

A variant of the test in (31), which builds on Hendry and Santos (2005) and hasdifferent power characteristics, is to combine the m impulses detected in (30) into anindex (see Hendry and Santos, 2007).

5. Null rejection frequency of the impulse-based test

Reconsider the earlier sequentially factorized DGP in (19), where under the null of superexogeneity, from (23):

yt = μ+ β′zt + η′xt−1 + εt (32)

so although the {zt} process is nonconstant, the linear relation between yt and zt in (32)is constant.

Page 189: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

176 An automatic test of super exogeneity

Let Sα1 denote the dates of the significant impulses {1{ti}} retained in the model forthe marginal process (30) where: ∣∣∣tτi,ti

∣∣∣ > cα1 (33)

when cα1 is the critical value for significance level α1. In the model (32) for yt|zt,xt−1,conditioning on zt implies taking the v2,ts as fixed, so stacking the impulses in

{1{ti}

}in the vector 1t:

E [yt | zt | xt−1] = μ+ β′zt + η′xt−1 + δ′1t (34)

where δ = 0 under the null. Given a significance level α2, a subset of the indicators {1t}will be retained in the conditional econometric model, given that they were retained inthe marginal when: ∣∣∣tδj

∣∣∣ > cα2 . (35)

Thus, when (33) occurs, the probability of retaining any indicator in the conditional is:

P(∣∣∣tδj

∣∣∣ > cα2 |∣∣∣tτi,ti

∣∣∣ > cα1

)= P

(∣∣∣tδj

∣∣∣ > cα2

)= α2 (36)

as (33) holds, which only depends on the significance level cα2 used on the conditionalmodel and not on α1. If (33) does not occur, no impulses are retained, then P(|tδj

| >cα2) = 0, so the super-exogeneity test will under-reject under the null.

5.1. Monte Carlo evidence on the null rejection frequency

The Monte Carlo experiments estimate the empirical null rejection frequencies of thesuper-exogeneity test for a variety of settings, sample sizes, and nominal significancelevels, and check if there is any dependence of these on the nominal significance levels forimpulse retention in the marginal process. If there is dependence, then searching for therelevant dates at which shifts might have occurred in the marginal would affect testing forassociated shifts in the conditional. In the following subsections, super exogeneity is thenull, and we consider three settings for the marginal process: where there are no breaksin §5.1.1; a mean shift in §5.1.2; and a variance change in §5.1.3. Because the “size” ofa test statistic has a definition that is only precise for a similar test, and the word isambiguous in many settings (such as sample size), we use the term “gauge” to denotethe empirical null rejection frequency of the test procedure. As Autometrics selectionseeks a congruent model, irrelevant variables with |t| < cα can sometimes be retained,and gauge correctly reflects their presence, whereas “size” would not (e.g., Hoover andPerez, 1999, report “size” for significant irrelevant variables only).

The general form of DGP is the bivariate system:(yt

zt

)| xt−1 ∼ IN2

[(μ+ βξ(t)μzt

+ η′xt−1

ξ(t)μzt

), σ22

(σ−1

22 σ11 + β2θ(t) βθ(t)

βθ(t) θ(t)

)](37)

Page 190: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Null rejection frequency of the impulse-based test 177

where ξ(t) = 1 + ξ1{t>T1} and θ(t) = 1 + θ1{t>T2}, so throughout:

γ2,t =σ12,t

σ22,t=βσ22θ(t)

σ22θ(t)= β = γ2 (38)

ω2t = σ11,t − σ2

12,tσ−122,t = σ11 + β2σ22θ(t) −

β2σ222θ

2(t)

σ22θ(t)= σ11 = ω2 (39)

and hence from (37):

E [yt | zt,xt−1] = μ+ βξ(t)μzt+ η′xt−1 + γ2

(zt − ξ(t)μzt

)= μ+ βzt + η′xt−1. (40)

Three cases of interest are ξ = θ = 0, ξ = 0, and θ = 0 in each of which superexogeneity holds, but for different forms of change in the marginal process. In all cases,β = 2 = γ2 and ω2 = 1, which are the constant and invariant parameters of interest,with σ22 = 5. Any changes in the marginal process occur at time T1 = 0.8T . Theimpulse saturation uses a partition of T/2 with M = 10,000 replications. Sample sizesof T = (50, 100, 200, 300) are investigated, and we examine all combinations of foursignificance levels for both α1 (for testing impulses in the marginal) and α2 (testing inthe conditional) equal to (0.1, 0.05, 0.025, 0.01).

5.1.1. Constant marginal

The baseline DGP is (37) with ξ = θ = 0, μzt= 1 and η = 0. Thus, the parameters

of the conditional model yt|zt are φ′1 =

(μ; γ2;ω2

)= (0; 2; 1) and the parameters of the

marginal are φ′2,t = (μ2,t;σ22,t) = (1; 5). The conditional representation is:

yt = βzt +∑

i∈Sα1

δi1ti+ εt (41)

and testing super exogeneity is based on the F-test of the null δ = 0 in (41).The first column in Figure 9.4 reports the test’s gauges where α1 is the nominal

significance level used for the t-tests on each individual indicator in the marginal model(horizontal axis), and α2 is the significance level for the F-test on the set of retained dum-mies in the conditional (vertical axis). Unconditional rejection frequencies are recordedthroughout.

The marginal tests should not use too low a probability of retaining impulses, or elsethe conditional must automatically have a zero null rejection frequency. For example,at T = 50 and α1 = 0.01, about one impulse per two trials will be retained, so halfthe time, no impulses will be retained; on the other half of the trials, about α2 will beretained, so roughly 0.5α2 will be found overall, as simulation confirms. The simulatedgauges and nominal null rejection frequencies are close so long as α1T > 3. Then, thereis no distortion in the number of retained dummies in the conditional. However, constantmarginal processes are the “worst-case”: the next two sections consider mean and variancechanges where many more impulses are retained, so there are fewer cases of no impulsesdetected to enter in the conditional.

Page 191: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

178 An automatic test of super exogeneity

0.0000.0250.0500.0750.100

T= 300α2

α1→

(a)

100.0000.0250.0500.0750.100

T= 300 α2

2 10 1000.0000.0250.0500.0750.100

T= 300α2

0.0000.0250.0500.0750.100

T= 200(b)

0.0000.0250.0500.0750.100

T= 200

2

2 10 1000.0000.0250.0500.0750.100

T= 200

5

0.0000.0250.0500.0750.100

T= 100

(c)

0.0000.0250.0500.0750.100

T= 100

ξ→

2 10 1000.0000.0250.0500.0750.100

T= 100

θ→

2 10

2 5 10

0.0000.0250.0500.0750.100

T= 50

(d)

α1→

0.0000.0250.0500.0750.100

T= 50

ξ→10 1000.0000.0250.0500.0750.100

T= 50

θ→2 5 10

0.100 0.050 0.025 0.010

0.100 0.050 0.025 0.010

0.100 0.050 0.025 0.010

0.100 0.050 0.025 0.010

52

Fig. 9.4. Gauges of F-tests in the conditional as α1, ξ or θ vary in the marginal

5.1.2. Changes in the mean of zt

The second DGP is given by (37), where ξ = 2, 10, 100 with θ = 0, μzt= 1 and

η = 0. Super exogeneity holds irrespective of the level shift in the marginal; however,it is important to check that spurious rejection is not induced by breaks in marginalprocesses. The variance–covariance matrix is constant, but could be allowed to changeas well, provided the values matched the conditions for super exogeneity as in §5.1.3.

The second column of graphs in Figure 9.4 reports the test’s gauges where thehorizontal axis now corresponds to the three values of ξ, using α1 = 2.5% throughout.

Despite large changes in ξ, when T > 100, the gauges are close to the nominalsignificance levels. Importantly, the test does not spuriously reject the null, but nowis slightly undersized at T = 50 for small shifts, as again sometimes no impulses areretained for small shifts.

5.1.3. Changes in the variance of zt

The third DGP is given by (37), where θ = 2, 5, 10 with ξ = 0, μzt= 1 and η = 0, so φ1,t

is again invariant to changes in φ2,t induced by changes in σ22,t. The impulse-saturationtest has the power to detect variance shifts in the marginal, so, like the previous case,

Page 192: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 Potency at stage 1 179

more than αT impulses should be retained on average, depending on the magnitude ofthe marginal variance change (see §6.2).

The third column of graphs in Figure 9.4 reports the test’s gauges as before. Again,the vertical axis reports α2, the nominal significance level for the F-test on the set ofretained impulses in the conditional, but now the horizontal axis corresponds to the threevalues of θ, using α1 = 2.5% throughout.

The F-test has gauge close to the nominal for T > 100, even when the variance of themarginal process changes markedly, but the test is again slightly undersized at T = 50 forsmall shifts. As in §5.1.2, the test is not “confused” by variance changes in the marginalto falsely imply a failure of super exogeneity even though the null holds.

Overall, the proposed test has appropriate empirical null rejection frequencies forboth constant and changing marginal processes, so we now turn to its ability to detectfailures of exogeneity. Being a selection procedure, test rejection no longer corresponds tothe conventional notion of “power”, so we use the term “potency” to denote the averagenon-null rejection frequency of the test.

This test involves a two-stage process: first detect shifts in the marginal, then usethose to detect shifts in the conditional. The properties of the first stage have beenconsidered in Santos and Hendry (2006), so we only note them here, partly to establishnotation for the second stage considered in §7.

6. Potency at stage 1

We consider the potency at stage 1 for a mean shift then a variance change, both attime T1.

6.1. Detecting a mean shift in the marginal

Marginal models in their simplest form are:

zj,t =∑

i∈Sα1

τi,j,α11{ti} + v∗2,j,t (42)

when the marginal process is (43):

zj,t = λj1{t>T1} + v2,j,t (43)

where H1: λj �= 0 ∀j holds. The potency to retain each impulse in (42) depends on theprobability of rejecting the null for the associated estimated τi,j,α1 :

τi,j,α1 = λj + v∗2,j,ti.

The properties of tests on such impulse indicators are discussed in Hendry and Santos(2005). Let ψλ,α1 denote the noncentrality, then as V[τi,j,α1 ] = σ22,j :

E[tτi,j,α=0 (ψλ,α1)

]= E

[τi,j,α1√σ22,j

] λj√

σ22,j= ψλ,α1 . (44)

Page 193: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

180 An automatic test of super exogeneity

When v2,j,t is normal, the potency could be computed directly from the t-distribution:as most outliers will have been removed, normality should be a reasonable approxima-tion. However, the denominator approximation requires most other shifts to have beendetected. We compute the potency functions using an approximation to t2τi,j,α1=0 by χ2

with one degree of freedom:

t2τi,j,α1=0

(ψ2

λ,α1

)appχ2

1

(ψ2

λ,α1

). (45)

Relating that noncentral χ2 distribution to a central χ2 using (see e.g., Hendry, 1995):

χ21

(ψ2

λ,α1

) hχ2m (0) (46)

where:

h =1 + 2ψ2

λ,α1

1 + ψ2λ,α1

and m =1 + ψ2

λ,α1

h. (47)

Then the potency function of the χ21(ψ

2λ,α1

) test in (45) is approximated by:

P[t2τi,j,α1=0

(ψ2

λ,α1

)> cα1 |H1

] P

[χ2

1

(ψ2

λ,α1

)> cα1 |H1

] P

[χ2

m (0) > h−1cα1

]. (48)

For noninteger values of m, a weighted average of the neighboring integer values isused. For example, when ψ2

λ,α1= 16 and cα1 = 3.84, then h 1.94 and m = 8.76

(taking the nearest integer values as 8 and 9 with weights 0.24 and 0.76), which yieldsP[t2τi,j,α1=0 (16) > 3.84] 0.99, as against the exact t-distribution outcome of 0.975.When λj = d

√σ22,j so ψ2

λ,α1= d2, then pλ = P[t2τi,j,α1

(d2)> cα1 ] rises from 0.17,

through 0.50 to 0.86 as d is 1, 2, 3 at cα1 = 3.84, so the potency is low at d = 1 (thet-distribution outcome for d = 1 is 0.16), but has risen markedly even by d = 3.

In practice, Autometrics selects impulses within contiguous blocks with approxi-mately these probabilities, but has somewhat lower probabilities for scattered impulses.For example for the two DGPs:

D1 : y1,t = d (IT−19 + · · · + IT ) + ut, ut ∼ IN(0, 1)

D3 : y3,t = d (I1 + I6 + I11 + · · · ) + ut, ut ∼ IN(0, 1)

where the model is just a constant and T dummies for T = 100. While both have 20relevant indicators, the potency per impulse differs as shown in Table 9.1. There is aclose match between analytic power and potency in D1, and both rise rapidly with d,the standardized shift. D3 poses greater detection difficulties as all subsamples are alike(by construction); the split-half algorithm performs poorly on such experiments relativeto Autometrics. Modifying the experiment to an intermediate case of (say) five breaks oflength 4 delivers potency similar to D1. Importantly, breaks at the start or end of thesample are no more difficult to detect. Thus, we use (48) as the approximation for thefirst-stage potency.

Page 194: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

7 Super-exogeneity failure 181

Table 9.1. Impulse saturation in Autometrics at 1% nominalsize, T = 100, M = 1000

d = 0 d = 1 d = 2 d = 3 d = 4 d = 5

D1gauge % 1.5 1.2 0.9 0.3 0.7 1.1potency % — 4.6 25.6 52.6 86.3 99.0analytic power % — 6.1 26.9 65.9 93.7 99.7

D3gauge % 1.5 1.0 0.4 0.3 1.0 0.8potency % — 3.5 7.9 24.2 67.1 90.2

6.2. Detecting a variance shift in the marginal

Consider a setting where the variance shift θ > 1 occurs when T1 > T/2 so that:

zt = 1 +(1{t<T1} +

√θ1{t≥T1}

)vt. (49)

The maximum feasible potency would be from detecting and entering the set of k =T − T1 + 1 impulses 1{t≥T1}, each of which would then equal

√θ1{t≥T1}vt, to be judged

against a baseline variance of σ2v :

tτt=

√θ1{t≥T1}vt

σv,

so t2τthas a noncentrality of ψ2

θ,α1= θ. Approximating by hχ2

m (0) as in (48), for ψ2θ,α1

=(2; 5; 10) potency will be about (25%, 60%, 90%), respectively, at α1 = 0.05. Thus, onlylarge changes in variances will be detected.

Viewing the potencies at stage 1 as the probability pλ of retaining a relevant impulsefrom the marginal model, then approximately pλk ≤ k relevant impulses will be retainedfor testing in the conditional model, attentuating the noncentrality (denoted ϕδ,α1)of the F-test of δ = 0 in (41) relative to known shifts. Further, retention of irrele-vant impulses – corresponding to nonbreak-related shocks in the marginal process –will also lower potency relative to knowing the shifts. For the F-test of δ = 0, thisincreases its degrees of freedom, but that should only induce a small potency reductionfor small α1. For a given noncentrality ϕδ,α1 , however, that effect also differs dependingon the magnitudes and lengths of the shifts in the marginal, as fewer irrelevant impulseswill be retained when (e.g.) there is a large, short shift.

7. Super-exogeneity failure

In this section, we derive the outcome for a super-exogeneity failure due to a weak exo-geneity violation when the marginal process is nonconstant, and obtain the noncentralityand approximate potency of the test when there is a single location shift in the marginal.Figure 9.1 showed high constancy-test rejection frequencies for both that setting and

Page 195: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

182 An automatic test of super exogeneity

even a single impulse. Section 9 reports the simulation outcomes. As seen in §3.1, manycauses of failure are possible, including shifts in variances in marginal processes andany cross-links between conditional and marginal parameters, but location shifts due tochanges in policy rules are a central scenario.

The potency at the second stage conditional on the saturation approach locatingall, and only, the relevant impulses corresponding to shifts in the marginal, is easilycalculated, but will only be accurate for large magnitude breaks, parameterized belowby λ. For smaller values of λ, fewer impulses will be detected in the marginal. Moreover,although the null rejection frequency of the test in the conditional does not depend onα1 once α1T > 3, the potency will, suggesting that a relatively nonstringent α1 shouldbe used. However, that will lead to retaining some “spurious” impulses in the marginal,albeit fewer than α1T1 because shifts lower the remaining null rejection frequency (see,e.g., Table 9.1).

We use the formulation in §3 for a normally distributed n × 1 vector x′t = (yt : z′t)

generated by (19), with E [yt|zt] given by (21), where γ = Σ−122 σ12, η = 0 and conditional

variance ω2 = σ11 − σ′12Σ

−122 σ12. The parameter of interest is β in (20), so:

yt = μ+ β′zt + (γ − β)′(zt − μ2,t

)+ εt

= μ+ γ′zt + (β − γ)′ μ2,t + εt (50)

where yt−E[yt|zt] = εt ∼ IN[0, σ2

ε

], so E[εt|zt] = 0. However, E[yt|zt] �= β′zt when β �= γ,

violating weak exogeneity, so (50) will change as μ2,t shifts. Such a conditional modelis an example of the Lucas (1976) critique where the agents’ behavioral rule depends onE[zt] as in (20), whereas the econometric equation conditions on zt.

To complete the system, the break in the marginal process for {zt}, which inducesthe violation in super exogeneity, is parameterized as:

zt = μ2,t + v2,t = λ1{t>T1} + v2,t. (51)

In practice, there could be multiple breaks in different marginal processes at differenttimes, which may affect one or more zts, but little additional insight is gleaned overthe one-off break in (51), which is sufficiently general as the proposed test is an F-teston all retained impulses, so does not assume any specific break form at either stage.The advantage of using the explicit alternative in (51) is that approximate analyticcalculations are feasible. As §2 showed that the key shifts are in the long run mean, weuse the Frisch and Waugh (1933) theorem to partial out means, but with a slight abuseof notation, do not alter it. Combining (50) with (51) and letting δ = (β − γ)′ λ, theDGP becomes:

yt = μ+ γ′zt + (β − γ)′ λ1{t>T1} + εt

= μ+ γ′zt + δ1{t>T1} + εt. (52)

Testing for the impulse dummies in the marginal model yields:

zti=∑

i∈Sα1

τ i,α11{i} + v∗2,ti

(53)

Page 196: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

7 Super-exogeneity failure 183

where Sα1 denotes the set of impulses τ i,α1 = λ1{ti>T1} + v2,tiwhere v∗

2,ti= 0 ∀i ∈ Sα1

defined by:

t2τi,j,α1=0 > cα1 . (54)

Stacking significant impulses from (54) in ιt, and adding these to (50), yields the testregression:

yt = κ0 + κ′1zt + κ2ιt + et (55)

The main difficulty in formalizing the analysis is that ιt varies between draws in bothits length and its contents. As the test is an F-test for an i.i.d. DGP, the particularrelevant and irrelevant impulses retained should not matter, merely their total numbersfrom the first stage. Consequently, we distinguish:

(a) the length of the break, Tr,(b) the number of relevant retained elements in the index, which on average will be

pλTr, where pλ is the probability of retaining any given relevant impulse from§6.1, and

(c) the total number of retained impulses in the model, Ts, usually including someirrelevant ones, where on average s = (pλr + α1), which determines the averagedegrees of freedom of the test.

The F-test will have Ts numerator degrees of freedom and T (1 − s) − n denominator(allowing for the constant). The potency of the FTs

T (1−s)−n-test of:

H0 : κ2 = 0 (56)

in (55) depends on the strengths of the super-exogeneity violations, (βi − γi); the magni-tudes of the breaks, λi, both directly and through their detectability, pλ, in the marginalmodels, in turn dependent on α1; the sample size T ; the relative number of periods raffected by the break; the number of irrelevant impulses retained, and on α2. The prop-erties are checked by simulation below, and could be contrasted with the optimal, butgenerally infeasible, test based on adding the index 1{t>T1}, instead of the impulses ιt,equivalent to a Chow (1960) test (see Salkever, 1976).

A formal derivation, could either include pλTr impulses, akin to a mis-specificationanalysis, or model ιt in (55) as containing all Tr relevant impulses, each with probabilitypλ > 0. The impact of irrelevant retained impulses is merely to reduce the number ofavailable observations, so lowers potency slightly, and can otherwise be neglected. Takingthe second route, namely the fixed length κ2, the full-sample representations are:

DGP : y = Zγ + δI∗Tr1Tr + ε

Model : y = Zκ1 + J∗Trκ2 + e

Exogenous : Z = I∗Tr1Trλ′ + V2

(57)

where:

I∗Tr =

(0T (1−r),Tr

ITr

); J∗

Tr = pλI∗Tr (58)

Page 197: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

184 An automatic test of super exogeneity

so 1Tr is Tr × 1 with Tr elements of unity etc. To relate the DGP to the model, addand subtract δJ∗

Tr1Tr, noting that omitted impulses are orthogonal to the included, so(I∗Tr − J∗

Tr) = K∗Tr with J∗′

TrK∗Tr = 0:

y = Zγ + J∗Tr (δ1Tr) + δK∗

Tr1Tr + ε. (59)

Combinations involving J∗Tr also have probability pλ, as it is only the initial chance of

selection that matters and, conditional on that, thereafter occurs with certainty. Then,using (59) and letting I∗′TrZ = ZTr:(

κ1 − γ

κ2 − δ1Tr

)=

(Z′Z Z′J∗

Tr

J∗′TrZ J∗′

TrJ∗Tr

)−1(Z′yJ∗′

Try

)−(

γ

δ1Tr

)

=

(G−1 −G−1Z′

Tr

−ZTrG−1 (pλITr)−1 + ZTrG−1Z′

Tr

)(δ (1 − pλ)Z′I∗Tr1Tr + Z′ε

pλεTr

)

= δ (1 − pλ)

(G−1Z′

Tr1Tr

−ZTrG−1Z′Tr1Tr

)

+

(G−1 (Z′ε − pλZ′

TrεTr)εTr − ZTrG−1 (Z′ε − pλZ′

TrεTr)

)(60)

where G = (Z′Z − pλZ′TrZTr). Since E [ZTr] = 1Trλ

′ and λ1′Tr1Tr = Trλ,

approximating by:

E[TG−1

] (E [T−1G])−1

=((1 − pλ) rλλ′ + (1 − pλr)Σ22

)−1 (61)

then:

E

[(κ1

κ2

)](

γ

δ1Tr

)− rδ (1 − pλ)

(−fλ

λ′fλ1Tr

)=

(γ∗

δ∗1Tr

)where:

fλ =(E[T−1G

])−1λ. (62)

As expected, the bias term vanishes when pλ = 1. Also, using the same approximations,the reported covariance matrix is (which will differ from the correct covariance matrixbased on the distribution in (60)):

Cov

[(κ1

κ2

)] σ2

e

T

(G−1 −G−1λ1′

Tr

−1Trλ′G−1

(Tp−1

λ ITr + λ′G−1λ1Tr1′Tr

)) (63)

where, evaluated at γ∗ and δ∗:

σ2e σ2

ε + δ2 (1 − pλ)2 r2(1 + f ′λΣ22fλ + 2f ′λλ + r (1 − pλ) (f ′λλ)2

). (64)

Page 198: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

7 Super-exogeneity failure 185

In the special case that pλ = 1, consistent estimates of γ result with T−1G = (1 − r)Σ22

and σ2e = σ2

ε .As:

FTsT (1−s)−n (κ2 = 0) = (T (1 − s) − n) κ′

2

(Tp−1

λ ITr + λ′G−1λ1Tr1′Tr

)−1κ2

Tsσ2e

,

using:

1′Tr (ITr + x1Tr1′

Tr)−1 1Tr =

Tr

1 + Trxthen:

T−1pλ1′Tr

(ITr + T−1pλλ′G−1λ1Tr1′

Tr

)−11Tr =

rpλ

1 + rpλλ′G−1λ(65)

so an approximate explicit expression for the noncentrality of the FTsT (1−s)−n-test is:

ϕ2s,F (T (1 − s) − n) pλr (δ∗)2

Tsσ2e

(1 + rpλλ′G−1λ

) . (66)

All the factors affecting the potency of the automatic test are clear in (66). The importantselection mistake is missing relevant impulses: when pλ < 1 in (60), then σ2

e > σ2ε , so ϕ2

s

falls rapidly with pλ. Consequently, a relatively loose first-stage significance level seemssensible, e.g., 2.5%. The potency is not monotonic in s as the degrees of freedom of theF-test alter: a given value of δ achieved by a larger s will have lower potency than thatfrom a smaller s > r.

For numerical calculations, we allow on average that α1T random extra impulses andpλrT = Tq relevant are retained, so approximate FTs

T (1−s)−n(ϕ2s,F ) by a χ2

Ts

(ϕ2

s

)for

ϕ2s = Tqϕ2

s,F , where P[χ2

Ts (0) > cα2

]= α2 using:

P[χ2

Ts

(ϕ2

s

)> cα2 |H1

] P[χ2

m (0) > h−1cα2

](67)

with:

h =Ts+ 2ϕ2

s

Ts+ ϕ2s

and m =Ts+ ϕ2

s

h. (68)

Some insight can be gleaned into the potency properties of the test when n = 2. Inthat case, G =

((1 − pλ) rλ2 + (1 − pλr)σ22

), and approximately for small α1:

ϕ2s T (1 − r) rpλ (δ∗)2

σ2e (1 + rpλG−1λ2)

−→λ→∞

T (1 − r)2 (β − γ)2 σ22

σ2ε

(69)

where the last expression shows the outcome for large λ so pλ → 1. Then (69) reflects theviolation of weak exogeneity, (β − γ)2, the signal–noise ratio, σ22/σ

2ε , the loss from longer

break lengths (1 − r)2, and the sample size, T . The optimal value of the noncentrality,ϕ2

r, for a known break date and form – so the single variable 1{t>T1} is added – is:

ϕ2r =

Trδ2

σ2ε

(1 + rσ−1

22 λ2) −→

λ→∞Tσ22 (β − γ)2

σ2ε

. (70)

Page 199: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

186 An automatic test of super exogeneity

Despite the nature of adding Tr separate impulses when nothing is known about the exis-tence or timing of a failure of super exogeneity, so ϕ2

s < ϕ2r, their powers converge rapidly

as the break magnitude λ grows, when r is not too large. The numerical evaluations of(69) in Table 9.4 below are reasonably accurate.

8. Co-breaking based tests

A key assumption underlying the above test is that impulse-saturation tests to detectbreaks and outliers were not applied to the conditional model. In many situations,investigators will have done precisely that, potentially vitiating the ability of a directsuper-exogeneity test to detect failures. Conversely, one can utilize such results for adeterministic co-breaking test of super exogeneity.

Again considering the simplest case for exposition, add impulses to the conditionalmodel, such that after saturation:

yt = μ0 + β′zt +s∑

j=1

κj1{tj} + νt (71)

At the same time, if Sα1 denotes the significant dummies in the marginal model:

zt = τ 0 +∑

j∈Sα1

τ j1{tj} + ut (72)

then the test tries to ascertain whether the timing of the impulses in (71) and (72)overlaps. For example, a perfect match would be strong evidence against super exogeneity,corresponding to the result above that the significance of the marginal-model impulsesin the conditional model rejects super exogeneity.

9. Simulating the potencies of the automaticsuper-exogeneity test

We undertook simulation analyses using the bivariate relationship in Section 5.1 for vio-lations of super exogeneity due to a failure of weak exogeneity under nonconstancyin: (

yt

zt

)∼ IN2

[(βμ2,t

μ2,t

),

(21 1010 5

)](73)

so γ = 2 and ω2 = 1, but β �= γ, with a level shift at T1 in the marginal:

μ2,t = λ1{t>T1} so μ1,t = βλ1{t>T1}. (74)

We vary: d = λ/√σ22 over the values 1, 2, 2.5, 3 and 4; β over 0.75, 1, 1.5 and 1.75,

reducing the extent of departure from weak exogeneity; two sample sizes (T = 100 andT = 300) which have varying break points, T1; and the significance levels α1 and α2

Page 200: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

9 Simulating the potencies of the automatic super-exogeneity test 187

Table 9.2. Potencies of the F-test for a levelshift at T1 = 250, T = 300, α1 = α2 = 0.05

d : β 0.75 1.0 1.5 1.75

1.0 0.191 0.153 0.078 0.0542.0 0.972 0.936 0.529 0.1502.5 1.000 0.993 0.917 0.3393.0 1.000 1.000 0.998 0.6534.0 1.000 1.000 1.000 0.967

in the marginal and conditional. A partition of T/2 was always used for the impulsesaturation in the marginal model, and M = 10, 000 replications.

Table 9.2 reports the empirical null rejection frequencies of the F-test when T = 300is used with 5% significance levels in both the marginal and conditional models, for alevel shift at T1 = 250, so k = 50 and r = 1/6. The potency of the test increases withthe increase in β − γ, as expected, and increases with the magnitude of the level shift d.Even moderate violations of the null are detectable for level shifts of 2.5σ or larger.

Table 9.3 shows the impact of reducing T − T1 to 25 cet. par. The potency is neversmaller for the shorter break, so the degrees of freedom of the F-test are important,especially at intermediate potencies.

Table 9.3. Potencies of the F-test for a level shiftat T1 = 275, T = 300, α1 = α2 = 0.05

d : β 0.75 1.0 1.5 1.75

1.0 0.377 0.274 0.097 0.0602.0 1.000 0.997 0.803 0.2382.5 1.000 1.000 0.990 0.5043.0 1.000 1.000 1.000 0.7974.0 1.000 1.000 1.000 0.984

Using more stringent significance levels of α1 = α2 = 2.5% naturally leads to a lesspotent test than the 5% in Table 9.2, although the detection probabilities still rise rapidlywith the break magnitude, and even relatively mild departures from weak exogeneity aredetected at the break magnitude of d = 4. The italic numbers in parentheses report thenumerical evaluation of the analytic potency from (69) as a typical example, p∗, and theresponse surface in (75) checks its explanatory ability. The coefficient of log (p∗) is notsignificantly different from unity and the intercept is insignificant.

log (p) = 0.96(0.04)

log (p∗) − 0.015(0.06)

Fhet(2, 15) = 1.77 Freset(1, 17) = 2.81

R2 = 0.975 σp = 0.21 χ2nd(2) = 6.4∗ (75)

Page 201: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

188 An automatic test of super exogeneity

Table 9.4. Potencies of the F-test for a level shift at T1 = 250,T = 300, α1 = α2 = 0.025

d : β 0.75 1.0 1.5 1.75

1.0 0.081 (0.087) 0.065 (0.060) 0.035 (0.031) 0.026 (0.027)2.0 0.717 (0.932) 0.612 (0.918) 0.220 (0.234) 0.062 (0.067)2.5 0.977 (1.000) 0.953 (1.000) 0.616 (0.615) 0.143 (0.107)3.0 1.000 (1.000) 0.999 (1.000) 0.953 (0.922) 0.372 (0.203)4.0 1.000 (1.000) 1.000 (1.000) 1.000 (1.000) 0.908 (0.627)

Here, R2 is the squared multiple correlation (when including a constant), σp is theresidual standard deviation, coefficient standard errors are shown in parentheses, thediagnostic tests are of the form Fj(k, T − l) which denotes an approximate F-test againstthe alternative hypothesis j for: heteroskedasticity (Fhet: see White, 1980); the RESETtest (Freset: see Ramsey, 1969); and χ2

nd(2) is a chi-square test for normality (see Doornikand Hansen, 2008); below we also present kth-order serial correlation (Far: see Godfrey,1978); kth-order autoregressive conditional heteroskedasticity (Farch: see Engle, 1982a);FChow for parameter constancy over k periods (see Chow, 1960); and SC is the Schwarzcriterion (see Schwarz, 1978); ∗ and ∗∗ denote significant at 5% and 1%, respectively.Figure 9.5 records the response surface fitted and actual values; their cross-plot; theresiduals scaled by σ; and their histogram and density with N[0,1] for comparison.

0 5 10 15 20

−3

−2

−1

0

−3.5 −3 −2.5 −2 −1.5 −1 −0.5 0

−3

−2

−1

0log (p) × Fitted ^

0 5 10 15 20

−1

0

1

2

3

−3 −2 −1 0 1 2 3 4

0.25

0.50

0.75

Density

f (ûi /σ) ^

N(0,1)

Fitted log(p)^

ûi /σ

Fig. 9.5. Response surface outcomes for equation (75)

Page 202: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

9 Simulating the potencies of the automatic super-exogeneity test 189

Table 9.5. Potencies of the F-test for a levelshift at T1 = 80, T = 100, α1 = α2 = 0.025

d/β 0.75 1.0 1.5 1.75

1 0.027 0.027 0.026 0.0222 0.114 0.098 0.054 0.0342.5 0.392 0.349 0.159 0.0553 0.757 0.715 0.434 0.1124 0.996 0.994 0.949 0.418

Table 9.6. Potencies of the F-test for a levelshift at T1 = 70, T = 100, α1 = α2 = 0.025

d : β 0.75 1.0 1.5 1.75

2.5 0.260 0.245 0.174 0.1183.0 0.708 0.680 0.486 0.2214.0 0.997 0.995 0.967 0.576

We now turn to the effect of sample size on potency. Table 9.5 reports the results forsignificance levels of 2.5% in both marginal and conditional models when T = 100 andT1 = 80.

The test still has reasonable potency for moderate violations of weak exogeneity whenbreaks are at least 3σ, although there is a loss of potency with the reduction in samplesize. The trade off between length of break and potency remains as shown in Table 9.6for T − T1 = 30, beginning at observation 71: small breaks have negligible potency.However, the potency is higher at the larger breaks despite smaller weak exogeneityviolations, so the impacts of the various determinants are nonmonotonic, as anticipatedfrom (66).

9.1. Optimal infeasible impulse-based F-test

The optimal infeasible impulse-based F-test with a known break location in the marginalprocess is computable in simulations. The tables below use α2 = 2.5% for testing in theconditional. The empirical rejection frequencies approximate maximum achievable powerfor this type of test. When T = 100, and the break is a mean shift starting at T1 = 80,the correct 20 impulse indicators are always included in the conditional model. Table 9.7reports for the failure of super exogeneity.

Relative to the optimal infeasible test, the automatic test based on satu-ration of the marginal naturally loses considerable potency for breaks of smallmagnitudes.

Table 9.8 shows that for a failure of super exogeneity, even when β = 1.75, the optimaltest power increases with k for breaks of d = 1 and 2. Thus, the optimal test exhibitspower increasing with break length unlike (69).

Page 203: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

190 An automatic test of super exogeneity

Table 9.7. Powers of an F-test for a levelshift at T1 = 0.8T = 80 with known breaklocation and form

d : β 0.75 1.0 1.5 1.75

1.0 1.000 0.994 0.404 0.0832.0 1.000 1.000 0.930 0.2472.5 1.000 1.000 0.973 0.3263.0 1.000 1.000 0.985 0.3804.0 1.000 1.000 0.988 0.432

Table 9.8. Super-exogeneity failures at T1 when T = 100 with known break locationand form

d : T − T1 45 40 30 20 15 10 5

1.0 0.572 0.563 0.515 0.423 0.348 0.259 0.0732.0 0.942 0.938 0.920 0.880 0.828 0.720 0.484

10. Testing super exogeneity in UK money demand

We next test super exogeneity in a model of transactions demand for money in the UKusing a sample of quarterly observations over 1964(3) to 1989(2), defined by:

• M nominal M1• X real total final expenditure (TFE) at 1985 prices• P TFE deflator• Rn net interest rate on retail sight deposits: three-month local authority interest

rate minus own rate.

We use the model in Hendry and Doornik (1994) (also see Hendry, 1979; Hendryand Ericsson, 1991; Boswijk, 1992; Hendry and Mizon, 1993; and Boswijk and Doornik,2004), and express the variables as a vector autoregressive system. Previous cointegrationanalyses showed two long run relationships, but confirmed the long run weak exogeneityof {xt,Δpt, Rn,t} in that four-variable system. The theoretical basis is a model that linksdemand for real money, m− p (lower case denoting logs) to (log) income x (transactionsmotive) and inflation Δpt, with the interest rate Rn measuring the opportunity cost ofholding money. The data series terminate in 1989(2) because a sequence of large buildingsocieties converted to banks thereafter, greatly altering M1 measures as their depositswere previously classified outside M1.

Commencing from the conditional model of m− p on {xt,Δpt, Rn,t} with two lags ofall variables, constant and trend, undertaking selection with impulse saturation on that

Page 204: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

10 Testing super exogeneity in UK money demand 191

equation using Autometrics at α2 = 1% yields:

(m− p)t = 0.11(0.01)

xt − 0.85(0.11)

Δpt − 0.44(0.08)

Rn,t + 0.60(0.07)

(m− p)t−1 + 0.30(0.07)

(m− p)t−2

− 0.27(0.10)

Rn,t−1 − 3.5(1.1)

I69(2) + 4.3(1.1)

I71(1) + 3.9(1.1)

I73(2) + 4.2(1.1)

I74(4)

− 2.8(1.1)

I83(3) (76)

Far(5, 84) = 1.90 Farch(4, 81) = 0.57 Fhet(22, 66) = 0.35 Freset(1, 91) = 0.08

σ(m−p) = 0.010 χ2nd(2) = 0.76 FChow:81(4)(30, 59) = 1.0 SC(11) = −5.93

The legend is described in §9. The coefficients of the impulses are multiplied by 100 (soare percentage shifts for (m− p)t, xt and Δpt).

Despite a large number of previous studies of UK M1, (76) has a major newresult: the puzzle of why transactions demand did not depend on the contempora-neous expenditure for which it was held is resolved by finding that it does – onceimpulse saturation is able to remove the contaminating perturbations. Moreover, thePcGive unit-root test is −12.79∗∗ strongly rejecting an absence of cointegration; andthe derived long run expenditure elasticity is 1.02 (0.003), so the match with economictheory has been made much closer. Almost all the impulses have historical interpre-tations: decimalization began in 1969(2) and was completed in 1971(1); 1973(2) sawthe introduction of VAT; 1974(4) was the heart of the first Oil crisis; but 1983(3) isunclear.

Next, we selected the significant impulses in congruent marginal models for{xt,Δpt, Rn,t} with two lags of every variable, constant and trend, finding:

xt = 1.24(0.32)

+ 0.89(0.03)

xt−1 − 0.14(0.03)

Rn,t−2 + 0.0007(0.0002)

t+ 2.9(1.0)

I68(1) + 3.6(1.0)

I72(4)

+ 4.5(1.0)

I73(1) + 5.7(1.0)

I79(2) (77)

Far(5, 91) = 1.50 Farch(4, 88) = 1.67 Fhet(13, 82) = 1.26 Freset(1, 95) = 0.001

σx = 0.010 χ2nd(2) = 0.05

Δpt = − 1.9(0.29)

+ 0.43(0.07)

Δpt−1 + 0.21(0.03)

xt−1 − 0.03(0.01)

(m− p)t−1 − 0.0012(0.0002)

t

− 3.1(0.68)

I73(2) + 2.5(0.65)

I74(2) (78)

Far(5, 92) = 0.10 Farch(4, 89) = 0.84 Fhet(16, 80) = 0.83 Freset(1, 96) = 6.5∗

σΔp = 0.0064 χ2nd(2) = 0.22

Page 205: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

192 An automatic test of super exogeneity

Rn,t = 0.99(0.01)

Rn,t−1 + 3.9(1.2)

I73(3) + 3.5(1.2)

I76(4) − 3.6(1.2)

I77(1) − 3.4(1.2)

I77(2) (79)

Far(5, 94) = 1.08 Farch(4, 91) = 1.53 Fhet(6, 92) = 1.85 Freset(1, 98) = 3.08

σRn= 0.012 χ2

nd(2) = 0.09

Only one mis-specification test is significant at even the 5% level across these threeequations, so we judge these marginal models to be congruent. The impulses were selectedusing α1 = 1%, as although the sample size is only T = 104, many impulses were alreadyknown to matter from the economic turbulence of the 1970s and 1980s in the UK, andindeed 10 are retained across these three models; surprisingly, the three-day week loss ofoutput in December 1973 did not show up in (77).

Next, we tested the significance of the 10 retained impulses from (77), (78) and(79) in the same unrestricted conditional model of (m− p)t as used for selecting (76),but without impulse saturation. This yielded FSE(10, 81) = 1.28 so the new test doesnot reject: the model with impulses had SC(22) = −5.11, whereas the unrestricted modelwithout any impulses had SC(12) = −5.41, both much poorer than (76). The one impulsein common between marginal and conditional models is I73(2), which entered the equationfor Δpt. However, it does so positively in both equations, even though Δpt enters (76)negatively.

Finally, we repeated the super-exogeneity impulse-saturation based test at α1 = 2.5%,which now led to 37 impulses being retained across the three marginal models, and atest statistic of FSE(37, 54) = 1.67∗ that just rejects at 5%, which may be partly dueto the small remaining degrees of freedom as SC(49) = −4.5, so the conditional modelwithout any impulses has a substantially smaller value of SC. Moreover, the only one ofthe impulses in (76) selected in any of these marginal models was again I73(2). Thus, wefind minimal evidence against the hypothesis that {xt,Δpt, Rn,t} are super exogenousfor the parameters of the conditional model for (m− p)t in (76).

Not rejecting the null of super exogeneity implies that agents did not alter theirdemand for money behavior despite quite large changes in the processes generating theirconditioning variables. In particular, agents could not have been forming expectationsbased on the marginal models for any of the three variables. This might be because theirnear unpredictability led to the use of robust forecasting devices of the general formsdiscussed by Favero and Hendry (1992) and Hendry and Ericsson (1991):

xt+1 = xt; Δpt+1 = Δpt; Rn,t+1 = Rn,t.

If so, the apparent conditioning variables are actually the basis for robust one-step aheadforecasting devices used in the face of unanticipated structural breaks, as in Hendry(2006). Consequently, the nonrejection of super exogeneity makes sense, and does notcontradict an underlying theory of forward-looking money demand behavior.

11. Conclusion

An automatically computable test for super exogeneity based on selecting shifts in themarginal process by impulse saturation to test for related shifts in the conditional has

Page 206: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

11 Conclusion 193

been proposed. The test has the correct null rejection frequency in constant conditionalmodels when the nominal test size, α1, is not too small in the marginal (e.g. 2.5%)even at small sample sizes, for a variety of marginal processes, both constant and withbreaks. The approximate rejection-frequency function was derived analytically for regres-sion models, and helps explain the simulation outcomes. These confirm that the test candetect failures of super exogeneity when weak exogeneity fails and the marginal pro-cesses change. Although only a single break was considered in detail, the general natureof the test makes it applicable when there are multiple breaks in the marginal processes,perhaps at different times.

A test rejection outcome indicates a dependence between the conditional modelparameters and those of the marginals, warning about potential mistakes from usingthe conditional model to predict the outcomes of policy changes that alter the marginalprocesses by location shifts, which is a common policy scenario.

The empirical application to UK M1 delivered new results in a much-studied illus-tration, and confirmed the feasibility of the test. The status of super exogeneity was notcompletely clear cut, but suggested, at most, a small degree of dependence between theparameters.

Although all the derivations and Monte Carlo experiments here have been for staticregression equations and specific location shifts, the principles are general, and shouldapply to dynamic equations (although with more approximate null rejection frequencies),to conditional systems, and to nonstationary settings: these are the focus of our presentresearch.

Page 207: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

10

Generalized Forecast Errors,a Change of Measure,

and Forecast OptimalityAndrew J. Patton and Allan Timmermann

1. Introduction

In a world with constant volatility, concerns about the possibility of asymmetric or non-quadratic loss functions in economic forecasting would (almost) vanish: Granger (1969)showed that in such an environment optimal forecasts will generally equal the conditionalmean of the variable of interest, plus a simple constant (an optimal bias term). How-ever, the pioneering and pervasive work of Rob Engle provides overwhelming evidenceof time-varying volatility in many macroeconomic and financial time series.1 In a worldwith time-varying volatility, asymmetric loss has important implications for forecasting,see Christoffersen and Diebold (1997), Granger (1999) and Patton and Timmermann(2007a).

The traditional assumption of a quadratic and symmetric loss function underlyingmost of the work on testing forecast optimality is increasingly coming under criti-cal scrutiny, and evaluation of forecast efficiency under asymmetric loss functions has

Acknowledgments: The authors would like to thank seminar participants at the Festschrift Conferencein Honor of Robert F. Engle in San Diego, June 2007, and Graham Elliott, Raffaella Giacomini, CliveGranger, Oliver Linton, Mark Machina, Francisco Penaranda, Kevin Sheppard, Mark Watson, HalWhite, Stanley Zin and an anonymous referee for useful comments. All remaining deficiencies are theresponsibility of the authors. The second author acknowledges support from CREATES, funded by theDanish National Research Foundation.

1See, amongst many others, Engle (1982a, 2004b), Bollerslev (1986), Engle et al. (1990), the specialissue of the Journal of Econometrics edited by Engle and Rothschild (1992), as well as surveys byBollerslev et al. (1994) and Andersen et al. (2006a).

194

Page 208: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

1 Introduction 195

recently gained considerable attention in the applied econometrics literature.2 Progresshas also been made on establishing theoretical properties of optimal forecasts for par-ticular families of loss functions (Christoffersen and Diebold, 1997; Elliott et al., 2005,2008; Patton and Timmermann, 2007b). However, although some results have beenderived for certain classes of loss functions, a more complete set of results has not beenestablished.

This chapter fills this lacuna in the literature by deriving properties of an optimalforecast that hold for general classes of loss functions and general data-generating pro-cesses. Working out these properties under general loss is important as none of thestandard properties established in the linear-quadratic framework survives to a moregeneral setting in the presence of conditional heteroskedasticity, cf. Patton and Tim-mermann (2007a). Irrespective of the loss function and data-generating process, ageneralized orthogonality principle must, however, hold provided information is effi-ciently embedded in the forecast. Implications of this principle will, of course, varysignificantly with assumptions about the loss function and data-generating process(DGP). Our results suggest two approaches: transforming the forecast error for agiven loss function, or transforming the density under which the forecast error is beingevaluated.

The first approach provides tests that generalize the widely used Mincer–Zarnowitz(Mincer and Zarnowitz, 1969) regressions, established under mean squared error (MSE)loss, to hold for arbitrary loss functions. We propose a seemingly unrelated regression(SUR)-based method for testing multiple forecast horizons simultaneously, which mayyield power improvements when forecasts for multiple horizons are available. This isrelevant for survey data such as those provided by the Survey of Professional Forecasters(Philadelphia Federal Reserve) or Consensus Economics as well as for individual forecastssuch as those reported by the IMF in the World Economic Outlook.

Our second approach introduces a new line of analysis based on a transformationfrom the usual probability measure to an “MSE-loss probability measure”. Under thisnew measure, optimal forecasts, from any loss function, are unbiased and forecast errorsare serially uncorrelated, in spite of the fact that these properties generally fail to holdunder the physical (or “objective”) measure. This transformation has its roots in assetpricing and “risk neutral” probabilities, see Harrison and Kreps (1979) for example, butto our knowledge has not previously been considered in the context of forecasting.

Relative to existing work, our contributions are as follows. Using the first line ofresearch, we establish population properties for the so-called generalized forecast error,which is similar to the score function known from estimation problems. These resultsbuild on, extend and formalize results in Granger (1999) as well as in our earlier work(Patton and Timmermann, 2007a,b) and apply to quite general classes of loss func-tions and data-generating processes. Patton and Timmermann (2007b) establish testableimplications of simple forecast errors (defined as the outcome minus the predicted value)under forecast optimality, whereas Patton and Timmermann (2007a) consider the gener-alized forecast errors but only for more specialized cases such as linex loss with normallydistributed innovations. Unlike Elliott et al. (2005), we do not deal with the issue of

2See, for example, Christoffersen and Diebold (1996), Pesaran and Skouras (2001), Christoffersen andJacobs (2004) and Granger and Machina (2006).

Page 209: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

196 Generalized forecast errors

identification and estimation of the parameters of the forecaster’s loss function. Thedensity forecasting results are, to our knowledge, new in the context of the forecastevaluation literature.

The outline of this chapter is as follows. Section 2 establishes properties of optimalforecasts under general known loss functions. Section 3 contains the change of measureresult, and Section 4 presents empirical illustrations of the results. Section 5 concludes.An appendix contains technical details and proofs.

2. Testable implications under general loss functions

Suppose that a decision maker is interested in forecasting some univariate time series,Y ≡ {Yt; t = 1, 2, ...}, h steps ahead given information at time t, Ft. We assumethat Xt = [Yt, Z

′t]′, where Zt is a (m × 1) vector of predictor variables used by the

decision maker, and X ≡ {Xt : Ω → Rm+1, m ∈ N, t = 1, 2, ...} is a stochastic pro-

cess on a complete probability space (Ω,F , P ), where Ω = R(m+1)∞ ≡ ×∞

t=1Rm+1,

F = B(m+1)∞ ≡ B(R(m+1)∞), the Borel σ-field generated by R(m+1)∞, and Ft is the

σ-field {Xt−k; k ≥ 0}. Yt is thus adapted to the information set available at time t.3

We will denote a generic sub-vector of Zt as Zt, and denote the conditional distribu-tion of Yt+h given Ft as Ft+h,t, i.e. Yt+h|Ft ∼ Ft+h,t, and the conditional density, ifit exists, as ft+h,t. Point forecasts conditional on Ft are denoted by Yt+h,t and belongto Y, a compact subset of R, and forecast errors are given by et+h,t = Yt+h − Yt+h,t.4

In general the objective of the forecast is to minimize the expected value of some lossfunction, L(Yt+h, Yt+h,t), which is a mapping from realizations and forecasts to the realline, L : R × Y → R. That is, in general

Y ∗t+h,t ≡ arg min

y∈YEt [L (Yt+h, y)] . (1)

Et[.] is shorthand notation for E[.|Ft], the conditional expectation given Ft. We alsodefine the conditional variance, Vt = E[(Y − E[Y |Ft])2|Ft] and the unconditionalequivalents, E[.] and V (.).

The general decision problem underlying a forecast is to maximize the expected valueof some utility function, U(Yt+h,A(Yt+h,t)), that depends on the outcome of Yt+h as wellas on the decision maker’s actions, A, which in general depend on the full distributionforecast of Yt+h, Ft+h,t. Here we assume that A depends only on the forecast Yt+h,t

and we write this as A(Yt+h,t). Granger and Machina (2006) show that under certainconditions on the utility function there exists a unique point forecast, which leads to thesame decision as if a full distribution forecast had been available.

3The assumption that Yt is adapted to Ft rules out the direct application of the results in thischapter to, e.g., volatility forecast evaluation. In such a scenario the object of interest, conditionalvariance, is not adapted to Ft. Using imperfect proxies for the object of interest in forecast optimalitytests can cause difficulties, as pointed out by Hansen and Lunde (2006) and further studied in Patton(2006b).

4We focus on point forecasts below, and leave the interesting extension to interval and densityforecasting for future research.

Page 210: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 Testable implications under general loss functions 197

2.1. Properties under general loss functions

Under general loss the first order condition for the optimal forecast is5

0 = Et

⎡⎣∂L(Yt+h, Y

∗t+h,t

)∂Yt+h,t

⎤⎦ =∫ ∂L

(y, Y ∗

t+h,t

)∂Yt+h,t

dFt+h,t (y) . (2)

This condition can be rewritten using what Granger (1999) refers to as the (opti-mal) generalized forecast error, ψ∗

t+h,t ≡ ∂L(Yt+h, Y∗t+h,t)/∂Yt+h,t,6 so that equation (2)

simplifies to

Et[ψ∗t+h,t] =

∫ψ∗

t+h,tdFt+h,t (y) = 0. (3)

Under a broad set of conditions ψ∗t+h,t is therefore a martingale difference sequence with

respect to the information set used to compute the forecast, Ft. The generalized forecasterror is closely related to the “generalized residual” often used in the analysis of discrete,censored or grouped variables, see Gourieroux et al. (1987) and Chesher and Irish (1987)for example. Both the generalized forecast error and the generalized residual are basedon first order (or “score”) conditions.

We next turn our attention to proving properties of the generalized forecast erroranalogous to those for the standard case. We will sometimes, though not generally, makeuse of the following assumption on the DGP for Xt ≡ [Yt, Z

′t]′:

Assumption D1: {Xt} is a strictly stationary stochastic process.

Note that we do not assume that Xt is continuously distributed and so the resultsbelow may apply to forecasts of discrete random variables, such as direction-of-changeforecasts or default forecasts. The following properties of the loss function are assumedat various points of the analysis, but not all will be required everywhere.

Assumption L1: The loss function is (at least) once differentiable with respect to itssecond argument, except on a set of Ft+h,t-measure zero, for all t and h.

Assumption L2: Et[L(Yt+h, y)] <∞ for some y ∈ Y and all t, almost surely.

Assumption L2’: An interior optimum of the problem

miny∈Y

∫L (y, y) dFt+h,t (y)

exists for all t and h.

Assumption L3: |Et[∂L(Yt+h, y)/∂y]| <∞ for some y ∈ Y and all t, almost surely.

Assumption L2 simply ensures that the conditional expected loss from a forecast isfinite, for some finite forecast. Assumptions L1 and L2’ allow us to use the first order

5This result relies on the ability to interchange the expectation and differentiation operators.Assumptions L1–L3 given below are sufficient conditions for this to hold.

6Granger (1999) considers loss functions that have the forecast error as an argument, and so definesthe generalized forecast error as ψ∗

t+h,t ≡ ∂L(et+h,t)/∂et+h,t. In both definitions, ψ∗t+h,t can be viewed

as the marginal loss associated with a particular prediction, Yt+h,t.

Page 211: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

198 Generalized forecast errors

condition of the minimization problem to study the optimal forecast. One set of sufficientconditions for Assumption L2’ to hold are Assumption L2 and:

Assumption L4: The loss function is a nonmonotonic, convex function solely of theforecast error.

We do not require that L is everywhere differentiable with respect to its secondargument, nor do we need to assume a unique optimum (though this is obtained if weimpose Assumption L4, with the convexity of the loss function being strict). Assump-tion L3 is required to interchange expectation and differentiation: ∂Et[L(Yt+h, y)]/∂y =Et[∂L(Yt+h, y)/∂y]. The bounds on the integral on the left-hand-side of this expressionare unaffected by the choice of y, and so two of the terms in Leibnitz’s rule drop out,meaning we need only assume that the term on the right-hand-side is finite.

The following proposition establishes properties of the generalized forecast error,ψ∗

t+h,t:

Proposition 1

1. Let assumptions L1, L2’ and L3 hold. Then the generalized forecast error, ψ∗t+h,t,

has conditional (and unconditional) mean zero.2. Let assumptions L1, L2’ and L3 hold. Then the generalized forecast error from an

optimal h-step forecast made at time t exhibits zero correlation with any functionof any element of the time t information set, Ft, for which second moments exist.In particular, the generalized forecast error will exhibit zero serial correlation forlags greater than (h− 1).7

3. Let assumptions D1 and L2 hold. Then the unconditional expected loss of anoptimal forecast error is a nondecreasing function of the forecast horizon.

All proofs are given in the appendix. The above result is useful when the loss func-tion is known, since ψ∗

t+h,t can then be calculated directly and employed in generalizedefficiency tests that project ψ∗

t+h,t on period-t instruments. For example, the martingaledifference property of ψ∗

t+h,t can be tested by testing α = β = 0 for all Zt ∈ Ft in thefollowing regression:

ψt+h,t = α+ β′Zt + ut+h. (4)

The above simple test will not generally be consistent against all departures from forecastoptimality. A consistent test of forecast optimality based on the generalized forecast errorscould be constructed using the methods of Bierens (1990), de Jong (1996) and Bierensand Ploberger (1997). Tests based on generalized forecast errors obtained from a modelwith estimated parameters can also be conducted, using the methods in West (1996,2006).

If the same forecaster reported forecasts for multiple horizons we can conduct a jointtest of forecast optimality across all horizons. This can be done without requiring that theforecaster’s loss function is the same across all horizons, i.e., we allow the one-step aheadforecasting problem to involve a different loss function to the two-step ahead forecasting

7Optimal h-step forecast errors under MSE loss are MA processes of order no greater than h − 1. Ina nonlinear framework an MA process need not completely describe the dependence properties of thegeneralized forecast error. However, the autocorrelation function of the generalized forecast error willmatch some MA(h − 1) process.

Page 212: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 Testable implications under general loss functions 199

problem, even for the same forecaster. A joint test of optimality across all horizons maybe conducted as: ⎡⎢⎢⎢⎣

ψt+1,t

ψt+2,t

...ψt+H,t

⎤⎥⎥⎥⎦ = A+BZt + ut,H (5)

and then testing H0 : A = B = 0 vs. Ha : A �= 0∪B �= 0. More concretely, one possibilityis to estimate a SUR system for the generalized forecast errors:⎡⎢⎢⎢⎣

ψt+1,t

ψt+2,t

...ψt+H,t

⎤⎥⎥⎥⎦ = A+B1

⎡⎢⎢⎢⎣ψt,t−1

ψt,t−2

...ψt,t−H

⎤⎥⎥⎥⎦+ . . .+BJ

⎡⎢⎢⎢⎣ψt−J+1,t−J

ψt−J+1,t−J−1

...ψt−J+1,t−J−H+1

⎤⎥⎥⎥⎦+ ut,H , (6)

and then test H0 : A = B = 0 vs. Ha : A �= 0 ∪B �= 0.

2.2. Properties under MSE Loss

In the special case of a squared error loss function:

L(Yt+h, Yt+h,t) = θ(Yt+h − Yt+h,t

)2

, θ > 0, (7)

optimal forecasts can be shown to have the standard properties, using the results fromProposition 1. For reference we list these below:

Corollary 1 Let the loss function be

L(Yt+h, Yt+h,t

)= θh

(Yt+h − Yt+h,t

)2

, θh > 0 for all h

and assume that Et[Y 2t+h] <∞ for all t and h almost surely. Then

1. The optimal forecast of Yt+h is Et[Yt+h] for all forecast horizons h;2. The forecast error associated with the optimal forecast has conditional (and

unconditional) mean zero;3. The h-step forecast error associated with the optimal forecast exhibits zero serial

covariance beyond lag (h− 1);Moreover, if we further assume that Y is covariance stationary, we obtain:

4. The unconditional variance of the forecast error associated with the optimalforecast is a nondecreasing function of the forecast horizon.

This corollary shows that the standard properties of optimal forecasts are gener-ated by the assumption of mean squared error loss alone; in particular, assumptions onthe DGP (beyond covariance stationarity and finite first and second moments) are notrequired. Properties such as these have been extensively tested in empirical studies ofoptimality of predictions or rationality of forecasts, e.g. by testing that the intercept

Page 213: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

200 Generalized forecast errors

is zero (α = 0) and the slope is unity (β = 1) in the Mincer–Zarnowitz (Mincer andZarnowitz, 1969) regression

Yt+h = α+ βYt+h,t + εt+h (8)

or equivalently in a regression of forecast errors on current instruments,

et+h,t = α+ β′Zt + ut+h. (9)

Elliott, Komunjer and Timmermann (2008) show that the estimates of β will be biasedwhen the loss function used to generate the forecasts is of the asymmetric squared lossvariety. Moreover, the bias in that case depends on the correlation between the absoluteforecast error and the instruments used in the test. It is possible to show that undergeneral (non-MSE) loss the properties of the optimal forecast error listed in Corollary 1can all be violated; see Patton and Timmermann (2007a) for an example using a regimeswitching model and the “linex” loss function of Varian (1974).

3. Properties under a change of measure

In the previous section we showed that by changing our object of analysis from theforecast error to the “generalized forecast error” we can obtain the usual properties ofunbiasedness and zero serial correlation. As an alternative approach, we next considerinstead changing the probability measure used to compute the properties of the fore-cast error. This analysis is akin to the use of risk-neutral densities in asset pricing,cf. Harrison and Kreps (1979). In asset pricing one may scale the objective (or physi-cal) probabilities by the stochastic discount factor (or the discounted ratio of marginalutilities) to obtain a risk-neutral probability measure and then apply risk-neutral pric-ing methods. Here we will scale the objective probability measure by the ratio of themarginal loss, ∂L/∂y, to the forecast error, and then show that under the new probabil-ity measure the standard properties hold; i.e., under the new measure, (Yt+h− Yt+h,t,Ft)is a martingale difference sequence when Yt+h,t = Y ∗

t+h,t, where Y ∗t+h,t is defined in equa-

tion (1). We call the new measure the “MSE-loss probability measure”. The resultingmethod thus suggests an alternative means of evaluating forecasts made using general lossfunctions.

Note that the conditional distribution of the forecast error, Fet+h,t, given Ft and any

forecast y ∈ Y, satisfies

Fet+h,t (e; y) = Ft+h,t (y + e) , (10)

for all (e, Yt+h,t) ∈ R × Y where Ft+h,t is the conditional distribution of Yt+h given Ft.To facilitate the change of measure, we make use of the following assumption:

Assumption L5: ∂L(y, y)/∂y ≤ (≥)0 for y ≥ (≤)y.

Assumption L5 simply imposes that the loss function is nondecreasing as the fore-cast moves further away (in either direction) from the true value, which is a reasonableassumption. It is common to impose that L(y, y) = 0, i.e., the loss from a perfect forecastis zero, but this is obviously just a normalization and is not required here.

Page 214: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Properties under a change of measure 201

The sign of (y− y)−1∂L(y, y)/∂y is negative under assumption L5, and in defining theMSE-loss probability measure we need to further assume that it is bounded and nonzero:

Assumption L6: 0 < −Et[(Yt+h − y)−1∂L(Yt+h, y)/∂y] < ∞ for all y ∈ Y and all t,almost surely.

Definition 1 Let assumptions L5 and L6 hold and let

Λ (e, y) ≡ −1e· ∂L (y, y)

∂y

∣∣∣∣y=y+e

. (11)

Then the “MSE-loss probability measure”, dFet+h,t(·|y), is defined by

dFet+h,t(e; y) =

Λ (e, y)Et [Λ (Yt+h − y, y)] · dFet+h,t

(e; y) . (12)

By construction the MSE-loss probability measure F (·|y) is absolutely continuouswith respect to the usual probability measure, F (·|y), (that is, F (·|y) << F (·|y)). Thefunction

Λt+h,t (e, y) ≡ Λ (e, y)Et [Λ (Yt+h − y, y)] (13)

is the Radon-Nikodym derivative dFet+h,t(·|y)/dFet+h,t(·|y). If we let u = e−1, thenAssumption L6 requires that ∂L(y, y)/∂y|y=y+1/u = O(u−1). Note that Λ(e, y) is welldefined at e = 0 for some common loss functions. For example,

MSE : lime→0

Λ (e, y) = 2

Linex : lime→0

Λ (e, y) = a2

PropMSE : lime→0

Λ (e, y) = 2/y2

where the Linex and PropMSE loss functions are defined as L(y, y) = exp{ae}−ae−1 andL(y, y) = (y/y − 1)2, respectively. For mean absolute error loss, L(y, y) = |e|, the limitsfrom both directions diverge, meaning that there is no MSE-loss density under MAE ingeneral. However, if the variable of interest is conditionally symmetrically distributedat all points in time, then the optimal forecast under MAE coincides with the optimalforecast under MSE, as the conditional mean is equal to the conditional median, and sothe appropriate Radon-Nikodym derivative is equal to one.

We now show that under the MSE-loss probability measure the optimal h-step aheadforecast errors exhibit the properties that we would expect from optimal forecasts underMSE loss:

Proposition 2

1. Let assumptions L1, L5 and L6 hold. Then the “MSE-loss probability measure”,Fet+h,t

(·|y), defined in equation (12) is a proper probability distribution functionfor all y ∈ Y.

2. If we further let assumption L2’ hold, then the optimal forecast error, e∗t+h,t =Yt+h − Y ∗

t+h,t has conditional mean zero under the MSE-loss probability measureFet+h,t

(·|Y ∗t+h,t).

Page 215: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

202 Generalized forecast errors

3. The optimal forecast error is serially uncorrelated under the MSE-loss probabilitymeasure, Fet+h,t

(·|Y ∗t+h,t), for all lags greater than h− 1.

4. V [e∗t+h,t], the variance of e∗t+h,t under Fet+h,tevaluated at Y ∗

t+h,t, is a nondecreas-ing function of the forecast horizon.

Notice that e∗t+h,t is a martingale difference sequence, with respect to Ft, under Ft+h,t.Furthermore, although the MSE-loss probability measure operates on forecast errors, theresult holds for general loss functions having Yt+h, Y

∗t+h,t as separate arguments.

It is worth emphasizing that the MSE-loss probability measure is a conditional dis-tribution, and so obtaining an estimate of it from data is not as simple as it would beif it was an unconditional distribution. If we assume that the density fet+h,t exists thenit is possible, under some conditions, to obtain a consistent estimate of fet+h,t via semi-nonparametric density estimation, see Gallant and Nychka (1987). If L is known thenΛ is, of course, also known.8 With consistent estimates of fet+h,t and Λ it is simple toconstruct an estimator of fet+h,t. In recent work, Chernov and Mueller (2007) specifya flexible parametric model for ft and Λt in order to estimate the underlying objectiveconditional density, f , of forecasters from a variety of macroeconomic surveys. From thisdensity estimate, they are then able to both “bias-correct” the individual forecasts, andcompute combination forecasts.

4. Numerical example and an application to US inflation

To illustrate how the MSE-loss error density differs from the objective error density,consider the following simple example. Consider the following AR(1)-GARCH(1,1) datagenerating process:

Yt = φ0 + φ1Yt−1 + εt

εt = h1/2t νt (14)

ht = ω + βht−1 + αε2t−1

νt|Ft−1 ∼ N(0, 1).

Next, consider the simple and analytically tractable “linex” loss function of Varian (1974),scaled by 2/a2:

L (y, y; a) =2a2

(exp {a (y − y)} − a (y − y) − 1) . (15)

The scaling term 2/a2 does not affect the optimal forecast, but ensures that this functionlimits to the MSE loss function as a→ 0. When a > 0, under-predictions (y > y, or e > 0)carry an approximately exponential penalty, whereas over-predictions (y < y, or e < 0)carry an approximately linear penalty. When a < 0 the penalty for over-predictions isapproximately exponential whereas the penalty for under-predictions is approximatelylinear. In Figure 10.1 we present the linex loss function for a = 3.

8If L is unknown, a nonparametric estimate of Λ may be obtained via sieve estimation methods, forexample, see Andrews (1991) or Chen and Shen (1998).

Page 216: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Numerical example and an application to US inflation 203

–3 –2 –1 0 1 2 30

2

4

6

8

10

12

forecast error

MSE and Linex loss functions

MSE lossLinex loss (a = 3)

Fig. 10.1. MSE and Linex loss functions for a range of forecast errors

Under linex loss, the optimal one-step-ahead forecast and the associated forecast errorare (see Varian, 1974; Zellner, 1986; and Christoffersen and Diebold, 1997)

Y ∗t = Et−1 [Yt] +

a

2Vt−1 [Yt]

e∗t = −a2Vt−1 [Yt] + εt (16)

= −a2ht + h1/2

t νt

so e∗t |Ft−1 ∼ N(−a

2ht, ht

)and so we see that the process for the conditional mean (an AR(1) process above) does notaffect the properties of the optimal forecast error. Notice that the forecast error followsan ARCH-in-mean process of the type analyzed by Engle, Lilien and Robins (1987).

The generalized forecast error for this example is as follows, and has a log-normaldistribution when suitably centered and standardized:

ψt ≡∂L(Yt, Yt

)∂y

=2a

(1 − exp

{a(Yt − Yt

)})(17)

so(1 − a

2ψt

)|Ft−1 ∼ logN

(a(μt − Yt

), a2ht

)and

(1 − a

2ψ∗

t

)|Ft−1 ∼ logN

(−a

2

2ht, a

2ht

).

Page 217: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

204 Generalized forecast errors

−5 0 50

0.1

0.2

0.3

0.4

0

0.1

0.2

0.3

0.4

0

0.1

0.2

0.3

0.4

0.5

hhat = 0.54

−5 0 5

−5 0 5

0

0.1

0.2

0.3

0.4

0.5

hhat = 0.73

−5 0 5

hhat = 1.00 (mean)

hhat = 1.11

forecast error

0

0.1

0.2

0.3

−5 0 5

hhat = 1.43

forecast error

hhat = 2.45

0

0.1

0.2

0.3

−5 0 5

objective densityMSE−loss density

Fig. 10.2. Objective and “MSE-loss” error densities for a GARCH process under Linexloss, for various values of the predicted conditional variance

For the numerical example, we chose values of the predicted variance, ht, to corre-spond to the mean and the 0.01, 0.25, 0.75, 0.9 and 0.99 percentiles of the unconditionaldistribution of ht when the GARCH parameters are set to (ω, α, β) = (0.02, 0.05, 0.93),which are empirically reasonable. A plot of the objective and the MSE-loss densities isgiven in Figure 10.2.

In all cases we see that the MSE-loss density is shifted to the right of the objectivedensity, in order to remove the (optimal) negative bias that is present under the objectiveprobability distribution due to the high cost associated with positive forecast errors. Theway this probability mass is shifted depends on the level of predicted volatility, andFigure 10.2 reveals a variety of shapes for the MSE-loss density. When volatility is low

Page 218: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Numerical example and an application to US inflation 205

(ht = 0.54 or 0.73), the MSE-loss density remains approximately bell-shaped, and is asimple shift of location (with a minor increase in spread) so that the mean of this densityis zero. When volatility is average to moderately high (ht = 1.00 or 1.11), the MSE-lossdensity becomes a more rounded bell shape and remains unimodal. When volatility ishigh, the MSE-loss density becomes bimodal: it is approximately “flat-topped” for theht = 1.43 case (though actually bimodal) and clearly bimodal for the ht = 2.45 case.The bimodality arises from the interaction of the three components that affect the shapeof the MSE-loss density: the derivative of the loss function, the shape of the objectivedensity, and the inverse of the forecast error.

We also see that the MSE-loss density is symmetric in this example. This is not ageneral result: a symmetric objective density (such as in this example) combined with anasymmetric loss function will generally lead to an asymmetric MSE-loss density. It is theparticular combination of the normal objective density with the linex loss function thatleads to the symmetric MSE-loss function observed here. A symmetric but non-normalconditional density for νt, such as a mixture of normals, can be shown to lead to anasymmetric MSE-loss density.

4.1. Application to US inflation

In this section we apply the methods of this chapter to inflation forecasting, which wasthe application in Rob Engle’s original ARCH paper, Engle (1982a). We use monthlyCPI inflation for the US, Δ log(CPIt) over the period January 1982 to December 2006.This happens to be the period starting with the publication of the original ARCH paper,and also coincides with the period after the change in the Federal Reserve’s monetarypolicy during the “monetarist experiment” from 1979–1982. This is widely believed tohave led to a break in the inflation dynamics and volatility of many macroeconomic timeseries. We use a simple AR(4) model for the conditional mean, and a GARCH(1,1) modelfor the conditional variance.9 Assuming normality for the standardized residuals fromthis model, we can then obtain both the MSE-optimal forecast (simply the conditionalmean) and the Linex-optimal forecast, where we set the linex shape parameter to equalthree, as in the previous section.10 The data and forecasts are presented in Figure 10.3.In the upper panel we plot both the realized inflation (in percent per month) and theestimated conditional mean, which is labeled in the “MSE forecast” in the lower panel.The lower panel reveals that the linex forecast is always greater than the MSE forecast,by an amount that grows in periods with high variance (as shown in the middle panel),with the average difference being 0.087%, or 1.04% per year. With average realizedinflation at 3.06% per year in this sample period, the linex forecast (optimal) bias issubstantial.

9The Engle (1982a) LM test for ARCH in the residuals from the AR(4) model rejected the null ofhomoskedasticity, at the 0.05 level, for all lags up to 12.10The Jarque–Bera (1987) test for the normality of the standardized residuals actually rejects the

assumption of normality here. The estimated skewness of these residuals is near zero, but the kurtosisis 4.38, which is far enough from 3 for this test to reject normality. We nevertheless proceed under theassumption of normality.

Page 219: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

206 Generalized forecast errors

Jan82 Jan84 Jan86 Jan88 Jan90 Jan92 Jan94 Jan96 Jan98 Jan00 Jan02 Jan04 Jan06−1

−0.5

0

0.5

1

1.5

Perc

ent

Monthly US inflation

Jan82 Jan84 Jan86 Jan88 Jan90 Jan92 Jan94 Jan96 Jan98 Jan00 Jan02 Jan04 Jan060.1

0.2

0.3

0.4

0.5

Perc

ent

Volatility of monthly US inflation

Jan82 Jan84 Jan86 Jan88 Jan90 Jan92 Jan94 Jan96 Jan98 Jan00 Jan02 Jan04 Jan06−0.5

0

0.5

1

Perc

ent

Monthly US inflation forecasts

realized inflationconditional mean

conditional standard deviation

linex forecast (a = 3)MSE forecast

Fig. 10.3. Monthly CPI inflation in the US over the period January 1982 to December2006, along with the estimated conditional mean, conditional standard deviation, andthe linex-optimal forecast

To emphasize the importance of the loss function in considering forecast optimal-ity, we illustrate two simple tests of optimality for each of the two forecasts.11 Thefirst looks for bias in the forecast, whereas the second looks for bias and first orderautocorrelation in the forecast errors. The results for the MSE and Linex forecasts arepresented below, with Newey–West (Newey and West, 1987) t-statistics presented inparentheses below the parameter estimates. The “p value” below reports the p valueassociated with the test of the null of forecast optimality, either zero bias or zero bias and

11Formal testing of forecast optimality would use a pseudo-out-of-sample period for analysis, separatefrom the period used for estimation.

Page 220: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Numerical example and an application to US inflation 207

zero autocorrelation.

eMSEt = − 0.002

(−0.123)+ ut, p value = 0.902

eMSEt = − 0.002

(−0.124)+ 0.003

(0.050)eMSE

t−1 + ut, p value = 0.992 (18)

eLinext = − 0.087

(−6.175)+ ut, p value = 0.000

eLinext = − 0.085

(−6.482)+ 0.021

(0.327)eLinex

t−1 + ut, p value = 0.000.

As expected, the MSE-optimal passes these tests. The Linex-optimal forecast fails bothof these tests, primarily due to the positive bias in the linex forecasts. This is, of course,also expected, as the linex forecasts are constructed for a situation where the costs ofunder-predicting are much greater than those of over-predicting, see Figure 10.1. Thus,the linex forecast is not constructed to be optimal under MSE loss, which is what theabove two tests examine.

Next we consider testing for optimality under linex loss, using the generalized forecasterror for that loss function and the methods discussed in Section 2. The formula for thegeneralized forecast for linex loss is given in equation (17), and from that we constructψMSE

t and ψLinext using the MSE forecast and the Linex forecast. We ran the same tests

as above, but now using the generalized forecast error rather than the usual forecasterror, and obtained the following results:

ψMSEt = − 0.210

(−3.985)+ ut, p value = 0.000

ψMSEt = − 0.214

(−3.737)− 0.019

(−0.342)ψMSE

t−1 + ut, p value = 0.000 (19)

ψLinext = − 0.010

(−0.256)+ ut, p value = 0.798

ψLinext = − 0.010

(−0.263)− 0.031

(−0.550)ψLinex

t−1 + ut, p value = 0.849

Using the test of optimality based on linex loss (with parameter equal to three), wefind that the MSE forecasts are strongly rejected, whereas the linex forecasts are not.The contrast between this conclusion and the conclusion from the tests based on theusual forecast errors provides a clear illustration of the importance of matching the lossfunction used in forecast evaluation with that used in forecast construction. Failure toaccurately account for the forecaster’s objectives through the loss function can clearlylead to false rejections of forecast optimality.

Finally, we present the estimated objective and MSE-loss densities associated withthese forecasts. We nonparametrically estimated the objective density of the stan-dardized residuals, νt ≡ (yt − μt)/

√ht, where μt is the conditional mean and

√ht is

the conditional standard deviation, using a Gaussian kernel with bandwidth set to

0.9×√V [νt]×T−1/5, where T = 300 is the sample size. From this, we can then compute

Page 221: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

208 Generalized forecast errors

−2 −1 0 1 20

0.5

1

1.5

2

2.5

forecast error

hhat = 0.030

−2 −1 0 1 20

0.5

1

1.5

2

2.5

forecast error

hhat = 0.038

−2 −1 0 1 20

0.5

1

1.5

2

forecast error

hhat = 0.057

−2 −1 0 1 20

0.5

1

1.5

2

forecast error

hhat = 0.066

−2 −1 0 1 20

0.5

1

1.5

forecast error

hhat = 0.085

−2 −1 0 1 20

0.5

1

1.5

forecast error

hhat = 0.157

objective densityMSE - loss density

Fig. 10.4. Estimated objective and “MSE-loss” error densities for US inflation, forvarious values of the predicted conditional variance

an estimate of the conditional (objective) density of the forecast errors:

f(e|ht

)= fν

(e+ hta/2√

ht

)1√ht

. (20)

The MSE-loss density is estimated as:

f(e|ht

)=

2ae (1 − exp {ae})

E[

2aet

(1 − exp {aet}) |ht

] f (e|ht

)(21)

where E[2 (1 − exp {aet})

aet|ht

]≡ 1T

T∑i=1

2(1 − exp

{a(√htνi − a

2ht

)})a(√htνi − a

2ht

) (22)

Page 222: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Appendix 209

and thus uses both the nonparametric estimate of the objective density, and a data-basedestimate of the normalization constant.

The estimated objective and MSE-loss densities are presented in Figure 10.4, using thesame method of choosing values for the predicted variance: we use values that correspondto the mean and the 0.01, 0.25, 0.75, 0.9 and 0.99 percentiles of the sample distributionof ht from our model. As in the simulation example in the previous section, we seethat the objective density is centered to the left of zero, and that the centering pointmoves further from zero as the variance increases. A small ‘bump’ in the right tail ofthe objective density estimate is amplified in the MSE-loss estimate, particularly as thevolatility increases, and the MSE-loss density is approximately centered on zero. The“bump” in the right tail of both of these densities disappears if we impose that thestandardized residuals are truly normally distributed; in that case the objective densityis, of course, Gaussian, and the resulting MSE-loss density is unimodal across thesevalues of ht.

5. Conclusion

This chapter derives properties of an optimal forecast that hold for general classes ofloss functions in the presence of conditional heteroskedasticity. Studying these propertiesis important, given the overwhelming evidence for conditional heteroskedasticity thathas accumulated since the publication of Engle’s seminal (1982a) ARCH paper. Weshow that irrespective of the loss function and data generating process, a generalizedorthogonality principle must hold provided information is efficiently embedded in theforecast. We suggest that this orthogonality principle leads to two primary implications:(1) a transformation of the forecast error, the “generalized forecast error”, must beuncorrelated with elements of the information set available to the forecaster, and (2) atransformation of the density of the forecast errors, labeled the “MSE-loss” density, mustexist which gives forecasts that are optimal under non-MSE loss the same properties asthose that are optimal under MSE loss.

The first approach to testing forecast optimality has its roots in the widely usedMincer–Zarnowitz (1969) regression, whereas the second approach is based on a trans-formation from the usual probability measure to an “MSE-loss probability measure”.This transformation has its roots in asset pricing and “risk neutral” probabilities, butto our knowledge has not previously been considered in the context of forecasting.Implementing the first approach empirically is relatively straightforward, although itmay require estimation of the parameters of the loss function if these are unknown(Elliott et al., 2005); implementing the second approach will require thinking about fore-cast (sub-)optimality in a different way, which may yield new insights into forecasterbehavior.

Appendix

Proof of Proposition 1 1. Assumptions L1 and L2’ allow us to analyze thefirst order condition for the optimal forecast, and assumption L3 permits theexchange of differentiation and expectation in the first order condition, giving us,

Page 223: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

210 Generalized forecast errors

by the optimality of Y ∗t+h,t,

Et

[ψ∗

t+h,t

]= Et

⎡⎣∂L(Yt+h, Y

∗t+h,t

)∂Yt+h,t

⎤⎦ = 0.

E[ψ∗

t+h,t

]= 0 follows from the law of iterated expectations.

To prove point 2, as (Yt, Yt−1, . . .) ∈ Ft by assumption we know that ψ∗t+h−j,t−j =

∂L(Yt+h−j , Y∗t+h−j,t−j)/∂y is an element of Ft for all j ≥ h. Assumptions L1 and

L2’ again allow us to analyze the first order condition for the optimal forecast, andassumption L3 permits the exchange of differentiation and expectation in the first ordercondition. We thus have

E[ψ∗

t+h,t|Ft

]= E

⎡⎣ ∂L(Yt+h, Y

∗t+h,t

)∂Y

∣∣∣∣∣∣Ft

⎤⎦ = 0,

which implies E[ψ∗t+h,t · φ(Zt)] = 0 for all Zt ∈ Ft and all functions φ for which this

moment exists. Thus, ψ∗t+h,t is uncorrelated with any function of any element of Ft. This

implies that E[ψ∗t+h,t · ψ∗

t+h−j,t−j ] = 0, for all j ≥ h, and so ψ∗t+h,t is uncorrelated with

ψ∗t+h−j,t−j .

To prove point 3, note that assumption (D1) of strict stationarity for {Xt} yields thestrict stationarity of (Yt+h, Y

∗t+h,t) as Y ∗

t+h,t is a time-invariant function of Zt. Thus forall h and j we have

E[Et

[L(Yt+h, Y

∗t+h,t

)]]= E

[Et−j

[L(Yt+h−j , Y

∗t+h−j,t−j

)]]and so the unconditional expected loss only depends on the forecast horizon, h, and noton the period when the forecast was made, t. By the optimality of the forecast Y ∗

t+h,t wealso have, ∀j ≥ 0,

Et

[L(Yt+h, Y

∗t+h,t−j

)]≥ Et

[L(Yt+h, Y

∗t+h,t

)]E[L(Yt+h, Y

∗t+h,t−j

)]≥ E

[L(Yt+h, Y

∗t+h,t

)]E[L(Yt+h+j , Y

∗t+h+j,t

)]≥ E

[L(Yt+h, Y

∗t+h,t

)]where the second line follows using the law of iterated expectations and the third linefollows from strict stationarity. Hence the unconditional expected loss is a nondecreasingfunction of the forecast horizon.

Proof of Corollary 1 This proof follows directly from the proof of Proposition 1 above,when one observes the relation between the forecast error and the generalized forecasterror, ψ∗

t+h,t, for the mean squared loss case: e∗t+h,t = − 12θhψ∗

t+h,t, and noting that theMSE loss function satisfies assumptions L1, L3 and L4, which implies a unique interioroptimum.

To prove Proposition 2 we prove the following lemma, for the “L-loss probabilitymeasure”, which nests the MSE-loss probability measure as a special case. We will requirethe following generalization of assumption L6:

Page 224: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Appendix 211

Assumption L6’: Given two loss functions, L and L, 0 < Et

[∂L(Yt+h,y)/∂y

∂L(Yt+h,y)/∂y

]< ∞ for

all y ∈ Y almost surely.

Lemma 1 Let L and L be two loss functions, and let Y ∗t+h,t and Y ∗

t+h,t be the optimalforecasts of Yt+h at time t under L and L, respectively.

1. Let assumptions L1, L5 and L6’ hold for L and L. Then the “L-loss probabilitymeasure”, Fet+h,t

, defined below is a proper probability distribution function forall y ∈ Y.

dFet+h,t(e; y) =

Λ (e, y)Et [Λ (Yt+h − y, y)] · dFet+h,t

(e; y)

where Λ (e, y) ≡ ∂L (y, y)/∂y|y=y+e

∂L (y, y)/∂y∣∣∣y=y+e

≡ ψ (y + e, y)ψ (y + e, y)

.

2. If we further let assumption L2’ hold, then the generalized forecast error under Levaluated at Y ∗

t+h,t, ψ(Yt+h, Y∗t+h,t) = ∂L(Yt+h, Y

∗t+h,t)/∂y, has conditional mean

zero under the L-loss probability measure.3. The generalized forecast error under L, evaluated at Y ∗

t+h,t, is serially uncorrelatedunder the L-loss probability measure for all lags greater than h− 1.

4. E[L(Yt+h, Y∗t+h,t)], the expectation of L(Yt+h, Y

∗t+h,t) under Fe(·; y), is a nonde-

creasing function of the forecast horizon when evaluated at y = Y ∗t+h,t.

Proof of Lemma 1 We first need to show that dFet+h≥ 0 for all possible values of

e, and that∫dFet+h,t(u; y)du = 1. By assumption L5 we have Λ(e, y) > 0 for all e

where Λ(e, y) exists. Thus Λ ·dFet+h,t is non-negative, and Et[Λ] is positive (and finite byassumption L6’), so dFet+h,t

(e; Yt+h,t) ≥ 0, if dFet+h,t(e; Yt+h,t) ≥ 0. By the construction

of dFet+h,tit is clear that it integrates to 1.

To prove part 2, note that, from the optimality of Y ∗t+h,t under L,

Et

[ψ(Yt+h, Y

∗t+h,t

)]∝∫ψ(Y ∗

t+h,t + e, Y ∗t+h,t

)Λ(e, Y ∗

t+h,t

)· dFet+h,t

(e; Y ∗

t+h,t

)=∫ψ(Y ∗

t+h,t + e, Y ∗t+h,t

)· dFet+h,t

(e; Y ∗

t+h,t

)= 0.

The unconditional mean of ψ(Yt+h, Y∗t+h,t) is also zero by the law of iterated

expectations.

Page 225: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

212 Generalized forecast errors

Part 3: As E[ψ(Yt+h, Y∗t+h,t)] = 0, from part 2, we need only show that E[ψ(Yt+h,

Y ∗t+h,t) · ψ(Yt+h+j , Y

∗t+h+j,t+j)] = 0 for j ≥ h. Again, by part 2,

Et

[ψ(Yt+h, Y

∗t+h,t

)· ψ(Yt+h+j , Y

∗t+h+j,t+j

)]= Et

[ψ(Yt+h, Y

∗t+h,t

)· Et+j

[ψ(Yt+h+j , Y

∗t+h+j,t+j

)]]for j ≥ h

= 0.

E[ψ(Yt+h, Y∗t+h,t) · ψ(Yt+h+j , Y

∗t+h+j,t+j)] = 0 follows by the law of iterated expectations.

For part 4 note that Et[ψ(Yt+h, Y∗t+h,t)] = 0 is the first order condition of

miny

Et[L(Yt+h, y)], so Et[L(Yt+h, Y∗t+h,t)] ≤ Et[L(Yt+h, Y

∗t+h,t−j)] ∀j ≥ 0, and so

E[L(Yt+h, Y∗t+h,t)] ≤ E[L(Yt+h, Y

∗t+h,t−j)] = E[L(Yt+h+j , Y

∗t+h+j,t)] by the law of iterated

expectations and the assumption of strict stationarity. Note that the assumption of strictstationarity for {Xt} suffices here as Y ∗

t+h,t and the change of measure, Λt+h,t(e, Y ∗t+h,t),

are time-invariant functions of Zt.

Proof of Proposition 2 Follows from the proof of Lemma 1 setting L(y, y) = (y− y)2and noting that assumption L6 satisfies L6’ for this loss function.

Page 226: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

11

Multivariate Autocontours forSpecification Testing in

Multivariate GARCH ModelsGloria Gonzalez-Rivera and Emre Yoldas

1. Introduction

Even though there is an extensive literature on specification tests for univariate timeseries models, the development of new tests for multivariate models has been very slow. Asan example, in the ARCH literature we have numerous univariate specifications for whichwe routinely scrutinize the standardized residuals for possible neglected dependence anddeviation from the assumed conditional density. However, for multivariate GARCH mod-els we rarely test for the assumed multivariate density and for cross-dependence in theresiduals. Given the inherent difficulty of estimating multivariate GARCH models, theissue of dynamic mis-specification at the system level – as important as it may be –seems to be secondary. Though univariate specification tests can be performed in eachequation of the system, these tests are not independent from each other, and an evalu-ation of the system will demand adjustments in the size of any joint test that combinesthe results of the equation-by-equation univariate tests. Bauwens, Laurent, and Rom-bouts (2006) survey the latest developments in multivariate GARCH models and theyalso acknowledge the need for further research on multivariate diagnostic tests. Thereare some portmanteau statistics for neglected multivariate conditional heteroskedasticityas in Ling and Li (1997), Tse and Tsui (1999), and Duchesne and Lalancette (2003).Some of these tests have unknown asymptotic distributions when applied to the general-ized GARCH residuals. Tse (2002) proposes another type of mis-specification test that isbased on regressions of the standardized residuals on some explanatory variables. In that

Acknowledgments: We are grateful to Tim Bollerslev and an anonymous referee for helpful commentsthat significantly improved the presentation of the chapter.

213

Page 227: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

214 Multivariate autocontours for specification testing

case, the usual ordinary least squares (OLS) asymptotics do not apply, but it is possibleto construct some statistics that are asymptotically chi-squared distributed under thenull of no dynamic mis-specification. None of these tests are concerned with the spec-ification of the multivariate density. However, the knowledge of the density functionalform is of paramount importance for density forecast evaluation, which is needed toassess the overall adequacy of the model. Recently, Bai and Chen (2008) adopted theempirical process-based testing approach of Bai (2003), which is developed in the uni-variate framework, to multivariate models. They use single-indexed empirical processesto make computation feasible, but this causes loss of full consistency. Kalliovirta (2007)also takes an empirical process-based approach and proposes several test statistics forchecking dynamic mis-specification and density functional form.

We propose a new battery of tests for dynamic specification and density functionalform in multivariate time series models. We focus on the most popular models for whichall the time dependence is confined to the first and second moments of the multivariateprocess. Multivariate dynamics in moments further than the second are difficult to find inthe data and, to our knowledge, there are only a few attempts in the literature restrictedto upmost bivariate systems. Our approach is not based on empirical processes, so wedo not require probability integral transformations as opposed to the above mentionedstudies testing for density specification. This makes dealing with parameter uncertaintyrelatively less challenging on theoretical grounds. When parameter estimation is required,we will adopt a quasi-maximum likelihood procedure as opposed to strict maximumlikelihood, which assumes the knowledge of the true multivariate density. If the truedensity were known, it would be possible to construct tests for dynamic mis-specificationbased on the martingale difference property of the score under the null. However, ifthe density function is unknown, a quasi-maximum likelihood estimator is the mostdesirable to avoid the inconsistency of the estimator that we would have obtained undera potentially false density function. The lack of consistency may also jeopardize theasymptotic distribution of the tests. Our approach is less demanding than any score-typetesting in the sense that once quasi-maximum likelihood estimates are in place, we canproceed to test different proposals on the functional form of the conditional multivariatedensity function.

The proposed tests are based on the concept of “autocontour” introduced byGonzalez-Rivera, Senyuz, and Yoldas (2007) for univariate processes. Our methodologyis applicable to a wide range of models including linear and nonlinear VAR specificationswith multivariate GARCH disturbances. The variable of interest is the vector of general-ized innovations εt = (ε1t, ε2t, . . . , εkt)

′ in a model yt = μt(θ01) +H1/2t (θ02)εt, where yt

is a k× 1 vector of variables with conditional mean vector μt and conditional covariancematrix Ht. Under the null hypothesis of correct dynamic specification the vector εt mustbe i.i.d. with a certain parametric multivariate probability density function f(.). Thus,if we consider the joint distribution of two vectors εt and εt−l, then under the null wehave f(εt, εt−l) = f(εt)f(εt−l). The basic idea of the proposed tests is to calculate thepercentage of observations contained within the probability autocontour planes corre-sponding to the assumed multivariate density of the vector of independent innovations,i.e. f(εt)f(εt−l), and to statistically compare it to the population percentage. We developa battery of t-tests based on a single autocontour and also more powerful chi-squared testsbased on multiple autocontours, which have standard asymptotic distributions. Withoutparameter uncertainty the test statistics are all distribution free, but under parameter

Page 228: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 Testing methodology 215

uncertainty there are nuisance parameters affecting the asymptotic distributions. Weshow that a simple bootstrap procedure overcomes this problem and yields the correctsize even for moderate sample sizes. We also investigate the power properties of the teststatistics in finite samples.

As the null is a joint hypothesis, the rejection of the null begs the question of what isat fault. Thus, it is desirable to separate i.i.d-ness from density function. In the spirit ofgoodness-of-fit tests, we also propose an additional test that focuses on the multivariatedensity functional form of the vector of innovations. Following a similar approach, weconstruct the probability contours corresponding to the hypothesized multivariate den-sity, f(εt), and compare the sample percentage of observations falling within the contourto the population percentage. The goodness-of-fit tests are also constructed as t-statisticsand chi-squared statistics with standard distributions.

The organization of this chapter is as follows. In Section 2, we describe the battery oftests, which follow from Gonzalez-Rivera, Senyuz, Yoldas (2007), and the constructionof the multivariate contours and autocontours. In Section 3, we offer some Monte Carlosimulation to assess the size and power of the tests in finite samples. In Section 4, weapply the tests to the generalized residuals of GARCH models with hypothesized mul-tivariate Normal and multivariate Student-t innovations fitted to excess returns on fivesize portfolios. In Section 5, we conclude.

2. Testing methodology

2.1. Test statistics

Let yt = (y1t, . . . , ykt) and suppose that yt evolves according to the following process

yt = μt(θ01) +H1/2t (θ02)εt, t = 1, . . . , T, (1)

where μt(.) and H1/2t (.) are both measurable with respect to time t−1 sigma field, �t−1,

Ht(.) is positive definite, and {εt} is an i.i.d. vector process with zero mean and identitycovariance matrix. The conditional mean vector, μt(.), and the conditional covariancematrix, Ht(.), are fully parameterized by the parameter vector θ0 = (θ′01, θ

′02)

′, which fornow we assume to be known, but later on we will relax this assumption to account forparameter uncertainty.

If all the dependence is contained in the first and second conditional moments of theprocess yt, then the null hypothesis of interest to test for model mis-specification is

H0 : εt is i.i.d. with density f(.).

The alternative hypothesis is the negation of the null. Though we wish to capture allthe dynamic dependence of yt through the modeling of the conditional mean and con-ditional covariance matrix, there may be another degree of dependence that is built inthe assumed multivariate density, f(.). In fact, once we move beyond the assumptionof multivariate normality, for instance when we assume a multivariate Student-t dis-tribution, the components of the vector εt are dependent among themselves and thisinformation is only contained within the functional form of the density. This is why,

Page 229: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

216 Multivariate autocontours for specification testing

among other reasons, it is of interest to incorporate the assumed density function in thenull hypothesis.

Let us consider the joint distribution of two k×1 vectors εt and εt−l, l = 1, . . . , L <∞.Define a 2k×1 vector ηt = (ε′t, ε

′t−l)

′ and let ψ(.) denote the associated density function.Under the null hypothesis of i.i.d. and correct probability density function, we can writeψ(ηt) = f(εt)f(εt−l). Then, under the null, we define the α-autocontour, Cl,α, as the setof vectors (ε′t, ε

′t−l) that results from slicing the multivariate density, ψ(.), at a certain

value to guarantee that the set contains α% of observations, that is,

Cl,α ={S(ηt) ⊂ �2k

∣∣ ∫ g1

h1

· · ·∫ g2k

h2k

ψ(ηt)dη1t . . . dη2k, t ≤ α

}, (2)

where the limits of integration are determined by the density functional form so that theshape of the probability contours is preserved under integration, e.g. when the assumeddensity is normal, then the autocontours are 2k-spheres (a circle when k = 1). Weconstruct an indicator process defined as

I l,αt =

{1 if ηt /∈ Cl,α

0 otherwise. (3)

The process {I l,αt } forms the building block of the proposed test statistics. Let pα ≡ 1−α.

As the indicator is a Bernoulli random variable, its mean and variance are given byE[I l,α

t ] = pα and V ar(I l,αt ) = pα(1−pα). Although {εt} is an i.i.d. process, {I l,α

t } exhibitssome linear dependence because I l,α

t and I l,αt−l share common information contained

in εt−l. Hence, the autocovariance function of {I l,αt } is given by

γαh =

{P (I l,α

t = 1, I l,αt−h = 1) − p2α if h = l

0 otherwise.

Proposition 1 Define plα = (T − l)−1

∑T−lt=1 I

l,αt . Under the null hypothesis,

tl,α =

√T − l (pl

α − pα

)σl,α

→d N(0, 1), (4)

where σ2l,α = pα(1 − pα) + 2γα

l .

Proof See Gonzalez-Rivera, Senyuz, and Yoldas (2007) for all mathematical proofs.

Now let us consider a finite number of contours, (α1, . . . , αn), jointly. Let pα =(pα1 , . . . , pαn

)′ where pαi= 1−αi, and define pl

αi= (T − l)−1

∑T−lt=1 I

l,αi

t for i = 1, . . . , n.We then collect all the pl

αis in a n× 1 vector, pl

α = (p1, . . . , pn)′.

Proposition 2 Under the null hypothesis,√T − l(pl

α − pα) →d N (0, Ξ),

where the elements of Ξ are ξij= min(pαi, pαj

) − pαipαj

+ Cov(I l,αi

t , Il,αj

t−l ) +Cov(I l,αj

t , I l,αi

t−l ). Then, it directly follows that

J ln = (T − l)(pl

α − pα)′Ξ−1(plα − pα) →d χ

2(n). (5)

Page 230: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 Testing methodology 217

A complementary test to those described above can be constructed in the spiritof goodness-of-fit. Suppose that we consider only the vector εt and we wish to testin the direction of density functional form. We construct the probability contour setsCα corresponding to the probability density function that is assumed under the nullhypothesis. The set is given by

Cα ={S(εt) ⊂ �k

∣∣ ∫ g1

h1

· · ·∫ gk

hk

f(εt)dε1t . . . dεkt ≤ α

}. (6)

Then, as before, we construct an indicator process as follows

Iαt =

{1 if εt /∈ Cα

0 otherwise, (7)

for which the mean and variance are E[Iαt ] = 1−α and V ar(Iα

t ) = α(1−α), respectively.The main difference between the sets Cl,α and Cα is that the latter does not explicitlyconsider the time-independence assumed under the null and, therefore, the followingtests based on Cα will be less powerful against independence. There is also a differencein the properties of the indicator process. Now, the indicator is also an i.i.d. process,and the analogous tests to those of Propositions 1 and 2 will have a simpler asymptoticdistribution.

Let pα = 1 − α and define an estimator of pα as pα = T−1∑T

t=1 Iαt . Under the null

hypothesis the distribution of the analogue test statistic to that of Proposition 1 is

tα =

√T (pα − pα)pα(1 − pα)

→d N(0, 1).

If, as in Proposition 2, now we jointly consider a finite number of contours and definethe vectors pα = (pα1 , . . . , pαn

)′ and pα = (pα1 , . . . , pαn)′, where pαi

= 1 − αi andpαi

= T−1∑T

t=1 Iαit . Then

√T (pα − pα) →d N(0, Ξ) where the elements of Ξ simplify

to ξij = min(pαi, pαj

) − pαipαj

and, it follows that

Jn = T (pα − pα)′Ξ−1(pα − pα) →d χ2(n).

Note that to make these tests operational we replace the covariance terms by theirsample counterparts. Furthermore, the asympotic normality results established abovestill hold under parameter uncertainty as shown by Gonzalez-Rivera, Senyuz, and Yoldas(2007). However, one needs to deal with nuisance parameters in the asymptotic covariancematrices to make the statistics operational. They suggest using a parametric bootstrapprocedure, which imposes all restrictions of the null hypothesis to estimate asymp-totic covariance matrices under parameter uncertainty. Specifically, after the model isestimated, bootstrap samples are generated by using the estimated model as the datagenerating process where innovation vectors are drawn from the hypothesized paramet-ric distribution. Their Monte Carlo simulations indicate that this approach providessatisfactory results. Hence, in this chapter we take the same approach in our applications.

Page 231: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

218 Multivariate autocontours for specification testing

2.2. Multivariate contours and autocontours

2.2.1. Multivariate normal distribution

In this case the density function is f(εt) = (2π)−k/2 exp(−0.5ε′tεt). Let fα denote thevalue of the density such that the corresponding probability contour contains α% of theobservations. Then the equation describing this contour is

qα = ε′tεt ≡ ε21t + ε22t + · · · + ε2kt,

where qα = −2 ln(fα × (2π)k/2). Hence, the Cα contour set is defined as follows

Cα ={S(εt) ⊂ �k

∣∣ ∫ g1

−g1

· · ·∫ gk

−gk

(2π)−k/2 exp(−0.5ε′tεt)dε1t . . .dεkt ≤ α

},

where g1 =√qλ, gi =

√qλ −∑i−1

j=1 ε2jt for i = 2, . . . , k, and λ ≤ α. We need to determine

the mapping qα in order to construct the indicator process. Let xt = ε′tεt, then xt ∼ χ2(k)and we have qα ≡ inf{q : Fxt

(q) ≥ α}, where Fxtis the cumulative distribution function

of a chi-squared random variable with k degrees of freedom. As a result, the indicatorseries is obtained as follows

Iαt =

{1 if ε′tεt > qα0 otherwise

.

To construct the autocontour Cl,α, we consider the joint distribution of εt and εt−l.Let ηt = (ε′t, ε

′t−l)

′, then the density of interest is given by ψ(ηt) = (2π)−k exp(−0.5η′tηt).Hence, the autocontour equation is given by

dα = η′tηt ≡ η21t + · · · + η2

2k,t,

where dα = −2 ln(ψα×(2π)k). Following the same arguments as above, the correspondingindicator process is

I l,αt =

{1 if η′tηt > dα

0 otherwise,

where dα ≡ inf{d : Fxt(d) ≥ α}, xt = η′tηt, and Fxt

is the cumulative distributionfunction of a chi-squared random variable with 2k degrees of freedom.

2.2.2. Student-t distribution

The multivariate density function is

f(εt) = G(k, v) [1 + ε′tεt/(v − 2)]−(k+v)/2,

where G(k, v) = Γ [(v + k)/2]/{[π(v − 2)]0.5kΓ (v/2)}. Then the equation for theα-probability contour is

qα = 1 + ε′tεt/(ν − 2),

where qα = [fα/G(k, v)](k+v)/2. As a result, the Cα contour set is defined as

Cα ={S(εt) ⊂ �k

∣∣ ∫ g1

−g1

· · ·∫ gk

−gk

G(k, v)(1 + ε′tεt/(v − 2))dε1t . . . dεkt ≤ α

},

Page 232: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Monte Carlo simulations 219

where g1 =√

(qλ − 1)(v − 2), gi =√

(qλ − 1)(v − 2) −∑i−1j=1 ε

2jt for i = 2, . . . , k, and

λ ≤ α. Now let xt = 1+ε′tεt/(v−2), then xt ≡ 1+(k/v)wt where wt has an F-distributionwith (k, v) degrees of freedom. Consequently, we have qα ≡ inf{q : Fwt

[v(q− 1)/k] ≥ α}.Then the indicator series is defined as

Iαt =

{1 if 1 + ε′tεt/(v − 2) > qα0 otherwise

.

To construct the autocontour Cl,α, we consider the joint distribution of εt and εt−l

under the null hypothesis, which is

ψ(εt, εt−l) = G(k, v)2[(1 + ε′tεt/(v − 2))

(1 + ε′t−lεt−l/(v − 2)

)]−(k+v)/2.

Then, the equation for the α-probability autocontour is given by

dα = 1 + (ε′tεt + ε′t−lεt−l)/(v − 2) + (ε′tεt)(ε′t−lεt−l)/(v − 2)2.

Let xt = 1 + (ε′tεt + ε′t−lεt−l)/(v − 2) + (ε′tεt)(ε′t−lεt−l)/(v − 2)2, then we have xt =

1 + (k/v) × [(w1t + w2t) + (k/v)(w1tw2t)] where w1t and w2t are independent randomvariables with an F-distribution with (k, v) degrees of freedom. Similar to the previouscase, we have dα ≡ inf{d : Fxt

(d) ≥ α}, but we do not have readily available resultsfor the quantiles of xt as before. A plausible solution is using Monte Carlo simulation toapproximate the quantiles of interest as we already know that xt is a specific function oftwo independent F-distributed random variables.

As an illustration, we provide sample contour and autocontour plots under Normaland Student-t (with v = 5) distributions in Figure 11.1. Due to the graphical constraintsimposed by high dimensionality, we consider k = 2 and k = 1 for Cα and Cl,α, respec-tively. Note that while Cα and Cl,α are of identical shape under normality, as the productof two independent normal densities yields a bivariate normal density, this is not the caseunder the Student-t distribution.

3. Monte Carlo simulations

We investigate the size and power properties of the proposed tests in finite samples byMonte Carlo simulations for two cases: when the parameters of the model are known andwhen they are unknown and need to be estimated.

3.1. Size simulations

For the size experiments we consider two alternative distributions for the innovationprocess: a multivariate Normal, εt ∼ i.i.d. N(0, Ik), and a multivariate Student-t with5 degrees of freedom, εt ∼ i.i.d.t(0, Ik, 5). Under parameter uncertainty, we consider asimple multivariate location-scale model: yt = μ+H1/2εt where we set μ = 0 andH = Ik.We consider both distributions under parameter uncertainty and apply the tests to theestimated standardized residual vector, εt = H−1/2(yt − μ), where we obtain H1/2 by

Page 233: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

220 Multivariate autocontours for specification testing

90%

–5 –4 –3 –2 –1 0 1 2 3 4 5–5

–4

–3

–2

–1

0

1

2

3

4

5

70%

50%

e1t e1t

e1te1t

99%

e1,t–1 e1,t–1

–5 –4 –3 –2 –1 0 1 2 3 4 5–5

–4

–3

–2

–1

0

1

2

3

4

5

e2te2t

Ca under bivariate Normal and Student-t distributions a Î {0.5,0.7,0.9,0.99}

Cl,a under bivariate Normal and Student-t distributions a Î {0.5,0.7,0.9,0.99}

–5 –4 –3 –2 –1 0 1 2 3 4 5–5

–4

–3

–2

–1

0

1

2

3

4

5

99%

90%70%

50%

90%

70%

50%

–5 –4 –3 –2 –1 0 1 2 3 4 5–5

–4

–3

–2

–1

0

1

2

3

4

5

99%

90%50%

70%

99%

Fig. 11.1. Contour and autocontour plots under Normal and Student-t distributions

using the Cholesky decomposition.1 The asymptotic variance of the tests is obtained bythe simple parametric bootstrap procedure outlined above (see Section 2.1). The numberof Monte Carlo replications is equal to 1,000, and the number of bootstrap replicationsis set to 500. We consider 13 autocontours (n = 13) with coverage levels (%): 1, 5, 10,20, 30, 40, 50, 60, 70, 80, 90, 95, and 99, spanning the entire density function.2 We startwith a sample size of 250 and consider increments of 250 up to 2,000 observations. In allexperiments, the nominal size is 5%.

1Alternative decompositions can be used to calculate the square-root matrix. We conjecture that thechoice of the decomposition technique is not critical for application of our tests.

2Our choice of the contour coverage levels is motivated by the need of covering the entire range of thedensity, from the tails to the very center as we do not have a theoretical result indicating the optimalchoice of the number of contours to guide our practice. The flexibility of our approach permits consideringdifferent types of coverage levels depending on the purpose of application, e.g. concentrating on tails forrisk models. Note also that the Monte Carlo results presented below provide guidance as to how far onecan go in the tails and the center of the denisty without losing precision in finite samples. AdditionalMonte Carlo simulations, not reported here to save space, also indicate that the size and power resultsare robust to the number of contours as long as the range considered is identical, i.e. a finer grid doesnot change the results.

Page 234: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Monte Carlo simulations 221

Table 11.1(a). Size of the J ln-statistics

T J113 J2

13 J313 J4

13 J513 J1

13 J213 J3

13 J413 J5

13

Panel a: Normal (k = 2) Panel b: Student-t (k = 2)

250 11.3 11.3 11.6 8.8 11.8 10.5 11.0 10.5 12.3 9.4500 6.5 6.0 5.8 5.9 8.0 7.5 5.8 5.9 7.0 6.21000 6.8 5.0 6.2 5.3 4.9 7.2 5.2 5.1 5.4 6.02000 6.4 5.1 5.7 4.1 4.8 7.2 5.8 5.5 6.4 6.4

Panel a: Normal (k = 5) Panel b: Student-t (k = 5)

250 12.7 11.8 11.5 14.0 12.9 10.4 11.7 12.3 10.3 11.6500 9.2 8.4 6.9 7.6 8.3 7.3 6.6 7.3 7.9 8.11000 6.3 7.1 5.5 6.0 6.4 5.9 4.8 6.6 5.7 7.82000 5.3 5.6 5.3 3.4 6.5 6.9 4.8 5.7 5.5 5.4

Table 11.1(b). Size of the J ln-statistics under parameter uncertainty

T J113 J2

13 J313 J4

13 J513 J1

13 J213 J3

13 J413 J5

13

Panel a: Normal (k = 2) Panel b: Student-t (k = 2)

250 8.1 6.1 7.3 7.5 6.9 6.8 6.4 7.8 6.5 6.0500 7.5 5.9 5.8 7.3 7.4 7.5 6.7 8.3 8.0 8.11000 8.1 5.8 8.0 7.3 6.6 8.5 6.9 8.8 8.3 7.62000 5.7 5.4 7.7 6.4 4.8 6.2 7.6 7.6 6.4 7.0

Panel a: Normal (k = 5) Panel b: Student-t (k = 5)

250 10.5 9.3 7.7 9.2 8.1 7.1 7.3 6.3 7.2 6.3500 7.7 6.9 6.3 6.9 7.6 6.8 5.5 6.0 6.9 6.41000 5.9 6.1 7.1 5.5 5.5 6.4 5.7 6.8 7.5 6.62000 8.0 8.0 7.4 6.8 7.1 7.0 6.5 7.3 6.3 7.9

In Tables 11.1(a) and 11.1(b) we present the simulated size results for the J ln-statistics.

We consider a system of two equations (k = 2) and a system of five equations (k = 5).For a small sample of 250 observations, the J l

n-statistics are oversized for both densitiesand both systems. However, under parameter uncertainty, the bootstrap procedure seemsto correct to some extent the oversize behavior. For samples of 1,000 and more obser-vations, the simulated size is within an acceptable range of values. There are no majordifferences between the results for the small versus the large systems of equations indicat-ing that the dimensionality of the system is not an issue for the implementation of thesetests.

In Tables 11.2(a) and 11.2(b) we show the simulated size for the Jn-statistics, whichshould be understood primarily as goodness-of-fit tests as they do not explicitly take intoaccount the independence of the innovations over time. The sizes reported in Table 11.2(a)

Page 235: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

222 Multivariate autocontours for specification testing

Table 11.2(a). Size of the Jn-statistics (n = 13)

Normal Student-t

T k = 2 k = 5 k = 2 k = 5

250 5.7 6.3 4.3 6.6500 4.9 5.3 3.1 5.11000 5.7 5.7 5.6 5.32000 5.6 6.2 4.9 5.6

Table 11.2(b). Size of the Jn-statistics (n = 13)under parameter uncertainty

Normal Student-t

T k = 2 k = 5 k = 2 k = 5

250 6.9 9.1 7.3 6.8500 7.0 6.1 6.8 6.71000 6.7 5.5 6.7 5.62000 6.4 7.4 6.8 5.7

are very good, though those in Table 11.2(b) tend to be slightly larger than 5% mainlyfor small samples. However, when we consider the tests with individual contours (seeTable 11.3 below), the size distortion tends to disappear.

For the t-tests, which are based on individual contours, the simulated sizes are verygood. In Table 11.3, we report these results for the case of parameter uncertainty. Themajor size distortions occur for small samples at the extreme contour t13 (99% coverage),but this is not very surprising as we do not expect enough variation in the indicator seriesfor small samples.

3.2. Power simulations

We investigate the power of the tests by generating data from a system with two equationsthat follows three different stochastic processes. We maintain the null hypothesis asyt = μ+H1/2εt, where εt ∼ i.i.d. N(0, Ik), and consider the following DGPs:

DGP 1: yt = μ+H1/2εt, where εt ∼ i.i.d.t(0, I2, 5), μ = 0, and H = I2. In this case,we maintain the independence hypothesis and analyze departures from thehypothesized density function by generating i.i.d. observations from a multivariateStudent-t distribution with 5 degrees of freedom.

DGP 2: yt = Ayt−1 +H1/2εt, where εt ∼ i.i.d.N(0, I2), a11 = 0.7, a12 = 0.1, a21 = 0.03,a22 = 0.85, and H = I2. In this case, we maintain the same density function as thatof the null hypothesis and analyze departures from the independence assumptionby considering a linear VAR(1).

Page 236: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Monte Carlo simulations 223

Table 11.3. Size of the t-statistics under parameter uncertainty

T t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13

Panel a: Normal (k = 2)250 5.0 4.6 5.2 5.1 6.5 6.7 5.7 4.9 5.2 4.6 6.0 4.8 2.0500 4.3 4.2 5.3 5.4 4.1 4.6 4.5 5.1 5.3 5.2 5.1 4.7 6.41000 4.7 4.2 5.2 5.8 5.4 5.5 5.2 5.7 5.7 4.6 5.9 7.6 3.72000 5.4 3.9 5.1 4.0 5.0 5.3 5.3 6.2 4.8 5.9 4.3 6.4 4.9

Panel b: Normal (k = 5)250 4.5 6.2 5.3 5.0 4.5 5.2 5.3 5.8 5.5 5.1 6.1 6.7 2.1500 4.1 4.8 5.8 4.8 6.0 5.6 5.3 6.4 6.5 4.3 6.3 6.0 6.31000 3.8 5.3 5.7 5.3 4.9 5.2 3.8 3.3 4.6 5.3 6.0 4.7 3.92000 4.5 5.3 5.0 5.0 4.6 4.1 5.4 6.0 4.6 5.5 5.5 4.4 6.5

Panel c: Student-t (k = 2)250 4.5 5.1 5.3 4.9 4.9 6.0 4.8 4.6 4.5 5.4 5.7 4.3 8.7500 4.5 6.1 5.9 4.8 4.5 4.2 4.9 5.3 4.2 5.3 6.1 5.9 4.91000 4.3 5.9 6.4 5.8 5.7 5.5 6.6 6.4 5.9 5.8 5.5 6.0 6.32000 5.7 5.0 5.2 5.4 5.5 4.7 5.4 5.9 5.5 5.0 4.9 5.2 4.8

Panel d: Student-t (k = 5)250 4.5 5.5 4.8 4.6 5.8 6.0 7.6 6.7 7.0 6.6 5.8 4.1 8.4500 4.6 5.4 6.4 4.9 4.9 6.6 5.8 7.1 7.7 6.5 5.4 5.0 5.91000 3.4 4.2 4.9 5.5 4.7 6.2 5.8 5.3 5.2 6.0 5.2 4.7 3.72000 5.1 5.6 5.3 5.2 5.2 5.0 5.3 4.4 5.3 6.1 5.0 5.1 3.8

DGP 3: yt = H1/2t εt, εt ∼ i.i.d. N(0, I2), with Ht = C +A′yt−1y

′t−1A+G′Ht−1G and

parameter values A = 0.11/2 × I2, G = 0.851/2 × I2, and C = V −A′V A−G′V Gwhere V is the unconditional covariance matrix with v11 = v22 = 1 and v12 = 0.5.In this case, we analyze departures from both independence and density functionalform by generating data from a system with multivariate conditionalheteroskedasticity.

In Table 11.4 we report the power of the J ln-statistic. The test is the most powerful to

detect departures from density functional form (DGP 1) as the rejection rates are almost100% even in small samples. For departures from independence, the test has better powerto detect dependence in the conditional mean (DGP 2) than in the conditional variance(DGP 3). As expected, in the case of the VAR(1) model (DGP 2), the power decreasesas l becomes larger indicating first order linear dependence. The power is also very good(69%) for small samples of 250 observations. In the case of the GARCH model (DGP 3),the rejection rate reaches 60% for sample sizes of 500 observations and above.

As expected, in Table 11.5 we observe that the goodness-of-fit test, Jn, has the largestpower for DGP 1 and it is not very powerful for DGP 2. It has reasonable power againstDGP 3 mainly for samples of 1,000 observations and above.

We find a similar message in Table 11.6 when we analyze the power of the t-statistics.The tests are the most powerful to detect DGP 1, the least powerful to detect DGP 2,

Page 237: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

224 Multivariate autocontours for specification testing

Table 11.4. Power of the J ln-statistics under parameter uncertainty

T J113 J2

13 J313 J4

13 J513

Panel a: DGP 1250 98.6 98.2 98.6 97.8 98.3500 100.0 100.0 100.0 100.0 100.01000 100.0 100.0 100.0 100.0 100.02000 100.0 100.0 100.0 100.0 100.0

Panel b: DGP 2250 68.9 40.2 26.6 19.3 16.5500 93.6 60.0 38.1 27.9 20.41000 99.9 84.8 58.0 39.2 28.92000 100.0 99.4 83.7 59.8 40.6

Panel c: DGP 3250 35.5 36.0 32.9 31.9 31.9500 62.8 61.6 60.5 61.4 60.31000 90.5 88.8 88.1 86.9 86.72000 99.4 99.6 99.7 98.9 99.2

and acceptable power against DGP 3 for samples of 1,000 observations and above. Thereis a substantial drop in power for the t11 test (90% contour) for the cases of DGP 1 andDGP 3. This behavior is similar to that encountered in the univariate tests of Gonzalez-Rivera, Senyuz, and Yoldas (2007). This is a result due to the specific density underthe null. In the case of DGP 1, for some contour coverage levels the normal density andthe Student-t are very similar. Hence it is very difficult for any test to discriminate thenull from the alternative with respect to the coverage level of those contour planes. Asimilar argument applies to DGP 3 as well, as the GARCH structure in the conditionalcovariance matrix is associated with a non-normal unconditional density.

4. Empirical applications

In this section we apply the proposed testing methodology to the generalized residualsof multivariate GARCH models fitted to US stock return data. Our data set consists of

Table 11.5. Power of the Jn-statistics (n =13) under parameter uncertainty

T DGP 1 DGP 2 DGP 3

250 99.1 12.4 19.7500 100.0 12.1 44.51000 100.0 12.9 70.22000 100.0 14.2 94.7

Page 238: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Empirical applications 225

Table 11.6. Power of the t-statistics under parameter uncertainty

T t1 t2 t3 t4 t5 t6 t7 t8 t9 t10 t11 t12 t13

Panel a: DGP 1250 23.1 55.3 76.6 91.8 96.1 97.7 98.0 96.6 89.9 59.6 8.5 33.7 85.2500 32.3 80.6 95.3 99.5 100.0 100.0 100.0 100.0 99.4 85.6 8.6 57.8 98.51000 49.7 97.4 99.9 100.0 100.0 100.0 100.0 100.0 100.0 98.9 14.0 78.7 100.02000 75.4 99.9 100.0 100.0 100.0 100.0 100.0 100.0 100.0 100.0 16.2 94.9 100.0

Panel b: DGP 2250 3.3 4.7 8.4 11.2 11.1 12.4 13.4 11.0 7.3 6.7 9.7 11.6 3.5500 3.6 5.6 7.6 11.5 12.8 11.5 11.8 11.0 8.9 7.0 7.2 10.9 13.11000 5.1 6.4 8.4 11.2 13.5 14.0 11.7 11.9 9.6 7.1 7.9 11.9 13.22000 4.4 6.7 9.2 10.8 13.3 15.3 14.6 11.6 9.5 8.7 8.7 12.3 14.0

Panel c: DGP 3250 5.6 7.2 10.7 12.8 15.3 17.6 18.5 18.7 14.6 8.3 6.3 9.0 17.0500 7.2 11.9 17.7 25.5 33.4 38.3 41.5 41.1 32.6 15.6 5.3 20.0 48.01000 8.1 20.5 31.4 46.3 58.6 64.3 68.7 67.1 59.1 32.1 8.6 34.8 70.42000 13.5 35.3 56.8 77.7 86.7 91.5 92.8 91.8 85.4 54.7 9.5 60.0 93.5

daily excess returns on five size portfolios, i.e. portfolios sorted with respect to marketcapitalization in an increasing order.3 The sample period runs from January 2, 1996 toDecember 29, 2006, providing a total of 2,770 observations. A plot of the data is providedin Figure 11.2.

As we are working with daily data we assume a constant conditional mean vector.In terms of the multivariate GARCH specifications, we consider two popular alterna-tives: the BEKK model of Engle and Kroner (1995) and the DCC model of Engle(2002a). Define ut = yt − μ where μ is the constant conditional mean vector. Then theBEKK (1, 1,K) specification for the conditional covariance matrix, Ht ≡ E[utu

′t|�t−1], is

given by

Ht = C ′C +∑K

j=1A′

jutu′t−1Aj +

∑K

j=1G′

jHt−1Gj . (9)

In our applications we setK = 1 and use the scalar version of the model due to parsimonyconsiderations where A = αIk, A = βIk, and α and β are scalars. We also use variancetargeting to facilitate estimation, i.e. we set C ′C = V −A′V A−G′V G where V = E[utu

′t],

e.g. Ding and Engle (2001).In the DCC specification, conditional variances and conditional correlations are mod-

eled separately. Specifically, consider the following decomposition of the conditionalcovariance matrix: Ht = DtRtDt where Dt = diag

{h

1/211,t, . . . , h

1/2kk,t

}, and each element

of Dt is modeled as an individual GARCH process. In our applications, we consider the

3Data is obtained from Kenneth French’s website: http://mba.tuck.dartmouth.edu/pages/faculty/ken.french We are grateful to him for making this data publicly available.

Page 239: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

226 Multivariate autocontours for specification testing

–8.0

–6.0

–4.0

–2.0

0.0

2.0

4.0

6.0

01/96 01/97 01/98 01/99 01/00 01/01 01/02 01/03 01/04 01/05 01/06

–8.0

–6.0

–4.0

–2.0

0.0

2.0

4.0

6.0

8.0

01/96 01/97 01/98 01/99 01/00 01/01 01/02 01/03 01/04 01/05 01/06

–8.0

–6.0

–4.0

–2.0

0.0

2.0

4.0

6.0

8.0

01/96 01/97 01/98 01/99 01/00 01/01 01/02 01/03 01/04 01/05 01/06

–8.0

–6.0

–4.0

–2.0

0.0

2.0

4.0

6.0

8.0

01/96 01/97 01/98 01/99 01/00 01/01 01/02 01/03 01/04 01/05 01/06

–10.0–8.0

–6.0–4.0–2.0

0.02.04.0

6.08.0

01/96 01/97 01/98 01/99 01/00 01/01 01/02 01/03 01/04 01/05 01/06

Fig. 11.2. Daily excess returns on five size portfolios (1/2/1996–12/29/2006)(From the smallest quintile portfolio to the largest quintile portfolio)

Page 240: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Empirical applications 227

standard GARCH (1,1) process:

hii,t = ωi + αiu2i,t−1 + βihii,t−1, j = 1, . . . , k.

Now define zt = D−1t ut, then Rt = diag{Qt}−1Qtdiag{Qt}−1 where

Qt = (1 − α− β)Q+ αutu′t−1 + βQt−1, (10)

and Q = E[ztz′t−1].Under both BEKK and DCC specifications, we consider two alternative distribu-

tional assumptions that are most commonly used in empirical applications involvingmultivariate GARCH models: multivariate Normal and multivariate Student-t distribu-tions. Under multivariate normality, the sample log-likelihood function, up to a constant,is given by

LT (θ) = −12

∑T

t=1ln[det(Ht)] − 1

2

∑T

t=1u′tHtut. (11)

In the case of the DCC model, a two-step estimation procedure is applicable under nor-mality as one can write the total likelihood as the sum of two parts, where the formerdepends on the individual GARCH parameters and the latter on the correlation param-eters. Under this estimation strategy, consistency is still guaranteed to hold. For furtherdetails on two-step estimation in the DCC model, the interested reader is referred to Engle(2002a), and Engle and Sheppard (2001). Under the assumption of multivariate Student-t distribution, we do not need to estimate the model with the corresponding likelihood asthe estimates obtained under normality are consistent due to quasi-maximum likelihoodinterpretation. Therefore, we obtain the standardized residual vectors under normalityand then simply test the Student-t assumption on these residuals.4 One remaining issuein the case of Student-t distribution is the choice of the degrees of freedom. We followPesaran and Zaffaroni (2008) and obtain estimates of the degrees of freedom parametersfor all series separately and then consider an average of the individual estimates for thedistributional specification in the multivariate model.

The results are summarized in Figures 11.3 through 11.6 and Table 11.7. From thefigures we observe that under both GARCH specifications, the J l

n-statistics are highlystatistically significant when multivariate normality is the maintained distributionalassumption. The J l

n-statistics of the BEKK model are larger than those obtained underthe DCC specification. Furthermore, there is an obvious pattern in the behavior ofthe statistics as a function of the lag order, especially under the BEKK specification.This indicates that the rejection is partly due to remaining dependence in the modelresiduals. When we switch to the multivariate Student-t distribution with 11 degrees offreedom,5 the J l

n-statistics go down substantially under both multivariate GARCH spec-ifications. Hence, we can argue that the distributional assumption plays a greater rolein the rejection of both models under normality. The J l

n-statistics are barely significant

4Note that in the specification of the multivariate Student-t distribution (see Section 2), the covariancematrix is already scaled to be an identity matrix, thus no re-scaling of residuals is necessary to implementthe test, e.g. Harvey, Ruiz and Sentana (1992).

5This value is obtained by averaging individual degrees of freedom estimates obtained from individualGARCH models under Student-t density.

Page 241: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

228 Multivariate autocontours for specification testing

0

100

200

300

400

500

600

700

Lag

1 5 9 13 17 21 25 29 33 37 41 45 49

Fig. 11.3. J l13-statistics of BEKK model under multivariate Normal distribution

0

100

200

300

400

500

600

Lag

1 5 9 13 17 21 25 29 33 37 41 45 49

Fig. 11.4. J l13-statistics of DCC model under multivariate Normal distribution

0

10

20

30

40

50

60

1 5 9 13 17 21 25 29 33 37 41 45 49

Lag

Fig. 11.5. J l13-statistics of BEKK model under multivariate Student-t distribution

Page 242: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Empirical applications 229

0

5

10

15

20

25

30

1 5 9 13 17 21 25 29 33 37 41 45 49

Lag

Fig. 11.6. J l13-statistics of DCC model under multivariate Student-t distribution

at 5% level for only a few lag values under the DCC specification coupled with multi-variate Student-t distribution. However, under the BEKK specification, J l

n-statistics aresignificant at early lags, even at 1% level. Table 11.7 reports individual t-statistics andthe Jn-statistics. Both types of test statistics indicate that normality is very stronglyrejected under both GARCH specifications. Similar to the case of J l

n-statistics, theresults dramatically change when the distributional assumption is altered to multivariateStudent-t. The DCC model produces better results with respect to both types of teststatistics, but the chi-squared test in particular strongly supports the DCC specificationcompared to the BEKK model. Combining the information from all test statistics we can

Table 11.7. Individual t and J13-statistics for estimated GARCH models

BEKK Normal DCC Normal BEKK Student-t DCC Student-t

t1 −1.85 −2.17 2.78 2.30t2 −8.52 −10.18 −0.31 −0.38t3 −9.97 −12.26 1.00 −0.64t4 −9.37 −11.22 0.84 −0.10t5 −10.34 −11.81 2.47 0.18t6 −11.54 −10.95 1.13 0.95t7 −9.28 −10.03 0.09 0.50t8 −6.85 −7.19 0.25 0.59t9 −2.74 −5.70 0.92 −0.32t10 0.24 −1.52 0.66 −0.89t11 5.39 2.17 0.08 −3.51t12 8.23 5.58 1.00 −1.30t13 12.18 12.50 1.26 0.74

J13 351.47 388.54 30.07 24.35

Page 243: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

230 Multivariate autocontours for specification testing

conclude that multivariate normality is a bad assumption to make regardless of the multi-variate GARCH specification. Furthermore, the DCC model with multivariate Student-tdistribution does a good job in terms of capturing dependence and producing a reasonablefit with respect to density functional form.

5. Concluding remarks

Motivated by the relative scarcity of tests for dynamic specification and density functionalform in multivariate time series models, we proposed a new battery of tests based onthe concept of “autocontour” introduced by Gonzalez-Rivera, Senyuz, and Yoldas (2007)for univariate processes. We developed t-tests based on a single autocontour and alsomore powerful chi-squared tests based on multiple autocontours, which have standardasymptotic distributions. We also developed a second type of chi-squared test statistic,which is informative as a goodness-of-fit test when combined with the first type of chi-squared test. Monte Carlo simulations indicate that the tests have good size and poweragainst dynamic mis-specification and deviations from the hypothesized density. Weapplied our methodology to multivariate GARCH models and showed that the DCCspecification of Engle (2002a) coupled with a multivariate Student-t distribution providesa fine model for multivariate time dependence in a relative large system of stock returns.

Page 244: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

12

Modeling AutoregressiveConditional Skewness and Kurtosis

with Multi-Quantile CAViaRHalbert White, Tae-Hwan Kim, and Simone Manganelli

1. Introduction

It is widely recognized that the use of higher moments, such as skewness and kurtosis, canbe important for improving the performance of various financial models. Responding tothis recognition, researchers and practitioners have started to incorporate these highermoments into their models, mostly using the conventional measures, e.g. the sampleskewness and/or the sample kurtosis. Models of conditional counterparts of the sampleskewness and the sample kurtosis, based on extensions of the GARCH model, have alsobeen developed and used; see, for example, Leon, Rubio, and Serna (2004). Nevertheless,Kim and White (2004) point out that because standard measures of skewness and kurtosisare essentially based on averages, they can be sensitive to one or a few outliers – a regularfeature of financial returns data – making their reliability doubtful.

To deal with this, Kim and White (2004) propose the use of more stable and robustmeasures of skewness and kurtosis, based on quantiles rather than averages. Neverthe-less, Kim and White (2004) only discuss unconditional skewness and kurtosis measures.In this chapter, we extend the approach of Kim and White (2004) by proposing con-ditional quantile-based skewness and kurtosis measures. For this, we extend Engle andManganelli’s (2004) univariate CAViaR model to a multi-quantile version, MQ-CAViaR.This allows for both a general vector autoregressive structure in the conditional quantilesand the presence of exogenous variables. We then use the MQ-CAViaR model to specifyconditional versions of the more robust skewness and kurtosis measures discussed in Kimand White (2004).

The chapter is organized as follows. In Section 2, we develop the MQ-CAViaRdata generating process (DGP). In Section 3, we propose a quasi-maximum likelihood

231

Page 245: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

232 Multi-Quantile CAViaR

estimator for the MQ-CAViaR process and prove its consistency and asymptoticnormality. In Section 4, we show how to consistently estimate the asymptotic variance-covariance matrix of the MQ-CAViaR estimator. Section 5 specifies conditional quantile-based measures of skewness and kurtosis based on MQ-CAViaR estimates. Section 6contains an empirical application of our methods to the S&P 500 index. We also reportresults of a simulation experiment designed to examine the finite sample behavior of ourestimator. Section 7 contains a summary and concluding remarks. Mathematical proofsare gathered into an Appendix.

2. The MQ-CAViaR process and model

We consider data generated as a realization of the following stochastic process.

Assumption 1 The sequence {(Yt, X′t) : t = 0,±1,±2, . . . , } is a stationary and ergodic

stochastic process on the complete probability space (Ω,F , P0), where Yt is a scalar andXt is a countably dimensioned vector whose first element is one.

Let Ft−1 be the σ-algebra generated by Zt−1 ≡ {Xt, (Yt−1, Xt−1), . . .}, i.e. Ft−1 ≡σ(Zt−1). We let Ft(y) ≡ P0[Yt < y|Ft−1] define the cumulative distribution function(CDF) of Yt conditional on Ft−1.

Let 0 < θ1 < . . . < θp < 1. For j = 1, . . . , p, the θjth quantile of Yt conditional onFt−1, denoted q∗j,t, is

q∗j,t ≡ inf{y : Ft(y) = θj}, (1)

and if Ft is strictly increasing,

q∗j,t = F−1t (θj).

Alternatively, q∗j,t can be represented as∫ q∗j,t

−∞dFt(y) = E[1[Yt≤q∗

j,t]|Ft−1] = θj , (2)

where dFt is the Lebesgue-Stieltjes differential for Yt conditional on Ft−1, correspondingto Ft.

Our objective is to jointly estimate the conditional quantile functions q∗j,t, j =1, 2, . . . , p. For this we write q∗t ≡ (q∗1,t, . . . , q

∗p,t)

′ and impose additional appropriatestructure.

First, we ensure that the conditional distribution of Yt is everywhere continuous,with positive density at each conditional quantile of interest, q∗j,t. We let ft denotethe conditional probability density function (PDF) corresponding to Ft. In stating ournext condition (and where helpful elsewhere), we make explicit the dependence of theconditional CDF Ft on ω by writing Ft(ω, y) in place of Ft(y). Realized values of theconditional quantiles are correspondingly denoted q∗j,t(ω). Similarly, we write ft(ω, y) inplace of ft(y).

After ensuring this continuity, we impose specific structure on the quantiles ofinterest.

Assumption 2 (i) Yt is continuously distributed such that for each t and each ω ∈Ω,Ft(ω, ·) and ft(ω, ·) are continuous on R; (ii) For given 0 < θ1 < . . . < θp < 1 and

Page 246: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 The MQ-CAViaR process and model 233

{q∗j,t} as defined above, suppose: (a) For each t and j = 1, . . . , p, ft(ω, q∗j,t(ω)) > 0; (b)For given finite integers k and m, there exist a stationary ergodic sequence of randomk× 1 vectors {Ψt}, with Ψt measurable−Ft−1, and real vectors β∗j ≡ (β∗j,1, . . . , β

∗j,k)′ and

γ∗jτ ≡ (γ∗jτ1, . . . , γ∗jτp)

′ such that for all t and j = 1, . . . , p,

q∗j,t = Ψ ′tβ

∗j +

m∑τ=1

q∗′t−τγ∗jτ . (3)

The structure of (3) is a multi-quantile version of the CAViaR process introduced byEngle and Manganelli (2004). When γ∗jτi = 0 for i �= j, we have the standard CAViaRprocess. Thus, we call processes satisfying our structure “Multi-Quantile CAViaR” (MQ-CAViaR) processes. For MQ-CAViaR, the number of relevant lags can differ across theconditional quantiles; this is reflected in the possibility that for given j, elements of γ∗jτ

may be zero for values of τ greater than some given integer. For notational simplicity,we do not represent m as depending on j. Nevertheless, by convention, for no τ ≤ m dowe have γ∗jτ equal to the zero vector for all j.

The finitely dimensioned random vectors Ψt may contain lagged values of Yt, aswell as measurable functions of Xt and lagged Xt and Yt. In particular, Ψt may con-tain Stinchcombe and White’s (1998) GCR transformations, as discussed in White(2006).

For a particular quantile, say θj , the coefficients to be estimated are β∗j and γ∗j ≡(γ∗′j1, . . . , γ

∗′jm)′. Let α∗′

j ≡ (β∗′j , γ∗′j ), and write α∗ = (α∗′

1 , . . . , α∗′p )′, an "×1 vector, where

" ≡ p(k + mp). We will call α∗ the “MQ-CAViaR coefficient vector.” We estimate α∗

using a correctly specified model of the MQ-CAViaR process.First, we specify our model.

Assumption 3 Let A be a compact subset of R�. (i) The sequence of functions {qt :

Ω × A → Rp} is such that for each t and each α ∈ A, qt(·, α) is measurable–Ft−1; for

each t and each ω ∈ Ω, qt(ω, ·) is continuous on A; and for each t and j = 1, . . . , p,

qj,t(·, α) = Ψ ′tβj +

m∑τ=1

qt−τ (·, α)′γjτ .

Next, we impose correct specification and an identification condition. Assumption4(i.a) delivers correct specification by ensuring that the MQ-CAViaR coefficient vectorα∗ belongs to the parameter space, A. This ensures that α∗ optimizes the estimationobjective function asymptotically. Assumption 4(i.b) delivers identification by ensuringthat α∗ is the only such optimizer. In stating the identification condition, we defineδj,t(α, α∗) ≡ qj,t(·, α) − qj,t(·, α∗) and use the norm ||α|| ≡ maxi=1,...,� |αi|.Assumption 4 (i)(a) There exists α∗ ∈ A such that for all t

qt(·, α∗) = q∗t ; (4)

(b) There exists a nonempty set J ⊆ {1, . . . , p} such that for each ε > 0 there existsδε > 0 such that for all α ∈ A with ||α− α∗|| > ε,

P [∪j∈J{|δj,t(α, α∗)| > δε}] > 0.

Among other things, this identification condition ensures that there is sufficient vari-ation in the shape of the conditional distribution to support estimation of a sufficient

Page 247: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

234 Multi-Quantile CAViaR

number (#J) of variation-free conditional quantiles. In particular, distributions thatdepend on a given finite number of parameters, say k, will generally be able to support kvariation-free quantiles. For example, the quantiles of the N(μ, 1) distribution all dependon μ alone, so there is only one “degree of freedom” for the quantile variation. Similarlythe quantiles of scaled and shifted t-distributions depend on three parameters (loca-tion, scale, and kurtosis), so there are only three “degrees of freedom” for the quantilevariation.

3. MQ-CAViaR estimation: Consistency and asymptoticnormality

We estimate α∗ by the method of quasi-maximum likelihood. Specifically, we constructa quasi-maximum likelihood estimator (QMLE) αT as the solution to the followingoptimization problem:

minα∈A

ST (α) ≡ T−1T∑

t=1

⎧⎨⎩p∑

j=1

ρθj(Yt − qj,t(·, α))

⎫⎬⎭ , (5)

where ρθ(e) = eψθ(e) is the standard “check function,” defined using the usual quantilestep function, ψθ(e) = θ − 1[e≤0]. We thus view

St(α) ≡ −⎧⎨⎩

p∑j=1

ρθj(Yt − qj,t(·, α))

⎫⎬⎭as the quasi log-likelihood for observation t. In particular, St(α) is the log-likelihood ofa vector of p independent asymmetric double exponential random variables (see White,1994, ch. 5.3; Kim and White, 2003; Komunjer, 2005). Because Yt − qj,t(·, α∗), j =1, . . . , p need not actually have this distribution, the method is quasi maximumlikelihood.

We can establish the consistency of αT by applying results of White (1994). For this weimpose the following moment and domination conditions. In stating this next conditionand where convenient elsewhere, we exploit stationarity to omit explicit reference to allvalues of t.

Assumption 5 (i) E|Yt| <∞; (ii) let D0,t ≡ maxj=1,...,p supα∈A |qj,t(·, α)|, t = 1, 2, . . ..Then E(D0,t) <∞.

We now have conditions sufficient to establish the consistency of αT .

Theorem 1 Suppose that Assumptions 1, 2(i,ii), 3(i), 4(i), and 5(i,ii) hold. ThenαT

a.s.→α∗.

Next, we establish the asymptotic normality of T 1/2(αT − α∗). We use a methodoriginally proposed by Huber (1967) and later extended by Weiss (1991). We first sketchthe method before providing formal conditions and results.

Page 248: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 MQ-CAViaR estimation: Consistency and asymptotic normality 235

Huber’s method applies to our estimator αT , provided that αT satisfies the asymptoticfirst order conditions

T−1T∑

t=1

⎧⎨⎩p∑

j=1

∇qj,t(·, αT )ψθj(Yt − qj,t(·, αT ))

⎫⎬⎭ = op(T 1/2), (6)

where ∇qj,t(·, α) is the "× 1 gradient vector with elements (∂/∂αi)qj,t(·, α), i = 1, . . . , ",and ψθj

(Yt − qj,t(·, αT )) is a generalized residual. Our first task is thus to ensure that (6)holds.

Next, we define

λ(α) ≡p∑

j=1

E[∇qj,t(·, α)ψθj(Yt − qj,t(·, α))].

With λ continuously differentiable at α∗ interior to A, we can apply the mean valuetheorem to obtain

λ(α) = λ(α∗) +Q0(α− α∗), (7)

where Q0 is an " × " matrix with (1 × ") rows Q0,i = ∇′λ(α(i)), where α(i) is a meanvalue (different for each i) lying on the segment connecting α and α∗, i = 1, . . . , ". It isstraightforward to show that correct specification ensures that λ(α∗) is zero. We will alsoshow that

Q0 = −Q∗ +O(‖α− α∗‖), (8)

where Q∗ ≡ ∑pj=1E[fj,t(0)∇qj,t(·, α∗)∇′qj,t(·, α∗)] with fj,t(0) the value at zero of the

density fj,t of εj,t ≡ Yt − qj,t(·, α∗), conditional on Ft−1. Combining (7) and (8) andputting λ(α∗) = 0, we obtain

λ(α) = −Q∗(α− α∗) +O(‖α− α∗‖2). (9)

The next step is to show that

T 1/2λ(αT ) +HT = op(1), (10)

where HT ≡ T−1/2∑T

t=1 η∗t , with η∗t ≡ ∑p

j=1 ∇qj,t(·, α∗)ψθj(εj,t). Equations (9) and

(10) then yield the following asymptotic representation of our estimator αT :

T 1/2(αT − α∗) = Q∗−1T−1/2T∑

t=1

η∗t + op(1). (11)

As we impose conditions sufficient to ensure that {η∗t ,Ft} is a martingale differencesequence (MDS), a suitable central limit theorem (e.g., theorem 5.24 in White, 2001)applies to (11) to yield the desired asymptotic normality of αT :

T 1/2(αT − α∗) d→N(0, Q∗−1V ∗Q∗−1), (12)

where V ∗ ≡ E(η∗t η∗′t ).

Page 249: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

236 Multi-Quantile CAViaR

We now strengthen the conditions above to ensure that each step of the aboveargument is valid.

Assumption 2 (iii) (a) There exists a finite positive constant f0 such that for eacht, each ω ∈ Ω, and each y ∈ R, ft(ω, y) ≤ f0 < ∞; (b) There exists a finite positiveconstant L0 such that for each t, each ω ∈ Ω, and each y1, y2 ∈ R, |ft(ω, y1)−ft(ω, y2)| ≤L0|y1 − y2|.

Next we impose sufficient differentiability of qt with respect to α.

Assumption 3 (ii) For each t and each ω ∈ Ω, qt(ω, ·) is continuously differentiable onA; (iii) For each t and each ω ∈ Ω, qt(ω, ·) is twice continuously differentiable on A.

To exploit the mean value theorem, we require that α∗ belongs to the interior of A,int(A).

Assumption 4 (ii) α∗ ∈ int(A).

Next, we place domination conditions on the derivatives of qt.

Assumption 5 (iii) Let D1,t ≡ maxj=1,...,p maxi=1,...,� supα∈A |(∂/∂αi)qj,t(·, α)|, t =1, 2, . . .. Then (a) E(D1,t) < ∞; (b) E(D2

1,t) < ∞; (iv) Let D2,t ≡ maxj=1,...,p

maxi=1,...,� maxh=1,...,� supα∈A |(∂2/∂αi∂αh)qj,t(·, α)|, t = 1, 2, . . .. Then (a) E(D2,t) <∞; (b) E(D2

2,t) <∞.

Assumption 6 (i) Q∗ ≡ ∑pj=1E[fj,t(0)∇qj,t(·, α∗)∇′qj,t(·, α∗)] is positive definite; (ii)

V ∗ ≡ E(η∗t η∗′t ) is positive definite.

Assumptions 3(ii) and 5(iii.a) are additional assumptions helping to ensure that (6)holds. Further imposing Assumptions 2(iii), 3(iii.a), 4(ii), and 5(iv.a) suffices to ensurethat (9) holds. The additional regularity provided by Assumptions 5(iii.b), 5(iv.b), and6(i) ensures that (10) holds. Assumptions 5(iii.b) and 6(ii) help ensure the availability ofthe MDS central limit theorem.

We now have conditions sufficient to ensure asymptotic normality of our MQ-CAViaRestimator. Formally, we have

Theorem 2 Suppose that Assumptions 1–6 hold. Then

V ∗−1/2Q∗T 1/2(αT − α∗) d→N(0, I).

Theorem 2 shows that our QML estimator αT is asymptotically normal with asymp-totic covariance matrix Q∗−1V ∗Q∗−1. There is, however, no guarantee that αT isasymptotically efficient. There is now a considerable literature investigating efficientestimation in quantile models; see, for example, Newey and Powell (1990), Otsu (2003),Komunjer and Vuong (2006, 2007a, 2007b). So far, this literature has only consideredsingle quantile models. It is not obvious how the results for single quantile models extendto multi-quantile models such as ours. Nevertheless, Komunjer and Vuong (2007a) showthat the class of QML estimators is not large enough to include an efficient estimator,and that the class of M -estimators, which strictly includes the QMLE class, yields anestimator that attains the efficiency bound. Specifically, they show that replacing theusual quantile check function ρθj

appearing in (5) with

ρ∗θj(Yt − qj,t(·, α)) = (θ − 1[Yt−qj,t(·,α)≤0])(Ft(Yt) − Ft(qj,t(·, α)))

Page 250: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Consistent covariance matrix estimation 237

will deliver an asymptotically efficient quantile estimator under the single quantile restric-tion. We conjecture that replacing ρθj

with ρ∗θjin (5) will improve estimator efficiency.

We leave the study of the asymptotically efficient multi-quantile estimator for futurework.

4. Consistent covariance matrix estimation

To test restrictions on α∗ or to obtain confidence intervals, we require a consistent esti-mator of the asymptotic covariance matrix C∗ ≡ Q∗−1V ∗Q∗−1. First, we provide aconsistent estimator VT for V ∗; then we give a consistent estimator QT for Q∗. It followsthat CT ≡ Q−1

T VT Q−1T is a consistent estimator for C∗.

Recall that V ∗ ≡ E(η∗t η∗′t ), with η∗t ≡ ∑p

j=1 ∇qj,t(·, α∗)ψθj(εj,t). A straightforward

plug-in estimator of V ∗ is

VT ≡ T−1T∑

t=1

ηtη′t, with

ηt ≡p∑

j=1

∇qj,t(·, αT )ψθj(εj,t)

εj,t ≡ Yt − qj,t(·, αT ).

We already have conditions sufficient to deliver the consistency of VT for V ∗. Formally,we have

Theorem 3 Suppose that Assumptions 1–6 hold. Then VTp→V ∗.

Next, we provide a consistent estimator of

Q∗ ≡p∑

j=1

E[fj,t(0)∇qj,t(·, α∗)∇′qj,t(·, α∗)].

We follow Powell’s (1984) suggestion of estimating fj,t(0) with 1[−cT ≤εj,t≤cT ]/2cT for asuitably chosen sequence {cT }. This is also the approach taken in Kim and White (2003)and Engle and Manganelli (2004). Accordingly, our proposed estimator is

QT = (2cTT )−1T∑

t=1

p∑j=1

1[−cT ≤εj,t≤cT ]∇qj,t(·, αT )∇′qj,t(·, αT ).

To establish consistency, we strengthen the domination condition on ∇qj,t and imposeconditions on {cT }.Assumption 5 (iii.c) E(D3

1,t) <∞.Assumption 7 {cT } is a stochastic sequence and {cT } is a nonstochastic sequence suchthat (i) cT /cT

p→ 1; (ii) cT = o(1); and (iii) c−1T = o(T 1/2).

Theorem 4 Suppose that Assumptions 1–7 hold. Then QTp→ Q∗.

Page 251: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

238 Multi-Quantile CAViaR

5. Quantile-based measures of conditional skewnessand kurtosis

Moments of asset returns of order higher than two are important because these per-mit a recognition of the multi-dimensional nature of the concept of risk. Such higherorder moments have thus proved useful for asset pricing, portfolio construction, andrisk assessment. See, for example, Hwang and Satchell (1999) and Harvey and Siddique(2000). Higher order moments that have received particular attention are skewness andkurtosis, which involve moments of order three and four, respectively. Indeed, it is widelyheld as a “stylized fact” that the distribution of stock returns exhibits both left skew-ness and excess kurtosis (fat tails); there is a large amount of empirical evidence to thiseffect.

Recently, Kim and White (2004) have challenged this stylized fact and the conven-tional way of measuring skewness and kurtosis. As moments, skewness and kurtosisare computed using averages, specifically, averages of third and fourth powers of stan-dardized random variables. Kim and White (2004) point out that averages are sensitiveto outliers, and that taking third or fourth powers greatly enhances the influence ofany outliers that may be present. Moreover, asset returns are particularly prone tocontaining outliers, as the result of crashes or rallies. According to Kim and White’ssimulation study, even a single outlier of a size comparable to the sharp drop instock returns caused by the 1987 stock market crash can generate dramatic irreg-ularities in the behavior of the traditional moment-based measures of skewness andkurtosis.

Kim and White (2004) propose using more robust measures instead, based on samplequantiles. For example, Bowley’s (1920) coefficient of skewness is given by

SK2 =q∗3 + q∗1 − 2q∗2q∗3 − q∗1

,

where q∗1 = F−1(0.25), q∗2 = F−1(0.5), and q∗3 = F−1(0.75), where F (y) ≡ P0[Yt < y] isthe unconditional CDF of Yt. Similarly, Crow and Siddiqui’s (1967) coefficient of kurtosisis given by

KR4 =q∗4 − q∗0q∗3 − q∗1

− 2.91,

where q∗0 = F−1(0.025) and q∗4 = F−1(0.975). (The notations SK2 and KR4 correspondto those of Kim and White, 2004.)

A limitation of these measures is that they are based on unconditional sample quan-tiles. Thus, in measuring skewness or kurtosis, these can neither incorporate usefulinformation contained in relevant exogenous variables nor exploit the dynamic evolu-tion of quantiles over time. To avoid these limitations, we propose constructing measuresof conditional skewness and kurtosis using conditional quantiles q∗j,t in place of the uncon-ditional quantiles q∗j . In particular, the conditional Bowley coefficient of skewness and

Page 252: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 Application and simulation 239

the conditional Crow and Siddiqui coefficient of kurtosis are given by

CSK2 =q∗3,t + q∗1,t − 2q∗2,t

q∗3,t − q∗1,t

,

CKR4 =q∗4,t − q∗0,t

q∗3,t − q∗1,t

− 2.91.

Another quantile-based kurtosis measure discussed in Kim and White (2004) isMoors’s (1988) coefficient of kurtosis, which involves computing six quantiles. Becauseour approach requires joint estimation of all relevant quantiles, and, in our model, eachquantile depends not only on its own lags, but also possibly on the lags of other quantiles,the number of parameters to be estimated can be quite large. Moreover, if the θjs are tooclose to each other, then the corresponding quantiles may be highly correlated, whichcan result in an analog of multicollinearity. For these reasons, in what follows we focusonly on SK2 and KR4, as these require jointly estimating at most five quantiles.

6. Application and simulation

6.1. Time-varying skewness and kurtosis for the S&P 500

In this section we obtain estimates of time-varying skewness and kurtosis for the S&P500 index daily returns. Figure 12.1 plots the S&P 500 daily returns series used forestimation. The sample ranges from January 1, 1999 to September 28, 2007, for a totalof 2,280 observations.

1999 2000 2001 2002 2003 2004 2005 2006 2007–8

–6

–4

–2

0

2

4

6

Fig. 12.1. S&P 500 daily returns: January 1, 1999–September 30, 2007

Page 253: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

240 Multi-Quantile CAViaR

Table 12.1. S&P 500 index: estimation results for the LRS model

β1 β2 β3 β4 β5 β6 β7 β8 β9

0.01 0.05 0.94 −0.04 0.01 0.01 3.25 0.00 0.00(0.18) (0.19) (0.04) (0.15) (0.01) (0.02) (0.04) (0.00) (0.00)

Standard errors are in parentheses.

First, we estimate time-varying skewness and kurtosis using the GARCH-type modelof Leon, Rubio, and Serna (2004), the LRS model for short. Letting rt denote the returnfor day t, we estimate the following specification of their model:

rt = h1/2t ηt

ht = β1 + β2r2t−1 + β3ht−1

st = β4 + β5η3t−1 + β6st−1

kt = β7 + β8η4t−1 + β9kt−1,

where we assume that Et−1(ηt) = 0, Et−1(η2t ) = 1, Et−1(η3

t ) = st, and Et−1(η4t ) = kt,

where Et−1 denotes the conditional expectation given rt−1, rt−2, . . . The likelihood isconstructed using a Gram-Charlier series expansion of the normal density function forηt, truncated at the fourth moment. We refer the interested reader to Leon, Rubio, andSerna (2004) for technical details.

The model is estimated via (quasi-)maximum likelihood. As starting values for theoptimization, we use estimates of β1, β2, and β3 from the standard GARCH model. Weset initial values of β4 and β7 equal to the unconditional skewness and kurtosis valuesof the GARCH residuals. The remaining coefficients are initialized at zero. The pointestimates for the model parameters are given in Table 12.1. Figures 12.2 and 12.3 displaythe time series plots for st and kt, respectively.

Next, we estimate the MQ-CAViaR model. Given the expressions for CSK2 andCKR4, we require five quantiles, i.e. those for θj = 0.025, 0.25, 0.5, 0.75, and 0.975. Wethus estimate an MQ-CAViaR model for the following DGP:

q∗0.025,t = β∗11 + β∗12|rt−1| + q∗′t−1γ∗1

q∗0.25,t = β∗21 + β∗22|rt−1| + q∗′t−1γ∗2

...

q∗0.975,t = β∗51 + β∗52|rt−1| + q∗′t−1γ∗5 ,

where q∗t−1 ≡ (q∗0.025,t−1, q∗0.25,t−1, q

∗0.5,t−1, q

∗0.75,t−1, q

∗0.975,t−1)

′ and γ∗j ≡ (γ∗j1, γ∗j2,

γ∗j3, γ∗j4, γ

∗j5)

′, j = 1, . . . , 5. Hence, the coefficient vector α∗ consists of all the coefficientsβ∗jk and γ∗jk, as above.

Estimating the full model is not trivial. We discuss this briefly before presenting theestimation results. We perform the computations in a step-wise fashion as follows. In

Page 254: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 Application and simulation 241

1999 2000 2001 2002 2003 2004 2005 2006 2007–2

–1.5

–1

–0.5

0

0.5

Fig. 12.2. S&P 500: estimated conditional skewness, LRS model

1999 2000 2001 2002 2003 2004 2005 2006 20072.5

3

3.5

4

4.5

5

5.5

6

6.5

Fig. 12.3. S&P 500: estimated conditional kurtosis, LRS model

the first step, we estimate the MQ-CAViaR model containing just the 2.5% and 25%quantiles. The starting values for optimization are the individual CAViaR estimates, andwe initialize the remaining parameters at zero. We repeat this estimation procedure forthe MQ-CAViaR model containing the 75% and 97.5% quantiles. In the second step, we

Page 255: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

242 Multi-Quantile CAViaR

Table 12.2. S&P 500 index: estimation results for the MQ-CAViaR model

θj βj1 βj2 γj1 γj2 γj3 γj4 γj5

0.025 −0.04 −0.11 0.93 0.02 0 0 0(0.05) (0.07) (0.12) (0.10) (0.29) (0.93) (0.30)

0.25 0.001 −0.01 0 0.99 0 0 0(0.02) (0.05) (0.06) (0.03) (0.04) (0.63) (0.20)

0.50 0.10 0.04 0.03 0 −0.32 0 −0.02(0.02) (0.04) (0.04) (0.02) (0.02) (0.52) (0.17)

0.75 0.03 −0.01 0 0 0 0.04 0.29(0.31) (0.05) (0.70) (0.80) (2.33) (0.84) (0.34)

0.975 0.03 0.24 0 0 0 0.03 0.89(0.06) (0.07) (0.16) (0.16) (0.33) (0.99) (0.29)

Standard errors are in parentheses.

use the estimated parameters of the first step as starting values for the optimization ofthe MQ-CAViaR model containing the 2.5%, 25%, 75%, and 97.5% quantiles, initializingthe remaining parameters at zero. Third and finally, we use the estimates from the secondstep as starting values for the full MQ-CAViaR model optimization containing all fivequantiles of interest, again setting to zero the remaining parameters.

The likelihood function appears quite flat around the optimum, making the opti-mization procedure sensitive to the choice of initial conditions. In particular, choosinga different combination of quantile couples in the first step of our estimation proce-dure tends to produce different parameter estimates for the full MQ-CAViaR model.Nevertheless, the likelihood values are similar, and there are no substantial differ-ences in the dynamic behavior of the individual quantiles associated with these differentestimates.

Table 12.2 presents our MQ-CAViaR estimation results. In calculating the standarderrors, we have set the bandwidth to 1. Results are slightly sensitive to the choice ofthe bandwidth, with standard errors increasing for lower values of the bandwidth. Weobserve that there is interaction across quantile processes. This is particularly evidentfor the 75% quantile: the autoregressive coefficient associated with the lagged 75% quan-tile is only 0.04, whereas that associated with the lagged 97.5% quantile is 0.29. Thisimplies that the autoregressive process of the 75% quantile is mostly driven by the lagged97.5% quantile, although this is not statistically significant at the usual significancelevel. Figure 12.4 displays plots of the five individual quantiles for the time period underconsideration.

Next, we use the estimates of the individual quantiles q∗0.025,t, . . . , q∗0.975,t to calculate

the robust skewness and kurtosis measures CSK2 and CKR4. The resulting time seriesplots are shown in Figures 12.5 and 12.6, respectively.

We observe that the LRS model estimates of both skewness and kurtosis do not varymuch and are dwarfed by those for the end of February 2007. The market was doing welluntil February 27, when the S&P 500 index dropped by 3.5%, as the market worriedabout global economic growth. (The sub-prime mortgage fiasco was still not yet publicknowledge.) Interestingly, this is not a particularly large negative return (there are larger

Page 256: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 Application and simulation 243

1999 2000 2001 2002 2003 2004 2005 2006 2007–6

–4

–2

0

2

4

6

Fig. 12.4. S&P 500 conditional quantiles: January 1, 1999–September 30, 2007

1999 2000 2001 2002 2003 2004 2005 2006 2007–2

–1.5

–1

–0.5

0

0.5

Fig. 12.5. S&P 500: estimated conditional skewness, MQ-CAViaR model

negative returns in our sample between 2000 and 2001), but this one occurred in a periodof relatively low volatility.

Our more robust MQ-CAViaR measures show more plausible variability and con-firm that the February 2007 market correction was indeed a case of large negative

Page 257: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

244 Multi-Quantile CAViaR

1999 2000 2001 2002 2003 2004 2005 2006 20072.5

3

3.5

4

4.5

5

5.5

6

6.5

Fig. 12.6. S&P 500: estimated conditional kurtosis, MQ-CAViaR model

conditional skewness and high conditional kurtosis. This episode appears to be sub-stantially affecting the LRS model estimates for the entire sample, raising doubts aboutthe reliability of LRS estimates in general, consistent with the findings of Sakata andWhite (1998).

6.2. Simulation

In this section we provide some Monte Carlo evidence illustrating the finite sample behav-ior of our methods. We consider the same MQ-CAViaR process estimated in the previoussubsection,

q∗0.025,t = β∗11 + β∗12|rt−1| + q∗′t−1γ∗1

q∗0.25,t = β∗21 + β∗22|rt−1| + q∗′t−1γ∗2

...

q∗0.975,t = β∗51 + β∗52|rt−1| + q∗′t−1γ∗5 . (13)

For the simulation exercise, we set the true coefficients equal to the estimates reportedin Table 12.2. Using these values, we generate the above MQ-CAViaR process 100 times,and each time we estimate all the coefficients, using the procedure described in theprevious subsection.

Data were generated as follows. We initialize the quantiles q∗θj ,t, j = 1, . . . , 5 at t = 1using the empirical quantiles of the first 100 observations of our S&P 500 data. Givenquantiles for time t, we generate a random variable rt compatible with these using thefollowing procedure. First, we draw a random variable Ut, uniform over the interval

Page 258: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 Application and simulation 245

Table 12.3. Means of point estimates through 100 replications (T = 1, 000)

θj True parameters

βj1 βj2 γj1 γj2 γj3 γj4 γj5

0.025 −0.05 −0.10 0.93 0.04 0.00 0.00 0.00(0.08) (0.02) (0.14) (0.34) (0.00) (0.01) (0.00)

0.25 −0.05 −0.01 0.04 0.81 0.00 0.00 0.00(0.40) (0.05) (0.17) (0.47) (0.01) (0.00) (0.00)

0.50 −0.08 0.02 0.00 0.00 −0.06 0.00 0.00(0.15) (0.06) (0.01) (0.00) (0.81) (0.00) (0.01)

0.75 0.20 0.05 0.00 0.00 0.00 0.38 0.13(0.42) (0.11) (0.02) (0.02) (0.00) (0.63) (0.19)

0.975 0.06 0.22 0.00 0.00 0.00 0.10 0.87(0.16) (0.03) (0.00) (0.01) (0.00) (0.56) (0.16)

Standard errors are in parentheses.

[0,1]. Next, we find θj such that θj−1 < Ut < θj . This determines the quantile rangewithin which the random variable to be generated should fall. Finally, we generated thedesired random variable rt by drawing it from a uniform distribution within the interval[q∗θj−1,t

, q∗θj ,t]. The procedure can be represented as follows:

rt =p+1∑j=1

I(θj−1 < Ut < θj)[q∗θj−1,t + (q∗θj ,t − q∗θj−1,t)Vt],

where Ut and Vt are i.i.d. U(0,1), θ0 = 0, θp+1 = 1, q∗θ0,t = q∗θ1,t − 0.05 and q∗θp+1,t =q∗θp,t + 0.05. It is easy to check that the random variable rt has the desired quantilesby construction. Further, it does not matter that the distribution within the quantilesis uniform, as that distribution has essentially no impact on the resulting parameterestimates. Using these values of rt and q∗t , we apply (13) to generate conditional quantilesfor the next period. The process iterates until t = T . Once we have a full sample, weperform the estimation procedure described in the previous subsection.

Tables 12.3 and 12.4 provide the sample means and standard deviations over 100replications of each coefficient estimate for two different sample sizes, T = 1, 000 andT = 2, 280 (the sample size of the S&P 500 data), respectively. The mean estimatesare fairly close to the values of Table 12.2, showing that the available sample sizes aresufficient to recover the true DGP parameters. (To obtain standard error estimates forthe means, divide the reported standard deviations by 10.)

A potentially interesting experiment that one might consider is to generate datafrom the LRS process and see how the MQ-CAViaR model performs in revealingunderlying patterns of conditional skewness and kurtosis. Nevertheless, we leave thisaside here, as the LRS model depends on four distributional shape parameters, butwe require five variation-free quantiles for the present exercise. As noted in Section 2,the MQ-CAViaR model will generally not satisfy the identification condition in suchcircumstances.

Page 259: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

246 Multi-Quantile CAViaR

Table 12.4. Means of point estimates through 100 replications (T = 2, 280)

θj True parameters

βj1 βj2 γj1 γj2 γj3 γj4 γj5

0.025 −0.04 −0.10 0.93 0.03 0.00 0.00 0.00(0.03) (0.01) (0.07) (0.21) (0.00) (0.00) (0.00)

0.25 −0.04 −0.01 0.03 0.88 0.00 0.00 0.00(0.18) (0.02) (0.12) (0.38) (0.00) (0.00) (0.00)

0.50 −0.01 0.02 0.00 0.00 −0.03 0.00 0.00(0.11) (0.04) (0.00) (0.01) (0.75) (0.00) (0.02)

0.75 0.09 0.01 0.00 0.00 0.00 0.33 0.19(0.21) (0.07) (0.00) (0.01) (0.00) (0.58) (0.18)

0.975 0.05 0.24 0.00 0.00 0.00 0.18 0.83(0.13) (0.02) (0.00) (0.03) (0.00) (0.69) (0.22)

Standard errors are in parentheses.

7. Conclusion

In this chapter, we generalize Engle and Manganelli’s (2004) single-quantile CAViaRprocess to its multi-quantile version. This allows for (i) joint modeling of multiplequantiles; (ii) dynamic interactions between quantiles; and (iii) the use of exogenousvariables. We apply our MQ-CAViaR process to define conditional versions of existingunconditional quantile-based measures of skewness and kurtosis. Because of their useof quantiles, these measures may be much less sensitive than standard moment-basedmethods to the adverse impact of outliers that regularly appear in financial market data.An empirical analysis of the S&P 500 index demonstrates the use and utility of our newmethods.

Appendix

Proof of Theorem 1 We verify the conditions of corollary 5.11 of White (1994), whichdelivers αT → α∗, where

αT ≡ arg maxα∈A

T−1T∑

t=1

ϕt(Yt, qt(·, α)),

and ϕt(Yt, qt(·, α)) ≡ −∑pj=1 ρθj

(Yt − qj,t(·, α)). Assumption 1 ensures White’s Assump-tion 2.1. Assumption 3(i) ensures White’s Assumption 5.1. Our choice of ρθj

satisfiesWhite’s Assumption 5.4. To verify White’s Assumption 3.1, it suffices that ϕt(Yt, qt(·, α))is dominated on A by an integrable function (ensuring White’s Assumption 3.1(a,b))

Page 260: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Appendix 247

and that for each α in A, {ϕt(Yt, qt(·, α))} is stationary and ergodic (ensuring White’sAssumption 3.1(c), the strong uniform law of large numbers (ULLN)). Stationarity andergodicity is ensured by Assumptions 1 and 3(i). To show domination, we write

|ϕt(Yt, qt(·, α))| ≤p∑

j=1

|ρθj(Yt − qj,t(·, α))|

=p∑

j=1

|(Yt − qj,t(·, α))(θj − 1[Yt−qj,t(·,α)≤0])|

≤ 2p∑

j=1

|Yt| + |qj,t(·, α)|

≤ 2p(|Yt| + |D0,t|),

so that

supα∈A

|ϕt(Yt, qt(·, α))| ≤ 2p(|Yt| + |D0,t|).

Thus, 2p(|Yt|+ |D0,t|) dominates |ϕt(Yt, qt(·, α))| and has finite expectation by Assump-tion 5(i,ii).

It remains to verify White’s Assumption 3.2; here this is the condition that α∗ is theunique maximizer of E(ϕt(Yt, qt(·, α)). Given Assumptions 2(ii.b) and 4(i), it follows byargument directly parallel to that in the proof of White (1994, corollary 5.11) that forall α ∈ A,

E(ϕt(Yt, qt(·, α))) ≤ E(ϕt(Yt, qt(·, α∗))).

Thus, it suffices to show that the above inequality is strict for α �= α∗. Letting Δ(α) ≡∑pj=1E(Δj,t(α)) with Δj,t(α) ≡ ρθj

(Yt − qj,t(·, α)) − ρθj(Yt − qj,t(·, α∗)), it suffices to

show that for each ε > 0,Δ(α) > 0 for all α ∈ A such that ||α− α∗|| > ε.Pick ε > 0 and α ∈ A such that ||α − α∗|| > ε. With δj,t(α, α∗) ≡ qt(θj , α) −

qt(θj , α∗), by Assumption 4(i.b), there exist J ⊆ {1, . . . , p} and δε > 0 such thatP [∪j∈J{|δj,t(α, α∗)| > δε}] > 0. For this δε and all j, some algebra and Assumption2(ii.a) ensure that

E(Δj,t(α)) = E[∫ δj,t(α,α∗)

0

(δj,t(α, α∗) − s) fj,t(s)ds]

≥ E[12δ2ε 1[|δj,t(α,α∗)|>δε] +

12δj,t(α, α∗)21[|δj,t(α,α∗)|≤δε])]

≥ 12δ2εE[1[|δj,t(α,α∗)|>δε]].

Page 261: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

248 Multi-Quantile CAViaR

The first inequality above comes from the fact that Assumption 2(ii.a) implies that forany δε > 0 sufficiently small, we have fj,t(s) > δε for |s| < δε. Thus,

Δ(α) ≡p∑

j=1

E(Δj,t(α)) ≥ 12δ2ε

p∑j=1

E[1[|δj,t(α,α∗)|>δε]]

=12δ2ε

p∑j=1

P [|δj,t(α, α∗)| > δε] ≥ 12δ2ε∑j∈J

P [|δj,t(α, α∗)| > δε]

≥ 12δ2εP [∪j∈J{|δj,t(α, α∗)| > δε}]

> 0

where the final inequality follows from Assumption 4(i.b). As ε > 0 and α are arbitrary,the result follows.

Proof of Theorem 2 As outlined in the text, we first prove

T−1/2T∑

t=1

p∑j=1

∇qj,t(·, αT )ψθj(Yt − qj,t(·, αT )) = op(1). (14)

The existence of ∇qj,t is ensured by Assumption 3(ii). Let ei be the " × 1 unit vectorwith ith element equal to one and the rest zero, and let

Gi(c) ≡ T−1/2T∑

t=1

p∑j=1

ρθj(Yt − qj,t(·, αT + cei)),

for any real number c. Then by the definition of αT , Gi(c) is minimized at c = 0. LetHi(c) be the derivative of Gi(c) with respect to c from the right. Then

Hi(c) = −T−1/2T∑

t=1

p∑j=1

∇iqj,t(·, αT + cei)ψθj(Yt − qj,t(·, αT + cei)),

where ∇iqj,t(·, αT + cei) is the ith element of ∇qj,t(·, αT + cei). Using the facts that (i)Hi(c) is nondecreasing in c and (ii) for any ε > 0, Hi(−ε) ≤ 0 and Hi(ε) ≥ 0, we have

|Hi(0)| ≤ Hi(ε) −Hi(−ε)

≤ T−1/2T∑

t=1

p∑j=1

|∇iqj,t(·, αT )|1[Yt−qj,t(·,αT )=0]

≤ T−1/2 max1≤t≤T

D1,t

T∑t=1

p∑j=1

1[Yt−qj,t(·,αT )=0],

where the last inequality follows by the domination condition imposed in Assump-tion 5(iii.a). Because D1,t is stationary, T−1/2 max1≤t≤T D1,t = op(1). The secondterm is bounded in probability:

∑Tt=1

∑pj=1 1[Yt−qj,t(·,αT )=0] = Op(1) given Assumption

Page 262: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Appendix 249

2(i,ii.a) (see Koenker and Bassett, 1978, for details). Since Hi(0) is the ith element ofT−1/2

∑Tt=1

∑pj=1 ∇qj,t(·, αT ) ψθj

(Yt − qj,t(·, αT )), the claim in (14) is proved.Next, for each α ∈ A, Assumptions 3(ii) and 5(iii.a) ensure the existence and finiteness

of the "× 1 vector

λ(α) ≡p∑

j=1

E[∇qj,t(·, α)ψθj(Yt − qj,t(·, α))]

=p∑

j=1

E[∇qj,t(·, α)∫ 0

δj,t(α,α∗)

fj,t(s)ds],

where δj,t(α, α∗) ≡ qj,t(·, α) − qj,t(·, α∗) and fj,t(s) = (d/ds)Ft(s+ qj,t(·, α∗)) representsthe conditional density of εj,t ≡ Yt − qj,t(·, α∗) with respect to the Lebesgue mea-sure. The differentiability and domination conditions provided by Assumptions 3(iii) and5(iv.a) ensure (e.g., by Bartle, corollary 5.9) the continuous differentiability of λ on A,with

∇λ(α) =p∑

j=1

E

[∇{∇′qj,t(·, α)

∫ 0

δj,t(α,α∗)

fj,t(s)ds

}].

As α∗ is interior to A by Assumption 4(ii), the mean value theorem applies to eachelement of λ to yield

λ(α) = λ(α∗) +Q0(α− α∗), (15)

for α in a convex compact neighborhood of α∗, where Q0 is an "× " matrix with (1× ")rows Qi(α(i)) = ∇′λ(α(i)), where α(i) is a mean value (different for each i) lying onthe segment connecting α and α∗, i = 1, . . . , ". The chain rule and an application of theLeibniz rule to

∫ 0

δj,t(α,α∗)fj,t(s)ds then give

Qi(α) = Ai(α) −Bi(α),

where

Ai(α) ≡p∑

j=1

E

[∇i∇′qj,t(·, α)

∫ 0

δj,t(α,α∗)

fj,t(s)ds

]

Bi(α) ≡p∑

j=1

E[fj,t(δj,t(α, α∗))∇iqj,t(·, α)∇′qj,t(·, α)].

Assumption 2(iii) and the other domination conditions (those of Assumption 5) thenensure that

Ai(α(i)) = O(||α− α∗||)Bi(α(i)) = Q∗

i +O(||α− α∗||),where Q∗

i ≡ ∑pj=1E[fj,t(0)∇iqj,t(·, α∗)∇′qj,t(·, α∗)]. Letting Q∗ ≡ ∑p

j=1E[fj,t(0) ×∇qj,t(·, α∗)∇′qj,t(·, α∗)], we obtain

Q0 = −Q∗ +O(||α− α∗||). (16)

Page 263: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

250 Multi-Quantile CAViaR

Next, we have that λ(α∗) = 0. To show this, we write

λ(α∗) =p∑

j=1

E[∇qj,t(·, α∗)ψθj(Yt − qj,t(·, α∗))]

=p∑

j=1

E(E[∇qj,t(·, α∗)ψθj(Yt − qj,t(·, α∗))|Ft−1])

=p∑

j=1

E(∇qj,t(·, α∗)E[ψθj(Yt − qj,t(·, α∗))|Ft−1])

=p∑

j=1

E(∇qj,t(·, α∗)E[ψθj(εj,t)|Ft−1])

= 0,

as E[ψθj(εj,t)|Ft−1] = θj − E[1[Yt≤q∗

j,t]|Ft−1] = 0, by definition of q∗j,t, j = 1, . . . , p (see

(2)). Combining λ(α∗) = 0 with (15) and (16), we obtain

λ(α) = −Q∗(α− α∗) +O(||α− α∗||2). (17)

The next step is to show that

T 1/2λ(α) +HT = op(1), (18)

where HT ≡ T−1/2∑T

t=1 η∗t , with η∗t ≡ ηt(α∗), ηt(α) ≡ ∑p

j=1 ∇qj,t(·, α) ψθj(Yt −

qj,t(·, α)). Let ut(α, d) ≡ sup{τ :||τ−α||≤d} ||ηt(τ)− ηt(α)||. By the results of Huber (1967)and Weiss (1991), to prove (18) it suffices to show the following: (i) there exist a > 0and d0 > 0 such that ||λ(α)|| ≥ a||α − α∗|| for ||α − α∗|| ≤ d0; (ii) there exist b > 0,d0 > 0, and d ≥ 0 such that E[ut(α, d)] ≤ bd for ||α− α∗|| + d ≤ d0; and (iii) there existc > 0, d0 > 0, and d ≥ 0 such that E[ut(α, d)2] ≤ cd for ||α− α∗|| + d ≤ d0.

The condition that Q∗ is positive definite in Assumption 6(i) is sufficient for (i). For(ii), we have that for given (small) d > 0

ut(α, d) ≤ sup{τ :||τ−α||≤d}

p∑j=1

||∇qj,t(·, τ)ψθj(Yt − qj,t(·, τ))

−∇qj,t(·, α)ψθj(Yt − qj,t(·, α))||

≤p∑

j=1

sup{τ :||τ−α||≤d}

||ψθj(Yt − qj,t(·, τ))||

× sup{τ :||τ−α||≤d}

||∇qj,t(·, τ) −∇qj,t(·, α)||

Page 264: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Appendix 251

+p∑

j=1

sup{τ :||τ−α||≤d}

||ψθj(Yt − qj,t(·, α))

− ψθj(Yt − qj,t(·, τ))|| × sup

{τ :||τ−α||≤d}||∇qj,t(·, α)||

≤ pD2,td+D1,t

p∑j=1

1[|Yt−qj,t(·,α)|<D1,td] (19)

using the following; (i) ||ψθj(Yt − qj,t(·, τ))|| ≤ 1, (ii) ||ψθj

(Yt − qj,t(·, α)) − ψθj(Yt −

qj,t(·, τ))|| ≤ 1[|Yt−qj,t(·,α)|<|qj,t(·,τ)−qj,t(·,α)|], and (iii) the mean value theorem applied to∇qj,t(·, τ) and qj,t(·, α). Hence, we have

E[ut(α, d)] ≤ pC0d+ 2pC1f0d

for some constants C0 and C1 given Assumptions 2(iii.a), 5(iii.a), and 5(iv.a). Hence,(ii) holds for b = pC0 + 2pC1f0 and d0 = 2d. The last condition (iii) can be similarlyverified by applying the cr–inequality to (19) with d < 1 (so that d2 < d) and usingAssumptions 2(iii.a), 5(iii.b), and 5(iv.b). Thus, (18) is verified.

Combining (17) and (18) thus yields

Q∗T 1/2(αT − α∗) = T−1/2T∑

t=1

η∗t + op(1)

But {η∗t ,Ft} is a stationary ergodic martingale difference sequence (MDS). In par-ticular, η∗t is measurable-Ft, and E(η∗t |Ft−1) = E(

∑pj=1 ∇qj,t(·, α∗)ψθj

(εj,t)|Ft−1) =∑pj=1 ∇qj,t(·, α∗)E(ψθj

(εj,t)|Ft−1) = 0, as E[ψθj(εj,t)|Ft−1] = 0 for all j = 1, . . . , p.

Assumption 5(iii.b) ensures that V ∗ ≡ E(η∗t η∗′t ) is finite. The MDS central limit theorem

(e.g., theorem 5.24 of White, 2001) applies, provided V ∗ is positive definite (as ensuredby Assumption 6(ii)) and that T−1

∑Tt=1 η

∗t η

∗′t = V ∗ + op(1), which is ensured by the

ergodic theorem. The standard argument now gives

V ∗−1/2Q∗T 1/2(αT − α∗) d→ N(0, I),

which completes the proof.

Proof of Theorem 3 We have

VT − V ∗ =

(T−1

T∑t=1

ηtη′t − T−1

T∑t=1

η∗t η∗′t

)+

(T−1

T∑t=1

η∗t η∗′t − E[η∗t η

∗′t ]

),

where ηt ≡ ∑pj=1 ∇qj,tψj,t and η∗t ≡ ∑p

j=1 ∇q∗j,tψ∗j,t, with ∇qj,t ≡ ∇qj,t(·, αT ), ψj,t ≡

ψθj(Yt − qj,t(·, αT )),∇q∗j,t ≡ ∇qj,t(·, α∗), and ψ∗

j,t ≡ ψθj(Yt − qj,t(·, α∗)). Assump-

tions 1 and 2(i,ii) ensure that {η∗t η∗′t } is a stationary ergodic sequence. Assumptions3(i,ii), 4(i.a), and 5(iii) ensure that E[η∗t η

∗′t ] < ∞. It follows by the ergodic theorem

that T−1∑T

t=1 η∗t η

∗′t − E[η∗t η

∗′t ] = op(1). Thus, it suffices to prove T−1

∑Tt=1 ηtη

′t −

T−1∑T

t=1 η∗t η

∗′t = op(1).

Page 265: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

252 Multi-Quantile CAViaR

The (h, i) element of T−1∑T

t=1 ηtη′t − T−1

∑Tt=1 η

∗t η

∗′t is

T−1T∑

t=1

p∑j=1

p∑k=1

{ψj,tψk,t∇hqj,t∇iqk,t − ψ∗j,tψ

∗k,t∇hq

∗j,t∇iq

∗k,t}.

Thus, it will suffice to show that for each (h, i) and (j, k) we have

T−1T∑

t=1

{ψj,tψk,t∇hqj,t∇iqk,t − ψ∗j,tψ

∗k,t∇hq

∗j,t∇iq

∗k,t} = op(1).

By the triangle inequality,

|T−1T∑

t=1

{ψj,tψk,t∇hqj,t∇iqk,t − ψ∗j,tψ

∗k,t∇hq

∗j,t∇iq

∗k,t}| ≤ AT +BT ,

where

AT = T−1T∑

t=1

|ψj,tψk,t∇hqj,t∇iqk,t − ψ∗j,tψ

∗k,t∇hqj,t∇iqk,t|

BT = T−1T∑

t=1

|ψ∗j,tψ

∗k,t∇hq

∗j,t∇iq

∗k,t − ψ∗

j,tψ∗k,t∇hqj,t∇iqk,t|.

We now show that AT = op(1) and BT = op(1), delivering the desired result. For AT ,the triangle inequality gives

AT ≤ A1T +A2T +A3T ,

where

A1T = T−1T∑

t=1

θj |1[εj,t≤0] − 1[εj,t≤0]||∇hqj,t∇iqk,t|

A2T = T−1T∑

t=1

θk|1[εk,t≤0] − 1[εk,t≤0]||∇hqj,t∇iqk,t|

A3T = T−1T∑

t=1

|1[εj,t≤0]1[εk,t≤0] − 1[εj,t≤0]1[εk,t≤0]||∇hqj,t∇iqk,t|.

Theorem 2, ensured by Assumptions 1–6, implies that T 1/2||αT − α∗|| = Op(1). This,together with Assumptions 2(iii,iv) and 5(iii.b), enables us to apply the same techniquesused in Kim and White (2003) to show A1T = op(1), A2T = op(1), A3T = op(1), implyingAT = op(1).

It remains to show BT = op(1). By the triangle inequality,

BT ≤ B1T +B2T ,

Page 266: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Appendix 253

where

B1T = |T−1T∑

t=1

ψ∗j,tψ

∗k,t∇hq

∗j,t∇iq

∗k,t − E[ψ∗

j,tψ∗k,t∇hq

∗j,t∇iq

∗k,t]|

B2T = |T−1T∑

t=1

ψ∗j,tψ

∗k,t∇hqj,t∇iqk,t − E[ψ∗

j,tψ∗k,t∇hq

∗j,t∇iq

∗k,t]|.

Assumptions 1, 2(i,ii), 3(i,ii), 4(i.a), and 5(iii) ensure that the ergodic theorem appliesto {ψ∗

j,tψ∗k,t∇hq

∗j,t∇iq

∗k,t}, so B1T = op(1). Next, Assumptions 1, 3(i,ii), and 5(iii) ensure

that the stationary ergodic ULLN applies to {ψ∗j,tψ

∗k,t∇hqj,t(·, α)∇iqk,t(·, α)}. This and

the result of Theorem 1 (αT −α∗ = op(1)) ensure that B2T = op(1) by e.g., White (1994,corollary 3.8), and the proof is complete.

Proof of Theorem 4 We begin by sketching the proof. We first define

QT ≡ (2cTT )−1T∑

t=1

p∑j=1

1[−cT ≤εj,t≤cT ]∇q∗j,t∇′q∗j,t,

and then we will show the following:

Q∗ − E(QT )p→ 0, (20)

E(QT ) −QTp→ 0, (21)

QT − QTp→ 0. (22)

Combining the results above will deliver the desired outcome: QT −Q∗ p→ 0.For (20), one can show by applying the mean value theorem to Fj,t(cT ) − Fj,t(−cT ),

where Fj,t(c) ≡∫

1{s≤c}fj,t(s)ds, that

E(QT ) = T−1T∑

t=1

p∑j=1

E[fj,t(ξj,T )∇q∗j,t∇′q∗j,t] =p∑

j=1

E[fj,t(ξj,T )∇q∗j,t∇′q∗j,t],

where ξj,T is a mean value lying between −cT and cT , and the second equality followsby stationarity. Therefore, the (h, i) element of |E(QT ) −Q∗| satisfies

|p∑

j=1

E{fj,t(ξj,T ) − fj,t(0)∇hq

∗j,t∇iq

∗j,t

} |≤

p∑j=1

E{|fj,t(ξj,T ) − fj,t(0)||∇hq

∗j,t∇iq

∗j,t|}

≤p∑

j=1

L0E{|ξj,T ||∇hq

∗j,t∇iq

∗j,t|}

≤ pL0cTE[D21,t],

Page 267: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

254 Multi-Quantile CAViaR

which converges to zero as cT → 0. The second inequality follows by Assumption 2(iii.b),and the last inequality follows by Assumption 5(iii.b). Therefore, we have the result in(20).

To show (21), it suffices simply to apply a LLN for double arrays, e.g. theorem 2 inAndrews (1988).

Finally, for (22), we consider the (h, i) element of |QT −QT |, which is given by

| 12cTT

T∑t=1

p∑j=1

1[−cT ≤εj,t≤cT ]∇hqj,t∇iqj,t

− 12cTT

T∑t=1

p∑j=1

1[−cT ≤εj,t≤cT ]∇hq∗j,t∇iq

∗j,t|

=cTcT

| 12cTT

T∑t=1

p∑j=1

(1[−cT ≤εj,t≤cT ] − 1[−cT ≤εj,t≤cT ])∇hqj,t∇iqj,t

+1

2cTT

T∑t=1

p∑j=1

1[−cT ≤εj,t≤cT ](∇hqj,t −∇hq∗j,t)∇iqj,t

+1

2cTT

T∑t=1

p∑j=1

1[−cT ≤εj,t≤cT ]∇hq∗j,t(∇iqj,t −∇iq

∗j,t)

+1

2cTT(1 − cT

cT)

T∑t=1

p∑j=1

1[−cT ≤εj,t≤cT ]∇hq∗j,t∇iq

∗j,t|

≤ cTcT

[A1T +A2T +A3T + (1 − cTcT

)A4T ],

where

A1T ≡ 12cTT

T∑t=1

p∑j=1

|1[−cT ≤εj,t≤cT ] − 1[−cT ≤εj,t≤cT ]| × |∇hqj,t∇iqj,t|

A2T ≡ 12cTT

T∑t=1

p∑j=1

1[−cT ≤εj,t≤cT ]|∇hqj,t −∇hq∗j,t| × |∇iqj,t|

A3T ≡ 12cTT

T∑t=1

p∑j=1

1[−cT ≤εj,t≤cT ]|∇hq∗j,t| × |∇iqj,t −∇iq

∗j,t|

A4T ≡ 12cTT

T∑t=1

p∑j=1

1[−cT ≤εj,t≤cT ]|∇hq∗j,t∇iq

∗j,t|.

It will suffice to show that A1T = op(1), A2T = op(1), A3T = op(1), and A4T = Op(1).Then, because cT /cT

p→ 1, we obtain the desired result: QT −Q∗ p→ 0.

Page 268: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Appendix 255

We first show A1T = op(1). It will suffice to show that for each j,

12cTT

T∑t=1

|1[−cT ≤εj,t≤cT ] − 1[−cT ≤εj,t≤cT ]| × |∇hqj,t∇iqj,t| = op(1).

Let αT lie between αT and α∗, and put dj,t,T ≡ ||∇qj,t(·, αT )|| × ||αT − α∗||+ |cT − cT |.Then

(2cTT )−1T∑

t=1

|1[−cT ≤εj,t≤cT ] − 1[−cT ≤εj,t≤cT ]| × |∇hqj,t∇iqj,t| ≤ UT + VT ,

where

UT ≡ (2cTT )−1T∑

t=1

1[|εj,t−cT |<dj,t,T ]|∇hqj,t∇iqj,t|

VT ≡ (2cTT )−1T∑

t=1

1[|εj,t+cT |<dj,t,T ]|∇hqj,t∇iqj,t|.

It will suffice to show that UTp→ 0 and VT

p→ 0. Let η > 0 and let z be an arbitrary pos-itive number. Then, using reasoning similar to that of Kim and White (2003, lemma 5),one can show that for any η > 0,

P (UT > η) ≤ P

((2cTT )−1

T∑t=1

1[|εj,t−cT |<(||∇qj,t(·,α0)||+1)zcT ])|∇hqj,t∇iqj,t| > η)

≤ zf0ηT

T∑t=1

E {(||∇qj,t(·, αT )|| + 1)|∇hqj,t∇iqj,t|}

≤ zf0{E|D31,t| + E|D2

1,t|}/η,

where the second inequality is due to the Markov inequality and Assumption 2(iii.a),and the third is due to Assumption 5(iii.c). As z can be chosen arbitrarily small and theremaining terms are finite by assumption, we have UT

p→ 0. The same argument is usedto show VT

p→ 0. Hence, A1T = op(1) is proved.Next, we show A2T = op(1). For this, it suffices to show A2T,j ≡ 1

2cT T

∑Tt=1

1[−cT ≤εj,t≤cT ]|∇hqj,t −∇hq∗j,t| × |∇iqj,t| = op(1) for each j. Note that

A2T,j ≤ 12cTT

T∑t=1

|∇hqj,t −∇hq∗j,t| × |∇iqj,t|

≤ 12cTT

T∑t=1

||∇2hqj,t(·, α)|| × ||αT − α∗|| × |∇iqj,t|

Page 269: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

256 Multi-Quantile CAViaR

≤ 12cT

||αT − α∗|| 1T

T∑t=1

D2,tD1,t

=1

2cTT 1/2T 1/2||αT − α∗|| 1

T

T∑t=1

D2,tD1,t,

where α is between αT and α∗, and ∇2hqj,t(·, α) is the first derivative of ∇hqj,t with

respect to α, which is evaluated at α. The last expression above is op(1) because (i)T 1/2||αT − α∗|| = Op(1) by Theorem 2, (ii) T−1

∑Tt=1D2,tD1,t = Op(1) by the ergodic

theorem and (iii) 1/(cTT 1/2) = op(1) by Assumption 7(iii). Hence, A2T = op(1). Theother claims A3T = op(1) and A4T = Op(1) can be analogously and more easily proven.Hence, they are omitted. Therefore, we finally have QT − QT

p→ 0, which, together with(20) and (21), implies that QT −Q∗ p→ 0. The proof is complete.

Page 270: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

13

Volatility Regimes and GlobalEquity Returns

Luis Catao and Allan Timmermann

The stock market burst of the early 2000s and widespread perception of tighterinternational co-movements in stock prices over the past boom and burst cycle haverenewed interest in patterns of equity market volatility and their sources. Threeimportant questions arise in this connection: first, does market volatility display well-characterized temporary switches that can nevertheless be quite persistent? Second, towhat extent is such volatility accounted for by global, country- or sector-specific factorsand how do these factor contributions evolve across distinct volatility states (if any)?Third, what implications follow for international risk diversification?

Each of these questions has been addressed in distinct literatures that have builton and been shaped by Rob Engle’s pioneering work on volatility modeling. Taking offfrom Engle (1982a), a first body of literature has looked at the question of whether stockreturn volatility is time-varying using a variety of econometric models capable of gaugingrich asset pricing dynamics, which have been applied to broad stock market indices(see Bollerslev et al., 1994, and Campbell et al., 1997 for comprehensive surveys). It istypically found that stock returns have been strongly time-varying, with evidence showingUS stock market volatility to have risen in the run-up to the 1987 crash, then droppingto unusually low levels through 1996/97 before rising markedly since, although somecontroversy remains as to whether stock return volatility has been trendless (Schwert,1989) or U-shaped over longer horizons (Eichengreen and Tong, 2004).

Although the above studies do not decompose such time-varying stock return volatil-ity into its country-, sector-, and firm-specific components, other researchers have usedinternational firm-level data to try to measure the relative importance of these factors.

Acknowledgments: We thank Steve Figlewski, seminar participants at the Festschrift conference in honorof Rob Engle, at the European econometric society meetings, IMF, and Monash University, as well asmany other colleagues on both sides of the Atlantic for many helpful comments on earlier drafts. Theusual caveats apply. The second author acknowledges support from CREATES, funded by the DanishNational Research Foundation.

257

Page 271: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

258 Volatility regimes and global equity returns

The employed econometric apparatus in the earlier strand of this literature has generallybeen much simpler, consisting of cross-sectional regressions of firms’ stock returns on aset of country and industry dummies for each period. As these dummies are orthogonal ineach cross-section, and their estimated coefficients represent the excess return associatedwith belonging to a given sector and country relative to a global average (the regres-sion’s intercept), the contribution of each factor can then be computed in two ways:either by the time series variance of the coefficients estimated in the successive cross-sectional regressions over fixed or rolling time windows of arbitrarily specified lengths,or by the average absolute sum of the coefficients on the sector and country dummiesover the chosen window. On this basis, it has been concluded that the country factortypically explains most of the cross-sectional variation in stock returns, with sector- orindustry-specific factors accounting for less than 10% on average (Heston and Rouwen-horst 1994; Beckers et al., 1992; Griffin and Karolyi, 1998), albeit rising in the morerecent period (Brooks and Catao, 2000; Brooks and del Negro, 2002; Cavaglia et al.,2000; L’Her et al., 2002). Underlying this approach is thus the assumption that factorsdriving country- and industry-affiliation effects have very limited dynamics, being eitherconstant or changing only very gradually over time. Although more recent work has over-come some of these limitations by using an arbitrage pricing theory (APT) model whereAPT factors are extracted from the covariance matrix of returns and re-estimated overfixed intervals (Bekaert, Hodrick, and Zhang, 2005), or by using a GARCH framework(Baele and Inghelbrecht, 2005), this strand of the literature has continued to rely onlinear factor specifications.

In light of evidence that country factors have been consistently important in driv-ing stock returns, a third strand of the literature has focused on the issue of how theycorrelate over time and, hence, what scope there is for international equity risk diversifi-cation arising from the covariance patterns of equity returns across the various nationalmarkets. Unsurprisingly in view of the evidence of time-varying correlations in manyfinancial return series (Engle, 2002a), it has generally been found that such covariancesdisplay considerable time variation (King, Sentana, and Wadhwani, 1994; Engle, Ito,and Lin, 1994; Longin and Solnik, 1995; Bekaert and Harvey, 1995; Karolyi and Stultz,1996). Further, it has also been found that informational proximity and common insti-tutional factors play a role (Portes and Rey, 2005) as do a variety of macroeconomicmeasures (Engle and Rangel, 2008), particularly at lower frequencies. Although Portesand Rey (2005) use disaggregated data on equity flows to test the informational gravityview, the bulk of this literature on time-varying national market correlations has typi-cally been obtained using broad stock indices. Among other things, this does not allowone to distinguish how much of these correlations are due to “pure” country-specific fac-tors or differences in the sector composition across the various national market indices –an issue that is better addressed with firm-level data and consistent sector classificationacross countries. By the same token, the important question of how risk diversificationpossibilities evolve as the various country and industry factors move into distinct (andnot necessarily coincident) volatility regimes is also overlooked in this literature.

Against this background, the contribution of this chapter is twofold. First, wedevelop a dynamically flexible econometric framework that is capable of addressing theabove questions about patterns and sources of international equity market volatility.We do so without imposing unwarranted restrictions featured in previous work, includ-ing the assumption of a single volatility regime, that the contribution of sector- or

Page 272: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Volatility regimes and global equity returns 259

country-specific factors cannot discretely change across regimes, or by making use ofarbitrarily specified rolling windows that are well-known since Frisch (1933) to be capa-ble of inducing spurious dynamics in the data. There are clear reasons for why relaxingthese assumptions is important. National policies that influence country risk may displaynongradual changes that have been deemed as one culprit for the time-varying nature ofstock return volatility (Eichengreen and Tong, 2004) and are also a well-known source ofnonlinearities in macroeconomic and financial data (Engel and Hamilton, 1990; Driffilland Sola, 1994). By the same token, widely studied supply shocks such as changes in oilprices are known to have potentially large and discrete effects on equity market volatility,and similarly so the emergence of new technologies. Both are thus potentially capableof radically changing the industry-specific dynamics of stock returns and generate sig-nificant differences in the persistence of high versus low volatility regimes, which cannotbe typically accounted for by linear models and/or GARCH-type specifications. All thisunderscores the need for greater flexibility in modeling the factor dynamics driving stockreturns.

The approach we propose consists of two steps. In the first step we form “pure”country and “pure” industry (or sector) portfolios from a large cross-section of firms. Sucha country–industry decomposition yields an important benefit relative to the practiceof measuring international correlations using broad national indices in that it permitsdisentangling the extent to which a given variation in country X’s stock index is dueto country X’s specific (institutional or policy related) factor or, instead, due to sayan information technology (IT) shock that affects the country disproportionally simplybecause of a large weight of the IT sector in that country. No less importantly, thisstandard procedure of forming portfolios is instrumental for achieving the dimensionalityreduction required in the application of richly parameterized models such as ours tolarge unbalanced panels. By summarizing the relevant firm-level information into a muchsmaller and hence manageable number of time series, we can then model the dynamicsof returns on the various country and sector portfolios in a possibly nonlinear fashionin a second stage, allowing for regime switches in volatility processes. As shown below,once country-, industry- and global factors are each allowed to be in a different volatilityregime at any given point in time, this will permit the characterization of a broader arrayof diversification possibilities than those considered in previous studies.

The second contribution of the chapter lies in applying this methodology to a uniquelylong firm-level data set so as to shed light on the substantive questions pertaining to thedistinct strands of the literature referred to above. Our sample spans 13 countries overnearly 30 years, compared with at most 15–20 years or so of data in previous studies.1 Asit accounts for around 80% of advanced countries’ stock market capitalization towardsthe end of the period and between 56 and 73% of world stock market capitalizationover 1973–2002, our data set is thus broadly representative of global stock market devel-opments. We use this data to answer the following questions. First, does the “stylizedfact” that country factors overwhelmingly dominate sectoral-affiliation effects hold uni-formly or change only very slowly/rapidly over time? Second, what is the strength of the

1The two studies using the longest time series that we are aware of are Brooks and del Negro (2002)and Bekaert, Hodrick, and Zhang (2005). The sample coverage is 1985:1 to 2002:2 in the former, and1980:1 to 2003:12 in the latter. Thus, none of these studies incorporates in their estimates the effects ofthe large shifts in stock market volatility and relative factor contribution following the oil shocks andmonetary disturbances of the 1970s, which we document. We return to this issue below.

Page 273: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

260 Volatility regimes and global equity returns

various individual country and sectoral return correlations within the distinct volatilitystates (if more than one)? In particular, do we observe tighter equity return correlationswithin certain groups even after allowing for distinct volatility states and distinct sec-toral compositions of the various national indices, consistent with informational gravitymodels of equity holdings? Finally, what are the implications for international portfoliodiversification?

The main results are as follows. First, we find strong evidence of nonlinear dynamicdependencies in both sector and country portfolios, indicating that the dynamic “mix-tures of normals” model underlying the Markov switching approach is superior to thesingle state model; we corroborate this evidence through a variety of tests on modelresiduals as well as by comparing our model’s smoothed probability estimates withnonparametric volatility measures spanning our entire sample. Second, we use this pur-portedly more accurate gauge provided by our model to estimate that the country factorexplains about 50% of market volatility over the entire period on average, as opposedto 16% accounted for by the sector- or industry-specific factor. Thus, while this averagecontribution of the industry factor is substantially lower than that of the country factor,it is well above that estimated in earlier studies (less than 10%). No less importantly,these relative factor contributions are shown to vary widely across volatility states. Thesectoral factor contribution typically rises sharply during major industry-specific shocks(such as the oil shocks of the early and late 1970s and mid-1980s, and IT boom and bustmore recently), the direct counterpart of which is a marked drop in the country factorcontribution down to the 30–35% range.

Third, we provide a new set of measures of international portfolio correlations. Asthese are model implied estimates calculated over the various portfolio pairs and con-ditional upon the entire time series information up to that point, they are not marredby biases affecting unconditional estimates discussed in Forbes and Rigobon (2001), noraffected by potential biases arising from relying on a small number of observations from aparticular volatility state. We find that such volatilities vary markedly across states and,in particular, that when both the global and industry factors are in the high volatilitystate, correlations between country portfolios typically become tighter than correlationsacross industry portfolios. A key implication is that the sharp rise in country portfoliocorrelations during high global volatility states undermines the benefits of cross-borderdiversification during those periods. This effect is further compounded by the finding thatsuch correlations are generally tighter across certain groups of countries (such as Anglo–Saxon countries and some European markets), thus lending support to an informationgravity view of cross-border equity flows a la Portes and Rey (2005). Thus, our find-ings highlight a potentially important connection between global stock market volatilityand both levels and the geographic distribution of international equity flows – an issuewhich, to our knowledge, is yet to be explored in the literature on the determinants ofinternational capital flows.

The remainder of the chapter is structured as follows. Section 1 lays out the economet-ric methodology, whereas Section 2 discusses the data. The empirical characterization ofthe single and joint dynamics of country and industry portfolios and of the global factoris provided in Section 3. Section 4 presents variance decomposition results on the rela-tive contribution of each factor to overall stock return volatility. Section 5 provides aneconomic interpretation of our model characterization of the volatility states, linking it tothe existing literature on the determinants of stock market volatility. Section 6 examines

Page 274: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

1 Econometric methodology 261

the within-state portfolio correlations and examines the respective implications for globalrisk diversification. Section 7 concludes.

1. Econometric methodology

1.1. Constructing “pure” country and industry portfolios

Panels of individual stock returns are typically highly unbalanced due to the fact thatsome firms die whereas others are “born” at some point within any reasonably longtime series data. To deal with this problem without having to resort to potentiallydistorting procedures dropping the observations of both newly born and dead firmsto balance the panel and make estimation feasible, we present an approach that doesnot entail losing information contained in the time series dynamics of individual coun-try or industry stock return series, nor in the whole cross-sectional dimension of thedata. Specifically, we propose a two-stage approach where, in the first stage, we fol-low Heston and Rouwenhorst (1994) and extract the industry and country returns for agiven time period through cross-sectional regressions in which each firm’s stock returns isdefined as:

Rijkt = αt + βjt + γkt + εit , (1)

where Rijkt stands for the return at time t of the ith firm in the jth industry and thekth country, αt is a global factor common to all firms, βjt is an “excess” return owing tothe firm’s belonging to industry j, γkt is an “excess” return associated with the firm’slocation in country k, and εit is an idiosyncratic firm-specific factor.2 This factor struc-ture has been a work-horse in much of the literature on equity market volatility andco-movements, both among studies using firm-level data as well as among those usingaggregate country indices (see, e.g., Forbes and Chinn, 2004).3

What has differed among recent studies is whether country and industry factor loadsare assumed to be fixed, cross-sectionally varying, time-varying, or both. Although there

2We have not subtracted a risk-free rate from returns in equation (1). As the model measures returnsrelative to the benchmark of the average world portfolio, this means that αt incorporates time-variationsin the risk-free world interest rate. As all returns are measured in US dollars, it is thus natural to thinkof αt as capturing fluctuations in the three-month US T-bill rate (an oft-used proxy for the worldrisk-free interest rate). If instead we were to measure returns in the various local currencies relative tothe respective country “risk-free” rate, variations in the risk-free rates across different countries will beabsorbed in γkt .

3Besides its popularity, there are three particular reasons why we retain this three-factor specification.First, we are interested in country and industry effects in their own right and we know that a modelwith both country and industry factors systematically dominates one with country- or with industryfactors alone (see Bekaert, Hodrick, and Zhang, 2005 for a comparison of models with distinct factors).Second, adding a “small cap” factor or a similar proxy related to firm size has been shown to havenegligible effects on country–industry decompositions for a similar sample of international firms (Brooksand Catao, 2000). Third, although regional factors are deemed to be important – in particular forEurope (cf. Baele and Inghelbrecht, 2005) – augmenting the model with various regional dummies wouldbe infeasible due to degrees of freedom limitations arising from the richly parameterized nature of ourregime-switching specification in the second stage estimation. As time goes by and longer time seriesdata become available, however, augmenting the model with regional factors could become a feasibleextension to this model.

Page 275: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

262 Volatility regimes and global equity returns

are advantages of letting the loads vary both across firms and over time as in Bekaert,Hodrick, and Zhang’s (2005), this choice needs to be traded off against the benefitsof modeling the dynamics of factor loadings as a regime-switching process. For lettingβ and γ vary both cross-sectionally and over time for each firm would be unfeasible(even for reasonably long time series such as ours) given the already large number ofparameters to be estimated with only time-varying loadings, as discussed below. Clearly,fixing the cross-sectional factor loads has the drawback that individual firms may differin their degree of exposure to the global factor. However, this cost appears to be lessconsequential in the present context as we rely on this load homogeneity assumption onlyto construct country and sector portfolios consisting of hundreds of firms, so that theeffect of idiosyncratic factor loadings is largely washed out in the aggregate.4 Further,one other major advantage of doing so – besides that of making the subsequent regime-switching estimation feasible – is to facilitate comparability between our results andthose from a large body of the literature, which also uses decomposition schemes basedon firm-level homogeneity of factor loads. This allows us to isolate the contribution ofour approach relative to earlier studies.

Generalizing to J industries and K countries, equation (1) can be rewritten as:

Rijkt = αt +J∑

j=1

eijββjt +K∑

k=1

eikγγkt + εit , (2)

where eijβ is a dummy variable defined as 1 for the ith firm’s industry and zero otherwise,whereas eikγ is a dummy defined as 1 for the ith firm’s country and zero otherwise. Aseach firm can only belong to one industry and one country at a time, the various industrydummies in (2) will be orthogonal to each other within the cross-section. Likewise, thevarious country dummies will also be orthogonal to each other.

We can rewrite (2) more succinctly by defining the excess return vectors as:

βt =

⎛⎜⎜⎜⎝β1t

β2t

...βJt

⎞⎟⎟⎟⎠ , γt =

⎛⎜⎜⎜⎝γ1t

γ2t

...γKt

⎞⎟⎟⎟⎠ ,4This is clear from the evidence presented in Brooks and del Negro (2002) who, after allowing for

distinct firm-specific loadings, obtain very similar inferences about the relative factor contributions asthose yielded when the homogeneity of factor loadings is imposed. As elsewhere in this literature, theirresults are based on the assumption of a single volatility state. Another important trade-off of allowingfor firm-specific factor loadings (which are estimated by a maximum likelihood algorithm as in theirstudy), is that of having to balance the panel as the algorithm cannot handle missing observations. Asdiscussed above, by eliminating new firm entry and exit from the sample, this panel-balancing procedureis an additional source of potential estimation biases. More recently, Bekaert, Hodrick and Zhang (2005)compared the performance of several models, including the Heston and Rouwenhorst (1994) model withfixed cross-sectional and time series loads over shorter sub-periods. Although Bekaert, Hodrick andZhang (2005) find that an APT model incorporating both global and local factors fits the covariancestructure of stock returns best, their results are restricted to the linear (one-state) version of the model.No less importantly, their study also finds that differences in the measurement of country versus industrycontributions between other (one-state) models and the Heston and Rouwenhorst model are mainly dueto the time dimension of the sample, rather than to the assumption of unit beta loadings across firms.

Page 276: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

1 Econometric methodology 263

so that:

Rijkt = αt + e′iββt + e′iγγt + εit . (3)

where eiβ is a J × 1 vector of zeros with a one in the ith firm’s industry, whereas eiγ isa K × 1 vector of zeros with a one in the ith firm’s country.

As equation (3) cannot be estimated as it stands because of perfect multicollinearity(as every company belongs to both an industry and a country whereas the industry andcountry effects can only be measured relative to a benchmark), we follow the literature byimposing the restriction that the weighted sum of industry and country effects equals zeroat every point in time; so, the industry and country effects are estimated as deviationsfrom a common benchmark, the return on the global factor captured by the intercept α.Subject to these zero sum restrictions, equation (3) can be estimated using weighted leastsquares, with each stock return being weighted by its beginning-of-period share xi of theglobal stock market capitalization (computed as a sum of the market capitalization ofall the N firms comprising the cross-section). An advantage of constructing country andindustry portfolios this way is that the number of firms in each cross-section can vary andyet the panel of portfolios of country- and sector- or industry-specific excess returns isbalanced. This procedure therefore effectively summarizes the relevant information fromthe original unbalanced panel.

1.2. Modeling stock return dynamics

Whereas the earlier literature has not attempted to link the individual industry (βt)and country components (γt) over time, we will allow for such dependencies in thesecomponents in a flexible manner that does not impose linearity or serial independencea priori. In doing so, we follow the large empirical literature that has documented thepresence of persistent regimes in a variety of financial time series (Ang and Bekaert,2002; Engel and Hamilton, 1990; Driffill and Sola, 1994; Hamilton, 1988; Kim andNelson, 1999b; Perez-Quiros and Timmermann, 2000). Typically these studies captureperiods of high and low volatility in univariate series or in pairs of series (e.g. Ang andBekaert, 2002; Perez-Quiros and Timmermann, 2000). In what follows we extend thisapproach to multi-country/multi-sector portfolios.

Let sαt, sβjt, sγkt be separate state variables driving returns on the global, industry,and country portfolios, respectively. We show in the empirical section that the datajustifies this assumption. If, furthermore, these state variables are industry- and country-specific, we can write returns on the global, industry and country portfolios as:

αt = μαsαt+ σαsαt

εαt,

βjt = μβjsβjt+ σβjsβjt

εβjt, (4)

γkt = μγksγkt+ σγksγkt

εγkt.

Suppose, for example, that there are two states for the global return process so sαt = 1or sαt = 2. Then the mean of the global return component in any given period, t, is eitherμα1 or μα2, whereas its volatility is either σα1 or σα2. Similarly, if the jth industry statevariable can take two values, sβjt = 1 or sβjt = 2, then the jth industry’s mean return attime t is either μβj1 or μβj2 whereas its volatility is either σβj1 or σβj2.

Page 277: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

264 Volatility regimes and global equity returns

How the state processes alternate between states is obviously important. We followconventional practice and assume constant state transition probabilities for the globalreturn process as well as for the individual country and industry return processes:

Pr(Sαt = sα|Sαt−1 = sα) = pαsαsα,

Pr(Sβjt = sβj|Sβjt−1 = sβj) = pβjsβjsβj

, (5)

Pr(Sγkt = sγk|Sγkt−1 = sγk) = pγksγksγk.

Here pα11 is the probability that the global return process remains in state 1 if itis already in this state, pβj11 is the probability that the jth industry state variableremains in state 1 and so forth. This means that the regimes are generated by a discretestate homogenous Markov chain. We will be interested in studying the state proba-bilities implied by our models given the current information set, Γt, which comprisesall information up to time t, i.e., πsαt = Pr(Sαt = sα|Γt), πsβjt = Pr(Sβjt = sβj

|Γt),πsγkt = Pr(Sγkt = sγ |Γt). As we shall see in the empirical section, the time series of theseprobabilities extracted from the data provide information about high and low volatilitystates. Finally, we assume that the innovation terms, εαt, εβjt and εγkt are normallydistributed. This implies that the return process will be a mixture of normal randomvariables, the resulting distribution of which is capable of accommodating features suchas skews and fat tails that are frequently found in financial data, c.f. Timmermann(2000).

Under this model, the return on the ith company in industry j and country k isaffected by separate global, industry and country regimes plus an idiosyncratic errorterm

Rijkt = μαsαt+ μβjsβjt

+ μγksγkt+ σαsαt

εαt + σβjsβjtεβjt + σγksγkt

εγkt + εit . (6)

It is possible, however, that the state variable driving the industry and countryreturns shares an important common component across industries and country returns.This could be induced, for example, by an oil shock to the extent that the latter tends tohave a large differential effect across industries and a far more homogenous effect acrosscountries. Similarly, one can think of a number of common shocks of political origins, forinstance, such as a war or a large-scale terrorist attack that spread mainly along countrylines as opposed to industry lines.

If so, a more efficient way to gain information about the underlying state variable is toestimate a multivariate regime-switching model jointly for several portfolios. To accountfor the possibility that a common state factor is driving the individual industry returnson the one hand and the individual country returns on the other hand, we consider thefollowing model:

αt = μαsαt+ εαsαt

,

βt = μβsβt+ εβsβt

,

γt = μγsγt+ εγsγt

,

(7)

where μαsαtis the scalar global mean return in state sαt, μβsβt

is a J-vector of industrymeans in state sβt, μγsγt

is a K-vector of country means in state sγt. Furthermore, theinnovations to returns are assumed to be Gaussian with zero mean and state-specific

Page 278: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 Data 265

variances εαsαt∼ (0, σ2

αsαt), εβsβt

∼ (0,Ωβsβt), εγsγt

∼ (0,Ωγsγt), where σ2

αsαtis the

scalar variance of global return in state sαt, Ωβsβtis the J × J variance–covariance

matrix of industry returns in state sβt, Ωγsγtis the K ×K variance–covariance matrix

of country returns in state sγt.State transitions for this common factor case are still assumed to be time-invariant:

Pr(Sαt = sα|Sαt−1 = sα) = pαsαsα,

Pr(Sβt = sβ |Sβt−1 = sβ) = pβsβsβ, (8)

Pr(Sγt = sγ |Sγt−1 = sγ) = pγsγsγ.

The regime-switching model is fully specified by the state transitions (8), the returnequations (3) and (7) and the assumed “mixture of normals” density. However, estimationof the model is complicated by the fact that the state variable is unobserved or latent. Wedeal with this by obtaining maximum likelihood estimates based on the EM algorithm(see Hamilton, 1994, for details).

A major advantage of our common nonlinear factor approach is that it allows usto extract volatility estimates of portfolio strategies involving an arbitrary number ofcountries or industries in addition to the global component. As discussed in Solnik andRoulet (2000), the standard way to capture time-variation in market volatility and cor-relations is by using a fixed-length rolling window of, say, 36 or 60 months of returnsdata and estimate cross-correlations for pairs of countries. This approach has three majordisadvantages compared to our approach. One is that of not relying on the full data sam-ple, likely leading to imprecise estimates of volatilities and correlations, which typicallyrequire relatively large data samples for precise estimation. Second, by construction asthey present moving averages of volatilities, rolling window estimates cannot capturerelatively short-lived volatility bursts that may be important for investment risk. Third,rolling window estimates provide unconditional estimates of volatilities and correlationsand do not exploit any dynamic structures in the covariance of portfolio returns, otherthan indirectly, as the parameter estimates get updated over time. In contrast, the pro-posed regime-switching framework can capture richer dynamics: whereas the mean andvariance of returns are constant within each state, the state probabilities vary over timeeither gradually (if the filtered state probabilities change slowly) or rapidly (if filteredstate probabilities move more suddenly).

2. Data

The data cover monthly total returns and market capitalizations for up to 3,951 firmsin developed stock markets over the period February 1973 to February 2002.5 Country

5Monthly total returns are computed in local currency using data from Datastream/Primark. Thereturn calculation assumes immediate reinvestment of dividends. These local currency returns are con-verted to US dollars using end-of-month spot exchange rates. The beginning-of-month stock marketcapitalizations are converted into US dollars using the beginning-of-month dollar price of one unit oflocal currency. Expressing all returns and market cap data in US dollars implicitly reflects the perspec-tive of a currency unhedged equity investor whose objective is to maximize US dollar returns. To theextent that changes in equity returns overwhelm those associated with currency fluctuations, expressing

Page 279: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

266 Volatility regimes and global equity returns

coverage spans Australia, Belgium, Canada, Denmark, France, Germany, Ireland, Italy,Japan, Netherlands, Switzerland, the United Kingdom, and the United States. Althoughdata are available for other advanced stock markets (notably Austria, Norway, andSweden) from the late 1970s/early 1980s, this would entail a shorter estimation periodand attendant degrees of freedom constraints would turn the estimation infeasible. Theexclusion of emerging markets in particular from our sample does not seem capable ofaltering the main results. Recent work that includes both mature and emerging marketsin (value weighted) regressions for the post-1985 period finds that trends in the rela-tive contribution of country and industry factors are basically the same regardless ofwhether one includes or excludes the emerging market subsample (Brooks and Catao,2000; Brooks and del Negro, 2002).

Firms in these 13 countries are then grouped into one of 11 FTSE industry sec-tors: resources, basic industries, general industries, cyclical consumer goods, noncyclicalconsumer goods, cyclical services, noncyclical services, utilities, information technology,financials and others. Although some recent papers argue in favor of a finer industryclassification, the level of aggregation used here is sufficient not only because it followsthe traditional industry breakdown used by portfolio managers and much of the academicliterature, but also because it clearly distinguishes new industries that appear to havedistinct time series dynamics of stock returns (such as information technology).6

A desirable feature of these data is that they are a realistic and unbiased represen-tation of the global stock market. As of December 1999, the total capitalization of thesample comes to $26.3 trillion or 80% of stock market capitalization in advanced coun-tries as measured by the IFC yearbook and 73% of the world market capitalization (i.e.including developing countries). Coverage deteriorates somewhat towards the beginningof the sample but because the data comprise the largest and internationally most activelytraded firms in key markets such as the United States, Japan, and the United Kingdomthroughout, the sample can be deemed as quite representative from the viewpoint of aglobal investor. It should be noted, however, that the deterioration in coverage reflectstwo deficiencies of the data set. First, it is subject to survivorship bias, meaning thatonly firms surviving over the full sample period are covered. Although this bias no doubtaffects average rates of return, it does not seem to be too consequential for the analysis

returns and market caps in the distinct national currencies should not change the thrust of the results,as earlier work using a subset of this data has found (Brooks and Catao, 2000). This is consistent withwhat other researchers have also found using other data sets (Heston and Rouwenhorst, 1994; Griffinand Karolyi, 1998; Griffin and Stultz, 2001). One possible reason lies in exchange rate hedging usedby many large firms in developed countries with extensive international operations, which comprise asizeable portion of those data sets. Developing country firms were excluded from the sample altogetherbecause none of those reported in the Datastream/Primark data set had sufficiently long series, entailingtoo short a time span for the respective country portfolio to be included in the estimation of the Markovswitching regressions.

6Although Griffin and Karolyi (1998) note that a finer industry disaggregation may yield a moreaccurate measure of industry effects, their main result – the far greater dominance of country-specificeffects – hardly changes with the move to a finer industry breakdown. An alternative breakdown thatgroups IT with media and telecoms under a broader TMT group has been considered in Brooks andCatao (2000) and Brooks and del Negro (2002), who show that the dynamics of the broader TMTgroup is largely dominated by the IT sector from the 1990s. In the FTSE industry classification usedin this chapter, media is grouped under cyclical services and telecommunications under noncyclicalservices.

Page 280: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Global stock return dynamics 267

of relative factor contributions to market volatility, which is the central concern of thischapter. This can be gauged from the very small differences in the results obtained froman application of the Heston–Rouwenhorst decomposition scheme to a subsample thatincludes dead firms for the post-1986 period (when such a list is available) and the coun-terpart in our data which do not include such firms.7 The second deficiency of the datais that of including only post-merger companies, dropping companies that go into themerger. The most likely effect of this is to bias the estimates in favor of finding morepronounced global industry effects in the more recent years in the sample; but as thisproblem applies only to a few firms, it is also likely to have a very limited effect on theestimates.

On the positive side, our sample stretches over a much longer time period than thosein the studies referred to above. This is a crucial advantage required for precise estimationof regime-switching processes. As we shall see, most regimes tend to be quite persistentso identifying them requires a time series as long as that considered in our study. Nosingle country is represented by less than 28 firms on average (Ireland and Denmark)and, in the case of large economies such as the US and Japan, coverage approaches1,000 firms towards the end of the sample from a minimum of 377 firms at the beginningof the sample (February 1973). This reasonably large time series and cross-sectionaldimension of the data probably eliminates any significant distortion in the econometricresults arising from the deficiencies mentioned above.

3. Global stock return dynamics

Table 13.1 presents some summary statistics for the distribution of the country, industryand world portfolios. All country and industry portfolio returns are measured in excess ofthe world portfolio so the mean returns on these portfolios are close to zero on average.8

Standard deviations average 4.89% per month for the country portfolios and 2.96% for theindustry portfolios, thus verifying the finding in the literature that, on average, countryfactors matter more than industries for explaining variations in stock returns. Countryportfolios tend to be slightly more positively skewed than the industry portfolios whereas,interestingly, returns on the global portfolio are not skewed. There is also strong evidenceof excess kurtosis in most of the portfolios. Accordingly, Jarque-Bera test statistics fornormality rejected the null of normally distributed returns for all portfolios except forSwitzerland and Japan.9 This is the type of situation where mixtures of normals maybetter capture the underlying return distribution.

7For a subsample including dead firms, Brooks and del Negro (2002) obtain the following figures forthe capitalization-weighted time series variance of country effects (also gauged by the same parameterγkt in the Heston–Rouwenhorst decomposition scheme): 18.47 over 1986:3 to 1990:2; 21.08 over 1990:3 to1994:2; and 9.12 between 1994:3 and 1998:2. Using the same 4-year fixed windows and a similar group ofmature markets but without including dead stocks, our respective estimates are 19.21, 22.10, and 8.80.So, the differences are small and have no discernable effect on trends.

8The only reason the averages are not exactly equal to zero is that we are reporting arithmetic averages,whereas the world portfolio is based on capitalization-weighted returns.

9Section 3.3 below reports the results of the normality tests for our fitted model, which show thatresiduals become broadly normal once conditioned on the regime moments, thus providing strong supportfor the proposed regime-switching approach.

Page 281: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

268 Volatility regimes and global equity returns

Table 13.1. Summary statistics for the country, industry, and worldportfolio returns

mean s.d. skew kurtosis

(a) Country portfoliosUS −0.12 2.77 −0.42 2.48UK 0.07 5.07 1.81 14.8France 0.10 5.25 0.27 1.32Germany −0.29 5.02 −0.09 0.81Italy −0.12 7.28 0.38 1.71Japan 0.11 4.63 0.02 0.58Canada −0.29 3.85 −0.34 0.55Australia −0.15 6.27 −0.25 1.67Belgium −0.22 4.65 0.60 1.87Denmark −0.10 5.32 0.33 1.33Ireland 0.18 6.11 0.55 2.72Netherlands −0.12 3.31 −0.04 1.02Switzerland −0.28 4.04 −0.02 0.09Average −0.09 4.89 0.22 2.38

(b) Industry portfoliosResources −0.12 3.74 0.03 0.88Basic −0.19 2.52 0.06 3.71General industry −0.05 1.78 −0.40 1.24Cyclical durables −0.09 3.24 −0.30 1.22Noncycl. durables −0.05 2.45 −0.51 4.27Cyclical services −0.06 1.61 0.01 0.68Noncycl. services −0.17 3.72 0.88 3.11Utilities −0.28 4.07 0.93 6.46Information technology 0.18 4.34 0.50 3.01Financials 0.00 2.28 −0.16 4.78Others −0.51 2.79 0.21 2.62Average −0.12 2.96 0.11 2.91

(c) World 1.71 4.34 −0.04 0.79

This table reports descriptive statistics for the country, industry and global portfoliosusing the decomposition (2) subject to the constraints (3), (4). Returns are measuredat the monthly frequency over the period February 1973–February 2002 and are basedon a data set covering up to 3,951 firms in developed stock markets.

3.1. Nonlinearity in returns

Previous studies of country- and industry effects in international stock returns havebeen based on the assumption of a single state, so it is important to investigate thevalidity of this assumption. To determine whether a regime-switching model is appro-priate for our analysis, we first verify that two or more states characterize the returngenerating process of the individual industry and country portfolios. For this purposewe report the outcome of the statistical test proposed by Davies (1977), which, unlike

Page 282: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Global stock return dynamics 269

Table 13.2. Tests for multiple states

(a) Country portfolios

US UK France Germany Italy Japan Canada

p value 0.000 0.000 0.000 0.004 0.000 0.005 0.352

Australia Belgium Denmark Ireland Netherlands Switzerland

p value 0.000 0.000 0.000 0.000 0.071 0.341

(b) Industry portfolios

Resources Basic General ind. Cyc. cons.

goods

Noncyc.

Cons.

Cyc. serv. Noncyc.

serv.

p value 0.006 0.000 0.000 0.000 0.000 0.271 0.000

Utilities Inf.

Technology

Financials Other

p value 0.000 0.000 0.000 0.000

(c) Global factor

p value 0.000

This table reports Davies’ (1977) p values for the test of a single state, accounting for unidentifiednuisance parameters under the null hypothesis of a single state. P values below 0.05 indicate the presenceof more than one state.

standard likelihood ratio tests, has the advantage of taking into account the problemassociated with unidentified nuisance parameters under the null hypothesis of a singleregime. The results are shown in Table 13.2. For 10 out of 13 countries and 10 of 11industries, the null of a single state is rejected at the 1% critical level. Linearity isalso strongly rejected for the global portfolio. Hence, there is overwhelming evidenceof nonlinear dynamics in the form of multiple regimes in country, industry and globalreturns.

These results suggest that there are at least two regimes in the vast majority ofreturn series. However, they do not tell us if two, three or even more states are needed tomodel the return dynamics. To choose among model specifications with multiple states,Table 13.3 reports the results of three standard information criteria that are designed totrade off fit (which automatically grows with the number of parameters and thus withthe number of underlying states) against parsimony (as measured by the total number ofparameters). We report results using the Akaike (AIC), the Schwarz Bayesian (BIC) andthe Hannan–Quinn (HQ) information criteria. For the 13 country portfolios, the threecriteria unanimously point to a single state for Canada and Switzerland and three statesfor the UK, and at least two of the above criteria suggest that stock returns in all othercountries are better modeled as a two-state process.10

Turning to the industry portfolios, the results are even more homogenous, withthe BIC and HQ criteria selecting a two-state model for nine industries out of 11.At the same time, all three criteria indicate that stock returns in resources are bestcaptured through a three-state model. Only for cyclical services is there considerabledifference – the BIC and HQ choosing a single-state model whereas the AIC selects athree-state specification. Finally, regarding the global portfolio, AIC and HQ choose atwo-state specification, whereas the BIC marginally selects a single-state specification.

10The finding of a single state for Canada and Switzerland is consistent with the Davies’ tests inTable 13.2, which could not reject linearity for these two countries.

Page 283: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

270 Volatility regimes and global equity returns

Table 13.3. Selection criteria for individual country, industry, and global portfolios

AIC BIC H-Q

k = 1 k = 2 k = 3 k = 1 k = 2 k = 3 k = 1 k = 2 k = 3

(a) Country portfoliosUS 4.882 4.700 4.728 4.904 4.767 4.860 4.891 4.727 4.780UK 6.093 5.803 5.720 6.115 5.869 5.852 6.101 5.829 5.773France 6.163 6.045 6.050 6.185 6.111 6.182 6.172 6.071 6.102Germany 6.074 6.049 6.069 6.096 6.115 6.201 6.083 6.075 6.121Italy 6.817 6.731 6.750 6.839 6.798 6.882 6.826 6.758 6.802Japan 5.910 5.886 5.894 5.932 5.952 6.027 5.919 5.912 5.947Canada 5.543 5.549 5.559 5.565 5.615 5.693 5.552 5.575 5.613Australia 6.518 6.435 6.455 6.540 6.501 6.587 6.526 6.461 6.508Belgium 5.920 5.856 5.861 5.942 5.923 5.994 5.928 5.883 5.914Denmark 6.189 6.133 6.151 6.211 6.199 6.284 6.197 6.160 6.204Ireland 6.468 6.327 6.306 6.490 6.393 6.438 6.477 6.354 6.359Netherlands 5.241 5.235 5.250 5.263 5.301 5.383 5.250 5.261 5.303Switzerland 5.640 5.646 5.660 5.663 5.712 5.793 5.649 5.672 5.713

(b) Industry portfoliosResources 5.485 5.462 5.356 5.507 5.528 5.488 5.493 5.488 5.409Basic 4.697 4.514 4.490 4.719 4.581 4.623 4.706 4.541 5.543General industry 4.000 3.893 3.907 4.022 3.959 4.040 4.009 3.920 3.960Cyclical consumer

goods5.200 5.120 5.110 5.222 5.185 5.242 5.209 5.145 5.163

Noncyclical consumergoods

4.638 4.340 4.359 4.660 4.407 4.492 4.647 4.367 4.412

Cyclical services 3.795 3.798 3.768 3.817 3.865 3.900 3.803 3.825 3.821Noncyclical services 5.473 5.396 5.373 5.495 5.462 5.506 5.482 5.422 5.426Utilities 5.653 5.448 5.463 5.675 5.514 5.595 5.662 5.474 5.516Information technology 5.783 5.484 5.460 5.805 5.550 5.593 5.792 5.510 5.513Financials 4.494 4.225 4.224 4.513 4.292 4.357 4.500 4.252 4.277Others 4.898 4.809 4.822 4.921 4.875 4.954 4.907 4.835 4.875

(c) Global 5.781 5.741 5.749 5.803 5.808 5.881 5.790 5.768 5.802

This table shows the values of various information criteria used to determine whether a single-state(k = 1), a two-state (k = 2), or a three-state (k = 3) model is chosen for the country, industry andglobal portfolios. AIC gives the value of the Akaike information criterion, BIC the Schwarz’s Bayesianinformation criterion and HQ the Hannan–Quinn information criterion. Lowest values (in bold) shouldbe preferred.

Overall, therefore, the results in Table 13.3 strongly indicate the presence of two statesin the dynamics of the various portfolio returns. Accordingly, the subsequent analysis isbased on this specification.

3.2. Joint portfolio dynamics

Addressing the question of the overall importance of industry and country effects requiresstudying common country and common industry effects. As discussed in Section 2, we dothis using a nonlinear dynamic common factor specification, which is distinct from thevast majority of recent work on dynamic factor models (cf. Stock and Watson, 1998) in

Page 284: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Global stock return dynamics 271

Table 13.4. Estimation results for the common component models

Stayer prob. Ergodic prob. Duration Davies test

State 1 State 2 State 1 State 2 State 1 State 2

Country 0.975 0.976 0.486 0.514 40.1 42.5 0.0000Industry 0.870 0.962 0.226 0.774 7.7 26.4 0.0000Global 0.922 0.899 0.565 0.435 12.9 9.9 0.0004

This table reports maximum likelihood estimates of the transition probability parameters of theregime-switching model (9), (10) fitted to the common state model for countries, industries or theglobal portfolio. The state transition probabilities give the probabilities of remaining in state 1 andstate 2, respectively. Steady state or ergodic probabilities provide the average time spent in the twostates, whereas the state durations are the average time spent without exiting from the states (inmonths). The Davies test is for the null of a single state versus the alternative of multiple states.

that it does not impose a linear factor structure. This distinction is particularly importantwhen the main interest lies in extracting common factors in the volatility of returns onvarious portfolios, given overwhelming empirical evidence of time-varying volatilities instock returns.

We estimate the proposed joint regime-switching model for the return series on the13 country portfolios and 11 industry portfolios. To our knowledge, regime-switchingmodels on such large systems of variables have not previously been estimated. The jointestimation of the parameters of a highly nonlinear model for such a large system is anontrivial exercise. Yet, it can yield valuable insights into the joint dynamics of portfolioreturns, as discussed below.

Table 13.4 presents estimates of the transition probabilities and average state dura-tions and the outcome of the Davies test for multiple states. Volatility estimates areshown in Table 13.5, which also presents results for the global portfolio.11 As expected,the null hypothesis of a linear model with a single state is strongly rejected for the coun-try, industry, and world models. All three information criteria support a two-state modelover the single-state model in the case of the joint industry and joint country models,whereas both the AIC and the HQ criterion support the two-state specification over theone-state model for the global return model. Table 13.4 also shows that the two statesidentified in country returns have persistence parameters of 0.975 and 0.976, implyingthat the durations of the two states are high at 40 and 42 months, respectively. Clearlythe model is picking up long-lasting regimes in the common component of the countryportfolios. The average volatility is around 4.9% in both states, so the states are nolonger defined along high and low volatility on average.

Different results emerge from the parameter estimates for the joint industry model. Inthe low volatility state (state 2) the average volatility is 2.27%, whereas it is more thantwice as high in the high volatility state (4.67). Average correlations are now negative inthe low volatility state and zero in the high volatility state. State transition probabilitiesfor the industry returns listed in Table 13.4 at 0.87 and 0.96 are quite high and imply

11As the joint country model has 210 parameters and the joint industry model has 156 parameters(most of which measure the covariance between industry returns in the two states), we do not report allthe estimates and instead concentrate on the standard deviations.

Page 285: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

272 Volatility regimes and global equity returns

Table 13.5. Volatility estimates for the common component models

(a) Commoncountry component

(b) Commonindustry component

(c) Globalcomponent

State 1 State 2 State 1 State 2 State 1 State 2

US 3.47 1.85 Resources 5.66 2.99 5.27 2.67UK 3.76 6.04 Basic 3.96 1.95France 5.54 4.94 General industry 2.60 1.47Germany 5.14 4.84 Cyclical consumer

goods4.80 2.62

Italy 7.56 6.99 Noncyclical consumergoods

4.25 1.62

Japan 4.46 4.76 Cyclical services 2.01 1.46Canada 3.92 3.75 Noncyclical services 5.65 2.98Australia 6.39 6.13 Utilities 6.69 2.97Belgium 4.90 4.38 Information

technology7.41 2.99

Denmark 5.14 5.46 Financials 3.70 1.68Ireland 5.32 6.75 Others 2.76 2.79Netherlands 3.01 3.56Switzerland 4.16 3.89Average 4.88 4.95 Average 4.67 2.27

This table reports maximum likelihood estimates of the volatility parameters of the regime-switchingmodel (9), (10) fitted to the common state model for countries or industries. The models thus extracta nonlinear state-variable common across the country or across the industry portfolios.

average duration of eight months in regime 1 and 26 months in regime 2. Consequently,the steady state probabilities are 23 and 77%, so that three times as much time is spentby the industry portfolios in the low volatility regime (state 2).

Figure 13.1 plots the time series of the smoothed probabilities for the high volatilitystate identified by the common country and common industry models, as well as themodel for global returns. The high persistence in the common country component standsout. For example, the common country effect stays in the same regime over the period1986–1997, although it is difficult to interpret in terms of periods of high and low volatil-ity. The common industry regime identifies four high volatility periods around the earlyseventies (1974) and 1979–1980, a spell from 1986 to September 1987 followed by themore recent period from late 1997. The global return component follows shorter cyclicalmovements that nevertheless are well identified by the model. The finding that the globalreturn component is the least persistent factor bodes well with the interpretation that itcaptures a variety of large, common economic shocks typically associated with the globalbusiness cycle. In contrast, common country components are likely to undergo less fre-quent shifts as they tend to be more based on structural relations that are more slowlyevolving, especially in countries with relatively stable institutions such as the advancedcountries comprising our data set. The economic interpretation of these results is furtherdiscussed in Section 5.

Page 286: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Global stock return dynamics 273

Global

0.00

0.20

0.40

0.60

0.80

1.00

1973 1974 1976 1977 1978 1979 1981 1982 1983 1984 1986 1987 1988 1989 1991 1992 1993 1994 1996 1997 1998 1999 2001 2002

Common Country

0.00

0.20

0.40

0.60

0.80

1.00

1973 1974 1976 1977 1978 1979 1981 1982 1983 1984 1986 1987 1988 1989 1991 1992 1993 1994 1996 1997 1998 1999 2001 2002

Common Industry

0.00

0.20

0.40

0.60

0.80

1.00

1973 1974 1976 1977 1978 1979 1981 1982 1983 1984 1986 1987 1988 1989 1991 1992 1993 1994 1996 1997 1998 1999 2001 2002

Fig. 13.1. Smoothed state probabilities for common components (high volatility state).

3.3. Robustness checks

A simple way of gauging the robustness of our estimates is to compare our smoothedprobability estimates in the upper panel of Figure 13.1, as well as the associatedmeasures of market volatility (computed as discussed below) against a simple non-parametric measure of global stock return volatility – the intra-month (capitalizationweighted) variance of daily stock returns in the 13 countries that we consider. Thiscomparison is plotted in Figure 13.2. Our model clearly appears to do a very goodjob in picking up the major volatility shifts: the correlation between our modelestimates and such high-frequency nonparametric measures is reasonably high at 0.4.

Page 287: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

0.0

1.0

2.0

3.0

4.0

5.0

6.0

Mar-

75M

ar-76

Mar-

77M

ar-78

Mar-

79M

ar-80

Mar-

81M

ar-82

Mar-

83M

ar-84

Mar-

85M

ar-86

Mar-

87M

ar-88

Mar-

89M

ar-90

Mar-

91M

ar-92

Mar-

93M

ar-94

Mar-

95M

ar-96

Mar-

97M

ar-98

Mar-

99M

ar-00

Mar-

01M

ar-02

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Intra-monthly varianceMS Global VarianceMS Smoothed probability (right axis)

Fig. 13.2. Actual intra-monthly variance of stock returns and MS estimates

Page 288: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Variance decompositions 275

We also performed a series of tests on the properties of the residuals that attest thesuitability of our model specification. The main results are as follows. For the worldportfolio, the coefficient of excess kurtosis goes from 0.79 in the raw return data to −0.10for the data normalized by the weighted state means and standard deviation. The Jarque–Bera test for normality (which has a critical value of 5.99) goes from 9.11 to 0.24. For thecountry portfolios, the average coefficient of excess kurtosis in the raw returns data is 2.38.This drops to 0.16 after standardizing by the regime moments. Moreover, the averagenormality test drops from 297 to 1.53 and, when standardized by the regime moments,the number of rejections drops from 11 to only one rejection. For the industry portfolios,the average coefficient of excess kurtosis drops from 2.91 to 0.56 and the average valueof the normality test declines from 180 to 18.6 upon standardizing by state moments.

The upshot is that these tests clearly suggest that the proposed model is fittingthe data well in that the residuals from the two-state specification are close to beingnormally distributed for the vast majority of portfolios, despite the evidence of verystrong non-normality prior to accounting for distinct volatility regimes.

4. Variance decompositions

A central question in the literature on the sources of stock return volatility is the relativevolatility of geographically or industrially diversified portfolios. To get a first measureof how the total market variance evolves over time, we simply sum the global variance,the average country variance and the average industry variance (all based on conditionalmoment information reflecting the time-varying state probabilities) as follows:

σ2t =

∑sαt

πsαt(σ2

sαt+ (μαsαt

− μαt)2)

+∑Sβt

πsβt(ω′

βtΩβsβtωβt + ω′βt(μβsβt

− μβt)2) (9)

+∑Sγt

πsγt(ω′

γtΩγsγtω′γt + ω′

γt(μγsγt− μγt)

2),

where ωβt is the vector of weights for the industry portfolios and ωγt is the vector ofweights of country portfolios. μαt =

∑sαtπsαt

μαsαtis the conditional expectations of the

global portfolio returns, μβt =∑

sβtπsβt

μβsβtand μγt =

∑sγtπsγt

μγsγtare J × 1 and

K × 1 vectors of conditionally expected returns on the industry and country portfolios,respectively. The first component in (9) accounts for the total variance of the global returncomponent. The second component is the value-weighted industry variance, whereas thethird component is the value-weighted country variance. Besides accounting for state-dependent covariances, there is an extra component in each of these terms arising fromvariations in the means across states. Notice that this measure of total market variancechanges over time due to time-variations in the state probabilities.12

Figure 13.3 plots the time series of the market volatility component computed from(9). Volatility varies considerably over time from a low point around 2.8% to a peak

12The squared terms in the variance expression enter due to the binomial nature of the state variable,c.f. Timmermann (2000).

Page 289: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

276 Volatility regimes and global equity returns

around 5.5% per month. It was very high around 1974/75, 1980, 1987, 1991, and fromlate 1997 onward. At these times, the market volatility component was close to twice aslarge as during the low volatility regimes that occurred in the late 1970s and mid-1990s.Recalling that the volatility of the country component does not vary much across the twostates, whereas conversely the volatility of the industry and global portfolio returns areabout twice as high in the high volatility state as they are in the low volatility state, it iseasy to understand the figure. Systematic volatility tends to be high when the commonindustry component and the global component are both in the high volatility state atthe same time, i.e. in 1974, 1980, 1987 and from 1998 to 2002. Conversely, if they aresimultaneously in the low volatility state, then systematic volatility will be low.13

The measure of market variance in (9) readily lends itself to a decomposition into itsthree constituents. Figure 13.4 shows the fraction of total market variance represented bythe average country, industry and global components scaled by the sum of these. Timevariation in the (average) country fraction is very large and ranges from about two-thirdsto one-third as in recent years. In particular, the importance of the country factor hasbeen noticeably lower in periods such as the 1974–1975 oil shock, the 1987 stock marketcrash and the information technology boom of the late 1990s.

Likewise, the fraction of total market volatility due to the industry component alsovaries considerably as shown in the middle panel of Figure 13.4. It rises to about 30% inthe immediate aftermath of the two oil shocks of the 1970s (1974 and 1980/81), duringthe stock market crash of 1987, and during the IT boom and bust cycle from 1997/98onwards. In the context of the existing literature, the estimated average level in the10–15% range is slightly higher than the 7% figure of Heston and Rouwenhorst (1994)and more than twice as high as the estimates in Griffin and Karolyi (1998) – bothbased on linear single-state models.14 Figure 13.4 clearly unveils significant changes inthe relative importance of the industry factor and shows that its recent rise has in factbeen the most persistent of all over the past 30 years, though not quite yet to the pointwhere its contribution to volatility has surpassed that of the country factor. As shown inthe bottom panel of Figure 13.4, this is partly due to the concomitant rise of the global

13Overall, our estimates plotted in Figure 13.3 suggest that systematic volatility is nearly trendless: ifa trend is fitted into this measure of market variance (which excludes firm-level idiosyncratic variance asin equation (9)), it is very mildly positive and its statistical significance is quite sensitive to end-point. Asimilar inference obtains when applying the nonparametric measure of intra-monthly volatility discussedin Section 3.3 and plotted in Figure 13.2: the respective slope of regressing this volatility measure on alinear trend is positive (0.0001) but not statistically significant at 5%. This is consistent with Schwert’s(1989) finding for the US that market volatility does not display a significant long-term trend. Using,however, aggregate stock price data spanning over a century for various advanced countries, Eichengreenand Tong (2004) suggest that stock market volatility displays a U-shape in most countries. It remainsto be established whether this result stems from their use of much longer data series, the effects ofidiosyncratic variance (which they do not filter out), or due to their methodology based on rollingstandard deviations of stock price changes and univariate GARCH(1,1) regressions.14Griffin and Karolyi (1998) present two sets of estimates, one using a nine-sector breakdown and

the other using a 66 industry breakdown. They find that the mean industry factor contribution tototal return variance are 2 and 4%, respectively, a lot lower therefore than the above estimates. Onepossible reason for the lower estimate of Griffin and Karolyi (1998) relative to Heston and Rouwenhorst(1994) as well as ours is the inclusion of emerging markets in their sample. As country-specific shockshave been shown to play a greater role in the determination of stock returns in emerging markets,this is to be expected. However, we show below that much of the difference appears to be model- andtime-dependent. Furthermore, Griffin and Karolyi consider a much shorter sample of weekly returns sodifferences in estimates are not all that surprising.

Page 290: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Systematic Volatility

2.80

3.30

3.80

4.30

4.80

5.30

1973 1974 1976 1977 1978 1979 1980 1981 1983 1984 1985 1986 1987 1988 1990 1991 1992 1993 1994 1995 1997 1998 1999 2000 2001

Fig. 13.3. Market volatility

Page 291: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

278 Volatility regimes and global equity returns

Variance due to country factor

0.30

0.35

0.40

0.45

0.50

0.55

0.60

0.65

1973 1974 1975 1976 1978 1979 1980 1981 1983 1984 1985 1986 1988 1989 1990 1991 1993 1994 1995 1996 1998 1999 2000 2002

Variance due to industry factor

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

1973 1974 1976 1977 1979 1981 1982 1984 1985 1987 1988 1990 1992 1993 1995 1996 1998 2000 2002

Variance due to global factor

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

1973 1974 1975 1976 1978 1979 1980 1981 1983 1984 1985 1986 1988 1989 1990 1991 1993 1994 1995 1996 1998 1999 2000 2002

Fig. 13.4. Decomposition of systematic variance into country, industry and globalfactors

Page 292: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Variance decompositions 279

factor contribution to overall stock return volatility in recent years, which has filled someof the gap arising from the decline in country-specific volatility.

It is instructive to compare these results with those obtained through the widespreadpractice of estimating relative contributions by the time series variances of the estimatedβjt and γkt , computed over a rolling window. We follow the common practice of using awindow length of 3 years, but also experimented with 4- and 5-year rolling windows andfound the trends to be very similar. To facilitate comparison with our results, Figure 13.5plots the 3-year rolling window results together with our regime-switching estimatespreviously plotted in Figure 13.4. Clearly, the rolling window approach smoothes out theshifts in factor volatilities and their relative contributions. The respective states becomeless clearly defined and the approach overlooks the important spikes associated with theoil shocks of 1973–1974 and 1979–1980.

Finally, we also consider an alternative and complementary measure of the relativesignificance of the industry and country contributions to portfolio returns, which wasproposed by Griffin and Karolyi (1998). Our two-stage econometric methodology allowsus to extend the Griffin and Karolyi decomposition scheme by both letting the rela-tive contributions of each factor vary across states and taking into account the variousindustry covariances within each state. As in Griffin and Karolyi (1998), let the excessreturn on the national stock market or portfolio of country k (over and above the globalportfolio return α) be decomposed into country k’s unique industry weights times theindustry returns summed across industries (i.e.,

∑Jj=1 ω

βjktβjt) plus a “pure” country

effect γkt :15

Rkt − αt =J∑

j=1

ωβjktβjt + γkt , (10)

where ωβjkt is the jth industry’s weight in country k. The variance of this excess return

conditional on the country state being sγt and the industry state being sβt is

V ar(Rkt − αt|sβt, sγt) = (ωβkt)

′Ωβsβωβ

kt +e′kΩγsγek +2(ωβ

kt)′Cov(βjt , γkt |sβt, syt), (11)

where ωβkt is the J-vector of market capitalization weights of the industries in country k.

Similarly, the excess return on the portfolio of industry j (over and above the globalportfolio) can be decomposed into industry j’s unique country weights times the countryreturns summed across countries plus a pure industry effect, βjt :

Rjt − αt =K∑

k=1

ωγjktγkt + βjt , (12)

where ωγjkt is the kth country’s weight in industry j. The variance of this excess return

conditional on the country state being sγt and the industry state being sβt is

V ar(Rjt−αt|sβt, sγt) =(ωγ

jt

)′Ωγsγ

ωγjt+e′jΩβsβ

ej +2(ωγ

jt

)′Cov (βjt , γkt |sβt, syt) , (13)

15It is straightforward to show that this decomposition follows from rewriting equation (2) for eachindividual country portfolio, where the individual firm’s weight is the share of that firm in total marketcapitalization of the respective country portfolio.

Page 293: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

280 Volatility regimes and global equity returns

Variance due to country factor

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

0.55

0.60

0.65

1973 1974 1975 1976 1978 1979 1980 1981 1983 1984 1985 1986 1988 1989 1990 1991 1993 1994 1995 1996 1998 1999 2000 2002

Variance due to industry factor

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

1973 1974 1975 1976 1978 1979 1980 1981 1983 1984 1985 1986 1988 1989 1990 1991 1993 1994 1995 1996 1998 1999 2000 2002

Variance due to global factor

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

0.55

0.60

1973 1974 1975 1976 1978 1979 1980 1981 1983 1984 1985 1986 1988 1989 1990 1991 1993 1994 1995 1996 1998 1999 2000 2002

Fig. 13.5. Comparison of variance decomposition results between the proposed modeland the 36-month rolling window approach

Page 294: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Economic interpretation: Oil, money, and tech shocks 281

where ωγjt is the K-vector of market capitalization weights of the countries in

industry j.Part (a) in Table 13.6 reports the time series variances of the “pure” country effects

and the cumulative sum of the industry effects in the 13 country portfolios, whereasPart (b) reports the time series variances of the pure industry effects and the cumulativesum of country effects in the 11 industry portfolios. In both cases, these variances areexpressed as a ratio relative to the total variances of the excess returns. Their sum istherefore close, but not exactly equal to one due to the presence of the extra covarianceterm in (11) and (13) between the industry and country effects.

As country volatility does not vary greatly over the two states, to save space Table 13.6simply presents results separately for the high and low industry volatility state. Althougha number of individual country and sector results are of interest in their own right, lookingat the overall means, two findings stand out. First, the 3.3% figure reported in the upperright panel is the overall measure of the industry factor contribution in the low industryvolatility state, which is well within the range previously estimated by Griffin and Karolyi(1998) (2 and 4% depending on the level of industry aggregation – see tables 2 and 3 oftheir paper). Turning to the left panel, however, one can see that the same measure yieldsa much higher estimate of the aggregate industry component in the country portfolios(22.3% on average). In both the high and low industry volatility states, the average purecountry volatility accounts for over 90% of the total country volatility – the fact that theright- and the left-hand side estimates in part (a) add to 120% being due to the highernegative covariance between the pure country and the composite industry effect duringthe high industry volatility state.

Moving to the breakdown of the industry portfolios shown in the bottom panelsof Table 13.6, it is clear that the aggregate contribution of country effects to industryportfolios is also state sensitive, being much lower (17%) in the high industry volatilitystate than in the low industry volatility state where it more than doubles (41%). Similarly,the pure industry contribution accounts for 91% of the total industry portfolio volatilityin the high industry volatility state but only for 69% in the low industry volatility state.These results therefore suggest that decomposition averages reported in previous studiesdo vary considerably over economic states.

5. Economic interpretation: Oil, money, and tech shocks

The existence of distinct volatility regimes in stock returns and shifts in the factor con-tributions therein begs the question of what drives them. Although the construction of amultivariate global risk model capable of identifying the various underlying shocks andtheir propagation into stock pricing is beyond the scope of this chapter, it is importantto relate the above findings to key global economic developments in light of an exist-ing literature on the drivers of stock market volatility. Furthermore, this provides anadditional robustness check on the reasonableness of our estimates.

A glance at the top and bottom panels of Figure 13.1 as well as Figure 13.3 is sugges-tive of one key determinant of the three main spikes in global market volatility identifiedby our model – 1973–1975, 1979–1980, and 1986–1987. These were periods when theprobability of being in the high volatility state for the common industry factor peaked

Page 295: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

282 Volatility regimes and global equity returns

Table 13.6. Relative contribution of “pure” country and industry factors to thevariance of stock returns

High industryvolatility state

Low industryvolatility state

Purecountry

Acc.industry

Purecountry

Acc.industry

(a) Country PortfoliosUS 0.955 0.091 0.992 0.011UK 0.825 0.169 1.010 0.020France 1.297 0.114 1.003 0.009Germany 0.983 0.153 1.023 0.017Italy 0.988 0.102 1.014 0.014Japan 0.969 0.112 1.028 0.012Canada 0.907 0.213 1.020 0.029Australia 0.956 0.212 0.993 0.039Belgium 1.092 0.300 1.028 0.033Denmark 1.008 0.115 1.033 0.025Ireland 0.922 0.227 1.053 0.026Netherlands 0.626 0.471 0.973 0.107Switzerland 0.971 0.438 1.033 0.049Average 0.974 0.223 1.018 0.033

Pureindustry

Acc.country

Pureindustry

Acc.country

(b) Industry PortfoliosResources 0.920 0.161 0.725 0.453Basic 0.928 0.080 0.721 0.254General industry 0.621 0.101 0.684 0.346Cyclical cons. goods 1.168 0.114 0.941 0.309Noncycl. cons. goods 0.772 0.138 0.435 0.532Cyclical services 0.532 0.182 0.594 0.384Noncycl. services 1.370 0.221 0.708 0.410Utilities 1.345 0.060 0.894 0.200Information technology 0.895 0.059 0.667 0.270Financials 1.104 0.128 0.726 0.409Others 0.349 0.647 0.511 0.923Average 0.910 0.172 0.691 0.408

Part (a) of this table shows the contribution of the “pure” country effect and the cumulatedindustry effect of the excess return (computed relative to the global return) on the individualcountry portfolios, using the decomposition equation (11) in this chapter. Part (b) shows thecontribution of the “pure” industry effect and the cumulated country effect of the excess return(computed relative to the global return) on the individual industry portfolios using the decom-position equation (13) in this chapter. The reported figures are ratios of the variance of eachcomponent to the variance of their sum (including their covariance).

Page 296: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Economic interpretation: Oil, money, and tech shocks 283

relative to the rest of the sample (bottom panel of Figure 13.1), and the industry factor’scontribution to overall global market volatility rose (middle panel of Figure 13.4). Thisclearly coincided with large oil shocks: oil prices tripled in 1974, more than doubledin 1979, and underwent a sharp decline in 1986, when the spot price of oil reached anin-sample trough below $10/barrel. These periods coincided with spells of substantialshort run volatility in oil prices and marked profitability shifts across industries depend-ing on their oil-intensity, leading to greater uncertainty about future earnings growththat were also reflected in current stock prices (Guo and Kliesen, 2005; Kilian and Park,2007).

Our estimates suggest that industry-specific shocks are not the whole story behind thesuccessive ups and downs in global stock return volatility during 1973–2002. Specifically,industry-specific shocks do not seem able to account for two other volatility shifts whichwe identify – the volatility upturns of 1982–1983 and 1985. Both occasions, however, weremarked by substantial volatility in a well-known determinant of stock returns such asthe short-term (risk-free) interest rate (see Eichengreen and Tong, 2004, and the variousreferences therein). Reflecting monetary policy shocks in the US and also in countrieslike the UK – to which global real interest rates responded by rising to unprecedentedlevels (see Bernanke and Blinder, 1992 for a discussion on the “exogeneity” of suchshocks), volatility in the three-month US Treasury bill market rose sharply during thoseepisodes.16 This is illustrated in Figure 13.6, which plots the intra-month volatility in theTreasury bond yields (calculated from daily data), together with the volatility of returnson the stock market indices of the 13 countries comprising our sample. The correlationbetween the two indices is reasonably tight at such high frequencies, yielding a correlationcoefficient of 0.42 between end-1981 and end-1985. The fact that such monetary policyshocks were dramatic and not-so-gradual is consistent with the behavior of the smoothedstate transition probability, which clearly display changes in volatility states around thoseepisodes that were discrete and nongradual.

Evidence that industry-specific shocks contributed to drive up global stock returnvolatility from 1997 is also evident from our estimates, which highlight the highercontribution of the industry factors during that period. Yet, a more complex set of cir-cumstances appears to have been at play. To gain insight into those, Figure 13.7(a) plotsthe smoothed transition probabilities for the different industries. Earlier explanations ofthe rise in volatility and co-movement across mature stock markets during 1997–2001focus on the IT sector (Brooks and Catao, 2000; Brooks and del Negro, 2002). This isnot only because IT stock volatility rose sharply during those years but also becausethe weight of the sector in the global market portfolio more than doubled – from 10%in mid-1997 to 25% at the market peak in March 2000. As discussed by Oliner and

16Other high global volatility spells that overlap with changes in monetary policy stances and highermoney market volatility are observed following the 1998 Russian crisis, the March 2000 stock marketcrash, and between late 2001 and early 2002, in the wake of the September 11th terrorist attacks. Asshown in Figure 13.6, however, none of these money market volatility bouts were of similar magnitudeas those of the late 1970s and early 1980s. In this more recent period, there is also some evidence thatmonetary policy has become more reactive to the stock market (cf. IMF, 2000; Rigobon and Sack, 2003)and therefore a less independent driver of stock market volatility. This plausibly reflects the much greaterweight of the stock market in aggregate wealth from the 1990s (relative to the 1970s and early 1980s),calling for greater attention of policymakers to stock market developments and their impact on aggregatespending and hence on price stability.

Page 297: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

0

1

2

3

4

5

6

7

8

9

10

1975M2 1977M1 1978M12 1980M11 1982M10 1984M9 1986M8 1988M7 1990M6 1992M5 1994M4 1996M3 1998M2 2000M1 2001M120.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Stock Returns

3-month TBill (right axis)

Fig. 13.6. Intra-monthly variance of stock returns and of the three-month treasury bill yield

Page 298: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Economic interpretation: Oil, money, and tech shocks 285

Resources

0.00

0.20

0.40

0.60

0.80

1.00

Basic

0.00

0.20

0.40

0.60

0.80

1.00

General Industry

0.00

0.20

0.40

0.60

0.80

1.00

Cyclical Consumer Durables

0.00

0.20

0.40

0.60

0.80

1.00

Non-Cyclical Consumer Durables

0.00

0.20

0.40

0.60

0.80

1.00

1973 1975 1976 1977 1979 1980 1982 1983 1985 1986 1987 1989 1990 1992 1993 1994 1996 1997 1999 2000 2002

1973 1975 1976 1977 1979 1980 1982 1983 1985 1986 1987 1989 1990 1992 1993 1994 1996 1997 1999 2000 2002

1973 1975 1976 1977 1979 1980 1982 1983 1985 1986 1987 1989 1990 1992 1993 1994 1996 1997 1999 2000 2002

1973 1975 1976 1977 1979 1980 1982 1983 1985 1986 1987 1989 1990 1992 1993 1994 1996 1997 1999 2000 2002

1973 1975 1976 1977 1979 1980 1982 1983 1985 1986 1987 1989 1990 1992 1993 1994 1996 1997 1999 2000 2002

Fig. 13.7(a). Smoothed state probabilities for individual industries (high volatilitystate)

Page 299: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

286 Volatility regimes and global equity returns

Utilities

Information Technology

Financials

Non cyclical Services

Cyclical Services

0.00

0.20

0.40

0.60

0.80

1.00

0.00

0.20

0.40

0.60

0.80

1.00

0.00

0.20

0.40

0.60

0.80

1.00

0.00

0.20

0.40

0.60

0.80

1.00

0.00

0.20

0.40

0.60

0.80

1.00

1973 1975 1976 1977 1979 1980 1982 1983 1985 1986 1987 1989 1990 1992 1993 1994 1996 1997 1999 2000 2002

1973 1975 1976 1977 1979 1980 1982 1983 1985 1986 1987 1989 1990 1992 1993 1994 1996 1997 1999 2000 2002

1973 1975 1976 1977 1979 1980 1982 1983 1985 1986 1987 1989 1990 1992 1993 1994 1996 1997 1999 2000 2002

1973 1975 1976 1977 1979 1980 1982 1983 1985 1986 1987 1989 1990 1992 1993 1994 1996 1997 1999 2000 2002

1973 1975 1976 1977 1979 1980 1982 1983 1985 1986 1987 1989 1990 1992 1993 1994 1996 1997 1999 2000 2002

Fig. 13.7(b). Smoothed state probabilities for individual industries (high volatilitystate)

Page 300: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 Implications for global portfolio allocation 287

Sichel (2000), this coincided with efficiency gains in the information technology sector,an associated shift in the relative profitability across industries and frequent revisions inearnings growth expectations, all of which only became tangible – and not so graduallyso – in the late 1990s because of a sharp rise in the sector’s weight. This is starklypicked up by our smoothed transition probability estimates shown in the second panel ofFigure 13.7(b).

However, our estimates indicate that such a transition into a higher volatility state isnot the preserve of the IT sector. The post-1997 period was also characterized by risingvolatility in oil prices: the world oil price tripled between end-1998 and end-2000, beforedropping sharply through 2001 and shooting up subsequently. This is clearly captured bythe transition probability estimates for the resource industry in Figure 13.7(a). In addi-tion, other industries also witnessed a volatility bout and these are not limited to mediaand telecom firms – industries with closer ties to the IT sector (grouped under cyclicaland noncyclical services, respectively – see footnote 6). Some of this generalized volatilityrise no doubt reflects the well-known tightening in world monetary conditions and finan-cial distress in Asian emerging markets and Russia, particularly affecting the financialservice sector (which was more heavily exposed to those markets – cf. the widely pub-licized collapse of the US-based hedge fund Long-Term Capital Management), but alsolarge chunks of the general industry in the US, Japan and Europe, which also exportedheavily to those emerging markets (see Forbes and Chinn, 2004 on related evidence).Hence, it is not surprising that the model is picking up such a strong common indus-try component rapidly transitioning from a low to a high volatility state. In a nutshell,whereas previous work has emphasized the role of the IT sector in driving up global mar-ket volatility between 1997 and 2002, our industry estimates and the model’s allowancefor a common industry factor indicate that the phenomenon was more widespread thanprevious work may suggest.

6. Implications for global portfolio allocation

Our decompositions of market variance are based on the average country and industry-specific variances. As such, they are statistical measures that do not represent the payoffsfrom a portfolio investment strategy as they ignore covariances between the returnson the underlying country, industry, and global equity portfolios. The advantage ofsuch measures is that they provide a clear idea of the relative size of the variances ofreturns on the three components (global, industry and country). International investors,however, will be interested in economic measures of volatility and risk that representfeasible investment strategies and hence account for covariances between returns onthe different portfolios involved. Further, changes in these covariances have importantimplications. For instance, when such covariances increase, domestic risk becomes lessdiversifiable, which in turn tends to raise the equity premium and drive up the cost ofcapital.

The large literature on the links between national stock markets finds that the covari-ance of (excess) returns between national stock indices displays considerable variationover time (King, Sentana, and Wadhwani, 1994; Engle, Ito, and Lin, 1994; Bekaertand Harvey, 1995; Longin and Solnik, 1995; Karolyi and Stultz, 1996; Hartmann et al.,

Page 301: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

288 Volatility regimes and global equity returns

2004). In this section, we use firm-level data and the methodology laid out in the previoussections to characterize the behavior of country portfolio covariances. Like King, Sentana,and Wadhwani (1994) and others, we let such time variation in country covariances bedriven by an unobserved latent variable but, unlike those authors, we characterize suchvariations in terms of relatively lengthy historical periods or “states” and allow for differ-ences in industry composition across countries to play a role. Likewise, the same approachis used to characterize the covariance patterns of the various industry portfolios. Becausethe estimated covariances/correlations within volatility states are conditional upon theentire time series information up to that point, they are not subject to the type of biasesaffecting unconditional estimates, which have been highlighted by Forbes and Rigobon(2001). Moreover, an important spin-off of the proposed approach that, to the best ofour knowledge has not been explored in the literature, is the possibility that the countryand industry portfolios may be in different states at a given point in time, thus raisinginteresting possibilities for risk diversification.

To see this, recall that the joint models ((9)–(10)) assume separate state processes forthe global return factor (which affects all stocks in every period) and for the country orindustry returns. Each of these state variables can be in the high or low volatility state.The return on a geographically diversified portfolio invested in industry j will be αt+βjt,whereas the return on an industrially diversified country portfolio is αt + γkt. For suchportfolios there are thus four possible state combinations. For the industry portfolios thefour states are:

• high industry volatility, high global volatility (sβt = 1; sαt = 1)• high industry volatility, low global volatility (sβt = 1; sαt = 2)• low industry volatility, high global volatility (sβt = 2; sαt = 1)• low industry volatility, low global volatility (sβt = 2; sαt = 2).

The correlation between geographically diversified industry portfolios is likely to varystrongly according to the underlying combination of global and industry state variables.By construction, the global component is common to all stocks. Thus, when the globalreturn variable is in the high volatility state, it will contribute relatively more to vari-ations in the returns of such portfolios and correlations will increase. In contrast, whenthe global return component is in the low volatility state, correlations between countryor industry portfolios will tend to be lower. Similarly, when the industry component is inthe low volatility state, the relative significance of the common global return componentis larger so that correlations between industry portfolios will be stronger compared towhen the industry return process is in the high volatility state. Given the very largedifferences between volatilities in the high and low volatility states observed for theglobal and industry portfolios, these effects are likely to give rise to large differencesbetween correlations of geographically diversified industry portfolios in the four possiblestates.

A complication arises when computing these correlations as they depend on the cor-relation between the global and industry or country portfolio returns. Terms such as

Page 302: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 Implications for global portfolio allocation 289

Cov(αt, γkt |st, sγt) can be consistently estimated as follows:

Cov(αt, βjt |sαt, sβt) =∑T

t=1 πsαtπsβt

(αt − αsαt)(βjt − βjsβt

)∑Tt=1 πsαt

πsβt

,

Cov(αt, γkt |sαt, sγt) =∑T

t=1 πsαtπsγk

(αt − αsαt)(γkt − γksγt

)∑Tt=1 πsαt

πsγt

,

Cov(βt, γt|sβt, sγt) =∑T

t=1 πsβtπsγt

(βjt − βjsβt)(γkt − γksγt

)∑Tt=1 πsβt

πsγt

.

(14)

To investigate just how different these correlations and volatilities are, Table 13.7presents the estimated covariances and correlations in the two possible states for theindustrially diversified country portfolios, whereas Table 13.8 presents the estimatedcovariances and correlations for the geographically diversified industry portfolios. Vari-ances are presented on the diagonals, covariances above the diagonal, and correlationsare below.

For the country portfolios, the key findings are as follows. First, correlations acrosscountries vary substantially, even after allowing for cross-country differences in indus-try composition. In particular, correlations are generally higher among the Anglo–Saxoncountries (notably between Canada, the United States, and the United Kingdom) andlowest between the United States and much of continental Europe and Japan. Thisresult is consistent with the evidence of other studies using different methodologies andmeasures (see, e.g., IMF, 2000) and our estimates show that it broadly holds acrossstates.17 Second, correlations change markedly across states. As there is not much dif-ference between the variance of country returns in the high and low volatility states,the main driver of the results will be whether the global portfolio is in the high or lowvolatility state. The average correlation between the country portfolios is 0.30 in the lowglobal volatility state and 0.56 in the high global volatility state. Thus, as other studiesusing distinct econometric methodologies and data series have found (see, e.g., Solnikand Roulet, 2000; Bekaert, Harvey, and Ng, 2005), the state process for the global returncomponent clearly makes a big difference to the average correlations between the countryportfolios – our estimates for 13 mature markets indicating that such correlations almostdouble in the high volatility state.

Turning to the geographically diversified industry portfolios listed in Table 13.8, aricher menu of possible combinations emerges as the global high and low volatility statesare now supplemented by the high and low industry volatility states. When the industryprocess is in the high volatility state while the global process is in the low volatility state,the average correlation across industry portfolios is only 0.19. This rises to 0.50 whenthe industry and global processes are both in the high volatility state or both are inthe low volatility state. Finally, when the industry state process is in the low volatilitystate while the global process is in the high volatility state, the average correlationacross the geographically diversified industry portfolios is 0.81. These results show that

17Among continental European countries, a main exception is the Netherlands the country factorvolatility of which is highly correlated with those of the US and the UK.

Page 303: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 13.7. Covariances and correlations for industrially diversified country portfolios

US UK FR GE IT JP CA AU BE DE IR NL SW

(a) High Global Volatility StateUS 22.576 21.077 15.481 11.452 12.418 17.511 17.049 16.616 14.321 9.494 20.049 15.753 13.538

UK 0.680 42.534 27.334 23.382 27.582 34.702 20.993 28.034 23.429 20.678 34.213 26.123 24.956

FR 0.445 0.572 53.637 29.912 30.362 36.561 13.863 17.861 28.936 18.813 30.921 22.389 24.173

GE 0.396 0.589 0.671 37.004 22.264 24.403 11.308 14.962 24.821 17.703 24.479 22.563 24.964

IT 0.297 0.481 0.471 0.416 77.383 35.471 16.039 18.271 20.623 20.757 27.111 19.699 16.904

JP 0.435 0.628 0.589 0.473 0.476 71.849 19.028 23.817 27.291 22.830 37.472 25.221 28.139

CA 0.755 0.677 0.398 0.391 0.384 0.472 22.585 23.058 13.024 10.860 20.533 14.562 13.595

AU 0.478 0.587 0.333 0.336 0.284 0.384 0.663 53.595 17.724 14.038 26.122 16.256 18.154

BE 0.490 0.584 0.642 0.663 0.381 0.523 0.445 0.394 37.846 16.177 30.436 20.627 21.922

DE 0.357 0.566 0.458 0.519 0.421 0.481 0.408 0.342 0.469 31.414 24.927 14.033 18.109

IR 0.559 0.695 0.559 0.533 0.408 0.585 0.572 0.472 0.655 0.589 57.045 25.005 26.657

NL 0.698 0.843 0.643 0.780 0.471 0.626 0.645 0.467 0.706 0.527 0.697 22.585 20.626

SW 0.507 0.681 0.588 0.731 0.342 0.591 0.509 0.442 0.635 0.575 0.629 0.773 31.528

(b) Low Global Volatility StateUS 13.810 6.208 4.145 2.405 −0.085 −5.585 9.836 8.909 4.076 3.354 4.321 6.349 3.141

UK 0.360 21.562 9.896 8.233 8.976 5.504 7.678 14.224 7.082 8.436 12.383 10.616 8.456

FR 0.177 0.338 39.733 18.296 15.290 10.896 4.081 7.584 16.122 10.104 12.625 10.416 11.207

GE 0.123 0.337 0.552 27.676 9.480 1.026 3.815 6.974 14.295 11.282 8.472 12.878 14.286

IT −0.003 0.247 0.310 0.230 61.143 8.638 5.090 6.826 6.641 10.880 7.647 6.558 2.770

JP −0.256 0.202 0.295 0.033 0.188 34.422 −2.514 1.780 2.717 2.359 7.415 1.487 3.411

CA 0.643 0.402 0.157 0.176 0.158 −0.104 16.926 16.904 4.333 6.273 6.360 6.712 4.751

AU 0.350 0.447 0.176 0.193 0.127 0.044 0.600 46.947 8.538 8.956 11.454 7.911 8.816

BE 0.215 0.298 0.500 0.532 0.166 0.091 0.206 0.244 26.123 8.558 13.230 9.745 10.046

DE 0.171 0.344 0.303 0.406 0.263 0.076 0.289 0.247 0.317 27.899 11.825 7.255 10.337

IR 0.198 0.455 0.342 0.275 0.167 0.216 0.264 0.285 0.442 0.382 34.357 8.640 9.299

NL 0.482 0.646 0.467 0.691 0.237 0.072 0.461 0.326 0.538 0.388 0.416 12.543 9.591

SW 0.191 0.412 0.403 0.615 0.080 0.132 0.261 0.291 0.445 0.443 0.359 0.613 19.500

This table reports estimates of the covariances and correlations between the returns on industrially diversified country portfolios. Results are shownfor two states: high global volatility and low global volatility. Numbers above the diagonal show covariance estimates, numbers on the diagonal showvariance estimates, whereas numbers below the diagonal are estimates of the correlations. The diagonal is in bold for easy reference.

Page 304: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 13.8. Covariances and correlations for geographically diversified industry portfolios

RESOR BASIC GENIN CYCGD NCYCG CYSER NCYSR UTILS ITECH FINAN OTHER

(a) High Global Volatility, High Industry VolatilityRESOR 44.698 27.487 23.026 17.816 10.006 17.214 7.179 5.569 22.755 23.987 26.029BASIC 0.656 39.249 31.117 34.581 15.977 26.676 11.502 6.966 31.720 32.248 29.707GENIN 0.566 0.816 37.092 36.960 15.375 29.660 13.931 1.763 53.177 28.895 33.892CYCGD 0.359 0.744 0.818 54.976 13.293 33.825 16.511 3.227 53.585 32.844 32.822NCYCG 0.335 0.571 0.565 0.401 19.964 13.737 5.309 2.977 18.766 20.295 17.400CYSER 0.460 0.760 0.869 0.814 0.549 31.385 18.934 2.281 49.845 27.364 29.477NCYSR 0.164 0.280 0.349 0.340 0.181 0.516 42.880 −3.578 39.028 15.932 18.059UTILS 0.165 0.220 0.057 0.086 0.132 0.080 −0.108 25.634 −2.995 13.503 6.295ITECH 0.302 0.449 0.774 0.640 0.372 0.788 0.528 −0.052 127.348 34.300 52.537FINAN 0.553 0.793 0.731 0.682 0.700 0.753 0.375 0.411 0.468 42.131 31.869OTHER 0.607 0.739 0.867 0.690 0.607 0.820 0.430 0.194 0.725 0.765 41.205

(b) Low Global Volatility, High Industry VolatilityRESOR 46.513 18.019 12.537 2.633 7.887 7.772 2.058 13.734 −3.558 12.519 14.541BASIC 0.614 18.496 9.344 8.113 2.574 5.949 −4.904 3.846 −5.876 9.496 6.935GENIN 0.486 0.575 14.299 9.472 0.951 7.912 −3.495 −2.377 14.560 5.122 10.099CYCGD 0.081 0.395 0.525 22.793 −5.826 7.382 −5.610 −5.607 10.273 4.376 4.334NCYCG 0.310 0.160 0.067 −0.327 13.911 0.360 −3.747 7.207 −11.482 4.892 1.977CYSER 0.349 0.423 0.640 0.473 0.030 10.684 2.553 −0.813 12.274 4.638 6.731NCYSR 0.054 −0.205 −0.167 −0.212 −0.181 0.141 30.820 −2.350 5.778 −2.474 −0.367UTILS 0.318 0.141 −0.099 −0.185 0.305 −0.039 −0.067 40.148 −22.958 8.384 1.156ITECH −0.061 −0.160 0.451 0.252 −0.361 0.440 0.122 −0.424 72.907 −5.297 12.921FINAN 0.440 0.530 0.325 0.220 0.315 0.340 −0.107 0.317 −0.149 17.379 7.097OTHER 0.526 0.398 0.659 0.224 0.131 0.508 −0.016 0.045 0.374 0.420 16.413

(cont.)

Page 305: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 13.8. (Continued)

RESOR BASIC GENIN CYCGD NCYCG CYSER NCYSR UTILS ITECH FINAN OTHER

(c) High Global Volatility, Low Industry VolatilityRESOR 31.884 26.284 25.658 23.047 23.764 24.252 21.415 22.257 23.168 26.389 27.509BASIC 0.780 35.637 31.534 28.314 28.892 31.026 24.000 24.962 29.079 31.083 32.749GENIN 0.802 0.932 32.124 29.490 27.954 29.377 23.803 23.397 29.971 29.746 31.505CYCGD 0.712 0.827 0.908 32.856 25.744 27.587 21.918 20.924 28.233 27.378 28.885NCYCG 0.770 0.886 0.903 0.822 29.852 28.563 23.448 23.840 26.034 29.241 29.386CYSER 0.767 0.928 0.925 0.859 0.933 31.386 23.516 24.143 27.393 30.539 30.651NCYSR 0.709 0.752 0.786 0.715 0.803 0.785 28.582 22.515 22.197 24.652 25.210UTILS 0.730 0.774 0.764 0.676 0.808 0.798 0.780 29.187 19.911 26.108 26.343ITECH 0.684 0.812 0.882 0.822 0.795 0.816 0.692 0.615 35.947 27.480 29.220FINAN 0.802 0.893 0.901 0.820 0.918 0.935 0.791 0.829 0.786 33.964 31.948OTHER 0.771 0.868 0.879 0.797 0.851 0.865 0.746 0.771 0.771 0.867 39.976

(d) Low Global Volatility, Low Industry VolatilityRESOR 14.969 5.969 6.483 5.174 4.980 5.283 4.824 5.173 5.590 6.563 7.373BASIC 0.447 11.922 8.959 7.040 6.708 8.656 4.008 4.478 8.101 7.856 9.213GENIN 0.513 0.794 10.690 9.358 6.911 8.148 4.952 4.054 10.133 7.660 9.109CYCGD 0.357 0.544 0.764 14.025 6.002 7.659 4.368 2.883 9.697 6.593 7.791NCYCG 0.424 0.640 0.697 0.528 9.201 7.725 4.988 4.888 6.588 7.546 7.382CYSER 0.424 0.779 0.774 0.635 0.791 10.362 4.869 5.005 7.761 8.658 8.461NCYSR 0.355 0.331 0.432 0.332 0.469 0.431 12.314 5.755 4.942 5.150 5.398UTILS 0.387 0.375 0.359 0.223 0.466 0.450 0.475 11.935 2.165 6.113 6.039ITECH 0.343 0.558 0.737 0.615 0.516 0.573 0.335 0.149 17.706 6.991 8.421FINAN 0.506 0.679 0.699 0.525 0.743 0.803 0.438 0.528 0.496 11.227 8.901OTHER 0.467 0.655 0.683 0.510 0.597 0.645 0.377 0.429 0.491 0.652 16.620

This table reports estimates of the covariances and correlations between the returns on geographically diversified industry portfolios. Results areshown for four states: high global volatility, high industry volatility (a); low global volatility, high industry volatility (b); high global volatility, lowindustry volatility (c); low global volatility, low industry volatility (d). Numbers above the diagonal show covariance estimates, numbers on thediagonal show variance estimates, whereas numbers below the diagonal are estimates of the correlations. The diagonal is in bold for easy reference.

Page 306: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

7 Conclusion 293

the average correlations between geographically diversified industry portfolios vary sub-stantially according to the state process driving the common industry component andthe global component, with the non-negligible differences in industry factor correlationswithin each state being especially magnified in the high industry volatility state. Finallywe note how different the average volatility level is in the high and low volatility states.For the country portfolios the variation in volatility is, unsurprisingly, somewhat smaller.The mean volatility is 6.4% per month in the high global volatility state and 5.3% in thelow volatility state. The mean volatility of the industry portfolios is 6.6% per month inthe high industry-, high global volatility state as compared with an average volatility ofthese portfolios of 3.6% in the low industry-, low global volatility state.

Important implications follow from these results. Generally, it will be more difficult toreduce equity risk through cross-border diversification when the global volatility processis in the high volatility state. On a macro level, this suggests that international capitalflows should be expected to rise or accelerate during periods of low global stock marketvolatility and to ebb during high volatility states. Moreover, as the gains to cross-borderdiversification appear to be especially meager when global and industry factors bothsimultaneously lie in a high volatility state, this suggests that cross-border risk diversifi-cation should not be so beneficial during those subperiods. Provided that a country has asufficiently diversified domestic industrial structure that allows residents to diversify riskalong broad industry lines without having to go abroad, international equity flows willtend to be dampened as a result. This raises the question of whether these patterns areactually observed in the data. A systematic testing of this relationship is no mean task –not only because international portfolio investments are driven by a number of effects atplay (see, e.g., Tesar and Werner, 1994), but also because of considerable data problemseven if we were to limit the analysis to US data. Yet, it clearly warrants attention byfuture research.

7. Conclusion

This chapter has developed a regime-switching modeling framework and applied it to30-year long firm-level data to address three main questions – whether global stockreturn volatility displays well-defined volatility regimes, the extent to which equitymarket volatility is accounted for by global, country- or sector-specific factors, andwhat implication this has for national equity market correlations and international riskdiversification.

Our results reveal strong evidence of regimes in international stock returns charac-terized by different levels of volatility, with the low volatility regime being two to threetimes more persistent once we average over the various individual country and industryportfolios. The robustness of these results is not only butressed by a variety of statisticaltests on the model’s residuals but also by an identification of volatility regimes, whichis broadly consistent with estimates from alternative nonparametric measures, as wellas with what we know about the timing of major shocks deemed to affect stock marketvolatility. At the very least, this suggests that the single-state assumption underlyingthe linear models used in previous studies can be improved upon. As discussed aboveand further stressed below, the inadequacy of the single-state assumption is not only a

Page 307: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

294 Volatility regimes and global equity returns

technical econometric issue: it also leads one to gloss over important shifts in portfoliodiversification possibilities as the various factors switch between high and low volatil-ity regimes over time. To the extent that such states are persistent enough to allowthe respective probabilities to be estimated with reasonable precision, market partici-pants should thus be able to reap significant benefits from monitoring the underlyingstate probabilities as well as cross-country and -industry portfolio correlations withinthem.

As allowing for time-varying factor contributions appears to characterize the databetter than linear models with similar factor structure, this should also deliver moreaccurate estimates of the various factor contributions. Over the entire period 1973–2002,the country factor contribution averaged some 50% as opposed to 16% for the industryfactor. Yet, these contributions have witnessed important variations across volatilitystates, with the country factor contribution dropping sharply at times, to as low asunder 35% as around 1973–1974, 1986–1987 and 2000–2001. Further, as each factor inthe model is allowed to be in one of two states at any point in time, we also show thateconomically interesting state combinations arise as each combination gives rise to astronger or weaker pattern of correlations between the various portfolios. In general, thecorrelations among the various country and industry portfolios are stronger in the highglobal volatility state than in the low global volatility state; in the case of industriallydiversified country-specific portfolios, those correlations nearly double on average. Hencethe diversification benefits of investing abroad tend to be considerably smaller whenglobal volatility is high. Further, and also after accounting for different industrial make-ups across countries and differences in volatility regimes, pair-wise correlations betweenthe various country portfolios indicate that international diversification benefits are evensmaller when confined to certain subsets of countries, such as the Anglo–Saxon nationsor within continental Europe.

These results speak directly to various strands of the literature. First, our findingssuggest that the apparently greater potential for industry diversification arising from thegreater contribution of industry factors to stock return volatility between 1997 and 2002is likely to be, at least in part, temporary: global stock market volatility typically goesthrough ups and downs and the contribution of country factors typically declines (rises)during high (low) global volatility states; so, the incentive for global equity diversificationalong industry lines (as opposed to country lines) should shift accordingly – rather thanbeing a permanent phenomenon. A similar inference follows from the evidence presentedin Brooks and del Negro (2002) and Bekaert et al. (2005) using different econometricapproaches and distinct levels of industry disaggregation in their discussion of the ITbubble of the late 1990s.

Second, related inferences can be drawn about the role of “globalization” in drivingdown the contribution of country factors in stock returns. Although a number of studieshave pointed to a decline in home bias and noted that firm operations (particularly amongadvanced countries) have grown more international (cf. Diermeier and Solnik, 2001), ourfinding that the contribution of country factors have fluctuated throughout the periodcautions against seeing the 1997–2002 decline as permanent due to “globalization” forces.This is not to exclude that this shift may have a sizeable permanent component especiallyfor certain country subgroups – notably in Europe (cf. Baele and Inghelbrecht, 2005).What our look at the historical evidence on regime shifts simply suggests is that theestimated longer stay in a low country volatility state plus the attendant decline in the

Page 308: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

7 Conclusion 295

contribution from the country factor from the mid-1990s may be picking up temporaryas well as permanent factors. More definitely, our estimates also suggest that, in anyevent, greater globalization has not yet resulted in the industry factor becoming moreimportant than the country factor. More time series data, together with richer structuralmodels that pin down the various sources of market integration, are clearly needed beforefirmer inferences can be made about the permanency of such shifts.

Third, our estimates of “pure” country portfolio correlations across national stockmarkets are consistent with the findings of an emerging literature on information fric-tions or institutional-based “gravity” views of international equity flows (Portes andRey, 2005). Because our estimates are conditional upon the distinct volatility states anduse all the time series information up to that point, they are robust to the bias affectingunconditional correlations discussed in Forbes and Rigobon (2001), which plagues uncon-ditional correlations reported in much of the literature. As in such gravity-type models,these conditional correlation estimates clearly show that market correlations tend to besystematically higher – both during high and low volatility states – among Anglo–Saxoncountries and across much of continental Europe. An open question for future researchis to which extent higher “pure” country correlations among European countries haveintensified over time and on a permanent basis since the introduction of the Euro in 1999and of other regional harmonization policies – rather than resulting from the rise in theglobal factor-driven volatility since 1997.

Finally, interesting implications for the pattern of cross-border capital flows also fol-low. For one thing, evidence that over the long run average stock return volatility has beenmainly determined by country-specific factors suggests that equity risk can be greatlyreduced by diversifying portfolios across national borders. This provides an importantrationale for the observed dramatic growth in gross cross-border equity holdings, which,in turn, has important implications for international macroeconomic adjustment (Laneand Milesi-Ferretti, 2001). Conversely, however, during high global and high industryvolatility regimes (such as those historically associated with large sector-specific shockssuch as oil), the risk diversification incentive to cross-border flows is therefore weakenedand perceived home bias is strengthened. Even granting that risk diversification is simplyone among other potential drivers of cross-border flows, one’s allowance for the existenceof such regime shifts may shed new light on the question of what drives the massiveswings in the growth of cross-border equity holdings observed in the data. Althoughfurther research considering a host of other factors and based on better equity flow datais clearly needed before any robust inference is drawn, this hypothesis emerges as aninteresting spin-off of this chapter’s results.

Page 309: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

14

A Multifactor, Nonlinear,Continuous-Time Model of

Interest Rate VolatilityJacob Boudoukh, Christopher Downing, Matthew Richardson,

Richard Stanton, and Robert F. Whitelaw

1. Introduction

When one sees so many co-authors on a single chapter and they are not from thehard sciences, the natural question is why? Looking at both the number and qual-ity of contributors to this volume and how much econometric talent Rob Engle hashelped nurture through his career, it becomes quite clear that the only way we couldparticipate in Rob’s Festschrift is to pool our limited abilities in Financial Economet-rics. Given Rob’s obvious importance to econometrics, and in particular to finance viahis seminal work on volatility, it is quite humbling to be asked to contribute to thisvolume.

Looking over Rob’s career, it is clear how deeply rooted the finance field is in Rob’swork. When one thinks of the major empirical papers in the area of fixed income, Famaand Bliss (1987), Campbell and Shiller (1991), Litterman and Scheinkman (1991), Chan,Karolyi, Longstaff and Sanders (1992), Longstaff and Schwartz (1992), Pearson and Sun(1994), Aıt-Sahalia (1996b) and Dai and Singleton (2000) come to mind. Yet in termsof citations, all of these papers are dominated by Rob’s 1987 paper with David Lilienand Russell Robins, “Estimating Time Varying Risk Premia in the Term Structure:The ARCH-M Model.” In this chapter, further expanded upon in Engle and Ng (1993),

Acknowledgments: We would like to thank Tim Bollerslev, John Cochrane, Lars Hansen, Chester Spatt,an anonymous referee and seminar participants at the Engle Festschrift conference, New York FederalReserve, the Federal Reserve Board, Goldman Sachs, University of North Carolina, U.C. Berkeley, ITAM,the San Diego meetings of the Western Finance Association, the Utah Winter Finance Conference, andthe NBER asset pricing program for helpful comments.

296

Page 310: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

1 Introduction 297

the authors present evidence that the yield curve is upward sloping when interest ratevolatility is high via an ARCH-M effect on term premia. The result is quite natural toanyone who teaches fixed income and tries to relate the tendency for the term structureto be upward sloping to the duration of the underlying bonds. Given this work by Rob,our contribution to this Festschrift is to explore the relation between volatility and theterm structure more closely.

It is now widely believed that interest rates are affected by multiple factors.1 Never-theless, most of our intuition concerning bond and fixed-income derivative pricing comesfrom stylized facts generated by single-factor, continuous-time interest rate models. Forexample, the finance literature is uniform in its view that interest rate volatility is increas-ing in interest rate levels, though there is some disagreement about the rate of increase(see, for example, Chan, Karolyi, Longstaff and Sanders, 1992; Aıt-Sahalia, 1996b; Con-ley, Hansen, Luttmer and Scheinkman, 1995; Brenner, Harjes and Kroner, 1996; andStanton, 1997). If interest rates possess multiple factors such as the level and slope ofthe term structure (Litterman and Scheinkman, 1991), and given the Engle, Lilien andRobins (1987) finding, then this volatility result represents an average over all possibleterm structure slopes. Therefore, conditional on any particular slope, volatility may beseverely misestimated, with serious consequences especially for fixed-income derivativepricing.

Two issues arise in trying to generate stylized facts about the underlying continuous-time, stochastic process for interest rates. First, how do we specify ex ante the drift anddiffusion of the multivariate process for interest rates so that it is consistent with the trueprocess underlying the data? Second, given that we do not have access to continuous-time data, but instead to interest rates/bond prices at discretely sampled intervals, howcan we consistently infer an underlying continuous-time multivariate process from thesedata? In single-factor settings, there has been much headway at addressing these issues(see, for example, Aıt-Sahalia, 1996a, 2007; Conley, Hansen, Luttmer and Scheinkman,1995; and Stanton, 1997). Essentially, using variations on nonparametric estimators withcarefully chosen moments, the underlying single-factor, continuous-time process can bebacked out of interest rate data.

Here, we extend the work of Stanton (1997) to a multivariate setting and provide forthe nonparametric estimation of the drift and volatility functions of multivariate stochas-tic differential equations.2 Basically, we use Milshteins (1978) approximation schemes forwriting expectations of functions of the sample path of stochastic differential equationsin terms of the drift, volatility and correlation coefficients. If the expectations are known(or, in our case, estimated nonparametrically) and the functions are chosen appropriately,then the approximations can be inverted to recover the drift, volatility and correlationcoefficients. In this chapter, we apply this technique to the short- and long-end of theterm structure for a general two-factor, continuous-time diffusion process for interest

1See, for example, Stambaugh (1988), Litterman and Scheinkman (1991), Longstaff and Schwartz(1992), Pearson and Sun (1994), Andersen and Lund (1997), Dai and Singleton (2000) and Collin-Dufresne, Goldstein and Jones (2006) to name a few. This ignores the obvious theoretical reasons formultifactor pricing, as in Brennan and Schwartz (1979), Schaefer and Schwartz (1984), Heath, Jarrowand Morton (1992), Longstaff and Schwartz (1992), Chen and Scott (1992), Duffie and Kan (1996), Ahn,Dittmar and Gallant (2002) and Piazzesi (2005), among others.

2An exception is Aıt-Sahalia (2008) and Aıt-Sahalia and Kimmel (2007b) who provide closed formexpansions for the log-likelihood function for a wide class of multivariate diffusions.

Page 311: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

298 A multifactor, nonlinear, continuous-time model of interest rate volatility

rates. Our methods can be viewed as a nonparametric alternative to the affine classof multifactor continuous-time interest rate models studied in Longstaff and Schwartz(1992), Duffie and Kan (1996), Dai and Singleton (2000) and Aıt-Sahalia and Kimmel(2007b), the quadratic term structure class studied in Ahn, Dittmar and Gallant (2002),and the nonaffine parametric specifications of Andersen and Lund (1997). As an appli-cation, we show directly how our model relates to the two-factor model of Longstaff andSchwartz (1992).

Our chapter provides two contributions to the existing literature. First, in estimatingthis multifactor diffusion process, some new empirical facts emerge from the data. Ofparticular note, although the volatility of interest rates increases in the level of inter-est rates, it does so primarily for sharply upward sloping term structures. Thus, theresults of previous studies, suggesting an almost exponential relation between interestrate volatility and levels, are due to the term structure on average being upward sloping,and is not a general result per se. Moreover, our volatility result holds for both the short-and long-term rates of interest. Thus, conditional on particular values of the two factors,such as a high short rate of interest and a negative slope of the term structure, the termstructure of interest rate volatilities is generally at a lower level across maturities thanimplied by previous work.

The second contribution is methodological. In this chapter, we provide a way oflinking empirical facts and continuous-time modeling techniques so that generatingimplications for fixed-income pricing is straightforward. Specifically, we use nonparamet-rically estimated conditional moments of “relevant pricing factors” to build a multifactorcontinuous-time diffusion process, which can be used to price securities. This process canbe considered a generalization of the Longstaff and Schwartz (1992) two-factor model.Using this estimated process, we then show how to value fixed-income securities, in con-junction with an estimation procedure for the functional for the market prices of risk.As the analysis is performed nonparametrically without any priors on the underlyingeconomic structure, the method provides a unique opportunity to study the economicstructure’s implications for pricing. Of course, ignoring the last 25 years of term structuretheory and placing more reliance on empirical estimation, with its inevitable estimationerror, may not be a viable alternative on its own. Nevertheless, we view this approach ashelpful for understanding the relation between interest rate modeling and fixed-incomepricing.

2. The stochastic behavior of interest rates: Someevidence

In this section, we provide some preliminary evidence for the behavior of interest ratesacross various points of the yield curve. Under the assumption that there are two interest-rate dependent state variables, and that these variables are spanned by the short rateof interest and the slope of the term structure, we document conditional means andvolatilities of changes in the six-month through five-year rates of interest. The results aregenerated nonparametrically, and thus impose no structure on the underlying functionalforms for the term structure of interest rates.

Page 312: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 The stochastic behavior of interest rates: Some evidence 299

2.1. Data description

Daily values for constant maturity Treasury yields on the three-year, five-year and10-year US government bond were collected from Datastream over the period January1983 to December 2006. In addition, three-month, six-month and one-year T-bill rateswere obtained from the same source, and converted to annualized yields. This providesus with over 6,000 daily observations.

The post-1982 period was chosen because there is considerable evidence thatthe period prior to 1983 came from a different regime (see, for example, Huizingaand Mishkin, 1986; Sanders and Unal, 1988; Klemkosky and Pilotte, 1992; andTorous and Ball, 1995). In particular, these researchers argue that the October1979 change in Federal Reserve operating policy led to a once-and-for-all shift in thebehavior of the short-term riskless rate. As the Federal Reserve experiment endedin November 1982, it is fairly standard to treat only the post-late-1982 period asstationary.

In estimating the conditional distribution of the term structure of interest rates, weemploy two conditioning factors. These factors are the short rate of interest – definedhere as the three-month yield – and the slope of the term structure – defined as the spreadbetween the 10-year and three-month yields. These variables are chosen to coincide withinterest rate variables used in other studies (see Litterman and Scheinkman, 1991; andChan, Karolyi, Longstaff and Sanders, 1992, among others). Figure 14.1 graphs the time

–2

0

2

4

6

8

10

12

84 86 88 90 92 94 96 98 00 02 04 06

Year

Short RateSlope

Fig. 14.1. Time series plot of the three-month rate and term structure slope (i.e., thespread between the 10-year and three-month rate) over the 1983–2006 period

Page 313: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

300 A multifactor, nonlinear, continuous-time model of interest rate volatility

–1

0

1

2

3

4

0 2 4 6 8 10

Slop

e

Short Rate

Fig. 14.2. Scatter plot of the three-month rate versus the term structure slope overthe 1983–2006 period

series of both the short rate and slope. Over the 1983–2006 period, the short rate rangesfrom 1% to 11%, whereas the slope varies from −1% to 4%. There are several distinctperiods of low and high interest rates, as well as slope ranges. As the correlation betweenthe short rate and slope is −0.31, there exists the potential for the two variables combinedto possess information in addition to a single factor.

Figure 14.2 presents a scatter plot of the short rate and term structure slope. Ofparticular importance to estimating the conditional distribution of interest rates is theavailability of the conditioning data. Figure 14.2 shows that there are two holes in thedata ranges, namely at low short rates (i.e., from 1% to 4%) and low slopes (i.e., from−1% to 2%), and at high short rates (i.e., from 9.5% to 11.5%) and low slopes (i.e.,from −1% to 1%). This means that the researcher should be cautious in interpreting theimplied distribution of interest rates conditional on these values for the short rate andslope.

2.2. The conditional distribution of interest rates: A first look

In order to understand the stochastic properties of interest rates, consider conditioningthe data on four possible states: (i) high level (i.e., of the short rate)/high slope, (ii) highlevel/low slope, (ii) low level/low slope, and (iv) low level/high slope. In a generalized

Page 314: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 The stochastic behavior of interest rates: Some evidence 301

method of moments framework, the moment conditions are:3

E

⎛⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎝

(Δiτt,t+1 − μτ

hr:hs

)× It,hr:hs(Δiτt,t+1 − μτ

hr:ls

)× It,hr:ls(Δiτt,t+1 − μτ

lr:ls

)× It,lr:ls(Δiτt,t+1 − μτ

lr:hs

)× It,lr:hs[(Δiτt,t+1 − μτ

hr:hs

)2 − στhr:hs

2]× It,hr:hs[(

Δiτt,t+1 − μτhr:ls

)2 − στhr:ls

2]× It,hr:ls[(

Δiτt,t+1 − μτlr:ls

)2 − στlr:ls

2]× It,lr:ls[(

Δiτt,t+1 − μτlr:hs

)2 − στlr:hs

2]× It,lr:hs

⎞⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠= 0, (1)

where Δiτt,t+1 is the change in the τ -period interest rate from t to t+ 1, μτ·|· is the mean

change in rates conditional on one of the four states occurring, στ·|· is the volatility of the

change in rates conditional on these states, and It,·|· = 1 if [·|·] occurs, zero otherwise.These moments, μτ and στ , thus represent coarse estimates of the underlying conditionalmoments of the distribution of interest rates.

These moment conditions allow us to test a variety of restrictions. First, areστ

hr:hs = στhr:ls and στ

lr:hs = στlr:ls? That is, does the slope of the term structure

help explain volatility at various interest rate levels? Second, similarly, with respectto the mean, are μτ

hr:hs = μτhr:ls and μτ

lr:hs = μτlr:ls? Table 14.1 provides estimates

of μτ·|· and στ

·|·, and the corresponding test statistics. Note that the framework allowsfor autocorrelation and heteroskedasticity in the underlying squared interest rate serieswhen calculating the variance–covariance matrix of the estimates. Further, the cross-correlation between the volatility estimates is taken into account in deriving the teststatistics.

Several facts emerge from Table 14.1. First, as documented by others (e.g., Chan,Karolyi, Longstaff and Sanders, 1992; and Aıt-Sahalia, 1996a), interest rate volatility isincreasing in the short rate of interest. Of some interest here, this result holds acrossthe yield curve. That is, conditional on either a low or high slope, volatility is higherfor the six-month, one-year, three-year and five-year rates at higher levels of the shortrate. Second, the slope also plays an important role in determining interest rate volatil-ity. In particular, at high levels of interest rates, the volatility of interest rates acrossmaturities is much higher at steeper slopes. For example, the six-month and five-yearvolatilities rise from 5.25 and 6.35 to 7.65 and 7.75 basis points, respectively. Formaltests of the hypothesis στ

hr:hs = στhr:ls provide 1% level rejections at each of the maturi-

ties. There is some evidence in the literature that expected returns on bonds are higherfor steeper term structures (see, for example, Fama, 1986, and Boudoukh, Richardson,Smith and Whitelaw, 1999a, 1999b); these papers and the finding of Engle, Lilien andRobins (1987) may provide a link to the volatility result here. Third, the effect of theslope is most important at high interest rate levels. At low short rate levels, though the

3We define a low (high) level or slope as one that lies below (above) its unconditional mean. Here,this mean is being treated as a known constant, though, of course, it is estimated via the data.

Page 315: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

302 A multifactor, nonlinear, continuous-time model of interest rate volatility

Table 14.1. Conditional moments of daily interest rate changes (basis points)

Probability HR,HS HR,LS χ2HR,HS=HR,LS LR,HS LR,LS χ2

LR,HS=LR,LS

22.76% 26.83% 27.45% 22.96%

Mean (bp/day)Six-month 0.032 −0.292 1.747 0.031 0.056 0.033(s.e.)/[p value] (0.207) (0.131) [0.186] (0.092) (0.108) [0.857]One-year 0.032 −0.365 2.339 0.060 0.069 0.003(s.e.)/[p value] (0.215) (0.147) [0.126] (0.120) (0.119) [0.957]Three-year 0.032 −0.365 1.462 0.063 0.082 0.007(s.e.)/[p value] (0.211) (0.158) [0.227] (0.167) (0.149) [0.932]Five-year −0.070 −0.371 1.304 0.017 0.110 0.170(s.e.)/[p value] (0.210) (0.158) [0.254] (0.167) (0.149) [0.680]

Volatility (bp/day)Six-month 7.645 5.248 35.314 3.715 4.006 0.862(s.e.)/[p value] (0.364) (0.165) [0.000] (0.163) (0.265) [0.353]One-year 7.928 5.869 24.452 4.879 4.428 2.024(s.e.)/[p value] (0.367) (0.187) [0.000] (0.168) (0.266) [0.155]Three-year 7.928 5.869 13.564 6.784 5.520 18.173(s.e.)/[p value] (0.341) (0.187) [0.000] (0.180) (0.229) [0.000]Five-year 7.746 6.347 13.567 6.761 5.571 20.389(s.e.)/[p value] (0.329) (0.179) [0.000] (0.180) (0.229) [0.000]

Average correlation0.840 0.823 0.807 0.796

The table presents summary statistics for daily changes in the six-month, one-year, three-year, and five-year yields on US government securities over the 1983–2006 period. Specifically, the table provides themean, volatility, and cross-correlation of these series, conditional on whether the level of the short rateand slope of the term structure are either low or high (and the associated standard errors). These statesof the world are labeled HR and LR for high and low short rates, respectively, and HS and LS for highand low slopes, respectively, and they occur with the probabilities given in the first row of the table. AWald test that the conditional moments are equal (and the associated p value), holding the short ratestate fixed but varying the state for the slope of the term structure, is also provided for the mean andvolatility of these series.

volatility at low slopes is less than that at high slopes, the effect is much less pronounced.This is confirmed by the fact that a number of the p values are no longer significantat conventional levels for the test of the hypothesis, στ

lr:hs = στlr:ls. Fourth, the con-

ditional means, though not in general reliably estimated, are consistent with existingresults in the literature (e.g., Chan, Karolyi, Longstaff and Sanders, 1992; Aıt-Sahalia,1996a; and Stanton, 1997). That is, at low levels of interest rates, the mean tends tobe greater than at high interest rates, which can be explained by mean reversion. How-ever, the table also provides an interesting new result, namely that the effect of theslope is of higher magnitude than the level. Further, low slopes tend to be associatedwith negative changes in rates, whereas high slopes are linked to positive interest ratechanges.

Page 316: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 The stochastic behavior of interest rates: Some evidence 303

2.3. The conditional distribution of interest rates: A closer look

In order to generalize the results of Section 2.2, we employ a kernel estimation pro-cedure for estimating the relation between interest rate changes and components ofthe term-structure of interest rates. Kernel estimation is a nonparametric method forestimating the joint density of a set of random variables. Specifically, given a timeseries Δiτt,t+1, i

rt and ist (where ir is the level of interest rates, and is is the slope),

generated from an unknown density f(Δiτ , ir, is), then a kernel estimator of thisdensity is

f(Δiτ , ir, is) =1

Thm

T∑t=1

K

((Δiτ , ir, is) − (Δiτt,t+1, i

rt , i

st )

h

), (2)

where K(·) is a suitable kernel function and h is the window width or smoothingparameter.

We employ the commonly used independent multivariate normal kernel for K(·). Theother parameter, the window width, is chosen based on the dispersion of the observations.For the independent multivariate normal kernel, Scott (1992) suggests the window width,

h = kσiT−1

m+4 ,

where σi is the standard deviation estimate of each variable zi, T is the number ofobservations, m is the dimension of the variables, and k is a scaling constant oftenchosen via cross-validation. Here, we employ a cross-validation procedure to find thek that provides the right trade-off between the bias and variance of the errors. Acrossall the data points, we find the ks that minimize the mean-squared error between theobserved data and the estimated conditional data. This mean-squared error minimizationis implemented using a Jackknife-based procedure. In particular, the various impliedconditional moments at each data point are estimated using the entire sample, exceptfor the actual data point and its nearest neighbors.4 Once the k is chosen, the actualestimation of the conditional distribution of interest rates involves the entire sample,albeit using window widths chosen from partial samples. To coincide with Section 2.2.,we focus on the first two conditional moments of the distribution, and it is possible toshow that

μΔiτ (ir, is) =T∑

t=1

wt(ir, is)Δiτt (3)

σ2Δiτ (ir, is) =

T∑t=1

wt(ir, is)(Δiτt − μΔiτ (ir, is))2, (4)

4Due to the serial dependence of the data, we performed the cross-validation omitting 100 observations,i.e., four months in either direction of the particular data point in question. Depending on the momentsin question, the optimal ks range from roughly 1.7 to 27.6, which implies approximately twice to 28times the smoothing parameter of Scott’s asymptotically optimal implied value.

Page 317: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

304 A multifactor, nonlinear, continuous-time model of interest rate volatility

Volatility (bp/day) (One year)

0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11r

00.005

0.010.015

0.020.025

0.030.035

Slope

4.5

5

5.5

6

6.5

7

7.5

8

8.5

9

Fig. 14.3. The volatility of the daily change in the one-year yield (in basis points),conditional on the short rate and the slope of term structure

where wt(ir, is) = K(

(ir,is)−(irt ,is

t )h

)/∑T

t=1K(

(ir,is)−(irt ,is

t )h

). The weights, wt(ir, is),

are determined by how close the chosen state, i.e., the particular values of the leveland slope, ir and is, is to the observed level and slope of the term structure, irtand ist .

As an illustration, using equation (4), Figure 14.3 provides estimates of the volatilityof daily changes in the one-year rate, conditional on the current level of the short rateand the slope of the term structure (i.e., irt and ist ). Although Figure 14.3 representsonly the one-year rate, the same effects carry through to the rest of the yield curve andhave therefore been omitted for purposes of space. The figure maps these estimates tothe relevant range of the data, in particular, for short rates ranging from 3% to 11%and slopes ranging from 0.0% to 3.5%. That said, from Figure 14.2, the data are quitesparse in the joint region of very low rates and low slopes, and thus results must betreated with caution in this range. The main result is that the volatility is maximized athigh interest rate levels and high slopes though the more dramatic changes occur at highslopes.

To see this a little more clearly, Figures 14.4 and 14.5 present cut-throughs ofFigure 14.3 across the term structure at short rates of 8.0% and 5.5%, respectively.From Figure 14.2, these levels represent data ranges in which there are many differ-ent slopes; thus, conditional on these levels, the estimated relation between the volatilityof the six-month, one-year, three-year and five-year rates as a function of the slopeis more reliable. Several observations are in order. First, as seen from the figures,

Page 318: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 The stochastic behavior of interest rates: Some evidence 305

5.5

6

6.5

7

7.5

8

8.5

9

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

Vol

atili

ty (

bp/d

ay)

Slope

Six monthsOne year

Three yearsFive years

Fig. 14.4. The volatility of the daily change in yields versus the slope, with the shortrate fixed at 8%

4

4.5

5

5.5

6

6.5

7

7.5

8

0 0.005 0.01 0.015 0.02 0.025 0.03 0.035

Vol

atili

ty (

bp/d

ay)

Slope

Six monthsOne year

Three yearsFive years

Fig. 14.5. The volatility of the daily change in yields versus the slope, with the shortrate fixed at 5.5%

Page 319: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

306 A multifactor, nonlinear, continuous-time model of interest rate volatility

4

4.5

5

5.5

6

6.5

7

7.5

8

8.5

9

0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11

Vol

atili

ty (

bp/d

ay)

r

Six monthsOne year

Three yearsFive years

Fig. 14.6. The volatility of the daily change in yields versus the short rate, with theslope fixed at 2.75%

volatility is increasing in the slope for all maturities, though primarily only for steepterm structures, i.e., above 2.0%. Second, volatility is also higher at greater magni-tudes of the short rate albeit less noticeably. These results suggest that any valuationrequiring a volatility estimate of interest rates should be done with caution. For exam-ple, estimating volatility when the term structure is flat relative to upward slopingshould lead to quite different point estimates. Third, the relation between volatilityand the slope is nonlinear, which, as it turns out in Section 4, will lead to a nonlin-ear continuous-time diffusion process. This feature can be potentially important as themajority of the multifactor, term structure pricing models are derived from the affineclass.

Alternatively, Figures 14.6 and 14.7 provide cut-throughs of Figure 14.3 across theterm structure at slopes of 2.75% and 1.00%, respectively. These slopes represent dataranges in which there are a number of observations of the interest rate level. The figuresshow that the estimated relation between the volatility of the six-month, one-year, andespecially the three-year and five-year rates as a function of the level depends consid-erably on the slope of the term structure. For example, the volatility of the six-monthand one-year interest rate changes is almost flat over levels of 3.0% to 6.0% at lowslopes, whereas it increases roughly 200 basis points at high slopes. Similarly, even atthe long end of the yield curve, the increase in volatility is higher at high versus lowslopes.

Page 320: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Estimation of a continuous-time multifactor diffusion process 307

4

4.5

5

5.5

6

6.5

7

7.5

8

0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11

Vol

atili

ty (

bp/d

ay)

r

Six monthsOne year

Three yearsFive years

Fig. 14.7. The volatility of the daily change in yields versus the short rate, with theslope fixed at 1%

3. Estimation of a continuous-time multifactordiffusion process

The results of Section 2 suggest that the volatility of changes in the term structure ofinterest rates depends on at least two factors. Given the importance of continuous-timemathematics in the fixed income area, the question arises as to how these results can beinterpreted in a continuous-time setting. Using data on bond prices, and explicit theo-retical pricing models (e.g., Cox, Ingersoll and Ross, 1985), Brown and Dybvig (1986),Pearson and Sun (1994), Gibbons and Ramaswamy (1993) and Dai and Singleton (2000)all estimate parameters of the underlying interest rate process in a fashion consistent withthe underlying continuous-time model. These procedures limit themselves, however, tofairly simple specifications.

As a result, a literature emerged which allows estimation and inference of fairlygeneral continuous-time diffusion processes using discretely sampled data. Aıt-Sahalia(2007) provides a survey of this literature and we provide a quick review here. First,at a parametric level, there has been considerable effort in the finance literature atworking through maximum likelihood applications of continuous-time processes withdiscretely sampled data, starting with Lo (1988) and continuing more recently withAıt-Sahalia (2002) and Aıt-Sahalia and Kimmel (2007a, 2007b). Second, by employ-ing the infinitesimal generators of the underlying continuous-time diffusion processes,Hansen and Scheinkman (1995) and Conley, Hansen, Luttmer and Scheinkman (1995)

Page 321: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

308 A multifactor, nonlinear, continuous-time model of interest rate volatility

construct moment conditions that also make the investigation of continuous-time mod-els possible with discrete time data. Third, in a nonparametric framework, Aıt-Sahalia(1996a, 1996b) develops a procedure for estimating the underlying process for interestrates using discrete data by choosing a model for the drift of interest rates and thennonparametrically estimating its diffusion function. Finally, as an alternative method,Stanton (1997) employs approximations to the true drift and diffusion of the underly-ing process, and then nonparametrically estimates these approximation terms to backout the continuous-time process (see also Bandi, 2002; Chapman and Pearson, 2000;and Pritsker, 1998). The advantage of this approach is twofold: (i) similar to theother procedures, the data need only be observed at discrete time intervals, and (ii)the drift and diffusion are unspecified, and thus may be highly nonlinear in the statevariable.

In this section, we extend the work of Stanton (1997) to a multivariate setting andprovide for the nonparametric estimation of the drift and volatility functions of multi-variate stochastic differential equations. Similar to Stanton (1997), we use Milshtein’s(1978) approximation schemes for writing expectations of functions of the sample pathof stochastic differential equations in terms of the drift and volatility coefficients. Ifthe expectations are known (albeit estimated nonparametrically in this paper) and thefunctions are chosen appropriately, then the approximations can be inverted to recoverthe drift and volatility coefficients. We have performed an extensive simulation analysis(not shown here) to better understand the properties of the estimators. Not surprisingly,the standard errors around the estimators, as well as the properties of the goodness offit, deteriorate as the data becomes more sparse. Given the aforementioned literaturethat looks at univariate properties of interest rates, it is important to point out thatthese properties suffer more in the multivariate setting as we introduce more “Star trek”regions of the data with the increasing dimensionality of the system. Nevertheless, thispoint aside, the approximation results here for the continuous-time process carry throughto those presented in Stanton (1997), in particular, the first order approximation workswell at daily to weekly horizons, while higher order approximations are required for lessfrequent sampling.

3.1. Drift, diffusion and correlation approximations

Assume that no arbitrage opportunities exist, and that bond prices are functions of twostate variables, the values of which can always be inverted from the current level, Rt,and a second state variable, St. Assume that these variables follow the (jointly) Markovdiffusion process

dRt = μR(Rt, St) dt+ σR(Rt, St) dZRt (5)

dSt = μS(Rt, St) dt+ σS(Rt, St) dZSt , (6)

where the drift, volatility and correlation coefficients (i.e., the correlation between ZR

and ZS) all depend on Rt and St. Define the vector Xt = (Rt, St).

Page 322: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Estimation of a continuous-time multifactor diffusion process 309

Under suitable restrictions on μ, σ, and a function f , we can write the conditionalexpectation Et [f(Xt+Δ)] in the form of a Taylor series expansion,5

Et [f(Xt+Δ)] = f(Xt) + Lf(Xt)Δ+12L2f(Xt)Δ2 + . . .

+1n!Lnf(Xt)Δn +O(Δn+1), (7)

where L is the infinitesimal generator of the multivariate process {Xt} (see Øksendal,1985; and Hansen and Scheinkman, 1995), defined by

Lf(Xt) =(∂f(Xt)∂Xt

)μX(Xt) +

12trace

[Σ(Xt)

(∂2f(Xt)∂Xt∂X ′

t

)],

where

Σ(Xt) =(

σ2R(Rt, St) ρ(Rt, St)σR(Rt, St)σS(Rt, St)

ρ(Rt, St)σR(Rt, St)σS(Rt, St) σ2S(Rt, St)

).

Equation (7) can be used to construct numerical approximations to Et[f(Xt+Δ)]in the form of a Taylor series expansion, given known functions μR, μS , ρ, σR andσS (see, for example, Milshtein, 1978). Alternatively, given an appropriately chosenset of functions f(·) and nonparametric estimates of Et[f(Xt+Δ)], we can use equation(7) to construct approximations to the drift, volatility and correlation coefficients (i.e.,μR, μS , ρ, σR and σS) of the underlying multifactor, continuous-time diffusion pro-cess. The nice feature of this method is that the functional forms for μR, μS , ρ, σR

and σS are quite general, and can be estimated nonparametrically from the underlyingdata. Rearranging equation (7), and using a time step of length iΔ(i = 1, 2, . . .), weobtain

Ei(Xt) ≡ 1iΔEt [f(Xt+iΔ) − f(Xt)] ,

= Lf(Xt) +12L2f(Xt)(iΔ) + . . .+

1n!Lnf(Xt)(iΔ)n−1 +O(Δn). (8)

From equation (8), each of the Ei is a first order approximation to Lf ,

Ei(Xt) = Lf(Xt) +O(Δ).

5For a discussion see, for example, Hille and Phillips (1957), Chapter 11. Milshtein (1974, 1978) givesexamples of conditions under which this expansion is valid, involving boundedness of the functions μ, σ,f and their derivatives. There are some stationary processes for which this expansion does not hold forthe functions f that we shall be considering, including processes such as

dx = μ dt + x3 dZ,

which exhibit “volatility induced stationary” (see Conley, Hansen, Luttmer and Scheinkman, 1995).However, any process for which the first order Taylor series expansion fails to hold (for linear f) willalso fail if we try to use the usual numerical simulation methods (e.g. Euler discretization). This severelylimits their usefulness in practice.

Page 323: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

310 A multifactor, nonlinear, continuous-time model of interest rate volatility

Now consider forming linear combinations of these approximations,∑N

i=1 αiEi(Xt). That

is, from equation (8),

N∑i=1

αiEi(Xt) =

[N∑

i=1

αi

]Lf(Xt) +

12

[N∑

i=1

αii

]L2f(Xt)Δ

+16

[N∑

i=1

αii2

]L3f(Xt)Δ2 + . . . . (9)

Can we choose the αi so that this linear combination is an approximation to Lf oforder N?

For the combination to be an approximation to Lf , we require first that the weightsα1, α2, . . . , αN sum to 1. Furthermore, from equation (9), in order to eliminate the firstorder error term, the weights must satisfy the equation

N∑i=1

αii = 0.

More generally, in order to eliminate the nth order error term (n ≤ N − 1), the weightsmust satisfy the equation,

N∑i=1

αiin = 0.

We can write this set of restrictions more compactly in matrix form as⎛⎜⎜⎜⎜⎜⎝1 1 1 · · · 11 2 3 · · · N1 4 9 · · · N2

......

.... . .

...1 2N−1 3N−1 · · · NN−1

⎞⎟⎟⎟⎟⎟⎠α ≡ V α =

⎛⎜⎜⎜⎜⎜⎝100...0

⎞⎟⎟⎟⎟⎟⎠ .The matrix V is called a Vandermonde matrix, and is invertible for any value of N . Wecan thus obtain α by calculating

α = V −1

⎛⎜⎜⎜⎝10...0.

⎞⎟⎟⎟⎠ . (10)

For example, for N = 3, we obtain

α =

⎛⎝1 1 11 2 31 4 9

⎞⎠−1⎛⎝100

⎞⎠ , (11)

=

⎛⎝ 3−31

⎞⎠ . (12)

Page 324: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Estimation of a continuous-time multifactor diffusion process 311

Substituting α into equation (9), and using equation (8), we get the following third orderapproximation of the infinitesimal generator of the process {Xt}:

Lf(Xt) =1

6Δ[18Et (f(Xt+Δ) − f(Xt)) − 9Et (f(Xt+2Δ) − f(Xt))

+ 2Et (f(Xt+3Δ) − f(Xt))] +O(Δ3).

To approximate a particular function g(x), we now need merely to find a specific functionf satisfying

Lf(x) = g(x).

For our purposes, consider the functions

f(1)(R) ≡ R−Rt,

f(2)(S) ≡ S − St,

f(3)(R) ≡ (R−Rt)2,

f(4)(S) ≡ (S − St)2,

f(5)(R,S) ≡ (R−Rt) (S − St) .

From the definition of L, we have

Lf(1)(R) = μR(R,S),

Lf(2)(S) = μS(R,S),

Lf(3)(R) = 2(R−Rt)μR(R,S) + σ2R(R,S),

Lf(4)(S) = 2(S − St)μS(R,S) + σ2S(R,S),

Lf(5)(R,S) = (S − St)μR(R,S) + (R−Rt)μS(R,S) + ρ(R,S)σR(R,S)σS(R,S).

Evaluating these at R = Rt, S = St, we obtain

Lf(1)(Rt) = μR(Rt, St),

Lf(2)(St) = μS(Rt, St),

Lf(3)(Rt) = σ2R(Rt, St),

Lf(4)(St) = σ2S(Rt, St),

Lf(5)(Rt, St) = ρ(Rt, St)σR(Rt, St)σS(Rt, St).

Using each of these functions in turn as the function f above, we can generate approxima-tions to μR, μS , σR, σS and ρ respectively. For example, the third order approximations

Page 325: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

312 A multifactor, nonlinear, continuous-time model of interest rate volatility

(taking square roots for σR and σS) are

μR(Rt, St) =1

6Δ[18Et (Rt+Δ −Rt) − 9Et (Rt+2Δ −Rt) + 2Et (Rt+3Δ −Rt)]

+O(Δ3), (13)

μS(Rt, St) =1

6Δ[18Et (St+Δ − St) − 9Et (St+2Δ − St) + 2Et (St+3Δ − St)]

+O(Δ3),

σR(Rt, St) =

√√√√√ 16Δ

⎛⎝ 18Et

[(Rt+Δ −Rt)

2]− 9Et

[(Rt+2Δ −Rt)

2]

+2Et

[(Rt+3Δ −Rt)

2] ⎞⎠

σS(Rt, St) =

√√√√√ 16Δ

⎛⎝ 18Et

[(St+Δ − St)

2]− 9Et

[(St+2Δ − St)

2]

+2Et

[(St+3Δ − St)

2] ⎞⎠

σRS(Rt, St) =1

6Δ(18Et [(Rt+Δ −Rt) (St+Δ − St)]

− 9Et [(Rt+2Δ −Rt) (St+2Δ − St)]

+2Et [(Rt+3Δ −Rt) (St+3Δ − St)]) .

The approximations of the drift, volatility and correlation coefficients are written interms of the true first, second and cross moments of multiperiod changes in the twostate variables. If the two-factor assumption is appropriate, and a large stationary timeseries is available, then these conditional moments can be estimated using appropriatenonparametric methods. In this chapter, we estimate the moments using multivariatedensity estimation, with appropriately chosen factors as the conditioning variables. Allthat is required is that these factors span the same space as the true state variables.6 Theresults for daily changes were provided in Section 2. Equation (13) shows that these esti-mates are an important part of the approximations to the underlying continuous-timedynamics. By adding multiperiod extensions of these nonparametric estimated condi-tional moments, we can estimate the drift, volatility and correlation coefficients of themultifactor process described by equations (5) and (6).

Figure 14.8 provides the first, second and third order approximations to the diffusionof the short rate against the short rate level and the slope of the term structure.7 Themost notable result is that a first order approximation works well; thus, one can considerthe theoretical results of this section as a justification for discretization methods cur-rently used in the literature. The description of interest rate behavior given in Section 2,therefore, carries through to the continuous-time setting. Our major finding is that the

6See Duffie and Kan (1996) for a discussion of the conditions under which this is possible (in a linearsetting).

7Figures showing the various approximations to the drift of the short rate, the drift and diffusion ofthe slope, and the correlation between the short rate and the slope are available upon request.

Page 326: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 A generalized Longstaff and Schwartz (1992) model 313

Volatility (r)

First orderSecond order

Third order

0.03 0.040.05 0.06 0.07 0.08 0.09 0.1 0.11

r0

0.0050.01

0.0150.02

0.0250.03

0.035

Slope

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

Fig. 14.8. First, second and third order approximations to the diffusion (annualized)of the short rate versus the short rate and the slope of the term structure

volatility of interest rates is increasing in the level of interest rates mostly for sharplyupward sloping term structures. The question then is what does Figure 14.8, and moregenerally the rest of the estimated process, mean for fixed-income pricing?

4. A generalized Longstaff and Schwartz (1992) model

Longstaff and Schwartz (1992) provide a two-factor general equilibrium model of theterm structure. Their model is one of the more popular versions within the affine class ofmodels for describing the yield curve (see also Cox, Ingersoll and Ross, 1985; Chen andScott, 1995; Duffie and Kan, 1996; and Dai and Singleton, 2000). In the Longstaff andSchwartz setting, all fixed-income instruments are functions of two fundamental factors,the instantaneous interest rate and its volatility. These factors follow diffusion processes,which in turn lead to a fundamental valuation condition for the price of any bond, orbond derivative. As an alternative, here we also present a two-factor continuous-timemodel for interest rates. The results of Section 2 suggest that the affine class may be toorestrictive.

Although our results shed valuable light on the factors driving interest rate movementsthere are potential problems in using this specification to price interest rate contingentclaims. A general specification for Rt and St (and the associated prices of risk) mayallow arbitrage opportunities if either of these state variables is a known function of an

Page 327: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

314 A multifactor, nonlinear, continuous-time model of interest rate volatility

asset price.8 Of course, this point is true of all previous estimations of continuous-timeprocesses to the extent that they use a priced proxy as the instantaneous rate. If we arewilling to assume that we have the right factors, however, then there is no problem inan asymptotic sense. That is, as we are estimating these processes nonparametrically, asthe sample size gets larger, our estimates will converge to the true functions, which areautomatically arbitrage-free (if the economy is). Nevertheless, this is of little consolationif we are trying to use the estimated functions to price assets.

To get around this problem, we need to write the model in a form in which neitherstate variable is an asset price or a function of asset prices. In this chapter, we followconvention by using the observable three-month yield as a proxy for the instantaneousrate, Rt. Furthermore, suppose that the mapping from (R,S) to (R, σR) is invertible,9

so we can write asset prices as a function of R and σR, instead of R and S.10 As σR isnot an asset price, using this variable avoids the inconsistency problem.

Specifically, suppose that the true model governing interest rate movements is ageneralization of the two-factor Longstaff and Schwartz (1992) model,

dRt = μR(R, σ) dt+ σ dZ1, (14)

dσt = μσ(R, σ)dt+ ρ(R, σ)s(R, σ) dZ1 +√

1 − ρ2s dZ2, (15)

where dZ1 dZ2 = 0.11 In vector terms,

d(Rt, σt) = M dt+ θ dZ,

where

M ≡(μR

μσ

),

θ ≡(σ 0ρs

√1 − ρ2s

).

Asset prices, and hence the slope of the term structure, can be written as some functionof the short rate and instantaneous short rate volatility, S(R, σ).

From equations (14) and (15), how do we estimate the underlying processes for Rand σ given the estimation results of Section 3? Although the short rate volatility, σ, isnot directly observable, it is possible to estimate this process. Specifically, using Ito’s

8See, for example, Duffie, Ma and Yong (1995). The problem is that, given such a model, we canprice any bond, and are thus able to calculate what the state variable “ought” to be. Without imposingany restrictions on the assumed dynamics for Rt and St, there is no guarantee that we will get back tothe same value of the state variable that we started with.

9That is, for a given value of Rt, the volatility, σR, is monotonic in the slope, S. This is the case inmost existing multifactor interest rate models, including, for example all affine models, such as Longstaffand Schwartz (1992).

10This follows by writing

V (R, S) = V (R, S(R, σR)) ≡ U(R, σR).

11This specification is the most convenient to deal with, as we now have orthogonal noise terms. Thecorrelation between the diffusion terms is ρ, and the overall variance of σ is s2 dt.

Page 328: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 A generalized Longstaff and Schwartz (1992) model 315

Lemma, together with estimates for μR, σR, μS , σS and ρ, it is possible to write

dσt = σRdRt + σSdSt +12[σRRσ

2(Rt, St) + σSSσ2S(Rt, St)

+2σRSσ(Rt, St)σS(Rt, St)ρ(Rt, St)]dt.

Given this equation, and the assumption that the function S(R, σ) is invertible, thedynamics of σt can be written as a function of the current level of R and σ in astraightforward way.

This procedure requires estimation of a matrix of second derivatives. Although thereare well-known problems in estimating higher order derivatives using kernel densityestimation techniques, it is possible to link the results of Section 2 and 3 to this general-ized Longstaff and Schwartz (1992) model. In particular, using estimates of the secondderivatives (not shown), several facts emerge. First, due to the small magnitudes of theestimated drifts of the state variables R and S, the drift of σ depends primarily on thesecond order terms. Consequently, the importance of the second factor (the slope) isdetermined by how much the sensitivity of short rate volatility to this factor changesrelative to the changes in the sensitivity to the first factor (the level). The general pat-tern is that volatility increases at a slower rate for high levels and a faster rate for highslopes. Consequently, for high volatilities and levels, the drift of volatility is negative,generating mean reversion. The effect of the second factor, however, is to counter thisphenomenon. Second, the diffusion of σ is determined by the sensitivities of short ratevolatility to the two factors and the magnitudes of the volatilities of the factors. Basedon the estimates of the volatilities and derivatives, the slope has the dominant influ-ence on this effect. In particular, the volatility of σ is high for upward sloping termstructures, which also correspond to states with high short rate volatility. Moreover, sen-sitivity of this diffusion to the two factors is larger in the slope direction than in the leveldirection.

As an alternative to the above method, we can estimate an implied series for σ byassuming that the function S(R, σ) is invertible, i.e., that we can equivalently write themodel in the form

dRt = μR(Rt, St)dt+ σ(Rt, St)dZ∗1

dSt = μS(Rt, St)dt+ σS(Rt, St)dZ∗2 ,

where Z∗1 and Z∗

2 may be correlated. To estimate the function σ(R,S), we apply themethodology described in Section 3.1 to the function f(3)(R,S) ≡ (R − Rt)2. Apply-ing the estimated function to each observed (R,S) pair in turn yields a series for thevolatility σ, which we can then use in estimating the generalized Longstaff and Schwartz(1992) model given in equations (14) and (15).12 This procedure is in stark contrastto that of Longstaff and Schwartz (1992), and others, who approximate the dynamicsof the volatility factor as a Generalized Autoregressive Conditional Heteroskedasticity(GARCH) process. The GARCH process is not strictly compatible with the underlyingdynamics of their continuous-time model; here, the estimation is based on approximation

12Although the use of an estimated series for σ rather than the true series may not be the most efficientapproach, this procedure is consistent. That is, the problem will disappear as the sample size becomeslarge, and our pointwise estimates of σ converge to the true values.

Page 329: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

316 A multifactor, nonlinear, continuous-time model of interest rate volatility

0.008

0.009

0.01

0.011

0.012

0.013

0.014

0.015

0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11

Sigm

a

Short rate

Fig. 14.9. Scatter plot of the three-month rate versus the term structure volatility overthe 1983–2006 period

schemes to the diffusion process and is internally consistent. Due to the difficulties inestimating derivatives, we choose this second approach to estimate the continuous-timeprocess.13

4.1. A general two-factor diffusion process: Empirical results

Figures 14.10–14.12 show approximations to equation (15) for the generalized Longstaffand Schwartz (1992) process as a function of the two factors, the instantaneous shortrate and its volatility. It is important to point out that there are few available data at lowshort rates/high volatilities and high short rates/low volatilities, which corresponds tothe earlier comment about interest rates and slopes (see Figure 14.9). Therefore, resultsin these regions need to be treated cautiously.

Figures 14.10 and 14.11 provide the estimates of the continuous-time process for thesecond interest factor, namely its volatility. Several observations are in order. First, thereis estimated mean-reversion in volatility; at low (high) levels of volatility, volatility tendsto drift upward (downward). The effect of the level of interest rates on this relationappears minimal. Second, and perhaps most important, there is clear evidence that thediffusion of the volatility process is increasing in the level of volatility, yet is affected

13Although the first approach provides similar results to the second approach, the functional formsunderlying the second method are more smooth and thus more suitable for analysis.

Page 330: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 A generalized Longstaff and Schwartz (1992) model 317

Drift (sigma)

0.030.04

0.050.06

0.070.08

0.090.1

0.11

r0.005

0.0060.007

0.0080.009

0.010.011

0.0120.013

0.0140.015

Sigma

–0.0009

–0.0008

–0.0007

–0.0006

–0.0005

–0.0004

–0.0003

–0.0002

–0.0001

0

0.0001

Fig. 14.10. First order approximation to the drift (annualized) of the volatility versusthe short rate and the volatility of the term structure

by the level of interest rates only marginally. Moreover, volatility’s effect is nonlinear inthat it takes effect only at higher levels. This finding suggests extreme caution shouldbe applied when inputting interest rate volatility into derivative pricing models. Mostof our models take the relation between the level and volatility for granted; however,with increases from 3% to 11% in the interest rate level, both the drift and diffusionof volatility exhibit only mild increases. On the other hand, changes in the volatilitylevel of much smaller magnitudes have a much larger impact on the volatility process.This finding links the term structure slope result documented earlier in the chapter toa second factor, namely the volatility of the instantaneous rate, and provides a closeconnection to the Engle, Lilien and Robins (1987) paper mentioned throughout thischapter.

As the final piece of the multifactor process for interest rates, Figure 14.12 graphs afirst order approximation of the correlation coefficient between the short rate and thevolatility, given values of the two factors. Taken at face value, the results suggest a com-plex variance–covariance matrix between these series in continuous-time. In particular,whereas the correlation decreases in the volatility for most interest rate levels, thereappears to be some nonmonotonicity across the level itself. Why is correlation falling asvolatility increases? Perhaps, high volatility, just like the corresponding high term struc-ture slope, is associated with aggregate economic phenomena that are less related to thelevel of interest rates. Given that interest rates are driven by two relatively independenteconomic factors, namely expectations about both real rates and inflation, this argument

Page 331: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

318 A multifactor, nonlinear, continuous-time model of interest rate volatility

Volatility (sigma)

0.030.04

0.050.06

0.070.08

0.090.1

0.11r

0.0050.006

0.0070.008

0.0090.01

0.0110.012

0.0130.014

0.015

Sigma

0

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

0.009

Fig. 14.11. First order approximation to the diffusion (annualized) of the volatilityversus the short rate and the volatility of the term structure

seems reasonable. It remains an open question, however, what the exact relationship isbetween Figure 14.12 and these economic factors.

4.2. Valuation of fixed-income contingent claims

Given the interest rate model described in equation (15), we can write the price of aninterest rate contingent claim as V (r, σ, t), depending only on the current values of thetwo state variables plus time. Then, by Ito’s Lemma,

dV (r, σ, t)V (r, σ, t)

= m(r, σ, t) dt+ s1(r, σ, t) dZ1 + s2(r, σ, t) dZ2, (16)

where

m(r, σ, t)V = Vt + μr(r, σ)VR + μσ(r, σ)Vσ +12trace

[θT ∇2V (r, σ) θ

],

= Vt + μr(r, σ)Vr + μσ(r, σ)Vσ +12σ2Vrr +

12s2Vσσ + ρσsVrσ, (17)

s1(r, σ, t)V = σVr + ρsVσ,

s2(r, σ, t)V =√

1 − ρ2sVσ.

Page 332: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 A generalized Longstaff and Schwartz (1992) model 319

Correlation (r, sigma)

0.030.04

0.050.06

0.070.08

0.090.1

0.11

r0.005

0.0060.007

0.0080.009

0.010.011

0.0120.013

0.0140.015

Sigma

–0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Fig. 14.12. First order approximation to the correlation coefficient between changesin the short rate and the volatility versus the short rate and the volatility of the termstructure

The volatility of the asset, σV , is given by

σV V =√

(σVr + ρsVσ)2 + (1 − ρ2) s2V 2σ ,

=√σ2V 2

r + 2ρσsVrVσ + s2V 2σ .

With a one-factor interest rate model, to prevent arbitrage, the risk premium on any assetmust be proportional to its standard deviation.14 Similarly, with two factors, absenceof arbitrage requires the excess return on an asset to be a linear combination of itsexposure to the two sources of risk. Thus, if the asset pays out dividends at rate d, we canwrite

m = r − d

V+ λr(r, σ)

Vr

V+ λσ(r, σ)

V, (18)

where λr and λσ are the prices of short rate risk and volatility risk, respectively. Sub-stituting equation (18) into equation (17), and simplifying, leads to a partial differentialequation that must be satisfied by any interest rate contingent claim, assuming the usual

14Suppose this did not hold for two risky assets. We could then create a riskless portfolio ofthese two assets with a return strictly greater than r, leading to an arbitrage opportunity (seeIngersoll, 1987).

Page 333: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

320 A multifactor, nonlinear, continuous-time model of interest rate volatility

technical smoothness and integrability conditions (see, for example, Duffie, 1988),

12σ2Vrr + [μr − λr]Vr +

12s2Vσσ + [μσ − λσ]Vσ + ρσsVrσ + Vt − rV + d = 0, (19)

subject to appropriate boundary conditions. To price interest rate dependent assets, weneed to know not only the processes governing movements in r and σ, but also the pricesof risk, λr and λσ.

Equation (18) gives an expression for these functions in terms of the partial deriva-tives Vr and Vσ, which could be used to estimate the prices of risk, given estimates ofthese derivatives for two different assets, plus estimates of the excess return for eachasset. As mentioned above, it is difficult to estimate derivatives precisely using nonpara-metric density estimation. Therefore, instead of following this route, one could avoiddirectly estimating the partial derivatives, Vr and Vσ, by considering the instantaneouscovariances between the asset return and changes in the interest rate/volatility, cV r andcV σ. From equations (14), (15) and (16) (after a little simplification),(

cV r

cV σ

)≡(dV dr/V dtdV dσ/V dt

)=(σ2 ρσsρσs s2

)(Vr/VVσ/V

). (20)

This can be inverted, as long as |ρ| < 1, to obtain(Vr/VVσ/V

)=(σ2 ρσsρσs s2

)−1(cV r

cV σ

),

=1

1 − ρ2(

1/σ2 −ρ/σs−ρ/σs 1/s2

)(cV r

cV σ

).

To preclude arbitrage, the excess return on the asset must also be expressible as a linearcombination of cV r and cV σ,

m = r − d

V+ λ∗r(r, σ)cV r + λ∗σ(r, σ)cV σ. (21)

Given two different interest rate dependent assets, we can estimate the instantaneouscovariances for each in the same way as we estimated ρ(r, σ) above. We can also estimatethe excess return for each asset, mi(r, σ)− r as a function of the two state variables. Thetwo excess returns can be expressed in the form(

m1 − rm2 − r

)=

(c1V r c1V σ

c2V r c2V σ

)(λ∗r

λ∗σ

),

which can be inverted to yield an estimate of the prices of risk,(λ∗r

λ∗σ

)=

(c1V r c1V σ

c2V r c2V σ

)−1(m1 − rm2 − r

).

Finally, for estimates of the more standard representation of the prices of risk, λr andλσ, equate equations (18) and (21), using equation (20), to obtain(

λr

λσ

)=(σ2 ρσsρσs s2

)(λ∗r

λ∗σ

).

Page 334: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Conclusion 321

Given estimates for the process governing movements in r and σ, and the aboveprocedure for the functions λr and λσ, we can value interest rate dependent assets in oneof two ways. The first is to solve equation (19) numerically using a method such as theHopscotch method of Gourlay and McKee (1977). The second is to use the fact that wecan write the solution to equation (19) in the form of an expectation. Specifically, we canwrite V , the value of an asset which pays out cash flows at a (possibly path-dependent)rate Ct, in the form

Vt = E

[∫ T

t

e−∫ s

t(ru) duCs ds

], (22)

where r follows the “risk adjusted” process,

drτ = [μr(rτ , στ ) − λr(rτ , στ )] dτ + στ dZ1, (23)

dστ = [μσ(rτ , στ ) − λσ(rτ , στ )] dτ + ρs(rtau, στ ) dZ1 +√

1 − ρ2s dZ2, (24)

for all τ > t, and where

rt = rt,

σt = σt.

This says that the value of the asset equals the expected sum of discounted cash flowspaid over the life of the asset, except that it substitutes the risk adjusted process (r, σ)for the true process (r, σ).

This representation leads directly to a valuation algorithm based on Monte Carlosimulation. For a given starting value of (rt, σt), simulate a number of paths for r andσ using equations (23) and (24). Along each path, calculate the cash flows Ct, anddiscount these back along the path followed by the instantaneous riskless rate, rt. Theaverage of the sum of these values taken over all simulated paths is an approximation tothe expectation in equation (22), and hence to the security value, Vt. The more pathssimulated, the closer the approximation.

5. Conclusion

This chapter provides a method for estimating multifactor continuous-time Markovprocesses. Using Milshtein’s (1978) approximation schemes for writing expectations offunctions of the sample path of stochastic differential equations in terms of the drift,volatility and correlation coefficients, we provide nonparametric estimation of the driftand diffusion functions of multivariate stochastic differential equations. We apply thistechnique to the short- and long-end of the term structure for a general two-factor,continuous-time diffusion process for interest rates. In estimating this process, severalresults emerge. First, the volatility of interest rates is increasing in the level of interestrates, only for sharply, upward sloping term structures. Thus, the result of previous stud-ies, suggesting an almost exponential relation between interest rate volatility and levels,is due to the term structure on average being upward sloping, and is not a general resultper se. Second, the finding that partly motivates this chapter, i.e., the link between slope

Page 335: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

322 A multifactor, nonlinear, continuous-time model of interest rate volatility

and interest rate volatility in Engle, Lilien and Robins (1987), comes out quite naturallyfrom the estimation. Finally, the slope of the term structure, on its own, plays a largerole in determining the magnitude of the diffusion coefficient. These volatility resultshold across maturities, which suggests that a low dimensional system (with nonlineareffects) may be enough to explain the term structure of interest rates.

As a final comment, there are several advantages of the procedure adopted in thischapter. First, there is a constant debate between researchers on the relative benefits ofusing equilibrium versus arbitrage-free models. Here, we circumvent this issue by usingactual data to give us the process and corresponding prices of risk. As the real worldcoincides with the intersection of equilibrium and arbitrage-free models, our model isautomatically consistent. Of course, in a small sample, statistical error will produceestimated functional forms that do not conform. This problem, however, is true of allempirical work. Second, we show how our procedure for estimating the underlying multi-factor continuous-time diffusion process can be used to generate fixed income pricing. Asan example, we show how our results can be interpreted within a generalized Longstaffand Schwartz (1992) framework, that is, one in which the drift and diffusion coefficientsof the instantaneous interest rate and volatility are both (nonlinear) functions of the levelof interest rates and the volatility. Third, and perhaps most important, the pricing offixed-income derivatives depends crucially on the level of volatility. The results in thischapter suggest that volatility depends on both the level and slope of the term structure,and therefore contains insights into the eventual pricing of derivatives.

Page 336: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

15

Estimating the ImpliedRisk-Neutral Density forthe US Market Portfolio

Stephen Figlewski

1. Introduction

The Black–Scholes (BS) option pricing model has had an enormous impact on academicvaluation theory and also, impressively, in the financial marketplace. It is safe to saythat virtually all serious participants in the options markets are aware of the model andmost use it extensively. Academics tend to focus on the BS model as a way to value anoption as a function of a handful of intuitive input parameters, but practitioners quicklyrealized that one required input, the future volatility of the underlying asset, is neitherobservable directly nor easy to forecast accurately. However, an option’s price in themarket is observable, so one can invert the model to find the implied volatility (IV)that makes the option’s model value consistent with the market. This property is oftenmore useful than the theoretical option price for a trader who needs the model to priceless liquid options consistently with those that are actively traded in the market, and tomanage his risk exposure.

An immediate problem with IVs is that when they are computed for options writtenon the same underlying they differ substantially according to “moneyness”. The now-familiar pattern is called the volatility smile, or for options on equities, and stock indexesin particular, the smile has become sufficiently asymmetrical over time, with higher

Acknowledgments: Thanks to Justin Birru for excellent assistance on this research and to Otto vanHemert, Robert Bliss, Tim Bollerslev, an anonymous reviewer, and seminar participants at NYU, Baruch,Georgia Tech, Essex University, Lancaster University, Bloomberg, and the Robert Engle FestschriftConference for valuable comments.

323

Page 337: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

324 Estimating the implied risk-neutral density for the US market portfolio

IVs for low exercise price options, that it is now more properly called a “smirk” or a“skew”.1

Implied volatility depends on the valuation model used to extract it, and the existenceof a volatility smile in Black–Scholes IVs implies that options market prices are not fullyconsistent with that model. Even so, the smile is stable enough over short time intervalsthat traders use the BS model anyway, by inputting different volatilities for differentoptions according to their moneyness. This jury-rigged procedure, known as “practitionerBlack–Scholes”, is an understandable strategy for traders, who need some way to imposepricing consistency across a broad range of related financial instruments, and do notcare particularly about theoretical consistency with academic models. This has led toextensive analysis of the shape and dynamic behavior of volatility smiles, even thoughit is odd to begin with a model that is visibly inconsistent with the empirical data andhope to improve it by modeling the behavior of the inconsistency.

Extracting important but unobservable parameters from option prices in the marketis not limited to implied volatility. More complex models can be calibrated to the marketby implying out the necessary parameter values, such as the size and intensity of discreteprice jumps. The most fundamental valuation principle, which applies to all financialassets, not just options, is that a security’s market price should be the market’s expectedvalue of its future payoff, discounted back to the present at a discount rate appropriatelyadjusted for risk.

Risk premia are also unobservable, unfortunately, but a fundamental insight of con-tingent claims pricing theory is that when a pricing model can be obtained using theprinciple of no-arbitrage, the risk-neutral probability distribution can be used in com-puting the expected future payoff, and the discount rate to bring that expectation backto the present is the riskless rate. The derivative security can be priced relative to theunderlying asset under the risk-neutralized probability distribution because investors’actual risk preferences are embedded in the price of the underlying asset.

Breeden and Litzenberger (1978) and Banz and Miller (1978) showed that, like impliedvolatility, the entire risk-neutral probability distribution can be extracted from marketoption prices, given a continuum of exercise prices spanning the possible range of futurepayoffs. An extremely valuable feature of this procedure is that it is model-free, unlikeextracting IV. The risk-neutral distribution does not depend on any particular pricingmodel.

At a point in time, the risk-neutral probability distribution and the associated risk-neutral density function, for which we will use the acronym RND, contain an enormousamount of information about the market’s expectations and risk preferences, and theirdynamics can reveal how information releases and events that affect risk attitudes impactthe market. Not surprisingly, a considerable amount of previous work has been done toextract and interpret RNDs, using a variety of methods and with a variety of purposesin mind.2

1Occasionally a writer will describe the pattern as a “sneer” but this is misleading. A smile curvesupward more or less symmetrically at both ends; a smirk also curves upward but more so at one endthan the other; a “skew” slopes more or less monotonically downward from left to right; but the term“sneer” would imply a downward curvature, i.e., a concave portion of the curve at one end, which is nota pattern seen in actual options markets.

2An important practical application of this concept has been the new version of the Chicago BoardOptions Exchange’s VIX index of implied volatility (Chicago Board Options Exchange, 2003). Theoriginal VIX methodology constructed the index as a weighted average of BS implied volatilities from

Page 338: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 Review of the literature 325

Estimation of the RND is hampered by two serious problems. First, the theory callsfor options with a continuum of exercise prices, but actual options markets only tradea relatively small number of discrete strikes. This is especially problematic for optionson individual stocks, but even index options have strikes at least 5 points apart, andup to 25 points apart or more in some parts of the available range. Market prices alsocontain microstructure noise from various sources, and bid-ask spreads are quite widefor options, especially for less liquid contracts and those with low prices. Slight irreg-ularities in observed option prices can easily translate into serious irregularities in theimplied RND, such as negative probabilities. Extracting a well-behaved estimate of aRND requires interpolation, to fill in option values for a denser set of exercise prices, andsmoothing, to reduce the influence of microstructure noise.

The second major problem is that the RND can be extracted only over the range ofavailable strikes, which generally does not extend very far into the tails of the distribu-tion. For some purposes, knowledge of the full RND is not needed. But in many cases,what makes options particularly useful is the fact that they have large payoffs in thecomparatively rare times when the underlying asset makes a large price move, i.e., in thetails of its returns distribution.

The purpose of this chapter is to present a new methodology for extracting completewell-behaved RND functions from options market prices and to illustrate the potentialof this tool for understanding how expectations and risk preferences are incorporatedinto prices in the US stock market. We review a variety of techniques for obtainingsmooth densities from a set of observed options prices and select one that offers goodperformance. This procedure is then modified to incorporate the market’s bid-ask spreadinto the estimation. Second, we will show how the tails of the RND obtained from theoptions market may be extended and completed by appending tails from a generalizedextreme value (GEV) distribution. We then apply the procedure to estimate RNDs forthe S&P 500 stock index from 1996–2008 and develop several interesting results.

The next section will give a brief review of the extensive literature related to thistopic. Section 3 details how the RND can theoretically be extracted from options prices.The following section reviews alternative smoothing procedures needed to obtain a well-behaved density from actual options prices. Section 5 presents our new methodology forcompleting the RND by appending tails from a GEV distribution. Section 6 applies themethodology to explore the behavior of the empirical RND for the Standard and Poor’s500 index over the period 1996–2008. The results presented in this section illustrate someof the great potential of this tool for revealing how the stock market behaves. The finalsection will offer some concluding comments and a brief description of several potentiallyfruitful lines of future research based on this technology.

2. Review of the literature

The literature on extracting and interpreting the risk-neutral distribution from marketoption prices is broad, and it becomes much broader if the field is extended to cover

eight options written on the S&P 100 stock index. This was replaced in 2003 by a calculation thatamounts to estimating the standard deviation of the risk-neutral density from options on the S&P 500index.

Page 339: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

326 Estimating the implied risk-neutral density for the US market portfolio

research on implied volatilities and on modeling the returns distribution. In this literaturereview, we restrict our attention to papers explicitly on RNDs.

The monograph by Jackwerth (2004) provides an excellent and comprehensive reviewof the literature on this topic, covering both methodological issues and applications. Blissand Panigirtzoglou (2002) also give a very good review of the alternative approaches toextracting the RND and the problems that arise with different methods. Bahra (1997)is another often-cited review of methodology, done for the Bank of England prior to themost recent work in this area.

One way to categorize the literature is according to the methods used by differentauthors to extract a RND from a set of option market prices. These fall largely into threeapproaches: fitting a parametric density function to the market data, approximating theRND with a nonparametric technique, or developing a model of the returns process thatproduces the empirical RND as the density for the value of the underlying asset on optionexpiration day.

An alternative classification is according to the authors’ purpose in extracting arisk-neutral distribution. Many authors begin with a review of the pros and cons ofdifferent extraction techniques in order to select the one they expect to work best fortheir particular application. Because a risk-neutral density combines estimates of objec-tive probabilities and risk preferences, a number of papers seek to use the RND as awindow on market expectations about the effects of economic events and policy changeson exchange rates, interest rates, and stock prices. Other papers take the opposite tack,in effect, abstracting from the probabilities in order to examine the market’s risk pref-erences that are manifested in the difference between the risk-neutral density and theempirical density. A third branch of the literature is mainly concerned with extractingthe RND as an econometric problem. These papers seek to optimize the methodologyfor estimating RNDs from noisy market options prices. The most ambitious papers con-struct an implied returns process, such as an implied binomial tree, that starts fromthe underlying asset’s current price and generates the implied RND on option expirationdate. This approach leads to a full option pricing model, yielding both theoretical optionvalues and Greek letter hedging parameters.

Bates (1991) was one of the first papers concerned with extracting information aboutmarket expectations from option prices. It analyzed the skewness of the RND fromS&P500 index options around the stock market crash of 1987 as a way to judge whetherthe crash was anticipated by the market. Like Bahra (1997), Soderlind and Svensson(1997) proposed learning about the market’s expectations for short-term interest rates,exchange rates, and inflation by fitting RNDs as mixtures of two normal or lognormaldensities. Melick and Thomas (1997) modeled the RND as a mixture of three lognormals.Using crude oil options, their estimated RNDs for oil prices during the period of the1991 Persian Gulf crisis were often bimodal and exhibited shapes that were inconsistentwith a univariate lognormal. They interpreted this as the market’s response to mediacommentary at the time and the anticipation that a major disruption in world oil priceswas possible.

In their examination of exchange rate expectations, Campa, Chang and Reider (1998)explored several estimation techniques and suggested that there is actually little differ-ence among them. However, this conclusion probably depends strongly on the fact thattheir currency options data only provided five strike prices per day, which substantiallylimits the flexibility of the functional forms that could be fitted. Malz (1997) also modeled

Page 340: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 Review of the literature 327

exchange rate RNDs and added a useful wrinkle. FX option prices are typically quotedin terms of their implied volatilities under the Garman–Kohlhagen model and moneynessis expressed in terms of the option’s delta. For example, a “25 delta call” is an out of themoney call option with a strike such that the option’s delta is 0.25. Malz used a simplefunction involving the prices of option combination positions to model and interpolatethe implied volatility smile in delta-IV space.

Quite a few authors have fitted RNDs to stock market returns, but for the most part,their focus has not been on the market’s probability estimates but on risk preferences.An exception is Gemmill and Saflekos (2000), who fitted a mixture of two lognormalsto FTSE stock index options and looked for evidence that investors’ probability beliefsprior to British elections reflected the dichotomous nature of the possible outcomes.

Papers that seek to use RNDs to examine the market’s risk preferences include Aıt-Sahalia and Lo (1998, 2000), Jackwerth (2000), Rosenberg and Engle (2002) and Blissand Panigirtzoglou (2004).

In their 1998 paper, Aıt-Sahalia and Lo used a nonparametric kernel smoothingprocedure to extract RNDs from S&P 500 index option prices. Unlike other researchers,they assumed that if the RND is properly modeled as a function of moneyness andthe other parameters that enter the Black–Scholes model, it will be sufficiently stableover time that a single RND surface defined on log return and days to maturity canbe fitted to a whole calendar year (1993). Although they did not specifically state thattheir approach focuses primarily on risk preferences, it is clear that if the RND is thisstationary, its shape is not varying in response to the flow of new information enteringthe market’s expectations, beyond what is reflected in the changes in the underlyingasset price. Aıt-Sahalia and Lo (2000) applies the results of their earlier work to theValue-at-Risk problem and proposes a new VaR concept that includes the market’s riskpreferences as revealed in the nonparametric RND.

Jackwerth (2000) uses the methodology proposed in Jackwerth and Rubinstein (1996)to fit smooth RNDs to stock prices around the 1987 stock market crash and the periodfollowing it. The paper explores the market’s risk attitudes, essentially assuming thatthey are quite stable over time, but subject to substantial regime changes. The resultingrisk aversion functions exhibit some anomalies, however, leaving some important openquestions.

In their cleverly designed study, Bliss and Panigirtzoglou (2004) assume a utilityfunction of a particular form. Given a level of risk aversion, they can then extract therepresentative investor’s true (subjective) expected probability distribution. They assumethe representative investor has rational expectations and find the value of the constantrisk-aversion parameter that gives the best match between the extracted subjective dis-tribution and the distribution of realized outcomes. By contrast, Rosenberg and Engle(2002) model a fully dynamic risk-aversion function by fitting a stochastic volatilitymodel to S&P 500 index returns and extracting the “empirical pricing kernel” on eachdate from the difference between the estimated empirical distribution and the observedRND.

The literature on implied trees began with three papers written at about thesame time. Perhaps the best-known is Rubinstein’s (1994) Presidential Address to theAmerican Finance Association, in which he described how to fit Binomial trees that repli-cate the RNDs extracted from options prices. Rubinstein found some difficulty in fittinga well-behaved left tail for the RND and chose the approach of using a lognormal density

Page 341: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

328 Estimating the implied risk-neutral density for the US market portfolio

as a Bayesian prior for the RND. Jackwerth (1997) generalized Rubinstein’s binomial lat-tice to produce a better fit, and Rubinstein (1998) suggested a different extension, usingan Edgeworth expansion to fit the RND and then constructing a tree consistent with theresulting distribution. Both Dupire (1994) and Derman and Kani (1994) also developedimplied tree models at about the same time as Rubinstein. Dupire fit an implied trino-mial lattice, whereas Derman and Kani, like Rubinstein, used options prices to imply outa binomial tree, but they combined multiple maturities to get a tree that simultaneouslymatched RNDs for different expiration dates. Their approach was extended in Dermanand Kani (1998) to allow implied (trinomial) trees that matched both option prices andimplied volatilities. Unfortunately, despite the elegance of these techniques, their abilityto produce superior option pricing and hedging parameters was called into question byDumas, Fleming and Whaley (1998) who offered empirical evidence that the impliedlattices were no better than “practitioner Black–Scholes”.

The most common method to model the RND is to select a known parametric den-sity function, or a mixture of such functions, and fit its parameters by minimizing thediscrepancy between the fitted function and the empirical RND. A variety of distribu-tions and objective functions have been investigated and their relative strengths debatedin numerous papers, including those already mentioned. Simply computing an impliedvolatility using the Black–Scholes equation inherently assumes the risk-neutral densityfor the cumulative return as of expiration is lognormal. Its mean is the riskless rate(with an adjustment for the concavity of the log function) and it has standard deviationconsistent with the implied volatility, both properly scaled by the time to expiration.3

But given the extensive evidence that actual returns distributions are too fat-tailed tobe lognormal, research with the lognormal has typically used a mixture of two or morelognormal densities with different parameters.

Yet, using the Black–Scholes equation to smooth and interpolate option values hasbecome a common practice. Shimko (1993) was the first to propose converting optionprices into implied volatilities, interpolating and smoothing the curve, typically witha cubic spline or a low-order polynomial, then converting the smoothed IVs back intoprice space and proceeding with the extraction of a RND from the resulting dense setof option prices. We adopt this approach below, but illustrate the potential pitfall ofsimply fitting a spline to the IV data: since a standard cubic spline must pass throughall of the original data points, it incorporates all of the noise from the bid-ask spreadand other market microstructure frictions into the RND. A more successful spline-basedtechnique, discussed by Bliss and Panigirtzoglou (2002), uses a “smoothing” spline. Thisproduces a much better-behaved RND by imposing a penalty function on the choppinessof the spline approximation and not requiring the curve to pass through all of the originalpoints exactly.

Other papers achieve smooth RNDs by positing either a specific returns process (e.g.,a jump diffusion) or a specific terminal distribution (e.g., a lognormal) and extractingits parameters from option prices. Nonparametric techniques (e.g., kernel regression)inherently smooth the estimated RND and achieve the same goal.

Several papers in this group, in addition to those described above, are worth men-tioning. Bates (1996) used currency options prices to estimate the parameters of a

3The implied volatility literature is voluminous. Poon and Granger (2003) provide an extensive reviewof this literature, from the perspective of volatility prediction.

Page 342: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

3 Extracting the risk-neutral density from options prices, in theory 329

jump-diffusion model for exchange rates, implying out parameters that lead to the bestmatch between the terminal returns distribution under the model and the observedRNDs. Buchen and Kelly (1996) suggested using the principle of Maximum Entropyto establish a RND that places minimal constraints on the data. They evaluated theprocedure by simulating options prices and trying to extract the correct density. Blissand Panigirtzoglou (2002) also used simulated option prices, to compare the performanceof smoothing splines versus a mixture of lognormals in extracting the correct RND whenprices are perturbed by amounts that would still leave them inside the typical bid-askspread. They concluded that the spline approach dominates a mixture of lognormals. Buand Hadri (2007), on the other hand, also used Monte Carlo simulation in comparingthe spline technique against a parametric confluent hypergeometric density and preferredthe latter.

One might summarize the results from this literature as showing that the implied risk-neutral density may be extracted from market option prices using a number of differentmethods, but none of them is clearly superior to the others. Noisy market option pricesand sparse strikes in the available set of traded contracts are a pervasive problem thatmust be dealt with in any viable procedure. We will select and adapt elements of theapproaches used by these researchers to extract the RND from a set of option prices, adda key wrinkle to take account of the bid-ask spread in the market, and then propose anew technique for completing the tails of the distribution.

3. Extracting the risk-neutral density from options prices,in theory

In the following, the symbols C, S, X, r, and T all have the standard meanings ofoption valuation: C = call price; S = time 0 price of the underlying asset; X = exerciseprice; r = riskless interest rate; T = option expiration date, which is also the time toexpiration. P will be the price of a put option. We will also use f(x) = risk-neutralprobability density function (RND) and F(x) =

∫ x

−∞ f(z)dz = risk-neutral distributionfunction.

The value of a call option is the expected value of its payoff on the expirationdate T, discounted back to the present. Under risk-neutrality, the expectation is takenwith respect to the risk-neutral probabilities and discounting is at the risk-free interestrate:

C =∫ ∞

X

e−rT (ST −X)f(ST )dST . (1)

Increasing the exercise price by an amount dX changes the option value for two reasons.First, it narrows the range of stock prices ST for which the call has a positive payoff.Second, increasing X reduces the payoff by the amount -dX for every ST at which theoption is in the money. The first effect occurs when ST falls between X and X+dX. Themaximum of the lost payoff is just dX, which contributes to option value multiplied bythe probability that ST will end up in that narrow range. So, for discrete dX the impactof the first effect is very small and it becomes infinitesimal relative to the second effectin the limit as dX goes to 0.

Page 343: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

330 Estimating the implied risk-neutral density for the US market portfolio

These two effects are seen clearly when we take the partial derivative in (1) withrespect to X:

∂C

∂X=

∂X

[∫ ∞

X

e−rT (ST −X)f(ST )dST

]

= e−rT

[−(X −X)f(X) +

∫ ∞

X

−f(ST )dST

].

The first term in brackets corresponds to the effect of changing the range of ST for whichthe option is in the money. This is zero in the limit, leaving

∂C

∂X= −e−rT

∫ ∞

X

f(ST )dST = −e−rT [1 − F (X)] .

Solving for the risk-neutral distribution F(X) gives

F (X) = erT ∂C

∂X+ 1. (2)

In practice, an approximate solution to (2) can be obtained using finite differences ofoption prices observed at discrete exercise prices in the market. Let there be optionprices available for maturity T at N different exercise prices, with X1 representing thelowest exercise price and XN being the highest.

We will use three options with sequential strike prices Xn−1,Xn, and Xn+1 in orderto obtain an approximation to F(X) centered on Xn.4

F (Xn) ≈ erT

[Cn+1 − Cn−1

Xn+1 −Xn−1

]+ 1 (3)

To estimate the probability in the left tail of the risk-neutral distribution upto X2, we approximate ∂C

∂X at X2 by erT C3−C1X3−X1

+ 1, and the probability inthe right tail from XN−1 to infinity is approximated by 1 − (erT CN−CN−2

XN−XN−2+ 1) =

−erT CN−CN−2XN−XN−2

.Taking the derivative with respect to X in (2) a second time yields the risk-neutral

density function at X:

f(X) = erT ∂2C

∂X2. (4)

The density f(Xn) is approximated as

f(Xn) ≈ erT Cn+1 − 2Cn + Cn−1

(ΔX)2. (5)

Equations (1)–(5) show how the portion of the RND lying between X2 and XN−1 canbe extracted from a set of call option prices. A similar derivation can be done to yield a

4In general, the differences (Xn−Xn−1) and (Xn+1−Xn) need not be equal, in which case a weightingprocedure could be used to approximate F(Xn). In our methodology ΔX is a constant value, becausewe construct equally spaced artificial option prices to fill in values for strikes in between those traded inthe market.

Page 344: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Extracting a risk-neutral density from options market prices, in practice 331

procedure for obtaining the RND from put prices. The equivalent expressions to (2)–(5)for puts are:

F (X) = erT ∂P

∂X(6)

F (Xn) ≈ erT

[Pn+1 − Pn−1

Xn+1 −Xn−1

](7)

f(X) = erT ∂2P

∂X2(8)

f(Xn) ≈ erT Pn+1 − 2Pn + Pn−1

(ΔX)2. (9)

4. Extracting a risk-neutral density from options marketprices, in practice

The approach described in the previous section assumes the existence of a set of optionprices that are all fully consistent with the theoretical pricing relationship of equation (1).Implementing it with actual market prices for traded options raises several importantissues and problems. First, market imperfections in observed option prices must be dealtwith carefully or the resulting risk-neutral density can have unacceptable features, suchas regions in which it is negative. Second, some way must be found to complete thetails of the RND beyond the range from X2 to XN−1. This section will review severalapproaches that have been used in the literature to obtain the middle portion of theRND from market option prices, and will describe the technique we adopt here. Thenext section will add the tails.

We will be estimating RNDs from the daily closing bid and ask prices for Standard andPoor’s 500 Index options. S&P 500 options are particularly good for this exercise becausethe underlying index is widely accepted as the proxy for the US “market portfolio”, theoptions are very actively traded on the Chicago Board Options Exchange, and theyare cash-settled with European exercise style. S&P 500 options have major expirationsquarterly, on the third Friday of the months of March, June, September and December.This will allow us to construct time series of RNDs applying to the value of the S&Pindex on each expiration date. The data set will be described in further detail below. Herewe will take a single date, January 5, 2005, selected at random, to illustrate extractionof an RND in practice.

4.1. Interpolation and smoothing

The available options prices for January 5, 2005 are shown in Table 15.1. The index closedat 1,183.74 on that date, and the March options contracts expired 72 days later, on March18, 2005. Strike prices ranged from 1,050 to 1,500 for calls, and from 500 to 1,350 forputs. Bid-ask spreads were relatively wide: 2 points for contracts trading above 20 dollarsdown to a minimum of 0.50 in most cases even for the cheapest options. This amounted

Page 345: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

332 Estimating the implied risk-neutral density for the US market portfolio

Table 15.1. S&P 500 index options prices, January 5, 2005

S&P 500 Index closing level, = 1, 183.74 Interest rate = 2.69Option expiration: 3/18/2005 (72 days) Dividend yield = 1.70

Calls Puts

Strikeprice

Bestbid

Bestoffer

Averageprice

Impliedvolatility

Bestbid

Bestoffer

Averageprice

Impliedvolatility

500 — — — — 0.00 0.05 0.025 0.593550 — — — — 0.00 0.05 0.025 0.530600 — — — — 0.00 0.05 0.025 0.473700 — — — — 0.00 0.10 0.050 0.392750 — — — — 0.00 0.15 0.075 0.356800 — — — — 0.10 0.20 0.150 0.331825 — — — — 0.00 0.25 0.125 0.301850 — — — — 0.00 0.50 0.250 0.300900 — — — — 0.00 0.50 0.250 0.253925 — — — — 0.20 0.70 0.450 0.248950 — — — — 0.50 1.00 0.750 0.241975 — — — — 0.85 1.35 1.100 0.230995 — — — — 1.30 1.80 1.550 0.222

1005 — — — — 1.50 2.00 1.750 0.2171025 — — — — 2.05 2.75 2.400 0.2081050 134.50 136.50 135.500 0.118 3.00 3.50 3.250 0.1931075 111.10 113.10 112.100 0.140 4.50 5.30 4.900 0.1831100 88.60 90.60 89.600 0.143 6.80 7.80 7.300 0.1721125 67.50 69.50 68.500 0.141 10.10 11.50 10.800 0.1611150 48.20 50.20 49.200 0.135 15.60 17.20 16.400 0.1521170 34.80 36.80 35.800 0.131 21.70 23.70 22.700 0.1461175 31.50 33.50 32.500 0.129 23.50 25.50 24.500 0.1441180 28.70 30.70 29.700 0.128 25.60 27.60 26.600 0.1421190 23.30 25.30 24.300 0.126 30.30 32.30 31.300 0.1411200 18.60 20.20 19.400 0.123 35.60 37.60 36.600 0.1391205 16.60 18.20 17.400 0.123 38.40 40.40 39.400 0.1391210 14.50 16.10 15.300 0.121 41.40 43.40 42.400 0.1381215 12.90 14.50 13.700 0.122 44.60 46.60 45.600 0.1381220 11.10 12.70 11.900 0.120 47.70 49.70 48.700 0.1361225 9.90 10.90 10.400 0.119 51.40 53.40 52.400 0.1371250 4.80 5.30 5.050 0.117 70.70 72.70 71.700 0.1391275 1.80 2.30 2.050 0.114 92.80 94.80 93.800 0.1471300 0.75 1.00 0.875 0.115 116.40 118.40 117.400 0.1611325 0.10 0.60 0.350 0.116 140.80 142.80 141.800 0.1791350 0.15 0.50 0.325 0.132 165.50 167.50 166.500 0.1981400 0.00 0.50 0.250 0.157 — — — —1500 0.00 0.50 0.250 0.213 — — — —

Source: Optionmetrics.

Page 346: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Extracting a risk-neutral density from options market prices, in practice 333

–0.2

0

0.2

0.4

0.6

0.8

1

800 900 1000 1100 1200 1300 1400

S&P 500 index

Prob

abili

ty

Distribution from put prices Distribution from call prices

Fig. 15.1. Risk-neutral distribution from raw options prices

to spreads of more than 100% of the average price for many deep out of the money con-tracts. It is customary to use either transactions prices or the midpoints of the quotedbid-ask spreads as the market’s option prices. Options transactions occur irregularly intime and only a handful of strikes have frequent trading, even for an actively tradedcontract like S&P 500 index options. Use of transactions data also requires obtainingsynchronous prices for the underlying. By contrast, bids and offers are quoted continu-ously for all traded strikes, whether or not trades are occurring. We will begin by takingthe average of bid and ask as the best available measure of the option price. We thenmodify the procedure to make use of the full spread in the smoothing and interpolationstage.

Equations (3) and (7) show how to estimate the probability distribution using acentered difference to compute the slope and the distribution at Xn. In Figure 15.1, wehave used uncentered differences, Cn − Cn−1 and Pn − Pn−1 simply for illustration, toconstruct probability distributions from the average call and put price quotes shown inTable 15.1. The distribution from the puts extends further to the left and the one fromthe calls extends further to the right, but in the middle range where they overlap, thevalues are quite close together. There are some discrepancies, notably around 1,250,where the cumulative call probability is 0.698 and the put probability is 0.776, but themore serious problem is around 1,225, where the fitted probability distribution from callprices is nonmonotonic.

Figure 15.2 plots the risk-neutral densities corresponding to the distribution functionsdisplayed in Figure 15.1. These are clearly unacceptable as plausible estimates of the truedensity function. Both RNDs have ranges of negative values, and the extreme fluctuationsin the middle portion and sharp differences between call and put RNDs violate our priorbeliefs that the RND should be fairly smooth and the same expectations should governpricing of both calls and puts.

Looking at the prices in Table 15.1, it is clear that there will be problems with outof the money puts. Except at 800, there is no bid for puts at any strike below 925 and

Page 347: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

334 Estimating the implied risk-neutral density for the US market portfolio

–0.005

0

0.005

0.01

0.015

0.02

0.025

800 900 1000 1100 1200 1300 1400

S&P 500 index

Prob

abili

ty

Density from put prices Density from call prices

Fig. 15.2. Risk-neutral density from raw options prices

the ask price is unchanged over multiple contiguous strikes, making the average priceequal for different exercise prices. From (9), the estimated RND over these regions willbe 0, implying no possibility that the S&P could end up there at expiration. A similarsituation occurs for out of the money calls between 1,400 and 1,500. Moreover, the single0.10 bid for puts at X = 800 produces an average put price higher than that for the nexthigher strike, which violates the static no-arbitrage condition that a put with a higherstrike must be worth more than one with a lower strike. This leads to a region in whichthe implied risk-neutral density is negative. However, it is not obvious from the pricesin Table 15.1 what the problem is that produces the extreme choppiness and negativedensities around the at the money index levels between 1,150 and 1,250.

Table 15.1 and Figure 15.1 show that even for this very actively traded index option,the available strikes are limited and the resulting risk-neutral distribution is a coarsestep function. The problem would be distinctly worse for individual traded stock optionswhose available strikes are considerably less dense than this. This suggests the use of aninterpolation technique to fill in intermediate values between the traded strike prices andto smooth out the risk-neutral distribution.

Cubic spline interpolation is a very common first choice as an interpolation tool.Figure 15.3 shows the spline-interpolated intermediate option prices for our calls andputs. To the naked eye, the curves look extremely good, without obvious bumps orwiggles between the market prices, indicated by the markers. Yet these option pricesproduce the RNDs shown in Figure 15.4, with erratic fluctuations around the at themoney stock prices, large discrepancies between RNDs from calls and puts, and negativeportions in both curves. The problem is that cubic spline interpolation generates a curvethat is forced to go through every observed price, which has the effect of incorporatingall of the noise due to market microstructure and other imperfections into the RND.

David Shimko (1993) proposed transforming the market option prices into impliedvolatility (IV) space before interpolating, then retransforming the interpolated curve backto price space to compute a risk-neutral distribution. This procedure does not assume

Page 348: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Extracting a risk-neutral density from options market prices, in practice 335

0

20

40

60

80

100

120

140

160

180

500 600 700 800 900 1000 1100 1200 1300 1400 1500

S&P 500 index

Opt

ion

pric

e

Spline interpolated call price

Spline interpolated put price

Market call prices

Market put prices

Fig. 15.3. Market option prices with cubic spline interpolation

–0.04

–0.03

–0.02

–0.01

0.00

0.01

0.02

0.03

0.04

0.05

800 900 1000 1100 1200 1300 1400

S&P 500 index

Den

sity

Density from interpolated put prices Density from interpolated call prices

Fig. 15.4. Densities from option prices with cubic spline interpolation

Page 349: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

336 Estimating the implied risk-neutral density for the US market portfolio

that the Black–Scholes model holds for these option prices. It simply uses the Black–Scholes equation as a computational device to transform the data into a space that ismore conducive to the kind of smoothing one wishes to do.

Consider the transformed option values represented by their BS IVs. Canonical Black–Scholes would require all of the options to have the same IV. If this constraint wereimposed and the fitted IVs transformed back into prices, by construction the resultingrisk-neutral density would be lognormal, and hence well-behaved. But because of the well-known volatility smile, or skew in this market, the new prices would be systematicallydifferent from the observed market prices, especially in the left tail. Most option tradersdo not use the canonical form of the BS model, but instead use “practitioner Black–Scholes”, in which each option is allowed to have its own distinct implied volatility.Despite the theoretical inconsistency this introduces, the empirical volatility smile/skewis quite smooth and not too badly sloped, so it works well enough.

Considerable research effort has been devoted to finding arbitrage-free theoreticalmodels based on nonlognormal returns distributions, that produce volatility smiles resem-bling those found empirically. Inverting those (theoretical) smiles will also lead to optionprices that produce well-behaved RNDs. Of course, if market prices do not obey the alter-native theoretical model due to market noise, transforming through implied volatilityspace will not cure the problem.

To moderate the effects of market imperfections in option prices, a smooth curveis fitted to the volatility smile/skew by least squares. Shimko used a simple quadraticfunction, but we prefer to allow greater flexibility with a higher order polynomial.

Applying a cubic spline to interpolate the volatility smile still produces bad resultsfor the fitted RND. The main reason for this is that an n-th degree spline constructs aninterpolating curve consisting of segments of n-th order polynomials joined together ata set of “knot” points. At each of those points, the two curve segments entering fromthe left and the right are constrained to have the same value and the same derivativesup to order n-1. Thus, a cubic spline has no discontinuities in the level, slope or secondderivative, meaning there will be no breaks, kinks, or even visible changes in curvatureat its knot points. But when the interpolated IV curve is translated back into optionstrike-price space and the RND is constructed by taking the second derivative as in(5), the discontinuous third derivative of the IV curve becomes a discontinuous firstderivative – a kink – in the RND. The simple solution is just to interpolate with a fourthorder spline or higher.5

The other problem with using a standard n-th degree spline as an interpolating func-tion is that it must pass through every knot point, which forces the curve to incorporateall pricing noise into the RND. As with K knot points, there will be K+n+1 parametersto fit, this also requires applying enough constraints to the curve at its end-points toallow all of the parameters to be identified with only K data points.

Previous researchers have used a “smoothing spline” that allows a tradeoff betweenhow close the curve is to the observed data points – it no longer goes through themexactly – and how well its shape conforms to the standard spline constraint that the

5As mentioned above, some researchers plot the IV smile against the option deltas rather than againstthe strike prices, which solves this problem automatically. Applying a cubic spline in delta-IV spaceproduces a curve that is smooth up to second order in terms of the partial derivative of option price,which makes it smooth up to third order in the price itself, eliminating any kinks in the RND.

Page 350: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Extracting a risk-neutral density from options market prices, in practice 337

derivatives of the spline curve should be smooth across the knot points. For any givenproblem, the researcher must choose how this tradeoff is resolved by setting the value ofa smoothness parameter.6

We depart somewhat from previous practice in this area. We have foundthat fitted RNDs behave very well using interpolation with just a fourth orderpolynomial – essentially a fourth degree spline with no knots. Additional degrees of free-dom, that allow the estimated densities to take more complex shapes, can be addedeither by fitting higher order polynomials or by adding knots to a fourth order spline.In this exercise, we found very little difference from either of these modifications. Wetherefore have done all of the interpolation for our density estimation using fourth ordersplines with a single knot point placed at the money.

Looking again at Table 15.1, we see that many of the bid and ask quotes are foroptions that are either very deep in the money or very deep out of the money. For theformer case, the effect of optionality is quite limited, such that the IV might range from12.9% to 14.0% within the bid-ask spread. For the lowest strike call, there is no IV atthe bid price, because it is below the no-arbitrage minimum call price. The IV at theask is 15.6%, whereas the IV at the midpoint, which is what goes into the calculations,is 11.8%. In addition to the wide bid-ask spreads, there is little or no trading in deep inthe money contracts. On this day, no 1,050 or 1,075 strike calls were traded at all, andonly three 1,150 strike calls changed hands. Most of the trading is in at the money orout of the money contracts.

But out of the money contracts present their own data problems, because of extremelywide bid-ask spreads relative to their prices. The 925 strike put, for example, would havean IV of 22.3% at its bid price of 0.20 and 26.2% at the ask price of 0.70. Setting the IVfor this option at 24.8% based on the mid-price of 0.45 is clearly rather arbitrary. Onereason the spread is so wide is that there is very little trading of deep out of the moneycontracts. On this date, the only trades in puts with strikes of 925 or below were fivecontracts at a strike of 850, for a total option premium of no more than a couple hundreddollars. It is obvious that the quality of information about the risk-neutral density thatcan be extracted from the posted quotes on options that do not trade in the market maybe quite limited.

These observations suggest that it is desirable to limit the range of option strikesthat are brought into the estimation process, eliminating those that are too deep in orout of the money. Also, as most trading is in at the money and somewhat out of themoney contracts, we can broaden the range with usable data if we combine calls andputs together. The CBOE does this in their calculation of the VIX index, for example,combining calls and puts but using only out of the money contracts.

To incorporate these ideas into our methodology, we first discard all options whosebid prices are less than 0.50. On this date, this eliminates calls with strikes of 1,325 andabove, and puts with strikes of 925 and below. Next we want to combine calls and puts,using the out of the money contracts for each. But from Table 15.1, with the currentindex level at 1,183.74, if we simply use puts with strikes up to 1,180 and calls with

6The procedure imposes a penalty function on the integral of the second derivative of the spline curveto make the fitted curve smoother. The standard smoothing spline technique still uses a knot at everydata point, so it requires constraints to be imposed at the end-points. See Bliss and Panigirtzoglou(2002), Appendix A, for further information about this approach.

Page 351: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

338 Estimating the implied risk-neutral density for the US market portfolio

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

500 600 700 800 900 1000 1100 1200 1300 1400 1500

S&P 500 index

Impl

ied

vola

tility

4th degree polynomial on combined IVs Traded Call IVs Traded Put IVs

Fig. 15.5. Implied volatilities from all calls and puts minimum bid price 0.50 fourthdegree spline interpolation (1-knot)

strikes from 1,190 to 1,300, there will be a jump from the put IV of 14.2% to the callIV of 12.6% at the break point. To smooth out the effect of this jump at the transitionpoint, we blend the call and put IVs in the region around the at the money index level.We have chosen a range of 20 points on either side of the current index value S0 in whichthe IV will be set to a weighted average of the IVs from the calls and the puts.7

Let Xlow be the lowest traded strike such that (S0 − 20) ≤ Xlow and Xhigh be thehighest traded strike such that Xhigh ≤ (S0 + 20). For traded strikes between Xlow andXhigh we use a blended value between IVput(X) and IVcall(X), computed as:

IVblend(X) = w IVput(X) + (1 − w) IVcall(X) (10)

where

w =Xhigh − X

Xhigh − Xlow.

In this case, we take put IVs for strikes up to 1,150, blended IVs for strikes 1,170to 1,200, and call IVs for strikes from 1,205 up. Figure 15.5 plots the raw IVs from thetraded options with markers and the interpolated IV curve computed from calls and putswhose bid prices are at least 0.50, as just described.

7The choice of a 40 point range over which to blend the put and call IVs is arbitrary, but we believethat the specific choice has little impact on the overall performance of the methodology. On January5, 2005, the discrepancy between the two IVs is about 0.015 in this range, which becomes distributedover the 40 point range of strikes at the rate of about 0.0004 per point. The effect on the fitted RNDwill be almost entirely concentrated around the midpoint, and it will be considerably smoother than ifno adjustment were made and the IV simply jumped from the put value to the call value for the at themoney strike. A reasonable criterion in setting the range for IV blending would be to limit it to the areabefore the IVs from the two sets begin to diverge, as Figure 15.5 illustrates happens when one of themgets far enough out of the money.

Page 352: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Extracting a risk-neutral density from options market prices, in practice 339

This procedure produces an implied risk-neutral density with a very satisfying shape,based on prior expectations that the RND should be smooth. Even so, there might besome concern that we have smoothed out too much. We have no reason to rule outminor bumps in the RND, that could arise when an important dichotomous future eventis anticipated, such as the possibility of a cut in the Federal Reserve’s target interestrate, or alternatively, if there are distinct groups in the investor population with sharplydivergent expectations.

I have explored increasing flexibility by fitting fourth order splines using three knotswith one at the midpoint and the others 20 points above and below that price. Thechoice of how many knots to use and where to place them allows considerable latitudefor the user. But we will see shortly that at least in the present case, it makes very littledifference to the results.

4.2. Incorporating market bid-ask spreads

The spline is fitted to the IV observations from the market by least squares. This appliesequal weights to the squared deviation between the spline curve and the market IVevaluated at the midpoint of the bid-ask spread at all data points, regardless of whetherthe spline would fall inside or outside the quoted spread. Given the width of the spreads,it would make sense to be more concerned about cases where the spline fell outside thequoted spread than those remaining within it.

To take account of the bid-ask spread, we apply a weighting function to increasethe weighting of deviations falling outside the quoted spread relative to those thatremain within it. We adapt the cumulative normal distribution function to constructa weighting function that allows weights between 0 and 1 as a function of a singleparameter σ.

w(IV) =

{N[ IV − IVAsk , σ] if IVMidpoint ≤ IVN[ IVBid − IV , σ] if IV ≤ IVMidpoint

}(11)

Figure 15.6 plots an example of this weighting function for three values of σ. Impliedvolatility is on the x axis, with the vertical solid lines indicating a given option’s IV valuesat the market’s bid, ask, and midprice, 0.1249, 0.1331, and 0.1290, respectively. Thesevalues are obtained by applying the spline interpolation described above separately tothe three sets of IVs, from the bid prices, the ask prices and the midprices at each tradedstrike level. In the middle range where call and put IVs are blended according to equation(10), the bid and ask IV curves from calls and puts are blended in the same way beforethe interpolation step.

Settting σ to a very high value like 100 assigns (almost) equal weights of 0.5 to allsquared deviations between the IV at the midpoint and the fitted spline curve at everystrike. This is the standard approach that does not take account of the bid-ask spread.With σ = 0.005, all deviations are penalized, but those falling well outside the quotedspread are weighted about three times more heavily than those close to the midpriceIV. Setting σ = 0.001 puts very little weight on deviations that are within the spreadand close to the midprice IV, whereas assigning full weight to nearly all deviations fallingoutside the spread. This is our preferred weighting pattern to make use of the informationcontained in the quoted spread in the market.

Page 353: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

340 Estimating the implied risk-neutral density for the US market portfolio

0.0

0.2

0.4

0.6

0.8

1.0

1.2

0.115 0.12 0.125 0.13 0.135 0.14 0.145

Implied volatility

Wei

ght o

n sq

uare

d de

viat

ion

Equal weights (sigma=100) Increased penalty outside Bid-Ask (sigma=0.005)

Very small penalty inside Bid-Ask (sigma=0.001)

IV at Bid IV at

MidpriceIV at Ask

Fig. 15.6. Alternative weighting of squared deviations within and outside the bid-askspread

Figure 15.7 illustrates the effect of changing the degree of the polynomial, the numberof knot points and the bid-ask weighting parameter used in the interpolation step. Linesin gray show densities constructed by fitting polynomials of degree 4, 6, and 8, with noknots and equal weighting of all squared deviations. The basic shape of the three curvesis close, but higher order polynomials allow greater flexibility in the RND. This allows itto fit more complex densities, but also increases the impact of market noise. Consider theleft end of the density. The missing portion of the left tail must be attached below 950but it is far from clear how it should look to match the density obtained either from theeighth degree polynomial, which slopes sharply downward at that level, or from the sixthdegree polynomial, which has a more reasonable slope at that point, but the estimateddensity is negative.

By contrast, the fourth order polynomial and all three of the spline functions producevery reasonably shaped RNDs that are so close together that they cannot be distinguishedin the graph. Although these plots are for a single date, I have found similar results onnearly every date for which this comparison was done, which supports the choice of afourth order spline with a single knot and with a very small relative weight on deviationsthat fall within the bid-ask spread in order to extract risk neutral densities from S&P500 index options.

4.3. Summary

The following steps summarize our procedure for extracting a well-behaved risk-neutraldensity from market prices for S&P 500 index options, over the range spanned by theavailable option strike prices.

Page 354: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Extracting a risk-neutral density from options market prices, in practice 341

–0.001

0.000

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

0.009

800 900 1000 1100 1200 1300 1400

S&P 500 index

Den

sity

4th degree polynomial 6th degree polynomial 8th degree polynomial

4th order spline, 1 knot, sigma=.001 4th order spline, 1 knot, sigma=.005 4th order spline, 3 knots, sigma=.001

Fig. 15.7. Densities constructed using alternative interpolation methods

1. Begin with bid and ask quotes for calls and puts with a given expirationdate.

2. Discard quotes for very deep out of the money options. We required a minimumbid price of $0.50 for this study.

3. Combine calls and puts to use only the out of the money and at the moneycontracts, which are the most liquid.

4. Convert the option bid, ask and midprices into implied volatilities using the Black–Scholes equation. To create a smooth transition from put to call IVs, take weightedaverages of the bid, ask and midprice IVs from puts and calls in a region aroundthe current at the money level, using equation (10).

5. Fit a spline function of at least fourth order to the midprice implied volatili-ties by minimizing the weighted sum of squared differences between the splinecurve and the midprice IVs. The weighting function shown in equation (11)downweights deviations that lie within the market’s quoted bid-ask spread rel-ative to those falling outside it. The number of knots should be kept small, andtheir optimal placement may depend on the particular data set under consid-eration. In this study we used a fourth order spline with a single knot at themoney.

6. Compute a dense set of interpolated IVs from the fitted spline curve and thenconvert them back into option prices.

7. Apply the procedure described in Section 3 to the resulting set of option prices inorder to approximate the middle portion of the RND.

8. These steps produce an empirical RND over the range between the lowest andhighest strike price with usable data. The final step is to extend the density intothe tails.

Page 355: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

342 Estimating the implied risk-neutral density for the US market portfolio

5. Adding tails to the risk-neutral density

The range of strike prices {X1,X2, . . . ,XN} for which usable option prices are availablefrom the market or can be constructed by interpolation does not extend very far into thetails of the distribution. The problem is further complicated by the fact that what weare trying to approximate is the market’s aggregation of the individual risk-neutralizedsubjective probability beliefs in the investor population. The resulting density functionneed not obey any particular probability law, nor is it even a transformation of the true(but unobservable) distribution of realized returns on the underlying asset.

We propose to extend the empirical RND by grafting onto it tails drawn from asuitable parametric probability distribution in such a way as to match the shape ofthe estimated RND over the portion of the tail region for which it is available. Thefirst question is which parametric probability distribution to use. Some of the earlierapproaches to this problem implicitly assume a distribution. For example, the Black–Scholes implied volatility function can be extended by setting IV(X) = IV(X1) for allX < X1 and IV(X) = IV(XN) for all X > XN, where IV(·) is the implied volatility fromthe Black–Scholes model.8 This forces the tails to be lognormal. Bliss and Panigirtzoglou(2004) do something similar by employing a smoothing spline for the middle portion of thedistribution but constraining it to become linear outside the range of the available strikes.Given the extensive empirical evidence of fat tails in returns distributions, constrainingthe tails of the RND to be lognormal is unlikely to be satisfactory in practice if one isconcerned about modeling tail events accurately.

Fortunately, similar to the way the Central Limit Theorem makes the normal a naturalchoice for modeling the distribution of the sample average from an unknown distribution,the Extreme Value distribution is a natural candidate for the purpose of modeling thetails of an unknown distribution. The Fisher–Tippett Theorem proves that under weakregularity conditions the largest value in a sample drawn from an unknown distribu-tion will converge in distribution to one of three types of probability laws, all of whichbelong to the generalized extreme value (GEV) family.9 We will therefore use the GEVdistribution to construct tails for the RND.

The standard generalized extreme value distribution has one parameter ξ, whichdetermines the tail shape. GEV distribution function:

F(z) = exp[−(1 + ξz)−1/ξ]. (12)

The value of ξ determines whether the tail comes from the Frechet distribution withfat tails relative to the normal (ξ > 0), the Gumbel distribution with tails like the normal(ξ = 0), or the Weibull distribution (ξ < 0) with finite tails that do not extend out toinfinity.

8See, for example Jiang and Tian (2005).9Specifically, let x1, x2, . . . be an i.i.d. sequence of draws from some distribution F and let Mn denote

the maximum of the first n observations. If we can find sequences of real numbers an and bn suchthat the sequence of normalized maxima (Mn − bn)/an converges in distribution to some nondegeneratedistribution H(x), i.e., P((Mn − bn)/an ≤ x) → H(x) as n → ∞ then H is a GEV distribution. The classof distribution functions that satisfy this condition is very broad, including all of those commonly usedin finance. See Embrechts et al. (1997) or McNeil et al. (2005) for further detail.

Page 356: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Adding tails to the risk-neutral density 343

Two other parameters, μ and σ, can be introduced to set location and scale of thedistribution, by defining

z =ST − μσ

. (13)

Thus we have three GEV parameters to set, which allows us to impose three conditionson the tail. We will use the expressions FEVL(·) and FEVR(·) to denote the approximatingGEV distributions for the left and right tails, respectively, with fEVL(·) and fEVR(·) as thecorresponding density functions, and the same notation without the L and R subscriptswhen referring to both tails without distinction. FEMP(·) and fEMP(·) will denote theestimated empirical risk-neutral distribution and density functions.

Let X(α) denote the exercise price corresponding to the α-quantile of the risk-neutraldistribution. That is, FEMP(X(α)) = α. We first choose the value of α at which the GEVtail is to begin, and then a second, more extreme point on the tail, that will be used inmatching the GEV tail shape to that of the empirical RND. These values will be denotedα0R and α1R, respectively, for the right tail and α0L and α1L for the left.

The choice of α0 and α1 values is flexible, subject to the constraint that we mustbe able to compute the empirical RND at both points, which requires X2 ≤ X(α1L)and X(α1R) ≤ XN−1. However, the GEV will fit the more extreme tail of an arbitrarydistribution better than the near tail, so there is a tradeoff between data availability andquality, which would favor less extreme values for α0 and α1, versus tail fit, which wouldfavor more extreme values.

Consider first fitting a GEV upper tail for the RND. The first condition to be imposedis that the total probability in the tail must be the same for the RND and the GEVapproximation. We also want the GEV density to have the same shape as the RND inthe area of the tail where the two overlap, so we use the other two degrees of freedom toset the two densities equal at α0R and α1R.

The three conditions for the right tail are shown in equations (14a–c):

FEVR(X(α0R)) = α0R, (14a)

fEVR(X(α0R)) = fEMP(X(α0R)), (14b)

fEVR(X(α1R)) = fEMP(X(α1R)). (14c)

The GEV parameter values that will cause these conditions to be satisfied can be foundeasily using standard optimization procedures.

Fitting the left tail of the RND is slightly more complicated than the right tail. As theGEV is the distribution of the maximum in a sample, its left tail relates to probabilitiesof small values of the maximum, rather than to extreme values of the sample minimum,i.e., the left tail. To adapt the GEV to fitting the left tail, we must reverse it left toright, by defining it on −z. That is, z values must be computed from (15) in placeof (13):

z =(−μL) − ST

σ(15)

Page 357: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

344 Estimating the implied risk-neutral density for the US market portfolio

where μL is the (positive) value of the location parameter for the left tail GEV. (Theoptimization algorithm will return the location parameter μ ≡ −μL as a negativenumber.)10

The optimization conditions for the left tail become

FEVL(−X(α0L)) = 1 − α0L, (16a)

fEVR(−X(α0L)) = fEMP(X(α0L)), (16b)

fEVR(−X(α1L)) = fEMP(X(α1L)). (16c)

Our initial preference was to connect the left and right tails at α0 values of 5% and95%, respectively. However, for the S&P 500 index options in the sample that will beanalyzed below, market prices for options with the relevant exercise prices were notalways available for the left tail and rarely were for the right tail. We have thereforechosen default values of α0L = 0.05 and α0R = 0.92, with α1L = 0.02 and α1R =0.95 as the more remote connection points. In cases where data were not available forthese α values, we set α1L = FEMP(X2), the lowest connection point available fromthe data, and α0L = α1L + 0.03. For the right tail, α1R = FEMP(XN−1), and α0R =α1R − 0.03.

On January 5, 2005, the 5% and 2% quantiles of the empirical RND fell at 1,044.00and 985.50, respectively, and the 95% and 92% right-tail quantiles were 1,271.50 and1,283.50, respectively.11 The fitted GEV parameters that satisfied equations (14) and(16) were as follows:

Left tail : μ = 1274.60 σ = 91.03 ξ = −0.112;

Right tail : μ = 1195.04 σ = 36.18 ξ = −0.139.

Figure 15.8 plots three curves: the middle portion of the empirical RND extractedfrom the truncated set of options prices with interpolation using a fourth degree splinewith one knot at the money and bid-ask weighting parameter σ = 0.001, as shown inFigure 15.7, and the two GEV distributions whose tails have been matched to the RND atthe four connection points. As the figure illustrates, the GEV tail matches the empiricalRND very closely in the region of the 5% and 92% tails. Figure 15.9 shows the resultingcompleted RND with GEV tails.

10The procedure as described works well for fitting tails to a RND that is defined on positive X valuesonly, as it is when X refers to an asset price ST, or a simple gross return ST/S0. Fitting a RND in termsof log returns, however, raises a problem that it may not be possible to fit a good approximating GEVfunction on the same support as the empirical RND. This difficulty can be dealt with by simply addinga large positive constant to every X value to shift the empirical RND to the right for fitting the tails,and then subtracting it out afterwards, to move the completed RND back to the right spot on the xaxis.11With finite stock price increments in the interpolation, these quantiles will not fall exactly on any

Xn. We therefore choose n at the left-tail connection points such that Xn−1 ≤ X(α) < Xn and setthe actual quantiles α0L and α1L equal to the appropriate actual values of the empirical risk neu-tral distribution and density at Xn. Similarly, the right connection points are set such that Xn−1 <X(α) ≤ Xn.

Page 358: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 Estimating the risk-neutral density 345

0.000

0.002

0.004

0.006

0.008

0.010

0.012

800 900 1000 1100 1200 1300 1400

Den

sity

Empirical RND Left tail GEV function Right tail GEV function Connection points

2% 5%

92%

95%

Fig. 15.8. Risk-neutral density and fitted GEV tail functions

0.000

0.002

0.004

0.006

0.008

0.010

0.012

800 900 1000 1100 1200 1300 1400

S&P 500 index

Den

sity

Empirical RND Left GEV tail Right GEV tail

Fig. 15.9. Full estimated risk-neutral density function for January 5, 2005

6. Estimating the risk-neutral density for the S&P 500from S&P 500 index options

We applied the methodology described above to fit risk-neutral densities for the Standardand Poor’s 500 stock index using S&P 500 index call and put options over the periodJanuary 4, 1996–February 20, 2008. In this section we will present interesting preliminary

Page 359: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

346 Estimating the implied risk-neutral density for the US market portfolio

results on some important issues, obtained from analyzing these densities. The purposeis to illustrate the potential of this approach to generate valuable insights about howinvestors’ information and risk preferences are incorporated in market prices. The issueswe consider are complex and we will not attempt to provide in-depth analysis of them inthis chapter. Rather, we offer a small set of what we hope are tantalizing “broad brush”results that suggest directions in which further research along these lines is warranted.Specifically, we first examine the moments of the fitted RNDs and compare them to thelognormal densities assumed in the Black–Scholes model. We then look at how the RNDbehaves dynamically, as the level of the underlying index changes.

6.1. Data sample

Closing bid and ask option prices data were obtained from Optionmetrics through theWRDS system. The RND for a given expiration date is extracted from the set of tradedoptions with that maturity, and each day’s option prices provide an updated RND esti-mate for the same expiration date. We focus on the quarterly maturities with expirationsin March, June, September, and December, which are the months with the most activetrading interest.12 The data sample includes option prices for 49 contract maturities and2,761 trading days.

We construct RNDs, updated daily, for each quarterly expiration, beginning immedi-ately after the previous contract expires and ending when the contract has less than twoweeks remaining to maturity. Very short maturity contracts were eliminated because wefound that their RNDs are often badly behaved. This may be partly due to price effectsfrom trading strategies related to contract expiration and rollover of hedge positions intolater expirations. Also, the range of strikes for which there is active trading interest inthe market gets much narrower as expiration approaches.

We computed Black–Scholes IVs using the closing bid and ask prices reported byOptionmetrics. Optionmetrics was also the source for the riskless rate and dividend yielddata, which are also needed in calculating forward values for the index on the optionmaturity dates.13

Table 15.2 provides summary information on the data sample and the estimated tailparameters. During this period, the S&P index ranged from a low of just under 600 toa high of 1,565.20, averaging 1,140.60. Contract maturities were between slightly overthree months down to 14 days, with an average value of about 54 days.

The number of market option prices available varied from day to day and some ofthose for which prices were reported were excluded because implied volatilities could notbe computed (typically because the option price violated a no-arbitrage bound). Thenumbers of usable calls and puts averaged about 46 and 42 each day, respectively. Weeliminated those with bid prices in the market less than $0.50. The excluded deep outof the money contracts are quite illiquid and, as Table 15.1 shows, their bid-ask spreads

12The CBOE lists contracts with maturities in the next three calendar months plus three more distantmonths from the March–June–September–December cycle, meaning that off-month contracts such asApril and May are only introduced when the time to maturity is less than three months.13Optionmetrics interpolates US dollar LIBOR to match option maturity and converts it into a contin-

uously compounded rate. The projected dividends on the index are also converted to a continuous annualrate. See the Optionmetrics Manual (2003) for detailed explanations of how Optionmetrics handles thedata.

Page 360: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 Estimating the risk-neutral density 347

Table 15.2. Summary statistics on fitted S&P 500 risk-neutral densities, January4, 1996–February 20, 2008

Average Standard Minimum Maximumdeviation

S&P index 1140.60 234.75 598.48 1565.20Days to expiration 54.2 23.6 14 94Number of option prices

# calls available 46.2 17.6 8 135# calls used 37.6 15.4 7 107IVs for calls used 0.262 0.180 0.061 3.101# puts available 41.9 15.0 6 131# puts used 32.9 12.4 6 114IVs for puts used 0.238 0.100 0.062 1.339

Left tailα0L connection point 0.8672 0.0546 0.6429 0.9678ξ 0.0471 0.1864 −0.8941 0.9620μ 1.0611 0.0969 0.9504 2.9588σ 0.0735 0.0920 0.0020 2.2430

Right tailα0R connection point 1.0900 0.0370 1.0211 1.2330ξ −0.1800 0.0707 −0.7248 0.0656μ 1.0089 0.0085 0.8835 1.0596σ 0.0416 0.0175 0.0114 0.2128

Tail parameters refer to the risk-neutral density expressed in terms of gross returns, ST/S0. “# calls(puts) available” is the number for which it was possible to compute implied volatilities. “# calls(puts) used” is the subset of those available that had bid prices of $0.50 and above.

are very wide relative to the option price. On average about 38 calls and 33 puts wereused to fit a given day’s RND, with a minimum of six puts and seven calls. Their impliedvolatilities averaged around 25%, but covered a very wide range of values.

The tail parameters reported in the table relate to risk-neutral densities estimated ongross returns, defined as ST/S0, where S0 is the current index level and ST is the indexon the contract’s expiration date. This rescaling makes it possible to combine RNDs fromdifferent expirations so that their tail properties can be compared. Under Black–Scholesassumptions, these simple returns should have a lognormal distribution.

For the left tail, if sufficient option price data are available, the connection pointis set at the index level where the empirical RND has cumulative probability ofα0L = 5%. This averaged 0.8672, i.e., a put option with that strike was about13% out of the money. The mean value of the fitted left-tail shape parameter ξ was0.0471, which makes the left-tail shape close to the normal on average, but with afairly large standard deviation. Note that this does not mean the RND is not fat-tailed relative to the normal, as we will see when we look at its excess kurtosis inTable 15.3, only that the extreme left tail of the RND defined on simple returns isnot fat on average. Indeed, as the RND defined on gross returns is bounded belowby 0, the true left tail must be thin-tailed relative to the normal, asymptotically.

Page 361: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 15.3. Summary statistics on the risk-neutral density for returns on the S&P 500, January 4,1996–February 20, 2008

Mean Std Dev Quantile

0.10 0.25 0.50 0.75 0.90

Expected return to expiration 0.61% 0.41% 0.13% 0.25% 0.52% 0.92% 1.22%Expected return annualized 4.05% 1.89% 1.08% 2.03% 4.88% 5.46% 5.93%Excess return relative to the riskless −0.21% 0.43% −0.57% −0.30% −0.16% −0.04% 0.10%

rate, annualizedStandard deviation 7.55% 2.86% 4.13% 5.49% 7.22% 9.34% 11.40%Standard deviation annualized 20.10% 5.82% 12.80% 15.56% 19.67% 23.79% 27.57%Skewness −1.388 0.630 −2.165 −1.651 −1.291 −0.955 −0.730Excess kurtosis 6.000 6.830 1.131 2.082 3.806 7.221 13.449Skewness of RND on log returns −2.353 1.289 −3.940 −2.834 −2.020 −1.508 −1.202Excess kurtosis of RND on log returns 20.516 28.677 2.929 4.861 10.515 23.872 49.300

The table summarizes properties of the risk-neutral densities fitted to market S&P 500 Index option prices, with GEV tailsappended, as described in the text. The period covers 2,761 days from 49 quarterly options expirations, with between 14and 94 days to expiration. The RNDs are fitted in terms of gross return, ST/S0. Excess return relative to the riskless rateis the mean return, including dividends, under the RND minus LIBOR interpolated to match the time to expiration. Excesskurtosis is the kurtosis of the distribution minus 3.0. Skewness and excess kurtosis of RND on log returns are those momentsfrom the fitted RNDs transformed to log returns, defined as log(ST/S0).

Page 362: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

6 Estimating the risk-neutral density 349

The right connection point averaged 1.0900, i.e., where a call was 9% out of themoney. The tail shape parameter ξ was negative for the right tail, implying a short-taileddistribution with a density that hits zero at a finite value. This result was very strong:although the fitted values for ξ varied over a fairly wide range, with a standard deviationof 0.0707, only 1 out of 2,761 ξ estimates for the right tail was positive. Comparing theσ estimates for the left and right tails, we see that the typical GEV approximationsgenerally resemble those shown for January 5, 2005 in Figures 15.8 and 15.9, with theleft tail coming from a substantially wider distribution than the right tail.

6.2. Moments of the risk-neutral density

Table 15.3 displays summary statistics on the moments of the fitted S&P 500 risk-neutraldensities. The table, showing the mean, standard deviation, and several quantiles of thedistribution of the first four moments of the fitted densities within the set of 2,761estimated RNDs, provides a number of interesting results.

The mean risk-neutralized expected return including dividends was 0.61%, over timehorizons varying from three months down to two weeks. At annualized rates, this was4.05%, but with a standard deviation of 1.89%. The quantile results indicate thatthe range of expected returns was fairly wide. Perhaps more important is how theexpected return option traders expected to earn compared to the riskless rate. Underrisk-neutrality, the expected return on any security, including the stock market portfolio,should be equal to the riskless interest rate, but the third row of Table 15.3 shows thaton average, option traders expected a risk-neutralized return 21 basis point below theriskless rate (using LIBOR as the proxy for that rate). The discrepancy was distributedover a fairly narrow range, however, with more than 95% of the values between −1%and +1%.

Skewness of the RND defined over returns was strongly negative. In fact, the skew-ness of the RND was negative on every single day in the sample. Under Black–Scholesassumptions, the distribution of gross returns is lognormal and risk-neutralization simplyshifts the density to the left so that its mean becomes the riskless rate. The skewnessresult in Table 15.3 strongly rejects the hypothesis that the risk-neutral density is con-sistent with the standard model. Kurtosis was well over 3.0, indicating the RNDs werefat-tailed relative to the normal, although the nonzero skewness makes this result difficultto interpret clearly.

To explore these results a little further, we converted the RNDs defined on terminalindex levels to RNDs for log returns, defined as r = log(ST/S0).14 This would yielda normal distribution if returns were lognormal. The results for skewness and excesskurtosis are shown in Table 15.3 for comparison, and they confirm what we have seenfor gross returns. The RND defined on log returns is even more strongly left-skewed and

14Let x be a continuous r.v. with density fx( . ). Let y = g(x) be a one-to-one transformation of x suchthat the derivative of x = g−1(y) with respect to y is continuous. Then Y = g(X) is a continuous r.v.with density

fY(y) =

∣∣∣∣ d

dyg−1(y)

∣∣∣∣fX(g−1(y)).

In our case, r = g(S) = ln(S/S0). Therefore, RNDr(r) = S × RNDS(S), where r = ln(S/S0).

Page 363: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

350 Estimating the implied risk-neutral density for the US market portfolio

excess kurtosis is increased. The RND was fat-tailed relative to the normal on everysingle day in the sample.

6.3. The dynamic behavior of the S&P 500 risk-neutral density

How does the RND behave when the underlying Standard and Poor’s 500 index moves?The annualized excess return shown in Table 15.3 implies that the mean of the RND(in price space) is approximately equal to the forward price of the index on average. Areasonable null hypothesis would therefore be that if the forward index value changes bysome amount ΔF, the whole RND will shift left or right by that amount at every point.This does not appear to be true.

Table 15.4 reports the results of regressing the changes in quantiles of the RND onthe change in the forward value of the index. Regression (17) was run for each of 11quantiles, Qj, j = 1, . . . , 11, of the risk-neutral densities:

Qj(t) = a + bΔF(t). (17)

Under the null hypothesis that the whole density shifts up or down by Δ F(t) as theindex changes, the coefficient b should be 1.0 for all quantiles.

When (17) is estimated on all observations, all b coefficients are positive and highlysignificant, but they show a clear negative and almost perfectly monotonic relationshipbetween the quantile and the size of b. When the index falls, the left end of the curvedrops by more than the change in the forward index and the right end moves down bysubstantially less. For example, a 10 point drop in the forward index leads to about a 14point drop in the 1% and 2% quantiles, but the upper quantiles, 0.90 and above, go downless than 8 points. Similarly, when the index rises the lower quantiles go up further thanthe upper quantiles. Visually, the RND stretches out to the left when the S&P drops,and when the S&P rises the RND tends to stack up against its relatively inflexible upperend.

The next two sets of results compare the behavior of the quantiles between positiveand negative returns. Although the same difference in the response of the left and righttails is present in both cases, it is more pronounced when the market goes down thanwhen it goes up. To explore whether a big move has a different impact, the last twosets of results in Table 15.4 report regression coefficients fitted only on days with largenegative returns, below −1.0%, or large positive returns greater than +1.0%. When themarket falls sharply, the effect on the left tail is about the same as the overall averageresponse to both up and down moves, but the extreme right tail moves distinctly lessthan for a normal day. By contrast, if the market rises more than 1.0%, the left-tail effectis attenuated whereas the right tail seems to move somewhat more than for a normalday.

These interesting and provocative results on how the RND responds to and reflectsthe market’s changing expectations and (possibly) risk attitudes as prices fluctuatein the market warrant further investigation. One potentially important factor here isthat the biggest differences are found in the extreme tails of the RND, in the regionswhere the empirical RND has been extended with GEV tails. What we are seeing maybe a result of changes in the shape of the empirical RND at its ends when the marketmakes a big move, which the GEV tails then try to match. Note, however, that theempirically observed portion of the RND for the full sample shows the strong monotonic

Page 364: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Table 15.4. Regression of change in quantile on change in the forward S&P index level

Quantile

0.01 0.02 0.05 0.10 0.25 0.50 0.75 0.90 0.95 0.98 0.99

All observations 1.365 1.412 1.385 1.297 1.127 0.974 0.857 0.773 0.730 0.685 0.659Nobs = 2712 (58.65) (72.43) (98.62) (180.26) (269.44) (269.88) (272.05) (131.16) (88.75) (60.08) (46.60)

Negative return 1.449 1.467 1.404 1.291 1.119 0.975 0.867 0.780 0.727 0.661 0.613Nobs = 1298 (30.07) (35.53) (46.96) (88.75) (128.75) (130.77) (134.93) (64.80) (43.13) (28.37) (21.45)

Positive return 1.256 1.308 1.340 1.306 1.148 0.978 0.845 0.756 0.720 0.696 0.691Nobs = 1414 (26.88) (34.32) (48.97) (88.04) (137.21) (134.38) (131.44) (62.87) (43.00) (29.88) (23.75)

Return < −1.0% 1.352 1.390 1.368 1.282 1.140 1.001 0.879 0.756 0.670 0.559 0.478Nobs = 390 (12.64) (14.26) (19.62) (39.18) (54.80) (59.48) (61.05) (27.71) (17.94) (11.28) (8.12)

Return > 1.0% 1.106 1.194 1.292 1.310 1.173 0.988 0.843 0.756 0.726 0.710 0.712Nobs = 395 (11.36) (14.05) (20.28) (35.88) (61.28) (57.29) (55.29) (26.27) (18.38) (13.11) (10.53)

Regression equation: ΔRNDQ(t) = a + bΔF(t).The table shows the estimate b coefficient. t-statistics in parentheses.

Page 365: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

352 Estimating the implied risk-neutral density for the US market portfolio

coefficient estimates throughout its full range, so the patterns revealed in Table 15.4 areclearly more than simply artifacts of the tail fitting procedure.

7. Concluding comments

We have proposed a comprehensive approach for extracting a well-behaved estimate ofthe risk-neutral density over the price or return of an underlying asset, using the marketprices of its traded options. This involves two significant technical problems: first, howbest to obtain a sufficient number of valid option prices to work with, by smoothing themarket quotes to reduce the effect of market noise, and interpolating across the relativelysparse set of traded strike prices; and second, how to complete the density functionsby extending them into the tails. We explored several potential solutions for the firstproblem and settled on converting market option price quotes into implied volatilities,smoothing and interpolating them in strike price-implied volatility space, convertingback to a dense set of prices, and then applying the standard methodology to extractthe middle portion of the risk-neutral density. We then addressed the second problem byappending left and right tails from a generalized extreme value distribution in such a waythat each tail contains the correct total probability and has a shape that approximatesthe shape of the empirical RND in the portion of the tail that was available from themarket.

Although the main concentration in this chapter has been on developing the estima-tion technology, the purpose of the exercise is ultimately to use the fitted RND functionsto learn more about how the market prices options, how it responds to the arrival of newinformation, and how market risk preferences behave and vary over time. We presentedresults showing that the risk-neutral density for the S&P 500 index, as reflected in itsoptions, is far from the lognormal density assumed by the Black–Scholes model – it isstrongly negatively skewed and fat-tailed relative to the (log)normal. We also found thatwhen the underlying index moves, the RND not only moves along with the index, butit also changes shape in a regular way, with the left tail responding much more stronglythan the right tail to the change in the index.

These results warrant further investigation, for the S&P 500 and for other underlyingassets that have active options markets. The following is a selection of such projects thatare currently under way.

The Federal Reserve announces its Federal funds interest rate target and policy deci-sions at approximately 2:15 in the afternoon at the end of its regular meeting, aboutevery six weeks. This is a major piece of new information and the market’s response isimmediate and often quite extreme. Using intraday options data, it is possible to esti-mate real time RNDs that give a very detailed picture of how the market’s expectationsand risk preferences are affected by the information release.

The volatility of the underlying asset is a very important input into all modern optionpricing models, but volatility is hard to predict accurately and there are a number ofalternative techniques in common use. There are also a number of index-based securitiesthat are closely related to one another and should therefore have closely related volatili-ties. The RND provides insight into what the market’s expected volatility is, and how it

Page 366: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

7 Concluding comments 353

is connected to other volatility measures, like realized historical volatility, volatility esti-mated from a volatility model such as GARCH, realized future volatility over the life ofthe option, implied volatility from individual options or from the VIX index, volatility ofS&P index futures prices, implied volatility from futures options, volatility of the SPDRtracking ETF, etc.

Yet another important issue involves causality and predictive ability of the risk-neutral density. Does the information contained in the RND predict the direction andvolatility of future price movements, or does it lag behind and follow the S&P index orthe S&P futures price?

We hope and anticipate that the procedure we have developed here can be put towork in these and other projects, and will ultimately generate valuable new insights intothe behavior of financial markets.

Page 367: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

16

A New Model for Limit OrderBook DynamicsJeffrey R. Russell and Taejin Kim

1. Introduction

Nearly half the world’s stock exchanges are organized as order-driven markets such asElectronic Communications Networks or ECNs. These markets are purely electronic withno designated specialists or market maker. In the absence of a market maker, prices arecompletely determined by limit orders submitted by market participants. Hence, forthese markets, the structure of the limit order book, the quantity of shares availablefor immediate execution at any given price, determines the cost of immediate orderexecution. The dynamics of the limit order book, therefore, determine how this cost variesover time. Despite the prevalence of these markets there are remarkably few models forthe determinants of the structure of the limit order book and its dynamics. This chapterproposes a new dynamic model for the determinants of the structure of the limit orderbook as determined by the state of the market and asset characteristics.

There is a substantial literature that has examined specific features of the limitorder book. One literature has examined limit order placement strategies of individualinvestors. Examples include Biais, Hillioin, and Spatt (1995), Coppejans and Domowitz(2002), Ranaldo (2004), and Hall and Hautsch (2004). This approach provides insightinto the microbehavior of decisions, but provides only indirect evidence about the overallstructure of the limit order book.

A second literature has focused on depth (the number of shares available) at the bestbid and the best ask. Bollerslev, Domowitz and Wang (1997) propose a dynamic for thebest bid and best ask conditional on order flow. Kavajecz (1999) decomposes depth intospecialist and limit order components. He shows that depth at the best bid and best askare reduced during periods of uncertainty and possible private information.

Acknowledgments: We thank Tim Bollerslev and Mark Watson for comments on a previous draft.

354

Page 368: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

1 Introduction 355

The existing literature can address specific questions regarding the limit order bookbut in the end, cannot provide direct answers to questions like “what is the expectedcost of buying 2,000 shares of Google one minute from now?”. Answers to these questionsrequire a more complete model of the limit order book that models the entire structure,not just a component. These answers will clearly be useful when considering optimaltrade execution strategies such as those considered in Almgren and Chriss (2000) who,under parametric assumptions regarding price dynamics, derive a closed form expressionfor optimal order execution. Furthermore, Engle and Ferstenberg show that optimallyexecuting an order by breaking it up and spreading the execution over time induces a riskcomponent that can be analyzed in the context of classic mean variance portfolio risk.This risk and time to execution is tied to the future shape of the limit order book. Engle,Ferstenberg and Russell (2008) empirically evaluate this mean variance tradeoff, butclearly optimal execution strategies will benefit from rigorous models for the dynamicsof the limit order book.

This chapter therefore diverges from the existing literature that focuses on spe-cific features of the limit order book. The limit order book is a set of quantitiesto be bought or sold at different prices and we propose directly modeling the time-varying demand and supply curves. The forecast from the model is therefore a functionproducing expected quantities over a range of prices as a function of the history ofthe limit order book and market and asset conditions. The model, therefore, candirectly answer the questions regarding the expected cost of a purchase (or sale) in1 minute.

The model is parameterized in a way that allows for easy interpretation and thereforethe model is useful in assessing and interpreting how market conditions affect the shapeof the limit order book and therefore liquidity. The distribution of depth across the limitorder book is modeled by a time-varying normal distribution and therefore depends ontwo time-varying parameters. The first determines the average distance that the depth liesaway from the midquote. As this parameter increases, market liquidity tends to decrease.The second parameter determines how spread out is the depth. Larger values of thisparameter lead to a flatter limit order book. These parameters are made time-varying inan autoregressive manner so that the shape of the limit order book next period dependson the shape of the limit order book in the previous period and possibly other variablesthat characterize the market condition. The ease of interpretation of the proposed modeldifferentiates it from the ACM model proposed by Russell and Engle (2005). The Probitstructure of the model is in the spirit of the time series models proposed by Hausman Loand MacKinlay (1992) and Bollerslev and Melvin (1994) although the specific dynamicsand application are new.

The model is applied to one month of limit order book data. The data come fromArchipelago Exchange. Model estimates are presented for limit order book dynamicsat 1 minute increments. We find that the limit order book exhibits very strong per-sistence suggesting that new limit orders are slow to replenish the book. We also findthat depth tends to move away from the midquote, so that the market becomes lessliquid, following larger spreads, smaller trade volume, higher transaction rates, andhigher volatility. We also find that the book tends to become more disperse (flatter)when spreads are low, trade size is large, transaction rates are high, and volatility ishigh.

Page 369: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

356 A new model for limit order book dynamics

2. The model

This section presents a model for the distribution of the number of shares available acrossdifferent prices. Our approach decomposes the limit order book into two components;the total depth in the market and the distribution of that depth across the multipleprices.

We begin with some notation. Let the midquote at time t be denoted by mt. Next,we denote a grid for N prices on the ask and bid sides. The ith ask price on the grid isdenoted by pa

it and the ith bid price is denoted by pbit. p

a1t is the first price at or above

the midquote at which depth can be listed and similarly, pa1t is the first price below the

midquote at which depth can be listed. We will treat the grid as being equally spaced sothat each consecutive price on the ask side is a fixed unit above the previous price. Thegrid accounts for the fact that available prices in most markets are restricted to fall onvalues at fixed tick sizes. Hence, the smallest increment considered would be that of thetick size although larger increments could be considered as well. Finally, we define thetotal number of shares available in each price bin. On the ask side, ait denotes the totaldepth available in the ith bin. a1t is the shares available in the limit order book at pricesp where (pa

1t ≤ p ≤ pa2t) and for i > 1 ait denotes the shares available at prices p where

(pait < p ≤ pa

i+1t). A similar notation is used for the bid side of the market where bitdenotes the shares available in the limit order book on the bid side. The grid is anchoredat the midquote so the grid has a time subscript. In both cases, larger values of i areassociated with prices further away from the midquote.

Our goal is to specify a model for the expected shares available in each bin giventhe state of the market and perhaps characteristics of the asset. For small number ofbins (small N) the depth could be modeled by standard time series techniques such asa VAR. These approaches quickly become intractable when N is more than one or two.Additionally, it is difficult to directly interpret the results of a VAR in the relevantcontext of liquidity. We take a different approach that decomposes the problem into twocomponents. Define the total shares in the limit order book over the first N bins asDa

t =∑N

i=1 ait. We decompose the model for the limit order book into shape and levelcomponents. Given the total shares in the limit order book, define

πit = E

(ait

Dat

|Dat

)(1)

as the expected fraction of the depth Dat in bin i, at time t. Given the total shares, the

expected depth in bin i at time t is given by

E (ait|Dat ) = πitD

at . (2)

Differences in depth across bins are driven by the π terms. Hence, this decompositionseparates the model for the limit order book into a shape component described by theπs and a level given by the overall depth, Da

t . In general, both the shape of the limitorder book and the total shares available, Da

t , will depend on characteristics of the assetand market conditions. Let Ft−1 denote an information set available at time t-1, and letg(Da

t |Ft−1) denote a model for the time-varying total shares. We now can generalize (1)and (2) to allow for time time-varing probabilities, time-varying total shares, Da

t , and a

Page 370: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

2 The model 357

time-varying limit order book:

πit = E

(ait

Dat

|Dat , Ft−1

). (3)

The one step ahead, predicted depth is then given by

E (ait|Ft−1) =∫

D

πitg (Dat |Ft−1)dD. (4)

Hence, the limit order book can be modeled using a multinomial model for (3) and aunivariate time series model for g(Da

t |Ft−1). The latter is a univariate series that couldbe modeled with standard time series models such as an ARMA model. The new parthere is, therefore, to find a good model for the multinomial probabilities.

The goal in specifying the multinomial model is to find a model that fits the datawell, is easily interpreted, and allows for N to be large without requiring a large numberof parameters. The limit order book clearly exhibits dependence especially when viewedover short time periods. The model must, therefore, be specified in a flexible way so thatthe shape depends on the history of the limit order book.

Our model is formulated using a multinomial probit model. For the probit model, themultinomial probabilities are determined by areas under the normal density function.These probabilities are time-varying when the mean and variance of the Normal densityare time-varying. Specifically, given a mean μt and a variance σ2

t the probability isgiven by:

πit =Φt (pit −mt) − Φt (pi−1t −mt)

Φt (pNt −mt) − Φt (0)

Where Φt is the cumulative distribution function for a Normal (μt, σ2t ). The denominator

simply normalizes the probabilities to sum to one. If the grid is set on ticks, then thiswould correspond to the fraction of the depth that lies on the ith tick above the midquotes.

This parameterization is convenient to interpret. Clearly as μt increases, the centerof the distribution moves away from the midquote. Therefore, larger values of μt areassociated with depth lying, on average, further from the midquote. This would corre-spond to a less liquid market. As σ2

t increases, the Normal density becomes flatter, sospreading out the probability more evenly across the N bins. As σ2

t goes to infinity theprobabilities become equal. An increase or decrease in either the mean or the varianceis, therefore, easily interpreted in terms of average distance that the depth lies from themidquote and how spread out the depth is across the N bins.

We now turn to the dynamics of the mean and variance. As the shape of the limit orderbook will be highly dependent, especially over short time intervals, we begin with the sim-plest version of the model using an autoregressive structure for the mean and variance. Ateach time period t, we can calculate the center of the empirical distribution of the depth.This is given by xt = 1

Dt

∑ni=1 (pit −mt)ait. The difference between the actual mean

and the predicted mean is given by εt = xt −∑n

i=1 πit(pit −mt). Similarly, we can com-pute the empirical variance of the depth across the bins as s2t = 1

Dt

∑ni=1 (pit − xt)2ait

and the associated error is given by ηt = s2t −∑n

i=1 πit(pit − xt)2. If the model is correctlyspecified then both error terms will be serially uncorrelated.

These errors are used to build an autoregressive model for the time-varying meanand variance that in turn dictate the time-varying probabilities in the multinomial.

Page 371: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

358 A new model for limit order book dynamics

Specifically, a simple model for the dynamic of the mean is given by:

μt = β0 + β1μt−1 + β2εt−1.

Similarly, a simple model for the dynamics of the variance is given by:

σ2t = γ0 + γ1σ2

t−1 + γ2ηt−1.

Clearly, higher order models could be considered. Additionally, other variables thatcapture the state of the market could be included. The explicit dependence of the currentmean and variance on the past mean and variance allows for potential persistence inthe series. The error terms allow the updating to depend on the differences between theexpected and actual mean and variance. In the next section, we turn to model estimation.

3. Model estimation

The data on a given side of the market consist of the number of shares available ineach bin. We proceed to estimate parameters for the mean and variance dynamics bymaximum likelihood. If each share submitted at each time period t could be viewed as ni.i.d. draws from a multinomial distribution then the likelihood associated with the t-thperiod is given by:

lt = πa1t1t π

a2t2t . . . πant

nt .

This assumes that the shares are i.i.d. draws, which is surely false. Orders are submittedin packets of multiple shares, typically in increments of 100 shares. If all orders weresubmitted in packets of 100 shares then the likelihood for the tth observation would begiven by:

lt = πa1t1t π

a2t2t . . . π

antnt

where ait = ait

100 .The log likelihood is then given by:

L =T∑

t=1

n∑i=1

ait ln (πit) .

Given an initial value of μ0 and σ20 , the sequence multinomial probabilities can be

sequentially updated and the likelihood evaluated for any set of parameters.

4. Data

The data consist of limit orders that were submitted through the Archipelago Exchange(ARCA). This exchange has since been bought by NYSE and is now called ARCA. Asof March, 2007, Archipelago is the second largest ECN in terms of shares traded (about20% market share for NASDAQ stocks). Our data consist of 1 month of all limit orderssubmitted in January 2005. The data contain the type of order action; add, modify anddelete. “Add” corresponds to a new order submission. “Modify” occurs when an order is

Page 372: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

4 Data 359

–40

150

100

50

Ave

rage

dep

th

–20 0 20 40

Price

Fig. 16.1. Distribution of depth measured in cents away from midquote

modified either in its price, number of shares, or if an order is partially filled. “Delete”signifies that an order was cancelled, filled, or expired. The data also contain a timestamp down to the millisecond, the price and order size, and a buy or sell indicator,stock symbol, and exchange.

We extract orders for a single stock Google (GOOG). Only orders submitted duringregular hours (9:30 to 4:00) are considered. From the order by order data we constructthe complete limit order book at every minute. This results in 390 observations per day.The average trade price for Google over the month is close to $200. Figure 16.1 presentsa plot of the depth at each cent moving away from the midquote from 1 cent to 40 cents.The plot reveals a peaked distribution, with its peak around 15–20 cents away from themidquote. Of course, this is an unconditional distribution.

The limit order book data is merged with Trades and Quotes (TAQ) data for thesame time period. From these data we create several variables related to trading andvolatility. Past order flow should be related to future order flow and therefore futurelimit order placement. For every minute, we construct the logarithm of the average tradesize over the most recent 15-minute period. Additionally, we construct the total numberof trades executed over the most recent 15-minute period. Both are indications of thedegree of market activity. We also create a realized volatility measure constructed bysumming squared, 1-minute interval returns over the 15 most recent minutes. Finally,the bid-ask spread at transaction times is averaged over the 15 most recent minutes.

In principle, we could model depth out through any distance from the midquote. Wefocus our attention in this analysis to the depth out through 30 cents. We aggregate theshares to larger, 5-cent bins and, consequently, have six bins on the bid side and six bins

Page 373: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

360 A new model for limit order book dynamics

1.0 0.081 0.094 0.039 0.010 0.020

0.081 1.00 0.190 0.167 0.090 0.075

0.094 0.190 1.00 0.190 0.210 0.092

0.039 0.167 0.190 1.00 0.185 0.033

0.010 0.090 0.210 0.185 1.00 0.263

0.020 0.075 0.092 0.133 0.263 1.00

r0 =

r1 =

r2 =

r3 =

0.049

0.058

0.111

0.054

0.116

0.046

0.136

0.209

0.164

0.180

0.146

0.138

0.082

0.192

0.228

0.160

0.210

0.176

0.074

0.163

0.207

0.231

0.230

0.236

0.087

0.155

0.184

0.253

0.311

0.308

0.053

0.162

0.219

0.206

0.276

0.329

0.050

0.088

0.034

0.059

0.042

0.045

0.084

0.123

0.160

0.132

0.104

0.128

0.050

0.085

0.203

0.142

0.154

0.146

0.037

0.108

0.141

0.184

0.223

0.186

0.066

0.105

0.114

0.174

0.280

0.297

0.099

0.106

0.179

0.200

0.252

0.315

0.048

0.092

0.053

0.026

0.057

0.058

0.090

0.145

0.114

0.105

0.121

0.125

0.023

0.116

0.188

0.103

0.135

0.145

0.056

0.097

0.156

0.170

0.152

0.177

0.089

0.140

0.175

0.143

0.233

0.222

0.041

0.099

0.156

0.152

0.248

0.269

Fig. 16.2. Autocorrelations of depth in different bins on the ask side

on the ask side. Our modeling strategy has separate models for the bid and ask side ofthe market. In our analysis, we focus on the ask side only.

5. Results

We begin with some summary statistics for the minute by minute data. At each minute,we have observed depth in the first six 5-cent bins, a1t, a2t, . . . , a6t. It is interesting toassess the dependence structure in this vector time series. Specifically, if we stack thedepth at time t into a vector xt where the first element of xt is a1t and the last element isa6t, we construct the autocorrelations of the vector xt for lags 0 through 3 minutes. Thesample autocorrelations are presented in Figure 16.2. For reference, conventional Barlettstandard errors yield a statistically significant autocorrelation larger than 2/

√T = 0.024

in absolute value.All autocorrelations are positive, indicating the depth at the prices tends to move

together. Depth near the diagonal tends to be more highly correlated than depth awayfrom the diagonal, indicating that the correlation between close bins is larger than the

Page 374: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Results 361

correlation between bins that are far apart. The diagonal or autocorrelations of thesame element of the vector xt tend to have the highest of all correlations. Althoughnot presented, the general positive and significant correlations structure continues outthrough lag 10 (or 10 minutes).

We now estimate the model for the distribution of the depth across the bins, themultinomial Probit. We begin by estimating a simple, first order model presented inSection 2. Specifically μt = β0 + β1μt−1 + β2εt−1 and σ2

t = γ0 + γ1σ2t−1 + γ2ηt−1. In

principle, one could re-initialize the start of each day by setting the initial value of μt

and σ2t to some set value such as an unconditional mean or perhaps treat the initial

values as parameters to be estimated. In reality, with 390 observations per day there isunlikely to be any meaningful effect by simply connecting the day and neglecting there-initialization. This is consistent with the findings in Engle and Russell (1998) for thetrade by trade duration models. The parameter estimates are given in Table 16.1 with t-statistics in parenthesis. All parameters are statistically significant at the 1% level. Boththe mean and the variance exhibit very strong persistence indicating that the averagedistance of the depth from the midquote is highly persistent, as is the degree spread ofthe depth across bins. The autoregressive term is near 1 for both models.

A natural test of the model is to check if the one step ahead forecast errors for themean and variance equations (εt and ηt) are uncorrelated. The null of a white noise seriescan be tested by examining the autocorrelations of these in sample errors. Specifically, weperform a Ljung-Box test on the first 15 autocorrelations associated with the errors forthe mean equation and the variance equation. The p values are 0.53 and 0.06, respectively.Hence, this simple first order model appears to do a reasonably good job of capturingthe dependence in the shape of the limit order book.

It is interesting to see that a simple first order version of the model can capture thesubstantial dependence in the shape of the limit order book. We now turn our attentionto additional market factors that might influence the dynamics of the limit order book.Glosten (1994) predicts that higher trading rates should result in depth clustering aroundthe midquote. Competition among traders in an active market leads to more limit ordersbeing placed near the midquote. Similarly, Rosu (2008) proposes a theoretical model forthe dynamics of the limit order book, which also predicts that more depth should clusteraround midquote. Following Glosten and Rosu, we should expect the mean to decrease,and the average distance of the depth to move closer to the midquote in periods of hightrading rates.

Periods of high volatility are associated with greater uncertainty. In periods of highuncertainty there might be a higher probability of trading against better informed agents.Classic microstructure theory predicts a widening of bid-ask spreads when the probabilityof trading against better informed agents is higher. We might therefore expect that depth

Table 16.1. Estimated coefficients for time series model

Model for mean Model for variance

Intercept 0.057 (6.61) Intercept 0.13 (3.69)μt−1, 0.998 (179.5) σ2

t−1 0.962 (679.9)εt−1 −0.51 (−511.05) ηt−1 −0.91 (25.93)

Page 375: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

362 A new model for limit order book dynamics

should move away from the midquote in periods of high volatility. At the same time, highvolatility in the asset price increases the probability that a limit order far from the currentprice gets executed. This might also serve as an incentive for traders to seek superiorexecution by placing limit orders further from the current price. Both of the ideas implythat in periods of higher volatility, the mean average distance of the depth from themidquote should increase. We might also expect that the distribution of depth shouldflatten. Hence, we might expect the mean and variance to increase in periods of highasset price volatility.

In light of these economic arguments, we next estimate models that condition onrecent transaction history and volatility. Specifically, we use the transaction volumeover the past 15 minutes, the number of trades over the last 15 minutes and the realizedminute-by-minute volatility over the last 15 minutes. Additionally, we include some othereconomic variables that are of interest including the average spread over the last 15minutes and the price change over the last 15 minutes. We include all these economicvariables within the first order time series model estimated above. The coefficients of theeconomic variables are presented in Table 16.2 with t-statistics in parenthesis.

We begin with a discussion of the realized volatility. Realized variance has a positivecoefficient in the mean equation indicating that when the volatility of the asset priceincreases, the average distance of the depth tends to move away from the midquote.This is consistent with both ideas, namely increased likelihood of trading against betterinformed agents moves depth to more conservative prices that account for this risk. It isalso consistent with the idea that high volatility increases the likelihood of depth furtherfrom the midquote getting executed at some point in the future. Similarly, the coefficienton the volatility is positive in the variance equation. This indicates a flattening of thedistribution so that the depth is more evenly spread over the bins.

Next, consider the trade size and trading rate variables. We see that larger averagetrade size tends to move the depth closer to the midquote. Higher trading rates tendto move the depth further from the midquote, on average. The effect of trade size andtrading rates are both positive on the variance. Larger trade size may be indicative oflarger depth posted at the best bid and ask prices. As the depth at any price is positivelyserially correlated, this might simply be indicative of large depth at the ask followinglarger depth at the ask. The trading rates are a little easier to interpret because thereis less of a direct link between trading rates and quantities at the best ask. The positivesign here indicates that depth tends to move away from the midquote during periods ofhigh transaction rates. Additionally, the positive sign on both variables in the variance

Table 16.2. Estimated coefficients for time series modelwith economic variables

Model for mean Model for variance

Realized variance 0.83 (1.76) 45.51 (5.85)Trade size −0.07 (1.87) 1.26 (3.08)Spread 2.12 (4.15) 26.48 (3.46)Trading rate 0.072 (3.69) 2.45 (8.34)Price change 0.56 (7.43) −10.08 (−9.49)

Page 376: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

5 Results 363

equation indicates that the depth is more evenly distributed during high trading ratesand larger average size. Overall, the evidence does not support the predictions of Glostenor the model of Rosu.

Wider spreads are associated with more uncertainty. As with volatility, we mightexpect that depth should move away from the midquote in periods of greater uncertainty.Indeed, the sign on the spread is positive both for the mean equation and for the varianceequation. Rising prices tend to be associated with depth moving away from the midquoteand the distribution becoming more evenly distributed.

Next, we estimate a model for the second component, namely the level of the depthDa

t on the ask side of the market. Specifically, we specify an ARMA(2,2) model for thelogarithm of the total depth:

ln (Dat ) = c+ α1 ln

(Da

t−1

)+ α2 ln

(Da

t−2

)+ θ1ξt−1 + θ2ξt−2 + λrvt−1 + ξt

where ξt is white noise and rvt−1 is the realized volatility over the last 15 min-utes. The other economic variables are not significant so they are not included in thefinal model. The estimated model is given in Table 16.3. T-statistics are presented inparenthesis.

The in sample residuals pass a Ljung-Box test with 15 lags. The process is alsohighly persistent. Although the other economic variables are insignificant, the realizedvolatility is significant at the 1% level and implies that the level of depth tends to increasefollowing periods of higher volatility. Combining the results for the distribution and thelevel, we see that the total number of shares in the first 30 cents tends to increasefollowing high volatility periods, but that the distribution of the depth shifts away fromthe midquote and flattens out. Figure 16.3 presents a plot of the predicted depth underaverage conditions for all variables except the volatility, which is varied from average tothe 5th percentile (low) to the 95th percentile (high).

This plot can be used to calculate the expected cost of purchasing different quantities.Specifically, about 200 more shares are expected to be available in the first price bin whenthe volatility is high as compared to the low volatility state. About 500 more sharesare expected to be available in the high volatility state for the second next price bin.Alternatively, the expected costs can be computed for any size trade directly off of thecurves. The expected cost of purchasing 2000 shares in the low volatility state is about$10 more in the low volatility state.

Table 16.3. Estimated coefficientsfor total depth model

Estimate

Intercept 9.76 (86.69)AR(1) 1.23 (2.83)AR(2) −0.28 (20.51)MA(1) −0.28 (−7.04)MA(2) −0.18 (−12.21)Realized variance 2.55 (2.05)

Page 377: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

364 A new model for limit order book dynamics

Mean

High vol

10

500

1000

1500

2000

2500

3000

3500

2 3 4 5 6

Low vol

Fig. 16.3. Predicted limit order book under average conditions as volatility varies fromlow to high

6. Conclusions

We propose a model for limit order book dynamics. The model is formulated in a waythat separates the modeling problem into a model for the level of the depth and a modelfor the distribution of the depth across specified bins. The decomposition combinedwith the use of a convenient Probit model allows the dynamics to be interpreted in aparticularly simple way. Specifically, we model the level, average distance of the depthfrom the midquote, and the flatness or spread of the depth across the bins. The modelfor the level of the depth can be taken from off the shelf processes. The new part here isthe model for the time-varying multinomial distribution.

We show that simple low order models for the Probit are able to capture the strongtemporal dependence in the shape of the distribution of the depth. More interestingly,we also consider several economic variables. We find that higher volatility predicts thatthe overall level of the depth will increase, but that depth moves away from the midquoteand the distribution tends to flatten out, becoming more disperse.

Contrary to the predictions of Glosten (1994) and Rosu (2008), we find evidence thathigher market activity, as measured by trading rates, tends to move depth away fromthe midquote and flatten the distribution.

Page 378: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography

Abraham, J.M., Goetzmann, W.N. and Wachter, S.M. (1994). “Homogeneous Groupingsof Metropolitan Housing Markets,” Journal of Housing Economics, 3, 186–206.

Ahn, D., Dittmar, R. and Gallant, A. (2002). “Quadratic Term Structure Models: Theoryand Evidence,” Review of Financial Studies, 16, 459–485.

Aıt-Sahalia, Y. (1996a). “Nonparametric Pricing of Interest Rate Derivative Securities,”Econometrica, 64, 527–560.

Aıt-Sahalia, Y. (1996b). “Testing Continuous-Time Models of the Spot Interest Rate,”Review of Financial Studies, 9, 385–426.

Aıt-Sahalia, Y. (2002). “Maximum-Likelihood Estimation of Discretely-Sampled Diffu-sions: A Closed-Form Approximation Approach,” Econometrica, 70, 223–262.

Aıt-Sahalia, Y. (2007). “Estimating Continuous-Time Models Using Discretely Sam-pled Data,” Econometric Society World Congress Invited Lecture, in Advances inEconomics and Econometrics, Theory and Applications, Ninth World Congress,(R. Blundell, P. Torsten and W.K. Newey, eds), Econometric Society Monographs,Cambridge University Press.

Aıt-Sahalia, Y. (2008). “Closed-Form Likelihood Expansions for Multivariate Diffusions,”Annals of Statistics, 36, 906–937.

Aıt-Sahalia, Y. and Lo, A.W. (1998). “Nonparametric Estimation of State-Price DensitiesImplicit in Financial Asset Prices,” Journal of Finance, 53, 499–547.

Aıt-Sahalia, Y. and Lo, A.W. (2000). “Nonparametric Risk Management and ImpliedRisk Aversion,” Journal of Econometrics, 94, 9–51.

Aıt-Sahalia, Y. and Kimmel, R. (2007a). “Maximum Likelihood Estimation of StochasticVolatility Models,” Journal of Financial Economics, 83, 413–452.

Aıt-Sahalia, Y. and Kimmel, R. (2007b). “Estimating Affine Multifactor Term Struc-ture Models Using Closed-Form Likelihood Expansions,” Working Paper, PrincetonUniversity.

Alexander, C. (2001). Market Models: A Guide to Financial Data Analysis. Chichester,UK: John Wiley and Sons, Ltd.

365

Page 379: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

366 Bibliography

Alexander, C. (2008). Market Risk Analysis, Vol. II: Practical Financial Econometrics.Chichester, UK: John Wiley and Sons, Ltd.

Alexander, C. and Lazar, E. (2006). “Normal Mixture GARCH(1,1): Applications toExchange Rate Modelling,” Journal of Applied Econometrics, 21, 307–336.

Almgren, R. and Chriss, N. (2000). “Optimal Execution of Portfolio Transactions”Journal of Risk, 3, 5–39.

Altonji, J. and Ham, J. (1990). “Variation in Employment Growth in Canada,” Journalof Labor Economics, 8, 198–236.

Andersen, T.G. (1996). “Return Volatility and Trading Volume: An Information FlowInterpretation of Stochastic Volatility,” Journal of Finance, 51, 169–204.

Andersen, T.G. and Bollerslev, T. (1998). “ARCH and GARCH Models.” In Encyclopediaof Statistical Sciences, Vol. II. (S. Kotz, C.B. Read and D.L. Banks eds), New York:John Wiley and Sons.

Andersen, T.G., Bollerslev, T., Christoffersen, P.F. and Diebold, F.X. (2006a). “Volatil-ity and Correlation Forecasting.” In Handbook of Economic Forecasting. (G.Elliott, C.W.J. Granger and A. Timmermann, eds), Amsterdam: North-Holland,778–878.

Andersen, T.G., Bollerslev, T., Christoffersen, P.F. and Diebold, F.X. (2006b). “PracticalVolatility and Correlation Modeling for Financial Market Risk Management.” InRisks of Financial Institutions, (M. Carey and R. Stulz, eds), University of ChicagoPress for NBER, 513–548.

Andersen, T.G., Bollerslev, T. and Diebold, F.X. (2007). “Roughing it up: Includ-ing Jump Components in the Measurement, Modeling and Forecasting of ReturnVolatility,” Review of Economics and Statistics, 89, 707–720.

Andersen, T.G., Bollerslev, T. and Diebold, F.X. (2009). “Parametric and Nonpara-metric Measurements of Volatility.” In Handbook of Financial Econometrics, (Y.Aıt-Sahalia and L.P. Hansen eds), North-Holland Forthcoming.

Andersen, T.G., Bollerslev, T., Diebold, F.X. and Ebens, H. (2001). “The Distributionof Realized Stock Return Volatility,” Journal of Financial Economics, 61, 43–76.

Andersen, T.G., Bollerslev, T., Diebold, F.X. and Labys, P. (2000). “Great Realizations,”Risk, 13, 105–108.

Andersen, T.G., Bollerslev, T. Diebold, F.X. and Labys, P. (2001). “The Distributionof Exchange Rate Volatility,” Journal of the American Statistical Association, 96,42–55. Correction published in 2003, volume 98, page 501.

Andersen, T.G., Bollerslev, T., Diebold, F.X. and Vega, C. (2003). “Micro Effects ofMacro Announcements: Real-Time Price Discovery in Foreign Exchange,” AmericanEconomic Review, 93, 38–62.

Page 380: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 367

Andersen, T.G., Bollerslev, T., Diebold, F.X. and Vega, C. (2007). “Real-Time PriceDiscovery in Stock, Bond and Foreign Exchange Markets,” Journal of InternationalEconomics, 73, 251–277.

Andersen, T.G. and Lund, J. (1997). “Estimating Continuous Time Stochastic Volatil-ity Models of the Short Term Interest Rate,” Journal of Econometrics, 77,343–377.

Andersen, T.G., Bollerslev, T. and Meddahi, N. (2004). “Analytic Evaluation ofVolatility Forecasts,” International Economic Review, 45, 1079–1110.

Andrews, D.W.K. (1988). “Laws of Large Numbers for Dependent Non-IdenticallyDistributed Random Variables,” Econometric Theory, 4, 458–467.

Andrews, D.W.K. (1991). “Asymptotic Normality of Series Estimators for Non-parametric and Semi-parametric Regression Models,” Econometrica, 59, 307–346.

Andrews, D.W.K. (1993). “Tests for Parameter Instability and Structural Change withUnknown Change Point,” Econometrica, 61, 501–533.

Ang, A. and Bekaert, G. (2002). “International Asset Allocation with Regime Shifts,”Review of Financial Studies, 15, 1137–87.

Ang, A., Chen, J. and Xing, Y. (2006). “Downside Risk,” Review of Financial Studies,19, 1191–1239.

Attanasio, O. (1991). “Risk, Time-Varying Second Moments and Market Efficiency,”Review of Economic Studies, 58, 479–494.

Babsiria, M.E. and Zakoian, J.-M. (2001). “Contemporaneous Asymmetry in GARCHProcesses,” Journal of Econometrics, 101, 257–294.

Baele, L. and Inghelbrecht, K. (2005). “Time-Varying Integration and InternationalDiversification Strategies,” Tilburg University, unpublished manuscript.

Bahra, B. (1997). “Implied Risk-Neutral Probability Density Functions from OptionsPrices: Theory and Application” Working Paper, Bank of England.

Bai, J. (1997). “Estimating Multiple Breaks One at a Time,” Econometric Theory, 13,315–352.

Bai J. (2003). “Testing Parametric Conditional Distributions of Dynamic Models,”Review of Economics and Statistics, 85, 532–549.

Bai, J. and Chen, Z. (2008). “Testing Multivariate Distributions in GARCH Models,”Journal of Econometrics, 143, 19–36.

Baillie, R.T., Bollerslev T. and Mikkelsen H.O. (1996). “Fractionally Integrated Gener-alized Autoregressive Conditional Heteroskedasticity,” Journal of Econometrics, 74,3–30.

Page 381: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

368 Bibliography

Baillie, R.T., Chung, C.F. and Tieslau, M.A. (1996). “Analysing Inflation by the Frac-tionally Integrated ARFIMA-GARCH Model,” Journal of Applied Econometrics,11, 23–40.

Bandi, F. (2002). “Short-Term Interest Rate Dynamics: A Spatial Approach,” Journalof Financial Economics, 65, 73–110.

Banz, R. and Miller M. (1978). “Prices for State-Contingent Claims: Some Estimatesand Applications,” Journal of Business, 51, 653–672.

Barndorff-Nielsen, O.E., Graversen, S.E., Jacod, J. and Shephard N. (2006). “LimitTheorems for Realised Bipower Variation in Econometrics,” Econometric Theory,22, 677–719.

Barndorff-Nielsen, O.E., Graversen, S.E., Jacod, J., Podolskij, M. and Shephard N.(2006). “A Central Limit Theorem for Realised Power and Bipower Variationsof Continuous Semimartingales.” In From Stochastic Analysis to MathematicalFinance, Festschrift for Albert Shiryaev, (Y. Kabanov, R. Lipster and J. Stoyanov,eds), 33–68. Springer.

Barndorff-Nielsen, O.E., Hansen, P.R., Lunde, A. and Shephard, N. (2008). “DesigningRealised Kernels to Measure the Ex-Post Variation of Equity Prices in the Presenceof Noise,” Econometrica, 76, 1481–1536.

Barndorff-Nielsen, O.E. and Shephard N. (2001). “Non-Gaussian Ornstein–Uhlenbeck-Based Models and Some of their Uses in Financial Economics (with discussion),”Journal of the Royal Statistical Society, Series B, 63, 167–241.

Barndorff-Nielsen, O.E. and Shephard N. (2002). “Econometric Analysis of RealisedVolatility and its Use in Estimating Stochastic Volatility Models,” Journal of theRoyal Statistical Society, Series B, 64, 253–280.

Barndorff-Nielsen, O.E. and Shephard N. (2004). “Power and Bipower Variationwith Stochastic Volatility and Jumps (with discussion),” Journal of FinancialEconometrics, 2, 1–48.

Barndorff-Nielsen, O.E. and Shephard N. (2006). “Econometrics of Testing for Jumps inFinancial Economics Using Bipower Variation,” Journal of Financial Econometrics,4, 1–30.

Barndorff-Nielsen, O.E. and Shephard N. (2007). “Variation, Jumps and High FrequencyData in Financial Econometrics.” In Advances in Economics and Econometrics.Theory and Applications, Ninth World Congress (R. Blundell, T. Persson and W.K.Newey, eds), Econometric Society Monographs, Cambridge University Press, 328–372.

Bartle, R. (1966). The Elements of Integration. New York: Wiley.

Bates, D.S. (1991). “The Crash of ’87–Was it Expected? The Evidence from OptionsMarkets,” Journal of Finance, 43, 1009–1044.

Page 382: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 369

Bates, D.S. (1996). “Jumps and Stochastic Volatility: Exchange Rate Process Implicit inDeutsche Mark Options,” Review of Financial Studies, 9, 69–107.

Bauwens, L., Laurent, S. and Rombouts, J.V.K. (2006). “Multivariate GARCH Models:A Survey,” Journal of Applied Econometrics, 21, 79–109.

Beckers, S., Grinold, R., Rudd, A. and Stefek, D. (1992). “The Relative Importance ofCommon Factors Across the European Equity Markets,” Journal of Banking andFinance, 16, 75–95.

Bekaert, G. and Harvey, C. (1995). “Time-Varying World Market Integration,” Journalof Finance, 50, 403–44.

Bekaert, G., Harvey, C. and Ng, A. (2005). “Market Integration and Contagion,” Journalof Business, 78, 39–70.

Bekaert, G., Hodrick, R. and Zhang, X. (2005). “International Stock Return Comove-ments,” NBER Working Paper 11906.

Benati, L. (2004). “Evolving Post-World War II U.K. Economic Performance,” Journalof Money, Credit, and Banking, 36, 691–717.

Bera, A.K. and Higgins, M.L. (1993). “ARCH Models: Properties, Estimation andTesting,” Journal of Economic Surveys, 7, 305–366.

Bera, A.K., Higgins, M.L. and Lee, S. (1992). “Interaction Between Autocorrelationand Conditional Heteroskedasticity: A Random-Coefficient Approach,” Journal ofBusiness and Economic Statistics, 10, 133–142.

Bernanke, B. and Blinder, A. (1992). “The Federal Funds Rate and the Channels ofMonetary Transmission,” American Economic Review, 82(4), 901–921.

Berzeg, K. (1978). “The Empirical Content of Shift-Share Analysis,” Journal of RegionalScience, 18, 463–469.

Biais, B., Hillioin P., and Spatt, C. (1995). “An Empirical Analysis of the LimitOrder Book and the Order Flow in the Paris Bourse,” Journal of Finance,1655–1689.

Bierens, H.J. (1990). “A Consistent Conditional Moment Test of Functional Form,”Econometrica, 58, 1443–1458.

Bierens, H.J. and Ploberger, W. (1997). “Asymptotic Theory of Integrated ConditionalMoment Tests,” Econometrica, 65, 1129–1151.

Billio, M., Caporin, M. and Gobbo, M. (2006). “Flexible Dynamic Conditional Cor-relation Multivariate GARCH Models for Asset Allocation,” Applied FinancialEconomics Letters, 2, 123–130.

Black, F. (1976). “Studies of Stock Price Volatility Changes,” Proceedings of the Businessand Economic Statistics Section, American Statistical Association, 177–181.

Page 383: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

370 Bibliography

Bliss, R. and Panigirtzoglou, N. (2002). “Testing the Stability of Implied ProbabilityDensity Functions,” Journal of Banking and Finance, 26, 381–422.

Bliss, R. and Panigirtzoglou, N. (2004). “Option Implied Risk Aversion Estimates,”Journal of Finance, 59, 407–446.

Boero, G., Smith, J. and Wallis, K.F. (2008). “Uncertainty and Disagreement in Eco-nomic Prediction: The Bank of England Survey of External Forecasters,” EconomicJournal, 118, 1107–1127.

Boivin, J. (2006). “Has U.S. Monetary Policy Changed? Evidence from DriftingCoefficients and Real-Time Data,” Journal of Money, Credit and Banking, 38,1149–1173.

Boivin, J. and Giannoni, M.P. (2006). “Has Monetary Policy Become More Effective?”Review of Economics and Statistics, 88, 445–462.

Bollerslev, T. (1986). “Generalized Autoregressive Conditional Heteroskedasticity,”Journal of Econometrics, 31, 307–327.

Bollerslev, T. (1987). “A Conditionally Heteroskedastic Time Series Model for Spec-ulative Prices and Rates of Return,” Review of Economics and Statistics, 69,542–547.

Bollerslev, T. (1990). “Modeling the Coherence in Short-Run Nominal Exchange Rates:A Multivariate Generalized ARCH Model,” Review of Economics and Statistics, 72,498–505.

Bollerslev, T., Chou, R.Y. and Kroner, K.F. (1992). “ARCH Modeling in Finance: ASelective Review of the Theory and Empirical Evidence,” Journal of Econometrics,52, 5–59.

Bollerslev, T., Domowitz, I. and Wang, I. (1997). “Order Flow and the Bid-Ask Spread:An Empirical Probability Model of Screen-Based Trading,” Journal of EconomicsDynamics and Control, 1471–1491.

Bollerslev, T. and Engle, R.F. (1986). “Modeling the Persistence of ConditionalVariances,” Econometric Reviews, 5, 1–50.

Bollerslev, T., Engle, R.F. and Nelson, D.B. (1994). “ARCH Models.” In Handbook ofEconometrics, Volume IV, (R.F. Engle and D. McFadden eds), Amsterdam: North-Holland 2959–3038.

Bollerslev, T., Engle, R.F. and Wooldridge, J.M. (1988). “A Capital Asset Pric-ing Model with Time Varying Covariances,” Journal of Political Economy, 96,116–131.

Bollerslev, T. and Ghysels, E. (1996). “Periodic Autoregressive Conditional Het-eroskedasticity,” Journal of Business and Economic Statistics, 14, 139–151.

Page 384: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 371

Bollerslev, T. and Melvin, M. (1994). “Bid-Ask Spreads and Volatility in the ForeignExchange Market: An Empirical Analysis,” Journal of International Economics,355–372.

Bollerslev, T. and Mikkelsen, H.O. (1996). “Modeling and Pricing Long Memory in StockMarket Volatility,” Journal of Econometrics, 73, 151–184.

Bollerslev, T. and Wooldridge, J.M. (1992). “Quasi-Maximum Likelihood Estimationand Inference in Dynamic Models with Time Varying Covariances,” EconometricReviews, 11, 143–172.

Boswijk, H.P. (1992). Cointegration, Identification and Exogeneity, Vol. 37 of TinbergenInstitute Research Series. Amsterdam: Thesis Publishers.

Boswijk, H.P. and Doornik, J.A. (2004). “Identifying, Estimating and Testing RestrictedCointegrated Systems: An Overview,” Statistica Neerlandica, 58, 440–465.

Boudoukh, J., Richardson, M., Smith, T. and Whitelaw, R.F. (1999a). “Ex Ante BondReturns and the Liquidity Preference Hypothesis,” Journal of Finance, 54, 1153–1167.

Boudoukh, J., Richardson, M., Smith T. and Whitelaw, R.F. (1999b). “Bond Returnsand Regime Shifts,” Working Paper, NYU.

Bowley, A.L. (1920). Elements of Statistics. New York: Charles Scribner’s Sons.

Box, G.E.P. and Jenkins, G.M. (1976). Time Series Analysis: Forecasting and Control.Revised edition. San Francisco: Holden-Day.

Brandt, M.W. and Jones, C.S. (2006). “Volatility Forecasting with Range-Based EGARCH Models,” Journal of Business and Economic Statistics, 24,470–486.

Breeden, D. and Litzenberger, R. (1978). “Prices of State-Contingent Claims Implicit inOption Prices,” Journal of Business, 51, 621–652.

Brennan, M. and Schwartz, E. (1979). “A Continuous Time Approach to the Pricing ofBonds,” Journal of Banking and Finance, 3, 133–155.

Brenner, R.J., Harjes, R.H. and Kroner, K.F. (1996). “Another Look at Models ofthe Short-Term Interest Rate,” Journal of Financial and Quantitative Analysis, 31,85–107.

Brockwell, P., Chadraa, E. and Lindner, A. (2006). “Continuous-Time GARCHProcesses,” Annals of Applied Probability, 16, 790–826.

Brooks, C. (2002). Introductory Econometrics for Finance. Cambridge, UK: CambridgeUniversity Press.

Brooks, R. and Catao, L. (2000). “The New Economy and Global Stock Returns,” IMFWorking Paper 00/216, Washington: International Monetary Fund.

Page 385: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

372 Bibliography

Brooks, R. and del Negro, M. (2002). “International Diversification Strategies,” WorkingPaper 2002–23, Federal Reserve Bank of Atlanta.

Brown, H.J. (1969). “Shift Share Projections of Regional Growth: An Empirical Test,”Journal of Regional Science, 9, 1–18.

Brown, S. (1986). Post-Sample Forecasting Comparisons and Model Selection Procedures,Ph.D dissertation, University of California, San Diego.

Brown, S., Coulson, N.E. and Engle, R. (1991). “Noncointegration and Eco-nometric Evaluation of Models of Regional Shift and Share,” WorkingPaper.

Brown, S., Coulson, N.E. and Engle, R. (1992). “On the Determination of RegionalBase and Regional Base Multipliers,” Regional Science and Urban Economics, 27,619–635.

Brown, S. and Dybvig, P. (1986). “The Empirical Implications of the Cox, Ingersoll,Ross Theory of the Term Structure of Interest Rates,” Journal of Finance, 41,617–630.

Bu, R. and Hadri, K. (2007). “Estimating Option Implied Risk-Neutral Densi-ties using Spline and Hypergeometric Functions,” Econometrics Journal, 10,216–244.

Buchen, P.W. and Kelly, M. (1996). “The Maximum Entropy Distribution of an AssetInferred from Option Prices,” Journal of Financial and Quantitative Analysis, 31,143–159.

Burns, P. (2005). “Multivariate GARCH with Only Univariate Estimation,” http://www.burns-stat.com.

Cai, J. (1994). “A Markov Model of Switching-Regime ARCH,” Journal of Business andEconomic Statistics, 12, 309–316.

Calvet, L.E., Fisher, A.J. and Thompson, S.B. (2006). “Volatility Comovement: AMultifrequency Approach,” Journal of Econometrics, 131, 179–215.

Campa, J.M., Chang, P.H.K. and Reider, R.L. (1998). “Implied Exchange Rate Distri-butions: Evidence from OTC Option Markets,” Journal of International Money andFinance, 17, 117–160.

Campbell, J.Y. and Shiller, R. (1991). “Yield Spreads and Interest Rate Movements: ABird’s Eye View,” Review of Economic Studies, 58, 495–514.

Campbell, J.Y., Lo, A.W. and Mackinlay, A.C. (1997). The Econometrics of FinancialMarkets, Princeton: Princeton University Press.

Caporin, M. and McAleer, M. (2006). “Dynamic Asymmetric GARCH,” Journal ofFinancial Econometrics, 4, 385–412.

Page 386: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 373

Cappiello, L., Engle, R.F. and Sheppard, K. (2006). “Asymmetric Dynamics in the Cor-relations of Global Equity and Bond Returns,” Journal of Financial Econometrics,4, 537–572.

Carlino, G.A., DeFina, R. and Sill, K. (2001). “Sectoral Shocks and MetropolitanEmployment Growth,” Journal of Urban Economics, 50, 396–417.

Carlino, G.A. and Mills, L.O. (1993). “Are, U.S. Regional Incomes Converging? A TimeSeries Analysis,” Journal of Monetary Economics, 32, 335–346.

Carter, C.K. and Kohn, R. (1994). “On Gibbs Sampling for State Space Models,”Biometrika, 81, 541–553.

Castle, J.L., Fawcett, N.W.P. and Hendry, D.F. (2009). “Forecasting with Equilibrium-Correction Models During Structural Breaks,” Journal of Econometrics, forthcom-ing.

Castle, J.L. and Hendry, D.F. (2008). “Forecasting UK Inflation: The Roles of Struc-tural Breaks and Time Disaggregation.” In Forecasting in the Presence of StructuralBreaks and Model Uncertainty (D.E. Rapach and M.E. Wohar, eds), Bingley:Emerald, 41–92.

Cavaglia, S., Brightman, C. and Aked, M. (2000). “The Increasing Importance ofIndustry Factors,” Financial Analysts Journal, 41–54.

Chan, N.H. (2002). Time Series: Applications to Finance. New York: John Wiley andSons, Inc.

Chan, K.C., Karolyi, A., Longstaff, F. and Sanders, A. (1992). “An Empirical Compari-son of Alternative Models of the Short-Term Interest Rate,” Journal of Finance, 47,1209–1227.

Chapman, D.A. and Pearson, N.D. (2000). “Is the Short Rate Drift Actually Nonlinear?”Journal of Finance, 55, 355–388.

Chen, X. and Ghysels, E. (2007). “News – Good or Bad – and its Impact over Multi-ple Horizons,” Unpublished paper: Department of Economics, University of NorthCarolina at Chapel Hill.

Chen, R.-R. and Scott, L. (1992). “Pricing Interest Rate Options in a Two-Factor Cox-Ingersoll-Ross Model of the Term Structure,” Review of Financial Studies, 5, 613–636.

Chen, X. and Shen, X. (1998). “Sieve Extremum Estimates for Weakly Dependent Data,”Econometrica, 66, 289–314.

Chernov, M. and Mueller, P. (2007). “The Term Structure of Inflation Forecasts,”Working Paper, London Business School.

Chesher, A. and Irish, M. (1987). “Residual Analysis in the Grouped and CensoredNormal Linear Model,” Journal of Econometrics, 34, 33–61.

Page 387: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

374 Bibliography

Chicago Board Options Exchange (2003). VIX CBOE Volatility Index. http://www.cboe.com/micro/vix/vixwhite.pdf.

Chou, R.Y. (2005). “Forecasting Financial Volatilities with Extreme Values: The Con-ditional Autoregressive Range (CARR) Model,” Journal of Money, Credit andBanking, 37, 561–582.

Chow, G.C. (1960). “Tests of Equality Between Sets of Coefficients in Two LinearRegressions,” Econometrica, 28, 591–605.

Christodoulakis, G.A. and Satchell, S.E. (2002). “Correlated ARCH (CorrARCH): Mod-elling Time-Varying Conditional Correlation Between Financial Asset Returns,”European Journal of Operational Research, 139, 351–370.

Christoffersen, P.F. (2003). Elements of Financial Risk Management. San Diego:Academic Press.

Christoffersen, P.F. and Diebold, F.X. (1996). “Further Results on Forecasting andModel Selection Under Asymmetric Loss,” Journal of Applied Econometrics, 11,561–72.

Christoffersen, P.F. and Diebold, F.X. (1997). “Optimal Prediction Under AsymmetricLoss,” Econometric Theory, 13, 808–817.

Christoffersen, P.F. and Jacobs, K. (2004). “The Importance of the Loss Function inOption Valuation,” Journal of Financial Economics, 72, 291–318.

Clarida, R., Galı, J. and Gertler, M. (2000). “Monetary Policy Rules and Macroeco-nomic Stability: Evidence and Some Theory,” Quarterly Journal of Economics, 115,147–180.

Clark, P.K. (1973). “A Subordinated Stochastic Process Model with Finite Variance forSpeculative Prices,” Econometrica, 41, 135–156.

Clements, M.P. and Hendry, D.F. (1994). “Towards a Theory of Economic Forecasting.”In Non-stationary Time-series Analysis and Cointegration, (Hargreaves, C., ed),Oxford: Oxford University Press, 9–52.

Clements, M.P. and Hendry, D.F. (1999). Forecasting Non-stationary Economic TimeSeries. Cambridge, Mass.: MIT Press.

Cleveland, W.S. (1979). “Robust Locally Weighted Fitting and Smoothing Scatterplots,”Journal of the American Statistical Association, 74, 829–836.

Collin-Dufresne, P., Goldstein, R. and Jones, C. (2006). “Can Interest Rate Volatility beExtracted from the Cross Section of Bond Yields? An Investigation of UnspannedStochastic Volatility,” Working Paper, U.C. Berkeley.

Conley, T., Hansen, L., Luttmer, E. and Scheinkman, J. (1995). “Estimating Subordi-nated Diffusions from Discrete Time Data,” The Review of Financial Studies, 10,525–577.

Page 388: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 375

Conrad, C. and Karanasos, M. (2006). “The Impulse Response Function of the LongMemory GARCH Process,” Economic Letters, 90, 34–41.

Coppejans, M. and Domowitz, I. (2002). “An Empirical Analysis of Trades, Orders, andCancellations in a Limit Order Market,” Discussion Paper, Duke University.

Corradi, V. and Distaso, W. (2006). “Semiparametric Comparison of Stochastic VolatilityModels Using Realized Measures,” Review of Economic Studies, 73, 635–667.

Coulson, N.E. (1993). “The Sources of Sectoral Fluctuations in Metropolitan Areas,”Journal of Urban Economics, 33, 76–94.

Coulson, N.E. (1999). “Housing Inventory and Completion,” Journal of Real EstateFinance and Economics, 18, 89–106.

Coulson, N.E. (1999). “Sectoral Sources of Metropolitan Growth,” Regional Science andUrban Economics, 39, 723–743.

Cox, J.C., Ingersoll, J.E. Jr. and Ross, S.A. (1985). “A Theory of the Term Structure ofInterest Rates,” Econometrica, 53, 385–408.

Crone, T.M. (2005). “An Alternative Definition of Economic Regions in the UnitedStates Based on Similarities in State Business Cycles,” The Review of Economicsand Statistics, 87, 617–626.

Crone, T.M. and Clayton-Matthews, A. (2005). “Consistent Economic Indexes for the50 States,” The Review of Economics and Statistics, 87, 593–603.

Crouhy, H. and Rockinger, M. (1997). “Volatility Clustering, Asymmetric and Hysteresisin Stock Returns: International Evidence,” Financial Engineering and the JapaneseMarkets, 4, 1–35.

Crow, E.L. and Siddiqui, M.M. (1967). “Robust Estimation of Location,” Journal of theAmerican Statistical Association, 62, 353–389.

Dai, Q. and Singleton, K.J. (2000). “Specification Analysis of Affine Term StructureModels,” Journal of Finance, 55, 1943–1978.

Davidson, J. (2004). “Moment and Memory Properties of Linear Conditional Het-eroskedasticity Models, and a New Model,” Journal of Business and EconomicStatistics, 22, 16–29.

Davies, R. (1977). “Hypothesis Testing When a Nuisance Parameter is Present OnlyUnder the Alternative,” Biometrica, 64, 247–254.

de Jong, R.M. (1996). “The Bierens Test Under Data Dependence,” Journal ofEconometrics, 72, 1–32.

Degiannakis, S. and Xekalaki, E. (2004). “Autoregressive Conditional Heteroscedasticity(ARCH) Models: A Review,” Quality Technology and Quantitative Management, 1,271–324.

Page 389: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

376 Bibliography

den Hertog, R.G.J. (1994). “Pricing of Permanent and Transitory Volatility forU.S. Stock Returns: A Composite GARCH Model,” Economic Letters, 44, 421–426.

Derman, E. and Kani, I. (1994). “Riding on a Smile,” RISK, 7 (Feb.), 32–39.

Derman, E. and Kani, I. (1998). “Stochastic Implied Trees: Arbitrage Pricing withStochastic Term and Strike Structure of Volatility,” International Journal ofTheoretical and Applied Finance, 1, 61–110.

Diebold, F.X. (1988). Empirical Modeling of Exchange Rate Dynamics. New York:Springer-Verlag.

Diebold, F.X. (2003). “The ET Interview: Professor Robert F. Engle, January 2003,”Econometric Theory, 19, 1159–1193.

Diebold, F.X. (2004). “The Nobel Memorial Prize for Robert F. Engle,” ScandinavianJournal of Economics, 106, 165–185.

Diebold, F.X. and Lopez, J. (1995). “Modeling Volatility Dynamics.” In Macroecono-metrics: Developments, Tensions and Prospects, (K. Hoover ed.), Boston: KluwerAcademic Press, 427–472.

Diebold, F.X. and Nerlove, M. (1989). “The Dynamics of Exchange Rate Volatility:A Multivariate Latent Factor ARCH Model,” Journal of Applied Econometrics, 4,1–21.

Diebold, F.X., Rudebusch, G.D. and Aruoba, B. (2006). “The Macroeconomy and theYield Curve: A Dynamic Latent Factor Approach,” Journal of Econometrics, 131,309–338.

Diemeier, J. and Solnik, J. (2001). “Global Pricing of Equity,” Financial AnalystsJournal, 57, 37–47.

Ding, Z. and Engle, R.F. (2001). “Large Scale Conditional Covariance Modeling,Estimation and Testing,” Academia Economic Papers, 29, 157–184.

Ding, Z., Engle, R.F. and Granger, C.W.J. (1993). “A Long Memory Property of StockMarket Returns and a New Model,” Journal of Empirical Finance, 1, 83–106.

Donaldson, R.G. and Kamstra, M. (1997). “An Artificial Neural Network GARCHmodel for International Stock Return Volatility,” Journal of Empirical Finance, 4,17–46.

Doornik, J.A. (2001). Ox: Object Oriented Matrix Programming, 5.0. London: Timber-lake Consultants Press.

Doornik, J.A. (2007a). “Econometric Modelling When There are More Variables thanObservations.” Working Paper, Economics Department, University of Oxford.

Doornik, J.A. (2007b). Object-Oriented Matrix Programming using Ox, 6th edn. London:Timberlake Consultants Press.

Page 390: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 377

Doornik, J.A. (2009). “Autometrics.” Working Paper, Economics Department, Univer-sity of Oxford.

Doornik, J.A. and Hansen, H. (2008). “A Practical Test for Univariate and MultivariateNormality,” Discussion Paper, Nuffield College.

Driffill, J. and Sola, M. (1994). “Testing the Term Structure of Interest Rates from aStationary Switching Regime VAR,” Journal of Economic Dynamics and Control18, 601–628.

Drost, F.C. and Nijman, T.E. (1993). “Temporal Aggregation of GARCH Processes,”Econometrica, 61, 909–927.

Duan, J. (1997). “Augmented GARCH(p,q) Process and its Diffusion Limit,” Journal ofEconometrics, 79, 97–127.

Duchesne, P. and Lalancette, S. (2003). “On Testing for Multivariate ARCH Effects inVector Time Series Models,” La Revue Canadienne de Statistique, 31, 275–292.

Duffie, D. (1988). Security Markets: Stochastic Models, Academic Press, Boston.

Duffie, D. and Kan, R. (1996). “A Yield-Factor Model of Interest Rates,” MathematicalFinance, 6, 379–406.

Duffie, D., Ma, J. and Yong, J. (1995). “Black’s Consol Rate Conjecture,” Annals ofApplied Probability, 5, 356–382.

Dumas, B., Fleming, J. and Whaley, R.E. (1998). “Implied Volatility Functions:Empirical Tests,” Journal of Finance, 53, 2059–2106.

Dunn, E.S., Jr. (1960). “A Statistical and Analytical Technique for Regional Analysis,”Regional Science Association Papers and Proceedings, 6, 97–112.

Dupire, B. (1994). “Pricing and Hedging with Smiles,” RISK, 7 (Jan), 18–20.

Eichengreen, B. and Tong, H. (2004). “Stock Market Volatility and Monetary Pol-icy: What the Historical Record Shows,” Paper presented at the Central Bank ofAustralia.

Elder, J. and Serletis, A. (2006). “Oil Price Uncertainty,” Working Paper, North DakotaState University.

Elliott, G., Komunjer, I. and Timmermann, A. (2005). “Estimation and Testing ofForecast Rationality under Flexible Loss” Review of Economic Studies, 72, 1107–1125.

Elliott, G., Komunjer, I. and Timmermann, A. (2008). “Biases in MacroeconomicForecasts: Irrationality or Asymmetric Loss?” Journal of European EconomicAssociation, 6, 122–157.

Embrechts, P., Kluppelberg, C. and Mikosch. T. (1997). Modelling Extremal Values forInsurance and Finance. Springer.

Page 391: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

378 Bibliography

Emmerson, R., Ramanathan, R. and Ramm, W. (1975). “On the Analysis of RegionalGrowth Patterns,” Journal of Regional Science, 15, 17–28.

Enders, W. (2004). Applied Econometric Time Series. Hoboken, NJ: John Wiley andSons, Inc.

Engel, C. and Hamilton, J.D. (1990). “Long Swings in the Dollar: Are they in the Dataand do the Markets Know it?,” American Economic Review, 80, 689–713.

Engle, R.F. (1978a). “Testing Price Equations for Stability Across Frequency Bands,”Econometrica, 46, 869–881.

Engle, R.F. (1978b). “Estimating Structural Models of Seasonality.” In Seasonal Analysisof Economic Time Series, (A. Zellner, ed.). U.S. Department of Commerce, Bureauof Census.

Engle, R.F. (1982a). “Autoregressive Conditional Heteroskedasticity with Estimates ofthe Variance of U.K. Inflation,” Econometrica, 50, 987–1008.

Engle, R.F. (1982b). “A General Approach to Lagrange Multiplier Model Diagnostics,”Journal of Econometrics, 20, 83–104.

Engle, R.F. (1990). “Discussion: Stock Market Volatility and the Crash of ’87,” Reviewof Financial Studies, 3, 103–106.

Engle, R.F. (1995). ARCH: Selected Readings. Oxford, UK: Oxford University Press.

Engle, R.F. (2001). “GARCH 101: The Use of ARCH/GARCH Models in AppliedEconometrics,” Journal of Economic Perspectives, 15, 157–168.

Engle, R.F. (2002a). “Dynamic Conditional Correlation: A Simple Class of Mul-tivariate GARCH Models,” Journal of Business and Economic Statistics, 20,339–350.

Engle, R.F. (2002b). “New Frontiers for ARCH Models,” Journal of Applied Economet-rics, 17, 425–446.

Engle, R.F. (2004). “Nobel Lecture. Risk and Volatility: Econometric Models andFinancial Practice,” American Economic Review, 94, 405–420.

Engle, R.F. and Bollerslev, T. (1986). “Modeling the Persistence of ConditionalVariances,” Econometric Reviews, 5, 1–50.

Engle, R.F. and Ferstenberg, R. (2007). “Execution Risk,” Journal of PortfolioManagement, 34–45.

Engle, R.F., Ferstenberg, R. and Russell, J. (2008). “Measuring and Modeling ExecutionCost and Risk,” University of Chicago Booth School of Business, Working Paper.

Engle, R.F. and Gallo, J.P. (2006). “A Multiple Indicator Model for Volatility UsingIntra Daily Data,” Journal of Econometrics, 131, 3–27.

Page 392: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 379

Engle, R.F., Ghysels, E. and Sohn, B. (2006). “On the Economic Sources of Stock MarketVolatility,” Manuscript, New York University.

Engle, R.F. and Gonzalez-Rivera, G. (1991). “Semi-Parametric ARCH Models,” Journalof Business and Economic Statistics, 9, 345–359.

Engle, R.F. and Granger, C.W.J. (1987). “Cointegration and Error Correction: Repre-sentation, Estimation and Testing,” Econometrica, 55, 251–276.

Engle, R.F. and Hendry, D.F. (1993). “Testing Super Exogeneity and Invariance inRegression Models,” Journal of Econometrics, 56, 119–139.

Engle, R.F., Hendry, D.F. and Richard, J.-F. (1983). “Exogeneity,” Econometrica, 51,277–304.

Engle, R.F., Ito, T. and Lin, W.L. (1990). “Meteor Showers or Heat Waves? Het-eroskedastic Intra-Daily Volatility in the Foreign Exchange Market,” Econometrica,58, 525–542.

Engle, R.F. and Kroner, F.K. (1995). “Multivariate Simultaneous Generalized GARCH,”Econometric Theory, 11, 122–150.

Engle, R.F. and Lee, G.G.J. (1999). “A Permanent and Transitory Component Model ofStock Return Volatility.” In Cointegration, Causality, and Forecasting: A Festschriftin Honor of Clive W.J. Granger, (R.F. Engle and H. White eds), Oxford, UK: OxfordUniversity Press, 475–497.

Engle, R.F., Lilien, D. and Robins, R. (1987). “Estimating Time-Varying Risk Premiain the Term Structure: The ARCH-M Model,” Econometrica, 55, 391–407.

Engle, R.F., Lilien, D.M. and Watson, M.W. (1985). “A DYMIMIC Model of HousingPrice Determination,” Journal of Econometrics, 28, 307–326.

Engle, R.F. and Manganelli, S. (2004). “CAViaR: Conditional Autoregressive Value-at-Risk by Regression Quantiles,” Journal of Business and Economic Statistics, 22,367–381.

Engle, R.F. and Mezrich, J. (1996). “GARCH for Groups,” Risk, 9, 36–40.

Engle, R.F. and Ng, V.K. (1993). “Measuring and Testing the Impact of News onVolatility,” Journal of Finance, 48, 1749–1778.

Engle, R.F. and Ng, V. (1993). “Time-Varying Volatility and the Dynamic Behavior ofthe Term Structure,” Journal of Money, Credit, and Banking, 25, 336–349.

Engle, R.F., Ng, V.K. and Rothschild, M. (1990). “Asset Pricing with a Factor-ARCH Covariance Structure: Empirical Estimates for Treasury Bills,” Journal ofEconometrics, 45, 213–238.

Engle, R.F. and Patton, A.J. (2001). “What Good is a Volatility Model?” QuantitativeFinance, 1, 237–245.

Page 393: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

380 Bibliography

Engle, R.F. and Rangel, J.G. (2008). “The Spline-GARCH Model for Low FrequencyVolatility and its Global Macroeconomic Causes,” Review of Financial Studies, 21,1187–1222.

Engle, R.F. and Rosenberg, J. (1995). “GARCH Gamma,” Journal of Derivatives, 17,229–247.

Engle, R.F. and Rothschild, M. (1992). “Statistical Models for Financial Volatility,”Journal of Econometrics, 52, 1–311.

Engle, R.F. and Russell, J.R. (1998). “Autoregressive Conditional Duration: ANew Model for Irregularly Spaced Transaction Data,” Econometrica, 66, 1127–1162.

Engle, R.F. and Russell, J.R. (2005). “A Discrete-State Continuous-Time Model ofFinancial Transactions Prices and Times: The ACM–ACD Model,” Journal ofBusiness and Economic Statistics, 23, 166–180.

Engle, R.F. and Sheppard, K. (2001). “Theoretical and Empirical Properties of DynamicConditional Correlation Multivariate GARCH,” Mimeo, UC San Diego.

Engle, R.F. and Watson, M.W. (1981). “A One-Factor Multivariate Time Series Modelof Metropolitan Wage Rates,” Journal of the American Statistical Association 76,774–781.

Engle, R.F. and Watson, M.W. (1983). “Alternative Algorithms for Estimation ofDynamic MIMIC, Factor, and Time Varying Coefficient Regression Models,” Journalof Econometrics, 23, 385–400.

Ericsson, N.R. and Irons, J.S. (eds) (1994). Testing Exogeneity. Oxford: OxfordUniversity Press.

Evans, M.D.D. and Lyons, R.K. (2007). “Exchange Rate Fundamentals and Order Flow.”Manuscript, Georgetown University and University of California, Berkeley.

Fama, E.F. (1986). “Term Premiums and Default Premiums in Money Markets,” Journalof Financial Economics, 17, 175–196.

Fama, E.F. and Bliss, R. (1987). “The Information in Long Maturity Forward Rates,”American Economic Review, 77, 680–692.

Favero, C. and Hendry, D.F. (1992). “Testing the Lucas Critique: A Review,”Econometric Reviews, 11, 265–306.

Fiorentini, G., Sentana, E. and Shephard, N. (2004). “Likelihood-Based Estimation ofLatent Generalized ARCH Structures,” Econometrica, 72, 1481–1517.

Fishburn, P.C. (1977). “Mean-Risk Analysis with Risk Associated Below TargetVariance,” American Economic Review, 67, 116–126.

Forbes, K. and Chinn, M. (2004). “A Decomposition of Global Linkages in FinancialMarkets Over Time,” Review of Economics and Statistics, 86, 705–722.

Page 394: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 381

Forbes, K. and Rigobon, R. (2001). “No Contagion, Only Interdependence: MeasuringStock Market Co-Movements,” Journal of Finance, 57, 2223–2261.

Fornari, F. and Mele, A. (1996). “Modeling the Changing Asymmetry of ConditionalVariances” Economics Letters, 50, 197–203.

Forni, M. and Reichlin, L. (1998). “Let’s Get Real: A Dynamic Factor AnalyticalApproach to Disaggregated Business Cycle,” Review of Economic Studies, 65,453–474.

Foster, D. and Nelson, D.B. (1996). “Continuous Record Asymptotics for Rolling SampleEstimators,” Econometrica, 64, 139–174.

Fountas, S. and Karanasos, M. (2007). “Inflation, Output Growth, and Nominal andReal Uncertainty: Empirical Evidence for the G7,” Journal of International Moneyand Finance, 26, 229–250.

Franses, P.H. and van Dijk, D. (2000). Non-Linear Time Series Models in EmpiricalFinance. Cambridge, UK: Cambridge University Press.

Friedman, M. (1977). “Nobel Lecture: Inflation and Unemployment,” Journal of PoliticalEconomy, 85, 451–472.

Friedman, B.M., Laibson, D.I. and Minsky, H.P. (1989). “Economic Implications ofExtraordinary Movements in Stock Prices,” Brookings Papers on Economic Activity,2, 137–189.

Frisch, R. (1933). “Propagation and Impulse Problems in Dynamic Economics,” Essaysin Honor of Gustav Cassel, London.

Frisch, R. and Waugh, F.V. (1933). “Partial Time Regression as Compared withIndividual Trends,” Econometrica, 1, 221–223.

Gallant, A.R. and Nychka, D.W. (1987). “Semi-Nonparametric Maximum LikelihoodEstimation,” Econometrica, 55, 363–390.

Gallant, A.R. and Tauchen, G. (1998). “SNP: A Program for Nonparametric Time SeriesAnalysis,” an online guide available at www.econ.duke.edu/ get/wpapers/index.html.

Garratt, A., Lee, K., Pesaran, M.H. and Shin, Y. (2003). “A Long Run StructuralMacroeconometric Model of the UK,” Economic Journal, 113, 412–455.

Garratt, A., Lee, K., Pesaran, M.H. and Shin, Y. (2006). Global and National Macroe-conometric Modelling: A Long-Run Structural Approach. Oxford: Oxford UniversityPress.

Gemmill, G. and Saflekos, A. (2000). “How Useful are Implied Distributions? Evidencefrom Stock-Index Options,” Journal of Derivatives, 7, 83–98.

Geweke, J. (1977). “The Dynamic Factor Analysis of Economic Time Series.” In LatentVariables in Socio-Economic Models, (D.J. Aigner and A.S. Goldberger, eds),Amsterdam: North-Holland.

Page 395: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

382 Bibliography

Geweke, J. (1986). “Modeling the Persistence of Conditional Variances: A Comment,”Econometric Review, 5, 57–61.

Ghysels, E., Santa-Clara, P. and Valkanov, R. (2005). “There is a Risk-Return TradeoffAfter All.” Journal of Financial Economics, 76, 509–548.

Ghysels, E., Santa-Clara, P. and Valkanov, R. (2006). “Predicting Volatility: How toGet the Most Out of Returns Data Sampled at Different Frequencies,” Journal ofEconometrics, 131, 59–95.

Gibbons, M. and Ramaswamy, K. (1993). “A Test of the Cox, Ingersoll and Ross Modelof the Term Structure,” Review of Financial Studies, 6, 619–658.

Glosten, L. (1994). “Is the Electronic Open Limit Order Book Inevitable?”, Journal ofFinance, 1127–1161.

Glosten, L.R., Jagannathan, R. and Runkle, D.E. (1993). “On the Relationship betweenthe Expected Value and the Volatility of the Nominal Excess Return on Stocks,”Journal of Finance, 48, 1779–1801.

Godfrey, L.G. (1978). “Testing for Higher Order Serial Correlation in Regression Equa-tions When the Regressors Include Lagged Dependent Variables,” Econometrica, 46,1303–1313.

Gonzalez-Rivera, G. (1998). “Smooth Transition GARCH Models,” Studies in NonlinearDynamics and Econometrics, 3, 61–78.

Gonzalez-Rivera, G., Senyuz, Z. and Yoldas, E. (2007). “Autocontours: DynamicSpecification Testing,” Mimeo, UC Riverside.

Goodman, J.L. (1986). “Reducing the Error in Monthly Housing Starts Estimates,”AREURA Journal, 14, 557–566.

Gourieroux, C. and Jasiak, J. (2001). Financial Econometrics. Princeton, NJ: PrincetonUniversity Press.

Gourieroux, C. and Monfort, A. (1992). “Qualitative Threshold ARCH Models,” Journalof Econometrics, 52, 159–199.

Gourieroux, C., Monfort, A., Renault, E. and Trongnon, A. (1987). “GeneralizedResiduals,” Journal of Econometrics, 34, 5–32.

Gourlay, A.R. and McKee, S. (1977). “The Construction of Hopscotch Meth-ods for Parabolic and Elliptic Equations in Two Space Dimensions with aMixed Derivative,” Journal of Computational and Applied Mathematics, 3, 201–206.

Granger, C.W.J. (1969). “Prediction with a Generalized Cost Function,” OR, 20, 199–207.

Granger, C.W.J (1983). “Acronyms in Time Series Analysis (ATSA),” Journal of TimeSeries Analysis, 3, 103–107.

Page 396: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 383

Granger, C.W.J. (1999). “Outline of Forecast Theory Using Generalized Cost Functions,”Spanish Economic Review, 1, 161–173.

Granger, C.W.J. (2008). “In Praise of Pragmatic Econometrics.” In The Methodologyand Practice of Econometrics: A Festschrift in Honour of David F. Hendry. (J.L.Castle and N. Shephard, eds), Oxford University Press. Forthcoming.

Granger, C.W.J. and Machina, M.J. (2006). “Forecasting and Decision Theory.” InHandbook of Economic Forecasting (G. Elliott, C.W.J. Granger and A. Timmermanneds), Amsterdam: North-Holland.

Gray, S.F. (1996). “Modeling the Conditional Distribution of Interest Rates as a Regime-Switching Process,” Journal of Financial Economics, 42, 27–62.

Grier, K.B. and Perry, M.J. (2000). “The Effects of Real and Nominal Uncertaintyon Inflation and Output Growth: Some GARCH-M Evidence,” Journal of AppliedEconometrics, 15, 45–58.

Griffin, J. and Karolyi, G.A. (1998). “Another Look at the Role of Industrial Struc-ture of Markets for International Diversification Strategies,” Journal of FinancialEconomics, 50, 351–373.

Griffin, J. and Stultz, R. (2001). “International Competition and Exchange Rate Shocks:A Cross-Country Industry Analysis,” Review of Financial Studies 14, 215–241.

Groen, J.J.J., Kapetanios, G. and Price, S. (2009). “Real Time Evaluation of Bank ofEngland Forecasts for Inflation and Growth,” International Journal of Forecasting,25, 74–80.

Guegan, D. and Diebolt, J. (1994). “Probabilistic Properties of the β-ARCH Model,”Statistica Sinica, 4, 71–87.

Guo, H. and Kliesen, K. (2005). “Oil Price Volatility and U.S. Macroeconomic Activity,”Federal Reserve Bank of St. Louis Review, Nov/Dec., 669–683.

Haldane, A. and Quah, D. (1999). “UK Phillips Curves and Monetary Policy,” Journalof Monetary Economics, 44, 259–278.

Hall, A. and Hautsch, N. (2004). “Order Aggressiveness and Order Book Dynamics,”Working Paper, University of Copenhagen.

Hamilton, J. (1988). “Rational Expectations Econometric Analysis of Changes in Regime:An Investigation of the Term Structure of Interest Rates,” Journal of EconomicDynamics and Control, 12, 365–423.

Hamilton, J.D. (1994). Time Series Analysis. Princeton: Princeton University Press.

Hamilton, J.D. (2009). “Daily Changes in Fed Funds Futures Prices,” Journal of Money,Credit and Banking, 41, 567–582.

Hamilton, J.D. (2008). “Daily Monetary Policy Shocks and New Home Sales,” Journalof Monetary Economics, 55, 1171–1190.

Page 397: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

384 Bibliography

Hamilton, J. and Jorda, O. (2002). “A Model of the Federal Funds Rate Target,” Journalof Political Economy, 110, 1135–1167.

Hamilton, J.D. and Lin, G. (1996). “Stock Market Volatility and the Business Cycle,”Journal of Applied Econometrics, 11, 573–593.

Hamilton, J.D. and Susmel, R. (1994). “Autoregressive Conditional Heteroskedasticityand Changes in Regimes,” Journal of Econometrics, 64, 307–333.

Han, H. and Park, J.Y. (2008). “Time Series Properties of ARCH Processes withPersistent Covariates,” Journal of Econometrics, 146, 275–292.

Hansen, B.E. (1994). “Autoregressive Conditional Density Estimation,” InternationalEconomic Review, 35, 705–730.

Hansen, L.P. and Jagannathan, R. (1991). “Implications of Security Market Data forModels of Dynamic Economies,” Journal of Political Economy, 99, 225–262.

Hansen, P.R. and Lunde, A. (2006). “Consistent Ranking of Volatility Models,” Journalof Econometrics, 131, 97–121.

Hansen, L.P. and Scheinkman, J. (1995). “Back to the Future: Generating MomentImplications for Continuous Time Markov Processes,” Econometrica, 63, 767–804.

Harris, R.D.F., Stoja, E. and Tucker, J. (2007). “A Simplified Approach to Mod-eling the Comovement of Asset Returns,” Journal of Futures Markets, 27,575–598.

Harrison, J.M. and Kreps, D.M. (1979). “Martingales and Arbitrage in MultiperiodSecurities Markets,” Journal of Economic Theory, 20, 381–408.

Hartmann, P., Straetman, S. and de Vries, C. (2004). “Asset Market Linkages in CrisisPeriods,” Review of Economics and Statistics, 86, 313–326.

Harvey, A., Ruiz, E. and Sentana, E. (1992). “Unobserved Component Time SeriesModels with ARCH Disturbances,” Journal of Econometrics, 52, 129–157.

Harvey, C.R. and Siddique, A. (1999). “Autoregressive Conditional Skewness,” Journalof Financial and Quantitative Analysis, 34, 465–487.

Harvey, C.R. and Siddique, A. (2000). “Conditional Skewness in Asset Pricing Tests,”Journal of Finance LV, 1263–1295.

Haug, S. And Czado, C. (2007). “An Exponential Continuous-Time GARCH Process,”Journal of Applied Probability, 44, 960–976.

Hausman, J., A.W. Lo and A.C. MacKinlay (1992). “An Ordered Probit Analysis ofTransaction Stock Prices,” Journal of Financial Analysis, 319–379.

Heath, D., Jarrow, R. and Morton, A. (1992). “Bond Pricing and the Term Structure ofInterest Rates,” Econometrica, 60, 77–105.

Page 398: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 385

Hendry, D.F. (1979). “Predictive Failure and Econometric Modelling in Macro-Economics: The Transactions Demand for Money.” In Economic Modelling,(Ormerod, P. ed), 217–242. London: Heinemann.

Hendry, D.F. (1988). “The Encompassing Implications of Feedback Versus FeedforwardMechanisms in Econometrics,” Oxford Economic Papers, 40, 132–149.

Hendry, D.F. (1995). Dynamic Econometrics. Oxford: Oxford University Press.

Hendry, D.F. (2000). “On Detectable and Non-Detectable Structural Change,” StructuralChange and Economic Dynamics, 11, 45–65.

Hendry, D.F. (2006). “Robustifying Forecasts from Equilibrium-Correction Models,”Journal of Econometrics, 135, 399–426.

Hendry, D.F. (2009). “The Methodology of Empirical Econometric Modeling: AppliedEconometrics Through the Looking-Glass.” In Palgrave Handbook of Econometrics.(Mills, T.C. and Patterson, K.D. eds), Basingstoke: Palgrave MacMillan.

Hendry, D.F. and Doornik, J.A. (1994). “Modelling Linear Dynamic EconometricSystems,” Scottish Journal of Political Economy, 41, 1–33.

Hendry, D.F. and Doornik, J.A. (1997). “The Implications for Econometric Modelling ofForecast Failure,” Scottish Journal of Political Economy, 44, 437–461.

Hendry, D.F. and Ericsson, N.R. (1991). “Modeling the Demand for Narrow Moneyin the United Kingdom and the United States,” European Economic Review, 35,833–886.

Hendry, D.F., Johansen, S. and Santos, C. (2008). “Automatic Selection of Indicatorsin a Fully Saturated Regression,” Computational Statistics, 33, 317–335. Erratum,337–339.

Hendry, D.F. and Krolzig, H.-M. (2001). Automatic Econometric Model Selection.London: Timberlake Consultants Press.

Hendry, D.F. and Krolzig, H.-M. (2005). “The Properties of Automatic Gets Modelling,”Economic Journal, 115, C32–C61.

Hendry, D.F. and Massmann, M. (2007). “Co-breaking: Recent Advances and a Synopsisof the Literature,” Journal of Business and Economic Statistics, 25, 33–51.

Hendry, D.F. and Mizon, G.E. (1993). “Evaluating Dynamic Econometric Models byEncompassing the VAR.” In Models, Methods and Applications of Econometrics,(Phillips, P.C.B. ed), 272–300. Oxford: Basil Blackwell.

Hendry, D.F. and Santos, C. (2005). “Regression Models with Data-Based IndicatorVariables,” Oxford Bulletin of Economics and Statistics, 67, 571–595.

Hendry, D.F. and Santos, C. (2007). “Automatic Index-Based Tests of Super Exogeneity.”Unpublished paper, Economics Department, University of Oxford.

Page 399: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

386 Bibliography

Hentschel, L. (1995). “All in the Family: Nesting Symmetric and Asymmetric GARCHModels,” Journal of Financial Economics, 39, 71–104.

Heston, S.L. (1993). “A Closed-Form Solution for Options with Stochastic Volatility,with Applications to Bond and Currency Options,” Review of Financial Studies, 6,327–343.

Heston, S.L. and Nandi, S. (2000). “A Closed-Form GARCH Option Valuation Model,”Review of Financial Studies, 13, 585–625.

Heston, S.L. and Rouwenhorst, G. (1994). “Does Industrial Structure Explain theBenefits of International Diversification” Journal of Financial Economics, 36, 3–27.

Higgins, M.L. and Bera, A.K. (1992). “A Class of Nonlinear ARCH Models,” Interna-tional, Economic Review, 33, 137–158.

Hille, E. and Phillips, R. (1957). Functional Analysis and Semigroups, AmericanMathematical Society, Providence, R.I.

HM Treasury (1994). “Economic Forecasting in the Treasury,” Government EconomicService Working Paper No.121. London: HM Treasury.

Hogan, W.W. and Warren, J.M. (1972). “Computation of the Efficient Boundary in theE-S Portfolio Selection Model, Journal of Finance and Quantitative Analysis, 7,1881–1896.

Hogan, W.W. and Warren, J.M. (1974). “Toward the Development of an EquilibriumCapital-Market Model Based on Semivariance,” Journal of Finance and QuantitativeAnalysis, 9, 1–11.

Hoover, K.D. and Perez, S.J. (1999). “Data Mining Reconsidered: Encompassing andthe General-to-specific Approach to Specification Search.” Econometrics Journal, 2,167–191.

Horvath, M. and Verbrugge, R. (1996). “Shocks and Sectoral Interactions: An EmpiricalInvestigation,” unpublished.

Huang, X. and Tauchen, G. (2005). “The Relative Contribution of Jumps to Total PriceVariation,” Journal of Financial Econometrics, 3, 456–499.

Huber, P.J. (1967). “The Behavior of Maximum Likelihood Estimates Under Nonstan-dard Conditions,” Proceedings of the Fifth Berkeley Symposium on MathematicalStatistics and Probability, 221–233.

Huizinga, J. and Mishkin, F.S. (1986). “Monetary Policy Regime Shifts and the UnusualBehavior of Real Interest Rates,” Carnegie-Rochester Conference on Public Policy,24, 231–274.

Hwang, S. and Satchell, S.E. (1999). “Modelling Emerging Market Risk PremiaUsing Higher Moments,” International Journal of Finance and Economics, 4,271–296.

Page 400: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 387

Hwang, S. and Satchell, S.E. (2005). “GARCH Model with Cross-Sectional Volatility:GARCHX Models,” Applied Financial Economics, 15, 203–216.

Ingersoll, J. (1987). Theory of Financial Decision Making, Rowman and Littlefield,Totowa, NJ.

International Monetary Fund (2000). World Economic Outlook – Asset Prices and theBusiness Cycle, Washington, D.C.: International Monetary Fund.

Jackwerth, J.C. (1997). “Generalized Binomial Trees,” Journal of Derivatives, 5(Winter), 7–17.

Jackwerth, J.C. (2000). “Recovering Risk Aversion from Option Prices and RealizedReturns,” Review of Financial Studies, 13, 433–451.

Jackwerth, J.C. (2004). Option-Implied Risk-Neutral Distributions and Risk Aversion.Charlotteville: Research Foundation of AIMR.

Jackwerth, J.C. and Rubinstein, M. (1996). “Recovering Probability Distributions fromOption Prices,” Journal of Finance, 51, 1611–1631.

Jacod, J. (1994). Limit of random measures associated with the increments of a Brown-ian semimartingale. Preprint number 120, Laboratoire de Probabilities, UniversitePierre et Marie Curie, Paris.

Jacod, J. (2007). “Statistics and high frequency data.” Unpublished paper.

Jacod, J., Li, Y. Mykland, P.A., Podolskij, M. and Vetter, M. (2007). “Microstruc-ture noise in the continuous case: the pre-averaging approach.” Unpublished paper:Department of Statistics, University of Chicago.

Jalil, M. (2004). Essays on the Effect of Information on Monetary Policy, unpublishedPh.D. dissertation, UCSD.

Jansen, E.S. and Terasvirta, T. (1996). “Testing parameter constancy and super exo-geneity in econometric equations,” Oxford Bulletin of Economics and Statistics, 58,735–763.

Jarque, C.M. and Bera, A.K. (1987). “A Test for Normality of Observations andRegression Residuals,” International Statistical Review, 55, 163–172.

Jiang, G.J. and Tian, Y.S. (2005). “The Model-Free Implied Volatility and itsInformation Content,” Review of Financial Studies, 18, 1305–1342.

Johansen, S. (1995). Likelihood-Based Inference in Cointegrated Vector AutoregressiveModels, Oxford: Oxford University Press.

Johansen, S. and Nielsen, B. (2009). “An Analysis of the Indicator Saturation Esti-mator as a Robust Regression Estimator.” In The Methodology and Practice ofEconometrics, (Castle, J.L. and Shephard, N. eds), Oxford: Oxford UniversityPress.

Page 401: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

388 Bibliography

Jondeau, E. and Rockinger, M. (2006). “The Copula-GARCH Model of ConditionalDependencies: An International Stock Market Application,” Journal of InternationalMoney and Finance, 25, 827–853.

Judd, J.P., and G.D. Rudebusch (1998). “Taylor’s Rule and the Fed: 1970–1997,” FederalReserve Bank of San Francisco Review, 3, 3–16.

Kalliovirta, L. (2007). “Quantile Residuals for Multivariate Models,” Mimeo, Universityof Helsinki.

Karolyi, A. and Stultz, R. (1996). “Why Do Markets Move Together? An Investigationof U.S.-Japan Stock Return Comovements,” Journal of Finance, 51, 951–86.

Kavajecz, K. (1999). “A Specialist’s Quoted Depth and the Limit Book,” Journal ofFinance, 747–771.

Kawakatsu, H. (2006). “Matrix Exponential GARCH,” Journal of Econometrics, 134,95–128.

Kilian, L. and Goncalves, S. (2004). “Bootstrapping Autoregressions with ConditionalHeteroskedasticity of Unknown Form,” Journal of Econometrics, 123, 89–120.

Kilian, L. and Park, C. (2007). “The Impact of Oil Prices on the US Stock Market,”CEPR Discussion Paper 6166.

Kim, C.-J. and Nelson, C.R. (1999a). “Has the US Economy Become More Stable? ABayesian Approach Based on a Markov-Switching Model of the Business Cycle,”Review of Economics and Statistics, 81, 608–616.

Kim, C.-J. and Nelson, C.R. (1999b). State-Space Models with Regime Switching.Classical and Gibb-Sampling Approaches with Applications, Cambridge, Mass.

Kim, S., Shephard, N. and Chib, S. (1998). “Stochastic Volatility: Likelihood Inferenceand Comparison with ARCH Models,” Review of Economic Studies, 65, 361–393.

Kim, T.-H. and White, H. (2003). “Estimation, Inference, and Specification Testing forPossibly Misspecified Quantile Regression.” In Maximum Likelihood Estimation ofMisspecified Models: Twenty Years Later (T. Fomby and C. Hill, eds.), New York:Elsevier, 107–132.

Kim, T.-H. and White, H. (2004). “On more robust estimation of skewness and kurtosis,”Finance Research Letters, 1, 56–73.

King, M., Sentana, E. and Wadhwani, S. (1994). “Volatility and Links Between NationalStock Markets,” Econometrica, 62, 901–33.

Kinnebrock, S. and Podolskij, M. (2008). “A Note on the Central Limit Theo-rem for Bipower Variation of General Functions,” Stochastic Processes and TheirApplications, 118, 1056–1070.

Klemkosky, R.C. and Pilotte, E.A. (1992). “Time-Varying Term Premiums on U.S.Treasury Bills and Bonds,” Journal of Monetary Economics, 30, 87–106.

Page 402: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 389

Kluppelberg, C., Lindner, A. and R. Maller (2004). “A Continuous Time GARCH Pro-cess Driven by a Levy Process: Stationarity and Second Order Behaviour,” Journalof Applied Probability, 41, 601–622.

Kodres, L.E. (1993). “Test of Unbiasedness in Foreign Exchange Futures Markets:An Examination of Price Limits and Conditional Heteroskedasticity,” Journal ofBusiness, 66, 463–490.

Koenker, R. and Bassett, G. (1978). “Regression Quantiles,” Econometrica, 46, 33–50.

Komunjer, I. (2005). “Quasi-Maximum Likelihood Estimation for Conditional Quan-tiles,” Journal of Econometrics, 128, 127–164.

Komunjer, I. and Vuong, Q. (2006). “Efficient Conditional Quantile Estimation: TheTime Series Case.” University of California, San Diego Department of EconomicsDiscussion Paper 2006–10.

Komunjer, I. and Vuong, Q. (2007a). “Semiparametric Efficiency Bound and M-estimation in Time-Series Models for Conditional Quantiles.” University of Cali-fornia, San Diego Department of Economics Discussion Paper.

Komunjer, I. and Vuong, Q. (2007b). “Efficient Estimation in Dynamic ConditionalQuantile Models.” University of California, San Diego Department of EconomicsDiscussion Paper.

Koren, M. and Tenreyro, S. (2007). “Volatility and Development,” Quarterly Journal ofEconomics, 122, 243–287.

Kose, M.A., Prasad, E.S. and Terrones, M.E. (2006). “How Do Trade and FinancialIntegration Affect the Relationship Between Growth and Volatility?” Journal ofInternational Economics, 69, 176–202.

Krolzig, H.-M. and Toro, J. (2002). “Testing for Super-Exogeneity in the Presence ofCommon Deterministic Shifts,” Annales d’Economie et de Statistique, 67/68, 41–71.

Lane, P.R. and Milesi-Ferretti, G.M. (2001). “The External Wealth of Nations: Measuresof Foreign Assets and Liabilities for Industrial and Developing Countries,” Journalof International Economics, 55, 263–294.

Laurent, S. and Peters, J.P. (2002). “G@RCH 2.2: An Ox Package for Estimating andForecasting Various ARCH Models,” Journal of Economic Surveys, 16, 447–485.

LeBaron, B. (1992). “Some Relations Between Volatility and Serial Correlation in StockMarket Returns,” Journal of Business, 65, 199–219.

Ledoit, O., Santa-Clara, P. and Wolf, M. (2003). “Flexible Multivariate GARCH Model-ing with an Application to International Stock Markets,” Review of Economics andStatistics, 85, 735–747.

Lee, K., Ni, S. and Ratti, R.A. (1995). “Oil Shocks and the Macroeconomy: The Role ofPrice Variability,” Energy Journal, 16, 39–56.

Page 403: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

390 Bibliography

Lee, L.F. (1999). “Estimation of Dynamic and ARCH Tobit Models,” Journal ofEconometrics, 92, 355–390.

Lee, S. and Mykland, P.A. (2008). “Jumps in Financial Markets: A New NonparametricTest and Jump Dynamics,” Review of Financial Studies, forthcoming.

Lee, S. and Taniguchi, M. (2005). “Asymptotic Theory for ARCH-SM Models: LAN andResidual Empirical Processes,” Statistica Sinica, 15, 215–234.

Lee, S.W. and Hansen, B.E. (1994). “Asymptotic Theory for the GARCH(1,1) Quasi-Maximum Likelihood Estimator,” Econometric Theory, 10, 29–52.

Lee, T.H. (1994). “Spread and Volatility in Spot and Forward Exchange Rates,” Journalof International Money and Finance, 13, 375–382.

Leon, A., Rubio, G. and Serna, G. (2004). “Autoregressive Conditional Volatility,Skewness and Kurtosis,” WP-AD 2004-13, Instituto Valenciano de InvestigacionesEconomicas.

Leon, A., Rubio, G. and Serna, G. (2005). “Autoregessive Conditional Volatility,Skewness and Kurtosis,” Quarterly Review of Economics and Finance, 45, 599–618.

Levine, R. (1997). “Financial Development and Economic Growth: Views and Agenda,”Journal of Economic Literature, 35, 688–726.

Lewis, A.L. (1990). “Semivariance and the Performance of Portfolios with Options,”Financial Analysts Journal, 67–76.

L’Her, J.F., Sy, O. and Yassine Tnani, M. (2002). “Country, Industry, and Risk FactorLoadings in Portfolio Management, Journal of Portfolio Management, 28, 70–79.

Li, C.W. and Li, W.K. (1996). “On a Double Threshold Autoregressive HeteroskedasticTime Series Model,” Journal of Applied Econometrics, 11, 253–274.

Lin, W.-L., Engle, R.F. and Takatoshi, I. (1994). “Do Bull and Bears Move AcrossBorders? Transmission of International Stock Returns and Volatility,” Review ofFinancial Studies, 7, 507–538.

Ling, S. and Li, W.K. (1997). “Diagnostic Checking of Nonlinear Multivariate Time Serieswith Multivariate ARCH Errors,” Journal of Time Series Analysis, 18, 447–464.

Litterman, R. and Scheinkman, J. (1991). “Common Factors Affecting Bond Returns,”Journal of Fixed Income, 1, 54–61.

Liu, S.M. and Brorsen, B.W. (1995). “Maximum Likelihood Estimation of a GARCHStable Model,” Journal of Applied Econometrics, 10, 272–285.

Lo, A. (1988). “Maximum Likelihood Estimation of Generalized Ito Processes withDiscretely Sampled Data,” Econometric Theory, 4, 231–247.

Longin, F. and Solnik, B. (1995). “Is the Correlation in International Equity ReturnsConstant: 1970–1990?” Journal of International Money and Finance, 14, 3–26.

Page 404: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 391

Longstaff, F. and Schwartz, E. (1992). “Interest Rate Volatility and the Term Structure:A Two-Factor General Equilibrium Model,” Journal of Finance, 47, 1259–1282.

Lucas, R.E. (1976). “Econometric Policy Evaluation: A Critique.” In The Phillips Curveand Labor Markets, (Brunner, K. and Meltzer, A. eds), Vol. 1 of Carnegie-RochesterConferences on Public Policy, 19–46. Amsterdam: North-Holland PublishingCompany.

Lumsdaine, R.L. (1996). “Consistency and Asymptotic Normality of the Quasi-MaximumLikelihood Estimator in IGARCH(1,1) and Covariance Stationary GARCH(1,1)Models,” Econometrica, 64, 575–596.

Lutkepohl, H. (2005). New Introduction to Multiple Time Series Analysis, Berlin:Springer-Verlag.

Maheu, J.M. and McCurdy, T.H. (2004). “News Arrival, Jump Dynamics and VolatilityComponents for Individual Stock Returns,” Journal of Finance, 59, 755–794.

Malz, A.M. (1997). “Estimating the Probability Distribution of the Future ExchangeRate from Options Prices,” Journal of Derivatives, 5, 18–36.

Mancini, C. (2001). “Disentangling the Jumps of the Diffusion in a Geometric BrownianMotion,” Giornale dell’Istituto Italiano degi Attuari LXIV, 19–47.

Mao, J.C.T. (1970a). “Models of Capital Budgeting, E-V vs. E-S,” Journal of Financeand Quantitative Analysis, 4, 657–675.

Mao, J.C.T. (1970b). “Survey of Capital Budgeting: Theory and Practice,” Journal ofFinance, 25, 349–360.

Markowitz, H. (1959). Portfolio Selection. New York.

McConnell, M.M. and Perez-Quiros, G. (2000). “Output Fluctuations in the UnitedStates: What Has Changed Since the Early 1980s?” American Economic Review,90, 1464–1476.

McCulloch, J.H. (1985). “Interest-Risk Sensitive Deposit Insurance Premia: Stable ACHEstimates,” Journal of Banking and Finance, 9, 137–156.

McNees, S.K. (1979). “The Forecasting Record for the 1970s,” New England EconomicReview, September/October 1979, 33–53.

McNeil, A.J., Frei, R. and Embrechts, P. (2005). Quantitative Risk Management.Princeton University Press.

McNeil, A.J. and Frey, R. (2000). “Estimation of Tail-Related Risk Measures forHeteroskedastic Financial Time Series: An Extreme Value Approach,” Journal ofEmpirical Finance, 7, 271–300.

Medeiros, M.C. and Veiga, A. (2009). “Modeling Multiple Regimes in Financial Volatil-ity with a Flexible Coefficient GARCH(1,1) Model,” Econometric Theory, 25,117–161.

Page 405: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

392 Bibliography

Meenagh, D., Minford, P., Nowell, E., Sofat, P. and Srinivasan, N. (2009). “Can theFacts of UK Inflation Persistence be Explained by Nominal Rigidity?” EconomicModelling, 26, 978–992.

Melick, W.R. and Thomas, C.P. (1997). “Recovering an Asset’s Implied PDF fromOption Prices: An Application to Crude Oil During the Gulf Crisis,” Journal ofFinancial and Quantitative Analysis, 32, 91–115.

Melliss, C. and Whittaker, R. (2000). “The Treasury’s Forecasts of GDP and theRPI: How Have They Changed and What are the Uncertainties?” In Economet-ric Modelling: Techniques and Applications (S. Holly and M.R. Weale, eds), 38–68.Cambridge: Cambridge University Press.

Milhøj, A. (1985). “The Moment Structure of ARCH Processes,” Scandinavian Journalof Statistics, 12, 281–292.

Milhøj, A. (1987). “A Conditional Variance Model for Daily Observations of an ExchangeRate,” Journal of Business and Economic Statistics, 5, 99–103.

Milhøj, A. (1987). “A Multiplicative Parameterization of ARCH Models.” WorkingPaper, Department of Statistics, University of Copenhagen.

Mills, T.C. (1993). The Econometric Modelling of Financial Time Series. Cambridge,UK: Cambridge University Press.

Milshtein, G.N. (1974). “Approximate Integration of Stochastic Differential Equations,”Theory of Probability and Its Applications, 19, 557–562.

Milshtein, G.N. (1978). “A Method of Second-Order Accuracy Integration for Stochas-tic Differential Equations,” Theory of Probability and Its Applications, 23,396–401.

Mincer, J. and Zarnowitz, V. (1969). “The Evaluation of Economic Forecasts.” In Eco-nomic Forecasts and Expectations, (J. Mincer ed.) National Bureau of EconomicResearch, New York.

Mitchell, J. (2005). “The National Institute Density Forecasts of Inflation,” NationalInstitute Economic Review, No.193, 60–69.

Montiel, P. and Serven, L. (2006). “Macroeconomic Stablity in Developing Countries:How Much is Enough?” The World Bank Research Observer, 21, Fall, 151–178.

Moors, J.J.A. (1988). “A Quantile Alternative for Kurtosis,” The Statistician, 37, 25–32.

Morgan, I.G. and Trevor, R.G. (1999). “Limit Moves as Censored Observations of Equi-librium Futures Prices in GARCH Processes,” Journal of Business and EconomicStatistics, 17, 397–408.

Muller, U.A., Dacorogna, M.M., Dave, R.D., Olsen, R.B., Puctet, O.V. and vonWeizsacker, J. (1997). “Volatilities of Different Time Resolutions – Analyzing theDynamics of Market Components,” Journal of Empirical Finance, 4, 213–239.

Page 406: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 393

Nam, K., Pyun, C.S. and Arize, A.C. (2002). “Asymmetric Mean-Reversion andContrarian Profits: ANST-GARCH Approach,” Journal of Empirical Finance, 9,563–588.

Nelson, D.B. (1990a). “Stationarity and Persistence in the GARCH(1,1) Model,”Econometric Theory, 6, 318–334.

Nelson, D.B. (1990b). “ARCH Models as Diffusion Approximations,” Journal ofEconometrics, 45, 7–39.

Nelson, D.B. (1991). “Conditional Heteroscedasticity in Asset Returns: A NewApproach,” Econometrica, 59, 347–370.

Nelson, D.B. (1992). “Filtering and Forecasting with Misspecifies ARCH Models I: Get-ting the Right Variance with the Wrong Model,” Journal of Econometrics, 52,61–90.

Nelson, D.B. (1996a). “Asymptotic Filtering Theory for Multivariate ARCH Models,”Journal of Econometrics, 71, 1–47.

Nelson, D.B. (1996b). “Asymptotically Optimal Smoothing with ARCH Models,”Econometrica, 64, 561–573.

Nelson, D.B. and Cao, C.Q. (1992). “Inequality Constraints in the Univariate GARCHModel,” Journal of Business and Economic Statistics, 10, 229–235.

Nelson, D.B. and Foster, D. (1994). “Asymptotic Filtering Theory for Univariate ARCHModels,” Econometrica, 62, 1–41.

Nelson, E. (2009). “An Overhaul of Doctrine: The Underpinning of UK InflationTargeting,” Economic Journal, 11, F333–F368.

Nelson, E. and Nikolov, K. (2004). “Monetary Policy and Stagflation in the UK,” Journalof Money, Credit, and Banking, 36, 293–318.

Nerlove, M. (1965). “Two Models of the British Economy: A Fragment of a CriticalSurvey,” International Economic Review, 6, 127–181.

Newey, W.K. and Powell, J.L. (1990). “Efficient Estimation of Linear and Type I Cen-sored Regression Models Under Conditional Quantile Restrictions,” EconometricTheory, 6, 295–317.

Newey, W.K. and West, K.D. (1987). “A Simple, Positive Semidefinite, Heteroskedas-ticity and Autocorrelation Consistent Covariance Matrix,” Econometrica, 55,703–708.

Nijman, T. and Sentana, E. (1996). “Marginalization and Contemporaneous Aggregationin Multivariate GARCH Processes,” Journal of Econometrics, 71, 71–87.

Norrbin, S. and Schlagenhauf, D. (1988). “An Inquiry into the Sources of MacroeconomicFluctuations,” Journal of Monetary Economics, 22, 43–70.

Page 407: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

394 Bibliography

Nowicka-Zagrajek, J. and Werron, A. (2001). “Dependence Structure of Stable R-GARCH Processes,” Probability and Mathematical Statistics, 21, 371–380.

Øksendal, B. (1985). Stochastic Differential Equations: An Introduction with Applica-tions, 3rd edition, Springer-Verlag, New York.

Oliner, S. and Sichel, D. (2000). “The Resurgence of Growth in the Late 1990s: IsInformation Technology the Story?” Journal of Economic Perspectives, 14, 3–22.

Optionmetrics (2003). “Ivy DB File and Data Reference Manual, Version 2.0.”Downloadable .pdf file, available on WRDS.

Orphanides, A. (2001). “Monetary Policy Rules Based on Real-Time Data” AmericanEconomic Review, 91, 964–985.

Otsu, T. (2003). “Empirical Likelihood for Quantile Regression.” University of Wiscon-sin, Madison Department of Economics Discussion Paper.

Pagan, A. (1996). “The Econometrics of Financial Markets,” Journal of EmpiricalFinance, 3, 15–102.

Palm, F. (1996). “GARCH Models of Volatility.” In Handbook of Statistics, Volume 14,(C.R. Rao and G.S. Maddala eds), Amsterdam: North-Holland, 209–240.

Pantula, S.G. (1986). “Modeling the Persistence of Conditional Variances: A Comment,”Econometric Review, 5, 71–74.

Park, B.J. (2002). “An Outlier Robust GARCH Model and Forecasting Volatility ofExchange Rate Returns,” Journal of Forecasting, 21, 381–393.

Patton, A.J. (2006a). “Modelling Asymmetric Exchange Rate Dependence,” Interna-tional Economic Review, 47, 527–556.

Patton, A.J. (2006b). “Volatility Forecast Comparison Using Imperfect VolatilityProxies,” Journal of Econometrics, forthcoming.

Patton, A.J. and Timmermann, A. (2007a). “Properties of Optimal Forecasts underAsymmetric Loss and Nonlinearity,” Journal of Econometrics, 140, 884–918.

Patton, A.J. and Timmermann, A. (2007b). “Testing Forecast Optimality underUnknown Loss,” Journal of American Statistical Association, 102, 1172–1184.

Pearson, N. and Sun, T.-S. (1994). “Exploiting the Conditional Density in Estimatingthe Term Structure: An Application to the Cox, Ingersoll, Ross Model,” Journal ofFinance, 49, 1279–1304.

Pedersen, C.S. and Satchell, S.E. (2002). “On the Foundation of Performance MeasuresUnder Asymmetric Returns,” Quantitative Finance, 2, 217–223.

Pelloni, G. and Polasek, W. (2003). “Macroeconomic Effects of Sectoral Shocks inGermany, the U.K. and the U.S.: A VAR-GARCH-M Approach,” ComputationalEconomics, 21, 65–85.

Page 408: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 395

Perez-Quiros, G. and Timmermann, A. (2000). “Firm Size and Cyclical Variations inStock Returns,” Journal of Finance, 55, 1229–1262.

Pesaran, M.H. and Skouras, S. (2001). Decision-Based Methods for Forecast Evaluation.In Companion to Economic Forecasting (Clements, M.P. and D.F. Hendry eds).Basil Blackwell.

Pesaran, M.H. and Zaffaroni, P. (2008). “Model Averaging in Risk Management withan Application to Futures Markets,” CESifo Working Paper Series No. 1358; IEPRWorking Paper No. 04.3.

Piazzesi, M. (2005). “Bond Yields and the Federal Reserve,” Journal of PoliticalEconomy, 113, 311–344.

Piazzesi, M. and Swanson, E. (2008). “Futures Prices as Risk-Adjusted Forecasts ofMonetary Policy,” Journal of Monetary Economics, 55, 677–691.

Pinto, B. and Aizenman, J. (eds) (2005). Managing Economic Volatility and Crises: APractitioner’s Guide. Cambridge: Cambridge University Press.

Poon, S.H. (2005). A Practical Guide to Forecasting Financial Market Volatility.Chichester, UK: John Wiley & Sons, Ltd.

Poon, S.-H. and Granger, C.W.J. (2003). “Forecasting Volatility in Financial Markets:A Review,” Journal of Economic Literature, 41, 478–539.

Portes, R. and Rey, H. (2005). “The Determinants of Cross-Border Equity Flows,”Journal of International Economics, 65, 269–96.

Powell, J. (1984). “Least Absolute Deviations Estimators for the Censored RegressionModel,” Journal of Econometrics, 25, 303–325.

Pritsker, M. (1998). “Nonparametric Density Estimation of Tests of Continu-ous Time Interest Rate Models,” The Review of Financial Studies, 11, 449–487.

Protter, P. (2004). Stochastic Integration and Differential Equations. New York: Springer-Verlag.

Psaradakis, Z. and Sola, M. (1996). “On the Power of Tests for Superex-ogeneity and Structural Invariance,” Journal of Econometrics, 72, 151–175.

Ramey, G. and Ramey, V.A. (1995). “Cross-Country Evidence on the Link BetweenVolatility and Growth,” American Economic Review, 85, 1138–1151.

Ramsey, J.B. (1969). “Tests for Specification Errors in Classical Linear Least SquaresRegression Analysis,” Journal of the Royal Statistical Society B, 31, 350–371.

Ranaldo, A. (2004). “Order Aggressiveness in Limit Order Book Markets,” Journal ofFinancial Markets, 53–74.

Page 409: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

396 Bibliography

Revankar, N.S. and Hartley, M.J. (1973). “An Independence Test and Conditional Unbi-ased Predictions in the Context of Simultaneous Equation Systems,” InternationalEconomic Review, 14, 625–631.

Rich, R. and Tracy, J. (2006). “The relationship between expected inflation, disagreementand uncertainty: evidence from matched point and density forecasts,” Staff ReportNo.253, Federal Reserve Bank of New York.

Rigobon, R. (2002). “The Curse of Non-Investment Grade Countries,” Journal ofDevelopment Economics, 69, 423–449.

Rigobon, R. and Sack, B. (2003). “Measuring the Reaction of Monetary Policy to theStock Market,” Quarterly Journal of Economics, 639–669.

Robinson, P.M. (1991). “Testing for Strong Serial Correlation and Dynamic ConditionalHeteroskedasticity in Multiple Regression,” Journal of Econometrics, 47, 67–84.

Rom, B.M. and Ferguson, K. (1993). “Post-Modern Portfolio Theory Comes of Age,”Journal of Investing, 11–17.

Rosenberg, J. and Engle, R. (2002). “Empirical Pricing Kernels,” Journal of FinancialEconomics, 64, 341–72.

Rosu, I. (2008). “A Dynamic Model of the Limit Order Book,” Review of FinancialStudies, forthcoming.

Rubinstein, M. (1994). “Implied Binomial Trees,” Journal of Finance, 49, 771–818.

Rubinstein, M. (1998). “Edgeworth Binomial Trees,” Journal of Derivatives, 5, 20–27.

Russell, J. and R.F. Engle, (2005). “A Discrete-State Continuous-Time Model of Transac-tion Prices and Times: The ACM-ACD Model,” Journal of Business and EconomicStatistics, 166–180.

Sack, B. (2004). “Extracting the Expected Path of Monetary Policy from Futures Rates,”Journal of Futures Markets, 24, 733–754.

Sakata, S. and White, H. (1998). “High Breakdown Point Conditional Dispersion Esti-mation with Application to S&P 500 Daily Returns Volatility,” Econometrica, 66,529–568.

Salkever, D.S. (1976). “The Use of Dummy Variables to Compute Predictions, PredictionErrors and Confidence Intervals,” Journal of Econometrics, 4, 393–397.

Sanchez, M.J. and Pena, D. (2003). “The Identification of Multiple Outliers in ARIMAModels,” Communications in Statistics: Theory and Methods, 32, 1265–1287.

Sanders, A.B. and Unal, H. (1988). “On the Intertemporal Stability of the Short TermRate of Interest,” Journal of Financial and Quantitative Analysis, 23, 417–423.

Santos, C. and Hendry, D.F. (2006). “Saturation in Autoregressive Models,” NotasEconomicas, 19, 8–20.

Page 410: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 397

Sasaki, K. (1963). “Military Expenditures and the Employment Multiplier in Hawaii,”Review of Economics and Statistics, 45, 293–304.

Schaefer, S. and Schwartz, E. (1984). “A Two-Factor Model of the Term Structure: AnApproximate Analytical Solution,” Journal of Financial and Quantitative Analysis,19, 413–424.

Schwarz, G. (1978). “Estimating the Dimension of a Model,” Annals of Statistics, 6,461–464.

Schwert, G.W. (1989). “Why Does Stock Market Volatility Change Over Time?” Journalof Finance, 44, 1115–1153.

Schwert, G.W. (1990). “Stock Volatility and the Crash of ‘87,” Review of FinancialStudies, 3, 77–102.

Sensier, M. and van Dijk, D. (2004). “Testing for Volatility Changes in U.S. Macroeco-nomic Time Series,” Review of Economics and Statistics, 86, 833–839.

Sentana, E. (1995). “Quadratic ARCH Models,” Review of Economic Studies, 62, 639–661.

Sentana, E. and Fiorentini, G. (2001). “Identification, Estimation and Testing of Condi-tional Heteroskedastic Factor Models,” Journal of Econometrics, 102, 143–164.

Serven, L. (2003). “Real-Exchange-Rate Uncertainty and Private Investment in LDCS,”Review of Economics and Statistics, 85, 212–218.

Shephard, N.H. (1994). “Partial Non-Gaussian State Space,” Biometrika 81, 115–131.

Shephard, N.H. (1996). “Statistical Aspects of ARCH and Stochastic Volatility Models.”In Time Series Models in Econometrics, Finance and Other Fields, (D.R. Cox, D.V.Hinkley and O.E. Barndorff-Nielsen eds), London: Chapman & Hall, 1–67.

Shephard, N.H. (2008). “Stochastic volatility models” In The New Palgrave Dictionaryof Economics, 2nd Edn (S.N. Durlauf and L.E. Blume, eds). Palgrave MacMillan.

Shields, K., Olekalns, N. Henry, O.T. and Brooks, C. (2005). “Measuring the Responseof Macroeconomic Uncertainty to Shocks,” Review of Economics and Statistics, 87,362–370.

Shiller, R.J. (1981). “Do Stock Prices Move Too Much to be Justified by SubsequentChanges in Dividends?” American Economic Review, 71, 421–436.

Shimko, D. (1993). “The Bounds of Probability,” RISK, 6, 33–37.

Siklos, P.L. and Wohar, M.E. (2005). “Estimating Taylor-Type Rules: An UnbalancedRegression?” In Advances in Econometrics, vol. 20 (T.B. Fomby and D. Terrell,eds). Amsterdam: Elsevier.

Singleton, K.J. (2006). Empirical Dynamic Asset Pricing. Princeton: Princeton Univer-sity Press.

Page 411: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

398 Bibliography

Soderlind, P. and Svensson, L. (1997). “New Techniques to Extract Market Expectationsfrom Financial Instruments,” Journal of Monetary Economics, 40, 383–429.

Solnik, B. and Roulet, J. (2000). “Dispersion as Cross-Sectional Correlation,” FinancialAnalysts Journal, 56, 54–61.

Somerville, C.T. (2001). “Permits, Starts, and Completions: Structural RelationshipsVersus Real Options,” Real Estate Economics, 29, 161–190.

Sortino, F.A. and Satchell, S.E. (2001). Managing Downside Risk in Financial Markets.Butterworth-Heinemann.

Sortino, F.A. and van der Meer, R. (1991). “Downside Risk,” The Journal of PortfolioManagement, 17, 27–31.

Stambaugh, R.F. (1993). “Estimating Conditional Expectations When Volatility Fluc-tuates,” NBER Working Paper 140.

Stambaugh, R.F. (1988). “The Information in Forward Rates: Implications For Modelsof the Term Structure,” Journal of Financial Economics, 21, 41–70.

Stanton, R. (1997). “A Nonparametric Model of Term Structure Dynamics and theMarket Price of Interest Rate Risk,” Journal of Finance, 52, 1973–2002.

Stinchcombe, M. and White, H. (1998). “Consistent Specification Testing with NuisanceParameters Present Only Under the Alternative,” Econometric Theory, 14, 295–324.

Stock, J. and Watson, M. (1998). “Diffusion Indexes,” NBER Working Paper 6702,Cambridge, Mass.: National Bureau of Economic Research.

Stock, J.H. and Watson, M.W. (2002a). “Forecasting Using Principal Components froma Large Number of Predictors,” Journal of the American Statistical Association, 97,1167–1179.

Stock, J.H. and Watson, M.W. (2002b). “Has the Business Cycle Changed and Why?” InNBER Macroeconomics Annual 2002. (M. Gertler and K. Rogoff eds), Cambridge,Mass.: MIT Press.

Stock, J.H. and Watson, M.W. (2007a). “Why Has U.S. Inflation Become Harder toForecast?” Journal of Money, Credit, and Banking, 39, 3–34.

Stock, J.H. and Watson, M.W. (2007b). Introduction to Econometrics, 2nd edn. Boston:Pearson Education.

Tauchen, G. and Pitts, M. (1983). “The Price Variability-Volume Relationship onSpeculative Markets,” Econometrica, 51, 485–505.

Taylor, J.B. (1993). “Discretion Versus Policy Rules in Practice,” Carnegie-RochesterConference Series on Public Policy, 39, 195–214.

Taylor, S.J. (1986). Modeling Financial Time Series. Chichester, UK: John Wiley andSons.

Page 412: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Bibliography 399

Taylor, S.J. (2004). Asset Price Dynamics and Prediction. Princeton, NJ: PrincetonUniversity Press.

Tesar, L. and Werner, I. (1994). “International Equity Transactions and U.S. PortfolioChoice.” In The Internationalization of Equity Markets, (Frankel, J. ed.), Chicago.

Theil, H. and Ghosh, R. (1980). “A Comparison of Shift-share and the RAS Adjustment,”Regional Science and Urban Economics, 10, 175–180.

Timmermann, A. (2000). “Moments of Markov Switching Models,” Journal of Econo-metrics, 96, 75–111.

Torous, W. and Ball, C. (1995). “Regime Shifts in Short Term Riskless Interest Rates,”Working Paper, London Business School.

Tsay, R.S. (2002). Analysis of Financial Time Series. New York: John Wiley and Sons,Inc.

Tse, Y.K. (1998). “The Conditional Heteroskedasticity of the Yen-Dollar ExchangeRate,” Journal of Applied Econometrics, 13, 49–55.

Tse, Y.K. (2002). “Residual-Based Diagnostics for Conditional HeteroscedasticityModels,” Econometrics Journal, 5, 358–373.

Tse, Y.K. and Tsui, A.K.C. (1999). “A Note on Diagnosing Multivariate Condi-tional Heteroscedasticity Models,” Journal of Time Series Analysis, 20, 679–691.

Tse, Y.K. and Tsui, A.K.C. (2002). “A Multivariate GARCH Model with Time-VaryingCorrelations,” Journal of Business and Economic Statistics, 20, 351–362.

van der Weide, R. (2002). “GO-GARCH: A Multivariate Generalized OrthogonalGARCH Model,” Journal of Applied Econometrics, 17, 549–564.

Varian, H.R. (1974). “A Bayesian Approach to Real Estate Assessment.” In Studies inBayesian Econometrics and Statistics in Honor of Leonard J. Savage (S.E. Fienbergand A. Zellner, eds). Amsterdam: North-Holland, 195–208.

Wallis, K.F. (1989). “Macroeconomic Forecasting: A Survey,” Economic Journal, 99,28–61.

Wallis, K.F. (2004). “An Assessment of Bank of England and National Institute Infla-tion Forecast Uncertainties,” National Institute Economic Review, No.189, 64–71.

Wallis, K.F. (2008). “Forecast Uncertainty, its Representation and Evaluation,” InEconometric Forecasting and High-Frequency Data Analysis (R.S. Mariano and Y.K.Tse, eds), Vol.13 of the Lecture Notes Series of the Institute for MathematicalSciences, National University of Singapore, 1–51. Singapore: World Scientific.

Wei, S.X. (2002). “A Censored-GARCH Model of Asset Returns with Price Limits,”Journal of Empirical Finance, 9, 197–223.

Page 413: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

400 Bibliography

Weiss, A. (1991). “Estimating Nonlinear Dynamic Models Using Least Absolute ErrorEstimation,” Econometric Theory, 7, 46–68.

West, K.D. (1996). “Asymptotic Inference About Predictive Ability,” Econometrica, 64,1067–1084.

West, K.D. (2006). “Forecast Evaluation.” In Handbook of Economic Forecasting. (G.Elliott, C.W.J. Granger, and A. Timmermann eds), Amsterdam: North-Holland.

White, H. (1980). “A Heteroskedasticity-Consistent Covariance Matrix Estimator and aDirect Test for Heteroskedasticity,” Econometrica, 48, 817–838.

White, H. (1994). Estimation, Inference and Specification Analysis. New York:Cambridge University Press.

White, H. (2001). Asymptotic Theory for Econometricians. San Diego: Academic Press.

White, H. (2006). “Approximate Nonlinear Forecasting Methods.” In Handbook of Eco-nomics Forecasting. (G. Elliott, C.W.J. Granger and A. Timmermann, eds), NewYork: Elsevier, 460–512.

Wong, C.S. and Li, W.K. (2001). “On a Mixture Autoregressive Conditional Het-eroskedastic Model,” Journal of the American Statistical Association, 96, 982–995.

Yang, M. and Bewley, R. (1995). “Moving Average Conditional HeteroskedasticProcesses,” Economics Letters, 49, 367–372.

Zarnowitz, V. and Lambros, L.A. (1987). “Consensus and Uncertainty in EconomicPrediction,” Journal of Political Economy, 95, 591–621.

Zakoıan, J.-M. (1994). “Threshold Heteroskedastic Models,” Journal of EconomicDynamics and Control, 18, 931–955.

Zellner, A. (1986). “Bayesian Estimation and Prediction Using Asymmetric LossFunctions,” Journal of the American Statistical Association, 81, 446–451.

Zhang, L., Mykland, P.A. and Aıt-Sahalia, Y. (2005). “A Tale of Two Time Scales:Determining Integrated Volatility with Noisy High-Frequency Data,” Journal of theAmerican Statistical Association, 100, 1394–1411.

Zhang, Z., Li, W.K. and Yuen, K.C. (2006). “On a Mixture GARCH Time Series Model,”Journal of Time Series Analysis, 27, 577–597.

Zhou, B. (1996). “High-Frequency Data and Volatility in Foreign-Exchange Rates,”Journal of Business and Economic Statistics, 14, 45–52.

Zivot, E. and Andrews, D.W.K. (1992). “Further Evidence on the Great Crash, the Oil-Price Shock, and the Unit-Root Hypothesis,” Journal of Business and EconomicStatistics, 10, 251–270.

Page 414: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Index

page numbers in bold refer to glossary definitions

AARCH 138Abraham, J. M. 38, 48ACD (Autoregressive Conditional

Duration) 138ACH1 (Autoregressive Conditional

Hazard) 138–9ACH2 (Adaptive Conditional

Heteroskedasticity) 139ACM (Autoregressive Conditional

Multinomial) 139Adair, M. 7ADCC (Asymmetric Dynamic

Conditional Correlations)139

AGARCH1 (Asymmetric GARCH)139, 151 see also TS-GARCH

AGARCH2 (Absolute ValueGARCH) 139

AGDCC 150Ahn, D. 298Aıt-Sahalia, Y. 296, 298, 307, 308,

327Aizenman, J. 98Akaike’s information criterion 72–3Alexander, C. 157Almgren, R. 355Altonji, J. 15 n.1American Express (AXP) trade data 130,

130 Tab. 7.4, 131 Tab. 7.5Andersen, T. 13, 108, 109, 118, 120, 121,

155, 298Andrews, D. W. K. 71, 254Ang, A. 117ANN-ARCH (Artificial Neural Network

ARCH) 139

ANOVA models 21 n.5ANST-GARCH (Asymmetric Nonlinear

Smooth Transition GARCH)139–40

APARCH (Asymmetric PowerARCH) 140, 147, 152

AR (1) model 9AR(1)-GARCH(1,1) processes and US

inflation 202–9ARCD (AutoRegressive Conditional

Density) 140ARCH (Autoregressive Conditional

Heteroskedasticity) x–xi, 2–3,5, 9, 35, 62–3, 78, 85, 117–18,140–1, 165

fourth-order 93Federal Reserve forecasting 87–91,

89 Tab. 5.3, 89 Fig. 5.4glossary 80 Tab. 5.1, 137–63and macroeconomics 79–96OLS estimates of bias in Federal

Reserve forecasting 89 Tab 5.3re-estimation of original UK

inflation model 66–9, 67Fig. 4.2, 67 Tab. 4.1, 68 Fig. 4.3(a), (b), 69 Fig. 4.4, 71, 72, 73,74 Fig. 4.4 (a), 77, 78

Taylor Rule and Federal Reservepolicy 91–5, 92 Tab. 5.5, 93Tab. 5.6, 94 Tab. 5.7, 95Fig. 5.5

and volatility 79–80see also GARCHArchipelago Exchange (ARCA) 355,

358

401

Page 415: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

402 Index

ARCH-M (ARCH-in-Mean) 4, 141–2effects 296–7

ARCH-NNH (ARCH NonstationaryNonlinear Heteroskedasticity)141

ARCH-SM (ARCH StochasticMean) 142

ARCH-Smoothers 142Arize, A. C. 139–40ARMA model 357, 363

processes 82–3representation 147, 148

Aruoba 110asset market volatility 97–8asymptotic standard errors 90ATGARCH (Asymmetric Threshold

GARCH) 142AUG-GARCH (Augmented

GARCH) 142–3Augustine, monk 6autocontours 214–15, 230 see also

multivariate autocontoursautoregressive conditional duration 5autoregressive conditional skewness and

kurtosis 231–56conditional quantile-based

measures 231LRS model 240 Tab. 12.1, 241

Fig. 12.2, 241 Fig. 12.3, 242,244, 245

quantile-based measures 238–9time-varying for the S & P 500

239–44, 239 Fig. 12.1, 240Tab. 12.1, 241 Fig. 12.2–12.3,242 Tab. 12.2, 243Fig. 12.4–12.5, 244 Fig. 12.6

AVGARCH (Absolute ValueGARCH) 143

Baba, Y. 143Babsiria, M. E. 120 n.3Bacci, M. 8Bahra, B. 326Bai, J. 71, 214Baillie, R. T. 77, 147Bank of England 63, 65, 326

forecasts 74–6

Banz, R. 324Barlett standard errors 360Barndorff-Nielsen, O. E. xi, 118, 124,

125, 126, 127, 131, 132, 133base multiplier model 20, 21

Trace test 26 Tab. 2.4Basset, G. 7Bates, D. 326, 328–9Bauwens, L. 213Beaud 21 n.5Bekaert, G. 259 n.1, 262, 262 n.4BEKK (Baba, Engle, Kraft, Kroner)

143multivariate autocontours 225, 227,

228 Fig. 11.3, 228 Fig. 11.5, 229Tab. 11.7

Benati, L. 65, 71, 77Bera, A. K. 138, 156Bernoulli random variable 216Berzeg, K. 21Bewley, R. 154Biais, B. 354Bierens, H. J. 198Billio, M. 147Black, F. 118Black-Scholes (BS) option pricing

model 323, 327, 336, 342, 347,349, 352,

equations 328, 341gammas 149implied volatilities (IVs) 324, 346

Bliss, R. 296, 326, 327, 328, 329, 342Boero, G. x, 75Boivin, J. 91Bollerslev, T. xi, 6

ARCH (GARCH) 86citations 7GARCH modeling 7, 81, 142, 144,

147, 148, 149–50, 151, 155, 156,157, 158

macroeconomic volatility and stockmarket volatility,world-wide 108, 109

models to limit order bookdynamics 354, 355

realized semivariance 118, 120, 121,122, 131

Page 416: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Index 403

Boudoukh, J. xiBowley coefficient of skewness 238–9Box, G. E. P. 9Brandt, M. W. 158Breeden, D. 324Brenner, R. 150, 153Breusch, T. 7Brockwell, P. 145Brooks, R. 259 n.1, 262 n.4, 267 n.7Brorsen, B. W. 159Brown, H. 19, 24Brown, S. 14, 16, 20, 27, 33Brownian motion 126, 133, 134, 145, 161Brownian semimartingale 119, 124,

126, 134Bu, R. 329Buchen, P. W. 329building permits (US) 38–47, 40 Fig. 3.3,

42–3 Tab. 3.1, 44 Fig. 3.4, 45Fig. 3.5, 46 Fig. 3.6

and GDP growth rate 36 Fig. 3.1Bureau of Labor Statistics (US) 21Bureau of the Census (US) 38, 39, 40Burns, P. 157business cycle effects 98

Cai, J. 151, 161Calvet, L. E. 98Campa, J. M. 326Campbell, J. 296Cao, C. Q. 148Caporin, M. 145, 147Cappiello, L. 139, 150Carlino, G. A. 15 n.1, 19, 20, 24, 25CARR (Conditional AutoRegressive

Range) 143Carter, C. K. 47Castle, J. L. 69Catao, Luis xiCAViaR (Conditional Autoregressive

Value At Risk) xi, 81, 143see also multi-quantile CAViaRand skewness and kurtosis

ccc (Constant Conditional Correlations)143–4

CCC GARCH 144Central Limit Theorem 82, 342

CGARCG (Component GARCH) 144Chadraa, E. 145Chan, K. C. 296Chang, P. H. K. 326Chen, J. 117Chen, X. 120 n.3Chen, Z. 214Chernov, M. 202Chesher, A. 197Chibb, S. 47Chicago Board of Trade 88Choleski decomposition 220Chou, R. 7, 143Chow (1960) test 168, 183Chriss, N. 355Christodoulakis, G. A. 145Christoffersen, P. F. 194Chung, C. F. 77Clarida, R. 91Clark, P. K. 155Clayton-Matthews, A. 38Clements, M. P. 167COGARCH (Continuous GARCH)

144–5, 146cointegration 2, 3, 4, 5, 9, 14, 17, 18, 164

and long run shift-share modeling22–33

conditional mean/variance xConley, T. 307Consensus Economics 195constant share model (Model 4) 20, 24,

27, 33Trace tests 25 Tab. 2.3

continuous-time model xiCopula GARCH 145Corr ARCH (Correlated ARCH) 145Corradi, V. 132Coulson, N. E. x, 14, 15 n.1, 16, 19, 20,

21, 27, 33Crone, T. M. 38, 48Crouhy, H. 142Crow, E. L. 238, 239Czado, C. 146

Dacorogna, M. M. 151DAGARCH (Dynamic Asymmetric

GARCH) 145

Page 417: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

404 Index

Dai, Q. 296, 298, 307Dave, R. D. 151Davidson, J. 151Davies test 268–9, 271DCC-GARCH (Dynamic Conditional

Correlations) xi, 145, 147and multivariate autocontours 225,

227, 228 Fig. 11.4, 229Tab. 11.7, 229 Fig. 11.6, 229Tab. 11.7, 230

de Jong, R. M. 198DeFina, R. 15 n.1deforestation, Amazon 5del Negro, M. 259 n.1, 262 n. 4, 267 n.7den Hertog, R. G. J. 144Department of Commerce (US) 39Derman, E. 328developing countries 101DFM-SV model 38, 45–51

estimation of fixed model coefficients47

filtering 47US housing results 57–60, 58

Fig. 3.10, 59–60 Fig. 3.11–13diag MGARCH (diagonal

GARCH) 145–6Dickey-Fuller (ADF) test 21–2, 69–71, 70

Fig. 4.5 (a), (b)Diebold, F. X. x–xi, 86, 108, 109, 110,

118, 120, 121, 131, 143, 147,153, 194

Engle interview 13, 14Ding, Z. 140, 225Distaso, W. 132Dittmar, R. 298Domowitz, I. 354Donaldson, R. G. 139Doornik, J. A. 167, 190Downing, C. xidownside risk 117–36

American Express (AXP) data 130,130 Tab. 7.4, 131 Tab. 7.5

General Electric data 120–1, 121Fig. 7.1, 128–30, 128 Tab. 7.2,129 Tab. 7.3, 131 Tab. 7.5

IBM data 130, 130 Tab. 7.4, 131tab. 7.5

measurement 117–36trade data, general 130–1Walt Disney (DIS) data 130, 130

Tab. 7.4, 131 Tab. 7.5Drost, F. C. 163DTARCH (Double Threshold

ARCH) 146, 147Duan, J. 142Duchesne, P. 213Duffie, D. 298Dumas, B. 328Dunn, E. S. Jr. 15, 18–19Dupire, B. 328Durbin, J. 7Dynamic Multiple-Indicator

Multiple-Cause (DYMIMIC)model 13–14

Ebens, H. 108Econometric Society 2, 5, 9, 164Econometrica 3, 5, 9, 62Econometrics World Congress 2Edgeworth expansion 328EGARCH (Exponential GARCH) 62,

90–1, 91 Tab. 5.4, 143, 146,147, 152

EGOGARCH (Exponential ContinuousGARCH) 146

Elder, J. 79electrical residential load forecasting 3electricity prices/demand 4Electronic Communications Networks

(ECNs) 354Elliott, G. 4–5, 9, 11, 195–6, 200Emmerson, R. 15, 21EMWA (Exponentially Weighted Moving

Average) 146–7Engle, R. 1, 5, 6, 9, 33, 37, 296

ARCH and GARCH modeling 4, 16,20, 27, 35, 47, 86, 87, 92, 121–2,138, 139, 140–1, 141–2, 144,145, 147, 149, 150, 151, 152,155, 156, 157, 159–60, 162, 164,237, 327

ARCH paper (1982) x–xi, 2–3, 78,79, 80, 205, 209

ARCH-M paper (1987) 203, 296

Page 418: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Index 405

BEKK model 143, 225CAViaR 81, 143, 231, 246citations 7Cornell PhD thesis xDCC model 225, 231Diebold interview 13, 14econometric volatility 97, 118, 137,

140, 257interest rates 317mean variance portfolio risk 355,

361MEM 155MIT x, 2, 13Nobel Prize 2, 5, 78, 137spline-GARCH model 98–9Stern School of Business, NYU x, 4super exogeneity 165thesis examinees 10time-varying volatility 194TR2 test 86, 86 Tab. 5.2, 87, 89, 96UK inflation research 62, 66–9, 78as urban economist x, 13–14, 33

equity model of volatility xiEricsson, N. R. 165, 192European Monetary System 66European Union 64Evans, M.D.D. 110EVT-GARCH (Extreme Value Theory

GARCH) 146exchange rate mechanism 66exogeneity see super exogeneityextreme value theory 146

Fama, E. F. 296F-ARCH (Factor ARCH) 147Favero, C. 165, 192FCGARCH (Flexible Coefficient

GARCH) 147FDCC (Flexible Dynamic Conditional

Correlations) 147Federal Reserve see US Federal ReserveFerguson, K. 120Ferstenberg, R. 355F-GARCH (Factor GARCH) 147FIAPARCH (Fractionally Integrated

Power ARCH) 147

FIEGARCH (Fractionally IntegratedEGARCH) 147

FIGARCH (Fractionally IntegratedGARCH) 147–8, 152, 154

Figlewski, S. xiFiorentini, G. 153, 161FIREGARCH 158Fishburn, P. C. 120Fisher, A. J. 98Fisher, F. 13Fisher-Tippett Theorem 342Fisher transforms 145Fisher’s Information Matrix 158Fleming, J. 328FLEX-GARCH (Flexible GARCH)

148Forbes, K. 260, 288, 295forecast errors see generalized forecast

errors, optimality and measureFoster, D. 141, 142Fountas, S. 79four-part shift share model (Model 5) 21fractional integration/long memory

processes 3, 9Frechet distribution 342Frey, R. 146Friedman, B. M. 154Friedman, M. 63, 77, 78Frisch, R. 182, 259Fornari, F. 163FTSE industry sectors 266

stock index options 327‘fundamental volatility’ 98, 98n.2, 99,

100, 105, 108–9FX option prices 327

GAARCH, Generalized model 138Galı, J. 91Gallant, A. R. 159, 202, 298Gallo, J. P. 121–2GARCH (Generalized AutoRegressive

Conditional Heteroskedasticity)7, 62, 121–2, 123 Tab. 7.1, 130,131 Tab. 7.5, 132 Tab. 7.7, 133,138, 148–9, 152, 231, 240, 258,259, 353

glossary 137–63

Page 419: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

406 Index

GARCH . . . (cont.)inference about the mean 81–7, 83

Fig. 5.1, 84 Fig. 5.2, 85 Fig. 5.3,86 Tab. 5.2

see also multivariate GARCHmodels

GARCH Diffusion 149GARCH-EAR (GARCH Exponential

AutoRegression) 149GARCH-in-mean (GARCH-M)

model 62, 77, 79GARCH with skewness and kurtosis

(GARCHSK) 140GARCH-t (GARCH

t-distribution) 149–50GARCH-t generalization 93, 94 Tab. 5.7GARCHX 150GARCH-X1 150GARCH-X2 150GARCH-Γ (GARCH Gamma) 149GARCH-Δ (GARCH Delta) 149GARJI 150Garratt, A. 69Gaussian GARCH processes 83Gaussian limit theory 132Gaussian quasi-likelihood 122GCR transformations 233GDCC (Generalized Dynamic

Conditional Correlations) 150GDP, India and Pakistan 101–5GDP, US 36, 91, 92, 99–100

growth 77oil price effects 79volatility 79

GED-GARCH (Generalized ErrorDistribution GARCH) 150

Gemmill, G. 327General Electric (GE) trade data 120–1,

121 Fig. 7.1, 128–30, 128Tab. 7.2, 129 Tab. 7.3, 131Tab. 7.5

generalized forecast errors, optimalityand measure 194–212

AR-GARCH processes 202–5Linex inflation forecasts 205–7, 206

Fig. 10.3

Mincer-Zarnowitz regressions 195,200, 209

MSE inflation forecasts 205–9, 206Fig. 10.3, 208 Fig. 10.4

MSE loss error density 199–200,202–5, 203 Fig. 10.1, 204Fig. 10.2, 208 Fig. 10.4

“MSE-loss probability measure” 195objective error density 202–5, 204

Fig. 10.2properties under change of

measure 200–2properties under loss

functions 197–9“risk neutral probabilities” 195testable implications under loss

functions 196US inflation application 205–9, 206

Fig. 10.3, 208 Fig. 10.4Gertler, M. 91Geweke, J. 38, 154, 156Ghosh 21 n.5Ghysels, E. 98, 120 n.3, 157Giannoni, M. P. 91Gibbons, M. 307Gibbs sampling 47GJR-GARCH (Glosten, Jagannathan and

Runkle) 122, 123 Tab. 7.5, 130,131 Tab. 7.5, 132 Tab. 7.7, 145,147, 150–1, 152, 163

global stock market 266globalization, role of 294Glosten, L. R. 118, 150, 162, 361, 363,

364Gobbo, M. 147Goetzmann, W. N. 38, 48GO-GARCH (Generalized Orthogonal

GARCH) 151Goncalves, S. 95Gonzalez-Rivera, G. xi, 159–60, 214, 215,

216, 217, 224, 230Google (GOOG) stock 359Gourieroux, C. 158, 197Gourlay, A. R. 321GQARCH (Generalized Quadratic

ARCH) 138, 151Gram-Charlier series expansions 240

Page 420: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Index 407

Granger, C. W. J. x. 9, 14, 194, 195, 196,197

ARCH acronyms 137, 139citations 7downside risk 117GDP hypothesis 109, 110 Tab. 6.1‘Granger Causality’ 7Nobel Prize 2, 5student co-publishers 11

Graversen, S. E. 125, 126Gray, S. F. 151, 161Great Inflation 62Great Moderation 37, 38, 43, 60, 61, 62,

100, 100 n.8Greek letter hedging parameters 326Grier, K. B. 77, 79Griffin, J. 276, 279, 281Groen, J. J. J. 76GRS-GARCH (Generalized

Regime-Switching GARCH)151

Guegan, D. 143

Hadri, K. 329Haldane, A. 65, 66Hall, A. 6, 354Hall, B. 2Ham, J. 15 n.1Hamilton, J. x, 3, 4, 5, 9, 90, 98, 138,

151, 161citations 7student co-publishers 11

Han, H. 141Hannan-Quinn criterion 73Hansen, B. E. 152–3Hansen, L. 98, 307Hansen, P. R. 124HARCH (Heterogeneous ARCH)

151–2Harjes, R. 150, 153Harmonised Index of Consumer Prices

(CPI) 64Harris, R. D. F. 159Harrison, J. M. 195, 200Hartley, M. J. 166Harvey, C. R. 140, 160, 238Haug, S. 146

Hausman, J. 355Hautsch, N. 354Hendry, D. F. xi, 3, 6, 69, 164, 165,

166, 167, 173, 175, 179, 190,192

Hentschel, L. 152Heston, S. L. 160, 261, 262 n.4, 276Heston-Rouwenhorst decomposition

scheme 267HGARCH (Hentschel GARCH) 152Higgins, M. L. 138, 156high-frequency intraday data xiHille, E. 309 n.5Hillioin, P. 354Hodrick, R. 259 n.1, 262, 262 n.4Hogan, W. W. 120Hooper, J. 1Hoover, K. D. 176Horvath, M. 15 n.1housing construction, US

Alabama population and housingpermits 39, 50

Alaska, national factors 52Arkansas population and housing

permits 39, 50building permit growth rate 41–4,

44 Fig. 3.4building permits data 38–4building permits data for

representative states 40–1, 40Fig. 3.3

California 50conclusions 60–1DFM-SV model, national/regional

factors 38DFM-SV model results 45–51,

57–60, 58 Fig. 3.10, 59–60Fig. 3.11–13

DFM with split-sampleestimates 51–7, 53–4 Tab. 3.3,54 Tab. 3.4, 55–6 Tab. 3.5, 58,60

estimated region results 49–51, 49Fig. 37, 50 Tab. 3.2, 51 Fig. 3.8,51 Fig. 3.9

estimation of housing market regions47–9

Page 421: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

408 Index

housing . . . (cont.)evolution of national/regional factors

35–61Florida 50, 52Georgia 52Hawaii 52Louisiana 50Mississippi population and housing

permits 39, 52Missouri sampling standard error 39Nebraska sampling standard error

39Nevada 50Ohio sampling standard error 39Rhode Island 52seasonality 41–3, 42–3 Tab. 3.1South Carolina 52South Dakota 50spatial correlation 43–5, 46 Fig. 3.6standard deviations: volatility 43, 45

Fig. 3.5, 61Vermont 49, 50Virginia 52volatility 43, 57, 58, 60–1Washington 50West Virginia 52Wyoming 52Wyoming sampling standard error

39Huang, X. 131, 118Huber, P. J. 234–5Hwang, S. 150, 238HYGARCH (Hyperbolic GARCH) 152

IBM trade data 130, 130 Tab. 7.4, 131tab. 7.5

IGARCH (Integrated GARCH) 146, 148,152–3

IMF 195Implied Volatility (IV) 153India, GDP 101–5Industry Act UK (1975) 73inflation uncertainty modeling,

UK 62–78ARCH model re-estimation 66–9, 67

Fig. 4.2, 67 Tab. 4.1, 68

Fig. 4.3 (a), (b), 69 Fig. 4.4, 71,72, 73, 74 Fig. 4.4 (a), 77, 78

Bank of England forecasts 74–6Bretton Woods period 65business cycle range 65CPI 73exchange rate targeting 65, 66forecast uncertainty 73–6, 74

Fig. 4.6 (a), (b)monetary policy 65, 75Monetary Policy Committee (MPC)

forecasts 74 Fig. 4.6 (a), 75mortgage interest and RPI 63–6non-stationary behaviour 69–73, 78policy environment 63–6Retail Price Index (RPI) 66, 69, 73seasonality 70short-term economic forecasts 73structural ‘breaks’ model 69, 71, 72

Tab. 4.2, 73–4, 74 Fig. 4.6 (a),75–6, 77, 78

Survey of External Forecasters(SEF) 74 Fig. 4.6 (a), (b), 75,76

‘traditional’/modeling approachescompared 63

Treasury forecasts 74 Fig. 4.6 (a),76, 77

uncertainty and the level of inflation77

unit route hypothesis 70–1interest rate volatility, a multifactor,

nonlinear, continuous-timemodel 296–322

affine class models 298, 313ARCH-M effects 296–7asset pricing 314bond/fixed-income derivative pricing

297conditional distribution of 300–2,

302 Tab. 14.1, 303–7, 304Fig. 14.3, 305 Fig. 14.4–5, 306Fig. 14.6, 307 Fig. 14.7

continuous-time multifactor diffusionprocess 307–13

data description 299–300, 299Fig. 14.1, 300 Fig. 14.2

Page 422: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Index 409

distribution of four possiblestates 300–2

drift, diffusion and correlationapproximations 308–13, 313Fig. 14.8

equilibrium vs. arbitrage-freedebate 322

fixed-income contingentclaims 318–21

Hopscotch method (Gourlay andMcKee) 321

Kernel estimation 303, 315Monte Carlo simulations 321nonaffine models 298“relevant pricing factor” 298stochastic behaviour of interest rates

298–307structure theory 298two-factor diffusion process

results 316–18, 316 Fig. 14.9,317 Fig. 14.10, 318 Fig. 14.11,319 Fig. 14.12

two-factor (Longstaff and Schwartz)model 298, 313–16, 322

volatilities and levels 298, 301, 315,316–21, 317 Fig. 14.10, 318Fig. 14.11, 319 Fig. 14.12, 321

International Financial Statistics (IFS)99

intraday high-frequency transactions xiIrish, M. 197Irons, J. S. 165ISI Web of Science 79

Jackwerth, J. C. 326, 327, 328Jacod, J. 118, 125, 126, 127, 133Jagannathan, R. 98, 118, 150, 162Jalil, M. 91Jansen, E. S. 165Jarque-Bera test 267, 275Jenkins, G. M. 9Johansen, S. 6, 7, 165, 166, 173Jondeau, E. 145Jones, C. S. 158Jorda, O. 138Judd, J. P. 91, 92Juselius, K. 6

Kalliovirta, L. 214Kalman filters 4, 35Kamstra, M. 139Kan, R. 298Kani, I. 328Kapetanios, G. 76Karanasos, M. 79Karolyi, G. A. 276, 279, 281, 296Kavajecz, K. 354Kawakatsu, H. 154Kelly, M. 329Kernel estimation 303, 315Kilian, L. 95Kim, E. Han 7Kim, S. 47Kim, T. xiKim, T.-H. xi, 231, 237, 238, 239, 252Kimmel, R. 298, 307King, M. 288Kinnebrock, S. xi, 125, 126, 133–4Kluppelberg, C. 144, 146k-means cluster analysis 48–9Kodres, L. E. 162Koenker, R. 7Kohn, R. 47Komunjer, I. 5, 200, 236Koppejans 354Koren, M. 101Kraft, D. 143Kreps, D. M. 195, 200Krolzig, H.-M. 165, 166Kroner, K. F. 7, 143, 150, 153, 225kurtosis

autoregressive conditional 231–56GARCH with skewness and kurtosis

(GARCHSK) 140and global equity returns 267, 275MQ-CAViaR autoregressive

conditional skewness andkurtosis 232–4

Labys, P. 118, 120, 121Laibson, D. I. 154Lalancette, S. 213Lambros, L. A. 76LARCH (Linear ARCH) 153latent GARCH 153

Page 423: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

410 Index

Laurent, S. 213Lazar, E. 157LeBaron, B. 149Lebesgue measure 249Lebesgue-Stieltjes probability density

function (PDF) 232Ledoit, O. 148Lee, G. G. J. 144Lee, K. 69, 79Lee, L. F. 162Lee, S. 118, 138, 142, 152–3Lee, T. H. 150Leibnitz, G. 198Leon, A 140, 231, 240Level-GARCH 153Levy processes 144Lewis, A. L. 120LGARCH (Leverage GARCH) 153

see also GJRLGARCH2 (Linear GARCH) 153Li, C. W. 146Li, W. K. 146, 156, 213Li, Y. 133Lilien, D. 35, 141–2, 203, 296, 297, 301,

317, 322limit order book dynamics, new

model 354–64ACM model 355Arma model 357, 363data 358–60, 359 Fig. 16.1, 360

Fig. 16.2description 356–8estimation 358high volatility periods 361–2, 364mean variance portfolio risk 355results 360–4, 361 tab. 16.1, 362

Tab. 16.2, 363 Tab. 16.3, 364Fig. 16.3

Lin, G. 98Lindner, A. 144, 145, 146linear regressions 9Ling, S. 213Litterman, R. 296Litzenberger, R. 324Liu, S. M. 159Ljung-Box test 361, 363LM test 92

LMGARCH (Long Memory GARCH)153–4

Lo, A. W. 307, 327, 355log-GARCH (Logarithmic GARCH) 154London School of Economics (LSE) 2long base multiplier model 20, 27

trace test 26 Tab. 2.4long run shift-share modeling,

metropolitan sectoralfluctuations 13–34

Atlanta Trace test 23, 24, 25Tab. 2.3, 26 Tab. 2.4

Atlanta VARs 30 Tab. 2.7base multiplier model 20, 21, 26

Tab. 2.4Chicago Trace test 25 Tab. 2.3, 26

Tab. 2.4Chicago VARs 31 Tab. 2.8cointegration 22–33constant total share (Model 2) 19,

20, 24, 25constant share (Model 4) 20, 24, 27,

33Dallas Trace test 23–4, 25 Tab. 2.3,

26 Tab. 2.4Dallas VARs 29 Tab. 2.6data and evidence 21–33four-part shift share model (Model

5) 21general model 14–18intermediate model (C) 27–33,

Tab. 2.5–2.9long base multiplier model 20, 27Los Angeles Trace test 23, 24, 25

Tab. 2.3, 26 Tab. 2.4Los Angeles VARs 32 Tab. 2.9model (D) 21, 27–33, Tab 2.5–2.9orthogonalization matrix 16, 18, 33Philadelphia Trace test 24, 25

Tab. 2.3, 26 Tab. 24Philadelphia VARs 28 Tab. 2.5sectoral shift-share (Model 3) 19, 24short run shift-share model

(A) 27–33, Tab. 2.5–2.9short run VAR model (B) 27–33,

Tab. 2.5–2.9‘total share component’ 15

Page 424: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Index 411

total share model (Model 1) 18–19,24

trace test long run basemultiplier 26 Tab. 2.4

trace tests 22–7, 23 Tab. 2.2, 25Tab. 2.3, 26 Tab. 2.4

trace tests constant share model 25Tab. 2.3

‘traditional’ models 14unit route tests 22 Tab. 2.1

Longstaff, F. 296, 298, 313Lucas, R. E. 165Lucas critique 171, 182Lumsdaine, R. L. 152–3Lund, J. 298Lunde, A. 124Lutkepohl, H. 16Luttmer, E. 307Lyons, R. K. 110

MACH (Moving Average ConditionalHeterodskedastic) 154

Machina, M. J. 8, 196MacKinlay, A. C. 355MacKinnon, J. 6macroeconomic volatility and stock

market volatility, world-wide97–116

annual consumption data 111,113–14 Tab. 6.A2

annual stock market data 111,112–13 Tab. 6.A1

asset market volatility 97–8basic relationship: stock

return/GDP volatilities 101basic relationship: stock return/PCE

volatilities 101, 103 Fig. 6.3choice of sample period 99–100controlling level of initial

GDP 101–5, 104–5 Figs6.4–6.6, 106–7 Figs. 6.7–6.9

cross-sectional analysis 103 Fig. 6.2,107–8, 107 Fig 6.9, 108Fig. 6.10

data 99–100developing countries 100

distribution of volatilities 101, 102Fig. 6.1

empirical results 100–5Granger hypothesis 109, 110

Tab. 6.1panel analysis of causal

direction 108–9quarterly stock index data 111,

114–15 Tab. 6.A3stack returns and GDP series 111,

116 Tab. 6.A4stock markets and developing

countries 100transition economies 100

Maheu, J. M. 150Maller, R. 144, 146Malz, A. M. 326–7Manganelli, S. xi, 81, 143, 231, 233, 237,

246Mao, J. C. T. 120MARCH1 (Modified ARCH) 154MARCH2 see MGARCH2

Markov Chain 153, 264Markov Chain Monte Carlo (MCMC) 38Markov process 5, 151, 321Markov switching 5, 260Markowitz, H. 120martingale difference properties 214martingale difference sequence (MDS)

82, 88, 89, 235martingale share model 20Massmann, M. 166Matrix EGARCH 154–5Maximum Entropy principle 329Maximum Likelihood Estimates 141

see also QMLEMcAleer, M. 145McARCH 95

philosophy 81McCulloch, J. H. 139, 159McCurdy, T. H. 150McKee, S. 321McNees, S. K. 62McNeil, A. J. 146MDH (Mixture of Distribution

Hypothesis) 155mean absolute error (MAE) 73

Page 425: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

412 Index

Medeiros, M. C. 147Meenagh, D. 63, 65–6, 71, 76, 77Mele, A. 163Melick, W. R. 326Melliss, C. 73, 76Melvin, M. 355MEM (Multiplicative Error Model) 155metropolitan sectoral fluctuations,

sources of 13–34demand shocks 21, 27four aggregate levels 14–15growth rates 15industry share of employment 16productivity shocks 24, 27, 34supply shocks 14, 16, 21, 33technology shocks 20see also long run shift-share

modelingMetropolitan Statistical Areas 21Mezrich, J. 162MGARCH 148, 154, 163MGARCH1 155–6MGARCH2 (Multiplicative GARCH)

156 see also log-GARCHMGARCH3 (Mixture GARCH) 156Mikkelsen, H. O. 86, 147Milhøj, A. 86, 154, 156Miller, M. 324Mills, L. O. 19, 20, 24, 25Milshtein, G. N. 297, 308, 309 n.5, 321Mincer-Zarnowitz regressions 195, 200,

209Minsky, H. P. 154MIT 2Mitchell, J. 75, 76mixed date sampling (MIDAS) 98MN-GARCH (Normal Mixture GARCH)

157Monash University, Melbourne 2monetary policy (US) 80monetary policy shocks (UK/US) 283Monfort, A. 158Monte Carlo methods 80, 153Moors coefficient of kurtosis 239Moran’s I 44 n.6, 46Morgan, I. G. 162

mortgage rate deviation, US regional 37Fig. 3.2

MQ-CAViaR autoregressive conditionalskewness and kurtosis 232–4

“MSE-loss probability measure” 195MS-GARCH (Markov Switching

GARCH) see SWARCHMueller, P. 202Muller, U. A. 151multi-quantile CAViaR and skewness and

kurtosis 231–56consistency and asymptotic

normality 234–7consistent covariance matrix

estimation 237estimations 240–4, 242 Tab. 12.2,

243 Fig. 12.5, 244 Fig. 12.6MQ-CAViaR process and model

232–4simulation 244–6, 245 Tab. 12.3, 246

Tab. 12.4multivariate autocontours xi, 213–30

concept 214–15, 230multivariate dynamic models ximultivariate GARCH models,

autocontour testingBEKK model 225, 227, 228

Fig. 11.3, 228 Fig. 11.5, 229Tab. 11.7

DCC model 225, 227, 228 Fig. 11.4,229 Tab. 11.7, 229 Fig. 11.6,229 Tab. 11.7, 230

empirical applications 224–30, 224Tab. 11.5, 225 Tab. 11.6, 226Fig. 11.2, 228–9, Figs.11.3–11.6, 229 Tab. 11.7

empirical process-based testingapproach 214

Monte Carlo simulations 215, 217,219–22, 221 Tab. 11.1 (a), (b),221, Tab. 11.2 (a), (b), 230

normal distributions 218, 219, 220Fig. 11.1, 224, 227

power simulations 222–4, 223Tab. 11.3, 224 Tab. 11.4

quasi-maximum likelihoodestimator 214

Page 426: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Index 413

Student-t distribution 218–19, 220Fig. 11.1, 224, 227, 229Tab. 11.7, 230

testing methodology 215–17MV-GARCH (MultiVariate GARCH)

156 see also MGARCH1

Mykland, P. A. 118, 133

NAGARCH (Nonlinear AsymmetricGARCH) 156

Nam, K. 139–40Nandi, S. 160National Institute of Economic and

Social Research (NIESR) 63, 75Nelson, D. B. 118, 142, 141, 146, 148,

149, 150Nelson, E. 65Nerlove, M. 2, 63, 147, 153neural networks 9New Keynsian model 65Newbold, P. 1Newey, W. K. 206, 236Newey-West corrections 80, 85, 86

Tab. 5.2Ng, V. K. 118, 147, 156, 157–8, 160, 162,

296NGARCH (Nonlinear GARCH) 152,

156–7Ni, S. 79Nielsen, B. 165, 166, 173Nijman, T. E. 163Nikolov, K. 65NL-GARCH (NonLinear GARCH) 157Nobel Prize 2, 5, 78Norrbin, S. 15 n.1North American Industry Classification

System (NAICS) 21Nottingham University 1Nowicka-Zagrajek, J. 158Nuffield College, Oxford 2Nychka, D. W. 202

OGARCH (Orthogonal GARCH) 157oil prices 259, 287, 326oil shocks 283Oliner, S. 287

OLS formula and tests 81–7, 83 Fig. 5.1,84 Fig. 5.2, 85 Fig. 5.3, 86Tab. 5.2, 90–1, 92, 96

OLS t-test, asymptotic rejectionprobability 83, 84

Olsen, R. B. 151Orr, D. 2Otsu, T. 236

Pagan, A. 7Pakistan GDP 101–5Panigirtzoglou, N. 326, 327, 328, 329, 342Pantula, S. G. 154, 156parameter variation across the frequency

domain 13PARCH (Power ARCH) see NGARCHPareto distributions 139, 146, 159Park, J. Y. 141, 158Patton, A. J. xi, 145, 194, 195PC-GARCH (Principal Component

GARCH) 157PcGets program 166PcGive algorithms 166Pearson, N. 296Pedersen, C. S. 120Pelloni, G. 79Perez, S. J. 176Perry, M. J. 77, 79personal consumption expenditures

(PCE) 99–100Pesaran, M. H. 69PGARCH1 (Periodic GARCH) 157PGARCH2 (Power GARCH) see

NGARCHPhillips curve 64, 65Phillips, R. 309 n.5Piazzesi, M. 88Pinto, B. 98Pitts, M. 155Ploberger, W. 198PNP-ARCH (Partially NonParametric

ARCH) 157–8Podolskij, M. 125, 126, 133–4Polasek, W. 79Poll Tax (UK) 73Portes, R. 258, 260portfolio theory 120

Page 427: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

414 Index

Powell, J. L. 236, 237Power ARCH see NGARCHPower GARCH see NGARCH“practitioner Black-Scholes” 324, 328,

336 see also Black-Scholes (BS)option pricing model

Price, S. 76Psaradakis, Z. 165Puctet, O. V. 151Pyun, C. S. 139–40

QARCH see GQARCHQMLE (Quasi Maximum Likelihood

Estimation) 158QTARCH (Qualitative Threshold

ARCH) 158Quah, D. 65, 66Quasi Maximum Likelihood Estimates

(QMLE) 138QUERI consultancy 4

Radon-Nikodym derivative 201Ramanathan, R. 8, 15, 21Ramaswamy, K. 307Ramey, G. 98Ramey, V. A. 8, 98Ramm, W. 15, 21Ranaldo, A. 354Rangel, J. G. 160Rangel spline-GARCH model 98–9Ratti, R. 79realized semivariance (RS) 117–36

bipower variation 125, 131–2, 132Tab. 7.7

GARCH models 121–2, 123Tab. 7.1, 130, 131 Tab. 7.5, 132Tab. 7.7, 133

GJR model 122, 123 Tab. 7.5, 130,131 Tab. 7.5, 132 Tab. 7.7

models and background 122–4noise effect 133realized variance (RV) 121, 122, 127,

129, 130realized variance (RV)

definition 118–19signature plots 120, 121

Fig 7.1 (d), (e)

REGARCH (Range EGARCH) 158regional economics 3Reider, R. L. 326Retail Price Index (RPI) 63–5, 65

Fig. 4.1(a), (b)Revankar, N. S. 166Rey, H. 258, 260RGARCH1 (Randomized GARCH) 158RGARCH2 (Robust GARCH) 158–9RGARCH3 (Root GARCH) 159Richard, J. F. xi, 164Richardson, Matthew xiRigobon, R. 161, 260, 288, 295RiskMetrics 147risk-neutral density, US market portfolio

estimation xi, 323–53adding tails 342–5, 345 Fig. 15.8arbitrage-free theoretical models 336Binomial Tree models 327–8Black-Scholes equations 328, 341Black-Scholes implied volatilities

(IVs) 324, 346Black-Scholes option pricing

model 323, 327, 336, 342, 347,349, 352

Central Limit Theorem 342dynamic behaviour 350–2, 351

Tab. 15.4and economic/political events 326estimating from S&P 500 index

options 345–52, 347 Tab. 15.2,348 Tab. 15.3

extracting from option marketprices, in practice 331–9, 332Tab. 15.1, 333 Fig. 15.1, 334Fig. 15.2, 335 Figs. 15.3–4, 338Fig. 15.5

extracting from option prices, intheory 329–31

exchange rates andexpectations 326–7, 329

Extreme Value distribution 342Fisher-Tippett Theorem 342Garman-Kohlhagen model 327Generalized Extreme Value (GEV)

distribution 325, 342–45, 345Figs. 15.8–9, 349, 352

Page 428: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Index 415

Generalized Extreme Value (GEV)parameter values 343–4

Generalized Extreme Value (GEV)tails 350–2

Greek letter hedging parameters 326implied volatilities (IVs) 323–4, 326,

328, 334, 336, 337, 338, 339,341, 342, 352

market bid-ask spreads 339–41, 340Fig. 15.6

Maximum Entropy principle 329moments of risk-neutral

density 349–50Monte Carlo simulations 329“practitioner Black-Scholes” 324,

328, 336risk preferences 324, 325, 327skewness and kurtosis 349–50“smoothing spline” 336–7spline functions 334, 336, 341summary 340–1, 341 Fig. 15.7tail parameters 347tails 352volatility ‘smile’ 323–4

Robins, R. 141–2, 203, 296, 297, 301,317, 322

Robinson, P. M. 153Rochester University 2Rockinger, M. 142, 145Rom, B. M. 120Rombouts, J. V. K. 213Rosenberg, J. xi, 149, 327Rosu, I. 361, 363, 364Rothenberg, Jerome 13Rothschild, M. 147Roulet, J. 265Rouwenhorst, G. 261, 262 n.4, 276RS-GARCH (Regime Switching GARCH)

see SWARCHRubinstein, M. 327–8Rubio, G. 140, 231, 240Rudebusch, G. D. 91, 92, 110Ruiz, E. 160Runkle, D. 118, 150, 162Russell, J. xi, 138, 139, 355, 361RV (Realized Volatility) 159

Sack, B. 88Saflekos, A. 327Sakata, S, 244San Diego University, California x, xi,

1–12changing years 4–6citations 7Econometrics Research Project 8founding years 2–3graduate students 6middle years 3–4university rankings 5–6visitors and students 6–7, 9–12wives 8

Sanders, A. 296Santa-Clara, P. 98, 148Santos, C. xi, 165, 175, 179SARV (Stochastic AutoRegressive

Volatility) see SV (StochasticVolatility)

SARV(1) 161Sasaki, K. 20, 21Satchell, S. E. 145, 150, 238Scheinkman, J. 296, 307Schlagenhauf, D. 15 n.1Schwartz, E. 296, 298, 313Schwarz criterion 73Schwert, G. W. 98, 101, 108–9, 160, 162sectoral shift-share (Model 3) 19, 24Sensier, M. 71Sentana, E. 138, 151, 153, 160, 161, 163,

288Senyuz, Z. 214, 215, 216, 217, 224, 230Serletis, A. 79Serna, G. 140, 231, 240Serven, L. 79S-GARCH (Simplified GARCH) 159SGARCH (Stable GARCH) 159Shephard, N. xi, 47, 63, 118, 124, 125,

126, 127, 131, 132, 133, 153Sheppard, K. 139, 150Shields, K. 79Shiller, R. 98, 296Shimko, D. 334–6Shin, Y. 69short run shift-share modelling 17, 18

Page 429: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

416 Index

short-term economic forecasts, UKgovernment 73

Sichel, D. 287Siddique, A. 140, 238Siddiqui, M. M. 238, 239Sign-GARCH see GJRSill, K. 15 n.1Sims, C. 2Singleton, K. J. 296, 298, 307skewness

autoregressive conditional 231–56Bowley coefficient of 238–9GARCH with skewness and kurtosis

(GARCHSK) 140and global equity returns 267, 275MQ-CAViaR autoregressive

conditional skewness andkurtosis 232–4

Smith, J. x, 75Soderlind, P. 326Sohn, B. 98Sola, M. 165Solnik, B. 265Sortino, F. 120Sortino ratios 120SPARCH (SemiParametric

ARCH) 159–60Spatt, C. 354spectral regression 3spline-GARCH 98–9, 160SQR-GARCH (Square-Root

GARCH) 160Satchell, S. E. 120Stambaugh, R. F. 86Standard and Poors (S&P) Emerging

Markets Database 99Standard and Poors (S&P) 500 stock

index 239–44, 325, 326, 327,333, 334, 340, 344, 345–52, 353

Stanford University 2Stanton, R. xi, 297, 308STARCH (Structural ARCH) 160Stdev-ARCH (Standard deviation

ARCH) 160Stern School of Business, NYU x, 4, 5STGARCH (Smooth Transition

GARCH) 147, 160–1

Stinchcombe, M. 8, 233stochastic volatility 62–3Stock, J. H. x, 47stock market volatility x–xi, 97–116stock market crash (1987) 276, 326, 327Stoja, E. 159Strong GARCH 161Structural GARCH 161Sun, T.-S, 296super exogeneity xi, 4super exogeneity, automatic tests

of 164–93co-breaking based tests 186detectability in conditional

models 169–70detectable shifts 166–70failures 172–3, 181–6, 193F-test, impulse-based 189–90F-test potency 187–90, 187

Tab. 9.2–9.3, 188 Tab. 9.4, 189Tab. 9.5–9.6, 190 Tab. 9.7

F-tests 175, 177, 179, 181, 182, 183,185

impulse saturation 165, 166, 173–5,174 Fig. 9.2–9.3, 178–9, 186,187, 192–3

mean shift detection 179–81, 181Tab. 9.2, 182

Monte Carlo evidence 166, 175, 193Monte Carlo evidence and null

rejection frequency 176–9,177–9 Fig. 9.4

null rejection frequency 175–9regression context 170–3simulating the potencies 186–90, 187

Tab. 9.2–9.3, 188 Tab. 9.4, 188Fig. 9.5, 189 Tab. 9.5–9.6, 190Tab. 9.7–9.8

simulation outcomes 167–9, 169Fig. 9.1

six conditions 166UK money demand testing 190–2variance shift detection 181

Survey of External Forecasters, Bank ofEngland 63

Survey of Professional Forecasters(US) 63, 76, 195

Page 430: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Index 417

Survey Research Centre, PrincetonUniversity 39

Susmel, R. 151, 161SV (Stochastic Volatility) 161Svensson, L. 326SVJ (Stochastic Volatility Jump) 161Swanson, E. 88Swanson, Norm 6SWARCH (Regime Switching ARCH)

151, 161

Taniguchi, M. 142Tauchen, G. 118, 131, 155, 159Taylor, S. J. 148, 162Taylor series expansions 309, 309 n.5Tenreyro, S. 101Terasvirta, T. 6, 165TGARCH (Threshold GARCH) 147,

150, 152, 161–2Theil 21 n.5Thomas, C. P. 326Thompson, S. B. 98Tieslau, M. A. 77time series methods 9time-varying volatility xTimmermann, A. xi, 4–5, 9, 194, 195,

200, 264student co-publishers 12

Tobit-GARCH 162Toro, J. 165, 166total share model (Model 1) 18–19, 24Trades and Quotes (TAQ) data 359transition economies 100Trevor, R. G. 162Tse, Y. K. 145, 147, 213–14TS-GARCH (Taylor-Schwert GARCH)

152, 159, 162Tsui, A. K. C. 145, 213Tucker, J. 159Tuesday’s Econometrician’s Lunch’ 2TVP-Level (Time-Varying Parameter

Level) see Level GARCHTVP-Level see Level-GARCH

UGARCH (Univariate GARCH) seeGARCH

UK money demand and exogeneity 190–2

unit root inference 5Unobserved GARCH see Latent GARCHurban economics x, 13–14, 33US Federal Reserve 80, 299, 339, 352

forecasting 87–91fund rates 95future forecasts 89 Tab. 5.3, 89

Fig. 5.4monetary policy 205policy and Taylor Rule 91–5, 92

Tab. 5.5, 93 Tab. 5.6, 94Tab. 5.7, 95 Fig. 5.5

US government bonds 299US interest rates 141–2US stock return data 224–6US Treasury bill market 283

Valkanov, R. 98Value Added Tax (UK) 73van Dijk, D. 71VAR (vector autoregression) x, 161

B-form 16‘city-industry’ 14

VAR-GARCH framework and sectoralshocks 79

Varian, H. R. 200Variance Targeting 162VCC (Varying Conditional Correlations)

see DCCVCC-MGARCH (Varying Conditional

Correlation) 145vech GARCH (vectorized GARCH)

see MGARCH1

VECH GARCH see MGARCH1

vector equilibrium systems (EqCMs)165

Vega, C. 109Veiga, A. 147Verbrugge, R. 15 n.1Vetter, M. 133VGARCH 160, 162–3VGARCH2 (Vector GARCH) 163VIX index 337, 353volatility

asset market 97–8equity model of xi

Page 431: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

418 Index

volatility (cont.)‘fundamental’ 98, 98n.2, 99, 100,

105, 108–9GDP, US 79housing construction, US 43, 57, 58,

60–1Implied Volatility (IV) 153interest rates 296–322limit order book periods 361–2,

364macroeconomic and stock market,

world-wide 97–116‘smile’ 323–4

volatility regimes and global equityreturns 257–95

Akaike (AIC) information criteria269, 270 Tab. 13.3, 271

arbitrage pricing theory (APT) 258common nonlinear factor approach

265conclusions 293–5country-industry/affiliation factors

258–9, 260country-industry decomposition

259–60data 265–7economic fixed-length rolling

windows approach 265, 280Fig. 13.5, 279–81

global portfolio allocation 287–93,290 Tab. 13.7, 291–2 Tab. 13.8

global risk diversification 261global stock return dynamics

267–75, 268 Tab. 13.1Hannan-Quinn (HQ) information

criteria 269, 270 Tab. 13.3, 271Heston-Rouwenhorst decomposition

scheme 267interpretation 281–7, 282 Tab. 13.6,

284 Fig. 13.7, 285 Fig. 13.7 (a),286 Fig. 13.7 (b)

industry and country effectsbenchmark 263

industry portfolios and statecombinations 288–9

industry-specific shocks 283international equity flows 260

international risk diversification 257,258

intra-monthly variance of stockreturns 274 Fig. 13.2

IT sector 287joint portfolio dynamics 270–3, 271

Tab. 13.4, 272 Tab. 13.5, 273Fig. 13.1

“mixtures of normals” model 260modeling stock return dynamics

263–5monetary policy shocks (UK/US)

283nonlinear dynamic common

component models 270–3nonlinear dynamic dependencies

260nonlinearity in returns 268–70, 269

Tab. 13.3, 270 Tab. 13.3oil prices and shocks 283, 287portfolio diversification 294“pure” country and industry

portfolios 261–3regime-switching/changes 279, 294

–5regime-switching models 271, 293regime-switching processes 262, 267,

268risk diversification 288, 295robustness checks 273–5rolling windows approach 279rolling windows approach

comparison, 280 Fig 13.5Schwarz Bayesian (BIC) information

criteria 269, 270 Tab. 13.3sector-specific factors/shocks 257,

258, 295short term interest rates 283single state model 260skewness and kurtosis 267, 275smoothed state probabilities 273

Fig. 13.1, 285 Fig. 13.7 (a), 286Fig. 13.7 (b)

temporary switches 257variance decompositions 275–81, 277

Fig. 13.3, 278 Fig. 13.4, 280Fig. 13.5, 282 Tab. 13.6

Page 432: Bollerslev T. Et Al._2010_Volatility and Time Series Eco No Metrics

Index 419

Volcker, P. 92, 93von Weizsacker 151VSGARCH (Volatility Switching

GARCH) 147, 163Vuong, Q. 236

Wachter, S. M. 38, 48Wadhwani, S. 288wage rates (UK) 66Wallis, K. F. x, 62, 75Walt Disney (DIS) trade data 130, 130

Tab. 7.4, 131 Tab. 7.5Wang, I. 354Warren, J. M. 120Watson, M. W. x, 6, 13, 35, 47Waugh, F. V. 182Weak GARCH 163Weibull distribution 342Weide, R. van der 151Weiss, A. 234Werron, A. 158West, K. D. 198, 206Whaley, R. E. 328White, H. xi, 2, 3, 4, 5, 9, 84, 231, 233,

237, 238, 239, 244, 246–7, 252citations 7student co-publishers 11

white noise and processes 2, 82White Standard Error test 4, 80, 85

Fig. 5.3, 86 Tab. 5.2, 92, 96,90–1,

White TR2 test 85, 86, 86 Tab. 5.2, 87Whittaker, R. 73, 76Wiener processes 149Wolf, M. 148Wong, C. S. 156Wooldridge, J. M. 122, 142, 155, 158World Bank 99

World Development Indicatorsdatabase (WDI) 99, 111

World Economic Outlook 195World Federation of Exchanges 99

Xing, Y. 117

Yang, M. 154Yilmaz, K. x–xiYixiao Sun 5Yoldas, E. xi, 214, 215, 216, 217, 224, 230Yuen, K. C. 156

Zakoian, J.-M. 120 n.3, 150, 161ZARCH (Zakoian ARCH) see TGARCHZARCH see TGARCHZarnowitz, V. 76Zellner, A. 2Zhang, X. 156, 259 n.1, 262, 262 n.4Zhou, B. 133Zivot, E. 71Zingales, L. 7