Statistical Inference for Time-Inhomogeneous

download Statistical Inference for Time-Inhomogeneous

of 27

Transcript of Statistical Inference for Time-Inhomogeneous

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    1/27

    arXiv:math/0

    406430v1

    [math.S

    T]22Jun2004

    The Annals of Statistics

    2004, Vol. 32, No. 2, 577602DOI: 10.1214/009053604000000102

    c Institute of Mathematical Statistics, 2004

    STATISTICAL INFERENCE FOR TIME-INHOMOGENEOUS

    VOLATILITY MODELS

    By Danilo Mercurio1

    and Vladimir Spokoiny

    Humboldt-Universitat zu Berlin and Weierstrass-Institute

    This paper offers a new approach for estimating and forecastingthe volatility of financial time series. No assumption is made aboutthe parametric form of the processes. On the contrary, we only sup-pose that the volatility can be approximated by a constant over someinterval. In such a framework, the main problem consists of filtering

    this interval of time homogeneity; then the estimate of the volatilitycan be simply obtained by local averaging. We construct a locallyadaptive volatility estimate(LAVE) which can perform this task andinvestigate it both from the theoretical point of view and throughMonte Carlo simulations. Finally, the LAVE procedure is applied toa data set of nine exchange rates and a comparison with a standardGARCH model is also provided. Both models appear to be capableof explaining many of the features of the data; nevertheless, the newapproach seems to be superior to the GARCH method as far as theout-of-sample results are concerned.

    1. Introduction. The aim of this paper is to offer a new perspective forthe estimation and forecasting of the volatility of financial asset returns such

    as stocks and exchange rate returns.A remarkable amount of statistical research is devoted to financial timeseries, in particular, to the volatility of asset returns, where the term volatil-ity indicates a measure of dispersion, usually the variance or the standarddeviation. The interest in this topic is motivated by the needs of the finan-cial industry, which regards volatility as one of the main reference numbersfor risk management and derivative pricing.

    Actually, asset returns time series display very peculiar stylized facts,which are connected with their second moments. Graphically, they look like

    Received June 2000; revised March 2003.1Supported by the Deutsche Forschungsgemeinschaft, Graduiertenkolleg fur Ange-

    wandte Mikrookonomik at Humboldt University.AMS 2000 subject classifications. Primary 62M10; secondary 62P20.Key words and phrases. Stochastic volatility model, adaptive estimation, local homo-

    geneity.

    This is an electronic reprint of the original article published by theInstitute of Mathematical Statistics in The Annals of Statistics,2004, Vol. 32, No. 2, 577602.This reprint differs from the original in paginationand typographic detail.

    1

    http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://arxiv.org/abs/math/0406430v1http://www.imstat.org/aos/http://dx.doi.org/10.1214/009053604000000102http://www.imstat.org/http://www.imstat.org/http://www.ams.org/msc/http://www.imstat.org/http://www.imstat.org/aos/http://dx.doi.org/10.1214/009053604000000102http://dx.doi.org/10.1214/009053604000000102http://dx.doi.org/10.1214/009053604000000102http://www.imstat.org/aos/http://www.imstat.org/http://www.ams.org/msc/http://www.imstat.org/http://dx.doi.org/10.1214/009053604000000102http://www.imstat.org/aos/http://arxiv.org/abs/math/0406430v1
  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    2/27

    2 D. MERCURIO AND V. SPOKOINY

    white noise, where periods of high and low volatility seem to alternate. Theirdensity has fat tails if compared to that of a normal random variable, andthey show significantly positive and highly persistent autocorrelation of theabsolute returns, meaning that large (resp. small) absolute returns are likelyto be followed by large (resp. small) absolute returns. Typical examples canbe seen in Section6, and further details on this topic can be found in Taylor(1986). Therefore, a white-noise process with time-varying variance is usuallytaken to model such features. LetStdenote the observed asset process. Thenthe corresponding (log) returns Rt= log(St/St1) follow the heteroscedasticmodel

    Rt= tt,

    where t are standard Gaussian independent innovations and t is a time-varyingvolatilitycoefficient. It is often assumed that t is measurable w.r.t.the -field generated by the preceding returns R1, . . . , Rt1. For modelingthis volatility process, parametric assumptions are usually used. The mainmodel classes are the ARCH and GARCH family [Engle (1995)] and thestochastic volatility models [Harvey, Ruiz and Shephard (1994)]. A largenumber of papers has followed the first publications on this topic, andthe original models have been extended in order to provide better explana-tions. For example, models which take into account asymmetries in volatilityhave been proposed, such as EGARCH [Nelson(1991)], QGARCH[Sentana(1995)] and GJR [Glosten, Jagannathan and Runkle (1993)]; furthermore,

    the research on integrated processes has produced integrated [Engle and Bollerslev(1986)] and fractal integrated versions of the GARCH model.

    The availability of very large samples of financial data has made it possi-ble to construct models which display quite complicated parameterizationsin order to explain all the observed stylized facts. Obviously, these mod-els rely on the assumption that the parametric structure of the processremains constant through the whole sample. This is a nontrivial and possi-bly dangerous assumption, in particular, as far as forecasting is concerned[Clements and Hendry(1998)]. Furthermore, checking for parameter insta-bility becomes quite difficult if the model is nonlinear and/or the numberof parameters is large. Thus, those characteristics of the returns, which areoften explained by the long memory and (fractal) integrated nature of the

    volatility process, could also depend on the parameters being time varying.In this paper we propose another approach focusing on a very simple

    model but with a possibility for model parameters to depend on time. Thismeans that the model is regularly checked and adapted to the data. Noassumption is made about the parametric structure of the volatility process.We only suppose that it can be locally approximated by a constant; thatis, for every time point there exists a past interval [ m, ] where the

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    3/27

    TIME-INHOMOGENEOUS VOLATILITY MODELS 3

    volatility t did not vary much. This interval is referred to as the interval oftime homogeneity. An algorithm is proposed for data-driven estimation ofthe interval of time homogeneity, after which the estimate of the volatilitycan be simply obtained by averaging.

    Our approach is similar to varying-coefficient modeling fromFan and Zhang(1999); see alsoCai, Fan and Li(2000) andCai, Fan and Yao(2000).Fan, Jiang, Zhang and Zh(2003) discussed applications of this method to stock price volatility mod-eling. The proposed procedure is based on the assumption that the modelparameters smoothly vary with time and can be locally approximated by alinear function of time. This approach has the drawback of not allowing oneto incorporate structural breaks into the model.

    Change point modeling with applications to financial time series was

    considered in Mikosch and Starica (2000). Kitagawa (1987) applied non-Gaussian random walk modeling with heavy tails as the prior for the piece-wise constant mean for one-step-ahead prediction of nonstationary time se-ries. However, the aforementioned approaches require some essential amountof prior information about the frequency of change points and their size.

    The LAVE approach proposed in this article does not assume smooth orpiecewise constant structure of the underlying process and does not requireany prior information. The procedure proposed below in Section 3 focuseson adaptive choice of the interval of homogeneity that allows one to proceedin a unified way with smoothly varying coefficient models and change pointmodels.

    The proposed approach attempts to describe the local dynamic of thevolatility process, and it is particularly appealing for short-term forecastingpurposes which is an important building block, for example, in value-at-riskand portfolio hedging problems or backtesting[Hardle and Stahl(1999)].

    The remainder of the paper is organized as follows. Section 2 introducesthe adaptive modeling procedure. Then some theoretical properties are dis-cussed in the general situation and for a change point model. A simulationstudy illustrates the performance of the new methodology with respect tothe change point model. The question of selecting the smoothing parametersis also addressed and some solutions are proposed. Finally, the procedureis applied to a set of nine exchange rates and it appears to be highly com-petitive with standard GARCH(1, 1), which is used as a benchmark model.

    Mathematical proofs are given in Section 8.

    2. Modeling volatility via power transformation. LetSt be an observedasset process in discrete time, t= 1, 2, . . . , and Rt are the correspondingreturns: Rt= log(St/St1). We model this process via the conditional het-eroscedasticity assumption

    Rt= tt,(2.1)

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    4/27

    4 D. MERCURIO AND V. SPOKOINY

    where t, t 1, is a sequence of independent standard Gaussian randomvariables and t is the volatility process which is in general a predictablerandom process, that is, t Ft1 withFt1= (R1, . . . , Rt1) (the -fieldgenerated by the first t 1 observations).

    A time-homogeneous (time-homoscedastic) model means that t is a con-stant. The process St is then a geometric Brownian motion observed atdiscrete time moments. The assumption of time homogeneity is too restric-tive in practical applications, and it does not allow one to fit real data verywell. In this paper, we consider an approach based on the local time ho-mogeneity, which means that for every time moment there exists a timeinterval [m, ] where the volatility process t is nearly constant. Undersuch a modeling, the main intention is both to describe the interval of ho-mogeneity and to estimate the corresponding value which can then beused for one-step forecasting and the like.

    2.1. Data transformation. The model equation (2.1) links the targetvolatility processtwith the observationsRt via the multiplicative errors t.The classical well-developed regression approach relies on the assumption ofadditive errors which can then be smoothed out by some kind of averag-ing. A natural and widespread method of transforming equation (2.1) into aregression-like equation is to apply the log function to both its sides squared:

    log R2t = log 2t +log

    2t ,(2.2)

    which can be rewritten in the form

    log R2t = log 2t + C+ vt,

    with C= E log 2t ,v2 = Var log 2t andt= v

    1(log 2tC); see, for example,Gourieroux(1997). This is a usual regression equation with the responseYt= log R

    2t , target regression function f(t) = log

    2t +Cand homogeneous

    noise vt.The main problem with this approach is due to the distribution of the

    errors t, which is highly skewed and gives very high weights to the smallvalues of the errors t. In particular, this leads to a serious problem withmissing data which are typically modeled equal to previous values providingRt= 0.

    Another possibility is based on power transformation [seeCarroll and Ruppert(1988)] which also leads to a regression with additive noise and this noiseis much closer to a Gaussian one. Due to (2.1), the random variable Rt isconditionally onFt1 Gaussian and

    E(R2t |Ft1) = 2t .

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    5/27

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    6/27

    6 D. MERCURIO AND V. SPOKOINY

    within an interval I= [m, ], and the process Rt follows the regression-like equation (2.3) with the constant trendI= C

    Iwhich can be estimated

    by averaging over this interval I:

    I= 1

    |I|tI

    |Rt|.(3.1)

    For the particular case = 2 the estimateIcoincides with the local maxi-mum likelihood estimator (MLE) of the volatility 2t considered inFan, Jiang, Zhang and Zhou(2003). As discussed in the previous section, a smaller value of might bepreferred for improving the stability of the method. Similarly toFan, Jiang, Zhang and Zhou(2003), one can also incorporate the one-sided kernel weighting to this esti-mator.

    By (2.3)

    I=C|I|tI

    t +

    D|I|tI

    tt=

    1

    |I|tI

    t+s|I|tI

    tt,(3.2)

    with s= D/Cso that

    EI= E1

    |I|tI

    t,(3.3)

    s2|I|2E

    tI

    tt

    2=

    s2|I|2E

    tI

    2t .(3.4)

    3.1. Some properties of the estimateI. Due to our assumption of localhomogeneity, the process t is close to for all t I. Define also

    I= suptI

    |t | and v2I=s2|I|2

    tI

    2t .

    The value of Imeasures the departure from homogeneity within the in-terval I, and it can be regarded as an upper bound of the bias of theestimateI. The value ofv

    2I, because of (3.4), will be referred as the condi-

    tional variance of the estimate I. The next theorem provides a probabilitybound for the estimation error, that is, the deviation ofI from the presentvalue of the volatility in terms of I and vI.

    Theorem 3.1. Let the volatility coefficientt satisfy the condition

    b t bB,(3.5)with some positive constants b, B. Then there exists a>0 such that, forevery 1,

    P(|I |> I+ vI) 4

    ea1 (1+log B)e2/(2a).

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    7/27

    TIME-INHOMOGENEOUS VOLATILITY MODELS 7

    Remark 3.1. This result can be slightly refined for the special casewhen the volatility process t for t I is deterministic or (conditionally)independent of the observations Rt precedingI. Namely, in such a situationthe factor 4

    ea1 (1+log B) in the bound can be replaced by 2:

    P(|I | > I+ vI) 2e2/(2a).A similar remark applies to all the results that follow.

    The result of this theorem bounds the loss of the estimate I via thevalue I and the conditional standard deviation vI. Under homogeneityI 0 and the error of estimation is of ordervI. Unfortunately,vI depends,in turn, on the target process t. One would be interested in another boundwhich does not involve the unknown function t. Namely, using (3.4) andassuming Ismall, one may replace the conditional standard deviation vIby its estimate

    vI= sI|I|1/2.

    Theorem 3.2. LetR1, . . . , Robey (2.1) and let (3.5) hold true. Then,for the estimateI of for everyD 0 and 1,

    P(|I|> vI,I/vID) 4

    e(1+log B)e2/(2a),

    where solves

    + D= /(1 + s|I|1/2).3.2. Adaptive choice of the interval of homogeneity. Given observations

    R1, . . . , R following the time-inhomogeneous model (2.1), we aim to es-timate the current value of the parameter using the estimate I with aproperly selected time intervalIof the form [m, ] to minimize the corre-sponding estimation error. Below we discuss one approach which goes back tothe idea of pointwise adaptive estimation; seeLepski(1990),Lepski and Spokoiny(1997) and Spokoiny (1998). The idea of the method can be explained asfollows. Suppose I is an interval candidate; that is, we expect time homo-geneity in I and, hence, in every subinterval ofI. This particularly impliesthat the value Iis small and similarly for all J,J

    I, and that the mean

    values of the t over Iand over Jnearly coincide. Our adaptive procedureroughly means the choice of the largest possible interval I such that thehypothesis that the value t is a constant within I is not rejected. For test-ing this hypothesis, we consider the family of subintervals ofIof the formJ= [m, ] with m < m and for every such subinterval J compare twodifferent estimates: one is based on the observations from J, and the otherone is calculated from the complement I\ J= [ m, m[. Theorems

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    8/27

    8 D. MERCURIO AND V. SPOKOINY

    3.1and 3.2 can be used to bound the difference J I\Junder homogene-ity within I. Indeed, the conditional variance ofI\JJ is v2I\J+ v2J andcan be estimated by v2I\J+ v

    2J. Thus, with high probability it holds that

    |I\JJ|

    v2I\J+ v2J,

    provided that is sufficiently large. Therefore, if there exists a testing inter-valJ Isuch that the quantity|I\J J| is significantly positive, then wereject the hypothesis of homogeneity for the intervalI. Finally, our adaptiveestimate corresponds to the largest interval I such that the hypothesis ofhomogeneity is not rejected for I itself and all smaller considered intervals.

    Now we present a formal description. Suppose a family

    Iof interval can-

    didates I is fixed. Each of them is of the form I= [ m, ], m N, sothat the setI is ordered due to m. With every such interval, we associatethe estimate I ofand the corresponding estimate vIof the conditionalstandard deviations vI.

    Next, for every interval I fromIwe assume there is a setJ(I) of testingsubintervals J[one example of these setsI andJ(I) is given in Section 6].For every J J(I) we construct the corresponding estimateJ (resp. I\J)from the observations Yt=|Rt| for t J (resp. for t I\ J) accordingto (3.1) and compute vJ (resp. vI\J).

    Now, with a constant , define the adaptive choice of the interval ofhomogeneity by the following iterative procedure:

    Initialization. Select the smallest interval inI.

    Iteration. Select the next intervalI inIand calculate the correspondingestimate Iand the estimated conditional standard deviation vI.

    Testing homogeneity. Reject I if there exists one JJ(I) such that|I\JJ| >

    v2I\J+ v

    2J.(3.6)

    Loop. IfIis not rejected, then continue with the iteration step by choos-ing a larger interval. Otherwise, set I= the latest nonrejected I.

    The locally adaptive volatility estimate (LAVE) of is defined by

    applying this selected interval I:

    = I.

    The next section discusses the theoretical properties of the LAVE algorithmin a general framework, while Section 6 gives a concrete example for thechoice of the setsI,J(I) and the parameter . This choice is then appliedto simulated and real data.

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    9/27

    TIME-INHOMOGENEOUS VOLATILITY MODELS 9

    4. Theoretical properties. In this section we collect some results describ-ing the quality of the proposed adaptive procedure.

    4.1. Accuracy of the adaptive estimate. Let I be the interval selectedby our adaptive procedure. We aim to show that our adaptive choice is upto some constant factor in the losses as good as the ideal choice I thatmay utilize the knowledge of the volatility process t. This ideal choicecan be defined by balancing the accuracy of approximating the underlyingprocess t (which is controlled by I) and the stochastic error controlled bythe stochastic standard deviation vI. By definition,vI= s|I|1(tI2t )1/2so that vI typically decreases when|I| increases. For simplicity of notationwe shall suppose further that v

    IvJ

    for J

    I.We do not give a formal definition of an ideal choice of the interval I

    since there is no one universally optimal choice even if the process t isknown. Instead, we consider a family of all good intervals I such thatthe variability of the process t inside I is not too large compared to theconditional stochastic deviation vI. This, due to Theorem 3.1, allows us tobound with high probability the losses of the ideal estimate Iby (D + )vIprovided that I/vI Dandis sufficiently large. A similar property shouldhold for all smaller intervals I I. Hence, it is natural to quantify the qualityof the interval I by

    I= supII:II

    I/vI.

    The next assertion claims that the risk of the adaptive estimate is not largerin order than vI for all I such that I is sufficiently small.

    Theorem 4.1. Let (3.5) hold true. Let an interval I be such that, forsomeD 0, it holds with positive probability I D. Then

    P( I is rejected, I D)

    II(I)

    JJ(I)

    12

    eJ(1+log B)e(JD)

    2/(2a),(4.1)

    whereJ= (1 sN1/2J ) withNJ= min{|J|, |I\J|}.Moreover, ifNJ 2sfor allJJ(I)and allII, then it holds for the

    adaptive estimate=Ion the random setA = {I is not rejected, I D}:|I I| 2vI

    and

    |I | (D + 3 + 2s(D+ )|I|1/2)vI.

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    10/27

    10 D. MERCURIO AND V. SPOKOINY

    Remark 4.1. It is easy to see that the sum on the right-hand side of thebound (4.1) can be made arbitrarily small by proper choice of the constant and the setsJ(I). Hence, the result of the theorem claims that witha dominating probability a good interval I will not be rejected and theadaptive estimate is up to a constant factor as good as any of the goodestimates I.

    Remark 4.2. As mentioned in Remark3.1, the probability bound onthe right-hand side of (4.1) can be refined for the special case when the pro-

    cesstis constant within Iby replacing the factor 12

    eJ(1+log B)e(JD)

    2/(2a)

    by 6e2J/(2a).

    5. Change point model. Achange pointmodel is described by a sequenceT1 < T2

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    11/27

    TIME-INHOMOGENEOUS VOLATILITY MODELS 11

    Theorem 5.1. If I= [m, ] is an interval of homogeneity, that is,t= for all t I, then

    P(I is rejected)

    II(I)

    JJ(I)

    6exp

    2

    2a(1 + s|J|1/2)2

    .

    This result is a special case of Theorem4.1with J 0 when taking intoaccount Remark4.2.

    Theorem4.1implies that for every fixed value M there exists a fixed providing a prescribed upper bound for the probability of a false alarmfor a homogeneous interval Iof length M. Namely, the choice

    (1 + )

    2alog Mm0

    leads for a proper small positive constant > 0 to the inequality

    II(I)

    JJ(I)

    6exp

    2

    2a(1 + s|J|1/2)2 .

    Here, M/m0 is approximately the number of intervals inJ(I) (see Sec-tion 6.1). This bound is, however, very rough, and it is only of theoreticalimportance since we estimate the probability of the sum of dependent eventsby the sum of single probabilities. The value of providing a prescribed

    probability of a false alarm can be found by Monte Carlo simulation forthe homogeneous model with constant volatility as described in Section 6.

    5.2. Sensitivity to change points and the mean delay. The quality (sen-sitivity) of a change point procedure is usually measured by the mean delaybetween the occurrence of a change point and its detection.

    To study this property of the proposed method, we consider the case ofestimation at a pointimmediately after a change point Tcp. It is convenientto suppose thatTcpbelongs to the end points of an interval which is tested forhomogeneity. In this case the ideal choice Iis clearly [Tcp, ]. Theorem4.1claims that the quality of estimation at is essentially the same as if weknew the latest change point Tcp a priori. In fact, one can state a slightly

    stronger assertion: every interval Iwhich is essentially larger than Iwill berejected with high probability provided that the magnitude of the change islarge enough.

    Denotem = |I|, that is,m = Tcp. Let alsoI= [Tcpm, ] = [mm, ] for some m, so that|I| = m + m, and let (resp. ) denote the valueof the parametertbefore (resp. after) the change point Tcp. The magnitudeof the change point is measured by the relative change b = 2| |/.

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    12/27

    12 D. MERCURIO AND V. SPOKOINY

    It is worth mentioning that the valuest and especially t can be randomand dependent on past observations. For instance, t may depend on Yt forall t < Tcp.

    The interval Iwill certainly be rejected if|I\I I| is sufficiently largecompared to the corresponding critical value.

    Theorem 5.2. Let E(Yt|Ft1) = before the change point at Tcp andE(Yt|Ft1) = after it, and let b = | |/. Let I= [m m, ] withm = Tcp. If := s/

    min{m, m}< 1 and

    b

    2 +

    2(1 + )

    1 ,(5.1)

    thenP(I is not rejected) 4e2/(2a).

    The result of Theorem 5.2 delivers some additional information aboutthe sensitivity of the proposed procedure to change points. One possiblequestion is about the minimal delay m between the change point Tcp andthe first momentwhen the procedure starts to indicate this change point byselecting an interval of type I= [Tcp, ]. Due to Theorem5.2, the change will

    be detected with high probability if the value =s/

    m fulfills (5.1).With fixedb > 0, condition (5.1) leads to bC0for some fixed constantC0.The latter condition can be rewritten in the form m

    b2

    2

    s2/C

    20 . We see

    that this lower bound for the required delay m is proportional tob2, whereb is the change point magnitude. It is also proportional to the threshold squared. In turn, for the prescribed probability of rejecting a homogeneousinterval of length M, the threshold can be bounded by C

    log(M/m0).

    In particular, if we fix the length M and , then m =O(b2). If we keepfixed the values b and M but aim to provide a very small probability of afalse alarm by letting go to 0, then m = O(log 1). All these issues arein agreement with the theory of change point detection; see, for example,Pollak (1985) andBrodsky and Darkhovsky(1993).

    6. LAVE in practice. The aim of this section is to give some hints con-cerning the choice of the testing intervals and the smoothing parameter and to illustrate the performance of the LAVE procedure on simulated andreal data. We consider the simplest homogeneous model and we study thestability of the procedure in such a situation. Then a change point modelis analyzed and the sensitivity with respect to the jump magnitude is mea-sured. Finally, LAVE is applied to a set of exchange rate data.

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    13/27

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    14/27

    14 D. MERCURIO AND V. SPOKOINY

    Table 1The value of, which, for a given power transformation, provides the

    rejection of an interval of time homogeneity of lengthM with a frequency

    of5%

    Smoothing parameter

    = 0.5 = 1.0 = 2.0

    M= 80 M= 40 M= 80 M= 40 M= 80 M= 40 = 2.74 = 2.40 = 2.58 = 2.24 = 2.18 = 1.86

    6.3. Simulation results for the change point model. We now evaluate the

    performance of the LAVE algorithm on simulated data. Two change pointtime series of length T = 240 are considered. The simulated data displaytwo jumps of the same magnitude in opposite directions: t = for t[1, 80] and t [161, 240] and t= for t [81, 160], where = 1 and = 3and 5, respectively. For each model 500 realizations are generated, and theestimation is performed at each time pointt [t0, 240], wheret0 is set equalto 20.

    We compute the estimation error for each combination of and withthe following criterion:

    240t=20

    500=1

    t t

    t

    2(),(6.1)

    where the index indicates the realizations of the change point model. Wenote that in (6.1) the quadratic error is divided by the true volatility sothat the criterion does not depend on the scale oft. The results shown inTable 2 are favorable to the choice of the smaller value of, confirming thatthe loss of efficiency caused by

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    15/27

    TIME-INHOMOGENEOUS VOLATILITY MODELS 15

    Table 2Estimation errors for all the combinations of parameters and

    Estimation error

    = 0.5 = 1.0 = 2.0

    Parameter = 2.74 = 2.40 = 2.58 = 2.24 = 2.18 = 1.86

    Small jump 19,241.9 17,175.3 19,121.2 16,522.5 24,887.2 17,490.9Large jump 46,616.2 43,282.5 51,363.9 46,706.4 68,730.7 55,706.3

    points. The results for other power transformations look very similar and

    therefore are not displayed.

    6.4. Estimation of exchange rate volatility. We apply the LAVE proce-dure to a set of nine exchange rates, which are available from the web sitehttp://federalreserve.gov of the U.S. Federal Reserve. The data sets rep-resent daily exchange rates of the U.S. dollar (USD) against the followingcurrencies: Australian dollar (AUD), British pound (BPD), Canadian dollar

    Fig. 2. Estimation results for the change point model. The upper plots show the values of

    the standard deviation, while the lower plots show the values of the interval of homogeneity

    at each time point. True values (solid line), median of all estimates (thick dotted line),upper and lower quartiles (thin dotted lines). The value of for= 0.5 andM= 40 hasbeen used.

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    16/27

    16 D. MERCURIO AND V. SPOKOINY

    Fig. 3. Estimation results for the jump model. The value of for= 0.5 andM= 80has been used.

    (CAD), Danish krone (DKR), Japanese yen (JPY), Norwegian krone (NKR),New Zealand dollar (NZD), Swiss franc (SFR) and Swedish krona (SKR).

    The period under consideration goes from January 1, 1990, to April 7, 2000.See Table 3.All the time series show qualitatively almost the same pattern; there-

    fore, we provide the graphical example only for the two representative ex-change rates JPY/USD and BPD/USD (Figure 4). The empirical mean ofthe returns is close to 0, while the empirical kurtosis is larger than 3. Fur-

    Table 3

    Summary statistics

    Currency n Mean 105 Variance 105 Skewness Kurtosis

    AUD 2583 10.41 3.191 0.187 8.854BPD 2583 0.679 3.530 0.279 5.792CAD 2583 8.819 0.895 0.042 5.499DKR 2583 6.097 4.201 0.037 4.967JPY 2583 12.70 5.486 0.585 7.366NKR 2583 9.493 4.251 0.313 8.630NZD 2583 6.581 3.604 0.356 49.17SFR 2583 1.480 5.402 0.186 4.526SKR 2583 12.66 4.615 0.372 9.660

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    17/27

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    18/27

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    19/27

    TIME-INHOMOGENEOUS VOLATILITY MODELS 19

    Table 4Forecast performance of LAVE relative to GARCH

    = 0.5 = 1.0 = 2.0

    Currency M = 80 M = 40 M = 80 M = 40 M = 80 M = 40

    AUD 0.942 0.945 0.963 0.962 0.991 0.982BPD 0.961 0.960 0.979 0.970 1.006 1.000CAD 0.974 0.979 0.989 0.992 1.010 0.997DKR 0.978 0.980 0.985 0.987 1.010 1.004JPY 0.951 0.949 0.971 0.966 1.006 0.997NKR 0.961 0.957 0.972 0.965 0.998 0.984NZD 0.878 0.879 0.904 0.902 0.952 0.947SFR 0.985 0.984 0.992 0.990 1.004 1.000

    SKR 0.965 0.961 0.973 0.968 0.982 0.977

    E(R2t+1|Ft) =2t+1. Therefore, given a forecast t+1|t, the empirical meanvalue of|R2t+1 2t+1|t|p can be used to measure the quality of this forecast.The forecast ability of the LAVE and the GARCH estimates is thereforeevaluated with the following criterion:

    1

    T t0 1T

    t=t0

    |R2t+1 2t+1|t|p with p = 0.5.

    The value ofp= 0.5 is chosen instead of the more common p= 2 because

    we are interested in a robust criterion which is not too sensitive to the pres-ence of outliers. The relative performance of the LAVE and the GARCHestimates is displayed in Table 4. The performance of the LAVE approachis clearly better; furthermore, the table gives a clear hint for the choice ofthe power transformation. Indeed, = 0.5 provides the smallest forecast-ing errors, while = 2.0 leads to the largest forecasting errors, which aresometimes larger than that of the GARCH model.

    7. Conclusions and outlook. The locally adaptive volatility estimate (LAVE)is described and analyzed in this paper. It provides a nonparametric way forestimating and short-term forecasting the volatility of financial returns.

    It is assumed that a local constant approximation of the volatility process

    holds over some unknown interval. The issue of filtering this interval of timehomogeneity out of the return time series is considered, and a nonparametricapproach is presented. The estimate of the volatility process is then foundby averaging over the interval of time homogeneity.

    A theoretical analysis of the properties of the LAVE algorithm is providedand the problem of selecting the smoothing parameters is analyzed throughMonte Carlo simulation. The estimation results on change point models

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    20/27

    20 D. MERCURIO AND V. SPOKOINY

    show that the method has reasonable performance in practice. An empiricalapplication to exchange rate returns and a comparison with a GARCH(1, 1)also provide good evidence that the new method is competitive and can evenoutperform the standard parametric models, especially for forecasting witha short horizon.

    An important feature of the proposed method is that it allows for astraightforward extension to multivariate volatility estimation; seeHardle, Herwartz and Spokoi(2000) for a detailed discussion.

    Obviously, if the underlying conditional distribution is not normal, the es-timated volatility can give only partial information about the riskiness of theasset. Recent developments in risk analysis tend to focus on the estimationof the quantiles of the distribution. In this direction, the LAVE procedure

    can be used as a convenient tool for prewhitening the returns and obtaininga sample of almost identical and independently distributed returns, whichdo not display any more variance clustering. Therefore, the usual techniquesof quantile estimation could be applied in a static framework. We regardsuch a development as a topic for future research.

    8. Proofs. In this section, we collect the proofs of the results statedabove. We begin by considering some useful properties of the power trans-formation introduced in Section2.1.

    Some properties of the power transformation. Letg(u) be the momentgenerating function of= D

    1 (||C):

    g(u) = Eeu.It is easy to see that this function is finite for < 2 and all uand for = 2and u < 1. For = 1/2, the function 2u2 log g(u) is plotted in Figure 6.

    Lemma 8.1. For every 1 there exists a constanta> 0 such that

    logEeuau2

    2 .(8.1)

    Fig. 6. The log-Laplace transform of1/2 divided by the log-Laplace transform of a stan-

    dard normal r.v.

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    21/27

    TIME-INHOMOGENEOUS VOLATILITY MODELS 21

    Proof. It is easy to check that the functiong(u) with 1 is positiveand smooth (infinitely many times differentiable). Moreover, the functionh(u) = log g(u) is also smooth and satisfies h(0) =h

    (0) = 0, h

    (0) =

    E2= 1. This yields thatu2h(u) = u

    2 log g(u) is bounded on every finiteinterval of the positive semiaxis [0,). It therefore remains to show that

    limu

    u2 logEeu 0,

    Eeu||D1 = Eeu||

    D1 1(

    |

    | t) + Eeu||

    D1 1(

    |

    |> t)

    eutD1 + Eeu||t1D1 eutD1 + 2Eeut1D1=eut

    D1 + 2eu2t22D2 .

    Next, with t = u1/(2) and < 1, for u,u2 log eut

    D1 =u1/2D1 0,u2 log eu

    2t22D2 =u(1)/D2 0.For = 1, the last expression remains bounded and the assertion follows.

    For = 1/2, condition (8.1) is satisfied with a= 1.005.The next technical statement is a direct consequence of Lemma 8.1.

    Lemma 8.2. Letct be a predictable process w.r.t. the filtrationF= (Ft);that is, every ct is a function of previous observations R1, . . . , Rt1 : ct =ct(R1, . . . , Rt1). Then the process

    Et= exp

    ts=1

    cssa2

    ts=1

    c2s

    is a supermartingale, that is,

    E(Et|Ft1) Et1.(8.2)

    The next result has been stated in Liptser and Spokoiny(2000) for Gaus-sian martingales; however, the proof is based only on the property (8.2) andallows for straightforward extension to sums of the form Mt=

    ts=1 css.

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    22/27

    22 D. MERCURIO AND V. SPOKOINY

    Theorem 8.1. LetMt=

    ts=1 css with predictable coefficientscs. ThenletTbe fixed or a stopping time. For everyb > 0, B 1 and 1,

    P(|MT|> MT, b

    MT bB) 4

    e(1+log B)e

    2/(2a),

    where

    MT=Tt=1

    c2t .

    Remark 8.1. If the coefficients ct are deterministic or independentofM, then Lemma8.1and the Chebyshev inequality yield

    P(|MT|> MT) 2e

    2/(2a)

    .

    Proof of Theorem3.1. Define

    I= 1

    |I|tI

    t, I= s|I|1tI

    tt.

    Then I=I+ I. By the definition of I,

    |I |= |I|1tI

    (t ) I.(8.3)

    Next, by (3.2)

    I

    =I

    + I

    ,

    and the use of (8.3) yields

    P(|I |> I+ vI)PtI

    tt

    >

    tI

    2t

    1/2.

    In addition, if the volatility coefficient t satisfies b 2t bB with somepositive constants b, B, then the conditional variance v2I=s

    2|I|2

    tI

    2t

    satisfies

    b|I|1 v2I b|I|1B,with b = bs2. Now the assertion follows from (3.5) and Theorem8.1.

    Proof of Theorem3.2. It suffices to show that the inequalities I/vID and

    |I|= |II| vI(8.4)imply |I| vI, where solves the equationD + = /(1+s|I|1/2).This would yield the desired result by Theorem 8.1; compare the proof ofTheorem3.1.

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    23/27

    TIME-INHOMOGENEOUS VOLATILITY MODELS 23

    Lemma 8.3. Let(I/vI)s|I|1/2 < 1. Under (8.4)vI vI

    1 (I/vI)2s2|I|1 s|I|1/2

    vI(1 s|I|1/2(I/vI+ )).

    Proof. By the definition of vI in view of (8.4),

    vI= sI|I|1/2 s(IvI)|I|1/2.Since I is the arithmetic mean oft over I,

    tI

    (tI)2 tI

    (t )2 2I|I|.

    Nexts2 |I|v2I= |I|1

    tI

    2t =2I+ |I|1

    tI

    (t I)2 2I+ 2I,

    so that

    I s1 |I|1/2vI

    1 (Isv1I |I|1/2)2.Hence, under (8.4),

    vI vI

    1 (Isv1I |I|1/2)2 s|I|1/2

    ,

    and the assertion follows.

    The bound (8.4) and the definition of I imply

    |I | |I |+ |II| I+ vI (D+ )vI.By Lemma8.3, vI vI(1 sD|I|1/2 s|I|1/2). Thus,

    |I | D + 1 s(D+ )|I|1/2 vI=

    vI

    as required.

    Proof of Theorem4.1. Let I be a good interval in the sense that,with high probability, J/vJ

    D for some nonnegative constant D and ev-

    eryJ J(I). First we show that Iwill not be rejected with high probabilityprovided that is sufficiently large.

    We proceed similarly as in the proofs of Theorems 3.1 and3.2. The pro-cedure involves the estimates J, I\Jand the differences JI\J for allII(I) and all J J(I). The expansion J= J+ J implies

    JI\J= (JI\J) + (J I\J).

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    24/27

    24 D. MERCURIO AND V. SPOKOINY

    Under the conditionID,|J I\J| IDvID

    v2J+ v

    2I\J.

    Define the events

    AI=

    JJ(I)

    |J I\J| (JD)

    v2J+ v

    2I\J

    and

    v2J+ v2I\Jv2J+ v

    2I\J

    1 sN1/2J

    ,

    AI=

    II:IIAI,

    where NJ= min{|J|, |I\J|} and J= (1 sN1/2J ).DefineA

    I= AI {I D}. On this set

    |J I\J|v2J+ v

    2I\J

    |J I\J|+ |J I\J|v2J+ v

    2I\J

    (D+ JD)

    v2J+ v2I\J

    v2J+ v2I\J

    D + JD1 sN1/2J

    = .

    It is easy to see that the conditional variance ofJI\Jis equal tov2J+ v2I\J.Arguing similarly to Lemma8.3and Theorem3.1, we bound, with J,D=JD,

    P(AI)

    JJ(I)

    P

    |J|vJ

    > J,D

    +P

    |I\J|vI\J

    > J,D

    +P

    |J I\J|v2J+ v

    2I\J

    > J,D

    JJ(I)

    12

    eJ(1+log B)e2J,D/(2a),

    and the first assertion of the theorem follows.Now we show that on the set A

    I the estimate =I satisfies| I|

    2vI.Due to the above, on A

    I the interval I will not be rejected and, hence

    |I| |I|. Let Ibe an arbitrary interval fromIwhich is not rejected by theprocedure. By construction I is one of the testing intervals for I. Denote

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    25/27

  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    26/27

    26 D. MERCURIO AND V. SPOKOINY

    see the proof of Theorem 3.1.Hence

    IJ= b + I J.It is straightforward to see that v2J= s

    2/mand v

    2I= s2

    /m. By Lemma8.1(see also Remark8.1)

    P(|J|> vJ) +P(||> vI) 4e2/(2a),and it suffices to check that the inequalities|J| vJ,|I| vI and (5.1)imply

    |J I| v2J+ v2I .Since 1 = b and since vJ=s|J|1/2J and similarly for vI, we haveunder the conditions|J| vJ,|I| vI,

    |J I| bs( + 1)m

    = b(1 ) 2,

    vJ= s

    m(1 + J) 1(1 + ),

    vI= s

    m(1 + I) 1(1 + ),

    with = m1/2s. Therefore,

    |J I|

    v2J+ v2I b(1 ) 22(1 + ) > 0

    in view of (5.1), and the assertion follows.

    REFERENCES

    Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. J.Econometrics 31307327. MR853051

    Brodsky, B. and Darkhovsky, B. (1993). Nonparametric Methods in Change-PointProblems. Kluwer, Dordrecht.

    Cai, Z., Fan, J. and Li, R. (2000). Efficient estimation and inferences for varying-coefficientmodels. J. Amer. Statist. Assoc. 95 888902. MR1804446

    Cai, Z., Fan, J. and Yao, Q. (2000). Functional-coefficient regression models for nonlineartime series. J. Amer. Statist. Assoc. 95941956. MR1804449

    Carroll, R. and Ruppert, D. (1988). Transformation and Weighting in RegressionChap. 4. Chapman and Hall, New York. MR1014890

    Clements, M. P. and Hendry, D. F. (1998). Forecasting Economic Time Series. Cam-bridge Univ. Press. MR1662973

    Engle, R. F., ed. (1995). ARCH, Selected Readings. Oxford Univ. Press.Engle, R. F. and Bollerslev, T. (1986). Modelling the persistence of conditional vari-

    ances (with discussion). Econometric Rev. 5 187.MR876792

    http://www.ams.org/mathscinet-getitem?mr=853051http://www.ams.org/mathscinet-getitem?mr=1804446http://www.ams.org/mathscinet-getitem?mr=1804449http://www.ams.org/mathscinet-getitem?mr=1014890http://www.ams.org/mathscinet-getitem?mr=1662973http://www.ams.org/mathscinet-getitem?mr=876792http://www.ams.org/mathscinet-getitem?mr=876792http://www.ams.org/mathscinet-getitem?mr=1662973http://www.ams.org/mathscinet-getitem?mr=1014890http://www.ams.org/mathscinet-getitem?mr=1804449http://www.ams.org/mathscinet-getitem?mr=1804446http://www.ams.org/mathscinet-getitem?mr=853051
  • 8/12/2019 Statistical Inference for Time-Inhomogeneous

    27/27

    TIME-INHOMOGENEOUS VOLATILITY MODELS 27

    Fan, J., Jiang, J., Zhang, C. and Zhou, Z. (2003). Time-dependent diffusion modelsfor term structure dynamics. Statistical applications in financial econometrics. Statist.Sinica13 965992. MR2026058

    Fan, J. and Zhang, W. (1999). Statistical estimation in varying coefficient models. Ann.Statist. 27 14911518. MR1742497

    Franses, P. and van Dijk, D. (1996). Forecasting stock market volatility using (non-linear) GARCH models. J. Forecasting 15229235.

    Glosten, L., Jagannathan, R. and Runkle, D. (1993). On the relation between theexpected value and the volatility of the nominal excess return on stocks. J. Finance4817791801.MR1242517

    Gourieroux, C. (1997). ARCH Models and Financial Applications. Springer, Berlin.MR1439744

    Hardle, W., Herwartz, H. and Spokoiny, V. (2000). Time inhomogeneous multiplevolatility modelling. Unpublished manuscript.

    Hardle, W. and Stahl, G. (1999). Backtesting beyond var. Technical Report 105, Son-derforschungsbereich 373, Humboldt Univ., Berlin.

    Harvey, A., Ruiz, E. and Shephard, N. (1994). Multivariate stochastic variance models.Rev. Econom. Stud. 61247264.

    Kitagawa, G. (1987). Non-Gaussian state-space modelling of nonstationary time series(with discussion). J. Amer. Statist. Assoc. 82 10321063. MR922169

    Lepski, O. (1990). On a problem of adaptive estimation in Gaussian white noise. TheoryProbab. Appl. 35 454466. MR1091202

    Lepski, O. and Spokoiny, V. (1997). Optimal pointwise adaptive methods in nonpara-metric estimation. Ann. Statist. 25 25122546. MR1604408

    Liptser, R. and Spokoiny, V. (2000). Deviation probability bound for martingales withapplications to statistical estimation. Statist. Probab. Lett. 46347357. MR1743992

    Mikosch, T. and Starica, C. (2000). Change of structure in financial time series, longrange dependence and the garch model. Technical Report 58, Aarhus School of Business,

    Aarhus Univ.Nelson, D. B. (1991). Conditional heteroscedasticity in asset returns: A new approach.

    Econometrica59 347370. MR1097532Pollak, M. (1985). Optimal detection of a change in distribution. Ann. Statist. 13 206

    227.MR773162Sentana, E. (1995). Quadratic ARCH models. Rev. Econom. Stud. 62 639661.Spokoiny, V. (1998). Estimation of a function with discontinuities via local p olynomial

    fit with an adaptive window choice. Ann. Statist. 2613561378. MR1647669Taylor, S. J. (1986). Modelling Financial Time Series. Wiley, Chichester.

    Humboldt-Universitat zu Berlin

    Spandauer Strae 1

    10178 Berlin

    Germany

    e-mail: [email protected]

    Weierstrass-Institute

    Mohrenstrasse 39

    10117 Berlin

    Germany

    e-mail:[email protected]

    http://www.ams.org/mathscinet-getitem?mr=2026058http://www.ams.org/mathscinet-getitem?mr=1742497http://www.ams.org/mathscinet-getitem?mr=1242517http://www.ams.org/mathscinet-getitem?mr=1439744http://www.ams.org/mathscinet-getitem?mr=922169http://www.ams.org/mathscinet-getitem?mr=1091202http://www.ams.org/mathscinet-getitem?mr=1604408http://www.ams.org/mathscinet-getitem?mr=1743992http://www.ams.org/mathscinet-getitem?mr=1097532http://www.ams.org/mathscinet-getitem?mr=773162http://www.ams.org/mathscinet-getitem?mr=1647669mailto:[email protected]:[email protected]:[email protected]:[email protected]://www.ams.org/mathscinet-getitem?mr=1647669http://www.ams.org/mathscinet-getitem?mr=773162http://www.ams.org/mathscinet-getitem?mr=1097532http://www.ams.org/mathscinet-getitem?mr=1743992http://www.ams.org/mathscinet-getitem?mr=1604408http://www.ams.org/mathscinet-getitem?mr=1091202http://www.ams.org/mathscinet-getitem?mr=922169http://www.ams.org/mathscinet-getitem?mr=1439744http://www.ams.org/mathscinet-getitem?mr=1242517http://www.ams.org/mathscinet-getitem?mr=1742497http://www.ams.org/mathscinet-getitem?mr=2026058