Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric...

21
This article was downloaded by: [University of Hong Kong Libraries] On: 20 September 2013, At: 18:47 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Applied Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/cjas20 Mixtures of autoregressive- autoregressive conditionally heteroscedastic models: semi- parametric approach Arash Nademi a & Rahman Farnoosh b a Department of Statistics, Science and Research Branch, Islamic Azad University, Tehran, Iran b School of Mathematics, Iran University of Science and Technology, Narmak, Tehran, Iran Published online: 16 Sep 2013. To cite this article: Arash Nademi & Rahman Farnoosh , Journal of Applied Statistics (2013): Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach, Journal of Applied Statistics, DOI: 10.1080/02664763.2013.839129 To link to this article: http://dx.doi.org/10.1080/02664763.2013.839129 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &

Transcript of Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric...

Page 1: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

This article was downloaded by: [University of Hong Kong Libraries]On: 20 September 2013, At: 18:47Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Applied StatisticsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/cjas20

Mixtures of autoregressive-autoregressive conditionallyheteroscedastic models: semi-parametric approachArash Nademia & Rahman Farnooshb

a Department of Statistics, Science and Research Branch, IslamicAzad University, Tehran, Iranb School of Mathematics, Iran University of Science andTechnology, Narmak, Tehran, IranPublished online: 16 Sep 2013.

To cite this article: Arash Nademi & Rahman Farnoosh , Journal of Applied Statistics (2013):Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametricapproach, Journal of Applied Statistics, DOI: 10.1080/02664763.2013.839129

To link to this article: http://dx.doi.org/10.1080/02664763.2013.839129

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &

Page 2: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 3: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

Journal of Applied Statistics, 2013http://dx.doi.org/10.1080/02664763.2013.839129

Mixtures of autoregressive-autoregressiveconditionally heteroscedastic models:

semi-parametric approach

Arash Nademia∗ and Rahman Farnooshb

aDepartment of Statistics, Science and Research Branch, Islamic Azad University, Tehran, Iran; bSchool ofMathematics, Iran University of Science and Technology, Narmak, Tehran, Iran

(Received 10 January 2013; accepted 24 August 2013)

We propose data generating structures which can be represented as a mixture of autoregressive-autoregressive conditionally heteroscedastic models. The switching between the states is governed bya hidden Markov chain. We investigate semi-parametric estimators for estimating the functions based onthe quasi-maximum likelihood approach and provide sufficient conditions for geometric ergodicity of theprocess. We also present an expectation–maximization algorithm for calculating the estimates numerically.

Keywords: EM algorithm; geometric ergodicity; hidden variables; mixture models; semi-parametricautoregression

1. Introduction

The forecasting and capturing the volatility of financial markets has been the object of numerousdevelopments and applications over the past two decades, both theoretically and empirically. Inthis respect, the most widely used class of models is autoregressive–autoregressive conditionallyheteroscedastic (AR–ARCH) models. The combination of AR processes and ARCH processes,the so-called AR–ARCH processes, is well established and very popular models. The AR–ARCHmodels exhibit some other interesting properties like their tail behavior, e.g. [2] or [3] for thespecial case of AR(1)–ARCH(1). Another variant of these models is introduced in [10]. Theyconsider an AR model with ARCH residuals for which the geometric ergodicity is investigatedand the asymptotic behavior of the quasi-maximum likelihood estimates is derived.

These findings clearly show a potential source of unknown structure to explain that the form ofthe variance is relatively inflexible and held fixed throughout the entire sample period. Hence, theestimates of an AR–ARCH model may suffer from a substantial bias in the persistence parameter.So, models in which the parameters are allowed to change over time may be more feasible formodeling processes.

∗Corresponding author. Email: [email protected]

© 2013 Taylor & Francis

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 4: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

2 A. Nademi and R. Farnoosh

Recently, AR–ARCH models have repeatedly applied for making switching regimes processeswhich allow for more flexibility in modeling data which only show locally a homogeneous behav-ior. The application of switching processes as models for economic time series goes back toHamilton [8], who used a parametric switching AR model for the long-term US real gross nationalproduct. Furthermore, the time varying volatilities of parametric AR–ARCH have been consid-ered by Wong and Li [21,22] with a focus on algorithms for computing the estimates. Lanneand Saikkonen [11] considered a threshold like a mixture of AR–ARCH model, and Lee [12]developed stability results in these models where the switching is controlled by a hidden whitenoise sequence. Stockis et al. [17] provided stability results for mixtures of nonlinear and non-parametric AR–ARCH models with Markovian switching regimes. Franke et al. [7] proposed amixture model with nonparametric regression functions. Kamgaing et al. [9] investigated mixturemodels with nonparametric components in both regression function and variance.

In this paper, we first intend to introduce a Markov mixture of AR–ARCH models as ageneralization of the single-regime model, with dependent errors, by the following form

Yt = f (Yt−1) + εt , εt = ρεt−1 + σut , |ρ| < 1

and then extend a semi-parametric method to estimate regression function introduced by Farnooshand Mortazavi [5] for the single-regime models. These models are applied to the time seriesin which errors have periodic variations in econometric time series data and financial studies.There is a massive literature on applying these models in the case of single regime. Schick [16]proposed an efficient estimation in additive regression model with AR errors. Troung and Stone[20] considered a nonparametric regression model with dependent errors. Tong [19] and Li [14]investigated algorithms for estimation of these models. Farnoosh and Mortazavi [5] applied thismodel for monetary data and proposed a semi-parametric algorithm for the estimation of regressionfunction.

These models can be shown as Yt = f (Yt−1) + ρ(Yt−1 − f (Yt−2)) + σut with independenterrors {ut} and we use this form to generate a mixture model and for more flexibility, it should beconsidered that errors’s variance has an ARCH form.

Suppose Y1, . . . , YN are part of a stationary time series that are generated by the followingMarkov switching model where the switching between the states is controlled by a hidden Markovchain Qt with values in E = {1, . . . , M}.

Yt =M∑

k=1

Ztk(μ(Yt−1, Yt−2; ρk) + σ(Yt−1, Yt−2; ωk , αk , βk)ut,k) (1)

such that

μ(Yt−1, Yt−2; ρk) = fk(Yt−1) + ρk(Yt−1 − fk(Yt−2)),

and

σ 2(Yt−1, Yt−2; ωk , αk , βk) = ωk + αkY 2t−1 + βkY 2

t−2,

and

Ztk ={

1 Qt = k,

0 otherwise,

where the residuals ut,k , t = 1, . . . , N , k = 1, . . . , M are i.i.d. random variables with mean 0 andvariance 1.

We suppose that the residuals have density function ψ which is continuous and positive every-where. Zt = (Zt1, Zt2, . . . , ZtM)T are random variables which assume as values of the unit vectorse1, . . . , eM ∈ RM , i.e. exactly one of the Ztk is 1, and the others are 0. Furthermore, we assume

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 5: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

Journal of Applied Statistics 3

that Zt is independent of Yj, j < t and uj,k , j ≤ t. Qt is aperiodic, irreducible and the distribu-tion of the hidden state process is given by the M × M transition probability matrix A, i.e.Ajk = pr(Qt = k|Qt−1 = j) and we denote the weights of the corresponding stationary distri-bution by π = (π1, . . . , πM), i.e. in the stationary state we have πk = pr(Zt = ek) = pr(Qt = k)

such that the latter are determined by A via πA = π . We suppose fk(.) have a semi-parametricframework such that fk(x) ∈ {g(x, θk)ξk(x); θk ∈ V}, where ξk(x) is a nonparametric adjustmentfactor and g(x, θk) is a known function of x and θk and V ⊆ Rp is the parametric space. Note that ifwe do not have any information about the structure of fk(.), a nonparametric approach is desirable[4,6,7]. Therefore, we can rewrite model (1) in the following form:

Yt =M∑

k=1

Ztk(μ(Yt−1, Yt−2; θk , ρk) + σ(Yt−1, Yt−2; ωk , αk , βk)ut,k) (2)

such that

μ(Yt−1, Yt−2; θk , ρk) = g(Yt−1, θk)ξk(Yt−1) + ρk(Yt−1 − g(Yt−2, θk)ξk(Yt−2)),

and

σ 2(Yt−1, Yt−2; ωk , αk , βk) = ωk + αkY 2t−1 + βkY 2

t−2.

For ease in abbreviating, we carry the subscripts θk , ρk onμ, (μθk ,ρk ), andωk , αk , βk onσ , (σωk ,αk ,βk ),for μ(Yt−1, Yt−2; θk , ρk) and σ(Yt−1, Yt−2; ωk , αk , βk), throughout the paper.

In the next section, we present a quasi-maximum likelihood approach to derive estimates ofthe model’s parameters. Section 3 discusses an expectation–maximization (EM) algorithm as anumerical procedure for calculating estimators. In Section 4, we prove geometric ergodicity ofthe process. Finally, Section 5 illustrates the feasibility of the introduced model by applying it tosome simulated and real data.

2. Quasi-maximum likelihood estimates

In this section, we want to derive the quasi-likelihood function for estimating theparameters π1, . . . , πM and the parameter vector v = (θ1, . . . , θM , ρ1, . . . , ρM , ω1, . . . , ωM ,α1, . . . , αM , β1, . . . , βM)T ∈ V .

Define Y (N) = {Y0, Y1, . . . , YN } and Q(N) = {Q0, Q1, . . . , QN } for N ≥ 2, the observed sampleup to time N and the corresponding hidden sample of the state process. We suppose that thedistribution of Y (N) and the hidden variables Q(N) are Lebesgue and counting measure, respectively.We define the joint distribution of a part Y of Y (N) and a part W of Q(N) by the density h(y, w) of(Y , W) w.r.t. the product measure μ × η, where μ and η denote counting and Lebesgue measure,respectively, i.e.

μ × η(Y ∈ C, W ∈ D) =∑w∈D

∫C

h(y, w) dη(y).

Now, we introduce our assumptions on the structure of the hidden and the observed process:

(A1) For any N ≥ 2, h(YN |Y (N−1) = y(N−1), Q(N) = q(N)) = h(YN |YN−1 = yN−1, YN−2 = yN−2,QN = qN ).

(A2) For any N ≥ 2, p(QN = qN |Y (N−1) = y(N−1), Q(N−1) = q(N−1)) = p(QN = qN |QN−1 =qN−1) i.e. Qt forms a Markov chain, and its conditional distribution given the past dependonly on its own past and not on the past observations Yi, i < t.

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 6: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

4 A. Nademi and R. Farnoosh

Mark that we use the notations μv, hv,A and σv to make it clear that μ, h and σ are the functions ofA or v. Applying the definition of Y (N) and Q(N), the complete likelihood conditional on (Y0, Y1)

and (Q0, Q1) is

Lc(v, A|Y (N), Q(N)) = hv,A(y(N), q(N)) = hv,A(yN |Y (N−1) = y(N−1), Q(N) = q(N))

× hv,A(QN = qN |Y (N−1) = y(N−1), Q(N−1) = q(N−1))

× hv,A(y(N−1), q(N−1)).

Using assumptions A1 and A2, we have

Lc(v, A|Y (N), Q(N)) = hv,A(yN |YN−1 = yN−1, YN−2 = yN−2, QN = qN )

× pA(QN = qN |QN−1 = qN−1)hv,A(y(N−1), q(N−1)),

where hv,A(QN = qN |Y (N−1) = y(N−1), Q(N−1) = q(N−1)) = pA(QN = qN |QN−1 = qN−1) does notdepend on v. By iterating above equation and using the Markov property, we get

Lc(v, A|Y (N), Q(N)) = πq1

N∏t=2

Aqt−1,qt hv,A(Yt|Yt−1 = yt−1, Yt−2 = yt−2, Qt = qt),

and for the corresponding log-likelihood, we have

lc(v, A|Y (N), Q(N)) = log πq1 +N∑

t=2

log Aqt−1,qt

+N∑

t=2

log hv,A(Yt|Yt−1 = yt−1, Yt−2 = yt−2, Qt = qt).

We suppose that the conditional probability density of a single observation Yt given Yt−1 = yt−1

and Yt−2 = yt−2 is

hv,A(Yt|Yt−1 = yt−1, Yt−2 = yt−2, Qt = qt) =M∑

k=1

Ztkϕ(Yt ; μθk ,ρk , σωk ,αk ,βk ),

where Ztk = 1k(qt) and ϕ(Yt ; μθk ,ρk , σωk ,αk ,βk ) denotes the normal density with mean μθk ,ρk andstandard deviation σωk ,αk ,βk . Then, we have

lc(v, A|Y (N), Q(N)) = log πq1 +N∑

t=2

log Aqt−1,qt +N∑

t=2

M∑k=1

Ztk log1

σωk ,αk ,βk

ϕ(Yt − μθk ,ρk

σωk ,αk ,βk

). (3)

Our goal is maximizing lc(v, A|Y (N), Q(N)) to get the quasi-maximum likelihood estimates v, Afor the parameters of our model. Additionally, we want to solve the filtering problem, i.e. to reachestimates Qt of the hidden state variables Qt , or equivalently, estimates Ztk of the state vectors Ztk ,t = 1, . . . , N , from the observations Y (N).

An efficient way of solving the problem is applying the EM algorithm. In our case, simplysaying, in the E-step we suppose that preliminary estimates of the model parameters are knownand used to calculate estimates of the conditional expectations of Ztk given the observations. Inthe M-step, we apply the latter estimates computed in the E-step to replace Ztk in the complete loglikelihood that we then maximize to reach estimates of the parameters v, A. Until satisfying some

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 7: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

Journal of Applied Statistics 5

stopping criteria, the E-step and M-step are iteratively repeated. At the end of the EM algorithm,we do not only calculate the quasi-maximum likelihood estimates of v but also we get estimatesof the conditional expectations of Ztk given the data, i.e. that may be considered as first estimatesfor the state variables of the model themselves.

3. The EM algorithm

In the single-regime model, Yt = f (Yt−1) + ρ(Yt−1 − f (Yt−2)) + σut , Farnoosh and Mortazavi[5] introduced a semi-parametric method to estimate autoregression function f . At the first, weillustrate their algorithm and then extend this procedure to our mixture model. They first assumethat f takes the form of g(x, θ), where g(x, θ) is a known function of x and θ . So, the parametricregression estimator g(x, θ ) is first regarded as a crude guess of f . They define the followingestimator for estimating θ and ρ:

(θ , ρ) = arg minθ ,ρ

N∑t=2

(Yt − g(Yt−1, θ) − ρ(Yt−1 − g(Yt−2, θ)))2.

In fact, θ and ρ are the common conditional least-squares estimators based on data Y0, Y1, . . . , YN .Next, they adjust the initial approximation by the semi-parametric form g(x, θ )ξ(x), where ξ(x) isthe adjustment factor. The remaining problem is to determine ξ(x). They apply the local L2-fittingcriterion

q(x, ξ) = 1

hN

N∑t=2

[k

(Yt−1 − x

hN

){f (Yt−1) − g(Yt−1, θ )ξ}2

+k

(Yt−2 − x

hN

){f (Yt−2) − g(Yt−2, θ )ξ}2

]

� 1

hN

N∑t=2

[k

(Yt−1 − x

hN

){Yt − g(Yt−1, θ )ξ}2 + k

(Yt−2 − x

hN

){Yt−1 − g(Yt−2, θ )ξ}2

],

where K(.) is the kernel function, hN is the bandwidth depending on N and f is the true butunknown autoregression function. They calculate the estimator ξ (x) of ξ(x) by minimizing theabove criterion with respect to ξ(x). Then, they get

ξ (x) =∑N

t=2[k((Yt−1 − x)/hN )g(Yt−1, θ )Yt + k((Yt−2 − x)/hN )g(Yt−2, θ )Yt−1]∑Nt=2[k((Yt−1 − x)/hN )g2(Yt−1, θ ) + k((Yt−2 − x)/hN )g2(Yt−2, θ )] .

Finally, the autoregression estimator is obtained by

μ(x, y; θ , ρ) = g(x, θ )ξ (x) + ρ(x − g(y, θ )ξ (y))

(see [5] for details).Now, we extend the explained approach to our mixture model. If the Ztk would be observable,

we could treat model (2) as M independent estimation problems and we have M independentdata sets {Yt = μθk ,ρk + σωk ,αk ,βk ut,k} for t ∈ Tk = {n ≤ N , Znk = 1}. So, the vector of function

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 8: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

6 A. Nademi and R. Farnoosh

estimates (μθ1,ρ1 , . . . , μθM ,ρM )T, that maximizes lc(v, A|Y (N), Q(N)), can be defined by

μ(x, y; θk , ρk) = g(x, θk)ξk(x) + ρk(x − g(y, θk)ξk(y)),

in which (θk , ρk) get from the following least-squares problem

(θk , ρk) = arg minθk ,ρk∈V

∑t∈Tk

(Yt − g(Yt−1, θk) − ρk(Yt−1 − g(Yt−2, θk)))2

= arg minθk ,ρk∈V

N∑t=2

Ztk(Yt − g(Yt−1, θk) − ρk(Yt−1 − g(Yt−2, θk)))2,

(4)

where |ρk| < 1 for k = 1, . . . , M.The vector of function estimates (ξ1, . . . , ξM)T are calculated by

ξk(x) = arg minξk

1

hk

∑t∈Tk

[k

(Yt−1 − x

hk

){Yt − g(Yt−1, θk)ξk(x)}2

+k

(Yt−2 − x

hk

){Yt−1 − g(Yt−2, θk)ξk(x)}2

]

= arg minξk

1

hk

N∑t=2

[k

(Yt−1 − x

hk

){Yt − g(Yt−1, θk)ξk(x)}2Ztk

+k

(Yt−2 − x

hk

){Yt−1 − g(Yt−2, θk)ξk(x)}2Ztk

].

Then, we have

ξk(x) =∑

t∈Tk[k((Yt−1 − x)/hk)g(Yt−1, θk)Yt + k((Yt−2 − x)/hk)g(Yt−2, θk)Yt−1]∑

t∈Tk[k((Yt−1 − x)/hk)g2(Yt−1, θk) + k((Yt−2 − x)/hk)g2(Yt−2, θk)]

=∑N

t=2 Ztk[k((Yt−1 − x)/hk)g(Yt−1, θk)Yt + k((Yt−2 − x)/hk)g(Yt−2, θk)Yt−1]∑Nt=2 Ztk[k((Yt−1 − x)/hk)g2(Yt−1, θk) + k((Yt−2 − x)/hk)g2(Yt−2, θk)]

,

(5)

for k = 1, . . . , M.Maximizing Equation (3) as function of ωk , αk , βk can be regarded as a constraint optimization

problem. Therefore,

(ωk , αk , βk) = arg max∑t∈Tk

log1

σωk ,αk ,βk

ϕ

(etk

σωk ,αk ,βk

)

= arg maxN∑

t=2

Ztk log1

σωk ,αk ,βk

ϕ

(etk

σωk ,αk ,βk

), k = 1, . . . , M,

where etk = Yt − μ(x, y; θk , ρk) denote the sample residuals.Since the Ztk are not observable, we estimate the hidden variables by their conditional expecta-

tions. Therefore, we calculate the posterior probability of being in state j at time t given the entire

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 9: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

Journal of Applied Statistics 7

sequence of observations. So, we get

Ctj = E[Ztj|Y (N)] = p(Ztj = 1|Y (N)) = p(Qt = j|Y (N))

= h(Y (N), Qt = j)

h(Y (N))= h(Y (N), Qt = j)∑M

k=1 h(Y (N), Qt = k).

Now, we verify h(Y (N), Qt = j). By the assumptions A1 and A2, we have

h(Y (N), Qt = j) = h(Y0, . . . , YN , Qt = j)

= h(Yt+1, . . . , YN |Y0, . . . , Yt , Qt = j)h(Y0, . . . , Yt , Qt = j)

= h(Yt+1, . . . , YN |Yt , Yt−1, Qt = j)h(Y0, . . . , Yt , Qt = j).

Define αtj = h(Y0, . . . , Yt , Qt = j) be the joint density of the observations from Y0 to Yt and of

being in state j at time t and β tj = h(Yt+1, . . . , YN |Yt , Yt−1, Qt = j) as the conditional density of

observations from Yt+1 to YN given the state j at time t and Yt , Yt−1. Therefore, we conclude

h(Y (N), Qt = j) = β tj α

tj .

So, we can rewrite Ctj by the following form

Ctj = αtjβ

tj∑M

k=1 αtkβ

tk

.

Now, we introduce some recursive relations that apply for calculating αtj and β t

j . By definition ofαt

j , we have

αt+1j = h(Y0, . . . , Yt , Yt+1, Qt+1 = j)

= h(Yt+1|Y (t), Qt+1 = j)M∑

k=1

p(Qt+1 = j|Y (t), Qt = k)h(Y (t), Qt = k).

Applying the assumption h(Yt+1|Yt , Yt−1, Qt+1 = j) = ϕ(Yt+1; μθj ,ρj , σωj ,αj ,βj ) and definition of αtj ,

we have

αt+1j = h(Yt+1|Yt , Yt−1, Qt+1 = j)

M∑k=1

p(Qt+1 = j|Qt = k)αtk

= ϕ(Yt+1; μθj ,ρj , σωj ,αj ,βj )

M∑k=1

Akjαtk .

(6)

Further, we can get the recursive form of β tj by the following relation

β tj = h(Yt+1, . . . , YN |Yt , Yt−1, Qt = j)

=M∑

k=1

h(Yt+1, . . . , YN , Qt+1 = k|Yt , Yt−1, Qt = j)

=M∑

k=1

h(Yt+2, . . . , YN |Yt+1, Yt , Qt+1 = k)h(Yt+1|Yt , Yt−1, Qt+1 = k) × p(Qt+1 = k|Qt = j)

=M∑

k=1

β t+1k ϕ(Yt+1; μθk ,ρk , σωk ,αk ,βk )Ajk . (7)

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 10: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

8 A. Nademi and R. Farnoosh

For estimating the transition probability, Aij, we calculate the joint conditional probability of Qt

and Qt+1 given the entire sequence of observations

δt,t+1ij = p(Qt = i, Qt+1 = j|Y (N))

= h(Qt = i, Qt+1 = j, Y (N))

h(Y (N))= h(Qt = i, Qt+1 = j, Y (N))∑M

k=1 h(Y (N), Qt = k)

= h(Yt+2, . . . , YN |Y (t+1), Qt = i, Qt+1 = j)h(Qt = i, Qt+1 = j, Y (t+1))∑Mk=1 h(Y (N), Qt = k)

= h(Yt+2, . . . , YN |Yt+1, Yt , Qt+1 = j)h(Qt = i, Qt+1 = j, Y (t+1))∑Mk=1 h(Y (N), Qt = k)

.

By definition of β tj and αt

j , we can write

δt,t+1ij = β t+1

j h(Yt+1|Y (t), Qt = i, Qt+1 = j)h(Qt = i, Qt+1 = j, Y (t))∑Mk=1 αt

kβtk

= β t+1j h(Yt+1|Yt , Yt−1, Qt+1 = j)h(Qt+1 = j|Qt = i, Y (t))h(Qt = i, Y (t))∑M

k=1 αtkβ

tk

= β t+1j ϕ(Yt+1; μθj ,ρj , σωj ,αj ,βj )Aijα

ti∑M

k=1 αtkβ

tk

.

(8)

Then, we have

Aij =∑N

t=1 δt,t+1ij∑N

t=1

∑Mj=1 δ

t,t+1ij

=∑N

t=1 δt,t+1ij∑N

t=1 p(Qt = i|Y (N))=

∑Nt=1 δ

t,t+1ij∑N

t=1 Cti

,

and

πi =∑N

t=1 Cti

N.

The method presented with the EM steps can be summarized to the following version of the EMalgorithm.

E-step: Suppose the parameter vector v and transition probabilities Aij are given. Therefore,conditional expectations of the hidden variables Ztk given observations are estimated by

Ctk = αtkβ

tk∑M

i=1 αtiβ

ti

, k = 1, . . . , M, t = 1, . . . , N ,

where αti and β t

i are calculated from Equations (6) and (7).M-step: Suppose approximations Ctk for the hidden variables Ztk are given. Then, the transition

probabilities are estimated by

Aij =∑N

t=1 δt,t+1ij∑N

t=1 Cti

,

where δt,t+1ij are calculated from Equation (8). π1, . . . , πM are estimated by

πk = 1

N

N∑t=1

Ctk , k = 1, . . . , M.

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 11: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

Journal of Applied Statistics 9

The M autoregression functions μk are estimated by

μ(x, y; θk , ρk) = g(x, θk)ξk(x) + ρk(x − g(y, θk)ξk(y)), (9)

in which (ρk , θk) get from (ρk , θk) = arg min Qn(θk , ρk), θk , ρk ∈ V , |ρk| < 1 for k = 1, . . . , Msuch that Qn(θk , ρk) is

Qn(θk , ρk) =N∑

t=2

Ctk(Yt − g(Yt−1, θk) − ρk(Yt−1 − g(Yt−2, θk)))2,

and

ξk(x) =∑N

t=2 Ctk[k((Yt−1 − x)/hk)g(Yt−1, θk)Yt + k((Yt−2 − x)/hk)g(Yt−2, θk)Yt−1]∑Nt=2 Ctk[k((Yt−1 − x)/hk)g2(Yt−1, θk) + k((Yt−2 − x)/hk)g2(Yt−2, θk)]

,

and (ωk , αk , βk) are estimated by

(ωk , αk , βk) = arg maxN∑

t=2

Ctk log1

σωk ,αk ,βk

ϕ

(etk

σωk ,αk ,βk

),

for k = 1, . . . , M.The estimate of the parameters is obtained by iterating these two steps until convergence.

4. Geometric ergodicity

In this section, we prove that the process ζt = (Zt , Yt , Yt−1) is geometrically ergodic. The geometricergodicity shows not only that a unique stationary probability measure for the process exists, butalso that the process, irrespective of its initialization, converges to it at a geometric rate withrespect to the total variation norm. Markov chains with this property satisfy conventional limittheorems such as the central limit theorem and the law of large numbers for any given startingvalue given the existence of suitable moments [15, Ch.17].

We add the following assumptions that are needed for proving the geometrically ergodic. Notethat to simplify in notation, we write the autoregression and variance functions in the formsμ(Yt−1, Yt−2, Zt) and σ 2(Yt−1, Yt−2, Zt), respectively.

(A3) For each Zt ∈ RM and (Yt−1, Yt−2) ∈ R2, there exist (a1(Zt), a2(Zt), d1(Zt), d2(Zt)) ∈ R4 withd1(Zt), d2(Zt) ≥ 0, such that

μ(Yt−1, Yt−2, Zt) ≤ a1(Zt)Yt−1 + a2(Zt)Yt−2 + o(‖(Yt−1, Yt−2)‖),and

σ 2(Yt−1, Yt−2, Zt) ≤ d1(Zt)Y2t−1 + d2(Zt)Y

2t−2 + o(‖(Yt−1, Yt−2)‖2),

where ‖.‖ is any vector norm on R2.(A4) For each Zt ∈ RM , supzt−1∈RM E{2(a2

1(Zt) + a22(Zt)) + d1(Zt) + d2(Zt)|Zt−1 = zt−1} < 1.

(A5) The functions μ(Yt−1, Yt−2, Zt) and σ 2(Yt−1, Yt−2, Zt) are bounded on compact sets.

Assumption A3 bounds the trend functions asymptotically for ‖Y‖ → ∞ by linear functions andthe squared volatility functions by some quadratic functions to avoid an explosion of the systemfor large values of ‖Y‖. Assumptions A4 and A5 needed for getting the drift condition.

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 12: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

10 A. Nademi and R. Farnoosh

Mark that verification of geometric ergodicity of a Markov chain {ζt = (Zt , Yt , Yt−1)} followsby proving that the process is μ × η − irreducible aperiodic and by showing the existence of atest function satisfying the Foster–Lyapounov drift condition [13,15,18]. Foster–Lyapounov driftcondition: There exists a real-valued measurable function φ ≥ 1 such that for some constantsε > 0, 0 < c < ∞, 0 < γ < 1 and a small set H,

E[φ(ζt)|ζt−1 = (zt−1, y)] ≤ γφ(zt−1, y) − ε, y ∈ Hc

and

E[φ(ζt)|ζt−1 = (zt−1, y)] < c, y ∈ H,

where y is (yt−1, yt−2).

Theorem 4.1 Under assumptions A1–A5, {ζt} is geometrically ergodic.

Proof We first prove that ζt = (Zt , Yt , Yt−1) is μ × η − irreducible. Let B = B1 × B2 × B3 suchthat μ × η(B) > 0. Then, η(B2 × B3) > 0 and B1 contains at least one integer k in E. So, it isenough to prove that there exists t such that for all k, l, i, p((Zt , Yt , Yt−1) ∈ {ek} × B2 × B3|Z2 =el, Z1 = ei, Y2 = y2, Y1 = y1) > 0 with ek denoting a unit vector with the kth component equalto 1.

By defining μk(x, y) = μ(x, y; θk , ρk) and σk(x, y) = σ(x, y; ωk , αk , βk), we have

p((Z4, Y4, Y3) ∈ {ek} × B2 × B3|Z2 = el, Z1 = ei, Y2 = y2, Y1 = y1)

= p(Q4 = k, Q3 ∈ E, (Y4, Y3) ∈ B2 × B3|Z2 = el, Z1 = ei, Y2 = y2, Y1 = y1)

= p((Y4, Y3) ∈ B2 × B3|Q4 = k, Q3 ∈ E, Z2 = el, Z1 = ei, Y2 = y2, Y1 = y1)

× p(Q4 = k|Q3 ∈ E, Q2 = l, Q1 = i)p(Q3 ∈ E|Q2 = l, Q1 = i)

=M∑

j=1

AljAjkp((Y4, Y3) ∈ B2 × B3|Q4 = k, Q3 = j, Z2 = el, Z1 = ei, Y2 = y2, Y1 = y1)

=M∑

j=1

AljAjkp(μk(Y3, y2) + σk(Y3, y2)u4,k ∈ B2, Y3 ∈ B3|Q3 = j)

=M∑

j=1

AljAjk

∫B2

∫B3

1

σk(u, y2)ψ(

t − μk(u, y2)

σk(u, y2))

1

σj(y2, y1)ψ(

u − μj(y2, y1)

σj(y2, y1)) du dt,

which is strictly greater than 0 because of the irreducibility of {Qt} and the fact that η(B2 × B3) > 0and density function ψ is continuous and positive. Further, we can write

p((Zt , Yt , Yt−1) ∈ {ek} × B2 × B3|Z2 = el, Z1 = ei, Y2 = y2, Y1 = y1)

=M∑

j1=1

M∑j2=1

. . .

M∑jt−1=1

Alj1 Aj1j2 . . . Ajt−1k

×∫

B2

∫B3

∫�

. . .

∫�

1

σk(u, yt−2)ψ(

t − μk(u, yt−2)

σk(u, yt−2))

1

σj(yt−2, yt−3)

× ψ

(u − μj(yt−2, yt−3)

σj(yt−2, yt−3)

). . .

1

σj1(y2, y1)ψ

(y − μj1(y2, y1)

σj1(y2, y1)

)du dt . . . dy > 0.

Analogously, it can easily be seen that {ζt} is aperiodic. Although by A5 it can be shown that eachcompact set is indeed a small set and thus a petite set [1].

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 13: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

Journal of Applied Statistics 11

Now we check the drift condition. Define ηi = supzt∈RM E{2a2i (Zt+1) + di(Zt+1)|Zt = zt} for

i = 1, 2 and a test function φ ≥ 1 by

φ(ζt) = φ(Zt , Yt , Yt−1) = 1 + δ1Y 2t + δ2Y 2

t−1,

such that δ1, δ2 > 0 and choose λ > 0 so that η1 + η2 + λ = 1, δ1η1 + δ2 ≤ δ1(1 − λ/2) andδ1η2 ≤ δ2(1 − λ/2), then we have

E{φ(ζt+1)|(Yt , Yt−1) = y, Zt = zt} = 1 + E{δ1(μ(y, Zt+1) + σ(y, Zt+1)ut+1)2|Zt = zt} + δ2y2

t

= 1 + δ1Ezt {(μ2(y, Zt+1) + σ 2(y, Zt+1))} + δ2y2t ,

where y is (yt , yt−1). Applying A3, we have

E{φ(ζt+1)|(Yt , Yt−1) = y, Zt = zt}≤ 1 + δ1Ezt {(a1(Zt+1)yt + a2(Zt+1)yt−1 + o(‖y‖))2 + d1(Zt+1)y

2t

+ d2(Zt+1)y2t−1 + o(‖y‖2)} + δ2y2

t .

Considering the fact 2AB ≤ 2|AB| ≤ A2 + B2 for A, B ∈ R and assumption A4, we conclude

E{φ(ζt+1)|(Yt , Yt−1) = y, Zt = zt} ≤ 1 + δ1(η1y2t + η2y2

t−1) + δ2y2t + δ1Ezt (RZt+1(y))

≤ (1 − λ/2)φ(zt , y) + δ1Ezt (RZt+1(y)) + λ/2

= φ(zt , y){1 − λ/2 + (λ/2 + δ1Ezt (RZt+1(y)))/φ(zt , y)},

where Rzt+1(y) = 2o(‖y‖)(a1(zt+1)yt + a2(zt+1)yt−1 + o(‖y‖)/2) + o(‖y‖2). On the other hand,(δ1Ezt (Rzt+1(y)) + λ/2)/φ(zt , y) → 0 as ‖y‖ → ∞. Therefore, for any ε > 0, we can chooseγ , 1 − λ/2 < γ < 1 and H < ∞ such that the next two inequalities hold

E{φ(ζt+1)|ζt} ≤ γφ(ζt) − ε, ‖y‖ > H

sup‖y‖≤H

E{φ(ζt+1)|ζt} < ∞.

So, the proof is completed. �

5. Simulation study and empirical application

In this section, we first investigate the estimation algorithm combined with the numerical proceduredescribed above. We first consider computer generated data, where the underlying hidden Markovchain is known and then we apply the introduced model (2) to estimate the GBP to USD (BritishPound to US Dollar) exchange rates.

5.1 Simulation study

We have simulated a data generating process corresponding to the defined model by Equation (2)for two states and we considered that utk are i.i.d. N(0, 1) random variables. We supposed

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 14: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

12 A. Nademi and R. Farnoosh

ρ1 = 0.7, ρ2 = 0.8, ω1 = .01, ω2 = .02, α1 = .05, α2 = .002, β1 = .01, β2 = .03 and chose thestate probabilities as π1 = 0.4 and π2 = 0.6. The transition probability matrix is given by

A =(

0.7 0.30.2 0.8

)

and we chose two functions

g(x; δ1, κ1, η1) = δ1x + κ1eη1(x−2)2, g(x; δ2, κ2, η2) = δ2x + κ2eη2(x−2)2

,

for g(x; θ), where δ1 = 0.7, κ1 = 0.4, η1 = −2, δ2 = 0.1, κ2 = 0.2 and η2 = −5.In Table 1, we report summary statistics for 1000 observations from this model and the estimated

parameters are summarized in Table 2.The use of nonparametric adjustment ξk(x) requires the optimal selection of the smoothing

parameter called the bandwidth hk . For a practical application, the choice of the bandwidth hk isvery important. The bandwidth depends on the sample size N and for consistency kernel function,we must have hk → 0 and hkN → ∞ for N → ∞. But for practical implementation, this conditionis not very helpful.A very small bandwidth yields a roughly course of the estimated function, whilea large bandwidth will lead to a smooth path but will probably flatten out mentioned asymptoticcondition. So, the bandwidth hk was chosen by an opening the window technique, i.e. by tryingseveral bandwidths and deciding for a good compromise which is neither too smooth nor toorough and for selection the best bandwidth, we applied the following least-square criterion:

hk = arg minhk

N∑t=2

Ctk[Yt − g(Yt−1, θk)ξk(Yt−1) − ρk(Yt−1 − g(Yt−2, θk)ξk(Yt−2))]2.

For the final bandwidths of the two kernel estimates, we get h1 = 0.0152 and h2 = 0.0209. InFigure 1 (a), we show the simulated series. Figure 1(b) and 1(c) shows the corresponding scatterplot of Yt , Yt−1 and Yt−2, respectively, that show the dependency degree of observations.

Table 1. Descriptive statistics for the simulated data.

Mean 0.6199 Maximum 0.7540Standard deviation 0.0362 Minimum 0.5003Skewness −0.0071 Kurtosis 3.0661

Table 2. The mean, bias and standard deviation of the estimated parametersfor the simulated data.

Mean Bias SD

(A12, A21) (0.2984,0.1950) (−0.0016,−0.005) (0.046,0.063)(ω1, ω2) (0.0346,0.0213) (0.0246,0.0013) (0.075,0.021)(α1, α2) (0.0432,0.0017) (−0.0068,−0.0003) (0.044,0.033)(β1, β2) (0.0112,0.0451) (0.0012,0.0151) (0.012,0.086)(δ1, δ2) (0.6972,0.0978) (−0.0028,−0.0022) (0.025,0.032)(κ1, κ2) (0.4002,0.2019) (0.0002,0.0019) (0.012,0.042)(η1, η2) (−2.004,−4.9828) (−0.0040,0.0172) (0.101,0.137)(ρ1, ρ2) (0.7029,0.8011) (0.0029,0.0011) (0.037,0.066)(π1, π2) (0.3901,0.6015) (−0.0099,0.0015) (0.068,0.053)

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 15: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

Journal of Applied Statistics 13

−0.15 −0.1 −0.05 0 0.05 0.1 0.15

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

−0.15 −0.1 −0.05 0 0.05 0.1 0.15

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

0 200 400 600 800 1000−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

(a) (b)

(c) (d)

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Figure 1. (a) Generated data. (b) Scatter plot (Yt , Yt−1). (c) Scatter plot (Yt , Yt−2). (d) Maximum of theestimated state probabilities.

Figure 2 shows the convergency of n-step transition probability matrix An to measure π =(0.4, 0.6) such that

limn→∞ An

11 = π1 = 0.4, limn→∞ An

12 = π2 = 0.6,

limn→∞ An

21 = π1 = 0.4, limn→∞ An

22 = π2 = 0.6,

where Anij = pr(Qt+n = j|Qt = i) and this demonstrate that the process has the unique stationary

probability measure. Furthermore, one can observe that the process converges to measure π at ageometric rate (1/4 < ρ < 2/3) such that

|Anij − πj| = o(ρn),

and this say that the process is geometrically ergodicity.One of the tools that is suitable in evaluation of the model is max(Ctk , k = 1 . . . , M). The

final values of Ctk , k = 1, . . . , M, can be applied for classifying the observations by the followingexpression: Yt is belonging to regime k if and only if Ctk = maxi=1,...,M Cti.

Figure 1(d) shows max(Ct1, Ct2) which, except for a few cases, almost are at least 0.9 infre-quently, i.e. there is a clear decision for one of the two phases in the large majority of cases andwe find that high percent of the data are correctly classified.

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 16: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

14 A. Nademi and R. Farnoosh

10 20 30 40 50 60 70 80 90 100

0.35

0.4

0.45

0.5

0.55

0.6

0.65

10 20 30 40 50 60 70 80 90 100

0.35

0.4

0.45

0.5

0.55

0.6

0.65

10 20 30 40 50 60 70 80 90 1000.2

0.25

0.3

0.35

0.4

0.45

10 20 30 40 50 60 70 80 90 100

0.55

0.6

0.65

0.7

2 4 6 8 10 12 14 16 18−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

2 4 6 8 10 12 14 16 18−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

2 4 6 8 10 12 14 16 18−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

2 4 6 8 10 12 14 16 18−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

(a) (b)

(c) (d)

(e) (f)

(g) (h)

Figure 2. (a) {An11}, (b) {An

12}, (c) {An21}, (d) {An

22}, (e) {|An11 − π1|}, (f) {|An

12 − π2|}, (g) {|An21 − π1|},

(h) {|An22 − π2|}. Remark: {|An

ij − πj|} (blue), {(2/3)n} (red), {(1/4)n} (green) (color online only).

5.2 Empirical application

We consider a data set consisted of the GBP to USD (British Pound to US Dollar) exchangerates downloaded from, http://www.tititudorancea.com, for the period, 27 December 2011–14December 2012. At first, considering four forms for initial function g(x, θ), we fitted four single-regime models, M = 1, for the observations. The initial forms consist of the exponential, logisticand trigonometric functions as follows:

g1(x, θ) = exp(θx), g2(x, θ) = exp(θ√

x),

g3(x, θ) = exp(θx)

1 + exp(θx), g4(x, θ) = θ sin(x).

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 17: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

Journal of Applied Statistics 15

Table 3. The estimated parameters for the single regime(M = 1).

g1(x, θ) = exp(θx) g2(x, θ) = exp(θ√

x)

ω 0.0225 0.1373(0.014) (0.025)

α 0.0027 0.0067(0.025) (0.003)

β 0.0038 0.0071(0.042) (0.014)

θ 0.5098 0.7561(0.016) (0.116)

ρ 0.9248 0.7913(0.008) (0.016)

h 0.0253 0.0298(0.012) (0.025)

MSE 0.6513 0.8487

g3(x, θ) = exp(θx)1+exp(θx) g4(x, θ) = θ sin(x)

ω 0.5409 0.5073(0.004) (0.071)

α 0.6497 0.4676(0.033) (0.020)

β 0.0354 0.0309(0.027) (0.134)

θ 2.8364 0.9808(0.037) (0.017)

ρ 0.5887 0.7819(0.184) (0.016)

h 0.0280 0.0147(0.087) (0.062)

MSE 1.3487 1.0502

Note: Standard errors are given within parentheses.

0 50 100 150 200 2501.3

1.4

1.5

1.6

1.7

1.8

1.9

2

Figure 3. GBP to USD (British Pound to US Dollar) exchange rates values (blue) and the step function (red)(color online only).

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 18: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

16 A. Nademi and R. Farnoosh

Table 3 shows the estimated parameters of the single-regime models based on initial functionsg(x, θ) and we computed the average squared error (ASE) to compare the efficiency of the models:

ASE = 1

N

N∑t=1

{Yt − Yt}2.

The square root of ASE is denoted by mean squared error (MSE).Comparing the measures of MSE for four models, we can find that the exponential models g1

and g2 are more efficient than the logistic and trigonometric models based on g3 and g4. This showsthat the selection of initial function, g(x, θ), is important, such that, in this case, the selection oflogistic and trigonometric functions cannot offer a proper approximate.

On the other hand, according to Figure 3, we can see that the observations have differentfunctions, where we highlight those by a step function, such that the step function (red) illustratesthat the sample path is governed by two regimes, where the first regime corresponds to low-volatile observations, whereas the second regime shows periods of high volatility. This propose

Table 4. The estimated parameters for the mixture models.

g1(x, θ) = exp(θx) g2(x, θ) = exp(θ√

x)

A12, A21 0.3142,0.3129 0.2881,0.4330(0.021,0.040) (0.115,0.022)

ω1, ω2 0.0016,0.0014 0.0172,0.1487(0.025,0.012) (0.105,0.138)

α1, β1 0.0210,0.0121 0.0179,0.0524(0.021,0.031) (0.138,0.111)

α2, β2 0.0026,0.0013 0.0176,0.0035(0.009,0.012) (0.026,0.062)

θ1, θ2 0.2959,0.3203 0.7886,0.5515(0.043,0.045) (0.054,0.065)

ρ1, ρ2 0.8775,0.8627 0.7193,0.6272(0.022,0.035) (0.077,0.064)

π1, π2 0.4990,0.5010 0.6005,0.3995(0.001,0.002) (0.003,0.001)

h1, h2 0.0181,0.0331 0.0590,0.0692(0.052,0.060) (0.005,0.003)

MSE 0.0023 0.0125

g3(x, θ) = exp(θx)1+exp(θx) g4(x, θ) = θ sin(x)

A12, A21 0.4381,0.2149 0.5169,0.5224(0.002,0.018) (0.025,0.022)

ω1, ω2 0.0360,0.0655 0.2591,0.6263(0.022,0.027) (0.098,0.057)

α1, β1 0.1143,0.0144 0.0981,0.0526(0.002,0.117) (0.022,0.029)

α2, β2 0.2095,0.0057 0.2594,0.0706(0.016,0.010) (0.086,0.062)

θ1, θ2 1.6989,3.0156 0.6090,0.5971(0.011,0.057) (0.057,0.071)

ρ1, ρ2 0.4191,0.3261 0.6064,0.6596(0.066,0.021) (0.095,0.071)

π1, π2 0.3291,0.6709 0.5026,0.4974(0.002,0.004) (0.001,0.022)

h1, h2 0.0186,0.0325 0.0560,0.0650(0.021,0.009) (0.071,0.036)

MSE 0.0434 0.0352

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 19: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

Journal of Applied Statistics 17

50 100 150 200 250

1.3

1.35

1.4

1.45

1.5

1.55

1.6

1.65

1.7

1.75

sample pathmodel4model2model1model3

Figure 4. The sample path and the approximated models 1, 2, 3 and 4.

0 50 100 150 2000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Figure 5. Maximum of the estimated state probabilities for model 1.

that the use of mixture models can be a proper mechanism to capture these regimes. So, fourmixture models based on g1, g2, g3 and g4 with two regimes, M = 2, were considered. Table 4shows the estimated parameters of the mixture models. Considering the measures of MSE forfour models, like the single-regime models, we get that the exponential models g1 and g2 aremore efficient than the logistic and trigonometric models based on g3 and g4. Also, the model (1),with g1(x, θ) = exp(θx), is the most capable model for estimation the observations among theexponential models. Furthermore, comparing the values of MSE for Tables 3 and 4, we can say thatthe mixture models are better than the single-regime models. Figure 4 shows the observations andthe estimated Y (1)

t , Y (2)t , Y (3)

t and Y (4)t based on g1, g2, g3 and g4, respectively, where the sample size

is N = 245. Figure 5 shows max(Ct1, Ct2) based on g1 which almost are at least 0.8 infrequentlythat shows the model (1) is powerful in classifying the observations. We also fitted a model withthree regimes to the data which did not lead to any improvement.

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 20: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

18 A. Nademi and R. Farnoosh

Table 5. The estimated parameters without nonparametricadjustment.

g1(x, θ) = exp(θx) g2(x, θ) = exp(θ√

x)

A12, A21 0.2931,0.3075 0.3078,0.5059(0.109,0.046) (0.064,0.079)

ω1, ω2 0.0346,0.0029 0.0051,0.0176(0.082,0.011) (0.028,0.006)

α1, β1 0.1742,0.1544 0.0315,0.0160(0.136,0.066) (0.105,0.052)

α2, β2 0.0383,0.0292 0.0959,0.0355(0.008,0.036) (0.074,0.023)

θ1, θ2 0.2848,0.3284 0.8226,0.5082(0.004,0.002) (0.104,0.032)

ρ1, ρ2 0.8307,0.7831 0.9221,0.6244(0.074,0.085) (0.059,0.042)

π1, π2 0.5120,0.4880 0.6217,0.3783(0.003,0.002) (0.005,0.006)

h1, h2 0.0634,0.0528 0.0752,0.0948(0.008,0.026) (0.120,0.017)

MSE 0.1023 0.1523

g3(x, θ) = exp(θx)1+exp(θx) g4(x, θ) = θ sin(x)

A12, A21 0.3528,0.3516 0.4845,0.4624(0.096,0.085) (0.008,0.029)

ω1, ω2 0.0620,0.0377 0.4177,0.5267(0.007,0.071) (0.088,0.125)

α1, β1 0.3516,0.1631 0.1648,0.0949(0.018,0.091) (0.017,0.009)

α2, β2 0.5416,0.0675 0.3708,0.0632(0.035,0.119) (0.033,0.011)

θ1, θ2 1.3453,3.7528 0.6594,0.6102(0.016,0.087) (0.097,0.085)

ρ1, ρ2 0.4494,0.3945 0.6559,0.6713(0.044,0.020) (0.036,0.106)

π1, π2 0.4991,0.5009 0.4880,0.5117(0.003,0.007) (0.006,0.004)

h1, h2 0.0543,0.0794 0.0213,0.0891(0.018,0.033) (0.109,0.085)

MSE 0.1644 0.1614

Furthermore, the existence of the nonparametric adjustment ξk(x), for smoothing the functiong(x, θ), is also important. To show this, we applied the mixture models for observations withoutthe nonparametric adjustment. Table 5 shows the estimated parameters of the mixture modelsbased on four initial functions g(x, θ). Therefore, comparing the values of Tables 4 and 5, wecan find that there is a significant difference between the MSE of the mixture models, with andwithout the nonparametric adjustment multipliers and this shows the important of applying thenonparametric adjustment.

6. Conclusion

We have considered a semi-parametric mixture model, a form of hidden Markov model, whichcan be useful for modeling time series with sudden changes in structure. We have illustrated thatthe semi-parametric estimator is applicable to mixtures of AR–ARCH models. The EM algorithm

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013

Page 21: Mixtures of autoregressive-autoregressive conditionally heteroscedastic models: semi-parametric approach

Journal of Applied Statistics 19

provides a numerical method for calculating the function estimates which reduce to apply commonlocal smoothers as part of an iterative scheme. We have proved geometric ergodicity of {ζt} undersimple conditions. The applications to simulated and real data look promising, but there is, ofcourse, a lot of possible extensions and open questions to be addressed in future work. Investigatingmixtures of more general models and nonlinear estimation methods are possible projects to be donein future. Also, the asymptotic behavior of the parameter estimates, conditions of consistency andhypothesis tests for the parameters are topics for the mixture framework have to be investigated.

References

[1] R. Bhattacharya and C. Lee, On geometric ergodicity of nonlinear autoregressive models, Statist. Probab. Lett. 22(1995), pp. 311–315.

[2] M. Borkovec and C. Kluppelberg, The tail of the stationary distribution of an autoregressive process with ARCH(1)errors, Ann. Appl. Probab. 4 (2001), pp. 1220–1241.

[3] D.B.H. Cline, Regular variation of order 1 nonlinear AR–ARCH models, Stochastic Process. Appl. 117 (2006),pp. 840–861.

[4] J. Fan and Y.K. Truong, Nonparametric regression with errors in variables, Ann. Statist. 21 (1993), pp. 1900–1925.[5] R. Farnoosh and S.J. Mortazavi, A semiparametric method for estimating nonlinear autoregressive model with

dependent errors, Nonlinear Anal. 74(17) (2011), pp. 6358–6370.[6] A. Francesco, Local likelihood for non-parametric ARCH(1) models, J. Time Ser. Anal. 26 (2005), pp. 251–278.[7] J. Franke, J. Stockis, J. Tadjuidje, and W.K. Li, Mixtures of nonparametric autoregressions, J. Nonparametr. Stat.

23 (2011), pp. 287–303.[8] J.D. Hamilton, A new approach to the economic analysis of nonstationary time series and the business cycle,

Econometrica. 57(2) (1989), pp. 357–384.[9] J.T. Kamgaing, J.P. Stockis, and J. Franke, Nonparametric estimation in Markov switching autoregressive ARCH

models, Tech. Rep., University of Kaiserslautern, Kaiserslautern, Germany, 2011.[10] T. Lange, A. Rahbek, and S.T. Jensen, Estimation and asymptotic inference in the first order AR–ARCH model, Tech.

Rep., Institute for Mathematical Sciences, University of Copenhagen, Copenhagen, Denmark, 2006.[11] M. Lanne and P. Saikkonen, Modeling the US short-term interest rate by mixture autoregressive processes, J. Financ.

Econom. 1 (2003), pp. 96–125.[12] O. Lee, Stationarity and β-mixing property of a mixture AR–ARCH models, Bull. Korean Math. Soc. 43 (2006),

pp. 813–820.[13] O. Lee and D.W. Shin, On geometric ergodicity of an AR–ARCH type process with Markov switching, J. Korean

Math. Soc. 41 (2004), pp. 309–318.[14] Q. Li, Constistent model specification test for time series econometric models, J. Econom. 92 (1999), pp. 101–147.[15] S.P. Meyn and R.L. Tweedie, Markov Chains and Stochastic Stability, Springer, London, 1993.[16] A. Schick, Efficient estimation in a semi-parametric additive regression model with autoregressive errors, Stoch.

Process. Appl. 61 (1996), pp. 339–361.[17] J.P. Stockis, J. Franke, and J.T. Kamgaing, On geometric ergodicity of CHARME models, J. Time Ser. Anal. 31

(2010), pp. 141–152.[18] D. Tjostheim, Nonlinear time series and Markov chains, Adv. Appl. Probab. 22 (1990), pp. 587–611.[19] H. Tong, Nonlinear Time Series, Oxford University Press, Oxford, 1990.[20] Y.K. Truong and C.J. Stone, Semi-parametric time series regression, J. Time Ser. Anal. 15 (1994), pp. 405–428.[21] C.S. Wong and W.K. Li, On a mixture autoregressive model, J. Roy. Statist. Soc. Ser. B. 62 (2000), pp. 95–115.[22] C.S. Wong and W.K. Li, On a mixture autoregressive conditional heteroscedastic model, J. Amer. Statist. 96 (2001),

pp. 982–995.

Dow

nloa

ded

by [

Uni

vers

ity o

f H

ong

Kon

g L

ibra

ries

] at

18:

47 2

0 Se

ptem

ber

2013