Gospodinov Ng

7/29/2019 Gospodinov Ng

1/37

Minimum Distance Estimation of Possibly

Non-Invertible Moving Average Models

Nikolay Gospodinov Serena Ng y

November 5, 2012

Abstract

This paper proposes classical and simulation-based minimum distance estimation of movingaverage (MA) models with non-Gaussian errors. Information in higher order cumulants allowsidentication of the parameters without imposing invertibility. By removing the invertibilityrestriction, the presence of a moving average unit root no longer presents a boundary problemthat gives rise to non-standard asymptotics. As a result, the minimum distance estimator of theMA(1) model has classical root-T asymptotic normal properties when the moving average rootis inside, outside, and on the unit circle. For more general models when the dependence of thecumulants on the model parameters is analytically intractable, we propose a simulation estima-tor based on auxiliary regressions with parameters that are informative about the higher ordercumulants. The method uses an error simulator with a exible functional form that accommo-dates a large class of distributions with non-Gaussian features. The simulation estimator is alsoapproximately normally distributed without imposing the a priori assumption of invertibility.

JEL Classication: C13, C15, C22

Keywords: Minimum distance; Non-invertibility; Indirect inference; Identication; Non-Gaussianerrors; Generalized lambda distribution

Concordia University and CIREQ, 1455 de Maisonneuve Blvd. West, Montreal, QC H3G 1M8, Canada. Email:[email protected]

yColumbia University, 420 W. 118 St. MC 3308, New York, NY 10027. Email: [email protected]

We would like to thank Prosper Dovonon, Anders Bredahl Kock, Ivana Komunjer and the participants at the CESGmeeting at Queens University for useful comments and suggestions. The rst author gratefully acknowledges nancialsupport from FQRSC, IFM2 and SSHRC. The second author acknowledges nancial support from the National ScienceFoundation (SES-0962431).


2/37

1 Introduction

Moving average (MA) models can parsimoniously characterize the dynamic behavior of many time

series processes. The challenges in estimating MA models are two-fold. First, invertible and

non-invertible moving average processes are observationally equivalent up to the second moments.

Second, invertibility puts an upper bound of one on all roots of the moving average polynomial,

rendering estimators with non-normal asymptotic distributions when some roots are on or near the

unit circle. Existing estimators treat invertible and non-invertible processes separately, requiring the

researcher to take a stand on the parameter space of interest. While estimators are super-consistent

under the null hypothesis of a moving average unit root, their distributions are not asymptotically

pivotal. To our knowledge, no estimator of the MA model exists that achieves identication without

imposing invertibility and yet enables classical inference over the whole parameter space.

Both invertible and non-invertible representations can be consistent with economic theory. For

example, if the logarithm of asset price is the sum of a random walk component and a stationarycomponent, the rst dierence (or asset returns) is generally invertible, but non-invertibility can

arise if the variance of the stationary component is large. While non-invertible models are not ruled

out by theory, invertibility is often the mainstream assumption in empirical work. One reason is that

non-invertible models are not useful for forecasting because future values of the endogenous variable

are not observable. The more practical reason is that the assumption provides the identication

restrictions without which maximum likelihood and covariance structure-based estimation of MA

models would not be possible when the data are normally distributed.1 Obviously, falsely assuming

invertibility will yield an inferior t of the data. It can also lead to spurious estimates of the impulse

coecients which are often the objects of interest. Hansen and Sargent (1991), Lippi and Reichlin

(1993), Fernndez-Villaverde et al. (2007), among others, emphasize the need to verify invertibility

because it aects how we interpret what can be recovered from the data. Indeed, it is necessary in

many science and engineering applications to admit parameter values in the non-invertible range. 2

A key nding in these studies is that higher order cumulants are necessary for identication when

the non-invertible models are to be entertained.

This paper considers minimum distance estimation of MA models without imposing invertibility

a priori. We rst show using the MA(1) model that use of higher cumulants per se is not sucient

1 Invertibility can also help to identify structural models. For example, Komunjer and Ng (2011) use invertibilityto narrow the class of equivalent DSGE models.

2 For example, in seismology, an accurate model of the seismic source wavelet, in the form of a moving averagelter, is necessary to recover the earths reectivity sequence. The fact that seismic data typically exhibit non-Gaussian features suggests the need for a wavelet (moving average polynomial) which is non-invertible. Similarly,in communication analysis, an accurate modeling of the communication channel by a possibly non-invertible movingaverage process is required to back out the underlying message from the observed distorted message.

1


3/37

for the Jacobian matrix to be full rank everywhere in the parameter space. Exploiting the fact that

mapping between the structural parameters and cumulants can be explicitly derived for the MA(1)

case, we show that the cumulants can over- but not exactly identify the MA(1) model if a unit root

and parameters consistent with non-invertibility are admissible. However, two second order along

with three third order cumulants can be used to construct a classical minimum distance estimatorthat is root-T consistent and uniformly asymptotically normal.

Extension of the classical minimum distance estimator to more general moving average models

is not possible when the relation between the model parameters and the higher order cumulants is

not analytically tractable. Thus, we also propose a simulation based minimum distance estimator

with errors drawn from the generalized lambda distribution. It is an alternative to the semi-

parametric density considered in Gallant and Tauchen (1996) for simulating non-Gaussian errors.

The estimator uses multiple auxiliary regressions and has the avor of indirect inference estimation

proposed by Gourieroux et al. (1993) as well as the simulated method of moments of Due and

Singleton (1993). The proposed estimator also has classical asymptotic properties regardless of

whether the MA roots are inside, outside, or on the unit circle.

The main arguments of the analysis are presented using the MA(1) model but extensions to more

general models are also discussed. Section 2 proceeds to highlight two identication problems in the

context of minimum distance estimation. Section 3 discusses the properties of the classical minimum

distance estimator based only on information about the covariance structure of the process. It also

motivates the need of using higher order cumulants in estimation and explains how identication

can be achieved. Section 4 develops a simulation minimum distance estimator for more general

moving average models. An empirical application for commodity prices is provided in Section 5.Section 6 concludes.

2 Two Identication Problems

Consider the autoregressive and moving average (ARMA) process of order (p, q):

(L)yt = (L)et;

where et iid(0; 2); L is the lag operator such that Lpyt = ytp, (L) = 11L : : :pLp haveno common roots with (L) = 1 + 1L + : : : + qLq. The autoregressive polynomial (z) is said

to be causal if(z) 6= 0 for all jzj 1 on the complex plane, and the moving average polynomialis said to be invertible if (z) 6= 0 for all jzj 1 (Brockwell and Davies (1991)). Ifyt is a causalfunction of et, then there exist constants hj with

P1j=0 jhj j < 1 such that yt =

P1j=0 hjetj for

t = 0;1; : : : We say that yt has minimum phase if the zeros of(z) and (z) are all greater than

2


4/37

one in absolute value.3 Few economic time series exhibit explosive behavior. If we narrow the focus

to causal and stable processes, invertible processes also have minimum phase.

If a process yt is invertible in et, then there exist constants j withP1

j=0 jjj < 1 such thatet =

P1j=0 jytj = (L)yt. For ARMA(p, q) models, invertibility requires that the inverse of (L)

has a convergent series expansion in positive powers of the lag operator L. For the MA(1) model

yt = et + et1; (1)

with et iid(0; 2), the invertibility condition is satised if jj < 1 since (L) =P1

s=0()sLsis a polynomial in positive powers of L. This is no longer true when jj in (1) exceeds one. Itis, however, misleading to classify invertible and non-invertible processes according to the value

alone. Consider the MA(1) process yt represented by

yt = et + et1: (2)

Even if in (2) is less than one, yt is still non-invertible because the implied (L) =P1

s=0(L)s1is a polynomial in negative powers ofL.

Invertible and non-invertible processes have distinctive features with implications for forecasting.

In the invertible case, the span of et and its history coincide with that of yt, which is observed by

the econometrician. The one-step ahead forecast errors are etjt1 = yt ytjt1 = et. In the non-invertible case, the econometrician does not observe future values ofyt and his information set is

strictly inferior to that of the economic agent. As discussed in Ramsey and Montenegro (1992),

the one-step ahead forecast errors when yt is generated by the non-invertible model (2) are

etjt1 = yt (et1 + et2) + 2(et2 + et3) + : : : 6= et:

These dierences are important in the subsequent analysis.

Identication and estimation of models with a moving-average component are dicult because

of two problems that are best understood by focusing on the MA(1) case. The rst identication

problem concerns at or near unity. When the MA parameter is near the unit circle, the Gaussian

maximum likelihood estimator (MLE) takes values exactly on the boundary of the invertibility

region with positive probability (the so-called pile-up problem) in nite samples. This point

probability mass at unity arises from the symmetry of the likelihood function around one and the

small sample deciency to identify all the critical points of the likelihood function in the vicinity of

the non-invertibility boundary; see Sargan and Bhargava (1983), Anderson and Takemura (1986),

Davis and Dunsmuir (1996), Gospodinov (2002), Davis and Song (2011).

3 A non-stationary process is not mean reverting with the property that the shocks have permanent eects on theseries. This has generated a large literature on testing the unit root hypothesis against the alternative of stationarity.

3


5/37

The second identication problem arises because covariance stationary processes are completely

characterized by the rst and second moments of the observables and an MA(1) model with pa-

rameters (; 2)0 has the same autocovariance structure as a model parameterized by (1=;22)0.

In consequence, the Gaussian likelihood for an MA(1) model with L(; 2) is the same as one with

L(1=;2

2

). The observational equivalence of second moments also implies that the projectioncoecients in (L) are the same regardless of whether is less than or greater than one. Thus,

cannot be recovered from the coecients without additional assumptions.

This observational equivalence problem can be further elicited from a frequency domain per-

spective. If we take as a starting point yt = h(L)et =P1

j=1 hjetj, the frequency response

function of the lter is

H(!) =X

hj exp(i!j) = jH(!)j expi(!);

wherejH(!)

jis the amplitude and (!) is the phase response of the lter. For ARMA models,

h(z) = (z)(z) =P1

j=1 hjzj. The amplitude response is usually constant for given ! and tends

towards zero outside the interval [0; ]. If et is Gaussian and h(L) is invertible, second order

statistics will correctly identify the amplitude and the phase of the wavelet. But for given a > 0,

the phase 0 is indistinguishable from (!) = 0 + a! for any ! 2 [0; ]. Recovering et from thesecond order spectrum

S2;y(z) = 2jH(z)j2

is problematic because S2;y(z) is proportional to the amplitude jH(z)j2 with no information aboutthe phase. The second order spectrum is thus said to be phase-blind. As explained in Lii and

Rosenblatt (1982), one can ip the roots of (z) and (z) without aecting the modulus of the

transfer function. With real distinct roots, there are 2p+q ways of specifying the roots without

changing the probability structure ofyt.

3 Classical Minimum Distance Estimation of MA(1) Model

The consequences of the two identication problems for estimation and inference are easily seen

from the perspective of the classical minimum distance estimator that exploits only the covariance

structure of the process. Let

2 be a K

1 parameter vector of interest with a true value

0, where the parameter space is a subset of the K dimensional Euclidean space RK. Considerestimating the MA(1) model indirectly via an auxiliary model with an L 1 (L K) vector ofparameters 2 RL that are functions of; f j = (); 2 g, with a pseudo-true value

0 (0):

4


6/37

Given data y (y1; : : : ; yT)0 and a consistent estimator bT and its asymptotic variance b, ischosen to minimize the dierence between bT and (). The classical minimum distance (CMD)estimator of using the optimal weighting matrix is dened as

bT = arg minJCMD(bT; ();b)= arg min(bT ())0b1(bT ()): (3)

If is of the same dimension as and () is of known form and invertible, then bT = 1(bT).In general, the dimension of exceeds that of. The auxiliary model need not nest the true model,

but identication hinges on a well-behaved mapping from the space of to the parameter space of

the auxiliary model.

Denition 1 Let() : ! () be a mapping from to andG() = @()=@0 withG0 G(0). Then, 0 is globally identied if (

) is injective and is locally identied if the matrix of

partial derivativesG0 has full column rank.

Note that the requirement for full column rank of the derivative matrix G(0) is a sucient

condition for the mapping (0) to be locally injective. Hence, 0 is locally identied if (0)

is locally injective and Denition 1 provides a sucient condition for local identication. From

Denition 1, 1 and 2 are observationally equivalent if(1) = (2).

Lemma 1 Suppose that the following conditions hold. (C.i) fyt; 1 < t < 1g is a strictlystationary and ergodic process; (C.ii) bT

p

!0; (C.iii) the set

f

j = ()

2

RL

gis

a compact subset of RK withL K; (C.iv) () is twice continuously dierentiable in ; (C.v)the matrix of partial derivativesG() has rank equal to K on ; (C.vi) the mapping 0 = (0)

is unique; Then, b p!0. If, in addition, (AN.i) 0 is in the interior of ; (AN.ii) pT(bT 0)

d!N(0;0); where 0 is a symmetric positive denite matrix and b p!0, then pT(bT 0)

d!N(0; (G0010 G0)1).

Lemma 1, adapted from Ruud (2000), provides conditions for consistency and asymptotic nor-

mality of the classical minimum distance estimator. Except for (C.v), these are more or less

standard conditions for extremum estimators to be consistent and asymptotically normal distrib-

uted; see for example, Newey and McFadden (1994). Condition (C.v) is typically stated as a

requirement for asymptotic normality but not a necessary condition for global identication; Hall

(2005, p.69).4 However, in Rothenberg (1971), full rank of the derivative matrix is used in stating

4 Sargan (1983) and, more recently, Dovonon and Renault (2011) show that identication is possible even whenthe moment conditions have a degenerate derivative matrix at the true values of the parameters.

5


7/37

sucient conditions for global identication.5 While this condition may be too strong in general,

the condition is necessary for identication of the MA(1) model if the case j0j = 1 is to be allowedfor as we now discuss.

The MA(1) model has = (; 2)0, 0 = (0; 20)0, = [H; H] [2L; 2H]; where 0 2, see Lii and Rosenblatt (1982, Lemma 1), Ginnakis and Swami (1990); Mendel

(1991). This necessarily requires that et has non-Gaussian features.

The use of higher order cumulants per se does not, however, automatically guarantee identi-

cation of the MA(1) model. Let et = "t; where "t iid(0; 1) with E("lt) = l (l 3) and

assume that 3 6= 0. Since the third cumulant of a mean-zero, stationary process yt is dened ascum3;y(u; v) = E(ytyt+uyt+v) = cum3;e

Pqi=0 hihi+uhi+v, the non-zero third order cumulants for

the MA(1) process are given by8

cum3;y(0; 0) = E(y3t ) = (1 +

3)33;

cum3;y(0;1) = E(y2t yt1) = 233;cum3;y(1; 0) = E(y2t yt1) = 233;

cum3;y(1;1) = E(yty2t1) = 33:

Lemma 2 Let = (cum2;y(1); cum2;y(0); cum3;y(u; v))0

, where cum3;y(u; v) is one of the thirdorder cumulants of yt. Then, (i) cannot globally identify for any 0 = (0;

20; 3;0)

0 2 and(ii) cannot locally identify when j0j = 1 for any2 and3.

7 All-pass models are non-causal and/or non-invertible ARMA models in which the roots of the autoregressivepolynomial are reciprocals of the roots of the moving average polynomials and vice versa.

8 This folllows from direct evaluation ofE(ytyt+uyt+v) with yt = et + et1.

8


10/37

Lemma 2 considers the case when and are of the same dimension. Part (i) of the lemma

implies that there always exist 1 2 and 2 2 such that 1 and 2 are observationally equivalentin the sense that they generate the same . For example, 1 = (;

2; 3)0 and 2 = (1=;

22; 3)0

both imply the same = (E(ytyt1); E(y2t ); E(y

2t yt1))

0. It is easy to verify that the mapping

from to is also surjective when cum3;y(0;1) is replaced by cum3;y(0; 0) or cum3;y(1;1).This result arises because observational equivalence of second moments precludes cum2;y(1) andcum2;y(0) from exactly identifying and

2, and a single higher order cumulant might, but cannot

be guaranteed to identify both 3 and the parameters of the MA(1) model. Part (ii) of Lemma

2 follows from the fact that the determinant of the derivative matrix is zero at j0j = 1. Localidentication of when j0j = 1 will always require replacing cum2;y(1) or cum2;y(0) with anotherthird order cumulant to avoid degeneracy in the derivative matrix.

Lemma 3 Let fcum2;y(1); cum2;y(0); cum3;y(0; 0); cum3;y(0;1); cum3;y(1;1)g and()

be a function that maps to . Then, there exists at least one value in the parameter space for at which rank(G()) < 3 if dim() = 3.

The three third order cumulants together with the two second order cumulants E(ytyt1) and

E(y2t ) create ten distinct combinations of a three-dimensional vector of auxiliary parameters to be

considered for exact identication. Direct calculations show that if consists ofE(ytyt1), E(y3t )

and E(yty2t1) or E(y

2t ), E(y

3t ) and E(yty

2t1), the derivative matrix is singular at = (1=2)

1=3. If

consists ofE(ytyt1), E(y2t yt1) and E(yty

2t1) or E(y

2t ), E(y

2t yt1) and E(yty

2t1), the matrix

of derivatives is singular at = 0. If contains both E(ytyt1) and E(y2t ), the determinant of the

derivative matrix is zero at = 1 and = 1. Finally, if consists ofE(ytyt1), E(y2t yt1) and

E(y3t ) or E(y2t ), E(y

2t yt1) and E(y

3t ), the determinant of the derivative matrix is zero at = 0

and = 21=3: For all ten combinations of considered, there always exist values of for which the

derivative matrix is singular. The implication is that exact identication of all 2 from threedimensional subsets of is not possible.

We now propose an over-identied CMD estimator bCMD based on a vector of cumulantsCMD =

cum2;y(1) cum2;y(0) cum3;y(0; 0) cum3;y(0;1) cum3;y(1;1)

0=

E(ytyt1) E(y

2t ) E(y

2t yt1) E(y

3t ) E(yty

2t1)

0

(6)

and a mapping function

CMD() =2 (1 + 2)2 233 (1 +

3)33 330:

From a methods of moments perspective, CMD contains information about the covariance struc-

ture, the unconditional skewness of the process, and the time-varying second moments of the ob-

9


11/37

servables. The latter is useful for identication because Ramsey and Montenegro (1992) show that

the residuals from an autoregressive approximation exhibit ARCH-type structure if the underlying

MA process is non-invertible and the true errors are asymmetric.

The derivative matrix ofCMD() with respect to is

GCMD() =

0BBBB@2 0

22 (1 + 2) 0233 3

23=2 23

3233 3(1 + 3)3=2 (1 +

3)3

33 33=2 3

1CCCCA : (7)

Notably, due to the addition of the three higher order cumulants, the derivative matrix has full

column rank everywhere in even at j0j = 1 and conditions (C.v) and (AN.i) are thus satised.The rank condition is necessary for 0 to be a unique solution to the system of non-linear equations

characterized by

GCMD()0W(CMD CMD()) = 031:Provided that 3 6= 0, the uniqueness condition (C.vi) holds. The full rank condition is alsonecessary for the estimator to be asymptotically normal. As a result, this CMD estimator is root-T

consistent and asymptotic normal.

Proposition 1 Consider the MA(1) model (1) with et = "t; "t iid(0; 1) and E("3t ) = 3:Assume that 3 6= 0 and Ej"tj6 < 1. Let = (; 2; 3)0 and bCMD be the minimum distanceestimator based on (6), with CMD = Avar(

bCMD). Then,

pT(bCMD 0) d!N0; GCMD(0)01CMDGCMD(0)1:

While analytical results for moving average processes of higher order are dicult to obtain, our

conjecture is that the use of a single higher order cumulant remains necessary but not sucient for

identication. Two more third order cumulants must be used in conjunction ofE(ytyt1) and E(y2t )

to overidentify . Provided that 4 6= 3, fourth order cumulants such as E(y4t ) = [(1+4)4+62]4and E(y2t y

2t1) = (1 +

2 + 24 + 4)4 can also be incorporated in conjunction of other cumulants

of order three or higher.

3.3 Finite-Sample Properties of the CMD Estimator

To illustrate the nite-sample properties of the CMD estimators, data with T = 1000 observations

are generated from an MA(1) model yt = et + et1 and et = "t where "t is iid(0; 1) and follows

10


12/37

a generalized lambda distribution (GLD) which will be further discussed in Section 4.1. For now,

it suces to note that GLD distributions can be characterized by a skewness parameter 3 and

a kurtosis parameter 4: The true values of the parameters are = 0:5; 0:7; 1; 1:5 and 2, = 1,

3 = 0; 0:35; 0:6 and 0.85, and 4 = 3. Lack of identication of arises when 3 = 0 and weak to

intermediate identication occurs when 3 = 0:35; 0.6 and 0.85.Table 1 presents the average estimates and the standard deviations of three CMD estimators of

and 3 over 5000 Monte Carlo replications. While 3 is typically not a parameter of direct interest,

information for this parameter would indicate how useful the third cumulants are in identifying and

estimating the parameters of the model. The rst estimator is the CMD estimator using the sample

analog of (6) as auxiliary parameters. As argued above, the use of higher order cumulants does

not necessarily guarantee identication. For this reason, we also consider a just-identied classical

minimum distance estimator that uses only the sample analog of

U = E(ytyt1) E(y2t ) E(y2t yt1)0 CMDas auxiliary parameters. For the sake of comparison, we consider an infeasible minimum distance

estimator which is based on bU but assumes that 2 is known and estimates only (; 3)0. Asdiscussed earlier, xing 2 solves the identication problem. Without imposing invertibility, j0j = 1is not on the boundary of the parameter space for . The infeasible estimator is asymptotically

normally distributed uniformly over the whole parameter space for and over all error distributions.

The problem is that 2 is, in general, unknown. We demonstrate, however, that our proposed CMD

estimator has properties similar to this infeasible estimator.

The results in Table 1 suggest that regardless of the degree of non-Gaussianity, the infeasibleestimator produces estimates of that are very precise and essentially unbiased. Hence, xing

solves both identication problems without the need of non-Gaussianity although a prior knowledge

of is rarely available in practice. As seen in Table 1, the feasible (just-identied) version of this

estimator, based on bU does not achieve identication of the structural parameters for any valueof3. This estimator is also characterized by a large pile-up probability at unity due to a violation

of condition (C.v). In contrast, over-identifying the model with the auxiliary moments E(y3t ) and

E(yty2t1) gives rise to the CMD estimator which achieves identication as the degree of skewness

increases. For the CMD estimator, the skewness parameter appears to be very well identied and

estimated over all specications of the error distribution. In fact, the CMD estimates of3 appear

to be much more precise than the infeasible estimator which can be attributed to the usefulness

of the additional third order cumulants used in the CMD estimator. However, the estimation of

depends on the strength of identication. While for 3 = 0:35 the identication is weak and

the estimates of are somewhat biased, for higher values of the skewness parameter the CMD

11


13/37

estimates of are practically unbiased. When 3 = 0:85, the CMD estimator identies correctly

(with probability one) if the true value of is in the invertible or the non-invertible region.

Figures 1, 2 and 3 plot the density functions of the standardized CMD estimator of , and

3, respectively, for the MA(1) model considered in this section with = 1:5 and T= 1000. While

the lack of identication for zero or low values of the skewness parameter induces non-normality(bimodality and fat tails) in the distribution of the estimator, the densities of the standardized CMD

estimator of, and 3 appear to be very close to the standard normal density for 3 = 0:85.

4 Semi-Parametric Simulated Minimum Distance Estimation

While the CMD estimator for the MA(1) model has appealing asymptotic and nite-sample prop-

erties, analytical expressions for the mapping from general ARMA(p, q) models to the cumulants

are not tractable. For this reason, we develop a simulation-based estimator for ARMA(p, q) mod-

els. The simulation estimator is similar in spirit to the CMD but can accommodate autoregressivedynamics, kurtosis and other features of the errors. The dierence with CMD is that it uses

simulations to approximate and invert ().More precisely, let ys() = (ys1; : : : ; y

sT)

0 be data simulated for a candidate value of . This

usually requires drawing errors from a known distribution, and the parameters of this distribution

are ancillary for . Let bT = argminQT(;y)and

esT() = arg minQT(;ys())be the the auxiliary parameters estimated from actual and simulated data, respectively, where

QT() denotes the objective function of the auxiliary model. Dene eT;S() to be the average ofthe estimates esT() over S draws each using simulated data of length T, i.e.,9

eT;S() = 1SSX

s=1

esT():A simulation-based minimum distance (SMD) estimator can now be dened as

bT;S arg minJSMD(bT; eT;S();b)= arg min(bT eT;S())0b1(bT eT;S()); (8)

9 Alternatively, one could use one draw of simulated data of length T S, yS() = (yS1 ; : : : ; yST;:::;y

STS)

0; and deneeT;S () as eT;S () = arg min QT(;y

S()):

12


14/37

where bT is a consistent estimate of the asymptotic variance of bT. The ecient method ofmoments (EMM) estimator of Gallant and Tauchen (1996) and the indirect inference estimator

(IIE) of Gourieroux et al. (1993) consider pseudo-maximum likelihood estimator of in which case

QT() is the log-likelihood. Identication requires that the mapping () be injective in the sense of

Denition 1. In other words, the auxiliary model must contain features of the data generated under. Simulations merely provide an approximation to 1(bT).10 Thus, the () in simulation-basedminimum distance estimation is sample size dependent (Phillips (2012)). Gourieroux et al. (1993)

refer to the estimator as the method of indirect inference and () as the binding function.Simulation estimation of the MA(1) model was considered in Gourieroux et al. (1993), Michaelides

and Ng (2000), Ghysels et al. (2003), Czellar and Zivot (2008), among others, but only for the in-

vertible case. All of these studies use an autoregression as the auxiliary model. For = 0:5 and

assuming that 2 is known, Gourieroux et al. (1993) nd that the IIE compares favorably to the

exact MLE in terms of bias and root-mean squared error. Michaelides and Ng (2000) and Ghysels

et al. (2003) also evaluate the properties of simulation-based estimators with 2 assumed known.

Czellar and Zivot (2008) report that the IIE is relatively less biased but exhibits some instability

and the tests based on it suer from size distortions when 0 is close to unity. The favorable

properties of the IIE when is in the invertible range can be traced to the fact that simulation es-

timation has a bias-correction property that is absent from classical minimum distance estimation.

Intuitively, if the auxiliary parameter estimates bT obtained from the data are downward biased,so will the estimates esT estimated from the data simulated for a given . Then, can be calibratedto bias correct the CMD, akin to the bootstrap.

This bias-correction property provided by simulation estimation has an additional but unex-ploited role when non-invertible models are allowed. As shown in the previous section, identication

without imposing invertibility relies on information in higher order moments which tend to exhibit

nite-sample biases. The next section considers a simulation based estimator that achieves identi-

cation without imposing invertibility and enables classical inference even at 0 = 1.

4.1 SMD Estimator Based on GLD Errors

As the key to identication is errors with non-Guassian properties, we need to be able to simulate

non-Gaussian errors in a exible fashion so that yt has the desired distributional properties. There

is evidently a large class of distributions with third and fourth moments consistent with a non-

Gaussian process that one can specify. As assuming a particular parametric error distribution

could compromise the robustness of the estimates, we simulate errors from the generalized lambda

10 In the terminology of Gallant and Tauchen (1996), true densities of the data be smoothly embedded within thescores of the auxiliary model.

13


15/37

distribution P(1; 2; 3; 4) considered in Ramberg and Schmeiser (1975). This distribution has

two appealing features. First, it can accommodate a wide range of values for the skewness and

excess kurtosis parameters and it includes as special cases normal, log-normal, exponential, t, beta,

gamma and Weibull distributions. The second advantage is that it is easy to simulate from. The

percentile function is given by

P(u)1 = 1 + [U3 + (1 U)4]=2; (9)

where U is a uniform random variable on [0; 1], 1 is a location parameter, 2 is a scale parameter,

and 3 and 4 are shape parameters. To simulate "t, a U is drawn from the uniform distribution and

(9) is evaluated for given values of (1; 2; 3; 4). As shown in Ramberg and Schmeiser (1975), the

shape parameters (3; 4) are explicitly related to the coecients of skewness and kurtosis (3 and

4) of "t. Furthermore, the shape parameters (3; 4) and the location/scale parameters (1; 2)

can be sequentially evaluated. Since "t

has mean zero and variance one, the parameters (1;

2)

are determined by (3; 4) so that "t is eectively characterized by 3 and 4.

We jointly estimate the structural parameters and 2 with the nuisance parameters of

the non-Gaussian distribution which are necessary for identication of and 2. The structural

parameter vector is expanded to contain parameters of the error process.11 Dene the augmented

parameter vector of interest by

SMD = (; 2; 3; 4)

0:

Let the vector of auxiliary parameters be dened from the following regression models:

yt = 1yt1 + : : :+ pytp + c1y2t1 + v1t; (10a)

y2t = c0 + c2yt1 + c3y2t1 + v2t: (10b)

Model (10a) captures the dynamics of yt; the slope parameters of model (10b) reect information

in the higher-order, time-varying cumulants of the process while the intercept c0 is related to the

second unconditional moment of yt. To capture information in the skewness and kurtosis of the

errors, we augment the auxiliary parameter vector with the third and fourth moments of the OLS

residuals from regression (10a), i.e., 3 = E(v31t) and 4 = E(v

41t): As a result, the auxiliary

parameter vector for the SMD estimator is

SMD = (1;:::;p; c0; c1; c2; c3; 3; 4)0: (11)

11 It would seem tempting to estimate 3 and 4 separately from (; 2)0, such as using the sample skewness and

kurtosis of the residuals of a long autoregression. But as discussed in Ramsey and Montenegro (1992), the OLSresiduals do not converge in the limit to the true errors when (L) is non-invertible, rendering their sample highermoments also asymptotically biased.

14


16/37

The auxiliary regressions (10a) and (10b) allow us to perform simple tests for identication. By

Lemma 2, two or more cumulants of third order are necessary for identication of if j0j = 1 is anadmissible value in the parameter space. For example, individual t tests ofH0 : c1 = 0, H0 : c2 = 0

and H0 : c3 = 0 can shed light on whether the third and fourth order cumulants can identify the

structural parameters of the MA(1) model. If the individual null hypotheses are rejected, a jointtest can be performed before using the classical or simulation-based estimation of.

The proposed simulated minimum distance estimator bSMD ofSMD is obtained as in (8) fora given consistent estimator bSMD of the auxiliary parameter vector SMD. The estimator issemi-parametric because we use a possibly misspecied error distribution to simulate data from

the structural model. To establish the consistency and asymptotic normality of the SMD estimatorbSMD we need some additional notation and regularity conditions. Let P denote the class ofgeneralized lambda distributions and all limits be taken with respect to P as T! 1:

Proposition 2 LetSMD be dened as in (11). Suppose that in addition to the assumptions in

Lemma 1, sup2

eSMD() SMD() p!0 and pT(eSMD() SMD(0)) d!N(0;SMD),where SMD = Avar(eSMD). Then, bSMD p!0 and

pT(bSMD 0) d!N

0;

1 +

1

S

GSMD(0)

01SMDGSMD(0)1! N0;Avar(bSMD):

Consistency follows from identiability ofand the moment conditions that exploit information

in higher order cumulants play a crucial role. In our procedure, 3 and 4 are dened in terms of3

and 4 so that the estimates of3 and 4 are implied by the generalized lambda distribution instead

of the sample estimates of skewness and kurtosis. Even though 3 and 4 are not parameters of

direct interest, they are crucial for identication of and 2.

A key feature of Proposition 2 is that it holds when is less than, greater than or equal to one. In

a Gaussian likelihood setting when invertibility is assumed for the purpose of identication, there is

a boundary for the support of jj at the unit circle. Thus, the likelihood-based estimation has non-standard properties when the true value of is on or near the boundary of one. In our setup, this

boundary constraint is lifted because identication is achieved through higher moments instead

of imposing invertibility. As a consequence, the SMD estimator

bSMD has classical properties

provided that 3 and 4 enable identication.Consistent estimation of the asymptotic variance of bSMD can proceed by substituting a con-

sistent estimator of SMD and evaluating the Jacobian GSMD(bT;S) numerically. The computedstandard errors can then be used for testing hypotheses and constructing condence intervals.

Alternatively, inference on the MA parameter of interest, , can be conducted by constructing

15


17/37

condence intervals based on test inversion without an explicit computation of the variance matrix

Avar(bSMD). For a sequence of null hypotheses H0 : = i for i 2 , consider a generic distancemetric (DM) statistic

DMSMD =TS

S+ 1JSMD(bT; eS;T(e);b) JSMD(bT; eS;T(b);b);where b is the unrestricted estimate and e is the restricted estimate under the null. Let be thesignicance level of the test and q1 denote the (1 )-th quantile of the chi-square distributionwith one degree of freedom. Then, the 100(1 )% condence interval for is given by the set ofvalues satisfying DMSMD q1; i.e., C1() = f 2 : DMSMD q1g. The endpoints of thecondence interval are obtained as

L = inff 2 : Pr(DMSMD q1 j H0) 1 g;U = sup

f

2 : Pr(DMSMD

q1

jH0)

1

g:

This approach is very convenient since it provides information on the invertibility of the process. We

implement the DM test with = SMD dened in (11) and being the corresponding asymptotic

variance of bSMD.4.2 Monte Carlo Simulations for the SMD Estimator

This section uses simulations to assess the properties of the proposed SMD estimator. Section 4.2.1

evaluates the point estimates of an MA(1) model and Section 4.2.2 studies the estimated impulse

response functions of an ARMA(1, 1) model.

4.2.1 Parameter Estimation in MA(1) Model

We rst study the nite-sample behavior of the proposed SMD estimator in invertible and non-

invertible MA(1) models with data generated from

yt = et + et1; et = "t;

where "t iid(0; 1) is drawn from a GLD with zero excess kurtosis and a skewness parameter0.85.12 In all simulation designs, = 1 and takes the values of 0:5; 0:7; 1; 1:5; and 2.13 The

sample sizes are T = 1000 and 2000 and the number of Monte Carlo replications is 1000. Wealso investigate the properties of the SMD estimator for smaller sample sizes (T = 500) and other

asymmetric (chi-squared and exponential) distributions.

12 Results for a larger range of values of the skewness parameter for GLD are not reported to conserve space butare available from the authors upon request.

13 The results are invariant to the choice of.

16


18/37

The proposed SMD estimator is implemented as follows. We use an error simulator based on the

generalized lambda error distribution. For the auxiliary model (10a), we use p = 4 for the lag order

of the AR polynomial. It appears that larger values ofS (the number of simulated sample paths

of length T) tend to smooth the objective functions which improves the identication of the MA

parameter. As a result, we set S= 20 although S > 20 seems to oer even further improvement,especially for small T; but at the cost of increased computational time. In addition to the estimate

of, the SMD also delivers estimates of, 3 and 4. From the estimates of3 and 4, we construct

estimates of3 and 4 as (see Ramberg and Schmeiser (1975))

3 =C 3AB + 2A3

32;

4 =D 4AC+ 6A2B 3A4

42;

where A =

1

1+3 1

1+4 , B =

1

1+23 +

1

1+24 2Beta(1 + 3; 1 + 4), 2 =pB A

2

, C =11+33

3Beta(1+23; 1 +4)+3Beta(1+3; 1 + 24) 11+34 , D =1

1+434Beta(1+33; 1 +4) +

6Beta(1 + 23; 1 + 24) 4Beta(1 + 3; 1 + 34) + 11+44 , and Beta(; ) denotes the beta function.As is true of all non-linear estimation problems, the numerical optimization problem must

take into account the possibility of local minima. Once non-invertibility is allowed, we need to

additionally allow for the possibility of multiple equilibria. Thus, the estimation always considers

two sets of initial values. Specically, we draw two starting values for - one from a uniform

distribution on (0; 1) and one from a uniform distribution on (1; 2) - with the starting value for

set equal to qb2y=(1 +

2) for each of the starting values for . The starting values for the shape

parameters of the GLD 3 and 4 are set equal to those of the standard normal distribution (with

3 = 0 and 4 = 3). In this respect, the starting values of , , 3 and 4 contain little prior

knowledge of the true parameters.

Figure 4 illustrates how identiability depends on skewness by plotting the log of the objective

function for the SMD estimator averaged over 1000 Monte Carlo replications of the MA(1) model

for dierent values of and : The true values of and are 0.7 and 1, respectively, and the errors

are generated from GLD with zero excess kurtosis and three values of the skewness parameter: 0,

0.35, 0.6 and 0.85.14 The rst case (skewness=0) corresponds to lack of identication and there

are two pronounced local minima at and 1=: As the skewness of the error distribution increases,the second local optima at 1= attens out and it almost completely disappears when the error

distribution is highly asymmetric.

14 In evaluating the objective function, the values of the lambda parameters in the generalized lambda distributionare set equal to their true values.

17


19/37

Tables 2 reports the mean and median estimates of , the average asymptotic standard error

of the SMD estimator of and the standard deviation of the estimates for which identication is

achieved. In addition, Table 2 presents the empirical probability of the SMD estimate of to be

greater than one which provides information on how often the identication of the true parameter

fails. The last column of Table 2 reports the rejection rate of the DM test ofH0 : = 0 at 10%signicance level. The main ndings can be summarized as follows. The SMD estimator of appears

to be median unbiased for all values of , even for small T. While there is a positive probability

that the SMD estimator will converge to 1= instead of (especially when is in the non-invertible

region), this probability is fairly small and it disappears completely for T = 2000. Interestingly,

in terms of precision, the SMD estimator appears to be more ecient even than the infeasible

estimator in Table 1 for values of in the invertible region (see also Gorodnichenko et al. (2012)

for a similar result in the context of autoregressive models). The asymptotic variance expression

in Proposition 2 tends to provide a very good approximation of the nite-sample variation of the

SMD estimates. Finally, the rejection rates of the hypothesis tests based on the SMD estimator

are very close to the nominal level which suggests that the asymptotic normality provides a good

approximation of the distribution of the SMD estimator over the whole parameter space.

Several remarks regarding the eciency properties of the SMD estimator are in order. First, the

SMD estimator tends to exhibit substantially smaller variability than the CMD estimator in Table

1 (case 3 = 0:85). These eciency gains are expected since the instrumental model based on the

AR approximation encompasses the dependence structure of the MA(1) model as the lag order p

increases to innity. What is somewhat surprising is the magnitude of the eciency gains. Second,

it is instructive to compare the sampling variability of the SMD estimator to the ML estimatorwhich provides the eciency bound for any estimator in the invertibility region. Recall that the

variance of the Gaussian ML estimator is (1 2)=T which, due to the invertibility restriction,shrinks to zero as the MA parameter approaches one. In contrast, our proposed SMD estimator

does not impose invertibility and its variance does not exhibit this type of behavior. For this

reason, a fair comparison between the SMD and ML estimators would involve values of that are

far away from the invertibility boundary, such as = 0:5. The sample dispersion measures for the

SMD estimator of0 = 0:5 in Table 2 are apparently very close and even lower than the asymptotic

standard error of the MLE which is 0.0274 and 0.0194 for T = 1000 and T= 2000, respectively. We

should note that similar results are reported by Gourieroux et al. (1993) for the simulation-based

(indirect inference) estimator of the invertible MA(1) model.

To gain some understanding about the source of the excellent properties of the SMD estimator

of, Table 3 reports the mean and median SMD estimates of the nuisance parameters , 3 and

18


20/37

4 along with their Monte Carlo standard deviations. The estimate of is practically unbiased

and very precise. Importantly, the skewness parameter, albeit slightly downward biased, is very

precisely estimated (its standard deviation is smaller than the standard deviation of the CMD

estimator in Table 1). This points to the possibility that the excellent identication and estimation

properties of the SMD estimator of are likely to be due to its built-in bias correction and improvedeciency the improved estimation of the higher order moments of the error process.

Finally, Table 4 presents results for the SMD estimator of and for a smaller simple size

(T = 500) and two other asymmetric error distributions: chi-squared distribution with 6 degrees of

freedom (with skewness and excess kurtosis parameters of 1:15 and 2, respectively) and exponential

distribution with a scale parameter of one (with skewness and excess kurtosis parameters of 2 and

6, respectively). The errors are recentered and rescaled to have a mean of zero and variance one.

Note that the simulator for the SMD estimator is still based on the GLD family and, hence, it

is misspecied. The results in Table 4 are in line with the previous results for larger sample

sizes and GLD errors. The SMD estimates of and appear to be almost unbiased and exhibit

small variability. With the smaller sample size, the probability that the SMD estimate of is

not identied increases up to 3.8% in some cases but, overall, the nite-sample properties of our

proposed estimator remain quite attractive.

4.2.2 Impulse Response Function Estimation of All-Pass ARMA(1, 1) Model

One of the main advantages of SMD is its exibility to accommodate more general models and

dependence structures. To illustrate this, we consider the all-pass ARMA(1, 1) model

yt yt1 = et (1=)et1 for jj < 1; (12)

where et is a standard exponential random variable with a scale parameter equal to one which is

recentered and rescaled to have mean zero and variance 1. As discussed in Davis (2010), this process

possesses some interesting properties. First, the process in (12) is uncorrelated but it exhibits

higher order dependence (conditional heteroskedasticity). Furthermore, while the process yt is

causal, it has a non-invertible MA component. If one imposes invertibility on the MA component

(or replaces the MA parameter 1= by and the unit variance of the error term by (1=)2),

the process has cancelling roots in the AR and MA polynomials and it reduces to an iid random

sequence. Therefore, using estimators that impose invertibility would result in a at impulse

response function while the true impulse response function for horizon j > 1 is given by

@yt@etj

= j1( 1=):

19


21/37

We investigate the SMD and Gaussian quasi ML estimates of the impulse response functions

(IRFs) for = 0:5 and 0:5 (T = 500): The SMD estimator uses the same auxiliary model as inthe previous section. The median IRF estimates obtained from 1,000 Monte Carlo replications are

plotted in Figure 5 and Figure 6, respectively. The SMD-based IRF estimates are median unbiased

and trace closely the shape of the true impulse response. In sharp contrast, the Gaussian quasiMLE fails to identify the AR and MA parameters and produces a at IRF around zero.

5 Empirical Application: Commodity Prices

Non-invertibility can be consistent with economic theory. For example, suppose yt = EtP1

s=0 sxt+s

is the present value ofxt = et + #et1. The solution yt = (1 + #)et + #et1 = h(L)et implies that

the root ofh(z) is 1+## which can be on or inside the unit circle even if j#j < 1. If there is nodiscounting and = 1, yt has a moving average unit root when # = 0:5 and h(L) is non-invertible

in the past whenever # < 0:5.15

Present value models are used to analyze a variables with a forward looking component including

commodity prices whose dynamics have implications for monetary policy and asset allocation. It is a

stylized fact that commodity price changes are almost uncorrelated (or very weakly autocorrelated)

over time and exhibit conditional heteroskedasticity. These two characteristics are also properties

of the all-pass models considered in the previous section and it is interesting to see if commodity

price changes are driven by a non-invertible MA component. To see that this is also theoretically

plausible, we revisit the present value model by Pindyck (1993) of commodity price determination.

Let st and ft denote the spot and futures commodity price for delivery at time t+ 1, and cyt be the

(net of insurance and storage costs) marginal convenience yield over the period. The no-arbitrage

condition implies that

Et(cyt) = (1 + i)st ft; (13)where i is the risk-free rate. Let Et(st+1) = ft + rpt, where rpt is a time-varying risk premium,

and assume that rpt = ( i)st; where denotes a risk-adjusted discount rate for the commodity.Substituting for ft = Et(st+1) ( i)st into (13) yields

Et(st+1) = (1 + )st cyt: (14)

The stationary (no-bubble) solution to the expectational dierence equation (14) is given by

st = (1 + )

1Xi=0

(1 + )iEt(cyt+i):

15 If the moving average polynomial #(L) is of innite order, as it would be the case for causal autoregressive

processes, it is still possible for the roots ofh(L) = #()L#(L)L

to be inside the unit disk.

20


22/37

The presence of an invertible MA component in the convenience yield would induce a (possibly)

non-invertible MA component in the dynamics of the observable commodity prices. Given the

possible nonstationarity in commodity prices, we estimate an ARMA(1, 1) model of commodity

(log) price changes

4st = 4 st1 + et + et1using the Gaussian MLE and the proposed SMD estimator.

The data for the empirical analysis consist of commodity prices of the nearest futures contract

from the Commodity Research Bureau and cover the period March 1983 - July 2008. The ARMA(1,

1) model is estimated at monthly frequency by taking the last daily price in the month as the

corresponding monthly observation. We use 22 commodity prices from 6 commodity groups: energy

(crude oil, heating oil), grains and oilseeds (soybean oil, corn, oats, soybeans, wheat, canola), metals

(platinum, copper, gold, silver, palladium), industrials (cotton, lumber), livestock and meats (cattle

feeder, cattle live, pork bellies, hogs lean) and foodstus (cocoa, sugar, coee).Table 5 presents the estimation results. Practically all of the commodity price changes exhibit

some form of non-Gaussianity which is necessary for identifying possible non-invertible MA compo-

nents. The Gaussian ML tends to produce estimates for and of similar magnitude and opposite

sign suggesting a presence of cancelling roots and lack of identiability. However, this lack of iden-

tication could be an artifact of imposing invertibility on the MA root as argued in the previous

section. Indeed, when this restriction is relaxed within the SMD procedure, most of the commodity

price changes (except for gold and live cattle) appear to be driven by a non-invertible MA compo-

nent. Another interesting observation is that the estimated AR and MA parameters are of similar

magnitude and sign across the dierent commodities which seems to suggest that the parameters

are well identied within the SMD procedure. This is not the case for the Gaussian MLE where the

parameter estimates span a wide range of values which possibly arises from the non-identiability

of the parameters. Overall, there is strong evidence in support of non-invertibility in commod-

ity price changes which has potentially important implications for impulse response analysis and

forecasting.16

6 Conclusions

This paper proposes classical and simulation-based minimum distance estimation of possibly non-

invertible MA models with non-Gaussian errors. The classical minimum distance estimator is

16 Non-invertibility is expected to arise in other variables, such as stock prices, that are believed to be determinedby the present value model. In unreported results for the period February 1952 August 2012, tting an ARMA (1,1) to monthly returns on the S&P500 index (with sample skewness of -0.68 and sample kurtosis of 5.44) has producedSMD (ML) estimates of 0.684 (-0.726) and -1.394 (0.791) for the AR and MA parameters, respectively.

21


23/37

developed and analyzed for the MA(1) model with asymmetric errors. The identication of the

structural parameters is achieved by exploiting the non-Gaussianity of the process through third

order cumulants. This type of identication also removes the boundary problem at the unit circle

which gives rise to the pile-up probability and non-standard asymptotics of the Gaussian maximum

likelihood estimator. As a consequence, the proposed classical minimum distance estimator isroot-T consistent and asymptotically normal over the whole parameter range, provided that the

non-Gaussianity in the data is suciently large to ensure identication.

To accommodate more general models with analytically intractable binding functions, we de-

velop a simulation estimator based on auxiliary regressions that incorporate information from the

higher order cumulants of the data. The eciency of the estimator is controlled by the ability of the

auxiliary model in approximating the true data generating process. Our proposed simulated min-

imum distance estimator is semi-parametric in the sense that it uses a possibly misspecied error

simulator with a exible functional form that approximates a large class of distributions with non-

Gaussian features. Particular attention is paid to the accurate estimation of the shape parameters

of the error distribution which play a critical role in identifying the structural parameters.

22


24/37

References

Anderson, T. and Takemura, A. 1986, Why Do Noninvertible Estimated Moving Averages Occur?,Journal of TIme Series Analysis 7(4), 235254.

Andrews, B., Davis, R. and Breidt, F. 2006, Maximum Likelihood Estimation of All-Pass Time

Series Models, Journal of Multivariate Analysis 97, 16381659.Andrews, B., Davis, R. and Breidt, F. 2007, Rank-Based Estimation of All-Pass Time Series Models,

Annals of Statistics 35, 844869.

Brockwell, P. and Davies, R. 1991, Time Series Theory and Methods, 2nd edn, Springer-Verlag,New York.

Czellar, V. and Zivot, E. 2008, Improved Small Sample Inference for Ecient Method of Momentsand Indirect Inference Estimators, University of Washington.

Davis, R. 2010, All-Pass Procssess with Applications to Finance, Plenary Talk at the 7th Interna-tional Iranian Workshop on Stochastic Processes.

Davis, R. and Dunsmuir, W. 1996, Maximum Likelihood Estimation for MA(1) processes with aRoot on the Unit Circle, Econometric Theory 12, 120.

Davis, R. and Song, L. 2011, Unit Roots in Moving Averages Beyond First Order, Annals ofStatistics 39(6), 30623091.

Dovonon, P. and Renault, E. 2011, Testing for Common GARCH Factors, MPRA paper 40244.

Due, D. and Singleton, K. 1993, Simulated Moments Estimation of Markov Models of AssetPrices, Econometrica 61, 929952.

Fernndez-Villaverde, F., Rubio-Ramrez, J., Sargent, T. and Watson, M. 2007, A,B,Cs and (D)sfor Understanding VARs, American Economic Review 97:3, 10211026.

Gallant, R. and Tauchen, G. 1996, Which Moments to Match, Econometric Theory 12, 657681.

Ghysels, E., Khalaf, L. and Vodounou, C. 2003, Simulation Based Inference in Moving AverageModels, Annales D Economie et Statistique 69, 8599.

Ginnakis, G. and Swami, A. 1990, On Estimating Noncausal Nonmiminum Phase ARMA Models ofNon-Gaussian Processes, IEEE Transations, Acoustics, Speech, and Signal Processing 38, 478495.

Gorodnichenko, Y., Mikusheva, A. and Ng, S. 2012, Estimators for Persistent and Possibly Non-Stationary Data with Classical Properties, Econometric Theory 28, 10031036.

Gospodinov, N. 2002, Bootstrap Based Inference in Models with a Nearly Noninvertible MovingAverage Component, Journal of Business and Economic Statistics 20, 254268.

Gourieroux, C., Monfort, A. and Renault, E. 1993, Indirect Inference, Journal of Applied Econo-metrics 85, S85S118.

Hall, A. 2005, Generalized Methods of Moments, Advanced Texts in Econometrics, Oxford Univer-sity Press, Oxford.

23


25/37

Hansen, L. and Sargent, T. 1991, Two Diculties in Interpreting Vector Autoressions, in L. P.Hansen and T. J. Sargent (eds), Rational Expectations Econometrics, Westview, London, pp. 77119.

Harris, D. 1999, GMM Estimation of Time Series Models, in L. Mtys (ed.), Generalized Methodof Moments Estimation, Vol. Themes in Modern Econometrics, Cambridge University Press,

Cambridge, U.K., pp. 149169.Huang, J. and Pawitan, Y. 2000, Quasi-Likelihood Estimation of Non-Invertible Moving Average

Processes, Scandinavian Journal of Statistics 27, 689702.

Komunjer, I. 2012, Global Identication in Nonlinear Models with Moment Restrictions, Econo-metric Theory, forthcoming.

Komunjer, I. and Ng, S. 2011, Dynamic Identication of Dynamic Stochastic General EquilibriumModels, Econometrica 79:6, 19952032.

Lii, K. and Rosenblatt, M. 1982, Deconvolution and Estimation of Transfer Function Phase Coef-cients for Non-Gaussian Linear Processes, Annals of Statistics 10, 11951208.

Lii, K. and Rosenblatt, M. 1992, An Approximate Maximum Likelihood Estiation of Non-GaussianNon-Minimum Phase Moving Average Processes, Journal of Multivariate Analysis 43, 272299.

Lippi, M. and Reichlin, L. 1993, The Dynamic Eects of Aggregate Demand and Supply Distur-bances: Comment, American Economic Review 83, 644652.

Meitz, M. and Saikkonen, P. 2011, Maximum Likelihood Estimation of a Non-Invertible ARMAModel with Autoregressive Conditional Heteroskedasticity, mimeo, University of Helsinki.

Mendel, J. 1991, Tutorial on Higher Order Statsitics in Signal Processing and System Theory:Theoretical Results and Some Applications, Proceedings of the IEEE 79(3), 278305.

Michaelides, A. and Ng, S. 2000, Estimating the Rational Expectations Model of SpeculativeStorage: A Monte Carlo Comparison of Three Simulation Estimators, Journal of Econometrics

96:2, 231266.

Newey, W. and McFadden, D. 1994, Large Sample Estimation and Hypothesis Testing, Handbookof Econometrics, Vol. 4,Chapter 36, North Holland.

Phillips, P. 2012, Folklore Theorems, Implicit Maps, and Indirect Inference, Econometrica80(1), 425454.

Pindyck, R. 1993, The Present Value Model of Rational Commodity Pricing, Journal of PoliticalEconomy 103, 511530.

Ramberg, J. and Schmeiser, B. 1975, An Approximate Method for Generating Asymmetric RandomVariables, Communications of the ACM 17(2), 7882.

Ramsey, J. and Montenegro, A. 1992, Identication and Estimation of Non-invertible Non-GaussianMA(q) processes, Journal of Econometrics 54, 301320.

Rothenberg, T. 1971, Identication in Parametric Models, Econometrica 39:3, 577591.

Ruud, P. 2000, An Introduction to Classical Econometric Theory, Oxford University Press, NewYork.

24


26/37

Sargan, D. and Bhargava, A. 1983, Maximum Likelihood Estimation of Regression Models withFirst Order Moving Average Errors When the Root Lies on the Unit Circle, Econometrica51, 799820.

Sargan, J. D. 1983, Identication and Lack of Identication, Econometrica 51:6, 16051633.

Tugnait, J. 1986, Identication of Non-Minimum Phase Linear Stochastic Systems, Automatica22, 457464.

25


27/37

Table 1: CMD estimates from MA(1) model with possibly asymmetric errors.

0 CMD estimator just-identied estimator infeasible estimatorb b3 b b3 b b3mean std. mean std. mean std. mean std. mean std. mean std.

3 = 0

0.5 1.459 0.754 0.001 0.126 1.623 0.675 -0.020 0.360 0.501 0.048 -0.019 0.5160.7 1.139 0.394 -0.003 0.168 1.271 0.342 -0.012 0.287 0.697 0.056 -0.014 0.3741.0 1.035 0.223 -0.006 0.189 1.082 0.200 -0.011 0.294 0.995 0.052 -0.010 0.2961.5 1.146 0.443 -0.006 0.159 1.342 0.365 -0.015 0.293 1.496 0.047 -0.009 0.2602.0 1.417 0.764 -0.002 0.126 1.753 0.594 -0.025 0.335 1.996 0.051 -0.008 0.257

3 = 0:35

0.5 0.648 0.446 0.327 0.141 1.643 0.666 0.203 0.351 0.501 0.048 0.323 0.5000.7 0.801 0.266 0.342 0.178 1.290 0.329 0.249 0.282 0.696 0.056 0.329 0.3631.0 1.026 0.204 0.331 0.178 1.097 0.197 0.330 0.287 0.994 0.052 0.329 0.2881.5 1.432 0.305 0.343 0.167 1.383 0.334 0.363 0.284 1.496 0.047 0.331 0.2532.0 1.925 0.429 0.336 0.130 1.793 0.557 0.393 0.345 1.996 0.052 0.331 0.250

3 = 0:60.5 0.506 0.108 0.593 0.102 1.614 0.677 0.355 0.357 0.501 0.048 0.561 0.4710.7 0.702 0.105 0.602 0.144 1.305 0.320 0.431 0.281 0.697 0.056 0.569 0.3421.0 1.010 0.153 0.572 0.174 1.104 0.194 0.570 0.280 0.994 0.052 0.568 0.2731.5 1.514 0.201 0.601 0.139 1.369 0.345 0.635 0.298 1.496 0.047 0.571 0.2382.0 2.019 0.234 0.594 0.099 1.741 0.604 0.702 0.408 1.996 0.052 0.572 0.236

3 = 0:85

0.5 0.499 0.055 0.825 0.078 1.619 0.674 0.493 0.360 0.500 0.048 0.790 0.4330.7 0.692 0.083 0.826 0.133 1.315 0.313 0.603 0.281 0.696 0.056 0.796 0.3141.0 1.001 0.095 0.804 0.161 1.105 0.194 0.795 0.273 0.994 0.052 0.793 0.2511.5 1.527 0.183 0.828 0.127 1.337 0.367 0.901 0.320 1.495 0.047 0.797 0.219

2.0 2.020 0.216 0.824 0.083 1.648 0.667 1.026 0.509 1.995 0.052 0.798 0.217

Notes: The table reports the mean and the standard deviation (std.) of the CMD estimates of and 3 from the MA(1) model yt = et + et1, et = "t and "t iid(0; 1) generated from a gen-eralized lambda distribution with a skewness parameter 3 and zero excess kurtosis. The samplesize is T = 1000, the number of Monte Carlo replication is 5000 and = 1. CMD estimator is theover-identied classical mimimum distance estimator of (;;3)

0 with a vector of auxiliary para-meters (E(ytyt1); E(y

2t ); E(y

2t yt1); E(y

3t ); E(yty

2t1))

0; the just-identied estimator is the classical

minimum distance estimator of(;;3)0 with auxiliary parameters (E(ytyt1); E(y

2t ); E(y

2t yt1))

0;and the infeasible estimator is the classical minimum distance estimator of (; 3)

0 with = 1assumed known and auxiliary parameters (E(ytyt1); E(y

2t ); E(y

2t yt1))

0.

26


28/37

Table 2: SMD estimates of from MA(1) model with asymmetric errors

true 0 mean median Pr(bSMD > 1) s.e. std. DM testT= 1000

0.5 0.500 0.500 0.000 0.026 0.027 0.1120.7 0.701 0.700 0.000 0.030 0.031 0.112

1.0 0.969 0.978 0.381 0.074 0.074 0.1041.5 1.499 1.499 0.997 0.067 0.070 0.1122.0 2.007 2.004 0.997 0.113 0.119 0.114

T= 2000

0.5 0.500 0.500 0.000 0.019 0.019 0.1110.7 0.701 0.700 0.000 0.022 0.022 0.1011.0 0.982 0.990 0.433 0.056 0.058 0.1011.5 1.500 1.499 1.000 0.048 0.050 0.1172.0 2.004 2.002 1.000 0.080 0.083 0.108

Notes: The table reports some summary statistics of the simulated minimum distance (SMD)

estimates of from the MA(1) model yt = et + et1, et = "t and "t iid(0; 1) generated froma generalized lambda distribution with a skewness parameter 3 = 0:85 and zero excess kurtosis.The sample size is T = 1000 and 2000, the number of Monte Carlo replication is 1000 and = 1.

Pr(bSMD > 1) signies the probability (over Monte Carlo replications) that bSMD > 1; s.e. isthe average standard error computed from consistent estimates of the relevant asymptotic variance

expressions and std. denotes the Monte Carlo standard deviation ofbSMD. The last column of thetable report the rejection rates of the DM test ofH0 : = 0 at 10% signicance level.

27


29/37

Table 3: SMD estimates of, 3 and 4 from MA(1) model with asymmetric errors

true 0 bSMD b3;SMD b4;SMDmean median std. mean median std. mean median std.

T = 1000

0.5 0.994 0.994 0.022 0.817 0.816 0.062 2.966 2.937 0.242

0.7 0.994 0.994 0.022 0.817 0.815 0.068 2.956 2.942 0.2631.0 1.000 0.997 0.040 0.803 0.805 0.100 2.982 2.961 0.2981.5 1.004 0.997 0.049 0.817 0.817 0.100 2.993 2.933 0.5032.0 1.008 0.998 0.062 0.816 0.816 0.086 2.993 2.910 0.389

T = 2000

0.5 0.997 0.997 0.015 0.824 0.824 0.044 2.988 2.978 0.1530.7 0.997 0.997 0.015 0.824 0.822 0.049 2.982 2.972 0.1891.0 1.000 0.999 0.031 0.816 0.818 0.074 2.988 2.966 0.2441.5 1.001 1.001 0.034 0.823 0.820 0.069 2.975 2.936 0.3592.0 1.008 0.999 0.043 0.822 0.823 0.059 2.977 2.911 0.281

Notes: The table reports some summary statistics of the simulated minimum distance (SMD)estimates of , 3 and 4 from the MA(1) model yt = et + et1, et = "t and "t iid(0; 1)generated from a generalized lambda distribution with a skewness parameter 3 = 0:85 and zeroexcess kurtosis (4 = 3). The sample size is T = 1000 and 2000, the number of Monte Carloreplication is 1000 and = 1. std. denotes the Monte Carlo standard deviation of the correspondingestimate.

28


30/37

Table 4: SMD estimates of and from MA(1) model with chi-squared and exponential errors

error true 0 bSMD bSMDdistr. mean median Pr(bSMD > 1) std. mean median std.26 0.5 0.548 0.504 0.027 0.046 0.955 0.968 0.0452

6

0.7 0.716 0.706 0.022 0.055 0.967 0.969 0.04526 1.0 0.979 0.982 0.425 0.098 0.983 0.985 0.06426 1.5 1.458 1.478 0.962 0.105 1.009 0.996 0.07726 2.0 1.967 1.982 0.978 0.177 1.006 0.987 0.096

exp(1) 0.5 0.558 0.500 0.038 0.047 0.935 0.953 0.054exp(1) 0.7 0.705 0.700 0.004 0.051 0.960 0.960 0.053exp(1) 1.0 0.966 0.978 0.377 0.083 0.987 0.985 0.064exp(1) 1.5 1.493 1.500 0.977 0.114 0.988 0.982 0.083exp(1) 2.0 2.015 2.021 0.987 0.188 0.981 0.973 0.105

Notes: The table reports some summary statistics of the simulated minimum distance (SMD)estimates of and from the MA(1) model yt = et + et1, et = "t and "t is either an iid chi-

squared random variable with 6 degrees of freedom (26) or an exponential random variable with ascale parameter equal to one (exp(1)). The errors "t are recentered and rescaled to have mean zero

and variance 1. The sample size is T = 500, the number of Monte Carlo replication is 1000 and = 1. std. denotes the Monte Carlo standard deviation of the corresponding estimate.

29


31/37

Table 5: SMD and Gaussian ML estimates of ARMA(1, 1) model for commodity prices

commodities sample moments Gaussian ML SMD

skewness kurtosis b b b bcrude oil 0:019 5:259 0:403

(0:251)0:538(0:241)

0:617(0:075)

1:575(0:218)

heating oil 0:207 7:394 0:653(0:239) 0:742(0:217) 0:501(0:096) 1:531(0:076)soybean oil 0:018 5:440 0:893

(0:106)0:814(0:124)

0:635(0:065)

1:595(0:108)

corn 0:346 6:884 0:634(0:562)

0:583(0:588)

0:628(0:060)

1:627(0:170)

oats 1:086 9:538 0:176(0:512)

0:288(0:497)

0:696(0:057)

1:274(0:081)

soybeans 0:724 7:259 0:881(0:095)

0:797(0:116)

0:515(0:108)

1:886(0:255)

wheat 0:020 3:565 0:154(0:693)

0:078(0:697)

0:924(0:412)

1:044(0:456)

canola 0:887 11:255 0:408(0:405)

0:500(0:390)

0:402(0:091)

1:691(0:067)

platinum 0:246 4:963 0:046(0:606) 0:146(0:591) 0:638(0:043) 1:428(0:066)copper 0:495 5:400 0:314

(1:524)0:341(1:498)

0:691(0:084)

1:347(0:126)

gold 0:347 3:598 0:267(0:614)

0:348(0:586)

0:234(0:241)

4:426(4:804)

silver 0:014 3:979 0:171(0:314)

0:335(0:295)

0:098(0:346)

0:266(0:330)

palladium 0:229 5:189 0:290(1:272)

0:253(1:278)

0:680(0:049)

1:443(0:154)

cotton 2:040 18:857 0:903(0:023)

1:000(0:016)

0:883(0:078)

1:048(0:101)

lumber 0:139 3:498 0:449(0:317)

0:329(0:332)

0:712(0:149)

1:166(0:181)

cattle, feeder

0:498 5:912 0:046

(2:333) 0:020

(2:337)

0:623(0:052)

1:386(0:064)

cattle, live 0:462 5:079 0:698(0:084)

0:891(0:053)

0:670(0:596)

0:955(0:987)

pork bellies 0:503 5:198 0:841(0:045)

0:961(0:025)

0:390(0:078)

1:673(0:094)

hogs, lean 0:396 5:462 0:822(0:046)

0:973(0:029)

0:579(0:065)

1:290(0:090)

cocoa 0:325 3:832 0:230(0:281)

0:054(0:287)

0:016(0:433)

0:180(0:410)

sugar 1:127 6:920 0:933(0:021)

1:000(0:014)

0:363(0:119)

2:506(0:658)

coee 0:374 4:685 0:338(0:657)

0:263(0:664)

0:669(0:060)

1:375(0:132)

Notes: The table reports the SMD and Gaussian quasi ML estimates and standard errors (inparentheses below the estimates) for the ARMA(1, 1) model 4st = 4 st1 + et + et1, whereet iid(0; 2). The rst two columns report the sample skewness and kurtosis of4st.

30


32/37

25 20 15 10 5 0 5 100

.05

0.1

.15

0.2skewness=0

20 15 10 5 00

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4skewness=0.35

16 14 12 10 8 6 4 2 0 2 40

.05

0.1

.15

0.2

.25

0.3

.35

0.4

skewness=0.6

8 6 4 2 0 2 40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

skewness=0.85

Figure 1: Density functions of the standardized CMD estimator (t-statistic) of based on data(T = 1000) generated from an MA(1) model yt = et + et1 with = 1:5 and et iid(0; 1). Theerrors are drawn from a generalized lambda distribution with zero excess kurtosis and a skewnessparameter equal to 0, 0.35, 0.6 and 0.85.

31


33/37

10 5 0 5 10 15 200

.05

0.1

.15

0.2skewness=0

6 4 2 0 2 4 6 8 10 120

0.05

0.1

0.15

0.2

0.25

0.3

0.35skewness=0.35

6 4 2 0 2 4 6 8 10 120

.05

0.1

.15

0.2

.25

0.3

.35

0.4

skewness=0.6

6 4 2 0 2 40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

skewness=0.85

Figure 2: Density functions of the standardized CMD estimator (t-statistic) of based on data(T = 1000) generated from an MA(1) model yt = et + et1 with = 1:5 and et iid(0; 1). Theerrors are drawn from a generalized lambda distribution with zero excess kurtosis and a skewnessparameter equal to 0, 0.35, 0.6 and 0.85.

32


34/37

6 4 2 0 2 4 60

.05

0.1

.15

0.2

.25

0.3

.35skewness=0

6 4 2 0 2 40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4skewness=0.35

8 6 4 2 0 2 4 60

.05

0.1

.15

0.2

.25

0.3

.35

0.4

skewness=0.6

8 6 4 2 0 2 40

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

skewness=0.85

Figure 3: Density functions of the standardized CMD estimator (t-statistic) of the skewness para-meter (3) based on data (T= 1000) generated from an MA(1) model yt = et + et1 with = 1:5and et iid(0; 1). The errors are drawn from a generalized lambda distribution with zero excesskurtosis and a skewness parameter equal to 0, 0.35, 0.6 and 0.85.

33


35/37

00.5

11.5

22.5

0

0.5

1

.5

4

2

0

2

4

skewness=0

0

0.51

1.52

0

0.5

1

1.5

4

2

0

2

4

skewness=0.35

00.5

11.5

22.5

0

0.5

1

.5

4

2

0

2

4

skewness=0.6

0

0.51

1.52

0

0.5

1

1.5

4

2

0

2

4

skewness=0.85

Figure 4: Logarithm of the objective function of SMD estimator of and based on data (T= 1000)generated from an MA(1) model yt = et + et1 with = 0:7 and et iid(0; 1). The errors aredrawn from a generalized lambda distribution with zero excess kurtosis and a skewness parameterequal to 0, 0.35, 0.6 and 0.85.

34


36/37

1 2 3 4 5 6 7 8 9 101

0.5

0

0.5

1

1.5

horizon

impulseresponsefunction

true IRF

SMDbased IRF median estimate

MLbased IRF median estimate

Figure 5: SMD and Gaussian quasi ML median estimates of the impulse response function fromthe ARMA(1, 1) model (1 + 0:5L)yt = (1 + 2L)et, where et is a standard exponential randomvariable with a scale parameter equal to one which is recentered and rescaled to have mean zeroand variance 1. The sample size of the simulated series is T= 500.

35


37/37

1 2 3 4 5 6 7 8 9 101.6

1.4

1.2

1

0.8

0.6

0.4

0.2

0

0.2

horizon

impulseresponsefunction

true IRF

SMDbased IRF median estimate

MLbased IRF median estimate

Figure 6: SMD and Gaussian quasi ML median estimates of the impulse response function fromthe ARMA(1, 1) model (1 0:5L)yt = (1 2L)et, where et is a standard exponential randomvariable with a scale parameter equal to one which is recentered and rescaled to have mean zeroand variance 1. The sample size of the simulated series is T= 500.

Gospodinov Ng

Documents

Transcript of Gospodinov Ng