Gospodinov Ng
Transcript of Gospodinov Ng
-
7/29/2019 Gospodinov Ng
1/37
Minimum Distance Estimation of Possibly
Non-Invertible Moving Average Models
Nikolay Gospodinov Serena Ng y
November 5, 2012
Abstract
This paper proposes classical and simulation-based minimum distance estimation of movingaverage (MA) models with non-Gaussian errors. Information in higher order cumulants allowsidentication of the parameters without imposing invertibility. By removing the invertibilityrestriction, the presence of a moving average unit root no longer presents a boundary problemthat gives rise to non-standard asymptotics. As a result, the minimum distance estimator of theMA(1) model has classical root-T asymptotic normal properties when the moving average rootis inside, outside, and on the unit circle. For more general models when the dependence of thecumulants on the model parameters is analytically intractable, we propose a simulation estima-tor based on auxiliary regressions with parameters that are informative about the higher ordercumulants. The method uses an error simulator with a exible functional form that accommo-dates a large class of distributions with non-Gaussian features. The simulation estimator is alsoapproximately normally distributed without imposing the a priori assumption of invertibility.
JEL Classication: C13, C15, C22
Keywords: Minimum distance; Non-invertibility; Indirect inference; Identication; Non-Gaussianerrors; Generalized lambda distribution
Concordia University and CIREQ, 1455 de Maisonneuve Blvd. West, Montreal, QC H3G 1M8, Canada. Email:[email protected]
yColumbia University, 420 W. 118 St. MC 3308, New York, NY 10027. Email: [email protected]
We would like to thank Prosper Dovonon, Anders Bredahl Kock, Ivana Komunjer and the participants at the CESGmeeting at Queens University for useful comments and suggestions. The rst author gratefully acknowledges nancialsupport from FQRSC, IFM2 and SSHRC. The second author acknowledges nancial support from the National ScienceFoundation (SES-0962431).
-
7/29/2019 Gospodinov Ng
2/37
1 Introduction
Moving average (MA) models can parsimoniously characterize the dynamic behavior of many time
series processes. The challenges in estimating MA models are two-fold. First, invertible and
non-invertible moving average processes are observationally equivalent up to the second moments.
Second, invertibility puts an upper bound of one on all roots of the moving average polynomial,
rendering estimators with non-normal asymptotic distributions when some roots are on or near the
unit circle. Existing estimators treat invertible and non-invertible processes separately, requiring the
researcher to take a stand on the parameter space of interest. While estimators are super-consistent
under the null hypothesis of a moving average unit root, their distributions are not asymptotically
pivotal. To our knowledge, no estimator of the MA model exists that achieves identication without
imposing invertibility and yet enables classical inference over the whole parameter space.
Both invertible and non-invertible representations can be consistent with economic theory. For
example, if the logarithm of asset price is the sum of a random walk component and a stationarycomponent, the rst dierence (or asset returns) is generally invertible, but non-invertibility can
arise if the variance of the stationary component is large. While non-invertible models are not ruled
out by theory, invertibility is often the mainstream assumption in empirical work. One reason is that
non-invertible models are not useful for forecasting because future values of the endogenous variable
are not observable. The more practical reason is that the assumption provides the identication
restrictions without which maximum likelihood and covariance structure-based estimation of MA
models would not be possible when the data are normally distributed.1 Obviously, falsely assuming
invertibility will yield an inferior t of the data. It can also lead to spurious estimates of the impulse
coecients which are often the objects of interest. Hansen and Sargent (1991), Lippi and Reichlin
(1993), Fernndez-Villaverde et al. (2007), among others, emphasize the need to verify invertibility
because it aects how we interpret what can be recovered from the data. Indeed, it is necessary in
many science and engineering applications to admit parameter values in the non-invertible range. 2
A key nding in these studies is that higher order cumulants are necessary for identication when
the non-invertible models are to be entertained.
This paper considers minimum distance estimation of MA models without imposing invertibility
a priori. We rst show using the MA(1) model that use of higher cumulants per se is not sucient
1 Invertibility can also help to identify structural models. For example, Komunjer and Ng (2011) use invertibilityto narrow the class of equivalent DSGE models.
2 For example, in seismology, an accurate model of the seismic source wavelet, in the form of a moving averagelter, is necessary to recover the earths reectivity sequence. The fact that seismic data typically exhibit non-Gaussian features suggests the need for a wavelet (moving average polynomial) which is non-invertible. Similarly,in communication analysis, an accurate modeling of the communication channel by a possibly non-invertible movingaverage process is required to back out the underlying message from the observed distorted message.
1
-
7/29/2019 Gospodinov Ng
3/37
for the Jacobian matrix to be full rank everywhere in the parameter space. Exploiting the fact that
mapping between the structural parameters and cumulants can be explicitly derived for the MA(1)
case, we show that the cumulants can over- but not exactly identify the MA(1) model if a unit root
and parameters consistent with non-invertibility are admissible. However, two second order along
with three third order cumulants can be used to construct a classical minimum distance estimatorthat is root-T consistent and uniformly asymptotically normal.
Extension of the classical minimum distance estimator to more general moving average models
is not possible when the relation between the model parameters and the higher order cumulants is
not analytically tractable. Thus, we also propose a simulation based minimum distance estimator
with errors drawn from the generalized lambda distribution. It is an alternative to the semi-
parametric density considered in Gallant and Tauchen (1996) for simulating non-Gaussian errors.
The estimator uses multiple auxiliary regressions and has the avor of indirect inference estimation
proposed by Gourieroux et al. (1993) as well as the simulated method of moments of Due and
Singleton (1993). The proposed estimator also has classical asymptotic properties regardless of
whether the MA roots are inside, outside, or on the unit circle.
The main arguments of the analysis are presented using the MA(1) model but extensions to more
general models are also discussed. Section 2 proceeds to highlight two identication problems in the
context of minimum distance estimation. Section 3 discusses the properties of the classical minimum
distance estimator based only on information about the covariance structure of the process. It also
motivates the need of using higher order cumulants in estimation and explains how identication
can be achieved. Section 4 develops a simulation minimum distance estimator for more general
moving average models. An empirical application for commodity prices is provided in Section 5.Section 6 concludes.
2 Two Identication Problems
Consider the autoregressive and moving average (ARMA) process of order (p, q):
(L)yt = (L)et;
where et iid(0; 2); L is the lag operator such that Lpyt = ytp, (L) = 11L : : :pLp haveno common roots with (L) = 1 + 1L + : : : + qLq. The autoregressive polynomial (z) is said
to be causal if(z) 6= 0 for all jzj 1 on the complex plane, and the moving average polynomialis said to be invertible if (z) 6= 0 for all jzj 1 (Brockwell and Davies (1991)). Ifyt is a causalfunction of et, then there exist constants hj with
P1j=0 jhj j < 1 such that yt =
P1j=0 hjetj for
t = 0;1; : : : We say that yt has minimum phase if the zeros of(z) and (z) are all greater than
2
-
7/29/2019 Gospodinov Ng
4/37
one in absolute value.3 Few economic time series exhibit explosive behavior. If we narrow the focus
to causal and stable processes, invertible processes also have minimum phase.
If a process yt is invertible in et, then there exist constants j withP1
j=0 jjj < 1 such thatet =
P1j=0 jytj = (L)yt. For ARMA(p, q) models, invertibility requires that the inverse of (L)
has a convergent series expansion in positive powers of the lag operator L. For the MA(1) model
yt = et + et1; (1)
with et iid(0; 2), the invertibility condition is satised if jj < 1 since (L) =P1
s=0()sLsis a polynomial in positive powers of L. This is no longer true when jj in (1) exceeds one. Itis, however, misleading to classify invertible and non-invertible processes according to the value
alone. Consider the MA(1) process yt represented by
yt = et + et1: (2)
Even if in (2) is less than one, yt is still non-invertible because the implied (L) =P1
s=0(L)s1is a polynomial in negative powers ofL.
Invertible and non-invertible processes have distinctive features with implications for forecasting.
In the invertible case, the span of et and its history coincide with that of yt, which is observed by
the econometrician. The one-step ahead forecast errors are etjt1 = yt ytjt1 = et. In the non-invertible case, the econometrician does not observe future values ofyt and his information set is
strictly inferior to that of the economic agent. As discussed in Ramsey and Montenegro (1992),
the one-step ahead forecast errors when yt is generated by the non-invertible model (2) are
etjt1 = yt (et1 + et2) + 2(et2 + et3) + : : : 6= et:
These dierences are important in the subsequent analysis.
Identication and estimation of models with a moving-average component are dicult because
of two problems that are best understood by focusing on the MA(1) case. The rst identication
problem concerns at or near unity. When the MA parameter is near the unit circle, the Gaussian
maximum likelihood estimator (MLE) takes values exactly on the boundary of the invertibility
region with positive probability (the so-called pile-up problem) in nite samples. This point
probability mass at unity arises from the symmetry of the likelihood function around one and the
small sample deciency to identify all the critical points of the likelihood function in the vicinity of
the non-invertibility boundary; see Sargan and Bhargava (1983), Anderson and Takemura (1986),
Davis and Dunsmuir (1996), Gospodinov (2002), Davis and Song (2011).
3 A non-stationary process is not mean reverting with the property that the shocks have permanent eects on theseries. This has generated a large literature on testing the unit root hypothesis against the alternative of stationarity.
3
-
7/29/2019 Gospodinov Ng
5/37
The second identication problem arises because covariance stationary processes are completely
characterized by the rst and second moments of the observables and an MA(1) model with pa-
rameters (; 2)0 has the same autocovariance structure as a model parameterized by (1=;22)0.
In consequence, the Gaussian likelihood for an MA(1) model with L(; 2) is the same as one with
L(1=;2
2
). The observational equivalence of second moments also implies that the projectioncoecients in (L) are the same regardless of whether is less than or greater than one. Thus,
cannot be recovered from the coecients without additional assumptions.
This observational equivalence problem can be further elicited from a frequency domain per-
spective. If we take as a starting point yt = h(L)et =P1
j=1 hjetj, the frequency response
function of the lter is
H(!) =X
hj exp(i!j) = jH(!)j expi(!);
wherejH(!)
jis the amplitude and (!) is the phase response of the lter. For ARMA models,
h(z) = (z)(z) =P1
j=1 hjzj. The amplitude response is usually constant for given ! and tends
towards zero outside the interval [0; ]. If et is Gaussian and h(L) is invertible, second order
statistics will correctly identify the amplitude and the phase of the wavelet. But for given a > 0,
the phase 0 is indistinguishable from (!) = 0 + a! for any ! 2 [0; ]. Recovering et from thesecond order spectrum
S2;y(z) = 2jH(z)j2
is problematic because S2;y(z) is proportional to the amplitude jH(z)j2 with no information aboutthe phase. The second order spectrum is thus said to be phase-blind. As explained in Lii and
Rosenblatt (1982), one can ip the roots of (z) and (z) without aecting the modulus of the
transfer function. With real distinct roots, there are 2p+q ways of specifying the roots without
changing the probability structure ofyt.
3 Classical Minimum Distance Estimation of MA(1) Model
The consequences of the two identication problems for estimation and inference are easily seen
from the perspective of the classical minimum distance estimator that exploits only the covariance
structure of the process. Let
2 be a K
1 parameter vector of interest with a true value
0, where the parameter space is a subset of the K dimensional Euclidean space RK. Considerestimating the MA(1) model indirectly via an auxiliary model with an L 1 (L K) vector ofparameters 2 RL that are functions of; f j = (); 2 g, with a pseudo-true value
0 (0):
4
-
7/29/2019 Gospodinov Ng
6/37
Given data y (y1; : : : ; yT)0 and a consistent estimator bT and its asymptotic variance b, ischosen to minimize the dierence between bT and (). The classical minimum distance (CMD)estimator of using the optimal weighting matrix is dened as
bT = arg minJCMD(bT; ();b)= arg min(bT ())0b1(bT ()): (3)
If is of the same dimension as and () is of known form and invertible, then bT = 1(bT).In general, the dimension of exceeds that of. The auxiliary model need not nest the true model,
but identication hinges on a well-behaved mapping from the space of to the parameter space of
the auxiliary model.
Denition 1 Let() : ! () be a mapping from to andG() = @()=@0 withG0 G(0). Then, 0 is globally identied if (
) is injective and is locally identied if the matrix of
partial derivativesG0 has full column rank.
Note that the requirement for full column rank of the derivative matrix G(0) is a sucient
condition for the mapping (0) to be locally injective. Hence, 0 is locally identied if (0)
is locally injective and Denition 1 provides a sucient condition for local identication. From
Denition 1, 1 and 2 are observationally equivalent if(1) = (2).
Lemma 1 Suppose that the following conditions hold. (C.i) fyt; 1 < t < 1g is a strictlystationary and ergodic process; (C.ii) bT
p
!0; (C.iii) the set
f
j = ()
2
RL
gis
a compact subset of RK withL K; (C.iv) () is twice continuously dierentiable in ; (C.v)the matrix of partial derivativesG() has rank equal to K on ; (C.vi) the mapping 0 = (0)
is unique; Then, b p!0. If, in addition, (AN.i) 0 is in the interior of ; (AN.ii) pT(bT 0)
d!N(0;0); where 0 is a symmetric positive denite matrix and b p!0, then pT(bT 0)
d!N(0; (G0010 G0)1).
Lemma 1, adapted from Ruud (2000), provides conditions for consistency and asymptotic nor-
mality of the classical minimum distance estimator. Except for (C.v), these are more or less
standard conditions for extremum estimators to be consistent and asymptotically normal distrib-
uted; see for example, Newey and McFadden (1994). Condition (C.v) is typically stated as a
requirement for asymptotic normality but not a necessary condition for global identication; Hall
(2005, p.69).4 However, in Rothenberg (1971), full rank of the derivative matrix is used in stating
4 Sargan (1983) and, more recently, Dovonon and Renault (2011) show that identication is possible even whenthe moment conditions have a degenerate derivative matrix at the true values of the parameters.
5
-
7/29/2019 Gospodinov Ng
7/37
sucient conditions for global identication.5 While this condition may be too strong in general,
the condition is necessary for identication of the MA(1) model if the case j0j = 1 is to be allowedfor as we now discuss.
The MA(1) model has = (; 2)0, 0 = (0; 20)0, = [H; H] [2L; 2H]; where 0 2, see Lii and Rosenblatt (1982, Lemma 1), Ginnakis and Swami (1990); Mendel
(1991). This necessarily requires that et has non-Gaussian features.
The use of higher order cumulants per se does not, however, automatically guarantee identi-
cation of the MA(1) model. Let et = "t; where "t iid(0; 1) with E("lt) = l (l 3) and
assume that 3 6= 0. Since the third cumulant of a mean-zero, stationary process yt is dened ascum3;y(u; v) = E(ytyt+uyt+v) = cum3;e
Pqi=0 hihi+uhi+v, the non-zero third order cumulants for
the MA(1) process are given by8
cum3;y(0; 0) = E(y3t ) = (1 +
3)33;
cum3;y(0;1) = E(y2t yt1) = 233;cum3;y(1; 0) = E(y2t yt1) = 233;
cum3;y(1;1) = E(yty2t1) = 33:
Lemma 2 Let = (cum2;y(1); cum2;y(0); cum3;y(u; v))0
, where cum3;y(u; v) is one of the thirdorder cumulants of yt. Then, (i) cannot globally identify for any 0 = (0;
20; 3;0)
0 2 and(ii) cannot locally identify when j0j = 1 for any2 and3.
7 All-pass models are non-causal and/or non-invertible ARMA models in which the roots of the autoregressivepolynomial are reciprocals of the roots of the moving average polynomials and vice versa.
8 This folllows from direct evaluation ofE(ytyt+uyt+v) with yt = et + et1.
8
-
7/29/2019 Gospodinov Ng
10/37
Lemma 2 considers the case when and are of the same dimension. Part (i) of the lemma
implies that there always exist 1 2 and 2 2 such that 1 and 2 are observationally equivalentin the sense that they generate the same . For example, 1 = (;
2; 3)0 and 2 = (1=;
22; 3)0
both imply the same = (E(ytyt1); E(y2t ); E(y
2t yt1))
0. It is easy to verify that the mapping
from to is also surjective when cum3;y(0;1) is replaced by cum3;y(0; 0) or cum3;y(1;1).This result arises because observational equivalence of second moments precludes cum2;y(1) andcum2;y(0) from exactly identifying and
2, and a single higher order cumulant might, but cannot
be guaranteed to identify both 3 and the parameters of the MA(1) model. Part (ii) of Lemma
2 follows from the fact that the determinant of the derivative matrix is zero at j0j = 1. Localidentication of when j0j = 1 will always require replacing cum2;y(1) or cum2;y(0) with anotherthird order cumulant to avoid degeneracy in the derivative matrix.
Lemma 3 Let fcum2;y(1); cum2;y(0); cum3;y(0; 0); cum3;y(0;1); cum3;y(1;1)g and()
be a function that maps to . Then, there exists at least one value in the parameter space for at which rank(G()) < 3 if dim() = 3.
The three third order cumulants together with the two second order cumulants E(ytyt1) and
E(y2t ) create ten distinct combinations of a three-dimensional vector of auxiliary parameters to be
considered for exact identication. Direct calculations show that if consists ofE(ytyt1), E(y3t )
and E(yty2t1) or E(y
2t ), E(y
3t ) and E(yty
2t1), the derivative matrix is singular at = (1=2)
1=3. If
consists ofE(ytyt1), E(y2t yt1) and E(yty
2t1) or E(y
2t ), E(y
2t yt1) and E(yty
2t1), the matrix
of derivatives is singular at = 0. If contains both E(ytyt1) and E(y2t ), the determinant of the
derivative matrix is zero at = 1 and = 1. Finally, if consists ofE(ytyt1), E(y2t yt1) and
E(y3t ) or E(y2t ), E(y
2t yt1) and E(y
3t ), the determinant of the derivative matrix is zero at = 0
and = 21=3: For all ten combinations of considered, there always exist values of for which the
derivative matrix is singular. The implication is that exact identication of all 2 from threedimensional subsets of is not possible.
We now propose an over-identied CMD estimator bCMD based on a vector of cumulantsCMD =
cum2;y(1) cum2;y(0) cum3;y(0; 0) cum3;y(0;1) cum3;y(1;1)
0=
E(ytyt1) E(y
2t ) E(y
2t yt1) E(y
3t ) E(yty
2t1)
0
(6)
and a mapping function
CMD() =2 (1 + 2)2 233 (1 +
3)33 330:
From a methods of moments perspective, CMD contains information about the covariance struc-
ture, the unconditional skewness of the process, and the time-varying second moments of the ob-
9
-
7/29/2019 Gospodinov Ng
11/37
servables. The latter is useful for identication because Ramsey and Montenegro (1992) show that
the residuals from an autoregressive approximation exhibit ARCH-type structure if the underlying
MA process is non-invertible and the true errors are asymmetric.
The derivative matrix ofCMD() with respect to is
GCMD() =
0BBBB@2 0
22 (1 + 2) 0233 3
23=2 23
3233 3(1 + 3)3=2 (1 +
3)3
33 33=2 3
1CCCCA : (7)
Notably, due to the addition of the three higher order cumulants, the derivative matrix has full
column rank everywhere in even at j0j = 1 and conditions (C.v) and (AN.i) are thus satised.The rank condition is necessary for 0 to be a unique solution to the system of non-linear equations
characterized by
GCMD()0W(CMD CMD()) = 031:Provided that 3 6= 0, the uniqueness condition (C.vi) holds. The full rank condition is alsonecessary for the estimator to be asymptotically normal. As a result, this CMD estimator is root-T
consistent and asymptotic normal.
Proposition 1 Consider the MA(1) model (1) with et = "t; "t iid(0; 1) and E("3t ) = 3:Assume that 3 6= 0 and Ej"tj6 < 1. Let = (; 2; 3)0 and bCMD be the minimum distanceestimator based on (6), with CMD = Avar(
bCMD). Then,
pT(bCMD 0) d!N0; GCMD(0)01CMDGCMD(0)1:
While analytical results for moving average processes of higher order are dicult to obtain, our
conjecture is that the use of a single higher order cumulant remains necessary but not sucient for
identication. Two more third order cumulants must be used in conjunction ofE(ytyt1) and E(y2t )
to overidentify . Provided that 4 6= 3, fourth order cumulants such as E(y4t ) = [(1+4)4+62]4and E(y2t y
2t1) = (1 +
2 + 24 + 4)4 can also be incorporated in conjunction of other cumulants
of order three or higher.
3.3 Finite-Sample Properties of the CMD Estimator
To illustrate the nite-sample properties of the CMD estimators, data with T = 1000 observations
are generated from an MA(1) model yt = et + et1 and et = "t where "t is iid(0; 1) and follows
10
-
7/29/2019 Gospodinov Ng
12/37
a generalized lambda distribution (GLD) which will be further discussed in Section 4.1. For now,
it suces to note that GLD distributions can be characterized by a skewness parameter 3 and
a kurtosis parameter 4: The true values of the parameters are = 0:5; 0:7; 1; 1:5 and 2, = 1,
3 = 0; 0:35; 0:6 and 0.85, and 4 = 3. Lack of identication of arises when 3 = 0 and weak to
intermediate identication occurs when 3 = 0:35; 0.6 and 0.85.Table 1 presents the average estimates and the standard deviations of three CMD estimators of
and 3 over 5000 Monte Carlo replications. While 3 is typically not a parameter of direct interest,
information for this parameter would indicate how useful the third cumulants are in identifying and
estimating the parameters of the model. The rst estimator is the CMD estimator using the sample
analog of (6) as auxiliary parameters. As argued above, the use of higher order cumulants does
not necessarily guarantee identication. For this reason, we also consider a just-identied classical
minimum distance estimator that uses only the sample analog of
U = E(ytyt1) E(y2t ) E(y2t yt1)0 CMDas auxiliary parameters. For the sake of comparison, we consider an infeasible minimum distance
estimator which is based on bU but assumes that 2 is known and estimates only (; 3)0. Asdiscussed earlier, xing 2 solves the identication problem. Without imposing invertibility, j0j = 1is not on the boundary of the parameter space for . The infeasible estimator is asymptotically
normally distributed uniformly over the whole parameter space for and over all error distributions.
The problem is that 2 is, in general, unknown. We demonstrate, however, that our proposed CMD
estimator has properties similar to this infeasible estimator.
The results in Table 1 suggest that regardless of the degree of non-Gaussianity, the infeasibleestimator produces estimates of that are very precise and essentially unbiased. Hence, xing
solves both identication problems without the need of non-Gaussianity although a prior knowledge
of is rarely available in practice. As seen in Table 1, the feasible (just-identied) version of this
estimator, based on bU does not achieve identication of the structural parameters for any valueof3. This estimator is also characterized by a large pile-up probability at unity due to a violation
of condition (C.v). In contrast, over-identifying the model with the auxiliary moments E(y3t ) and
E(yty2t1) gives rise to the CMD estimator which achieves identication as the degree of skewness
increases. For the CMD estimator, the skewness parameter appears to be very well identied and
estimated over all specications of the error distribution. In fact, the CMD estimates of3 appear
to be much more precise than the infeasible estimator which can be attributed to the usefulness
of the additional third order cumulants used in the CMD estimator. However, the estimation of
depends on the strength of identication. While for 3 = 0:35 the identication is weak and
the estimates of are somewhat biased, for higher values of the skewness parameter the CMD
11
-
7/29/2019 Gospodinov Ng
13/37
estimates of are practically unbiased. When 3 = 0:85, the CMD estimator identies correctly
(with probability one) if the true value of is in the invertible or the non-invertible region.
Figures 1, 2 and 3 plot the density functions of the standardized CMD estimator of , and
3, respectively, for the MA(1) model considered in this section with = 1:5 and T= 1000. While
the lack of identication for zero or low values of the skewness parameter induces non-normality(bimodality and fat tails) in the distribution of the estimator, the densities of the standardized CMD
estimator of, and 3 appear to be very close to the standard normal density for 3 = 0:85.
4 Semi-Parametric Simulated Minimum Distance Estimation
While the CMD estimator for the MA(1) model has appealing asymptotic and nite-sample prop-
erties, analytical expressions for the mapping from general ARMA(p, q) models to the cumulants
are not tractable. For this reason, we develop a simulation-based estimator for ARMA(p, q) mod-
els. The simulation estimator is similar in spirit to the CMD but can accommodate autoregressivedynamics, kurtosis and other features of the errors. The dierence with CMD is that it uses
simulations to approximate and invert ().More precisely, let ys() = (ys1; : : : ; y
sT)
0 be data simulated for a candidate value of . This
usually requires drawing errors from a known distribution, and the parameters of this distribution
are ancillary for . Let bT = argminQT(;y)and
esT() = arg minQT(;ys())be the the auxiliary parameters estimated from actual and simulated data, respectively, where
QT() denotes the objective function of the auxiliary model. Dene eT;S() to be the average ofthe estimates esT() over S draws each using simulated data of length T, i.e.,9
eT;S() = 1SSX
s=1
esT():A simulation-based minimum distance (SMD) estimator can now be dened as
bT;S arg minJSMD(bT; eT;S();b)= arg min(bT eT;S())0b1(bT eT;S()); (8)
9 Alternatively, one could use one draw of simulated data of length T S, yS() = (yS1 ; : : : ; yST;:::;y
STS)
0; and deneeT;S () as eT;S () = arg min QT(;y
S()):
12
-
7/29/2019 Gospodinov Ng
14/37
where bT is a consistent estimate of the asymptotic variance of bT. The ecient method ofmoments (EMM) estimator of Gallant and Tauchen (1996) and the indirect inference estimator
(IIE) of Gourieroux et al. (1993) consider pseudo-maximum likelihood estimator of in which case
QT() is the log-likelihood. Identication requires that the mapping () be injective in the sense of
Denition 1. In other words, the auxiliary model must contain features of the data generated under. Simulations merely provide an approximation to 1(bT).10 Thus, the () in simulation-basedminimum distance estimation is sample size dependent (Phillips (2012)). Gourieroux et al. (1993)
refer to the estimator as the method of indirect inference and () as the binding function.Simulation estimation of the MA(1) model was considered in Gourieroux et al. (1993), Michaelides
and Ng (2000), Ghysels et al. (2003), Czellar and Zivot (2008), among others, but only for the in-
vertible case. All of these studies use an autoregression as the auxiliary model. For = 0:5 and
assuming that 2 is known, Gourieroux et al. (1993) nd that the IIE compares favorably to the
exact MLE in terms of bias and root-mean squared error. Michaelides and Ng (2000) and Ghysels
et al. (2003) also evaluate the properties of simulation-based estimators with 2 assumed known.
Czellar and Zivot (2008) report that the IIE is relatively less biased but exhibits some instability
and the tests based on it suer from size distortions when 0 is close to unity. The favorable
properties of the IIE when is in the invertible range can be traced to the fact that simulation es-
timation has a bias-correction property that is absent from classical minimum distance estimation.
Intuitively, if the auxiliary parameter estimates bT obtained from the data are downward biased,so will the estimates esT estimated from the data simulated for a given . Then, can be calibratedto bias correct the CMD, akin to the bootstrap.
This bias-correction property provided by simulation estimation has an additional but unex-ploited role when non-invertible models are allowed. As shown in the previous section, identication
without imposing invertibility relies on information in higher order moments which tend to exhibit
nite-sample biases. The next section considers a simulation based estimator that achieves identi-
cation without imposing invertibility and enables classical inference even at 0 = 1.
4.1 SMD Estimator Based on GLD Errors
As the key to identication is errors with non-Guassian properties, we need to be able to simulate
non-Gaussian errors in a exible fashion so that yt has the desired distributional properties. There
is evidently a large class of distributions with third and fourth moments consistent with a non-
Gaussian process that one can specify. As assuming a particular parametric error distribution
could compromise the robustness of the estimates, we simulate errors from the generalized lambda
10 In the terminology of Gallant and Tauchen (1996), true densities of the data be smoothly embedded within thescores of the auxiliary model.
13
-
7/29/2019 Gospodinov Ng
15/37
distribution P(1; 2; 3; 4) considered in Ramberg and Schmeiser (1975). This distribution has
two appealing features. First, it can accommodate a wide range of values for the skewness and
excess kurtosis parameters and it includes as special cases normal, log-normal, exponential, t, beta,
gamma and Weibull distributions. The second advantage is that it is easy to simulate from. The
percentile function is given by
P(u)1 = 1 + [U3 + (1 U)4]=2; (9)
where U is a uniform random variable on [0; 1], 1 is a location parameter, 2 is a scale parameter,
and 3 and 4 are shape parameters. To simulate "t, a U is drawn from the uniform distribution and
(9) is evaluated for given values of (1; 2; 3; 4). As shown in Ramberg and Schmeiser (1975), the
shape parameters (3; 4) are explicitly related to the coecients of skewness and kurtosis (3 and
4) of "t. Furthermore, the shape parameters (3; 4) and the location/scale parameters (1; 2)
can be sequentially evaluated. Since "t
has mean zero and variance one, the parameters (1;
2)
are determined by (3; 4) so that "t is eectively characterized by 3 and 4.
We jointly estimate the structural parameters and 2 with the nuisance parameters of
the non-Gaussian distribution which are necessary for identication of and 2. The structural
parameter vector is expanded to contain parameters of the error process.11 Dene the augmented
parameter vector of interest by
SMD = (; 2; 3; 4)
0:
Let the vector of auxiliary parameters be dened from the following regression models:
yt = 1yt1 + : : :+ pytp + c1y2t1 + v1t; (10a)
y2t = c0 + c2yt1 + c3y2t1 + v2t: (10b)
Model (10a) captures the dynamics of yt; the slope parameters of model (10b) reect information
in the higher-order, time-varying cumulants of the process while the intercept c0 is related to the
second unconditional moment of yt. To capture information in the skewness and kurtosis of the
errors, we augment the auxiliary parameter vector with the third and fourth moments of the OLS
residuals from regression (10a), i.e., 3 = E(v31t) and 4 = E(v
41t): As a result, the auxiliary
parameter vector for the SMD estimator is
SMD = (1;:::;p; c0; c1; c2; c3; 3; 4)0: (11)
11 It would seem tempting to estimate 3 and 4 separately from (; 2)0, such as using the sample skewness and
kurtosis of the residuals of a long autoregression. But as discussed in Ramsey and Montenegro (1992), the OLSresiduals do not converge in the limit to the true errors when (L) is non-invertible, rendering their sample highermoments also asymptotically biased.
14
-
7/29/2019 Gospodinov Ng
16/37
The auxiliary regressions (10a) and (10b) allow us to perform simple tests for identication. By
Lemma 2, two or more cumulants of third order are necessary for identication of if j0j = 1 is anadmissible value in the parameter space. For example, individual t tests ofH0 : c1 = 0, H0 : c2 = 0
and H0 : c3 = 0 can shed light on whether the third and fourth order cumulants can identify the
structural parameters of the MA(1) model. If the individual null hypotheses are rejected, a jointtest can be performed before using the classical or simulation-based estimation of.
The proposed simulated minimum distance estimator bSMD ofSMD is obtained as in (8) fora given consistent estimator bSMD of the auxiliary parameter vector SMD. The estimator issemi-parametric because we use a possibly misspecied error distribution to simulate data from
the structural model. To establish the consistency and asymptotic normality of the SMD estimatorbSMD we need some additional notation and regularity conditions. Let P denote the class ofgeneralized lambda distributions and all limits be taken with respect to P as T! 1:
Proposition 2 LetSMD be dened as in (11). Suppose that in addition to the assumptions in
Lemma 1, sup2
eSMD() SMD() p!0 and pT(eSMD() SMD(0)) d!N(0;SMD),where SMD = Avar(eSMD). Then, bSMD p!0 and
pT(bSMD 0) d!N
0;
1 +
1
S
GSMD(0)
01SMDGSMD(0)1! N0;Avar(bSMD):
Consistency follows from identiability ofand the moment conditions that exploit information
in higher order cumulants play a crucial role. In our procedure, 3 and 4 are dened in terms of3
and 4 so that the estimates of3 and 4 are implied by the generalized lambda distribution instead
of the sample estimates of skewness and kurtosis. Even though 3 and 4 are not parameters of
direct interest, they are crucial for identication of and 2.
A key feature of Proposition 2 is that it holds when is less than, greater than or equal to one. In
a Gaussian likelihood setting when invertibility is assumed for the purpose of identication, there is
a boundary for the support of jj at the unit circle. Thus, the likelihood-based estimation has non-standard properties when the true value of is on or near the boundary of one. In our setup, this
boundary constraint is lifted because identication is achieved through higher moments instead
of imposing invertibility. As a consequence, the SMD estimator
bSMD has classical properties
provided that 3 and 4 enable identication.Consistent estimation of the asymptotic variance of bSMD can proceed by substituting a con-
sistent estimator of SMD and evaluating the Jacobian GSMD(bT;S) numerically. The computedstandard errors can then be used for testing hypotheses and constructing condence intervals.
Alternatively, inference on the MA parameter of interest, , can be conducted by constructing
15
-
7/29/2019 Gospodinov Ng
17/37
condence intervals based on test inversion without an explicit computation of the variance matrix
Avar(bSMD). For a sequence of null hypotheses H0 : = i for i 2 , consider a generic distancemetric (DM) statistic
DMSMD =TS
S+ 1JSMD(bT; eS;T(e);b) JSMD(bT; eS;T(b);b);where b is the unrestricted estimate and e is the restricted estimate under the null. Let be thesignicance level of the test and q1 denote the (1 )-th quantile of the chi-square distributionwith one degree of freedom. Then, the 100(1 )% condence interval for is given by the set ofvalues satisfying DMSMD q1; i.e., C1() = f 2 : DMSMD q1g. The endpoints of thecondence interval are obtained as
L = inff 2 : Pr(DMSMD q1 j H0) 1 g;U = sup
f
2 : Pr(DMSMD
q1
jH0)
1
g:
This approach is very convenient since it provides information on the invertibility of the process. We
implement the DM test with = SMD dened in (11) and being the corresponding asymptotic
variance of bSMD.4.2 Monte Carlo Simulations for the SMD Estimator
This section uses simulations to assess the properties of the proposed SMD estimator. Section 4.2.1
evaluates the point estimates of an MA(1) model and Section 4.2.2 studies the estimated impulse
response functions of an ARMA(1, 1) model.
4.2.1 Parameter Estimation in MA(1) Model
We rst study the nite-sample behavior of the proposed SMD estimator in invertible and non-
invertible MA(1) models with data generated from
yt = et + et1; et = "t;
where "t iid(0; 1) is drawn from a GLD with zero excess kurtosis and a skewness parameter0.85.12 In all simulation designs, = 1 and takes the values of 0:5; 0:7; 1; 1:5; and 2.13 The
sample sizes are T = 1000 and 2000 and the number of Monte Carlo replications is 1000. Wealso investigate the properties of the SMD estimator for smaller sample sizes (T = 500) and other
asymmetric (chi-squared and exponential) distributions.
12 Results for a larger range of values of the skewness parameter for GLD are not reported to conserve space butare available from the authors upon request.
13 The results are invariant to the choice of.
16
-
7/29/2019 Gospodinov Ng
18/37
The proposed SMD estimator is implemented as follows. We use an error simulator based on the
generalized lambda error distribution. For the auxiliary model (10a), we use p = 4 for the lag order
of the AR polynomial. It appears that larger values ofS (the number of simulated sample paths
of length T) tend to smooth the objective functions which improves the identication of the MA
parameter. As a result, we set S= 20 although S > 20 seems to oer even further improvement,especially for small T; but at the cost of increased computational time. In addition to the estimate
of, the SMD also delivers estimates of, 3 and 4. From the estimates of3 and 4, we construct
estimates of3 and 4 as (see Ramberg and Schmeiser (1975))
3 =C 3AB + 2A3
32;
4 =D 4AC+ 6A2B 3A4
42;
where A =
1
1+3 1
1+4 , B =
1
1+23 +
1
1+24 2Beta(1 + 3; 1 + 4), 2 =pB A
2
, C =11+33
3Beta(1+23; 1 +4)+3Beta(1+3; 1 + 24) 11+34 , D =1
1+434Beta(1+33; 1 +4) +
6Beta(1 + 23; 1 + 24) 4Beta(1 + 3; 1 + 34) + 11+44 , and Beta(; ) denotes the beta function.As is true of all non-linear estimation problems, the numerical optimization problem must
take into account the possibility of local minima. Once non-invertibility is allowed, we need to
additionally allow for the possibility of multiple equilibria. Thus, the estimation always considers
two sets of initial values. Specically, we draw two starting values for - one from a uniform
distribution on (0; 1) and one from a uniform distribution on (1; 2) - with the starting value for
set equal to qb2y=(1 +
2) for each of the starting values for . The starting values for the shape
parameters of the GLD 3 and 4 are set equal to those of the standard normal distribution (with
3 = 0 and 4 = 3). In this respect, the starting values of , , 3 and 4 contain little prior
knowledge of the true parameters.
Figure 4 illustrates how identiability depends on skewness by plotting the log of the objective
function for the SMD estimator averaged over 1000 Monte Carlo replications of the MA(1) model
for dierent values of and : The true values of and are 0.7 and 1, respectively, and the errors
are generated from GLD with zero excess kurtosis and three values of the skewness parameter: 0,
0.35, 0.6 and 0.85.14 The rst case (skewness=0) corresponds to lack of identication and there
are two pronounced local minima at and 1=: As the skewness of the error distribution increases,the second local optima at 1= attens out and it almost completely disappears when the error
distribution is highly asymmetric.
14 In evaluating the objective function, the values of the lambda parameters in the generalized lambda distributionare set equal to their true values.
17
-
7/29/2019 Gospodinov Ng
19/37
Tables 2 reports the mean and median estimates of , the average asymptotic standard error
of the SMD estimator of and the standard deviation of the estimates for which identication is
achieved. In addition, Table 2 presents the empirical probability of the SMD estimate of to be
greater than one which provides information on how often the identication of the true parameter
fails. The last column of Table 2 reports the rejection rate of the DM test ofH0 : = 0 at 10%signicance level. The main ndings can be summarized as follows. The SMD estimator of appears
to be median unbiased for all values of , even for small T. While there is a positive probability
that the SMD estimator will converge to 1= instead of (especially when is in the non-invertible
region), this probability is fairly small and it disappears completely for T = 2000. Interestingly,
in terms of precision, the SMD estimator appears to be more ecient even than the infeasible
estimator in Table 1 for values of in the invertible region (see also Gorodnichenko et al. (2012)
for a similar result in the context of autoregressive models). The asymptotic variance expression
in Proposition 2 tends to provide a very good approximation of the nite-sample variation of the
SMD estimates. Finally, the rejection rates of the hypothesis tests based on the SMD estimator
are very close to the nominal level which suggests that the asymptotic normality provides a good
approximation of the distribution of the SMD estimator over the whole parameter space.
Several remarks regarding the eciency properties of the SMD estimator are in order. First, the
SMD estimator tends to exhibit substantially smaller variability than the CMD estimator in Table
1 (case 3 = 0:85). These eciency gains are expected since the instrumental model based on the
AR approximation encompasses the dependence structure of the MA(1) model as the lag order p
increases to innity. What is somewhat surprising is the magnitude of the eciency gains. Second,
it is instructive to compare the sampling variability of the SMD estimator to the ML estimatorwhich provides the eciency bound for any estimator in the invertibility region. Recall that the
variance of the Gaussian ML estimator is (1 2)=T which, due to the invertibility restriction,shrinks to zero as the MA parameter approaches one. In contrast, our proposed SMD estimator
does not impose invertibility and its variance does not exhibit this type of behavior. For this
reason, a fair comparison between the SMD and ML estimators would involve values of that are
far away from the invertibility boundary, such as = 0:5. The sample dispersion measures for the
SMD estimator of0 = 0:5 in Table 2 are apparently very close and even lower than the asymptotic
standard error of the MLE which is 0.0274 and 0.0194 for T = 1000 and T= 2000, respectively. We
should note that similar results are reported by Gourieroux et al. (1993) for the simulation-based
(indirect inference) estimator of the invertible MA(1) model.
To gain some understanding about the source of the excellent properties of the SMD estimator
of, Table 3 reports the mean and median SMD estimates of the nuisance parameters , 3 and
18
-
7/29/2019 Gospodinov Ng
20/37
4 along with their Monte Carlo standard deviations. The estimate of is practically unbiased
and very precise. Importantly, the skewness parameter, albeit slightly downward biased, is very
precisely estimated (its standard deviation is smaller than the standard deviation of the CMD
estimator in Table 1). This points to the possibility that the excellent identication and estimation
properties of the SMD estimator of are likely to be due to its built-in bias correction and improvedeciency the improved estimation of the higher order moments of the error process.
Finally, Table 4 presents results for the SMD estimator of and for a smaller simple size
(T = 500) and two other asymmetric error distributions: chi-squared distribution with 6 degrees of
freedom (with skewness and excess kurtosis parameters of 1:15 and 2, respectively) and exponential
distribution with a scale parameter of one (with skewness and excess kurtosis parameters of 2 and
6, respectively). The errors are recentered and rescaled to have a mean of zero and variance one.
Note that the simulator for the SMD estimator is still based on the GLD family and, hence, it
is misspecied. The results in Table 4 are in line with the previous results for larger sample
sizes and GLD errors. The SMD estimates of and appear to be almost unbiased and exhibit
small variability. With the smaller sample size, the probability that the SMD estimate of is
not identied increases up to 3.8% in some cases but, overall, the nite-sample properties of our
proposed estimator remain quite attractive.
4.2.2 Impulse Response Function Estimation of All-Pass ARMA(1, 1) Model
One of the main advantages of SMD is its exibility to accommodate more general models and
dependence structures. To illustrate this, we consider the all-pass ARMA(1, 1) model
yt yt1 = et (1=)et1 for jj < 1; (12)
where et is a standard exponential random variable with a scale parameter equal to one which is
recentered and rescaled to have mean zero and variance 1. As discussed in Davis (2010), this process
possesses some interesting properties. First, the process in (12) is uncorrelated but it exhibits
higher order dependence (conditional heteroskedasticity). Furthermore, while the process yt is
causal, it has a non-invertible MA component. If one imposes invertibility on the MA component
(or replaces the MA parameter 1= by and the unit variance of the error term by (1=)2),
the process has cancelling roots in the AR and MA polynomials and it reduces to an iid random
sequence. Therefore, using estimators that impose invertibility would result in a at impulse
response function while the true impulse response function for horizon j > 1 is given by
@yt@etj
= j1( 1=):
19
-
7/29/2019 Gospodinov Ng
21/37
We investigate the SMD and Gaussian quasi ML estimates of the impulse response functions
(IRFs) for = 0:5 and 0:5 (T = 500): The SMD estimator uses the same auxiliary model as inthe previous section. The median IRF estimates obtained from 1,000 Monte Carlo replications are
plotted in Figure 5 and Figure 6, respectively. The SMD-based IRF estimates are median unbiased
and trace closely the shape of the true impulse response. In sharp contrast, the Gaussian quasiMLE fails to identify the AR and MA parameters and produces a at IRF around zero.
5 Empirical Application: Commodity Prices
Non-invertibility can be consistent with economic theory. For example, suppose yt = EtP1
s=0 sxt+s
is the present value ofxt = et + #et1. The solution yt = (1 + #)et + #et1 = h(L)et implies that
the root ofh(z) is 1+## which can be on or inside the unit circle even if j#j < 1. If there is nodiscounting and = 1, yt has a moving average unit root when # = 0:5 and h(L) is non-invertible
in the past whenever # < 0:5.15
Present value models are used to analyze a variables with a forward looking component including
commodity prices whose dynamics have implications for monetary policy and asset allocation. It is a
stylized fact that commodity price changes are almost uncorrelated (or very weakly autocorrelated)
over time and exhibit conditional heteroskedasticity. These two characteristics are also properties
of the all-pass models considered in the previous section and it is interesting to see if commodity
price changes are driven by a non-invertible MA component. To see that this is also theoretically
plausible, we revisit the present value model by Pindyck (1993) of commodity price determination.
Let st and ft denote the spot and futures commodity price for delivery at time t+ 1, and cyt be the
(net of insurance and storage costs) marginal convenience yield over the period. The no-arbitrage
condition implies that
Et(cyt) = (1 + i)st ft; (13)where i is the risk-free rate. Let Et(st+1) = ft + rpt, where rpt is a time-varying risk premium,
and assume that rpt = ( i)st; where denotes a risk-adjusted discount rate for the commodity.Substituting for ft = Et(st+1) ( i)st into (13) yields
Et(st+1) = (1 + )st cyt: (14)
The stationary (no-bubble) solution to the expectational dierence equation (14) is given by
st = (1 + )
1Xi=0
(1 + )iEt(cyt+i):
15 If the moving average polynomial #(L) is of innite order, as it would be the case for causal autoregressive
processes, it is still possible for the roots ofh(L) = #()L#(L)L
to be inside the unit disk.
20
-
7/29/2019 Gospodinov Ng
22/37
The presence of an invertible MA component in the convenience yield would induce a (possibly)
non-invertible MA component in the dynamics of the observable commodity prices. Given the
possible nonstationarity in commodity prices, we estimate an ARMA(1, 1) model of commodity
(log) price changes
4st = 4 st1 + et + et1using the Gaussian MLE and the proposed SMD estimator.
The data for the empirical analysis consist of commodity prices of the nearest futures contract
from the Commodity Research Bureau and cover the period March 1983 - July 2008. The ARMA(1,
1) model is estimated at monthly frequency by taking the last daily price in the month as the
corresponding monthly observation. We use 22 commodity prices from 6 commodity groups: energy
(crude oil, heating oil), grains and oilseeds (soybean oil, corn, oats, soybeans, wheat, canola), metals
(platinum, copper, gold, silver, palladium), industrials (cotton, lumber), livestock and meats (cattle
feeder, cattle live, pork bellies, hogs lean) and foodstus (cocoa, sugar, coee).Table 5 presents the estimation results. Practically all of the commodity price changes exhibit
some form of non-Gaussianity which is necessary for identifying possible non-invertible MA compo-
nents. The Gaussian ML tends to produce estimates for and of similar magnitude and opposite
sign suggesting a presence of cancelling roots and lack of identiability. However, this lack of iden-
tication could be an artifact of imposing invertibility on the MA root as argued in the previous
section. Indeed, when this restriction is relaxed within the SMD procedure, most of the commodity
price changes (except for gold and live cattle) appear to be driven by a non-invertible MA compo-
nent. Another interesting observation is that the estimated AR and MA parameters are of similar
magnitude and sign across the dierent commodities which seems to suggest that the parameters
are well identied within the SMD procedure. This is not the case for the Gaussian MLE where the
parameter estimates span a wide range of values which possibly arises from the non-identiability
of the parameters. Overall, there is strong evidence in support of non-invertibility in commod-
ity price changes which has potentially important implications for impulse response analysis and
forecasting.16
6 Conclusions
This paper proposes classical and simulation-based minimum distance estimation of possibly non-
invertible MA models with non-Gaussian errors. The classical minimum distance estimator is
16 Non-invertibility is expected to arise in other variables, such as stock prices, that are believed to be determinedby the present value model. In unreported results for the period February 1952 August 2012, tting an ARMA (1,1) to monthly returns on the S&P500 index (with sample skewness of -0.68 and sample kurtosis of 5.44) has producedSMD (ML) estimates of 0.684 (-0.726) and -1.394 (0.791) for the AR and MA parameters, respectively.
21
-
7/29/2019 Gospodinov Ng
23/37
developed and analyzed for the MA(1) model with asymmetric errors. The identication of the
structural parameters is achieved by exploiting the non-Gaussianity of the process through third
order cumulants. This type of identication also removes the boundary problem at the unit circle
which gives rise to the pile-up probability and non-standard asymptotics of the Gaussian maximum
likelihood estimator. As a consequence, the proposed classical minimum distance estimator isroot-T consistent and asymptotically normal over the whole parameter range, provided that the
non-Gaussianity in the data is suciently large to ensure identication.
To accommodate more general models with analytically intractable binding functions, we de-
velop a simulation estimator based on auxiliary regressions that incorporate information from the
higher order cumulants of the data. The eciency of the estimator is controlled by the ability of the
auxiliary model in approximating the true data generating process. Our proposed simulated min-
imum distance estimator is semi-parametric in the sense that it uses a possibly misspecied error
simulator with a exible functional form that approximates a large class of distributions with non-
Gaussian features. Particular attention is paid to the accurate estimation of the shape parameters
of the error distribution which play a critical role in identifying the structural parameters.
22
-
7/29/2019 Gospodinov Ng
24/37
References
Anderson, T. and Takemura, A. 1986, Why Do Noninvertible Estimated Moving Averages Occur?,Journal of TIme Series Analysis 7(4), 235254.
Andrews, B., Davis, R. and Breidt, F. 2006, Maximum Likelihood Estimation of All-Pass Time
Series Models, Journal of Multivariate Analysis 97, 16381659.Andrews, B., Davis, R. and Breidt, F. 2007, Rank-Based Estimation of All-Pass Time Series Models,
Annals of Statistics 35, 844869.
Brockwell, P. and Davies, R. 1991, Time Series Theory and Methods, 2nd edn, Springer-Verlag,New York.
Czellar, V. and Zivot, E. 2008, Improved Small Sample Inference for Ecient Method of Momentsand Indirect Inference Estimators, University of Washington.
Davis, R. 2010, All-Pass Procssess with Applications to Finance, Plenary Talk at the 7th Interna-tional Iranian Workshop on Stochastic Processes.
Davis, R. and Dunsmuir, W. 1996, Maximum Likelihood Estimation for MA(1) processes with aRoot on the Unit Circle, Econometric Theory 12, 120.
Davis, R. and Song, L. 2011, Unit Roots in Moving Averages Beyond First Order, Annals ofStatistics 39(6), 30623091.
Dovonon, P. and Renault, E. 2011, Testing for Common GARCH Factors, MPRA paper 40244.
Due, D. and Singleton, K. 1993, Simulated Moments Estimation of Markov Models of AssetPrices, Econometrica 61, 929952.
Fernndez-Villaverde, F., Rubio-Ramrez, J., Sargent, T. and Watson, M. 2007, A,B,Cs and (D)sfor Understanding VARs, American Economic Review 97:3, 10211026.
Gallant, R. and Tauchen, G. 1996, Which Moments to Match, Econometric Theory 12, 657681.
Ghysels, E., Khalaf, L. and Vodounou, C. 2003, Simulation Based Inference in Moving AverageModels, Annales D Economie et Statistique 69, 8599.
Ginnakis, G. and Swami, A. 1990, On Estimating Noncausal Nonmiminum Phase ARMA Models ofNon-Gaussian Processes, IEEE Transations, Acoustics, Speech, and Signal Processing 38, 478495.
Gorodnichenko, Y., Mikusheva, A. and Ng, S. 2012, Estimators for Persistent and Possibly Non-Stationary Data with Classical Properties, Econometric Theory 28, 10031036.
Gospodinov, N. 2002, Bootstrap Based Inference in Models with a Nearly Noninvertible MovingAverage Component, Journal of Business and Economic Statistics 20, 254268.
Gourieroux, C., Monfort, A. and Renault, E. 1993, Indirect Inference, Journal of Applied Econo-metrics 85, S85S118.
Hall, A. 2005, Generalized Methods of Moments, Advanced Texts in Econometrics, Oxford Univer-sity Press, Oxford.
23
-
7/29/2019 Gospodinov Ng
25/37
Hansen, L. and Sargent, T. 1991, Two Diculties in Interpreting Vector Autoressions, in L. P.Hansen and T. J. Sargent (eds), Rational Expectations Econometrics, Westview, London, pp. 77119.
Harris, D. 1999, GMM Estimation of Time Series Models, in L. Mtys (ed.), Generalized Methodof Moments Estimation, Vol. Themes in Modern Econometrics, Cambridge University Press,
Cambridge, U.K., pp. 149169.Huang, J. and Pawitan, Y. 2000, Quasi-Likelihood Estimation of Non-Invertible Moving Average
Processes, Scandinavian Journal of Statistics 27, 689702.
Komunjer, I. 2012, Global Identication in Nonlinear Models with Moment Restrictions, Econo-metric Theory, forthcoming.
Komunjer, I. and Ng, S. 2011, Dynamic Identication of Dynamic Stochastic General EquilibriumModels, Econometrica 79:6, 19952032.
Lii, K. and Rosenblatt, M. 1982, Deconvolution and Estimation of Transfer Function Phase Coef-cients for Non-Gaussian Linear Processes, Annals of Statistics 10, 11951208.
Lii, K. and Rosenblatt, M. 1992, An Approximate Maximum Likelihood Estiation of Non-GaussianNon-Minimum Phase Moving Average Processes, Journal of Multivariate Analysis 43, 272299.
Lippi, M. and Reichlin, L. 1993, The Dynamic Eects of Aggregate Demand and Supply Distur-bances: Comment, American Economic Review 83, 644652.
Meitz, M. and Saikkonen, P. 2011, Maximum Likelihood Estimation of a Non-Invertible ARMAModel with Autoregressive Conditional Heteroskedasticity, mimeo, University of Helsinki.
Mendel, J. 1991, Tutorial on Higher Order Statsitics in Signal Processing and System Theory:Theoretical Results and Some Applications, Proceedings of the IEEE 79(3), 278305.
Michaelides, A. and Ng, S. 2000, Estimating the Rational Expectations Model of SpeculativeStorage: A Monte Carlo Comparison of Three Simulation Estimators, Journal of Econometrics
96:2, 231266.
Newey, W. and McFadden, D. 1994, Large Sample Estimation and Hypothesis Testing, Handbookof Econometrics, Vol. 4,Chapter 36, North Holland.
Phillips, P. 2012, Folklore Theorems, Implicit Maps, and Indirect Inference, Econometrica80(1), 425454.
Pindyck, R. 1993, The Present Value Model of Rational Commodity Pricing, Journal of PoliticalEconomy 103, 511530.
Ramberg, J. and Schmeiser, B. 1975, An Approximate Method for Generating Asymmetric RandomVariables, Communications of the ACM 17(2), 7882.
Ramsey, J. and Montenegro, A. 1992, Identication and Estimation of Non-invertible Non-GaussianMA(q) processes, Journal of Econometrics 54, 301320.
Rothenberg, T. 1971, Identication in Parametric Models, Econometrica 39:3, 577591.
Ruud, P. 2000, An Introduction to Classical Econometric Theory, Oxford University Press, NewYork.
24
-
7/29/2019 Gospodinov Ng
26/37
Sargan, D. and Bhargava, A. 1983, Maximum Likelihood Estimation of Regression Models withFirst Order Moving Average Errors When the Root Lies on the Unit Circle, Econometrica51, 799820.
Sargan, J. D. 1983, Identication and Lack of Identication, Econometrica 51:6, 16051633.
Tugnait, J. 1986, Identication of Non-Minimum Phase Linear Stochastic Systems, Automatica22, 457464.
25
-
7/29/2019 Gospodinov Ng
27/37
Table 1: CMD estimates from MA(1) model with possibly asymmetric errors.
0 CMD estimator just-identied estimator infeasible estimatorb b3 b b3 b b3mean std. mean std. mean std. mean std. mean std. mean std.
3 = 0
0.5 1.459 0.754 0.001 0.126 1.623 0.675 -0.020 0.360 0.501 0.048 -0.019 0.5160.7 1.139 0.394 -0.003 0.168 1.271 0.342 -0.012 0.287 0.697 0.056 -0.014 0.3741.0 1.035 0.223 -0.006 0.189 1.082 0.200 -0.011 0.294 0.995 0.052 -0.010 0.2961.5 1.146 0.443 -0.006 0.159 1.342 0.365 -0.015 0.293 1.496 0.047 -0.009 0.2602.0 1.417 0.764 -0.002 0.126 1.753 0.594 -0.025 0.335 1.996 0.051 -0.008 0.257
3 = 0:35
0.5 0.648 0.446 0.327 0.141 1.643 0.666 0.203 0.351 0.501 0.048 0.323 0.5000.7 0.801 0.266 0.342 0.178 1.290 0.329 0.249 0.282 0.696 0.056 0.329 0.3631.0 1.026 0.204 0.331 0.178 1.097 0.197 0.330 0.287 0.994 0.052 0.329 0.2881.5 1.432 0.305 0.343 0.167 1.383 0.334 0.363 0.284 1.496 0.047 0.331 0.2532.0 1.925 0.429 0.336 0.130 1.793 0.557 0.393 0.345 1.996 0.052 0.331 0.250
3 = 0:60.5 0.506 0.108 0.593 0.102 1.614 0.677 0.355 0.357 0.501 0.048 0.561 0.4710.7 0.702 0.105 0.602 0.144 1.305 0.320 0.431 0.281 0.697 0.056 0.569 0.3421.0 1.010 0.153 0.572 0.174 1.104 0.194 0.570 0.280 0.994 0.052 0.568 0.2731.5 1.514 0.201 0.601 0.139 1.369 0.345 0.635 0.298 1.496 0.047 0.571 0.2382.0 2.019 0.234 0.594 0.099 1.741 0.604 0.702 0.408 1.996 0.052 0.572 0.236
3 = 0:85
0.5 0.499 0.055 0.825 0.078 1.619 0.674 0.493 0.360 0.500 0.048 0.790 0.4330.7 0.692 0.083 0.826 0.133 1.315 0.313 0.603 0.281 0.696 0.056 0.796 0.3141.0 1.001 0.095 0.804 0.161 1.105 0.194 0.795 0.273 0.994 0.052 0.793 0.2511.5 1.527 0.183 0.828 0.127 1.337 0.367 0.901 0.320 1.495 0.047 0.797 0.219
2.0 2.020 0.216 0.824 0.083 1.648 0.667 1.026 0.509 1.995 0.052 0.798 0.217
Notes: The table reports the mean and the standard deviation (std.) of the CMD estimates of and 3 from the MA(1) model yt = et + et1, et = "t and "t iid(0; 1) generated from a gen-eralized lambda distribution with a skewness parameter 3 and zero excess kurtosis. The samplesize is T = 1000, the number of Monte Carlo replication is 5000 and = 1. CMD estimator is theover-identied classical mimimum distance estimator of (;;3)
0 with a vector of auxiliary para-meters (E(ytyt1); E(y
2t ); E(y
2t yt1); E(y
3t ); E(yty
2t1))
0; the just-identied estimator is the classical
minimum distance estimator of(;;3)0 with auxiliary parameters (E(ytyt1); E(y
2t ); E(y
2t yt1))
0;and the infeasible estimator is the classical minimum distance estimator of (; 3)
0 with = 1assumed known and auxiliary parameters (E(ytyt1); E(y
2t ); E(y
2t yt1))
0.
26
-
7/29/2019 Gospodinov Ng
28/37
Table 2: SMD estimates of from MA(1) model with asymmetric errors
true 0 mean median Pr(bSMD > 1) s.e. std. DM testT= 1000
0.5 0.500 0.500 0.000 0.026 0.027 0.1120.7 0.701 0.700 0.000 0.030 0.031 0.112
1.0 0.969 0.978 0.381 0.074 0.074 0.1041.5 1.499 1.499 0.997 0.067 0.070 0.1122.0 2.007 2.004 0.997 0.113 0.119 0.114
T= 2000
0.5 0.500 0.500 0.000 0.019 0.019 0.1110.7 0.701 0.700 0.000 0.022 0.022 0.1011.0 0.982 0.990 0.433 0.056 0.058 0.1011.5 1.500 1.499 1.000 0.048 0.050 0.1172.0 2.004 2.002 1.000 0.080 0.083 0.108
Notes: The table reports some summary statistics of the simulated minimum distance (SMD)
estimates of from the MA(1) model yt = et + et1, et = "t and "t iid(0; 1) generated froma generalized lambda distribution with a skewness parameter 3 = 0:85 and zero excess kurtosis.The sample size is T = 1000 and 2000, the number of Monte Carlo replication is 1000 and = 1.
Pr(bSMD > 1) signies the probability (over Monte Carlo replications) that bSMD > 1; s.e. isthe average standard error computed from consistent estimates of the relevant asymptotic variance
expressions and std. denotes the Monte Carlo standard deviation ofbSMD. The last column of thetable report the rejection rates of the DM test ofH0 : = 0 at 10% signicance level.
27
-
7/29/2019 Gospodinov Ng
29/37
Table 3: SMD estimates of, 3 and 4 from MA(1) model with asymmetric errors
true 0 bSMD b3;SMD b4;SMDmean median std. mean median std. mean median std.
T = 1000
0.5 0.994 0.994 0.022 0.817 0.816 0.062 2.966 2.937 0.242
0.7 0.994 0.994 0.022 0.817 0.815 0.068 2.956 2.942 0.2631.0 1.000 0.997 0.040 0.803 0.805 0.100 2.982 2.961 0.2981.5 1.004 0.997 0.049 0.817 0.817 0.100 2.993 2.933 0.5032.0 1.008 0.998 0.062 0.816 0.816 0.086 2.993 2.910 0.389
T = 2000
0.5 0.997 0.997 0.015 0.824 0.824 0.044 2.988 2.978 0.1530.7 0.997 0.997 0.015 0.824 0.822 0.049 2.982 2.972 0.1891.0 1.000 0.999 0.031 0.816 0.818 0.074 2.988 2.966 0.2441.5 1.001 1.001 0.034 0.823 0.820 0.069 2.975 2.936 0.3592.0 1.008 0.999 0.043 0.822 0.823 0.059 2.977 2.911 0.281
Notes: The table reports some summary statistics of the simulated minimum distance (SMD)estimates of , 3 and 4 from the MA(1) model yt = et + et1, et = "t and "t iid(0; 1)generated from a generalized lambda distribution with a skewness parameter 3 = 0:85 and zeroexcess kurtosis (4 = 3). The sample size is T = 1000 and 2000, the number of Monte Carloreplication is 1000 and = 1. std. denotes the Monte Carlo standard deviation of the correspondingestimate.
28
-
7/29/2019 Gospodinov Ng
30/37
Table 4: SMD estimates of and from MA(1) model with chi-squared and exponential errors
error true 0 bSMD bSMDdistr. mean median Pr(bSMD > 1) std. mean median std.26 0.5 0.548 0.504 0.027 0.046 0.955 0.968 0.0452
6
0.7 0.716 0.706 0.022 0.055 0.967 0.969 0.04526 1.0 0.979 0.982 0.425 0.098 0.983 0.985 0.06426 1.5 1.458 1.478 0.962 0.105 1.009 0.996 0.07726 2.0 1.967 1.982 0.978 0.177 1.006 0.987 0.096
exp(1) 0.5 0.558 0.500 0.038 0.047 0.935 0.953 0.054exp(1) 0.7 0.705 0.700 0.004 0.051 0.960 0.960 0.053exp(1) 1.0 0.966 0.978 0.377 0.083 0.987 0.985 0.064exp(1) 1.5 1.493 1.500 0.977 0.114 0.988 0.982 0.083exp(1) 2.0 2.015 2.021 0.987 0.188 0.981 0.973 0.105
Notes: The table reports some summary statistics of the simulated minimum distance (SMD)estimates of and from the MA(1) model yt = et + et1, et = "t and "t is either an iid chi-
squared random variable with 6 degrees of freedom (26) or an exponential random variable with ascale parameter equal to one (exp(1)). The errors "t are recentered and rescaled to have mean zero
and variance 1. The sample size is T = 500, the number of Monte Carlo replication is 1000 and = 1. std. denotes the Monte Carlo standard deviation of the corresponding estimate.
29
-
7/29/2019 Gospodinov Ng
31/37
Table 5: SMD and Gaussian ML estimates of ARMA(1, 1) model for commodity prices
commodities sample moments Gaussian ML SMD
skewness kurtosis b b b bcrude oil 0:019 5:259 0:403
(0:251)0:538(0:241)
0:617(0:075)
1:575(0:218)
heating oil 0:207 7:394 0:653(0:239) 0:742(0:217) 0:501(0:096) 1:531(0:076)soybean oil 0:018 5:440 0:893
(0:106)0:814(0:124)
0:635(0:065)
1:595(0:108)
corn 0:346 6:884 0:634(0:562)
0:583(0:588)
0:628(0:060)
1:627(0:170)
oats 1:086 9:538 0:176(0:512)
0:288(0:497)
0:696(0:057)
1:274(0:081)
soybeans 0:724 7:259 0:881(0:095)
0:797(0:116)
0:515(0:108)
1:886(0:255)
wheat 0:020 3:565 0:154(0:693)
0:078(0:697)
0:924(0:412)
1:044(0:456)
canola 0:887 11:255 0:408(0:405)
0:500(0:390)
0:402(0:091)
1:691(0:067)
platinum 0:246 4:963 0:046(0:606) 0:146(0:591) 0:638(0:043) 1:428(0:066)copper 0:495 5:400 0:314
(1:524)0:341(1:498)
0:691(0:084)
1:347(0:126)
gold 0:347 3:598 0:267(0:614)
0:348(0:586)
0:234(0:241)
4:426(4:804)
silver 0:014 3:979 0:171(0:314)
0:335(0:295)
0:098(0:346)
0:266(0:330)
palladium 0:229 5:189 0:290(1:272)
0:253(1:278)
0:680(0:049)
1:443(0:154)
cotton 2:040 18:857 0:903(0:023)
1:000(0:016)
0:883(0:078)
1:048(0:101)
lumber 0:139 3:498 0:449(0:317)
0:329(0:332)
0:712(0:149)
1:166(0:181)
cattle, feeder
0:498 5:912 0:046
(2:333) 0:020
(2:337)
0:623(0:052)
1:386(0:064)
cattle, live 0:462 5:079 0:698(0:084)
0:891(0:053)
0:670(0:596)
0:955(0:987)
pork bellies 0:503 5:198 0:841(0:045)
0:961(0:025)
0:390(0:078)
1:673(0:094)
hogs, lean 0:396 5:462 0:822(0:046)
0:973(0:029)
0:579(0:065)
1:290(0:090)
cocoa 0:325 3:832 0:230(0:281)
0:054(0:287)
0:016(0:433)
0:180(0:410)
sugar 1:127 6:920 0:933(0:021)
1:000(0:014)
0:363(0:119)
2:506(0:658)
coee 0:374 4:685 0:338(0:657)
0:263(0:664)
0:669(0:060)
1:375(0:132)
Notes: The table reports the SMD and Gaussian quasi ML estimates and standard errors (inparentheses below the estimates) for the ARMA(1, 1) model 4st = 4 st1 + et + et1, whereet iid(0; 2). The rst two columns report the sample skewness and kurtosis of4st.
30
-
7/29/2019 Gospodinov Ng
32/37
25 20 15 10 5 0 5 100
.05
0.1
.15
0.2skewness=0
20 15 10 5 00
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4skewness=0.35
16 14 12 10 8 6 4 2 0 2 40
.05
0.1
.15
0.2
.25
0.3
.35
0.4
skewness=0.6
8 6 4 2 0 2 40
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
skewness=0.85
Figure 1: Density functions of the standardized CMD estimator (t-statistic) of based on data(T = 1000) generated from an MA(1) model yt = et + et1 with = 1:5 and et iid(0; 1). Theerrors are drawn from a generalized lambda distribution with zero excess kurtosis and a skewnessparameter equal to 0, 0.35, 0.6 and 0.85.
31
-
7/29/2019 Gospodinov Ng
33/37
10 5 0 5 10 15 200
.05
0.1
.15
0.2skewness=0
6 4 2 0 2 4 6 8 10 120
0.05
0.1
0.15
0.2
0.25
0.3
0.35skewness=0.35
6 4 2 0 2 4 6 8 10 120
.05
0.1
.15
0.2
.25
0.3
.35
0.4
skewness=0.6
6 4 2 0 2 40
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
skewness=0.85
Figure 2: Density functions of the standardized CMD estimator (t-statistic) of based on data(T = 1000) generated from an MA(1) model yt = et + et1 with = 1:5 and et iid(0; 1). Theerrors are drawn from a generalized lambda distribution with zero excess kurtosis and a skewnessparameter equal to 0, 0.35, 0.6 and 0.85.
32
-
7/29/2019 Gospodinov Ng
34/37
6 4 2 0 2 4 60
.05
0.1
.15
0.2
.25
0.3
.35skewness=0
6 4 2 0 2 40
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4skewness=0.35
8 6 4 2 0 2 4 60
.05
0.1
.15
0.2
.25
0.3
.35
0.4
skewness=0.6
8 6 4 2 0 2 40
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
skewness=0.85
Figure 3: Density functions of the standardized CMD estimator (t-statistic) of the skewness para-meter (3) based on data (T= 1000) generated from an MA(1) model yt = et + et1 with = 1:5and et iid(0; 1). The errors are drawn from a generalized lambda distribution with zero excesskurtosis and a skewness parameter equal to 0, 0.35, 0.6 and 0.85.
33
-
7/29/2019 Gospodinov Ng
35/37
00.5
11.5
22.5
0
0.5
1
.5
4
2
0
2
4
skewness=0
0
0.51
1.52
0
0.5
1
1.5
4
2
0
2
4
skewness=0.35
00.5
11.5
22.5
0
0.5
1
.5
4
2
0
2
4
skewness=0.6
0
0.51
1.52
0
0.5
1
1.5
4
2
0
2
4
skewness=0.85
Figure 4: Logarithm of the objective function of SMD estimator of and based on data (T= 1000)generated from an MA(1) model yt = et + et1 with = 0:7 and et iid(0; 1). The errors aredrawn from a generalized lambda distribution with zero excess kurtosis and a skewness parameterequal to 0, 0.35, 0.6 and 0.85.
34
-
7/29/2019 Gospodinov Ng
36/37
1 2 3 4 5 6 7 8 9 101
0.5
0
0.5
1
1.5
horizon
impulseresponsefunction
true IRF
SMDbased IRF median estimate
MLbased IRF median estimate
Figure 5: SMD and Gaussian quasi ML median estimates of the impulse response function fromthe ARMA(1, 1) model (1 + 0:5L)yt = (1 + 2L)et, where et is a standard exponential randomvariable with a scale parameter equal to one which is recentered and rescaled to have mean zeroand variance 1. The sample size of the simulated series is T= 500.
35
-
7/29/2019 Gospodinov Ng
37/37
1 2 3 4 5 6 7 8 9 101.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0.2
horizon
impulseresponsefunction
true IRF
SMDbased IRF median estimate
MLbased IRF median estimate
Figure 6: SMD and Gaussian quasi ML median estimates of the impulse response function fromthe ARMA(1, 1) model (1 0:5L)yt = (1 2L)et, where et is a standard exponential randomvariable with a scale parameter equal to one which is recentered and rescaled to have mean zeroand variance 1. The sample size of the simulated series is T= 500.