Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf ·...
Transcript of Laplace Likelihood and LAD Estimation for Non-invertible MA(1)rdavis/lectures/Tokyo_06.pdf ·...
1Waseda 1/06
LaplaceLaplace Likelihood and LAD Estimation Likelihood and LAD Estimation for Nonfor Non--invertible MA(1)invertible MA(1)
Colorado State UniversityNational Tsing-Hua University
U. of California, San Diego
(http://www.stat.colostate.edu/~rdavis/lectures)
Richard A. Davis
F. Jay Breidt
Nan-Jung Hsu
Murray Rosenblatt
3Waseda 1/06
MA(1) unit root problem
MA(1): (world’s simplest time series model!)
Yt = Zt − θ Zt-1 , {Zt} ~ IID (0, σ2)
Properties:
• |θ| < 1 ⇒ (invertible)
• |θ| > 1 ⇒ (non-invertible)
• |θ| = 1 ⇒ and
⇒ (perfect interpolation)
• |θ| < 1 ⇒
MLE = maximum (Gaussian) likelihood, n = sample size
What if θ =1?
∑∞
=−θ=
0
j
jjtt YZ
∑∞
=+θ−=
1
j-
jjtt YZ
},...,,{sp },...,,{sp 211 ++− ∈∈ tttttt YYZYYZ
00}0 ,sp{P YYsYs=≠
)/)1(,(AN is ˆ 2 nmle θ−θθ
4Waseda 1/06
Why study MA(1) with a unit root?
a) differencing (to remove non-stationarity)
• linear trend model: Xt = a + bt + Zt .Yt = Xt − Xt-1 = b + Zt − Zt-1 ~ MA(1) with θ = 1.
• seasonal model: Xt = st + Zt , st seasonal component w/ period 12.Yt = Xt − Xt-12 = Zt − Zt-12 ~ MA(12) with θ = 1.
b) random walk + noise
Xt = Xt-1 + Ut (random walk signal)
Yt = Xt + Vt (random walk signal + noise)
ThenYt − Yt-1 = Ut + Vt − Vt-1 ~MA(1)
with θ=1 if and only if Var(Ut) = 0.
5Waseda 1/06
Identifiability and the Gaussian likelihoodIdentifiability
• |θ| > 1 ⇒ Yt = εt – θ−1 εt-1 , where {εt} ~ WN(0,θ2σ2).
• {εt} is IID if and only if {Zt} is Gaussian (Breidt and Davis `91)
• {εt} is a special case of an All-Pass Model (Breidt, Davis, Trindade `01, Andrews et al. `05a, `05b)
Gaussian Likelihood
LG(θ, σ2) = LG(1/θ, θ2σ2) ⇒ θ is only identifiably for |θ| ≤ 1.Notes:i) this implies LG(θ) = LG(1/θ) for the profile likelihood and θ = 1 is a
critical point,
ii) a pile-up effect ensues, i.e., even if θ < 1.
.0)1(L'G =
0)1ˆ( >=θP
6Waseda 1/06
Gaussian likelihood, non-Gaussian data
theta
Gau
ssia
n lik
elih
ood
0.5 1.0 1.5 2.0
1.20
1.25
1.30
1.35
1.40
100 observations from Yt = Zt − θ0 Zt-1 , {Zt} ~ IID, Laplace pdf
θ0 =1.0θ0 =.8 θ0 =1.25
7Waseda 1/06
Gaussian MLE for near-unit roots
Idea: build parameter normalization into the likelihood function.
Model: Yt = Zt - (1-β/n) Zt-1 , t =1,…,n.
β = n(1-θ), θ = 1- β/n, θ0 = 1- γ/n
Gaussian Likelihood:Ln(β) = ln (1- β/n) - ln (1), ln ( ) = profile log-like.
Theorem (Davis and Dunsmuir `96): Under θ0 = 1-γ / n,Ln(β) →d Zγ (β) on C[0,∞).
Results:
• argmax Ζγ(β)
• arglocalmax Ζγ(β)
• if γ = 0.
=β→θ− ˆ)ˆ1( LLn
6518.)0ˆP()1ˆP( ==β→=θ LL
=β→θ− mlemlen ˆ)ˆ1(
8Waseda 1/06
Extensions of MLE (Gaussian likelihood)
ii) heavy tails (Davis and Mikosch `98): {Zt} symmetric alpha stable
(SαS). Then the max Gaussian likelihood estimator has the same
normalizing rate, i.e.,
The pile-up decreases with increasing tail heaviness.
i) non-zero mean (Chen and Davis `00): same type of limit, except pile-up is more excessive.
This makes hypothesis testing easy!
Reject H0: θ = 1 if (size of test is .045)
LdLn β→θ− ˆ)ˆ1(
)0ˆP()1ˆP( =β→=θ LL
.955)1ˆ(P →=θmle
1ˆ <θmle
12Waseda 1/06
Laplace likelihood/LAD estimation If noise distribution is non-Gaussian, the MA(1) parameter θ is identifiable for all real values.
Q1. For MLE (non-Gaussian) does one have 1/n or 1/n1/2 asymptotics?
Q2. Is there a pile-up effect?
Look at this problem with non-Gaussian likelihood• Specifically, consider Laplace likelihood / Least AbsoluteDeviations for unit root only (not near-unit root) • Preliminary results only!
13Waseda 1/06
Non-Gaussian likelihood – Joint and ExactModel. Yt = Zt − θ Zt-1 , {Zt} ~ IID with median 0 and EZ4 < ∞. Initial variable.
Joint density: Let Yn=(Y1, . . . ,Yn), then
where the zt are solved
⎩⎨⎧
−≤θ
= ∑ =
n
t tn
init
YZZ
Z1
0
otherwise. ,,1||if ,
( ),1||1),...,,(),( }1|{|}1|{|10 >θ−
≤θ θ+= nn
initn zzzfzf y
forward by: zt = Yt + θ zt-1, t =1,…,n for |θ| ≤ 1 with z0 = zinit
backward by: zt-1 = θ−1(zt – Yt ), t =n,…,1 for |θ| > 1 with zn = zinit+ Y1+. . .+ Yn
Note: integrate out zinit to get Exact likelihood.
initinit dzzff nn ∫∞
∞−
= ),()( yy
14Waseda 1/06
Laplace likelihood examples
100 observations from Yt = Zt − θ0 Zt-1 , {Zt} ~ IID Laplace pdf
0.5
1
1.5
2
theta
-2
-1
0
1
2
z.init 0
0.2
0.4
0.6
0.8
1Z
0.5
1
1.5
2
theta
-2
-1
0
1
2
z.init
00.
20.
40.
60.
81
Z
θ0 =.8
θ0 =1.0
15Waseda 1/06
Laplace likelihood, Laplace noise
100 observations from Yt = Zt − θ0 Zt-1 , {Zt} ~ IID Laplace pdf
θ0 =1.0θ0 =.8 θ0 =1.25
Exact likelihood Joint likelihood at zmax(θ)
theta
Lapl
ace
likel
ihoo
d
0.5 1.0 1.5 2.0
0.0
0.2
0.4
0.6
0.8
1.0
theta
Lapl
ace
likel
ihoo
d
0.5 1.0 1.5 2.0
0.0
0.02
0.04
0.06
0.08
0.10
17Waseda 1/06
Laplace likelihood-LAD estimation
(Joint) Laplace log-likelihood. (σ = E|Z0| is a scale parameter)
Maximizing wrt σ, we obtain
so that maximizing L is equivalent to minimizing
∑=
>θ− θ−σ−σ+−=σθ
n
tt nznzL init
0}1|{|
1 1|)|(log||2log)1(),,(
∑=
+=σn
tt nz
0
)1/(||ˆ
⎪⎪⎩
⎪⎪⎨
⎧
θ
≤θ=θ
∑
∑
=
=n
tt
n
tt
initn
z
zzl
0
0
otherwise. |,|||
1, || if |,|),(
18Waseda 1/06
Joint Laplace likelihood — limit results
Result 1. Under the parameterizations, θ = 1 + β/n and zinit = Z0 + ασ/n1/2,
we have
on C(R2), where
for β ≤ 0, and
for β > 0.
),()),1(),((),( 01 αβ→−θσ=αβ − UZlzlU dnnn
init
dsetdSef
sdWetdSeU
ssts
ssts
21
0 0
)(
1
0 0
)(
)()0(
)()(),(
∫ ∫
∫ ∫
⎟⎟⎠
⎞⎜⎜⎝
⎛α+β+
⎟⎟⎠
⎞⎜⎜⎝
⎛α+β=αβ
β−β
−β−β
dsetdSef
sdWetdSeU
s
sts
s
sts
21
0
1)1()(
1
0
1)1()(
)()0(
)()(),(
∫ ∫
∫ ∫
⎟⎟⎠
⎞⎜⎜⎝
⎛α+β+
⎟⎟⎠
⎞⎜⎜⎝
⎛α+β−=αβ
−β−−β
+
−β−−β
19Waseda 1/06
From the limit,
it suggests (from the continuous mapping theorem?) that limit(optimum(criterion)) = optimum(limit(criterion)).
So for the Local optimizer of the Joint likelihood
where
Joint Laplace likelihood — limit results
( ) ( )( ) ( ) ˆ,ˆˆ,1ˆLJLJ0
initLJ
1LJ αβ→−σ−θ −
dZznn
.)(sign1)( ),(1)( d
][
0
][
0W(t)Z
ntWtSZ
ntS
nt
iind
nt
iin →=→= ∑∑
== σσ
).,( minarg(local) )ˆ,ˆ( LJLJ αβ=αβ U
The limits contain correlated Brownian Motions S(t) and W(t), obtained as the limits of the partial sum processes
),,(),( αβ→αβ UU dn
20Waseda 1/06
Might expect a similar result to hold for the Global optimizer of the Joint likelihood
where
Joint Laplace likelihood — limit results
( ) ( )( ) ( ) ˆ,ˆˆ,1ˆGJGJ0
initGJ
1GJ αβ→−σ−θ −
dZznn
).,( argmin )ˆ,ˆ( GJGJ αβ=αβ U
22Waseda 1/06
Exact Laplace likelihood — limit results
Exact Laplace Likelihood:
Result 2. For the Global optimizer of the Exact likelihood,
where
and U*(β) is a stochastic process defined in terms of S(t) and W(t).
In addition, for the Local optimizer of the Exact likelihood
∫∞
∞−
=σθ initinit dzzfL nn ),(),( y
,ˆ)1ˆ( GEGE β→−θ dn
, )( min argˆ *GE β=β U
).( min (local) argˆ ,ˆ)1ˆ( *LELELE β=ββ→−θ Un d
23Waseda 1/06
Simulating from the limit process
Step 1. Simulate two indep sequences (W1, . . . , Wm) and (V1, . . . , Vm) of iid N(0,1) random variables with m=100000.
.100000/)( and 100000/)(]100000[
1
]100000[
1∑∑
==
==t
jj
t
jj VtVWtW
.1||/)(Var 02
1 −= ZEZc t
Step 5. Determine the respective Local and Global minimizers of Joint limit U(β,α) and Exact limit U*(β) numerically.
Step 2. Form W(t) and V(t) by the partial sum processes,
Step 3. Set S(t) = W(t) + c1V(t), where
Limit process depends only on c1 and f (0).
Step 4. Compute U(β,α) and U*(β) from the definition.
24Waseda 1/06
Simulated realizations of the limit processes
Simulate Joint and Exact limit processes, U(β,α(β)), U*(β).
• Simulate realization of each limit process, joint and exact
• Compute local and global optima
• Repeat…
• Build up limit distribution functions
t(5) pdf
beta
U a
nd U
star
-20 -10 0 10 20
-20
24
27Waseda 1/06
Limit cdf
beta.min
cdf
-20 -10 0 10 20
0.0
0.2
0.4
0.6
0.8
1.0
Joint Lap Like
red graph = Laplace pdf for Zt blue graph = Gaussian pdf for Zt
beta.min
cdf
-20 -10 0 10 200.
00.
20.
40.
60.
81.
0
Exact Lap Like
29Waseda 1/06
.0000 .0004 .0293
.0574 .0208 .1511
.0574 .0208 .1539
.0483 .0211 .1414
biass.d.rmseasymp
n = 50
-.0057 -.0033 -.0208.1438 .0656 .2430.1439 .0657 .2438.1207 .0526 .2236
biass.d.rmseasymp
n = 20
.0005 -.0003 -.0025
.0303 .0107 .1000
.0303 .0107 .1000
.0241 .0105 .1000
biass.d.rmseasymp
n = 100
.0005 .0000 -.0016
.0140 .0058 .0718
.0141 .0058 .0718
.0121 .0053 .0707
biass.d.rmseasymp
n = 200
Exact Jointn ˆ ˆ ˆ
LJLE σθθLocal results
Laplace noise
θ = 1, σ =1
1000 reps
30Waseda 1/06
-.013 .002 .000.096 .078 .057
biasrmse
n = 50
-.047 -.050 -.003.224 .213 .144
biasrmse
n = 20
.003 -.003 .000
.051 .034 .011biasrmse
n = 100
.000 .000 .000
.028 .014 .006biasrmse
n = 200
Exact Joint Localn ˆ ˆ ˆ
LJGJGE θθθLaplace noise
θ = 1, σ =1
1000 reps
Simulation results: Global Exact and Global Joint
Global Exact = MLE
Global Joint = maximize over θ and zinit
Note:
• Local dominates Global
• Joint dominates Exact (rmse is half the size)
32Waseda 1/06
Analysis of pile-up probabilities
Look back at realizations of the limit processes, U(β,α(β)), U*(β).
• When is there a local optimum at θ = 1?
• Check derivatives
• Negative derivative from the left
• Positive derivative from the right
• Local optimum at θ = 1
t(5) pdf
beta
U a
nd U
star
-20 -10 0 10 20
-20
24
33Waseda 1/06
Pile-up probabilities (Joint)
Result 3. (Local Joint Laplace likelihood)
where
Idea: look at derivatives
Now,
and the result follows.
),1 0(P)1ˆ(P LJ <<→=θ Y
)0))(ˆ,( lim and 0))(ˆ,( (limP)1ˆ(P 00 >βαββ∂∂
<βαββ∂∂
==θ ↓β↑β nnlm UU
)0))(ˆ,( lim and 0))(ˆ,( (limP 00 >βαββ∂∂
<βαββ∂∂
→ ↓β↑β UU
1))(ˆ,( lim 0 −=βαββ∂∂
↑β YU
YU =βαββ∂∂
↓β ))(ˆ,( lim 0
)2)1()(()0(2
)1()()1()()( 1
0
1
0
1
0∫ ∫ ∫+= /ds-WsW
fWdssS-WsdWsSY
34Waseda 1/06
Pile-up probabilities (Exact)
Result 4. (Local Exact Laplace likelihood)
The pile-up probability is always zero for the Local Exact, and always positive for the Local Joint (see Result 3).
0 21-1
21P)1ˆ(P LE =⎥⎦
⎤⎢⎣⎡ <<→=θ Y
Remark. (Laplace pile-up)
If Zt has a Laplace density then
where W(s) and V(s) are independent standard Brownian motions.
,21)( -|z|/σezfσ
=
[ ] .21)()()1(
1
0∫ +−= sdVsWsWY
35Waseda 1/06
Laplace pile-up probabilities (cont)
It follows that Local Joint has pile-up probability
)10P()1ˆ(P LJ <<→=θ Y
[ ]∫ <+−<=1
0
)15.)()()1( P(0 sdVsWsW
[ ] ⎥⎦
⎤⎢⎣
⎡∈<−<= ∫
1
0
])1,0[ ),(|5.)()()1( P(-.5 ttWsdVsWsWE
[ ]⎥⎥
⎦
⎤
⎢⎢
⎣
⎡
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛
⎭⎬⎫
⎩⎨⎧
−Φ=−
∫ 1-)()1( .522/11
0
2 dssWsWE
820.0≈
But no pile-up probability for Local Exact:
Remark: if Local does not pile up, Global does not pile up
if Local does pile up, Global probably does as well
36Waseda 1/06
.827
.846
.831
.817
.823
.796t(5)
.836
.841
.843
.864
.864
.831Unif
.820
.809
.819
.819
.806
.796Lap
.858
.855
.844
.873
.859
.827Gau
200
500
50
20
100
∞
n
Simulation results – pile-up probabilities
Pile-up probabilities for Local Joint: )1ˆ(P LJ =θ )1ˆ(P LJ =θ
(No pile-up probabilities for Local Exact.)