Adaptive estimation of survival function in the convolution model...
Transcript of Adaptive estimation of survival function in the convolution model...
Adaptive estimation of survival function in theconvolution model on R+
Gwennaelle MABON
CREST - ENSAE & Universite Paris Descartes
April, 20th 2016
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 1 / 15
Framework Motivation : additive processes
Motivations: one-sided error in convolution models (a.k.a. additivemeasurement errors).
−→ Application to back calculation problems in AIDS researchGroeneboom and Wellner (1992), van Es et al. (1998), Jongbloed (1998),Groeneboom and Jongbloed (2003).
−→ Application in finance (nonparametric regression)Jirak, Meister and Reiß (2014), Reiß & Selk (2015) .
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 2 / 15
Framework Motivation : additive processes
Motivations: one-sided error in convolution models (a.k.a. additivemeasurement errors).
−→ Application to back calculation problems in AIDS researchGroeneboom and Wellner (1992), van Es et al. (1998), Jongbloed (1998),Groeneboom and Jongbloed (2003).
−→ Application in finance (nonparametric regression)Jirak, Meister and Reiß (2014), Reiß & Selk (2015) .
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 2 / 15
Framework Motivation : additive processes
Motivations: one-sided error in convolution models (a.k.a. additivemeasurement errors).
−→ Application to back calculation problems in AIDS researchGroeneboom and Wellner (1992), van Es et al. (1998), Jongbloed (1998),Groeneboom and Jongbloed (2003).
−→ Application in finance (nonparametric regression)Jirak, Meister and Reiß (2014), Reiß & Selk (2015) .
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 2 / 15
Statistical model
� We study the following model:
Zi = Xi + Yi , i = 1, . . . , n, (1)
� Xi ’s i.i.d. nonnegative variables with unknown density f ,survival function SX .
� Yi ’s i.i.d. nonnegative variables with known density g , survivalfunction SY .
� (Xi )i |= (Yi )i , Zi ∼ h, survival function SZ .
Target: estimation of SX when the Zi ’s are observed and g is known.
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 3 / 15
Statistical model Steps
� Assumptions: SX , SY and g belong to L2(R+).
� Find an appropriate orthonormal basis of L2(R+) ,(ϕk)k≥0,
SX (x) =∑k≥0
ak(SX )ϕk(x).
ak(SX ): k-th component of SX in the orthonormal basis.
� Study the MISE of the estimator in this basis.
E‖SX − SX ,m‖2 ≤ ?
� Build a model selection procedure a la Birge and Massart.
m = arg minm∈M
γn(SX ,m) + pen(m).
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 4 / 15
Statistical model Steps
� Assumptions: SX , SY and g belong to L2(R+).
� Find an appropriate orthonormal basis of L2(R+) ,(ϕk)k≥0,
SX (x) =∑k≥0
ak(SX )ϕk(x).
ak(SX ): k-th component of SX in the orthonormal basis.
� Study the MISE of the estimator in this basis.
E‖SX − SX ,m‖2 ≤ ?
� Build a model selection procedure a la Birge and Massart.
m = arg minm∈M
γn(SX ,m) + pen(m).
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 4 / 15
Statistical model Steps
� Assumptions: SX , SY and g belong to L2(R+).
� Find an appropriate orthonormal basis of L2(R+) ,(ϕk)k≥0,
SX (x) =∑k≥0
ak(SX )ϕk(x).
ak(SX ): k-th component of SX in the orthonormal basis.
� Study the MISE of the estimator in this basis.
E‖SX − SX ,m‖2 ≤ ?
� Build a model selection procedure a la Birge and Massart.
m = arg minm∈M
γn(SX ,m) + pen(m).
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 4 / 15
Statistical model Steps
� Assumptions: SX , SY and g belong to L2(R+).
� Find an appropriate orthonormal basis of L2(R+) ,(ϕk)k≥0,
SX (x) =∑k≥0
ak(SX )ϕk(x).
ak(SX ): k-th component of SX in the orthonormal basis.
� Study the MISE of the estimator in this basis.
E‖SX − SX ,m‖2 ≤ ?
� Build a model selection procedure a la Birge and Massart.
m = arg minm∈M
γn(SX ,m) + pen(m).
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 4 / 15
Survival function estimation Convolution equation
Let z ≥ 0, by definition SZ (z) = P(Z > z), we get
SZ (z) = P(X + Y > z) =
∫∫1x+y>z f (x)1x≥0 g(y)1y≥0 dx dy
=
∫ (∫ +∞
z−yf (x) dx
)g(y)1y≥01z−y≥0 dy
+
∫ (∫ +∞
0f (x) dx
)g(y)1y≥01z−y≤0 dy
=
∫ z
0SX (z − y)g(y) dy + SY (z).
SZ (z) = SX ? g(z) + SY (z) (2)
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 5 / 15
Survival function estimation Convolution equation
Let z ≥ 0, by definition SZ (z) = P(Z > z), we get
SZ (z) = P(X + Y > z) =
∫∫1x+y>z f (x)1x≥0 g(y)1y≥0 dx dy
=
∫ (∫ +∞
z−yf (x) dx
)g(y)1y≥01z−y≥0 dy
+
∫ (∫ +∞
0f (x) dx
)g(y)1y≥01z−y≤0 dy
=
∫ z
0SX (z − y)g(y) dy + SY (z).
SZ (z) = SX ? g(z) + SY (z) (2)
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 5 / 15
Survival function estimation Laguerre procedure
� For R+-supported functions, the convolution product writes
SX ? g(z) =
∫ z
0SX (u)g(z − u)du
=∞∑k=0
∞∑j=0
ak(SX )aj(g)
∫ z
0ϕk(u)ϕj(z − u) du.
� We introduce the Laguerre basis defined for k ∈ N, x ≥ 0, by
ϕk(x) =√
2Lk(2x)e−x with Lk(x) =k∑
j=0
(k
j
)(−x)j
j!.
The (ϕk)k ’s form an orthonormal basis of L2(R+).
� What makes the Laguerre basis relevant is the relation∫ x
0ϕk(u)ϕj(x − u)du = 2−1/2 (ϕk+j(x)− ϕk+j+1(x)) .
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 6 / 15
Survival function estimation Laguerre procedure
� For R+-supported functions, the convolution product writes
SX ? g(z) =
∫ z
0SX (u)g(z − u)du
=∞∑k=0
∞∑j=0
ak(SX )aj(g)
∫ z
0ϕk(u)ϕj(z − u) du.
� We introduce the Laguerre basis defined for k ∈ N, x ≥ 0, by
ϕk(x) =√
2Lk(2x)e−x with Lk(x) =k∑
j=0
(k
j
)(−x)j
j!.
The (ϕk)k ’s form an orthonormal basis of L2(R+).
� What makes the Laguerre basis relevant is the relation∫ x
0ϕk(u)ϕj(x − u)du = 2−1/2 (ϕk+j(x)− ϕk+j+1(x)) .
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 6 / 15
Survival function estimation Laguerre procedure
� For R+-supported functions, the convolution product writes
SX ? g(z) =
∫ z
0SX (u)g(z − u)du
=∞∑k=0
∞∑j=0
ak(SX )aj(g)
∫ z
0ϕk(u)ϕj(z − u) du.
� We introduce the Laguerre basis defined for k ∈ N, x ≥ 0, by
ϕk(x) =√
2Lk(2x)e−x with Lk(x) =k∑
j=0
(k
j
)(−x)j
j!.
The (ϕk)k ’s form an orthonormal basis of L2(R+).
� What makes the Laguerre basis relevant is the relation∫ x
0ϕk(u)ϕj(x − u)du = 2−1/2 (ϕk+j(x)− ϕk+j+1(x)) .
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 6 / 15
Survival function estimation Laguerre procedure
� It yields
SX ? g(z)
=1√2
∞∑k=0
ϕk(z)(
ak(SX )a0(g) +k∑
l=0
(ak−l(g)− ak−l−1(g)
)al(SX )
).
� Equation implies (2)
SX ? g(z) = SZ (z)− SY (z) =∑k≥0
(ak(SZ )− ak(SY ))ϕk(z)
� We obtain for any m that
Gm~SX ,m = ~SZ ,m − ~SY ,m
~S•,m = t(a0(S•), . . . , am−1(S•)).� Gm is the lower triangular Toeplitz matrix with elements
Gm =1√2
a0(g) if i = j ,
ai−j(g)− ai−j−1(g) if j < i ,
0 otherwise.
(3)
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 7 / 15
Survival function estimation Laguerre procedure
� It yields
SX ? g(z)
=1√2
∞∑k=0
ϕk(z)(
ak(SX )a0(g) +k∑
l=0
(ak−l(g)− ak−l−1(g)
)al(SX )
).
� Equation implies (2)
SX ? g(z) = SZ (z)− SY (z) =∑k≥0
(ak(SZ )− ak(SY ))ϕk(z)
� We obtain for any m that
Gm~SX ,m = ~SZ ,m − ~SY ,m
~S•,m = t(a0(S•), . . . , am−1(S•)).� Gm is the lower triangular Toeplitz matrix with elements
Gm =1√2
a0(g) if i = j ,
ai−j(g)− ai−j−1(g) if j < i ,
0 otherwise.
(3)
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 7 / 15
Survival function estimation Laguerre procedure
� It yields
SX ? g(z)
=1√2
∞∑k=0
ϕk(z)(
ak(SX )a0(g) +k∑
l=0
(ak−l(g)− ak−l−1(g)
)al(SX )
).
� Equation implies (2)
SX ? g(z) = SZ (z)− SY (z) =∑k≥0
(ak(SZ )− ak(SY ))ϕk(z)
� We obtain for any m that
Gm~SX ,m = ~SZ ,m − ~SY ,m
~S•,m = t(a0(S•), . . . , am−1(S•)).� Gm is the lower triangular Toeplitz matrix with elements
Gm =1√2
a0(g) if i = j ,
ai−j(g)− ai−j−1(g) if j < i ,
0 otherwise.
(3)
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 7 / 15
Survival function estimation Laguerre procedure
� It yields
SX ? g(z)
=1√2
∞∑k=0
ϕk(z)(
ak(SX )a0(g) +k∑
l=0
(ak−l(g)− ak−l−1(g)
)al(SX )
).
� Equation implies (2)
SX ? g(z) = SZ (z)− SY (z) =∑k≥0
(ak(SZ )− ak(SY ))ϕk(z)
� We obtain for any m that
Gm~SX ,m = ~SZ ,m − ~SY ,m
~S•,m = t(a0(S•), . . . , am−1(S•)).� Gm is the lower triangular Toeplitz matrix with elements
Gm =1√2
a0(g) if i = j ,
ai−j(g)− ai−j−1(g) if j < i ,
0 otherwise.
(3)
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 7 / 15
Survival function estimation Laguerre procedure
� Gm is a lower triangular matrix and is invertible iff the coefficients ofthe diagonal are different from 0.
a0(g) =√
2
∫R+
g(u)e−u du =√
2E[e−Y ] > 0.
� It yields
~SX ,m = G−1m
(~SZ ,m − ~SY ,m
)� Remark:
ak(SZ ) =
∫R+
SZ (u)ϕk(u) du =
∫R+
ϕk(u)
(∫ +∞
uh(v) dv
)du
=
∫R+
(∫ v
0ϕk(u) du
)h(v) dv = E [Φk(Z1)]
with Φk a primitive of ϕk defined as Φk(x) =∫ x
0 ϕk(u) du.
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 8 / 15
Survival function estimation Laguerre procedure
� Gm is a lower triangular matrix and is invertible iff the coefficients ofthe diagonal are different from 0.
a0(g) =√
2
∫R+
g(u)e−u du =√
2E[e−Y ] > 0.
� It yields
~SX ,m = G−1m
(~SZ ,m − ~SY ,m
)� Remark:
ak(SZ ) =
∫R+
SZ (u)ϕk(u) du =
∫R+
ϕk(u)
(∫ +∞
uh(v) dv
)du
=
∫R+
(∫ v
0ϕk(u) du
)h(v) dv = E [Φk(Z1)]
with Φk a primitive of ϕk defined as Φk(x) =∫ x
0 ϕk(u) du.
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 8 / 15
Survival function estimation Laguerre procedure
� Gm is a lower triangular matrix and is invertible iff the coefficients ofthe diagonal are different from 0.
a0(g) =√
2
∫R+
g(u)e−u du =√
2E[e−Y ] > 0.
� It yields
~SX ,m = G−1m
(~SZ ,m − ~SY ,m
)� Remark:
ak(SZ ) =
∫R+
SZ (u)ϕk(u) du =
∫R+
ϕk(u)
(∫ +∞
uh(v) dv
)du
=
∫R+
(∫ v
0ϕk(u) du
)h(v) dv = E [Φk(Z1)]
with Φk a primitive of ϕk defined as Φk(x) =∫ x
0 ϕk(u) du.
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 8 / 15
Survival function estimation Laguerre procedure
� Let Sm = span{ϕk}k∈{0,...,m−1} and consider SX ,m the projection ofSX on Sm
SX ,m(x) =m−1∑k=0
ak(SX )ϕk(x). (4)
Definition (Projection estimator)
SX ,m(x) =m−1∑k=0
akϕk(x) (5)
t(a0, . . . , am−1) = ~SX ,m and ~SZ ,m = t(a0(Z ), . . . , am−1(Z ))
with ~SX ,m = G−1m
(~SZ ,m − ~SY ,m
)and ak(Z ) =
1
n
n∑i=1
Φk(Zi ),
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 9 / 15
Survival function estimation Upper bounds
Proposition (M.(2014))
If SX and g ∈ L2(R+) and E[Z1] <∞, for Gm defined by (3) and SX ,m
defined by (5), the following result holds
E‖SX − SX ,m‖2 ≤ ‖SX − SX ,m‖2 +E[Z1]
n%2(G−1
m ). (6)
%2 (A) is the largest eigenvalue of a matrix tAA in absolute value.
Consequence
� m plays the same role as a bandwith parameter.
m too small ⇒ dominant bias.m too big ⇒ dominant variance.
� Choose m to have a trade-off between the bias and the variance.
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 10 / 15
Survival function estimation Upper bounds
Proposition (M.(2014))
If SX and g ∈ L2(R+) and E[Z1] <∞, for Gm defined by (3) and SX ,m
defined by (5), the following result holds
E‖SX − SX ,m‖2 ≤ ‖SX − SX ,m‖2 +E[Z1]
n%2(G−1
m ). (6)
%2 (A) is the largest eigenvalue of a matrix tAA in absolute value.
Consequence
� m plays the same role as a bandwith parameter.
m too small ⇒ dominant bias.m too big ⇒ dominant variance.
� Choose m to have a trade-off between the bias and the variance.
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 10 / 15
Model selection
Goal: define an empirical version of the upper bound on the MISE
‖SX − SX ,m‖2 +E[Z1]
n%2(G−1
m )
−→ Approximation of the bias term by
−‖SX ,m‖2
−→ Approximation of the variance term by
pen(m) =κE[Z1]
n%2(G−1
m
)log n
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 11 / 15
Model selection
Goal: define an empirical version of the upper bound on the MISE
‖SX − SX ,m‖2 +E[Z1]
n%2(G−1
m )
−→ Approximation of the bias term by
−‖SX ,m‖2
−→ Approximation of the variance term by
pen(m) =κE[Z1]
n%2(G−1
m
)log n
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 11 / 15
Model selection
Goal: define an empirical version of the upper bound on the MISE
‖SX − SX ,m‖2 +E[Z1]
n%2(G−1
m )
−→ Approximation of the bias term by
−‖SX ,m‖2
−→ Approximation of the variance term by
pen(m) =κE[Z1]
n%2(G−1
m
)log n
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 11 / 15
Model selection Theorem
(B1) Mn ={
1 ≤ m ≤ n, %2(G−1
m
)log n ≤ Cn
}, where C > 0.
(B2) 0 < E[Z 31 ] <∞.
Theorem (M.(2014))
If SX and g ∈ L2(R+), let us suppose that (B1)-(B2) are true. Let SX ,m
be defined by (5) and
m = argminm∈Mn
{−‖SX ,m‖2 + pen(m)
},
with pen(m) =κE[Z1]
n%2(G−1
m
)log n, then there exists a positive
numerical constant κ ≥ κ0 such that
E‖SX − SX ,m‖2 ≤ 4 infm∈Mn
{‖SX − SX ,m‖2 + pen(m)
}+
C
n,
where C is a constant depending on E[Z 31 ].
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 12 / 15
Model selection Theorem
(B1) Mn ={
1 ≤ m ≤ n, %2(G−1
m
)log n ≤ Cn
}, where C > 0.
(B2) 0 < E[Z 31 ] <∞.
Theorem (M.(2014))
If SX and g ∈ L2(R+), let us suppose that (B1)-(B2) are true. Let SX ,m
be defined by (5) and
m = argminm∈Mn
{−‖SX ,m‖2 + pen(m)
},
with pen(m) =κE[Z1]
n%2(G−1
m
)log n, then there exists a positive
numerical constant κ ≥ κ0 such that
E‖SX − SX ,m‖2 ≤ 4 infm∈Mn
{‖SX − SX ,m‖2 + pen(m)
}+
C
n,
where C is a constant depending on E[Z 31 ].
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 12 / 15
Model selection Theorem
Corollary (M.(2014))
If SX and g ∈ L2(R+), let us suppose that (B1)-(B2) are true. Let SX ,m
be defined by (5) and
m = argminm∈Mn
{−‖SX ,m‖2 + pen(m)
}
pen(m) =2κZn
n%2(G−1
m
)log n where Zn =
1
n
n∑i=1
Zi ,
then there exists a positive numerical constant κ ≥ κ0 such that
E‖SX − SX ,m‖2 ≤ 4 infm∈Mn
{‖SX − SX ,m‖2 + pen(m)
}+
C
n
where C is a constant depending on E[Z1], E[Z 21 ], E[Z 3
1 ].
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 13 / 15
Conclusion Extensions
� Estimation of the survival function in a global setting on R+.
� Related works:
I Estimation of the density → Mabon (2014).I Estimation of linear functionals of the density (c.d.f, pointwise
estimation of the density, Laplace transform) → Mabon (2015).
� Perspectives:
I Estimation when g is unknown → work in progress.I Goodness-of-fit test.
Thank you for your attention.
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 14 / 15
Conclusion Extensions
� Estimation of the survival function in a global setting on R+.
� Related works:
I Estimation of the density → Mabon (2014).I Estimation of linear functionals of the density (c.d.f, pointwise
estimation of the density, Laplace transform) → Mabon (2015).
� Perspectives:
I Estimation when g is unknown → work in progress.I Goodness-of-fit test.
Thank you for your attention.
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 14 / 15
Conclusion Bibliography
I van Es, B. (2011). Combining kernel estimators in the uniformdeconvolution problem. Statistica Neerlandica, 65(3):275–296.
I Groeneboom, P. and Jongbloed, G. (2003). Density estimation in theuniform deconvolution model. Statistica Neerlandica, 57(1):136–157.
I Jirak, M, Meister, A. and Reiß, M. (2014). Adaptive functionestimation in nonparametric regression with one-sided errors. TheAnnals of Statistics, 42(5):1970–2002.
I Jongbloed, G. (1998). Exponential deconvolution: two asymptoticallyequivalent estimators. Statistica Neerlandica, 52(1):6–17.
I Mabon, G. (2014). Adaptive deconvolution on the nonnegative realline. preprint MAP5 2014-33. In revision
I Mabon, G. (2015). Adaptive deconvolution of linear functionals onthe nonnegative real line. preprint MAP5 2015-24. In revision
I Reiß & Selk (2015). Efficient estimation of functionals innonparametric boundary models. To appear in Bernoulli.
G. MABON (CREST & MAP5) Survival function in the convolution model April, 20th 2016 15 / 15