Shrinkage Drift Parameters Estimation for Multi-factors Ornstein ...
Transcript of Shrinkage Drift Parameters Estimation for Multi-factors Ornstein ...
manuscript No.(will be inserted by the editor)
Shrinkage Drift Parameters Estimation for Multi-factors
Ornstein-Uhlenbeck Processes
Sévérien Nkurunziza · and · S. Ejaz Ahmed
Abstract We consider some inference problem concerning the drift parameters of multi-factors Vasicek
model (or multivariate Ornstein-Uhlebeck process). For example, it asserts that the term structure of interest
rate is not just a single process, but rather a superpositions of several analogous processes. This motivates us
to develop an improved estimation theory for the drift parameters when homogeneity of several parameters
may hold. However, the information regarding the equality of the theses parameters may be imprecise. In this
context, we consider Stein-rule (or shrinkage) estimators which allow us to improve upon the performance
of classical maximum likelihood estimator (MLE). Asymptotic properties of some shrinkage estimators are
studied. Under an asymptotic distributional quadratic risk criterion, their relative dominance picture is explored
and assessed. A simulation study illustrates the behavior of the suggested method for small and moderate length
of time observation period. Our analytical and simulation results demonstrate that shrinkage estimators provide
excellent estimation accuracy and outperform the MLE uniformly. An over-ridding theme of this paper is that
the shrinkage estimators provides a powerful extension of its classical counterparts.
Sévérien Nkurunziza · and · S. Ejaz Ahmed
University of Windsor, 401 Sunset Avenue, Windsor, Ontario, N9B 3P4
Tel. : 1-519-253-3000 ext. 3017 · Fax : 1-519-971-3649 · E-mail: [email protected] · [email protected]
Shrinkage Estimator of the drift parameters 1
Keywords Ornstein-Uhlenbeck process · Vasicek process · Gaussian process · shrinkage estimator · MLE ·
diffusion process
1 Introduction
The Ornstein-Uhlenbeck process has been extensively used in modelling of different phenomenon such
as in Biology (Engen and Sæther, 2005), in Ecology (Engen et al., 2002), in Finances (Schöbel and Zhu,
1999). In this last particular field, the Ornstein-Uhlenbeck process is used in modelling the term structure of
the interest rates and that is mostly known as Vasicek (1977) process. In a such particular point of view, the
Vasicek model describes the interaction between equilibrium value of what the interest rate should be and a
stochastic movements into the interest rate that results of the unpredictable economic environment (see for e.g.
Vasicek, 1977, Abu-Mostafa, 2001). Namely, the instantaneous interest rate, r(t) is governed by the stochastic
differential equation (SDE)
dr(t) = θ (α− r(t))dt +σdW (t) (1)
where α > 0 represents a steady-state interest rate (or the long-term mean), θ > 0 represents a speed of conver-
ging to the steady-state, σ > 0 is a volatility or “randomness level” and {W (t), t > 0} is a Wiener process. The
first part θ (α− r(t)) represents the drift term and thus, α and θ are so-called the drift parameters. Also, the
second part σdW (t) represents the diffusion term of the process whose σ is so-called the diffusion parameter.
With this in mind, let us illustrate the more general form of the above model in the following sub-section and
the remanning discussions follows.
1.1 Multi-factors Model
The model given in (1) is univariate while the term structure of the interest rates is embedded in a large
macroeconomic system (Langetieg, 1980). For this reason, Langetieg (1980) developed a multivariate version
2 S. Nkurunziza and S. E. Ahmed
of the model (1), so-called “multi-factors Vasicek model”. This model takes care of an arbitrary number of
economic relationships and given as follows :
drk(t) = θk (αk− rk(t))dt +σkdWk(t), k = 1,2, . . . , p, (2)
where {Wk, t > 0}, k = 1,2, . . . , p are Wiener processes possibly correlated. In some particular economic contexts,
the philosophy behind having a such multiple factors stems from the observation can be interpreted as different
time scales for the behavior of interest rates.
The interest rate Vasicek model has been extensively used in literature. This model was used to analyze
the maturity structure of the public debt both at the Bank of Canada and at the Department of Finance, and in
Danish National bank, see Georges (2003). Abu-Mostafa (2001) calibrates the correlated multi-factors Vasicek
model of interest rates, and then applied it to Japanese Yen swaps market and U.S. Dollar yield market.
From a statistical point of view, Liptser and Shiryayev (1978), Basawa, Ishwar and Rao (1980) and Ku-
toyants (2004) considered the inference problem concerning the drift parameters of the model (1) and derived
the maximum likelihood estimator (MLE). Our goal is here to improve the performance of the existing MLE of
the drift parameters of multivariate Vasicek (or Ornstein-Uhlenbeck) process. In order to simplify the analysis,
we consider that steady-state interest rate α is known or available from the previous study. Accordingly, our
parameter of interest is the speed of converging to the steady-state, i.e., θk. Thus, let Xk(t) = rk(t)−αk. Then,
from the multi-factors Vasicek model (2), we get
dXk(t) =−θkXk(t)dt +σkdWk(t), k = 1,2, . . . , p, (3)
where θk > 0, σk > 0, k = 1,2, . . . , p. We assume that {Wk(t), t > 0,k = 1,2 . . . , p} is p-dimensional Wiener
process. The independence assumption between the p different Wiener processes is made to simplify the analy-
sis. Similar results can be established for the case where these Wiener processes are pairwise jointly Gaussian
with a non-zero correlation coefficient.
Motivated by the diverse applications of the above model, we consider the estimation problem of drift
parameter vector θθθ = (θ1,θ2, . . . ,θp)′.
Shrinkage Estimator of the drift parameters 3
We are interested in the inference problem when it is suspected that
θ1 = θ2 = · · ·= θp
or nearly equal. In other words, considering the situations when all the parameters may be equal. Some aspects
of this behavior are observed in a short time horizon (high-speed factors). In other scenario perhaps the data was
collected from various sources under similar conditions. Hence, under these scenarios it is reasonable to suspect
the homogeneity of the several drift parameters. The estimation and testing of homogeneity or parameters is
of paramount interest equally and in scientific and statistical literatures. For this reason, the main focus of this
contribution is to suggest some improved estimators of θθθ with high estimation accuracy when homogeneity of
several drift parameters may hold. An asymptotic test for the equality of parameters is also suggested. Hence,
we have extended the single factor inference problem into a more general form of the Vasicek model, the
multi-factors model. Further, an alternative estimator of variance parameters is also suggested.
The rest of this paper is organized as follows. Section 2 is Meta Analysis and Testing the Homogeneity. In
this section, we present the shrinkage estimator and show its supremacy over the MLE. Section 3 is devoted to
some simulation studies of the ADR results. Finally, Section 4 gives concluding remarks and, technical results
are given in the Appendix.
2 Meta Analysis and Testing the Homogeneity
Let {X1(t),0 6 t 6 T}, {X2(t),0 6 t 6 T},. . . ,{
Xp(t),0 6 t 6 T}
be Ornstein-Uhlenbeck processes (Ku-
toyants, 2004, p. 51, Steele, 2001, p. 140) whose diffusion equations are given by
dXk(t) =−θkXk(t)dt +σkdWk(t) Xk(0) fixed , (4)
where θk > 0, σk > 0, k = 1,2, . . . , p and {Wk(t),0 6 t 6 T}, k = 1,2, . . . , p are p-independent Wiener pro-
cesses. Further, let θθθ be a p-column vector given by θθθ = (θ1,θ2, . . . ,θp)′ , and let Σ be diagonal matrix whose
4 S. Nkurunziza and S. E. Ahmed
diagonal entrees are σ21 ,σ2
2 , . . . ,σ2p , i.e., ΣΣΣ = diag
(σ2
1 ,σ22 , . . . ,σ2
p). Also, let XXX(t) and WWW (t) be column vectors
given by
XXX(t) = (X1(t),X2(t), . . . ,Xp(t))′ , WWW (t) = (W1(t),W2(t), . . . ,Wp(t))
′ ,
for 0 6 t 6 T . Further, let
VVV (t) = diag(X1(t),X2(t), . . . ,Xp(t)) , 0 6 t 6 T .
From relation (4), we get
dXXX(t) =−VVV (t)θθθdt +Σ12 dWWW (t), 0 6 t 6 T . (5)
As mentioned in the Introduction, we are interested in the situation where we suspect that
θ1 = θ2 = · · ·= θp,
which leads to consider the testing problem
H0 : θ1 = θ2 = · · ·= θp versus H1 : θ j 6= θk for some 1 6 j < k 6 p. (6)
The provided solution to (6) is based on the maximum likelihood estimators of θθθ under H0 and under the full pa-
rametric space. This estimation and/or testing problem is considered when ΣΣΣ is either known or unknown. Also,
note that the MLE established here corresponds to that given in Lipter and Shiryayev (1978, chapter 17, p. 206-
207) or in Kutoyants (2004, p. 63), for the univariate case. Hence, we extend univariate estimation problem to
a multivariate situation.
In the sequel, we denote
UUUT =(∫ T
0X1(t)dX1(t),
∫ T
0X2(t)dX2(t), . . . ,
∫ T
0Xp(t)dXp(t)
)′,
and let
DDDT =∫ T
0VVV 2(t)dt, θ̂θθ
∗=−DDD−1
T UUUT =(
θ̂ ∗1 , θ̂ ∗2 , . . . , θ̂ ∗p)′
. (7)
Also, let z+ = max(z,0). Conditionally to XXX0, let θ̂θθ be maximum likelihood estimator of θθθ satisfying the
model (5) and θ̃θθ be the restricted MLE (RMLE) of θθθ under H0.
Shrinkage Estimator of the drift parameters 5
Proposition 1 We have
θ̂θθ =(
θ̂ ∗+1 , θ̂ ∗+2 , . . . , θ̂ ∗+p
)′and θ̃θθ =
(eee′pΣ−1DDDT eeep
)−1eeepeee′pΣ−1DDDT θ̂θθ , (8)
where eeep is a p-column vector whose all entrees are equal to 1.
The proof follows from Proposition 4 that is given in the Appendix.
It is established that θ̂θθ and θ̃θθ are strongly consistent for θθθ (see Appendix, Proposition 5). Moreover, these
estimators are asymptotically normal as T tends to infinity (see, Appendix, Proposition 6). Based on these
results, we suggest the following test for the testing problem (6), when Σ is known,
Ψ =
1 if T(
θ̂θθ − θ̃θθ)′
Σ−1(
θ̂θθ − θ̃θθ)
> χ2p−1;α
0 if T(
θ̂θθ − θ̃θθ)′
Σ−1(
θ̂θθ − θ̃θθ)
< χ2p−1;α ,
(9)
where α is fixed and 0 < α < 1. Also, we denote by χ2p−1, the chi-square random variable with p−1 degrees
of freedom and
Pr{
χ2p−1 > χ2
p−1;α}
= α.
When Σ is unknown, the test (9) can be modified by replacing Σ by a its strongly consistent estimator. Later,
we prove that the test Ψ is asymptotically α-level. Further, the test obtained for the unknown Σ case has
asymptotically the same level α , and it is as powerful as the test Ψ .
Corollary 1 Under the model (5), the test Ψ given in (6) is asymptotically α-level test for the testing pro-
blem (6).
The proof is straightforward by using Corollary 4 that is given in the Appendix.
2.1 Practical Aspect
In practice the covariance matrix Σ is unknown and needs to be estimated in order to compute θ̂θθ and θ̃θθ .
Namely, Σ is replaced by its corresponding strongly consistent estimator Σ̂ . By Slutsky Theorem, the corres-
ponding new estimators are strongly consistent and asymptotically normal. The benchmark strongly consistent
6 S. Nkurunziza and S. E. Ahmed
estimator for the diffusion parameter is based on quadratic variation. In the present investigation we suggest an
alternative estimator for (σ1,σ2, . . . ,σp)′, that is
(σ̂1, σ̂2, . . . , σ̂p)′ where σ̂2
i =X2
i (T )T
− 2T
∫ T
0Xi(t)dXi(t).
Further, if σ1 = σ2 = · · ·= σp = σ , then the common value σ2 is estimated by
σ̂2 =1p
p
∑i=1
(X2
i (T )T
− 2T
∫ T
0Xi(t)dXi(t)
).
Hence
Σ̂ = diag(σ̂2
1 , σ̂22 , . . . , σ̂2
p)
or Σ̃ = σ̂2Ip. (10)
Proposition 2 Assume that the model (5) holds. Then,
(i) Σ̂ a.s.−−−→T→∞
Σ and for any 0 < ν < 1, T ν(
Σ̂ −Σ)
a.s.−−−→T→∞
0;
(ii) if E{‖XXX(0)‖2
}< ∞ then lim
T→∞E
{∥∥∥Σ̂ −Σ∥∥∥
2}
= 0.
From Proposition 2, it should be noted that the suggested estimator Σ̂ converges faster than the classical esti-
mator based on quadratic variation. Since the strongly consistency and normality properties of θ̂θθ and θ̃θθ do not
change when Σ is replaced by Σ̂ , the brevity sake, we treat Σ as known in the remaining discussions.
Finally, we close this subsection by noting that a continuous time process can be viewed as an approxi-
mation of discrete time process ; since in practice the observations are collected in discrete times. Indeed, in
order to evaluate the suggested estimator, the stochastic integrals are replaced with their corresponding discrete
Riemann-Îto sums. In the following subsection, we showcase the main contribution of the paper.
2.2 Shrinkage Estimation Strategy (SES)
In this subsection, we shows how to improve the precision of the MLE by shrinking them towards a common
estimate. Following Ahmed (2001) and others, we consider two Stein-rule (or shrinkage) estimators of the drift
parameters.
Shrinkage Estimator of the drift parameters 7
The shrinkage estimator (SE) θ̂S
of θ is defined as
θ̂S= θ̃ +{1− cψ−1}(θ̂ − θ̃), (11)
where
ψ = T(
θ̂θθ − θ̃θθ)′
Σ−1(
θ̂θθ − θ̃θθ)
,
and c is allowed to vary over [0,2(p−2)), p≥ 3, and often is taken as c = p−2 ; thus tacitly, we assume that
p≥ 3. Note that ψ ≥ 0, and hence, for
ψ < c⇐⇒ 1− cψ−1 < 0,
that causes a possible inversion of sign or over-shrinkage. For this reason θ̂S
may not be useful at all in the
sense that the components of this estimator may have different signs from the maximum likelihood estimator.
Thus, shrinking towards the pivot is commonly accepted but over-shrinking may be somewhat more difficult
to interpret, albeit this drawback does not adversely effect the risk performance of shrinkage estimator. The
positive-rule shrinkage estimators (PSE) control this drawback satisfactorily and may dominate both SE and
MLE and hence, are useful for practical purposes. Formally, the PSE(
θ̂S+)
is defined as
θ̂S+
= θ̃ +{1− cψ−1}+(
θ̂ − θ̃)
, (12)
where z+ = max(0,z). Accordingly, Ahmed (2001) recommended that shrinkage estimator should be used as a
tool for developing the PSE and should not be used as an estimator in its own right.
In parametric setups, the SE, PSE, and other related estimators have been extensively studied (Judge and
Bock, 1978 and the references therein). Large sample properties of these estimators were studied by Sen (1986),
Ahmed (2005) and others. Stigler (1990) and Kubokawa (1998) provide excellent reviews of (parametric) shrin-
kage estimators. Jurec̆ková and Sen (1996) have an extensive treatise of the asymptotic and interrelationships
of robust estimators of location, scale and regression parameters with due emphasis on Stein-rule estimators.
8 S. Nkurunziza and S. E. Ahmed
An interesting feature of this paper is that we consider the shrinkage estimation of drift parameters in the
multifactors model, a more general form of the Vasicek model. Based on the reviewed literature, this kind of
study is not available for practitioner.
Having all these estimators defined, we need to stress on the regularity conditions under which they have
good performance characteristics. Unlike maximum likelihood estimators θ̂ and θ̃ , these shrinkage estimators
are not linear. Hence, even if the distribution was a standard normal, the finite sample distribution theory of these
shrinkage estimators are not simple to obtain, for even normal ones. This difficulty has been largely overcome
by asymptotic methods (Ahmed and Saleh (1999), Jurec̆ková and Sen (1996), and others). These asymptotic
methods relate primarily to convergence in distribution which may not generally guarantee convergence in
quadratic risk. This technicality has been taken care of by the introduction of asymptotic distributional risk
(ADR) (Sen (1986)), which, in turn, is based on the concept of a shrinking neighborhood of the pivot for which
the ADR serves a useful and interpretable role in asymptotic risk analysis.
We focus on some small and moderate length of time observation period and study ADR of different
estimators by means of a simulation study. For this purpose, in the next section, we recapitulate briefly ADR
and asymptotic bias (AB) results pertaining to them. This forms a basis and provides some justifications for
moderate length of time observation period.
2.3 Asymptotic Properties
First, for the testing problem (6), let us consider the following local alternative
KT : θθθ = θ̄eeep +δδδ√T
(13)
where δδδ is a p-column vector with different direction with eeep. Also, we assume that ‖δ‖< ∞. Further, let
ϒ =(eee′pΣ−1eeep
)−1eeepeee′p, δδδ ∗ = δδδ −ϒ Σ−1δδδ , and let Σ ∗ = Σ −ϒ .
Shrinkage Estimator of the drift parameters 9
The asymptotic study of the shrinkage estimator (11) can be derived from the asymptotic distribution of the
quantities
ρρρT =√
T(
θ̂θθ − θ̄eeep
), ξξξ T =
√T
(θ̂θθ − θ̃θθ
), ζT =
√T
(θ̃θθ − θ̄eeep
).
Proposition 3 Assume that the model (5) holds. Under the local alternative (13),
(i)
ρρρT
ξξξ T
L−−−→T→∞
N2p
δδδ
δδδ ∗
,
Σ Σ ∗
Σ ∗ Σ ∗
.
(ii)
ζζζ T
ξξξ T
L−−−→T→∞
N2p
δδδ −δδδ ∗
δδδ ∗
,
ϒ 000
000 Σ ∗
.
The proof of this proposition is given in the Appendix.
Corollary 2 Let Ξ = Σ−1−Σ−1(eee′pΣ−1eeep
)−1 eeepeee′pΣ−1. If (13) holds, then
ξξξ ′T Σ−1ξξξ TL−−−→
T→∞χ2
p−1
(δδδ ∗
′Ξδδδ ∗
).
For the proof, see the Appendix.
Based on Corollary 2, we establish Corollary 3 which gives the asymptotic power for the test Ψ .
Corollary 3 Under the conditions of Corollary 2, we have
limT→∞
ΠΨ
(θθθ +
δδδ√T
)= Pr
{χ2
p−1
(δδδ ∗
′Ξδδδ ∗
)> χ2
p−1;α
(δδδ ∗
′Ξδδδ ∗
)}.
The proof is straightforward by using Corollary 2.
It is well known that, even for normal distribution, the effective domain of risk dominance of PSE or
SE over MLE is a small neighborhood of the chosen pivot (viz., θ = θe′p) ; and as we make the observation
period T larger and larger, this domain becomes narrower. In the present context, under the assumed regularity
conditions, Corollary 2 shows that for any fixed θ 6= θeee′p,
10 S. Nkurunziza and S. E. Ahmed
ψ L−−−→T→∞
χ2p−1
(δδδ ∗
′Ξδδδ ∗
)and T−1ψ L−−−→
T→∞0 (14)
as such, the shrinkage factor cψ−1 = Op(T−1
), as T → ∞, so that asymptotically there is no shrinkage effect.
This justifies the choice of the usual Pitman type of alternatives given in (13).
Even for such local alternatives, the SE, PSE and θ̃ may not be asymptotically unbiased estimators of
θ . With that in mind, we introduce the following optimality criterion. For an estimator θ̂?
of θ , we confine
ourselves to a quadratic loss function of the form
L(
θ̂?,θ ;W
)=
[√T
(θ̂
?−θ)]′
W[√
T(
θ̂?−θ
)], (15)
where W is positive semi-definite (p.s.d). Using the distribution of√
T(
θ̂?−θ
)and taking the expected
value both sides of (15), we get the expected loss that would be called the quadratic risk RoT
(θ̂
?,θ ;W
)(=
trace(
WΣ̂ T
), where Σ̂ T is the dispersion matrix of
√T
(θ̂
?−θ)
. Whenever
limT→∞
Σ̂ T = Σ
exists, RoT
(θ̂
?,θ ;W
)→ Ro
(θ̂
?,θ ;W
)= trace(WΣ), which is termed the asymptotic risk. In our setup, we
denote the distribution of√
T(
θ̂?−θ
)by G̃T (u), u ∈ Rp. Suppose that G̃T → G̃ (at all points of continuity),
as T → ∞, and let Σ G̃ be the dispersion matrix of G̃. Then the ADR of θ̂?
is defined as
Ro(
θ̂?,θ ;W
)= trace(WΣ G̃) . (16)
Note that if√
T(
θ̂?−θ
)converges in second moment, then ADR is indeed the asymptotic risk. However, this
stronger mode of convergence may be difficult to prove analytically, especially for SE and PSE, and thus, we
shall work with the ADR results in the ongoing discussions.
The shrinkage estimators are, in general, biased estimators. Nevertheless, the bias is accompanied by reduc-
tion in risk, and hence, it does not have serious impact on risk assessment. In this vein, we define the asymptotic
bias as
B0T
(θ̂
?,θ
)= E
[√T
(θ̂
?−θ)]
, (17)
Shrinkage Estimator of the drift parameters 11
and side by side, the asymptotic distributional bias (ADB) as the limit
B0T
(θ̂
?,θ
)=
∫. . .
∫xdG̃T (x) −−−→
T→∞
(B
(θ̂
?,θ
)=
∫. . .
∫xdG̃(x)
). (18)
Two central key results to the study of ADR and ADB of the shrinkage estimators are given in Proposition 3
and Corollary 2. With these results intact, we are in a position to use the results on the (normal distribution)
parametric model, and thereby give at the main results of this subsection. We present (without derivation)
the results on SE and PSE. The proofs are similar to that given in Ahmed (1992). Let ∆ = δδδ ∗′Ξδδδ ∗ and let
Hν(x ;∆) = P{χ2ν (∆)≤ x}, x ∈ R+.
Theorem 1 The ADB functions of the estimators are given as follows :
B(
θ̂ ,θ)
= 0, B(
θ̃ ,θ)
=−δ , B(
θ̂S,θ
)=−δ (p−3)E{χ−2
p+1(∆)
B(
θ̂S+
,β1
)= −δ
[Hp+1(p−3;∆)+E{χ−2
p+1(∆)I(
χ2p+1(∆)
)> (p−3))
].
(19)
Since for the ADB of θ̃ , θ̂S
and θ̂S+
, the component δ is common and they differ only by scalar factors,
it suffices to compare the scalar factors ∆ only. It is clear that bias of the θ̃ is an unbounded function of ∆ . On
the other hand, the ADB of both θ̂S
and θ̂S+
are bounded in ∆ . Noting that since E{χ−2p+1(∆)} is a decreasing
log-convex function of ∆ the ADB of θ̂S
starts from the origin at ∆ = 0, increases to a maximum, and then
decreases towards 0. The characteristic θ̂S+
is similar to θ̂S. Interestingly, the bias curve of θ̂
S+remains below
the curve of SE for all values of ∆ . The simulation results in the next section are also pointing in that direction.
Theorem 2 The ADR functions of the estimators are given as follows :
R(
θ̂ ,θ ;W)
= trace(WΣ),
R(
θ̃ ,θ ;W)
= trace(WΣ)− trace(WΣ−)+δ ∗′Wδ ∗, Σ− = Σ − epe′p,
R(
θ̂S,θ ;W
)= ADR(θ̂)+δ ∗′Wδ ∗(p2−3)E(χ−4
p+3(∆))
−(p−3)trace(WΣ−){2E(χ−2p+1(∆))− (p−3)E(χ−4
p+1(∆))},
(20)
12 S. Nkurunziza and S. E. Ahmed
R(
θ̂S+
,θ ;W)
= ADR(θ̂S)+(p−3)trace(WΣ−)
×[2E
{χ−2
p+1(∆)I(χ2p+1(∆)≤ (p−3)
}
−(p−3)E{
χ−4p+1(∆)I(χ2
p+1(∆)≤ (p−3)}]
−trace(WΣ)Hp+1(p−2;∆)
+δ ∗′Wδ ∗{
2Hp+1(p−3;∆)−Hp+3(p−3;∆)}
−(p−3)δ ∗′Wδ ∗[2E{χ−2
p+1(∆))I(χ2p+1(∆)≤ (p−3)}
−2E{χ−2p+3(∆)I(χ2
p+3(∆)≤ (p−3)}
+(p−2)E {χ−4p+3(∆)I(χ2
p+3(∆)≤ (p−2)}].
(21)
The proof is similar to that given in Ahmed (1992). For a suitable choice of the matrix W, risk dominance
of the estimators are similar to those under normal theory and can be summarized as follows :
– (i) Clearly
R(θ̂S,θ ;W) < trace(WΣ) for all ∆ ∈ [0,∞),
hence providing greater estimation accuracy than MLE, a gold standard. Indeed, the ADR function of
SE is monotone in ∆ , the smallest value is achieved at ∆ = 0 and the largest is trace(WΣ). Hence, θ̂S
outperforms θ̂ and is an admissible estimator when compared with θ̂ .
– (ii) θ̂S+
asymptotically superior to θ̂S
in the entire parameter space induced by ∆ . Therefore, θ̂S+
is also
superior to the classical MLE.
– (iii) Importantly, θ̂S+
does not inherent over-shrinking problem.
For practical reasons we conduct an extensive simulation study to investigate the performance of the propo-
sed estimators for moderate values of the observation period. This study is not available in reviewed literature
for the shrinkage procedures in the present context. Our simulation experiments provide strong evidence that
corroborates with the asymptotic theory and confirms the validity of the SES.
Shrinkage Estimator of the drift parameters 13
3 Simulation Study
In this section, we carry out a Monte Carlo simulation study to examine risk (namely MSE) performance
of the all estimators under consideration.
In order to conserve space, we report detailed results only for p = 3 and p = 5. The UPI (pivot) is given by
the null hypothesis H0 : θ = θeee′p. The length of the time period of observation T = 30,T = 50 are considered.
The relative efficiency of the estimators with respect to θ̂ is defined by
RMSE =risk
(θ̂)
risk(proposed estimator).
In particular, we define
RMSE_U =risk
(θ̂)
risk(
θ̂) , RMSE_R =
risk(
θ̂)
risk(
θ̃) ,
RMSE_S =risk
(θ̂)
risk(
θ̂S) , RMSE_S_plus =
risk(
θ̂)
risk(
θ̂S+) .
Thus, a relative efficiency larger than one indicates the degree of superiority of the estimator over θ̂ . The results
are graphically reported in Figure 1(a)-1(b) for p = 3 and in Figure 1(c)-1(d) for p = 5. Graphically, Figure 1
indicates that a reduction of 50% or more in the risk seem quite realistic depending on the values of ∆ and p.
14 S. Nkurunziza and S. E. Ahmed
0 0.5 1 1.5 2 2.5 3 3.5 4
1
1.5
2
2.5
3
3.5
4
4.5
Delta
RM
SE
RMSEU
RMSER
RMSES
RMSESp
lus
(a) T = 30 and p = 3
0 0.5 1 1.5 2 2.5 30.5
1
1.5
2
2.5
3
Delta
RM
SE
RMSEU
RMSER
RMSES
RMSESp
lus
(b) T = 50 and p = 3
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
1
2
3
4
5
6
Delta
RM
SE
RMSEU
RMSER
RMSES
RMSESp
lus
(c) T = 30 and p = 5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
1
2
3
4
5
6
Delta
RM
SE
RMSEU
RMSER
RMSES
RMSESp
lus
(d) T = 30 and p = 5
FIG. 1 Relative efficiency vs ∆
The salient features given by these simulations can be summarized as follows (with few exceptions that are
detailed later) :
(a) The behavior of the SES is quite stable and does always have risk-dominance over the MLE around the
pivot, as theoretically expected.
Shrinkage Estimator of the drift parameters 15
(b) Comparing the above Figure 1(a)-1(b) to the following Figure 1(c)-1(d), one can see that higher is
the parameter dimensions better pronounced is the risk dominance of the shrinkage estimators over the
MLE.
(c) The efficiency of θ̃ converges to 0 as the normalized distance between true values of drift parameters
and hypothesized values tends to grow. However, the performance of θ̃ is superior to shrinkage estimators
around the pivot.
4 Recommendations and concluding remarks
In this paper, we consider the inference problem concerning the drift parameter vector of a multivariate
Ornstein-Uhlenbeck process. This model have a great application in finance, specifically in modelling the term
structure of interest rates in macro-economic context. It is natural to establish inference strategies for the drift
parameters of this model. Our contribution consists in suggesting shrinkage estimators that improve drastically
the performance of classical maximum likelihood estimator. This is demonstrated by theoretical results as well
as numerical results.
In summary, it is concluded that the positive-part shrinkage estimator dominates the usual shrinkage esti-
mator. Further, both shrinkage estimators outperform the MLE in the entire parameter space. We recommend
the use of PSE over SE and MLE. We re-emphasize here that θ̂S
should not be used since it is not a convex
combination of restricted and unrestricted methods and also changes the sign of the baseline estimator. One
may agree on adjusting the magnitudes of the usual estimator, but change of sign is somewhat a serious matter
and it would make a practitioner rather uncomfortable. For this reason we have introduced SE as a tool for de-
veloping θ̂S. Our simulation studies have provided strong evidence that corroborates with the said asymptotic
theory related to proposed SES.
Finally, our numerical findings indicate that a reduction of 50% or more in the risk seem quite realistic de-
pending on the values of ∆ and p (see Figure 1). Like the statistical models underlying the statistical inferences
16 S. Nkurunziza and S. E. Ahmed
to be made, the homogeneity of the drift parameters will be susceptible to uncertainty and the practitioners may
be reluctant to impose the additional information regarding parameters in the estimation process. The suggested
SES is robust in the sense that it still work better when such constraint may not hold. More importantly, the
estimation strategy combining the sample and parameter information is superior to a strategy based on sample
information only. Research on statistical implications of these and other estimators for a range of statistical
models is ongoing.
Acknowledgements This research received financial support from Natural Sciences and Engineering Research Council of canada.
Appendix
A Some Theoretical results
Let µWWW be the measure induced by the Wiener process {WWW (t), t ≥ 0}. Also, let us denote by µU the probability measure induced
by process {U(t), t ≥ 0}.
µU (B) = P{ω : Ut(ω) ∈ B}, where B is a Borel set.
The following result plays a central role in establishing the MLE of θθθ .
Proposition 4 Conditionally to XXX0, the Radon-Nicodym derivative of µXXX with respect to µWWW is given by
dµXXX
dµWWW(XXX) = exp
{−θθθ ′Σ−1UUUT − 1
2θθθ ′Σ−1DDDT θθθ
}. (22)
The proof is straightforward by applying Theorem (7.7) in Liptser and Shiryayev (1977). It should be noted that, the relation (22)
has the same form as the equation 17.24 of Liptser and Shiryayev (1977) for the univariate case with the non-random initial value.
A.1 Asymptotic Properties of MLE
In univariate case, Lipter and Shiryayev (1978, Theorem 17.3, Lemma 17.3 and Theorem 17.4) and Kutoyants (2004) give
some asymptotic results for the maximum likelihood estimator of the drift parameter of an Ornstein-Uhlenbech process. The ideas
of proof are the same as given in the quoted references. Nevertheless, for the completeness, we outline the proof of the strong
consistency.
Shrinkage Estimator of the drift parameters 17
Proposition 5 (Strong consistency) Assume that the model (5) holds.
(i) Then,
Pr{
limT→∞
θ̂θθ = θθθ}
= 1.
(ii) If θ1 = θ2 = · · ·= θp, then
Pr{
limT→∞
θ̃θθ = θ̄eeep
}= 1.
where θ̄ as the common drift parameter.
Proof For any matrix A, we denote by
‖A‖2 = trace(AA′).
(i) Let
M(T ) =(∫ T
0X1(t)dW1(t),
∫ T
0X2(t)dW2(t), . . . ,
∫ T
0Xp(t)dWp(t)
)′.
By some computations, we have
UUUT =−DDDT θθθ −M(T )Σ12 .
Therefore,
θ̂θθ∗−θθθ = DDD−1
T M(T )Σ12 . (23)
Obviously, M(T ) is a martingale whose quadratic variation is DDDT . Moreover, one can verify that
Pr{
limT→∞
‖DDDT ‖= ∞}
= 1,
and, for any column vector aaa, the process aaa′DDDT aaa is nondecreasing in T . Hence, by strong law of large number for martingale,
we get
θ̂θθ∗−θθθ a.s.−−−→
T→∞000,
that imply that θ̂θθ is strongly consistent for θθθ and this completes the proof of (i).
(ii) In the similar way, we prove that
θ̃θθ −θθθ a.s.−−−→T→∞
000,
which completes the proof.
¤
Proposition 6 (Asymptotic normality) Assume that the model (5) holds and suppose that XXX0 has the same moment as the invariant
distribution.
18 S. Nkurunziza and S. E. Ahmed
(i) Then,
√T
(θ̂θθ −θθθ
)L−−−→
T→∞Np (000, Σ) , and T
(θ̂θθ −θθθ
)′Σ−1
(θ̂θθ −θθθ
)L−−−→
T→∞χ2
p .
(ii) If θ1 = θ2 = · · ·= θp, then
√T
(θ̃θθ − θ̄eeep
)L−−−→
T→∞Np
(000, eeepeee′p
(eee′pΣ−1eeep
)−1)
.
The proof follows by applying Proposition 1.34 or Theorem 2.8 of Kutoyants (2004, p. 61 and p. 121). Let
ρρρT =√
T(
θ̂θθ − θ̄eeep
), ξξξ T =
√T
(θ̂θθ − θ̃θθ
), ζT =
√T
(θ̃θθ − θ̄eeep
).
ϒ =(eee′pΣ−1eeep
)−1eeepeee′p, and let Σ∗ = Σ −ϒ .
Proposition 7 Assume that Proposition 6 holds. Under H0 given in (6), we have
(i)
ρρρT
ξξξ T
L−−−→
T→∞N2p
000
000
,
Σ Σ∗
Σ∗ Σ∗
.
(ii)
ζζζ T
ξξξ T
L−−−→
T→∞N2p
000
000
,
ϒ 000
000 Σ∗
.
Proof
(i) By some computations, we get under the null hypothesis
(ρρρ ′T ,ξξξ ′T
)′=
(Ip, Ip−
(eee′pΣ−1DDDT eeep
)−1DDDT Σ−1eeepeee′p
)′√T
(θ̂θθ −θθθ
).
By combining Proposition 3 and Slutsky Theorem we get the result stated in (i).
(ii) In the similar way as in (i), we get under the null hypothesis,
ζζζ T
ξξξ T
=
(eee′pΣ−1DDDT eeep
)−1 eeepeee′pΣ−1DDDT
Ip−(eee′pΣ−1DDDT eeep
)−1 eeepeee′pΣ−1DDDT
√
T(
θ̂θθ −θθθ)
.
Again, by combining Proposition 3 and Slutsky Theorem we get the result stated in (ii), which completes the proof.
¤
Corollary 4 Under the conditions of Proposition 7 and under H0,
ξξξ ′T Σ−1ξξξ TL−−−→
T→∞χ2
p−1.
Shrinkage Estimator of the drift parameters 19
Proof
ξξξ ′T Σ−1ξξξ T = ρρρ ′T(
Σ−1−Σ−1 (eee′pΣ−1eeep
)−1eeepeee′pΣ−1
)ρρρT +ρρρ ′T (ΞT −Ξ)ρρρT , (24)
where
ΞT =(
Ip−DDDT Σ−1eeep(eee′pΣ−1DDDT eeep
)−1)
Σ−1(
Ip−(eee′pΣ−1DDDT eeep
)−1eee′pΣ−1DDDT
),
and
Ξ = Σ−1p −Σ−1 (
eee′pΣ−1eeep)−1
eeepeee′pΣ−1.
Combining Proposition 4 and Slutsky theorem, we deduce that,
ρρρ ′T [ΞT −Ξ ]ρρρTL−−−→
T→∞000.
Moreover,
Σ(
Σ−1−Σ−1 (eee′pΣ−1eeep
)−1eeepeee′pΣ−1
)= ΣΞ
is an idempotent matrix and hence, using Proposition 4
ZZZ′(
Σ−1−Σ−1 (eee′pΣ−1eeep
)−1eeepeee′pΣ−1
)ZZZ = ZZZ′ΞZZZ ∼ χ2
r ,
where
r = rank(Ξ) = trace(
Σ(
Σ−1−Σ−1 (eee′pΣ−1eeep
)−1eeepeee′pΣ−1
))= p−1.
¤
Proof of Proposition 2
First, by Îto formula, we prove that
σ̂2i = σ2 +
X2i (0)T
, for i = 1,2, . . . , p. (25)
Hence, for i = 1,2, . . . , p,
Pr{
limT→∞
σ̂2i = σ2
i + limT→∞
X2i (0)T
}= 1 and then, Pr
{lim
T→∞σ̂2
i = σ2i
}= 1,
that proves the first statement of Proposition 2. Second, from (25), we get
limT→∞
E{
max16i6p
∣∣σ̂2i −σ2
i∣∣2
}= lim
T→∞
1T
E{
max16i6p
|Xi(0)|2}
6 pτ2 limT→∞
1T
= 0,
that completes the proof. ¤
Proof of Proposition 3
20 S. Nkurunziza and S. E. Ahmed
(i) By some computations, we get
(ρρρ ′T ,ξξξ ′T
)′=
(Ip, Ip−
(eee′pΣ−1DDDT eeep
)−1DDDT Σ−1eeepeee′p
)′√T
(θ̂θθ −θθθ
)
+(
Ip, Ip−(eee′pΣ−1DDDT eeep
)−1DDDT Σ−1eeepeee′p
)′δδδ .
By combining Proposition 7 and Slutsky Theorem we get the result stated in (i).
(ii) In the similar way as in (i), we get
ζζζ T
ξξξ T
=
(eee′pΣ−1DDDT eeep
)−1 eeepeee′pΣ−1DDDT
Ip−(eee′pΣ−1DDDT eeep
)−1 eeepeee′pΣ−1DDDT
√
T(
θ̂θθ −θθθ)
+
(eee′pΣ−1DDDT eeep
)−1 eeepeee′pΣ−1DDDT
Ip−(eee′pΣ−1DDDT eeep
)−1 eeepeee′pΣ−1DDDT
δδδ .
Again, by combining Proposition 7 and Slutsky Theorem we get the result stated in (ii) and that completes the proof. ¤
References
Abu-Mostafa, Y. S. (2001). Financial Model Calibration Using Consistency Hints. IEEE transactions on neural networks 12, 4,
791-808.
Ahmed, S. E.(1992). Large-sample pooling procedure for correlation. The Statistician 41, 425-438.
Ahmed, S. E. and Saleh, A. K. Md. E. (1999). Improved nonparametric estimation of location vector in a multivariate regression
model . Nonparametric Statistics. 11, 51-78.
Ahmed, S. E. (2001). Shrinkage estimation of regression coefficients from censored data with multiple observations. In S.E.
Ahmed and N. Reid (Eds.), Empirical Bayes and and Likelihood inference (pp. 103-120). NewYork :Springer.
Ahmed, S. E.(2002). Simultaneous estimation of coefficient of variations. Journal of Statistical Planning and Inference, 104,
31-51.
Ahmed, S. E.(2005). Assessing Process Capability Index for Nonnormal Processes. Journal of Statistical Planning and Infe-
rence, 129, 195-206.
Basawa, V. Ishwar, and B. L. S. Rao, P. B. L. S. (1980). Statistical inference for stochastic processes. London : Academic Press.
Engen, S. and Sæther, B. E. (2005). Generalizations of the Moran Effect Explaining Spatial Synchrony in Population Fluctua-
tions. The American Naturalist, 166, 603–612.
Engen, S., Lande, R., Wall, T. and DeVries, J. P. (2002). Analyzing Spatial Structure of Communities Using the Two-Dimensional
Poisson Lognormal Species Abundance Model. The American Naturalist, 160,1, 60–73.
Shrinkage Estimator of the drift parameters 21
Georges, P. (2003). The Vasicek and CIR Models and the Expectation Hypothesis of the Interest Rate Term Structure. The Bank
of Canada and Department of Finance, Canada.
Judge, G. G. and M. E. Bock (1978). The statistical implications of pre-test and Stein-rule estimators in econometrics. Amster-
dam :North Holland.
Jurec̆ková, J. and P. K. Sen (1996). Robust Statistical Procedures. New York :Wiley.
Kubokawa, T. (1998). The Stein phenomenon in simultaneous estimation : A review. In S.E. Ahmed et al. (Eds.), Nonparametric
Statistics and Related Topics (pp. 143-173). NewYork : Nova Science.
Kutoyants, A. Y. (2004). Statistical Inference for Ergodic Diffusion Processes. New York : Springer.
Liptser, R. S. and A. N., Shiryayev (1977). Statistics of Random Processes : Generale Theory, Vol. I. New York : Springer-
Verlag.
Liptser, R. S. and A. N., Shiryayev (1978). Statistics of Random Processes : Applications, Vol. II. New York : Springer-Verlag.
Sclove, S. L., C. Morris, R. Radhakrishnan (1972). Non-optimality of preliminary test estimators for the mean of a multivariate
normal distribution. Annals of Mathematical Statistics 43, 1481-1490.
Sen, P.K. (1986). On the asymptotic distributional risks of shrinkage and preliminary test versions of maximum likelihood
estimators. Sankhya, Series A, 48, 354-371.
R. Schöbel, R. and Zhu, J. (1999). Stochastic Volatility With an Ornstein-Uhlenbeck Process : An Extension. European Finance
Review 3, 23-46.
Steele, J. M. (2001). Stochastic calculus and financial applications. New York : Springer-Verlag.
Stigler, S. M. (1990). The 1988 Neyman Memorial Lecture : A Galtonian perspective on shrinkage estimators. Statistical
Science 5, 147-155.
Langetieg, T. C. (1980). A Multivariate Model of the Term Structure. The Journal of Finance, 35, 1, 71-97.
Vasicek, O. (1977). An Equilibrium Characterization of the Term Structure. Journal of Financial Economics 5, 177-188.