On High-dimensional Sign Testshomepages.ulb.ac.be/.../Article57/universasupp.pdf · Supplement to...
Transcript of On High-dimensional Sign Testshomepages.ulb.ac.be/.../Article57/universasupp.pdf · Supplement to...
Supplement to
“On High-dimensional Sign Tests”
DAVY PAINDAVEINE 1,* and THOMAS VERDEBOUT2,**
1ECARES and Departement de Mathematiques, Universite Libre de BruxellesE-mail: *[email protected]
2 INRIA and Laboratoire EQUIPPE, Universite Lille 3E-mail: **[email protected]
Below Theorem M-2.4, Lemma M-B.1, etc. refer to Theorem 2.4, Lemma B.1, etc.from Paindaveine and Verdebout (2014). Unless otherwise stated, other cross-referencesrelate to this supplement itself.
1. Proof of Theorem M-2.4
To prove Theorem M-2.4, we define Fn` as the σ-algebra generated by (X′n1,Y′n1)′, . . . ,
(X′n`,Y′n`)′, and we let
DIn` = En`
[I(n)N ,p
]− En,`−1
[I(n)N ,p
]=
√2pnqnn
`−1∑i=1
(U′niUn`)(V′niVn`),
where En` still denotes conditional expectation with respect to Fn`. Clearly, we have that
I(n)N ,p,q =
∑n`=1D
In`, where DI
n` is almost surely bounded, hence has a finite-variance. Theasymptotic normality result in Theorem M-2.4 will then follow from Theorem M-B.1provided that we establish the following two lemmas (in the proofs, we use the samenotational shortcuts as in those of Lemmas M-B.1 to M-B.4).
Lemma 1.1. Letting σ2n` = En,`−1
[(DI
n`)2],∑n
`=1 σ2n` converges to one in quadratic
mean as n→∞.
Lemma 1.2. For any ε > 0,∑n
`=1 E[(DI
n`)2 I[|DI
n`| > ε]]→ 0 as n→∞.
Proof of Lemma 1.1. The identity (vecA)′(vecB) = Trace[A′B] allows to write
(U′iU`)(V′iV`) as
(vec(UiV
′i))′
vec(U`V′`), which yields
σ2n` = E`−1
[(DI
n`)2]
=2pnqnn2
`−1∑i,j=1
(vec(UiV
′i))′
×E[vec(U`V
′`)(vec(U`V
′`))′]
vec(UjV′j).
1imsart-bj ver. 2014/02/20 file: TSWLatexianTemp_000002.tex date: October 14, 2015
2 D.Paindaveine and Th.Verdebout
Using Lemma M-A.2(iv), we obtain
σ2n` =
2
n2
`−1∑i,j=1
(vec(UiV
′i))′
vec(UjV′j) =
2
n2
`−1∑i,j=1
ρUijρVij ,
where we let ρUij = ρUn,ij = U′niUnj and ρVij = ρVn,ij = V′niVnj . Lemma M-A.1(iii) thenentails
E[σ2n`
]=
2(`− 1)
n2
(= E
[(DI
n`)2] ). (1.1)
Therefore, as n→∞,
E
[ n∑`=1
σ2n`
]=n− 1
n→ 1.
From the pairwise independence of the ρUij ’s and of the ρVij ’s and the fact that the ρUij ’s
are independent of the ρVij ’s, we have
Var
[ n∑`=1
σ2n`
]=
16
n4Var
[ n∑`=3
∑1≤i<j≤`−1
ρUijρVij
]
=16
n4Var
[ ∑1≤i<j≤n
(n− j)ρUijρVij]
=16
n4E[(ρUij)
2] E[(ρVij)2]
∑1≤i<j≤n
(n− j)2.
Lemma M-A.1(iii) then yields
Var
[ n∑`=1
σ2n`
]=
16
pnqnn4
∑1≤i<j≤n
(n− j)2 =16
pnqnn4
n∑j=2
(j − 1)(n− j)2
=16
pnqnn4
n−1∑j=1
j(n− j − 1)2 ≤ 16
pnqnn2
n−1∑j=1
j =8(n− 1)
pnqnn,
which is o(1) as n→∞. The result follows. �
Proof of Lemma 1.2. Proceeding as in the proof of Lemmas M-B.2 and M-B.4, weobtain
n∑`=1
E[(DI
n`)2 I[|DI
n`| > ε]]≤
n∑`=1
√E[(DI
n`)4]√
P[|DIn`| > ε]
≤ 1
ε
n∑`=1
√E[(DI
n`)4]√
Var[DIn`] ≤
√2
εn
n∑`=1
√(`− 1)E
[(DI
n`)4], (1.2)
imsart-bj ver. 2014/02/20 file: TSWLatexianTemp_000002.tex date: October 14, 2015
Supplement to “On High-dimensional Sign Tests” 3
where we have used the fact that (see (1.1))
Var[DIn`] ≤ E
[(DI
n`)2]
=2(`− 1)
n2.
Now, Lemma M-A.1(iv) yields
E[(DI
n`)4]
=4p2nq
2n
n4
`−1∑i,j,r,s=1
E[ρUi`ρ
Uj`ρ
Ur`ρ
Us`ρ
Vi`ρ
Vj`ρ
Vr`ρ
Vs`
]
=4p2nq
2n
n4
{(`− 1)E
[(ρUi`ρ
Vi`
)4]+ 3(`− 1)(`− 2)
(E[(ρUi`ρ
Vi`
)2])2}
≤ 4p2nq2n(`− 1)2
n4
{E[(ρUi`ρ
Vi`
)4]+ 3(
E[(ρUi`ρ
Vi`
)2])2}. (1.3)
Lemma M-A.1(iii) then allows to evaluate the upper bound in (1.3), which yields
E[(DI
n`)4]≤ 4p2nq
2n(`− 1)2
n4
{9
pnqn(pn + 2)(qn + 2)+
3
p2nq2n
}≤ 48(`− 1)2
n4,
as n→∞. Plugging this into (1.2), we conclude that
n∑`=1
E[(DI
n`)2 I[|DI
n`| > ε]]≤√
96
εn3
n∑`=1
(`− 1)3/2,
which, still by using Lemma M-A.3, is O(n−1/2). The result follows. �
2. Proof of Theorem M-2.5
Of course, it is sufficient to show that, under the assumptions of Theorem M-2.5,
SStn =
pnn
n∑1≤i<j≤n
((U′niUnj)
2 − 1
pn
)is asymptotically standard normal. To do so, we define Fn` as the σ-algebra generatedby Xn1, . . . ,Xn`, and we let
DSn` = En`
[SStn
]− En,`−1
[SStn
]=pnn
`−1∑i=1
((U′niUnj)
2 − 1
pn
),
where En` still denotes conditional expectation with respect to Fn`. Clearly, we have thatSStn =
∑n`=1D
Sn`, where DS
n` is almost surely bounded, hence has a finite-variance. Theasymptotic normality result in Theorem M-2.5 will then follow from Theorem M-B.1 ifwe prove both following lemmas.
imsart-bj ver. 2014/02/20 file: TSWLatexianTemp_000002.tex date: October 14, 2015
4 D.Paindaveine and Th.Verdebout
Lemma 2.1. Letting σ2n` = En,`−1
[(DS
n`)2],∑n
`=1 σ2n` converges to one in quadratic
mean as n→∞.
Lemma 2.2. For any ε > 0,∑n
`=1 E[(DS
n`)2 I[|DS
n`| > ε]]→ 0 as n→∞.
We use the same notational shortcuts in the proofs of these lemmas as in those ofLemmas M-B.1 and M-B.2.
Proof of Lemma 2.1. Using the identity (vecA)′(vecB) = Trace[A′B], we obtain
(U′iU`)2 − 1
pn=(vec(UiU
′i))′J
⊥
p vec(U`U′`),
where the (symmetric and idempotent) matrix J⊥
p = Ip2 − 1pJp is based on the matrix Jp
involved in Lemma M-A.2. Hence,
σ2n` = E`−1
[(DS
n`)2]
=p2nn2
`−1∑i,j=1
(vec(UiU
′i))′J
⊥
p E[vec(U`U
′`)(vec(U`U
′`))′]
×J⊥
p vec(UjU′j).
Since Kpvec(A) = vec(A′) for any p×p matrix A, we have that J⊥
p
(Ip2 +Kp +Jp
)J
⊥
p =
Ip2 + Kp − 2pJp. Lemma M-A.2(iii) then yields
σ2n` =
pn(pn + 2)n2
`−1∑i,j=1
(vec(UiU
′i))′(
Ip2 + Kp −2
pJp
)vec(UjU
′j)
=2pn
(pn + 2)n2
`−1∑i,j=1
(vec(UiU
′i))′J
⊥
p vec(UjU′j) =
2pn(pn + 2)n2
`−1∑i,j=1
(ρ2ij −
1
pn
).
By applying Lemma M-A.1(iii), we then obtain
E[σ2n`
]=
2(pn − 1)(`− 1)
(pn + 2)n2
(= E
[(DS
n`)2] ). (2.1)
Therefore, as n→∞,
E
[ n∑`=1
σ2n`
]=
2(pn − 1)
(pn + 2)n2
n∑`=1
(`− 1) =(pn − 1)(n− 1)
(pn + 2)n→ 1.
imsart-bj ver. 2014/02/20 file: TSWLatexianTemp_000002.tex date: October 14, 2015
Supplement to “On High-dimensional Sign Tests” 5
From the pairwise independence of the ρij ’s, we have
Var
[ n∑`=1
σ2n`
]=
16p2n(pn + 2)2n4
Var
[ n∑`=3
∑1≤i<j≤`−1
(ρ2ij −
1
p
)]
=16p2n
(pn + 2)2n4Var
[ ∑1≤i<j≤n
(n− j)(ρ2ij −
1
p
)]
=16p2n
(pn + 2)2n4
(E[ρ4ij ]− E2[ρ2ij ]
) ∑1≤i<j≤n
(n− j)2.
Lemma M-A.1(iii) then yields
Var
[ n∑`=1
σ2n`
]=
32(pn − 1)
(pn + 2)3n4
∑1≤i<j≤n
(n− j)2
=32(pn − 1)
(pn + 2)3n4
n∑j=2
(j − 1)(n− j)2 =32(pn − 1)
(pn + 2)3n4
n−1∑j=1
j(n− j − 1)2
≤ 32(pn − 1)
(pn + 2)3n2
n−1∑j=1
j ≤ 16(pn − 1)(n− 1)
(pn + 2)3n,
which is o(1) as n→∞. The result follows. �
Proof of Lemma 2.2. Proceeding as in the proofs of Lemmas M-B.2, M-B.4 and 1.2,we obtain
n∑`=1
E[(DS
n`)2 I[|DS
n`| > ε]]≤
n∑`=1
√E[(DS
n`)4]√
P[|DSn`| > ε]
≤ 1
ε
n∑`=1
√E[(DS
n`)4]√
Var[DSn`] ≤
√2(pn − 1)
εn√pn + 2
n∑`=1
√(`− 1)E
[(DS
n`)4], (2.2)
where we have used the fact that (see (2.1))
Var[DSn`] ≤ E
[(DS
n`)2]
=2(pn − 1)(`− 1)
(pn + 2)n2.
imsart-bj ver. 2014/02/20 file: TSWLatexianTemp_000002.tex date: October 14, 2015
6 D.Paindaveine and Th.Verdebout
Now, Lemma M-A.1(iv) yields
E[(DS
n`)4]
=p4nn4
`−1∑i,j,r,s=1
E[(ρ2i` −
1
pn
)(ρ2j` −
1
pn
)(ρ2r` −
1
pn
)(ρ2s` −
1
pn
)]
=p4nn4
{(`− 1)E
[(ρ2i` −
1
pn
)4]+ 3(`− 1)(`− 2)
(E[(ρ2i` −
1
pn
)2])2}
≤ p4n(`− 1)2
n4
{E[(ρ2i` −
1
pn
)4]+ 3(
E[(ρ2i` −
1
pn
)2])2}. (2.3)
Lemma M-A.1(iii) allows, after tedious computations, to evaluate the upper boundin (2.3), which yields
E[(DS
n`)4]
=p4n(`− 1)2
n4× 72(pn − 1)(pn + 1)
p2n(pn + 2)2(pn + 4)(pn + 6)=
(`− 1)2
n4O(1),
as n→∞. Plugging this into (2.2), we conclude that
n∑`=1
E[(DS
n`)2 I[|DS
n`| > ε]]≤√
2(pn − 1)
εn3√pn + 2
O(1)
n∑`=1
(`− 1)3/2,
which, still by using Lemma M.A.3, is O(n−1/2). The result follows. �
3. Simulations: checking universal asymptotics of thesign test for independence
As in Paindaveine and Verdebout (2014), the statistic I(n)N ,p,q in (2.10) (with p = q)
was evaluated on M = 10 000 independent random samples of size n from the (p + q)-dimensional standard normal distributions, with (n, p+q) ∈ {4, 30, 200, 1 000}2. Figure 1,that provides histograms of the M = 10 000 corresponding values of the standardized
test statistic I(n)N ,p,q in each case, confirms that the null distribution of I
(n)N ,p,q is close to
the standard normal as soon as n and p(= q) are both moderate-to-large. The histogramsin Figures 2 and 3 are the histograms from the simulations of Section 3.1 of Paindaveineand Verdebout (2014).
References
Paindaveine, D. and Verdebout, T. (2014), “On high-dimensional sign tests,” submitted.
imsart-bj ver. 2014/02/20 file: TSWLatexianTemp_000002.tex date: October 14, 2015
Supplement to “On High-dimensional Sign Tests” 7
-4 -2 0 2 4
0.0
0.2
0.4
n=4
p+q=4
-4 -2 0 2 4
0.0
0.2
0.4
p+q=30
-4 -2 0 2 4
0.0
0.2
0.4
p+q=200
-4 -2 0 2 4
0.0
0.2
0.4
p+q=1000
-4 -2 0 2 40.0
0.2
0.4
n=30
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
n=200
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
n=1000
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
Figure 1. Histograms of M = 10 000 values of the standardized sign test statistic I(n)N ,p,q for
multivariate independence in (2.10). These values are obtained from independent samples ofsize n; in each sample, observations are i.i.d. from the (p + q)-dimensional standard normaldistribution. Every (n, p+ q) ∈ {4, 30, 200, 1 000}2 was considered, with p = q in each case. Thestandard Gaussian density is plotted in black in each plot.
imsart-bj ver. 2014/02/20 file: TSWLatexianTemp_000002.tex date: October 14, 2015
8 D.Paindaveine and Th.Verdebout
-4 -2 0 2 4
0.0
0.2
0.4
n=4
p=4
-4 -2 0 2 4
0.0
0.2
0.4
p=30
-4 -2 0 2 4
0.0
0.2
0.4
p=200
-4 -2 0 2 4
0.0
0.2
0.4
p=1000
-4 -2 0 2 4
0.0
0.2
0.4
n=30
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
n=200
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
n=1000
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
Figure 2. Histograms of M = 10 000 values of the standardized Rayleigh test statistic R(n)N ,p.
These values are obtained from independent random samples of size n from the uniform distri-bution on the unit sphere of Rp. Every (n, p) ∈ {4, 30, 200, 1 000}2 was considered. The standardGaussian density is plotted in black in each plot.
imsart-bj ver. 2014/02/20 file: TSWLatexianTemp_000002.tex date: October 14, 2015
Supplement to “On High-dimensional Sign Tests” 9
-4 -2 0 2 4
0.0
0.2
0.4
n=4
p=4
-4 -2 0 2 4
0.0
0.2
0.4
p=30
-4 -2 0 2 4
0.0
0.2
0.4
p=200
-4 -2 0 2 4
0.0
0.2
0.4
p=1000
-4 -2 0 2 4
0.0
0.2
0.4
n=30
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
n=200
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
n=1000
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
-4 -2 0 2 4
0.0
0.2
0.4
Figure 3. Histograms of M = 10 000 values of the standardized Portmanteau-type sign teststatistic T
(n)N ,p. These values are obtained from independent random samples of size n from the
p-dimensional standard normal distribution. Every (n, p) ∈ {4, 30, 200, 1 000}2 was considered.The standard Gaussian density is plotted in black in each plot.
imsart-bj ver. 2014/02/20 file: TSWLatexianTemp_000002.tex date: October 14, 2015