On High-dimensional Sign Testshomepages.ulb.ac.be/.../Article57/universasupp.pdf · Supplement to...

$: On High-dimensional Sign Testshomepages.ulb.ac.be/.../Article57/universasupp.pdf · Supplement to \On High-dimensional Sign Tests" DAVY PAINDAVEINE 1 ,* and THOMAS VERDEBOUT2 ** 1ECARES$
Supplement to

“On High-dimensional Sign Tests”

DAVY PAINDAVEINE 1,* and THOMAS VERDEBOUT2,**

1ECARES and Departement de Mathematiques, Universite Libre de BruxellesE-mail: *[email protected]

2 INRIA and Laboratoire EQUIPPE, Universite Lille 3E-mail: **[email protected]

Below Theorem M-2.4, Lemma M-B.1, etc. refer to Theorem 2.4, Lemma B.1, etc.from Paindaveine and Verdebout (2014). Unless otherwise stated, other cross-referencesrelate to this supplement itself.

1. Proof of Theorem M-2.4

To prove Theorem M-2.4, we define Fn` as the σ-algebra generated by (X′n1,Y′n1)′, . . . ,

(X′n`,Y′n`)′, and we let

DIn` = En`

[I(n)N ,p

]− En,`−1

[I(n)N ,p

]=

√2pnqnn

`−1∑i=1

(U′niUn`)(V′niVn`),

where En` still denotes conditional expectation with respect to Fn`. Clearly, we have that

I(n)N ,p,q =

∑n`=1D

In`, where DI

n` is almost surely bounded, hence has a finite-variance. Theasymptotic normality result in Theorem M-2.4 will then follow from Theorem M-B.1provided that we establish the following two lemmas (in the proofs, we use the samenotational shortcuts as in those of Lemmas M-B.1 to M-B.4).

Lemma 1.1. Letting σ2n` = En,`−1

[(DI

n`)2],∑n

`=1 σ2n` converges to one in quadratic

mean as n→∞.

Lemma 1.2. For any ε > 0,∑n

`=1 E[(DI

n`)2 I[|DI

n`| > ε]]→ 0 as n→∞.

Proof of Lemma 1.1. The identity (vecA)′(vecB) = Trace[A′B] allows to write

(U′iU`)(V′iV`) as

(vec(UiV

′i))′

vec(U`V′`), which yields

σ2n` = E`−1

[(DI

n`)2]

=2pnqnn2

`−1∑i,j=1

(vec(UiV

′i))′

×E[vec(U`V

′`)(vec(U`V

′`))′]

vec(UjV′j).

1imsart-bj ver. 2014/02/20 file: TSWLatexianTemp_000002.tex date: October 14, 2015

mailto:[email protected]

mailto:[email protected]

2 D.Paindaveine and Th.Verdebout

Using Lemma M-A.2(iv), we obtain

σ2n` =

2

n2

`−1∑i,j=1

(vec(UiV

′i))′

vec(UjV′j) =

2

n2

`−1∑i,j=1

ρUijρVij ,

where we let ρUij = ρUn,ij = U′niUnj and ρVij = ρVn,ij = V′niVnj . Lemma M-A.1(iii) thenentails

E[σ2n`

]=

2(`− 1)

n2

(= E

[(DI

n`)2] ). (1.1)

Therefore, as n→∞,

E

[ n∑`=1

σ2n`

]=n− 1

n→ 1.

From the pairwise independence of the ρUij ’s and of the ρVij ’s and the fact that the ρUij ’s

are independent of the ρVij ’s, we have

Var

[ n∑`=1

σ2n`

]=

16

n4Var

[ n∑`=3

∑1≤i<j≤`−1

ρUijρVij

]

=16

n4Var

[ ∑1≤i<j≤n

(n− j)ρUijρVij]

=16

n4E[(ρUij)

2] E[(ρVij)2]

∑1≤i<j≤n

(n− j)2.

Lemma M-A.1(iii) then yields

Var

[ n∑`=1

σ2n`

]=

16

pnqnn4

∑1≤i<j≤n

(n− j)2 =16

pnqnn4

n∑j=2

(j − 1)(n− j)2

=16

pnqnn4

n−1∑j=1

j(n− j − 1)2 ≤ 16

pnqnn2

n−1∑j=1

j =8(n− 1)

pnqnn,

which is o(1) as n→∞. The result follows. �

Proof of Lemma 1.2. Proceeding as in the proof of Lemmas M-B.2 and M-B.4, weobtain

n∑`=1

E[(DI

n`)2 I[|DI

n`| > ε]]≤

n∑`=1

√E[(DI

n`)4]√

P[|DIn`| > ε]

≤ 1

ε

n∑`=1

√E[(DI

n`)4]√

Var[DIn`] ≤

√2

εn

n∑`=1

√(`− 1)E

[(DI

n`)4], (1.2)

imsart-bj ver. 2014/02/20 file: TSWLatexianTemp_000002.tex date: October 14, 2015

Supplement to “On High-dimensional Sign Tests” 3

where we have used the fact that (see (1.1))

Var[DIn`] ≤ E

[(DI

n`)2]

=2(`− 1)

n2.

Now, Lemma M-A.1(iv) yields

E[(DI

n`)4]

=4p2nq

2n

n4

`−1∑i,j,r,s=1

E[ρUi`ρ

Uj`ρ

Ur`ρ

Us`ρ

Vi`ρ

Vj`ρ

Vr`ρ

Vs`

]

=4p2nq

2n

n4

{(`− 1)E

[(ρUi`ρ

Vi`

)4]+ 3(`− 1)(`− 2)

(E[(ρUi`ρ

Vi`

)2])2}

≤ 4p2nq2n(`− 1)2

n4

{E[(ρUi`ρ

Vi`

)4]+ 3(

E[(ρUi`ρ

Vi`

)2])2}. (1.3)

Lemma M-A.1(iii) then allows to evaluate the upper bound in (1.3), which yields

E[(DI

n`)4]≤ 4p2nq

2n(`− 1)2

n4

{9

pnqn(pn + 2)(qn + 2)+

3

p2nq2n

}≤ 48(`− 1)2

n4,

as n→∞. Plugging this into (1.2), we conclude that

n∑`=1

E[(DI

n`)2 I[|DI

n`| > ε]]≤√

96

εn3

n∑`=1

(`− 1)3/2,

which, still by using Lemma M-A.3, is O(n−1/2). The result follows. �

2. Proof of Theorem M-2.5

Of course, it is sufficient to show that, under the assumptions of Theorem M-2.5,

SStn =

pnn

n∑1≤i<j≤n

((U′niUnj)

2 − 1

pn

)is asymptotically standard normal. To do so, we define Fn` as the σ-algebra generatedby Xn1, . . . ,Xn`, and we let

DSn` = En`

[SStn

]− En,`−1

[SStn

]=pnn

`−1∑i=1

((U′niUnj)

2 − 1

pn

),

where En` still denotes conditional expectation with respect to Fn`. Clearly, we have thatSStn =

∑n`=1D

Sn`, where DS

n` is almost surely bounded, hence has a finite-variance. Theasymptotic normality result in Theorem M-2.5 will then follow from Theorem M-B.1 ifwe prove both following lemmas.



Lemma 2.1. Letting σ2n` = En,`−1

[(DS

n`)2],∑n

`=1 σ2n` converges to one in quadratic

mean as n→∞.

Lemma 2.2. For any ε > 0,∑n

`=1 E[(DS

n`)2 I[|DS

n`| > ε]]→ 0 as n→∞.

We use the same notational shortcuts in the proofs of these lemmas as in those ofLemmas M-B.1 and M-B.2.

Proof of Lemma 2.1. Using the identity (vecA)′(vecB) = Trace[A′B], we obtain

(U′iU`)2 − 1

pn=(vec(UiU

′i))′J

⊥

p vec(UÙ′`),

where the (symmetric and idempotent) matrix J⊥

p = Ip2 − 1pJp is based on the matrix Jp

involved in Lemma M-A.2. Hence,

σ2n` = E`−1

[(DS

n`)2]

=p2nn2

`−1∑i,j=1

(vec(UiU

′i))′J

⊥

p E[vec(UÙ

′`)(vec(UÙ

′`))′]

×J⊥

p vec(UjU′j).

Since Kpvec(A) = vec(A′) for any p×p matrix A, we have that J⊥

p

(Ip2 +Kp +Jp

)J

⊥

p =

Ip2 + Kp − 2pJp. Lemma M-A.2(iii) then yields

σ2n` =

pn(pn + 2)n2

`−1∑i,j=1

(vec(UiU

′i))′(

Ip2 + Kp −2

pJp

)vec(UjU

′j)

=2pn

(pn + 2)n2

`−1∑i,j=1

(vec(UiU

′i))′J

⊥

p vec(UjU′j) =

2pn(pn + 2)n2

`−1∑i,j=1

(ρ2ij −

1

pn

).

By applying Lemma M-A.1(iii), we then obtain

E[σ2n`

]=

2(pn − 1)(`− 1)

(pn + 2)n2

(= E

[(DS

n`)2] ). (2.1)

Therefore, as n→∞,

E

[ n∑`=1

σ2n`

]=

2(pn − 1)

(pn + 2)n2

n∑`=1

(`− 1) =(pn − 1)(n− 1)

(pn + 2)n→ 1.



From the pairwise independence of the ρij ’s, we have

Var

[ n∑`=1

σ2n`

]=

16p2n(pn + 2)2n4

Var

[ n∑`=3

∑1≤i<j≤`−1

(ρ2ij −

1

p

)]

=16p2n

(pn + 2)2n4Var

[ ∑1≤i<j≤n

(n− j)(ρ2ij −

1

p

)]

=16p2n

(pn + 2)2n4

(E[ρ4ij ]− E2[ρ2ij ]

) ∑1≤i<j≤n

(n− j)2.

Lemma M-A.1(iii) then yields

Var

[ n∑`=1

σ2n`

]=

32(pn − 1)

(pn + 2)3n4

∑1≤i<j≤n

(n− j)2

=32(pn − 1)

(pn + 2)3n4

n∑j=2

(j − 1)(n− j)2 =32(pn − 1)

(pn + 2)3n4

n−1∑j=1

j(n− j − 1)2

≤ 32(pn − 1)

(pn + 2)3n2

n−1∑j=1

j ≤ 16(pn − 1)(n− 1)

(pn + 2)3n,

which is o(1) as n→∞. The result follows. �

Proof of Lemma 2.2. Proceeding as in the proofs of Lemmas M-B.2, M-B.4 and 1.2,we obtain

n∑`=1

E[(DS

n`)2 I[|DS

n`| > ε]]≤

n∑`=1

√E[(DS

n`)4]√

P[|DSn`| > ε]

≤ 1

ε

n∑`=1

√E[(DS

n`)4]√

Var[DSn`] ≤

√2(pn − 1)

εn√pn + 2

n∑`=1

√(`− 1)E

[(DS

n`)4], (2.2)

where we have used the fact that (see (2.1))

Var[DSn`] ≤ E

[(DS

n`)2]

=2(pn − 1)(`− 1)

(pn + 2)n2.



Now, Lemma M-A.1(iv) yields

E[(DS

n`)4]

=p4nn4

`−1∑i,j,r,s=1

E[(ρ2i` −

1

pn

)(ρ2j` −

1

pn

)(ρ2r` −

1

pn

)(ρ2s` −

1

pn

)]

=p4nn4

{(`− 1)E

[(ρ2i` −

1

pn

)4]+ 3(`− 1)(`− 2)

(E[(ρ2i` −

1

pn

)2])2}

≤ p4n(`− 1)2

n4

{E[(ρ2i` −

1

pn

)4]+ 3(

E[(ρ2i` −

1

pn

)2])2}. (2.3)

Lemma M-A.1(iii) allows, after tedious computations, to evaluate the upper boundin (2.3), which yields

E[(DS

n`)4]

=p4n(`− 1)2

n4× 72(pn − 1)(pn + 1)

p2n(pn + 2)2(pn + 4)(pn + 6)=

(`− 1)2

n4O(1),

as n→∞. Plugging this into (2.2), we conclude that

n∑`=1

E[(DS

n`)2 I[|DS

n`| > ε]]≤√

2(pn − 1)

εn3√pn + 2

O(1)

n∑`=1

(`− 1)3/2,

which, still by using Lemma M.A.3, is O(n−1/2). The result follows. �

3. Simulations: checking universal asymptotics of thesign test for independence

As in Paindaveine and Verdebout (2014), the statistic I(n)N ,p,q in (2.10) (with p = q)

was evaluated on M = 10 000 independent random samples of size n from the (p + q)-dimensional standard normal distributions, with (n, p+q) ∈ {4, 30, 200, 1 000}2. Figure 1,that provides histograms of the M = 10 000 corresponding values of the standardized

test statistic I(n)N ,p,q in each case, confirms that the null distribution of I

(n)N ,p,q is close to

the standard normal as soon as n and p(= q) are both moderate-to-large. The histogramsin Figures 2 and 3 are the histograms from the simulations of Section 3.1 of Paindaveineand Verdebout (2014).

References

Paindaveine, D. and Verdebout, T. (2014), “On high-dimensional sign tests,” submitted.



-4 -2 0 2 4

0.0

0.2

0.4

n=4

p+q=4

-4 -2 0 2 4

0.0

0.2

0.4

p+q=30

-4 -2 0 2 4

0.0

0.2

0.4

p+q=200

-4 -2 0 2 4

0.0

0.2

0.4

p+q=1000

-4 -2 0 2 40.0

0.2

0.4

n=30

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

n=200

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

n=1000

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

Figure 1. Histograms of M = 10 000 values of the standardized sign test statistic I(n)N ,p,q for

multivariate independence in (2.10). These values are obtained from independent samples ofsize n; in each sample, observations are i.i.d. from the (p + q)-dimensional standard normaldistribution. Every (n, p+ q) ∈ {4, 30, 200, 1 000}2 was considered, with p = q in each case. Thestandard Gaussian density is plotted in black in each plot.



-4 -2 0 2 4

0.0

0.2

0.4

n=4

p=4

-4 -2 0 2 4

0.0

0.2

0.4

p=30

-4 -2 0 2 4

0.0

0.2

0.4

p=200

-4 -2 0 2 4

0.0

0.2

0.4

p=1000

-4 -2 0 2 4

0.0

0.2

0.4

n=30

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

n=200

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

n=1000

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

Figure 2. Histograms of M = 10 000 values of the standardized Rayleigh test statistic R(n)N ,p.

These values are obtained from independent random samples of size n from the uniform distri-bution on the unit sphere of Rp. Every (n, p) ∈ {4, 30, 200, 1 000}2 was considered. The standardGaussian density is plotted in black in each plot.



-4 -2 0 2 4

0.0

0.2

0.4

n=4

p=4

-4 -2 0 2 4

0.0

0.2

0.4

p=30

-4 -2 0 2 4

0.0

0.2

0.4

p=200

-4 -2 0 2 4

0.0

0.2

0.4

p=1000

-4 -2 0 2 4

0.0

0.2

0.4

n=30

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

n=200

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

n=1000

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

-4 -2 0 2 4

0.0

0.2

0.4

Figure 3. Histograms of M = 10 000 values of the standardized Portmanteau-type sign teststatistic T

(n)N ,p. These values are obtained from independent random samples of size n from the

p-dimensional standard normal distribution. Every (n, p) ∈ {4, 30, 200, 1 000}2 was considered.The standard Gaussian density is plotted in black in each plot.


On High-dimensional Sign Testshomepages.ulb.ac.be/.../Article57/universasupp.pdf · Supplement to...

Documents

Transcript of On High-dimensional Sign Testshomepages.ulb.ac.be/.../Article57/universasupp.pdf · Supplement to...