Likelihood Ratio Tests for Covariance Matrices of High...

22
Likelihood Ratio Tests for Covariance Matrices of High-Dimensional Normal Distributions Dandan Jiang 1 , Tiefeng Jiang 2 and Fan Yang 3 Abstract For a random sample of size n obtained from a p-variate normal population, the likelihood ratio test (LRT) for the covariance matrix equal to a given matrix is considered. By using the Selberg integral, we prove that the LRT statistic converges to a normal distribution under the assumption p/n y (0, 1]. The result for y = 1 is much different from the case for y (0, 1). Another test is studied: given two sets of random observations of sample size n 1 and n 2 from two p-variate normal distributions, we study the LRT for testing the two normal distributions having equal covariance matrices. It is shown through a corollary of the Selberg integral that the LRT statistic has an aymptotic normal distribution under the assumption p/n 1 y 1 (0, 1] and p/n 2 y 2 (0, 1]. The case for max{y 1 ,y 2 } = 1 is much different from the case max{y 1 ,y 2 } < 1. 1 Introduction In their pioneer work, Bai, Jiang, Yao and Zheng [2] studied two Likelihood Ratio Tests (LRT) by using Random Matrix Theory. The limiting distributions of the LRT test statistics are derived. There are two purposes in this paper. We first use the Selberg integral, a different method, to revisit the two problems. We then prove two theorems which cover the critical cases that are not studied in [2]. Now we review the two tests and present our results. Let x 1 , ··· , x n be i.i.d. R p -valued random variables with normal distribution N p (µ, Σ), where µ R p is the mean vector and Σ is the covariance matrix. Consider the test: H 0 : Σ = I p vs H a : Σ ̸= I p (1.1) with µ unspecified. Any test H 0 : Σ 0 with known non-singular Σ 0 and unspecified µ can be reduced to (1.1) by transforming data y i 1/2 0 x i for i =1, 2 ··· ,n (then y 1 , ··· , y n are i.i.d. with distribution N p µ, I p ), where ˜ µ 1/2 0 µ). Recall ¯ x = 1 n n i=1 x i and S = 1 n n i=1 (x i ¯ x)(x i ¯ x) . (1.2) Of course S is a p × p matrix. After scaling and taking logarithm, a LRT statistic for (1.1) is chosen 1 School of Mathematics, Jilin University, Changchun 130012 China, [email protected]. 2 Supported in part by NSF #DMS-0449365, School of Statistics, University of Minnesota, 224 Church Street, MN55455, [email protected]. 3 School of Statistics, University of Minnesota, 224 Church Street, MN55455, [email protected]. Key Words: High-dimensional data, testing on covariance matrices, Selberg integral, Gamma function. AMS (2000) subject classifications: Primary 62H15; secondary 62H10. 1

Transcript of Likelihood Ratio Tests for Covariance Matrices of High...

Page 1: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

Normal Distributions

Dandan Jiang 1, Tiefeng Jiang 2 and Fan Yang3

Abstract For a random sample of size n obtained from a p-variate normal population, the likelihood

ratio test (LRT) for the covariance matrix equal to a given matrix is considered. By using the Selberg

integral, we prove that the LRT statistic converges to a normal distribution under the assumption

p/n → y ∈ (0, 1]. The result for y = 1 is much different from the case for y ∈ (0, 1). Another test is

studied: given two sets of random observations of sample size n1 and n2 from two p-variate normal

distributions, we study the LRT for testing the two normal distributions having equal covariance

matrices. It is shown through a corollary of the Selberg integral that the LRT statistic has an

aymptotic normal distribution under the assumption p/n1 → y1 ∈ (0, 1] and p/n2 → y2 ∈ (0, 1]. The

case for max{y1, y2} = 1 is much different from the case max{y1, y2} < 1.

1 Introduction

In their pioneer work, Bai, Jiang, Yao and Zheng [2] studied two Likelihood Ratio Tests (LRT) by

using Random Matrix Theory. The limiting distributions of the LRT test statistics are derived.

There are two purposes in this paper. We first use the Selberg integral, a different method, to revisit

the two problems. We then prove two theorems which cover the critical cases that are not studied

in [2]. Now we review the two tests and present our results.

Let x1, · · · ,xn be i.i.d. Rp-valued random variables with normal distribution Np(µ,Σ), where

µ ∈ Rp is the mean vector and Σ is the covariance matrix. Consider the test:

H0 : Σ = Ip vs Ha : Σ = Ip (1.1)

with µ unspecified. Any test H0 : Σ = Σ0 with known non-singular Σ0 and unspecified µ can be

reduced to (1.1) by transforming data yi = Σ−1/20 xi for i = 1, 2 · · · , n (then y1, · · · ,yn are i.i.d.

with distribution Np(µ, Ip), where µ = Σ−1/20 µ). Recall

x =1

n

n∑i=1

xi and S =1

n

n∑i=1

(xi − x)(xi − x)∗. (1.2)

Of course S is a p× p matrix. After scaling and taking logarithm, a LRT statistic for (1.1) is chosen

1School of Mathematics, Jilin University, Changchun 130012 China, [email protected] in part by NSF #DMS-0449365, School of Statistics, University of Minnesota, 224 Church Street,

MN55455, [email protected] of Statistics, University of Minnesota, 224 Church Street, MN55455, [email protected].

Key Words: High-dimensional data, testing on covariance matrices, Selberg integral, Gamma function.

AMS (2000) subject classifications: Primary 62H15; secondary 62H10.

1

Page 2: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

to be in the following form:

L∗n = tr(S)− log |S| − p

=1

n

p∑i=1

(λi − n log λi) + p log n− p, (1.3)

where λ1, · · · , λp are the eigenvalues of nS. See, for example, p. 355 from [14] for this. The notation

log above stands for the natural logarithm loge throughout the paper.

For fixed p, it is known from the classical multivariate analysis theory that a (constant) linear

transform of nL∗n converges to χ2

p(p+1)/2 as n → ∞. See, e.g., p. 359 from [14]. When p is large,

particularly as n → ∞ and p/n → y ∈ (0, 1), there are some results on the improvement of the

convergence, see, e.g., [3]. The fact that dimension p is large and is proportional to the sample size n

is a common practice in modern data. A failure for a similar LRT test in the high dimensional case

(p is large) is observed by Dempster [8] in as early as 1958. It is due to this reason that Bai, Jiang,

Yao and Zheng [2] study the statistic L∗n in (1.3) when both n and p are large and are proportional

to each other.

Now, we state our results in this paper next.

THEOREM 1 Let x1, · · · ,xn be i.i.d. random vectors with normal distribution Np(µ,Σ). Let L∗n

be as in (1.3). Assume H0 in (1.1) holds. If n > p = pn and limn→∞ p/n = y ∈ (0, 1], then

(L∗n − µn)/σn converges in distribution to N(0, 1) as n → ∞, where

µn =(n− p− 3

2

)log

(1− p

n

)+ p− y and σ2

n = −2[ pn+ log

(1− p

n

)].

A simulation study was made for the quantity (L∗n − µn)/σn as in Theorem 1. We chose p/n = 0.9

in Figure 1 with different values of n. The figure shows that the convergence becomes more accurate

as n increases. To see the convergence rate for the case y = 1, we chose an extreme scenario with

p = n− 4 in Figure 2. As n increases, the convergence rate seems quite decent too.

Now, note that σ2n → −2y − 2 log(1 − y) if p/n → y ∈ (0, 1). We obviously have the following

corollary.

COROLLARY 1.1 Let x1, · · · ,xn be i.i.d. random vectors with normal distribution Np(µ,Σ). Let

L∗n be as in (1.3). Assume H0 in (1.1) holds. If n > p = pn and limn→∞ p/n = y ∈ (0, 1), then

L∗n − µn converges in distribution to N(0, σ2) as n → ∞, where σ2 = −2y − 2 log(1− y) and

µn = (n− p) log(1− p

n

)+ p− y − 3

2log(1− y).

Looking at Theorem 1, it is obvious that σ2n ∼ −2 log(1 − p

n ) as p/n → 1. We then get the

following.

COROLLARY 1.2 Assume all the conditions in Theorem 1 hold with y = 1. Let rn = (− log(1 −pn ))

1/2. Then

L∗n − p− (p− n+ 1.5)r2n√

2rnconverges in distribution to N(0, 1) as n → ∞.

2

Page 3: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

Figure 1: Histograms were constructed based on 10, 000 simulations of the normalized likelihood ratio

statistic (L∗n − µn)/σn according to Theorem 1 under the null hypothesis Σ = Ip with p/n = 0.9.

The curves on the top of the histograms are the standard normal curve.

Figure 2: Histograms were constructed based on 10, 000 simulations of the normalized likelihood ratio

statistic (L∗n − µn)/σn according to Theorem 1 under the null hypothesis Σ = Ip with p = n − 4.

The curves on the top of the histograms are the standard normal curve.

3

Page 4: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

The above result studies the critical case for y = 1, which is not covered in [2]. In fact, the

random matrix tool by Bai and Silverstein [4] is used to derive the results in [2]. Their tool fails

when y = 1.

For a practical testing procedure, we would use Theorem 1 directly instead of using Corollaries

1.1 and 1.2, which deal with the cases y ∈ (0, 1) and y = 1 separately. This is because, for a real set

of data, sometimes it is hard to judge when p/n goes to 1 or when it goes to a number less than 1.

Now we study another likelihood test. For two p-dimensional normal distributionsN(µk,Σk), k =

1, 2, where Σ1 and Σ2 are non-singular and unknown, we wish to test

H0 : Σ1 = Σ2 vs Ha : Σ1 = Σ2 (1.4)

with unspecified µ1 and µ2. The data are given as follows: x1, · · · ,xn1 is a random sample from

Np(µ1,Σ1); y1, · · · ,yn2 is a random sample from Np(µ2,Σ2), and two sets of random vectors are

independent. The two relevant covariance matrices are

A =1

n1

n1∑i=1

(xi − x)(xi − x)∗ and B =1

n2

n2∑i=1

(yi − y)(yi − y)∗ (1.5)

where

x =1

n1

n1∑i=1

xi and y =1

n2

n2∑i=1

yi. (1.6)

Let N = n1 + n2 and ck = nk

N for k = 1, 2. The likelihood ratio test statistic is

TN = −2 logL1 where L1 =|A|n1/2 · |B|n2/2

|c1A+ c2B|N/2. (1.7)

See, e.g., section 8.2 from [14] for this. The second main result in this paper is as follows.

THEOREM 2 Let ni > p for i = 1, 2 and TN be as in (1.7). Assume H0 in (1.4) holds. If

n1 → ∞, n2 → ∞ and p → ∞ with p/ni → yi ∈ (0, 1] for i = 1, 2, then

1

σn

(TN

N− µn

)converges in distribution to N(0, 1),

where

µn = (p−N + 2.5) log(1− p

N)−

2∑i=1

(p− ni + 1.5)ni

Nlog(1− p

ni);

σ2n = 2 log

(1− p

N

)− 2

2∑i=1

n2i

N2log

(1− p

ni

). (1.8)

We did some simulations for the statistic (TN/N − µn)/σn as in Theorem 2. In Figure 3, we

chose p/n1 = p/n2 = 0.9, the picture shows that the convergence rate is quite robust with the value

of n1, n2 and p increase even though the ratio 0.9 is close to 1. To see the convergence rate for the

case that max{y1, y2} = 1, we chose an extreme situation with p = n1− 4 = n2− 4 in Figure 4. The

convergence rate looks well too although it is not as fast as the case p/n1 = p/n2 = 0.9 presents.

According to the notation in Theorem 2, we know that pN = (n1

p + n2

p )−1 → y1y2

y1+y2and ni

N =

ni

p · (n1

p + n2

p )−1 → y−1i

y−11 +y−1

2

for i = 1, 2. We easily get the following corollary.

4

Page 5: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

Figure 3: Histograms were constructed based on 10, 000 simulations of the normalized likelihood

ratio statistic (TN/N − µn)/σn according to Theorem 2 under the null hypothesis Σ1 = Σ2 with

p/n1 = p/n2 = 0.9. The curves on the top of the histograms are the standard normal curve.

Figure 4: Histograms were constructed based on 10, 000 simulations of the normalized likelihood

ratio statistic (TN/N − µn)/σn according to Theorem 2 under the null hypothesis Σ1 = Σ2 with

p = n1 − 4 = n2 − 4. The curves on the top of the histograms are the standard normal curve.

5

Page 6: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

COROLLARY 1.3 Let ni > p for i = 1, 2 and TN be as in (1.7). Assume H0 in (1.4) holds. If

n1 → ∞, n2 → ∞ and p → ∞ with p/ni → yi ∈ (0, 1) for i = 1, 2, then

TN

N− νn converges in distribution to N(µ, σ2),

where

µ =1

2

[5 log(1− y)− 3γ1 log(1− y1)− 3γ2 log(1− y2)

];

σ2 = 2[log(1− y)− γ2

1 log(1− y1)− γ22 log(1− y2)

]; (1.9)

νn = (p−N) log(1− p

N)− (p− n1)n1

Nlog(1− p

n1)− (p− n2)n2

Nlog(1− p

n2)

with γ1 = y2(y1 + y2)−1, γ2 = y1(y1 + y2)

−1 and y = y1y2(y1 + y2)−1.

Our method of proving the above results is much different from [2]. The random matrix theories,

developed by Bai and Silverstein [4] for the Wishart matrices and Zheng [15] for the F -matrices, are

used in [2]. The tools are universal in the sense that no normality assumption is needed. However,

the requirements that y < 1 as in Corollary 1.1 and max{y1, y2} < 1 as in Corollary 1.3 are crucial.

Technically, the study for critical cases that y = 1 and that max{y1, y2} = 1 are more challenging.

Under the normality assumption, without relying on the random matrix theories similar to Bai

and Silverstein [4] and Zheng [15], we are able to use analysis tools. In fact, the Selberg integral is

used in the proof of both theorems. Through the Selberg integral, some close forms of the moment

generating functions of the two likelihood ratio test statistics are obtained. We then study the

moment generating functions to derive the central limit theorems for the two likelihood ratio test

statistics. In particular, our results study the cases that y ≤ 1 and that max{y1, y2} ≤ 1. As shown

in Corollary 1.2, the result for y = 1 and the result for y ∈ (0, 1) are much different. The same

applies for the second test.

We develop a tool on the product of a series of Gamma functions (Proposition 2.1). It is powerful

in analyzing the moment generating functions of the two log-likelihood ratio statistics studied in this

paper.

The organization of the rest of the paper is as follows. In Section 2, we derive a tool to study the

product of a series the Gamma functions. The proofs of the main theorems stated above are given

in Section 3.

2 Auxiliary Results

PROPOSITION 2.1 Let n > p = pn and rn = (− log(1 − pn ))

1/2. Assume that p/n → y ∈ (0, 1]

and t = tn = O(1/rn) as n → ∞. Then, as n → ∞,

log

n−1∏i=n−p

Γ( i2 − t)

Γ( i2 )

= pt(1 + log 2)− pt log n+ r2n

(t2 + (p− n+ 1.5)t

)+ o(1).

The proposition is proved through the following three lemmas.

6

Page 7: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

LEMMA 2.1 Let b := b(x) be a real-valued and bounded function defined on (0,∞). Then

logΓ(x+ b)

Γ(x)= b log x+

b2 − b

2x+O

(1

x2

)as x → +∞, where Γ(x) is the gamma function.

Proof. Recall the Stirling formula (see, e.g., p.368 from [6] or (37) on p.204 from [1]):

log Γ(z) = z log z − z − 1

2log z + log

√2π +

1

12z+O

(1

x3

)as x = Re (z) → +∞. It follows that

logΓ(x+ b)

Γ(x)

= (x+ b) log(x+ b)− x log x− b− 1

2(log(x+ b)− log x) +

1

12

(1

x+ b− 1

x

)+O

(1

x3

)(2.1)

as x → +∞. First, use the fact that log(1 + t) ∼ t− (t2/2) +O(t3) as t → 0 to get

(x+ b) log(x+ b)− x log x = (x+ b)

(log x+ log

(1 +

b

x

))− x log x

= (x+ b)(log x+

b

x− b2

2x2+O(x−3)

)− x log x

= b log x+ b+b2

2x+O

(1

x2

)as x → +∞. Evidently,

log(x+ b)− log x = log(1 +

b

x

)=

b

x+O

( 1

x2

)and

1

x+ b− 1

x= O

( 1

x2

)as x → +∞. Plugging these two assertions into (2.1), we have

logΓ(x+ b)

Γ(x)= b log x+

b2 − b

2x+O

(1

x2

)as x → +∞. �

LEMMA 2.2 Let n > p = pn. Assume that limn→∞ p/n = y ∈ (0, 1) and {tn;n ≥ 1} is bounded.

Then, as n → ∞,

logn−1∏

i=n−p

Γ( i2 − tn)

Γ( i2 )

= ptn(1 + log 2)− tnn logn+ tn(n− p) log(n− p)−(t2n +

3tn2

)log(1− y) + o(1). (2.2)

7

Page 8: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

Proof. Since p/n → y ∈ (0, 1), then n − p → +∞ as n → ∞. By Lemma 2.1, there exists integer

C1 ≥ 2 such that

logΓ( i

2 − t)

Γ( i2 )

= −t logi

2+

t2 + t

i+ φ(i) and |φ(i)| ≤ C1

i2

for all i ≥ n − p as n is sufficiently large, where here and later in this proof we write t for tn for

short notation. Notice −t log i2 = t log 2− t log i. Then,

n−1∑i=n−p

logΓ( i

2 − t)

Γ( i2 )

= pt log 2− t

n−1∑i=n−p

log i+ (t2 + t)

n−1∑i=n−p

1

i+

n−1∑i=n−p

φ(i)

= pt log 2 + (t2 + t)n−1∑

i=n−p

1

i− t log

n!

(n− p)!+ t log

n

(n− p)+O(

1

n)

= pt log 2 + (t2 + t)

n−1∑i=n−p

1

i− t log(1− y)− t log

n!

(n− p)!+ o(1) (2.3)

since∑n−1

i=n−p φ(i) = O( 1n ) and log n(n−p) → − log(1− y) as n → ∞. First,

n−1∑i=n−p

1

i≤

n−1∑i=n−p

∫ i

i−1

1

xdx =

∫ n−1

n−p−1

1

xdx.

By working on the lower bound similarly, we have

logn

n− p=

∫ n

n−p

1

xdx ≤

n−1∑i=n−p

1

i≤

∫ n−1

n−p−1

1

xdx = log

n− 1

n− p− 1.

This implies, by assumption p/n → y, that

n−1∑i=n−p

1

i→ − log(1− y) (2.4)

as n → ∞. Second, by the Stirling formula (see, e.g., p.210 from [11]), there are some θn, θ′n ∈ (0, 1),

logn!

(n− p)!= log

√2πnnne−n+ θn

12n√2π(n− p)(n− p)n−pe−n+p+

θ′n12(n−p)

= n log n− (n− p) log(n− p)− p+1

2log

n

n− p+ o(1)

= n log n− (n− p) log(n− p)− p− 1

2log(1− y) + o(1)

8

Page 9: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

as n → ∞. Join this with (2.3) and (2.4), we arrive at

log

n−1∏i=n−p

Γ( i2 − t)

Γ( i2 )

= pt log 2− (t2 + t) log(1− y)− t log(1− y)− tn log n

+t(n− p) log(n− p) + tp+t

2log(1− y) + o(1)

= pt(1 + log 2)− (t2 +3t

2) log(1− y)− tn log n+ t(n− p) log(n− p) + o(1)

as n → ∞. The proof is then complete. �

LEMMA 2.3 Let n > p = pn and rn = (− log(1 − pn ))

1/2. Assume that limn→∞ p/n = 1 and

t = tn = O(1/rn) as n → ∞. Then, as n → ∞,

logn−1∏

i=n−p

Γ( i2 − t)

Γ( i2 )

= pt(1 + log 2)− pt log n+ r2n

(t2 + (p− n+ 1.5)t

)+ o(1).

Proof. Obviously, limn→∞ rn = +∞. Hence, {tn; n ≥ 2} is bounded. By Lemma 2.1, there exist

integers C1 ≥ 2 and C2 ≥ 2 such that

logΓ( i

2 − t)

Γ( i2 )

= −t logi

2+

t2 + t

i+ φ(i) and |φ(i)| ≤ C1

i2(2.5)

for all i ≥ C2.

We will use (2.5) to estimate∏n−1

i=n−pΓ( i

2−t)

Γ( i2 )

. However, when n− p is small, say, 2 or 3 (which is

possible since p/n → 1), the identity (2.5) can not be directly applied to estimate each term in the

product of∏n−1

i=n−pΓ( i

2−t)

Γ( i2 )

. We next use a truncation to solve the problem thanks to the fact that

Γ( i2−t)

Γ( i2 )

→ 1 as n → ∞ for fixed i.

Fix M ≥ C2. Write

ai =Γ( i

2 − t)

Γ( i2 )

for i ≥ 1 and γn =

1, if n− p ≥ M ;∏M−1i=n−p ai, if n− p < M .

Then,

n−1∏i=n−p

Γ( i2 − t)

Γ( i2 )

= γn ·n−1∏

i=(n−p)∨M

Γ( i2 − t)

Γ( i2 )

. (2.6)

Easily, (min

1≤i≤M(1 ∧ ai)

)M

≤ γn ≤(

max1≤i≤M

(1 ∨ ai)

)M

9

Page 10: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

for all n ≥ 1. Note that, for each i ≥ 1, ai → 1 as n → ∞ since limn→∞ tn = 0. Thus, since M is

fixed, the two bounds above go to 1 as n → ∞. Consequently, limn→∞ γn = 1. This and (2.6) say

that

n−1∏i=n−p

Γ( i2 − t)

Γ( i2 )

∼n−1∏

i=(n−p)∨M

Γ( i2 − t)

Γ( i2 )

(2.7)

as n → ∞. By (2.5), as n is sufficiently large, we know

logn−1∏

i=(n−p)∨M

Γ( i2 − t)

Γ( i2 )

=n−1∑

i=(n−p)∨M

(−t log

i

2+

t2 + t

i+ φ(i)

)

with |φ(i)| ≤ C1i−2 for i ≥ C2. Write −t log i

2 = −t log i+ t log 2. It follows that

log

n−1∏i=(n−p)∨M

Γ( i2 − t)

Γ( i2 )

=(n− (n− p) ∨M

)t log 2− t

n−1∑i=(n−p)∨M

log i+ (t2 + t)

n−1∑i=(n−p)∨M

1

i+

n−1∑i=(n−p)∨M

φ(i)

:= An −Bn + Cn +Dn (2.8)

as n is sufficiently large. Now we analyze the four terms above.

By distinguishing the cases n− p > M and n− p ≤ M, we get

|An − pt log 2| ≤ (t log 2) · |n− p−M | · I(n− p ≤ M) ≤ (M log 2)t. (2.9)

Now we estimate Bn. By the same argument as in (2.9), we get

∣∣∣ n−1∑i=(n−p)∨M

h(i)−n−1∑

i=(n−p)

h(i)∣∣∣ ≤ M∑

i=1

|h(i)| (2.10)

for h(x) = log x or h(x) = 1/x on x ∈ (0,∞). By the Stirling formula (see, e.g., p.210 from [11]),

n! =√2πnnne−n+ θn

12n with θn ∈ (0, 1) for all n ≥ 1. It follows that for some θn, θ′n ∈ (0, 1),

n−1∑i=n−p

log i = logn!

(n− p)!+ log

n− p

n

= log

√2πnnne−n+ θn

12n√2π(n− p)(n− p)n−pe−n+p+

θ′n12(n−p)

+ logn− p

n

= n log n− (n− p) log(n− p)− p+1

2log

n− p

n+Rn

with |Rn| ≤ 1 as n is sufficiently large. Recall Bn = t∑n−1

i=(n−p)∨M log i. We know from (2.10) that∣∣∣Bn −(tn log n− t(n− p) log(n− p)− tp+

t

2log

n− p

n

) ∣∣∣ ≤ Ct, (2.11)

where C here and later stands for a constant and can be different from line to line.

10

Page 11: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

Now we estimate Cn. Recall the identity sn :=∑n

i=11i = log n+cn for all n ≥ 1 and limn→∞ cn =

c, where c ∼ 0.577 is the Euler constant. Thus, |(sn − sn−p)− log nn−p | ≤ cn + cn−p. Moreover,

n∑i=n−p+1

1

i= sn − sn−p and

∣∣∣ n−1∑i=n−p

1

i−

n∑i=n−p+1

1

i

∣∣∣ ≤ 1.

Therefore, ∣∣∣ n−1∑i=n−p

1

i− log

n

n− p

∣∣∣ ≤ C.

Consequently, since Cn = (t2 + t)∑n−1

i=(n−p)∨M1i , we know from (2.10) that∣∣∣Cn − (t2 + t) logn

n− p

∣∣∣ ≤ (t2 + t)C. (2.12)

Finally, it is easy to see from the second fact in (2.5) that

|Dn| ≤ C1

∞∑i=M

1

i2(2.13)

for all n ≥ 2. Now, reviewing that t = tn → 0 as n → ∞, we have from (2.7), (2.8), (2.9), (2.11) and

(2.12) that, for fixed integer M > 0,

An −Bn + Cn +Dn

= pt log 2−(tn log n− t(n− p) log(n− p)− tp+

t

2log

n− p

n

)+ (t2 + t) log

n

n− p+Dn + o(1)

= pt(1 + log 2) +

(t2 +

3t

2− nt

)log n−

(t2 +

3t

2− (n− p)t

)log(n− p)︸ ︷︷ ︸

En

+Dn + o(1)

as n → ∞. Write log(n− p) = log n− r2n. Then

En = pt(1 + log 2)− pt log n+ r2n

(t2 +

3t

2− (n− p)t

).

From (2.13) we have that

lim supn→∞

|(An −Bn + Cn +Dn)− En| ≤ C1

∞∑i=M

1

i2

for any M ≥ C2. Recalling (2.7) and (2.8), letting M → ∞, we eventually obtain the desired con-

clusion. �

Proof of Proposition 2.1. The conclusion corresponding to the case y = 1 follows from Lemma

2.3. If y ∈ (0, 1), then limn→∞ rn = (− log(1− y))1/2, and hence {tn : n ≥ 1} is bounded. It follows

that

pt(1 + log 2)− pt log n+ r2n

(t2 + (p− n+ 1.5)t

)= pt(1 + log 2)− pt log n− t(p− n) log(1− p

n)−

(t2 +

3t

2

)log(1− p

n).

11

Page 12: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

The last term above is identical to(t2 + 3t

2

)log(1− y) + o(1) since p/n → y as n → ∞. Moreover,

−pt log n− t(p− n) log(1− p

n) = −pt log n+ t(n− p)

(log(n− p)− log n

)= −nt log n+ t(n− p) log(n− p).

The above three assertions conclude

pt(1 + log 2)− pt log n+ r2n

(t2 + (p− n+ 1.5)t

)= pt(1 + log 2)− nt log n+ (n− p)t log(n− p)−

(t2 +

3t

2

)log(1− y) + o(1)

as n → ∞. This is exactly the right hand side of (2.2). �

3 Proof of Main Results

We first prove Theorem 1. To do that, we need to make a preparation. Assume that x1, · · · ,xn are

Rp-valued random variables. Recall

S =1

n

n∑i=1

(xi − x)(xi − x)∗ where x =1

n

n∑i=1

xi. (3.1)

The following is from Theorem 3.1.2 and Corollary 3.2.19 in [14].

LEMMA 3.1 Assume n > p. Let x1, · · · ,xn be i.i.d. Rp-valued random variables with distribution

Np(µ, Ip). Then nS and Z∗Z have the same distribution, where Z := (zij)(n−1)×p and zij’s are i.i.d.

with distribution N(0, 1). Further, λ1, · · · , λp have joint density function

f(λ1, · · · , λp) = Const ·∏

1≤i<j≤p

|λi − λj | ·p∏

i=1

λ(n−p−2)/2i · e− 1

2

∑pi=1 λi

for all λ1 > 0, λ2 > 0, · · · , λp > 0.

Recall the β-Laguerre ensemble as follows:

fβ,a(λ1, · · · , λp) = cβ,aL ·∏

1≤i<j≤p

|λi − λj |β ·p∏

i=1

λa−qi · e− 1

2

∑pi=1 λi (3.2)

for all λ1 > 0, λ2 > 0, · · · , λp > 0, where

cβ,aL = 2−pa

p∏j=1

Γ(1 + β2 )

Γ(1 + β2 j)Γ(a− β

2 (p− j)), (3.3)

β > 0, p ≥ 2, a > β2 (p−1) and q = 1+ β

2 (p−1). See, e.g., [9, 12] for further details. It is known that

fβ,a(λ1, · · · , λp) is a probability density function, i.e.,∫· · ·

∫[0,∞)p

fβ,a(λ1, · · · , λp) dλ1 · · · dλp = 1.

See (17.6.5) from [13] (which is essentially a corollary of the Selberg integral in (3.23) below).

Evidently,

the density function in Lemma 3.1 corresponds to the β-Laguerre ensemble in (3.2) with

β = 1, a =1

2(n− 1) and q = 1 +

1

2(p− 1). (3.4)

12

Page 13: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

LEMMA 3.2 Let n > p and L∗n be as in (1.3). Assume λ1, · · · , λp have density function fβ,a(λ1, · · · , λp)

as in (3.2) with a = β2 (n− 1) and q = 1 + β

2 (p− 1). Then

EetL∗n = e(logn−1)pt ·

(1− 2t

n

)p(t− β2 (n−1))

· 2−pt ·p−1∏j=0

Γ(a− t− β2 j)

Γ(a− β2 j)

for any t ∈(− 1

2β,12 (β ∧ n)

).

Proof. Recall

L∗n =

1

n

p∑j=1

(λj − n log λj) + p log n− p.

We then have

EetL∗n

= e(logn−1)pt

∫[0,∞)p

etn

∑pj=1 λj ·

p∏j=1

λ−tj · fβ,a(λ1, · · · , λp) dλ1 · · · dλp

= e(logn−1)pt · cβ,aL

∫[0,∞)p

e−( 12−

tn )

∑pj=1 λj ·

p∏j=1

λ(a−t)−qj ·

∏1≤k<l≤p

|λk − λl|β dλ1 · · · dλp. (3.5)

For t ∈(− 1

2β,12 (β ∧ n)

), we know 1

2 − tn > 0. Make transforms µj = (1 − 2t

n )λj for 1 ≤ j ≤ p. It

follows that the above is identical to

e(logn−1)pt · cβ,aL ·(1− 2t

n

)−p(a−t−q)− β2 p(p−1)−p

·∫[0,∞)p

e−12

∑pj=1 µj ·

p∏j=1

µ(a−t)−qj ·

∏1≤k<l≤p

|µk − µl|β dµ1 · · · dµp. (3.6)

Since t ∈(− 1

2β,12 (β ∧ n)

)and n− p ≥ 1, we know

t <β

2≤ β

2(n− p) =

β

2(n− 1)− β

2(p− 1) = a− β

2(p− 1).

That is, a− t > β2 (p− 1). Therefore the integral in (3.6) is equal to 1/cβ,a−t

L by (3.2) and (3.3). It

then from (3.5) and (3.6) that

EetL∗n = e(logn−1)pt ·

(1− 2t

n

)−p(a−t−q)− β2 p(p−1)−p

·cβ,aL

cβ,a−tL

= e(logn−1)pt ·(1− 2t

n

)−p(a−t−q)− β2 p(p−1)−p

· 2−pt ·p∏

j=1

Γ(a− t− β2 (p− j))

Γ(a− β2 (p− j))

.

Now, use a = β2 (n− 1) and q = 1 + β

2 (p− 1) to obtain that

EetL∗n = e(logn−1)pt ·

(1− 2t

n

)p(t− β2 (n−1))

· 2−pt ·p−1∏j=0

Γ(a− t− β2 j)

Γ(a− β2 j)

.

13

Page 14: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

The proof is complete. �

Let {Z,Zn; n ≥ 1} be a sequence of random variables. It is known that

Zn converges to Z in distribution if limn→∞

EetZn = EetZ < ∞ (3.7)

for all t ∈ (−t0, t0), where t0 > 0 is a constant. See, e.g., page 408 from [5].

Proof of Theorem 1. First, since log(1−x) < −x for all x < 1, we know σ2n > 0 for all n > p ≥ 1.

Now, by assumption, it is easy to see

limn→∞

σ2n =

−2[y + log(1− y)

], if y ∈ (0, 1);

+∞, if y = 1.(3.8)

Trivially, the limit is always positive. Consequently,

δ0 := inf{σn; n > p ≥ 1} > 0.

To finish the proof, by (3.7) it is enough to show that

E exp{L∗

n − µn

σns}→ es

2/2 = EesN(0,1) (3.9)

as n → ∞ for all s such that |s| < δ0/2.

Fix s such that |s| < δ0/2. Set t = tn = s/σn. Then |tn| < 1/2 for all n > p ≥ 1. In Lemma 3.2, take

β = 1 and a = (n− 1)/2, by (3.4),

EetL∗n = e(logn−1)pt ·

(1− 2t

n

)pt−np2 + p

2 · 2−pt ·p−1∏j=0

Γ(n−j−12 − t)

Γ(n−j−12 )

.

Letting i = n− j − 1, we get

EetL∗n = 2−pt · e(logn−1)pt ·

(1− 2t

n

)pt−np2 + p

2 ·n−1∏

i=n−p

Γ( i2 − t)

Γ( i2 )

(3.10)

for n > p. Then

logEetL∗n = pt(log n− 1− log 2) + p

(t+

1− n

2

)log

(1− 2t

n

)+ log

n−1∏i=n−p

Γ( i2 − t)

Γ( i2 )

.

Now, use identity log(1− x) = −x− x2

2 +O(x3) as x → 0 to have

p(t+

1− n

2

)log

(1− 2t

n

)= p

(t+

1− n

2

)(− 2t

n− 2t2

n2+O(

1

n3))

= −2pt

n

(t+

1− n

2

)(1 +

t

n

)+ o(1)

= −2pt

n

(12t+

1− n

2+O(

1

n))+ o(1)

= − p

nt2 + pt− yt+ o(1)

14

Page 15: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

as n → ∞. Recall rn = (− log(1− pn ))

1/2. We know t = tn = sσn

= O( 1rn) as n → ∞. By Proposition

2.1,

logn−1∏

i=n−p

Γ( i2 − t)

Γ( i2 )

= pt(1 + log 2)− pt log n+ r2n

(t2 + (p− n+ 1.5)t

)+ o(1)

as n → ∞. Join all the assertions from (3.10) to the above to obtain that

logEetL∗n = pt(log n− 1− log 2) − p

nt2 + pt− yt

+ pt(1 + log 2)− pt log n+ r2n

(t2 + (p− n+ 1.5)t

)+ o(1)

=(− p

n+ r2n

)t2 + [p+ r2n(p− n+ 1.5)− y]t+ o(1) (3.11)

as n → ∞. Noticing

p+ r2n(p− n+ 1.5)− y =(n− p− 3

2

)log

(1− p

n

)+ p− y = µn

and from the definition of σn and notation t = sσn

, we know(− p

n + r2n)t2 = s2

2 . Hence, it follows

from (3.11) that

logE exp{L∗

n − µn

σns}= logEetL

∗n − µnt →

s2

2

as n → ∞. This implies (3.9). The proof is completed. �

Now we start to prove Theorem 2. The following lemma says that the distribution of L1 in (1.7) does

not depend the mean vectors or covariance matrices of the population distributions where random

samples xi’s and yj ’s come from.

LEMMA 3.3 Let L1 be defined as in (1.7) with n1 > p and n2 > p. Then, under H0 in (1.4), L1

and

L1 :=(n1 + n2)

(n1+n2)p/2

nn1p/21 n

n2p/22

|C|n1/2 · |I−C|n2/2 (3.12)

have the same distribution, where

C = (U∗U+V∗V)−1/2(U∗U)(U∗U+V∗V)−1/2 (3.13)

with U = (uij)(n1−1)×p and V = (vij)(n2−1)×p, and {uij , vkl} are i.i.d. random variables with

distribution N(0, 1).

Proof. Recall that x1, · · · ,xn1 is a random sample from population Np(µ1,Σ1), and y1, · · · ,yn2 is

a random sample from population Np(µ2,Σ2), and the two sets of random variables are independent.

Under H0 in (1.4), Σ1 = Σ2 = Σ and Σ is non-singular. Set

xi = Σ−1/2xi and yj = Σ−1/2yj

15

Page 16: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

for 1 ≤ i ≤ n1 and 1 ≤ j ≤ n2. Then {xi; 1 ≤ i ≤ n1} are i.i.d. with distribution Np(µ1, Ip)

where µ1 = Σ−1/2µ1; {yj ; 1 ≤ j ≤ n2} are i.i.d. with distribution Np(µ2, Ip) where µ2 = Σ−1/2µ2.

Further, {xi; 1 ≤ i ≤ n1} and {yj ; 1 ≤ j ≤ n2} are obviously independent. Similar to (1.5) and

(1.6), define

A =1

n1

n1∑i=1

(xi − ¯x)(xi − ¯x)∗ and B =1

n2

n2∑i=1

(yi − ¯y)(yi − ¯y)∗ (3.14)

where

¯x =1

n1

n1∑i=1

xi and ¯y =1

n2

n2∑i=1

yi. (3.15)

It is easy to check that

A = Σ1/2AΣ1/2 and B = Σ1/2BΣ1/2. (3.16)

By Lemma 3.1,

n1Ad= U∗U and n2B

d= V∗V (3.17)

where U = (uij)(n1−1)×p and V = (vij)(n2−1)×p, and {uij , vkl; i, j, k, l ≥ 1} are i.i.d. random

variables with distribution N(0, 1). Review (1.7),

L1 =|A|n1/2 · |B|n2/2

|c1A+ c2B|N/2=

NNp/2

nn1p/21 n

n2p/22

· |n1A|n1/2 · |n2B|n2/2

|n1A+ n2B|N/2

=NNp/2

nn1p/21 n

n2p/22

· |n1A|n1/2 · |n2B|n2/2

|n1A+ n2B|N/2(3.18)

since |n1A| = |n1A| · |Σ|, and |n2B| = |n2B| · |Σ| and

|n1A+ n2B| = |Σ1/2(n1A+ n2B)Σ1/2| = |n1A+ n2B| · |Σ|

by (3.16), and hence the term |Σ|(n1+n2)/2 in the numerator canceled |Σ|N/2 in the denominator.

Define C = (n1A+ n2B)−1/2(n1A)(n1A+ n2B)−1/2. We see from the independence between n1A

and n2B and the independence between U∗U and V∗V that

Cd= C, (3.19)

where C is as in (3.13). It is obvious that

|C| = |n1A| · |n1A+ n2B|−1 and |I− C| = |n2B| · |n1A+ n2B|−1

Hence we have from (3.18) that

L1 =NNp/2

nn1p/21 n

n2p/22

· |C|n1/2 · |I− C|n2/2. (3.20)

16

Page 17: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

Finally, we get the desired conclusion from (3.19) and (3.20). �

Let λ1, · · · , λp be the eigenvalues of the β-Jacobi ensemble or the β-MANOVA matrix, that is,

they have the joint probability density function:

f(λ1, · · · , λp) = cβ,a1,a2

J

∏1≤i<j≤p

|λi − λj |β ·p∏

i=1

λa1−qi (1− λi)

a2−q (3.21)

for 0 ≤ λ1, · · · , λp ≤ 1, where a1, a2 > β2 (p− 1) are parameters, q = 1 + β

2 (p− 1), and

cβ,a1,a2

J =

p∏j=1

Γ(1 + β2 )Γ(a1 + a2 − β

2 (p− j))

Γ(1 + β2 j)Γ(a1 −

β2 (p− j))Γ(a2 − β

2 (p− j))(3.22)

with a1 = β2 (n1 − 1) and a2 = β

2 (n2 − 1). The fact that f(λ1, · · · , λp) is a probability density

function follows from the Selberg integral (see, e.g., [10, 13]):∫[0,1]p

∏1≤i<j≤p

|λi − λj |β ·p∏

i=1

λa1−qi (1− λi)

a2−q dλ1 · · ·λp =1

cβ,a1,a2

J

. (3.23)

It is known that

the eigenvalues of C defined in (3.13) has density function f(λ1, · · · , λp) in (3.21)

with β = 1, a1 =1

2(n1 − 1), a2 =

1

2(n2 − 1)and q = 1 +

1

2(p− 1). (3.24)

See, for example, [7, 14] for this fact.

LEMMA 3.4 Let TN be as in (1.7). Assume n1 > p and n2 > p. Then

EetTN = Cn1,n2 · Un(t) · V1,n(t)−1 · V2,n(t)

−1

for all t < 12 (1−

pn1∧n2

), where

Cn1,n2 =nn1pt1 nn2pt

2

(n1 + n2)(n1+n2)pt, Un(t) =

N−2∏i=N−p−1

Γ( 12 i)

Γ(12 i−Nt),

V1,n(t) =

n1−1∏i=n1−p

Γ(12 i)

Γ(12 i− n1t)and V2,n(t) =

n2−1∏i=n2−p

Γ(12 i)

Γ(12 i− n2t). (3.25)

Proof. From (1.7), etTN = (L1)−2t for any t ∈ R. Therefore, by Lemma 3.3,

EetTN = Cn1,n2 · E(|C|−n1t · |I−C|−n2t

)= Cn1,n2 · E

( p∏j=1

λ−n1tj (1− λj)

−n2t)

where λ1, · · · , λp are the eigenvalues of C in (3.13). Write ca1,a2

J = c1,a1,a2

J . By (3.22) and (3.24),

EetTN = Cn1,n2 · ca1,a2

J ·∫[0,1]p

p∏j=1

λa1−n1t−qj (1− λj)

a2−n2t−q ·∏

1≤i<j≤p

|λi − λj | dλ1 · · ·λp

= Cn1,n2·

ca1,a2

J

ca1−n1t,a2−n2tJ

(3.26)

17

Page 18: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

since f(λ1, · · · , λp) is a probability density function. Of course, recalling ai =12 (ni − 1) for i = 1, 2

and the assumption that t < 12 (1−

pn1∧n2

), we know

a1 − n1t >1

2(p− 1) and a2 − n2t >

1

2(p− 1)

which are required in (3.21). From (3.26), we see

EetTN = Cn1,n2 ·p∏

j=1

Γ(a1 + a2 − 12 (p− j))

Γ(a1 + a2 −Nt− 1

2 (p− j)) ·

[ p∏j=1

Γ(a1 − 12 (p− j))

Γ(a1 − n1t− 1

2 (p− j))]−1

·[ p∏j=1

Γ(a2 − 12 (p− j))

Γ(a2 − n2t− 1

2 (p− j))]−1

=: Cn1,n2 · Un(t) · V1,n(t)−1 · V2,n(t)

−1. (3.27)

Now, use ai =12 (ni − 1) for i = 1, 2 again to have

a1 −1

2(p− j) =

1

2(n1 − p+ j − 1); a2 −

1

2(p− j) =

1

2(n2 − p+ j − 1);

a1 + a2 −1

2(p− j) =

1

2(N − p+ j − 2).

Thus, by setting i = N − p+ j − 2 for j = 1, 2, · · · , p, we have

Un(t) =

p∏j=1

Γ(a1 + a2 − 12 (p− j))

Γ(a1 + a2 −Nt− 1

2 (p− j)) =

N−2∏i=N−p−1

Γ( 12 i)

Γ( 12 i−Nt)= Un(t).

Similarly, Vi,n(t) = Vi,n(t) for i = 1, 2. These combining with (3.27) yield the desired result. �

LEMMA 3.5 Let TN be as in (1.7). Assume ni > p and p/ni → yi ∈ (0, 1] for i = 1, 2. Recall σ2n

in (1.8). Then, 0 < σn < ∞ for all n1 ≥ 2, n2 ≥ 2, and E exp{

TN

Nσnt}< ∞ for all t ∈ R as n1 and

n2 are sufficiently large.

Proof. First, we claim that

σ2 := 2[log(1− y)− γ2

1 log(1− y1)− γ22 log(1− y2)

]> 0 (3.28)

for all y1, y2 ∈ (0, 1), where γ1 = y2(y1 + y2)−1, γ2 = y1(y1 + y2)

−1 and y = y1y2(y1 + y2)−1.

In fact, consider h(x) = − log(1 − x) for x < 1. Then, h′′(x) = (1 − x)−2 > 0 for x < 1. That

is, h(x) is a convex function. Take γ3 = 2y1y2/(y1 + y2)2. Then, γ2

1 + γ22 + γ3 = 1. Hence, by the

convexity,

−γ21 log(1− y1)− γ2

2 log(1− y2) = −γ21 log(1− y1)− γ2

2 log(1− y2)− γ3 log(1− 0)

< − log(1− (γ2

1y1 + γ22y2 + γ3 · 0)

)= − log(1− y),

where the strict inequality comes since y1 = 0 and y2 = 0.

Now, taking yi = p/ni ∈ (0, 1) for i = 1, 2 in (3.28), we get

γ1 =y2

y1 + y2=

n1

N, γ2 =

y1y1 + y2

=n2

Nand y =

y1y2y1 + y2

=p

N.

18

Page 19: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

Evidently, n1/N, n2/N, p/N ∈ (0, 1). Then, by (3.28), we know 0 < σn < ∞ for all n1 ≥ 2, n2 ≥ 2.

Second, noting that{t :

t

Nσn<

1

2(1− p

n1 ∧ n2)}=

(−∞,

1

2(1− p

n1 ∧ n2)Nσn

),

to prove the second part, it suffices to show from Lemma 3.4 that

limn1,n2→∞

(1− p

n1 ∧ n2)Nσn = +∞. (3.29)

Case 1: y1 < 1, y2 < 1. Recall σ2 in (3.28). Evidently, σ2n → σ2 ∈ (0,∞) as n1, n2 → +∞. Hence,

(3.29) follows since 1− pn1∧n2

→ 1− y1 ∨ y2 > 0.

Case 2: max{y1, y2} = 1. This implies σ2n → +∞ as n1, n2 → ∞ because log

(1 − p

N

)→ log y ∈

(−∞, 0) and the sum of the last two term on the right hand side of (1.8) goes to +∞. Further, the

given conditions say that ni − 1 ≥ p, and hence, 1− pni

≥ 1ni

≥ 1N for i = 1, 2. Thus,

(1− p

n1 ∧ n2)Nσn = min

{1− p

n1, 1− p

n2

}·Nσn ≥ σn → +∞

as n1, n2 → ∞. We get (3.29). The proof is complete. �

Proof of Theorem 2. From Lemma 3.5, we assume, without loss of generality, thatE exp{

TN

Nσnt}<

∞ for all n1 ≥ 2, n2 ≥ 2 and t ∈ R. Fix t ∈ R. Set tn = tn1,n2 = tσn

for n1, n2 ≥ 2. From the condi-

tion p/ni → yi for i = 1, 2 as p ∧ n1 ∧ n2 = p → ∞ by the assumption n1 > p and n2 > p (we will

simply say “p → ∞” in similar situations later), we know σ2n has a positive limit (possibly +∞) as

p → ∞. It follows that {tn; n1, n2 ≥ 2} is bounded. By Lemma 3.4,

logE exp{ TN

Nσnt}

= − log V1,n

( tnN

)− log V2,n

( tnN

)+ logUn

( tnN

)+ptnN

(n1 log n1 + n2 log n2 −N logN). (3.30)

Set γ1 = y2(y1 + y2)−1, γ2 = y1(y1 + y2)

−1 and y = y1y2(y1 + y2)−1. Easily,

ni

N→ γi ∈ (0, 1),

p

N − 1→ y ∈ (0, 1) and 2 log(1− p

N) → 2 log(1− y) ∈ (−∞, 0)

as p → ∞. Then, from (1.8) we know that

ni

Ntn ∼ γit ·

1

σn= O

((− log(1− p

ni))−1/2)

and

tn = O((

− log(1− p

N − 1))−1/2)

(3.31)

for i = 1, 2 as p → ∞. Replacing “t” in Proposition 2.1 with “n1tn/N”, we have

− log V1,n

( tnN

)= log

n1−1∏i=n1−p

Γ( i2 − n1

N tn)

Γ( i2 )

=n1ptnN

(1 + log 2)− n1ptnN

log n1

+ r2n,1

( n21

N2t2n + (p− n1 + 1.5)

n1

Ntn

)+ o(1) (3.32)

19

Page 20: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

as p → ∞, where

rn,i :=(− log(1− p

ni))1/2

, i = 1, 2. (3.33)

Similarly,

− log V2,n

( tnN

)= log

n2−1∏i=n2−p

Γ( i2 − n2

N tn)

Γ( i2 )

=n2ptnN

(1 + log 2)− n2ptnN

log n2

+ r2n,2

( n22

N2t2n + (p− n2 + 1.5)

n2

Ntn

)+ o(1) (3.34)

as p → ∞. By the same argument, by using (3.31) we see

− logUn

( tnN

)= log

(N−1)−1∏i=(N−1)−p

Γ( i2 − tn)

Γ( i2 )

= ptn(1 + log 2)− ptn log(N − 1)

+R2n

(t2n + (p−N + 2.5)tn

)+ o(1) (3.35)

as p → ∞, where

Rn =(− log

(1− p

N − 1

))1/2

. (3.36)

From (3.32) and (3.34),

− log Vi,n

( tnN

)+

ptnN

ni log ni

=niptnN

(1 + log 2) + r2n,i

( n2i

N2t2n + (p− ni + 1.5)

ni

Ntn

)+ o(1)

=niptnN

(1 + log 2) +n2i r

2n,i

N2t2n +

(p− ni + 1.5)nir2n,i

Ntn + o(1) (3.37)

as p → ∞ for i = 1, 2. Since {tn} is bounded, use log(1 + x) = x+O(x2) as x → 0 to see

ptn logN − ptn log(N − 1) = ptn log(1 +

1

N − 1

)= ytn + o(1)

as p → ∞, where lim pN−1 = y1y2

y1+y2= y < 1. Therefore, by (3.35) and the fact N = n1 + n2,

− logUn

( tnN

)+ ptn logN

= ptn(1 + log 2) + ytn +R2n

(t2n + (p−N + 2.5)tn

)+ o(1)

=n1ptn + n2ptn

N(1 + log 2) +R2

nt2n +

(y + (p−N + 2.5)R2

n

)tn + o(1) (3.38)

as p → ∞. Joining (3.30) with (3.37) and (3.38), we obtain

logEetnTN/N =( n2

1

N2r2n,1 +

n22

N2r2n,2 −R2

n

)t2n + ρntn + o(1) (3.39)

20

Page 21: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

as p → ∞, where

ρn =1

N

((p− n1 + 1.5)n1r

2n,1 + (p− n2 + 1.5)n2r

2n,2

)− (p−N + 2.5)R2

n − y. (3.40)

By using the fact log(1 + x) = x+ o(x2) again, we have that

log(N − 1

N· N − p

N − p− 1

)= log(1− 1

N)− log(1− 1

N − p)

=p

N(N − p)+O(

1

N2)

as p → ∞. Reviewing (3.36), we have

R2n = − log

(1− p

N − 1

)= − log

(1− p

N

)+ log

(N − 1

N· N − p

N − p− 1

)= r2n +

p

N(N − p)+O(

1

N2) (3.41)

as p → ∞, where

rn =(− log

(1− p

N

))1/2

.

In particular, since {tn} is bounded,

R2nt

2n = r2nt

2n + o(1) (3.42)

as p → ∞. By (3.41), recalling p/N → y, we get

(p−N + 2.5)R2n = (p−N + 2.5)r2n − p

N+ o(1) = (p−N + 2.5)r2n − y + o(1)

as p → ∞. Plug this into (3.40) to have that

ρn =1

N

((p− n1 + 1.5)n1r

2n,1 + (p− n2 + 1.5)n2r

2n,2

)− (p−N + 2.5)r2n + o(1) (3.43)

as p → ∞. Now plug the above and (3.42) into (3.39), since {tn} is bounded, we have

logEetnTN/N =( n2

1

N2r2n,1 +

n22

N2r2n,2 − r2n

)t2n + µntn + o(1) (3.44)

as p → ∞ with

µn =1

N

((p− n1 + 1.5)n1r

2n,1 + (p− n2 + 1.5)n2r

2n,2

)− (p−N + 2.5)r2n.

Using tn = t/σn and the definition of σn, we get( n21

N2r2n,1 +

n22

N2r2n,2 − r2n

)t2n

= t2n

(log

(1− p

N

)− n2

1

N2log

(1− p

n1

)− n2

2

N2log

(1− p

n2

))→ t2

2

as p → ∞. This and (3.44) conclude that

logE exp{Tn −Nµn

Ntn

}= logEetnTN/N − µntn → t2

2

21

Page 22: Likelihood Ratio Tests for Covariance Matrices of High ...users.stat.umn.edu/~jiang040/...revision02262012.pdf · Likelihood Ratio Tests for Covariance Matrices of High-Dimensional

as p → ∞, which is equivalent to that

E exp{ 1

σn

(Tn

N− µn

)t}→ et

2/2 = EetN(0,1)

as p → ∞ for any t ∈ R. The proof is completed by using (3.7). �

Acknowledgement We thank Danning Li very much for her check of our proofs and many good

suggestions. We also thank an anonymous referee for very helpful comments for revision.

References

[1] Ahlfors, L. V. (1979). Complex Analysis, 3rd. Edition. McGraw-Hill, Inc.

[2] Bai, Z., Jiang, D., Yao, J. and Zheng, S. (2009). Corrections to LRT on large-dimensionalcovariance matrix by RMT. Ann. Stat. 37(6B) 3822-3840.

[3] Bai, Z. and Saranadasa, H. (1996). Effect of high dimension comparison of significance testsfor a high-dimensional two sample problem. Statist. Sinica 6 311-329.

[4] Bai, Z. and Silverstein, J. (2004). CLT for linear spectral statistics of largedimensional samplecovariance matrices. Ann. Probab. 32 553-605.

[5] Billingsley, P. (1986). Probability and Measure. Wiley Series in Probability and MathematicalStatistics, 2nd Edition.

[6] Gamelin, T. W. (2001). Complex Analysis. 1st Ed., Springer.

[7] Constantine, A. (1963). Some non-central distribution problems in multivariate analysis. Ann.Math. Stat. 34 1270-1285.

[8] Dempster, A. (1958). A high-dimensional two sample significance test. Ann. Math. Statist. 29995-1010.

[9] Dumitriu, I. and Edelman, A. (2002). Matrix models for beta-ensembles. J. Math. Phys.,43(11) 5830-5847.

[10] Forrester, P. and Warnaar, S. (2008). The importance of the Selberg integral. Bull. Amer.Math. Soc. 45(4) 489-534.

[11] Freitag, E. and Busam, R. (2005). Complex Analysis. Springer.

[12] Jiang, T. Limit Theorems on Beta-Jacobi Ensembles. http://arxiv.org/abs/0911.2262.

[13] Mehta, M. L. (2004). Random Matrices, 3rd edition. Pure and Applied Mathematics (Ams-terdam), 142. Elsevier/Academic Press, Amsterdam.

[14] Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory, Wiley, New York.

[15] Zheng, S. (2008). Central limit theorem for linear spectral statistics of large dimensional F-matrix. Preprint. Northeast Normal Univ., Changchun, China.

22