1 Fluctuations of an Improved Population Eigenvalue ... · 1 ≤ k ≤ L, we will refer to each...

arX

iv:1

108.

5266

v1 [

mat

h.P

R]

26 A

ug 2

011

1

Fluctuations of an Improved Population

Eigenvalue Estimator in Sample Covariance

Matrix ModelsJ. Yao, R. Couillet, J. Najim, M. Debbah

October 16, 2018

Abstract

This article provides a central limit theorem for a consistent estimator of population eigenvalues with

large multiplicities based on sample covariance matrices.The focus is on limited sample size situations,

whereby the number of available observations is known and comparable in magnitude to the observation

dimension. An exact expression as well as an empirical, asymptotically accurate, approximation of the

limiting variance is derived. Simulations are performed that corroborate the theoretical claims. A specific

application to wireless sensor networks is developed.

I. INTRODUCTION

Problems of statistical inference based onM independent observations of anN -variate random variable

y, with E[y] = 0 andE[yyH ] = RN have drawn the attention of researchers from many fields for years:

Portfolio optimization in finance [1], gene coexistence in biostatistics [2], channel capacity in wireless

communications [3], power estimation in sensor networks [4], array processing [5], etc.

In particular, retrieving spectral properties of thepopulation covariance matrix RN , based on the

observation ofM independent and identically distributed (i.i.d.) samplesy(1), . . . ,y(M), is paramount

to many questions of general science. IfM is large compared toN , then it is known that almost surely

This work was partially supported by Agence Nationale de la Recherche (France), program ANR-07-MDCO-012-01 ’Sesame’.

Yao and Najim are with Télécom Paristech, France.{yao,najim}@telecom-paristech.fr ;

Yao is also with Ecole Normale Supérieure and Najim, with CNRS.

Couillet and Debbah are with Supélec, France.{romain.couillet, merouane.debbah}@supelec.fr,

Debbah also holds Alcatel-Lucent/Supélec Flexible Radiochair, France.

October 16, 2018 DRAFT

http://arxiv.org/abs/1108.5266v1

2

‖R̂N−RN‖ → 0, asM → ∞, for any standard matrix norm, wherêRN is thesample covariance matrixR̂N ,

1M

∑Mm=1 y

(m)y(m)H . However, one cannot always afford a large number of samples, especially

in wireless communications where the number of available samples has often a size comparable to the

dimension of each sample. In order to cope with this issue, random matrix theory [6], [7] has proposed new

tools, mainly spurred by theG-estimators of Girko [8]. Other works include convex optimization methods

[9], [10] and free probability tools [11], [12]. Many of those estimators are consistent in the sense that

they are asymptotically unbiased asM,N grow large at the same rate. Nonetheless, only recently have

techniques been unveiled which allow to estimate individual eigenvalues and functionals of eigenvectors

of R. The main contributor is Mestre [13]-[14] who studies the case whereRN = UNDNUHN with DN

diagonal with entries of large multiplicities andUN with i.i.d. entries. For this model, he provides an

estimator for every eigenvalue ofR with large multiplicity under some separability condition, see also

Vallet et al. [15], Couillet et al. [4] for more elaborate models.

These estimators, although proven asymptotically unbiased, have nonetheless not been fully character-

ized in terms of performance statistics. It is in particularfundamental to evaluate the variance of these

estimators for not-too-largeM,N . The purpose of this article is to study the fluctuations of the population

eigenvalue estimator of [14] in the case of structured population covariance matrices. A central limit

theorem (CLT) is provided to describe the asymptotic fluctuations of the estimators with exact expression

for the variance asM,N tend to infinity . An empirical approximation, asymptotically accurate is also

derived.

The results are applied in a cognitive radio context in whichwe assume the co-existence of a licensed

(primary) network and an opportunistic (secondary) network aiming at reusing the bandwidth resources

left unoccupied by the primary network. The eigenvalue estimator is used here by secondary users to

estimate the transmit power of primary users, while the fluctuations are used to provide a confidence

margin on the estimate.

The remainder of the article is structured as follows: In Section II, the system model is introduced

and the main results from [13], [14] are recalled. In SectionIII, the CLT for the estimator in [14] is

stated with the asymptotic variance. In Section IV, an empirical approximation for the variance is derived.

A cognitive radio application of these results is provided in Section V, with comparative Monte Carlo

simulations. Finally, Section VI concludes this article. Technical proofs are postponed to the appendix.

DRAFT October 16, 2018

3

II. ESTIMATION OF THE POPULATION EIGENVALUES

A. Notations

In this paper, the notationss,x,M stand for scalars, vectors and matrices, respectively. As usual,‖x‖represents the Euclidean norm of vectorx and‖M‖ stands for the spectral norm ofM. The superscripts(·)T and(·)H respectively stand for the transpose and transpose conjugate; the trace ofM is denoted byTr(M); the mathematical expectation operator, byE. If x is aN × 1 vector, thendiag(x) is theN ×Nmatrix with diagonal elements the components ofx. If z ∈ C, thenℜ(z) andℑ(z) respectively standfor z’s real and imaginary parts, whilei stands for

√−1; z stands forz’s conjugate andδkℓ is denoted

as Kronecker’s symbol (whose value is1 if k = ℓ, 0 otherwise).

If the supportS of a probability measure overR is the finite union of closed compact intervalsSk for

1 ≤ k ≤ L, we will refer to each compact intervalSk as acluster of S.

If Z ∈ CN×N is a nonnegative Hermitian matrix with eigenvalues(ξi; 1 ≤ i ≤ N), we denote in thesequel byeig(Z) = {ξi, 1 ≤ i ≤ N} the set of its eigenvalues and byFZ the empirical distribution ofits eigenvalues (also calledspectral distribution of Z), i.e.:

FZ(dλ) =1

N

N∑

i=1

δξi(dλ) ,

whereδx stands for the Dirac probability measure atx.

Convergence in distribution will be denoted byD−→, in probability by P−→; and almost sure convergence,

bya.s.−−→.

B. Matrix Model

Consider aN × M matrix XN = (Xij) whose entries are independent and identically distributed(i.i.d.) random variables, with distributionCN(0, 1), i.e. Xij = U + iV , whereU, V are both i.i.d.

real Gaussian random variablesN(0, 12). Let RN be aN ×N Hermitian matrix withL (L being fixed)distinct eigenvaluesρ1 < · · · < ρL with respective multiplicitiesN1, · · · , NL (notice that

∑Li=1Ni = N ).

Consider now

YN = R1/2N XN .

The matrix YN = [y1, · · · ,yM ] is the concatenation ofM independent observations[y1, · · · ,yM ],where each observation writesyi = R

1/2N xi with XN = [x1, · · · ,xM ]. In particular, the (population)

covariance matrix of each observationyi is RN = EyiyHi . In this article, we are interested in recovering


4

information onRN based on the observation

R̂N =1

MR

1/2N XNX

HNR

1/2N ,

which is referred to as thesample covariance matrix.

It is in general a complicated task to infer the spectral properties ofRN based onR̂N for all finite

N,M . Instead, in the following, we assume thatN andM are large, and consider the following asymptotic

regime:

Assumption 1 (A1):

N,M → ∞ , with NM

→ c ∈ (0,∞) , and NiM

→ ci ∈ (0,∞) , 1 ≤ i ≤ L. (1)

This assumption will be shortly referred to asN,M → ∞.

Assumption 2 (A2):

We assume that the limiting supportS of the eigenvalue distribution of̂RN is formed ofL compact

disjoint subsets (cf. Figure 1). Following [14], one can also reformulate this condition in a more analytic

manner: The limiting support of̂RN is formed ofL clusters if and only if fori ∈ {1, .., L}, infN{MN −ΨN (i)} > 0, where

ΨN (i) =

1N

∑Lr=1Nr

(

ρrρr−α1

)2m = 1,

max{

1N

∑Lr=1Nr

(

ρrρr−αm−1

)2, 1N∑L

r=1Nr

(

ρrρr−αm

)2 }

1 < m < L,

1N

∑Lr=1Nr

(

ρrρr−αL−1

)2m = L

whereα1 ≤ · · · ≤ αL−1 areL− 1 different ordered solutions to the equation

1

N

L∑

r=1

Nrρ2r

(ρr − x)3= 0.

This condition is also called theseparability condition.

Figure 1 depicts the eigenvalues of a realization of the random matrixR̂N and the associated limiting

distribution asN,M grow large, forρ1 = 1, ρ2 = 3, ρ3 = 10 andN = 60 with equal multiplicity.

C. Mestre’s Estimator of the population eigenvalues

In [14], an estimator of the population covariance matrix eigenvalues(ρk; 1 ≤ k ≤ L) based on theobservationŝRN is proposed.


5

1 3 10Eigenvalues

Den

sity

Asymptotic spectrum

Empirical eigenvalues

Fig. 1. Empirical and asymptotic eigenvalue distribution of R̂N for L = 3, ρ1 = 1, ρ2 = 3, ρ3 = 10, N/M = c = 0.1,

N = 60, N1 = N2 = N3 = 20.

Theorem 1: [14] Denote byλ̂1 ≤ · · · ≤ λ̂N the ordered eigenvalues of̂RN . Let M,N → ∞ in thesense of the assumption (A1). Under the assumptions (A1)-(A2), the following convergence holds true:

ρ̂k − ρk a.s.−−−−−−→M,N→∞

0 , (2)

where

ρ̂k =M

Nk

∑

m∈Nk

(

λ̂m − µ̂m)

, (3)

with Nk = {∑k−1

j=1 Nj + 1, . . . ,∑k

j=1Nj} and µ̂1 ≤ · · · ≤ µ̂N the (real and) ordered solutions of:

1

N

N∑

m=1

λ̂m

λ̂m − µ=

M

N. (4)

D. Integral representation of estimator ρ̂k - Stieltjes transforms

The proof of Theorem 1 relies on random matrix theory, and in particular, [16], [17] use as a key

ingredient theStieltjes transform.

The Stieltjes transform mP of a probability distributionP overR+ is aC-valued function defined by:

mP(z) =

∫

R+

P(dλ)

λ− z , z ∈ C\R+ .

There also exists an inverse formula to recover the probability distribution associated to a Stieljes

transform: Leta < b be two continuity points of the cumulative distribution function associated to

P, then


6

P([a, b]) =1

πlimy↓0

ℑ[∫ b

amP(x+ iy)dx

]

.

In the case whereFZ is the spectral distribution associated to a nonnegative Hermitian matrixZ ∈CN×N with eigenvalues(ξi; 1 ≤ i ≤ N), the Stieltjes transformmZ of FZ takes the particular form:

mZ(z) =

∫

FZ(dλ)

λ− z

=1

N

N∑

i=1

1

ξi − z=

1

NTr (Z− zIN )−1 ,

which can be seen as the normalized trace of the resolvent(Z− zIN )−1. Since the seminal paper ofMarčenko and Pastur [16], the Stieltjes transform has proved to be extremely efficient to describe the

limiting spectrum of large dimensional random matrices.

In the following, we recall some elements of the proof of Theorem 1, necessary for the remainder of

the article. The first important result is due to Bai and Silverstein [17] (see also [16]).

Theorem 2: [17] Denote byFR the limiting spectral distribution ofRN , i.e. FR(dλ) =∑L

k=1ckc δρk(dλ).

Under the assumption (A1), the spectral distributionF R̂N of the sample covariance matrix̂RN converges

(weakly and almost surely) to a probability distributionF asM,N → ∞, whose Stieltjes transformm(z)satisfies:

m(z) =1

cm(z)−

(

1− 1c

)

1

z,

for z ∈ C+ = {z ∈ C, ℑ(z) > 0}, wherem(z) is defined as the unique solution inC+ of:

m(z) = −(

z − c∫

t

1 + tm(z)dFR(t)

)−1

.

Note thatm(z) is also a Stieltjes transform whose associated distribution function will be denotedF ,

which turns out to be the limiting spectral distribution ofF R̂N whereR̂N is defined as:

R̂N ,1

MXHNRNXN .

Denote bymR̂N

(z) andmR̂N

(z) the Stieltjes transforms ofF R̂N andF R̂N . Notice in particular that

mR̂N

(z) =M

Nm

R̂N(z) −

(

1− MN

)

1

z.

Remark 1: This relation associated to (4) readily implies thatmR̂N

(µ̂i) = 0. Otherwise stated, the

µ̂i’s are the zeros ofmR̂N . This fact will be of importance in the sequel.


7

Denote bymN (z) and mN (z) the finite-dimensional counterparts ofm(z) and m(z), respectively,

defined by the relations:

mN (z) = −(

z − NM

∫

t

1 + tmN (z)dFRN (t)

)−1

,

mN (z) =M

NmN (z)−

(

1− MN

)

1

z.

It can be shown thatmN andmN are Stieltjes transforms of given probability measuresFN andFN ,

respectively (cf. [7, Theorem 3.2]).

With these notations at hand, we can now derive Theorem 1. By Cauchy’s formula, write:

ρk =N

Nk

1

2iπ

∮

Γk

(

1

N

L∑

r=1

Nrw

ρr − wdw

)

,

whereΓk is a negatively oriented contour taking values onC \ {ρ1, · · · , ρL} and only enclosingρk.With the change of variablew = − 1mM (z) and the condition that the limiting supportS of the eigenvaluedistribution ofRN is formed ofL distinct clusters(Sk, 1 ≤ k ≤ L) (cf. Figure 1), we can write:

ρk =M

2iπNk

∮

Ck

zm′N (z)

mN (z)dz , 1 ≤ k ≤ L (5)

whereCk andCℓ denote negatively oriented contours which enclose the corresponding clustersSk and

Sℓ respectively. Defining

ρ̂k ,M

2πiNk

∮

Ck

zm′

R̂N(z)

mR̂N

(z)dz , 1 ≤ k ≤ L , (6)

dominated convergence arguments ensure thatρk − ρ̂k → 0, almost surely. The integral form of̂ρk canthen be explicitly computed thanks to residue calculus, andthis finally yields (3).

The main objective of this article is to study the performance of the estimators(ρ̂k, 1 ≤ k ≤ L). Moreprecisely, we will establish a central limit theorem (CLT) for (M(ρ̂k − ρk), 1 ≤ k ≤ L) asM,N → ∞,explicitly characterize the limiting covariance matrixΘ = (Θkℓ)1≤k,ℓ≤L, and finally provide an estimator

for Θ.

III. F LUCTUATIONS OF THE POPULATION EIGENVALUE ESTIMATORS

A. The Central Limit Theorem

The main result of this article is the following CLT which expresses the fluctuations of(ρ̂k, 1 ≤ k ≤ L).Theorem 3: Under the assumptions (A1)-(A2) and with the same notations:

(M(ρ̂k − ρk), 1 ≤ k ≤ L) D−−−−−−→M,N→∞

x ∼ NL(0,Θ) ,

whereNL refers to a realL-dimensional Gaussian distribution, andΘ is aL× L matrix whose entriesΘkℓ are given by (7), whereCk andCℓ are defined as before (cf. Formula (5)).


8

Θkℓ = −1

4π2ckcℓ

∮

Ck

∮

Cℓ

[

m′(z1)m′(z2)

(m(z1)−m(z2))2− 1

(z1 − z2)2]

1

m(z1)m(z2)dz1dz2 . (7)

B. Proof of Theorem 3

We first outline the main steps of the proof and then provide the details.

Using the integral representation ofρ̂k andρk, we get: Almost surely,

M(ρ̂k − ρk) =M2

2πiNk

∮

Ck

z

(

m′R̂N

(z)

mR̂N

(z)− m

′N (z)

mN (z)

)

dz

Denote byC(Ck,C) the set of continuous functions fromCk to C endowed with the supremum norm

‖u‖∞ = supCk |u|. Consider the process:

(XN ,X′N , uN , u

′N ) : Ck → C4

where

XN (z) = M(

mR̂N

(z)−mN (z))

,

X ′N (z) = M(

m′R̂N

(z)−m′N (z))

,

uN (z) = mR̂N(z) , u′N (z) = m

′R̂N

(z) .

Then due to ’no eigenvalue’ result (cf. [18], see also Proposition 1), (XN ,X ′N , uN , u′N ) almost surely

belongs toC(Ck,C) andM(ρ̂k − ρk) writes:

M(ρ̂k − ρk) =M

2πiNk

∮

Ck

z

(

mN (z)X′N (z) − u′N (z)XN (z)mN (z)uN (z)

)

dz

△= ΥN (XN ,X

′N , uN , u

′N ) ,

where

ΥN (x, x′, u, u′) =

M

2πiNk

∮

Ck

z

(

mN (z)x′(z)− u′(z)x(z)

mN (z)u(z)

)

dz . (8)

If needed, we shall explicitly indicate the dependence in the contourCk and writeΥN (x, x′, u, u′,Ck).

The main idea of the proof of the theorem lies in three steps:

(i) To prove the convergence in distribution of the process(XN ,X ′N , uN , u′N ) to a Gaussian process.

(ii) To transfer this convergence to the quantityΥN (XN ,X ′N , uN , u′N ) with the help of the continuous

mapping theorem [19].


9

(iii) To check that the limit (in distribution) ofΥN (XN ,X ′N , uN , u′N ) is Gaussian and to compute the

limiting covariance betweenΥN(XN ,X ′N , uN , u′N ,Ck) andΥN (XN ,X

′N , uN , u

′N ,Cℓ).

Remark 2: Note that the convergence in step (i) is a distribution convergence at a process level, hence

one has to first establish the finite dimensional convergenceof the process and then to prove that the

process is tight overCk. Tightness turns out to be difficult to establish due to the lack of control over

the eigenvalues of̂RN whenever the contour crosses the real line. In order to circumvent this issue, we

shall introduce, following Bai and Silverstein [20], a process that approximatesXN andX ′N .

Let us now start the proof of Theorem 3.

Lemma 1: Under the assumptions (A1)-(A2), the process

(XN ,X′N ) : Ck → C4

converges in distribution to a Gaussian process(X,Y ) with mean function zero and covariance function:

cov(X(z),X(z̃)) =m′(z)m′(z̃)

(m(z)−m(z̃))2 −1

(z − z̃)2△= κ(z, z̃) , (9)

cov(Y (z),X(z̃)) =∂

∂zκ(z, z̃) ,

cov(X(z), Y (z̃)) =∂

∂z̃κ(z, z̃) ,

cov(Y (z), Y (z̃)) =∂2

∂z∂z̃κ(z, z̃) .

Lemma 1 is the cornerstone to the proof of Theorem 3. The proofof Lemma 1 is postponed to Appendix

B and relies on the following proposition, of independent interest:

Proposition 1: Under the assumptions (A1)-(A2) and denote byS the support of the probability

distribution associated to the Stieltjes transformm. Then, for everyε > 0, ℓ ∈ N∗:

P

(

supλ∈eig(R̂N)

d(λ,S) > ε

)

= O

(

1

N ℓ

)

,

whered(λ,S) = infx∈S |λ− x|.The proof of Proposition 1 is postponed to Appendix A.

As (uN , u′N )a.s.−−−−−−→

N,M→∞(m,m′), a straightforward corollary of Lemma 1 yields the convergence in

distribution of (XN ,X ′N , uN , u′N ) to (X,Y,m,m

′). This concludes the proof of step (i).

A direct consequence of Lemma 1 yields that(XN ,X ′N , uN , u′N ) : Ck → C4 converges in distribution

to the Gaussian process(X,Y,m,m′) defined as before. We are now in position to transfer the con-

vergence of(XN ,X ′N , uN , u′N ) to ΥN(XN ,X

′N , uN , u

′N ) via the continuous mapping theorem, whose

statement as expressed in [19] is reminded below.


10

Proposition 2 (cf. [19, Th. 4.27]): For any metric spacesS1 andS2, let ξ, (ξn)n≥1 be random elements

in S1 with ξnD−−−→

n→∞ξ and consider some measurable mappingsf , (fn)n≥1: S1 7→ S2 and a measurable

setΓ ⊂ S1 with ξ ∈ Γ a.s. such thatfn(sn) → f(s) assn → s ∈ Γ. Thenfn(ξn) D−−−→n→∞

f(ξ).

It remains to apply Theorem 2 to the process(XN ,X ′N , uN , u′N ) and to the functionΥN as defined

in (8). Denote by1

Υ(x, y, v, w) =1

2πick

∮

Ck

z

(

m(z)y(z) − w(z)x(z)m(z)v(z)

)

dz ,

and consider the set

Γ =

{

(x, y, v, w) ∈ C4(Ck,C) , infCk

|v| > 0}

.

Then, it is shown in [6, Section 9.12.1] thatinfCk |m| > 0, and, by a dominated convergence theoremargument, that(xN , yN , vN , wN ) → (x, y, v, w) ∈ Γ implies thatΥN (xN , yN , vN , wN ) → Υ(x, y, v, w).Therefore, Theorem 2 applies toΥN (xN , yN , vN , wN ) and the following convergence holds true:

ΥN (XN ,X′N , uN , u

′N )

D−−−−−−→M,N→∞

Υ(X,Y,m,m′) ,

and step (ii) is established.

It now remains to prove step (iii),i.e. to check the Gaussianity of the random variableΥ(X,Y,m,m′)

and to compute the covariance betweenΥ(X,Y,m,m′,Ck) andΥ(X,Y,m,m′,Cℓ).

In order to propagate the Gaussianity of the deviations in the integrands of (6) to the deviations of

the integral which defineŝρk, it suffices to notice that the integral can be written as the limit of a finite

Riemann sum and that a finite Riemann sum of Gaussian random variables is still Gaussian. Therefore

M(ρ̂k − ρk) converges to a Gaussian distribution. Asinfz∈Ck |m(z)| > 0, a straightforward applicationof Fubini’s theorem together with the fact thatE(X) = E(Y ) = 0 yields:

E

∮ (

zm

′

(z)X(z)

m2(z)− z Y (z)

m(z)

)

dz = 0 .

It remains to compute the covariance betweenΥ(X,Y,m,m′,Ck) andΥ(X,Y,m,m′,Cℓ) for possibly

different contoursCk andCℓ. We shall therefore evaluate, for1 ≤ k, ℓ ≤ L:

Θkℓ = E(

Υ(X,Y,m,m′,Ck)Υ(X,Y,m,m′,Cℓ)

)

,

(a)= − 1

4π2ckcl

∮

Ck

∮

Cℓ

z1z2

(m′

(z1)m′

(z2)κ(z1, z2)

m2(z1)m2(z2)− m

′

(z1)∂2κ(z1, z2)

m2(z1)m(z2)

−m′

(z2)∂1κ(z1, z2)

m(z1)m2(z2)+

∂212κ(z1, z2)

m(z1)m(z2)

)

dz1dz2 ,

1As previously, we shall explicitly indicate the dependenceon the contourCk if needed and writeΥ(x, x′, u, u′,Ck).


11

Θ̂kℓ = −M2

4π2NkNℓ

∮

Ck

∮

Cℓ

(

m′R̂N

(z1)m′R̂N

(z2)

(mR̂N

(z1)−mR̂N (z2))2− 1

(z1 − z2)2

)

× 1m

R̂N(z1)mR̂N

(z2)d z1d z2 .

(10)

where(a) follows from the fact thatinfz∈Ck |m(z)| > 0 together with Fubini’s theorem, and∂1, ∂2, ∂212respectively stand for∂/∂z1, ∂/∂z2 and∂2/∂z1∂z2.

By integration by parts, we obtain

∮

z1z2m′(z2)∂1κ(z1, z2)

m(z1)m2(z2)dz1

=

∮(

−z2m′(z2)κ(z1, z2)

m(z1)m2(z2)+

z1z2m′(z1)m

′(z2)κ(z1, z2)

m2(z1)m2(z2)

)

dz1 .

Similarly,∮

z1z2m(z2)∂1,2κ(z1, z2)

m(z1)m2(z2)dz1

= −∮

z2∂2κ(z1, z2)

m(z1)m(z2)dz1 +

∮

z1z2m′(z1)∂2κ(z1, z2)

m2(z1)m(z2)dz1 .

Hence

Θkℓ = −1

4π2ckcl

{∮

Ck

∮

Cℓ

z2m′(z2)κ(z1, z2)

m(z1)m2(z2)dz1dz2 −

∮

Ck

∮

Cℓ

z2∂2κ(z1, z2)

m(z1)m(z2)dz1dz2

}

.

Another integration by parts yields∮

z2∂2κ(z1, z2)

m(z1)m(z2)dz2 = −

∮

κ(z1, z2)

m(z1)m(z2)dz2 +

∮

z2m′(z2)κ(z1, z2)

m(z1)m2(z2)dz2 .

Finally, we obtain:

Θkℓ = −1

4π2ckcl

∮

Ck

∮

Cℓ

κ(z1, z2)

m(z1)m(z2)d z1d z2 ,

and (7) is established.

IV. ESTIMATION OF THE COVARIANCE MATRIX

Theorem 3 describes the limiting performance of the estimator of Theorem 1, with an exact charac-

terization of its variance. Unfortunately, the varianceΘ depends upon unknown quantities. We provide

hereafter consistent estimatesΘ̂ for Θ based on the observationŝRN .


12

Θ̂kℓ =M2

NkNℓ

∑

(i,j)∈Nk×Nℓ, i 6=j

−1(µ̂i − µ̂j)2m′

R̂N(µ̂i)m

′R̂N

(µ̂j)

+δkℓ∑

i∈Nk

(

m′′′R̂N

(µ̂i)

6m′R̂N

(µ̂i)3−

m′′R̂N

(µ̂i)2

4m′R̂N

(µ̂i)4

)]

. (11)

Theorem 4: Assume that the assumptions (A1)-(A2) hold true, and recallthe definition ofΘkℓ given

in (7). Let Θ̂kℓ be defined by (11), where(Nk) and (µ̂k) are defined in Theorem 1, then:

Θ̂kℓ −Θkℓ a.s.−−→ 0

asN,M → ∞.

Theorem 4 is useful in practice as one can obtain simultaneously an estimatêρk of the values ofρk

as well as an estimation of the degree of confidence for eachρ̂k.

Proof: In view of formula (7), and taking into account the fact thatmR̂N

andm′R̂N

are consistent

estimates form andm′, it is natural to definêΘkℓ by replacing the unknown quantitiesm andm′ in (7)

by their empirical counterpartsmR̂N

andm′R̂N

, hence the definition of̂Θkℓ in (10).

The proof of Theorem 4 now breaks down into two steps: The convergence of̂Θkℓ to Θkℓ, which relies

on the definition (10) of̂Θkℓ and on a dominated convergence argument, and the effective computation

of the integral in (10) which relies on Cauchy’s residue theorem [21], and yields (11).

We first address the convergence ofΘ̂kℓ to Θkℓ. Due to [18], [22], almost surely, the eigenvalues of

R̂N will eventually belong to anyε-blow-up of the supportS of the probability measure associated tom,

i.e. the set{x ∈ R : d(x,S) < ε}. Hence, ifε is small enough, the distance between these eigenvaluesand anyz ∈ Ck will be eventually uniformly lower-bounded. By [14, Lemma 1], the same result holdstrue for the zeros ofm

R̂N(which are real). In particular, this implies thatm

R̂Nis eventually uniformly

lower-bounded onCk (if not, then by compacity, there would existz ∈ Ck such thatmR̂N (z) = 0 whichyields a contradiction because all the zeroes ofm

R̂Nare strictly within the contour). With these arguments

at hand, one can easily apply the dominated convergence theorem and conclude that a.s.Θ̂kℓ → Θkℓ.

We now evaluate the integral (10) by computing the residues of the integrand withinCk andCℓ. There

are two cases to discuss depending on whetherk 6= ℓ andk = ℓ. Denote byh(z1, z2) the integrand in(10), that is:


13

h(z1, z2) =

(

m′R̂N

(z1)m′R̂N

(z2)

(mR̂N

(z1)−mR̂N (z2))2− 1

(z1 − z2)2

)

× 1m

R̂N(z1)mR̂N

(z2). (12)

We first consider the case wherek 6= ℓ.In this case, the two integration contours are different andit can be assumed that they never intersect

(so it can always be assumed thatz1 6= z2). Let z2 be fixed, and denote bŷµi the zeroes (labeled inincreasing order) ofm

R̂N, then the computation of the residueRes(h(·, z2), µ̂i) of h(·, z2) at a zeroµ̂i

of mR̂N

which is located withinCk is straightforward and yields:

r(z2)△= Res(h(·, z2), µ̂i) =

(

m′R̂N

(µ̂i)m′R̂N

(z2)

m2R̂N

(z2)− 1

(µ̂i − z2)2

)

1

m′R̂N

(µ̂i)mR̂N(z2)

. (13)

Similarly, if one computesRes(r, µ̂j) at a zeroµ̂j of mR̂Nlocated withinCk, one obtains:

Res(r, µ̂j) = −1

(µ̂i − µ̂j)2 m′R̂N

(µ̂i)m′R̂N

(µ̂j).

Then we need to consider the residueξ on the setRz2 = {z1 : mR̂N (z1) = mR̂N (z2) 6= 0, z1 6= z2}.(If this set is empty, then the residue is zero.) Notice thatξ is not a residue of 1(z1−z2)2

1m

R̂N(z1)mR̂N (z2)

,

hence one needs to compute

g(z1, z2) =m′

R̂N(z1)m

′R̂N

(z2)

(mR̂N

(z1)−mR̂N (z1))2

1

mR̂N

(z1)mR̂N(z2)

for the residueξ. By integration by parts, one gets

∮

g(z1, z2)dz1 = −∮ m′

R̂N(z1)m

′R̂N

(z2)

(mR̂N

(z1)−mR̂N (z2))dz1

m2R̂N

(z1)mR̂N(z2)

.

Let k = min{i ∈ N∗ : m(i)R̂N

(ξ) 6= 0}, then by a Taylor expansion

mR̂N

(z1) = mR̂N(z2) +

(z1 − ξ)kk!

m(k)

R̂N(ξ) + o(z1 − ξ)k,

and

m′R̂N

(z1) =(z1 − ξ)k−1(k − 1)! m

(k)

R̂N(ξ) + o(z1 − ξ)k−1.

Hence

Res(g, ξ) = −km′

R̂N(z2)

m3R̂N

(z2).

As it is the derivative function of k2m2R̂N

(z2), the integration with respect toz2 is zero.


14

It remains to count the number of zeros within each contour. By [14, Lemma 1], eventually, there are

exactly as many zeros as eigenvalues within each contour. Ithas been proved that the contribution of the

residues ofξ on Rz2 is null, hence the result in the casek 6= ℓ:

Θ̂kℓ = −M2

NkNℓ

∑

(i,j)∈Nk×Nℓ

− 1(µ̂i − µ̂j)2m′

R̂N(µi)m′

R̂N(µj)

.

We now compute the integral (10) in the case wherek = ℓ, and begin by the computation of the

residues at̂µi. The definition (13) ofr and the computation ofRes(r, µ̂j) still hold true in the case

whereµ̂j is within Ck but different fromµ̂i. It remains to computeRes(r, µ̂i). Takingz2 → µi, we get:

limz2→µ̂i

(z2 − µ̂i)3(

1

m′R̂N

(µ̂i)mR̂N(z2)(µ̂i − z2)2

)

=1

m′2R̂N

(µ̂i),

limz2→µ̂i

(z2 − µ̂i)2

1

m′R̂N

(µ̂i)mR̂N(z2)(µ̂i − z2)2

− 1m′

R̂N

2(µ̂i)(z2 − µ̂i)3

= −m′′

R̂N(µ̂i)

2m′R̂N

3(µ̂i).

Finally,

limz2→µ̂i

(z2 − µ̂i)(

1

m′R̂N

(µ̂i)mR̂N(z2)(µ̂i − z2)2

− 1m′2

R̂N(µ̂i)(z2 − µ̂i)3

+m′′RN

(µ̂i)

2m′3RN (µ̂i)(z2 − µ̂i)2

)

=m′′′

R̂N(µ̂i)

6m′R̂N

(µ̂i)3−

m′′R̂N

(µ̂i)2

4m′R̂N

(µ̂i)4.

Hence the residue:

Res(r, µ̂i) =m′′′RN

(µ̂i)

6m′RN(µ̂i)3

−m′′RN

(µ̂i)2

4m′RN(µ̂i)4

.

There are two other residues that should be taken into account for the computation of the integral: The

residues ofξ on Rz2 , and the residue forz1 = z2. The first case can be handled as before. Forz1 = z2,

the calculus ofg(z1, z2) for the residuez1 = z2 is exactly the same as before. It remains to compute

1(z1−z2)2

1m

R̂N(z1)mR̂N (z2)

for the residuez1 = z2. The integration by parts yields that:

∮

1

(z1 − z2)2dz1

mR̂N

(z1)mR̂N(z2)

=

∮

−m′

R̂N(z1)

(z1 − z2)dz1

m2R̂N

(z1)mR̂N(z2)

.

Then the residue forz1 = z2 is:

−m′

R̂N(z2)

m3R̂N

(z2).

Again, this is the derivative function of 12m2R̂N

(z2), then the integration is zero.


15

Finally both have a null contribution, hence the formula:

Θ̂kk =M2

N2k

∑

(i,j)∈N2k, i 6=j

−1(µ̂i − µ̂j)2m′

R̂N(µ̂i)m

′R̂N

(µ̂j)+∑

i∈Nk

(

m′′′R̂N

(µ̂i)

6m′R̂N

(µ̂i)3−

m′′R̂N

(µ̂i)2

4m′R̂N

(µ̂i)4

)]

.

V. PERFORMANCE IN THE CONTEXT OF COGNITIVE RADIOS

We introduce below a practical application of the above result to the telecommunication field of

cognitive radios. Consider a communication network implementing orthogonal code division multiple

access (CDMA) in the uplink, which we refer to as theprimary network. The primary network is

composed ofK transmitters. The data of transmitterk are modulated by thenk orthogonalN -chip

codeswk,1, . . . ,wk,nk ∈ CN . Consider also a secondary network, in sensor mode, that we assumetime-synchronized with the primary network, and whose objective is to determine the distances of the

primary transmitters in order to optimally reuse the frequencies used by the primary transmitters2. From

the viewpoint of the secondary network, primary userk has powerPk. Then, at symbol timem, any

secondary user receives theN -dimensional data vector

y(m) =

K∑

k=1

√

Pk

nk∑

j=1

wk,jx(m)k,j + σn

(m) (14)

with σn(m) ∈ CN an additive white Gaussian noiseCN(0, σ2I) received at timem and x(m)k,j ∈ Cthe signal transmitted by userk on the carrier codej at timem, which we assumeCN(0, 1) as well.

The propagation channel is considered frequency flat on the CDMA transmission bandwidth. We do not

assume that the sensor knowsσ2 neither the vectorswk,j. The secondary users may or may not be aware

of the number of codewords employed by each user.

Equation (14) can be compacted under the form

y(m) = WP1

2x(m) + σn(m)

with W = [w1,1, . . . ,w1,n1 ,w2,1, . . . ,wK,nK ] ∈ CN×n, n ,∑K

k=1 nk, P ∈ Cn×n the diagonal matrixwith entry P1 of multiplicity n1, P2 of multiplicity n2, etc. andPK of multiplicity nK , andx(m) =

[x(m)T1 , . . . ,x

(m)TK ]

T ∈ Cn wherex(m)k ∈ Cnk is a column vector withj-th entryx(m)k,j .

2the rationale being that far transmitters will not be interfered with by low power communications within the secondary

network.


16

GatheringM successive independent observations, we obtain the matrixY = [y(1), . . . ,y(M)] ∈CN×M given by

Y = WP1

2X+ σN =[

WP1

2 σIN

]

X

N

whereX = [x(1), . . . ,x(M)] andN = [n(1), . . . ,n(M)].

The y(m) are therefore independent Gaussian vectors of zero mean andcovarianceR , WPWH +

σ2IN . Since the objective is to retrieve the powersPk, while σ2 is known, the problem boils down to

finding the eigenvalues ofWPWH +σ2IN . However, the sensors only have access toY, or equivalently

to the sample covariance matrix

RN ,1

MYYH =

1

M

M∑

m=1

y(m)y(m)H .

AssumingRN conveys a good appreciation of the eigenvalue clustering tothe secondary user (as in

Figure 1), Theorem 1 enables the detection of primary transmitters and the estimation of their transmit

powersP1, . . . , PK ; this boils down to estimating the largestK eigenvalues ofWPWH + σ2IN , i.e.

thePk + σ2, and to subtractσ2 (optionally estimated from the smallest eigenvalue ofWPWH + σ2IN

if n < N ). Call P̂k the estimate ofPk.

Based on these power estimates, the secondary user can determine the optimal coverage for secondary

communications that ensures no interference with the primary network. A basic idea for instance is to

ensure that the closest primary user, i.e. that with strongest received power, is not interfered with. Our

interest is then cast onPK . Now, since the power estimator is imperfect, it is hazardous for the secondary

network to state thatK has powerP̂K or to add some empirical security margin tôPK . The results of

Section III partially answer this problem.

Theorems 3 and 4 enable the secondary sensor to evaluate the accuracy ofP̂k. In particular, assume

that the cognitive radio protocol allows the secondary network to interfere the primary network with

probability q and denoteA the value

A , infa{P(PK − P̂K > a) ≤ q}.

According to Theorem 3, forN,M large,A is well approximated bŷΘK,KQ−1(q), with Q the Gaussian

cumulative distribution function. If the secondary users detect a user with powerPK , estimated byP̂K ,

P(P̂K + A < PK) < q and then it is safe for the secondary network to assume the worst case scenario

where userK transmits at power̂PK +A ≃ P̂K + Θ̂KQ−1(q).In Figure 2, the performance of Theorem 3 is compared against10, 000 Monte Carlo simulations of

a scenario of three users, withP1 = 1, P2 = 3, P3 = 10, n1 = n2 = n3 = 20, N = 60 andM = 600.


17

1 3 100

1

2

3

EstimatesD

ensi

ty

Histogram of theρ̂kTheoretical limiting distribution

Fig. 2. Comparison of empirical against theoretical variances, based on Theorem 3, for three users,P1 = 1, P2 = 3, P3 = 10,

n1 = n2 = n3 = 20 codes per user,N = 60, M = 600 and SNR= 20 dB.

It appears that the limiting distribution is very accurate for these values ofN,M . We also performed

simulations to obtain empirical estimatesΘ̂k of Θk from Theorem 4, which suggest thatΘ̂k is an accurate

estimator as well.

VI. CONCLUSION

In this article, we derived an exact expression and an approximation of the limiting performance of

a statistical inference method that estimates the population eigenvalues of a class of sample covariance

matrices. These results are applied in the context of cognitive radios to optimize secondary network

coverage based on measures of the primary network activity.

APPENDIX

A. Proof of Proposition 1

Let us first begin by considerations related to the supports of the probability distributions associated

to m(z) andmN (z). Denote byS andSN these supports and recall thatS is the union ofL clusters:

S = (a1, b1) ∪ · · · ∪ (aL, bL) .

The following proposition clarifies the relations betweenSN andS.


18

Proposition 3: LetN,M → ∞, then forN large enough, the supportSN of the probability distributionassociated to the Stieltjes transformmN (z) is the union ofL clusters:

SN = (aN1 , b

N1 ) ∪ · · · ∪ (aNL , bNL ) .

Moreover, the following convergence holds true :

aNℓ −−−−−−→N,M→∞

aℓ , bNℓ −−−−−−→

N,M→∞bℓ ,

for 1 ≤ ℓ ≤ L.Remark 3: If the supportSN contains zero, (ex:N > M ), then zero is also in the supportS, the

conclusion is still true.

Proof of Proposition 3: Recall the relations:

mN (z) = −(

z − NM

∫

t

1 + tmN (z)dFRN (t)

)−1

(15)

and

mN (z) =M

NmN (z)−

(

1− MN

)

1

z. (16)

As the inverse of Stieltjes transform of−1z is δ0 (the Dirac mass on 0) andmN (z) is a continuousfunction overR∗+, for a,b with 0 < a < b, by the inverse formula of Stieltjes transform, one gets:

FN ([a, b]) =M

NFN ([a, b]).

So it suffices to study the supportSN associated toFN .

From the definition ofmN (z) (see formula (15)), we obtain:

zRN (mN ) = −1

mN+

N

M

∫

tdFRN(t)

1 + tmN (z).

Denote byB = {m ∈ R : m 6= 0,−m−1 /∈ {ρ1, · · · , ρL}}. In [23, Theorem 4.1 and Theorem 4.2],Silverstein and Choi show that for a real numberx, x ∈ ScN ⇐⇒ mx ∈ B and z′RN(mx) =

1m2x

−NM

∫ t2dFRN(t)(1+tmx)

2 > 0 with mN (x) = mx andzRN(mx) = x.

Then ifa ∈ ∂SN , ma /∈ B or z′RN(ma) ≤ 0 with ma = mN (a). Now we will show thatma ∈ B. In [23,Theorem 5.1],ma 6= 0. If −m−1a ∈ SFRN , asFRN is discrete, we get thatlimm→ma

∫ t2dFRN(t)(1+tm)2 −→ ∞.

So on the neighborhood to the left and to the right ofma, z′RN < 0 which contradicts [23, Theorem 5.1].

Hencez′RN(ma) ≤ 0. By the continuity, we get


19

z′RN(ma) =1

m2a− N

M

∫

t2dFRN (t)

(1 + tma)2= 0.

It is equivalent to the following equation:

z′RN(ma) =1

m2a− 1

M

L∑

i=1

Niρ2i

(1 + ρima)2= 0. (17)

By multiplying the common denominator, one will get a polynomial of the degree 2L inma. Now we

will show that these 2L roots are real. At first, notice that

1

m2− N

M

∫

t2dFRN (t)

(1 + tm)2−−−−−→m→− 1

ρi

−∞,

and

z′′RN(m) = −2

m3+

N

M

∫

2t3dFRN (t)

(1 + tm)3.

So z′′RN(m) has one and only one zero in the open set(−1ρi,− 1ρi+1 ) for i ∈ {1, · · · , L − 1}. Then

for βi ∈ (− 1ρi ,−1

ρi+1) such thatz′′RN(βi) = 0, it suffices to show thatz

′RN

(βi) > 0 in order to prove

that there will be two zeros forz′RN(m) in the set(−1ρi,− 1ρi+1 ). From the separability condition (cf.

Assumption (A2)),infN{MN −ΨN (i)} > 0, and

z′RN(−1

αi) = α2i −

N

M

∫

t2dFRN(t)

(1− tαi )2

= α2i

(

1− 1M

L∑

r=1

Niρ2i

(αi − ρi)2

)

> 0

Thus we obtain2(L− 1) roots. Besides, in the open set(−ρ−1L , 0),1

m2− N

M

∫

t2dFRN (t)

(1 + tm)2−−−−−→ma→0−

+∞,

there exists another root in this set. In the open set(−∞,−ρ−11 ),1

m2− N

M

∫

t2dFRN (t)

(1 + tm)2−−−−−→m→−∞

0

and1

m2− N

M

∫

t2dFRN (t)

(1 + tm)2∼

m→−∞

1

m2(1− L

M) > 0.

Hence the last root in this open set. This proves thatSN = (aN1 , bN1 ) ∪ · · · ∪ (aNL , bNL ).

To proveaNℓ −−−−−−→N,M→∞ aℓ andbNℓ −−−−−−→N,M→∞ bℓ , notice thatai bi satisfy the same type of the equation

by replacingNM by c andFRN by FR. As NM → c and KiM → ci, the roots of Equation (17) converge to

those of the limit equation (see [24]). Thus we achieve the second conclusion.


20

We are now in position to establish the proof of Proposition 1.

Denote byS(ε) the ε-blow-up of S, i.e. S(ε) = {x ∈ R, d(x, S) < ε}. Let ε > 0 be small enoughand consider a smooth functionφ equal to zero onS(ε/3), equal to 1 ifx /∈ S(ε), equal to zero againif |x| ≥ τ (as we shall see,τ will be chosen to be large), and smooth in-between with0 ≤ φ ≤ 1:

φ(x) =

0 if d(x,S) < ε/3 ,

1 if d(x,S) > ε , |x| ≤ τ − ǫ

0 if |x| > τ .

Notice that if N,M → ∞ andN is large enough, then by Proposition 3,φ(x) = 0 for all x ∈ SN .Now if Z is aM ×M hermitian matrix with spectral decompositionZ = Udiag (γi; 1 ≤ i ≤ M)) UH ,whereU is unitary anddiag (γi; 1 ≤ i ≤ M)) stands for theM ×M diagonal matrix whose entries areZ’s eigenvalues, writeφ(Z) = Udiag (φ(γi); 1 ≤ i ≤ M)) UH .

We have:

P(supn

d(λn, S) > ε) ≤ P(‖R̂N‖ > τ − ε) + P(Trφ(R̂N ) ≥ 1)

= P(‖R̂N‖ > τ − ε) + P([Trφ(R̂N )]p ≥ 1)(a)

≤ P(‖R̂N‖ > τ − ε) + E[Trφ(R̂N )]p ,

for everyp ≥ 1, where(a) follows from Markov’s inequality. The fact thatP(‖R̂N‖ > τ) = O(N−ℓ)for τ large enough and everyℓ ∈ N∗ is well-known (see for instance [6, Section 9.7]). We shall thereforeestablish estimates overE[Trφ(R̂N )]p. Takep = 2k; we prove the following statement by induction: For

k ≥ 1 and for every integerβ < 2k and for every smooth functionf with compact support whose valueon S(ε/3) is zero ,

E

(

Trf(R̂N ))2k

= O

(

1

Nβ

)

.

First notice that due to Proposition 3,∫

SNf(λ)FN (dλ) = 0 (whereFN is the probability distribution

associated tomN ) for N,M large enough (N,M → ∞). A minor modification of [25, Lemma 2] (whosemodel is slightly different) with the help of [26, Proposition 5] yields that forN,M → ∞ andN largeenough,ETr f(R̂N ) = O(N−1), and the property is verified fork = 0.


21

Let k > 0 be fixed and assume that the result holds true forβ < 2k. We want to show that

E[Trf(R̂N )]2(k+1) = O(N−2β). At stepk + 1, the expectation writes:

∣

∣

∣E[Tr f(R̂N )]2(k+1)

∣

∣

∣ (18)

=

∣

∣

∣

∣

E

(

[Trf(R̂N )]2k + E[trf(R̂N )]

2k − E[Trf(R̂N )]2k)2∣

∣

∣

∣

≤ 2(

Var[Tr f(R̂N )]2k + |E[Trf(R̂N )]2

k |2)

. (19)

The second term of the right hand side (r.h.s.) of the equation can be handled by the induction hypothesis:∣

∣

∣E[Trf(R̂N )]

2k∣

∣

∣

2= O

(

1

N2β

)

.

We now rely on Poincaré-Nash inequality (see for instance [26, Section II-B]) to handle the first term of

the r.h.s. Applying this inequality, we obtain:

Var(

(Tr f(R̂N ))2k)

≤ K∑

i,j

E

∣

∣

∣

∣

∣

∂[Tr f(R̂N )]2k

∂Yi,j

∣

∣

∣

∣

∣

2

+

∣

∣

∣

∣

∣

∂[Tr f(R̂N )]2k

∂Y i,j

∣

∣

∣

∣

∣

2

, (20)

whereK is a constant which does not depend onN,M and which is greater thanRN ’s eigenvalues. In

order to compute the derivatives of the r.h.s., we rely on [27, Lemma 4.6]. This yields:

∂

∂Yi,j[Tr f(R̂N )]

2k =2k

M[Tr f(R̂N )]

2k−1[Y∗Nf′(R̂N )]j,i ,

∂

∂Yi,j[Tr f(R̂N )]

2k =2k

M[Tr f(R̂N )]

2k−1[f ′(R̂N )YN ]i,j .

Plugging these derivatives into (20), we obtain:

Var(Tr[f(R̂N )]2k)

≤ K 22k+1

M2E

[

(Tr f(R̂N ))(2k+1−2)Tr (f ′(R̂N )YNY

∗Nf

′(R̂N ))]

,

=K 22k+1

ME

[

(Tr f(R̂N ))(2k+1−2)Tr (f ′(R̂N )

2R̂N )]

,

≤ K 22k+1

M

∣

∣

∣E[Trf(R̂N )]

2k+1∣

∣

∣

2k+1−2

2k+1 ×∣

∣

∣E[Trf ′(R̂N )

2R̂N ]2k∣

∣

∣

1

2k

,

where the last inequality is a consequence of Hölder’s inequality.

As the functionh(λ) = λ[f ′(λ)]2 satisfies the induction hypothesis, we have for everyα < 1:∣

∣

∣ETr[f ′(R̂N )

2R̂N ]2k∣

∣

∣

1

2k

= O(N−α).

Plugging this estimate into (18), we obtain:∣

∣

∣E[Tr f(R̂N )]2(k+1)

∣

∣

∣ ≤ K(

1

N1+α|E[Tr f(R̂N )]2

(k+1) | 2k+1

−2

2k+1

)

+ O(N−2β) , (21)


22

whereK is a constant independent ofM,N, k. Notice that inequality (21) involves twice the quantity

of interestE[Tr f(R̂N )]2(k+1)

that we want to upper bound byO(N−2β). We shall proceed iteratively.

Notice thatTr [f(R̂N )] ≤ supx∈R |f(x)| ×N becausef is bounded onR; hence the rough estimate:

E[Trf(R̂N )]2(k+1) = O(N2

k+1

).

Plugging this into (21) yields:

E[Tr f(R̂N )]2(k+1) = O(Na1) ,

wherea0 = 2k+1 anda1 = a0 2k+1−22k+1 − (1 + α). Iterating the procedure, we obtain:

E[Tr f(R̂N )]2(k+1) = O

(

Naℓ∨(−2β))

,

whereaℓ = aℓ−12k+1−22k+1 − (1+α) andx∨ y stands forsup(x, y). Now, in order to conclude the proof, it

remains to prove that i) the sequence(aℓ) converges to some limita∞, ii) for some well-chosenα < 1,

a∞ ∈ (−2k+1,−2β). Write:

aℓ+1 + 2k(1 + α) =

2k − 12k

(aℓ + 2k(1 + α)) ,

henceaℓ converges to−2k(1+α) which readily belongs to(−2k+1,−2β) for a well-chosenα ∈ (0, 1).Finally E[Tr f(R̂N )]2

(k+1)

= O(N−2β) which ends the induction.

It remains to apply this estimate toE[Trφ(R̂N )]ℓ in order to get the desired result.

B. Proof of Lemma 1

As explained in Section III, there are two conditions to prove (Billingsley [28, Theorem 13.1]):

• Finite-dimensional convergence of the process(XN ,X ′N ).

• Tightness on the contourCk.

Remark 4: As uN (resp.u′N ) converges almost surely tou (resp.u′) (see Silverstein and Bai [17]),

the convergence of the process(XN ,X ′N , uN , u′N ) is achieved as soon as the convergence of the process

(XN ,X′N ) is proved.

In [20], Bai and Silverstein establish a central limit theorem for FRN with the complex Gaussian

entriesXij . We recall below their main result.

Proposition 4: [20] With the notations introduced in Section II, forf1, . . . , fp, analytic on an open

region containingR,

1)(

N∫

fi(x)d(FR̂N − FN )(x)

)

1≤i≤pforms a tight sequence onN ,


23

2)(

N

∫

fi(x)d(FR̂N − FN )(x)

)

1≤i≤p

D−→ N(0,V),

whereV = (Vij) and

Vij = −1

4π2

∮ ∮

fi(z1)fj(z2)vij(z1, z2)dz1dz2,

with

vij(z1, z2) =m′(z1)m

′(z2)

(m(z1)−m(z2))2− 1

(z1 − z2)2

where the integration is over positively oriented contoursthat circle around the supportS.

Now we apply this proposition to show the finite-dimensionalconvergence. For allzi ∈ Ck\R, noticethat

mR̂N

(z)−mN (z) =1

2iπ

∮

1

x− z d(FR̂N − FN )(x)

with the contour who contains the supportS andXN (z) = M(mR̂N(z) −mN (z)). Then Proposition 4

implies directly that for allp ∈ N, the random vector(

XN (z1),X′

N (z1), · · · ,XN (zp),X′

N (zp))

converges to a centered Gaussian vector by considering the functions:(

f1(x) =1

x− z1, f2(x) =

1

(x− z1)2, · · · , f2p−1(x) =

1

x− zp, f2p(x) =

1

(x− zp)2)

.

Thus the finite dimensional convergence is achieved.

The proof of the tightness is based on Nash-Poincaré inequality ([25] and [26]). In Appendix A, it is

proved that for allǫ > 0 and allℓ ∈ N,

P

(

supλ∈eig(R̂N )

d(λ,S) > ǫ

)

= o(N−ℓ).

Following the same idea as Bai and Silverstein [20, Section 3and 4], it is indeed a tight sequence. The

details of the proof are in Appendix C. Thus Lemma 1 is achieved.

C. Proof of the tightness

We will show the tightness of the sequenceM(mR̂N

− mN ) andM(m′

R̂N− m′N ) by using Nash-

Poincaré’s inequality [26]. First, denote byM(mR̂N

(z) −mN (z)) = M1N (z) +M2N (z) with M1N (z) =M(m

R̂N(z)− E[m

R̂N(z)]) andM2N (z) = M(E[mR̂N (z)]−mN (z)).


24

As 1ρ̂k−z can converge to infinite if z is close to the real axis, there will be a little trouble for the

tightness. Then we need a truncated version of the process. More precisely, letεN be a real sequence

decreasing to zero satisfying for someδ ∈]0, 1[:

εN ≥ N−δ .

Remark 5: Notice thatXN (z) = M(mR̂N −mN ) = XN (z) for z ∈ C+. So it suffices to verify the

arguments forz ∈ C+.Denote by([x2k−1, x2k], k = 1, · · · , L) thek-th cluster of the support of the limiting spectral measure;

and takel2k−1, l2k such thatx2k−2 < l2k−1 < x2k−1 and x2k < l2k < x2k+1 for k ∈ {1, .., L} withconventionsx0 = 0 andx2L+1 = ∞, i .e. , [l2k−1, l2k] only contains the k-th cluster. Letd > 0. Consider:

Cu = {x+ id : x ∈ [l2k−1, l2k]}.

and

Cr = {l2k−1 + iv : v ∈ [N−1εN , d]}.

Also

Cl = {l2k + iv : v ∈ [N−1εN , d]}.

ThenCN = Cl ∪ Cu ∪ Cr. The processM̂1N (·) is defined by

M̂1N (z) =

M1N (z) for z ∈ CN ,

M1N (l2k + iN−1εN ) for x = l2k, v ∈ [0, N−1εN ],

M1N (l2k−1 + iN−1εN ) for x = l2k−1, v ∈ [0, N−1εN ].

This partition ofCN is identical to that used in [20, Section 1]. With probability one (see [18] and [22]),

for all ǫ > 0,

lim supλ∈eig(R̂N )

d(λ,SN ) < ǫ

with d(x, S) the Euclidean distance ofx to the setS. So with probability one, for allN large, ([20, page

563])∣

∣

∣

∣

∮

(

M1N (z)− M̂1N (z))

dz

∣

∣

∣

∣

≤ K1εN ,

and∣

∣

∣

∣

∮

(

M1N′(z)− M̂2′N (z)

)

dz

∣

∣

∣

∣

≤ K2εN

for some constantsK1 andK2. Both terms converge to zero asM → ∞. Then it suffices to ensure thetightness forM̂1N (z) andM̂

1′N (z).


25

We now prove tightness based on [28, Theorem 13.1], i.e.

1) Tightness at any point of the contour (hereCN ).

2) Satisfaction of the condition

supN,z1,z2∈CN

E|(M̂1N (z1)− M̂1N (z2))|2|z1 − z2|2

≤ K.

Condition 1) is achieved by an immediate application of Proposition 4. We now verify the second

condition.

We evaluateE|(M̂1N (z1)−M̂

1N (z2))|

2

|z1−z2|2. Notice that

mR̂N

(z1)−mR̂N (z2) =z1 − z2M

N∑

i=1

1

(λ̂i − z1)(λ̂i − z2)

=z1 − z2M

Tr(D−1N (z1)D−1N (z2))

with DN (z) = R̂N − zIN . We have

∂

∂Yi,j

(

mR̂N

(z1)−mR̂N (z2)z1 − z2

)

=∂

∂Yi,jTr(R̂N − z1I)−1(R̂N − z2I)−1

=1

M

[

−Y∗ND−2N (z1)D−1N (z2)−Y∗ND−1N (z1)D−2N (z2)]

j,i,

and

∂

∂Ȳi,j

(

mR̂N

(z1)−mR̂N (z2)z1 − z2

)

=1

M[−D−2N (z1)D−1N (z2)YN −D−1N (z1)D−2N (z2)YN ]i,j.

Then by the Nash-Poincaré inequality and the fact thatR̂N is uniformly bounded in spectral norm almost

surely, one gets

E|M̂1(z1)− M̂1(z2)|2|z1 − z2|2

≤ C1N

E

[

Tr(LN )]

=C1N

E(Tr(LN )Isupn d(λ̂n,S)≤ε) +

C1N

E(Tr(LN )Isupn d(λ̂n,S)>ε)

with

LN = R̂ND−4N (z1)D

−2N (z2) + 2R̂ND

−3N (z1)D

−3N (z2) + R̂ND

−2N (z1)D

−4N (z2)


26

andC1 a constant which does not depend onN or M . For the first term,Tr(LN ) is bounded on the set

supn d(λ̂n,S) ≤ ε. For the second term, since for alli ∈ N and allz ∈ CN , 1|λ̂n−z|i ≤N i

εiN, it leads that

N∑

n=1

1

|λ̂n − z|i≤ N

i+1

εiN.

Then

|Tr(LN )| ≤ O(

N7

ε6N

)

.

As P(sup d(λ̂n,S) ≥ ε) = o(N−16), takeεN = N−0.01, one obtains∣

∣

∣E(Tr(LN )Isupn d(λ̂n,S)>ε

)∣

∣

∣≤ E

∣

∣

∣Tr(LN )Isup d(λ̂n,S)>ε

∣

∣

∣

≤ O(

N7

ǫ6NP(sup d(λ̂n,S) > ε)

)

≤ O(

N7−0.06−16)

→ 0.The second condition of tightness is achieved.

For M2N (z), following exactly the same method in [6, Section 9.11], onecan show thatM2N (z) is

bounded and forms an equicontinuous family that converges to 0. Hence the tightness forM(mR̂N

(z)−mN (z)).

The next step is to prove the tightness ofM(m′R̂M

(z)−m′N (z)). We have

m′R̂N

(z1)−m′R̂N (z2)

=z1 − z2M

N∑

i=1

2λ̂i − z1 − z2(λ̂i − z1)2(λ̂i − z2)2

=z1 − z2M

Tr(

D−2N (z1)D−2N (z2)(DN (z1) +DN (z2))

)

.

Following the same method as derived before, one obtains

∂

∂YijD−1N (z1)D

−2N (z2)

= − 1M

[

Y∗ND−2N (z1)D

−2N (z2) + 2Y

∗ND

−1N (z1)D

−3N (z2)

]

j,i,

and∣

∣

∣

∣

∂

∂YijTrD−2N (z1)D

−2N (z2)(D(z1) +DN (z2))

∣

∣

∣

∣

2

=1

MTr(L2)

with

L2 =4R̂N

(

3D−4N (z1)D−4N (z2) + 2D

−3N (z1)D

−5N (z2) + 2D

−5N (z1)D

−3N (z2)

+D−2N (z1)D−6N (z2) +D

−6N (z1)D

−2N (z2)

)

.


27

Then Nash-Poincaré inequality yields that

Var|M̂1′N (z1)− M̂1

′

N (z2)||z1 − z2|

≤ C1N

E(Tr(L2)Isupn d(λ̂n,S)≤ε) +

C1N

E(Tr(L2)Isupn d(λ̂n,S)>ε)

with C1 the same constant defined as before. The termTr(L2) is bounded on the setsup d(λ̂n,S) ≤ ε.For the second term,|Tr(L2)| ≤ O

(

N9

ε8N

)

. As P(sup d(λ̂n,S) ≥ ε) = o(N−16) and εN = N−0.01, theproof of the tightness ofM1N

′(z) is achieved as before.

The proof of the tightness is completed with the verificationof M2N′(z) for z ∈ Cn to be bounded

and forms an equicontinuous family, and convergence to 0. Wewill use the same method for the process

M2N (z) (see [6, Section 9.11]).

By Formula (9.11.1) in [6, Section 9.11], they show that

(EmR̂N

−mN )

1−NM

∫ mN t2dFRN(t)

(1+tEmR̂N

)(1+tmN )

−z + NM∫

tdFRN1+tEm

R̂N

− TN

= EmR̂N

mNTN (22)

where

TN =N

M2

M∑

j=1

Eβjdj(EmR̂N)−1,

dj = dj(z) = −q∗jR1/2(R̂(j) − zI)−1(EmR̂NR+ I)−1R1/2qj + (1/M)Tr(EmR̂N

R+ I)−1R(R̂N − zI)−1,

βj =1

1 + 1M y∗j (R̂(j) − zI)−1yj

,

qj = 1/√Nxj,

R̂(j) = R̂N −1

Myjy

∗j .

If one derives (22) with respect toz, the equation becomes

(Em′R̂N

−m′N )

1−NM

∫ mN t2dFRN(t)

(1+tEmR̂N

)(1+tmN )

−z + NM∫

tdFRN1+tEm

R̂N

− TN

+ (EmR̂N

−mN )

1−NM

∫ mN t2dFRN(t)

(1+tEmR̂N

)(1+tmN )

−z + NM∫

tdFRN1+tEm

R̂N

− TN

′

= Em′R̂N

mNTN + EmR̂Nm′NTN + EmR̂N

mNT′N .

In the work of [6, Section 9.11], they show that whenN tends to infinity,

1) supz∈CN |EmR̂N (z)−m(z)| → 0 and supz∈CN |mN (z)−m(z)| → 0,

2)N

M

∫ t2mNdFRN (t)(1+tEm

R̂N)(1+tmN )

−z+ NM

∫tdFRN

1+tEmR̂N

−TNconverges ,

3) M2N (z) → 0, TN → 0.


28

With the same method, one can show easily that

4) supz∈CN |Em′R̂N (z)−m′(z)| → 0,

5) supz∈CN |m′N (z)−m′(z)| → 0,6) NM

(

∑Mj=1 Eβjdj

)′converges.

With these results, it suffices to show thatT ′N → 0, andM2N′ is equicontinuous.

In [6, Section 9.9], they show that form, p ∈ N and a non-randomN × N matrix Ak, k = 1, ..,mandBl, ℓ = 1, .., q, we have

∣

∣

∣

∣

∣

E

(

m∏

k=1

r∗tAkrt

q∏

ℓ=1

(r∗tRℓrt −M−1TrRBℓ))∣

∣

∣

∣

∣

≤ KM−(1∧q)m∏

k=1

‖Ak‖q∏

ℓ=1

‖Bℓ‖. (23)

We have also that for any positivep,

max(E‖D−1(z)‖p,E‖D−1j (z)‖p,E‖D−1ij (z)‖p) ≤ Kp (24)

and

supn,z∈Cn

‖(EmR̂N

(z)R + I)−1‖ < ∞ (25)

whereKp is a constant which depends only onp.

With all these preliminaries, asTN → 0, by the dominated convergence theorem of derivation, itsuffices to show thatT ′N is bounded overCN . In [6, Section 9.11], it is sufficient to show that(f

′M (z))

is bounded where

fM (z) =

M∑

j=1

E[(r∗jD−1j rj−M−1TrD−1j R)(r∗jD−1j (EmR̂NR+I)

−1rj−M−1TrD−1j (EmR̂NR+I)−1R)].

With the help of (23)-(25),f ′M (z) is indeed bounded inCN .

Now we will show thatM2N′ is equicontinuous. With the light work as before, it is sufficient to show

that f ′′M(z) is bounded. Using (23), we obtain


29

|f ′′(z)| ≤KM−1[(

E(TrD−31 RD̄−31 R)E(TrD

−11 (EmR

NR+ I)−1R(Em̄

R̂NR+ I)−1D̄−11 R)

)1/2

+ 2(


−21 (EmR̂N

R+ I)−1R(Em̄R̂N

R+ I)−1D̄−21 R))1/2

+ 2|Em′R̂N

|(


−11 (EmR̂N

R+ I)−2R(Em̄R̂N

R+ I)−2D̄−11 R))1/2

+(


−31 (EmR̂N

R+ I)−1R(Em̄R̂N

R+ I)−1D̄−31 R))1/2

+ 2|Em′R̂N

|(


−21 (EmR̂N

R+ I)−2R(Em̄R̂N

R+ I)−2D̄−21 R))1/2

+ |Em′′R̂N

|(


−11 (EmR̂N

R+ I)−2R(Em̄R̂N

R+ I)−2D̄−11 R))1/2

+ |Em′R̂N

|2(


−11 (EmR̂N

R+ I)−3R(Em̄R̂N

R+ I)−3D̄−11 R))1/2]

.

Thanks to (24) and (25), the right side is indeed bounded. This ends the proof of the tightness.

REFERENCES

[1] V. Plerous, P. Gopikrishnan, B. Rosenow, L. Amaral, T. Guhr, and H. Stanley, “Random matrix approach to cross correlations

in financial data,”Phys. Rev. E, vol. 65, no. 6, Jun. 2002.

[2] F. Luo, Y. Yang, J. Zhong, H. Gao, L. Khan, D. K. Thompson, and J. Zhou, “Constructing gene co-expression networks

and predicting functions of unknown genes by random matrix theory,” BMC bioinformatics, vol. 8, no. 1, p. 299, 2007.

[3] A. Kammoun, M. Kharouf, W. Hachem, and J. Najim, “BER and Outage Probability Approximations for LMMSE Detectors

on Correlated MIMO channels,”Information Theory, vol. 55, no. 10, pp. 4386–4397, 2009.

[4] R. Couillet, J. W. Silverstein, and M. Debbah, “Eigen-Inference for Energy Estimation of Multiple Sources,”IEEE Trans.

Inf. Theory, submitted for publication. [Online]. Available: http://arxiv.org/abs/1001.3934

[5] X. Mestre and M. Lagunas, “Modified Subspace Algorithms for DoA Estimation With Large Arrays,”IEEE Trans. Signal

Process., vol. 56, no. 2, pp. 598–614, Feb. 2008.

[6] Z. Bai and J. W. Silverstein, “Spectral Analysis of LargeDimensional Random Matrices,”Springer Series in Statistics,

2009.

[7] R. Couillet and M. Debbah,Random matrix methods for wireless communications, 1st ed. New York, NY, USA: Cambridge

University Press, to appear.

[8] V. L. Girko, “Ten years of general statistical analysis.” [Online]. Available:

http://www.general-statistical-analysis.girko.freewebspace.com/chapter14.pdf

[9] J. W. Silverstein and P. L. Combettes, “Signal detectionvia spectral theory of large dimensional random matrices,”IEEE

Trans. Signal Process., vol. 40, no. 8, pp. 2100–2105, 1992.

[10] N. E. Karoui, “Spectrum estimation for large dimensional covariance matrices using random matrix theory,”Annals of

Statistics, vol. 36, no. 6, pp. 2757–2790, Dec. 2008.

[11] O. . Ryan and M. Debbah, “Free deconvolution for signal processing applications,” inProc. IEEE International Symposium

on Information Theory (ISIT’07), Nice, France, Jun. 2007, pp. 1846–1850.


http://arxiv.org/abs/1001.3934http://www.general-statistical-analysis.girko.freewebspace.com/chapter14.pdf

30

[12] R. Couillet and M. Debbah, “Free deconvolution for OFDMmulticell SNR detection,” inProc. IEEE International

Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC’08), Cannes, France, 2008.

[13] X. Mestre, “On the asymptotic behavior of the sample estimates of eigenvalues and eigenvectors of covariance matrices,”

IEEE Trans. Signal Process., vol. 56, no. 11, pp. 5353–5368, Nov. 2008.

[14] ——, “Improved estimation of eigenvalues of covariancematrices and their associated subspaces using their sample

estimates,”IEEE Trans. Inf. Theory, vol. 54, no. 11, pp. 5113–5129, Nov. 2008.

[15] P. Vallet, P. Loubaton, and X. Mestre, “Improved subspace DoA estimation methods with large arrays: The deterministic

signals case,” inProc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’09), 2009, pp.

2137–2140.

[16] V. A. Marc̆enko and L. A. Pastur, “Distributions of eigenvalues for some sets of random matrices,”Math USSR-Sbornik,

vol. 1, no. 4, pp. 457–483, Apr. 1967.

[17] J. W. Silverstein and Z. D. Bai, “On the empirical distribution of eigenvalues of a class of large dimensional random

matrices,”Journal of Multivariate Analysis, vol. 54, no. 2, pp. 175–192, 1995.

[18] Z. D. Bai and J. W. Silverstein, “No Eigenvalues Outsidethe Support of the Limiting Spectral Distribution of Large

Dimensional Sample Covariance Matrices,”Annals of Probability, vol. 26, no. 1, pp. 316–345, Jan. 1998.

[19] O. Kallenberg,Foundations of mordern Probability, 2nd edition. Springer Verlag New York, 2002.

[20] Z. D. Bai and J. W. Silverstein, “CLT of linear spectral statistics of large dimensional sample covariance matrices,” Annals

of Probability, vol. 32, no. 1A, pp. 553–605, 2004.

[21] J. Marsden and M. Hoffman,Basic Complex Analysis, 3rd ed. New York: Freeman, 1987.

[22] Z. D. Bai and J. W. Silverstein, “Exact Separation of Eigenvalues of Large Dimensional Sample Covariance Matrices,”

The Annals of Probability, vol. 27, no. 3, pp. 1536–1555, 1999.

[23] J. W. Silverstein and S. Choi, “Analysis of the limitingspectral distribution of large dimensional random matrices,” Journal

of Multivariate Analysis, vol. 54, no. 2, pp. 295–309, 1995.

[24] F. Cucker and A. Corbalan, “An Alternate Proof of the Continuity of the Roots of a Polynomial,”The American

Mathematical Monthly, vol. 96, no. 4, pp. 342–345, Apr. 1989.

[25] P. Vallet, P. Loubaton, and X. Mestre, “Improved subspace estimation for multivariate observations of high

dimension: the deterministic signals case,”IEEE Trans. Inf. Theory, submitted for publication. [Online]. Available:

http://arxiv.org/abs/1002.3234

[26] W. Hachem, O. Khorunzhy, P. Loubaton, J. Najim, and L. A.Pastur, “A new approach for capacity analysis of large

dimensional multi-antenna channels,”IEEE Trans. Inf. Theory, vol. 54, no. 9, 2008.

[27] U. Haagerup, H. Schultz, and S. Thorbjørnsen, “A randommatrix approach to the lack of projections in Cred*(F2),”

Advances in Mathematics, vol. 204, no. 1, pp. 1–83, 2006.

[28] P. Billingsley,Probability and Measure, 3rd ed. Hoboken, NJ: John Wiley & Sons, Inc., 1995.


http://arxiv.org/abs/1002.3234

I IntroductionII Estimation of the population eigenvaluesII-A NotationsII-B Matrix ModelII-C Mestre's Estimator of the population eigenvaluesII-D Integral representation of estimator k - Stieltjes transforms

III Fluctuations of the population eigenvalue estimatorsIII-A The Central Limit TheoremIII-B Proof of Theorem ??

IV Estimation of the covariance matrixV Performance in the context of cognitive radiosVI ConclusionAppendixA Proof of Proposition ??B Proof of Lemma ??C Proof of the tightness

References

1 Fluctuations of an Improved Population Eigenvalue ... · 1 ≤ k ≤ L, we will refer to each...

Documents

Transcript of 1 Fluctuations of an Improved Population Eigenvalue ... · 1 ≤ k ≤ L, we will refer to each...