1 Fluctuations of an Improved Population Eigenvalue ... · 1 ≤ k ≤ L, we will refer to each...

30
arXiv:1108.5266v1 [math.PR] 26 Aug 2011 1 Fluctuations of an Improved Population Eigenvalue Estimator in Sample Covariance Matrix Models J. Yao, R. Couillet, J. Najim, M. Debbah October 16, 2018 Abstract This article provides a central limit theorem for a consistent estimator of population eigenvalues with large multiplicities based on sample covariance matrices. The focus is on limited sample size situations, whereby the number of available observations is known and comparable in magnitude to the observation dimension. An exact expression as well as an empirical, asymptotically accurate, approximation of the limiting variance is derived. Simulations are performed that corroborate the theoretical claims. A specific application to wireless sensor networks is developed. I. I NTRODUCTION Problems of statistical inference based on M independent observations of an N -variate random variable y, with E[y]=0 and E[yy H ]= R N have drawn the attention of researchers from many fields for years: Portfolio optimization in finance [1], gene coexistence in biostatistics [2], channel capacity in wireless communications [3], power estimation in sensor networks [4], array processing [5], etc. In particular, retrieving spectral properties of the population covariance matrix R N , based on the observation of M independent and identically distributed (i.i.d.) samples y (1) ,..., y (M) , is paramount to many questions of general science. If M is large compared to N , then it is known that almost surely This work was partially supported by Agence Nationale de la Recherche (France), program ANR-07-MDCO-012-01 ’Sesame’. Yao and Najim are with T´ el´ ecom Paristech, France. {yao,najim}@telecom-paristech.fr ; Yao is also with Ecole Normale Sup´ erieure and Najim, with CNRS. Couillet and Debbah are with Sup´ elec, France. {romain.couillet, merouane.debbah}@supelec.fr, Debbah also holds Alcatel-Lucent/Sup´ elec Flexible Radio chair, France. October 16, 2018 DRAFT

Transcript of 1 Fluctuations of an Improved Population Eigenvalue ... · 1 ≤ k ≤ L, we will refer to each...

  • arX

    iv:1

    108.

    5266

    v1 [

    mat

    h.P

    R]

    26 A

    ug 2

    011

    1

    Fluctuations of an Improved Population

    Eigenvalue Estimator in Sample Covariance

    Matrix ModelsJ. Yao, R. Couillet, J. Najim, M. Debbah

    October 16, 2018

    Abstract

    This article provides a central limit theorem for a consistent estimator of population eigenvalues with

    large multiplicities based on sample covariance matrices.The focus is on limited sample size situations,

    whereby the number of available observations is known and comparable in magnitude to the observation

    dimension. An exact expression as well as an empirical, asymptotically accurate, approximation of the

    limiting variance is derived. Simulations are performed that corroborate the theoretical claims. A specific

    application to wireless sensor networks is developed.

    I. INTRODUCTION

    Problems of statistical inference based onM independent observations of anN -variate random variable

    y, with E[y] = 0 andE[yyH ] = RN have drawn the attention of researchers from many fields for years:

    Portfolio optimization in finance [1], gene coexistence in biostatistics [2], channel capacity in wireless

    communications [3], power estimation in sensor networks [4], array processing [5], etc.

    In particular, retrieving spectral properties of thepopulation covariance matrix RN , based on the

    observation ofM independent and identically distributed (i.i.d.) samplesy(1), . . . ,y(M), is paramount

    to many questions of general science. IfM is large compared toN , then it is known that almost surely

    This work was partially supported by Agence Nationale de la Recherche (France), program ANR-07-MDCO-012-01 ’Sesame’.

    Yao and Najim are with Télécom Paristech, France.{yao,najim}@telecom-paristech.fr ;

    Yao is also with Ecole Normale Supérieure and Najim, with CNRS.

    Couillet and Debbah are with Supélec, France.{romain.couillet, merouane.debbah}@supelec.fr,

    Debbah also holds Alcatel-Lucent/Supélec Flexible Radiochair, France.

    October 16, 2018 DRAFT

    http://arxiv.org/abs/1108.5266v1

  • 2

    ‖R̂N−RN‖ → 0, asM → ∞, for any standard matrix norm, wherêRN is thesample covariance matrixR̂N ,

    1M

    ∑Mm=1 y

    (m)y(m)H . However, one cannot always afford a large number of samples, especially

    in wireless communications where the number of available samples has often a size comparable to the

    dimension of each sample. In order to cope with this issue, random matrix theory [6], [7] has proposed new

    tools, mainly spurred by theG-estimators of Girko [8]. Other works include convex optimization methods

    [9], [10] and free probability tools [11], [12]. Many of those estimators are consistent in the sense that

    they are asymptotically unbiased asM,N grow large at the same rate. Nonetheless, only recently have

    techniques been unveiled which allow to estimate individual eigenvalues and functionals of eigenvectors

    of R. The main contributor is Mestre [13]-[14] who studies the case whereRN = UNDNUHN with DN

    diagonal with entries of large multiplicities andUN with i.i.d. entries. For this model, he provides an

    estimator for every eigenvalue ofR with large multiplicity under some separability condition, see also

    Vallet et al. [15], Couillet et al. [4] for more elaborate models.

    These estimators, although proven asymptotically unbiased, have nonetheless not been fully character-

    ized in terms of performance statistics. It is in particularfundamental to evaluate the variance of these

    estimators for not-too-largeM,N . The purpose of this article is to study the fluctuations of the population

    eigenvalue estimator of [14] in the case of structured population covariance matrices. A central limit

    theorem (CLT) is provided to describe the asymptotic fluctuations of the estimators with exact expression

    for the variance asM,N tend to infinity . An empirical approximation, asymptotically accurate is also

    derived.

    The results are applied in a cognitive radio context in whichwe assume the co-existence of a licensed

    (primary) network and an opportunistic (secondary) network aiming at reusing the bandwidth resources

    left unoccupied by the primary network. The eigenvalue estimator is used here by secondary users to

    estimate the transmit power of primary users, while the fluctuations are used to provide a confidence

    margin on the estimate.

    The remainder of the article is structured as follows: In Section II, the system model is introduced

    and the main results from [13], [14] are recalled. In SectionIII, the CLT for the estimator in [14] is

    stated with the asymptotic variance. In Section IV, an empirical approximation for the variance is derived.

    A cognitive radio application of these results is provided in Section V, with comparative Monte Carlo

    simulations. Finally, Section VI concludes this article. Technical proofs are postponed to the appendix.

    DRAFT October 16, 2018

  • 3

    II. ESTIMATION OF THE POPULATION EIGENVALUES

    A. Notations

    In this paper, the notationss,x,M stand for scalars, vectors and matrices, respectively. As usual,‖x‖represents the Euclidean norm of vectorx and‖M‖ stands for the spectral norm ofM. The superscripts(·)T and(·)H respectively stand for the transpose and transpose conjugate; the trace ofM is denoted byTr(M); the mathematical expectation operator, byE. If x is aN × 1 vector, thendiag(x) is theN ×Nmatrix with diagonal elements the components ofx. If z ∈ C, thenℜ(z) andℑ(z) respectively standfor z’s real and imaginary parts, whilei stands for

    √−1; z stands forz’s conjugate andδkℓ is denoted

    as Kronecker’s symbol (whose value is1 if k = ℓ, 0 otherwise).

    If the supportS of a probability measure overR is the finite union of closed compact intervalsSk for

    1 ≤ k ≤ L, we will refer to each compact intervalSk as acluster of S.

    If Z ∈ CN×N is a nonnegative Hermitian matrix with eigenvalues(ξi; 1 ≤ i ≤ N), we denote in thesequel byeig(Z) = {ξi, 1 ≤ i ≤ N} the set of its eigenvalues and byFZ the empirical distribution ofits eigenvalues (also calledspectral distribution of Z), i.e.:

    FZ(dλ) =1

    N

    N∑

    i=1

    δξi(dλ) ,

    whereδx stands for the Dirac probability measure atx.

    Convergence in distribution will be denoted byD−→, in probability by P−→; and almost sure convergence,

    bya.s.−−→.

    B. Matrix Model

    Consider aN × M matrix XN = (Xij) whose entries are independent and identically distributed(i.i.d.) random variables, with distributionCN(0, 1), i.e. Xij = U + iV , whereU, V are both i.i.d.

    real Gaussian random variablesN(0, 12). Let RN be aN ×N Hermitian matrix withL (L being fixed)distinct eigenvaluesρ1 < · · · < ρL with respective multiplicitiesN1, · · · , NL (notice that

    ∑Li=1Ni = N ).

    Consider now

    YN = R1/2N XN .

    The matrix YN = [y1, · · · ,yM ] is the concatenation ofM independent observations[y1, · · · ,yM ],where each observation writesyi = R

    1/2N xi with XN = [x1, · · · ,xM ]. In particular, the (population)

    covariance matrix of each observationyi is RN = EyiyHi . In this article, we are interested in recovering

    October 16, 2018 DRAFT

  • 4

    information onRN based on the observation

    R̂N =1

    MR

    1/2N XNX

    HNR

    1/2N ,

    which is referred to as thesample covariance matrix.

    It is in general a complicated task to infer the spectral properties ofRN based onR̂N for all finite

    N,M . Instead, in the following, we assume thatN andM are large, and consider the following asymptotic

    regime:

    Assumption 1 (A1):

    N,M → ∞ , with NM

    → c ∈ (0,∞) , and NiM

    → ci ∈ (0,∞) , 1 ≤ i ≤ L. (1)

    This assumption will be shortly referred to asN,M → ∞.

    Assumption 2 (A2):

    We assume that the limiting supportS of the eigenvalue distribution of̂RN is formed ofL compact

    disjoint subsets (cf. Figure 1). Following [14], one can also reformulate this condition in a more analytic

    manner: The limiting support of̂RN is formed ofL clusters if and only if fori ∈ {1, .., L}, infN{MN −ΨN (i)} > 0, where

    ΨN (i) =

    1N

    ∑Lr=1Nr

    (

    ρrρr−α1

    )2m = 1,

    max{

    1N

    ∑Lr=1Nr

    (

    ρrρr−αm−1

    )2, 1N∑L

    r=1Nr

    (

    ρrρr−αm

    )2 }

    1 < m < L,

    1N

    ∑Lr=1Nr

    (

    ρrρr−αL−1

    )2m = L

    whereα1 ≤ · · · ≤ αL−1 areL− 1 different ordered solutions to the equation

    1

    N

    L∑

    r=1

    Nrρ2r

    (ρr − x)3= 0.

    This condition is also called theseparability condition.

    Figure 1 depicts the eigenvalues of a realization of the random matrixR̂N and the associated limiting

    distribution asN,M grow large, forρ1 = 1, ρ2 = 3, ρ3 = 10 andN = 60 with equal multiplicity.

    C. Mestre’s Estimator of the population eigenvalues

    In [14], an estimator of the population covariance matrix eigenvalues(ρk; 1 ≤ k ≤ L) based on theobservationŝRN is proposed.

    DRAFT October 16, 2018

  • 5

    1 3 10Eigenvalues

    Den

    sity

    Asymptotic spectrum

    Empirical eigenvalues

    Fig. 1. Empirical and asymptotic eigenvalue distribution of R̂N for L = 3, ρ1 = 1, ρ2 = 3, ρ3 = 10, N/M = c = 0.1,

    N = 60, N1 = N2 = N3 = 20.

    Theorem 1: [14] Denote byλ̂1 ≤ · · · ≤ λ̂N the ordered eigenvalues of̂RN . Let M,N → ∞ in thesense of the assumption (A1). Under the assumptions (A1)-(A2), the following convergence holds true:

    ρ̂k − ρk a.s.−−−−−−→M,N→∞

    0 , (2)

    where

    ρ̂k =M

    Nk

    m∈Nk

    (

    λ̂m − µ̂m)

    , (3)

    with Nk = {∑k−1

    j=1 Nj + 1, . . . ,∑k

    j=1Nj} and µ̂1 ≤ · · · ≤ µ̂N the (real and) ordered solutions of:

    1

    N

    N∑

    m=1

    λ̂m

    λ̂m − µ=

    M

    N. (4)

    D. Integral representation of estimator ρ̂k - Stieltjes transforms

    The proof of Theorem 1 relies on random matrix theory, and in particular, [16], [17] use as a key

    ingredient theStieltjes transform.

    The Stieltjes transform mP of a probability distributionP overR+ is aC-valued function defined by:

    mP(z) =

    R+

    P(dλ)

    λ− z , z ∈ C\R+ .

    There also exists an inverse formula to recover the probability distribution associated to a Stieljes

    transform: Leta < b be two continuity points of the cumulative distribution function associated to

    P, then

    October 16, 2018 DRAFT

  • 6

    P([a, b]) =1

    πlimy↓0

    ℑ[∫ b

    amP(x+ iy)dx

    ]

    .

    In the case whereFZ is the spectral distribution associated to a nonnegative Hermitian matrixZ ∈CN×N with eigenvalues(ξi; 1 ≤ i ≤ N), the Stieltjes transformmZ of FZ takes the particular form:

    mZ(z) =

    FZ(dλ)

    λ− z

    =1

    N

    N∑

    i=1

    1

    ξi − z=

    1

    NTr (Z− zIN )−1 ,

    which can be seen as the normalized trace of the resolvent(Z− zIN )−1. Since the seminal paper ofMarčenko and Pastur [16], the Stieltjes transform has proved to be extremely efficient to describe the

    limiting spectrum of large dimensional random matrices.

    In the following, we recall some elements of the proof of Theorem 1, necessary for the remainder of

    the article. The first important result is due to Bai and Silverstein [17] (see also [16]).

    Theorem 2: [17] Denote byFR the limiting spectral distribution ofRN , i.e. FR(dλ) =∑L

    k=1ckc δρk(dλ).

    Under the assumption (A1), the spectral distributionF R̂N of the sample covariance matrix̂RN converges

    (weakly and almost surely) to a probability distributionF asM,N → ∞, whose Stieltjes transformm(z)satisfies:

    m(z) =1

    cm(z)−

    (

    1− 1c

    )

    1

    z,

    for z ∈ C+ = {z ∈ C, ℑ(z) > 0}, wherem(z) is defined as the unique solution inC+ of:

    m(z) = −(

    z − c∫

    t

    1 + tm(z)dFR(t)

    )−1

    .

    Note thatm(z) is also a Stieltjes transform whose associated distribution function will be denotedF ,

    which turns out to be the limiting spectral distribution ofF R̂N whereR̂N is defined as:

    R̂N ,1

    MXHNRNXN .

    Denote bymR̂N

    (z) andmR̂N

    (z) the Stieltjes transforms ofF R̂N andF R̂N . Notice in particular that

    mR̂N

    (z) =M

    Nm

    R̂N(z) −

    (

    1− MN

    )

    1

    z.

    Remark 1: This relation associated to (4) readily implies thatmR̂N

    (µ̂i) = 0. Otherwise stated, the

    µ̂i’s are the zeros ofmR̂N . This fact will be of importance in the sequel.

    DRAFT October 16, 2018

  • 7

    Denote bymN (z) and mN (z) the finite-dimensional counterparts ofm(z) and m(z), respectively,

    defined by the relations:

    mN (z) = −(

    z − NM

    t

    1 + tmN (z)dFRN (t)

    )−1

    ,

    mN (z) =M

    NmN (z)−

    (

    1− MN

    )

    1

    z.

    It can be shown thatmN andmN are Stieltjes transforms of given probability measuresFN andFN ,

    respectively (cf. [7, Theorem 3.2]).

    With these notations at hand, we can now derive Theorem 1. By Cauchy’s formula, write:

    ρk =N

    Nk

    1

    2iπ

    Γk

    (

    1

    N

    L∑

    r=1

    Nrw

    ρr − wdw

    )

    ,

    whereΓk is a negatively oriented contour taking values onC \ {ρ1, · · · , ρL} and only enclosingρk.With the change of variablew = − 1mM (z) and the condition that the limiting supportS of the eigenvaluedistribution ofRN is formed ofL distinct clusters(Sk, 1 ≤ k ≤ L) (cf. Figure 1), we can write:

    ρk =M

    2iπNk

    Ck

    zm′N (z)

    mN (z)dz , 1 ≤ k ≤ L (5)

    whereCk andCℓ denote negatively oriented contours which enclose the corresponding clustersSk and

    Sℓ respectively. Defining

    ρ̂k ,M

    2πiNk

    Ck

    zm′

    R̂N(z)

    mR̂N

    (z)dz , 1 ≤ k ≤ L , (6)

    dominated convergence arguments ensure thatρk − ρ̂k → 0, almost surely. The integral form of̂ρk canthen be explicitly computed thanks to residue calculus, andthis finally yields (3).

    The main objective of this article is to study the performance of the estimators(ρ̂k, 1 ≤ k ≤ L). Moreprecisely, we will establish a central limit theorem (CLT) for (M(ρ̂k − ρk), 1 ≤ k ≤ L) asM,N → ∞,explicitly characterize the limiting covariance matrixΘ = (Θkℓ)1≤k,ℓ≤L, and finally provide an estimator

    for Θ.

    III. F LUCTUATIONS OF THE POPULATION EIGENVALUE ESTIMATORS

    A. The Central Limit Theorem

    The main result of this article is the following CLT which expresses the fluctuations of(ρ̂k, 1 ≤ k ≤ L).Theorem 3: Under the assumptions (A1)-(A2) and with the same notations:

    (M(ρ̂k − ρk), 1 ≤ k ≤ L) D−−−−−−→M,N→∞

    x ∼ NL(0,Θ) ,

    whereNL refers to a realL-dimensional Gaussian distribution, andΘ is aL× L matrix whose entriesΘkℓ are given by (7), whereCk andCℓ are defined as before (cf. Formula (5)).

    October 16, 2018 DRAFT

  • 8

    Θkℓ = −1

    4π2ckcℓ

    Ck

    Cℓ

    [

    m′(z1)m′(z2)

    (m(z1)−m(z2))2− 1

    (z1 − z2)2]

    1

    m(z1)m(z2)dz1dz2 . (7)

    B. Proof of Theorem 3

    We first outline the main steps of the proof and then provide the details.

    Using the integral representation ofρ̂k andρk, we get: Almost surely,

    M(ρ̂k − ρk) =M2

    2πiNk

    Ck

    z

    (

    m′R̂N

    (z)

    mR̂N

    (z)− m

    ′N (z)

    mN (z)

    )

    dz

    Denote byC(Ck,C) the set of continuous functions fromCk to C endowed with the supremum norm

    ‖u‖∞ = supCk |u|. Consider the process:

    (XN ,X′N , uN , u

    ′N ) : Ck → C4

    where

    XN (z) = M(

    mR̂N

    (z)−mN (z))

    ,

    X ′N (z) = M(

    m′R̂N

    (z)−m′N (z))

    ,

    uN (z) = mR̂N(z) , u′N (z) = m

    ′R̂N

    (z) .

    Then due to ’no eigenvalue’ result (cf. [18], see also Proposition 1), (XN ,X ′N , uN , u′N ) almost surely

    belongs toC(Ck,C) andM(ρ̂k − ρk) writes:

    M(ρ̂k − ρk) =M

    2πiNk

    Ck

    z

    (

    mN (z)X′N (z) − u′N (z)XN (z)mN (z)uN (z)

    )

    dz

    △= ΥN (XN ,X

    ′N , uN , u

    ′N ) ,

    where

    ΥN (x, x′, u, u′) =

    M

    2πiNk

    Ck

    z

    (

    mN (z)x′(z)− u′(z)x(z)

    mN (z)u(z)

    )

    dz . (8)

    If needed, we shall explicitly indicate the dependence in the contourCk and writeΥN (x, x′, u, u′,Ck).

    The main idea of the proof of the theorem lies in three steps:

    (i) To prove the convergence in distribution of the process(XN ,X ′N , uN , u′N ) to a Gaussian process.

    (ii) To transfer this convergence to the quantityΥN (XN ,X ′N , uN , u′N ) with the help of the continuous

    mapping theorem [19].

    DRAFT October 16, 2018

  • 9

    (iii) To check that the limit (in distribution) ofΥN (XN ,X ′N , uN , u′N ) is Gaussian and to compute the

    limiting covariance betweenΥN(XN ,X ′N , uN , u′N ,Ck) andΥN (XN ,X

    ′N , uN , u

    ′N ,Cℓ).

    Remark 2: Note that the convergence in step (i) is a distribution convergence at a process level, hence

    one has to first establish the finite dimensional convergenceof the process and then to prove that the

    process is tight overCk. Tightness turns out to be difficult to establish due to the lack of control over

    the eigenvalues of̂RN whenever the contour crosses the real line. In order to circumvent this issue, we

    shall introduce, following Bai and Silverstein [20], a process that approximatesXN andX ′N .

    Let us now start the proof of Theorem 3.

    Lemma 1: Under the assumptions (A1)-(A2), the process

    (XN ,X′N ) : Ck → C4

    converges in distribution to a Gaussian process(X,Y ) with mean function zero and covariance function:

    cov(X(z),X(z̃)) =m′(z)m′(z̃)

    (m(z)−m(z̃))2 −1

    (z − z̃)2△= κ(z, z̃) , (9)

    cov(Y (z),X(z̃)) =∂

    ∂zκ(z, z̃) ,

    cov(X(z), Y (z̃)) =∂

    ∂z̃κ(z, z̃) ,

    cov(Y (z), Y (z̃)) =∂2

    ∂z∂z̃κ(z, z̃) .

    Lemma 1 is the cornerstone to the proof of Theorem 3. The proofof Lemma 1 is postponed to Appendix

    B and relies on the following proposition, of independent interest:

    Proposition 1: Under the assumptions (A1)-(A2) and denote byS the support of the probability

    distribution associated to the Stieltjes transformm. Then, for everyε > 0, ℓ ∈ N∗:

    P

    (

    supλ∈eig(R̂N)

    d(λ,S) > ε

    )

    = O

    (

    1

    N ℓ

    )

    ,

    whered(λ,S) = infx∈S |λ− x|.The proof of Proposition 1 is postponed to Appendix A.

    As (uN , u′N )a.s.−−−−−−→

    N,M→∞(m,m′), a straightforward corollary of Lemma 1 yields the convergence in

    distribution of (XN ,X ′N , uN , u′N ) to (X,Y,m,m

    ′). This concludes the proof of step (i).

    A direct consequence of Lemma 1 yields that(XN ,X ′N , uN , u′N ) : Ck → C4 converges in distribution

    to the Gaussian process(X,Y,m,m′) defined as before. We are now in position to transfer the con-

    vergence of(XN ,X ′N , uN , u′N ) to ΥN(XN ,X

    ′N , uN , u

    ′N ) via the continuous mapping theorem, whose

    statement as expressed in [19] is reminded below.

    October 16, 2018 DRAFT

  • 10

    Proposition 2 (cf. [19, Th. 4.27]): For any metric spacesS1 andS2, let ξ, (ξn)n≥1 be random elements

    in S1 with ξnD−−−→

    n→∞ξ and consider some measurable mappingsf , (fn)n≥1: S1 7→ S2 and a measurable

    setΓ ⊂ S1 with ξ ∈ Γ a.s. such thatfn(sn) → f(s) assn → s ∈ Γ. Thenfn(ξn) D−−−→n→∞

    f(ξ).

    It remains to apply Theorem 2 to the process(XN ,X ′N , uN , u′N ) and to the functionΥN as defined

    in (8). Denote by1

    Υ(x, y, v, w) =1

    2πick

    Ck

    z

    (

    m(z)y(z) − w(z)x(z)m(z)v(z)

    )

    dz ,

    and consider the set

    Γ =

    {

    (x, y, v, w) ∈ C4(Ck,C) , infCk

    |v| > 0}

    .

    Then, it is shown in [6, Section 9.12.1] thatinfCk |m| > 0, and, by a dominated convergence theoremargument, that(xN , yN , vN , wN ) → (x, y, v, w) ∈ Γ implies thatΥN (xN , yN , vN , wN ) → Υ(x, y, v, w).Therefore, Theorem 2 applies toΥN (xN , yN , vN , wN ) and the following convergence holds true:

    ΥN (XN ,X′N , uN , u

    ′N )

    D−−−−−−→M,N→∞

    Υ(X,Y,m,m′) ,

    and step (ii) is established.

    It now remains to prove step (iii),i.e. to check the Gaussianity of the random variableΥ(X,Y,m,m′)

    and to compute the covariance betweenΥ(X,Y,m,m′,Ck) andΥ(X,Y,m,m′,Cℓ).

    In order to propagate the Gaussianity of the deviations in the integrands of (6) to the deviations of

    the integral which defineŝρk, it suffices to notice that the integral can be written as the limit of a finite

    Riemann sum and that a finite Riemann sum of Gaussian random variables is still Gaussian. Therefore

    M(ρ̂k − ρk) converges to a Gaussian distribution. Asinfz∈Ck |m(z)| > 0, a straightforward applicationof Fubini’s theorem together with the fact thatE(X) = E(Y ) = 0 yields:

    E

    ∮ (

    zm

    (z)X(z)

    m2(z)− z Y (z)

    m(z)

    )

    dz = 0 .

    It remains to compute the covariance betweenΥ(X,Y,m,m′,Ck) andΥ(X,Y,m,m′,Cℓ) for possibly

    different contoursCk andCℓ. We shall therefore evaluate, for1 ≤ k, ℓ ≤ L:

    Θkℓ = E(

    Υ(X,Y,m,m′,Ck)Υ(X,Y,m,m′,Cℓ)

    )

    ,

    (a)= − 1

    4π2ckcl

    Ck

    Cℓ

    z1z2

    (m′

    (z1)m′

    (z2)κ(z1, z2)

    m2(z1)m2(z2)− m

    (z1)∂2κ(z1, z2)

    m2(z1)m(z2)

    −m′

    (z2)∂1κ(z1, z2)

    m(z1)m2(z2)+

    ∂212κ(z1, z2)

    m(z1)m(z2)

    )

    dz1dz2 ,

    1As previously, we shall explicitly indicate the dependenceon the contourCk if needed and writeΥ(x, x′, u, u′,Ck).

    DRAFT October 16, 2018

  • 11

    Θ̂kℓ = −M2

    4π2NkNℓ

    Ck

    Cℓ

    (

    m′R̂N

    (z1)m′R̂N

    (z2)

    (mR̂N

    (z1)−mR̂N (z2))2− 1

    (z1 − z2)2

    )

    × 1m

    R̂N(z1)mR̂N

    (z2)d z1d z2 .

    (10)

    where(a) follows from the fact thatinfz∈Ck |m(z)| > 0 together with Fubini’s theorem, and∂1, ∂2, ∂212respectively stand for∂/∂z1, ∂/∂z2 and∂2/∂z1∂z2.

    By integration by parts, we obtain

    z1z2m′(z2)∂1κ(z1, z2)

    m(z1)m2(z2)dz1

    =

    ∮(

    −z2m′(z2)κ(z1, z2)

    m(z1)m2(z2)+

    z1z2m′(z1)m

    ′(z2)κ(z1, z2)

    m2(z1)m2(z2)

    )

    dz1 .

    Similarly,∮

    z1z2m(z2)∂1,2κ(z1, z2)

    m(z1)m2(z2)dz1

    = −∮

    z2∂2κ(z1, z2)

    m(z1)m(z2)dz1 +

    z1z2m′(z1)∂2κ(z1, z2)

    m2(z1)m(z2)dz1 .

    Hence

    Θkℓ = −1

    4π2ckcl

    {∮

    Ck

    Cℓ

    z2m′(z2)κ(z1, z2)

    m(z1)m2(z2)dz1dz2 −

    Ck

    Cℓ

    z2∂2κ(z1, z2)

    m(z1)m(z2)dz1dz2

    }

    .

    Another integration by parts yields∮

    z2∂2κ(z1, z2)

    m(z1)m(z2)dz2 = −

    κ(z1, z2)

    m(z1)m(z2)dz2 +

    z2m′(z2)κ(z1, z2)

    m(z1)m2(z2)dz2 .

    Finally, we obtain:

    Θkℓ = −1

    4π2ckcl

    Ck

    Cℓ

    κ(z1, z2)

    m(z1)m(z2)d z1d z2 ,

    and (7) is established.

    IV. ESTIMATION OF THE COVARIANCE MATRIX

    Theorem 3 describes the limiting performance of the estimator of Theorem 1, with an exact charac-

    terization of its variance. Unfortunately, the varianceΘ depends upon unknown quantities. We provide

    hereafter consistent estimatesΘ̂ for Θ based on the observationŝRN .

    October 16, 2018 DRAFT

  • 12

    Θ̂kℓ =M2

    NkNℓ

    (i,j)∈Nk×Nℓ, i 6=j

    −1(µ̂i − µ̂j)2m′

    R̂N(µ̂i)m

    ′R̂N

    (µ̂j)

    +δkℓ∑

    i∈Nk

    (

    m′′′R̂N

    (µ̂i)

    6m′R̂N

    (µ̂i)3−

    m′′R̂N

    (µ̂i)2

    4m′R̂N

    (µ̂i)4

    )]

    . (11)

    Theorem 4: Assume that the assumptions (A1)-(A2) hold true, and recallthe definition ofΘkℓ given

    in (7). Let Θ̂kℓ be defined by (11), where(Nk) and (µ̂k) are defined in Theorem 1, then:

    Θ̂kℓ −Θkℓ a.s.−−→ 0

    asN,M → ∞.

    Theorem 4 is useful in practice as one can obtain simultaneously an estimatêρk of the values ofρk

    as well as an estimation of the degree of confidence for eachρ̂k.

    Proof: In view of formula (7), and taking into account the fact thatmR̂N

    andm′R̂N

    are consistent

    estimates form andm′, it is natural to definêΘkℓ by replacing the unknown quantitiesm andm′ in (7)

    by their empirical counterpartsmR̂N

    andm′R̂N

    , hence the definition of̂Θkℓ in (10).

    The proof of Theorem 4 now breaks down into two steps: The convergence of̂Θkℓ to Θkℓ, which relies

    on the definition (10) of̂Θkℓ and on a dominated convergence argument, and the effective computation

    of the integral in (10) which relies on Cauchy’s residue theorem [21], and yields (11).

    We first address the convergence ofΘ̂kℓ to Θkℓ. Due to [18], [22], almost surely, the eigenvalues of

    R̂N will eventually belong to anyε-blow-up of the supportS of the probability measure associated tom,

    i.e. the set{x ∈ R : d(x,S) < ε}. Hence, ifε is small enough, the distance between these eigenvaluesand anyz ∈ Ck will be eventually uniformly lower-bounded. By [14, Lemma 1], the same result holdstrue for the zeros ofm

    R̂N(which are real). In particular, this implies thatm

    R̂Nis eventually uniformly

    lower-bounded onCk (if not, then by compacity, there would existz ∈ Ck such thatmR̂N (z) = 0 whichyields a contradiction because all the zeroes ofm

    R̂Nare strictly within the contour). With these arguments

    at hand, one can easily apply the dominated convergence theorem and conclude that a.s.Θ̂kℓ → Θkℓ.

    We now evaluate the integral (10) by computing the residues of the integrand withinCk andCℓ. There

    are two cases to discuss depending on whetherk 6= ℓ andk = ℓ. Denote byh(z1, z2) the integrand in(10), that is:

    DRAFT October 16, 2018

  • 13

    h(z1, z2) =

    (

    m′R̂N

    (z1)m′R̂N

    (z2)

    (mR̂N

    (z1)−mR̂N (z2))2− 1

    (z1 − z2)2

    )

    × 1m

    R̂N(z1)mR̂N

    (z2). (12)

    We first consider the case wherek 6= ℓ.In this case, the two integration contours are different andit can be assumed that they never intersect

    (so it can always be assumed thatz1 6= z2). Let z2 be fixed, and denote bŷµi the zeroes (labeled inincreasing order) ofm

    R̂N, then the computation of the residueRes(h(·, z2), µ̂i) of h(·, z2) at a zeroµ̂i

    of mR̂N

    which is located withinCk is straightforward and yields:

    r(z2)△= Res(h(·, z2), µ̂i) =

    (

    m′R̂N

    (µ̂i)m′R̂N

    (z2)

    m2R̂N

    (z2)− 1

    (µ̂i − z2)2

    )

    1

    m′R̂N

    (µ̂i)mR̂N(z2)

    . (13)

    Similarly, if one computesRes(r, µ̂j) at a zeroµ̂j of mR̂Nlocated withinCk, one obtains:

    Res(r, µ̂j) = −1

    (µ̂i − µ̂j)2 m′R̂N

    (µ̂i)m′R̂N

    (µ̂j).

    Then we need to consider the residueξ on the setRz2 = {z1 : mR̂N (z1) = mR̂N (z2) 6= 0, z1 6= z2}.(If this set is empty, then the residue is zero.) Notice thatξ is not a residue of 1(z1−z2)2

    1m

    R̂N(z1)mR̂N (z2)

    ,

    hence one needs to compute

    g(z1, z2) =m′

    R̂N(z1)m

    ′R̂N

    (z2)

    (mR̂N

    (z1)−mR̂N (z1))2

    1

    mR̂N

    (z1)mR̂N(z2)

    for the residueξ. By integration by parts, one gets

    g(z1, z2)dz1 = −∮ m′

    R̂N(z1)m

    ′R̂N

    (z2)

    (mR̂N

    (z1)−mR̂N (z2))dz1

    m2R̂N

    (z1)mR̂N(z2)

    .

    Let k = min{i ∈ N∗ : m(i)R̂N

    (ξ) 6= 0}, then by a Taylor expansion

    mR̂N

    (z1) = mR̂N(z2) +

    (z1 − ξ)kk!

    m(k)

    R̂N(ξ) + o(z1 − ξ)k,

    and

    m′R̂N

    (z1) =(z1 − ξ)k−1(k − 1)! m

    (k)

    R̂N(ξ) + o(z1 − ξ)k−1.

    Hence

    Res(g, ξ) = −km′

    R̂N(z2)

    m3R̂N

    (z2).

    As it is the derivative function of k2m2R̂N

    (z2), the integration with respect toz2 is zero.

    October 16, 2018 DRAFT

  • 14

    It remains to count the number of zeros within each contour. By [14, Lemma 1], eventually, there are

    exactly as many zeros as eigenvalues within each contour. Ithas been proved that the contribution of the

    residues ofξ on Rz2 is null, hence the result in the casek 6= ℓ:

    Θ̂kℓ = −M2

    NkNℓ

    (i,j)∈Nk×Nℓ

    − 1(µ̂i − µ̂j)2m′

    R̂N(µi)m′

    R̂N(µj)

    .

    We now compute the integral (10) in the case wherek = ℓ, and begin by the computation of the

    residues at̂µi. The definition (13) ofr and the computation ofRes(r, µ̂j) still hold true in the case

    whereµ̂j is within Ck but different fromµ̂i. It remains to computeRes(r, µ̂i). Takingz2 → µi, we get:

    limz2→µ̂i

    (z2 − µ̂i)3(

    1

    m′R̂N

    (µ̂i)mR̂N(z2)(µ̂i − z2)2

    )

    =1

    m′2R̂N

    (µ̂i),

    limz2→µ̂i

    (z2 − µ̂i)2

    1

    m′R̂N

    (µ̂i)mR̂N(z2)(µ̂i − z2)2

    − 1m′

    R̂N

    2(µ̂i)(z2 − µ̂i)3

    = −m′′

    R̂N(µ̂i)

    2m′R̂N

    3(µ̂i).

    Finally,

    limz2→µ̂i

    (z2 − µ̂i)(

    1

    m′R̂N

    (µ̂i)mR̂N(z2)(µ̂i − z2)2

    − 1m′2

    R̂N(µ̂i)(z2 − µ̂i)3

    +m′′RN

    (µ̂i)

    2m′3RN (µ̂i)(z2 − µ̂i)2

    )

    =m′′′

    R̂N(µ̂i)

    6m′R̂N

    (µ̂i)3−

    m′′R̂N

    (µ̂i)2

    4m′R̂N

    (µ̂i)4.

    Hence the residue:

    Res(r, µ̂i) =m′′′RN

    (µ̂i)

    6m′RN(µ̂i)3

    −m′′RN

    (µ̂i)2

    4m′RN(µ̂i)4

    .

    There are two other residues that should be taken into account for the computation of the integral: The

    residues ofξ on Rz2 , and the residue forz1 = z2. The first case can be handled as before. Forz1 = z2,

    the calculus ofg(z1, z2) for the residuez1 = z2 is exactly the same as before. It remains to compute

    1(z1−z2)2

    1m

    R̂N(z1)mR̂N (z2)

    for the residuez1 = z2. The integration by parts yields that:

    1

    (z1 − z2)2dz1

    mR̂N

    (z1)mR̂N(z2)

    =

    −m′

    R̂N(z1)

    (z1 − z2)dz1

    m2R̂N

    (z1)mR̂N(z2)

    .

    Then the residue forz1 = z2 is:

    −m′

    R̂N(z2)

    m3R̂N

    (z2).

    Again, this is the derivative function of 12m2R̂N

    (z2), then the integration is zero.

    DRAFT October 16, 2018

  • 15

    Finally both have a null contribution, hence the formula:

    Θ̂kk =M2

    N2k

    (i,j)∈N2k, i 6=j

    −1(µ̂i − µ̂j)2m′

    R̂N(µ̂i)m

    ′R̂N

    (µ̂j)+∑

    i∈Nk

    (

    m′′′R̂N

    (µ̂i)

    6m′R̂N

    (µ̂i)3−

    m′′R̂N

    (µ̂i)2

    4m′R̂N

    (µ̂i)4

    )]

    .

    V. PERFORMANCE IN THE CONTEXT OF COGNITIVE RADIOS

    We introduce below a practical application of the above result to the telecommunication field of

    cognitive radios. Consider a communication network implementing orthogonal code division multiple

    access (CDMA) in the uplink, which we refer to as theprimary network. The primary network is

    composed ofK transmitters. The data of transmitterk are modulated by thenk orthogonalN -chip

    codeswk,1, . . . ,wk,nk ∈ CN . Consider also a secondary network, in sensor mode, that we assumetime-synchronized with the primary network, and whose objective is to determine the distances of the

    primary transmitters in order to optimally reuse the frequencies used by the primary transmitters2. From

    the viewpoint of the secondary network, primary userk has powerPk. Then, at symbol timem, any

    secondary user receives theN -dimensional data vector

    y(m) =

    K∑

    k=1

    Pk

    nk∑

    j=1

    wk,jx(m)k,j + σn

    (m) (14)

    with σn(m) ∈ CN an additive white Gaussian noiseCN(0, σ2I) received at timem and x(m)k,j ∈ Cthe signal transmitted by userk on the carrier codej at timem, which we assumeCN(0, 1) as well.

    The propagation channel is considered frequency flat on the CDMA transmission bandwidth. We do not

    assume that the sensor knowsσ2 neither the vectorswk,j. The secondary users may or may not be aware

    of the number of codewords employed by each user.

    Equation (14) can be compacted under the form

    y(m) = WP1

    2x(m) + σn(m)

    with W = [w1,1, . . . ,w1,n1 ,w2,1, . . . ,wK,nK ] ∈ CN×n, n ,∑K

    k=1 nk, P ∈ Cn×n the diagonal matrixwith entry P1 of multiplicity n1, P2 of multiplicity n2, etc. andPK of multiplicity nK , andx(m) =

    [x(m)T1 , . . . ,x

    (m)TK ]

    T ∈ Cn wherex(m)k ∈ Cnk is a column vector withj-th entryx(m)k,j .

    2the rationale being that far transmitters will not be interfered with by low power communications within the secondary

    network.

    October 16, 2018 DRAFT

  • 16

    GatheringM successive independent observations, we obtain the matrixY = [y(1), . . . ,y(M)] ∈CN×M given by

    Y = WP1

    2X+ σN =[

    WP1

    2 σIN

    ]

    X

    N

    whereX = [x(1), . . . ,x(M)] andN = [n(1), . . . ,n(M)].

    The y(m) are therefore independent Gaussian vectors of zero mean andcovarianceR , WPWH +

    σ2IN . Since the objective is to retrieve the powersPk, while σ2 is known, the problem boils down to

    finding the eigenvalues ofWPWH +σ2IN . However, the sensors only have access toY, or equivalently

    to the sample covariance matrix

    RN ,1

    MYYH =

    1

    M

    M∑

    m=1

    y(m)y(m)H .

    AssumingRN conveys a good appreciation of the eigenvalue clustering tothe secondary user (as in

    Figure 1), Theorem 1 enables the detection of primary transmitters and the estimation of their transmit

    powersP1, . . . , PK ; this boils down to estimating the largestK eigenvalues ofWPWH + σ2IN , i.e.

    thePk + σ2, and to subtractσ2 (optionally estimated from the smallest eigenvalue ofWPWH + σ2IN

    if n < N ). Call P̂k the estimate ofPk.

    Based on these power estimates, the secondary user can determine the optimal coverage for secondary

    communications that ensures no interference with the primary network. A basic idea for instance is to

    ensure that the closest primary user, i.e. that with strongest received power, is not interfered with. Our

    interest is then cast onPK . Now, since the power estimator is imperfect, it is hazardous for the secondary

    network to state thatK has powerP̂K or to add some empirical security margin tôPK . The results of

    Section III partially answer this problem.

    Theorems 3 and 4 enable the secondary sensor to evaluate the accuracy ofP̂k. In particular, assume

    that the cognitive radio protocol allows the secondary network to interfere the primary network with

    probability q and denoteA the value

    A , infa{P(PK − P̂K > a) ≤ q}.

    According to Theorem 3, forN,M large,A is well approximated bŷΘK,KQ−1(q), with Q the Gaussian

    cumulative distribution function. If the secondary users detect a user with powerPK , estimated byP̂K ,

    P(P̂K + A < PK) < q and then it is safe for the secondary network to assume the worst case scenario

    where userK transmits at power̂PK +A ≃ P̂K + Θ̂KQ−1(q).In Figure 2, the performance of Theorem 3 is compared against10, 000 Monte Carlo simulations of

    a scenario of three users, withP1 = 1, P2 = 3, P3 = 10, n1 = n2 = n3 = 20, N = 60 andM = 600.

    DRAFT October 16, 2018

  • 17

    1 3 100

    1

    2

    3

    EstimatesD

    ensi

    ty

    Histogram of theρ̂kTheoretical limiting distribution

    Fig. 2. Comparison of empirical against theoretical variances, based on Theorem 3, for three users,P1 = 1, P2 = 3, P3 = 10,

    n1 = n2 = n3 = 20 codes per user,N = 60, M = 600 and SNR= 20 dB.

    It appears that the limiting distribution is very accurate for these values ofN,M . We also performed

    simulations to obtain empirical estimatesΘ̂k of Θk from Theorem 4, which suggest thatΘ̂k is an accurate

    estimator as well.

    VI. CONCLUSION

    In this article, we derived an exact expression and an approximation of the limiting performance of

    a statistical inference method that estimates the population eigenvalues of a class of sample covariance

    matrices. These results are applied in the context of cognitive radios to optimize secondary network

    coverage based on measures of the primary network activity.

    APPENDIX

    A. Proof of Proposition 1

    Let us first begin by considerations related to the supports of the probability distributions associated

    to m(z) andmN (z). Denote byS andSN these supports and recall thatS is the union ofL clusters:

    S = (a1, b1) ∪ · · · ∪ (aL, bL) .

    The following proposition clarifies the relations betweenSN andS.

    October 16, 2018 DRAFT

  • 18

    Proposition 3: LetN,M → ∞, then forN large enough, the supportSN of the probability distributionassociated to the Stieltjes transformmN (z) is the union ofL clusters:

    SN = (aN1 , b

    N1 ) ∪ · · · ∪ (aNL , bNL ) .

    Moreover, the following convergence holds true :

    aNℓ −−−−−−→N,M→∞

    aℓ , bNℓ −−−−−−→

    N,M→∞bℓ ,

    for 1 ≤ ℓ ≤ L.Remark 3: If the supportSN contains zero, (ex:N > M ), then zero is also in the supportS, the

    conclusion is still true.

    Proof of Proposition 3: Recall the relations:

    mN (z) = −(

    z − NM

    t

    1 + tmN (z)dFRN (t)

    )−1

    (15)

    and

    mN (z) =M

    NmN (z)−

    (

    1− MN

    )

    1

    z. (16)

    As the inverse of Stieltjes transform of−1z is δ0 (the Dirac mass on 0) andmN (z) is a continuousfunction overR∗+, for a,b with 0 < a < b, by the inverse formula of Stieltjes transform, one gets:

    FN ([a, b]) =M

    NFN ([a, b]).

    So it suffices to study the supportSN associated toFN .

    From the definition ofmN (z) (see formula (15)), we obtain:

    zRN (mN ) = −1

    mN+

    N

    M

    tdFRN(t)

    1 + tmN (z).

    Denote byB = {m ∈ R : m 6= 0,−m−1 /∈ {ρ1, · · · , ρL}}. In [23, Theorem 4.1 and Theorem 4.2],Silverstein and Choi show that for a real numberx, x ∈ ScN ⇐⇒ mx ∈ B and z′RN(mx) =

    1m2x

    −NM

    ∫ t2dFRN(t)(1+tmx)

    2 > 0 with mN (x) = mx andzRN(mx) = x.

    Then ifa ∈ ∂SN , ma /∈ B or z′RN(ma) ≤ 0 with ma = mN (a). Now we will show thatma ∈ B. In [23,Theorem 5.1],ma 6= 0. If −m−1a ∈ SFRN , asFRN is discrete, we get thatlimm→ma

    ∫ t2dFRN(t)(1+tm)2 −→ ∞.

    So on the neighborhood to the left and to the right ofma, z′RN < 0 which contradicts [23, Theorem 5.1].

    Hencez′RN(ma) ≤ 0. By the continuity, we get

    DRAFT October 16, 2018

  • 19

    z′RN(ma) =1

    m2a− N

    M

    t2dFRN (t)

    (1 + tma)2= 0.

    It is equivalent to the following equation:

    z′RN(ma) =1

    m2a− 1

    M

    L∑

    i=1

    Niρ2i

    (1 + ρima)2= 0. (17)

    By multiplying the common denominator, one will get a polynomial of the degree 2L inma. Now we

    will show that these 2L roots are real. At first, notice that

    1

    m2− N

    M

    t2dFRN (t)

    (1 + tm)2−−−−−→m→− 1

    ρi

    −∞,

    and

    z′′RN(m) = −2

    m3+

    N

    M

    2t3dFRN (t)

    (1 + tm)3.

    So z′′RN(m) has one and only one zero in the open set(−1ρi,− 1ρi+1 ) for i ∈ {1, · · · , L − 1}. Then

    for βi ∈ (− 1ρi ,−1

    ρi+1) such thatz′′RN(βi) = 0, it suffices to show thatz

    ′RN

    (βi) > 0 in order to prove

    that there will be two zeros forz′RN(m) in the set(−1ρi,− 1ρi+1 ). From the separability condition (cf.

    Assumption (A2)),infN{MN −ΨN (i)} > 0, and

    z′RN(−1

    αi) = α2i −

    N

    M

    t2dFRN(t)

    (1− tαi )2

    = α2i

    (

    1− 1M

    L∑

    r=1

    Niρ2i

    (αi − ρi)2

    )

    > 0

    Thus we obtain2(L− 1) roots. Besides, in the open set(−ρ−1L , 0),1

    m2− N

    M

    t2dFRN (t)

    (1 + tm)2−−−−−→ma→0−

    +∞,

    there exists another root in this set. In the open set(−∞,−ρ−11 ),1

    m2− N

    M

    t2dFRN (t)

    (1 + tm)2−−−−−→m→−∞

    0

    and1

    m2− N

    M

    t2dFRN (t)

    (1 + tm)2∼

    m→−∞

    1

    m2(1− L

    M) > 0.

    Hence the last root in this open set. This proves thatSN = (aN1 , bN1 ) ∪ · · · ∪ (aNL , bNL ).

    To proveaNℓ −−−−−−→N,M→∞ aℓ andbNℓ −−−−−−→N,M→∞ bℓ , notice thatai bi satisfy the same type of the equation

    by replacingNM by c andFRN by FR. As NM → c and KiM → ci, the roots of Equation (17) converge to

    those of the limit equation (see [24]). Thus we achieve the second conclusion.

    October 16, 2018 DRAFT

  • 20

    We are now in position to establish the proof of Proposition 1.

    Denote byS(ε) the ε-blow-up of S, i.e. S(ε) = {x ∈ R, d(x, S) < ε}. Let ε > 0 be small enoughand consider a smooth functionφ equal to zero onS(ε/3), equal to 1 ifx /∈ S(ε), equal to zero againif |x| ≥ τ (as we shall see,τ will be chosen to be large), and smooth in-between with0 ≤ φ ≤ 1:

    φ(x) =

    0 if d(x,S) < ε/3 ,

    1 if d(x,S) > ε , |x| ≤ τ − ǫ

    0 if |x| > τ .

    Notice that if N,M → ∞ andN is large enough, then by Proposition 3,φ(x) = 0 for all x ∈ SN .Now if Z is aM ×M hermitian matrix with spectral decompositionZ = Udiag (γi; 1 ≤ i ≤ M)) UH ,whereU is unitary anddiag (γi; 1 ≤ i ≤ M)) stands for theM ×M diagonal matrix whose entries areZ’s eigenvalues, writeφ(Z) = Udiag (φ(γi); 1 ≤ i ≤ M)) UH .

    We have:

    P(supn

    d(λn, S) > ε) ≤ P(‖R̂N‖ > τ − ε) + P(Trφ(R̂N ) ≥ 1)

    = P(‖R̂N‖ > τ − ε) + P([Trφ(R̂N )]p ≥ 1)(a)

    ≤ P(‖R̂N‖ > τ − ε) + E[Trφ(R̂N )]p ,

    for everyp ≥ 1, where(a) follows from Markov’s inequality. The fact thatP(‖R̂N‖ > τ) = O(N−ℓ)for τ large enough and everyℓ ∈ N∗ is well-known (see for instance [6, Section 9.7]). We shall thereforeestablish estimates overE[Trφ(R̂N )]p. Takep = 2k; we prove the following statement by induction: For

    k ≥ 1 and for every integerβ < 2k and for every smooth functionf with compact support whose valueon S(ε/3) is zero ,

    E

    (

    Trf(R̂N ))2k

    = O

    (

    1

    )

    .

    First notice that due to Proposition 3,∫

    SNf(λ)FN (dλ) = 0 (whereFN is the probability distribution

    associated tomN ) for N,M large enough (N,M → ∞). A minor modification of [25, Lemma 2] (whosemodel is slightly different) with the help of [26, Proposition 5] yields that forN,M → ∞ andN largeenough,ETr f(R̂N ) = O(N−1), and the property is verified fork = 0.

    DRAFT October 16, 2018

  • 21

    Let k > 0 be fixed and assume that the result holds true forβ < 2k. We want to show that

    E[Trf(R̂N )]2(k+1) = O(N−2β). At stepk + 1, the expectation writes:

    ∣E[Tr f(R̂N )]2(k+1)

    ∣ (18)

    =

    E

    (

    [Trf(R̂N )]2k + E[trf(R̂N )]

    2k − E[Trf(R̂N )]2k)2∣

    ≤ 2(

    Var[Tr f(R̂N )]2k + |E[Trf(R̂N )]2

    k |2)

    . (19)

    The second term of the right hand side (r.h.s.) of the equation can be handled by the induction hypothesis:∣

    ∣E[Trf(R̂N )]

    2k∣

    2= O

    (

    1

    N2β

    )

    .

    We now rely on Poincaré-Nash inequality (see for instance [26, Section II-B]) to handle the first term of

    the r.h.s. Applying this inequality, we obtain:

    Var(

    (Tr f(R̂N ))2k)

    ≤ K∑

    i,j

    E

    ∂[Tr f(R̂N )]2k

    ∂Yi,j

    2

    +

    ∂[Tr f(R̂N )]2k

    ∂Y i,j

    2

    , (20)

    whereK is a constant which does not depend onN,M and which is greater thanRN ’s eigenvalues. In

    order to compute the derivatives of the r.h.s., we rely on [27, Lemma 4.6]. This yields:

    ∂Yi,j[Tr f(R̂N )]

    2k =2k

    M[Tr f(R̂N )]

    2k−1[Y∗Nf′(R̂N )]j,i ,

    ∂Yi,j[Tr f(R̂N )]

    2k =2k

    M[Tr f(R̂N )]

    2k−1[f ′(R̂N )YN ]i,j .

    Plugging these derivatives into (20), we obtain:

    Var(Tr[f(R̂N )]2k)

    ≤ K 22k+1

    M2E

    [

    (Tr f(R̂N ))(2k+1−2)Tr (f ′(R̂N )YNY

    ∗Nf

    ′(R̂N ))]

    ,

    =K 22k+1

    ME

    [

    (Tr f(R̂N ))(2k+1−2)Tr (f ′(R̂N )

    2R̂N )]

    ,

    ≤ K 22k+1

    M

    ∣E[Trf(R̂N )]

    2k+1∣

    2k+1−2

    2k+1 ×∣

    ∣E[Trf ′(R̂N )

    2R̂N ]2k∣

    1

    2k

    ,

    where the last inequality is a consequence of Hölder’s inequality.

    As the functionh(λ) = λ[f ′(λ)]2 satisfies the induction hypothesis, we have for everyα < 1:∣

    ∣ETr[f ′(R̂N )

    2R̂N ]2k∣

    1

    2k

    = O(N−α).

    Plugging this estimate into (18), we obtain:∣

    ∣E[Tr f(R̂N )]2(k+1)

    ∣ ≤ K(

    1

    N1+α|E[Tr f(R̂N )]2

    (k+1) | 2k+1

    −2

    2k+1

    )

    + O(N−2β) , (21)

    October 16, 2018 DRAFT

  • 22

    whereK is a constant independent ofM,N, k. Notice that inequality (21) involves twice the quantity

    of interestE[Tr f(R̂N )]2(k+1)

    that we want to upper bound byO(N−2β). We shall proceed iteratively.

    Notice thatTr [f(R̂N )] ≤ supx∈R |f(x)| ×N becausef is bounded onR; hence the rough estimate:

    E[Trf(R̂N )]2(k+1) = O(N2

    k+1

    ).

    Plugging this into (21) yields:

    E[Tr f(R̂N )]2(k+1) = O(Na1) ,

    wherea0 = 2k+1 anda1 = a0 2k+1−22k+1 − (1 + α). Iterating the procedure, we obtain:

    E[Tr f(R̂N )]2(k+1) = O

    (

    Naℓ∨(−2β))

    ,

    whereaℓ = aℓ−12k+1−22k+1 − (1+α) andx∨ y stands forsup(x, y). Now, in order to conclude the proof, it

    remains to prove that i) the sequence(aℓ) converges to some limita∞, ii) for some well-chosenα < 1,

    a∞ ∈ (−2k+1,−2β). Write:

    aℓ+1 + 2k(1 + α) =

    2k − 12k

    (aℓ + 2k(1 + α)) ,

    henceaℓ converges to−2k(1+α) which readily belongs to(−2k+1,−2β) for a well-chosenα ∈ (0, 1).Finally E[Tr f(R̂N )]2

    (k+1)

    = O(N−2β) which ends the induction.

    It remains to apply this estimate toE[Trφ(R̂N )]ℓ in order to get the desired result.

    B. Proof of Lemma 1

    As explained in Section III, there are two conditions to prove (Billingsley [28, Theorem 13.1]):

    • Finite-dimensional convergence of the process(XN ,X ′N ).

    • Tightness on the contourCk.

    Remark 4: As uN (resp.u′N ) converges almost surely tou (resp.u′) (see Silverstein and Bai [17]),

    the convergence of the process(XN ,X ′N , uN , u′N ) is achieved as soon as the convergence of the process

    (XN ,X′N ) is proved.

    In [20], Bai and Silverstein establish a central limit theorem for FRN with the complex Gaussian

    entriesXij . We recall below their main result.

    Proposition 4: [20] With the notations introduced in Section II, forf1, . . . , fp, analytic on an open

    region containingR,

    1)(

    N∫

    fi(x)d(FR̂N − FN )(x)

    )

    1≤i≤pforms a tight sequence onN ,

    DRAFT October 16, 2018

  • 23

    2)(

    N

    fi(x)d(FR̂N − FN )(x)

    )

    1≤i≤p

    D−→ N(0,V),

    whereV = (Vij) and

    Vij = −1

    4π2

    ∮ ∮

    fi(z1)fj(z2)vij(z1, z2)dz1dz2,

    with

    vij(z1, z2) =m′(z1)m

    ′(z2)

    (m(z1)−m(z2))2− 1

    (z1 − z2)2

    where the integration is over positively oriented contoursthat circle around the supportS.

    Now we apply this proposition to show the finite-dimensionalconvergence. For allzi ∈ Ck\R, noticethat

    mR̂N

    (z)−mN (z) =1

    2iπ

    1

    x− z d(FR̂N − FN )(x)

    with the contour who contains the supportS andXN (z) = M(mR̂N(z) −mN (z)). Then Proposition 4

    implies directly that for allp ∈ N, the random vector(

    XN (z1),X′

    N (z1), · · · ,XN (zp),X′

    N (zp))

    converges to a centered Gaussian vector by considering the functions:(

    f1(x) =1

    x− z1, f2(x) =

    1

    (x− z1)2, · · · , f2p−1(x) =

    1

    x− zp, f2p(x) =

    1

    (x− zp)2)

    .

    Thus the finite dimensional convergence is achieved.

    The proof of the tightness is based on Nash-Poincaré inequality ([25] and [26]). In Appendix A, it is

    proved that for allǫ > 0 and allℓ ∈ N,

    P

    (

    supλ∈eig(R̂N )

    d(λ,S) > ǫ

    )

    = o(N−ℓ).

    Following the same idea as Bai and Silverstein [20, Section 3and 4], it is indeed a tight sequence. The

    details of the proof are in Appendix C. Thus Lemma 1 is achieved.

    C. Proof of the tightness

    We will show the tightness of the sequenceM(mR̂N

    − mN ) andM(m′

    R̂N− m′N ) by using Nash-

    Poincaré’s inequality [26]. First, denote byM(mR̂N

    (z) −mN (z)) = M1N (z) +M2N (z) with M1N (z) =M(m

    R̂N(z)− E[m

    R̂N(z)]) andM2N (z) = M(E[mR̂N (z)]−mN (z)).

    October 16, 2018 DRAFT

  • 24

    As 1ρ̂k−z can converge to infinite if z is close to the real axis, there will be a little trouble for the

    tightness. Then we need a truncated version of the process. More precisely, letεN be a real sequence

    decreasing to zero satisfying for someδ ∈]0, 1[:

    εN ≥ N−δ .

    Remark 5: Notice thatXN (z) = M(mR̂N −mN ) = XN (z) for z ∈ C+. So it suffices to verify the

    arguments forz ∈ C+.Denote by([x2k−1, x2k], k = 1, · · · , L) thek-th cluster of the support of the limiting spectral measure;

    and takel2k−1, l2k such thatx2k−2 < l2k−1 < x2k−1 and x2k < l2k < x2k+1 for k ∈ {1, .., L} withconventionsx0 = 0 andx2L+1 = ∞, i .e. , [l2k−1, l2k] only contains the k-th cluster. Letd > 0. Consider:

    Cu = {x+ id : x ∈ [l2k−1, l2k]}.

    and

    Cr = {l2k−1 + iv : v ∈ [N−1εN , d]}.

    Also

    Cl = {l2k + iv : v ∈ [N−1εN , d]}.

    ThenCN = Cl ∪ Cu ∪ Cr. The processM̂1N (·) is defined by

    M̂1N (z) =

    M1N (z) for z ∈ CN ,

    M1N (l2k + iN−1εN ) for x = l2k, v ∈ [0, N−1εN ],

    M1N (l2k−1 + iN−1εN ) for x = l2k−1, v ∈ [0, N−1εN ].

    This partition ofCN is identical to that used in [20, Section 1]. With probability one (see [18] and [22]),

    for all ǫ > 0,

    lim supλ∈eig(R̂N )

    d(λ,SN ) < ǫ

    with d(x, S) the Euclidean distance ofx to the setS. So with probability one, for allN large, ([20, page

    563])∣

    (

    M1N (z)− M̂1N (z))

    dz

    ≤ K1εN ,

    and∣

    (

    M1N′(z)− M̂2′N (z)

    )

    dz

    ≤ K2εN

    for some constantsK1 andK2. Both terms converge to zero asM → ∞. Then it suffices to ensure thetightness forM̂1N (z) andM̂

    1′N (z).

    DRAFT October 16, 2018

  • 25

    We now prove tightness based on [28, Theorem 13.1], i.e.

    1) Tightness at any point of the contour (hereCN ).

    2) Satisfaction of the condition

    supN,z1,z2∈CN

    E|(M̂1N (z1)− M̂1N (z2))|2|z1 − z2|2

    ≤ K.

    Condition 1) is achieved by an immediate application of Proposition 4. We now verify the second

    condition.

    We evaluateE|(M̂1N (z1)−M̂

    1N (z2))|

    2

    |z1−z2|2. Notice that

    mR̂N

    (z1)−mR̂N (z2) =z1 − z2M

    N∑

    i=1

    1

    (λ̂i − z1)(λ̂i − z2)

    =z1 − z2M

    Tr(D−1N (z1)D−1N (z2))

    with DN (z) = R̂N − zIN . We have

    ∂Yi,j

    (

    mR̂N

    (z1)−mR̂N (z2)z1 − z2

    )

    =∂

    ∂Yi,jTr(R̂N − z1I)−1(R̂N − z2I)−1

    =1

    M

    [

    −Y∗ND−2N (z1)D−1N (z2)−Y∗ND−1N (z1)D−2N (z2)]

    j,i,

    and

    ∂Ȳi,j

    (

    mR̂N

    (z1)−mR̂N (z2)z1 − z2

    )

    =1

    M[−D−2N (z1)D−1N (z2)YN −D−1N (z1)D−2N (z2)YN ]i,j.

    Then by the Nash-Poincaré inequality and the fact thatR̂N is uniformly bounded in spectral norm almost

    surely, one gets

    E|M̂1(z1)− M̂1(z2)|2|z1 − z2|2

    ≤ C1N

    E

    [

    Tr(LN )]

    =C1N

    E(Tr(LN )Isupn d(λ̂n,S)≤ε) +

    C1N

    E(Tr(LN )Isupn d(λ̂n,S)>ε)

    with

    LN = R̂ND−4N (z1)D

    −2N (z2) + 2R̂ND

    −3N (z1)D

    −3N (z2) + R̂ND

    −2N (z1)D

    −4N (z2)

    October 16, 2018 DRAFT

  • 26

    andC1 a constant which does not depend onN or M . For the first term,Tr(LN ) is bounded on the set

    supn d(λ̂n,S) ≤ ε. For the second term, since for alli ∈ N and allz ∈ CN , 1|λ̂n−z|i ≤N i

    εiN, it leads that

    N∑

    n=1

    1

    |λ̂n − z|i≤ N

    i+1

    εiN.

    Then

    |Tr(LN )| ≤ O(

    N7

    ε6N

    )

    .

    As P(sup d(λ̂n,S) ≥ ε) = o(N−16), takeεN = N−0.01, one obtains∣

    ∣E(Tr(LN )Isupn d(λ̂n,S)>ε

    )∣

    ∣≤ E

    ∣Tr(LN )Isup d(λ̂n,S)>ε

    ≤ O(

    N7

    ǫ6NP(sup d(λ̂n,S) > ε)

    )

    ≤ O(

    N7−0.06−16)

    → 0.The second condition of tightness is achieved.

    For M2N (z), following exactly the same method in [6, Section 9.11], onecan show thatM2N (z) is

    bounded and forms an equicontinuous family that converges to 0. Hence the tightness forM(mR̂N

    (z)−mN (z)).

    The next step is to prove the tightness ofM(m′R̂M

    (z)−m′N (z)). We have

    m′R̂N

    (z1)−m′R̂N (z2)

    =z1 − z2M

    N∑

    i=1

    2λ̂i − z1 − z2(λ̂i − z1)2(λ̂i − z2)2

    =z1 − z2M

    Tr(

    D−2N (z1)D−2N (z2)(DN (z1) +DN (z2))

    )

    .

    Following the same method as derived before, one obtains

    ∂YijD−1N (z1)D

    −2N (z2)

    = − 1M

    [

    Y∗ND−2N (z1)D

    −2N (z2) + 2Y

    ∗ND

    −1N (z1)D

    −3N (z2)

    ]

    j,i,

    and∣

    ∂YijTrD−2N (z1)D

    −2N (z2)(D(z1) +DN (z2))

    2

    =1

    MTr(L2)

    with

    L2 =4R̂N

    (

    3D−4N (z1)D−4N (z2) + 2D

    −3N (z1)D

    −5N (z2) + 2D

    −5N (z1)D

    −3N (z2)

    +D−2N (z1)D−6N (z2) +D

    −6N (z1)D

    −2N (z2)

    )

    .

    DRAFT October 16, 2018

  • 27

    Then Nash-Poincaré inequality yields that

    Var|M̂1′N (z1)− M̂1

    N (z2)||z1 − z2|

    ≤ C1N

    E(Tr(L2)Isupn d(λ̂n,S)≤ε) +

    C1N

    E(Tr(L2)Isupn d(λ̂n,S)>ε)

    with C1 the same constant defined as before. The termTr(L2) is bounded on the setsup d(λ̂n,S) ≤ ε.For the second term,|Tr(L2)| ≤ O

    (

    N9

    ε8N

    )

    . As P(sup d(λ̂n,S) ≥ ε) = o(N−16) and εN = N−0.01, theproof of the tightness ofM1N

    ′(z) is achieved as before.

    The proof of the tightness is completed with the verificationof M2N′(z) for z ∈ Cn to be bounded

    and forms an equicontinuous family, and convergence to 0. Wewill use the same method for the process

    M2N (z) (see [6, Section 9.11]).

    By Formula (9.11.1) in [6, Section 9.11], they show that

    (EmR̂N

    −mN )

    1−NM

    ∫ mN t2dFRN(t)

    (1+tEmR̂N

    )(1+tmN )

    −z + NM∫

    tdFRN1+tEm

    R̂N

    − TN

    = EmR̂N

    mNTN (22)

    where

    TN =N

    M2

    M∑

    j=1

    Eβjdj(EmR̂N)−1,

    dj = dj(z) = −q∗jR1/2(R̂(j) − zI)−1(EmR̂NR+ I)−1R1/2qj + (1/M)Tr(EmR̂N

    R+ I)−1R(R̂N − zI)−1,

    βj =1

    1 + 1M y∗j (R̂(j) − zI)−1yj

    ,

    qj = 1/√Nxj,

    R̂(j) = R̂N −1

    Myjy

    ∗j .

    If one derives (22) with respect toz, the equation becomes

    (Em′R̂N

    −m′N )

    1−NM

    ∫ mN t2dFRN(t)

    (1+tEmR̂N

    )(1+tmN )

    −z + NM∫

    tdFRN1+tEm

    R̂N

    − TN

    + (EmR̂N

    −mN )

    1−NM

    ∫ mN t2dFRN(t)

    (1+tEmR̂N

    )(1+tmN )

    −z + NM∫

    tdFRN1+tEm

    R̂N

    − TN

    = Em′R̂N

    mNTN + EmR̂Nm′NTN + EmR̂N

    mNT′N .

    In the work of [6, Section 9.11], they show that whenN tends to infinity,

    1) supz∈CN |EmR̂N (z)−m(z)| → 0 and supz∈CN |mN (z)−m(z)| → 0,

    2)N

    M

    ∫ t2mNdFRN (t)(1+tEm

    R̂N)(1+tmN )

    −z+ NM

    ∫tdFRN

    1+tEmR̂N

    −TNconverges ,

    3) M2N (z) → 0, TN → 0.

    October 16, 2018 DRAFT

  • 28

    With the same method, one can show easily that

    4) supz∈CN |Em′R̂N (z)−m′(z)| → 0,

    5) supz∈CN |m′N (z)−m′(z)| → 0,6) NM

    (

    ∑Mj=1 Eβjdj

    )′converges.

    With these results, it suffices to show thatT ′N → 0, andM2N′ is equicontinuous.

    In [6, Section 9.9], they show that form, p ∈ N and a non-randomN × N matrix Ak, k = 1, ..,mandBl, ℓ = 1, .., q, we have

    E

    (

    m∏

    k=1

    r∗tAkrt

    q∏

    ℓ=1

    (r∗tRℓrt −M−1TrRBℓ))∣

    ≤ KM−(1∧q)m∏

    k=1

    ‖Ak‖q∏

    ℓ=1

    ‖Bℓ‖. (23)

    We have also that for any positivep,

    max(E‖D−1(z)‖p,E‖D−1j (z)‖p,E‖D−1ij (z)‖p) ≤ Kp (24)

    and

    supn,z∈Cn

    ‖(EmR̂N

    (z)R + I)−1‖ < ∞ (25)

    whereKp is a constant which depends only onp.

    With all these preliminaries, asTN → 0, by the dominated convergence theorem of derivation, itsuffices to show thatT ′N is bounded overCN . In [6, Section 9.11], it is sufficient to show that(f

    ′M (z))

    is bounded where

    fM (z) =

    M∑

    j=1

    E[(r∗jD−1j rj−M−1TrD−1j R)(r∗jD−1j (EmR̂NR+I)

    −1rj−M−1TrD−1j (EmR̂NR+I)−1R)].

    With the help of (23)-(25),f ′M (z) is indeed bounded inCN .

    Now we will show thatM2N′ is equicontinuous. With the light work as before, it is sufficient to show

    that f ′′M(z) is bounded. Using (23), we obtain

    DRAFT October 16, 2018

  • 29

    |f ′′(z)| ≤KM−1[(

    E(TrD−31 RD̄−31 R)E(TrD

    −11 (EmR

    NR+ I)−1R(Em̄

    R̂NR+ I)−1D̄−11 R)

    )1/2

    + 2(

    E(TrD−21 RD̄−21 R)E(TrD

    −21 (EmR̂N

    R+ I)−1R(Em̄R̂N

    R+ I)−1D̄−21 R))1/2

    + 2|Em′R̂N

    |(

    E(TrD−21 RD̄−21 R)E(TrD

    −11 (EmR̂N

    R+ I)−2R(Em̄R̂N

    R+ I)−2D̄−11 R))1/2

    +(

    E(TrD−11 RD̄−11 R)E(TrD

    −31 (EmR̂N

    R+ I)−1R(Em̄R̂N

    R+ I)−1D̄−31 R))1/2

    + 2|Em′R̂N

    |(

    E(TrD−11 RD̄−11 R)E(TrD

    −21 (EmR̂N

    R+ I)−2R(Em̄R̂N

    R+ I)−2D̄−21 R))1/2

    + |Em′′R̂N

    |(

    E(TrD−11 RD̄−11 R)E(TrD

    −11 (EmR̂N

    R+ I)−2R(Em̄R̂N

    R+ I)−2D̄−11 R))1/2

    + |Em′R̂N

    |2(

    E(TrD−11 RD̄−11 R)E(TrD

    −11 (EmR̂N

    R+ I)−3R(Em̄R̂N

    R+ I)−3D̄−11 R))1/2]

    .

    Thanks to (24) and (25), the right side is indeed bounded. This ends the proof of the tightness.

    REFERENCES

    [1] V. Plerous, P. Gopikrishnan, B. Rosenow, L. Amaral, T. Guhr, and H. Stanley, “Random matrix approach to cross correlations

    in financial data,”Phys. Rev. E, vol. 65, no. 6, Jun. 2002.

    [2] F. Luo, Y. Yang, J. Zhong, H. Gao, L. Khan, D. K. Thompson, and J. Zhou, “Constructing gene co-expression networks

    and predicting functions of unknown genes by random matrix theory,” BMC bioinformatics, vol. 8, no. 1, p. 299, 2007.

    [3] A. Kammoun, M. Kharouf, W. Hachem, and J. Najim, “BER and Outage Probability Approximations for LMMSE Detectors

    on Correlated MIMO channels,”Information Theory, vol. 55, no. 10, pp. 4386–4397, 2009.

    [4] R. Couillet, J. W. Silverstein, and M. Debbah, “Eigen-Inference for Energy Estimation of Multiple Sources,”IEEE Trans.

    Inf. Theory, submitted for publication. [Online]. Available: http://arxiv.org/abs/1001.3934

    [5] X. Mestre and M. Lagunas, “Modified Subspace Algorithms for DoA Estimation With Large Arrays,”IEEE Trans. Signal

    Process., vol. 56, no. 2, pp. 598–614, Feb. 2008.

    [6] Z. Bai and J. W. Silverstein, “Spectral Analysis of LargeDimensional Random Matrices,”Springer Series in Statistics,

    2009.

    [7] R. Couillet and M. Debbah,Random matrix methods for wireless communications, 1st ed. New York, NY, USA: Cambridge

    University Press, to appear.

    [8] V. L. Girko, “Ten years of general statistical analysis.” [Online]. Available:

    http://www.general-statistical-analysis.girko.freewebspace.com/chapter14.pdf

    [9] J. W. Silverstein and P. L. Combettes, “Signal detectionvia spectral theory of large dimensional random matrices,”IEEE

    Trans. Signal Process., vol. 40, no. 8, pp. 2100–2105, 1992.

    [10] N. E. Karoui, “Spectrum estimation for large dimensional covariance matrices using random matrix theory,”Annals of

    Statistics, vol. 36, no. 6, pp. 2757–2790, Dec. 2008.

    [11] O. . Ryan and M. Debbah, “Free deconvolution for signal processing applications,” inProc. IEEE International Symposium

    on Information Theory (ISIT’07), Nice, France, Jun. 2007, pp. 1846–1850.

    October 16, 2018 DRAFT

    http://arxiv.org/abs/1001.3934http://www.general-statistical-analysis.girko.freewebspace.com/chapter14.pdf

  • 30

    [12] R. Couillet and M. Debbah, “Free deconvolution for OFDMmulticell SNR detection,” inProc. IEEE International

    Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC’08), Cannes, France, 2008.

    [13] X. Mestre, “On the asymptotic behavior of the sample estimates of eigenvalues and eigenvectors of covariance matrices,”

    IEEE Trans. Signal Process., vol. 56, no. 11, pp. 5353–5368, Nov. 2008.

    [14] ——, “Improved estimation of eigenvalues of covariancematrices and their associated subspaces using their sample

    estimates,”IEEE Trans. Inf. Theory, vol. 54, no. 11, pp. 5113–5129, Nov. 2008.

    [15] P. Vallet, P. Loubaton, and X. Mestre, “Improved subspace DoA estimation methods with large arrays: The deterministic

    signals case,” inProc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’09), 2009, pp.

    2137–2140.

    [16] V. A. Marc̆enko and L. A. Pastur, “Distributions of eigenvalues for some sets of random matrices,”Math USSR-Sbornik,

    vol. 1, no. 4, pp. 457–483, Apr. 1967.

    [17] J. W. Silverstein and Z. D. Bai, “On the empirical distribution of eigenvalues of a class of large dimensional random

    matrices,”Journal of Multivariate Analysis, vol. 54, no. 2, pp. 175–192, 1995.

    [18] Z. D. Bai and J. W. Silverstein, “No Eigenvalues Outsidethe Support of the Limiting Spectral Distribution of Large

    Dimensional Sample Covariance Matrices,”Annals of Probability, vol. 26, no. 1, pp. 316–345, Jan. 1998.

    [19] O. Kallenberg,Foundations of mordern Probability, 2nd edition. Springer Verlag New York, 2002.

    [20] Z. D. Bai and J. W. Silverstein, “CLT of linear spectral statistics of large dimensional sample covariance matrices,” Annals

    of Probability, vol. 32, no. 1A, pp. 553–605, 2004.

    [21] J. Marsden and M. Hoffman,Basic Complex Analysis, 3rd ed. New York: Freeman, 1987.

    [22] Z. D. Bai and J. W. Silverstein, “Exact Separation of Eigenvalues of Large Dimensional Sample Covariance Matrices,”

    The Annals of Probability, vol. 27, no. 3, pp. 1536–1555, 1999.

    [23] J. W. Silverstein and S. Choi, “Analysis of the limitingspectral distribution of large dimensional random matrices,” Journal

    of Multivariate Analysis, vol. 54, no. 2, pp. 295–309, 1995.

    [24] F. Cucker and A. Corbalan, “An Alternate Proof of the Continuity of the Roots of a Polynomial,”The American

    Mathematical Monthly, vol. 96, no. 4, pp. 342–345, Apr. 1989.

    [25] P. Vallet, P. Loubaton, and X. Mestre, “Improved subspace estimation for multivariate observations of high

    dimension: the deterministic signals case,”IEEE Trans. Inf. Theory, submitted for publication. [Online]. Available:

    http://arxiv.org/abs/1002.3234

    [26] W. Hachem, O. Khorunzhy, P. Loubaton, J. Najim, and L. A.Pastur, “A new approach for capacity analysis of large

    dimensional multi-antenna channels,”IEEE Trans. Inf. Theory, vol. 54, no. 9, 2008.

    [27] U. Haagerup, H. Schultz, and S. Thorbjørnsen, “A randommatrix approach to the lack of projections in Cred*(F2),”

    Advances in Mathematics, vol. 204, no. 1, pp. 1–83, 2006.

    [28] P. Billingsley,Probability and Measure, 3rd ed. Hoboken, NJ: John Wiley & Sons, Inc., 1995.

    DRAFT October 16, 2018

    http://arxiv.org/abs/1002.3234

    I IntroductionII Estimation of the population eigenvaluesII-A NotationsII-B Matrix ModelII-C Mestre's Estimator of the population eigenvaluesII-D Integral representation of estimator k - Stieltjes transforms

    III Fluctuations of the population eigenvalue estimatorsIII-A The Central Limit TheoremIII-B Proof of Theorem ??

    IV Estimation of the covariance matrixV Performance in the context of cognitive radiosVI ConclusionAppendixA Proof of Proposition ??B Proof of Lemma ??C Proof of the tightness

    References