1 Fluctuations of an Improved Population Eigenvalue ... · 1 ≤ k ≤ L, we will refer to each...
Transcript of 1 Fluctuations of an Improved Population Eigenvalue ... · 1 ≤ k ≤ L, we will refer to each...
-
arX
iv:1
108.
5266
v1 [
mat
h.P
R]
26 A
ug 2
011
1
Fluctuations of an Improved Population
Eigenvalue Estimator in Sample Covariance
Matrix ModelsJ. Yao, R. Couillet, J. Najim, M. Debbah
October 16, 2018
Abstract
This article provides a central limit theorem for a consistent estimator of population eigenvalues with
large multiplicities based on sample covariance matrices.The focus is on limited sample size situations,
whereby the number of available observations is known and comparable in magnitude to the observation
dimension. An exact expression as well as an empirical, asymptotically accurate, approximation of the
limiting variance is derived. Simulations are performed that corroborate the theoretical claims. A specific
application to wireless sensor networks is developed.
I. INTRODUCTION
Problems of statistical inference based onM independent observations of anN -variate random variable
y, with E[y] = 0 andE[yyH ] = RN have drawn the attention of researchers from many fields for years:
Portfolio optimization in finance [1], gene coexistence in biostatistics [2], channel capacity in wireless
communications [3], power estimation in sensor networks [4], array processing [5], etc.
In particular, retrieving spectral properties of thepopulation covariance matrix RN , based on the
observation ofM independent and identically distributed (i.i.d.) samplesy(1), . . . ,y(M), is paramount
to many questions of general science. IfM is large compared toN , then it is known that almost surely
This work was partially supported by Agence Nationale de la Recherche (France), program ANR-07-MDCO-012-01 ’Sesame’.
Yao and Najim are with Télécom Paristech, France.{yao,najim}@telecom-paristech.fr ;
Yao is also with Ecole Normale Supérieure and Najim, with CNRS.
Couillet and Debbah are with Supélec, France.{romain.couillet, merouane.debbah}@supelec.fr,
Debbah also holds Alcatel-Lucent/Supélec Flexible Radiochair, France.
October 16, 2018 DRAFT
http://arxiv.org/abs/1108.5266v1
-
2
‖R̂N−RN‖ → 0, asM → ∞, for any standard matrix norm, wherêRN is thesample covariance matrixR̂N ,
1M
∑Mm=1 y
(m)y(m)H . However, one cannot always afford a large number of samples, especially
in wireless communications where the number of available samples has often a size comparable to the
dimension of each sample. In order to cope with this issue, random matrix theory [6], [7] has proposed new
tools, mainly spurred by theG-estimators of Girko [8]. Other works include convex optimization methods
[9], [10] and free probability tools [11], [12]. Many of those estimators are consistent in the sense that
they are asymptotically unbiased asM,N grow large at the same rate. Nonetheless, only recently have
techniques been unveiled which allow to estimate individual eigenvalues and functionals of eigenvectors
of R. The main contributor is Mestre [13]-[14] who studies the case whereRN = UNDNUHN with DN
diagonal with entries of large multiplicities andUN with i.i.d. entries. For this model, he provides an
estimator for every eigenvalue ofR with large multiplicity under some separability condition, see also
Vallet et al. [15], Couillet et al. [4] for more elaborate models.
These estimators, although proven asymptotically unbiased, have nonetheless not been fully character-
ized in terms of performance statistics. It is in particularfundamental to evaluate the variance of these
estimators for not-too-largeM,N . The purpose of this article is to study the fluctuations of the population
eigenvalue estimator of [14] in the case of structured population covariance matrices. A central limit
theorem (CLT) is provided to describe the asymptotic fluctuations of the estimators with exact expression
for the variance asM,N tend to infinity . An empirical approximation, asymptotically accurate is also
derived.
The results are applied in a cognitive radio context in whichwe assume the co-existence of a licensed
(primary) network and an opportunistic (secondary) network aiming at reusing the bandwidth resources
left unoccupied by the primary network. The eigenvalue estimator is used here by secondary users to
estimate the transmit power of primary users, while the fluctuations are used to provide a confidence
margin on the estimate.
The remainder of the article is structured as follows: In Section II, the system model is introduced
and the main results from [13], [14] are recalled. In SectionIII, the CLT for the estimator in [14] is
stated with the asymptotic variance. In Section IV, an empirical approximation for the variance is derived.
A cognitive radio application of these results is provided in Section V, with comparative Monte Carlo
simulations. Finally, Section VI concludes this article. Technical proofs are postponed to the appendix.
DRAFT October 16, 2018
-
3
II. ESTIMATION OF THE POPULATION EIGENVALUES
A. Notations
In this paper, the notationss,x,M stand for scalars, vectors and matrices, respectively. As usual,‖x‖represents the Euclidean norm of vectorx and‖M‖ stands for the spectral norm ofM. The superscripts(·)T and(·)H respectively stand for the transpose and transpose conjugate; the trace ofM is denoted byTr(M); the mathematical expectation operator, byE. If x is aN × 1 vector, thendiag(x) is theN ×Nmatrix with diagonal elements the components ofx. If z ∈ C, thenℜ(z) andℑ(z) respectively standfor z’s real and imaginary parts, whilei stands for
√−1; z stands forz’s conjugate andδkℓ is denoted
as Kronecker’s symbol (whose value is1 if k = ℓ, 0 otherwise).
If the supportS of a probability measure overR is the finite union of closed compact intervalsSk for
1 ≤ k ≤ L, we will refer to each compact intervalSk as acluster of S.
If Z ∈ CN×N is a nonnegative Hermitian matrix with eigenvalues(ξi; 1 ≤ i ≤ N), we denote in thesequel byeig(Z) = {ξi, 1 ≤ i ≤ N} the set of its eigenvalues and byFZ the empirical distribution ofits eigenvalues (also calledspectral distribution of Z), i.e.:
FZ(dλ) =1
N
N∑
i=1
δξi(dλ) ,
whereδx stands for the Dirac probability measure atx.
Convergence in distribution will be denoted byD−→, in probability by P−→; and almost sure convergence,
bya.s.−−→.
B. Matrix Model
Consider aN × M matrix XN = (Xij) whose entries are independent and identically distributed(i.i.d.) random variables, with distributionCN(0, 1), i.e. Xij = U + iV , whereU, V are both i.i.d.
real Gaussian random variablesN(0, 12). Let RN be aN ×N Hermitian matrix withL (L being fixed)distinct eigenvaluesρ1 < · · · < ρL with respective multiplicitiesN1, · · · , NL (notice that
∑Li=1Ni = N ).
Consider now
YN = R1/2N XN .
The matrix YN = [y1, · · · ,yM ] is the concatenation ofM independent observations[y1, · · · ,yM ],where each observation writesyi = R
1/2N xi with XN = [x1, · · · ,xM ]. In particular, the (population)
covariance matrix of each observationyi is RN = EyiyHi . In this article, we are interested in recovering
October 16, 2018 DRAFT
-
4
information onRN based on the observation
R̂N =1
MR
1/2N XNX
HNR
1/2N ,
which is referred to as thesample covariance matrix.
It is in general a complicated task to infer the spectral properties ofRN based onR̂N for all finite
N,M . Instead, in the following, we assume thatN andM are large, and consider the following asymptotic
regime:
Assumption 1 (A1):
N,M → ∞ , with NM
→ c ∈ (0,∞) , and NiM
→ ci ∈ (0,∞) , 1 ≤ i ≤ L. (1)
This assumption will be shortly referred to asN,M → ∞.
Assumption 2 (A2):
We assume that the limiting supportS of the eigenvalue distribution of̂RN is formed ofL compact
disjoint subsets (cf. Figure 1). Following [14], one can also reformulate this condition in a more analytic
manner: The limiting support of̂RN is formed ofL clusters if and only if fori ∈ {1, .., L}, infN{MN −ΨN (i)} > 0, where
ΨN (i) =
1N
∑Lr=1Nr
(
ρrρr−α1
)2m = 1,
max{
1N
∑Lr=1Nr
(
ρrρr−αm−1
)2, 1N∑L
r=1Nr
(
ρrρr−αm
)2 }
1 < m < L,
1N
∑Lr=1Nr
(
ρrρr−αL−1
)2m = L
whereα1 ≤ · · · ≤ αL−1 areL− 1 different ordered solutions to the equation
1
N
L∑
r=1
Nrρ2r
(ρr − x)3= 0.
This condition is also called theseparability condition.
Figure 1 depicts the eigenvalues of a realization of the random matrixR̂N and the associated limiting
distribution asN,M grow large, forρ1 = 1, ρ2 = 3, ρ3 = 10 andN = 60 with equal multiplicity.
C. Mestre’s Estimator of the population eigenvalues
In [14], an estimator of the population covariance matrix eigenvalues(ρk; 1 ≤ k ≤ L) based on theobservationŝRN is proposed.
DRAFT October 16, 2018
-
5
1 3 10Eigenvalues
Den
sity
Asymptotic spectrum
Empirical eigenvalues
Fig. 1. Empirical and asymptotic eigenvalue distribution of R̂N for L = 3, ρ1 = 1, ρ2 = 3, ρ3 = 10, N/M = c = 0.1,
N = 60, N1 = N2 = N3 = 20.
Theorem 1: [14] Denote byλ̂1 ≤ · · · ≤ λ̂N the ordered eigenvalues of̂RN . Let M,N → ∞ in thesense of the assumption (A1). Under the assumptions (A1)-(A2), the following convergence holds true:
ρ̂k − ρk a.s.−−−−−−→M,N→∞
0 , (2)
where
ρ̂k =M
Nk
∑
m∈Nk
(
λ̂m − µ̂m)
, (3)
with Nk = {∑k−1
j=1 Nj + 1, . . . ,∑k
j=1Nj} and µ̂1 ≤ · · · ≤ µ̂N the (real and) ordered solutions of:
1
N
N∑
m=1
λ̂m
λ̂m − µ=
M
N. (4)
D. Integral representation of estimator ρ̂k - Stieltjes transforms
The proof of Theorem 1 relies on random matrix theory, and in particular, [16], [17] use as a key
ingredient theStieltjes transform.
The Stieltjes transform mP of a probability distributionP overR+ is aC-valued function defined by:
mP(z) =
∫
R+
P(dλ)
λ− z , z ∈ C\R+ .
There also exists an inverse formula to recover the probability distribution associated to a Stieljes
transform: Leta < b be two continuity points of the cumulative distribution function associated to
P, then
October 16, 2018 DRAFT
-
6
P([a, b]) =1
πlimy↓0
ℑ[∫ b
amP(x+ iy)dx
]
.
In the case whereFZ is the spectral distribution associated to a nonnegative Hermitian matrixZ ∈CN×N with eigenvalues(ξi; 1 ≤ i ≤ N), the Stieltjes transformmZ of FZ takes the particular form:
mZ(z) =
∫
FZ(dλ)
λ− z
=1
N
N∑
i=1
1
ξi − z=
1
NTr (Z− zIN )−1 ,
which can be seen as the normalized trace of the resolvent(Z− zIN )−1. Since the seminal paper ofMarčenko and Pastur [16], the Stieltjes transform has proved to be extremely efficient to describe the
limiting spectrum of large dimensional random matrices.
In the following, we recall some elements of the proof of Theorem 1, necessary for the remainder of
the article. The first important result is due to Bai and Silverstein [17] (see also [16]).
Theorem 2: [17] Denote byFR the limiting spectral distribution ofRN , i.e. FR(dλ) =∑L
k=1ckc δρk(dλ).
Under the assumption (A1), the spectral distributionF R̂N of the sample covariance matrix̂RN converges
(weakly and almost surely) to a probability distributionF asM,N → ∞, whose Stieltjes transformm(z)satisfies:
m(z) =1
cm(z)−
(
1− 1c
)
1
z,
for z ∈ C+ = {z ∈ C, ℑ(z) > 0}, wherem(z) is defined as the unique solution inC+ of:
m(z) = −(
z − c∫
t
1 + tm(z)dFR(t)
)−1
.
Note thatm(z) is also a Stieltjes transform whose associated distribution function will be denotedF ,
which turns out to be the limiting spectral distribution ofF R̂N whereR̂N is defined as:
R̂N ,1
MXHNRNXN .
Denote bymR̂N
(z) andmR̂N
(z) the Stieltjes transforms ofF R̂N andF R̂N . Notice in particular that
mR̂N
(z) =M
Nm
R̂N(z) −
(
1− MN
)
1
z.
Remark 1: This relation associated to (4) readily implies thatmR̂N
(µ̂i) = 0. Otherwise stated, the
µ̂i’s are the zeros ofmR̂N . This fact will be of importance in the sequel.
DRAFT October 16, 2018
-
7
Denote bymN (z) and mN (z) the finite-dimensional counterparts ofm(z) and m(z), respectively,
defined by the relations:
mN (z) = −(
z − NM
∫
t
1 + tmN (z)dFRN (t)
)−1
,
mN (z) =M
NmN (z)−
(
1− MN
)
1
z.
It can be shown thatmN andmN are Stieltjes transforms of given probability measuresFN andFN ,
respectively (cf. [7, Theorem 3.2]).
With these notations at hand, we can now derive Theorem 1. By Cauchy’s formula, write:
ρk =N
Nk
1
2iπ
∮
Γk
(
1
N
L∑
r=1
Nrw
ρr − wdw
)
,
whereΓk is a negatively oriented contour taking values onC \ {ρ1, · · · , ρL} and only enclosingρk.With the change of variablew = − 1mM (z) and the condition that the limiting supportS of the eigenvaluedistribution ofRN is formed ofL distinct clusters(Sk, 1 ≤ k ≤ L) (cf. Figure 1), we can write:
ρk =M
2iπNk
∮
Ck
zm′N (z)
mN (z)dz , 1 ≤ k ≤ L (5)
whereCk andCℓ denote negatively oriented contours which enclose the corresponding clustersSk and
Sℓ respectively. Defining
ρ̂k ,M
2πiNk
∮
Ck
zm′
R̂N(z)
mR̂N
(z)dz , 1 ≤ k ≤ L , (6)
dominated convergence arguments ensure thatρk − ρ̂k → 0, almost surely. The integral form of̂ρk canthen be explicitly computed thanks to residue calculus, andthis finally yields (3).
The main objective of this article is to study the performance of the estimators(ρ̂k, 1 ≤ k ≤ L). Moreprecisely, we will establish a central limit theorem (CLT) for (M(ρ̂k − ρk), 1 ≤ k ≤ L) asM,N → ∞,explicitly characterize the limiting covariance matrixΘ = (Θkℓ)1≤k,ℓ≤L, and finally provide an estimator
for Θ.
III. F LUCTUATIONS OF THE POPULATION EIGENVALUE ESTIMATORS
A. The Central Limit Theorem
The main result of this article is the following CLT which expresses the fluctuations of(ρ̂k, 1 ≤ k ≤ L).Theorem 3: Under the assumptions (A1)-(A2) and with the same notations:
(M(ρ̂k − ρk), 1 ≤ k ≤ L) D−−−−−−→M,N→∞
x ∼ NL(0,Θ) ,
whereNL refers to a realL-dimensional Gaussian distribution, andΘ is aL× L matrix whose entriesΘkℓ are given by (7), whereCk andCℓ are defined as before (cf. Formula (5)).
October 16, 2018 DRAFT
-
8
Θkℓ = −1
4π2ckcℓ
∮
Ck
∮
Cℓ
[
m′(z1)m′(z2)
(m(z1)−m(z2))2− 1
(z1 − z2)2]
1
m(z1)m(z2)dz1dz2 . (7)
B. Proof of Theorem 3
We first outline the main steps of the proof and then provide the details.
Using the integral representation ofρ̂k andρk, we get: Almost surely,
M(ρ̂k − ρk) =M2
2πiNk
∮
Ck
z
(
m′R̂N
(z)
mR̂N
(z)− m
′N (z)
mN (z)
)
dz
Denote byC(Ck,C) the set of continuous functions fromCk to C endowed with the supremum norm
‖u‖∞ = supCk |u|. Consider the process:
(XN ,X′N , uN , u
′N ) : Ck → C4
where
XN (z) = M(
mR̂N
(z)−mN (z))
,
X ′N (z) = M(
m′R̂N
(z)−m′N (z))
,
uN (z) = mR̂N(z) , u′N (z) = m
′R̂N
(z) .
Then due to ’no eigenvalue’ result (cf. [18], see also Proposition 1), (XN ,X ′N , uN , u′N ) almost surely
belongs toC(Ck,C) andM(ρ̂k − ρk) writes:
M(ρ̂k − ρk) =M
2πiNk
∮
Ck
z
(
mN (z)X′N (z) − u′N (z)XN (z)mN (z)uN (z)
)
dz
△= ΥN (XN ,X
′N , uN , u
′N ) ,
where
ΥN (x, x′, u, u′) =
M
2πiNk
∮
Ck
z
(
mN (z)x′(z)− u′(z)x(z)
mN (z)u(z)
)
dz . (8)
If needed, we shall explicitly indicate the dependence in the contourCk and writeΥN (x, x′, u, u′,Ck).
The main idea of the proof of the theorem lies in three steps:
(i) To prove the convergence in distribution of the process(XN ,X ′N , uN , u′N ) to a Gaussian process.
(ii) To transfer this convergence to the quantityΥN (XN ,X ′N , uN , u′N ) with the help of the continuous
mapping theorem [19].
DRAFT October 16, 2018
-
9
(iii) To check that the limit (in distribution) ofΥN (XN ,X ′N , uN , u′N ) is Gaussian and to compute the
limiting covariance betweenΥN(XN ,X ′N , uN , u′N ,Ck) andΥN (XN ,X
′N , uN , u
′N ,Cℓ).
Remark 2: Note that the convergence in step (i) is a distribution convergence at a process level, hence
one has to first establish the finite dimensional convergenceof the process and then to prove that the
process is tight overCk. Tightness turns out to be difficult to establish due to the lack of control over
the eigenvalues of̂RN whenever the contour crosses the real line. In order to circumvent this issue, we
shall introduce, following Bai and Silverstein [20], a process that approximatesXN andX ′N .
Let us now start the proof of Theorem 3.
Lemma 1: Under the assumptions (A1)-(A2), the process
(XN ,X′N ) : Ck → C4
converges in distribution to a Gaussian process(X,Y ) with mean function zero and covariance function:
cov(X(z),X(z̃)) =m′(z)m′(z̃)
(m(z)−m(z̃))2 −1
(z − z̃)2△= κ(z, z̃) , (9)
cov(Y (z),X(z̃)) =∂
∂zκ(z, z̃) ,
cov(X(z), Y (z̃)) =∂
∂z̃κ(z, z̃) ,
cov(Y (z), Y (z̃)) =∂2
∂z∂z̃κ(z, z̃) .
Lemma 1 is the cornerstone to the proof of Theorem 3. The proofof Lemma 1 is postponed to Appendix
B and relies on the following proposition, of independent interest:
Proposition 1: Under the assumptions (A1)-(A2) and denote byS the support of the probability
distribution associated to the Stieltjes transformm. Then, for everyε > 0, ℓ ∈ N∗:
P
(
supλ∈eig(R̂N)
d(λ,S) > ε
)
= O
(
1
N ℓ
)
,
whered(λ,S) = infx∈S |λ− x|.The proof of Proposition 1 is postponed to Appendix A.
As (uN , u′N )a.s.−−−−−−→
N,M→∞(m,m′), a straightforward corollary of Lemma 1 yields the convergence in
distribution of (XN ,X ′N , uN , u′N ) to (X,Y,m,m
′). This concludes the proof of step (i).
A direct consequence of Lemma 1 yields that(XN ,X ′N , uN , u′N ) : Ck → C4 converges in distribution
to the Gaussian process(X,Y,m,m′) defined as before. We are now in position to transfer the con-
vergence of(XN ,X ′N , uN , u′N ) to ΥN(XN ,X
′N , uN , u
′N ) via the continuous mapping theorem, whose
statement as expressed in [19] is reminded below.
October 16, 2018 DRAFT
-
10
Proposition 2 (cf. [19, Th. 4.27]): For any metric spacesS1 andS2, let ξ, (ξn)n≥1 be random elements
in S1 with ξnD−−−→
n→∞ξ and consider some measurable mappingsf , (fn)n≥1: S1 7→ S2 and a measurable
setΓ ⊂ S1 with ξ ∈ Γ a.s. such thatfn(sn) → f(s) assn → s ∈ Γ. Thenfn(ξn) D−−−→n→∞
f(ξ).
It remains to apply Theorem 2 to the process(XN ,X ′N , uN , u′N ) and to the functionΥN as defined
in (8). Denote by1
Υ(x, y, v, w) =1
2πick
∮
Ck
z
(
m(z)y(z) − w(z)x(z)m(z)v(z)
)
dz ,
and consider the set
Γ =
{
(x, y, v, w) ∈ C4(Ck,C) , infCk
|v| > 0}
.
Then, it is shown in [6, Section 9.12.1] thatinfCk |m| > 0, and, by a dominated convergence theoremargument, that(xN , yN , vN , wN ) → (x, y, v, w) ∈ Γ implies thatΥN (xN , yN , vN , wN ) → Υ(x, y, v, w).Therefore, Theorem 2 applies toΥN (xN , yN , vN , wN ) and the following convergence holds true:
ΥN (XN ,X′N , uN , u
′N )
D−−−−−−→M,N→∞
Υ(X,Y,m,m′) ,
and step (ii) is established.
It now remains to prove step (iii),i.e. to check the Gaussianity of the random variableΥ(X,Y,m,m′)
and to compute the covariance betweenΥ(X,Y,m,m′,Ck) andΥ(X,Y,m,m′,Cℓ).
In order to propagate the Gaussianity of the deviations in the integrands of (6) to the deviations of
the integral which defineŝρk, it suffices to notice that the integral can be written as the limit of a finite
Riemann sum and that a finite Riemann sum of Gaussian random variables is still Gaussian. Therefore
M(ρ̂k − ρk) converges to a Gaussian distribution. Asinfz∈Ck |m(z)| > 0, a straightforward applicationof Fubini’s theorem together with the fact thatE(X) = E(Y ) = 0 yields:
E
∮ (
zm
′
(z)X(z)
m2(z)− z Y (z)
m(z)
)
dz = 0 .
It remains to compute the covariance betweenΥ(X,Y,m,m′,Ck) andΥ(X,Y,m,m′,Cℓ) for possibly
different contoursCk andCℓ. We shall therefore evaluate, for1 ≤ k, ℓ ≤ L:
Θkℓ = E(
Υ(X,Y,m,m′,Ck)Υ(X,Y,m,m′,Cℓ)
)
,
(a)= − 1
4π2ckcl
∮
Ck
∮
Cℓ
z1z2
(m′
(z1)m′
(z2)κ(z1, z2)
m2(z1)m2(z2)− m
′
(z1)∂2κ(z1, z2)
m2(z1)m(z2)
−m′
(z2)∂1κ(z1, z2)
m(z1)m2(z2)+
∂212κ(z1, z2)
m(z1)m(z2)
)
dz1dz2 ,
1As previously, we shall explicitly indicate the dependenceon the contourCk if needed and writeΥ(x, x′, u, u′,Ck).
DRAFT October 16, 2018
-
11
Θ̂kℓ = −M2
4π2NkNℓ
∮
Ck
∮
Cℓ
(
m′R̂N
(z1)m′R̂N
(z2)
(mR̂N
(z1)−mR̂N (z2))2− 1
(z1 − z2)2
)
× 1m
R̂N(z1)mR̂N
(z2)d z1d z2 .
(10)
where(a) follows from the fact thatinfz∈Ck |m(z)| > 0 together with Fubini’s theorem, and∂1, ∂2, ∂212respectively stand for∂/∂z1, ∂/∂z2 and∂2/∂z1∂z2.
By integration by parts, we obtain
∮
z1z2m′(z2)∂1κ(z1, z2)
m(z1)m2(z2)dz1
=
∮(
−z2m′(z2)κ(z1, z2)
m(z1)m2(z2)+
z1z2m′(z1)m
′(z2)κ(z1, z2)
m2(z1)m2(z2)
)
dz1 .
Similarly,∮
z1z2m(z2)∂1,2κ(z1, z2)
m(z1)m2(z2)dz1
= −∮
z2∂2κ(z1, z2)
m(z1)m(z2)dz1 +
∮
z1z2m′(z1)∂2κ(z1, z2)
m2(z1)m(z2)dz1 .
Hence
Θkℓ = −1
4π2ckcl
{∮
Ck
∮
Cℓ
z2m′(z2)κ(z1, z2)
m(z1)m2(z2)dz1dz2 −
∮
Ck
∮
Cℓ
z2∂2κ(z1, z2)
m(z1)m(z2)dz1dz2
}
.
Another integration by parts yields∮
z2∂2κ(z1, z2)
m(z1)m(z2)dz2 = −
∮
κ(z1, z2)
m(z1)m(z2)dz2 +
∮
z2m′(z2)κ(z1, z2)
m(z1)m2(z2)dz2 .
Finally, we obtain:
Θkℓ = −1
4π2ckcl
∮
Ck
∮
Cℓ
κ(z1, z2)
m(z1)m(z2)d z1d z2 ,
and (7) is established.
IV. ESTIMATION OF THE COVARIANCE MATRIX
Theorem 3 describes the limiting performance of the estimator of Theorem 1, with an exact charac-
terization of its variance. Unfortunately, the varianceΘ depends upon unknown quantities. We provide
hereafter consistent estimatesΘ̂ for Θ based on the observationŝRN .
October 16, 2018 DRAFT
-
12
Θ̂kℓ =M2
NkNℓ
∑
(i,j)∈Nk×Nℓ, i 6=j
−1(µ̂i − µ̂j)2m′
R̂N(µ̂i)m
′R̂N
(µ̂j)
+δkℓ∑
i∈Nk
(
m′′′R̂N
(µ̂i)
6m′R̂N
(µ̂i)3−
m′′R̂N
(µ̂i)2
4m′R̂N
(µ̂i)4
)]
. (11)
Theorem 4: Assume that the assumptions (A1)-(A2) hold true, and recallthe definition ofΘkℓ given
in (7). Let Θ̂kℓ be defined by (11), where(Nk) and (µ̂k) are defined in Theorem 1, then:
Θ̂kℓ −Θkℓ a.s.−−→ 0
asN,M → ∞.
Theorem 4 is useful in practice as one can obtain simultaneously an estimatêρk of the values ofρk
as well as an estimation of the degree of confidence for eachρ̂k.
Proof: In view of formula (7), and taking into account the fact thatmR̂N
andm′R̂N
are consistent
estimates form andm′, it is natural to definêΘkℓ by replacing the unknown quantitiesm andm′ in (7)
by their empirical counterpartsmR̂N
andm′R̂N
, hence the definition of̂Θkℓ in (10).
The proof of Theorem 4 now breaks down into two steps: The convergence of̂Θkℓ to Θkℓ, which relies
on the definition (10) of̂Θkℓ and on a dominated convergence argument, and the effective computation
of the integral in (10) which relies on Cauchy’s residue theorem [21], and yields (11).
We first address the convergence ofΘ̂kℓ to Θkℓ. Due to [18], [22], almost surely, the eigenvalues of
R̂N will eventually belong to anyε-blow-up of the supportS of the probability measure associated tom,
i.e. the set{x ∈ R : d(x,S) < ε}. Hence, ifε is small enough, the distance between these eigenvaluesand anyz ∈ Ck will be eventually uniformly lower-bounded. By [14, Lemma 1], the same result holdstrue for the zeros ofm
R̂N(which are real). In particular, this implies thatm
R̂Nis eventually uniformly
lower-bounded onCk (if not, then by compacity, there would existz ∈ Ck such thatmR̂N (z) = 0 whichyields a contradiction because all the zeroes ofm
R̂Nare strictly within the contour). With these arguments
at hand, one can easily apply the dominated convergence theorem and conclude that a.s.Θ̂kℓ → Θkℓ.
We now evaluate the integral (10) by computing the residues of the integrand withinCk andCℓ. There
are two cases to discuss depending on whetherk 6= ℓ andk = ℓ. Denote byh(z1, z2) the integrand in(10), that is:
DRAFT October 16, 2018
-
13
h(z1, z2) =
(
m′R̂N
(z1)m′R̂N
(z2)
(mR̂N
(z1)−mR̂N (z2))2− 1
(z1 − z2)2
)
× 1m
R̂N(z1)mR̂N
(z2). (12)
We first consider the case wherek 6= ℓ.In this case, the two integration contours are different andit can be assumed that they never intersect
(so it can always be assumed thatz1 6= z2). Let z2 be fixed, and denote bŷµi the zeroes (labeled inincreasing order) ofm
R̂N, then the computation of the residueRes(h(·, z2), µ̂i) of h(·, z2) at a zeroµ̂i
of mR̂N
which is located withinCk is straightforward and yields:
r(z2)△= Res(h(·, z2), µ̂i) =
(
m′R̂N
(µ̂i)m′R̂N
(z2)
m2R̂N
(z2)− 1
(µ̂i − z2)2
)
1
m′R̂N
(µ̂i)mR̂N(z2)
. (13)
Similarly, if one computesRes(r, µ̂j) at a zeroµ̂j of mR̂Nlocated withinCk, one obtains:
Res(r, µ̂j) = −1
(µ̂i − µ̂j)2 m′R̂N
(µ̂i)m′R̂N
(µ̂j).
Then we need to consider the residueξ on the setRz2 = {z1 : mR̂N (z1) = mR̂N (z2) 6= 0, z1 6= z2}.(If this set is empty, then the residue is zero.) Notice thatξ is not a residue of 1(z1−z2)2
1m
R̂N(z1)mR̂N (z2)
,
hence one needs to compute
g(z1, z2) =m′
R̂N(z1)m
′R̂N
(z2)
(mR̂N
(z1)−mR̂N (z1))2
1
mR̂N
(z1)mR̂N(z2)
for the residueξ. By integration by parts, one gets
∮
g(z1, z2)dz1 = −∮ m′
R̂N(z1)m
′R̂N
(z2)
(mR̂N
(z1)−mR̂N (z2))dz1
m2R̂N
(z1)mR̂N(z2)
.
Let k = min{i ∈ N∗ : m(i)R̂N
(ξ) 6= 0}, then by a Taylor expansion
mR̂N
(z1) = mR̂N(z2) +
(z1 − ξ)kk!
m(k)
R̂N(ξ) + o(z1 − ξ)k,
and
m′R̂N
(z1) =(z1 − ξ)k−1(k − 1)! m
(k)
R̂N(ξ) + o(z1 − ξ)k−1.
Hence
Res(g, ξ) = −km′
R̂N(z2)
m3R̂N
(z2).
As it is the derivative function of k2m2R̂N
(z2), the integration with respect toz2 is zero.
October 16, 2018 DRAFT
-
14
It remains to count the number of zeros within each contour. By [14, Lemma 1], eventually, there are
exactly as many zeros as eigenvalues within each contour. Ithas been proved that the contribution of the
residues ofξ on Rz2 is null, hence the result in the casek 6= ℓ:
Θ̂kℓ = −M2
NkNℓ
∑
(i,j)∈Nk×Nℓ
− 1(µ̂i − µ̂j)2m′
R̂N(µi)m′
R̂N(µj)
.
We now compute the integral (10) in the case wherek = ℓ, and begin by the computation of the
residues at̂µi. The definition (13) ofr and the computation ofRes(r, µ̂j) still hold true in the case
whereµ̂j is within Ck but different fromµ̂i. It remains to computeRes(r, µ̂i). Takingz2 → µi, we get:
limz2→µ̂i
(z2 − µ̂i)3(
1
m′R̂N
(µ̂i)mR̂N(z2)(µ̂i − z2)2
)
=1
m′2R̂N
(µ̂i),
limz2→µ̂i
(z2 − µ̂i)2
1
m′R̂N
(µ̂i)mR̂N(z2)(µ̂i − z2)2
− 1m′
R̂N
2(µ̂i)(z2 − µ̂i)3
= −m′′
R̂N(µ̂i)
2m′R̂N
3(µ̂i).
Finally,
limz2→µ̂i
(z2 − µ̂i)(
1
m′R̂N
(µ̂i)mR̂N(z2)(µ̂i − z2)2
− 1m′2
R̂N(µ̂i)(z2 − µ̂i)3
+m′′RN
(µ̂i)
2m′3RN (µ̂i)(z2 − µ̂i)2
)
=m′′′
R̂N(µ̂i)
6m′R̂N
(µ̂i)3−
m′′R̂N
(µ̂i)2
4m′R̂N
(µ̂i)4.
Hence the residue:
Res(r, µ̂i) =m′′′RN
(µ̂i)
6m′RN(µ̂i)3
−m′′RN
(µ̂i)2
4m′RN(µ̂i)4
.
There are two other residues that should be taken into account for the computation of the integral: The
residues ofξ on Rz2 , and the residue forz1 = z2. The first case can be handled as before. Forz1 = z2,
the calculus ofg(z1, z2) for the residuez1 = z2 is exactly the same as before. It remains to compute
1(z1−z2)2
1m
R̂N(z1)mR̂N (z2)
for the residuez1 = z2. The integration by parts yields that:
∮
1
(z1 − z2)2dz1
mR̂N
(z1)mR̂N(z2)
=
∮
−m′
R̂N(z1)
(z1 − z2)dz1
m2R̂N
(z1)mR̂N(z2)
.
Then the residue forz1 = z2 is:
−m′
R̂N(z2)
m3R̂N
(z2).
Again, this is the derivative function of 12m2R̂N
(z2), then the integration is zero.
DRAFT October 16, 2018
-
15
Finally both have a null contribution, hence the formula:
Θ̂kk =M2
N2k
∑
(i,j)∈N2k, i 6=j
−1(µ̂i − µ̂j)2m′
R̂N(µ̂i)m
′R̂N
(µ̂j)+∑
i∈Nk
(
m′′′R̂N
(µ̂i)
6m′R̂N
(µ̂i)3−
m′′R̂N
(µ̂i)2
4m′R̂N
(µ̂i)4
)]
.
V. PERFORMANCE IN THE CONTEXT OF COGNITIVE RADIOS
We introduce below a practical application of the above result to the telecommunication field of
cognitive radios. Consider a communication network implementing orthogonal code division multiple
access (CDMA) in the uplink, which we refer to as theprimary network. The primary network is
composed ofK transmitters. The data of transmitterk are modulated by thenk orthogonalN -chip
codeswk,1, . . . ,wk,nk ∈ CN . Consider also a secondary network, in sensor mode, that we assumetime-synchronized with the primary network, and whose objective is to determine the distances of the
primary transmitters in order to optimally reuse the frequencies used by the primary transmitters2. From
the viewpoint of the secondary network, primary userk has powerPk. Then, at symbol timem, any
secondary user receives theN -dimensional data vector
y(m) =
K∑
k=1
√
Pk
nk∑
j=1
wk,jx(m)k,j + σn
(m) (14)
with σn(m) ∈ CN an additive white Gaussian noiseCN(0, σ2I) received at timem and x(m)k,j ∈ Cthe signal transmitted by userk on the carrier codej at timem, which we assumeCN(0, 1) as well.
The propagation channel is considered frequency flat on the CDMA transmission bandwidth. We do not
assume that the sensor knowsσ2 neither the vectorswk,j. The secondary users may or may not be aware
of the number of codewords employed by each user.
Equation (14) can be compacted under the form
y(m) = WP1
2x(m) + σn(m)
with W = [w1,1, . . . ,w1,n1 ,w2,1, . . . ,wK,nK ] ∈ CN×n, n ,∑K
k=1 nk, P ∈ Cn×n the diagonal matrixwith entry P1 of multiplicity n1, P2 of multiplicity n2, etc. andPK of multiplicity nK , andx(m) =
[x(m)T1 , . . . ,x
(m)TK ]
T ∈ Cn wherex(m)k ∈ Cnk is a column vector withj-th entryx(m)k,j .
2the rationale being that far transmitters will not be interfered with by low power communications within the secondary
network.
October 16, 2018 DRAFT
-
16
GatheringM successive independent observations, we obtain the matrixY = [y(1), . . . ,y(M)] ∈CN×M given by
Y = WP1
2X+ σN =[
WP1
2 σIN
]
X
N
whereX = [x(1), . . . ,x(M)] andN = [n(1), . . . ,n(M)].
The y(m) are therefore independent Gaussian vectors of zero mean andcovarianceR , WPWH +
σ2IN . Since the objective is to retrieve the powersPk, while σ2 is known, the problem boils down to
finding the eigenvalues ofWPWH +σ2IN . However, the sensors only have access toY, or equivalently
to the sample covariance matrix
RN ,1
MYYH =
1
M
M∑
m=1
y(m)y(m)H .
AssumingRN conveys a good appreciation of the eigenvalue clustering tothe secondary user (as in
Figure 1), Theorem 1 enables the detection of primary transmitters and the estimation of their transmit
powersP1, . . . , PK ; this boils down to estimating the largestK eigenvalues ofWPWH + σ2IN , i.e.
thePk + σ2, and to subtractσ2 (optionally estimated from the smallest eigenvalue ofWPWH + σ2IN
if n < N ). Call P̂k the estimate ofPk.
Based on these power estimates, the secondary user can determine the optimal coverage for secondary
communications that ensures no interference with the primary network. A basic idea for instance is to
ensure that the closest primary user, i.e. that with strongest received power, is not interfered with. Our
interest is then cast onPK . Now, since the power estimator is imperfect, it is hazardous for the secondary
network to state thatK has powerP̂K or to add some empirical security margin tôPK . The results of
Section III partially answer this problem.
Theorems 3 and 4 enable the secondary sensor to evaluate the accuracy ofP̂k. In particular, assume
that the cognitive radio protocol allows the secondary network to interfere the primary network with
probability q and denoteA the value
A , infa{P(PK − P̂K > a) ≤ q}.
According to Theorem 3, forN,M large,A is well approximated bŷΘK,KQ−1(q), with Q the Gaussian
cumulative distribution function. If the secondary users detect a user with powerPK , estimated byP̂K ,
P(P̂K + A < PK) < q and then it is safe for the secondary network to assume the worst case scenario
where userK transmits at power̂PK +A ≃ P̂K + Θ̂KQ−1(q).In Figure 2, the performance of Theorem 3 is compared against10, 000 Monte Carlo simulations of
a scenario of three users, withP1 = 1, P2 = 3, P3 = 10, n1 = n2 = n3 = 20, N = 60 andM = 600.
DRAFT October 16, 2018
-
17
1 3 100
1
2
3
EstimatesD
ensi
ty
Histogram of theρ̂kTheoretical limiting distribution
Fig. 2. Comparison of empirical against theoretical variances, based on Theorem 3, for three users,P1 = 1, P2 = 3, P3 = 10,
n1 = n2 = n3 = 20 codes per user,N = 60, M = 600 and SNR= 20 dB.
It appears that the limiting distribution is very accurate for these values ofN,M . We also performed
simulations to obtain empirical estimatesΘ̂k of Θk from Theorem 4, which suggest thatΘ̂k is an accurate
estimator as well.
VI. CONCLUSION
In this article, we derived an exact expression and an approximation of the limiting performance of
a statistical inference method that estimates the population eigenvalues of a class of sample covariance
matrices. These results are applied in the context of cognitive radios to optimize secondary network
coverage based on measures of the primary network activity.
APPENDIX
A. Proof of Proposition 1
Let us first begin by considerations related to the supports of the probability distributions associated
to m(z) andmN (z). Denote byS andSN these supports and recall thatS is the union ofL clusters:
S = (a1, b1) ∪ · · · ∪ (aL, bL) .
The following proposition clarifies the relations betweenSN andS.
October 16, 2018 DRAFT
-
18
Proposition 3: LetN,M → ∞, then forN large enough, the supportSN of the probability distributionassociated to the Stieltjes transformmN (z) is the union ofL clusters:
SN = (aN1 , b
N1 ) ∪ · · · ∪ (aNL , bNL ) .
Moreover, the following convergence holds true :
aNℓ −−−−−−→N,M→∞
aℓ , bNℓ −−−−−−→
N,M→∞bℓ ,
for 1 ≤ ℓ ≤ L.Remark 3: If the supportSN contains zero, (ex:N > M ), then zero is also in the supportS, the
conclusion is still true.
Proof of Proposition 3: Recall the relations:
mN (z) = −(
z − NM
∫
t
1 + tmN (z)dFRN (t)
)−1
(15)
and
mN (z) =M
NmN (z)−
(
1− MN
)
1
z. (16)
As the inverse of Stieltjes transform of−1z is δ0 (the Dirac mass on 0) andmN (z) is a continuousfunction overR∗+, for a,b with 0 < a < b, by the inverse formula of Stieltjes transform, one gets:
FN ([a, b]) =M
NFN ([a, b]).
So it suffices to study the supportSN associated toFN .
From the definition ofmN (z) (see formula (15)), we obtain:
zRN (mN ) = −1
mN+
N
M
∫
tdFRN(t)
1 + tmN (z).
Denote byB = {m ∈ R : m 6= 0,−m−1 /∈ {ρ1, · · · , ρL}}. In [23, Theorem 4.1 and Theorem 4.2],Silverstein and Choi show that for a real numberx, x ∈ ScN ⇐⇒ mx ∈ B and z′RN(mx) =
1m2x
−NM
∫ t2dFRN(t)(1+tmx)
2 > 0 with mN (x) = mx andzRN(mx) = x.
Then ifa ∈ ∂SN , ma /∈ B or z′RN(ma) ≤ 0 with ma = mN (a). Now we will show thatma ∈ B. In [23,Theorem 5.1],ma 6= 0. If −m−1a ∈ SFRN , asFRN is discrete, we get thatlimm→ma
∫ t2dFRN(t)(1+tm)2 −→ ∞.
So on the neighborhood to the left and to the right ofma, z′RN < 0 which contradicts [23, Theorem 5.1].
Hencez′RN(ma) ≤ 0. By the continuity, we get
DRAFT October 16, 2018
-
19
z′RN(ma) =1
m2a− N
M
∫
t2dFRN (t)
(1 + tma)2= 0.
It is equivalent to the following equation:
z′RN(ma) =1
m2a− 1
M
L∑
i=1
Niρ2i
(1 + ρima)2= 0. (17)
By multiplying the common denominator, one will get a polynomial of the degree 2L inma. Now we
will show that these 2L roots are real. At first, notice that
1
m2− N
M
∫
t2dFRN (t)
(1 + tm)2−−−−−→m→− 1
ρi
−∞,
and
z′′RN(m) = −2
m3+
N
M
∫
2t3dFRN (t)
(1 + tm)3.
So z′′RN(m) has one and only one zero in the open set(−1ρi,− 1ρi+1 ) for i ∈ {1, · · · , L − 1}. Then
for βi ∈ (− 1ρi ,−1
ρi+1) such thatz′′RN(βi) = 0, it suffices to show thatz
′RN
(βi) > 0 in order to prove
that there will be two zeros forz′RN(m) in the set(−1ρi,− 1ρi+1 ). From the separability condition (cf.
Assumption (A2)),infN{MN −ΨN (i)} > 0, and
z′RN(−1
αi) = α2i −
N
M
∫
t2dFRN(t)
(1− tαi )2
= α2i
(
1− 1M
L∑
r=1
Niρ2i
(αi − ρi)2
)
> 0
Thus we obtain2(L− 1) roots. Besides, in the open set(−ρ−1L , 0),1
m2− N
M
∫
t2dFRN (t)
(1 + tm)2−−−−−→ma→0−
+∞,
there exists another root in this set. In the open set(−∞,−ρ−11 ),1
m2− N
M
∫
t2dFRN (t)
(1 + tm)2−−−−−→m→−∞
0
and1
m2− N
M
∫
t2dFRN (t)
(1 + tm)2∼
m→−∞
1
m2(1− L
M) > 0.
Hence the last root in this open set. This proves thatSN = (aN1 , bN1 ) ∪ · · · ∪ (aNL , bNL ).
To proveaNℓ −−−−−−→N,M→∞ aℓ andbNℓ −−−−−−→N,M→∞ bℓ , notice thatai bi satisfy the same type of the equation
by replacingNM by c andFRN by FR. As NM → c and KiM → ci, the roots of Equation (17) converge to
those of the limit equation (see [24]). Thus we achieve the second conclusion.
October 16, 2018 DRAFT
-
20
We are now in position to establish the proof of Proposition 1.
Denote byS(ε) the ε-blow-up of S, i.e. S(ε) = {x ∈ R, d(x, S) < ε}. Let ε > 0 be small enoughand consider a smooth functionφ equal to zero onS(ε/3), equal to 1 ifx /∈ S(ε), equal to zero againif |x| ≥ τ (as we shall see,τ will be chosen to be large), and smooth in-between with0 ≤ φ ≤ 1:
φ(x) =
0 if d(x,S) < ε/3 ,
1 if d(x,S) > ε , |x| ≤ τ − ǫ
0 if |x| > τ .
Notice that if N,M → ∞ andN is large enough, then by Proposition 3,φ(x) = 0 for all x ∈ SN .Now if Z is aM ×M hermitian matrix with spectral decompositionZ = Udiag (γi; 1 ≤ i ≤ M)) UH ,whereU is unitary anddiag (γi; 1 ≤ i ≤ M)) stands for theM ×M diagonal matrix whose entries areZ’s eigenvalues, writeφ(Z) = Udiag (φ(γi); 1 ≤ i ≤ M)) UH .
We have:
P(supn
d(λn, S) > ε) ≤ P(‖R̂N‖ > τ − ε) + P(Trφ(R̂N ) ≥ 1)
= P(‖R̂N‖ > τ − ε) + P([Trφ(R̂N )]p ≥ 1)(a)
≤ P(‖R̂N‖ > τ − ε) + E[Trφ(R̂N )]p ,
for everyp ≥ 1, where(a) follows from Markov’s inequality. The fact thatP(‖R̂N‖ > τ) = O(N−ℓ)for τ large enough and everyℓ ∈ N∗ is well-known (see for instance [6, Section 9.7]). We shall thereforeestablish estimates overE[Trφ(R̂N )]p. Takep = 2k; we prove the following statement by induction: For
k ≥ 1 and for every integerβ < 2k and for every smooth functionf with compact support whose valueon S(ε/3) is zero ,
E
(
Trf(R̂N ))2k
= O
(
1
Nβ
)
.
First notice that due to Proposition 3,∫
SNf(λ)FN (dλ) = 0 (whereFN is the probability distribution
associated tomN ) for N,M large enough (N,M → ∞). A minor modification of [25, Lemma 2] (whosemodel is slightly different) with the help of [26, Proposition 5] yields that forN,M → ∞ andN largeenough,ETr f(R̂N ) = O(N−1), and the property is verified fork = 0.
DRAFT October 16, 2018
-
21
Let k > 0 be fixed and assume that the result holds true forβ < 2k. We want to show that
E[Trf(R̂N )]2(k+1) = O(N−2β). At stepk + 1, the expectation writes:
∣
∣
∣E[Tr f(R̂N )]2(k+1)
∣
∣
∣ (18)
=
∣
∣
∣
∣
E
(
[Trf(R̂N )]2k + E[trf(R̂N )]
2k − E[Trf(R̂N )]2k)2∣
∣
∣
∣
≤ 2(
Var[Tr f(R̂N )]2k + |E[Trf(R̂N )]2
k |2)
. (19)
The second term of the right hand side (r.h.s.) of the equation can be handled by the induction hypothesis:∣
∣
∣E[Trf(R̂N )]
2k∣
∣
∣
2= O
(
1
N2β
)
.
We now rely on Poincaré-Nash inequality (see for instance [26, Section II-B]) to handle the first term of
the r.h.s. Applying this inequality, we obtain:
Var(
(Tr f(R̂N ))2k)
≤ K∑
i,j
E
∣
∣
∣
∣
∣
∂[Tr f(R̂N )]2k
∂Yi,j
∣
∣
∣
∣
∣
2
+
∣
∣
∣
∣
∣
∂[Tr f(R̂N )]2k
∂Y i,j
∣
∣
∣
∣
∣
2
, (20)
whereK is a constant which does not depend onN,M and which is greater thanRN ’s eigenvalues. In
order to compute the derivatives of the r.h.s., we rely on [27, Lemma 4.6]. This yields:
∂
∂Yi,j[Tr f(R̂N )]
2k =2k
M[Tr f(R̂N )]
2k−1[Y∗Nf′(R̂N )]j,i ,
∂
∂Yi,j[Tr f(R̂N )]
2k =2k
M[Tr f(R̂N )]
2k−1[f ′(R̂N )YN ]i,j .
Plugging these derivatives into (20), we obtain:
Var(Tr[f(R̂N )]2k)
≤ K 22k+1
M2E
[
(Tr f(R̂N ))(2k+1−2)Tr (f ′(R̂N )YNY
∗Nf
′(R̂N ))]
,
=K 22k+1
ME
[
(Tr f(R̂N ))(2k+1−2)Tr (f ′(R̂N )
2R̂N )]
,
≤ K 22k+1
M
∣
∣
∣E[Trf(R̂N )]
2k+1∣
∣
∣
2k+1−2
2k+1 ×∣
∣
∣E[Trf ′(R̂N )
2R̂N ]2k∣
∣
∣
1
2k
,
where the last inequality is a consequence of Hölder’s inequality.
As the functionh(λ) = λ[f ′(λ)]2 satisfies the induction hypothesis, we have for everyα < 1:∣
∣
∣ETr[f ′(R̂N )
2R̂N ]2k∣
∣
∣
1
2k
= O(N−α).
Plugging this estimate into (18), we obtain:∣
∣
∣E[Tr f(R̂N )]2(k+1)
∣
∣
∣ ≤ K(
1
N1+α|E[Tr f(R̂N )]2
(k+1) | 2k+1
−2
2k+1
)
+ O(N−2β) , (21)
October 16, 2018 DRAFT
-
22
whereK is a constant independent ofM,N, k. Notice that inequality (21) involves twice the quantity
of interestE[Tr f(R̂N )]2(k+1)
that we want to upper bound byO(N−2β). We shall proceed iteratively.
Notice thatTr [f(R̂N )] ≤ supx∈R |f(x)| ×N becausef is bounded onR; hence the rough estimate:
E[Trf(R̂N )]2(k+1) = O(N2
k+1
).
Plugging this into (21) yields:
E[Tr f(R̂N )]2(k+1) = O(Na1) ,
wherea0 = 2k+1 anda1 = a0 2k+1−22k+1 − (1 + α). Iterating the procedure, we obtain:
E[Tr f(R̂N )]2(k+1) = O
(
Naℓ∨(−2β))
,
whereaℓ = aℓ−12k+1−22k+1 − (1+α) andx∨ y stands forsup(x, y). Now, in order to conclude the proof, it
remains to prove that i) the sequence(aℓ) converges to some limita∞, ii) for some well-chosenα < 1,
a∞ ∈ (−2k+1,−2β). Write:
aℓ+1 + 2k(1 + α) =
2k − 12k
(aℓ + 2k(1 + α)) ,
henceaℓ converges to−2k(1+α) which readily belongs to(−2k+1,−2β) for a well-chosenα ∈ (0, 1).Finally E[Tr f(R̂N )]2
(k+1)
= O(N−2β) which ends the induction.
It remains to apply this estimate toE[Trφ(R̂N )]ℓ in order to get the desired result.
B. Proof of Lemma 1
As explained in Section III, there are two conditions to prove (Billingsley [28, Theorem 13.1]):
• Finite-dimensional convergence of the process(XN ,X ′N ).
• Tightness on the contourCk.
Remark 4: As uN (resp.u′N ) converges almost surely tou (resp.u′) (see Silverstein and Bai [17]),
the convergence of the process(XN ,X ′N , uN , u′N ) is achieved as soon as the convergence of the process
(XN ,X′N ) is proved.
In [20], Bai and Silverstein establish a central limit theorem for FRN with the complex Gaussian
entriesXij . We recall below their main result.
Proposition 4: [20] With the notations introduced in Section II, forf1, . . . , fp, analytic on an open
region containingR,
1)(
N∫
fi(x)d(FR̂N − FN )(x)
)
1≤i≤pforms a tight sequence onN ,
DRAFT October 16, 2018
-
23
2)(
N
∫
fi(x)d(FR̂N − FN )(x)
)
1≤i≤p
D−→ N(0,V),
whereV = (Vij) and
Vij = −1
4π2
∮ ∮
fi(z1)fj(z2)vij(z1, z2)dz1dz2,
with
vij(z1, z2) =m′(z1)m
′(z2)
(m(z1)−m(z2))2− 1
(z1 − z2)2
where the integration is over positively oriented contoursthat circle around the supportS.
Now we apply this proposition to show the finite-dimensionalconvergence. For allzi ∈ Ck\R, noticethat
mR̂N
(z)−mN (z) =1
2iπ
∮
1
x− z d(FR̂N − FN )(x)
with the contour who contains the supportS andXN (z) = M(mR̂N(z) −mN (z)). Then Proposition 4
implies directly that for allp ∈ N, the random vector(
XN (z1),X′
N (z1), · · · ,XN (zp),X′
N (zp))
converges to a centered Gaussian vector by considering the functions:(
f1(x) =1
x− z1, f2(x) =
1
(x− z1)2, · · · , f2p−1(x) =
1
x− zp, f2p(x) =
1
(x− zp)2)
.
Thus the finite dimensional convergence is achieved.
The proof of the tightness is based on Nash-Poincaré inequality ([25] and [26]). In Appendix A, it is
proved that for allǫ > 0 and allℓ ∈ N,
P
(
supλ∈eig(R̂N )
d(λ,S) > ǫ
)
= o(N−ℓ).
Following the same idea as Bai and Silverstein [20, Section 3and 4], it is indeed a tight sequence. The
details of the proof are in Appendix C. Thus Lemma 1 is achieved.
C. Proof of the tightness
We will show the tightness of the sequenceM(mR̂N
− mN ) andM(m′
R̂N− m′N ) by using Nash-
Poincaré’s inequality [26]. First, denote byM(mR̂N
(z) −mN (z)) = M1N (z) +M2N (z) with M1N (z) =M(m
R̂N(z)− E[m
R̂N(z)]) andM2N (z) = M(E[mR̂N (z)]−mN (z)).
October 16, 2018 DRAFT
-
24
As 1ρ̂k−z can converge to infinite if z is close to the real axis, there will be a little trouble for the
tightness. Then we need a truncated version of the process. More precisely, letεN be a real sequence
decreasing to zero satisfying for someδ ∈]0, 1[:
εN ≥ N−δ .
Remark 5: Notice thatXN (z) = M(mR̂N −mN ) = XN (z) for z ∈ C+. So it suffices to verify the
arguments forz ∈ C+.Denote by([x2k−1, x2k], k = 1, · · · , L) thek-th cluster of the support of the limiting spectral measure;
and takel2k−1, l2k such thatx2k−2 < l2k−1 < x2k−1 and x2k < l2k < x2k+1 for k ∈ {1, .., L} withconventionsx0 = 0 andx2L+1 = ∞, i .e. , [l2k−1, l2k] only contains the k-th cluster. Letd > 0. Consider:
Cu = {x+ id : x ∈ [l2k−1, l2k]}.
and
Cr = {l2k−1 + iv : v ∈ [N−1εN , d]}.
Also
Cl = {l2k + iv : v ∈ [N−1εN , d]}.
ThenCN = Cl ∪ Cu ∪ Cr. The processM̂1N (·) is defined by
M̂1N (z) =
M1N (z) for z ∈ CN ,
M1N (l2k + iN−1εN ) for x = l2k, v ∈ [0, N−1εN ],
M1N (l2k−1 + iN−1εN ) for x = l2k−1, v ∈ [0, N−1εN ].
This partition ofCN is identical to that used in [20, Section 1]. With probability one (see [18] and [22]),
for all ǫ > 0,
lim supλ∈eig(R̂N )
d(λ,SN ) < ǫ
with d(x, S) the Euclidean distance ofx to the setS. So with probability one, for allN large, ([20, page
563])∣
∣
∣
∣
∮
(
M1N (z)− M̂1N (z))
dz
∣
∣
∣
∣
≤ K1εN ,
and∣
∣
∣
∣
∮
(
M1N′(z)− M̂2′N (z)
)
dz
∣
∣
∣
∣
≤ K2εN
for some constantsK1 andK2. Both terms converge to zero asM → ∞. Then it suffices to ensure thetightness forM̂1N (z) andM̂
1′N (z).
DRAFT October 16, 2018
-
25
We now prove tightness based on [28, Theorem 13.1], i.e.
1) Tightness at any point of the contour (hereCN ).
2) Satisfaction of the condition
supN,z1,z2∈CN
E|(M̂1N (z1)− M̂1N (z2))|2|z1 − z2|2
≤ K.
Condition 1) is achieved by an immediate application of Proposition 4. We now verify the second
condition.
We evaluateE|(M̂1N (z1)−M̂
1N (z2))|
2
|z1−z2|2. Notice that
mR̂N
(z1)−mR̂N (z2) =z1 − z2M
N∑
i=1
1
(λ̂i − z1)(λ̂i − z2)
=z1 − z2M
Tr(D−1N (z1)D−1N (z2))
with DN (z) = R̂N − zIN . We have
∂
∂Yi,j
(
mR̂N
(z1)−mR̂N (z2)z1 − z2
)
=∂
∂Yi,jTr(R̂N − z1I)−1(R̂N − z2I)−1
=1
M
[
−Y∗ND−2N (z1)D−1N (z2)−Y∗ND−1N (z1)D−2N (z2)]
j,i,
and
∂
∂Ȳi,j
(
mR̂N
(z1)−mR̂N (z2)z1 − z2
)
=1
M[−D−2N (z1)D−1N (z2)YN −D−1N (z1)D−2N (z2)YN ]i,j.
Then by the Nash-Poincaré inequality and the fact thatR̂N is uniformly bounded in spectral norm almost
surely, one gets
E|M̂1(z1)− M̂1(z2)|2|z1 − z2|2
≤ C1N
E
[
Tr(LN )]
=C1N
E(Tr(LN )Isupn d(λ̂n,S)≤ε) +
C1N
E(Tr(LN )Isupn d(λ̂n,S)>ε)
with
LN = R̂ND−4N (z1)D
−2N (z2) + 2R̂ND
−3N (z1)D
−3N (z2) + R̂ND
−2N (z1)D
−4N (z2)
October 16, 2018 DRAFT
-
26
andC1 a constant which does not depend onN or M . For the first term,Tr(LN ) is bounded on the set
supn d(λ̂n,S) ≤ ε. For the second term, since for alli ∈ N and allz ∈ CN , 1|λ̂n−z|i ≤N i
εiN, it leads that
N∑
n=1
1
|λ̂n − z|i≤ N
i+1
εiN.
Then
|Tr(LN )| ≤ O(
N7
ε6N
)
.
As P(sup d(λ̂n,S) ≥ ε) = o(N−16), takeεN = N−0.01, one obtains∣
∣
∣E(Tr(LN )Isupn d(λ̂n,S)>ε
)∣
∣
∣≤ E
∣
∣
∣Tr(LN )Isup d(λ̂n,S)>ε
∣
∣
∣
≤ O(
N7
ǫ6NP(sup d(λ̂n,S) > ε)
)
≤ O(
N7−0.06−16)
→ 0.The second condition of tightness is achieved.
For M2N (z), following exactly the same method in [6, Section 9.11], onecan show thatM2N (z) is
bounded and forms an equicontinuous family that converges to 0. Hence the tightness forM(mR̂N
(z)−mN (z)).
The next step is to prove the tightness ofM(m′R̂M
(z)−m′N (z)). We have
m′R̂N
(z1)−m′R̂N (z2)
=z1 − z2M
N∑
i=1
2λ̂i − z1 − z2(λ̂i − z1)2(λ̂i − z2)2
=z1 − z2M
Tr(
D−2N (z1)D−2N (z2)(DN (z1) +DN (z2))
)
.
Following the same method as derived before, one obtains
∂
∂YijD−1N (z1)D
−2N (z2)
= − 1M
[
Y∗ND−2N (z1)D
−2N (z2) + 2Y
∗ND
−1N (z1)D
−3N (z2)
]
j,i,
and∣
∣
∣
∣
∂
∂YijTrD−2N (z1)D
−2N (z2)(D(z1) +DN (z2))
∣
∣
∣
∣
2
=1
MTr(L2)
with
L2 =4R̂N
(
3D−4N (z1)D−4N (z2) + 2D
−3N (z1)D
−5N (z2) + 2D
−5N (z1)D
−3N (z2)
+D−2N (z1)D−6N (z2) +D
−6N (z1)D
−2N (z2)
)
.
DRAFT October 16, 2018
-
27
Then Nash-Poincaré inequality yields that
Var|M̂1′N (z1)− M̂1
′
N (z2)||z1 − z2|
≤ C1N
E(Tr(L2)Isupn d(λ̂n,S)≤ε) +
C1N
E(Tr(L2)Isupn d(λ̂n,S)>ε)
with C1 the same constant defined as before. The termTr(L2) is bounded on the setsup d(λ̂n,S) ≤ ε.For the second term,|Tr(L2)| ≤ O
(
N9
ε8N
)
. As P(sup d(λ̂n,S) ≥ ε) = o(N−16) and εN = N−0.01, theproof of the tightness ofM1N
′(z) is achieved as before.
The proof of the tightness is completed with the verificationof M2N′(z) for z ∈ Cn to be bounded
and forms an equicontinuous family, and convergence to 0. Wewill use the same method for the process
M2N (z) (see [6, Section 9.11]).
By Formula (9.11.1) in [6, Section 9.11], they show that
(EmR̂N
−mN )
1−NM
∫ mN t2dFRN(t)
(1+tEmR̂N
)(1+tmN )
−z + NM∫
tdFRN1+tEm
R̂N
− TN
= EmR̂N
mNTN (22)
where
TN =N
M2
M∑
j=1
Eβjdj(EmR̂N)−1,
dj = dj(z) = −q∗jR1/2(R̂(j) − zI)−1(EmR̂NR+ I)−1R1/2qj + (1/M)Tr(EmR̂N
R+ I)−1R(R̂N − zI)−1,
βj =1
1 + 1M y∗j (R̂(j) − zI)−1yj
,
qj = 1/√Nxj,
R̂(j) = R̂N −1
Myjy
∗j .
If one derives (22) with respect toz, the equation becomes
(Em′R̂N
−m′N )
1−NM
∫ mN t2dFRN(t)
(1+tEmR̂N
)(1+tmN )
−z + NM∫
tdFRN1+tEm
R̂N
− TN
+ (EmR̂N
−mN )
1−NM
∫ mN t2dFRN(t)
(1+tEmR̂N
)(1+tmN )
−z + NM∫
tdFRN1+tEm
R̂N
− TN
′
= Em′R̂N
mNTN + EmR̂Nm′NTN + EmR̂N
mNT′N .
In the work of [6, Section 9.11], they show that whenN tends to infinity,
1) supz∈CN |EmR̂N (z)−m(z)| → 0 and supz∈CN |mN (z)−m(z)| → 0,
2)N
M
∫ t2mNdFRN (t)(1+tEm
R̂N)(1+tmN )
−z+ NM
∫tdFRN
1+tEmR̂N
−TNconverges ,
3) M2N (z) → 0, TN → 0.
October 16, 2018 DRAFT
-
28
With the same method, one can show easily that
4) supz∈CN |Em′R̂N (z)−m′(z)| → 0,
5) supz∈CN |m′N (z)−m′(z)| → 0,6) NM
(
∑Mj=1 Eβjdj
)′converges.
With these results, it suffices to show thatT ′N → 0, andM2N′ is equicontinuous.
In [6, Section 9.9], they show that form, p ∈ N and a non-randomN × N matrix Ak, k = 1, ..,mandBl, ℓ = 1, .., q, we have
∣
∣
∣
∣
∣
E
(
m∏
k=1
r∗tAkrt
q∏
ℓ=1
(r∗tRℓrt −M−1TrRBℓ))∣
∣
∣
∣
∣
≤ KM−(1∧q)m∏
k=1
‖Ak‖q∏
ℓ=1
‖Bℓ‖. (23)
We have also that for any positivep,
max(E‖D−1(z)‖p,E‖D−1j (z)‖p,E‖D−1ij (z)‖p) ≤ Kp (24)
and
supn,z∈Cn
‖(EmR̂N
(z)R + I)−1‖ < ∞ (25)
whereKp is a constant which depends only onp.
With all these preliminaries, asTN → 0, by the dominated convergence theorem of derivation, itsuffices to show thatT ′N is bounded overCN . In [6, Section 9.11], it is sufficient to show that(f
′M (z))
is bounded where
fM (z) =
M∑
j=1
E[(r∗jD−1j rj−M−1TrD−1j R)(r∗jD−1j (EmR̂NR+I)
−1rj−M−1TrD−1j (EmR̂NR+I)−1R)].
With the help of (23)-(25),f ′M (z) is indeed bounded inCN .
Now we will show thatM2N′ is equicontinuous. With the light work as before, it is sufficient to show
that f ′′M(z) is bounded. Using (23), we obtain
DRAFT October 16, 2018
-
29
|f ′′(z)| ≤KM−1[(
E(TrD−31 RD̄−31 R)E(TrD
−11 (EmR
NR+ I)−1R(Em̄
R̂NR+ I)−1D̄−11 R)
)1/2
+ 2(
E(TrD−21 RD̄−21 R)E(TrD
−21 (EmR̂N
R+ I)−1R(Em̄R̂N
R+ I)−1D̄−21 R))1/2
+ 2|Em′R̂N
|(
E(TrD−21 RD̄−21 R)E(TrD
−11 (EmR̂N
R+ I)−2R(Em̄R̂N
R+ I)−2D̄−11 R))1/2
+(
E(TrD−11 RD̄−11 R)E(TrD
−31 (EmR̂N
R+ I)−1R(Em̄R̂N
R+ I)−1D̄−31 R))1/2
+ 2|Em′R̂N
|(
E(TrD−11 RD̄−11 R)E(TrD
−21 (EmR̂N
R+ I)−2R(Em̄R̂N
R+ I)−2D̄−21 R))1/2
+ |Em′′R̂N
|(
E(TrD−11 RD̄−11 R)E(TrD
−11 (EmR̂N
R+ I)−2R(Em̄R̂N
R+ I)−2D̄−11 R))1/2
+ |Em′R̂N
|2(
E(TrD−11 RD̄−11 R)E(TrD
−11 (EmR̂N
R+ I)−3R(Em̄R̂N
R+ I)−3D̄−11 R))1/2]
.
Thanks to (24) and (25), the right side is indeed bounded. This ends the proof of the tightness.
REFERENCES
[1] V. Plerous, P. Gopikrishnan, B. Rosenow, L. Amaral, T. Guhr, and H. Stanley, “Random matrix approach to cross correlations
in financial data,”Phys. Rev. E, vol. 65, no. 6, Jun. 2002.
[2] F. Luo, Y. Yang, J. Zhong, H. Gao, L. Khan, D. K. Thompson, and J. Zhou, “Constructing gene co-expression networks
and predicting functions of unknown genes by random matrix theory,” BMC bioinformatics, vol. 8, no. 1, p. 299, 2007.
[3] A. Kammoun, M. Kharouf, W. Hachem, and J. Najim, “BER and Outage Probability Approximations for LMMSE Detectors
on Correlated MIMO channels,”Information Theory, vol. 55, no. 10, pp. 4386–4397, 2009.
[4] R. Couillet, J. W. Silverstein, and M. Debbah, “Eigen-Inference for Energy Estimation of Multiple Sources,”IEEE Trans.
Inf. Theory, submitted for publication. [Online]. Available: http://arxiv.org/abs/1001.3934
[5] X. Mestre and M. Lagunas, “Modified Subspace Algorithms for DoA Estimation With Large Arrays,”IEEE Trans. Signal
Process., vol. 56, no. 2, pp. 598–614, Feb. 2008.
[6] Z. Bai and J. W. Silverstein, “Spectral Analysis of LargeDimensional Random Matrices,”Springer Series in Statistics,
2009.
[7] R. Couillet and M. Debbah,Random matrix methods for wireless communications, 1st ed. New York, NY, USA: Cambridge
University Press, to appear.
[8] V. L. Girko, “Ten years of general statistical analysis.” [Online]. Available:
http://www.general-statistical-analysis.girko.freewebspace.com/chapter14.pdf
[9] J. W. Silverstein and P. L. Combettes, “Signal detectionvia spectral theory of large dimensional random matrices,”IEEE
Trans. Signal Process., vol. 40, no. 8, pp. 2100–2105, 1992.
[10] N. E. Karoui, “Spectrum estimation for large dimensional covariance matrices using random matrix theory,”Annals of
Statistics, vol. 36, no. 6, pp. 2757–2790, Dec. 2008.
[11] O. . Ryan and M. Debbah, “Free deconvolution for signal processing applications,” inProc. IEEE International Symposium
on Information Theory (ISIT’07), Nice, France, Jun. 2007, pp. 1846–1850.
October 16, 2018 DRAFT
http://arxiv.org/abs/1001.3934http://www.general-statistical-analysis.girko.freewebspace.com/chapter14.pdf
-
30
[12] R. Couillet and M. Debbah, “Free deconvolution for OFDMmulticell SNR detection,” inProc. IEEE International
Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC’08), Cannes, France, 2008.
[13] X. Mestre, “On the asymptotic behavior of the sample estimates of eigenvalues and eigenvectors of covariance matrices,”
IEEE Trans. Signal Process., vol. 56, no. 11, pp. 5353–5368, Nov. 2008.
[14] ——, “Improved estimation of eigenvalues of covariancematrices and their associated subspaces using their sample
estimates,”IEEE Trans. Inf. Theory, vol. 54, no. 11, pp. 5113–5129, Nov. 2008.
[15] P. Vallet, P. Loubaton, and X. Mestre, “Improved subspace DoA estimation methods with large arrays: The deterministic
signals case,” inProc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’09), 2009, pp.
2137–2140.
[16] V. A. Marc̆enko and L. A. Pastur, “Distributions of eigenvalues for some sets of random matrices,”Math USSR-Sbornik,
vol. 1, no. 4, pp. 457–483, Apr. 1967.
[17] J. W. Silverstein and Z. D. Bai, “On the empirical distribution of eigenvalues of a class of large dimensional random
matrices,”Journal of Multivariate Analysis, vol. 54, no. 2, pp. 175–192, 1995.
[18] Z. D. Bai and J. W. Silverstein, “No Eigenvalues Outsidethe Support of the Limiting Spectral Distribution of Large
Dimensional Sample Covariance Matrices,”Annals of Probability, vol. 26, no. 1, pp. 316–345, Jan. 1998.
[19] O. Kallenberg,Foundations of mordern Probability, 2nd edition. Springer Verlag New York, 2002.
[20] Z. D. Bai and J. W. Silverstein, “CLT of linear spectral statistics of large dimensional sample covariance matrices,” Annals
of Probability, vol. 32, no. 1A, pp. 553–605, 2004.
[21] J. Marsden and M. Hoffman,Basic Complex Analysis, 3rd ed. New York: Freeman, 1987.
[22] Z. D. Bai and J. W. Silverstein, “Exact Separation of Eigenvalues of Large Dimensional Sample Covariance Matrices,”
The Annals of Probability, vol. 27, no. 3, pp. 1536–1555, 1999.
[23] J. W. Silverstein and S. Choi, “Analysis of the limitingspectral distribution of large dimensional random matrices,” Journal
of Multivariate Analysis, vol. 54, no. 2, pp. 295–309, 1995.
[24] F. Cucker and A. Corbalan, “An Alternate Proof of the Continuity of the Roots of a Polynomial,”The American
Mathematical Monthly, vol. 96, no. 4, pp. 342–345, Apr. 1989.
[25] P. Vallet, P. Loubaton, and X. Mestre, “Improved subspace estimation for multivariate observations of high
dimension: the deterministic signals case,”IEEE Trans. Inf. Theory, submitted for publication. [Online]. Available:
http://arxiv.org/abs/1002.3234
[26] W. Hachem, O. Khorunzhy, P. Loubaton, J. Najim, and L. A.Pastur, “A new approach for capacity analysis of large
dimensional multi-antenna channels,”IEEE Trans. Inf. Theory, vol. 54, no. 9, 2008.
[27] U. Haagerup, H. Schultz, and S. Thorbjørnsen, “A randommatrix approach to the lack of projections in Cred*(F2),”
Advances in Mathematics, vol. 204, no. 1, pp. 1–83, 2006.
[28] P. Billingsley,Probability and Measure, 3rd ed. Hoboken, NJ: John Wiley & Sons, Inc., 1995.
DRAFT October 16, 2018
http://arxiv.org/abs/1002.3234
I IntroductionII Estimation of the population eigenvaluesII-A NotationsII-B Matrix ModelII-C Mestre's Estimator of the population eigenvaluesII-D Integral representation of estimator k - Stieltjes transforms
III Fluctuations of the population eigenvalue estimatorsIII-A The Central Limit TheoremIII-B Proof of Theorem ??
IV Estimation of the covariance matrixV Performance in the context of cognitive radiosVI ConclusionAppendixA Proof of Proposition ??B Proof of Lemma ??C Proof of the tightness
References