of 55/55
• date post

10-Jun-2020
• Category

Documents

• view

9

0

Embed Size (px)

Transcript of Regularized PCA to denoise and visualise...

• PCA Regularized PCA Results Discussion

Regularized PCA to denoise and

visualise data

Marie Verbanck � Julie Josse � François Husson

Laboratoire de statistique, Agrocampus Ouest, Rennes, France

Statlearn, Bordeaux, 9 april 2013

1 / 35

• PCA Regularized PCA Results Discussion

Outline

1 PCA

2 Regularized PCA

MSE point of view

Bayesian point of view

3 Results

4 Discussion

2 / 35

• PCA Regularized PCA Results Discussion

Geometrical point of view

Maximize the variance of the projected points

Minimize the reconstruction error

⇒ Approximation of X of low rank (S < p)

‖Xn×p − X̂n×p‖2 SVD: X̂PCA = Un×SΛ12S×SV

′p×S

Fn×SV′p×S

F = UΛ12 principal components

V principal axes

Alternating least squares (ALS) algorithm: ‖X− FV′‖2

F = XV(V′V)−1

V = X′F(F′F)−1

3 / 35

• PCA Regularized PCA Results Discussion

Model in PCA

xij = signal + noise

• Random e�ect model: Factor Analysis, Probabilistic PCA(Lawley, 1953, Roweis, 1998, Tipping & Bishop, 1999)

⇒ individuals are exchangeable⇒ relationship between variables

• Fixed e�ect model (Lawley, 1941, Caussinus 1986)⇒ individuals are not i.i.d⇒ study the individuals and the variables

• sensory analysis (products - sensory descriptors)• plant breeding (genotypes - environments)

4 / 35

• PCA Regularized PCA Results Discussion

Fixed e�ect model

Xn×p = X̃n×p + εn×p

xij =∑S

s=1

√dsqis rjs + εij , εij ∼ N (0, σ2)

• Randomness due to the noise• Individuals have di�erent expectations: individual parameters• Inferential framework: when n ↗ the number of parameters ↗⇒ Asymptotic: σ → 0

• Maximum likelihood estimates (θ̂): least squares estimatesX̂PCA = UΛ

12V

EM algorithm = ALS algorithm

⇒ Not necessarily the best in MSE

⇒ Shrinkage estimates: minimize E(φθ̂ − θ)2

5 / 35

• PCA Regularized PCA Results Discussion

Outline

1 PCA

2 Regularized PCA

MSE point of view

Bayesian point of view

3 Results

4 Discussion

6 / 35

• PCA Regularized PCA Results Discussion

Minimizing the MSE

xij = x̃ij + εij =∑S

s=1 x̃sij + εij

MSE = E

∑i ,j

min(n−1,p)∑s=1

φs x̂(s)ij − x̃ij

2 with x̂ (s)ij = √λsuisvjs

MSE = E

∑i ,j

(S∑

s=1

φs x̂(s)ij − x̃

(s)ij

)2+

min(n−1,p)∑s=S+1

(φs x̂

(s)ij − x̃

(s)ij

)2For all s ≥ S + 1, x̃ (s)ij = 0, therefore φS+1 = ... = φmin(n−1,p) = 0.

7 / 35

• PCA Regularized PCA Results Discussion

Minimizing the MSE

MSE = E

∑i ,j

(S∑

s=1

φs x̂(s)ij − x̃ij

)2 with x̂ (s)ij = √λsuisvjs

φs =

∑i ,j E(x̂

(s)ij )x̃ij∑

i ,j E(x̂(s)2ij )

=

∑i ,j E(x̂

(s)ij )x̃ij∑

i ,j

(V (x̂

(s)ij ) +

(E(x̂ (s)ij )

)2)When σ → 0: E(x̂ (s)ij ) = x̃

(s)ij (unbiased); V (x̂

(s)ij ) =

σ2

min(n−1;p)(Pazman & Denis, 1996; Denis & Gower, 1999)

φs =

∑i ,j x̃

(s)2

ij∑i ,j

σ2

min(n−1;p) +∑

i ,j x̃(s)2

ij

8 / 35

• PCA Regularized PCA Results Discussion

Minimizing the MSE

xij = x̃ij + εij =∑S

s=1

√dsqis rjs + εij , εij ∼ N (0, σ2)

MSE = E(∑

i ,j

(∑Ss=1 φs x̂

(s)ij − x̃ij

)2)with x̂

(s)ij =

√λsuisvjs

φs =

∑i ,j x̃

(s)2

ij∑i ,j

σ2

min(n−1;p) +∑

i ,j x̃(s)2

ij

=ds

npσ2

min(n−1;p) + ds

=variance signal

variance total

φ̂s =λs − σ̂2

λsfor s = 1, ..., S

9 / 35

• PCA Regularized PCA Results Discussion

Regularized PCA

x̂PCAij =S∑

s=1

√λsuisvjs

x̂rPCAij =S∑

s=1

(λs − σ̂2

λs

)√λsuisvjs =

S∑s=1

(√λs −

σ̂2√λs

)uisvjs

Compromise between hard and soft thresholding

σ2 small → regularized PCA ≈ PCAσ2 large → individuals toward the center of gravity

ddl=

n∑q

s=S+1 λs

np − p − nS − pS + S2 + S(Xn×p;Un×S ;Vp×S)

(only asympt unbiased)10 / 35

• PCA Regularized PCA Results Discussion

Illustration of the regularization

Xsim4×10 = X̃4×10 + ε

sim4×10 sim = 1, ..., 1000

PCA rPCA

⇒ rPCA more biased less variable11 / 35

• PCA Regularized PCA Results Discussion

Outline

1 PCA

2 Regularized PCA• MSE point of view• Bayesian point of view

3 Results

12 / 35

• PCA Regularized PCA Results Discussion

Outline

1 PCA

2 Regularized PCA• MSE point of view• Bayesian point of view

Y = Xβ + ε, ε ∼ N (0, σ2)β ∼ N (0, τ2)

β̂MAP = (X′X + κ)−1X′Y , with κ = σ2/τ2

3 Results

12 / 35

• PCA Regularized PCA Results Discussion

Probabilistic PCA (Roweis, 1998, Tipping & Bishop, 1999 )

⇒ A speci�c case of an exploratory factor analysis model

xij = (Zn×SB′p×S)ij + εij , zi ∼ N (0, IS), εi ∼ N (0, σ2Ip)

• Conditional independence : xi |zi ∼ N (Bzi , Ip)• Distribution for the observations:xi ∼ N (0,Σ) with Σ = BB′ + σ2Ip

• Explicit solution:• σ̂2 = 1

p−S∑p

s=S+1 λs

• B̂ = V(Λ− σ2IS)1

2

13 / 35

• PCA Regularized PCA Results Discussion

Probabilistic PCA

xij = (Zn×SB′p×S)ij + εij , zi ∼ N (0, IS), εi ∼ N (0, σ2Ip)

⇒ Random e�ect model: no individual parameters⇒ BLUP E(zi |xi ): Ẑ = XB̂(B̂′B̂ + σ2IS)−1

X̂pPCA = ẐB̂′

Using B̂ = V(Λ− σ2IS)12 and SVD X = UΛ1/2V′ → XV = Λ1/2U

X̂pPCA = U(Λ− σ2IS)Λ−

12V x̂

pPCAij =

S∑s=1

(√λs −

σ̂2√λs

)uisvjs

Bayesian treatment of the �xed e�ect model with an a priori on zi :

⇒ Regularized PCA14 / 35

• PCA Regularized PCA Results Discussion

Probabilistic PCA

⇒ EM algorithm in pPCA: two ridge regressions

Step E: Ẑ = XB̂(B̂′B̂ + σ̂2IS)−1

Step M: B̂ = X′Ẑ(Ẑ′Ẑ + σ̂2Λ−1)−1

⇒ EM algorithm in PCA: two regressions ‖X− FV′‖2

F = XV(V′V)−1

V = X′F(F′F)−1

15 / 35

• PCA Regularized PCA Results Discussion

An empirical Bayesian approach

Model: xij =S∑

s=1

x̃(s)ij + εij , εij ∼ N (0, σ

2)

Prior: x̃(s)ij ∼ N (0, τ

2s )

Posterior: E(x̃(s)ij |x

(s)ij

)= Φsx

(s)ij with Φs =

τ2sτ2s +

1min(n−1;p)σ

2

Maximizing the likelihood of(x(s)ij

)∼ N (0, τ2s + 1min(n−1;p)σ

2) as a

function of τ2s to obtain: τ̂s2 =

(1npλs − 1min(n−1;p) σ̂

2)

Φ̂s =λs − npmin(n−1;p) σ̂

2

λs

16 / 35

• PCA Regularized PCA Results Discussion

Other Bayesian point of views

X = X̃ + ε = UDV′ + ε

⇒ Bayesian SVD models: overparametrization issue

• Ho� (2009): Uniform prior for U and V special cases of vonMises-Fisher distributions (Stiefel manifold); (ds)s=1...S ∼ N

(0, s2λ

)• Josse & Denis (2013): disregarding the constraints

uis ∼ N (0, 1) E (x̃ij) = 0vjs ∼ N (0, 1) V (x̃ij) = Ss2λ

(ds)s=1...S ∼ N(0, s2λ

)

17 / 35

• PCA Regularized PCA Results Discussion

Outline

1 PCA

2 Regularized PCA

MSE point of view

Bayesian point of view

3 Results

4 Discussion

18 / 35

• PCA Regularized PCA Results Discussion

Simulation design

The simulated data:

X = X̃ + ε

Vary several parameters to construct X̃:

• n/p: 100/20 = 5, 50/50 = 1 and 20/100 = 0.2• S underlying dimensions: 2, 4, 10• the ratio d1/d2: 4, 1

εij is drawn from a Gaussian distribution with a standard deviationchosen to obtain a signal-to-noise ratio (SNR): 4, 1, 0.8

19 / 35

• PCA Regularized PCA Results Discussion

Soft thresholding and SURE (Candès et al., 2012)

x̂Softij =

min(n,p)∑s=1

(√λs − λ

)+uisvjs ‖X− X̃‖2 + λ‖X̃‖∗

Choosing the tunning parameter λ? MSE(λ) = E‖X̃− SVTλ(X)‖2

⇒ Stein's Unbiased Risk Estimate (SURE)

SURE(SVTλ)(X) = −npσ2 +min(n,p)∑

i

min(λ2, λi ) + 2σ2div(SVTλ)(X)

⇒ λ is obtained minimizing SURE. Remark: σ̂2 is required

20 / 35

• PCA Regularized PCA Results Discussion

Simulation study

n/p S SNR d1/d2 MSE(X̂PCA, X̃) MSE(X̂rPCA, X̃) MSE(X̂SURE, X̃)100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-0150/50 4 4 4 5.48E-04 5.48E-04 1.04E-0350/50 4 4 1 5.46E-04 5.46E-04 1.04E-0350/50 4 1 4 1.75E-01 1.53E-01 2.00E-0150/50 4 1 1 1.68E-01 1.52E-01 2.09E-0150/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-0150/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-0120/100 4 4 4 8.28E-04 8.27E-04 1.41E-0320/100 4 4 1 8.29E-04 8.28E-04 1.42E-0320/100 4 1 4 2.55E-01 1.97E-01 2.45E-0120/100 4 1 1 2.48E-01 2.04E-01 2.60E-0120/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-0120/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01

MSE in favor of rPCA21 / 35

• PCA Regularized PCA Results Discussion

Simulation study

n/p S SNR d1/d2 MSE(X̂PCA, X̃) MSE(X̂rPCA, X̃) MSE(X̂SURE, X̃)100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-0150/50 4 4 4 5.48E-04 5.48E-04 1.04E-0350/50 4 4 1 5.46E-04 5.46E-04 1.04E-0350/50 4 1 4 1.75E-01 1.53E-01 2.00E-0150/50 4 1 1 1.68E-01 1.52E-01 2.09E-0150/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-0150/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-0120/100 4 4 4 8.28E-04 8.27E-04 1.41E-0320/100 4 4 1 8.29E-04 8.28E-04 1.42E-0320/100 4 1 4 2.55E-01 1.97E-01 2.45E-0120/100 4 1 1 2.48E-01 2.04E-01 2.60E-0120/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-0120/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01

rPCA ≈ PCA when SNR is high - rPCA ≥ PCA when SNR is small21 / 35

• PCA Regularized PCA Results Discussion

Simulation study

n/p S SNR d1/d2 MSE(X̂PCA, X̃) MSE(X̂rPCA, X̃) MSE(X̂SURE, X̃)100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-0150/50 4 4 4 5.48E-04 5.48E-04 1.04E-0350/50 4 4 1 5.46E-04 5.46E-04 1.04E-0350/50 4 1 4 1.75E-01 1.53E-01 2.00E-0150/50 4 1 1 1.68E-01 1.52E-01 2.09E-0150/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-0150/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-0120/100 4 4 4 8.28E-04 8.27E-04 1.41E-0320/100 4 4 1 8.29E-04 8.28E-04 1.42E-0320/100 4 1 4 2.55E-01 1.97E-01 2.45E-0120/100 4 1 1 2.48E-01 2.04E-01 2.60E-0120/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-0120/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01

rPCA ≥ rPCA when d1 ≥ d221 / 35

• PCA Regularized PCA Results Discussion

Simulation study

n/p S SNR d1/d2 MSE(X̂PCA, X̃) MSE(X̂rPCA, X̃) MSE(X̂SURE, X̃)100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-0150/50 4 4 4 5.48E-04 5.48E-04 1.04E-0350/50 4 4 1 5.46E-04 5.46E-04 1.04E-0350/50 4 1 4 1.75E-01 1.53E-01 2.00E-0150/50 4 1 1 1.68E-01 1.52E-01 2.09E-0150/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-0150/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-0120/100 4 4 4 8.28E-04 8.27E-04 1.41E-0320/100 4 4 1 8.29E-04 8.28E-04 1.42E-0320/100 4 1 4 2.55E-01 1.97E-01 2.45E-0120/100 4 1 1 2.48E-01 2.04E-01 2.60E-0120/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-0120/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01

Same order of errors for n/p = 0.2 or 5, smaller for n/p = 1!21 / 35

• PCA Regularized PCA Results Discussion

Simulation study

n/p S SNR d1/d2 MSE(X̂PCA, X̃) MSE(X̂rPCA, X̃) MSE(X̂SURE, X̃)100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-0150/50 4 4 4 5.48E-04 5.48E-04 1.04E-0350/50 4 4 1 5.46E-04 5.46E-04 1.04E-0350/50 4 1 4 1.75E-01 1.53E-01 2.00E-0150/50 4 1 1 1.68E-01 1.52E-01 2.09E-0150/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-0150/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-0120/100 4 4 4 8.28E-04 8.27E-04 1.41E-0320/100 4 4 1 8.29E-04 8.28E-04 1.42E-0320/100 4 1 4 2.55E-01 1.97E-01 2.45E-0120/100 4 1 1 2.48E-01 2.04E-01 2.60E-0120/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-0120/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01

SURE good when data are noisy; S is high

⇒ SNR not a good measure of the level of noise 21 / 35

• PCA Regularized PCA Results Discussion

Simulation study

n/p S SNR d1/d2 MSE(X̂PCA, X̃) MSE(X̂rPCA, X̃) MSE(X̂SURE, X̃)100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-0150/50 4 4 4 5.48E-04 5.48E-04 1.04E-0350/50 4 4 1 5.46E-04 5.46E-04 1.04E-0350/50 4 1 4 1.75E-01 1.53E-01 2.00E-0150/50 4 1 1 1.68E-01 1.52E-01 2.09E-0150/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-0150/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-0120/100 4 4 4 8.28E-04 8.27E-04 1.41E-0320/100 4 4 1 8.29E-04 8.28E-04 1.42E-0320/100 4 1 4 2.55E-01 1.97E-01 2.45E-0120/100 4 1 1 2.48E-01 2.04E-01 2.60E-0120/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-0120/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01

SURE bad: select too many components(√λs − λ

)+

21 / 35

• PCA Regularized PCA Results Discussion

Candes et al. (2012)'s simulation

• Same model of simulation• n = 200 p = 500, SNR (4, 2, 1, 0.5), S (10, 100)

S SNR MSE(X̂PCA, X̃) MSE(X̂rPCA, X̃) MSE(X̂SURE, X̃)10 4 4.31E-03 4.29E-03 8.74E-0310 2 1.74E-02 1.71E-02 3.29E-0210 1 7.16E-02 6.75E-02 1.16E-0110 0.5 3.19E-01 2.57E-01 3.53E-01100 4 3.79E-02 3.69E-02 4.50E-02100 2 1.58E-01 1.41E-01 1.56E-01100 1 7.29E-01 4.91E-01 4.48E-01100 0.5 3.16E+00 1.48E+00 8.52E-01

⇒ Good behavior of rPCA⇒ SURE good when data are noisy and S high

22 / 35

• PCA Regularized PCA Results Discussion

Candes et al. (2012)'s simulation

• Same model of simulation• n = 200 p = 500, SNR (4, 2, 1, 0.5), S (10, 100)

S SNR MSE(X̂PCA, X̃) MSE(X̂rPCA, X̃) MSE(X̂SURE, X̃)10 4 4.31E-03 4.29E-03 8.74E-0310 2 1.74E-02 1.71E-02 3.29E-0210 1 7.16E-02 6.75E-02 1.16E-0110 0.5 3.19E-01 2.57E-01 3.53E-01100 4 3.79E-02 3.69E-02 4.50E-02100 2 1.58E-01 1.41E-01 1.56E-01100 1 7.29E-01 4.91E-01 4.48E-01100 0.5 3.16E+00 1.48E+00 8.52E-01

MSE of rPCA with 100 dim = 1.48

Data very noisy: MSE of rPCA with 0 dim = 1

22 / 35

• PCA Regularized PCA Results Discussion

Individuals representation

100 individuals, 20 variables, 2 dimensions, SNR=0.8, d1/d2 = 4

True con�guration PCA

23 / 35

• PCA Regularized PCA Results Discussion

Individuals representation

100 individuals, 20 variables, 2 dimensions, SNR=0.8, d1/d2 = 4

PCA rPCA

24 / 35

• PCA Regularized PCA Results Discussion

Individuals representation

100 individuals, 20 variables, 2 dimensions, SNR=0.8, d1/d2 = 4

PCA SURE

24 / 35

• PCA Regularized PCA Results Discussion

Individuals representation

100 individuals, 20 variables, 2 dimensions, SNR=0.8, d1/d2 = 4

True con�guration rPCA

⇒ Improve the recovery of the inner-product matrix X̃X̃′

25 / 35

• PCA Regularized PCA Results Discussion

Variables representation

True con�guration PCA

cor(X7, X9)= 1 0.68

26 / 35

• PCA Regularized PCA Results Discussion

Variables representation

PCA rPCA

cor(X7, X9)= 0.68 0.81

27 / 35

• PCA Regularized PCA Results Discussion

Variables representation

PCA SURE

cor(X7, X9)= 0.68 0.82

27 / 35

• PCA Regularized PCA Results Discussion

Variables representation

True con�guration rPCA

cor(X7, X9)= 1 0.81

⇒ Improve the recovery of the covariance matrix X̃′X̃

28 / 35

• PCA Regularized PCA Results Discussion

Real data set

• 27 chickens submitted to 4 nutritional statuses• 12664 gene expressions

PCA rPCA

29 / 35

• PCA Regularized PCA Results Discussion

Real data set

• 27 chickens submitted to 4 nutritional statuses• 12664 gene expressions

SURE rPCA

29 / 35

• PCA Regularized PCA Results Discussion

Real data set

• 27 chickens submitted to 4 nutritional statuses• 12664 gene expressions

SPCA rPCA

Sparse PCA (Witten, 2009)

29 / 35

• PCA Regularized PCA Results Discussion

Real data set

PCA rPCA

Bi-clustering from rPCA better �nd the 4 nutritional statuses!

30 / 35

• PCA Regularized PCA Results Discussion

Real data set

PCA rPCA

F16R16 with F16R5 and with N in rPCA

Bi-clustering from rPCA better �nd the 4 nutritional statuses!

30 / 35

• PCA Regularized PCA Results Discussion

Real data set

SURE rPCA

Bi-clustering from rPCA better �nd the 4 nutritional statuses!

30 / 35

• PCA Regularized PCA Results Discussion

Real data set

SPCA rPCA

Bi-clustering from rPCA better �nd the 4 nutritional statuses!

30 / 35

• PCA Regularized PCA Results Discussion

Outline

1 PCA

2 Regularized PCA

MSE point of view

Bayesian point of view

3 Results

4 Discussion

31 / 35

• PCA Regularized PCA Results Discussion

Conclusion

⇒ Recover a low rank matrix from noisy observations• singular values thresholding: a popular estimation strategy

• a regularized term a priori:(√

λs − σ̂2√λs

): Minimize MSE

(asymptotic results) - Bayesian �xed e�ect model

⇒ Good results: recovery of the structure, distances betweenindividuals, covariance matrix, visualization

⇒ Tuning parameter: the number of dimensions !!• too few: information lost (noise variance overestimated)• too many: relevant information taken into account (noisevariance is underestimated); better? Not always!

Noisy schemes → less dimensions to regularize more

32 / 35

• PCA Regularized PCA Results Discussion

Conclusion

⇒ Recover a low rank matrix from noisy observations• singular values thresholding: a popular estimation strategy

• a regularized term a priori:(√

λs − σ̂2√λs

): Minimize MSE

(asymptotic results) - Bayesian �xed e�ect model

⇒ Good results: recovery of the structure, distances betweenindividuals, covariance matrix, visualization

⇒ Tuning parameter: the number of dimensions !!• too few: information lost (noise variance overestimated)• too many: relevant information taken into account (noisevariance is underestimated); better? Not always!

Noisy schemes → less dimensions to regularize more

32 / 35

• PCA Regularized PCA Results Discussion

Selecting the number of dimensions (Josse & Husson, 2011)

?

??

??

??

??

?

??

??

?

?

??

?

?

⇒ EM-CV (Bro et al. 2008)⇒ MSEPS = 1np

∑ni=1

∑pj=1(xij − x̂

−ijij )

2

⇒ Computational costly

In regression ŷ = Py (Craven & Whaba, 1979): ŷ−ii − yi =ŷi−yi1−Pi,i

In PCA vec(X̂(S)) = P(S) vec(X) X̂ = FV′

P(S)np×np = (P

′V⊗ In) + (I

′p ⊗ PF)− (P

′V⊗ PF)

ACVS =1

np

∑i ,j

(x̂ij − xij1− Pij ,ij

)2GCVS =

1

np×

∑i ,j(x̂ij − xij)2

(1− tr(P(S))/np)2

33 / 35

• PCA Regularized PCA Results Discussion

Selecting the number of dimensions (Josse & Husson, 2011)

?

??

??

??

??

?

??

??

?

?

??

?

?

⇒ EM-CV (Bro et al. 2008)⇒ MSEPS = 1np

∑ni=1

∑pj=1(xij − x̂

−ijij )

2

⇒ Computational costly

In regression ŷ = Py (Craven & Whaba, 1979): ŷ−ii − yi =ŷi−yi1−Pi,i

In PCA vec(X̂(S)) = P(S) vec(X) X̂ = FV′

P(S)np×np = (P

′V⊗ In) + (I

′p ⊗ PF)− (P

′V⊗ PF)

ACVS =1

np

∑i ,j

(x̂ij − xij1− Pij ,ij

)2GCVS =

1

np×

∑i ,j(x̂ij − xij)2

(1− tr(P(S))/np)2

33 / 35

• PCA Regularized PCA Results Discussion

Selecting the number of dimensions (Josse & Husson, 2011)

?

??

??

??

??

?

??

??

?

?

??

?

?

⇒ EM-CV (Bro et al. 2008)⇒ MSEPS = 1np

∑ni=1

∑pj=1(xij − x̂

−ijij )

2

⇒ Computational costly

In regression ŷ = Py (Craven & Whaba, 1979): ŷ−ii − yi =ŷi−yi1−Pi,i

In PCA vec(X̂(S)) = P(S) vec(X) X̂ = FV′

P(S)np×np = (P

′V⊗ In) + (I

′p ⊗ PF)− (P

′V⊗ PF)

ACVS =1

np

∑i ,j

(x̂ij − xij1− Pij ,ij

)2GCVS =

1

np×

∑i ,j(x̂ij − xij)2

(1− tr(P(S))/np)2

33 / 35

• PCA Regularized PCA Results Discussion

Selecting the number of dimensions (Josse & Husson, 2011)

?

??

??

??

??

?

??

??

?

?

??

?

?

⇒ EM-CV (Bro et al. 2008)⇒ MSEPS = 1np

∑ni=1

∑pj=1(xij − x̂

−ijij )

2

⇒ Computational costly

In regression ŷ = Py (Craven & Whaba, 1979): ŷ−ii − yi =ŷi−yi1−Pi,i

In PCA vec(X̂(S)) = P(S) vec(X) X̂ = FV′

P(S)np×np = (P

′V⊗ In) + (I

′p ⊗ PF)− (P

′V⊗ PF)

ACVS =1

np

∑i ,j

(x̂ij − xij1− Pij ,ij

)2GCVS =

1

np×

∑i ,j(x̂ij − xij)2

(1− tr(P(S))/np)2

33 / 35

• PCA Regularized PCA Results Discussion

Selecting the number of dimensions (Josse & Husson, 2011)

?

??

??

??

??

?

??

??

?

?

??

?

?

⇒ EM-CV (Bro et al. 2008)⇒ MSEPS = 1np

∑ni=1

∑pj=1(xij − x̂

−ijij )

2

⇒ Computational costly

In regression ŷ = Py (Craven & Whaba, 1979): ŷ−ii − yi =ŷi−yi1−Pi,i

In PCA vec(X̂(S)) = P(S) vec(X) X̂ = FV′

P(S)np×np = (P

′V⊗ In) + (I

′p ⊗ PF)− (P

′V⊗ PF)

ACVS =1

np

∑i ,j

(x̂ij − xij1− Pij ,ij

)2GCVS =

1

np×

∑i ,j(x̂ij − xij)2

(1− tr(P(S))/np)2

33 / 35

• PCA Regularized PCA Results Discussion

Selecting the number of dimensions (Josse & Husson, 2011)

?

??

??

??

??

?

??

??

?

?

??

?

?

⇒ EM-CV (Bro et al. 2008)⇒ MSEPS = 1np

∑ni=1

∑pj=1(xij − x̂

−ijij )

2

⇒ Computational costly

In regression ŷ = Py (Craven & Whaba, 1979): ŷ−ii − yi =ŷi−yi1−Pi,i

In PCA vec(X̂(S)) = P(S) vec(X) X̂ = FV′

P(S)np×np = (P

′V⊗ In) + (I

′p ⊗ PF)− (P

′V⊗ PF)

ACVS =1

np

∑i ,j

(x̂ij − xij1− Pij ,ij

)2GCVS =

1

np×

∑i ,j(x̂ij − xij)2

(1− tr(P(S))/np)2

33 / 35

• PCA Regularized PCA Results Discussion

Best graphics for X = X̃+ ε

Regularized PCA: biased but closer to X̃

PCA (unbiased) with con�dence areas? ⇒ Residuals bootstrap

1 residuals (ε̂ = X− X̂) bootstrap or draw from N (0, σ̂2): εb

2 Xb = X̂ + εb

3 PCA on Xb: Ub, Λb, Vb ⇒ (U1,Λ1,V1), ..., (UB ,ΛB ,VB)

-5 0 5

-6-4

-20

2

Individuals factor map (PCA)

Dim 1 (52.64%)

Dim

2 (

25.5

9%)

S Michaud

S Renaudie

S Trotignon

S Buisse Domaine

S Buisse Cristal

V Aub Silex

V Aub Marigny

V Font Domaine

V Font Brûlés

V Font Coteaux

-1.0 -0.5 0.0 0.5 1.0

-1.0

-0.5

0.0

0.5

1.0

Variables factor map (PCA)

Dim 1 (52.64%)

Dim

2 (2

5.59

%)

O.Av.Intensité

O.Ap.Intensité

O.AlcoolO.Végétale

O.Champignon

O.Fruitpassion

O.Typicité

G.Intensité

Sucrée

Acide

Amère

Astringent

G.Alcool

Equilibre

G.Typicité

34 / 35

• PCA Regularized PCA Results Discussion

Analysis of variance framework

Y11 G1 E1

Y1J

Y21

YIJ

G1

G2

GI

EJ

E2

EJ

Y G E

G1

GI

YJ

E1 EJ

⇒ Related to AMMI (Additive Main e�ects and MultiplicativeInteraction) models in Anova

yij = µ+ αi + βj +S∑

s=1

λsuisvjs + εij

35 / 35

PCARegularized PCAMSE point of viewBayesian point of view

ResultsDiscussion