Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients....
Transcript of Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients....
![Page 1: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/1.jpg)
Introduction CCA Lasso Sparse CCA Summary
Sparse CCA using Lasso
Anastasia Lykou & Joe Whittaker
Department of Mathematics and Statistics,Lancaster University
July 23, 2008
![Page 2: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/2.jpg)
Introduction CCA Lasso Sparse CCA Summary
Outline
1 IntroductionMotivation
2 CCADefinitionCCA as least squares problem
3 LassoDefinitionLasso algorithmsThe Lasso algorithms contrasted
4 Sparse CCASCCAAlgorithm for SCCAExample
5 Summary
![Page 3: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/3.jpg)
Introduction CCA Lasso Sparse CCA Summary
Motivation
SCCAimprove the interpretation of CCA
sparse principal component analysis (SCoTLASS byJolliffe et al. (2003) and SPCA by Zou et al. (2004))
interesting data sets (market basket analysis)Sparsity
shrinkage and model selection simultaneously(may reduce the prediction error, can be extended tohigh-dimensional data sets)
![Page 4: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/4.jpg)
Introduction CCA Lasso Sparse CCA Summary
Definition
Canonical Correlation Analysis
X1
Xp
S
Y1
Yq
T
seek linear combinations S = αT X and T = βT Y suchthat ρ = maxα,β corr(S,T )
S,T are the canonical variatesα,β are called conical loadingsStandard solution through eigen decomposition.
![Page 5: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/5.jpg)
Introduction CCA Lasso Sparse CCA Summary
CCA as least squares problem
1st dimension
Theorem1Let α, β be p, q dimensional vectors, respectively.
(α̂, β̂) = argminα,β
{var(αT X − βT Y )
},
subject to αT var(X )α = βT var(Y )β = 1.
Then α̂, β̂ are proportional to the first dimensional ordinarycanonical loadings.
![Page 6: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/6.jpg)
Introduction CCA Lasso Sparse CCA Summary
CCA as least squares problem
2nd dimension
Theorem2Let α, β be p, q dimensional vectors.
(α̂, β̂) = argminα,β
{var(αT X − βT Y )
},
st αT var(X )α = βT var(Y )β = 1 andαT
1 var(X )α = βT1 var(Y )β = 0
where α1,β1 are the first canonical loadings.Then, α̂, β̂ are proportional to the second dimensional ordinarycanonical loadings.
The theorems establish an Alternating Least Squaresalgorithm for CCA.
![Page 7: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/7.jpg)
Introduction CCA Lasso Sparse CCA Summary
CCA as least squares problem
2nd dimension
Theorem2Let α, β be p, q dimensional vectors.
(α̂, β̂) = argminα,β
{var(αT X − βT Y )
},
st αT var(X )α = βT var(Y )β = 1 andαT
1 var(X )α = βT1 var(Y )β = 0
where α1,β1 are the first canonical loadings.Then, α̂, β̂ are proportional to the second dimensional ordinarycanonical loadings.
The theorems establish an Alternating Least Squaresalgorithm for CCA.
![Page 8: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/8.jpg)
Introduction CCA Lasso Sparse CCA Summary
CCA as least squares problem
ALS for CCA
Let the objective function be
Q(α,β) = var(αT X − βT Y )
subject to αT var(X )α = βT var(Y )β = 1.
Q is continuous with closed and bounded domain⇒ Q attainsits infimum
ALS algorithmGiven α̂
β̂ = arg minβQ(α̂,β) subject to var(βT Y ) = 1)
Given β̂
α̂ = arg minαQ(α, β̂) subject to var(αT X ) = 1)
Q decreases over the iterations and is bounded from below⇒Q converges.
![Page 9: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/9.jpg)
Introduction CCA Lasso Sparse CCA Summary
CCA as least squares problem
ALS for CCA
Let the objective function be
Q(α,β) = var(αT X − βT Y )
subject to αT var(X )α = βT var(Y )β = 1.
Q is continuous with closed and bounded domain⇒ Q attainsits infimum
ALS algorithmGiven α̂
β̂ = arg minβQ(α̂,β) subject to var(βT Y ) = 1)
Given β̂
α̂ = arg minαQ(α, β̂) subject to var(αT X ) = 1)
Q decreases over the iterations and is bounded from below⇒Q converges.
![Page 10: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/10.jpg)
Introduction CCA Lasso Sparse CCA Summary
CCA as least squares problem
ALS for CCA
Let the objective function be
Q(α,β) = var(αT X − βT Y )
subject to αT var(X )α = βT var(Y )β = 1.
Q is continuous with closed and bounded domain⇒ Q attainsits infimum
ALS algorithmGiven α̂
β̂ = arg minβQ(α̂,β) subject to var(βT Y ) = 1)
Given β̂
α̂ = arg minαQ(α, β̂) subject to var(αT X ) = 1)
Q decreases over the iterations and is bounded from below⇒Q converges.
![Page 11: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/11.jpg)
Introduction CCA Lasso Sparse CCA Summary
Definition
Lasso (least absolute shrinkage and selection operator)
Introduced by Tibshirani (1996)Imposes the L1 norm on the linear regression coefficients.
Lasso
β̂lasso = argminβ
{var(Y− βT X )
}subject to
∑pj=1 |βj | ≤ t
The L1 norm properties shrink the coefficients towardszero and exactly to zero if t is small enough.
![Page 12: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/12.jpg)
Introduction CCA Lasso Sparse CCA Summary
Lasso algorithms
Lasso algorithms available in the literature
Lasso by Tibshirani
Expresses the problem as a least squares problemwith 2p inequality constraints
Adapts the NNLS algorithm
Lars-Lasso
A modified version of Lars algorithm introduced byEfron et al. (2004)
Lasso estimates are calculated such that the anglebetween the active covariates and the residuals isalways equal.
![Page 13: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/13.jpg)
Introduction CCA Lasso Sparse CCA Summary
Lasso algorithms
Proposed algorithm
Lasso with positivity constraintsSuppose that the sign of the coefficients does not changeduring shrinkage of the coefficients
Positivity Lasso
β̂lasso = argminβ
{var(Y − βT X )
}subject to st
0β ≤ t and s0jβj ≥ 0 for i = 1 . . . ,p
where s0 is the sign of the OLS estimate.
simple algorithm, but quite generalrestricted version of Lasso algorithms, since the sign of thecoefficients cannot changeup to p + 1 constraints imposed, << 2p constraints ofTibshirani’s Lasso
![Page 14: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/14.jpg)
Introduction CCA Lasso Sparse CCA Summary
Lasso algorithms
Numerical solution
The solution is given through quadratic programming methods,
Positivity Lasso solution
β̂ = b0 − λ var(X )−1s0 + var(X )−1 diag (s0)µ
b0 is the OLS estimate.λ is the shrinkage parameter and there is a one to onecorrespondence between the λ and tµ is zero for active and positive for nonactive coefficientsparameters λ and µ are calculated satisfying the KKTconditions under the positivity constraints
![Page 15: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/15.jpg)
Introduction CCA Lasso Sparse CCA Summary
The Lasso algorithms contrasted
Diabetes data set
442 observationsage, sex, body mass index, average blood pressure andsix blood serum measurementsdisease progression one year after baseline
0.0 0.2 0.4 0.6 0.8 1.0
−5
00
05
00
Lars − Lasso
sum|b|/sum|bols|
Co
effic
ien
ts
1
0.0 0.2 0.4 0.6 0.8 1.0
−5
00
05
00
Positivity Lasso
sum|b|/sum|bols|
Co
effic
ien
ts
52
18
46
9
![Page 16: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/16.jpg)
Introduction CCA Lasso Sparse CCA Summary
The Lasso algorithms contrasted
Simulation studies
We simulate 200 data sets consisting of 100 observations eachfrom the following model,
Y = βT X + σε, corr(Xi ,Xj) = ρ|i−j|
Dataset n p β σ ρ
1 100 8 (3, 1.5, 0, 0, 2, 0, 0, 0)T 3 0.502 100 8 (3, 1.5, 0, 0, 2, 0, 0, 0)T 3 0.903 100 8 0.85∀j 3 0.504 100 8 (5, 0, 0, 0, 0, 0, 0, 0)T 2 0.50
Table: Proportions of the casesthe correct model selected.
Dataset Tibs-Lasso Lars-Lasso Pos-Lasso
1 0.06 0.13 0.142 0.02 0.04 0.043 0.84 0.89 0.874 0.09 0.19 0.19
Table: Proportions of agreementbetween Pos-Lasso and
Dataset Tibs-Lasso Lars-Lasso
1 0.76 0.832 0.63 0.653 0.95 0.984 0.77 0.78
![Page 17: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/17.jpg)
Introduction CCA Lasso Sparse CCA Summary
SCCA
ALS for CCA and Lasso
First dimensionGiven the canonical variate T = βT Y,
α̂ = arg minα
{var(T −αT X )
}st var(αT X ) = 1 and ||α||1 ≤ t
We seek an algorithm solving this optimization problemorModify the Lasso algorithm in order to incorporate the equalityconstraint.
![Page 18: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/18.jpg)
Introduction CCA Lasso Sparse CCA Summary
SCCA
ALS for CCA and Lasso
First dimensionGiven the canonical variate T = βT Y,
α̂ = arg minα
{var(T −αT X )
}st var(αT X ) = 1 and ||α||1 ≤ t
We seek an algorithm solving this optimization problemorModify the Lasso algorithm in order to incorporate the equalityconstraint.
![Page 19: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/19.jpg)
Introduction CCA Lasso Sparse CCA Summary
SCCA
ALS for CCA and Lasso
First dimensionGiven the canonical variate T = βT Y,
α̂ = arg minα
{var(T −αT X )
}st var(αT X ) = 1 and ||α||1 ≤ t
We seek an algorithm solving this optimization problemorModify the Lasso algorithm in order to incorporate the equalityconstraint.
![Page 20: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/20.jpg)
Introduction CCA Lasso Sparse CCA Summary
Algorithm for SCCA
ALS for CCA and Lasso
Tibshirani’s LassoNNLS algorithm cannot incorporate the equality
constraintLars Lasso
the equality constraint violates the equiangularconditionPositivity Lasso
by additionally imposing positivity constraints theabove optimization problem can be solved.
![Page 21: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/21.jpg)
Introduction CCA Lasso Sparse CCA Summary
Algorithm for SCCA
ALS for CCA and Lasso
Tibshirani’s LassoNNLS algorithm cannot incorporate the equality
constraintLars Lasso
the equality constraint violates the equiangularconditionPositivity Lasso
by additionally imposing positivity constraints theabove optimization problem can be solved.
![Page 22: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/22.jpg)
Introduction CCA Lasso Sparse CCA Summary
Algorithm for SCCA
ALS for CCA and Lasso
Tibshirani’s LassoNNLS algorithm cannot incorporate the equality
constraintLars Lasso
the equality constraint violates the equiangularconditionPositivity Lasso
by additionally imposing positivity constraints theabove optimization problem can be solved.
![Page 23: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/23.jpg)
Introduction CCA Lasso Sparse CCA Summary
Algorithm for SCCA
SCCA with positivity
First dimension
minα
{var(T −αT X )
}st αT var(X )α = 1,
and sT0 α ≤ t , s0jαj ≥ 0 for j = 1, . . . ,p
The entire Lasso path is derived by considering KKTconditions.Cross-validation methods select the shrinkage levelapplied.αsp and βsp for each set of variables are derivedalternately until the corr(Ssp,Tsp) converges to itsmaximum.
![Page 24: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/24.jpg)
Introduction CCA Lasso Sparse CCA Summary
Algorithm for SCCA
SCCA with positivity
First dimension
minα
{var(T −αT X )
}st αT var(X )α = 1,
and sT0 α ≤ t , s0jαj ≥ 0 for j = 1, . . . ,p
The entire Lasso path is derived by considering KKTconditions.Cross-validation methods select the shrinkage levelapplied.αsp and βsp for each set of variables are derivedalternately until the corr(Ssp,Tsp) converges to itsmaximum.
![Page 25: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/25.jpg)
Introduction CCA Lasso Sparse CCA Summary
Algorithm for SCCA
SCCA with positivity
First dimension
minα
{var(T −αT X )
}st αT var(X )α = 1,
and sT0 α ≤ t , s0jαj ≥ 0 for j = 1, . . . ,p
The entire Lasso path is derived by considering KKTconditions.Cross-validation methods select the shrinkage levelapplied.αsp and βsp for each set of variables are derivedalternately until the corr(Ssp,Tsp) converges to itsmaximum.
![Page 26: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/26.jpg)
Introduction CCA Lasso Sparse CCA Summary
Algorithm for SCCA
SCCA with positivity
Second dimension
minα
{var(T −αT X )
}st αT var(X )α = 1, αT
1 var(X )α = 0,
and sT0 α ≤ t , s0jαj ≥ 0 for j = 1, . . . ,p
where α1 is the first dimensional loading.
Cross-validation methods select the shrinkage level.Again alternating algorithm derives the seconddimensional canonical loadings
![Page 27: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/27.jpg)
Introduction CCA Lasso Sparse CCA Summary
Algorithm for SCCA
SCCA with positivity
Second dimension
minα
{var(T −αT X )
}st αT var(X )α = 1, αT
1 var(X )α = 0,
and sT0 α ≤ t , s0jαj ≥ 0 for j = 1, . . . ,p
where α1 is the first dimensional loading.
Cross-validation methods select the shrinkage level.Again alternating algorithm derives the seconddimensional canonical loadings
![Page 28: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/28.jpg)
Introduction CCA Lasso Sparse CCA Summary
Algorithm for SCCA
SCCA with positivity
Second dimension
minα
{var(T −αT X )
}st αT var(X )α = 1, αT
1 var(X )α = 0,
and sT0 α ≤ t , s0jαj ≥ 0 for j = 1, . . . ,p
where α1 is the first dimensional loading.
Cross-validation methods select the shrinkage level.Again alternating algorithm derives the seconddimensional canonical loadings
![Page 29: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/29.jpg)
Introduction CCA Lasso Sparse CCA Summary
Example
Simulations
We simulate 300 observations of the following model.1
r = 0.98
r = 0.90
X4
. . .
X1
X5
. . .
X10
Y3
. . .
Y1
Y4
. . .
Y7
S1 T1
S2 T2
![Page 30: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/30.jpg)
Introduction CCA Lasso Sparse CCA Summary
Example
Simulations
1st dim 2nd dimVariable CCA SCCA CCA SCCA
X1 0.229 0.248 0.122 0.056X2 0.350 0.366 -0.052 0X3 0.337 0.341 0.027 0X4 0.304 0.298 0.114 0X5 0.135 0.014 0.198 0.208X6 -0.037 0 0.381 0.472X7 -0.052 0 0.212 0.183X8 -0.052 0 0.205 0.266X9 -0.111 0 0.166 0.177X10 -0.019 0 0.168 0
Y1 0.402 0.419 0.112 0.014Y2 0.460 0.444 -0.018 0Y3 0.309 0.325 0.085 0Y4 -0.018 0 0.279 0.421Y5 0.032 0 0.183 0.008Y6 -0.113 -0.028 0.395 0.361Y7 -0.089 -0.025 0.384 0.427
ρ 0.745 0.737 0.654 0.638RdX (%) 14.2 13.9 13.4 12.6RdY (%) 16.6 16.4 15 14
Var.Ext of X (%) 25.7 25.5 31.4 30.9Var.Ext of Y (%) 30 30.1 35 33.8
![Page 31: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/31.jpg)
Introduction CCA Lasso Sparse CCA Summary
Summary
Extra workSparse CCA without positivity constraints
using Lars-Lasso algorithmFurther work
Compare the performance of SCCA with and withoutpositivity constraintsBayesian model selection
Imposing different Lasso penaltiesUsing GVS, Dellaportas et al. (2002)Bayesian version of the SCCA
![Page 32: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/32.jpg)
Appendix References
Literature
Dellaportas, P., Forster, J., and Ntzoufras, I. (2002). On bayesian model andvariable selection using mcmc. Statistics and Computing, 12:27–36.
Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2004). Least angleregression. Annals of Statistics, 32:407–499.
Jolliffe, I., Trendafilov, N., and Uddin, M. (2003). A modified principalcomponent technique based on the lasso. Journal of Computational andGraphical Statistics, 12(3):531–547.
Lawson, C. and Hanson, R. (1974). Solving Least Square Problems.Prentice Hall, Englewood Cliffs, NJ.
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J.Royal. Statist. Soc. B., 58:267–288.
Zou, H., Hastie, T., and Tibshirani, T. (2004). Sparse principal componentanalysis. to appear, JCGS.
![Page 33: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/33.jpg)
Appendix References
SCCA
S1 = a1’X T1 = b1’Yp1
S2 = a2’X T2 = b2’Yp2
S1 = a1’X T1 = b1’Yp1
S2 = a2’X T2 = b2’Yp2
Figure: CCA and SCCA with positivity
S1 = a1’X T1 = b1’Yp1
S2 = a2’Xres T2 = b2’Yresp2
Figure: SCCA without positivity
![Page 34: Sparse CCA using Lasso - univie.ac.at...Imposes the L1 norm on the linear regression coefficients. Lasso b lasso = argmin n var(Y TX) o subject to P p j=1 j jj t The L1 norm properties](https://reader033.fdocuments.in/reader033/viewer/2022043012/5faaff540bc2b86be23f5545/html5/thumbnails/34.jpg)
Appendix References
NNLS
Lawson and Hanson (1974) define the following problems,
LSI problem
LSI problem: minβ||Y − βT X || subject to Gβ ≥ hNNLS problem: minβ||Y− βT X || subject to β ≥ 0
LDP problem: minβ||βT || subject to Gβ ≥ h
LSI is equivalent to Lasso→ LDP→ NNLS