Local Fisher Discriminant Analysis for Supervised Dimensionality Reduction
Semi-supervised Discriminant Analysis
description
Transcript of Semi-supervised Discriminant Analysis
Semi-supervised Discriminant Analysis
Lishan Qiao
2009.03.13
Outline
• Motivation
• Locality Preserving Regularization based…
– Laplacian Linear Discriminant Analysis(LapLDA)[1]
– Semi-supervised Discriminant Analysis(SDA)[2]
– Comments: Does Locality Preserving Reg. really work?
• Opitimization based…
– Semi-supervised Discriminant Analysis Via CCCP(SSDACCCP)[3]
• Conclusion
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
[1] J.H.Chen, J.P.Ye, Q.Li, Integrating global and local structures: A least squares framework for dimensionality reduction, CVPR07 [2] D.Cai, X.F.He, J.W.Han, Semi-supervised discriminant analysis, ICCV07[3] Y. Zhang, D.Y.Yeung, Semi-supervised Discriminant Analysis Via CCCP, ECML PKDD 08
Motivation Why to extend LDA
Linear Discriminant Analysis (LDA) is popular supervised DR method.
PseudoLDA, PCA+LDA, NullLDA, RLDA,…
2DLDA, TensorLDA,…
LapLDA, SDA, SSLDA
SDA, SSLDA, SSDACCCP
wSw
wSw
tT
bT
maxObjective function:
Besides,
Semi-supervised LearningCo-TrainingTransductive, e.g. Label PropagationInductive, e.g. LapSVM…
Small Sample Size (SSS)
Global DR method
Completely supervised method
However, 1)(,)( nSrankcnSrank tw
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
LapLDA Motivation & Objective function
Motivation: LDA captures the global geometric structure of the data by simultaneously maximizing the between-class distance and minimizing the within-class distance. However, local geometric structure has recently beenshown to be effective for dimensionality reduction.
ijw
ixjx
otherwise
xofkNNamongisxif
xofkNNamongisxif
or
xx
w ij
jiji
ij
,
,
0
)2/||||exp( 22
Objective function: wXLXwwSw
wSwTT
tT
bT
max WDL
LapLDA = LDA + LPP
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
2.3288.67
1.3181.02
↑4.54
1.7282.27
?
0.6190.90
RLDA
Does locality preserving Regularizer really work?
It seems to only play the role of Tikhonov Regularizer!!
LapLDA Experiments & Discussion
IwwwSw
wSwT
tT
bT
max
wXLXwwSw
wSwTT
tT
bT
max (LapLDA) (RLDA)
0 2 4 6 8 10 12 14 16 18 200.8
0.802
0.804
0.806
0.808
0.81
K
Acc
urac
yLetter (a-m)
K=1,2,3,5,10,15,20
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
SDA Motivation & Objective function
SDA=RLDA+LPP=LapLDA+Tikhonov Reg.=LDA+LPP+Tikhonov Reg.
Motivation: The labeled data points are used to maximize the separability between different classes and the unlabeled data points are used toestimate the intrinsic geometric structure of the data.
||||max
wwXLXwwSw
wSwTT
tT
bT
Objective function:
Globality Preserving DA: ||||
maxwwSw
wXXwwSw
tT
TTb
T
wSw
wXXw
tT
TT
max
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
Only 1 labeled training sample per class
SDA Experiments & Discussion
WOptions = [];WOptions.Metric = 'Cosine';WOptions.NeighborMode = 'KNN';WOptions.k = 2;WOptions.WeightMode = 'Cosine';WOptions.bSelfConnected = 0;WOptions.bNormalized = 1; options = [];options.ReguType = 'Ridge';options.ReguAlpha = 0.01;options.beta = 0.1;
No any parameter!
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
3.27.32 1 labeled + 29 unlabeled
1.35.37 1 labeled + 1 unlabeled
Discussion About Locality Preserving Reg.
Although the graph is at the heart of graph-based semi-supervised learning methods, its construction has not been studied extensively. [X. Zhu, SSL_survey, 05-08]
otherwise
xofkNNamongisxif
xofkNNamongisxif
or
xx
w ij
jiji
ij
,
,
0
)2/||||exp( 22
1) Graph Construction
For example, the face space is estimated to have at least 100 dimensions [4]
[4] M. Meytlis, L. Sirovich. On the dimensionality of face space. PAMI, 29(7):1262–1267, 2007
Curse of dimensionalityIssue 1
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
1.90x103
0.84x103 0.92x103
The performance of classification relies heavily on how well the nearest neighbor criterion works in the original high-dimensional space[5].
Issue 2
Discussion About Locality Preserving Reg.
0 10 20 30 40 50 60 700
0.1
0.2
0.3
0.4
0.5
0.6
0.7
↑ 4.35%
[5] H. T. Chen, H. W. Chang, and T. L. Liu, Local discriminant embedding and its variants. CVPR, 2005.
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
★
■
□
☆
☆□
LDA
Issue 3 Difficulty of Parameter selection Cross-validation?
[6]
[6] D. Zhou, O. Bousquet, B. Scholkopf. Learning with Local and Global Consistency.NIPS,2004
Discussion About Locality Preserving Reg.
2) Parameter model vs. non-parametric model
||||max
wwXLXwwSw
wSwTT
tT
bT
wSw
wXXw
tT
TT
max
wSw
wSw
tT
bT
max
IwwwSw
wSwT
tT
bT
max
wXLXwwSw
wSwTT
tT
bT
maxLapLDA:
RLDA:
LDA:
gpDA:SDA:
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
algorithmsDiscriminative term Regularization term
ParametersFisher MMC Pairwise Tikhonov Globality Locality
RLDA √ √ λ
LapLDA[1] √ √ λ, K, σ
SDA[2]/SSLDA √ √ √ α,β,K,σ
SSDR √ √ α,β
SSDRL √ √ λ
SSMMC √ √ √ α,β,K,σ
Related Works Semi-supervised DR
p||||w wXXw TT wXLXw TT
Sparsity preserving “regularization”
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
Outline
• Motivation
• Locality Preserving Regularization based…
– Laplacian Linear Discriminant Analysis(LapLDA)[1]
– Semi-supervised Discriminant Analysis(SDA)[2]
– Comments: Does Locality Preserving Reg. really work?
• Opitimization based…
– Semi-supervised Discriminant Analysis Via CCCP(SSDACCCP)[3]
• Conclusion[1] J.H.Chen, J.P.Ye, Q.Li, Integrating global and local structures: A least squares framework for dimensionality reduction, CVPR07 [2] D.Cai, X.F.He, J.W.Han, Semi-supervised discriminant analysis, ICCV07[3] Y. Zhang, D.Y.Yeung, Semi-supervised Discriminant Analysis Via CCCP, ECML PKDD 08
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
★
■
×
××
××
×
×
××
×
×
SSDACCCP Motivation
1x
lx
1lx
nx
1 2 C1 0 01 0 0
0 0 1
? ? ?
? ? ?
LDA:
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
SSDACCCP Formulation
1x
lx
1lx
nx
1 2 C1 0 01 0 0
0 0 1
? ? ?
? ? ?
1x
lx
1lx
nx
1 2 C1 0 01 0 0
0 1 0
0 1 0
0 0 1
],,,[ 21 CAAAA
C
k
Tkkkb mmmmnS
1
))((
Tt DDS
kkk
n
nTkk
nDAm
nDm
An
/
/
1
1
],,,[ 21 CXXXD 0 1 0
0 0 1
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
)()()()1 BtraceAtraceBAtrace
)()()2 BAtraceABtrace
)()()()3 CABtraceBCAtraceABCtrace
SSDACCCP Formulation
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
SSDACCCP Formulation
Amax
t
SxxT
tx,max
t
Sxxconst
T
tx
,min
)(xg )(xh
Without loss of generality,
D.C. Programming
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
SSDACCCP CCCP
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
)(xg
)(xh
px 1px
SSDACCCP CCCP
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
SSDACCCP CCCP
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
SSDACCCP Formulation
t
SxxT
tx,max
t
Sxxconst
T
tx
,min
)(xg )(xh
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
t
Sxxtxh
T
),( ],)2
[(2p
pTpT
p
pT
t
Sxx
t
Sxh
p
p
p
pTpT
p
p
tt
xx
t
Sxx
t
Sxh ],)
2[(
20 tt
Sxxx
t
Sx
p
pTpT
p
p
2)
2(
gradient First-order Taylor expansion
Omit constant term
SSDACCCP Formulation
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
SSDACCCP Experiments
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
SSDACCCP Experiments
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
Conclusion
The power of the Locality Preserving Reg. was somewhat overstated.
The prior from the practical problem is paramount important.
★
■
×
××
××
×
×
××
×
×
××
×
×
×
××
×
×
×
××
×
× ×
××
×
× ×
1. Data-dependent Regularizer
2. Label estimation via optimization
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3
Thanks!
Semi-supervised Discriminant Analysis Lishan Qiao 2009-3