Component Analysis Methods for Computer Vision and Pattern Recognition - Home - CVPR -...
Transcript of Component Analysis Methods for Computer Vision and Pattern Recognition - Home - CVPR -...
1
Component Analysis Methodsfor Computer Vision and
Pattern Recognition
Fernando De la TorreFernando De la Torre
Computer Vision and Pattern Recognition Easter SchoolComputer Vision and Pattern Recognition Easter School
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 1
Computer Vision and Pattern Recognition Easter School Computer Vision and Pattern Recognition Easter School March 2011March 2011
2
Component Analysis for CV & PR • Computer Vision & Image Processing• Computer Vision & Image Processing
– Structure from motion.– Spectral graph methods for segmentation.– Appearance and shape models.pp p– Fundamental matrix estimation and calibration.– Compression.– Classification.
Di i lit d ti d i li ti– Dimensionality reduction and visualization.• Signal Processing
– Spectral estimation, system identification (e.g. Kalman filter), sensor array processing (e.g. cocktail problem, eco cancellation), blind sourcearray processing (e.g. cocktail problem, eco cancellation), blind source separation, …
• Computer Graphics– Compression (BRDF), synthesis,…
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 2
• Speech, bioinformatics, combinatorial problems.
3
• Computer Vision & Image Processing
Component Analysis for CV & PR • Computer Vision & Image Processing
– Structure from motion.– Spectral graph methods for segmentation.– Appearance and shape models.
Structure from motion
pp p– Fundamental matrix estimation and calibration.– Compression.– Classification.
Di i lit d ti d i li ti– Dimensionality reduction and visualization.• Signal Processing
– Spectral estimation, system identification (e.g. Kalman filter), sensor array processing (e.g. cocktail problem, eco cancellation), blind sourcearray processing (e.g. cocktail problem, eco cancellation), blind source separation, …
• Computer Graphics– Compression (BRDF), synthesis,…
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 3
• Speech, bioinformatics, combinatorial problems.
4
• Computer Vision & Image Processing
Component Analysis for CV & PR • Computer Vision & Image Processing
– Structure from motion.– Spectral graph methods for segmentation.– Appearance and shape models.
Spectral graph methods for segmentation.pp p
– Fundamental matrix estimation and calibration.– Compression.– Classification.
Di i lit d ti d i li ti– Dimensionality reduction and visualization.• Signal Processing
– Spectral estimation, system identification (e.g. Kalman filter), sensor array processing (e.g. cocktail problem, eco cancellation), blind sourcearray processing (e.g. cocktail problem, eco cancellation), blind source separation, …
• Computer Graphics– Compression (BRDF), synthesis,…
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 4
• Speech, bioinformatics, combinatorial problems.
5
• Computer Vision & Image Processing
Component Analysis for CV & PR • Computer Vision & Image Processing
– Structure from motion.– Spectral graph methods for segmentation.– Appearance and shape models.Appearance and shape modelspp p– Fundamental matrix estimation and calibration.– Compression.– Classification.
Di i lit d ti d i li ti
pp p
– Dimensionality reduction and visualization.• Signal Processing
– Spectral estimation, system identification (e.g. Kalman filter), sensor array processing (e.g. cocktail problem, eco cancellation), blind sourcearray processing (e.g. cocktail problem, eco cancellation), blind source separation, …
• Computer Graphics– Compression (BRDF), synthesis,…
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 5
• Speech, bioinformatics, combinatorial problems.
6
• Computer Vision & Image Processing
Component Analysis for CV & PR • Computer Vision & Image Processing
– Structure from motion.– Spectral graph methods for segmentation.– Appearance and shape models.pp p– Fundamental matrix estimation and calibration.– Compression.– Classification.
Di i lit d ti d i li tiDi i lit d ti d i li ti– Dimensionality reduction and visualization.• Signal Processing
– Spectral estimation, system identification (e.g. Kalman filter), sensor array processing (e.g. cocktail problem, eco cancellation), blind source
Dimensionality reduction and visualization
array processing (e.g. cocktail problem, eco cancellation), blind source separation, …
• Computer Graphics– Compression (BRDF), synthesis,…
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 6
• Speech, bioinformatics, combinatorial problems.
7
• Computer Vision & Image Processing
Component Analysis for CV & PR • Computer Vision & Image Processing
– Structure from motion.– Spectral graph methods for segmentation.– Appearance and shape models.pp p– Fundamental matrix estimation and calibration.– Compression.– Classification.
Di i lit d ti d i li ti– Dimensionality reduction and visualization.• Signal Processing
– Spectral estimation, system identification (e.g. Kalman filter), sensor array processing (e.g. cocktail problem, eco cancellation), blind sourcecocktail problemarray processing (e.g. cocktail problem, eco cancellation), blind source separation, …
• Computer Graphics– Compression (BRDF), synthesis,…
cocktail problem
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 7
• Speech, bioinformatics, combinatorial problems.
8
Independent Component Analysis (ICA)S dSound
Source 1Mixture 1
Sound Source 2
Mixture 2
Output 1
IMixture 2
Output 2ICA
Sound Source 3
Mixture 3
Output 3
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 8
Source 3
9
• Computer Vision & Image Processing
Component Analysis for CV & PR • Computer Vision & Image Processing
– Structure from motion.– Spectral graph methods for segmentation.– Appearance and shape models.pp p– Fundamental matrix estimation and calibration.– Compression.– Classification.
Di i lit d ti d i li ti– Dimensionality reduction and visualization.• Signal Processing
– Spectral estimation, system identification (e.g. Kalman filter), sensor array processing (e.g. cocktail problem, eco cancellation), blind sourcearray processing (e.g. cocktail problem, eco cancellation), blind source separation, …
• Computer Graphics– Compression (BRDF), synthesis,…
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 9
• Speech, bioinformatics, combinatorial problems.
10
Why CA for CV & PR?• Learn from high dimensional data and few samples.
– Useful for dimensionality reduction.
• Easy to incorporate – Robustness to noise, missing data, outliers (de la Torre & Black, 2003a)– Invariance to geometric transformations (Frey et al. 99, de la Torre & Black,
2003b; Cox et al 2008)
(Everitt,1984)
2003b;, Cox et al. 2008)
– Non-linearities (Kernel methods) (Scholkopf & Smola,2002; Shawe-Taylor & Cristianini,2004)
– Probabilistic (latent variable models)M lti f t i l (t )
Efficient methods O( d n< <n2 )
– Multi-factorial (tensors) (Paatero & Tapper, 1994 ;O’Leary & Peleg,1983; Vasilescu & Terzopoulos,2002; Vasilescu & Terzopoulos,2003)
– Exponential family PCA (Gordon,2002; Collins et al. 01)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 10
features samples
• Efficient methods O( d n< <n2 )
11
Are CA Methods Popular/Useful/Used?• About 28% of CVPR-07 papers use CA.
• Google:– Results 1 - 10 of about 1,870,000 for "principal componentResults 1 10 of about 1,870,000 for principal component
analysis".– Results 1 - 10 of about 506,000 for "independent component
analysis". – Results 1 - 10 of about 273,000 for "linear discriminant,
analysis". – Results 1 - 10 of about 46,100 for "negative matrix
factorization".Results 1 10 of about 491 000 for "kernel methods"
• Still work to do– Results 1 - 10 of about 65,300,000 for "Britney Spears".
– Results 1 - 10 of about 491,000 for kernel methods .
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 11
12
Outline• IntroductionIntroduction• Generative models
– Principal Component Analysis (PCA) and extensions– K-means, spectral clustering and extensions– Non-negative Matrix Factorization (NMF)– Independent Component Analysis (ICA)
• Discriminative models– Linear Discriminant Analysis (LDA) and extensions– Oriented Component Analysis (OCA)– Canonical Correlation Analysis (CCA) and extensions
• A unifying view of CA• A unifying view of CA• Standard extensions of linear models
– Latent variable models.– Tensor factorization
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 12
– Tensor factorization
13
Generative ModelsBCD
• Principal Component Analysis/Singular Value Decomposition
BCD
Decomposition1) Robust PCA/SVD, PCA with uncertainty and missing data.2) Parameterized PCA3) Filtered Component Analysis4) Subspace regression5) Kernel PCA
• K-means and spectral clustering6) Aligned Cluster Analysis (ACA)
• Non-Negative Matrix Factorization• Independent Component Analysis.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 13
14
Principal Component Analysis (PCA)(Pearson, 1901; Hotelling, 1933;Mardia et al., 1979; Jolliffe, 1986; Diamantaras, 1996)
• PCA finds the directions of maximum variation of thedata based on linear correlation.
• PCA decorrelates the original variables.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 14
g
15
PCA
dd
nn= images= images
Tnn μ1BCdddD ...21
==pixelspixels
nn images images
kdnd BD
kbbb 21
1 dnk μC
kccc ......21
•Assuming 0 mean data the basis B that preserve the maximumAssuming 0 mean data, the basis B that preserve the maximumvariation of the signal is given by the eigenvectors of DDT.
BΛBDD Td
d
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 15
BΛBDD Td
16
Snap-shot Method & SVD• If d>>n (e.g. images 100*100 vs. 300 samples) no DDT.If d n (e.g. images 100 100 vs. 300 samples) no DD .• DDT and DTD have the same eigenvalues (energy) and
related eigenvectors (by D). • B is a linear combination of the data! (Sirovich 1987)• B is a linear combination of the data!
• [α,L]=eig(DTD) B=D α(diag(diag(L))) -0.5
ΛDαDDαDDDDαBBΛBDD TTTT (Sirovich, 1987)
TVUΣD
• SVD factorizes the data matrix D as:
BCD
TT UUΛDD
TT VVΛDD
(Beltrami, 1873; Schmidt, 1907; Golub & Loan, 1989)
diagonal
nnnkkd
T
ΣIVVIUUVΣU
VUΣD
TT
TT CCIBBCB
BCDnkkd
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 16
SVDPCA
diagonalΣIVVIUU CCIBB
17
Error Function for PCA• PCA minimizes the following function:
(Eckardt & Young, 1936; Gabriel & Zamir, 1979; Baldi & Hornik, 1989; Shum et al., 1995; de la Torre & Black, 2003a)
n
E BCDBdCB 2)(
• PCA minimizes the following function:
• Not unique solution:To obtain same PCA solution R has to satisfy:
kk RBCCBRR 1
Fi
iiE BCDBcdCB, 1
21 )(
• To obtain same PCA solution R has to satisfy:
TT CCIBB
CRCBRBˆˆˆˆ
ˆˆ 1
• R is computed as a generalized k×k eigenvalue problem.
CCIBB
11 TT(de la Torre, 2006)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 17
11 BRBRCC TT
( , )
18
PCA/SVD in Computer Vision• PCA/SVD has been applied to:
– Recognition (eigenfaces:Turk & Pentland, 1991; Sirovich & Kirby, 1987; Leonardis & Bischof, 2000; Gong et al., 2000; McKenna et al., 1997a)
– Parameterized motion models (Yacoob & Black, 1999; Black et al., 2000; Black, 1999; Black & Jepson, 1998)
– Appearance/shape models (Cootes & Taylor, 2001; Cootes et al., 1998; Pentland t l 1994 J & P i 1998 C i & S l ff 1999 Bl k & J 1998 Bl &et al., 1994; Jones & Poggio, 1998; Casia & Sclaroff, 1999; Black & Jepson, 1998; Blanz &
Vetter, 1999; Cootes et al., 1995; McKenna et al., 1997; de la Torre et al., 1998b; de la Torre et al., 1998b)
– Dynamic appearance models (Soatto et al., 2001; Rao, 1997; Orriols & Binefa, 2001; Gong et al., 2000)
– Structure from Motion (Tomasi & Kanade 1992; Bregler et al 2000; Sturm &Structure from Motion (Tomasi & Kanade, 1992; Bregler et al., 2000; Sturm & Triggs, 1996; Brand, 2001)
– Illumination based reconstruction (Hayakawa, 1994)– Visual servoing (Murase & Nayar, 1995; Murase & Nayar, 1994)
– Visual correspondence (Zhang et al., 1995; Jones & Malik, 1992)– Camera motion estimation (Hartley, 1992; Hartley & Zisserman, 2000)– Image watermarking (Liu & Tan, 2000)– Signal processing (Moonen & de Moor, 1995)– Neural approaches (Oja, 1982; Sanger, 1989; Xu, 1993)
Bili d l
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 18
– Bilinear models (Tenenbaum & Freeman, 2000; Marimont & Wandell, 1992)– Direct extensions (Welling et al., 2003; Penev & Atick, 1996)
19
1-Robust PCA•Two types of outliers:
Sample outliers Intra-sample outliers(Xu & Yuille., 1995) (de la Torre & Black, 2001b; Skocaj & Leonardis, 2003)
•Standard PCA solution (noisy data):
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 19
20
Robust PCA• Using robust statistics:• Using robust statistics:
Pixel residual(de la Torre & Black, 2001b; de la Torre & Black, 2003a)
n
i
d
pp
k
jjipjppirpca cbdE
1 1 1
),(),,( μCB
quadraticoutlieroutlier
meanBasis (B) &Coefficients(c)
robustrobust
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 20
Coefficients(c)
21
Numerical Problems• No closed form solution in terms of an eigen equation• No closed form solution in terms of an eigen-equation.• Deflation approaches do not hold.
First eigenvector with
T
T11
uuAA
uuAA
222
1
'''
'
First eigenvector with
highest eigenvalue.
Second eigenvector with
uuAA 222 Second eigenvector with highest eigenvalue.
• In the robust case all the basis have to be computed simultaneously (including the mean).
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 21
22
How to Optimize it?
n
i
d
pp
k
jjipjppirpca cbdE
1 1 1),(),,( μCB
B
HBB
rpca
bnn E
11 )(max
2
TrpcaE
diagbb
H b
• Normalized Gradient descent
C
HCC
B
rpca
cnn
b
E
11 )(max2
Tii
rpca
ii
Ediag
ccH
bb
c
(Blake & Zisserman, 1987)• Deterministic annealing methods to avoid local minima.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 22
23
Example
Statistical outlier
• Small region• Short amount of timeShort amount of time
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 23
24
Robust PCA
Original PCA RPCA Outliers
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 24
25
Structure from Motion
More work on Robust PCA• Robust estimation of coefficients (Black & Jepson, 1998; Leonardis & Bischof, 2000;
Ke & Kanade, 2004)
• Robust estimation of basis and coefficients (Gabriel & Odoro, 1984; Croux & Filzmoser., 1981; Skocaj et al., 2002;Skocaj & Leonardis, 2003; de la Torre & Black, 2001b; de l T & Bl k 2003 )
More work on Robust PCA
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 25
la Torre & Black, 2003a)
• Other Robust PCA techniques (sample outliers) (Campbell, 1980; Ruymagaart, 1981; Xu & Yuille., 1995)
26
1- PCA with Uncertainty and Missing Data
d n k
sjisijijFcbdwE 2
2 )()()( BCDWCB, • Adding uncertainty
• If weights are separable closed-form solutionTwwW
i j s
sjisijijF1 1 1
2 )()()( ,Adding uncertainty
If weights are separable closed form solution.
nd W
cn
ccc www 21w ……
r cwwW
productHadamard
wij
0
W
D
n
n
dd
dd
221
111
r
r
r w
w
...2
1
wproductHadamard
dnd dd 1
rdw
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 26
• Generalized SVD(Greenacre, 1984; Irani & Anandan, 2000;)
27
General Case• For arbitrary weights no closed-form solutionFor arbitrary weights no closed form solution.
dpTppTpTp
n
iiii
TiiF
diag
diagE1
2
))(()(
))(()()()(
bCdwbCd
BcdwBcdBCDWCB,
(Wiberg, 1976 , Torre & Black, 2003a)
• Alternated least squares algorithms– Slow convergence, easy implementation.
• Damped Newton Algorithm
p 1
EErepeat
22
22
v
gv
H– Fast convergence.
B
CBBCDWCB,
12
212
][)(
||||||||)()(
EEvec
E FFF
I
repeat
)(10
1
gHxy
vv(Buchanan & Fitzgibbon., 2005)
vvvv
CB
v
2
22
2)1( ][
)()( EE
vecvec nn
2
FFuntilI
10;
)()()(
yx
xygHxy
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 27
– H definite positive: Iv
H
22
2E econvergencuntil10
28
Related work
• Iterative (Wiberg, 1976; Shum et al., 1995; Morris & Kanade, 1998; Aans et al.,2002; Guerreiro & Aguiar, 2002)
• Closed-form (Aguiar & Moura, 1999; Irani & Anandan, 2000)
P f t i ti• Power factorization (Hartley & Schaalitzky, 2003)
• Bayesian estimation (L.Torresani & Bregler, 2004)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 28
29
2- Parameterized Component Analysis (PaCA) (de la Torre & Black, 2003b)
• Learn a subspace invariant to geometric transformations?
. . .
• Data has to be geometrically normalized
– Tedious manual cropping.Tedious manual cropping.
– Inaccuracies due to matching ambiguities.
– Hard to achieve sub-pixel accuracy.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 29
30
Error function for PaCA
)()()()(2
caBc)af(xdaCB ppET
)()()(),,( 211 1
caBc)af(x,daCBWt ppE
ttt
Basis ((BB) &) &Motion Regularizationcoefficients ((cc))(warping)
Regularization
22aΓacΓc
T L
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 30
3121 1
211 WWaΓacΓc
tat
t ltct
31
EigenEye Learning
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 31
32
More examples• UPS dataset.
R d l i f 100 i (16 16 i l ) Random selection of 100 images (16×16 pixels). Incrementally update until preserve 80% of the energy.
PaK PCAOriginal CongealingPaK-PCAOriginal Congealing
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
33
Improving facial landmark labeling•Hand label (red dots), PaK-PCA label (yellow)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
34
More on Parameterized CA• Probabilistic model
– Search scales exponentially with the number of motion parameters(Frey & Jojic, 1999a; Frey & Jojic, 1999b; Williams & Titsias, 2004)
• Other continuous approaches.
• Invariant clustering(Schewitzer, 1999; Rao, 1999; Shashua et al., 2002, Cox et al. 2008)
• Non-rigid motion(Fitzgibbon & Zisserman, 2003)
(Baker et al., 2004)
• Invariant recognition
• Invariant support vector machinesP t i d K l C t A l i
(Black & Jepson, 1998)
(Avidan, 2001)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 34
• Parameterized Kernel Component Analysis (De la Torre, 2008)
35
3- Filtered Component Analysis(de la Torre et al.,2007b)
1) No local minimum in the expected place.
2) Many local minima2) Many local minima.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 35
36
Multi-band representation
• Texture classification (Nunes et al. ‘03, Freeman, Zalesny & Van Gool, Leung & Malik ‘01, Cula & Dana ‘01, Varma & Zisserman ‘02, De Bonet ’97, Heeger & Bergen ’95, Portilla & Simoncelli ’00, Zhu et al. ‘98)
• Face recognition (Wang et al ’03 Hie et al ’04 Wiskott et al ’97 Lades et alFace recognition (Wang et al. 03, Hie et al. 04, Wiskott et al. 97, Lades et al. ’93, Wechler et al. ’02, Zhao et al. ‘98)
Fil• Filters (Gabor, Wavelets, Volterra, Fourier transform, …)
Convolution
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 36
37
Multi-band representation
1) Global minimum in the1) Global minimum in the expected place.
2) Distance between global and other minima is larger
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 37
and other minima is larger.
38
Filtered Component Analysis (FCA)Images
22
11
22
11 ||)(||||)(||),...,(
2
fbackgroundj
n
j
F
ff
n
iFE FμdFμdFF
Filters
ConvolutionConvolution
vecvecjTi
iTi
0)()(1)()(
FFFF
N l b t filt
No trivial solution (0)
jivecvec jTi 0)()( FFNo overlap between filters
F n
T 2
f i
TfPCAE
1
22
1||)(|| μdF
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 38
39
Robustness of FCA Training: 100 images Testing: 120 imagesTraining: 100 images Testing: 120 images
Correct global minimum
Gray FCA (4) Gabor(4)41 % 74 % 62%Correct global minimum 41 % 74 % 62%
14.59 26 19.683.28 1.4 1.92
Correct to 2nd minimum distance
Average number of local minima
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 39
40
Other work
• Incremental PCA (de la Torre et al., 1998b; Ross et al., 2004; Brand, 2002; Skocaj & Leonardis, 2003; Champagne & Liu., 1998; A. Levy, 2000)
Mi t re of s bspaces• Mixture of subspaces (Vidal et al., 2003; Leonardis et al., 2002)
• Changing the margin in SVM (Ashraf and Lucey 2010)
• Exponential family PCA (Collins et al. 01)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 40
41
4- Subspace Regression: From a Single Image to a SubspaceSingle Image to a Subspace
• Traditional subspace methods
• Subspace Regression (Kim et al. 2010)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 41
42
4- Subspace Regression (II)
frontal(s=0) Subject Subspace
subj=1
subj=2
… … … … …… …
subj=i
… … … … ……
… … … … ……
……
TestImage(s=0)
?Predict a subspace from a single image
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 42
43
Subspace Regression (II)
b1 b2 b3 b4 b5
• Generated samples for each pose
1 2 3 4 5
O ti i ti bl• Optimization problem
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
44
Experiment I
ErrorMeasure
Baseline I(img -> img)
Baseline II(img -> subsp)
SubspaceRegression
Matlab®’s 1 3507 1 4088 1 0860
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
Matlab ssubspace()
1.3507(1.2312)
1.4088(1.1645)
1.0860(1.0651)
45
Experiment II
• Predicting a Subspace for Illumination– CMU PIE data set– 60 aligned subjects– 19 different illuminations
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
46
Subspace tracking
(Template Matching: 42.99)
IVT-SS: 38.41
Component Analysis for CV & PR F. De la Torre CVPR Easter School-201146
Subspace Regression: 37.98
47
5-Kernel PCA
),,(),,(),( 32122
212121 zzzxxxxxx
• The kernel defines an implicit mapping (usually high dimensional andnon-linear) from input to feature space so the data becomes linearly
Feature spaceInput space
non linear) from input to feature space, so the data becomes linearlyseparable.
• Computation in the feature space can be costly because it is(usually) high dimensional
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 47
– The feature space is typically infinite-dimensional!
48
Kernel Methods• Suppose (.) is given as followspp ( ) g
• An inner product in the feature space is
• So, if we define the kernel function as follows, there is no d t t ( ) li itlneed to carry out (.) explicitly
• This use of kernel function to avoid carrying out ( )• This use of kernel function to avoid carrying out (.) explicitly is known as the kernel trick. In any linear algorithm that can be expressed by inner products can be made nonlinear by going to the feature space
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 48
49
Kernel PCA(Scholkopf et al., 1998)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 49
50
Generative ModelsBCD
• Principal Component Analysis/Singular Value Decomposition
BCD
Decomposition1) Robust PCA/SVD, PCA with uncertainty and missing data.2) Parameterized PCA3) Filtered Component Analysis4) Subspace regression5) Kernel PCA
• K-means and spectral clustering6) Aligned Cluster Analysis (ACA)
• Non-Negative Matrix Factorization• Independent Component Analysis.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 50
51
The Clustering ProblemP titi th d t t i di j i t “ l t ” f d t i t• Partition the data set in c-disjoint “clusters” of data points.
• Number of possible partitions
12
110
421
)1(1),(
cn
iic
ccnS n
c
i
c
p p
• NP-hard and approximate algorithms (k-means hierarchical
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
NP hard and approximate algorithms (k means, hierarchical clustering, mog, …)
52
K-means(Ding et al., ‘02, Torre et al ‘06)
FTE ||)(||),(0 MGDGM xD
TMG xyTG
yD
M xy
57
y
y
1
2
3
45
6
7
9
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
18 10
x
53
Spectral Clustering(Dhillon et al., ‘04, Zass & Shashua, 2005; Ding et al., 2005, De la Torre et al ‘06)
Affinity Matrix
FcTE ||)(||),(0 WMCΓCM Fc0
)(DΓ )](...)()([ 21 ndddΓ Normalized Cuts (Shi & Malik ’00)Ratio-cuts
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 53
Ratio cuts(Hagen & Kahng ’02)
54
6- Aligned Cluster Analysis (ACA)• Mining facial expressionMining facial expression
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
55
Mining facial e pression for one s bjectMining facial e pression for one s bject
Problem
• Mining facial expression for one subject• Mining facial expression for one subject
• Summarization
• Visualization
• Indexing
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
56
Mining facial e pression for one s bject
Problem
• Mining facial expression for one subjectLooking up Sleeping SmilingLooking
forwardWaking up
• Summarization
• Visualization
• Indexing
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
57
Mining facial e pression of one s bject
Problem
• Mining facial expression of one subject
• Summarization
• Embedding
I d i• Indexing
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
58
Mining facial e pression for one s bject
Problem
• Mining facial expression for one subject
• Summarization
• Embedding
• Indexing
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
59
k-means and kernel k-means(MacQueen 67, Ding et al. 02, Dhillon et al. 04, Zass and Shashua 05, De la Torre 06)
2||||),( FJ MGXGM xyX
)(G
MG xyG )))((()( 1
n GGGGIKG TTtrJM xy
24
57
y
13
4 6
8
9
10
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
8 10
x)()( XXK T
60
Problem formulation for ACA (I)
)..[ 21 ssX )..[ 43 ssX )..[ 1 mm ss X
1 2 3 1Labels (G)
1s 2s 3s 4sStart and end of the segments (s)Labels (G) Start and end of the segments (s)
s 2)(),,(FacaJ MGXGM )..[)..[)..[ 13221
,...,,mm ssssss
XXX
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
61
Problem formulation for ACA (II)
2
)..[)..[)..[ ),...,,(),,(13221 Fssssssaca mm
J MGXXXSGM
k
ccSS
m
ici mg
ii1
2
2)..[1
1X X[Si , Si+1) mc
Dynamic Time Alignment Kernel (Shimodaira et al. 01)
X [Si , Si+1)[ i , i+1)
m
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
mc
62
Matrix formulation for ACAT
GGGGILKL 1n )(with)( TT
kmk trJ
)()( XXK T
GHGGGHILWLK 1n )(with))o(( TTT
aca trJ
men
ts
ers
samples
segm
2323RW
clus
te
segments 7310 G
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
samples 2371,0 H
1,0G
63
Optimizing ACA (forward step)• Efficient Dynamic Programming• Efficient Dynamic Programming
2.12.42.4
i =23 i =25 i =291.81.7
1.21.8
1.91.5
maxw
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
64
Optimizing ACA (backward step)
)( max2wnO
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
65
Honey bee dance data (Oh et al. 08)
Three behaviors: 1 waggle 2 left turn 3 right turn1‐waggle, 2‐left turn, 3‐right turn
Seq 1 Seq 2 Seq 3 Seq 4 Seq 5 Seq 6ACA 0.845 0.925 0.600 0.922 0.878 0.928PS- SLDS (Oh et al 08) 0.759 0.924 0.831 0.934 0.904 0.910
HDP- VAR(1)-HMM (Fox et al 08)
0.465 0.441 0.456 0.832 0.932 0.887
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
( )Spectral Clustering 0.698 0.631 0.509 0.671 0.577 0.649
66
Facial image features• Active Appearance Models (Baker and Matthews ‘04)Active Appearance Models (Baker and Matthews 04)
• Image features Appearance
Upper face
Shape• Image features
Lower f
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
face
67
• Cohn-Kanade: 30 people and five different
Facial event discovery across subjectsCohn Kanade: 30 people and five different expressions (surprise, joy, sadness, fear, anger)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
68
• Cohn-Kanade: 30 people and five different
Facial event discovery across subjectsCohn Kanade: 30 people and five different expressions (surprise, joy, sadness, fear, anger)
ACA Spectral Clustering
• 10 sets of 30 people
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
Clustering (SC)
0.87(.05) 0.56(.04)
69
Unsupervised facial event discovery
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
70
Clustering human motion
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
71
clustering of human motion II
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
72
Generative ModelsBCD
• Principal Component Analysis/Singular Value Decomposition
BCD
Decomposition1) Robust PCA/SVD, PCA with uncertainty and missing data.2) Parameterized PCA3) Filtered Component Analysis4) Kernel PCA
• K-means and spectral clustering5) Aligned Cluster Analysis (ACA)
• Non-Negative Matrix Factorization• Independent Component Analysis.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 72
73
“Intercorrelations among i bl th b f thvariables are the bane of the
multivariate researcher’s struggle for meaning”
Cooley and Lohnes, 1971
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 73
74
Part-based Representation
The firing rates of neurons are never negativeThe firing rates of neurons are never negative. Independent representations.
NMF & ICANMF & ICA
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 74
75
Non-negative Matrix Factorization• Positive factorizationPositive factorization.
• Leads to part-based representation.0||||)( CB,BCDCB, FE
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 75
76
Nonnegative Factorization (Lee & Seung, 1999;Lee & Seung, 2000)
ij
ijijdF2
0,0)(min BC
CB Inference:j
ij
ijijij )(
)(BVBDB
CC T
T
Derivatives:
F TTLearning:
Tij
T
ijij )()(
BCCDC
BB
ijijij
F )()( CBBCBC
TT
TTF )()( DCBCC
• Multiplicative algorithm can be interpreted as
ijTjj )(BCCijij
ij
)()( DCBCCB
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
Multiplicative algorithm can be interpreted as diagonally rescaled gradient descent.
77
Generative ModelsBCD
• Principal Component Analysis/Singular Value Decomposition
BCD
Decomposition1) Robust PCA/SVD, PCA with uncertainty and missing data.2) Parameterized PCA3) Filtered Component Analysis4) Kernel PCA
• K-means and spectral clustering5) Aligned Cluster Analysis (ACA)
• Non-Negative Matrix Factorization• Independent Component Analysis.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 77
78
Independent Component Analysis
• We need more than second order statistics to represent the signal.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 78
79
ICA1 BWWDSCBCD
(Hyvrinen et al., 2001)
• Look for si that are independent.• PCA finds uncorrelated variables, the independent
components have non Gaussian distributions
BWWDSCBCD
components have non Gaussian distributions.• Uncorrelated E(sisj)= E(si)E(sj)• Independent E(g(si)f(sj))= E(g(si))E(f(sj)) for any non-j j
linear f,g
PCA ICA
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 79
80
ICA vs PCA
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 80
81
Many optimization criteria
• Minimize high order moments: e.g. kurtosiskurt(W) = E{s4} -3(E{s2}) 2
• Many other information criteria.
(Olhausen & Field, 1996)• Also an error function:
n
ii
n
iii S
11)(cBcd
Sparseness (e.g. S=| |)
(Chennubhotla & Jepson, 2001b; Zou et al., 2005; dAspremont et al., 2004;)
• Other sparse PCA.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 81
82
Basis of natural images
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 82
83
Denoising
Originalimage Noisy Image
(30% i )(30% noise)
Denoise(Wi filt ) ICA(Wiener filter) ICA
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 83
84
Outline• IntroductionIntroduction• Generative models
– Principal Component Analysis (PCA) and extensions– K-means, spectral clustering and extensions– Non-negative Matrix Factorization (NMF)– Independent Component Analysis (ICA)
• Discriminative models– Linear Discriminant Analysis (LDA) and extensions– Oriented Component Analysis (OCA)– Canonical Correlation Analysis (CCA) and extensions
• A unifying view of CA• A unifying view of CA• Standard extensions of linear models
– Latent variable models.– Tensor factorization
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 84
– Tensor factorization
85
Discriminative Models
• Linear Discriminant Analysis (LDA)7) Discriminative Cluster Analysis ) y8) Multimodal Oriented Discriminant Analysis
• Oriented Component Analysis (OCA)• Canonical Correlation Analysis (CCA)
9) D i l C l d C t A l i9) Dynamical Coupled Component Analysis10) Canonical Time Warping
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 85
86
Linear Discriminant Analysis (LDA)(Fisher, 1938;Mardia et al., 1979; Bishop, 1995)
C C
C
i
C
j
Tjijib
1 1))(( μμμμS
n
TT ddDDS
BΛSBSBSBBSBB b
bt
tT
T
J ||||)(
i
iit1
ddDDS
c C
• Optimal linear dimensionality reduction if classes are
Tji
c
j
C
ijiw
i
)()(1 1
μdμdS
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 86
p yGaussian with equal covariance matrix.
87
Error function for LDA(de la Torre & Kanade, 2006)
FTTT
LDAE ||)()(||),( 21
DBAGGGBA
[d1 d2 ... dn]
Ad=pixels
dim
sp
ace
nk
ijg
1G1
}1,0{
0...01...00...1
TGc=
clas
ses
Equations n×c Unknowns d×c
A
K=d
subsknG
c
n=samples
Equations n×c Unknowns d×c
d UNDETERMINED t f ti ! ( fitti )
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
• d>>n an UNDETERMINED system of equations! (over-fitting)
88
7-Discriminative Cluster Analysis (DCA)(de la Torre & Kanade, 2006)
1
15
20
6
8
FTTT
DCAE ||)()(||),,( 2 DBAGGGGBA
−15
−10
−5
0
5
10
15
Z
−4
−2
0
2
4
6
Y
−10
−5
0
5
10 −10−5
05
10
−20
−15
YX
−8 −6 −4 −2 0 2 4 6 8−10
−8
−6
X
20PCA+k-means DCA
−10
−5
0
5
10
15
20
Z
−15
−10
−5
0
5
10
15
20
Z
PCA+k means
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 88
−10−5
05
10 −10−5
05
10−20
−15
Y
X
−10
−5
0
5
10 −10−5
05
10
−20
YX
89
Clustering faces2020
40
60
80
100
TT GGGG 1)(
20 40 60 80 100 120 140
120
140
PCA
PCA DCA
0
0.1
0.2
DCA
−0.20
0.20.4
0
0.2
0.4
0.6−0.2
−0.1
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 89
−0.4−0.2
−0.2
0
90
DCA vs. PCA+k-means
1
1.05
DCA
PCA+k−means
0.85
0.9
0.95
urac
y
PCA+k−means
0.75
0.8
0.85
Acc
ur
5 10 15 20 25 30 35 400.65
0.7
0.75
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 90
5 10 15 20 25 30 35 40
Number of clusters (classes)
91
8- Multimodal Oriented Component Analysis (MODA)
(de la Torre & Kanade 2005a)
• How to extend LDA to deal with:– Model class covariances.
(de la Torre & Kanade, 2005a)
– Multimodal classes.– Deal efficiently with huge covariance matrices
(e.g. 100*100).
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 91
92
Multimodality
−5
0
5
10
−200−100
0100
200
−200
0
200−10
−5
MODA
10
20
30
40
2
4
6
8
10
LDA
0 10 20 30 40 50 60−40
−30
−20
−10
0
10
0 10 20 30 40 50 60−10
−8
−6
−4
−2
0
2
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 92
93
MODAB h MAXIMIZES hB that MAXIMIZES the
Kullback-Leibler divergence between clusters among lclasses.
classesTr
jri
classesri
rj
rj
ri
ri
rj
ri
rj
Ttr1111
)))())(((( 2121211212 BμμΣΣμμΣΣΣΣB i
jiij Cr Cr
ijjiijiji j1 1 2
• 1 mode per class and equal covariances equivalent to
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 93
LDA.
94
Optimization• Hard optimization problem
1
1 ))()(()(i
iT
iTtrJ BABBΣBB
p p
T(B)• Iterative Majorization (Kiers, 1995; Leeuw, 1994)
)()()()(
00 BBBBB
JTJT
J(B)
W1
W0
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 94
W1
95
Related LDA work
• Face recognition (Belhumeur et al., 1997;Zhao, 2000;Martinez & Kak, 2003)
• Small sample problem (Chen et al., 2000; Yu & Yang, 2001)
• Mixture (Hastie et al., 1995; Zhu & Martinez, 2006;)
• Neural approaches (Gallinari et al., 1991; Lowe & Webb, 1991)
• Heteroscedastic discriminant analysis (Kumar &Heteroscedastic discriminant analysis (Kumar & Andreou, 1998; Fukunaga, 1990; Mardia et al., 1979; Saon et al., 2000;)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 95
96
Discriminative Models
• Linear Discriminant Analysis (LDA)7) Discriminative Cluster Analysis ) y8) Multimodal Oriented Discriminant Analysis
• Oriented Component Analysis (OCA)• Canonical Correlation Analysis (CCA)
9) D i l C l d C t A l i9) Dynamical Coupled Component Analysis10) Canonical Time Warping
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 96
97
Oriented Component Analysis (OCA)
T bΣbsignal
OCAb
OCAT
OCAOCA
noiseOCA
signal
bΣb
bΣb noise
• Generalized eigenvalue problem: keki bΣbΣ Generalized eigenvalue problem:• boca is steered by the distribution of noise
keki bb
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 97
98
Representational Oriented Component Analysis (ROCA)
(de la Torre et al., 2005a)
kTk bΣb 1
jTj i
bΣb 2
kT
kk
ek
i
bΣb 1j
Tj
jj
e
i
bΣb 2
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 98
99
Discriminative Models
• Linear Discriminant Analysis (LDA)7) Discriminative Cluster Analysis ) y8) Multimodal Oriented Discriminant Analysis
• Oriented Component Analysis (OCA)• Canonical Correlation Analysis (CCA)
9) D i l C l d C t A l i9) Dynamical Coupled Component Analysis10) Canonical Time Warping
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 99
100
Canonical Correlation Analysis (CCA)
• Learn relations between multiple data sets? (e g find
(Mardia et al., 1979; Borga 98)
• Learn relations between multiple data sets? (e.g. find features in one set related to another data set)
• Given two sets , CCA finds the pair of directions w and w that maximize the correlation
ndnd and 21 YXof directions wx and wy that maximize the correlation between the projections (assume zero mean data)
yTT
x YwXw
• Several ways of optimizing it:Ty
TTy
Tx
TTx YwYwXwXw
TT w0XXYX0
• An stationary point of r is the solution to CCA
y
xddddT
ddddT w
ww
YY00XX
Β0YXYX0
A )()()()( 21212121 ,
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 100
• An stationary point of r is the solution to CCA.ΒwAw
101
9- Dynamic Coupled Component Analysis (DCCA) (de la Torre & Black, 2001a)(DCCA)
Data 1Data 1 Data 2Data 2
( )
• Learning the couplingLearning the coupling.• High dimensional data.• Limited training data.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 101
102
Solutions?• PCA independently and general mapping
PCA PCA
• Signals dependent signals with small energy can be lost.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 102
103
DCCA
ˆB
ReconstructionReconstruction
B̂
n
iiicca
i
E1
2
1ˆˆˆ)ˆ,,,,ˆ,(
WcBμdμμCABB
ectio
nec
tion
n
iii
n
ii
Ti
i
ii
i
1
2
3121
2
2
1
)(WW
AccμdBc DynamicsDynamicsProj
ePr
oje
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 103
104
Dynamic Coupled Component Analysis
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 104
105
10- Canonical Time Warping (CTW)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
106
Canonical Correlation Analysis (CCA)(Hotelling 1936)
• CCA minimizes:different #rows, same #columns
TT
ndnd yx YX ,
2),(
F
Ty
TxyxccaJ YVXVVV b
yTT
y
xTT
xts IVYYV
VXXV
.
CCASpatial transformation
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
107
A least-square formulation for DTW
same #rows, different #columnsyx ndnd YX ,
2),(
F
Ty
TxyxdtwJ YWXWWW
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
108
Canonical Time Warping (CTW)
Reminder 2
2
),(
),(
F
Ty
Txyxdtw
F
Ty
Txyxcca
J
J
YWXWWW
YVXVVV
different #rows, different #columnsyyxx ndnd YX ,
spatial transformation
F
Ty
Ty
Tx
TxyxyxctwJ YWVXWVVVWW ),,,(
2
temporal alignment
bTTTx
Tx
Tx
Txts I
VYWYWV
VXWXWV
..
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
yyyy VYWYWV
109
Facial expression alignment
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
110
Facial expression alignment
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
111
Aligning human motion
Boxing O i bi tBoxing Opening a cabinet
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
112
Aligning motion capture and video
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
113
Outline• IntroductionIntroduction• Generative models
– Principal Component Analysis (PCA) and extensions– K-means, spectral clustering and extensions– Non-negative Matrix Factorization (NMF)– Independent Component Analysis (ICA)
• Discriminative models– Linear Discriminant Analysis (LDA) and extensions– Oriented Component Analysis (OCA)– Canonical Correlation Analysis (CCA) and extensions
• A unifying view of CA• A unifying view of CA• Standard extensions of linear models
– Latent variable models.– Tensor factorization
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 113
– Tensor factorization
114
The fundamental equation of CAGiven two datasets : nxnd and XD
FTE ||)(||),(0 WBAWBA
Given two datasets : and XD
CC
FcrE ||)(||),(0 WBAWBA
Weights Weightsfor columns
Regressionmatrices XD )()(
C
for rows for columnsXD )()(
AB
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
115
Properties of the cost function• E0(A,B) has a unique global minimum (Baldi and Hornik-89).
• Closed form solutions for A and B are:
)()()( 22120 AWWWAAWAA T
rTTTT
ccctrE
))(()()( 221222120 BWWWWWBBWBB TTTTTtrE ))(()()(0 BWWWWWBBWBB rcccrrtrE
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
116
Principal Component Analysis (PCA)(Pearson, 1901; Hotelling, 1933;Mardia et al., 1979; Jolliffe, 1986; Diamantaras, 1996)
• PCA finds the directionsof maximum variation ofthe data based on linearthe data based on linearcorrelation.
• Kernel PCA finds thedirections of maximumvariation of the data inthe feature space.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
),,(),,(),( 32122
212121 zzzxxxxxx
117
PCA-Kernel PCA• Error function for KPCA: (E k dt & Y 1936 G b i l &
FTE ||)(||),(0 WBAWBA
• Error function for KPCA: (Eckardt & Young, 1936; Gabriel & Zamir, 1979; Baldi & Hornik, 1989; Shum et al., 1995; de la Torre & Black, 2003a)
FcrE ||)(||),(0 WBAWBA F
TkpcaE ||)(||),( BADBA
)(D• The primal problem:
)(D
)()()( 1 BDDBBBB TTTkpca trE
))()(()()( 1 ADDAAAA TTTtrE
)()()(kpca
• The dual problem:
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
))()(()()( ADDAAAA kpca trE
118
Linear Discriminant Analysis (LDA)(Fisher, 1938;Mardia et al., 1979; Bishop, 1995)
C
i
C
j
Tjijib
1 1))(( μμμμS
n
BΛSBSBSBBSBB bb tT
tTtrJ )()()( 1
• Optimal linear dimensionality reduction if classes are
n
i
Tii
Tt
1ddDDS
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
p yGaussian with equal covariance matrix.
119
Canonical Correlation Analysis (CCA)(Fisher 36;Mardia et al., 1979;)
• Given two sets , CCA finds the pair of directions wx and wy that maximize the correlation between the projections (assume zero mean data)
ndnd and 21 DX
between the projections (assume zero mean data)
TT DXTd
Td
Tx
TTx
dTT
x
DwDwXwXwDwXw
T
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
120
T ||)(||)(
Canonical Correlation Analysis (CCA)
FcT
rE ||)(||),(0 WBAWBA
TTE ||)()(||)( 21
XBADDDBA
FTT
CCAE ||)()(||),( 2 XBADDDBA
0...1
Tsses
0...01...0TG
c=cl
asn=samples
• CCA is the same as LDA changing the
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
label matrix by a new set X
121
K-means(Ding et al., ‘02, Torre et al ‘06)
FcT
rE ||)(||),(0 WBADWBA xD
TBA xyTA
yD
B xy
57
y
y
1
2
3
45
6
7
9
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
18 10
x
122
Normalized cuts(Dhillon et al., ‘04, Zass & Shashua, 2005; Ding et al., 2005, De la Torre et al ‘06)
FcT
rE ||)(||),(0 WBAΓWBA
)(DΓ )](...)()([ 21 ndddΓ Normalized Cuts (Shi & Malik ’00)Ratio-cuts(Hagen & Kahng ’02)
Affinity Matrix
(Hagen & Kahng 02)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
123
Other Connections
• The LS-KRRR (E0) is also the generative model for:– Laplacian Eigenmaps, Locality Preserving projections, MDS,
Partial least-squaresPartial least squares, ….
• Benefits of LS framework:– Common framework to understand difference and communalities
between different CA methods (e g KPCA KLDA KCCA Ncuts)between different CA methods (e.g. KPCA, KLDA, KCCA, Ncuts)– Better understanding of normalization factors and
generalizations– Efficient numerical optimization less than θ(n3) or θ(d3), where n p ( ) ( ),
is number of samples and d dimensions
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011
124
Outline• IntroductionIntroduction• Generative models
– Principal Component Analysis (PCA).– Non-negative Matrix Factorization (NMF).– Independent Component Analysis (ICA).
• Discriminative models– Linear Discriminant Analysis (LDA).– Oriented Component Analysis (OCA).– Canonical Correlation Analysis (CCA).
• A unifying view of CASt d d t i f li d l• Standard extensions of linear models– Latent variable models.– Tensor factorization
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 124
125
Latent Variable Modelsate t a ab e ode s
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 125
126
Factor Analysis• A Gaussian distribution on the coefficients and noise is
added to PCA Factor Analysis.
k NpNp BcμdBcdI0,ccηBcμd
),|(),|()|()(
(Mardia et al., 1979)
• Inference (Roweis & Ghahramani 1999;Tipping & Bishop 1999a)
TT
d
k
E
diagNppp
BBμdμdd
0,cημ,
)))((()cov(
),...,,()|()(),|(),|()|()(
21
• Inference (Roweis & Ghahramani, 1999;Tipping & Bishop, 1999a)
),|()( Vmcd|c Np
),( dcp Jointly Gaussian
11
1
)()()(
),|()(
BBIVμdBBBm
|
T
TT
p
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 126
PCA reconstruction low error.FA high reconstruction error (low likelihood).
127
Ppca• If PPCA.d
TE Iηη )(If PPCA.• If is equivalent to PCA. TTTT BBBBBB 11 )()(0
dηη )(
0
• Probabilistic visual learning (Moghaddam & Pentland, 1997;)
)(
2)(
1
21
1
)()()(21
1
)()(21
)()()(
2
1
211
kdkd
c
ddeeeedppp
k
i i
iTT
dμdIBBμdμdμd T
ccc|dd
2
1
21
22222 )2()2()2()2(i
i
iT
i dBc
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 127
128
More on PPCA
• Tracking (Yang et al 1999; Yang et al 2000a; Lee et al 2005; de la Torre et
(Tipping & Bishop, 1999b; Black et al., 1998; Jebara et al., 1998)• Extension to mixtures of Ppca (mixture of subspaces).
Tracking (Yang et al., 1999; Yang et al., 2000a; Lee et al., 2005; de la Torre et al., 2000b)
• Recognition/Detection (Moghaddam et al., 2000; Shakhnarovich & Moghaddam 2004; Everingham & Zisserman 2006)Moghaddam, 2004; Everingham & Zisserman, 2006)
• PCA for the exponential family (collins et al., 2001)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 128
129
Tensor Factorizatione so acto at o
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 129
130
Tensor faces(Vasilescu & Terzopoulos, 2002; Vasilescu & Terzopoulos, 2003)
peoplepeople
expressions
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 130
viewsilluminations
131
Eigenfaces• Facial images (identity change)Facial images (identity change)
• Eigenfaces bases vectors capture the variability in facial appearance (do not decouple pose, illumination, …)
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 131
132
Data Organization• Linear/PCA: Data Matrix el
s
ImagesDLinear/PCA: Data Matrix– Rpixels x images
– a matrix of image vectorsD
Pixe
• Multilinear: Data TensorViews
D
Du t ea Data ensor– Rpeople x views x illums x express x pixels
– N-dimensional matrix
D
N dimensional matrix– 28 people, 45 images/person– 5 views, 3 illuminations,
3 expressions per person umin
atio
ns
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 132
3 expressions per personexilvpp ,,,iIl
lu
133
N-Mode SVD Algorithm
N = 3
pixelsxexpressxillums.xviews xpeoplex 51 UUUUU . ZD 432
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 133
134
PCA:
TensorFaces:
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 134
135
Strategic Data Compression = Perceptual Qualityp y
• TensorFaces data reduction in illumination space primarily degrades illumination effects (cast shadows, highlights)
• PCA has lower mean square error but higher perceptual errorTensorFaces
Mean Sq. Err. = 409.15
PCA
Mean Sq. Err. = 85.75OriginalTensorFaces
• PCA has lower mean square error but higher perceptual error
3 illum + 11 people param.33 basis vectors
33 parameters33 basis vectors
g
176 basis vectors6 illum + 11 people param.
66 basis vectors
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 135
136
Acknowledgments• The content of some of the slides has been taken from previous
presentations/papers of:– Ales Leonardis.– Horst BischofHorst Bischof.– Michael Black.– Rene Vidal.– Anat Levin.– Aleix Martinez.– Juha Karhunen.– Andrew Fitzgibbon.– Daniel Lee.– Chris Ding.– M. Alex Vasilescu.– Sam Roweis.– Daoqiang Zhang.– Ammon Shashua.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 136
137
CACA
Thanks
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 137
138
Bibliography
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 138
139
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 139
140
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 140
141
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 141
142
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 142
143
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 143
144
Bibliography
Zhou F., De la Torre F. and Hodgins J. (2008) "Aligned Cluster Analysis for Temporal Segmentation of Human Motion“ IEEE Conference on Automatic Face and Gestures Recognition, September, 2008.
De la Torre, F. and Nguyen, M. (2008) “Parameterized Kernel Principal Component Analysis: Theory and Applications to Supervised and Unsupervised Image Alignment“ IEEE Conference on Computer Vision and Pattern Recognition, June, 2008.
Component Analysis for CV & PR F. De la Torre CVPR Easter School-2011 144