Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge...
Transcript of Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge...
![Page 1: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/1.jpg)
Sparse and Low-Rank Modeling for High-Dimensional Data Analysis
Ehsan Elhamifar, Rene Vidal, John Wright, Guillermo Sapiro
CVPR 2015 Tutorial Boston, MA
![Page 2: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/2.jpg)
High-dimensional data deluge
72 hrs new videos / minute 300M new photos / day
![Page 3: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/3.jpg)
Low-dimensional structures
• Intrinsic structures are low-dimensional
S1 S2
Basri-Jacobs’03, Tomasi-Kanade’92, Deerwester et al ’90, Goldberg-Nichols-Oki-Terry ’92
![Page 4: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/4.jpg)
High-dimensional data analysis
Clustering
Embedding
Rd1Rd2
Classification
S1 S2
S1S2
S1 S2
Subset selection
![Page 5: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/5.jpg)
Challenges
• Clustering and subset selection: Non-convex and NP-hard
• Real data are often corrupted
• Little prior knowledge about low-dim structures
• Points in different groups can be very close
- Ext YaleB dataset (38 subjects, 64 images)
6% 14% 23% 31%
K = 1 K = 2 K = 3 K = 4
nearest neighborsK
![Page 6: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/6.jpg)
This tutorial
kck1 ↵⇤
c2
c1
ATools:
- Sparse & low-rank representation
- High-dimensional statistics & geometry
- Convex programming & analysis
Efficient, robust and provably correct algorithms for
(1) clustering, subset selection (2) classification, dimension reduction
![Page 7: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/7.jpg)
This tutorial
1) Clustering, Subset selection: algorithm, theory, applications
Ehsan Elhamifar
Rene Vidal
2) Robust PCA, Learning low-rank transformations: algorithm, theory, applications
— Coffee Break 3:30pm — 4:15pm
John Wright
Guillermo Sapiro
![Page 8: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/8.jpg)
Sparse Subspace Clustering Ehsan Elhamifar
S1 S2 S1 S2
![Page 9: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/9.jpg)
Subspace clustering problem
• Given points in lying in , find
- Basis for each subspace
- Clustering of the data
• Challenging for multiple subspaces
- Do not know subspace bases
- Do not know memberships of points
- Corruption by noise, missing entries, outliers, ...
S1
S2
S3
{y1, . . . ,yN} S1 [ . . . [ SLRn
Rn
Tomasi-Kanade’92, Tipping-Bishop’99, Tseng’00, Kanatani’01, Vidal-Ma-Sastry’05, Yan-Pollefeys’06, Chen-Lerman’09
Possible approach: Expectation Maximization, Issue: Local minima
![Page 10: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/10.jpg)
Spectral clustering-based approach
• Spectral Clustering
- Represent points as graph nodes
- Connect and with weight
- Infer clusters from graph Laplacian
• Good similarity for subspaces?
- Points in the same subspace:
- Points in different subspaces:
- Nearest neighbors
S1
S2
S3
i j cij
cij 6= 0
cij = 0
Fiedler’73, Shi-Malik’01, Belkin-Niyogi’01, Ng-Weiss-Jordan’01, Xing-Jordan’03, Von Luxburg’07
i
j
cij
![Page 11: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/11.jpg)
Subspace clustering: idea
• Self-Expressiveness Property (SEP)
- many solutions
-
• In of dim , a point can be reconstructed by other points
- : number of nonzero elements
yi = Y ci
low column-rank
min kcik0 s. t. yi = Y ci, cii = 0
S` d` d`
S1
S2
S3
yi
`0
NP-hard
Y =⇥Y 1 Y 2 · · · Y L
⇤�
![Page 12: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/12.jpg)
Subspace clustering: idea
• Self-Expressiveness Property (SEP)
- many solutions
-
• In of dim , a point can be reconstructed by other points
- : sum of absolute values of elements
yi = Y ci
low column-rank
S` d` d`
S1
S2
S3
yi
Convex
min kcik1 s. t. yi = Y ci, cii = 0
`1
Y =⇥Y 1 Y 2 · · · Y L
⇤�
![Page 13: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/13.jpg)
Sparse subspace clustering (SSC)
• 1: Solve the sparse optimization
• 2: Infer clustering from similarity graph
spectral clustering
min kcik1 s. t. yi = Y ci, cii = 0 c⇤i =
2
64c⇤i1...
c⇤iN
3
75
E. Elhamifar and R. Vidal, CVPR 2009; E. Elhamifar and R. Vidal, IEEE Trans. PAMI, 2013
![Page 14: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/14.jpg)
L1 graph vs k-NN graph
• Conventional graph clustering
- 1) build a k-NN graph
- 2) learn edge weights
- 3) partition the graph
• SSC algorithm
- 1) learn graph & weights
- 2) partition the graph
yi
yi
cij = e�kyi�yjk
2
2�2
c⇤i =
2
66666664
00.80...0.30
3
77777775
SSC automatically selects the right number of neighbors! SSC can deal with subspaces of different dimensions!
E. Elhamifar and R. Vidal, CVPR 2009; E. Elhamifar and R. Vidal, IEEE Trans. PAMI, 2013
![Page 15: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/15.jpg)
Theoretical analysis
• When does SSC succeed?
- selects points from the correct subspace: no false discovery
• More challenging than conventional sparse recovery
- Sparse representation from the correct subspace
- Sparse representation not unique
`1
S1
S2
S3
yi
Donoho-Elad’03, La-Do’05, Candes-Romberg-Tao’06, Tropp-Gilbert’07, Wright-Ma’10, Sankaranarayanan-Turaga-Baraniuk-Chellappa’10
0 5 10 15 20 25 30
ï0.2
0
0.2
0.4
q = 1Textci
![Page 16: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/16.jpg)
Geometry-based theoretical guarantees
• Theorem: SSC has zero false discovery for any ify 2 Si
E. Elhamifar and R. Vidal, ICASSP 2010; E. Elhamifar and R. Vidal, IEEE Trans. PAMI, 2013
Si
Sj
S`
xO
Pi
P�i Si
Sj
S`
xO
Pi
P�i
Si
Sj
S`
xO
Pi
P�i
max
j 6=icos(✓ij) < max
rank(Y0i)=di
�di(Y0
i) /pdi
L1 succeeds
L1 fails
![Page 17: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/17.jpg)
Geometry-based theoretical guarantees
• Theorem: SSC has zero false discovery for any ify 2 Si
E. Elhamifar and R. Vidal, ICASSP 2010; E. Elhamifar and R. Vidal, IEEE Trans. PAMI, 2013
max
j 6=icos(✓ij) < max
rank(Y0i)=di
�di(Y0
i) /pdi
No need to have many points; Need a few but well distributed!
Si
Sj
S`
xO
Pi
P�i Si
Sj
S`
xO
Pi
P�i
Si
Sj
S`
xO
Pi
P�i
![Page 18: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/18.jpg)
Clustering noisy data
• All points contaminated by noise
• Self-expressiveness implies
• Solve Lasso
S3
S1
S2
eY = Y +Z
min � kcik1 +1
2k eyi � eY ci k22 s. t. cii = 0
zij ⇠ N (0,�2/n)
eyi = eY ci + (zi �Zci)
sparseperturbationsparse
yi = Y ci
i.i.d.
corrupted?
M. Soltanolkotabi, E. Elhamifar and E. Candes, Annals of Stats, 2014; E. Elhamifar, M. Soltanolkotabi, S. Sastry, IEEE TSP, 2015
![Page 19: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/19.jpg)
Robust SSC
• Theorem: Assume noise-free data is drawn uniformly at random from the intersection of each subspace and hypersphere. Apply the two-step procedure to . Under some assumptions, if
• Algorithm: two-step approach
1)
2)
ey 2 Si
max
j 6=i
qAve(cos
2(✓ij)) < (logN)
�1
⇠
�⇤i = argmin
�i
k�ik1 s. t. keyi � eY �ik2 2�, �ii = 0
with high prob, a) no false discovery, b) about subspace dim nonzeros.
� =1
4 k�⇤i k1i
c⇤i = argminci
� kcik1 +1
2k eyi � eY cik22 s. t. cii = 0
i data dependent!
M. Soltanolkotabi, E. Elhamifar and E. Candes, Annals of Stats, 2014; E. Elhamifar, M. Soltanolkotabi, S. Sastry, IEEE TSP, 2015
![Page 20: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/20.jpg)
Application: motion segmentation
• Given feature trajectories of multiple rigid motions
• Find segmentation into underlying motions
Tomasi-Kanade’92, Sugaya-Kanatani’04, Vida-Ma-Sastry’05, Elhamifar-Vidal’09, Chen-Lerman’09, Zhao-Medioni’11
![Page 21: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/21.jpg)
Experiments: motion segmentation
• Hopkins 155 dataset
- 155 sequences
- 2 and 3 motions
• Clustering errors
Algorithms RANSAC GPCA MSL LSA SCC LRR LRSC SSC
2 Motions
Mean 5.56 4.59 4.14 4.23 2.89 4.10 3.69 1.52Median 1.18 0.38 0.00 0.56 0.00 0.22 0.29 0.00
3 Motions
Mean 22.94 28.66 8.23 7.02 8.25 9.89 7.69 4.40Median 22.03 28.26 1.76 1.45 0.24 6.22 3.80 0.56
- nonconvex, local min + convex, provable- k-NN based + automatic selection- sensitive to noise + robust to noise- exponential complexity + computationally efficient
E. Elhamifar and R. Vidal, IEEE Trans. PAMI, 2013
![Page 22: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/22.jpg)
Application: face clustering
• Corruption by sparse errors
• SSC error on Ext YaleB faces
min � kcik1 + kyi � eY cik1 s. t. cii = 0
yi = yi + ei
E. Elhamifar and R. Vidal, IEEE Trans. PAMI, 2013
LaplaceGaussian
0
< 2.0% for 2 subjects
< 11.0% for 10 subjects
2 4 6 8 100
10
20
30
40
50
60
70
Number of subjects
Clu
ster
ing e
rror
(%)
D = 2,016 dimensional data
SSC
LRSC
LRR−H
LRR
SCC
LSA
![Page 23: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/23.jpg)
Other extensions of SSC
• Extension to clustering and DR of nonlinear manifolds [Elhamifar-Vidal NIPS’11]
• Scaling to large datasets
- Greedy algorithm, theory [Dyer-Sankaranarayanan-Baraniuk JMLR’13]
- Sampling + more a compact dictionary [Peng-Zhang-Yi CVPR’13]
• Dealing with sequential and spatial data [Tierney-Gao-Guo CVPR’13, Pham et al CVPR’12]
• Enforcing block-diagonal structure on laplacian /adjacency [Feng-Lin et al CVPR’14]
• Connectivity of SSC graph [Nasihatkon-Hartley CVPR’11]
![Page 24: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/24.jpg)
Conclusions
• Addressed clustering of data lying in multiple subspaces
• Proposed an efficient algorithm based on sparse modeling
- Proved theoretical guarantees of the algorithm
- Extended to deal with corrupted data
- Resolved challenges of the state of the art
- Showed it performs well in real-world problems
![Page 25: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/25.jpg)
S1 S2S1 S2
Sparse Subset Selection Ehsan Elhamifar
![Page 26: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/26.jpg)
Finding representatives
• A subset of data / models, efficiently representing the entire set
- Summarize and visualize images/videos/text/web datasets
79
![Page 27: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/27.jpg)
Finding representatives
• A subset of data / models, efficiently representing the entire set
- Summarize and visualize images/videos/text/web datasets
- Improve computational time and memory
- Describe (complex) nonlinear models
0 1 2 3 4 5 6
−1
−0.5
0
0.5
1
u(t) y(t)�t
�1
submodel 1
submodel s�s
![Page 28: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/28.jpg)
Column subset selection
• Given , select a subset that “well represent” the dataset
[ [⇤ ⇤ ⇤ ⇤ ⇤⇤
⇤ ⇤ ⇤⇤ ⇤⇤⇤ ⇤ ⇤ ⇤ ⇤⇤
Representatives
yi1
yi2
yi3
E. Elhamifar, G. Sapiro and R. Vidal, CVPR 2012
argmin �NX
i=1
kCi⇤kp +1
2kY � Y Ck2F
C
s. t. 1>C = 1>
{yi1 , . . . ,yik}y1,y2, . . . ,yN 2 Rn
=
SSC via L1 graph
C [ [⇤
⇤⇤
⇤
⇤⇤
⇤ ⇤
⇤
⇤
⇤⇤⇤
⇤
⇤
argmin �kCk1 +1
2kY � Y Ck2F =
![Page 29: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/29.jpg)
Column subset selection: theory
• Theorem: Let be the convex hull of the columns of with k vertices. Assume the columns of lie in a (k-1)-dim. affine subspace. For p > 1, we obtain k representatives, corresponding to the vertices of .
• Theorem: For points lying in a union of independent subspaces ( ), we obtain at least representatives from each .
H YY
H
yi1
yi2
yi3
dim(�iSi) =X
i
dim(Si) dim(Si) + 1Si S1
S2
S3
E. Elhamifar, G. Sapiro and R. Vidal, CVPR 2012
![Page 30: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/30.jpg)
Column subset selection
• Given , select a subset that “well represent” the dataset
- What if data not in low-dim subspaces?
- What if no feature representation? e.g, social network graph
- What about summarization between two / multiple sets?
[ [⇤ ⇤ ⇤ ⇤ ⇤⇤
⇤ ⇤ ⇤⇤ ⇤⇤⇤ ⇤ ⇤ ⇤ ⇤⇤
E. Elhamifar, G. Sapiro and R. Vidal, CVPR 2012
argmin �NX
i=1
kCi⇤kp +1
2kY � Y Ck2F
C
s. t. 1>C = 1>
{yi1 , . . . ,yik}y1,y2, . . . ,yN 2 Rn
=
Representatives
yi1
yi2
yi3
![Page 31: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/31.jpg)
Subset selection using dissimilarities
• Given: dissimilarities
• Goal: select a small subset of that well represent w.r.t.
• : how well represents
- = models, = data = representation/coding error
- = data, = data = Euclidean/ geodesic distance
d : X⇥ Y �! R�0
source target
YX d(·, ·)
x1
x2
xM
y1
y2
yN
xi yjd(xi,yj) = dij
YX d(·, ·)
YX d(·, ·)
x1
x2
xM
y1
y2
yN
d11
dMN
![Page 32: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/32.jpg)
Dissimilarity-based sparse subset selection (DS3)
• Let , introduce
-
• To select few elements of that well represent , minimize
• Solve the simultaneous sparse recovery program
D =⇥dij
⇤Z =
⇥zij
⇤
X Y
1) Encoding of via representativesYMX
i=1
NX
j=1
dijzij = tr(D>Z)
zij = P (xi rep. yj)
2) Number of representativesMX
i=1
I (kZi⇤kp)
p 2 {2,1}
xi
yj
minZ
�MX
i=1
kZi⇤kp + tr(D>Z) s. t. Z � 0, 1>Z = 1> Convex
E. Elhamifar, G. Sapiro and R. Vidal, NIPS, 2012; E. Elhamifar, G. Sapiro and S. Sastry, IEEE Trans. PAMI, 2015
![Page 33: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/33.jpg)
Dissimilarity-based sparse subset selection (DS3)
• Identical source and target
−10 −5 0 5 10−4
−2
0
2
4
6
8
10
Representatives for λ = 0.01 λmax,∞
data pointsrepresentatives
−10 −5 0 5 10−4
−2
0
2
4
6
8
10
Representatives for λ = 0.1 λmax,∞
data pointsrepresentatives
−10 −5 0 5 10−4
−2
0
2
4
6
8
10
Representatives for λ = λmax,2
data pointsrepresentatives�1 �2 > �1 �3 > �2
Z matrix for λ = 0.01 λmax,∞
20 40 60 80 100
20
40
60
80
1000
0.2
0.4
0.6
0.8
1
Z matrix for λ = 0.1 λmax,∞
20 40 60 80 100
20
40
60
80
1000
0.2
0.4
0.6
0.8
1
Z matrix for λ = λmax,∞
20 40 60 80 100
20
40
60
80
1000
0.2
0.4
0.6
0.8
1
![Page 34: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/34.jpg)
Theoretical analysis
• Proposition 1: Assume and are identical. If is sufficiently large, only one representative is selected. If is sufficiently small, each point chooses itself as a representative.
- , where
-
• We determine to set the regularization
� � �max,p(D) Z = e`1
> ` = argmini
1>Di⇤
� �min,p(D) Z = I
��
minZ
�MX
i=1
kZi⇤kp + tr(D>Z) s. t. Z � 0, 1>Z = 1>
E. Elhamifar, G. Sapiro and S. Sastry, IEEE Trans. PAMI, 2015
[�min,p(D) , �
max,p(D)]
X Y
�min,2(D) = minj
(mini 6=j
dij � djj) �max,2(D) = max
i 6=`
pN
2
kDi⇤ �D`⇤k22
1>(Di⇤ �D`⇤)
e.g., ,
![Page 35: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/35.jpg)
Theoretical analysis
• Proposition 2: Assume and are identical. Assume points partition into groups. If , the optimal is such that
- (1) each group will have representatives;
- (2) points in each group select representatives from that group only.
L � �g(D) Z
minZ
�MX
i=1
kZi⇤kp + tr(D>Z) s. t. Z � 0, 1>Z = 1>
G1 G2
xc1
xjxi
�g(D) = mink
minj2Gk
(mink0 6=k
mini2Gk0
dij � dckj)
X Y
E. Elhamifar, G. Sapiro and S. Sastry, IEEE Trans. PAMI, 2015
![Page 36: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/36.jpg)
DS3 applications: Learning nonlinear models
• Nonlinear dynamical systems as switched linear models
- Human gaits / activities, motor control systems, ...
• Learning switched dynamical models: Non-convex & NP-hard
E. Elhamifar, G. Sapiro and S. Sastry, IEEE Trans. PAMI, 2015; E. Elhamifar, S. Burden and S. Sastry, IFAC, 2014
X = {�1, . . . , �M}
Y = {(u(1),y(1)), . . . , (u(N),y(N))}
= ensemble of models
0 1 2 3 4 5 6
−1
−0.5
0
0.5
1
u(t) y(t)�t
�1
submodel 1
submodel s�s
Our convex solution
![Page 37: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/37.jpg)
DS3 applications: Learning nonlinear models
• Experiments on segmentation of CMU motion capture data
- Discrete-time SS model via subspace ID, snippets of length 100
- = Euclidean norm of representation error of j-th snippet via i-th estimated submodel
Sequence number 1 2 3 4 5 6 7 8 9 10 11
# activities 4 8 7 7 7 10 6 9 4 4 7
SC error (%) 23.86 30.61 19.02 40.60 26.43 47.77 14.85 38.09 9.02 8.31 3.47SBiC error (%) 22.77 22.08 18.94 28.40 29.85 30.96 30.50 24.78 13.03 12.68 23.68
Kmedoids error (%) 18.26 46.26 49.89 51.99 37.07 54.75 29.81 49.53 9.71 33.50 33.80AP error (%) 22.93 41.22 49.66 54.56 37.87 50.19 37.84 48.37 9.71 26.05 23.84DS3 error (%) 5.33 9.90 12.27 19.64 16.55 14.66 12.56 11.73 11.18 3.32 6.18
dij
E. Elhamifar, G. Sapiro and S. Sastry, IEEE Trans. PAMI, 2015
![Page 38: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/38.jpg)
Dealing with outliers via DS3
• Add outlier representative node to source
• Solve the optimization
E. Elhamifar, G. Sapiro and S. Sastry, IEEE Trans. PAMI, 2015
x1
x2
xM
y1
y2
yN
d11
dMN
ej = P (outlier node yj)
outlier rep
minZ,e
�MX
i=1
kZi⇤kp + tr
Ddo
�> Ze>
�!
s. t. 1>Ze>
�= 1>,
Ze>
�� 0
−5 0 5 10
−2
0
2
4
6
8
10
data pointsrepresentatives
Source set −5 0 5 10
−2
0
2
4
6
8
10
data pointsoutliers
Target set
![Page 39: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/39.jpg)
Dealing with outliers via DS3: experiment
• Exclude one activity when estimating LDS ensemble
• Set outlier node weights
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
False positive rate
Tru
e p
osi
tiv
e ra
te
Walk
Jump
Punch
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
False positive rate
Tru
e p
osi
tiv
e ra
te
Walk
Jump
Punch
x1
x2
xM
y1
y2
yN
d11
dMN
outlier rep
E. Elhamifar, G. Sapiro and S. Sastry, IEEE Trans. PAMI, 2015
wj = � e�mini dij
⌧
⌧ = 0.1 ⌧ = 1.0
(0.91,0.06)
(0.91,0.01)
![Page 40: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/40.jpg)
DS3 applications: Active learning
• Successively annotate the most informative unlabeled samples
• For = = { unlabeled samples }, solveX Y
minZ
� kWZk1,p + tr(D>Z) s. t. Z � 0, 1>Z = 1>
G(1)1
G(1)2
G(1)3
G(2)1G(2)
2
G(1)1
G(1)2
G(2)1
G(2)2
wi , min{� � (� � 1)
E(pi)
log2(L), � � (� � 1)
minj2L djimaxk2U minj2L djk
}
classifier uncertainty confidence sample diversity confidence
W , diag(w1, w2, . . .)
E. Elhamifar, G. Sapiro, A. Yang and S. Sastry, ICCV, 2013
![Page 41: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/41.jpg)
DS3 applications: Active learning
• Pedestrian vs non-pedestrian
- classifier: SVM
- dissimilarity: HOG distances
0 50 100 150 200 250 3000.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Labeled training set size
Test
set a
ccur
acy
CPALCCALRAND
proposed method
�2
• Face recognition
- classifier: SRC
- dissimilarity: Euclidean dist.
300 400 500 600 7000.75
0.8
0.85
0.9
Labeled training set size
Test
set a
ccur
acy
CPALCCALRAND
proposed method
high
er is
bet
ter
high
er is
bet
ter
E. Elhamifar, G. Sapiro, A. Yang and S. Sastry, ICCV, 2013
![Page 42: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/42.jpg)
Conclusions
• Studied the problem of subset selection
- Feature-space representations
- Pairwise similarities
• Proposed convex programs using simultaneous sparse recovery
- Extended to deal with outliers
• Proved the solution recovers representatives from each group and correctly clusters data
• Addressed several problems effectively
- Active learning
- Learning nonlinear dynamical models
- Segmentation of time-series data
![Page 43: Sparse and Low-Rank Modeling for High …CVPR 2015 Tutorial Boston, MA High-dimensional data deluge 72 hrs new videos / minute 300M new photos / day Low-dimensional structures •](https://reader035.fdocuments.in/reader035/viewer/2022070900/5f41f9f4134c1a56c1040e0a/html5/thumbnails/43.jpg)
Thanks!
Shankar Sastry (UCB) Emmanuel Candes (Stanford) Allen Yang (UCB) Mahdi Soltanolkotabi (USC)
Collaborators:Guillermo Sapiro (Duke) Rene Vidal (JHU) Sam Burden (UCB)
Codes: http://www.eecs.berkeley.edu/~ehsan.elhamifar/code.htm