An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An...
Transcript of An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An...
![Page 1: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/1.jpg)
An Alternating Direction Algorithm for Structure-enforced
Matrix Factorization
Lijun Xu (Dalian University of Technology)
Supervised by
Bo Yu (DUT) Yin Zhang (Rice University)
March 27, 2013
![Page 2: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/2.jpg)
Outline
Introduction Alternating Direction Method (ADM) ADM Extension to SeMF Numerical experiments Conclusion
![Page 3: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/3.jpg)
• Matrix Factorization • Various factorizations requiring different
constraints on and , a) Exact factorizations: LU, QR, SVD and
eigendecomposition, etc b) Recent approximate factorizations : NMF, K-means,
sparse PCA, matrix completion, dictionary learning, etc.
Introduction
2
,
1min , , ,2
m n m k k nFX Y
M XY M X Y× × ×− ∈ ∈ ∈
X Y
![Page 4: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/4.jpg)
• In practice, many constraints on and impose structural properties like non-negativity, sparsity, orthogonality, normalization, etc., which allow easy ‘projections’.
• Structure-enforced Matrix Factorization (SeMF)
where and are easily projectable sets.
2
,
1min , s.t. , 2 FX Y
M XY X Y− ∈ ∈
X Y
Introduction
![Page 5: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/5.jpg)
• Some examples of easily projectable sets : Non-negativity :
Sparsity:
Orthogonality:
, 0( )
0 , 0ij ij
ij
X XX
X≥= <
{ : 0}ijX X= ≥
0{ : , 1, 2, }iX X k i= ≤ =
, | | is in the first -th largest absolute values of ( )
0 , otherwiseij ij iX X k X
X
=
{ : , }i JX X X i I= ⊥ ∈
( )1( ) , ( )
,
T TJ J J J i
j
X X X X X i IX
X j J
− Ι − ∈= ∈
Introduction
![Page 6: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/6.jpg)
Normalization:
Combinatorial structure:
E.g. 3 groups, each group is sparse.
, 1( )
, 1i i i
i i
X X XX
X X
>= ≤
{ : 1, 1, 2, }iX X i= ≤ =
{ }1 2 : , 1, 2,
r iI I I I iX X X X X i r = = ∈ =
1 1 2 2( ) ( ) ( ) ( )
r rI I IX X X X =
1 zero 2 zeros 1 zero
Introduction
![Page 7: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/7.jpg)
Introduction
• Problems with specific structural patterns
a) Sparse NMF : non-negative (+sparse) : non-negative (+ sparse) b) Sparse PCA : sparse : column normalized c) Dictionary Learning for sparse representation : column normalized : sparse etc.
2
,
1min , s.t. , 2 FX Y
M XY X Y− ∈ ∈
![Page 8: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/8.jpg)
• Classic ADM:
where are convex, are closed convex. • Augmented Lagrangian:
ADM:
Alternating Direction Method
![Page 9: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/9.jpg)
ADM Extension to SeMF • Original Model:
• Model with splitting variables:
Splitting variables separates from (similarly for ), Separations facilitate alternating direction methods
2
,
1min , s.t. , 2 FX Y
M XY X Y− ∈ ∈
2
, , ,
1min , s.t. 0, 0, ,2 FX Y U V
M XY X U Y V U V− − = − = ∈ ∈
U X Y
![Page 10: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/10.jpg)
ADM framework to SeMF
• Augmented Lagrangian:
where are lagrangian multipliers, are penalty parameters and product .
Minimizing with respect to one at a time while fixing others, and then updating after each sweep of such alternating minimization.
2 2 21( , , , , , )2 2 2
+ ( ) ( )
A F F FX Y U V M XY X U Y V
X U Y V
α βΛ Π = − + − + −
Λ• − +Π • −
, ij iji jA B a b• =∑
,Λ Π ( , ) 0α β >
A ( ), and , ,X Y U V,Λ Π
![Page 11: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/11.jpg)
ADM framework to SeMF
• Framework:
( )
1
1 1
1 1
1 1
1 1 1
1 1 1
argmin ( , , , , , ) ,
argmin ( , , , , , ) ,
( / ),
( / ),
( ),
,
k k k k k kA
Xk k k k k k
AY
k k k
k k k
k k k k
k k k k
X X Y U V
Y X Y U V
U XV Y
X U
Y V
α
β
γα
γβ
+
+ +
+ +
+ +
+ + +
+ + +
← Λ Π
← Λ Π
← +Λ
← +Π
Λ ← Λ + −
Π ← Π + −
![Page 12: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/12.jpg)
Implementation • Choice of Step length we set Adaptive updating Motivation: fixed values often cause slow convergence and getting
trapped in local minima. Intuition : balance the changes of the 3 terms and .
• Stopping criterion: , where
M XY−
,X U Y V− −
1,γ =( )0,1.618 ,γ ∈, , α β γ
( , ) , α β
1 k k kf f f tol+− ≤ k kk F
f M X Y= −
![Page 13: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/13.jpg)
Implementation • An updating strategy:
![Page 14: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/14.jpg)
Implementation • An simple example:
Solve
using different initial :
2
,: random 40 60 matrix, || || =1: sparse 60 1500 matrix
each column has 3 zeros with random location and value,
i
A XYX xY
=×
×,
,
2[1 0.1] 10 , 1, 5.kA k−× × =
2
2 0,
1min . . 1, 32 i iFX Y
A XY s t x y− = ≤
![Page 15: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/15.jpg)
![Page 16: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/16.jpg)
Numerical Experiments Dictionary Learning
Synthetic experiments: (compare with K-SVD) X*: random 20*50, columns normalized; Y*: 3 random non-zeros each column; M: X*Y*+ white Gaussian noise.
2
2 0,
1min , s.t. 1, ,2 i jFX Y
M XY x y k i j− ≤ ≤ ∀,
: samples of data, : overcomplete dictionary matrix,
: sparse representation of ,
MXY M
Denote X as learned dictionary. Measure distance: ( )( , ) min 1 ,T
j i jidist x X x x∗ ∗= −
![Page 17: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/17.jpg)
In this case (sparsity = 3), SeMF can recover better when number of samples is small (<500).
Test: a) Solve with different numbers of samples and figure out the percentage of recovery columns ,
Numerical Experiments if is recovered, and define
( , ) 0.01,jdist x X∗ ≤
( , ) ( ( , ))jdist X X mean dist x X∗ ∗=jx∗
Dictionary size : 20*50, Sparsity: 3 Noise: 20dB .
![Page 18: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/18.jpg)
b) The smallest number of samples to reach 95% recovery of dictionary respective to different sparsity ,
the number of samples : [200:50:2000] sparsity: [1 2 3 4 5 6] average results of 10 experiments:
Numerical Experiments
Dictionary size : 20*50, Noise: 20dB .
![Page 19: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/19.jpg)
c) Recovery respect to different noise level.
Numerical Experiments
For each SNR, compute the number of recovered atoms, repeat 100 tests, sort the results and average in groups of 20. SNR = [10 20 30 ]dB
![Page 20: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/20.jpg)
Numerical Experiments Test on Swimmer Datasets
• Swimmer consists of 256 images of size 32*32. Each image is constituted by 5 parts from the 17 distinct non-overlapping basis images, i.e., a centered invariant part called torso and four limbs in one of the 4 positions.
• Goal: extracting non-negative basis images . 1024 256 1024 17 17 256, ,M X Y× × ×∈ ∈ ∈
1 17{ , , }X X
![Page 21: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/21.jpg)
Different structure enforcing 1. Sparse NMF
2. Sparse NMF with equal non-zero coefficients
Latent property: 5 parts of swimmer image have the same
coefficient, which means there are 5 equal non-zeros in the sparse representation Y.
2
00, 0
1min , s.t. 5 1, 2562 jFX Y
M XY y j≥ ≥
− ≤ = ,
2
00 0 ,,
1min , s.t. ( , 5 2
) jFX j nnzY jy meM Y a jyX yn≥ ≥
− ≤= ∀,
Numerical Experiments
![Page 22: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/22.jpg)
Results on different structure enforcing
Sparse NMF Sparse NMF with equal coefficients
Improved but no sequence
Numerical Experiments
![Page 23: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/23.jpg)
3. Sparse NMF with orthogonal property Since sparse NMF can not apparently extract the central
torso, but potential sparsity and orthogonality to 4 limbs. (Actually all 5 parts are independent and there are non-overlapping non-zero parts.)
1, ,16 12
00, 00 7 171min , s.t. , 52
7 , 1 jFX Yx x xM XY y
≥ ≥− ⊥ ≤ ≤
Different structure enforcing Numerical Experiments
![Page 24: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/24.jpg)
Sparse NMF Sparse NMF with orthogonal structure
The torso is classified.
Results on different structure enforcing Numerical Experiments
![Page 25: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/25.jpg)
4. Sparse NMF with combinatorial patterns Divide rows of Y into 5 groups(4 limbs and 1 torso), each
group has only 1 non-zero and the 5 non-zeros are equal.
0
2,0, 0
1min , s.t. ( 1, 1,)2
,5, ij nnz jF GX Y
M XY y mean y y i≥ ≥
= =− =
G1 G2 G3 G4 G5
Different structure enforcing Numerical Experiments
![Page 26: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/26.jpg)
Sparse NMF enforcing combinatorial patterns
Results on different structure enforcing Numerical Experiments
quite well classified parts
![Page 27: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/27.jpg)
Numerical Experiments Test on Face Images
• Goal: return a part-based representation.
The basis elements extract facial features such as eyes, nose and lips.
![Page 28: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/28.jpg)
• Structure Property: Y is non-negative, X is sparse and non-negative,
Few works with L0 sparse NMF. Non-negative K-SVD (NNK-SVD,2005), Probabilistic sparse matrix factorization
(PSMF,2004), NMFL0 (2012)
a) L1 sparse NMF (relaxation of L0 sparse, convex) penalize or constrain the L1 norm of X or Y: b) L0 sparse NMF (more intuitive, non-convex) constrain the L0 norm of X or Y.
Numerical Experiments
(Hoyer 2004)
![Page 29: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/29.jpg)
• Model: sparsity enforced to matrix X
• Compare to Alg. (R.Peharz, F. Pernkopf, 2012) a) fixed Y, calculate X using non-negative least square
(NNLS), b) update Y maintaining sparse structure of X. (ANLS or Multiplicative Update) Difference in subproblems a) and b): SeMF : minimize augmented lagrangian function, : minimize original objective.
2
00, 0
1min , s.t. 2 iFX Y
M XY x K≥ ≥
− ≤
Numerical Experiments
0 -NMF X
0 -NMF X
![Page 30: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/30.jpg)
• Apply to ORL datasets(10304400, 25 basis parts)
Numerical Experiments
nnz: 33% nnz: 25% nnz: 10%
SeMF:
NMFL0:
![Page 31: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/31.jpg)
• Comparison of reconstruction quality and running time.
similar quality but more faster than in less
sparsity cases (more non-zeros).
Numerical Experiments
0 -NMF X
note: perform better than Hoyer’s method in both SNR and time in the paper “Sparse nonnegative matrix factorization with L0-constraints” by R. Peharz and F. Pernkopf.
0 -NMF X
![Page 32: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/32.jpg)
• SeMF can handle many different structures provided they have easy projections,
• ADM approach for augmented lagrangian of a split model, • Dynamically updating penalty parameters empirically
performs well. • Potential applications to many problems with latent
structure properties to improve solution quality, • Further work on experiments and comparisons, non-convex
complication, parameter choices, etc.
Conclusions
![Page 33: An Alternating Direction Algorithm for Structure …optimization/L1/optseminar/ADM for...An Alternating Direction Algorithm for Structure-enforced Matrix Factorization Lijun Xu (Dalian](https://reader034.fdocuments.in/reader034/viewer/2022042319/5f081eeb7e708231d4207127/html5/thumbnails/33.jpg)
Thank you!