Structured Non-Negative Matrix Factorization
Transcript of Structured Non-Negative Matrix Factorization
Structured Non-Negative MatrixFactorization
Hans Laurberg
Aalborg Universitet
Joint work with Lars Kai Hansen, Søren Holt Jensen, Mads G. Christensen and Mikkel N. Schmidt
Structured NMF – p. 1/32
Outline
1. Introduction to NMF
2. Structured NMF
(a) Affine NMF(b) Instrumentation separation using NMF
Structured NMF – p. 2/32
Rank Reduction
Let V , V̂ , W and H be matrices.Set up:
V ≈ V̂ = WH
where V̂ is a low rank approximation of V .Purpose:
• Noise reduction• Classification• Data reduction
Structured NMF – p. 3/32
Music Example
Source: Smaragdis 2004
Structured NMF – p. 4/32
Principal Component Analysis (PCA)
Task:
V̂ = arg minrank(V̂ )≤d
(‖V − V̂ ‖2F )
Solution:
V̂ =d∑
i=1
λipiqTi
where λi, pi and qi are the singular triplets of V .
Structured NMF – p. 5/32
Positive Data
Examples of positive V :
1. Images
2. Amplitude/Power Spectrums
3. Histograms
Interpretation of negative solution?
Structured NMF – p. 6/32
Positive Construction
The good news
1. Part based
2. Sparse
3. Understandable
Structured NMF – p. 7/32
NMF
Task:Find element wise non-negative W and H thatminimize the error function, E(V̂ ).
Some Error Functions:• EPower(V̂ ) = ‖V − V̂ ‖2
F
• ESparse(V̂ ) = ‖V − WH‖2F + λ
∑
ij Hij
• EKL(V̂ ) =∑
ij(Vij logVij
V̂ij
− Vij + V̂ij)
Structured NMF – p. 8/32
Power
Generation model:
V = V̂ + N,
where the elements in N are Gaussian IID.
p(V |V̂ ) ∝∏
ij
exp ((Vij − V̂ij)
2
−2σ2) = exp (
‖V − V̂ ‖2F
−2σ2)
EPower minimizes equivalent to ML
Structured NMF – p. 9/32
Sparse
Generation model V = V̂ + N , where theelements in N are Gaussian IID and the prior ofH is exponential IID.
p(V̂ |V ) ∝ p(V̂ )p(V |V̂ )
∝ exp (−αΣijHij) exp (−‖V − V̂ ‖2F/2σ2)
∝ exp (−‖V − V̂ ‖2F − 2ασ2ΣijHij)
(σ−2/2)
ESparse minimizes equivalent to MAP
Structured NMF – p. 10/32
Kullback-Leibler Divergence
The EKL error function equivalent to PLSAGaussier and Goutte 2005
Ding, Li and Peng 2006
PLSA:
• ML of V using a mixture model with dmixtures.
• Each mixture consists of two independentvariables.
Structured NMF – p. 11/32
Implementation
Example - W in the “Power” error function:
∇WE(V, V̂ ) = ∇W‖V − WH‖2F
= 2V HT︸ ︷︷ ︸
∇+
W
− 2WHHT︸ ︷︷ ︸
∇−
W
Update rule:
W = W ⊙∇+
W
∇−W
= W ⊙V HT
WHHT
Structured NMF – p. 12/32
Critical Theoretical Issues
1. When does NMF exist.
2. Gaussian assumption vs. positivity.
3. Convergenche to local minima.
Structured NMF – p. 13/32
NMF Summary
1. V ≈ V̂ = WH
2. Non-negative constraint leads to part basedbasis vectors.
3. There are some theoretial foundations for theNMF cost functions.
4. The are critical theoretical issues.
Structured NMF – p. 14/32
Structured NMF
How small changes can make NMF more useful.
1. Affine NMF(Laurberg and Hansen ICASSP 2007)
2. Instrument separation using NMFLaurberg and Schmidt (Ongoing work)
Structured NMF – p. 15/32
Affine NMF
Problem: An offset leads to non uniqueness.
0 10
1
2
3
4
W0 W
1
W2
Data0 1
0
1
2
3
4
W1
W2 W
3
Power
0 10
1
2
3
4
W1
W2
W3
Sparse0 1
0
1
2
3
4
W0 W
1
W2
Affine
Structured NMF – p. 16/32
Affine NMF
Affine NMF model:
V̂ = WH + w01T
Affine NMF cost function:
E(V̂ ) = ‖V − V̂ ‖2F + λ
∑
ij
Hij
Structured NMF – p. 17/32
The Swimmer Database
The “SwimmerDatabase” intro-duced by Donohoand Stodden2004 to discussthe uniquenessissues.
Data Power
Sparse Affine
Structured NMF – p. 18/32
The Swimmer Database
Two dimension-al projection ofthe “SwimmerDatabase”.
0 10
1
Data0 1
0
1
Power
0 10
1
Sparse0 1
0
1
Affine
Structured NMF – p. 19/32
Business Card Data Set
Photos plus ‘wa-termark’
Data Power
Sparse Affine
Structured NMF – p. 20/32
Business Card Data Set
Two dimensionalprojection of thebusiness card da-ta set.
0 1 2 3 40
1
2
3
4
5
Data0 1 2 3 4
0
1
2
3
4
5
Power
0 1 2 3 40
1
2
3
4
5
Sparse0 1 2 3 4
0
1
2
3
4
5
Affine
Structured NMF – p. 21/32
Summary of Affine NMF
1. Offset in data occur in different kind ofpositive data.
2. If data has an offset, performance is improvedif an affine method is used.
Structured NMF – p. 22/32
NMF and InstrumentSeparation
Structured NMF – p. 23/32
Existing Instrument Separation
1. Let V be the spectrogram of a music piece
2. Use training data (instruments playing solo) tofind instrument models W1 · · ·WN
3. Estimate mixing coefficients H1 · · ·HN
4. V̂ =∑
i WiHi
5. The instruments are separated by: V̂i = WiHi
Structured NMF – p. 24/32
Existing Instrument Separation
Model training:• VBass ≈ WBassHtemp1
• VDrum ≈ WDrumHtemp2
Separation:
• VData ≈[
WBass WDrum
][
HBass
HDrum
]
Structured NMF – p. 25/32
New Method
• Is it possible to separate instruments withoutinstrument models?
Structured NMF – p. 26/32
Joint Estimation and Separation
Separation:
[
VData VBass VDrum
]
≈[
WBass WDrum
][
HBass Htemp1 0
HDrum 0 Htemp2
]
Structured NMF – p. 27/32
Joint Estimation and Separation
Implementation?
No problem:
W = W ⊙∇+
W
∇−W
Structured NMF – p. 28/32
No Training Data
Structured NMF
[
VNoPiano VNoBass VNoDrum
]
≈ WH
=[
WPiano WBass WDrum
]
0 ∗ ∗
∗ 0 ∗
∗ ∗ 0
Are zeros enough to ensure uniqueness? (yes)
Structured NMF – p. 29/32
Demo
Structured NMF – p. 30/32
Music NMF Summary
1. Instrument separation is possible without solosongs is labels are known.
2. Easy to make update rule.
3. Ongoing work.
Structured NMF – p. 31/32
Questions
?
?
?
?
?
?
Structured NMF – p. 32/32