Structured Non-Negative Matrix Factorization

Structured Non-Negative MatrixFactorization

Hans Laurberg

[email protected]

Aalborg Universitet

Joint work with Lars Kai Hansen, Søren Holt Jensen, Mads G. Christensen and Mikkel N. Schmidt

Structured NMF – p. 1/32

Outline

1. Introduction to NMF

2. Structured NMF

(a) Affine NMF(b) Instrumentation separation using NMF


Rank Reduction

Let V , V̂ , W and H be matrices.Set up:

V ≈ V̂ = WH

where V̂ is a low rank approximation of V .Purpose:

• Noise reduction• Classification• Data reduction


Music Example

Source: Smaragdis 2004


Principal Component Analysis (PCA)

Task:

V̂ = arg minrank(V̂ )≤d

(‖V − V̂ ‖2F )

Solution:

V̂ =d∑

i=1

λipiqTi

where λi, pi and qi are the singular triplets of V .


Positive Data

Examples of positive V :

1. Images

2. Amplitude/Power Spectrums

3. Histograms

Interpretation of negative solution?


Positive Construction

The good news

1. Part based

2. Sparse

3. Understandable


NMF

Task:Find element wise non-negative W and H thatminimize the error function, E(V̂ ).

Some Error Functions:• EPower(V̂ ) = ‖V − V̂ ‖2

F

• ESparse(V̂ ) = ‖V − WH‖2F + λ

∑

ij Hij

• EKL(V̂ ) =∑

ij(Vij logVij

V̂ij

− Vij + V̂ij)


Power

Generation model:

V = V̂ + N,

where the elements in N are Gaussian IID.

p(V |V̂ ) ∝∏

ij

exp ((Vij − V̂ij)

2

−2σ2) = exp (

‖V − V̂ ‖2F

−2σ2)

EPower minimizes equivalent to ML


Sparse

Generation model V = V̂ + N , where theelements in N are Gaussian IID and the prior ofH is exponential IID.

p(V̂ |V ) ∝ p(V̂ )p(V |V̂ )

∝ exp (−αΣijHij) exp (−‖V − V̂ ‖2F/2σ2)

∝ exp (−‖V − V̂ ‖2F − 2ασ2ΣijHij)

(σ−2/2)

ESparse minimizes equivalent to MAP


Kullback-Leibler Divergence

The EKL error function equivalent to PLSAGaussier and Goutte 2005

Ding, Li and Peng 2006

PLSA:

• ML of V using a mixture model with dmixtures.

• Each mixture consists of two independentvariables.


Implementation

Example - W in the “Power” error function:

∇WE(V, V̂ ) = ∇W‖V − WH‖2F

= 2V HT︸︷︷︸

∇+

W

− 2WHHT︸︷︷︸

∇−

W

Update rule:

W = W ⊙∇+

W

∇−W

= W ⊙V HT

WHHT


Critical Theoretical Issues

1. When does NMF exist.

2. Gaussian assumption vs. positivity.

3. Convergenche to local minima.


NMF Summary

1. V ≈ V̂ = WH

2. Non-negative constraint leads to part basedbasis vectors.

3. There are some theoretial foundations for theNMF cost functions.

4. The are critical theoretical issues.


Structured NMF

How small changes can make NMF more useful.

1. Affine NMF(Laurberg and Hansen ICASSP 2007)

2. Instrument separation using NMFLaurberg and Schmidt (Ongoing work)


Affine NMF

Problem: An offset leads to non uniqueness.

0 10

1

2

3

4

W0 W

1

W2

Data0 1

0

1

2

3

4

W1

W2 W

3

Power

0 10

1

2

3

4

W1

W2

W3

Sparse0 1

0

1

2

3

4

W0 W

1

W2

Affine


Affine NMF

Affine NMF model:

V̂ = WH + w01T

Affine NMF cost function:

E(V̂ ) = ‖V − V̂ ‖2F + λ

∑

ij

Hij


The Swimmer Database

The “SwimmerDatabase” intro-duced by Donohoand Stodden2004 to discussthe uniquenessissues.

Data Power

Sparse Affine


The Swimmer Database

Two dimension-al projection ofthe “SwimmerDatabase”.

0 10

1

Data0 1

0

1

Power

0 10

1

Sparse0 1

0

1

Affine


Business Card Data Set

Photos plus ‘wa-termark’

Data Power

Sparse Affine


Business Card Data Set

Two dimensionalprojection of thebusiness card da-ta set.

0 1 2 3 40

1

2

3

4

5

Data0 1 2 3 4

0

1

2

3

4

5

Power

0 1 2 3 40

1

2

3

4

5

Sparse0 1 2 3 4

0

1

2

3

4

5

Affine


Summary of Affine NMF

1. Offset in data occur in different kind ofpositive data.

2. If data has an offset, performance is improvedif an affine method is used.


NMF and InstrumentSeparation


Existing Instrument Separation

1. Let V be the spectrogram of a music piece

2. Use training data (instruments playing solo) tofind instrument models W1 · · ·WN

3. Estimate mixing coefficients H1 · · ·HN

4. V̂ =∑

i WiHi

5. The instruments are separated by: V̂i = WiHi


Existing Instrument Separation

Model training:• VBass ≈ WBassHtemp1

• VDrum ≈ WDrumHtemp2

Separation:

• VData ≈[

WBass WDrum

][

HBass

HDrum

]


New Method

• Is it possible to separate instruments withoutinstrument models?


Joint Estimation and Separation

Separation:

[

VData VBass VDrum

]

≈[

WBass WDrum

][

HBass Htemp1 0

HDrum 0 Htemp2

]


Joint Estimation and Separation

Implementation?

No problem:

W = W ⊙∇+

W

∇−W


No Training Data

Structured NMF

[

VNoPiano VNoBass VNoDrum

]

≈ WH

=[

WPiano WBass WDrum

]

0 ∗ ∗

∗ 0 ∗

∗ ∗ 0

Are zeros enough to ensure uniqueness? (yes)


Demo


Music NMF Summary

1. Instrument separation is possible without solosongs is labels are known.

2. Easy to make update rule.

3. Ongoing work.


Questions

?

?

?

?

?

?


Structured Non-Negative Matrix Factorization

Documents

Transcript of Structured Non-Negative Matrix Factorization