Bayesian Nonparametric Matrix Factorization for Recorded Music
description
Transcript of Bayesian Nonparametric Matrix Factorization for Recorded Music
![Page 1: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/1.jpg)
Bayesian Nonparametric Matrix Factorization for Recorded Music
Matthew D. Hoffman, David M. Blei, Perry R. Cook
Presented by Lu Ren
Electrical and Computer Engineering
Duke University
![Page 2: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/2.jpg)
Outline
Introduction
GaP-NMF Model
Variational Inference
Evaluation
Related Work
Conclusions
![Page 3: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/3.jpg)
Introduction
Breaking audio spectrograms into separate sources of sound
Identifying individual instruments and notes
Predicting hidden or distorted signals
Source separation
previous work
Specifying the number of sources---Bayesian Nonparametric Gamma Process Nonnegative Matrix Factorization (GaP-NMF) Computational challenge: non-conjugate pairs of distributions
• favor for spectrogram data, not for computational convenience
• bigger variational family analytic coordinate ascent algorithm
![Page 4: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/4.jpg)
GaP-NMF Model Observation: Fourier power sepctrogram of an audio signal
: M by N matrix of nonnegative reals
: power at time window n and frequency bin m
A window of 2(M-1)
samples
DFT Squared magnitude in each
frequency bin
Keep only the
first M bins
Assume K static sound sources
: describe these sources
: amplitude of each source changing over time
is the average amount of energy source k exhibits at frequency m
is the gain of source k at time n
![Page 5: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/5.jpg)
GaP-NMF Model
1Abdallah & Plumbley (2004) and Fevotte et al. (2009)
Mixing K sound sources in the time domain (under certain assumptions), spectrogram is distributed1
Infer both the characters and number of latent audio sources
: trunction level
![Page 6: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/6.jpg)
GaP-NMF Model
As goes infinity, approximates an infinite sequence drawn from a gamma process Number of elements greater than some is finite almost surely:
If is sufficiently large relative to , only a few elements of
are substantially greater than 0. Setting :
θ
θ
![Page 7: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/7.jpg)
Variational Inference
Variational distribution: expanded family
Generalized Inverse-Gaussian (GIG):
denotes a modified Bessel function of the second kind
Gamma family is a special case of the GIG family where ,
![Page 8: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/8.jpg)
Variational Inference
Lower bound of GaP-NMF model:
If :
GIG family sufficient statistics:
Gamma family sufficient statistics:
![Page 9: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/9.jpg)
Variational Inference
The likelihood term expands to:
With Jensen’s inequality:
![Page 10: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/10.jpg)
Variational Inference
With a first order Taylor approximation:
: an arbitrary positive point
![Page 11: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/11.jpg)
Variational Inference Tightening the likelihood bound
Optimizing the variational distributions
For example:
![Page 12: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/12.jpg)
Evaluation
Compare GaP-NMF to two variations:
1. Finite Bayesian model
2. Finite non-Bayesian model
Itakura-Saito Nonnegative Matrix Factorization (IS-NMF)
: maximize the likelihood in the above fomula
Compare with another two NMF algorithms:
EU-NMF: minimize the sum of the squared Euclidean distance
KL-NMF: minimize the generalized KL-divergence
![Page 13: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/13.jpg)
Evaluation
1. Synthetic Data
![Page 14: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/14.jpg)
Evaluation
2. Marginal Likelihood & Bandwidth Expansion
![Page 15: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/15.jpg)
Evaluation
3. Blind Monophonic Source Separation
![Page 16: Bayesian Nonparametric Matrix Factorization for Recorded Music](https://reader035.fdocuments.in/reader035/viewer/2022062811/56816070550346895dcf976e/html5/thumbnails/16.jpg)
Conclusions
Related work
Bayesian nonparametric model GaP-NMF
Applicable to other types of audio