Variational Bayes

Click here to load reader

  • date post

    04-Jul-2015
  • Category

    Technology

  • view

    1.128
  • download

    3

Embed Size (px)

description

Variational Bayesian inference using the R package VBmix

Transcript of Variational Bayes

  • 1. Variational Bayes VBmixSummaryVariational Bayes using the R package VBmix Matt MooresZo van HavreBayesian Research & Applications GroupQueensland University of Technology, Brisbane, AustraliaCRICOS provider no. 00213J Thursday October 11, 2012 BRAG Oct. 11Variational Bayes

2. Variational BayesVBmix SummaryOutline1 Variational BayesIntroductionunivariate Gaussianmixture of Gaussians2 VBmix BRAG Oct. 11Variational Bayes 3. Variational BayesIntroductionVBmix univariate Gaussian Summarymixture of GaussiansExact InferenceWhen the posterior distribution is analytically tractableeg. Normal distribution with natural conjugate priorsp(|Y) = p(, 2 |Y) = p(| 2 , Y)p( 2 |Y) (1) 2 N m , IG(a , b ) (2) where = 0 + n 1 m = (0 m0 + ny )a = a0 + n2 1n 0 n(y m0 )2 b = b0 + 2i=1 (yi y )2 + 0 +n BRAG Oct. 11 Variational Bayes 4. Variational Bayes IntroductionVBmixunivariate Gaussian Summary mixture of GaussiansApproximate InferenceStochastic approximationMarkov chain Monte CarloAnalytic approximationexpectation propagationLaplace approximationvariational BayesBRAG Oct. 11 Variational Bayes 5. Variational Bayes Introduction VBmixunivariate GaussianSummary mixture of GaussiansVariational Bayes VB is derived from the calculus of variations (Euler, Lagrange, et al.)integration and differentiation of functionals(functions of functions) Kullback-Leibler (KL) divergencemeasures the distance between our approximation q()and the true posterior distribution p(|Y) p(|Y) KL(q||p) = q() lnd (3)q()Kullback & Leibler (1951) On Information and SufciencyBRAG Oct. 11Variational Bayes 6. Variational Bayes Introduction VBmixunivariate GaussianSummary mixture of GaussiansMean Field Variational BayesIf the posterior distribution is analytically intractable,approximate it using a distribution that is tractableeg. using mean eld theory:M q() = qm (m ) (4)m=1then minimise the KL divergence using convex optimisationParisi (1988) Statistical Field TheoryBRAG Oct. 11Variational Bayes 7. Variational Bayes Introduction VBmixunivariate GaussianSummary mixture of GaussiansVB for the univariate Gaussian distributionThe exact posterior distribution is analytically tractable(see equation 1):p(, 2 |Y) = p(| 2 , Y)p( 2 |Y)but for the purpose of illustration:q(, 2 ) = q () q2 ( 2 )0 m0 + ny E[ 2 ]q () N , 0 + n0 + n n2 n1q2 ( ) IG a0 + , b0 + E(yi )2 + 0 ( m0 )222i=1this lends itself to estimation via a variant of the EM algorithmBRAG Oct. 11Variational Bayes 8. Variational Bayes IntroductionVBmixunivariate Gaussian Summary mixture of GaussiansR code for VB while ( LB oldLB > 0 . 1 ) {# Este pEmu m_vbEtau a_vb / b_vb# Mste pm_vb mean( y )n_vb na_vb n / 2b_vb (sum ( ( y Emu) ^ 2 ) + 1 / Etau ) / 2# check convergenceoldLB LBLB calcLowerBound (m_vb , n_vb , a_vb , b_vb )} BRAG Oct. 11Variational Bayes 9. Variational Bayes Introduction VBmixunivariate GaussianSummary mixture of GaussiansVB in actioniteration 0 2.0 1.5 1.0 0.5 0.00.0 0.51.01.5 2.0BRAG Oct. 11Variational Bayes 10. Variational Bayes Introduction VBmixunivariate GaussianSummary mixture of GaussiansVB in actioniteration 1 bound is 100.6 2.0 1.5 1.0 0.5 0.00.0 0.51.01.5 2.0BRAG Oct. 11Variational Bayes 11. Variational Bayes Introduction VBmixunivariate GaussianSummary mixture of GaussiansVB in actioniteration 2 bound is 100.2 2.0 1.5 1.0 0.5 0.00.0 0.51.01.5 2.0BRAG Oct. 11Variational Bayes 12. Variational BayesIntroductionVBmix univariate Gaussian Summarymixture of GaussiansGaussian Mixture ModelLikelihood function: n k 1 (yi j )2p(y|, , 2 ) = jexp i=1j=12j22j2where kj = 1 j=1Natural conjugate priors: p() Dirichlet() j2 p(j |j2 ) Nmj , jp(j2 ) IG(aj , bj ) BRAG Oct. 11 Variational Bayes 13. Variational Bayes IntroductionVBmixunivariate Gaussian Summary mixture of GaussiansExact Inference for GMMComplexity of the posterior distribution is O(k n ) computationally infeasible for more than a small handful of observations and mixture components back of the envelope:if k = 2 and n = 50, it would take approximately 15minon an nVidia Tesla M2050 (1288 GFLOPs peak throughput)if k = 2 and n = 100, it would take 31 billion yearsFor EM, Gibbs sampling and Variational Bayes, we approximatethe posterior by introducing a matrix Z of indicator variables,such that zij = 1 if yi has the label j, and zij = 0 otherwise.Robert & Mengersen (2011) Exact Bayesian analysis of mixtures BRAG Oct. 11Variational Bayes 14. Variational Bayes Introduction VBmixunivariate GaussianSummary mixture of GaussiansVariational Bayes for GMMmean eld approximation: kq() = q(Z) q()q(j |j2 )q(j2 )j=1Variational E-step: n k z q(Z) =ij ij i=1 j=1 ijij =k x=1 ix1 1log ij = E[log j ] E[log j2 ] log 22 2(xi j )21 Ej ,22 j j2BRAG Oct. 11Variational Bayes 15. Variational BayesIntroductionVBmix univariate Gaussian Summarymixture of GaussiansVariational Bayes for GMM, continuedM-step: nnn1 1nj = ijyj = ij yisj2 = ij (yi yj )2 njnj i=1 i=1 i=1njnj sj20 nj (yj m0 )2 q(j2 ) IG a0 + , b0 + +2 2 2(0 + nj ) 0 m0 + nj yj j2 q(j |j2 ) N , 0 + n j 0 + n j q(1 , . . . , k ) Dirichlet(0 + n1 , . . . , 0 + nk ) BRAG Oct. 11 Variational Bayes 16. Variational BayesVBmix SummaryVBmixAn R package by Pierrick BruneauVariational Bayesian inference for mixtures of Gaussianssee 10.2 of Bishop (2006) open source (GPL v3) implemented in C using the Gnu Scientic Library (GSL) Windows binary unavailable on CRANChristopher M. Bishop (2006) Pattern Recognition and Machine Learning BRAG Oct. 11Variational Bayes 17. Variational Bayes VBmixSummaryVBmix for Fishers iris data i n s t a l l . packages ( "VBmix" ) # r e q u i r e s GSL, Qt , f f t w 3l i b r a r y ( VBmix )# 3 component m i x t u r e o f m u l t i v a r i a t e Gaussiansf i t _vb varbayes ( i r i s d a t a , ncomp=20)f a c t o r ( Z to L a b e ls ( f i t _vb$model$ resp )# ground t r u t hirislabels# f i t GMM u sin g maximum l i k e l i h o o d , f o r comparisonf i t _em classicEM ( i r i s d a t a , 4 )f i t _em$ l a b e l sBRAG Oct. 11Variational Bayes 18. Variational Bayes VBmixSummarySummary VB is an analytic approximation to the posterior distribution suited to standard models with natural conjugate priorsupdate equations derived using calculus of variationsto minimise the KL divergence algorithm resembles Expectation-Maximisation (EM)can become stuck on suboptimal local maxima tends to underestimate the uncertainty in the posterior The R package VBmix provides fast, approximate inference for mixtures of multivariate Gaussians.BRAG Oct. 11Variational Bayes 19. Appendix For Further ReadingFor Further Reading I Christopher M. Bishop Pattern Recognition and Machine Learning. Springer, 2006. John Ormerod & Matt Wand Explaining Variational Approximations. The American Statistician, 64(2): 140153, 2010. Mike Jordan, Zoubin Ghahramani, Tommi Jaakkola, & Lawrence Saul An Introduction to Variational Methods for Graphical Models. Machine Learning, 37: 183233, 1999. Pierrick Bruneau, Marc Gelgon & Fabien Picarougne Parsimonious reduction of Gaussian mixture models with a variational-Bayes approach. Pattern Recognition, 43(3): 850858, 2010.BRAG Oct. 11 Variational Bayes 20. Appendix For Further ReadingFor Further Reading II Clare McGrory & Mike Titterington Variational approximations in Bayesian model selection for nite mixture distributions. Computational Statistics & Data Analysis, 51: 53525367, 2007. Solomon Kullback & Richard Leibler On Information and Sufciency. The Annals of Mathematical Statistics, 22: 7986, 1951. Giorgio Parisi Statistical Field Theory. Addison-Wesley, 1988. Christian Robert & Kerrie Mengersen Exact Bayesian analysis of mixtures In Mengersen, Robert & Titternginton (eds.) Mixtures: Estimation and Applications John Wiley & Sons, 2011.BRAG Oct. 11Variational Bayes