Download - Variational Bayes Model Selection for Mixture Distribution

Transcript
Page 1: Variational Bayes Model Selection for Mixture Distribution

Variational Bayes Model Selectionfor Mixture Distribution

Presented by Shihao Ji

Duke University Machine Learning Group

Jan. 20, 2006

Authors: Adrian Corduneanu & Christopher M. Bishop

Page 2: Variational Bayes Model Selection for Mixture Distribution

• Introduction – model selection

• Automatic Relevance Determination (ARD)

• Experimental Results

• Application to HMMs

Outline

Page 3: Variational Bayes Model Selection for Mixture Distribution

Introduction

• Cross validation

• Bayesian approaches

– MCMC and Laplace approximation

– (Traditional) variational method

– (Type II) variational method

dMpMDpMDp )|(),|()|(

)()|(log QLMDp

Page 4: Variational Bayes Model Selection for Mixture Distribution

• relevance vector regression

Given a dataset , we assume is Gaussian

Automatic Relevance Determination (ARD)

Nnnn t 1},{ x )),(|( 2xytN

2

22/22

21exp)2(),|( wtwt

Np

N

iiiwΝp

0

1),0|()|( αw

wαtwwtαt dppp ),,|(),|(),|( 222

Determination of hyperparameters:

)|(),|(),,|( 22 αwwtαtw ppp

)|( xtp

Likelihood:

Prior:

Posterior:

Type II ML

Page 5: Variational Bayes Model Selection for Mixture Distribution

• mixture of Gaussian Given an observed dataset , we assume each data point is drawn independently from a mixture of Gaussian density

M

iiii Np

1

),|(),,|( xx

N

nnpDp

1

),,|(),,|( x

ddDpDpDp ),|,(),,|()|(

),|,( Dp

NnnD 1}{ x

Likelihood:

Prior:

Posterior:

Determination of mixing coefficients:

),0|()( ii Np ),|()( VvWp ii

VB

Type II ML

Automatic Relevance Determination (ARD)

Page 6: Variational Bayes Model Selection for Mixture Distribution

Bayesian method: ,

• model selection

Automatic Relevance Determination (ARD)

)|( Dpm

Component elimination: if ,

i.e.,

},,2,1{ maxMm

i },,2,1{ maxMi

410

Page 7: Variational Bayes Model Selection for Mixture Distribution

Experimental Results

600 points drawn from a mixture of 5 Gaussians.

• Bayesian method vs. cross-validation

)|( Dpm

Page 8: Variational Bayes Model Selection for Mixture Distribution

Initially the model had 15 mixtures, finally was pruned down to 3 mixtures

• Component elimination

Experimental Results

Page 9: Variational Bayes Model Selection for Mixture Distribution

Experimental Results

Page 10: Variational Bayes Model Selection for Mixture Distribution

• hidden Markov model Given an observed dataset , we assume each data sequence is generated independently from an HMM

N

nn ApADp

1

),,|(),,|( x

dADpADpADp ),,|(),,|(),|(

),,|( ADp

NnnD 1}{ x

Likelihood:

Prior:

Posterior:

Determination of and A:

VB

Type II ML

Automatic Relevance Determination (ARD)

T

tttss

T

tst

T

tsss xpaAP

,, 1

1

11

11)|(),,|(

x

),,,|,()( mβiii NWp

Page 11: Variational Bayes Model Selection for Mixture Distribution

Define -- visiting frequency

where

Bayesian method: ,

• model selection

Automatic Relevance Determination (ARD)

),|( ADpm

State elimination: if ,

},,2,1{ maxMm

ivf },,2,1{ maxMi

),|()( )()( nt

nt ispi x

n t

nti ivf )()(

Page 12: Variational Bayes Model Selection for Mixture Distribution

Experimental Results (1)

Page 13: Variational Bayes Model Selection for Mixture Distribution

0 10 20 30 40 50 60-1600

-1550

-1500

-1450

-1400

-1350

number of iterations

mar

gina

l log

-like

lihoo

d

Page 14: Variational Bayes Model Selection for Mixture Distribution

Experimental Results (2)

Page 15: Variational Bayes Model Selection for Mixture Distribution

0 10 20 30 40 50 60-1680

-1660

-1640

-1620

-1600

-1580

-1560

-1540

-1520

-1500

number of iterations

mar

gina

l log

-like

lihoo

d

Page 16: Variational Bayes Model Selection for Mixture Distribution

Experimental Results (3)

Page 17: Variational Bayes Model Selection for Mixture Distribution

0 20 40 60 80 100 120 140 160 180-1820

-1800

-1780

-1760

-1740

-1720

-1700

-1680

-1660

-1640

number of iterations

mar

gina

l log

-like

lihoo

d

Page 18: Variational Bayes Model Selection for Mixture Distribution

Questions?