Lecture 13Variational Inference
Scribes : Kaushal Panini
Niklas Smedeuranh - Margulies
Variational Inference
Idea , Approximate posterior by maximizing
a variational lower bound,
ply ? Ptt , fly )
" 9) = *
go.o.cn/eogPgY:I;:T
)
7. O ly= log pig ) t Eg o ,llogpqlcz.co#)= leg pays - KL (9170101711party ) )
q
s log pay ,Maximizing L lol ) is the
Same as minimizing KL
Variational Inference : Interpretation as Regularization
Equivalent Interpretation : Regularized maximum likelihood
Llp ) = Eq , .gg ,I leg MYTHOS ) pcy.t.os-pcyiz.ospcz.co )
I
917,01lol
=Ego..
o ,I log pcyiz.at t leg ]
= Ego,o ,
I log
pcyiz.io/-kL(qc7Olos//pl7.os)
I I"
male log likelihood as make sure got ,Oslo )
large as possible"
is similar to prior"
Intuition : Minimizing KL divergences
Ply ,×
, ,X . ) = PCYIX , ,×z ) pcx , ,xz )Z
/ qcx ) = Norm ( x
;µ, 2)g) qc ,× ,
,×z , ÷ q(× , )q( × . ,
qcx , ) ÷ Norm /×, ;µ ,
,6 ? )
qkz ) := Normkn ;µz,6i )
LC aim ) :[email protected], ,×z )
= pcyipix , ,×ziy ) = leg ply ) - KL ( q( x, ,x , ) Hpcx , ,×zly ) )
Intuition : KL divergence under
approximates
variance
Intuition : Minimizing KL divergences
z
gP( Yixi ,×z ) = pcylx , ,×up( 1-
, ,x > )
a. ( x ) = Norm ( x ; p, [ )
g) paganqc ,× ,
,×z , ÷ qk , , qkz )
Propagate qlx ,) :-. Norm ( ×
, ;µ , ,6 , )
91×21 :-. µorm( Xz ;µz , G)
* KL( plx . ,xzly ) 119k , ,xz ) )9k , ,×z I
KL( qlx , ,xz ) Hpcx . ,xzly ) ) = |d×,
dx . qk , ,xz 1 log-
plx , ,Xz1y )
ligng.
q leg 'f = o lipm,
a log ph = as
Intuition : q ( ×, ,xe ) → o whenever PC x , ,xz ly )→0
Algorithm :Variational Expectation Maximization
Define : qcz ,O ; g) = gets Of't ) g C O 's 90 )
Objective
:LClot ,
do ) = Etqiao , flog
'
"q¥toI slog ply ,
-
Repeat until Ileft ,00 )
converges ( change smallerthan some threshold )
1. Expectation Step
lot =
argy.mxL ( oligo) Analogous to EM
step for j
z. Maximization Step
Updates distribution
010 = angurax £ ( 97,010 ) glo ; go ) instead of40 point estimate O
Example : Gaussian Mixture ( Simplified )
Generative Model f !Isi
is :/huh ,
d ~ Norm ( pro ,d ,So )
2- n - Discrete ( YK,
. . . ,Yk )
ynl7n=h n Norml pea ,
EI )
Model Selection
µMargined likelihood
"
AverageEvidence livelihood
"
£ I log ply ) log pigs = log ldtdoply.z.io )
K=2•
I
I
Number of Clusters
Intuition : Can avoid over fitting by keeping model
with highest £
Variational Expectation Maximization : Updates
119707 ) =
Eqcziopsqcy, lbs
g%YgtYg¢n, ]
= #gczioftiqcylpn) ↳ Ply ,Z , 7 ) ) ← depends an
47 and 47- Egypt, I log917145) -Eqcyiqn, flog9411091)
depends on 47 depends on 47
E - step : o -
-
ftp.E.ge?,q.,fEqcy,qn,llogpcy.t.y7)-bgqczipl)
9171472 exp ( Eqcyipyllogpcyit.nl)M - step : o -
- ¥ y Egm, LEG ,y⇒ I log pcy.tn ) ) - by94144 )
genial ) a exp LEqczigzilloyplyit.nl))
Intermezzo : Functional Derivatives
Idea : Compute derivativeofan integral win . 't .a function
ply , 7. y )° = 8%7
, fat dy amain ) (log gifs )at
ply , 7. y )
= / dy gigs (log gig ) -
ldyactsang !%,
drop integral over 2-,
take derivative of integrand
= - log get , t Egg ,flog plyit.nl - log 9in ) ) - I
leg 9th = Egon ,flog ply , 7. y ) ) t const
depends on 7- ensures henna lineation
Variational Expectation Maximization : Updates
1197071 =
Eqcziopsqcy, lbs
g%YofYg¢n, ]
= Ege ,Ileg ply ,Z , y ) ) ← depends an
147917197)
47 and 47-Etgcziqz, I log9171ft) ) -Eqcyiqn, flog9411091)
depends on 47 depends on 47
E - step : 917197 ) x exp f Egg , µ ,I log ply , 2,77 ) )
M - step : qcy 147 ) L exp I Eg # 147 , flog ply .IM ) ) )
Gaussian Mixture : Derivation of Updates
Idea : Exploit Exponential Families
log pcy.z.ms = log pi y l 2-,
4)t log
pcztrf) + log pin )
9 9 9All of these are exponential family
log pcyiz . y'd
= E { y I[zn=h ) tcyn ) -acyilIEti-hli.bghey,h
log pctlyt ) = ? { ytuI[zn=h )
leg populate ) =D ! .FR - D ?. acyl ) t log hints
leg pcyt197) = ItTy ' t loghcyz )
Gaussian Mixture : Derivation of Updates
E - step : Collect all terms that depend on 2-n
leg
qcz.nllot ) = Egon , µ ,
[ log
pcyn.tn. n ) ] t
. . -
= In Egon , µ ,[ yhb ) I 7n=h ) Ayn ) t Egon , µ ,
[ME) I[7n=h ) t. . .
&9
Need expected valuesE (yandEdyta )
Gaussian Mixture : Derivation of Updates
M - step : Collect all terms that depend on y
z htleg 91414) = Ego ,,µz,
[ logpsyn.tn . n ) ) t- . -
YZ= can :(Kiyl+ . . -
YZ my¢ h Qu
, ,
-
log an! 9%= { y ! "
( f InEm, litton ) tan )tin!Need expected value t § alga ) #Eq , , ,
lIftn=h))) tD !! )
Eave ,[ Il7n=h ) ) onbu.ee#
Gaussian Mixture : Variational EM
Objective : Variational Evidence Lower Bound ( ELBO )
1197071 = Eqcziopsqcy, lbsg%Y¥Yg¢n , ]Repeat until £10744 ) converges
I . Expectation Step ; Update get ) (keeping qcy ) fixed )
exp L Each , [ log plynitih17 ) ) )= Eg , ,lIL7n=hl )Th =
f exp #an , [ log pcyn.tn -- ely ) )
2.
Maximization Step : Update qcy) (keeping q CZ) fixed )
¢hI=Nutty
OIL?.
-- {loiitisni +9?
. ¢ ?! -
-Nuts ?
Top Related