CS839:ProbabilisticGraphicalModels
Lecture14:ApproximateInferenceSamplingMethods
TheoRekatsinas
1
Approachestoinference
2
• Exactinferencealgorithms• Theeliminationalgorithm• Message-passingalgorithm(sum-product,beliefpropagation)• Junctiontreealgorithm
• Approximateinferencetechniques• Variational algorithms
• Loopybeliefpropagation• Meanfieldapproximation
• Stochasticsimulation/samplingmethods• MarkovchainMonteCarlomethods
Howtorepresentajointdistribution?
3
• Closedformrepresentation
• Sample-basedrepresentationCollectsamplesX(m) ~P(x)ifwedrawalotofsampleswecanusethelawoflargenumberstogetthatEp(f(x))=Σmf(X(m))/|m|
MonteCarloMethods
4
• Drawrandomsamplesfromthedesireddistribution• Yieldastochasticrepresentationofacomplexdistribution
• marginals andotherexpectationscanbeapproximatedusingsample-basedaverages• Ep(f(x))=Σmf(X(m))/|m|
• Asymptoticallyexactandeasytoapplytoarbitrarymodels• Challenges:
• howtodrawsamplesfromagivendist.(notalldistributionscanbetriviallysampled)?
• howtomakebetteruseofthesamples(notallsampleareuseful,orequallyuseful,seeanexamplelater)?
• howtoknowwe'vesampledenough?
MonteCarloMethods
5
• DirectSampling• Wehaveseenit.• Verydifficulttopopulateahigh-dimensionalstatespace
• RejectionSampling• Createsampleslikedirectsampling,onlycountsampleswhichisconsistentwithgivenevidences.
• Likelihoodweighting,...• Samplevariablesandcalculateevidenceweight.Onlycreatethesampleswhichsupporttheevidences.
• MarkovchainMonteCarlo(MCMC)• Metropolis-Hasting• Gibbs
Rejectionsampling
6
• Supposewewishtosamplefromdist.Π(X)=Π'(X)/Z.• Π(X)isdifficulttosample,butΠ'(X)iseasytoevaluate• SamplefromasimplerdistributionQ(X)• Rejectionsampling
• Correctness:
• Pitfall:Wegainedasamplebutwhatdidwepay?
Unnormalized ImportanceSampling
7
• SupposesamplingfromP(·)ishard.• Supposewecansamplefroma"simpler"proposaldistributionQ(·)instead.• IfQdominatesP(i.e.,Q(x)>0wheneverP(x)>0),wecansamplefromQandreweight:
Normalizedimportancesampling
8
• SupposewecanonlyevaluateP’(x)=aP(x)
Weightedresampling
9
• Problemofimportancesampling:performancedependsonhowwellQmatchesP• IfP(x)f(x)isstronglyvaryingandhasasignificantproportionofitsmassconcentratedinasmallregion,rm willbedominatedbyafewsamples
• Solution:useaheavytailQandweightedresampling
LimitationsofMonteCarlo
10
• Directsampling• Hardtogetrareeventsinhigh-dimensionalspaces• InfeasibleforMRFsunlessweknowthenormalizerZ
• Rejectionsampling,Importancesampling• WeneedagoodproposalQ(x)thatisnotverydifferentthanP(x)
• Howaboutweuseanadaptiveproposal?
MarkovChainMonteCarlo
11
• MCMCalgorithmsfeatureadaptiveproposals• InsteadofQ(x’)useQ(x’|x)wherex’isthenewstatebeingsampledandxistheprevioussample• AsxchangesQ(x’|x)canalsochange
Metropolis-Hastings
12
• Drawasamplex’fromQ(x’|x)wherexistheprevioussample• Thenewsamplex’isacceptedorrejectedwithsomeprobabilityA(x’|x)
• Acceptanceprob:
• A(x’|x)islikearationofimportancesamplingweights• P(x’)/Q(x’|x)istheimportanceweightforx’,P(x)/Q(x|x’)istheimportanceweightforx• Wedividetheimportanceweightforx’bythatofx• NoticethatweonlyneedtocomputeP(x’)/P(x)ratherthanP(x’)orP(x)
• A(x’|x)ensuresthataftersufficientlymanydraws,oursamplescomefromthetruedistribution.
Metropolis-Hastings
13
ExampleofMH
14
• LetQ(x’|x)beaGuassian centeredonx• WearetryingtosamplefromabimodalP(x)
ExampleofMH
15
• LetQ(x’|x)beaGuassian centeredonx• WearetryingtosamplefromabimodalP(x)
ExampleofMH
16
• LetQ(x’|x)beaGuassian centeredonx• WearetryingtosamplefromabimodalP(x)
ExampleofMH
17
• LetQ(x’|x)beaGuassian centeredonx• WearetryingtosamplefromabimodalP(x)
ExampleofMH
18
• LetQ(x’|x)beaGuassian centeredonx• WearetryingtosamplefromabimodalP(x)
ExampleofMH
19
• LetQ(x’|x)beaGuassian centeredonx• WearetryingtosamplefromabimodalP(x)
ExampleofMH
20
• LetQ(x’|x)beaGuassian centeredonx• WearetryingtosamplefromabimodalP(x)
ExampleofMH
21
• LetQ(x’|x)beaGuassian centeredonx• WearetryingtosamplefromabimodalP(x)
SometheoreticalaspectsofMCMC
22
• TheMHalgorithmhasaburn-inperiod• InitialsamplesarenottrulyfromP
• WhyaretheMHsamplesguaranteedtobefromP(x)?• TheproposalQ(x’|x)keepschangingwiththevalueofx;howdoweknowthesampleswilleventuallycomefromP(x)?
• WhyMarkovChain?
MarkovChains
23
• AMarkovChainisasequenceofrandomvariablesx1,x2,…,xN withtheMarkovProperty
• Therighthandsideisthetransitionkernel.Nextstatedependsonlyonprecedingstate
• Let’sassumethekernelisfixedwithtime.
MCConcepts
24
MCConcepts
25
• StationarydistributionsareofgreatimportanceinMCMC.Somenotions• Irreducible:anMCisirreducibleifyoucangetfromanystatextoanyotherstatex’withprobabilityx>0inafinitenumberofsteps• Aperiodic:anMCisaperiodicifyoucanreturntoanystatexatanytime• Ergodic(orregular):anMCisergodicifitisirreducibleandaperiodic
• Ergodicityisimportant:itimpliesyoucanreachthestationarydistributionnomattertheinitialdistribution.
MCConcepts
26
• Reversible(detailedbalance):anMCisreversibleifthereexistsadistributionπ(x)suchthatthedetailedbalanceconditionholds
• ReversibleMCsalwayshaveastationarydistribution
WhydoesMHwork?
27
• Wedrawasamplex’accordingtoQ(x’|x)andthenaccept/rejectaccordingtoA(x’|x).Hencethetransitionkernelis:
• WecanprovethatMHsatisfiesdetailedbalance.
WhydoesMHwork?
28
• NowsupposeA(x’|x)<1andA(x|x’).=1.Wehave
• Thisisthedetailedbalancecondition:• TheMHalgorithmleadstoastationarydistributionP(x)• WedefinedP(x)tobethetruedistributionofx• Thus,MHeventuallyconvergestothetruedistribution
GibbsSampling
29
• GibbsSamplingisanMCMCalgorithmthatsampleseachrandomvariableofagraphicalmodel,oneatatime
• GSisfairlyeasytoderiveformanygraphicalmodels
• GShasreasonablecomputationandmemoryrequirements(becausewesampleoner.v.atatime)
GibbsSamplingAlgorithm
30
GibbsSamplingExample
31
ParallelGibbsSampling
32
x
VariableTally
CompleteModelCopies
• RunGibbsindependentlyonfullcopiesofthesamemodel
• Feweriterationspercopy• Moresamplesmeansmore
accuratemarginals
Datatomaterializefactorgraph
RunsequentialGibbs
ParallelGibbsSamplingVariableAssignments
ColoredModel
Datatomaterializefactorgraph
VariableTally• Computeak-coloringofthe
factorgraph• Sampleallvariableswithsame
colorinparallel• Loadbalancingisakey
challenge
t1
t1
t1
CoordinatedWorkers
Summary
34
• Samplingcanbeeasytoimplementbutwecangetpoorqualitysamples• Weneedagoodproposaldistribution
• MarkovChainMonteCarlomethodsuseadaptiveproposalsQ(x’|x)tosamplefromthetruedistributionP(x)
• Metropolis-HastingsallowsyoutospecifyanyproposalQ(x’|x)
• GibbssamplingsetstheproposalQ(x’|x)totheconditionalP(x’|x)• Acceptancerateisalways1butthismeansslowexploration
• Burn-inisanart!
Top Related