Download - Lecture 14: Approximate Inference Sampling Methods · Lecture 14: Approximate Inference Sampling Methods Theo Rekatsinas 1. Approaches to inference 2 •Exact inference algorithms

CS839:ProbabilisticGraphicalModels

Lecture14:ApproximateInferenceSamplingMethods

TheoRekatsinas

1

Approachestoinference

2

• Exactinferencealgorithms• Theeliminationalgorithm• Message-passingalgorithm(sum-product,beliefpropagation)• Junctiontreealgorithm

• Approximateinferencetechniques• Variational algorithms

• Loopybeliefpropagation• Meanfieldapproximation

• Stochasticsimulation/samplingmethods• MarkovchainMonteCarlomethods

Howtorepresentajointdistribution?

3

• Closedformrepresentation

• Sample-basedrepresentationCollectsamplesX(m) ~P(x)ifwedrawalotofsampleswecanusethelawoflargenumberstogetthatEp(f(x))=Σmf(X(m))/|m|

MonteCarloMethods

4

• Drawrandomsamplesfromthedesireddistribution• Yieldastochasticrepresentationofacomplexdistribution

• marginals andotherexpectationscanbeapproximatedusingsample-basedaverages• Ep(f(x))=Σmf(X(m))/|m|

• Asymptoticallyexactandeasytoapplytoarbitrarymodels• Challenges:

• howtodrawsamplesfromagivendist.(notalldistributionscanbetriviallysampled)?

• howtomakebetteruseofthesamples(notallsampleareuseful,orequallyuseful,seeanexamplelater)?

• howtoknowwe'vesampledenough?

MonteCarloMethods

5

• DirectSampling• Wehaveseenit.• Verydifficulttopopulateahigh-dimensionalstatespace

• RejectionSampling• Createsampleslikedirectsampling,onlycountsampleswhichisconsistentwithgivenevidences.

• Likelihoodweighting,...• Samplevariablesandcalculateevidenceweight.Onlycreatethesampleswhichsupporttheevidences.

• MarkovchainMonteCarlo(MCMC)• Metropolis-Hasting• Gibbs

Rejectionsampling

6

• Supposewewishtosamplefromdist.Π(X)=Π'(X)/Z.• Π(X)isdifficulttosample,butΠ'(X)iseasytoevaluate• SamplefromasimplerdistributionQ(X)• Rejectionsampling

• Correctness:

• Pitfall:Wegainedasamplebutwhatdidwepay?

Unnormalized ImportanceSampling

7

• SupposesamplingfromP(·)ishard.• Supposewecansamplefroma"simpler"proposaldistributionQ(·)instead.• IfQdominatesP(i.e.,Q(x)>0wheneverP(x)>0),wecansamplefromQandreweight:

Normalizedimportancesampling

8

• SupposewecanonlyevaluateP’(x)=aP(x)

Weightedresampling

9

• Problemofimportancesampling:performancedependsonhowwellQmatchesP• IfP(x)f(x)isstronglyvaryingandhasasignificantproportionofitsmassconcentratedinasmallregion,rm willbedominatedbyafewsamples

• Solution:useaheavytailQandweightedresampling

LimitationsofMonteCarlo

10

• Directsampling• Hardtogetrareeventsinhigh-dimensionalspaces• InfeasibleforMRFsunlessweknowthenormalizerZ

• Rejectionsampling,Importancesampling• WeneedagoodproposalQ(x)thatisnotverydifferentthanP(x)

• Howaboutweuseanadaptiveproposal?

MarkovChainMonteCarlo

11

• MCMCalgorithmsfeatureadaptiveproposals• InsteadofQ(x’)useQ(x’|x)wherex’isthenewstatebeingsampledandxistheprevioussample• AsxchangesQ(x’|x)canalsochange

Metropolis-Hastings

12

• Drawasamplex’fromQ(x’|x)wherexistheprevioussample• Thenewsamplex’isacceptedorrejectedwithsomeprobabilityA(x’|x)

• Acceptanceprob:

• A(x’|x)islikearationofimportancesamplingweights• P(x’)/Q(x’|x)istheimportanceweightforx’,P(x)/Q(x|x’)istheimportanceweightforx• Wedividetheimportanceweightforx’bythatofx• NoticethatweonlyneedtocomputeP(x’)/P(x)ratherthanP(x’)orP(x)

• A(x’|x)ensuresthataftersufficientlymanydraws,oursamplescomefromthetruedistribution.

Metropolis-Hastings

13

ExampleofMH

14

• LetQ(x’|x)beaGuassian centeredonx• WearetryingtosamplefromabimodalP(x)

ExampleofMH

15


ExampleofMH

16


ExampleofMH

17


ExampleofMH

18


ExampleofMH

19


ExampleofMH

20


ExampleofMH

21


SometheoreticalaspectsofMCMC

22

• TheMHalgorithmhasaburn-inperiod• InitialsamplesarenottrulyfromP

• WhyaretheMHsamplesguaranteedtobefromP(x)?• TheproposalQ(x’|x)keepschangingwiththevalueofx;howdoweknowthesampleswilleventuallycomefromP(x)?

• WhyMarkovChain?

MarkovChains

23

• AMarkovChainisasequenceofrandomvariablesx1,x2,…,xN withtheMarkovProperty

• Therighthandsideisthetransitionkernel.Nextstatedependsonlyonprecedingstate

• Let’sassumethekernelisfixedwithtime.

MCConcepts

24

MCConcepts

25

• StationarydistributionsareofgreatimportanceinMCMC.Somenotions• Irreducible:anMCisirreducibleifyoucangetfromanystatextoanyotherstatex’withprobabilityx>0inafinitenumberofsteps• Aperiodic:anMCisaperiodicifyoucanreturntoanystatexatanytime• Ergodic(orregular):anMCisergodicifitisirreducibleandaperiodic

• Ergodicityisimportant:itimpliesyoucanreachthestationarydistributionnomattertheinitialdistribution.

MCConcepts

26

• Reversible(detailedbalance):anMCisreversibleifthereexistsadistributionπ(x)suchthatthedetailedbalanceconditionholds

• ReversibleMCsalwayshaveastationarydistribution

WhydoesMHwork?

27

• Wedrawasamplex’accordingtoQ(x’|x)andthenaccept/rejectaccordingtoA(x’|x).Hencethetransitionkernelis:

• WecanprovethatMHsatisfiesdetailedbalance.

WhydoesMHwork?

28

• NowsupposeA(x’|x)<1andA(x|x’).=1.Wehave

• Thisisthedetailedbalancecondition:• TheMHalgorithmleadstoastationarydistributionP(x)• WedefinedP(x)tobethetruedistributionofx• Thus,MHeventuallyconvergestothetruedistribution

GibbsSampling

29

• GibbsSamplingisanMCMCalgorithmthatsampleseachrandomvariableofagraphicalmodel,oneatatime

• GSisfairlyeasytoderiveformanygraphicalmodels

• GShasreasonablecomputationandmemoryrequirements(becausewesampleoner.v.atatime)

GibbsSamplingAlgorithm

30

GibbsSamplingExample

31

ParallelGibbsSampling

32

x

VariableTally

CompleteModelCopies

• RunGibbsindependentlyonfullcopiesofthesamemodel

• Feweriterationspercopy• Moresamplesmeansmore

accuratemarginals

Datatomaterializefactorgraph

RunsequentialGibbs

ParallelGibbsSamplingVariableAssignments

ColoredModel

Datatomaterializefactorgraph

VariableTally• Computeak-coloringofthe

factorgraph• Sampleallvariableswithsame

colorinparallel• Loadbalancingisakey

challenge

t1

t1

t1

CoordinatedWorkers

Summary

34

• Samplingcanbeeasytoimplementbutwecangetpoorqualitysamples• Weneedagoodproposaldistribution

• MarkovChainMonteCarlomethodsuseadaptiveproposalsQ(x’|x)tosamplefromthetruedistributionP(x)

• Metropolis-HastingsallowsyoutospecifyanyproposalQ(x’|x)

• GibbssamplingsetstheproposalQ(x’|x)totheconditionalP(x’|x)• Acceptancerateisalways1butthismeansslowexploration

• Burn-inisanart!