Markov Chain Monte Carlo modelsmcvean/DTC/STAT/Lectures/Tues_wk2/...1/30/12 1 1/29/12 Markov Chain...

1/30/12

1

1/29/12

MarkovChainMonteCarlo

ZaminIqbal

Modelsandmethods

•  Wewanttobeabletostudybiologicalsystems(popula=ons,inheritance,diseasesuscep=bility,genomeevolu=on,selec=on)

•  Inthisendeavour,wemakemodels.

•  Wehavetochoosewhichparameters/conceptstoincludeandhowtobuildthemodel.

•  Typicallythereisatrade‐offbetween“modelrealism”andcomputa=onal/sta=s=calapplicability

•  MCMCisamethodthatallowsyoutoexploremorerealis=cmodelsthathavenosimpleanaly=calsolu=ons.

BayesianInference

•  InBayesiansta=s=cs,wewanttolearnabouttheprobabilitydistribu=onoftheparameterofinterestgiventhedata=theposterior

•  Insimplecaseswecanderiveananaly=calexpressionfortheposterior

€

P(θ |D) =P(θ)P(D |θ)

P(D)Posterior

Prior

Likelihood

Normalisingconstant

BayesCartoon

Beliefbefore=P(theta)

Likelihood(D|theta)

BeliefaTer=P(theta|D)(Posterior)

1/30/12

2

MonteCarlo(noMarkovChainsyet)

•  Inmanysitua=onsthenormalisingconstantcannotbecalculatedanaly=cally

•  Typicallytrueifyouhavemul=pleparameters,mul=dimensionalparameters,complexmodelstructuresorcomplexlikelihoodfunc=ons.(i.e.mostofmodernsta=s=cs)

•  MonteCarlomethodsallowyoutosamplefrom(andthereforees=matefunc=onsof)theposterior(seenextslide).

€

P(D) = P(θ)P(D |θ)dθ∫

Samplingtoes=mateanintegral

•  Youhavealreadymettheideathatsamplingcanbeusedtoes=mateanexpecta=on(=integral).Ifwehaveasetofiidrandomvariables,then

•  Themeanofthesampleconvergestothemeanofthe

€

Xni=1

n

∑n

→ µ

€

Xi

Samplingtoes=mateanintegral

•  Thisisalsotrueformoregeneralexpecta=ons

•  Soifweareinterestedinsomeexpecta=onoftheposterior,wecoulduseaniidsequencetoapproximateit.

€

g(Xn )i=1

n

∑n

→ g(x) fX (x)dx∫

MarkovChainMonteCarlo

•  Greatidea–wedon’tneedourrandomvariablestobeiid:anyMarkovchainwhosesta:onarydistribu:onistheposteriorwilldo.

•  UsedbyMetropolisandUlamaspartoftheManha]anprojecttosolvetheproblemofini=a=ngfusioninabomb

•  Broughttoprominenceinsta=s=csbyGelfandandSmithin1990.

1/30/12

3

MarkovChains

•  AMarkovChainisastochas=cprocessthatgeneratesrandomvariablessuchthat

i.ethedistribu=onofthenextvariabledependsonlyonthecurrentone

•  Wetalkabouttransi=onprobabili=es:•  Note:thearetypicallyhighlycorrelated,soeachsampleisnotan

independentdrawfromtheposterior.(Thinningofthechaincanleadtoeffec=velyindependentsamples).

€

Xi

€

P(Xi | X1,X2 ...Xi−1) = P(Xi | Xi−1)

€

Xi

€

pij = Pr(Xn+1 = j | Xn = i)

Nota=on

•  Ishallrefertotheposteriordistribu=onasthetargetdistribu=on.

•  I’llcallthetransi=onprobabili=esintheMarkovchaintheproposaldistribu:on,orthetransi:onkernel,q(X|Y)

•  Whentalkingaboutmul=pleparametersItalkaboutthejointposterior

andthecondi:onaldistribu:ons€

π (θ1,θ2,...θr )

€

π (θ2 |ϑ1,...θr)

TheMetropolisAlgorithm

•  HowcanweconstructaMarkovchainthatconvergestotheposteriorwewant?

•  SupposeweareinstateX,andwewanttomove.DrawanewstateYfromtheproposaldistribu:onq(Y|X),whereqcanbeanythingyoulike,provideditissymmetric

i.e.q(Y|X)=q(X|Y)

Acceptthisproposalwithprobabilitygivenby

€

α

€

α =min 1, π (Y )π (X)

=min 1,

P(Y )L(Y )P(X)L(X)

€

π

Whathappenswhenyou“reach”thelimi=ngdistribu=on?

•  Considerasysteminwhichasingleparametercantakekpossiblevalues.Myproposalistoselectatrandomfromtheotherk‐1possiblevalues

•  Supposethesystemhasreacheditssta=onarystate.Whatdoesthismean?Theprobabilityofbeinginastateispropor=onaltotheprior=mesthelikelihood

•  Considertwostatesiandj,whereTheratesofflowinthetwodirec=onsare:Flowij:

Flowj‐>i€

π (Xi) > π (X j )

€

π (Xi)q(X j | Xi)α ij = π (Xi) ×1k −1

×π (X j )π (Xi)

=π (X j )k −1

€

π (X j )q(Xi | X j )α ji = π (X j )1k −1

“Detailedbalance”equa:ons

1/30/12

4

TheHas=ngsRa=o

•  Asimplechangetotheacceptanceformulaallowsyoutouseasymmetricproposals:

•  Movesthatmul=plyordivideparametersneedtoapplythechangeofvariablesrule€

α =min 1, π (Y )q(X |Y )π (X)q(Y | X)

Thesmallprint

•  Ifdetailedbalanceholdsforeverypairofstates,thenifthesystemreachesthesta=onarydistribu=on,itwillstaythere

‐it’suptoyoutoensureitreachesit•  Therearethreecondi=onsforthechainXitoconvergetothe

sta=onarydistribu=on

1)  Xmustbeirreducible(everystatemustbereachablefromeveryotherstate)

2)  Xmustbeaperiodic(stopsthesystemfromgehngstuckoscilla=ngbetweenstates)

3)  Xmustbeposi:verecurrent(theexpectedwai=ng=metoreturntoastateisfinite)

Proposaldistribu=onchoice

•  MCMCallowsyoutoexplorethebehaviourofmodelsthataretoocomplexforyoutosolveanaly=cally

•  Howeverthereisnoguaranteeitwillwork;badchoicesofproposaldistribu=oncanpreventyoufromconvergingtothedistribu=onyouwant

Let’slookatanotherexample

Firsta]empt,usingMCMC

•  Uniformprioron[0,1]

•  Proposaldistribu=onisnormallydistributedaroundcurrentposi=on,withsd=1

1/30/12

5

0 200 400 600 800 1000

1.0

2.5

4.0

Compare 3 runs of the chain

Index

z1

Histogram of z1

z1

Density

1.0 1.5 2.0 2.5 3.0 3.5 4.0

0.0

0.3

0.6

Histogram of z2

z2

Density

1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5

0.0

0.3

0.6

Histogram of z3

z3

Density

0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

0.0

0.3

Mul=plerunsgetthesame(right)answer

Seconda]empt,smallmodifica=on

•  Uniformprioron[0,1]

•  Proposaldistribu=onisnormallydistributedaroundcurrentposi=on,withsd=0.1

0 200 400 600 800 1000

1.0

1.6

Compare 3 runs of the chain

Index

z1

Histogram of z1

z1

Density

0 1 2 3 4 5

0.0

0.8

Histogram of z2

z2

Density

0 1 2 3 4 5

0.0

0.8

Histogram of z3

z3

Density

0 1 2 3 4 5

0.0

0.8

Wearestuckinonesideofthetargetdistribu=on

Gibbssampling

•  InGibbssampling,wewanttofindtheposteriorforasetofparameters

•  Eachparameterisupdatedinturnbysamplingfromthecondi=onaldistribu=ongiventhedataandthecurrentvaluesofalltheotherparameters

•  ConsiderthecaseofasingleparameterupdatedusingtheMetropolisalgorithm,wheretheproposaldensityisthecondi=onaldistribu=on

•  i.e.theGibbssamplerisanMCMCwhereeveryproposalisaccepted

•  Withmul=pleparameters,youneedtobecarefulaboutupdateordertoensurereversibility

€

αXY =min 1, π (Y )q(X |Y )π (X)q(Y | X)

=min 1,

π (Y )π (X)π (X)π (Y )

=1

1/30/12

6

Convergence

•  Ifwellconstructed,theMarkovchainisguaranteedtohavetheposteriorasitssta=onarydistribu=on

•  BUTthisdoesnottellyouhowlongyouhavetorunittoreachsta=onarity

  Theini=alposi=onmayhaveabiginfluence

  Theproposaldistribu=onmayleadtolowacceptancerates

  Thechainmaygetcaughtinalocalmaximuminthelikelihoodsurface

•  Mul=plerunsfromdifferentini=alcondi=ons,andgraphicalcheckscanbeusedtocheckconvergence

  Theefficiencyofthechaincanbemeasuredintermsofthevarianceofes=matesobtainedbyrunningthechainforashort=me

Watchyourchain

Twochains,samplingfromanexp(1)distribu=on.Proposaldistribu=onisnormalwithsd=0.001(red,andsd=1(black)

0 2000 4000 6000 8000 10000

02

46

8

Watch chains (sd=0.001 in red, sd=1 in black)

Index

z2

Histogram of sd=0.001

z1

Density

0 1 2 3 4 5

02

46

Histogram of sd=1

z2

Density

0 1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

Acceptancerate99.97%

Acceptancerate51.9%

Burn‐in

•  OTenyoustartthechainfarawayfromthetargetdistribu=on

 Truthunknown Checkforconvergence•  Thefirst“few”samplesfromthechainareapoor

representa=onofthesta=onarydistribu=on

•  Theseareusuallythrownawayas“burn‐in”•  Thereisnotheorytellingyouhowmuchtothrowaway,but

be]ertoerronthesideofmorethanless

Otheruses

•  Marginaleffects–supposewehaveamul=dimensionalparameter,wemayonlybeinterestedinsomesubset

•  Predic=on: Givenourposteriordistribu=ononparameters,wecan

predictthedistribu=onoffuturedatabysamplingparametersfromtheposterior,andsimula=ngdatagiventhoseparameters

 ThePosteriorpredic=vedistribu=onisausefulsourceofgoodness‐of‐fittes=ng:ifthedatawesimulatedoesnotlooklikethedataweoriginallycollected,themodelispoor.

1/30/12

7

Modifica=ons

•  Animportantdevelopmenthasbeentoallowtrans‐dimensionalmoves(Green1995),alsoknownasreversible‐jumpMCMC.

‐usefulwhenlookingforchangepoints(eginrate)alongasequence

‐e.g.whenlookingforperiodsofelevatedaccidentrateorelevatedrecombina=onrate

Therearemanysubtlevaria=onsofbasicMCMCthatallowyoutoincreasetheefficiencyincomplexsitua=ons(seeLiu2001formany)

Furtherreading

•  ForbasicMarkovChainbackground:ProbabilityandRandomProcesses,Grimme]andS=rzaker,(OUP,2001)

•  MarkovChainMonteCarloinPrac=ce,1996,edsGilks,Richardson,Spiegelhalter.(ChapmanandHall/CRC).

•  BayesianDataAnalysis,2004.Gelman,Carlin,SternandRubin.(ChapmanandHall/CRC).

•  MonteCarloStrategiesinScien=ficCompu=ng,2001,Liu(Springer‐Verlag).

•  ChrisHolmes’shortcourseonBayesianSta=s=cs:h]p://www.stats.ox.ac.uk/~cholmes/Courses/BDA/

bda_mcmc.pdf

Markov Chain Monte Carlo modelsmcvean/DTC/STAT/Lectures/Tues_wk2/...1/30/12 1 1/29/12 Markov Chain...

Documents

Transcript of Markov Chain Monte Carlo modelsmcvean/DTC/STAT/Lectures/Tues_wk2/...1/30/12 1 1/29/12 Markov Chain...