Bayesian inference calculate the model parameters that produce a distribution that gives the...
-
date post
19-Dec-2015 -
Category
Documents
-
view
214 -
download
1
Transcript of Bayesian inference calculate the model parameters that produce a distribution that gives the...
![Page 1: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/1.jpg)
Bayesian inferencecalculate the model parameters that produce a distribution that gives the observed data the greatest probability
![Page 2: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/2.jpg)
Thomas Bayes Bayesian methods were invented in the 18th century, but their application in phylogenetics dates from 1996.
Thomas Bayes? (1701?-1761?)
![Page 3: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/3.jpg)
Bayes’ theorem Bayes’ theorema links a conditional probability to its inverse
Prob(H|D) = Prob(H) Prob(D|H)
∑H Prob(H) Prob(D|H)
![Page 4: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/4.jpg)
Bayes’ theorem in the case of two alternative hypotheses, the theorem can be written as
Prob(H|D) = Prob(H) Prob(D|H)
∑H Prob(H) Prob(D|H)
Prob(H1|D) = Prob(H1) Prob(D|H1)
Prob(H1) Prob(D|H1) + Prob(H2) Prob(D|H2)
![Page 5: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/5.jpg)
Bayes’ theorem Bayes for smarties
m
m
= D
H1=D came from mainly orange bag
H2=D came from mainly blue bag
Prob(D|H1) = ¾ • ¾ • ¾ • ¾ • ¼ • 5 = 405/1024
Prob(D|H2) = ¼ • ¼ • ¼ • ¼ • ¾ • 5 = 15/1024
Prob(H1) = ½
Prob(H2) = ½
Prob(H1|D) = Prob(H1) Prob(D|H1)
Prob(H1) Prob(D|H1) + Prob(H2) Prob(D|H2) = = 0.964
½ • 405/1024
½ • 405/1024 + ½ • 15/1024
m
m
mmm
mm
m
m
mm
mm
mm mmm m
mm
m
mmm m
m
m
m
m m
mm
mm
mm
m
m m m m
![Page 6: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/6.jpg)
Bayes’ theorem a-priori knowledge can affect one’s conclusions
positive test result negative test result
ill true positive false negative
healthy false positive true negative
positive test result negative test result
ill 99% 1%
healthy 0.1% 99.9%
using the data only, P(ill|positive test result)≈0.99
![Page 7: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/7.jpg)
Bayes’ theorem a-priori knowledge can affect one’s conclusions
positive test result negative test result
ill true positive false negative
healthy false positive true negative
positive test result negative test result
ill 99% 1%
healthy 0.1% 99.9%
using the data only, P(ill|positive test result)≈0.99
![Page 8: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/8.jpg)
Bayes’ theorem a-priori knowledge can affect one’s conclusions
positive test result negative test result
ill 99% 1%
healthy 0.1% 99.9%
a-priori knowledge: 0.1% of the population (n=100 000) is ill
positive test result negative test result
Ill (100) 99 1
Healthy (99 900) 100 99800
with a-priori knowledge: 99/190 of persons with positive test results is ill P(ill|positive result) ≈ 50%
![Page 9: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/9.jpg)
Bayes’ theorem a-priori knowledge can affect one’s conclusions
![Page 10: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/10.jpg)
Bayes’ theorem a-priori knowledge can affect one’s conclusions
![Page 11: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/11.jpg)
Bayes’ theorem a-priori knowledge can affect one’s conclusions
Behind door 1 Behind door 2 Behind door 3 Result if staying at door 1
Result if switching to door offered
Car Goat Goat Car Goat
Goat Car Goat Goat Car
Goat Goat Car Goat Car
![Page 12: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/12.jpg)
Bayes’ theorem a-priori knowledge can affect one’s conclusions
P(C=c|H=h, S=s) = P(H=h|C=c, S=s)• P(C=c|S=s)
P(H=h|S=s)
C=number of the door hiding the carS=number of the door selected by the playerH=number of the door opened by the host
probability of finding the car, after the original selectionand the host’s opening of one.
![Page 13: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/13.jpg)
Bayes’ theorem a-priori knowledge can affect one’s conclusions
P(C=c|H=h, S=s) = P(H=h|C=c, S=s)• P(C=c|S=s)
∑ P(H=h|C=c,S=s)
C=number of the door hiding the carS=number of the door selected by the playerH=number of the door opened by the host
the host’s behaviour depends on the candidate’s selectionand on where the car is.
C=1
3
![Page 14: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/14.jpg)
Bayes’ theorem a-priori knowledge can affect one’s conclusions
P(C=2|H=3, S=1) = 1 • 1/3
C=number of the door hiding the carS=number of the door selected by the playerH=number of the door opened by the host
1/2 • 1/3 + 1 • 1/3 + 0 • 1/3= 2/3
![Page 15: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/15.jpg)
Bayes’ theorem Bayes’ theorema is used to combine a prior probability with the likelihood to produce a posterior probability.
Prob(H|D) = Prob(H) Prob(D|H)
∑H Prob(H) Prob(D|H)
prior probability
posterior probability
likelihood
normalizing constant
![Page 16: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/16.jpg)
Bayesian inference of trees in BI, the players are the tree topology and branch lengths, the evolution model and the (sequence) data)
tree topology and branch lengths
evolutionary modelA G
C T
(sequence) data
![Page 17: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/17.jpg)
Bayesian inference of trees the posterior probability of a tree is calculated from the prior and the likelihood
A G
C TProb( , | ) =
A G
C TProb( , ) • Prob( | , )
A G
C T
Prob( )
posterior probabilityof a tree
prior probability of a tree
summation over all possible branch lengths and modelparameter values
likelihood
![Page 18: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/18.jpg)
Bayesian inference of trees the prior probability of a tree is often not known and therefore all trees are considered equally probable
A
B
CD
EA
B
DC
EA
B
ED
CA
C
BD
EB
C
AD
E
A
D
CB
EA
D
BC
EA
D
EB
CA
C
DB
ED
C
AB
E
A
E
CB
DA
E
BC
DA
E
BD
CA
C
EB
DE
C
AB
E
115
115
115
115
115
115
115
115
115
115
115
115
115
115
115
![Page 19: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/19.jpg)
Bayesian inference of trees
Prob
(Tre
e i)
Prob
(Dat
a |T
ree
i)Pr
ob(T
ree
i |D
ata)
prior probability
likelihood
posterior probability
the prior probability of a tree is often not known and therefore all trees are considered equally probable
![Page 20: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/20.jpg)
Bayesian inference of trees but prior knowledge of taxonomy could suggest other prior probabilities
A
B
CD
EA
B
DC
EA
B
ED
CA
C
BD
EB
C
AD
E
A
D
CB
EA
D
BC
EA
D
EB
CA
C
DB
ED
C
AB
E
A
E
CB
DA
E
BC
DA
E
BD
CA
C
EB
DE
C
AB
E
13
13
13
0 0
0 0 0
0 0 0 0 0
0 0
(CDE) constrained:
![Page 21: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/21.jpg)
Bayesian inference of trees BI requires summation over all possible trees … which is impossible to do analytically
A G
C TProb( , | ) =
A G
C TProb( , ) • Prob( | , )
A G
C T
Prob( )
summation over all possible branch lengths and modelparameter values
![Page 22: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/22.jpg)
1. Start at a random point
Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability
Post
erio
r pro
babi
lity
dens
ity
tree 1 tree 2 tree 3
parameter space
![Page 23: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/23.jpg)
1. Start at a random point2. Make a small random
move3. Calculate posterior density
ratio r = new/old state
Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability
Post
erio
r pro
babi
lity
dens
ity
tree 1 tree 2 tree 3
parameter space
1
2
![Page 24: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/24.jpg)
1. Start at a random point2. Make a small random
move3. Calculate posterior density
ratio r = new/old state4. If r > 1 always accept move
Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability
Post
erio
r pro
babi
lity
dens
ity
tree 1 tree 2 tree 3
parameter space
1
2 always accepted
![Page 25: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/25.jpg)
1. Start at a random point2. Make a small random
move3. Calculate posterior density
ratio r = new/old state4. If r > 1 always accept move
If r < 1 accept move with a probability ~ 1/distance
Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability
Post
erio
r pro
babi
lity
dens
ity
tree 1 tree 2 tree 3
parameter space
1
2
perhaps accepted
![Page 26: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/26.jpg)
1. Start at a random point2. Make a small random
move3. Calculate posterior density
ratio r = new/old state4. If r > 1 always accept move
If r < 1 accept move with a probability ~ 1/distance
Bayesian inference of trees but Markov chain Monte Carlo allows approximating posterior probability
Post
erio
r pro
babi
lity
dens
ity
tree 1 tree 2 tree 3
parameter space
1
2
rarely accepted
![Page 27: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/27.jpg)
1. Start at a random point2. Make a small random
move3. Calculate posterior density
ratio r = new/old state4. If r > 1 always accept move
If r < 1 accept move with a probability ~ 1/distance
5. Go to step 2
Bayesian inference of trees the proportion of time that MCMC spends in a particular parameter region is an estimate of that region’s posterior probability.
Post
erio
r pro
babi
lity
dens
ity
tree 1 tree 2 tree 3
parameter space
20% 48% 32%
![Page 28: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/28.jpg)
Bayesian inference of trees Metropolis-coupled Markov Chain Monte Carlo speeds up the search
cold chain
hot chain: P(tree|data)b
hotter chain: P(tree|data)b
hottest chain: P(tree|data)b
0 < < 1b cold chainflat
![Page 29: Bayesian inference calculate the model parameters that produce a distribution that gives the observed data the greatest probability.](https://reader036.fdocuments.in/reader036/viewer/2022062714/56649d3a5503460f94a15038/html5/thumbnails/29.jpg)
Bayesian inference of trees Metropolis-coupled Markov Chain Monte Carlo speeds up the search
cold scout stuck on local optimum
Hey!Over here!
hot scout signalling better spot