Divergence-time estimation in RevBayes

108
E S D T Tracy A. Heath University of Kansas University of California, Berkeley Iowa State University (in 2015) Tanja Stadler ETH Zürich 2014 Phylogenetic Analysis in RevBayes NESCent Academy

Transcript of Divergence-time estimation in RevBayes

Page 1: Divergence-time estimation in RevBayes

E S D T

Tracy A. HeathUniversity of Kansas

University of California, BerkeleyIowa State University (in 2015)

Tanja StadlerETH Zürich

2014 Phylogenetic Analysis in RevBayesNESCent Academy

Page 2: Divergence-time estimation in RevBayes

A T-S E

Phylogenetic trees can provide both topological informationand temporal information

0.2 expected

substitutions/site

Primates

Carnivora

Cetacea

Simiiformes

Artiodactyla

Mic

roce

bu

s

Homininae

Hippo

Fo

ssil

Ca

libra

tio

ns

Ph

ylo

ge

ne

tic R

ela

tio

nsh

ips

Se

qu

en

ce

Da

ta

Did Simiiformes experience

accelerated rates of molecular

evolution?

What is the age of the MRCA of

mouse lemurs (Microcebus)?

100 0.020.040.060.080.0

EquusRhinocerosBosHippopotamusBalaenopteraPhyseterUrsusCanisFelisHomoPanGorillaPongoMacacaCallithrixLorisGalagoDaubentoniaVareciaEulemurLemurHapalemurPropithecusLepilemur

MirzaM. murinusM. griseorufus

M. myoxinusM. berthaeM. rufus1M. tavaratraM. rufus2M. sambiranensisM. ravelobensis

Cheirogaleus

Sim

iiform

es

Mic

rocebus

Cretaceous Paleogene Neogene Q

Time (Millions of years)

Understanding Evolutionary Processes (Yang & Yoder Syst. Biol. 2003)

Page 3: Divergence-time estimation in RevBayes

A T-S EPhylogenetic divergence-timeestimation

• What was the spacial andclimatic environment of ancientangiosperms?

• Did the uplift of the PatagonianAndes drive the diversity ofPeruvian lilies?

• How has mammalian body-sizechanged over time?

• Is diversification in Caribbeananoles correlated with ecologicalopportunity?

• How has the rate of molecularevolution changed across theTree of Life?

(Antonelli & Sanmartin. Syst. Biol. 2011)

(Lartillot & Delsuc. Evolution 2012) (Mahler, Revell, Glor, & Losos. Evolution 2010)

(Nabholz, Glemin, Galtier. MBE 2008)

Historical biogeography Molecular evolution

Trait evolution

Diversi�cation

Anolis fowleri (image by L. Mahler)

Understanding Evolutionary Processes

Page 4: Divergence-time estimation in RevBayes

T D I D

Divergence time estimationof rapidly evolvingpathogens provideinformation about spatialand temporal dynamics ofinfectious diseases

Sequences sampled atdifferent time horizonsimpose a temporal structureon the tree by providingages for non-contemporaneous tips

(Pybus & Rambaut. 2009. Nature Reviews Genetics.)

Page 5: Divergence-time estimation in RevBayes

D T E

Goal: Estimate the ages of interior nodes to understand thetiming and rates of evolutionary processes

Model how rates aredistributed across the tree

Describe the distribution ofspeciation events over time

External calibrationinformation for estimates ofabsolute node times

calibrated node

100 0.020.040.060.080.0

EquusRhinocerosBosHippopotamusBalaenopteraPhyseterUrsusCanisFelisHomoPanGorillaPongoMacacaCallithrixLorisGalagoDaubentoniaVareciaEulemurLemurHapalemurPropithecusLepilemur

MirzaM. murinusM. griseorufus

M. myoxinusM. berthaeM. rufus1M. tavaratraM. rufus2M. sambiranensisM. ravelobensis

Cheirogaleus

Sim

iiform

es

Mic

rocebus

Cretaceous Paleogene Neogene Q

Time (Millions of years)

Page 6: Divergence-time estimation in RevBayes

A T-S E

Phylogenetic trees can provide both topological informationand temporal information

100 0.020.040.060.080.0

EquusRhinocerosBosHippopotamusBalaenopteraPhyseterUrsusCanisFelisHomoPanGorillaPongoMacacaCallithrixLorisGalagoDaubentoniaVareciaEulemurLemurHapalemurPropithecusLepilemur

MirzaM. murinusM. griseorufus

M. myoxinusM. berthaeM. rufus1M. tavaratraM. rufus2M. sambiranensisM. ravelobensis

Cheirogaleus

Sim

iiform

es

Mic

roce

bu

s

Cretaceous Paleogene Neogene Q

Time (Millions of years)

Understanding Evolutionary Processes (Yang & Yoder Syst. Biol. 2003; Heath et al. MBE 2012)

Page 7: Divergence-time estimation in RevBayes

T G M C

Assume that the rate ofevolutionary change isconstant over time

(branch lengths equalpercent sequencedivergence) 10%

400 My

200 My

A B C

20%

10%10%

(Based on slides by Jeff Thorne; http://statgen.ncsu.edu/thorne/compmolevo.html)

Page 8: Divergence-time estimation in RevBayes

T G M C

We can date the tree if weknow the rate of change is1% divergence per 10 My N

A B C

20%

10%10%

10%200 My

400 My

200 My

(Based on slides by Jeff Thorne; http://statgen.ncsu.edu/thorne/compmolevo.html)

Page 9: Divergence-time estimation in RevBayes

T G M C

If we found a fossil of theMRCA of B and C, we canuse it to calculate the rateof change & date the rootof the tree

N

A B C

20%

10%10%

10%200 My

400 My

(Based on slides by Jeff Thorne; http://statgen.ncsu.edu/thorne/compmolevo.html)

Page 10: Divergence-time estimation in RevBayes

R G M C

Rates of evolution vary across lineages and over time

Mutation rate:Variation in

• metabolic rate

• generation time

• DNA repair

Fixation rate:Variation in

• strength and targets ofselection

• population sizes

10%

400 My

200 My

A B C

20%

10%10%

Page 11: Divergence-time estimation in RevBayes

U A

Sequence data provideinformation about branchlengths

In units of the expected # ofsubstitutions per site

branch length = rate × time0.2 expected

substitutions/site

Ph

ylo

ge

ne

tic R

ela

tio

nsh

ips

Se

qu

en

ce

Da

ta

Page 12: Divergence-time estimation in RevBayes

P L

f (D | V , θs,Ψ)

V Vector of branch lengths

θs Sequence model parameters

D Sequence data

Ψ Tree topology

Page 13: Divergence-time estimation in RevBayes

R T

The expected # of substitutions/site occurring along abranch is the product of the substitution rate and time

length = rate × time length = rate length = time

Methods for dating species divergences estimate thesubstitution rate and time separately

Page 14: Divergence-time estimation in RevBayes

R T

The sequence dataprovide informationabout branch length

for any possible rate,there’s a time that fitsthe branch lengthperfectly

0

1

2

3

4

5

0 1 2 3 4 5

Bra

nch

Rat

e

Branch Time

time = 0.8rate = 0.625

branch length = 0.5

(based on Thorne & Kishino, 2005)

Page 15: Divergence-time estimation in RevBayes

B D T E

length = rate length = time

R = (r, r, r, . . . , rN−)

A = (a, a, a, . . . , aN−)

N = number of tips

Page 16: Divergence-time estimation in RevBayes

B D T E

length = rate length = time

R = (r, r, r, . . . , rN−)

A = (a, a, a, . . . , aN−)

N = number of tips

Page 17: Divergence-time estimation in RevBayes

B D T E

Posterior probability

f (R,A, θR, θA, θs | D,Ψ)

R Vector of rates on branches

A Vector of internal node ages

θR, θA, θs Model parameters

D Sequence data

Ψ Tree topology

Page 18: Divergence-time estimation in RevBayes

B D T E

f(R,A, θR, θA, θs | D) =

f (D | R,A, θs) f(R | θR) f(A | θA) f(θs)

f(D)

f(D | R,A, θR, θA, θs) Likelihood

f(R | θR) Prior on rates

f(A | θA) Prior on node ages

f(θs) Prior on substitution parameters

f(D) Marginal probability of the data

Page 19: Divergence-time estimation in RevBayes

B D T E

Estimating divergence times relies on 2 main elements:

• Branch-specific rates: f (R | θR)

• Node ages: f (A | θA,C)

Page 20: Divergence-time estimation in RevBayes

M R VSome models describing lineage-specific substitution ratevariation:

• Global molecular clock (Zuckerkandl & Pauling, 1962)

• Local molecular clocks (Hasegawa, Kishino & Yano 1989;Kishino & Hasegawa 1990; Yoder & Yang 2000; Yang & Yoder

2003, Drummond and Suchard 2010)

• Punctuated rate change model (Huelsenbeck, Larget andSwofford 2000)

• Log-normally distributed autocorrelated rates (Thorne,Kishino & Painter 1998; Kishino, Thorne & Bruno 2001; Thorne &

Kishino 2002)

• Uncorrelated/independent rates models (Drummond et al.2006; Rannala & Yang 2007; Lepage et al. 2007)

• Mixture models on branch rates (Heath, Holder, Huelsenbeck2012)

Models of Lineage-specific Rate Variation

Page 21: Divergence-time estimation in RevBayes

G M C

The substitution rate isconstant over time

All lineages share the samerate

branch length = substitution rate

low high

Models of Lineage-specific Rate Variation (Zuckerkandl & Pauling, 1962)

Page 22: Divergence-time estimation in RevBayes

G M C

rate

de

nsity

r

rate prior distribution

Models of Lineage-specific Rate Variation (Zuckerkandl & Pauling, 1962)

Page 23: Divergence-time estimation in RevBayes

G M C

The sampled rate is appliedto every branch in the tree

rate

de

nsity

r

rate prior distribution

Models of Lineage-specific Rate Variation (Zuckerkandl & Pauling, 1962)

Page 24: Divergence-time estimation in RevBayes

R G M C

Rates of evolution vary across lineages and over time

Mutation rate:Variation in

• metabolic rate

• generation time

• DNA repair

Fixation rate:Variation in

• strength and targets ofselection

• population sizes

10%

400 My

200 My

A B C

20%

10%10%

Page 25: Divergence-time estimation in RevBayes

R-C M

To accommodate variation in substitution rates‘relaxed-clock’ models estimate lineage-specific substitutionrates

• Local molecular clocks

• Punctuated rate change model

• Log-normally distributed autocorrelated rates

• Uncorrelated/independent rates models

• Mixture models on branch rates

Page 26: Divergence-time estimation in RevBayes

L M C

Rate shifts occurinfrequently over the tree

Closely related lineageshave equivalent rates(clustered by sub-clades)

low high

branch length = substitution rate

Models of Lineage-specific Rate Variation (Yang & Yoder 2003, Drummond and Suchard 2010)

Page 27: Divergence-time estimation in RevBayes

L M C

Most methods forestimating local clocksrequired specifying thenumber and locations ofrate changes a priori

Drummond and Suchard(2010) introduced aBayesian method thatsamples over a broad rangeof possible random localclocks

low high

branch length = substitution rate

Models of Lineage-specific Rate Variation (Yang & Yoder 2003, Drummond and Suchard 2010)

Page 28: Divergence-time estimation in RevBayes

A R

Substitution rates evolvegradually over time –closely related lineages havesimilar rates

The rate at a node isdrawn from a lognormaldistribution with a meanequal to the parent rate

low high

branch length = substitution rate

Models of Lineage-specific Rate Variation (Thorne, Kishino & Painter 1998; Kishino, Thorne & Bruno 2001)

Page 29: Divergence-time estimation in RevBayes

A R

Models of Lineage-specific Rate Variation (Thorne, Kishino & Painter 1998; Kishino, Thorne & Bruno 2001)

Page 30: Divergence-time estimation in RevBayes

P R C

Rate changes occur alonglineages according to apoint process

At rate-change events, thenew rate is a product ofthe parent’s rate and aΓ-distributed multiplier

low high

branch length = substitution rate

Models of Lineage-specific Rate Variation (Huelsenbeck, Larget and Swofford 2000)

Page 31: Divergence-time estimation in RevBayes

I/U R

Lineage-specific rates areuncorrelated when the rateassigned to each branch isindependently drawn froman underlying distribution

low high

branch length = substitution rate

Models of Lineage-specific Rate Variation (Drummond et al. 2006)

Page 32: Divergence-time estimation in RevBayes

I/U R

low high

branch length = substitution rate

Models of Lineage-specific Rate Variation (Drummond et al. 2006)

Page 33: Divergence-time estimation in RevBayes

I M M

Dirichlet process prior:Branches are partitionedinto distinct rate categories

branch length = substitution rate

c5

c4

c3

c2

substitution rate classes

c1

Models of Lineage-specific Rate Variation (Heath, Holder, Huelsenbeck. 2012 MBE)

Page 34: Divergence-time estimation in RevBayes

T D P P (DPP)

A stochastic process that models data as a mixture ofdistributions and can identify latent classes present in thedata

Branches are assumed toform distinct substitutionrate clusters

Efficient Markov chainMonte Carlo (MCMC)implementations allow forinference under this model

branch length = substitution rate

c5

c4

c3

c2

substitution rate classes

c1

DPP Model of Lineage-specific Rate Variation (Heath, Holder, Huelsenbeck. 2012 MBE 29:939-955)

Page 35: Divergence-time estimation in RevBayes

T D P P (DPP)

A stochastic process that models data as a mixture ofdistributions and can identify latent classes present in thedata

Random variables under theDPP informed by the data:

• the number of rateclasses

• the assignment ofbranches to classes

• the rate value for eachclass

branch length = substitution rate

c5

c4

c3

c2

substitution rate classes

c1

DPP Model of Lineage-specific Rate Variation (Heath, Holder, Huelsenbeck. 2012 MBE 29:939-955)

Page 36: Divergence-time estimation in RevBayes

T D P P (DPP)

A stochastic process that models data as a mixture ofdistributions and can identify latent classes present in thedata

Random variables under theDPP informed by the data:

• the number of rateclasses

• the assignment ofbranches to classes

• the rate value for eachclass

DPP Model of Lineage-specific Rate Variation (Heath, Holder, Huelsenbeck. 2012 MBE 29:939-955)

Page 37: Divergence-time estimation in RevBayes

DPP C P

Global molecular

clock

rate classes

branch length = substitution rate

18c1

DPP Model of Lineage-specific Rate Variation (Heath, Holder, Huelsenbeck. 2012 MBE 29:939-955)

Page 38: Divergence-time estimation in RevBayes

DPP C P

Independent

rates

1c1

1c18

1c3

1c2

rate classes

branch length = substitution rate

DPP Model of Lineage-specific Rate Variation (Heath, Holder, Huelsenbeck. 2012 MBE 29:939-955)

Page 39: Divergence-time estimation in RevBayes

DPP C P

Local molecular

clock

5c3

5c2

rate classes

branch length = substitution rate

8c1

(262,142 different local molecular clocks)

DPP Model of Lineage-specific Rate Variation (Heath, Holder, Huelsenbeck. 2012 MBE 29:939-955)

Page 40: Divergence-time estimation in RevBayes

DPP C P

Global molecular

clock

4c3

6c2

rate classes

branch length = substitution rate

8c1

Each of the

682,076,806,159

configurations

has a prior weight

DPP Model of Lineage-specific Rate Variation (Heath, Holder, Huelsenbeck. 2012 MBE 29:939-955)

Page 41: Divergence-time estimation in RevBayes

M R V

These are only a subset of the available models forbranch-rate variation

• Global molecular clock

• Local molecular clocks

• Punctuated rate change model

• Log-normally distributed autocorrelated rates

• Uncorrelated/independent rates models

• Dirchlet process prior

Models of Lineage-specific Rate Variation

Page 42: Divergence-time estimation in RevBayes

M R V

Are our models appropriate across all data sets?

cave bear

American

black bear

sloth bear

Asian

black bear

brown bear

polar bear

American giant

short-faced bear

giant panda

sun bear

harbor seal

spectacled

bear

4.08

5.39

5.66

12.86

2.75

5.05

19.09

35.7

0.88

4.58

[3.11–5.27]

[4.26–7.34]

[9.77–16.58]

[3.9–6.48]

[0.66–1.17]

[4.2–6.86]

[2.1–3.57]

[14.38–24.79]

[3.51–5.89]14.32

[9.77–16.58]

95% CI

mean age (Ma)

t 2

t 3

t 4

t 6

t 7

t 5

t 8

t 9

t 10

t x

node

MP•MLu•MLp•Bayesian

100•100•100•1.00

100•100•100•1.00

85•93•93•1.00

76•94•97•1.00

99•97•94•1.00

100•100•100•1.00

100•100•100•1.00

100•100•100•1.00

t 1

Eocene Oligocene Miocene Plio Plei Hol

34 5.3 1.823.8 0.01

Epochs

Ma

Global expansion of C4 biomassMajor temperature drop and increasing seasonality

Faunal turnover

Krause et al., 2008. Mitochondrial genomes reveal anexplosive radiation of extinct and extant bears near theMiocene-Pliocene boundary. BMC Evol. Biol. 8.

Taxa

1

5

10

50

100

500

1000

5000

10000

20000

0100200300MYA

Ophidiiformes

Percomorpha

Beryciformes

Lampriformes

Zeiforms

Polymixiiformes

Percopsif. + Gadiif.

Aulopiformes

Myctophiformes

Argentiniformes

Stomiiformes

Osmeriformes

Galaxiiformes

Salmoniformes

Esociformes

Characiformes

Siluriformes

Gymnotiformes

Cypriniformes

Gonorynchiformes

Denticipidae

Clupeomorpha

Osteoglossomorpha

Elopomorpha

Holostei

Chondrostei

Polypteriformes

Clade r ε ΔAIC

1. 0.041 0.0017 25.32. 0.081 * 25.53. 0.067 0.37 45.1 4. 0 * 3.1Bg. 0.011 0.0011

Ostariophysi

Acanthomorpha

Teleo

stei

Santini et al., 2009. Did genome duplication drive the originof teleosts? A comparative study of diversification inray-finned fishes. BMC Evol. Biol. 9.

Page 43: Divergence-time estimation in RevBayes

M R V

These are only a subset of the available models forbranch-rate variation

• Global molecular clock

• Local molecular clocks

• Punctuated rate change model

• Log-normally distributed autocorrelated rates

• Uncorrelated/independent rates models

• Dirchlet process prior

Model selection and model uncertainty are very importantfor Bayesian divergence time analysis

Models of Lineage-specific Rate Variation

Page 44: Divergence-time estimation in RevBayes

B D T E

Estimating divergence times relies on 2 main elements:

• Branch-specific rates: f (R | θR)

• Node ages: f (A | θA,C)

http://bayesiancook.blogspot.com/2013/12/two-sides-of-same-coin.html

Page 45: Divergence-time estimation in RevBayes

P N T

Sequence data are only informative on relative rates & times

Node-time priors cannot give precise estimates of absolutenode ages

We need external information (like fossils) to calibrate orscale the tree to absolute time

Node Age Priors

Page 46: Divergence-time estimation in RevBayes

C D T

Fossils (or other data) are necessary to estimate absolutenode ages

There is no information inthe sequence data forabsolute time

Uncertainty in theplacement of fossils

N

A B C

20%

10%10%

10%200 My

400 My

Page 47: Divergence-time estimation in RevBayes

C D

Bayesian inference is well suited to accommodatinguncertainty in the age of the calibration node

Divergence times arecalibrated by placingparametric densities oninternal nodes offset by ageestimates from the fossilrecord

N

A B C

200 My

De

nsity

Age

Page 48: Divergence-time estimation in RevBayes

F C

Fossil and geological datacan be used to estimate theabsolute ages of ancientdivergences

Time (My)

Calibrating Divergence Times

Page 49: Divergence-time estimation in RevBayes

F C

The ages of extant taxaare known

Time (My)

Calibrating Divergence Times

Page 50: Divergence-time estimation in RevBayes

F C

Fossil taxa are assigned tomonophyletic clades

Time (My)Minimum age

Calibrating Divergence Times Notogoneus osculus (Grande & Grande J. Paleont. 2008)

Page 51: Divergence-time estimation in RevBayes

F C

Fossil taxa are assigned tomonophyletic clades andconstrain the age of theMRCA

Minimum age Time (My)

Calibrating Divergence Times

Page 52: Divergence-time estimation in RevBayes

A F C

Misplaced fossils can affect node age estimates throughoutthe tree – if the fossil is older than its presumed MRCA

Calibrating the Tree (figure from Benton & Donoghue Mol. Biol. Evol. 2007)

Page 53: Divergence-time estimation in RevBayes

A F C

Crown clade: allliving species andtheir most-recentcommon ancestor(MRCA)

Calibrating the Tree (figure from Benton & Donoghue Mol. Biol. Evol. 2007)

Page 54: Divergence-time estimation in RevBayes

A F C

Stem lineages:purely fossil formsthat are closer totheir descendantcrown clade thanany other crownclade

Calibrating the Tree (figure from Benton & Donoghue Mol. Biol. Evol. 2007)

Page 55: Divergence-time estimation in RevBayes

A F C

Fossiliferoushorizons: thesources in therock record forrelevant fossils

Calibrating the Tree (figure from Benton & Donoghue Mol. Biol. Evol. 2007)

Page 56: Divergence-time estimation in RevBayes

F C

Age estimates from fossilscan provide minimum timeconstraints for internalnodes

Reliable maximum boundsare typically unavailable

Minimum age Time (My)

Calibrating Divergence Times

Page 57: Divergence-time estimation in RevBayes

P D C N

Parametric distributions aretypically off-set by the ageof the oldest fossil assignedto a clade

These prior densities do not(necessarily) requirespecification of maximumbounds

Uniform (min, max)

Exponential (λ)

Gamma (α, β)

Log Normal (µ, σ2)

Time (My)Minimum age

Calibrating Divergence Times

Page 58: Divergence-time estimation in RevBayes

P D C N

Describe the waiting timebetween the divergenceevent and the age of theoldest fossil

Minimum age Time (My)

Calibrating Divergence Times

Page 59: Divergence-time estimation in RevBayes

P D C N

Overly informative priorscan bias node ageestimates to be too young

Minimum age

Exponential (λ)

Time (My)

Calibrating Divergence Times

Page 60: Divergence-time estimation in RevBayes

P D C N

Uncertainty in the age ofthe MRCA of the claderelative to the age of thefossil may be bettercaptured by vague priordensities

Minimum age

Exponential (λ)

Time (My)

Calibrating Divergence Times

Page 61: Divergence-time estimation in RevBayes

P P

Hyperprior: place an higher-order prior on the parameter ofa prior distribution

Sample the time fromthe MRCA to thefossil from a mixtureof differentexponentialdistributions

Account foruncertainty in valuesof λ

De

nsity

Hyperparameter

Hyperprior

De

nsity

Parameter

Prior

Page 62: Divergence-time estimation in RevBayes

C P RB

Calibrating Divergence Times

Page 63: Divergence-time estimation in RevBayes

P D C N

Common practice in Bayesian divergence-time estimation:

Estimates of absolute nodeages are driven primarily bythe calibration density

Specifying appropriatedensities is a challenge formost molecular biologists

Uniform (min, max)

Exponential (λ)

Gamma (α, β)

Log Normal (µ, σ2)

Time (My)Minimum age

Calibration Density Approach

Page 64: Divergence-time estimation in RevBayes

I F C

We would prefer toeliminate the need forad hoc calibrationprior densities

Calibration densitiesdo not account fordiversification of fossils

Domestic dog

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Zaragocyon daamsi

Ballusia elmensis

Ursavus brevihinus

Ailurarctos lufengensis

Ursavus primaevus

Agriarctos spp.

Kretzoiarctos beatrix

Indarctos vireti

Indarctos arctoides

Indarctos punjabiensis

Giant short-faced bear

Cave bear

Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)

Page 65: Divergence-time estimation in RevBayes

I F C

We want to use allof the available fossils

Example: Bears12 fossils are reducedto 4 calibration ageswith calibration densitymethods

Domestic dog

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Zaragocyon daamsi

Ballusia elmensis

Ursavus brevihinus

Ailurarctos lufengensis

Ursavus primaevus

Agriarctos spp.

Kretzoiarctos beatrix

Indarctos vireti

Indarctos arctoides

Indarctos punjabiensis

Giant short-faced bear

Cave bear

Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)

Page 66: Divergence-time estimation in RevBayes

I F C

We want to use allof the available fossils

Example: Bears12 fossils are reducedto 4 calibration ageswith calibration densitymethods

Domestic dog

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Zaragocyon daamsi

Ballusia elmensis

Ursavus brevihinus

Ailurarctos lufengensis

Ursavus primaevus

Agriarctos spp.

Kretzoiarctos beatrix

Indarctos vireti

Indarctos arctoides

Indarctos punjabiensis

Giant short-faced bear

Cave bear

Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)

Page 67: Divergence-time estimation in RevBayes

I F C

Because fossils arepart of thediversification process,we can combine fossilcalibration withbirth-death models

Domestic dog

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Zaragocyon daamsi

Ballusia elmensis

Ursavus brevihinus

Ailurarctos lufengensis

Ursavus primaevus

Agriarctos spp.

Kretzoiarctos beatrix

Indarctos vireti

Indarctos arctoides

Indarctos punjabiensis

Giant short-faced bear

Cave bear

Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)

Page 68: Divergence-time estimation in RevBayes

I F C

This relies on abranching model thataccounts forspeciation, extinction,and rates offossilization,preservation, andrecovery

Domestic dog

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Zaragocyon daamsi

Ballusia elmensis

Ursavus brevihinus

Ailurarctos lufengensis

Ursavus primaevus

Agriarctos spp.

Kretzoiarctos beatrix

Indarctos vireti

Indarctos arctoides

Indarctos punjabiensis

Giant short-faced bear

Cave bear

Fossil and Extant Bears (Krause et al. BMC Evol. Biol. 2008; Abella et al. PLoS ONE 2012)

Page 69: Divergence-time estimation in RevBayes

T F B-D P (FBD)

Improving statistical inference of absolute node ages

Eliminates the need to specify arbitrarycalibration densities

Better capture our statisticaluncertainty in species divergence dates

All reliable fossils associated with aclade are used 150 100 50 0

Time

Heath, Huelsenbeck, & Stadler. 2014. The fossilized birth-death process forcoherent calibration of divergence time estimates. PNAS 111(29):E2957–E2966.

http://www.pnas.org/content/111/29/E2957.abstract

Page 70: Divergence-time estimation in RevBayes

T F B-D P (FBD)

Recovered fossil specimensprovide historicalobservations of thediversification process thatgenerated the tree ofextant species

150 100 50 0

Time

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 71: Divergence-time estimation in RevBayes

T F B-D P (FBD)

The probability of the treeand fossil observationsunder a birth-death modelwith rate parameters:

λ = speciation

μ = extinction

ψ = fossilization/recovery

150 100 50 0

Time

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 72: Divergence-time estimation in RevBayes

T F B-D P (FBD)

We assume that the fossilis a descendant of aspecified calibrated node

The time of the fossil: Aindicates an observation ofthe birth-death process afterthe age of the node N

0250 50100150200

Time (My)

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 73: Divergence-time estimation in RevBayes

T F B-D P (FBD)

The fossil must attach tothe tree at some time: ⋆

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 74: Divergence-time estimation in RevBayes

T F B-D P (FBD)

If it is the descendant ofan unobserved lineage, thenthere is a speciation eventat time ⋆ on one of the 2branches

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 75: Divergence-time estimation in RevBayes

T F B-D P (FBD)

If it is the descendant ofan unobserved lineage, thenthere is a speciation eventat time ⋆ on one of the 2branches

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 76: Divergence-time estimation in RevBayes

T F B-D P (FBD)

If ⋆ = A, the fossil is anobservation of a lineageancestral to the extantspecies

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 77: Divergence-time estimation in RevBayes

T F B-D P (FBD)

If ⋆ = A, the fossil is anobservation of a lineageancestral to the extantspecies

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 78: Divergence-time estimation in RevBayes

T F B-D P (FBD)

The probability of thisrealization of thediversification process isconditional on:

λ, μ, and ψ

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 79: Divergence-time estimation in RevBayes

T F B-D P (FBD)

Using MCMC, we cansample the age of thecalibrated node • whileconditioning on

λ, μ, and ψother node agesA and ⋆

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 80: Divergence-time estimation in RevBayes

T F B-D P (FBD)

MCMC allows us toconsider all possible valuesof ⋆ (marginalization)

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 81: Divergence-time estimation in RevBayes

T F B-D P (FBD)

MCMC allows us toconsider all possible valuesof ⋆ (marginalization)

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 82: Divergence-time estimation in RevBayes

T F B-D P (FBD)

MCMC allows us toconsider all possible valuesof ⋆ (marginalization)

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 83: Divergence-time estimation in RevBayes

T F B-D P (FBD)

MCMC allows us toconsider all possible valuesof ⋆ (marginalization)

0250 50100150200

Time (My)

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 84: Divergence-time estimation in RevBayes

T F B-D P (FBD)

The posterior samples ofthe calibrated node age areinformed by the fossilattachment times

0250 50100150200

Time (My)

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 85: Divergence-time estimation in RevBayes

T F B-D P (FBD)

The FBD model allowsmultiple fossils to calibratea single node

N

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 86: Divergence-time estimation in RevBayes

T F B-D P (FBD)

Given ⋆ and A, the newfossil can attach to the treevia speciation along eitherbranch in the extant tree

N

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 87: Divergence-time estimation in RevBayes

T F B-D P (FBD)

Given ⋆ and A, the newfossil can attach to the treevia speciation along eitherbranch in the extant tree

N

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 88: Divergence-time estimation in RevBayes

T F B-D P (FBD)

Given ⋆ and A, the newfossil can attach to the treevia speciation along eitherbranch in the extant tree

N

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 89: Divergence-time estimation in RevBayes

T F B-D P (FBD)

Or the unobserved branchleading to the othercalibrating fossil

N

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 90: Divergence-time estimation in RevBayes

T F B-D P (FBD)

If ⋆ = A, then the newfossil lies directly on abranch in the extant tree

N

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 91: Divergence-time estimation in RevBayes

T F B-D P (FBD)

If ⋆ = A, then the newfossil lies directly on abranch in the extant tree

N

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 92: Divergence-time estimation in RevBayes

T F B-D P (FBD)

Or it is an ancestor of theother calibrating fossil

N

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 93: Divergence-time estimation in RevBayes

T F B-D P (FBD)

The probability of thisrealization of thediversification process isconditional on:

λ, μ, and ψ

N

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 94: Divergence-time estimation in RevBayes

T F B-D P (FBD)

Using MCMC, we cansample the age of thecalibrated node whileconditioning on

λ, μ, and ψother node agesA and ⋆

A and ⋆N

0250 50100150200

Time (My)

N

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 95: Divergence-time estimation in RevBayes

T F B-D P (FBD)

MCMC allows us toconsider all possible valuesof ⋆ (marginalization)

0250 50100150200

Time (My)

Diversification of Fossil & Extant Lineages (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 96: Divergence-time estimation in RevBayes

B I U FBD

Implemented in:

• DPPDiv:github.com/trayc7/FDPPDIV

• BEAST2:beast2.cs.auckland.ac.nz/

Available soon:

• RevBayes:github.com/revbayes/revbayes

150 100 50 0

Time

0250 50100150200

Time (My)

FBD Implementation (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 97: Divergence-time estimation in RevBayes

L T T

The FBD model provides anestimate of the number oflineages over time

050100150200

Time (My)

3.0

2.0

1.0

1.0

50100150 0200

Time (My)

Nu

mb

er

of lin

ea

ge

s

Page 98: Divergence-time estimation in RevBayes

L T T

Lineage diversity over time with fossils

MCMC samples thetimes of lineages inthe reconstructed treeand the times of thefossil lineages

(for 1 simulation replicate with 10% random sample of fossils)

Simulations: Results

Page 99: Divergence-time estimation in RevBayes

L T T

Lineage diversity over time with fossils

Visualize extant andsampled fossil lineagediversification whenusing all availablefossils (21 total)

(for 1 simulation replicate with 10% random sample of fossils)

Simulations: Results

Page 100: Divergence-time estimation in RevBayes

L T T

Lineage diversity over time with fossils

Choosing only theoldest (calibration)fossils reduced the setof sampled fossilsfrom 21 to 12,giving less informationabout diversificationover time

(for 1 simulation replicate with 10% random sample of fossils)

Simulations: Results

Page 101: Divergence-time estimation in RevBayes

B: D T

Sequence data for extantspecies:

• 8 Ursidae• 1 Canidae (gray wolf)• 1 Phocidae (spotted seal)

Fossil ages:

• 12 Ursidae• 5 Canidae• 5 Pinnipedimorpha

DPP relaxed clock model

(Heath, Holder, Huelsenbeck MBE 2012)

Fixed tree topology

Se

qu

en

ce

Da

ta

Fossil Data Phylogenetic Relationships

fossil canids

fossil pinnepeds

stem fossil ursids

fossil Ailuropodinae

giant short-faced bear

cave bear

Empirical Analysis (Wang 1994, 1999; Krause et al. 2008; Fulton & Strobeck 2010; Abella et al. 2012)

Page 102: Divergence-time estimation in RevBayes

B: D TGray wolf

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

1. fossil canids

2. fossil pinnepeds

2. stem fossil ursids

3. fossil Ailuropodinae

4. giant short-faced bear

5. cave bear

1

3

2

4

5

Empirical Analysis (Heath et al. PNAS 2014; silhouette images: http://phylopic.org/)

Page 103: Divergence-time estimation in RevBayes

B: D TGray wolf

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Urs

ida

e

Time (My)

60 2040 0

Eocene Oligocene Miocene

Plio

Ple

is

Paleocene

95% CI

Empirical Analysis (Heath et al. PNAS 2014; silhouette images: http://phylopic.org/)

Page 104: Divergence-time estimation in RevBayes

B: D T

fossil Ailuropodinae

fossil Arctodus

fossil Ursus

fossil pinnipeds

Gray wolf

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Urs

ida

e

Time (My)

60 2040 0

Eocene Oligocene Miocene

Plio

Ple

is

Paleocene

stem fossil Ursidae

fossil canids

Empirical Analysis (Heath et al. PNAS 2014; silhouette images: http://phylopic.org/)

Page 105: Divergence-time estimation in RevBayes

T F B-D P (FBD)

Improved statistical inference of absolute node ages

Biologically motivatedmodels can bettercapture statisticaluncertainty

fossil Ailuropodinae

fossil Arctodus

fossil Ursus

fossil pinnipeds

Gray wolf

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Urs

ida

e

Time (My)

60 2040 0

Eocene Oligocene Miocene

Plio

Ple

is

Paleocene

stem fossil Ursidae

fossil canids

Modeling Diversification Processes (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 106: Divergence-time estimation in RevBayes

T F B-D P (FBD)

Improved statistical inference of absolute node ages

Use all available fossils

Eliminates arbitrarychoice of calibrationpriors

fossil Ailuropodinae

fossil Arctodus

fossil Ursus

fossil pinnipeds

Gray wolf

Spotted seal

Giant panda

Spectacled bear

Sun bear

Am. black bear

Asian black bear

Brown bear

Polar bear

Sloth bear

Urs

ida

e

Time (My)

60 2040 0

Eocene Oligocene Miocene

Plio

Ple

is

Paleocene

stem fossil Ursidae

fossil canids

Modeling Diversification Processes (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 107: Divergence-time estimation in RevBayes

T F B-D P (FBD)

Improved statistical inference of absolute node ages

Extensions of the FBDcan account forstratigraphic samplingof fossils and shifts inrates of speciationand extinction

0175 255075100125150

Time

0.2 1.05

Preservation Rate

Modeling Diversification Processes (Heath, Huelsenbeck, Stadler. PNAS 2014)

Page 108: Divergence-time estimation in RevBayes

F T E D

Combining extant and fossilspecies

Fo

ssil

Ag

e D

ata

Se

qu

en

ce

Da

ta

© AntWeb.org

Mo

rph

olo

gic

al

Da

ta © AntWeb.org

0%

20%

40%

60%

80%

100%

OrthopteraParaneoptera

NeuropteraRaphidiopteraColeo PolyphagaColeoptera AdephagaLepidopteraMecoptera

XyelaMacroxyela

RunariaParemphytus

Blasticotoma

TenthredoAglaostigmaDolerus

SelandriaStrongylogasterMonophadnoidesMetallus

Athalia

Taxonus

HoplocampaNematinusNematusCladiusMonoctenusGilpiniaDiprion

CimbicinaeAbiaCorynis

ArgeSterictiphoraPergaPhylacteophagaLophyrotoma

AcorduleceraDecameria

Neurotoma

OnycholydaPamphiliusCephalciaAcantholyda

Megalodontes cephalotesMegalodontes skorniakowii

CephusCalameutaHartigiaSyntexisSirexXerisUrocerusTremexXiphydria

Orussus

Stephanidae AStephanidae B

Megalyridae

Trigonalidae

Chalcidoidea

Evanioidea

Ichneumonidae

Cynipoidea

Apoidea AApoidea BApoidea CVespidae

Grimmaratavites

Ghilarella

AulidontesProtosirexAuliscaKaratavitesSepulca

Onokhoius

TrematothoraxThoracotremaProsyntexis

FerganolydaRudisiriciusSogutia

XyelulaBrigittepterus

MesolydaBrachysyntexis

DahuratomaPseudoxyelocerus

PalaeathaliaAnaxyela

SyntexyelaKulbastavia

Undatoma

Abrotoxyela

Mesoxyela mesozoicaSpathoxyela

TriassoxyelaLeioxyela

NigrimonticolaChaetoxyelaAnagaridyela

EoxyelaLiadoxyela

Xyelotoma

Pamphiliidae undescribed

Turgidontes

PraeoryssusParoryssus

Mesorussus

SymphytopterusCleistogasterLeptephialtitesStephanogaster

97

100

100

100100

100100

100

100

100100

100

100100

100

100100

100

100100

100

100

100100

100

100

100

100

100

100

97

68

55

77

75

91

84

6157

62

9896

71

71

85

93

52

95

64

80

64

99

77

99

98

97

82

52

58

9490

60

70

57

8361

7060

350 300 250 200 150 100 50 0 million years before present

% of morphological

characters scored

for each terminal

(Ronquist, Klopfstein, et al. Syst. Biol. 2012.)