Generalizing phylogenetics to infer shared evolutionary events

110
Generalizing phylogenetics to infer shared evolutionary events Jamie R. Oaks 1,2 1 Department of Biological Sciences, Auburn University 2 Department of Biology, University of Washington March 31, 2016 c 2007 Boris Kulikov boris-kulikov.blogspot.com Shared divergences Jamie Oaks – phyletica.org 1/35

Transcript of Generalizing phylogenetics to infer shared evolutionary events

Generalizing phylogenetics toinfer shared evolutionary events

Jamie R. Oaks1,2

1Department of Biological Sciences, AuburnUniversity

2Department of Biology, University of Washington

March 31, 2016

c© 2007 Boris Kulikov boris-kulikov.blogspot.com

Shared divergences Jamie Oaks – phyletica.org 1/35

Outline

An assumption (i.e., exciting opportunity) in phylogenetics

An approach to the problem

Empirical applications

Current and future directions

Shared divergences Jamie Oaks – phyletica.org 2/35

Current state of phylogenetics

I Shared ancestry is a fundamental propertyof life

I Phylogenetics is rapidly progressing as thestatistical foundation of comparatve biology

I “Big data” present exciting possibilities andcomputational challenges

I Exciting opportunities to develop new waysto study biology in the light of phylogeny

Shared divergences Jamie Oaks – phyletica.org 3/35

Current state of phylogenetics

I Shared ancestry is a fundamental propertyof life

I Phylogenetics is rapidly progressing as thestatistical foundation of comparatve biology

I “Big data” present exciting possibilities andcomputational challenges

I Exciting opportunities to develop new waysto study biology in the light of phylogeny

Shared divergences Jamie Oaks – phyletica.org 3/35

Current state of phylogenetics

I Shared ancestry is a fundamental propertyof life

I Phylogenetics is rapidly progressing as thestatistical foundation of comparatve biology

I “Big data” present exciting possibilities andcomputational challenges

I Exciting opportunities to develop new waysto study biology in the light of phylogeny

Shared divergences Jamie Oaks – phyletica.org 3/35

Current state of phylogenetics

I Shared ancestry is a fundamental propertyof life

I Phylogenetics is rapidly progressing as thestatistical foundation of comparatve biology

I “Big data” present exciting possibilities andcomputational challenges

I Exciting opportunities to develop new waysto study biology in the light of phylogeny

Shared divergences Jamie Oaks – phyletica.org 3/35

Current state of phylogenetics

I Shared ancestry is a fundamental propertyof life

I Phylogenetics is rapidly progressing as thestatistical foundation of comparatve biology

I “Big data” present exciting possibilities andcomputational challenges

I Exciting opportunities to develop new waysto study biology in the light of phylogeny

Shared divergences Jamie Oaks – phyletica.org 3/35

Current state of phylogenetics

I Assumption: All processes ofdiversification affect each lineageindependently and only cause bifurcatingdivergences.

I We know this assumption is frequentlyviolated

Shared divergences Jamie Oaks – phyletica.org 4/35

Current state of phylogenetics

I Assumption: All processes ofdiversification affect each lineageindependently and only cause bifurcatingdivergences.

I We know this assumption is frequentlyviolated

Shared divergences Jamie Oaks – phyletica.org 4/35

Current state of phylogenetics

I Assumption: All processes ofdiversification affect each lineageindependently and only cause bifurcatingdivergences.

I We know this assumption is frequentlyviolated

Shared divergences Jamie Oaks – phyletica.org 4/35

Violating independent divergences

Shared divergences Jamie Oaks – phyletica.org 5/35

Violating independent divergences

Shared divergences Jamie Oaks – phyletica.org 5/35

Violating independent divergences

Shared divergences Jamie Oaks – phyletica.org 5/35

Violating independent divergences

Shared divergences Jamie Oaks – phyletica.org 5/35

Violations are pervasive and interesting

Biogeography

I Environmental changes that affect wholecommunities of species

Gene family evolution

I Chromosomal duplications

Epidemiology

I Disease spread via co-infected individuals

I Transmission at social gatherings

Endosymbiont evolution (e.g., parasites,microbiome)

I Speciation of the host

I Co-colonization of new host species

Shared divergences Jamie Oaks – phyletica.org 6/35

Violations are pervasive and interesting

Biogeography

I Environmental changes that affect wholecommunities of species

Gene family evolution

I Chromosomal duplications

Epidemiology

I Disease spread via co-infected individuals

I Transmission at social gatherings

Endosymbiont evolution (e.g., parasites,microbiome)

I Speciation of the host

I Co-colonization of new host species

Shared divergences Jamie Oaks – phyletica.org 6/35

Violations are pervasive and interesting

Biogeography

I Environmental changes that affect wholecommunities of species

Gene family evolution

I Chromosomal duplications

Epidemiology

I Disease spread via co-infected individuals

I Transmission at social gatherings

Endosymbiont evolution (e.g., parasites,microbiome)

I Speciation of the host

I Co-colonization of new host species

Shared divergences Jamie Oaks – phyletica.org 6/35

Violations are pervasive and interesting

Biogeography

I Environmental changes that affect wholecommunities of species

Gene family evolution

I Chromosomal duplications

Epidemiology

I Disease spread via co-infected individuals

I Transmission at social gatherings

Endosymbiont evolution (e.g., parasites,microbiome)

I Speciation of the host

I Co-colonization of new host species

Shared divergences Jamie Oaks – phyletica.org 6/35

Why account for shared divergences?

1. Improve inference

2. Provide a framework for studying processes of co-diversification

Shared divergences Jamie Oaks – phyletica.org 7/35

Why account for shared divergences?

1. Improve inference

2. Provide a framework for studying processes of co-diversification

Shared divergences Jamie Oaks – phyletica.org 7/35

Solution:Accommodate shareddivergence models

Advantage:More data to estimateshared parameters

True history

τ1τ2τ3

Problem:Current methodsonly considergeneral model

Consequence:Unnecessaryparametersintroduce error

Current tree model

τ1τ2 τ3τ4 τ5τ6 τ7 τ8

Shared divergences Jamie Oaks – phyletica.org 8/35

Solution:Accommodate shareddivergence models

Advantage:More data to estimateshared parameters

True history

τ1τ2τ3

Problem:Current methodsonly considergeneral model

Consequence:Unnecessaryparametersintroduce error

Current tree model

τ1τ2 τ3τ4 τ5τ6 τ7 τ8

Shared divergences Jamie Oaks – phyletica.org 8/35

Solution:Accommodate shareddivergence models

Advantage:More data to estimateshared parameters

True history

τ1τ2τ3

Problem:Current methodsonly considergeneral model

Consequence:Unnecessaryparametersintroduce error

Current tree model

τ1τ2 τ3τ4 τ5τ6 τ7 τ8

Shared divergences Jamie Oaks – phyletica.org 8/35

Solution:Accommodate shareddivergence models

Advantage:More data to estimateshared parameters

True history

τ1τ2τ3

Problem:Current methodsonly considergeneral model

Consequence:Unnecessaryparametersintroduce error

Current tree model

τ1τ2 τ3τ4 τ5τ6 τ7 τ8

Shared divergences Jamie Oaks – phyletica.org 8/35

Solution:Accommodate shareddivergence models

Advantage:More data to estimateshared parameters

True history

τ1τ2τ3

Problem:Current methodsonly considergeneral model

Consequence:Unnecessaryparametersintroduce error

Current tree model

τ1τ2 τ3τ4 τ5τ6 τ7 τ8

Shared divergences Jamie Oaks – phyletica.org 8/35

Solution:Accommodate shareddivergence models

Advantage:More data to estimateshared parameters

True history

τ1τ2τ3

Problem:Current methodsonly considergeneral model

Consequence:Unnecessaryparametersintroduce error

Current tree model

τ1τ2 τ3τ4 τ5τ6 τ7 τ8

Shared divergences Jamie Oaks – phyletica.org 8/35

Why account for shared divergences?

1. Improve inference

2. Provide a framework for studying processes of co-diversification

Shared divergences Jamie Oaks – phyletica.org 9/35

Why account for shared divergences?

1. Improve inference

2. Provide a framework for studying processes of co-diversification

Shared divergences Jamie Oaks – phyletica.org 9/35

Violations are pervasive and interesting

Biogeography

I Environmental changes that affect wholecommunities of species

Gene family evolution

I Chromosomal duplications

Epidemiology

I Disease spread via co-infected individuals

I Transmission at social gatherings

Endosymbiont evolution (e.g., parasites,microbiome)

I Speciation of the host

I Co-colonization of new host species

Shared divergences Jamie Oaks – phyletica.org 10/35

Outline

An assumption (i.e., exciting opportunity) in phylogenetics

An approach to the problem

Empirical applications

Current and future directions

Shared divergences Jamie Oaks – phyletica.org 11/35

Divergence model choice

τ1

Shared divergences Jamie Oaks – phyletica.org 12/35

Divergence model choice

τ1

Shared divergences Jamie Oaks – phyletica.org 12/35

Divergence model choice

τ1

Shared divergences Jamie Oaks – phyletica.org 12/35

Divergence model choice

τ2 τ1

Shared divergences Jamie Oaks – phyletica.org 12/35

Divergence model choice

τ1τ2

Shared divergences Jamie Oaks – phyletica.org 12/35

Divergence model choice

τ1τ2

Shared divergences Jamie Oaks – phyletica.org 12/35

Divergence model choice

τ3 τ1τ2

Shared divergences Jamie Oaks – phyletica.org 12/35

Inferring co-diversification

m1 m2 m3 m4 m5τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

m1 m2 m3 m4 m5τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

We want to infer m and T given DNA sequence alignments X

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

We want to infer m and T given DNA sequence alignments X

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

We want to infer m and T given DNA sequence alignments X

p(mi |X) ∝ p(X |mi )p(mi )

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

We want to infer m and T given DNA sequence alignments X

p(mi |X) ∝ p(X |mi )p(mi )

p(X |mi ) =

∫θp(X | θ,mi )p(θ |mi )dθ

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

We want to infer m and T given DNA sequence alignments X

p(mi |X) ∝ p(X |mi )p(mi )

p(X |mi ) =

∫θp(X | θ,mi )p(θ |mi )dθ

I Divergence times

I Gene trees

I Substitution parameters

I Demographic parameters

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

Challenges:

1. Cannot solve all the integrals analytically

2. Likelihood is tractable, but “cumbersome” (or is it?. . . )

I Numerical approximation via approximate-likelihood Bayesian computation (ABC)

3. Sampling over all possible models

I 5 taxa = 52 modelsI 10 taxa = 115,975 modelsI 20 taxa = 51,724,158,235,372 models!!I A “diffuse” Dirichlet process prior (DPP)

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

Challenges:1. Cannot solve all the integrals analytically

2. Likelihood is tractable, but “cumbersome” (or is it?. . . )

I Numerical approximation via approximate-likelihood Bayesian computation (ABC)

3. Sampling over all possible models

I 5 taxa = 52 modelsI 10 taxa = 115,975 modelsI 20 taxa = 51,724,158,235,372 models!!I A “diffuse” Dirichlet process prior (DPP)

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

Challenges:1. Cannot solve all the integrals analytically

2. Likelihood is tractable, but “cumbersome” (or is it?. . . )

I Numerical approximation via approximate-likelihood Bayesian computation (ABC)

3. Sampling over all possible models

I 5 taxa = 52 modelsI 10 taxa = 115,975 modelsI 20 taxa = 51,724,158,235,372 models!!I A “diffuse” Dirichlet process prior (DPP)

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

Challenges:1. Cannot solve all the integrals analytically

2. Likelihood is tractable, but “cumbersome” (or is it?. . . )I Numerical approximation via approximate-likelihood Bayesian computation (ABC)

3. Sampling over all possible models

I 5 taxa = 52 modelsI 10 taxa = 115,975 modelsI 20 taxa = 51,724,158,235,372 models!!I A “diffuse” Dirichlet process prior (DPP)

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

Challenges:1. Cannot solve all the integrals analytically

2. Likelihood is tractable, but “cumbersome” (or is it?. . . )I Numerical approximation via approximate-likelihood Bayesian computation (ABC)

3. Sampling over all possible models

I 5 taxa = 52 modelsI 10 taxa = 115,975 modelsI 20 taxa = 51,724,158,235,372 models!!I A “diffuse” Dirichlet process prior (DPP)

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

Challenges:1. Cannot solve all the integrals analytically

2. Likelihood is tractable, but “cumbersome” (or is it?. . . )I Numerical approximation via approximate-likelihood Bayesian computation (ABC)

3. Sampling over all possible modelsI 5 taxa = 52 models

I 10 taxa = 115,975 modelsI 20 taxa = 51,724,158,235,372 models!!I A “diffuse” Dirichlet process prior (DPP)

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

Challenges:1. Cannot solve all the integrals analytically

2. Likelihood is tractable, but “cumbersome” (or is it?. . . )I Numerical approximation via approximate-likelihood Bayesian computation (ABC)

3. Sampling over all possible modelsI 5 taxa = 52 modelsI 10 taxa = 115,975 models

I 20 taxa = 51,724,158,235,372 models!!I A “diffuse” Dirichlet process prior (DPP)

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

Challenges:1. Cannot solve all the integrals analytically

2. Likelihood is tractable, but “cumbersome” (or is it?. . . )I Numerical approximation via approximate-likelihood Bayesian computation (ABC)

3. Sampling over all possible modelsI 5 taxa = 52 modelsI 10 taxa = 115,975 modelsI 20 taxa = 51,724,158,235,372 models!!

I A “diffuse” Dirichlet process prior (DPP)

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

Inferring co-diversification

p(m1 |X) p(m2 |X) p(m3 |X) p(m4 |X) p(m5 |X)τ1 τ2 τ1 τ1τ2 τ1τ2 τ3 τ1τ2

Challenges:1. Cannot solve all the integrals analytically

2. Likelihood is tractable, but “cumbersome” (or is it?. . . )I Numerical approximation via approximate-likelihood Bayesian computation (ABC)

3. Sampling over all possible modelsI 5 taxa = 52 modelsI 10 taxa = 115,975 modelsI 20 taxa = 51,724,158,235,372 models!!I A “diffuse” Dirichlet process prior (DPP)

J. R. Oaks et al. (2013). Evolution 67: 991–1010, J. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 13/35

“Easy” as ABC

A

A

A

G

G

G

C

C

C

C

C

C

G

G

G

G

G

G

A

A

A

A

A

T

A

A

A

A

A

A

T

T

C

C

C

C

G

G

G

G

G

G

T

T

T

T

T

T

G

G

G

G

G

G

C

C

C

T

T

T

T

T

T

C

C

C

C

C

C

C

C

C

G

G

G

G

G

G

C

C

T

T

T

T

A

A

A

A

A

A

C

C

C

C

C

C

G

G

G

G

G

G

T

T

T

T

T

T

A

A

A

G

G

G

C

C

C

C

C

C

C

C

C

C

C

C

A

A

A

T

T

T

G

G

G

G

G

G

T

T

T

T

C

C

A

A

A

A

A

A

C

C

C

C

C

C

C

C

C

T

T

T

G

G

G

G

G

G

G

G

G

G

G

G

T

T

T

T

T

T

S1

S2

S3

Shared divergences Jamie Oaks – phyletica.org 14/35

“Easy” as ABC

A

A

A

G

G

G

C

C

C

C

C

C

G

G

G

G

G

G

A

A

A

A

A

T

A

A

A

A

A

A

T

T

C

C

C

C

G

G

G

G

G

G

T

T

T

T

T

T

G

G

G

G

G

G

C

C

C

T

T

T

T

T

T

C

C

C

C

C

C

C

C

C

G

G

G

G

G

G

C

C

T

T

T

T

A

A

A

A

A

A

C

C

C

C

C

C

G

G

G

G

G

G

T

T

T

T

T

T

A

A

A

G

G

G

C

C

C

C

C

C

C

C

C

C

C

C

A

A

A

T

T

T

G

G

G

G

G

G

T

T

T

T

C

C

A

A

A

A

A

A

C

C

C

C

C

C

C

C

C

T

T

T

G

G

G

G

G

G

G

G

G

G

G

G

T

T

T

T

T

T

S1

S2

S3

Shared divergences Jamie Oaks – phyletica.org 14/35

“Easy” as ABC

A

A

A

G

G

G

C

C

C

C

C

C

G

G

G

G

G

G

A

A

A

A

A

T

A

A

A

A

A

A

T

T

C

C

C

C

G

G

G

G

G

G

T

T

T

T

T

T

G

G

G

G

G

G

C

C

C

T

T

T

T

T

T

C

C

C

C

C

C

C

C

C

G

G

G

G

G

G

C

C

T

T

T

T

A

A

A

A

A

A

C

C

C

C

C

C

G

G

G

G

G

G

T

T

T

T

T

T

A

A

A

G

G

G

C

C

C

C

C

C

C

C

C

C

C

C

A

A

A

T

T

T

G

G

G

G

G

G

T

T

T

T

C

C

A

A

A

A

A

A

C

C

C

C

C

C

C

C

C

T

T

T

G

G

G

G

G

G

G

G

G

G

G

G

T

T

T

T

T

T

S1

S2

S3

Shared divergences Jamie Oaks – phyletica.org 14/35

“Easy” as ABC

0.00.2

0.40.6

0.81.0 0.0

0.20.4

0.60.8

1.00.0

0.2

0.4

0.6

0.8

1.0

S1

S2

S3

Shared divergences Jamie Oaks – phyletica.org 15/35

“Easy” as ABC

0.00.2

0.40.6

0.81.0 0.0

0.20.4

0.60.8

1.00.0

0.2

0.4

0.6

0.8

1.0

S1

S2

S3

Shared divergences Jamie Oaks – phyletica.org 15/35

“Easy” as ABC

0.00.2

0.40.6

0.81.0 0.0

0.20.4

0.60.8

1.00.0

0.2

0.4

0.6

0.8

1.0

S1

S2

S3

Shared divergences Jamie Oaks – phyletica.org 15/35

“Easy” as ABC

0.00.2

0.40.6

0.81.0 0.0

0.20.4

0.60.8

1.00.0

0.2

0.4

0.6

0.8

1.0

S1

S2

S3

Shared divergences Jamie Oaks – phyletica.org 15/35

“Easy” as ABC

0.00.2

0.40.6

0.81.0 0.0

0.20.4

0.60.8

1.00.0

0.2

0.4

0.6

0.8

1.0

S1

S2

S3

Shared divergences Jamie Oaks – phyletica.org 15/35

“Easy” as ABC

0.00.2

0.40.6

0.81.0 0.0

0.20.4

0.60.8

1.00.0

0.2

0.4

0.6

0.8

1.0

S1

S2

S3

Shared divergences Jamie Oaks – phyletica.org 15/35

“Easy” as ABC

0.00.2

0.40.6

0.81.0 0.0

0.20.4

0.60.8

1.00.0

0.2

0.4

0.6

0.8

1.0

S1

S2

S3

Shared divergences Jamie Oaks – phyletica.org 15/35

“Easy” as ABC

0.00.2

0.40.6

0.81.0 0.0

0.20.4

0.60.8

1.00.0

0.2

0.4

0.6

0.8

1.0

S1

S2

S3

Shared divergences Jamie Oaks – phyletica.org 15/35

“Easy” as ABC

0.00.2

0.40.6

0.81.0 0.0

0.20.4

0.60.8

1.00.0

0.2

0.4

0.6

0.8

1.0

S1

S2

S3

Shared divergences Jamie Oaks – phyletica.org 15/35

α =

(αα+1

)(αα+2

)

= 0.758

α

(αα+1

)(1

α+2

)

= 0.076

1

(αα+1

)(1

α+2

)

= 0.076

(1

α+1

)(αα+2

)

= 0.076

α

(1

α+1

)(2

α+2

)

= 0.015

2

1

Shared divergences Jamie Oaks – phyletica.org 17/35

α =

(αα+1

)(αα+2

)

= 0.758

α

(αα+1

)(1

α+2

)

= 0.076

1

(αα+1

)(1

α+2

)

= 0.076

(1

α+1

)(αα+2

)

= 0.076

α

(1

α+1

)(2

α+2

)

= 0.015

2

1

Shared divergences Jamie Oaks – phyletica.org 17/35

α =

(αα+1

)(αα+2

)

= 0.758

α

(αα+1

)(1

α+2

)

= 0.076

1

(αα+1

)(1

α+2

)

= 0.076

(1

α+1

)(αα+2

)

= 0.076

α

(1

α+1

)(2

α+2

)

= 0.015

2

1

Shared divergences Jamie Oaks – phyletica.org 17/35

α =

(αα+1

)(αα+2

)

= 0.758

α

(αα+1

)(1

α+2

)

= 0.076

1

(αα+1

)(1

α+2

)

= 0.076

(1

α+1

)(αα+2

)

= 0.076

α

(1

α+1

)(2

α+2

)

= 0.015

2

1

Shared divergences Jamie Oaks – phyletica.org 17/35

α =

(αα+1

)(αα+2

)

= 0.758

α

(αα+1

)(1

α+2

)

= 0.076

1

(αα+1

)(1

α+2

)

= 0.076

(1

α+1

)(αα+2

)

= 0.076

α

(1

α+1

)(2

α+2

)

= 0.015

2

1

Shared divergences Jamie Oaks – phyletica.org 17/35

α = 0.5

(αα+1

)(αα+2

)= 0.067

= 0.758

α

(αα+1

)(1

α+2

)= 0.133

= 0.076

1

(αα+1

)(1

α+2

)= 0.133

= 0.076

(1

α+1

)(αα+2

)= 0.133

= 0.076

α

(1

α+1

)(2

α+2

)= 0.533

= 0.015

2

1

Shared divergences Jamie Oaks – phyletica.org 17/35

α = 10.0

(αα+1

)(αα+2

)= 0.758

α

(αα+1

)(1

α+2

)= 0.076

1

(αα+1

)(1

α+2

)= 0.076

(1

α+1

)(αα+2

)= 0.076

α

(1

α+1

)(2

α+2

)= 0.0152

1

Shared divergences Jamie Oaks – phyletica.org 17/35

New method: dpp-msbayes

I Approximate-likelihood Bayesian approach to inferring models of shared divergences

I Flexible Dirichlet-process prior (DPP) over all possible divergence models

I Flexible priors on parameters to avoid strongly weighted posteriors

I Multi-processing to accommodate genomic datasets

J. R. Oaks (2014). BMC Evolutionary Biology 14: 150Shared divergences Jamie Oaks – phyletica.org 18/35

New method: dpp-msbayes

I Approximate-likelihood Bayesian approach to inferring models of shared divergences

I Flexible Dirichlet-process prior (DPP) over all possible divergence models

I Flexible priors on parameters to avoid strongly weighted posteriors

I Multi-processing to accommodate genomic datasets

J. R. Oaks (2014). BMC Evolutionary Biology 14: 150Shared divergences Jamie Oaks – phyletica.org 18/35

New method: dpp-msbayes

I Approximate-likelihood Bayesian approach to inferring models of shared divergences

I Flexible Dirichlet-process prior (DPP) over all possible divergence models

I Flexible priors on parameters to avoid strongly weighted posteriors

I Multi-processing to accommodate genomic datasets

J. R. Oaks (2014). BMC Evolutionary Biology 14: 150Shared divergences Jamie Oaks – phyletica.org 18/35

New method: dpp-msbayes

I Approximate-likelihood Bayesian approach to inferring models of shared divergences

I Flexible Dirichlet-process prior (DPP) over all possible divergence models

I Flexible priors on parameters to avoid strongly weighted posteriors

I Multi-processing to accommodate genomic datasets

J. R. Oaks (2014). BMC Evolutionary Biology 14: 150Shared divergences Jamie Oaks – phyletica.org 18/35

dpp-msbayes: Simulation-based assessment

Validation:

I Simulate 50,000 datasets and analyze each under the same model

Robustness:

I Simulate datasets that violate model assumptions and analyze each of them

Shared divergences Jamie Oaks – phyletica.org 19/35

dpp-msbayes: Simulation-based assessment

Validation:

I Simulate 50,000 datasets and analyze each under the same model

Robustness:

I Simulate datasets that violate model assumptions and analyze each of them

Shared divergences Jamie Oaks – phyletica.org 19/35

dpp-msbayes: Validation results

0.0 0.2 0.4 0.6 0.8 1.00.0

0.2

0.4

0.6

0.8

1.0

Posterior probability of one divergence

True

prob

abili

tyof

one

dive

rgen

ce

J. R. Oaks (2014). BMC Evolutionary Biology 14: 150Shared divergences Jamie Oaks – phyletica.org 20/35

dpp-msbayes: Validation results

0.0 0.2 0.4 0.6 0.8 1.00.0

0.2

0.4

0.6

0.8

1.0

Posterior probability of one divergence

True

prob

abili

tyof

one

dive

rgen

ce

0.0 0.2 0.4 0.6 0.8 1.00.0

0.2

0.4

0.6

0.8

1.0

Posterior probability of one divergence

True

prob

abili

tyof

one

dive

rgen

ceJ. R. Oaks (2014). BMC Evolutionary Biology 14: 150

Shared divergences Jamie Oaks – phyletica.org 20/35

dpp-msbayes: Performance

I New method for estimating shared evolutionary history shows:

1. Model-choice accuracy2. Robustness to model violations3. Power to detect variation in divergence times4. It’s fast!

I A new tool for biologists to leverage comparative genomic data to exploreprocesses of co-diversification

J. R. Oaks (2014). BMC Evolutionary Biology 14: 150Shared divergences Jamie Oaks – phyletica.org 21/35

dpp-msbayes: Performance

I New method for estimating shared evolutionary history shows:

1. Model-choice accuracy2. Robustness to model violations3. Power to detect variation in divergence times4. It’s fast!

I A new tool for biologists to leverage comparative genomic data to exploreprocesses of co-diversification

J. R. Oaks (2014). BMC Evolutionary Biology 14: 150Shared divergences Jamie Oaks – phyletica.org 21/35

Outline

An assumption (i.e., exciting opportunity) in phylogenetics

An approach to the problem

Empirical applications

Current and future directions

Shared divergences Jamie Oaks – phyletica.org 22/35

Shared divergences Jamie Oaks – phyletica.org 23/35

Did repeated fragmentation ofislands during inter-glacial rises insea level promote diversification?

Shared divergences Jamie Oaks – phyletica.org 24/35

Did repeated fragmentation ofislands during inter-glacial rises insea level promote diversification?

Shared divergences Jamie Oaks – phyletica.org 24/35

Climate-driven diversification

Shared divergences Jamie Oaks – phyletica.org 25/35

Climate-driven diversification

Shared divergences Jamie Oaks – phyletica.org 25/35

Climate-driven diversification

Shared divergences Jamie Oaks – phyletica.org 25/35

Results

1 3 5 7 9 11 13 15 17 19 21Number of divergence events

0.00

0.02

0.04

0.06

0.08

0.10

Pos

terio

r pro

babi

lity

J. R. Oaks (2014). BMC Evolutionary Biology 14: 150Shared divergences Jamie Oaks – phyletica.org 26/35

Results

1 3 5 7 9 11 13 15 17 19 21Number of divergence events

0.00

0.02

0.04

0.06

0.08

0.10

Pos

terio

r pro

babi

lity

0100200300400500Time (kya)

0

-50

-100

Sea le

vel (m

)

J. R. Oaks (2014). BMC Evolutionary Biology 14: 150Shared divergences Jamie Oaks – phyletica.org 26/35

More data!

I Collecting genomic data from taxa co-distributed across Southeast Asian Islands andMainland

I Preliminary results for 1000 loci from 5 pairs of Gekko mindorensis populations

Shared divergences Jamie Oaks – phyletica.org 27/35

More data!

I Collecting genomic data from taxa co-distributed across Southeast Asian Islands andMainland

I Preliminary results for 1000 loci from 5 pairs of Gekko mindorensis populations

1 2 3 4 5Number of divergence events, j¿j

-5.0

-4.0

-3.0

-2.0

-1.0

0.0

1.0

2.0

3.0

2ln(

Bay

es fa

ctor

)

Shared divergences Jamie Oaks – phyletica.org 27/35

Diversification across African rainforests

I Did climate cycles drive diversificationand community assembly across rainforesttaxa?

I Preliminary results with 300 loci from 3taxa

Shared divergences Jamie Oaks – phyletica.org 28/35

Diversification across African rainforests

I Did climate cycles drive diversificationand community assembly across rainforesttaxa?

I Preliminary results with 300 loci from 3taxa

1 2 3Number of divergence events, j¿j

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2ln(

Bay

es fa

ctor

)

Shared divergences Jamie Oaks – phyletica.org 28/35

Conclusions

I New method for estimating shared evolutionary historyI Shows good “frequentist” behaviorI Relatively robust to model violations

I Finding support for temporally clustered divergences in multiple systems

I However, there is a lot of uncertainty!

Shared divergences Jamie Oaks – phyletica.org 29/35

Conclusions

I New method for estimating shared evolutionary historyI Shows good “frequentist” behaviorI Relatively robust to model violations

I Finding support for temporally clustered divergences in multiple systems

I However, there is a lot of uncertainty!

Shared divergences Jamie Oaks – phyletica.org 29/35

Conclusions

I New method for estimating shared evolutionary historyI Shows good “frequentist” behaviorI Relatively robust to model violations

I Finding support for temporally clustered divergences in multiple systemsI However, there is a lot of uncertainty!

Shared divergences Jamie Oaks – phyletica.org 29/35

Outline

An assumption (i.e., exciting opportunity) in phylogenetics

An approach to the problem

Empirical applications

Current and future directions

Shared divergences Jamie Oaks – phyletica.org 30/35

Current work: More power

Ecoevolity: Estimating evolutionary coevality

I Full-likelihood Bayesian implementation

I Uses all the information in the dataI Applicable to deeper timescales

I Analytically integrate over gene trees 1

I Very efficient numerical approximation of posteriorI Applicable to NGS datasets

1D. Bryant et al. (2012). Molecular Biology And Evolution 29: 1917–1932

Shared divergences Jamie Oaks – phyletica.org 31/35

Current work: More power

Ecoevolity: Estimating evolutionary coevality

I Full-likelihood Bayesian implementationI Uses all the information in the dataI Applicable to deeper timescales

I Analytically integrate over gene trees 1

I Very efficient numerical approximation of posteriorI Applicable to NGS datasets

1D. Bryant et al. (2012). Molecular Biology And Evolution 29: 1917–1932

Shared divergences Jamie Oaks – phyletica.org 31/35

Current work: More power

Ecoevolity: Estimating evolutionary coevality

I Full-likelihood Bayesian implementationI Uses all the information in the dataI Applicable to deeper timescales

I Analytically integrate over gene trees 1

I Very efficient numerical approximation of posteriorI Applicable to NGS datasets

1D. Bryant et al. (2012). Molecular Biology And Evolution 29: 1917–1932

Shared divergences Jamie Oaks – phyletica.org 31/35

Next step: A general framework

I Develop a framework for inferring shareddivergences across phylogenies

I Generalize Bayesian phylogenetics toincorporate shared divergences

I Sample models numerically viareversible-jump Markov chain Monte Carlo

Benefits:

I Improve phylogenetic inference

I Framework for studying processes ofco-diversification

τ1τ2

Shared divergences Jamie Oaks – phyletica.org 32/35

Next step: A general framework

I Develop a framework for inferring shareddivergences across phylogenies

I Generalize Bayesian phylogenetics toincorporate shared divergences

I Sample models numerically viareversible-jump Markov chain Monte Carlo

Benefits:

I Improve phylogenetic inference

I Framework for studying processes ofco-diversification

τ1τ2

Shared divergences Jamie Oaks – phyletica.org 32/35

Next step: A general framework

I Develop a framework for inferring shareddivergences across phylogenies

I Generalize Bayesian phylogenetics toincorporate shared divergences

I Sample models numerically viareversible-jump Markov chain Monte Carlo

Benefits:

I Improve phylogenetic inference

I Framework for studying processes ofco-diversification

τ1τ2

Shared divergences Jamie Oaks – phyletica.org 32/35

Next step: A general framework

I Develop a framework for inferring shareddivergences across phylogenies

I Generalize Bayesian phylogenetics toincorporate shared divergences

I Sample models numerically viareversible-jump Markov chain Monte Carlo

Benefits:

I Improve phylogenetic inference

I Framework for studying processes ofco-diversification

τ1τ2

Shared divergences Jamie Oaks – phyletica.org 32/35

Next step: A general framework

I Develop a framework for inferring shareddivergences across phylogenies

I Generalize Bayesian phylogenetics toincorporate shared divergences

I Sample models numerically viareversible-jump Markov chain Monte Carlo

Benefits:

I Improve phylogenetic inference

I Framework for studying processes ofco-diversification

τ1τ2

Shared divergences Jamie Oaks – phyletica.org 32/35

Everything is on GitHub. . .

Software:

I Ecoevolity: https://github.com/phyletica/ecoevolity

I PyMsBayes: https://joaks1.github.io/PyMsBayes

I dpp-msbayes: https://github.com/joaks1/dpp-msbayes

I ABACUS: Approximate BAyesian C UtilitieS. https://github.com/joaks1/abacus

Open-Science Notebook:

I msbayes-experiments: https://github.com/joaks1/msbayes-experiments

Shared divergences Jamie Oaks – phyletica.org 33/35

Acknowledgments

Ideas and feedback:

I Leache Lab

I Minin Lab

I Holder Lab

I Brown Lab/KU Herpetology

Computation:

Funding:

Photo credits:

I Rafe Brown, Cam Siler, Jesse Grismer, &Jake Esselstyn

I FMNH Philippine Mammal Website:I D.S. Balete, M.R.M. Duya, & J. Holden

I PhyloPic!

Shared divergences Jamie Oaks – phyletica.org 34/35

Questions?

[email protected]

c© 2007 Boris Kulikov boris-kulikov.blogspot.com

Shared divergences Jamie Oaks – phyletica.org 35/35