Inference under the model using an accurate beta...

76
using an accurate beta approximation PAULA TATARU THOMAS BATAILLON ASGER HOBOLTH AARHUS UNIVERSITY Bioinformatics Research Centre CSHL, April 15 th 2015 Inference under the Wright-Fisher model

Transcript of Inference under the model using an accurate beta...

Page 1: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

using an accurate beta approximation

PAULA TATARU

THOMAS BATAILLON

ASGER HOBOLTH

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

CSHL, April 15th 2015

Inference under the Wright-Fisher model

Page 2: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Theoretical population genetics

2

Page 3: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Theoretical population genetics

›Mathematical models formalize the evolution of

genetic variation within and between populations

2

Page 4: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Theoretical population genetics

›Mathematical models formalize the evolution of

genetic variation within and between populations

›Provide a framework for inferring evolutionary paths

from observed data to

2

Page 5: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Inference problems

› Inference of population history from DNA data

› (Variable) population size

› Migration / admixture

› Divergence times

› Selection coefficients

3

Page 6: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Inference problems: population size

4

H. Li and R. Durbin. Inference of human population history from individual whole-genome

sequences. Nature, 475:493–496, 2011

PSMC

Page 7: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Inference problems: populations divergence

5

M. Gautier and R. Vitalis. Inferring population histories using genome-wide allele frequency data.

Molecular biology and evolution, 30(3):654–668, 2013

Kim Tree

Page 8: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Inference problems: populations admixture

6

J. K. Pickrell and J. K. Pritchard. Inference of population splits and mixtures from genome-wide allele

frequency data. PLOS Genetics, 8(11):e1002967, 2012

TreeMix

Page 9: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Inference problems: populations admixture

7

Gronau I., Hubisz M. J., Gulko B., Danko C. G., Siepel A. Bayesian inference of ancient human

demography from individual genome sequences. Nature genetics 43(10): 1031-1034, 2011

G-PhoCS

Page 10: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Inference problems: loci under selection

8

Steinrücken M., Bhaskar A. and Song Y. S. A novel spectral method for inferring general selection from

time series genetic data. The Annals of Applied Statistics 8(4):2203–2222, 2014

spectralHMM

Page 11: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Population genetics: the Wright-Fisher model

› Evolution of a population

forward in time

› Follow one locus (region

in the DNA)

›Different variants at the

locus are called alleles

9

individuals

ge

ne

rati

on

s (t

ime

)

Page 12: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Population genetics: the Wright-Fisher model

›Basic model: only two

alleles per locus

› Follow the frequency of

one of the alleles

10

individuals

ge

ne

rati

on

s (t

ime

)

3

2

3

3

4

5

5

allele count

Page 13: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Allele frequency distribution

11

Page 14: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Population genetics: the coalescent model

› Trace the genealogy of

sampled individuals

backward in time

12

individuals

ge

ne

rati

on

s (t

ime

)

Page 15: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Population genetics: the coalescent model

› Trace the genealogy of

sampled individuals

backward in time

12

individuals

ge

ne

rati

on

s (t

ime

)

Page 16: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Population genetics: the coalescent model

› Trace the genealogy of

sampled individuals

backward in time

12

individuals

ge

ne

rati

on

s (t

ime

)

MRCA

Page 17: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Population genetics: the coalescent model

› Trace the genealogy of

sampled individuals

backward in time

›Coalescent process

terminates when

reaching MRCA

12

individuals

ge

ne

rati

on

s (t

ime

)

MRCA

Page 18: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›The Wright-Fisher

›The coalescent

Two dual models

13

Page 19: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›The Wright-Fisher

› Forward in time

›The coalescent

› Backward in time

Two dual models

13

Page 20: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›The Wright-Fisher

› Forward in time

› Follow allele frequency

›The coalescent

› Backward in time

› Follow genealogy

Two dual models

13

Page 21: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›The Wright-Fisher

› Forward in time

› Follow allele frequency

› Selection

›The coalescent

› Backward in time

› Follow genealogy

› Recombination

Two dual models

13

Page 22: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›The Wright-Fisher

› Forward in time

› Follow allele frequency

› Selection

› Scalability

›Sample size decreases

uncertainty

›The coalescent

› Backward in time

› Follow genealogy

› Recombination

› Scalability

›Sample size increases

complexity

Two dual models

13

Page 23: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›Diffusion

›Moment-based

Approximations to the Wright-Fisher

14

Page 24: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›Diffusion

› Large population size

› Infinitesimal change

›Moment-based

Approximations to the Wright-Fisher

14

Page 25: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›Diffusion

› Large population size

› Infinitesimal change

›Moment-based

› Convenient distributions

› Normal distribution

› Beta distribution

Approximations to the Wright-Fisher

14

Page 26: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›Diffusion

› Large population size

› Infinitesimal change

› No closed solution

› Cumbersome to evaluate

›Moment-based

› Convenient distributions

› Normal distribution

› Beta distribution

› Closed analytical forms

› Fast to evaluate

Approximations to the Wright-Fisher

14

Page 27: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›Diffusion

› Large population size

› Infinitesimal change

› No closed solution

› Cumbersome to evaluate

›Moment-based

› Convenient distributions

› Normal distribution

› Beta distribution

› Closed analytical forms

› Fast to evaluate

› Problematic at boundaries

Approximations to the Wright-Fisher

14

Page 28: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›Normal distribution

›Beta distribution

Behavior at the boundaries

15

Page 29: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›Normal distribution

› Support: real line

›Beta distribution

› Support: [0, 1]

Behavior at the boundaries

15

Page 30: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›Normal distribution

› Support: real line

› Truncation

›Incorrect variance

›Beta distribution

› Support: [0, 1]

Behavior at the boundaries

15

Page 31: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

›Normal distribution

› Support: real line

› Truncation

›Incorrect variance

› Intermediary frequencies

›Beta distribution

› Support: [0, 1]

› Intermediary frequencies

Behavior at the boundaries

15

Page 32: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Beta with spikes

›Use of Wright-Fisher

› Scalable

›Use of moments

› Simple mathematical calculations

› Improve behavior at boundaries

› Preserve mean and variance

16

Page 33: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Wright Fisher model

›Zt allele count

›Xt = Zt /2N

›Zt+1 follows a binomial

distribution

17

individuals

ge

ne

rati

on

s (t

ime

)

3

2

3

3

4

5

5

allele count

Page 34: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Wright Fisher model

›Zt allele count

›Xt = Zt /2N

›Zt+1 follows a binomial

distribution

17

individuals

ge

ne

rati

on

s (t

ime

)

3

2

3

3

4

5

5

allele count

Page 35: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Wright Fisher model

›Zt allele count

›Xt = Zt /2N

›Zt+1 follows a binomial

distribution

›g encodes the

evolutionary pressures

17

individuals

ge

ne

rati

on

s (t

ime

)

3

2

3

3

4

5

5

allele count

Page 36: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Wright Fisher model: Drift only

18

individuals

ge

ne

rati

on

s (t

ime

)

3

2

3

3

4

5

5

allele count

Page 37: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Wright Fisher model: Mutations

19

individuals

ge

ne

rati

on

s (t

ime

)

3

2

4

5

4

3

2

allele count

u v

Page 38: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Wright Fisher model: Mutations

19

individuals

ge

ne

rati

on

s (t

ime

)

3

2

4

5

4

3

2

allele count

u v

Page 39: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Wright Fisher model: Migration

20

individuals

ge

ne

rati

on

s (t

ime

)

3

2

3

5

4

2

3

allele count

m1 m2

Page 40: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Wright Fisher model: Migration

20

individuals

ge

ne

rati

on

s (t

ime

)

3

2

3

5

4

2

3

allele count

m1 m2

Page 41: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Wright Fisher model: Linear forces

›Mutations

›Migration

›Mutations & Migration

21

Page 42: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Wright Fisher model: Linear forces

›Mutations

›Migration

›Mutations & Migration

21

Page 43: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 22

The Beta approximation: Main idea

›The density of Xt

Page 44: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 22

The Beta approximation: Main idea

›The density of Xt

›Use recursive approach to calculate

› Mean and variance

Page 45: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 22

The Beta approximation: Main idea

›The density of Xt

›Use recursive approach to calculate

› Mean and variance

Page 46: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 23

The Beta approximation: Drift only

Page 47: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 23

The Beta approximation: Drift only

Page 48: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 24

The Beta approximation: Drift only

Page 49: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 25

The Beta approximation: Drift only

Page 50: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Beta with spikes: Main idea

›The density of Xt

26

Page 51: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Beta with spikes: Main idea

›The density of Xt

›Use recursive approach to calculate

› Loss and fixation probabilities

26

Page 52: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Beta with spikes: loss probability

27

Page 53: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Beta with spikes: loss probability

28

Page 54: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Beta with spikes: loss probability

28

Page 55: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Beta with spikes: loss probability

28

Page 56: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

The Beta with spikes: fixation probability

29

Page 57: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 30

The Beta with spikes: Drift only

Page 58: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 30

The Beta with spikes: Drift only

Page 59: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 31

The Beta with spikes: Drift only

Page 60: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 32

The Beta with spikes: Drift only

Page 61: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Numerical accuracy: Drift only

33

Beta Beta with spikes

Page 62: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 34

Inference of divergence times: Drift only

›Simulated data

› 5000 independent loci

› 100 samples in each population

› 50 data sets (replicates)

Page 63: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre 34

Inference of divergence times: Drift only

›Simulated data

› 5000 independent loci

› 100 samples in each population

› 50 data sets (replicates)

›Allele frequency distribution is used to

calculate likelihood of data

› Likelihood is numerically optimized

Page 64: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Inference of divergence times: Drift only

35

Page 65: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Conclusions

›Beta with spikes

36

Page 66: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Conclusions

›Beta with spikes

› An extension built on the beta approximation

36

Page 67: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Conclusions

›Beta with spikes

› An extension built on the beta approximation

› Improves the quality of the approximation

36

Page 68: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Conclusions

›Beta with spikes

› An extension built on the beta approximation

› Improves the quality of the approximation

› Simple mathematical formulation

36

Page 69: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Conclusions

›Beta with spikes

› An extension built on the beta approximation

› Improves the quality of the approximation

› Simple mathematical formulation

› Works under linear evolutionary forces

36

Page 70: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Conclusions

›Beta with spikes

› An extension built on the beta approximation

› Improves the quality of the approximation

› Simple mathematical formulation

› Works under linear evolutionary forces

› Comparable to state of the art methods

for inference of divergence times

36

Page 71: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Conclusions

›Beta with spikes

› An extension built on the beta approximation

› Improves the quality of the approximation

› Simple mathematical formulation

› Works under linear evolutionary forces

› Comparable to state of the art methods

for inference of divergence times

› Recursive formulation enables incorporation

of variable population size

36

Page 72: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Future work

› Incorporate selection

37

Page 73: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Future work

› Incorporate selection

› Non-linear evolutionary force

37

Page 74: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Future work

› Incorporate selection

› Non-linear evolutionary force

› Positive selection increases probability of fixation

37

Page 75: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Future work

› Incorporate selection

› Non-linear evolutionary force

› Positive selection increases probability of fixation

› Mean and variance are no longer available in closed form

37

Page 76: Inference under the model using an accurate beta approximationpure.au.dk/portal/files/90723413/PaulaTataruCSHL.pdfusing an accurate beta approximation PAULA TATARU THOMAS BATAILLON

An accurate Beta approximation

Paula Tataru [email protected]

AARHUS

UNIVERSITY

Bioinformatics

Research Centre

Future work

› Incorporate selection

› Non-linear evolutionary force

› Positive selection increases probability of fixation

› Mean and variance are no longer available in closed form

› Extend the approximation for loss/fixation probabilities to

mean and variance

37