Pennell defense-talk

Click here to load reader

  • date post

    16-Jul-2015
  • Category

    Science

  • view

    547
  • download

    4

Embed Size (px)

Transcript of Pennell defense-talk

  • Matthew Wesley PennellPhD Candidate - Bioinformatics and Computational BiologyInstitute for Bioinformatics and Evolutionary StudiesUniversity of Idaho

    MODELS,MEANINGS, AND MACROEVOLUTION

  • How can statistical models help us understand the drivers of long-term

    evolutionary change?

  • What we talk about when we talk about

    MACROEVOLUTION

  • We know the ingredients ofevolutionary change within populations

  • Mutation

  • Selection

  • Drift

  • Gene flow

  • Mutation Selection

    DriftGene flow

  • But how do these work together to

    SHAPE DIVERSITY?

  • Simone Des Roches

  • Long term dynamics of evolutionary processes

    MACROEVOLUTION

  • Peter Park

  • Time

  • Daniel Berner

  • Time

  • F(time)

  • Models for continuous traits

    Brownian motion Ornstein-Uhlenbeck Early Burst

    Random walkRandom walk with a

    central tendencyEvolution is rapid early

    & slows down over time

    -

    -

    -t

  • Models for discrete traits

    Mk Threshold

    Transitions are instantaneous& occur at some constant rate

    Character states are determinedby a continuous liability

    0

    1

    q01q10

    10

  • GEIGER

    Pennell et al. 2014 Bioinformatics

    https://github.com/mwpennell/geiger-v2

  • What can learn we about evolution

    FROM TRAIT MODELS

  • MacroevolutionaryDynamics

    Populationprocesses

    Statisticaldescriptors

  • MacroevolutionaryDynamics

    Populationprocesses

    Statisticaldescriptors

    In order to make ANY interpretation of the model, we need to know if our model is actually explaining our data

  • 1. Is the model capturing the variation in the data we have observed?

    2.What about the data we havent?

  • 1. Is the model capturing the variation in the data we have observed?

    2. What about the data we havent?

  • 1. Is the model capturing the variation in the data we have observed?

    2. What about the data we havent?

  • R2=0.67 p=0.002 R2=0.67 p=0.002

    R2=0.67 p=0.002R2=0.67 p=0.002

  • Is the model

    APPROPRIATEand if not...

    WHAT ARE WE MISSING

  • Linear regression models

    Observation

    Cook

    s dis

    tance

  • Linear regression models

    Fitted values

    Resid

    uals

  • Assessing the adequacy of

    PHYLOGENETIC TRAIT MODELS

  • Establishing scope

    Trait value

    Univariate, quantitativetraits

    Models that predict multivariate normal data

  • Fit a model to comparative data

    Use fitted parameters to simulate data

    Compare observed to simulated data

  • Old idea in statistics

    Pr(|

    D)Pr

    (D|) Parametric bootstrapping

    Posterior predictive simulation

  • But new in comparative biology

    Pr(|

    D)Pr

    (D|) Parametric bootstrapping(Boettiger et al. 2012 Evolution)

    Posterior predictive simulation(Slater and Pennell 2014 Sys Bio)

  • If we re-ran evolution, how likely are we to see a data set like ours

  • SIMILAR

    DIFFERENT

    Model is likely adequate

    Model is likely inadequate

  • How similar is similarProblem: No two datasets are exactly alike

  • How similar is similarProblem: No two datasets are exactly alikeSolution: Use test statistics to summarize data in meaningful ways

  • How similar is similarProblem: No two datasets are exactly alikeSolution: Use test statistics to summarize data in meaningful ways

    Problem: Species are not independent data points

  • How similar is similarProblem: No two datasets are exactly alikeSolution: Use test statistics to summarize data in meaningful ways

    Problem: Species are not independent data pointsSolution: Calculate test statistics on contrasts rather than the data

  • AB

    C

    Independent contrasts

    Ci

    Cj

    n-1 contrasts for n tips

    Under Brownian motionC ~ Normal(0, )

    Felsenstein 1985 Am NatFelsenstein 1973 Am J Hum Gen

  • For non-Brownian models

    Problem: Contrasts will no longer be normally distributed

  • For non-Brownian models

    Problem: Contrasts will no longer be normally distributedSolution: Use model parameters to standardize branch lengths by theexpected (co)variance that will accumulate along them

  • For non-Brownian models

    Problem: Contrasts will no longer be normally distributedSolution: Use model parameters to standardize branch lengths by theexpected (co)variance that will accumulate along them

    Refer to rescaled tree as a unit tree

  • Test statistics

    Slope of contrasts vs. ancestral

    state

    Slope of contrasts vs. expected

    variances

    Slope of contrasts vs. node height

    Mean of squaredcontrasts

    Coefficient of variation of contrasts

    KS-Test for normality of

    contrasts

  • Simulating datasets for comparison

    Simulate a lot of new datasets on unit tree

    Use a BM model with a rate of 1

    Calculate test statistics on simulated dataset

  • Putting it all together...

  • 1

    TY TX 1

    2

    34

    5

    6

    Fit model

    TX

    TYUnit tree

    Test statSim data

    Test stat x m

    Compare

    BM

  • ARBUTUS

    Pennell et al. 2015 Am Nat

    https://github.com/mwpennell/arbutus

  • Cornwell et al. 2014 J Ecology

    Lamiales

    Solanale

    s

    Gentian

    alesBo

    ragina

    ceae

    Garrya

    lesIca

    cinace

    ae

    Dipsac

    ales

    Parac

    ryphia

    ceae

    Apiale

    sBrun

    ialesAs

    terale

    s

    Esca

    llonia

    ceae

    Aquif

    oliale

    s

    Erica

    lesCornales

    CaryophyllalesSantalales

    Berberidopsidales

    Malpighiales

    OxalidalesCelastralesCucurbitales

    Fagales

    Rosales

    Fabales

    Zygophyllales

    Brassicales

    Malvales

    Huertea

    les

    Sapinda

    les

    Crosso

    somata

    les

    Myrta

    lesGe

    ranial

    esVit

    ales

    Saxif

    ragale

    sDi

    llenia

    ceae

    Gunn

    erale

    sBu

    xace

    aePr

    oteale

    sSa

    biace

    aeRa

    nunc

    ulales

    Acor

    acea

    eAl

    ismat

    ales selailiL

    Asparagales

    Poales

    ArecacalesZingiberales

    Commelinales

    Dioscoreales

    Pandanales

    Magnoliales

    Laurales

    Piperales

    Canellales

    Chloranthaceae

    Austrobaileyales

    Nymphaeales

    Pinales

    Gnetales

    Cycadales

    Monilophyte

    Arec.

    Ast.Ast2.

    Bras.

    Cary.

    Eric.

    Fab.

    Gymn.

    Magn.

    Mono.

    Myrt.

    Prot.

    Rosid.

    Leaf NSLAMax heightLeaf sizeSeed mass

  • Specific Leaf Area

    Seed mass

    Leaf Nitrogen Content

    72 datasets (20 - 2,200 species)

    226 datasets (20 - 22,817 species)

    39 datasets (20 - 936 species)Kleyer et al. 2008 J Ecology

    Kew Seed Information Database 2014

    Wright et al. 2004 NatureZanne et al. 2014 Nature

  • Empirical analyses

    1. Fit Brownian motion, Ornstein-Uhlenbeck and

    Early Burst to each dataset

    2. Calculate relative support using AIC

    3. Assess adequacy of best-fitting model

  • Dataset

    AIC

    weigh

    t ModelBMOUEB

    Mode

    l sup

    port

    (AIC)

    Brownian motion

    Ornstein-Uhlenbeck Early burst

    Dataset (1 - 337)Pennell et al. 2015 Am Nat

  • Specific Leaf Area

    Seed mass

    Leaf Nitrogen Content

    Model deviations detected in 32/72 datasets

    Model deviations detected in 153/226 datasets

    Model deviations detected in 19/39 datasets

  • Simple, commonly used models are often (woefully) inadequate

  • But we already knew that...

  • We are (often) here

  • We are (often) here

    This is how we learn about biology

  • Learn about our data

  • Learn about our data

    Phylogenetic error (topology and branch lengths)

    Measurement error

    Biologically interesting outlier species

  • Learn about evolutionary processes

  • Time heterogeneous models

    Different models for different parts of the tree

    Biologically motivated models

    Learn about evolutionary processes

  • Understanding how and why a model failscan provide new biological insights

  • 1. Is the model capturing the variation in the data we have observed?

    2. What about the data we havent?

  • True diversity

    Sampled diversity

  • If missing data is non-randommodel parameters will be biased

  • HOW MANY SPECIES ARE WOODY

    Willem van Aken

  • True diversity?

  • True diversity?Known species

    316,000

  • True diversity?Known species

    316,000 Trait data49,000

  • True diversity?Known species

    316,000 Trait data49,000

    Genetic data55,000

  • Sampling bias is

    EVERYWHERE

  • Hinchliff and Smith 2014 PLoS ONE

  • Sampling bias in...

    The groups we choose to study(Pennell, Sarver, and Harmon 2012 PLoS ONE)

    And the traits we choose to measure(Uyeda, Caetano, and Pennell 2015 Sys Bio)

  • MISSING DATA HAS STRUCTURE

    100% HERBACEOUS

    100% WOODY

    Gnangarra Willem van Aken

  • Microcoelia (Orchid family)

    ? ? ? ? ? ?

    ? ? ? ? ? ?

    W ?H0 12 18 30

    ??

    ? ? ? ?

    H H H H H H H H

    H H H H

  • Strong PriorPr (All are ) = 1

    Weak PriorPr (All are ) = 0.42

    Pr ( At least 15 are ) = 0.90

    ?

    ?

    Sampling withreplacement(Binomial)

    ?

    Sampling without replacement

    (Hypergeometric)

    H

    H

    H

  • Distribution of woodiness bimodal

    791 genera with records for >10 species

    411W H

    271

    HW

    58

  • Prob

    abilit

    y den

    sity

    Percentage of woody species per genus0 10020 40 60 80

    Strong priorWeak prior

  • Global proportion of woody species

    Prob

    abilit

    y den

    sity

    46 4844

    Strong priorWeak prior

  • Global proportion of woody species

    Prob

    abilit

    y den

    sity

    46 4844

    Strong priorWeak prior

    Taking the dataset at face value: 59% woody

  • WoodyHerb

    MonilophytesGymnospermsBasal AngiospermsMonocotsEudicots

    FitzJohn, Pennell, et al. 2014 J Ecology

  • Can use estimated sampling proportionsin model-based analyses

  • So we have a good model and haveincorporated sampling error...

    WHAT CAN WE SAY?

  • MacroevolutionaryDynamics

    Populationprocesses

    Statisticaldescriptors

  • MacroevolutionaryDynamics

    Populationprocesses

    Statisticaldescriptors

  • Strict pop gen interpretation

    z = dW

    z = 2VM

    Brownian Motion

    Mutation-Drift Equilibrium

    Hansen and Martins 1996 Evolution

  • Quantitative genetics interpretation

    z = dW

    z = 2VM

    Brownian Motion

    Mutation-Drift Equilibrium

    Hansen and Martins 1996 Evolution

    Rate

    Diffusion process

    Mutational variance

    Change intrait mean

    Lynch and Hill 1986 Evolution

  • By fitting alternative models we canevaluate the effects of microevolutionary

    processes over long time scales

  • But such intuitive interpretations are

    LIKELY NAVE

  • Micro to MacroUse population estimates to predict divergence over long time scales

    Macro to Micro Use phylogenetic models to estimate

    population level parameters

  • Micro to Macro

    Hansen 2012 Book ChapterEstes and Arnold 2007 Am Nat

    Use population estimates to predict divergence over long time scales

    Macro to Micro Use phylogenetic models to estimate

    population level parametersLynch 1990 Am Nat

    THE NUMBERS DONT ADD UP!

  • MacroevolutionaryDynamics

    Populationprocesses

    Statisticaldescriptors

  • Macroevolutionary models may reflectdynamics of adaptive landscapes

    rather thanevolution along an adaptive landscape

    Pennell et al. 2014 TREEPennell and Harmon 2013 NYAS

    Pennell 2015 Sys Bio

  • Simpson 1944 Tempo and Mode

    Dynamics of adaptive landscapes

  • Adaptive radiation

    Adaptive zones

    Red Queen (Van Valen)

    Escape and radiate

    Punctuated equilibrium

    Diversity dependence

    Key innovations

    Ephemeral divergence

  • Adaptive radiation

    Adaptive zones

    Red Queen (Van Valen)

    Escape and radiate

    Punctuated equilibrium

    Diversity dependence

    Key innovations

    Ephemeral divergence

  • Punctuated equilibrium

    Eldredge and Gould 1972 Gould and Eldredge 1977 Paleobiology

    Time

    Morphology

  • What about punctuated equilibrium?

    Eldredge and Gould 1972 Gould and Eldredge 1977 Paleobio

    Time

    Morphology

    This is confusing (to everyone)

  • Is evolution gradual or pulsed?

    Is trait evolution(mainly) associated

    with speciation?

    Is evolution duringspeciation adaptive

    or neutral?

    Does species selectiondrive evolutionary

    trends?

  • Is evolution gradual or pulsed?

    Is trait evolution(mainly) associated

    with speciation?

    Is evolution duringspeciation adaptive

    or neutral?

    Does species selectiondrive evolutionary

    trends?

  • Is evolution gradual or pulsed?

    Is trait evolution(mainly) associated

    with speciation?

    Is evolution duringspeciation adaptive

    or neutral?

    Does species selectiondrive evolutionary

    trends?

    Each can be tested with a specificmacroevolutionary model

  • MacroevolutionaryDynamics

    Populationprocesses

    Statisticaldescriptors

  • Nothing.

  • Models purely phenomenologicalCapture patterns not processes

  • A case study:

    EVOLUTION OF KARYOTYPES

  • To a geneticist many of the comparisons (i.e., between karyotypes of different species) will seem of little significance, because to him [sic] it is not the shapes and sizes of chromosomes which are important, but the genes contained in them.

    T. H. Morgan et al. 1925

  • Physical linkage keeps genes together

    Genetic material lost/gained whenmutations change chromosome number/form

    Structural changes may be involved inadaptation and speciation

  • New chromosomes arise from

    Duplications (including polyploidy)

    Fissions - chromosome breaks into two

    Fusions - two chromosomes come together

  • New chromosomes arise from

    Duplications (including polyploidy)

    Fissions - chromosome breaks into two

    Fusions - two chromosomes come together

  • XX XY

  • ZZ ZW

  • Sex chromosomes are natural

    EVOLUTION EXPERIMENTS

  • Males

    Hemizygous Homozygous

    Females W

    Y

    X

    Z

  • XA

    Y

    A

    X1

    X1

    X2 Y XY1Y2

    Y-A X-A

    Pennell et al. in press PLoS Genetics

  • FISHES

    SQUAMATEREPTILES

  • Y-A fusionTotal XY 109

    423

    120400

    3802

    24024 Pennell et al. in press PLoS Genetics

    Data from Tree of Sex Consortium 2014

    X-A fusion

    W-A fusionTotal ZW

    Z-A fusion

  • Y-A fusionTotal XY 109

    423

    120400

    3802

    24024 Pennell et al. in press PLoS Genetics

    Data from Tree of Sex Consortium 2014

    X-A fusion

    W-A fusionTotal ZW

    Z-A fusion

    Both highly significant (Fishers exact test)

  • Xiphophorus

    Gambusia

    PoeciliaM

    egupsilo

    nGar

    manellaFun

    dulusAllo

    dontich

    thys

    Ilyodon

    Aphyo

    semion

    &

    Chrom

    aphyos

    emion

    Notho

    branc

    hius

    Aploc

    heilus

    Pterol

    ebiasOryz

    iasLepa

    doga

    ster

    Oreo

    chro

    mis

    Saro

    ther

    odon

    Pseu

    docr

    enila

    brus

    Sata

    nope

    rca

    Geop

    hagu

    sBo

    thus

    Para

    licht

    hysMicrochirus

    Tetrapturus

    Mastacem

    belusTrichogaster

    Rhinecanthus

    Odonus

    Stephanolepis

    Takifugu

    Arothron

    Scatophagus

    Lutjanus

    Dicentrarchus

    Parapercis

    Pomoxis

    Chionodraco

    Chaenodraco

    Pagetopsis

    Pagothenia

    TrematomusZingel

    ArctoscopusCottusPungitius

    Gobius,Neogobius,

    & ProterorhinusBoleophthalmusAwaousSynechogobiusCtenogobiusDormitator

    EleotrisCallionymus

    MelamphaesScopelogadus

    BeryxZeus

    Synodus &

    Trachinocephalus

    Saurida

    Stenobrach

    ius

    Scopeleng

    ys

    Oncorhy

    nchus

    Salmo

    Salveli

    nus

    Hucho

    Coreg

    onus

    Bathy

    lagus

    Leuro

    glossu

    sArg

    entin

    aSy

    nodo

    ntis

    Clar

    iasNe

    tuma

    Pime

    lodell

    aIm

    parfin

    isOm

    pok

    Mys

    tus

    Hiso

    notu

    sPs

    eudo

    tocin

    clus

    Hypo

    stom

    usHarttiaLoricariichthys

    Leporinus

    CharacidiumTriportheus

    ThoracocharaxBrachyhypopomus

    GymnotusVimba

    Scardinius

    Leuciscus,

    Gnathopogon,

    & Ctenopharyngodon

    Barilius

    Carassius

    Cyprinus

    Garra

    Barbonymus

    Cobitis

    Lepidocephalichthys

    Brevoortia

    Anguilla

    CongerGymnothorax

    OsteoglossumBrienomyrusAcipenser

    XYZW

    YA/WAXA/ZA

    Pennell et al. in press PLoS Genetics

  • XYZW

    YA/WAXA/ZA

    EnhydrinaDisteira

    Hydrophis &Pelamis

    AipysurusEmydocep

    halusHemiaspisTropidec

    hisNotechisHoploc

    ephalus

    AustrelapsDry

    sdaliaPseu

    donajaOxyu

    ranus

    DenisoniaRh

    inoploce

    phalus

    Elapogn

    athus

    SutaCac

    ophisPse

    udechisAca

    nthoph

    is

    Simose

    laps

    Furin

    a

    Dema

    nsia

    Latica

    uda

    Bung

    arus

    NajaDe

    ndroa

    spis

    Micru

    rus

    Gera

    rda

    Cerb

    erus

    Cleli

    a &Ps

    eudo

    boa

    Oxyrh

    opus

    Tropid

    odry

    as

    Tham

    nody

    naste

    s

    Tom

    odon

    Philo

    drya

    s

    Wag

    lerop

    hisXe

    nodo

    nLio

    phis

    Hydromorphus

    GeophisNatrix

    StoreriaTham

    nophisSinonatrix

    Amphiesma,

    Xenochrophis,

    Rhabdophis,

    & Macropisthodon

    Drymarchon

    Chironius,

    Spilotes,

    & Mastigodryas

    Elaphe

    Bogertophis

    Dinodon

    LycodonPtyas

    Boiga

    Chrysopelea

    Dendrelaphis

    AhaetullaCrotalusAgkistrodonBothriechisLachesis

    BothropsCerrophidionPorthidiumAtropoides

    ViperaDaboiaMacroviperaEchisSanziniaAcrantophisBoa

    MoreliaLiasis

    Scelopor

    us

    UtaUm

    a

    Anoli

    s

    Prist

    idacty

    lusPh

    ymat

    urus

    Polychrus

    TropidurusPogonaPhrynocephalus

    Varanus

    LacertaTimon

    PodarcisDarevskia

    AlgyroidesTakydromus

    EremiasOphisops

    Acanthodactylus

    MesalinaPedioplanis

    Meroles

    Heliobolus

    Psammodromus

    Gallotia

    Calyptommatus

    Nothobachia

    Gymnophthalmus

    Micrablepharus

    Cnemidophorus

    Aspidoscelis

    Pseudemoia

    Bassiana

    Cyclodina

    Saproscincus

    Lampropholis

    MabuyaScincella

    GekkoLepidodactylus

    HeteronotiaGehyraHemidactylusChristinusPhyllodactylusGonatodesDelmaLialisDibamus

    Pennell et al. in press PLoS Genetics

  • XYFused

    XYFused

    XY

    ZW

    ZWFused

  • XYFused

    C

    A

    T

    G

  • XYFused

    XYFused

    XY

    ZW

    ZWFused

  • Difference in fusion rate between Y and other sex chromosomes0 00.01 0.02 0.03 0.05 0.1 0.15

    Prob

    abilit

    y den

    sity

    Pennell et al. in press PLoS Genetics

  • Y chromosomes fuse with autosomesmore frequently that X, W, or Z

  • How can this help us understand

    EVOLUTIONARY PROCESSES?

  • Y W Z

    Male biasedmutationBatemangradient

    All elseequal

    X fusions >X fusions

  • Neutral case

    Y W Z

    Male biasedmutationBatemangradient

    All elseequal

  • Neutral case

    Y W Z

    Male biasedmutationBatemangradient

    All elseequalNO

    NO

    NO

  • Direct fitness effects

    X

    A

    Y

    A

    X1

    X2 Y

    Causes expression changes near breakpoints

  • Direct fitness effects (deleterious)

    Y W Z

    Male biasedmutationBatemangradient

    All elseequal

  • Direct fitness effects (deleterious)

    Y W Z

    Male biasedmutationBatemangradient

    All elseequalNO

    YES

    YES

  • Sexually antagonistic selectionA

    Fitne

    ss

    Males Females

    If A fuses with Y, allele will only be found in males(assuming no recombination between X and Y)

  • Y W Z

    Male biasedmutationBatemangradient

    All elseequal

    Sexually antagonistic selection

  • Y W Z

    Male biasedmutationBatemangradient

    All elseequal

    Sexually antagonistic selection

    NO

    YES

    NO

  • Most scenarios inconsistent with excess of

    Y-A fusions

  • Fusions deleterious + male-biased mutation

    Fusions deleterious + Bateman gradient

    Fusions driven by sexually antagonistic selection + male-biased mutation

    (requires very high male-biased mutation rate)

  • Phylogenetic models used not mechanistic

    But model fits can ground truth theoretical analyses

  • MacroevolutionaryDynamics

    Populationprocesses

    Statisticaldescriptors

  • MacroevolutionaryDynamics

    Populationprocesses

    Statisticaldescriptors

    Software forfitting models

  • MacroevolutionaryDynamics

    Populationprocesses

    Statisticaldescriptors

    Assessing modeladequacy

    Software forfitting models

  • MacroevolutionaryDynamics

    Populationprocesses

    Statisticaldescriptors

    Assessing modeladequacy

    Incorporating sampling bias

    Software forfitting models

  • MacroevolutionaryDynamics

    Populationprocesses

    Statisticaldescriptors

    Assessing modeladequacy

    Incorporating sampling bias

    Punctuated equilibrium

    Software forfitting models

  • MacroevolutionaryDynamics

    Populationprocesses

    Statisticaldescriptors

    Software forfitting models

    Assessing modeladequacy

    Incorporating sampling bias

    Punctuated equilibrium Chromosome fusions

  • Because I cant eat phylogeniesInsti

    tute for Bioinform

    atics & Evolution

    ary Studies

  • Luke Harmon

  • Jack SullivanScott NuismerPaul JoyceArne Mooers

  • David TankLarry Forney

  • Rich FitzJohnJosef UyedaJon EastmanDavid Bapst

  • Michael AlfaroSteve ArnoldFrank BurbrinkWill CornwellBernie CrespiJoe FelsensteinDavid GreenPaul Harnik

    Mark KirkpatrickCraig MillerBrian OMearaErica Bree RosenblumCarl SimpsonGraham SlaterDavid SwoffordAmy Zanne

  • Joseph BrownDaniel CaetanoSimone Des RochesTravis HageyKayla HardwickDenim Jochimsen

    Suzanne JonesonRafael MaiaEliot MillerTom PoortenJames RosindellJamie Voyles

  • Simon Uribe-ConversTyler HetherBrice Sarver

  • All yall

    Institute for Bioinform

    atics & Evolution

    ary Studies

  • Lisha AbendrothEva Top

  • Roxana Hickey

  • My family