So many different kinds of mistakes

42
So many different kinds of mistakes Or why systematic error is the 21st century’s sampling error Liliana M. Dávalos Assistant Professor, Department of Ecology & Evolution SUNY, Stony Brook Grand Valley State University 10 April 2014

Transcript of So many different kinds of mistakes

Page 1: So many different kinds of mistakes

So many different kinds of mistakes

Or why systematic error is the 21st century’s sampling error!Liliana M. DávalosAssistant Professor, Department of Ecology & EvolutionSUNY, Stony Brook!Grand Valley State University10 April 2014

Page 2: So many different kinds of mistakes

My lab’s research mission

Biological diversityDiversification Human

impact

Page 3: So many different kinds of mistakes

Two kinds of questions

Biological diversity

Diversification, speciation decrease Habitat lossincrease

Page 4: So many different kinds of mistakes

So many kinds of mistakes

• Sampling error vs. systematic error• In phylogenetics• How phenotypes evolve

• In environmental change• Why we are losing forests?

Page 5: So many different kinds of mistakes

So many kinds of mistakes

• Sampling error vs. systematic error• In phylogenetics• How phenotypes evolve

• In environmental change• Why we are losing forests?

Page 6: So many different kinds of mistakes

Thinking about errors

• Let’s say we want to answer a question:• In a finite

population, what is the frequency of an allele?

Sampling vs. systematic

Page 7: So many different kinds of mistakes

How to answer this question

• We go out, get samples, genotype different individuals

• Then we count the alleles

• What is the main source of error?

Sampling vs. systematic

Page 8: So many different kinds of mistakes

This is sampling error

• We want to get a better estimate of the allele frequency• => Sample more

• We could sample the entire population• => Best possible

estimate of allele frequency

Sampling vs. systematic

Page 9: So many different kinds of mistakes

Now let’s ask a different question

• We want to find out how these 3000 microbial lineages relate to one another

• We get their genomes, map out each of the single-copy genes, estimate a phylogeny

Lang, Darling, Eisen 2013 PLoS One

Sampling vs. systematic

Page 10: So many different kinds of mistakes

But our results don’t make sense

• Is it sampling error?• Can we sample

more than the whole genome?

• We discover the model of gene evolution we are using was wrong• What kind of error is

this?

Lang, Darling, Eisen 2013 PLoS One

Sampling vs. systematic

Page 11: So many different kinds of mistakes

This is systematic error

• Even sampling whole genomes won’t fix the problem• Having more data

can make the problem worse!

• As long as we don’t change the model, we will keep obtaining the wrong answer

Lang, Darling, Eisen 2013 PLoS One

Sampling vs. systematic

Page 12: So many different kinds of mistakes

So many kinds of mistakes

• Sampling error vs. systematic error• In phylogenetics• How phenotypes evolve

• In environmental change• Why we are losing forests?

Page 13: So many different kinds of mistakes

0.1 substitutions/site

Mycobacterium bovis BCG str. Pasteur 1173P2M. tuberculosis H37RaM. bovis BCG str. Tokyo 172M. bovis AF212297M. tuberculosis CDC1551M. tuberculosis F11M. tuberculosis KZN 1435M. tuberculosis H37Rv

M. avium subsp. paratuberculosis K10M. avium 104

M. vanbaalenii PYR1M. sp. Spyr1

M. smegmatis str. MC2 155M. sp. KMSM. sp. MCSM. sp JLS

Mycobacterium sp. *Nocardia farcinica IFM 10152

Gordonia bronchialis DSM 43247Rhodococcus opacus B4

R. equi ATCC 33707R. equi 103S

Segniliparus rotundus DSM 44985Bifidobacterium longum NCC2705 B. longum DJO10A B. longum subsp. infantis 157FB. longum subsp. longum JCM 1217B. longum subsp. longum BBMN68 B. longum subsp. infantis ATCC 55813B. longum subsp. longum JDM301 B. longum subsp. infantis ATCC 15697B. breve DSM 20213

B. dentium Bd1B. dentium ATCC 27679

B. adolescentis ATCC 15703 B. bifidum PRL2010B. bifidum S17Bifidobacterium sp. *

Corynebacterium matruchotii ATCC 14266C. efficiens YS314

C. genitalium ATCC 33030 Sca01C. glucuronolyticum ATCC 51866

C. urealyticum DSM 7109Arthrobacter sp. FB24

A. chlorophenolicus A6Kocuria rhizophila DC2201

Micrococcus luteus NCTC 2665Clavibacter michiganensis subsp. michiganensis NCP

C. michiganensis subsp. sepedonicus Cellulomonas flavigena DSM 20109

Kineococcus radiotolerans SRS30216Nakamurella multipartita DSM 44233

Saccharopolyspora erythraea NRRL 2338 Geodermatophilus obscurus DSM 43160

Amycolatopsis mediterranei U32Intrasporangium calvum DSM 43043

Kytococcus sedentarius DSM 20547Nocardioides sp. JS614

Streptomyces avermitilis MA4680S. scabiei 87 22

S. coelicolor A3 2Catenulispora acidiphila DSM 44928

Thermobifida fusca YXThermobispora bispora DSM 43833

Thermomonospora curvata DSM 43183Streptosporangium roseum DSM 43021

Micromonospora aurantiaca ATCC 27029M. sp. L5 Salinispora tropica CNB440

Salinispora arenicola CNS205Acidothermus cellulolyticus 11B

Rhodococcus jostii RHA1Mycobacterium gilvum PYRGCK

Frankia alni ACN14a

100

10084

9642

10063

63

65

55

84

10074

51

70

98

9299

74

100100

10075

99

100

78

4378

100

49

20

100

9992

32

10092

50

26

5618

14

6

37

32

11

66100

51

5

463878

15

100

100

10077

99

84

88

pathogenic Mycobacterium complex(avium-bovis-tuberculosis)

non-pathogenic Mycobacterium smegmatis complex

Phylogenetics

• Testing relatedness• All of comparative

biology• Historical

biogeography• Evolutionary aspects

of community ecology• Diagnostics and

similar applications

Corthals...Dávalos 2012 PLoS One

How phenotypes evolve

Page 14: So many different kinds of mistakes

Dated trees more important than ever

• Dated trees need fossils

• Why use dated trees?• Trait evolution• History of

assemblages in time and space

• Key innovations

Dumont, Dávalos et al. 2012 P R Soc B

How phenotypes evolve

Page 15: So many different kinds of mistakes

• We use morphological characters

• How good are the models of evolution for morphological characters?• Characteristics of

the data• Compare to models

molecular evolution

Fossils without genomes

Dávalos & Russell 2012 Ecol Evol

How phenotypes evolve

Page 16: So many different kinds of mistakes

Species CharactersThese are morphological characters

• They look like this —>• Discontinuous

between species• Factors, not

numbers• Difficult to model

How phenotypes evolve

Page 17: So many different kinds of mistakes

The organisms in question

New World Leaf-nosed bats and relatives

How phenotypes evolve

Page 18: So many different kinds of mistakes

� � � � p� � � �

� � � � � i �

� � � � � i �

� � � � � p� � p� � � �

� � � � � p� � � �

� � p� � � � p� D� � � �

� � � D� � � � � �

� � � � � � � p� � � �

� � � p� D� � � �

� � � � � p� � i �

� � � � � � � � � �

� � D� � p� D� � � �

� � � i � � �

� p� � � p� D� � � �

� � � � � p� D� � � �

� � � � � � p� D� � � �

� � � � � � � � � i �

i � � p� D� � � �

� � � � � � � p� D� � � �

� � p� � � � � � � � �

� � p� � � D� � i �

� � � � � � p� � i � �

� � � � � � � � � � �

� � � � � � � � � i � � Di � �

� � � � � � � � � � DD� � �

� � � � � � � �

� � � � D� � �

� � � � D� � D� � i �

� � � � p� i � �

� � � � p� D� � � �

� � p� � � � p� D� � � �

� � � � � � p� D� � � �

� � � � Di �

� � � � � p� D� � � � � � � i D�

� � � � � p� D� � � � � � � i D�

� � � � � p� D� � � � � � � � � � D�

� � � � � � � p� � � �

� � � � � � � � �

Di � � � � � �

� � � � � D� � � � � � � � D� �

� � D� � � i � � � � � � � � �

� � D� � � i � � � � � � � � � �

� � D� � � i � � � � � � � i

� � � � � � � � �

� � � Dp� � � � � i �

� � � � p� � � � �

� � � � � � � � � � �

� � � � p� � � � � � � � �

� � � � p� � � � � p� � � � � �

� � � � p� � � � � i � � � �

� � D� � � p� � � �

� � � � p� � � �

� � � D� � � � �

� � � Di � � � �

� � � � � � � p� D� � � �

� p� � � � � � � �

� � p� � � � �

D� � � � � � � � �

� � � D� i �

� � � � � �

�����

�����

�����������

� � � � � � � � �

� � � � � � p� D� � � �

� � � � Ma

� � � � � � r � M� � �

� � � � � r � M� � � � � � � aM�

� � � � � r � M� � � � � � � � � M� � a�

� � � � � r � M� � � � � � � aM�

� � � � � � r M� � � � � � � � � � M�

� � � � r � � �

� � � � � a

� � � � � a

� � � � � � � � � �

� � � � � � r � � a�

� � � � � � �

� � � � M� � M� � a

� � � � r � a�

� � � � M� � �

� � � � M� �

� � r � � � � � � � �

� � r � � � M� � a

� � � �

� � � a� �

� r � � � r � M� � �

� � � � � � � � � a

a � � r � M� � �

� � � � � � � r � M� � �

� � � � � r � � �

� � � � � r � � r � � �

� � � � � r � � a

� � � � � � � � �

� � M� � r � M� � �

� � � � � � � r � � �

� � � r � M� � �

� � � � � � � �

� � � � r � M� � �

� � r � � � � r � M� � � � � � c� � �

� � r � � � � r � M� � � � r � c� M� �

� � � � � � � r � � �

Ma� � � � �

� � � � r � � �

� � � � r � � �

� � � Mr � � � � � a

� � � � r � � � �

� � � � � � � �

� � � � r � � � � � � � � �

� � � � r � � � � � � � � � �

� � � � � � � � � �

� � � � � M� � � �

� � M� � � r � � �

� � M� � � a

� � � � � � a� �

� � � M� a

� � � � �

M� � � � � � � �

� � � Ma� � �

� r � � � � � � �

� � � M� � � �

� � � � � � � r � M� � �

� � � � � �������

� � � � � � � � � � � � � � � � � � � �

Baker et al. 2003 Occas Pap Mus TTU Dávalos, Cirranello et al. 2012 Biol Rev

Wetterer et al. 2000 B Am Mus Nat Hist

How phenotypes evolve

Page 19: So many different kinds of mistakes

The trouble with morphological characters

• At first, only model was parsimony

• Neutral Jukes-Cantor 1969 model implemented 2001• Current model has

gamma variation across characters

• Applying this model does not solve conflict

Dávalos, Cirranello et al. 2012 Biol Rev

How phenotypes evolve

Page 20: So many different kinds of mistakes

If the Jukes-Cantor model yields conflicting answer, could the model be inadequate given these data?

Page 21: So many different kinds of mistakes

q

p

Homoplasy I: inconsistency!

q

pp

Felsenstein 1978 Syst Biol

How phenotypes evolve

consistent

Non consistent

Page 22: So many different kinds of mistakes

A B

� � � � � � � � � � � � � � � � �

��

�P��

� �

��

��

�e��

���

��

����

���

��

���

��

��

c

� �

� �

� � ��� P� r � � � e� � � � � � � � � � � � � � � � e� � �

� � � � � � � � � � � � � � � � �

� � � � � � � � � � � � � �� � � �� � � � � � �

� c � � � � � �� � � �

� � � � � � �� � � � e� � � � � � � � � � � � � � � �

�r

�P

���

P��

�� �

P���

��

���

����

��

� �

� �

� �

� � � � � � � � � � � � � � �

� � � � � � � � � � � � � � �� � � � � � � � �

� � � � � � � � � �� � � � � � � � � � � �� � � � � � � � � � � �

� c � � � � � �� � � �

� � � � � � �

Figure 12

Homoplasy II: ecological convergence

• Can bring together unrelated ecologically similar lineages• This example: mt

cytochrome b gene of nectar-feeding bats

• Association adaptive molecular evolution and supporting wrong node Dávalos, Cirranello et al. 2012 Biol Rev

How phenotypes evolve

Page 23: So many different kinds of mistakes

Homoplasy III: correlated evolution

• Expected in protein-coding genes

• Models in use for codons, aminoacids, ribosomal RNA secondary structure

Dávalos & Perkins 2008 Genomics

How phenotypes evolve

Page 24: So many different kinds of mistakes

Might these affect morphological characters?

Reviewer 1:

I don't see the point. If the characters are good characters (meaning that they have some phylogenetic signal at some level), then there is nothing especially wrong with the fact that they are weighted a little more than other characters.

How phenotypes evolve

Page 25: So many different kinds of mistakes

Dávalos, Cirranello et al. 2012 Biol Rev

Inconsistency!

How phenotypes evolve

Page 26: So many different kinds of mistakes

� � � � � � � � � � � � � � �●

� � � � � � � �� � � � � � � �

● � � � � � � � � � � �

� � � � � �� � � � � � � �

● � � � � � � � �

�ï�

����

����RU�QHFWDUï

����

��

����

●●

●●

●●

A B

� � � � � � � � � � � � � � � � �

��

�P��

� �

��

��

�e��

���

��

����

���

��

���

��

��

c

� �

� �

� � ��� P� r � � � e� � � � � � � � � � � � � � � � e� � �

� � � � � � � � � � � � � � � � �

� � � � � � � � � � � � � �� � � �� � � � � � �

� c � � � � � �� � � �

� � � � � � �� � � � e� � � � � � � � � � � � � � � �

�r

�P

���

P��

�� �

P���

��

���

����

��

� �

� �

� �

� � � � � � � � � � � � � � �

� � � � � � � � � � � � � � �� � � � � � � � �

� � � � � � � � � �� � � � � � � � � � � �� � � � � � � � � � � �

� c � � � � � �� � � �

� � � � � � �

Figure 12

Dávalos et al. In Press Syst Biol Dávalos, Cirranello et al. 2012 Biol Rev

Convergent evolution!

How phenotypes evolve

Page 27: So many different kinds of mistakes

Correlated evolution!

How phenotypes evolve

Dissimilarity between characters ->

Page 28: So many different kinds of mistakes

Models incur systematic error

• Morphology = phenotype• Neutrality and

independence wrong for models• Not neutral• Not independent

Skelly et al. 2013 Genome Res

How phenotypes evolve

Page 29: So many different kinds of mistakes

How does morphology evolve?

• Ordering: each character state gives rise to a finite range of states

• There are limits to states because of• Development• Natural selection

Dávalos, Cirranello et al. 2012 Biol Rev

How phenotypes evolve

Page 30: So many different kinds of mistakes

Modeling selection in morphology

• Brownian motion vs. Ornstein-Uhlenbeck models

• Continuous phenotypic traits

• Might selection explain homoplasy in morphological data?

How phenotypes evolve

Butler & King 2004 Am Nat

Page 31: So many different kinds of mistakes

A BB C D

nectarivorous

other

OU2a

frugivorous (figs)

other

OU2b

frugivorous (figs)

other

nectarivorous

OU3

frugivorous (figs)

other

nectarivorous

strictly frugivorous (figs, Short-faced bats)

OU4

Figure 5

Ardops

Ariteus

Carollia

Diphylla

MimonTonatia

Sturnira

Ametrida

Centurio

PygodermaSphaeronycteris

Stenoderma

Lonchophylla

Chrotopterus

DesmodusDiaemus

Lampronycteris

Lophostoma

Macrotus

Micronycteris

Phylloderma

Phyllostomus

Rhinophylla

Trachops

Vampyrum

Artibeus

Chiroderma

EctophyllaEnchisthenes

Mesophylla

Platyrrhinus

Uroderma

Vampyressa

Vampyrodes

Metavampyressa

LonchophyllaPlatalina

Anoura

Choeroniscus

Choeronycteris

Hylonycteris

Erophylla

Glossophaga

LeptonycterisMonophyllus

PhyllonycterisBrachyphylla

Dumont ... Dávalos 2014 Evolution

Engineering model of performance

How phenotypes evolve

Page 32: So many different kinds of mistakes

0

100

200

300

400

500

0.0 0.4 0.8 1.2MA

count

dietfigs

figs only

nectar

other

• Performance related to diet• Low mechanical

advantage in nectar-feeding bats• Convergence on

this phenotype• Analyzing function and

integrating selection better than ignoring

Three performance peaks

How phenotypes evolve

Mechanical advantage

Freq

uenc

y

Dumont ... Dávalos 2014 Evolution

Page 33: So many different kinds of mistakes

Morphology...

AminoacidsCodons

How phenotypes evolve

Neutral genotype

Model complexity

How phenotypes evolve

Page 34: So many different kinds of mistakes

The trouble with systematic error

• In sampling error mode• More is more• More characters• = thousands of

correlated phenotypes• This will fail, we have

systematic error• Improve model• Improve data• Reduce data

Page 35: So many different kinds of mistakes

So many kinds of mistakes

• Sampling error vs. systematic error• In phylogenetics• How phenotypes evolve

• In environmental change• Why we are losing forests?

Page 36: So many different kinds of mistakes

My lab’s research mission

Biological diversityDiversification Human

impact

Page 37: So many different kinds of mistakes

Why do rainforests decline? Three hypotheses

Hamburger! (or steak)Kaimowitz et al. 2004 CIFOR

CocaDávalos et al. 2011 Environ

Sci Technol

Land tenure and propertyHecht 1993 BioScience

Why lose forests?

Page 38: So many different kinds of mistakes

Predictions

Hamburger! (or steak)Kaimowitz et al. 2004 CIFOR

CocaDávalos et al. 2011 Environ

Sci Technol

Land tenure and propertyHecht 1993 BioScience

Why lose forests?

+ demand beef + beef, + cattle + cattle, + pasture + pasture, - forest

+ demand cocaine + cocaine, + coca + coca, - forest

+ demand land + pasture, + cattle + cattle, - forest

Page 39: So many different kinds of mistakes

The real drivers of habitat loss

Forest, coca nothing Eradicationdecrease

Urbanization &

Development

Dávalos et al. 2014 Biol Cons

becomes

Pasture &

Cowsisproperty

Why lose forests?

Page 40: So many different kinds of mistakes

These systematic errors are scary

• Models inform policy• Real decisions are

made based on these inadequate models

• Models influence what data we collect• If we focus on cattle

and the problem is palm, we are missing the real story

Page 41: So many different kinds of mistakes

Shifting to the present

• 20th century challenge• Collecting enough data• i.e., sampling

• Still relevant in many cases

• New challenges• Formulating models • “Big” data• Correlated data• Otherwise biased data

Fjeldsa et al. 2005 Ambio

Page 42: So many different kinds of mistakes

•Funding•NSF–DEB, CIDER–SBU

•Speciation & diversification: A. Cirranello, A. Russell, N. Simmons, P. Velazco

•Functional evolution: E. Dumont, S. Rossiter, E. Teeling

•Conservation & policy: D. Armenteras, A. Bejarano, A. Corthals, L. Correa, J. Holmes, N. Rodriguez, C. Romero

•Dávalos Lab•Phylogenetics: R. Dahan, S. DelSerra, A. Goldberg, O. Warsi, L. Yohe, X. Zhang

• Land use: P. Connell, M. Hall, E. Simola, G. Tudda, Y. Shah

Thanks!