Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular...

21
Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology

Transcript of Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular...

Page 1: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Coalescence and the Cenancestor

J. Peter GogartenUniversity of Connecticut

Department of Molecular and Cell Biology

Page 2: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Acknowledgements

NSF Microbial GeneticsNASA Exobiology Program

Re. Coalescence:Andrew Martin (U of C Boulder)

Joe Felsenstein (U of Wash)Hyman Hartman (MIT)

Yuri Wolf (NCBI)

Gogarten Lab:Olga Zhaxybayeva

Page 3: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Cenancestor – (from the Greek “kainos” meaning recent and

“koinos” meaning common) – the most recent common

ancestor of all the organisms that are alive today. The term

was proposed by Fitch in 1987.

All life forms on Earth share common ancestry

Where is the root of the tree of life?

• On the branch leading to Bacteria (Gogarten, J.P. et al.,1989; Iwabe, N. et al., 1989)

• On the branch leading to Archaea (Woese, C.R.,1987)

• On the branch leading to Eukaryotes (Forterre, P. and Philippe, H., 1999; Lopez, P. et al., 1999)

• Under aboriginal trifurcation (Woese, C.R., 1978)

• Inconclusive results (Caetano-Anolles, G., 2002)

Page 4: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Detection of Ancient Gene Duplications using Whole Genome Data

SELECTION OF GENOMES12 representative completed genomes

SELECTION OF GENOMES12 representative completed genomes

RANKING OF PUTATIVE ANCIENT PARALOGOUS PAIRSElimination of duplicates; ranking by number of occurrences

RANKING OF PUTATIVE ANCIENT PARALOGOUS PAIRSElimination of duplicates; ranking by number of occurrences

ANALYSIS OF PUTATIVE ANCIENT PARALOGOUS PAIRSAdd homologous sequences, perform alignment and reconstruct phylogenetic trees

ANALYSIS OF PUTATIVE ANCIENT PARALOGOUS PAIRSAdd homologous sequences, perform alignment and reconstruct phylogenetic trees

DETECTION OF PUTATIVE ANCIENT PARALOGOUS PAIRSDETECTION OF PUTATIVE ANCIENT PARALOGOUS PAIRS

Encoded proteins,

Genome A

Encoded proteins,

Genome A

Encoded proteins,

Genome B

Encoded proteins,

Genome B

BLAST,E-value cutoff 10-8

(A1,B1) and (A2,B2) are putativeanciently

duplicated genes

(A1,B1) and (A2,B2) are putativeanciently

duplicated genes

Gene A1Gene A1 Gene B1Gene B1

Gene A2Gene A2 Gene B2Gene B2

Reciprocal top scoring BLAST hit

Reciprocal top scoring BLAST hit

Page 5: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Ancient Gene Duplications and Root of Tree of Life

Root LocationTotal

number of datasets

Reliable Ambiguous or

Unresolved

Between Domains 32 7 25

Within Archaea 3 2 1

Within Bacteria 9 0 9

Polyphyletic, only Archaeal sequences group with Archaeal ortholog of the A-B pair

17 1 16

Polyphyletic, only Bacterial sequences group with Bacterial ortholog of the A-B pair

14 5 9

Polyphyletic on both sides of root

11 0 11

Page 6: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

SS

U-r

RN

A T

ree

of L

i fe

EuglenaTrypanosoma

Zea

Paramecium

Dictyostelium

EntamoebaNaegleria

Coprinus

Porphyra

Physarum

HomoTritrichomonas

Sulfolobus

ThermofilumThermoproteus

pJP 27pJP 78

pSL 22pSL 4

pSL 50

pSL 12

E.coli

Agrobacterium

Epulopiscium

AquifexThermotoga

Deinococcus

Synechococcus

Bacillus

Chlorobium

Vairimorpha

Cytophaga

HexamitaGiardia

mitochondria

chloroplast

Haloferax

Methanospirillum

Methanosarcina

Methanobacterium

ThermococcusMethanopyrus

Methanococcus

ARCHAEA BACTERIA

EUCARYA

Encephalitozoon

Thermus

EM 17

0.1 changes per nt

Marine group 1

RiftiaChromatium

ORIGIN

Treponema

CPSV/A-ATPaseProlyl RSLysyl RSMitochondriaPlastids

Fig. modified from Norman Pace

Page 7: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Another example of topology with long empty branches connecting three domains of life –

proton pumping ATPases

Olendzenski et al, 2000

0.1

BfTreponemaBArabidops

BHomoBSaccharomyces

BCandidaBNeurospora373

744812

1000

BEnterococcosBBMethanosarcina

BSulfolobuBDesulfurococcus

206

BDeinococcusBThermus957

1000

alpE.colialpVibrio1000

alpPropionigeniumalpSynechococcus

alpNicotianaalpBos

alpSchizosalpSaccharom.

alpNeurospora830927610

949

alpMycobacteriumalpThermotoga

alpPS3441593

388

541

1000

838

FfBorreliaFlSalmonellaFTreponema

FlBacillus901951

1000

betAquifexbetMycoplasma

betThermotogabetE.coli

betVibrio997

betEnterococcusbetPS3971

betWolinellabetChlorobium

betPectinatusbetPropionigenium

betBosbetNicotiana616

betSaccharomycesbetNeurospora

BetSchizossacch.388315

962

88

174

442564

1000

ASulfolobusAfTreponema378

AEnterococcusABMethanosarcinaAMMethanosarcina1000

AHaloferaxAHalobacterium1000

984

AjMethanoccusADesulfurococcus989

370268

417

AThermusADeinococcus1000

448

AGiardiaAPlasmodium

ATrypanosoACyanidium

AArabidopsAVigna

ADaucusAGossypium

8271000

A1AcetabulariaA2Acetabularia

1000999

914

AGallus A1AMusA1Homo1

ABos1

1000

ADrosophilaAAedes

AManduca7271000

1000

ANeurosporaASaccharomycesACandida1000

1000

262

ADictyosteliumAEntamoeba

282

299

513

954

973

1000

901

901

Flagellar Assembly ATPases

Non C

atalytic Subunits

Catalytic S

ubunits

Bacterial (F

-)A

rchaeal (A-) &

Vacuolar (v-)

Bacterial (F

-)A

rchaeal (A-)

Vacuolar (V

-) AT

Pase S

U

Page 8: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Phylogenetic tree of phenylalanyl-tRNA synthetase beta-subunits.

J.R. Brown, Syst. Biol. 50(4):497–512 , 2001

Page 9: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

H. Philippe, P. Forterre, J Mol Evol (1999) 49:509–523

Phylogenetic tree of Ile- and Val-tRNA synthetases

Page 10: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Hypotheses to explain the long empty branches connecting the three domains of life

Early evolution prior to the split of the three domains was following different mechanisms.

• Wolfram Zillig, 1992

• Otto Kandler, 1994

• Carl Woese, 1998

• Arthur Koch, 1994

Page 11: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Hypotheses to explain the long empty branches connecting the three domains of life (cont.)

There were major catastrophic events in the past that led to a bottleneck with only few survivors

For example:

•Tail of early heavy bombardment

•Rise of oxygen

Page 12: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Hypotheses to explain the long empty branches connecting the three domains of life (cont.)

There were major catastrophic events in the past that led to a bottleneck with only few survivors

Page 13: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Hypotheses to explain the long empty branches connecting the three domains of life (cont.)

Page 14: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Coalescence – the

process of tracing

lineages backwards

in time to their

common ancestors.

Every two extant

lineages coalesce

to their most recent

common ancestor.

Eventually, all

lineages coalesce

to the cenancestor.

t/2(Kingman, 1982)

Illustration is from J. Felsenstein, “Inferring Phylogenies”, Sinauer, 2003

Page 15: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Clade – (from the Greek “klados” meaning branch or twig) –

a group of organisms that includes all of the descendants of

an ancestral taxon. In a rooted phylogeny every node

defines a clade as the lineages originating from this node,

including those that arise in successive furcations.

Coalescence as an Approach to Study Cladogenesis

Cladogenesis – the process of clade formation

clade

Page 16: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Simulations of cladogenesis by coalescence

•One extinction and one speciation event per generation

•Horizontal transfer event once in 10 generations

RED: organismal lineages (no HGT)BLUE: molecular lineages (with HGT)GRAY: extinct lineages

RESULTS:•Most recent common ancestors are different for organismal and molecular phylogenies

•Different coalescence times

•Long coalescence of the last two lineages

Page 17: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

EX

TA

NT

LIN

EA

GE

S F

OR

TH

E S

IMU

LAT

ION

S O

F 5

0 LI

NE

AG

ES

Page 18: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

green: organismal lineages ; red: molecular lineages (with gene transfer)

Number of extant lineages over time

Page 19: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

Adam and Eve never met

Albrecht Durer, The Fall of Man, 1504

MitochondrialEve

Y chromosomeAdam

Lived approximately

50,000 years ago

Lived 166,000-249,000

years ago

Thomson, R. et al. (2000) Proc Natl Acad Sci U S A 97, 7360-5

Underhill, P.A. et al. (2000) Nat Genet 26, 358-61

Cann, R.L. et al. (1987) Nature 325, 31-6

Vigilant, L. et al. (1991) Science 253, 1503-7

Page 20: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

CONCLUSIONS•Coalescence leads to long branches at the root.

•Using single genes as phylogenetic markers makes it difficult to trace organismal phylogeny in the presence of horizontal gene transfer.

•Each contemporary molecule has its own history and traces back to an individual molecular cenancestor, but these molecular ancestors were likely to be present in different organisms at different times.

•The model alone already explains some features of the observed topology of the tree of life. Therefore, it does not appear warranted to invoke more complex hypotheses involving bottlenecks and extinction events to explain these features of the tree of life.

Page 21: Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.

OUTLOOK

• Incorporate non-random Horizontal Gene Transfer into the

model.

• Consider the model as a null hypothesis and look for

deviations from it. According to the model the bottom

branches are expected to be long for every individual clade.

This does not seem to be the case for bacterial domain. This

deviation could be due to sudden radiation.

• Explore if the model can be used to infer parameters that

describe the early evolution of life. E.g., number of species.