CS426Fall 2010/Lecture 161 Computer Security CS 426 Lecture 16 Worms.
EVE 161 Lecture 4
-
Upload
jonathan-eisen -
Category
Education
-
view
823 -
download
0
description
Transcript of EVE 161 Lecture 4
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Lecture 4
EVE 161:Microbial Phylogenomics
!Lecture #4:
Era I: Modern View of the Tree of Life !
UC Davis, Winter 2014 Instructor: Jonathan Eisen
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Where we are going and where we have been
• Previous lecture: !3. Woese and the Tree of Life
• Current Lecture: !4. Modern view of Tree of Life
• Next Lecture: !5. Era II: rRNA from environment
!2
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Two papers for today
Syst. Biol. 59(5):518–533, 2010c⃝ The Author(s) 2010. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved.
For Permissions, please email: [email protected]:10.1093/sysbio/syq037Advance Access publication on July 23, 2010
Broadly Sampled Multigene Analyses Yield a Well-Resolved Eukaryotic Tree of Life
LAURA WEGENER PARFREY1, JESSICA GRANT2, YONAS I. TEKLE2,6, ERICA LASEK-NESSELQUIST3,4,HILARY G. MORRISON3, MITCHELL L. SOGIN3, DAVID J. PATTERSON5, AND LAURA A. KATZ1,2,∗
1Program in Organismic and Evolutionary Biology, University of Massachusetts, 611 North Pleasant Street, Amherst,MA 01003, USA; 2Department of Biological Sciences, Smith College, 44 College Lane, Northampton, MA 01063, USA; 3Bay Paul Center for
Comparative Molecular Biology and Evolution, Marine Biological Laboratory, 7 MBL Street, Woods Hole, MA 02543, USA; 4Department of Ecology andEvolutionary Biology, Brown University, 80 Waterman Street, Providence, RI 02912, USA; 5Biodiversity Informatics Group, Marine Biological
Laboratory, 7 MBL Street, Woods Hole, MA 02543, USA; 6Present address: Department of Epidemiology and Public Health, Yale University School ofMedicine, New Haven, CT 06520, USA;
∗Correspondence to be sent to: Laura A. Katz, 44 College Lane, Northampton, MA 01003, USA; E-mail: [email protected] Wegener Parfrey and Jessica Grant have contributed equally to this work.
Received 30 September 2009; reviews returned 1 December 2009; accepted 25 May 2010Associate Editor: Cecile Ane
Abstract.—An accurate reconstruction of the eukaryotic tree of life is essential to identify the innovations underlying thediversity of microbial and macroscopic (e.g., plants and animals) eukaryotes. Previous work has divided eukaryotic diver-sity into a small number of high-level “supergroups,” many of which receive strong support in phylogenomic analyses.However, the abundance of data in phylogenomic analyses can lead to highly supported but incorrect relationships dueto systematic phylogenetic error. Furthermore, the paucity of major eukaryotic lineages (19 or fewer) included in thesegenomic studies may exaggerate systematic error and reduce power to evaluate hypotheses. Here, we use a taxon-richstrategy to assess eukaryotic relationships. We show that analyses emphasizing broad taxonomic sampling (up to 451 taxarepresenting 72 major lineages) combined with a moderate number of genes yield a well-resolved eukaryotic tree of life.The consistency across analyses with varying numbers of taxa (88–451) and levels of missing data (17–69%) supports theaccuracy of the resulting topologies. The resulting stable topology emerges without the removal of rapidly evolving genesor taxa, a practice common to phylogenomic analyses. Several major groups are stable and strongly supported in theseanalyses (e.g., SAR, Rhizaria, Excavata), whereas the proposed supergroup “Chromalveolata” is rejected. Furthermore, ex-tensive instability among photosynthetic lineages suggests the presence of systematic biases including endosymbiotic genetransfer from symbiont (nucleus or plastid) to host. Our analyses demonstrate that stable topologies of ancient evolutionaryrelationships can be achieved with broad taxonomic sampling and a moderate number of genes. Finally, taxon-rich analy-ses such as presented here provide a method for testing the accuracy of relationships that receive high bootstrap support(BS) in phylogenomic analyses and enable placement of the multitude of lineages that lack genome scale data. [Excavata;microbial eukaryotes; Rhizaria; supergroups; systematic error; taxon sampling.]
Perspectives on the structure of the eukaryotic treeof life have shifted in the past decade as molecularanalyses provide hypotheses for relationships amongthe approximately 75 robust lineages of eukaryotes.These lineages are defined by ultrastructural identities(Patterson 1999)—patterns of cellular and subcellularorganization revealed by electron microscopy—and arestrongly supported in molecular analyses (Parfrey et al.2006; Yoon et al. 2008). Most of these lineages nowfall within a small number of higher level clades, thesupergroups of eukaryotes (Simpson and Roger 2004;Adl et al. 2005; Keeling et al. 2005). Several of theseclades—Opisthokonta, Rhizaria, and Amoebozoa—are increasingly well supported by phylogenomic(Rodrıguez-Ezpeleta et al. 2007a; Burki et al. 2008;Hampl et al. 2009) and phylogenetic (Parfrey et al.2006; Pawlowski and Burki 2009), analyses, whereassupport for “Archaeplastida” predominantly comesfrom some phylogenomic studies (Rodrıguez-Ezpeletaet al. 2005; Burki et al. 2007) or analyses of plastidgenes (Yoon et al. 2002; Parfrey et al. 2006). In con-trast, support for “Chromalveolata” and Excavata ismixed, often dependent on the selection of taxa in-cluded in analyses (Rodrıguez-Ezpeleta et al. 2005;Parfrey et al. 2006; Rodrıguez-Ezpeleta et al. 2007a;Burki et al. 2008; Hampl et al. 2009). We use quotation
marks throughout to note groups where uncertaintiesremain. Moreover, it is difficult to evaluate the overallstability of major clades of eukaryotes because phyloge-nomic analyses have 19 or fewer of the major lineagesand hence do not sufficiently sample eukaryotic diver-sity (Rodrıguez-Ezpeleta et al. 2007b; Burki et al. 2008;Hampl et al. 2009), whereas taxon-rich analyses with4 or fewer genes yield topologies with poor support atdeep nodes (Cavalier-Smith 2004; Parfrey et al. 2006;Yoon et al. 2008).
Estimating the relationships of the major lineagesof eukaryotes is difficult because of both the ancientage of eukaryotes (1.2–1.8 billion years; Knoll et al.2006) and complex gene histories that include hetero-geneous rates of molecular evolution and paralogy(Maddison 1997; Gribaldo and Philippe 2002; Tekleet al. 2009). A further issue obscuring eukaryotic re-lationships is the chimeric nature of the eukaryoticgenome—not all genes are vertically inherited due tolateral gene transfer (LGT) and endosymbiotic genetransfer (EGT)—that can also mislead efforts to re-construct phylogenetic relationships (Andersson 2005;Rannala and Yang 2008; Tekle et al. 2009). This is espe-cially true among photosynthetic lineages that comprise“Chromalveolata” and “Archaeplastida” where a largeportion of the host genome (approximately 8–18%) is
518
at University of C
alifornia, Davis - Library on January 16, 2014
http://sysbio.oxfordjournals.org/D
ownloaded from
first published online 24 October 2012, doi: 10.1098/rspb.2012.1795279 2012 Proc. R. Soc. B Tom A. Williams, Peter G. Foster, Tom M. W. Nye, Cymon J. Cox and T. Martin Embley the ArchaeaA congruent phylogenomic signal places eukaryotes within
Supplementary datatml http://rspb.royalsocietypublishing.org/content/suppl/2012/10/18/rspb.2012.1795.DC1.h
"Data Supplement"
References
http://rspb.royalsocietypublishing.org/content/279/1749/4870.full.html#related-urls Article cited in:
http://rspb.royalsocietypublishing.org/content/279/1749/4870.full.html#ref-list-1
This article cites 56 articles, 35 of which can be accessed free
This article is free to access
Subject collections
(178 articles)taxonomy and systematics (1595 articles)evolution
(25 articles)bioinformatics Articles on similar topics can be found in the following collections
Email alerting service hereright-hand corner of the article or click Receive free email alerts when new articles cite this article - sign up in the box at the top
http://rspb.royalsocietypublishing.org/subscriptions go to: Proc. R. Soc. BTo subscribe to
on January 16, 2014rspb.royalsocietypublishing.orgDownloaded from on January 16, 2014rspb.royalsocietypublishing.orgDownloaded from
!3
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Phylogeny Review
!4
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Internal nodes represent hypothetical ancestral taxa
a b c d e f g h
root, root node
terminal (or tip) taxa
internal nodes
internal branches
u
v
wx
y
z
t
Terminal branches
Parts of a phylogenetic tree
!5
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Characters
• A heritable feature of an organism is known as a character (also character trait or trait).
!
• The form that a character takes is known as its state (also known as character state). ! Note: Presence/absence can be a state !
• Example: ! Character = heart ! Character state = present/absent ! Character state = # of chambers
!6
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Characters ancestry is critical to understand
• Characters that are inherited from a common ancestor are homologous.
• Species change over time ! Known (generally) as divergence, or divergent
evolution. ! Species change over time due to the combined
processes of mutation, recombination, drift, selection, etc
!7
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Data matrices
!8
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Sequence Alignment
!9
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Tree reconstruction methods
!10
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Some other bells and whistles
!11
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Long branch attraction
!12
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Homoplasy
!13
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Bootstrapping
!14
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Jacknifing
!15
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Congruence
!16
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Rooting
!17
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Masking
!18
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Concatenation
!19
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Two papers for today
Syst. Biol. 59(5):518–533, 2010c⃝ The Author(s) 2010. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved.
For Permissions, please email: [email protected]:10.1093/sysbio/syq037Advance Access publication on July 23, 2010
Broadly Sampled Multigene Analyses Yield a Well-Resolved Eukaryotic Tree of Life
LAURA WEGENER PARFREY1, JESSICA GRANT2, YONAS I. TEKLE2,6, ERICA LASEK-NESSELQUIST3,4,HILARY G. MORRISON3, MITCHELL L. SOGIN3, DAVID J. PATTERSON5, AND LAURA A. KATZ1,2,∗
1Program in Organismic and Evolutionary Biology, University of Massachusetts, 611 North Pleasant Street, Amherst,MA 01003, USA; 2Department of Biological Sciences, Smith College, 44 College Lane, Northampton, MA 01063, USA; 3Bay Paul Center for
Comparative Molecular Biology and Evolution, Marine Biological Laboratory, 7 MBL Street, Woods Hole, MA 02543, USA; 4Department of Ecology andEvolutionary Biology, Brown University, 80 Waterman Street, Providence, RI 02912, USA; 5Biodiversity Informatics Group, Marine Biological
Laboratory, 7 MBL Street, Woods Hole, MA 02543, USA; 6Present address: Department of Epidemiology and Public Health, Yale University School ofMedicine, New Haven, CT 06520, USA;
∗Correspondence to be sent to: Laura A. Katz, 44 College Lane, Northampton, MA 01003, USA; E-mail: [email protected] Wegener Parfrey and Jessica Grant have contributed equally to this work.
Received 30 September 2009; reviews returned 1 December 2009; accepted 25 May 2010Associate Editor: Cecile Ane
Abstract.—An accurate reconstruction of the eukaryotic tree of life is essential to identify the innovations underlying thediversity of microbial and macroscopic (e.g., plants and animals) eukaryotes. Previous work has divided eukaryotic diver-sity into a small number of high-level “supergroups,” many of which receive strong support in phylogenomic analyses.However, the abundance of data in phylogenomic analyses can lead to highly supported but incorrect relationships dueto systematic phylogenetic error. Furthermore, the paucity of major eukaryotic lineages (19 or fewer) included in thesegenomic studies may exaggerate systematic error and reduce power to evaluate hypotheses. Here, we use a taxon-richstrategy to assess eukaryotic relationships. We show that analyses emphasizing broad taxonomic sampling (up to 451 taxarepresenting 72 major lineages) combined with a moderate number of genes yield a well-resolved eukaryotic tree of life.The consistency across analyses with varying numbers of taxa (88–451) and levels of missing data (17–69%) supports theaccuracy of the resulting topologies. The resulting stable topology emerges without the removal of rapidly evolving genesor taxa, a practice common to phylogenomic analyses. Several major groups are stable and strongly supported in theseanalyses (e.g., SAR, Rhizaria, Excavata), whereas the proposed supergroup “Chromalveolata” is rejected. Furthermore, ex-tensive instability among photosynthetic lineages suggests the presence of systematic biases including endosymbiotic genetransfer from symbiont (nucleus or plastid) to host. Our analyses demonstrate that stable topologies of ancient evolutionaryrelationships can be achieved with broad taxonomic sampling and a moderate number of genes. Finally, taxon-rich analy-ses such as presented here provide a method for testing the accuracy of relationships that receive high bootstrap support(BS) in phylogenomic analyses and enable placement of the multitude of lineages that lack genome scale data. [Excavata;microbial eukaryotes; Rhizaria; supergroups; systematic error; taxon sampling.]
Perspectives on the structure of the eukaryotic treeof life have shifted in the past decade as molecularanalyses provide hypotheses for relationships amongthe approximately 75 robust lineages of eukaryotes.These lineages are defined by ultrastructural identities(Patterson 1999)—patterns of cellular and subcellularorganization revealed by electron microscopy—and arestrongly supported in molecular analyses (Parfrey et al.2006; Yoon et al. 2008). Most of these lineages nowfall within a small number of higher level clades, thesupergroups of eukaryotes (Simpson and Roger 2004;Adl et al. 2005; Keeling et al. 2005). Several of theseclades—Opisthokonta, Rhizaria, and Amoebozoa—are increasingly well supported by phylogenomic(Rodrıguez-Ezpeleta et al. 2007a; Burki et al. 2008;Hampl et al. 2009) and phylogenetic (Parfrey et al.2006; Pawlowski and Burki 2009), analyses, whereassupport for “Archaeplastida” predominantly comesfrom some phylogenomic studies (Rodrıguez-Ezpeletaet al. 2005; Burki et al. 2007) or analyses of plastidgenes (Yoon et al. 2002; Parfrey et al. 2006). In con-trast, support for “Chromalveolata” and Excavata ismixed, often dependent on the selection of taxa in-cluded in analyses (Rodrıguez-Ezpeleta et al. 2005;Parfrey et al. 2006; Rodrıguez-Ezpeleta et al. 2007a;Burki et al. 2008; Hampl et al. 2009). We use quotation
marks throughout to note groups where uncertaintiesremain. Moreover, it is difficult to evaluate the overallstability of major clades of eukaryotes because phyloge-nomic analyses have 19 or fewer of the major lineagesand hence do not sufficiently sample eukaryotic diver-sity (Rodrıguez-Ezpeleta et al. 2007b; Burki et al. 2008;Hampl et al. 2009), whereas taxon-rich analyses with4 or fewer genes yield topologies with poor support atdeep nodes (Cavalier-Smith 2004; Parfrey et al. 2006;Yoon et al. 2008).
Estimating the relationships of the major lineagesof eukaryotes is difficult because of both the ancientage of eukaryotes (1.2–1.8 billion years; Knoll et al.2006) and complex gene histories that include hetero-geneous rates of molecular evolution and paralogy(Maddison 1997; Gribaldo and Philippe 2002; Tekleet al. 2009). A further issue obscuring eukaryotic re-lationships is the chimeric nature of the eukaryoticgenome—not all genes are vertically inherited due tolateral gene transfer (LGT) and endosymbiotic genetransfer (EGT)—that can also mislead efforts to re-construct phylogenetic relationships (Andersson 2005;Rannala and Yang 2008; Tekle et al. 2009). This is espe-cially true among photosynthetic lineages that comprise“Chromalveolata” and “Archaeplastida” where a largeportion of the host genome (approximately 8–18%) is
518
at University of C
alifornia, Davis - Library on January 16, 2014
http://sysbio.oxfordjournals.org/D
ownloaded from
!20
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Abstract.—An accurate reconstruction of the eukaryotic tree of life is essential to identify the innovations underlying the diversity of microbial and macroscopic (e.g., plants and animals) eukaryotes. Previous work has divided eukaryotic diver- sity into a small number of high-level “supergroups,” many of which receive strong support in phylogenomic analyses. However, the abundance of data in phylogenomic analyses can lead to highly supported but incorrect relationships due to systematic phylogenetic error. Furthermore, the paucity of major eukaryotic lineages (19 or fewer) included in these genomic studies may exaggerate systematic error and reduce power to evaluate hypotheses. Here, we use a taxon-rich strategy to assess eukaryotic relationships. We show that analyses emphasizing broad taxonomic sampling (up to 451 taxa representing 72 major lineages) combined with a moderate number of genes yield a well-resolved eukaryotic tree of life. The consistency across analyses with varying numbers of taxa (88–451) and levels of missing data (17–69%) supports the accuracy of the resulting topologies. The resulting stable topology emerges without the removal of rapidly evolving genes or taxa, a practice common to phylogenomic analyses. Several major groups are stable and strongly supported in these analyses (e.g., SAR, Rhizaria, Excavata), whereas the proposed supergroup “Chromalveolata” is rejected. Furthermore, ex- tensive instability among photosynthetic lineages suggests the presence of systematic biases including endosymbiotic gene transfer from symbiont (nucleus or plastid) to host. Our analyses demonstrate that stable topologies of ancient evolutionary relationships can be achieved with broad taxonomic sampling and a moderate number of genes. Finally, taxon-rich analy- ses such as presented here provide a method for testing the accuracy of relationships that receive high bootstrap support (BS) in phylogenomic analyses and enable placement of the multitude of lineages that lack genome scale data. [Excavata; microbial eukaryotes; Rhizaria; supergroups; systematic error; taxon sampling.]
!21
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !22
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !23
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !24
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !25
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !26
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !27
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !28
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
451 Taxa
!29
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
88 Taxa
!30
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !31
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !32
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Just Rhizaria
!33
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Just Excavata
!34
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !35
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !36
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Two papers for today
first published online 24 October 2012, doi: 10.1098/rspb.2012.1795279 2012 Proc. R. Soc. B Tom A. Williams, Peter G. Foster, Tom M. W. Nye, Cymon J. Cox and T. Martin Embley the ArchaeaA congruent phylogenomic signal places eukaryotes within
Supplementary datatml http://rspb.royalsocietypublishing.org/content/suppl/2012/10/18/rspb.2012.1795.DC1.h
"Data Supplement"
References
http://rspb.royalsocietypublishing.org/content/279/1749/4870.full.html#related-urls Article cited in:
http://rspb.royalsocietypublishing.org/content/279/1749/4870.full.html#ref-list-1
This article cites 56 articles, 35 of which can be accessed free
This article is free to access
Subject collections
(178 articles)taxonomy and systematics (1595 articles)evolution
(25 articles)bioinformatics Articles on similar topics can be found in the following collections
Email alerting service hereright-hand corner of the article or click Receive free email alerts when new articles cite this article - sign up in the box at the top
http://rspb.royalsocietypublishing.org/subscriptions go to: Proc. R. Soc. BTo subscribe to
on January 16, 2014rspb.royalsocietypublishing.orgDownloaded from on January 16, 2014rspb.royalsocietypublishing.orgDownloaded from
!37
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Determining the relationships among the major groups of cellular life is important for understanding the evolution of biological diversity, but is difficult given the enormous time spans involved. In the textbook ‘three domains’ tree based on informational genes, eukaryotes and Archaea share a common ancestor to the exclusion of Bacteria. However, some phylogenetic analyses of the same data have placed eukaryotes within the Archaea, as the nearest relatives of different archaeal lineages. We compared the support for these competing hypotheses using sophisticated phylogenetic methods and an improved sampling of archaeal biodiversity. We also employed both new and existing tests of phylogenetic congruence to explore the level of uncertainty and conflict in the data. Our analyses suggested that much of the observed incongruence is weakly supported or associated with poorly fitting evolutionary models. All of our phylogenetic analyses, whether on small subunit and large subunit ribosomal RNA or concatenated protein-coding genes, recovered a monophyletic group containing eukaryotes and the TACK archaeal superphylum comprising the Thaumarchaeota, Aigarchaeota, Crenarchaeota and Korarchaeota. Hence, while our results provide no support for the iconic three-domain tree of life, they are consistent with an extended eocyte hypothesis whereby vital components of the eukaryotic nuclear lineage originated from within the archaeal radiation
!38
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !39
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !40
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !41
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !42
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !43
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !44
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Archaeoglobus fulgidus
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Methanosarcina mazei
Thermoplasma volcanium
Giardia lamblia
Trichomonas vaginalisNaegleria gruberi
Arabidopsis thaliana
Homo sapiens
Saccharomyces cerevisiae
Thalassiosira pseudonana
Dictyostelium discoideum
Trypanosoma brucei
Entamoeba histolytica
Cenarchaeum symbiosum
Nitrosopumilus maritimus
Korarchaeum cryptofilum
Caldiarchaeum subterraneum
Caldivirga maquilingensis
Pyrobaculum aerophilum
Thermofilum pendens
Sulfolobus solfataricusStaphylothermus marinus
Hyperthermus butylicus
Ignicoccus hospitalis
Aeropyrum pernix
Campylobacter jejuni
Escherichia coli
Rhodopseudomonas palustris
Clostridium acetobutylicum
Synechocystis sp.
Treponema pallidum
Chlamydia trachomatis
Rhodopirellula baltica
1
1
1
1
0.83
1
0.2
(a)
Bacteria
Euryarchaeota
Crenarchaeota
Eukaryota
Trichomonas vaginalis
Arabidopsis thaliana
Giardia lamblia
Homo sapiens
Saccharomyces cerevisiae
Thalassiosira pseudonana
Dictyostelium discoideum
Trypanosoma brucei
Entamoeba histolytica
Naegleria gruberi
Archaeoglobus fulgidus
Methanosarcina mazei
Thermoplasma volcanium
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Caldivirga maquilingensis
Pyrobaculum aerophilum
Thermofilum pendens
Sulfolobus solfataricus
Hyperthermus butylicus
Staphylothermus marinus
Ignicoccus hospitalis
Aeropyrum pernix
Clostridium acetobutylicum
Synechocystis sp.
Campylobacter jejuni
Escherichia coli
Rhodopseudomonas palustris
Treponema pallidum
Chlamydia trachomatis
Rhodopirellula baltica
1
1
1
1
1
0.2
(b)
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Archaeoglobus fulgidus
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Methanosarcina mazei
Thermoplasma volcanium
Trichomonas vaginalis
Giardia lamblia
Naegleria gruberi
Entamoeba histolytica
Dictyostelium discoideum
Trypanosoma brucei
Arabidopsis thaliana
Homo sapiens
Saccharomyces cerevisiae
Thalassiosira pseudonana
Cenarchaeum symbiosum
Nitrosopumilus maritimus
Korarchaeum cryptofilum
Caldiarchaeum subterraneum
Caldivirga maquilingensis
Pyrobaculum aerophilum
Thermofilum pendens
Sulfolobus solfataricus
Hyperthermus butylicus
Ignicoccus hospitalis
Staphylothermus marinus
Aeropyrum pernix
Campylobacter jejuni
Escherichia coli
Rhodopseudomonas palustris
Clostridium acetobutylicum
Synechocystis sp.
Treponema pallidum
Chlamydia trachomatis
Rhodopirellula baltica
1
1
1
1
1
1
0.2
(c)
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Methanococcus jannaschii
Thermoplasma volcanium
Methanosarcina mazei
Archaeoglobus fulgidus
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Korarchaeum cryptofilum
Nitrosopumilus maritimus
Cenarchaeum symbiosum
Caldiarchaeum subterraneum
Giardia lamblia
Homo sapiens
Thalassiosira pseudonana
Saccharomyces cerevisiae
Trypanosoma brucei
Naegleria gruberi
Entamoeba histolytica
Trichomonas vaginalis
Dictyostelium discoideum
Arabidopsis thaliana
Thermofilum pendens
Pyrobaculum aerophilum
Caldivirga maquilingensis
Sulfolobus solfataricus
Staphylothermus marinus
Aeropyrum pernix
Ignicoccus hospitalis
Hyperthermus butylicus
Rhodopirellula baltica
Synechocystis sp.
Clostridium acetobutylicum
Treponema pallidum
Chlamydia trachomatis
Rhodopseudomonas palustris
Escherichia coli
Campylobacter jejuni
1
1
0.57
1
0.97
0.2
(d )
Figure 1. Phylogenies of Bacteria, Archaea and eukaryotes inferred from concatenated rRNA. (a) A Bayesian phylogeny of Bac-teria, Archaea and eukaryotes inferred under the GTR model, showing an eocyte-like topology in which eukaryotes emergefrom within the Archaea with maximal support (posterior probability (PP) ¼ 1). (b) Removal of recently characterized archaealgroups (the Thaumarchaeota, Aigarchaeota and Korarchaeota) converts this tree into a canonical three-domains topology,again with maximal support (PP ¼ 1), indicating that sampling plays an important role in the resolution of these ancientrelationships. Analyses of the full dataset using the better-fitting NDRH þNDCH (c) and CAT (d) models recover maximallysupported eocyte-like topologies; these models also recover eocyte-like topologies on the reduced dataset, without the TAKsequences (see the electronic supplementary material, figure S1). Branch lengths are proportional to substitutions per site.
Evolution of eukaryotes from Archaea T. A. Williams et al. 4873
Proc. R. Soc. B (2012)
on January 16, 2014rspb.royalsocietypublishing.orgDownloaded from
!45
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Archaeoglobus fulgidus
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Methanosarcina mazei
Thermoplasma volcanium
Giardia lamblia
Trichomonas vaginalisNaegleria gruberi
Arabidopsis thaliana
Homo sapiens
Saccharomyces cerevisiae
Thalassiosira pseudonana
Dictyostelium discoideum
Trypanosoma brucei
Entamoeba histolytica
Cenarchaeum symbiosum
Nitrosopumilus maritimus
Korarchaeum cryptofilum
Caldiarchaeum subterraneum
Caldivirga maquilingensis
Pyrobaculum aerophilum
Thermofilum pendens
Sulfolobus solfataricusStaphylothermus marinus
Hyperthermus butylicus
Ignicoccus hospitalis
Aeropyrum pernix
Campylobacter jejuni
Escherichia coli
Rhodopseudomonas palustris
Clostridium acetobutylicum
Synechocystis sp.
Treponema pallidum
Chlamydia trachomatis
Rhodopirellula baltica
1
1
1
1
0.83
1
0.2
(a)
Bacteria
Euryarchaeota
Crenarchaeota
Eukaryota
Trichomonas vaginalis
Arabidopsis thaliana
Giardia lamblia
Homo sapiens
Saccharomyces cerevisiae
Thalassiosira pseudonana
Dictyostelium discoideum
Trypanosoma brucei
Entamoeba histolytica
Naegleria gruberi
Archaeoglobus fulgidus
Methanosarcina mazei
Thermoplasma volcanium
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Caldivirga maquilingensis
Pyrobaculum aerophilum
Thermofilum pendens
Sulfolobus solfataricus
Hyperthermus butylicus
Staphylothermus marinus
Ignicoccus hospitalis
Aeropyrum pernix
Clostridium acetobutylicum
Synechocystis sp.
Campylobacter jejuni
Escherichia coli
Rhodopseudomonas palustris
Treponema pallidum
Chlamydia trachomatis
Rhodopirellula baltica
1
1
1
1
1
0.2
(b)
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Archaeoglobus fulgidus
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Methanosarcina mazei
Thermoplasma volcanium
Trichomonas vaginalis
Giardia lamblia
Naegleria gruberi
Entamoeba histolytica
Dictyostelium discoideum
Trypanosoma brucei
Arabidopsis thaliana
Homo sapiens
Saccharomyces cerevisiae
Thalassiosira pseudonana
Cenarchaeum symbiosum
Nitrosopumilus maritimus
Korarchaeum cryptofilum
Caldiarchaeum subterraneum
Caldivirga maquilingensis
Pyrobaculum aerophilum
Thermofilum pendens
Sulfolobus solfataricus
Hyperthermus butylicus
Ignicoccus hospitalis
Staphylothermus marinus
Aeropyrum pernix
Campylobacter jejuni
Escherichia coli
Rhodopseudomonas palustris
Clostridium acetobutylicum
Synechocystis sp.
Treponema pallidum
Chlamydia trachomatis
Rhodopirellula baltica
1
1
1
1
1
1
0.2
(c)
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Methanococcus jannaschii
Thermoplasma volcanium
Methanosarcina mazei
Archaeoglobus fulgidus
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Korarchaeum cryptofilum
Nitrosopumilus maritimus
Cenarchaeum symbiosum
Caldiarchaeum subterraneum
Giardia lamblia
Homo sapiens
Thalassiosira pseudonana
Saccharomyces cerevisiae
Trypanosoma brucei
Naegleria gruberi
Entamoeba histolytica
Trichomonas vaginalis
Dictyostelium discoideum
Arabidopsis thaliana
Thermofilum pendens
Pyrobaculum aerophilum
Caldivirga maquilingensis
Sulfolobus solfataricus
Staphylothermus marinus
Aeropyrum pernix
Ignicoccus hospitalis
Hyperthermus butylicus
Rhodopirellula baltica
Synechocystis sp.
Clostridium acetobutylicum
Treponema pallidum
Chlamydia trachomatis
Rhodopseudomonas palustris
Escherichia coli
Campylobacter jejuni
1
1
0.57
1
0.97
0.2
(d )
Figure 1. Phylogenies of Bacteria, Archaea and eukaryotes inferred from concatenated rRNA. (a) A Bayesian phylogeny of Bac-teria, Archaea and eukaryotes inferred under the GTR model, showing an eocyte-like topology in which eukaryotes emergefrom within the Archaea with maximal support (posterior probability (PP) ¼ 1). (b) Removal of recently characterized archaealgroups (the Thaumarchaeota, Aigarchaeota and Korarchaeota) converts this tree into a canonical three-domains topology,again with maximal support (PP ¼ 1), indicating that sampling plays an important role in the resolution of these ancientrelationships. Analyses of the full dataset using the better-fitting NDRH þNDCH (c) and CAT (d) models recover maximallysupported eocyte-like topologies; these models also recover eocyte-like topologies on the reduced dataset, without the TAKsequences (see the electronic supplementary material, figure S1). Branch lengths are proportional to substitutions per site.
Evolution of eukaryotes from Archaea T. A. Williams et al. 4873
Proc. R. Soc. B (2012)
on January 16, 2014rspb.royalsocietypublishing.orgDownloaded from
With New Data
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Archaeoglobus fulgidus
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Methanosarcina mazei
Thermoplasma volcanium
Giardia lamblia
Trichomonas vaginalisNaegleria gruberi
Arabidopsis thaliana
Homo sapiens
Saccharomyces cerevisiae
Thalassiosira pseudonana
Dictyostelium discoideum
Trypanosoma brucei
Entamoeba histolytica
Cenarchaeum symbiosum
Nitrosopumilus maritimus
Korarchaeum cryptofilum
Caldiarchaeum subterraneum
Caldivirga maquilingensis
Pyrobaculum aerophilum
Thermofilum pendens
Sulfolobus solfataricusStaphylothermus marinus
Hyperthermus butylicus
Ignicoccus hospitalis
Aeropyrum pernix
Campylobacter jejuni
Escherichia coli
Rhodopseudomonas palustris
Clostridium acetobutylicum
Synechocystis sp.
Treponema pallidum
Chlamydia trachomatis
Rhodopirellula baltica
1
1
1
1
0.83
1
0.2
(a)
Bacteria
Euryarchaeota
Crenarchaeota
Eukaryota
Trichomonas vaginalis
Arabidopsis thaliana
Giardia lamblia
Homo sapiens
Saccharomyces cerevisiae
Thalassiosira pseudonana
Dictyostelium discoideum
Trypanosoma brucei
Entamoeba histolytica
Naegleria gruberi
Archaeoglobus fulgidus
Methanosarcina mazei
Thermoplasma volcanium
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Caldivirga maquilingensis
Pyrobaculum aerophilum
Thermofilum pendens
Sulfolobus solfataricus
Hyperthermus butylicus
Staphylothermus marinus
Ignicoccus hospitalis
Aeropyrum pernix
Clostridium acetobutylicum
Synechocystis sp.
Campylobacter jejuni
Escherichia coli
Rhodopseudomonas palustris
Treponema pallidum
Chlamydia trachomatis
Rhodopirellula baltica
1
1
1
1
1
0.2
(b)
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Archaeoglobus fulgidus
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Methanosarcina mazei
Thermoplasma volcanium
Trichomonas vaginalis
Giardia lamblia
Naegleria gruberi
Entamoeba histolytica
Dictyostelium discoideum
Trypanosoma brucei
Arabidopsis thaliana
Homo sapiens
Saccharomyces cerevisiae
Thalassiosira pseudonana
Cenarchaeum symbiosum
Nitrosopumilus maritimus
Korarchaeum cryptofilum
Caldiarchaeum subterraneum
Caldivirga maquilingensis
Pyrobaculum aerophilum
Thermofilum pendens
Sulfolobus solfataricus
Hyperthermus butylicus
Ignicoccus hospitalis
Staphylothermus marinus
Aeropyrum pernix
Campylobacter jejuni
Escherichia coli
Rhodopseudomonas palustris
Clostridium acetobutylicum
Synechocystis sp.
Treponema pallidum
Chlamydia trachomatis
Rhodopirellula baltica
1
1
1
1
1
1
0.2
(c)
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Methanococcus jannaschii
Thermoplasma volcanium
Methanosarcina mazei
Archaeoglobus fulgidus
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Korarchaeum cryptofilum
Nitrosopumilus maritimus
Cenarchaeum symbiosum
Caldiarchaeum subterraneum
Giardia lamblia
Homo sapiens
Thalassiosira pseudonana
Saccharomyces cerevisiae
Trypanosoma brucei
Naegleria gruberi
Entamoeba histolytica
Trichomonas vaginalis
Dictyostelium discoideum
Arabidopsis thaliana
Thermofilum pendens
Pyrobaculum aerophilum
Caldivirga maquilingensis
Sulfolobus solfataricus
Staphylothermus marinus
Aeropyrum pernix
Ignicoccus hospitalis
Hyperthermus butylicus
Rhodopirellula baltica
Synechocystis sp.
Clostridium acetobutylicum
Treponema pallidum
Chlamydia trachomatis
Rhodopseudomonas palustris
Escherichia coli
Campylobacter jejuni
1
1
0.57
1
0.97
0.2
(d )
Figure 1. Phylogenies of Bacteria, Archaea and eukaryotes inferred from concatenated rRNA. (a) A Bayesian phylogeny of Bac-teria, Archaea and eukaryotes inferred under the GTR model, showing an eocyte-like topology in which eukaryotes emergefrom within the Archaea with maximal support (posterior probability (PP) ¼ 1). (b) Removal of recently characterized archaealgroups (the Thaumarchaeota, Aigarchaeota and Korarchaeota) converts this tree into a canonical three-domains topology,again with maximal support (PP ¼ 1), indicating that sampling plays an important role in the resolution of these ancientrelationships. Analyses of the full dataset using the better-fitting NDRH þNDCH (c) and CAT (d) models recover maximallysupported eocyte-like topologies; these models also recover eocyte-like topologies on the reduced dataset, without the TAKsequences (see the electronic supplementary material, figure S1). Branch lengths are proportional to substitutions per site.
Evolution of eukaryotes from Archaea T. A. Williams et al. 4873
Proc. R. Soc. B (2012)
on January 16, 2014rspb.royalsocietypublishing.orgDownloaded from
Without New Data
!46
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Better Models
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Archaeoglobus fulgidus
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Methanosarcina mazei
Thermoplasma volcanium
Giardia lamblia
Trichomonas vaginalisNaegleria gruberi
Arabidopsis thaliana
Homo sapiens
Saccharomyces cerevisiae
Thalassiosira pseudonana
Dictyostelium discoideum
Trypanosoma brucei
Entamoeba histolytica
Cenarchaeum symbiosum
Nitrosopumilus maritimus
Korarchaeum cryptofilum
Caldiarchaeum subterraneum
Caldivirga maquilingensis
Pyrobaculum aerophilum
Thermofilum pendens
Sulfolobus solfataricusStaphylothermus marinus
Hyperthermus butylicus
Ignicoccus hospitalis
Aeropyrum pernix
Campylobacter jejuni
Escherichia coli
Rhodopseudomonas palustris
Clostridium acetobutylicum
Synechocystis sp.
Treponema pallidum
Chlamydia trachomatis
Rhodopirellula baltica
1
1
1
1
0.83
1
0.2
(a)
Bacteria
Euryarchaeota
Crenarchaeota
Eukaryota
Trichomonas vaginalis
Arabidopsis thaliana
Giardia lamblia
Homo sapiens
Saccharomyces cerevisiae
Thalassiosira pseudonana
Dictyostelium discoideum
Trypanosoma brucei
Entamoeba histolytica
Naegleria gruberi
Archaeoglobus fulgidus
Methanosarcina mazei
Thermoplasma volcanium
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Caldivirga maquilingensis
Pyrobaculum aerophilum
Thermofilum pendens
Sulfolobus solfataricus
Hyperthermus butylicus
Staphylothermus marinus
Ignicoccus hospitalis
Aeropyrum pernix
Clostridium acetobutylicum
Synechocystis sp.
Campylobacter jejuni
Escherichia coli
Rhodopseudomonas palustris
Treponema pallidum
Chlamydia trachomatis
Rhodopirellula baltica
1
1
1
1
1
0.2
(b)
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Archaeoglobus fulgidus
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Methanosarcina mazei
Thermoplasma volcanium
Trichomonas vaginalis
Giardia lamblia
Naegleria gruberi
Entamoeba histolytica
Dictyostelium discoideum
Trypanosoma brucei
Arabidopsis thaliana
Homo sapiens
Saccharomyces cerevisiae
Thalassiosira pseudonana
Cenarchaeum symbiosum
Nitrosopumilus maritimus
Korarchaeum cryptofilum
Caldiarchaeum subterraneum
Caldivirga maquilingensis
Pyrobaculum aerophilum
Thermofilum pendens
Sulfolobus solfataricus
Hyperthermus butylicus
Ignicoccus hospitalis
Staphylothermus marinus
Aeropyrum pernix
Campylobacter jejuni
Escherichia coli
Rhodopseudomonas palustris
Clostridium acetobutylicum
Synechocystis sp.
Treponema pallidum
Chlamydia trachomatis
Rhodopirellula baltica
1
1
1
1
1
1
0.2
(c)
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Methanococcus jannaschii
Thermoplasma volcanium
Methanosarcina mazei
Archaeoglobus fulgidus
Methanothermobacter thermautotrophicus
Pyrococcus furiosus
Korarchaeum cryptofilum
Nitrosopumilus maritimus
Cenarchaeum symbiosum
Caldiarchaeum subterraneum
Giardia lamblia
Homo sapiens
Thalassiosira pseudonana
Saccharomyces cerevisiae
Trypanosoma brucei
Naegleria gruberi
Entamoeba histolytica
Trichomonas vaginalis
Dictyostelium discoideum
Arabidopsis thaliana
Thermofilum pendens
Pyrobaculum aerophilum
Caldivirga maquilingensis
Sulfolobus solfataricus
Staphylothermus marinus
Aeropyrum pernix
Ignicoccus hospitalis
Hyperthermus butylicus
Rhodopirellula baltica
Synechocystis sp.
Clostridium acetobutylicum
Treponema pallidum
Chlamydia trachomatis
Rhodopseudomonas palustris
Escherichia coli
Campylobacter jejuni
1
1
0.57
1
0.97
0.2
(d )
Figure 1. Phylogenies of Bacteria, Archaea and eukaryotes inferred from concatenated rRNA. (a) A Bayesian phylogeny of Bac-teria, Archaea and eukaryotes inferred under the GTR model, showing an eocyte-like topology in which eukaryotes emergefrom within the Archaea with maximal support (posterior probability (PP) ¼ 1). (b) Removal of recently characterized archaealgroups (the Thaumarchaeota, Aigarchaeota and Korarchaeota) converts this tree into a canonical three-domains topology,again with maximal support (PP ¼ 1), indicating that sampling plays an important role in the resolution of these ancientrelationships. Analyses of the full dataset using the better-fitting NDRH þNDCH (c) and CAT (d) models recover maximallysupported eocyte-like topologies; these models also recover eocyte-like topologies on the reduced dataset, without the TAKsequences (see the electronic supplementary material, figure S1). Branch lengths are proportional to substitutions per site.
Evolution of eukaryotes from Archaea T. A. Williams et al. 4873
Proc. R. Soc. B (2012)
on January 16, 2014rspb.royalsocietypublishing.orgDownloaded from
!47
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !48
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !49
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !50
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Concatenated Proteins
complete datasets are presented in figure 2; phylogeniesinferred from the CONCATERPILLAR-derived congruentsubsets are provided in the electronic supplementarymaterial, figures S2 and S3. The support from all theseanalyses for current hypotheses on eukaryotic origins issummarized in the electronic supplementary material,tables S4 and S5.
Our analyses including bacterial outgroups consist-ently supported the monophyly of eukaryotes with theTACK superphylum of Archaea, to the exclusion ofthe euryarchaeotes, although the strength of support forthis eocyte-like hypothesis varied with the subset of thedata analysed (see figure 2 and the electronic supplemen-tary material, tables S4 and S5). By contrast, we found nosupport for the three-domains hypothesis and the mono-phyly of Archaea from any of these analyses. While themonophyly of eukaryotes and the TACK superphylumwas consistently recovered, the specific relationshipswithin this clade were more ambiguous. The phylogenyinferred from the 63AE dataset recovered eukaryotes andthe TACK superphylum as separate clusters (figure 2b);in contrast, the 29BAE dataset and the two largestCONCATERPILLAR-derived congruent subsets inferredfrom the 64AE dataset supported the nesting of theeukaryotes within the TACK superphylum, either as
the neighbour of Korarchaeum or with the relationshipunresolved (see figure 2a and the electronic supplemen-tary material, tables S4 and S5). Given the sparsesampling of Korarchaeota and their relatives, and thelong branch leading to eukaryotes, this finding must betreated with caution [14]. In particular, when the bac-terial sequences were removed from the 29BAE datasetand the analyses were repeated, the relationship betweeneukaryotes, Korarchaeum and the rest of the TACK super-phylum collapsed to a trichotomy, suggesting that theKorarchaeum/eukaryote link is not strongly supported (seethe electronic supplementary material, figure S6). FurtherKorarchaeal genome sequences are likely to be veryinformative about this part of the tree of life. It is interest-ing to note that we consistently recovered a stronglysupported Thaumarchaeota/Aigarchaeota clade withinthe TACK group, confirming the relationship betweenthese groups [20] and suggesting that they do not representthe earliest-diverging archaeal lineage [16,49]; in ourtrees, the eukaryotes and the TACK superphylum consist-ently form a monophyletic group to the exclusion ofeuryarchaeotes (figure 2, electronic supplementarymaterial, figure S2).
With the exception of Korarchaeum, our analyses did notprovide support for a specific relationship between any
Bacteria
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Methanothermobacter thermautotrophicus
Methanococcus jannaschii
Thermoplasma volcanium
Methanosarcina mazei
Archaeoglobus fulgidus
Pyrococcus furiosus
Giardia lamblia
Trichomonas vaginalis
Thalassiosira pseudonana
Phytophthora ramorum
Saccharomyces cerevisiae
Homo sapiens
Entamoeba histolytica
Dictyostelium discoideum
Leishmania major
Arabidopsis thaliana
Korarchaeum cryptofilum
Nitrosopumilus maritimus
Nitrosoarchaeum limnia
Cenarchaeum symbiosum
Caldiarchaeum subterraneum
Thermofilum pendens
Pyrobaculum aerophilum
Caldivirga maquilingensis
Staphylothermus marinus
Sulfolobus solfataricus
Ignicoccus hospitalis
Aeropyrum pernix
Hyperthermus butylicus
Rhodopseudomonas palustris
Escherichia coli
Treponema pallidum
Rhodopirellula baltica
Chlamydia trachomatis
Synechocystis sp.
Clostridium acetobutylicum
Campylobacter jejuni
1
0.51
0.81
0.99
0.99
1
0.99
1
1
0.2
(a)
Euryarchaeota
Korarchaeota
Crenarchaeota
Aigarchaeota
Thaumarchaeota
Eukaryota
Pyrococcus furiosus
Methanococcus jannaschii
Methanothermobacter thermautotrophicus
Thermoplasma acidophilum
Archaeoglobus fulgidus
Methanosarcina mazei
Trichomonas vaginalis
Giardia lamblia
Entamoeba histolytica
Naegleria gruberi
Leishmania major
Dictyostelium discoideum
Saccharomyces cerevisiae
Homo sapiens
Arabidopsis thaliana
Thalassiosira pseudonana
Phytophthora ramorum
Korarchaeum cryptofilum
Caldiarchaeum subterraneum
Cenarchaeum symbiosum
Nitrosopumilus maritimus
Nitrosoarchaeum limnia
Thermofilum pendens
Pyrobaculum aerophilum
Caldivirga maquilingensis
Sulfolobus solfataricus
Ignicoccus hospitalis
Staphylothermus marinus
Hyperthermus butylicus
Aeropyrum pernix
1
1
1
0.99
1
1
0.5
(b)
Figure 2. Phylogenies of Bacteria, Archaea and eukaryotes inferred from conserved protein-coding genes. (a) A phylogenyinferred from 29 concatenated proteins conserved between Bacteria, Archaea and eukaryotes. An eocyte topology was recov-ered with strong (PP ¼ 0.99) support. In this phylogeny, the eukaryotes emerge as the sister group of Korarchaeum, nested withthe TACK superphylum. (b) A phylogeny inferred from 63 concatenated proteins shared between Archaea and eukaryotes. Theposition of the root is not explicitly indicated. However, based on the result from (a) and the electronic supplementary material,table S4, it is likely to be either within, or on the branch leading to, the Euryarchaea. If this position is correct, then the treeshows the eukaryotes emerging as the sister group to the TACK superphylum, including Korarchaeum. These trees wereinferred using the CAT model in PHYLOBAYES. Branch lengths are proportional to substitutions per site, except the truncatedbacterial branch in (a).
4874 T. A. Williams et al. Evolution of eukaryotes from Archaea
Proc. R. Soc. B (2012)
on January 16, 2014rspb.royalsocietypublishing.orgDownloaded from
!51
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !52
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !53
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014
Tree Congruence
3. CONCLUSIONSUnder the three-domains hypothesis, important com-ponents of the eukaryotic genetic machinery werevertically inherited from a common ancestor sharedwith Archaea, and this relationship is taken to explainthe shared properties of both groups. In an eocyte-likescenario, those same eukaryotic components were verti-cally inherited from an ancestor that was already anarchaeon, and the phylogenetic position of this ancestorcould be particularly informative about the genetic andmetabolic context of early eukaryotic evolution and for
theories of eukaryotic origins [1]. Here, we have com-pared support for these hypotheses and others, usingconserved components of the genetic machinery. Withan updated sampling of archaeal diversity, we found nosupport for the three-domains hypothesis either fromrRNA or protein-coding genes under any phylogeneticmodel. Instead, we detected a congruent phylogeneticsignal that placed essential informational genes of theeukaryotic nuclear lineage within the archaeal radiation,sharing common ancestry with the TACK superphylum.The monophyly of eukaryotic genes with the TACK
distance
freq
uenc
y
1 2 3 4 5
no. t
ests
pas
sed
(P >
0.0
5)
saturation andhomoplasy
site-specific biochemical diversity
compositionalheterogeneity
0
10
20
30
40
50
60model
CAT20LG
(b)
0
50
100
150
200
250
300
(a)
1.0 1.5 2.0 2.5 3.0
dens
ity
modelCAT20LG
0
0.2
0.4
0.6
0.8
1.0
1.2
(c)
distance
Figure 3. Analysing incongruence using a novel measure of distance between gene trees. We used distributions of pairwise geo-desic distances between gene trees to compare levels of incongruence inferred under different evolutionary models. (a) Thedistribution of distances under a single model (CAT20) can be used to identify obvious outliers corresponding to highly incon-gruent gene trees; a single gene was responsible for the peak highlighted in red, and was removed from subsequent analyses.(b) Overview of model-fitting tests (posterior predictive simulations) for each gene in the 64AE dataset. The height of the barsindicates the proportion of genes that ‘passed’ a test under a particular model; we said that a test was passed when the value ofthe test statistic on the real data fell within the central 95% of the distribution of values produced by posterior predictive simu-lation. The results suggest that CAT20 fits better than LG, successfully accounting for the observed levels of saturation andhomoplasy in all but one of the alignments. Both models do a poor job of modelling the site-specific selective constraints inour dataset, although again CAT20 performs better than LG (13 passes as opposed to 0). (c) Comparison of the distance dis-tributions inferred under the CAT20 and LG models. The trees inferred under the better-fitting CAT20 model are significantlymore congruent than those inferred under LG (mean distance: 2.68 versus 3.22, p , 0.0001). The significance of this differ-ence was assessed using a permutation test that took the correlations between pairwise distances into account (see §4). Theseresults suggest that a significant portion of the incongruence in this dataset of informational genes can be attributed to modelmisspecification, rather than genuinely distinct evolutionary histories.
4876 T. A. Williams et al. Evolution of eukaryotes from Archaea
Proc. R. Soc. B (2012)
on January 16, 2014rspb.royalsocietypublishing.orgDownloaded from
!54
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !55
Slides for UC Davis EVE161 Course Taught by Jonathan Eisen Winter 2014 !56