Eisen.All Hands

download

of 43

Embed Size (px)

description

Talk summarizing our GEBA Genomic Encylopedia of Bacteria and Archaea project for "All Hands" meeting at the Joint Genome Institute

transcript

  • 1. A phylogeny driven genomic encyclopedia of bacteria and archaea(or what is GEBA anyway?)Jonathan A. Eisen October 27, 2009

2. From http://genomesonline.org 3. rRNA Tree of Life 4. The Tree is not Happy 5. As of 2002 ProteobacteriaTM6OS-K At least 40AcidobacteriaTermite GroupOP8phyla ofNitrospiraBacteroides bacteriaChlorobiFibrobacteresMarine GroupAWS3GemmimonasFirmicutesFusobacteriaActinobacteriaOP9CyanobacteriaSynergistesDeferribacteresChrysiogenetesNKB19VerrucomicrobiaChlamydiaOP3PlanctomycetesSpriochaetesCoprothmermobacterOP10ThermomicrobiaChloroexiTM7Deinococcus-ThermusDictyoglomusAquicaeThermudesulfobacteriaThermotogaeOP1 Based onOP11Hugenholtz, 2002 6. As of 2002 ProteobacteriaTM6OS-K At least 40AcidobacteriaTermite GroupOP8phyla ofNitrospiraBacteroides bacteriaChlorobiFibrobacteresMarine GroupA GenomeWS3GemmimonasFirmicutessequences areFusobacteriaActinobacteriamostly fromOP9CyanobacteriaSynergistesthree phylaDeferribacteresChrysiogenetesNKB19VerrucomicrobiaChlamydiaOP3PlanctomycetesSpriochaetesCoprothmermobacterOP10ThermomicrobiaChloroexiTM7Deinococcus-ThermusDictyoglomusAquicaeThermudesulfobacteriaThermotogaeOP1 Based onOP11Hugenholtz, 2002 7. As of 2002 ProteobacteriaTM6OS-K At least 40AcidobacteriaTermite GroupOP8phyla ofNitrospiraBacteroides bacteriaChlorobiFibrobacteresMarine GroupA GenomeWS3GemmimonasFirmicutessequences areFusobacteriaActinobacteriamostly fromOP9CyanobacteriaSynergistesthree phylaDeferribacteresChrysiogenetesNKB19 Some otherVerrucomicrobiaChlamydiaOP3phyla arePlanctomycetesSpriochaetesonly sparselyCoprothmermobacterOP10ThermomicrobiasampledChloroexiTM7Deinococcus-ThermusDictyoglomusAquicaeThermudesulfobacteriaThermotogaeOP1 Based onOP11Hugenholtz, 2002 8. As of 2002 ProteobacteriaTM6OS-K At least 40AcidobacteriaTermite GroupOP8phyla ofNitrospiraBacteroides bacteriaChlorobiFibrobacteresMarine GroupA GenomeWS3GemmimonasFirmicutessequences areFusobacteriaActinobacteriamostly fromOP9CyanobacteriaSynergistesthree phylaDeferribacteresChrysiogenetesNKB19 Some otherVerrucomicrobiaChlamydiaOP3phyla arePlanctomycetesSpriochaetesonly sparselyCoprothmermobacterOP10ThermomicrobiasampledChloroexiTM7Deinococcus-Thermus Same trend inDictyoglomusAquicaeThermudesulfobacteriaArchaeaThermotogaeOP1 Based onOP11Hugenholtz, 2002 9. Need for Tree Guidance Well Established Common approach within some eukaryotic groups Many small projects funded to ll in some bacterial or archaeal gaps Phylogenetic gaps in bacterial and archaeal projects commonly lamented in literature 10. Proteobacteria TM6 OS-K At least 100 phyla of Acidobacteria Termite Group OP8 bacteria Nitrospira Bacteroides Chlorobi Genome sequences are Fibrobacteres Marine GroupA mostly from three phyla WS3 Gemmimonas Firmicutes Most phyla with cultured Fusobacteria Actinobacteriaspecies are sparsely OP9 Cyanobacteria Synergistes sampled Deferribacteres Chrysiogenetes NKB19 Lineages with no cultured Verrucomicrobia Chlamydia OP3 taxa even more poorly Planctomycetes Spriochaetessampled Coprothmermobacter OP10 Thermomicrobia Chloroexi Solution - use tree to really TM7 Deinococcus-Thermus ll gaps Dictyoglomus Aquicae Well sampled phyla Thermudesulfobacteria Thermotogae OP1 OP11 11. http://www.jgi.doe.gov/programs/GEBA/pilot.html 12. GEBA Pilot Project Overview Identify major branches in rRNA tree for which no genomes are available Identify a cultured representative for each group Grow > 200 of these and prep. DNA Sequence and nish 100 Annotate, analyze, release data Assess benets of tree guided sequencing 13. GEBA Pilot Project: Components Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan Eisen, Eddy Rubin, Jim Bristow) Project management (David Bruce, Eileen Dalin, Lynne Goodwin) Culture collection and DNA prep (DSMZ, Hans-Peter Klenk) Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus, Mat Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng) Annotation and data release (Nikos Kyrpides, Victor Markowitz, et al) Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik DHaeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova, Athanasios Lykidis, Adam Zemla) Adopt a microbe education project (Cheryl Kerfeld) Outreach (David Gilbert) $$$ (DOE, Eddy Rubin, Jim Bristow) 14. Some Lessons From GEBA 15. GEBA Lesson 1rRNA Tree of Life is a Useful Guideand Genomes Improve Resolution 16. GEBA Lesson 2Phylogenetically Guided Selection Can Help Annotate Other Genomes 17. Most/All Functional Prediction Improves w/ Better Phylogenetic Sampling Better denition of protein family sequence patterns Greatly improves comparative and evolutionary based predictions Conversion of hypothetical into conserved hypotheticals Linking distantly related members of protein families Improved non-homology predictionKostas Natalia ThanosNikos Iain Mavrommatis IvanovaLykidisKyrpides Anderson 18. GEBA Lesson 3Phylogenetically Guided SelectionCan Help Study UnculturedOrganisms 19. Environmental Shotgun Sequencingshotgun sequence 20. Binning challengeA T B U C V D W E X F Y G Z 21. Metagenomic Analysis Improves Sean Hooper Small but realimprovementinmetagenomic AmritaPatiannotationand analysis 22. GEBA Lesson 4We have still only scratched thesurface of microbial diversity 23. Protein Family Rarefaction Curves Take data set of multiple complete genomes Identify all protein families using MCL Plot # of genomes vs. # of protein families 24. Phylogenetic Distribution Novelty: 1stBacterial Actin Related Protein VictorKuninPatrik Dhaeseleer Adam Zemla Haliangium ochraceum DSM 14365 25. Phylogenetic Diversity with GEBA 26. Phylogenetic Diversity: Isolates 27. Phylogenetic Diversity: All 28. Proteobacteria TM6 OS-K At least 40 phyla of Acidobacteria Termite Group OP8 bacteria Nitrospira Bacteroides Chlorobi Genome sequences are Fibrobacteres Marine GroupA mostly from three phyla WS3 Gemmimonas Firmicutes Most phyla with cultured Fusobacteria Actinobacteriaspecies are sparsely OP9 Cyanobacteria Synergistes sampled Deferribacteres Chrysiogenetes NKB19 Lineages with no cultured Verrucomicrobia Chlamydia OP3 taxa even more poorly Planctomycetes Spriochaetessampled Coprothmermobacter OP10 Thermomicrobia Chloroexi TM7 Deinococcus-Thermus Dictyoglomus AquicaeWell sampled phyla Thermudesulfobacteria Thermotogae Poorly sampled OP1 OP11No cultured taxa 29. Uncultured Lineages:Technical Approaches Get into culture Enrichment cultures If abundant in low diversity ecosystems Flow sorting Microbeads Microuidic sorting Single cell amplication 30. GEBA Lesson 6Need Experiments from Across the Tree of Life too 31. Adopt a Microbe 32. MICROBES 33. A Happy Tree of Life 34. Related Lesson 1METADATA ROCKS 35. SIGS The Genomic Standards Consortium The GSC is an open-membership working body which formed in September 2005. The goal of this international community is to promote mechanisms that standardize the description of genomes and the exchange and integration of genomic data. See http://gensc.org/gc_wiki/index.php/Main_Page