Rob Edwards phage.sdsu/~rob San Diego State University

Click here to load reader

  • date post

    12-Jan-2016
  • Category

    Documents

  • view

    25
  • download

    3

Embed Size (px)

description

SGM Meeting, Warwick, April 2006. Challenges for metagenomic data analysis and lessons from viral metagenomes [What would you do if sequencing were free?]. Rob Edwards http://phage.sdsu.edu/~rob San Diego State University Fellowship for Interpretation of Genomes. Outline. - PowerPoint PPT Presentation

Transcript of Rob Edwards phage.sdsu/~rob San Diego State University

  • Challenges for metagenomic data analysis and lessons from viral metagenomes

    [What would you do if sequencing were free?]Rob Edwards

    http://phage.sdsu.edu/~robSan Diego State UniversityFellowship for Interpretation of GenomesSGM Meeting, Warwick, April 2006

  • OutlineThe envy is not mineA tour around the world, thanks to phagePeople suckWhat is the most successful gene in evolution?Is there a Future?

  • This is all 454 sequence data21 libraries10 microbial, 11 phage597,340,328 bp total20% of the human genome50% of all complete and partial microbial genomes5,769,035 sequencesAverage 274,716 per libraryAverage read length 103.5 bpAv. read length has not increased in 7 monthsCost 0.04 per bp

  • Sequencing is cheap and easy.

    Bioinformatics is neither.

  • The Soudan Mine, MinnesotaRed Stuff OxidizedBlack Stuff Reduced

  • Red and Black Samples Are DifferentCloned and 454 sequenced16S are indistinguishableBlack stuffRedClonedRed

  • There are different amounts of metabolism in each environment

  • There are different amounts ofsubstrates in each environmentBlackStuffRedStuff

  • But are the differences significant?Sample 10,000 proteins from site 1Count frequency of each subsystemRepeat 20,000 times

    Repeat for sample 2

    Combine both samplesSample 10,000 proteins 20,000 timesBuild 95% CI

    Compare medians from sites 1 and 2 with 95% CI

    Rodriguez-Brito (2006). BMC Bioinformatics

  • Subsystem differences & metabolismIron acquisitionBlack Stuff

    Siderophore enterobactin biosynthesisferric enterobactin transportABC transporter ferrichromeABC transporter heme

    Black stuff: ferrous iron (Fe2+, ferroan [(Mg,Fe)6(Si,Al)4O10(OH)8])

    Red stuff: ferric iron (goethite [FeO(OH)])

  • Nitrification differentiates the samplesEdwards (2006)BMC Genomics

  • The challenge is explaining the differences between samplesRed Sample

    Arg, Trp, His UbiquinoneFA oxidationChemotaxis, FlagellaMethylglyoxal metabolism

    Black Sample

    Ile, Leu, ValSiderophoresGlycerolipidsNiFe hydrogenasePhenylpropionate degradation

  • We can cheaply compare the importantbiochemistry happening in different environments

    We dont care which organisms are doing the metabolism but we know what organisms are there

  • OutlineThe envy is not mineA tour around the world, thanks to phagePeople suckWhat is the most successful gene in evolution?Is there a Future?

  • Why Phages?Phages are viruses that infect bacteria10:1 ratio of phages:bacteria1031 phages on the planetSpecific interactions (probably)one virus : one hostSmall genome sizeHigher coverage Horizontal gene transfer1025-1028 bp DNA per year in the oceansCant do fosmids

  • Phages In The Worlds Oceans

  • Most Marine Phage Sequences are Novel

  • Thanks: Mya BreitbartPhages are specific to environmentsPhageProteomicTree v. 5(Edwards, Rohwer)

  • Marine Single-Stranded DNA Viruses6% of SAR sequences ssDNA phage (Chlamydia-like Microviridae)

    40% viral particles in SAR are ssDNA phage

    Several full-genome sequences were recovered via de novo assembly of these fragments

    Confirmed by PCR and sequencing

  • 12,297 sequence fragments hit using TBLASTXover a ~4.5 kb genomeSAR Aligned Against the Chlamydia 4Individual sequence readsChlamydia phi 4genomeCoverageConcatenated hits

  • OutlineThe envy is not mineA tour around the world, thanks to phagePeople suckWhat is the most successful gene in evolution?Is there a Future?

  • Phages, Reefs, and Human Disturbance

  • Phages, Reefs, and Human DisturbanceThe Northern Line IslandsExpedition, 2005ChristmasKingmanPalmyraWashingtonFanning

  • Christmas to Kingman Bias in No. Phage HostsNegative numbers mean relatively more phage hosts at Kingman

  • OutlineThe envy is not mineA tour around the world, thanks to phagePeople suckWhat is the most successful gene in evolution?Is there a Future?

  • Phages enrich for important genesRios Mesquites Stromatolites No photosynthesis genes in phages

    Pozas Azules Stromatolites 5 different photosynthesis genes in phages

  • RNR is the most successful reaction in evolution

  • OutlineThe envy is not mineA tour around the world, thanks to phagePeople suckWhat is the most successful gene in evolution?Is there a Future?

  • Computational ChallengesSequence annotations and analysisWhat is there?What is it doing?How is it doing it?Gene predictions in unknownsLutz Krause (Bielefeld)Sequence comparisonsBLASTOther ways to rapidly compare short sequencesWhat happens when everyone is using 454 sequencing?

  • Sequence data from 21 libraries6 million sequences600 million bp Each BLASTX search takes 1,000 CPU hours 21 libraries = 21,000 CPU hours or 2.4 CPU years Users want repeat runs, TBLASTX, more analysis more data more, more, more, more

  • SDSU Forest Rohwer Beltran Rodriguez-BritoUSF Mya BreitbartRohwer Lab Linda Wegley Florent Angly Matt HaynesStromatolites Janet Seifert Rice University) Valeria Souza (UNAM, Mexico)Math [email protected] Peter Salamon Joe Mahaffy James Nulton Ben Felts David Bangor Steve Rayhawk Jennifer MuellerMIT: Ed DeLongFIG Veronika Vonstein Ross Overbeek AnnotatorsANL Rick Stevens Bob Olsen CI SupportAlso at SDSU Anca Segall Stanley MaloyUBC Curtis Suttle Amy Chan