Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

64
What's going on in the environment? Getting a grip on microbial physiology with genomics and metagenomics Rob Edwards http://phage.sdsu.edu/~rob Fellowship for Interpretation of Genomes, San Diego State University, Burnham Institute for Medical Research, IMEC, LLC SIO, San Diego, May 200

description

SIO, San Diego, May 2006. What's going on in the environment? Getting a grip on microbial physiology with genomics and metagenomics. Rob Edwards http://phage.sdsu.edu/~rob Fellowship for Interpretation of Genomes, San Diego State University, Burnham Institute for Medical Research, - PowerPoint PPT Presentation

Transcript of Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Page 1: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

What's going on in the environment? Getting a grip on microbial

physiology with genomics and metagenomics

Rob Edwardshttp://phage.sdsu.edu/~rob

Fellowship for Interpretation of Genomes,San Diego State University,

Burnham Institute for Medical Research,IMEC, LLC

SIO, San Diego, May 2006

Page 2: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Outline

• Sequencing statistics scare skeptics

• The SEED database

• Some simply stunning Subsystems

• Mysterious missing methionine metabolism

• Marine metabolism mined from metagenomics

• Fabulous four-five-four for facile functional

findings

• Marine phage most puzzling

Page 3: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

The Players

• FIG: Fellowship for Interpretation of Genomes

• NMPDR: Natl. Microbial Pathogen Data Resource

• BRC: NIH Bioinformatics Resource Centers

• SEED: The SEED database.

Page 4: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

How Many Genomes Have Been Sequenced?

Complete Draft Total

Archaea

Bacteria

Eukarya

Page 5: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

How Many Genomes Have Been Sequenced?

Complete Draft Total

Archaea 26 12 38

Bacteria

Eukarya

Page 6: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

How Many Genomes Have Been Sequenced?

Complete Draft Total

Archaea 26 12 38

Bacteria 342 238 580

Eukarya

Page 7: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

How Many Genomes Have Been Sequenced?

Complete Draft Total

Archaea 26 12 38

Bacteria 342 238 580

Eukarya 29 533 562

Page 8: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

When will the 1,000thmicrobial genome be sequenced?

1,000

2,000

3,000

4,000

5,000

1996

2000 2004 2008X X X X X X X X X X

Com

ple

te G

enom

es

Year

Page 9: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Outline

• Sequencing statistics scare skeptics

• The SEED database

• Some simply stunning Subsystems

• Mysterious missing methionine metabolism

• Marine metabolism mined from metagenomics

• Fabulous four-five-four for facile functional

findings

• Marine phage most puzzling

Page 10: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

http://theseed.uchicago.edu/FIG/index.cgi

The SEED database developed by FIG

Current version:

580 Bacteria (342 complete)38 Archaea (26 complete)562 Eukarya (29 complete)1335 Viruses2 Environmental Genomes

Page 11: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

The problem:

How do you generate consistent

annotations for 1,000 genomes?

Page 12: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Basic biology

lacZlacI lacY lacA

Page 13: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Different types of clustering

< 80 % < 80 % < 80%

Page 14: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Act

inob

acte

ria

Aquifi

cae

Bacte

roid

etes

Chlam

ydia

e

Chlor

oflex

i

Cyano

bact

eria

Deino

cocc

us-

Ther

mus Fi

rmicut

es

Spiro

chae

tes

Ther

mot

ogae

Prot

eoba

cter

ia

1

0.8

0.6

0.4

0.2

0

Clusters of genes w/ maximum 80% identityGenes in subsystems in clustersTotal number of genomes in group

Fra

ctio

n o

f genes

in c

lust

ers

Num

ber o

f genom

es

0

40

80

120

Avera

ge

Occurrence of clustering in different genomes

Page 15: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Outline

• Sequencing statistics scare skeptics

• The SEED database

• Some simply stunning Subsystems

• Mysterious missing methionine metabolism

• Marine metabolism mined from metagenomics

• Fabulous four-five-four for facile functional

findings

• Marine phage most puzzling

Page 16: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

The Subsystems Approach to Annotation

• Subsystem is a generalization of “pathway”– collection of functional roles jointly involved

in a biological process or complex

• Functional Role is the abstract biological function of a gene product– atomic, or user-defined, examples:

• 6-phosphofructokinase (EC 2.7.1.11)• LSU ribosomal protein L31p• Streptococcal virulence factors • Does not contain “putative”, “thermostable”, etc

• Populated subsystem is complete spreadsheet of functions and roles

Page 17: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Subsystems developed based on

• Wet lab• Chromosomal context• Metabolic context• Phylogenetic context• Microarray data• Proteomics data

• …

Page 18: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Example Subsystem: Histidine Degradation

1 HutH Histidine ammonia-lyase (EC 4.3.1.3)

2 HutU Urocanate hydratase (EC 4.2.1.49)

3 HutI Imidazolonepropionase (EC 3.5.2.7)4 GluF Glutamate formiminotransferase (EC 2.1.2.5)

5 HutG Formiminoglutamase (EC 3.5.3.8)

6 NfoD N-formylglutamate deformylase (EC 3.5.1.68)

7 ForI Formiminoglutamic iminohydrolase (EC 3.5.3.13)

Subsystem: Histidine Degradation

• Conversion of histidine to glutamate • Functional roles defined in table• Inclusion in subsystem is only by functional role• Controlled vocabulary …

Page 19: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Subsystem Spreadsheet

• Column headers taken from table of functional roles• Rows are selected genomes or organisms• Cells are populated with specific, annotated genes• Functional variants defined by the annotated roles• Variant code -1 indicates subsystem is not functional• Clustering shown by color

Organism Variant HutH HutU HutI GluF HutG NfoD ForI

Bacteroides thetaiotaomicron 1 Q8A4B3 Q8A4A9 Q8A4B1 Q8A4B0

Desulfotela psychrophila 1 gi51246205 gi51246204 gi51246203 gi51246202

Halobacterium sp. 2 Q9HQD5 Q9HQD8 Q9HQD6 Q9HQD7

Deinococcus radiodurans 2 Q9RZ06 Q9RZ02 Q9RZ05 Q9RZ04

Bacillus subtilis 2 P10944 P25503 P42084 P42068

Caulobacter crescentus 3 P58082 Q9A9MI P58079 Q9A9M0 Q9A9L9

Pseudomonas putida 3 Q88CZ7 Q88CZ6 Q88CZ9 Q88D00 Q88CZ3

Xanthomonas campestris 3 Q8PAA7 P58988 Q8PAA6 Q8PAA8 Q8PAA5

Listeria monocytogenes -1

Subsystem Spreadsheet

Page 20: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

“The Populated Subsystem”

1 HutH Histidine ammonia-lyase (EC 4.3.1.3)

2 HutU Urocanate hydratase (EC 4.2.1.49)

3 HutI Imidazolonepropionase (EC 3.5.2.7)4 GluF Glutamate formiminotransferase (EC 2.1.2.5)

5 HutG Formiminoglutamase (EC 3.5.3.8)

6 NfoD N-formylglutamate deformylase (EC 3.5.1.68)

7 ForI Formiminoglutamic iminohydrolase (EC 3.5.3.13)

Subsystem: Histidine Degradation

Organism Variant HutH HutU HutI GluF HutG NfoD ForI

Bacteroides thetaiotaomicron 1 Q8A4B3 Q8A4A9 Q8A4B1 Q8A4B0

Desulfotela psychrophila 1 gi51246205 gi51246204 gi51246203 gi51246202

Halobacterium sp. 2 Q9HQD5 Q9HQD8 Q9HQD6 Q9HQD7

Deinococcus radiodurans 2 Q9RZ06 Q9RZ02 Q9RZ05 Q9RZ04

Bacillus subtilis 2 P10944 P25503 P42084 P42068

Caulobacter crescentus 3 P58082 Q9A9MI P58079 Q9A9M0 Q9A9L9

Pseudomonas putida 3 Q88CZ7 Q88CZ6 Q88CZ9 Q88D00 Q88CZ3

Xanthomonas campestris 3 Q8PAA7 P58988 Q8PAA6 Q8PAA8 Q8PAA5

Listeria monocytogenes -1

Subsystem Spreadsheet

Page 21: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Subsystem Diagram

• Three functional variants• Universal subset has three roles, followed by

three alternative paths from IV to VI• No ForI known experimentally

www.nmpdr.org

ForI

H2O

V NfoD

NH3

I III HutI IV HutG VI

H2O H2O H2O Formamide

HutH II HutU

NH3

GluF

Tetrahydrofolate FormiminotetrahydrofolateSubsystem Diagram

Page 22: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Subsystem Spreadsheet

• Prediction from subsystems confirmed experimentally

Organism Variant HutH HutU HutI GluF HutG NfoD ForI

Bacteroides thetaiotaomicron 1 Q8A4B3 Q8A4A9 Q8A4B1 Q8A4B0

Desulfotela psychrophila 1 gi51246205 gi51246204 gi51246203 gi51246202

Halobacterium sp. 2 Q9HQD5 Q9HQD8 Q9HQD6 Q9HQD7

Deinococcus radiodurans 2 Q9RZ06 Q9RZ02 Q9RZ05 Q9RZ04

Bacillus subtilis 2 P10944 P25503 P42084 P42068

Caulobacter crescentus 3 P58082 Q9A9MI P58079 Q9A9M0 Q9A9L9

Pseudomonas putida 3 Q88CZ7 Q88CZ6 Q88CZ9 Q88D00 Q88CZ3

Xanthomonas campestris 3 Q8PAA7 P58988 Q8PAA6 Q8PAA8 Q8PAA5

Listeria monocytogenes -1

Subsystem Spreadsheet

Page 23: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Outline

• Sequencing statistics scare skeptics

• The SEED database

• Some simply stunning Subsystems

• Mysterious missing methionine metabolism

• Marine metabolism mined from metagenomics

• Fabulous four-five-four for facile functional

findings

• Marine phage most puzzling

Page 24: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

How do bacteria make methionine?

acquirehomoserine

convertcysteine to cystathione

convertcystathione tohomocysteine

acquire met orconverthomocysteine tomethionine

sulfur and acetylhomoserinesulfhydralase

Page 25: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Sulfhydrylation

Organism Variant

Code HSDH HK HSST HSAT AHSH/ SHSH CTGS CTBL MetH MetE BhmT MTHFR

Nostoc sp. PCC 7120 0 4427 657 619 1093

Synechocystis sp. PCC 6803 0 2356 1112 2469 1144Thermosynechococcus elongatus BP-1

0 277 1764 1027 1090 1770

Trichodesmium erythraeum IMS101

0415, 4266

6167106, 1229

2279 4433

Gloeobacter violaceus PCC 7421 0 4295 1127 2500 477 789

Anabaena variabilis ATCC 29413 33 2331 5519 3872 38734254, 6365

6434

Nostoc punctiforme 33 2895 6648 5301 5302 4055 1885Prochlorococcus marinus MED4 66 1204 1764 1714 1715 2 1 1421 295Prochlorococcus marinus str. MIT 9313

66 1141 426 875 874 225 226 728 2005

Prochlorococcus marinus subsp. marinus str. CCMP1375

66 1148 1064 799 798 404 405 957 176

Prochlorococcus marinus subsp. pastoris str. CCMP1986

66 1047 592 640 639 405 406 874 153

Synechococcus sp. WH 8102 66 706 1476 845 846 669 670 1233 2258Synechococcus elongatus PCC 7942

0 1397 769 2172 1030 2173 702 639

Homocerine activation Transsulfuration Methylation

Page 26: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Sulfhydrylation

Organism Variant

Code HSDH HK HSST HSAT AHSH/ SHSH CTGS CTBL MetH MetE BhmT MTHFR

Nostoc sp. PCC 7120 0 4427 657 619 1093

Synechocystis sp. PCC 6803 0 2356 1112 2469 1144Thermosynechococcus elongatus BP-1

0 277 1764 1027 1090 1770

Trichodesmium erythraeum IMS101

0415, 4266

6167106, 1229

2279 4433

Gloeobacter violaceus PCC 7421 0 4295 1127 2500 477 789

Anabaena variabilis ATCC 29413 33 2331 5519 3872 38734254, 6365

6434

Nostoc punctiforme 33 2895 6648 5301 5302 4055 1885Prochlorococcus marinus MED4 66 1204 1764 1714 1715 2 1 1421 295Prochlorococcus marinus str. MIT 9313

66 1141 426 875 874 225 226 728 2005

Prochlorococcus marinus subsp. marinus str. CCMP1375

66 1148 1064 799 798 404 405 957 176

Prochlorococcus marinus subsp. pastoris str. CCMP1986

66 1047 592 640 639 405 406 874 153

Synechococcus sp. WH 8102 66 706 1476 845 846 669 670 1233 2258Synechococcus elongatus PCC 7942

0 1397 769 2172 1030 2173 702 639

Homocerine activation Transsulfuration Methylation

?

?

Missing genes

Page 27: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Cyanoseed:http://cyanoseed.theFIG.info

Page 28: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Marineseed:http://theseed.uchicago.edu/FIG/organisms.cgi?

show=marine

Page 29: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

predicted or measured co-regulation

genome context(virulence islands, prophages,

conserved gene clusters)

virulence mechanism

cellular localization

enzymatic activity

common phenotype

combinations of criteria

Subsystems are not just for gene clusters

Page 30: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

How much progress has been made?

• 541 subsystems encoded

• 80 – 85% of the genes in core machinery are contained in subsystems

• 30 – 35% of genes in NMPDR organism genomes,

• 20 – 30% of other genomes contained in subsystems

Page 31: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Outline

• Sequencing statistics scare skeptics

• The SEED database

• Some simply stunning Subsystems

• Mysterious missing methionine metabolism

• Marine metabolism mined from metagenomics

• Fabulous four-five-four for facile functional

findings

• Marine phage most puzzling

Page 32: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Metagenomics

200 liters water 5-500 g fresh fecal matter

DNA/RNA LASL

Sequence

Epifluorescent Microscopy

Concentrate and purify viruses

Extract nucleic acids

Breitbart et al., multiple papers

Page 33: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Control datasets for metagenome comparisons

Bacteria 952,758

Archaea 49,694

Eukarya 259,653

Acid mine 7,588

Sargasso(without Shewanella, Burkholderia)

960,561

Sorcerer II ~13,000,000

Number of proteins in different datasets

Page 34: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Subsystems per million CDS

Page 35: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Determination of Statistical Differences

Between Metagenomes• Take 10,000 proteins from sample 1• Count frequency of each subsystem• Repeat 20,000 times

• Repeat for sample 2

• Combine both samples• Sample 10,000 proteins 20,000 times• Build 95% CI

• Compare medians from samples 1 and 2 with 95% CI

Rodriguez-Brito (2006). BMC Bioinformatics

Page 36: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Sampling Sargasso and “SEED” metagenomes

Page 37: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Comparison of all SubsystemsMore in Sargasso More in SEED

Page 38: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Is serine being used as an osmolyte?

•Few trehalose, proline, sucrose synthetic genes

•Serine is most abundant amino acid in ocean (Suttle, Keil)

•Serine is more effective osmoprotectant than glycine betaine(Yancey)

Page 39: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Outline

• Sequencing statistics scare skeptics

• The SEED database

• Some simply stunning Subsystems

• Mysterious missing methionine metabolism

• Marine metabolism mined from metagenomics

• Fabulous four-five-four for facile functional

findings

• Marine phage most puzzling

Page 40: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Metagenomics

200 liters water 5-500 g fresh fecal matter

DNA/RNA LASL

Sequence

Epifluorescent Microscopy

Concentrate and purify viruses

Extract nucleic acids

Breitbart et al., multiple papers

454

So 2004

Page 41: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

454 Sequence Data(Only from Rohwer Lab, in one year)

• 42 libraries– 22 microbial, 20 phage

• 1,028,563,420 bp total– 33% of the human genome– 95% of all complete and partial bacterial genomes– 10% of community sequencing of JGI per year

• 9,933,184 sequences– Average 236,511 per library

• Average read length 103.5 bp– Av. read length has not increased in 12 months

Page 42: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

The Soudan Mine, Minnesota

Red Stuff OxidizedBlack Stuff Reduced

Page 43: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Red and Black Samples Are Different

Cloned and 454 sequenced16S are indistinguishable

Black stuff

Red

ClonedRed

Page 44: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

There are different amounts of metabolism in each environment

Page 45: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

There are different amounts ofsubstrates in each environment

BlackStuff

RedStuff

Page 46: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

But are the differences significant?

• Sample 10,000 proteins from site 1• Count frequency of each “subsystem”• Repeat 20,000 times

• Repeat for sample 2

• Combine both samples• Sample 10,000 proteins 20,000 times• Build 95% CI

• Compare medians from sites 1 and 2 with 95% CI

Rodriguez-Brito (2006). BMC Bioinformatics

Page 47: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Subsystem differences & metabolism

Iron acquisitionBlack Stuff

Siderophore enterobactin biosynthesisferric enterobactin transportABC transporter ferrichromeABC transporter heme

Black stuff: ferrous iron (Fe2+, ferroan [(Mg,Fe)6(Si,Al)4O10(OH)8])

Red stuff: ferric iron (goethite [FeO(OH)])

Page 48: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Nitrification differentiates the samples

Edwards (2006)BMC Genomics

Page 49: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

The challenge is explaining the differences between samples

Red Sample

Arg, Trp, His UbiquinoneFA oxidationChemotaxis, FlagellaMethylglyoxal metabolism

Black Sample

Ile, Leu, ValSiderophoresGlycerolipidsNiFe hydrogenasePhenylpropionate

degradation

Page 50: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

We can cheaply compare the importantbiochemistry happening in different

environments

We don’t care which organisms are doing the metabolism but we know what organisms are

there

Page 51: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Outline

• Sequencing statistics scare skeptics

• The SEED database

• Some simply stunning Subsystems

• Mysterious missing methionine metabolism

• Marine metabolism mined from metagenomics

• Fabulous four-five-four for facile functional

findings

• Marine phage most puzzling

Page 52: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Phages In The Worlds Oceans

GOM41 samples

13 sites5 years

SAR1 sample

1 site1 year

BBC85 samples

38 sites8 years

ARC56 samples

16 sites1 year

LI4 sites1 year

Page 53: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Phages, Reefs, and Human Disturbance

The Northern Line IslandsExpedition, 2005

Christmas

Kingman

Christmas

Kingman

Palmyra

Washington

Fanning

Page 54: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

16S rDNA at each island

Page 55: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

16S rDNA of the Proteobacteria

Page 56: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Phages at each island

Page 57: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Christmas to Kingman Bias in No. Phage HostsNegative numbers mean relatively more phage hosts at Kingman

Page 58: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Phages In The Worlds Oceans

GOM41 samples

13 sites5 years

SAR1 sample

1 site1 year

BBC85 samples

38 sites8 years

ARC56 samples

16 sites1 year

LI4 sites1 year

Page 59: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Most Marine Phage Sequences are Novel

Page 60: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Thanks: Mya Breitbart

Phages are specific to environments

PhageProteomicTree v. 5(Edwards, Rohwer)

ssDNA

-like

T7-likeT4-like

Page 61: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Marine Single-Stranded DNA Viruses

• 6% of SAR sequences ssDNA phage (Chlamydia-like Microviridae)

• 40% viral particles in SAR are ssDNA phage

• Several full-genome sequences were recovered via de novo assembly of these fragments

• Confirmed by PCR and sequencing

Page 62: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

12,297 sequence fragments hit using TBLASTXover a ~4.5 kb genome

3890 bp 4490 bp

0

1033

SAR Aligned Against the Chlamydia 4

Individual sequence reads

Chlamydia phi 4genome

Coverage

Concatenated hits

Page 63: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

Summary

You only need to remember:

• Subsystems are the best way to annotate genomes

• 454 generates lots of data

• We can use subsystems to find out what is going on in the environment

Page 64: Rob Edwards phage.sdsu/~rob Fellowship for Interpretation of Genomes,

SDSU Forest Rohwer Beltran Brito-Rodriguez Linda Wegley

USF Mya Breitbart

University of Bielefeld Folker Meyer Lutz Krause

FIG Veronika Vonstein Ross Overbeek Gordon Pusch

ANL Rick Stevens Bob Olsen Terry Disz

Annotators Gary Olsen Andrei Ostermann Olga Zagnitko Olga Vassieva Svetlana Gerdes Ramy Aziz

UBC Curtis Suttle Amy Chan