Leveraging ancestral state reconstruction to infer community function from a single marker gene

22
Morgan Langille Dalhousie University July 10, 2012

description

This was presented at iEvoBio 2012 in Ottawa.

Transcript of Leveraging ancestral state reconstruction to infer community function from a single marker gene

Page 1: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Morgan Langille

Dalhousie University

July 10, 2012

Page 2: Leveraging ancestral state reconstruction to infer community function from a single marker gene

16S rRNA gene

Standard marker gene for bacterial and

archaeal species identification

Recent widespread use in metagenomic

microbiome surveys

Limited to telling us: “who is there?”

Page 3: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Using 16S anonymously

16S reads often clustered into OTUs

Alpha diversity

Beta diversity

Rarefaction

Biogeography

Bik et al., 2012

Page 4: Leveraging ancestral state reconstruction to infer community function from a single marker gene

What is in a name?

Real names vs OTU1234

Lee et al. 2010

Page 5: Leveraging ancestral state reconstruction to infer community function from a single marker gene

What is in a name?

Real names vs OTU1234

Haloferax

Lee et al. 2010

Page 6: Leveraging ancestral state reconstruction to infer community function from a single marker gene

What is in a name?

Real names vs OTU1234

Haloferax

Prochlorococcus

Lee et al. 2010

Page 7: Leveraging ancestral state reconstruction to infer community function from a single marker gene

What is in a name?

Real names vs OTU1234

Haloferax

Prochlorococcus

Bacillus

Lee et al. 2010

Page 8: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Extending 16S to functions

Metagenomics: “What are they doing?”

Requires WGS sequencing

More costly

Use microbial databases

~3500 genomes

IMG

NCBI

Etc.

Find genome Functional

Information

• KEGG

• PFAM

• EC

• SEED

• Etc.

• 16S gene

• Or Other

Marker Gene

Page 9: Leveraging ancestral state reconstruction to infer community function from a single marker gene

PICRUST Phylogenetic Investigation of

Communities by Reconstruction of

Unobserved STates

http://picrust.sourceforge.net

Page 10: Leveraging ancestral state reconstruction to infer community function from a single marker gene

PICRUST: Predicting genomes

Reference 16S

Tree

(Green Genes)

Genome Trait

Table

(e.g. KEGG, 16S

copy number)

Prune taxa with

no genome

information

Infer

ancestral

genome traits

Predict

genome

compositions

Page 11: Leveraging ancestral state reconstruction to infer community function from a single marker gene

PICRUST: Predicting metagenomes

OTU Table

(16S by Sample)

16S Copy Number

Predictions

(per genome)

Functional Trait

Predictions

(per genome)

Normalize OTU Table Predict Metagenome

Functional Traits

Functions by

Sample

Page 12: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Ancestral State Reconstruction

Needs to accept continuous data

Must run fast! (8000 traits across 3500 genomes)

Wagner Parsimony (Count software; Csuos, 2010)

ACE (APE R Library; Paradis, 2004)

PIC

ML

REML

Page 13: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Accuracy for metagenome prediction

1. Obtain metagenomic projects with both

WGS and 16S only sequencing

2. Make functional predictions using

PICRUST with 16S only data

3. Compare predictions with WGS data

Page 14: Leveraging ancestral state reconstruction to infer community function from a single marker gene

ASR methods on metagenomics

HMP Mock Community (known organisms sequenced)

All methods give similar results except for “ACE ML” known problem

and recently added “REML” method solves problem

R2= 0.92 R2= 0.91

R2= 0.92 R2= 0.72

Wagner Parsimony ACE PIC

ACE REML ACE ML

Page 15: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Accuracy on metagenomes

Page 16: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Accuracy across various HMP sites

Page 17: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Accuracy for genome prediction

1. Pretend a genome has not been sequenced

2. Predict genome composition using PICRUST

3. Compare predictions to real data

4. Repeat for all genomes

Page 18: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Accuracy depends on distance to

closest sequenced genome

R2=-0.72

Page 19: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Accuracy across the TOL

http://itol.embl.de/shared/mlangill

Staphylococcus aerues

E. coli

Page 20: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Accuracy depends on type of functional category

PICRUST Accuracy

Page 21: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Possible applications

1. 16S only microbiome studies Make hypotheses about the functions they encode

2. Complete metagenomic studies Compare functions we “observe” to what we would expect

based on species present

3. Aid other metagenomic computational methods Binning

Metabolic reconstruction

4. Insight into correlation between species & function For different taxonomic groups

For different functional classes

Page 22: Leveraging ancestral state reconstruction to infer community function from a single marker gene

Acknowledgements

Rob Beiko

Curtis Huttenhower

Rob Knight

Jesse Zaneveld

Greg Caporaso

Joshua Reyes

Dan Knights

Daniel McDonald