Metagenomics at Second Genome Tanya Yatsunenko [email protected].
-
Upload
cecily-beasley -
Category
Documents
-
view
216 -
download
0
Transcript of Metagenomics at Second Genome Tanya Yatsunenko [email protected].
Metagenomics at Second Genome
Tanya [email protected]
• San Francisco based company leveraging microbiome science to enable the discovery and development of human health products through services, collaborations and internal R&D
• Taking a mechanistic approach to discovery – First-of-kind microbiome drug discovery platform with pharma partner
validation– Not Dx, not nutrition, not fecal transplant, not strains as
drugs
• Curator of Greengenes™ database (Todd DeSantis)
• Qiime developer (Justin Kuczynski from Rob Knight Lab)
• Over 200 microbiome studies completed to date across industry, government, academic researchers, nutrition companies, and pharma
Metagenomic (and RNA-seq) Pipeline at SG
Sample1_Right.fastqSample1_Left.fastq
Remove adapters
prinseq-lite
Remove poor quality bases
and short reads
Remove Host DNA
Bowtie2
Remove rRNA
SortmeRNA
Filtered sequences
Metaphlan
Taxonomic Table Functional
Annotation
RapSearch
BioCyc Database
Samples comparison: PCoA, Hierarchical
Clustering;Discriminatory Organisms
and Pathways
Genes, Genomes, Pathway abundance and
coverage
Open source software
Cloud = Amazon AWS spot instances
fastq-mcf
Functional annotationGenes -> Enzymes -> Pathways and Strains
Genes 1 2
GJXV-1205, GTP cyclohydrolase 1 0
GJXV-2161, Na+-driven multidrug pump 0 10
Enzymes 1 2
ENZRXNJXV-1763 1 0
ENZRXNJXV-1765 0 10
Pathways 1 2
NAGLIPASYN-PWY 1 0
PWY-5687 0 10
Strains 1 2Faecalibacterium prausnitzii M-65
1 100
Acidovorax sp.JS42 0 1
Functional assignments Bacterial strain assignments
1 Query Sequence from Sample1: KDYDTAQRVLGNVLVLNIIIGLAFTVLTLIFLD
Connecting genes/enzymes to bacterial genomes
Challenges• ~1% filtered sequences with a significant hit to
BioCyc database
• Assembly with complex microbiota?
• Paired-end sequences are treated independently (for hi-seq)
• Confidence in identification of strains hits from metagenomic and transcriptomic datasets
• Database: KEGG vs BioCyc vs others
• Some samples forward and reverse reads result in different microbiome profiles
Human gene
Mic
robia
l g
ene
Get correlation coefficient (Rho) and p value23 mln correlations, 400 after bonferroni
correction
Correlating human with microbial transcriptome
+Rho
-Rho
Best correlation: Peptidoglycan glycosyltransferase vs Human gene (inflammasome related)
0 20 40 60 80 100 120 140 1600 2 4 6 8 10 12 14
Human gene expression Microbial enzymeexpression
Sam
ple
ID
Best correlation: microbial enzyme vs 5 human genes
0 2 4 6 8 10 12 140
10
20
30
40
50
60
70
80
90
100
110
120
130
140
150
160
Relative Abundance of MICROBIAL ENZYME RXN-11348
Rel
ativ
e A
bu
nd
ance
of
HU
MA
N g
enes
Peptidoglycan glycosyltransferase.
Summary
• Will be happy to discuss our methods and some of the findings
• Currently working on relating human and microbiome functions in disease states