Inferring microbial ecosystem function from community structure
-
Upload
jeff-bowman -
Category
Science
-
view
210 -
download
1
Transcript of Inferring microbial ecosystem function from community structure
Inferring microbial ecosystem function from community structure
Je� S. Bowman and Hugh W. Ducklow
Lamont-Doherty Earth Observatory at Columbia [email protected] | www.polarmicrobes.org
Introduction and Motivation
Marine microbes play a central role in the sustainability of the global ocean by mediating the �ow of carbon and nutrients through the marine system. Ecologists commonly study the structure and composition of marine microbial communities by analyzing the 16S rRNA gene. Although this data is well suited to evaluating di�erences between communities, and to correlating community structure with other environmental parameters (e.g. chlorophyll concentration, temperature, sa-linity), it is less well suited to describing the ecosystem functions (i.e. metabolic functions) of these community. Although metagenomics and other techniques can bridge the gap between microbial community structure and ecosystem function these techniques are costly, data intensive, and low throughput.
Our goal was to develop a high-throughput method for inferring community metabolism from community taxonomy. By evaluating metabolic structure in place of community structure we capture key inter-sample relationships and their impact on microbial ecosystem function. Our method produces pathway genome databases (PGDBs) that describe the metabolic pathways likely to be present in the sample. These PGDBs are amenable to �ux-based metabolic modeling. Future work will focus on predicting the �ow of elements and energy through these pathways, providing a way to model the impact of changing commu-nity structure on biogeochemical cycles.
Here we apply our method to a seasonally variable, depth strati�ed microbial community from the West Antarctic Peninsula, a region undergoing unprecedented environmental change.
16S sequence library, the bigger
the better!
Obtain all completed genomes
Build 16S rRNA reference tree
Find consensus genome for
each tree node
Place reads on reference tree
Extract pathways for each placement
Generate confidence score
for sample
Predict metabolic pathways
Calculate confidence for
each node
Evaluate genomic
plasticity for terminal nodes
Evaluate relative core genome size
Fig. 1. Methods. Our metabolic inference pipe-line, PAPRICA [1], uses a phylogenetic placement program (pplacer) [2] to place query reads on a reference tree of 16S rRNA genes from all complet-ed genomes. We determine a consensus genome for each point of placement on the tree, and deter-mine the metabolic pathways represented in these genomes. Separately we determine a con�dence score for each point of placement on the reference tree from a novel indicator of genomic stability.
Terminal Node
Terminal Node
Internal Node
Core genome
Accessory Genome
= ( )
(1 )
Fig. 2. Con�dence score. Placements can be made to terminal and internal nodes. To determine the con�dence (c) of a metabolic inference for a given placement we con-sider the core genome size (Score), the mean genome size of the clade (Sclade), and the mean index of plasticity for the clade (ф; Fig. 3).
Fig. 3. Genomic plasticity of genomes in our database. A major impediment to accurate metabolic inference is the genetic diversity that can exist within even a narrow taxonomic clade. We developed a con�dence metric for our inferred metab-olisms that is based on the degree of genomic plasticity present inherent to each genome. X-axis gives the position of each genome on our reference tree, Y-axis gives the degree of plasticity. Unusually plastic genomes are indicated by Roman numerals. I) Nanoarcheum equitans II) the Mycobacteria III) a butyrate producing bacterium within the Clostridium IV) Candidatus Hodgkinia circadicola V) the Myco-plasma VI) Sulcia muelleri VII) Portiera aleyrodidanum VIII) Buchnera aphidicola IX) the Oxalobacteraceae.
0 500 1000 1500 2000 2500
0.0
0.2
0.4
0.6
0.8
1.0
Terminal node
Rel
ativ
e pl
astic
ity
I
IIIII
IV
V VIVII
VIII
IX
Fig. 4. Sample locations within the Palmer LTER o� the WAP (left) and inter-sample similarity (right). The location of Palmer Sta-tion is given by the star. Summer surface and deep samples along with winter surface samples were analyzed [3]. A) Hierarchical cluster-ing of samples by metabolic structure. B) Hierarchical clustering of samples by taxonomic structure. Note duplicate samples in both A and B. C) Distances between samples are in good agreement between the two methods (R2 = 0.70). D) Distances are correlated (R2 = 0.40), albeit less well, the alternate metabolic inferrence approach PICRUSt [4].
●
●
●
●
NW
NE
SW
SE
WAP
sum
mer
_sw
_dee
p_b.
1su
mm
er_s
w_d
eep_
b.2
sum
mer
_nw
_dee
p_b.
1su
mm
er_n
w_d
eep_
b.2
sum
mer
_se_
deep
_b.1
sum
mer
_se_
deep
_b.2
win
ter_
ne_s
hallo
w_b
.1w
inte
r_ne
_sha
llow
_b.2
sum
mer
_ne_
deep
_b.1
sum
mer
_ne_
deep
_b.2
sum
mer
_ne_
shal
low
_b.1
sum
mer
_ne_
shal
low
_b.2
sum
mer
_se_
shal
low
_b.1
sum
mer
_se_
shal
low
_b.2
sum
mer
_sw
_sha
llow
_b.1
sum
mer
_sw
_sha
llow
_b.2
sum
mer
_nw
_sha
llow
_b.1
sum
mer
_nw
_sha
llow
_b.20.
01.
02.
0
Hei
ght
sum
mer
_nw
_dee
p_b.
1su
mm
er_n
w_d
eep_
b.2
sum
mer
_se_
deep
_b.1
sum
mer
_se_
deep
_b.2
sum
mer
_sw
_dee
p_b.
1su
mm
er_s
w_d
eep_
b.2
win
ter_
ne_s
hallo
w_b
.1w
inte
r_ne
_sha
llow
_b.2
sum
mer
_ne_
deep
_b.1
sum
mer
_ne_
deep
_b.2
sum
mer
_se_
shal
low
_b.1
sum
mer
_se_
shal
low
_b.2
sum
mer
_nw
_sha
llow
_b.2
sum
mer
_sw
_sha
llow
_b.1
sum
mer
_sw
_sha
llow
_b.2
sum
mer
_nw
_sha
llow
_b.1
sum
mer
_ne_
shal
low
_b.1
sum
mer
_ne_
shal
low
_b.20.
00.
20.
4
Hei
ght
0.02 0.04 0.06 0.08 0.10 0.12 0.14
0.1
0.3
0.5
Distance by pathway abundance
Dis
tanc
e by
edg
e ab
unda
nce
A B
Surface
Deep
Winter surface
C
0.05 0.10 0.15
0.2
0.4
0.6
0.8
Distance by pathway abundance
Dis
tanc
e by
OTU
abu
ndan
ce
D
This methodR2 = 0.70
PICRUStR2 = 0.40
Clustering by pathway abundance, this method Clustering by edge abundance, this method
Key Points
• Microbial communities can be described by their metabolic structure.• Metabolic structure provides information on potential microbial ecosystem functions.• Representing a microbial community by metabolic structure may provide a way to model the �ow of elements and energy through the community.
1. Bowman, Je� S., and Hugh W. Ducklow. 2015. Microbial Communities Can Be Described by Metabolic Structure: A General Framework and Application to a Sea-sonally Variable, Depth-Strati�ed Microbial Community from the Coastal West Antarctic Peninsula. PloS one, 10.8: e0135868.2. Matsen, F, R Kodner, E Armbrust. 2010. pplacer: Linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a �xed reference tree. BMC Bioinformatics, 11:538.3. Luria, C, H Ducklow, L Amaral-Zettler. 2014. Marine bacterial, archaeal and eukaryotic diversity and community structure on the continental shelf of the western Antarctic Peninsula. Aquatic Microbial Ecology, 73:2 107-121.4. Langille, Morgan GI, et al. Predictive functional pro�ling of microbial communities using 16S rRNA marker gene sequences. 2013. Nature biotechnology 31.9: 814-821.
pyruvate fermentation to lactatephosphonoacetate degradation
adenosine nucleotides degradation IIIcreatinine degradation II
D−galacturonate degradation Itriacylglycerol degradation
allantoin degradation to ureidoglycolate I (urea producing)nitrate reduction I (denitrification)
oxalate degradation IIsucrose degradation IV (sucrose phosphorylase)
galactose degradation I (Leloir pathway)threonine degradation I
S−methyl−5−thio−alpha−D−ribose 1−phosphate degradationnitrate reduction IV (dissimilatory)
taurine degradation IVcholesterol degradation to androstenedione II (cholesterol dehydrogenase)
sitosterol degradation to androstenedionereactive oxygen species degradation (mammalian)
alkylnitronates degradationreductive monocarboxylic acid cycle
trehalose degradation VI (periplasmic)arginine degradation III (arginine decarboxylase/agmatinase pathway)
propionyl CoA degradationphenylmercury acetate degradation
thymine degradationglutamate degradation I
uracil degradation I (reductive)ethanol degradation IV
threonine degradation III (to methylglyoxal)formaldehyde oxidation II (glutathione−dependent)
ethanol degradation IIvaline degradation II
S−methyl−5'−thioadenosine degradation IIguanosine nucleotides degradation III
formate oxidation to CO2pyrimidine deoxyribonucleosides degradation
2'−deoxy−alpha−D−ribose 1−phosphate degradationmethylglyoxal degradation II
glutamate degradation Xglucose and glucose−1−phosphate degradation
glycogen degradation Iurate biosynthesis/inosine 5'−phosphate degradation
pseudouridine degradationphenylacetate degradation I (aerobic)
D−mannose degradationurea degradation I
methionine degradation I (to homocysteine)aspartate degradation I
citrulline degradationglutamine degradation I
−0.6 −0.4 −0.2 0.0 0.2 0.4 0.6
Enriched in surface | Enriched in deep and winter
p-value0.05
4.57 x 10-5
Key intracellular metabolismAnaerobic metabolismNitrogen degradationCarbon degradation
C1 metabolism
AutotrophyMercury degradation
Columbia / Kiel University Sustainable Oceans Symposium
Fig. 5. What metabolic pathways are di�erentially present between summer surface samples and winter and deep samples? Having determined that the relationship between samples can be accurately represented by metabolic structure we can begin to ask ecologically relevant questions. A frequent ques-tion posed to community structure data is how are metabolisms partitioned between niches? In the �gure at left color gives the p-value for a Mann-Whit-ney test between sample groups (summer surface vs. summer deep and winter surface). The X-axis gives the anomaly, calculated as the di�erence in sample group means divided by the sum of the sample group means.