A NOVEL FLOW CYTOMETRY BASED METHODOLOGY FOR RAPID, HIGH-
THROUGHPUT CHARACTERIZATION OF MICROBIOME DYNAMICS IN
ANAEROBIC SYSTEMS
BY
ABHISHEK S. DHOBLE
DISSERTATION
Submitted in partial fulfillment of the requirements
for the degree of Doctor of Philosophy in Agricultural and Biological Engineering
in the Graduate College of the
University of Illinois at Urbana-Champaign, 2016
Urbana, Illinois
Doctoral Committee:
Associate Professor Kaustubh D. Bhalerao, Chair and Director of Research
Associate Professor Michael J. Miller
Associate Professor Kris N. Lambert
Assistant Professor Girish Chowdhary
ii
ABSTRACT
A key challenge in studying complex microbial communities in natural, controlled and
engineered environments is the development of a label-free, high throughput technique to
enable rapid, in-line monitoring of the structure and function of microbiomes sensitive to
perturbations. Here, a novel multidimensional flow cytometry based method has been
demonstrated to monitor and rapidly characterize the dynamics of the complex anaerobic
microbiome associated with perturbations in external environmental factors.
The present study indicates that an autocorrelation analysis between diverging
microbial communities is a simple and rapid tool to monitor perturbations in anaerobic
systems due to addition of various carbon sources. Exploiting multiple measurable
dimensions in flow cytometry such as cell size (FSC or forward scatter), cell
granularity/morphology (SSC or side scatter) and autofluorescence (corresponding to the
same excitation/emission wavelength as in AmCyan standard dye), it is possible to monitor
and rapidly characterize the dynamics of the complex anaerobic microbiome associated with
perturbations in external environmental factors. Further, it is also possible to quantitatively
discriminate between divergent microbiomes, in a manner analogous to community
fingerprinting techniques using automated ribosomal intergenic spacer analysis (ARISA).
While ARISA measures diversity at the genomic level and flow cytometry measures diversity
at the morphological level, there was an observed correspondence between the two measures
at the phylum-level.
The present study also suggests that machine learning algorithms can be fruitful in the
classification of cytometric fingerprints. With a limited dataset from the carbon source
iii
perturbed anaerobic microbiome, several machine learning algorithms were found to be fast
and comparable in accuracy to traditional microbial ecology statistical analysis. A comparison
between different algorithms based on predictive capabilities suggested that Deep Learning
(DL) was best at predicting overall community but Distributed Random Forest (DRF) was
best for predicting the most important putative microbial group(s) in the anaerobic digesters
viz. Methanogens. The utility of flow cytometry based method has also been demonstrated in
a fully functional industry scale anaerobic digester to distinguish between microbiome
compositions caused by varying the hydraulic retention time (HRT). Potential utility of the
proposed methodology has been demonstrated for monitoring the syntrophic resilience of the
anaerobic microbiome perturbed under nanotoxicity.
iv
ACKNOWLEDGEMENTS
I wish to thank my advisor, Dr. Kaustubh D. Bhalerao, for introducing me to great
fields of flow cytometry and machine learning which would be the founding pillars for my
career. I’d like to extend my sincere thanks to Dr. Bhalerao for believing in me and for
providing the best advice at the most challenging times. I thank Dr. Bhalerao for teaching me
the best lesson of my life – the lesson of humility – for which I am forever indebted to him. I
am grateful for his continued intellectual support, patience, unrestricted time and bountiful
reservoir of inspiration during my PhD. Greater than the satisfaction of having achieved the
important milestone in academia, it was my pleasure to know and work closely with
confident, cheerful, down-to-earth personality like Dr. Bhalerao, to whom I extend my
deepest respect and affection.
I am greatly appreciative of having a fantastic, helpful committee. I’d like to express
my deep sense of gratitude to Dr. Michael J. Miller for his valuable support and guidance with
microbiology, for teaching me the functional aspects of food and industrial microbiology in
his class, and for helpful suggestions in the committee meetings. I am equally indebted to Dr.
Kris N. Lambert for his support with ARISA experiments and for lending me use his facilities,
expertise, patience and time. I had a great time serving as his teaching assistant for Genetic
Engineering lab and I thank him for giving me the opportunity. I sincerely thank Dr. Girish
Chowdhary for serving on my committee, providing invaluable suggestions and guiding me
with his expertise in machine learning. I am grateful to Dr. Angela Kent for her cogent
suggestions during the inceptive phase of this dissertation.
v
I am also thankful to the Roy J. Carver Biotechnology Center (CBC)’s Flow
Cytometry Facility for their services, equipment and resources to carry out this work. I
specially want to thank the director of the flow ctyometry facility - Dr. Barbara Pilas who
made this work possible. The present work is supported by NSF award 0749028 and
independent grant from the Institute of Sustainability, Energy, and Environment. I’d also like
to extend my sincere thanks to the Post-Doctoral Researchers from my lab - Dr. Chinmay
Soman, Dr. Sadia Bekal and outside the lab – Dr. Christine Xiaoji Liu, Dr. Sujit Jagtap, Dr.
Deepak Kumar Garg, Dr. Kamaljit Banger, for mentoring my progress.
I would like to thank Dr. Chester Brown and J. P. Swigart from The School of
Molecular and Cellular Biology (MCB) for offering teaching assistantship and for their
support in completing the graduate teacher certification and teacher scholar certificate. I
appreciate the time and efforts of cheerful ABE staff with paper work, travel and PhD
examinations. A special thanks to our head Dr. K. C. Ting for the incredible leadership and
his system-based thinking, which percolated down to each student in this department.
I would like to thank my lab mates Pratik Lahiri, Xiaofeng Ban, Seokchan Yoo, Nico
Hawley-Weld, Farhan Syed, Safyre Andersen, Allante Whitmore, Xiaolun Chen, Xiaowen
Lin and ABE friends Harshal Maske, Vaskar Dahal, Andres Prada, Michael Stablein, Ana
Martin-Ryals, Wan-Ting Chen, Ranjeet Kumar Jha, Chih-Ting Kuo, Dr. Ravi Challa, Robert
Reis, Dr. Liangcheng Yang, Dr. Siddhartha Varma, Dr. Ming-Hsu Chen and Dr. Haibo
Huang. Thank you for your friendship and support.
Finally I would like to thank my family who believed in me!
vi
TABLE OF CONTENTS
1. INTRODUCTION .................................................................................................................... 1
1.1. Microbiomes rule the world! ............................................................................................ 1
1.2. The anaerobic microbiome ............................................................................................... 2
1.3. Need for a microbiome characterization tool for anaerobic digesters ........................... 3
1.4. Overview and goals of this work ..................................................................................... 4
1.5. Organization of this work ................................................................................................. 6
2. LITERATURE REVIEW ........................................................................................................ 7
2.1. Background ....................................................................................................................... 7
2.2. Molecular microbial ecology methods for microbial community structure analysis ... 7
2.3. Molecular microbial ecology methods for microbial community function analysis .. 12
2.4. Future prospects .............................................................................................................. 14
3. RAPID CHARACTERIZATION OF MICROBIOME DYNAMICS USING FLOW
CYTOMETRY ....................................................................................................................... 16
3.1. Background ..................................................................................................................... 16
3.2. Materials and methods .................................................................................................... 18
3.3. Results and discussion .................................................................................................... 22
3.4. Conclusion and future research directions .................................................................... 29
4. COMPARISON OF FLOW CYTOMETRY TO TRADITIONAL CULTURE-
INDEPENDENT TECHNIQUE LIKE ARISA ................................................................... 31
4.1. Background ..................................................................................................................... 31
4.2. Materials and methods .................................................................................................... 33
4.3. Results and discussion .................................................................................................... 35
4.4. Conclusion and future research directions .................................................................... 41
5. MACHINE LEARNING ANALYSIS OF FLOW CYTOMETRY DATA USING
H2O.AI.................................................................................................................................... 43
5.1. Background .................................................................................................................. 43
5.2 Approach ....................................................................................................................... 44
5.3 Results and discussion .................................................................................................. 46
5.4 Conclusions and future research directions ................................................................. 53
6. APPLICATIONS OF THE PROPOSED FLOW CYTOMETRY BASED
METHODOLOGY ................................................................................................................. 54
6.1. Applications in monitoring community dynamics in operational municipal anaerobic
digester ....................................................................................................................... 54
6.2. Applications in evaluating microbiome resilience ....................................................... 59
vii
6.3. Applications in the proposed campus anaerobic digester............................................. 66
7. CONCLUSIONS AND FUTURE WORK ........................................................................... 70
7.1. Major conclusion of this work ....................................................................................... 70
7.2. Future research goals ...................................................................................................... 71
7.3. Last words ....................................................................................................................... 76
LITERATURE CITED.................................................................................................................. 77
APPENDIX A: ADDITIONAL EXPERIMENTAL RESULTS AND OUTPUT OF MODEL
RUNS ...................................................................................................................................... 88
A.1. FSC-SSC plots for all flow cytometry samples showing percentages of cells in the
prominent blobs. ........................................................................................................ 88
A.2. Live vs Dead cells in UCSD digester ........................................................................... 89
A.3. E. coli HB 101 growth curves ....................................................................................... 90
A.4. Flavin autofluorescence for UCSD samples ................................................................ 91
A.5. Predictive model run output for UCSD samples .......................................................... 92
1
1. INTRODUCTION
1.1. Microbiomes rule the world!
The microbiome plays an important role in maintaining health and wellness in plants
(Chaparro et al., 2012), animals (Henderson et al., 2015) and humans (Balskus, 2016). To
achieve foundational changes in our understanding of health and wellness, and agricultural
and environmental sustainability, there is a recognized need to advance our understanding of
the key players in our ecosystems, viz. microbial communities (Tilman et al., 2011).
Perturbations in the structure and function of a microbiome are now recognized as an
important step in the etiology of infectious disease (Balskus, 2016; Stappenbeck and Virgin,
2016). The role of soil microbes in bio-geo-chemical nutrient cycling (Faure et al., 2009;
Hirsch and Fujishige, 2012) and the disruption of ecosystem functionality due to changes in
the soil microbiota has been well recognized (Q. Li et al., 2016). In the sphere of animal
agriculture, ruminant animals are vital component of our food systems, and the rumen
microbiome plays a vital role in controlling effective conversion of animal feed to animal
protein (El Akkad et al., 1966; Henderson et al., 2015). Last, but not the least, many disease
states caused by microbes like C. difficile (Lyerly et al., 1985) and P. aeruginosa (Stieritz and
Holder, 1975) in human health have their etiology in perturbations of the native, commensal
microbiome. Since a resilient microbiome is considered an integral part of environmental,
agricultural and human ecosystems, there is a need to enhance our understanding of it’s basic
biological mechanisms and principles.
2
1.2. The anaerobic microbiome
In today’s era of sustainable energy sources, production of energy from wastes is an
essential component (Angenent et al., 2004). Anaerobic bioreactors are a prominent source of
bioenergy (Pandit et al., 2015) commonly employed for the treatment of wastewaters, the
stabilization of sludges, and the treatment of hazardous and solid waste streams (Dhoble and
Pullammanappallil, 2014). A healthy and functioning microbiome is considered crucial to the
stability and performance of anaerobic bioreactors and their functional stability is governed by
dynamic interaction within these microbial communities (Jeffrey J Werner et al., 2011).
Anaerobic digestion is a process in which syntrophic consortia of microorganisms break down
organic material in the absence of oxygen to produce biogas, a mixture of methane and carbon
dioxide (Manyi-Loh et al., 2013). The digestion process begins with bacterial hydrolysis of
the input materials in order to break down insoluble organic polymers such as carbohydrates
and make them available for utilization by microbial consortia. Higher in the hierarchy, large
organic chain molecules such as cellulose and starch are broken down into simpler sugars and
monomers by hydrolyzers. Acidogenic bacteria then convert the sugars and amino acids into
carbon dioxide, hydrogen, ammonia, and organic acids. Acetogenic bacteria then convert
these resulting organic acids into acetic acid, along with additional ammonia, hydrogen, and
carbon dioxide as shown in Figure 1-1 (Dhoble, 2009). In the absence of methanogens,
fermentation products, volatile organic acids and hydrogen build-up. This build-up retards the
overall degradative process causing a decrease in pH which inhibits growth and stops
fermentation. The overall role of methanogenesis in the biosphere is to complete the
degradation process by removal of inhibitory fermentation products (Boone et al., 1993).
3
Figure 1-1. Anaerobic digestion process (Jeffrey J Werner et al., 2011).
1.3. Need for a microbiome characterization tool for anaerobic digesters
There is no simple, in-line, cost effective method available to distinguish between the
bioreactors that perform well from those that perform inadequately (Leitão et al., 2006).
Because of this there is a general perception that anaerobic bioreactors are unreliable or
unstable (Amani et al., 2010). In order to monitor and characterize perturbations in the
structure and function of a microbiome, there is a need to develop tools that measure
microbiome features beyond genomic or functional diversity (Müller and Nebe-von-Caron,
2010). The strict anaerobic requirements are an impediment to culture based studies of
underlying microbial communities (Koch et al., 2014c). It has been demonstrated previously
that microbial communities in anaerobic bioreactors can be characterized using molecular
probes (Raskin et al., 1995). However the complexity arising from probe design,
characterization and availability of rRNA sequences make this approach difficult to
4
implement in an online monitoring system. There is a demand for the quick, simple and
effective method for monitoring the structure and function of the bioreactor microbial
communities (Kinet et al., 2016).
Low cost next generation sequencing (NGS) is not high throughput enough to resolve
dynamic changes in the structure of the microbiome over time (Yu et al., 2015). Flow
cytometry can provide this information in a high throughput manner since its fast, inline,
automated, permits sample labeling, and requires small sample volumes and minimal sample
preparation (Müller and Nebe-von-Caron, 2010). It also requires low capital investment, has
low per sample cost, produces rich high-dimensional information and is capable of sorting and
classifying a sample (Koch et al., 2014d).
1.4. Overview and goals of this work
Here, a novel methodology has been demonstrated using multiple flow cytometry
signals: cell size (FSC or forward scatter), cell granularity/morphology (SSC or side scatter)
and autofluorescence (corresponding to the same excitation/emission wavelength as in
AmCyan standard dye) towards a high-throughput tool to assess microbial community
structure and dynamics. This approach has the potential to open the door for rapid, label free
monitoring and characterizing dynamics of the microbial communities in natural as well as
engineered environments.
The central hypothesis of this work is that flow cytometry can classify microbial
consortia based on characteristics such as viability, metabolic activity, and morphology. The
rationale for the proposed methodology is that the combinations of these characteristics form
unique ‘cytometric fingerprints’ (Koch et al., 2014c) that can complement existing
5
technologies (Park et al., 2005) and may facilitate rapid characterization of the dynamics of
microbial communities. The goal of this study is to demonstrate the potential usefulness and
sensitivity of a novel multi-dimensional flow cytometry based method to allow rapid high-
throughput characterization of the dynamics of microbial community evolution and
intracommunity dynamics. Using anaerobic microbial consortia perturbed with controlled
addition of carbon sources (Owen et al., 1979), the microbial diversity information obtained
from established culture-independent technique like ARISA (Fisher and Triplett, 1999) has
been demonstrated to be comparable with flow cytometry based analysis. Furthermore, the
flow cytometry along with non-invasive autofluorescence based method has been
demonstrated to potentially detect and classify (type) microbial consortia in an industrial scale
anaerobic digester operated at varying HRTs, which can provide high throughput
microbiological bioprocess evaluation and appropriate in-line as well as off-line intervention
strategies in bioprocess industries.
Objectives of this study were to:
1. Create controlled perturbation in model ecosystem and determine if flow cytometry
can facilitate rapid characterization of dynamics of complex microbiome.
2. Compare flow cytometry to traditional culture-independent techniques like ARISA.
3. Determine if machine learning can be fruitful in the classification of flow cytometry
samples.
4. Demonstrating applications of proposed methodology in detectecting changes in the
microbial community at different HRTs in industrial biodigeters, evaluating
nanotoxicity and bioprocess design.
6
1.5. Organization of this work
The work that resulted from these objectives is presented in several sections. In Chapter 2,
overview of the existing microbiome characterization technologies has been discussed. In
Chapter 3, the specific methodology used for the experiments and analysis has been presented.
The results of these studies and its comparison with tradition culture independent technique
have been discussed in Chapter 4. Some of the aspects of Chapter 3 and 4 have been
published in the Bioresource Technology, Volume 220, November 2016, Pages 566–571.
Chapter 5 explores the applications of machine learning in flow cytometry data analysis. The
applications of the proposed methodology have been demonstrated in Chapter 6. Finally in
Chpater 7, an overall evaluation of the research and suggestions for its extension has been
offered.
7
2. LITERATURE REVIEW
2.1. Background
Anaerobic digestion is a microbial process in which syntrophic consortia of
microorganisms break down organic material in the absence of oxygen to produce biogas a
mixture of methane and carbon dioxide (Boone et al., 1993). Traditional methods of microbial
analysis include culture-type techniques such as microscopy, plating and counting etc.
However the strict anaerobic requirements are an impediment to culture based studies of
underlying microbial communities in the anaerobic bioreactors (Raskin et al., 1995). There is
a need for an ideal tool to monitor the structure and function of the anaerobic bioreactor
microbial communities that will be a real time method that is quick, simple and effective.
Owing to the fact that most of the microorganisms are not cultivatable in the lab with
current cultivation methods, a small portion of the microbial communities are known, and the
ecology information revealed by cultivation-based methods is restricted. On the other hand,
molecular techniques that can measure cell DNA and RNA are more direct and robust and can
furnish useful information about the structure (who they are), function (what they do), and
dynamics (how they change through space and time).
2.2. Molecular microbial ecology methods for microbial community structure analysis
Even though most of the contemporary techniques and technologies analyze marker
genes, 16S ribosomal RNA (rRNA) gene has been almost exclusively used as marker gene in
studies of microbial communities, either community composition or population dynamics,
microbial community profiling is also commonly used to evaluate and compare different
microbiomes. The ones that have been used include terminal restriction fragment length
8
polymorphism (T-RFLP), single strand conformation polymorphism (SSCP), and denaturing
gradient gel electrophoresis (DGGE).
Figure 2-1. Oveview of molecular analysis tools for anaerobic systems.
2.2.1. 16S rRNA based techniques
Because of the common acceptance of 16S rRNA as the marker for phylogenetic studies
of microbiomes, a variety of 16S rRNA-based techniques have been developed and applied to
microbial ecology studies in the area of anaerobic digestion. Cloning library of PCR-amplified
16S rRNA gene, a fragment or the entire gene, followed by traditional sequencing has been
used for decades in microbial community studies (Nelson et al., 2011). However, because this
8
polymorphism (T-RFLP), single strand conformation polymorphism (SSCP), and denaturing
gradient gel electrophoresis (DGGE).
Figure 2-1. Oveview of molecular analysis tools for anaerobic systems.
2.2.1 16S rRNA Based Techniques
Because of the common acceptance of 16S rRNA as the marker for phylogenetic studies
of microbiomes, a variety of 16S rRNA-based techniques have been developed and applied to
microbial ecology studies in the area of anaerobic digestion. Cloning library of PCR-amplified
16S rRNA gene, a fragment or the entire gene, followed by traditional sequencing has been
used for decades in microbial community studies (Nelson et al., 2011). However, because this
Molecular Ecology Methods For
Anaerobic Digesters
To Monitor Structure
To Monitor Function
16S rRNA
DGGE
SSCPTRFLP
ARISA
NGS
qPCR
PLFA
SIP
FISH
Microarrays
9
traditional sequencing approach sequences individual clones one by one, which is costly and
time consuming, it does not support detailed analysis to reveal the true microbial composition
and diversity, especially when multiple samples are analyzed. Recent advancement and
decreasing cost of the next-generation sequencing technologies made this method quite
obsolete.
2.2.2. Terminal restriction fragment length polymorphism (T-RFLP)
T-RFLP entails amplification of a region of the 16S rRNA gene using primers labeled
with a fluorescent dye at 5’ end, digestion using one or two restriction endonuclease, and
sizing one or both terminal restriction fragments using Sanger sequencers. T-RFLP has been
successfully used to investigate both the archaeal (Yang et al., 2013b) and bacterial
communities ((Westerholm et al., 2011)) in different anaerobic digestion systems. The
terminal restriction fragments can be quantified based on the intensity of the fluorescence
signal, but such quantification is not accurate because of PCR bias.
2.2.3. Single strand conformation polymorphism (SSCP)
SSCP is another method to profile microbial community. It is based on the principle that
single-strand DNA fragments of the same length but different sequences can form different
secondary structures, which affect migration during gel electrophoresis and allow separation
of different fragments (Orita et al., 1989). SSCP has been used to investigate the
methanogenic community in several anaerobic digestion systems (Leclerc et al., 2004), but it
is not used as commonly as DGGE (Hori et al., 2006).
10
2.2.4. Denaturing gradient gel electrophoresis (DGGE)
Like SSCP, DGGE is also based on migration differences among 16S rRNA gene
amplicons of the same length but different sequences. Different from SSCP, DGGE analyzes
double stranded DNA fragments. It possible to combine samples that were extracted at
different times within one gel and hence it is an extremely effective tool for assessing the
ways in which microbial communities change over a given period of time (Turker et al.,
2016a). DGGE is one of the commonly used techniques to screen clone libraries DGGE has
been widely used for microbial community studies in anaerobic digestion systems because of
its simplicity and rapidness, even after NGS technologies are increasingly affordable
(Zamanzadehn et al., 2013). It should also be noted that a recent study showed that DGGE
profiles concurred with the detailed community profiles of microbial communities in several
digesters that were determined using 454 pyrosequencing (Nelson, 2011). In studies that
involve a large number of samples, DGGE can be a useful tool to profile all the samples to
identify representative samples to be further analyzed by deep sequencing (Nelson, 2011).
Even though the band excision is a powerful feature of DGGE, a band will consist of 150–
200-bp DNA, which makes it challenging for a phylogenetic analysis. Furthermore, co-
migration of bands also will give a poor identification in the sequencing (Aydin et al., 2015).
2.2.5. Automated ribosomal intergenic spacer analysis (ARISA)
ARISA constitutes an elegant tool to study microbial diversity. There are very few
studies on anaerobic bioreactors with direct utilization of ARISA. The changes in δ13CH4
and archaeal community structure have been monitored during mesophilic methanization of
municipal solid waste indicating shifts in archaeal communities (Qu et al., 2009a). Recently
Limam et al., 2013 have utilized ARISA in evaluation of biodegradability of phenol and
11
bisphenol-A during mesophilic and thermophilic municipal solid waste anaerobic digestion
using 13C-labeled contaminants. However both of these studies has mentioned about the
biases that ARISA may introduce in their conclusions due to under or over estimation of
diversity.
2.2.6. Limitations of traditional culture independent techniques
The major limitation of SSCP, T-RFLP, DGGE and ARISA lies in the difficulty to
obtain sequence information of the 16S rRNA gene fragments. Nevertheless, despite of their
limitations, these three profiling methods can be still useful to obtain a snapshot of the
microbial communities from large number of samples.
2.2.7. Next generation sequencing
The development of the so-called next-generation sequencing (NGS) technologies
makes it possible to reveal ‘who’s there? in anaerobic bioreactors by employing massive
parallel sequencing. NGS technologies can produces millions of sequencing reads in a single
instrument run. The two most popular NGS technologies in use are 454 pyrosequencing and
the Illumina sequencing on HiSeq and MiSeq platforms (Illumina). Although differing in
sequencing principle, a recent study suggested that both methods were suitable in quantitative
analysis of microbial communities and produced similar results on the same microbial
community (Luo et al., 2012). By directly comparing the sequencing datasets obtained from
these two sequencing methods on the same microbial community, the authors reported that the
results obtained from the two platform agreed on over 90% of the assembled countings and
89% of the unassembled reads as well as on the estimated gene and genome abundance in the
12
samples (Luo et al., 2012). The major reported disadvantages of these techniques are that they
are very expensive.
2.3. Molecular microbial ecology methods for microbial community function analysis
Although SSCP, DGGE, T-RFLP, ARISA and NGS are useful tools in profiling the
microbial community in the anaerobic bioreactors, they do not support accurate quantification
of individual populations. Hence in an effort to answer the question “how many are there?”
investigators have employed various other techniques as described below:
2.3.1. Quantitative real-time PCR
Quantitative Real-Time PCR (qPCR) is used to quantify genes, which can help in the
evaluation of microbial structure as well as function. Order-specific primers have been
demonstrated for commonly witnessed methanogens in anaerobic digestion (Yu et al., 2005),
or genera-specific primers for methanoculleus, methanosarcina, and methanothermobacter
(Franke-Whittle et al., 2009). Genus-specific primers targeting 16S rRNA of hydrolytic
bacteria clostridium and syntrophic acetogen syntrophomonas were also used in qPCR to
investigate the spatial distribution of these genera in different compartments of a plug flow
digester (Talbot et al., 2010). While this technique can be used to quantify abundance or
expression of gene of interest, it has been argued to not give any information on identity of
organism being quantified.
2.3.2. Microarrays
Microarrays have been proven to be used as a biosensor (Burja et al., 2003). They have
been shown to enhance the understanding of the microbial communities in regard to structure,
13
function and dynamics to enhance methane production. While they are good for quickly
determining the presence and relative abundance of genes of interest, it’s reported to be quite
expensive to analyze.
2.3.3. Stable isotope probing (SIP)
Stable isotope probing (SIP) has been shown to recover the labeled DNA fraction from
methane and methanol assimilation (Radajewski et al., 2000). The technique allows for
functional detection of microbial groups. The major reported advantage for this technique is
that it can be used concurrently with FISH for detecting ammonia oxidation in anaerobic
sludges (Wanger and Loy 2002).
2.3.4. Fluorescence in situ hybridization (FISH)
Fluorescence in situ hybridization (FISH) is a useful molecular technique to
quantitatively identify a microbial community. The technique involves rapid fixation of
samples post extraction in order to preserve the cell morphology of the microbial communities
involved. After extraction, samples are washed and hybridized using oligonucleotide probes
that are specific to the gene sequences of the cells involved. Fluorescent dye is used to label
the oligonucleotide probes so that they can be observed under a fluorescence microscope
(Turker et al., 2016a) . FISH has been used to observe the spatial distribution of archaeal and
bacterial cells in anaerobic digesters (Angenent et al., 2004; Yamada et al., 2005). A
combination of FISH and microautoradiography (FISH-MAR), FISH-SIP and microarrays
have been used to evaluate microbial structure and function (Ariesyady et al., 2007; Ho et al.,
2013). The main claimed advantage of FISH is its ability to localize the specific group of
interest.
14
2.3.5. Phospholipid fatty acid (PLFA)
Phospholipid fatty acid (PLFA) profiling has been used in other habitats, but its
application to the microbial communities in biodigesters has not explored much in the
literature. Only one study has been reported in the literature that investigated changes of
microbial community under different operation conditions in anaerobic digestion
(Schwarzenauer and Illmer, 2012). One of the important reason could be archaea cannot be
detected with PLFA. Since methanogenic archaea plays a crucial role in anaerobic
bioreactors, PLFA would definitely underestimate the diversity in this system.
2.4. Future prospects
The future work in this area will primarily be concentrated along the development of a
real time, quick, simple and effective method for monitoring the structure and function of the
anaerobic bioreactor using the direct detection of specific genes or gene products within single
microbial cells by advanced FISH techniques. To the best of my knowledge such a
comprehensive study on single cell identification of microbes by detecting signature regions
in their rRNA molecules using Flow-FISH and comparison of microbial community
morphology obtained from flow cytometry with microbial diversity information obtained
from ARISA does not exist for anaerobic digesters. The methodology also is novel to such
studies, and has the potential to yield informative models for understanding and engineering
the function of other complex microbial communities, such as in the human gut, soils and
oceans.
15
Figure 2-2. Comparison of processing timeline for NGS (~14 h) vs flow cytometry (~1 h) indicating NGS is not high
throughput enough to resolve dynamic changes in the structure of the microbiome over time while flow cytometry can provide this information in a high throughput manner.
16
3. Rapid characterization of microbiome dynamics using flow cytometry
In this section, a multidimensional, multi-parameter flow cytometry based method has
been demonstrated using anaerobic microbial consortia perturbed with the controlled addition
of various carbon sources. Utility of autocorrelation function commonly seen in time series
data has been demonstrated using multiple flow cytometry signals towards a high-throughput
tool to assess microbial community structure and dynamics. A novel methodology presented
here has a potential to benefit research community in various fields of life sciences and
microbial ecology.
3.1. Background
The model microbial community used here is from the operational anaerobic digesters
in the wastewater treatment plant. The rationale behind selecting this as a model ecosystem is
because it’s an active, interactive community rather than just a combination of different
microbes. Performance of anaerobic bioreactors and their functional stability is governed by
dynamic interaction within these microbial communities. The metabolic process of anaerobic
digestion can be divided into four sequential steps: hydrolysis, acidogenesis, acetogenesis, and
methanogenesis (Christy et al., 2014). Additional pathways are involved in nutrient removal
(i.e. the formation of inorganic nitrogenous and sulfurous species such as ammonia and
hydrogen sulfide) which compete for the intermediates of methane production (i.e. volatile
fatty acids) (Volmer et al., 2015). The controlled perturbations can be created by addition of
various microbiome preferred carbon sources at each stage of anaerobic digestion
(Gunaseelan, 1997). The first step of anaerobic digestion is hydrolysis, which depicts the
breakdown of large organic particulates and macromolecules like cellulose into soluble
17
macromolecular compounds like glucose (Siegert and Banks, 2005). The second step of
anaerobic digestion is acidogenesis: the further breakdown of soluble organics into volatile
fatty acids (VFAs) by various bacterial species (Chen, 2010). The hydrolysis step can be
depicted with controlled addition of glucose as demonstrated previously (Akunna et al., 1993).
Through acetogenesis, which can be depicted with controlled addition of butyric acid or
propionic acid, various intermediate volatile fatty acid compounds are converted into acetic
acid and other single-carbon compounds (Akunna et al., 1993). Finally, in the methanogenesis
step, acetoclastic methanogens convert acetic acid into methane and carbon dioxide by acetate
decarboxylation, while other methanogens convert hydrogen and carbon dioxide into methane
(Daniels, 1984). The acetoclastic methanogenesis can be depicted with the controlled addition
of acetic acid (Chen, 2010). The microbial communities that perform each of these four steps
of anaerobic digestion can be highly transient, and can vary considerably with minor shifts in
operating conditions; hence there is a need for a promising routine on-site methodology
suitable for the detection of stability/variation/disturbance of complex microbial communities
involved in anaerobic digesters (Kinet et al., 2016).
For analyzing microbial community dyanamics in anaerobic systems using cytometric
fingerprinting, four tools have been proposed thus far: Dalmatian Plot (Bombach et al., 2011),
Cytometric Histogram Image Comparison (CHIC) (Koch et al., 2013a), Cytometric Barcoding
(CyBar) (Koch et al., 2014c), and FlowFP (Rogers and Holyst, 2009). The Dalmatian Plot
have been shown to be most sensitive to operator impact but still considered useful for
providing an overview on community shifts. CHIC, CyBar, and FlowFP showed less operator
dependence and gave highly resolved information on community structure variation on
different detection levels (Koch et al., 2014c). All these tools are based on 2 dimensional
18
cytometric fingerprints. The methodology proposed in this section is addressed ‘novel’
because it is based on 3 dimensional flow cytometry signals viz. cell size (FSC or forward
scatter), cell granularity/morphology (SSC or side scatter) and autofluorescence
(corresponding to the same excitation/emission wavelength as in AmCyan standard dye).
3.2. Materials and methods
3.2.1. Source of anaerobic culture and controlled perturbations
Samples of the anaerobic microbial communities were collected from the mesosphilic
anaerobic digester of Urbana and Champaign Sanitary District (UCSD), Urbana, IL as shown
in the Figure 3-1. Laboratory cultures were setup to enrich the community with different
carbon sources to introduce biases in the underlying microbial communities. The cultures
were incubated in Corning No.1460, 250 ml serum bottles. In each serum bottle, 100 ml of
inoculum and the corresponding nutrient solution was added. Each bottle was fed with 2000
mg/L COD of glucose (GLUC), cellulose (CELL), propionate (PROP), butyrate (BUTY),
acetate (ACET) and waste activated sludge (SLUD).
19
Figure 3-1. Arial view of UCSD's wastewater treatment plant showing source of anaerobic culture.
Figure 3-2. Experimental set-up showing biogas assays.
The contents of the bottles were sparged with nitrogen gas for about five minutes before
sealing. Bottles were sealed with butyl rubber serum caps and crimped with aluminum seals of
appropriate size. The sealed bottles were incubated at 35±2oC. Each experiment was
20
conducted in triplicate accompanied with controls containing only inoculated medium. In all
cases, biogas production was measured every 24 h using a water displacement column. After
gas measurement the reactors were shaken once a day manually. Moisture content, total and
particulate volatile solids, soluble chemical oxygen demand and pH were measured according
to Standard Methods for the Examination of Water and Wastewater (APHA, 1998). The peak
day samples correspond to the maximum biogas production observed as shown in the Figures
3-5, 3-6, and 3-7. Samples with no additional carbon sources were represented as NONE_0
for the zero day sample and NONE_2 and NONE_5 for peak biogas day and fifth day samples
respectively. The experimental design has been shown in the Figure 3-3.
3.2.2. Flow cytometric analysis
Samples of the anaerobic microbial communities were collected from each serum bottle
via a syringe with a 18 gauge needle every 24 h following a biogas measurement. Initial
sample (NONE_0) was the same for all the assays, which was the fresh inoculum collected
from UCSD. 750 μL of sample from each serum bottle was strained prior to flow cytometry
using BD Falcon 12x75 mm Tube with Cell Strainer Cap having a 35 um nylon mesh
(Catalog No. 352235). The strained samples were suspended in PBS-1X. Analyses were
performed immediately on a Bio-sciences LSR II Flow Cytometry Analyzer. The excitation
laser was tuned for 405-nm. Autofluorescence was measured as light passing a 450/50 PMT B
band pass filter with no long pass dichroic mirror. Signals were amplified with a 4-decade log
amplifier and collected at a rate of approximately 1000-events/s. Total numbers of events
collected were 100,000 for each sample.
21
Figure 3-3. Experimental design.
3.2.3. Statistical analysis
For the analysis of flow cytometry data, the .fcs files obtained from a Biosciences LSR
II Flow Cytometry Analyzer were imported and the various distance matrices were calculated
using Root Mean Square Distance (RMSD) analysis. A 3D structure based on the FSC-A,
SSC-A and Autofluorescence-A data was created and a bin number of 50 were defined along
each dimension as shown in the Figure 3-4. Data visualization and ordination analyses were
conducted using R programming language. The data from six equally spaced time points was
utilized to do an autocorrelation analysis on the flow cytometry data. A vector that consists of
22
the distances from the NONE_0 sample was assembled all the way through the 5d samples for
every carbon source, for each replicate. For a given carbon source, the matrix containing the
time series data was obtained for that particular replicate.
Figure 3-4. RMSD 3D visualization showing binning strategy along each dimension.
3.3. Results and discussion
3.3.1. Trends in biogas production depicting controlled perturbations
The cumulative and differential biogas productions of putative hydrolyzers and
acidogens fed with CELL and GLUC, respectively, have been shown in the Figure 3-5.
23
Putative hydrolyzers reached the peak biogas production of 271.33 mL of 430 mL cumulative
on day 2 while putative acidogens reached the peak biogas production of 146.33 mL on day 1
itself; cumulative biogas production over 5 days was 264.17 mL. The delay in reaching the
peak biogas production for putative hydrolyzers compared to putative aciodgens can be
attributed to the fact that hydrolysis is generally considered having a slow reaction rates and is
therefore considered as the rate-limiting step of anaerobic digestion (Brummeler et al., 2007).
Hydrolysis rates are dependent upon biomass concentration, extracellular enzyme production,
substrate concentration, and the substrate’s specific surface area and difficulties with
hydrolysis can be partly attributed to a feed material’s large particle size (Siegert and Banks,
2005). Acidogenesis on the other hand represents breakdown of easily soluble organics like
glucose into butyric acid, propionic acid, and acetic acid by various bacterial species of
Clostridium, Enterobacter, Syntrophobacter, and others (Chen, 2010). Figure 3-6 shows the
cumulative and differential biogas productions for putative acetogens. PROP reached the peak
biogas production of 114.67 mL on day 2 while BUTY took 4 days to reach its biogas
production of 47.33 mL. The cumulative biogas production for BUTY (121.33 mL) was also
less than that of PROP (252 mL). These results demonstrate that the community might favor
propionate-utilizing acetogens (genera Smithllela, Syntrophobacter, and Pelotomaculum) over
butyrate utilizing acetogens (genera Syntrophus and Syntrophomonas). Genomic analysis of
total community data might resolve this divergence.
24
Figure 3-5. Cumulative (top) and differential (bottom) biogas production for putative hydrolyzers and acidogens.
26
Figure 3-7. Cumulative (top) and differential (bottom) biogas production for putative methanogens.
27
Cumulative and differential biogas productions for putative methanogens have been
shown in the Figure 3-7. SLUD reached the peak biogas production of 34.33 mL on day 2
while ACET took 4 days to reach the peak biogas production of 30 mL. The cumulative
biogas production for SLUD and ACET were 97 mL and 62.67 mL respectively. Surprisingly,
the total biogas production for the ACET (62.67 mL) is less than that of negative control
sample NONE (74.67 mL). Aceticlastic methanogens convert acetic acid into methane and
carbon dioxide by acetate decarboxylation, while other methanogens convert hydrogen and
carbon dioxide into methane (Daniels, 1984). Acetoclastic methanogens are slow growers
compared to hydrogenotropic methanogens (Christgen et al., 2015). Since the NONE samples
might have a traces of sludge left in them and since the ability of microbial community to eat
one another in order to keep the steady biogas production, it may have been possible that the
experimental time for ACET sample might not be adequate enough to show the true acetate
decarboxylation capability.
3.3.2. Perturbations monitoring with autocorrelation between divergent microbiome
Figure 3-8 shows the autocorrelation analysis at delays of 1-4 d of flow cytometry data
with six equally spaced time periods. Under controlled perturbations, each culture spiked by
specific carbon sources diverges as evidenced by the increasing dissimilarity with time and
hence depicts the continuous evolution of culture over time. The dissimilarity for samples is
smallest for a time delay of 1d, and as the delay increases the dissimilarity increases
proportionally. An autocorrelation function commonly seen in time series data (Wei, 1990)
would be a close analogue of the analysis presented here. GLUC exhibits the increased
distance with delay suggesting it’s a divergent culture, on the contrary PROP and BUTY are
28
somewhat stable. There is an observed similarity with the behavior of ACET and the controls
(NONE, SLUD) since they all were increasing sharply for a delay of 4.
Day-to-day changes over time plotted as a function of the day with respect to which the
day is measured with the delay=1 is shown in the Figure 3-9. The point corresponding to
minimum change portrays a stable extremum in the colony structure. The greatest
dissimilarity is observed between days 1 and 2, which can be interpreted as the time at which
the greatest perturbation was observed. The extremum in the distance can be interpreted as the
time at which the phenotype reaches a stable point under perturbation.
Figure 3-8. Autocorrelation analysis on flow cytometry data with six equally spaced time periods.
Figure 3-9. Autocorrelation analysis on flow cytometry data with day-to-day changes.
29
When anaerobic systems were subjected to substrate shock loadings, the recovery periods
could last from a few days to weeks as shown in the previous perturbation experiments
(McCarty and Mosey, 1991). The magnitude of the perturbation decides how much time an
anaerobic community would take to return to its fully functional condition (Cohen et al.,
1982). As shown in the Figures 3-8 and 3-9, ACET along with controls NONE and SLUD
showed sharp increases for a delay of 4. There are evidences pointing to the fact that,
subdominant communities that take part in the reactive capacity of the digester ecosystem
help face perturbations and foster functional stability (Alsouleman et al., 2016; Y.-F. Li et al.,
2016; Turker et al., 2016b; Venkiteshwaran et al., 2015). Certain communities can even grow
and remain in the digester at subdominant levels and yet be capable of immediate and
significant responses to environmental changes (Blasco et al., 2014; Ito et al., 2012; Nelson et
al., 2011). In the presented autocorrelation function based approach, the salubrious cytometric
fingerprints based on morphology (FSC), granularity/internal complexity (SSC) and metabolic
activity (Autofluorescence) have been exploited to capture the microbiome dynamics ranging
from active to subdominant communities. This demonstrates that the usefulness of the flow
cytometry as a reliable, high throughput tool to resolve dynamic changes in the stable vs
perturbed microbiome dynamics over time.
3.4. Conclusion and future research directions
The present study demonstrates the possibility of quantitative discrimination between
divergent microbiome using anaerobic microbial consortia perturbed with the controlled
addition of various carbon sources. Exploiting the three dimensional flow cytometry signals
namely cell size (FSC or forward scatter), cell granularity/morphology (SSC or side scatter)
and autofluorescence (corresponding to the same excitation/emission wavelength as in
30
AmCyan standard dye), it is possible to monitor and rapidly characterize the dynamics of the
complex anaerobic microbiome associated with perturbations in external environmental
factors.
Further, depending on the availability of species-specific antibody, flow cytometry
technique can be used for differential detection and quantification of microbiome to the
species level. Fluorescent antibodies can be applied to the identification of single cells in the
environment. Some regions of the rRNAs have remained essentially unchanged in all
sequenced species; these can be used as targets for the proposed rRNA probes which would
prove to be a sensitive analytical technique and a high-throughput method that can rapidly
detect changes in the composition of a complex microbiota. Flow cytometry coupled with
live/dead differential staining dyes SYBR Green I (SGI) and Propidium Iodide (PI) may also
be used to quantify and study other essential characteristics of the microbiome.
In order to develop a promising, routine on-site monitoring tool suitable for the
detection of stability/variation/disturbance of complex microbial communities involved in
anaerobic digesters or any other bioprocess, the proposed methodology can be expanded to
multiple parameters beyond those used in the present study. Total DNA content (DAPI
staining), metabolic activity, and fluorescence in-situ hybridization (FISH) rRNA probes are
some of the parameters that can easily be integrated into the existing RMSD based tool. In
addition to the reliable, rapid, high-throughput, routine characterization tool for bioprocess
monitoring, the presented methodology has a potential to benefit research community in
various fields of life sciences and microbial ecology.
31
4. Comparison of flow cytometry to traditional culture-independent
technique like ARISA
In order to monitor and characterize perturbations in the structure and function of a
microbiome, there is a need to develop tools that measure microbiome features beyond
genomic or functional diversity. However any new microbiome characterization methodology
will always be evaluated within the context of the status quo of genomics-based approaches.
Hence, possibility of quantitative discrimination between divergent microbiome analogous to
community fingerprinting techniques using automated ribosomal intergenic spacer analysis
(ARISA) has been evaluated here and compared with flow cytometry in the broader context of
monitoring and rapidly characterizing the dynamics of the complex anaerobic microbiome
associated with perturbations in external environmental factors.
4.1. Background
While sequencing-based approaches can provide very large amounts of information, the
labor required to process the samples (e.g. to amplify genomic regions and to label them with
barcodes) prior to characterization limits its throughput for resolving dynamic changes in the
structure of the microbiome making them not high throughput enough to resolve dynamic
changes in the structure of the microbiome over time (Ong et al., 2013). Flow cytometry can
provide this information in a high throughput manner since its fast, inline, automated, permits
sample labeling, and requires small sample volumes and minimal sample preparation (Müller
and Nebe-von-Caron, 2010). It also requires low capital investment, has low per sample cost,
produces rich high-dimensional information and is capable of sorting and classifying a sample
(Koch et al., 2014c).
32
Since flow cytometry based methodology does not rely on traditional species or
subtype-level enumeration of microbial species within a consortium to derive a classification
scheme, making it somewhat challenging to defend within the context of the status quo of
genomics-based approaches (Günther et al., 2016; Koch et al., 2014a, 2014c, 2014d, 2013a,
2013b; Melzer et al., 2015). Flow cytometry is still perceived to be a tool and database
development activity with the potential to impact the methodology of microbial ecology
studies, rather than a specific system-level hypothesis pertaining to microbial ecosystems
(Amann et al., 1990; Kinet et al., 2016; Saeys et al., 2016; Xue et al., 2016). The goal of this
exercise is to establish the proposed methodology in (i) increasing the accuracy and
throughput of exciting phenotypic and microbial data acquisition and classification, (ii)
potentially transforming the genomic-based analysis pipeline by serving as a preliminary
screening step, and (iii) leading to a complementary approach to genomics-based methods
with the inherent capability of addressing detection / identification / classification problems in
the science of microbiomes characterization.
The rationale for choosing ARISA for the comparison purpose is that it’s a proven
community fingerprinting technique of microbial community analysis that provides a means
of comparing differing environments or treatment impacts without the bias imposed by culture
dependent approaches (Fisher and Triplett, 1999). ARISA has also been successfully explored
for evaluating community dynamics in the model ecosystem proposed here. It has been used
to observe the microbial community dynamics in the context of evaluation of biodegradability
during mesophilic and thermophilic municipal solid waste anaerobic digestion (Limam et al.,
2014). It has also been used to explore succession of microbial community in the study
methanogenic pathway during thermophilic anaerobic digestion, where the archaeal and
33
bacterial community structures along the incubation have shown to be analyzed by ARISA
and the number, position and density of the bands on the ARISA profile have been used to
explore microbial dynamics (Lin et al., 2013). While ARISA measures diversity at the
genomic level and flow cytometry measures diversity at the morphological level, the
comparison between the two will be a stepping stone towards a potential complementarity and
integration into routine bioprocess monitoring pipeline.
4.2. Materials and methods
4.2.1. ARISA analysis
Samples for DNA isolation were collected accompanying the flow cytometry sampling.
Total genomic DNA was extracted from 1.5 mL of leachates from each assay using the
MoBio PowerFecal DNA Isolation Kit (MoBio Laborataries, Inc., Carlsbad, CA) following
manufacturer’s instructions. Purity of the extracted DNA was determined by measuring the
260/280 nm absorbance ratios. The extracted DNA samples were stored at -20oC before
analysis. ARISA was conducted as described by Fisher and Triplett (1999) with the following
modifications: DNA in the range of 15 to 250 ng was used for ARISA analysis. Each PCR
reaction contained 10 pmoles of each primer pair; five primer pairs were tested for each DNA
sample, and 4 µl of Taq 5X mix (New England BioLabs) in a total volume of 20 µl. The
reaction was run on a thermocycler using touchdown PCR at 95oC (30 s), [95
oC (30 s), 65
oC
(30 s)-0.5oC/cycle, 68
oC (1:00 min) x 20 cycles], [95
oC (30 s), 55
oC (30 s), 68
oC (1:00 min) x
16 cycles], 68oC (7 m). The resulting PCR products were confirmed by agarose gel
electrophoresis, then diluted three fold and sent for fragment analysis at the W. M. Keck
Center for Comparative and Functional Genomics at the University of Illinois at Urbana-
34
Champaign. DNA samples were analyzed by denaturing capillary electrophoresis using an
ABI 3730XL Genetic Analyzer (Applied Biosystems Inc.).
Table 4-1. Primers for ARISA analysis.
Primer Sequence (5’-3’) Reference(s)
Bacterial
1406F FAM-TGYACACACCGCCCGT (Cardinale et al., 2004)
23Sr GGGTTBCCCCATTCRG (Cardinale et al., 2004)
S-D-Bact-1522 FAM-TGCGGCTGGATCCCCTCCTT (Cardinale et al., 2004)
L-D-Bact-132-a-A-18 CCGGGTTTCCCCATTCGG (Cardinale et al., 2004)
ITSF FAM-GTCGTAACAAGGTAGCCGTA (Cardinale et al., 2004)
ITSReub GCCAAGGCATCCACC (Cardinale et al., 2004)
Fungal
2234C GTTTCCGTAGGTGAACCTGC (Ranjard et al., 2001)
3126T-56 FAM-ATATGCTTAAGTTCAGCGGGT (Ranjard et al., 2001)
Archaeal
1389F FAM-ACGGGCGGTGTGTGCAAG (Qu et al., 2009b)
71R TCGGYGCCGAGCCGAGCCATCC (Qu et al., 2009b)
35
Electrophoresis conditions were 63oC and 15 kV with a run time of 120 m using POP-7
polymer. DNA fragment sizes were calculated using the Peak Scanner 2 software package
(Applied Biosystems Inc.) using a threshold of 100 to 500 fluorescence units. The primers
used are described in Table 4-1. All oligonucleotides were synthesized by Integrated DNA
Technologies.
4.2.2. Statistical analysis
The total peak area for each peak was normalized by dividing the area of individual
peaks by the total fluorescence detected in each sample run. The inter sample variability in the
ARISA data set was obtained by first making a matrix of distances between each observation
and then obtaining distribution of distances of points within replicates. The primary tools used
for ARISA data analysis was the Euclidean Distance Matrix. ARISA profiles were analyzed
using PeakScanner Software (Applied Biosystems Inc.). The base pair lengths were quantized
to 4 units and the area was normalized. One of the bacterial primers listed in Table 4-1,
archaeal primer and fungal primer were concatenated and the area was normalized for every
reading for individual primers. The between and within sample variability was obtained in the
same manner as flow cytometry data. To obtain the significance of difference between the
between sample variability distribution and the within sample variability distribution,
Kolmogorov- Smirnov Test was performed.
4.3. Results and discussion
4.3.1. Inter sample variability
Figure 4-1 shows the inter sample variability for ARISA data while Figure 4-2 shows the
inter sample variability in the flow cytometry data set. All the between sample variabilities
36
were clearly larger than the within sample variabilities, indicating that there is a significant
difference between the two distributions. This test is crucial to determine if the distance
between replicates is smaller than the average distance between treatments.
Figure 4-1. Inter sample variability in ARISA dataset (D+ = 0.4246, p-value = 1.894e-07).
Figure 4-2. Inter sample variability in flow cytometry dataset (for RMSD 3D voxel: D+ = 0.6208, p-value = 2.2e-16).
0.0
2.5
5.0
7.5
10.0
0.1 0.2 0.3
dist
de
nsity type
Between
Within
3D Voxel FSC−AmCyan FSC−SSC SSC−AmCyan
0
1
2
3
1e−04 1e−03 1e−04 1e−03 1e−04 1e−03 1e−04 1e−03
dist
de
nsity type
Between
Within
37
The within sample variability is a distribution of distances of points within replicates.
Two-sample Kolmogorov-Smirnov test suggested that the cumulative distribution function
(CDF) of the within- sample variability is greater than that of the between-sample variability
and the comparison of D+ value suggests that flow cytometry data set may be considered more
reliable than ARISA data set. This statistical comparison of data sets between flow cytometry
and ARISA is the first of it’s kind and immensely relevant in today’s era because flow
cytometry is rapidly becoming popular technique for microbial characterization (Pedreira et
al., 2013) thanks to its merits with low per sample cost and capability of producing rich high-
dimensional information (Müller and Nebe-von-Caron, 2010) and hence it’s crucial to
establish flow cytometry’s reliability as a routine bioprocess monitoring tool.
4.3.2. Multidimensional spacing analysis
The multidimensional scaling plots from the distance matrices obtained separately from
flow cytometry and ARISA data are shown in Figure 4-3 and 4-4, respectively. All the peak
value samples cluster separately from the 5 d samples in the ARISA plots with the exception
of GLUC_1. On contrary, NONE_5, NONE_O, ACET_5 and SLUD_5 clustered separately in
the flow cytometry plot. As expected, the ARISA plot indicated a similarity between ACET_5
and NONE_O. The reason for distance similarity between ACET and NONE could be that
acetate treatment was administered to enrich methanogens since they consume acetic acid to
generate methane gas (Berg et al., 1976). Interestingly, as illustrated in Figure 3-7, the
putative methanogens i.e. samples fed with acetate, sludge and blank samples reached the
peak biogas production on day 2. Due to the distinctive clustering from rest of the carbon
sources fed samples, cytometric fingerprints of putative methanogens obtained by flow
38
cytometry in real time can be considered as a promising high- throughput tool for routine on-
site monitoring of anaerobic digesters (Kinet et al., 2016).
Figure 4-3. Multidimensional spacing plots for flow cytometry data.
Figure 4-4. Multidimensional spacing plots for ARISA data.
.
39
4.3.3. Correlation between flow cytometry and ARISA data
In the interest of obtaining correlation between ARISA and flow cytometry data, the
flow cytometry data was reduced to include only those points present in the ARISA data set.
In order to see if the distance of any sample from the NONE_0 sample in the ARISA space be
predicted from the distance of that sample from the NONE_0 sample in the flow cytometry
space, a linear model was constructed as shown in Figure 4-5.
The SLUD_2, GLUC_5 and PROP_5 stood out as outliers in the model as evident from
Figure 4-6. This exception can be attributed to the fact that the ecological dynamics of these
syntrophic and resilient populations (J. J. Werner et al., 2011) may not be evident in the flow
cytometry based methodology demonstrated here. Flow cytometry looks more at the
morphology of the cells (Müller and Nebe-von-Caron, 2010) while ARISA is more of the
genetic technique exploiting the highly conserved Internal transcribed spacer (ITS) region
(Fisher and Triplett, 1999). Further analysis of ARISA data might resolve this divergence
since a single organism may contribute more than one peak to the ARISA community profile,
also unrelated organisms can also have similar spacer lengths, which leads to underestimation
of community diversity (Fisher and Triplett, 1999). The precise sizing information from
ARISA profiles could be used to identify the major bands of interest in a separate manual
polyacrylamide gel, which could then be further, characterized through band excision and
subsequent sequencing as illustrated by Fisher and Triplett (1999).
40
Figure 4-5. Flow cytometry RMSD 3D vs ARISA distance model.
Figure 4-6. Normal Q-Q plot for flow cyometry vs ARISA distance model.
41
Even though these two methodologies look at totally different aspects, the observed
similarities in the clustering of most of the samples illustrate that one can be used in place of
the other to study the complex microbial population dynamics. Recent studies concluded that
even though there was a lack of significant correlation at genus or species level, same
phenotypical profiles of microbiota during assays matched to several 16S rRNA gene
sequencing ones (Kinet et al., 2016). Due to high-throughput nature of flow cytometry, a
large, curated multi-dimensional flow cytometry dataset can be routinely generated and
applied for screening and exploratory purposes in bioprocess monitoring with an automated
software toolchain to access and analyze this information.
4.4. Conclusion and future research directions
Flow cytometry was found comparable to traditional culture-independent techniques like
ARISA. While ARISA measures diversity at the genomic level and flow cytometry measures
diversity at the morphological level, there was an observed complementarity between the two
measures. The flow cytometry-ARISA based method was sensitive enough to changes within
5 d of perturbation to the community.
Even thought the proposed methodology does not rely on traditional species or subtype-
level enumeration of microbial species within a consortium to derive a classification scheme,
cell sorting can improve community structure analysis on the basis of segregating various
clusters in gated community. The approach has the advantage that sorted cells can be further
investigated on various levels towards the cells’ genome, transcriptome or proteome. This will
enable quantitative and more precise community resolution since quantitative information is
often lost during whole community PCR-based approach. This is especially important when
42
low abundant sub-communities are of interest. The utility of the flow cytometry in quickly
analyzing the dynamics and trends in microbial community has been established and alleged
constraints about specificity can be addressed using various approaches as mentioned in
previous chapter.
43
5. Machine learning analysis of flow cytometry data using H2O.ai
Flow cytometry allows measuring single cells into unprecedented depth. This unique
power of flow cytometry has revolutionized the field of biology, medicine, ecology and
environment. While this is exciting, on the other hand this means we will be generating new
types of high-throughput datasets at rapid pace, which would demand incorporating the novel
computational techniques that efficiently mine single cell data sets. In this section, utility of
one of the recent machine learning techniques called ‘Deep Learning’ has been explored for
single cell data analysis, comparing it with traditional multivariate data analysis in community
ecology and highlighting interesting avenues that might lead to novel algorithm development
in the future incorporating a community physiology angle.
5.1. Background
During the past 50 years, flow cytometry has established it’s reputation as a high
throughput tool for single cell analysis (Fulwyler, 1965). Currently, there are over 100
companies in the flow cytometry business worldwide constituting more than $3 billion
(Robinson and Roederer, 2015). Since its genesis in 1965 (Fulwyler, 1965) and since it started
becoming more popular in 1970s (Gray et al., 1975), the basic design of flow cytometers has
remained almost unchanged, emphasizing the robustness in the design of the technology.
With the advancement of key technologies in cytometry, the number of parameters that
can be measured simultaneously from single particles have increased multifold most notable
of which are with the addition of more powerful lasers (Saeys et al., 2016). It has become
routine for many labs to use 18-parameter flow cytometry on daily basis (Perfetto et al.,
2004). The 30-parameter flow cytometers have started becoming available commercially in
44
the recent past and it’s being reported that 50-parameter flow cytometry is forecasted to be
available soon (Chattopadhyay and Roederer, 2012). This number will keep on increasing in
the coming years and theoretically it’s possible to measure upto 100 parameters per cell
(Saeys et al., 2016). Furthermore, there is another group of researchers who are interested in
combining cytometry with imaging for instance in tissue engineering (Giesen et al., 2014).
A novel combination of rapidness, high-throughput nature of flow cytometry, combined
with the increasing capacity to measure more and more cell parameters at once, will lead to
massive, high-dimensional datasets in not so distant future. There would be limitation on how
much the traditional community ecology tools or classical, mostly manual, analysis techniques
will be useful in analyzing this huge multidimensional datasets. Therefore there is a gap in the
development of novel computational techniques and their adoption by the broad community.
This exercise attempts to exploit open-source user-friendly platforms in machine learning
to analyze flow cytometry data generated in the anaerobic microbiome perturbation
experiments. The applications to the real world anaerobic digesters and nanotoxicity have
been discussed in the next chapter. The major drawback of this exercise is the limitation on
the size of the dataset we can generate from the wet lab experiments. With the more and more
data coming in from further research in the current lab as well from other researchers, the
performance is expected to increase drastically.
5.2. Approach
Experimental set-up and flow cytometry analysis were performed in the similar manner
explained in the 'Materials & Methods' section of Chapter 3. For the machine learning based
analysis, the .fcs files for the 5 days experiment obtained from a Biosciences LSR II Flow
45
Cytometry Analyzer were imported. To keep the number of nodes in the input layer to a
reasonable number, only 10000 events from each fcs data set were chosen and only the "FSC-
A", "SSC-A" and "AmCyan-A" columns were considered from each field. Each fcs file of
100000 events were split into 10 files of 10000 events each, and placed in a data frame. Each
data frame was labeled with the corresponding carbon source. Data from the third replicate
was split into two parts, one part was used as the validation data set and the second was used
as the test set in each case.
For the clubbed flow frames analysis, the sample names were clubbed as per the
conceptual division of putative microbial groups in anaerobic digestion process. The four
groups were: 1) Putative hydrolyzes represented by "HYDRO" which were samples fed with
cellulose (CELL) 2) Putative acidogens represented by "ACIDO" which were samples fed
with glucose (GLUC) 3) Putative synothrophic acetogens represented by "ACETO" which
were clubbed samples fed with propionate (PROP) and butyrate (BUTY) individually 4)
Putative methanogens represented by "METHA" which were clubbed samples fed with
acetate (ACET), sludge (SLUD) individually as well as those fed with no carbon source
(BLAN and NONE). The reason behind clubbing ACET, SLUD, BLAN and NONE into
METHA putative group was because all these samples were supposedly methanogenic
samples with acetate utilizing methanogens were the predominant group in the waste water
treatment plant anaerobic digesters (Li et al., 2013).
H2O.ai server was set-up as per instructions (http://www.h2o.ai/) and the data was
uploaded in individual csv files. H2O’s Deep Learning algorithm has been primarily used for
building the models. The Deep Learning algorithm is based on a multi-layer feed-forward
artificial neural network that is trained with stochastic gradient descent using back-
46
propagation. The default "RectifierWithDropout" was been selected for activation. The
network contained input_dropout_ratio = 0.1,hidden_dropout_ratios = c(0.2, 0.2, 0.1) and a
default hidden layer sizes (hidden=c(2000,1000,500)) were used. Number of epochs (i.e.
number of times to iterate the dataset) were selected to 10. To add stability and improve
generalization, either L1 or L2 regularization was used with the value of 1e-5 each.
5.3. Results and discussion
5.3.1. Trends in the machine learning predictions are comparable to traditional
statistical analysis
Figure 5-1 shows the results of the deep learning model. The idea was to see how did the
model perform in classifying sample vs predicted carbon sources. In accordance with the
results from classical multivariate analysis listed in the Chapters 3 and 4, GLUC looked very
different from the rest and is perfectly classified, followed by PROP followed by SLUD.
NONE and BLAN got misclassified. As discussed in the Chapter 3, the increased distance
with delay in the autocorrelation function analysis suggests a divergent culture, as seen in
GLUC whereas PROP and BUTY were rather stable. ACET and the controls (NONE, SLUD)
showed similar behavior (increasing sharply for a delay of 4). Similarly, as shown the
autocorrelation analysis on flow cytometry data with day-to-day changes (Figure 3-9), GLUC
shows a distinct pattern in reaching a stable extremum in the colony structure, which is a point
corresponding to minimum change. Furthermore, as shown in the MDS plot in Figure 4-3,
both the peak value and non-peak value GLUC samples cluster separately in the Flow
Cytometry plots, which is in accordance with the results from the deep learning model. Also,
47
NONE_5, NONE_O, ACET_5 and SLUD_5 clustered separately in Flow Cytometry MDS
plots which resemble to NONE, ACET and SLUD misclassification evident in Figure 5-1.
Figure 5-1: Carbon sources prediction with H2O's deep learning algorithm.
The machine learning approach to the flow cytometry and microbial ecology dataset is
still in its infancy but the results presented here demonstrate that it has much greater potential.
Compared to the tradition microbial ecology statistical analysis (like MDS or PCA), it is very
fast and therefore more useful. Further all the important variables were <V1000, which are
mostly FSC variables. Hence FSC corresponding to cell size and morphology stand out as the
most important predictor. With the advancement in the high-throughput nature of flow
48
cytometry, combined with the increasing capacity to measure more cell parameters at once, a
massive and high-dimensional datasets on a routine daily basis would be generated in future
and would add to the predictive power of the machine learning based approach.
Figure 5-2: Clubbed carbon sources predictions.
5.3.2.MSE reconstruction with autoencoder
Figure 5-3 shows the results from the H2O.ai’s DL autoencoder analysis. In an
attempt to determine which data looks the `weirdest', an autoencoder option was turned on,
which does a dimensional reduction type analysis. H2O.ai’s DL autoencoder is based on the
standard deep (multi-layer) neural net architecture, where the entire network is learned
49
together, instead of being stacked layer-by-layer. The only difference is that no response is
required in the input and that the output layer has as many neurons as the input layer (Le,
2015).
Figure 5-3: Reconstruction MSE for DL's Autoencoder for carbon sources parsed with days.
It’s obvious from the figure that ‘CELL’ and ‘GLUC’ looks very different. Coincidentally
day 2 is a peak biogas production for CELL with 272 mL while other days were in the range
of 25-40 mL. If this is compared with the clubbed predictions as shown in the Figure 5-2,
HYDRO (i.e.'CELL') is predicted more like 'ACETO' than itself in this model, which is same
50
as 'CELL' - 'PROP' clubbing observed previously. There might be a physiological explanation
for this phenomenon as reported by previous studies (Aydin et al., 2015; Blasco et al., 2014;
Čater et al., 2013; Ito et al., 2012; Venkiteshwaran et al., 2015). Smithllela, Syntrophobacter,
and Pelotomaculum might have the ability to break down CELL faster than classical
hydrolyzers like Clostridium or have greater doubling time (Batstone et al., 2009). As
mentioned in the section above, ACIDO (i.e.'GLUC') predicted very well. Interestingly
ACETO (i.e.'PROP' and 'BUTY') also got predicted well. These syntrophic acetogenesis are
believed to be very important in maintaining stable and robust anaerobic operation (Ariesyady
et al., 2007a; Fernandez et al., 2000). PROP species fall in the genera Smithllela,
Syntrophobacter, and Pelotomaculum (Ariesyady et al., 2007b) while BUTY species in the
genera Syntrophus and Syntrophomonas (Ahring and Westermann, 1987). There have been
reported phenotypic/physiological similarities between species of these two genera that may
explain this trend (Khanal, 2008; Narihiro and Sekiguchi, 2007).
5.3.3. Model comparison for predictive capabilities
Table 5-1 lists prediction MSEs for the different models in the H20.ai's supervised
learning data science algorithms suite. The efficiency in terms of speed and resources for
model optimization, training and validation would be altogether different track of inquiry and
it was not used for comparison in the proposed approach. The objective of this exercise was to
see if one algorithm is better at predicting putative microbial group phenotypes than other.
Deep Learning (DL) performed better overall with 33.47% prediction MSE followed by Naive
Bayes (NB) (39.19%), Gradient Boosting (GB) (39.52%) and Distributed Random Forest
(DRF) (40.93%). The results are in concordance with the similar supervised model
51
comparison studies where feedforward neural nets have been concluded to perform best and
are competitive with some of the algorithms (Caruana and Niculescu-Mizil, 2006).
Methanogens are considered as the driving microbial group in anaerobic digesters
(Daniels, 1984) and DRF performed the best in predicting putative methanogens from the
controlled experiments with the prediction MSE of as low as 5.67%. GB is the next best
model with prediction MSE of 14.25%. NB and DL performed surprisingly poor with
prediction MSEs of 53.68% and 67.08% respectively. The better putative methanogens
predicative capability of DRF can be attributed to the fact that with a given dataset, DRF
generates a forest of classification (or regression) trees, rather than a single classification (or
regression) tree. Each of these trees is a weak learner built on a subset of rows and columns
(Geurts et al., 2006). More trees will reduce the variance. Both classification and regression
take the average prediction over all of their trees to make a final prediction, whether
predicting for a class or numeric value.
The limitations of the DL in predicting methanogens were also evident in the clubbed
prediction as shown in the Figure 5-2 where METH contradicted the expectations.
Physiologically it may be it's too short of a time to get distinguishing phenotypic signatures
for DL algorithm at the single cell level for i) acetoclastic methanogens that use the
acetoclastic pathway to produce methane and carbon dioxide from acetate, ii)
hydrogenotrophic methanogens that use hydrogen to reduce carbon dioxide to methane via the
hydrogenotrophic pathway, and iii) methylotrophic methanogens that produce methane from
C1 compounds, such as methanol, methylamines, and methyl sulfides, through the
methylotrophic methanogenesis.
52
Table 5-1. Machine learning model comparison (values in the boxes are prediction errors; smaller the better).
Acidogens and acetogens are the middle groups in the anaerobic digestion hierarchy and for
this mid-tier putative microbial groups, DL performed the best with 1.60% and 20%
prediction MSEs for putative acidogens and acetogens respectively. NB has a comparable
performance with 2.93% prediction MSE putative acidogens and 37.47% for putative
acetogens. DRF and GB didn’t do well for these groups. The whole stability of bioreactors is
dependent on engineering acid build up and acid removal and with great predictions from
ACIDO and ACETO; DL seems to be doing a great job in predicting that middle level
hierarchy. Hydrolyzers are basically the samples fed with cellulose and NB and DL were
comparable in terms of prediction MSEs (32.80% and 35.20% respectively). As mentioned
earlier, FSC variables (depicting size and morphology of individual cells) are the deciding
variables and DL appears to be best at classifying cells based on it’s size and morphological
features.
Putative Groups Deep Learning Distributed
Random Forests
Gradient
Boosting
Naïve Bayes
Hydrolyzers 0.3520 0.7173 0.6293 0.3280
Acidogens 0.0160 0.2560 0.2587 0.0293
Acetogens 0.2000 0.8960 0.7507 0.3747
Methanogens 0.6708 0.0567 0.1425 0.5358
Overall 0.334668 0.4093 0.3952 0.3919
53
5.4. Conclusions and future research directions
The high-throughput nature of flow cytometry, combined with the increasing capacity to
measure more cell parameters at once, warrants novel computational techniques to handle
massive and high-dimensional datasets on a routine basis. With the limited dataset from the
carbon source perturbed anaerobic microbiome, the machine learning algorithms were found
to be very fast and comparable to the tradition microbial ecology statistical analysis. FSC
variables depicting cell size and morphology stood out the most important variable for the
classification purpose. Community perturbed with the controlled addition of glucose and
cellulose looked different than the normal community and that with the controlled addition of
other carbon sources. Model comparison exercise based on predictive capabilities concluded
that DL was best at predicting overall community but DRF was best for predicting the most
important putative microbial group in the anaerobic digesters viz. Methanogens.
In addition to classical multivariate analysis techniques, the research community in the
microbial ecology, clinical biology and the single cell analytics need to adapt to the rapidly
developing novel computational techniques. While each experiment will have its own specific
challenges, scalability may be an issue in the future particularly while dealing with large
datasets (containing millions of cells) and with increasing dimensionality (such as gene
expression on single cell level). Furthermore the most challenging part from the researcher’s
point of view would be obtaining sufficient data quality and striking the golden balance
between the computational and biological variations and incorporating this flexibility in the
algorithm development in the future. While this field is quickly developing, the presented
small exercise will be a stepping-stone towards making sense of high-dimensional single cell
data in the complex microbiome.
54
6. Applications of the proposed flow cytometry based methodology
The rich multidimensional information from flow cytometry along with fast machine
learning algorithms have been put to test for rapid characterization of the dynamics of
microbial communities in industrially relevant anaerobic systems. The utility of flow
cytometry based method has also been demonstrated in a fully functional industry scale
anaerobic digester to distinguish between microbiome compositions caused by varying
hydraulic retention time (HRT). Potential utility of the proposed methodology have been
demonstrated for monitoring the syntrophic resilience of the anaerobic microbiome perturbed
under nanotoxicity.
6.1. Applications in monitoring community dynamics in operational municipal
anaerobic digester
Anaerobic bioprocess plays a significant role in overall municipal wastewater treatment
worldwide (Brar et al., 2010). One of the important factors to consider for both the waste
treatment and resource recovery in municipal wastewater treatment is HRT, which indicates
the time the waste remains in the reactor in contact with the biomass (Zhang et al., 2006).
Rate of microbial metabolism decides the time required to achieve a given degree of treatment
(Elefsiniotis and Oldham, 1994; Rincón et al., 2008). Simple compounds like glucose gets
readily degraded hence requiring small HRT while complex wastes like halogenated organic
compounds would degrade slowly and hence demand longer HRT (Khanal, 2008). HRT is
considered as a deciding factor in process design for complex and slowly degradable organic
pollutants (Speece, 1996). Further HRT is crucial for maintaining the right balance between
acetoclastic and hydrogenotropic methanogens and hence play a crucial role in maintaining
55
the process stability in anaerobic digesters (Khanal, n.d.; Rincón et al., 2008). A quick, on-line
monitoring tool for HRT in light of microbiome metabolism and dyanamics would be
promising for routine on-site monitoring of anaerobic digesters.
6.1.1. Approach
The waste activated sludge used in this study was withdrawn from the secondary
settling tank of a municipal wastewater treatment plant of UCSD as shown in the Figure 6-1..
This stream is often called activated sludge and is the recycled fraction of wastewater
(Safoniuk, 2004). The dry solids in the sludge consist mostly of biodegradable organic
compounds, followed by a substantial amount of inorganics and a small amount of toxic
organics (Kolat and Kadlec, 2013). The organic fraction typically consists of paper fiber, food
waste, and fecal matter; the inorganic fraction often contains sand and other miscellaneous
grit (Barber, 2014). The composition of the sludge can vary greatly from day to day and even
hour to hour (Vanrolleghem and Lee, 2003). For UCSD experiments, samples were collected
on Day 1, 2, 4, 14 and 26 from Digester 1 (DIG 1) and Digester 2 (DIG 2) in order to capture
the microbiome dynamics beyond the usual residence time. Both the digesters have a same
volume but different flow rates making them operational at different Hydraulic Retention
Times (HRTs). DIG 1 has the HRT of 13 d while DIG 2 has the HRT of 26 d. In order to
explore potential applications of flow cytometry in detecting changes in the microbial
community at the industrial scale anaerobic digesters, the focus of the present study was on
differences caused by varying HRTs.
56
Figure 6-1. Schematic of UCSD's wastewater treatment plant showing position of anaerobic digester in the process.
For autofluorescence control experiments, E. coli HB101 strain was obtained from
Promega, Madison, WI in which bacterial cells were grown overnight at 37°C then diluted
1:100 into 10 ml of fresh LB medium to the optimal optical density (OD) (Rezaeinejad and
Ivanov, 2013) in a 50-ml tube and grown at 37oC to an OD600 of ~1.0. 1:100 diluted
subcultures were then grown in 5 mL of 3 different LB concentrations (10mg/L, 20 mg/L and
40 mg/L). Multiple samples were collected during different growth phases and the flow
cytometry analyses were performed as described.
6.1.2. Flow cytometry detects changes in the microbial community at different HRTs
Multidimensional spacing (MDS) plot-demonstrating clusters based on the RMSD3D
tool (described in Chapter 3) for the samples obtained UCSD’s DIG 1 and DIG 2 at different
time points has been shown in Figure 6-2. As evident from the MDS plots, all the DIG 2
samples cluster separately from DIG 1 samples with the exception of 3 d sample. The
difference in the clustering can be explained with the help of previously published studies
depicting variations in microbial abundance and composition in anaerobic digesters with
57
different HRTs (Sasaki et al., 2011; Zhang et al., 2006). For each of the samples, DIG 2 shows
a similar fraction of live cells as DIG 1 and the ratio of live cells between different digesters is
constant at 1.04 ± (1 Std. Dev.). As shown in the Appendix A5, the deep learning algorithm
mentioned in Chapter 5 worked best for DIG 2 (K) than DIG 1 (E) pointing to the potential
limitation in the utility of this approach to shorter HRT systems.
Figure 6-2. MDS plot showing clustering of samples obtained from different anaerobic digesters
Live cell percentages in two different functional digesters at UCSD have been shown
in Appendix A2. The main microbial populations of these digesters, belonging to the domains
of bacteria and archaea were identified by use of ribosomal oligonucleotide probes (Table 4-1)
as used in the ARISA experiments explained above and the results are presented in Appendix
A2. In both HRT conditions, bacterial bands were much more obvious than archaeal
suggesting that the bacteria represented the major microbial group, while the archaea
constituted the minor but significant group of microorganisms which is in agreement with the
58
previous work (Raskin et al., 1994). The results of the flow cytometry studies on fully
operational industrial scale UCSD’s anaerobic digesters suggests that different setup of HRT
shows significant impact on the percentage of dead cells in AD community. In agreement with
the preciously published work, methanogenic pathway and community structure in anaerobic
digesters are affected by HRT along with other parameters (Sasaki et al., 2011) and
importance of HRT as significant hydrodynamic selection on the mixed-microbial populations
to establish a stable microbial population (Zhang et al., 2006), this comparison studies on the
live cell percentages in two digesters shows a DIG 2 has a smaller fraction of live cells than
DIG 1. All the published work in this domain is based on ascertaining the species composition
of the culture using 16S rDNA genes separated by denaturing gradient gel electrophoresis
(DGGE), which is labor intensive, time consuming and suffers biases as described previously
(Müller and Nebe-von-Caron, 2010).
6.1.3. Role of autofluorescence in elucidating overfed vs underfed microbiome
Alexa Fluor 488 mean frequency for DIG1 is always greater than that in DIG 2 (p-value
< 0.001 for the paired two-sample test). In order to confirm the validity of this experiment, the
control monoculture experiment was performed with E. coli. Various molecular markers like
DAPI, SYBRgreen, SYTO etc. have been used previously for discriminating microbial cells
on the basis of their DNA content (Koch et al., 2014a, 2013c; Müller and Nebe-von-Caron,
2010). In this work, no label is involved and discrimination is made on the basis of
autofluorescence since its proven to be a good discriminant based on the most commonly
observed autofluorescencing molecules like NADPH and flavins (Monici, 2005).
59
Autofluorescence, as measured by Alexa Fluor 488 channel in flow cytometer is
likely from flavin. This autofluorescence can also be used to detect changes in microbial
community at different HRTs. The control monoculture experiment was performed with E.
coli demonstrates 3 distinct curves for flavin autofluorescence (measured with Alexa Fluor
488 channel in flow cytometer) for 3 different LB concentrations as shown in Appendix A3.
These results are in accordance with the previous study (Renggli et al., 2013) that has shown
that E. coli treated with different bactericidal antibiotics exhibits increased autofluorescence
when analyzed by flow cytometry and hence this control experiment validates the presented
autofluorescence results from UCSD. Even though the flavin based autofluorescence needs
further confirmation with the protein characterization studies (Mukherjee et al., 2013) which
is out of the scope of present work, we believe that the idea has a potential to be transformed
into the non-invasive, rapid, high throughput tool for routine bioprocess monitoring.
Flow cytometry can provide this information in a high throughput manner since it is
fast, requires small sample volumes and minimal sample preparation. All the steps can be
performed in less than an hour and hence has a potential to be used for controlling biogas
reactor based on HRT. The ability of flow cytometry to rapidly detect changes in microbiome
composition and dynamics caused by varying HRTs in fully functional industrial scale
anaerobic digester is a major step towards it’s application in monitoring and potentially
controlling biodigesters in the future.
6.2. Applications in evaluating microbiome resilience
The widespread use of engineered nanomaterials is faced with a skeptical and
demanding public for their potential environmental risks (Abdelsalam et al., 2016). With the
60
increasing diversity of engineered nanomaterials in commercial products, there is a need to
monitor potential consequences of their release into the environment. With the increasing use
of nanomaterials in daily activities, commercial products and industry, wastewater treatment
plants are the receptors and possible gateway for release into the environment (Brar et al.,
2010). Because of the hydrophobic nature of most of the engineered nanomaterials, they will
strongly partition into the biomass in wastewater treatment plants (Nyberg et al., 2008) and
ultimately adsorb into the waste activated sludge fed to anaerobic digesters (Westerhoff et al.,
2011).
As anaerobic digestion is the standard stream for energy generation (biogas) and waste
reduction, any perturbations in the waste stream contributes to the perturbations in the
anaerobic digester. The functional stability of the anaerobic bioreactor is an emerging
property of the dynamic interaction within the microbial communities (J. J. Werner et al.,
2011). However as discussed in Chapter 2, the strict anaerobic requirements are an
impediment to the high throughput monitoring of underlying microbial communities. In this
section, an effort has been made to demonstrate the utility of flow cytometry based method
along with machine learning algorithm to be used as a potential tool for monitoring the
syntrophic resilience of the anaerobic bioreactor microbial communities. If successful it will
be an ideal, real time tool that is quick, simple to use by plant operators and effective.
6.2.1. Approach
An experimental set-up used for this section is similar to the one described in Chapter 3.
The deep learning algorithm described in Chapter 5 is used as the analytical tool for this
section as well. The chemical properties of the sludge used in this research are summarized in
61
Table 6-1. The sludge samples were mixed with various nanoparticles in low (L) 1 mg/g-TS
and high (H) 10 mg/g-TS concentrations, which are considered environmentally relevant for
assessing nanotoxiciy (Mahapatra et al., 2015). Plain sludge with anaerobic inoculum from
active anaerobic digesters but no nanoparticles added was set up as a positive control (PCH)
and anaerobic inoculum from the active anaerobic digesters without sludge or nanoparticles
was set up as a negative control (NCL). The alleged toxic impact of nanoparticles on
anaerobic digestion process was quantified by monitoring the biogas production on a daily
basis along with gas composition, TCOD, SCOD and VFA production on biweekly basis.
Drawing the samples for flow cyometric analysis every week helped build the cytometric
fingerprints of the perturbed community, which forms the basis for subsequent machine
learning analysis. TINPs are titanium(IV) oxide, anatase nanopowder, <25 nm particle size,
99.7% trace metals basis, specific surface area 45-55 m2/g obtained from Sigma Aldrich
(Catalog # 1317-70-0 ). FENPs are iron nanopowder, 25 nm average particle size, 99.5% trace
metals basis from Sigma Aldrich (Catalog # 7439-89-6). CNTs are multi-walled carbon
nanotube which are reported thin and short, <5% Metal Oxide powder also from Sigma
Aldrich (Catalog # 308068-56-6).
62
Table 6-1. Chemical properties of wastewater sludge.
Total Chemical Oxygen
Demand (TCOD) (mg/L)
42,308 ± 3721
Soluble Chemical Oxygen
Demand (SCOD) (mg/L)
5028 ± 865
Total Solids (TS) (g/L) 33.8 ± 0.9
Volatile Solids (VS) (g/L) 29.3 ± 1.1
pH 6.2 ± 0.7
Ammonia (mg/L) 180 ± 12.8
6.2.2. Flow cytometry based method can capture functionally redundant microbiome
For the nanoparticles tested, none showed the toxic impact on anaerobic digestion
process observed in terms of biogas production, methane composition, TCOD/SCOD
reduction and VFA accumulation with respect to both positive and negative controls. The
results are in agreement with the most reported nanotoxicity on waste processing studies
(Chen et al., 2014; García et al., 2012; Li et al., 2015; Yang et al., 2013a). Similarly, nano-
TiO2 in doses up to 150 milligram per gram total suspended solids (mg/g-TSS) showed no
inhibitory effect and the methane generation was the same as that in the control (Mu et al.,
2011). (Nyberg et al., 2008) has studied exposure of biosolids from anaerobic wastewater
treatment to fullerene (C60) in order to model an environmentally relevant discharge scenario.
63
Activity was assessed by monitoring production of CO2 and CH
4. Findings suggest that C60
fullerenes have no significant effect on the anaerobic community over an exposure period of a
few months. This conclusion is based on the absence of toxicity indicated by no change in
methanogenesis relative to untreated reference samples. DGGE results showed no evidence of
substantial community shifts due to treatment with C60, in any subset of the microbial
community. However in the deep learning analysis of flow cytometry data of nanoparticles
perturbed community as shown in the Figures 6-3 and 6-4 that community looks quite
different from one another.
Figure 6-3. Deep learning based predictions for nanoparticles perturbed community.
64
As evident from Figure 6-3, each nanoparticle-perturbed community looks different
from one another and hence got predicted very well. TIH and TIL which were TINPs in high
and low concentrations respectively have shown a little deviation and got misclassified as
FENPs community for a few instances. This deviation might be because of the toxic impact of
these nanoparticles to the same trophic level in the microbiome (Ganzoury and Allam, 2015).
Compared to the carbon sources predictions as described in Chapter 5, the positive control
(PCH) and negative control (NCH) got classified better when trained with nanoparticles than
carbon sources. As reported in Chapter 5, the FSC variables are the most relevant in the deep
learning algorithm used in the proposed methodology, and hence since the community shifts
in the context of community morphology are severe under nanoparticles compared (Li et al.,
2015; Zheng et al., 2015) to carbon sources, the controls got classified well.
Figure 6-4. Autoencoder MSE reconstruction for nanoparticles perturbed community.
65
Figure 6-4 is similar to the corresponding figure in Chapter 5 determining which data
looks the `weirdest'. As evident the community perturbed with higher concentration of TINPs
on both the time points and higher concentration of CNTs on earlier time point looks
‘different’ than the normal community. As mentioned earlier, the nontoxicity experiments
were designed considering the environmentally relevant concentrations. With the rapid
development of nanotechnology, nanoparticles (NPs) are now widely used in some industrial
products, such as antibactericide coatings, catalysts, biomedicine, skin creams and toothpastes
because of their unique physicochemical properties of enhanced magnetic, electrical, optical
etc (Bolyard et al., 2013; Eduok et al., 2013). The present exercise with a limited dataset
proves the potential of flow ctyometry based method in monitoring the anaerobic bioprocess
when the community shifts away from it’s normal dynamics. These results may point to the
fact that even though the functional redundancy in terms of biogas production, methane
generation, TCOD/SCOD reduction and VFA accumulation; TINPs and CNTs may be
pointing towards the community that’s on the verge on breaking down (Bolyard et al., 2013;
Kaegi et al., 2013; Yuan et al., 2015). Further analysis on long term nontoxicity experiments
might resolve this divergence (Chen et al., 2012; Otero-González et al., 2014) but the present
exercise opens the door for exploring flow cytometry and machine learning towards high-
throughput tool for assessing nanotoxicity for both lab researchers and field operators.
Even though the results from the present study and the associated literature review
suggests no quantifiable effect of some nanoparticle’s toxicity on anaerobic digestion, some
perturbation of the normal functions with respect to controls in germinating tests was
observed, suggesting the necessity for further research in this field. At the same time, the
66
effect of NP-solvents was sometimes more significant than that of the NPs themselves, a point
that is of special interest for future nanotoxicological study.
6.3. Applications in the proposed campus anaerobic digester
The University of Illinois produces 1500 lbs of paper waste per day that is the best-case
scenario is consolidated at the waste transfer station and sold back to the community. While
relatively sustainable in the recyclables context, there are other options to consider when
dealing with this specific type of waste and others that can be more difficult to manage
including food waste, manure, and other biodegradables. It is, therefore, important to find
ways in making the campus adaptable in managing different wastes, more advanced in
renewable applications, as well as technologically competitive with other universities using
progressive systems. Under the grant from Student Sustainability Committee, the feasibility of
anaerobic digestion process is being evaluated experimentally and at pilot scale and the
proposed methodology has the potential in this widely utilized technology project.
The feedstock used for this section was newsprints from day-old Daily Illini Newspaper
obtained from the waste sorting facility. Newsprint samples were either ground to powder,
shredded to strips in the office paper shredder or whole pieces hand cut to size for required
weight in each experiment. The experimental platform and the analysis pipeline is the same as
mentioned in Chapter 3 and Chapter 5 respectively. The goal was to predict the newspaper
cytometric fingerprints (abbreviated as NEWS) off the carbon sources dataset generated with
the proposed flow cytometry methodology.
Figure 6-5 shows that NEWS community gets predicted very well and the model still
holds on to better predictions with GLUC and CELL as has been observed previously in
67
Chapter 5. Reported composition of newspaper is cellulose (glucose polymer), wood fiber
(with 65.8% glucose, 19.8% xylose, 12.5% galactose and 1.3% mannose) (Biedermann and
Grob, 2010; Mahapatra et al., 2015), which might suggests that NEWS would get
misclassified as either CELL or GLUC. However, different groups of hydrolyzers and
acidogens might be involved in initial degradation of newsprints than the one feeding on pure
cellulose or glucose (Biedermann and Grob, 2010; Khanal, n.d.; Manyi-Loh et al., 2013). The
superior predictive power of the proposed methodology and with its capability of accurate
classification of various group of putative hydrolyzers and acidogens, this might form the
basis of routine bioprocess monitoring of campus anaerobic digester in near future.
Figure 6-5. Newsprint cytometric fingerprints predictions off carbon source data.
68
Autoencoder model reconstruction MSEs on newsprint dataset are shown in Figure 6-6.
NEWS 1 through 4 represent four data points in the 2 weeks batch of lab-scale anaerobic
biodegradation of newsprint feedstock. As evident from the figure, NEWS 4 looks completely
different than the average community. The autoencoder model run on newsprint dataset
reconfirmed the results presented in Chapter 5 about CELL 2 community being weirdest of
the normal community.
Figure 6-6. Autoenconder reconstruction MSE on newsprint cytometric fingerprints.
69
These results from autoencoder model would be crucial in designing and operating the
campus anaerobic digester since the process would now be engineered considering NEWS 4
as the perturbed time point. The predictive power and ‘weirdest’ data point from the normal
community would have applications in stochastic optimization in classical Chemical
Engineering process design. It has a capability to transform the process engineering paradigm
in which a process designer are currently trained on designs based on throughput rate, process
yield and product purity (Doran, 1995). With the novel combination of cytometric
fingerprinting and machine learning based toolchain, weaknesses in designs could be
identified in much reliable, faster and high-throughput manner allowing process designers and
process engineers to choose better alternatives in real time, bypassing conventional lab-pilot-
operations route.
In addition benefitting the campus with sustainability applications, the proposed tool will
help develop a new tier of digestion technology. Not only will the system be able to degrade
waste more efficiently for energy production but it would also be enhanced with system
controls to be self-sustaining. As the model becomes stronger with more data and further
optimization of parameters, the campus anaerobic digester would be automation capable with
sensors so future maintenance and operation will be minimal. The end results of this self-
sustaining, scaled up operation would not only generate renewable energy but also act as an
educational demonstration model for the campus.
70
7. Conclusions and Future Work
7.1. Major conclusion of this work
The original hypothesis of this work that flow cytometry can classify microbial consortia
based on morphological and metabolic characteristics stood true and the rationale that the
proposed methodology complements existing technologies in rapid characterization of
microbiome dynamics proven to be justified. However in order to make subsequent analysis
more efficient and reduce subjectivity, there is a necessity to develop an automated pipeline to
efficiently analyze the information in cytometric fingerprints.
The major conclusions of this work were as follows:
1. It is possibile to quantitatively discrimination between divergent microbiome using
anaerobic microbial consortia perturbed with the controlled addition of various carbon
sources. Exploiting the three dimensional flow cytometry signals namely cell size (FSC or
forward scatter), cell granularity/morphology (SSC or side scatter) and autofluorescence
(corresponding to the same excitation/emission wavelength as in AmCyan standard dye), it is
possible to monitor and rapidly characterize the dynamics of the complex anaerobic
microbiome associated with perturbations in external environmental factors.
2. Flow cytometry was found comparable to traditional culture-independent techniques like
ARISA. While ARISA measures diversity at the genomic level and flow cytometry measures
diversity at the morphological level, there was an observed complementarity between the two
measures. The flow cytometry-ARISA based method was sensitive enough to changes within
5 d of perturbation to the community.
71
3. With the limited dataset from the carbon source perturbed anaerobic microbiome, the
machine learning algorithms were found to be very fast and comparable to the tradition
microbial ecology statistical analysis. FSC variables depicting cell size and morphology stood
out the most important variable for the classification purpose. Community perturbed with the
controlled addition of glucose and cellulose looked different than the normal community and
that with the controlled addition of other carbon sources. Model comparison exercise based on
predictive capabilities concluded that DL was best at predicting overall community but DRF
was best for predicting the most important microbial group in the anaerobic digesters viz.
Methanogens.
4. The proposed flow cytometry based method demonstrated to be directly applicable in a
fully functional industry scale anaerobic digester to distinguish between microbiome
compositions caused by varying hydraulic retention time (HRT). Further proposed
methodology have been demonstrated to be useful for monitoring the syntrophic resilience of
the anaerobic microbiome perturbed under nanotoxicity the utility of proposed methodology
beyond “Monitoring" and exploring its applications. The proposed methodology has a
potential for better "designing" and "operating" bioprocesses.
7.2. Future research goals
The research presented here demonstrates the utility of novel integration of flow
cytometry with machine learning in facilitating the rapid characterization of the dynamics of
microbial communities. As discussed in Chapter 3, structure and function of microbial
communities manifests as unique combinations, or “cytometric fingerprints” of characteristics
such as viability, metabolic activity, and morphology. Integrating Cytometric Fingerprinting
72
with Machine Learning (CFML) would be a powerful approach to detect changes in the
structure and function of a microbial community in near-real time. Future research direction
would be to improve CFML as a tool that increases the accuracy and throughput of
microbiome characterization. The following specific goals could be targetted in the near
future:
1. Validate CFML as a methodology to allow rapid, high-throughput characterization of the
dynamics of microbial community evolution: A different model community eg. from animal
gut microbiome could be used for validation.
2. Develop advanced analytic tools to automate the task of identifying and classifying samples
based on cytometric fingerprints: Flow cytometry relies on manual gating for feature
extraction. We propose novel algorithms to adapt to the raw data and produce fast
and automated feature detection in flow cytometry samples, which can become the basis of
classification and searching algorithms in an eventual cytometric fingerprinting database.
3. Apply CFML and the automated analytical pipeline to facilitate early detection
of subclinical bovine mastitis: Cytometric fingerprint is sensitive enough to detect and classify
(type) subclinical mastitis, which can provide real time high throughput microbiological milk
quality evaluation and appropriate antibiotic intervention strategies on dairy farms.
7.2.1. Further validation of proposed methodology
In order to assess correspondence between cytometric fingerprints and their genetic
composition, subpopulations could be sorted using various combinations of the parameters,
for instance, cell size (forward scatter), cell granularity/ morphology (side scatter), flavin
autofluorescence, total DNA content (DAPI staining), metabolic activity, and fluorescence in-
73
situhybridization (FISH) rRNA probes etc. Cells could be sorted using two-way sort option
and most accurate sort mode (single and one-drop-mode; highest purity 99%) could be chosen
for separating up to 5000 cells/sec. Sorted cell populations could be amplified, barcoded and
sequenced with an Illumina MiSeq® to generate genomic information. Analysis of NGS data
for the purpose of characterizing microbial communities takes two approaches: 1) “What
groups of microorganisms seem enriched / attenuated under controlled perturbations” and 2)
“Can the changes in these populations be justified based on the putative mechanism of the
perturbation”. The expectation behind such studies is that empirically identified
microorganisms can become ‘biomarkers’ for a specific perturbation or pathology of the
microbiome.
7.2.2. Dataset and analytical toolchain development
As discussed in the Chapter 5, with the advancement of key technologies in flow
cytometry, the number of parameters that can be measured are increasing exorbitantly year
after year. Hence there is a need to develop automated algorithms to extract specific features
from the cytometry dataset beyond a simple distance-based algorithms. These “fingerprints”
would form the basis of further analysis, such as identification and classification of samples,
as well as the development of autocorrelation functions that could be used to quantify the rate
of perturbation over time. In essence, a ‘fingerprint’ is a partitioning of the multidimensional
flow cytometry dataset into hyperrectangular regions such that each partition has similar
amounts of informational entropy. In other words, it is a multidimensional histogram where
the bin width is optimized to enclose equal amounts of Shannon’s information. Automated
partitioning tools such as Flow-FP exist, however the problem with such tools is that they
produce partitions based on informational entropy alone, without consideration or relevance to
74
the specific classification problem. Microbial ecosystems are highly dynamic and can show
large variation in short timescales. An important question to answer in this domain would be,
“How much dynamic variation is nominal to the treatment, and what specific variational
patterns signify perturbation?”.
An automated pipeline could be developed that introduces an iterative algorithm to
solve for dataset partitions that optimize a machine classifier; it would help compress the
information, making subsequent analysis more computationally efficient. The FlowFP
package in R/Bioconductor has the capability to generate partitions with a fixed number of
recursions. As the partitions get smaller, histograms of the within-sample and between-sample
RMSD could be generated and the Kolmogorov-Smirnov statistic could be used to rank order
the partitions based on their contrast. Repeating this procedure on random subsets of the data
could test the robustness of the optimal partitioning scheme. This optimal partitioning could
be used on the much larger dataset in order to develop efficient machine learning models to
identify the original cause of the perturbation, whether it be carbon sources, nanoparticles or
antibiotics.
This will reduce the subjectivity and time-consuming nature of manual data analysis of
cytometric fingerprints. With an algorithm for contrast enhancement and an automated
pipeline to find optimal partitions for a given objective function, the analysis of flow
cytometry data will become repeatable, and automated. A direct benefit of this approach is
that this algorithm will be able to identify the bounds of nominal variation within treatments,
thereby reducing the likelihood of false positive biomarkers. This toolchain will permit life
science researchers to focus on the design and interpretation of the experiment itself rather
than on different analytical tools.
75
7.2.3. Applications of proposed methodology in early detection of subclinical bovine
mastitis
Bovine mastitis costs the US dairy industry $2 billion, an average of $200 per cow
annually. Mastitis is currently diagnosed based on lymphocyte counts in milk, which are a
non-specific marker of infection. The proposed methodology along with the toolchain
developed above can potentially detect and classify (type) subclinical mastitis, which can
provide high throughput microbiological milk quality evaluation and may be able to inform
appropriate antibiotic intervention strategies in dairy farms.
Cytometric fingerprints could be developed for the identification of main mastitis-
causing pathogens using the multiple flow cytometry parameters. Variations in the number,
function and viability of milk leukocytes determine the first-line immune defense of the
bovine mammary gland. Cytometric fingerprinting technique could also be applied for
somatic cell count and the differential inflammatory cell count along with the quantification of
the portion of viable, apoptotic, and necrotic leukocytes isolated from the milk samples
infected with different mastitis pathogens. The deep-learning based machine classifier
approach could be applied to the mastitis dataset to create a classification of main mastitis-
causing pathogens. The most actionable and desirable outcome of this exercise would be a
clear identification of the pathogen, and the number, function and viability of milk leukocytes.
The proposed exercise will demonstrate the bridge between characterizing animal
microbiomes and practical applications in agriculture. Furthermore, it would motivate the use
of data- driven and monitoring-based approaches for the proper and judicious use of
antibiotics in animal agriculture. In addition, this application overcomes a major practical
76
drawback of animal microbiome evaluation: While characterizing gut and superficial
microbiomes of animals does provide unique insights, it is difficult to implement a practical
microbiome collection protocol in the context of production agriculture. In contrast, using
tools to evaluate the microbial composition of an animal product itself (viz. the milk) would
make a persuasive case for adopting this technology in the field.
7.3. Last words
In addition to the goal of developing a breakthrough technology
for microbiome characterization in anaerobic systems, the research presented here has a
potential to benefit the broader microbial ecology research community by providing a rapid
screening and characterization tool to discover patterns in dynamic microbiome responses to
several stimuli. For instance, these tools could be used in the discovery of novel
antimicrobials, useful probiotics, and in the advancement of our understanding of
gastrointestinal infectious pathogens such as C. difficile in the context of healthy and
perturbed microbial communities. Aspects of this research, including experimental
methodologies and data analysis (R and Machine Learning) can be incorporated into the
existing courses in the university curriculum and experiments from this project can be adapted
as a laboratory exercise or term projects for undergraduate classes or a learning module
including hands-on tools. This project will not only has a potential to be successful as a novel
methodology that increases the accuracy and throughput of microbiome characterization via
flow cytometry, but also opens a pathway to commercialize this tool.
77
LITERATURE CITED
Abdelsalam, E., Samer, M., Attia, Y.A., Abdel-Hadi, M.A., Hassan, H.E., Badr, Y., 2016.
Comparison of nanoparticles effects on biogas and methane production from anaerobic
digestion of cattle dung slurry. Renew. Energy 87, 592–598.
doi:10.1016/j.renene.2015.10.053
Ahring, B.K., Westermann, P., 1987. Kinetics of butyrate, acetate, and hydrogen metabolism
in a thermophilic, anaerobic, butyrate-degrading triculture. Appl. Environ. Microbiol. 53,
434–9.
Akunna, J.C., Bizeau, C., Moletta, R., 1993. Nitrate and nitrite reductions with anaerobic
sludge using various carbon sources: Glucose, glycerol, acetic acid, lactic acid and
methanol. Water Res. 27, 1303–1312. doi:10.1016/0043-1354(93)90217-6
Alsouleman, K., Linke, B., Klang, J., Klocke, M., Krakat, N., Theuerl, S., 2016.
Reorganisation of a mesophilic biogas microbiome as response to a stepwise increase of
ammonium nitrogen induced by poultry manure supply, Bioresource Technology.
doi:10.1016/j.biortech.2016.02.104
Amani, T., Nosrati, M., Sreekrishnan, T.R., 2010. Anaerobic digestion from the viewpoint of
microbiological, chemical, and operational aspects — a review. Environ. Rev. 18, 255–
278. doi:10.1139/A10-011
Amann, R.I., Binder, B.J., Olson, R.J., Chisholm, S.W., Devereux, R., Stahl, D.A., 1990.
Combination of 16S rRNA-targeted oligonucleotide probes with flow cytometry for
analyzing mixed microbial populations. Appl. Envir. Microbiol. 56, 1919–1925.
Angenent, L.T., Sung, S., Raskin, L., 2004. Formation of granules and Methanosaeta fibres in
an anaerobic migrating blanket reactor (AMBR). Environ. Microbiol. 6, 315–322.
doi:10.1111/j.1462-2920.2004.00597.x
Apha, A., 1998. WEF (1998) Standard methods for the examination of water and wastewater.
Ariesyady, H.D., Ito, T., Okabe, S., 2007a. Functional bacterial and archaeal community
structures of major trophic groups in a full-scale anaerobic sludge digester. Water Res.
41, 1554–1568. doi:10.1016/j.watres.2006.12.036
Ariesyady, H.D., Ito, T., Yoshiguchi, K., Okabe, S., 2007b. Phylogenetic and functional
diversity of propionate-oxidizing bacteria in an anaerobic digester sludge. Appl.
Microbiol. Biotechnol. 75, 673–683. doi:10.1007/s00253-007-0842-y
Aydin, S., Ince, B., Ince, O., 2015. Application of real-time PCR to determination of
combined effect of antibiotics on Bacteria, Methanogenic Archaea, Archaea in anaerobic
sequencing batch reactors. Water Res. 76, 88–98. doi:10.1016/j.watres.2015.02.043
Balskus, E.P., 2016. Addressing Infectious Disease Challenges by Investigating Microbiomes.
Barber, W.P.F., 2014. Influence of wastewater treatment on sludge production and processing.
Water Environ. J. 28, 1–10. doi:10.1111/wej.12044
Batstone, D.J., Tait, S., Starrenburg, D., 2009. Estimation of hydrolysis parameters in full-
scale anerobic digesters. Biotechnol. Bioeng. 102, 1513–1520. doi:10.1002/bit.22163
Berg, L. van den, Patel, G.B., Clark, D.S., Lentz, C.P., 1976. Factors affecting rate of methane
formation from acetic acid by enriched methanogenic cultures. Can. J. Microbiol. 22,
1312–1319. doi:10.1139/m76-194
78
Biedermann, M., Grob, K., 2010. Is recycled newspaper suitable for food contact materials?
Technical grade mineral oils from printing inks. Eur. Food Res. Technol. 230, 785–796.
doi:10.1007/s00217-010-1223-9
Bioprocess Engineering Principles - Pauline M. Doran - Google Books, 1995. . Academic
Press.
Blasco, L., Kahala, M., Tampio, E., Ervasti, S., Paavola, T., Rintala, J., Joutsjoki, V., 2014.
Dynamics of microbial communities in untreated and autoclaved food waste anaerobic
digesters. Anaerobe 29, 3–9. doi:10.1016/j.anaerobe.2014.04.011
Bolyard, S.C., Reinhart, D.R., Santra, S., 2013. Behavior of engineered nanoparticles in
landfill leachate. Environ. Sci. Technol. 47, 8114–8122. doi:10.1021/es305175e
Bombach, P., Hübschmann, T., Fetzer, I., Kleinsteuber, S., Geyer, R., Harms, H., Müller, S.,
2011. Resolution of natural microbial community dynamics by community
fingerprinting, flow cytometry, and trend interpretation analysis. Adv. Biochem. Eng.
Biotechnol. 124, 151–81. doi:10.1007/10_2010_82
Boone, D.R., Chynoweth, D.P., Mah, R.A., Smith, P.H., Wilkie, A.C., 1993. Ecology and
microbiology of biogasification. Biomass and Bioenergy 5, 191–202. doi:10.1016/0961-
9534(93)90070-K
Brar, S.K., Verma, M., Tyagi, R.D., Surampalli, R.Y., 2010. Engineered nanoparticles in
wastewater and wastewater sludge – Evidence and impacts. Waste Manag. 30, 504–520.
doi:10.1016/j.wasman.2009.10.012
Brummeler, E. Ten, Horbach, H.C.J.M., Koster, I.W., 2007. Dry anaerobic batch digestion of
the organic fraction of municipal solid waste. J. Chem. Technol. Biotechnol. 50, 191–
209. doi:10.1002/jctb.280500206
Cardinale, M., Brusetti, L., Quatrini, P., Borin, S., Puglia, A.M., Rizzi, A., Zanardini, E.,
Sorlini, C., Corselli, C., Daffonchio, D., 2004. Comparison of different primer sets for
use in automated ribosomal intergenic spacer analysis of complex bacterial communities.
Appl. Environ. Microbiol. 70, 6147–56. doi:10.1128/AEM.70.10.6147-6156.2004
Caruana, R., Niculescu-Mizil, A., 2006. An empirical comparison of supervised learning
algorithms. Proc. 23th Int. Conf. Mach. Learn. 161–168. doi:10.1145/1143844.1143865
Čater, M., Fanedl, L., Logar, R.M., 2013. Microbial Community Analyses in Biogas Reactors
by Molecular Methods. Acta Chim. Slov. 60, 243–255.
Chaparro, J.M., Sheflin, A.M., Manter, D.K., Vivanco, J.M., 2012. Manipulating the soil
microbiome to increase soil health and plant fertility. Biol. Fertil. Soils 48, 489–499.
doi:10.1007/s00374-012-0691-4
Chattopadhyay, P.K., Roederer, M., 2012. Cytometry: Today’s technology and tomorrow’s
horizons. Methods 57, 251–258. doi:10.1016/j.ymeth.2012.02.009
Chen, J.L., Ortiz, R., Steele, T.W.J., Stuckey, D.C., 2014. Toxicants inhibiting anaerobic
digestion: a review. Biotechnol. Adv. 32, 1523–34.
doi:10.1016/j.biotechadv.2014.10.005
Chen, Q. 1985-, 2010. Kinetics of anaerobic digestion of selected C1 to C4 organic acids.
Chen, Y., Wang, D., Zhu, X., Zheng, X., Feng, L., 2012. Long-Term Effects of Copper
Nanoparticles on Wastewater Biological Nutrient Removal and N2O Generation in the
Activated Sludge Process. Environ. Sci. Technol. 46, 12452–12458.
doi:10.1021/es302646q
Christgen, B., Yang, Y., Ahammad, S.Z., Li, B., Rodriquez, D.C., Zhang, T., Graham, D.W.,
2015. Metagenomics Shows That Low-Energy Anaerobic−Aerobic Treatment Reactors
79
Reduce Antibiotic Resistance Gene Levels from Domestic Wastewater.
Cohen, A., Breure, A.M., van Andel, J.G., van Deursen, A., 1982. Influence of phase
separation on the anaerobic digestion of glucose—II. Water Res. 16, 449–455.
doi:10.1016/0043-1354(82)90170-1
Daniels, L., 1984. Biological methanogenesis: physiological and practical aspects. Trends
Biotechnol. 2, 91–98. doi:10.1016/S0167-7799(84)80004-9
Dhoble, A., 2009. HIGH SOLIDS ANAEROBIC DIGESTION FOR THE LONG TERM
EXPLORATORY NASA LUNAR SPACE MISSIONS.
Dhoble, A.S., Pullammanappallil, P.C., 2014. Design and operation of an anaerobic digester
for waste management and fuel generation during long term lunar mission. Adv. Sp. Res.
54, 1502–1512. doi:10.1016/j.asr.2014.06.029
Eduok, S., Martin, B., Villa, R., Nocker, A., Jefferson, B., Coulon, F., 2013. Evaluation of
engineered nanoparticle toxic effect on wastewater microorganisms: current status and
challenges. Ecotoxicol. Environ. Saf. 95, 1–9. doi:10.1016/j.ecoenv.2013.05.022
EL AKKAD, I., HOBSON, P.N., Cahen, L., Delhal, J., Monteyne-Poulaert, G., 1966. Effect
of Antibiotics on Some Rumen and Intestinal Bacteria. Nature 209, 1046–1047.
doi:10.1038/2091046a0
Elefsiniotis, P., Oldham, W.K., 1994. Substrate degradation patterns in acid‐ phase anaerobic
digestion of municipal primary sludge. Environ. Technol. 15, 741–751.
doi:10.1080/09593339409385480
Faure, D., Vereecke, D., Leveau, J.H.J., 2009. Molecular communication in the rhizosphere.
Plant Soil 321, 279–303. doi:10.1007/s11104-008-9839-2
Fernandez, A.S., Hashsham, S.A., Dollhopf, S.L., Raskin, L., Glagoleva, O., Dazzo, F.B.,
Hickey, R.F., Criddle, C.S., Tiedje, J.M., 2000. Flexible Community Structure Correlates
with Stable Community Function in Methanogenic Bioreactor Communities Perturbed by
Glucose. Appl. Environ. Microbiol. 66, 4058–4067. doi:10.1128/AEM.66.9.4058-
4067.2000
Fisher, M.M., Triplett, E.W., 1999. Automated Approach for Ribosomal Intergenic Spacer
Analysis of Microbial Diversity and Its Application to Freshwater Bacterial
Communities. Appl. Envir. Microbiol. 65, 4630–4636.
Fulwyler, M.J., 1965. Electronic Separation of Biological Cells by Volume. Science (80-. ).
150.
Ganzoury, M.A., Allam, N.K., 2015. Impact of nanotechnology on biogas production: A mini-
review. Renew. Sustain. Energy Rev. 50, 1392–1404. doi:10.1016/j.rser.2015.05.073
García, A., Delgado, L., Torà, J.A., Casals, E., González, E., Puntes, V., Font, X., Carrera, J.,
Sánchez, A., 2012. Effect of cerium dioxide, titanium dioxide, silver, and gold
nanoparticles on the activity of microbial communities intended in wastewater treatment.
J. Hazard. Mater. 199–200, 64–72. doi:10.1016/j.jhazmat.2011.10.057
Geurts, P., Ernst, D., Wehenkel, L., 2006. Extremely randomized trees. Mach. Learn. 63, 3–
42. doi:10.1007/s10994-006-6226-1
Giesen, C., Wang, H.A.O., Schapiro, D., Zivanovic, N., Jacobs, A., Hattendorf, B., Schüffler,
P.J., Grolimund, D., Buhmann, J.M., Brandt, S., Varga, Z., Wild, P.J., Günther, D.,
Bodenmiller, B., 2014. Highly multiplexed imaging of tumor tissues with subcellular
resolution by mass cytometry. Nat. Methods 11, 417–422. doi:10.1038/nmeth.2869
Gray, J.W., Carrano, A. V, Steinmetz, L.L., Van Dilla, M.A., Moore, D.H., Mayall, B.H.,
Mendelsohn, M.L., 1975. Chromosome measurement and sorting by flow systems. Proc.
80
Natl. Acad. Sci. U. S. A. 72, 1231–4.
Günther, S., Faust, K., Schumann, J., Harms, H., Raes, J., Müller, S., 2016. Species-sorting
and mass-transfer paradigms control managed natural metacommunities. Environ.
Microbiol. 1–31. doi:10.1111/1462-2920.13402
Henderson, G., Cox, F., Ganesh, S., Jonker, A., Young, W., Janssen, P.H., 2015. Rumen
microbial community composition varies with diet and host, but a core microbiome is
found across a wide geographical range. Sci. Rep. 5, 14567. doi:10.1038/srep14567
Hirsch, A.M., Fujishige, N.A., 2012. Molecular Signals and Receptors: Communication
Between Nitrogen-Fixing Bacteria and Their Plant Hosts. Springer Berlin Heidelberg,
pp. 255–280. doi:10.1007/978-3-642-23524-5_14
Ito, T., Yoshiguchi, K., Ariesyady, H.D., Okabe, S., 2012. Identification and quantification of
key microbial trophic groups of methanogenic glucose degradation in an anaerobic
digester sludge. Bioresour. Technol. 123, 599–607. doi:10.1016/j.biortech.2012.07.108
Kaegi, R., Voegelin, A., Ort, C., Sinnet, B., Thalmann, B., Krismer, J., Hagendorfer, H.,
Elumelu, M., Mueller, E., 2013. Fate and transformation of silver nanoparticles in urban
wastewater systems. Water Res. 47, 3866–77. doi:10.1016/j.watres.2012.11.060
Khanal, S.K. (Ed.), 2008. Anaerobic Biotechnology for Bioenergy Production. Wiley-
Blackwell, Oxford, UK. doi:10.1002/9780813804545
Khanal, S.K., n.d. Microbiology and Biochemistry of Anaerobic Biotechnology, in: Anaerobic
Biotechnology for Bioenergy Production. Wiley-Blackwell, Oxford, UK, pp. 29–41.
doi:10.1002/9780813804545.ch2
Kinet, R., Dzaomuho, P., Baert, J., Taminiau, B., Daube, G., Nezer, C., Brostaux, Y., Nguyen,
F., Dumont, G., Thonart, P., Delvigne, F., 2016. Flow cytometry community
fingerprinting and amplicon sequencing for the assessment of landfill leachate
cellulolytic bioaugmentation. Bioresour. Technol. 214, 450–459.
doi:10.1016/j.biortech.2016.04.131
Koch, C., Fetzer, I., Harms, H., Müller, S., 2013a. CHIC-an automated approach for the
detection of dynamic variations in complex microbial communities. Cytometry. A 83,
561–7. doi:10.1002/cyto.a.22286
Koch, C., Fetzer, I., Schmidt, T., Harms, H., Müller, S., 2013b. Monitoring functions in
managed microbial systems by cytometric bar coding. Environ. Sci. Technol. 47, 1753–
1760. doi:10.1021/es3041048
Koch, C., Günther, S., Desta, A.F., Hübschmann, T., Müller, S., 2013c. Cytometric
fingerprinting for analyzing microbial intracommunity structure variation and identifying
subcommunity function. Nat. Protoc. 8, 190–202. doi:10.1038/nprot.2012.149
Koch, C., Harms, H., Müller, S., 2014a. Dynamics in the microbial cytome-single cell
analytics in natural systems. Curr. Opin. Biotechnol. 27, 134–41.
doi:10.1016/j.copbio.2014.01.011
Koch, C., Harms, H., Müller, S., 2014b. Dynamics in the microbial cytome — single cell
analytics in natural systems. Curr. Opin. Biotechnol., Energy biotechnology •
Environmental biotechnology 27, 134–141. doi:10.1016/j.copbio.2014.01.011
Koch, C., Harnisch, F., Schröder, U., Müller, S., 2014c. Cytometric fingerprints: evaluation of
new tools for analyzing microbial community dynamics. Front. Microbiol. 5, 273.
doi:10.3389/fmicb.2014.00273
Koch, C., Müller, S., Harms, H., Harnisch, F., 2014d. Microbiomes in bioenergy production:
From analysis to management. Curr. Opin. Biotechnol. 27, 65–72.
81
doi:10.1016/j.copbio.2013.11.006
Kolat, P., Kadlec, Z., 2013. Sewage sludge as a biomass energy source. Acta Univ. Agric.
Silvic. Mendelianae Brun. 61, 85–91. doi:10.11118/actaun201361010085
Le, Q. V, 2015. A Tutorial on Deep Learning Part 2: Autoencoders, Convolutional Neural
Networks and Recurrent Neural Networks.
Leitão, R.C., van Haandel, A.C., Zeeman, G., Lettinga, G., 2006. The effects of operational
and environmental variations on anaerobic wastewater treatment systems: a review.
Bioresour. Technol. 97, 1105–18. doi:10.1016/j.biortech.2004.12.007
Li, A., Chu, Y., Wang, X., Ren, L., Yu, J., Liu, X., Yan, J., Zhang, L., Wu, S., Li, S., Weiland,
P., Yadvika, N., Santosh, N., Sreekrishnan, T., Kohli, S., Rana, V., Liu, X., Wang, W.,
Gao, X., Zhou, Y., Shen, R., Demirel, B., Scherer, P., Ferry, J., Bryan, P., Tracy, S., Fast,
A., Indurthi, D., Papoutsakis, E., Schink, B., Chouari, R., Paslier, D. Le, Daegelen, P.,
Ginestet, P., Weissenbach, J., Sghir, A., Tang, Y., Shigematsu, T., Morimura, S., Kida,
K., Shigematsu, T., Era, S., Mizuno, Y., Ninomiya, K., Kamegawa, Y., Morimura, S.,
Kida, K., Tang, Y., Fujimura, Y., Shigematsu, T., Morimura, S., Kida, K., Klocke, M.,
Mahnert, P., Mundt, K., Souidi, K., Linke, B., Weiss, A., Jerome, V., Freitag, R., Mayer,
H., Klocke, M., Nettmann, E., Bergmann, I., Mundt, K., Souidi, K., Mumme, J., Linke,
B., Tang, Y., Shigematsu, T., Ikbal, N., Morimura, S., Kida, K., Sasaki, K., Haruta, S.,
Ueno, Y., Ishii, M., Igarashi, Y., Jaenicke, S., Ander, C., Bekel, T., Bisdorf, R., Droge,
M., Gartemann, K., Junemann, S., Kaiser, O., Krause, L., Tille, F., Margulies, M.,
Egholm, M., Altman, W., Attiya, S., Bader, J., Bemben, L., Berka, J., Braverman, M.,
Chen, Y., Chen, Z., Rothberg, J., Leamon, J., Schluter, A., Bekel, T., Diaz, N., Dondrup,
M., Eichenlaub, R., Gartemann, K., Krahn, I., Krause, L., Kromeke, H., Kruse, O.,
Simon, C., Wiezer, A., Strittmatter, A., Daniel, R., Abbai, N., Govender, A., Shaik, R.,
Pillay, B., Krause, L., Diaz, N., Edwards, R., Gartemann, K., Kromeke, H., Neuweger,
H., Puhler, A., Runte, K., Schluter, A., Stoye, J., Krause, L., Diaz, N., Goesmann, A.,
Kelley, S., Nattkemper, T., Rohwer, F., Edwards, R., Stoye, J., Krober, M., Bekel, T.,
Jaenicke, S., Krause, L., Miller, D., Runte, K., Viehover, P., Puhler, A., Schluter, A.,
Wirth, R., Kovacs, E., Maroti, G., Bagi, Z., Rakhely, G., Kovacs, K., Desai, N.,
Antonopoulos, D., Gilbert, J., Glass, E., Meyer, F., Scholz, M., Lo, C., Chain, P.,
Bergmann, I., Mundt, K., Sontag, M., Baumstark, I., Nettmann, E., Klocke, M., Hwang,
C., Ling, F., Andersen, G., LeChevallier, M., Liu, W., Ezaki, T., Yamamoto, N.,
Ninomiya, K., Suzuki, S., Yabuuchi, E., Murdoch, D., Ezaki, T., Kawamura, Y., Li, N.,
Li, Z., Zhao, L., Shu, S., Rodrigues, D., Jesus, E., Ayala-del-Rio, H., Pellizari, V.,
Gilichinsky, D., Sepulveda-Torres, L., Tiedje, J., Yoon, J., Lee, C., Kang, S., Oh, T.,
Yoon, J., Lee, C., Yeo, S., Oh, T., Jung, S., Lee, M., Oh, T., Park, Y., Yoon, J., Park’, Y.,
Liao, V., Chu, Y., Su, Y., Hsiao, S., Wei, C., Liu, C., Liao, C., Shen, W., Chang, F.,
Yumoto, I., Hirota, K., Sogabe, Y., Nodasaka, Y., Yokota, Y., Hoshino, T., Bozal, N.,
Montes, M., Tudela, E., Guinea, J., Kampfer, P., Albrecht, A., Buczolits, S., Busse, H.,
Vela, A., Collins, M., Latre, M., Mateos, A., Moreno, M., Hutson, R., Dominguez, L.,
Fernandez-Garayzabal, J., Yoon, J., Kang, K., Park, Y., Joseph, B., Ramteke, P.,
Thomas, G., Zhang, J., Lin, S., Zeng, R., Baena, S., Fardeau, M., Labat, M., Ollivier, B.,
Thomas, P., Garcia, J., Patel, B., Beaty, P., Wofford, N., Mcinerney, M., Wofford, N.,
Beaty, P., Mcinerney, M., Mackie, R., Bryant, M., Xu, J., Chiang, H., Bjursell, M.,
Gordon, J., Weimer, P., Zeikus, J., Beguin, P., Millet, J., Aubert, J., Sparling, R., Islam,
R., Cicek, N., Carere, C., Chow, H., Levin, D., Seedorf, H., Fricke, W., Veith, B.,
82
Bruggemann, H., Liesegang, H., Strittimatter, A., Miethke, M., Buckel, W.,
Hinderberger, J., Li, F., Barker, H., Warnick, T., Baena, S., Fardeau, M., Woo, T.,
Ollivier, B., Labat, M., Patel, B., Plugge, C., Stams, A., Kandler, O., Hippe, H., Staley,
B., Reyes, F. de L., Barlaz, M., Maeder, D., Anderson, I., Brettin, T., Bruce, D., Gilna,
P., Han, C., Lapidus, A., Metcalf, W., Saunders, E., Tapia, R., Sowers, K., Liu, Y., Hori,
T., Haruta, S., Ueno, Y., Ishii, M., Igarashi, Y., Huang, L., Zhou, H., Chen, Y., Luo, S.,
Lan, C., Qu, L., Singh-Wissmann, K., Ingram-Smith, C., Miles, R., Ferry, J., Picard, C.,
Ponsonnet, C., Paget, E., Nesme, X., Simonet, P., Zhou, J., Bruns, M., Tiedje, J.,
DeLong, E., Preston, C., Mincer, T., Rich, V., Hallam, S., Frigaard, N., Martinez, A.,
Sullivan, M., Edwards, R., Brito, B., Baker, B., Banfield, J., Huson, D., Auch, A., Qi, J.,
Schuster, S., Cole, J., Wang, Q., Cardenas, E., Fish, J., Chai, B., Farris, R., Kulam-Syed-
Mohideen, A., McGarrell, D., Marsh, T., Garrity, G., Tatusov, R., Galperin, M., Natale,
D., Koonin, E., 2013. A pyrosequencing-based metagenomic study of methane-
producing microbial community in solid-state biogas reactor. Biotechnol. Biofuels 6, 3.
doi:10.1186/1754-6834-6-3
Li, L.-L., Tong, Z.-H., Fang, C.-Y., Chu, J., Yu, H.-Q., 2015. Response of anaerobic granular
sludge to single-wall carbon nanotube exposure. Water Res. 70, 1–8.
doi:10.1016/j.watres.2014.11.042
Li, Q., Song, X., Gu, H., Gao, F., Fierer, N., Jackson, R.B., Green, J.L., Bohannan, B.J.M.,
Whitaker, R.J., He, Z., Nostrnd, J., Deng, Y., Zhou, J., Brookes, P.C., Landman, A.,
Pruden, G., Jenkinson, D.S., López-Lozano, N.E., Heidelberg, K.B., Nelson, W., Garcia-
Oliva, F., Eguiarte, L.E., Hallin, S., Jones, C.M., Schloter, M., Philippot, L., Ye, L.,
Zhang, T., Zhao, J., Roesch, L.F.W., Frank, D.A., Groffman, P.M., Ramette, A., Tiedje,
J.M., Fierer, N., Nemergut, D., Knight, R., Craine, J.M., Song, X., Li, R., Werger,
M.J.A., During, H.J., Zhong, Z., Xu, Q., Jiang, P., Wu, Q., Wang, J., Wu, J., Yu, W.,
Jiang, Z., Liu, M., Zhao, X., He, D., Liu, W., Deng, G.H., Zhang, J.C., Yang, Q.P., Sun,
D.D., Xu, Q.F., Tian, T., Liu, B.R., Liu., X., Reay, D.S., Dentener, F., Smith, P., Grace,
J., Feely, R.A., Lü, C., Tian, H., Song, X., Zhou, G., Gu, H., Qi, L., Li, L., Zeng, D., Yu,
Z., Fan, Z., Mao, R., Treseder, K.K., Ramirez, K.S., Lauber, C.L., Knight, R., Bradford,
M.A., Fierer, N., Compton, J.E., Watruda, L.S., Porteousa, L.A., DeGroud, S., Frey,
S.D., Knorr, M., Parrent, J.L., Simpson, R.T., Paul, J.W., Beauchamp, E.G., Johnson, D.,
Leake, J.R., Lee, J.A., Campbell, C.D., Boxman, A.W., Liu, E.K., Wang, W., Gao, P.,
Qiu, Y., Zhou, Z., He, R., Xu, J., Broadbent, F.E., Rousk, J., Zhao, J., Zhou, J., Zhao, C.,
Peng, S., Ruan, H., Zhang, Y., Kielak, A., Pijl, A.S., Veen, J.A., Kowalchuk, G.A.,
Leininger, S., Song, X., Gu, H., Wang, M., Zhou, G., Li, Q., Wang, H., Vitousek, P.M.,
Guo, J.H., Aber, J.D., Nadelhoffer, K.J., Steudler, P., Melillo, J.M., Fierer, N., Campbell,
B.J., Polson, S.W., Hanson, T.E., Mack, M.C., Schuur, E.A.G., Shen, J., Zhang, L., Guo,
J., Jessica, L.R., He, J., Geisseler, D., Scow, K.M., Nicol, G.W., Leininger, S., Schleper,
C., Prosser, J.I., Kuang, J., Wang, X., Hu, M., Xia, Y., Wen, X., Ding, K., Lauber, C.L.,
Hamady, M., Knight, R., Fierer, N., Sun, M., Xiao, T., Ning, Z., Xiao, E., Sun, W.,
Belén, C.N.R., Roberto, Á., Alejandro, M., Vázquez, M.P., Zhao, J., Chu, H., Shen, C.,
Xie, Y., Zhang, S., Zhao, X., Xiong, Z., Xing, G., Cui, J., Jia, Y., Mo, J., Brown, S., Xue,
J., Fang, Y., Li, Z., Fang, H., Mo, J., Peng, S., Li, Z., Wang, H., Chen, C., Xu, Z.,
Mathers, N.J., Watanabe, F.S., Olsen, S.R., Sparks, D.L., Caporaso, J.G., Caporaso, J.G.,
Schloss, P.D., Kuczynski, J., 2016. Nitrogen deposition and management practices
increase soil microbial biomass carbon but decrease diversity in Moso bamboo
83
plantations. Sci. Rep. 6, 28235. doi:10.1038/srep28235
Li, Y.-F., Shi, J., Nelson, M.C., Chen, P.-H., Graf, J., Li, Y., Yu, Z., 2016. Impact of different
ratios of feedstock to liquid anaerobic digestion effluent on the performance and
microbiome of solid-state anaerobic digesters digesting corn stover. Bioresour. Technol.
200, 744–752. doi:10.1016/j.biortech.2015.10.078
Limam, R.D., Chouari, R., Mazéas, L., Wu, T.-D., Li, T., Grossin-Debattista, J., Guerquin-
Kern, J.-L., Saidi, M., Landoulsi, A., Sghir, A., Bouchez, T., 2014. Members of the
uncultured bacterial candidate division WWE1 are implicated in anaerobic digestion of
cellulose. Microbiologyopen 3, 157–167. doi:10.1002/mbo3.144
Lin, Y., Lü, F., Shao, L., He, P., 2013. Influence of bicarbonate buffer on the methanogenetic
pathway during thermophilic anaerobic digestion. Bioresour. Technol. 137, 245–253.
doi:10.1016/j.biortech.2013.03.093
Lyerly, D.M., Saum, K.E., MacDonald, D.K., Wilkins, T.D., 1985. Effects of Clostridium
difficile toxins given intragastrically to animals. Infect. Immun. 47, 349–52.
Mahapatra, I., Sun, T.Y., Clark, J.R.A., Dobson, P.J., Hungerbuehler, K., Owen, R., Nowack,
B., Lead, J., Eustis, S., El-Sayed, M., Masitas, R., Zamborini, F., Trudel, S., Lukianova-
Hleb, E., Ren, X., Sawant, R., Wu, X., Torchilin, V., Lapotko, D., Shilo, M., Motiei, M.,
Hana, P., Popovtzer, R., Setua, S., Ouberai, M., Piccirillo, S., Watts, C., Welland, M.,
Arnaiz, B., Martinez-Avila, O., Falcon-Perez, J., Penades, S., Ramirez, A., Brain, R.,
Usenko, S., Mottaleb, M., O’Donnell, J., Stahl, L., Roberts, P., Thomas, K., Miller, T.,
McEneff, G., Brown, R., Owen, S., Bury, N., Barron, L., Keel, T., Judy, J., Unrine, J.,
Bertsch, P., Sabo-Attwood, T., Unrine, J., Stone, J., Murphy, C., Ghoshroy, S., Blom, D.,
Tsyusko, O., Unrine, J., Spurgeon, D., Blalock, E., Starnes, D., Tseng, M., Kim, K.,
Zaikova, T., Hutchison, J., Tanguay, R., Geffroy, B., Ladhar, C., Cambier, S., Treguer-
Delapierre, M., Brethes, D., Bourdineaud, J., Perreault, F., Bogdan, N., Morin, M.,
Claverie, J., Popovic, R., Cui, Y., Zhao, Y., Tian, Y., Zhang, W., Lu, X., Jiang, X.,
Coradeghini, R., Gioria, S., García, C., Nativo, P., Franchini, F., Gilliland, D., Zhao, Y.,
Tian, Y., Cui, Y., Liu, W., Ma, W., Jiang, X., Owen, R., Handy, R., Pastoor, T.,
Bachman, A., Bell, D., Cohen, S., Dellarco, M., Dewhurst, I., Gottschalk, F., Sonderer,
T., Scholz, R., Nowack, B., Sun, T., Gottschalk, F., Hungerbuhler, K., Nowack, B.,
Keller, A., McFerran, S., Lazareva, A., Suh, S., Keller, A., Lazareva, A., Gottschalk, F.,
Scholz, R., Nowack, B., Gottschalk, F., Sonderer, T., Scholz, R., Nowack, B.,
Gottschalk, F., Nowack, B., Booker, V., Halsall, C., Llewellyn, N., Johnson, A.,
Williams, R., Prejean, J., Song, R., Hernandez, A., Ziebell, R., Green, T., Walker, F.,
Siegel, R., Ma, J., Zou, Z., Jemal, A., Goel, R., Shah, N., Visaria, R., Paciotti, G.,
Bischof, J., Gad, S., Sharp, K., Montgomery, C., Payne, J., Goodrich, G., Zhang, X.-D.,
Wu, D., Shen, X., Liu, P.-X., Fan, F.-Y., Fan, S.-J., Longmire, M., Choyke, P.,
Kobayashi, H., Tudor, T., Townend, W., Cheeseman, C., Edgar, J., McHugh, J., Filser,
J., Arndt, D., Baumann, J., Geppert, M., Hackmann, S., Luther, E., Liu, H., Cohen, Y.,
Praetorius, A., Scheringer, M., Hungerbühler, K., Meesters, J., Koelmans, A., Quik, J.,
Hendriks, A., Meent, D., Dumont, E., Johnson, A., Keller, V., Williams, R., Gottschalk,
F., Ort, C., Scholz, R., Nowack, B., Suarez, I., Rosal, R., Rodriguez, A., Ucles, A.,
Fernandez-Alba, A., Hernando, M., Wheeler, J., Grist, E., Leung, K., Morritt, D., Crane,
M., Stilgoe, J., Owen, R., Macnaghten, P., Ashton, D., Hilton, M., Thomas, K., Thomas,
K., Hilton, M., Liu, J., Lu, G., Zhang, Z., Bao, Y., Liu, F., Wu, D., Liu, J., Lu, G., Xie,
Z., Zhang, Z., Li, S., Yan, Z., Sanchez, W., Sremski, W., Piccini, B., Palluel, O., Maillot-
84
Maréchal, E., Betoulle, S., Shvedova, A., Kagan, V., Fadeel, B., Lowry, G., Gregory, K.,
Apte, S., Lead, J., Linkov, I., Satterstrom, F., Steevens, J., Ferguson, E., Pleus, R.,
Garner, K., Suh, S., Lenihan, H., Keller, A., Gottschalk, F., Kost, E., Nowack, B.,
Hough, R., Noble, R., Hitchen, G., Hart, R., Reddy, S., Saunders, M., Kaegi, R.,
Voegelin, A., Ort, C., Sinnet, B., Thalmann, B., Krismer, J., Jarvie, H., Al-Obaidi, H.,
King, S., Bowes, M., Lawrence, M., Drake, A., Johnson, A., Jürgens, M., Lawlor, A.,
Cisowska, I., Williams, R., Buffat, P., Borel, J., Lee, J., Lee, J., Tanaka, T., Mori, H.,
Dick, K., Dhanasekaran, T., Zhang, Z., Meisel, D., Nanda, K., Maisels, A., Kruis, F.,
Rellinghaus, B., Luo, W., Su, K., Li, K., Liao, G., Hu, N., Jia, M., Kakumazaki, J., Kato,
T., Sugawara, K., AmuthaRani, D., Boccaccini, A., Deegan, D., Cheeseman, C.,
Ørnebjerg, H., Franck, J., Lamers, F., Angotti, F., Morin, R., Brunner, M., Mari, M.,
Domingo, J., Asharani, P., lianwu, Y., Gong, Z., Valiyaveettil, S., Bar-Ilan, O., Albrecht,
R., Fako, V., Furgeson, D., 2015. Probabilistic modelling of prospective environmental
concentrations of gold nanoparticles from medical applications as a basis for risk
assessment. J. Nanobiotechnology 13, 93. doi:10.1186/s12951-015-0150-0
Manyi-Loh, C.E., Mamphweli, S.N., Meyer, E.L., Okoh, A.I., Makaka, G., Simon, M., 2013.
Microbial anaerobic digestion (bio-digesters) as an approach to the decontamination of
animal wastes in pollution control and the generation of renewable energy. Int. J.
Environ. Res. Public Health 10, 4390–417. doi:10.3390/ijerph10094390
McCarty, P.L., Mosey, F.E., 1991. Modelling of Anaerobic Digestion Processes (A
Discussion of Concepts). Water Sci. Technol. 24, 17–33.
Melzer, S., Winter, G., Jäger, K., Hübschmann, T., Hause, G., Syrowatka, F., Harms, H.,
Tárnok, A., Müller, S., 2015. Cytometric patterns reveal growth states of S hewanella
putrefaciens. Microb. Biotechnol. 8, 379–391. doi:10.1111/1751-7915.12154
Merlin Christy, P., Gopinath, L.R., Divya, D., 2014. A review on anaerobic decomposition
and enhancement of biogas production through enzymes and microorganisms. Renew.
Sustain. Energy Rev. 34, 167–173. doi:10.1016/j.rser.2014.03.010
Monici, M., 2005. Cell and tissue autofluorescence research and diagnostic applications.
Biotechnol. Annu. Rev. 11, 227–56. doi:10.1016/S1387-2656(05)11007-2
Mu, H., Chen, Y., Xiao, N., 2011. Effects of metal oxide nanoparticles (TiO2, Al2O3, SiO2
and ZnO) on waste activated sludge anaerobic digestion. Bioresour. Technol. 102,
10305–11. doi:10.1016/j.biortech.2011.08.100
Mukherjee, A., Walker, J., Weyant, K.B., Schroeder, C.M., 2013. Characterization of flavin-
based fluorescent proteins: an emerging class of fluorescent reporters. PLoS One 8,
e64753. doi:10.1371/journal.pone.0064753
Müller, S., Nebe-von-Caron, G., 2010. Functional single-cell analyses: flow cytometry and
cell sorting of microbial populations and communities. FEMS Microbiol. Rev. 34, 554–
87. doi:10.1111/j.1574-6976.2010.00214.x
Nallathambi Gunaseelan, V., 1997. Anaerobic digestion of biomass for methane production:
A review. Biomass and Bioenergy 13, 83–114. doi:10.1016/S0961-9534(97)00020-2
Narihiro, T., Sekiguchi, Y., 2007. Microbial communities in anaerobic digestion processes for
waste and wastewater treatment: a microbiological update. Curr. Opin. Biotechnol. 18,
273–278. doi:10.1016/j.copbio.2007.04.003
Nelson, M.C., Morrison, M., Yu, Z., 2011. A meta-analysis of the microbial diversity
observed in anaerobic digesters. Bioresour. Technol. 102, 3730–3739.
doi:10.1016/j.biortech.2010.11.119
85
Nyberg, L., Turco, R.F., Nies, L., 2008. Assessing the Impact of Nanomaterials on Anaerobic
Microbial Communities. Environ. Sci. Technol. 42, 1938–1943. doi:10.1021/es072018g
Ong, S.H., Kukkillaya, V.U., Wilm, A., Lay, C., Ho, E.X.P., Low, L., Hibberd, M.L.,
Nagarajan, N., 2013. Species identification and profiling of complex microbial
communities using shotgun Illumina sequencing of 16S rRNA amplicon sequences.
PLoS One 8, e60811. doi:10.1371/journal.pone.0060811
Otero-González, L., Field, J.A., Sierra-Alvarez, R., 2014. Fate and long-term inhibitory
impact of ZnO nanoparticles during high-rate anaerobic wastewater treatment. J.
Environ. Manage. 135, 110–7. doi:10.1016/j.jenvman.2014.01.025
Owen, W.F., Stuckey, D.C., Healy, J.B., Young, L.Y., McCarty, P.L., 1979. Bioassay for
monitoring biochemical methane potential and anaerobic toxicity. Water Res. 13, 485–
492. doi:10.1016/0043-1354(79)90043-5
Pandit, P.D., Gulhane, M.K., Khardenavis, A.A., Vaidya, A.N., 2015. Technological
Advances for Treating Municipal Waste, in: Microbial Factories. Springer India, New
Delhi, pp. 217–229. doi:10.1007/978-81-322-2598-0_13
Park, H.S., Schumacher, R., Kilbane 2nd, J.J., 2005. New method to characterize microbial
diversity using flow cytometry. J Ind Microbiol Biotechnol 32, 94–102.
doi:10.1007/s10295-005-0208-3
Pedreira, C.E., Costa, E.S., Lecrevisse, Q., van Dongen, J.J.M., Orfao, A., 2013. Overview of
clinical flow cytometry data analysis: recent advances and future challenges. Trends
Biotechnol. 31, 415–25. doi:10.1016/j.tibtech.2013.04.008
Perfetto, S.P., Chattopadhyay, P.K., Roederer, M., 2004. Innovation: Seventeen-colour flow
cytometry: unravelling the immune system. Nat. Rev. Immunol. 4, 648–655.
doi:10.1038/nri1416
Qu, X., Mazéas, L., Vavilin, V.A., Epissard, J., Lemunier, M., Mouchel, J.-M., He, P.,
Bouchez, T., 2009a. Combined monitoring of changes in δ13CH4 and archaeal
community structure during mesophilic methanization of municipal solid waste. FEMS
Microbiol. Ecol. 68.
Qu, X., Mazéas, L., Vavilin, V.A., Epissard, J., Lemunier, M., Mouchel, J.-M., He, P.,
Bouchez, T., 2009b. Combined monitoring of changes in delta13CH4 and archaeal
community structure during mesophilic methanization of municipal solid waste. FEMS
Microbiol. Ecol. 68, 236–45. doi:10.1111/j.1574-6941.2009.00661.x
Ranjard, L., Poly, F., Lata, J.-C., Mougel, C., Thioulouse, J., Nazaret, S., 2001.
Characterization of Bacterial and Fungal Soil Communities by Automated Ribosomal
Intergenic Spacer Analysis Fingerprints: Biological and Methodological Variability.
Appl. Environ. Microbiol. 67, 4479–4487. doi:10.1128/AEM.67.10.4479-4487.2001
Raskin, L., Amann, R.I., Poulsen, L.K., Rittmann, B.E., Stahl, D.A., 1995. Use of ribosomal
RNA-based molecular probes for characterization of complex microbial communities in
anaerobic biofilms. Water Sci. Technol. 31, 261–272.
Raskin, L., Stromley, J.M., Rittmann, B.E., Stahl, D.A., 1994. Group-specific 16S rRNA
hybridization probes to describe natural communities of methanogens. Appl. Envir.
Microbiol. 60, 1232–1240.
Renggli, S., Keck, W., Jenal, U., Ritz, D., 2013. Role of Autofluorescence in Flow Cytometric
Analysis of Escherichia coli Treated with Bactericidal Antibiotics. J. Bacteriol. 195,
4067–4073. doi:10.1128/JB.00393-13
Rezaeinejad, S., Ivanov, V., 2013. Assessment of correlation between physiological states of
86
Escherichia coli cells and their susceptibility to chlorine using flow cytometry. Water
Sci. Technol. Water Supply 13, 1056. doi:10.2166/ws.2013.083
Rincón, B., Borja, R., González, J.M., Portillo, M.C., Sáiz-Jiménez, C., 2008. Influence of
organic loading rate and hydraulic retention time on the performance, stability and
microbial communities of one-stage anaerobic digestion of two-phase olive mill solid
residue. Biochem. Eng. J. 40, 253–261. doi:10.1016/j.bej.2007.12.019
Robinson, J.P., Roederer, M., 2015. Flow cytometry strikes gold. Science (80-. ). 350.
Rogers, W.T., Holyst, H.A., 2009. FlowFP: A Bioconductor Package for Fingerprinting Flow
Cytometric Data. Adv. Bioinformatics 2009, 1–11. doi:10.1155/2009/193947
Saeys, Y., Gassen, S. Van, Lambrecht, B.N., 2016. Computational flow cytometry: helping to
make sense of high-dimensional immunology data. Nat. Rev. Immunol. 16, 449–462.
doi:10.1038/nri.2016.56
Safoniuk, M., 2004. Wastewater Engineering: Treatment and Reuse. Chem. Eng. 111, 10–12.
Sasaki, D., Hori, T., Haruta, S., Ueno, Y., Ishii, M., Igarashi, Y., 2011. Methanogenic
pathway and community structure in a thermophilic anaerobic digestion process of
organic solid waste. J. Biosci. Bioeng. 111, 41–6. doi:10.1016/j.jbiosc.2010.08.011
Siegert, I., Banks, C., 2005. The effect of volatile fatty acid additions on the anaerobic
digestion of cellulose and glucose in batch reactors. Process Biochem. 40, 3412–3418.
doi:10.1016/j.procbio.2005.01.025
Speece, R.., 1996. Anaerobic biotechnology for industrial wastewaters.
Stappenbeck, T.S., Virgin, H.W., 2016. Accounting for reciprocal host–microbiome
interactions in experimental science. Nature 534, 191–199. doi:10.1038/nature18285
Stieritz, D.D., Holder, I.A., 1975. Experimental Studies of the Pathogenesis of Infections Due
to Pseudomonas aeruginosa: Description of a Burned Mouse Model. J. Infect. Dis. 131,
688–691. doi:10.1093/infdis/131.6.688
Tilman, D., Balzer, C., Hill, J., Befort, B.L., 2011. Global food demand and the sustainable
intensification of agriculture. Proc. Natl. Acad. Sci. 108, 20260–20264.
doi:10.1073/pnas.1116437108
Turker, G., Aydin, S., Akyol, Ç., Yenigun, O., Ince, O., Ince, B., 2016a. Changes in microbial
community structures due to varying operational conditions in the anaerobic digestion of
oxytetracycline-medicated cow manure. Appl. Microbiol. Biotechnol.
doi:10.1007/s00253-016-7469-9
Turker, G., Aydin, S., Akyol, Ç., Yenigun, O., Ince, O., Ince, B., 2016b. Changes in microbial
community structures due to varying operational conditions in the anaerobic digestion of
oxytetracycline-medicated cow manure. Appl. Microbiol. Biotechnol. 100, 6469–6479.
doi:10.1007/s00253-016-7469-9
Vanrolleghem, P.A., Lee, D.S., 2003. On-line monitoring equipment for wastewater treatment
processes: state of the art. Water Sci. Technol. 47.
Venkiteshwaran, K., Bocher, B., Maki, J., Zitomer, D., 2015. Relating Anaerobic Digestion
Microbial Community and Process Function. Microbiol. insights 8, 37–44.
doi:10.4137/MBI.S33593
Volmer, J., Schmid, A., Bühler, B., 2015. Guiding bioprocess design by microbial ecology.
Curr. Opin. Microbiol. 25, 25–32. doi:10.1016/j.mib.2015.02.002
Wei, W., 1990. Time series analysis. Ulb.Tu-Darmstadt.De.
Werner, J.J., Knights, D., Garcia, M.L., Scalfone, N.B., Smith, S., Yarasheski, K., Cummings,
T.A., Beers, A.R., Knight, R., Angenent, L.T., 2011. Bacterial community structures are
87
unique and resilient in full-scale bioenergy systems. Proc. Natl. Acad. Sci. U. S. A. 108,
4158–63. doi:10.1073/pnas.1015676108
Werner, J.J., Knights, D., Garcia, M.L., Scalfone, N.B., Smith, S., Yarasheski, K., Cummings,
T.A., Beers, A.R., Knight, R., Angenent, L.T., 2011. Bacterial community structures are
unique and resilient in full-scale bioenergy systems. Proc. Natl. Acad. Sci. 108, 4158–
4163. doi:10.1073/pnas.1015676108
Westerhoff, P., Song, G., Hristovski, K., Kiser, M.A., 2011. Occurrence and removal of
titanium at full scale wastewater treatment plants: implications for TiO2 nanomaterials. J.
Environ. Monit. 13, 1195–203. doi:10.1039/c1em10017c
Westerholm, M., Müller, B., Arthurson, V., Schnürer, A., 2011. Changes in the Acetogenic
Population in a Mesophilic Anaerobic Digester in Response to Increasing Ammonia
Concentration. Microbes Environ. 26, 347–353. doi:10.1264/jsme2.ME11123
Woese, C.R., 1987. Bacterial evolution. Microbiol. Rev. 51, 221–71.
Xue, Y., Wilkes, J.G., Moskal, T.J., Williams, A.J., Cooper, W.M., Nayak, R., Rafii, F.,
Buzatu, D.A., 2016. Development of a Flow Cytometry-Based Method for Rapid
Detection of Escherichia coli and Shigella Spp. Using an Oligonucleotide Probe. PLoS
One 11, e0150038. doi:10.1371/journal.pone.0150038
Yang, Y., Guo, J., Hu, Z., 2013a. Impact of nano zero valent iron (NZVI) on methanogenic
activity and population dynamics in anaerobic digestion. Water Res. 47, 6790–800.
doi:10.1016/j.watres.2013.09.012
Yang, Y., Zhang, C., Hu, Z., 2013b. Impact of metallic and metal oxide nanoparticles on
wastewater treatment and anaerobic digestion. Environ. Sci. Process. Impacts 15, 39–48.
doi:10.1039/C2EM30655G
Yu, B., Lou, Z., Zhang, D., Shan, A., Yuan, H., Zhu, N., Zhang, K., 2015. Variations of
organic matters and microbial community in thermophilic anaerobic digestion of waste
activated sludge with the addition of ferric salts. Bioresour. Technol. 179, 291–298.
doi:10.1016/j.biortech.2014.12.011
Yuan, Z.-H., Yang, X., Hu, A., Yu, C.-P., 2015. Long-term impacts of silver nanoparticles in
an anaerobic–anoxic–oxic membrane bioreactor system. Chem. Eng. J. 276, 83–90.
doi:10.1016/j.cej.2015.04.059
Zhang, Z.-P., Show, K.-Y., Tay, J.-H., Liang, D.T., Lee, D.-J., Jiang, W.-J., 2006. Effect of
hydraulic retention time on biohydrogen production and anaerobic microbial community.
Process Biochem. 41, 2118–2123. doi:10.1016/j.procbio.2006.05.021
Zheng, X., Wu, L., Chen, Y., Su, Y., Wan, R., Liu, K., Huang, H., 2015. Effects of titanium
dioxide and zinc oxide nanoparticles on methane production from anaerobic co-digestion
of primary and excess sludge. J. Environ. Sci. Heal. Part A.
88
APPENDIX A: Additional experimental results and output of model runs
A.1. FSC-SSC plots for all flow cytometry samples showing percentages of cells in
the prominent blobs.
Figure A-1. FSC-SSC plots of flow cytometry samples fed with different carbon sources.
89
A.2. Live vs Dead cells in UCSD digester
Figure A-2. FSC Express plots to elucidate the relative percentages of live vs dead cells in the samples obtained from UCSD.
Figure A-3. Gel electrophoregram showing bacterial DNA bands (left) are much more obvious then archaeal (right) in the
samples obtained from UCSD.
90
A.3. E. coli HB 101 growth curves
Figure A-4. (Top) E. coli HB 101 growth curves under 3 different LB concentrations (A= 10 g/L, B= 20 g/L, C = 40 g/L).
(Bottom) Alexa Fuor 488 mean intensity curves showing varying autofluorescence at different LB concentrations (dotted
lines represent duplicates).
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
1 6 11 16 21 26 31
OD
60
0
Time (H)
E coli HB 101 Growth Curve A
B
C
50
100
150
200
250
300
350
400
450
1 6 11 16 21 26 31
Ale
xa F
luo
r 4
88
Me
an In
ten
sity
Time (H)
Alexa Fluor 488
A A2 B B2 C C2
91
A.4. Flavin autofluorescence for UCSD samples
Figure A-5. Live cell percentages in UCSD’s Digester 1 HRT=13 D (Blue) and Digester 2 HRT = 26 days (Red).
Figure A-6. Flavin autofluorescence measured by mean fluorescence intensity in Alexa Fluor 488 channel can distinguish samples obtained from UCSD’s Digester 1 HRT=13 D (Blue) vs. Digester 2 HRT = 26 days (Red).
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
4.00
4.50
5.00
0 5 10 15 20 25 30
Ale
xa
Flu
or
48
8 M
ea
n I
nte
nsit
y
Time (D)
92
A.5. Predictive model run output for UCSD samples
Model Details:
==============
H2OBinomialModel: deeplearning
Model Key: first_model
Status of Neuron Layers: predicting label, 2-class
classification, bernoulli distribution, CrossEntropy loss,
640,802 weights/biases, 7.9 MB, 2,066 training samples, mini-
batch size 1
layer units type dropout l1 l2
mean_rate rate_RMS momentum mean_weight
1 1 3000 Input 10.00 %
2 2 200 RectifierDropout 50.00 % 0.000000 0.000000
0.003909 0.002068 0.000000 0.000057
3 3 200 RectifierDropout 50.00 % 0.000000 0.000000
0.008427 0.024785 0.000000 -0.000398
4 4 2 Softmax 0.000000 0.000000
0.000899 0.000369 0.000000 -0.009646
weight_RMS mean_bias bias_RMS
1
2 0.025627 0.499391 0.006595
3 0.069718 0.998175 0.011197
4 0.398366 -0.000081 0.016927
H2OBinomialMetrics: deeplearning
** Reported on training data. **
Description: Metrics reported on full training frame
MSE: 0.3220701
R^2: -0.2882805
LogLoss: 1.25982
AUC: 0.778114
Gini: 0.556228
Confusion Matrix for F1-optimal threshold:
E K Error Rate
E 513 487 0.487000 =487/1000
K 128 872 0.128000 =128/1000
Totals 641 1359 0.307500 =615/2000
Maximum Metrics: Maximum metrics at their respective thresholds
metric threshold value idx
1 max f1 0.007573 0.739296 386
2 max f2 0.000628 0.858992 397
93
3 max f0point5 0.028458 0.705271 362
4 max accuracy 0.022889 0.713500 367
5 max precision 0.999577 1.000000 0
6 max absolute_MCC 0.021389 0.432644 369
7 max min_per_class_accuracy 0.042678 0.698000 350
H2OBinomialMetrics: deeplearning
** Reported on validation data. **
Description: Metrics reported on temporary (load-balanced)
validation frame
MSE: 0.3958777
R^2: -0.5835108
LogLoss: 1.752724
AUC: 0.655518
Gini: 0.311036
Confusion Matrix for F1-optimal threshold:
E K Error Rate
E 191 309 0.618000 =309/500
K 71 429 0.142000 =71/500
Totals 262 738 0.380000 =380/1000
Maximum Metrics: Maximum metrics at their respective thresholds
metric threshold value idx
1 max f1 0.003129 0.693053 387
2 max f2 0.000125 0.838710 398
3 max f0point5 0.005741 0.625786 380
4 max accuracy 0.005741 0.626000 380
5 max precision 0.969332 0.800000 3
6 max absolute_MCC 0.003129 0.272899 387
7 max min_per_class_accuracy 0.025283 0.612000 343
Scoring History:
timestamp duration training_speed epochs iteration samples
training_MSE
1 2015-12-02 20:45:24 0.000 sec 0.00000
0 0.000000
2 2015-12-02 20:45:28 5.030 sec 219 rows/sec 0.09950
1 199.000000 0.32207
3 2015-12-02 20:45:41 20.065 sec 203 rows/sec 1.03300
10 2066.000000 0.04244
4 2015-12-02 20:45:47 23.609 sec 200 rows/sec 1.03300
10 2066.000000 0.32207
training_r2 training_logloss training_AUC
training_classification_error validation_MSE
1
94
2 -0.28828 1.25982 0.77811
0.30750 0.39588
3 0.83025 0.17720 0.98715
0.05050 0.34832
4 -0.28828 1.25982 0.77811
0.30750 0.39588
validation_r2 validation_logloss validation_AUC
validation_classification_error
1
2 -0.58351 1.75272 0.65552
0.38000
3 -0.39330 2.06235 0.66551
0.35600
4 -0.58351 1.75272 0.65552
0.38000
Variable Importances: (Extract with `h2o.varimp`)
=================================================
Variable Importances:
variable relative_importance scaled_importance percentage
1 V913 1.000000 1.000000 0.000381
2 V711 0.998849 0.998849 0.000380
3 V1772 0.998077 0.998077 0.000380
4 V2982 0.990547 0.990547 0.000377
5 V1019 0.987840 0.987840 0.000376
---
variable relative_importance scaled_importance percentage
2995 V1739 0.773655 0.773655 0.000295
2996 V1115 0.773630 0.773630 0.000295
2997 V846 0.765953 0.765953 0.000292
2998 V600 0.755322 0.755322 0.000288
2999 V842 0.744924 0.744924 0.000284
3000 V29 0.733244 0.733244 0.000279
Top Related