Genomic and Transcriptomic Analysis ofLactobacillus rhamnosus LC705
Ilhan Cem Duru
Master’s thesisUniversity Of Helsinki
Faculty of Biological and Environmental Sciences Department of Biosciences
January 2017
Tiedekunta – Fakultet – FacultyFaculty of Biological and Environmental Sciences
Laitos – Institution – DepartmentDepartment of Biosciences
Tekijä – Författare – Author Ihan Cem DuruTyön nimi – Arbetets titel – TitleGenomic and transcriptomic analysis of Lactobacillus rhamnosus LC705Oppiaine – Läroämne – SubjectBiotechnologyTyön laji – Arbetets art – LevelMaster’s Thesis
Aika – Datum – Month and yearJanuary 2017
Sivumäärä – Sidoantal – Number of pages71
Tiivistelmä – Referat – Abstract
Lactobacilli are gram-positive lactic acid bacteria with wide beneficial properties for human health and foodproduction. Today most of the fermented products and probiotic foods are produced by lactobacilli species. Oneof the most using area of lactobacilli species is fermented products especially dairy products. Lactobacilli speciescan be used as starter or adjunct cultures in dairy products and play important role for preservation and quality,texture and flavor formation. Additionally, probiotic properties of lactobacilli species provide several health effectfor human by stimulation of immune system and protection against pathogens. Lactobacillus rhamnosus LC705 isa facultatively heterofermentative type lactobacilli which is used in production of dairy products as adjunct starterand protective culture. The complete and annotated genome sequence of L. rhamnosus strain LC705 published on2009. Known characteristics of L. rhamnosus strain LC705 are food preservation, toxin removal and healthbenefits when combined with other probiotic strains. However, molecular mechanism behind these characteristicsare not known or not clearly understood. To get further insight on these properties and roles in cheese ripening ofstrain LC705, we re-annotated genome of the LC705 with updated methods and databases, analyzed metabolicpathways of LC705, and performed RNA-seq experiment to determine gene expression changes of LC705 duringwarm room (25 °C) and cold room (5 °C) cheese ripening process.
Several un-characterized proteins of LC705 were annotated (77) and 1197 enzyme commission (EC) numbersare added to annotation file with re-annotation of genes of LC705. More importantly, re-annotation provided us 72new pathways of LC705 which is 35% of the entire collection of 201 pathways. Analyzes of pathways showedthat genome of LC705 has responsible genes for production of flavor compounds such as acetoin and diacetylwhich are provide buttery flavor to dairy products, and hydrogen sulfide which is a volatile sulfur compound thatcause unlikeable odor. Additionally to flavor compounds, we defined genes that produce anti-fungus compoundsand bacteriocin which provide food preservation characteristic to LC705. Determination of gene expressionrespond of LC705 during warm room and cold room cheese ripening process with RNA-Seq showed that centralmetabolism genes that responsible for lyase activity, degradation activity, disaccharides and monosaccharidesmetabolism are warm induced genes. The genes play role in citrate metabolism pathways were significantlydown-regulated during cold room, citrate degradation pathways are critical for buttery flavor products, thereforebuttery flavor compounds are produced by LC705 during warm room. Finally, during cold room ripening, thegenes of LC705 that produces ethanol and acetyl-CoA from pyruvate was up-regulated, so we may say thatLC705 uses pyruvate to produce ethanol and acetyl-CoA instead of lactic acid.
Avainsanat – Nyckelord – KeywordsLactic acid bacteria, RNA-seq, gene annotation, cheese, metabolic pathwayOhjaaja tai ohjaajat – Handledare – Supervisor or supervisors Olli-Pekka Smolander and Petri AuvinenSäilytyspaikka – Förvaringställe – Where depositedUniversity of Helsinki, Viikki Campus LibraryMuita tietoja – Övriga uppgifter – Additional information-
Table of Contents
Abbreviations.............................................................................................................................................5
1. Introduction............................................................................................................................................6
1.1. Lactobacillus...................................................................................................................................6
1.1.1. Role of Lactobacilli in Fermented Dairy Products..................................................................6
1.1.2. Probiotic strains........................................................................................................................9
1.2. Lactobacillus rhamnosus strain LC705........................................................................................10
1.3. Transcriptome................................................................................................................................14
1.4. RNA–Seq Analysis of the Transcriptome of the Bacteria.............................................................16
1.4.1. RNA Extraction......................................................................................................................16
1.4.2. Library Preparation................................................................................................................17
1.4.3. Sequencing of cDNA Library................................................................................................18
1.4.4. Analysis of Bacterial RNA-Seq Data for Differential Gene Expression Analysis.................21
1.4.4.1. Quality Check.................................................................................................................21
1.4.4.2. Filtering and Trimming...................................................................................................23
1.4.4.3. Mapping..........................................................................................................................23
1.4.4.4. Transcript Quantification................................................................................................24
1.4.4.5. Differential Gene Expression Analysis...........................................................................24
1.5. Functional Gene Annotation.........................................................................................................26
2. Aims of the Study.................................................................................................................................28
3. Materials and Methods.........................................................................................................................29
3.1. Cheese Sampling and RNA-sequencing.......................................................................................29
3.2. Gene Annotation Update and Pathway Analysis...........................................................................30
3.3. RNA-Seq Data Analysis................................................................................................................31
3.3.1. Filtering and Quality Control.................................................................................................31
3.3.2. Mapping and Counting Reads................................................................................................32
3.3.3. Differential gene expression..................................................................................................32
3.3.4. Gene enrichment analysis......................................................................................................33
4. Results..................................................................................................................................................35
4.1. Gene annotation update.................................................................................................................35
4.2. Metabolic pathways and metabolism of strain LC705 related to cheese......................................35
4.2.1. Fermentation Pathways..........................................................................................................37
4.2.2. Enzymatic Degradation of Amino Acids................................................................................37
3
4.2.3. Carboxylates Degradation......................................................................................................39
4.2.4. Carbohydrates Degradation....................................................................................................41
4.2.5. Production of Biogenic Amines.............................................................................................41
4.2.6. Vitamin Biosynthesis.............................................................................................................41
4.2.7. Lipolysis and fatty acid biosynthesis.....................................................................................43
4.2.8. Protective activities................................................................................................................43
4.3. RNA-Seq data analysis.................................................................................................................44
4.3.1. Read Filtering, Quality Check and Mapping.........................................................................44
4.3.2. Differential gene analysis.......................................................................................................44
4.2.3. Gene Ontology Enrichment Analysis.....................................................................................45
4.2.4. Pathway enrichment analysis.................................................................................................49
5. Discussion............................................................................................................................................52
6. Conclusions..........................................................................................................................................57
7. Acknowledgements..............................................................................................................................58
8. References............................................................................................................................................59
4
AbbreviationsBAM Binary Alignment/MapBBH bi-directional best hitBP biological processbp base pairCC cellular component EC enzyme commissionEMP Embden-Meyerhof-Parnas pathway FDR false discovery ratesGIT gastrointestinal tractGO Gene Ontology HboV human bocavirusIBS irritable bowel syndromeIFNγ interferon-γLFC log fold changeMF molecular function NSLAB non-starter lactic acid bacteriapadj adjusted P value PBMC peripheral blood mononuclear cellsPLP pyridoxal 5′-phosphate RPKM reads per kilobase per million mapped readsSAM Sequence Alignment/MapSMRT single-molecule real-timeTC Transporter Classification TMM trimmed mean of M-valuesUQ upper quartile VSC volatile sulfur compound ZMW zero-mode waveguide
5
1. Introduction
1.1. Lactobacillus
The genus Lactobacillus is one of the largest genus among the lactic acid bacteria and it
contains more than 150 species. (October 2016; http://www.ncbi.nlm.nih.gov/Taxonomy). This genus
contains rod-shaped, gram-positive bacteria that are acid-tolerant and has a genome with low GC
(guanine and cytosine)-content (Salvetti et al., 2012). Lactobacilli are mostly aero-tolerant anaerobic
organisms and they are strictly fermentative. They generally prefer to live in rich carbohydrate-
containing areas. They are found in mucosal flora of humans and animals (oral, intestinal and vaginal
microflora), in plants and in fermenting food (Merk et al., 2005). Based on the fermentation properties
of lactobacilli, there are three groups; obligately homofermentative, facultatively heterofermentative
and obligately heterofermentative (summarized in figure 1). Obligately homofermentative lactobacilli
principally convert hexoses to lactic acid through Embden-Meyerhof-Parnas pathway (EMP) or
glycolysis (Mcdonald et al., 1987). This lactobacilli group cannot ferment pentoses and gluconates.
Facultatively heterofermentative lactobacilli not only can convert hexoses to lactic acid by using the
EMP, but also it can additionally convert pentoses and gluconate to acetic acid, ethanol and formic acid
via the pentose phosphate pathway under glucose limitation (Salvetti et al., 2012). Obligately
heterofermentative lactobacilli primarily ferment pentoses and hexoses to lactic acid, ethanol and CO2
through phosphogluconate pathway (Salvetti et al., 2012).
1.1.1. Role of Lactobacilli in Fermented Dairy Products
Large number of lactobacilli species are used in the food industry, mainly in dairy products like
cheese, yogurt, fermented milk, and also in other fermented foods such as fermented meat sausages
(Fontana et al. 2016) and in bakery products (Garofalo et al., 2012). Reasons why lactobacilli are
commonly used in the food industry are the role of fermentation of lactobacilli, the food preservative
activity by acidification and production of bacteriocin and the anti-fungal compounds and the
enhancement of flavor, texture and nutrition (Giraffa et al., 2010).
6
Figure 1. Three different lactobacilli groups based on fermentation properties summarized.
The most common field that lactobacilli species are used in is cheese production. In cheese
production, the microbial enzymes are of main importance for the development of flavor. Cheese flavor
development is a complex biochemical process in which lactobacilli join this process by catabolize
milk compounds such as sugars (mostly lactose and citrate), proteins, amino acids and lipids.
Lactobacilli species like Lactobacillus helveticus can be used in cheese production as a starter
culture (O’Sullivan et al., 2016). However, lactobacilli species are mostly used as a non-starter lactic
acid bacteria (NSLAB). Starter cultures starts and accelerate the cheese ripening process in cheese by
producing intracellular enzymes such as peptidases, lipases and esterase (Hannon et al., 2003).
Breakdown of proteins and fats are mainly done by starter cultures, and these activities play an
important role in cheese flavor development (Lortal and Chapot-Chartier, 2005). Lysis of proteins by
starter cultures lead to better flavor in cheese and increase concentration of free amino acids (Lortal and
Chapot-Chartier, 2005). In cheese, casein is the main protein, and degradation of casein by peptidases
causes the reduction of bitterness taste (Fallico et al., 2005). Lipase activities in starter cultures cause
hydrolysis of milk fat into free fatty acids which are precursors of flavor compounds like esters
7
(Esteban-Torres et al., 2014). As a lactobacilli strain, Lactobacillus helveticus is highly autolytic strain
and it is used for accelerate the ripening process (Hannon et al., 2003).
NSLAB flora in cheese is dominated by facultative heterofermentative mesophilic lactobacilli
species such as Lactobacillus casei, Lactobacillus paracasei, Lactobacillus rhamnosus and
Lactobacillus plantarum (Gobbetti et al., 2015). NSLAB is used as secondary flora – as adjunct
culture- in cheese flavor production for different cheese types. Especially in Swiss-type cheeses, flavor
intensity depends to counts of facultatively heterofermentative lactobacilli and Propionibacterium
(Bouton et al., 2009). Several studies have showed the importance of adjunct lactobacilli on cheese
flavor development, and it is reported that adjunct lactobacilli species use free fatty acids, free amino
acids, lysis products of starter culture and some sugars. (Burns et al., 2012; Ortakci et al., 2015; Sgarbi
et al., 2013). Aminotransferase activity and glutamate dehydrogenase activity play key roles in
producing flavor products in lactobacilli (Peralta et al., 2016). Some NSLAB can utilize citrate in the
production of aroma compounds. As a result of citrate fermenting pathways in NSLAB, C4-
metabolites, which are acetoin, diacetyl and 2,3-butanediol, are synthesized. These compounds are
highly important for flavor formation (Martino et al., 2016). Selection of NSLAB is also a key step for
cheese production since properties of NSLAB vary a lot depending on species and strains. Some strains
can have dramatic effect on cheese flavor formation and some strains do not have any significant effect
(Crow et al., 2001).
Starter and adjunct lactobacilli microbiota play key roles in dairy product production, quality
and flavor formation. However, they can also produce undesirable compounds like biogenic amines
(Schirone et al., 2013). The production of biogenic amines, mostly histamine, tyramine, putrescine and
cadaverine, occurs from decarboxylation of amino acids such as histidine, tyrosine, lysine and ornithine
(Bunkova et al., 2010) (Summarized in figure 2). Biogenic amides are potential toxic compounds in
which consumption of high doses (>40 mg per meal) can cause several different types of diseases like
migraine, hypotension and allergic response (Shalaby, 1996). Recommended upper limit for the sum of
biogenic amines in cheese is 900 mg/kg (Valsamaki et al., 2000). Biogenic amine poisoning is mostly
reported from fish consumption, but cheese is also one of the reported causes (Linares et al., 2012).
Hence, controlling lactobacilli decarboxylation activities is important for food safety to avoid high
biogenic amine production.
8
1.1.2. Probiotic strains
Lactobacilli are also used in the health industries as probiotics. The definition of probiotics is
“live microorganisms which, when administered in adequate amounts, confer health benefit on the
host” defined by FAO/WHO (2002). Lactobacilli have been found to provide health benefits for
humans when 1 × 106 - 1 × 1010 viable cells per day were consumed (Bernardeau et al., 2006). The
most common lactobacilli species that are used in health industries due to many health benefits are
Lactobacillus rhamnosus, Lactobacillus acidophilus, Lactobacillus plantarum, and Lactobacillus casei
(Espino et al., 2015). Lactabacillus rhamnosus is the most studied lactobacilli species as a probiotics
(Douillard et al., 2013). Several studies show that lactobacilli provide health benefits to the host, such
as immunomodulatory effects (Johansson et al., 2016), protection effect against colon carcinogenesis
(Gamallat et al., 2016), reduction aggregation capacity of gastric pathogen Helicobacter pylori
(Bergonzelli et al., 2006), reduction of blood lipid concentration and heart disease (Liong and Shah
2005) and control of intestinal inflammation (Gao et al., 2015). Stimulation of immune system,
prevention of disease and protection against pathogens are the most common health impacts of
lactobacilli. Additionally lactobacilli species contribute to the nutritional support in humans by
producing several number of different enzymes that human cannot produce. These enzymes break
down important nutritional compounds for humans such as galactooligosaccharides (GOS), lactose and
some starches that cause abdominal pain (Turpin et al., 2010). Furthermore, vitamin synthesis from
lactobacilli is considered to be part of the nutritional benefit conferred from probiotic consumption
(Rossi et al., 2011). For example, vitamin B groups are produced by lactobacilli and vitamin B
complexes are essential for human diet (Kleerebezem and Hugenholtz, 2003). Lactobacilli attract
interest in markets of probiotic foods due to these health effects. Lactobacilli are species used around
the world in commercial products (table 1) and consumption of lactobacilli-containing probiotics is
high, especially in Europe (Saxelin, 2008). Hence, lactobacilli created huge probiotic industry accounts
for billions of Euros.
9
Figure 2. Decarboxylation activities in lactobacilli cause biogenic amines which are toxic compounds
for human. Produced biogenic amines from fermented food; histamine, cadaverine, tyramine and
putrescine are produced by histidine decarboxylase, lysine decarboxylase, tyrosine decarboxylase and
ornithine decarboxylase respectively.
1.2. Lactobacillus rhamnosus strain LC705
Lactobacillus rhamnosus strain LC705 belongs to Lactobacillus rhamnosus species which is
used in dairy products as protective and adjunct culture and probiotics. Strain LC705 is facultatively
heterofermentative type of lactobacilli, It can ferment sugars to lactic acid, ethanol, CO2 and citrate.
Strain LC705 is dairy adapted, allowing to be used as an adjunct starter culture in dairy products
(Kankainen et al., 2009). It also has properties of food preservation, toxin removal and health benefits
when combined with other probiotic strains (Espino et al., 2015).
10
Table 1. Lactobacilli strains which are used in commercial products and their health effect. (Saxelin et
al., 2005; Siezen and Wilson, 2010).
Species Strain Commercial Product Health effect
Lactobacillus rhamnosus
GG ActifitPlus®, Gefilus® Immune-modulatory effect, prevents diarrhoea
Lactobacillus helveticus& Lactobacillus rhamnosus
R0052 & R0011 A’Biotica® Helicobacter pyloriinhibition
Lactobacillus casei DN114001 Actimel®, DanActive® Acute diarrhoea treatment, infectionprevention
Lactobacillus plantarum 299v ProViva® Iron absorption
Lactobacillus paracasei St11 Lactobacillus fortis® Immune-modulatory effect, gut health
Lactobacillus casei Shirota Yakult® Alleviation of acute diarrhoea
Lactobacillus johnsonii NCC533 LC1 range® Immune-modulatory effect, pathogen inhibition
Complete and annotated genome sequence of strain LC705 was published in 2009
(Kankainen et al., 2009) has 3.03 Mbp genome and one plasmid. The genome contains 2992 genes and
the GC content is 47%. Compared to the Lactobacillus rhamnosus strain GG, strain LC705 has bad
adhesion capacity to human intestinal mucus (Myllyluoma et al., 2007). The comparative genomics
study between strain GG and strain LC705 showed that the reason for the poor adhesion of LC705 is
due to the lack of pilus gene cluster spaCBA (Kankainen et al., 2009). Within the Lactobacillus
rhamnosus species, only strain GG has unique pilus genes, which allows it to colonize the
gastrointestinal tract (GIT) efficiently.
Despite the poor adhesion adhesion capability of strain LC705 to GIT, it is used in
combination with other probiotic strains due to important health effects. Nine different studies show
health effects of probiotic therapy that contains strain LC705. When strain LC705 is used in probiotic
therapy with other probiotic strains which are lactobacillus rhamnosus GG, Bifidobacterium breve
11
Bb99 and Propionibacterium freudenreichii ssp. shermanii JS, It has been observed to alleviate
symptoms in irritable bowel syndrome (IBS) (Kajander et al., 2005; Kajander et al., 2008; Lyra et al.,
2010), decrease mast-cell activation and the effects of other allergy-related mediators (Oksaharju et al.,
2011), potentially reduces the presence of human bocavirus (HboV) (Lehtoranta et al., 2012). Another
study (Collado et al., 2007) with the same bacteria combination showed that probiotic bacteria
combinations have better synergistic effects against pathogens than the individual strains. When these
four strains are used with GOS, they stimulates the peripheral blood mononuclear cells (PBMC)
proliferation and interferon-γ (IFNγ) production, which are beneficial in allergic (Holma et al., 2011).
The same species combination was also tested for newborn infants, and the combination was safe for
the infants and also provided respiratory infection resistance (Kukkonen et al., 2008). LC705-
containing probiotic therapy in combination with L. rhamnosus GG, P. freudenreichii ssp. Shermanii
JS, and Bifidobacterium lactis Bb12 have other health benefits as well. This combination has beneficial
effects on the gastric mucosa in Helicobacter pylori-infected patients (Myllyluoma et al., 2007).
Additionally, previously it has been show that LC705 can activate the inflammasome and IFN response
in human macrophages better than strain GG (Miettinen et al., 2012). In conclusion, strain LC705 can
be used as probiotic strain with other probiotic strains in dairy products. Health effects of probiotic
therapies by using different combinations are summarized in table 2.
Toxin removal is one of the characteristics of LC705 (Kajander et al., 2007). The first study that
showed the ability of toxin removal of LC705 was published on 1998. This study showed that strain
LC705 has a significant effect in reducing levels of aflatoxin B1 (AFB1), which is a type of mycotoxin
in liquid media (El-Nezami et al., 1998). Aflatoxins are produced by the fungal species Aspergillus
flavus (A.flavus) and Aspergillus parasiticus, which are found in a variety of food products (Magnussen
et al., 2013). Within different aflatoxin types, aflatoxin B1 being the most common and toxic type, it is
also carcinogenic for animals and humans and increases the risk of liver cancer (Magnussen et al.,
2013). Removal of aflatoxin is important for foods which are contaminated with aflatoxin. Another
study (Nada et al., 2009) showed that strain LC705 not only removes the aflatoxin but also can inhibit
A.flavus growth which is responsible for aflatoxin production. Although LC705 is able to remove
aflatoxin by itself, combinations of lactic acid bacteria strains may work better in removing several
toxins (Halttunen et al., 2008). One clinical study in China showed that probiotic supplements
including strain LC705 reduced urinary excretion of aflatoxin B1-N7-guanine (AFB-N7-guanine), a
biomarker for increased risk of liver cancer (El-Nezami et al., 2006). Thus strain LC705 may be used
12
in diet to prevent the development of liver cancer. In addition to aflatoxin, LC705 can remove another
dangerous mycotoxin zearalenone, which can be found in foods such as maize, oats and wheat (El-
Nezami et al., 2002). More than 250 strains of lactic acid bacteria were tested for aflatoxins removal
and L. rhamnosus strains GG and LC705 are the most efficient for aflatoxin binding (El-Nezami et al.,
2006). Feature of toxin removal makes strain LC705 important for food and health industries.
Table 2. Probiotic therapy studies that contain strain LC705.
Probiotic strain
combination type
Health Effect(s) Reference(s)
Combination A Alleviates IBS.Decreases mast-cell activation and the effects of allergy-related mediators.Reduces the presence of HboV.Synergistic effects against pathogens.Consumption with GOS stimulates PBMC proliferation and IFNγ production.Increases resistance to respiratory infections for newborn infants.
Kajander et al., 2008, Lyra et al., 2010,Oksaharju et al., 2011,Lehtoranta et al., 2012,Collado et al., 2007,Holma et al., 2011,Kukkonen et al., 2008
Combination B Beneficial effect on gastric mucosa in Helicobacter pylori infected patients.
Myllyluoma et al., 2007,
Combination A = Lactobacillus rhamnosus GG, Lactobacillus rhamnosus GG, Bifidobacterium breve
Bb99 and Propionibacterium freudenreichii ssp. shermanii JS, Combination B = L. rhamnosus GG, L.
rhamnosus LC705, P. freudenreichii ssp. shermanii JS and Bifidobacterium lactis Bb12
L. rhamnosus LC705 is used in dairy products as a protective culture. The dairy products are a
very good growth medium for a great range of microorganisms, which allows microbial contamination
(Ruegg, 2013). In dairy products, food spoilage yeasts, molds and some pathogens must be controlled
for food safety. In cheese, common potential pathogenic bacteria species are Listeria monocytogenes
and Escherichia coli (Luukkonen et al., 2005), which may cause severe diseases. Several economic
losses may happen because of foodborne diseases because of these pathogens (Ruegg, 2013; Schlech
and Acheson, 2000). Hence food hygiene is critical for dairy products production. Protective cultures in
dairy products are very important sources for bio-preservative. Microorganisms that are used as bio-13
preservatives produce organic acids, hydrogen peroxide, diacetyl, antifungal peptides, and bacteriocins
as antimicrobial metabolites (Heredia-Castro et al., 2015). Due to restrictions on the use of chemical
additives and preservatives, it is important to use bio-preservatives in the food industry (Gao et al.,
2015). For that purpose, bacteriocin-based bio-preservation is common in the food industry (Gálvez et
al., 2007). Not much is known specifically about protective activities of the strain LC705. Suomalainen
and Mäyrä-Mäkinen (1999) developed a protective culture combination to replace chemical
preservatives like sorbic acid and acetic acid in fermented milk and bread. The developed protective
culture combination consists of two different bacteria: Lactobacillus rhamnosus LC705 and
Propionibacterium freudenreichii ssp. shermanii JS. That study showed that L. rhamnosus LC705 and
P. freudenreichii ssp. shermanii JS together inhibited the growht of food spoilage yeasts, molds and
Bacillus species. Another study (Luukkonen et al., 2005) showed that Lactobacillus rhamnosus LC705
can inhibit the growth of Listeria in Edam cheese. Neither of the studies explained the mechanism of
the inhibitory effect. This inhibitory mechanism requires further investigation in LC705.
1.3. Transcriptome
The transcriptome is the whole set of both coding and non-coding RNAs transcribed from the
genome of an organism at a specific stage. Transcription analysis is very common and important
because it provides functional characterization of the genome, an understanding of the expression of
genome at the transcription level and annotation of transcripts and operons. This information helps to
reconstruct gene networks that give the opportunity to understand cellular functions, regulatory
pathways and biological systems and also to understand disease processes and help in diagnosis and
drug discovery.
For transcriptome studies, different types of technologies have been developed: hybridization
techniques (microarrays) and sequencing techniques. For the hybridization techniques, mostly
annotated genomes are used to construct microarrays, and RNAs of the sample are converted to labeled
complementary DNA (cDNA). Incubation process provide hybridization of labeled cDNA and
microarrays which provide detection of transcript levels (Van Vliet, 2010). To get very high-resolution
results from hybridization techniques, genomic tiling microarrays were constructed. It uses overlapping
short sequence of probes that provide genome-wide analysis of RNA with high resolution (Selinger et
al., 2000). Genomic tiling microarray method is not commonly used because it is very expensive (Pinto
et al., 2011). For a long time, microarrays were mainly used in transcriptomic studies and thousands of
14
publications were published using microarrays. However, microarray technology has critical
limitations. It produces high rate of noise because of cross-hybridization, and transcription level
comparison is often difficult because it requires complicated normalization methods (Wang et al.,
2009).
Sequencing techniques for transcriptome studies are based on determining cDNA sequence.
Sequencing methods like Sanger sequencing of cDNA or expressed sequence tags (EST) libraries and
tag-based methods such as serial analysis of gene expression (SAGE), cap analysis of gene expression
(CAGE) and massively parallel signature sequencing (MPSS) were used for studying transcriptomes.
These methods are highly expensive and have some difficulties in mapping to the reference genome
(Wang et al., 2009). Due to these disadvantages, usage of traditional sequencing is limited in
transcription studies. New approach to sequencing which is next generation ultrahigh-throughput
sequencing, provided a new method for quantifying transcriptomes. This method is RNA-sequencing
(RNA-Seq). RNA-Seq has several advantages over other traditional methods. It is similar price than
other methods. It does not require probe sequences nor genome to be existing prior analysis. It can
analyse whole transcription, thereby it is possible to work with operons and non-coding RNAs.
Additionally, RNA-Seq data provides high resolution results, which improves the mapping of the
sequences to the reference genome. It does not produce high background noise and new transcripts can
be seen with RNA-Seq. Lastly, it is more sensitive in low levels of expression when sufficient
sequencing depth is reached. With these advantages, RNA-Seq became the gold standard for
transcriptome studies (Wang et al., 2009).
The basic working principle of RNA-Seq starts with the conversion of isolated RNAs to a
library of cDNA fragments with the addition of adapter sequences by library-construction methods.
Molecules which are obtained by library-construction are sequenced via ultrahigh-throughput
sequencing machine to get single-end sequencing or pair-end sequencing results. Any of the next-
generation sequencing platforms like Illumina, SOLiD, Roche 454, Ion Torrent, and Pacific biosciences
(Pacbio) can be used for RNA-sequencing. The size and quality of the reads from sequencing machine
vary depending on the DNA-sequencing machine and DNA-sequencing technology. Every platform
uses different library construction protocols and methods. After getting the reads from sequencing,
bioinformatic analysis is needed to process the huge data. The bioinformatic analysis is critical for
RNA-Seq and it is as challenging as the experimental process including isolation of RNAs, cDNA
library construction and sequencing. Bioinformatic analysis procedure of RNA-Seq can be vary
15
depending on experimental design, but it mostly includes quality control of RNA-Seq data and filtering
of the data, alignment of the filtered reads to a reference genome, calculation of expression levels of
genes, analysis of differentially expressed genes and their visualization.
1.4. RNA–Seq Analysis of the Transcriptome of the Bacteria
RNA transcripts in a population of bacterial cells can be studied by sequencing of cDNA
libraries with RNA-Seq technology. In comparison to eukaryotic genomes, bacterial genomes are
smaller and contain high rates of coding sequences (CDS). Nevertheless, compared to eukaryotic RNA
transcriptome studies, bacterial transcriptome studies are more challenging. Eukaryotic messenger
RNAs (mRNAs) have a long poly-A tail that provides easy mRNA isolation from other RNAs by
hybridization to immobilized poly-T residues. However, mostly prokaryotic mRNAs either do not
contain poly-A tails (Srinivasan et al., 1975) or contain a short poly-A tail (Sarkar et al., 1997).
Additionally, bacterial mRNA generally has shorter life time and high ribonuclease activity make
bacterial mRNA unstable (Deutscher, 2006). Due to these difficulties, RNA-Seq technology was first
used for eukaryotic cells (Van Vliet, 2010). These difficulties were overcame with new methods (Sorek
and Cossart, 2010) that include capturing and removing rRNAs, degrading rRNAs with magnetic
beads, or using artificially polyadenylated mRNAs. Currently, developing technologies are creating
methods and protocols (Avraham et al., 2016; Shishkin et al., 2015) that allow working on single cells
and studying host and pathogen transcriptomes with barcoding and pooling RNAs before library
preparation.
The following sub-chapters will cover the steps of RNA-Seq workflow (summarized in figure
3), RNA-Seq data analysis for transcriptome and determination of gene expression changes in bacterial
cells.
1.4.1. RNA Extraction
To generating cDNA libraries of bacterial RNA, the first step is to purify RNA from bacteria.
Generally RNA is extracted from bacteria by using commercially available kits like RNAEasy
(Qiagen), TRIZOL Max (Life Technologies), or RiboPure (Life Technologies). These kits extract RNA
by using specific lysis and filters or membranes. The kits provide easy, rapid and efficient purification
of high-quality RNA. It is important to choose a method or kit that does not bias the transcriptome and
16
is able to collect enough RNA for the library construction (Croucher and Thomson, 2010; Wilhelm and
Landry, 2009). Library construction requires approximately 0.1–10 μg of total RNA, depending on the
aim of the study and library construction protocols (Pease and Sooknanan, 2012). DNA contamination
is very common during the RNA extraction step, hence DNase treatment is highly recommended to
degrade genomic DNA. For bacterial cells, extracted RNA samples contain high-concentration of
rRNA. High amounts of rRNA can create wasted effort sequencing the same rRNA millions of times
(Wilhelm and Landry, 2009). Hence depletion of rRNA by hybridization using commercial kits like
Ribo-Zero Magnetic Kit (Bacteria) (Epicentre Biotechnologies) and MicrobExpress Bacterial mRNA
Purification Kit (Ambion) can be done to enrich mRNA and non-coding RNA (ncRNA). Nevertheless,
Lahens et al. (2014) showed that rRNA depletion creates critical biases in coverage. Therefore mRNA
enrichment step is optional. After RNA extraction, it is important to check quality of RNA for
degradation, quantity and purity.
1.4.2. Library Preparation
RNA samples are not directly sequenced. Next-generation sequencing technology prefers to
sequence DNA directly because of technical maturity and better stability of DNA (Korpelainen et al.,
2014). Therefore, the main aim of library construction for RNA-Seq is to convert single-stranded RNA
to a library of cDNA fragments by reverse transcription with adding adapter sequences. Adapter
sequences help sequences reads to combine with primers for amplification during sequencing. Library
preparation steps can vary depending on library protocols. These variations between protocols can
create differences in strand-specificity, library complexity, coverage and level of gene expression
(Levin et al., 2010). There are several different protocols for library preparation, but the main basic
steps are mRNA fragmentation, cDNA synthesis by reverse transcription, adapter ligation and PCR
amplification and purification of library.
There are two different main protocols for library construction: non-strand-specific protocol and
strand-specific protocol. Non-strand-specific protocols are cheaper and contains fewer steps compared
to strand-specific protocols. Non-strand-specific protocols work well in gene expression studies but
cannot determine original strand of the transcript which is important for overlapping transcripts, gene
regulation and antisense RNA studies. Non-strand-specific library generation is mostly based on one
standard protocol. The fragmentation step in non-strand-specific library generation can be done in two
ways: fragmentation of cDNA or fragmentation of RNA samples. Both methods create the same cDNA
17
library. However, RNA sample fragmentation has less bias compared to cDNA fragmentation (Wang et
al., 2009). After the RNA samples fragmentation, forward and reverse strands of cDNA are synthesized
by reverse transcription using random hexamers which bind to random complementary sites on a target
cDNA.
To determine the strand of the transcript, strand-specific protocols are used. Strand-specific
protocols are technically more challenging and more time consuming compared to non-strand-specific
protocols. There are two main classes of strand-specific RNA-Seq library construction protocols that
use different techniques. The first technique uses strand-specific adapters in a known orientation, hence
these protocols mark the 3’ end and 5’ ends of the mRNA when cDNAs are created. The second
technique is chemical modification of strands including bisulfide treatment and dUTP modification
(Mills et al., 2013). Every step in strand-specific protocols can change depending on the sequencing
platform used, and every sequencing platform has published its own protocols and kits for library
construction. It is recommended to use the library construction protocols of sequencing platforms, but it
can be expensive. Hence low-cost RNA sequencing protocols (Hou et al., 2015; Wang et al., 2011) are
quite common. Library preparation step is critical for the whole analysis because protocols are
generally biased processes. Several factors can create bias including favored specific sites in
fragmentation of RNA. PCR library amplification step can create critical bias including sequence
artifacts due to unequal amplification or amplification efficiency (van Dijk et al., 2014). Bias in the
library preparation step can cause low-complexity libraries, which is problematic for sequencing step.
Hence it is very important to choose protocol and kits carefully.
1.4.3. Sequencing of cDNA Library
Purified cDNA fragments can be sequenced by using a variety of next-generation sequencing
platforms such as Illumina, SOLiD, Ion Torrent and Pacific Biosciences. Each different sequencing
platform provides different lengths of reads, depth of coverage and quality by using different methods
to sequence. Illumina is the most commonly used high-throughput sequencing platform that sequences
via synthesis technique (Reuter et al., 2015). In this process, cDNA fragments are hybridized to glass
slide (flowcell) based on complementarity with adapter sequences. To create high amount of hybridized
cDNA fragments, they are amplified as a bridge and clusters are generated by polymerases. After the
bridge amplification, the reverse strand is removed by washing. For sequencing by synthesis,
nucleotides are differently labeled with fluorescent tags are added (Liu et al., 2012). Additional
18
sequencing reagent start polymerization reaction and only one base complement the template at a time.
Different fluorescent tags from the nucleotides produce unique emission wavelengths and imaging of
the emission wavelength are used to determine the base by a coupled-charge device (Reuter et al.,
2015). Sequencing can be performed at one end or from both ends (paired-end). If paired-end
sequencing is used, both ends of the cDNA fragment is sequenced. Illumina produces wide ranges of
instruments (Miniseq series, Miseq series, Hiseq series and NovaSeq series) with different features.
Depending on the instrument model, read lengths can be 100-300 nucleotides and run time of these
instruments can be vary between 4 hours and 3 days.
The SOLiD sequencing platform uses oligonucleotide ligation-based chemistry instead of
synthesis method. In SOLiD platform, cDNA fragments are attached to magnetic beads that contains
complementary oligonucleotides and emulsion PCR is performed for clonal sequencing as an
amplification step. Amplification products are selected and bound to a flowcell by covalent bonding.
On a SOLiD flowcell, sequencing is performed by a DNA ligase. Universal primer is bound to the
adapter sequence of amplified fragments to create a ligation site for specific octamers, which contain
di-base probes and fluorescent labels (Shendure and Ji, 2008). There are 16 possible di-base
combination with four fluorescent labels. If the di-base is complementary to the cDNA fragment,
ligation occurs and fluorescent signals are produced. In the second cycle, specific octamer is cleaved at
the 5th base and a ligation site is generated for the new specific octamer. Therefore ligation cycles
sequence every 5th base of amplified fragments. The number of cycles of ligation determine the read
length. For the 35 base-pair (bp) read length, seven cycles of ligation is performed. After cycles of
ligation, the first universal primer is removed from the amplified cDNA fragment and new primer is
bound to the adapter sequence of amplified fragment at the n-1 position for a new round of ligation
cycles. This process is repeated with new primer positions to get the whole read of the amplified
fragment. Due to the usage of dinucleotide probes, each base is interrogated twice. The di-base probe
approach from SOLiD provide a low error rate with accuracy of up to 99.99% (Korpelainen et al.,
2014). Depending on the sequencer model of the SOLiD platform, the read length can be vary from 35
bp to 85 bp and the run time can take up to 7 days (Liu et al., 2012).
Ion Torrent is another sequencing platform that uses sequencing by synthesis technique like the
Illumina sequencing platform. Instead of measuring light signals, Ion Torrent detects pH changes
caused by the production of hydrogen ions during DNA synthesis (Rothberg et al., 2011). To detect
hydrogen ions, Ion Torrent platform uses complementary metal-oxide semiconductor (CMOS)
19
technology with ion-sensitive field-effect transistor because high sensitivity of hydrogen ions
(Rothberg et al., 2011). The process in Ion Torrent starts with hybridization of cDNA fragments to
bead. cDNA libraries which contain adapter sequences are clonally amplified to magnetic bead by
using emulsion PCR. Template-carrying beads are loaded to sensor microwells where sequencing by
synthesis chemical reaction occurs. In the sensor microwells of the Ion Torrent platform, four different
nucleotides provided one by one. When a nucleotide is complementary to the template, it binds to
template by polymerase and hydrogen ion is released. Released hydrogen is detected by the
semiconductor sensor and converted to a voltage signal (Reuter et al., 2015). The Ion Torrent
sequencing platform offers high-speed, high-quality and low-cost sequencing performance without
fluorescence labeling and camera scanning. It has a limitation with homopolymer repeats.
Homopolymer of length longer than six base pairs create high error rates (Reuter et al., 2015). Ion
Torrent sequencing platform allow to use maximum 200 bp to 400 bp read length. The run time of the
sequencing can vary between two and eight hours depending on the machine model.
The Pacific Biosciences (PacBio) sequencing platform uses single-molecule real-time (SMRT)
technology which represents third-generation sequencing. The leading difference of SMRT sequencing
is that it requires only single molecules and enables real-time observation. As it works with single
molecules, clonal amplification by PCR is not needed; hence DNA preparation time is shorter than in
other sequencing platforms and can avoid bias and errors caused by PCR for whole genome sequencing
(Liu et al., 2012). For RNA-Seq, Pacbio platform uses the isoform sequencing (Iso-Seq) application.
Iso-Seq requires large scale PCR but it provides full length transcripts without assembly and isoforms
of transcripts. In SMRT, the sequencing template (SMRTbell) is prepared by ligation of hairpin
adapters to both end of cDNA molecules. Prepared SMRTbell is transferred to chip named SMRT cell
which contains zeptoliter-sized chambers called zero-mode waveguides (ZMW). DNA synthesis occurs
in the ZMW with polymerase at the bottom which binds to the hairpin structure of SMRTbell. During
DNA synthesis, four fluorescently-labeled nucleotides are used. Fluorescent signals is detected by
camera in real time (Reuter et al., 2015). One of the advantages of the PacBio platform is read length.
Compare to other platforms PacBio can generate much longer reads. The average read length of the
PacBio sequencing is over 10 Kbp and maximum reads can be longer than 60 Kbp (Rhoads and Au,
2015). Nevertheless, PacBio sequencing platform creates high error rate (up to 10-15%) compared to
other sequencing platforms (Rhoads and Au, 2015).
20
1.4.4. Analysis of Bacterial RNA-Seq Data for Differential Gene Expression Analysis
The huge amount of data is created from RNA-Seq needs to be analyzed. Interpretation of
RNA-Seq data requires robust computational programs, bioinformatic tools and powerful servers.
There are several types of applications for RNA-Seq experiments like fusion gene detection,
differential gene expression studies and RNA editing. There is no single pipeline for those applications.
Quality control, data filtering and trimming, alignment of reads (mapping) or de novo assembly and
indexing (quantification) of reads are common steps for all applications. For gene expression studies,
extra statistical steps are needed for identifying significance in genetic expression within the context of
experimental conditions.
1.4.4.1. Quality Check
As a result of RNA-Seq millions of raw short sequence reads are mostly generated as a FASTQ
file. During generation of these reads, critical quality problems may occur because of sequencing
errors, library preparation errors and PCR bias. These poor-quality sequences can badly affect mapping
and assembly results and gene expression estimates. Raw bases also contains adapter sequences that
need to be trimmed before mapping. To check the quality of raw reads, several tools have been
published, such as FastQC (Andrews, 2010), NGSQC (Dai et al., 2010) and PRINSEQ (Schmieder and
Edwards, 2011). With these tools, it is possible to determine base sequence qualities based on Phred
quality score, sequence GC content, read length distribution, sequence duplication and adapter
contents. Base sequence qualities based on Phred quality score measures the error probability for each
base (Ewing and Green, 1998). Quality values are encoded into the next-generation sequencing output
(FASTQ file format) with ASCII characters. Depending on the sequencing platform, different ASCII
offsets can be used. Offset of 64 and 33 (also named Sanger encoding) are mainly used. For files that
use Phred scores with an ASCII offset of 64, the 64 th character is used as zero. For files with an ASCII
ofset of 33, the 33rd character is used as zero. The Phred quality score calculation is logarithmic based.
For example, a Phred score of 20 means there is a 1 in 100 chance that the base is wrong, hence
accuracy is 99%. Similarly, a Phred score of 30 means that on average every 1000 bp read contains an
error, hence accuracy is 99.9%.
21
Figure 3. RNA-Seq workflow starts with sample preparation and RNA extraction. Total RNA from
samples can be extracted with some commercial kits. mRNA enrichment step is optional and can be
applied with commercial kits. After pure RNA extracted, library preparation can be done with RNA
fragmentation, reverse transcription, adapter ligation, and library amplification steps. Finally
sequencing can be done with next-generation sequencing platforms. Produced RNA-Seq data can be
analyzed with bioinformatic tools.
22
1.4.4.2. Filtering and Trimming
After the quality check, filtering and trimming the data is an important step to get rid of bad-
quality reads and adapter sequences, while improving the mappability. Filtering the data represents
removing the whole read, while trimming represents removing poor-quality ends of reads.
Trimmomatic (Bolger et al., 2014) and Cutadapt (Martin, 2011) can be used for both filtering and
trimming the data. For filtering the data, the mean quality of reads and read length are two key factors.
It is difficult to define best values for minimum quality and minimum length filtering. Filtering values
should be applied according to experimental design and sequencing platform. For example, for gene
expression profile studies, filtering values affect significantly the gene expression estimates (de Sá et
al., 2015; Williams et al., 2016). Ideally the recommended Phred quality score is 25 or higher
(Korpelainen et al., 2014). Therefore quality filtering should be applied for reads with scores less than
25 and for very short reads. Trimming the data is needed for RNA-seq studies for two main reasons:
adapter trimming and low-quality base trimming. For filtering the reads, the mean quality of reads is
considered. However, for trimming the reads, base quality is considered. For many sequencing
platforms, quality of reads decreases towards the 3’ end of reads (Conesa et al., 2016). The low-quality
ends need to be trimmed based on base quality. Adapter sequences that were added to reads during
library generation should also be trimmed. If adapter sequence is not known, some adapter prediction
algorithms (Schmieder et al., 2010) can be used. Bioinformatic tools that are used for filtering and
trimming the data may require knowing the type of data information like paired-end or single-end,
adapter location (3’end or 5’end) and type quality encoding type (ASCII 33 or 64). It is critical to
define these parameters to get proper results from filtering and trimming RNA-Seq data.
1.4.4.3. Mapping
After high-quality data is obtained from filtering and trimming, the next step is to align these
millions of reads to a reference genome or transcriptome to define the locations of the reads. Mapping
for bacterial RNA-Seq data is easier than eukaryotic RNA-Seq data, due to the fact that splicing
activities are very rare and that introns are very low. Due to the millions of reads that are aligned, speed
was a problem for mapping (Trapnell and Salzberg, 2009). Several published tools such as STAR
(Dobin et al., 2013), TopHat2 (Kim et al., 2013) and Bowtie2 (Langmead and Salzberg, 2012) provide
ultra-fast and memory-efficient read aligning. The algorithms that are used for alignment not only
23
provide ultra-fast alignment, but also cope with mismatches and indels very efficiently. The typical
created output file from aligners is Sequence Alignment/Map (SAM) files (Li et al., 2009). SAM files
contain several information about mapped read like mapping quality, reference sequence name, query
quality and read size. SAM files can be unsorted and require high amount of disk space. Hence it is
efficient to convert and sort SAM files to the Binary Alignment/Map (BAM) file format, which is the
binary representation of SAM files. After mapping, it is important to check the quality of mapping and
percentage of mapped reads. Additionally, visualizing the reads by using a reference genome can allow
quality check by human eye. Visualizing can be done with Integrative Genomics Viewer (Robinson et
al., 2011) and Tablet (Milne et al., 2010).
1.4.4.4. Transcript Quantification
After mapping, quantification is done to define the number of reads that are aligned to each
gene, transcript or exon. If a reference genome is already annotated, aligned reads can be easily
counted by using the coordinates of transcripts. This gene expression estimates can be done with
bioinformatic tools such as HTSeq (Anders et al., 2015) and featureCounts (Liao et al., 2014). For the
quantification step, reads mapped to multiple regions or overlapped with multiple genes can create
problems for differential gene expression studies. HTSeq and featureCounts offer the option to exclude
these multi-overlapped reads or count them for each overlapped gene. Generally for differential gene
expression analysis, excluding multi-overlapped reads is considered the better option (Liao et al., 2014;
Anders et al., 2015). If there is no reference genome, quantification can be done by using de novo
transcriptome assembly. Finally, transcript quantification can be done without mapping the reads by
using software such as Sailfish (Patro et al., 2014) and Kallisto (Bray et al., 2016). Transcript
quantification without mapping is much faster than other methods because of skipping mapping step.
Quantification without mapping method uses counts of k-mers across reads. Hence the k-mer length is
a critical parameter for these tools. It is important to choose proper parameters to get accurate results
(Patro et al., 2014; Bray et al., 2016).
1.4.4.5. Differential Gene Expression Analysis
Gene expression profiling identifies genes that are differentially expressed in different
conditions. Basically, differential gene expression analysis is a statistical power analysis for transcript
quantification. Several different R packages have been published for differential gene expression24
analysis such as DESeq2 (Love et al., 2014), edgeR (Robinson et al., 2010), EBSeq (Leng et al., 2013),
and ShrinkBayes (van de Wiel et al., 2014). There are three main steps for differentially expressed gene
analysis: normalization, dispersion estimation and significance testing to provide false discovery rates
(FDRs). Normalization is the first step for differential gene expression analysis, raw RNA-seq data has
some problems that originated from sequencing. Sequencing depth and library sizes are different for
each samples, hence raw counts cannot be used without proper normalization (Soneson et al., 2013).
For that purpose different normalization methods are exist like total count, upper quartile (UQ),
trimmed mean of M-values (TMM), reads per kilobase per million mapped reads (RPKM) and DESeq
normalization method. Dillies et al. (2013) showed that total count, RPKM and UQ methods are not
efficient, hence TMM and DESeq normalization methods are recommended to use for normalization
step. Most R packages use raw count data and perform internal normalization. Some packages like
ShrinkBayes require normalized data.
Fold change based on read counts is commonly used measure for differential gene expression
(Feng et al., 2012). It is a measure that determines quantity changes between samples. Using raw fold
change can create problems for low counts, hence dispersion estimation is needed before significance
testing. For example, to remove this problem, DESeq2 uses log fold change (LFC) shrinkage approach
(Love et al., 2014). As a result of LCF shrinkage approach, DESeq2 R package returns shrunken log2
fold change estimates for each gene. Shrunken LFC is divided by its standard error for the significance
testing (Wald test) in DESeq2 package. The Wald test gives P-values of gene expressions. As multiple
testing creates false positive results, adjustment for it is needed. Hence, estimation of FDR with the
procedure of Benjamini-Hochberg is used in DESeq2 to adjuste multiple testing (Love et al., 2014).
Different methods can be seen by other R packages, but as a result of statistical power analysis, FDR
and fold change data determine the significantly differentially expressed genes. Up-regulated and
down-regulated gene lists can be created by determining a proper threshold for FDR and fold change
data.
1.4.4.6. Downstream Analysis: Gene Set Enrichment and Pathway Analysis
Produced gene sets from differential gene expression analysis (up-regulated and down-regulated
genes) need to be biologically interpreted by enrichment analysis to determine molecular functions and
pathways of differentially expressed genes. For the gene enrichment analysis, transcriptome needs to be
annotated by functional categories. Gene Ontology (GO) (Gene Ontology Consortium et al., 2013) is
25
the most common annotation library for classification of genes. GO contains three different main
domains including biological process (BP), molecular function (MF) and cellular component (CC).
Annotated genes are assigned with descriptive GO definitions (GO terms) that show the process or
activity of a gene. Several tools have been published for GO classification and GO enrichment such as
DAVID (Huang et al., 2007), BiNGO (Maere et al., 2005) and LEGO (Dong et al., 2016). There are
different approaches for enrichment analysis statistical tests. Fisher's exact test and the hypergeometric
test are two main common tests for enrichment analysis (Rivals et al., 2007). Basically, these tools
calculate P-values with statistical tests for GO terms dependent on a list of up-regulated or down-
regulated genes.
Besides GO enrichment, other databases like KEGG (Kanehisa and Goto, 2000) and BioCyc
(Caspi et al., 2016) can be used for pathway analysis, pathway construction and defining enriched
pathways. Pathway analysis is critical for understanding the biological functions of transcriptome data.
It provides determination of function of genes and their roles in pathways. Pathway construction can be
done with automatic pathway servers such as KAAS (Moriya et al., 2007) and RAST (Aziz et al.,
2008). These servers are fully automatic, therefore functional annotation of proteins is also done by
servers with finding best similarity matches and through homology to reference genes. Additional to
fully automatic pathway construction servers, bioinformatic tools like Pathway Tools (Karp et al.,
2002) can be used to construct pathways. Unlike pathway construction servers, Pathway Tools does not
functionally annotate the genome. It requires annotation file which can include protein names, enzyme
commission (EC) number and GO terms. Annotated proteins are mapped to BioCyc pathway database
in Pathway Tools. As the annotation is not automatic, it is possible to construct more precise and clear
pathways in Pathway Tools by checking the annotation manually. Similar to GO enrichment analysis,
pathway enrichment analysis is done with statistical tests such as Fisher's exact test. By using separated
lists of up-regulated and down-regulated genes, it is possible to define which pathway is activated or
deactivated under specific condition.
1.5. Functional Gene Annotation
The prediction of functional role of genes is critical step for genomic studies. Functional
annotation is the first step that provide biological role of the gene. Without accurate annotation, there is
a big possibility to miss important results. The information of functional annotation of genes can be
26
vary depending on the databases such as UniProt (Boutet et al., 2016), RefSeq (O'Leary et al., 2016),
KEGG and GO. As a result of gene annotation, function of a gene can be defined with functional
description (protein name), EC number, Transporter Classification (TC) number (Saier et al., 2016) and
GO terms. EC and TC numbers are classification numbers, EC number is four digit numerical
classification for enzymes, and therefore it is more useful for defining enzyme functions, while TC
number is used for classification of membrane transport proteins. Automated functional gene
annotation is mostly done by sequence similarity and through homology. Sequence similarity method
uses sequence-sequence comparison against databases by using alignment algorithms by using
alignment tools like BLAST (Altschul et al., 1990). With best hit score of alignment, annotation
determined with finding homology to genes which function is experimentally known.
Automated gene annotation based on sequence similarity provide fast annotation, but error rate
can be quite high (Jones et al., 2007). Best matching sequence is not always give accurate results for
annotation (Promponas et al., 2015), and it requires manual curation to discard false positive
annotations. Due to it is very slow to check all annotation manually, bioinformatic tools which use
intelligent methods to decrease risk of incorrect annotation like PANNZER (Koskinen et al., 2015) can
be used. PANNZER reduces errors by using k-nearest neighbour clustering and novel weighted
statistical testing (Z-score) instead of best BLAST hit method. Additionally, the source of error and
misannotation can be also false positive annotations in databases, level of misannotation can reach 60%
for some databases (Schnoes et al., 2009). Therefore using multiple databases and choosing accurate
database is one of the critical steps for successful functional gene annotation.
27
2. Aims of the Study
The aim of this study was provide genomic and transcriptomic analysis of Lactobacillus
rhamnosus strain LC705 by using RNA-Seq technology and bioinformatic methods to increase
understanding of roles of L. rhamnosus strain LC705 during cheese ripening in different temperatures.
In this work, the specific focus was to:
I) Re-annotate gene annotations of strain LC705.
II) Analyze metabolic pathways of strain LC705 for cheese ripening by using updated gene
annotations.
III) Determine the gene expression responses of Strain LC705 during warm room and cold room cheese
ripening process.
28
3. Materials and Methods
3.1. Cheese Sampling and RNA-sequencing
RNA were extracted for RNA transcriptome analysis from three warm and three cold room
cheese samples during ripening process (Figure 4). Cheese type which was used for this research was
semi hard Maasdam-type cheese. For this study, semi hard Maasdam-type cheese cooked with
mesophilic cooking recipe by using mesophilic starter and thermophilic cultures; Lactococcus lactis
ssp. lactis, Lactococcus lactis ssp. cremoris, Lactobacillus rhamnosus strain LC705, Lactobacillus
helveticus and Propionibacterium freudenreichii ssp. shermanii JS. Three cooked cheeses named A, B
and C were cut into four parts and wrapped with ripening foil for the ripening step. First 30 days,
cheeses were ripened in warm room which was at +20°C. After 30 days in warm room ripening,
cheeses were transferred into cold room which was at +5°C, and ripening process was continued for
another 30 days. The first triplet samples were collected at warm room on day 12 of ripening and
samples were named A1, B1 and C1. The second triplet samples were collected at cold room on day 37
and samples were named A3, B3 and C3 (Figure 4).
Cell isolation process from cheese samples started by blending cheese samples with 2% tri-
sodium citrate in a Stomacher filter pouch and mixing with Stomacher blender for 2.5 minutes.
Obtained liquid was centrifuged for 7 minutes then liquid and fat were removed. Suspension of pellet
was done in 2 ml RNA Protect Bacteria Reagent (Qiagen). After second centrifuge process, liquid and
fat were removed again and cell pellet was obtained. For the RNA extraction from cell pellet, method
described by (Koskenniemi et al., 2011) was used.
RIBOZero kit (Epicentre) was used for removing rRNA by following manufacturer instructions.
By using Ovation RNA-Seq System V2 (NuGEN Technologies Inc., San Carlos, CA) RNA fractions
were amplified as cDNA and libraries were constructed. Libraries were sequenced by using the SOLiD
5500XL (Life technologies) sequencer with single-end sequencing method with non-strand-specific
protocol. SOLID 5500XL sequencer converted 306,557,758 reads to FASTQ-files. Created FASTQ-
files were transferred to computer servers for analyzing and interpreting the data.
29
Figure 4. Cooked semi hard Maasdam-type cheeses were ripened 60 days in total. 30 days in warm
room and 30 days in cold room. First triplet samples (A1, B1 and C1) collected at day 12, quarter of the
cheese was taken for RNA-sequencing. Second triplet samples (A3, B3 and C3) collected at day 37,
another quarter of the cheese was taken for RNA-sequencing.
3.2. Gene Annotation Update and Pathway Analysis
Before the RNA-Seq data analysis, as a first step of the project, the gene annotation of strain
LC705 (published on 2009 (Kankainen et al., 2009)) was updated. To update the gene annotation of
strain LC705, the bioinformatic tools PANNZER, Blast2GO (Conesa et al., 2005) and KEGG
automatic annotation server (KAAS) were used. For identification and visualization of the new
pathways with updated annotations, we used Pathway Tools. Gene annotation update was performed in
two steps, first step was updating conserved and putative proteins and the second was adding updated
enzyme code numbers to annotated genes.
30
For the conserved and putative proteins update, PANNZER was used. Proteins with Description
Similarity Measure scores higher than 0.7 were considered as significantly high score in PANNZER.
To obtain updated EC numbers, protein FASTA file, Blast2GO and KAAS tools were used.
With Blast2GO, we run NCBI BLASTp by using RefSeq and Swiss-Prot database by using parameters
of BLAST expectation value (E-value) 1.03E-3, number of BLAST hits 20 and taxonomy filter
bacteria. Thereafter annotation file contains EC numbers of 904 protein products was generated by the
tool with RefSeq database. KAAS tool was also used by uploading FASTA file of strain LC705 to
KAAS server and using BBH (bi-directional best hit) alignment method. KAAS gave us EC numbers
of 615 protein products as an output. Comparing Blast2GO and KAAS EC number annotation results
showed that 500 genes annotated commonly. Addition to 500 genes, Blast2GO annotated 404 unique
genes more while KAAS annotated 115 unique genes. Combination of these results gave us EC
numbers for 1019 genes. Obtained EC numbers and new protein descriptions from PANNZER were
added to annotation file of LC705 with manually written Python script and UNIX command-line
applications.
Finally, to see the difference between old protein products and new protein products, we
compared old annotation file and updated annotation file in Pathway Tools. For that purpose, we
converted annotation files to Genbank format with EMBOSS Seqret (Rice et al., 2000). Each different
pathway is checked manually by comparing two different annotation. New annotation file was also
used for metabolic pathway analysis of strain LC705 with Pathway Tools and important pathways
related to cheese ripening were reported.
3.3. RNA-Seq Data Analysis
RNA-Seq data analysis was performed to analyze differential gene expression analysis between
two different time points. Our RNA-Seq data analysis workflow was summarized in figure 5. Basically
workflow steps contain filtering and quality control, mapping (align the reads to genome), counting
reads per genes, differential expression analysis.
3.3.1. Filtering and Quality Control
Cutadapt tool was used to filter and trim sequences. SOLiD next generation platform uses adapter
sequence which is “CGCCTTGGCCGTACAGCAG” in library-construction step and create reads with
31
adapter sequence. This adapter sequence was removed from reads using Cutadapt. Addition to adapter
sequence trimming, reads that contain less than 30 base pairs and low-quality ends which have less
than 25 phred quality score were filtered by using “cutadapt -m 30 -q 25 -a
CGCCTTGGCCGTACAGCAG -o 3a_cutadapt.fastq 3a_combined.fastq” command for sample 3a, and
all samples were processed identically. Adapter sequence and quality trimming can create high number
of short reads. Short reads should be removed because short reads are hard to map unambiguously.
Next, quality check was done for reads by using FastQC.
3.3.2. Mapping and Counting Reads
Generated reads were mapped onto the genome of strain LC705 using Bowtie2. To check
mapping results by human eye, reads that aligned to the genome of strain LC705 are visualized by
Tablet software. Mapped reads were sorted and converted to BAM file format using SAMtools (Li et
al., 2009) with “samtools view -bS 3a_lc705_map.sam | samtools sort – 3a_lc705_map_sorted”
command for sample 3a, and all samples were processed identically. Sorted BAM files were used for
creating count data using HTSeq using “htseq-count -s no -f bam -t gene --idattr ID
3a_lc705_sorted.bam lc705.gff > 3a_lc705.counts” command. In more detail, with “-s no” command,
we specified non-strand-specific protocol was used for sequencing, “-f bam” used for specify format of
the input data, “-t gene” indicated feature type of annotation file, “--idattr ID” indicated feature ID that
is used in annotation file. The mode of HTSeq was not changed, default mode union which exclude
multi-overlapped reads was used.
3.3.3. Differential gene expression
The differential expressions between two time points were obtained with DESeq2 using R by creating
two conditions warm room and cold room ripening. Thus the design for differential expression analysis
was depending only on one condition. A1, B1 and C1 samples were assigned to condition warm room
ripening, and A3, B3, C3 samples were assigned to condition cold room ripening. By using this design
matrix, comparative analysis result table was created. For FDR values, DESeq2 software returns
adjusted P value (padj) for each genes. In this study genes with padj < 0.10 and log2 Fold Change > 1
were considered as significantly differentially expressed.
32
3.3.4. Gene enrichment analysis
Gene enrichment analysis performed with two different tools; Pathway Tools and Blast2GO.
Pathway Tools provided pathway enrichment analysis by using 0.10 P-value limiting and Fisher Exact
test. Blast2GO software assigned biological functions based on BLAST sequence homologies and
Gene ontology annotations of strain LC705. Gene enrichment analysis up- and down-regulated genes
over all strain LC705 genes were tested by using Blast2GO with 0.05 P-value filter mode, Fisher’s
Exact test.
33
Figure 5. For the beginning, adapter and low quality sequences were removed. Then the quality of
reads was checked. For the next step, origin of the reads was described by mapping to the reference
genome. Then reads were counted per genes using HTSeq with union mode and, finally differential
gene expression analysis was done by using statistical testing.
34
4. Results
4.1. Gene annotation update
We updated published gene annotations of the strain LC705 to obtain new and more accurate
gene annotations for our study. The published annotation file of strain LC705 contained approximately
850 proteins that did not have a specific gene annotation which were described as a putative or
conserved proteins in annotation file. Out of that 850 proteins that did not have a specific gene
annotation, we defined 77 proteins with PANNZER tools. Already annotated genes were updated by
assigning Enzyme Commission (EC) number. We used two different databases (RefSeq and Swiss-Prot)
for blasting. Annotation results from RefSeq database gave more proteins sequences with BLAST Hits
and GO annotation compared to annotation results from Swiss-Prot database (Figure 6). In more
detail, %75 of total protein sequences from LC 705 are annotated with Refseq database, while 52% of
total protein sequences annotated with Swiss-Prot database. Hence to creating EC numbers, we decided
to use RefSeq database annotation results. In total, we added 1197 EC numbers to annotation file of
strain LC705 by using Blast2GO and KAAS annotation results.
The comparison between old and updated annotation file in Pathway Tools showed that gene
annotation update was successful. With the updated annotation file we saw that there were 72 new
pathways were gained while 10 pathways were lost after the pathway analysis. Within these 72 new
pathways, there were several new pathways related to cheese ripening like sugar degradations, amino
acid degradations and fermentation pathways which are key pathways for cheese production and flavor
formation. The table 3 summarizes new pathways and pathways that got lost as a result of gene
annotation update.
4.2. Metabolic pathways and metabolism of strain LC705 related to cheese
Updated annotation file was analyzed to identify pathways and metabolism of strain LC705
related to cheese production. Pathways of enzymatic degradation of amino acids, carboxylates
degradation, carbohydrates degradation, production of biogenic amines, vitamin biosynthesis, lipolysis
and fatty acid biosynthesis and fermentation were pathways which were related to cheese development
35
and cheese flavor formation in LC705. In addition to cheese flavor pathways, we also saw pathways
which produce protective compounds for cheese.
Figure 6. Bar-plot illustration of gene annotation results by using RefSeq and Swiss-Prot databases.
RefSeq database results have more genes with BLAST Hits and GO annotation (The figure created
with Blast2GO).
36
4.2.1. Fermentation Pathways
There are two types of lactic acid fermentation; homolactic and heterolactic fermentation
(summarized in figure 7). According to pathway analysis strain LC705 can use both homolactic and
heterolactic fermentation.
In homolactic fermentation glucose which is obtained from disaccharides (lactose, maltose,
melibose, sucrose and trehalose) degradation is used to produce L-lactate in strain LC705. Glucose is
fermented to pyruvate, and then pyruvate is converted to L-lactate by L-lactate dehydrogenase enzyme.
In heterolactic fermentation fructose and glucose can be used as a source. Glucose is converted
to glucose 6-phosphate by glucokinase, fructose is first converted to fructose 6-phosphate by
fructokinase and then converted to glucose 6-phosphate by glucose-6-phosphate isomerase enzyme.
Strain LC705 does not have a gene that produces fructokinase which would convert fructose to fructose
6-phosphate. However, fructose 6-phosphate can be produced by mannitol, sorbitol, sorbose, sucrose
and mannose degradation in strain LC705. Therefore we can say that in heterolactic fermentation
fructose cannot be used by strain LC705 due to lack of fructokinase. Instead of fructose, some other
hexitols and hexoses are source of fructose 6-phosphate in strain LC705. Glucose and fructose 6-
phosphate are fermented to ethanol and pyruvate in heterolactic fermentation. Pyruvate is converted to
L-lactate and D-lactate by L-lactate dehydrogenase and D-lactate dehydrogenase.
4.2.2. Enzymatic Degradation of Amino Acids
For the enzymatic degradation of amino acids, we saw that LC705 has genes for degradation of
L-asparagine, L-aspartate, L-cysteine, L-glutamate, L-glutamine, L-ornithine, L-homocysteine, L-
lysine and L-serine. Altogether strain LC705 can degrade nine different amino acids. Within these
amino acids, degradation of L-aspartate, L-cysteine, L-serine and L-glutamate are important for cheese
flavor formation.
For the L-aspartate degradation pathway, strain LC705 was found to possess aminotransferases
and aspartate aminotransferase enzymes that convert L-aspartate to oxaloacetate which is an α-keto
acid. Oxaloacetate produced from L-aspartate degradation pathway can then be converted to pyruvate
by oxaloacetate decarboxylase enzyme.
37
Table 3. All gained and lost pathways after gene annotations updated.
38
Similarly, L-serine can also be converted to pyruvate and ammonium by L-serine dehydratase.
Produced pyruvate can be converted by LC705 into acetic acid, acetoin, ethanol and diacetyl which are
aroma compounds and contribute to flavor formation.
L-cysteine degradation occurs in strain LC705 by cystathionine gamma-synthase enzyme.
This enzyme converts L-cysteine to pyruvate, hydrogen sulfide, H2O and ammonium. Hydrogen sulfide
is a volatile sulfur compound (VSC). VSCs are very important flavor compounds in a variety of
cheeses (Fuchsmann et al., 2015).
The last important pathway of enzymatic amino acid degradation pathway related to cheese
flavor is L-glutamate degradation. Strain LC705 produces glutamate dehydrogenase enzyme which
converts L-glutamate to 2-oxoglutarate. 2-oxoglutarate is an α-ketoglutarate which is a good α-keto
donor for transaminases in lactic acid bacteria.
4.2.3. Carboxylates Degradation
For the carboxylates degradation, strain LC705 has potential to degrade nine different
carboxylates which are (S)-malate, 2-oxobutanoate, acetyl-CoA, glycolate, L-ascorbate, malonate,
citrate, D-gluconate, mevalonate. Within these nine different carboxylates, citrate degradation produces
important flavor compounds for cheese.
Our study showed that strain LC705 has citrate lyase genes that are responsible for citrate metabolism.
On the first step in this pathway, citrate lyase complex catalyzes citrate to acetate and oxaloacetate. On
the next step, oxaloacetate is decarboxylated by the oxaloacetate decarboxylase enzyme, pyruvate and
CO2 are generated as a result of this reaction. Acetoin and diacetyl are produced by pyruvate
degradation. Acetolactate synthase enzyme is responsible for converting 2 pyruvate to (S)-2-
acetolactate, and α-acetolactate decarbocylase enzyme is responsible for conversion of (S)-2-
acetolactate to (R)-acetoin in strain LC705. Diacetyl is produced by non-enzymatic oxidative
decarboxylation from (S)-2-acetolactate. However, strain LC705 does not have genes that are
responsible for the production of 2,3-butanediol from acetoin (Figure 8). Therefore strain LC705 can
produce only acetoin and diacetyl from citrate, but not 2,3-butanediol.
39
Figure 7. Heterolactic and homolactic pathways from LC705. Blue colored enzymes can be produced
by LC705. Red colored enzymes cannot be produced by LC705. Products of heterolactic fermentation
are CO2, ethanol, S-lactate and R-lactate. Product of homolactic fermentation is S-lactate.40
4.2.4. Carbohydrates Degradation
Carbohydrate sources for strain LC705 are galactose, L-rhamnose, lactose, sucrose, trehalose,
xylose, 2’-deoxy-a-D-ribose 1-phospate, N-acetylneuraminate, D-mannose, fructose, L-sorbose,
maltose, melibose, ribose. The genes responsible for the degradation of these carbohydrates are present
in the genome of the strain LC705.
Strain LC705 can catabolize lactose by using with two different pathways; lactose and galactose
degradation I and lactose degradation III. Lactose degradation III pathway converts lactose to galactose
and glucose by α- and β-galactosidase and glycosyl hydrolase. Lactose and galactose degradation I
pathway converts lactose 6’-phosphate to D-galactose 6’-phosphate and glucose by enzyme 6-phospho-
β-galactosidase. These glucose molecules which are produced by lactose degradation in strain LC705
are used in heterolactic and homolactic fermentation and converted to lactate.
4.2.5. Production of Biogenic Amines
Our pathway analysis showed that strain LC705 has a potential to produce biogenic amines
which are toxic compounds through two different pathways. The first pathway is conversion of L-
ornithine into putrescine by ornithine decarboxylase enzyme. The second pathway is degradation of L-
lysine to cadaverine and CO2 by lysine decarboxylase enzyme. Expression levels of decarboxylase
enzyme and lysine decarboxylase enzyme were also checked, and responsible genes were
expressedduring cheese ripening. Consequently, the genome of strain LC705 contains genes which are
responsible to produce putrescine and cadaverine.
4.2.6. Vitamin Biosynthesis
Results of pathway analysis indicated that vitamin B biosynthesis is also possible by strain
LC705. Cobalamin (vitamin B12), thiamine (vitamin B1), vitamin B6 biosynthesis pathways were found
in the genome.
The strain LC705 does not have the full genetic information required for biosynthesis of
adenosylcobalamin which is one of the active forms of vitamin B12. Biosynthetic pathways for
adenosylcobalamin require several genes (Rodionov et al., 2003). However, strain LC705 possesses
only Cob(I)alamin adenosyltransferase which produces adenosylcobinamide from cob(I)alamin. So that
we can say that in presence of cob(I)alamin, strain LC705 is able to produce vitamin B12.
41
Figure 8. Summary of the production of buttery flavor compounds (acetoin and diacetyl) in strain
LC705. Buttery flavor compounds are produced through citrate degradation. Responsible enzyme
names and NCBI protein numbers are given in picture. Lactic acid bacteria that have genes to produce
acetoin reductase can produce 2,3-butanediol. Nevertheless, strain LC705 does not have these genes.
Pathway analysis results showed that thiamine (vitamin B1) is produced via salvage pathways in
LC705. Five genes are present in the genome with predicted functions related to thiamine biosynthesis.
Pyridoxal 5′-phosphate (PLP) is the active form of vitamin B6. Pathway analysis showed that
strain LC705 can produce PLP by salvage pathway. Bacteria can synthesize PLP by two main
pathways; de novo pathway and a salvage pathway (El Qaidi et al., 2013). Strain LC705 does not have
genes to produce PLP by the de novo pathway, but it has genes for salvage pathway. In salvage
42
pathway vitamers which are pyridoxine, pyridoxamine and pyridoxal are phosphorylated by kinases in
strain LC705 and, phosphorylated compounds are then oxidized to PLP by pyridoxine 5'-phosphate
oxidase.
4.2.7. Lipolysis and fatty acid biosynthesis
Pathway analysis showed that strain LC705 does not have many pathways for fatty acid and
lipid degradation. According to pathway analysis, LC705 has only glycerol degradation I pathway. In
this pathway glycerol is degraded by glycerol kinase and α-glycerophosphate oxidase to glycerone
phosphate which is a simple carbohydrate. Produced glycerone phosphate is used then in
glyconeogenesis and glycolysis.
Addition to lipolysis, LC705 has pathways that produce several fatty acids. These fatty acids are
cyclopropane fatty acid, palmitate, stearate, vaccenate and gondoate. These fatty acids do not have
direct effect to the cheese flavors.
4.2.8. Protective activities
LC705 has a potential to produce anti-fungal compounds. During fermentation, strain LC705
produces organic acids, like lactic acid and acetic acid, which increase acidity of the environment. Not
only organic acids are produced from strain LC705, but also some specific anti-fungal compounds are
produced. There are 26 anti-fungal compounds published for Lactic acid bacteria (Brosnan et al.,
2012). LC705 has genes that responsible to produce Cytidine (CAR89628.1, CAR89573.1 and
CAR89636.1), succinic acid (CAR91792.1 and CAR89391.1), and hydrogen peroxide (CAR88894.1
and CAR89058.1) that belong to these anti-fungal compounds. In addition to the anti-fungal activities,
strain LC705 has prebacteriocin and bacteriocin transporter protein genes which could be related to the
inhibition of the growth of other bacterial strains.
43
4.3. RNA-Seq data analysis
4.3.1. Read Filtering, Quality Check and Mapping
We had approximately 152 and 154 million RNA-Seq reads obtained from the warm and cold
room samples, respectively. After filtering these reads, we observed that more than 80% reads passed
filters for each sample and replicate (Table 4). Quality control results were good to continue with this
reads. Mapping results showed that between 4.91% and 7.7% of reads from first time point samples
and between 8.7% and 11.1% of reads from the second time point samples mapped to the strain LC705
genome (Table 4).
Table 4. Filtering and mapping results for every sample including total reads and written reads.
4.3.2. Differential gene analysis
Our differential gene analysis showed that there were 327 significantly differentially expressed
genes between warm room ripening and cold room ripening (padj <0.10 | log2fold > 1). Figure 9 shows
the most significantly differentially expressed genes between cold room and warm room ripening
according to adjusted p-values (padj) of differentially expressed genes. Within 327 differentially
expressed genes 153 genes were up-regulated and 174 genes were down-regulated during cold room
cheese ripening. To analyze these differentially expressed genes, gene enrichment analysis was done as
a next step.
44
Figure 9. Heat map of the 60 most significantly differentially expressed genes between 2 time points.
Dark blue represents high expression (high number of reads), light blue represents low expression (low
number of reads).
4.2.3. Gene Ontology Enrichment Analysis
Gene ontology enrichment analysis for up-regulated genes (Figure 10) showed that gene ontology
terms belonging to biological process and molecular function while there are no cellular component
GO terms. In total we saw 102 different enriched GO terms.
45
Enriched GO terms for up-regulated genes during cold room cheese ripening are mostly related
to biosynthesis of purine, ribonucleoside, nucleobase and nucleoside. Transport GO term
(GO:0006810) was also highly enriched in biological process GO terms for up-regulated genes in cold
room. The GO terms of inorganic anion transmembrane transport, inorganic anion transport, phosphate
ion transport, sulfur compound transport, phosphate transmembrane transporter activity, sulfate
transmembrane transport, sulfate transport and inorganic ion transmembrane transport are main
transport related GO terms for cold-induced genes of strain LC705. Additionally, we saw both negative
and positive regulation GO terms in enrichment analysis for up-regulated genes in cold room.
Biological process GO terms contain 16 negative regulation GO terms. At the same time there are five
positive regulation GO terms which are same with negative regulations (RNA metabolic process,
transcription (DNA-templated), RNA biosynthetic process, nucleic acid-templated transcription,
nucleobase-containing compound metabolic process). Genes that cause both negative regulation and
positive regulation were up-regulated in cold room.
GO enrichment analysis for down-regulated genes during cold room cheese ripening (Figure
11) showed terms mostly in biological process and molecular function while there are only 3 terms in
cellular component GO terms. There were 28 GO terms in total.
The biological process GO terms indicated that carbohydrate metabolism genes expression was
significantly down-regulated during cold room ripening. Lactose, disaccharide and oligosaccharide
metabolic and catabolic processes were down-regulated during cold room ripening. Some other
metabolic processes like acetyl-CoA and nucleobase metabolic processes were also enriched for down-
regulated genes in cold room. On the molecular function GO terms, we saw that in general lyase
activity genes were down-regulated during cold room ripening. Particularly, carbon-carbon lyase,
citrate lysase and carboxy-lyase were significantly down-regulated during cold room ripening. Most of
the other molecular function GO terms were related to degradation and utilization of carbohydrates and
carboxylates like galactose-6-phosphate isomerase activity, glutaconyl-CoA decarboxylase activity,
alpha-glucosidase activity, tagatose-bisphosphate aldolase activity and oxaloacetate decarboxylase
activity. The GO term citrate lyase complex which is found in down-regulated cellular component GO
terms is also additional evidence that supports the results that lyase activity genes were down-regulated
during cold room ripening.
46
Figure 10.Barplots of differentially enriched GO terms for up-regulated genes in cold room ripening.
GO terms were sorted by p-value.
47
Figure 11. Barplots of differentially enriched GO terms for down-regulated genes in cold room
ripening. GO terms were sorted by p-value.
48
4.2.4. Pathway enrichment analysis
As a result of the pathway enrichment analysis of the significantly up-regulated and down-
regulated genes in cold room cheese ripening, we found 28 significantly enriched pathways (fisher test,
values lower than 0.10) for up-regulated genes (Table 5) and 20 significantly enriched pathway (fisher
test, values lower than 0.10) for down-regulated genes in cold room ripening (Table 6).
Pathway enrichment analysis for up-regulated genes in cold room verified that in cold room
ripening strain LC705 produces high amount of nucleosides and nucleotides, 13 of 28 significantly
enriched pathways were related to nucleosides and nucleotides biosynthesis. We also observed that in
cold room strain LC705 started to use some alternative sugars as carbon sources such as L-malate, S-
malate, Chitin and N-acetylglucosamine. Up-regulation of reductive monocarboxylic acid cycle
pathway is a good indicator that of increased formate biosynthesis in cold room ripening. In that
pathway pyruvate is converted to acetyl-CoA and formate. We saw that pyruvate fermentation to
ethanol pathway was significantly up-regulated during cold room ripening.
Pathway enrichment analysis for down-regulated genes in cold room showed that citrate
degradation and citrate lyase activation pathways had lowest p-value within other down-regulated
pathways. Following pathways are lactose and galactose degradation pathways. Pathway enrichment
analysis also showed that responsible genes for main pathways like sugar degradation, carbohydrate
degradation, glycolysis and homolactic fermentation were down-regulated in cold room ripening.
49
Table 5. List of pathways which were obtained from enrichment analysis for up-regulated genes in cold
room
50
Table 6. List of pathways which were obtained from enrichment analysis for down-regulated genes in
cold room.
51
5. Discussion
For this study, pathways of the strain LC705 were analyzed by Pathway Tools software. For
pathway analysis, using updated gene annotations is highly important to get accurate results. For our
project, before the gene annotation update step, published gene annotations were containing only name
of the gene products and around 850 non-annotated genes. To obtain better results in pathway analysis,
we updated gene annotations and we added EC numbers to annotations. For the EC number annotation,
the database selection was challenging. We tested different databases for NCBI BLASTp step and we
chose to use RefSeq database due to high number of annotations. KEGG database also provided EC
numbers for non-annotated proteins. As a result, 72 new pathways were discovered by Pathway Tools.
Pathway Tools software showed 201 different pathways for strain LC705 and 72 new pathways
approximately equal to 35% of the whole 201 pathways. These new pathways were important pathways
with critical roles in sugar degradation, amino acid degradation and fermentation. For example, as a
facultative heterofermentative lactobacilli, strain LC705 should have heterolactic and homolactic
fermentation pathways. Without updating the annotations we cannot see heterolactic fermentation
pathway for strain LC705 with Pathway Tools software but the pathway is among the 72 new
pathways. The gene re-annotation part was very important for this project, since due to this step we
gain the ability to characterize pathways for strain LC705 in more thorough way. To have right
pathways is critical for genome wide transcriptomic analysis in this project.
Lactobacilli species are one of the most important key factors in cheese production and cheese
flavor development. These species can be used in cheese ripening process as a starter culture or non-
starter adjunct culture to increase quality of the cheese. During cheese ripening, microbiological and
biochemical changes occur on cheese and used starter and adjunct cultures influence cheese quality and
flavor. For this study, we analyzed role of strain LC705 during cheese ripening. Strain LC705 was used
as an adjunct culture in this study. Adjunct cultures play important role in providing or enhancing the
characteristic flavors and textures of cheese (Settanni and Moschetti, 2010). Flavor formation which
occurs during cheese ripening is a complicated activity. Glycolysis, lipolysis and proteolysis are the
leading catabolic pathways that are involved to flavor formation (Yvon and Rijnen, 2001). It has also
been shown that amino acid catabolism has an important role in flavor formation in cheese by
producing aroma compounds (Yvon and Rijnen, 2001).
52
Our pathway analysis showed that strain LC705 contains at least 201 different pathways and
several number of pathways are related to cheese ripening, flavor and texture of the cheese.
Genome of the strain LC705 is equipped with important enzymes participating amino acid
degradation. By degrading L-aspartate and L-serine, buttery flavor products were produced by strain
LC705. However, buttery flavor is mostly produced through citrate degradation (Martino et al., 2016).
L-aspartate and L-serine degradation is a good alternative to produce diacetyl and acetoin for bacteria
which cannot degrade citrate like some Pediococcus strains (Irmler et al., 2013). We also saw that
strain LC705 can produce hydrogen sulfide which is a VSC. Mostly, VSCs are produced from the
degradation of the sulfur-containing amino acids like methionine and cysteine. VSCs are highly volatile
and reactive hence even with low quantities the flavor can be highly affected (Fuchsmann et al., 2015).
As a VSC, hydrogen sulfide produces unlikeable odor of rotten eggs (Landaud et al. 2008). Now we
have identified genes that are responsible for hydrogen sulfide production which may make strain
LC705 a strong flavor effector for cheese. L-glutamate degradation is important factor for amino acid
degradation activities with α-ketoglutarate end product. It is shown that deprivation of α-ketoglutarate
limits the amino acid degradation activities which enhance cheese aroma in lactic acid bacteria
(Amárita et al., 2006). Lactic acid bacteria which cannot produce α-ketoglutarate by glutamate
degradation, need to uptake α-ketoglutarate from external source by citrate transporter (Pudlik and
Lolkema, 2013). 2-oxoglutarate, an α-ketoglutarate, is produced by L-glutamate degradation in LC705.
Due to the genes that produce α-ketoglutarate compound LC705 does not need α-ketoglutarate from an
external source.
Genes responsible to citrate degradation were also found in the genome of strain LC705. Citrate
degradation is important for cheese flavor and limited number of LAB species are capable of utilizing
citrate (Smid and Kleerebezem, 2014). Citrate fermenting pathways are responsible for producing
acetoin, diacetyl and 2,3 butanediol which provide buttery flavor to cheese (Martino et al., 2016). Our
study showed that LC705 does not contain genes to produce 2,3 butanediol. Even the production of
diacetyl and acetoin is important for cheese, as it provides the buttery flavor. Citrate fermentation
pathways also produce CO2. This CO2 production by citrate metabolism can also be important for the
eye formation in cheese (O'Sullivan et al., 2016). Thus we can say that strain LC705 is responsible for
gas formation and flavor formation by having citrate degradation pathway.
Lactose and galactose are the two main simple carbohydrates in milk. In cheese ripening
process, mostly starter cultures are used for the lactose degradation. As an adjunct culture, LC705
53
possess lactose degradation genes, hence it can join lactose degradation process with starter cultures
and can be also named adjunct starter culture. Galactose degradation is important for cheese because
the existence of galactose was found to be related with lower qualities of the fermented products
(Neves et al., 2010). Additionally the existence if galactose is one of the major reasons why dairy
products turn brown when they are cooked or heated (Neves et al., 2010). Thus galactose utilization is
needed to develop higher-quality dairy products. Strain LC705 has galactose degradation genes.
Therefore, as an adjunct starter, strain LC705 contributes to cheese quality.
On the other hand adjunct cultures can be produce unwanted by-products like biogenic amines
(Weinrichter et al., 2004). Biogenic amines which pose a risk for consumer are formed by microbial
decarboxylation of the corresponding amino acids (Bunkova et al., 2010). Our pathway analysis results
showed that strain LC705 possesses genes to produce putrescine and cadaverine. Putrescine and
cadaverine enhance toxic effect of other amines by inhibiting of detoxifying enzymes (Bunkova et al.,
2010) and can cause some pharmacological effects like hypotension and bradycardia (Shalaby, 1996). It
is difficult to remove biogenic amines after they are formed. Therefore the formation of the biogenic
amines should be controlled (Valsamaki et al., 2000). Earlier studies (Bunkova et al., 2010; Valsamaki
et al., 2000; Pachlová et al., 2012) have showed that biogenic amines are produced in larger quantities
at higher temperature in cheese ripening. Interestingly, our transcription analysis between cold and
warm room ripening did not show any significant difference for genes that responsible for producing
putrescine and cadaverine. However, temperature is not only the factor that influence production of
biogenic amines. Bacterial density, pH and NaCl concentration have a significant effect on production
of biogenic amines (Valsamaki et al., 2000). Therefore we can say that these factors limited the
production of biogenic amines in warm room ripening.
Many industrial lactic acid bacteria can synthesize vitamin B complexes (Capozzi et al., 2012).
Our pathway analysis results showed that strain LC705 is equipped with genes to produce cobalamin
(vitamin B12), thiamine (vitamin B1), and vitamin B6. It is important to produce thiamine because it
plays a key role in energy metabolism. Thiamine is a fundamental cofactor for several types of
enzymes. It is required in alcohol fermentation and catabolism of carbohydrates (Falder et al., 2010). It
is also an advantage for cheese production as animals cannot produce thiamine and thus need to obtain
it from their diet (Sriram et al., 2012). Additionally, vitamin B6 is an essential cofactor and has several
key roles in amino acid and cellular metabolism (Toney, 2005).
54
Lipolysis is one of the major sources of cheese flavor and odor compounds. Though strain
LC705 does not have important genes for lipolysis activities. Therefore LC705 does not contribute to
flavor enhancement by lipolysis.
LAB and Propionibacterium are defined as bio-preservative micro-organisms because they have
a role in preservation of fermented food. It is shown that combination of Lactobacillus rhamnosus
strain LC705 and Propionibacterium freudenreichii ssp. shermanii strain JS is the most effective
bacterial combination against yeast and molds, but the actual mechanism behind this behavior was not
elucidated (Suomalainen and Mäyrä-Mäkinen, 1999). Our study showed that LC705 produces organic
acids and anti-fungal compounds. Anti-fungal activity occurs by synergistic effect of organic acids and
anti-fungal compounds (Le Lay et al., 2016). Moreover, it has been shown that acetic acid is the main
compound for anti-fungal activity especially against Penicillium corylophilum and Eurotium repens
(Le Lay et al., 2016). It is known that strain JS can produce acetate and propionic acid (Ojala et al.,
2016). Result that why the combination of LC705 and Propionibacterium freudenreichii ssp. shermanii
strain JS is the most effective bacterial combination against yeast and molds can be explained by
production of anti-fungal compounds, acetic acid and propionic acid from both strains. Synergistic
effect of these organic acids and anti-fungal compounds make these two bacteria effective anti-fungal
micro-organisms.
Lactobacillus rhamnosus species have a protective effects against Bacillus species
(Suomalainen and Mäyrä-Mäkinen, 1999) and Listeria species (Luukkonen et al., 2005). For strain
LC705, the mechanism of anti-bacterial effect was not known. In our research we found out that strain
LC705 has prebacteriocin and bacteriocin transporter protein genes which may explain why it represses
the growth of clostridia. Additionally we checked the expression levels of these genes with
transcriptomic analysis. We verified that prebacteriocin genes were expressed in LC705. These genes
have not been reported before.
Gene expression profiling showed that there were 327 significantly differentially expressed
genes between warm room and cold room ripening. 153 genes were are up-regulated, while 174 genes
were down-regulated during cold room cheese ripening.
Data from enrichment analysis for up-regulated genes in cold room cheese ripening indicated
that nucleobase and nucleoside biosynthesis genes are cold induced genes. We especially saw many of
the purine biosynthesis genes highly transcribed in cold room. Purine nucleotides have many key roles
in cellular metabolism like the structure of DNA and RNA, serving as enzyme cofactors, functioning in
55
cellular signaling, acting as phosphate group donors, and generating cellular energy (Zhao et al., 2015).
In addition, transporter activity was also increased for ions such as phosphate and sulfate. Increasing of
phosphate ion transporter expression can be explained by the increase of purine production, as during
purine synthesis high amount of phosphate ions are generated. This may indicate that strain LC705
produces high amount of phosphate ions and transport these during cold room cheese ripening. It is
interesting that we saw both negative regulation and positive regulation GO terms for this enrichment
analysis. Explanation of up-regulated negative regulation genes can be that the temperature change may
have activated the defense mechanism of LC705 against a heat-shock stress. In cold room, strain
LC705 started to use alternative sugars as a carbon sources. Finally in the cold room ripening, we saw
up-regulation for alternative pyruvate degradation activities for strain LC705. Pyruvate is degraded to
lactic acid, acetoin, diacetyl, acetyl-CoA and ethanol in strain LC705. Lactic acid and buttery
compounds (acetoin and diacetyl) production is critical for cheese ripening. However, in cold room
ripening, genes that convert pyruvate to acetyl-CoA and ethanol were up-regulated. Ethanol is a volatile
compound which gives dry and dusty odor to cheese. (Pogacic et al., 2016). It is interesting, because
volatile and aroma compound production is mostly up-regulated in warm room cheese ripening. In cold
room, instead of producing lactic acid and buttery flavor compounds, strain LC705 mostly produce
acetyl-CoA and ethanol.
Results of enrichment analysis for down-regulated genes in cold room cheese ripening showed
that central metabolism and lyase genes of strain LC705 are warm induced genes. Important main
carbohydrate degradation genes were down-regulated in cold room. Degradation activities are very
important for cheese production and cheese flavor. We also showed that in cold room alternative carbon
degradation genes of strain LC705 were up-regulated. Using alternative carbon sources instead of main
carbohydrates may indicate that other carbon sources were finished or decreased dramatically by the
day 37 of the cheese ripening or the main carbohydrate genes are warm induced genes. We clearly saw
that citrate degradation genes were down-regulated in cold room ripening. Citrate degradation is the
main pathway for producing buttery aroma in strain LC705. It is clear that citrate degradation genes are
warm induced genes, and strain LC705 produces buttery flavor to cheese during warm room ripening.
56
6. Conclusions
In this study, one of the main aims was to update the gene annotations of Lactobacillus
rhamnosus strain LC705. Annotation update provided new information about genome of the strain
LC705. By analyzing new gene annotations, this study provided understanding on the role of L.
rhamnosus strain LC705 in cheese ripening. Strain LC705 contributes to cheese ripening stage with
flavor formation and protective activities by producing lactic acid, buttery flavor compounds, volatile
sulfur compounds, biogenic amines, vitamins, some anti-fungus compounds and bacteriocin.
This study compared gene expression responses of strain LC705 during Maasdam-type cheese
ripening at warm temperature (25 °C) and cold temperature (5 °C). Central metabolism genes L.
rhamnosus LC705 that provide main carbohydrate degradation are warm induced genes. Buttery flavor
is provided to cheese by strain LC705 in warm room. In cold room strain LC705 preferably converts
pyruvate to acetyl-CoA and ethanol instead lactic acid and buttery flavor compounds.
57
7. Acknowledgements
First of all, I would like to thank my supervisors, Olli-Pekka Smolander and Petri Auvinen for
their encouragement, expert guidance, support and commenting my thesis through the study. I have
learned a lot from them.
I would also like to thank all members of DNA Sequencing and Genomics laboratory of the
Institute of Biotechnology, especially Juhana Kammonen and Margarita Andreevskaya for giving very
useful advices and providing comfortable and friendly working environment.
Special thanks to my family for supporting me all the time and covering all my expenses with
big sacrifices during my master’s degree studies and thesis. Warmly thanks to my lovely friend group
Mataramasuko. Last but not least, I would like to thank to Oona Nevanen for supportive and patient
listening and interest in my thesis.
58
8. References
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. (1990). Basic local alignment search tool. J. Mol. Biol. 215, 403-410.
Amárita, F., de, l.P., Fernández, d.P., Requena, T., and Peláez, C. (2006). Cooperation between wild lactococcal strains for cheese aroma formation. Food Chem. 94, 240-246.
Anders, S., Pyl, P.T., and Huber, W. (2015). HTSeq--a Python framework to work with high-throughputsequencing data. Bioinformatics 31, 166-169.
Andrews S. (2010). FastQC: a quality control tool for high throughput sequence data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc
Avraham, R., Haseley, N., Fan, A., Bloom-Ackermann, Z., Livny, J., and Hung, D.T. (2016). A highly multiplexed and sensitive RNA-seq protocol for simultaneous analysis of host and pathogen transcriptomes. Nat. Protocols 11, 1477-1491.
Aziz, R.K., Bartels, D., Best, A.A., DeJongh, M., Disz, T., Edwards, R.A., Formsma, K., Gerdes, S., Glass, E.M., Kubal, M., et al. (2008). The RAST Server: rapid annotations using subsystems technology. BMC Genomics 9, 75-2164-9-75.
Bergonzelli, G.E., Granato, D., Pridmore, R.D., Marvin-Guy, L.F., Donnicola, D., and Corthesy-Theulaz, I.E. (2006). GroEL of Lactobacillus johnsonii La1 (NCC 533) is cell surface associated: potential role in interactions with the host and the gastric pathogen Helicobacter pylori. Infect. Immun. 74, 425-434.
Bernardeau, M., Guguen, M., and Vernoux, J.P. (2006). Beneficial lactobacilli in food and feed: long-term use, biodiversity and proposals for specific and realistic safety assessments. FEMS Microbiol. Rev. 30, 487-513.
Bolger, A.M., Lohse, M., and Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114-2120.
Boutet, E., Lieberherr, D., Tognolli, M., Schneider, M., Bansal, P., Bridge, A.J., Poux, S., Bougueleret, L., and Xenarios, I. (2016). UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View. Methods Mol. Biol. 1374, 23-54.
Bouton, Y., Buchin, S., Duboz, G., Pochet, S., and Beuvier, E. (2009). Effect of mesophilic lactobacilli and enterococci adjunct cultures on the final characteristics of a microfiltered milk Swiss-type cheese. Food Microbiol. 26, 183-191.
Bray, N.L., Pimentel, H., Melsted, P., and Pachter, L. (2016). Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525-527.
59
Brosnan, B., Coffey, A., Arendt, E.K., and Furey, A. (2012). Rapid identification, by use of the LTQ Orbitrap hybrid FT mass spectrometer, of antifungal compounds produced by lactic acid bacteria. Anal.Bioanal Chem. 403, 2983-2995.
Bunkova, L., Bunka, F., Mantlova, G., Cablova, A., Sedlacek, I., Svec, P., Pachlova, V., and Kracmar, S. (2010). The effect of ripening and storage conditions on the distribution of tyramine, putrescine and cadaverine in Edam-cheese. Food Microbiol. 27, 880-888.
Burns, P., Cuffia, F., Milesi, M., Vinderola, G., Meinardi, C., Sabbag, N., and Hynes, E. (2012). Technological and probiotic role of adjunct cultures of non-starter lactobacilli in soft cheeses. Food Microbiol. 30, 45-50.
Capozzi, V., Russo, P., Duenas, M.T., Lopez, P., and Spano, G. (2012). Lactic acid bacteria producing B-group vitamins: a great potential for functional cereals products. Appl. Microbiol. Biotechnol. 96, 1383-1394.
Caspi, R., Billington, R., Ferrer, L., Foerster, H., Fulcher, C.A., Keseler, I.M., Kothari, A., Krummenacker, M., Latendresse, M., Mueller, L.A., et al. (2016). The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 44, D471-80.
Collado, M.C., Meriluoto, J., and Salminen, S. (2007). In vitro analysis of probiotic strain combinationsto inhibit pathogen adhesion to human intestinal mucus. Food Res. Int. 40, 629-636.
Conesa, A., Gotz, S., Garcia-Gomez, J.M., Terol, J., Talon, M., and Robles, M. (2005). Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21, 3674-3676.
Conesa, A., Madrigal, P., Tarazona, S., Gomez-Cabrero, D., Cervera, A., McPherson, A., Szczesniak, M.W., Gaffney, D.J., Elo, L.L., Zhang, X., and Mortazavi, A. (2016). A survey of best practices for RNA-seq data analysis. Genome Biol. 17, 13-016-0881-8.
Croucher, N.J., and Thomson, N.R. (2010). Studying bacterial transcriptomes using RNA-seq. Curr. Opin. Microbiol. 13, 619-624.
Crow, V., Curry, B., and Hayes, M. (2001). The ecology of non-starter lactic acid bacteria (NSLAB) and their use as adjuncts in New Zealand Cheddar. Int. Dairy J. 11, 275-283.
Dai, M., Thompson, R.C., Maher, C., Contreras-Galindo, R., Kaplan, M.H., Markovitz, D.M., Omenn, G., and Meng, F. (2010). NGSQC: cross-platform quality analysis pipeline for deep sequencing data. BMC Genomics 11 Suppl 4, S7-2164-11-S4-S7.
de Sá, P.H.C.G., Veras, A.A.O., Carneiro, A.R., Pinheiro, K.C., Pinto, A.C., Soares, S.C., Schneider, M.P.C., Azevedo, V., Silva, A., and Ramos, R.T.J. (2015). The impact of quality filter for RNA-Seq. Gene 563, 165-171.
60
Deutscher, M.P. (2006). Degradation of RNA in bacteria: comparison of mRNA and stable RNA. Nucleic Acids Res. 34, 659-666.
Dillies, M.A., Rau, A., Aubert, J., Hennequet-Antier, C., Jeanmougin, M., Servant, N., Keime, C., Marot, G., Castel, D., Estelle, J., et al. (2013). A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis. Brief Bioinform 14, 671-683.
Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15-21.
Dong, X., Hao, Y., Wang, X., and Tian, W. (2016). LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights. Sci. Rep. 6, 18871.
Douillard, F.P., Ribbera, A., Kant, R., Pietila, T.E., Jarvinen, H.M., Messing, M., Randazzo, C.L., Paulin, L., Laine, P., Ritari, J., et al. (2013). Comparative genomic and functional analysis of 100 Lactobacillus rhamnosus strains and their comparison with strain GG. PLoS Genet. 9, e1003683.
El Qaidi, S., Yang, J., Zhang, J.R., Metzger, D.W., and Bai, G. (2013). The vitamin B(6) biosynthesis pathway in Streptococcus pneumoniae is controlled by pyridoxal 5'-phosphate and the transcription factor PdxR and has an impact on ear infection. J. Bacteriol. 195, 2187-2196.
El-Nezami, H., Kankaanpaa, P., Salminen, S., and Ahokas, J. (1998). Ability of dairy strains of lactic acid bacteria to bind a common food carcinogen, aflatoxin B1. Food Chem. Toxicol. 36, 321-326.
El-Nezami, H., Polychronaki, N., Salminen, S., and Mykkanen, H. (2002). Binding rather than metabolism may explain the interaction of two food-Grade Lactobacillus strains with zearalenone and its derivative (')alpha-earalenol. Appl. Environ. Microbiol. 68, 3545-3549.
El-Nezami, H.S., Polychronaki, N.N., Ma, J., Zhu, H., Ling, W., Salminen, E.K., Juvonen, R.O., Salminen, S.J., Poussa, T., and Mykkanen, H.M. (2006). Probiotic supplementation reduces a biomarker for increased risk of liver cancer in young men from Southern China. Am. J. Clin. Nutr. 83, 1199-1203.
Espino, E., Koskenniemi, K., Mato-Rodriguez, L., Nyman, T.A., Reunanen, J., Koponen, J., Ohman, T.,Siljamaki, P., Alatossava, T., Varmanen, P., and Savijoki, K. (2015). Uncovering surface-exposed antigens of Lactobacillus rhamnosus by cell shaving proteomics and two-dimensional immunoblotting. J. Proteome Res. 14, 1010-1024.
Esteban-Torres, M., Mancheño, J.M., de, l.R., and Muñoz, R. (2014). Production and characterization of a tributyrin esterase from Lactobacillus plantarum suitable for cheese lipolysis. J. Dairy Sci. 97, 6737-6744.
Ewing, B., and Green, P. (1998). Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186-194.
61
Falder, S., Silla, R., Phillips, M., Rea, S., Gurfinkel, R., Baur, E., Bartley, A., Wood, F.M., and Fear, M.W. (2010). Thiamine supplementation increases serum thiamine and reduces pyruvate and lactate levels in burn patients. Burns 36, 261-269.
Fallico, V., McSweeney, P.L.H., Horne, J., Pediliggieri, C., Hannon, J.A., Carpino, S., and Licitra, G. (2005). Evaluation of Bitterness in Ragusano Cheese*. J. Dairy Sci. 88, 1288-1300.
FAO/WHO, (2012). Guidelines for the Evaluation of Probiotics in Food, Working group report Food and Agricultural Organization of the United Nations and World Health Organization.
Feng, J., Meyer, C.A., Wang, Q., Liu, J.S., Shirley Liu, X., and Zhang, Y. (2012). GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data. Bioinformatics 28, 2782-2788.
Fontana, C., Bassi, D., Lopez, C., Pisacane, V., Otero, M.C., Puglisi, E., Rebecchi, A., Cocconcelli, P.S., and Vignolo, G. (2016). Microbial ecology involved in the ripening of naturally fermented llama meat sausages. A focus on lactobacilli diversity. Int. J. Food Microbiol. 236, 17-25.
Fuchsmann, P., Stern, M.T., Brugger, Y.A., and Breme, K. (2015). Olfactometry Profiles and Quantitation of Volatile Sulfur Compounds of Swiss Tilsit Cheeses. J. Agric. Food Chem. 63, 7511-7521.
Gálvez, A., Abriouel, H., López, R.L., and Omar, N.B. (2007). Bacteriocin-based strategies for food biopreservation. Int. J. Food Microbiol. 120, 51-70.
Gamallat, Y., Meyiah, A., Kuugbee, E.D., Hago, A.M., Chiwala, G., Awadasseid, A., Bamba, D., Zhang, X., Shang, X., Luo, F., and Xin, Y. (2016). Lactobacillus rhamnosus induced epithelial cell apoptosis, ameliorates inflammation and prevents colon cancer development in an animal model. Biomed. Pharmacother. 83, 536-541.
Gao, C., Major, A., Rendon, D., Lugo, M., Jackson, V., Shi, Z., Mori-Akiyama, Y., and Versalovic, J. (2015). Histamine H2 Receptor-Mediated Suppression of Intestinal Inflammation by Probiotic Lactobacillus reuteri. Mbio 6, e01358-15.
Garofalo, C., Zannini, E., Aquilanti, L., Silvestri, G., Fierro, O., Picariello, G., and Clementi, F. (2012).Selection of sourdough lactobacilli with antifungal activity for use as biopreservatives in bakery products. J. Agric. Food Chem. 60, 7719-7728.
Gene Ontology Consortium, Blake, J.A., Dolan, M., Drabkin, H., Hill, D.P., Li, N., Sitnikov, D., Bridges, S., Burgess, S., Buza, T., et al. (2013). Gene Ontology annotations and resources. Nucleic Acids Res. 41, D530-5.
Giraffa, G., Chanishvili, N., and Widyastuti, Y. (2010). Importance of lactobacilli in food and feed biotechnology. Res. Microbiol. 161, 480-487.
62
Gobbetti, M., De Angelis, M., Di Cagno, R., Mancini, L., and Fox, P.F. (2015). Pros and cons for using non-starter lactic acid bacteria (NSLAB) as secondary/adjunct starters for cheese ripening. Trends FoodSci. Technol. 45, 167-178.
Halttunen, T., Collado, M.C., El-Nezami, H., Meriluoto, J., and Salminen, S. (2008). Combining strainsof lactic acid bacteria may reduce their toxin and heavy metal removal efficiency from aqueous solution. Lett. Appl. Microbiol. 46, 160-165.
Hannon, J.A., Wilkinson, M.G., Delahunty, C.M., Wallace, J.M., Morrissey, P.A., and Beresford, T.P. (2003). Use of autolytic starter systems to accelerate the ripening of Cheddar cheese. Int. Dairy J. 13, 313-323.
Heredia-Castro, P.Y., Mendez-Romero, J.I., Hernandez-Mendoza, A., Acedo-Felix, E., Gonzalez-Cordova, A.F., and Vallejo-Cordoba, B. (2015). Antimicrobial activity and partial characterization of bacteriocin-like inhibitory substances produced by Lactobacillus spp. isolated from artisanal Mexican cheese. J. Dairy Sci. 98, 8285-8293.
Holma, R., Kekkonen, R.A., Hatakka, K., Poussa, T., Vaarala, O., Adlercreutz, H., and Korpela, R. (2011). Consumption of Galactooligosaccharides together with Probiotics Stimulates the In Vitro Peripheral Blood Mononuclear Cell Proliferation and IFNγ Production in Healthy Men. ISRN Immunology 2011, 6.
Hou, Z., Jiang, P., Swanson, S.A., Elwell, A.L., Nguyen, B.K.S., Bolin, J.M., Stewart, R., and Thomson, J.A. (2015). A cost-effective RNA sequencing protocol for large-scale gene expression studies. Scientific Reports 5, 9570.
Huang, D.W., Sherman, B.T., Tan, Q., Collins, J.R., Alvord, W.G., Roayaei, J., Stephens, R., Baseler, M.W., Lane, H.C., and Lempicki, R.A. (2007). The DAVID Gene Functional Classification Tool: a novel biological module-centric algorithm to functionally analyze large gene lists. Genome Biol. 8, R183.
Irmler, S., Bavan, T., Oberli, A., Roetschi, A., Badertscher, R., Guggenbuhl, B., and Berthoud, H. (2013). Catabolism of serine by Pediococcus acidilactici and Pediococcus pentosaceus. Appl. Environ. Microbiol. 79, 1309-1315.
Johansson, M.A., Bjorkander, S., Mata Forsberg, M., Qazi, K.R., Salvany Celades, M., Bittmann, J., Eberl, M., and Sverremark-Ekstrom, E. (2016). Probiotic Lactobacilli Modulate Staphylococcus aureus-Induced Activation of Conventional and Unconventional T cells and NK Cells. Front. Immunol.7, 273.
Jones, C.E., Brown, A.L., and Baumann, U. (2007). Estimating the annotation error rate of curated GO database sequence annotations. BMC Bioinformatics 8, 170.
63
Kajander, K., Hatakka, K., Poussa, T., Farkkila, M., and Korpela, R. (2005). A probiotic mixture alleviates symptoms in irritable bowel syndrome patients: a controlled 6-month intervention. Aliment. Pharmacol. Ther. 22, 387-394.
Kajander, K., Krogius-Kurikka, L., Rinttila, T., Karjalainen, H., Palva, A., and Korpela, R. (2007). Effects of multispecies probiotic supplementation on intestinal microbiota in irritable bowel syndrome. Aliment. Pharmacol. Ther. 26, 463-473.
Kajander, K., Myllyluoma, E., Rajilic-Stojanovic, M., Kyronpalo, S., Rasmussen, M., Jarvenpaa, S., Zoetendal, E.G., de Vos, W.M., Vapaatalo, H., and Korpela, R. (2008). Clinical trial: multispecies probiotic supplementation alleviates the symptoms of irritable bowel syndrome and stabilizes intestinal microbiota. Aliment. Pharmacol. Ther. 27, 48-57.
Kanehisa, M., and Goto, S. (2000). KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27-30.
Kankainen, M., Paulin, L., Tynkkynen, S., von Ossowski, I., Reunanen, J., Partanen, P., Satokari, R., Vesterlund, S., Hendrickx, A.P., Lebeer, S., et al. (2009). Comparative genomic analysis of Lactobacillus rhamnosus GG reveals pili containing a human- mucus binding protein. Proc. Natl. Acad.Sci. U. S. A. 106, 17193-17198.
Karp, P.D., Paley, S., and Romero, P. (2002). The Pathway Tools software. Bioinformatics 18 Suppl 1, S225-32.
Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., and Salzberg, S.L. (2013). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36-2013-14-4-r36.
Kleerebezem, M., and Hugenholtz, J. (2003). Metabolic pathway engineering in lactic acid bacteria. Curr. Opin. Biotechnol. 14, 232-237.
Korpelainen, L., Tuimala, J., Somervuo, P., Huss, M., and Wong, G. (2014). RNA-seq Data Analysis: APractical Approach (UK: Chapman & Hall/CRC Mathematical & Computational Biology).
Koskenniemi, K., Laakso, K., Koponen, J., Kankainen, M., Greco, D., Auvinen, P., Savijoki, K., Nyman, T.A., Surakka, A., Salusjarvi, T., et al. (2011). Proteomics and transcriptomics characterization of bile stress response in probiotic Lactobacillus rhamnosus GG. Mol. Cell. Proteomics 10, M110.002741.
Koskinen, P., Toronen, P., Nokso-Koivisto, J., and Holm, L. (2015). PANNZER: high-throughput functional annotation of uncharacterized proteins in an error-prone environment. Bioinformatics 31, 1544-1552.
Kukkonen, K., Savilahti, E., Haahtela, T., Juntunen-Backman, K., Korpela, R., Poussa, T., Tuure, T., and Kuitunen, M. (2008). Long-term safety and impact on infection rates of postnatal probiotic and
64
prebiotic (synbiotic) treatment: randomized, double-blind, placebo-controlled trial. Pediatrics 122, 8-12.
Lahens, N.F., Kavakli, I.H., Zhang, R., Hayer, K., Black, M.B., Dueck, H., Pizarro, A., Kim, J., Irizarry,R., Thomas, R.S., Grant, G.R., and Hogenesch, J.B. (2014). IVT-seq reveals extreme bias in RNA sequencing. Genome Biol. 15, R86.
Landaud, S., Helinck, S., and Bonnarme, P. (2008). Formation of volatile sulfur compounds and metabolism of methionine and other sulfur compounds in fermented food. Appl. Microbiol. Biotechnol.77, 1191-1205.
Langmead, B., and Salzberg, S.L. (2012). Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357-359.
Le Lay, C., Coton, E., Le Blay, G., Chobert, J.M., Haertle, T., Choiset, Y., Van Long, N.N., Meslet-Cladiere, L., and Mounier, J. (2016). Identification and quantification of antifungal compounds produced by lactic acid bacteria and propionibacteria. Int. J. Food Microbiol. 239, 79-85.
Lehtoranta, L., Soderlund-Venermo, M., Nokso-Koivisto, J., Toivola, H., Blomgren, K., Hatakka, K., Poussa, T., Korpela, R., and Pitkaranta, A. (2012). Human bocavirus in the nasopharynx of otitis-prone children. Int. J. Pediatr. Otorhinolaryngol. 76, 206-211.
Leng, N., Dawson, J.A., Thomson, J.A., Ruotti, V., Rissman, A.I., Smits, B.M., Haag, J.D., Gould, M.N., Stewart, R.M., and Kendziorski, C. (2013). EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics 29, 1035-1043.
Levin, J.Z., Yassour, M., Adiconis, X., Nusbaum, C., Thompson, D.A., Friedman, N., Gnirke, A., and Regev, A. (2010). Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat Meth 7, 709-715.
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and 1000 Genome Project Data Processing Subgroup. (2009). The Sequence Alignment/Map formatand SAMtools. Bioinformatics 25, 2078-2079.
Liao, Y., Smyth, G.K., and Shi, W. (2014). featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923-930.
Linares, D.M., Del Rio, B., Ladero, V., Martinez, N., Fernandez, M., Martin, M.C., and Alvarez, M.A. (2012). Factors influencing biogenic amines accumulation in dairy products. Front. Microbiol. 3, 180.
Liong, M.T., and Shah, N.P. (2005). Acid and bile tolerance and cholesterol removal ability of lactobacilli strains. J. Dairy Sci. 88, 55-66.
Liu, L., Li, Y., Li, S., Hu, N., He, Y., Pong, R., Lin, D., Lu, L., and Law, M. (2012). Comparison of next-generation sequencing systems. J. Biomed. Biotechnol. 2012, 251364.
65
Lortal, S., and Chapot-Chartier, M. (2005). Role, mechanisms and control of lactic acid bacteria lysis incheese. Int. Dairy J. 15, 857-871.
Love, M.I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550.
Luukkonen, J., Kemppinen, A., Kärki, M., Laitinen, H., Mäki, M., Sivelä, S., Taimisto, A., and Ryhänen, E. (2005). The effect of a protective culture and exclusion of nitrate on the survival of enterohemorrhagic E. coli and Listeria in Edam cheese made from Finnish organic milk. Int. Dairy J. 15, 449-457.
Lyra, A., Krogius-Kurikka, L., Nikkila, J., Malinen, E., Kajander, K., Kurikka, K., Korpela, R., and Palva, A. (2010). Effect of a multispecies probiotic supplement on quantity of irritable bowel syndrome-related intestinal microbial phylotypes. BMC Gastroenterol. 10, 110-230X-10-110.
Maere, S., Heymans, K., and Kuiper, M. (2005). BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics 21, 3448-3449.
Magnussen, A., and Parsi, M.A. (2013). Aflatoxins, hepatocellular carcinoma and public health. World J. Gastroenterol. 19, 1508-1512.
Martin, M. (2011). Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.Journal 17,
Martino, G.P., Quintana, I.M., Espariz, M., Blancato, V.S., and Magni, C. (2016). Aroma compounds generation in citrate metabolism of Enterococcus faecium: Genetic characterization of type I citrate gene cluster. Int. J. Food Microbiol. 218, 27-37.
Mato Rodriguez, L., Ritvanen, T., Joutsjoki, V., Rekonen, J., and Alatossava, T. (2011). The role of copper in the manufacture of Finnish Emmental cheese. J. Dairy Sci. 94, 4831-4842.
McDonald, L.C., McFeeters, R.F., Daeschel, M.A., and Fleming, H.P. (1987). A differential medium forthe enumeration of homofermentative and heterofermentative lactic Acid bacteria. Appl. Environ. Microbiol. 53, 1382-1384.
Merk, K., Borelli, C., and Korting, H.C. (2005). Lactobacilli – bacteria–host interactions with special regard to the urogenital tract. International Journal of Medical Microbiology 295, 9-18.
Miettinen, M., Pietila, T.E., Kekkonen, R.A., Kankainen, M., Latvala, S., Pirhonen, J., Osterlund, P., Korpela, R., and Julkunen, I. (2012). Nonpathogenic Lactobacillus rhamnosus activates the inflammasome and antiviral responses in human macrophages. Gut Microbes 3, 510-522.
Mills, J.D., Kawahara, Y., and Janitz, M. (2013). Strand-Specific RNA-Seq Provides Greater Resolution of Transcriptome Profiling. Curr. Genomics 14, 173-181.
66
Milne, I., Bayer, M., Cardle, L., Shaw, P., Stephen, G., Wright, F., and Marshall, D. (2010). Tablet--nextgeneration sequence assembly visualization. Bioinformatics 26, 401-402.
Moriya, Y., Itoh, M., Okuda, S., Yoshizawa, A.C., and Kanehisa, M. (2007). KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 35, W182-5.
Myllyluoma, E., Kajander, K., Mikkola, H., Kyronpalo, S., Rasmussen, M., Kankuri, E., Sipponen, P., Vapaatalo, H., and Korpela, R. (2007). Probiotic intervention decreases serum gastrin-17 in Helicobacter pylori infection. Dig. Liver Dis. 39, 516-523.
Nada, S., Amra, H., Deabes, M., and Omara, E. (2009). Saccharomyces Cerevisiae And Probiotic Bacteria Potentially Inhibit Aflatoxins Production In Vitro And In Vivo Studies. The Internet Journal of Toxicology 8,
Neves, A.R., Pool, W.A., Solopova, A., Kok, J., Santos, H., and Kuipers, O.P. (2010). Towards enhanced galactose utilization by Lactococcus lactis. Appl. Environ. Microbiol. 76, 7048-7060.
Ojala, T., Laine, P.K.S., Ahlroos, T., Tanskanen, J., Pitkänen, S., Salusjärvi, T., Kankainen, M., Tynkkynen, S., Paulin, L., and Auvinen, P. (2017). Functional genomics provides insights into the role of Propionibacterium freudenreichii ssp. shermanii JS in cheese ripening. Int. J. Food Microbiol. 241, 39-48.
Oksaharju, A., Kankainen, M., Kekkonen, R.A., Lindstedt, K.A., Kovanen, P.T., Korpela, R., and Miettinen, M. (2011). Probiotic Lactobacillus rhamnosus downregulates FCER1 and HRH4 expression in human mast cells. World J. Gastroenterol. 17, 750-759.
O'Leary, N.A., Wright, M.W., Brister, J.R., Ciufo, S., Haddad, D., McVeigh, R., Rajput, B., Robbertse, B., Smith-White, B., Ako-Adjei, D., et al. (2016). Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733-45.
Ortakci, F., Broadbent, J.R., Oberg, C.J., and McMahon, D.J. (2015). Growth and gas production of a novel obligatory heterofermentative Cheddar cheese nonstarter lactobacilli species on ribose and galactose. J. Dairy Sci. 98, 3645-3654.
O'Sullivan, D.J., McSweeney, P.L., Cotter, P.D., Giblin, L., and Sheehan, J.J. (2016). Compromised Lactobacillus helveticus starter activity in the presence of facultative heterofermentative Lactobacillus casei DPC6987 results in atypical eye formation in Swiss-type cheese. J. Dairy Sci. 99, 2625-2640.
Pachlová, V., Buňka, F., Flasarová, R., Válková, P., and Buňková, L. (2012). The effect of elevated temperature on ripening of Dutch type cheese. Food Chem. 132, 1846-1854.
Patro, R., Mount, S.M., and Kingsford, C. (2014). Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat. Biotechnol. 32, 462-464.
Pease, J., and Sooknanan, R. (2012). A rapid, directional RNA-seq library preparation workflow for Illuminareg] sequencing. Nat Meth 9,
67
Peralta, G.H., Bergamini, C.V., and Hynes, E.R. (2016). Aminotransferase and glutamate dehydrogenase activities in lactobacilli and streptococci. Braz J. Microbiol. 47, 741-748.
Pinto, A.C., Melo-Barbosa, H.P., Miyoshi, A., Silva, A., and Azevedo, V. (2011). Application of RNA-seq to reveal the transcript profile in bacteria. Genet. Mol. Res. 10, 1707-1718.
Pogacic, T., Maillard, M.B., Leclerc, A., Herve, C., Chuat, V., Valence, F., and Thierry, A. (2016). Lactobacillus and Leuconostoc volatilomes in cheese conditions. Appl. Microbiol. Biotechnol. 100, 2335-2346.
Promponas, V.J., Iliopoulos, I., and Ouzounis, C.A. (2015). Annotation inconsistencies beyond sequence similarity-based function prediction - phylogeny and genome structure. Stand. Genomic Sci. 10, 108-015-0101-2. eCollection 2015.
Pudlik, A.M., and Lolkema, J.S. (2013). Uptake of alpha-ketoglutarate by citrate transporter CitP drivestransamination in Lactococcus lactis. Appl. Environ. Microbiol. 79, 1095-1101.
Reuter, J.A., Spacek, D.V., and Snyder, M.P. (2015). High-throughput sequencing technologies. Mol. Cell 58, 586-597.
Rhoads, A., and Au, K.F. (2015). PacBio Sequencing and Its Applications. Genomics Proteomics Bioinformatics 13, 278-289.
Rice, P., Longden, I., and Bleasby, A. (2000). EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 16, 276-277.
Rivals, I., Personnaz, L., Taing, L., and Potier, M.C. (2007). Enrichment or depletion of a GO category within a class of genes: which test? Bioinformatics 23, 401-407.
Robinson, J.T., Thorvaldsdottir, H., Winckler, W., Guttman, M., Lander, E.S., Getz, G., and Mesirov, J.P. (2011). Integrative genomics viewer. Nat. Biotechnol. 29, 24-26.
Robinson, M.D., McCarthy, D.J., and Smyth, G.K. (2010). edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139-140.
Rodionov, D.A., Vitreschak, A.G., Mironov, A.A., and Gelfand, M.S. (2003). Comparative genomics ofthe vitamin B12 metabolism and regulation in prokaryotes. J. Biol. Chem. 278, 41148-41159.
Rossi, M., Amaretti, A., and Raimondi, S. (2011). Folate production by probiotic bacteria. Nutrients 3, 118-134.
Rothberg, J.M., Hinz, W., Rearick, T.M., Schultz, J., Mileski, W., Davey, M., Leamon, J.H., Johnson, K., Milgrew, M.J., Edwards, M., et al. (2011). An integrated semiconductor device enabling non-optical genome sequencing. Nature 475, 348-352.
Ruegg, P.L. (2003). Practical Food Safety Interventions for Dairy Production. J. Dairy Sci. 86, Supplement, E1-E9.
68
Saier, M.H.,Jr, Reddy, V.S., Tsu, B.V., Ahmed, M.S., Li, C., and Moreno-Hagelsieb, G. (2016). The Transporter Classification Database (TCDB): recent advances. Nucleic Acids Res. 44, D372-9.
Salvetti, E., Torriani, S., and Felis, G.E. (2012). The Genus Lactobacillus: A Taxonomic Update. Probiotics and Antimicrobial Proteins 4, 217-226.
Sarkar, N. (1997). Polyadenylation of mRNA in Prokaryotes. Annu. Rev. Biochem. 66, 173-197.
Saxelin, M. (2008). Probiotic formulations and applications, the current probiotics market, and changesin the marketplace: a European perspective. Clin. Infect. Dis. 46 Suppl 2, S76-9; discussion S144-51.
Schirone, M., Tofalo, R., Fasoli, G., Perpetuini, G., Corsetti, A., Manetta, A.C., Ciarrocchi, A., and Suzzi, G. (2013). High content of biogenic amines in Pecorino cheeses. Food Microbiol. 34, 137-144.
Schlech, W.F., and Acheson, D. (2000). Foodborne Listeriosis. Clinical Infectious Diseases 31, 770-775.
Schmieder, R., and Edwards, R. (2011). Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863-864.
Schnoes, A.M., Brown, S.D., Dodevski, I., and Babbitt, P.C. (2009). Annotation error in public databases: misannotation of molecular function in enzyme superfamilies. PLoS Comput. Biol. 5, e1000605.
Selinger, D.W., Cheung, K.J., Mei, R., Johansson, E.M., Richmond, C.S., Blattner, F.R., Lockhart, D.J.,and Church, G.M. (2000). RNA expression analysis using a 30 base pair resolution Escherichia coli genome array. Nat Biotech 18, 1262-1268.
Settanni, L., and Moschetti, G. (2010). Non-starter lactic acid bacteria used to improve cheese quality and provide health benefits. Food Microbiol. 27, 691-697.
Sgarbi, E., Lazzi, C., Tabanelli, G., Gatti, M., Neviani, E., and Gardini, F. (2013). Nonstarter lactic acidbacteria volatilomes produced using cheese components. J. Dairy Sci. 96, 4223-4234.
Shalaby, A.R. (1996). Significance of biogenic amines to food safety and human health. Food Res. Int. 29, 675-690.
Shendure, J., and Ji, H. (2008). Next-generation DNA sequencing. Nat Biotech 26, 1135-1145.
Shishkin, A.A., Giannoukos, G., Kucukural, A., Ciulla, D., Busby, M., Surka, C., Chen, J., Bhattacharyya, R.P., Rudy, R.F., Patel, M.M., et al. (2015). Simultaneous generation of many RNA-seqlibraries in a single reaction. Nat Meth 12, 323-325.
Siezen, R.J., and Wilson, G. (2010). Probiotics genomics. Microb. Biotechnol. 3, 1-9.
Smid, E.J., and Kleerebezem, M. (2014). Production of aroma compounds in lactic fermentations. Annu. Rev. Food Sci. Technol. 5, 313-326.
69
Soneson, C., and Delorenzi, M. (2013). A comparison of methods for differential expression analysis ofRNA-seq data. BMC Bioinformatics 14, 91-2105-14-91.
Sorek, R., and Cossart, P. (2010). Prokaryotic transcriptomics: a new view on regulation, physiology and pathogenicity. Nat. Rev. Genet. 11, 9-16.
Srinivasan, P.R., Ramanarayanan, M., and Rabbani, E. (1975). Presence of polyriboadenylate sequences in pulse-labeled RNA of Escherichia coli. Proc. Natl. Acad. Sci. U. S. A. 72, 2910-2914.
Sriram, K., Manzanares, W., and Joseph, K. (2012). Thiamine in nutrition therapy. Nutr. Clin. Pract. 27,41-50.
Suomalainen, T., H., and Mäyrä-Makinen, A., M. (1999). Propionic acid bacteria as protective cultures in fermented milks and breads. Lait 79, 165-174.
Toney, M.D. (2005). Reaction specificity in pyridoxal phosphate enzymes. Arch. Biochem. Biophys. 433, 279-287.
Trapnell, C., and Salzberg, S.L. (2009). How to map billions of short reads onto genomes. Nat. Biotechnol. 27, 455-457.
Turpin, W., Humblot, C., Thomas, M., and Guyot, J. (2010). Lactobacilli as multifaceted probiotics with poorly disclosed molecular mechanisms. Int. J. Food Microbiol. 143, 87-102.
Valsamaki, K., Michaelidou, A., and Polychroniadou, A. (2000). Biogenic amine production in Feta cheese. Food Chem. 71, 259-266.
van de Wiel, M.A., Neerincx, M., Buffart, T.E., Sie, D., and Verheul, H.M. (2014). ShrinkBayes: a versatile R-package for analysis of count-based sequencing data in complex study designs. BMC Bioinformatics 15, 116-2105-15-116.
van Dijk, E.L., Jaszczyszyn, Y., and Thermes, C. (2014). Library preparation methods for next-generation sequencing: Tone down the bias. Exp. Cell Res. 322, 12-20.
Van Vliet, A.H.M. (2010). Next generation sequencing of microbial transcriptomes: challenges and opportunities. FEMS Microbiol. Lett. 302, 1.
Wang, L., Si, Y., Dedow, L.K., Shao, Y., Liu, P., and Brutnell, T.P. (2011). A Low-Cost Library Construction Protocol and Data Analysis Pipeline for Illumina-Based Strand-Specific Multiplex RNA-Seq. Plos One 6, e26426.
Wang, Z., Gerstein, M., and Snyder, M. (2009). RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57-63.
Weinrichter, B., Sollberger, H., Ginzinger, W., Jaros, D., and Rohm, H. (2004). Adjunct starter properties affect characteristic features of Swiss-type cheeses. Nahrung 48, 73-79.
70
Wilhelm, B.T., and Landry, J. (2009). RNA-Seq—quantitative measurement of expression through massively parallel RNA-sequencing. Methods 48, 249-257.
Williams, C.R., Baccarella, A., Parrish, J.Z., and Kim, C.C. (2016). Trimming of sequence reads alters RNA-Seq gene expression estimates. BMC Bioinformatics 17, 103.
Yvon, M., and Rijnen, L. (2001). Cheese flavour formation by amino acid catabolism. Int. Dairy J. 11, 185-201.
Zhao, H., Chiaro, C.R., Zhang, L., Smith, P.B., Chan, C.Y., Pedley, A.M., Pugh, R.J., French, J.B., Patterson, A.D., and Benkovic, S.J. (2015). Quantitative analysis of purine nucleotides indicates that purinosomes increase de novo purine biosynthesis. J. Biol. Chem. 290, 6705-6713.
71
Top Related