Post on 02-Aug-2020
1
The role of large sequence polymorphisms in generating genomic diversity in clinical 1
isolates of Mycobacterium tuberculosis and their utility in phylogenetic analysis. 2
3
David Alland*1
, David W. Lacher2, Manzour Hernando Hazbón
1, Alifiya S. Motiwala
1, 4
Weihong. Qi2, Robert D. Fleischmann
3 and Thomas S. Whittam
2. 5
6
1. Division of Infectious Disease, Department of Medicine and the Ruy V. Lourenço Center for 7
the Study of Emerging and Re-emerging Pathogens, New Jersey Medical School, University of 8
Medicine and Dentistry of New Jersey, Newark, New Jersey. 9
2. National Food Safety and Toxicology Center, Michigan State University, East Lansing, 10
Michigan. 11
3. The Institute for Genomic Research, Rockville, Maryland. 12
13
RUNNING TITLE: LSP distribution in M. tuberculosis. 14
KEYWORDS: Tuberculosis, SNP, phylogenetic, population, sequence polymorphism, genetic 15
markers. 16
* Corresponding author: 185 South Orange Avenue, Division of Infectious Disease, University of
Medicine and Dentistry of New Jersey, 185 South Orange Avenue, MSB A920C, Newark NJ
07103. E-mail: allandda@umdnj.edu. Phone: (973) 972-2179. Fax: (973) 972-0713.
ACCEPTED
Copyright © 2006, American Society for Microbiology and/or the Listed Authors/Institutions. All Rights Reserved.J. Clin. Microbiol. doi:10.1128/JCM.02483-05 JCM Accepts, published online ahead of print on 1 November 2006
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
2
ABSTRACT 17
Mycobacterium tuberculosis strains contain different genomic insertions or deletions 18
called large sequence polymorphisms (LSPs). Distinguishing between LSPs that occur one time 19
versus ones that occur repeatedly in a genomic region may provide insights into the biological 20
roles of LSPs and identify useful phylogenetic markers. We analyzed 163 clinical M. 21
tuberculosis isolates for 17 LSPs identified in a genomic comparison of M. tuberculosis strains 22
H37Rv and CDC1551. LSPs were mapped onto a single nucleotide polymorphism (SNP)-based 23
phylogenetic tree created using nine novel SNP markers that were found to reproduce a 212 24
SNP-based phylogeny. Four “Group A LSPs” mapped to a single SNP-tree segment. Two 25
“Group B LSPs” and eleven “Group C LSPs” were inferred to have arisen independently in the 26
same genomic region either two or more than two times, respectively. None of the Group A 27
LSPs but one Group B LSP and five Group C LSPs were flanked by IS6110 sequences in the 28
references strains. PE-PPE genes were only present in Group B or C LSPs. SNP versus LSP-29
based phylogenies were also compared. We classified each isolate into 58 “LSP types” using a 30
separate LSP-based phylogenetic analysis, and mapped the LSP types onto the SNP tree. LSPs 31
often assigned isolates to the correct phylogenetic lineage, however, significant mistakes 32
occurred for 6/58 (10%) of the LSP types. In conclusion, most LSPs occur in genomic regions 33
that are prone to repeated insertion/deletion events; and were responsible for an unexpectedly 34
high degree of genomic variation in clinical M. tuberculosis. Group B and C LSPs may 35
represent polymorphisms that occur due to selective pressure and affect the phenotype of the 36
organism, while Group A LSPs are preferable phylogenetic markers. 37
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
3
INTRODUCTION 38
As pathogenic bacteria adapt to their host environments, virulence properties may change 39
through the insertion or deletion (indel) of chromosomal regions and the gain or loss of genetic 40
material (3, 4, 9, 11, 19, 25, 27, 29, 34). Mycobacterium tuberculosis is a major pathogen of 41
humans, and genomic deletions [also known as large sequence polymorphisms (LSPs) or regions 42
of difference] can also be detected in most clinical isolates of this species (16, 23, 32). Studies 43
examining the biological role of LSPs in M. tuberculosis have been inconclusive (26). LSPs 44
were demonstrated to be unique event polymorphisms (UEPs) in a study of 100 clinical M. 45
tuberculosis isolates (23). It was also possible to perform an informative analysis of a large 46
sample of clinical M. tuberculosis isolates using these LSPs as phylogenetic markers (17). UEP 47
refers to a mutation that has occurred once in the phylogeny of a species (i.e. is “unique”), is 48
irreversible and does not display homoplasy (23). The observation that most LSPs were UEPs 49
suggested that LSPs were unlikely to have an important role in disease pathogenesis, because 50
mutations that confer an evolutionary advantage to M. tuberculosis should be selected for 51
repeatedly in the evolution of the species (2). Supporting this hypothesis, LSPs were found to 52
have a possible attenuating effect on clinical disease in one retrospective study (26). A study of 53
gene expression in ten clinical M. tuberculosis isolates also demonstrated that LSPs 54
predominately encoded for genes that were variably expressed or not expressed in broth cultures 55
(18). These results suggested that LSPs do not generally involve functionally important proteins. 56
Instead, a number of investigators have assumed LSPs are selectively neutral and used them as 57
phylogenetic markers for population and evolutionary investigations (17, 23, 24). 58
Other studies have suggested that LSPs do have a critical role in M. tuberculosis 59
pathogenesis. Clinical M. tuberculosis LSPs had low consistency indices when incorporated into 60
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
4
a phylogeny composed of SNPs, LSPs and clinical parameters (16). Furthermore, investigations 61
of clinical strains have detected three apparent genomic “hot spots” for insertion of IS6110 and 62
associated chromosomal deletions (13, 14, 24, 30). Genomic analysis indicates that LSPs almost 63
always include segments of open reading frames (16, 23), although this may be due to a paucity 64
of non coding regions in the M. tuberculosis genome (7). Finally, Yang et al. (33) recently 65
showed that clinical M. tuberculosis isolates with deletions in the plcD gene (one of the known 66
deletion hot spots) are indeed phenotypically different, exhibiting a two-fold increased risk of 67
causing extrapulmonary tuberculosis. Taken together, these results indicate that some LSPs have 68
evolved repeatedly in the radiation of M. tuberculosis, and suggest that LSP-associated indels 69
provide a selective advantage to certain M. tuberculosis strains. 70
Unfortunately, the rates of indels underlying M. tuberculosis LSPs cannot be 71
conveniently measured in the laboratory. This makes it difficult to differentiate experimentally 72
between mutations that are UEPs and mutations that have a tendency to occur repeatedly. 73
Phylogenetic analysis makes an alternative approach available. Clinical strains containing a 74
particular indel can be mapped onto a phylogenetic tree and then examined to determine whether 75
or not they can be traced to a single ancestral event. Indels that have arisen independently 76
multiple times in the population may have significant biological roles. Mutations that appear to 77
have a single origin are more likely to represent UEPs that are evolutionarily neutral (2). A 78
variation of this approach was undertaken in prior LSP studies (16, 23). However, these 79
previous studies also used the LSPs themselves as markers in the phylogenetic tree construction. 80
Furthermore, the largest of these studies defined distinct LSPs quite strictly, choosing to analyze 81
LSPs separately if they had different deletion sites, even if the deletions mapped to identical or 82
overlapping genes. In investigating the biology of LSPs, we propose that it is more important to 83
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
5
categorize LSPs according to the gene or genes that are deleted (or inserted) rather than by the 84
exact location of the indel site. This is because the effect of a LSP on microbial phenotype is 85
more likely to be due to the genes that are disrupted or otherwise affected by the LSP rather than 86
the exact indel sites where the LSP occurred. Therefore, we have favored a less restricted 87
definition of LSP that is based on the presence or absence of a gene region rather than on the 88
presence or absence of a specific deletion. 89
In this report, we present a phylogenetic analysis of gene deletions found in M. 90
tuberculosis LSPs, using an “unequivocal” phylogenetic tree constructed with synonymous SNP 91
markers. We present phylogenetic evidence that many of the gene regions contained within 92
LSPs have been deleted (or possible inserted) multiple times as separate events in the history of 93
M. tuberculosis divergence, and identify several possible mechanisms for these genomic 94
changes. Our results suggest that LSPs represent an important mechanism of genetic variation in 95
M. tuberculosis and indicate that further investigations into the functional relevance of LSPs may 96
provide insights into M. tuberculosis pathogenesis and immunity. LSPs, as defined in our study, 97
that recur independently with high frequency may be precluded as phylogenetic markers. 98 ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
6
MATERIALS AND METHODS 99
Study population. The study population has been described previously (16); it consisted 100
of consecutive patients with positive cultures for M. tuberculosis identified at Montefiore 101
Medical Center in the Bronx, N.Y., between 1989 and 1996. All isolates had been typed with 102
IS6110-based restriction length polymorphism (RFLP) analysis, and a secondary typing 103
procedure if necessary (1). Of the 319 available cultures from that period, 169 of the samples 104
plus the M. tuberculosis reference strains H37Rv and CDC1551 were selected at random for 105
SNP and LSP analysis. Six clinical samples gave indeterminate SNP or LSP results enabling 106
163 clinical isolates plus H37Rv and CDC1551 to be included in the present study. The 107
demographic and clinical characteristics of this subset were similar to those of the overall study 108
population and were generally reflective of the diverse nature of New York City residents (1). 109
This subset included M. tuberculosis isolates from a broad range of ethnicities and patients from 110
at least 19 different known countries of origin. 111
LSP identification. Eighty-six LSPs larger than ten base pairs were identified by 112
comparing the genomes of M. tuberculosis strains CDC1551 and H37Rv in a previous 113
investigation (16). Seventeen LSPs were further studied: LSPs 1 through 12 were selected from 114
sequences that were present in CDC1551 but absent from H37Rv; LSPs 13 – 17 were selected 115
from sequences that were present in H37Rv but absent from CDC1551. DNA probes were then 116
prepared for one gene in each LSP by PCR (16). We limited our study to 17 LSPs because of the 117
technical complexity of studying each LSP in large numbers of M. tuberculosis samples. The 118
coordinates for each probe and primer are described in this previous work. Approximately two 119
µg of genomic DNA from the clinical M. tuberculosis isolates or CDC1551 and H37Rv were 120
suspended in 2X SSC at a final volume of 200 µl. Each sample was boiled for 5 min. and then 121
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
7
cooled on ice. A multi-slot hybridization apparatus (immunoblotter; Immunetics, MA) was 122
assembled as per the manufacturers recommendations with the modification that the cushion was 123
replaced with five pieces of dry 3 mm Whatman paper underneath one piece of 1 mm Whatman 124
paper soaked in 2X SSC. A pre-wetted Biotrans Plus nylon membrane (ICN Pharmaceuticals, 125
CA) was placed on top of the thin Whatman paper. The apparatus was assembled, and the 126
cooled genomic DNA was bound in longitudinal strips onto the membrane by rapidly loading the 127
DNA mixture into the apparatus. Bubbles were avoided inside the apparatus by loading a slight 128
excess volume of DNA solution. The apparatus was then disassembled, the membrane was 129
removed, rinsed in 2X SSC and then cross-linked with ultra-violet light. For identification of the 130
LSPs present in each DNA sample, the membrane was prehybridized for one hour in Rapid Hyb 131
buffer (Amersham, CT) at 69°C in a hybridization oven. The still wet membrane was then 132
reinserted into the multi-slot hybridization apparatus at 90°C from its previous orientation, using 133
the manufacturers cushion instead of Whatman paper (described above) to seal the apparatus. 134
Each slot was then loaded with approximately 200 µl of boiled and then rapidly ice-cooled 135
hybridization buffer containing γ-32
P-labeled probes for the 17 LSPs. The openings of the 136
apparatus were sealed with parafilm and the apparatus was incubated at 69°C with occasional 137
gentle rocking for 2 hours. The parafilm was carefully removed, unhybridized probe was sucked 138
out of each hybridization well using a vacuum attached to the wash device supplied by the 139
manufacturer, and each slot was washed (again using the vacuum-wash device) with 2X SSC. 140
The apparatus was then dissembled; the membrane was washed one more time in 2X SSC, three 141
times in 0.1X SSC at 69°C, and then exposed on film. Using this protocol, 44 different genomic 142
DNA samples could be slotted in an array consisting of 44 lines extending across the membrane. 143
Hybridizing of probes for each LSP at 90°C to this array permitted every probe to come into 144
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
8
contact with every genomic DNA sample. The presence of a particular LPS in a DNA sample 145
was determined by examining the developed autoradiogram for dark spots. An example of a LSP 146
blot has been shown previously (16). 147
SNP identification. We had previously identified six SNP markers that were sufficient 148
to classify a global M. tuberculosis collection into seven phylogenetically distinct “SNP cluster 149
groups” (SCGs) (15). For the current study, we selected a different set of nine SNP markers that 150
enabled us to further subdivide the SCGs into subgroups (SC-subgroups) for a total of seven 151
SCGs and five SC-subgroups (Table 1). All of the study samples were then tested at the nine 152
SNP loci using hairpin primer assays as described previously (22) (Table 2) and the alleles 153
determined. 154
Phylogenetic analysis. Each isolate was assigned to a SCG or SC-subgroup according to 155
the allele pattern at the nine SNP loci (Table 1), and plotted on a neighbor-joining phylogenetic 156
tree previously created by analyzing a global M. tuberculosis collection using 212 SNP markers 157
(15) (Fig. 1). The presence or absence of each LSP was scored as a binary and the isolates were 158
also classified into 58 LSP types (LSP-Ts) as defined by the distinct patterns in the present or 159
absent LSPs in each isolate. 160 ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
9
RESULTS. 161
LSPs occur repeatedly in the M. tuberculosis genome. In order to perform a 162
phylogenetic analysis of the distributions of M. tuberculosis LSPs, it was first necessary to 163
unambiguously establish the phylogeny of the 163 M. tuberculosis study isolates plus H37Rv 164
and CDC1551. Each isolate was tested for the presence of nine SNP markers, and a SCG or SC-165
subgroup was assigned to each isolate based on the pattern of its SNP alleles. The LSP and SNP 166
alleles for each isolate are shown in Supplementary Table 1. The typed isolates were then 167
plotted onto a phylogenic tree of M. tuberculosis established previously (15). The study set was 168
found to include members of all SCGs except for SCG 7 (which primarily contains 169
Mycobacterium bovis) (Fig. 1). Each M. tuberculosis SCG/SG-subgroup contained an average 170
of 18 isolates (range 0 to 43) and an average of 13 different strains (range 0 to 38) as defined by 171
the presence of distinct RFLP patterns. 172
We selected 17 M. tuberculosis LSPs from a larger set of previously identified LSPs (16) 173
to study their distribution on the strain phylogeny (Table 3). The distribution of these LSPs have 174
not been previously examined in a set of phylogenetically characterized clinical strains. Three 175
LSPs (LSP 10, 11 and 13) were located near two IS1547 elements, known to be “hotspots” for 176
IS6110 insertions (17). Each M. tuberculosis isolate was examined for the presence or absence 177
of each of the 17 LSPs by probing for an internal DNA sequence. All of the LSPs were then 178
mapped onto the phylogenetic tree. We found that the majority of LSPs did not appear to be 179
UEPs. Unlike the distribution of the selectively neutral SNPs shown in a previous report (2), 180
only four of the 17 LSPs studied (LSPs 1, 9, 13 and 16) (Fig 2A) were situated on the 181
phylogenetic tree such that their presence could be explained by a single event in a common 182
ancestor. We have called these LSPs “Group A LSPs” in subsequent discussions. Two other 183
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
10
LSPs (LSPs 12 and 14) appeared to have occurred independently at least two times (Fig. 2B). 184
We have called these LSPs “Group B LSPs. The remaining 11 LSPs (LSPs 2, 3, 4, 5, 6, 7, 8, 10, 185
11, 15 and 17) were situated on the phylogenetic tree such that they could not have arisen from a 186
single common ancestor, and must have arisen independently multiple times (Fig. 3). These 187
LSPs were renamed “Group C LSPs”. 188
The genes that corresponded to the probes for each LSP were then examined (Table 3). 189
We examined all of the genes that were deleted in each LSP as it was originally defined by the 190
CDC1551-H37Rv genomic comparisons, although some clinical strains may have smaller or 191
larger LSPs in each region. None of the Group A LSPs and only one of the Group B LSPs were 192
flanked by IS6110 elements in either CDC1551 or H37Rv, while five of the 11 Group C LSPs 193
were flanked by IS6110 elements. These results suggest that recombination between IS6110 194
elements is one of the mechanisms that generate LSPs that reoccur frequently. Indeed, we also 195
noted that the IS6110 elements adjacent to the locations of four of the Group C LSPs (LSPs 3, 4, 196
10 and 11) lack the characteristic 3 to 4 bp direct repeats indicative of recombination between 197
IS6110 elements (16). This adds further support to the hypothesis that IS6110 is an important 198
driving force for large sequence diversity in M. tuberculosis (5, 13, 24, 30, 31). PE, PE_PGRS 199
or PPE genes were not present in any of the Group A LSPs, while one of the Group B LSPs and 200
three Group C LSPs contained PPE genes. Recombination and deletion between these genes that 201
have substantial sequence similarity might represent a second mechanism for LSP generation. 202
However, the number of LSPs examined was too small to reasonably test for statistical 203
differences in PPE gene frequency among the LSP groups. Despite these two proposed 204
mechanisms for recurrent LSP generation, one of two Group B LSPs and three of 11 Group C 205
LSPs were not associated with either flanking IS6110 sequences or repetitive genes. 206
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
11
The LSPs associated with IS6110 in the reference strains did not occur at a higher 207
frequency than other LSPs. The phylogenetic analysis of each LSP (Figs. 2 and 3) suggested that 208
there were 65 independent LSP events in the 165 M. tuberculosis isolates (Table 3) (this 209
population contained many more LSPs, but a group of phylogenetically-related isolates with the 210
same LSP were considered to constitute one LSP event). Approximately one-third (6/17) of the 211
LSPs studied were associated with IS6110, and these LSPs were associated with 27/65 (42%) of 212
the independent LSPs in the population. This did not differ significantly from the approximately 213
two-thirds (11/17) of the LSPs studied that were not associated with IS6110 in the reference 214
strains. These LSPs accounted for at least 38/65 (58%) of the independent LSPs. 215
Twelve of the 17 LSPs in this study represent sequences that are absent in H37Rv but 216
present in CDC1551 [although LSP 6 appears to be present in some H37Rv isolates; and must, 217
therefore have been deleted recently in a subset of H37Rv isolates in experimental use (16)]. 218
Each of these LSPs were also found to be missing in at least one clinical isolate, demonstrating 219
that the H37Rv LSPs did not include unique deletion events that might have occurred as a 220
consequence of a prolonged in vitro culture. 221
Confirmation of LSP identification and variability within IS6110-defined clusters. 222
It was important to ensure that the results of this study were not due to artifacts of the LSP 223
identification process. Inconsistencies in detecting LSPs could make it falsely appear as if LSPs 224
were occurring repeatedly as independent events. Repeated probing of the same strain gave 225
identical LSP results, suggesting that the LSP identification process was sound. We also 226
examined strains that were identical by IS6110 RFLP analysis to determine if these closely 227
related strains contained the same LSPs. We found only six instances, in 17 clusters involving 228
66 isolates, where two isolates within a cluster did not have exactly the same LSP pattern. In 229
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
12
each of these cases, only one of the seventeen LSPs was discordant between the isolates. 230
Furthermore, all six of the mismatched LSPs were Group C LSPs (four were LSP 6, one was 231
LSP 2 and one was LSP 10). These results suggest that the small variation in LSP patterns that 232
we observed within isolates of a cluster is due to the propensity of M. tuberculosis to develop 233
independent deletions in these regions. The exact time frame of LSP generation cannot be 234
deduced from this study because the epidemiological connections among the clustered isolates 235
were not well characterized in our data set. Prior reports suggest that differences in LSP patterns 236
are not observed among RFLP-identical isolates with known epidemiological links (16). 237
However, these results do strongly suggest that different LSPs are generated at different rates. 238
Phylogenetic analysis of M. tuberculosis populations using LSP markers. LSPs 239
appear to be useful phylogenetic markers for studies of M. tuberculosis (20, 23, 28), especially 240
when the specific identity of each LSP can be confirmed by sequencing the ends of each deletion 241
(23). End-sequencing makes it possible to identify which deletions within a similar genome 242
region are, in fact, independent deletions. However, large scale sequencing of deletion sites is 243
not practical, and even PCR-based identification of specific deletion sites may be difficult if 244
LSPs of similar sizes occur near the same genomic locus. We studied the ability of the LSPs 245
identified in this study to accurately describe phylogenetic relationships among M. tuberculosis 246
isolates. Each of the 163 clinical isolates (H37Rv and CDC1551 were not included in this 247
analysis) were classified into one of 58 LSP types (LSP-Ts), based on the pattern of LSPs that 248
were present (Supplementary Table 1). Each LSP-T was then located on the SNP tree, and the 249
proximity of all of the isolates with the same LSP-T was examined. LSP-Ts that placed M. 250
tuberculosis isolates together in a manner that was consistent with the SNP tree would be 251
considered good phylogenetic assignments. LSP-Ts that conflicted with the SNP tree would 252
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
13
represent inaccurate assignments. Our results showed that LSP-Ts situated most of the M. 253
tuberculosis isolates on the same or adjoining branch of the SNP tree (Fig. 4). However, six of 254
the LSP-Ts incorrectly grouped isolates together that were more distantly related according to 255
the SNP tree (Fig. 4, LSP-Ts 3, 5, 7, 9, 14 and 15). Many of the LSP-Ts only contained a single 256
M. tuberculosis isolate. We performed a secondary analysis restricted to commonly occurring 257
LSP-Ts by eliminating LSP-Ts that contained fewer than two isolates. This analysis reduced the 258
study to 31 LSP-Ts and 137 isolates. We found that 6/31 (19%) of the LSP-Ts that contained 259
two or more isolates continued to produce important conflicts with the SNP tree. These results 260
confirm our findings with the total study sample. 261
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
14
DISCUSSION 262
This study suggests that LSPs are a substantial source of diversity within the M. 263
tuberculosis genome. While some LSPs appeared to represent rare events in the population, the 264
majority of LSPs appeared to have been generated multiple times in the divergence of M. 265
tuberculosis strains. The low frequency of Group A and B LSP events suggests that these LSPs 266
arose from random genomic events and have become associated with a particular phylogenetic 267
lineage. These LSPs may have occurred in the absence of special mechanism for generating 268
genomic change at high frequency. We suspect that these LSPs are unlikely to result in a 269
selective advantage for the organism; however, this is very difficult to test without additional 270
data. 271
Group C LSPs are much more variable and appear to have been generated by at least two 272
mechanisms. Forty-five percent of the Group C LSPs were flanked by IS6110 transposable 273
elements on at least one side of a reference strain. The presence of IS6110 in proximity to LSP 274
regions that are not present (and likely to be deleted) in other isolates suggests that 275
recombination between nearby IS6110 elements produced a deletion – creating the LSP. IS6110 276
transposition events may be advantageous, neutral or detrimental to the bacterial cell depending 277
on the genes involved. Yang et al. (33) has shown that plcD deletions (LSP 4, a group C LSP 278
flanked by IS6110 in our study) do indeed affect bacterial phenotype, in this case showing a 279
strong association with extrapulmonary tuberculosis. This work supports the hypothesis that the 280
variation associated with group C LSPs affects bacterial phenotype (although it is unclear if an 281
extrapulmonary phenotype should be considered selectively advantageous); it also provides 282
further evidence that IS6110 is a contributing force driving genetic diversity in the M. 283
tuberculosis complex. Indeed, as IS6110 may also be present in the clinical isolates at sites 284
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
15
where H37Rv and CDC1551 do not contain IS6110, this element may be playing an even more 285
pivotal role. We speculate that the group C LSPs have occurred under positive selective 286
pressure, and these deletions (LSPs) enhance transmission and other virulence features of M. 287
tuberculosis. Maurelli and colleagues have demonstrated similar events in Shigella strains, 288
where parallel loss of the cadA locus in different lineages of Shigella were found to be 289
pathoadaptive (8). An alternative hypothesis is that group C LSPs represent highly unstable 290
genomic regions that are repeatedly deleted because the genes encompassed by these LSPs are 291
nonfunctional. Under these circumstances, the repeated loss of these genes could reflect a 292
selective advantage for loss of nonfunctional DNA. However, observations in other bacteria 293
suggest that deletion of nonfunctional DNA is a progressive occurrence that begins with 294
mutation of nonfunctional genes into pseudogenes, and is only later followed by a series of 295
deletion events (21). In M. tuberculosis, there is no evidence that any of the deleted genes have 296
mutated to pseudogenes. One of the Group B and three of the Group C LSPs were in PPE genes 297
that others have speculated may be involved in immune variation and evasion (6, 10, 12). Their 298
recurrent deletion in different TB lineages is consistent with the hypothesis that these are escape 299
mutants created by silencing these gene products during the course of infection of mammalian 300
hosts. 301
Our findings do not directly contradict the work of Hirsh et al., (23) which suggested that 302
virtually all LSPs were unique evolutionary events. First, this prior investigation excluded LSPs 303
originating or terminating in PPE genes, whereas these LSPs were included in our study. 304
Second, our investigation included regions that were deleted in H37Rv relative to the genome of 305
CDC1551. Hirsh et al. only examined regions that are missing in clinical isolates relative to the 306
genome of H37Rv. Finally, we used a hybridization-based approach to identify the presence or 307
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
16
absence of genomic regions known to be encompassed by LSPs. In contrast, Hirsh et al., 308
sequenced across each end of the LSP, confirming the exact deletion sites and distinguishing 309
among similar deletion events. It is likely that a reanalysis of this previous work would 310
demonstrate that many LSPs overlap, differing only at the specific deletion sites and confirm our 311
observation that many genomic regions were likely to be deleted independently. 312
Other investigators have suggested that LSPs can provide an accurate genetic marker 313
system for molecular epidemiological and evolutionary studies of M. tuberculosis (20, 28). Our 314
results suggest that LSPs may be informative markers in situations where discrimination of 315
strains is the main objective. However, phylogenetic inference will be complicated by the 316
multiple origins and parallel evolution of many LSPs which will generate incompatibilities with 317
other phylogenetic markers such as SNP loci. The extent to which this problem can be alleviated 318
by direct sequencing LSP deletion sites requires further study. 319
In summary, this work demonstrates that LSPs are predominately genomic deletions that 320
result in an unexpected degree of genomic plasticity in clinical M. tuberculosis isolates. At least 321
one-third of the plasticity in specific genomic regions appears to involve recombination between 322
IS6110 elements in the region. The repeated evolution of some LSPs suggests that these 323
polymorphisms are a critical source of genetic variation that is adaptive, and may underlie 324
variation in virulence among TB strains; however, this is difficult to test and warrants future 325
investigational studies of pathogenicity and immunity. 326
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
17
ACHNOWLEGEMENTS. 327
This work was supported by Public Health Service grants AI-46669 and AI-49352 from the 328
National Institutes of Health. 329
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
18
Table 1. SNP set used to assign the SCGs and SC-subgroups. 330
SNP position in H37Rva
SCG 1977 54394 74092 105139 144390 232574 311613 913274 2154724
1 G G C C G G T G A
2 G G C A G G T C A
3a G G C C G G T C A
3b G G C C G G T C C
3c G G C C G T T C C
4 G G C C A T T C C
5 G A C C G G T C C
6a A A C C G G T C C
6b A A C C G G G C C
7 G G T C G G T G A
a. GenBank accession number NC_000962.331 ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
19
Table 2. Genome locations and hairpin assay primers used for the nine SNP set. 332
Position in
H37Rv
Hairpin-shaped primersa Constant primer
RHP-1977: GTTCGTC gggactgccaacgacgaac
1977
RHP-G1977A: ATTCGTC gggactgccaacgacgaat
F1977
tacggttgttgttcgactgct
FHP-54338: CGCCCA gatctggcccgggcg
54394
FHP-G54338A: TGCCCA gatctggcccgggca
R54338
gttgggtcctttggtctgattct
RHP-74073: CAGTACCGAT gcggtgaactcggtactg
74092
RHP-C74073T: TAGTACCGAT gcggtgaactcggtacta
F74073
cgacggtccgaattgcc
FHP-105129: GGGCG gcactgTcaaagagcgccc
105139
FHP-C105129A: TGGCG gcactgTcaaagagcgcca
R105129
tcccttgtgtcacttcagtttcac
RHP-144381: AGATGGG tgtcgtgcgAcccatct
144390
RHP-A144381G: GGATGGG tgtcgtgcgAcccatcc
F144381
cccgggtggtgctgatt
RHP-232686: TCGGC ccgctgtaggcgccga
232574
RHP-T232686G: GCGGC ccgctgtaggcgccgc
F232686
gattcaaacagatccgtgataccc
RHP-311729: TACGGC ccgtgCacaccgccgta
311613
RHP-T311729G: GACGGC ccgtgCacaccgccgtc
F311729
cgcccagagccgttcgt
FHP-0913183 (2): GGAGATTGG ctcgGtggacccaatctcc
913274
FHP-C0913183G (2): CGAGATTGG ctcgGtggacccaatctcg
R0913183
atcaggtcttcgatggccatg
FHP-KatG463: CGGATCT agcctttagagccagatccg 2154724
FHP-KatGR463L: AGGATCT gagcctttagagccagatcct
RKatG463
gagacagtcaatcccgatgc
a. Sequences of the 5’-end tails added to the hairpin primers and the residues corresponding to 333
the secondary mutations are shown in capital letters. 334
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
20
Table 3. LSP Groups and their attributes. 335
Group LSPa Locus
b Coordinates
c Gene
Other genes on deleted region in CDC1551
or H37Rv reference strainsd
Strain(s)
containing
adjacent
IS6110
Number of
independent
LSP deletions
1 MT0676 744149-744392 Alpha-mannosidase 1
9 MT2081
MT2082 2268479-2268702 Hyp
e/Helicase MT2080 (Hyp) and MT2080.1 (Hyp) 1
13 Rv0793
Rv0794c 886934-887397
Hyp/dihydrolipoamide
dehydrogenase 1
A
16 Rv3519 3955704-3956104 Hype 1
12 MT2423 2633331-2633746 PPE H37Rv 2 B
14 Rv2124c 2381785-2383193 Methionine synthase 2
2 MT1360 1481322-1481551 Adenylate cyclase 7
3 MT1802 198225-1982482 Transporter H37Rv and
CDC1551 5
4 MT1799 1978754-1978931 Phospholipase
MT1800 (glycosyl transferase), MT1801
(molybdopterin oxidoreductase), MT1802
(MmpL family)
H37Rv and
CDC1551 6
5 MT1812 1994163-1994437 Hype
H37Rv and
CDC1551 4
6 MT2420 2630855-2631147 Hype MT2421 (Hyp) 7
7 MT2619 2862884-2863033 Membrane lipoprotein 5
8 MT3248 3526018-3526304 PPE 4
10 MT3426 3705322-3705665 moaB
MT3427 (moaA), MT3428 (transcript.
regulator), MT3429 (Hyp), MT3430 (IS1547,
transposase)
H37Rv 6
11 MT3427 3707462-3707706 moaA MT3426 (pterin dehydratase), MT3428 -
MT3430 (see above) H37Rv 4
15 Rv3135 3501335-3501499 PPE 3
C
17 Rv3343c 3733083-3733353 PPE 6
a. Large sequence polymorphism. b. The MT prefix indicates LSPs present in CDC1551 and absent in H37Rv, the Rv prefix indicates 336
LSPs present in H37Rv but absent in CDC1551. c. Coordinates of the LSP probes. Note: probes for LSP 1 to 12 are CDC1551 337
coordinates (GenBank accession number NC_002755), and probes for LSPs 13 to 17 are H37Rv coordinates (GenBank accession 338
number NC_000962). d. Coordinates for all LSPs in the reference strains have been described previously (16). e. Hypothetical.339
ACCEPTED on O
ctober 14, 2020 by guesthttp://jcm
.asm.org/
Dow
nloaded from
21
REFERENCES. 340
341
1. Alland, D., G. E. Kalkut, A. R. Moss, R. A. McAdam, J. A. Hahn, W. Bosworth, E. 342
Drucker, and B. R. Bloom. 1994. Transmission of tuberculosis in New York City. An 343
analysis by DNA fingerprinting and conventional epidemiologic methods. N Engl J Med 344
330:1710-6. 345
2. Alland, D., T. S. Whittam, M. B. Murray, M. D. Cave, M. H. Hazbon, K. Dix, M. 346
Kokoris, A. Duesterhoeft, J. A. Eisen, C. M. Fraser, and R. D. Fleischmann. 2003. 347
Modeling bacterial evolution with comparative-genome-based marker systems: 348
application to Mycobacterium tuberculosis evolution and pathogenesis. J Bacteriol 349
185:3392-9. 350
3. Baek, S. H., G. Rajashekara, G. A. Splitter, and J. P. Shapleigh. 2004. Denitrification 351
genes regulate Brucella virulence in mice. J Bacteriol 186:6025-31. 352
4. Blaser, M. J., and J. C. Atherton. 2004. Helicobacter pylori persistence: biology and 353
disease. J Clin Invest 113:321-33. 354
5. Brosch, R., W. J. Philipp, E. Stavropoulos, M. J. Colston, S. T. Cole, and S. V. 355
Gordon. 1999. Genomic analysis reveals variation between Mycobacterium tuberculosis 356
H37Rv and the attenuated M. tuberculosis H37Ra strain. Infect Immun 67:5768-74. 357
6. Choudhary, R. K., R. Pullakhandam, N. Z. Ehtesham, and S. E. Hasnain. 2004. 358
Expression and characterization of Rv2430c, a novel immunodominant antigen of 359
Mycobacterium tuberculosis. Protein Expr Purif 36:249-53. 360
7. Cole, S. T., R. Brosch, J. Parkhill, T. Garnier, C. Churcher, D. Harris, S. V. 361
Gordon, K. Eiglmeier, S. Gas, C. E. Barry, 3rd, F. Tekaia, K. Badcock, D. Basham, 362
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
22
D. Brown, T. Chillingworth, R. Connor, R. Davies, K. Devlin, T. Feltwell, S. Gentles, 363
N. Hamlin, S. Holroyd, T. Hornsby, K. Jagels, A. Krogh, J. McLean, S. Moule, L. 364
Murphy, K. Oliver, J. Osborne, M. A. Quail, M. A. Rajandream, J. Rogers, S. 365
Rutter, K. Seeger, J. Skelton, R. Squares, S. Squares, J. E. Sulston, K. Taylor, S. 366
Whitehead, and B. G. Barrell. 1998. Deciphering the biology of Mycobacterium 367
tuberculosis from the complete genome sequence. Nature 393:537-44. 368
8. Day, W. A., Jr., R. E. Fernandez, and A. T. Maurelli. 2001. Pathoadaptive mutations 369
that enhance virulence: genetic organization of the cadA regions of Shigella spp. Infect 370
Immun 69:7471-80. 371
9. de Visser, J. A., A. D. Akkermans, R. F. Hoekstra, and W. M. de Vos. 2004. 372
Insertion-sequence-mediated mutations isolated during adaptation to growth and 373
starvation in Lactococcus lactis. Genetics 168:1145-57. 374
10. Delogu, G., and M. J. Brennan. 2001. Comparative immune response to PE and 375
PE_PGRS antigens of Mycobacterium tuberculosis. Infect Immun 69:5606-11. 376
11. Ernst, R. K., D. A. D'Argenio, J. K. Ichikawa, M. G. Bangera, S. Selgrade, J. L. 377
Burns, P. Hiatt, K. McCoy, M. Brittnacher, A. Kas, D. H. Spencer, M. V. Olson, B. 378
W. Ramsey, S. Lory, and S. I. Miller. 2003. Genome mosaicism is conserved but not 379
unique in Pseudomonas aeruginosa isolates from the airways of young children with 380
cystic fibrosis. Environ Microbiol 5:1341-9. 381
12. Espitia, C., J. P. Laclette, M. Mondragon-Palomino, A. Amador, J. Campuzano, A. 382
Martens, M. Singh, R. Cicero, Y. Zhang, and C. Moreno. 1999. The PE-PGRS 383
glycine-rich proteins of Mycobacterium tuberculosis: a new family of fibronectin-binding 384
proteins? Microbiology 145 ( Pt 12):3487-95. 385
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
23
13. Fang, Z., C. Doig, D. T. Kenna, N. Smittipat, P. Palittapongarnpim, B. Watt, and K. 386
J. Forbes. 1999. IS6110-mediated deletions of wild-type chromosomes of 387
Mycobacterium tuberculosis. J Bacteriol 181:1014-20. 388
14. Fang, Z., and K. J. Forbes. 1997. A Mycobacterium tuberculosis IS6110 preferential 389
locus (ipl) for insertion into the genome. J Clin Microbiol 35:479-81. 390
15. Filliol, I., A. S. Motiwala, M. Cavatore, W. Qi, M. H. Hazbon, M. Bobadilla del 391
Valle, J. Fyfe, L. Garcia-Garcia, N. Rastogi, C. Sola, T. Zozio, M. I. Guerrero, C. I. 392
Leon, J. Crabtree, S. Angiuoli, K. D. Eisenach, R. Durmaz, M. L. Joloba, A. 393
Rendon, J. Sifuentes-Osornio, A. Ponce de Leon, M. D. Cave, R. Fleischmann, T. S. 394
Whittam, and D. Alland. 2006. Global phylogeny of Mycobacterium tuberculosis based 395
on single nucleotide polymorphism (SNP) analysis: insights into tuberculosis evolution, 396
phylogenetic accuracy of other DNA fingerprinting systems, and recommendations for a 397
minimal standard SNP set. J Bacteriol 188:759-72. 398
16. Fleischmann, R. D., D. Alland, J. A. Eisen, L. Carpenter, O. White, J. Peterson, R. 399
DeBoy, R. Dodson, M. Gwinn, D. Haft, E. Hickey, J. F. Kolonay, W. C. Nelson, L. A. 400
Umayam, M. Ermolaeva, S. L. Salzberg, A. Delcher, T. Utterback, J. Weidman, H. 401
Khouri, J. Gill, A. Mikula, W. Bishai, W. R. Jacobs Jr, Jr., J. C. Venter, and C. M. 402
Fraser. 2002. Whole-genome comparison of Mycobacterium tuberculosis clinical and 403
laboratory strains. J Bacteriol 184:5479-90. 404
17. Gagneux, S., K. DeRiemer, T. Van, M. Kato-Maeda, B. C. de Jong, S. Narayanan, 405
M. Nicol, S. Niemann, K. Kremer, M. C. Gutierrez, M. Hilty, P. C. Hopewell, and P. 406
M. Small. 2006. Variable host-pathogen compatibility in Mycobacterium tuberculosis. 407
Proc Natl Acad Sci U S A 103:2869-73. 408
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
24
18. Gao, Q., K. E. Kripke, A. J. Saldanha, W. Yan, S. Holmes, and P. M. Small. 2005. 409
Gene expression diversity among Mycobacterium tuberculosis clinical isolates. 410
Microbiology 151:5-14. 411
19. Goerke, C., S. Matias y Papenberg, S. Dasbach, K. Dietz, R. Ziebach, B. C. Kahl, 412
and C. Wolz. 2004. Increased frequency of genomic alterations in Staphylococcus aureus 413
during chronic infection is in part due to phage mobilization. J Infect Dis 189:724-34. 414
20. Goguet de la Salmoniere, Y. O., C. C. Kim, A. G. Tsolaki, A. S. Pym, M. S. Siegrist, 415
and P. M. Small. 2004. High-throughput method for detecting genomic-deletion 416
polymorphisms. J Clin Microbiol 42:2913-8. 417
21. Gomez-Valero, L., A. Latorre, and F. J. Silva. 2004. The evolutionary fate of 418
nonfunctional DNA in the bacterial endosymbiont Buchnera aphidicola. Mol Biol Evol 419
21:2172-81. 420
22. Hazbon, M. H., and D. Alland. 2004. Hairpin primers for simplified single-nucleotide 421
polymorphism analysis of Mycobacterium tuberculosis and other organisms. J Clin 422
Microbiol 42:1236-42. 423
23. Hirsh, A. E., A. G. Tsolaki, K. DeRiemer, M. W. Feldman, and P. M. Small. 2004. 424
Stable association between strains of Mycobacterium tuberculosis and their human host 425
populations. Proc Natl Acad Sci U S A 101:4871-6. 426
24. Ho, T. B., B. D. Robertson, G. M. Taylor, R. J. Shaw, and D. B. Young. 2000. 427
Comparison of Mycobacterium tuberculosis genomes reveals frequent deletions in a 20 428
kb variable region in clinical isolates. Yeast 17:272-82. 429
25. Israel, D. A., N. Salama, C. N. Arnold, S. F. Moss, T. Ando, H. P. Wirth, K. T. 430
Tham, M. Camorlinga, M. J. Blaser, S. Falkow, and R. M. Peek, Jr. 2001. 431
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
25
Helicobacter pylori strain-specific differences in genetic content, identified by 432
microarray, influence host inflammatory responses. J Clin Invest 107:611-20. 433
26. Kato-Maeda, M., J. T. Rhee, T. R. Gingeras, H. Salamon, J. Drenkow, N. Smittipat, 434
and P. M. Small. 2001. Comparing genomes within the species Mycobacterium 435
tuberculosis. Genome Res 11:547-54. 436
27. Kuipers, E. J., D. A. Israel, J. G. Kusters, M. M. Gerrits, J. Weel, A. van Der Ende, 437
R. W. van Der Hulst, H. P. Wirth, J. Hook-Nikanne, S. A. Thompson, and M. J. 438
Blaser. 2000. Quasispecies development of Helicobacter pylori observed in paired 439
isolates obtained years apart from the same host. J Infect Dis 181:273-82. 440
28. Mostowy, S., D. Cousins, J. Brinkman, A. Aranaz, and M. A. Behr. 2002. Genomic 441
deletions suggest a phylogeny for the Mycobacterium tuberculosis complex. J Infect Dis 442
186:74-80. 443
29. Pearson, B. M., C. Pin, J. Wright, K. I'Anson, T. Humphrey, and J. M. Wells. 2003. 444
Comparative genome analysis of Campylobacter jejuni using whole genome DNA 445
microarrays. FEBS Lett 554:224-30. 446
30. Sampson, S. L., M. Richardson, P. D. Van Helden, and R. M. Warren. 2004. IS6110-447
mediated deletion polymorphism in isogenic strains of Mycobacterium tuberculosis. J 448
Clin Microbiol 42:895-8. 449
31. Sreevatsan, S., X. Pan, K. E. Stockbauer, N. D. Connell, B. N. Kreiswirth, T. S. 450
Whittam, and J. M. Musser. 1997. Restricted structural gene polymorphism in the 451
Mycobacterium tuberculosis complex indicates evolutionarily recent global 452
dissemination. Proc Natl Acad Sci U S A 94:9869-74. 453
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
26
32. Tsolaki, A. G., A. E. Hirsh, K. DeRiemer, J. A. Enciso, M. Z. Wong, M. Hannan, Y. 454
O. Goguet de la Salmoniere, K. Aman, M. Kato-Maeda, and P. M. Small. 2004. 455
Functional and evolutionary genomics of Mycobacterium tuberculosis: insights from 456
genomic deletions in 100 strains. Proc Natl Acad Sci U S A 101:4865-70. 457
33. Yang, Z., D. Yang, Y. Kong, L. Zhang, C. F. Marrs, B. Foxman, J. H. Bates, F. 458
Wilson, and M. D. Cave. 2005. Clinical relevance of Mycobacterium tuberculosis plcD 459
gene mutations. Am J Respir Crit Care Med 171:1436-42. 460
34. Zhong, S., A. Khodursky, D. E. Dykhuizen, and A. M. Dean. 2004. Evolutionary 461
genomics of ecological specialization. Proc Natl Acad Sci U S A 101:11719-24. 462
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
27
FIGURE LEGENDS. 463
Figure 1. Phylogeny of the M. tuberculosis study isolates. M. tuberculosis isolates were 464
assigned to each SCG or SC-subgroup based on SNP alleles at nine loci. The SCG and SC-465
subgroup designations had been defined in a previous work (15). The number of study strains 466
and the number of clinical isolates, as defined by identical RFLP patterns, are shown for each 467
location on the tree. The locations of the three M. tuberculosis reference strains (H37Rv, 468
CDC1551, and strain 210) and M. bovis strain (M. bovis AF 2122/97) with sequenced genomes 469
are also shown. 470
Figure 2. Distribution of Group A and Group B LSPs on the SNP tree. M. tuberculosis 471
strains containing each designated LSP are indicated next to each tree branch. Numbers refer to 472
the total number of strains with the indicated LSP / the total number of isolates with the indicated 473
LSP. Thick lines are used to indicate the phylogenetic location of a hypothetical common 474
ancestor in which the LSP first occurred and its progeny. A: All Group A LSPs in the study. B: 475
All Group B LSPs in the study. The location of the SCG and SC-subgroups of these trees as well 476
as the total numbers of strains and isolates present in each SCG and SC-subgroups can be found 477
in Fig. 1. 478
Figure 3. Distribution of Group C LSPs on the SNP tree. M. tuberculosis strains containing 479
Group C LSPs in this study are shown. Numbers refer to the total number of strains with the 480
indicated LSP / the total number of isolates with the indicated LSP. Thick lines are used to 481
indicate the phylogenetic location of a hypothetical common ancestor in which the LSP first 482
occurred and its progeny. The location of the SCG and SC-subgroups of these trees as well as the 483
total numbers of strains and isolates present in each SCG and SC-subgroups can be found in Fig. 484
1. 485
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
28
Figure 4. Location of LSP-Ts on the SNP tree. The location of clinical M. tuberculosis strains 486
identified by LSP-T are shown relative to the location of each SCG and SC-subgroup on the SNP 487
tree. Colored LSP-Ts and connecting lines indicate LSP-Ts that are present on multiple SNP tree 488
branches. Tree not drawn to scale. 489
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
SCG2 (Strain 210)
12 strains
24 isolates
SCG6a
12 strains
12 isolates
SCG5
38 strains
43 isolates
SCG7 (M. bovis)
(0 isolates)
SCG4 (CDC1551)
6 strains
7 isolates
SCG1
9 strains
11 isolates
SCG3a
1 isolate
SCG3b
27 strains
29 isolatesSCG3c
8 strains
34 isolates
SCG6b (H37Rv)
4 strains
4 isolates
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
11/11
4/4
LSP912/12
4/4
LSP1
A 1/1
3/3
LSP12
B
LSP13
1/1
LSP16
1/11/1
2/3
LSP14
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
3/3
3/3
6/7
1/1
4/4
LSP11 1/1
5/6 5/6
2/2
1/1
1/1
LSP17
1/1 6/7 9/11
26/28
8/34
LSP15
1/1
2/2
3/3
22/25
7/8
1/1
4/4
LSP10 LSP8 12/12
37/42
1/1
1/2
4/4
6/17
1/1
9/10 1/1 8/8
9/9
2/19
LSP6
LSP7
12/12
37/42 6/6
1/1 1/1
4/4
12/24
6/6
15/16
1/1 1/1
4/4
LSP5 1/1
14/17 9/11
3/4
3/20
2/2
LSP2
1/1 12/24
4/4
24/27
1/1 1/1
LSP3
1/1
1/1
11/23
3/3
23/26
1/1
LSP4
1/1
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from
SCG6a (LSP-T:46,47,48,49,50,51,52,53,54)
SCG6b (LSP-T:53,55,56,57)
SCG5 (LSP-T:14,15,25,26,27,28,29,30,31,32,33,34,35,36,
37,38,39,40,41,42,43,44,45)
SCG3c (LSP-T:5,6,7,8,9,10)
SCG4 (LSP-T:1,2,3,4,5)
SCG3b (LSP-T:3,5,7,10,11,12,13,14,15,16)
SCG3a (LSP-T:20)
SCG2 (LSP-T:17,18,19,21,24,58)
SCG7
SCG1 (LSP-T:7,9,22,23)
ACCEPTED
on October 14, 2020 by guest
http://jcm.asm
.org/D
ownloaded from