Tracy Ann Kimdigital.library.unt.edu/ark:/67531/metadc849620/m2/1/high_res_d/KIM...(Ara macao)....
Transcript of Tracy Ann Kimdigital.library.unt.edu/ark:/67531/metadc849620/m2/1/high_res_d/KIM...(Ara macao)....
GENETIC CHARACTERIZATION OF CENTRAL AND SOUTH AMERICAN
POPULATIONS OF SCARLET MACAW (Ara macao)
Dissertation Prepared for the Degree of
DOCTOR OF PHILOSOPHY
UNIVERSITY OF NORTH TEXAS
May 2016
APPROVED:
Robert C. Benjamin, Major Professor
Michael S. Allen, Committee Member
Lee E. Hughes, Committee Member
Sarah A. McIntire, Committee Member
Douglas D. Root, Committee Member
Arthur J. Goven, Chair of the Department of
Biological Sciences
Costas Tsatsoulis, Dean of Toulouse
Graduate School
Tracy Ann Kim
Kim, Tracy Ann. Genetic Characterization of Central and South American Populations of
Scarlet Macaw (Ara macao). Doctor of Philosophy (Molecular Biology), May 2016, 171 pp., 15
tables, 35 figures, references, 123 titles.
The wild populations of the Scarlet Macaw subspecies native to southern Mexico and
Central America, A. m. cyanoptera, have been drastically reduced over the last half century and
are now a major concern to local governments and conservation groups. Programs to rebuild
these local populations using captive bred specimens must be careful to reintroduce the native A.
m. cyanoptera, as opposed to the South American nominate subspecies (A. m. macao) or hybrids
of the two subspecies. Molecular markers for comparative genomic analyses are needed for
definitive differentiation. Here I describe the isolation and sequence analysis of multiple loci
from 7 pedigreed A. m. macao and 14 pedigreed A. m. cyanoptera specimens. The loci analyzed
include the 18S rDNA genes, the complete mitogenome as well as intronic regions of selected
autosomally-encoded genes. Although the multicopy18S gene sequences exhibited 10%
polymorphism within all A. macao genomes, no differences were observed between any of the
21 birds whose genomes were studied. In contrast, numerous polymorphic sites were observed
throughout the 16,993 bp mitochondrial genomes of both subspecies. Although much of the
polymorphism was observed in the genomes of both subspecies, subspecies-specific alleles were
observed at a number of mitochondrial loci, including 12S, 16S, CO2 and ND3. Evidence of
possible subspecies-specific alleles were also found in three of four screened nuclear loci.
Collectively, these mitochondrial and nuclear loci can be used as the basis to distinguish A. m.
cyanoptera from the nominate subspecies, A. m. macao, as well as identify many hybrids, and
most importantly will contribute to further reintroduction efforts.
ii
Copyright 2016
by
Tracy Ann Kim
iii
ACKNOWLEDGEMENTS
I would like to thank Dr. Robert Benjamin for all of his support as my major professor.
He has provided invaluable hours of counseling and problem-solving. I would also like to thank
my doctoral committee for their encouragement throughout my project. I would like to thank the
many amazing friends, colleagues, students, and professors I have known at UNT.
I am so very grateful to my colleagues who provided the macaw blood samples, making
this project possible. Xcaret Ecoparque and Nature Preserve in Playa del Carmen, Mexico and
private breeders from Cancun, Quintana Roo, Mexico provided many samples, as well as, Dr.
Patricia Escalante from the National Autonomous University of Mexico who generously assisted
me in locating validated scarlet macaw blood samples.
Thank you to Dr. Allen, who has been so steadfast in his support and for the constant
access to the wondrous equipment in his lab. Dr. Yan Zhyang has been tremendously helpful and
patient when I encroach upon her lab. I would like to thank Dr. Ron Mittler for his contagious
inspiration and bravery over this last year. I would like to thank Dr. David Visi for his tireless
assistance with sequencing platforms and always answering when I call. A huge thank you to
Richard Donagen-Quick for his brilliant troubleshooting prowess and for being our scientist for
four years and to Laci Adolfo who has been there for me in so many aspects of this project,
discussing science and innovation. She has been so incredibly hard-working and supportive and
always readily available to spill the tea. She is definitely the best discovery from my years on
this research project. I would like to thank my incredible husband, Brian McCormack, who has
been tireless with his patience and energy in this scientific endeavor. Also, I will never be able to
thank my kids enough for enduring so many growing years in a biology building and
continuously showing me how to see the wonders of the world.
iv
TABLE OF CONTENTS
ACKNOWLEDGEMENTS ........................................................................................................... iii
LIST OF TABLES ......................................................................................................................... vi
LIST OF FIGURES ...................................................................................................................... vii
CHAPTER 1 INTRODUCTION .................................................................................................... 1
1.1 Conservation Issues ........................................................................................................... 1
1.2 Reintroduction Programs ................................................................................................... 4
1.3 Molecular Analysis to Differentiate Species ..................................................................... 7
1.4 New World Parrots .......................................................................................................... 18
CHAPTER 2 MATERIALS AND METHODS ........................................................................... 29
2.1 Obtaining Samples ........................................................................................................... 29
2.2 DNA Isolation from Liquid Blood .................................................................................. 30
2.3 DNA Isolation from FTA® Cards ................................................................................... 31
2.4 18S Ribosomal DNA ....................................................................................................... 32
2.5 Column-based Purification of PCR Products .................................................................. 35
2.6 Next Generation Sequencing Using the Illumina MiSeq® Platform .............................. 36
2.7 Mitochondrial Genome Comparison ............................................................................... 47
2.8 Next Generation Sequencing Using the Ion Torrent® PGM™ Sequencing System (Life
Technologies) ........................................................................................................................ 59
2.9 Mitochondrial DNA Sequence Data Analysis ................................................................. 66
2.10 Alignments to Reference Sequence ............................................................................... 66
2.11 Differentiation on the Sequence Level .......................................................................... 67
2.12 Identification of Candidate Nuclear Sequences for Polymorphism Screening ............. 68
2.13 Restriction Digestion Analysis ...................................................................................... 78
2.14 Next Generation Sequencing Using the Illumina MiSeq® Platform ............................ 82
CHAPTER 3 RESULTS AND CONCLUSIONS ........................................................................ 83
v
3.1 18S Analysis of Results ................................................................................................... 83
3.2 Mitochondrial Sequences: Analysis of Results ............................................................... 89
3.3 Nuclear Results .............................................................................................................. 103
3.4 Off-Instrument Use of MiSeq® Reporter ...................................................................... 112
APPENDIX EXTENDED RESULTS ........................................................................................ 117
BIBLIOGRAPHY ....................................................................................................................... 158
vi
LIST OF TABLES
Table 1: Mitochondrial Primer Pairs Used to Amplify the Mitogenome of A. macao ..................50
Table 2: Preliminary Assessment of Candidate Nuclear Loci by RFLP .......................................71
Table 3: Restriction Enzymes Used for Initial Search for Variability ...........................................80
Table 4: Nuclear Sequence Comparison Between Subspecies: 18S. .............................................88
Table 5: Characteristics of Two Subspecies of Scarlet Macaw .....................................................97
Table 6: Mitochondrial Comparison of A. macao cyanoptera, A. macao macao, and hybrids ..100
Table 7: Mitochondrial Sequence Comparison Between Subspecies: 16S .................................102
Table 8: Mitochondrial Sequence Comparison Between Subspecies: 12S .................................103
Table 9: Mitochondrial Sequence Comparison Between Subspecies: CO2 ................................104
Table 10: Mitochondrial Sequence Comparison Between Subspecies: ND3 ..............................105
Table 11: Mitochondrial Sequence Comparison Between Subspecies: CytB .............................106
Table 12: Nuclear Sequence Comparison Between Subspecies: AK1 ........................................108
Table 13: Nuclear Sequence Comparison Between Subspecies: RGS4 ......................................110
Table 14: Nuclear Sequence Comparison Between Subspecies: Vim .........................................112
Table 15: Nuclear Sequence Comparison Between Subspecies: RAG1 .....................................113
vii
LIST OF FIGURES
Figure 1: Benefits of Using Nuclear and Mitochondrial DNA Sequences for Comparative
Analyses ............................................................................................................................9
Figure 2: Homogenization Model of Multicopy Gene Arrays .......................................................14
Figure 3: Visual Comparison of A. m. macao and A. m. cyanoptera .............................................21
Figure 4: Map of Estimated A. macao Populations in Southern Mexico and Central America. ...24
Figure 5: Arrangement of rDNA Clusters on the Genome ............................................................33
Figure 6: Amplified Nuclear Region from 18S rDNA – 1 of 2 ....................................................34
Figure 7: Amplified Nuclear Region from 18S rDNA – 2 of 2 .....................................................35
Figure 8: Tagmentation® Procedure Used by Nextera XT Library Prep Kit® .............................39
Figure 9: TruSeq® Index Plate Guide ...........................................................................................41
Figure 10: Electrophoretic Analysis of Mitochondrial Segment 1 Amplicons..............................52
Figure 11: Electrophoretic Analysis of Mitochondrial Segment 2 Amplicons..............................53
Figure 12: Electrophoretic Analysis of Mitochondrial Segment 3 Amplicons..............................54
Figure 13: Electrophoretic Analysis of Mitochondrial Segment 4 Amplicons..............................55
Figure 14: Electrophoretic Analysis of Mitochondrial Segment 5 Amplicons..............................56
Figure 15: Electrophoretic Analysis of Mitochondrial Segment 6 Amplicons..............................57
Figure 16: Overlapping PCR Amplicons Encompassing Ara macao Mitochondrial Genome .....58
Figure 17: Ion Torrent® Workflow for Sample Preparation .........................................................65
Figure 18: Ion Torrent® Sequencing Final Workflow ..................................................................66
Figure 19: Amplified Nuclear Region from AK1 – 1 of 2 ............................................................72
Figure 20: Amplified Nuclear Region from AK1 – 2 of 2 ............................................................73
viii
Figure 21: Amplified Nuclear Region from RAG1 – 1 of 2 ..........................................................74
Figure 22: Amplified Nuclear Region from RAG1 – 2 of 2 ..........................................................75
Figure 23: Amplified Nuclear Region from RGS4 – 1 of 2 ..........................................................76
Figure 24: Amplified Nuclear Region from RGS4 – 2 of 2 ..........................................................77
Figure 25: Amplified Nuclear Region from Vim – 1 of 2 .............................................................78
Figure 26: Amplified Nuclear Region from Vim – 2 of 2 .............................................................79
Figure 27: Electrophoretic Analysis of Restriction Digest of RGS4 .............................................82
Figure 28: Sequence Homology and Single Nucleotide Polymorphisms for 2 Individuals ..........84
Figure 29: Arrangement of Ribosomal DNA (rDNA) Clusters on the Genome ...........................85
Figure 30: Ribosomal DNA Tandem Cluster Array ......................................................................87
Figure 31: Illustration of Problem with Tandem Clusters of rDNA During Read Assembly .......91
Figure 32: Phylogenetic tree showing relatedness of Ara macao from mitogenome domain I. ....95
Figure 33: Illumina MiSeq® Reporter Workflow .......................................................................117
Figure 34: Illumina MiSeq® Reporter Run Summary Interface .................................................120
Figure 35: Illumina MiSeq® Reporter Detailed Sample Analysis Interface ...............................121
1
CHAPTER 1
INTRODUCTION
1.1 Conservation Issues
Wildlife conservation addresses the preservation of species and their habitats throughout
the world. The continuing extinction of species during the last century has been occurring at an
alarming rate. It is currently estimated that 1,800 populations per hour, encompassing a wide
range of species, are being lost as a result of human-driven ecological change. This is an
unprecedented rate of biodiversity loss. While the current scientific concern over the loss of
biodiversity emphasizes preventing the loss of species, many of the benefits biodiversity confers
upon humanity are delivered through individual populations (Hughes and Hughes 2007). A
population is a group of individuals of the same species, living in a given location, which is
genetically different from other such groups. Population loss, therefore, is a more significant
measurement to be used when representing the changes to our world that directly affect us.
The International Union for Conservation of Nature (IUCN) is an internationally
recognized authority on the conservation status of the plant and animal species of the world.
Approximately every four years, the IUCN generates a revised “Red List” that documents the
differing threat levels faced by thousands of the world’s plant and animal species. The last such
report was issued in April 2015, and at that time the IUCN Red List assessed and reported on the
status of over 79,000 plant and animal species. The April 2015 Red List classified 41% of
amphibian species, 26% of mammalian species, and 13% of avian species as “under threat”
(International Union for Conservation of Nature 2015).
2
Biodiversity loss has effects beyond the loss of a particular species or subspecies. The
relationship between biodiversity and the overall health of an ecosystem is both closely
interwoven and multidimensional. Many human diseases originate in animal or environmental
reservoirs that are forced by habitat change to relocate to areas more densely populated by
humans and livestock. This sudden increase in habitat overlap causes humans to come into
contact with new bacteria and viruses not normally encountered. An example of problems caused
by such an overlap is when fires were used to clear rainforests for agriculture in Indonesia in
1998, the extensive smoke that was produced forced the fruit bat populations to relocate into
northern Malaysia. Pigs from the large commercial farms located there became ill after
consuming fruit later discovered to be contaminated by the bat’s saliva or urine (Goh, Tan et al.
2000). A large majority of the farmers that came in contact with the ailing livestock began to
report a range of disorders from acute respiratory syndrome to fatal encephalitis. Although
extremely contagious among pigs, the mortality rate was very low. In humans, however, 40% of
the people that contracted the Nipah virus died. Since 1998, a dozen more outbreaks have
occurred in Bangladesh and India. This has resulted in more respiratory disease and a fatality rate
of up to 92%, which has lead scientists to suspect a different strain of the Nipah virus as the
culprit (Iowa State University 2007).
In addition, many of the far-reaching effects of habitat destruction are often not seen until
long after the damage has been done. And even if these far-reaching effects can be reversed, such
reversal can generally be accomplished only through long, arduous, and scientifically sound
efforts. An example of the far-reaching effects of biodiversity loss can be seen in the story of the
gray wolf (Canis lupus) in the northern United States. During the first half of the 20th century,
the gray wolf was seen as a threat to livestock, and it became the target of a mass extermination
3
effort. This effort effectively wiped out the gray wolf in most of its former range within the
lower 48 states (Ripple and Beschta 2012). As the gray wolf population declined, there was an
increase in populations of the gray wolf’s prey species: elk, deer, moose, coyote, raccoon, and
beaver. With the corresponding proliferation of grazing species (such as elk, deer, and moose),
populations of the plants that these animals fed on became decimated. This reduction in
vegetation resulted in a loss of protective cover for birds, causing a decline in riparian bird
populations (Ripple and Beschta 2012). With the falling bird populations in wetland areas, the
number of insects, particularly mosquitoes, began to increase. Mosquitoes are a known disease
vector that directly affects humans—West Nile Virus (encephalitis), dengue fever, and yellow
fever are known mosquito-borne diseases in the United States (AMCA - American Mosquito
Control Association 2014). It seems unlikely that anyone would have predicted, when the wolf
eradication program was initiated, that the program’s success would result in an increased threat
to humans from mosquito-borne diseases. Such an indirect and unanticipated ecological effect is
just one example of the types of dangers posed by species extinction (or the local extinction of a
population of a species).
In the case of the gray wolf, which was an example of local population extinction, not all
was lost. Beginning in the early 21st century, the U.S. National Park Service carried out a
program to reintroduce the gray wolf back into the northern United States (National Park Service
2015). Canadian gray wolf populations were used to seed former U.S. ranges. The return of
wolves to Yellowstone has begun to curb the abundant elk population within the park, which in
turn has allowed young aspens and willows to grow where they had been previously decimated
by overbrowsing (Ripple and Beschta 2012). The increase in vegetation along streams and other
areas has provided improved habitats for beaver, birds, and fish populations. The enhanced
4
vegetation also increased the available food sources for bears and birds (Ripple and Beschta
2012). Although it is too early to observe a consequent decrease in insect populations as a result
of the increase in avian species populations, it is reasonable to suggest that this will likely follow.
1.2 Reintroduction Programs
Reintroduction programs, such as the gray wolf example given above, are often seen as
controversial conservation tools. Care needs to be taken to ensure the health of the members of
the introduced species as well as the current inhabitants of the introduction area. The loss of
genetic diversity also results in lower individual fitness and poor adaptability (Lande 1988). The
fate of small populations may be linked to collective genetic change in those populations.
Further, captive breeding of endangered wildlife animals is often necessary for their
conservation. However, this strategy may increase the chance of inbreeding, which causes poor
fitness of these populations (Ralls and Ballou 1983, Crnokrak and Roff 1999). Inbreeding is
known to decrease genetic diversity and reduce reproductive and survival rates, with these
problems leading to increased extinction risk (Saccheri, Kuussaari et al. 1998). Such genetically
impoverished populations often struggle and have to be crossed with individuals from other
populations to become genetically viable (Westemeier, Brawn et al. 1998). To address risks
associated with inbreeding, however, appropriate population management programs can be
developed using genetic studies (Snyder, Derrickson et al. 1996). Care must also be taken to
ensure the animals are being introduced into an appropriate, sufficiently sized and healthy
habitat.
An important consideration when introducing or reintroducing a species into a habitat is
the determination of whether the reintroduced population will be genetically sustainable. It is
5
therefore desirable to identify for reintroduction an appropriate indigenous population that is
genetically well-adapted for the chosen environment. It is additionally important to verify that
there is sufficient genetic diversity within the introduced population to maintain the healthy
genetic substructure necessary for long-term survival. Genetic substructure of a population is
diversity that is the result of the combined forces of mutation, gene flow, genetic drift, and
natural selection. Populations with high rates of gene flow tend to become more homogenized
and over time will exhibit less total substructure. Populations with low rates of gene flow will
have an increased likelihood of experiencing genetic drift, and the result is a decrease in
variability within the population and an increase in genetic variability from other
populations/subpopulations of the same species. This in turn leads to an increase in the amount
of substructure within the larger/combined population. It is important to the success of the
reintroduction program that a genetically diverse population of suitably adapted individuals be
introduced into a suitably large and diverse environment that will preserve and expand the
genetic substructure of the newly introduced population.
Allopatric speciation can occur when a population separates to the extent that genetic
exchange effectively ceases between the separated groups. The resulting genetic divergence may
be due to differing selective pressures, independent genetic drift, or by mutations that arise in
one group but not the other, and eventually the divergence can result in the formation of
recognized subspecies (Nosil and Schluter 2011). When two subspecies exist of a species chosen
for restoration into the wild, reintroduction of members of one of the two subspecies into a new
area can introduce alleles that are not necessarily favored in the new environment. These alleles,
although supported by the original environment where they arose and were selected for as
advantageous, are now subjected to conditions that select for a different, genetically distinct
6
group. Thus, when reintroducing a species into a range where it is no longer found, the choice of
the particular subspecies to be introduced can have a significant impact on the success or failure
of the restoration effort. Given the importance of choosing the right subspecies, verification of
the genetic background of individuals to be used for introduction is a high priority.
Efforts to select subspecies for introduction are further complicated by the possible
existence/creation of hybrid individuals. When two genetically-distinct populations are combined
as a result of a reintroduction effort, intermediate genotypes can be created in the subsequent
generations that lack vigor (Birchler, Yao et al. 2006). In hybrids with these intermediate
genotypes, there can be a breakdown of biochemical and physiological compatibility between the
alleles of interacting genes (gene products) from the different genetic backgrounds. Under
typical evolutionary conditions, individual alleles are propagated because of the advantage they
give to the individual in an environment. Due to non-additive gene action, these alleles, however,
are often a detriment when appearing in a relatively new genetic background. Individuals having
these non-adaptive gene combinations are less likely to survive. A reintroduction program that
leads to the creation of such unfavorable hybridization can, counterintuitively, cause a reduction
in the overall population due to interbreeding and the subsequent poor survival of hybrid
offspring.
Based upon all of these considerations, determining the population/subspecies of the
species appropriate for the area into which they are to be introduced and having detailed
knowledge of the genetic makeup of each individual to be used for reintroduction are of
paramount importance to insuring the success of a reintroduction program. Accomplishing these
objectives has been made much simpler and more definitive with the introduction of comparative
genomics as a means to differentiate groups of organisms taxonomically. The use of such
7
comparative genomic techniques for the differentiation of species and taxa has become both
quicker and less expensive in recent years, and is now considered the “gold standard” for such
determinations in most cases.
Recent advances in molecular research techniques such as high-throughput DNA
sequencing and computer programs for high-volume DNA sequence analysis have opened up
many new opportunities. Such techniques have allowed for the collection of large amounts of
sequence information that is now available to the research community in public databases.
Accordingly, analyses that before would have been too complex or expensive to carry out with
available resources, can now be used advantageously in reintroduction programs. With such
population-focused genomic data available, broader genomic analyses can and should be
considered as necessary first steps of a reintroduction program.
1.3 Molecular Analysis to Differentiate Species
The first molecular-based approach applied to the study of taxonomy and the
biogeography of populations came with the advent of protein sequencing. In the early 1960s,
amino acid sequence comparisons, primarily of cytochrome c, were used to determine the
relative divergence of taxa. The rate of the evolution of a protein is based on the occurrence of
mutations in the genome and by the probability that the resulting random change in amino acid
sequence will be tolerable in a functioning protein. The relatively slow rate of change of
cytochrome c at the amino acid level made this protein an excellent candidate to discern deep
phylogenetic relationships (Fitch and Margoliash 1967, Olson 1989). Protein sequencing
comparisons are still used today, but more often as a supplement to DNA sequence-level
analyses, and not as a stand-alone comparison method. Synonymous mutations (due to the
8
degeneracy of the genetic code) will still yield the same amino acid sequence even though the
nucleotide sequences of the two alleles will differ at the DNA sequence level causing divergence
to be underestimated. Therefore, studying a given length of DNA will yield more phylogenetic
information than is produced by studying the amino acid sequence of the corresponding encoded
polypeptide. Amino acid sequences are also subject to more selective pressure than DNA
sequences, and thus change more slowly over evolutionary time. They are therefore likely to be
less useful for comparisons of species/populations that have only recently diverged (Tourasse
2000).
From 1974 to 1986, the dominant molecular-based technique for phylogenetic analysis
involved DNA-DNA hybridization. (Sibley and Ahlquist 1983). Purified genomic DNA samples
from two different species were denatured and then allowed to hybridize to form heteroduplexes
between the homologous DNA sequences. The formed heteroduplex molecules whose nucleotide
sequences had more complementarity exhibited a higher melting temperature (Tm) than those
with less complementarity. Thus, a higher thermostability of the heteroduplex was associated
with a closer phylogenetic relationship between the two organisms.
New biotechnology advances in the early 1970s allowed researchers to use a more direct
method to measure DNA sequence diversity. Frederick Sanger pioneered a DNA sequencing
method based on DNA replication. This process used chemically altered bases to terminate
synthesis of DNA fragments during replication. The fragments were then size-separated by
electrophoresis and visualized by an x-ray image to obtain the DNA sequence (Sanger, Nicklen
et al. 1977).
9
Phylogenetically useful markers often accumulate in the genome in the form of single
nucleotide polymorphisms (SNPs) and short indels (insertions/deletions) (see Figure 1).
Approximately 90% of genetic variation in the human genome is in the form of SNPs (Collins,
Brooks et al.). Comparison of these sequences have provided new insights into the evolution of
populations and have rapidly become the molecular marker of choice for many applications in
ecology and conservation genetics. SNPs have a tendency to not be especially advantageous or
disadvantageous from a survival perspective. They are not readily removed from the genome
over time by natural selection because they do not directly alter the transcriptome and/or
proteome of the individual. Polymorphisms of this type can often be used to more effectively
track recent and rapid evolutionary changes, and are therefore useful to differentiate between
highly related groups and shed light on the level of genetic relatedness and substructure within
and/or between populations. The original SNP analyses used in population studies were virtually
Figure 1: Benefits of using nuclear and mitochondrial DNA sequence for comparative
analyses (Morin, Luikart et al. 2004).
10
all carried out using the RFLP (restriction fragment length polymorphism) technique or through
RAPD (random amplified polymorphic DNA) studies. Today, SNPs are found and analyzed by
DNA sequencing of the genome and/or the locus under study (Bohle and Gabaldon 2012).
SNPs do have limitations when being used to assess genetic variation. Ascertainment bias
is a distortion in the measurement of the frequency of a phenomenon due to the way the data is
collected. This bias is a larger potential drawback for certain applications more than others. The
problem of ascertainment bias is less of an issue for individual identification, paternity analyses,
and assigning individuals to different populations because these are not based on allelic
frequency (Lachance and Tishkoff 2013).
Population substructure can be underestimated when measuring SNPs with a high
heterozygosity. Usually SNPs with higher heterozygosity tend to be older and therefore have had
time to be distributed across the population relative to SNPs with lower heterozygosity (Morin,
Luikart et al. 2004). Population bottlenecks are detected by loss of genetic variation within the
genome. Since SNPs house only two alleles per locus, it is more difficult to detect allelic variety
or lack thereof within a population (Morin, Luikart et al.). So estimation in population size or
demographic differences can be a problem unless ascertainment bias is taken into consideration.
Biases also arise when using SNP markers across populations between different studies.
The diversity of the original study’s population can accordingly skew the diversity measurement
of the population being measured in the subsequent study causing a false increase or decrease to
be seen (Schlotterer and Harr). To help counteract this type of bias, the screened individuals
must originate from a wide geographic source and the protocol used initially to identify the
polymorphism must be recorded in detail along with the number of individuals screened (Akey).
Fourteen pedigreed A. m. cyanoptera and seven pedigreed A. m. macao (plus two known
11
hybrids) were studied in the completion of this project. Careful consideration was given to
ensure that the individual macaws studied were not related. Sequence analysis of the
hypervariable mitochondrial control region was completed to show the there was no matrilineal
relationship between the individual macaws studied (see figure 32).
Small subunit ribosomal RNA gene (rDNA) sequences were the initial nucleotide
sequences used for phylogenetic comparison, and these are still used today. The homology at the
nucleotide sequence level of rDNA is well conserved between even distantly related organisms.
This allows for easier alignment, and therefore determination, of evolutionary relationships and
divergence rates. In particular, the 16S genes of prokaryotes and 18S genes of eukaryotes are
frequently used to determine phylogeny. The 16S rDNA sequence of prokaryotes was used by
Carl Woese to show evolutionary relatedness, and his work in the late 1970s became the basis of
the modern three-domain classification system of Archaea, Bacteria and Eukarya (Clarridge
2004). The 16S and 18S rRNA genes are multicopy gene families (making them easy to detect),
and comparisons using them effectively resolve deep phylogenetic relationships. However, due
to the high level of interspecies sequence conservation, they may not provide sufficient data for
differentiation at closer taxonomic levels. Despite this shortcoming, comparisons using these
sequences to identify species and assess interspecies relationships remain extremely valuable for
taxonomic studies.
Reiterated genes such as the ribosomal RNA genes are characterized by being composed
of hundreds of identical genes present in a tandem array (King and Stansfield 1990). This genetic
redundancy is to produce higher amounts of a specific product which is needed comparatively
more than other less redundant gene products. Concerted evolution is likely to be a correction
mechanism counteracting the undesired effects of mutations. Homogenization of sequences
12
across these regions provides stability by minimizing polymorphisms in gene products required
to be present uniformly in large quantities (Feliner and Rosselló 2012).
The individual units of ribosomal DNA arrays, consisting of these tandem clusters, show
much greater sequence similarity within a species than between species. The
clusters/transcription units of these arrays do not evolve independently of each other but through
concerted evolution. This causes cluster homogeneity that is the direct result of unequal crossing-
over events and gene conversion (Schlötterer and Tautz 1994). Novel variants arising by
mutation can spread relatively rapidly along the array or be quickly eliminated from the array of
any one species (Eickbush and Eickbush 2007). It has been suggested that homogenization is
favored by natural selection because it reduces mutational load (Ohta 2009). The high levels of
ITS and IGS polymorphisms in some species are likened to the creation and disappearance of the
duplicate gene copies in multicopy gene families as a result of natural selection (Nei and Rooney
2005).
The high copy number and sequence conservation of the tandemly arranged clusters
make detailed sequence analysis of the entire array difficult. Presently available DNA
sequencing technology cannot provide accurate sequence data across long stretches of tandemly
repeated and highly conserved sequences. Consequently, SNPs found in these rRNA gene
clusters have not been studied in any non-model organism with more than a few hundred cluster
copies (Matyasek, Renny-Byfield et al. 2012) on one chromosomal locus. Angiosperms are a
good example of the significance of this problem because they are one of many organisms that
possess tens of thousands of rRNA gene clusters distributed over several chromosomal loci
(Heslop-Harrison and Schwarzacher 2011).
13
Multicopy gene sequences that show homogenization within repeat arrays have been
found to follow three basic phases involved with concerted evolution (Barker, Benesh et al.
2012). During the first phase, mutations/SNPs tend to occur anywhere within the rDNA cluster.
Since the rDNA is highly redundant, there is no selective pressure acting on these mutations and
they can persist for some time (Hughes and Hughes 2007). These mutations are observed as low-
frequency polymorphisms located throughout the repeat unit. During the second phase,
recombination and unequal crossing-over events will result in mutations/SNPs being either
removed/deleted or duplicated. Duplicated clusters housing the mutation will increase in number
(if not removed during subsequent crossing-over or recombination events) until a certain
threshold is reached. At this point, natural selection will judge whether fitness is compromised
by this functional constraint or strengthened by this novel functional change. During the third
phase, the mutant repeat completely replaces all of the previous repeats and the new variant
becomes fixed and homogenized within the array (see Figure 2) (Questiau, Eybert et al. 1998).
14
This mode of homogenization explains why some regions within the same repeat cluster
are highly polymorphic while others are highly conserved although the entire repeat is subject to
the identical homogenization process.
Nucleotide sequences in the mitochondrial genome have been used extensively for
phylogenetic comparisons and parentage determinations. Each mitochondrion houses 2-10 copies
of the mitogenome, and each cell can possess dozens of mitochondria. This relatively high copy
number per cell facilitates obtaining sequence-level information from mtDNA-encoded loci. The
traditional approach using “alleles” for comparisons is routinely replaced by identifying mtDNA
haplotypes, each composed of a specific set of single nucleotide substitutions (SNPs) and/or
indels.
Figure 2: Homogenization model of multicopy gene arrays. (Ganley and Kobayashi)
15
The mitochondrial genome mutation rate is estimated to be 10x higher than the mutation
rate of equivalent regions of the nuclear genome (Avise 2000). It is believed this is because
mtDNA is more exposed to mutagenic events due to the lack of histone-like proteins, there are
no clear DNA repair capabilities in the organelle, and exposure to a high steady-state level of
reactive oxygen species (ROS) and free radicals (Mikhed, Daiber et al. 2015). This, coupled with
a high replication rate, creates the signature increase of accrued polymorphisms in the
mitochondria. The avian mitochondrial genome is approximately 17,000 bp long and consists of
22 tRNA genes, 13 protein-coding open reading frames (ORFs), two rRNA coding regions and a
control region (Boore 1999). The level of sequence polymorphism varies across the mitogenome.
The control region, or D loop, is the most variable section of the mitogenome, and because it
doesn’t code for an expressed product, it is able to undergo rapid change at the nucleotide
sequence level. The average level of sequence polymorphism across the different coding regions
of the mitochondrion decreases in the order of control region > cytochrome B > cytochrome
oxidase 1 (CO1) > 12S rDNA > 16S rDNA (Arif).
As noted above, as new generations of improved molecular analysis techniques have
been introduced, and the cost and time requirements for such analyses have significantly
decreased, it has become possible to broadly apply such techniques for the identification and
differentiation of species/subspecies. The mitochondrial genome can provide abundant
information for evolutionary studies of many taxa, and can be used as a source of molecular
markers for conservation studies of endangered species (Nabholz, Uwimana et al. 2013). A
number of suggested subspecies have been shown to exhibit definitive differentiation based on
these techniques:
16
Stingless bee – Melipona quadrifasciata anthidioids; Milipona quadrifasciata
quadrifasciata (Waldschmidt, Barros et al. 2000);
Canada goose – Branta canadensis taverneri; Branta canadensis leucopareia
(Shields and Wilson 1987);
Sand lizard – Lacerta agilis exigua; Lacerta agilis boemica (Grechko, Fedorova
et al. 2006);
Willow Flycatcher – Empidonax traillii extimus; Empidonax traillii adastus
(Paxton, Sogge et al. 2008); and
Bluethroat – Luscinia svecica namnetum; Luscinia svecica svecica (Questiau,
Eybert et al. 1998).
In 2003, an international consortium began to compile a database of DNA sequences
from taxa all over the world in order to differentiate organisms at the nucleotide sequence level.
This identification system is based upon specific molecular markers. The primary marker used
for this species comparison is located at the 5’ end of the cytochrome c oxidase subunit 1 gene
(CO1), which is encoded within the mitochondrial genome. Taxonomic
assignments/differentiations using this 650 bp sequence have stood up remarkably well under the
scrutiny of taxonomists that initially declared this method of species identification as being “anti-
taxonomy” (Hebert and Gregory 2005).
Dr. Mark Stoeckle of Rockefeller University developed specific differentiating DNA
sequences for 260 species of North American birds based upon the CO1 gene. All of the birds,
with the exception of four species, could be uniquely differentiated by a species-specific CO1
gene sequence. In the case of the four “species” for which the approach failed to identify a single
17
species-specific DNA sequence, each of them showed the presence of two differentiated
sequences, suggesting that each of the single “understood” species might in fact consist of two
distinct species (Hebert, Stoeckle et al. 2004).
After analysis by groups using molecular markers, many original classifications that had
been made under the traditional biological species concept were determined to be incorrect. This
has been especially true in the case of many prokaryotic species. Using molecular analysis, many
higher eukaryotic populations formerly regarded as separate species were found to be a
single taxon and, conversely, many singular groupings were refined into two independent
classifications (Hebert and Gregory 2005).
Dr. Paul Hebert and Dr. Daniel H. Janzen of the University of Pennsylvania showed that
members of a single, traditionally classified Costa Rican butterfly species, Astraptes fulgerator,
possessed a total of 10 distinct CO1 gene sequences (Hebert, Penton et al. 2004). This finding
suggests that the butterflies were not a single species, as long assumed, but a complex of 10
different species occupying overlapping territories. The survival advantage imparted by the
original shared adult phenotypes (physical shape and wing color), was so great that the
descendant groups each retained these ancestral traits (Aitken 2006). However, although the
adults of different groups shared many physical and behavioral properties, the larval stage
(caterpillar) for each of these newly separated butterfly species look quite distinct and prefer
different foods. This is, of course, consistent with each species having diverged from a
common ancestral species with the majority of the present differences between them primarily
observable at the larval state, and with many of the adult phenotypic properties retained by each
of the “new” species.
18
The mitochondrial genome only offers information from maternal inheritance; therefore,
delimiting subspecies hybrids also requires comparison of bi-parental nuclear sequence. Nuclear
loci, especially protein-coding regions, generally evolve slower than regions found in the
mitochondria. This slower rate of change often doesn’t provide enough sequence variation to
differentiate between closely related species/populations (Dasmahapatra and Mallet 2006). Non-
protein coding sequences found in the nuclear genome, such as intergenic and intronic regions,
offer more variability between populations.
In order to use the nuclear DNA sequence as a successful phylogenetic tool, the sequence
chosen must be conserved among individuals within the population being studied but display
sufficient variation between the two populations. Sequences within gene families, as opposed to
single-copy regions, should be avoided in order to reduce the risk of including sequences that are
not true orthologues (De Mendonca Dantas, Godinho et al. 2009) but are variations of a similar
sequences on the same genome. Attempts should also be made to use intronic sequences found
on different chromosomes to minimize the possibility of linkage allowing easier identification of
hybrids (Backstrom, Fagerberg et al. 2008).
1.4 New World Parrots
Parrots are the most threatened and endangered group of birds with more than 90 species
(Zierdt-Warshaw 2000). During the last quarter of the 20th century, over 21 parrot species have
become extinct. Three of these extinction events have occurred since 2000. The primary cause of
population declines and extinctions was overhunting in the 1960s and 1970s, but it is now more
often due to the loss of habitat as a result of deforestation and subsequent habitat fragmentation.
19
Fifty percent of the earth’s plants and animals can be found in the rainforest even though
rainforests cover only 5% of the earth’s land surface (Butler 2014). Eighty-thousand acres of
tropical rainforest are destroyed every day (Moss 2009). This is due, in large part, to the increase
in the global demand for beef and soybean production, and in order to make way for these
commercial crops, the forests are burned. The rainforests that are being destroyed can be as
young as the 2000 year old southern basin of the Amazon (Carson, Watling et al. 2015) to the 70
million year old ancient forests of Southeast Asia (Benders-Hyde 2002). It is projected that most
of the primary rainforests of Southeast Asia will be destroyed in the next 10 years (Benders-
Hyde 2002)
The vitality of parrots as a group is heavily challenged. The IUCN Red List details 11
species of parrots as endangered, five species as threatened, and four species requiring special
protection. Mexico has a special connection to this group, having 22 indigenous species of
parrots, with six such species found only in Mexico (IUCN 2013). In the recent past, the rate of
illegal wild bird capture has still been estimated at 65,000 to 75,000 birds per year, and these
include many parrots. Greater than 75% of the captured birds die before reaching the purchaser,
meaning that supporting even a relatively small trade of this type at the consumer level requires a
substantial depletion in wild populations.
Parrots are considered an important group of birds for reasons beyond the simple need to
preserve all remaining species in our ecosystems but also because they have characteristics that
are not regularly seen in other avian groups. Their unique higher cognitive function is attributed
to an unusual rate of brain development and a much larger brain volume, allowing the parrots to
employ complex communication patterns and to mimic the sounds of other animals (Forshaw
2006). Parrots often live 50-75 years in the wild and up to 90 years in captivity. The genetic
20
factors supporting this incredible longevity is of much interest in the scientific community. As
are the genes associated with the parrot’s superior cardiovascular health which enables them to
fly 15-20 miles per day oftentimes at speeds of up to 35 miles per hour. These attributes are
seldom seen in other Aves species granting a range of difference references for research
comparison.
Macaws, as a part of the New World parrot group, have the same unique characteristics,
and as seed dispersers they are an important element of South and Central American forest
ecosystems. In addition to their ecological importance, they are signature birds to the regions in
which they are found, making them a source of regional and national affection, as well as an
important resource for ecotourism. These and other factors have driven further study and
reintroduction of macaw species into regions within their historic ranges that might still support
healthy populations. One of these species is the Scarlet Macaw, Ara macao.
Scarlet macaws are one of the larger of the New World parrots. They are characterized by
their massive bills and strongly graduated tails. They have a bare, white patch on the sides of the
face with inconspicuous white, feathered lines. Their chest and head feathers are mostly red with
blue lower back coverts and tail coverts. The median and secondary wing coverts are yellow and
variably tipped with green (Juniper 1998). Scarlet macaws are considered to have the greatest
latitudinal distribution range of any bird in the genus Ara. Their native habitat presently extends
from southeastern Mexico through northern South America to eastern Bolivia and the Brazilian
Highlands. This species of macaw also includes a poorly defined subspecies denoted Ara macao
cyanoptera, also known as the Central American scarlet macaw. This subspecies is differentiated
from the South American scarlet macaw (nominate subspecies, Ara macao macao) by a
considerably larger body length and mass. The band of yellow on the secondary wing coverts is
21
much wider on the Central American scarlet macaw, and this macaw has feathers tipped with
blue and little or no green (see Figure 3) (Forshaw 2006).
Macaws reach sexual maturity at three to four years of age, depending on the availability
of a mate, and the species has a low annual reproductive rate. They are asynchronous egg layers
during the months of December, January, and February, and they will lay 1-4 eggs, with each
egg being laid approximately three days after the last (Vaughan Bremer Dear 2004). The eggs
will hatch after an approximate 22-day incubation period, and the chicks fledge approximately
75 days after hatching. Because they lay their eggs over a period of time, and there is variability
in fledging time among the chicks, collectively, there will be a period of more than 100 days
during which the parents will need to be particularly vigilant to protect the young (Myers and
Figure 3: Visual comparison of the nominate subspecies Ara macao macao (left) and
Ara macao cyanoptera (right). A. m. cyanoptera is differentiated by its larger
size and lack of green on the tips of its wing feathers. (A m. macao photo
courtesy of Andy Hay. A. m. cyanoptera photo courtesy of John Perry.)
22
Vaughan 2004). The parents will continue to contribute to the care of the young after fledging,
typically through the first year of life and will not lay eggs again until the young have left the
nest (Myers and Vaughan 2004).
The nests are commonly located in tree cavities that are approximately 20 meters from
the ground. These tree cavities are specialized structures created from the natural rotting of lower
branches of the tree or branches that have broken off during the rainy season. The macaws will
wait until a nesting cavity becomes available before attempting to find a mate and starting a
clutch. Macaws have a much longer lifespan than the majority of other avian species, living 50-
70 years in the wild. The macaw remains reproductively active for approximately 20 years. The
presence of “seniors” in the population can mask an underlying lack of reproduction. People still
see birds in the wild, but it may simply be the same birds advancing in years and not successfully
bearing offspring (Marsden and Pilgrim 2002). Because the macaw has a long lifespan,
populations that are under extreme pressure are able to persist and mask the effects of habitat
destruction and other causes of decline. With the older, non-producing population, this gradual
decline in population numbers is often followed by a drastic increase in mortality rates as the
older, non-reproductive birds reach the end of their lifespans.
In the wild, the Ara macao as a species is listed as a “species of least concern” by the
IUCN. This is due to the larger, yet also declining, numbers of the nominate subspecies, which
range from the extreme south of Nicaragua to Brazil and Bolivia. The desperately low numbers
of the subspecies A. m. cyanoptera ranging from southern Mexico to southeast Nicaragua are a
subject of concern for local governments and has been assessed by several independent experts
associated with the IUCN, the Convention on International Trade in Endangered Species
(Species. 2007) and the World Parrot Trust as meeting the criteria for endangered status (Birdlife
23
International 2013). This is a typical example of a larger, umbrella species masking underlying
ecological health concerns of rarer regionally distinct subspecies that exist in smaller
populations, and which creates a false lack of concern based simply on a cursory evaluation of
reported population numbers for the species as a whole.
The surviving wild populations of A. m. cyanoptera are now found in remnant clusters
throughout their historic indigenous range. In 2014, there were approximately 250 individuals in
Mexico: 200 in the Lacondona Rainforest in Chiapas bordering Guatemala and 50 in the
Chimalapas Mountains in Oaxaca. Other isolated groups have been reported in a few localities in
Guatemala, Belize, Honduras, and Nicaragua. In El Salvador, the A. m. cyanoptera is now
reported to be regionally extinct. Figure 4 reflects current regional population estimates of A. m.
cyanoptera (see Figure 4) (O'neill 2013, Cantu 2014, Estrada 2014, Amaya-Villarreal, Estrada et
al. 2015).
24
In 2008, Mexico conducted a population viability analysis to assess the decline of this
once abundant bird. The analysis concluded that the rapid decline of A. m. cyanoptera
populations has been due to a combination of factors that include deforestation, local hunting,
and the illegal bird trade. While the illegal bird trade remains a problem, strides have been made
internationally to end wild bird imports in order to address this important issue. In Europe, for
example, in the early 1990s, a resolution by the European Parliament called for the European
Commission to end imports of wild birds. The U.S. federal government passed the Wild Bird
Conservation Act prohibiting the importation of most wild-caught birds—primarily parrots. And
in 2008, the Mexican government passed a law with tough legal penalties for poachers caught
illegally moving parrots out of the country.
Figure 4: Map of local population estimates for Ara macao cyanoptera in southern
Mexico and Central America.
25
The long-lasting, deleterious effects of poaching are often understated and, therefore, not
fully comprehended. When poachers find a nesting site containing pre-fledging chicks, the adult
macaws are often maimed or killed in an attempt to keep the macaw’s raucous warning cries
from alerting the authorities. In order to reach the chicks within the nests located high in the tree
in the upper rainforest canopy, the poachers will often burn or chop the base of the tree. This
results in killing the tree, removing a valuable nesting site from the population, and often
injuring or killing the chicks in the process.
The reintroduction of A. m. cyanoptera into historic habitats to increase population
numbers in the wild is a primary goal. When supplementing a depleted wild population with
captive bred stock, or reintroducing a species into a region that it formerly inhabited, efforts must
be focused on sponsoring individuals that are genetically matched to the region in question. Zoos
and breeding programs also need to maintain a level of genetic purity to prevent subspecies
hybridization. There are, however, reliability limitations to using only morphology for
subspecies determination, especially at such a critical level. The subtle morphologic distinctions
between allied subspecies are so complex that most taxonomists who attempt this specialize in a
single group of closely related organisms. As a result, finding appropriate experts and
distributing specimens in order to have a comprehensive morphological analysis can be a time-
consuming and expensive process (Stoeckle 2003, Hebert, Stoeckle et al. 2004). The possibility
of hybrids between two subspecies only complicates such efforts.
In the mid-1990s, a number of sponsored programs had begun to release confiscated,
smuggled, and cage-raised birds into the wild. Captive breeding and reintroduction programs
were a fairly new concept at this time and were deemed necessary to reestablish endangered
species. One such program from 1992 to 1995 involved releasing 20 scarlet macaws to their
26
historic range in the Tambopata Nature Reserve in Peru. Little was known at this time about the
numerous problems that arise during the reintroduction of a population. Historically, initial
programs are seen as a success if at least 50% of the original released group survives.
The released group of macaws lacked the adequate health screening to protect against the
introduction of diseases into the new environment. Parrots are especially susceptible to several
lethal and contagious diseases that are capable of lying dormant for years (Brightsmith, Hilburn
et al. 2005). The lack of adequate habitat and native resources also hindered the program’s
efforts, and at the time of this reintroduction event there was little to no effort to slow the rapid
deforestation in these regions. Due to the lack of support from the surrounding community and
continued deforestation, the released macaws were put at risk. Local communities were unaware
of the long-term benefits that ecotourism brings compared to the short-term returns of exploiting
the macaws for food and/or feathers. Further, it was not determined whether the reintroduced
birds were of the same, or at least similar, haplotype that had been indigenous to the release site,
and therefore, may not have been genetically well-adapted for the release environment. Of the
original group of scarlet macaws released, only 55% were still alive as of 2002. The lessons
learned from these early reintroduction efforts have since been used to design more successful
release programs.
Proper reintroduction of A. m. cyanoptera needs molecular level distinction to enable the
introduction of the original population haplotype/subspecies at the release site(s). Historically,
molecular-level subspecies differentiation has not been done, as the differentiating gene
sequences between subspecies has not been known. But given the potential importance of
subspecies identification for reintroduction success, as well as to avoid subspecies hybridization,
this should be seen as an important part of these reintroduction programs.
27
The use of DNA-based testing is commonly required in order for two subspecies to be
reliably identified/distinguished. It is also important to identify likely species and subspecies
hybrids, since little or no effort has been taken to prevent homogenization, particularly in the pet
trade. It is necessary to obtain accurate genetic profiles of suitably polymorphic regions for each
group in order to allow for accurate differentiation. Using this approach, DNA-based testing
methods have been developed for a number of species that can quickly and consistently
distinguish between two subspecies and hybrids. (Tavares, Baker et al. 2006). Closely related
sister-species delimited with independent evidence can often be differentiated by mitochondrial
sequence comparison using cytochrome oxidase subunit 1 (CO1) sequences. Molecular analysis
of combinations of multiple genes, including mitochondrial DNA (mtDNA) sequences, should
allow unbiased species differentiation of even closely related populations such as at the
subspecies level (Knowles and Carstens 2007, Dupuis, Roe et al. 2012).
Although not yet considered threatened by the IUCN, the scarlet macaw’s decline can be
seen as symptom of a larger problem of global biodiversity loss, and many of the same problems
and solutions relating to species preservation can be studied and applied to assist in the
preservation of this species. There have been important successes in macaw reintroduction
programs that can serve as models for future programs with these principles in mind.
One such success story is that of the Palenque Rainforest in Mexico where the macaws
had been extinct for 70 years. Ninety-six pedigreed macaws were released as six small release
groups between April 2013 and June 2014. The reintroduction area is a national park famous for
its Mayan ruins and therefore already granted substantial protection by the Mexican government.
Preparations for these releases were multifaceted, and included important health screening and
local community involvement elements. The released individuals are closely monitored by
28
microchip and volunteer support staff. Three institutions are involved in this release program:
Aluxes Ecopark of Palenque provided the release site, Xcaret Ecopark and Nature Preserve
continues to provide the captive-bred macaws, and the Institute of Biology of the National
Autonomous University of Mexico (UNAM) continues to provide the scientific planning,
execution, and research for this project. This program has been extremely successful, reporting a
92% survival rate as of August 2014 (Estrada 2014).
29
CHAPTER 2
MATERIALS AND METHODS
2.1 Obtaining Samples
Eighty scarlet macaw blood samples were collected in 2003 and 2005 from Xcaret
Ecoparque and Nature Preserve in Playa del Carmen, Mexico. Of the samples, only 32 were from
birds that had been maintained as pedigreed and raised together to breed to prevent subspecies
hybridization. Twenty-seven blood samples were also obtained from two private owners in
Quintana Roo, Cancun, Mexico. Eleven of the nominate A. m. macao blood samples were
obtained through the assistance of Dr. Patricia Escalante from the National Autonomous
University of Mexico (UNAM). Definitive blood samples of the nominate subspecies of A. m.
macao, and the endangered group, A. m. cyanoptera, as well as known hybrid blood samples of
the two subspecies were used for this study.
To obtain the blood samples, the macaws were first restrained using a towel in a sterile
environment. The blood was collected using a 25 gauge needle. The needle was inserted into the
basilic wing vein of the restrained macaws. The vein is located on the ulna of the birds, making it
easy to access (Harris, 2007). The collected blood was transferred to a vial containing lysis
buffer (0.01 M Tris, 0.01 M NaCl, 0.01 M EDTA, 1% n-lauroylsarcosine, at pH 7.5). The
n-lauroylsarcosine is a detergent that lyses the cells and the high salt concentration neutralizes
the nucleases in the blood (Seutin et al., 1991). Some of the blood samples were transported on
FTA® databasing cards (Whatman) rather than in lysis buffer. This mode of transport avoids
leakage risks and sample degradation due to temperature variations and spoilage (Smith and
Burgoyne 2004). Blood was added to the FTA® cards by placing blood droplets within the
printed circle on the card. Care was taken to avoid pooling the blood in one spot by dispersing
30
the blood drops over the entire area inside of the printed circle and allowing the card to fully dry.
The cards contain reagents that are designed to kill most pathogens by cell lysis and protein
denaturation, inhibits fungal growth, and avoids other contaminants with strong buffering and
free-radical properties much like the lysis buffer used in liquid blood sample transport.
2.2 DNA Isolation from Liquid Blood
The DNA was isolated from the blood samples by using a guanidinium thiocyanate
(GITC) extraction method, modified from Hammond et al. (Hammond, Spanswick et al. 1996).
A 10-20 µl quantity of blood suspended in lysis buffer (0.01 M Tris, 0.01 M NaCl, 0.01 M
EDTA, 1% n-lauroylsarcosine, pH 7.5) was added to 500 µl of extraction solution (0.5 M
guanidinium thiocyanate and 0.1 M EDTA) in a 1.5 ml microfuge tube and vortexed well. The
GITC irreversibly inactivated nucleases and other proteins by denaturation, making the proteins
insoluble. The EDTA in the solution is an ion chelator. Two hundred fifty microliters of ice-cold
7.5 M ammonium acetate was added to the solution and vortexed well. Ammonium acetate
makes the DNA much less soluble in water. The solution was incubated on ice for 10 min.
To solubilize proteins and lipids, 500 µl of 24:1 chloroform to isoamyl alcohol was added
and the solution was vortexed. The solution was then centrifuged at 10,000 rpm for 10 min in a
microcentrifuge at room temperature. The upper aqueous phase was carefully removed and
transferred to a new microfuge tube. The chloroform extraction was repeated. Six
hundred microliters of cold isopropanol was added to precipitate the DNA, and the solution was
vortexed well. The solution was centrifuged at 10,000 rpm for 20 min at 4 °C. Taking care not to
disturb the pellet, the supernatant was carefully removed with a pulled-out Pasteur pipette and
discarded. One milliliter of cold 70% ethanol was then added to remove any residual salts as well
31
as furthers the precipitation of DNA. The tube was gently inverted 3 times to re-suspend the
DNA pellet. The solution was centrifuged at 10,000 rpm at 4 °C. Taking care not to disturb the
pellet, the supernatant was removed with a Pasteur pipette and discarded. The remaining pellet
and residual solution was placed in a Savant SpeedVac® Concentrator (ThermoScientific) for
approximately 3 min until dry. The pellet was re-suspended in 100 µl of molecular grade water
and stored at -20 °C.
2.3 DNA Isolation from FTA® Cards
The isolation of DNA from FTA® cards (Whatman 2015) involved using an autoclave-
sterilized single hole paper punch to remove a 6 mM diameter disk with dried blood. The paper
was placed in a 1.5 ml microfuge tube, and 1 ml of FTA® purification reagent (100 mM Tris,
0.1% SDS) was added. The solution was flash vortexed 5 times to mix and allowed to incubate
for 5 min at room temperature. All of the residual FTA® purification reagent was removed using
a pipette and discarded. The FTA® purification reagent wash steps were repeated twice.
One milliliter of TE buffer (10 M Tris-HCl, 1 M EDTA, pH 8.0) was added to the
microfuge tube and incubated for 5 min at room temperature. All residual TE buffer was
removed with a pipette and discarded. The addition of TE buffer, incubation, and removal were
repeated twice. At this point, the punch should appear white or near-white. A 140 µl aliquot of a
first solution (0.1 N NaOH, 0.3 mM EDTA, pH 13.0) was then added to the microfuge tube and
then incubated for 5 min at 65 °C. A 260 µl aliquot of a second solution (0.1 N Tris-HCl, pH 7.0)
was added to the tube and again vortexed to mix. The solution was incubated for 10 min at room
temperature and then vortexed to mix (approximately 10 times). The punch was removed from
the solution with autoclave-sterilized flat-tipped tweezers and squeezed to recover any remaining
32
eluate from the matrix. This final eluate contained the genomic DNA in TE buffer (66 mM Tris-
HCl, 0.1 mM EDTA, pH 8.0).
2.4 18S Ribosomal DNA
Ribosomal DNA (rDNA) is an invaluable tool in evolutionary studies (Kurtzman and
Robnett 1998). It is able to display bi-parental inheritance, intergenic variability at the species
and genus level (Baldwin, Sanderson et al. 1995), and is easy to amplify for more detailed study
due to its conserved nature. Because of these abilities, this genomic region is commonly used for
species identification, molecular barcoding, and phylogenetic construction. The rDNA sequence
is located at the chromosomal region(s) around which the nucleoli form, and for this reason these
locations on chromosomes are called nucleolus organizer regions (NORs). Each region is
composed of several tandem clusters of the rRNA genes, including the 18S gene which codes for
the small subunit rRNA which is commonly the chosen sequence used for phylogenetic
comparison. The NORs of the scarlet macaw are found on three distinct chromosome pairs
(Seabury, Dowd et al. 2013), while the majority of species of Aves (such as the California
condor and the chicken) possess a NOR on only one pair of chromosomes.
Small subunit 18S rRNA genes are the standard reference sequences for the taxonomic
classification of organisms (Wang, Tian et al. 2014). The 18S rDNA sequence exhibits a low rate
of polymorphism within species, but still has a sufficient level of polymorphism between species
to make it very useful for determining interspecies phylogenetic relationships using only a few
specimens from each species. Certain sections of this gene are often very highly conserved, so
much so that a centrally located 15 nucleotide sequence shares 100% sequence and location
homology among mammals and 97% among mammals, marsupials, and birds (Coleman 2013).
33
This extreme homology is due to the importance of maintaining the secondary structure to
function correctly when translating mRNAs to proteins.
Two internally transcribed spacer (ITS) regions connect the 18S, 5.8S, and 28S rRNA
genes in each cluster (see Figure 5). The ITS region has become widely used in phylogenetic
inference (Álvarez and Wendel 2003) and the variability in these regions help to differentiate
closer related groups. The highly conserved nature of the flanking regions aid in consensus
primer design. Each rDNA cluster is nearly identical in sequence although there is some
variation in size due to the difference in the number of repeated DNA elements in the non-
translated spacer regions. These tandem arrays are created by unequal crossing-over during sister
chromatid exchange and/or gene conversion events (Eickbush 2002).
Primers were designed using the published Gallus gallus 18S rDNA sequence and the
flanking ITS1 sequence to amplify this region for comparison between the A. macao subspecies
(see Figures 6 & 7). The sequence of 1815 bp covering the entire 18S rRNA gene and the
adjacent ITS1 was amplified with the following primers: 18S-F1 (5-
CCTGGTTGATCCTGCCAGTAGC-3’) and 18S-R1 (5’TCCTTCCGCAGGTTCACCTACG-
Figure 5: Arrangement of ribosomal DNA (rDNA) clusters on the genome. ITS1 – Internal
Transcribed Spacer 1; ITS2– Internal Transcribed Spacer 2; ETS – External
Transcribed Spacer; NTS – Non-Transcribed Spacer (Holstein 2006).
34
3’). All polymerase chain reactions were carried out in 25 µl reaction volumes containing Q5®
Hot Start reaction buffer (New England Biolabs), 0.2 mM dNTPs, 0.5 µM of each primer, 1U
Q5® Hot Start High-Fidelity polymerase (New England Biolabs) and 50 ng of the DNA
template. A thermal cycling profile was used of one cycle of 45 sec at 98 ºC followed by 30
cycles of 10 sec at 98 ºC, 20 sec at 68 ºC, and 27 sec at 72 ºC, with a final step of 5 min at 72 ºC.
Figure 6: Amplified nuclear region from 18S rDNA – 1 of 2. Amplified Ara macao samples
were electrophoresed using a 1% sodium borate (SB) agarose gel with a 1 kb size
standard ladder (Invitrogen).
35
2.5 Column-based Purification of PCR Products
A Wizard® SV Gel and PCR Clean-Up System (Promega 2015) was used to purify PCR
products consisting of DNA fragments of 100 bp to 10 kb directly from a PCR amplification.
This system removes excess nucleotides and primers, and allows for downstream applications
such as DNA sequencing and restriction digestion. An equal volume of Membrane-Binding
Solution (guanidine isothiocyanate) was added to the PCR amplification product. The solution
was transferred to the mini-column assembly, which contains a silica membrane to bind the
DNA. The solution was allowed to incubate for 1 min at room temperature on the column and
Figure 7: Amplified nuclear region from 18S rDNA – 2 of 2. Amplified Ara macao samples
were electrophoresed using a 1% sodium borate (SB) agarose gel with a 1 kb size
standard ladder (Invitrogen).
36
then centrifuged at 16K x g for 1 min. The eluate was discarded. Seven hundred microliters of
membrane wash solution (10 mM potassium acetate, 80% ethanol, 60.7 µM EDTA) were added,
and the column was centrifuged at 16K x g for 1 min.
The resulting eluate was discarded. Five hundred microliters of membrane wash solution
were added and the column was centrifuged at 16K x g for 5 min. The resulting eluate was
discarded. The column was centrifuged to air dry for at least 1 min. It was important to allow all
of the ethanol to evaporate to obtain a maximum yield. The column was placed in a clean
microfuge tube, and 15 µl of molecular grade water were added to the center of the filter, being
careful not to touch the filter with the tip of the pipette. The solution was allowed to incubate at
room temperature for 1 min, and the column was then centrifuged at 16K x g for 1 min. The
amplicons were quantified and assessed for contamination on a NanoDrop® 2000C
Spectrophotometer (ThermoScientific). Absorbance ratios of 260/280 nm and 260/230 nm were
above 1.8, and therefore considered to have adequate nucleic acid purity for downstream
applications (ThermoScientific). The eluted DNA was stored at 4 °C for short periods of time or
at -20 °C for long-term storage.
2.6 Next Generation Sequencing Using the Illumina MiSeq® Platform
Illumina® sequencing technology is based on the creation of clusters by massively
parallel sequencing-by-synthesis (SBS), which enables the detection of the incorporation of
single bases into the growing strands of DNA. During cluster generation, single DNA molecules
are bound to the surface of the flow cell, and bridge-amplified to form growing clonal clusters.
Four types of reversible, fluorescently labeled, terminator bases are added and the growing
clusters are imaged as the DNA chains are extended one nucleotide at a time. Non-incorporated
37
nucleotides are washed away and the fluorescent label and 3’ terminal blocker are chemically
removed from the DNA to allow incorporation of the next nucleotide. Since all 4 dNTPs are
present during each cycle, natural competition minimizes incorporation bias. The image is
captured and the process is repeated for each cycle of sequencing. Following image analysis, the
sequencing software performs base calling, filtering, and quality scoring.
2.6.1 Quantification of dsDNA on Qubit® 2.0 Fluorometer
An enzymatic DNA fragmentation method was used to prepare and tag each sample for
sequencing and because of this, the exact quantification of the input DNA was critical. The
Qubit® 2.0 Fluorometer uses dyes which fluoresce when bound to double-stranded DNA
molecules. This allowed for a more accurate quantification of the DNA target because it
disregards free nucleotides, degraded nucleic acids, protein, and RNA contaminates. In order for
the Qubit® to accurately quantify the samples, the concentration must be between 10 pg/µl to
100 ng/µl. Original quantification was performed using a NanoDrop® 2000 Spectrophotometer
(ThermoScientific). Each PCR amplicon sample was diluted to approximately 10 ng/µl with
molecular grade water. The samples were then quantified using the Qubit® dsDNA HS assay kit.
Calibration was performed with two standards which were prepared by adding 190 µl of the
Qubit® Working Solution to 10 µl of each standard in disposable thin-walled 0.5 ml tubes. The
Qubit® Working Solution was made in a light-protected plastic container using a 1:200 ratio of
Qubit® dsDNA HS Reagent to Qubit® dsDNA HS Buffer.
The standards were then analyzed using the Qubit® by selecting the “DNA” option on
the home screen followed by the “dsDNA High Sensitivity” option. Once both standards were
measured, the sample tubes were prepared by adding 198 µl of Qubit® Working Solution to 2 µl
38
of each sample in disposable thin-walled 0.5 ml tubes. The sample solutions were incubated for 2
min at room temperature and then placed in the sample chamber. Each sample was quantified
in ng/ml by multiplying the Qubit® output reading by 100 (ThermoScientific). Using the
calculated concentrations, each PCR amplicon sample was brought to 10 µl of concentration of
0.2 ng/µl.
2.6.2 Nextera DNA Tagmentation
Each sample was simultaneously fragmented and tagged for sequencing with the Nextera
DNA Sample Prep Kit on the Illumina MiSeq® Desktop Sequencer using a specially engineered
transposome. A transposase cut the target DNA randomly, which created double-stranded breaks
with staggered ends. At the same time, the transposase attached adapter sequences to the ends of
the target DNA. A limited-cycle PCR reaction used the adapter sequences to amplify the insert
DNA. The PCR reaction also added index sequences on both ends of the insert DNA, which
enabled dual-index sequencing of pooled DNA libraries during the sequencing run (see Figure 8)
(Illumina, Syed). The average sample size of the generated fragments after “tagmentation” was
approximately 250 bp plus the length of the ligated primer and index sequence.
39
Figure 8: Tagmentation procedure used by Nextera XT Library Preparation Kit (Illumina).
A. Nextera XT transposome with adapters is combined with template DNA
B. Tagmentation to fragment and add adapters
C. Limited cycle PCR to add sequencing primer sequences and indices
40
A 96-well MIDI plate (Fisher Scientific, part # AB-0859) was labeled NTA (Nextera XT
Tagment Amplicon) and 10 µl of Tagment DNA Buffer were added to each well of the NTA
plate for sample preparation. Five microliters of 0.2 ng/µl of each sample were added to the
corresponding wells of the 96-well plate for a total of 1 ng of DNA per reaction. Five microliters
of Amplicon Tagment Mix were added to each sample and mixed by pipetting. The NTA plate
was sealed with foil and briefly centrifuged. The plate was then run on a thermal cycler for 5 min
at 55 °C with a heated lid and then held at 10 °C.
2.6.3 Neutralization of the Nextera XT Transposome Reaction
After the samples reached 10 °C, they were immediately neutralized in order to halt the
enzymatic reaction of the Nextera XT transposome. The foil on the NTA plate was removed and
5 µl of Neutralize Tagment Buffer were added to each sample well in the NTA plate. The
samples were mixed by pipetting. The NTA plate was sealed with foil, briefly centrifuged, and
incubated at room temperature for 5 min.
41
2.6.4 PCR Amplification
The Nextera XT sequencing kit uses a series of index primers for cluster formation. It
was crucial that each sample has a different pair of indices to parse the different samples after
sequencing. To help ensure that primer pairs were not repeated, a TruSeq® index plate fixture
was used to organize the N7 (index 1 primer) and S5 (index 2 primer) pairs around the NTA
plate (see Figure 9).
The N7 primers have orange caps and were arranged horizontally long the index plate
fixture, while the S5 primers have white caps and were arranged vertically along the index plate
fixture. Once the NTA plate and the indices were arranged on the index plate fixture, we
removed the foil from the NTA plate and added 15 µl of the Nextera PCR Master Mix. Next we
added 5 µl of the S5 (white capped index 2 primers) to each column and then 5 µl of N7 (orange
Figure 9: TruSeq® index plate guide (Illumina).
42
capped index 1 primers) to each row. The index caps were changed directly after use to prevent
cross contamination. After the indices and Nextera PCR Master Mix were added to the sample
wells on the NTA plate, the wells were mixed by pipetting. The NTA plate was sealed with foil
and centrifuged. After the NTA plate was centrifuged, we placed the NTA plate on a thermal
cycler, programed to run with a heated lid for 3 min at 72 °C, 30 sec at 95 °C, 12 cycles of 95 °C
for 10 sec, 55 °C for 30 sec, and 72 °C for 30 sec and 5 min at 72 °C before holding at 10 °C.
2.6.5 PCR Cleanup of Indexed Samples
Agencourt AMPure® XP magnetic beads were used to purify the indexed samples by
removing unincorporated dNTPs, primers, primer dimers, salts, and other contaminants
(Beckmann-Coulter). Short indexed fragments were also removed during a size selection step.
The NTA plate was briefly centrifuged before the foil was removed and 50 µl was transferred
from the NTA plate to a new 96-well plate labeled CAA (Clean Amplified Plate). The magnetic
beads were vortexed to ensure that they were evenly dispersed before adding 30 µl of beads to
each sample well of the CAA plate. The sample wells were then mixed by pipetting. The plate
was incubated at room temperature for 5 min without being disturbed. The CAA plate was then
set on a magnetic rack to allow the supernatant to clear before the supernatant was removed and
discarded. With the plate still on the magnetic stand, 200 µl of 80% ethanol was carefully added
to the sample wells of the CAA plate to avoid disturbing the beads.
The CAA plate was incubated for 30 sec before the ethanol was removed. The ethanol
wash was repeated after another 30 sec incubation, the ethanol was removed. After the removal
of the ethanol from the second wash, any excess ethanol was removed with a fine tipped pipette.
The plate was air-dried for 15 min on the magnetic stand. After the plate was dry, it was removed
43
from the magnetic stand and 52.5 µl of Resuspension Buffer was added to each sample well in
the CAA plate. Each sample was mixed gently by pipetting and left undisturbed at room
temperature for 2 min before the plate was moved to the magnetic stand. Once the supernatant
had cleared, 50 µl of supernatant from the CAA plate was transferred to a new 96-well plate
labeled CAN (Clean Amplified NTA plate).
2.6.6 Accurate Determination of DNA Fragment Size
In order to ensure optional size distribution within the sample library, an Agilent® 2100
Bioanalyzer (Agilent Technologies) was used along with an Agilent® DNA 1000 kit to
determine the correct size of the DNA fragments. This method is preferred to standard gel
electrophoresis due to its output data format that can be transferred via computer interface. Each
DNA sample was loaded onto an Agilent® DNA chip consisting of interconnected
microchannels and separated by size electrophoretically. A gel matrix was combined with a dye
concentrate and transferred to a spin filter. The solution was centrifuged at 2240 x g for 15 min.
Care was taken to protect the solution from light. An Agilent® DNA chip was placed on the chip
priming station, 9 µl of filtered gel-dye mix was pipetted into the well, marked with a “G,” and
the priming station cover was closed.
A priming syringe was clipped on the syringe stand and the plunger was depressed and
held in position to force the gel-dye mix into the chip. Three microliters of gel-dye solution mix
was pipetted into the designated well and 5 µl of marker solution was loaded into the sample and
ladder wells on the chip. One microliter of DNA size standard ladder was loaded into the ladder
well. One microliter of each sample was loaded in to each sample well, and the chip was placed
in an adapter and vortexed for 1 min at 2400 rpm. The desired assay to be performed was
44
selected on the computer display, the chip was loaded on to the machine and the assay was
started.
2.6.7 Library Normalization
Library normalization balances the quantity of each library to ensure a more equal library
representation when the samples are combined to create the pooled sample library.
Twenty microliters of supernatant from the CAN plate was transferred to a new 96-well plate
labeled LNP (Library Normalization Plate). A quantity of 45.85 µl per sample of Library
Normalization Additives 1 (LNA1) was added to a 15 ml conical tube. Library Normalization
Beads 1 (LNB1) was mixed thoroughly by pipetting using a P1000 set to 1000 µl. Once the
LNB1 was mixed, 8.33 µl per sample was added to the 15 ml conical tube. A P200 with a cut tip
was used when the total volume of LNB1 was less than the minimum for a P1000 pipette.
Immediately after LNA1 and LNB1 were mixed, 45 µl of the LNA1/LNB1 solution was added
to each sample well of the LNP plate.
The LNP plate was sealed with foil and shaken for 30 min at 1800 rpm. Once thoroughly
mixed, the foil was removed and the LNP plate was placed on the magnetic rack until the
supernatant was clear. Eighty microliters of supernatant was discarded from each well of the
LNP plate while on the magnetic stand. The LNP plate was then removed from the magnetic
stand and 45 µl of Library Normalization Wash 1 (LNW1) was added. The LNP plate was sealed
with foil and shaken for 5 min at 1800 rpm. The plate was then placed back on the magnetic
stand until the supernatant was clear. The supernatant was removed and discarded from the
sample wells while on the magnetic stand. Again, the LNP plate was removed from the magnetic
stand and washed with LNW1.
45
After the second LNW1 wash, the LNP plate was removed from the magnetic stand and
30 µl of 0.1 N NaOH was added to each sample well. The plate was sealed with foil and shaken
for 5 min at 1800 rpm. Thirty microliters of Library Normalization Storage buffer 1 (LNS1) were
added to each sample well of a new 96-well plate labeled SGP (Storage Plate). The plate was
removed from the shaker and placed on the magnetic stand until the supernatant was clear.
Thirty microliters of the supernatant from LNP plate were transferred to the SGP plate. In
preparation for cluster generation and sequencing, equal volumes of normalized library were
combined, diluted in Hybridization Buffer, and denatured on a heat block at 96 ºC.
2.6.8 Library Pooling and MiSeq® Sample Loading
Once the reagent cartridge was thawed, a 1.5 ml microfuge tube was labeled PAL
(Pooled Amplicon Library) and 5 µl from each sample well was added to the tube. A microfuge
tube was labeled DAL (Diluted Amplicon Library) and 576 µl of Hybridization Buffer (HT1)
was added to the tube. Twenty-four microliters of PAL was transferred to the DAL tube. The
DAL solution was vortexed and incubated on a heat block for 2 min at 96 °C. After the
incubation, the DAL was mixed by inversion and placed in an ice-water bath. The DAL
remained in the ice water bath for 5 min before being loaded into the “load sample” well of the
MiSeq® cartridge.
2.6.9 Prepping the Illumina MiSeq® Platform and Sequencing Reagents
The MiSeq® reagent cartridge, stored at -20 ºC, was placed in water at room temperature
for approximately 60 min to thaw. The reagent cartridge was then inverted ten times to mix the
thawed reagents and assure that they were free of precipitates. The foil seal covering the
46
reservoir labeled “Load Samples” was cleaned with a low-lint lab tissue and the foil seal was
pierced with a clean 1 ml pipette tip. Six hundred microliters of the prepared pooled library was
then loaded into the “Load Samples” reservoir.
While wearing gloves, the flow cell was carefully removed from the storage buffer with a
pair of plastic forceps. The flow cell was rinsed with molecular grade water to remove any
excess salts and then dried using a lint-free lens cleaning tissue. The flow cell glass was cleaned
with an alcohol wipe and the excess alcohol was removed with a lens tissue. The flow cell was
then loaded into the flow cell compartment and the “Next” option was selected. The reagents
were loaded. The “Next” option was selected and the flow cell was prepared and loaded onto the
MiSeq® Desktop Sequencer.
The wash buffer (PR2) bottle was inverted to mix and placed in the reagent compartment
between the reagent chiller and the waste bottle. The waste bottle was emptied and the sippers
were lowered into the PR2 bottle and the waste bottle. The reagent cartridge was loaded in to the
reagent chiller compartment and the “next” option was selected.
The sample sheet was loaded and the run parameters verified. The “next” option was
selected and the system started the pre-run check. After the check was completed, the “start run”
option was selected and the sequencing run was initiated.
2.6.10 Sequence Run Monitoring
The MiSeq® reporter software that runs in conjunction with the MiSeq® Desktop
Sequencer provides primary analysis and re-queueing for further analysis if re-queueing is
instructed by the user to generate a different output format. The resulting sequencing and
analysis files from the primary run are saved on the Illumina® cloud-computing environment
47
BaseSpace and on-instrument using MiSeq® Reporter. Additional analyses was performed using
an off-instrument installation of MiSeq® Reporter.
2.7 Mitochondrial Genome Comparison
Recent advances in technology have made it easier and more affordable to sequence the
entire mitogenome (Duchene, Archer et al. 2011). The complete mitogenome sequence has been
reported for other tribe Arini (includes macaws and parakeets) genera: Eupsittula pertinax,
Psittacara brevipes, Thectocercus acuticaudatus, and Rhynchopsitta terrisi (Pacheco, Battistuzzi
et al. 2011, Urantowka, Strzala et al. 2014) but only one mitogenome from members of the genus
Ara, Ara glaucogularis (Urantowka 2014), and a partial mitogenome of the nominate subspecies
Ara macao macao (Seabury, Dowd et al. 2013) have been sequenced. There is not any published
or documented DNA sequence, mitochondrial or nuclear, of the endangered subspecies, Ara
macao cyanoptera.
In order to obtain a set of DNA fragments that collectively encode the mitochondrial
genome, total genomic DNA was extracted from liquid blood using a guanidinium thiocyanate
(GITC) method and from FTA® cards using a “thorough buffer rinse method” (Smith and
Burgoyne). The final DNA pellet was then in each case rehydrated and quantified using a
NanoDrop® 2000C Spectrophotometer (ThermoScientific).
Six primer pairs were designed that collectively amplify an overlapping set of fragments
encompassing the entire mitogenome. Because the mitogenome is strongly conserved across
species, the published mitogenome sequence of Gallus gallus (chicken) in the National Center
for Biotechnology Information (NCBI) database was used to design the initial set of consensus
primers (Sorenson, Ast et al.) (Desjardins and Morais). Primer sets that failed to amplify
48
fragments were subsequently compared to the partial sequence of an Ara macao macao
individual which became available at that time (Seabury, Dowd et al.). Primer sets that had failed
to effectively amplify fragments were modified based upon this sequence.
The control region is inherently problematic during amplification, read assembly, and the
subsequent contig alignment after sequencing. Cytosine homopolymers, which are located near
the 5’ end of the locus, assist in forming regulatory hairpin-loop structures that serve an
important function during mitochondrial genome replication and transcription (Kilpert and
Podsiadlowski 2006). However, this structure creates difficulty in capturing all of the nucleotide
sequence detail because of the strand barrier formed by the folding at this conserved secondary
region. To add to the unusual sequence patterns in this area, adenosine/thymine (AT) residues
form short tandem repeat sequences which are often found in abundance near the 3’ end of the
control region and often contain a thymine homopolymer approximately 15 bp long. This
sequence is suspected to be the cause of the presumed polymerase slippage problem during
amplification and/or sequencing. Our lab was able to overcome these problems through trial and
error of a number of methods such as altering temperatures, using different enzymes, and
designing a variety of primers sets to use for amplification of the control region of different
individual scarlet macaws.
The difficulty for the Ion Torrent® platform to call homopolymer DNA sequences
(Feinstein and Cracraft 2004, Reumers, De Rijk et al. 2012) and the abundance of secondary
structures in this region caused the base call quality score to be lower than desired and to report
an incorrect number of consecutive nucleotides. This is due to the use of non-terminating bases
during sequencing allowing the addition of repetitive nucleotides until the wash step removes the
nucleotides from the chip. This area of the mitogenome was sequenced a second time with the
49
Illumina MiSeq platform. The effort resulted in an increased level of coverage (due to the
different platform chemistry involved) as well as a much higher quality score. The entire
mitogenome was amplified in six segments ranging in size from 2910 bp to 4423 bp (Figure 16).
50
Table 1: Mitochondrial Primer Pairs Used to Amplify the Mitogenome of Ara macao (Scarlet
Macaw)
Segment
Size
(bp)
TA
(°C)
Primers† Primer Sequence (5’-3’)
1 3111 69
F2682a
R5793a
CCAACATCTTAGCGGATCTTAGCG
GAAGCTTGAAGAGAGGAGTAGG
2 4423 69
L2258b
H6681b
CGTAACAAGGTAAGTGTACCGGAAGG
GGTATAGGGTGCCGATGTCTTTGTG
3 2910 66
F7266c
R10176a
GCCTTCAAAGCCTTAAACAAGAG
AAGAAGGTTAGGATCATGGTCAAG
4 4110 69
F9869c
R13979c
GGCCAGTGCTCAGAAATCTGTGG
GATGGGTGGCTCCTAAGACCAGTG
5 3666 67
F13328a
R0028c
GCCTACTCCTCCGTAAGCCACATAGG
CTTCGTGTTTTGGTTTACAAGACC
6 3460 65
F16478c
R2968a
CACGAATCAGGATCAAACAACC
ACCTGTCTTGTTAGTGGGCTGT
†The primers were designed from a this study, b Sorenson et al. (2009), c modified from Sorenson et al. (2009)
51
Various protocols and DNA polymerases were used to amplify the mitogenome
segments. The majority of the mitochondrial segments were amplified in 25 µl reaction volumes
containing Phusion® High-Fidelity (HF) buffer (1.5 mM MgCl2), 0.2 mM dNTPs, 0.5 µM of
each primer, 0.5 U Phusion® HF polymerase (New England Biolabs), and 1 ng of template. The
thermal cycling profile used was one cycle of 1 min at 98 °C followed by 30 cycles each
consisting of 10 sec at 98 °C, 20 sec at the annealing temperature (TA) listed in Table 1, 80 sec at
72 °C, and then a final 7 min incubation at 72 °C. Due to the presence of extensive secondary
structure in the region, the amplification of the fragment that included the control region was
carried out using an elongation step of 70 sec instead of 80 sec and used GC Phusion® buffer
instead of HF Phusion® buffer with Phusion® HF polymerase. PCR amplification was
performed on a PTC-200 Peltier Thermal Cycler (MJ Research, Waltham, Massachusetts).
The PCR product was quantified using a NanoDrop® 2000C Spectrophotometer
(ThermoScientific). Each amplified sample was loaded onto a 1% sodium borate (SB) agarose
gel and stained with ethidium bromide. The gel was electrophoresed in 1X SB buffer at 120V.
One kilobase Quick-Load® ladder (Invitrogen) was used as a size standard (see Figures 10, 11,
12, 13, 14 & 15).
52
Figure 10: Electrophoretic analysis of mitochondrial segment 1 amplicons. Amplified Ara
macao samples were electrophoresed using a 1% sodium borate (SB) agarose gel
with a 1 kb size standard ladder (Invitrogen).
53
Figure 11: Electrophoretic analysis of mitochondrial segment 2 amplicons. Amplified Ara
macao samples were electrophoresed using a 1% sodium borate (SB) agarose gel
with a 1 kb size standard ladder (Invitrogen).
54
Figure 12: Electrophoretic analysis of mitochondrial segment 3 amplicons. Amplified Ara
macao samples were electrophoresed using a 1% sodium borate (SB) agarose gel
with a 1 kb size standard ladder (Invitrogen).
55
Figure 13: Electrophoretic analysis of mitochondrial segment 4 amplicons. Amplified Ara
macao samples were electrophoresed using a 1% sodium borate (SB) agarose gel
with a 1 kb size standard ladder (Invitrogen).
56
Figure 14: Electrophoretic analysis of mitochondrial segment 5 amplicons. Amplified Ara
macao samples were electrophoresed using a 1% sodium borate (SB) agarose gel
with a 1 kb size standard ladder (Invitrogen).
57
Figure 15: Electrophoretic analysis of mitochondrial segment 6 amplicons. Amplified Ara
macao samples were electrophoresed using a 1% sodium borate (SB) agarose gel
with a 1 kb size standard ladder (Invitrogen).
58
Figure 16: Overlapping PCR amplicons used to amplify the Ara macao mitogenome.
59
2.8 Next Generation Sequencing Using the Ion Torrent® PGM™ Sequencing System (Life
Technologies)
The Ion Torrent® PGM™ is a semiconductor-based sequencer. It was the first
sequencing benchtop technology that did not depend on a controlled light source. Once the
sample is fragmented and ligated to adapters, multiple samples can be combined into a library.
The library is attached to beads and clonally amplified. With each addition of a complementary
base to the growing strand, a proton is released, altering the pH. This sequence-by-synthesis
method records base incorporation and filters out low-accuracy readings (see Figure 17).
2.8.1 Combining and Standardization of Mitochondrial DNA Samples
Each PCR amplicon was column-cleaned with the Wizard® SV Gel and PCR Clean-Up
System (Promega®). The amplicon products were quantified and checked for purity using a
NanoDrop® 2000C Spectrophotometer (ThermoScientific). As a rule, 260/280 nm and 260/230
nm absorbance ratios of 1.8 or greater are considered acceptable indications of adequate purity.
Four hundred nanograms of each of the six amplified mitochondrial segments for each individual
sample were combined in equimolar amounts to ensure equal representation across the
mitogenome during fragmentation and sequencing. The initial amount of total DNA in each
sample was between 0.1 µg – 1.0 µg. These samples were then further diluted to obtain 500 ng in
15.5 µl of molecular grade water. The fragmentation protocol required 0.1 to 1.0 µg of DNA as
starting material.
60
2.8.2 DNA Fragmentation and End Repair
The NEBNext® Fast DNA Fragmentation and Library Prep Set for the Ion Torrent®
(New England Biosystems) were used to prepare the DNA libraries. An amount of 15.5 µl of the
standardized DNA was combined with 2 µl of NEBNext® DNA Fragmentation Reaction Buffer
and 1 µl of 100 µM MgCl2. This DNA/buffer mix was vortexed for 3 sec, briefly centrifuged,
and placed on ice. The NEBNext® DNA Fragmentation Master Mix was vortexed for 3 sec and
1.5 µl of the mix was added to the DNA/Buffer solution. The solution was then incubated on a
thermal cycler for 20 min at 25° C, followed by 10 min at 70 °C and finally held at 4 °C. The
microfuge tube was then placed on ice.
2.8.3 Ligation of Adapters to DNA
Barcode adapters comprised of differing sequences were ligated to each individual
sequence sample to pool/multiplex several samples in one sequencing run. A P1 adapter was
ligated to the end of each sample sequence to allow for the attachment to paramagnetic
AMPure® XP beads during size selection and sequencing. One microliter of sterile water, 4 µl of
T4 DNA Ligase Buffer for Ion Torrent®, 5 µl of P1 adaptor solution, 1 µl of Bst 2.0
WarmStart® DNA polymerase preparation, 4 µl of T4 DNA ligase solution, and 5 µl of a
barcode solution were added to the fragment solution and mixed by pipetting. The combined
solution was incubated in a thermal cycler for 15 min at 25 °C, followed by 5 min at 65 °C and
was then held at 4 °C. Five microliters of stop buffer was added to the microfuge tube which was
then vortexed. Sixty microliters of 0.1 X TE buffer (10 M Tris-HCl, 1 M EDTA, pH 8.0) was
added to the microfuge tube.
61
2.8.4 Bead-Based Size Selection of Amplified DNA
Agencourt AMPure® XP Beads (Beckmann-Coulter) were used to selectively isolate the
200 bp fragments from the unwanted larger and smaller fragments. Seventy microliters of
AMPure® XP magnetic beads were added to the DNA library solution and mixed by pipetting.
The solution was allowed to incubate at room temperature for 5 min, and then placed on a
magnetic rack to separate the beads which bond to the larger, unwanted fragments from the
supernatant. After the solution was clear, the supernatant was transferred to a new microfuge
tube and the beads were discarded. Fifteen microliters of re-suspended AMPure® XP beads were
added to the supernatant, mixed, and allowed to incubate at room temperature for 5 min. The
tube was then placed on the magnetic rack to separate out the beads that were now bound to the
desired fragment size.
Once the solution was clear, the supernatant was removed and discarded. Fifty microliters
of fresh 80% ethanol was added to the tube while on the magnetic rack. The solution was
allowed to incubate at room temperature for 20 sec, and the supernatant was removed and
discarded. Five hundred microliters of ethanol was added to the tube and allowed to incubate at
room temperature for 30 sec, and the supernatant was discarded. The tubes were air dried while
on the magnetic rack with the caps open for 5 min. The DNA target was then eluted from the
beads with 45 µl of 0.1X TE (1.0 M Tris HCl, 0.5 M EDTA, pH 8.0). The solution was mixed by
vortexing and placed on the magnetic rack until clear. Forty microliters of solution was
transferred to a new microfuge tube.
62
2.8.5 PCR Amplification of Adaptor Ligated DNA
Fifty microliters of NEBNext® High-Fidelity 2X PCR Master Mix, 2 µl of equalizer
primers, and 8 µl of sterile water were added to the 40 µl adaptor-ligated 200 bp fragment
solution. The solution was amplified under the following conditions: initial denaturation at 98 °C
for 30 sec, six cycles of denaturation at 98 °C for 10 sec, annealing at 50 °C for 30 sec, and
elongation at 72 °C for 30 sec, followed by a final elongation at 72 °C for 5 min.
2.8.6 Equalizing the Library for Multiplexing
In order to prepare the samples for multiplexing, each sample in the combined library
needed to be represented in equal amounts during sequencing. This was accomplished by special
equalizer beads and buffer. Ten microliters of Equalizer Capture Solution was added to the
amplified samples and mixed by pipetting. The solution was allowed to incubate at room
temperature for 5 min. Six microliters of washed Equalizer Beads were added to the microfuge
tube. The solution was mixed by pipetting and allowed to incubate at room temperature for 5
min. The tube was placed in the magnetic rack until the solution was clear.
The supernatant was removed and discarded and 50 µl of Equalizer Wash Buffer was
added to the reaction. The tube was removed from the magnetic rack, gently mixed by pipetting,
and then returned to the magnetic rack. The supernatant was removed, discarded, and 150 µl of
Equalizer Wash Buffer was added to the reaction. The tube was again removed from the
magnetic rack and gently mixed by pipetting. The tube was then placed in the magnetic rack and
the supernatant was removed and discarded. The tube was then removed from the magnetic rack
and 100 µl of Equalizer Elution Buffer was added to the pellet. The solution was mixed by
pipetting and the tube was sealed and placed in a thermal cycler at 35 °C for 5 min. The tube was
63
again placed in the magnetic rack and allowed to incubate at room temperature for 5 min. The
supernatant was then removed and contains the equalized library. The final concentration of each
equalized library was approximately 100 pM.
2.8.7 Assessment of DNA Library
An Agilent® 2100 Bioanalyzer (Agilent Technologies) was used along with an Agilent®
DNA 1000 kit to determine the sample fragment size of the amplified DNA fragments as well as
the sample concentration. Each DNA sample was loaded onto an Agilent® DNA chip consisting
of interconnected microchannels and separated by size electrophoretically. This method is
preferred to standard gel electrophoresis due to its output data format that can be transferred via
computer interface. A gel matrix was combined with a dye concentrate and transferred to a spin
filter. The solution was centrifuged at 2240 x g for 15 min. Care was taken to protect the mixture
from light. An Agilent® DNA chip was placed on the chip priming station, 9 µl of filtered gel-
dye mix was pipetted into the G well, and the priming station cover was closed.
A priming syringe was clipped on the syringe stand and the plunger was depressed and
held in position to force the gel-dye mix into the chip. Three microliters of gel-dye solution mix
was pipetted into the designated well, and 5 µl of marker was loaded into the sample and ladder
wells on the chip. One microliter of DNA size standard ladder was loaded into the ladder well.
One microliter of each sample was loaded in to each sample well, and the chip was placed in an
adapter and vortexed for 1 min at 2400 rpm. The desired assay to be performed was selected on
the computer display, the chip was loaded on to the machine, and the assay was started.
Each sample was then diluted to 23 pM using molecular grade water. Twenty microliters
of each sample was transferred to a LoBind microfuge tube and this was stored at 4 °C.
64
2.8.8 Preparation of Ion-Sphere Particles for Emulsion PCR & Enrichment
The samples were combined to create a multiplex library. The sample templates were
attached to Ion Sphere Particles (ISPs) in a concentrated reaction to regulate the ratio of template
to particles. This was performed on an Ion Torrent® OneTouch 2 system in an aqueous medium
composed of nanoliters of PCR reagents in individual droplets. This emulsion of individual
aqueous droplets were referred to as microreactors. Each microreactor should contain a single
library fragment, a single bead, and enough PCR reagent (primer, dNTPs, and polymerase) for a
clonal amplification to occur. Multiple copies of the same template were clonally amplified on
the bead.
The Ion Torrent® OneTouch ES enriches the solution after emulsion PCR by removing
non-templated ISPs. Dynabeads® MyOne Streptavidin C1 Beads separate the biotinylated
primers and free template from the templated ISPs.
Figure 17: Ion Torrent sequencing work flow for preparation (Kiesler 2014)
65
2.8.9 Ion Torrent® Run Plan
A run plan was created in the Torrent Suite & Ion PGM™ System software on the server
connected to the Ion PGM™. The Ion PGM™ was prepared and initialized, and the enriched
ISPs were loaded onto an Ion 316™ chip. The chip was centrifuged repeatedly to seat individual
ISPs into signaling positions.
Each microcell position on the chip should contain only one template-positive ISP. Once
the sequencing run has been initiated, sequencing-by-synthesis began, and the chip was flooded
by one dNTP at a time. When a nucleotide was incorporated into a growing strand of
complementary DNA, hydrogen ions were released. The chip underwent a change in pH when
this occurred and the unit recorded this data as a digital signal.
Figure 18: Ion Torrent sequencing final workflow (Kiesler 2014).
66
2.9 Mitochondrial DNA Sequence Data Analysis
The Ion Torrent® Server and Ion Torrent® Suite software records and processes the
corresponding sequences from the sequenced library. The library was separated by
demultiplexing the individual samples by the barcodes attached during the library preparation.
The Ion Torrent® support software uses Torrent Suite™ preparation and FASTQ output
in the form of 200 bp reads. The reads were aligned preliminarily on-instrument to create
contigs. Base calls were scored by local cluster call stringencies programmed at the beginning of
the run. The mitochondrial sequence contigs were downloaded from the instrument in FASTQ
file format (nucleotide sequence including quality scores) and were further analyzed by
Geneious® version 8.0.4 (see Figure 18) (Kearse, Moir et al. 2012).
The off-instrument program, Geneious®, conducted all of the mitochondrial alignments,
assemblies, and analyses. Comparison analysis was conducted by ClustlW with default
parameters as implemented and adjusted by eye. Initial alignment of each sample was performed
de novo. This gave a basis for true alignment before being matched back to a reference sequence.
Uncorrected pairwise sequence divergence between samples was calculated for all genomic
features based on MUSCLE alignments executed in Geneious® version 8.0.4.
2.10 Alignments to Reference Sequence
The reference sequence used in this alignment was from a nominate Ara macao macao
extirpated from Brazil but now housed at the Blank Park Zoo in Des Moines, Iowa. A wild-
caught bird was chosen as the reference sequence to ensure the sample originated from a pure
nominate subspecies and not from a hybrid of the two subspecies (Drees 2010).
During alignment, the consensus sequence and reference sequence were aligned by
67
allowing gaps to make the most matches between the two sequences, and then various analysis
matrices were employed to compare the sequences. By choosing the optimal alignment,
sequences of different length can be compared. Scoring methods were used and scores were
assigned for gaps in the sequence or different bases. In this way, different alignments can be tried
and the most probable alignment will have the highest alignment score. The appropriate scoring
function must be used based on an evolutionary model with insertions, deletions, and
substitutions. The substitution score matrix contained an entry for every amino acid pair.
Algorithms were used to maximize similarity between sequences, minimize distance between
sequences, and apply percent-acceptable mutation. Residues that were aligned but do not match
equal substitutions represent singular nucleotide polymorphisms. Residues that were aligned
with a gap in the sequence (in one direction or the other) represent insertions and deletions.
Matrices evaluated the likelihood that alignments were significant rather than random. Many
different matrices can be employed to analyze sequence similarity.
2.11 Differentiation on the Sequence Level
Once the alignment was complete, the software, in this case ClustlW, used a computer
model algorithm to detect single nucleotide polymorphisms (SNPs), insertions, and deletions. On
average, 90% of sequence variation in the genome are SNPs, and they are a major source of
heterozygosity. SNPs rarely affect the function of a protein and often occur in non-protein
coding areas or intronic regions. They can also occur in the third base position of a codon and
result in the same amino acid. SNPs can cause a change in a protein residue that results in a
chemically similar amino acid or a change in amino acid that doesn’t affect the function of that
protein because it is not an important structural amino acid of the protein. Other polymorphisms
68
such as insertions and deletions (indels) are also taken into account when measuring variation
between sequences. This type of change occurs at a much lower frequency because of the more
disruptive shift it could cause within the coding sequence
2.12 Identification of Candidate Nuclear Sequences for Polymorphism Screening
As noted above, mitochondrial sequences can have a much higher degree of variability
than equivalent nuclear sequences, but because they are maternal in origin only, they cannot
differentiate a hybrid of the two subspecies from A. m. macao or A. m. cyanoptera. In order to
assist in determining a more complete lineage assessment, sequences encoded by the nuclear
genome were also investigated. The nuclear genome offers bi-parental inheritance but lacks a
high degree of variability in many areas. This slower rate of change often doesn’t provide
sufficient sequence variation to differentiate between closely related species (Dasmahapatra and
Mallet 2006).
Most protein-coding regions are highly constrained because most amino acid-altering
mutations are deleterious and become selectively eliminated (Li 1997). Intronic regions,
intergenic sequences, and the third nucleotide of a codon are often significantly less susceptible
to natural selection and fitness interference than protein-coding genes and are thus often more
variable between individuals. Therefore, these sequences are expected to have a higher number
of polymorphic sites and to evolve/change faster, making them potentially useful sequences to
explore for interspecies nucleotide variability when working with closely related species
(Watanabe, Nishida et al. 2005, Aranishi 2006).
Candidate nuclear regions which have historically displayed higher sequence variability
in other organisms were evaluated for comparative analysis in this study (Table 2). Sequences
69
from genes in gene families were avoided in order to increase the chance that intronic regions
would represent an unlinked locus on an individual chromosome. However, using DNA as a
successful phylogenetic tool also relies on the assumption that sequence variation among
individuals within a subspecies is much smaller than variation between two subspecies.
Therefore, sequences with too much variability needed to be avoided as candidates for sequence
comparison as well.
Nuclear primer sets were ordered from Biosynthesis, Inc., Lewisville, Texas. Primers
were designed in exonic regions when attempting to amplify intronic sequences to allow for
future use in other studies. Each lyophilized primer was hydrated to a stock concentration of
100 µM with TE buffer (10 M Tris-HCl, 1 M EDTA, pH 8.0). A working stock concentration of
10 µM of each primer was then made by adding molecular grade water. A variety of PCR
protocols were used for amplifying the candidate nuclear regions and the regions that were
believed to show possible differentiation were further analyzed.
70
Table 2: Candidate Nuclear Regions Used for Differentiation of Subspecies (De Mendonca
Dantas, Godinho et al. 2009)
Nuclear region Short
name
Reference
Length Primers (5’-3’)
ATP citrate lyase ACL 469 GCTCTGCTTATGACAGCACT
CAGCAATAATGGCAATGGTG
Myelin proteolipid
protein MPLPR 356
ACATCTACTTTAACACCTGGACCACCTG
TTGCAGATGGAGAGCAGGTGGGAGCC
CEPU gene CEPU 627 CGAGTCAAAGTCACCGTCAA
CTCTTCGCATCCGAGATGTA
β-fibrogen FIB 908 CAGGACAATGACAATTCAC
GTAGTATCTGCCATTAGG
Laminin receptor
precursor P40 LRPP40 362
GGGCCTGATGTGGTGGATGCTGGC
GCTTTCTCAGCAGCAGCCTGCTC
V-raf murine sarcoma
viral oncogene C-RMIL 576
TCAATCATCCACAGAGACC
TGATGAGATCCACTCCATCG
Transforming growth
factor TGFβ2 630
GAAGCGTGCTCTAGATGCTG
AGGCAGCAATTATCCTGCAC
Tropomyosin TROP 533 AATGGCTGCAGAGGATTA
TCCTCTTCAAGCTCAGCACA
Axin protein AXIN 1280 GATCTCCTGAAGACGTGG
AAGGCTGGACGACGTTCC
Aldolase ADL 356 CTTATGTTGAAGCTGAACGACTG
GCACGTAGCCATAGTGCGTAGTC
Myoglobin MG 627 GCTCAGGGTCTCTAGGTCCA
CTAGGCAGCCTAAGTATGCC
Histone 2 H2AF 908 GCACGACGAGCATGCTAC
AGGTATTCCTGGCACTGG
Adenylate kinase 1 AK1 851 TGCAAGCCATCATCGAGAAGG
TGATGGTCTCCTCGTTGTCG
Recombination
activating gene 1 RAG1 1124
CCTCCTGCTGGTATCCCTGC
GAATGTTCTCAGGATGCGTCC
Regulator of G-
protein signaling 4 RGS4 868
TCGCTGGAAAACTTGATCC
GTAGTCCTCACAACTGACC
Vimentin gene Vim 446 TGCTTCTTTGAACCTGAGAG
GTGTCCTCTTCGAGTGAGTG
71
The adenylate kinase 1 (AK1) gene fragment was amplified by PCR with the primers
listed in Table 2 (see Figures 19 & 20). Reactions were carried out in 25 µl reaction volumes
containing One Taq® Standard Reaction buffer, 0.2 mM dNTPs, 0.2 µM each primer, 1.25 U
One Taq® Hot Start polymerase (New England Biolabs), and 200 ng DNA template using a
thermal cycler protocol of one cycle of 45 sec at 94 °C followed by 35 cycles of 30 sec at 94 °C,
45 sec at 62 °C, and 1 min at 68 °C, with a final cycle of 5 min at 68 °C. PCR amplification was
performed on a PTC-200 Peltier Thermal Cycler (MJ Research, Waltham, Massachusetts).
Figure 19: Amplified nuclear region from adenylate kinase 1 (AK1) – 1 of 2. Amplified Ara
macao samples were electrophoresed using a 1% sodium borate (SB) agarose gel
with a 1 kb size standard ladder (Invitrogen).
72
Figure 20: Amplified nuclear region from adenylate kinase 1 (AK1) – 2 of 2. Amplified Ara
macao samples were electrophoresed using a 1% sodium borate (SB) agarose gel
with a 1 kb size standard ladder (Invitrogen).
73
The recombination activating gene 1 (RAG1) gene fragment was amplified by PCR with
the primers listed in Table 9 (see Figures 21 & 22). Reactions were carried out in 25 µl reaction
volumes containing Taq buffer, 0.2 mM dNTPs, 0.2 µM each primer, 1.25 U Taq polymerase
(New England Biolabs), 10 ng DNA template using a thermal cycler protocol of one cycle of 45
sec at 94 °C followed by 30 cycles of 30 sec at 94 °C, 45 sec at 61 °C, and 1 min at 68 °C, with a
final cycle of 5 min at 68 °C. PCR amplification was performed on a PTC-200 Peltier Thermal
Cycler (MJ Research, Waltham, Massachusetts).
Figure 21: Amplified nuclear region from recombination activating gene 1 (RAG1) – 1 of
2. Amplified Ara macao samples were electrophoresed using a 1% sodium
borate (SB) agarose gel with a 1 kb size standard ladder (Invitrogen).
74
Figure 22: Amplified nuclear region from recombination activating gene 1 (RAG1) – 2 of
2. Amplified Ara macao samples were electrophoresed using a 1% sodium
borate (SB) agarose gel with a 1 kb size standard ladder (Invitrogen).
75
The regulator of G protein signaling 4 (RSG4) gene fragment was amplified by PCR with
the primers listed in Table 9 (see Figures 23 & 24). Reactions were carried out in 25 µl reaction
volumes containing Q5® Hot Start buffer, 0.2 mM dNTPs, 0.5 µM each primer, 1.0 U Q5® Hot
Start polymerase (New England Biolabs), 50 ng DNA template using a thermal cycler protocol
of one cycle of 30 sec at 98 °C followed by 30 cycles of 10 sec at 98 °C, 20 sec at 63 °C, and 10
sec at 72 °C, with a final cycle of 5 min at 72 °C. PCR amplification was performed on a PTC-
200 Peltier Thermal Cycler (MJ Research, Waltham, Massachusetts).
Figure 23: Amplified nuclear region from regulator of G-protein signaling 4 (RGS4) – 1 of 2.
Amplified Ara macao samples were electrophoresed using a 1% sodium borate (SB)
agarose gel with a 1 kb size standard ladder (Invitrogen).
76
Figure 24: Amplified nuclear region from regulator of G-protein signaling 4 (RGS4) – 2 of 2.
Amplified Ara macao samples were electrophoresed using a 1% sodium borate (SB)
agarose gel with a 1 kb size standard ladder (Invitrogen).
77
The Vimentin (Vim) gene fragment was amplified by PCR using the primers listed in
Table 9 (see Figures 25 & 26). Reactions were carried out in 25 µl reaction volumes containing
Q5® Hot Start buffer, 0.2 mM dNTPs, 0.5 µM each primer, 1.0 U Q5® Hot Start polymerase
(New England Biolabs), 50 ng DNA template using a thermal cycler protocol of one cycle of 30
sec at 98 °C followed by 30 cycles of 10 sec at 98 °C, 20 sec at 63 °C, and 10 sec at 72 °C, with
a final cycle of 5 min at 72 °C. PCR amplification was performed on a PTC-200 Peltier Thermal
Cycler (MJ Research, Waltham, Massachusetts).
Figure 25: Amplified nuclear region from vimentin gene (Vim) – 1 of 2. Amplified Ara
macao samples were electrophoresed using a 1% sodium borate (SB) agarose gel
with a 1 kb size standard ladder (Invitrogen).
78
Figure 26: Amplified nuclear region from vimentin gene (Vim) – 2 of 2. Amplified Ara
macao samples were electrophoresed using a 1% sodium borate (SB) agarose gel
with a 1 kb size standard ladder (Invitrogen).
2.13 Restriction Digestion Analysis
DNA fragmentation by restriction enzyme digestion is a technique used to differentiate
homologous DNA sequences using the restriction enzyme’s ability to recognize and cut at
precise sequences. This method has proven to be a powerful tool to distinguish sequence
polymorphisms between species based on the recognition of specific short nucleotide
sequences/restriction sites and has regularly been used for species identification (Haig,
Wennerberg et al. 2004). Given the significant cost of next generation sequencing, an initial
search for possible sequence variability within the amplified nuclear regions was performed.
Restriction digestion was performed on the nuclear amplicons to determine if the intronic
fragments had some degree of sequence variability between A. m. macao and A. m. cyanoptera.
79
Restriction enzymes were chosen that were known to associate with more common
recognition sequences at standard conditions (Table 3).
Table 3: Restriction Enzymes
Restriction Enzyme Recognition Sequence
EcoRI G/AATTC
HindIII A/AGCTT
HaeIII GGG/CCC
BstEII G/GTACC
BamHI G/GATCC
NcoI C/CATGG
80
One microliter of 10X EcoRI buffer (100 mM Tris-HCl, 50 mM NaCl, 10 mM MgCl2,
0.025% Triton X-100, pH 7.5), and 2 µl of amplified template DNA (0.5 µg) were added to 6.5
µl of distilled water with 0.5 µl of EcoRI restriction enzyme (20 U/ µl) in a 1.5 ml microfuge
tube. Care was taken to avoid using more restriction enzyme preparation than 10% of the total
reaction volume due to the reaction inhibiting effects of glycerol mixed with the enzyme. The
solution was mixed by pipette and incubated in a thermal cycler for 1 h at 37 °C.
One microliter of 10X HindIII reaction buffer (10 mM Tris-HCl, 50 mM NaCl, 10 mM
MgCl2, 1 mM dithiothreitol, pH 7.9), and 2 µl of amplified template DNA (0.5 µg) was added to
6.5 µl of distilled water with 0.5 µl of HindIII restriction enzyme (20 U/ µl) in a 1.5 ml
microfuge tube. The solution was mixed by pipette and incubated in a thermal cycler for 1 h at
37 °C.
Five microliters of 10X BstEII reaction buffer (100 Tris-HCl, 50 mM NaCl, 10 mM
MgCl2, 0.025% Triton X-100, pH 7.5), and 2 µl of amplified template DNA (1.0 µg) was added
to 7.5 µl of distilled water with 0.5 µl of BstEII restriction enzyme (10U/ µl) in a 1.5 ml
microfuge tube. The solution was mixed by pipette and incubated in a thermal cycler for up to 1
h at 37 °C. All other enzymes used in restriction digest followed the above basic protocol.
Double digestions were prepared by following the recipes for the individual digestions
after determining that the buffering conditions of each enzyme were compatible. The stop
solution used to terminate the enzymatic reactions was (10 µl per 50 µl reaction) (2.5% Ficoll®-
400, 10mM EDTA, 3.3mM Tris-HCl, 0.08% SDS, pH 8.0 at 25 °C)
81
The digested fragments were then loaded onto a 2% sodium borate (SB) agarose gel
stained with ethidium bromide and electrophoresed at 120V (see Figure 27). If the banding
patterns differed between the subspecies, the nuclear amplicons were included in the sequencing
library for further analysis.
Figure 27: Electrophoretic analysis of restriction digest of RGS4. Restriction enzymes
EcoRI, BstEII, and HindIII were used to digest the nuclear amplicon RGS4 and
create comparative banding patterns to screen for subspecies unique sequence
polymorphisms.
82
2.14 Next Generation Sequencing Using the Illumina MiSeq® Platform
Illumina® sequencing technology is based on the creation of clusters by massively
parallel sequencing-by-synthesis (SBS) which detects the incorporation of single bases into the
growing DNA strands. Single DNA molecules are bound to the surface of the flow cell, and
bridge-amplified to form growing clonal clusters. Four types of reversible, fluorescently-labeled,
terminator bases are added and the growing clusters are imaged as the DNA chains are extended
one nucleotide at a time. Non-incorporated nucleotides are washed away and the fluorescent
label and 3’ terminal blocker are chemically removed from the DNA to allow incorporation of
the next base. Since all 4 dNTPs are present during each cycle, natural competition minimizes
incorporation bias. The image of the new base as part of the strand is captured and the process is
repeated for each cycle of sequencing. Following image analysis, the sequencing software
performs base calling, filtering, and quality scoring (Figure 28). Requeuing analyses allow for
homology, SNP, indel, and variant calling on- or off-instrument.
Figure 28: Sequence homology and a single nucleotide polymorphism (SNP) seen between
mitochondrial sequence for macaw 065 and 1022 (Illumina MiSeq® Reporter).
83
CHAPTER 3
RESULTS AND CONCLUSIONS
3.1 18S Analysis of Results
Ribosomal DNA sequences have long been an invaluable tool used in evolutionary
studies. This region is phylogenetically informative due to the combination of the highly
conserved coding sequences of the 18S, 5.8S, and 28S rRNA genes flanked by the more highly
variable sequences found on the non-coding internally transcribed spacer (ITS) regions. A map
of this region is shown below (see Figure 29; see also Figure 5, supra):
This variability allows for phylogenetic inferences across a broad range of evolutionary
time scales (Baldwin, Sanderson et al. 1995) but is still easy to align across divergent taxa. The
rRNA gene regions are highly conserved within species, whereas the ITS regions exhibit
divergence sufficient to resolve relationships within species or between closely related species of
most genera (Álvarez and Wendel 2003). The transcription start site and elements that influence
Figure 29: Arrangement of ribosomal DNA (rDNA) clusters on the genome. ITS1 –
Internal Transcribed Spacer 1; ITS2 – Internal Transcribed Spacer 2; ETS –
External Transcribed Spacer; NTS – Non-Transcribed Spacer (Holstein 2006).
84
the regulation of the downstream genes are within the external transcribed spacer (ETS) region,
which diverges more rapidly, and therefore displays more variability, than even the ITS region
(Kovarik, Dadejova et al. 2008).
In the scarlet macaw, the 18S – 5.8S – 28S nuclear ribosomal DNA (rDNA) clusters
occur in tandem arrays on three pairs of microchromosomes (see Figure 30) (Seabury, Dowd et
al. 2013). There are estimated to be 100s to 1000s of repetitive clusters (transcriptional units)
separated by non-transcribed spacer (NTS) regions comprising each array. Each cluster encodes
three different rRNA genes (18S, 5.8S, and 28S), which are separated by two internal transcribed
spacer (ITS1 and ITS2) regions (Matyasek, Renny-Byfield et al. 2012). The highly conserved
nature of this region allows for the design of “universal” PCR primers which enable
amplification of a wide variety of taxa and the high copy number of the rDNA clusters also
allows for easier amplification/analysis.
85
Figure 30: Ribosomal DNA tandem cluster array. Adapted from photograph by O. L. Miller,
Jr. (Gilbert). NTS – non-transcribed spacer; ETS – external transcribed spacer;
ITS1 – internal transcribed spacer 1; ITS2 – internal transcribed spacer 2; rRNA
genes 18S, 5.8S, and 28S
86
Table 4: 18S rDNA Sequence Comparison Between Subspecies
Sample
Locus
1172 Freq. 1190 Freq. 1214 Freq. 1287 Freq. 1582 Freq.
A. m.
cyanoptera
SL8 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.90/0.10 G/A 0.91/0.09 G/A 0.90/0.10
033 T/A 0.89/0.11 A/T 0.85/0.15 T/C 0.90/0.10 G/A 0.91/0.09 G/A 0.90/0.10
19s/m T/A 0.90/0.10 A/T 0.88/0.12 T/C 0.91/0.09 G/A 0.93/0.07 G/A 0.89/0.11
ZM19 T/A 0.90/0.10 A/T 0.87/0.13 T/C 0.91/0.09 G/A 0.93/0.07 G/A 0.92/0.08
028 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.90/0.10 G/A 0.92/0.08 G/A 0.90/0.10
4531 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.90/0.10 G/A 0.92/0.08 G/A 0.90/0.10
9021 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.89/0.11 G/A 0.91/0.09 G/A 0.90/0.10
062 T/A 0.90/0.10 A/T 0.87/0.13 T/C 0.91/0.09 G/A 0.92/0.08 G/A 0.88/0.12
065 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.91/0.09 G/A 0.92/0.08 G/A 0.87/0.13
9780 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.89/0.11 G/A 0.90/0.10 G/A 0.90/0.10
ZM16 T/A 0.90/0.10 A/T 0.85/0.15 T/C 0.93/0.07 G/A 0.93/0.07 G/A 0.90/0.10
101 T/A 0.90/0.10 A/T 0.86/0.14 T/C 0.89/0.11 G/A 0.91/0.09 G/A 0.90/0.10
119 T/A 0.89/0.11 A/T 0.87/0.13 T/C 0.91/0.09 G/A 0.90/0.10 G/A 0.90/0.10
363 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.90/0.10 G/A 0.90/0.10 G/A 0.89/0.11
A. m. macao
ZM13 T/A 0.91/0.09 A/T 0.88/0.12 T/C 0.92/0.08 G/A 0.93/0.07 G/A 0.90/0.10
CC6 T/A 0.89/0.11 A/T 0.88/0.12 T/C 0.91/0.09 G/A 0.93/0.07 G/A 0.91/0.09
SL5 T/A 0.89/0.11 A/T 0.87/0.13 T/C 0.90/0.10 G/A 0.89/0.11 G/A 0.88/0.12
ZM10 T/A 0.89/0.11 A/T 0.87/0.13 T/C 0.91/0.09 G/A 0.92/0.08 G/A 0.90/0.10
CC7 T/A 0.88/0.12 A/T 0.87/0.13 T/C 0.90/0.10 G/A 0.92/0.08 G/A 0.90/0.10
046 T/A 0.90/0.10 A/T 0.90/0.10 T/C 0.90/0.10 G/A 0.89/0.11 G/A 0.89/0.11
024 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.90/0.10 G/A 0.91/0.09 G/A 0.88/0.12
Hybrids
043 T/A 0.90/0.10 A/T 0.87/0.13 T/C 0.91/0.09 G/A 0.92/0.08 G/A 0.90/0.10
1022 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.91/0.09 G/A 0.92/0.08 G/A 0.90/0.10
87
The 1815 bp fragment encompassing the 18S rDNA and the 5’ end of the flanking
internal transcribed spacer 1 (ITS1) regions of both the nominate A. macao macao and A. macao
cyanoptera subspecies were sequenced on the Illumina MiSeq® Desktop Sequencer (Table 4).
The rDNA amplicons from a total of 7 individuals of the A. macao macao subspecies, 14 of the
A. macao cyanoptera subspecies, and two known hybrids were sequenced. The 18S rDNA
sequence is regularly used to resolve deep phylogenetic relationships and therefore was not
expected show much variation, if any, between these two subspecies. However, since rDNA is a
much conserved repeated sequence, the obtained nucleotide sequence of the rDNA amplicon is
the “sum” of the sequences of all of the units represented within the PCR product. If they are
identical, then a single, compiled sequence is obtained. If one or more of the repeat units
exhibits a polymorphism, then the obtained sequence will show two nucleotides at that position
with the relative proportions reflecting the ratio of the two units within the total number of rDNA
clusters/units. In this case, the results of the rDNA sequence analysis indicated the existence of a
clear sequence heterogeneity across the A. macao 18S rRNA genes as well as the 5’ region of
ITS1. A total of 5 points of polymorphism were identified that occurred at approximately a 90/10
ratio in the amplified product. This is consistent with the “minor sequence” likely comprising a
cluster that represents about 10% of the total rDNA units in these birds. Unfortunately, the same
polymorphism was exhibited in all 23 A. macao specimens tested, indicating that the origin of
the polymorphism predates the separation of the two subspecies and that it has not further
diverged since that time. Thus, although this is an interesting finding in itself, it is of limited
usefulness for the present study.
Sequence heterogeneity within multicopy gene families has previously posed a problem
in phylogenetic analyses of many species, in particular within the internal transcribed spacer
88
regions (Buckler-Iv, Ippolito et al. 1997, Álvarez and Wendel 2003, Kiss 2012). Surprisingly,
the number of polymorphisms (only one identified in ITS1) was remarkably low in the A. macao
birds analyzed and again this one was observed in all 23 birds used in this study.
The rDNA sequencing reads produced from our Illumina MiSeq sequencing run from the
A. macao samples were each assembled into an rDNA sequence alignment. The high similarity
of the repeats inherent in rDNA arrays inhibits the ability to differentiate between reads cut in
sequences from repetitive clusters. The array sequencing reads instead are aligned as one single
rDNA (transcription unit) with a sequence polymorphism frequency that is the product of the
sequencing coverage of the genome and the number of rDNA clusters in the array (see Figure
31). These results demonstrate that rDNA evolves via concerted evolution and that
homogenization is highly efficient at maintaining the rDNA with near-identical repeats (Ganley
and Kobayashi 2007). It seems highly likely that homogenization is favored by natural selection
because it reduces mutational load (Ohta 2009). Otherwise, it would seem inconceivable that
variation is seen at such a low frequency, especially because the ITS region is known to evolve
very rapidly.
89
To summarize my findings from rDNA analyses, interesting sequence polymorphisms
were observed between rDNA clusters within each of the macaws studied, but no polymorphisms
were observed that were useful as subspecies-specific indicators. Again, although this was
disappointing, at least in some sense, it is very much in line with expectations given the close
phylogenetic relationship between these two groups of macaws. For this reason, we focused on
what should be more polymorphic regions of the A. macao genome to find loci which would
allow subspecies identifications to be carried out. These regions included the complete
mitochondrial DNA sequence and four additional nuclear autosomally-encoded loci.
3.2 Mitochondrial Sequences: Analysis of Results
Figure 31: Inherent problem with tandem clusters of rDNA during read assembly. 18S
rDNA/ITS1 sequenced reads from tandem clusters will assemble together as one
cluster. ETS: external transcribed spacer; ITS1: internal transcribed spacer 1;
ITS2: internal transcribed spacer 2
90
Fourteen pedigreed A. m. cyanoptera and seven pedigreed A. m. macao (plus two known
hybrids) were studied in the completion of this project. The complete mtDNA sequence was
determined for each of these birds. The mitogenome of Ara macao was found to be 16,993 bp
long and to contain 13 protein-coding genes, 2 rRNA-encoding genes, 22 tRNA-encoding genes,
and a control region approximately 1494 bp. The comparisons of these mtDNA sequences
clearly reveal that the total amount of polymorphism within the scarlet macaw mitogenome is
more than sufficient to allow for subspecies identification using mitochondrial loci. Upon
completing DNA sequence analysis and polymorphism assessment, a total of 74 SNPs (0.43% of
entire mitogenome) and nine indels (0.05%) out of the 16,993 bp were discovered within the
endangered subspecies, Ara macao cyanoptera. Fifty-eight SNPs (0.34%) and six indels
(0.035%) were found within the nominal subspecies, Ara macao macao out of the 16,993 bp. A
total of nineteen loci (0.1%) show a different nucleotide (allele) between the two subspecies.
A number of loci also displayed evidence suggestive of heteroplasmy, the co-existence of
multiple mitochondrial DNA variants within a single organelle (see Table 6, blue cells).
Heteroplasmy is the result of mutations that affect only a portion of the mitochondria within a
cell. The level of heterogeneity can increase in number over time as the rarer allele becomes
more common within the organelle/cell. It is possible to inherit a heteroplasmic organelle/
genome as well and it continue being passed through generations until the heteroplasmy is lost
when a descendant only inherits one or the other of the alleles.
The observed and abundant mtDNA polymorphism in A. macao provides an ideal source
of molecular markers for conservation studies of endangered species (Nabholz, Uwimana et al.
2013). The majority of previous mitochondrial DNA analyses have focused on sequencing only
select fragments of the mitogenome and, historically, the most common region to sequence has
91
been the hypervariable, non-coding control region (CR) (Duchene, Archer et al. 2011). This
region is followed in use by cytochrome B (CytB), which is slightly more conserved than the
CR, as well as cytochrome C oxidase 1 (CO1) (Hebert, Ratnasingham et al. 2003). We chose to
sequence the entire mitogenome to increase the likelihood that we would find regions with the
“appropriate” levels of polymorphism to allow for differentiation between the subspecies at the
mtDNA level.
The mitogenome of the nominate subspecies, Ara macao macao, was published as a
partial sequence in 2013 (Seabury, Dowd et al. 2013). The cytochrome B gene was incomplete,
missing 23 nucleotides at the 3’ end of the locus. Also, the reported sequence of the control
region (D-loop region) had many differences from the sequences we report here for seven birds
of the same subspecies. Both the CytB and control regions were sequenced and assembled to
completion by our lab for each of our reference subspecies individuals. Once we had isolated,
amplified, and sequenced the entire mitogenome of the first A. m. macao specimen, we compared
the mtDNA sequences of six additional individuals to assess the variability within the nominate
subspecies. There were many SNPs and indels observed in the control region for these seven
individuals. Unfortunately, the degree of polymorphism was so high as to classify the region as
hyperpolymorphic—as expected from studies of other species. And due to the hypervariable
nature at this locus, the usefulness of this locus for definitive DNA comparisons between the two
subspecies is, at best, complicated and thus greatly reduced.
The control region of the mitochondria consists of three distinct domains. Domain I is the
C-rich region that contains the H-strand synthesis terminus. This domain contains the most
variability within the control region. Heteroplasmic repeats that aid in hairpin formation for
tighter regulatory control are found in this domain as well as in domain III. Due to this
92
comparable variability, 600 bp from this region of the mitochondria were chosen to assess the
relatedness of our sample group (see Appendix). Domain II is a highly conserved and G-rich
segment of the control region. Among the individual macaw samples we collected, variability
between subspecies was the lowest in this domain as would be expected. Domain III is an AT-
rich and an extremely G-poor segment. The origin of replication of the H-strand is found in this
domain.
Our 21 samples were obtained from birds collected from many regions inhabited by the
two scarlet macaw subspecies and therefore it is expected that different haplotypes should be
represented within our group. It was our primary focus to ultimately locate regions of the
Figure 32: Phylogenetic tree showing relatedness of Ara macao from mitogenome domain I.
93
mitogenome with appropriate/useful levels of polymorphism between the two subspecies and, if
possible, with subspecies-specific alleles/haplotypes. Analysis of the mtDNA sequences obtained
from the 21 birds revealed considerable polymorphisms seen across a number of regions on the
mitochondrial genome. For example, the 16S rDNA region has ten loci that exhibit
polymorphism within the A. m. macao subspecies and a total of 17 loci that exhibit
polymorphism within the A. macao species, six which display intraspecies variability (alleles
found only in one or the other subspecies, see Table 6, peach columns). The 16S rDNA region
therefore shows great promise as a tool for subspecies differentiation at the mtDNA level and
will be discussed in more detail below.
94
Table 5: Characteristics of the mtDNA of Two Subspecies of Scarlet Macaw Ara macao macao and Ara macao cyanoptera
Gene Coding
Strand
Position Spacer
(+) or
Overlap
(-)
Size Codon
From To Start Stop
tRNAPHE H (R) 1 66 0 66
12S rRNA L (F) 67 1038(1036) 0 972(970)
tRNAVAL H (R) 1039(1037) 1109(1107) +1 71
16S rRNA L (F) 1111(1109) 2679(2680) 0 1569(1572)
tRNALEU H (R) 2680(2681) 2754(2755) +6 75 ND1 H (R) 2761(2762) 3741(3742) -1 981 ATG AGG
tRNAILE H (R) 3740(3741) 3811(3812) +5 72
tRNAGLN L (F) 3817(3818) 3887(3888) 0 71
tRNAMET H (R) 3888(3889) 3955(3956) 0 68 ND2 H (R) 3956(3957) 4995(4996) 0 1040 ATA TA--
tRNATRP H (R) 4996(4997) 5066(5067) +1 71
tRNAALA L (F) 5068(5069) 5136(5137) +2 69
tRNAASN L (F) 5139(5140) 5212(5213) +2 74
tRNACYS L (F) 5215(5216) 5281(5282) 0 67
tRNATYR L (F) 5282(5283) 5351(5352) +9 70 CO1 H (R) 5361(5362) 6908(6909) 0 1548 GTG AGG
tRNASER L (F) 6909(6910) 6975(6976) +4 67
tRNAASP H (R) 6980(6981) 7048(7049) +2 69 CO2 H (R) 7051(7052) 7734(7735) +1 684 ATG TAA
tRNALYS H (R) 7736(7737) 7803(7804) +1 68 ATP8 H (R) 7805(7806) 7972(7973) -10 168 ATG TAA ATP6 H (R) 7963(7964) 8645(8646) 0(-1) 683 ATG TA- CO3 H (R) 8646 9429(9427) 0 784(782) ATG T--
tRNAGLY H (R) 9430(9428) 9498(9496) 0 69 ND3 H (R) 9499(9497) 9848(9846) 0 350 ATA TA-
tRNAARG H (R) 9849(9847) 9918(9916) +1 70 ND4L H (R) 9920(9918) 10216(10214) -7 297 ATG TAA ND4 H (R) 10210(10208) 11602(11600) 0 1393 ATG T--
tRNAHIS H (R) 11603(11601) 11671(11669) 0 69
tRNASER H (R) 11672(11670) 11737(11735) 0 66
tRNALEU H (R) 11738(11736) 11807(11805) 0 70
ND5 H (R) 11808(11806) 13622(13620) +11 1815 GTG TAG
CytB H (R) 13634(13632) 14773(14771) 0 1140 ATG TAA
tRNATHR H (R) 14774(14772) 14841(14839) +4 68
tRNAPRO L (F) 14846(14844) 14915(14913) +3 70 ND6 L (F) 14919(14917) 15431 +1 513(515) ATG TAG
tRNAGLU L (F) 15433 15502 0 70
CR H (R) 15503 16993+3* 0 1491+3*
95
Table 5 shows locations of mitogenome features, such as genes and other regions of
interest, for both of the Ara macao subspecies. Values displayed are for the nominate subspecies,
Ara macao macao, while values in parentheses are for homologous features found in Ara macao
cyanoptera that differed from the nominate features. When only one value is present, the features
are identical in the two subspecies. An asterisk (*) indicates where the control region sequence
overlaps the sequence for the phenylalanine tRNA gene. Negative values represent other
overlapping nucleotides whereas positive values represent intergenic spacer regions. Four codons
contain “--” or “-”, which indicate incomplete termination codons that are completed via
polyadenylation during post transcriptional mRNA modification. Both subspecies are missing a
cytosine residue 171 nucleotides into the NADH dehydrogenase subunit 3 gene (ND3) (Mindell,
Sorenson et al. 1998). The previously unknown termination codon for the cytochrome B gene
(CytB) is TAA (see Table 5, blue).
Comparison between the two subspecies showed significant variability across the
mitogenome as well (see Table 6). The 16S rDNA region shows six loci that, based upon the
sequences for the 21 reference birds, can differentiate between the two subspecies (see Table 6,
peach columns). For these loci, each subspecies appears to not exhibit intrasubspecies
polymorphism, and so the alleles could be considered subspecies indicative. For example, at
locus 1992, all 14 of the cyanoptera subspecies show a guanine residue while all seven of the
nominate subspecies show an adenine residue. Two polymorphic loci within the 16S rDNA
region are indels and both appear to have subspecies-specific alleles. Locus 2075 of the
cyanoptera subspecies shows an adenine followed by an adenine, cytosine, and two thymine
residues while the nominate subspecies shows only the adenine. Locus 1832 of the nominate
96
subspecies shows an adenine followed by a cytosine while cyanoptera only shows an adenine at
the locus.
There are four additional loci which appear to have subspecies-specific alleles, in
addition to alleles that are shared by both subspecies (see Table 6, grey columns). Therefore, at
this time, certain alleles appear to be associated with only one subspecies while others at this
locus have no predictive value for subspecies identification. For example, all fourteen of the
cyanoptera subspecies show a cytosine residue at locus 1514 while only 57% of the nominate
subspecies show a cytosine residue and 43% show a thymine residue at this locus.
97
Table 6: Mitochondrial Sequence Comparison Within and Between Subspecies: 16S rDNA
Sample Locus
1181 1514 1611 1621 1746 1777 1785 1801 1802 1832 1879 1992 2075 2197 2206 2359 2618
A. m. cyanoptera
SL8 A T A G A T C C A A T G AACTT G G T C 033 A T A A A T C C G A T G AACTT G G T C 19s/m A T G A A C C C A A T G AACTT G G T C ZM19 A T G A G T T T G A T G AACTT G G T T 028 A T A A A T T T G A T G AACTT G G T T 4531 A T A G A C C C A A T G AACTT G G T C 9021 A T A A A C C C A A T G AACTT G G T T 062 A T A A G C C C A A T G AACTT G G T C 065 A T A G G C C C A A T G AACTT G G T C 9780 A T A G G C C C A A T G AACTT G G T T ZM16 A T G A A T T T G A T G AACTT G G T T 101 A T A G G C T T A A T G AACTT G G T T 119 A T A G G C T T A A T G AACTT G G T T 363 A T A G G C C C A A T G AACTT G G T T
A. m. macao
ZM13 G C A G A T T C A AC C A A G A C C
CC6 G C A G A T T T G AC C A A A/G A C C
SL5 G C A A A T T T G AC C A A G G C T
ZM10 G C G A G T T C A AC C A A G A C C
CC7 G T A G A T T T A AC C A A A/G A C T
046 G T A G A C T T A AC C A A G G C T
024 G T A G A C T T A AC C A A A/G A C C
Hybrids
043 A T A G G C C C A A T G AACTT G G T T
1022 G T A G A C T T A AC C A A G A C C
98
Therefore, we selected five of the coding regions of the mitogenome for definitive
sequence analysis (see Tables 7-11). The genes chosen were 12S rDNA, cytochrome B (CytB),
16S rDNA, NADH dehydrogenase subunit 3 (ND3), and cytochrome C oxidase subunit 2 (CO2).
(The entire mitogenome sequence and all observed polymorphisms are included in the
appendix.)
Table 7: Mitochondrial Sequence Comparison Between Subspecies: 16S rDNA
Sample Locus 1181 1832 1879 1992 2075 2359
A. m. cyanoptera
SL8 A A T G AACTT T 033 A A T G AACTT T 19s/m A A T G AACTT T ZM19 A A T G AACTT T 028 A A T G AACTT T 4531 A A T G AACTT T 9021 A A T G AACTT T 062 A A T G AACTT T 065 A A T G AACTT T 9780 A A T G AACTT T ZM16 A A T G AACTT T 101 A A T G AACTT T 119 A A T G AACTT T 363 A A T G AACTT T
A. m. macao
ZM13 G AC C A A C CC6 G AC C A A C SL5 G AC C A A C ZM10 G AC C A A C ZM16 G AC C A A C CC7 G AC C A A C 024 G AC C A A C
Hybrids
043 A A T G AACTT T 1022 G AC C A A C
99
Table 8: Mitochondrial Sequence Comparison Between Subspecies: 12S rDNA
Sample
Locus
383 515
A. m. cyanoptera
SL8 G A
033 G A
19s/m G A
ZM19 G A
028 G A
4531 G A
9021 G A
062 G A
065 G A
9780 G A
ZM16 G A
101 G A
119 G A
363 G A
A. m. macao
ZM13 A AG
CC6 A AG
SL5 A AG
ZM10 A AG
CC7 A AG
046 A AG
024 A AG
Hybrids
043 G A
1022 A AG
100
Table 9: Mitochondrial Sequence Comparison Between Subspecies: Cytochrome C Oxidase Subunit II (COII)
Sample
Locus
2012 2141 2171
A. m. cyanoptera
SL8 T G A
033 T G A
19s/m T A A
ZM19 T G A
028 T G G
4531 T A A
9021 T A A
062 T G A
065 T G A
9780 T G A
ZM16 T G A
101 T G A
119 T G A
363 T G A
A. m. macao
ZM13 C G A
CC6 T G A
SL5 T G A
ZM10 C G A
ZM16 C G A
CC7 T G A
024 C G A
Hybrids
043 T G G
1022 C G A
101
Table 10: Mitochondrial Sequence Comparison Between Subspecies: NADH Dehydrogenase 3 (ND3)
Sample
Locus
1921 2010 2045 2155
A. m. cyanoptera
SL8 A A TC G
033 A A TC G
19s/m A A TC G
ZM19 A A TC G
028 A A TC G
4531 A A T G
9021 A A T G
062 A A T G
065 A A T G
9780 A A T G
ZM16 A A TC G
101 A A TC G
119 A A TC G
363 A A TC G
A. m. macao
ZM13 G G TC A
CC6 G G TC A
SL5 G G TC A
ZM10 G G TC A
ZM16 G G TC A
CC7 G G TC A
024 G G TC A
Hybrids
043 A A TC G
1022 G G TC A
102
Table 11: Mitochondrial Sequence Comparison Between Subspecies: Cytochrome B (CytB)
Sample
Locus
2947 2953 2962
A. m. cyanoptera
SL8 C A T
033 C A T
19s/m C A T
ZM19 C A T
028 C G C
4531 C A T
9021 C A T
062 C A T
065 C A T
9780 C G T
ZM16 C G C
101 C A T
119 C A C
363 C A T
A. m. macao
ZM13 T A T
CC6 T A T
SL5 T A T
ZM10 T A T
CC7 T A T
046 T A T
024 T A T
Hybrids
043 C A T
1022 T A T
103
3.3 Nuclear Results
Although the mitochondrial genome is an extremely powerful phylogenetic tool, the
uniparental mode of inheritance is limiting when dealing with the differentiation of hybrids. It
has the disadvantage of being inherited as a single linkage group so that regardless of the number
of genes sequenced, it can be inferred as only one linked haplotype (Prychitko and Moore 2000).
The uniparental nature of its inheritance also means that hybrids will never show evidence of
hybridization in the mtDNA sequences. For this reason, we turned to the nuclear genome to
further assess the differentiation between the subspecies of the scarlet macaw.
We analyzed a total of 3266 bp autosomally encoded nuclear DNA sequence (including
RAG1) from a total of four loci. The 1104 bp RAG1 sequence showed no subspecies-specific
polymorphisms (see Table 15), although many polymorphic sites were shared by the two
subspecies. Nine polymorphic sites with subspecies-specific alleles were identified in the
remaining 2162 bp (0.4%) of nuclear DNA sequence analyzed as part of this study. The intron
sequence, adenylate kinase 1 (AK1), covers 851 bp and includes four loci with A. m. macao and
A. m cyanoptera specific alleles (see Table 12). Although none of the four loci exhibited two
alleles which were each subspecies-specific (one allele/subspecies), in each case one subspecies
was polymorphic at that locus and the “second allele” was unique to that subspecies. Thus,
although one allele has limited predictive value for subspecies identification, the presence of the
other can be considered indicative of a single subspecies (based upon our present subspecies
sequence database of 21 birds, 42 alleles). As an example, at AK1 locus 725, a thymine residue
was found as 11 of 14 alleles in A. m. macao but was not included as an allele out of the 28
alleles of A. m. cyanoptera.
104
Table 12: Nuclear Sequence Comparison Between Subspecies : Adenylate Kinase 1 (AK1)
Sample
Locus
249 260 725 742
A. m. cyanoptera
SL8 C/C T/T A/A G/A
033 C/C T/T A/A G/G
19s/m C/T T/T A/A G/A
ZM19 C/C T/T A/A G/G
028 C/T T/T A/A G/G
4531 C/C T/T A/A G/G
9021 C/T T/T A/A G/G
062 C/T T/T A/A G/A
065 C/C T/T A/A G/G
9780 C/T T/T A/A G/G
ZM16 C/C T/T A/A G/G
101 C/C T/T A/A G/G
119 C/C T/T A/A G/G
363 C/C T/T A/A G/G
A. m. macao
ZM13 C/C C/T C/A G/G
CC6 C/C C/T C/C G/G
SL5 C/C C/T C/C G/G
ZM10 C/C C/T C/A G/G
CC7 C/C C/T C/A G/G
046 C/C C/C C/C G/G
024 C/C C/C C/C G/G
Hybrids
043 C/C T/T C/A G/A
1022 C/C T/T A/A G/G
105
The intronic sequence, regulator of G-protein signaling 4 (RGS4), covers 865 bp and
includes three loci with A. m. macao and A. m cyanoptera specific alleles (see Table 13).
Although none of the three loci exhibited two different alleles which were each subspecies-
specific (one allele per subspecies), in each case one subspecies was polymorphic at that locus
and a “second allele” was unique to that subspecies. Thus, although one allele has limited
predictive value for subspecies identification, the presence of the other can be considered
indicative of a single subspecies. This of course is based upon our present subspecies sequence
database of 21 birds, 42 alleles at this locus. As an example, at RGS4 locus 744, a cytosine
residue was found as eleven of 14 alleles in A. m. macao but as none of 28 alleles of A. m.
cyanoptera.
106
Table 13: Nuclear Sequence Comparison Between Subspecies: Regulator of G-Protein Signaling 4 (RGS4)
Sample
Locus
556 700 744
A. m. cyanoptera
SL8 C/C C/C A/A
033 C/CCT C/C A/A
19s/m C/CCT C/C A/A
ZM19 C/CCT C/C A/A
028 C/CCT C/T A/A
4531 C/C C/C A/A
9021 C/C C/C A/A
062 C/CCT C/T A/A
065 C/C C/C A/A
9780 C/C C/C A/A
ZM16 C/CCT C/C A/A
101 C/C C/C A/A
119 C/CCT C/T A/A
363 C/C C/C A/A
A. m. macao
ZM13 C/C C/C C/C
CC6 C/C C/C C/A
SL5 C/C C/C C/A
ZM10 C/C C/C C/C
CC7 C/C C/C C/C
046 C/C C/C C/C
024 C/C C/C C/A
Hybrids
043 C/C C/T C/A
1022 C/C C/T C/A
107
An intronic sequence found within the vimentin gene (VIM), covers 446 bp and includes
two loci with A. m. macao and A. m cyanoptera specific alleles (see Table 14). Although neither
of the two loci exhibited two alleles which were each subspecies-specific (one allele/subspecies),
in each case, one subspecies was polymorphic at that locus and the “second allele” was unique to
that subspecies. Again, one allele demonstrates a limited predictive value for subspecies
identification, but the presence of another allele in the other subspecies can be considered
indicative of a specific subspecies when found at this locus. For example, at VIM locus 105, an
adenine residue was found as eleven of 14 alleles in A. m. macao but was not found as any of the
28 alleles of A. m. cyanoptera. This locus allows fairly definitive results to predict a subspecies
but lacking the adenine as three of alleles of A. m. macao, it is not completely subspecies
indicative.
108
Table 14: Nuclear Sequence Comparison Between Subspecies: Vimentin (Vim)
Samples
Locus
105 390
A. m. cyanoptera
SL8 G/G T/T
033 G/G T/T
19s/m G/G T/T
ZM19 G/G T/T
028 G/G T/T
4531 G/G T/T
9021 G/G T/T
062 G/G T/T
065 G/G T/T
9780 G/G T/T
ZM16 G/G T/T
101 G/G T/T
119 G/G T/T
363 G/G T/T
A. m. macao
ZM13 A/G C/C
CC6 A/G C/T
SL5 A/A C/C
ZM10 A/A C/T
CC7 A/A C/T
046 A/A C/C
024 A/G C/T
Hybrids
043 G/G C/C
1022 G/G C/T
109
As previously mentioned, the intronic region that was further analyzed in recombination
activating gene 1 (RAG1) did not show subspecies-specific alleles (see Table 15).
Table 15: Nuclear Sequence Comparison Between Subspecies: Recombination Activating Gene 1 (RAG1)
Sample
Locus
929
A. m. cyanoptera
SL8 C/C
033 C/C
19s/m C/T
ZM19 C/C
028 C/C
4531 C/T
9021 C/C
062 C/C
065 C/C
9780 C/C
ZM16 C/C
101 C/C
119 C/C
363 C/C
A. m. macao
ZM13 C/C
CC6 C/T
SL5 C/C
ZM10 C/C
CC7 C/C
046 C/C
024 C/C
Hybrids
043 C/T
1022 C/C
110
Advances in sequencing technology in bioinformatics tools over the last decade, have
changed the power of molecular marker-based methods. The increase in the number of available
genome-wide molecular markers has dramatically enhanced the resolution and the reliability of
taxonomic conclusions (Steiner, Putnam et al. 2013), such as assessing the impact of genetic
variation on patterns of gene expression and measuring the responses to environmental
change. In order to understand the underlying patterns of genetic variations in individuals and
populations, we have to better understand the underlying processes involved and their relevance
to conservation. Structural rearrangements, copy number variations, insertions and deletions,
single nucleotide polymorphisms, and sequence repeats will likely become the standard units for
all future assessments of natural populations (Ellegren 2014). Studying selectively important
variation strongly relies upon the availability of annotated sequence data to assist in identifying
the functional genomic regions of interest. These molecular markers are invaluable in the
classification of individuals and conservation programs (Romanov, Tuttle et al. 2009),
identifying the substructures within populations, and in making strong conservation decisions
(Frankham, Ballou et al. 2010).
We were able to verify the close genetic relationship of A. m. macao and A. m.
cyanoptera through sequence level analysis of the entire 18S ribosomal DNA region as well as
the 5’ segment of the flanking internal transcribed spacer 1 (ITS1). However, comparative
analysis revealed there was not sufficient polymorphism at this locus for subspecies-specific
differentiation which was thought to be as a possible outcome given the close phylogenetic
relationship between these two groups of macaws.
Sequencing the entire mitogenome of both, the nominate and endangered subspecies
enabled us to accomplish a few goals. The published mitochondrial genome was completed as
111
well as corrected at some sites of the nominate A. m. macao. The first mitogenome sequence of
the cyanoptera subspecies was determined and annotated. Distinct differences between the two
mitogenomes of these subspecies were noted. We were able to identify numerous regions within
this sequence to definitively distinguish between the two subspecies of macaw. Unfortunately,
the mitogenome is uniparental in origin and cannot fully assess the entire lineage.
Our lab conducted a preliminary assessment of five autosomal loci encompassing 3266
bp within 23 individual macaws. Although we found some indications of subspecies-specific
alleles, which taken together allow reasonable inference of subspecies, evaluated separately,
these alleles do not have the power to definitively distinguish between the subspecies. More
nuclear loci with a larger dataset needs to be evaluated to enable definitive determination of
hybrids.
112
3.4 Off-Instrument Use of MiSeq® Reporter
The Illumina MiSeq® Desktop Sequencer includes MiSeq® Reporter, which is an on-
instrument software program that performs sequence analysis (see Figure 32). The MiSeq®
reporter is typically run in the lab via computers that are locally networked with the MiSeq®,
and it launches automatically after the MiSeq® completes its primary analysis. After performing
the initial sequencing and analysis runs on the MiSeq®, it is of value for the research team to be
able to perform further analyses, “re-sequencing” the initially obtained primary run data. The
MiSeq® sequencing and analysis workflow is illustrated below:
Figure 33: Illumina MiSeq® reporter workflow (Illumina)
113
To allow for distributed analysis by the research time, as well as the primary researcher,
the MiSeq® Reporter is available to be installed at an “off-instrument,” i.e., at other computers
not co-located with the MiSeq® itself. To facilitate this, the MiSeq® provides an ability to
upload its run outputs to the BaseSpace platform, which is a well. For each of our runs, this was
approximately a 12 gigabyte set of files, so there was a fairly significant amount of data to be
transferred and stored into a registered BaseSpace account. Being able to download these files to
an off-instrument instance of MiSeq® Reporter was invaluable for re-sequencing and further
data analysis.
The requirements for running MiSeq® reporter was a Windows 7 machine with the
following required components:
64-bit Windows OS (Vista, Windows 7, Windows Server 2008 64-bit, English-US)
≥ 8 GB RAM minimum; ≥ 16 GB RAM recommended
≥ 1 TB disk space
Quad core processor (2.8 GHz or higher)
Microsoft .NET 4
Given that the typical machine now runs Windows 8 or higher, this may require downgrading
a Windows 8 or Windows 10 machine back to Windows 7, along with ensuring the other
requirements were met. The off-instrument MiSeq® Reporter software was an available
download from the Illumina® website.
Next, the interface for downloading a run from BaseSpace is a Python programming
language open source downloader. So we downloaded the Python program platform and an
Illumina-provided script file to be executed by Python in accordance with
114
https://support.basespace.illumina.com/knowledgebase/articles/403618-python-run-downloader.
Execution of this file using the “token” and “RunID” provided on BaseSpace (as instructed by
the support article above) allowed for all of the files associated with the run to be downloaded in
a single batch. But given the 12 GB size of our download, this download took several hours on a
high-speed connection.
Once the files had been downloaded, they were available for access by the off-instrument
MiSeq® Reporter software. A “run” folder (C:\Illumina\MiSeq®Analysis) was created to have
the original analysis files in it. To provide “re-queue” instructions, the first step was to delete the
original “SampleSheet.csv” file that had the original run instructions in it. A genome folder was
created that had “.fa” files, which are ASCII text files having gene sequences to use for further
analyses against the original run files. The .fa files having these sequences essentially had to be
copied into three folders -- one folder associated with the Illumina® Experiment Manager (IEM),
which is used to create the re-sequence manifest file and the new re-sequence sample sheet.
Using the IEM, the sample sheet (SampleSheet.csv) is created, and the genome manifest is
accessed by first accessing the .fa file that was originally stored in the “genome” folder under the
IEM data folder. Once the IEM accesses the .fa file for the first time, it creates a full genome
manifest in the corresponding genome folder. This genome folder is then copied into the
MiSeq® Analysis folder (which is the run folder for the MiSeq® reporter).
The MiSeq® Reporter itself operates as a “service” on the Windows 7 computer, and that
service is accessed through the Internet Explorer or other compatible browser by entering the
following command line in the address line of the browser: “https://localhost:8042.” With this, a
screen that looks like the following will show up on the screen, but execution of a resequencing
run will not begin until the “QueuedforAnalysis” file is deleted from the MiSeq®Analysis “run
115
folder.” Once the run begins, this file will reappear, and the run execution status can be
monitored from the associated browser window. Depending on processor and memory capability
and on the run particulars, the re-queue analyses will take several hours each. Example reporter
graphical outputs are as follows (see Figures 33 & 34).
Figure 34: Illumina MiSeq® reporter run summary interface (Illumina)
116
Not only did the MiSeq® Reporter screens provide intuitive insights into the sequence
data, the Illumina® plots provides for formatted data outputs of, for example, SNP data
represented in a given plot as “comma separated value” or “csv” text files. Thus, many of the
spreadsheets discussed and presented elsewhere in this paper were generated by outputting data
directly from the MiSeq® Reporter screen outputs.
Figure 35: Illumina MiSeq® reporter detailed sample analysis interface (Illumina)
117
APPENDIX
EXTENDED RESULTS
118
Relatedness of Ara macao subjects from mitogenome domain I
SL8 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GT-GGGCTGGTCTGCT
033 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCCCCCCACTGAGTC-GT-GGGCTGGTCTGCT
19s/m 1 CCCATACCCCTAAGGGTAGCCCCCCCTACC--CCCACTGAGTC-GT-GGGCTGGTCTGCT
ZM19 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GT-GGGCTGGTCTGCT
028 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTCCGT-GGGCTGGTCTGCT
4531 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GT-GGGCTGGTCTGCT
9021 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCCCCCCACTGAGTC-GT--GGCTGGTCTGCT
062 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GT-GGGCTGGTCTGCT
065 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GT-GGGCTGGTCTGCT
9780 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCCCCCCACTGAGTC-GT--GGCTGGTCTGCT
ZM16 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTGGGGCTGGTCTGCT
101 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCCCCCCACTGAGTC-GT-GGGCTGGTCTGCT
119 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTCCGT-GGGCTGGTCTGCT
363 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GT-GGGCTGGTCT-CT
ZM13 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT
CC6 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT
SL5 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT
ZM10 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT
CC7 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT
046 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT
024 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT
043 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GT-GGGCTGGTCTGCT
1022 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT
119
SL8 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA
033 59 TCGCCTATC-CAGGCTATGTATATCGTACATT-TA-ATAATT-GTACCTTAT--ACATTA
19s/m 57 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA--TAATTGGTACCTTAT--ACATTA
ZM19 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTATATACATTA
028 59 TCGCTTA-C-CAGGCTATGTATATCGTACATTTTA-ATAATTGGTACCTTAT--ACATTA
4531 58 TCGCTTATCCCAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA
9021 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA
062 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTATA-ACATTA
065 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTATATACATTA
9780 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAA-TGGTACCTTAT--ACATTA
ZM16 59 TCGCT-ATC-CAGGCTATGTATATCGTACATT-TAAATAATTGGTACCTTATA-ACATTA
101 59 TCGCCTATC-CAGGCTATGTATATCGTACATT-TAAATAATTGGTACCTTATA-ACATTA
119 59 TCGCTTA-C-CAGGCTATGTATATCGTACATTTTA-ATAATTGGTACCTT-T--ACATTA
363 57 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTATATACATTA
ZM13 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA
CC6 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAA-TGGTACCTTAT--ACATTA
SL5 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA
ZM10 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TAAATAATTGGTACCTTATA-ACATTA
CC7 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA
046 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAA-TGGTACCTTAT--ACATTA
024 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA
043 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TAAATAATTGGTACCTTATA-ACATTA
1022 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA
120
SL8 113 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG
033 113 TATTATA----TTATTAGGGACTAAATAATTCATGC-TCAATGACATATTGGTATTGTGG
19s/m 111 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTG-TATTGGGG
ZM19 115 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG
028 114 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG
4531 114 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG
9021 113 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG
062 114 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG
065 115 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG
9780 112 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG
ZM16 115 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG
101 116 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCTA--ATATATTGGTATTG-GG
119 113 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG
363 114 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATG--ATATTGGTATTG-GG
ZM13 114 TATTATAA-TGTTATTAGGGACTAAATAATTCATGCCTCAATG--ATATTGGTATTG-GG
CC6 113 TATTATAA-TGTTATTAGGGACTAAATAATTCATGCCTCAATGA-ATATTGGTATTG-GG
SL5 114 TATTATAAATGTTATTAGGGACTAAATAATTCATGCCTCAAT---ATATTGGTATTG-GG
ZM10 116 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCTAAT--ATATTGGTATTG-GG
CC7 114 TATTATAAATGTTATTAGGGACTAAATAATTCATGCCTCAAT---ATATTGGTATTG-GG
046 113 TATTATAA-TGTTATTAGGGACTAAATAATTCATGCCTCAATGA-ATATTGGTATTG-GG
024 114 TATTATAAATGTTATTAGGGACTAAATAATTCATGCCTCAAT---ATATTGGTATTG-GG
043 115 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCTA--ATATATTGGTATTG-GG
1022 114 TATTATAA-TGTTATTAGGGACTAAATAATTCATGCCTCAATG--ATATTGGTATTG-GG
SL8 169 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
033 168 ACTAATCTCTGGT-CTAGTTCGGTCCTACCACAGGGGT-TGGAA-AACTCCATG-GCACG
19s/m 167 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
ZM19 171 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAAAAACTCCATG-GCACG
121
028 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATGGGCACG
4531 170 ACTAATCTCTGGTCCTAGTTCGGTCCTA-CACAGGGGTTTGGAA-AACTCCATG-GCACG
9021 169 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
062 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
065 171 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
9780 168 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
ZM16 171 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
101 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
119 169 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATGGGCACG
363 168 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
ZM13 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
CC6 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
SL5 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
ZM10 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
CC7 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
046 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
024 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
043 169 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
1022 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG
122
SL8 224 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
033 224 ATAA-GCA-AGCTTCATGGT--TCTGGCCAAGGCATTGTTATCTTTAACTCTACTG-GT-
19s/m 222 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTG-GT-
ZM19 227 ATAA-GCA-AGCTTCATGTT--TCTGGCCAAGGCATTGTATCCTTTAACTCTACTGGGT-
028 226 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTG-GT-
4531 227 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
9021 224 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTG-GT-
062 225 ATAA-G---AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
065 226 ATAAAGCA-AGCTTCATGGT--TCTGGCCAAGGCATTGTTATCTTTAACTCTACTG-GT-
9780 223 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTG-GT-
ZM16 226 ATAA-G---AGCTTCATGGTT-TCTGGCCAAGGCATTG-ATCTTTTAACTCTACTGGGT-
101 225 ATAA-GCAGAGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
119 225 ATAA-GGA--ACTTCATGCTT--TAGGTCA-TGCTTTGTACCCTTTAATTTCATAG-CTC
363 223 ATAA-GCA-AGCTTCATGGT--TCTGGCCAAGGCATTGTTATCTTTAACTCTACTG-GT-
ZM13 225 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
CC6 225 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
SL5 225 ATAA-GCAGAGCTTCATGGTTTTCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
ZM10 225 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
CC7 225 ATAA-GCAGAGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
046 225 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
024 225 ATAA-GCAGAGCTTCATGGTTTTCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
043 224 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
1022 225 ATAA-GCAGAGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-
SL8 280 ACAGT-ATACGGAAGTGCCCCTAGTA-TAAG-AAACTTCATCCTTTAGGTCATGC-TTTT
033 278 ACAGT-ATACGGAAGTGCC-CTAGTA-TAA--GAACTTCATGCTTTAGGTCATGC-T-TT
19s/m 277 ACAGT-ATACGGAAGTGCCCCTAGTA-TAAG-AAACTTCATCCTTTAGGTCATGC-TTTT
ZM19 282 ACAGT-ATACGGAAGTGCCCCTAGTA-TAAGAAAACTTCATCCTTTAGGTCATGC-TTTT
123
028 281 ACAGT-ATACGGAAGGCCC-CTAGTA-TAAGGAAACTTCATCCTTTAGGTCATGC-T-TT
4531 283 ACAGT-ATACGGAAGTGCCCCTAGTA-TAAG-AAACTTCATCCTTTAGGTCATGC-TTTT
9021 279 ACAGT-ATACGGAAGTGCCCCTAGTA-TAA-GAAACTTCATCCTTTAGGTCATGCTTTTT
062 279 ACAGTAATACGGAAGTGCCCCTAGTA-TAAG-AAACTTCATCCTTTAGGTCATGC-TTTT
065 281 ACAGT-ATACGGAAGTGCC-CTAGTAATAAGAAAACTTCATCCTTTAGGTCATGC-TTTT
9780 278 ACAGT-ATACGGAAGTGCCCCTAGTA-TAAAGAAACTTCATCCTTTAGGTCATGCT-TTT
ZM16 279 ACAGTAATACGGAAGTGCCCCTAG----A---GAACTTCATGCTTTAGGTCATGC-T-TT
101 282 ACAGTAATACGGAAGTGCCCCTAGTA-TAAAGAAACTTCATCCTTTAGGTCATGC-TTTT
119 278 TAAGT-ATACGGAAGTGCT-CTAGTA-CAAG-AAACTTCATCCTTTAGGTCATGC---TT
363 277 ACAGT-ATACGGAAGTGCC-CTAGTA-TAA--GAACTTCATGCTTTAGGTCATGC-T-TT
ZM13 281 ACAGTAATACGGAAGTGCCCCTAGTA-TA---AAACTTCATCCTTTAGGTCATGCCTTTT
CC6 281 ACAGTAATACGGAAGTGCCCCTAGTA-TA---AAACTTCATCCTTTAGGTCATGCCTTTT
SL5 283 ACAGTAATACGGAAGTGCCCCTAGTA-TA---AAACTTCATCCTTTAGGTCATGCCTTTT
ZM10 281 ACAGTAATACGGAAGTGCCCCTAGTA-TA---AAACTTCATCCTTTAGGTCATGCCTTTT
CC7 282 ACAGTAATACGGAAGTGCCCCTAGTA-TTA--AAACTTCATCCTTTAGGTCATGCCTTTT
046 281 ACAGTAATACGGAAGTGCCCCTAGTA-TA---AAACTTCATCCTTTAGGTCATGCCTTTT
024 283 ACAGTAATACGGAAGTGCCCCTAGTA-TA---AAACTTCATCCTTTAGGTCATGCCTTTT
043 280 ACAGTAATACGGAAGTGCCCCTAGTA-TA-G-AAACTTCATCCTTTAGGTCATGCCTTTT
1022 282 ACAGTAATACGGAAGTGCCCCTAGTA-TA----AACTTCATCCTTTAGGTCATGCCTTTT
124
SL8 336 GATACCCT-TAATTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAGTACAAAGGACTT
033 331 GATACCCTTTAATTTCATAGCTCTAAGTATA-CGGAAGT--GCTCTAGTACAAAGGACTT
19s/m 333 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAGTACAAAGGACTT
ZM19 339 GATACCCTTAATTTTCATAGCT-TAAGTA-ACCGGAAGT--GCTCTAGTACAAAGGACTT
028 336 GATACCCT-TAATTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAGTACAAAGGACTT
4531 339 GATACCCTTAATTTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAGTACAAAGGACTT
9021 336 GATACCC-TTAATTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAGTACAAAGGACTT
062 336 GATACCCTTAATTTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAG-------GACTT
065 338 GATACCCTTAATTTTCATAGCTCTAAGTA-ACCGGAAGT--GCTCTAG-------GACTT
9780 335 GATACCT-TAATTTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAG-------GACTT
ZM16 330 GATACCCTTTAATTTCATAGCTCTAAGTATA-CGGAAGT--GCTCTAGTACAAAGGACTT
101 340 GATACCCTTAATTTTCATAGCTCT-AGTA-A-CGGAAGT--GCTCTG--------GACTT
119 331 GATACCCT-TAATTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAGTACAAAGGACTT
363 330 GATACCCTTTAATTTCATAGCTCTAAGTATA-CGGAAGT--GCTCTAGTACA-AGGACTT
ZM13 337 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACA-AGGACTT
CC6 337 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACA--AGACTT
SL5 339 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTA---AGGACTT
ZM10 337 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACA--AGACTT
CC7 339 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACA-AGGACTT
046 337 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACCAAGGACTT
024 339 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACA-AGGACTT
043 337 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAG---GCTCTAGTACAA--GACTT
1022 337 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACA-AGGACTT
SL8 391 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTTC
033 388 ATCGGTCACCGCCCATAATTGGCCCG-GGAACT-TTCT-TATCCCGTAA-TGCTCGTTTC
19s/m 389 ATCG-TCACCGCCCATAATTGGC-CG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTT-
ZM19 395 A-CG-TCACCGCCCATAATTGGCCCGCGGAACT-TTCTTTATCCCGTA--T-CT-GTTTC
125
028 391 ATCG-TCACCGC-CATAATTGGCCCG-GGAACCTTTCTTT-ACCCGTA--TGCT-GTTTC
4531 395 ATCGTTCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTTC
9021 391 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAATTGCT-GTTTC
062 385 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTTC
065 388 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTTC
9780 383 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCT-TATCCCGTAA-TGCTCGTTTC
ZM16 387 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTTC
101 387 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTTC
119 386 ATCG-TCACCGC-CATAAT-GGCCCG-GGAACCTTTCTTT-ACCCGTA--TGCT-GTTTC
363 386 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTTC
ZM13 394 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC
CC6 393 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC
SL5 394 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC
ZM10 393 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC
CC7 396 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC
046 395 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC
024 396 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC
043 390 ATCG-TCACCGC-CATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA---GCT-GTTTC
1022 394 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC
126
SL8 445 AGG-GCCCGGTTATTTAT------TGTCTGACTACTCACG-AG-GATCACCAATCC---T
033 444 AGGGGCCCGGTTATAT-------TAGTCTGACTACTCACG-AGAGATCACCAATCCGG-T
19s/m 441 AGG-GCCCGGTTATTTAT------TGTCTGACT-CTCACG-AG-GATCACCAATCC---T
ZM19 448 AGG-GCCCGGT-ATTTAT------TGTCTGACTACTC-CG-AG-GATCACCAATCC---T
028 444 AGG-GCCCGGTTATTTAT------TGTCTGACTACTCACG-AG-GATCACCAATCC---T
4531 450 AGG-GCCCGGTTATTTAT------TGTCTGACTACTCACG-AG-GATCACCAATCC---T
9021 447 AGG-GCCCGGTTATTTAT------TGTCTGACTACTCACC-AG-GATCACCAATCC---T
062 439 AGG-GCCCGGTTATTTATT-TA-TTGTCTGACTACTCACCGAG-GATCACCAATCC---T
065 442 AGG-GCCCGGTTATTTAT-------GTCTGACTACTCACG-AG-GATCACCAATCC---T
9780 438 AGGGGCCCGGTTATTTAT------TGTCTGACTACTCACC-AG-GATCACCAATCC---T
ZM16 441 AGG-GCCCGGTTATAT-------TAGTCTGACTACTCACG-AGAGATCACCAATCCGG-T
101 441 AGG-GCCCGGTTATTTATTTTA-TTGTCTGACTACTCACCGAG-GATCACCAATCC---T
119 438 AGG-GCCCGGTTATTTAT------TGTCTGACTACTCACG-AG-GATCACCAATCC---T
363 440 AGG-GCCCGGTTATTTATAT---TAGTCTGACTACTCACG-AGAGATCACCAATCCCG--
ZM13 448 AGG-GCCCGGTTATTTAT---ATTAGTCTGACTACTCACG-AGAGATCACCAATCCCGGT
CC6 447 AGG-GCCCGGTTATTTAT---ATTAGTCTGACTACTCACG-AGAGATCACCAATCCCGGT
SL5 448 AGG-GCCCGGTTATTTATAT-ATTAGTCTGACTACTCACG-AGAGATC-CCAATCCCGGT
ZM10 447 AGG-GCCCGGTTATTTAT---ATTAGTCTGACTACTCACG-AGAGATCACCAATCCCGGT
CC7 450 AGG-GCCCGGTTATTTAT---ATTAGTCTGACTACTCACG-AGAGATC-CCAATCCCGGT
046 449 AGG-GCCCGGTTATTTAT---ATTAGTCTGACTACTCACG-AGAGATC-CCAATCCCGGT
024 450 AGG-GCCCGGTTATTTAT---ATTAGTCTGACTACTCACG-AGAGATC-CCAATCCCGGT
043 442 AGG-GCCCGGTTATTTAT--------TCTGACTACTCACCGAG-GATCACCAATCC---T
1022 448 AGG-GCCCGGTTATTTAT---A-TAGTCTGACTACTCACC-GAGGATCACCAATCCCGGT
SL8 493 GT-AAGTAAGGTTCGG-CCTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT
033 495 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTCTT-TCCCCCTAC-ACCCTT
19s/m 488 GTAAAGTAAGGTTCGGTCCTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT
ZM19 494 GTAAAGTAAGGTTCGG-CCTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT
127
028 492 GT-AAGTAAGGTTCGG-CCTCCCTAGCGTCCAGGTCCATTTCTTTCCCCCTAC-ACCCTT
4531 498 GTAAAGTAAGGTTCGG-CCTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT
9021 495 GT-AAGTAAGGTTCG---CTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT
062 492 GT-AAGTAAGGTTCG---CTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT
065 489 GT-AAGTAAGGTTCGG-CCTCCCTAGCGTCCAGGTCCATTTCT-TTCCCCTACCACCCTT
9780 487 GT-AAGTAAGGTTCG---CTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACC-TT
ZM16 491 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT
101 495 GT-AAGTAAGGTTCG---CTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCC-T
119 486 GT-AAGTAAGGTTCGG-CCTCCCTAGCGTCCAGGTCCATTTCTTTCCCCCTAC-ACCCTT
363 493 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT
ZM13 503 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC-CTACCAC-CTT
CC6 502 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC--CACCAC-CCT
SL5 504 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC--CACCAC-CTT
ZM10 502 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC--CACCAC-C-T
CC7 504 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC--CA-CACCCTT
046 503 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC--CACCAC-CTT
024 504 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC--CA-CAC-CTT
043 489 GT-AAGTAAGGTTCG---CTCCCTAGCGT-CAGGTCCATTCTT-TCC-CCTACCACCC-T
1022 502 GT-AAGTAAGGTTCGG-CCTCCCTAGCGT-CAGGTCCATTCTT-TCCC--CACCAC-CTT
128
SL8 549 ACACTCCTTGCGCTTTTCGCCTCTG-GGTTCCTCGGTCAGGCACATAAC
033 550 ACACTCCTTGCGCTTT-TGCGCCTCTGGTTCCTCGGTCAGGCACATAAC
19s/m 546 ACACTCCTTGCGCTTTTCGCCTCTGGGCTTCCTCGGTCAGGCACATAAC
ZM19 551 ACACTCCTTG-GCTTTTCGCCTCTG-GGTTCCTCGGTCAGGCACATAAC
028 549 ACACTCCTTGCGCTTTTCGCCTCTG-GGTTCCTCGGTCAGGCACATAAC
4531 555 ACACTCCTTGCGCTTTTCGCCTCTGGGTTTCCTCGGTCAGGCACATAAC
9021 549 ACACTCCTTGCGCTTTTCGCCT--CGGGTTCCTCGGTCAGGCACATAAC
062 546 ACACTCCTTGCGCTTTTCGCCTCTCGGGTTCCTCGGTCAGGCACATAAC
065 546 ACACTCCTTG-GCT-TTCGCCTCTG-GGTTCCTCGGTCAGGCACATAAC
9780 540 ACACTCCTTGCGCTTTTCGCCT--CGGGTTCCTCGGTCAGGCACATAAC
ZM16 547 ACACTCCTT-CGCTTT-CGCCTCTCGGGTTCCTCGGTCAGGCACATAAC
101 548 ACACTCCTTGCGCTTTTCGCCTCTCGGGTTCCTCGGTCAGGCACATAAC
119 543 ACACTCCTTGCGCCTTTCGCCTCTG-GGTTCCTCGGTCAGGCACATAAC
363 549 AC-CTCCTTGCGCTTT--TCGCCTCGGGTTCCTCGGTCAGGCACATAAC
ZM13 557 ACACTCCTTGCGCTT-TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC
CC6 555 ACACTCCTTGCGCT--TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC
SL5 557 A-ACTCCTTGCGCT--TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC
ZM10 554 ACACTCCTTGCGCTT-TCGCCTCTCGGGTTCCTCGGT--GGCACATAAC
CC7 557 A-ACTCCTTGCGCT--TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC
046 556 A-ACTCCTTGCGCT--TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC
024 556 A-ACTCCTTGCGCT--TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC
043 541 ACACTCCTTGCGCTTTTCGCCTCTCGGGTTCCTCGGTCAGGCACATAAC
1022 555 ACACTCCTTGCGCTT-TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC
129
Adenylate Kinase 1
SNPs and Indels of Intronic Nuclear Regions
19S/M CC6
CC7 SL5
SL8
028
130
033 043
062 065
1022 4531
131
9021 9780
ZM10 ZM13
ZM16 ZM19
132
18S rRNA gene (18S)
SNPs and Indels for intronic nuclear regions
19S/M 028
033
043
062 065
133
1022 4531
9021 9780
CC6 CC7
134
024
SL5
SL8 Zm10
ZM13
ZM16
ZM19
135
Recombination activating gene 1 (RAG1)
SNPs and Indels of nuclear Intronic regions
19S/M CC6
CC7 SL5
SL8
028
136
033
043
062
065
1022 4531
137
9021 9780
ZM10
ZM13
ZM16 ZM19
138
Regulator of G-protein signaling 4 (RGS4)
SNPs and Indels for nuclear intronic regions
19S/M
028
033 043
062
065
139
1022
4531
9021
9780
CC6
CC7
140
024 Sl5
Sl8 ZM10
141
ZM13
ZM16
ZM19
142
Vimentin Gene (Vim)
SNPs and Indels for nuclear Intronic regions
19S/M
028
033
043
062 065
143
1022
4531
9021 9780
CC6
CC7
144
SL8 ZM10
145
ZM13 ZM16
ZM19
146
Mitogenome
SNPs and Indels
19 S/M CC6
CC7 SL5
SL8
028
147
033 043
062
065
1022
4531
148
9021 9780
ZM10
ZM13
ZM16
ZM19
149
EXAMPLE RESTRICTION DIGEST ELECTROPHORETIC GELS
150
151
MiSeq® Reporter graphical outputs
Below are a series of graphical outputs from the MiSeq® Reporter, illustrating a
“Sample/Genetic Region” under analysis in each graph. The vertical axis illustrates the depth of
coverage (number of reads) at each position, and the horizontal axis indicates the base position in
the sequence. The green graph represents the overall curve for the depth of coverage. The red
spikes indicate the positions at which there has been a SNP or an indel relative to the reference
model.
The first group of graphs are nuclear genome intron graphs, whereas the second group of
graphs (which contain many more SNPs and indels) are mitochondrial genome graphs.
152
MISEQ PLATFORM OUTPUT NUCLEAR SNP AND INDEL GRAPHS
153
MISEQ PLATFORM OUTPUT MITOGENOME SNP AND INDEL GRAPHS
154
155
156
Attached in “soft copy” form are two tables (in MS EXCEL) format of the mitochondrial
genome polymorphisms (SNPs and indels). To receive a copy of this via email, please send an
email to the author at [email protected]. (To open these files, right-click on the picture,
select “Worksheet Object”, then select “Open”)
MET(68bp)
REGION --> 12S(970bp) VAL(71bp) 16S(1572bp) ND1(981bp) ILE(72bp) 3889-3956 ND2(1040bp) CO1(1548bp) ASP(69bp) CO2(684bp) LYS(68bp) ATP8(168bp) ATP6(683bp) CO3(782bp) ND3(350bp) ND4L(297bp) ND4(1393bp) ND5(1815bp) CytB(1140bp) THR(68bp) ND6(515bp)
67-1036 1037-1107 1109-2680 2762-3742 3741-3812 3957-4996 5362-6909 6981-7049 7052-7735 7737-7804 7806-7973 7964-8646 8646-9427 9497-9846 9918-10214 10208-11600 11806-13620 13632-14771 14772-14839 14917-15431
LOCUS --> 772 789 823 880 955 1094 1181 1514 1611 1621 1746 1777 1785 1801 1802 1832 1879 1992 2075 2197 2206 2359 2618 2784 2802 2914 2944 3142 3166 3199 3282 3298 3299 3436 3439 3511 3553 3602 3644 3798 3939 4061 4136 4193 4205 4247 4373 4385 4419 4432 4446 4489 4493 4517 4531 4539 4593 4620 4670 4697 4739 4778 4820 4938 4962 5433 5478 5490 5622 5655 5709 5748 5794 5931 5940 5952 5953 6078 6156 6207 6468 6474 6552 6592 6651 6756 6981 7033 7162 7192 7312 7413 7429 7576 7591 7616 7642 7649 7775 7785 7786 7914 7970 8057 8087 8110 8195 8219 8306 8516 8797 8830 8836 8883 9002 9028 9220 9265 9510 9539 9543 9579 9581 9632 9667 9701 9777 9987 9990 10047 10056 10108 10122 10143 10293 10295 10362 10430 10766 10850 10859 10898 10931 10969 10983 10984 11057 11087 11102 11177 11222 11327 11370 11387 11411 11441 11472 11481 11531 11836 11851 11896 11956 12085 12103 12106 12163 12361 12505 12709 12718 12739 12859 12922 12929 12967 13042 13135 13177 13201 13254 13319 13424 13480 13579 13743 13756 13776 13887 14022 14025 14031 14040 14097 14139 14224 14317 14373 14466 14541 14616
mito1 332 349 383 440 515 654 741 1074 1171 1181 1306 1337 1345 1361 1362 1392 1439 1552 1635 1757 1766 1919 2178 2344 2363 2474 2504 2702 2726 2759 2842 2858 2859 2996 2999 3071 3113
mito2 95 182 515 612 622 747 778 786 802 803 833 880 993 1076 1198 1207 1360 1619 1785 1804 1915 1945 2143 2167 2200 2283 2299 2300 2437 2440 2512 2554 2603 2645 2799 2940 3062 3137 3194 3206 3248 3374 3386 3420 3433 3447 3490 3494 3518 3532 3540 3594 3621 3671 3698 3740 3779 3821 3939 3963
mito3 412 457 469 601 634 688 727 773 910 919 931 932 1057 1135 1186 1447 1453 1531 1571 1630 1735 1960 2012 2141 2171 2291 2392 2408 2555 2570 2595 2621 2628 2754 2764 2765 2893
mito4 7622- mito4 27 153 163 164 292 348 435 465 488 573 597 684 894 1175 1208 1214 1261 1380 1406 1598 1643 1888 1917 1921 1957 1959 2010 2045 2079 2155 2365 2368 2425 2434 2486 2500 2521 2671 2673 2740 2808 3144 3228 3237 3276 3309 3347 3361 3362 3435 3465 3480 3555 3600 3705 3748 3765 3789 3819 3850 3859 3909
mito5 144 249 292 309 333 363 394 403 453 758 773 818 878 1007 1025 1028 1085 1283 1427 1631 1640 1661 1781 1844 1851 1889 1964 2057 2099 2123 2176 2241 2346 2402 2501 2665 2678 2698 2809 2944 2947 2953 2962 3019 3061 3146 3239 3295 3388 3463 3538
mito6
Reference
Seabury G A A A AG C G T A G G C T T A AC C C A A A A C T A T C A T A C T T G G T C A C A T C T A A G C C A T T T G T T A G G C G G G A T T A A A C C A T G A G C C C G A A C C G A C G C G A A T C A G G C G T A A A C A G T A C A A G G A G G A C T T C G A T G TC G A A T A A G C A C T A T A G G A T G A G G C A T T G G T A G G C A A G T G C A C C A G T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C A A
A. m. cyanoptera
SL8 A G G A A T A T A G A T C C A A T G AACTT G G T C C G C T A C G T C T A A T T A T C C C C G A A T C G T C C A C T G A A T G A G A C T G G A T T G C G T A T C T G G G C T A G C A T G A A T T A A A C G T AG G A C G G T G C G A G A G G G A T T T C A A C A TC G G A C A A A T G C C G T G A A G T G A A A C G T C A A C A A A C G G A C A T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C T A C T T A A
033 A G G A/G A T A T A A A T C C G A T G AACTT G G T C C G C T A C G T C T G A T T A C C C C C G G A C C G T C T A C T G A A T G A G A C T G G A T T G C G T A T C T G G G T T A G C A T G A G T T G A G C A T A G C C A A T G T G A G A G G A A T T T C A G C A TC G G G T G A G T A T C G C G A G G T G T A A T G T C A A C G A A T G G A C A T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C T A C T T A A
19s/m G G G A/G A T A T G A A C C C A A T G AACTT G G T C C G C T A C G T C T G A T T A C C C C C G G A C T G T C T A C T A/G A G T A A G G C C G G A T T G C G T A C T T G G G T T A G C A C A A G T T A A G T G T AG G C C A A T/C G T G A G A G A A A T G T T A G C A TC G G G T G A G T A T C G C G A G G T G T A A T G T C A A C G A A C G A G C G T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C C A C T T A A
ZM19 A G G A A T A T G A G T T T G A T G AACTT G G T T C G C T G C G T C C A A T T A C C C T C G G A T C G T C T A C C G A G T G G G A C C G G A T T G C G T A C T T G G G T T A G C A T G A G C T G A G T G C A G C C A A T G T G A G A G G A A T G T T A G C A TC G G G T G A G T A T C G C G A G G T G T A A T G T C A A C A A A C G G A C A T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C C A C T T A A
028 A A G A A T A T A A A T T T G A T G AACTT G G T T C G C T G C G T C C A A T T A T C C T C G G A T C G T C T A C C G A A T A A A A C C G G G T T G C A A A T T C G A A T T A G C A C G G G T T G G G C G T AG G A C A G T/C G C G A G A G G G G T G C C A A C A TC G G A C A A A T G C C G T G A A G T G A A A C G C C A A C A A A T G G A C A T G T T G G T C A T C A A A G G T A A A C A A A C T T C G C C T A T C T A A
4531 G G G A A C A T A G A C C C A A T G AACTT G G T C C G C T A C G T C C G A T T A C C C C C G A A T T G T C T A C C G A G T G A G G C C G G G T T A C G A A C C T G G G T T A G T A T A A A T T G A G C A T AG G A C G G T G T G A G A G A G A T G C T A G C A T A G A T A A G T A C C G T G A G G C A T A A C G T C A A C A A A C G G A C A T G T C G A A T G T C A A A G G C A A A C A A A C T T C A T C T A C T T A A
9021 A G G A A C A T A A A C C C A A T G AACTT G G T T C G C T A C G T C T G A T T A C C C C C G G A T C G T C T A C T G A A T G A A G C T G G G T T A C A A A C C C G G A T T A G C A C A A A T T A A G C A T AG G A C G G T G C G A G A G A G A T G C C A A C A T G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G G A C A T G T T G A A T G T C A A A G G C A A A C A A G C C T C A T C T G C T T G G
062 A G G A A C A T A A G C C C A A T G AACTT G G T C C G C T G C G T C T A A T T A C C C C C G A A C C G T C T A C T G A G T A A A A C T G G G T T A C A T A T C C G A A T T A G C A T G G A T T A G G C A T AG G A T G G T G C G G A A G G G G T G C C A A C A T G G A T A A G T A T C G C G A G G T G T A A T G T C A A C A A A C G G A C A T G T T G G T C A T C A A A G G T A A A C A A A C T T C A T C T G T C T A A
065 A G G A A C A T A G G C C C A A T G AACTT G G T C C G C T A C G T C T A A T T A C C C C C G A A C C G T C T A C T G A G T G A G A C T G G G T T G C G T A C C T G A G C T A G T A T G A A T T A A G C A T A G A C A G T/C G C G A G A G G G A T G C C A A C A T G G G C A A/G A T G C C G T G A G G T G A A A C G T C A A C A A A C G G A C A T G T T G A T T G T C A A A G G C A A A C A G G C C C C A T C T G C T T A G
9780 G G G A A C A T A G G C C C A A T G AACTT G G T T C G C T A C G T C C G A T T A C C C C C G A A C C G T C T A C T G A G T G A G A C T G G A T T G C G T A T C C G A A C T A G C A T G A A C T A A G C G T A G A C A G T G C G A G A G G G A T G C C A A C A T G G G T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G G A C A T G T C G A T T G T C A G A G A C A A G C G A A C T T C G T C T A C T T A A
ZM16 A G G A A T A T G A A T T T G A T G AACTT G G T T C G C T A T G T C T A G T C G C C C T C G G A C T G T C T A C C A/G A G T G A A A C C G G A T T G C A A A T C C A A G C T A G C A C G A A T T A A G T G T AG G A T A G T G C G G A A G G G A T G C C A A C A TC G G A T A A G T A T C G C G A G G T G T A A T G C C A A C A A A C G A G C A T G T T G G T T G T C A A A G G C A A A C A G G C C C C G C C T A C T T A G
101 A A G A A C A T A G G C T T A A T G AACTT G G T T C G C T A T G T C T G G T C A C C C C C G A A C C G T C T A C T G A G T G A G A C T G G A T T A C G A A C C C G A A C T A G C A T G A A T T A G G C A T A G A C A G T G C G A G A G G G A T T T C A A C A TC G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G A G C G T G T C G G T C A T C A A A G G T A A A C A A A C T T C A T C T A T C T A A
119 A A G A A C A T A G G C T T A A T G AACTT G G T T C G C T A T G T C T G G T C A C C C C C G A A C C G T C T A C T G A G T G A G A C T G G A T T A C G A A C C C G A A C T A G C A T G A A T T A G G C G T AG G A C A G T G C G A G A G G G A T G T C A A C A TC G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G A G C G T G T C G G T C A T C A A A G G T A A A C A A A C T T C A C C T A T C T A A
363 A A G A A C A T A G G C C C A A T G AACTT G G T T C G C T A T G T C T G G T C A C C C C C G A A C C G T C T A C T G A G T G A G A C T G G A T T A C G A A C C C G A A C T A G C A C G A A T T A G G C G T A G A C A G T G C G A G A G G G A T G T C A A C A TC G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G A G C G T G T C G G T C A T C A A A G G T A A A C A A A C T T C A T C T A T C T A A
A. m. macao
ZM13 G G A A AG T G C A G A T T C A AC C A A G A C C C G T C G T A C T T A G C C A C A T C T A A A C T A T T T G T T A G G C G G G A T C G G A T T G C G A G T C T G A A C C G A T G T G A A T C A G G T G T AG A A C A G T A C A A G G A G G A C G T C G A T G TC G A A C A A A C G C T A T A G A A T G A G A C A C T A A C A A G T A G A T G C A C T A A T T G C T G A G A G C C T A T A A G T C T T A T T C G C T C A A
CC6 A G A A/G AG T G C A G A T T T G AC C A A A/G A C C T A T C G T A C T T A G T C G C A T C T A A A C T A C T T G T T A/G G G C G G A G T C A A A C C A T G A G T C C G A A C C G A C G C G A G T T A G G T G T A A A C A G T/C A C A A A G A G G A C T T C G A T G TC G A A T A A G C A C T A T A G A A T G A G A C A T T G G T A G G C A A G T G C A C C A A T T G C T G A G A G C C T A T A A A T T T T A T T T A C C C A A
SL5 G A A A AG T G C A A A T T T G AC C A A G G C T T A T C/T A T A C T T G G T C A T A C C C A A A C T A T T T G T T A G A C A G G G T C G G A T T G C G T G C C C G A G C C G A T G C G A G T T A A G C G T A A A C A G T A T A A G G A G A A C G T T G G T G TC A A G T A A G C A C T A T A G A A C A T G A C A C T A A C A A G C A A A T G C A C T A A T T G C T G A G A G C C T A T A A G T C T T A T T T A C T C A G
ZM10 G G A A/G AG C G C G A G T T C A AC C A A G A C C C A T C A T A C T T G G C C A C C T C T A A A C T A C T T G T T A/G G A C A G G A T T G G A T T G C G A G C C C G A A C C G A T G C G A A T C A G G T G T AG A A C A G T A C A A G G A G G A T G C C G A T G TC G A G T A A G C A C T A T A G A A T G A G A C A C T A G T A G A C A A G T A C A C T A G T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C G A
CC7 A A A A AG T G T A G A T T T A AC C A A A/G A C T T A T C/T A T A C T T A G T C A C A C C C A A A C T A T T T G T T A G G C G G G A T T G G A T T G C G T G C C T G A G C C G A T G T G A G T C A A G C G T A A A C A G T A C A A G G A G A A C G C C G A T G TC G A A C A A/G A C G C T A T A G A A T G A G G C A C T A A T A A A C A G G T A C A C C A A T T G C T G G G A A C C T G T G A A T T T T A T T T A C T C G A
046 A A A A AG C G T A G A C T T A AC C A A G G C T C A T C A T A C T T G G T C A C A T C T A A G C C A T T T G T T A G G C G G G A T T A A A C C A T G A G C C C G A A C C G A C G C G A A T T A G G C G T A A A C A G T A C A A G G A G A A T T T C G A T G TC G A A T A A G C A C T A T A G A A T G A G A C A T T G G T A G G C A A G T G C A C C A G T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C A A
024 G G A A AG C G T A G A C T T A AC C A A A/G A C C C G T C/T G T A C T T G G T C A C A T C T A A A C T A T T T G T T A G G C G G G G T T A A A C C A T G A G C C C G A A C C G A C G C G A A T C A G G C G T AG A A C A G T A C A A G G A G A A C T T C G A T G TC G A A T A A G C A C T A T A G A A T G A G G C A T T G G T A G G C A G A T A C A C C A A T T G C T G A G A G C C T A T A A A T T T T A T T T A C T C A A
Hybrid
043 A A G A A C A T A G G C C C A A T G AACTT G G T T C G C T A C G T C C G A T T A C C C C C G A A C C G T C T A C C G A G T G G G G C T G G G T T A C G T A C C C G G A C T A G T A T G G G C T G G G C A T AG G A C G G T G C G A G A G G G G T G C C A A C A TC G G A T A A G T A C C G T G A G G T G T A A C G T C G G T G G G C A G A C A T G T T G A A T G T C A A A G G C A A A C A A G C C T C A T T C G C T C G A
1022 G G A A/G AG C G T A G A C T T A AC C A A G A C C T A T C G T A C T T G G C C A C A T C T A A A C T A T T T G T T A G G C G G G A T T A A A C C A T G A G C C C G A G C C G A C G C G A A T C A G G C G T A A A T A G T A T A A G G A G G A C G T T G A T G TC G A G C A A/G A C G C T A T A G G A T A A G G C G T T A A C A A A C A A A T A C A C C A A T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C A A
157
12S(970bp) VAL(71bp) 16S(1572bp) ND1(981bp)
67-1036 1037-1107 1109-2680 2762-3742
440- 772 789 823 880 955 1094 1181 1514 1611 1621 1746 1777 1785 1801 1802 1832 1879 1992 2075 2197 2206 2359 2618 2784 2802 2914 2944 3142 3166 3199 3282 3298 3299 3436 3439 3511 3553 (Cont)
mito1 332 349 383 440 515 654 741 1074 1171 1181 1306 1337 1345 1361 1362 1392 1439 1552 1635 1757 1766 1919 2178 2344 2363 2474 2504 2702 2726 2759 2842 2858 2859 2996 2999 3071 3113
Ref G A A A AG C G T A G G C T T A AC C C A A A A C T A T C A T A C T T G G T C
A. m. cyanoptera
SL8 A G G A A T A T A G A T C C A A T G AACTT G G T C C G C T A C G T C T A A T T
033 A G G A/G A T A T A A A T C C G A T G AACTT G G T C C G C T A C G T C T G A T T
19s/m G G G A/G A T A T G A A C C C A A T G AACTT G G T C C G C T A C G T C T G A T T
ZM19 A G G A A T A T G A G T T T G A T G AACTT G G T T C G C T G C G T C C A A T T
028 A A G A A T A T A A A T T T G A T G AACTT G G T T C G C T G C G T C C A A T T
4531 G G G A A C A T A G A C C C A A T G AACTT G G T C C G C T A C G T C C G A T T
9021 A G G A A C A T A A A C C C A A T G AACTT G G T T C G C T A C G T C T G A T T
062 A G G A A C A T A A G C C C A A T G AACTT G G T C C G C T G C G T C T A A T T
065 A G G A A C A T A G G C C C A A T G AACTT G G T C C G C T A C G T C T A A T T
9780 G G G A A C A T A G G C C C A A T G AACTT G G T T C G C T A C G T C C G A T T
ZM16 A G G A A T A T G A A T T T G A T G AACTT G G T T C G C T A T G T C T A G T C
101 A A G A A C A T A G G C T T A A T G AACTT G G T T C G C T A T G T C T G G T C
119 A A G A A C A T A G G C T T A A T G AACTT G G T T C G C T A T G T C T G G T C
363 A A G A A C A T A G G C C C A A T G AACTT G G T T C G C T A T G T C T G G T C
A. m. macao
ZM13 G G A A AG T G C A G A T T C A AC C A A G A C C C G T C G T A C T T A G C C
CC6 A G A A/G AG T G C A G A T T T G AC C A A A/G A C C T A T C G T A C T T A G T C
SL5 G A A A AG T G C A A A T T T G AC C A A G G C T T A T C/T A T A C T T G G T C
ZM10 G G A A/G AG C G C G A G T T C A AC C A A G A C C C A T C A T A C T T G G C C
CC7 A A A A AG T G T A G A T T T A AC C A A A/G A C T T A T C/T A T A C T T A G T C
046 A A A A AG C G T A G A C T T A AC C A A G G C T C A T C A T A C T T G G T C
024 G G A A AG C G T A G A C T T A AC C A A A/G A C C C G T C/T G T A C T T G G T C
Hybrids
043 A A G A A C A T A G G C C C A A T G AACTT G G T T C G C T A C G T C C G A T T
1022 G G A A/G AG C G T A G A C T T A AC C A A G A C C T A T C G T A C T T G G C C
MET(68bp)
VAL(71bp) 16S(1572bp) ND1(981bp) ILE(72bp) 3889-3956 ND2(1040bp)
1037-1107 1109-2680 2762-3742 3741-3812 3957-4996
1094 1181 1514 1611 1621 1746 1777 1785 1801 1802 1832 1879 1992 2075 2197 2206 2359 2618 2784 2802 2914 2944 3142 3166 3199 3282 3298 3299 3436 3439 3511 3553 3602 3644 3798 3939 4061 4136 4193 4205 4247 4373 4385 4419 4432 4446 4489 4493 4517 4531 4539 4593 4620 4670 4697 4739 4778 4820 4938 4962
mito2 mito2 95 182 515 612 622 747 778 786 802 803 833 880 993 1076 1198 1207 1360 1619 1785 1804 1915 1945 2143 2167 2200 2283 2299 2300 2437 2440 2512 2554 2603 2645 2799 2940 3062 3137 3194 3206 3248 3374 3386 3420 3433 3447 3490 3494 3518 3532 3540 3594 3621 3671 3698 3740 3779 3821 3939 3963
Ref Ref C G T A G G C T T A AC C C A A A A C T A T C A T A C T T G G T C A C A T C T A A G C C A T T T G T T A G G C G G G A T T
A. m. cyanoptera A. m. cyanoptera
SL8 SL8 T A T A G A T C C A A T G AACTT G G T C C G C T A C G T C T A A T T A T C C C C G A A T C G T C C A C T G A A T G A G A C T
033 033 T A T A A A T C C G A T G AACTT G G T C C G C T A C G T C T G A T T A C C C C C G G A C C G T C T A C T G A A T G A G A C T
19s/m 19s/m T A T G A A C C C A A T G AACTT G G T C C G C T A C G T C T G A T T A C C C C C G G A C T G T C T A C T A/G A G T A A G G C C
ZM19 ZM19 T A T G A G T T T G A T G AACTT G G T T C G C T G C G T C C A A T T A C C C T C G G A T C G T C T A C C G A G T G G G A C C
028 028 T A T A A A T T T G A T G AACTT G G T T C G C T G C G T C C A A T T A T C C T C G G A T C G T C T A C C G A A T A A A A C C
4531 4531 C A T A G A C C C A A T G AACTT G G T C C G C T A C G T C C G A T T A C C C C C G A A T T G T C T A C C G A G T G A G G C C
9021 9021 C A T A A A C C C A A T G AACTT G G T T C G C T A C G T C T G A T T A C C C C C G G A T C G T C T A C T G A A T G A A G C T
062 062 C A T A A G C C C A A T G AACTT G G T C C G C T G C G T C T A A T T A C C C C C G A A C C G T C T A C T G A G T A A A A C T
065 065 C A T A G G C C C A A T G AACTT G G T C C G C T A C G T C T A A T T A C C C C C G A A C C G T C T A C T G A G T G A G A C T
9780 9780 C A T A G G C C C A A T G AACTT G G T T C G C T A C G T C C G A T T A C C C C C G A A C C G T C T A C T G A G T G A G A C T
ZM16 ZM16 T A T G A A T T T G A T G AACTT G G T T C G C T A T G T C T A G T C G C C C T C G G A C T G T C T A C C A/G A G T G A A A C C
101 101 C A T A G G C T T A A T G AACTT G G T T C G C T A T G T C T G G T C A C C C C C G A A C C G T C T A C T G A G T G A G A C T
119 119 C A T A G G C T T A A T G AACTT G G T T C G C T A T G T C T G G T C A C C C C C G A A C C G T C T A C T G A G T G A G A C T
363 363 C A T A G G C C C A A T G AACTT G G T T C G C T A T G T C T G G T C A C C C C C G A A C C G T C T A C T G A G T G A G A C T
A. m. macao A. m. macao
ZM13 ZM13 T G C A G A T T C A AC C A A G A C C C G T C G T A C T T A G C C A C A T C T A A A C T A T T T G T T A G G C G G G A T C
CC6 CC6 T G C A G A T T T G AC C A A A/G A C C T A T C G T A C T T A G T C G C A T C T A A A C T A C T T G T T A/G G G C G G A G T C
SL5 SL5 T G C A A A T T T G AC C A A G G C T T A T C/T A T A C T T G G T C A T A C C C A A A C T A T T T G T T A G A C A G G G T C
ZM10 ZM10 C G C G A G T T C A AC C A A G A C C C A T C A T A C T T G G C C A C C T C T A A A C T A C T T G T T A/G G A C A G G A T T
CC7 CC7 T G T A G A T T T A AC C A A A/G A C T T A T C/T A T A C T T A G T C A C A C C C A A A C T A T T T G T T A G G C G G G A T T
046 046 C G T A G A C T T A AC C A A G G C T C A T C A T A C T T G G T C A C A T C T A A G C C A T T T G T T A G G C G G G A T T
024 024 C G T A G A C T T A AC C A A A/G A C C C G T C/T G T A C T T G G T C A C A T C T A A A C T A T T T G T T A G G C G G G G T T
Hybrids Hybrids
043 043 C A T A G G C C C A A T G AACTT G G T T C G C T A C G T C C G A T T A C C C C C G A A C C G T C T A C C G A G T G G G G C T
1022 1022 C G T A G A C T T A AC C A A G A C C T A T C G T A C T T G G C C A C A T C T A A A C T A T T T G T T A G G C G G G A T T
CO1(1548bp) ASP(69bp) CO2(684bp) LYS(68bp) ATP8(168bp)
5362-6909 6981-7049 7052-7735 7737-7804 7806-7973
5433 5478 5490 5622 5655 5709 5748 5794 5931 5940 5952 5953 6078 6156 6207 6468 6474 6552 6592 6651 6756 6981 7033 7162 7192 7312 7413 7429 7576 7591 7616 7642 7649 7775 7785 7786 7914 (Cont)
5021-
mito3 mito3 412 457 469 601 634 688 727 773 910 919 931 932 1057 1135 1186 1447 1453 1531 1571 1630 1735 1960 2012 2141 2171 2291 2392 2408 2555 2570 2595 2621 2628 2754 2764 2765 2893
Ref Ref A A A C C A T G A G C C C G A A C C G A C G C G A A T C A G G C G T A A A
A. m. cyanoptera A. m. cyanoptera
SL8 SL8 G G A T T G C G T A T C T G G G C T A G C A T G A A T T A A A C G T AG G A
033 033 G G A T T G C G T A T C T G G G T T A G C A T G A G T T G A G C A T A G C
19s/m 19s/m G G A T T G C G T A C T T G G G T T A G C A C A A G T T A A G T G T AG G C
ZM19 ZM19 G G A T T G C G T A C T T G G G T T A G C A T G A G C T G A G T G C A G C
028 028 G G G T T G C A A A T T C G A A T T A G C A C G G G T T G G G C G T AG G A
4531 4531 G G G T T A C G A A C C T G G G T T A G T A T A A A T T G A G C A T AG G A
9021 9021 G G G T T A C A A A C C C G G A T T A G C A C A A A T T A A G C A T AG G A
062 062 G G G T T A C A T A T C C G A A T T A G C A T G G A T T A G G C A T AG G A
065 065 G G G T T G C G T A C C T G A G C T A G T A T G A A T T A A G C A T A G A
9780 9780 G G A T T G C G T A T C C G A A C T A G C A T G A A C T A A G C G T A G A
ZM16 ZM16 G G A T T G C A A A T C C A A G C T A G C A C G A A T T A A G T G T AG G A
101 101 G G A T T A C G A A C C C G A A C T A G C A T G A A T T A G G C A T A G A
119 119 G G A T T A C G A A C C C G A A C T A G C A T G A A T T A G G C G T AG G A
363 363 G G A T T A C G A A C C C G A A C T A G C A C G A A T T A G G C G T A G A
A. m. macao A. m. macao
ZM13 ZM13 G G A T T G C G A G T C T G A A C C G A T G T G A A T C A G G T G T AG A A
CC6 CC6 A A A C C A T G A G T C C G A A C C G A C G C G A G T T A G G T G T A A A
SL5 SL5 G G A T T G C G T G C C C G A G C C G A T G C G A G T T A A G C G T A A A
ZM10 ZM10 G G A T T G C G A G C C C G A A C C G A T G C G A A T C A G G T G T AG A A
CC7 CC7 G G A T T G C G T G C C T G A G C C G A T G T G A G T C A A G C G T A A A
046 046 A A A C C A T G A G C C C G A A C C G A C G C G A A T T A G G C G T A A A
024 024 A A A C C A T G A G C C C G A A C C G A C G C G A A T C A G G C G T AG A A
Hybrids Hybrids
043 043 G G G T T A C G T A C C C G G A C T A G T A T G G G C T G G G C A T AG G A
1022 1022 A A A C C A T G A G C C C G A G C C G A C G C G A A T C A G G C G T A A A
CO2(684bp) LYS(68bp) ATP8(168bp) ATP6(683bp) CO3(782bp) ND3(350bp) ND4L(297bp) ND4(1393bp)
7052-7735 7737-7804 7806-7973 7964-8646 8646-9427 9497-9846 9918-10214 10208-11600
7649 7775 7785 7786 7914 7970 8057 8087 8110 8195 8219 8306 8516 8797 8830 8836 8883 9002 9028 9220 9265 9510 9539 9543 9579 9581 9632 9667 9701 9777 9987 9990 10047 10056 10108 10122 10143 10293 10295 10362 10430 10766 10850 10859 10898 10931 10969 10983 10984 11057 11087 11102 11177 11222 11327 11370 11387 11411 11441 11472 11481 11531
mito4 7622- mito4 27 153 163 164 292 348 435 465 488 573 597 684 894 1175 1208 1214 1261 1380 1406 1598 1643 1888 1917 1921 1957 1959 2010 2045 2079 2155 2365 2368 2425 2434 2486 2500 2521 2671 2673 2740 2808 3144 3228 3237 3276 3309 3347 3361 3362 3435 3465 3480 3555 3600 3705 3748 3765 3789 3819 3850 3859 3909
Ref Ref G T A A A C A G T A C A A G G A G G A C T T C G A T G T G A A T A A G C A C T A T A G G A T G A G G C A T T G G T A G G C A
A. m. cyanoptera A. m. cyanoptera
SL8 SL8 G T AG G A C G G T G C G A G A G G G A T T T C A A C A TC G G A C A A A T G C C G T G A A G T G A A A C G T C A A C A A A C G
033 033 A T A G C C A A T G T G A G A G G A A T T T C A G C A TC G G G T G A G T A T C G C G A G G T G T A A T G T C A A C G A A T G
19s/m 19s/m G T AG G C C A A T/C G T G A G A G A A A T G T T A G C A TC G G G T G A G T A T C G C G A G G T G T A A T G T C A A C G A A C G
ZM19 ZM19 G C A G C C A A T G T G A G A G G A A T G T T A G C A TC G G G T G A G T A T C G C G A G G T G T A A T G T C A A C A A A C G
028 028 G T AG G A C A G T/C G C G A G A G G G G T G C C A A C A TC G G A C A A A T G C C G T G A A G T G A A A C G C C A A C A A A T G
4531 4531 A T AG G A C G G T G T G A G A G A G A T G C T A G C A T A G A T A A G T A C C G T G A G G C A T A A C G T C A A C A A A C G
9021 9021 A T AG G A C G G T G C G A G A G A G A T G C C A A C A T G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G
062 062 A T AG G A T G G T G C G G A A G G G G T G C C A A C A T G G A T A A G T A T C G C G A G G T G T A A T G T C A A C A A A C G
065 065 A T A G A C A G T/C G C G A G A G G G A T G C C A A C A T G G G C A A/G A T G C C G T G A G G T G A A A C G T C A A C A A A C G
9780 9780 G T A G A C A G T G C G A G A G G G A T G C C A A C A T G G G T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G
ZM16 ZM16 G T AG G A T A G T G C G G A A G G G A T G C C A A C A TC G G A T A A G T A T C G C G A G G T G T A A T G C C A A C A A A C G
101 101 A T A G A C A G T G C G A G A G G G A T T T C A A C A TC G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G
119 119 G T AG G A C A G T G C G A G A G G G A T G T C A A C A TC G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G
363 363 G T A G A C A G T G C G A G A G G G A T G T C A A C A TC G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G
A. m. macao A. m. macao
ZM13 ZM13 G T AG A A C A G T A C A A G G A G G A C G T C G A T G TC G A A C A A A C G C T A T A G A A T G A G A C A C T A A C A A G T A
CC6 CC6 G T A A A C A G T/C A C A A A G A G G A C T T C G A T G TC G A A T A A G C A C T A T A G A A T G A G A C A T T G G T A G G C A
SL5 SL5 G T A A A C A G T A T A A G G A G A A C G T T G G T G TC A A G T A A G C A C T A T A G A A C A T G A C A C T A A C A A G C A
ZM10 ZM10 G T AG A A C A G T A C A A G G A G G A T G C C G A T G TC G A G T A A G C A C T A T A G A A T G A G A C A C T A G T A G A C A
CC7 CC7 G T A A A C A G T A C A A G G A G A A C G C C G A T G TC G A A C A A/G A C G C T A T A G A A T G A G G C A C T A A T A A A C A
046 046 G T A A A C A G T A C A A G G A G A A T T T C G A T G TC G A A T A A G C A C T A T A G A A T G A G A C A T T G G T A G G C A
024 024 G T AG A A C A G T A C A A G G A G A A C T T C G A T G TC G A A T A A G C A C T A T A G A A T G A G G C A T T G G T A G G C A
Hybrid Hybrid
043 043 A T AG G A C G G T G C G A G A G G G G T G C C A A C A TC G G A T A A G T A C C G T G A G G T G T A A C G T C G G T G G G C A
1022 1022 G T A A A T A G T A T A A G G A G G A C G T T G A T G TC G A G C A A/G A C G C T A T A G G A T A A G G C G T T A A C A A A C A
ND4(1393bp) ND5(1815bp) CytB(1140bp)
10208-11600 11806-13620 13632-14771
11079- 11222 11327 11370 11387 11411 11441 11472 11481 11531 11836 11851 11896 11956 12085 12103 12106 12163 12361 12505 12709 12718 12739 12859 12922 12929 12967 13042 13135 13177 13201 13254 13319 13424 13480 13579 13743 13756 13776 13887 14022 14025 14031 14040 14097 14139 14224 14317 14373 14466 14541 14616 (Cont)
mito5 mito5 144 249 292 309 333 363 394 403 453 758 773 818 878 1007 1025 1028 1085 1283 1427 1631 1640 1661 1781 1844 1851 1889 1964 2057 2099 2123 2176 2241 2346 2402 2501 2665 2678 2698 2809 2944 2947 2953 2962 3019 3061 3146 3239 3295 3388 3463 3538
Ref Ref T G G T A G G C A A G T G C A C C A G T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C A A
A. m. cyanoptera A. m. cyanoptera
SL8 SL8 C A A C A A A C G G A C A T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C T A C T T A A
033 033 C A A C G A A T G G A C A T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C T A C T T A A
19s/m 19s/m C A A C G A A C G A G C G T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C C A C T T A A
ZM19 ZM19 C A A C A A A C G G A C A T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C C A C T T A A
028 028 C A A C A A A T G G A C A T G T T G G T C A T C A A A G G T A A A C A A A C T T C G C C T A T C T A A
4531 4531 C A A C A A A C G G A C A T G T C G A A T G T C A A A G G C A A A C A A A C T T C A T C T A C T T A A
9021 9021 C A A C A A A C G G A C A T G T T G A A T G T C A A A G G C A A A C A A G C C T C A T C T G C T T G G
062 062 C A A C A A A C G G A C A T G T T G G T C A T C A A A G G T A A A C A A A C T T C A T C T G T C T A A
065 065 C A A C A A A C G G A C A T G T T G A T T G T C A A A G G C A A A C A G G C C C C A T C T G C T T A G
9780 9780 C A A C A A A C G G A C A T G T C G A T T G T C A G A G A C A A G C G A A C T T C G T C T A C T T A A
ZM16 ZM16 C A A C A A A C G A G C A T G T T G G T T G T C A A A G G C A A A C A G G C C C C G C C T A C T T A G
101 101 C A A C A A A C G A G C G T G T C G G T C A T C A A A G G T A A A C A A A C T T C A T C T A T C T A A
119 119 C A A C A A A C G A G C G T G T C G G T C A T C A A A G G T A A A C A A A C T T C A C C T A T C T A A
363 363 C A A C A A A C G A G C G T G T C G G T C A T C A A A G G T A A A C A A A C T T C A T C T A T C T A A
A. m. macao A. m. macao
ZM13 ZM13 T A A C A A G T A G A T G C A C T A A T T G C T G A G A G C C T A T A A G T C T T A T T C G C T C A A
CC6 CC6 T G G T A G G C A A G T G C A C C A A T T G C T G A G A G C C T A T A A A T T T T A T T T A C C C A A
SL5 SL5 T A A C A A G C A A A T G C A C T A A T T G C T G A G A G C C T A T A A G T C T T A T T T A C T C A G
ZM10 ZM10 T A G T A G A C A A G T A C A C T A G T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C G A
CC7 CC7 T A A T A A A C A G G T A C A C C A A T T G C T G G G A A C C T G T G A A T T T T A T T T A C T C G A
046 046 T G G T A G G C A A G T G C A C C A G T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C A A
024 024 T G G T A G G C A G A T A C A C C A A T T G C T G A G A G C C T A T A A A T T T T A T T T A C T C A A
Hybrid Hybrid
043 043 C G G T G G G C A G A C A T G T T G A A T G T C A A A G G C A A A C A A G C C T C A T T C G C T C G A
1022 1022 T A A C A A A C A A A T A C A C C A A T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C A A
THR(68bp) ND6(515bp)
14772-14839 14917-15431
14234- 14373 14466 14541 14616
Mito6 Mito6
Ref Ref C C A A
A. m. cyanoptera A. m. cyanoptera
SL8 SL8 T T A A
033 033 T T A A
19s/m 19s/m T T A A
ZM19 ZM19 T T A A
028 028 C T A A
4531 4531 T T A A
9021 9021 T T G G
062 062 C T A A
065 065 T T A G
9780 9780 T T A A
ZM16 ZM16 T T A G
101 101 C T A A
119 119 C T A A
363 363 C T A A
A. m. macao A. m. macao
ZM13 ZM13 T C A A
CC6 CC6 C C A A
SL5 SL5 T C A G
ZM10 ZM10 C C G A
CC7 CC7 T C G A
046 046 C C A A
024 024 T C A A
Hybrid Hybrid
043 043 T C G A
1022 1022 C C A A
158
BIBLIOGRAPHY
Agilent Technologies. "2100 Bioanalyzer Instruments." from
http://www.genomics.agilent.com/en/product.jsp?cid=AG-PT-106&_requestid=27387.
Aitken, S. (2006). "DNA Barcoding: Fast-Tracking Species Identification." Biodiversity 7(3-4):
71-79.
Akey, J. M. (2003). "The Effect of Single Nucleotide Polymorphism Identification Strategies on
Estimates of Linkage Disequilibrium." Molecular Biology and Evolution 20(2): 232-242.
Álvarez, I. and J. F. Wendel (2003). "Ribosomal ITS sequences and plant phylogenetic
inference." Molecular Phylogenetics and Evolution 29(3): 417-434.
Amaya-Villarreal, A. M., A. Estrada and N. Vargas-Ramirez (2015). "Use of wild foods during
the rainy season by a reintroduced population of scarlet macaws (Ara macao cyanoptera) in
Palenque, Mexico." Tropical Conservation Science 8(2): 455-478.
AMCA - American Mosquito Control Association (2014). "Mosquito-Borne Diseases."
Aranishi, F. (2006). "A novel mitochondrial intergenic spacer reflecting population structure of
Pacific oyster." J Appl Genet 47(2): 119-123.
Arif, I. A., Khan, H.A. (2009). "Molecular markers for biodiversity analysis of wildlife animals:
a brief review." Animal Biodiversity and Conservation 32(1): 9-17.
Avise, J. C. (2000). Phylogeography: the history and formation of species, Harvard University
Press.
Backstrom, N., S. Fagerberg and H. Ellegren (2008). "Genomics of natural bird populations: a
gene-based set of reference markers evenly spread across the avian genome." Mol Ecol 17(4):
964-980.
159
Baldwin, B. G., M. J. Sanderson, J. M. Porter, M. F. Wojciechowski, C. S. Campbell and M. J.
Donoghue (1995). "The its Region of Nuclear Ribosomal DNA: A Valuable Source of Evidence
on Angiosperm Phylogeny." Annals of the Missouri Botanical Garden 82(2): 247.
Barker, F. K., M. K. Benesh, A. J. Vandergon and S. M. Lanyon (2012). "Contrasting
evolutionary dynamics and information content of the avian mitochondrial control region and
ND2 gene." PLoS One 7(10): e46403.
Beckmann-Coulter. "Agencourt AMPure XP Beads." from
https://www.beckmancoulter.com/wsrportal/bibliography?docname=AMPureXPvsAMPure.pdf.
Beckmann-Coulter (2013). Agencourt AMPure XP PCR Purification. Beckmann-Coulter.
Benders-Hyde, E. (2002). "Southeast Asian Forest."
Birchler, J. A., H. Yao and S. Chudalayandi (2006). "Unraveling the genetic basis of hybrid
vigor." Proc Natl Acad Sci U S A 103(35): 12957-12958.
Birdlife International. (2013). ""Ara macao". IUCN Red List of Threatened Species."
International Union for Conservation of Nature Retrieved 26 November, 2013.
Bohle, H. M. and T. Gabaldon (2012). "Selection of marker genes using whole-genome DNA
polymorphism analysis." Evol Bioinform Online 8: 161-169.
Boore, J. L. (1999). "Animal mitochondrial genomes." Nucleic Acids Research 27(8): 1767-
1780.
Brightsmith, D., J. Hilburn, A. del Campo, J. Boyd, M. Frisius, R. Frisius, D. Janik and F.
Guillen (2005). "The use of hand-raised psittacines for reintroduction: a case study of scarlet
macaws (Ara macao) in Peru and Costa Rica." Biological Conservation 121(3): 465-472.
Buckler-Iv, E. S., A. Ippolito and T. P. Holtsford (1997). "The Evolution of Ribosomal DNA:
Divergent Paralogues and Phylogenetic Implications." Genetics 145(3): 821-832.
160
Butler, R. (2014). "Tropical Rainforests of the World." from
http://rainforests.mongabay.com/0101.htm.
Cantu, J. C. (2014). "The Scarlet Macaw is Back in the Gulf of Mexico!", from
http://www.defendersblog.org/2014/07/scarlet-macaw-back-gulf-mexico/.
Carson, J. F., J. Watling, F. E. Mayle, B. S. Whitney, J. Iriarte, H. Prumers and J. D. Soto (2015).
"Pre-Columbian land use in the ring-ditch region of the Bolivian Amazon." The Holocene 25(8):
1285-1300.
Clarridge, J. E., 3rd (2004). "Impact of 16S rRNA gene sequence analysis for identification of
bacteria on clinical microbiology and infectious diseases." Clin Microbiol Rev 17(4): 840-862,
table of contents.
Coleman, A. W. (2013). "Analysis of mammalian rDNA internal transcribed spacers." PLoS One
8(11): e79122.
Collins, F. S., L. D. Brooks and A. Chakravarti (1998). "A DNA polymorphism discovery
resource for research on human genetic variation." Genome Res 8(12): 1229-1231.
Crnokrak, P. and D. A. Roff (1999). "Inbreeding depression in the wild." Heredity 83(3): 260-
270.
Dasmahapatra, K. K. and J. Mallet (2006). "Taxonomy: DNA barcodes: recent successes and
future prospects." Heredity (Edinb) 97(4): 254-255.
De Mendonca Dantas, G. P., R. Godinho, J. S. Morgante and N. Ferrand (2009). "Development
of new nuclear markers and characterization of single nucleotide polymorphisms in kelp gull
(Larus dominicanus)." Mol Ecol Resour 9(4): 1159-1161.
Desjardins, P. and R. Morais (1990). "Sequence and gene organization of the chicken
mitochondrial genome. A novel gene order in higher vertebrates." J Mol Biol 212(4): 599-634.
161
Drees, K. (2010). "Zoo Blog: Scarlet Macaws." from
http://www.blankparkzoo.com/index.cfm/18193/1202/scarlet_macaws.
Duchene, S., F. I. Archer, J. Vilstrup, S. Caballero and P. A. Morin (2011). "Mitogenome
phylogenetics: the impact of using single regions and partitioning schemes on topology,
substitution rate and divergence time estimation." PLoS One 6(11): e27138.
Dupuis, J. R., A. D. Roe and F. A. Sperling (2012). "Multi-locus species delimitation in closely
related animals and fungi: one marker is not enough." Mol Ecol 21(18): 4422-4436.
Eickbush, T. H. (2002). "R2 and Related Site-Specific Non-Long Terminal Repeat
Retrotransposons." 813-835.
Eickbush, T. H. and D. G. Eickbush (2007). "Finely orchestrated movements: evolution of the
ribosomal RNA genes." Genetics 175(2): 477-485.
Ellegren, H. (2014). "Genome sequencing and population genomics in non-model organisms."
Trends Ecol Evol 29(1): 51-63.
Estrada, A. (2014). "Reintroduction of the scarlet macaw (Ara macao
cyanoptera) in the tropical rainforests of Palenque,
Mexico: project design and first year progress." Tropical Conservation Science 7(3): 342-364.
Feinstein, J. and J. Cracraft (2004). "Solving a sequencing problem in the vertebrate
mitochondrial control region using phylogenetic comparisons." DNA Seq 15(5-6): 374-377.
Feliner, G. N. and J. A. Rosselló (2012). "Concerted Evolution of Multigene Families and
Homoeologous Recombination." 171-193.
Fitch, W. M. and E. Margoliash (1967). "Construction of phylogenetic trees." Science
155(3760): 279-284.
Forshaw, J. M., Cooper, W. (2006). Parrots of the World, Avian Publications.
162
Frankham, R., J. Ballou and D. Briscoe (2010). Introduction to Conservation Genomicsx`.
Ganley, A. R. and T. Kobayashi (2007). "Highly efficient concerted evolution in the ribosomal
DNA repeats: total rDNA repeat variation revealed by whole-genome shotgun sequence data."
Genome Res 17(2): 184-191.
Gilbert, S. F. (2013). Developmental Biology, Sinauer Associates.
Goh, K. J., C. T. Tan, N. K. Chew, P. S. Tan, A. Kamarulzaman, S. A. Sarji, K. T. Wong, B. J.
Abdullah, K. B. Chua and S. K. Lam (2000). "Clinical features of Nipah virus encephalitis
among pig farmers in Malaysia." N Engl J Med 342(17): 1229-1235.
Grechko, V. V., L. V. Fedorova, D. M. Riabinin, D. G. Chobanu, S. A. Kosushkhin and I. S.
Darevskii (2006). "Molecular markers of nuclear DNA in the study of evolution and speciation
process in an example of "Lacerta agilis complex" (Sauria: Lacertidae)." Mol. Biol. (Mosk)
40(1): 61-73.
Haig, S. M., L. Wennerberg, T. D. Mullins, E. D. Forsman and P. Trail (2004). "Genetic
identification of spotted owls, barred owls, and their hybrids: Legal implications of hybrid
identity." Conservation Biology 18(5): 1347-1357.
Hammond, J. B. W., G. Spanswick and J. A. Mawn (1996). "Extraction of DNA from preserved
animal specimens for use in randomly amplified polymorphic DNA analysis." Analytical
Biochemistry(240): 298-300.
Hebert, P. D. and T. R. Gregory (2005). "The promise of DNA barcoding for taxonomy." Syst
Biol 54(5): 852-859.
Hebert, P. D., E. H. Penton, J. M. Burns, D. H. Janzen and W. Hallwachs (2004). "Ten species in
one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes
fulgerator." Proc Natl Acad Sci U S A 101(41): 14812-14817.
163
Hebert, P. D., S. Ratnasingham and J. R. deWaard (2003). "Barcoding animal life: cytochrome c
oxidase subunit 1 divergences among closely related species." Proc Biol Sci 270 Suppl 1: S96-
99.
Hebert, P. D., M. Y. Stoeckle, T. S. Zemlak and C. M. Francis (2004). "Identification of Birds
through DNA Barcodes." PLoS Biol 2(10): e312.
Heslop-Harrison, J. S. and T. Schwarzacher (2011). "Organisation of the plant genome in
chromosomes." Plant J 66(1): 18-33.
Holstein, N. (2006). "Eucaryot rdna.png."
Hughes, A. L. and M. A. Hughes (2007). "Coding sequence polymorphism in avian
mitochondrial genomes reflects population histories." Mol Ecol 16(7): 1369-1376.
Illumina, I. "MiSeq Reporter Software Documentation."
Illumina, I. "MiSeq Reporter workflow."
Illumina, I. "Nextera XT DNA Library Preparation Kit." from
http://www.illumina.com/products/nextera_xt_dna_library_prep_kit.html.
Illumina, I. (2015). "Nextera XT DNA Library Preparation Kit." from
http://www.illumina.com/products/nextera_xt_dna_library_prep_kit.html.
International Union for Conservation of Nature. (2015). "Conservation successes overshadowed
by more species declines – IUCN Red List update." from
http://www.iucn.org/news_homepage/?21561/Conservation-successes-overshadowed-by-more-
species-declines--IUCN-Red-List-update.
IUCN (2013). IUCN Red List of Threatened Species. Version 2013.2. www.iucnredlist.org.
Juniper, T. a. M. P. (1998). Parrots: A Guide to Parrots of the World., Yale University Press.
164
Kearse, M., R. Moir, A. Wilson, S. Stones-Havas, M. Cheung, S. Sturrock, S. Buxton, A.
Cooper, S. Markowitz, C. Duran, T. Thierer, B. Ashton, P. Meintjes and A. Drummond (2012).
"Geneious Basic: an integrated and extendable desktop software platform for the organization
and analysis of sequence data." Bioinformatics 28(12): 1647-1649.
Kiesler, K. (2014). Next Generation Sequencing on the Ion Torrent PGM: New SNP Typing
Applications, National Institute of Standards and Technology.
Kilpert, F. and L. Podsiadlowski (2006). "The complete mitochondrial genome of the common
sea slater, Ligia oceanica (Crustacea, Isopoda) bears a novel gene order and unusual control
region features." BMC Genomics 7: 241.
King, R. C. and W. D. Stansfield (1990). Dictionary of Genetics, Oxford University Press.
Kiss, L. (2012). "Limits of nuclear ribosomal DNA internal transcribed spacer (ITS) sequences
as species barcodes for Fungi." Proc Natl Acad Sci U S A 109(27): E1811; author reply E1812.
Knowles, L. L. and B. C. Carstens (2007). "Delimiting species without monophyletic gene
trees." Systematic Biology 56(6): 887-895.
Kovarik, A., M. Dadejova, Y. K. Lim, M. W. Chase, J. J. Clarkson, S. Knapp and A. R. Leitch
(2008). "Evolution of rDNA in Nicotiana allopolyploids: a potential link between rDNA
homogenization and epigenetics." Ann Bot 101(6): 815-823.
Kurtzman, C. P. and C. J. Robnett (1998). "Identification and phylogeny of ascomycetous yeasts
from analysis of nuclear large subunit (26S) ribosomal DNA partial sequences." Antonie van
Leeuwenhoek 73(4): 331-371.
Lachance, J. and S. A. Tishkoff (2013). "SNP ascertainment bias in population genetic analyses:
why it is important, and how to correct it." Bioessays 35(9): 780-786.
165
Lande, R. (1988). "Genetics and demography in biological conservation." Science 241(4872):
1455-1460.
Li, W.-H. (1997). Molecular Evolution. Sunderland, Massachusetts, Sinauer Associates.
Life Technologies. "Life Technologies -- Ion Torrent." from
http://www.biomedcentral.com/1471-2148/12/46.
Marsden, S. J. and J. D. Pilgrim (2002). "Factors influencing the abundance of parrots and
hornbills in pristine and disturbed forests on New Britain, PNG." Ibis 145(1): 45-53.
Matyasek, R., S. Renny-Byfield, J. Fulnecek, J. Macas, M. A. Grandbastien, R. Nichols, A.
Leitch and A. Kovarik (2012). "Next generation sequencing analysis reveals a relationship
between rDNA unit diversity and locus number in Nicotiana diploids." BMC Genomics 13: 722.
Mikhed, Y., A. Daiber and S. Steven (2015). "Mitochondrial Oxidative Stress, Mitochondrial
DNA Damage and Their Role in Age-Related Vascular Dysfunction." Int J Mol Sci 16(7):
15918-15953.
Mindell, D. P., M. D. Sorenson and D. E. Dimcheff (1998). "An Extra Nucleotide is not
Translated in Mitochondrial ND3 of Some Birds and Turtles." Molecular Biology and Evolution
15(11): 1568-1571.
Morin, P. A., G. Luikart, R. K. Wayne and S. N. P. w. g. the (2004). "SNPs in ecology, evolution
and conservation." Trends in Ecology & Evolution 19(4): 208-216.
Moss, R. S. a. D. (2009). "Earth Talk." from http://www.emagazine.com/earth-talk-
archive/week-of-10-11-09.
Myers, M. C. and C. Vaughan (2004). "Movement and behavior of scarlet macaws (Ara macao)
during the post-fledging dependence period: implications for in situ versus ex situ management."
Biological Conservation 118(3): 411-420.
166
Nabholz, B., N. Uwimana and N. Lartillot (2013). "Reconstructing the phylogenetic history of
long-term effective population size and life-history traits using patterns of amino acid
replacement in mitochondrial genomes of mammals and birds." Genome Biol Evol 5(7): 1273-
1290.
National Park Service. (2015). "Wolf Restoration." from
http://www.nps.gov/yell/learn/nature/wolf-restoration.htm.
Nei, M. and A. P. Rooney (2005). "Concerted and birth-and-death evolution of multigene
families." Annu Rev Genet 39: 121-152.
New England Biosystems. "NEBNext Fast DNA Fragmentation & Library Prep Set for Ion
Torrent." from https://www.neb.com/products/e6285-nebnext-fast-dna-fragmentation-and-
library-prep-set-for-ion-torrent.
Nosil, P. and D. Schluter (2011). "The genes underlying the process of speciation." Trends Ecol
Evol 26(4): 160-167.
O'neill, P. (2013). "Flying rainbows: the scarlet macaw returns to Mexico." from
http://news.mongabay.com/2013/06/flying-rainbows-the-scarlet-macaw-returns-to-mexico/.
Ohta, T. (2009). "The mutational load of a multigene family with uniform members." Genetical
Research 53(02): 141.
Olson, S. (1989). Shaping the Future: Biology and Human Values.
Pacheco, M. A., F. U. Battistuzzi, M. Lentino, R. F. Aguilar, S. Kumar and A. A. Escalante
(2011). "Evolution of modern birds revealed by mitogenomics: timing the radiation and origin of
major orders." Mol Biol Evol 28(6): 1927-1942.
Paxton, E. H., M. K. Sogge, T. C. Theimer, J. Girard and P. Keim (2008). "Using molecular
markers to resolve a subspecies boundary: the northern boundary of the Southwestern Willow
167
Flycatcher in the four-corner states: U.S. Geological Survey Open-File Report 2007-1117, 20 p."
2007 1117: 20.
Promega. (2015). "Wizard SV Gel and PCR Clean-Up System." from
https://www.promega.com/products/dna-and-rna-purification/dna-fragment-purification/wizard-
sv-gel-and-pcr-clean_up-system/.
Prychitko, T. M. and W. S. Moore (2000). "Comparative Evolution of the Mitochondrial
Cytochrome b Gene and Nuclear β-Fibrinogen Intron 7 in Woodpeckers." Molecular Biology
and Evolution 17(7): 1101-1111.
Questiau, S., M. C. Eybert, A. R. Gaginskaya, L. Gielly and P. Taberlet (1998). "Recent
divergence between two morphologically differentiated subspecies of bluethroat (Aves:
Muscicapidae: Luscinia svecica) inferred from mitochondrial DNA sequence variation."
Molecular Ecology 7(2): 239-245.
Ralls, K. and J. Ballou (1983). "Extinction: Lessons from zoos." BIOL. CONSERV. SER.: 164-
184.
Reumers, J., P. De Rijk, H. Zhao, A. Liekens, D. Smeets, J. Cleary, P. Van Loo, M. Van Den
Bossche, K. Catthoor, B. Sabbe, E. Despierre, I. Vergote, B. Hilbush, D. Lambrechts and J. Del-
Favero (2012). "Optimized filtering reduces the error rate in detecting genomic variants by short-
read sequencing." Nat Biotechnol 30(1): 61-68.
Ripple, W. J. and R. L. Beschta (2012). "Trophic cascades in Yellowstone: The first 15years
after wolf reintroduction." Biological Conservation 145(1): 205-213.
Romanov, M. N., E. M. Tuttle, M. L. Houck, W. S. Modi, L. G. Chemnick, M. L. Korody, E. M.
Mork, C. A. Otten, T. Renner, K. C. Jones, S. Dandekar, J. C. Papp, Y. Da, N. C. S. Program, E.
D. Green, V. Magrini, M. T. Hickenbotham, J. Glasscock, S. McGrath, E. R. Mardis and O. A.
168
Ryder (2009). "The value of avian genomics to the conservation of wildlife." BMC Genomics 10
Suppl 2: S10.
Saccheri, I., M. Kuussaari, M. Kankare, P. Vikman, W. Fortelius and I. Hanski (1998).
"Inbreeding and extinction in a butterfly metapopulation." Nature 392(6675): 491-494.
Sanger, F., S. Nicklen and A. R. Coulson (1977). "DNA sequencing with chain-terminating
inhibitors." Proc Natl Acad Sci U S A 74(12): 5463-5467.
Schlotterer, C. and B. Harr (2002). "Single nucleotide polymorphisms derived from ancestral
populations show no evidence for biased diversity estimates in Drosophila melanogaster."
Molecular Ecology 11(5): 947-950.
Schlötterer, C. and D. Tautz (1994). "Chromosomal homogeneity of Drosophila ribosomal DNA
arrays suggests intrachromosomal exchanges drive concerted evolution." Current Biology 4(9):
777-783.
Seabury, C. M., S. E. Dowd, P. M. Seabury, T. Raudsepp, D. J. Brightsmith, P. Liboriussen, Y.
Halley, C. A. Fisher, E. Owens, G. Viswanathan and I. R. Tizard (2013). "A multi-platform draft
de novo genome assembly and comparative analysis for the Scarlet Macaw (Ara macao)." PLoS
One 8(5): e62415.
Shields, G. F. and A. C. Wilson (1987). "Subspecies of the Canada Goose (Branta canadensis)
Have Distinct Mitochondrial DNA's." Evolution 41(3): 662.
Sibley, C. G. and J. E. Ahlquist (1983). "Phylogeny and classification of birds based on the data
of DNA-DNA hybridization." Current Ornithology 1: 245-292.
Smith, L. M. and L. A. Burgoyne (2004). "Collecting, archiving and processing DNA from
wildlife samples using FTA databasing paper." BMC Ecol 4: 4.
169
Snyder, N. F. R., S. R. Derrickson, S. R. Beissinger, J. W. Wiley, T. B. Smith, W. D. Toone and
B. Miller (1996). "Limitations of Captive Breeding in Endangered Species Recovery."
Conservation Biology 10(2): 338-348.
Sorenson, M. D., J. C. Ast, D. E. Dimcheff, T. Yuri and D. P. Mindell (1999). "Primers for a
PCR-based approach to mitochondrial genome sequencing in birds and other vertebrates." Mol
Phylogenet Evol 12(2): 105-114.
Species., C.-L. (2007). Scarlet Macaw. U.-W. C. M. Center.
Steiner, C. C., A. S. Putnam, P. E. Hoeck and O. A. Ryder (2013). "Conservation genomics of
threatened animal species." Annu Rev Anim Biosci 1: 261-281.
Stoeckle, M. Y. (2003). "Taxonomy, DNA, and the Bar Code of Life." BioScience 53(9): 2-3.
Syed, F. H. G., Nicholas Caruccio (2009). "Next-generation sequencing library preparation:
simultaneous fragmentation and tagging using in vitro transposition." Nature Methods.
Tavares, E. S., A. J. Baker, S. L. Pereira and C. Y. Miyaki (2006). "Phylogenetic relationships
and historical biogeography of neotropical parrots (Psittaciformes: Psittacidae: Arini) inferred
from mitochondrial and nuclear DNA sequences." Syst Biol 55(3): 454-470.
ThermoScientific. "NanoDrop 2000 UV-Vis Spectrophotometer." from
http://www.nanodrop.com/Productnd2000overview.aspx.
ThermoScientific. "Qubit 2.0 Fluorometer." from
https://tools.thermofisher.com/content/sfs/manuals/mp32866.pdf.
ThermoScientific. "Savant™ SPD131DDA SpeedVac™ Concentrator." from
http://www.thermoscientific.com/content/tfs/en/product/savant-spd131dda-speedvac-
concentrator.html.
ThermoScientific (2008). T009-Technical Bulletin (NanoDrop). New York, Worth Publishers.
170
Tourasse, N. J. (2000). "Selective Constraints, Amino Acid Composition, and the Rate of
Protein Evolution."
University, I. S. (2007). Nipah Virus Infection. T. C. f. F. S. a. P. Health.
Urantowka, A. D. (2014). "Complete mitochondrial genome of Critically Endangered Blue-
throated Macaw (Ara glaucogularis): its comparison with partial mitogenome of Scarlet Macaw
(Ara macao)." Mitochondrial DNA.
Urantowka, A. D., T. Strzala and K. A. Grabowski (2014). "Complete mitochondrial genome of
endangered Maroon-fronted Parrot (Rhynchopsitta terrisi) - conspecific relation of the species
with Thick-billed Parrot (Rhynchopsitta pachyrhyncha)." Mitochondrial DNA 25(6): 424-426.
Waldschmidt, A. M., E. G. d. Barros and L. A. O. Campos (2000). "A molecular marker
distinguishes the subspecies Melipona quadrifasciata quadrifasciata and Melipona quadrifasciata
anthidioides (Hymenoptera: Apidae, Meliponinae)." Genetics and Molecular Biology 23(3): 609-
611.
Wang, Y., R. M. Tian, Z. M. Gao, S. Bougouffa and P. Y. Qian (2014). "Optimal eukaryotic 18S
and universal 16S/18S ribosomal RNA primers and their application in a study of symbiosis."
PLoS One 9(3): e90053.
Watanabe, T., M. Nishida, K. Watanabe, D. S. Wewengkang and M. Hidaka (2005).
"Polymorphism in Nucleotide Sequence of Mitochondrial Intergenic Region in Scleractinian
Coral (Galaxea fascicularis)." Mar Biotechnol (NY) 7(1): 33-39.
Westemeier, R. L., J. D. Brawn, S. A. Simpson, T. L. Esker, R. W. Jansen, J. W. Walk, E. L.
Kershner, J. L. Bouzat and K. N. Paige (1998). "Tracking the Long-Term Decline and Recovery
of an Isolated Population." Science 282(5394): 1695-1698.
171
Whatman, G. H. L. S. (2015). "FTA/FTA Elute Sample Collection Cards and Kits." from
http://www.gelifesciences.com/webapp/wcs/stores/servlet/catalog/en/GELifeSciences-
us/products/AlternativeProductStructure_17096/.
Zierdt-Warshaw, L. (2000). Encyclopedia of Environmental Science, Greenwood.