Tracy Ann Kimdigital.library.unt.edu/ark:/67531/metadc849620/m2/1/high_res_d/KIM...(Ara macao)....

GENETIC CHARACTERIZATION OF CENTRAL AND SOUTH AMERICAN

POPULATIONS OF SCARLET MACAW (Ara macao)

Dissertation Prepared for the Degree of

DOCTOR OF PHILOSOPHY

UNIVERSITY OF NORTH TEXAS

May 2016

APPROVED:

Robert C. Benjamin, Major Professor

Michael S. Allen, Committee Member

Lee E. Hughes, Committee Member

Sarah A. McIntire, Committee Member

Douglas D. Root, Committee Member

Arthur J. Goven, Chair of the Department of

Biological Sciences

Costas Tsatsoulis, Dean of Toulouse

Graduate School

Tracy Ann Kim

Kim, Tracy Ann. Genetic Characterization of Central and South American Populations of

Scarlet Macaw (Ara macao). Doctor of Philosophy (Molecular Biology), May 2016, 171 pp., 15

tables, 35 figures, references, 123 titles.

The wild populations of the Scarlet Macaw subspecies native to southern Mexico and

Central America, A. m. cyanoptera, have been drastically reduced over the last half century and

are now a major concern to local governments and conservation groups. Programs to rebuild

these local populations using captive bred specimens must be careful to reintroduce the native A.

m. cyanoptera, as opposed to the South American nominate subspecies (A. m. macao) or hybrids

of the two subspecies. Molecular markers for comparative genomic analyses are needed for

definitive differentiation. Here I describe the isolation and sequence analysis of multiple loci

from 7 pedigreed A. m. macao and 14 pedigreed A. m. cyanoptera specimens. The loci analyzed

include the 18S rDNA genes, the complete mitogenome as well as intronic regions of selected

autosomally-encoded genes. Although the multicopy18S gene sequences exhibited 10%

polymorphism within all A. macao genomes, no differences were observed between any of the

21 birds whose genomes were studied. In contrast, numerous polymorphic sites were observed

throughout the 16,993 bp mitochondrial genomes of both subspecies. Although much of the

polymorphism was observed in the genomes of both subspecies, subspecies-specific alleles were

observed at a number of mitochondrial loci, including 12S, 16S, CO2 and ND3. Evidence of

possible subspecies-specific alleles were also found in three of four screened nuclear loci.

Collectively, these mitochondrial and nuclear loci can be used as the basis to distinguish A. m.

cyanoptera from the nominate subspecies, A. m. macao, as well as identify many hybrids, and

most importantly will contribute to further reintroduction efforts.

ii

Copyright 2016

by

Tracy Ann Kim

iii

ACKNOWLEDGEMENTS

I would like to thank Dr. Robert Benjamin for all of his support as my major professor.

He has provided invaluable hours of counseling and problem-solving. I would also like to thank

my doctoral committee for their encouragement throughout my project. I would like to thank the

many amazing friends, colleagues, students, and professors I have known at UNT.

I am so very grateful to my colleagues who provided the macaw blood samples, making

this project possible. Xcaret Ecoparque and Nature Preserve in Playa del Carmen, Mexico and

private breeders from Cancun, Quintana Roo, Mexico provided many samples, as well as, Dr.

Patricia Escalante from the National Autonomous University of Mexico who generously assisted

me in locating validated scarlet macaw blood samples.

Thank you to Dr. Allen, who has been so steadfast in his support and for the constant

access to the wondrous equipment in his lab. Dr. Yan Zhyang has been tremendously helpful and

patient when I encroach upon her lab. I would like to thank Dr. Ron Mittler for his contagious

inspiration and bravery over this last year. I would like to thank Dr. David Visi for his tireless

assistance with sequencing platforms and always answering when I call. A huge thank you to

Richard Donagen-Quick for his brilliant troubleshooting prowess and for being our scientist for

four years and to Laci Adolfo who has been there for me in so many aspects of this project,

discussing science and innovation. She has been so incredibly hard-working and supportive and

always readily available to spill the tea. She is definitely the best discovery from my years on

this research project. I would like to thank my incredible husband, Brian McCormack, who has

been tireless with his patience and energy in this scientific endeavor. Also, I will never be able to

thank my kids enough for enduring so many growing years in a biology building and

continuously showing me how to see the wonders of the world.

iv

TABLE OF CONTENTS

ACKNOWLEDGEMENTS ........................................................................................................... iii

LIST OF TABLES ......................................................................................................................... vi

LIST OF FIGURES ...................................................................................................................... vii

CHAPTER 1 INTRODUCTION .................................................................................................... 1

1.1 Conservation Issues ........................................................................................................... 1

1.2 Reintroduction Programs ................................................................................................... 4

1.3 Molecular Analysis to Differentiate Species ..................................................................... 7

1.4 New World Parrots .......................................................................................................... 18

CHAPTER 2 MATERIALS AND METHODS ........................................................................... 29

2.1 Obtaining Samples ........................................................................................................... 29

2.2 DNA Isolation from Liquid Blood .................................................................................. 30

2.3 DNA Isolation from FTA® Cards ................................................................................... 31

2.4 18S Ribosomal DNA ....................................................................................................... 32

2.5 Column-based Purification of PCR Products .................................................................. 35

2.6 Next Generation Sequencing Using the Illumina MiSeq® Platform .............................. 36

2.7 Mitochondrial Genome Comparison ............................................................................... 47

2.8 Next Generation Sequencing Using the Ion Torrent® PGM™ Sequencing System (Life

Technologies) ........................................................................................................................ 59

2.9 Mitochondrial DNA Sequence Data Analysis ................................................................. 66

2.10 Alignments to Reference Sequence ............................................................................... 66

2.11 Differentiation on the Sequence Level .......................................................................... 67

2.12 Identification of Candidate Nuclear Sequences for Polymorphism Screening ............. 68

2.13 Restriction Digestion Analysis ...................................................................................... 78

2.14 Next Generation Sequencing Using the Illumina MiSeq® Platform ............................ 82

CHAPTER 3 RESULTS AND CONCLUSIONS ........................................................................ 83

v

3.1 18S Analysis of Results ................................................................................................... 83

3.2 Mitochondrial Sequences: Analysis of Results ............................................................... 89

3.3 Nuclear Results .............................................................................................................. 103

3.4 Off-Instrument Use of MiSeq® Reporter ...................................................................... 112

APPENDIX EXTENDED RESULTS ........................................................................................ 117

BIBLIOGRAPHY ....................................................................................................................... 158

vi

LIST OF TABLES

Table 1: Mitochondrial Primer Pairs Used to Amplify the Mitogenome of A. macao ..................50

Table 2: Preliminary Assessment of Candidate Nuclear Loci by RFLP .......................................71

Table 3: Restriction Enzymes Used for Initial Search for Variability ...........................................80

Table 4: Nuclear Sequence Comparison Between Subspecies: 18S. .............................................88

Table 5: Characteristics of Two Subspecies of Scarlet Macaw .....................................................97

Table 6: Mitochondrial Comparison of A. macao cyanoptera, A. macao macao, and hybrids ..100

Table 7: Mitochondrial Sequence Comparison Between Subspecies: 16S .................................102

Table 8: Mitochondrial Sequence Comparison Between Subspecies: 12S .................................103

Table 9: Mitochondrial Sequence Comparison Between Subspecies: CO2 ................................104

Table 10: Mitochondrial Sequence Comparison Between Subspecies: ND3 ..............................105

Table 11: Mitochondrial Sequence Comparison Between Subspecies: CytB .............................106

Table 12: Nuclear Sequence Comparison Between Subspecies: AK1 ........................................108

Table 13: Nuclear Sequence Comparison Between Subspecies: RGS4 ......................................110

Table 14: Nuclear Sequence Comparison Between Subspecies: Vim .........................................112

Table 15: Nuclear Sequence Comparison Between Subspecies: RAG1 .....................................113

vii

LIST OF FIGURES

Figure 1: Benefits of Using Nuclear and Mitochondrial DNA Sequences for Comparative

Analyses ............................................................................................................................9

Figure 2: Homogenization Model of Multicopy Gene Arrays .......................................................14

Figure 3: Visual Comparison of A. m. macao and A. m. cyanoptera .............................................21

Figure 4: Map of Estimated A. macao Populations in Southern Mexico and Central America. ...24

Figure 5: Arrangement of rDNA Clusters on the Genome ............................................................33

Figure 6: Amplified Nuclear Region from 18S rDNA – 1 of 2 ....................................................34

Figure 7: Amplified Nuclear Region from 18S rDNA – 2 of 2 .....................................................35

Figure 8: Tagmentation® Procedure Used by Nextera XT Library Prep Kit® .............................39

Figure 9: TruSeq® Index Plate Guide ...........................................................................................41

Figure 10: Electrophoretic Analysis of Mitochondrial Segment 1 Amplicons..............................52






Figure 16: Overlapping PCR Amplicons Encompassing Ara macao Mitochondrial Genome .....58

Figure 17: Ion Torrent® Workflow for Sample Preparation .........................................................65

Figure 18: Ion Torrent® Sequencing Final Workflow ..................................................................66

Figure 19: Amplified Nuclear Region from AK1 – 1 of 2 ............................................................72

Figure 20: Amplified Nuclear Region from AK1 – 2 of 2 ............................................................73

viii

Figure 21: Amplified Nuclear Region from RAG1 – 1 of 2 ..........................................................74

Figure 22: Amplified Nuclear Region from RAG1 – 2 of 2 ..........................................................75

Figure 23: Amplified Nuclear Region from RGS4 – 1 of 2 ..........................................................76

Figure 24: Amplified Nuclear Region from RGS4 – 2 of 2 ..........................................................77

Figure 25: Amplified Nuclear Region from Vim – 1 of 2 .............................................................78

Figure 26: Amplified Nuclear Region from Vim – 2 of 2 .............................................................79

Figure 27: Electrophoretic Analysis of Restriction Digest of RGS4 .............................................82

Figure 28: Sequence Homology and Single Nucleotide Polymorphisms for 2 Individuals ..........84

Figure 29: Arrangement of Ribosomal DNA (rDNA) Clusters on the Genome ...........................85

Figure 30: Ribosomal DNA Tandem Cluster Array ......................................................................87

Figure 31: Illustration of Problem with Tandem Clusters of rDNA During Read Assembly .......91

Figure 32: Phylogenetic tree showing relatedness of Ara macao from mitogenome domain I. ....95

Figure 33: Illumina MiSeq® Reporter Workflow .......................................................................117

Figure 34: Illumina MiSeq® Reporter Run Summary Interface .................................................120

Figure 35: Illumina MiSeq® Reporter Detailed Sample Analysis Interface ...............................121

1

CHAPTER 1

INTRODUCTION

1.1 Conservation Issues

Wildlife conservation addresses the preservation of species and their habitats throughout

the world. The continuing extinction of species during the last century has been occurring at an

alarming rate. It is currently estimated that 1,800 populations per hour, encompassing a wide

range of species, are being lost as a result of human-driven ecological change. This is an

unprecedented rate of biodiversity loss. While the current scientific concern over the loss of

biodiversity emphasizes preventing the loss of species, many of the benefits biodiversity confers

upon humanity are delivered through individual populations (Hughes and Hughes 2007). A

population is a group of individuals of the same species, living in a given location, which is

genetically different from other such groups. Population loss, therefore, is a more significant

measurement to be used when representing the changes to our world that directly affect us.

The International Union for Conservation of Nature (IUCN) is an internationally

recognized authority on the conservation status of the plant and animal species of the world.

Approximately every four years, the IUCN generates a revised “Red List” that documents the

differing threat levels faced by thousands of the world’s plant and animal species. The last such

report was issued in April 2015, and at that time the IUCN Red List assessed and reported on the

status of over 79,000 plant and animal species. The April 2015 Red List classified 41% of

amphibian species, 26% of mammalian species, and 13% of avian species as “under threat”

(International Union for Conservation of Nature 2015).

2

Biodiversity loss has effects beyond the loss of a particular species or subspecies. The

relationship between biodiversity and the overall health of an ecosystem is both closely

interwoven and multidimensional. Many human diseases originate in animal or environmental

reservoirs that are forced by habitat change to relocate to areas more densely populated by

humans and livestock. This sudden increase in habitat overlap causes humans to come into

contact with new bacteria and viruses not normally encountered. An example of problems caused

by such an overlap is when fires were used to clear rainforests for agriculture in Indonesia in

1998, the extensive smoke that was produced forced the fruit bat populations to relocate into

northern Malaysia. Pigs from the large commercial farms located there became ill after

consuming fruit later discovered to be contaminated by the bat’s saliva or urine (Goh, Tan et al.

2000). A large majority of the farmers that came in contact with the ailing livestock began to

report a range of disorders from acute respiratory syndrome to fatal encephalitis. Although

extremely contagious among pigs, the mortality rate was very low. In humans, however, 40% of

the people that contracted the Nipah virus died. Since 1998, a dozen more outbreaks have

occurred in Bangladesh and India. This has resulted in more respiratory disease and a fatality rate

of up to 92%, which has lead scientists to suspect a different strain of the Nipah virus as the

culprit (Iowa State University 2007).

In addition, many of the far-reaching effects of habitat destruction are often not seen until

long after the damage has been done. And even if these far-reaching effects can be reversed, such

reversal can generally be accomplished only through long, arduous, and scientifically sound

efforts. An example of the far-reaching effects of biodiversity loss can be seen in the story of the

gray wolf (Canis lupus) in the northern United States. During the first half of the 20th century,

the gray wolf was seen as a threat to livestock, and it became the target of a mass extermination

3

effort. This effort effectively wiped out the gray wolf in most of its former range within the

lower 48 states (Ripple and Beschta 2012). As the gray wolf population declined, there was an

increase in populations of the gray wolf’s prey species: elk, deer, moose, coyote, raccoon, and

beaver. With the corresponding proliferation of grazing species (such as elk, deer, and moose),

populations of the plants that these animals fed on became decimated. This reduction in

vegetation resulted in a loss of protective cover for birds, causing a decline in riparian bird

populations (Ripple and Beschta 2012). With the falling bird populations in wetland areas, the

number of insects, particularly mosquitoes, began to increase. Mosquitoes are a known disease

vector that directly affects humans—West Nile Virus (encephalitis), dengue fever, and yellow

fever are known mosquito-borne diseases in the United States (AMCA - American Mosquito

Control Association 2014). It seems unlikely that anyone would have predicted, when the wolf

eradication program was initiated, that the program’s success would result in an increased threat

to humans from mosquito-borne diseases. Such an indirect and unanticipated ecological effect is

just one example of the types of dangers posed by species extinction (or the local extinction of a

population of a species).

In the case of the gray wolf, which was an example of local population extinction, not all

was lost. Beginning in the early 21st century, the U.S. National Park Service carried out a

program to reintroduce the gray wolf back into the northern United States (National Park Service

2015). Canadian gray wolf populations were used to seed former U.S. ranges. The return of

wolves to Yellowstone has begun to curb the abundant elk population within the park, which in

turn has allowed young aspens and willows to grow where they had been previously decimated

by overbrowsing (Ripple and Beschta 2012). The increase in vegetation along streams and other

areas has provided improved habitats for beaver, birds, and fish populations. The enhanced

4

vegetation also increased the available food sources for bears and birds (Ripple and Beschta

2012). Although it is too early to observe a consequent decrease in insect populations as a result

of the increase in avian species populations, it is reasonable to suggest that this will likely follow.

1.2 Reintroduction Programs

Reintroduction programs, such as the gray wolf example given above, are often seen as

controversial conservation tools. Care needs to be taken to ensure the health of the members of

the introduced species as well as the current inhabitants of the introduction area. The loss of

genetic diversity also results in lower individual fitness and poor adaptability (Lande 1988). The

fate of small populations may be linked to collective genetic change in those populations.

Further, captive breeding of endangered wildlife animals is often necessary for their

conservation. However, this strategy may increase the chance of inbreeding, which causes poor

fitness of these populations (Ralls and Ballou 1983, Crnokrak and Roff 1999). Inbreeding is

known to decrease genetic diversity and reduce reproductive and survival rates, with these

problems leading to increased extinction risk (Saccheri, Kuussaari et al. 1998). Such genetically

impoverished populations often struggle and have to be crossed with individuals from other

populations to become genetically viable (Westemeier, Brawn et al. 1998). To address risks

associated with inbreeding, however, appropriate population management programs can be

developed using genetic studies (Snyder, Derrickson et al. 1996). Care must also be taken to

ensure the animals are being introduced into an appropriate, sufficiently sized and healthy

habitat.

An important consideration when introducing or reintroducing a species into a habitat is

the determination of whether the reintroduced population will be genetically sustainable. It is

5

therefore desirable to identify for reintroduction an appropriate indigenous population that is

genetically well-adapted for the chosen environment. It is additionally important to verify that

there is sufficient genetic diversity within the introduced population to maintain the healthy

genetic substructure necessary for long-term survival. Genetic substructure of a population is

diversity that is the result of the combined forces of mutation, gene flow, genetic drift, and

natural selection. Populations with high rates of gene flow tend to become more homogenized

and over time will exhibit less total substructure. Populations with low rates of gene flow will

have an increased likelihood of experiencing genetic drift, and the result is a decrease in

variability within the population and an increase in genetic variability from other

populations/subpopulations of the same species. This in turn leads to an increase in the amount

of substructure within the larger/combined population. It is important to the success of the

reintroduction program that a genetically diverse population of suitably adapted individuals be

introduced into a suitably large and diverse environment that will preserve and expand the

genetic substructure of the newly introduced population.

Allopatric speciation can occur when a population separates to the extent that genetic

exchange effectively ceases between the separated groups. The resulting genetic divergence may

be due to differing selective pressures, independent genetic drift, or by mutations that arise in

one group but not the other, and eventually the divergence can result in the formation of

recognized subspecies (Nosil and Schluter 2011). When two subspecies exist of a species chosen

for restoration into the wild, reintroduction of members of one of the two subspecies into a new

area can introduce alleles that are not necessarily favored in the new environment. These alleles,

although supported by the original environment where they arose and were selected for as

advantageous, are now subjected to conditions that select for a different, genetically distinct

6

group. Thus, when reintroducing a species into a range where it is no longer found, the choice of

the particular subspecies to be introduced can have a significant impact on the success or failure

of the restoration effort. Given the importance of choosing the right subspecies, verification of

the genetic background of individuals to be used for introduction is a high priority.

Efforts to select subspecies for introduction are further complicated by the possible

existence/creation of hybrid individuals. When two genetically-distinct populations are combined

as a result of a reintroduction effort, intermediate genotypes can be created in the subsequent

generations that lack vigor (Birchler, Yao et al. 2006). In hybrids with these intermediate

genotypes, there can be a breakdown of biochemical and physiological compatibility between the

alleles of interacting genes (gene products) from the different genetic backgrounds. Under

typical evolutionary conditions, individual alleles are propagated because of the advantage they

give to the individual in an environment. Due to non-additive gene action, these alleles, however,

are often a detriment when appearing in a relatively new genetic background. Individuals having

these non-adaptive gene combinations are less likely to survive. A reintroduction program that

leads to the creation of such unfavorable hybridization can, counterintuitively, cause a reduction

in the overall population due to interbreeding and the subsequent poor survival of hybrid

offspring.

Based upon all of these considerations, determining the population/subspecies of the

species appropriate for the area into which they are to be introduced and having detailed

knowledge of the genetic makeup of each individual to be used for reintroduction are of

paramount importance to insuring the success of a reintroduction program. Accomplishing these

objectives has been made much simpler and more definitive with the introduction of comparative

genomics as a means to differentiate groups of organisms taxonomically. The use of such

7

comparative genomic techniques for the differentiation of species and taxa has become both

quicker and less expensive in recent years, and is now considered the “gold standard” for such

determinations in most cases.

Recent advances in molecular research techniques such as high-throughput DNA

sequencing and computer programs for high-volume DNA sequence analysis have opened up

many new opportunities. Such techniques have allowed for the collection of large amounts of

sequence information that is now available to the research community in public databases.

Accordingly, analyses that before would have been too complex or expensive to carry out with

available resources, can now be used advantageously in reintroduction programs. With such

population-focused genomic data available, broader genomic analyses can and should be

considered as necessary first steps of a reintroduction program.

1.3 Molecular Analysis to Differentiate Species

The first molecular-based approach applied to the study of taxonomy and the

biogeography of populations came with the advent of protein sequencing. In the early 1960s,

amino acid sequence comparisons, primarily of cytochrome c, were used to determine the

relative divergence of taxa. The rate of the evolution of a protein is based on the occurrence of

mutations in the genome and by the probability that the resulting random change in amino acid

sequence will be tolerable in a functioning protein. The relatively slow rate of change of

cytochrome c at the amino acid level made this protein an excellent candidate to discern deep

phylogenetic relationships (Fitch and Margoliash 1967, Olson 1989). Protein sequencing

comparisons are still used today, but more often as a supplement to DNA sequence-level

analyses, and not as a stand-alone comparison method. Synonymous mutations (due to the

8

degeneracy of the genetic code) will still yield the same amino acid sequence even though the

nucleotide sequences of the two alleles will differ at the DNA sequence level causing divergence

to be underestimated. Therefore, studying a given length of DNA will yield more phylogenetic

information than is produced by studying the amino acid sequence of the corresponding encoded

polypeptide. Amino acid sequences are also subject to more selective pressure than DNA

sequences, and thus change more slowly over evolutionary time. They are therefore likely to be

less useful for comparisons of species/populations that have only recently diverged (Tourasse

2000).

From 1974 to 1986, the dominant molecular-based technique for phylogenetic analysis

involved DNA-DNA hybridization. (Sibley and Ahlquist 1983). Purified genomic DNA samples

from two different species were denatured and then allowed to hybridize to form heteroduplexes

between the homologous DNA sequences. The formed heteroduplex molecules whose nucleotide

sequences had more complementarity exhibited a higher melting temperature (Tm) than those

with less complementarity. Thus, a higher thermostability of the heteroduplex was associated

with a closer phylogenetic relationship between the two organisms.

New biotechnology advances in the early 1970s allowed researchers to use a more direct

method to measure DNA sequence diversity. Frederick Sanger pioneered a DNA sequencing

method based on DNA replication. This process used chemically altered bases to terminate

synthesis of DNA fragments during replication. The fragments were then size-separated by

electrophoresis and visualized by an x-ray image to obtain the DNA sequence (Sanger, Nicklen

et al. 1977).

9

Phylogenetically useful markers often accumulate in the genome in the form of single

nucleotide polymorphisms (SNPs) and short indels (insertions/deletions) (see Figure 1).

Approximately 90% of genetic variation in the human genome is in the form of SNPs (Collins,

Brooks et al.). Comparison of these sequences have provided new insights into the evolution of

populations and have rapidly become the molecular marker of choice for many applications in

ecology and conservation genetics. SNPs have a tendency to not be especially advantageous or

disadvantageous from a survival perspective. They are not readily removed from the genome

over time by natural selection because they do not directly alter the transcriptome and/or

proteome of the individual. Polymorphisms of this type can often be used to more effectively

track recent and rapid evolutionary changes, and are therefore useful to differentiate between

highly related groups and shed light on the level of genetic relatedness and substructure within

and/or between populations. The original SNP analyses used in population studies were virtually

Figure 1: Benefits of using nuclear and mitochondrial DNA sequence for comparative

analyses (Morin, Luikart et al. 2004).

10

all carried out using the RFLP (restriction fragment length polymorphism) technique or through

RAPD (random amplified polymorphic DNA) studies. Today, SNPs are found and analyzed by

DNA sequencing of the genome and/or the locus under study (Bohle and Gabaldon 2012).

SNPs do have limitations when being used to assess genetic variation. Ascertainment bias

is a distortion in the measurement of the frequency of a phenomenon due to the way the data is

collected. This bias is a larger potential drawback for certain applications more than others. The

problem of ascertainment bias is less of an issue for individual identification, paternity analyses,

and assigning individuals to different populations because these are not based on allelic

frequency (Lachance and Tishkoff 2013).

Population substructure can be underestimated when measuring SNPs with a high

heterozygosity. Usually SNPs with higher heterozygosity tend to be older and therefore have had

time to be distributed across the population relative to SNPs with lower heterozygosity (Morin,

Luikart et al. 2004). Population bottlenecks are detected by loss of genetic variation within the

genome. Since SNPs house only two alleles per locus, it is more difficult to detect allelic variety

or lack thereof within a population (Morin, Luikart et al.). So estimation in population size or

demographic differences can be a problem unless ascertainment bias is taken into consideration.

Biases also arise when using SNP markers across populations between different studies.

The diversity of the original study’s population can accordingly skew the diversity measurement

of the population being measured in the subsequent study causing a false increase or decrease to

be seen (Schlotterer and Harr). To help counteract this type of bias, the screened individuals

must originate from a wide geographic source and the protocol used initially to identify the

polymorphism must be recorded in detail along with the number of individuals screened (Akey).

Fourteen pedigreed A. m. cyanoptera and seven pedigreed A. m. macao (plus two known

11

hybrids) were studied in the completion of this project. Careful consideration was given to

ensure that the individual macaws studied were not related. Sequence analysis of the

hypervariable mitochondrial control region was completed to show the there was no matrilineal

relationship between the individual macaws studied (see figure 32).

Small subunit ribosomal RNA gene (rDNA) sequences were the initial nucleotide

sequences used for phylogenetic comparison, and these are still used today. The homology at the

nucleotide sequence level of rDNA is well conserved between even distantly related organisms.

This allows for easier alignment, and therefore determination, of evolutionary relationships and

divergence rates. In particular, the 16S genes of prokaryotes and 18S genes of eukaryotes are

frequently used to determine phylogeny. The 16S rDNA sequence of prokaryotes was used by

Carl Woese to show evolutionary relatedness, and his work in the late 1970s became the basis of

the modern three-domain classification system of Archaea, Bacteria and Eukarya (Clarridge

2004). The 16S and 18S rRNA genes are multicopy gene families (making them easy to detect),

and comparisons using them effectively resolve deep phylogenetic relationships. However, due

to the high level of interspecies sequence conservation, they may not provide sufficient data for

differentiation at closer taxonomic levels. Despite this shortcoming, comparisons using these

sequences to identify species and assess interspecies relationships remain extremely valuable for

taxonomic studies.

Reiterated genes such as the ribosomal RNA genes are characterized by being composed

of hundreds of identical genes present in a tandem array (King and Stansfield 1990). This genetic

redundancy is to produce higher amounts of a specific product which is needed comparatively

more than other less redundant gene products. Concerted evolution is likely to be a correction

mechanism counteracting the undesired effects of mutations. Homogenization of sequences

12

across these regions provides stability by minimizing polymorphisms in gene products required

to be present uniformly in large quantities (Feliner and Rosselló 2012).

The individual units of ribosomal DNA arrays, consisting of these tandem clusters, show

much greater sequence similarity within a species than between species. The

clusters/transcription units of these arrays do not evolve independently of each other but through

concerted evolution. This causes cluster homogeneity that is the direct result of unequal crossing-

over events and gene conversion (Schlötterer and Tautz 1994). Novel variants arising by

mutation can spread relatively rapidly along the array or be quickly eliminated from the array of

any one species (Eickbush and Eickbush 2007). It has been suggested that homogenization is

favored by natural selection because it reduces mutational load (Ohta 2009). The high levels of

ITS and IGS polymorphisms in some species are likened to the creation and disappearance of the

duplicate gene copies in multicopy gene families as a result of natural selection (Nei and Rooney

2005).

The high copy number and sequence conservation of the tandemly arranged clusters

make detailed sequence analysis of the entire array difficult. Presently available DNA

sequencing technology cannot provide accurate sequence data across long stretches of tandemly

repeated and highly conserved sequences. Consequently, SNPs found in these rRNA gene

clusters have not been studied in any non-model organism with more than a few hundred cluster

copies (Matyasek, Renny-Byfield et al. 2012) on one chromosomal locus. Angiosperms are a

good example of the significance of this problem because they are one of many organisms that

possess tens of thousands of rRNA gene clusters distributed over several chromosomal loci

(Heslop-Harrison and Schwarzacher 2011).

13

Multicopy gene sequences that show homogenization within repeat arrays have been

found to follow three basic phases involved with concerted evolution (Barker, Benesh et al.

2012). During the first phase, mutations/SNPs tend to occur anywhere within the rDNA cluster.

Since the rDNA is highly redundant, there is no selective pressure acting on these mutations and

they can persist for some time (Hughes and Hughes 2007). These mutations are observed as low-

frequency polymorphisms located throughout the repeat unit. During the second phase,

recombination and unequal crossing-over events will result in mutations/SNPs being either

removed/deleted or duplicated. Duplicated clusters housing the mutation will increase in number

(if not removed during subsequent crossing-over or recombination events) until a certain

threshold is reached. At this point, natural selection will judge whether fitness is compromised

by this functional constraint or strengthened by this novel functional change. During the third

phase, the mutant repeat completely replaces all of the previous repeats and the new variant

becomes fixed and homogenized within the array (see Figure 2) (Questiau, Eybert et al. 1998).

14

This mode of homogenization explains why some regions within the same repeat cluster

are highly polymorphic while others are highly conserved although the entire repeat is subject to

the identical homogenization process.

Nucleotide sequences in the mitochondrial genome have been used extensively for

phylogenetic comparisons and parentage determinations. Each mitochondrion houses 2-10 copies

of the mitogenome, and each cell can possess dozens of mitochondria. This relatively high copy

number per cell facilitates obtaining sequence-level information from mtDNA-encoded loci. The

traditional approach using “alleles” for comparisons is routinely replaced by identifying mtDNA

haplotypes, each composed of a specific set of single nucleotide substitutions (SNPs) and/or

indels.

Figure 2: Homogenization model of multicopy gene arrays. (Ganley and Kobayashi)

15

The mitochondrial genome mutation rate is estimated to be 10x higher than the mutation

rate of equivalent regions of the nuclear genome (Avise 2000). It is believed this is because

mtDNA is more exposed to mutagenic events due to the lack of histone-like proteins, there are

no clear DNA repair capabilities in the organelle, and exposure to a high steady-state level of

reactive oxygen species (ROS) and free radicals (Mikhed, Daiber et al. 2015). This, coupled with

a high replication rate, creates the signature increase of accrued polymorphisms in the

mitochondria. The avian mitochondrial genome is approximately 17,000 bp long and consists of

22 tRNA genes, 13 protein-coding open reading frames (ORFs), two rRNA coding regions and a

control region (Boore 1999). The level of sequence polymorphism varies across the mitogenome.

The control region, or D loop, is the most variable section of the mitogenome, and because it

doesn’t code for an expressed product, it is able to undergo rapid change at the nucleotide

sequence level. The average level of sequence polymorphism across the different coding regions

of the mitochondrion decreases in the order of control region > cytochrome B > cytochrome

oxidase 1 (CO1) > 12S rDNA > 16S rDNA (Arif).

As noted above, as new generations of improved molecular analysis techniques have

been introduced, and the cost and time requirements for such analyses have significantly

decreased, it has become possible to broadly apply such techniques for the identification and

differentiation of species/subspecies. The mitochondrial genome can provide abundant

information for evolutionary studies of many taxa, and can be used as a source of molecular

markers for conservation studies of endangered species (Nabholz, Uwimana et al. 2013). A

number of suggested subspecies have been shown to exhibit definitive differentiation based on

these techniques:

16

Stingless bee – Melipona quadrifasciata anthidioids; Milipona quadrifasciata

quadrifasciata (Waldschmidt, Barros et al. 2000);

Canada goose – Branta canadensis taverneri; Branta canadensis leucopareia

(Shields and Wilson 1987);

Sand lizard – Lacerta agilis exigua; Lacerta agilis boemica (Grechko, Fedorova

et al. 2006);

Willow Flycatcher – Empidonax traillii extimus; Empidonax traillii adastus

(Paxton, Sogge et al. 2008); and

Bluethroat – Luscinia svecica namnetum; Luscinia svecica svecica (Questiau,

Eybert et al. 1998).

In 2003, an international consortium began to compile a database of DNA sequences

from taxa all over the world in order to differentiate organisms at the nucleotide sequence level.

This identification system is based upon specific molecular markers. The primary marker used

for this species comparison is located at the 5’ end of the cytochrome c oxidase subunit 1 gene

(CO1), which is encoded within the mitochondrial genome. Taxonomic

assignments/differentiations using this 650 bp sequence have stood up remarkably well under the

scrutiny of taxonomists that initially declared this method of species identification as being “anti-

taxonomy” (Hebert and Gregory 2005).

Dr. Mark Stoeckle of Rockefeller University developed specific differentiating DNA

sequences for 260 species of North American birds based upon the CO1 gene. All of the birds,

with the exception of four species, could be uniquely differentiated by a species-specific CO1

gene sequence. In the case of the four “species” for which the approach failed to identify a single

17

species-specific DNA sequence, each of them showed the presence of two differentiated

sequences, suggesting that each of the single “understood” species might in fact consist of two

distinct species (Hebert, Stoeckle et al. 2004).

After analysis by groups using molecular markers, many original classifications that had

been made under the traditional biological species concept were determined to be incorrect. This

has been especially true in the case of many prokaryotic species. Using molecular analysis, many

higher eukaryotic populations formerly regarded as separate species were found to be a

single taxon and, conversely, many singular groupings were refined into two independent

classifications (Hebert and Gregory 2005).

Dr. Paul Hebert and Dr. Daniel H. Janzen of the University of Pennsylvania showed that

members of a single, traditionally classified Costa Rican butterfly species, Astraptes fulgerator,

possessed a total of 10 distinct CO1 gene sequences (Hebert, Penton et al. 2004). This finding

suggests that the butterflies were not a single species, as long assumed, but a complex of 10

different species occupying overlapping territories. The survival advantage imparted by the

original shared adult phenotypes (physical shape and wing color), was so great that the

descendant groups each retained these ancestral traits (Aitken 2006). However, although the

adults of different groups shared many physical and behavioral properties, the larval stage

(caterpillar) for each of these newly separated butterfly species look quite distinct and prefer

different foods. This is, of course, consistent with each species having diverged from a

common ancestral species with the majority of the present differences between them primarily

observable at the larval state, and with many of the adult phenotypic properties retained by each

of the “new” species.

18

The mitochondrial genome only offers information from maternal inheritance; therefore,

delimiting subspecies hybrids also requires comparison of bi-parental nuclear sequence. Nuclear

loci, especially protein-coding regions, generally evolve slower than regions found in the

mitochondria. This slower rate of change often doesn’t provide enough sequence variation to

differentiate between closely related species/populations (Dasmahapatra and Mallet 2006). Non-

protein coding sequences found in the nuclear genome, such as intergenic and intronic regions,

offer more variability between populations.

In order to use the nuclear DNA sequence as a successful phylogenetic tool, the sequence

chosen must be conserved among individuals within the population being studied but display

sufficient variation between the two populations. Sequences within gene families, as opposed to

single-copy regions, should be avoided in order to reduce the risk of including sequences that are

not true orthologues (De Mendonca Dantas, Godinho et al. 2009) but are variations of a similar

sequences on the same genome. Attempts should also be made to use intronic sequences found

on different chromosomes to minimize the possibility of linkage allowing easier identification of

hybrids (Backstrom, Fagerberg et al. 2008).

1.4 New World Parrots

Parrots are the most threatened and endangered group of birds with more than 90 species

(Zierdt-Warshaw 2000). During the last quarter of the 20th century, over 21 parrot species have

become extinct. Three of these extinction events have occurred since 2000. The primary cause of

population declines and extinctions was overhunting in the 1960s and 1970s, but it is now more

often due to the loss of habitat as a result of deforestation and subsequent habitat fragmentation.

19

Fifty percent of the earth’s plants and animals can be found in the rainforest even though

rainforests cover only 5% of the earth’s land surface (Butler 2014). Eighty-thousand acres of

tropical rainforest are destroyed every day (Moss 2009). This is due, in large part, to the increase

in the global demand for beef and soybean production, and in order to make way for these

commercial crops, the forests are burned. The rainforests that are being destroyed can be as

young as the 2000 year old southern basin of the Amazon (Carson, Watling et al. 2015) to the 70

million year old ancient forests of Southeast Asia (Benders-Hyde 2002). It is projected that most

of the primary rainforests of Southeast Asia will be destroyed in the next 10 years (Benders-

Hyde 2002)

The vitality of parrots as a group is heavily challenged. The IUCN Red List details 11

species of parrots as endangered, five species as threatened, and four species requiring special

protection. Mexico has a special connection to this group, having 22 indigenous species of

parrots, with six such species found only in Mexico (IUCN 2013). In the recent past, the rate of

illegal wild bird capture has still been estimated at 65,000 to 75,000 birds per year, and these

include many parrots. Greater than 75% of the captured birds die before reaching the purchaser,

meaning that supporting even a relatively small trade of this type at the consumer level requires a

substantial depletion in wild populations.

Parrots are considered an important group of birds for reasons beyond the simple need to

preserve all remaining species in our ecosystems but also because they have characteristics that

are not regularly seen in other avian groups. Their unique higher cognitive function is attributed

to an unusual rate of brain development and a much larger brain volume, allowing the parrots to

employ complex communication patterns and to mimic the sounds of other animals (Forshaw

2006). Parrots often live 50-75 years in the wild and up to 90 years in captivity. The genetic

20

factors supporting this incredible longevity is of much interest in the scientific community. As

are the genes associated with the parrot’s superior cardiovascular health which enables them to

fly 15-20 miles per day oftentimes at speeds of up to 35 miles per hour. These attributes are

seldom seen in other Aves species granting a range of difference references for research

comparison.

Macaws, as a part of the New World parrot group, have the same unique characteristics,

and as seed dispersers they are an important element of South and Central American forest

ecosystems. In addition to their ecological importance, they are signature birds to the regions in

which they are found, making them a source of regional and national affection, as well as an

important resource for ecotourism. These and other factors have driven further study and

reintroduction of macaw species into regions within their historic ranges that might still support

healthy populations. One of these species is the Scarlet Macaw, Ara macao.

Scarlet macaws are one of the larger of the New World parrots. They are characterized by

their massive bills and strongly graduated tails. They have a bare, white patch on the sides of the

face with inconspicuous white, feathered lines. Their chest and head feathers are mostly red with

blue lower back coverts and tail coverts. The median and secondary wing coverts are yellow and

variably tipped with green (Juniper 1998). Scarlet macaws are considered to have the greatest

latitudinal distribution range of any bird in the genus Ara. Their native habitat presently extends

from southeastern Mexico through northern South America to eastern Bolivia and the Brazilian

Highlands. This species of macaw also includes a poorly defined subspecies denoted Ara macao

cyanoptera, also known as the Central American scarlet macaw. This subspecies is differentiated

from the South American scarlet macaw (nominate subspecies, Ara macao macao) by a

considerably larger body length and mass. The band of yellow on the secondary wing coverts is

21

much wider on the Central American scarlet macaw, and this macaw has feathers tipped with

blue and little or no green (see Figure 3) (Forshaw 2006).

Macaws reach sexual maturity at three to four years of age, depending on the availability

of a mate, and the species has a low annual reproductive rate. They are asynchronous egg layers

during the months of December, January, and February, and they will lay 1-4 eggs, with each

egg being laid approximately three days after the last (Vaughan Bremer Dear 2004). The eggs

will hatch after an approximate 22-day incubation period, and the chicks fledge approximately

75 days after hatching. Because they lay their eggs over a period of time, and there is variability

in fledging time among the chicks, collectively, there will be a period of more than 100 days

during which the parents will need to be particularly vigilant to protect the young (Myers and

Figure 3: Visual comparison of the nominate subspecies Ara macao macao (left) and

Ara macao cyanoptera (right). A. m. cyanoptera is differentiated by its larger

size and lack of green on the tips of its wing feathers. (A m. macao photo

courtesy of Andy Hay. A. m. cyanoptera photo courtesy of John Perry.)

22

Vaughan 2004). The parents will continue to contribute to the care of the young after fledging,

typically through the first year of life and will not lay eggs again until the young have left the

nest (Myers and Vaughan 2004).

The nests are commonly located in tree cavities that are approximately 20 meters from

the ground. These tree cavities are specialized structures created from the natural rotting of lower

branches of the tree or branches that have broken off during the rainy season. The macaws will

wait until a nesting cavity becomes available before attempting to find a mate and starting a

clutch. Macaws have a much longer lifespan than the majority of other avian species, living 50-

70 years in the wild. The macaw remains reproductively active for approximately 20 years. The

presence of “seniors” in the population can mask an underlying lack of reproduction. People still

see birds in the wild, but it may simply be the same birds advancing in years and not successfully

bearing offspring (Marsden and Pilgrim 2002). Because the macaw has a long lifespan,

populations that are under extreme pressure are able to persist and mask the effects of habitat

destruction and other causes of decline. With the older, non-producing population, this gradual

decline in population numbers is often followed by a drastic increase in mortality rates as the

older, non-reproductive birds reach the end of their lifespans.

In the wild, the Ara macao as a species is listed as a “species of least concern” by the

IUCN. This is due to the larger, yet also declining, numbers of the nominate subspecies, which

range from the extreme south of Nicaragua to Brazil and Bolivia. The desperately low numbers

of the subspecies A. m. cyanoptera ranging from southern Mexico to southeast Nicaragua are a

subject of concern for local governments and has been assessed by several independent experts

associated with the IUCN, the Convention on International Trade in Endangered Species

(Species. 2007) and the World Parrot Trust as meeting the criteria for endangered status (Birdlife

23

International 2013). This is a typical example of a larger, umbrella species masking underlying

ecological health concerns of rarer regionally distinct subspecies that exist in smaller

populations, and which creates a false lack of concern based simply on a cursory evaluation of

reported population numbers for the species as a whole.

The surviving wild populations of A. m. cyanoptera are now found in remnant clusters

throughout their historic indigenous range. In 2014, there were approximately 250 individuals in

Mexico: 200 in the Lacondona Rainforest in Chiapas bordering Guatemala and 50 in the

Chimalapas Mountains in Oaxaca. Other isolated groups have been reported in a few localities in

Guatemala, Belize, Honduras, and Nicaragua. In El Salvador, the A. m. cyanoptera is now

reported to be regionally extinct. Figure 4 reflects current regional population estimates of A. m.

cyanoptera (see Figure 4) (O'neill 2013, Cantu 2014, Estrada 2014, Amaya-Villarreal, Estrada et

al. 2015).

24

In 2008, Mexico conducted a population viability analysis to assess the decline of this

once abundant bird. The analysis concluded that the rapid decline of A. m. cyanoptera

populations has been due to a combination of factors that include deforestation, local hunting,

and the illegal bird trade. While the illegal bird trade remains a problem, strides have been made

internationally to end wild bird imports in order to address this important issue. In Europe, for

example, in the early 1990s, a resolution by the European Parliament called for the European

Commission to end imports of wild birds. The U.S. federal government passed the Wild Bird

Conservation Act prohibiting the importation of most wild-caught birds—primarily parrots. And

in 2008, the Mexican government passed a law with tough legal penalties for poachers caught

illegally moving parrots out of the country.

Figure 4: Map of local population estimates for Ara macao cyanoptera in southern

Mexico and Central America.

25

The long-lasting, deleterious effects of poaching are often understated and, therefore, not

fully comprehended. When poachers find a nesting site containing pre-fledging chicks, the adult

macaws are often maimed or killed in an attempt to keep the macaw’s raucous warning cries

from alerting the authorities. In order to reach the chicks within the nests located high in the tree

in the upper rainforest canopy, the poachers will often burn or chop the base of the tree. This

results in killing the tree, removing a valuable nesting site from the population, and often

injuring or killing the chicks in the process.

The reintroduction of A. m. cyanoptera into historic habitats to increase population

numbers in the wild is a primary goal. When supplementing a depleted wild population with

captive bred stock, or reintroducing a species into a region that it formerly inhabited, efforts must

be focused on sponsoring individuals that are genetically matched to the region in question. Zoos

and breeding programs also need to maintain a level of genetic purity to prevent subspecies

hybridization. There are, however, reliability limitations to using only morphology for

subspecies determination, especially at such a critical level. The subtle morphologic distinctions

between allied subspecies are so complex that most taxonomists who attempt this specialize in a

single group of closely related organisms. As a result, finding appropriate experts and

distributing specimens in order to have a comprehensive morphological analysis can be a time-

consuming and expensive process (Stoeckle 2003, Hebert, Stoeckle et al. 2004). The possibility

of hybrids between two subspecies only complicates such efforts.

In the mid-1990s, a number of sponsored programs had begun to release confiscated,

smuggled, and cage-raised birds into the wild. Captive breeding and reintroduction programs

were a fairly new concept at this time and were deemed necessary to reestablish endangered

species. One such program from 1992 to 1995 involved releasing 20 scarlet macaws to their

26

historic range in the Tambopata Nature Reserve in Peru. Little was known at this time about the

numerous problems that arise during the reintroduction of a population. Historically, initial

programs are seen as a success if at least 50% of the original released group survives.

The released group of macaws lacked the adequate health screening to protect against the

introduction of diseases into the new environment. Parrots are especially susceptible to several

lethal and contagious diseases that are capable of lying dormant for years (Brightsmith, Hilburn

et al. 2005). The lack of adequate habitat and native resources also hindered the program’s

efforts, and at the time of this reintroduction event there was little to no effort to slow the rapid

deforestation in these regions. Due to the lack of support from the surrounding community and

continued deforestation, the released macaws were put at risk. Local communities were unaware

of the long-term benefits that ecotourism brings compared to the short-term returns of exploiting

the macaws for food and/or feathers. Further, it was not determined whether the reintroduced

birds were of the same, or at least similar, haplotype that had been indigenous to the release site,

and therefore, may not have been genetically well-adapted for the release environment. Of the

original group of scarlet macaws released, only 55% were still alive as of 2002. The lessons

learned from these early reintroduction efforts have since been used to design more successful

release programs.

Proper reintroduction of A. m. cyanoptera needs molecular level distinction to enable the

introduction of the original population haplotype/subspecies at the release site(s). Historically,

molecular-level subspecies differentiation has not been done, as the differentiating gene

sequences between subspecies has not been known. But given the potential importance of

subspecies identification for reintroduction success, as well as to avoid subspecies hybridization,

this should be seen as an important part of these reintroduction programs.

27

The use of DNA-based testing is commonly required in order for two subspecies to be

reliably identified/distinguished. It is also important to identify likely species and subspecies

hybrids, since little or no effort has been taken to prevent homogenization, particularly in the pet

trade. It is necessary to obtain accurate genetic profiles of suitably polymorphic regions for each

group in order to allow for accurate differentiation. Using this approach, DNA-based testing

methods have been developed for a number of species that can quickly and consistently

distinguish between two subspecies and hybrids. (Tavares, Baker et al. 2006). Closely related

sister-species delimited with independent evidence can often be differentiated by mitochondrial

sequence comparison using cytochrome oxidase subunit 1 (CO1) sequences. Molecular analysis

of combinations of multiple genes, including mitochondrial DNA (mtDNA) sequences, should

allow unbiased species differentiation of even closely related populations such as at the

subspecies level (Knowles and Carstens 2007, Dupuis, Roe et al. 2012).

Although not yet considered threatened by the IUCN, the scarlet macaw’s decline can be

seen as symptom of a larger problem of global biodiversity loss, and many of the same problems

and solutions relating to species preservation can be studied and applied to assist in the

preservation of this species. There have been important successes in macaw reintroduction

programs that can serve as models for future programs with these principles in mind.

One such success story is that of the Palenque Rainforest in Mexico where the macaws

had been extinct for 70 years. Ninety-six pedigreed macaws were released as six small release

groups between April 2013 and June 2014. The reintroduction area is a national park famous for

its Mayan ruins and therefore already granted substantial protection by the Mexican government.

Preparations for these releases were multifaceted, and included important health screening and

local community involvement elements. The released individuals are closely monitored by

28

microchip and volunteer support staff. Three institutions are involved in this release program:

Aluxes Ecopark of Palenque provided the release site, Xcaret Ecopark and Nature Preserve

continues to provide the captive-bred macaws, and the Institute of Biology of the National

Autonomous University of Mexico (UNAM) continues to provide the scientific planning,

execution, and research for this project. This program has been extremely successful, reporting a

92% survival rate as of August 2014 (Estrada 2014).

29

CHAPTER 2

MATERIALS AND METHODS

2.1 Obtaining Samples

Eighty scarlet macaw blood samples were collected in 2003 and 2005 from Xcaret

Ecoparque and Nature Preserve in Playa del Carmen, Mexico. Of the samples, only 32 were from

birds that had been maintained as pedigreed and raised together to breed to prevent subspecies

hybridization. Twenty-seven blood samples were also obtained from two private owners in

Quintana Roo, Cancun, Mexico. Eleven of the nominate A. m. macao blood samples were

obtained through the assistance of Dr. Patricia Escalante from the National Autonomous

University of Mexico (UNAM). Definitive blood samples of the nominate subspecies of A. m.

macao, and the endangered group, A. m. cyanoptera, as well as known hybrid blood samples of

the two subspecies were used for this study.

To obtain the blood samples, the macaws were first restrained using a towel in a sterile

environment. The blood was collected using a 25 gauge needle. The needle was inserted into the

basilic wing vein of the restrained macaws. The vein is located on the ulna of the birds, making it

easy to access (Harris, 2007). The collected blood was transferred to a vial containing lysis

buffer (0.01 M Tris, 0.01 M NaCl, 0.01 M EDTA, 1% n-lauroylsarcosine, at pH 7.5). The

n-lauroylsarcosine is a detergent that lyses the cells and the high salt concentration neutralizes

the nucleases in the blood (Seutin et al., 1991). Some of the blood samples were transported on

FTA® databasing cards (Whatman) rather than in lysis buffer. This mode of transport avoids

leakage risks and sample degradation due to temperature variations and spoilage (Smith and

Burgoyne 2004). Blood was added to the FTA® cards by placing blood droplets within the

printed circle on the card. Care was taken to avoid pooling the blood in one spot by dispersing

30

the blood drops over the entire area inside of the printed circle and allowing the card to fully dry.

The cards contain reagents that are designed to kill most pathogens by cell lysis and protein

denaturation, inhibits fungal growth, and avoids other contaminants with strong buffering and

free-radical properties much like the lysis buffer used in liquid blood sample transport.

2.2 DNA Isolation from Liquid Blood

The DNA was isolated from the blood samples by using a guanidinium thiocyanate

(GITC) extraction method, modified from Hammond et al. (Hammond, Spanswick et al. 1996).

A 10-20 µl quantity of blood suspended in lysis buffer (0.01 M Tris, 0.01 M NaCl, 0.01 M

EDTA, 1% n-lauroylsarcosine, pH 7.5) was added to 500 µl of extraction solution (0.5 M

guanidinium thiocyanate and 0.1 M EDTA) in a 1.5 ml microfuge tube and vortexed well. The

GITC irreversibly inactivated nucleases and other proteins by denaturation, making the proteins

insoluble. The EDTA in the solution is an ion chelator. Two hundred fifty microliters of ice-cold

7.5 M ammonium acetate was added to the solution and vortexed well. Ammonium acetate

makes the DNA much less soluble in water. The solution was incubated on ice for 10 min.

To solubilize proteins and lipids, 500 µl of 24:1 chloroform to isoamyl alcohol was added

and the solution was vortexed. The solution was then centrifuged at 10,000 rpm for 10 min in a

microcentrifuge at room temperature. The upper aqueous phase was carefully removed and

transferred to a new microfuge tube. The chloroform extraction was repeated. Six

hundred microliters of cold isopropanol was added to precipitate the DNA, and the solution was

vortexed well. The solution was centrifuged at 10,000 rpm for 20 min at 4 °C. Taking care not to

disturb the pellet, the supernatant was carefully removed with a pulled-out Pasteur pipette and

discarded. One milliliter of cold 70% ethanol was then added to remove any residual salts as well

31

as furthers the precipitation of DNA. The tube was gently inverted 3 times to re-suspend the

DNA pellet. The solution was centrifuged at 10,000 rpm at 4 °C. Taking care not to disturb the

pellet, the supernatant was removed with a Pasteur pipette and discarded. The remaining pellet

and residual solution was placed in a Savant SpeedVac® Concentrator (ThermoScientific) for

approximately 3 min until dry. The pellet was re-suspended in 100 µl of molecular grade water

and stored at -20 °C.

2.3 DNA Isolation from FTA® Cards

The isolation of DNA from FTA® cards (Whatman 2015) involved using an autoclave-

sterilized single hole paper punch to remove a 6 mM diameter disk with dried blood. The paper

was placed in a 1.5 ml microfuge tube, and 1 ml of FTA® purification reagent (100 mM Tris,

0.1% SDS) was added. The solution was flash vortexed 5 times to mix and allowed to incubate

for 5 min at room temperature. All of the residual FTA® purification reagent was removed using

a pipette and discarded. The FTA® purification reagent wash steps were repeated twice.

One milliliter of TE buffer (10 M Tris-HCl, 1 M EDTA, pH 8.0) was added to the

microfuge tube and incubated for 5 min at room temperature. All residual TE buffer was

removed with a pipette and discarded. The addition of TE buffer, incubation, and removal were

repeated twice. At this point, the punch should appear white or near-white. A 140 µl aliquot of a

first solution (0.1 N NaOH, 0.3 mM EDTA, pH 13.0) was then added to the microfuge tube and

then incubated for 5 min at 65 °C. A 260 µl aliquot of a second solution (0.1 N Tris-HCl, pH 7.0)

was added to the tube and again vortexed to mix. The solution was incubated for 10 min at room

temperature and then vortexed to mix (approximately 10 times). The punch was removed from

the solution with autoclave-sterilized flat-tipped tweezers and squeezed to recover any remaining

32

eluate from the matrix. This final eluate contained the genomic DNA in TE buffer (66 mM Tris-

HCl, 0.1 mM EDTA, pH 8.0).

2.4 18S Ribosomal DNA

Ribosomal DNA (rDNA) is an invaluable tool in evolutionary studies (Kurtzman and

Robnett 1998). It is able to display bi-parental inheritance, intergenic variability at the species

and genus level (Baldwin, Sanderson et al. 1995), and is easy to amplify for more detailed study

due to its conserved nature. Because of these abilities, this genomic region is commonly used for

species identification, molecular barcoding, and phylogenetic construction. The rDNA sequence

is located at the chromosomal region(s) around which the nucleoli form, and for this reason these

locations on chromosomes are called nucleolus organizer regions (NORs). Each region is

composed of several tandem clusters of the rRNA genes, including the 18S gene which codes for

the small subunit rRNA which is commonly the chosen sequence used for phylogenetic

comparison. The NORs of the scarlet macaw are found on three distinct chromosome pairs

(Seabury, Dowd et al. 2013), while the majority of species of Aves (such as the California

condor and the chicken) possess a NOR on only one pair of chromosomes.

Small subunit 18S rRNA genes are the standard reference sequences for the taxonomic

classification of organisms (Wang, Tian et al. 2014). The 18S rDNA sequence exhibits a low rate

of polymorphism within species, but still has a sufficient level of polymorphism between species

to make it very useful for determining interspecies phylogenetic relationships using only a few

specimens from each species. Certain sections of this gene are often very highly conserved, so

much so that a centrally located 15 nucleotide sequence shares 100% sequence and location

homology among mammals and 97% among mammals, marsupials, and birds (Coleman 2013).

33

This extreme homology is due to the importance of maintaining the secondary structure to

function correctly when translating mRNAs to proteins.

Two internally transcribed spacer (ITS) regions connect the 18S, 5.8S, and 28S rRNA

genes in each cluster (see Figure 5). The ITS region has become widely used in phylogenetic

inference (Álvarez and Wendel 2003) and the variability in these regions help to differentiate

closer related groups. The highly conserved nature of the flanking regions aid in consensus

primer design. Each rDNA cluster is nearly identical in sequence although there is some

variation in size due to the difference in the number of repeated DNA elements in the non-

translated spacer regions. These tandem arrays are created by unequal crossing-over during sister

chromatid exchange and/or gene conversion events (Eickbush 2002).

Primers were designed using the published Gallus gallus 18S rDNA sequence and the

flanking ITS1 sequence to amplify this region for comparison between the A. macao subspecies

(see Figures 6 & 7). The sequence of 1815 bp covering the entire 18S rRNA gene and the

adjacent ITS1 was amplified with the following primers: 18S-F1 (5-

CCTGGTTGATCCTGCCAGTAGC-3’) and 18S-R1 (5’TCCTTCCGCAGGTTCACCTACG-

Figure 5: Arrangement of ribosomal DNA (rDNA) clusters on the genome. ITS1 – Internal

Transcribed Spacer 1; ITS2– Internal Transcribed Spacer 2; ETS – External

Transcribed Spacer; NTS – Non-Transcribed Spacer (Holstein 2006).

34

3’). All polymerase chain reactions were carried out in 25 µl reaction volumes containing Q5®

Hot Start reaction buffer (New England Biolabs), 0.2 mM dNTPs, 0.5 µM of each primer, 1U

Q5® Hot Start High-Fidelity polymerase (New England Biolabs) and 50 ng of the DNA

template. A thermal cycling profile was used of one cycle of 45 sec at 98 ºC followed by 30

cycles of 10 sec at 98 ºC, 20 sec at 68 ºC, and 27 sec at 72 ºC, with a final step of 5 min at 72 ºC.

Figure 6: Amplified nuclear region from 18S rDNA – 1 of 2. Amplified Ara macao samples

were electrophoresed using a 1% sodium borate (SB) agarose gel with a 1 kb size

standard ladder (Invitrogen).

35

2.5 Column-based Purification of PCR Products

A Wizard® SV Gel and PCR Clean-Up System (Promega 2015) was used to purify PCR

products consisting of DNA fragments of 100 bp to 10 kb directly from a PCR amplification.

This system removes excess nucleotides and primers, and allows for downstream applications

such as DNA sequencing and restriction digestion. An equal volume of Membrane-Binding

Solution (guanidine isothiocyanate) was added to the PCR amplification product. The solution

was transferred to the mini-column assembly, which contains a silica membrane to bind the

DNA. The solution was allowed to incubate for 1 min at room temperature on the column and

Figure 7: Amplified nuclear region from 18S rDNA – 2 of 2. Amplified Ara macao samples

were electrophoresed using a 1% sodium borate (SB) agarose gel with a 1 kb size

standard ladder (Invitrogen).

36

then centrifuged at 16K x g for 1 min. The eluate was discarded. Seven hundred microliters of

membrane wash solution (10 mM potassium acetate, 80% ethanol, 60.7 µM EDTA) were added,

and the column was centrifuged at 16K x g for 1 min.

The resulting eluate was discarded. Five hundred microliters of membrane wash solution

were added and the column was centrifuged at 16K x g for 5 min. The resulting eluate was

discarded. The column was centrifuged to air dry for at least 1 min. It was important to allow all

of the ethanol to evaporate to obtain a maximum yield. The column was placed in a clean

microfuge tube, and 15 µl of molecular grade water were added to the center of the filter, being

careful not to touch the filter with the tip of the pipette. The solution was allowed to incubate at

room temperature for 1 min, and the column was then centrifuged at 16K x g for 1 min. The

amplicons were quantified and assessed for contamination on a NanoDrop® 2000C

Spectrophotometer (ThermoScientific). Absorbance ratios of 260/280 nm and 260/230 nm were

above 1.8, and therefore considered to have adequate nucleic acid purity for downstream

applications (ThermoScientific). The eluted DNA was stored at 4 °C for short periods of time or

at -20 °C for long-term storage.

2.6 Next Generation Sequencing Using the Illumina MiSeq® Platform

Illumina® sequencing technology is based on the creation of clusters by massively

parallel sequencing-by-synthesis (SBS), which enables the detection of the incorporation of

single bases into the growing strands of DNA. During cluster generation, single DNA molecules

are bound to the surface of the flow cell, and bridge-amplified to form growing clonal clusters.

Four types of reversible, fluorescently labeled, terminator bases are added and the growing

clusters are imaged as the DNA chains are extended one nucleotide at a time. Non-incorporated

37

nucleotides are washed away and the fluorescent label and 3’ terminal blocker are chemically

removed from the DNA to allow incorporation of the next nucleotide. Since all 4 dNTPs are

present during each cycle, natural competition minimizes incorporation bias. The image is

captured and the process is repeated for each cycle of sequencing. Following image analysis, the

sequencing software performs base calling, filtering, and quality scoring.

2.6.1 Quantification of dsDNA on Qubit® 2.0 Fluorometer

An enzymatic DNA fragmentation method was used to prepare and tag each sample for

sequencing and because of this, the exact quantification of the input DNA was critical. The

Qubit® 2.0 Fluorometer uses dyes which fluoresce when bound to double-stranded DNA

molecules. This allowed for a more accurate quantification of the DNA target because it

disregards free nucleotides, degraded nucleic acids, protein, and RNA contaminates. In order for

the Qubit® to accurately quantify the samples, the concentration must be between 10 pg/µl to

100 ng/µl. Original quantification was performed using a NanoDrop® 2000 Spectrophotometer

(ThermoScientific). Each PCR amplicon sample was diluted to approximately 10 ng/µl with

molecular grade water. The samples were then quantified using the Qubit® dsDNA HS assay kit.

Calibration was performed with two standards which were prepared by adding 190 µl of the

Qubit® Working Solution to 10 µl of each standard in disposable thin-walled 0.5 ml tubes. The

Qubit® Working Solution was made in a light-protected plastic container using a 1:200 ratio of

Qubit® dsDNA HS Reagent to Qubit® dsDNA HS Buffer.

The standards were then analyzed using the Qubit® by selecting the “DNA” option on

the home screen followed by the “dsDNA High Sensitivity” option. Once both standards were

measured, the sample tubes were prepared by adding 198 µl of Qubit® Working Solution to 2 µl

38

of each sample in disposable thin-walled 0.5 ml tubes. The sample solutions were incubated for 2

min at room temperature and then placed in the sample chamber. Each sample was quantified

in ng/ml by multiplying the Qubit® output reading by 100 (ThermoScientific). Using the

calculated concentrations, each PCR amplicon sample was brought to 10 µl of concentration of

0.2 ng/µl.

2.6.2 Nextera DNA Tagmentation

Each sample was simultaneously fragmented and tagged for sequencing with the Nextera

DNA Sample Prep Kit on the Illumina MiSeq® Desktop Sequencer using a specially engineered

transposome. A transposase cut the target DNA randomly, which created double-stranded breaks

with staggered ends. At the same time, the transposase attached adapter sequences to the ends of

the target DNA. A limited-cycle PCR reaction used the adapter sequences to amplify the insert

DNA. The PCR reaction also added index sequences on both ends of the insert DNA, which

enabled dual-index sequencing of pooled DNA libraries during the sequencing run (see Figure 8)

(Illumina, Syed). The average sample size of the generated fragments after “tagmentation” was

approximately 250 bp plus the length of the ligated primer and index sequence.

39

Figure 8: Tagmentation procedure used by Nextera XT Library Preparation Kit (Illumina).

A. Nextera XT transposome with adapters is combined with template DNA

B. Tagmentation to fragment and add adapters

C. Limited cycle PCR to add sequencing primer sequences and indices

40

A 96-well MIDI plate (Fisher Scientific, part # AB-0859) was labeled NTA (Nextera XT

Tagment Amplicon) and 10 µl of Tagment DNA Buffer were added to each well of the NTA

plate for sample preparation. Five microliters of 0.2 ng/µl of each sample were added to the

corresponding wells of the 96-well plate for a total of 1 ng of DNA per reaction. Five microliters

of Amplicon Tagment Mix were added to each sample and mixed by pipetting. The NTA plate

was sealed with foil and briefly centrifuged. The plate was then run on a thermal cycler for 5 min

at 55 °C with a heated lid and then held at 10 °C.

2.6.3 Neutralization of the Nextera XT Transposome Reaction

After the samples reached 10 °C, they were immediately neutralized in order to halt the

enzymatic reaction of the Nextera XT transposome. The foil on the NTA plate was removed and

5 µl of Neutralize Tagment Buffer were added to each sample well in the NTA plate. The

samples were mixed by pipetting. The NTA plate was sealed with foil, briefly centrifuged, and

incubated at room temperature for 5 min.

41

2.6.4 PCR Amplification

The Nextera XT sequencing kit uses a series of index primers for cluster formation. It

was crucial that each sample has a different pair of indices to parse the different samples after

sequencing. To help ensure that primer pairs were not repeated, a TruSeq® index plate fixture

was used to organize the N7 (index 1 primer) and S5 (index 2 primer) pairs around the NTA

plate (see Figure 9).

The N7 primers have orange caps and were arranged horizontally long the index plate

fixture, while the S5 primers have white caps and were arranged vertically along the index plate

fixture. Once the NTA plate and the indices were arranged on the index plate fixture, we

removed the foil from the NTA plate and added 15 µl of the Nextera PCR Master Mix. Next we

added 5 µl of the S5 (white capped index 2 primers) to each column and then 5 µl of N7 (orange

Figure 9: TruSeq® index plate guide (Illumina).

42

capped index 1 primers) to each row. The index caps were changed directly after use to prevent

cross contamination. After the indices and Nextera PCR Master Mix were added to the sample

wells on the NTA plate, the wells were mixed by pipetting. The NTA plate was sealed with foil

and centrifuged. After the NTA plate was centrifuged, we placed the NTA plate on a thermal

cycler, programed to run with a heated lid for 3 min at 72 °C, 30 sec at 95 °C, 12 cycles of 95 °C

for 10 sec, 55 °C for 30 sec, and 72 °C for 30 sec and 5 min at 72 °C before holding at 10 °C.

2.6.5 PCR Cleanup of Indexed Samples

Agencourt AMPure® XP magnetic beads were used to purify the indexed samples by

removing unincorporated dNTPs, primers, primer dimers, salts, and other contaminants

(Beckmann-Coulter). Short indexed fragments were also removed during a size selection step.

The NTA plate was briefly centrifuged before the foil was removed and 50 µl was transferred

from the NTA plate to a new 96-well plate labeled CAA (Clean Amplified Plate). The magnetic

beads were vortexed to ensure that they were evenly dispersed before adding 30 µl of beads to

each sample well of the CAA plate. The sample wells were then mixed by pipetting. The plate

was incubated at room temperature for 5 min without being disturbed. The CAA plate was then

set on a magnetic rack to allow the supernatant to clear before the supernatant was removed and

discarded. With the plate still on the magnetic stand, 200 µl of 80% ethanol was carefully added

to the sample wells of the CAA plate to avoid disturbing the beads.

The CAA plate was incubated for 30 sec before the ethanol was removed. The ethanol

wash was repeated after another 30 sec incubation, the ethanol was removed. After the removal

of the ethanol from the second wash, any excess ethanol was removed with a fine tipped pipette.

The plate was air-dried for 15 min on the magnetic stand. After the plate was dry, it was removed

43

from the magnetic stand and 52.5 µl of Resuspension Buffer was added to each sample well in

the CAA plate. Each sample was mixed gently by pipetting and left undisturbed at room

temperature for 2 min before the plate was moved to the magnetic stand. Once the supernatant

had cleared, 50 µl of supernatant from the CAA plate was transferred to a new 96-well plate

labeled CAN (Clean Amplified NTA plate).

2.6.6 Accurate Determination of DNA Fragment Size

In order to ensure optional size distribution within the sample library, an Agilent® 2100

Bioanalyzer (Agilent Technologies) was used along with an Agilent® DNA 1000 kit to

determine the correct size of the DNA fragments. This method is preferred to standard gel

electrophoresis due to its output data format that can be transferred via computer interface. Each

DNA sample was loaded onto an Agilent® DNA chip consisting of interconnected

microchannels and separated by size electrophoretically. A gel matrix was combined with a dye

concentrate and transferred to a spin filter. The solution was centrifuged at 2240 x g for 15 min.

Care was taken to protect the solution from light. An Agilent® DNA chip was placed on the chip

priming station, 9 µl of filtered gel-dye mix was pipetted into the well, marked with a “G,” and

the priming station cover was closed.

A priming syringe was clipped on the syringe stand and the plunger was depressed and

held in position to force the gel-dye mix into the chip. Three microliters of gel-dye solution mix

was pipetted into the designated well and 5 µl of marker solution was loaded into the sample and

ladder wells on the chip. One microliter of DNA size standard ladder was loaded into the ladder

well. One microliter of each sample was loaded in to each sample well, and the chip was placed

in an adapter and vortexed for 1 min at 2400 rpm. The desired assay to be performed was

44

selected on the computer display, the chip was loaded on to the machine and the assay was

started.

2.6.7 Library Normalization

Library normalization balances the quantity of each library to ensure a more equal library

representation when the samples are combined to create the pooled sample library.

Twenty microliters of supernatant from the CAN plate was transferred to a new 96-well plate

labeled LNP (Library Normalization Plate). A quantity of 45.85 µl per sample of Library

Normalization Additives 1 (LNA1) was added to a 15 ml conical tube. Library Normalization

Beads 1 (LNB1) was mixed thoroughly by pipetting using a P1000 set to 1000 µl. Once the

LNB1 was mixed, 8.33 µl per sample was added to the 15 ml conical tube. A P200 with a cut tip

was used when the total volume of LNB1 was less than the minimum for a P1000 pipette.

Immediately after LNA1 and LNB1 were mixed, 45 µl of the LNA1/LNB1 solution was added

to each sample well of the LNP plate.

The LNP plate was sealed with foil and shaken for 30 min at 1800 rpm. Once thoroughly

mixed, the foil was removed and the LNP plate was placed on the magnetic rack until the

supernatant was clear. Eighty microliters of supernatant was discarded from each well of the

LNP plate while on the magnetic stand. The LNP plate was then removed from the magnetic

stand and 45 µl of Library Normalization Wash 1 (LNW1) was added. The LNP plate was sealed

with foil and shaken for 5 min at 1800 rpm. The plate was then placed back on the magnetic

stand until the supernatant was clear. The supernatant was removed and discarded from the

sample wells while on the magnetic stand. Again, the LNP plate was removed from the magnetic

stand and washed with LNW1.

45

After the second LNW1 wash, the LNP plate was removed from the magnetic stand and

30 µl of 0.1 N NaOH was added to each sample well. The plate was sealed with foil and shaken

for 5 min at 1800 rpm. Thirty microliters of Library Normalization Storage buffer 1 (LNS1) were

added to each sample well of a new 96-well plate labeled SGP (Storage Plate). The plate was

removed from the shaker and placed on the magnetic stand until the supernatant was clear.

Thirty microliters of the supernatant from LNP plate were transferred to the SGP plate. In

preparation for cluster generation and sequencing, equal volumes of normalized library were

combined, diluted in Hybridization Buffer, and denatured on a heat block at 96 ºC.

2.6.8 Library Pooling and MiSeq® Sample Loading

Once the reagent cartridge was thawed, a 1.5 ml microfuge tube was labeled PAL

(Pooled Amplicon Library) and 5 µl from each sample well was added to the tube. A microfuge

tube was labeled DAL (Diluted Amplicon Library) and 576 µl of Hybridization Buffer (HT1)

was added to the tube. Twenty-four microliters of PAL was transferred to the DAL tube. The

DAL solution was vortexed and incubated on a heat block for 2 min at 96 °C. After the

incubation, the DAL was mixed by inversion and placed in an ice-water bath. The DAL

remained in the ice water bath for 5 min before being loaded into the “load sample” well of the

MiSeq® cartridge.

2.6.9 Prepping the Illumina MiSeq® Platform and Sequencing Reagents

The MiSeq® reagent cartridge, stored at -20 ºC, was placed in water at room temperature

for approximately 60 min to thaw. The reagent cartridge was then inverted ten times to mix the

thawed reagents and assure that they were free of precipitates. The foil seal covering the

46

reservoir labeled “Load Samples” was cleaned with a low-lint lab tissue and the foil seal was

pierced with a clean 1 ml pipette tip. Six hundred microliters of the prepared pooled library was

then loaded into the “Load Samples” reservoir.

While wearing gloves, the flow cell was carefully removed from the storage buffer with a

pair of plastic forceps. The flow cell was rinsed with molecular grade water to remove any

excess salts and then dried using a lint-free lens cleaning tissue. The flow cell glass was cleaned

with an alcohol wipe and the excess alcohol was removed with a lens tissue. The flow cell was

then loaded into the flow cell compartment and the “Next” option was selected. The reagents

were loaded. The “Next” option was selected and the flow cell was prepared and loaded onto the

MiSeq® Desktop Sequencer.

The wash buffer (PR2) bottle was inverted to mix and placed in the reagent compartment

between the reagent chiller and the waste bottle. The waste bottle was emptied and the sippers

were lowered into the PR2 bottle and the waste bottle. The reagent cartridge was loaded in to the

reagent chiller compartment and the “next” option was selected.

The sample sheet was loaded and the run parameters verified. The “next” option was

selected and the system started the pre-run check. After the check was completed, the “start run”

option was selected and the sequencing run was initiated.

2.6.10 Sequence Run Monitoring

The MiSeq® reporter software that runs in conjunction with the MiSeq® Desktop

Sequencer provides primary analysis and re-queueing for further analysis if re-queueing is

instructed by the user to generate a different output format. The resulting sequencing and

analysis files from the primary run are saved on the Illumina® cloud-computing environment

47

BaseSpace and on-instrument using MiSeq® Reporter. Additional analyses was performed using

an off-instrument installation of MiSeq® Reporter.

2.7 Mitochondrial Genome Comparison

Recent advances in technology have made it easier and more affordable to sequence the

entire mitogenome (Duchene, Archer et al. 2011). The complete mitogenome sequence has been

reported for other tribe Arini (includes macaws and parakeets) genera: Eupsittula pertinax,

Psittacara brevipes, Thectocercus acuticaudatus, and Rhynchopsitta terrisi (Pacheco, Battistuzzi

et al. 2011, Urantowka, Strzala et al. 2014) but only one mitogenome from members of the genus

Ara, Ara glaucogularis (Urantowka 2014), and a partial mitogenome of the nominate subspecies

Ara macao macao (Seabury, Dowd et al. 2013) have been sequenced. There is not any published

or documented DNA sequence, mitochondrial or nuclear, of the endangered subspecies, Ara

macao cyanoptera.

In order to obtain a set of DNA fragments that collectively encode the mitochondrial

genome, total genomic DNA was extracted from liquid blood using a guanidinium thiocyanate

(GITC) method and from FTA® cards using a “thorough buffer rinse method” (Smith and

Burgoyne). The final DNA pellet was then in each case rehydrated and quantified using a

NanoDrop® 2000C Spectrophotometer (ThermoScientific).

Six primer pairs were designed that collectively amplify an overlapping set of fragments

encompassing the entire mitogenome. Because the mitogenome is strongly conserved across

species, the published mitogenome sequence of Gallus gallus (chicken) in the National Center

for Biotechnology Information (NCBI) database was used to design the initial set of consensus

primers (Sorenson, Ast et al.) (Desjardins and Morais). Primer sets that failed to amplify

48

fragments were subsequently compared to the partial sequence of an Ara macao macao

individual which became available at that time (Seabury, Dowd et al.). Primer sets that had failed

to effectively amplify fragments were modified based upon this sequence.

The control region is inherently problematic during amplification, read assembly, and the

subsequent contig alignment after sequencing. Cytosine homopolymers, which are located near

the 5’ end of the locus, assist in forming regulatory hairpin-loop structures that serve an

important function during mitochondrial genome replication and transcription (Kilpert and

Podsiadlowski 2006). However, this structure creates difficulty in capturing all of the nucleotide

sequence detail because of the strand barrier formed by the folding at this conserved secondary

region. To add to the unusual sequence patterns in this area, adenosine/thymine (AT) residues

form short tandem repeat sequences which are often found in abundance near the 3’ end of the

control region and often contain a thymine homopolymer approximately 15 bp long. This

sequence is suspected to be the cause of the presumed polymerase slippage problem during

amplification and/or sequencing. Our lab was able to overcome these problems through trial and

error of a number of methods such as altering temperatures, using different enzymes, and

designing a variety of primers sets to use for amplification of the control region of different

individual scarlet macaws.

The difficulty for the Ion Torrent® platform to call homopolymer DNA sequences

(Feinstein and Cracraft 2004, Reumers, De Rijk et al. 2012) and the abundance of secondary

structures in this region caused the base call quality score to be lower than desired and to report

an incorrect number of consecutive nucleotides. This is due to the use of non-terminating bases

during sequencing allowing the addition of repetitive nucleotides until the wash step removes the

nucleotides from the chip. This area of the mitogenome was sequenced a second time with the

49

Illumina MiSeq platform. The effort resulted in an increased level of coverage (due to the

different platform chemistry involved) as well as a much higher quality score. The entire

mitogenome was amplified in six segments ranging in size from 2910 bp to 4423 bp (Figure 16).

50

Table 1: Mitochondrial Primer Pairs Used to Amplify the Mitogenome of Ara macao (Scarlet

Macaw)

Segment

Size

(bp)

TA

(°C)

Primers† Primer Sequence (5’-3’)

1 3111 69

F2682a

R5793a

CCAACATCTTAGCGGATCTTAGCG

GAAGCTTGAAGAGAGGAGTAGG

2 4423 69

L2258b

H6681b

CGTAACAAGGTAAGTGTACCGGAAGG

GGTATAGGGTGCCGATGTCTTTGTG

3 2910 66

F7266c

R10176a

GCCTTCAAAGCCTTAAACAAGAG

AAGAAGGTTAGGATCATGGTCAAG

4 4110 69

F9869c

R13979c

GGCCAGTGCTCAGAAATCTGTGG

GATGGGTGGCTCCTAAGACCAGTG

5 3666 67

F13328a

R0028c

GCCTACTCCTCCGTAAGCCACATAGG

CTTCGTGTTTTGGTTTACAAGACC

6 3460 65

F16478c

R2968a

CACGAATCAGGATCAAACAACC

ACCTGTCTTGTTAGTGGGCTGT

†The primers were designed from a this study, b Sorenson et al. (2009), c modified from Sorenson et al. (2009)

51

Various protocols and DNA polymerases were used to amplify the mitogenome

segments. The majority of the mitochondrial segments were amplified in 25 µl reaction volumes

containing Phusion® High-Fidelity (HF) buffer (1.5 mM MgCl2), 0.2 mM dNTPs, 0.5 µM of

each primer, 0.5 U Phusion® HF polymerase (New England Biolabs), and 1 ng of template. The

thermal cycling profile used was one cycle of 1 min at 98 °C followed by 30 cycles each

consisting of 10 sec at 98 °C, 20 sec at the annealing temperature (TA) listed in Table 1, 80 sec at

72 °C, and then a final 7 min incubation at 72 °C. Due to the presence of extensive secondary

structure in the region, the amplification of the fragment that included the control region was

carried out using an elongation step of 70 sec instead of 80 sec and used GC Phusion® buffer

instead of HF Phusion® buffer with Phusion® HF polymerase. PCR amplification was

performed on a PTC-200 Peltier Thermal Cycler (MJ Research, Waltham, Massachusetts).

The PCR product was quantified using a NanoDrop® 2000C Spectrophotometer

(ThermoScientific). Each amplified sample was loaded onto a 1% sodium borate (SB) agarose

gel and stained with ethidium bromide. The gel was electrophoresed in 1X SB buffer at 120V.

One kilobase Quick-Load® ladder (Invitrogen) was used as a size standard (see Figures 10, 11,

12, 13, 14 & 15).

52

Figure 10: Electrophoretic analysis of mitochondrial segment 1 amplicons. Amplified Ara

macao samples were electrophoresed using a 1% sodium borate (SB) agarose gel

with a 1 kb size standard ladder (Invitrogen).

53




54




55




56




57




58

Figure 16: Overlapping PCR amplicons used to amplify the Ara macao mitogenome.

59

2.8 Next Generation Sequencing Using the Ion Torrent® PGM™ Sequencing System (Life

Technologies)

The Ion Torrent® PGM™ is a semiconductor-based sequencer. It was the first

sequencing benchtop technology that did not depend on a controlled light source. Once the

sample is fragmented and ligated to adapters, multiple samples can be combined into a library.

The library is attached to beads and clonally amplified. With each addition of a complementary

base to the growing strand, a proton is released, altering the pH. This sequence-by-synthesis

method records base incorporation and filters out low-accuracy readings (see Figure 17).

2.8.1 Combining and Standardization of Mitochondrial DNA Samples

Each PCR amplicon was column-cleaned with the Wizard® SV Gel and PCR Clean-Up

System (Promega®). The amplicon products were quantified and checked for purity using a

NanoDrop® 2000C Spectrophotometer (ThermoScientific). As a rule, 260/280 nm and 260/230

nm absorbance ratios of 1.8 or greater are considered acceptable indications of adequate purity.

Four hundred nanograms of each of the six amplified mitochondrial segments for each individual

sample were combined in equimolar amounts to ensure equal representation across the

mitogenome during fragmentation and sequencing. The initial amount of total DNA in each

sample was between 0.1 µg – 1.0 µg. These samples were then further diluted to obtain 500 ng in

15.5 µl of molecular grade water. The fragmentation protocol required 0.1 to 1.0 µg of DNA as

starting material.

60

2.8.2 DNA Fragmentation and End Repair

The NEBNext® Fast DNA Fragmentation and Library Prep Set for the Ion Torrent®

(New England Biosystems) were used to prepare the DNA libraries. An amount of 15.5 µl of the

standardized DNA was combined with 2 µl of NEBNext® DNA Fragmentation Reaction Buffer

and 1 µl of 100 µM MgCl2. This DNA/buffer mix was vortexed for 3 sec, briefly centrifuged,

and placed on ice. The NEBNext® DNA Fragmentation Master Mix was vortexed for 3 sec and

1.5 µl of the mix was added to the DNA/Buffer solution. The solution was then incubated on a

thermal cycler for 20 min at 25° C, followed by 10 min at 70 °C and finally held at 4 °C. The

microfuge tube was then placed on ice.

2.8.3 Ligation of Adapters to DNA

Barcode adapters comprised of differing sequences were ligated to each individual

sequence sample to pool/multiplex several samples in one sequencing run. A P1 adapter was

ligated to the end of each sample sequence to allow for the attachment to paramagnetic

AMPure® XP beads during size selection and sequencing. One microliter of sterile water, 4 µl of

T4 DNA Ligase Buffer for Ion Torrent®, 5 µl of P1 adaptor solution, 1 µl of Bst 2.0

WarmStart® DNA polymerase preparation, 4 µl of T4 DNA ligase solution, and 5 µl of a

barcode solution were added to the fragment solution and mixed by pipetting. The combined

solution was incubated in a thermal cycler for 15 min at 25 °C, followed by 5 min at 65 °C and

was then held at 4 °C. Five microliters of stop buffer was added to the microfuge tube which was

then vortexed. Sixty microliters of 0.1 X TE buffer (10 M Tris-HCl, 1 M EDTA, pH 8.0) was

added to the microfuge tube.

61

2.8.4 Bead-Based Size Selection of Amplified DNA

Agencourt AMPure® XP Beads (Beckmann-Coulter) were used to selectively isolate the

200 bp fragments from the unwanted larger and smaller fragments. Seventy microliters of

AMPure® XP magnetic beads were added to the DNA library solution and mixed by pipetting.

The solution was allowed to incubate at room temperature for 5 min, and then placed on a

magnetic rack to separate the beads which bond to the larger, unwanted fragments from the

supernatant. After the solution was clear, the supernatant was transferred to a new microfuge

tube and the beads were discarded. Fifteen microliters of re-suspended AMPure® XP beads were

added to the supernatant, mixed, and allowed to incubate at room temperature for 5 min. The

tube was then placed on the magnetic rack to separate out the beads that were now bound to the

desired fragment size.

Once the solution was clear, the supernatant was removed and discarded. Fifty microliters

of fresh 80% ethanol was added to the tube while on the magnetic rack. The solution was

allowed to incubate at room temperature for 20 sec, and the supernatant was removed and

discarded. Five hundred microliters of ethanol was added to the tube and allowed to incubate at

room temperature for 30 sec, and the supernatant was discarded. The tubes were air dried while

on the magnetic rack with the caps open for 5 min. The DNA target was then eluted from the

beads with 45 µl of 0.1X TE (1.0 M Tris HCl, 0.5 M EDTA, pH 8.0). The solution was mixed by

vortexing and placed on the magnetic rack until clear. Forty microliters of solution was

transferred to a new microfuge tube.

62

2.8.5 PCR Amplification of Adaptor Ligated DNA

Fifty microliters of NEBNext® High-Fidelity 2X PCR Master Mix, 2 µl of equalizer

primers, and 8 µl of sterile water were added to the 40 µl adaptor-ligated 200 bp fragment

solution. The solution was amplified under the following conditions: initial denaturation at 98 °C

for 30 sec, six cycles of denaturation at 98 °C for 10 sec, annealing at 50 °C for 30 sec, and

elongation at 72 °C for 30 sec, followed by a final elongation at 72 °C for 5 min.

2.8.6 Equalizing the Library for Multiplexing

In order to prepare the samples for multiplexing, each sample in the combined library

needed to be represented in equal amounts during sequencing. This was accomplished by special

equalizer beads and buffer. Ten microliters of Equalizer Capture Solution was added to the

amplified samples and mixed by pipetting. The solution was allowed to incubate at room

temperature for 5 min. Six microliters of washed Equalizer Beads were added to the microfuge

tube. The solution was mixed by pipetting and allowed to incubate at room temperature for 5

min. The tube was placed in the magnetic rack until the solution was clear.

The supernatant was removed and discarded and 50 µl of Equalizer Wash Buffer was

added to the reaction. The tube was removed from the magnetic rack, gently mixed by pipetting,

and then returned to the magnetic rack. The supernatant was removed, discarded, and 150 µl of

Equalizer Wash Buffer was added to the reaction. The tube was again removed from the

magnetic rack and gently mixed by pipetting. The tube was then placed in the magnetic rack and

the supernatant was removed and discarded. The tube was then removed from the magnetic rack

and 100 µl of Equalizer Elution Buffer was added to the pellet. The solution was mixed by

pipetting and the tube was sealed and placed in a thermal cycler at 35 °C for 5 min. The tube was

63

again placed in the magnetic rack and allowed to incubate at room temperature for 5 min. The

supernatant was then removed and contains the equalized library. The final concentration of each

equalized library was approximately 100 pM.

2.8.7 Assessment of DNA Library

An Agilent® 2100 Bioanalyzer (Agilent Technologies) was used along with an Agilent®

DNA 1000 kit to determine the sample fragment size of the amplified DNA fragments as well as

the sample concentration. Each DNA sample was loaded onto an Agilent® DNA chip consisting

of interconnected microchannels and separated by size electrophoretically. This method is

preferred to standard gel electrophoresis due to its output data format that can be transferred via

computer interface. A gel matrix was combined with a dye concentrate and transferred to a spin

filter. The solution was centrifuged at 2240 x g for 15 min. Care was taken to protect the mixture

from light. An Agilent® DNA chip was placed on the chip priming station, 9 µl of filtered gel-

dye mix was pipetted into the G well, and the priming station cover was closed.

A priming syringe was clipped on the syringe stand and the plunger was depressed and

held in position to force the gel-dye mix into the chip. Three microliters of gel-dye solution mix

was pipetted into the designated well, and 5 µl of marker was loaded into the sample and ladder

wells on the chip. One microliter of DNA size standard ladder was loaded into the ladder well.

One microliter of each sample was loaded in to each sample well, and the chip was placed in an

adapter and vortexed for 1 min at 2400 rpm. The desired assay to be performed was selected on

the computer display, the chip was loaded on to the machine, and the assay was started.

Each sample was then diluted to 23 pM using molecular grade water. Twenty microliters

of each sample was transferred to a LoBind microfuge tube and this was stored at 4 °C.

64

2.8.8 Preparation of Ion-Sphere Particles for Emulsion PCR & Enrichment

The samples were combined to create a multiplex library. The sample templates were

attached to Ion Sphere Particles (ISPs) in a concentrated reaction to regulate the ratio of template

to particles. This was performed on an Ion Torrent® OneTouch 2 system in an aqueous medium

composed of nanoliters of PCR reagents in individual droplets. This emulsion of individual

aqueous droplets were referred to as microreactors. Each microreactor should contain a single

library fragment, a single bead, and enough PCR reagent (primer, dNTPs, and polymerase) for a

clonal amplification to occur. Multiple copies of the same template were clonally amplified on

the bead.

The Ion Torrent® OneTouch ES enriches the solution after emulsion PCR by removing

non-templated ISPs. Dynabeads® MyOne Streptavidin C1 Beads separate the biotinylated

primers and free template from the templated ISPs.

Figure 17: Ion Torrent sequencing work flow for preparation (Kiesler 2014)

65

2.8.9 Ion Torrent® Run Plan

A run plan was created in the Torrent Suite & Ion PGM™ System software on the server

connected to the Ion PGM™. The Ion PGM™ was prepared and initialized, and the enriched

ISPs were loaded onto an Ion 316™ chip. The chip was centrifuged repeatedly to seat individual

ISPs into signaling positions.

Each microcell position on the chip should contain only one template-positive ISP. Once

the sequencing run has been initiated, sequencing-by-synthesis began, and the chip was flooded

by one dNTP at a time. When a nucleotide was incorporated into a growing strand of

complementary DNA, hydrogen ions were released. The chip underwent a change in pH when

this occurred and the unit recorded this data as a digital signal.

Figure 18: Ion Torrent sequencing final workflow (Kiesler 2014).

66

2.9 Mitochondrial DNA Sequence Data Analysis

The Ion Torrent® Server and Ion Torrent® Suite software records and processes the

corresponding sequences from the sequenced library. The library was separated by

demultiplexing the individual samples by the barcodes attached during the library preparation.

The Ion Torrent® support software uses Torrent Suite™ preparation and FASTQ output

in the form of 200 bp reads. The reads were aligned preliminarily on-instrument to create

contigs. Base calls were scored by local cluster call stringencies programmed at the beginning of

the run. The mitochondrial sequence contigs were downloaded from the instrument in FASTQ

file format (nucleotide sequence including quality scores) and were further analyzed by

Geneious® version 8.0.4 (see Figure 18) (Kearse, Moir et al. 2012).

The off-instrument program, Geneious®, conducted all of the mitochondrial alignments,

assemblies, and analyses. Comparison analysis was conducted by ClustlW with default

parameters as implemented and adjusted by eye. Initial alignment of each sample was performed

de novo. This gave a basis for true alignment before being matched back to a reference sequence.

Uncorrected pairwise sequence divergence between samples was calculated for all genomic

features based on MUSCLE alignments executed in Geneious® version 8.0.4.

2.10 Alignments to Reference Sequence

The reference sequence used in this alignment was from a nominate Ara macao macao

extirpated from Brazil but now housed at the Blank Park Zoo in Des Moines, Iowa. A wild-

caught bird was chosen as the reference sequence to ensure the sample originated from a pure

nominate subspecies and not from a hybrid of the two subspecies (Drees 2010).

During alignment, the consensus sequence and reference sequence were aligned by

67

allowing gaps to make the most matches between the two sequences, and then various analysis

matrices were employed to compare the sequences. By choosing the optimal alignment,

sequences of different length can be compared. Scoring methods were used and scores were

assigned for gaps in the sequence or different bases. In this way, different alignments can be tried

and the most probable alignment will have the highest alignment score. The appropriate scoring

function must be used based on an evolutionary model with insertions, deletions, and

substitutions. The substitution score matrix contained an entry for every amino acid pair.

Algorithms were used to maximize similarity between sequences, minimize distance between

sequences, and apply percent-acceptable mutation. Residues that were aligned but do not match

equal substitutions represent singular nucleotide polymorphisms. Residues that were aligned

with a gap in the sequence (in one direction or the other) represent insertions and deletions.

Matrices evaluated the likelihood that alignments were significant rather than random. Many

different matrices can be employed to analyze sequence similarity.

2.11 Differentiation on the Sequence Level

Once the alignment was complete, the software, in this case ClustlW, used a computer

model algorithm to detect single nucleotide polymorphisms (SNPs), insertions, and deletions. On

average, 90% of sequence variation in the genome are SNPs, and they are a major source of

heterozygosity. SNPs rarely affect the function of a protein and often occur in non-protein

coding areas or intronic regions. They can also occur in the third base position of a codon and

result in the same amino acid. SNPs can cause a change in a protein residue that results in a

chemically similar amino acid or a change in amino acid that doesn’t affect the function of that

protein because it is not an important structural amino acid of the protein. Other polymorphisms

68

such as insertions and deletions (indels) are also taken into account when measuring variation

between sequences. This type of change occurs at a much lower frequency because of the more

disruptive shift it could cause within the coding sequence

2.12 Identification of Candidate Nuclear Sequences for Polymorphism Screening

As noted above, mitochondrial sequences can have a much higher degree of variability

than equivalent nuclear sequences, but because they are maternal in origin only, they cannot

differentiate a hybrid of the two subspecies from A. m. macao or A. m. cyanoptera. In order to

assist in determining a more complete lineage assessment, sequences encoded by the nuclear

genome were also investigated. The nuclear genome offers bi-parental inheritance but lacks a

high degree of variability in many areas. This slower rate of change often doesn’t provide

sufficient sequence variation to differentiate between closely related species (Dasmahapatra and

Mallet 2006).

Most protein-coding regions are highly constrained because most amino acid-altering

mutations are deleterious and become selectively eliminated (Li 1997). Intronic regions,

intergenic sequences, and the third nucleotide of a codon are often significantly less susceptible

to natural selection and fitness interference than protein-coding genes and are thus often more

variable between individuals. Therefore, these sequences are expected to have a higher number

of polymorphic sites and to evolve/change faster, making them potentially useful sequences to

explore for interspecies nucleotide variability when working with closely related species

(Watanabe, Nishida et al. 2005, Aranishi 2006).

Candidate nuclear regions which have historically displayed higher sequence variability

in other organisms were evaluated for comparative analysis in this study (Table 2). Sequences

69

from genes in gene families were avoided in order to increase the chance that intronic regions

would represent an unlinked locus on an individual chromosome. However, using DNA as a

successful phylogenetic tool also relies on the assumption that sequence variation among

individuals within a subspecies is much smaller than variation between two subspecies.

Therefore, sequences with too much variability needed to be avoided as candidates for sequence

comparison as well.

Nuclear primer sets were ordered from Biosynthesis, Inc., Lewisville, Texas. Primers

were designed in exonic regions when attempting to amplify intronic sequences to allow for

future use in other studies. Each lyophilized primer was hydrated to a stock concentration of

100 µM with TE buffer (10 M Tris-HCl, 1 M EDTA, pH 8.0). A working stock concentration of

10 µM of each primer was then made by adding molecular grade water. A variety of PCR

protocols were used for amplifying the candidate nuclear regions and the regions that were

believed to show possible differentiation were further analyzed.

70

Table 2: Candidate Nuclear Regions Used for Differentiation of Subspecies (De Mendonca

Dantas, Godinho et al. 2009)

Nuclear region Short

name

Reference

Length Primers (5’-3’)

ATP citrate lyase ACL 469 GCTCTGCTTATGACAGCACT

CAGCAATAATGGCAATGGTG

Myelin proteolipid

protein MPLPR 356

ACATCTACTTTAACACCTGGACCACCTG

TTGCAGATGGAGAGCAGGTGGGAGCC

CEPU gene CEPU 627 CGAGTCAAAGTCACCGTCAA

CTCTTCGCATCCGAGATGTA

β-fibrogen FIB 908 CAGGACAATGACAATTCAC

GTAGTATCTGCCATTAGG

Laminin receptor

precursor P40 LRPP40 362

GGGCCTGATGTGGTGGATGCTGGC

GCTTTCTCAGCAGCAGCCTGCTC

V-raf murine sarcoma

viral oncogene C-RMIL 576

TCAATCATCCACAGAGACC

TGATGAGATCCACTCCATCG

Transforming growth

factor TGFβ2 630

GAAGCGTGCTCTAGATGCTG

AGGCAGCAATTATCCTGCAC

Tropomyosin TROP 533 AATGGCTGCAGAGGATTA

TCCTCTTCAAGCTCAGCACA

Axin protein AXIN 1280 GATCTCCTGAAGACGTGG

AAGGCTGGACGACGTTCC

Aldolase ADL 356 CTTATGTTGAAGCTGAACGACTG

GCACGTAGCCATAGTGCGTAGTC

Myoglobin MG 627 GCTCAGGGTCTCTAGGTCCA

CTAGGCAGCCTAAGTATGCC

Histone 2 H2AF 908 GCACGACGAGCATGCTAC

AGGTATTCCTGGCACTGG

Adenylate kinase 1 AK1 851 TGCAAGCCATCATCGAGAAGG

TGATGGTCTCCTCGTTGTCG

Recombination

activating gene 1 RAG1 1124

CCTCCTGCTGGTATCCCTGC

GAATGTTCTCAGGATGCGTCC

Regulator of G-

protein signaling 4 RGS4 868

TCGCTGGAAAACTTGATCC

GTAGTCCTCACAACTGACC

Vimentin gene Vim 446 TGCTTCTTTGAACCTGAGAG

GTGTCCTCTTCGAGTGAGTG

71

The adenylate kinase 1 (AK1) gene fragment was amplified by PCR with the primers

listed in Table 2 (see Figures 19 & 20). Reactions were carried out in 25 µl reaction volumes

containing One Taq® Standard Reaction buffer, 0.2 mM dNTPs, 0.2 µM each primer, 1.25 U

One Taq® Hot Start polymerase (New England Biolabs), and 200 ng DNA template using a

thermal cycler protocol of one cycle of 45 sec at 94 °C followed by 35 cycles of 30 sec at 94 °C,

45 sec at 62 °C, and 1 min at 68 °C, with a final cycle of 5 min at 68 °C. PCR amplification was

performed on a PTC-200 Peltier Thermal Cycler (MJ Research, Waltham, Massachusetts).

Figure 19: Amplified nuclear region from adenylate kinase 1 (AK1) – 1 of 2. Amplified Ara



72

Figure 20: Amplified nuclear region from adenylate kinase 1 (AK1) – 2 of 2. Amplified Ara



73

The recombination activating gene 1 (RAG1) gene fragment was amplified by PCR with

the primers listed in Table 9 (see Figures 21 & 22). Reactions were carried out in 25 µl reaction

volumes containing Taq buffer, 0.2 mM dNTPs, 0.2 µM each primer, 1.25 U Taq polymerase

(New England Biolabs), 10 ng DNA template using a thermal cycler protocol of one cycle of 45

sec at 94 °C followed by 30 cycles of 30 sec at 94 °C, 45 sec at 61 °C, and 1 min at 68 °C, with a

final cycle of 5 min at 68 °C. PCR amplification was performed on a PTC-200 Peltier Thermal

Cycler (MJ Research, Waltham, Massachusetts).

Figure 21: Amplified nuclear region from recombination activating gene 1 (RAG1) – 1 of

2. Amplified Ara macao samples were electrophoresed using a 1% sodium

borate (SB) agarose gel with a 1 kb size standard ladder (Invitrogen).

74

Figure 22: Amplified nuclear region from recombination activating gene 1 (RAG1) – 2 of

2. Amplified Ara macao samples were electrophoresed using a 1% sodium

borate (SB) agarose gel with a 1 kb size standard ladder (Invitrogen).

75

The regulator of G protein signaling 4 (RSG4) gene fragment was amplified by PCR with

the primers listed in Table 9 (see Figures 23 & 24). Reactions were carried out in 25 µl reaction

volumes containing Q5® Hot Start buffer, 0.2 mM dNTPs, 0.5 µM each primer, 1.0 U Q5® Hot

Start polymerase (New England Biolabs), 50 ng DNA template using a thermal cycler protocol

of one cycle of 30 sec at 98 °C followed by 30 cycles of 10 sec at 98 °C, 20 sec at 63 °C, and 10

sec at 72 °C, with a final cycle of 5 min at 72 °C. PCR amplification was performed on a PTC-

200 Peltier Thermal Cycler (MJ Research, Waltham, Massachusetts).

Figure 23: Amplified nuclear region from regulator of G-protein signaling 4 (RGS4) – 1 of 2.

Amplified Ara macao samples were electrophoresed using a 1% sodium borate (SB)

agarose gel with a 1 kb size standard ladder (Invitrogen).

76

Figure 24: Amplified nuclear region from regulator of G-protein signaling 4 (RGS4) – 2 of 2.

Amplified Ara macao samples were electrophoresed using a 1% sodium borate (SB)

agarose gel with a 1 kb size standard ladder (Invitrogen).

77

The Vimentin (Vim) gene fragment was amplified by PCR using the primers listed in

Table 9 (see Figures 25 & 26). Reactions were carried out in 25 µl reaction volumes containing

Q5® Hot Start buffer, 0.2 mM dNTPs, 0.5 µM each primer, 1.0 U Q5® Hot Start polymerase

(New England Biolabs), 50 ng DNA template using a thermal cycler protocol of one cycle of 30

sec at 98 °C followed by 30 cycles of 10 sec at 98 °C, 20 sec at 63 °C, and 10 sec at 72 °C, with

a final cycle of 5 min at 72 °C. PCR amplification was performed on a PTC-200 Peltier Thermal

Cycler (MJ Research, Waltham, Massachusetts).

Figure 25: Amplified nuclear region from vimentin gene (Vim) – 1 of 2. Amplified Ara



78

Figure 26: Amplified nuclear region from vimentin gene (Vim) – 2 of 2. Amplified Ara



2.13 Restriction Digestion Analysis

DNA fragmentation by restriction enzyme digestion is a technique used to differentiate

homologous DNA sequences using the restriction enzyme’s ability to recognize and cut at

precise sequences. This method has proven to be a powerful tool to distinguish sequence

polymorphisms between species based on the recognition of specific short nucleotide

sequences/restriction sites and has regularly been used for species identification (Haig,

Wennerberg et al. 2004). Given the significant cost of next generation sequencing, an initial

search for possible sequence variability within the amplified nuclear regions was performed.

Restriction digestion was performed on the nuclear amplicons to determine if the intronic

fragments had some degree of sequence variability between A. m. macao and A. m. cyanoptera.

79

Restriction enzymes were chosen that were known to associate with more common

recognition sequences at standard conditions (Table 3).

Table 3: Restriction Enzymes

Restriction Enzyme Recognition Sequence

EcoRI G/AATTC

HindIII A/AGCTT

HaeIII GGG/CCC

BstEII G/GTACC

BamHI G/GATCC

NcoI C/CATGG

80

One microliter of 10X EcoRI buffer (100 mM Tris-HCl, 50 mM NaCl, 10 mM MgCl2,

0.025% Triton X-100, pH 7.5), and 2 µl of amplified template DNA (0.5 µg) were added to 6.5

µl of distilled water with 0.5 µl of EcoRI restriction enzyme (20 U/ µl) in a 1.5 ml microfuge

tube. Care was taken to avoid using more restriction enzyme preparation than 10% of the total

reaction volume due to the reaction inhibiting effects of glycerol mixed with the enzyme. The

solution was mixed by pipette and incubated in a thermal cycler for 1 h at 37 °C.

One microliter of 10X HindIII reaction buffer (10 mM Tris-HCl, 50 mM NaCl, 10 mM

MgCl2, 1 mM dithiothreitol, pH 7.9), and 2 µl of amplified template DNA (0.5 µg) was added to

6.5 µl of distilled water with 0.5 µl of HindIII restriction enzyme (20 U/ µl) in a 1.5 ml

microfuge tube. The solution was mixed by pipette and incubated in a thermal cycler for 1 h at

37 °C.

Five microliters of 10X BstEII reaction buffer (100 Tris-HCl, 50 mM NaCl, 10 mM

MgCl2, 0.025% Triton X-100, pH 7.5), and 2 µl of amplified template DNA (1.0 µg) was added

to 7.5 µl of distilled water with 0.5 µl of BstEII restriction enzyme (10U/ µl) in a 1.5 ml

microfuge tube. The solution was mixed by pipette and incubated in a thermal cycler for up to 1

h at 37 °C. All other enzymes used in restriction digest followed the above basic protocol.

Double digestions were prepared by following the recipes for the individual digestions

after determining that the buffering conditions of each enzyme were compatible. The stop

solution used to terminate the enzymatic reactions was (10 µl per 50 µl reaction) (2.5% Ficoll®-

400, 10mM EDTA, 3.3mM Tris-HCl, 0.08% SDS, pH 8.0 at 25 °C)

81

The digested fragments were then loaded onto a 2% sodium borate (SB) agarose gel

stained with ethidium bromide and electrophoresed at 120V (see Figure 27). If the banding

patterns differed between the subspecies, the nuclear amplicons were included in the sequencing

library for further analysis.

Figure 27: Electrophoretic analysis of restriction digest of RGS4. Restriction enzymes

EcoRI, BstEII, and HindIII were used to digest the nuclear amplicon RGS4 and

create comparative banding patterns to screen for subspecies unique sequence

polymorphisms.

82

2.14 Next Generation Sequencing Using the Illumina MiSeq® Platform

Illumina® sequencing technology is based on the creation of clusters by massively

parallel sequencing-by-synthesis (SBS) which detects the incorporation of single bases into the

growing DNA strands. Single DNA molecules are bound to the surface of the flow cell, and

bridge-amplified to form growing clonal clusters. Four types of reversible, fluorescently-labeled,

terminator bases are added and the growing clusters are imaged as the DNA chains are extended

one nucleotide at a time. Non-incorporated nucleotides are washed away and the fluorescent

label and 3’ terminal blocker are chemically removed from the DNA to allow incorporation of

the next base. Since all 4 dNTPs are present during each cycle, natural competition minimizes

incorporation bias. The image of the new base as part of the strand is captured and the process is

repeated for each cycle of sequencing. Following image analysis, the sequencing software

performs base calling, filtering, and quality scoring (Figure 28). Requeuing analyses allow for

homology, SNP, indel, and variant calling on- or off-instrument.

Figure 28: Sequence homology and a single nucleotide polymorphism (SNP) seen between

mitochondrial sequence for macaw 065 and 1022 (Illumina MiSeq® Reporter).

83

CHAPTER 3

RESULTS AND CONCLUSIONS

3.1 18S Analysis of Results

Ribosomal DNA sequences have long been an invaluable tool used in evolutionary

studies. This region is phylogenetically informative due to the combination of the highly

conserved coding sequences of the 18S, 5.8S, and 28S rRNA genes flanked by the more highly

variable sequences found on the non-coding internally transcribed spacer (ITS) regions. A map

of this region is shown below (see Figure 29; see also Figure 5, supra):

This variability allows for phylogenetic inferences across a broad range of evolutionary

time scales (Baldwin, Sanderson et al. 1995) but is still easy to align across divergent taxa. The

rRNA gene regions are highly conserved within species, whereas the ITS regions exhibit

divergence sufficient to resolve relationships within species or between closely related species of

most genera (Álvarez and Wendel 2003). The transcription start site and elements that influence

Figure 29: Arrangement of ribosomal DNA (rDNA) clusters on the genome. ITS1 –

Internal Transcribed Spacer 1; ITS2 – Internal Transcribed Spacer 2; ETS –

External Transcribed Spacer; NTS – Non-Transcribed Spacer (Holstein 2006).

84

the regulation of the downstream genes are within the external transcribed spacer (ETS) region,

which diverges more rapidly, and therefore displays more variability, than even the ITS region

(Kovarik, Dadejova et al. 2008).

In the scarlet macaw, the 18S – 5.8S – 28S nuclear ribosomal DNA (rDNA) clusters

occur in tandem arrays on three pairs of microchromosomes (see Figure 30) (Seabury, Dowd et

al. 2013). There are estimated to be 100s to 1000s of repetitive clusters (transcriptional units)

separated by non-transcribed spacer (NTS) regions comprising each array. Each cluster encodes

three different rRNA genes (18S, 5.8S, and 28S), which are separated by two internal transcribed

spacer (ITS1 and ITS2) regions (Matyasek, Renny-Byfield et al. 2012). The highly conserved

nature of this region allows for the design of “universal” PCR primers which enable

amplification of a wide variety of taxa and the high copy number of the rDNA clusters also

allows for easier amplification/analysis.

85

Figure 30: Ribosomal DNA tandem cluster array. Adapted from photograph by O. L. Miller,

Jr. (Gilbert). NTS – non-transcribed spacer; ETS – external transcribed spacer;

ITS1 – internal transcribed spacer 1; ITS2 – internal transcribed spacer 2; rRNA

genes 18S, 5.8S, and 28S

86

Table 4: 18S rDNA Sequence Comparison Between Subspecies

Sample

Locus

1172 Freq. 1190 Freq. 1214 Freq. 1287 Freq. 1582 Freq.

A. m.

cyanoptera

SL8 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.90/0.10 G/A 0.91/0.09 G/A 0.90/0.10

033 T/A 0.89/0.11 A/T 0.85/0.15 T/C 0.90/0.10 G/A 0.91/0.09 G/A 0.90/0.10

19s/m T/A 0.90/0.10 A/T 0.88/0.12 T/C 0.91/0.09 G/A 0.93/0.07 G/A 0.89/0.11

ZM19 T/A 0.90/0.10 A/T 0.87/0.13 T/C 0.91/0.09 G/A 0.93/0.07 G/A 0.92/0.08

028 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.90/0.10 G/A 0.92/0.08 G/A 0.90/0.10

4531 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.90/0.10 G/A 0.92/0.08 G/A 0.90/0.10

9021 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.89/0.11 G/A 0.91/0.09 G/A 0.90/0.10

062 T/A 0.90/0.10 A/T 0.87/0.13 T/C 0.91/0.09 G/A 0.92/0.08 G/A 0.88/0.12

065 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.91/0.09 G/A 0.92/0.08 G/A 0.87/0.13

9780 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.89/0.11 G/A 0.90/0.10 G/A 0.90/0.10

ZM16 T/A 0.90/0.10 A/T 0.85/0.15 T/C 0.93/0.07 G/A 0.93/0.07 G/A 0.90/0.10

101 T/A 0.90/0.10 A/T 0.86/0.14 T/C 0.89/0.11 G/A 0.91/0.09 G/A 0.90/0.10

119 T/A 0.89/0.11 A/T 0.87/0.13 T/C 0.91/0.09 G/A 0.90/0.10 G/A 0.90/0.10

363 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.90/0.10 G/A 0.90/0.10 G/A 0.89/0.11

A. m. macao

ZM13 T/A 0.91/0.09 A/T 0.88/0.12 T/C 0.92/0.08 G/A 0.93/0.07 G/A 0.90/0.10

CC6 T/A 0.89/0.11 A/T 0.88/0.12 T/C 0.91/0.09 G/A 0.93/0.07 G/A 0.91/0.09

SL5 T/A 0.89/0.11 A/T 0.87/0.13 T/C 0.90/0.10 G/A 0.89/0.11 G/A 0.88/0.12

ZM10 T/A 0.89/0.11 A/T 0.87/0.13 T/C 0.91/0.09 G/A 0.92/0.08 G/A 0.90/0.10

CC7 T/A 0.88/0.12 A/T 0.87/0.13 T/C 0.90/0.10 G/A 0.92/0.08 G/A 0.90/0.10

046 T/A 0.90/0.10 A/T 0.90/0.10 T/C 0.90/0.10 G/A 0.89/0.11 G/A 0.89/0.11

024 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.90/0.10 G/A 0.91/0.09 G/A 0.88/0.12

Hybrids

043 T/A 0.90/0.10 A/T 0.87/0.13 T/C 0.91/0.09 G/A 0.92/0.08 G/A 0.90/0.10

1022 T/A 0.89/0.11 A/T 0.86/0.14 T/C 0.91/0.09 G/A 0.92/0.08 G/A 0.90/0.10

87

The 1815 bp fragment encompassing the 18S rDNA and the 5’ end of the flanking

internal transcribed spacer 1 (ITS1) regions of both the nominate A. macao macao and A. macao

cyanoptera subspecies were sequenced on the Illumina MiSeq® Desktop Sequencer (Table 4).

The rDNA amplicons from a total of 7 individuals of the A. macao macao subspecies, 14 of the

A. macao cyanoptera subspecies, and two known hybrids were sequenced. The 18S rDNA

sequence is regularly used to resolve deep phylogenetic relationships and therefore was not

expected show much variation, if any, between these two subspecies. However, since rDNA is a

much conserved repeated sequence, the obtained nucleotide sequence of the rDNA amplicon is

the “sum” of the sequences of all of the units represented within the PCR product. If they are

identical, then a single, compiled sequence is obtained. If one or more of the repeat units

exhibits a polymorphism, then the obtained sequence will show two nucleotides at that position

with the relative proportions reflecting the ratio of the two units within the total number of rDNA

clusters/units. In this case, the results of the rDNA sequence analysis indicated the existence of a

clear sequence heterogeneity across the A. macao 18S rRNA genes as well as the 5’ region of

ITS1. A total of 5 points of polymorphism were identified that occurred at approximately a 90/10

ratio in the amplified product. This is consistent with the “minor sequence” likely comprising a

cluster that represents about 10% of the total rDNA units in these birds. Unfortunately, the same

polymorphism was exhibited in all 23 A. macao specimens tested, indicating that the origin of

the polymorphism predates the separation of the two subspecies and that it has not further

diverged since that time. Thus, although this is an interesting finding in itself, it is of limited

usefulness for the present study.

Sequence heterogeneity within multicopy gene families has previously posed a problem

in phylogenetic analyses of many species, in particular within the internal transcribed spacer

88

regions (Buckler-Iv, Ippolito et al. 1997, Álvarez and Wendel 2003, Kiss 2012). Surprisingly,

the number of polymorphisms (only one identified in ITS1) was remarkably low in the A. macao

birds analyzed and again this one was observed in all 23 birds used in this study.

The rDNA sequencing reads produced from our Illumina MiSeq sequencing run from the

A. macao samples were each assembled into an rDNA sequence alignment. The high similarity

of the repeats inherent in rDNA arrays inhibits the ability to differentiate between reads cut in

sequences from repetitive clusters. The array sequencing reads instead are aligned as one single

rDNA (transcription unit) with a sequence polymorphism frequency that is the product of the

sequencing coverage of the genome and the number of rDNA clusters in the array (see Figure

31). These results demonstrate that rDNA evolves via concerted evolution and that

homogenization is highly efficient at maintaining the rDNA with near-identical repeats (Ganley

and Kobayashi 2007). It seems highly likely that homogenization is favored by natural selection

because it reduces mutational load (Ohta 2009). Otherwise, it would seem inconceivable that

variation is seen at such a low frequency, especially because the ITS region is known to evolve

very rapidly.

89

To summarize my findings from rDNA analyses, interesting sequence polymorphisms

were observed between rDNA clusters within each of the macaws studied, but no polymorphisms

were observed that were useful as subspecies-specific indicators. Again, although this was

disappointing, at least in some sense, it is very much in line with expectations given the close

phylogenetic relationship between these two groups of macaws. For this reason, we focused on

what should be more polymorphic regions of the A. macao genome to find loci which would

allow subspecies identifications to be carried out. These regions included the complete

mitochondrial DNA sequence and four additional nuclear autosomally-encoded loci.

3.2 Mitochondrial Sequences: Analysis of Results

Figure 31: Inherent problem with tandem clusters of rDNA during read assembly. 18S

rDNA/ITS1 sequenced reads from tandem clusters will assemble together as one

cluster. ETS: external transcribed spacer; ITS1: internal transcribed spacer 1;

ITS2: internal transcribed spacer 2

90

Fourteen pedigreed A. m. cyanoptera and seven pedigreed A. m. macao (plus two known

hybrids) were studied in the completion of this project. The complete mtDNA sequence was

determined for each of these birds. The mitogenome of Ara macao was found to be 16,993 bp

long and to contain 13 protein-coding genes, 2 rRNA-encoding genes, 22 tRNA-encoding genes,

and a control region approximately 1494 bp. The comparisons of these mtDNA sequences

clearly reveal that the total amount of polymorphism within the scarlet macaw mitogenome is

more than sufficient to allow for subspecies identification using mitochondrial loci. Upon

completing DNA sequence analysis and polymorphism assessment, a total of 74 SNPs (0.43% of

entire mitogenome) and nine indels (0.05%) out of the 16,993 bp were discovered within the

endangered subspecies, Ara macao cyanoptera. Fifty-eight SNPs (0.34%) and six indels

(0.035%) were found within the nominal subspecies, Ara macao macao out of the 16,993 bp. A

total of nineteen loci (0.1%) show a different nucleotide (allele) between the two subspecies.

A number of loci also displayed evidence suggestive of heteroplasmy, the co-existence of

multiple mitochondrial DNA variants within a single organelle (see Table 6, blue cells).

Heteroplasmy is the result of mutations that affect only a portion of the mitochondria within a

cell. The level of heterogeneity can increase in number over time as the rarer allele becomes

more common within the organelle/cell. It is possible to inherit a heteroplasmic organelle/

genome as well and it continue being passed through generations until the heteroplasmy is lost

when a descendant only inherits one or the other of the alleles.

The observed and abundant mtDNA polymorphism in A. macao provides an ideal source

of molecular markers for conservation studies of endangered species (Nabholz, Uwimana et al.

2013). The majority of previous mitochondrial DNA analyses have focused on sequencing only

select fragments of the mitogenome and, historically, the most common region to sequence has

91

been the hypervariable, non-coding control region (CR) (Duchene, Archer et al. 2011). This

region is followed in use by cytochrome B (CytB), which is slightly more conserved than the

CR, as well as cytochrome C oxidase 1 (CO1) (Hebert, Ratnasingham et al. 2003). We chose to

sequence the entire mitogenome to increase the likelihood that we would find regions with the

“appropriate” levels of polymorphism to allow for differentiation between the subspecies at the

mtDNA level.

The mitogenome of the nominate subspecies, Ara macao macao, was published as a

partial sequence in 2013 (Seabury, Dowd et al. 2013). The cytochrome B gene was incomplete,

missing 23 nucleotides at the 3’ end of the locus. Also, the reported sequence of the control

region (D-loop region) had many differences from the sequences we report here for seven birds

of the same subspecies. Both the CytB and control regions were sequenced and assembled to

completion by our lab for each of our reference subspecies individuals. Once we had isolated,

amplified, and sequenced the entire mitogenome of the first A. m. macao specimen, we compared

the mtDNA sequences of six additional individuals to assess the variability within the nominate

subspecies. There were many SNPs and indels observed in the control region for these seven

individuals. Unfortunately, the degree of polymorphism was so high as to classify the region as

hyperpolymorphic—as expected from studies of other species. And due to the hypervariable

nature at this locus, the usefulness of this locus for definitive DNA comparisons between the two

subspecies is, at best, complicated and thus greatly reduced.

The control region of the mitochondria consists of three distinct domains. Domain I is the

C-rich region that contains the H-strand synthesis terminus. This domain contains the most

variability within the control region. Heteroplasmic repeats that aid in hairpin formation for

tighter regulatory control are found in this domain as well as in domain III. Due to this

92

comparable variability, 600 bp from this region of the mitochondria were chosen to assess the

relatedness of our sample group (see Appendix). Domain II is a highly conserved and G-rich

segment of the control region. Among the individual macaw samples we collected, variability

between subspecies was the lowest in this domain as would be expected. Domain III is an AT-

rich and an extremely G-poor segment. The origin of replication of the H-strand is found in this

domain.

Our 21 samples were obtained from birds collected from many regions inhabited by the

two scarlet macaw subspecies and therefore it is expected that different haplotypes should be

represented within our group. It was our primary focus to ultimately locate regions of the

Figure 32: Phylogenetic tree showing relatedness of Ara macao from mitogenome domain I.

93

mitogenome with appropriate/useful levels of polymorphism between the two subspecies and, if

possible, with subspecies-specific alleles/haplotypes. Analysis of the mtDNA sequences obtained

from the 21 birds revealed considerable polymorphisms seen across a number of regions on the

mitochondrial genome. For example, the 16S rDNA region has ten loci that exhibit

polymorphism within the A. m. macao subspecies and a total of 17 loci that exhibit

polymorphism within the A. macao species, six which display intraspecies variability (alleles

found only in one or the other subspecies, see Table 6, peach columns). The 16S rDNA region

therefore shows great promise as a tool for subspecies differentiation at the mtDNA level and

will be discussed in more detail below.

94

Table 5: Characteristics of the mtDNA of Two Subspecies of Scarlet Macaw Ara macao macao and Ara macao cyanoptera

Gene Coding

Strand

Position Spacer

(+) or

Overlap

(-)

Size Codon

From To Start Stop

tRNAPHE H (R) 1 66 0 66

12S rRNA L (F) 67 1038(1036) 0 972(970)

tRNAVAL H (R) 1039(1037) 1109(1107) +1 71

16S rRNA L (F) 1111(1109) 2679(2680) 0 1569(1572)

tRNALEU H (R) 2680(2681) 2754(2755) +6 75 ND1 H (R) 2761(2762) 3741(3742) -1 981 ATG AGG

tRNAILE H (R) 3740(3741) 3811(3812) +5 72

tRNAGLN L (F) 3817(3818) 3887(3888) 0 71

tRNAMET H (R) 3888(3889) 3955(3956) 0 68 ND2 H (R) 3956(3957) 4995(4996) 0 1040 ATA TA--

tRNATRP H (R) 4996(4997) 5066(5067) +1 71

tRNAALA L (F) 5068(5069) 5136(5137) +2 69

tRNAASN L (F) 5139(5140) 5212(5213) +2 74

tRNACYS L (F) 5215(5216) 5281(5282) 0 67

tRNATYR L (F) 5282(5283) 5351(5352) +9 70 CO1 H (R) 5361(5362) 6908(6909) 0 1548 GTG AGG

tRNASER L (F) 6909(6910) 6975(6976) +4 67

tRNAASP H (R) 6980(6981) 7048(7049) +2 69 CO2 H (R) 7051(7052) 7734(7735) +1 684 ATG TAA

tRNALYS H (R) 7736(7737) 7803(7804) +1 68 ATP8 H (R) 7805(7806) 7972(7973) -10 168 ATG TAA ATP6 H (R) 7963(7964) 8645(8646) 0(-1) 683 ATG TA- CO3 H (R) 8646 9429(9427) 0 784(782) ATG T--

tRNAGLY H (R) 9430(9428) 9498(9496) 0 69 ND3 H (R) 9499(9497) 9848(9846) 0 350 ATA TA-

tRNAARG H (R) 9849(9847) 9918(9916) +1 70 ND4L H (R) 9920(9918) 10216(10214) -7 297 ATG TAA ND4 H (R) 10210(10208) 11602(11600) 0 1393 ATG T--

tRNAHIS H (R) 11603(11601) 11671(11669) 0 69

tRNASER H (R) 11672(11670) 11737(11735) 0 66

tRNALEU H (R) 11738(11736) 11807(11805) 0 70

ND5 H (R) 11808(11806) 13622(13620) +11 1815 GTG TAG

CytB H (R) 13634(13632) 14773(14771) 0 1140 ATG TAA

tRNATHR H (R) 14774(14772) 14841(14839) +4 68

tRNAPRO L (F) 14846(14844) 14915(14913) +3 70 ND6 L (F) 14919(14917) 15431 +1 513(515) ATG TAG

tRNAGLU L (F) 15433 15502 0 70

CR H (R) 15503 16993+3* 0 1491+3*

95

Table 5 shows locations of mitogenome features, such as genes and other regions of

interest, for both of the Ara macao subspecies. Values displayed are for the nominate subspecies,

Ara macao macao, while values in parentheses are for homologous features found in Ara macao

cyanoptera that differed from the nominate features. When only one value is present, the features

are identical in the two subspecies. An asterisk (*) indicates where the control region sequence

overlaps the sequence for the phenylalanine tRNA gene. Negative values represent other

overlapping nucleotides whereas positive values represent intergenic spacer regions. Four codons

contain “--” or “-”, which indicate incomplete termination codons that are completed via

polyadenylation during post transcriptional mRNA modification. Both subspecies are missing a

cytosine residue 171 nucleotides into the NADH dehydrogenase subunit 3 gene (ND3) (Mindell,

Sorenson et al. 1998). The previously unknown termination codon for the cytochrome B gene

(CytB) is TAA (see Table 5, blue).

Comparison between the two subspecies showed significant variability across the

mitogenome as well (see Table 6). The 16S rDNA region shows six loci that, based upon the

sequences for the 21 reference birds, can differentiate between the two subspecies (see Table 6,

peach columns). For these loci, each subspecies appears to not exhibit intrasubspecies

polymorphism, and so the alleles could be considered subspecies indicative. For example, at

locus 1992, all 14 of the cyanoptera subspecies show a guanine residue while all seven of the

nominate subspecies show an adenine residue. Two polymorphic loci within the 16S rDNA

region are indels and both appear to have subspecies-specific alleles. Locus 2075 of the

cyanoptera subspecies shows an adenine followed by an adenine, cytosine, and two thymine

residues while the nominate subspecies shows only the adenine. Locus 1832 of the nominate

96

subspecies shows an adenine followed by a cytosine while cyanoptera only shows an adenine at

the locus.

There are four additional loci which appear to have subspecies-specific alleles, in

addition to alleles that are shared by both subspecies (see Table 6, grey columns). Therefore, at

this time, certain alleles appear to be associated with only one subspecies while others at this

locus have no predictive value for subspecies identification. For example, all fourteen of the

cyanoptera subspecies show a cytosine residue at locus 1514 while only 57% of the nominate

subspecies show a cytosine residue and 43% show a thymine residue at this locus.

97

Table 6: Mitochondrial Sequence Comparison Within and Between Subspecies: 16S rDNA

Sample Locus

1181 1514 1611 1621 1746 1777 1785 1801 1802 1832 1879 1992 2075 2197 2206 2359 2618

A. m. cyanoptera

SL8 A T A G A T C C A A T G AACTT G G T C 033 A T A A A T C C G A T G AACTT G G T C 19s/m A T G A A C C C A A T G AACTT G G T C ZM19 A T G A G T T T G A T G AACTT G G T T 028 A T A A A T T T G A T G AACTT G G T T 4531 A T A G A C C C A A T G AACTT G G T C 9021 A T A A A C C C A A T G AACTT G G T T 062 A T A A G C C C A A T G AACTT G G T C 065 A T A G G C C C A A T G AACTT G G T C 9780 A T A G G C C C A A T G AACTT G G T T ZM16 A T G A A T T T G A T G AACTT G G T T 101 A T A G G C T T A A T G AACTT G G T T 119 A T A G G C T T A A T G AACTT G G T T 363 A T A G G C C C A A T G AACTT G G T T

A. m. macao

ZM13 G C A G A T T C A AC C A A G A C C

CC6 G C A G A T T T G AC C A A A/G A C C

SL5 G C A A A T T T G AC C A A G G C T

ZM10 G C G A G T T C A AC C A A G A C C

CC7 G T A G A T T T A AC C A A A/G A C T

046 G T A G A C T T A AC C A A G G C T

024 G T A G A C T T A AC C A A A/G A C C

Hybrids

043 A T A G G C C C A A T G AACTT G G T T

1022 G T A G A C T T A AC C A A G A C C

98

Therefore, we selected five of the coding regions of the mitogenome for definitive

sequence analysis (see Tables 7-11). The genes chosen were 12S rDNA, cytochrome B (CytB),

16S rDNA, NADH dehydrogenase subunit 3 (ND3), and cytochrome C oxidase subunit 2 (CO2).

(The entire mitogenome sequence and all observed polymorphisms are included in the

appendix.)

Table 7: Mitochondrial Sequence Comparison Between Subspecies: 16S rDNA

Sample Locus 1181 1832 1879 1992 2075 2359

A. m. cyanoptera

SL8 A A T G AACTT T 033 A A T G AACTT T 19s/m A A T G AACTT T ZM19 A A T G AACTT T 028 A A T G AACTT T 4531 A A T G AACTT T 9021 A A T G AACTT T 062 A A T G AACTT T 065 A A T G AACTT T 9780 A A T G AACTT T ZM16 A A T G AACTT T 101 A A T G AACTT T 119 A A T G AACTT T 363 A A T G AACTT T

A. m. macao

ZM13 G AC C A A C CC6 G AC C A A C SL5 G AC C A A C ZM10 G AC C A A C ZM16 G AC C A A C CC7 G AC C A A C 024 G AC C A A C

Hybrids

043 A A T G AACTT T 1022 G AC C A A C

99

Table 8: Mitochondrial Sequence Comparison Between Subspecies: 12S rDNA

Sample

Locus

383 515

A. m. cyanoptera

SL8 G A

033 G A

19s/m G A

ZM19 G A

028 G A

4531 G A

9021 G A

062 G A

065 G A

9780 G A

ZM16 G A

101 G A

119 G A

363 G A

A. m. macao

ZM13 A AG

CC6 A AG

SL5 A AG

ZM10 A AG

CC7 A AG

046 A AG

024 A AG

Hybrids

043 G A

1022 A AG

100

Table 9: Mitochondrial Sequence Comparison Between Subspecies: Cytochrome C Oxidase Subunit II (COII)

Sample

Locus

2012 2141 2171

A. m. cyanoptera

SL8 T G A

033 T G A

19s/m T A A

ZM19 T G A

028 T G G

4531 T A A

9021 T A A

062 T G A

065 T G A

9780 T G A

ZM16 T G A

101 T G A

119 T G A

363 T G A

A. m. macao

ZM13 C G A

CC6 T G A

SL5 T G A

ZM10 C G A

ZM16 C G A

CC7 T G A

024 C G A

Hybrids

043 T G G

1022 C G A

101

Table 10: Mitochondrial Sequence Comparison Between Subspecies: NADH Dehydrogenase 3 (ND3)

Sample

Locus

1921 2010 2045 2155

A. m. cyanoptera

SL8 A A TC G

033 A A TC G

19s/m A A TC G

ZM19 A A TC G

028 A A TC G

4531 A A T G

9021 A A T G

062 A A T G

065 A A T G

9780 A A T G

ZM16 A A TC G

101 A A TC G

119 A A TC G

363 A A TC G

A. m. macao

ZM13 G G TC A

CC6 G G TC A

SL5 G G TC A

ZM10 G G TC A

ZM16 G G TC A

CC7 G G TC A

024 G G TC A

Hybrids

043 A A TC G

1022 G G TC A

102

Table 11: Mitochondrial Sequence Comparison Between Subspecies: Cytochrome B (CytB)

Sample

Locus

2947 2953 2962

A. m. cyanoptera

SL8 C A T

033 C A T

19s/m C A T

ZM19 C A T

028 C G C

4531 C A T

9021 C A T

062 C A T

065 C A T

9780 C G T

ZM16 C G C

101 C A T

119 C A C

363 C A T

A. m. macao

ZM13 T A T

CC6 T A T

SL5 T A T

ZM10 T A T

CC7 T A T

046 T A T

024 T A T

Hybrids

043 C A T

1022 T A T

103

3.3 Nuclear Results

Although the mitochondrial genome is an extremely powerful phylogenetic tool, the

uniparental mode of inheritance is limiting when dealing with the differentiation of hybrids. It

has the disadvantage of being inherited as a single linkage group so that regardless of the number

of genes sequenced, it can be inferred as only one linked haplotype (Prychitko and Moore 2000).

The uniparental nature of its inheritance also means that hybrids will never show evidence of

hybridization in the mtDNA sequences. For this reason, we turned to the nuclear genome to

further assess the differentiation between the subspecies of the scarlet macaw.

We analyzed a total of 3266 bp autosomally encoded nuclear DNA sequence (including

RAG1) from a total of four loci. The 1104 bp RAG1 sequence showed no subspecies-specific

polymorphisms (see Table 15), although many polymorphic sites were shared by the two

subspecies. Nine polymorphic sites with subspecies-specific alleles were identified in the

remaining 2162 bp (0.4%) of nuclear DNA sequence analyzed as part of this study. The intron

sequence, adenylate kinase 1 (AK1), covers 851 bp and includes four loci with A. m. macao and

A. m cyanoptera specific alleles (see Table 12). Although none of the four loci exhibited two

alleles which were each subspecies-specific (one allele/subspecies), in each case one subspecies

was polymorphic at that locus and the “second allele” was unique to that subspecies. Thus,

although one allele has limited predictive value for subspecies identification, the presence of the

other can be considered indicative of a single subspecies (based upon our present subspecies

sequence database of 21 birds, 42 alleles). As an example, at AK1 locus 725, a thymine residue

was found as 11 of 14 alleles in A. m. macao but was not included as an allele out of the 28

alleles of A. m. cyanoptera.

104

Table 12: Nuclear Sequence Comparison Between Subspecies : Adenylate Kinase 1 (AK1)

Sample

Locus

249 260 725 742

A. m. cyanoptera

SL8 C/C T/T A/A G/A

033 C/C T/T A/A G/G

19s/m C/T T/T A/A G/A

ZM19 C/C T/T A/A G/G

028 C/T T/T A/A G/G

4531 C/C T/T A/A G/G

9021 C/T T/T A/A G/G

062 C/T T/T A/A G/A

065 C/C T/T A/A G/G

9780 C/T T/T A/A G/G

ZM16 C/C T/T A/A G/G

101 C/C T/T A/A G/G

119 C/C T/T A/A G/G

363 C/C T/T A/A G/G

A. m. macao

ZM13 C/C C/T C/A G/G

CC6 C/C C/T C/C G/G

SL5 C/C C/T C/C G/G

ZM10 C/C C/T C/A G/G

CC7 C/C C/T C/A G/G

046 C/C C/C C/C G/G

024 C/C C/C C/C G/G

Hybrids

043 C/C T/T C/A G/A

1022 C/C T/T A/A G/G

105

The intronic sequence, regulator of G-protein signaling 4 (RGS4), covers 865 bp and

includes three loci with A. m. macao and A. m cyanoptera specific alleles (see Table 13).

Although none of the three loci exhibited two different alleles which were each subspecies-

specific (one allele per subspecies), in each case one subspecies was polymorphic at that locus

and a “second allele” was unique to that subspecies. Thus, although one allele has limited

predictive value for subspecies identification, the presence of the other can be considered

indicative of a single subspecies. This of course is based upon our present subspecies sequence

database of 21 birds, 42 alleles at this locus. As an example, at RGS4 locus 744, a cytosine

residue was found as eleven of 14 alleles in A. m. macao but as none of 28 alleles of A. m.

cyanoptera.

106

Table 13: Nuclear Sequence Comparison Between Subspecies: Regulator of G-Protein Signaling 4 (RGS4)

Sample

Locus

556 700 744

A. m. cyanoptera

SL8 C/C C/C A/A

033 C/CCT C/C A/A

19s/m C/CCT C/C A/A

ZM19 C/CCT C/C A/A

028 C/CCT C/T A/A

4531 C/C C/C A/A

9021 C/C C/C A/A

062 C/CCT C/T A/A

065 C/C C/C A/A

9780 C/C C/C A/A

ZM16 C/CCT C/C A/A

101 C/C C/C A/A

119 C/CCT C/T A/A

363 C/C C/C A/A

A. m. macao

ZM13 C/C C/C C/C

CC6 C/C C/C C/A

SL5 C/C C/C C/A

ZM10 C/C C/C C/C

CC7 C/C C/C C/C

046 C/C C/C C/C

024 C/C C/C C/A

Hybrids

043 C/C C/T C/A

1022 C/C C/T C/A

107

An intronic sequence found within the vimentin gene (VIM), covers 446 bp and includes

two loci with A. m. macao and A. m cyanoptera specific alleles (see Table 14). Although neither

of the two loci exhibited two alleles which were each subspecies-specific (one allele/subspecies),

in each case, one subspecies was polymorphic at that locus and the “second allele” was unique to

that subspecies. Again, one allele demonstrates a limited predictive value for subspecies

identification, but the presence of another allele in the other subspecies can be considered

indicative of a specific subspecies when found at this locus. For example, at VIM locus 105, an

adenine residue was found as eleven of 14 alleles in A. m. macao but was not found as any of the

28 alleles of A. m. cyanoptera. This locus allows fairly definitive results to predict a subspecies

but lacking the adenine as three of alleles of A. m. macao, it is not completely subspecies

indicative.

108

Table 14: Nuclear Sequence Comparison Between Subspecies: Vimentin (Vim)

Samples

Locus

105 390

A. m. cyanoptera

SL8 G/G T/T

033 G/G T/T

19s/m G/G T/T

ZM19 G/G T/T

028 G/G T/T

4531 G/G T/T

9021 G/G T/T

062 G/G T/T

065 G/G T/T

9780 G/G T/T

ZM16 G/G T/T

101 G/G T/T

119 G/G T/T

363 G/G T/T

A. m. macao

ZM13 A/G C/C

CC6 A/G C/T

SL5 A/A C/C

ZM10 A/A C/T

CC7 A/A C/T

046 A/A C/C

024 A/G C/T

Hybrids

043 G/G C/C

1022 G/G C/T

109

As previously mentioned, the intronic region that was further analyzed in recombination

activating gene 1 (RAG1) did not show subspecies-specific alleles (see Table 15).

Table 15: Nuclear Sequence Comparison Between Subspecies: Recombination Activating Gene 1 (RAG1)

Sample

Locus

929

A. m. cyanoptera

SL8 C/C

033 C/C

19s/m C/T

ZM19 C/C

028 C/C

4531 C/T

9021 C/C

062 C/C

065 C/C

9780 C/C

ZM16 C/C

101 C/C

119 C/C

363 C/C

A. m. macao

ZM13 C/C

CC6 C/T

SL5 C/C

ZM10 C/C

CC7 C/C

046 C/C

024 C/C

Hybrids

043 C/T

1022 C/C

110

Advances in sequencing technology in bioinformatics tools over the last decade, have

changed the power of molecular marker-based methods. The increase in the number of available

genome-wide molecular markers has dramatically enhanced the resolution and the reliability of

taxonomic conclusions (Steiner, Putnam et al. 2013), such as assessing the impact of genetic

variation on patterns of gene expression and measuring the responses to environmental

change. In order to understand the underlying patterns of genetic variations in individuals and

populations, we have to better understand the underlying processes involved and their relevance

to conservation. Structural rearrangements, copy number variations, insertions and deletions,

single nucleotide polymorphisms, and sequence repeats will likely become the standard units for

all future assessments of natural populations (Ellegren 2014). Studying selectively important

variation strongly relies upon the availability of annotated sequence data to assist in identifying

the functional genomic regions of interest. These molecular markers are invaluable in the

classification of individuals and conservation programs (Romanov, Tuttle et al. 2009),

identifying the substructures within populations, and in making strong conservation decisions

(Frankham, Ballou et al. 2010).

We were able to verify the close genetic relationship of A. m. macao and A. m.

cyanoptera through sequence level analysis of the entire 18S ribosomal DNA region as well as

the 5’ segment of the flanking internal transcribed spacer 1 (ITS1). However, comparative

analysis revealed there was not sufficient polymorphism at this locus for subspecies-specific

differentiation which was thought to be as a possible outcome given the close phylogenetic

relationship between these two groups of macaws.

Sequencing the entire mitogenome of both, the nominate and endangered subspecies

enabled us to accomplish a few goals. The published mitochondrial genome was completed as

111

well as corrected at some sites of the nominate A. m. macao. The first mitogenome sequence of

the cyanoptera subspecies was determined and annotated. Distinct differences between the two

mitogenomes of these subspecies were noted. We were able to identify numerous regions within

this sequence to definitively distinguish between the two subspecies of macaw. Unfortunately,

the mitogenome is uniparental in origin and cannot fully assess the entire lineage.

Our lab conducted a preliminary assessment of five autosomal loci encompassing 3266

bp within 23 individual macaws. Although we found some indications of subspecies-specific

alleles, which taken together allow reasonable inference of subspecies, evaluated separately,

these alleles do not have the power to definitively distinguish between the subspecies. More

nuclear loci with a larger dataset needs to be evaluated to enable definitive determination of

hybrids.

112

3.4 Off-Instrument Use of MiSeq® Reporter

The Illumina MiSeq® Desktop Sequencer includes MiSeq® Reporter, which is an on-

instrument software program that performs sequence analysis (see Figure 32). The MiSeq®

reporter is typically run in the lab via computers that are locally networked with the MiSeq®,

and it launches automatically after the MiSeq® completes its primary analysis. After performing

the initial sequencing and analysis runs on the MiSeq®, it is of value for the research team to be

able to perform further analyses, “re-sequencing” the initially obtained primary run data. The

MiSeq® sequencing and analysis workflow is illustrated below:

Figure 33: Illumina MiSeq® reporter workflow (Illumina)

113

To allow for distributed analysis by the research time, as well as the primary researcher,

the MiSeq® Reporter is available to be installed at an “off-instrument,” i.e., at other computers

not co-located with the MiSeq® itself. To facilitate this, the MiSeq® provides an ability to

upload its run outputs to the BaseSpace platform, which is a well. For each of our runs, this was

approximately a 12 gigabyte set of files, so there was a fairly significant amount of data to be

transferred and stored into a registered BaseSpace account. Being able to download these files to

an off-instrument instance of MiSeq® Reporter was invaluable for re-sequencing and further

data analysis.

The requirements for running MiSeq® reporter was a Windows 7 machine with the

following required components:

64-bit Windows OS (Vista, Windows 7, Windows Server 2008 64-bit, English-US)

≥ 8 GB RAM minimum; ≥ 16 GB RAM recommended

≥ 1 TB disk space

Quad core processor (2.8 GHz or higher)

Microsoft .NET 4

Given that the typical machine now runs Windows 8 or higher, this may require downgrading

a Windows 8 or Windows 10 machine back to Windows 7, along with ensuring the other

requirements were met. The off-instrument MiSeq® Reporter software was an available

download from the Illumina® website.

Next, the interface for downloading a run from BaseSpace is a Python programming

language open source downloader. So we downloaded the Python program platform and an

Illumina-provided script file to be executed by Python in accordance with

114

https://support.basespace.illumina.com/knowledgebase/articles/403618-python-run-downloader.

Execution of this file using the “token” and “RunID” provided on BaseSpace (as instructed by

the support article above) allowed for all of the files associated with the run to be downloaded in

a single batch. But given the 12 GB size of our download, this download took several hours on a

high-speed connection.

Once the files had been downloaded, they were available for access by the off-instrument

MiSeq® Reporter software. A “run” folder (C:\Illumina\MiSeq®Analysis) was created to have

the original analysis files in it. To provide “re-queue” instructions, the first step was to delete the

original “SampleSheet.csv” file that had the original run instructions in it. A genome folder was

created that had “.fa” files, which are ASCII text files having gene sequences to use for further

analyses against the original run files. The .fa files having these sequences essentially had to be

copied into three folders -- one folder associated with the Illumina® Experiment Manager (IEM),

which is used to create the re-sequence manifest file and the new re-sequence sample sheet.

Using the IEM, the sample sheet (SampleSheet.csv) is created, and the genome manifest is

accessed by first accessing the .fa file that was originally stored in the “genome” folder under the

IEM data folder. Once the IEM accesses the .fa file for the first time, it creates a full genome

manifest in the corresponding genome folder. This genome folder is then copied into the

MiSeq® Analysis folder (which is the run folder for the MiSeq® reporter).

The MiSeq® Reporter itself operates as a “service” on the Windows 7 computer, and that

service is accessed through the Internet Explorer or other compatible browser by entering the

following command line in the address line of the browser: “https://localhost:8042.” With this, a

screen that looks like the following will show up on the screen, but execution of a resequencing

run will not begin until the “QueuedforAnalysis” file is deleted from the MiSeq®Analysis “run

115

folder.” Once the run begins, this file will reappear, and the run execution status can be

monitored from the associated browser window. Depending on processor and memory capability

and on the run particulars, the re-queue analyses will take several hours each. Example reporter

graphical outputs are as follows (see Figures 33 & 34).

Figure 34: Illumina MiSeq® reporter run summary interface (Illumina)

116

Not only did the MiSeq® Reporter screens provide intuitive insights into the sequence

data, the Illumina® plots provides for formatted data outputs of, for example, SNP data

represented in a given plot as “comma separated value” or “csv” text files. Thus, many of the

spreadsheets discussed and presented elsewhere in this paper were generated by outputting data

directly from the MiSeq® Reporter screen outputs.

Figure 35: Illumina MiSeq® reporter detailed sample analysis interface (Illumina)

117

APPENDIX

EXTENDED RESULTS

118

Relatedness of Ara macao subjects from mitogenome domain I

SL8 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GT-GGGCTGGTCTGCT

033 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCCCCCCACTGAGTC-GT-GGGCTGGTCTGCT

19s/m 1 CCCATACCCCTAAGGGTAGCCCCCCCTACC--CCCACTGAGTC-GT-GGGCTGGTCTGCT

ZM19 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GT-GGGCTGGTCTGCT

028 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTCCGT-GGGCTGGTCTGCT

4531 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GT-GGGCTGGTCTGCT

9021 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCCCCCCACTGAGTC-GT--GGCTGGTCTGCT



9780 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCCCCCCACTGAGTC-GT--GGCTGGTCTGCT

ZM16 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTGGGGCTGGTCTGCT

101 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCCCCCCACTGAGTC-GT-GGGCTGGTCTGCT

119 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTCCGT-GGGCTGGTCTGCT

363 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GT-GGGCTGGTCT-CT

ZM13 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT

CC6 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT

SL5 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT

ZM10 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT

CC7 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT

046 1 CCCATACCCCTAAGGGTAGCCCCCCCTACCC-CCCACTGAGTC-GTCGGGCTGGTCTGCT




119

SL8 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA

033 59 TCGCCTATC-CAGGCTATGTATATCGTACATT-TA-ATAATT-GTACCTTAT--ACATTA

19s/m 57 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA--TAATTGGTACCTTAT--ACATTA

ZM19 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTATATACATTA

028 59 TCGCTTA-C-CAGGCTATGTATATCGTACATTTTA-ATAATTGGTACCTTAT--ACATTA

4531 58 TCGCTTATCCCAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA

9021 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA

062 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTATA-ACATTA

065 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTATATACATTA

9780 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAA-TGGTACCTTAT--ACATTA

ZM16 59 TCGCT-ATC-CAGGCTATGTATATCGTACATT-TAAATAATTGGTACCTTATA-ACATTA

101 59 TCGCCTATC-CAGGCTATGTATATCGTACATT-TAAATAATTGGTACCTTATA-ACATTA

119 59 TCGCTTA-C-CAGGCTATGTATATCGTACATTTTA-ATAATTGGTACCTT-T--ACATTA

363 57 TCGCTTATC-CAGGCTATGTATATCGTACATT-TA-ATAATTGGTACCTTATATACATTA

ZM13 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA

CC6 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAA-TGGTACCTTAT--ACATTA

SL5 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA

ZM10 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TAAATAATTGGTACCTTATA-ACATTA

CC7 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA

046 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAA-TGGTACCTTAT--ACATTA

024 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA

043 58 TCGCTTATC-CAGGCTATGTATATCGTACATT-TAAATAATTGGTACCTTATA-ACATTA

1022 59 TCGCTTATCCCAGGCTAT-TATATCGTACATT-TA-ATAATTGGTACCTTAT--ACATTA

120

SL8 113 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG

033 113 TATTATA----TTATTAGGGACTAAATAATTCATGC-TCAATGACATATTGGTATTGTGG

19s/m 111 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTG-TATTGGGG

ZM19 115 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG

028 114 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG






ZM16 115 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATGACATATTGGTATTG-GG

101 116 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCTA--ATATATTGGTATTG-GG


363 114 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCAATG--ATATTGGTATTG-GG

ZM13 114 TATTATAA-TGTTATTAGGGACTAAATAATTCATGCCTCAATG--ATATTGGTATTG-GG

CC6 113 TATTATAA-TGTTATTAGGGACTAAATAATTCATGCCTCAATGA-ATATTGGTATTG-GG

SL5 114 TATTATAAATGTTATTAGGGACTAAATAATTCATGCCTCAAT---ATATTGGTATTG-GG

ZM10 116 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCTAAT--ATATTGGTATTG-GG

CC7 114 TATTATAAATGTTATTAGGGACTAAATAATTCATGCCTCAAT---ATATTGGTATTG-GG

046 113 TATTATAA-TGTTATTAGGGACTAAATAATTCATGCCTCAATGA-ATATTGGTATTG-GG

024 114 TATTATAAATGTTATTAGGGACTAAATAATTCATGCCTCAAT---ATATTGGTATTG-GG

043 115 TATTATA---GTTATTAGGGACTAAATAATTCATGCCTCTA--ATATATTGGTATTG-GG

1022 114 TATTATAA-TGTTATTAGGGACTAAATAATTCATGCCTCAATG--ATATTGGTATTG-GG

SL8 169 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG

033 168 ACTAATCTCTGGT-CTAGTTCGGTCCTACCACAGGGGT-TGGAA-AACTCCATG-GCACG

19s/m 167 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG

ZM19 171 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAAAAACTCCATG-GCACG

121

028 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATGGGCACG

4531 170 ACTAATCTCTGGTCCTAGTTCGGTCCTA-CACAGGGGTTTGGAA-AACTCCATG-GCACG

9021 169 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG




ZM16 171 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG


119 169 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATGGGCACG



CC6 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG

SL5 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG


CC7 170 ACTAATCTCTGGT-CTAGTTCGGTCCTA-CACAGGGGT-TGGAA-AACTCCATG-GCACG





122

SL8 224 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-

033 224 ATAA-GCA-AGCTTCATGGT--TCTGGCCAAGGCATTGTTATCTTTAACTCTACTG-GT-

19s/m 222 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTG-GT-

ZM19 227 ATAA-GCA-AGCTTCATGTT--TCTGGCCAAGGCATTGTATCCTTTAACTCTACTGGGT-

028 226 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTG-GT-

4531 227 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-


062 225 ATAA-G---AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-

065 226 ATAAAGCA-AGCTTCATGGT--TCTGGCCAAGGCATTGTTATCTTTAACTCTACTG-GT-


ZM16 226 ATAA-G---AGCTTCATGGTT-TCTGGCCAAGGCATTG-ATCTTTTAACTCTACTGGGT-

101 225 ATAA-GCAGAGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-

119 225 ATAA-GGA--ACTTCATGCTT--TAGGTCA-TGCTTTGTACCCTTTAATTTCATAG-CTC

363 223 ATAA-GCA-AGCTTCATGGT--TCTGGCCAAGGCATTGTTATCTTTAACTCTACTG-GT-

ZM13 225 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-

CC6 225 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-

SL5 225 ATAA-GCAGAGCTTCATGGTTTTCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-

ZM10 225 ATAA-GCA-AGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-

CC7 225 ATAA-GCAGAGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-


024 225 ATAA-GCAGAGCTTCATGGTTTTCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-


1022 225 ATAA-GCAGAGCTTCATGGTT-TCTGGCCAAGGCATTGTATCTTTTAACTCTACTGGGT-

SL8 280 ACAGT-ATACGGAAGTGCCCCTAGTA-TAAG-AAACTTCATCCTTTAGGTCATGC-TTTT

033 278 ACAGT-ATACGGAAGTGCC-CTAGTA-TAA--GAACTTCATGCTTTAGGTCATGC-T-TT

19s/m 277 ACAGT-ATACGGAAGTGCCCCTAGTA-TAAG-AAACTTCATCCTTTAGGTCATGC-TTTT

ZM19 282 ACAGT-ATACGGAAGTGCCCCTAGTA-TAAGAAAACTTCATCCTTTAGGTCATGC-TTTT

123

028 281 ACAGT-ATACGGAAGGCCC-CTAGTA-TAAGGAAACTTCATCCTTTAGGTCATGC-T-TT

4531 283 ACAGT-ATACGGAAGTGCCCCTAGTA-TAAG-AAACTTCATCCTTTAGGTCATGC-TTTT

9021 279 ACAGT-ATACGGAAGTGCCCCTAGTA-TAA-GAAACTTCATCCTTTAGGTCATGCTTTTT

062 279 ACAGTAATACGGAAGTGCCCCTAGTA-TAAG-AAACTTCATCCTTTAGGTCATGC-TTTT

065 281 ACAGT-ATACGGAAGTGCC-CTAGTAATAAGAAAACTTCATCCTTTAGGTCATGC-TTTT

9780 278 ACAGT-ATACGGAAGTGCCCCTAGTA-TAAAGAAACTTCATCCTTTAGGTCATGCT-TTT

ZM16 279 ACAGTAATACGGAAGTGCCCCTAG----A---GAACTTCATGCTTTAGGTCATGC-T-TT

101 282 ACAGTAATACGGAAGTGCCCCTAGTA-TAAAGAAACTTCATCCTTTAGGTCATGC-TTTT

119 278 TAAGT-ATACGGAAGTGCT-CTAGTA-CAAG-AAACTTCATCCTTTAGGTCATGC---TT

363 277 ACAGT-ATACGGAAGTGCC-CTAGTA-TAA--GAACTTCATGCTTTAGGTCATGC-T-TT

ZM13 281 ACAGTAATACGGAAGTGCCCCTAGTA-TA---AAACTTCATCCTTTAGGTCATGCCTTTT

CC6 281 ACAGTAATACGGAAGTGCCCCTAGTA-TA---AAACTTCATCCTTTAGGTCATGCCTTTT

SL5 283 ACAGTAATACGGAAGTGCCCCTAGTA-TA---AAACTTCATCCTTTAGGTCATGCCTTTT

ZM10 281 ACAGTAATACGGAAGTGCCCCTAGTA-TA---AAACTTCATCCTTTAGGTCATGCCTTTT

CC7 282 ACAGTAATACGGAAGTGCCCCTAGTA-TTA--AAACTTCATCCTTTAGGTCATGCCTTTT

046 281 ACAGTAATACGGAAGTGCCCCTAGTA-TA---AAACTTCATCCTTTAGGTCATGCCTTTT

024 283 ACAGTAATACGGAAGTGCCCCTAGTA-TA---AAACTTCATCCTTTAGGTCATGCCTTTT

043 280 ACAGTAATACGGAAGTGCCCCTAGTA-TA-G-AAACTTCATCCTTTAGGTCATGCCTTTT

1022 282 ACAGTAATACGGAAGTGCCCCTAGTA-TA----AACTTCATCCTTTAGGTCATGCCTTTT

124

SL8 336 GATACCCT-TAATTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAGTACAAAGGACTT

033 331 GATACCCTTTAATTTCATAGCTCTAAGTATA-CGGAAGT--GCTCTAGTACAAAGGACTT

19s/m 333 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAGTACAAAGGACTT

ZM19 339 GATACCCTTAATTTTCATAGCT-TAAGTA-ACCGGAAGT--GCTCTAGTACAAAGGACTT

028 336 GATACCCT-TAATTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAGTACAAAGGACTT

4531 339 GATACCCTTAATTTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAGTACAAAGGACTT

9021 336 GATACCC-TTAATTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAGTACAAAGGACTT

062 336 GATACCCTTAATTTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAG-------GACTT

065 338 GATACCCTTAATTTTCATAGCTCTAAGTA-ACCGGAAGT--GCTCTAG-------GACTT

9780 335 GATACCT-TAATTTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAG-------GACTT

ZM16 330 GATACCCTTTAATTTCATAGCTCTAAGTATA-CGGAAGT--GCTCTAGTACAAAGGACTT

101 340 GATACCCTTAATTTTCATAGCTCT-AGTA-A-CGGAAGT--GCTCTG--------GACTT

119 331 GATACCCT-TAATTTCATAGCTCTAAGTA-A-CGGAAGT--GCTCTAGTACAAAGGACTT

363 330 GATACCCTTTAATTTCATAGCTCTAAGTATA-CGGAAGT--GCTCTAGTACA-AGGACTT

ZM13 337 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACA-AGGACTT

CC6 337 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACA--AGACTT

SL5 339 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTA---AGGACTT

ZM10 337 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACA--AGACTT

CC7 339 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACA-AGGACTT

046 337 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACCAAGGACTT

024 339 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACA-AGGACTT

043 337 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAG---GCTCTAGTACAA--GACTT

1022 337 GATACCCCTTAATTTCATAGCTCTAAGTA-A-CGGAAGTTGGCTCTAGTACA-AGGACTT

SL8 391 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTTC

033 388 ATCGGTCACCGCCCATAATTGGCCCG-GGAACT-TTCT-TATCCCGTAA-TGCTCGTTTC

19s/m 389 ATCG-TCACCGCCCATAATTGGC-CG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTT-

ZM19 395 A-CG-TCACCGCCCATAATTGGCCCGCGGAACT-TTCTTTATCCCGTA--T-CT-GTTTC

125

028 391 ATCG-TCACCGC-CATAATTGGCCCG-GGAACCTTTCTTT-ACCCGTA--TGCT-GTTTC

4531 395 ATCGTTCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTTC

9021 391 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAATTGCT-GTTTC

062 385 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTTC


9780 383 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCT-TATCCCGTAA-TGCTCGTTTC

ZM16 387 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA--TGCT-GTTTC


119 386 ATCG-TCACCGC-CATAAT-GGCCCG-GGAACCTTTCTTT-ACCCGTA--TGCT-GTTTC


ZM13 394 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC

CC6 393 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC

SL5 394 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC

ZM10 393 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC

CC7 396 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC

046 395 ATCG-TCACCGCCCATAATTGGCCCG-GGAACT-TTCTTTATCCCGTAT--GCT-GTTTC


043 390 ATCG-TCACCGC-CATAATTGGCCCG-GGAACT-TTCTTTATCCCGTA---GCT-GTTTC


126

SL8 445 AGG-GCCCGGTTATTTAT------TGTCTGACTACTCACG-AG-GATCACCAATCC---T

033 444 AGGGGCCCGGTTATAT-------TAGTCTGACTACTCACG-AGAGATCACCAATCCGG-T

19s/m 441 AGG-GCCCGGTTATTTAT------TGTCTGACT-CTCACG-AG-GATCACCAATCC---T

ZM19 448 AGG-GCCCGGT-ATTTAT------TGTCTGACTACTC-CG-AG-GATCACCAATCC---T

028 444 AGG-GCCCGGTTATTTAT------TGTCTGACTACTCACG-AG-GATCACCAATCC---T


9021 447 AGG-GCCCGGTTATTTAT------TGTCTGACTACTCACC-AG-GATCACCAATCC---T

062 439 AGG-GCCCGGTTATTTATT-TA-TTGTCTGACTACTCACCGAG-GATCACCAATCC---T

065 442 AGG-GCCCGGTTATTTAT-------GTCTGACTACTCACG-AG-GATCACCAATCC---T

9780 438 AGGGGCCCGGTTATTTAT------TGTCTGACTACTCACC-AG-GATCACCAATCC---T

ZM16 441 AGG-GCCCGGTTATAT-------TAGTCTGACTACTCACG-AGAGATCACCAATCCGG-T

101 441 AGG-GCCCGGTTATTTATTTTA-TTGTCTGACTACTCACCGAG-GATCACCAATCC---T


363 440 AGG-GCCCGGTTATTTATAT---TAGTCTGACTACTCACG-AGAGATCACCAATCCCG--

ZM13 448 AGG-GCCCGGTTATTTAT---ATTAGTCTGACTACTCACG-AGAGATCACCAATCCCGGT

CC6 447 AGG-GCCCGGTTATTTAT---ATTAGTCTGACTACTCACG-AGAGATCACCAATCCCGGT

SL5 448 AGG-GCCCGGTTATTTATAT-ATTAGTCTGACTACTCACG-AGAGATC-CCAATCCCGGT

ZM10 447 AGG-GCCCGGTTATTTAT---ATTAGTCTGACTACTCACG-AGAGATCACCAATCCCGGT

CC7 450 AGG-GCCCGGTTATTTAT---ATTAGTCTGACTACTCACG-AGAGATC-CCAATCCCGGT

046 449 AGG-GCCCGGTTATTTAT---ATTAGTCTGACTACTCACG-AGAGATC-CCAATCCCGGT

024 450 AGG-GCCCGGTTATTTAT---ATTAGTCTGACTACTCACG-AGAGATC-CCAATCCCGGT

043 442 AGG-GCCCGGTTATTTAT--------TCTGACTACTCACCGAG-GATCACCAATCC---T

1022 448 AGG-GCCCGGTTATTTAT---A-TAGTCTGACTACTCACC-GAGGATCACCAATCCCGGT

SL8 493 GT-AAGTAAGGTTCGG-CCTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT

033 495 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTCTT-TCCCCCTAC-ACCCTT

19s/m 488 GTAAAGTAAGGTTCGGTCCTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT

ZM19 494 GTAAAGTAAGGTTCGG-CCTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT

127

028 492 GT-AAGTAAGGTTCGG-CCTCCCTAGCGTCCAGGTCCATTTCTTTCCCCCTAC-ACCCTT

4531 498 GTAAAGTAAGGTTCGG-CCTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT

9021 495 GT-AAGTAAGGTTCG---CTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT

062 492 GT-AAGTAAGGTTCG---CTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT

065 489 GT-AAGTAAGGTTCGG-CCTCCCTAGCGTCCAGGTCCATTTCT-TTCCCCTACCACCCTT

9780 487 GT-AAGTAAGGTTCG---CTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACC-TT

ZM16 491 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT

101 495 GT-AAGTAAGGTTCG---CTCCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCC-T

119 486 GT-AAGTAAGGTTCGG-CCTCCCTAGCGTCCAGGTCCATTTCTTTCCCCCTAC-ACCCTT

363 493 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTCTT-TCCCCCTACCACCCTT

ZM13 503 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC-CTACCAC-CTT

CC6 502 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC--CACCAC-CCT

SL5 504 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC--CACCAC-CTT

ZM10 502 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC--CACCAC-C-T

CC7 504 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC--CA-CACCCTT

046 503 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC--CACCAC-CTT

024 504 GT-AAGTAAGGTTCGG-CCTTCCTAGCGT-CAGGTCCATTTCT-TCCC--CA-CAC-CTT

043 489 GT-AAGTAAGGTTCG---CTCCCTAGCGT-CAGGTCCATTCTT-TCC-CCTACCACCC-T

1022 502 GT-AAGTAAGGTTCGG-CCTCCCTAGCGT-CAGGTCCATTCTT-TCCC--CACCAC-CTT

128

SL8 549 ACACTCCTTGCGCTTTTCGCCTCTG-GGTTCCTCGGTCAGGCACATAAC

033 550 ACACTCCTTGCGCTTT-TGCGCCTCTGGTTCCTCGGTCAGGCACATAAC

19s/m 546 ACACTCCTTGCGCTTTTCGCCTCTGGGCTTCCTCGGTCAGGCACATAAC

ZM19 551 ACACTCCTTG-GCTTTTCGCCTCTG-GGTTCCTCGGTCAGGCACATAAC

028 549 ACACTCCTTGCGCTTTTCGCCTCTG-GGTTCCTCGGTCAGGCACATAAC

4531 555 ACACTCCTTGCGCTTTTCGCCTCTGGGTTTCCTCGGTCAGGCACATAAC

9021 549 ACACTCCTTGCGCTTTTCGCCT--CGGGTTCCTCGGTCAGGCACATAAC

062 546 ACACTCCTTGCGCTTTTCGCCTCTCGGGTTCCTCGGTCAGGCACATAAC

065 546 ACACTCCTTG-GCT-TTCGCCTCTG-GGTTCCTCGGTCAGGCACATAAC

9780 540 ACACTCCTTGCGCTTTTCGCCT--CGGGTTCCTCGGTCAGGCACATAAC

ZM16 547 ACACTCCTT-CGCTTT-CGCCTCTCGGGTTCCTCGGTCAGGCACATAAC


119 543 ACACTCCTTGCGCCTTTCGCCTCTG-GGTTCCTCGGTCAGGCACATAAC

363 549 AC-CTCCTTGCGCTTT--TCGCCTCGGGTTCCTCGGTCAGGCACATAAC

ZM13 557 ACACTCCTTGCGCTT-TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC

CC6 555 ACACTCCTTGCGCT--TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC

SL5 557 A-ACTCCTTGCGCT--TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC

ZM10 554 ACACTCCTTGCGCTT-TCGCCTCTCGGGTTCCTCGGT--GGCACATAAC

CC7 557 A-ACTCCTTGCGCT--TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC

046 556 A-ACTCCTTGCGCT--TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC

024 556 A-ACTCCTTGCGCT--TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC


1022 555 ACACTCCTTGCGCTT-TCGCCTCTCGGGTTCCT-GGTC-GGCACATAAC

129

Adenylate Kinase 1

SNPs and Indels of Intronic Nuclear Regions

19S/M CC6

CC7 SL5

SL8

028

130

033 043

062 065

1022 4531

131

9021 9780

ZM10 ZM13

ZM16 ZM19

132

18S rRNA gene (18S)

SNPs and Indels for intronic nuclear regions

19S/M 028

033

043

062 065

133

1022 4531

9021 9780

CC6 CC7

134

024

SL5

SL8 Zm10

ZM13

ZM16

ZM19

135

Recombination activating gene 1 (RAG1)

SNPs and Indels of nuclear Intronic regions

19S/M CC6

CC7 SL5

SL8

028

136

033

043

062

065

1022 4531

137

9021 9780

ZM10

ZM13

ZM16 ZM19

138

Regulator of G-protein signaling 4 (RGS4)

SNPs and Indels for nuclear intronic regions

19S/M

028

033 043

062

065

139

1022

4531

9021

9780

CC6

CC7

140

024 Sl5

Sl8 ZM10

141

ZM13

ZM16

ZM19

142

Vimentin Gene (Vim)

SNPs and Indels for nuclear Intronic regions

19S/M

028

033

043

062 065

143

1022

4531

9021 9780

CC6

CC7

144

SL8 ZM10

145

ZM13 ZM16

ZM19

146

Mitogenome

SNPs and Indels

19 S/M CC6

CC7 SL5

SL8

028

147

033 043

062

065

1022

4531

148

9021 9780

ZM10

ZM13

ZM16

ZM19

149

EXAMPLE RESTRICTION DIGEST ELECTROPHORETIC GELS

151

MiSeq® Reporter graphical outputs

Below are a series of graphical outputs from the MiSeq® Reporter, illustrating a

“Sample/Genetic Region” under analysis in each graph. The vertical axis illustrates the depth of

coverage (number of reads) at each position, and the horizontal axis indicates the base position in

the sequence. The green graph represents the overall curve for the depth of coverage. The red

spikes indicate the positions at which there has been a SNP or an indel relative to the reference

model.

The first group of graphs are nuclear genome intron graphs, whereas the second group of

graphs (which contain many more SNPs and indels) are mitochondrial genome graphs.

152

MISEQ PLATFORM OUTPUT NUCLEAR SNP AND INDEL GRAPHS

153

MISEQ PLATFORM OUTPUT MITOGENOME SNP AND INDEL GRAPHS

156

Attached in “soft copy” form are two tables (in MS EXCEL) format of the mitochondrial

genome polymorphisms (SNPs and indels). To receive a copy of this via email, please send an

email to the author at [email protected]. (To open these files, right-click on the picture,

select “Worksheet Object”, then select “Open”)

MET(68bp)

REGION --> 12S(970bp) VAL(71bp) 16S(1572bp) ND1(981bp) ILE(72bp) 3889-3956 ND2(1040bp) CO1(1548bp) ASP(69bp) CO2(684bp) LYS(68bp) ATP8(168bp) ATP6(683bp) CO3(782bp) ND3(350bp) ND4L(297bp) ND4(1393bp) ND5(1815bp) CytB(1140bp) THR(68bp) ND6(515bp)

67-1036 1037-1107 1109-2680 2762-3742 3741-3812 3957-4996 5362-6909 6981-7049 7052-7735 7737-7804 7806-7973 7964-8646 8646-9427 9497-9846 9918-10214 10208-11600 11806-13620 13632-14771 14772-14839 14917-15431

LOCUS --> 772 789 823 880 955 1094 1181 1514 1611 1621 1746 1777 1785 1801 1802 1832 1879 1992 2075 2197 2206 2359 2618 2784 2802 2914 2944 3142 3166 3199 3282 3298 3299 3436 3439 3511 3553 3602 3644 3798 3939 4061 4136 4193 4205 4247 4373 4385 4419 4432 4446 4489 4493 4517 4531 4539 4593 4620 4670 4697 4739 4778 4820 4938 4962 5433 5478 5490 5622 5655 5709 5748 5794 5931 5940 5952 5953 6078 6156 6207 6468 6474 6552 6592 6651 6756 6981 7033 7162 7192 7312 7413 7429 7576 7591 7616 7642 7649 7775 7785 7786 7914 7970 8057 8087 8110 8195 8219 8306 8516 8797 8830 8836 8883 9002 9028 9220 9265 9510 9539 9543 9579 9581 9632 9667 9701 9777 9987 9990 10047 10056 10108 10122 10143 10293 10295 10362 10430 10766 10850 10859 10898 10931 10969 10983 10984 11057 11087 11102 11177 11222 11327 11370 11387 11411 11441 11472 11481 11531 11836 11851 11896 11956 12085 12103 12106 12163 12361 12505 12709 12718 12739 12859 12922 12929 12967 13042 13135 13177 13201 13254 13319 13424 13480 13579 13743 13756 13776 13887 14022 14025 14031 14040 14097 14139 14224 14317 14373 14466 14541 14616

mito1 332 349 383 440 515 654 741 1074 1171 1181 1306 1337 1345 1361 1362 1392 1439 1552 1635 1757 1766 1919 2178 2344 2363 2474 2504 2702 2726 2759 2842 2858 2859 2996 2999 3071 3113

mito2 95 182 515 612 622 747 778 786 802 803 833 880 993 1076 1198 1207 1360 1619 1785 1804 1915 1945 2143 2167 2200 2283 2299 2300 2437 2440 2512 2554 2603 2645 2799 2940 3062 3137 3194 3206 3248 3374 3386 3420 3433 3447 3490 3494 3518 3532 3540 3594 3621 3671 3698 3740 3779 3821 3939 3963

mito3 412 457 469 601 634 688 727 773 910 919 931 932 1057 1135 1186 1447 1453 1531 1571 1630 1735 1960 2012 2141 2171 2291 2392 2408 2555 2570 2595 2621 2628 2754 2764 2765 2893

mito4 7622- mito4 27 153 163 164 292 348 435 465 488 573 597 684 894 1175 1208 1214 1261 1380 1406 1598 1643 1888 1917 1921 1957 1959 2010 2045 2079 2155 2365 2368 2425 2434 2486 2500 2521 2671 2673 2740 2808 3144 3228 3237 3276 3309 3347 3361 3362 3435 3465 3480 3555 3600 3705 3748 3765 3789 3819 3850 3859 3909

mito5 144 249 292 309 333 363 394 403 453 758 773 818 878 1007 1025 1028 1085 1283 1427 1631 1640 1661 1781 1844 1851 1889 1964 2057 2099 2123 2176 2241 2346 2402 2501 2665 2678 2698 2809 2944 2947 2953 2962 3019 3061 3146 3239 3295 3388 3463 3538

mito6

Reference

Seabury G A A A AG C G T A G G C T T A AC C C A A A A C T A T C A T A C T T G G T C A C A T C T A A G C C A T T T G T T A G G C G G G A T T A A A C C A T G A G C C C G A A C C G A C G C G A A T C A G G C G T A A A C A G T A C A A G G A G G A C T T C G A T G TC G A A T A A G C A C T A T A G G A T G A G G C A T T G G T A G G C A A G T G C A C C A G T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C A A

A. m. cyanoptera

SL8 A G G A A T A T A G A T C C A A T G AACTT G G T C C G C T A C G T C T A A T T A T C C C C G A A T C G T C C A C T G A A T G A G A C T G G A T T G C G T A T C T G G G C T A G C A T G A A T T A A A C G T AG G A C G G T G C G A G A G G G A T T T C A A C A TC G G A C A A A T G C C G T G A A G T G A A A C G T C A A C A A A C G G A C A T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C T A C T T A A

033 A G G A/G A T A T A A A T C C G A T G AACTT G G T C C G C T A C G T C T G A T T A C C C C C G G A C C G T C T A C T G A A T G A G A C T G G A T T G C G T A T C T G G G T T A G C A T G A G T T G A G C A T A G C C A A T G T G A G A G G A A T T T C A G C A TC G G G T G A G T A T C G C G A G G T G T A A T G T C A A C G A A T G G A C A T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C T A C T T A A

19s/m G G G A/G A T A T G A A C C C A A T G AACTT G G T C C G C T A C G T C T G A T T A C C C C C G G A C T G T C T A C T A/G A G T A A G G C C G G A T T G C G T A C T T G G G T T A G C A C A A G T T A A G T G T AG G C C A A T/C G T G A G A G A A A T G T T A G C A TC G G G T G A G T A T C G C G A G G T G T A A T G T C A A C G A A C G A G C G T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C C A C T T A A

ZM19 A G G A A T A T G A G T T T G A T G AACTT G G T T C G C T G C G T C C A A T T A C C C T C G G A T C G T C T A C C G A G T G G G A C C G G A T T G C G T A C T T G G G T T A G C A T G A G C T G A G T G C A G C C A A T G T G A G A G G A A T G T T A G C A TC G G G T G A G T A T C G C G A G G T G T A A T G T C A A C A A A C G G A C A T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C C A C T T A A

028 A A G A A T A T A A A T T T G A T G AACTT G G T T C G C T G C G T C C A A T T A T C C T C G G A T C G T C T A C C G A A T A A A A C C G G G T T G C A A A T T C G A A T T A G C A C G G G T T G G G C G T AG G A C A G T/C G C G A G A G G G G T G C C A A C A TC G G A C A A A T G C C G T G A A G T G A A A C G C C A A C A A A T G G A C A T G T T G G T C A T C A A A G G T A A A C A A A C T T C G C C T A T C T A A

4531 G G G A A C A T A G A C C C A A T G AACTT G G T C C G C T A C G T C C G A T T A C C C C C G A A T T G T C T A C C G A G T G A G G C C G G G T T A C G A A C C T G G G T T A G T A T A A A T T G A G C A T AG G A C G G T G T G A G A G A G A T G C T A G C A T A G A T A A G T A C C G T G A G G C A T A A C G T C A A C A A A C G G A C A T G T C G A A T G T C A A A G G C A A A C A A A C T T C A T C T A C T T A A

9021 A G G A A C A T A A A C C C A A T G AACTT G G T T C G C T A C G T C T G A T T A C C C C C G G A T C G T C T A C T G A A T G A A G C T G G G T T A C A A A C C C G G A T T A G C A C A A A T T A A G C A T AG G A C G G T G C G A G A G A G A T G C C A A C A T G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G G A C A T G T T G A A T G T C A A A G G C A A A C A A G C C T C A T C T G C T T G G

062 A G G A A C A T A A G C C C A A T G AACTT G G T C C G C T G C G T C T A A T T A C C C C C G A A C C G T C T A C T G A G T A A A A C T G G G T T A C A T A T C C G A A T T A G C A T G G A T T A G G C A T AG G A T G G T G C G G A A G G G G T G C C A A C A T G G A T A A G T A T C G C G A G G T G T A A T G T C A A C A A A C G G A C A T G T T G G T C A T C A A A G G T A A A C A A A C T T C A T C T G T C T A A

065 A G G A A C A T A G G C C C A A T G AACTT G G T C C G C T A C G T C T A A T T A C C C C C G A A C C G T C T A C T G A G T G A G A C T G G G T T G C G T A C C T G A G C T A G T A T G A A T T A A G C A T A G A C A G T/C G C G A G A G G G A T G C C A A C A T G G G C A A/G A T G C C G T G A G G T G A A A C G T C A A C A A A C G G A C A T G T T G A T T G T C A A A G G C A A A C A G G C C C C A T C T G C T T A G

9780 G G G A A C A T A G G C C C A A T G AACTT G G T T C G C T A C G T C C G A T T A C C C C C G A A C C G T C T A C T G A G T G A G A C T G G A T T G C G T A T C C G A A C T A G C A T G A A C T A A G C G T A G A C A G T G C G A G A G G G A T G C C A A C A T G G G T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G G A C A T G T C G A T T G T C A G A G A C A A G C G A A C T T C G T C T A C T T A A

ZM16 A G G A A T A T G A A T T T G A T G AACTT G G T T C G C T A T G T C T A G T C G C C C T C G G A C T G T C T A C C A/G A G T G A A A C C G G A T T G C A A A T C C A A G C T A G C A C G A A T T A A G T G T AG G A T A G T G C G G A A G G G A T G C C A A C A TC G G A T A A G T A T C G C G A G G T G T A A T G C C A A C A A A C G A G C A T G T T G G T T G T C A A A G G C A A A C A G G C C C C G C C T A C T T A G

101 A A G A A C A T A G G C T T A A T G AACTT G G T T C G C T A T G T C T G G T C A C C C C C G A A C C G T C T A C T G A G T G A G A C T G G A T T A C G A A C C C G A A C T A G C A T G A A T T A G G C A T A G A C A G T G C G A G A G G G A T T T C A A C A TC G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G A G C G T G T C G G T C A T C A A A G G T A A A C A A A C T T C A T C T A T C T A A

119 A A G A A C A T A G G C T T A A T G AACTT G G T T C G C T A T G T C T G G T C A C C C C C G A A C C G T C T A C T G A G T G A G A C T G G A T T A C G A A C C C G A A C T A G C A T G A A T T A G G C G T AG G A C A G T G C G A G A G G G A T G T C A A C A TC G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G A G C G T G T C G G T C A T C A A A G G T A A A C A A A C T T C A C C T A T C T A A

363 A A G A A C A T A G G C C C A A T G AACTT G G T T C G C T A T G T C T G G T C A C C C C C G A A C C G T C T A C T G A G T G A G A C T G G A T T A C G A A C C C G A A C T A G C A C G A A T T A G G C G T A G A C A G T G C G A G A G G G A T G T C A A C A TC G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G A G C G T G T C G G T C A T C A A A G G T A A A C A A A C T T C A T C T A T C T A A

A. m. macao

ZM13 G G A A AG T G C A G A T T C A AC C A A G A C C C G T C G T A C T T A G C C A C A T C T A A A C T A T T T G T T A G G C G G G A T C G G A T T G C G A G T C T G A A C C G A T G T G A A T C A G G T G T AG A A C A G T A C A A G G A G G A C G T C G A T G TC G A A C A A A C G C T A T A G A A T G A G A C A C T A A C A A G T A G A T G C A C T A A T T G C T G A G A G C C T A T A A G T C T T A T T C G C T C A A

CC6 A G A A/G AG T G C A G A T T T G AC C A A A/G A C C T A T C G T A C T T A G T C G C A T C T A A A C T A C T T G T T A/G G G C G G A G T C A A A C C A T G A G T C C G A A C C G A C G C G A G T T A G G T G T A A A C A G T/C A C A A A G A G G A C T T C G A T G TC G A A T A A G C A C T A T A G A A T G A G A C A T T G G T A G G C A A G T G C A C C A A T T G C T G A G A G C C T A T A A A T T T T A T T T A C C C A A

SL5 G A A A AG T G C A A A T T T G AC C A A G G C T T A T C/T A T A C T T G G T C A T A C C C A A A C T A T T T G T T A G A C A G G G T C G G A T T G C G T G C C C G A G C C G A T G C G A G T T A A G C G T A A A C A G T A T A A G G A G A A C G T T G G T G TC A A G T A A G C A C T A T A G A A C A T G A C A C T A A C A A G C A A A T G C A C T A A T T G C T G A G A G C C T A T A A G T C T T A T T T A C T C A G

ZM10 G G A A/G AG C G C G A G T T C A AC C A A G A C C C A T C A T A C T T G G C C A C C T C T A A A C T A C T T G T T A/G G A C A G G A T T G G A T T G C G A G C C C G A A C C G A T G C G A A T C A G G T G T AG A A C A G T A C A A G G A G G A T G C C G A T G TC G A G T A A G C A C T A T A G A A T G A G A C A C T A G T A G A C A A G T A C A C T A G T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C G A

CC7 A A A A AG T G T A G A T T T A AC C A A A/G A C T T A T C/T A T A C T T A G T C A C A C C C A A A C T A T T T G T T A G G C G G G A T T G G A T T G C G T G C C T G A G C C G A T G T G A G T C A A G C G T A A A C A G T A C A A G G A G A A C G C C G A T G TC G A A C A A/G A C G C T A T A G A A T G A G G C A C T A A T A A A C A G G T A C A C C A A T T G C T G G G A A C C T G T G A A T T T T A T T T A C T C G A

046 A A A A AG C G T A G A C T T A AC C A A G G C T C A T C A T A C T T G G T C A C A T C T A A G C C A T T T G T T A G G C G G G A T T A A A C C A T G A G C C C G A A C C G A C G C G A A T T A G G C G T A A A C A G T A C A A G G A G A A T T T C G A T G TC G A A T A A G C A C T A T A G A A T G A G A C A T T G G T A G G C A A G T G C A C C A G T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C A A

024 G G A A AG C G T A G A C T T A AC C A A A/G A C C C G T C/T G T A C T T G G T C A C A T C T A A A C T A T T T G T T A G G C G G G G T T A A A C C A T G A G C C C G A A C C G A C G C G A A T C A G G C G T AG A A C A G T A C A A G G A G A A C T T C G A T G TC G A A T A A G C A C T A T A G A A T G A G G C A T T G G T A G G C A G A T A C A C C A A T T G C T G A G A G C C T A T A A A T T T T A T T T A C T C A A

Hybrid

043 A A G A A C A T A G G C C C A A T G AACTT G G T T C G C T A C G T C C G A T T A C C C C C G A A C C G T C T A C C G A G T G G G G C T G G G T T A C G T A C C C G G A C T A G T A T G G G C T G G G C A T AG G A C G G T G C G A G A G G G G T G C C A A C A TC G G A T A A G T A C C G T G A G G T G T A A C G T C G G T G G G C A G A C A T G T T G A A T G T C A A A G G C A A A C A A G C C T C A T T C G C T C G A

1022 G G A A/G AG C G T A G A C T T A AC C A A G A C C T A T C G T A C T T G G C C A C A T C T A A A C T A T T T G T T A G G C G G G A T T A A A C C A T G A G C C C G A G C C G A C G C G A A T C A G G C G T A A A T A G T A T A A G G A G G A C G T T G A T G TC G A G C A A/G A C G C T A T A G G A T A A G G C G T T A A C A A A C A A A T A C A C C A A T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C A A

mailto:[email protected]

157

12S(970bp) VAL(71bp) 16S(1572bp) ND1(981bp)

67-1036 1037-1107 1109-2680 2762-3742

440- 772 789 823 880 955 1094 1181 1514 1611 1621 1746 1777 1785 1801 1802 1832 1879 1992 2075 2197 2206 2359 2618 2784 2802 2914 2944 3142 3166 3199 3282 3298 3299 3436 3439 3511 3553 (Cont)

mito1 332 349 383 440 515 654 741 1074 1171 1181 1306 1337 1345 1361 1362 1392 1439 1552 1635 1757 1766 1919 2178 2344 2363 2474 2504 2702 2726 2759 2842 2858 2859 2996 2999 3071 3113

Ref G A A A AG C G T A G G C T T A AC C C A A A A C T A T C A T A C T T G G T C

A. m. cyanoptera

SL8 A G G A A T A T A G A T C C A A T G AACTT G G T C C G C T A C G T C T A A T T

033 A G G A/G A T A T A A A T C C G A T G AACTT G G T C C G C T A C G T C T G A T T

19s/m G G G A/G A T A T G A A C C C A A T G AACTT G G T C C G C T A C G T C T G A T T

ZM19 A G G A A T A T G A G T T T G A T G AACTT G G T T C G C T G C G T C C A A T T

028 A A G A A T A T A A A T T T G A T G AACTT G G T T C G C T G C G T C C A A T T

4531 G G G A A C A T A G A C C C A A T G AACTT G G T C C G C T A C G T C C G A T T

9021 A G G A A C A T A A A C C C A A T G AACTT G G T T C G C T A C G T C T G A T T

062 A G G A A C A T A A G C C C A A T G AACTT G G T C C G C T G C G T C T A A T T

065 A G G A A C A T A G G C C C A A T G AACTT G G T C C G C T A C G T C T A A T T

9780 G G G A A C A T A G G C C C A A T G AACTT G G T T C G C T A C G T C C G A T T

ZM16 A G G A A T A T G A A T T T G A T G AACTT G G T T C G C T A T G T C T A G T C

101 A A G A A C A T A G G C T T A A T G AACTT G G T T C G C T A T G T C T G G T C

119 A A G A A C A T A G G C T T A A T G AACTT G G T T C G C T A T G T C T G G T C

363 A A G A A C A T A G G C C C A A T G AACTT G G T T C G C T A T G T C T G G T C

A. m. macao

ZM13 G G A A AG T G C A G A T T C A AC C A A G A C C C G T C G T A C T T A G C C

CC6 A G A A/G AG T G C A G A T T T G AC C A A A/G A C C T A T C G T A C T T A G T C

SL5 G A A A AG T G C A A A T T T G AC C A A G G C T T A T C/T A T A C T T G G T C

ZM10 G G A A/G AG C G C G A G T T C A AC C A A G A C C C A T C A T A C T T G G C C

CC7 A A A A AG T G T A G A T T T A AC C A A A/G A C T T A T C/T A T A C T T A G T C

046 A A A A AG C G T A G A C T T A AC C A A G G C T C A T C A T A C T T G G T C

024 G G A A AG C G T A G A C T T A AC C A A A/G A C C C G T C/T G T A C T T G G T C

Hybrids

043 A A G A A C A T A G G C C C A A T G AACTT G G T T C G C T A C G T C C G A T T

1022 G G A A/G AG C G T A G A C T T A AC C A A G A C C T A T C G T A C T T G G C C

MET(68bp)

VAL(71bp) 16S(1572bp) ND1(981bp) ILE(72bp) 3889-3956 ND2(1040bp)

1037-1107 1109-2680 2762-3742 3741-3812 3957-4996

1094 1181 1514 1611 1621 1746 1777 1785 1801 1802 1832 1879 1992 2075 2197 2206 2359 2618 2784 2802 2914 2944 3142 3166 3199 3282 3298 3299 3436 3439 3511 3553 3602 3644 3798 3939 4061 4136 4193 4205 4247 4373 4385 4419 4432 4446 4489 4493 4517 4531 4539 4593 4620 4670 4697 4739 4778 4820 4938 4962

mito2 mito2 95 182 515 612 622 747 778 786 802 803 833 880 993 1076 1198 1207 1360 1619 1785 1804 1915 1945 2143 2167 2200 2283 2299 2300 2437 2440 2512 2554 2603 2645 2799 2940 3062 3137 3194 3206 3248 3374 3386 3420 3433 3447 3490 3494 3518 3532 3540 3594 3621 3671 3698 3740 3779 3821 3939 3963

Ref Ref C G T A G G C T T A AC C C A A A A C T A T C A T A C T T G G T C A C A T C T A A G C C A T T T G T T A G G C G G G A T T

A. m. cyanoptera A. m. cyanoptera

SL8 SL8 T A T A G A T C C A A T G AACTT G G T C C G C T A C G T C T A A T T A T C C C C G A A T C G T C C A C T G A A T G A G A C T

033 033 T A T A A A T C C G A T G AACTT G G T C C G C T A C G T C T G A T T A C C C C C G G A C C G T C T A C T G A A T G A G A C T

19s/m 19s/m T A T G A A C C C A A T G AACTT G G T C C G C T A C G T C T G A T T A C C C C C G G A C T G T C T A C T A/G A G T A A G G C C

ZM19 ZM19 T A T G A G T T T G A T G AACTT G G T T C G C T G C G T C C A A T T A C C C T C G G A T C G T C T A C C G A G T G G G A C C

028 028 T A T A A A T T T G A T G AACTT G G T T C G C T G C G T C C A A T T A T C C T C G G A T C G T C T A C C G A A T A A A A C C

4531 4531 C A T A G A C C C A A T G AACTT G G T C C G C T A C G T C C G A T T A C C C C C G A A T T G T C T A C C G A G T G A G G C C

9021 9021 C A T A A A C C C A A T G AACTT G G T T C G C T A C G T C T G A T T A C C C C C G G A T C G T C T A C T G A A T G A A G C T

062 062 C A T A A G C C C A A T G AACTT G G T C C G C T G C G T C T A A T T A C C C C C G A A C C G T C T A C T G A G T A A A A C T

065 065 C A T A G G C C C A A T G AACTT G G T C C G C T A C G T C T A A T T A C C C C C G A A C C G T C T A C T G A G T G A G A C T

9780 9780 C A T A G G C C C A A T G AACTT G G T T C G C T A C G T C C G A T T A C C C C C G A A C C G T C T A C T G A G T G A G A C T

ZM16 ZM16 T A T G A A T T T G A T G AACTT G G T T C G C T A T G T C T A G T C G C C C T C G G A C T G T C T A C C A/G A G T G A A A C C

101 101 C A T A G G C T T A A T G AACTT G G T T C G C T A T G T C T G G T C A C C C C C G A A C C G T C T A C T G A G T G A G A C T

119 119 C A T A G G C T T A A T G AACTT G G T T C G C T A T G T C T G G T C A C C C C C G A A C C G T C T A C T G A G T G A G A C T

363 363 C A T A G G C C C A A T G AACTT G G T T C G C T A T G T C T G G T C A C C C C C G A A C C G T C T A C T G A G T G A G A C T

A. m. macao A. m. macao

ZM13 ZM13 T G C A G A T T C A AC C A A G A C C C G T C G T A C T T A G C C A C A T C T A A A C T A T T T G T T A G G C G G G A T C

CC6 CC6 T G C A G A T T T G AC C A A A/G A C C T A T C G T A C T T A G T C G C A T C T A A A C T A C T T G T T A/G G G C G G A G T C

SL5 SL5 T G C A A A T T T G AC C A A G G C T T A T C/T A T A C T T G G T C A T A C C C A A A C T A T T T G T T A G A C A G G G T C

ZM10 ZM10 C G C G A G T T C A AC C A A G A C C C A T C A T A C T T G G C C A C C T C T A A A C T A C T T G T T A/G G A C A G G A T T

CC7 CC7 T G T A G A T T T A AC C A A A/G A C T T A T C/T A T A C T T A G T C A C A C C C A A A C T A T T T G T T A G G C G G G A T T

046 046 C G T A G A C T T A AC C A A G G C T C A T C A T A C T T G G T C A C A T C T A A G C C A T T T G T T A G G C G G G A T T

024 024 C G T A G A C T T A AC C A A A/G A C C C G T C/T G T A C T T G G T C A C A T C T A A A C T A T T T G T T A G G C G G G G T T

Hybrids Hybrids

043 043 C A T A G G C C C A A T G AACTT G G T T C G C T A C G T C C G A T T A C C C C C G A A C C G T C T A C C G A G T G G G G C T

1022 1022 C G T A G A C T T A AC C A A G A C C T A T C G T A C T T G G C C A C A T C T A A A C T A T T T G T T A G G C G G G A T T

CO1(1548bp) ASP(69bp) CO2(684bp) LYS(68bp) ATP8(168bp)

5362-6909 6981-7049 7052-7735 7737-7804 7806-7973

5433 5478 5490 5622 5655 5709 5748 5794 5931 5940 5952 5953 6078 6156 6207 6468 6474 6552 6592 6651 6756 6981 7033 7162 7192 7312 7413 7429 7576 7591 7616 7642 7649 7775 7785 7786 7914 (Cont)

5021-

mito3 mito3 412 457 469 601 634 688 727 773 910 919 931 932 1057 1135 1186 1447 1453 1531 1571 1630 1735 1960 2012 2141 2171 2291 2392 2408 2555 2570 2595 2621 2628 2754 2764 2765 2893

Ref Ref A A A C C A T G A G C C C G A A C C G A C G C G A A T C A G G C G T A A A


SL8 SL8 G G A T T G C G T A T C T G G G C T A G C A T G A A T T A A A C G T AG G A

033 033 G G A T T G C G T A T C T G G G T T A G C A T G A G T T G A G C A T A G C

19s/m 19s/m G G A T T G C G T A C T T G G G T T A G C A C A A G T T A A G T G T AG G C

ZM19 ZM19 G G A T T G C G T A C T T G G G T T A G C A T G A G C T G A G T G C A G C

028 028 G G G T T G C A A A T T C G A A T T A G C A C G G G T T G G G C G T AG G A

4531 4531 G G G T T A C G A A C C T G G G T T A G T A T A A A T T G A G C A T AG G A

9021 9021 G G G T T A C A A A C C C G G A T T A G C A C A A A T T A A G C A T AG G A

062 062 G G G T T A C A T A T C C G A A T T A G C A T G G A T T A G G C A T AG G A

065 065 G G G T T G C G T A C C T G A G C T A G T A T G A A T T A A G C A T A G A

9780 9780 G G A T T G C G T A T C C G A A C T A G C A T G A A C T A A G C G T A G A

ZM16 ZM16 G G A T T G C A A A T C C A A G C T A G C A C G A A T T A A G T G T AG G A

101 101 G G A T T A C G A A C C C G A A C T A G C A T G A A T T A G G C A T A G A

119 119 G G A T T A C G A A C C C G A A C T A G C A T G A A T T A G G C G T AG G A

363 363 G G A T T A C G A A C C C G A A C T A G C A C G A A T T A G G C G T A G A


ZM13 ZM13 G G A T T G C G A G T C T G A A C C G A T G T G A A T C A G G T G T AG A A

CC6 CC6 A A A C C A T G A G T C C G A A C C G A C G C G A G T T A G G T G T A A A

SL5 SL5 G G A T T G C G T G C C C G A G C C G A T G C G A G T T A A G C G T A A A

ZM10 ZM10 G G A T T G C G A G C C C G A A C C G A T G C G A A T C A G G T G T AG A A

CC7 CC7 G G A T T G C G T G C C T G A G C C G A T G T G A G T C A A G C G T A A A

046 046 A A A C C A T G A G C C C G A A C C G A C G C G A A T T A G G C G T A A A

024 024 A A A C C A T G A G C C C G A A C C G A C G C G A A T C A G G C G T AG A A

Hybrids Hybrids

043 043 G G G T T A C G T A C C C G G A C T A G T A T G G G C T G G G C A T AG G A

1022 1022 A A A C C A T G A G C C C G A G C C G A C G C G A A T C A G G C G T A A A

CO2(684bp) LYS(68bp) ATP8(168bp) ATP6(683bp) CO3(782bp) ND3(350bp) ND4L(297bp) ND4(1393bp)

7052-7735 7737-7804 7806-7973 7964-8646 8646-9427 9497-9846 9918-10214 10208-11600

7649 7775 7785 7786 7914 7970 8057 8087 8110 8195 8219 8306 8516 8797 8830 8836 8883 9002 9028 9220 9265 9510 9539 9543 9579 9581 9632 9667 9701 9777 9987 9990 10047 10056 10108 10122 10143 10293 10295 10362 10430 10766 10850 10859 10898 10931 10969 10983 10984 11057 11087 11102 11177 11222 11327 11370 11387 11411 11441 11472 11481 11531

mito4 7622- mito4 27 153 163 164 292 348 435 465 488 573 597 684 894 1175 1208 1214 1261 1380 1406 1598 1643 1888 1917 1921 1957 1959 2010 2045 2079 2155 2365 2368 2425 2434 2486 2500 2521 2671 2673 2740 2808 3144 3228 3237 3276 3309 3347 3361 3362 3435 3465 3480 3555 3600 3705 3748 3765 3789 3819 3850 3859 3909

Ref Ref G T A A A C A G T A C A A G G A G G A C T T C G A T G T G A A T A A G C A C T A T A G G A T G A G G C A T T G G T A G G C A


SL8 SL8 G T AG G A C G G T G C G A G A G G G A T T T C A A C A TC G G A C A A A T G C C G T G A A G T G A A A C G T C A A C A A A C G

033 033 A T A G C C A A T G T G A G A G G A A T T T C A G C A TC G G G T G A G T A T C G C G A G G T G T A A T G T C A A C G A A T G

19s/m 19s/m G T AG G C C A A T/C G T G A G A G A A A T G T T A G C A TC G G G T G A G T A T C G C G A G G T G T A A T G T C A A C G A A C G

ZM19 ZM19 G C A G C C A A T G T G A G A G G A A T G T T A G C A TC G G G T G A G T A T C G C G A G G T G T A A T G T C A A C A A A C G

028 028 G T AG G A C A G T/C G C G A G A G G G G T G C C A A C A TC G G A C A A A T G C C G T G A A G T G A A A C G C C A A C A A A T G

4531 4531 A T AG G A C G G T G T G A G A G A G A T G C T A G C A T A G A T A A G T A C C G T G A G G C A T A A C G T C A A C A A A C G

9021 9021 A T AG G A C G G T G C G A G A G A G A T G C C A A C A T G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G

062 062 A T AG G A T G G T G C G G A A G G G G T G C C A A C A T G G A T A A G T A T C G C G A G G T G T A A T G T C A A C A A A C G

065 065 A T A G A C A G T/C G C G A G A G G G A T G C C A A C A T G G G C A A/G A T G C C G T G A G G T G A A A C G T C A A C A A A C G

9780 9780 G T A G A C A G T G C G A G A G G G A T G C C A A C A T G G G T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G

ZM16 ZM16 G T AG G A T A G T G C G G A A G G G A T G C C A A C A TC G G A T A A G T A T C G C G A G G T G T A A T G C C A A C A A A C G

101 101 A T A G A C A G T G C G A G A G G G A T T T C A A C A TC G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G

119 119 G T AG G A C A G T G C G A G A G G G A T G T C A A C A TC G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G

363 363 G T A G A C A G T G C G A G A G G G A T G T C A A C A TC G G A T A A G T A C C G T G A G G T G A A A C G T C A A C A A A C G


ZM13 ZM13 G T AG A A C A G T A C A A G G A G G A C G T C G A T G TC G A A C A A A C G C T A T A G A A T G A G A C A C T A A C A A G T A

CC6 CC6 G T A A A C A G T/C A C A A A G A G G A C T T C G A T G TC G A A T A A G C A C T A T A G A A T G A G A C A T T G G T A G G C A

SL5 SL5 G T A A A C A G T A T A A G G A G A A C G T T G G T G TC A A G T A A G C A C T A T A G A A C A T G A C A C T A A C A A G C A

ZM10 ZM10 G T AG A A C A G T A C A A G G A G G A T G C C G A T G TC G A G T A A G C A C T A T A G A A T G A G A C A C T A G T A G A C A

CC7 CC7 G T A A A C A G T A C A A G G A G A A C G C C G A T G TC G A A C A A/G A C G C T A T A G A A T G A G G C A C T A A T A A A C A

046 046 G T A A A C A G T A C A A G G A G A A T T T C G A T G TC G A A T A A G C A C T A T A G A A T G A G A C A T T G G T A G G C A

024 024 G T AG A A C A G T A C A A G G A G A A C T T C G A T G TC G A A T A A G C A C T A T A G A A T G A G G C A T T G G T A G G C A

Hybrid Hybrid

043 043 A T AG G A C G G T G C G A G A G G G G T G C C A A C A TC G G A T A A G T A C C G T G A G G T G T A A C G T C G G T G G G C A

1022 1022 G T A A A T A G T A T A A G G A G G A C G T T G A T G TC G A G C A A/G A C G C T A T A G G A T A A G G C G T T A A C A A A C A

ND4(1393bp) ND5(1815bp) CytB(1140bp)

10208-11600 11806-13620 13632-14771

11079- 11222 11327 11370 11387 11411 11441 11472 11481 11531 11836 11851 11896 11956 12085 12103 12106 12163 12361 12505 12709 12718 12739 12859 12922 12929 12967 13042 13135 13177 13201 13254 13319 13424 13480 13579 13743 13756 13776 13887 14022 14025 14031 14040 14097 14139 14224 14317 14373 14466 14541 14616 (Cont)

mito5 mito5 144 249 292 309 333 363 394 403 453 758 773 818 878 1007 1025 1028 1085 1283 1427 1631 1640 1661 1781 1844 1851 1889 1964 2057 2099 2123 2176 2241 2346 2402 2501 2665 2678 2698 2809 2944 2947 2953 2962 3019 3061 3146 3239 3295 3388 3463 3538

Ref Ref T G G T A G G C A A G T G C A C C A G T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C A A


SL8 SL8 C A A C A A A C G G A C A T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C T A C T T A A

033 033 C A A C G A A T G G A C A T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C T A C T T A A

19s/m 19s/m C A A C G A A C G A G C G T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C C A C T T A A

ZM19 ZM19 C A A C A A A C G G A C A T G T C G A T T A T C A G A G G C A A A C G A A C T T C A T C C A C T T A A

028 028 C A A C A A A T G G A C A T G T T G G T C A T C A A A G G T A A A C A A A C T T C G C C T A T C T A A

4531 4531 C A A C A A A C G G A C A T G T C G A A T G T C A A A G G C A A A C A A A C T T C A T C T A C T T A A

9021 9021 C A A C A A A C G G A C A T G T T G A A T G T C A A A G G C A A A C A A G C C T C A T C T G C T T G G

062 062 C A A C A A A C G G A C A T G T T G G T C A T C A A A G G T A A A C A A A C T T C A T C T G T C T A A

065 065 C A A C A A A C G G A C A T G T T G A T T G T C A A A G G C A A A C A G G C C C C A T C T G C T T A G

9780 9780 C A A C A A A C G G A C A T G T C G A T T G T C A G A G A C A A G C G A A C T T C G T C T A C T T A A

ZM16 ZM16 C A A C A A A C G A G C A T G T T G G T T G T C A A A G G C A A A C A G G C C C C G C C T A C T T A G

101 101 C A A C A A A C G A G C G T G T C G G T C A T C A A A G G T A A A C A A A C T T C A T C T A T C T A A

119 119 C A A C A A A C G A G C G T G T C G G T C A T C A A A G G T A A A C A A A C T T C A C C T A T C T A A

363 363 C A A C A A A C G A G C G T G T C G G T C A T C A A A G G T A A A C A A A C T T C A T C T A T C T A A


ZM13 ZM13 T A A C A A G T A G A T G C A C T A A T T G C T G A G A G C C T A T A A G T C T T A T T C G C T C A A

CC6 CC6 T G G T A G G C A A G T G C A C C A A T T G C T G A G A G C C T A T A A A T T T T A T T T A C C C A A

SL5 SL5 T A A C A A G C A A A T G C A C T A A T T G C T G A G A G C C T A T A A G T C T T A T T T A C T C A G

ZM10 ZM10 T A G T A G A C A A G T A C A C T A G T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C G A

CC7 CC7 T A A T A A A C A G G T A C A C C A A T T G C T G G G A A C C T G T G A A T T T T A T T T A C T C G A

046 046 T G G T A G G C A A G T G C A C C A G T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C A A

024 024 T G G T A G G C A G A T A C A C C A A T T G C T G A G A G C C T A T A A A T T T T A T T T A C T C A A

Hybrid Hybrid

043 043 C G G T G G G C A G A C A T G T T G A A T G T C A A A G G C A A A C A A G C C T C A T T C G C T C G A

1022 1022 T A A C A A A C A A A T A C A C C A A T C A C T G A G A G T C T A T A A A T T T T A T T T A T C C A A

THR(68bp) ND6(515bp)

14772-14839 14917-15431

14234- 14373 14466 14541 14616

Mito6 Mito6

Ref Ref C C A A


SL8 SL8 T T A A

033 033 T T A A

19s/m 19s/m T T A A

ZM19 ZM19 T T A A

028 028 C T A A

4531 4531 T T A A

9021 9021 T T G G

062 062 C T A A

065 065 T T A G

9780 9780 T T A A

ZM16 ZM16 T T A G

101 101 C T A A

119 119 C T A A

363 363 C T A A


ZM13 ZM13 T C A A

CC6 CC6 C C A A

SL5 SL5 T C A G

ZM10 ZM10 C C G A

CC7 CC7 T C G A

046 046 C C A A

024 024 T C A A

Hybrid Hybrid

043 043 T C G A

1022 1022 C C A A

158

BIBLIOGRAPHY

Agilent Technologies. "2100 Bioanalyzer Instruments." from

http://www.genomics.agilent.com/en/product.jsp?cid=AG-PT-106&_requestid=27387.

Aitken, S. (2006). "DNA Barcoding: Fast-Tracking Species Identification." Biodiversity 7(3-4):

71-79.

Akey, J. M. (2003). "The Effect of Single Nucleotide Polymorphism Identification Strategies on

Estimates of Linkage Disequilibrium." Molecular Biology and Evolution 20(2): 232-242.

Álvarez, I. and J. F. Wendel (2003). "Ribosomal ITS sequences and plant phylogenetic

inference." Molecular Phylogenetics and Evolution 29(3): 417-434.

Amaya-Villarreal, A. M., A. Estrada and N. Vargas-Ramirez (2015). "Use of wild foods during

the rainy season by a reintroduced population of scarlet macaws (Ara macao cyanoptera) in

Palenque, Mexico." Tropical Conservation Science 8(2): 455-478.

AMCA - American Mosquito Control Association (2014). "Mosquito-Borne Diseases."

Aranishi, F. (2006). "A novel mitochondrial intergenic spacer reflecting population structure of

Pacific oyster." J Appl Genet 47(2): 119-123.

Arif, I. A., Khan, H.A. (2009). "Molecular markers for biodiversity analysis of wildlife animals:

a brief review." Animal Biodiversity and Conservation 32(1): 9-17.

Avise, J. C. (2000). Phylogeography: the history and formation of species, Harvard University

Press.

Backstrom, N., S. Fagerberg and H. Ellegren (2008). "Genomics of natural bird populations: a

gene-based set of reference markers evenly spread across the avian genome." Mol Ecol 17(4):

964-980.

http://www.genomics.agilent.com/en/product.jsp?cid=AG-PT-106&_requestid=27387

159

Baldwin, B. G., M. J. Sanderson, J. M. Porter, M. F. Wojciechowski, C. S. Campbell and M. J.

Donoghue (1995). "The its Region of Nuclear Ribosomal DNA: A Valuable Source of Evidence

on Angiosperm Phylogeny." Annals of the Missouri Botanical Garden 82(2): 247.

Barker, F. K., M. K. Benesh, A. J. Vandergon and S. M. Lanyon (2012). "Contrasting

evolutionary dynamics and information content of the avian mitochondrial control region and

ND2 gene." PLoS One 7(10): e46403.

Beckmann-Coulter. "Agencourt AMPure XP Beads." from

https://www.beckmancoulter.com/wsrportal/bibliography?docname=AMPureXPvsAMPure.pdf.

Beckmann-Coulter (2013). Agencourt AMPure XP PCR Purification. Beckmann-Coulter.

Benders-Hyde, E. (2002). "Southeast Asian Forest."

Birchler, J. A., H. Yao and S. Chudalayandi (2006). "Unraveling the genetic basis of hybrid

vigor." Proc Natl Acad Sci U S A 103(35): 12957-12958.

Birdlife International. (2013). ""Ara macao". IUCN Red List of Threatened Species."

International Union for Conservation of Nature Retrieved 26 November, 2013.

Bohle, H. M. and T. Gabaldon (2012). "Selection of marker genes using whole-genome DNA

polymorphism analysis." Evol Bioinform Online 8: 161-169.

Boore, J. L. (1999). "Animal mitochondrial genomes." Nucleic Acids Research 27(8): 1767-

1780.

Brightsmith, D., J. Hilburn, A. del Campo, J. Boyd, M. Frisius, R. Frisius, D. Janik and F.

Guillen (2005). "The use of hand-raised psittacines for reintroduction: a case study of scarlet

macaws (Ara macao) in Peru and Costa Rica." Biological Conservation 121(3): 465-472.

Buckler-Iv, E. S., A. Ippolito and T. P. Holtsford (1997). "The Evolution of Ribosomal DNA:

Divergent Paralogues and Phylogenetic Implications." Genetics 145(3): 821-832.

https://www.beckmancoulter.com/wsrportal/bibliography?docname=AMPureXPvsAMPure.pdf

160

Butler, R. (2014). "Tropical Rainforests of the World." from

http://rainforests.mongabay.com/0101.htm.

Cantu, J. C. (2014). "The Scarlet Macaw is Back in the Gulf of Mexico!", from

http://www.defendersblog.org/2014/07/scarlet-macaw-back-gulf-mexico/.

Carson, J. F., J. Watling, F. E. Mayle, B. S. Whitney, J. Iriarte, H. Prumers and J. D. Soto (2015).

"Pre-Columbian land use in the ring-ditch region of the Bolivian Amazon." The Holocene 25(8):

1285-1300.

Clarridge, J. E., 3rd (2004). "Impact of 16S rRNA gene sequence analysis for identification of

bacteria on clinical microbiology and infectious diseases." Clin Microbiol Rev 17(4): 840-862,

table of contents.

Coleman, A. W. (2013). "Analysis of mammalian rDNA internal transcribed spacers." PLoS One

8(11): e79122.

Collins, F. S., L. D. Brooks and A. Chakravarti (1998). "A DNA polymorphism discovery

resource for research on human genetic variation." Genome Res 8(12): 1229-1231.

Crnokrak, P. and D. A. Roff (1999). "Inbreeding depression in the wild." Heredity 83(3): 260-

270.

Dasmahapatra, K. K. and J. Mallet (2006). "Taxonomy: DNA barcodes: recent successes and

future prospects." Heredity (Edinb) 97(4): 254-255.

De Mendonca Dantas, G. P., R. Godinho, J. S. Morgante and N. Ferrand (2009). "Development

of new nuclear markers and characterization of single nucleotide polymorphisms in kelp gull

(Larus dominicanus)." Mol Ecol Resour 9(4): 1159-1161.

Desjardins, P. and R. Morais (1990). "Sequence and gene organization of the chicken

mitochondrial genome. A novel gene order in higher vertebrates." J Mol Biol 212(4): 599-634.

http://rainforests.mongabay.com/0101.htm

http://www.defendersblog.org/2014/07/scarlet-macaw-back-gulf-mexico/

161

Drees, K. (2010). "Zoo Blog: Scarlet Macaws." from

http://www.blankparkzoo.com/index.cfm/18193/1202/scarlet_macaws.

Duchene, S., F. I. Archer, J. Vilstrup, S. Caballero and P. A. Morin (2011). "Mitogenome

phylogenetics: the impact of using single regions and partitioning schemes on topology,

substitution rate and divergence time estimation." PLoS One 6(11): e27138.

Dupuis, J. R., A. D. Roe and F. A. Sperling (2012). "Multi-locus species delimitation in closely

related animals and fungi: one marker is not enough." Mol Ecol 21(18): 4422-4436.

Eickbush, T. H. (2002). "R2 and Related Site-Specific Non-Long Terminal Repeat

Retrotransposons." 813-835.

Eickbush, T. H. and D. G. Eickbush (2007). "Finely orchestrated movements: evolution of the

ribosomal RNA genes." Genetics 175(2): 477-485.

Ellegren, H. (2014). "Genome sequencing and population genomics in non-model organisms."

Trends Ecol Evol 29(1): 51-63.

Estrada, A. (2014). "Reintroduction of the scarlet macaw (Ara macao

cyanoptera) in the tropical rainforests of Palenque,

Mexico: project design and first year progress." Tropical Conservation Science 7(3): 342-364.

Feinstein, J. and J. Cracraft (2004). "Solving a sequencing problem in the vertebrate

mitochondrial control region using phylogenetic comparisons." DNA Seq 15(5-6): 374-377.

Feliner, G. N. and J. A. Rosselló (2012). "Concerted Evolution of Multigene Families and

Homoeologous Recombination." 171-193.

Fitch, W. M. and E. Margoliash (1967). "Construction of phylogenetic trees." Science

155(3760): 279-284.

Forshaw, J. M., Cooper, W. (2006). Parrots of the World, Avian Publications.

http://www.blankparkzoo.com/index.cfm/18193/1202/scarlet_macaws

162

Frankham, R., J. Ballou and D. Briscoe (2010). Introduction to Conservation Genomicsx`.

Ganley, A. R. and T. Kobayashi (2007). "Highly efficient concerted evolution in the ribosomal

DNA repeats: total rDNA repeat variation revealed by whole-genome shotgun sequence data."

Genome Res 17(2): 184-191.

Gilbert, S. F. (2013). Developmental Biology, Sinauer Associates.

Goh, K. J., C. T. Tan, N. K. Chew, P. S. Tan, A. Kamarulzaman, S. A. Sarji, K. T. Wong, B. J.

Abdullah, K. B. Chua and S. K. Lam (2000). "Clinical features of Nipah virus encephalitis

among pig farmers in Malaysia." N Engl J Med 342(17): 1229-1235.

Grechko, V. V., L. V. Fedorova, D. M. Riabinin, D. G. Chobanu, S. A. Kosushkhin and I. S.

Darevskii (2006). "Molecular markers of nuclear DNA in the study of evolution and speciation

process in an example of "Lacerta agilis complex" (Sauria: Lacertidae)." Mol. Biol. (Mosk)

40(1): 61-73.

Haig, S. M., L. Wennerberg, T. D. Mullins, E. D. Forsman and P. Trail (2004). "Genetic

identification of spotted owls, barred owls, and their hybrids: Legal implications of hybrid

identity." Conservation Biology 18(5): 1347-1357.

Hammond, J. B. W., G. Spanswick and J. A. Mawn (1996). "Extraction of DNA from preserved

animal specimens for use in randomly amplified polymorphic DNA analysis." Analytical

Biochemistry(240): 298-300.

Hebert, P. D. and T. R. Gregory (2005). "The promise of DNA barcoding for taxonomy." Syst

Biol 54(5): 852-859.

Hebert, P. D., E. H. Penton, J. M. Burns, D. H. Janzen and W. Hallwachs (2004). "Ten species in

one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes

fulgerator." Proc Natl Acad Sci U S A 101(41): 14812-14817.

163

Hebert, P. D., S. Ratnasingham and J. R. deWaard (2003). "Barcoding animal life: cytochrome c

oxidase subunit 1 divergences among closely related species." Proc Biol Sci 270 Suppl 1: S96-

99.

Hebert, P. D., M. Y. Stoeckle, T. S. Zemlak and C. M. Francis (2004). "Identification of Birds

through DNA Barcodes." PLoS Biol 2(10): e312.

Heslop-Harrison, J. S. and T. Schwarzacher (2011). "Organisation of the plant genome in

chromosomes." Plant J 66(1): 18-33.

Holstein, N. (2006). "Eucaryot rdna.png."

Hughes, A. L. and M. A. Hughes (2007). "Coding sequence polymorphism in avian

mitochondrial genomes reflects population histories." Mol Ecol 16(7): 1369-1376.

Illumina, I. "MiSeq Reporter Software Documentation."

Illumina, I. "MiSeq Reporter workflow."

Illumina, I. "Nextera XT DNA Library Preparation Kit." from

http://www.illumina.com/products/nextera_xt_dna_library_prep_kit.html.

Illumina, I. (2015). "Nextera XT DNA Library Preparation Kit." from

http://www.illumina.com/products/nextera_xt_dna_library_prep_kit.html.

International Union for Conservation of Nature. (2015). "Conservation successes overshadowed

by more species declines – IUCN Red List update." from

http://www.iucn.org/news_homepage/?21561/Conservation-successes-overshadowed-by-more-

species-declines--IUCN-Red-List-update.

IUCN (2013). IUCN Red List of Threatened Species. Version 2013.2. www.iucnredlist.org.

Juniper, T. a. M. P. (1998). Parrots: A Guide to Parrots of the World., Yale University Press.

http://www.illumina.com/products/nextera_xt_dna_library_prep_kit.html

http://www.illumina.com/products/nextera_xt_dna_library_prep_kit.html

http://www.iucn.org/news_homepage/?21561/Conservation-successes-overshadowed-by-more-species-declines--IUCN-Red-List-update

http://www.iucn.org/news_homepage/?21561/Conservation-successes-overshadowed-by-more-species-declines--IUCN-Red-List-update

http://www.iucnredlist.org/

164

Kearse, M., R. Moir, A. Wilson, S. Stones-Havas, M. Cheung, S. Sturrock, S. Buxton, A.

Cooper, S. Markowitz, C. Duran, T. Thierer, B. Ashton, P. Meintjes and A. Drummond (2012).

"Geneious Basic: an integrated and extendable desktop software platform for the organization

and analysis of sequence data." Bioinformatics 28(12): 1647-1649.

Kiesler, K. (2014). Next Generation Sequencing on the Ion Torrent PGM: New SNP Typing

Applications, National Institute of Standards and Technology.

Kilpert, F. and L. Podsiadlowski (2006). "The complete mitochondrial genome of the common

sea slater, Ligia oceanica (Crustacea, Isopoda) bears a novel gene order and unusual control

region features." BMC Genomics 7: 241.

King, R. C. and W. D. Stansfield (1990). Dictionary of Genetics, Oxford University Press.

Kiss, L. (2012). "Limits of nuclear ribosomal DNA internal transcribed spacer (ITS) sequences

as species barcodes for Fungi." Proc Natl Acad Sci U S A 109(27): E1811; author reply E1812.

Knowles, L. L. and B. C. Carstens (2007). "Delimiting species without monophyletic gene

trees." Systematic Biology 56(6): 887-895.

Kovarik, A., M. Dadejova, Y. K. Lim, M. W. Chase, J. J. Clarkson, S. Knapp and A. R. Leitch

(2008). "Evolution of rDNA in Nicotiana allopolyploids: a potential link between rDNA

homogenization and epigenetics." Ann Bot 101(6): 815-823.

Kurtzman, C. P. and C. J. Robnett (1998). "Identification and phylogeny of ascomycetous yeasts

from analysis of nuclear large subunit (26S) ribosomal DNA partial sequences." Antonie van

Leeuwenhoek 73(4): 331-371.

Lachance, J. and S. A. Tishkoff (2013). "SNP ascertainment bias in population genetic analyses:

why it is important, and how to correct it." Bioessays 35(9): 780-786.

165

Lande, R. (1988). "Genetics and demography in biological conservation." Science 241(4872):

1455-1460.

Li, W.-H. (1997). Molecular Evolution. Sunderland, Massachusetts, Sinauer Associates.

Life Technologies. "Life Technologies -- Ion Torrent." from

http://www.biomedcentral.com/1471-2148/12/46.

Marsden, S. J. and J. D. Pilgrim (2002). "Factors influencing the abundance of parrots and

hornbills in pristine and disturbed forests on New Britain, PNG." Ibis 145(1): 45-53.

Matyasek, R., S. Renny-Byfield, J. Fulnecek, J. Macas, M. A. Grandbastien, R. Nichols, A.

Leitch and A. Kovarik (2012). "Next generation sequencing analysis reveals a relationship

between rDNA unit diversity and locus number in Nicotiana diploids." BMC Genomics 13: 722.

Mikhed, Y., A. Daiber and S. Steven (2015). "Mitochondrial Oxidative Stress, Mitochondrial

DNA Damage and Their Role in Age-Related Vascular Dysfunction." Int J Mol Sci 16(7):

15918-15953.

Mindell, D. P., M. D. Sorenson and D. E. Dimcheff (1998). "An Extra Nucleotide is not

Translated in Mitochondrial ND3 of Some Birds and Turtles." Molecular Biology and Evolution

15(11): 1568-1571.

Morin, P. A., G. Luikart, R. K. Wayne and S. N. P. w. g. the (2004). "SNPs in ecology, evolution

and conservation." Trends in Ecology & Evolution 19(4): 208-216.

Moss, R. S. a. D. (2009). "Earth Talk." from http://www.emagazine.com/earth-talk-

archive/week-of-10-11-09.

Myers, M. C. and C. Vaughan (2004). "Movement and behavior of scarlet macaws (Ara macao)

during the post-fledging dependence period: implications for in situ versus ex situ management."

Biological Conservation 118(3): 411-420.

http://www.biomedcentral.com/1471-2148/12/46

http://www.emagazine.com/earth-talk-archive/week-of-10-11-09

http://www.emagazine.com/earth-talk-archive/week-of-10-11-09

166

Nabholz, B., N. Uwimana and N. Lartillot (2013). "Reconstructing the phylogenetic history of

long-term effective population size and life-history traits using patterns of amino acid

replacement in mitochondrial genomes of mammals and birds." Genome Biol Evol 5(7): 1273-

1290.

National Park Service. (2015). "Wolf Restoration." from

http://www.nps.gov/yell/learn/nature/wolf-restoration.htm.

Nei, M. and A. P. Rooney (2005). "Concerted and birth-and-death evolution of multigene

families." Annu Rev Genet 39: 121-152.

New England Biosystems. "NEBNext Fast DNA Fragmentation & Library Prep Set for Ion

Torrent." from https://www.neb.com/products/e6285-nebnext-fast-dna-fragmentation-and-

library-prep-set-for-ion-torrent.

Nosil, P. and D. Schluter (2011). "The genes underlying the process of speciation." Trends Ecol

Evol 26(4): 160-167.

O'neill, P. (2013). "Flying rainbows: the scarlet macaw returns to Mexico." from

http://news.mongabay.com/2013/06/flying-rainbows-the-scarlet-macaw-returns-to-mexico/.

Ohta, T. (2009). "The mutational load of a multigene family with uniform members." Genetical

Research 53(02): 141.

Olson, S. (1989). Shaping the Future: Biology and Human Values.

Pacheco, M. A., F. U. Battistuzzi, M. Lentino, R. F. Aguilar, S. Kumar and A. A. Escalante

(2011). "Evolution of modern birds revealed by mitogenomics: timing the radiation and origin of

major orders." Mol Biol Evol 28(6): 1927-1942.

Paxton, E. H., M. K. Sogge, T. C. Theimer, J. Girard and P. Keim (2008). "Using molecular

markers to resolve a subspecies boundary: the northern boundary of the Southwestern Willow

http://www.nps.gov/yell/learn/nature/wolf-restoration.htm

https://www.neb.com/products/e6285-nebnext-fast-dna-fragmentation-and-library-prep-set-for-ion-torrent

https://www.neb.com/products/e6285-nebnext-fast-dna-fragmentation-and-library-prep-set-for-ion-torrent

http://news.mongabay.com/2013/06/flying-rainbows-the-scarlet-macaw-returns-to-mexico/

167

Flycatcher in the four-corner states: U.S. Geological Survey Open-File Report 2007-1117, 20 p."

2007 1117: 20.

Promega. (2015). "Wizard SV Gel and PCR Clean-Up System." from

https://www.promega.com/products/dna-and-rna-purification/dna-fragment-purification/wizard-

sv-gel-and-pcr-clean_up-system/.

Prychitko, T. M. and W. S. Moore (2000). "Comparative Evolution of the Mitochondrial

Cytochrome b Gene and Nuclear β-Fibrinogen Intron 7 in Woodpeckers." Molecular Biology

and Evolution 17(7): 1101-1111.

Questiau, S., M. C. Eybert, A. R. Gaginskaya, L. Gielly and P. Taberlet (1998). "Recent

divergence between two morphologically differentiated subspecies of bluethroat (Aves:

Muscicapidae: Luscinia svecica) inferred from mitochondrial DNA sequence variation."

Molecular Ecology 7(2): 239-245.

Ralls, K. and J. Ballou (1983). "Extinction: Lessons from zoos." BIOL. CONSERV. SER.: 164-

184.

Reumers, J., P. De Rijk, H. Zhao, A. Liekens, D. Smeets, J. Cleary, P. Van Loo, M. Van Den

Bossche, K. Catthoor, B. Sabbe, E. Despierre, I. Vergote, B. Hilbush, D. Lambrechts and J. Del-

Favero (2012). "Optimized filtering reduces the error rate in detecting genomic variants by short-

read sequencing." Nat Biotechnol 30(1): 61-68.

Ripple, W. J. and R. L. Beschta (2012). "Trophic cascades in Yellowstone: The first 15years

after wolf reintroduction." Biological Conservation 145(1): 205-213.

Romanov, M. N., E. M. Tuttle, M. L. Houck, W. S. Modi, L. G. Chemnick, M. L. Korody, E. M.

Mork, C. A. Otten, T. Renner, K. C. Jones, S. Dandekar, J. C. Papp, Y. Da, N. C. S. Program, E.

D. Green, V. Magrini, M. T. Hickenbotham, J. Glasscock, S. McGrath, E. R. Mardis and O. A.

https://www.promega.com/products/dna-and-rna-purification/dna-fragment-purification/wizard-sv-gel-and-pcr-clean_up-system/

https://www.promega.com/products/dna-and-rna-purification/dna-fragment-purification/wizard-sv-gel-and-pcr-clean_up-system/

168

Ryder (2009). "The value of avian genomics to the conservation of wildlife." BMC Genomics 10

Suppl 2: S10.

Saccheri, I., M. Kuussaari, M. Kankare, P. Vikman, W. Fortelius and I. Hanski (1998).

"Inbreeding and extinction in a butterfly metapopulation." Nature 392(6675): 491-494.

Sanger, F., S. Nicklen and A. R. Coulson (1977). "DNA sequencing with chain-terminating

inhibitors." Proc Natl Acad Sci U S A 74(12): 5463-5467.

Schlotterer, C. and B. Harr (2002). "Single nucleotide polymorphisms derived from ancestral

populations show no evidence for biased diversity estimates in Drosophila melanogaster."

Molecular Ecology 11(5): 947-950.

Schlötterer, C. and D. Tautz (1994). "Chromosomal homogeneity of Drosophila ribosomal DNA

arrays suggests intrachromosomal exchanges drive concerted evolution." Current Biology 4(9):

777-783.

Seabury, C. M., S. E. Dowd, P. M. Seabury, T. Raudsepp, D. J. Brightsmith, P. Liboriussen, Y.

Halley, C. A. Fisher, E. Owens, G. Viswanathan and I. R. Tizard (2013). "A multi-platform draft

de novo genome assembly and comparative analysis for the Scarlet Macaw (Ara macao)." PLoS

One 8(5): e62415.

Shields, G. F. and A. C. Wilson (1987). "Subspecies of the Canada Goose (Branta canadensis)

Have Distinct Mitochondrial DNA's." Evolution 41(3): 662.

Sibley, C. G. and J. E. Ahlquist (1983). "Phylogeny and classification of birds based on the data

of DNA-DNA hybridization." Current Ornithology 1: 245-292.

Smith, L. M. and L. A. Burgoyne (2004). "Collecting, archiving and processing DNA from

wildlife samples using FTA databasing paper." BMC Ecol 4: 4.

169

Snyder, N. F. R., S. R. Derrickson, S. R. Beissinger, J. W. Wiley, T. B. Smith, W. D. Toone and

B. Miller (1996). "Limitations of Captive Breeding in Endangered Species Recovery."

Conservation Biology 10(2): 338-348.

Sorenson, M. D., J. C. Ast, D. E. Dimcheff, T. Yuri and D. P. Mindell (1999). "Primers for a

PCR-based approach to mitochondrial genome sequencing in birds and other vertebrates." Mol

Phylogenet Evol 12(2): 105-114.

Species., C.-L. (2007). Scarlet Macaw. U.-W. C. M. Center.

Steiner, C. C., A. S. Putnam, P. E. Hoeck and O. A. Ryder (2013). "Conservation genomics of

threatened animal species." Annu Rev Anim Biosci 1: 261-281.

Stoeckle, M. Y. (2003). "Taxonomy, DNA, and the Bar Code of Life." BioScience 53(9): 2-3.

Syed, F. H. G., Nicholas Caruccio (2009). "Next-generation sequencing library preparation:

simultaneous fragmentation and tagging using in vitro transposition." Nature Methods.

Tavares, E. S., A. J. Baker, S. L. Pereira and C. Y. Miyaki (2006). "Phylogenetic relationships

and historical biogeography of neotropical parrots (Psittaciformes: Psittacidae: Arini) inferred

from mitochondrial and nuclear DNA sequences." Syst Biol 55(3): 454-470.

ThermoScientific. "NanoDrop 2000 UV-Vis Spectrophotometer." from

http://www.nanodrop.com/Productnd2000overview.aspx.

ThermoScientific. "Qubit 2.0 Fluorometer." from

https://tools.thermofisher.com/content/sfs/manuals/mp32866.pdf.

ThermoScientific. "Savant™ SPD131DDA SpeedVac™ Concentrator." from

http://www.thermoscientific.com/content/tfs/en/product/savant-spd131dda-speedvac-

concentrator.html.

ThermoScientific (2008). T009-Technical Bulletin (NanoDrop). New York, Worth Publishers.

http://www.nanodrop.com/Productnd2000overview.aspx

https://tools.thermofisher.com/content/sfs/manuals/mp32866.pdf

http://www.thermoscientific.com/content/tfs/en/product/savant-spd131dda-speedvac-concentrator.html

http://www.thermoscientific.com/content/tfs/en/product/savant-spd131dda-speedvac-concentrator.html

170

Tourasse, N. J. (2000). "Selective Constraints, Amino Acid Composition, and the Rate of

Protein Evolution."

University, I. S. (2007). Nipah Virus Infection. T. C. f. F. S. a. P. Health.

Urantowka, A. D. (2014). "Complete mitochondrial genome of Critically Endangered Blue-

throated Macaw (Ara glaucogularis): its comparison with partial mitogenome of Scarlet Macaw

(Ara macao)." Mitochondrial DNA.

Urantowka, A. D., T. Strzala and K. A. Grabowski (2014). "Complete mitochondrial genome of

endangered Maroon-fronted Parrot (Rhynchopsitta terrisi) - conspecific relation of the species

with Thick-billed Parrot (Rhynchopsitta pachyrhyncha)." Mitochondrial DNA 25(6): 424-426.

Waldschmidt, A. M., E. G. d. Barros and L. A. O. Campos (2000). "A molecular marker

distinguishes the subspecies Melipona quadrifasciata quadrifasciata and Melipona quadrifasciata

anthidioides (Hymenoptera: Apidae, Meliponinae)." Genetics and Molecular Biology 23(3): 609-

611.

Wang, Y., R. M. Tian, Z. M. Gao, S. Bougouffa and P. Y. Qian (2014). "Optimal eukaryotic 18S

and universal 16S/18S ribosomal RNA primers and their application in a study of symbiosis."

PLoS One 9(3): e90053.

Watanabe, T., M. Nishida, K. Watanabe, D. S. Wewengkang and M. Hidaka (2005).

"Polymorphism in Nucleotide Sequence of Mitochondrial Intergenic Region in Scleractinian

Coral (Galaxea fascicularis)." Mar Biotechnol (NY) 7(1): 33-39.

Westemeier, R. L., J. D. Brawn, S. A. Simpson, T. L. Esker, R. W. Jansen, J. W. Walk, E. L.

Kershner, J. L. Bouzat and K. N. Paige (1998). "Tracking the Long-Term Decline and Recovery

of an Isolated Population." Science 282(5394): 1695-1698.

171

Whatman, G. H. L. S. (2015). "FTA/FTA Elute Sample Collection Cards and Kits." from

http://www.gelifesciences.com/webapp/wcs/stores/servlet/catalog/en/GELifeSciences-

us/products/AlternativeProductStructure_17096/.

Zierdt-Warshaw, L. (2000). Encyclopedia of Environmental Science, Greenwood.

http://www.gelifesciences.com/webapp/wcs/stores/servlet/catalog/en/GELifeSciences-us/products/AlternativeProductStructure_17096/

http://www.gelifesciences.com/webapp/wcs/stores/servlet/catalog/en/GELifeSciences-us/products/AlternativeProductStructure_17096/

Tracy Ann Kimdigital.library.unt.edu/ark:/67531/metadc849620/m2/1/high_res_d/KIM...(Ara macao)....

Documents

Transcript of Tracy Ann Kimdigital.library.unt.edu/ark:/67531/metadc849620/m2/1/high_res_d/KIM...(Ara macao)....