Shaping Genetic Variation in Neutral and Adaptive

32
ACTA UNIVERSITATIS UPSALIENSIS UPPSALA 2013 Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1078 Neutral and Adaptive Processes Shaping Genetic Variation in Spruce Species MICHAEL STOCKS ISSN 1651-6214 ISBN 978-91-554-8760-7 urn:nbn:se:uu:diva-207714

Transcript of Shaping Genetic Variation in Neutral and Adaptive

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2013

Digital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 1078

Neutral and Adaptive ProcessesShaping Genetic Variation inSpruce Species

MICHAEL STOCKS

ISSN 1651-6214ISBN 978-91-554-8760-7urn:nbn:se:uu:diva-207714

Dissertation presented at Uppsala University to be publicly examined in Lindahlsalen,Evolutionary Biology Centre, Uppsala, Thursday, October 31, 2013 at 09:30 for the degree ofDoctor of Philosophy. The examination will be conducted in English.

AbstractStocks, M. 2013. Neutral and Adaptive Processes Shaping Genetic Variation in SpruceSpecies. Acta Universitatis Upsaliensis. Digital Comprehensive Summaries of UppsalaDissertations from the Faculty of Science and Technology 1078. 30 pp. Uppsala.ISBN 978-91-554-8760-7.

Population genetic analyses can provide information about both neutral and selectiveevolutionary processes shaping genetic variation. In this thesis, extensive population geneticmethods were used to make inferences about genetic drift and selection in spruce species. Inpaper I we studied four species from the Qinghai-Tibetan Plateau (QTP): Picea likiangensis,P. purpurea, P. wilsonii and P. schrenkiana. Big differences in estimates of genetic diversityand Ne were observed in the more restricted species, P. schrenkiana, and the other more widelydistributed species. Furthermore, P. purpurea appears to be a hybrid between P. likiangensis andP. wilsonii. In paper II we used Approximate Bayesian Computation (ABC) to find that the datasupport a drastic reduction of Ne in Taiwan spruce around 300-500 kya, in line with evidencefrom the pollen records. The split from P. wilsonii was dated to between 4-8 mya, around thetime that Taiwan was formed. These analyses relied on a small sample size, and so in PaperIII we investigated the impact of small datasets on the power to distinguish between models inABC. We found that when genetic diversity is low there is little power to distinguish betweensimple coalescent models and this can determine the number of samples and loci required.

In paper IV we studied the relative importance of genetic drift and selection in four sprucespecies with differing Ne: P. abies, P. glauca, P. jezoensis and P. breweriana. P. breweriana,which has a low Ne, exhibits a low fraction of adaptive substitutions, while P. abies hashigh Ne and a high fraction of adaptive substitutions. The other two spruce, however, do notsupport this suggesting other factors a more important. In paper V we find that several SNPscorrelate with both a key adaptive trait (budset) and latitude. The expression of one in particular(PoFTL2) correlates with budset and was previously indentified in P. abies. These studies havehelped characterise the importance of different population genetic processes in shaping geneticvariation in spruce species and has laid some solid groundwork for future studies of spruce.

Keywords: spruce, population genetics, adaptation, evolution, picea, approximate bayesiancomputation, cline

Michael Stocks, Uppsala University, Department of Ecology and Genetics, Plant Ecology andEvolution, Norbyvägen 18 D, SE-752 36 Uppsala, Sweden.

© Michael Stocks 2013

ISSN 1651-6214ISBN 978-91-554-8760-7urn:nbn:se:uu:diva-207714 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-207714)

List of papers

This thesis is based on the following papers, which are referred to in the textby their Roman numerals.

I Li*, Y., Stocks*, M., Hemmilä*, S., Källman*, T., Hongtao, Z.,Yongfeng, Z., Chen, J., Liu, J. & Lascoux, M. (2010) Demographichistories of four spruce (Picea) species of the Qinghai-Tibetan Plateauand neighboring areas inferred from multiple nuclear loci. MolecularBiology and Evolution 27(5): 1001-1014.

II Bodare*, S., Stocks*, M., Yang, J-C. & Lascoux, M. (2013). Originand demographic history of the endemic Taiwanese spruce (Piceamorrisonicola). Ecology and Evolution 3(10): 3320-3333.

III Stocks, M., Siol, M., Lascoux, M. & De Mita, S. (2013). Amount ofinformation needed for model choice in Approximate BayesianComputation. Manuscript.

IV Stocks, M., Chen, J., Källman, T., Bousquet, J. & Lascoux, M. (2013).Molecular adaptation in spruce species. Manuscript.

V Chen*, J., Tsuda*, Y., Stocks*, M., Källman, T.*, Xu, N., Semerikov,V., Vendramin, G. & Lascoux, M. (2013). Clinal variation in allelefrequency and gene expression at photoperiodic and circadian genes inSiberian spruce: parallel evolution in FTL2 and Gigantea? Manuscript.

Reprints were made with permission from the publishers.

* - These authors contributed equally to this paper.

The following papers were written during the course of my doctoral studiesbut are not part of the present dissertation:

Dean, R., Stocks, M., Rogell, B. & Friberg, U. (2013). The X chromo-some trans-regulates sexually dimorphic gene expression. Manuscript.

Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.1 Inferring Neutral Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.2 The Fate of New Mutations Under Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.3 Adaptation in Natural Populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.4 Spruce as a Study Species . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.1 Paper I - Speciation and Hybridisation on the QTP . . . . . . . . . . . . . . . . . . . . . 152.2 Paper II - Population Decline in Taiwan Spruce . . . . . . . . . . . . . . . . . . . . . . . . . . 162.3 Paper III - Model Choice in ABC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.4 Paper IV - Rate of Molecular Adaptation in Spruce . . . . . . . . . . . . . . . . . . . . 202.5 Paper V - Adaptation to a Latitudinal Cline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4 Svensk sammanfattning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

1. Introduction

“Mere chance ... alone would never account for so habitual and large anamount of difference as that between varieties of the same species.”

- Charles Darwin. On the Origin of Species. IV: Natural Selection [7]

Genetic variation amongst individuals in a population is created through theaccumulation of mutations. Mutations are often assumed to occur randomlywith regard to their location in the genome and to occur at some constant ratethrough time. Each new mutation may be broadly defined as having a bene-ficial, deleterious or neutral effect with regard to the fitness of an individual.Natural selection, in the absence of other population genetic forces, will in-crease the frequency of beneficial mutations in a population and decrease thefrequency of deleterious mutations.

Natural populations, however, are finite in size, and this introduces a ran-dom sampling of individuals each generation with the consequence that someindividuals will, by chance, harbour more than one offspring and others none.This process, known as genetic drift and described by the Wright-Fisher model[12, 33], results in the random loss of alleles and genetic variation from thepopulation. The consequence of this neutral process is that genetic drift will,purely by chance, lead to the fixation of some deleterious mutations and theloss of some beneficial mutations from the population. The relative promi-nance of these neutral and adaptive processes in natural populations is there-fore of considerable interest.

This thesis explores the role that genetic drift and natural selection have hadin shaping genetic diversity in spruce species. Papers I and II look at the rolethat neutral processes have played in spruce evolution by inferring the demo-graphic history of a group of Asian spruce species. Paper III looks in moredetail at the power afforded by different datasets when choosing between dif-ferent population genetic models using Approximate Bayesian Computation.And papers IV and V focus on the role of natural selection and adaptation, inspruce species, on two different time-scales.

7

1.1 Inferring Neutral ProcessesNeutral genetic variation within a population of size N can be thought of as abalance between mutation and genetic drift. Mutations enter a population withrate µ and drift removes variation at a rate of 1/2N each generation. This bal-ance is represented in population genetics by the population-scaled mutationrate, θ = 4Nµ . Due to a number of simplifying assumptions, however, theWright-Fisher model never truly holds in most (if not all) natural populations,and there are other processes or features of the organism’s life history that canaffect genetic variation. These effects are contained within a term called theeffective population size (Ne), and genetic variation present in natural, ratherthan idealized, populations is usually defined relative to this term: θ = 4Neµ .The effective population size is therefore a measure of the amount of geneticdrift affecting levels of genetic variation in a population and can be influencedby many biologically reasonable factors, such as spatial population structure,mating system or changes in the census population size through time. Theeffective population size therefore contains a great deal of information aboutthe population genetic processes operating within a study species and is con-sequently of great interest to population geneticists.

Estimates of the effective population size have been greatly aided by thedevelopment of another classic population genetic model. The coalescent[21, 22, 20] is a backwards in time approximation to the Wright-Fisher model.Rather than tracking all individuals forward in time, the coalescent models theancestral process backwards in time just from those individuals sampled in thecurrent generation. This makes simulations far more efficient, and populationgenetic analyses have greatly benefited from this. The coalescent also pro-vides very useful predictions about the patterns expected to be seen as a resultof neutral population genetic processes in contemporary genetic data and hasproved useful due to the intuitive way in which the shapes of the genealogiesgenerated under the model can impact genetic variation ("genealogical think-ing") (reviewed in [30]).

Key insights into how temporal fluctuations in the effective population sizeaffect modern day genetic variation can be gained by performing coalescentsimulations, calculating some statistic and comparing this with the observeddata. For example, a number of programs (e.g. IMa [16], MIMAR [2]) im-plement an Isolation-with-Migration (IM) model to infer a number of impor-tant speciation parameters. An IM model simulates an ancestral population ofeffective size NA that splits at some time in the past, t (measured in coales-cent time units), into two descendent populations with effective sizes N1 andN2, with some rate of gene-flow (m) continuing after divergence. These IM-based methods take note of the fact that the segregating sites occurring in twoclosely-related species can be partitioned into four distinct categories [31].

8

Mutations can be polymorphic only within one of the two species (privatepolymorphisms: Sx1, Sx2), they can be polymorphic in both species (sharedpolymorphisms: SS) or they can occur in all individuals of one species andnone in the other (fixed differences: SF ). Each of these summary statisticsgives information about the parameters (NA, N1, N2, t, m) of the IM model andso, by performing coalescent simulations under this model, and comparingthe summary statistics calculated from these simulations with those calculatedfrom the observed data, one can generate estimates of these key divergenceparameters.

A more recent statistical advance, Approximate Bayesian Computation (ABC)[1], has allowed a great number of models to be fit to observed data and com-pared with each other in a more flexible and efficient manner. The steps in-volved in performing an ABC analysis can be described in a piecewise manner.A prior distribution of parameter values is randomly generated for a model ofchoice by sampling from some pre-determined distribution. For example, fora coalescent model of constant effective population size, 105 values between0 and 5 of the population scaled mutation rate, θ , may be randomly drawnfrom a uniform distribution. These values are then fed into a coalescent sim-ulator (such as ms [17]), from which a select number of summary statisticsare calculated, to form a joint distribution. This joint distribution representseach randomly chosen parameter value (θi) and its associated summary statis-tic (si). The equivalent summary statistic is also calculated from the observeddata (s̄) so that these simulated values can be compared to the observed data,either by taking the euclidean distance (|si− s̄|) or by using a more complexfunction. The ε closest values to the observed statistics are kept and regres-sion is performed on these values to fit to the data. The approximate natureof ABC means that it is far more computationally efficient than other methods(e.g. MIMAR, IMa) and is not restricted to just one model. As a result, a numberof models of varying complexity can be compared against one another using,for example, Bayes factors.

9

1.2 The Fate of New Mutations Under SelectionGenetic differences between species arise from the fixation of mutations seg-regating in a population. New mutations, occurring at a rate of µ per site pergeneration, enter a population of N diploid individuals at a rate of 2Nµ pergeneration. Each new mutation will initially exist in a single copy so that, inthe absence of selection, the fixation probability of this mutation is equal toits frequency in the population, 1/2N. Over long time periods the effect ofpopulation size on the appearance and fixation of new mutations is cancelledout, such that the net rate of neutral substitutions,

κ =1

2N.2Nµ = µ, (1.1)

is equal to the mutation rate, and is therefore independent of the populationsize [34, 19].

However, it has long been hypothesized [7] that a sizeable number of newmutations are not neutral, but likely consist of mostly deleterious, many neu-tral and few beneficial mutations (e.g. [28]). A key question, therefore, re-gards the fate of these different types of mutations. Are the substitutions thatwe observe between species the result of positive selection? Or are most ofthem neutral mutations that have simply drifted to fixation? A selectionistview would state that, while the majority of new mutations are deleterious, thevast majority of substitutions between species are due to the fixation of bene-ficial mutations. The opposing neutralist view, which sits at the foundation ofMotoo Kimura’s neutral theory of molecular evolution [18, 19] and TomokoOhta’s nearly neutral theory [26], states that, while some beneficial mutationsare fixed, the vast majority of differences between species are effectively neu-tral. The neutral and nearly neutral theories have come to dominate evolu-tionary thinking in population genetics and molecular evolution, but empiricalevidence from population genomic studies often raises significant questionsregarding their relevance in natural populations [15].

Whether the neutral theory holds or not, it does give useful neutral predic-tions about sequence diversity within and between species. For example, inneutrally evolving populations amino acid changing (non-synonymous) muta-tions are just as likely to fix in the population as silent (synonymous) muta-tions. That is the ratio of the number of non-synonymous substitutions (dN) tothe number of synonymous substitutions (dS), dN/dS, should equal one. Val-ues of dN/dS greater than one indicate an excess of amino-acid changing sub-stitutions and the putative presence of positive selection, while dN/dS valuesof less than one indicate a dearth of non-synonymous substitutions that couldpoint to the presence of purifying selection. The McDonald-Kreitman (MK)test [24] goes one step further and includes within-species polymorphism data

10

for non-synonymous (PN) and synonymous (PS) sites to test individual loci, inthe form of a 2×2 contingency table, for departures from neutrality.

The central premise of the MK test is that the ratio given by the NeutralityIndex [27]:

NI =PN/DN

PS/DS, (1.2)

is equal to one under neutrality. As in dN/dS tests, positive selection is in-dicated when NI > 1 and purifying selection when NI < 1. This leads quitenaturally into a potential estimator for the proportion of amino acid changingsubstitutions fixed due to positive selection [4]:

α = 1−NI = 1− DSPN

DNPS(1.3)

Given the extensive debate over the contributions of neutral and beneficialmutations to the number of substitutions between species then this estimatorpromises a great deal. What should we expect? What does the rate of adapta-tion depend on? Estimates early on, performed predominantly in humans (e.g.[3]) and Drosophila (e.g. [9]), suggested that these were due to differences inthe effective population size. This is for two reasons: First, a larger effectivepopulation size means that genetic drift is weaker, and so the efficacy of se-lection is stronger. Secondly, a large population size means that there is moreopportunity for new beneficial mutations to enter the population. Subsequentstudies, however, have not strongly reinforced this expectation. For example,a large scale study of plant species [14] failed to find a significant relationshipbetween α and Ne.

Some of these ambiguities may be accounted for by considering problemsin estimating both the effective population size and the proportion of adaptivesubstitutions. However, there may be other factors that influence the rate ofmolecular adaptation. One suggestion [13] is that the rate of environmentalchange has a far greater influence than the effective population size and recentsimulations support this claim [23]. These models are often based on Fisher’sGeometric Model [12]. Fisher’s Geometric Model imagines an organism witha number of traits that control it’s fitness in a given environment. Each traitlies at a distance from the phenotypic optimum of its current environment.Each new mutation can either move the trait closer to or further away fromthe optimum, and can have a random effect on how much the phenotype ischanged. Simulating under Fisher’s Geometric Model, it is suggested thatwhile there is a weak correlation with the effective population size, the rate ofenvironmental change is of more importance in determining the proportion of

11

adaptive substitutions [23]. The causes and consequences of adaptation at themolecular level are clearly at a point where there is much still to understand,and further empirical data can help to shed light on some of these questions.

1.3 Adaptation in Natural PopulationsNatural populations are, however, more complex than this. Evaluating theamount of adaptation in any given species based solely on those mutations fix-ing within the population is missing out on some of the more nuanced waysthat natural selection can mould a species to it’s environment. In reality thenatural world is highly heterogeneous and traits that are fit in one environmentmay often be detrimental in another. This can be understood by thinking interms of a population made up of a number of smaller populations or demes,with a number of migrants m moving between them [33]. Gene-flow betweenpopulations causes a homogenization of the genetic material among the popu-lations that works counter to the gradual differentiation between demes causedby genetic drift. If migration is free amongst all populations then all popula-tions resemble one large population or, at the other extreme, if gene-flow isrestricted then each deme evolves independently and the actions of geneticdrift lead to differentiation between demes.

Natural populations likely sit somewhere between these two extremes, withgenetic variation within and between demes being maintained at some migration-drift equilibrium. If a beneficial mutation is introduced into this scenario then,assuming that it is not eliminated by drift, the mutation will establish itselfwithin its current deme before spreading to the other demes. This advancecan spread quite fast, proceeding on the order of

√2sσ2 per generation [11],

where the s refers to the selection coefficient and σ2 is the rate of diffusionof genes. This assumes that the selection coefficient is consistent across alldemes, but there are striking examples that illustrate that this is not always thecase. This spatial variation in the fitness of an allele is often manifested innature as clines that form changes in trait values or allele frequencies across ageographic gradient. A regression between genotype and latitude can be usedto identify SNPs that could be showing adaptation to some environmental vari-able.

Adaptation is defined as the approach of an organism to a phenotype thatbest suits its current environment [12]. While the study of adaptation is asold as the study of evolution itself, there are many details that remain to beunderstood. The empirical evidence for clinal variation in traits and allele fre-quencies suggest that there is a significant role for local adaptation in naturalpopulations, but there are other lingering questions as well. For example, is theadaptive walk to an optimum characterised by many large effect substitutions

12

or many of small effect? Is adaptation the result of the rapid fixation of highlybeneficial alleles, or is it governed more by small shifts in allele frequency atmany loci? These are questions that will attract a great deal of attention as theamount and availability of empirical data increases.

1.4 Spruce as a Study SpeciesThe pollen of species within the conifer genus spruce are wind-pollinated andcapable of mediating long distance gene-flow. As a result, population geneticstructure can remain relatively low over sometimes quite large areas. Manyof these large areas exist in the boreal forests that carpet much of the northernhemisphere. Across Eurasia, NORWAY SPRUCE (Picea abies) occupies rangesin southern Europe and northern Scandinavia before merging and hybridizinginto SIBERIAN SPRUCE (P. obovata), which begins in the Urals and extendsacross Russia to the Bering Strait. Here, at the eastern end of Russia and ex-tending into the Korean peninsula, Kamchatka and Japan exist several speciesincluding KOREAN SPRUCE (P. koraiensis), GLEHN’S SPRUCE (P. glehnii)and JEZO SPRUCE (P. jezoensis). In North America, WHITE SPRUCE (P.glauca) and BLACK SPRUCE (P. mariana) extend from the north-east coast toAlaska in the west, with ranges overlapping with several other spruce speciessuch as RED SPRUCE (P. rubens) in the east and SITKA (P. sitchensis) andENGELMANN SPRUCE (P. engelmannii) in the west.

A great deal of the species diversity, however, exists in China and areas inand around the Qinghai-Tibetan Plateau (QTP). High altitude species, suchas SIKKIM SPRUCE (P. spinulosa), MORINDA SPRUCE (P. smithiana) andBURMESE SPRUCE (P. farreri), exist at the southern edge of the QTP, whileat the eastern edge, LIKIANG SPRUCE (P. likiangensis) extends south towardsButan, WILSON’S SPRUCE is distributed towards the northeast along the Yel-low River and species such as PURPLE SPRUCE (P. purpurea) and DRAGONSPRUCE (P. asperata) co-exist at the eastern edge of the QTP. While sprucespecies are generally associated with having large distributions and popula-tion sizes, there are many species that exist in small fragmented populations.Those of particular note are BREWER SPRUCE (P. breweriana), which existsin a number of small populations in northern California, SHRENK’S SPRUCE(P. schrenkiana), which is scattered through parts of central Asia and TAIWANSPRUCE (P. morrisonicola), which is a vulnerable species endemic to Taiwan.

These differences in distribution are often reflected in the genetic data.Widely distributed Boreal species, such as NORWAY SPRUCE, WHITE SPRUCEand BLACK SPRUCE, tend to have higher levels of genetic diversity than morerestricted species such as BREWER SPRUCE [5]. Estimates of the effectivepopulation size also reflect differences in the distribution of spruce species.

13

Estimates of the effective population size, for example, in 4 species obtainedusing Isolation-with-Migration models [5] range from around 12,000 in BREWERSPRUCE up to around 150,000 in NORWAY SPRUCE. These estimates are nodoubt influenced by past demographic events. The boreal regions in the north-ern hemisphere, in particular, have been subject to repeated glaciations, andrepeated bottlenecks may have lowered estimates of effective population size.In NORWAY SPRUCE, coalescent simulations suggested that values of Tajima’sD [29] and Fay & Wu’s H [10] were consistent with an old, strong bottleneck,whilst evidence for population growth has also been found in other studies us-ing Isolation-with-Migration models [5].

These large effective population sizes suggest that selection may be moreeffective in spruce than in species with smaller effective sizes. Evidence foradaptation has often come from studies of traits with high heritability that areknown to play an important role in adapting to seasonal fluctuations. Traitssuch as budset, show significant correlations with latitude and with the expres-sion patterns of genes that putatively form part of the photoperiodic pathwayand circadian clock [6]. However, due to the size (∼ 20 gb) of the genome, andthe huge amount of repetitive DNA, genetic studies in spruce have often beenmore restricted than in model species. The recent sequencing of the NORWAYSPRUCE genome sequence [25], however, promises to open up the species tomore extensive genetic studies and lead to further insights into the processesshaping spruce evolution.

14

2. Results and Discussion

In the following papers, I look to shed some light on the population geneticprocesses operating within and between spruce species. In papers I, II andIII we concentrate on the inference of neutral processes in a number of Asianspruce species and describe temporal changes in the effective population size.In paper IV we quantify the relative importance of drift and selection and cal-culate the proportion of adaptive substitutions in a number of spruce specieswith different effective population sizes. Finally, in paper V we study a morecomplex example of adaptation with a look at the SNPs and expression asso-ciated with a latitudinal cline.

2.1 Paper I - Speciation and Hybridisation on the QTPThe region known today as the Qinghai-Tibetan Plateau (QTP) has undergoneconsiderable geological change as a result of the collision, some 70 millionyears ago, of the Indo-Australian and Eurasian Plates. While these changeshave been gradual, their impact on the region’s flora and fauna are exemplifiedby changes in the abundance of spruce and conifer species in the area. Coniferswere present in the region 50 mya but spruce in particular began to becomemore common in the pollen record from around 38 mya [8]. The abundancedecreased from around 20 mya and a further significant uplift around 7 mya[32] has resulted in a number of fragmented spruce species existing in deepvalleys on the edges of the QTP.

We looked to address how this complex, dynamic environment has im-pacted genetic variation using four spruce species: LIKIANG SPRUCE, PUR-PLE SPRUCE, WILSON’S SPRUCE and SCHRENK’S SPRUCE. We sequenced12-16 loci in populations of each of these species and used a number of para-metric and nonparametric methods to infer information on the level of geneticdiversity, the population histories and relatedness of the species. In additionwe also investigated levels of introgression in PURPLE SPRUCE.

In three of the four species (LIKIANG, PURPLE and WILSON’S SPRUCE)levels of synonymous genetic diversity were high, and similar to estimates ob-tained previously in WHITE SPRUCE but higher than estimates obtained in anumber of other species (such as NORWAY SPRUCE and BLACK SPRUCE) thathave large distributions. In SCHRENK’S SPRUCE, however, estimates were

15

more on a par with other species with restricted distributions such as BREWERSPRUCE and TAIWAN SPRUCE. The genetic structure within the species indi-cated that, while LIKIANG, SCHRENK and WILSON’S SPRUCE appear to formdistinct populations, PURPLE SPRUCE likely has a more complex origin. Thespecies appears to be admixed with contributions coming from all species. Es-timates of divergence time also indicate that PURPLE SPRUCE diverged mostrecently from SCHRENK’S SPRUCE, suggesting a scenario whereby PURPLESPRUCE’S modern day genetic variation is formed partly from variation sharedwith SCHRENK’S SPRUCE, along with variation obtained from both LIKIANGand WILSON’S SPRUCE in admixture events occurring some time after diver-gence.

The recent upheaval of the QTP has had a remarkably diverse impact on thespecies included in this study. SCHRENK’S SPRUCE has been left isolated andexisting in fragmented populations in and around the Tian Shan mountainswhere low levels of genetic variation mean that evolutionary processes areinfluenced more heavily by genetic drift. This is contrast to LIKIANG, PUR-PLE and WILSON’S SPRUCE, whose levels of genetic variation are high, evencompared to boreal spruce species with continent-wide distributions. Theremay be a number of reasons for this, but species this far south may not havebeen subject to the more drastic bottlenecks caused by past ice ages that areknown to have affected boreal species. It could also be, due to the fragmentedlandscape found in the deep valleys and high mountains of the QTP, that thespecies are subject to cycles of differentiation and admixture. This has likelybeen the case in the admixed species PURPLE SPRUCE, and could also be amore common phenomenon than previously thought amongst plants speciesin and around the QTP.

2.2 Paper II - Population Decline in Taiwan SprucePollen records, taken from Jih-Yueh Tan basin in Central Taiwan, show that theendemic conifer TAIWAN SPRUCE covered, around 50-60 thousand years ago(kya), a larger range in central Taiwan and occurred at lower altitudes thanit does today. Since then the climate has warmed as TAIWAN SPRUCE hasretreated to the higher altitudes of the island with temperate and subtropicalspecies taking their place. Today, the species has a more restricted distribu-tion and is listed as vulnerable by the International Union for Conservation ofNature and Natural Resources. TAIWAN SPRUCE is therefore an interestingstudy species as we are in a position to assess the impact that its decline inpopulation size has had on levels of genetic diversity.

In this study, 15 nuclear loci were sequenced in 15 individuals from pop-ulations of TAIWAN SPRUCE and supplemented with data from species from

16

mainland China (LIKIANG, PURPLE, WILSON’S and SCHRENK’S SPRUCE),which were obtained from previous studies and through additional sequenc-ing. Levels of genetic variation were low, and similar to other restrictedspecies such as BREWER and SHRENK’S SPRUCE. Analysis using the pro-gram Structure revealed very little genetic structure within the sample ofTAIWAN SPRUCE studied here and that it forms a distinct cluster separatefrom species in mainland China. The analysis also suggested that WILSON’SSPRUCE is the most closely related of these species and so divergence timeswere estimated between TAIWAN SPRUCE and WILSON’S SPRUCE using MIMARand ABC.

Constant Effective Population Size

N

Bottleneck

t0

t1

N

αN

N

Population Structure

N

M

Decline

t

N

αN

Figure 2.1. The different with-species coalescent models compared using ABC, whereN is the effective population size, al pha is the severity of the bottleneck or decline, Mis the migration rate and t is the time is coalescent time units.

Tajima’s D values were positive indicating an excess of intermediate vari-ants in the observed data. Therefore, we decided to concentrate on models,such as a population structure or recent bottleneck model, that are expectedto cause these types of patterns in the data. For both the within-species and

17

between-species analyses, model comparison using ABC indicated that TAI-WAN SPRUCE underwent a drastic reduction in the effective population sizearound 300-500 kya. We dated the split from WILSON’S SPRUCE to around4 million years ago (mya) using MIMAR and to a more recent time of around1 mya when using ABC to take account of the reduction in the effective pop-ulation size (assuming a generation time of 25 years per generation). Theselection of models using ABC has the potential to be influenced by the sizeof the sample and the low level of genetic diversity. We performed an analysisof the power and the false positive rate but found that, while there is low powerto reject a null model of constant effective population size, the false positiverate is low, meaning that the high Bayes factors that we observe are not likelyto represent false positives.

The drastic decline inferred here from multiple nuclear loci for the vul-nerable, endemic conifer TAIWAN SPRUCE is consistent with the patterns ofabundance inferred from the pollen record. The species is protected in Taiwantoday, but the long-term prospects for TAIWAN SPRUCE are unclear as a warm-ing climate would push the species even higher in altitude unless it can adapt.The low effective population size inferred here unfortunately means that thenumber of new beneficial mutations entering the population is limited and theefficacy of selection is weak. However, given that the species is still present af-ter its tumultuous past provides some hope that the ability of TAIWAN SPRUCEto adapt to the changing environment is greater than that predicted by theory.

2.3 Paper III - Model Choice in ABCSimple, tractable models that describe the natural world are of enormous im-portance in evolutionary biology. One of the advantages of ApproximateBayesian Computation (ABC) is that it is efficient and flexible to the ex-tend that numerous models can be compared and contrasted with one another.There are scenarios, however, when the amount of information available foran analysis is limited. This may be when the number of samples or loci arelimited, or when levels of genetic variation are low. We explored the powerand false positive rate of model choice in ABC when the amount of informa-tion is limited by comparing a simple coalescent model of constant effectivepopulation size with a bottleneck model.

We began by using correlation coefficients and Principal Component Anal-ysis (PCA) to explore the relationship between the parameters of the modelsand the summary statistics. PCA in particular revealed that θ -based statisticscapture a great deal of the signal, however, examination of principle compo-nents 2 and 3 revealed signals associated with the site frequency spectrum(through Tajima’s D and Fay & Wu’s H). Generally, combinations of statistics

18

0.2 0.4 0.6 0.8 1.0

Small dataset, low genetic variation

Prob. BNM

Tajim

a's

D

−1.5

−1.0

−0.5

0.0

0.5

1.0

NB = NNB = 0.2NNB = 0.1NNB = 0.01N

0.2 0.4 0.6 0.8 1.0

Small dataset, high genetic variation

Prob. BNM

Tajim

a's

D

−1.5

−1.0

−0.5

0.0

0.5

1.0

0.2 0.4 0.6 0.8 1.0

Large dataset, low genetic variation

Prob. BNM

Tajim

a's

D

−1.5

−1.0

−0.5

0.0

0.5

1.0

0.2 0.4 0.6 0.8 1.0

Large dataset, high genetic variation

Prob. BNM

Tajim

a's

D

−1.5

−1.0

−0.5

0.0

0.5

1.0

Figure 2.2. The effect of bottleneck strength on the value of Tajima’s D and modelprobabilities for both small (n = 10, l = 15) and large (n = 20, l = 30) datasets withlow (θ = 0.0015) and high (θ = 0.005) genetic variation. Each point represents therejection step of an ABC analysis when the TPH+DH set of statistics is used with atolerance of 0.001. The effective population size during the bottleneck (NB) is definedrelative to the recovered effective population size (N).

such as Waterson’s theta, the average number of pairwise nucleotide differ-ences, haplotype diversity, Tajima’s D and Fay & Wu’s H perform particularlywell as they combine information about the population-scaled mutation rate(θ ), the site frequency spectrum and haplotypic information.

We also explored the power allowed for detecting bottlenecks of varyingseverity. There is almost no power available to detect weak bottlenecks, withstronger bottlenecks where the population size is reduced to 10% of the currentpopulation size being more easily detected given a more sizeable dataset. Wenote that in species with low genetic variation (θ = 0.0015), the power isseverely restricted and a greater number of samples and loci are required togive a sufficient level of power. The power of model choice in ABC then, ishighly dependent on both the quality of the dataset and the choice of summarystatistics. What is particularily noticeable, however, is that levels of genetic

19

●● ●

0.5 0.4 0.3 0.2 0.1 0.0

0.0

0.2

0.4

0.6

0.8

1.0

TPH+DH

Relative effective population size during bottleneck

Pow

er

n = 20, l = 30, theta = 0.005n = 20, l = 30, theta = 0.0015n = 10, l = 15, theta = 0.005n = 10, l = 15, theta = 0.0015

Figure 2.3. The effect of bottleneck strength on the value of Tajima’s D and modelprobabilities for both small (n = 10, l = 15) and large (n = 20, l = 30) datasets withlow (θ = 0.0015) and high (θ = 0.005) genetic variation. Each point represents therejection step of an ABC analysis when the TPH+DH set of statistics is used with atolerance of 0.001. The effective population size during the bottleneck (NB) is definedrelative to the recovered effective population size (N).

variation in the study species have a considerable impact on the power of ABCto separate the models, and sampling strategies should be considered that takethis information into account.

2.4 Paper IV - Rate of Molecular Adaptation in SpruceThe rate of substitution of novel beneficial mutations is generally thought tocorrelate with the effective population size. This is due both to the weaker in-fluence of genetic drift at large effective population sizes and to the increasednumber of beneficial mutations entering larger populations. However, the em-pirical evidence for this relationship has been inconsistent in the literature.Part of the problem in comparing these different studies is that the species of-ten have quite contrasting life histories. For example, the differences in theproportion of adaptive substitutions between humans and Drosophila could be

20

due a number of different reasons other than the effective population size.

To circumvent this problem we analysed population genetic data from 4spruce species of differing effective population sizes: NORWAY, WHITE, JEZOand BREWER SPRUCE. We calculated a number of within and between speciesstatistics from synonymous and non-synonymous sites that give informationon the relative contributions of genetic drift and selection to both polymor-phism and divergence. In particular we used several methods to calculate theproportion of amino acid changing substitutions fixed due to positive selectionand compared this to estimates of the effective population size calculated fromsynonymous sites.

Estimates of the effective population size, calculated using Watterson’s thetaat synonymous sites and assuming a mutation rate of 1× 10−8, were highestin NORWAY SPRUCE (234,250) and WHITE SPRUCE (217,500), moderate inJEZO SPRUCE (101,250) and low in BREWER SPRUCE (8,500). Levels ofconstraint at nonsynonymous sites was generally high in species with high ef-fective population sizes, particularly in JEZO SPRUCE, but low in BREWERSPRUCE whose effective size is considerable lower. These results are consis-tent with Tajima’s D values calculated for synonymous and non-synonymoussites. Non-synonymous values of D are more negative than the for those atsynonymous sites in the three larger spruce, but similar in BREWER SPRUCE.

Table 2.1. Estimates of the effective population size (Ne), the fraction of neutral re-placement sustitutions ( f ) and the proportion of adaptive substitutions (αω ).

Species Ne f αω

NORWAY 234,250 0.148 0.221WHITE 217,500 0.218 0.147JEZO 101,250 0.066 0.295BREWER 8,500 0.455 0.114

The proportion of adaptive substitutions relative to synonymous substitu-tions (αω ), however, do not show a clear correlation with the effective popu-lation size. The species (BREWER SPRUCE) with the lowest effective popu-lation size does indeed have the lowest estimate of αω but any other patternsare concealed by the high estimate for JEZO SPRUCE, which has the highestrate of adaptive substitutions despite having less than half the effective pop-ulation size as NORWAY and WHITE SPRUCE. There could be a number ofreasons for this. Firstly, we use estimates of the effective population size thatwere calculated directly from Watterson’s theta, whereas there may be demo-graphic processes at work that distort the estimates reported here. Secondly,it is entirely possible that there is a weak underlying relationship between

21

the effective population size and the proportion of adaptive substitutions, butthat other factors play a far greater role in determining . A compelling al-ternative, and one which has been suggested before [23, 13], is that the rateof environmental change is of greater importance. This explanation fits wellgiven the long generation times in spruce, where only a handful of generationscan span hundreds of years and considerable environmental change.

2.5 Paper V - Adaptation to a Latitudinal ClineSpruce species respond to seasonal fluctuations in temperature and light byadjusting their yearly growth rhythms. In spruce, growth cessation duringthe autumn and winter is triggered by the shortening of the days towards theend of the summer, and is a trait which shows high heritability and correlateswith latitude in NORWAY SPRUCE. Furthermore, SNPs within putative clockgenes show a significant correlation with latitude. We analysed populationsof SIBERIAN SPRUCE that form a latitudinal cline along the Yenisei River inRussia parallel to that studied in NORWAY SPRUCE.

54 56 58 60 62 64 66 68

0.0

0.2

0.4

0.6

0.8

1.0

Latitude (°N)

Alle

le fr

eque

ncy

PoGI_F4_638PoGI_F6_51

We find a complete lack of population structure, but in spite of this thelength of the growth period correlated significantly with latitude. On average,

22

northern populations ceased growth after around 20 days, with the two mostsouthern populations growing for more than 60 days. Significant correlationswith latitude were indentified in SNPs in three genes putatively involved inthe circadian clock and photoperiodic pathway, one SNP of which has beenshown to show the same pattern in a parallel cline in NORWAY SPRUCE. Fur-thermore, expression in one of these genes, PoFTL2, shows an increase inexpression with latitude. The patterns of genetic variation at candidate lociwas also assessed compared to a set of control loci to test for departures fromneutrality. No departure was detected from a simple model of constant ef-fective population size, but two loci, PoPRR3 and PoGI, showed significantdepartures from neutrality for Tajima’s D and Fay & Wu’s H respectively.

Studies of longitudinal clines provide information on spatial variation inthe fitness of alleles across a geographic gradient. Given the local variation inenvironmental conditions we observe in the natural world it seems inevitablethat local adaptation is a common phenomenon in nature. This study providesnew insights into the process of local adaptation in natural populations andprovides a compelling example of parallel adaptation to latitudinal clines.

23

3. Conclusions

Evolution is driven by processes that change the frequency of alleles in a popu-lation. Patterns of genetic variation produced by neutral processes can providevaluable information on the evolutionary history of species. The studies ofAsian spruce species in papers I and II suggest that spruce species are linkedclosely to their surrounding environment. A dynamically changing landscapecan have a quite profound and striking impact on levels of genetic diversity.These range from range fragmentation and admixture in paper I to a severepopulation decline in paper II. These inferences have been aided by the devel-opment of ABC, which has proven extremely useful when performing demo-graphic analyses when the power afforded by the dataset is sufficient (paperIII).

Adaptation in natural populations takes on a number of different forms.This thesis looked at adaptation on both short and long-term time-scales andrevealed a great deal about the way selection acts in spruce species. The factorsaffecting the rate of adaptive substitutions is still yet to be fully understood.In paper IV we did not see a clear relationship between the effective popula-tion size of the species and the rate of adaptive substitution. The reasons forthis are a topic of ongoing research but it seems likely that the rate of envi-ronmental change and the complexity of the phenotype play a part. However,our basic understanding of adaptation should be shifted somewhat away fromthe simple selective sweep view, whereby a new beneficial mutation sweepsto fixation in the population. There are several alternative modes of adapta-tion that are biologically appealing, and go against our intuition that says thatadaptation leads to adaptive substitutions. Adaptation in natural populationsmay be defined more by subtle shifts in the allele frequencies of many smalleffect loci. It may also resemble the scenario observed in paper V, where thefitness of alleles differs spatially across the landscape, so that local adaptationis a better description of the real world. Further insights into the nature ofadaptation will come as the amount of empirical data increases.

24

4. Svensk sammanfattning

Genetisk variation som finns i populationer idag är resultatet av olika evo-lutionära processer. En grov indelning av de evoltionära processerna brukarsärskilja varianter som enbart påverkas avslumpmässig nedärvning (genetiskdrift) och varianter som är påverkats av det naturliga urvalet (selektion). Måletmed många populationsgenetiska analyser är att identifiera hur dessa evolu-tionära krafter format den variation vi kan observera idag. En central param-eter i detta sammanhang är effektiv populationstorlek (Ne), som förenklat kansägas vara ett mått på hur mycket genetisk variation som hittas i en population.I artikel I och II använder vi olika populationsgenetiska metoder och modellerför att estimera den effektiva populationsstorleken och dess förändringar övertid i flera olika arter av gran. Då det i vissa populationer/arter endast finnsbegränsat med genetisk variation använde vi i artikel III datorsimuleringar föratt bättre förstå vilken effekt detta har på uppskattning av parametrar såsom Ne. I de två sista manuskripten undersökte vi mer direkt vilken roll selektion harhaft hos olika granarter. I artikel IV jämförs fyra arter med avseende på hurstor del av dess genetiska variation som fixerats av selektion och hur mycketsom är resultatet av slumpmässig genetisk drift, medan vi i artikel V fokuserarpå genetisk variation inom en art och hur denna kan påverkats av lokal anpass-ning.

I artikel I studerade vi demografisk historia, populationsstruktur och gen-flöde mellan fyra arter av gran med sitt ursprung kring den Tibetanska hö-platån. De fyra arterna har olika, men delvis överlappande distributionsområ-den vilket gör att vi förväntade oss skillnader i både Ne och populationshisto-ria. Som väntat hade den art med minst utbredningsområde (Picea schrenkiana)betydligt mindre Ne än de tre andra arterna (P. likiangensis, P. purpurea ochP. wilsonii). Vidare visade analyser av populationstruktur att P. purpurea ären hybridart och är ett resultat av hybridisering mellan P. likiangensis och P.wilsonii. Sammantaget avslöjade våra analyser en komplex demografisk his-toria hos dessa fyra arter något som antagligen kan förklaras av den komplexageologiska historien i detta område av världen.

I artkel II studerade ytterligare en asiatisk granart, P. morrisonicola, mentill skillnad från de tidigare beskrivna arterna återfinns denna enbart på ön Tai-wan och har bland dagens granarter en av de minsta utbredningsområdena.Tidigare studier av pollen från Taiwan tyder på att arten tidigare fanns överstora delar av Taiwan, men idag återfinns den endast på högre höjd. Dennareduktion i utbredningsområde har lämnat ett tydligt mönster i den genetiska

25

variation som vi ser i arten idag. Genom att använda avancerade modeller ochdatorsimuleringar kan vi visa att arten är närmast släkt med P. wilsonii och attdessa arter skilde sig åt någon gång för 4-8 miljoner år sedan. Efter det, föromkring 300-500 tusen år sedan, har dess populationsstorlek drastiskt reducer-ats, vilket gör att arten idag har väldigt låg genetisk variation. Dessa analyserutfördes samtliga på en begränsad mängd data. För att bättre förstå med vilkenprecision vi kan uppskatta effektiv populationsstorlek samt förändringar övertid i denna parameter genomförde vi i artikel III en utförlig simuleringstudie.Resultaten från denna studie visar att det när mängden variation är låg är detofta svårt att särskilja mellan olika demografiska modeller, men det går attkomma runt delar av det problemet genom att använda fler genetiska marköreroch/eller fler individer. Vidare visar studien att de exakta parametrarna somman använder i datorsimuleringarna är viktiga och man kan ibland genom attinkludera fler parametrar markant öka möjligheterna till att särskilja olika de-mografiska scenarior.

Antal mutationer som selekteras fram av naturligt urval antas vara betydligtvanligare om Ne är stor. Detta beror framför allt på två faktorer, en stor Ne in-nebär att det över en given tidsenhet kommer ske fler fördelaktiga mutationersamt att den slumpmässiga variationen som beror på drift är mindre jämförtmed stora populationer. Då många andra faktorer också påverkar hur mångamutationer som kommer fixeras av antingen selektion eller genetisk drift un-dersökte vi i artikel V IV huruvida fyra olika granarter som skiljer sig åt i framför allt i Ne också uppvisar skillnader i antalet genetisk varianter som fixeratsav selektion. Två av arterna (P. abies, P. glauca) förväntades på grund avderas stora utbredningsområde och tidigare genetiska studier ha stor effektivpopulationstorlek medan P. jezoensis förväntas ha något mindre och P. brewe-riana förväntas ha en extremt liten populationstorlek. Antalet mutationer somfixerats av selektion skiljer sig åt mellan de fyra arterna och lägst frekvenshittades som väntat i P. breweriana. De tre andra arterna uppvisar en högreandel positivt selekterade mutationer, men förvånande nog verkar inte den es-timerade effektiva populationstorleken vara det som styr mängden selekteradevarianter, något som visar att det finns fler faktorer än Ne som på genetisknivå styr hur effektivt det naturliga urvalet är. I den sista artikeln V tittar vinärmare på vilken effekt lokal anpassning har haft på genetisk variation i P.obovata. Knoppsättning är för många tempererade träd centralt för överlev-nad, då tidig knoppsättning kan innebära att en viktigt del av tillväxtsäsongenmissas, medan en sen knoppsättningen ökar risker för frostskador. Precis somP. abies och P. glauca har P. obovata ett stort utbredningsområde och hit-tas från ungefär N 53° till N 68° över stora delar av västra Asien. Genomatt studera knoppsättning, genuttryck och genetisk variation från populationermed sitt ursprung från södra till norra delen av den sibiriska floden Yeneseikunde vi visa att knoppsättning och genuttryck av vissa gener korrelerar medvilken latitud populationen kommer från. Liknande observationer har tidigare

26

också presenterats från P. abies i Västeuropa. Dessutom visade sig det att trots,en generell avsaknad av populationsstruktur så fanns det genetiska varianter igener involverade i styrningen av tillväxtrytm hos gran som uppvisar mönstersom vi förväntar oss från gener som är direkt involverad i lokal anpassning.

Dessa studier bidrar till våran förståelse för hur olika evolutionära pro-cesser påverkat den genetiska variation vi ser i granarter idag. Inte bara denuvarande populationstorlekarna, men också de historiska storlekarna varierarkraftigt mellan dessa nära besläktade arter och vi har hittat bevis på tidigareoch pågående anpassning. Sammanfattningsvis kan dessa studier ses som engod grund för framtida studier av denna fascinerande grupp av växter.

27

5. Acknowledgements

A PhD and a thesis cannot be completed without the help of a great many otherpeople. I’d first like to thank Martin, for his supervision over the course of thePhD. I’d also like to thank my co-supervisors Ulf and Mattias for their ad-vice along the way. I’d also like to thank my unofficial co-supervisor Thomas,who’s broad knowledge has been invaluable over the course of my time in thedepartment.

I’d like to thank the current and former members of the now defunct func-tional genomics department: Yoshiaki, Jun, Hanna, Kerstin, Rose-Marie, Kalle,Sofia, Laura and also to my great mentor Xiao-Fei Ma, who taught me so muchabout ping-pong (and some lab stuff as well...). I would like to thank my col-laborators, Stéphane and Mathieu, in Montpellier for making my short staythere so welcoming and ping-pong rich, and also to people who have given meadvice along the way such as Yves and Urban.

I’d also like to thank Pádraic and Pontus for endless scientific and non-scientific discussions. I still remember the day we all met on the first dayof the masters, 6 years ago, when Pádraic came up to me and immediatelystarted talking about Ernst Mayr and Theodosius Dobzhansky. There are awhole bunch of people in and around the EBC, both past and present, thathave made it such a fun place to be. I’m just going to list a bunch of people offthe top of my head: Elle, Jenny, Jelmar, Roger, Jurg, Benoit, Lucie, Rob, Sen,Oddny, Jamie, Eva and far too many other people to mention here. And therehave been so many moments that stick in the memory: the EBC pub, Oddny’sbrunches, beating Rob, losing to Rob, seeing Sen Li in a blond wig...

I’d like to thank my mum, dad and sister Katie, and particularly to my par-ents for funding and supporting me throughout my education. There werecertainly times when it seemed unlikely that this day would ever come...

And finally I’d like to thank Becky for the endless support, encouragementand guacamole dip needed to write this thesis.

28

References

[1] Mark A Beaumont, Wenyang Zhang, and David J Balding. Approximatebayesian computation in population genetics. Genetics, 162(4):2025–2035,2002.

[2] Celine Becquet and Molly Przeworski. A new approach to estimate parametersof speciation models with application to apes. Genome Research,17(10):1505–1519, 2007.

[3] Adam R Boyko, Scott H Williamson, Amit R Indap, Jeremiah D Degenhardt,Ryan D Hernandez, Kirk E Lohmueller, Mark D Adams, Steffen Schmidt,John J Sninsky, Shamil R Sunyaev, et al. Assessing the evolutionary impact ofamino acid mutations in the human genome. PLoS genetics, 4(5):e1000083,2008.

[4] Brian Charlesworth. The effect of background selection against deleteriousmutations on weakly selected, linked variants. Genetical research,63(03):213–227, 1994.

[5] J Chen, T Källman, N Gyllenstrand, and M Lascoux. New insights on thespeciation history and nucleotide diversity of three boreal spruce species and atertiary relict. Heredity, 104(1):3–14, 2009.

[6] Jun Chen, Thomas Källman, Xiaofei Ma, Niclas Gyllenstrand, Giusi Zaina,Michele Morgante, Jean Bousquet, Andrew Eckert, Jill Wegrzyn, David Neale,et al. Disentangling the roles of history and local selection in shaping clinalvariation of allele frequencies and gene expression in norway spruce (piceaabies). Genetics, 191(3):865–881, 2012.

[7] Charles Darwin. On the origin of the species by natural selection. 1859.[8] G Dupont-Nivet, C Hoorn, and M Konert. Tibetan uplift prior to the

eocene-oligocene climate transition: Evidence from pollen analysis of thexining basin. Geology, 36(12):987–990, 2008.

[9] Adam Eyre-Walker and Peter D Keightley. Estimating the rate of adaptivemolecular evolution in the presence of slightly deleterious mutations andpopulation size change. Molecular biology and evolution, 26(9):2097–2108,2009.

[10] Justin C Fay and Chung-I Wu. Hitchhiking under positive darwinian selection.Genetics, 155(3):1405–1413, 2000.

[11] Ronald Aylmer Fisher. The wave of advance of advantageous genes. Annals ofEugenics, 7(4):355–369, 1937.

[12] Ronald Aylmer Fisher. The genetical theory of natural selection: a completevariorum edition. Oxford University Press, 1999.

[13] John H Gillespie. Population genetics: a concise guide. JHU Press, 2010.[14] Toni I Gossmann, Bao-Hua Song, Aaron J Windsor, Thomas Mitchell-Olds,

Christopher J Dixon, Maxim V Kapralov, Dmitry A Filatov, and AdamEyre-Walker. Genome wide analyses reveal little evidence for adaptive

29

evolution in many plant species. Molecular biology and evolution,27(8):1822–1832, 2010.

[15] Matthew W Hahn. Toward a selection theory of molecular evolution. Evolution,62(2):255–265, 2008.

[16] Jody Hey. The divergence of chimpanzee species and subspecies as revealed inmultipopulation isolation-with-migration analyses. Molecular biology andevolution, 27(4):921–933, 2010.

[17] Richard R Hudson. Generating samples under a wright–fisher neutral model ofgenetic variation. Bioinformatics, 18(2):337–338, 2002.

[18] Motoo Kimura. The neutral theory of molecular evolution. CambridgeUniversity Press, 1984.

[19] Motoo Kimura et al. Evolutionary rate at the molecular level. Nature,217(5129):624–626, 1968.

[20] JFC Kingman. Exchangeability and the evolution of large populations. 1982.[21] John FC Kingman. The coalescent. Stochastic processes and their applications,

13(3):235–248, 1982.[22] John FC Kingman. On the genealogy of large populations. Journal of Applied

Probability, pages 27–43, 1982.[23] João M Lourenço, Sylvain Glémin, and Nicolas Galtier. The rate of molecular

adaptation in a changing environment. Molecular biology and evolution,30(6):1292–1301, 2013.

[24] John H McDonald, Martin Kreitman, et al. Adaptive protein evolution at theadh locus in drosophila. Nature, 351(6328):652–654, 1991.

[25] Björn Nystedt, Nathaniel R Street, Anna Wetterbom, Andrea Zuccolo,Yao-Cheng Lin, Douglas G Scofield, Francesco Vezzi, Nicolas Delhomme,Stefania Giacomello, Andrey Alexeyenko, et al. The norway spruce genomesequence and conifer genome evolution. Nature, 2013.

[26] Tomoko Ohta. Slightly deleterious mutant substitutions in evolution. Nature,246(5428):96–98, 1973.

[27] David M Rand and Lisa M Kann. Excess amino acid polymorphism inmitochondrial dna: contrasts among genes from drosophila, mice, and humans.Molecular Biology and Evolution, 13(6):735–748, 1996.

[28] Daniel R Schrider, David Houle, Michael Lynch, and Matthew W Hahn. Ratesand genomic consequences of spontaneous mutational events in drosophilamelanogaster. Genetics, 2013.

[29] Fumio Tajima. Statistical method for testing the neutral mutation hypothesis bydna polymorphism. Genetics, 123(3):585–595, 1989.

[30] John Wakeley. Coalescent theory: an introduction, volume 1. Roberts &Company Publishers, 2009.

[31] John Wakeley and Jody Hey. Estimating ancestral population parameters.Genetics, 145(3):847–855, 1997.

[32] Yang Wang, Tao Deng, and Dana Biasatti. Ancient diets indicate significantuplift of southern tibet after ca. 7 ma. Geology, 34(4):309–312, 2006.

[33] Sewall Wright. Evolution in mendelian populations. Genetics, 16(2):97, 1931.[34] Sewall Wright. The distribution of gene frequencies under irreversible

mutation. Proceedings of the National Academy of Sciences of the United Statesof America, 24(7):253, 1938.

30

Acta Universitatis UpsaliensisDigital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 1078

Editor: The Dean of the Faculty of Science and Technology

A doctoral dissertation from the Faculty of Science andTechnology, Uppsala University, is usually a summary of anumber of papers. A few copies of the complete dissertationare kept at major Swedish research libraries, while thesummary alone is distributed internationally throughthe series Digital Comprehensive Summaries of UppsalaDissertations from the Faculty of Science and Technology.

Distribution: publications.uu.seurn:nbn:se:uu:diva-207714

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2013