Post on 14-Sep-2019
Germ banks affect the inference of past demographicevents
DANIEL Ž IVKOVIC* and AURELIEN TELLIER*†
*Section of Evolutionary Biology, Department of Biology II, BioCenter, LMU Munich, Grosshaderner Strasse 2, 82152 Planegg-
Martinsried, Germany, †Section of Population Genetics, Center of Life and Food Sciences Weihenstephan, Technische Universitat
Munchen, 85354 Freising, Germany
Abstract
Continuous progress in empirical population genetics based on the whole-genome
polymorphism data requires the theoretical analysis of refined models in order to
interpret the evolutionary history of populations with adequate accuracy. Recent stud-
ies focus prevalently on the aspects of demography and adaptation, whereas age
structure (for example, in plants via the maintenance of seed banks) has attracted less
attention. Germ banking, that is, seed or egg dormancy, is a prevalent and important
life-history trait in plants and invertebrates, which buffers against environmental vari-
ability and modulates species extinction in fragmented habitats. Within this study,
we investigate the combined effect of germ banking and time-varying population size
on the neutral coalescent and particularly derive the allele frequency spectrum under
some simplifying assumptions. We then perform an ABC analysis using two simple
demographic scenarios—a population expansion and an instantaneous decline. We
demonstrate the appreciable influence of seed banks on the estimation of demo-
graphic parameters depending on the germination rate with biases scaled by the
square of the germination rate. In the more complex case of a population bottleneck,
which comprises an instantaneous decline and an expansion phase, ignoring informa-
tion on the germination rate denies reliable estimates of the bottleneck parameters
via the allelic spectrum. In particular, when seeds remain in the bank over several
generations, recent expansions may remain invisible in the frequency spectrum,
whereas ancient declines leave signatures much longer than in the absence of seed
bank.
Keywords: allele frequency spectrum, approximate Bayesian computation, coalescent, seed and
germ bank, time-varying population size
Received 10 May 2012; revision received 10 August 2012; accepted 21 August 2012
Introduction
Since the beginning of the 20th century, molecular data
have been used to estimate the recent evolutionary his-
tory of populations (e.g. Hirschfeld & Hirschfeld 1919).
This has been made largely feasible by theoretical
advances demonstrating the possibility to detect depar-
tures from equilibrium conditions (e.g. panmictic popu-
lation, mutation–drift equilibrium), as for instance
deviations from demographic stationarity (e.g. Watter-
son 1984). Understanding which demographic events or
selective forces shape the patterns of polymorphism is
fundamental in evolutionary genetics (e.g. Stephan
2010) and of practical relevance for conservation biol-
ogy (e.g. Olivieri et al. 2008). Genetic data are indeed
increasingly used to reconstruct the demographic his-
tory of species or populations, such as past bottlenecks
due to hunting, the introduction of alien species or hab-
itat loss and fragmentation for endangered species (e.g.
Olivieri et al. 2008). It becomes therefore important to
quantify which ecological factors or life-history traits
affect the precision of these inferences and thus poten-
tially the conclusions of these studies. For example, theCorrespondence: Aurelien Tellier, Fax: +49 89 2180 74 104;
E-mail: tellier@wzw.tum.de
© 2012 Blackwell Publishing Ltd
Molecular Ecology (2012) doi: 10.1111/mec.12039
existence of metapopulation structure may create spuri-
ous bottleneck signals depending on the genetic
differentiation (or gene flow), genetic diversity and
sampling scheme. As a result, several conservation biol-
ogy studies may have overestimated or incorrectly
detected bottlenecks (discussed in Chikhi et al. 2010).
Theoretical studies in population genetics were prev-
alently based on the continuous-time approximation of
the Wright–Fisher model (Fisher 1930; Wright 1931),
which inter alia assumes a constant population size, ran-
domly mating individuals without structure and nonov-
erlapping generations in discrete time. Kimura (1955)
introduced the time rescaling, which generalizes several
results that were originally derived for the basic model
to deterministic changes in population size. Thereafter,
studies often considered particular demographic scenar-
ios such as instantaneous changes (e.g. Watterson 1984)
or exponential growth (e.g. Slatkin & Hudson 1991),
before Griffiths & Tavare (1994) investigated the coales-
cent for deterministic changes in population size. Fur-
thermore, the allele frequency spectrum that serves as
an essential statistic to find agreements between demo-
graphic models and samples of DNA sequences has
been derived (e.g. Griffiths & Tavare 1998; Živkovic &
Stephan 2011). Although deterministic changes in popu-
lation size are arguably more popular regarding the
interpretation of biological data, studies have also
focussed on stochastically varying population sizes (e.g.
Kaj & Krone 2003).
Departure from the Wright–Fisher assumptions
arises also if species show age-structured populations
due to specific life-history traits (Charlesworth 1994)
such as created by overlapping generations or germ
banks. Germ banks spanning over several generations
are ubiquitous characteristics to many species (Evans
& Dennehy 2005) encompassing seed dormancy in
plants (Templeton & Levin 1979; Nunney 2002; Evans
et al. 2007), resting eggs, for example, in pond sedi-
ments of Daphnia (Decaestecker et al. 2007) and sur-
vival of spores in bacteria (e.g. Lennon & Jones 2011).
It has been suggested theoretically (e.g. Templeton &
Levin 1979) and shown empirically (Evans et al. 2007;
Tielborger et al 2012) that adaptation for dormancy is a
bet-hedging strategy to magnify the evolutionary effect
of ‘good’ years and to dampen the effect of ‘bad’
years, that is, to buffer environmental variability.
Importantly, germ dormancy generates an increase in
the effective population size compared to the census
size of the observable population by (i) promoting the
storage of genetic diversity (Templeton & Levin 1979;
Nunney 2002) and (ii) counter-acting habitat fragmen-
tation by buffering against the extinction of small
and isolated populations—a phenomenon known as
‘temporal rescue effect’ (Brown & Kodric-Brown 1977;
Honnay et al. 2008). Seed banks are also key for the
conservation of endangered plant species as a life-his-
tory trait modulating habitat fragmentation. We denote
here the observable population as a general term
describing the census size of reproductive individuals
such as the above-ground plants, and the free living
individuals of invertebrates and bacteria. For simplic-
ity, we interchangeably refer to seed and germ banks
hereafter.
Based on the theoretical work of Kaj et al. (2001), seed
banking leads to an increase in the effective population
size because a coalescent event of two lineages can only
occur in a reproducing individual in the observable
population. Assuming a population of constant size in
time as Kaj et al. (2001), the relative allele frequencies
within a sample remain unchanged in the seed bank
model. This is because the underlying mean waiting
times to coalescence are equivalently stretched com-
pared to a population without a bank. However, we
reason that if a population undergoes size changes in
time, long germ banks with small germination rates
may buffer or enhance the effect of the demography.
The polymorphism signature, that is, the observed
genetic variability and the allelic spectrum, of a given
past demographic event may then be affected compared
to a population without banks. Germ banks would thus
create spurious signatures of a past population expan-
sion or a bottleneck, or make these signatures nondiffer-
entiable based on SNP data.
In this study, we analyse the effect of germ banks
on the detection of past population size changes under
neutrality. First, we derive the frequency spectrum for
a Wright–Fisher-type dynamics with germ bank and
for widely used models of population size changes
based on the work of Kaj et al. (2001), Tavare (1984),
Griffiths & Tavare (1998) and Živkovic & Stephan
(2011). On the means of a simple expansion model, we
exemplify that numerous parameter combinations of
rate and time of growth lead to equivalent relative fre-
quency spectra depending on the germination rate.
Second, we investigate if this confounding effect of
seed banks is likely to strongly impede inference of
past demographic events. We simulate two simple
demographic scenarios—a population expansion and
an instantaneous decline—with variable parameters
assuming the presence of banks with different germi-
nation rates. To mimic current studies, we infer with
approximate Bayesian computation (ABC) (e.g. Beau-
mont et al. 2002) the past demography ignoring the
effect of seed banks on the polymorphism data. We
show that the model choice of the ABC method is
robust to differentiate between these two demographic
models. However, the parameter inference procedure
presents strong biases in the estimates. Finally, we
© 2012 Blackwell Publishing Ltd
2 D. Ž IVKOVIC and A. TELLIER
show that more complex demographic histories such as
bottlenecks, which include a decline and an expansion,
can even lead to an excess of low- , intermediate- or
high-frequency derived alleles depending on the usu-
ally unknown germination rate. The characterization of
certain demographic tendencies based on the frequency
spectrum is in these cases cumbersome. On the posi-
tive side, if knowledge on the germination rate is
available, relatively older changes in population size
remain detectable in the polymorphism data in com-
parison with the model without seed bank due to the
enlarged coalescent tree.
Methods and results
The coalescent for seed bank models with constantpopulation size
Our model is based on the elegant urn model by Kaj
et al. (2001) describing the neutral seed bank dynamics
for a haploid population of constant size. In each gener-
ation, the population consists of N individuals with pro-
portion bi originating from seeds produced i = 1,…,m
generations ago, where m denotes the maximum num-
ber of generations a seed may spend in the bank. In a
given generation, each individual is randomly drawn
independently from the others with probabilities
b1; . . .; bm from the appropriate generation. The popula-
tion of a new generation is thus formed via multinomial
sampling from the previous m generations. Viewing
time retrospectively, each individual in a given genera-
tion is assigned randomly to an ancestor from the
previous m generations according to the above probabil-
ities. This procedure can be seen as a process in which
a sample of balls of initial size n at present is relocated
across the previous generations by sliding a window
that comprises m consecutive generations as cells in a
stepwise manner. When the window is slided one gen-
eration backwards, all balls from the first cell of the pre-
vious window are relocated into one of the m cells of
the actual window. More precisely, each ball is relo-
cated into one of the N slots of a given cell. Each slot
represents an individual of the population in the
respective generation. During the relocation process,
two types of coalescent events may occur: either two
balls are relocated into the same slot of the same cell or
a ball is relocated into a previously occupied slot. It has
been shown that more than one coalescent event hap-
pens with the negligible probability of Oð1=N2Þ at a
time (Kaj et al. 2001). The probability of one coalescent
event is O(1/N) at each step, so that coalescences occur
in O(N) steps. In contrast, the configuration process
describing the distribution of balls across the cells of
the windows over time offers transitions between the
states at each step. So the configuration process has
time to reach an equilibrium between coalescent events
for large N (Kaj et al. 2001). This separation of time-
scales into a slow and a fast process has been applied
to several population genetic models (e.g. Nordborg &
Krone 2002).
The ancestral process of the seed bank model is
denoted by ðANn ðkÞÞk� 0, where AN
n ðkÞ is the number of
ancestors at step k with population size N and initial
sample size n. Let b = 1/E(B), where EðBÞ ¼ Pmi¼ 1 ibi
is the expected value of the seed bank age distribution
PðB ¼ jÞ ¼ bj, j = 1,…,m, or simply the mean time a
seed will spend in the bank. In a biological meaning, bis approximately the germination rate (Tellier et al.
2011a), so that we will refer to b as the germination rate
throughout. The main result of Kaj et al. (2001) states
that the time-recaled ancestral process ðANn ð½Nt�ÞÞt� 0
converges as N?∞ to the continuous-time Markov
chain ðAnðtÞÞt� 0 with infinitesimal generator matrix
Q ¼ ðqijÞi;j2f1;...;ng defined by
qii ¼ �b2i
2
� �; 2� i� n;
qii�1 ¼ b2i
2
� �; 2� i� n;
qij ¼ 0; otherwise:
ð1Þ
So the limiting process of the seed bank model is the n-
coalescent (Kingman 1982) run on a slower timescale.
From eqn 1, it is straightforward to derive the proba-
bility that the process AnðtÞ is in a certain state, in
which there are j = n,…,2 ancestors, at time t via the
matrix method (e.g. Tavare 1984; Živkovic & Stephan
2011). After some algebra, one obtains
PðAnðtÞ ¼ jÞ ¼Xnk¼j
cnkrkj exp �b2k
2
� �t
� �; ð2Þ
where cnk ¼ nk
� �kðkÞ=nðkÞ and rkj ¼ ð�1Þk�j k
j
� �jðk�1Þ=
kðk�1Þ are the elements of the matrices of column
and row eigenvectors of Q, respectively, and
aðbÞ ¼ aðaþ 1Þ � � � ðaþ b� 1Þ, að0Þ ¼ 1. The mean wait-
ing times between coalescent events are given by
EðTjÞ ¼Z 1
0
PðAnðtÞ ¼ jÞdt ¼ � � � ¼ b2j
2
� �� ��1
; ð3Þ
as the inverse of the coalescent rate. The germination
rate, b, is bounded as 1/m� b� 1. The lower and upper
bounds result from the scenarios, where all seeds,
respectively, rest m and one generation in the bank. So
the expected coalescent tree can be up to m2 genera-
© 2012 Blackwell Publishing Ltd
GERM BANKS AFFECT INFERENCE OF DEMOGRAPHY 3
tions longer in the seed bank model compared with the
usual Wright–Fisher model.
The coalescent for seed bank models with variablepopulation size
When population size changes occur, plants and seeds
of all age classes are assumed to be equivalently
affected such that the relative proportions of all type of
seeds remain constant over time. Then the probabilities
b1; . . .; bm and therefore the germination rate b remain
constant over time as well, such that in the urn model a
change in population size solely alters the number of
slots in the corresponding cell. One may also think of a
more complex neutral model, in which an environmen-
tal change affects solely the plants but not the seeds of
the corresponding generation, such that subsequently
the proportions of seeds of different age classes could
very well change. However, for mathematical conve-
nience, we focus on the simplified setting.
In discrete time, let qNðiþ k� 1Þ ¼ Nðiþ k� 1Þ=Nbe the ratio of the population size, N(i+k�1), at the ith
cell of the kth m-window, relative to the population
size, N, at time of sampling. As usual, the population
size is assumed to be large in each generation, which
will particularly allow the configuration process to
reach an equilibrium between coalescence events as in
the case of constant population size. The demographic
changes here occur on the coalescent time scale and not
generation-wise (as in Nunney 2002). Moreover, we
require that the population size remains approximately
constant over a given m-window k0 as determined
by the population size of the first cell, that is,
qNðiþ k0 � 1Þ � qNðk0Þ. This simplification holds in
particular for a geometrically growing population, if the
growth rate and m are chosen realistically small. In the
case of an instantaneous population decline, this rela-
tionship is violated just for m�1 generations, so that for
small m instantaneous changes within a window can be
neglected due to the small corresponding coalescence
probability for large population sizes. In summary,
models encompassing these forms of demographic
changes can be approximately treated as the usual
Wright–Fisher model regarding changes in population
size (e.g. Griffiths & Tavare 1994) as being determined
by the first cell of each window.
In continuous time, let the function q(t), which arises
from qNð½Nt�Þ as N?∞ and time being measured in
units of N generations, be piecewise continuous and
bounded. The time-rescaling argument for the coales-
cent approximation of the usual Wright–Fisher model
(e.g. Griffiths & Tavare 1994), t ! R t0 qðsÞ�1ds, can be
applied to the ancestral process ðAnðtÞÞt� 0 to obtain the
process with time-varying population size ðAqnðtÞÞt� 0
according to the convention in discrete time. Therefore,
the corresponding results to eqns 2 and 3 are given by
PðAqnðtÞ ¼ jÞ ¼
Xnk¼j
cnkrkj exp �b2k
2
� �Z t
0
qðsÞ�1ds
� �ð4Þ
and
EðTjÞ ¼Z 1
0
PðAqnðtÞ ¼ jÞdt; ð5Þ
respectively. It might be worth mentioning that all of
the above equations hold for a diploid population of
size N and scaling time in units of 2N generations. For
simplicity, we will mostly consider demographies that
comprise exponential growths and instantaneous
declines. As a first example, we illustrate in Fig. 1 the
effect of different germination rates on the genealogy of
a sample from an exponentially growing population.
The ratio of the means of external and total branch
lengths, which are both simply obtained from eqns 4
and 5, is used as a tree measure. These ratios are clearly
elevated compared with the basic model of a constant
population size (without dormancy). Equivalent curves
are found for different values of b, as the time of expan-
sion, te, is shifted to the past, and the growth rate, R, is
adequately reduced to keep the ratio of the ancestral
and the actual population size, d, constant. The strong-
est accumulation of coalescent events as represented by
the peaks of this measure is shifted to the past with
decreasing b, as seeds remaining longer in the bank
compensate the demographic coalescence pressure.
0.5
0.4
0.3
0 0.5 1 1.5 2
Fig. 1 The ratio of the expected external branch length, EðTeÞ,and the expected total tree length, EðTcÞ, is plotted over the
time of expansion, te, for a sample of size n = 20 and various
values of b, which is the inverse of the mean time a seed will
spend in the bank. The underlying demography is a popula-
tion expansion from previously constant size, that is,
q(t) = exp (�Rt), 0� t\ te, q(t) = d, te � t. R is determined by
te as R ¼ � logðdÞ=te, and d = 0.1. The basic model refers to a
constant population size without seed bank.
© 2012 Blackwell Publishing Ltd
4 D. Ž IVKOVIC and A. TELLIER
Adding mutations to the genealogy
Kaj et al. (2001) modelled neutral mutations as being age-
dependent, so that older seeds accumulate more muta-
tions than younger ones. However, to shorten notation,
we assume the mutation rate to be identical for seeds of
all age classes, as (i) following Kaj et al. (2001) the differ-
ent rates are summarized into an overall mutation rate,
so that the results below are applicable in their age-
dependent model as well, and (ii) evidence for such a
dependency is scarce (Vitalis et al. 2004; Honnay et al.
2008; but see Levin 1990). Neglecting seed banks, muta-
tions are commonly assumed to occur independently
according to Poisson processes of rate h/2 along the
edges of the coalescent tree, where the population muta-
tion rate h ¼ limN!1 2Nl with N being the population
size at the time of sampling and l being the mutation
probability per sequence per generation. The seed bank
model requires in addition that mutations are solely
superimposed on the above-ground edges of the coales-
cent tree. Similarly as Kaj et al. (2001), we thus introduce
the scaled mutation rate for seed bank models, hb,referred to as the b-scaled mutation rate, via the relation-
ship hb ¼ bh, as seeds germinate on average every 1/bgenerations. This relationship particularly holds for time-
varying population size, as each ancestral line on average
remains above-ground an equivalent amount of time, and
mutations occur along the above-ground edges condi-
tional on their lengths. Therefore, and assuming an infi-
nitely many sites mutation model (Kimura 1969), where
each mutation arises at a previously monomorphic site,
results regarding allele frequency spectra of general bin-
ary coalescent trees (Griffiths & Tavare 1998) are applica-
ble and given in eqns 6 and 7. The allele or site frequency
spectrum, hereafter denoted as frequency spectrum, is
the distribution of the number of derived alleles in a sam-
ple of size n over a large number of polymorphic sites. As
mutations can be either counted absolutely or relative to
the total number of segregating sites, we note and use
both results. The absolute and relative frequency spectra,
fi and ri, 1� i� n�1, are, respectively, given by
fi ¼ hb2
Xn�iþ1
k¼2
k
n� i� 1
k� 2
� �n� 1
k� 1
� � EðTkÞ; ð6Þ
and
ri ¼Xn�iþ1
k¼2
k
n� i� 1
k� 2
� �n� 1
k� 1
� � EðTkÞ.Xn
k¼2
kEðTkÞ: ð7Þ
Again, these equations are applicable in the diploid
case as well, where time is scaled in units of 2N genera-
tions using hb ¼ bh and h ¼ limN!1 4Nl. For constant
population size, the number of mutations is elevated by
1/b in each class of the absolute frequency spectrum, fi,
compared to the model without dormancy, whereas the
relative frequency spectrum, ri, is equivalent with and
without dormancy. We revisit the demographic exam-
ple of Fig. 1 in terms of relative allele frequencies, ri, by
applying eqn 5 to eqn 7. The expansions (Fig. 2a) are
chosen so that the corresponding frequency spectra, ri,
are equivalent for the different germination rates
(Fig. 2b), which holds for arbitrary values of b, te and R
as long as the values of teb2, d and R=b2 remain the
same. The amount of singletons corresponds to the
maximum value of EðTeÞ=EðTcÞ in Fig. 1. This example
particularly shows that a recent (te ¼ 0:12) and strong
(R = 20) expansion without seed bank has an identical
frequency spectrum as an old (te ¼ 2:88) and weak
(R = 0.8) expansion with a small germination rate
(b = 0.2). Thus, expansions will be dated as too recent
and growth rates overestimated, when seed banks are
not taken into account. The impact of the germination
(a)
(b)
Fig. 2 Three different parameter combinations of the rate and
time of growth (a) leading to equivalent relative frequency
spectra, ri, depending on the germination rate (b). The parame-
ter combinations of the curves from left to right (a) are given
top down in the legend of (b).
© 2012 Blackwell Publishing Ltd
GERM BANKS AFFECT INFERENCE OF DEMOGRAPHY 5
rate, b, onto the estimation of demographic changes is
studied in more detail in the next section.
Simulation procedure and pseudo-observed datasets
On the basis of two simple models of time-varying pop-
ulation size, we study how well the demographic history
of a given single population can be retrieved in a species
with seed bank. The estimation is based on approximate
Bayesian computation (ABC) (e.g. Beaumont et al. 2002)
using the mean of the absolute frequency spectrum
across loci as the set of summary statistics. In the follow-
ing, we use the absolute instead of the relative frequency
spectrum, as it captures information on the number of
segregating sites. Two situations are modelled. First, we
mimic the common situation where no information on
the seed bank is available, that is, seed banks are ignored
and the demographic coalescent inference rests on the
simple Wright–Fisher model with past population size
changes. Second, and for comparison, we assume that
the germination rate b is known, which corresponds to a
few rare cases (Tellier et al. 2011a). Here, the estimation
procedure is conducted taking the existence of seed
banks into account using a coalescence model with seed
banks and known values of b. The third possibility of
simultaneously estimating the demography and b is not
taken into account due to the various combinations of
demographic parameters and b-values, which result in
equivalent frequency spectra (Fig. 2).
The studied population experiences either an expo-
nential growth from previously constant population size
or an instantaneous population decline. The sequences
sampled from the population are thereafter denoted as
pseudo-observed data sets. Exponential growth is mod-
elled as above with three parameters: the time, te, at
which the population expansion starts, the growth rate,
R, and the population mutation rate, h. The ratio, d, of
the ancestral population size before the expansion and
the current population size is determined by te and R
(Fig. 1). The decline model has also three parameters:
the time of decline, td, the ratio of the ancestral and the
current population size, d, and the population mutation
rate, h. Simulations are performed using a modified
version of the coalescent program ms (Hudson 2002) as
previously developed in Tellier et al. (2011a). The
coalescent simulator (C++ code available in the asso-
ciated Dryad patch) follows the expectations of the the-
oretical model as described in eqns 4–7 previously.
The pseudo-observed data sets are composed of a
sample of 20 chromosomes, sequenced at 1000 neutral
and independent loci without intra-locus recombination.
Such data sets capture the signatures of past demogra-
phy on the genome at multiple independent and
neutral loci and represent typically next-generation
sequencing or whole-genome data. These data sets are
simulated under the expansion model with parameters
te;obs, Robs and hobs or under the decline model with
parameters td;obs, dobs and hobs, assuming a seed bank
with b-values of 0.1, 0.2, 0.3, 0.4, 0.5, 0.75 and 0.95,
mimicking long to very short times of dormancy. For
each b-value, 500 pseudo-observed data sets are gener-
ated by drawing values of the three model parameters
randomly from a uniform distribution (Table 1). Note
that the population mutation rate is identical for all
data sets (hobs ¼ 12:5). This value is chosen (i) to gener-
ate a sufficient number of segregating sites at each locus
to perform the statistical estimation procedure and (ii)
based on the very high observed genetic diversity in
wild tomato species exhibiting seed banks (Tellier et al.
2011a). Note that data sets with smaller amount of seg-
regating sites and/or loci will yield more ambiguous
results. Furthermore, the highly idealized simulation
conditions, which assume known mutation and germi-
nation rates, are chosen to narrow down the effect of
seed banks on the estimation of the demographic
parameters. Practically, the b-scaled mutation rate
Table 1 Ranges of values for the set of pseudo-observed data under population expansion and decline. The b-scaled mutation rate,
hb;obs, is chosen so that the population mutation rate, hobs ¼ hb;obs=b, has a fixed value of 12.5 per locus. The parameters te;obs and
td;obs are rescaled by the factor of b2 from the ranges without germ bank (b = 1)
b hb;obs
Expansion Decline
Robs te;obs dobs td;obs
0.1 1.25 0–5 20–400 1–20 20–10000.2 2.5 0–5 5–100 1–20 5–250
0.3 3.75 0–5 2.22–44.44 1–20 2.22–111.110.4 5 0–5 1.25–25 1–20 1.25–62.5
0.5 6.25 0–5 0.8–16 1–20 0.8–400.75 9.375 0–5 0.3556–7.11 1–20 0.3556–17.778
0.95 11.875 0–5 0.2216–4.432 1–20 0.2216–11.08031 12.5 0–5 0.2–4 1–20 0.2–10
© 2012 Blackwell Publishing Ltd
6 D. Ž IVKOVIC and A. TELLIER
varies for different b-values (Table 1) via the relation-
ship hb;obs ¼ bhobs. This allows for a statistical compari-
son of the results for the various b-values based on a
comparable average number of segregating sites across
all data sets. The growth rate, Robs, of the expansion
model ranges conservatively from 0 (no expansion) to 5
for all values of b. The time of expansion, te;obs, varies
depending on the b-values with larger ranges for lower
values of b (Table 1). The rationale for the choice of these
ranges derives from the previous rescaling argument.
For the decline model, the ratio of the ancestral and the
present population size, dobs, ranges from 1 (no decline)
to 20. The time of decline, td;obs, is also chosen depend-
ing on the b-values (Table 1). Each pseudo-observed
data set is summarized as the absolute frequency spec-
trum across 1000 loci using a combination of R, C++codes and the libsequence library (Thornton 2003).
In the following, we estimate for each pseudo-
observed data set generated under population expan-
sion or decline (i) the demographic scenario, that is
expansion, constant size or decline, using the ABC
model choice procedure (e.g. Beaumont et al. 2002) and
(ii) the demographic parameters by means of the local
regression algorithm of the ABC (Excoffier et al. 2005).
Simulation step of the ABC
The simulation step of the ABC comprises 1 000 000
data sets without and with seed banks (for each of the
various b-values as noted previously), respectively, for
each demographic model (expansion, constant size and
decline) and the same number of sampled individuals
and independent neutral loci as for the pseudo-
observed data sets. Each demographic scenario is simu-
lated given a set of three parameters (te;sim or td;sim, Rsim
or dsim, hsim) each, randomly chosen from uniform prior
distributions (Table S1, Supporting information). The
ABCest program (Excoffier et al. 2005) is used to
retrieve the 2000 simulations with the smallest Euclid-
ean distance to the pseudo-observed data sets regarding
the absolute frequency spectrum across loci. The prior
distributions (Table S1, Supporting information) encom-
pass the range of values of the pseudo-observed data
sets (Table 1). To avoid a potential variability in the
number of segregating sites, the prior range of hb;sim(Table S1, Supporting information) is adjusted in the
seed bank models according to the respective values of
hb;obs in the pseudo-observed data sets (Table 1).
Model choice procedure and performance for the chosendemographic scenarios
The model choice procedure rests on a weighted
multinomial logistic regression computed on the 2000
simulations closest to the pseudo-observed data
(Beaumont et al. 2002). Bayes factors are calculated as
the ratio of the posterior probabilities for an expansion
or a decline against the respective other models (Kass
& Raftery 1995). We record the Bayes factors for the
500 pseudo-observed data sets for both demographic
models with and without seed banks. The correct
model is considered to be chosen conservatively if
its Bayes factor is higher than five (Kass & Raftery
1995).
In at least 95% of the pseudo-observed data sets, the
occurrence of a demographic past expansion (against
constant size and decline models, Table S2, Supporting
information) is correctly estimated irrespective of taking
seed banks into account or not. On the other hand, for
a population decline, the correct model is recovered
well, when seed banks are ignored, but not as well
when the seed bank parameter is known (Table S2,
Supporting information). The genomic signature of a
past population expansion, such as an excess of low-
frequency derived polymorphisms in the frequency
spectrum, in a species with seed bank is thus not con-
founded with decline or constant population size mod-
els without seed bank. Similarly, the signature of a past
decline in the frequency spectrum appears to be distin-
guishable from that of the other two demographic mod-
els. In other words, even when ignoring the effect of a
seed bank, it is mostly possible to distinguish the signa-
tures of population expansion and decline with reason-
able certainty.
Method for parameter inference
The parameter estimation procedure is based on the
generated data from the simulation step of the ABC.
The parameters of the expansion model with a certain
seed bank parameter b0, for example, are estimated for
an expansion model without and with a germ bank
characterized by b0. Estimates of the posterior distribu-
tions (mode and 95%-credibility intervals) of each of the
three model parameters (te;est or td;est, Rest or dest and
hest) are obtained by applying the locally weighted mul-
tivariate regression method implemented in the ABCest
program (Beaumont et al. 2002) based on the 2000 data
sets closest to the 500 pseudo-observed data. We sum-
marize the accuracy of the parameter estimates by cal-
culating the relative error (RE), and the root mean
square error (RMSE), for each of the 500 pseudo-
observed data sets (e.g. Tellier et al. 2011b). The relative
error of the time of expansion, for example, is given by
REte ¼ ðte;est � te;obsÞ=te;obs with a negative and positive
value indicating that the parameter is under- and over-
estimated, respectively. The RMSE is the square root of
the average squared relative errors over #sim (here:
© 2012 Blackwell Publishing Ltd
GERM BANKS AFFECT INFERENCE OF DEMOGRAPHY 7
500) data sets. For the time of expansion, for example, it
is given by
RMSEte ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1
#sim
X te;est � te;obste;obs
� �2s
;
where higher values of RMSE indicate a greater estima-
tion inaccuracy. For the presentation of the results and
the calculation of the RMSEs, the 10 most extreme val-
ues of RE are removed for each model parameter. This
allows us to reject unrealistic overestimates of the
observed value, which can be due to a boundary effect
in the prior or in the observed parameter values (Tellier
et al. 2011b).
We compare here the accuracies of models with and
without germ banks in estimating the demographic
parameters from pseudo-observed sequence data with a
given germ bank. Note that the demographic model,
that is, a past decline or an expansion, of the pseudo-
observed data is assumed to be known, which allows
us to focus on the impact of the germ bank in the
parameter estimation procedure.
Inference of past expansion
When seed banks are disregarded, the expansion rate,
R, is overestimated (Fig. 3b) compared to the estimates
assuming the correct b-values (Fig. 3a). The values of
RE are higher for lower values of b and proportional to
1=b2 (Fig. 3a, b), as also indicated by higher values of
RMSE for smaller b (Table S3, Supporting information).
Conversely, the time of expansion, te, is strongly under-
estimated when ignoring seed banks (Fig. 4b) compared
to the estimates with known seed bank (Fig. 4a). Esti-
mates of h are similarly accurate irrespective of taking
seed banks into account or not (Fig. S1, Table S3, Sup-
porting information), as this parameter has been scaled
by b, so that hb provides a similar number of segregat-
ing sites for all models.
Based on our previous theoretical argument that
equivalent frequency spectra are obtained for arbitrary
values of b, te and R, when teb2 and R=b2 are fixed
(Fig. 1), the overestimation of the growth rate R by
means of RE and proportionally to 1=b2 is expected,
when seed banks are ignored. After rescaling the esti-
mated parameter values in Fig. 4b, that is, multiplying
the estimated growth rate, Rest, and dividing the time
of expansion, te;est, by b2, the values of RE and RMSE
are similar to those obtained with seed banks (Figs 3c
and 4c, Table S3, Supporting information). We confirm
therefore (i) that ignoring seed banks when estimating
the demographic parameters leads to an overestimation
of the growth rate and an underestimation of the time of
expansion by a factor of b2 and that (ii) these errors are
–1
0
1
2
3
4
5
0
50
100
150
0.1 0.2 0.3 0.4 0.5 0.75 0.95
0.1 0.2 0.3 0.4 0.5 0.75 0.95
0.1 0.2 0.3 0.4 0.5 0.75 0.95
–1
0
1
2
3
4
5
(a)
(b)
(c)
Fig. 3 Relative errors for the growth rate, R, of a past popula-
tion expansion. The x-axis indicates the b-values under which
the pseudo-observed data are generated, that is, for a model
with germ bank. The relative error distribution over the 500
data sets is shown assuming (a) a model with germ bank and
parameters b equal to that of the pseudo-observed data set, (b)
a model without germ bank and (c) a model without germ
bank with values from (b) multiplied by b2.
© 2012 Blackwell Publishing Ltd
8 D. Ž IVKOVIC and A. TELLIER
not due to our statistical method or the choice of the
absolute frequency spectrum as the summary statistics.
Inference of past decline
Demographic parameters for decline models are mises-
timated as shown in Figs S2, S3 and Table S4, Sup-
porting information. Note that the RMSE-values are
low because the decline ratio, d, and more so the time
of decline, td, are strongly underestimated irrespective
of the value of b in models with and without seed
banks. The population mutation rate, h, is most accu-
rately estimated, and in the model with seed bank h is
slightly more overestimated for lower b-values(Fig. S4a, Supporting information). Finally, rescaling
the parameter estimates of the model without seed
bank via b2 does not improve the results in contrast to
the expansion model. This points out the insufficiency
of the allelic spectrum for the estimation of decline
parameters, as (i) declines show a large variance in
polymorphism patterns among loci in contrast to
expansion scenarios (Živkovic & Wiehe 2008) and (ii)
relatively old events become barely distinguishable
from the basic model.
A more complex model
We define here a model of a population bottleneck that
unifies a population expansion and an instantaneous
decline with fixed parameters (Fig. 5a). We study the
signatures of such a bottleneck on the relative allele
frequencies, ri, depending on the b-values (Fig. 5b).
Without seed dormancy (b = 1), the expansion phase
appears more evident than the rather old decline by an
excess of low-frequency alleles relative to the basic
model of a constant population size. Both demographic
events—the expansion and the instantaneous decline—
are visible in terms of the relative allele frequencies, ri,
for b = 0.6, as there is an excess of low- and high-
frequency derived alleles relative to the basic model. As
the probability of coalescence decreases as b becomes
smaller, the expansion phase is not detectable anymore
in the frequency spectrum for b = 0.2, and only the
instantaneous decline can be observed by an excess of
alleles in intermediate to high frequencies. This effect is
enhanced with larger ancestral population sizes. Simi-
larly, if one decreases the duration of the bottleneck for
small germination rates, b, even the decline becomes
harder to detect, as the relative allele frequencies, ri,
approach those of the basic model again. In conclusion,
for small values of b, recent or short enduring demo-
graphic changes can be barely or not accessible to esti-
mation, whereas old events leave signatures longer than
models without seed bank.
0.1 0.2 0.3 0.4 0.5 0.75 0.95
0
5
10
15
20
25
0
2
4
6
0
5
10
15
20
25
0.1 0.2 0.3 0.4 0.5 0.75 0.95
0.1 0.2 0.3 0.4 0.5 0.75 0.95
(a)
(b)
(c)
Fig. 4 Relative errors for the time, te, of a past population
expansion. The x-axis indicates the b-values under which the
pseudo-observed data are generated, that is, for a model with
germ bank. The relative error distribution over the 500 data
sets is shown assuming (a) a model with germ bank and
parameters b equal to that of the pseudo-observed data set, (b)
a model without germ bank and (c) a model without germ
bank with values from (b) divided by b2.
© 2012 Blackwell Publishing Ltd
GERM BANKS AFFECT INFERENCE OF DEMOGRAPHY 9
Discussion
We study here the impact of the simultaneous occur-
rence of germ dormancy and simple deterministic mod-
els of time-varying population size on neutral
polymorphism patterns. First, based on the results of
Kaj et al. (2001), we obtain the probability distribution
of the ancestral process for a sample of size n, the mean
waiting times between coalescence events and the fre-
quency spectra for various demographic models. We
show then in our simulation study that the model
choice procedure of the ABC retrieves quite well the
correct model between a population expansion and an
instantaneous decline, irrespective of whether the seed
bank is included. However, the germination rate, b, isshown to have a substantial impact on the parameter
estimation. This follows our theoretical result for the
case of a past population expansion that a seed bank
model with germination rate, b, and growth from time
te at rate R leaves an equivalent polymorphism pattern
as the case without seed banks, where growth starts
more recently at b2te at a higher rate R=b2. We con-
clude when studying a more complex bottleneck model
that for small values of b, recent or short enduring
demographic changes can be barely or not accessible to
estimation, whereas old events leave signatures longer
than models without seed bank. Moreover, if complex
demographic scenarios such as population bottlenecks
appear biologically relevant for many species, their
inference may be impossible without a priori informa-
tion on the germination rate (Fig. 5).
The recent burst of genomic data available for many
taxa is prompting the need for further refinements of
the population genetics theory based on the classic
Wright–Fisher model. A general problem is to quantify
the extent to which we can violate assumptions, that is,
ignore the ecological reality, when estimating past
demography. Germ banking or overlapping of genera-
tions, changes in population size and spatial structuring
are realistic assumptions common to many species with
potential consequences for inference. For example, over-
lapping generations in combination with varying popu-
lation size are shown to generate deviations of the
molecular clock from expected patterns of neutrality
(Balloux & Lehmann 2012). We show here that age
structure due to seed banks is a common factor to
account for in population genetics analysis, when seed
dormancy is a bet-hedging strategy and the germination
rate is lower than 0.5 (Figs 3 and 4; Evans et al. 2007;
Tielborger et al. 2012). Germ banks and spatial structur-
ing lead in principle to a similar departure from the
assumption of random mating of the usual Wright–
Fisher model, as there is a separation of individuals
either into different age classes (Charlesworth 1994; Kaj
et al. 2001; Nunney 2002) or into different spatial demes
(Charlesworth et al. 2003). Furthermore, ignoring the
effect of seed banks in spatially structured populations
may lead to misinterpretations of the amount of genetic
differentiation among demes, local effective population
sizes (Vitalis et al. 2004) and demographic changes
within the metapopulations. Another possible ecologi-
cally realistic assumption, but further complication, is
that some species may exhibit long-term seed banks
only in parts of their range. Examples and exhaustive
studies are so far lacking, but in Arabidopsis species,
long-term seed banks may only be prevalent in north-
ern European populations (Lundemo et al. 2009; Falah-
ati-Anbaran et al. 2011). Seed banks may thus not affect
estimates of the whole species’ past demography
(Francois et al. 2008), but may be important for under-
standing the demography and local adaptation in north-
ern populations. Finally, we suggest that the statistical
inference based on SNP data of evolutionary parame-
ters in speciation scenarios, for example, the so-called
0.5 1.0 1.5 2.0
0.2
0.4
0.6
0.8
5 6 71 2 3 4 8 9 10 11 12 13 14 15 16 17 18 19
0.1
0.2
0.3
0.4
0.5
(a)
(b)
Fig. 5 The underlying demography (a) is a bottleneck model,
that is, q(t) = exp (�Rt), 0� t\ te, qðtÞ ¼ d1, te � t\ td,
qðtÞ ¼ d2, td � t, with parameter combination R = 20,
d1 ¼ 0:1, td ¼ 0:6 and d2 ¼ 0:5. te is determined by R as
te ¼ � logðd1Þ=R, such that te � 0:12. Relative frequency spec-
tra, ri, (b) for this demographic model and germination rates as
given in the legend box are illustrated.
© 2012 Blackwell Publishing Ltd
10 D. Ž IVKOVIC and A. TELLIER
isolation with migration model (Wakeley & Hey 1997;
Tellier et al. 2011a), may be as well affected by long-
term seed banks.
In some cases, the inferred neutral null model of
demography with seed banks may serve as a basis to
study the rate of genetic adaptation to biotic and abiotic
environments and the genes under natural selection.
We advocate here also that taking seed banks into
account may be essential to infer the existence of selec-
tion. In the case of balancing selection, the interaction
of a persistent seed bank and temporally fluctuating
selection promotes the maintenance of stable polymor-
phism (Turelli et al. 2001; Tellier & Brown 2009). For
example, seed banks explained the stable single-locus
polymorphism for flower colour found in Linanthus par-
ryae (Turelli et al. 2001), whereas ignoring their effect
led to erroneous evolutionary inference (Schemske &
Bierzychudek 2001). Concerning positive selection, seed
banks slow down the rate of selection (Hairston & De
Stasio 1988), that is, positively selected alleles have
longer fixation times and decrease the rate of local
adaptation in spatially structured populations. In plant
species, recent positive selection may then not be
detectable in sequence data due to the existence of seed
banks (e.g. Gossmann et al. 2010) similarly as in the
case of a recent strong population expansion.
For simplicity, we have estimated in this study the
past demography of a population assuming a known
germination parameter, b, or the known absence of seed
banks. However, in reality, the presence of seed banks
and values of the germination rates are often unknown.
The first option to perform statistical inference of past
events lies in the simultaneous estimation of past
demography and b based, for example, on the fre-
quency spectrum at numerous neutral or reference loci.
However, we show that different combinations of
demographic parameters and b-values may result in
similar genomic signatures (Fig. 2). To circumvent this
difficulty, information on the above-ground population
based on ecological observations is needed and has to
be integrated to define the priors of the population size.
This follows from the results that smaller values of bincrease the observed nucleotide diversity (Kaj et al.
2001; Nunney 2002; Tellier et al. 2011a). A corollary is
that all hypotheses, which could explain an increase in
the observed nucleotide diversity compared to expecta-
tions based on population sizes, should be accounted
for in the model. It is, for example, crucial to incorpo-
rate spatial structuring of populations and limited gene
flow among demes in the population models because (i)
spatial structure may increase genetic diversity com-
pared to expectations based on census sizes and num-
ber of demes (e.g. Charlesworth et al. 2003) and (ii) low
germination rates decrease the genetic differentiation
among demes (Vitalis et al. 2004). The usefulness of
such an approach was recently demonstrated in wild
tomato species, where metapopulation structure is a
key evolutionary factor (Tellier et al. 2011a).
The second option relies on field observations and
measurement of germination rates, which can be used
to define priors on b-values in the model for inference.
The germination rate and dormancy of seeds are, how-
ever, determined by the interactions of genetic (Bent-
sink et al. 2010) as well as physical, climatic and
ecological factors (Fenner & Thompson 2004). Disentan-
gling the influence of these factors on population
dynamics is a key requirement to demonstrate that seed
banks are bet-hedging strategies (e.g. Evans et al. 2007;
Tielborger et al. 2012). Such studies have thus generated
collections of plants and seeds at different points in
time and ecological surveys on population census sizes
(Honnay et al. 2008). We suggest that these data can be
combined with nucleotide sequences and analysed with
new statistical methods of inference as for instance the
ABC procedure, to reveal the evolutionary importance
of long-term germ banks in plants as well as in inverte-
brate and bacterial species.
Acknowledgements
The authors would like to thank Wolfgang Stephan for valu-
able comments on this article. This research was supported by
grant I/84232 from the Volkswagen Foundation to D.Z. and
grant HU1776/1 from the Deutsche Forschungsgemeinschaft to
Stephan Hutter and A.T.
References
Balloux F, Lehmann L (2012) Substitution rates at neutral genes
depend on population size under fluctuating demography
and overlapping generations. Evolution, 66, 605–611.
Beaumont MA, Zhang W, Balding DJ (2002) Approximate
Bayesian computation in population genetics. Genetics, 162,
2025–2035.Bentsink L, Hanson J, Hanhart C, et al. (2010) Natural variation for
seed dormancy in Arabidopsis is regulated by additive genetic
and molecular pathways. Proceedings of the National Academy of
Sciences of the United States of America, 107, 4264–4269.Brown JH, Kodric-Brown A (1977) Turnover rates in insular
biogeography: effect of immigration on extinction. Ecology,
58, 445–449.
Charlesworth B (1994) Evolution in Age-Structured Popula-
tions. Cambridge University Press, Cambridge, UK.
Charlesworth B, Charlesworth D, Barton NH (2003) The effects
of genetic and geographic structure on neutral variation.
Annual Review of Ecology, Evolution and Systematics, 34, 99–125.Chikhi L, Sousa VC, Luisi P, Goossens B, Beaumont MA
(2010) The confounding effects of population structure,
genetic diversity and the sampling scheme on the detection
and quantification of population size changes. Genetics, 186,
983–995.
© 2012 Blackwell Publishing Ltd
GERM BANKS AFFECT INFERENCE OF DEMOGRAPHY 11
Decaestecker E, Gaba S, Raeymaekers JAM, et al. (2007) Host–
parasite ‘red queen’ dynamics archived in pond sediment.
Nature, 450, 870–873.
Evans MEK, Dennehy JJ (2005) Germ banking: bet-hedging and
variable release from egg and seed dormancy. The Quarterly
Review of Biology, 80, 431–451.Evans MEK, Ferriere R, Kane MJ, Venable DL (2007) Bet hedg-
ing via seed banking in desert evening primroses (Oenothera,
Onagraceae): demographic evidence from natural popula-
tions. American Naturalist, 169, 184–194.Excoffier L, Estoup A, Cornuet JM (2005) Bayesian analysis of
an admixture model with mutations and arbitrarily linked
markers. Genetics, 169, 1727–1738.
Falahati-Anbaran M, Lundemo S, Agren J, Stenøien H (2011)
Genetic consequences of seed banks in the perennial herb
Arabidopsis lyrata subsp. petraea (Brassicaceae). American Jour-
nal of Botany, 98, 1475–1485.
Fenner M, Thompson K (2004) The Ecology of Seeds. Cam-
bridge University Press, Cambridge, UK.
Fisher RA (1930) The Genetical Theory of Natural Selection.
Clarendon Press, Oxford.
Francois O, Blum M, Jakobsson M, Rosenberg N (2008) Demo-
graphic history of European populations of Arabidopsis thali-
ana. PLoS Genetics, 4, e1000075.
Gossmann TI, Song BH, Windsor AJ, et al. (2010) Genome wide
analyses reveal little evidence for adaptive evolution in many
plant species. Molecular Biology and Evolution, 27, 1822–1832.
Griffiths RC, Tavare S (1994) Sampling theory for neutral
alleles in a varying environment. Philosophical Transactions of
the Royal Society B: Biological Sciences, 344, 403–410.
Griffiths RC, Tavare S (1998) The age of a mutation in a gen-
eral coalescent tree. Stochastic Models, 14, 273–295.
Hairston Jr NG, De Stasio Jr BT (1988) Rate of evolution slo-
wed by a dormant propagule pool. Nature, 336, 239–242.
Hirschfeld L, Hirschfeld H (1919) Serological differences
between the blood of different races - the result of researches
on the Macedonian front. Lancet, 2, 675–679.Honnay O, Bossuyt B, Jacquemyn H, Shimono A, Uchiyama K
(2008) Can a seed bank maintain the genetic variation in the
above ground plant population? Oikos, 117, 1–5.
Hudson RR (2002) Generating samples under a Wright–Fisherneutral model of genetic variation. Bioinformatics, 18, 337–338.
Kaj I, Krone SM (2003) The coalescent process in a population
of stochastically varying size. Journal of Applied Probability,
40, 33–48.Kaj I, Krone SM, Lascoux M (2001) Coalescent theory for seed
bank models. Journal of Applied Probability, 38, 285–300.Kass RE, Raftery AE (1995) Bayes factors. Journal of the Ameri-
can Statistical Association, 90, 773–795.Kimura M (1955) Random genetic drift in multi-allelic locus.
Evolution, 9, 419–435.Kimura M (1969) The number of heterozygous nucleotide sites
maintained in a finite population due to steady flux of muta-
tions. Genetics, 61, 893–903.
Kingman JFC (1982) On the genealogy of large populations.
Journal of Applied Probability, 19A, 27–43.
Lennon JT, Jones SE (2011) Microbial seed banks: the ecological
and evolutionary implications of dormancy. Nature Reviews
Microbiology, 9, 119–130.Levin DA (1990) The seed bank as a source of genetic novelty
in plants. American Naturalist, 135, 563–572.
Lundemo S, Falahati-Anbaran M, Stenøien H (2009) Seed banks
cause elevated generation times and effective population
sizes of Arabidopsis thaliana in northern Europe. Molecular
Ecology, 18, 2798–2811.Nordborg M, Krone S (2002) Separation of time scales and con-
vergence to the coalescent in structured populations. In:
Modern Developments in Theoretical Population Genetics: The
Legacy of Gustave Malecot (eds Slatkin M and Veuille M), pp.
194–232. Oxford University Press, Oxford, UK.
Nunney L (2002) The effective size of annual plant popula-
tions: the interaction of a seed bank with fluctuating popula-
tion size in maintaining genetic variation. American
Naturalist, 160, 195–204.
Olivieri GL, Sousa V, Chikhi L, Radespiel U (2008) From
genetic diversity and structure to conservation: genetic sig-
nature of recent population declines in three mouse lemur
species (Microcebus spp.). Biological Conservation, 141,
1257–1271.Schemske DW, Bierzychudek P (2001) Evolution of flower color
in the desert annual Linanthus parryae: Wright revisited. Evo-
lution, 55, 1269–1282.
Slatkin M, Hudson RR (1991) Pairwise comparisons of mito-
chondrial DNA sequences in stable and exponentially grow-
ing populations. Genetics, 129, 555–562.Stephan W (2010) Detecting strong positive selection in the
genome. Molecular Ecology Resources, 10, 863–872.Tavare S (1984) Line-of-descent and genealogical processes,
and their application in population genetics model. Theoreti-
cal Population Biology, 26, 119–164.Tellier A, Brown JKM (2009) The influence of perenniality and
seed banks on polymorphism in plant-parasite interactions.
American Naturalist, 174, 769–779.
Tellier A, Laurent SJY, Lainer H, Pavlidis P, Stephan W (2011a)
Inference of seed bank parameters in two wild tomato spe-
cies using ecological and genetic data. Proceedings of the
National Academy of Sciences of the United States of America,
108, 17052–17057.Tellier A, Pfaffelhuber P, Haubold B, et al. (2011b) Estimating
parameters of speciation models based on refined summa-
ries of the joint site-frequency spectrum. PLoS ONE, 6,
e18155.
Templeton AR, Levin DA (1979) Evolutionary consequences of
seed pools. American Naturalist, 114, 232–249.Thornton K (2003) libsequence: a C++ class library for evolu-
tionary genetic analysis. Bioinformatics, 19, 2325–2327.Tielborger K, Petruu M, Lampei C (2012) Bet-hedging germina-
tion in annual plants: a sound empirical test of the theoretical
foundations, Oikos, doi:10.1111/j.1600-0706.2011.20236.x.
Turelli M, Schemske DW, Bierzychudek P (2001) Stable two-
allele polymorphisms maintained by fluctuating fitnesses
and seed banks: protecting the blues in Linanthus parryae.
Evolution, 55, 1283–1298.
Vitalis R, Glemin S, Olivieri I (2004) When genes go to
sleep: the population genetic consequences of seed dor-
mancy and monocarpic perenniality. American Naturalist,
163, 295–311.
Wakeley J, Hey J (1997) Estimating ancestral population
parameters. Genetics, 145, 847–855.
Watterson GA (1984) Allele frequencies after a bottleneck. Theo-
retical Population Biology, 26, 387–407.
© 2012 Blackwell Publishing Ltd
12 D. Ž IVKOVIC and A. TELLIER
Wright S (1931) Evolution in Mendelian populations. Genetics,
16, 97–159.Živkovic D, Stephan W (2011) Analytical results on the neutral
non-equilibrium allele frequency spectrum based on diffu-
sion theory. Theoretical Population Biology, 79, 184–191.
Živkovic D, Wiehe T (2008) Second-order moments of segregating
sites under variable population size. Genetics, 180, 341–357.
D.Z. is a postdoctoral researcher at LMU Munich. His research
focuses on the enhancement of coalescent and diffusion theory
regarding the inclusion of ecologically realistic assumptions
such as varying population size, seed banks and natural selec-
tion. A.T. is a professor of population genetics at TUM and
focuses on combining theoretical approaches with the use of
new sequencing technologies, for studying seed banks and
plant-pathogen coevolution.
Data accessibility
The codes used in this article are archived on Dryad
(doi:10.5061/dryad.7kp90).
Supporting information
Additional Supporting Information may be found in the online ver-
sion of this article.
Table S1 Ranges of values for the ABC priors under popula-
tion expansion and decline, which encompass the correspond-
ing ranges of the pseudo-observed datasets (Table 1).
Table S2 Results of the model choice procedure for population
expansion and decline.
Table S3 Root mean square errors for the estimates of the
expansion model parameters h , R and te assuming a model
without or with germ banks and the correct b -value.
Table S4 Root mean square errors for the estimates of the
decline model parameters h , d and td assuming a model with-
out or with germ banks and the correct b -value.
Fig. S1 Relative errors for the population mutation rate, h , of
a past population expansion. The x-axis indicates the b -values
under which the observed data were generated, that is, for a
model with germ bank.
Fig. S2 Relative errors for the decline ratio, d, of a past popula-
tion decline. The x-axis indicates the b -value under which the
observed data were generated, that is, for a model with germ
bank.
Fig. S3 Relative errors for the time of decline, td, of a past pop-
ulation decline.
Fig. S4 Relative errors for the population mutation rate, h , of
a past population decline.
Please note: Wiley-Blackwell are not responsible for the content
or functionality of any supporting materials supplied by the
authors. Any queries (other than missing material) should be
directed to the corresponding author for the article.
© 2012 Blackwell Publishing Ltd
GERM BANKS AFFECT INFERENCE OF DEMOGRAPHY 13