Linear Algebraof Eigen’s Quasispecies Modelnovozhil/Data/linear...Artem Novozhilov (NDSU) May 17,...
Transcript of Linear Algebraof Eigen’s Quasispecies Modelnovozhil/Data/linear...Artem Novozhilov (NDSU) May 17,...
Linear Algebra of
Eigen’s Quasispecies Model
Artem Novozhilov
Department of Mathematics
North Dakota State University
Midwest Mathematical Biology Conference,
University of Wisconsin — La Crosse
May 17, 2014
Artem Novozhilov (NDSU) May 17, 2014 1 / 29
Collaborators:
Yuri S. SemenovMoscow State University of RailwayEngineering, Moscow, Russia
Alexander S. BratusLomonosov Moscow State University
Moscow, Russia
Artem Novozhilov (NDSU) May 17, 2014 2 / 29
Outline:
◮ Motivation and historic outline
◮ Mathematical problem
◮ Known results
◮ New results
Artem Novozhilov (NDSU) May 17, 2014 3 / 29
M. Eigen, Naturwisenschaften, 58(10), 1971: 465–523
Artem Novozhilov (NDSU) May 17, 2014 4 / 29
Quasispecies theory:
Manfred Eigen, born 1927
◮ M. Eigen, Naturwisenschaften, 58(10), 1971:465–523
◮ M. Eigen, P. Schuster, The Hypercycle, Springer,1979
◮ M. Eigen, J. McCaskill, P. Schuster, J Phys Chem,92(24), 1988:6881-6891
Artem Novozhilov (NDSU) May 17, 2014 5 / 29
Mathematical side of the story:
Ref: M. Eigen, J. McCaskill, P. Schuster, J Phys Chem, 92(24), 1988: 6881-6891
Artem Novozhilov (NDSU) May 17, 2014 6 / 29
Model statement:
Consider a population of sequences of fixed length N composed of two-letteralphabet, say, {0, 1}, therefore 2N different sequences.
The population is subject to two evolutionary forces. First evolutionary forceis selection, which is included in the system through the Malthusian fitness,defined here for simplicity as
m(particular sequence σ) = m(Hσ),
where Hσ is the Hamming norm of this sequence, i.e., number of 1s insequence σ. In this way we do not distinguish between sequences with thesame number of 1s and hence reduce the dimensionality of the problem from2N × 2N to (N + 1)× (N + 1). Hence, we consider at this point onlypermutation invariant fitness landscapes
M = diag(m0, . . . ,mN ) or m = (m0, . . . ,mN)⊤.
Artem Novozhilov (NDSU) May 17, 2014 7 / 29
Model statement:
The second evolutionary force is mutation.
In particular, assuming N + 1 classes of sequences, we have that the mutationsµij (i.e., the mutation rate from class j to class i) can be described by thematrix
M = (µij) = µQ =
−N 1 0 0 . . . . . . 0N −N 2 0 . . . . . . 00 N − 1 −N 3 . . . . . . 00 0 N − 2 −N . . . . . . 0. . . . . . . . . . . . . . . . . . 00 0 . . . . . . 2 −N N0 0 . . . . . . 0 1 −N
,
where µ is the mutation rate per site per sequence per replication event.
Artem Novozhilov (NDSU) May 17, 2014 8 / 29
Model statement:
Let p(t) denote the vector of frequencies of different classes of sequences, then,assuming uncoupled reproduction and mutation events, we arrive at
p(t) = (M + µQ)p(t)−m(t)p(t),
where
m(t) = m · p(t) =
N∑
i=0
mipi(t)
is the mean population fitness.
This model is often called a paramuse of Crow–Kimura quasispecies modelwith permutation invariant fitness landscape.
Ref: Baake and Gabriel, Annual Reviews of Computational Physics VII, 1999: 203–264
Ref: Crow and Kimura, An introduction to population genetics theory, 1970
Artem Novozhilov (NDSU) May 17, 2014 9 / 29
Elementary results:
The asymptotic behavior of the quasispecies model is determined by theequilibrium p = limt→∞ p(t), which solves the eigenvalue problem
(M + µQ)p = mp,
wherem = m · p.
By Perron–Frobenius theorem it follows that there is a unique positivesolution p > 0, which is the right eigenvector of M + µQ corresponding to thesimple real dominant eigenvalue λ = m.
This vector p was called by Eigen the quasispecies. It is globally stable for thequasispecies system. We are mostly interested in properties of m and p
depending on the fitness landscape M and mutation rate µ, therefore, we usethe notation m = m(µ) and p = p(µ) for the mean fitness and equilibriumdistribution.
Artem Novozhilov (NDSU) May 17, 2014 10 / 29
Known results:
◮ Thompson and McBride, Math Biosciences, 21: 127–142, 1974
The quasispecies model is essentially linear.
◮ Rumschitzki, J of Math Biol, 24: 667–680, 1987
For zero epistasis fitness landscape the spectral properties of the Eigenevolutionary matrices can be inferred with the representation of M andM using tensor products.
◮ Swetina and Schuster, Bioph Chem, 16: 329–345, 1982
Numerical analysis of single peaked fitness landscape yields the error
threshold.
Artem Novozhilov (NDSU) May 17, 2014 11 / 29
Known results: The error threshold
Ref: Swetina and Schuster, Bioph Chem, 16: 329–345, 1982
Consider the single peaked fitness landscape
M = diag(m0, 0, . . . , 0), m0 > 0.
Artem Novozhilov (NDSU) May 17, 2014 12 / 29
Known results: The error threshold
Consider the single peaked fitness landscape
M = diag(m0, 0, . . . , 0), m0 > 0.
0. 0.12 0.24 0.36 0.48 0.60
0.1
0.2
0.3
µ
p
Artem Novozhilov (NDSU) May 17, 2014 13 / 29
Known results: Statistical Physics
◮ Leuthausser, I. (1986). An exact correspondence between Eigen’s evolutionmodel and a two-dimensional Ising system. The Journal of Chemical Physics,84(3), 1884-1885.
◮ Tarazona, P. (1992). Error thresholds for molecular quasispecies as phasetransitions: From simple landscapes to spin-glass models. Physical Review A,45(8), 6038.
◮ Baake, E., Baake, M., & Wagner, H. (1997). Ising quantum chain is equivalentto a model of biological evolution. Physical Review Letters, 78(3), 559-562.
◮ Galluccio, S. (1997). Exact solution of the quasispecies model in a sharplypeaked fitness landscape. Physical Review E, 56(4), 4526.
◮ Baake, E., & Wagner, H. (2001). Mutation-selection models solved exactly withmethods of statistical mechanics. Genetical Research, 78(01), 93-117.
◮ Saakian, D. B., & Hu, C. K. (2006). Exact solution of the Eigen model withgeneral fitness functions and degradation rates. Proceedings of the NationalAcademy of Sciences of the USA, 103(13), 4935-4939.
Artem Novozhilov (NDSU) May 17, 2014 14 / 29
Known results: Maximum principle
◮ Hermisson, J., Redner, O., Wagner, H., & Baake, E. (2002). Mutation-selectionbalance: Ancestry, load, and maximum principle. Theoretical PopulationBiology, 62(1), 9-46.
◮ Baake, E., & Georgii, H. O. (2007). Mutation, selection, and ancestry inbranching models: a variational approach. Journal of Mathematical Biology,54(2), 257-303.
Assume that mi = Nri = Nr(xi), xi =iN
∈ [0, 1] and define
g(x) = µ(
1− 2√
x(1 − x))
. Then the mean fitness m(µ) = Nr is given by
r ≈ r∞ = supx∈[0,1]
(
r(x) − g(x))
.
r(x) may have only finite number of discontinuities and be either left or rightcontinuous at every point.
Artem Novozhilov (NDSU) May 17, 2014 15 / 29
Main idea:
We consider the eigenvalue problem
(M + µQ)p = mp, m = m · p,
where p = p(µ), m = m(µ) with a fixed fitness landscape m.
We claim that this problem simplifies in the coordinates of the basis composedof the eigenvectors of the matrix Q = QN .
Artem Novozhilov (NDSU) May 17, 2014 16 / 29
Proposition: For the matrix Q = QN :
1. The eigenvalues of QN
are simple (all have algebraic multiplicities one) andgiven by
qk = −2k, k = 0, . . . , N.
2. Let v⊤k = (c0k, . . . , cNk) be the right eigenvector of Q
Ncorresponding to qk and
normalized such that c0k = 1, C = CN = (cik)(N+1)×(N+1) be the matrixcomposed of vk (vk is the k-th column of CN). Then the generating functionfor the elements of the k-th column has the form
Pk(t) =N∑
i=0
cik ti = (1− t)k(1 + t)N−k
, k = 0, . . . , N.
3. C2 = 2NI, where I is the identity matrix, or, equivalently,
C−1 = 2−N
C.
4. 1-norm of C is
‖C‖1 = max0≤k≤N
N∑
i=0
|cik| = 2N .
Artem Novozhilov (NDSU) May 17, 2014 17 / 29
Behavior for µ → ∞:
Let x = C−1p and x = 2−N (1, 0, . . . , 0). Then
‖x− x‖1 ≤1
µ‖M‖1,
and
‖x′(µ)‖1 ≤2N
2µ− (2N+1 + 1)‖M‖1.
Therefore,
p(µ) → p = 2−N
((
N
0
)
, . . . ,
(
N
N
))
.
Artem Novozhilov (NDSU) May 17, 2014 18 / 29
Parametric solution:
Consider fitness landscape such that mj > 0 for some j, and all the restmi = 0 for i 6= j. Then
xk(s) = 2−N ckj1 + ks
,
pi(s) = 2−N
N∑
k=0
cikckj1 + ks
,
m(s) = mjpj(s),
µ =s
2m(s).
Artem Novozhilov (NDSU) May 17, 2014 19 / 29
Example 1:
Let N = 2A andm = (0, . . . , 0, N, 0, . . . , 0),
where N is exactly on A-th position. Consider a scaled fitness landscapeNr = m and r(µ) = m(µ)/N . Then, using the properties of the mutationmatrix Q and the parametric formulas from the previous slide, it can beproved that
r ≈ r∞ = limN→∞
r(µ) =√
µ2 + 1− µ.
0 3
1
Μ
rHΜL
N=100
0 3
1
Μ
rHΜL
N=200
Artem Novozhilov (NDSU) May 17, 2014 20 / 29
Example 1:Let N = 2A and
m = (0, . . . , 0, N, 0, . . . , 0),
where N is exactly on A-th position. Consider a scaled fitness landscapeNr = m and r(µ) = m(µ)/N . Denoting r∞ =
√
µ2 + 1− µ we can prove that
limN→∞
pA±k(µ) = r∞
(
1− r∞1 + r∞
)k
, k = 0, . . . , A.
0. 1. 2. 3. 4.0.0
0.2
0.4
0.6
0.8
1.0
µ
p
0. 1. 2. 3. 4.0.0
0.2
0.4
0.6
0.8
1.0
µ
p
Artem Novozhilov (NDSU) May 17, 2014 21 / 29
Example 2: Single peaked landscape
Letm = (N, 0, . . . , 0).
Consider a scaled fitness landscape Nr = m and r(µ) = m(µ)/N . Then, usingthe properties of the mutation matrix Q and the parametric formulas we canprove that
r∞ = limN→∞
p0 = 1− µ,
limN→∞
pi = (1− µ)µi, i ≥ 1.
if µ < 1 and pj = 0 for all j if µ ≥ 1.
Artem Novozhilov (NDSU) May 17, 2014 22 / 29
Example 2: Single peaked landscape
Comparison with numerical computations for N = 100.
r∞ = limN→∞
p0 = 1− µ, limN→∞
pi = (1− µ)µi, i ≥ 1
0. 0.3 0.6 0.9 1.2 1.50.0
0.2
0.4
0.6
0.8
1.0
µ
r(µ)
0. 0.2 0.4 0.6 0.8 1.0.00
0.05
0.10
0.15
0.20
0.25
0.30
µ
p
Artem Novozhilov (NDSU) May 17, 2014 23 / 29
Epsilon stabilization:
Definition: Dominant eigenvalue m(µ) admits epsilon stabilization if for asmall enough ε > 0 there exists constant m∗
ε and µ∗ε such that for all µ > µ∗
ε
we have|m(µ)−m∗
ǫ | < ε, |m′(µ)| < ε.
We know that epsilon stabilization is a sure event, due to the previous. Thequestion is how to determine µ∗
ε given ε > 0.
Artem Novozhilov (NDSU) May 17, 2014 24 / 29
Epsilon stabilization:
Let m = (m0,m1, . . . ,mN ) such that m0 > m1 ≥ m2 ≥ . . . ≥ mN . Then wehave
µ∗ =m0 −m1
N, (Classical Result)
µ∗ε = (m0 −m1)
(
1−
√
1−2(m0 − 1)
(m0 −m1)N
)
,
if δ = 2−N
N∑
k=0
(mk − 1)
(
N
k
)
< ε,
(Our result 1)
m(µ∗) = mmin + 2µ∗. (Our result 2)
Artem Novozhilov (NDSU) May 17, 2014 25 / 29
Example:
0. 0.3125 0.625 0.9375 1.250.00
0.05
0.10
0.15
0.20
0.25
0.30
Μ
p`
0. 0.3125 0.625 0.9375 1.250
5
10
15
20
Μ
mHΜL
Figure : Error threshold in the quasispecies model with the single picked fitnesslandscape
Estimates from the previous slide:
µ∗1 = 0.633,
µ∗2 = 0.644,
µ∗3 = 0.613
Artem Novozhilov (NDSU) May 17, 2014 26 / 29
Example:
0.125 0.25 0.375 0.50.00
0.05
0.10
0.15
0.20
0.25
0.30
Μ
p`
0. 0.125 0.25 0.375 0.5
1.7
1.8
1.9
2.0
2.1
2.2
2.3
Μ
mHΜL
Figure : Error threshold in the quasispecies model with mk = kα log(1− s)
Estimates from the previous slide:
µ∗1 = 0.09,
µ∗2 = 0.19,
µ∗3 = 0.33
Artem Novozhilov (NDSU) May 17, 2014 27 / 29
Acknowledgements:
Startup grant from Department of Mathematics, NDSU
ND EPSCoR and NSF grant # EPS-0814442
Artem Novozhilov (NDSU) May 17, 2014 28 / 29
Thank you for your attention!
e-mail: [email protected]
site: https://www.ndsu.edu/pubweb/ novozhil/
References:
◮ Bratus, A. S., Novozhilov, A. S., & Semenov, Y. S. (2013). Linear algebraof the permutation invariant Crow–Kimura model of prebiotic evolution.arXiv:1306.0111.
◮ Semenov, Y. S., Bratus, A. S., & Novozhilov, A. S. (2014). On thebehavior of the leading eigenvalue of the Eigen evolutionary matrices, inpreparation.
Artem Novozhilov (NDSU) May 17, 2014 29 / 29