Linear Algebraof Eigen’s Quasispecies Modelnovozhil/Data/linear...Artem Novozhilov (NDSU) May 17,...

Linear Algebra of

Eigen’s Quasispecies Model

Artem Novozhilov

Department of Mathematics

North Dakota State University

Midwest Mathematical Biology Conference,

University of Wisconsin — La Crosse

May 17, 2014

Artem Novozhilov (NDSU) May 17, 2014 1 / 29

Collaborators:

Yuri S. SemenovMoscow State University of RailwayEngineering, Moscow, Russia

Alexander S. BratusLomonosov Moscow State University

Moscow, Russia


Outline:

◮ Motivation and historic outline

◮ Mathematical problem

◮ Known results

◮ New results


M. Eigen, Naturwisenschaften, 58(10), 1971: 465–523


Quasispecies theory:

Manfred Eigen, born 1927

◮ M. Eigen, Naturwisenschaften, 58(10), 1971:465–523

◮ M. Eigen, P. Schuster, The Hypercycle, Springer,1979

◮ M. Eigen, J. McCaskill, P. Schuster, J Phys Chem,92(24), 1988:6881-6891


Mathematical side of the story:

Ref: M. Eigen, J. McCaskill, P. Schuster, J Phys Chem, 92(24), 1988: 6881-6891


Model statement:

Consider a population of sequences of fixed length N composed of two-letteralphabet, say, {0, 1}, therefore 2N different sequences.

The population is subject to two evolutionary forces. First evolutionary forceis selection, which is included in the system through the Malthusian fitness,defined here for simplicity as

m(particular sequence σ) = m(Hσ),

where Hσ is the Hamming norm of this sequence, i.e., number of 1s insequence σ. In this way we do not distinguish between sequences with thesame number of 1s and hence reduce the dimensionality of the problem from2N × 2N to (N + 1)× (N + 1). Hence, we consider at this point onlypermutation invariant fitness landscapes

M = diag(m0, . . . ,mN ) or m = (m0, . . . ,mN)⊤.


Model statement:

The second evolutionary force is mutation.

In particular, assuming N + 1 classes of sequences, we have that the mutationsµij (i.e., the mutation rate from class j to class i) can be described by thematrix

M = (µij) = µQ =

−N 1 0 0 . . . . . . 0N −N 2 0 . . . . . . 00 N − 1 −N 3 . . . . . . 00 0 N − 2 −N . . . . . . 0. . . . . . . . . . . . . . . . . . 00 0 . . . . . . 2 −N N0 0 . . . . . . 0 1 −N

,

where µ is the mutation rate per site per sequence per replication event.


Model statement:

Let p(t) denote the vector of frequencies of different classes of sequences, then,assuming uncoupled reproduction and mutation events, we arrive at

p(t) = (M + µQ)p(t)−m(t)p(t),

where

m(t) = m · p(t) =

N∑

i=0

mipi(t)

is the mean population fitness.

This model is often called a paramuse of Crow–Kimura quasispecies modelwith permutation invariant fitness landscape.

Ref: Baake and Gabriel, Annual Reviews of Computational Physics VII, 1999: 203–264

Ref: Crow and Kimura, An introduction to population genetics theory, 1970


Elementary results:

The asymptotic behavior of the quasispecies model is determined by theequilibrium p = limt→∞ p(t), which solves the eigenvalue problem

(M + µQ)p = mp,

wherem = m · p.

By Perron–Frobenius theorem it follows that there is a unique positivesolution p > 0, which is the right eigenvector of M + µQ corresponding to thesimple real dominant eigenvalue λ = m.

This vector p was called by Eigen the quasispecies. It is globally stable for thequasispecies system. We are mostly interested in properties of m and p

depending on the fitness landscape M and mutation rate µ, therefore, we usethe notation m = m(µ) and p = p(µ) for the mean fitness and equilibriumdistribution.


Known results:

◮ Thompson and McBride, Math Biosciences, 21: 127–142, 1974

The quasispecies model is essentially linear.

◮ Rumschitzki, J of Math Biol, 24: 667–680, 1987

For zero epistasis fitness landscape the spectral properties of the Eigenevolutionary matrices can be inferred with the representation of M andM using tensor products.

◮ Swetina and Schuster, Bioph Chem, 16: 329–345, 1982

Numerical analysis of single peaked fitness landscape yields the error

threshold.


Known results: The error threshold

Ref: Swetina and Schuster, Bioph Chem, 16: 329–345, 1982

Consider the single peaked fitness landscape

M = diag(m0, 0, . . . , 0), m0 > 0.


Known results: The error threshold

Consider the single peaked fitness landscape

M = diag(m0, 0, . . . , 0), m0 > 0.

0. 0.12 0.24 0.36 0.48 0.60

0.1

0.2

0.3

µ

p


Known results: Statistical Physics

◮ Leuthausser, I. (1986). An exact correspondence between Eigen’s evolutionmodel and a two-dimensional Ising system. The Journal of Chemical Physics,84(3), 1884-1885.

◮ Tarazona, P. (1992). Error thresholds for molecular quasispecies as phasetransitions: From simple landscapes to spin-glass models. Physical Review A,45(8), 6038.

◮ Baake, E., Baake, M., & Wagner, H. (1997). Ising quantum chain is equivalentto a model of biological evolution. Physical Review Letters, 78(3), 559-562.

◮ Galluccio, S. (1997). Exact solution of the quasispecies model in a sharplypeaked fitness landscape. Physical Review E, 56(4), 4526.

◮ Baake, E., & Wagner, H. (2001). Mutation-selection models solved exactly withmethods of statistical mechanics. Genetical Research, 78(01), 93-117.

◮ Saakian, D. B., & Hu, C. K. (2006). Exact solution of the Eigen model withgeneral fitness functions and degradation rates. Proceedings of the NationalAcademy of Sciences of the USA, 103(13), 4935-4939.


Known results: Maximum principle

◮ Hermisson, J., Redner, O., Wagner, H., & Baake, E. (2002). Mutation-selectionbalance: Ancestry, load, and maximum principle. Theoretical PopulationBiology, 62(1), 9-46.

◮ Baake, E., & Georgii, H. O. (2007). Mutation, selection, and ancestry inbranching models: a variational approach. Journal of Mathematical Biology,54(2), 257-303.

Assume that mi = Nri = Nr(xi), xi =iN

∈ [0, 1] and define

g(x) = µ(

1− 2√

x(1 − x))

. Then the mean fitness m(µ) = Nr is given by

r ≈ r∞ = supx∈[0,1]

(

r(x) − g(x))

.

r(x) may have only finite number of discontinuities and be either left or rightcontinuous at every point.


Main idea:

We consider the eigenvalue problem

(M + µQ)p = mp, m = m · p,

where p = p(µ), m = m(µ) with a fixed fitness landscape m.

We claim that this problem simplifies in the coordinates of the basis composedof the eigenvectors of the matrix Q = QN .


Proposition: For the matrix Q = QN :

1. The eigenvalues of QN

are simple (all have algebraic multiplicities one) andgiven by

qk = −2k, k = 0, . . . , N.

2. Let v⊤k = (c0k, . . . , cNk) be the right eigenvector of Q

Ncorresponding to qk and

normalized such that c0k = 1, C = CN = (cik)(N+1)×(N+1) be the matrixcomposed of vk (vk is the k-th column of CN). Then the generating functionfor the elements of the k-th column has the form

Pk(t) =N∑

i=0

cik ti = (1− t)k(1 + t)N−k

, k = 0, . . . , N.

3. C2 = 2NI, where I is the identity matrix, or, equivalently,

C−1 = 2−N

C.

4. 1-norm of C is

‖C‖1 = max0≤k≤N

N∑

i=0

|cik| = 2N .


Behavior for µ → ∞:

Let x = C−1p and x = 2−N (1, 0, . . . , 0). Then

‖x− x‖1 ≤1

µ‖M‖1,

and

‖x′(µ)‖1 ≤2N

2µ− (2N+1 + 1)‖M‖1.

Therefore,

p(µ) → p = 2−N

((

N

0

)

, . . . ,

(

N

N

))

.


Parametric solution:

Consider fitness landscape such that mj > 0 for some j, and all the restmi = 0 for i 6= j. Then

xk(s) = 2−N ckj1 + ks

,

pi(s) = 2−N

N∑

k=0

cikckj1 + ks

,

m(s) = mjpj(s),

µ =s

2m(s).


Example 1:

Let N = 2A andm = (0, . . . , 0, N, 0, . . . , 0),

where N is exactly on A-th position. Consider a scaled fitness landscapeNr = m and r(µ) = m(µ)/N . Then, using the properties of the mutationmatrix Q and the parametric formulas from the previous slide, it can beproved that

r ≈ r∞ = limN→∞

r(µ) =√

µ2 + 1− µ.

0 3

1

Μ

rHΜL

N=100

0 3

1

Μ

rHΜL

N=200


Example 1:Let N = 2A and

m = (0, . . . , 0, N, 0, . . . , 0),

where N is exactly on A-th position. Consider a scaled fitness landscapeNr = m and r(µ) = m(µ)/N . Denoting r∞ =

√

µ2 + 1− µ we can prove that

limN→∞

pA±k(µ) = r∞

(

1− r∞1 + r∞

)k

, k = 0, . . . , A.

0. 1. 2. 3. 4.0.0

0.2

0.4

0.6

0.8

1.0

µ

p

0. 1. 2. 3. 4.0.0

0.2

0.4

0.6

0.8

1.0

µ

p


Example 2: Single peaked landscape

Letm = (N, 0, . . . , 0).

Consider a scaled fitness landscape Nr = m and r(µ) = m(µ)/N . Then, usingthe properties of the mutation matrix Q and the parametric formulas we canprove that

r∞ = limN→∞

p0 = 1− µ,

limN→∞

pi = (1− µ)µi, i ≥ 1.

if µ < 1 and pj = 0 for all j if µ ≥ 1.


Example 2: Single peaked landscape

Comparison with numerical computations for N = 100.

r∞ = limN→∞

p0 = 1− µ, limN→∞

pi = (1− µ)µi, i ≥ 1

0. 0.3 0.6 0.9 1.2 1.50.0

0.2

0.4

0.6

0.8

1.0

µ

r(µ)

0. 0.2 0.4 0.6 0.8 1.0.00

0.05

0.10

0.15

0.20

0.25

0.30

µ

p


Epsilon stabilization:

Definition: Dominant eigenvalue m(µ) admits epsilon stabilization if for asmall enough ε > 0 there exists constant m∗

ε and µ∗ε such that for all µ > µ∗

ε

we have|m(µ)−m∗

ǫ | < ε, |m′(µ)| < ε.

We know that epsilon stabilization is a sure event, due to the previous. Thequestion is how to determine µ∗

ε given ε > 0.


Epsilon stabilization:

Let m = (m0,m1, . . . ,mN ) such that m0 > m1 ≥ m2 ≥ . . . ≥ mN . Then wehave

µ∗ =m0 −m1

N, (Classical Result)

µ∗ε = (m0 −m1)

(

1−

√

1−2(m0 − 1)

(m0 −m1)N

)

,

if δ = 2−N

N∑

k=0

(mk − 1)

(

N

k

)

< ε,

(Our result 1)

m(µ∗) = mmin + 2µ∗. (Our result 2)


Example:

0. 0.3125 0.625 0.9375 1.250.00

0.05

0.10

0.15

0.20

0.25

0.30

Μ

p`

0. 0.3125 0.625 0.9375 1.250

5

10

15

20

Μ

mHΜL

Figure : Error threshold in the quasispecies model with the single picked fitnesslandscape

Estimates from the previous slide:

µ∗1 = 0.633,

µ∗2 = 0.644,

µ∗3 = 0.613


Example:

0.125 0.25 0.375 0.50.00

0.05

0.10

0.15

0.20

0.25

0.30

Μ

p`

0. 0.125 0.25 0.375 0.5

1.7

1.8

1.9

2.0

2.1

2.2

2.3

Μ

mHΜL

Figure : Error threshold in the quasispecies model with mk = kα log(1− s)

Estimates from the previous slide:

µ∗1 = 0.09,

µ∗2 = 0.19,

µ∗3 = 0.33


Acknowledgements:

Startup grant from Department of Mathematics, NDSU

ND EPSCoR and NSF grant # EPS-0814442


Thank you for your attention!

e-mail: [email protected]

site: https://www.ndsu.edu/pubweb/ novozhil/

References:

◮ Bratus, A. S., Novozhilov, A. S., & Semenov, Y. S. (2013). Linear algebraof the permutation invariant Crow–Kimura model of prebiotic evolution.arXiv:1306.0111.

◮ Semenov, Y. S., Bratus, A. S., & Novozhilov, A. S. (2014). On thebehavior of the leading eigenvalue of the Eigen evolutionary matrices, inpreparation.


Linear Algebraof Eigen’s Quasispecies Modelnovozhil/Data/linear...Artem Novozhilov (NDSU) May 17,...

Documents

Transcript of Linear Algebraof Eigen’s Quasispecies Modelnovozhil/Data/linear...Artem Novozhilov (NDSU) May 17,...