The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft Excel

38
The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft Excel 0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700 0.800 0.900 1.000 0 10 20 30 40 50 60 70 80 90 100 Generatio p’ = p (pW AA + qW ) / W Anton E. Weisstein, Truman State University

description

p’ = p ( pW AA + qW AS ) /. W. The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft Excel. Anton E. Weisstein, Truman State University. BIO 2010: Transforming Undergraduate Education for Future Research Biologists National Research Council (2003). - PowerPoint PPT Presentation

Transcript of The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft Excel

Page 1: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

The Biological ESTEEM Project:Linear Algebra, Population Genetics, and Microsoft Excel

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0 10 20 30 40 50 60 70 80 90 100

Generation

p’ = p (pWAA + qWAS) / W

Anton E. Weisstein, Truman State University

Page 2: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

BIO 2010:Transforming Undergraduate Education

for Future Research Biologists

National Research Council (2003)

Recommendation #2:“Concepts, examples, and techniques from mathematics…should be included in biology courses. …Faculty in biology, mathematics, and physical sciences must work collaboratively to find ways of integrating mathematics…into life science courses…”

Recommendation #1:“Those selecting the new approaches should consider the importance of mathematics...”

Page 3: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

BIO 2010:Transforming Undergraduate Education

for Future Research Biologists

National Research Council (2003)

Specific strategies:

• A strong interdisciplinary curriculum that includes physical science, information technology, and math.

• Meaningful laboratory experiences.

Page 4: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Biological Topics

Spread of infectious diseases

Tree growth

Enzyme kinetics

Population genetics

Page 5: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Mathematical Topics

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0 10 20 30 40 50 60 70 80 90 100

Generation

p

Random walks

Optimi-zation

Linear algebra

Graph theory

Page 6: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Unpacking “ESTEEM”

• Excel: ubiquitous, easy, flexible, non-intimidating

• Exploratory: apply to real-world data; extend & improve

• Experiential: students engage directly with the math

Page 7: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Three Boxes

Black box:Hide the model

? y = axb

Glass box:Study the model

y = axb

No box:Build the model!

How do students interact with the mathematical model underlying the biology?

Page 8: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Copyleft

• download• use• modify• share

Users may freely the software, w/proper attribution

More info available at Free Software Foundation website

Page 9: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

3. Survival of the Slightly Better: Exploring an Evolutionary Paradox with Linear Algebra

1. Intro to Population Genetics: Hardy-Weinberg Equilibrium and the Binomial Theorem

2. Evolutionary Analysis:Microevolution, Statistics, and Stability Analysis

ˆ p =WSS −WAS

WAA − 2WAS + WSS

Synthesizing and Applying Math Concepts Using Biological Cases

Page 10: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

DefinitionsAllele:One variant of a specific gene.

Genotype:The set of alleles carried by an individual.

Phenotype:The detectable manifestations of a specific genotype.

Example: ABO blood type

IA

IB i

IAIA Type A

IBIB Type B

ii Type O

IAi Type A

IAIB Type AB

IBi Type B

Page 11: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Life Cycle Gametes(eggs & sperm)

Zygotes(fertilized eggs)

Juveniles(reproductively

immature)

Adults(reproductively

mature)

Page 12: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Life Cycle Gametes(eggs & sperm)

Zygotes(fertilized eggs)

Juveniles(reproductively

immature)

Adults(reproductively

mature)

Page 13: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Life Cycle Gametes(eggs & sperm)

Zygotes(fertilized eggs)

Juveniles(reproductively

immature)

Adults(reproductively

mature)

Page 14: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Life Cycle Gametes(eggs & sperm)

Zygotes(fertilized eggs)

Juveniles(reproductively

immature)

Adults(reproductively

mature)

Page 15: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Recursion EquationsLet

x = # AA adults;y = # Aa adults;z = # aa adults.

Definep = # A gametes = x + y/2 ;q = # a gametes = y/2 + z .

Determine expected # adults of each genotype in next generation.

(For now, feel free to make any

simplifying assumptions.)

Page 16: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Hardy-Weinberg Equilibrium

Genotypes reach ratios

p2 : 2pq : q2

in one generation, then stay there forever!

Assumptions?

• Gametes combine at random• All individuals have equal chance of survival• Each gen. a perfectly representative sample of the previous

Page 17: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

3. Survival of the Slightly Better: Exploring an Evolutionary Paradox with Linear Algebra

1. Intro to Population Genetics: Hardy-Weinberg Equilibrium and the Binomial Theorem

2. Evolutionary Analysis:Microevolution, Statistics, and Stability Analysis

ˆ p =WSS −WAS

WAA − 2WAS + WSS

Synthesizing and Applying Math Concepts Using Biological Cases

Page 18: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

The Case of the Sickled Cell

• The S allele for sickle-cell anemia has a frequency of ~11% in some African populations.

• Why is it so common?

• If it provides a selective advantage, why isn’t its frequency 100%?

Page 19: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

DefinitionsReproductive fitness:The average number of offspring produced

by an organism in a specific environment.

Examples:• Antibiotic resistance• Camouflage• Resistance to infectious diseases

Natural selection:An evolutionary mechanism

that tends to increase the freq. of traits that increase an organism’s fitness.

Source: Jeffrey Jeffords, DiveGallery.com

Page 20: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Selection and Sickle-CellAlleles:

A: “normal” hemoglobinS: sickle-cell hemoglobin

Genotype Fitness

AA WAA = 0.9

AS WAS = 1.0

SS WSS = 0.2

Natural selection:

Sickle-cell anemia:~20% survive to reproductive age

Malaria susceptibility: ~90% survive to reproductive age

Page 21: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Recursion Equationsp = # A gametes;q = # S gametes.

Life stage SS(W = 0.2)

AS(W = 1.0)

AA(W = 0.9)

Juvenile q22pqp2

Adult q2WSS2pqWASp2WAA

Zygote q22pqp2

W W W

p’ = p (pWAA + qWAS) / W

W = p2WAA + 2pqWAS + q2WSSNormalization:

Page 22: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Selection and Sickle-Cell

Genotype Fitness

AA WAA = 0.9

AS WAS = 1.0

SS WSS = 0.2

Biological Question:

• How will this population evolve over time?

p’ = p (pWAA + qWAS) / W

Mathematical Question:

What are the equilibria for this recursion equation?

Page 23: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Solving for EquilibriaSet p’ = p and solve:

p =p (pWAA + qWAS )

W

p = 0

pWAA + qWAS = W = p2WAA + 2pqWAS + q2WSSor

Substitute q = 1 – p and factor:

0 = (1− p)[WSS −WAS − p(WAA − 2WAS + WSS )]

ˆ p =1

ˆ p =WSS −WAS

WAA − 2WAS + WSSor

ˆ p = 0 or

ˆ p =0.2 −1.0

0.9 − 2 ⋅1.0 + 0.2=

−0.8

−0.9= 0.889Nontrivial solution:

Page 24: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Stability Analysis:NatSelDiffEqns (Tim Comar, Benedictine College)

Is q = 0.11 stable or unstable?

Page 25: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

The Case of the Protective Protein

• HIV docks with the CCR5 surface protein present on some cells of immune system

• CCR5 32 allele partially protects against HIV infection

Peterson 1999. JYI 2: ?

Page 26: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

The Case of the Protective Protein

• Based on genetic evidence, 32 arose ~700 years ago.

• Present in ~10% of Caucasians; largely absent in other groups. Why?

Hypothesis: May also have protected vs. plague and/or smallpox.

Biological Question:How much selective advantage must 32 have given to become so common in only 700 years?

Mathematical Question:For what fitness values does 700 years lie within the 95% CI of 32’s age?

Page 27: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Definitions

Examples:

• Absence of blood type B in Native Americans

• Northern elephant seal: virtually no genetic variation 100 years after near-extinction

Genetic drift:An evolutionary mechanism by which

allele frequencies change due to chance alone, independent of those alleles’ effects on fitness.

Page 28: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Modeling Genetic Drift

Let N = population size (constant).

Assume this pop. produces ∞ gametes:f(A) = p, f(B) =q .

But only 2N of those gametes (chosen at random) combine to form the zygotes that

develop into the next generation!

p’ = B(2N, p)12N

≈ N(p, )pq

2N

Page 29: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Genetic Drift as a Random Walkp’ = B(2N, p)1

2N≈ N(p, )pq

2N

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0 10 20 30 40 50 60 70 80 90 100Generation

p

N = 2000

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0 10 20 30 40 50 60 70 80 90 100

Generation

p

N = 200

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0 10 20 30 40 50 60 70 80 90 100

Generation

p

N = 20

• Largest fluctuations in small pops.

• p = 0 and p = 1 are absorbing states

Page 30: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Modeling Microevolution:Deme

Page 31: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

3. Survival of the Slightly Better: Exploring an Evolutionary Paradox with Linear Algebra

1. Intro to Population Genetics: Hardy-Weinberg Equilibrium and the Binomial Theorem

2. Evolutionary Analysis:Microevolution, Statistics, and Stability Analysis

ˆ p =WSS −WAS

WAA − 2WAS + WSS

Synthesizing and Applying Math Concepts Using Biological Cases

Page 32: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Sickle Cell Strikes Back!• In addition to the A and S

alleles, there is also a C allele for hemoglobin!

• C confers even stronger malaria resistance than AS but with no anemia!

• But C is found only in a few isolated populations. Why might this happen?

Extend previous analysis to 3 alleles: some surprising results!

Page 33: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Selection and Sickle-CellHemoglobin alleles: A, S, C

Genotype Fitness

AA 0.9

AS 1.0

AC 0.9

SS 0.2

SC 0.7

CC 1.3

Sickle-cell anemia

Malaria susceptibility

Malaria susceptibility

Mild anemia

Strong malaria resistance

C is beneficial only when common!

Page 34: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Selection and Sickle-Cell

p’ = p (pWAA + qWAS + rWAC) / W

q’ = q (pWAS + qWSS + rWSC) / W

r’ = r (pWAC + qWSC + rWCC) / W

Recursion Equations:

Equilibria:p = DA / D, q = DS / D, r = DC / D

where DA = (WAS – WSS)(WAC – WCC) – (WAS – WSC)(WAC – WSC)

DS = (WAS – WAA)(WSC – WCC) – (WAS – WAC)(WSC – WAC)

DC = (WAC – WAA)(WSC – WSS) – (WAC – WAS)(WSC – WAS)

D = DA + DS + DC

Page 35: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Plotting the Adaptive Landscape2 alleles:Landscape W(p) is a curve in R2

3 alleles:Landscape W(p, q, r)

is a sheet in R3

Constraint:p + q + r = 1

Page 36: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Stability Analysis1. Re-express W(p, q, r) as W(x, y)

2. Calculate Hessian matrix:

∂2W

∂x 2

∂ 2W

∂x∂y

∂ 2W

∂y∂x

∂ 2W

∂y 2

⎢ ⎢ ⎢ ⎢

⎥ ⎥ ⎥ ⎥

=T U

U V

⎣ ⎢

⎦ ⎥

where

T = 2(WAA − 2WAS + 2WSS ),

U = 2 33 (WAA −WSS + 2WAC + 2WSC ),

V = 23 (WAA + WSS + 4WCC + 2WAS − 4WAC − 4WSC ).

3. Take the determinant and apply the 2nd derivative test:

TV > U2, T > 0, T+V > 0 Local max

TV > U2, T < 0, T+V < 0 Local min

TV < U2 Saddle point

TV = U2Higher-order tests needed

Page 37: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Survival of the Slightly Better:DeFinetti

Global maximum:only C allele present

Local maximum:C allele eliminatedSaddle point

Page 38: The Biological ESTEEM Project: Linear Algebra, Population Genetics, and Microsoft  Excel

Cases & Mathematics:Explicit Connections

• Binomial & Normal Distributions • Combinatorics

• Equilibria & Stability Analysis • Normalization

• Recursion & Difference Eqns. • Stochasticity

• Geometry of Curves & Solids • Matrix & Linear Algebra

• Partial Derivatives