Recombination Histories & Global Pedigrees

15
Recombination Histories & Global Pedigrees Acknowledgements Yun Song - Rune Lyngsø - Mike Steel Finding Minimal Recombination Histories 1 2 3 4 1 2 3 4 1 2 3 4 Global Pedigrees Finding Common Ancestors NOW

description

Finding Minimal Recombination Histories. 1. 2. 3. 4. 1. 2. 3. 1. 4. 2. 3. 4. Global Pedigrees. Finding Common Ancestors. NOW. Recombination Histories & Global Pedigrees. Acknowledgements Yun Song - Rune Lyngsø - Mike Steel. Recombination. Gene Conversion. - PowerPoint PPT Presentation

Transcript of Recombination Histories & Global Pedigrees

Recombination Histories & Global Pedigrees

Acknowledgements Yun Song - Rune Lyngsø - Mike Steel

Finding Minimal Recombination Histories

1 2 3 4 1 2 3 4 1 234

Global Pedigrees

Fin

din

g

Co

mm

on

A

nc

es

tors

NOW

Basic Evolutionary Events

Recombination Gene Conversion

Coalescent/Duplication Mutation

Infinite site assumption ?

Hudson & Kaplan’s RM

If you equate RM with expected number of recombinations, this could be used as an estimator. Unfortunately, RM is a gross underestimate of the real number of recombinations.

0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 10 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 11 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 11 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1

Local Inference of Recombinations

0000111

0001101

00

10

01

11

Four combinationsIncompatibility:

Myers-Griffiths (2002): Number of Recombinations in a sample, NR, number of types, NT, number of mutations, NM obeys:

NR NT NM 1

0011

0101

T . . . GT . . . CA . . . GA . . . C

Recoding

•At most 1 mutation per column

•0 ancestral state, 1 derived state

Minimal Number of Recombinations

Last Local Tree Algorithm:

L21Data

2

n

i-1 i

1

Trees

The Kreitman data (1983): 11 sequences, 3200bp, 43(28) recoded, 9 different

How many neighbors?

(2n 2)!

2n 1(n 1)!

n! (n 1)!

2n 1

14133 2 nn

~ n3

Bi-partitionsHow many local trees?

• Unrooted

• Coalescent

Metrics on Trees based on subtree transfers.

Pretending the easy problem (unrooted) is the real problem (age ordered), causes violation of the triangle inequality:

Tree topologies with age ordered internal nodes

Rooted tree topologies

Unrooted tree topologies

Trees including branch lengths

Observe that the size of the unit-neighbourhood of a tree does not grow nearly as fast as the number of trees

Song (2003+)

Du

e to Yu

n S

ong

Tree Combinatorics and Neighborhoods

(2n 3)!!(2n 2)!

2n 1(n 1)!

Allen & Steel (2001)

2(n 3)(2n 7)

14133 2 nn

2

12

2 )1(log2)2(4n

m

mn

n! (n 1)!

2n 1

1

32n3 3n2 20n 39

1

23

4

56

7

Methods # of rec events obtained

Hudson & Kaplan (1985) 5

Myers & Griffiths (2003) 6

Song & Hein (2004). Set theory based approach. 7

Song & Hein (2003). Tree scanning using DP

Lyngsø, Song & Hein (2006). Massive Acceleration using Branch and Bound Algorithm.

Lyngsø, Song & Hein (2006). Minimal number of Gene Conversions (in prep.)

7

7

5-2/6-1

The Minimal Recombination History for the Kreitman Data

- recombination 27 ACs

0

1

2

3

4

5

6

7

8

1

1

4

2

5

3

1

5

5

The Griffiths-Ethier-Tavare Recursions

No recombination: Infinite Site Assumption

Ancestral State Known

History Graph: Recursions Exists

No cycles

Possible Histories without Recombination for simple data example

+ recombination 3*108 ACs

1st

2nd

Ancestral configurations to 2 sequences with 2 segregating sites

mid-point heuristic

Counting + Branch and Bound Algorithm

?

Exact len

gth

Lower bound

Up

per B

oun

d

0 31 912 13143 86184 304365 627946 789707 630498 324519 1046710 1727

289920

k-recom

bin

atination

n

eighb

orhood

k

minARGs: Recombination Events & Local Trees

True ARG

Reconstructed ARG

1 2 3 4 5

1 23 4 5

((1,2),(1,2,3))

((1,3),(1,2,3))

n=7, =10, =75

Minimal ARG

True ARG

0 4 Mb

Hudson-Kaplan

Myers-Griiths

Song-Hein

n=8, =40

n=8, =15

Mutation information on only one side

Mutation information on both sides

Reconstructing global pedigrees: SuperpedigreesSteel and Hein, 2005

The gender-labeled pedigrees for all pairs, defines global pedigree

k

Gender-unlabeled pedigrees doesn’t!!

•All embedded phylogenies are observable

•Do they determine the pedigree?

Genomes with and --> infinity recombination rate, mutation rate

Benevolent Mutation and Recombination Process

Counter example: Embedded phylogenies: