A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael...

45
A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 1 / 45

Transcript of A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael...

Page 1: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

A likelihood method for cophylogenetics

Michael Charleston & Rob Little

School of ITUniversity of Sydney

Phylomania,2009.10.29-30

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 1 / 45

Page 2: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Tracing history

I am, in point of fact, a particularly haughty and exclusiveperson, of pre-Adamite ancestral descent. You will understandthis when I tell you that I can trace my ancestry back to aprotoplasmic primordial atomic globule.

[The Mikado, Gilbert and Sullivan]

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 2 / 45

Page 3: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Cophylogenetics

Cophylogeny is the reconstruction of ancient relationships amongecologically linked groups of organisms, from their phylogeneticinformation.

Cophylogeny has broad applications throughout biological science.

Essentially the problem is that of how to relate phylogenies fromecologically related groups of organisms or habitats, given their(estimated) phylogenies and the known associations between the two.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 3 / 45

Page 4: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Motivation

An estimated 75% of emergent human diseases are zoonoses, that is,they switched hosts from other species into humans.

If we can pinpoint episodes of codivergence (when two ecologicallylinked phylogenetic lineages split together) then we can compare ratesof evolution.

Understanding where an organism came from (e.g., invading pests)tells us how better to combat them.

Discovery of linked divergent histories enables us to find “missing”species (e.g., New Zealand Beech trees Nothofagus).

Possibly undiscovered pathogens?

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 4 / 45

Page 5: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

. . . and anyway,

It’s an interesting problem

It’s hard

Hardly anyone else is doing it.

The ARC now funds me to study it. (Woohoo!)

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 5 / 45

Page 6: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Different systems coevolve

For example,

hosts and their parasites or pathogens;

whole organisms and their genes;

codivergence/

cospeciationduplication/

independent speciation

duplication/

independent speciation

loss/

extinction

horizontal transfer/

host switch

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 6 / 45

Page 7: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

geographical areas and the species which inhabit them.

vicariant speciation

vicariant speciation

invasion

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 7 / 45

Page 8: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

General processes

codivergence

extinctionmiss the boat

host-switch

unsuccessful

host switch

duplication

Host

Pathogen

Untraceable

ghost

failure to diverge

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 8 / 45

Page 9: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Terminology of events

genes parasites pathogens biogeography words

codivergence cospeciation / codivergence vicariance codivergenceduplication independent

speciationemergence sympatric spe-

ciationduplication

horizontaltransfer

host switch host switch /zoonosis

migration borrowing

loss (drift) extinction extinction /eradication

extinction loss

missing the boat / lineage sorting loss

Loss arises from multiple processes, which cannot be distinguished fromthe phylogenies and associations.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 9 / 45

Page 10: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

codivergence & duplication

Definition

A codivergence event occurs when internal vertices p ∈ V (P) andh ∈ V (H) are coincident, and the children of p diversify on the children ofh.

A duplication occurs when p is associated with an arc of H rather than avertex; this corresponds to a speciation or divergence of p that isindependent of a divergence event in the host.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 10 / 45

Page 11: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

host switch & loss

Definition

A host switch occurs for some arc (p, q) ∈ A(P) where p is associatedwith a location in H that is contemprary with, but not ancestral to, thelocation in H with which q is associated. In my approach in order for ahost switch to occur p must duplicate and a successful invasion into a newhost lineage must occur.

A loss occurs as the result of one of three things, which given a probleminstance (below) are indistinguishable: extinction of some p, failure totrack both hosts after a host divergence event (“missing the boat”) andsimple failure to sample the pathogen p.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 11 / 45

Page 12: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Circoviruses

A virus of major economic importance is porcine circovirus 2 (PCV2).

PCV2 is a single-stranded, non-enveloped, circular DNA virus about1.76kb

PCV2 is highly infectious and its prevalence is reported up to 40-60%,even up to 100% in some regions.

PCV2 causes post-weaning multisystemic wasting syndrome(PMWS), which in Australia can lead to 50% mortality [8]

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 12 / 45

Page 13: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Tanglegram of circoviruses

Duck_malard Duck_mulCV

Raven Raven_CV

Starling

Starling_CV

Pig PCV

Goose Goose_CV

Finch Finch_CV

Pigeon Pigeon_CV

Canary

Canary_CV

Duck_muscovy Duckmusc_CV

Swan Muteswan_CV

Gull Gull_CV

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 13 / 45

Page 14: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Consensus-like map of relationships

Raven

Starling

Finch

Canary

Duck

Goose

Swan

Gull

Pigeon

Pig

maps1-6,10,12-15

maps5-10

all maps

maps1,2,4,6,13-15

ma

ps3

,4,1

2,1

3

maps3,5,12

maps7-11

7,8

8 7

9

maps3-10,12,13

Ramdsen et al., Mol. Biol. Evol. 2009Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 14 / 45

Page 15: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Basic terminology

Here all trees are phylogenies: each vertex has 0 or 2 children, and eachtree has a defined internal root, from which all edges are directed.

The edges of T are E (T ); the vertices are V (T ).

Each vertex represents an operational taxonomic unit (OTU) or simplytaxon.

A leaf is a vertex with out-degree 0. The leaves of T are L(T ).

The root represents the common ancestor of all the taxa in the tree.

If the edges are weighted then weights represent some measure ofevolutionary distance (sequence divergence, time, morphological orDNA-DNA hybridisation distance etc.).

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 15 / 45

Page 16: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Problem Statement

Given

a host phylogeny H

an associate phylogeny P

known associations ϕ of the tips of P with those of H

The object is to find out the ancestral relationships between P and H.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 16 / 45

Page 17: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

What we can recover

Ronquist confirmed in 2002 [9] that there are only four types ofrecoverable event for this problem:

1 codivergence,

2 duplication,

3 loss, and

4 host switching

All methods (attempt to) recover codivergence, but not all can recoverhost switching. Some only recover codivergence, duplication and loss.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 17 / 45

Page 18: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Existing methods

There are three basic approaches to answering questions aboutcophylogeny:

parsimony / event-based methods

tree metrics

mapping / reconcilation

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 18 / 45

Page 19: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Parsimony/event-based methods

Brooks came up with an early solution, coined Brooks’ Parsimony Analysis(BPA) [1].

BPA recodes the known associations ϕ and the parasite tree P as binarycharacters and then puts them into a parsimony-based tree reconstructionmethod.

This suffers from many difficulties, including

Requirement for post hoc reinterpretation (generally by Brookshimself).

Inability to cope correctly with host switching without furtherpost-hoc fiddling.

Treatment of non-independent tree characters as independent binarycharacters.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 19 / 45

Page 20: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Tree Metrics

Huelsenbeck and Rannala [4] came up with a method to test codivergence,but it was a metric: it treats P and H as equivalent, whereas we considerP to be “tracking” H.

Their method answered the question “did these trees evolve according tothe same model of cladogenesis”?

Another (better?) question is “how strongly was the evolution of thisparasite or pathogen governed by that of its hosts?”

A later method was maximum-likelihood based and asymmetric [5], but itwas rather simplistic: one major constraint was that it permitted only oneparasite lineage to exist per host lineage.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 20 / 45

Page 21: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Mapping

The most intuitive (and computationally hardest) method is to reconciletrees using a mapping from P into H.

This requires no post hoc interpretation and can recover all events,depending on how the mapping method is implemented.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 21 / 45

Page 22: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

The Cophylogeny Mapping Problem

Given H, P and the map ϕ : L(P) 7→ L(H),

We want to recover a mapping

Φ : V (P) 7→ V (P) × ξ,

where ξ = V (H) ∪ A(H).

An element x of ξ is a location, and an element (p, x) of V (P) × ξ is aj -vertex.

The relationships among associations of V (P) and ξ define the historicalevents that took place.

Thus we need a set of j-vertices (associations), one for each vertexp ∈ V (P), which could account for the coevolutionary history of P and H.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 22 / 45

Page 23: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Definition

Given a host phylogeny H and associate phylogeny P, and associationsgiven by the mapping ϕ : V (P) 7→ V (H) ∪ A(H), a reconstruction is acollection Φ of associations with certain properties:

1 Φ extends ϕ, that is, Φ|L(P) ≡ ϕ (“lifting” condition);

2 If P ∼= H and ϕ preserves the isomorphism, then Φ preserves theisomorphism also (“consistency” condition);

3 Φ is an isomorphism of P (“isomorphism” condition);

4 If p ∈ V (P) \ L(P) is mapped by Φ to ` in H, then at least one childof p must also be mapped by Φ to a descendant of ` (“traceable”condition);

5 If p ∈ V (P) \ L(P) is mapped to ` ∈ V (H), corresponding to acodivergence event, then all the children of p must be mapped todescendants of ` (“interpretable” condition).

(Adapted from [3].)

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 23 / 45

Page 24: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Complexity

The cophylogenetic network reconcilation problem is NP-complete, even ifthere are only two possible locations in time for each host node [6];

In fact it has recently been shown that the tree-specific form (where Hand P are trees) is also NP-complete (Libeskind-Hadas et al., submittedto Jour. Comp. Biol.).

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 24 / 45

Page 25: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Jungles

One solution to the mapping problem is to construct a jungle, whichcontains as subgraphs all the (Pareto) maps [2].

Treating parasite lineages as independent of each other and ignoringproblems with time-travelling parasites we can traverse this junglerelatively quickly.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 25 / 45

Page 26: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

a:T.wardi

b:T.minor

c:G.thomomyus

d:G.actuosi

e:G.ewingi

f:G.chapini

g:G.panamensis

h:G.setzeri

i:G.cherriei

j:G.costaricensis

k

l

m

n

o

p

q

r

s

louse tree1:T. talpoides

2:T. bottae

3:G. busarius

4:O. hispidus

5:O. cav ator

6:O. underw oodi

7:O. cherriei

8:O. heterodus

9

10

11

12

13

14

15

gopher tree

from Charleston, Math. Biosc. 1998

Page 27: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Jungle

(k:9)*

(r:4)

(m:6)

(l:15)*

(d:2)

(e:3)

(l:2)

(l:9)

(l:3)

(c:1)

(o:1)

(g:5)

(h:6)

(m:5)

(i:7)(j:8)

(n:10)*

(p:5)

(f:4)

(r:1)

(r:9)

(r:15)

(a:1)

(b:2)

(s:9)

(s:11)

(s:15)*

(s:3)

(s:4)

(s:13)

(s:5)

(s:12)

(r:15)*

(r:3)

(r:14)

(r:14)*

(r:5)

(r:12)

(r:11)

(o:9)

(o:15)

(o:14)

(o:3)

(q:4)

(q:11)(p:12)*

(p:12)

(p:11)*

(m:12)*

(q:5)

(o:9)*

(l:14)

(o:15)*

(s:15)

(r:13)

(q:13)*

from Charleston, Math. Biosc. 1998Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 27 / 45

Page 28: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Jungle Fowl

(k:9)*

(r:4)

(m:6)

(l:15)*

(d:2) (e:3)

(l:2)

(l:9)

(l:3)

(c:1)

(o:1)

(g:5)

(h:6)

(m:5)

(i:7)

(j:8)

(n:10)*

(p:5)

(f:4)

(r:1)

(r:9)

(r:15)

(a:1)

(b:2)

(s:9)

(s:11)(s:15)*

(s:3)

(s:4)

(s:13)

(s:5)

(s:12)

(r:15)*

(r:3)

(r:14)

(r:14)*

(r:5)

(r:12)

(r:11)

(o:9)

(o:15) (o:14)

(o:3)

(q:4)

(q:11)

(p:12)*

(p:12)

(p:11)*

(m:12)*

(q:5)(o:9)*

(l:14)

(o:15)*

(s:15)

(r:13)

(q:13)*

from Charleston, Math. Biosc. 1998Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 28 / 45

Page 29: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Untraceable EventsNot all coevolutionary events are traceable.

A parasite lineage that leaves no trace of itself on a host lineage, either bybecoming extinct or by leaving that host, cannot be linked with that hostunless in a probabilistic way.

These events are untraceable.

H

P P

H

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 29 / 45

Page 30: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Event Likelihoods and Costs

The basic probabilistic model of codivergence has these ingredients:

pc : The conditional probability that if the host lineage of a parasite(or pathogen) diverges, so does that parasite.

λP : The birth rate of the parasite tree, instantaneous rate of anyextant lineage diverging independently of its host.

µP : The death rate of the parasite tree, instantaneous rate of anyextant lineage going extinct.

pw : The conditional probability that if a duplication event took place,one nascent parasite lineage switches hosts.

These can be roughly ‘scored’ with values c, d , e and w respectively, withc < d , e, w .

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 30 / 45

Page 31: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Definition

A history A is a collection of associations of parasite/pathogen verticesand host locations that contains as a subgraph some Φ.We say that A presents Φ if there is such a Φ that satisfies the conditionsdefined earlier. Note that every mapping Φ is a history.Note that a history may contain untraceable events

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 31 / 45

Page 32: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Combinatorics of segmented histories

So how many histories are there?

Consider a location ` in a segmented history, with first and seconddescendants `′, `′′ (if they exist). If a single parasite lineage p is associatedwith `, we have the recurrence

F (`) =

1 + F (`′) + F (`′)∑

ai

F(

a′i)

, if ` is an arc of H

1 + F (`′) + F (`′′) + F (`′) F (`′′) , if ` is a vertex of H

for the total number of possible histories generated by the parasite lineage,where the ai are all of the arcs in the segment containing `.

This number increases alarmingly!

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 32 / 45

Page 33: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Combinatorics of segmented histories

It is also useful to the consider number of histories presenting a specificreconstruction

Can count histories in much the same way, with special cases forreconstructed associations

I Reconstructed null event may become duplication with or withoutswitch

I Reconstructed lineage sort may become a codivergenceI Other reconstructed associations must be preserved

H (`, p) =

H (`′, p′) H (`′′, p′′) , `, p verticesH (`′, p′) (1 + F (`′′)) , `, p reconstructed as sort

H (`′, p′)(

1 +∑

aiF (a′i )

)

, `, p reconstructed as null

. . .

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 33 / 45

Page 34: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Philosophy

We can approximate time because

1 dating estimates tend to have very large confidence intervals

2 speciation does not happen in an instant

3 given small enough time increments we can approximate thecontinuous case

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 34 / 45

Page 35: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Discretization

1 Given an instance of the problem

2 If there are branch lengths, use them to segment the times intochunks (e.g., 10My) (as many as needed for the whole P)

3 Begin with a map such as the reconciled tree, sensu Page [7]

4 Embed the map in a history

5 Search histories that present the original map via Markov chainMonte Carlo (MCMC) to estimate a likelihood of a given map

6 or Search all histories to find the maximum likelihood map

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 35 / 45

Page 36: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

(a): map1a (b): map1bMap 1 and its two discretizations

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 36 / 45

Page 37: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Probabilities

Pr(map1a) = (p2c ) · (pnphs/2) · (p

3n) · (p

3s )

= p2c · p4

n · phs · p3s /2 (1)

Pr(map1b) = (p2c ) · (p

2n) · (pnphs/2) · (p

3s )

= p2c · p3

n · phs · p3s /2 (2)

Where

pc probability of codivergence

pd probability of duplication

pw probability of host switch, given a duplication

px probability of going extinct

ps probability of sampling extant taxon

Also define pn as (1 − pd − px) and phs ≡ p2dpw (1 − pw ) for brevity.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 37 / 45

Page 38: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

H P

HP

HP

(a) (b)

Figure: A simple tanglegram and two Pareto-optimal solutions

Page 39: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Histories

a b c

a b c

(a) (b)

≤≤

a b c d e f

≤ †

a b c d e f

(c) (d)

Key: •codivergence; ◦duplication; ≤miss the boat; †extinction; �sampling failure

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 39 / 45

Page 40: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Probability distributionspc= 0.75, pw= 0.2, ps= 0.9

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 40 / 45

Page 41: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Outcomes, we hope

Estimates of model parameters, e.g., rates of zoonosis

Easily extensible to include competition, cross-immunity etc.

Fast heuristics for finding most likely maps, even on large problems

Allowance for phylogenetic uncertainty

Doesn’t require either phylogeny to be a tree so can be implementedon DAGs

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 41 / 45

Page 42: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

TreeMap 3

3

In Java now (woo! GUI!Cross-platform!)

Likelihoods and heuristics inprogress

Doesn’t require phylogenies tobe trees (just DAGs)

Hasn’t solved computationalcomplexity by converting toJava (but some things muchfaster)

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 42 / 45

Page 43: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Job opportunities

ARC-funded (DP1094891) post-doc: 3 years starting early 2010

PhD top-up scholarships & projects

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 43 / 45

Page 44: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

Acknowledgements

Support:This work is supported by the Australian Research Council: parts ofTreeMap3 by ARC DP0770991 and subsequent research by ARCDP1094891.

Collaborateurs:

Eddie Holmes

Cadhla Firth (Nee Ramsden)

Rod Page

Iain Beeston

Susan Perkins

. . . and many more.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 44 / 45

Page 45: A likelihood method for cophylogenetics · A likelihood method for cophylogenetics Michael Charleston & Rob Little School of IT University of Sydney Phylomania, 2009.10.29-30 Charleston

ReferencesD. R. Brooks.

How to do BPA, really.Jour. Biog., 28:345–358, 2001.

M. A. Charleston.

Jungles: a new solution to the host/parasite phylogeny reconciliation problem.Math. Biosci., 149(2):191–223, May 1998.

Michael A. Charleston.

Principles of cophylogeny maps.In Michael Lassig and Angelo Valleriani, editors, Biological Evolution and Statistical Physics. Springer-Verlag, 2002.

J. P. Huelsenbeck and B. Rannala.

Phylogenetic methods come of age: testing hypotheses in an evolutionary context.Science, 276:227–232, 1997.

John P. Huelsenbeck, Bruce Rannala, and Ziheng Yang.

Statistical tests of host-parasite cospeciation.Evolution, 51(2):410–419, 1997.

Ran Libeskind-Hadas and Michael Charleston.

On the computational complexity of the reticulate cophylogeny reconstruction problem.Journal of Computational Biology, 16(1):05–117, 2009.doi:10.1089/cmb.2008.0084.

R. D. Page and M. A. Charleston.

From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem.Mol. Phyl. Evol., 7(2):231–240, Apr 1997.

C. Ramsden, M. A. Charleston, S Duffy, B. Shapiro, and E. C. Holmes.

Insights into the evolutionary history of an emerging livestock pathogen: Porcine circovirus 2.Journal of Virology.(in press).

Fredrik Ronquist.

Parsimony analysis of coevolving species associations.In Roderic D. M. Page, editor, Tangled trees: phylogeny, cospeciation, and coevolution, pages 22–64. Chicago UniversityPress, Chicago, 2002.

Charleston & Little (USyd) A likelihood method for cophylogenetics Phylomania, 2009.10.29-30 45 / 45