WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find...

22
WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s • Delete elts. r,s, add new elt. t with D it =D ti =n r /(n r +n s )•D ir + n s /(n r +n s ) D is • Repeat

Transcript of WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find...

Page 1: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

WPGMA

• Input: Distance matrix Dij; Initially each

element is a cluster. nr- size of cluster r• Find min element Drs in D; merge clusters

r,s• Delete elts. r,s, add new elt. t with

Dit=Dti=nr/(nr+ns)•Dir+ ns/(nr+ns) • Dis

• Repeat

Page 2: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

The distance table

dog bear raccon weasel seal sea lion

cat chimp

dog 0 32 48 51 50 48 98 148bear 0 26 34 29 33 84 136raccon 0 42 44 44 92 152weasel 0 44 38 86 142seal 0 24 89 142sea lion 0 90 142cat 0 148chimp 0

Page 3: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

The distance table

dog bear raccon weasel seal sea lion

cat chimp

dog 0 32 48 51 50 48 98 148bear 0 26 34 29 33 84 136raccon 0 42 44 44 92 152weasel 0 44 38 86 142seal 0 24 89 142sea lion 0 90 142cat 0 148chimp 0

Page 4: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Starting tree

seal sea lion

We call the father node of seal and sea lion “ss”.

12 12

Distance between these two taxa was 24, so each branch has a length of 12.

ss

Page 5: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Removing the seal and sea-lion rows and columns,and adding the ss row and columns

dog bear raccon weasel ss cat chimp

dog 0 32 48 51 ? 98 148bear 0 26 34 ? 84 136raccon 0 42 ? 92 152weasel 0 ? 86 142ss 0 89 142cat 0 148chimp 0

Page 6: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Computing dog-ss distance

dog bear raccon weasel seal sea lion

cat chimp

dog 0 32 48 51 50 48 98 148

),())()(

)((),()

)()(

)(()),(( kjD

jnin

jnkiD

jnin

inkijD

Here, i=seal, j=sea lion, k = dog.

n(i)=n(j)=1.

D(ss,dog) = 0.5D(sea lion,dog) + 0.5D(seal,dog) = 49.

Page 7: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

The new table. Starting second iteration…

dog bear raccon weasel ss cat chimp

dog 0 32 48 51 49 98 148bear 0 26 34 31 84 136raccon 0 42 44 92 152weasel 0 41 86 142ss 0 89 142cat 0 148chimp 0

Page 8: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Starting tree

We call the father node of seal and sea lion “ss”.

Distance between bear and raccoon was 26, so each branch has a length of 13.

seal sea lion

12 12

ss

bear raccoon

13 13

br

Page 9: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Computing br-ss distance

dog bear raccon weasel ss cat chimp

ss 49 31 44 41 0 89.5 142

Here, i=raccoon, j=bear, k = ss.

n(i)=n(j)=1. D(br,ss) = 0.5D(bear,ss)+0.5D(raccoon,ss)=37.5.

),())()(

)((),()

)()(

)(()),(( kjD

jnin

jnkiD

jnin

inkijD

Page 10: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

The new table. Starting second iteration…

dog br weasel ss cat chimp

dog 0 40 51 49 98 148br 0 38 37.5 88 144weasel 0 41 86 142ss 0 89 142cat 0 148chimp 0

Page 11: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Starting tree

Distance between br and ss was 37.5, so each branch has a length of 18.75. But this is the distance from brss to the leaves. The distance brss to ss is 18.75-12=6.75. The distance between brss to br is 18.75-13=5.75

seal sea lion

12 12

ss

bear raccoon

6.75

13

brss

br

5.75

13

Page 12: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Computing dog-brss distance

dog br weasel ss cat chimp

dog 0 40 51 49 98 148

Here, i = br, j = ss, k = dog.

n(i)=n(j)=2. D( brss , dog ) = 0.5D( br , dog ) + 0.5D( ss , dog )=44.5.

),())()(

)((),()

)()(

)(()),(( kjD

jnin

jnkiD

jnin

inkijD

Page 13: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

The new table. Starting second iteration…

dog brss weasel cat chimp

dog 0 44.5 51 98 148brss 0 39.5 88.75 143weasel 0 86 142cat 0 148chimp 0

Page 14: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Starting tree

Distance between brss and w was 39.5, so wbrss is mapped to the line 19.75. The distance to brss, is thus, 1

seal sea lion

0

ss

bear raccoon

brss

br 1312

19.7518.75

weasel

wbrss

Page 15: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Computing dog-wbrss distance

dog brss weasel cat chimp

dog 0 44.5 51 98 148

Here, i = brss, j = weasel, k = dog.

n(i)=4, n(j)=1. D( wbrss , dog ) = 0.8D( brss , dog ) + 0.2D( weasel , dog )=

44.5*8/10+51*2/10 = (356+102)/10=45.8

),())()(

)((),()

)()(

)(()),(( kjD

jnin

jnkiD

jnin

inkijD

Page 16: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

The new table. Starting second iteration…

dog wbrss cat chimp

dog 0 45.8 98 148wbrss 0 88.2 142.8cat 0 148chimp 0

Page 17: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Starting tree

Distance between wbrss and dog was 45.8, so dwbrss is mapped to the line 22.9 The distance to wbrss, is thus, 3.15

seal sea lion

0

ss

bear raccoon

brss

br 1312

22.9

18.75

weasel

dwbrss

19.75

dogl

wbrss

Page 18: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

The new table. Starting second iteration…

dwrbss cat chimp

dwrbss 0 89.833 143.66cat 0 148chimp 0

Page 19: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Starting tree

Distance between dwbrss and cat was 89.833, so cdwbrss is mapped to the line 44.9165The distance to dwbrss, is thus, 22.0165

seal sea lion

0

ss

bear raccoon

brss

br 1312

44.9165

18.75

weasel

cdwbrss

19.75

dog

wbrss22.9

cat

dwbrss

Page 20: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

The new table. Starting second iteration…

cdwrbss chimp

cdwrbss 0 144.2857chimp 0

Page 21: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Starting tree

Distance between cdwbrss and chimp was 144.2857, so THE ROOT is mapped to the line 72.14285The distance to dwbrss, is thus, 27.22635

seal sea lion

0

ss

bear raccoon

brss

br 1312

72.14

18.75

weasel

dwbrss

19.75

dog

wbrss22.9

cat

cdwbrss44.9165

chimp

Page 22: WPGMA Input: Distance matrix D ij; Initially each element is a cluster. n r - size of cluster r Find min element D rs in D; merge clusters r,s Delete elts.

Neighbor Joining Algorithm Saitou & Nei, 87

• Input: Distance matrix Dij; Initially each element is a cluster.

• Find min element Drs in D; merge clusters r,s

• Delete elts. r,s, add new elt. t with Dit=Dti=(Dir+ Dis – Drs)/2

• Repeat• Present the hierarchy as a tree with similar

elements near each other