Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular...

31
1 Multiple Sequence Alignment Sequences > Yeast YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > Neurospora crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe ssvkdlneakvlavgpgaldkdgkrlpmgvnagdrvl ipqyggspvkvgeeeytlfrdseilakiae > Aspergillus nidulans msllrnvknlaplldrvlvqrvkpeaktasgiflpes svkeqneakvlavgpgavdrngqripmgvaagdrvlv pqfggsplkigeeeyhlfrdseilakine > Schizosaccharomyces pombe (fission yeast) matklksaksivplldrilvqrikadtktasgiflpe ksveklsegrvisvgkggynkegklaqpsvavgdrvl lpayggsnikvgeeeyslyrdhellaiike > Mortierella alpina masritkfsktivpmmdrvlvqrikpqqktasgiyip ekaqealnegyvvavgkglttqegkvvpselaegdkv llppyggsvvkvdneelilfreseilakiq > Crypthecodinium cohnii matgiakrftplldrvlvqrlkpeaktasglflpesa akapnyatvlavgpggrtrdgdilpmnvkvgdkvvvp eyggmtlkfedeefqvfrdadimgilne > Drosophila melanogaster maaaikkiipmldriliqraealtktkggivlpekav gkvlegtvlavgpgtrnastgnhipigvkegdrvllp efggtkvnlegdqkelflfresdilakle > Homo sapiens agqafrkflplfdrvlversaaetvtkggimlpeksq gkvlqatvvavgsgskgkggeiqpvsvkvgdkvllpe yggtkvvlddkdyflfrdgxilgky > Geobacillus stearothermophilus vlkplgdrvvievieteektasgivlpdtakekpqeg rvvavgkgrvldsgervapevevgdriifskyagtev kydgkeylilresdilavig > Mycobacterium tuberculosis makvnikpledkilvqaneaetttasglvipdtakek pqegtvvavgpgrwdedgekripldvaegdtviysky ggteikyngeeylilsardvlavvsk > Mus musculus (house mouse) magqafrkflllfdrvlversaaetvtkggimlpeks qgkvlqatvvavgsggkgksgeiepvsvkvgdkvllp eyggtkvvlddkdyflfrdsdilgkyvn

Transcript of Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular...

Page 1: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

1

Multiple Sequence Alignment

Sequences

> Yeast YOR020cmstllksaksivplmdrvlvqrikaqaktasglylpeknveklnqaevvavgpgftdangnkvvpqvkvgdqvlipqfggstiklgnddevilfrdaeilakiakd

> Neurospora crassamattvrsvksliplldrvlvqrvkaeaktasgiflpessvkdlneakvlavgpgaldkdgkrlpmgvnagdrvlipqyggspvkvgeeeytlfrdseilakiae

> Aspergillus nidulansmsllrnvknlaplldrvlvqrvkpeaktasgiflpessvkeqneakvlavgpgavdrngqripmgvaagdrvlvpqfggsplkigeeeyhlfrdseilakine

> Schizosaccharomyces pombe (fission yeast)matklksaksivplldrilvqrikadtktasgiflpeksveklsegrvisvgkggynkegklaqpsvavgdrvllpayggsnikvgeeeyslyrdhellaiike

> Mortierella alpinamasritkfsktivpmmdrvlvqrikpqqktasgiyipekaqealnegyvvavgkglttqegkvvpselaegdkvllppyggsvvkvdneelilfreseilakiq

> Crypthecodinium cohniimatgiakrftplldrvlvqrlkpeaktasglflpesaakapnyatvlavgpggrtrdgdilpmnvkvgdkvvvpeyggmtlkfedeefqvfrdadimgilne

> Drosophila melanogastermaaaikkiipmldriliqraealtktkggivlpekavgkvlegtvlavgpgtrnastgnhipigvkegdrvllpefggtkvnlegdqkelflfresdilakle

> Homo sapiensagqafrkflplfdrvlversaaetvtkggimlpeksqgkvlqatvvavgsgskgkggeiqpvsvkvgdkvllpeyggtkvvlddkdyflfrdgxilgky

> Geobacillus stearothermophilusvlkplgdrvvievieteektasgivlpdtakekpqegrvvavgkgrvldsgervapevevgdriifskyagtevkydgkeylilresdilavig

> Mycobacterium tuberculosismakvnikpledkilvqaneaetttasglvipdtakekpqegtvvavgpgrwdedgekripldvaegdtviyskyggteikyngeeylilsardvlavvsk

> Mus musculus (house mouse)magqafrkflllfdrvlversaaetvtkggimlpeksqgkvlqatvvavgsggkgksgeiepvsvkvgdkvllp

eyggtkvvlddkdyflfrdsdilgkyvn

Page 2: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

2

Multiple Sequence Alignment(MSA)

Why MSA?

– Selection of sequences

– Multiple sequence alignment of sequences

– Tree building

– Tree evaluation

• Proteins are often related to a larger group (i.e., a family) of proteins

• Multiple sequence alignment is more sensitive thanpairwise alignment for detecting homologs

• MSAs can elucidate conserved residues, motifs, or other functional regions in a protein

• MSA is critical for phylogenetic analysis

Page 3: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

3

Pairwise Alignment

0 5 4 6

0 0 10 4

0

0

3-sequence Alignment

5 0 0

0 0 0

G

A

A

A G T

TCC

AGA

AGT

TCC

Page 4: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

4

Sequences

> Yeast YOR020cmstllksaksivplmdrvlvqrikaqaktasglylpeknveklnqaevvavgpgftdangnkvvpqvkvgdqvlipqfggstiklgnddevilfrdaeilakiakd

> Neurospora crassamattvrsvksliplldrvlvqrvkaeaktasgiflpessvkdlneakvlavgpgaldkdgkrlpmgvnagdrvlipqyggspvkvgeeeytlfrdseilakiae

> Aspergillus nidulansmsllrnvknlaplldrvlvqrvkpeaktasgiflpessvkeqneakvlavgpgavdrngqripmgvaagdrvlvpqfggsplkigeeeyhlfrdseilakine

> Schizosaccharomyces pombe (fission yeast)matklksaksivplldrilvqrikadtktasgiflpeksveklsegrvisvgkggynkegklaqpsvavgdrvllpayggsnikvgeeeyslyrdhellaiike

> Mortierella alpinamasritkfsktivpmmdrvlvqrikpqqktasgiyipekaqealnegyvvavgkglttqegkvvpselaegdkvllppyggsvvkvdneelilfreseilakiq

> Crypthecodinium cohniimatgiakrftplldrvlvqrlkpeaktasglflpesaakapnyatvlavgpggrtrdgdilpmnvkvgdkvvvpeyggmtlkfedeefqvfrdadimgilne

> Drosophila melanogastermaaaikkiipmldriliqraealtktkggivlpekavgkvlegtvlavgpgtrnastgnhipigvkegdrvllpefggtkvnlegdqkelflfresdilakle

> Homo sapiensagqafrkflplfdrvlversaaetvtkggimlpeksqgkvlqatvvavgsgskgkggeiqpvsvkvgdkvllpeyggtkvvlddkdyflfrdgxilgky

> Geobacillus stearothermophilusvlkplgdrvvievieteektasgivlpdtakekpqegrvvavgkgrvldsgervapevevgdriifskyagtevkydgkeylilresdilavig

> Mycobacterium tuberculosismakvnikpledkilvqaneaetttasglvipdtakekpqegtvvavgpgrwdedgekripldvaegdtviyskyggteikyngeeylilsardvlavvsk

> Mus musculus (house mouse)magqafrkflllfdrvlversaaetvtkggimlpeksqgkvlqatvvavgsggkgksgeiepvsvkvgdkvllp

eyggtkvvlddkdyflfrdsdilgkyvn

Multiple Sequence Alignment

Page 5: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

5

Pairwise Alignment Scores

Yeast

Neurospora

Aspergillus

Schizosaccharomyces

Mortierella

Crypthecodinium

Drosophila

Homo

Geobacillus

Mycobacterium

Mus

Yea

st

Neu

rosp

ora

Asp

ergi

llus

Schi

zosc

chrm

ycs

Mor

tiere

lla

Cry

pthe

codi

nium

Dro

soph

ila

Hom

o

Geo

baci

llus

Myc

obac

teriu

m

Mus

49 46 78 45 55 54 44 38 37 4252 41 40 43 46 44 41 39 43

43 48 45 45 40 40 38 3942 53 55 41 41 40 40

43 46 40 43 38 3961 43 34 36 45

49 42 36 4937 32 93

59 3832

Guide Tree

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Page 6: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

6

• Unweighted pair group method with arithmetic mean (UPGMA)

• Neighbor joining (NJ)

Constructing a Guide Tree

• Assume each organism is its own group

• Repeat the following step

– Merge together the two closest groups

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Page 7: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

7

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Yeast

Neurospora

Aspergillus Schizosaccharomyces

Mortierella

Crypthecodinium

Drosophila

Homo

Geobacillus

Mycobacterium

Mus

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Yeast

Neurospora

Aspergillus Schizosaccharomyces

Mortierella

Crypthecodinium

Drosophila

Homo

Geobacillus

Mycobacterium

Mus

Page 8: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

8

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Yeast

Neurospora

Aspergillus Schizosaccharomyces

Mortierella

Crypthecodinium

Drosophila

Homo

Geobacillus

Mycobacterium

Mus

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Yeast

Neurospora

Aspergillus Schizosaccharomyces

Mortierella

Crypthecodinium

Drosophila

Homo

Geobacillus

Mycobacterium

Mus

Page 9: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

9

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Yeast

Neurospora

Aspergillus Schizosaccharomyces

Mortierella

Crypthecodinium

Drosophila

Homo

Geobacillus

Mycobacterium

Mus

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Yeast

Neurospora

Aspergillus Schizosaccharomyces

Mortierella

Crypthecodinium

Drosophila

Homo

Geobacillus

Mycobacterium

Mus

Page 10: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

10

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Yeast

Neurospora

Aspergillus Schizosaccharomyces

Mortierella

Crypthecodinium

Drosophila

Homo

Geobacillus

Mycobacterium

Mus

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Yeast

Neurospora

Aspergillus Schizosaccharomyces

Mortierella

Crypthecodinium

Drosophila

Homo

Geobacillus

Mycobacterium

Mus

Page 11: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

11

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Yeast

Neurospora

Aspergillus Schizosaccharomyces

Mortierella

Crypthecodinium

Drosophila

Homo

Geobacillus

Mycobacterium

Mus

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Yeast

Neurospora

Aspergillus Schizosaccharomyces

Mortierella

Crypthecodinium

Drosophila

Homo

Geobacillus

Mycobacterium

Mus

Page 12: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

12

Unweighted Pair Group Method with Arithmetic mean (UPGMA)

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Yeast

Neurospora

Aspergillus Schizosaccharomyces

Mortierella

Crypthecodinium

Drosophila

Homo

Geobacillus

Mycobacterium

Mus

Guide Tree

Neu

rosp

ora

Asp

ergi

llus

Yea

st

Schi

zosa

ccha

rom

yces

Cry

pthe

codi

nium

Dro

soph

ila

Geo

baci

llus

Myc

obac

teriu

m

Mor

tiere

lla

Hom

o

Mus

Page 13: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

13

• Generate full tree with starlike structure

• Repeat the following step

– Connect two closest groups (i.e., neighbors) through a single node

Neighbor Joining (NJ)

Neighbor Joining

Yeast

Neurospora Aspergillus

Schizosaccharomyces

Mortierella

Crypthecodinium

DrosophilaHomo

GeobacillusMycobacterium

Mus

Page 14: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

14

Neighbor Joining

Yeast

Neurospora Aspergillus

Schizosaccharomyces

Mortierella

Crypthecodinium

DrosophilaHomo

GeobacillusMycobacterium

Mus

Neighbor Joining

Yeast

Neurospora Aspergillus

Schizosaccharomyces

Mortierella

Crypthecodinium

DrosophilaHomo

GeobacillusMycobacterium

Mus

Page 15: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

15

Neighbor Joining

Yeast

Neurospora Aspergillus

Schizosaccharomyces

Mortierella

Crypthecodinium

DrosophilaHomo

GeobacillusMycobacterium

Mus

Neighbor Joining

Yeast

Neurospora Aspergillus

Schizosaccharomyces

Mortierella

Crypthecodinium

DrosophilaHomo

GeobacillusMycobacterium

Mus

Page 16: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

16

Neighbor Joining

Yeast

Neurospora Aspergillus

Schizosaccharomyces

Mortierella

Crypthecodinium

DrosophilaHomo

GeobacillusMycobacterium

Mus

Neighbor Joining

Yeast

Neurospora Aspergillus

Schizosaccharomyces

Mortierella

Crypthecodinium

DrosophilaHomo

GeobacillusMycobacterium

Mus

Page 17: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

17

Neighbor Joining

Yeast

Neurospora Aspergillus

Schizosaccharomyces

Mortierella

Crypthecodinium

DrosophilaHomo

GeobacillusMycobacterium

Mus

Neighbor Joining

Yeast

Neurospora Aspergillus

Schizosaccharomyces

Mortierella

Crypthecodinium

DrosophilaHomo

GeobacillusMycobacterium

Mus

Page 18: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

18

Neighbor Joining

Yeast

Neurospora Aspergillus

Schizosaccharomyces

Mortierella

Crypthecodinium

DrosophilaHomo

GeobacillusMycobacterium

Mus

Multiple Sequence Alignment

Page 19: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

19

Multiple Sequence Alignment

Multiple Sequence Alignment

Page 20: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

20

What can phylogeny do for you?

Why do we care about evolution and the evolutionary history of organisms?

OR

How do we benefit from phylogeny? AND

How is bioinformatics related to any of this?

What are the goals of phylogeny?

1) Deduce correct trees of life for all species

2) Infer or estimate divergence times

All life forms share a common origin and are part of the Tree of Life

How can we use phylogenetic analyses?

Page 21: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

21

Revolutionalizing the Tree of Life

Carl Woese:rRNA IDs Archaea as

separate branch of Tree of Life

Discovering new life forms

Page 22: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

22

Developing effective snakebite antivenins

Identifying emergent diseases

Page 23: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

23

Protecting ecosystems from invasive species

Caulerpa taxifoliaPurple loosetrife

Eurasian water milfoil

A

B

C

D

EAncestral Nodeor ROOT of

the TreeInternal Nodes

hypothetical taxanomicunits (HTUs)

Branches orLineages

Terminal Nodesoperational

taxanomic units (OTUs)

Represent the TAXA(genes, populations,

species, etc.) used to infer the phylogeny

Common phylogenetic tree terminology

Page 24: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

24

Phylogenetic trees can be drawn many ways

A

B

C

D

E

Clade: group with a single common ancestor and its descendents

“B-C clade”

“D-E clade”

“A-B-C clade”

Page 25: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

25

A

B

C

DRooted

A

B C

D

Unrooted

Shows degree of kinshipDoesn’t make assumptions or require knowledge of

common ancestor

Specifies evolutionary pathRoot node is most recent

common ancestor of all TUs; specifies time flow

Phylogenetic trees can be rooted or unrooted

C

Unscaled

Branch length not proportional to number of

changes/distance

Phylogenetic trees can be scaled or unscaled

A

B C

D

Cladogram

A

B

D

Scaled

Branch length proportional to number of

changes/distance

Phylogram

Page 26: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

26

Phylogenetic trees diagram evolutionary relationships

No meaning to thespacing between the

taxa, or to the order inwhich they appear from

top to bottom.

1) No scale (cladograms)2) Proportional to genetic distance (phylograms)3) Proportional to time (ultrametric trees)

E

D

C

B

A

Rotating clades: same meanings

E

D

C

B

A C

B

A

=E

D

Page 27: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

27

Interpreting phylogenetic trees

Is the frog more closely related to the fish or the human ?

How are phylogenetic trees built?

- Closely related organisms don’t always look similar- Similar looking organisms not always closely related- How do you decide importance of traits?

Caveats:

Traditionally: use homologous structures

Page 28: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

28

Structural analogy can result from convergent evolution

Classification based on traits can be tricky

cell number

organelles

Page 29: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

29

Molecular phylogenetic trees

Large molecular data sets: Bioinformatics!

Caveat:

Gene divergence may not correlate with species divergence

Result: great improvement on classical phylogenies

Molecular clock vs. punctuated equilibrium

Eliminates analogy and trait selection issues

Molecular phylogenies can be constructed using different elements

Nuclear genes

Mitochondrial DNA

Genome structure

Usually integrate analyses of multiple different genes

Reasonably well conserved, present in common ancestors

Page 30: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

30

==

Molecular comparisons vs. body plans

Which species are the closest living relatives of modern humans?

MYA

Chimpanzees

Orangutans

Humans

Bonobos

Gorillas

014

MitoDNA, most nuclear genes, and DNA hybridization

Bonobos and chimpanzees are related more closely to humans than either are to gorillas.

Humans

Bonobos

Gorillas

Orangutans

Chimpanzees

MYA015-30

Pre-molecular view

Great apes (chimpanzees, gorillas and orangutans) formed a clade separate from humans.

Page 31: Multiple Sequence Alignment - Wellesley CScs.wellesley.edu/~cs303/slides/7_MSA.pdf · Molecular clock vs. punctuated equilibrium Eliminates analogy and trait selection issues Molecular

31

What is the closest living relative of whales?

Phylogenetic trees are hypotheses

How do you construct phylogenetic trees?

How do you test the robustness of hypotheses?

What computational strategies are used?