Evolution of Proteins and Genomes select subset of slides

25
Biochemistry and Molecular Genetics Computational Bioscience Program Consortium for Comparative Genomics University of Colorado School of Medicine [email protected] www.EvolutionaryGenomics.com Evolution of Proteins and Genomes select subset of slides

description

Evolution of Proteins and Genomes select subset of slides. Evolution of Proteins. Jason de Koning. Description. Focus on protein structure, sequence, and functional evolution Subjects structural comparison and prediction, biochemical adaptation, evolution of protein complexes, - PowerPoint PPT Presentation

Transcript of Evolution of Proteins and Genomes select subset of slides

Page 1: Evolution of Proteins and Genomes select subset of slides

Biochemistry and Molecular GeneticsComputational Bioscience Program

Consortium for Comparative GenomicsUniversity of Colorado School of Medicine

[email protected]

Evolution of Proteins and Genomes

select subset of slides

Page 2: Evolution of Proteins and Genomes select subset of slides

Evolution of Proteins

Jason de Koning

Page 3: Evolution of Proteins and Genomes select subset of slides

DescriptionFocus on protein structure, sequence, and

functional evolutionSubjects

structural comparison and prediction, biochemical adaptation, evolution of protein complexes, probabilistic methods for detecting patterns of

sequence evolution, effects of population structure on protein evolution, lattice and other computational models of protein

evolution, protein folding and energetics, mutagenesis experiments, directed evolution,

coevolutionary interactions within and between proteins, and detection of adaptation, diversifying selection and

functional divergence.

Page 4: Evolution of Proteins and Genomes select subset of slides

Reconstruction of Ancestral Function

Page 5: Evolution of Proteins and Genomes select subset of slides
Page 6: Evolution of Proteins and Genomes select subset of slides

Comparative Sequence AnalysisLooking at sets of sequences

Mouse: …TLSPGLKIVSNPL…Rat: …TLTPGLKLVSDTL…Baboon: …TVSPGLRIVSDGV…Chimp: …TISPGLVIVSENL...

Mouse: …TLSPGLKIVSNPL…Rat: …TLTPGLKLVSDTL…Baboon: …TVSPGLRIVSDGV…Chimp: …TISPGLVIVSENL...

Conservedproline

Mouse: …TLSPGLKIVSNPL…Rat: …TLTPGLKLVSDTL…Baboon: …TVSPGLRIVSDGV…Chimp: …TISPGLVIVSENL...

Conservedproline Variable

“High entropy”

A common but wrong assumption: sequences are a random sample from the set of all possible sequences

Page 7: Evolution of Proteins and Genomes select subset of slides

In reality, proteins are related by evolutionary process

Comparative Sequence AnalysisLooking at sets of sequences

Page 8: Evolution of Proteins and Genomes select subset of slides

Selection

SelectivePressure

Stochastic Realizations

Mouse: …TLSPGLKIVSNPL…Rat: …TLTPGLKLVSDTL…Baboon: …TVSPGLRIVSDGV…Chimp: …TISPGLVIVSENL...

Stability

AB

C

Function

Folding

Fitness

Page 9: Evolution of Proteins and Genomes select subset of slides

Model

SelectivePressure

Data

Mouse: …TLSPGLKIVSNPL…Rat: …TLTPGLKLVSDTL…Baboon: …TVSPGLRIVSDGV…Chimp: …TISPGLVIVSENL...

Stability

AB

C

Function

Folding

Understanding

Page 10: Evolution of Proteins and Genomes select subset of slides

Mutations result in genetic variation

Page 11: Evolution of Proteins and Genomes select subset of slides

…UGUACAAAG…

Genetic changes

…UGUAUAAAG…

Substitution

…UGUUACAAAG…

Insertion

…UGUAAAAG…

Deletion

Page 12: Evolution of Proteins and Genomes select subset of slides

Substitutions Can Be:

Purines: A G

Pyrimidines: C T

Transitions

Transversions

Page 13: Evolution of Proteins and Genomes select subset of slides

UGU/AGA/AAG

Substitutions in coding regions can be:

UGU/CGA/AAG

Silent

UGU/UGA/AAG

Nonsense

UGU/GGA/AAG

Missense

Cys STOP LysCys Gly Lys

Cys Arg Lys

Cys Arg Lys

First position: 4% of all changes silentSecond position: no changes silentThird position: 70% of all changes silent (wobble position)

Page 14: Evolution of Proteins and Genomes select subset of slides

Uneven crossover leading to gene deletion and duplication

Homologous crossover

Gene conversion

Page 15: Evolution of Proteins and Genomes select subset of slides

Fate of a duplicated gene

Keep on doing whatever it originally was doing

Lose ability to do anything(become a pseudogene)

Learn to do something new (neofunctionalization)

Split old functions among new genes (subfunctionalization)

Page 16: Evolution of Proteins and Genomes select subset of slides

Homologies

Rat Hb

Mouse Hb

Mouse Hb

Rat Hb

OrthologsParalogs

Hemoglobin Hemoglobin

Geneduplication

Speciation

Page 17: Evolution of Proteins and Genomes select subset of slides

Probability of fixation =

10-02

-0.01 0 0.01 0.02

10-04

1

10-06

10-08

10-10

10-12

10-14

N = 10,000

N = 1000

N = 10

= 1/(2N) when |s| < 1/(2N)

= 2s (large, positive S, large N)

Selective advantage (s)

Fix

atio

n pr

obab

ility

1-e-2s

1-e-2Ns

N = 100

Page 18: Evolution of Proteins and Genomes select subset of slides

The Rate of Evolution Depends on Constraints

Human vs. Rodent Comparison

Highest substitution rates pseudogenes introns 3’ flanking (not transcribed to mature mRNA) 4-fold degenerate sites

Intermediate substitution rates 5’ flanking (contains promoter) 3’, 5’ untranslated (transcribed to mRNA) 2-fold degenerate sites

Lowest substitution rates Nondegenerate sites

Page 19: Evolution of Proteins and Genomes select subset of slides

Selection of Species for DNA comparisons

Both coding and

non-coding

sequences

~70-75%

~150 MYA

4.2

Opossum

0.42.53.0Size (Gbp)

~65%~80%>99%Sequence

conservation (in coding regions)

Primarily coding

sequences

Both coding and non-coding sequences

Recently changed

sequences and genomic

rearrangements

Aids identification of…

~450 MYA~ 65 MYA~5 MYATime since divergence

PufferfishMouseChimpanzeeHuman versus

Page 20: Evolution of Proteins and Genomes select subset of slides

20

UCSC Genome Browser

Page 21: Evolution of Proteins and Genomes select subset of slides

Comparative analysis of multi-species sequences from targeted genomic regions

2121

Nature, 2003Nature, 2003

Page 22: Evolution of Proteins and Genomes select subset of slides

Looking backward from the human genome How much is still there after 450my (Fugu)

22

Page 23: Evolution of Proteins and Genomes select subset of slides

Transposable ElementsGone Wild!

Page 24: Evolution of Proteins and Genomes select subset of slides

Using 12 species, 561 Multi-Species ConservedSequences (MCSs) were found

How can be found using just the Mouse genome (rather than all 12)

Identifying Functionally Important Regions

How many comparative genomes do we need?Can’t we just use the mouse?

False Pos.

False Neg.True Pos.

Page 25: Evolution of Proteins and Genomes select subset of slides

Interpreting Evolutionary Changes Requires a Model

e.g. 0.00005 / my 20 x 20 Substitution Matrix

…IGTLS…

…IGRLS...

In evolution:what is the rate R(T R) at

which Ts become Rs?