Modelling proteomesRam Samudrala
Department of Microbiology
How does the genome of an organism specify its behaviour and characteristics?
Proteome – all proteins of a particular system
~60,000 in human
~60,000 in rice
~4500 in bacterialike Salmonella andE. coli
Several thousanddistinct sequencefamilies
Modelling proteomes – understand the structure of individual proteins
A few thousanddistinct structuralfolds
Modelling proteomes – understand their individual functions
Thousands ofpossible functions
Modelling proteomes – understand their expression
Different expressionpatterns based ontime and location
Modelling proteomes – understand their interactions
Interactions andexpression patternsare interdependentwith structure andfunction
Protein folding
…-L-K-E-G-V-S-K-D-…
…-CUA-AAA-GAA-GGU-GUU-AGC-AAG-GUU-…
one amino acid
DNA
protein sequence
unfolded protein
native state
spontaneous self-organisation (~1 second)
not uniquemobileinactive
expandedirregular
Protein folding
…-L-K-E-G-V-S-K-D-…
…-CUA-AAA-GAA-GGU-GUU-AGC-AAG-GUU-…
one amino acid
DNA
protein sequence
unfolded protein
native state
spontaneous self-organisation (~1 second)
unique shapeprecisely orderedstable/functionalglobular/compacthelices and sheets
not uniquemobileinactive
expandedirregular
De novo prediction of protein structure
sample conformational space such thatnative-like conformations are found
astronomically large number of conformations5 states/100 residues = 5100 = 1070
select
hard to design functionsthat are not fooled by
non-native conformations(“decoys”)
Semi-exhaustive segment-based foldingEFDVILKAAGANKVAVIKAVRGATGLGLKEAKDLVESAPAALKEGVSKDDAEALKKALEEAGAEVEVK
generatecontinuous , distributionslocal and global moves
… …
minimisemonte carlo with simulated annealingconformational space annealing, GA
… …
filter all-atom pairwise interactions, bad contactscompactness, secondary structure,density of generated conformations
2.52 Å 5.06 Å
Model 1
CASP6 prediction for T0215
Ling-Hong Hung/Shing-Chung Ngan
3.63 Å 5.42 Å
Model 5
CASP6 prediction for T0236
Ling-Hong Hung/Shing-Chung Ngan
2.25 Å 4.31 Å
Model 1
CASP6 prediction for T0281
Ling-Hong Hung/Shing-Chung Ngan
Comparative modelling of protein structure
KDHPFGFAVPTKNPDGTMNLMNWECAIPKDPPAGIGAPQDN----QNIMLWNAVIP** * * * * * * * **
… …
scanalign
refine
physical functions
build initial model
minimum perturbation
construct non-conservedside chains and main chains
graph theory, semfold
de novo simulation
T0247 RAPDF TMscore RMSD MaxSub
cf-model -30.14 0.8448 4.055 0.6563
parent 1 -27.09 0.8391 4.108 0.6446
parent 2 -26.68 0.8318 4.194 0.625
parent 3 -26.59 0.8252 4.197 0.6051
parent 4 -26.25 0.839 3.981 0.6281
parent 5 -18.51 0.8422 3.979 0.6416
CASP6 prediction for T0247
Model 1
Tianyun Liu
Model 1
Parent 1
Parent 2 Parent 3
T0247 RAPDF TM-score RMSD MaxSub
cf-model -37.44 0.8718 2.166 0.7911
parent 1 -34.87 0.8662 2.233 0.7789
parent 2 -33.99 0.8248 2.166 0.7402
parent 3 -36.83 0.8254 2.139 0.7456
CASP6 prediction for T0271
Tianyun Liu
0.45
0.55
0.65
0.75
0.85
0.95
1.05T0
246
T026
8
T023
3
T023
1
T027
7
T026
6
T027
1
T024
7
T026
7
T027
6
T027
4
T026
9
T028
2
T024
4
T021
1
T023
4
T023
2
T024
3
T026
4
T022
9
T020
0
T021
3
T027
9
Target ID
TM
-sc
ore
sCF-models average of parent models
CASP6 overall summaries
Tianyun Liu
Similar global sequence or structure does not imply similar function
Qualitative function classification
Kai Wang
Prediction of HIV-1 protease-inhibitor binding energies with MD
MD simulation time
Cor
rela
tion
coe
ffic
ien
t
ps0 0.2 0.4 0.6 0.8 1.0
1.0
0.5
with MD
without MD
Ekachai Jenwitheesuk
Prediction of inhibitor resistance/susceptibility
Kai Wang / Ekachai Jenwitheesuk
http://protinfo.compbio.washington.edu/pirspred/
Integrated structural and functional annotation of proteomes
structure based methodsmicroenvironment analysis
zinc binding site?
structure comparison
homology function?
sequence based methods
sequence comparisonmotif searches
phylogenetic profilesdomain fusion analyses
+
*
**
*Bioverse
*
*
Assign function toentire protein space:
key paradigm is use ofhomology to transfer information across
organisms
experimental datasingle molecule + genomic/proteomic
+EXPRESSION
+INTERACTION
}
Bioverse – explore relationships among molecules and systems
Jason McDermott/Michal Guerquin/Zach Frazier
http://bioverse.compbio.washington.edu
Bioverse – explore relationships among molecules and systems
Jason McDermott/Michal Guerquin/Zach Frazier
http://bioverse.compbio.washington.edu
Bioverse – explore relationships among molecules and systems
Jason McDermott/Michal Guerquin/Zach Frazier
http://bioverse.compbio.washington.edu
Bioverse – explore relationships among molecules and systems
Jason McDermott/Michal Guerquin/Zach Frazier
http://bioverse.compbio.washington.edu
Bioverse – prediction of protein interaction networks
Jason McDermott
Interacting protein database
protein α
protein β
experimentallydeterminedinteraction
Target proteome
protein A85%
predictedinteraction
protein B90%
Assign confidence based on similarity and strength of interaction
Bioverse – E. coli predicted protein interaction network
Jason McDermott
Bioverse – M. tuberculosis predicted protein interaction network
Jason McDermott
Bioverse – C. elegans predicted protein interaction network
Jason McDermott
Bioverse – H. sapiens predicted protein interaction network
Jason McDermott
Bioverse – network-based annotation for C. elegans
Jason McDermott
Jason McDermottArticulation point proteins
Bioverse – identifying key proteins on the anthrax predicted network
Jason McDermott
Bioverse – identification of virulence factors
Bioverse - Integrator
Aaron Chang
Take home message
Prediction of protein structure, function, and networks may be used to model whole genomes to
understand organismal function and evolution
Acknowledgements
Aaron ChangChuck MaderDavid NickleEkachai JenwitheesukGong ChengJason McDermottKai Wang
Ling-Hong HungMike InouyeMichal GuerquinStewart MoughonShing-Chung NganTianyun LiuZach Frazier
National Institutes of HealthNational Science Foundation
Searle Scholars Program (Kinship Foundation)UW Advanced Technology Initiative in Infectious Diseases
http://bioverse.compbio.washington.eduhttp://protinfo.compbio.washington.edu
Top Related