A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction
-
Upload
natalio-krasnogor -
Category
Education
-
view
654 -
download
1
description
Transcript of A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction
![Page 1: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/1.jpg)
Evolving energy functionfor protein structure prediction
Paweł [email protected]
Natalio Krasnogor, Jonathan Garibaldi
Department of Computer ScienceBen-Gurion University of the Negev Beer Sheva, Israel
2009-06-30
![Page 2: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/2.jpg)
Outline
1 Introduction
2 Protein energy models
3 Genetic Programming problem formulation
4 Results
5 Conclusions
Pawe l Widera Evolving energy function for PSP 2009-06-30 2 / 26
![Page 3: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/3.jpg)
Protein structure predictionFrom 1D sequence to 3D structure
LFSKELRCMMYGFGDDQNPYTESVDILEDLVIEFITEMTHKAMSIFSEEQLNRYEMYRRSAFPKAAIKRLIQSITGTSVSQNVVIAMSGISKVFVGEVVEEALDVCEKWGEMPPLQPKHMREAVRRLKSKGQIP
Protein basics20 aminoacidalphabetsequence encodesstructurestructuredetermines activity
Pawe l Widera Evolving energy function for PSP 2009-06-30 3 / 26
![Page 4: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/4.jpg)
International prediction contestCritical Assessment of techniques for protein Structure Prediction
CASP factsbiannual competition started in 1994parallel prediction and experimental verificationmodel assesment by human experts
Prediction difficultycomparative modelling (sequence similarity)fold recognition (new or existing)ab initio modelling (first principles)
Pawe l Widera Evolving energy function for PSP 2009-06-30 4 / 26
![Page 5: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/5.jpg)
Ab initio predictor schemaFrom sequence to the final model
Target sequence
Secondarystructureprediction
Foldrecognition
and threading
Initial ab initio prediction
Optimisation
Clustering
Final models
PSIPRED
SAM-T02
JUFO
PSI-BLAST
Pawe l Widera Evolving energy function for PSP 2009-06-30 5 / 26
![Page 6: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/6.jpg)
Ab initio predictor schemaFrom sequence to the final model
Target sequence
Secondarystructureprediction
Foldrecognition
and threading
Initial ab initio prediction
Optimisation
Clustering
Final models
PSIPRED
SAM-T02
JUFO
PSI-BLAST
Pawe l Widera Evolving energy function for PSP 2009-06-30 5 / 26
![Page 7: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/7.jpg)
The algorithm of foldingAnfinsen’s thermodynamic hypothesis [Anfinsen, 1973]
[Dill and Chan, 1997]
Refolding experimentfolds to the samenative statenative state isenergetically stable
Energy funnelroll down freeenergy hillavoid local minimatraps
Pawe l Widera Evolving energy function for PSP 2009-06-30 6 / 26
![Page 8: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/8.jpg)
Model assesmentCorrelation between energy and similarity to native
Similarity measure
RMSD =
√√√√ 1N
i=N∑i=1
δ2i
Decoys generated byI-TASSER[Wu et al., 2007]
Robetta[Rohl et al., 2004]
Pawe l Widera Evolving energy function for PSP 2009-06-30 7 / 26
![Page 9: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/9.jpg)
Model assesmentCorrelation between energy and similarity to native
Similarity measure
RMSD =
√√√√ 1N
i=N∑i=1
δ2i
Decoys generated byI-TASSER[Wu et al., 2007]
Robetta[Rohl et al., 2004]
Pawe l Widera Evolving energy function for PSP 2009-06-30 7 / 26
![Page 10: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/10.jpg)
All-atom force fieldFolding simulation ∑bonds
ik l
i2 (l − l0i )+∑angles
ikθi2 (θ − θ0
i )+∑torsionsi
Vωi2 [1 + cos(niωi − φi)]+∑N−1
i=1∑N
j=i+1
{4εij
[(σijrij
)12−(
σijrij
)6]
+qi qj
4πε0rij
}Intermolecular forces
bond forces (stretching, bending, rotating)short range forces (Pauli repulsion, van der Waals’ interactions)electrostatic forces (Coulomb’s law)
Rosetta@home in CASP7140k computers (37 TFLOPS) — 500k CPU hours per domain
Pawe l Widera Evolving energy function for PSP 2009-06-30 8 / 26
![Page 11: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/11.jpg)
All-atom force fieldFolding simulation ∑bonds
ik l
i2 (l − l0i )+∑angles
ikθi2 (θ − θ0
i )+∑torsionsi
Vωi2 [1 + cos(niωi − φi)]+∑N−1
i=1∑N
j=i+1
{4εij
[(σijrij
)12−(
σijrij
)6]
+qi qj
4πε0rij
}Intermolecular forces
bond forces (stretching, bending, rotating)short range forces (Pauli repulsion, van der Waals’ interactions)electrostatic forces (Coulomb’s law)
Rosetta@home in CASP7140k computers (37 TFLOPS) — 500k CPU hours per domain
Pawe l Widera Evolving energy function for PSP 2009-06-30 8 / 26
![Page 12: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/12.jpg)
Simplified knowledege-based potentialProtein structure prediction
i − 1
i
i + 1
n̂ib̂i v̂i
Example
Estiff =∑
i
(−λv̂i · v̂i+4 − λ
∣∣∣b̂i · b̂i+2
∣∣∣− λΘ1(i) + Θ2(i) + Θ3(i))
Eenv =∑
i V (NPi ,NAi ,NOi ,Ai)
Pawe l Widera Evolving energy function for PSP 2009-06-30 9 / 26
![Page 13: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/13.jpg)
Energy functionWeighted sum of terms vs. evolved function
F (~T ) = w1 ∗ T1 + . . .wn ∗ Tn[Zhang et al., 2003]
F (~T ) = T1∗T3w1∗log(T2)
+ sin(
T4−w2∗T1T5∗exp(cos(w1∗T3))
)GP input
terminals:T1, . . . ,T8
functions:add sub mul divsin cos exp lograndom ephemeralsin range [0,1]
GP tree examplesize = 60depth = 17
Pawe l Widera Evolving energy function for PSP 2009-06-30 10 / 26
![Page 14: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/14.jpg)
Energy functionWeighted sum of terms vs. evolved function
F (~T ) = w1 ∗ T1 + . . .wn ∗ Tn[Zhang et al., 2003]
F (~T ) = T1∗T3w1∗log(T2)
+ sin(
T4−w2∗T1T5∗exp(cos(w1∗T3))
)GP input
terminals:T1, . . . ,T8
functions:add sub mul divsin cos exp lograndom ephemeralsin range [0,1]
GP tree examplesize = 60depth = 17
Pawe l Widera Evolving energy function for PSP 2009-06-30 10 / 26
![Page 15: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/15.jpg)
Energy functionWeighted sum of terms vs. evolved function
F (~T ) = w1 ∗ T1 + . . .wn ∗ Tn[Zhang et al., 2003]
F (~T ) = T1∗T3w1∗log(T2)
+ sin(
T4−w2∗T1T5∗exp(cos(w1∗T3))
)GP input
terminals:T1, . . . ,T8
functions:add sub mul divsin cos exp lograndom ephemeralsin range [0,1]
GP tree examplesize = 60depth = 17
Pawe l Widera Evolving energy function for PSP 2009-06-30 10 / 26
![Page 16: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/16.jpg)
Fitness evaluationEvolutionary objective
1 construction of the reference ranking RR(decoys sorted by similarity to native)
2 ranking decoys using evolved energy function RE(decoys sorted by energy)
3 rankings comparison - RR vs. RE4 fitness = average distance for all proteins
Pawe l Widera Evolving energy function for PSP 2009-06-30 11 / 26
![Page 17: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/17.jpg)
Reference ranking constructionCorrelation between energy and similarity to native
R0
RMSD
0
3.2
1
2.1
2
5.2
3
1.2
4
3.5
5
2.1
6
4.8
7
3.5
R1 3 1 5 0 4 7 6 2
R2 3.0 1.5 7.0 0.0 4.5 1.5 6.0 4.5
Ranking typesR1 - permutation of indicesR2 - averaged ranks
Pawe l Widera Evolving energy function for PSP 2009-06-30 12 / 26
![Page 18: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/18.jpg)
Rankings comparisonMeasure of distance between rankings
4 3 2 1 53 4 1 5 21 1 1 4 3→ 10
1 45
35
25
15 → 4.6
Distance functionsLevenshtein edit distance - O(n)
Kendall Tau distance - O(n(n−1)2 )
Spearman footrule distance - O(12n2)
Ranks weightinglinearsigmoid
Pawe l Widera Evolving energy function for PSP 2009-06-30 13 / 26
![Page 19: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/19.jpg)
Rankings comparisonMeasure of distance between rankings
4 3 2 1 53 4 1 5 21 1 1 4 3→ 10
1 45
35
25
15 → 4.6
Distance functionsLevenshtein edit distance - O(n)
Kendall Tau distance - O(n(n−1)2 )
Spearman footrule distance - O(12n2)
Ranks weightinglinearsigmoid
Pawe l Widera Evolving energy function for PSP 2009-06-30 13 / 26
![Page 20: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/20.jpg)
Decoys samplingSelection vs. noise reduction
Simple selectiontopuniformrandom
Bin based selectionequal sizeequal distance
Pawe l Widera Evolving energy function for PSP 2009-06-30 14 / 26
![Page 21: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/21.jpg)
Decoys samplingSelection vs. noise reduction
Simple selectiontopuniformrandom
Bin based selectionequal sizeequal distance
Pawe l Widera Evolving energy function for PSP 2009-06-30 14 / 26
![Page 22: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/22.jpg)
Decoys samplingSelection vs. noise reduction
Simple selectiontopuniformrandom
Bin based selectionequal sizeequal distance
Pawe l Widera Evolving energy function for PSP 2009-06-30 14 / 26
![Page 23: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/23.jpg)
Decoys samplingSelection vs. noise reduction
Simple selectiontopuniformrandom
Bin based selectionequal sizeequal distance
Pawe l Widera Evolving energy function for PSP 2009-06-30 14 / 26
![Page 24: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/24.jpg)
Decoys samplingSelection vs. noise reduction
Simple selectiontopuniformrandom
Bin based selectionequal sizeequal distance
Pawe l Widera Evolving energy function for PSP 2009-06-30 14 / 26
![Page 25: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/25.jpg)
Decoys samplingSelection vs. noise reduction
Simple selectiontopuniformrandom
Bin based selectionequal sizeequal distance
Pawe l Widera Evolving energy function for PSP 2009-06-30 14 / 26
![Page 26: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/26.jpg)
Decoys samplingSelection vs. noise reduction
Simple selectiontopuniformrandom
Bin based selectionequal sizeequal distance
Pawe l Widera Evolving energy function for PSP 2009-06-30 14 / 26
![Page 27: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/27.jpg)
Experiment design
Pawe l Widera Evolving energy function for PSP 2009-06-30 15 / 26
![Page 28: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/28.jpg)
Evolutionary progressFitness throughout generations
0 200 400 600 800 1000
generation
0.0010
0.0015
0.0020
0.0025
0.0030
0.0035
0.0040
fitness
levenshtein-steadystate
0 200 400 600 800 1000
generation
0.34
0.36
0.38
0.40
0.42
fitness
spearman-generational
0 200 400 600 800 1000
generation
0.500
0.505
0.510
0.515
0.520
fitness
kendall-generational
ObservationsRound I - early saturationRound II - small but constant improvement
Pawe l Widera Evolving energy function for PSP 2009-06-30 16 / 26
![Page 29: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/29.jpg)
Evolutionary progressFitness throughout generations
0 200 400 600 800 1000
generation
0.32
0.34
0.36
0.38
0.40
0.42
0.44
fitness
spearman-linear-ts8-generational
0 200 400 600 800 1000
generation
0.42
0.44
0.46
0.48
0.50
0.52
0.54
fitness
spearman-sigmoid-ts8-elitism
0 200 400 600 800 1000
generation
0.500
0.505
0.510
0.515
0.520
0.525
fitness
kendall-ts4-generational
ObservationsRound I - early saturationRound II - small but constant improvement
Pawe l Widera Evolving energy function for PSP 2009-06-30 16 / 26
![Page 30: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/30.jpg)
Landscape analysisFitness distribution for the random walk
Pawe l Widera Evolving energy function for PSP 2009-06-30 17 / 26
![Page 31: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/31.jpg)
Landscape analysisFitness distribution for the random walk
d-100 d-58 f-100 f-42 random-100 top-100 uniform-100 all0.3
0.4
0.5
0.6
0.7
0.8
fitness
Pawe l Widera Evolving energy function for PSP 2009-06-30 17 / 26
![Page 32: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/32.jpg)
Population diveristy analysisGenotype - Phenotype - Fitness mapping
Diversity measuresF - fitness entropy(frequency ofduplicates)P - root mean squaredistance betweenrankingsG - number of uniquetrees <#T, #NT, depth>
Pawe l Widera Evolving energy function for PSP 2009-06-30 18 / 26
![Page 33: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/33.jpg)
Population diveristy analysisGenotype - Phenotype - Fitness mapping
Pawe l Widera Evolving energy function for PSP 2009-06-30 18 / 26
![Page 34: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/34.jpg)
Population diveristy analysisGenotype - Phenotype - Fitness mapping
Pawe l Widera Evolving energy function for PSP 2009-06-30 18 / 26
![Page 35: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/35.jpg)
Improvement over random walkIs the evolution any good?
decoys set improvement avg best
all 0.78% 0.710uniform-100 0.96% 0.711random-100 1.28% 0.713top-100 1.93% 0.702s-42 7.76% 0.713s-100 7.64% 0.772d-58 8.21% 0.780d-100 10.88% 0.804
Pawe l Widera Evolving energy function for PSP 2009-06-30 19 / 26
![Page 36: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/36.jpg)
Best evolved energy functionsComparison to naive combination of energy terms
Correlation to RMSDd-100 0.76(generational+ADF)all decoys 0.30(steady-state+elitism)best single term 0.24worst single term -0.20naive combination ofterms 0.12original I-TASSERenergy 0.44 (0.51/0.65)
Pawe l Widera Evolving energy function for PSP 2009-06-30 20 / 26
![Page 37: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/37.jpg)
Best evolved energy functionsComparison to naive combination of energy terms
Correlation to RMSDd-100 0.76(generational+ADF)all decoys 0.30(steady-state+elitism)best single term 0.24worst single term -0.20naive combination ofterms 0.12original I-TASSERenergy 0.44 (0.51/0.65)
Pawe l Widera Evolving energy function for PSP 2009-06-30 20 / 26
![Page 38: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/38.jpg)
Best evolved energy functionsComparison to naive combination of energy terms
Correlation to RMSDd-100 0.76(generational+ADF)all decoys 0.30(steady-state+elitism)best single term 0.24worst single term -0.20naive combination ofterms 0.12original I-TASSERenergy 0.44 (0.51/0.65)
Pawe l Widera Evolving energy function for PSP 2009-06-30 20 / 26
![Page 39: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/39.jpg)
Best evolved energy functionsComparison to naive combination of energy terms
Correlation to RMSDd-100 0.76(generational+ADF)all decoys 0.30(steady-state+elitism)best single term 0.24worst single term -0.20naive combination ofterms 0.12original I-TASSERenergy 0.44 (0.51/0.65)
Pawe l Widera Evolving energy function for PSP 2009-06-30 20 / 26
![Page 40: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/40.jpg)
Comparison to weighted sum of termsNelder-Mead downhill simplex optimisation
spearman-sigmoid correlation
method d-100 all d-100 all
simplex 0.734 0.638 0.650 0.166GP 0.835 *0.714 0.740 *0.200
Pawe l Widera Evolving energy function for PSP 2009-06-30 21 / 26
![Page 41: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/41.jpg)
Distribution of terminals and operatorsDid the evolution discovered any knowledge?
energy term correlation
T1 (E13) 0.03± 0.11T2 (E14) 0.20± 0.17T3 (E15) 0.15± 0.15T4 (Estiff ) 0.24± 0.22T5 (EHB) −0.16± 0.20T6 (Epair ) 0.01± 0.14T7 (Eelectro) −0.20± 0.23T8 (Eenv ) 0.04± 0.16
average 0.06
Use of energy termsmost frequent: T4, T5
least frequent: T1, T6
Use of operatorsmost frequentadd, divleast frequentsin, cos, log
Pawe l Widera Evolving energy function for PSP 2009-06-30 22 / 26
![Page 42: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/42.jpg)
Distribution of terminals and operatorsDid the evolution discovered any knowledge?
energy term correlation
T1 (E13) 0.03± 0.11T2 (E14) 0.20± 0.17T3 (E15) 0.15± 0.15T4 (Estiff ) 0.24± 0.22T5 (EHB) −0.16± 0.20T6 (Epair ) 0.01± 0.14T7 (Eelectro) −0.20± 0.23T8 (Eenv ) 0.04± 0.16
average 0.06
Use of energy termsmost frequent: T4, T5
least frequent: T1, T6
Use of operatorsmost frequentadd, divleast frequentsin, cos, log
Pawe l Widera Evolving energy function for PSP 2009-06-30 22 / 26
![Page 43: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/43.jpg)
Distribution of terminals and operatorsDid the evolution discovered any knowledge?
energy term correlation
T1 (E13) 0.03± 0.11T2 (E14) 0.20± 0.17T3 (E15) 0.15± 0.15T4 (Estiff ) 0.24± 0.22T5 (EHB) −0.16± 0.20T6 (Epair ) 0.01± 0.14T7 (Eelectro) −0.20± 0.23T8 (Eenv ) 0.04± 0.16
average 0.06
Use of energy termsmost frequent: T4, T5
least frequent: T1, T6
Use of operatorsmost frequentadd, divleast frequentsin, cos, log
Pawe l Widera Evolving energy function for PSP 2009-06-30 22 / 26
![Page 44: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/44.jpg)
Distribution of terminals and operatorsDid the evolution discovered any knowledge?
energy term correlation
T1 (E13) 0.03± 0.11T2 (E14) 0.20± 0.17T3 (E15) 0.15± 0.15T4 (Estiff ) 0.24± 0.22T5 (EHB) −0.16± 0.20T6 (Epair ) 0.01± 0.14T7 (Eelectro) −0.20± 0.23T8 (Eenv ) 0.04± 0.16
average 0.06
Use of energy termsmost frequent: T4, T5
least frequent: T1, T6
Use of operatorsmost frequentadd, divleast frequentsin, cos, log
Pawe l Widera Evolving energy function for PSP 2009-06-30 22 / 26
![Page 45: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/45.jpg)
Summary
ConclusionsGP evolved function outperforms linear combination of weightsGP choice of energy terms reflects their correlation to RMSDdecoys from real prediction process are more difficult to assesbloat control is necessary to evolve more compact functions
Ideas for the futuremore complex total fitnessdistance measured using ProCKSI consensusRosetta generated decoysadditional energy terms (SA, RCH)
Pawe l Widera Evolving energy function for PSP 2009-06-30 23 / 26
![Page 46: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/46.jpg)
Summary
ConclusionsGP evolved function outperforms linear combination of weightsGP choice of energy terms reflects their correlation to RMSDdecoys from real prediction process are more difficult to assesbloat control is necessary to evolve more compact functions
Ideas for the futuremore complex total fitnessdistance measured using ProCKSI consensusRosetta generated decoysadditional energy terms (SA, RCH)
Pawe l Widera Evolving energy function for PSP 2009-06-30 23 / 26
![Page 47: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/47.jpg)
Thank you!
AcknowledgementsThis work was supported by Marie CurieAction MEST-CT-2004-7597 under theSixth Framework Programme of theEuropean Community.
Ben Gurion University of the Negev’sDistinguished Scientists Visitor Programand Prof. Moshe Sipper.
Pawe l Widera Evolving energy function for PSP 2009-06-30 24 / 26
![Page 48: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/48.jpg)
Publications
1 P. Widera, J.M. Garibaldi, N. KrasnogorEvolutionary design of the energy function for proteinstructure predictionIn IEEE Congress on Evolutionary Computation, CEC’09,p1305–1312, Trondheim, Norway, May 2009
2 P. Widera, J.M. Garibaldi, N. KrasnogorGP challange: evolving the energy function for proteinstructure predictionsubmitted to Genetic Programming and Evolvable Machines, 2008
Pawe l Widera Evolving energy function for PSP 2009-06-30 25 / 26
![Page 49: A Genetic Programming Challenge: Evolving the Energy Function for Protein Structure Prediction](https://reader033.fdocuments.in/reader033/viewer/2022051015/5565945ad8b42a093a8b4b56/html5/thumbnails/49.jpg)
References
Anfinsen, C. (1973).Principles that Govern the Folding of Protein Chains.Science, 181(4096):223–30.
Dill, K. A. and Chan, H. S. (1997).From Levinthal to pathways to funnels.Nat Struct Mol Biol, 4(1):10–19.
Rohl, C. A., Strauss, C. E. M., Misura, K. M. S., and Baker, D. (2004).Protein Structure Prediction Using Rosetta.In Brand, L. and Johnson, M. L., editors, Numerical Computer Methods, Part D, volume Volume 383 of Methods inEnzymology, pages 66–93. Academic Press.
Wu, S., Skolnick, J., and Zhang, Y. (2007).Ab initio modeling of small proteins by iterative TASSER simulations.BMC Biol, 5(1):17.
Zhang, Y., Kolinski, A., and Skolnick, J. (2003).TOUCHSTONE II: A New Approach to Ab Initio Protein Structure Prediction.Biophys. J., 85(2):1145–1164.
Pawe l Widera Evolving energy function for PSP 2009-06-30 26 / 26