MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science...
-
date post
21-Dec-2015 -
Category
Documents
-
view
217 -
download
0
Transcript of MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science...
![Page 1: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/1.jpg)
MULTICOM – A Combination Pipeline for Protein Structure Prediction
Jianlin Cheng
Computer Science Department & Informatics InstituteUniversity of Missouri, Columbia, MO, USA
![Page 2: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/2.jpg)
MULTICOM Structure Prediction PipelineServer PredictorQuery Sequence
Output
Human Predictor
![Page 3: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/3.jpg)
MULTICOM Structure Prediction PipelineQuery Sequence
Output
• PSI-BLAST• HHSearch• COMPASS• FOLDpro + SPEM
Query-template alignments:
Find a set of good templates / fragments; generate alternative query-templatealignments
![Page 4: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/4.jpg)
MULTICOM Structure Prediction PipelineQuery Sequence
Output
1. Combine top ranked query-template alignment (QTA) withother significant QTAs2. Take fragments from lesssignificant QTA (Template-free)
Don’t try to find the best template; Instead combine multiple good templates / fragments.
Combination
![Page 5: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/5.jpg)
MULTICOM Structure Prediction PipelineQuery Sequence
Output
1. Modeller 2. Rosetta for template-free small domains
Domain-level combination of template-based and template-free approaches
Integrative Model Generation
![Page 6: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/6.jpg)
MULTICOM Structure Prediction PipelineQuery Sequence
Output
Model Ranking by ModelEvaluator
![Page 7: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/7.jpg)
ModelEvaluator3D Model Ab initio Sequence-Based Structural Feature Prediction
EEEECCEEEHHHHHHHHHHHHEEEECCEEEHHHH
eeee-----eeeee----------eeeee------eeeee---eeeeeeee
Secondary Structure
Relative Solvent Accessibility
Contact Map
Beta-Sheet Pairing
Input Features
Predicted GDT-TS score
Good models ranked at the top. Very effective fortemplate-free models.
Comparison
![Page 8: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/8.jpg)
MULTICOM Structure Prediction PipelineQuery Sequence
Output
1. Start from a top ranked model2. Combine it with other models having global similarity (80%, 4Å)3. Combine it with the longest
similar model fragments
Global-Local Model Combination
Modeller Iterative Modeling
Average Model
Don’t try to find the best model.Instead combine multiple goodmodels / fragments (2-3% improvement).
![Page 9: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/9.jpg)
Good Template-Free Example: T0416_2
Structure MULTICOM (GDT = 0.66, RMSD = 2.5)
Superposition (red: model) (Courtesy by Prof. Joel Sussman)
Combination of 20 models:
Zhang-ServerRobettaTASSERMULTICOMYASARAforecast
Success: rank very good models at top.
![Page 10: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/10.jpg)
Good Template-Free Example: T0513_2
StructureMULTICOM (GDT = 0.73, RMSD=2.1)
Combine Robetta modelsBetter than each one of them
Superposition (blue: model)
Success: rank very good models at top and combination improves modeling.
![Page 11: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/11.jpg)
Not Good Template-Free Example: T0405_1
Structure(Helix Bundle)
MULTICOMGDT = 0.41
Superposition (by Prof. Sussman)(Gray: structure, yellow: best modelgreen: MULTICOM model)
Failure: ModelEvaluator fails to identify correct helix orientations.
![Page 12: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/12.jpg)
Concluding Remarks• CASP Community can sometime generate good
template-free models (e.g. Rosetta-based tools)
• ModelEvaluator can rank good template-free models at the top
• Iterative global-local combination of models can improve template-free modeling
• Blending of template-free and template-based modeling
![Page 13: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/13.jpg)
Blending of Template-Free and Template-Based Modeling
100% TBM 100% FM50% TBM+50%FM
Protein Modeling Spectrum
![Page 14: MULTICOM – A Combination Pipeline for Protein Structure Prediction Jianlin Cheng Computer Science Department & Informatics Institute University of Missouri,](https://reader035.fdocuments.in/reader035/viewer/2022062421/56649d555503460f94a32321/html5/thumbnails/14.jpg)
Acknowledgements
• CASP8 organizers and assessors• CASP8 participants• MU colleagues: Dong Xu, Toni Kazic • My group: Zheng Wang Allison Tegge Xin Deng