correlating graph-theoretical centrality indices with interface residue propensity
-
Upload
celestyn-leighton -
Category
Documents
-
view
23 -
download
0
description
Transcript of correlating graph-theoretical centrality indices with interface residue propensity
1
correlating graph-theoretical correlating graph-theoretical centrality indices with interface centrality indices with interface residue propensityresidue propensity
or: where do things stick together?
Stefan MaetschkeTeasdale Group
2
…a bit more specific
Prediction of interface residues Protein-RNA interfaces Machine learning methods Structural information Graph-topological features
3
something for the visual cortex
[Terribilini et al. 2006][JMol,1R3E_A] [Jung Library]
Protein-RNA complex Binding site Contact graph
4
questions
Most predictors are sequence based:
What impact has structural information on prediction accuracy?
What features are predictive for interface residues?
5
obvious features
is on surface => Accessible surface area has to bind => Physico-chemical prop. must be stabilized => Contact graph topology prefers flat surface => not really is conserved => maybe not that much
Interface residue…
6
accessible surface area (ASA)
http://www.see.ed.ac.uk/~tduren/research/surface_area/http://www.ysbl.york.ac.uk/~ccp4mg/ccp4mg_help/analysis.html
7
physico-chemical properties
Hydrophobicity
Inside/Outside
Partition Coefficient
Conformation
AAIndex database approx. 400 indices AUC over 144 protein chains
4304 binding and 27932 non-bindingsequence similarity < 30%
8
patch types
9
patch type comparison
Naïve Bayes PSI-BLAST Profiles AUC 5-fold x-validation RB144 data set
10
features over patches
11
betweenness-centrality (BC)
http://en.wikipedia.org/wiki/Image:Graph_betweenness.svg
s tv
12
BC for contact graph
1FJG_K AUC = 0.71 Red: interface residue Size: betweenness centrality
Histogram: binned BC over RB144
13
combined features
WRC : distance-weighted retention coefficient BC : betweenness centrality ASA : accessible surface area 5-fold x–validation, RB144 Patch sizes: sequential->11, topological->19, spatial->19
14
summary
Patch size is critical for sequential patches Spatial/topological patches perform better Structural information helps – but not much: +5% Novelty: centrality indices as predictors SVM superior to NB Top prediction accuracy – as far as one can tell Accuracy in general is still low (MCC < 0.4)
15
what’s next… Prediction of disease associated SNPs Graph-spectral methods Protein function prediction
16
acknowledgments
Zheng Yuan – Data sets and much more …
Karin Kassahn – Aminoacyl-tRNA synthetases
http://en.wikipedia.org/wiki/Aminoacyl_tRNA_synthetase
17
questions