Solving Qualitative Organic Problems with the Aid of an Automated Isotope-Pattern Analyzer
TEXTAL: A System for Automated Model Building Based on Pattern Recognition
-
Upload
noelle-bell -
Category
Documents
-
view
32 -
download
1
description
Transcript of TEXTAL: A System for Automated Model Building Based on Pattern Recognition
![Page 1: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/1.jpg)
TEXTAL: A System for Automated Model Building Based on Pattern Recognition
Thomas R. IoergerDepartment of Computer Science
Texas A&M University
![Page 2: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/2.jpg)
Main Stages of TEXTALelectron density map
CAPRA
C-alpha chains
LOOKUP
model (initial coordinates)
model (final coordinates)
Post-processing routines
Reciprocal-spacerefinement/ML DM
HumanCrystallographer
(editing)
build-in side-chainand main-chain atoms
locally around each CA
example:real-spacerefinement
![Page 3: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/3.jpg)
F=<1.72,-0.39,1.04,1.55...> F=<1.58,0.18,1.09,-0.25...>
F=<0.90,0.65,-1.40,0.87...> F=<1.79,-0.43,0.88,1.52...>
![Page 4: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/4.jpg)
CAPRA:C-Alpha Pattern Recognition Algorithm
![Page 5: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/5.jpg)
![Page 6: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/6.jpg)
![Page 7: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/7.jpg)
![Page 8: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/8.jpg)
![Page 9: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/9.jpg)
![Page 10: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/10.jpg)
![Page 11: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/11.jpg)
![Page 12: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/12.jpg)
Overview of CAPRA
• goal: predict CA chains from density map• not just “tracing” - more than Bones• desire 1:1 correspondence, ~3.8A apart• based on principles of pattern recognition
– use neural net to estimate which pseudo-atoms in trace “look” closest to true C-alphas
– use feature extraction to capture 3D patterns in density for input to neural net
– use other heuristics for “linking” together into chains, including geometric analysis (s.s.)
![Page 13: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/13.jpg)
CAPRA: C-Alpha Pattern-Recognition Algorithm
• Tracer - remove lattice points from map (lowest density first) without breaking connectivity
• Neural nework - for each pseudo atom, extract features, input to network, predict distances to CAs (1:10 in trace), trained on example points in real maps
• Linking - desire long chains, good CA predictions (not in side-chains), “structurally plausible” (e.g. linear, helical)
DensityTrace
NeuralNetwork
Linking intoC-alpha chains
pseudo atoms predictions ofdistance to true CA
map C-alphacoordinates
![Page 14: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/14.jpg)
Steps in CAPRA
![Page 15: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/15.jpg)
Examples of CAPRA Steps
![Page 16: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/16.jpg)
Tracer+ + + + ++ + + + + + + + + + + + + ++ + + + + + + + + + + + + + + + + + + + + + + +
+ + ++ + + + + + ++ + + + + + + + + + + + + + + + + + + +
+ + + + + + + + + + + + + + + +
![Page 17: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/17.jpg)
Neural Network
![Page 18: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/18.jpg)
Feature Extraction
• characterize 3D patterns in local density
• must be “rotation invariant”
• examples:– average density in region– standard deviation, kurtosis...– distance to center of mass– moments of inertia, ratios of moments– “spoke angles”
• calculated over spheres of 3A and 4A radius
![Page 19: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/19.jpg)
i
jijij biasoutwact ,
jactje
out
1
1
k
kkjjjj woutout ,)1(
ForwardPropagation:
BackwardPropagation:
kjkj outw ,
![Page 20: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/20.jpg)
Selection of Candidate C-alpha’s
• method:– pick candidates in order of lowest predicted
distance first,– among all pseudo-atoms in trace,– as long as not closer than 2.5A
• notes:– no 3.8A constraint; distance can be as high as 5A– don’t rely on branch points (though often near) – picked in random order throughout map– initially covers whole map, including side-chains
and disconnected regions (e.g. noise in solvent)
![Page 21: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/21.jpg)
Linking into Chains
• initial connectivity of CA candidates based on the trace
• “over-connected” graph - branches, cycles...
• start by computing connected components (islands, or clusters)
• two strategies:– for small clusters (<=20 candidates), find longest
internal chain with “good” atoms– for large clusters (>20 candidates), incrementally
clip branch points using heuristics
![Page 22: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/22.jpg)
Extracting Chains from Small Clusters
• exhaustive depth-first search of all paths
• scoring function:– length– penalty for inclusion of points with high
predicted distance to true CA by neural net– preference for following secondary structure
(locally straight or helical)
![Page 23: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/23.jpg)
Secondary Structure Analysis
• generate all 7-mers (connected fragments of candidate CAs of length 7)
• evaluate “straightness”– ratio of sum of link lengths to end-to-end distance– straightness>0.8 ==> potential beta-strand
• evaluate “helicity”– average absolute deviation of angles and torsions
along 7-mer from ideal values (95º and 50º)– helicity<20 ==> potential alpha-helix
![Page 24: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/24.jpg)
Handling Large Clusters
• start by breaking cycles (near “bad” atoms)
• clip links at branch points till only linear chains remain
• clip the most “obvious” links first, e.g.– if other two links are part of sec. struct.– if clipped branch has “bad” atom nearby– if clipped branch is small and other 2 are large
? ??
![Page 25: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/25.jpg)
Example of CA-chains for CzrA fit by CAPRA
![Page 26: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/26.jpg)
Results for MVK
![Page 27: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/27.jpg)
Results
protein PDB id final res method res used sec. str. sizeCzrA 2.3A MAD/MR 2.8A 94/104IF5a 1bkb 1.75A MAD 2.8A 136/139MVK 1kkh 2.4A MAD 2.4A 317/317PCAa 1l1e 2.0A MAD 2.8A 262/287P2 Myelin 1pmp 2.7A MIR 2.7A 131/131
protein % built RMS error # chains longest # ins/del cross-oversCzrA 84/104 (81)% 1.08A 5 53 0IF5a 127/136 (93%) 0.78A 4 52 0MVK 298/317 (95%) 0.83A 6 101 0PCAa 212/262 (81%) 0.89A 11 50 1P2 Myelin 111/131 (85%) 0.91A 6 63 2
![Page 28: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/28.jpg)
Availability
• Textal web site:– http://textal.tamu.edu:12321– server-side processing– free access to Capra– beta-testing of Textal
• To contact us, email: [email protected]
![Page 29: TEXTAL: A System for Automated Model Building Based on Pattern Recognition](https://reader030.fdocuments.in/reader030/viewer/2022032606/56812d6b550346895d927dc9/html5/thumbnails/29.jpg)
Acknowledgements
• Funding– National Institutes of Health– Welch Foundation
• People– Dr. James C. Sacchettini– The rest of the TEXTAL Group:
• Tod Romo
• Kreshna Gopal
• Reetal Pai