Model Database. Scene Recognition Lamdan, Schwartz, Wolfson, “Geometric Hashing”,1988.
-
date post
22-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Model Database. Scene Recognition Lamdan, Schwartz, Wolfson, “Geometric Hashing”,1988.
Inexact Alignment.
Simple case – two closely related proteins with the same number of amino acids.
T
Question: how to measure alignment error?
Superposition - best least squares(RMSD – Root Mean Square Deviation)
Given two sets of 3-D points :P={pi}, Q={qi} , i=1,…,n;
rmsd(P,Q) = √ i|pi - qi |2 /n
Find a 3-D rigid transformation T* such that:
rmsd( T*(P), Q ) = minT √ i|pi - qi |2 /n
A closed form solution exists for this task.It can be computed in O(n) time.
Structure Alignment (Straightforward Algorithm)
• For each pair of triplets, one from each molecule which define ‘almost’ congruent triangles compute the rigid transformation that superimposes them.
• Count the number of point pairs, which are ‘almost’ superimposed and sort the hypotheses by this number.
• For the highest ranking hypotheses improve the transformation by replacing it by the best RMSD transformation for all the matching pairs.
• Complexity : assuming order of n points in both molecules - O(n8) .
O(n4) if one exploits protein backbone geometry.
Accuracy improvement during detection of 3D transformation.
Instead of 3 points use more. How many?
Align any possible pair of fragments - Fij(k)
i
j
i+k-1
j+k-1
Accept Fij(k) if rmsd(Fij
(k)) <
Complexity O(n3 n).
(For each Fij(k) we need compute its rmsd)
can be reduced to O(n3)
Improvement : BLAST idea - detect short similar fragments, then extend as much as possible.
j
i+1
j+1
i
j-1
i-1
ai-1 ai ai+1
bj-1 bj bj+1
k
t
k+l-1
t+l-1
Complexity: O(n2)
Extend while: rmsd(Fij(k)) <
Sequence Based Structure Alignment
•Run pairwise sequence alignment.
•Based on sequence correspondence compute 3D transformation (least square fit can be applied).
•Iteratively improve structural superposition.
Motivation
• Proteins are flexible. One would like to align proteins modulo the flexibility.
• Hinge and shear protein domain motions (Gerstein, Lesk , Chotia).
• Conformational flexibility in drugs.
Flexible protein alignment without prior hinge knowledge
FlexProt - algorithm
– detects automatically flexibility regions
– exploits amino acid sequence order