Structure Alignment in Polynomial Time
description
Transcript of Structure Alignment in Polynomial Time
![Page 1: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/1.jpg)
Structure Alignment in Polynomial Time
Rachel KolodnyStanford University
Nati LinialThe Hebrew University of Jerusalem
![Page 2: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/2.jpg)
Problem Statement• 2 structures in R3
A={a1,a2,…,an}, B={b1,b2,…,bm}
• Find subsequences sa and sb
s.t the substructures{asa(1),asa(2),…, asa(l)},{bsb(1),bsb(2),…, bsb(l)}
are similar
![Page 3: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/3.jpg)
Motivation• Structure is better conserved than
amino acid sequence – Structure similarity can give hints to
common functionality/origin• Allows automatic classification of
protein structure
![Page 4: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/4.jpg)
Correspondence Position
• Given a correspondence the rotation and translation that minimize the cRMS distance can be calculated
Kabsch, W. (1978).
![Page 5: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/5.jpg)
Position Correspondence
• Given a rotation and translation one can calculate the alignment that optimizes a (separable) score – Using dynamic programming– Essentially similar to sequence
alignment• Example score
2
20 # 101 ( , ) / 5i correspondance i i
gapsd A B
![Page 6: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/6.jpg)
Score cRMS• We want to give “bonus points” for
longer correspondences– e.g. corresponding ONE atom from each
structure has 0 cRMS
• Even better scores ?– vary gap penalty depending on position in
structure– Incorporate sequence information
![Page 7: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/7.jpg)
Score cRMSA specific correspondence
![Page 8: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/8.jpg)
Previous WorkDistance Matrices Heuristics in rotation
and translation spaceDALI [Holm and Sander 93]CONGENEAL [Yee & Dill 93]SSAP [Taylor & Orengo 89]Nussinov-Wolfson [89,93]Godzik [93]
…
STRUCTAL [Subibiah et al 93]COMPARER [Sali & Blundell 90]LOCK [Singh & Brutlag 97]CE [Shindyalov & Bourne 98]Taylor (??) [93]Zu-Kang & Sipppl 96 (?)
…*most data taken from Orengo 94
![Page 9: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/9.jpg)
“…It can be proved that, for these reasons, finding an optimal structural alignment between two protein structures is an NP hard problem and thus there are no fast structural alignment algorithms that are guaranteed to be optimal within any given similarity measure…”
Adam Godzik‘The structural alignment between two proteins: Is there a
unique answer’ 1996
“There is no exact solution to the protein structure alignment problem, only the best solution for the heuristics used in the calculation.”
Shindyalov & Bourne‘Protein Structure Alignment by Incremental Combinatorial (CE) of the Optimal Path’ 1998
![Page 10: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/10.jpg)
Exponentially many
Focus on Scoring Functions
![Page 11: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/11.jpg)
Exponentially many
Focus on Scoring Functions
![Page 12: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/12.jpg)
Exponentially many
All Maxima are interestingNoisy data !!
![Page 13: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/13.jpg)
Good scoring functions• Each of the functions is well-behaved
– Satisfies Lipschitz condition
• Thus, the maximum over a finite set is well-behaved
• In each dimension two points at distance have function values that vary by O(n)
• Need O(n) samples in every dimension
![Page 14: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/14.jpg)
Sampling is Sufficient
![Page 15: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/15.jpg)
Polynomial Algorithm• Sample in rotation and translation
space– compute best score (and alignment)
for each sample point• Return maximum score
• Need O(n6n2) time and O(n2) space
![Page 16: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/16.jpg)
Internal Distance Matrices• Invariant to position
and rotation of structures can be compared directly
• Find largest common sub-matrices (LCM) whose distances are roughly the same
![Page 17: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/17.jpg)
LCM is NP-complete• Harder than MAX-
CLIQUE• Matrices encode
distances that are positive, symmetric and obey triangle inequality
0 1 1 1 1 11 0 1 1 1 11 1 0 1 1 11 1 1 0 1 11 1 1 1 0 11 1 1 1 1 0
0 1 2 3 2 3 3 4 5 21 0 1 2 1 1 2 3 4 12 1 0 3 2 2 3 4 5 23 2 3 0 1 2 3 4 5 22 1 2 1 0 1 2 3 4 13 1 2 2 1 0 1 2 3 13 2 3 3 2 1 0 1 2 24 3 4 4 3 2 1 0 1 35 4 5 5 4 3 2 1 0 42 1 2 2 1 1 2 3 4 0
![Page 18: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/18.jpg)
Example1dme28 amino acids
1jjd51 amino acids
Best STRUCTAL score 149Best score found by exhaustive search 197
![Page 19: Structure Alignment in Polynomial Time](https://reader036.fdocuments.in/reader036/viewer/2022062811/56816059550346895dcf83be/html5/thumbnails/19.jpg)
Heuristic• Consider only
translations that positions an atom from protein A on an atom of protein B
• O(m*n) instead of O((n+m)3)