The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford...
-
date post
19-Dec-2015 -
Category
Documents
-
view
218 -
download
2
Transcript of The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford...
![Page 1: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/1.jpg)
The Side-Chain Positioning Problem
Joint work with Bernard Chazelle and Mona Singh
Carl KingsfordPrinceton University
![Page 2: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/2.jpg)
VC
R
R
Proteins
Many functions: Structural, messaging, catalytic, …
Sequence of amino acids strung together on a backbone
Each amino acid has a flexible side-chain
Proteins fold. Function depends highly on 3D shape
![Page 3: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/3.jpg)
Backbone
Protein Structure
Side-chains
![Page 4: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/4.jpg)
Side-chain Positioning Problem
Given:• fixed backbone• amino acid sequence
Find the 3D positions for the side-chains that minimize the energy of the structure
Assume lowest energy is best
IILVPACW…IILVPACW…
![Page 5: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/5.jpg)
Side-chain Positioning Applications
Homology-modeling: Use known backbone of similar protein to predict new structure
Unknown:KNVACKNGQTNCYQSYSTMSITDCRETGSSKYPNCAYKTTQANKHII NV CKNG NCY S S + ITDCR G+SKYPNC YKT+ KHII Known:ENVTCKNGKKNCYKSTSALHITDCRLKGNSKYPNCDYKTSDYQKHII
![Page 6: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/6.jpg)
Rotamers
Each amino acid has some number of statistically preferred side-chain positions
These are called rotamers
Continuum of positions is well approximated by rotamers
3 rotamers of Arginine
![Page 7: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/7.jpg)
An Equivalent Graph Problem
For protein with p side-chains:
p-partite graph:
• part Vi for each side-chain i
• node u for each rotamer
• edge {u,v} if u interacts with v
Weights:
• E(u) = self-energy
• E(u,v) = interaction energy
n nodes
rotamer
position
interaction
V1
V2
![Page 8: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/8.jpg)
Feasible Solution
Feasible solution: one node from each part
cost(feasible) = cost of induced subgraph
Hard to approximate within a factor of cn
where n is the # of nodes
rotamer
position
interaction
V1
V2
![Page 9: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/9.jpg)
Determining the Energy
• Energy of a protein conformation is the sum of several energy terms
• No -inequality
van der Waals
electrostatics
bond lengthsbond angles
dihedral angleshydrogen bonds
0+ -
A
B
![Page 10: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/10.jpg)
Plan of Attack
1.Formulate as a quadratic integer program
2.Relax into a semidefinite program
3.Solve the SDP in polynomial time
4.Round solution vectors to choice of rotamers
![Page 11: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/11.jpg)
Quadratic Integer Program
min
for each posn j
subject to
for each posn j, node v
![Page 12: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/12.jpg)
Relax Into Vector Program
Use xu = xu2 for to write as pure quadratic
programVariables n-dimensional vectors ( )
minimize
subject to
for each posn j
for each node v, posn j
![Page 13: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/13.jpg)
Rewrite As Semidefinite Program
X (xuv) is PSD xuv = xuTxv
minimize
subject tofor each posn j
for each node v, posn j
![Page 14: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/14.jpg)
position constraintssum of the node variables in each position is 1
Vi
xvv
Constraints & Dummy Position
xu0V0
Insert a new position with a single node.No edges, no node cost.
xuv Vj
flow constraintssum of edge variables adjacent to a nodeequals that node variable
![Page 15: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/15.jpg)
Geometry of the Solution Vectors
![Page 16: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/16.jpg)
Let Simple algebra shows that:
Geometry of Solution Vectors
Lemma.
Proof.
• Length of y is 1
• Length of xu0 is 1
• Length of projection of y onto xu0 is 1
.
![Page 17: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/17.jpg)
Solution Vectors Lie on a Sphere
xu0
xu
a
O
because
Note. Length of projection of xu onto xu0 is
the length of vector xu squared.
Each solution vector lies on a sphere of radius ½ centered at xu0
/2:
a2 =
![Page 18: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/18.jpg)
How do we round the solution of the SDP relaxation?
Convert fractional solutions into feasible 0/1 solutions
• Projection rounding• Perron-Frobenius rounding
![Page 19: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/19.jpg)
Projection Rounding
O
Since , the xuu give a probability distribution at
at each position.
Pick node u with probability xuu
xu0 xu
xv
xuu = length of the projection onto xu0
.
X =
![Page 20: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/20.jpg)
Drift for Projection Rounding
Drift expected difference between fractional & rounded solutions.
Comes entirely from pairwise interactions.
In fact,
yuyv
xuxv
By Cauchy-Schwartz,
uv = E(u,v)(xuv – Pr[uv])
Because xu are on a sphere,
![Page 21: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/21.jpg)
Perron-Frobenius Rounding
0/1 characteristic n-vector of optimal solutionOptimal integral X* T rank(X*) = 1
Idea: Approximate fractional X by a rank 1 matrix qqT
Want to sample from , but settle for q
= 0 1 1 1 10 0 0 0 0 00 0 0 0
= 1 = 1 = 1 = 1q =
q needs to contain probability distributions for each
position. How do we choose q?
![Page 22: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/22.jpg)
Lemma. Any nonnegative vector q with L1-norm p in the image space of X contains the required set of probability distributions.
Proof. X = WTW, where W = [x1 x2 … xn].
Let 1i characteristic vector for position i
Suppose q = Xy for some y.
Then,
The final value is independent of i each position sums to 1.
Possible Choices for q
![Page 23: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/23.jpg)
A Choice for q
By spectral decomposition
where
Take
By Perron-Frobenius theorem for nonnegative matrices q ≥ 0.
By Lemma, q contains the needed probability distributions.
z1 is in the image space of X.
![Page 24: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/24.jpg)
Computational Results
Compare solutions from Simple LP SDP Fractional Projection rounded Perron-Frobenius rounded
30 random graphs
60 nodes, 15 positions
edge probability ½
weights uniformly from [0,1]
![Page 25: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/25.jpg)
Future Work
Can the rounding schemes be applied to other problems?
Can the semidefinite program be sped up?
─ Can only routinely solve graphs with ≤ 120 nodes (reasonable protein problems contain 1000 to 5000 nodes)
─ xuv ≥ 0 constraints are the bottleneck
Can the requirement of a fixed backbone be relaxed?
We’ve worked quite a bit with real proteins using a LP approach Seems an SDP formulation might be useful
![Page 26: The Side-Chain Positioning Problem Joint work with Bernard Chazelle and Mona Singh Carl Kingsford Princeton University.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d3e5503460f94a16e79/html5/thumbnails/26.jpg)
More Information
The Side-Chain Positioning Problem: A Semidefinite Programming Formulation with New Rounding Schemes
, B. Chazelle, C. Kingsford, M. Singh, Proc. ACM FCRC'2003, Principles of Computing and Knowledge: Paris Kanellakis Memorial Workshop (2003).
http://www.cs.princeton.edu/~carlk/papers.html