1 Energy Maintenance for Molecular Simulation kinematics + energy motion + structure Main...
-
Upload
ross-wilkins -
Category
Documents
-
view
218 -
download
3
Transcript of 1 Energy Maintenance for Molecular Simulation kinematics + energy motion + structure Main...
1
Energy Maintenancefor Molecular
Simulationkinematics + energy motion + structure
Main computational issue: Proximity computation
2
Energy
q1
qi
q2
qj
qN-1
qN
Function defined over large dimensionalconformation space
3
Energy FunctionEnergy FunctionE = ES + E + ES + ETor + EvdW + Edipole
bonded terms (in linear number)
4
Energy FunctionEnergy FunctionE = ES + E + ES + ETor + EvdW + Edipole
bonded terms (in linear number)
EvdW
non-bonded terms(in quadratic number)
5
Role of vdW TermsRole of vdW Terms
vdW terms maze of in conformational space
Other terms steer the molecule in this maze
6
Heuristic Energy TermsHeuristic Energy Terms(e.g., Gō Models)(e.g., Gō Models)
7
Interaction with SolventInteraction with Solvent
Explicit solvent models: 100s or 1000s of discrete solvent molecules
Implicit solvent models: solvent as continuous medium, interface is solvent-accessible surface
8
Energy FunctionEnergy Function E = bonded terms
+ non-bonded terms + solvation terms
Bonded terms - Relatively few
Non-bonded terms - Depend on distances between pairs of atoms - Quadratic number Expensive to compute
Solvation terms- May require computing molecular surface
9
Energy FunctionEnergy Function E = bonded terms
+ non-bonded terms + solvation terms
Bonded terms - Relatively few
Non-bonded terms - Depend on distances between pairs of atoms - Quadratic number Expensive to compute
Solvation terms- May require computing molecular surface
10
Uses of Energy FunctionUses of Energy Function
Generate energetically plausible conformations: sample (at random), minimize, cluster
Generate meaningful distributions (e.g., Boltzman) of conformations: Monte Carlo simulation
Generate motion pathways to study molecular kinetics: molecular dynamics, MC simulation
11
Popular approach to study thermodynamic and kinetic properties of proteins
Random walk through conformation space At each cycle:
– Perturb current conformation at random– Accept step with probability:
(Metropolis acceptance criterion) The conformations generated by an arbitrarily long
MCS are Boltzman distributed, i.e.,
#conformations in V ~
ΔE-
kTP(accept)=min 1,e
Monte Carlo Simulation Monte Carlo Simulation (MCS)(MCS)
E
-kT
Ve dV
12
Uses of Energy FunctionUses of Energy Function
Generate energetically plausible conformations: sample (at random), minimize, cluster
Generate meaningful distributions (e.g., Boltzman) of conformations: Monte Carlo simulation
Generate motion pathways to study molecular kinetics: molecular dynamics, MC simulation
One issue in common: Energy must be evaluated frequently
E.g., MD and MC simulation runs may consist of millions of steps, each
13
Uses of Energy FunctionUses of Energy Function
Generate energetically plausible conformations: sample (at random), minimize, cluster
Generate meaningful distributions (e.g., Boltzman) of conformations: Monte Carlo simulation
Generate motion pathways to study molecular kinetics: molecular dynamics, MC simulation
Problem:How to efficiently compute and update energy during minimization and simulation?
14
Non-Bonded Energy TermsNon-Bonded Energy Terms Quadratic number of pairs of atoms
Energy terms go to 0 when distance increases Cutoff distance (6 - 12Å)
vdW forces prevent atoms from bunching up Only O(n) interacting pairs [Halperin&Overmars 98]
Problems: How can we find the interacting pairs without
enumerating all atom pairs? How can we detect atomic clashes quickly?Main computational issue: Proximity computation
15
Grid MethodGrid Method
dcutoff
Subdivide 3-space into cubic cells
Compute cell that contains each atom center
Represent grid as hashtable
16
Grid MethodGrid Method
dcutoff O(n) time to build grid O(1) time to find
interactive pairs for each atom
Θ(n) to find all interactive pairs of atoms [Halperin&Overmars, 98]
Asymptotically optimal in worst-case
17
Energy Update
Compare the interacting pairs at new step with those at previous step
For every pair that has disappeared, subtract the corresponding energy term from energy value
For every new pair, add the corresponding energy term to energy value
Takes Θ(n) time, even if very few pairs have changed
18
Conservation of partialenergy sums
The grid method is unable to recognize and re-use such partial sums
19
Grid MethodGrid Method
dcutoff O(n) time to build grid O(1) time to find
interactive pairs for each atom
Θ(n) to find all interactive pairs of atoms [Halperin&Overmars, 98]
Asymptotically optimal in worst-case
But:- Energy partial sums? - Atomic clashes?
[second grid with small cutoff distance]
20
Grid Method Grid Method Surface Surface [Halperin and Shelton, 97][Halperin and Shelton, 97]
Each sphere intersects O(1) spheres
Computing each atom’s contribution to molecular surface takes O(1) time
Computation of molecular surface takes Θ(n) time
implicit solvation term in Θ(n) time
21
General ProblemGeneral Problem
Molecules form geometrically complex objects that deform and move relative to each other
(Self-)collision detection Distance computation
Several computational approaches: Space occupancy: grid, octree Tracking pairs of closest features Polynomial equation Bounding-volume hierarchies (BVH) Spanners
22
Bounding Volume Bounding Volume Hierarchies (BVHs)Hierarchies (BVHs)
Outline: Case of rigid objects:
Bounding volume (BV) BV hierarchy (BVH) Types of BVs Collision detection with BVHs Distance computation
Application to deformable objects Application to protein simulation
http://www.cs.unc.edu/~geom/collide/index.shtml
23
Basic ProblemBasic Problem
Given the geometric models and relative positions of two objects, determine whether they overlap
24
Basic ProblemBasic Problem
Given the geometric models and relative positions of two objects, determine whether they overlap
distance = 0 collision
25
ApplicationsApplications
Computer graphics & simulation Robotics Haptics
26
27
Basic Idea of SolutionBasic Idea of Solution Enclose objects into bounding volumes (spheres or boxes) Check the bounding volumes first
28
Basic Idea of SolutionBasic Idea of Solution Enclose objects into bounding volumes (spheres or boxes) Check the bounding volumes first Decompose an object into two
29
Basic Idea of SolutionBasic Idea of Solution Enclose objects into bounding volumes (spheres or boxes) Check the bounding volumes first Decompose an object into two Proceed hierarchically
30
Basic Idea of SolutionBasic Idea of Solution Enclose objects into bounding volumes (spheres or boxes) Check the bounding volumes first Decompose an object into two Proceed hierarchically
31
Bounding Volume Hierarchy Bounding Volume Hierarchy (BVH)(BVH)
• BVH is pre-computed for each object• BVH is typically a balanced binary tree
32
BVH in 3DBVH in 3D
33
Collision DetectionCollision Detection
Two objects described by their precomputed BVHs
A
B C
D E F G
A
B C
D E F G
34
Collision DetectionCollision Detection
AA
Search tree
AA
pruning
35
Collision DetectionCollision Detection
AA
CCCBBCBB
Search tree
AA
A
B C
D E F G
36
Collision DetectionCollision Detection
CCCBBCBB
AA
Search tree
pruning
A
B C
D E F G
37
If two leaves of the BVH’s overlap(here, G and D) check their contentfor collision
Collision DetectionCollision Detection
CCCBBCBB
AA
Search tree
GEGDFEFD
A
B C
D E F G
GD
38
VariantVariant
AA
CCCBBCBB
Search tree
AA
A
B C
D E F GAA
CABA
39
Collision DetectionCollision Detection
Pruning discards subsets of the two objects that are separated by the BVs
Each path is followed until pruning or until two leaves overlap
When two leaves overlap, their contents are tested for overlap
40
Search Strategy and Search Strategy and HeuristicsHeuristics
If there is no collision, all paths must eventually be followed down to pruning or a leaf node
But if there is collision, it is desirable to detect it as quickly as possible
Greedy best-first search strategy with f(N) = d/(rX+rY)
[Expand the node XY with largest relative overlap (most likely to contain a collision)]
rX
rYd
X
Y
41
Recursive (Depth-First) Recursive (Depth-First) Collision Detection AlgorithmCollision Detection Algorithm
Test(A,B)1. If A and B do not overlap, then return 12. If A and B are both leaves, then return 0
if their contents overlap and 1 otherwise3. Switch A and B if A is a leaf, or if B is
bigger and not a leaf4. Set A1 and A2 to be A’s children
5. If Test(A1,B) = 1 then return Test(A2,B) else return 0
42
PerformancePerformance
Several thousand collision checks per second for 2 three-dimensional objects each described by 500,000 triangles, on a 1-GHz PC
43
Greedy Distance ComputationGreedy Distance Computation(same recursion as collision detection)(same recursion as collision detection)
Greedy-Distance(A,B)1. If dist(A,B) > 0, then return dist(A,B)2. If A and B are both leaves, then return
distance between their contents 3. Switch A and B if A is a leaf, or if B is bigger
and not a leaf4. Set A1 and A2 to be A’s children
5. d1 Greedy-Distance(A1,B)
6. If d1 > 0 then
a. d2 Greedy-Distance(A2,B)
b. If d2 > 0 then return Min(d1,d2)
7. Return 0
44
Exact Distance Exact Distance ComputationComputation
Distance(A,B)1. If dist(A,B) > M, then return M2. If A and B are both leaves, then
a. d distance between their contents b. Return Min(d,M)
3. Switch A and B if A is a leaf, or if B is bigger and not a leaf
4. Set A1 and A2 to be A’s children
5. M Distance(A1,B)
6. If M > 0 then return Distance(A2,B)
7. Else return 0
M (upper bound on distance) is initialized to very large number
45
Approximate Distance Approximate Distance ComputationComputation
Approx-Distance(A,B) [ da : da de and de-da de]
1. If dist(A,B) > M, then return M2. If A and B are both leaves, then
a. d distance between their contents b. If d < M then return (1-)d else return M
3. Switch A and B if A is a leaf, or if B is bigger and not a leaf
4. Set A1 and A2 to be A’s children
5. M Approx-Distance(A1,B)
6. If M > 0 then return Approx-Distance(A2,B) 7. Return 0
M (upper bound on distance) is initialized to very large number
46
Approximate Distance Approximate Distance ComputationComputation
Approx-Distance(A,B) [ da : da de and de-da de]
1. If dist(A,B) > M, then return M2. If A and B are both leaves, then
a. d distance between their contents b. If d < M then return (1-)d
3. Switch A and B if A is a leaf, or if B is bigger and not a leaf
4. Set A1 and A2 to be A’s children
5. M Approx-Distance(A1,B)
6. If M > 0 then return Approx-Distance(A2,B) 7. Return 0
M (upper bound on distance) is initialized to very large number
Garanteed to return an approximate distance between (1-)d and d
47
Collision detection < Greedy distance computation < 0.5 Approximate distance computation
<< Exact distance computation
< : slightly faster<< : much faster
48
Desirable Properties of Desirable Properties of BVs and BVHsBVs and BVHs
BVs: Tightness Efficient testing Invariance
BVH: Separation Balanced tree
49
Desirable Properties of Desirable Properties of BVs and BVHsBVs and BVHs
BVs: Tightness Efficient testing Invariance
BVH: Separation Balanced tree
50
SpheresSpheres
Invariant Efficient to test But tight?
51
Axis-Aligned Bounding Box Axis-Aligned Bounding Box (AABB)(AABB)
52
Axis-Aligned Bounding Box Axis-Aligned Bounding Box (AABB)(AABB)
Not invariant Efficient to test Not tight
53
Oriented Bounding Box Oriented Bounding Box (OBB)(OBB)
[Gottschalk, Lin, and Manocha, 96][Gottschalk, Lin, and Manocha, 96]
54
Oriented Bounding Box Oriented Bounding Box (OBB)(OBB)
Invariant Less efficient to test Tight
55
Rectangle Swept Spheres Rectangle Swept Spheres (RSS)(RSS)
Similar to OBBs
Efficient distance computation
56
Computation of Distance Computation of Distance Between Two RSS’sBetween Two RSS’s
Compute the distance between the two underlying rectangles
Subtract the growing radius
57
Comparison of BVsComparison of BVs
Sphere
AABB OBB RSS
Tightness - -- + +
Testing + + - -+
Invariance
yes no yes yes
No type of BV is optimal for all situations
59
Computation of an OBB[Gottschalk, Lin, and Manocha, 96]
N points ai = (xi, yi, zi)T, i = 1,…, N
SVD of A = (a1 a2 ... aN) A = UDVT where
D = diag(1,2,3) such that 1 2 3 0
U is a 3x3 rotation matrix that defines the principal axes of variance of the ai’s OBB’s directions
The OBB is defined by max and min coordinates of the ai’s along these directions
Possible improvements: use vertices of convex hull of the ai’s or dense uniform sampling of convex hull
x
y
X
Yrotation described bymatrix U
60
OBB of a Collection of Spheres
Compute the OBB of the centers
Grow the OBB by moving each of its faces outwardby the atom radius
x
y
X
Y
61
Computation of an RSS[Larsen, Gottschalk, Lin, and Manocha, 00]
Similar to OBB. Compute the two principal axes of variance of the ai’s (atom centers)
Project all ai’s into the plane P defined by these two directions
Compute minimum enclosing rectangle R contained in P and aligned with these directions
Grow R by half the length of the interval spanned by the ai’s along the direction perpendicular to P increased by the atom radius
62
Desirable Properties of Desirable Properties of BVs and BVHsBVs and BVHs
BVs: Tightness Efficient testing Invariance
BVH: Separation Balanced tree
63
Desirable Properties of Desirable Properties of BVs and BVHsBVs and BVHs
BVs: Tightness Efficient testing Invariance
BVH: Separation Balanced tree
Group pieces that are close apart, not pieces that are far apart
66
Subdivision of an OBB/RSSSubdivision of an OBB/RSS
Split longest axis at mid or median point
67
Application to Application to Deformable ObjectsDeformable Objects
The BVH computed for some initial or nominal geometry may become useless
68
Application to Application to Deformable ObjectsDeformable Objects
The BVH computed for some initial or nominal geometry may become useless
Group pieces hierarchically based on topological rather than geometric proximity
Topological proximity is invariant implies geometric proximity (converse is not true)
69
Particular Case: Long ChainParticular Case: Long Chain
70
Application to Application to Deformable ObjectsDeformable Objects
The BVH computed for some initial or nominal geometry may become useless
Group pieces hierarchically based on topological rather than geometric proximity
Topological proximity is invariant implies geometric proximity (converse is not true)
BVH with fixed topology, but BVs must still be adjusted in size and position
Self-collision detection is done by testing a BVH against itself
71
Particular Case: Long ChainParticular Case: Long ChainA chain of spheres is well-behaved iff:1. The ratio of the radii of the largest and
smallest spheres is less than some 2. The distance between any two sphere
centers is greater than some
Complexity for updating the BVH and testing self-collision of a well-behaved chain of spheres
72
Application to Monte Carlo Application to Monte Carlo Simulation of ProteinsSimulation of Proteins
(ChainTree)(ChainTree)
[I. Lotan, D. Halperin, F. Schwarzer and J.C. Latombe. Algorithm and Data Structures for Efficient Energy maintenance During Monte Carlo Simulation of Proteins, J. Computational Biology, 2004]
73
Random walk through conformation space At each cycle:
- Perturb current conformation at random– Accept step with probability:
Problem: Update energy value
/( ) min 1, bE k TP accept e
Monte Carlo Simulation Monte Carlo Simulation (MCS)(MCS)
74
Energy FunctionEnergy Function E = bonded terms
+ non-bonded terms + solvation terms
Bonded terms - Relatively few
Non-bonded terms - Depend on distances between pairs of atoms - Quadratic number Expensive to compute
Solvation terms- May require computing molecular surface
75
Non-Bonded Energy TermsNon-Bonded Energy Terms
They go to 0 when distance increases Use cutoff distance (6 - 12Å)
vdW forces prevent atoms from bunching up Only O(n) interacting pairs [Halperin&Overmars 98]
Problem: How to find these interacting pairs without enumerating all atom pairs?
76
Can We Do Better on Can We Do Better on Average than Grid method?Average than Grid method?• Few DOFs are changed at each MC
step
Number kof DOF changes
0 10 20 305
77
Can We Do Better on Can We Do Better on Average than Grid method?Average than Grid method?• Few DOFs are changed at each MC
step
Number kof DOF changes
0 10 20 305
simulationof 100,000attempted steps
78
Few DOFs are changed at each MC step Proteins are long chain kinematics
Long sub-chains stay rigid at each step Many partial energy sums remain constant
Problem: How to retrieve the unchanged partial sums?
Can We Do Better on Can We Do Better on Average?Average?
79
ChainTreeChainTree(Twofold Hierarchy: BVs + (Twofold Hierarchy: BVs +
Transforms)Transforms)
links
80
TNO
TJK
TAB
joints
ChainTreeChainTree(Twofold Hierarchy: BVs + (Twofold Hierarchy: BVs +
Transforms)Transforms)
81
Updating the ChainTreeUpdating the ChainTree
Update path to root:– Recompute transforms that “shortcut” the DOF change– Recompute BVs that contain the DOF change– O(k (log(n/k)+1)) work for k changes
82
Finding Interacting PairsFinding Interacting Pairs
83
Finding Interacting PairsFinding Interacting Pairs
84
Finding Interacting PairsFinding Interacting Pairs
Do not search inside rigid sub-chains (unmarked nodes)
Do not test two nodes with no marked node between them
85
Finding Interacting PairsFinding Interacting Pairs
Do not search inside rigid sub-chains (unmarked nodes)
Do not test two nodes with no marked node between them
86
EnergyTreeEnergyTree
E(N,N)
E(J,L)
E(K.L)
E(L,L)
E(M,M)
87
EnergyTreeEnergyTree
E(N,N)
E(J,L)
E(K.L)
E(L,L)
E(M,M)
88
Computational ComplexityComputational Complexity
n : total number of DOFs k : number of DOF changes at each MCS step k << n
Complexity of: updating ChainTree: O(k (log(n/k)+1)) finding interacting pairs: O(n4/3)
but performs much better in practice!!!
89
Experimental SetupExperimental Setup
Energy function: Van der Waals Electrostatic Attraction between native contacts Cutoff at 12Å
300,000 steps MCS with Grid and ChainTree
Steps are the same with both methods Early rejection for large vdW terms
90
Results: 1-DOF changeResults: 1-DOF change
(68) (144) (374) (755)# amino acids
3.5
12.5
5.8
7.8
speedup
91
Results: 5-DOF changeResults: 5-DOF change
(68) (144) (374) (755)
2.2
3.4
4.5
5.9
speedup
92
Two-Pass ChainTree Two-Pass ChainTree (ChainTree+)(ChainTree+)
1st pass: small cutoff distance to detect steric clashes2nd pass: Normal cutoff distance
>5Tests around native state
93
Interaction with Solvent
Explicit solvent models: 100s or 1000s of discrete solvent molecules
Implicit solvent models: solvent as continuous medium, interface is solvent-accessible surface
E. Eyal, D. Halperin. Dynamic Maintenance of Molecular Surfaces underConformational Changes. http://www.give.nl/movie/publications/telaviv/EH04.pdf
94
ConclusionConclusion ChainTree significantly reduces average time of
MCS for proteins (vs. grid)
It exploits: Atomic exclusion Cutoff distance on potentials Chain kinematics of protein Small # of DOF changes at each MC step
Larger speed-up for bigger proteins and smaller # of simultaneous DOF changes
Extension to updating protein surface
http://robotics.stanford.edu/~itayl/mcs
Already exploitedby grid method