Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps
description
Transcript of Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps
![Page 1: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/1.jpg)
Molecular Motion Pathways: Computation of Ensemble
Properties with Probabilistic Roadmaps
1) A.P. Singh, J.C. Latombe, and D.L. Brutlag. A Motion Planning Approach to Flexible Ligand Binding. Proc. 7th Int. Conf. on Intelligent Systems for Molecular Biology (ISMB), AAAI Press, Menlo Park, CA, pp. 252-261, 1999.
2) N.M. Amato, K.A. Dill, and G. Song. Using Motion Planning to Map Protein Folding Landscapes and Analyze Folding Kinetics of Known Native Structures. J. Comp. Biology, 10(2):239-255, 2003.
3) M.S. Apaydin, D.L. Brutlag, C. Guestrin, D. Hsu, J.C. Latombe, and C. Varma. Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion. J. Comp. Biology, 10(3-4):257-281, 2003.
4) N. Singhal, C.D. Snow, and V.S. Pande. Using Path Sampling to Build Better Markovian State Models: Predicting the Folding Rate and Mechanism of a Tryptophan Zipper Beta Hairpin, J. Chemical Physics, 121(1):415-425, 2004.
5) J. Cortés, T. Siméon, M. Renaud-Siméon, and V. Tran. Geometric Algorithms for the Conformational Analysis of Long Protein Loops. J. Comp. Chemistry, 25:956-967, 2004.
![Page 2: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/2.jpg)
Mad cow disease is caused by misfolding Drug molecules act by
binding to proteins
Molecular motion is an essential process of life
![Page 3: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/3.jpg)
So, studying molecular motion is of critical importance in
molecular biology
Stanford BioX cluster
NMR spectrometer
However, few tools are available
Computer simulation:- Monte Carlo simulation- Molecular Dynamics
![Page 4: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/4.jpg)
I ntermediate states
I ntermediate states
Unfolded (denatured) state
Folded (native) stateMany pathwaysMany pathways
Two Major Drawbacks of MD and MC Simulation
1) Each simulation run yields a single pathway, while molecules tend to move along many different pathways
Interest in ensemble properties
![Page 5: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/5.jpg)
Example of Ensemble Property:
Probability of Folding pfold
Unfolded state Folded state
pfold1- pfold
Measure kinetic distance to folded state Du, Pande, Grosberg, Tanaka,
and Shakhnovich. On the Transition Coordinate for Protein Folding. Journal of Chemical Physics (1998).
![Page 6: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/6.jpg)
Other Examples of Ensemble Properties
Folding:• Order of formation of SSE’s• Folding rate / Mean first passage time• Key intermediates
Binding:• Average time to escape from active site• Average energy barrier
![Page 7: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/7.jpg)
Two Major Drawbacks ofMD and MC Simulation
1) Each simulation run yields a single pathway, while molecules tend to move along many different pathways
2) Each simulation run tends to waste much time in local minima
![Page 8: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/8.jpg)
Roadmap-Based Representation
Compact representation of many motion pathways Coarse resolution relative to MC and MD simulation Efficient algorithms for analyzing multiple pathways
![Page 9: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/9.jpg)
Roadmaps for Robot Motion Planning
free space
[Kavraki, Svetska, Latombe,Overmars, 96]
![Page 10: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/10.jpg)
Initial Work A.P. Singh, J.C. Latombe, and D.L. Brutlag.
A Motion Planning Approach to Flexible Ligand Binding. Proc. 7th ISMB, pp. 252-261, 1999
Study of ligand-protein binding The ligand is a small flexible molecule, but the protein is assumed rigid A fixed coordinate system P is
attached to the protein and a moving coordinate system L is defined using three bonded atoms in the ligand
A conformation of the ligand is defined by the position and orientation of L relative to P and the torsional angles of the ligand
![Page 11: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/11.jpg)
Roadmap Construction (Node Generation)
The nodes of the roadmap are generated by sampling conformations of the ligand uniformly at random in the parameter space (around the protein)
The energy E at each sampled conformation is computed: E = Einteraction + Einternal
Einteraction = electrostatic + van der Waals potentialEinternal = non-bonded pairs of atoms electrostatic + van der Waals
![Page 12: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/12.jpg)
Roadmap Construction (Node Generation)
The nodes of the roadmap are generated by sampling conformations of the ligand uniformly at random in the parameter space (around the protein)
The energy E at each sampled conformation is computed: E = Einteraction + Einternal
Einteraction = electrostatic + van der Waals potentialEinternal = non-bonded pairs of atoms electrostatic + van der Waals
A sampled conformation is retained as a node of the roadmap with probability:
0 if E > Emax
Emax-EEmax-Emin
1 if E < Emin
Denser distribution of nodes in low-energy regions of conformational space
P = if Emin E Emax
![Page 13: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/13.jpg)
Roadmap Construction (Edge Generation)
q q’
Each node is connected to its closest neighbors by straight edges
Each edge is discretized so that between qi and qi+1 no atom moves by more than some ε (= 1Å)
If any E(qi) > Emax , then the edge is rejected
qi qi+
1
E
Emax
![Page 14: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/14.jpg)
Heuristic measureof energetic difficultyor moving from q to q’
Roadmap Construction (Edge Generation)
q q’
Any two nodes closer apart than some threshold distance are connected by a straight edge
Each edge is discretized so that between qi and qi+1 no atom moves by more than some ε (= 1Å)
If all E(qi) Emax , then the edge is retained and is assigned two weights w(qq’) and w(q’q)
where:
(probability that the ligand moves from qi to qi+1 when it is constrained to move along the edge)
qi qi+
1
i i+1i
w(q q') = -ln(P[q q ])
ii+1
i ii+1 i-1
-(E -E )/ kT
i i+1 -(E -E )/ kT -(E -E )/ kT
eP[q q ] =
e e
![Page 15: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/15.jpg)
For a given goal node qg (e.g., binding conformation), the Dijkstra’s single-source algorithm computes the lowest-weight paths from qg to each node (in either direction) in O(N logN) time, where N = number of nodes
Various quantities can then be easily computed in O(N) time, e.g., average weights of all paths entering qg and of all paths leaving qg (~ binding and dissociation rates Kon and Koff)
Querying the Roadmap
Protein: Lactate dehydrogenaseLigand: Oxamate (7 degrees of freedom)
![Page 16: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/16.jpg)
Experiments on 3 Complexes
1) PDB ID: 1ldmReceptor: Lactate Dehydrogenase (2386 atoms, 309 residues)Ligand: Oxamate (6 atoms, 7 dofs)
2) PDB ID: 4ts1Receptor: Mutant of tyrosyl-transfer-RNA synthetase (2423
atoms, 319 residues)Ligand: L- leucyl-hydroxylamine (13 atoms, 9 dofs)
3) PDB ID: 1stpReceptor: Streptavidin (901 atoms, 121 residues)Ligand: Biotin (16 atoms, 11 dofs)
![Page 17: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/17.jpg)
Computation of Potential Binding Conformations
1) Sample many (several 1000’s) ligand’s conformations at random around protein
2) Repeat several times: Select lowest-energy
conformations that are close to protein surface
Resample around them
3) Retain k (~10) lowest-energy conformations whose centers of mass are at least 5Å apart
lactate dehydrogenase
active site
![Page 18: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/18.jpg)
Results for 1ldm
Some potential binding sites have slightly lower energy than the active site Energy is not a discriminating factor
Average path weights (energetic difficulty) to enter and leave binding site are significantly greater for the active site Indicates that the active site is surrounded by an energy barrier that “traps” the ligand
![Page 19: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/19.jpg)
Energy
ConformationPotential binding
site
Potential binding
site
Active site
![Page 20: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/20.jpg)
Known native state Degrees of freedom: φ-ψ angles Energy: van der Waals, hydrogen bonds,
hydrophobic effect New idea: Sampling strategy Application: Finding order of SSE
formation
Application of Roadmaps to Protein Folding
N.M. Amato, K.A. Dill, and G. Song. Using Motion Planning to Map Protein Folding Landscapes and Analyze Folding Kinetics of
Known Native Structures. J. Comp. Biology, 10(2):239-255, 2003
![Page 21: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/21.jpg)
High dimensionality non-uniform sampling
Conformations are sampled using Gaussian distribution around native state
Conformations are sorted into bins by number of native contacts (pairs of C atoms that are closeapart in native structure)
Sampling ends when all bins have minimum number of conformations “good” coverage of conformational space
Sampling Strategy(Node Generation)
![Page 22: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/22.jpg)
The lowest-weight path is extracted from each denatured conformation to the folded one
The order of formation of SSE’s is computed along each path
The formation order that appears the most often over all paths is considered the SSE formation order of the protein
Application: Order of Formation of Secondary
Structures
![Page 23: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/23.jpg)
1) The contact matrix showing the time step when each native contact appears is built
Method
![Page 24: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/24.jpg)
Protein CI2 (1 + 4 )
![Page 25: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/25.jpg)
Protein CI2(1 + 4 )
60
5
The native contact between residues 5 and 60 appears at step 216
![Page 26: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/26.jpg)
1) The contact matrix showing the time step when each native contact appears is built
2) The time step at which a structure appears is approximated as the average of the appearance time steps of its contacts
Method
![Page 27: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/27.jpg)
Protein CI2(1 + 4 )
forms at time step 122 (II)3 and 4 come together at 187 (V)2 and 3 come together at 210 (IV)1 and 4 come together at 214 (I) and 4 come together at 214 (III)
![Page 28: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/28.jpg)
1) The contact matrix showing the time step when each native contact appears is built
2) The time step at which a structure appears is approximated as the average of the appearance time steps of its contacts
Method
![Page 29: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/29.jpg)
Comparison with Experimental Data
CI2
1+5
31+4
1+4 5126, 70k
5471, 104k7975, 104k8357, 119k
roadmap sizeSSE’s
![Page 30: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/30.jpg)
Stochastic Roadmaps M.S. Apaydin, D.L. Brutlag, C. Guestrin, D. Hsu, J.C. Latombe and C.
Varma. Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion. J. Comp. Biol., 10(3-4):257-
281, 2003
New Idea: Capture the stochastic nature of molecular motion by assigning probabilities to edges
vi
vj
Pij
![Page 31: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/31.jpg)
Edge probabilities
Follow Metropolis criteria:
ijij
iij
i
exp(-ΔE / kT), if ΔE >0;
NP =
1, otherwise.
N
Self-transition probability:
ii ijj i
P=1- Pvj
vi
Pij
Pii
[Roadmap nodes are sampled uniformly at random and energy profilealong edges is not considered]
![Page 32: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/32.jpg)
V
Stochastic Roadmap Simulation
Pij
Stochastic roadmap simulation and Monte Carlo simulation converge to the Boltzmann distribution, i.e., the number of times SRS is at a node in V converges towardwhen the number of nodes grows (and they are uniformly distributed)
-E/ kT
Ve dV
![Page 33: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/33.jpg)
Roadmap as Markov Chain
Transition probability Pij depends only on i and j
Pijij
![Page 34: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/34.jpg)
Example #1: Probability of Folding pfold
Unfolded state Folded state
pfold1- pfold
![Page 35: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/35.jpg)
Pii
F: Folded stateU: Unfolded state
First-Step Analysis
Pij
i
k
j
l
m
Pik Pil
Pim
Let fi = pfold(i)After one step: fi = Pii fi + Pij fj + Pik fk + Pil fl + Pim fm
=1 =1
One linear equation per node Solution gives pfold for all nodes No explicit simulation run All pathways are taken into account Sparse linear system
![Page 36: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/36.jpg)
Number of Self-Avoiding Walks
on a 2D Grid
1, 2, 12, 184, 8512, 1262816,575780564, 789360053252, 3266598486981642,(10x10) 41044208702632496804, (11x11) 1568758030464750013214100,(12x12) 182413291514248049241470885236 > 1028 http://mathworld.wolfram.com/Self-AvoidingWalk.html
![Page 37: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/37.jpg)
In contrast …
Computing pfold with MC simulation requires:
For every conformation q of interest
Perform many MC simulation runs from q
Count number of times F is attained first
![Page 38: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/38.jpg)
Computational Tests• 1ROP (repressor of
primer)• 2 helices• 6 DOF
• 1HDD (Engrailed homeodomain)
• 3 helices• 12 DOF
H-P energy model with steric clash exclusion [Sun et al., 95]
![Page 39: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/39.jpg)
1ROP
Correlation with MC Approach
![Page 40: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/40.jpg)
pfold for ß hairpin
Immunoglobin binding protein
(Protein G)
Last 16 amino acids
Cα based representation
Go model energy function
42 DOFs
[Zhou and Karplus, `99]
![Page 41: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/41.jpg)
Computation Times (ß hairpin)
Monte Carlo (30 simulations):
1 conformation ~10 hours ofcomputer time
Over 107 energy
computations
Roadmap:
2000 conformations23 seconds ofcomputer time
~50,000 energycomputations
~6 orders of magnitude speedup!
![Page 42: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/42.jpg)
Example #2: Ligand-Protein Interaction
Computation of escape time from funnels of attraction around potential binding sites
Funnel of attraction = ball of 10Å rmsd around bound state[Camacho and Vajda, 01]
![Page 43: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/43.jpg)
Computation Through Simulation
[Sept, Elcock and McCammon `99]
10K to 30K independent simulations
![Page 44: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/44.jpg)
Computing Escape Time with Roadmap
i = 1 + Pii i + Pij j+ Pik k + Pil l + Pim m
(escape time is measured as number of stepsof stochastic simulation)
= 0
Funnel of Attraction
ij
kl
m
Pii
Pim
PilPikPij
![Page 45: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/45.jpg)
Distinguishing Active Site
Given several potential binding sites,
which one is the active one?
Energy: electrostatic + van der Waals + solvation free energy terms
![Page 46: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/46.jpg)
Complexes Studied
ligand protein # random nodes
# DOFs
oxamate 1ldm 8000 7
Streptavidin 1stp 8000 11
Hydroxylamine 4ts1 8000 9
COT 1cjw 8000 21
THK 1aid 8000 14
IPM 1ao5 8000 10
PTI 3tpi 8000 13
![Page 47: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/47.jpg)
Distinction Using Escape Time
Protein Bound state
Best potential binding site
1stp 3.4E+9 1.1E+7
4ts1 3.8E+10 1.8E+6
3tpi 1.3E+11 5.9E+5
1ldm 8.1E+5 3.4E+6
1cjw 5.4E+8 4.2E+6
1aid 9.7E+5 1.6E+8
1ao5 6.6E+7 5.7E+6(# steps)
Able to distinguishcatalytic
site
Not able
![Page 48: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/48.jpg)
Using Path Sampling to Construct Roadmaps
N. Singhal, C.D. Snow, and V.S. Pande. Using Path Sampling to Build Better Markovian State Models: Predicting the Folding Rate
and Mechanism of a Tryptophan Zipper Beta Hairpin, J. Chemical Physics, 121(1):415-425, 2004
New idea:Paths computed with Molecular Dynamics simulation techniques are used to create the nodes of the roadmap
More pertinent/better distributed nodes
Edges are labeled with the time needed to traverse them
![Page 49: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/49.jpg)
t
U
F
Sampling Nodes from Computed Paths (Path
Shooting)
![Page 50: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/50.jpg)
Sampling Nodes from Computed Paths (Path
Shooting)
U
Fi
jtij
pij
Example: Langevin dynamics equation of motion is where R is a Gaussian random forceext
dxF -mγ +R=0
dt
![Page 51: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/51.jpg)
Node Merging
If two nodes are closer apart than some , they are merged into one and merging rules are applied to update edge probabilities and times
4
1
5
3
2P12, t12
P14, t14
1
5
3
2’P12’, t12’
P12’ = P12 + P14 t12’ = P12xt12 + P14xt14
![Page 52: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/52.jpg)
Node Merging
If two nodes are closer apart than some , they are merged into one and merging rules are applied to update edge probabilities and times
4
1
5
3
2P12, t12
P14, t14
1
5
3
2’P12’, t12’
P12’ = P12 + P14 t12’ = P12xt12 + P14xt14
Approximately uniform distribution of nodes over the reachable subset of
conformational space
Approximately uniform distribution of nodes over the reachable subset of
conformational space
![Page 53: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/53.jpg)
Application: Computation of MFPT
Mean First Passage Time: the average time when a protein first reaches its folded state
First-Step Analysis yields: MPFT(i) = j Pij x (tij + MPFT(j)) MPFT(i) = 0 if i F
Assuming first-order kinetics, the probability that a protein folds at time t is:
where r is the folding rate
MFPT = =1/r
-rtfP(t) = 1 - e
f0
P(t) tdt
![Page 54: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/54.jpg)
Computational Test
12-residue tryptophan zipper beta hairpin (TZ2)
Folding@Home used to generate trajectories (fully atomistic simulation) ranging from 10 to 450 ns
1750 trajectories (14 reaching folded state) 22,400-node roadmap MFPT ~ 2-9 s, which is similar to
experimental measurements (from fluorescence and IR)
![Page 55: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/55.jpg)
Conformational Analysis of Protein Loops
J. Cortés, T. Siméon, M. Renaud-Siméon, and V. Tran. Geometric Algorithms for the Conformational Analysis of Long Protein Loops.
J. Comp. Chemistry, 25:956-967, 2004
New idea:Explore the clash-free subset of the conformational space of a loop, by building a tree-shaped roadmap
Kinematic model: - angles on the backbone + i torsional angles in side-chains
![Page 56: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/56.jpg)
Amylosucrase (AS)- Only enzyme in its family that acts on sucrose substrate-The 17-residue loop (named loop 7) between Gly433 and Gly449 is believed to play a pivotal role
![Page 57: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/57.jpg)
Roadmap Construction
A tree-shaped roadmap is created from a start conformation qstart
At each step of the roadmap construction, a conformation qrand of the loop is picked at random, and a new roadmap node is created by iteratively pulling toward it the existing node that is closest to qrand
![Page 58: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/58.jpg)
Roadmap Construction
C CfreeCclosed
qstart
qrand
![Page 59: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/59.jpg)
Roadmap Construction
C CfreeCclosed
qstart
qrand
![Page 60: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/60.jpg)
Roadmap Construction
C Cfree
Cclosed
qstart
qrand
![Page 61: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/61.jpg)
Roadmap Construction
C Cfree
Cclosed
qstart
qrand
Stops when one can’t get closer to qrand or a clash is detected
![Page 62: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/62.jpg)
Computational Results Surprisingly, loop 7 can’t move much Main bottleneck is residue Asp231
Positions of theC atom of middleresidue (Ser441)
![Page 63: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/63.jpg)
Computational Results Surprisingly, loop 7 can’t move much Main bottleneck is residue Asp231
![Page 64: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/64.jpg)
Computational Results If residue Asp231 is “removed”, then loop
7’s mobility increases dramatically. The C atom of Ser441 can be displaced by more than 9Å from its crystallographic position
![Page 65: Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps](https://reader033.fdocuments.in/reader033/viewer/2022051215/568149ab550346895db6ea4b/html5/thumbnails/65.jpg)
Conclusion
Probabilistic roadmaps are a recent, but promising tool for exploring conformational space and computing ensemble properties of molecular pathways
Current/future research:• Better sampling strategies able to handle more
complex molecular models (protein-protein binding)• More work to include time information in roadmaps • More thorough experimental validation to compare
computed and measured quantitative properties