IMA Summer Program
Classical and Quantum Approachesin Molecular Modeling
Lecture 5: Molecular Sampling
Robert D. SkeelDepartment of Computer Science, and of Mathematics
Purdue University
http://bionum.cs.purdue.edu/2007July24.pdf
Acknowledgment: NIH
SamplingCreate a stochastic (or deterministic!) ergodic Markov chain togenerate x1, x2, . . . , xn, . . . having the desired distribution.
• Monte Carlo methods are unbiased in the limit Ntrials→∞but waste information.
• Molecular dynamics with extended Hamiltonians and/orstochastic terms has a bias due to finite ∆t.
Sampling can be accelerated by techniques such as replicaexchange and multicanonical/Wang-Landau sampling.
2
Canonical ensembleLet β = 1/kBT .The probability density function factors as
const e−βpTM−1p/2 · ρ(x) where ρ(x) = const e−βU(x).
The constant in ρ(x) is unknown.
3
Outline
I. Markov chain Monte Carlo methods
II. Hybrid Monte Carlo methods
III. Molecular dynamics
IV. Replica exchange method
V. Multicanonical/Wang-Landau sampling
VI. Practicalities
4
A simple Markov chain Monte Carlo methodGiven x, one step is as follows:
Pick an atom i at random.Generate i.i.d. values ∆x, ∆y, ∆z uniform on [−1
2∆, 12∆]x′ = x with [∆x,∆y,∆z]T added to ~riAccept x′ with probability
min {1, ρ(x′)/ρ(x)}—Metropolis acceptance criterion.
Otherwise, choose x as the new value.Note. A rejected move counts as a step in the chain.
These trial moves have symmetric conditional p.d.f.:ρt(x|x′) = ρt(x′|x).
5
MCMC ConvergenceEach configuration xn of the Markov chain has a density ρn(x).Only in the limit n→∞ might ρn(x)→ ρ(x): convergence.
Proposition. convergence⇔ stationarity & ergodicity.
Stationarity means ρn(x) = ρ(x)⇒ ρn+1(x) = ρ(x)
Ergodicity means that all subsets of positive measure will bevisited with probability > 0 in finite time.
The Markov chain given previously is ergodic.
6
StationarityProposition. detailed balance⇒ stationarity.
Detailed balance∗ means ρ(x|x′)ρ(x′) = ρ(x′|x)ρ(x)where ρ(x′|x) is the conditional p.d.f. for a complete step.
Proposition. symmetric trial moves & Metropolis criterion⇒detailed balance
7
Metropolis-Hastings criterionaccommodates nonsymmetric trial moves by accepting withprobability
min{
1,ρ(x′)ρt(x|x′)ρ(x)ρt(x′|x)
}.
Example is a trial move given by
x′ = x+ ∆tD∇ log ρ(x) +√
2∆tD1/2Z
where D is a constant diagonal matrix and Z is a set ofindependent standard Gaussian random numbers.
I would call this scheme a . . .
8
“Brownian dynamics sampler”Basic idea is similar to
force-bias MC (1978)
and almost identical to
smart MC (1978).
Abstracted idea has been called
Metropolis-adjusted Langevin algorithm (1994).
9
Outline
I. Markov chain Monte Carlo methods
II. Hybrid Monte Carlo methods
III. Molecular dynamics
IV. Replica exchange method
V. Multicanonical/Wang-Landau sampling
VI. Practicalities
10
Hybrid Monte CarloHybrid Monte Carlo uses MD to generate possible moves.Given x :
(1) Generate p from const exp(−12βp
TM−1p).
(2) Obtain x′, p′ from short MD trajectory.
(3) Accept x′ with probability
min{
1,exp(−βH(x′, p′))exp(−βH(x, p))
},
where H(x, p) = 12p
TM−1p+ U(x).
(4) If rejected, choose x.
11
Convergence of HMCIt is enough that the integrator
be reversible and volume-preserving.
Counterintuitively, 〈∆H〉 > 0 , ∆H = H(x′, p′)−H(x, p).Indeed, 〈∆H〉 ∝ ∆t2pN .
To get a 50% acceptance rate, we need〈∆H〉 = 0.9099 kBT ,
which implies∆t ∝ N−1/2p
12
Outline
I. Markov chain Monte Carlo methods
II. Hybrid Monte Carlo methods
III. Molecular dynamics
IV. Replica exchange method
V. Multicanonical/Wang-Landau sampling
VI. Practicalities
13
Deterministic MDInstantaneous temperature T (p) = pTM−1p/kBNd
where Nd = number of DOFs.Nose-Hoover augments Newton’s equations with a thermostat:
ddtx = M−1p,
ddtp = −∇U(x)−ps
Qp,
ddtps = NdkB(T (p)− T ).
where Q = thermal inertia.If T (p) > T , the value of ps will increase,
and eventually ps will be positive,causing p—and T (p)—to decrease.
(And the opposite happens if T (p) < T .)
14
Nose-Hoover generates canonical ensemble via
〈A(x, p)〉 = limt→∞
1t
∫ t
0
A(x(t), p(t)) dt.
It is not Hamiltonian but has a conserved quantity
12pTM−1p+ U(x) +
12Q
p2s +NdkBT ln s
whereddts =
1Qpss.
Drift in this “extended energy” may be excessive, however.
15
Nose-PoincareIt is defined by the extended Hamiltonian H(x, s, p, ps) =
s
(12s−2pTM−1p+ U(x) +
12Q
p2s +NdkBT ln s− E
)where E is chosen to make H initially zero.
〈A〉 = limt→∞
1t
∫ t
0
A(x(t′), s(t′)−1p(t′)) dt′.
16
Stochastic MDRecall Langevin dynamics
Md
dt2x = F (x)− CM d
dtx+ (2kBTCM)1/2
ddtW (t)
where
C is a diagonal matrix of damping constants, e.g., 5 ps−1, and
W (t) is a set of 3N independent canonical Wiener processes.
17
Outline
I. Markov chain Monte Carlo methods
II. Hybrid Monte Carlo methods
III. Molecular dynamics
IV. Replica exchange method
V. Multicanonical/Wang-Landau sampling
VI. Practicalities
18
Temperature replica exchangeLarge conformational barriers can be surmounted
by raising temperature,and this can be done
while maintaining Boltzmann-Gibbs samplingusing replica-exchange aka parallel tempering (1995).
19
MethodSimulate an ensemble of systems at temperaturesT1 < T2 < · · · < Tµ where T1 is the desired temperature.Periodically, choose ν at random from 1, 2, . . . , µ− 1,and consider swapping configurations x(ν) and x(ν+1).Probability of the swapped state relative the unswapped state is
r =ρν(x(ν+1))ρν(x(ν))
ρν+1(x(ν))ρν+1(x(ν+1))
where ρν(x) is the probability density for temperature Tν.Accept the exchange with a probability
min {1, r} .Then continue sampling.
20
ShortcomingProbability of rejection increases with system size N .
Number of replicas needed to prevent excessive rejectionsincreases as N1/2.
21
Hamiltonian replica exchangeA generalization of replica exchange.
As an example, write U = Upp + Upw + Uww
where the splitting represents protein–protein, protein–water, andwater–water interactions, respectively.
Then considerUν = γνU
pp + Upw + Uww, 1 = γ1 > γ2 > · · · > γµ—hot solute, cold solvent.
22
Outline
I. Markov chain Monte Carlo methods
II. Hybrid Monte Carlo methods
III. Molecular dynamics
IV. Replica exchange method
V. Multicanonical/Wang-Landau sampling
VI. Practicalities
23
Energy distribution for canonical ensemble
Spread is ∝√NkBT .
24
Multicanonical samplingEnergy barriers can be overcome by sampling from a density
for which U(x) has a flat distribution.For suitable ranges Emin ≤ U(x) ≤ Emax, there is such adensity,
ρmulti(x) = const/g(U(x)),
where g(E) is the (configurational) density of states,
g(E) = const∫δ(U(x)− E)dx.
25
ReweightingIf we have an approximation to g(E),canonical ensemble averages can be calculated by reweighting
〈A(x)〉 =〈A(x)e−βU(x)/g(U(x))−1〉multi
〈e−βU(x)/g(U(x))−1〉multi,
valid independent of the accuracy of g(E).(If g(E) has good accuracy, there is an alternative.)
26
Wang-Landau schemeDiscretize using histograms:
partition Emin ≤ E ≤ Emax into J subintervals,
define basis functions 1j(E) ={
1, in jth subinterval,0, elsewhere.
Construct approximations, g(1)(E), g(2)(E), . . . , of decreasinggranularity:
log g(1)(E) =J∑j=1
N(1)j 1j(E),
log g(2)(E) = log g(1)(E) +12
J∑j=1
N(2)j 1j(E),
et cetera.
27
Inner loopInitialize: g0(E) = g(k)(E).Inner loop: for n = 1, 2, . . .
Generate proposal xn from xn−1
Choose xn = xn or xn−1 based onMetropolis criterion for density = const/gn−1(U(x))
# increment histogram valuelog gn(E) = log gn−1(E) +
(12
)k∑Jj=1 1j(U(xn))1j(E),
until well sampled.Set g(k+1)(E) = gn(E).
not a Markov chainScheme tries to create a flat energy distribution,the resistance encountered is a correction to g(k)(E).
28
Outline
I. Markov chain Monte Carlo methods
II. Hybrid Monte Carlo methods
III. Molecular dynamics
IV. Replica exchange method
V. Multicanonical/Wang-Landau sampling
VI. Practicalities
29
PracticalitiesThe practicalities of doing such calculations involve three steps:
structure building Setting up the input files is best doneinteractively with scripts and visual feedback.visualization programs: RasMol, VMD, PyMOL, . . .
simulation Generating dynamics or sampling trajectories is bestdone in background or remotely.simulation programs: CHARMM, Amber, Gromacs, NAMD,LAMMPS, NWChem, Tinker, . . .
analysis Analyzing trajectory data.
30
Simulation specifications
• Specify molecular system & surroundings
• Specify computational tasks
• Select computational model:uncontrolled approximations and error tolerances
- internal forces- external forces, e.g., temperature and pressure control- dynamics (sampling or real)
• (Override defaults for performance parameters)
• Design simulation protocol
31
References
• M. P. Allen and D. J. Tildesley, Computer Simulation ofLiquids, 1987,
• D. Frenkel and B. Smit, Understanding Molecular Simulation:From Algorithms to Applications, 2nd edition, 2002.
• A. R. Leach, Molecular Modelling: Principles andApplications, 2nd edition, 2001,
• T. Schlick, Molecular Modeling and Simulation: AnInterdisciplinary Guide, 2002,
• Journal of Chemical Physics
32
Top Related