Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

35
bioexcel.eu Partners Funding Multiple timescales in atomistic simulations Presenter: Simone Meloni Host: Adam Carter BioExcel Webinar Series 15th November, 2017

Transcript of Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Page 1: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

bioexcel.eu

Partners Funding

Multiple timescales in atomistic simulations

Presenter: Simone MeloniHost: Adam Carter

BioExcel Webinar Series

15th November, 2017

Page 2: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

bioexcel.eu

Thiswebinarisbeingrecorded

Page 3: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

bioexcel.eu

BioExcel Overview• Excellence in Biomolecular Software

- Improve the performance, efficiency and scalability of key codes

• Excellence in Usability- Devise efficient workflow environments

with associated data integration

• Excellence in Consultancy and Training- Promote best practices and train end users

DMI Monitor

DMI Enactor

DMI Executor

DMI Enactor

Data Delivery Point

Data Source

Monitoring flow

Data flow

Service Invocation

DMI Optimiser

DMI Planner

DMIValidator

DMI Gateway

DMI Gateway

DMI Gateway

DMI Enactor

Portal / Workbench

DMI Request

DADC Engineer

DMI Expert

Repository

Registry

DMI Expert

Domain Expert

Page 4: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

bioexcel.eu

Interest Groups

• Integrative Modeling IG• Free Energy Calculations IG• Hybrid methods for biomolecular systems IG• Biomolecular simulations entry level users IG• Practical applications for industry IG• Training• Workflows

Support platformshttp://bioexcel.eu/contact

Forums Code Repositories Chat channel Video Channel

Page 5: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

bioexcel.eu

BioExcel Community Forum

Lloyd Hotel & Cultural Embassy, Amsterdam22nd-23rd November, 2017

A networking event with a programme of small interactive working groups aligned with the BioExcel Interest

Groups

Free - Find out more, and register for free at http://bioexcel.eu/events/bioexcel-community-forum-22-23-

november-2017/

5

Page 6: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

bioexcel.eu

Audience Q&A session

Please use the Questionsfunction in GoToWebinar

application

Any other questions or points to discuss after the live

webinar? Join the discussion the discussion at

http://ask.bioexcel.eu.

Page 7: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

bioexcel.eu

Today’s PresenterDr Simone Meloni is research professor at the Department of Mechanical and Aerospace Engineering at the University of Roma "La Sapienza".

He works on applications of computational statistical mechanics to fundamental and technological problems, especially related to the energy technologies: materials for 3rd generation solar cells, energy storage, energy scavenging, etc.

He developed techniques for the simulation of rare events, i.e. events which are to infrequent to be observed on the timescale accessible by brute force simulations. He has also developed non-equilibrium techniques to (in principle) exactly compute statistical quantities in atomistic and molecular systems out of equilibrium.

7

Page 8: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Multiple timescales in atomistic simulations

Simone Meloni, Department of Mechanical and Aerospace Engineering,

Sapienza University of Rome ([email protected])

Webminar Bioexcel,November152017

Page 9: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Outline

• Needofrareeventtechniquesinhighdimensionalspaces

• Reconstructionofhighdimensional freeenergylandscape

• Findingpathsona10-105Dfreeenergylandscape

Page 10: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Rareevents

• Isometization ofAlanineDipeptide– Coarsegraineddescriptionoftheprocess:focusonthedegrees offreedomgoverningisomerization(CVs)

– Thestatistics,dynamics andkineticsofthesystemisdeterminedbysomedihedralanglesandtheirevolution

Page 11: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Rareevents

p✓(z) =

Zdrm(r)

Y

i

�(✓i(r)� zi)

F✓(z) = �kBT log p✓(z)

⌧ = 1/⌫ = ⌧1 exp[��F †/kBT ]⌧ = 1/⌫ = ⌧1 exp[��F †/kBT ]

Page 12: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Rareevents

• HistogramalongtheMDtrajectory–– Theeventisthermallyactivated

• Accelerationtechniques– US,ConstrainedMD,Metadynamics,BoxedMD,FFS,…

– Canbeused inconjuction withonly few CVs,typically1or2

F(✓)

kBT

F (✓)

⌧ >> tMD⌧ = 1/⌫ = ⌧1 exp[��F †/kBT ]⌧ = 1/⌫ = ⌧1 exp[��F †/kBT ]⌧ = 1/⌫ = ⌧1 exp[��F †/kBT ]

Page 13: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

RareEvents

• Committor:probabilitytoreachtheproductfirst

ous string is resolved at discrete level. Since this accuracy isO!!"" because of the way we discretize Pij in !47", thismeans that we must have s=O!!"". This also implies thatthe statistical error must be below this threshold; in otherwords, consistent with the error estimate derived in Appen-dix B, in step 1 we must take T=O!!"−2" in the estimators!41" and !42" to compute !zF!z" and M!z". We must alsotake k=O!!"−1" to not introduce a bias above accuracythreshold due to the finiteness of k.

Step 4. Reparametrize the string. At discrete level, thisamounts to interpolating a curve through the images zi

m,**,then distributing new images along this interpolated curve. Inthe present implementation, it suffices to use piecewise linearinterpolation: this is first order accurate in !", which is alsothe accuracy at which we compute the projector in !47". Thereparametrization step goes as follows. Denote by L!k" thelength of the string up to image k, i.e., L!0"=1 and

L!k" = #m=1

k

$zm,** − zm−1,**$, k = 2, . . . ,R , !49 "

and for m=2, . . . ,R let s!m"= !m−1"L!R" / !R−1". Then, setz1!t+!t"=z1,**, zR!t+!t"=zR,**, and for m=2, . . .R−1 take

zm!t + !t" = zk−1,** + !s!m" − L!k − 1""zk,** − zk−1,**

$zk,** − zk−1,**$,

!50"

where k=2, . . . ,R is such that L!k−1"# s!m"$ L!k". It canbe readily checked that the zm!t+!t" are then the pointsalong the piecewise linear interpolated path which are atequal distance along this path. Notice that the new imageszm!t+!t" only satisfy !45" to order O!!"2": this is consistentwith the overall order of accuracy of our scheme, but, ifneeded, arbitrarily high order of accuracy on !45" can beobtained by iterating !50".

Steps 1–4 leave us with a new update of the string,zm!t+!t", and these steps can be repeated until convergenceto the MFEP.

More sophisticated methods to discretize, evolve,smooth, and reparametrize the string can be used to achievehigher order of accuracy !see e.g., Refs. 6 , 14, and 27", but inSec. V we will simply use the scheme discussed above. Fi-nally, let us note that we use the string method here becauseit is simple, robust, and efficient, but other techniques suchas the nudged elastic band could be used as well.28

V. EXAMPLE: ALANINE DIPEPTIDE

As an example to illustrate the procedure that we pro-pose, we analyze the isomerization transition of alaninedipeptide molecule at 300 K in vacuum. We have studied thetransition between the two metastable conformers usuallynamed C7eq and C7ax. This transition has been extensivelystudied in the literature with different methods.12,14–20 Thetwo conformers are usually defined as local minima in thespace of the two dihedral angles % and &. Figure 1 shows apictorial description of the molecule and of the two conform-ers. All the dihedral angles here used as collective variablesare also shown, and the atoms involved in their definitions

are listed in the caption. For what follows, we recall that thecentral carbon atom in the figure is usually called C". Wehave looked first for minimum free energy paths using thetwo angles % and & as collective variables % i.e., !z1 ,z2"= !% ,&"& , then the four angles %, &, ', and ( as collectivevariables % i.e., !z1 ,z2 ,z3 ,z4"= !% ,& ,' ,("& . The angles % and& describe the rotations around the N−C" and C"−C bonds,respectively, while ' and ( describe the rotations around theC−N bonds !i.e., the peptide bonds". The atoms O, C, N, andH are usually collectively referred to as peptide group. In thealanine dipeptide molecule, two such groups are present, oneon the left and one on the right of the C" atom in Fig. 1.Along simulated trajectories of the molecule, these atomsstay in average on a plane. Different torsion angles can beused to monitor the conservation of this structure.29 Amongthese, we decided to work with ' and ( because the twomolecular isomers C7eq and C7ax are both characterized bythe presence of a hydrogen bond between the O and H atomsinvolved in the definition of these angles !see Fig. 1".

The simulations using the two angles % and & as collec-tive variables were done mainly to benchmark the method,since in this case the full free energy profile in % and & canalso be computed by umbrella sampling. On the other hand,the results reported below clearly indicate that using only %and & as collective variables is not enough to describe themechanism of transition in alanine dipeptide in vacuum.However, using the four angles %, &, ', and ( as collectivevariables is sufficient: in particular, it will be shown thatworking with these angles permits to determine the isocom-mittor 1

2 surface, whereas working with the two angles % and& does not.

In all simulations, we used the full-atom representationof the molecule in the CHARMM force field.30 For the dynam-ics, the Nosé-Hoover integrator25 in the CHARMM code31 wasused. All chemical bonds in the system were represented

FIG. 1. Ball-and-stick representation of the alanine dipeptide molecule:!CH3"− !CO"− !NH"– !C"HCH3"− !CO"− !NH"− !CH3". The central carbonatom is referred to as C". The dihedral angles used in the calculations areshowed. They are defined through the following quadruples of atoms:!O ,C ,N ,C"" for ', !C ,N ,C" ,C" for %, !N ,C" ,C ,N" for &, and!C" ,C ,N ,H" for (. In the bottom row are shown the two metastable con-formers of the molecule in vacuum. The dashed line represents a hydrogenbond.

024106-8 Maragliano et al. J. Chem. Phys. 125, 024106 "2006#

Downloaded 28 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

not says that !! ," ,# ,$" is a large enough set, though this isindeed the case as will be verified below#.

The free energy profile along the MFEP can be calcu-lated from !44". Figure 4 shows the results obtained for theMFEPs in !! ,"" and !! ," ,# ,$". It can be seen that the freeenergy along the MFEP in !! ," ,# ,$" is different from theone along the MFEP in !! ,"". In the two angles case, theprofile shows the flat region through which the string goes!see Fig. 2". Free energy differences calculated betweenpoints along the two angles profile are in very good agree-ment with results previously obtained with different meth-ods, all of which used ! and " only as collectivevariables.15– 19 Moreover, it is interesting to note that the freeenergy profile along the MFEP in !! ," ,# ,$" is in fact closerto the one calculated in Ref. 14 with the string method usingall the variables in Cartesian space.

To further prove the importance of the # and $ variables,we generated configurations from the transition ensemble.By definition, the transition ensemble is the ensemble ofpoints on the isocommittor 1

2 surface in state space !x ,v"equipped with the Boltzmann-Gibbs probability densityfunction restricted to this surface. From the results in Sec.III D, we know that if one assumes that the collective vari-ables used are a good set, then the isocommittor 1

2 surface islocally approximated by the lift up in state space !x ,v" of theplane defined by !51" associated with the point of maximumfree energy along the MFEP. Denoting by %s the point alongthe MFEP where F!z!%"" is maximum, i.e., F!z!%s""=max%!$0,1# F!z!%"", the equation for this hypersurface is

%j=1

N

n j!%s"!# j!x" − zj!%s"" = 0. !52"

In order to sample points on the hypersurface defined by !52"distributed according to the Boltzmann-Gibbs probability

density function restricted to this surface, we added the fol-lowing expression to the potential used in the CHARMM code:

V%,kp!x" =

kp

2 &%j=1

N

n j!%"!# j!x" − zj!%""'2

, !53 "

and used the Nosé-Hoover integrator. The constant kp waschosen to be 1000 kcal/ !mol rad2". Figure 5 shows an en-larged view of the major saddle point region of Fig. 2, to-gether with the projections on the !! ,"" plane of the pointsin the transition ensemble. The gray cloud are the pointsobtained by working in the two angles !! ,"" only. $In thiscase, in this projected view, the transition ensemble shouldlie on the gray line—the latter corresponds to !51" with!z1 ,z2"= !! ,"" and %=%s; the finite width of this cloud awayfrom the line is due to the finiteness of kp in !53 "#. The blackcloud contains the points from the transition ensemble ob-tained by working in the four angles !! ," ,# ,$". The factthat the black cloud covers a more extended region than thegray cloud reveals that the transition ensemble obtained byworking with !! ," ,# ,$" is actually bended in the # and $variables, which is a clear indication that these two variablesare important for the transition under study.

As a conclusive and definitive test on the role of the fullset of collective variables, we computed committor valuesdistributions of the transition ensembles on the hypersurface!52", since this surface is supposed to be a local approxima-tion of the isocommittor 1

2 surface. We first calculated thecommittor distribution associated with the surface !52" whenonly the angles ! and " were used. One hundred differentconfigurations were extracted from the transition ensembleon this surface, and their committor value was computedfrom 200 trajectories generated from each of these configu-

FIG. 4. Free energy profiles along the MFEPs in the variables !! ,"" !graycurve" and !! ," ,# ,$" !black curve". The profile of the free energy along theMFEP in !! ,"" indicates that this MFEP goes by a flat region of the freeenergy, consistent with what is seen in Fig. 2. However, there is no suchregion along the MFEP in !! ," ,# ,$". The free energy along the MFEP in!! ," ,# ,$" is more similar to the one obtained in Ref. 14 using the stringmethod in the full Cartesian space, which is already an indication that thecollective variables !, ", #, and $ are a set large enough to describe thetransition between C7eq and C7ax.

FIG. 5. The projections on the !! ,"" plane of points sampled in the tran-sition ensemble $that is, the ensemble of points on the isocommittor 1

2 sur-face in state space !x ,v" equipped with the Boltzmann-Gibbs probabilitydensity function restricted to this surface#. The gray cloud is slightly scat-tered around the gray dashed line due to the sampling procedure used; theblack cloud is more scattered because the isocommittor 1

2 surface associatedwith the MFEP in !! ," ,# ,$" bends in the direction of # and $. This revealsthe importance of these variables in describing the mechanism of thetransition.

024106-10 Maragliano et al. J. Chem. Phys. 125, 024106 !2006"

Downloaded 28 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

L.Maragliano,A.Fischer,E.Vanden-Eijnden andG.Ciccotti,JCP125,024126(2006)

not says that !! ," ,# ,$" is a large enough set, though this isindeed the case as will be verified below#.

The free energy profile along the MFEP can be calcu-lated from !44". Figure 4 shows the results obtained for theMFEPs in !! ,"" and !! ," ,# ,$". It can be seen that the freeenergy along the MFEP in !! ," ,# ,$" is different from theone along the MFEP in !! ,"". In the two angles case, theprofile shows the flat region through which the string goes!see Fig. 2". Free energy differences calculated betweenpoints along the two angles profile are in very good agree-ment with results previously obtained with different meth-ods, all of which used ! and " only as collectivevariables.15– 19 Moreover, it is interesting to note that the freeenergy profile along the MFEP in !! ," ,# ,$" is in fact closerto the one calculated in Ref. 14 with the string method usingall the variables in Cartesian space.

To further prove the importance of the # and $ variables,we generated configurations from the transition ensemble.By definition, the transition ensemble is the ensemble ofpoints on the isocommittor 1

2 surface in state space !x ,v"equipped with the Boltzmann-Gibbs probability densityfunction restricted to this surface. From the results in Sec.III D, we know that if one assumes that the collective vari-ables used are a good set, then the isocommittor 1

2 surface islocally approximated by the lift up in state space !x ,v" of theplane defined by !51" associated with the point of maximumfree energy along the MFEP. Denoting by %s the point alongthe MFEP where F!z!%"" is maximum, i.e., F!z!%s""=max%!$0,1# F!z!%"", the equation for this hypersurface is

%j=1

N

n j!%s"!# j!x" − zj!%s"" = 0. !52"

In order to sample points on the hypersurface defined by !52"distributed according to the Boltzmann-Gibbs probability

density function restricted to this surface, we added the fol-lowing expression to the potential used in the CHARMM code:

V%,kp!x" =

kp

2 &%j=1

N

n j!%"!# j!x" − zj!%""'2

, !53 "

and used the Nosé-Hoover integrator. The constant kp waschosen to be 1000 kcal/ !mol rad2". Figure 5 shows an en-larged view of the major saddle point region of Fig. 2, to-gether with the projections on the !! ,"" plane of the pointsin the transition ensemble. The gray cloud are the pointsobtained by working in the two angles !! ,"" only. $In thiscase, in this projected view, the transition ensemble shouldlie on the gray line—the latter corresponds to !51" with!z1 ,z2"= !! ,"" and %=%s; the finite width of this cloud awayfrom the line is due to the finiteness of kp in !53 "#. The blackcloud contains the points from the transition ensemble ob-tained by working in the four angles !! ," ,# ,$". The factthat the black cloud covers a more extended region than thegray cloud reveals that the transition ensemble obtained byworking with !! ," ,# ,$" is actually bended in the # and $variables, which is a clear indication that these two variablesare important for the transition under study.

As a conclusive and definitive test on the role of the fullset of collective variables, we computed committor valuesdistributions of the transition ensembles on the hypersurface!52", since this surface is supposed to be a local approxima-tion of the isocommittor 1

2 surface. We first calculated thecommittor distribution associated with the surface !52" whenonly the angles ! and " were used. One hundred differentconfigurations were extracted from the transition ensembleon this surface, and their committor value was computedfrom 200 trajectories generated from each of these configu-

FIG. 4. Free energy profiles along the MFEPs in the variables !! ,"" !graycurve" and !! ," ,# ,$" !black curve". The profile of the free energy along theMFEP in !! ,"" indicates that this MFEP goes by a flat region of the freeenergy, consistent with what is seen in Fig. 2. However, there is no suchregion along the MFEP in !! ," ,# ,$". The free energy along the MFEP in!! ," ,# ,$" is more similar to the one obtained in Ref. 14 using the stringmethod in the full Cartesian space, which is already an indication that thecollective variables !, ", #, and $ are a set large enough to describe thetransition between C7eq and C7ax.

FIG. 5. The projections on the !! ,"" plane of points sampled in the tran-sition ensemble $that is, the ensemble of points on the isocommittor 1

2 sur-face in state space !x ,v" equipped with the Boltzmann-Gibbs probabilitydensity function restricted to this surface#. The gray cloud is slightly scat-tered around the gray dashed line due to the sampling procedure used; theblack cloud is more scattered because the isocommittor 1

2 surface associatedwith the MFEP in !! ," ,# ,$" bends in the direction of # and $. This revealsthe importance of these variables in describing the mechanism of thetransition.

024106-10 Maragliano et al. J. Chem. Phys. 125, 024106 !2006"

Downloaded 28 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

rations by assigning initial random velocities. The resultingdistribution is reported in Fig. 6. The distribution is almostuniform, indicating that the corresponding surface is not agood approximation of the isocommittor 1

2 surface, i.e., !and " alone are not a good set of collective variables.

The situation improves markedly if one uses the fourdihedral angles, !, ", #, and $. Figure 7 shows the committordistributions calculated for the hypersurfaces !52" corre-sponding to images number 13 and 14 along the string, cor-responding to %=0.63 and %=0.68, respectively. One hun-dred points were extracted from the ensemble restricted onthese hypersurfaces, and 200 trajectories were launched fromeach of these points. The resulting distributions are peaked,which is an indication that the set of collective variablesproperly describes the reaction since it shows that the hyper-surfaces !52" with %=0.63 and %=0.68 do indeed approxi-mate the isocommittor surfaces. Figure 8 shows the distribu-tion obtained for the transition ensemble on the hypersurfacelabeled by %s, with %s=0.66, corresponding to committorvalue 1

2 , obtained by linear interpolation between those at%=0.63 and %=0.68. The distribution is peaked at 1

2 , indi-cating that we have chosen the correct surface.

The results above—namely, that the set !! ,"" is notsufficient to describe the reaction but the enlarged set!! ," ,# ,$" is—are consistent with those in Refs. 12 and 20.Two points are worth noting, however. First, the results inRefs. 12 and 20 were obtained by identifying a set of reac-tive trajectories first, then analyzing these trajectories toidentify the transition state regions. This second step is quitea tedious one in practice, and by our technique we avoid itcompletely since we identify the isocommittor surfaces di-rectly !i.e., without running reactive trajectories beforehand".Even though the validity of these surfaces !i.e., the validityof the collective variables chosen" must then be assessed aposteriori, the method that we propose is still substantiallycheaper than the one in Refs. 12 and 20. The second pointworth noting is a difference with Refs. 12 and 20 in terms ofthe protocol used to compute the committor distributions. InRef. 20, these were calculated for ensembles constrained in apoint of the collective variables space, while here we onlyrestrict the system to be on the hypersurface defined in !52".This leads to a more severe test than the one used in Ref. 20,but also a more appropriate one, since the isocommittor sur-face is a dividing surface in the original state space whichtherefore cannot be reduced to a point in the collective vari-ables space !it should be a surface of dimension N−1 in thisspace if the number of collective variables is N".

VI. CONCLUDING REMARKS

Let us summarize the highlights of this paper. We haveshown that the minimum free energy path !MFEP" in a givenset of collective variables is relevant to describe a reaction:the MFEP is the reaction pathway of maximum likelihood inthe space of these collective variables. We have also shownhow to compute the MFEP by combining the string methodwith a restrained sampling technique to calculate the meanforce and the metric tensor #see the definition of the MFEPsin !7"$.

The main advantage of this computational approach isthat the update of the string involves a local calculationalong the path. As a result, the cost of the technique pre-sented here scales linearly with the number of images along

FIG. 6. Committor distribution of the transition ensemble in the hypersur-face in the two angles MFEP with putative committor value 1

2 . The flatnessof this distribution indicates that this surface is not a good approximation ofthe isocommittor 1

2 surface, i.e., !! ,"" are not good collective variables todescribe the reaction.

FIG. 7. Committor distributions for the hypersurfaces at the images alongthe string with %=0.63 !dashed line" and %=0.68 !solid line", which are theclosest to the maximum of the free energy along the MFEP in the four angleMFEP.

FIG. 8. Committor distribution for the interpolated hypersurface with com-mittor value 1

2 along the MFEP in four angles. The fact that this distributionis peaked at 1

2 indicates that the set !! ," ,# ,$" is a good set of collectivevariables to describe the reaction.

024106-11 String method in collective variables J. Chem. Phys. 125, 024106 !2006"

Downloaded 28 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

rations by assigning initial random velocities. The resultingdistribution is reported in Fig. 6. The distribution is almostuniform, indicating that the corresponding surface is not agood approximation of the isocommittor 1

2 surface, i.e., !and " alone are not a good set of collective variables.

The situation improves markedly if one uses the fourdihedral angles, !, ", #, and $. Figure 7 shows the committordistributions calculated for the hypersurfaces !52" corre-sponding to images number 13 and 14 along the string, cor-responding to %=0.63 and %=0.68, respectively. One hun-dred points were extracted from the ensemble restricted onthese hypersurfaces, and 200 trajectories were launched fromeach of these points. The resulting distributions are peaked,which is an indication that the set of collective variablesproperly describes the reaction since it shows that the hyper-surfaces !52" with %=0.63 and %=0.68 do indeed approxi-mate the isocommittor surfaces. Figure 8 shows the distribu-tion obtained for the transition ensemble on the hypersurfacelabeled by %s, with %s=0.66, corresponding to committorvalue 1

2 , obtained by linear interpolation between those at%=0.63 and %=0.68. The distribution is peaked at 1

2 , indi-cating that we have chosen the correct surface.

The results above—namely, that the set !! ,"" is notsufficient to describe the reaction but the enlarged set!! ," ,# ,$" is—are consistent with those in Refs. 12 and 20.Two points are worth noting, however. First, the results inRefs. 12 and 20 were obtained by identifying a set of reac-tive trajectories first, then analyzing these trajectories toidentify the transition state regions. This second step is quitea tedious one in practice, and by our technique we avoid itcompletely since we identify the isocommittor surfaces di-rectly !i.e., without running reactive trajectories beforehand".Even though the validity of these surfaces !i.e., the validityof the collective variables chosen" must then be assessed aposteriori, the method that we propose is still substantiallycheaper than the one in Refs. 12 and 20. The second pointworth noting is a difference with Refs. 12 and 20 in terms ofthe protocol used to compute the committor distributions. InRef. 20, these were calculated for ensembles constrained in apoint of the collective variables space, while here we onlyrestrict the system to be on the hypersurface defined in !52".This leads to a more severe test than the one used in Ref. 20,but also a more appropriate one, since the isocommittor sur-face is a dividing surface in the original state space whichtherefore cannot be reduced to a point in the collective vari-ables space !it should be a surface of dimension N−1 in thisspace if the number of collective variables is N".

VI. CONCLUDING REMARKS

Let us summarize the highlights of this paper. We haveshown that the minimum free energy path !MFEP" in a givenset of collective variables is relevant to describe a reaction:the MFEP is the reaction pathway of maximum likelihood inthe space of these collective variables. We have also shownhow to compute the MFEP by combining the string methodwith a restrained sampling technique to calculate the meanforce and the metric tensor #see the definition of the MFEPsin !7"$.

The main advantage of this computational approach isthat the update of the string involves a local calculationalong the path. As a result, the cost of the technique pre-sented here scales linearly with the number of images along

FIG. 6. Committor distribution of the transition ensemble in the hypersur-face in the two angles MFEP with putative committor value 1

2 . The flatnessof this distribution indicates that this surface is not a good approximation ofthe isocommittor 1

2 surface, i.e., !! ,"" are not good collective variables todescribe the reaction.

FIG. 7. Committor distributions for the hypersurfaces at the images alongthe string with %=0.63 !dashed line" and %=0.68 !solid line", which are theclosest to the maximum of the free energy along the MFEP in the four angleMFEP.

FIG. 8. Committor distribution for the interpolated hypersurface with com-mittor value 1

2 along the MFEP in four angles. The fact that this distributionis peaked at 1

2 indicates that the set !! ," ,# ,$" is a good set of collectivevariables to describe the reaction.

024106-11 String method in collective variables J. Chem. Phys. 125, 024106 !2006"

Downloaded 28 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

2CVs 4 CVs

Page 14: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Rareevents

5 Apr 2002 12:52 AR AR155-11.tex AR155-11.SGM LaTeX2e(2001/05/10) P1: GSR

308 BOLHUIS ET AL.

Figure 9 Four different potential or free-energy landscapes V (q, s). Alongside each areplotted the corresponding free energy, F(q⇤, s), and committor distribution, P( pA), for theensemble of microstates with q = q⇤. For landscape (a), the reaction coordinate is adequatelydescribed by q, and P( pA) is peaked at pA = 1/2. For landscape (b), the reaction coordinatehas a significant component along s, as indicated by the barrier in F(q⇤, s) and the bimodalshape of P( pA). In (c), s is again an important dynamical variable. In this case P( pA) is nearlyconstant, suggesting that motion along s is diffusive when q is near q⇤. Finally, for landscape(d), the reaction coordinate is orthogonal to q, reflected by the single peak of P( pA) nearpA = 0. In this case, almost none of the configurations belonging to the constrained ensemblewith q = q⇤ lie on the transition state surface.

Averaging variables overmany examples of a transition does not provide equivalentinformation. Day et al., for example, have demonstrated that certain hydrogen bondangles change on average during transfer of an excess proton in liquid water (41).But in order to establish that the proton transfer mechanism can be described usingonly these coordinates, it will be necessary to compute the appropriate distributionof committors. Similarly, determining only the mean of a committor distributiondoes not provide information about the possible importance of orthogonal coordi-nates. In remarkable experimental studies of colloidal crystallization, Gasser et al.have, in effect, determined hpAiR for various crystallite sizes R (50). The mono-tonic decrease of hpAiR with increasing R, passing through hpAiRc ' 1/2 for acritical size Rc, indicates that cluster size is indeed correlated with the progressof nucleation. But it does not guarantee that the ensemble of configurations withR = Rc coincides with the transition state surface for nucleation.

Ann

u. R

ev. P

hys.

Che

m. 2

002.

53:2

91-3

18. D

ownl

oade

d fr

om a

rjour

nals

.ann

ualre

view

s.org

by U

nive

rsity

of C

alifo

rnia

- B

erke

ley

on 0

2/28

/06.

For

per

sona

l use

onl

y.

P.G.Bolhuis,D.Chandler,C.Dellago andP.LGeissler,Ann.Rev.Phys.Chem.53,291(2002)

Page 15: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Rareevents

5 Apr 2002 12:52 AR AR155-11.tex AR155-11.SGM LaTeX2e(2001/05/10) P1: GSR

308 BOLHUIS ET AL.

Figure 9 Four different potential or free-energy landscapes V (q, s). Alongside each areplotted the corresponding free energy, F(q⇤, s), and committor distribution, P( pA), for theensemble of microstates with q = q⇤. For landscape (a), the reaction coordinate is adequatelydescribed by q, and P( pA) is peaked at pA = 1/2. For landscape (b), the reaction coordinatehas a significant component along s, as indicated by the barrier in F(q⇤, s) and the bimodalshape of P( pA). In (c), s is again an important dynamical variable. In this case P( pA) is nearlyconstant, suggesting that motion along s is diffusive when q is near q⇤. Finally, for landscape(d), the reaction coordinate is orthogonal to q, reflected by the single peak of P( pA) nearpA = 0. In this case, almost none of the configurations belonging to the constrained ensemblewith q = q⇤ lie on the transition state surface.

Averaging variables overmany examples of a transition does not provide equivalentinformation. Day et al., for example, have demonstrated that certain hydrogen bondangles change on average during transfer of an excess proton in liquid water (41).But in order to establish that the proton transfer mechanism can be described usingonly these coordinates, it will be necessary to compute the appropriate distributionof committors. Similarly, determining only the mean of a committor distributiondoes not provide information about the possible importance of orthogonal coordi-nates. In remarkable experimental studies of colloidal crystallization, Gasser et al.have, in effect, determined hpAiR for various crystallite sizes R (50). The mono-tonic decrease of hpAiR with increasing R, passing through hpAiRc ' 1/2 for acritical size Rc, indicates that cluster size is indeed correlated with the progressof nucleation. But it does not guarantee that the ensemble of configurations withR = Rc coincides with the transition state surface for nucleation.

Ann

u. R

ev. P

hys.

Che

m. 2

002.

53:2

91-3

18. D

ownl

oade

d fr

om a

rjour

nals

.ann

ualre

view

s.org

by U

nive

rsity

of C

alifo

rnia

- B

erke

ley

on 0

2/28

/06.

For

per

sona

l use

onl

y.

P.G.Bolhuis,D.Chandler,C.Dellago andP.LGeissler,Ann.Rev.Phys.Chem.53,291(2002)

Techniquesforthereconstructionofthefreeenergyoridentificationofthemostlikelypathinhighdimensionalspaces

Page 16: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

AccelerationtechniquesinnD

• (Improved)SingleSweepMethod(iSS)

• Stringmethod

Page 17: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

(Improved)SingleSweep

Strategy:separatetheexplorationofspacefromreconstructionofthefreeenergylandscape

with springs. For the calculation of the mean force, harmonicpotentials were added involving the dihedral angles, asrequested from Eq. !35", with force constants k=1000 kcal/ !mol rad2". The CHARMM force subroutine wasmodified in order to obtain the quantities needed for thecomputation of the tensor defined in Eq. !6".

To start the computation of the MFEP with the stringmethod, an initial condition is needed. First, we computedthe end points by minimization using !33". The end pointsso obtained are !! ,""C7eq

= !−83.2,74.5" and !! ,""C7ax= !70,−70" for the two angle simulation, and !! ," ,# ,$"C7eq= !−82.7,73.5,1.6,−4.3", !! ," ,# ,$"C7ax

= !70.5,−69.1,−0.8,5.7" in the four angle case. We keep the names C7eq andC7ax for these end points, no matter the dimension of thespace in which they are defined. Once the end points areidentified, we take as starting condition for the string 20images along a linear path connecting these end points, andthen we follow the procedure described in Sec. IV C to up-date this string till convergence to the MFEP. To compute themean force !zF and the tensor M !step 1" we use the Nosé-Hoover integrator in CHARMM with a time step of 0.5 fs for atotal of 500 000 steps. From the resulting trajectories, !zFand M are calculated for each image using !41" and !42". Thestring is then updated using !46" with %t=0.02 !step 2". Forthe smoothing step !step 3", we use !48" with s=0.1. Finally,the reparametrization step is performed as described in step 4at every update. In order for the string to converge to theMFEP, about 100 updates were needed in two angles andabout 250 in four angles. At the last update, simulations ofup to 106 steps in the Nosé-Hoover integrator are performed

to minimize the statistical error in !zF and M.Figure 2 shows the result of our MFEP calculation using

!! ,"" and !! ," ,# ,$" as collective variables. In the lattercase, the MFEP identified by the string is a curve in the fourdimensional space !! ," ,# ,$"! #0,2&$4. The projection ofthis MFEP in the space !! ,"" is shown in the figure. Forgraphical purposes, the paths are overimposed on the so-called adiabatic energy landscape of alanine dipeptide forthe !! ,"" variables. This is defined as the surface of mini-mum potential energy of the system V!x" at fixed values of !and ". The two angles path goes through the major saddlepoint in the landscape. It also passes through a flat regionlocated around !! ,""= !−60,−40". As for the four anglespath, it can be seen that its projection on the !! ,"" planediffers from the MFEP obtained by working with the vari-ables !! ,"" only. This indicates that the MFEP in !! ," ,# ,$"is curved in the # and $ directions. In order to visualize thevariation in # space, Fig. 3 shows the projections of the fourdimensional MFEP in the space of !! ,#" and !" ,#" vari-ables. Also shown in Figs. 2 and 3 are the hyperplanes P!'"!see Sec. III D" associated with the images with maximumfree energy along the MFEPs. These planes are defined bythe equation

%j=1

N

nj!'"!zj − zj!'"" = 0, !51"

meaning that they are lines when two angles are used and!z1 ,z2"= !! ,"", and three dimensional hyperplanes whenfour angles are used and !z1 ,z2 ,z3 ,z4"= !! ," ,# ,$" #in thelatter case, what is shown in the figures is the intersection ofthe hyperplane with the planes !-" !Fig. 2" or !-# and !-$ !Fig. 3"$. Figure 2 shows that the plane associated with theMFEP in two angles is very different from the one associatedwith the MFEP in four angles, and this already indicates that!! ,"" is not a set of collective variables large enough todescribe the mechanism of the reaction #of course, this does

FIG. 2. Minimum free energy paths obtained by the string method using thetwo dihedral angles !! ,"" !gray curve" and the four angles !! ," ,# ,$"!black curve". The latter is the projection into the !! ,"" plane. The dotsalong the curves represent the images of the discretized strings. Notice thateven though the MFEPs look close in this projected view, the MFEP in!! ," ,# ,$" is essentially four dimensional and cannot be represented usingthe variables ! and " alone—see Fig. 3. Notice also that the hyperplanesdefined by !51" are very different in the two cases: the gray dashed line isthe one associated with the MFEP in two angle !! ,"" space going throughthe image with maximum free energy along the MFEP !See Fig. 4", whereasthe black dashed line is the intersection between the !! ,"" plane and thehyperplane associated with the MFEP in four angle !! ," ,# ,$" space goingthrough the image with maximum free energy along the MFEP !see Fig. 4".

FIG. 3. Minimum free energy path obtained by the string method using thefour dihedral angles !! ," ,# ,$" projected into the !! ,#" plane !top panel"and the !" ,#" plane !bottom panel". The intersections of the hyperplaneassociated to the image with maximum free energy along the MFEP !seeFig. 4" with the planes !! ,#" and !" ,#" are shown as dashed lines.

024106-9 String method in collective variables J. Chem. Phys. 125, 024106 "2006#

Downloaded 28 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

Page 18: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

(Improved)SingleSweep:stepI

• DeterminingtherelevantregionoftheCVspace

T 0 >> T

M z = �rz

V (r) +

X

i

i

2(✓i(r)� z)2

!+@T 0

mr = �rr

⇣V (r) +

X i

2(✓i(r)� z)2

⌘+@T

L.Maragliano andE.Vanden-Eijnden,.JCP128,184110(2008)M.Monteferrante,S.Bonella,SMandG.Ciccotti,Mol.Sim.35,1116(2009)

⌧ 0 >> ⌧

Page 19: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

(Improved)SingleSweep:stepI

since no new center will be deposited. The accuracy of thereconstruction depends on the number of centers and the ac-curacy at which the mean force is computed in Eq. !11"much more than the precise locations where the centers aredeposited. An important practical consequence is that it israther straightforward to pick the parameters ! and " in Eq.!8" since the final result is robust against variations in theseparameters.

III. TWO-DIMENSIONAL ILLUSTRATIVE EXAMPLE

Since, given the location of the centers zk, the meanforce estimation at these points is quite standard, as a firstillustration we use a two-dimensional example for which!!x"#x= !x ,y" and A!z"#V!x", where V!x" is the Muellerpotential.24 In this case, there is no need to extend the systemas in Eq. !8", and the temperature accelerated dynamics sim-ply reduces to !setting "=1 by appropriate rescaling of time"

x = − !V!x" + $2#−1"!t" . !12"

Figure 1 shows a TAMD trajectory generated by solving Eq.!12" by forward Euler with the initial condition !x!0" ,y!0""= !1,0" and a time step of $t=2%10−5 for 2%104 time stepsat 1 / # =40 !for comparison the energy barrier between thetwo main minima of the Mueller potential is about 100". Alsoshown are the centers zk#!xk ,yk" obtained by depositing anew center along the trajectory each time the trajectoryreaches a point which is d=0.175 away from all the previouscenters. In this run, 174 centers were deposited. At the cen-ters, we used −!V!xk ,yk"= fk as estimate of the “mean force”!i.e., there is no sampling error in the present example". Wethen used these data to reconstruct the free energy as ex-plained in Sec. II A Figure 2 shows the residual per centere2!&" defined in Eq. !7". The optimal & for this run was &!

=0.398 and the condition number at this &! was 7%106. Thelevel sets of the reconstructed potential are shown in Fig. 3

and compared to those of the original Mueller potential,while Fig. 4 compares the values of the original and recon-structed Mueller potential along the TAMD trajectory shownin Fig. 1. As a simple estimate of the error, we used

e1 =%'&V!x" − V!x"&dx

%'&V!x"&dx, !13"

where V!x" denotes the reconstructed potential and ' is thedomain in which the original potential remains less than 180above its minimum value. The error defined in Eq. !13" forthis calculation was e1=4.2%10−3.

These results, which are already very good, can be im-proved by diminishing d and thereby increasing the numberof centers without having to increase the length of theTAMD trajectory. For example, by taking d=0.12, we ob-tained 351 centers in a trajectory still 2%104 steps long.

FIG. 1. !Color" A trajectory generated by simulating Eq. !12" by forwardEuler with a time step $t=2%10−5 for 2%104 steps !white curve" shownabove the contour plot of the Mueller potential !with 29 level sets evenlydistributed between V=0 and V=180 in a scale where the minimum of thepotential is V=0". The red circles are the locations of the centers depositedalong the trajectory using d=0.175. In this run, 174 centers were deposited.

FIG. 2. Residual per center e2!&" defined in Eq. !7" for the reconstruction ofthe Mueller potential with the 2%104 steps single-sweep trajectory shown inFig. 1. The optimal & for this run was &!=0.398.

FIG. 3. !Color" Comparison between the level sets of the original Muellerpotential !red curves" and the reconstructed potential using Eq. !2" !blackcurve". Here we use the 174 centers shown in Fig. 1. The optimal & is &!

=0.398 !see Fig. 2". We show 29 level sets evenly distributed between V=0 and V=180. The level sets of the reconstructed potential and the originalone are in so close agreement that they can only be distinguished in somelocalized regions !e.g., near the saddle point between the two minima in thelower right corner".

184110-4 L. Maragliano and E. Vanden-Eijnden J. Chem. Phys. 128, 184110 "2008#

Downloaded 28 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

M z = �rz

V (r) +

X

i

i

2(✓i(r)� z)2

!+@T 0

mr = �rr

⇣V (r) +

X i

2(✓i(r)� z)2

⌘+@T

Page 20: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

(Improved)SingleSweep:stepII

1. Expansionofthefreeenergyona(localized)basisset

– Basisset:Multidimensionalgaussian

F✓(z) =X

k

ck �k(z� zk) =X

k

ck gk(z� zk;�k)

gk(z� zk;�k) =q

det(��1k )/(2⇡)n exp

��1/2(z� zk)

T��1K (z� zk)

since no new center will be deposited. The accuracy of thereconstruction depends on the number of centers and the ac-curacy at which the mean force is computed in Eq. !11"much more than the precise locations where the centers aredeposited. An important practical consequence is that it israther straightforward to pick the parameters ! and " in Eq.!8" since the final result is robust against variations in theseparameters.

III. TWO-DIMENSIONAL ILLUSTRATIVE EXAMPLE

Since, given the location of the centers zk, the meanforce estimation at these points is quite standard, as a firstillustration we use a two-dimensional example for which!!x"#x= !x ,y" and A!z"#V!x", where V!x" is the Muellerpotential.24 In this case, there is no need to extend the systemas in Eq. !8", and the temperature accelerated dynamics sim-ply reduces to !setting "=1 by appropriate rescaling of time"

x = − !V!x" + $2#−1"!t" . !12"

Figure 1 shows a TAMD trajectory generated by solving Eq.!12" by forward Euler with the initial condition !x!0" ,y!0""= !1,0" and a time step of $t=2%10−5 for 2%104 time stepsat 1 / # =40 !for comparison the energy barrier between thetwo main minima of the Mueller potential is about 100". Alsoshown are the centers zk#!xk ,yk" obtained by depositing anew center along the trajectory each time the trajectoryreaches a point which is d=0.175 away from all the previouscenters. In this run, 174 centers were deposited. At the cen-ters, we used −!V!xk ,yk"= fk as estimate of the “mean force”!i.e., there is no sampling error in the present example". Wethen used these data to reconstruct the free energy as ex-plained in Sec. II A Figure 2 shows the residual per centere2!&" defined in Eq. !7". The optimal & for this run was &!

=0.398 and the condition number at this &! was 7%106. Thelevel sets of the reconstructed potential are shown in Fig. 3

and compared to those of the original Mueller potential,while Fig. 4 compares the values of the original and recon-structed Mueller potential along the TAMD trajectory shownin Fig. 1. As a simple estimate of the error, we used

e1 =%'&V!x" − V!x"&dx

%'&V!x"&dx, !13"

where V!x" denotes the reconstructed potential and ' is thedomain in which the original potential remains less than 180above its minimum value. The error defined in Eq. !13" forthis calculation was e1=4.2%10−3.

These results, which are already very good, can be im-proved by diminishing d and thereby increasing the numberof centers without having to increase the length of theTAMD trajectory. For example, by taking d=0.12, we ob-tained 351 centers in a trajectory still 2%104 steps long.

FIG. 1. !Color" A trajectory generated by simulating Eq. !12" by forwardEuler with a time step $t=2%10−5 for 2%104 steps !white curve" shownabove the contour plot of the Mueller potential !with 29 level sets evenlydistributed between V=0 and V=180 in a scale where the minimum of thepotential is V=0". The red circles are the locations of the centers depositedalong the trajectory using d=0.175. In this run, 174 centers were deposited.

FIG. 2. Residual per center e2!&" defined in Eq. !7" for the reconstruction ofthe Mueller potential with the 2%104 steps single-sweep trajectory shown inFig. 1. The optimal & for this run was &!=0.398.

FIG. 3. !Color" Comparison between the level sets of the original Muellerpotential !red curves" and the reconstructed potential using Eq. !2" !blackcurve". Here we use the 174 centers shown in Fig. 1. The optimal & is &!

=0.398 !see Fig. 2". We show 29 level sets evenly distributed between V=0 and V=180. The level sets of the reconstructed potential and the originalone are in so close agreement that they can only be distinguished in somelocalized regions !e.g., near the saddle point between the two minima in thelower right corner".

184110-4 L. Maragliano and E. Vanden-Eijnden J. Chem. Phys. 128, 184110 "2008#

Downloaded 28 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

Page 21: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

2. (weighted)Leastsquarefittingoftheunknowncoefficient:b. minimizationofanobjectivefunctionbasedonthemeanforce

• Meanforcescomputedbystandardtechniques,e.g.constrainedMD,restrainedMD,etc

(Improved)SingleSweep:stepII

(c⇤,�⇤) = argminE(c,�) = argmin

(X

i

µi

���f(zi)� (�rzF✓(zi))���2)

since no new center will be deposited. The accuracy of thereconstruction depends on the number of centers and the ac-curacy at which the mean force is computed in Eq. !11"much more than the precise locations where the centers aredeposited. An important practical consequence is that it israther straightforward to pick the parameters ! and " in Eq.!8" since the final result is robust against variations in theseparameters.

III. TWO-DIMENSIONAL ILLUSTRATIVE EXAMPLE

Since, given the location of the centers zk, the meanforce estimation at these points is quite standard, as a firstillustration we use a two-dimensional example for which!!x"#x= !x ,y" and A!z"#V!x", where V!x" is the Muellerpotential.24 In this case, there is no need to extend the systemas in Eq. !8", and the temperature accelerated dynamics sim-ply reduces to !setting "=1 by appropriate rescaling of time"

x = − !V!x" + $2#−1"!t" . !12"

Figure 1 shows a TAMD trajectory generated by solving Eq.!12" by forward Euler with the initial condition !x!0" ,y!0""= !1,0" and a time step of $t=2%10−5 for 2%104 time stepsat 1 / # =40 !for comparison the energy barrier between thetwo main minima of the Mueller potential is about 100". Alsoshown are the centers zk#!xk ,yk" obtained by depositing anew center along the trajectory each time the trajectoryreaches a point which is d=0.175 away from all the previouscenters. In this run, 174 centers were deposited. At the cen-ters, we used −!V!xk ,yk"= fk as estimate of the “mean force”!i.e., there is no sampling error in the present example". Wethen used these data to reconstruct the free energy as ex-plained in Sec. II A Figure 2 shows the residual per centere2!&" defined in Eq. !7". The optimal & for this run was &!

=0.398 and the condition number at this &! was 7%106. Thelevel sets of the reconstructed potential are shown in Fig. 3

and compared to those of the original Mueller potential,while Fig. 4 compares the values of the original and recon-structed Mueller potential along the TAMD trajectory shownin Fig. 1. As a simple estimate of the error, we used

e1 =%'&V!x" − V!x"&dx

%'&V!x"&dx, !13"

where V!x" denotes the reconstructed potential and ' is thedomain in which the original potential remains less than 180above its minimum value. The error defined in Eq. !13" forthis calculation was e1=4.2%10−3.

These results, which are already very good, can be im-proved by diminishing d and thereby increasing the numberof centers without having to increase the length of theTAMD trajectory. For example, by taking d=0.12, we ob-tained 351 centers in a trajectory still 2%104 steps long.

FIG. 1. !Color" A trajectory generated by simulating Eq. !12" by forwardEuler with a time step $t=2%10−5 for 2%104 steps !white curve" shownabove the contour plot of the Mueller potential !with 29 level sets evenlydistributed between V=0 and V=180 in a scale where the minimum of thepotential is V=0". The red circles are the locations of the centers depositedalong the trajectory using d=0.175. In this run, 174 centers were deposited.

FIG. 2. Residual per center e2!&" defined in Eq. !7" for the reconstruction ofthe Mueller potential with the 2%104 steps single-sweep trajectory shown inFig. 1. The optimal & for this run was &!=0.398.

FIG. 3. !Color" Comparison between the level sets of the original Muellerpotential !red curves" and the reconstructed potential using Eq. !2" !blackcurve". Here we use the 174 centers shown in Fig. 1. The optimal & is &!

=0.398 !see Fig. 2". We show 29 level sets evenly distributed between V=0 and V=180. The level sets of the reconstructed potential and the originalone are in so close agreement that they can only be distinguished in somelocalized regions !e.g., near the saddle point between the two minima in thelower right corner".

184110-4 L. Maragliano and E. Vanden-Eijnden J. Chem. Phys. 128, 184110 "2008#

Downloaded 28 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

µi = 1/ |f(zi) + �|2

Page 22: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

(Improved)SingleSweep:stepII

2. (weighted)Leastsquarefittingoftheunknowncoefficient:b. Simulatedannealing:

1.

2.

(c0,�0)random�����! (c00,�00)

↵(0!00) = min⇣1, exp(��E/✏)

Page 23: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

(Improved)SingleSweep:results

aluminium Al1 is defective. The values given above for thelocations of both the minima and of the saddle point areaveraged over the different reconstructions (the valueswere quite stable in the different reconstructions). Table 4shows a considerable reduction of the residual when theflexibility of the radial basis is increased. The convergenceof the free-energy barriers with increasing number ofcentres is quite fast both for the standard and the relativeobjective functions and the values obtained with differentmethods are always within 0.03 eV. This value (<kBT inthe process) is of the order of the typical thermal energy ofthe system and lower than the accuracy of a DFTcalculation, so all our results lead to the same qualitativeconclusions on the main features of the activated event.The variations of the standard method, however, introducesome improvements in the reconstruction. Let us focus onthe 1 £ CV runs that, in the examples considered so far,offer the best compromise among improvements in thereconstruction and added numerical cost of the calculation.Figure 6 shows contour lines of the free energy obtainedfor 37 (black curves) and 55 (red curves) centres. The upperpanel is the result of the standard single sweepreconstruction, while in the lower panel the figures areobtained from 1 £ CV calculations performed with thestandard (right) and relative (left) objective functions,respectively. The reconstruction that employs the relativeobjective function shows a better trend to convergence

in all relevant regions of the landscape. In particular, boththe rise of the free energy from C2 < ð4:9; 5:7Þ to thesaddle point and the region around the saddle convergefaster and more regularly in this reconstruction. Thisregularity reflects a more accurate reconstruction of thegradient of the free energy at a set of important centres,some of which are shown in Figure 6. In the figure, themean force calculated via Equation (6) (green arrows) iscompared to that obtained using the gradient of thelandscape reconstructed with 37 centres (blue arrows)when the standard (figure on the right) and relative (figureon the left) objective functions are used. The relativeobjective function consistently provides a more preciseestimate of the direction and magnitude of gradients ofsmaller modulus. These gradients are in the regions wherethe contour lines obtained with the standard objectivefunction and 37 centres show more relevant differenceswith those obtained with a larger set of centres. As for theMuller potential, the accurate estimate of the contributionof small gradients proves more important than theaccuracy of the reconstruction at points of a large gradient.These are captured more accurately by the standardobjective function (see gradient at (4.6,4.8) in the figure),but this fact does not seem to improve the globalreconstruction. Note also that the contour lines obtainedwith the relative objective function and 37 centres areparallel to the lines obtained with the larger set of centres

Figure 5. Scaled Muller potential ðl ¼ 3Þ: Contour lines of the potential from 0 to 140. The red curves are the exact contour lines, theblack curves have been reconstructed using the standard single sweep method (upper panel) and 1 £ CV runs (lower panel). In the bottompanel, the plot on the left shows the reconstruction with the standard objective function, the plot on the right the reconstruction with therelative objective function. In all figures, the potential was reconstructed using the same 92 centres ðd ¼ 0:09Þ deposed along a Langevintrajectory with T ¼ 26.

M. Monteferrante et al.1126

Down

load

ed B

y: [

Univ

ersi

ty C

olle

ge D

ubli

n] A

t: 1

0:53

6 S

epte

mber

201

0

gk(z� zk;�k) =q

det(��1k )/(2⇡)n exp

��1/2(z� zk)

T��1K (z� zk)

(c⇤,�⇤) = argminE(c,�) = argmin

(X

i

µi

���f(zi)� (�rzF✓(zi))���2)

Page 24: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

(Improved)SingleSweep:V’Hdiffusioninalanates

202 M. Monteferrante et al.

clustering of the points. Since we verified that our centers were nonetheless placed in mean-ingful regions of the free energy landscape, we decided not to repeat the expensive calculationof the mean force on a set of new, equally spaced, points but rather to improve the characteris-tics of the available set. First, we added to the 43 TAMD centers 36 new points placed by hand,then we extracted subsets of points that respected the distance criterion for given choices of d.Three subsets where obtained this way using d = 0.1 (which selected 55 centers), d = 0.15(37 centers), d = 0.2 (25 centers). The free energy landscape was then reconstructed withthe three different subsets following the single-sweep procedure described in Sect. 3.2. Thegradient of the free energy at the centers was computed using Eq. (14) with restrained ab initiomolecular dynamics runs. The equilibrium averages were obtained with runs of T ≃ 2.2 ps,that ensured an error on the measure, as estimated by the variance associated to the average,of about 10−2 eV. With this information, the linear system (16) was solved and the varianceof the Gaussians in the basis was then optimized.

The free energy profile reconstructed with 55 centers is shown in Fig. 8. To test the accu-racy of this calculation, the free energy surface reconstructed with increasing number ofcenters, and the results for the free energy barriers, were compared. In Fig. 9 we show theabsolute value of the difference in the reconstructed free energies with 25 and 37 centers,and with 37 and 55 centers, respectively. As the profiles are defined within a constant, thefigure was obtained by shifting the reconstructed surfaces so that the values at the minimumcorresponding to an hexa-coordinated Al2 coincided. As it can be seen, the convergence withnumber of centers is quite fast, and the difference among the free energy calculated with 37and 55 centers is less than 0.02 eV in the regions involved in the non-local diffusion, givingan accuracy sufficient for comparison with the experimental data. As a further test of con-vergence, we evaluated the relative residual, defined in terms of the objective function (13)and the mean forces (14) as

CAl

1

CA

l 2

F[eV]

4.2 4.4 4.6 4.8 5 5.2 5.4 5.6 5.8 64.2

4.4

4.6

4.8

5

5.2

5.4

5.6

5.8

6

0.1

0.2

0.3

0.4

0.5

0.6

Fig. 8 Contour plot of the free energy reconstructed with 55 centers as a function of the coordination numbersof alumina Al1 and Al2. The white circles superimposed to the plot are the positions of the centers used inthe single-sweep reconstruction. The white curve is the converged steepest descent path computed using thestring method

123

Calculations of free energy barriers 203

CAl

1

CAl

1

CA

l 2

CA

l 2

∆ F[eV] 55 vs 37

4.2 4.4 4.6 4.8 5 5.2 5.4 5.6 5.8 64.2

4.4

4.6

4.8

5

5.2

5.4

5.6

5.8

6

0

0.005

0.01

0.015

0.02

0.025

0.03

∆ F[eV] 37 vs 25

4.2 4.4 4.6 4.8 5 5.2 5.4 5.6 5.8 64.2

4.4

4.6

4.8

5

5.2

5.4

5.6

5.8

6

0

0.005

0.01

0.015

0.02

0.025

0.03

Fig. 9 Contour plot of the absolute value of the difference of the free energy profiles reconstructed via thesingle-sweep method with increasing number of centers. The free energy scale is the same as in Fig. 8 for easycomparison, but the colormap has been inverted (here, white is low and dark is high whereas it is the oppositein Fig. 8) for esthetical purposes. In the bottom panel, the difference among the profiles obtained with 25 and37 centers is shown. The upper panel presents the difference among the reconstructions with 37 and 55 centers

res =!

E(a, σ )"J

j=1 | f j |2(19)

where, as before, a j and σ denote the optimized paramters. With J = 55 centers, the residualwas res = 0.2.

123

Calculations of free energy barriers 201

Fig. 7 Typical non-local diffusion event in the TAMD trajectory. The transferring hydrogen is tagged as thered sphere. The sequence, to be read from left to right, follows the event from the initial configuration in whichthe hydrogen is bound to the donor aluminum group to the accepting defective aluminum group

configuration of the system may not exist in general, although in our case it was foundby setting λ = 10 Å−1 and r0 = 2 Å. This choice places the inflection point of the stepfunction around the equilibrium Al-H distance and the force decays over a range of about1 Å around this distance. Consequently, the boundaries of the coordination spheres of neigh-boring Al overlap, so that the force itself or the thermal fluctuations in the system can alwaysdrive the hydrogen exchange. Furthermore, the function is smooth enough for numericalintegration.

With this choice of the parameters, the accelerated dynamics produced a considerablenumber of hydrogen hops between the aluminum atoms and, during the overall TAMD run,seven out of eight alumina became penta-coordinated at least once. Figure 7 shows a typicalevent. Reading the cartoon from left to right, the initial configuration of the defective Al groupis a roughly a square pyramid. The donor AlH6 group has one hydrogen, tagged as the redsphere in the figure, pointing in the direction of the base of the pyramid. It is this hydrogen thatwill hop between the two Al groups. As the hop proceeds there is very little rearrangementof the surrounding sodium atoms or of the bond structure of the two groups. The putativetransition state, shown in the middle panel of the figure, is in fact quite symmetric with thetransferring H midway between the groups. As the diffusion event proceeds, the hydrogen iscaptured by the acceptor that then rearranges very slightly to assume the hexa-coordinatedoctahedron geometry while the donor is now in a square base pyramid conformation. Theanalysis of the hydrogen transfer events in the TAMD trajectory shows three main features:(1) the hopping involves only two Al atoms per event with no apparent cooperative effectsamong different alumina; (2) there are no appreciable differences in the transfer dynamicsbased on the identity of the pair of aluminum atoms participating in the event; (3) most ofthe reactive events occur between a specific pair of aluminum atoms, involved in 18 hopsover a total of 44. The average number of hops among other active pairs is 3. Thus we areled to focus on the pair of alumina among which the largest number of hydrogen hops occursince they can be considered as representative of all hops. Therefore, we reduced the numberof collective variables from the eight coordination numbers of all Al in the cell to the twocoordination numbers of the pair that participated in most hops. In the following, these alu-mina are called Al1 and Al2. Given the computational cost of the estimates of the gradientof the free energy in ab inito molecular dynamics, this reduction is important to make theanalysis affordable. We now proceed to apply single-sweep to reconstruct a two dimensionalfree energy surface.

4.2.2 Radial basis reconstruction of the free energy

Due to an error in our implementation of the distance criterion described in the Methodsection, we produced a biased set of centers along the TAMD trajectory that suffered from

123

M.Monteferrante,S.Bonella,SM,E.Vanden-Eijnden,andG.Ciccotti.SMNS15,187(2008)

Page 25: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

(Improved)SingleSweep:pathwaysofCOdiffusioninMyoglobin

cavities and a few more cavities also found in recent studies.15,17,37

In two of the simulations, the ligand exited to the solvent directlyfrom the distal pocket, passing close to histidine 64 (H64), i.e.,through the so-called histidine gate.

Mean Force Calculations and PMF Reconstruction. Byjoining together all the TAMD trajectories, we obtained a list ofvalues for the CO center of mass coordinates inside Mb. Toreconstruct the PMF map, a set of values for the mean forces isneeded, for example, computed at centers extracted from the list.As already done in previous studies,29-31 we used a distancecriterion to choose the centers; i.e., we took z1 as the first elementin the list and then ran along the list and extracted a new centereach time an element was more than a prescribed distance d awayfrom all the previous centers. By setting d ) 2.5 Å, we obtained239 centers. Figure 1 shows as red spheres the locations inside Mbof the 239 centers; the protein backbone is represented as ribbons,the heme residue is represented as sticks, and the locations of thexenon binding cavities are represented as yellow spheres. This andall other images in the paper were produced using the VMDprogram.57

Mean forces were computed at these centers by simulating eq 2with z(t) ) zk fixed and κ ) 200 kcal/(mol Å2) and using eq 3. Allother simulation settings were as described in the section “TAMDSimulations”. Each mean force calculation lasted 500 ps, at whichlength we observed convergence of the mean forces and of the PMFbarriers after reconstruction. The cumulative simulation time ishence 120 ns, although we stress again that, all mean forcecalculations being independent of each other, they were distributedon different processing nodes. The PMF reconstruction wasperformed using Gaussian RBFs. The value of the optimal σ was2.97 Å, at which the relative residual, defined as [E(a,σ)/Σk|fk|2]1/2,is 0.64.

MFEPs as Ligand Migration Paths. Once the PMF surface isknown, pathways for CO diffusion inside Mb are identified asMFEPs on the surface.25 An MFEP is defined as the curve whosetangent is always parallel to M∇A(z), where M is a metric tensor25

which, for the case of linear collective variables chosen here, isconstant and diagonal. Hence, in the present case, the MFEPscoincide with the steepest descent paths from saddle points on thePMF surface. Since we have obtained an analytical approximationof this surface, such paths can be efficiently computed with thezero temperature string (ZTS) method.33,34 Given an initial guess

for a curve on the PMF surface, the ZTS method finds the closestMFEP by moving a discrete set of points on the curve by steepestdescent on the PMF landscape, at the same time keeping the pointsat constant distance from each other. The procedure requires thefirst derivatives of the underlying PMF surface, which can be easilyobtained from eq 4, with the optimal ak set and σ determined bythe minimization procedure. Note that no more MD calculationsare required at this stage.

Results and Discussion

PMF Map for CO Diffusion in Mb. Figure 2 shows fourisosurfaces of the three-dimensional PMF map as obtained fromour calculations. For illustrative purposes, we superpose theisosurfaces on a representative structure of Mb, extracted fromone of our mean force simulations, and hence aligned consis-tently with the PMF map. A movie with a 360° view of themap at different energy levels (from 0.5 to 10 kcal/mol) isavailable online as a Web-enhanced object. The global minimumof the map is in correspondence with the Xe4 cavity. Energylevels in Figure 2 are, from left to right and top to bottom, 1.5,3.5, 5.5, and 8.5 kcal/mol with respect to the global minimum.The overall shape of the isosurfaces of our map is in goodqualitative agreement with similar maps computed in refs 37and 38 and with the dissociated CO trajectories computed inrefs 6, 15, 19, and 20.

Other local minima are in the DP and in the Xe1, Xe2, andXe3 cavities. The energy of the DP minimum is 0.65 kcal/molhigher than that of Xe4. Xe1 and Xe3 are almost at the sameenergy as the DP, and Xe2 is 1.75 kcal/mol higher than Xe4.To check the quality of our reconstruction in the DP, we ran anunbiased simulation of the dissociated Mb:CO system with theCO in the distal pocket, with all simulation settings as describedin the section “Simulation Setup”. After about 2 ns, thedistributions of the CO center of mass coordinates are mono-modal, with peaks at x ) 6.25, y ) 2.08, and z ) -0.49, to becompared with the location of the local minimum in the DP ofthe PMF map, x ) 7.85, y ) 1.86, z ) 0.04.

(57) Humphrey, W.; Dalke, A.; Schulten, K. J. Mol. Graphics 1996 , 14,33–38.

Figure 1. Set of 239 positions of the CO center of mass (red spheres)inside Mb, obtained from TAMD simulations, and used as locations tocompute the mean forces. The protein’s backbone is represented as ribbons,the heme residue is represented as sticks, and the locations of the xenoncavities are represented as yellow spheres. Figure 2. Four isosurfaces of the three-dimensional PMF map of the CO

center of mass inside Mb, superposed on the protein structure. Energy levelsare, from left to right and top to bottom, 1.5, 3.5, 5.5, and 8.5 kcal/molwith respect to the global minimum in the Xe4 cavity. The protein’sbackbone is represented as ribbons, the heme residue is represented as sticks,and the locations of the xenon binding cavities are represented as yellowspheres.

1014 J. AM. CHEM. SOC. 9 VOL. 132, NO. 3, 2010

A R T I C L E S Maragliano et al.

cavities and a few more cavities also found in recent studies.15,17,37

In two of the simulations, the ligand exited to the solvent directlyfrom the distal pocket, passing close to histidine 64 (H64), i.e.,through the so-called histidine gate.

Mean Force Calculations and PMF Reconstruction. Byjoining together all the TAMD trajectories, we obtained a list ofvalues for the CO center of mass coordinates inside Mb. Toreconstruct the PMF map, a set of values for the mean forces isneeded, for example, computed at centers extracted from the list.As already done in previous studies,29-31 we used a distancecriterion to choose the centers; i.e., we took z1 as the first elementin the list and then ran along the list and extracted a new centereach time an element was more than a prescribed distance d awayfrom all the previous centers. By setting d ) 2.5 Å, we obtained239 centers. Figure 1 shows as red spheres the locations inside Mbof the 239 centers; the protein backbone is represented as ribbons,the heme residue is represented as sticks, and the locations of thexenon binding cavities are represented as yellow spheres. This andall other images in the paper were produced using the VMDprogram.57

Mean forces were computed at these centers by simulating eq 2with z(t) ) zk fixed and κ ) 200 kcal/(mol Å2) and using eq 3. Allother simulation settings were as described in the section “TAMDSimulations”. Each mean force calculation lasted 500 ps, at whichlength we observed convergence of the mean forces and of the PMFbarriers after reconstruction. The cumulative simulation time ishence 120 ns, although we stress again that, all mean forcecalculations being independent of each other, they were distributedon different processing nodes. The PMF reconstruction wasperformed using Gaussian RBFs. The value of the optimal σ was2.97 Å, at which the relative residual, defined as [E(a,σ)/Σk|fk|2]1/2,is 0.64.

MFEPs as Ligand Migration Paths. Once the PMF surface isknown, pathways for CO diffusion inside Mb are identified asMFEPs on the surface.25 An MFEP is defined as the curve whosetangent is always parallel to M∇A(z), where M is a metric tensor25

which, for the case of linear collective variables chosen here, isconstant and diagonal. Hence, in the present case, the MFEPscoincide with the steepest descent paths from saddle points on thePMF surface. Since we have obtained an analytical approximationof this surface, such paths can be efficiently computed with thezero temperature string (ZTS) method.33,34 Given an initial guess

for a curve on the PMF surface, the ZTS method finds the closestMFEP by moving a discrete set of points on the curve by steepestdescent on the PMF landscape, at the same time keeping the pointsat constant distance from each other. The procedure requires thefirst derivatives of the underlying PMF surface, which can be easilyobtained from eq 4, with the optimal ak set and σ determined bythe minimization procedure. Note that no more MD calculationsare required at this stage.

Results and Discussion

PMF Map for CO Diffusion in Mb. Figure 2 shows fourisosurfaces of the three-dimensional PMF map as obtained fromour calculations. For illustrative purposes, we superpose theisosurfaces on a representative structure of Mb, extracted fromone of our mean force simulations, and hence aligned consis-tently with the PMF map. A movie with a 360° view of themap at different energy levels (from 0.5 to 10 kcal/mol) isavailable online as a Web-enhanced object. The global minimumof the map is in correspondence with the Xe4 cavity. Energylevels in Figure 2 are, from left to right and top to bottom, 1.5,3.5, 5.5, and 8.5 kcal/mol with respect to the global minimum.The overall shape of the isosurfaces of our map is in goodqualitative agreement with similar maps computed in refs 37and 38 and with the dissociated CO trajectories computed inrefs 6, 15, 19, and 20.

Other local minima are in the DP and in the Xe1, Xe2, andXe3 cavities. The energy of the DP minimum is 0.65 kcal/molhigher than that of Xe4. Xe1 and Xe3 are almost at the sameenergy as the DP, and Xe2 is 1.75 kcal/mol higher than Xe4.To check the quality of our reconstruction in the DP, we ran anunbiased simulation of the dissociated Mb:CO system with theCO in the distal pocket, with all simulation settings as describedin the section “Simulation Setup”. After about 2 ns, thedistributions of the CO center of mass coordinates are mono-modal, with peaks at x ) 6.25, y ) 2.08, and z ) -0.49, to becompared with the location of the local minimum in the DP ofthe PMF map, x ) 7.85, y ) 1.86, z ) 0.04.

(57) Humphrey, W.; Dalke, A.; Schulten, K. J. Mol. Graphics 1996 , 14,33–38.

Figure 1. Set of 239 positions of the CO center of mass (red spheres)inside Mb, obtained from TAMD simulations, and used as locations tocompute the mean forces. The protein’s backbone is represented as ribbons,the heme residue is represented as sticks, and the locations of the xenoncavities are represented as yellow spheres. Figure 2. Four isosurfaces of the three-dimensional PMF map of the CO

center of mass inside Mb, superposed on the protein structure. Energy levelsare, from left to right and top to bottom, 1.5, 3.5, 5.5, and 8.5 kcal/molwith respect to the global minimum in the Xe4 cavity. The protein’sbackbone is represented as ribbons, the heme residue is represented as sticks,and the locations of the xenon binding cavities are represented as yellowspheres.

1014 J. AM. CHEM. SOC. 9 VOL. 132, NO. 3, 2010

A R T I C L E S Maragliano et al.

The locations of the minima in our map well correlate withresults from previous experimental and theoretical work. Indeed,the CO molecule was found in the Xe1 cavity by time-resolvedcrystallography, in native Mb, in the pioneering study of ref 8and in the L29W Mb mutant in ref 9. Other crystallographicstudies of the photolyzed L29W10 mutant found the CO trappedin the Xe4 site. In a more recent study on Mb crystals at lowtemperature17 the CO was found to populate, at different timesafter photolysis, all xenon binding cavities. Previous theoreticalstudies also found CO PMF minima (or density peaks) in xenoncavities and the DP.6,15,19,20,28,37,38,58,59

In addition to the xenon binding sites and DP, our map showsother features that are similar to those obtained in othercomputational studies.15,20,37 In particular, we observe two localminima above Xe3 that are located in correspondence to cavitiesvisited by the dissociated CO trajectory in refs 15 and 20 (theywere called Ph1 and Ph2 in ref 15) and were also present in themap of ref 37. The ring-shaped structure above Xe3 is verysimilar to that obtained in ref 37. The shape of the isosurfacesin the DP region matches the plot of the positions occupied bydissociated CO in the extensive simulations of ref 20, wherethe ligand also visited the largest of the two minima we findin the proximity of the heme, close to the protein surface.Finally, the region above Xe4 toward the solvent was visitedby the dissociated trajectories in refs 15, 19, and 20.

Pathways of CO Migration inside Mb. To accurately locatepathways and compute energy barriers for CO migration insideMb, we use the string method to calculate MFEPs (i.e., thecurves whose tangent is always parallel to ∇A(z), see the section“MFEPs as Ligand Migration Paths”) on the reconstructed PMFsurface. These MFEPs are identified as migration pathways forCO inside Mb. Figure 3 shows our results. MFEPs are shown

as yellow curves. Two isosurfaces of the PMF map arerepresented (red, 2 kcal/mol; blue, 5 kcal/mol; with respect toXe4). White and black spheres represent, respectively, thelocations of energy barriers and local minima along thepathways. Starting from the DP, a network of possible pathwaysis accessible to the dissociated CO molecule. The locations ofthe pathways we found is in excellent agreement with previousstudies. In the massive simulations of dissociated CO in ref 20,nine different gates were identified for the ligand to exit/enterMb. Each one of these gates is connected with one of ourpathways. Moreover, most of the trajectories followed by theCO molecules in ref 20 are along some of our pathways.

On the experimental side, many of the residues that werefound by random mutagenesis60 to affect the ligand-bindingkinetics lie along our paths. In particular, all of the energybarriers we find are close to at least one of the residues identifiedin ref 60. This result, as already observed in refs 37 and 20, isan important step toward a more atomistically based interpreta-tion of the mutagenesis results. The largest cluster of suchresidues is around the DP, where indeed we observe threedifferent possible escape routes toward the solvent, all of themin proximity of at least one of the kinetically relevant residues.Among these paths is the one toward the so-called histidinegate, which for many years has been considered the only onepossible for CO to enter/exit Mb.61 From our results, this is theshortest path connecting the DP and the solvent. It is also theonly direct one, i.e., without intermediate minima along it. Thismight reflect the importance of this path in the escape process.Figure 4 shows a detailed view of the histidine gate path fromour calculations. The yellow curve is the MFEP. In orange, weshow the position of the H64 side chain in the crystal of Mbwith CO bound, while in blue we show its position as we findit in the dissociated system configurations. It can be seen that,in the deoxy Mb configuration, the rotation of the side chaintoward the solvent creates room for the pathway, opening thegate. This same mechanism has been proposed by severalauthors.61,62

It is important to compare the different PMF barriers thatthe CO molecule has to cross when moving along the paths.Figure 5 shows again the migration pathways of Figure 3, thistime colored according to the value of the PMF along them.Energy barriers and local minima are represented as spheres,

(58) Kiyota, Y.; Hiraoka, R.; Yoshida, N.; Maruyama, Y.; Imai, T.; Hirata,F. J. Am. Chem. Soc. 2009, 131, 3852–3853.

(59) Nutt, D. R.; Meuwly, M. Proc. Natl. Acad. Sci. U.S.A. 2004 , 101,5998–6002.

(60) Huang, X.; Boxer, S. G. Nat. Struct. Biol. 1994 , 1, 226–229.(61) Scott, E. E.; Gibson, Q. H.; Olson, J. S. J. Biol. Chem. 2001, 276,

5177–5188.(62) Perutz, M. F. Trends Biochem. Sci. 1989, 14, 42–44.

Figure 3. CO migration pathways inside Mb. The yellow curves are MFEPscomputed with the string method on the CO PMF map and identify possibleCO routes. Two isosurfaces of the PMF map are shown (red, 2.0 kcal/mol;blue, 5.0 kcal/mol; with respect to Xe4). The white and black spheresrepresent, respectively, the locations of the energy barriers and the localminima along the pathways. The yellow arrows represent the locations ofthe CO exits to the solvent as observed in the TAMD simulations. Theprotein’s backbone is represented as ribbons and the heme as sticks.

Figure 4. Illustration of the histidine gate as results from our calculations.The yellow curve is the MFEP. Orange and blue sticks represent,respectively, the side chain of H64 in the crystal of Mb with CO boundand in a deoxy-Mb configuration from our simulations.

J. AM. CHEM. SOC. 9 VOL. 132, NO. 3, 2010 1015

Network of Pathways of CO Diffusion in Myoglobin A R T I C L E S

L.Maragliano,G.Cottone,G.Ciccotti andE.Vanden-Eijnden,JACS132,1010 (2010)

Page 26: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Stringmethod• WhenthenumberofCVisverylarge(10-105)onecouldnotreconstructthefreeenergyandmustfocusonthemostlikelypath

• Thesystemevolvesaccordingto

• Themostlikelyparametricpathmr = �rrv(r) + @T

nZ(�)

o

�=0,1

Mi,j = hrr✓i ·rr✓ji✓=z

hM

⇥�rzF✓(z(�))

⇤i

?= 0

L.Maragliano,A.Fischer,E.Vanden-Eijnden andG.Ciccotti,JCP125,024126(2006)

Page 27: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Stringmethod

where !l and !r are Lagrange multipliers determined by theconstraints

!"l − "s! = !"r − "s! = h. " 30#

"29# is a discretized version of "27# because

!V" "l# = H" "s#" "l − "s# + O" h2# " 31#

and similarly for !V" "r#: here we used !V" "s# and !"l−"s!=h.

In practice, "29# can be solved by a two-step procedure.At each time step, "r and "l is first evolved by the potentialforce to give intermediate values,

"l! = "l

n − #t ! V" "ln # , " 32#

and similarly for "r!; then the constraints in "30# are enforced

by projecting "l! and "r

! to the sphere S"s,hwith center "s and

radius h,

"ln +1 = "s + h

"l! − "s

!"l! − "s!

" 33#

and similarly for "r!. The steady-state solution of the proce-

dure above is used in "28# to calculate the tangent vector $s.The parameter h in "28# should be chosen as small as

possible without impeding the accuracy with round-off er-rors: if the digital precision is TOLmin, one should chooseh=TOLmin

1/2 , in which case the error due to finite difference in"28# remains O" h2#=O" TOLmin#.

Notice that the time step #t in "32# can be chosen inde-pendently of h without impeding on the accuracy because"31# implies that !V" "l#=O" h# and !V" "r#=O" h#. As a re-sult !"l

!−"ln !=O" h# and !"r

!−"ln !=O" h# and the two steps in

the procedure above do not interfere with the accuracy re-gardless of what #t is. Since the convergence of the solutionof "29# is exponential in time, the number of steps n step re-quired to achieved a given accuracy TOL on $s scales as in"23#.

Note that the above procedure brings "r and "l to theminima of the potential energy V on the sphere S"s,h

bysteepest descent dynamics. More efficient constrained opti-mization methods can be used as well to improve the con-vergence rate and save computational cost.15

C. Illustrative example

In this example, we calculate the MEP, one of the saddlepoint, and the associated unstable direction for the Muellerpotential.13

In the calculation, we first identify an approximation ofthe MEP using the improved string method with N=10 im-ages. Cubic splines were used in the reparametrization andthe forward Euler method with #t=4.5% 10−4 was used inthe integration. After 70 time steps when d defined in "18# isless than 0.1, we stop the string calculation, and identify theimage of maximum energy along the string, "s

0, and the cor-responding $s

0. Then we switch to the climbing image algo-rithm described in Sec. V A to improve "s

0, using again #t=4.5% 10−4 in "22#. The numerical result is shown in the

upper panel of Fig. 2. The figure shows the initial string"dashed line# and the calculated MEP "filled circles#. Thebackground shows the contour lines of the Mueller potential.There is an intermediate metastable state along the MEP, andaccordingly there are two saddle points. The empty circle onthe MEP indicates the location of the saddle point "s withhigher energy, obtained by the climbing image technique.After convergence, the norm of the potential force at "s,!!V" "s#!, is smaller than 10−12. It takes 188 time steps toreach this accuracy. The convergence history for the calcula-tion of the saddle point is shown in the lower panel of Fig. 2.The error decays exponentially with the iteration number ortime step n .

We then proceeded to calculate the unstable direction at"s using the algorithm described in Sec. V B. We comparedthe accuracy of the numerical results for different choices ofh.2,3,5,15 The numerical result is shown in the upper panel ofFig. 3. Here the error is calculated by

FIG. 2. Upper panel: Initial string and calculated MEP using the stringmethod with ten images " the images are shown as filled circles; the lines arethe curves interpolated across these images; the vertical line is the initialstring and the other one is the calculated MEP#. The empty circle indicatesthe saddle point identified by combining the string method with the climbingimage technique. The norm of the residual potential force at "s is smallerthan 10−12, !!V" "s#!& 10−12. The background shows the contour lines of theMueller potential. Lower panel: The norm of the force on the climbingimage !!V" "s#! vs the number n of iterations or time steps. The convergenceis exponential in time.

164103-6 E, Ren, and Vanden-Eijnden J. Chem. Phys. 126, 164103 "2007#

Downloaded 27 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

where !l and !r are Lagrange multipliers determined by theconstraints

!"l − "s! = !"r − "s! = h. " 30#

"29# is a discretized version of "27# because

!V" "l# = H" "s#" "l − "s# + O" h2# " 31#

and similarly for !V" "r#: here we used !V" "s# and !"l−"s!=h.

In practice, "29# can be solved by a two-step procedure.At each time step, "r and "l is first evolved by the potentialforce to give intermediate values,

"l! = "l

n − #t ! V" "ln # , " 32#

and similarly for "r!; then the constraints in "30# are enforced

by projecting "l! and "r

! to the sphere S"s,hwith center "s and

radius h,

"ln +1 = "s + h

"l! − "s

!"l! − "s!

" 33#

and similarly for "r!. The steady-state solution of the proce-

dure above is used in "28# to calculate the tangent vector $s.The parameter h in "28# should be chosen as small as

possible without impeding the accuracy with round-off er-rors: if the digital precision is TOLmin, one should chooseh=TOLmin

1/2 , in which case the error due to finite difference in"28# remains O" h2#=O" TOLmin#.

Notice that the time step #t in "32# can be chosen inde-pendently of h without impeding on the accuracy because"31# implies that !V" "l#=O" h# and !V" "r#=O" h#. As a re-sult !"l

!−"ln !=O" h# and !"r

!−"ln !=O" h# and the two steps in

the procedure above do not interfere with the accuracy re-gardless of what #t is. Since the convergence of the solutionof "29# is exponential in time, the number of steps n step re-quired to achieved a given accuracy TOL on $s scales as in"23#.

Note that the above procedure brings "r and "l to theminima of the potential energy V on the sphere S"s,h

bysteepest descent dynamics. More efficient constrained opti-mization methods can be used as well to improve the con-vergence rate and save computational cost.15

C. Illustrative example

In this example, we calculate the MEP, one of the saddlepoint, and the associated unstable direction for the Muellerpotential.13

In the calculation, we first identify an approximation ofthe MEP using the improved string method with N=10 im-ages. Cubic splines were used in the reparametrization andthe forward Euler method with #t=4.5% 10−4 was used inthe integration. After 70 time steps when d defined in "18# isless than 0.1, we stop the string calculation, and identify theimage of maximum energy along the string, "s

0, and the cor-responding $s

0. Then we switch to the climbing image algo-rithm described in Sec. V A to improve "s

0, using again #t=4.5% 10−4 in "22#. The numerical result is shown in the

upper panel of Fig. 2. The figure shows the initial string"dashed line# and the calculated MEP "filled circles#. Thebackground shows the contour lines of the Mueller potential.There is an intermediate metastable state along the MEP, andaccordingly there are two saddle points. The empty circle onthe MEP indicates the location of the saddle point "s withhigher energy, obtained by the climbing image technique.After convergence, the norm of the potential force at "s,!!V" "s#!, is smaller than 10−12. It takes 188 time steps toreach this accuracy. The convergence history for the calcula-tion of the saddle point is shown in the lower panel of Fig. 2.The error decays exponentially with the iteration number ortime step n .

We then proceeded to calculate the unstable direction at"s using the algorithm described in Sec. V B. We comparedthe accuracy of the numerical results for different choices ofh.2,3,5,15 The numerical result is shown in the upper panel ofFig. 3. Here the error is calculated by

FIG. 2. Upper panel: Initial string and calculated MEP using the stringmethod with ten images " the images are shown as filled circles; the lines arethe curves interpolated across these images; the vertical line is the initialstring and the other one is the calculated MEP#. The empty circle indicatesthe saddle point identified by combining the string method with the climbingimage technique. The norm of the residual potential force at "s is smallerthan 10−12, !!V" "s#!& 10−12. The background shows the contour lines of theMueller potential. Lower panel: The norm of the force on the climbingimage !!V" "s#! vs the number n of iterations or time steps. The convergenceis exponential in time.

164103-6 E, Ren, and Vanden-Eijnden J. Chem. Phys. 126, 164103 "2007#

Downloaded 27 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

hM

⇥�rzF✓(z(�))

⇤i

?= 0 dz(�, ⌧)

d⌧= �

hMrzF (z(�, ⌧))

i

?

Page 28: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Stringmethod

where !l and !r are Lagrange multipliers determined by theconstraints

!"l − "s! = !"r − "s! = h. " 30#

"29# is a discretized version of "27# because

!V" "l# = H" "s#" "l − "s# + O" h2# " 31#

and similarly for !V" "r#: here we used !V" "s# and !"l−"s!=h.

In practice, "29# can be solved by a two-step procedure.At each time step, "r and "l is first evolved by the potentialforce to give intermediate values,

"l! = "l

n − #t ! V" "ln # , " 32#

and similarly for "r!; then the constraints in "30# are enforced

by projecting "l! and "r

! to the sphere S"s,hwith center "s and

radius h,

"ln +1 = "s + h

"l! − "s

!"l! − "s!

" 33#

and similarly for "r!. The steady-state solution of the proce-

dure above is used in "28# to calculate the tangent vector $s.The parameter h in "28# should be chosen as small as

possible without impeding the accuracy with round-off er-rors: if the digital precision is TOLmin, one should chooseh=TOLmin

1/2 , in which case the error due to finite difference in"28# remains O" h2#=O" TOLmin#.

Notice that the time step #t in "32# can be chosen inde-pendently of h without impeding on the accuracy because"31# implies that !V" "l#=O" h# and !V" "r#=O" h#. As a re-sult !"l

!−"ln !=O" h# and !"r

!−"ln !=O" h# and the two steps in

the procedure above do not interfere with the accuracy re-gardless of what #t is. Since the convergence of the solutionof "29# is exponential in time, the number of steps n step re-quired to achieved a given accuracy TOL on $s scales as in"23#.

Note that the above procedure brings "r and "l to theminima of the potential energy V on the sphere S"s,h

bysteepest descent dynamics. More efficient constrained opti-mization methods can be used as well to improve the con-vergence rate and save computational cost.15

C. Illustrative example

In this example, we calculate the MEP, one of the saddlepoint, and the associated unstable direction for the Muellerpotential.13

In the calculation, we first identify an approximation ofthe MEP using the improved string method with N=10 im-ages. Cubic splines were used in the reparametrization andthe forward Euler method with #t=4.5% 10−4 was used inthe integration. After 70 time steps when d defined in "18# isless than 0.1, we stop the string calculation, and identify theimage of maximum energy along the string, "s

0, and the cor-responding $s

0. Then we switch to the climbing image algo-rithm described in Sec. V A to improve "s

0, using again #t=4.5% 10−4 in "22#. The numerical result is shown in the

upper panel of Fig. 2. The figure shows the initial string"dashed line# and the calculated MEP "filled circles#. Thebackground shows the contour lines of the Mueller potential.There is an intermediate metastable state along the MEP, andaccordingly there are two saddle points. The empty circle onthe MEP indicates the location of the saddle point "s withhigher energy, obtained by the climbing image technique.After convergence, the norm of the potential force at "s,!!V" "s#!, is smaller than 10−12. It takes 188 time steps toreach this accuracy. The convergence history for the calcula-tion of the saddle point is shown in the lower panel of Fig. 2.The error decays exponentially with the iteration number ortime step n .

We then proceeded to calculate the unstable direction at"s using the algorithm described in Sec. V B. We comparedthe accuracy of the numerical results for different choices ofh.2,3,5,15 The numerical result is shown in the upper panel ofFig. 3. Here the error is calculated by

FIG. 2. Upper panel: Initial string and calculated MEP using the stringmethod with ten images " the images are shown as filled circles; the lines arethe curves interpolated across these images; the vertical line is the initialstring and the other one is the calculated MEP#. The empty circle indicatesthe saddle point identified by combining the string method with the climbingimage technique. The norm of the residual potential force at "s is smallerthan 10−12, !!V" "s#!& 10−12. The background shows the contour lines of theMueller potential. Lower panel: The norm of the force on the climbingimage !!V" "s#! vs the number n of iterations or time steps. The convergenceis exponential in time.

164103-6 E, Ren, and Vanden-Eijnden J. Chem. Phys. 126, 164103 "2007#

Downloaded 27 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

where !l and !r are Lagrange multipliers determined by theconstraints

!"l − "s! = !"r − "s! = h. " 30#

"29# is a discretized version of "27# because

!V" "l# = H" "s#" "l − "s# + O" h2# " 31#

and similarly for !V" "r#: here we used !V" "s# and !"l−"s!=h.

In practice, "29# can be solved by a two-step procedure.At each time step, "r and "l is first evolved by the potentialforce to give intermediate values,

"l! = "l

n − #t ! V" "ln # , " 32#

and similarly for "r!; then the constraints in "30# are enforced

by projecting "l! and "r

! to the sphere S"s,hwith center "s and

radius h,

"ln +1 = "s + h

"l! − "s

!"l! − "s!

" 33#

and similarly for "r!. The steady-state solution of the proce-

dure above is used in "28# to calculate the tangent vector $s.The parameter h in "28# should be chosen as small as

possible without impeding the accuracy with round-off er-rors: if the digital precision is TOLmin, one should chooseh=TOLmin

1/2 , in which case the error due to finite difference in"28# remains O" h2#=O" TOLmin#.

Notice that the time step #t in "32# can be chosen inde-pendently of h without impeding on the accuracy because"31# implies that !V" "l#=O" h# and !V" "r#=O" h#. As a re-sult !"l

!−"ln !=O" h# and !"r

!−"ln !=O" h# and the two steps in

the procedure above do not interfere with the accuracy re-gardless of what #t is. Since the convergence of the solutionof "29# is exponential in time, the number of steps n step re-quired to achieved a given accuracy TOL on $s scales as in"23#.

Note that the above procedure brings "r and "l to theminima of the potential energy V on the sphere S"s,h

bysteepest descent dynamics. More efficient constrained opti-mization methods can be used as well to improve the con-vergence rate and save computational cost.15

C. Illustrative example

In this example, we calculate the MEP, one of the saddlepoint, and the associated unstable direction for the Muellerpotential.13

In the calculation, we first identify an approximation ofthe MEP using the improved string method with N=10 im-ages. Cubic splines were used in the reparametrization andthe forward Euler method with #t=4.5% 10−4 was used inthe integration. After 70 time steps when d defined in "18# isless than 0.1, we stop the string calculation, and identify theimage of maximum energy along the string, "s

0, and the cor-responding $s

0. Then we switch to the climbing image algo-rithm described in Sec. V A to improve "s

0, using again #t=4.5% 10−4 in "22#. The numerical result is shown in the

upper panel of Fig. 2. The figure shows the initial string"dashed line# and the calculated MEP "filled circles#. Thebackground shows the contour lines of the Mueller potential.There is an intermediate metastable state along the MEP, andaccordingly there are two saddle points. The empty circle onthe MEP indicates the location of the saddle point "s withhigher energy, obtained by the climbing image technique.After convergence, the norm of the potential force at "s,!!V" "s#!, is smaller than 10−12. It takes 188 time steps toreach this accuracy. The convergence history for the calcula-tion of the saddle point is shown in the lower panel of Fig. 2.The error decays exponentially with the iteration number ortime step n .

We then proceeded to calculate the unstable direction at"s using the algorithm described in Sec. V B. We comparedthe accuracy of the numerical results for different choices ofh.2,3,5,15 The numerical result is shown in the upper panel ofFig. 3. Here the error is calculated by

FIG. 2. Upper panel: Initial string and calculated MEP using the stringmethod with ten images " the images are shown as filled circles; the lines arethe curves interpolated across these images; the vertical line is the initialstring and the other one is the calculated MEP#. The empty circle indicatesthe saddle point identified by combining the string method with the climbingimage technique. The norm of the residual potential force at "s is smallerthan 10−12, !!V" "s#!& 10−12. The background shows the contour lines of theMueller potential. Lower panel: The norm of the force on the climbingimage !!V" "s#! vs the number n of iterations or time steps. The convergenceis exponential in time.

164103-6 E, Ren, and Vanden-Eijnden J. Chem. Phys. 126, 164103 "2007#

Downloaded 27 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

dz(�, ⌧)

d⌧= �MrzF (z(�, ⌧))

dz(�, ⌧)

d⌧= �MrzF (z(�, ⌧))

ComputedbyCMDorRMDatpresent

hM

⇥�rzF✓(z(�))

⇤i

?= 0

Page 29: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Stringmethod:stepI

where !l and !r are Lagrange multipliers determined by theconstraints

!"l − "s! = !"r − "s! = h. " 30#

"29# is a discretized version of "27# because

!V" "l# = H" "s#" "l − "s# + O" h2# " 31#

and similarly for !V" "r#: here we used !V" "s# and !"l−"s!=h.

In practice, "29# can be solved by a two-step procedure.At each time step, "r and "l is first evolved by the potentialforce to give intermediate values,

"l! = "l

n − #t ! V" "ln # , " 32#

and similarly for "r!; then the constraints in "30# are enforced

by projecting "l! and "r

! to the sphere S"s,hwith center "s and

radius h,

"ln +1 = "s + h

"l! − "s

!"l! − "s!

" 33#

and similarly for "r!. The steady-state solution of the proce-

dure above is used in "28# to calculate the tangent vector $s.The parameter h in "28# should be chosen as small as

possible without impeding the accuracy with round-off er-rors: if the digital precision is TOLmin, one should chooseh=TOLmin

1/2 , in which case the error due to finite difference in"28# remains O" h2#=O" TOLmin#.

Notice that the time step #t in "32# can be chosen inde-pendently of h without impeding on the accuracy because"31# implies that !V" "l#=O" h# and !V" "r#=O" h#. As a re-sult !"l

!−"ln !=O" h# and !"r

!−"ln !=O" h# and the two steps in

the procedure above do not interfere with the accuracy re-gardless of what #t is. Since the convergence of the solutionof "29# is exponential in time, the number of steps n step re-quired to achieved a given accuracy TOL on $s scales as in"23#.

Note that the above procedure brings "r and "l to theminima of the potential energy V on the sphere S"s,h

bysteepest descent dynamics. More efficient constrained opti-mization methods can be used as well to improve the con-vergence rate and save computational cost.15

C. Illustrative example

In this example, we calculate the MEP, one of the saddlepoint, and the associated unstable direction for the Muellerpotential.13

In the calculation, we first identify an approximation ofthe MEP using the improved string method with N=10 im-ages. Cubic splines were used in the reparametrization andthe forward Euler method with #t=4.5% 10−4 was used inthe integration. After 70 time steps when d defined in "18# isless than 0.1, we stop the string calculation, and identify theimage of maximum energy along the string, "s

0, and the cor-responding $s

0. Then we switch to the climbing image algo-rithm described in Sec. V A to improve "s

0, using again #t=4.5% 10−4 in "22#. The numerical result is shown in the

upper panel of Fig. 2. The figure shows the initial string"dashed line# and the calculated MEP "filled circles#. Thebackground shows the contour lines of the Mueller potential.There is an intermediate metastable state along the MEP, andaccordingly there are two saddle points. The empty circle onthe MEP indicates the location of the saddle point "s withhigher energy, obtained by the climbing image technique.After convergence, the norm of the potential force at "s,!!V" "s#!, is smaller than 10−12. It takes 188 time steps toreach this accuracy. The convergence history for the calcula-tion of the saddle point is shown in the lower panel of Fig. 2.The error decays exponentially with the iteration number ortime step n .

We then proceeded to calculate the unstable direction at"s using the algorithm described in Sec. V B. We comparedthe accuracy of the numerical results for different choices ofh.2,3,5,15 The numerical result is shown in the upper panel ofFig. 3. Here the error is calculated by

FIG. 2. Upper panel: Initial string and calculated MEP using the stringmethod with ten images " the images are shown as filled circles; the lines arethe curves interpolated across these images; the vertical line is the initialstring and the other one is the calculated MEP#. The empty circle indicatesthe saddle point identified by combining the string method with the climbingimage technique. The norm of the residual potential force at "s is smallerthan 10−12, !!V" "s#!& 10−12. The background shows the contour lines of theMueller potential. Lower panel: The norm of the force on the climbingimage !!V" "s#! vs the number n of iterations or time steps. The convergenceis exponential in time.

164103-6 E, Ren, and Vanden-Eijnden J. Chem. Phys. 126, 164103 "2007#

Downloaded 27 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

where !l and !r are Lagrange multipliers determined by theconstraints

!"l − "s! = !"r − "s! = h. " 30#

"29# is a discretized version of "27# because

!V" "l# = H" "s#" "l − "s# + O" h2# " 31#

and similarly for !V" "r#: here we used !V" "s# and !"l−"s!=h.

In practice, "29# can be solved by a two-step procedure.At each time step, "r and "l is first evolved by the potentialforce to give intermediate values,

"l! = "l

n − #t ! V" "ln # , " 32#

and similarly for "r!; then the constraints in "30# are enforced

by projecting "l! and "r

! to the sphere S"s,hwith center "s and

radius h,

"ln +1 = "s + h

"l! − "s

!"l! − "s!

" 33#

and similarly for "r!. The steady-state solution of the proce-

dure above is used in "28# to calculate the tangent vector $s.The parameter h in "28# should be chosen as small as

possible without impeding the accuracy with round-off er-rors: if the digital precision is TOLmin, one should chooseh=TOLmin

1/2 , in which case the error due to finite difference in"28# remains O" h2#=O" TOLmin#.

Notice that the time step #t in "32# can be chosen inde-pendently of h without impeding on the accuracy because"31# implies that !V" "l#=O" h# and !V" "r#=O" h#. As a re-sult !"l

!−"ln !=O" h# and !"r

!−"ln !=O" h# and the two steps in

the procedure above do not interfere with the accuracy re-gardless of what #t is. Since the convergence of the solutionof "29# is exponential in time, the number of steps n step re-quired to achieved a given accuracy TOL on $s scales as in"23#.

Note that the above procedure brings "r and "l to theminima of the potential energy V on the sphere S"s,h

bysteepest descent dynamics. More efficient constrained opti-mization methods can be used as well to improve the con-vergence rate and save computational cost.15

C. Illustrative example

In this example, we calculate the MEP, one of the saddlepoint, and the associated unstable direction for the Muellerpotential.13

In the calculation, we first identify an approximation ofthe MEP using the improved string method with N=10 im-ages. Cubic splines were used in the reparametrization andthe forward Euler method with #t=4.5% 10−4 was used inthe integration. After 70 time steps when d defined in "18# isless than 0.1, we stop the string calculation, and identify theimage of maximum energy along the string, "s

0, and the cor-responding $s

0. Then we switch to the climbing image algo-rithm described in Sec. V A to improve "s

0, using again #t=4.5% 10−4 in "22#. The numerical result is shown in the

upper panel of Fig. 2. The figure shows the initial string"dashed line# and the calculated MEP "filled circles#. Thebackground shows the contour lines of the Mueller potential.There is an intermediate metastable state along the MEP, andaccordingly there are two saddle points. The empty circle onthe MEP indicates the location of the saddle point "s withhigher energy, obtained by the climbing image technique.After convergence, the norm of the potential force at "s,!!V" "s#!, is smaller than 10−12. It takes 188 time steps toreach this accuracy. The convergence history for the calcula-tion of the saddle point is shown in the lower panel of Fig. 2.The error decays exponentially with the iteration number ortime step n .

We then proceeded to calculate the unstable direction at"s using the algorithm described in Sec. V B. We comparedthe accuracy of the numerical results for different choices ofh.2,3,5,15 The numerical result is shown in the upper panel ofFig. 3. Here the error is calculated by

FIG. 2. Upper panel: Initial string and calculated MEP using the stringmethod with ten images " the images are shown as filled circles; the lines arethe curves interpolated across these images; the vertical line is the initialstring and the other one is the calculated MEP#. The empty circle indicatesthe saddle point identified by combining the string method with the climbingimage technique. The norm of the residual potential force at "s is smallerthan 10−12, !!V" "s#!& 10−12. The background shows the contour lines of theMueller potential. Lower panel: The norm of the force on the climbingimage !!V" "s#! vs the number n of iterations or time steps. The convergenceis exponential in time.

164103-6 E, Ren, and Vanden-Eijnden J. Chem. Phys. 126, 164103 "2007#

Downloaded 27 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

dz(�, ⌧)

d⌧= �MrzF (z(�, ⌧))

Page 30: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Stringmethod:stepI

where !l and !r are Lagrange multipliers determined by theconstraints

!"l − "s! = !"r − "s! = h. " 30#

"29# is a discretized version of "27# because

!V" "l# = H" "s#" "l − "s# + O" h2# " 31#

and similarly for !V" "r#: here we used !V" "s# and !"l−"s!=h.

In practice, "29# can be solved by a two-step procedure.At each time step, "r and "l is first evolved by the potentialforce to give intermediate values,

"l! = "l

n − #t ! V" "ln # , " 32#

and similarly for "r!; then the constraints in "30# are enforced

by projecting "l! and "r

! to the sphere S"s,hwith center "s and

radius h,

"ln +1 = "s + h

"l! − "s

!"l! − "s!

" 33#

and similarly for "r!. The steady-state solution of the proce-

dure above is used in "28# to calculate the tangent vector $s.The parameter h in "28# should be chosen as small as

possible without impeding the accuracy with round-off er-rors: if the digital precision is TOLmin, one should chooseh=TOLmin

1/2 , in which case the error due to finite difference in"28# remains O" h2#=O" TOLmin#.

Notice that the time step #t in "32# can be chosen inde-pendently of h without impeding on the accuracy because"31# implies that !V" "l#=O" h# and !V" "r#=O" h#. As a re-sult !"l

!−"ln !=O" h# and !"r

!−"ln !=O" h# and the two steps in

the procedure above do not interfere with the accuracy re-gardless of what #t is. Since the convergence of the solutionof "29# is exponential in time, the number of steps n step re-quired to achieved a given accuracy TOL on $s scales as in"23#.

Note that the above procedure brings "r and "l to theminima of the potential energy V on the sphere S"s,h

bysteepest descent dynamics. More efficient constrained opti-mization methods can be used as well to improve the con-vergence rate and save computational cost.15

C. Illustrative example

In this example, we calculate the MEP, one of the saddlepoint, and the associated unstable direction for the Muellerpotential.13

In the calculation, we first identify an approximation ofthe MEP using the improved string method with N=10 im-ages. Cubic splines were used in the reparametrization andthe forward Euler method with #t=4.5% 10−4 was used inthe integration. After 70 time steps when d defined in "18# isless than 0.1, we stop the string calculation, and identify theimage of maximum energy along the string, "s

0, and the cor-responding $s

0. Then we switch to the climbing image algo-rithm described in Sec. V A to improve "s

0, using again #t=4.5% 10−4 in "22#. The numerical result is shown in the

upper panel of Fig. 2. The figure shows the initial string"dashed line# and the calculated MEP "filled circles#. Thebackground shows the contour lines of the Mueller potential.There is an intermediate metastable state along the MEP, andaccordingly there are two saddle points. The empty circle onthe MEP indicates the location of the saddle point "s withhigher energy, obtained by the climbing image technique.After convergence, the norm of the potential force at "s,!!V" "s#!, is smaller than 10−12. It takes 188 time steps toreach this accuracy. The convergence history for the calcula-tion of the saddle point is shown in the lower panel of Fig. 2.The error decays exponentially with the iteration number ortime step n .

We then proceeded to calculate the unstable direction at"s using the algorithm described in Sec. V B. We comparedthe accuracy of the numerical results for different choices ofh.2,3,5,15 The numerical result is shown in the upper panel ofFig. 3. Here the error is calculated by

FIG. 2. Upper panel: Initial string and calculated MEP using the stringmethod with ten images " the images are shown as filled circles; the lines arethe curves interpolated across these images; the vertical line is the initialstring and the other one is the calculated MEP#. The empty circle indicatesthe saddle point identified by combining the string method with the climbingimage technique. The norm of the residual potential force at "s is smallerthan 10−12, !!V" "s#!& 10−12. The background shows the contour lines of theMueller potential. Lower panel: The norm of the force on the climbingimage !!V" "s#! vs the number n of iterations or time steps. The convergenceis exponential in time.

164103-6 E, Ren, and Vanden-Eijnden J. Chem. Phys. 126, 164103 "2007#

Downloaded 27 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

where !l and !r are Lagrange multipliers determined by theconstraints

!"l − "s! = !"r − "s! = h. " 30#

"29# is a discretized version of "27# because

!V" "l# = H" "s#" "l − "s# + O" h2# " 31#

and similarly for !V" "r#: here we used !V" "s# and !"l−"s!=h.

In practice, "29# can be solved by a two-step procedure.At each time step, "r and "l is first evolved by the potentialforce to give intermediate values,

"l! = "l

n − #t ! V" "ln # , " 32#

and similarly for "r!; then the constraints in "30# are enforced

by projecting "l! and "r

! to the sphere S"s,hwith center "s and

radius h,

"ln +1 = "s + h

"l! − "s

!"l! − "s!

" 33#

and similarly for "r!. The steady-state solution of the proce-

dure above is used in "28# to calculate the tangent vector $s.The parameter h in "28# should be chosen as small as

possible without impeding the accuracy with round-off er-rors: if the digital precision is TOLmin, one should chooseh=TOLmin

1/2 , in which case the error due to finite difference in"28# remains O" h2#=O" TOLmin#.

Notice that the time step #t in "32# can be chosen inde-pendently of h without impeding on the accuracy because"31# implies that !V" "l#=O" h# and !V" "r#=O" h#. As a re-sult !"l

!−"ln !=O" h# and !"r

!−"ln !=O" h# and the two steps in

the procedure above do not interfere with the accuracy re-gardless of what #t is. Since the convergence of the solutionof "29# is exponential in time, the number of steps n step re-quired to achieved a given accuracy TOL on $s scales as in"23#.

Note that the above procedure brings "r and "l to theminima of the potential energy V on the sphere S"s,h

bysteepest descent dynamics. More efficient constrained opti-mization methods can be used as well to improve the con-vergence rate and save computational cost.15

C. Illustrative example

In this example, we calculate the MEP, one of the saddlepoint, and the associated unstable direction for the Muellerpotential.13

In the calculation, we first identify an approximation ofthe MEP using the improved string method with N=10 im-ages. Cubic splines were used in the reparametrization andthe forward Euler method with #t=4.5% 10−4 was used inthe integration. After 70 time steps when d defined in "18# isless than 0.1, we stop the string calculation, and identify theimage of maximum energy along the string, "s

0, and the cor-responding $s

0. Then we switch to the climbing image algo-rithm described in Sec. V A to improve "s

0, using again #t=4.5% 10−4 in "22#. The numerical result is shown in the

upper panel of Fig. 2. The figure shows the initial string"dashed line# and the calculated MEP "filled circles#. Thebackground shows the contour lines of the Mueller potential.There is an intermediate metastable state along the MEP, andaccordingly there are two saddle points. The empty circle onthe MEP indicates the location of the saddle point "s withhigher energy, obtained by the climbing image technique.After convergence, the norm of the potential force at "s,!!V" "s#!, is smaller than 10−12. It takes 188 time steps toreach this accuracy. The convergence history for the calcula-tion of the saddle point is shown in the lower panel of Fig. 2.The error decays exponentially with the iteration number ortime step n .

We then proceeded to calculate the unstable direction at"s using the algorithm described in Sec. V B. We comparedthe accuracy of the numerical results for different choices ofh.2,3,5,15 The numerical result is shown in the upper panel ofFig. 3. Here the error is calculated by

FIG. 2. Upper panel: Initial string and calculated MEP using the stringmethod with ten images " the images are shown as filled circles; the lines arethe curves interpolated across these images; the vertical line is the initialstring and the other one is the calculated MEP#. The empty circle indicatesthe saddle point identified by combining the string method with the climbingimage technique. The norm of the residual potential force at "s is smallerthan 10−12, !!V" "s#!& 10−12. The background shows the contour lines of theMueller potential. Lower panel: The norm of the force on the climbingimage !!V" "s#! vs the number n of iterations or time steps. The convergenceis exponential in time.

164103-6 E, Ren, and Vanden-Eijnden J. Chem. Phys. 126, 164103 "2007#

Downloaded 27 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

Page 31: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Stringmethod:stepII

where !l and !r are Lagrange multipliers determined by theconstraints

!"l − "s! = !"r − "s! = h. " 30#

"29# is a discretized version of "27# because

!V" "l# = H" "s#" "l − "s# + O" h2# " 31#

and similarly for !V" "r#: here we used !V" "s# and !"l−"s!=h.

In practice, "29# can be solved by a two-step procedure.At each time step, "r and "l is first evolved by the potentialforce to give intermediate values,

"l! = "l

n − #t ! V" "ln # , " 32#

and similarly for "r!; then the constraints in "30# are enforced

by projecting "l! and "r

! to the sphere S"s,hwith center "s and

radius h,

"ln +1 = "s + h

"l! − "s

!"l! − "s!

" 33#

and similarly for "r!. The steady-state solution of the proce-

dure above is used in "28# to calculate the tangent vector $s.The parameter h in "28# should be chosen as small as

possible without impeding the accuracy with round-off er-rors: if the digital precision is TOLmin, one should chooseh=TOLmin

1/2 , in which case the error due to finite difference in"28# remains O" h2#=O" TOLmin#.

Notice that the time step #t in "32# can be chosen inde-pendently of h without impeding on the accuracy because"31# implies that !V" "l#=O" h# and !V" "r#=O" h#. As a re-sult !"l

!−"ln !=O" h# and !"r

!−"ln !=O" h# and the two steps in

the procedure above do not interfere with the accuracy re-gardless of what #t is. Since the convergence of the solutionof "29# is exponential in time, the number of steps n step re-quired to achieved a given accuracy TOL on $s scales as in"23#.

Note that the above procedure brings "r and "l to theminima of the potential energy V on the sphere S"s,h

bysteepest descent dynamics. More efficient constrained opti-mization methods can be used as well to improve the con-vergence rate and save computational cost.15

C. Illustrative example

In this example, we calculate the MEP, one of the saddlepoint, and the associated unstable direction for the Muellerpotential.13

In the calculation, we first identify an approximation ofthe MEP using the improved string method with N=10 im-ages. Cubic splines were used in the reparametrization andthe forward Euler method with #t=4.5% 10−4 was used inthe integration. After 70 time steps when d defined in "18# isless than 0.1, we stop the string calculation, and identify theimage of maximum energy along the string, "s

0, and the cor-responding $s

0. Then we switch to the climbing image algo-rithm described in Sec. V A to improve "s

0, using again #t=4.5% 10−4 in "22#. The numerical result is shown in the

upper panel of Fig. 2. The figure shows the initial string"dashed line# and the calculated MEP "filled circles#. Thebackground shows the contour lines of the Mueller potential.There is an intermediate metastable state along the MEP, andaccordingly there are two saddle points. The empty circle onthe MEP indicates the location of the saddle point "s withhigher energy, obtained by the climbing image technique.After convergence, the norm of the potential force at "s,!!V" "s#!, is smaller than 10−12. It takes 188 time steps toreach this accuracy. The convergence history for the calcula-tion of the saddle point is shown in the lower panel of Fig. 2.The error decays exponentially with the iteration number ortime step n .

We then proceeded to calculate the unstable direction at"s using the algorithm described in Sec. V B. We comparedthe accuracy of the numerical results for different choices ofh.2,3,5,15 The numerical result is shown in the upper panel ofFig. 3. Here the error is calculated by

FIG. 2. Upper panel: Initial string and calculated MEP using the stringmethod with ten images " the images are shown as filled circles; the lines arethe curves interpolated across these images; the vertical line is the initialstring and the other one is the calculated MEP#. The empty circle indicatesthe saddle point identified by combining the string method with the climbingimage technique. The norm of the residual potential force at "s is smallerthan 10−12, !!V" "s#!& 10−12. The background shows the contour lines of theMueller potential. Lower panel: The norm of the force on the climbingimage !!V" "s#! vs the number n of iterations or time steps. The convergenceis exponential in time.

164103-6 E, Ren, and Vanden-Eijnden J. Chem. Phys. 126, 164103 "2007#

Downloaded 27 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

Reparametrization stepwhere !l and !r are Lagrange multipliers determined by theconstraints

!"l − "s! = !"r − "s! = h. " 30#

"29# is a discretized version of "27# because

!V" "l# = H" "s#" "l − "s# + O" h2# " 31#

and similarly for !V" "r#: here we used !V" "s# and !"l−"s!=h.

In practice, "29# can be solved by a two-step procedure.At each time step, "r and "l is first evolved by the potentialforce to give intermediate values,

"l! = "l

n − #t ! V" "ln # , " 32#

and similarly for "r!; then the constraints in "30# are enforced

by projecting "l! and "r

! to the sphere S"s,hwith center "s and

radius h,

"ln +1 = "s + h

"l! − "s

!"l! − "s!

" 33#

and similarly for "r!. The steady-state solution of the proce-

dure above is used in "28# to calculate the tangent vector $s.The parameter h in "28# should be chosen as small as

possible without impeding the accuracy with round-off er-rors: if the digital precision is TOLmin, one should chooseh=TOLmin

1/2 , in which case the error due to finite difference in"28# remains O" h2#=O" TOLmin#.

Notice that the time step #t in "32# can be chosen inde-pendently of h without impeding on the accuracy because"31# implies that !V" "l#=O" h# and !V" "r#=O" h#. As a re-sult !"l

!−"ln !=O" h# and !"r

!−"ln !=O" h# and the two steps in

the procedure above do not interfere with the accuracy re-gardless of what #t is. Since the convergence of the solutionof "29# is exponential in time, the number of steps n step re-quired to achieved a given accuracy TOL on $s scales as in"23#.

Note that the above procedure brings "r and "l to theminima of the potential energy V on the sphere S"s,h

bysteepest descent dynamics. More efficient constrained opti-mization methods can be used as well to improve the con-vergence rate and save computational cost.15

C. Illustrative example

In this example, we calculate the MEP, one of the saddlepoint, and the associated unstable direction for the Muellerpotential.13

In the calculation, we first identify an approximation ofthe MEP using the improved string method with N=10 im-ages. Cubic splines were used in the reparametrization andthe forward Euler method with #t=4.5% 10−4 was used inthe integration. After 70 time steps when d defined in "18# isless than 0.1, we stop the string calculation, and identify theimage of maximum energy along the string, "s

0, and the cor-responding $s

0. Then we switch to the climbing image algo-rithm described in Sec. V A to improve "s

0, using again #t=4.5% 10−4 in "22#. The numerical result is shown in the

upper panel of Fig. 2. The figure shows the initial string"dashed line# and the calculated MEP "filled circles#. Thebackground shows the contour lines of the Mueller potential.There is an intermediate metastable state along the MEP, andaccordingly there are two saddle points. The empty circle onthe MEP indicates the location of the saddle point "s withhigher energy, obtained by the climbing image technique.After convergence, the norm of the potential force at "s,!!V" "s#!, is smaller than 10−12. It takes 188 time steps toreach this accuracy. The convergence history for the calcula-tion of the saddle point is shown in the lower panel of Fig. 2.The error decays exponentially with the iteration number ortime step n .

We then proceeded to calculate the unstable direction at"s using the algorithm described in Sec. V B. We comparedthe accuracy of the numerical results for different choices ofh.2,3,5,15 The numerical result is shown in the upper panel ofFig. 3. Here the error is calculated by

FIG. 2. Upper panel: Initial string and calculated MEP using the stringmethod with ten images " the images are shown as filled circles; the lines arethe curves interpolated across these images; the vertical line is the initialstring and the other one is the calculated MEP#. The empty circle indicatesthe saddle point identified by combining the string method with the climbingimage technique. The norm of the residual potential force at "s is smallerthan 10−12, !!V" "s#!& 10−12. The background shows the contour lines of theMueller potential. Lower panel: The norm of the force on the climbingimage !!V" "s#! vs the number n of iterations or time steps. The convergenceis exponential in time.

164103-6 E, Ren, and Vanden-Eijnden J. Chem. Phys. 126, 164103 "2007#

Downloaded 27 Jul 2009 to 151.100.4.22. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/jcp/copyright.jsp

Page 32: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Stringmethod:wettingofhydrophobictexturedsurfacesActivated wetting of nanostructured surfaces: reaction coordinates, finite size e↵ects, and simulation pitfalls

FIG. 1. (Left) Atomistic systems formed by 3 ⇥ 3 and 1 ⇥ 1 pillars. (Right) Discretization of

the space corresponding to the di↵erent sets of CVs for the 3 ⇥ 3 (1, 9, and 864 CVs) and 1 ⇥ 1

(1 and 96 CVs) pillars systems. The coarse-grained density field corresponds to the collection

of local densities computed within the yellow parallelepipeds shown in the figure which have size

�x⇥ �y ⇥ �z =??nm⇥??nm⇥??nm inserire dimensioni, respectively.

CVs and from the size of the computational sample, and the correlations between these two

simulation parameters. The systems investigated are shown in Fig. 1.

Understanding the e↵ect of the coarse graining and system size on simulation results,

whether or not they introduce any sizable artifact, is crucial to assess theories on the wet-

ting and recovery mechanism and energetics and to provide a practical guideline for future

simulations of this process on even more complex surface topographies. This will help mov-

ing forward toward the in silico design of superhydrophobic surfaces with tailored properties.

Such properties indeed depend on the surface characteristics (topography and chemistry) via

the wetting/dewetting path, which should therefore be captured accurately in simulations.

The article is organized as follows. In Sec. II the atomistic model, the string method,

and the restrained molecular dynamics method are introduced. In Sec. III the results are

presented and discussed. Finally, Sec. IV is left for conclusions.

7

Activated wetting of nanostructured surfaces: reaction coordinates, finite size e↵ects, and simulation pitfalls

0 0.25 0.5 0.75 1

0

0.05

0.1

0.15

�⇢

1

9

864

A

B

C

A

B

C

a)

b)

FIG. 3. a) Configurations of the meniscus along the wetting process for 1 (red frame), 9 (blue

frame) and 864 (green frame) CVs. The color code (bottom left) helps identifying the distance

of the meniscus from the bottom wall: blue when the meniscus is at the pillars top, red when it

touches the bottom, and green in between. b) Nondimensional Euclidean distance �⇢(�) between

density fields at successive points along the string ⇢(x,�) (see Eq. (8)). Consistently with the top

panel, red, blue and green curves refer to 1, 9, and 864 CVs, respectively.

The meniscus is identified using the Gibbs dividing surface, i.e., the locus of points

where the fluid density coincides with the mean value between the liquid ⇢l and vapor

⇢v bulk values, (⇢l � ⇢v)/2 ⇡ 0.375. Thus, computing the meniscus from an ensemble

of atomistic configurations requires computing a coarse-grained density field, ⇢(x). For

this (computationally inexpensive) post-processing operation we use a finer level of coarse

graining than that used for the CVs, namely a 60⇥ 60⇥ 30 points grid (cell ⇡ 1⇥ 1⇥ 1 �).

The analysis of the meniscus shapes along the wetting process reveals that the collapse

mechanism consists of three steps (A-C) (Fig. 3a). In step A, the liquid, initially pinned at

the pillars top, starts to progressively fill the inter-pillar space, with the meniscus assuming

an essentially flat conformation parallel to the bottom wall. This step is similar for all the

14

Activated wetting of nanostructured surfaces: reaction coordinates, finite size e↵ects, and simulation pitfalls

0 0.25 0.5 0.75 1

0

0.05

0.1

0.15

�⇢

1

9

864

A

B

C

A

B

C

a)

b)

FIG. 3. a) Configurations of the meniscus along the wetting process for 1 (red frame), 9 (blue

frame) and 864 (green frame) CVs. The color code (bottom left) helps identifying the distance

of the meniscus from the bottom wall: blue when the meniscus is at the pillars top, red when it

touches the bottom, and green in between. b) Nondimensional Euclidean distance �⇢(�) between

density fields at successive points along the string ⇢(x,�) (see Eq. (8)). Consistently with the top

panel, red, blue and green curves refer to 1, 9, and 864 CVs, respectively.

The meniscus is identified using the Gibbs dividing surface, i.e., the locus of points

where the fluid density coincides with the mean value between the liquid ⇢l and vapor

⇢v bulk values, (⇢l � ⇢v)/2 ⇡ 0.375. Thus, computing the meniscus from an ensemble

of atomistic configurations requires computing a coarse-grained density field, ⇢(x). For

this (computationally inexpensive) post-processing operation we use a finer level of coarse

graining than that used for the CVs, namely a 60⇥ 60⇥ 30 points grid (cell ⇡ 1⇥ 1⇥ 1 �).

The analysis of the meniscus shapes along the wetting process reveals that the collapse

mechanism consists of three steps (A-C) (Fig. 3a). In step A, the liquid, initially pinned at

the pillars top, starts to progressively fill the inter-pillar space, with the meniscus assuming

an essentially flat conformation parallel to the bottom wall. This step is similar for all the

14

Activated wetting of nanostructured surfaces: reaction coordinates, finite size e↵ects, and simulation pitfalls

0 0.25 0.5 0.75 1

�100

400

900

1400

⌦/(kBT)

1

9

864

0 0.25 0.5 0.75 1

�7000

�5000

�3000

�1000

1000

S/kB

1

9

864

Ideal gas

0.6 0.85

1000

1250

1500

⌦/(kBT)

A

BC

AB

C

a)

b)

A

BC

FIG. 4. a) Free-energy profiles for the 1, 9, and 864 CVs sets. Error bars are computed via the

procedure described in the Supplementary Materials. In the inset the free-energy profile for the 1

CV case is reported to underscore the change of slope of this curve associated to the morphological

transitions observed in step B. b) Entropy along the wetting path for the same sets. The reference

is chosen such that the entropy is zero in the Cassie state. The entropy is computed from the free

energy by subtracting the enthalpy, which, in turn, is computed as the sum of the expectation value

of the Hamiltonian hHi and the �PV term of the liquid and vapor phases. The entropy profile is

noisy due to the corresponding noisy signal of hHi. The dashed line represents the entropy of an

ideal gas of density 0.045, which linearly decreases with the reduction of the amount of vapor in

the corrugations.

followed by the liquid wetting the surface textures.

The di↵erence in the wetting paths is reflected in the di↵erence among the free energy

profiles (Fig. 4). In the case of 1 CV we observe an extended, relatively flat domain of the free

energy profile in step B. Indeed, careful analysis shows that this region is composed of two

linear segments with a di↵erent slope. The discontinuity of the first derivative is observed in

correspondence of the transitions from one morphology to another (inset of Fig. 4a), namely

i) from the flat meniscus to the liquid partly wetting the bottom wall (transition from step

17

M.Amabili,A.Giacomello,SM,andC.M.Casciola,Phys.Rev.Fluids2,034202(2017)

Page 33: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Stringmethod:hydrophobiccollapseofproteins

restrained to be empty, ensure that the reference distribution isrecovered in detail. Here, kBT denotes Boltzmann’s constant timestemperature T.

Defining the MFEP. We used the coarse-graining procedure to obtaincollective variables that describe the position of the hydrophobicchain and the water density field. Specifically, let x ! (xc, w) be theposition vector of length n ! 3 " 1 2 # 3 " 3 " N for the atomisticrepresentation of the entire system, where xc is the position vectorfor the hydrophobes in the chain and w is the position vector for theatoms in the water molecules. Then, z(x) ! (xc, P) is the vector oflength ! ! 3 " 1 2 # 4 8 " 4 8 " 5 6 for the collective variablerepresentation of the system, where the elements of P are definedin Eq. 1.

The MFEP is a curve in the space of collective variables. It isrepresented by z*(!), where ! ! 0 corresponds to the collapsedchain and ! ! 1 corresponds to the extended chain. For interme-diate values ! ! (0 , 1 ), the MFEP obeys the condition

dz*i$!%

d!parallel to !

j!1

!

Mij$z* $!%%"F$z* $!%%

"zj, [3]

where F(z) ! &#&1 ln'(!i!1 $(zi & zi(x))) is the free-energy

surface defined in the collective variables, and

Mij$z%% exp*#F$z%+

& "!k!1

n

mk&1 "zi$x%

"xk

"zj$x%

"xk#i!1

!

$$zi ' zi$x%%$. [4]

Here, angle brackets indicate equilibrium expectation values, #&1 !kBT, and mk is the mass of the atom corresponding to coordinatexk. The matrix Mij(z), which arises from projecting the dynamics ofthe atomistic coordinates onto the collective variables (and, thus, ingeneral, curving the coordinate space) (3 , 1 9 ), ensures that thevarious collective variables evolve on a consistent time scale. As isseen in Eq. 10, the matrix scales the relative diffusion coefficientsfor the collective variables.

If the used collective variables are adequate to describe themechanism of the reaction (here, the hydrophobic collapse),then it can be shown that the MFEP is the path of maximumlikelihood for reactive MD trajectories that are monitored in thecollective variables (3 ). For the current application, we checkedthe adequacy of the collective variables a posteriori by runningMD trajectories that are initiated from the presumed rate-limiting step along the MFEP (i.e., the configuration of maxi-mum free energy) and verified that these trajectories led withapproximately equal probability to either the collapsed or ex-tended configurations of the chain (see Hydrophobic Collapse ofa Hydrated Chain: ‘‘The Committor Function and a Proof ofPrinciple for Course-Graining’’).

String Method in Collective Variables. The string method yields theMFEP by evolving a parameterized curve (i.e., a string) accordingto the dynamics (3 , 2 0 )

"z*i$!, t%"t % ' !

j!1

!

Mij$z* $! , t%%"F$z* $! , t%%

"zj

( )$! , t%"z*i$! , t%

"!, [5]

where the term )(!, t)"z*i (!, t)/"! enforces the constraint that thestring remain parameterized by normalized arc length. The end-

points of the string evolve by steepest descent on the free-energysurface,

"z*i $!, t%"t % '

"F$z* $! , t%%"zi

, [6]

for ! ! 0 and ! ! 1 . These artificial dynamics of the string yieldthe MFEP, which satisfies Eq. 3.

In practice, the string is discretized by using Nd configurationsof the system in the collective variable representation. Thedynamics in Eqs. 5 and 6 are then accomplished in a three-stepcycle, where (i) the endpoint configurations of the string areevolved according to Eq. 6 and the rest of the configurations areevolved according to the first term in Eq. 5, (ii) the string is(optionally) smoothed, and (iii) the string is reparameterized tomaintain equidistance of the configurations in the discretization.This cycle is repeated until the discretized version of Eq. 3 issatisfied. Step i requires evaluation of the mean force elements"F(z)/"zi and the tensor elements Mij(z) at each configuration.These terms are obtained by using restrained atomistic MDsimulations of the sort illustrated for the solvent degrees offreedom in Fig. 1 C. The details of the string calculation areprovided in SI Text: ‘‘String Method in Collective Variables.’’

Hyd rop hobic Collap se of a Hyd rated ChainMFEP. Fig. 2 shows the MFEP for the hydrophobic collapse of thehydrated chain. It was obtained by using the string method in thecollective variables for the chain atom positions and the grid-basedsolvent density field. The converged MFEP was discretized by usingNd ! 4 0 configurations of the system, and we shall hereafter referto these configurations by their index number, s ! 1 , . . ., Nd. Thefree-energy profile was obtained by integrating the projection of themean force along the MFEP, using

F* $!%% %0

!

!F$z* $!,%%! dz* $!,%. [7]

The resolution of F* (!) in Fig. 2 could be improved by employinga larger Nd but at larger computational cost.

Fig. 2. The minimum free-energy path obtained by using the string method.(Upper) The free-energy profile exhibits a single peak at configuration 22.(Lower) The configurations of the path in the vicinity of the free-energy peakare shown with configuration numbers indicated in white text.

Miller et al. PNAS & September 11, 2007 & vol. 104 & no. 37 & 14561

BIO

PHYS

ICS

CHEM

ISTR

Y

mass units. These are ideal hydrophobes, because they exert purelyrepulsive interactions with water that expel the centers of theoxygen atoms from the volume within 0 .5 nm of the center of thehydrophobe. The volume and mass are typical of amino acidresidues (1 2 , 1 3 ). Consecutive hydrophobes in the unbranchedchain interact via harmonic bonds, and the chain is made semirigidby a potential energy term that penalizes its curvature. The rigidityis chosen so that the extended configurations of the chain aredominant in vacuum. It is only solvent (and thus the hydrophobicinteraction) that stabilizes the collapsed globule configurations.The chain is hydrated with 3 3 ,9 1 2 rigid water molecules interactingwith the SPC/E (simple point charges/extended model) potential(1 4 ) in an orthorhombic simulation box with periodic boundaryconditions. Electrostatic interactions were included by using thesmoothed particle mesh Ewald method (1 5 ), and all simulationswere performed at 3 0 0 K. Full details of the simulation protocol areprovided in supporting information (SI) Text: ‘‘Details of theSystem.’’

To describe hydrophobic collapse, it is necessary to choose anappropriate thermodynamic ensemble for the simulations. Nanom-eter-scale fluctuations in solvent density are suspected to play a keyrole in these dynamics. Use of the NVT ensemble (with a 1 -g/cm3

density of water) might suppress these density fluctuations and biasthe calculated mechanism. We avoided this problem with a simpletechnique that is based on the fact that, under ambient conditions,liquid water is close to phase coexistence. Indeed, it is this proximitythat leads to the possibility of large-length-scale hydrophobicity (6 ,1 6 ). By placing a fixed number of water molecules at 3 0 0 K in avolume that corresponds to an average density of !1 g/cm3 , weobtained a fraction of the system at the density of water vapor andthe majority at a density of bulk water. Because we are notconcerned with solvent fluctuations on macroscopic length scales,the difference between simulating bulk water at its own vaporpressure compared with atmospheric pressure is completely negli-gible. This strategy has been used to study the dewetting transitionbetween solvophobic surfaces (1 7 , 1 8 ). To ensure that the liquid–vapor interface remains both flat and well distanced from thelocation of the chain, we repelled particles from a thin layer at thetop edge of the simulation box, as is discussed in SI Text: ‘‘Detailsof the System.’’

Coarse-Graining of Solvent. Throughout this study, we simulatedwater and the hydrated chain by using atomistic MD. To applythe string method to the dynamics of this system, we employeda choice of collective variables that describe the density field ofwater. In particular, a coarse-graining algorithm was developedto connect the atomistic and collective variable representationsof the solvent. By following tWC (4 ), the simulation box waspartitioned into a three-dimensional lattice (4 8 " 4 8 " 5 6 ) ofcubic cells with a side length of l # 2 .1 Å. We labeled the cellswith the vector k # (kx, ky, kz), where each k! takes on integervalues bounded as follows: 1 " kx " 4 8 , 1 " ky " 4 8 , and 1 "kz " 5 6 . On this grid, the molecular density #(r), as determinedby the positions of the water oxygen atoms, is coarse-grained intothe field Pk, where

Pk $ ! dr #$r% "!#x,y,z

&k!$r ! 1!%. [1]

Here, the integral extends over the volume of the system, and 1!

denotes the unit vector in the !th Cartesian direction. The coarse-graining function, &k!

(x), is normalized, ensuring that 'kPk # N isthe total number of water molecules. The particular function thatwe have chosen to use is

&j$x%$ !dy%$x & y%#hj$x%hj$y%' hj(1 $x%$i" j

hi$y%

' hj)1 $x%$i( j

hi$y%%, [2]

where %(x) # (2 )*2 ))1 /2 exp()x2 /2 *2 ), * # 1 Å, and hj(x) is unitywhen x is in the jth interval and zero otherwise. In effect, thischoice spreads the atomistic density field #(r) over the lengthscale * and bins it into a grid of length scale l in such a way asto preserve normalization. While we have found this choice ofcoarse-graining function to be convenient, others are possible.

Fig. 1 A and B illustrates the coarse-graining procedure. In Fig.1 A, the solvent is schematically shown before and after coarse-graining. Fig 1 B, the same mapping is shown for the actual systemconsidered here. Cells are shaded white when their solvent occu-pation, Pk, is less than half of the bulk average value of *P+bulk , c #0 .3 molecules. Small local density fluctuations are seen throughoutthe simulation box, as is expected for an instantaneous solventconfiguration.

In addition to visualizing solvent density, the coarse-grainingalgorithm is useful for controlling the solvent density in MDsimulations. For example, if it is desired that a particular cell kexhibit a solvent occupation P*k, the potential energy term +(Pk )P*k)2 /2 can be used to derive the appropriate forces on the atoms inthe simulation, where + is a force constant. This technique isillustrated in Fig. 1 C. The leftmost image shows a density distribu-tion that is exceedingly unlikely for a simulation of ambient liquidwater. Images to the right show the average solvent density distri-bution from MD simulations that are restrained to this unlikelyreference distribution with increasing +. Force constant values of-2 kBT/c2 , in which the simulation incurs an energetic penalty of atleast kBT for placing the average bulk density in a cell that is

Fig. 1. Solvent coarse graining. (A) The coarse-graining procedure is sche-matically shown to project the atomistic solvent density onto a discrete grid.(B) The same procedure is shown for an instantaneous solvent configurationof the actual system. Grid cells containing less solvent density than c/2 arecolored white; the remaining cells are left transparent against the bluebackground. The hydrophobic chain is shown in red. (C) Atomistic solventdensity is restrained to the reference distribution at the far left. With largerrestraint force constant + reported numerically in units of 2 kBT/c2, the averageatomic solvent density reproduces the reference distribution in detail (see textfor notation).

14560 & www.pnas.org'cgi'doi'10.1073'pnas.0705830104 Miller et al.

130000CV

T.F.MillerIII,E.Vanden-Eijnden,D.Chandler,PNAS104,14559(2007)

Page 34: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

Summary

• OftenfewCVsarenotsufficienttodescribeachemicalreactionorphysicalprocess– Committor analysis

• NeedoftechniquestodealwithahighdimensionalCVspace– SingleSweepforreconstructingthefreeenergy– Stringmethodtodeterminethemostlikelypath

Page 35: Bioexcel Webinar Series #18: Simone Meloni "Multiple timescales in atomistic simulations"

bioexcel.eu

Audience Q&A session

Please use the Questionsfunction in GoToWebinar

application

Any other questions or points to discuss after the live

webinar? Join the discussion the discussion at

http://ask.bioexcel.eu.