Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale...

26
Institute for Advanced Simulation Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner and Qiang Cui published in Multiscale Simulation Methods in Molecular Sciences, J. Grotendorst, N. Attig, S. Bl¨ ugel, D. Marx (Eds.), Institute for Advanced Simulation, Forschungszentrum J¨ ulich, NIC Series, Vol. 42, ISBN 978-3-9810843-8-2, pp. 421-444, 2009. c 2009 by John von Neumann Institute for Computing Permission to make digital or hard copies of portions of this work for personal or classroom use is granted provided that the copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise requires prior specific permission by the publisher mentioned above. http://www.fz-juelich.de/nic-series/volume42

Transcript of Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale...

Page 1: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

Institute for Advanced Simulation

Multiscale Methods for the Description ofChemical Events in Biological Systems

Marcus Elstner and Qiang Cui

published in

Multiscale Simulation Methods in Molecular Sciences,J. Grotendorst, N. Attig, S. Blugel, D. Marx (Eds.),Institute for Advanced Simulation, Forschungszentrum Julich,NIC Series, Vol. 42, ISBN 978-3-9810843-8-2, pp. 421-444, 2009.

c© 2009 by John von Neumann Institute for ComputingPermission to make digital or hard copies of portions of this work forpersonal or classroom use is granted provided that the copies are notmade or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwiserequires prior specific permission by the publisher mentioned above.

http://www.fz-juelich.de/nic-series/volume42

Page 2: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2
Page 3: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

Multiscale Methods for the Description of ChemicalEvents in Biological Systems

Marcus Elstner1,2 and Qiang Cui3

1 Department of Physical and Theoretical ChemistryTechnische Universitat Braunschweig

D-38106 Braunschweig, Germany2 Department of Molecular Biophysics

German Cancer Research CenterD-69115 Heidelberg, Germany

E-mail: [email protected]

3 Department of Chemistry and Theoretical Chemistry InstituteUniversity of Wisconsin-Madison

1101 University Avenue, Madison, WI 53706, USAE-mail: [email protected]

Computational methods for the description of chemical events in biological structures have totake into account the key features of bio-molecular molecules, their high degree of structuralflexibility and the long-range nature of electrostatic forces. In the last decade, a multitude ofapproaches have been developed to combine computational methods that span different length-and time-scales. These multiscale approaches incorporatea quantum mechanical description ofthe actives site in combination with an empirical force fieldmethod for the immediate proteinenvironment and a continuum treatment of the regions further away. To study reactive events,efficient sampling techniques have to be applied, which can become computationally intenseand therefore requires effective quantum methods. In this contribution, we describe the variousoptions to combine different methods, where the specific combination depends very much onthe nature of the problem in hand.

1 Introduction

The simulation of structure and dynamics of biological systems can nowadays be rou-tinely performed using empirical force fields, which have become robust and reliable toolsover the last decades1, 2. These Molecular Mechanics (MM) force fields3, 4 model chem-ical bonds by harmonic springs, i.e. they describe the energy of a chemical bond usingharmonic (or Fourier) potentials for the bond length, bond angle and dihedral angle. In ad-dition to these bonded terms, the force fields contain non-bonded contributions, modeledby the interaction of fixed atomic point charges and van der Waals interactions, usually de-scribed by the 12-6 Lennard-Jones potential. Polarizable force fields5 that allow the partialcharges to vary depending on their environment have also been developed, although theirapplications have been much more limited due to the higher computational expense.

Biological structures host a multitude of chemical events like chemical reactions (bio-catalysis), photochemical processes, long range proton transfers (e.g., in bioenergetics),electron and energy (excitation) transfers, which can onlybe described using quantum me-chanical (QM) techniques and not with MM. The description ofthese processes is verychallenging for computational chemistry due to the large size of biological systems andthe presence of multiple time-scales. Indeed, biological structures take the middle ground

421

Page 4: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

between solids and more disordered materials like polymers. On the one hand, they havea highly ordered structure from a functional perspective; e.g., specific functional aminoacids with pre-organized orientations are found in the immediate vicinity of the active site,which is one important reason that chemical events in the enzyme active site are more ef-ficient than the corresponding processes in solution6. On the other hand, biomolecules arehighly flexible and entropic contributions to the reaction free energy can be as important aspotential energy contributions. Therefore, to model chemical events in biological systemsrequires both accurate potential functions and access to sufficient conformational samplingand long time-scales.

None of the existing methods alone is up to the task in general. For example, standardQM methods like Hartree-Fock (HF), Density-Functional (DFT) or Semi-Empirical (SE)Theory alone can not handle several thousands of atoms with sufficient sampling. As aconsequence, many studies in the past focused only on small parts of the system, such asthe active site of the protein where the reaction occurs. This however, has been shown to beinsufficient due to the long range nature of the electrostatic forces and steric interactionsof the active site with the environment6–8. The development of linear scaling methodsextended the applicability of QM significantly. However, their application to large systemsis still costly, not viable for many interesting systems with 10,000-100,000 atoms and nothelpful when dynamical or thermodynamical properties are required, which is the case inmany biological applications. Evidently, methods from different computational levels haveto be combined effectively, which has been explored for the past few decades.

In the quantum chemistry community, efforts have largely been focussed on the combi-nation of QM methods with continuum electrostatic theories, starting from Born & Onsagertheories that aimed at computing the solvation free energy of charges in a polar environ-ment. These methods have been refined over the years and can now give a reasonabledescription of solvation properties in an isotropic and homogeneous medium9, 10. In thiscontext, MM force field methods have also been combined with continuum electrostaticsmethods11, 12 since the number of water that has to be included in explicit solvent simula-tions with the periodic boundary condition often far exceeds the number of atoms in thebiological molecule itself. Most of these methods are basedon the Poisson-Boltzmanntheory13 and the Generalized Born model14, although more sophisticated integral equationand density functional theories13 have also been explored for small biomolecules.

These continuum models (CM), however, are by no means appropriate to representthe electrostatic and steric interactions of the structured environment with the active site.Therefore, Warshel and Levitt15 proposed in 1976 to combine QM methods for the activesite with MM methods for the remainder of the system. An appropriate QM-MM couplingterm describes the polarization of the QM region by the charges on the MM atoms andmediate the steric interactions via covalent bonds and van der Waals contacts. Up to now,such QM/MM methods have been developed to combine many QM methods (post-HF, HF,DFT, SE) with various force fields (e.g., CHARMM, AMBER, GROMOS, ...) and havebecome a powerful tool for analyzing chemical events in biomolecules.

It has long been envisioned that a multiscale model can be developed for complexmolecular systems in which QM/MM methods are further augmented by a continuumelectrostatic model. Indeed, although efficient Ewald summation has been implementedwith QM/MM potential function16, 17, the high cost and sampling challenge associatedwith explicit solvent simulations also becomes more transparent for QM/MM simula-

422

Page 5: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

tions, especially those using high level QM methods. Practical implementations that in-tegrate QM/MM potential with continuum electrostatics models, however, only have be-come available in recent years18–20. The major focus of this review is to summarize the keycomponents of such QM/MM/CM models and to discuss a few relevant examples that bestillustrate their value and limitations.

2 QM/MM Methods

The development of QM/MM methods in recent years has turned them into powerful pre-dictive tool and many research groups are involved in the process; most of the recent devel-opments have been nicely summarized in a comprehensive review21 (see the contributionof W. Thiel). There is not one single QM/MM method, and the multitude of differentimplementations can be characterized by several main distinctions:

• Additive and subtractive methods: Subtractive models22 apply the QM method tothe active site and the MM method to the entire system, also including the active site.Since the active site region is treated by both methods, the MM contribution for theactive site has to be subtracted out:

E = EtotMM + Eactive site

QM − Eactive siteMM (1)

The advantage of this method is that it allows in a simple way to also combine twodifferent QM methods in a QM/QM’ scheme or multiple methods in a QM/QM’/MMscheme, where high (e.g., DFT) and low level (SE) QM methods are combined23, 24.The disadvantage is that the MM has to treat also the active site, which may not bestraightforward when the active site has complex electronic structure (e.g., transi-tion metal centers). The additive scheme25, by contrast, only applies the MM to theenvironment of the active site, and the two regions are then coupled by a QM/MMcoupling term:

E = Eactive siteQM + Eenvironement

MM + EQM/MM (2)

Here, no force field parameters are needed for the active site, but the description ofthe boundary is conceptionally more involved.

• The treatment of the QM/MM boundary: In many applications, this boundary dis-sects a covalent bond. In the simplestlink atomapproach25, the dangling bond ofthe QM region is saturated by an additional hydrogen. Other approaches avoid theintroduction of this artificial hydrogen. Thepseudoatom/bondapproach26 treats thefrontier functional group as a pseudo-atom with an effective one-electron potential.In most cases, a C-C single bond has to be cut and the CH2 at the QM boundary isthen substituted by a parametrized (using a pseudo-potential) pseudo-Fluorine, whichmodels the properties of the C-C bond. The hybrid-orbital approach27 does not sub-stitute the boundary CH2 group but freezes the occupation of the orbital, which rep-resents the dangling bond. These are the most common approaches to deal with theQM/MM boundary and various variants have also been proposed28. Systematic stud-ies indicate that most schemes give comparable results as far as the charges at theQM/MM boundary are carefully treated29–31.

423

Page 6: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

• Mechanical, electrostatic and polarizable embedding: This concerns the QM/MMcoupling term and the nature of the force field. In the mechanical embedding22, 23,the MM point charges are not allowed to polarize the QM region. The interaction ofthe QM and MM regions is simply given by the Coulomb and van derWaals inter-actions32 between the QM and MM subsystems and the interactions at the boundary,thus the QM density isnot perturbed by the MM charges. Since biological systemsare often highly charged, this method should not be used for biological applications.The electrostatic embedding25 includes the MM charges as external point charges inthe determination of the QM charge density, i.e., the QM region is polarized by theMM charges. This sounds conceptually simple, but can be an intricate matter in prac-tice. First of all, the QM density can become too delocalizeddue to interactions withthe point charges, which is referred to as the “spill out problem”, in particular whenlarge basis sets of plane wave bases are used. This problem can be alleviated byusing a modification of the1/r interaction at short distances33. Further, large pointcharges close to the QM region can overpolarize the QM density due to the artificialconcentration of the MM charge at one point. Here, a charge smearing scheme can beused28. Finally, in thepolarizable embeddingscheme a polarizable force field insteadof the fixed point charge model is used. In some cases, polarization effects from theenvironment can have a significant impact on the result as shown, for example, by thecalculation of excitation energies in retinal proteins34, 35(see below).

3 Sampling Reactive Events

For chemical reactions, the calculation of free energy changes and activation free energiesis of ultimate interest and is still a challenge. There are several categories of techniquesavailable.

• Direct MD The most straightforward way is to perform MD simulations byintegrat-ing Newton’s equation of motion with either the microcanonical or canonical ensem-bles.36. The common computational technology and algorithms, however, put severelimitations in the accessible time scales. As a rule of thumb, HF and DFT methodsallow to perform MD simulations in the ps regime (≈ 10-50ps for ‘small’ QM regionsof 10-50 atoms), while SE methods allow for simulation timesroughly three ordersof magnitude longer (≈ 10-100ns for ‘small’ QM regions). Therefore, direct MDsimulations only allow overcoming small free energy barriers of severalkBT , suchas sampling of various conformers of very short peptides in water (see below). Manychemical reactions of interest have barriers on the order of10-25 kcal/mol, and cannot be meaningfully addressed with direct MD simulations, even with SE methods.Direct MD simulations, therefore, are mostly useful for equilibrating configurationsof protein active sites and qualitative exploration of the structural features relevant tochemistry, such as water distributions along potential proton transfer pathways.

• Reaction path techniques These methods determine the Minimum Energy Path(MEP)37 between a reactant and product state, in particular they locate the transitionstate (saddle point on the potential energy surface). For enthalpy driven processes,this path contains most relevant information for describing the chemical reaction ofinterest, in particular the relative energies of reactant,product and transition state. As

424

Page 7: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

a starting point, reactant and product states have to be available. For simple reactions,an approximate MEP can be determined by the adiabatic mapping procedure38, whena reaction coordinate is chosen and partial optimizations are carried out with the re-action coordinate set to a number of values; e.g., consider aproton transfer from anoxygen atom to a nitrogen, denote the O-H distance byd1, the H-N distance byd2, areaction coordinated = d1 − d2 can then be used to describe the reaction. For morecomplex reaction processes that actively involve many degrees of freedom, however,more sophisticated techniques are required. One techniqueavailable in CHARMM iscalled the Conjugate Peak Refinement (CPR,39), which starts by a straight line inter-polation between reactant and product. At the line search maximum, all degrees offreedom perpendicular (‘conjugate’) to the initial searchdirection are optimized, untila minimum is found. This minimum is then connected to the reactant and product andthe optimization process is iterated. A popular alternative is the Nudged elastic bandmethod (NEB40), where images of the system are distributed along the search line be-tween reactant and product and are connected by springs; therelated dimer method41

is also widely used, though more in solid state and surface physics communities.

For enthalpy driven processes, MEP based techniques can provide valuable mecha-nistic information. The limitations of the methods, however, are also obvious. First,the straight line interpolation does not assure to find the pathway with lowest en-ergya. Therefore, chemical intuition is necessary to include various different inter-mediate states, as illustrated in our study of the first proton transfer event in Bacteri-orhodopsin42 (see below). Moreover, entropic contributions are completely neglected.For example, Klahn et al.43 showed for the reaction of a phosphate ion in the Ras-GAP complex that the total energies of reactant and product fluctuate on the order of30 kcal/mole and the reaction barrier on the order of 6 kcal/mol, when using differentprotein conformations generated by classical MD simulations. In other words, thethermal motion of the protein environment makes the use of total energies in the MEPframework meaningless, which highlights the point that pursuing a high accuracy inthe QM method may not be the bottleneck for meaningful QM/MM studies of manybiological problems.

• Free energy computations along reaction path One approach for improving uponMEP results is to calculate the free energy (potential of mean force) along the MEP.For example, the MM free energy contribution along the MEP can be estimated usingfree energy perturbation calculations in which the QM region is frozen (or treated ina harmonic fashion) while the MM region samples the proper thermal distribution or-thogonal to the MEP44. In the more elaborate scheme developed recently45, the pathitself can be refined based on the derivatives of the potential of mean force, whichultimately converges to a minimum free energy path. The costof such calculations,however, can be rather high especially if high-level QM methods are used; one prac-tical approximation is to replace the QM region by effective(or even polarizable)charges when sampling the MM degrees of freedom46.

• Umbrella sampling and meta-dynamics When the reaction can be described by a

aImagine connecting Munich and Milano by a rope, which will arrange along the valleys connecting Munich andMilano: however, depending on the initial placement of the rope, different pathways can be found.

425

Page 8: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

number of pre-chosen “reaction coordinates”, umbrella sampling techniques47 can beused to generate the relevant potential of mean force curve/surface. The most basictechnique is to add harmonic umbrella potentials at a discrete set of reaction coordi-nate values to overcome barriers on the potential energy surface, and various schemeshave been proposed to make the process automated (adaptive)and converge quickly.For example, meta-dynamics methods48 are adaptive umbrella sampling methodswhere successive Gaussians are added to avoid revisiting configurations during thesampling and therefore speeds up the convergence; the widthand height of the addedGaussian functions as well as the frequency of adding the Gaussian functions can beoptimized for optimal convergence49–51. Finally, energy can be used as a collectivereaction coordinate to enhance sampling when it is difficultto determinea priori a setof geometrical parameters that describe the reaction52–54.

• Other advanced techniques Finally, there are transition path sampling (TPS) tech-niques that aim to directly sample the reactive trajectory ensembles55. These are inprinciple the most rigorous framework for understanding reaction mechanisms in thecondensed phase and generally do not require specifyinga priori the reaction coor-dinates; it is well known that environmental degrees of freedom can be essential partof the kinetic bottleneck for many reactions in solution andbiological systems. TPShas been applied in several studies of enzyme reactions56, 57, and the cost of such cal-culations highlights the importance of developing accurate SE methods. It should benoted that the TPS techniques in principle can also suffer from sampling issues in thepath space and therefore can also benefit from using different initial guesses.

4 Semi-Empirical Methods

While the adiabatic mapping calculations can be readily applied in conjunction with HFand DFT methods, more elaborate reaction path techniques and most free energy and TPStechniques overstretch the possibilities ofab initio methods and are mostly applied usingSE methods. The great promise of DFT methods on the one hand and the lower accu-racy and limited transferability of the SE type methods, like MNDO, AM1 or PM3 on theother hand, seemed to devalue the latter type of methods; in the late 1990’s they were tobecome obsolete in the eyes of many quantum chemist’s. However, the limitations andquite involved empirical parametrization process of modern DFT methods changed alsothe view onto the SE methods58. The desire to study increasingly complex (bio)moleculesand the importance of entropic contribution and sampling instudying soft matter broughta renewed interest into SE methods, especially if they can bemade more robust and trans-ferable.

Most SE methods are derived from the Hartree-Fock theory by applying various ap-proximations resulting in, for example, the Neglect of Differential Diatomic Overlap(NDDO) type of methods; the most well-known ones being the MNDO, AM1 and PM3models59. In these methods certain integrals are omitted and the remaining are treated asparameters, which are either pre-calculated from first principles or fitted to experimentaldata. SE methods usually have an overall accuracy lower thanDFT, although this canbe reversed for specific systems. In the so called specific reaction parametrization (SRP)scheme60, a SE method (e.g., PM3) is specifically re-parametrized forthe particular sys-

426

Page 9: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

tem under study, which may provide a very accurate description for the reaction of inter-est at a level even unmatched by popular DFT methods. However, parameterization of aSRP that works well for condensed phase simulations is not asstraightforward as for gas-phase applications and a large number of carefully constructed benchmark calculations areneeded61–63. Therefore, it remains an interesting challenge to developgenerally robust SEmethods that properly balance computational efficiency andaccuracy. Some of the morerecent models include the inclusion of orthogonalization corrections in the OMx model59,the PDDG/PM3 model64, and the NO-MNDO model, which all generated encouragingimprovements over traditional NDDO methods.

SE methods can also be derived from DFT, a development that wehave focussed onover the last decade. The so called Self-Consistent Charge Density Functional Tight Bind-ing (SCC-DFTB) method65, 66 is derived by expanding the DFT total energy functional upto second order with respect to the charge density fluctuations δρ around the referencedensityρ0

66 (ρ′0 = ρ0(~r′),∫

′=∫

d~r′ ):

E =

occ∑

i

〈Φi|H0|Φi〉 +

1

2

∫∫

(

1

|~r − ~r ′|+δ2Exc

δρ δρ′

ρ0

)

δρ δρ′

−1

2

∫∫

′ ρ′0ρ0

|~r − ~r ′|+ Exc[ρ0] −

Vxc[ρ0]ρ0 + Ecc (3)

H0 = H[ρ0] is the effective Kohn-Sham Hamiltonian evaluated at the reference densityρ0

and theΦi are Kohn-Sham orbitals.Exc andVxc are the exchange-correlation energy andpotential, respectively, andEcc is the core-core repulsion energy (an extension up to thirdorder has been presented recently67, 68).

The (artificial) reference densityρ0 is chosen as a superposition of densitiesρα0 of the

neutral atomsα constituting the molecular system,

ρ0 =∑

α

ρα0 (4)

and a density fluctuationδρ, also built up from atomic contributions

δρ =∑

α

δρα, (5)

in order to represent the ground state density

ρ = ρ0 + δρ. (6)

Approximations to the three energy contributions in eq. 3 result in the final expressionof the SCC-DFTB model66:

E =occ∑

iµν

ciµciν < ηµ|H0|ην > +

1

2

α,β

Uαβ(Rαβ) +1

2

αβ

∆qα∆qβγαβ (7)

427

Page 10: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

SCC-DFTB has been tested in detail for atomization energies, geometries and vibra-tional frequencies using a large set of molecules69–71. In terms of atomization energies,the modern NDDO type methods like PDDG/PM2 or OM2 have bee shown to be superiorto SCC-DFTB, while SCC-DFTB is excellent in reproducing geometries and also predictsreasonable vibrational frequencies. It is worth emphasizing again that the SE methods arelikely less accurate than modern DFT-functionals on average, although this situation canbe reversed in specific cases.b Moreover, as discussed above, the errors introduce by ne-glecting the effects of dynamics and entropy can become larger than the intrinsic error ofthe respective electronic structure method. Nanoseconds of MD simulations are readilyfeasible with SE methods, while impossible with HF and DFT. Therefore, SE methods canbe used in various ways to improve the quality of the computational model: (i) They canbe applied as the main QM method for the initial exploration of possible reaction mecha-nisms after careful testing/refinement for relevant model systems; (ii) they can be used toestimate the entropic contributions of a particular mechanism while the accurate potentialenergy is evaluated at a higher level method77; (iii) they can be used as the lower level QMin either an ONIOM type multi-level23, 24scheme or to guide the sampling in a multi-levelfree energy calculations.

5 The Continuum Component

While continuum approaches applied in computational materials science mostly modelmechanical properties, those applied in biological simulations mainly model the dielectricresponse of the environment to the charge distribution of the moleculec. Most popular con-tinuum electrostatics models in the biological context arebased on the Poisson-Boltzmannframework.11, 13. The Poisson equation allows to compute the electrostatic potential andelectrostatic free energy for the charge distribution of the solute in the presence of a dielec-tric continuum (representing the dielectric response fromthe solvent molecules). The PBequation further includes the mobile charge distribution originating from the surroundingions, which respond to and modulate the electrostatic potentials of the solute charges:

−∇ǫ(x)∇φ(x) −

N∑

k=1

ckqke−qkφ(x)−Vk(x) =

4πe2

kTρ(x) (8)

ρ(x) describes the solute charge distribution,qk the charge of the mobile ion speciesk, Vk(x) the steric interaction of the solute with the mobile ion species k,ǫ(x) the space-dependent dielectric function andφ(x) the resulting electrostatic potential. As discussedextensively in the literature13, the correlation between the mobile ions is ignored in thePB approach, thus PB is most reliable for monovalent ions, which fortunately fits mostbiological applications.

bEven well established methods like the hybrid DFT method B3LYP show deficiencies, which may not be widelyrecognized, e.g., problems with the description of extended electronicπ systems72,73, dispersion interactions74

or electronically excited states with significant charge-separation75,76. These examples show that careful testingare obligatory before application to new systems, even for DFT methods.cGenerally, the work for cavity formation and the van der Waals interactions at the surface, the ‘apolar’ compo-nents of the solute-solvent interaction, need to be included as well, see78.

428

Page 11: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

In the most straightforward conceptual scheme, the QM/MM/CM treats the active sitewith QM, the entire biomolecule with MM and the solvent with CM. In many practicalapplications, however, it is sufficient to treat atoms very close to the active site (e.g., within20 A) with discrete MM degree of freedom that are fully flexible during the simulation;this would include explicit solvent molecules in or near theactive site, which helps allevi-ate some of the limitations of continuum electrostatics models at the solute/solvent inter-face. To properly and efficiently deal with protein atoms in the continuum region, we haveadapted the Generalized Solvent Boundary Condition (GSBP)scheme developed by Rouxand co-workers for classical simulations79. Briefly, if we refer the discrete QM/MM regionas the inner region while the continuum as the outer regions,the total effective potential(potential of mean force) of the system can be written as,

WGSBP = U (ii) + U(io)int + U

(io)LJ + ∆Wnp + ∆W

(io)elec + ∆W

(ii)elec, (9)

whereU (ii) is the complete inner-inner potential energy,U(io)int andU (io)

LJ are the inner-outerinternal (bonds, angles, and dihedrals) and Lennard-Jonespotential energies, respectively,and∆Wnp is the non-polar confining potential. The last two terms in Eq.9 are the coreof GSBP, representing the long-range electrostatic interaction between the outer and innerregions. The contribution from distant protein charges (screened by the bulk solvent) in theouter region,∆W (io)

elec , is represented in terms of the corresponding electrostatic potential

in the inner region,φ(o)s (rα),

∆W(io)elec =

α∈inner

qαφ(o)s (rα) (10)

The dielectric effect on the interactions among inner region atoms is represented through areaction field term,

∆W(ii)elec =

1

2

mn

QmMmnQn (11)

whereM andQ are the generalized reaction field matrix and generalized multipole mo-ments, respectively, in a basis set expansion.79

The advantage of the GSBP method lies in its ability to include these contributionsexplicitly while sampling configurational space of the reaction region during a simulationat minimal additional cost. The static field potential,φ

(o)s (r), and the generalized reaction

field matrixM are computed only once based on PB calculations and stored for subsequentsimulations. The only quantities that need to be updated during the simulation are thegeneralized multipole moments,Qn,

Qn =∑

α∈inner

qαbn(rα) (12)

wherebn(rα) is thenth basis function at nuclear positionrα.As described in Ref.18, the implementation of GSBP into a combined QM/MM frame-

work is straightforward, and involves the QM-QM and QM-MM reaction field, and theQM-static field terms. For the GSBP combined with SCC-DFTB, these terms take on asimple form becauseρQM (r) is expressed in terms of Mulliken charges.66 Although theformulation of GSBP is self-consistent, the validity of theapproach depends on many fac-tors especially the size of the inner region and the choice ofthe dielectric “constant” for

429

Page 12: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

the outer region. Therefore, for any specific application, the simulation protocol has to becarefully tested using relevant benchmarks such aspKa of key residues17, 80.

An economic alternative to the GSBP approach is the charge-scaling protocol, wherethe partial charges for charged-residues in the MM region are scaled down based on theelectrostatic potentials calculated (with PB) when the biomolecule is in vacuum vs. solu-tion; the scaled partial charges are then used in the QM/MM simulations with the systemin vacuum. In the end, PB calculations are carried out with scaled and full partial chargesto complete the thermodynamic cycle. The charge-scaling approach has been successfullyused in several QM/MM studies7, 81, 82, although several numerical issues (e.g., treatmentof residues very close to the QM region and cancellation of large contributions) render theprotocol less robust than GSBP.

6 Polarizable Force Field Models

Continuum electrostatic approaches take into account a majority of the dielectric responsesof the solute. The electronic polarization of the environment to changes in the QM densityduring chemical reaction, however, is missing when non-polarizable force fields are usedas MM. This electronic polarization may give significant contributions when electrons orions are transported over long distances and for excitationenergies, where the dipole mo-ment of the QM region changes significantly upon excitation.In the last decade, manyresearch groups have been actively developing polarizableforce fields, for which severalgood overviews are available (see a thematic issue that followsJ. Chem. Theory Comput.2007, 3, 1877).

Common approaches to describe electronic polarization effects use models based onatomic polarizabilities, a method that we have implementedto estimate the protein polar-ization effects on electronic excitation energies35. Here, the Coulomb interaction is de-scribed using atomic chargesqA and atomic polarizabilitiesαA, where the induced atomicdipolesµA can be calculated as:

µA = αA

(

B

TABqB +∑

C

TACµC

)

(13)

The first term contains the Coulomb interaction with the fixedatomic point charges,which lead to the induced dipoles. The second term describesthe interaction betweenthe induced dipoles. Note that the induced dipole moments appear on both sides of theequation. For small systems, these equations can be solved by matrix inversion techniques,for large systems they are usually solved iteratively. The tensorsT contain a dampedCoulomb interaction since for small distances the bare Coulomb1/r and1/r3 terms for thecharge-charge and charge-dipole interactions would lead to over-polarization. Effectively,this damping is induced by smearing out the charges, i.e., bydescribing the atomic chargewith an exponential charge distribution. A new parameter, the width ‘a’ of this chargedistribution is therefore introduced and has to be determined during the fitting procedure.

Atomic polarizabilities can be calculated or taken from experiment; we have used val-ues from the literature35, where typical parameters are around 0.5A−3 for H and about 0.8-1.5A−3 for first row atoms C, N and O. However, to gain high accuracy atomic parametershave been taken to be dependent on the atomic hybridization state, e.g., the parameters for

430

Page 13: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

sp2 and sp3 carbon differ by about 0.5A−3. This allows to account for the different po-larizabilities of sp3 carbon structures, like alkanes compared to aromatic molecules, likepolyenes or benzene.

Common force charge models are parametrized in order to account for the effects ofsolvation implicitly. This can be done by fitting the chargesto experimental data, or by cal-culating them using HF/6-31G*, which is known to overestimate the magnitude of charges,thereby implicitly taking the effect of solvent polarization into account. Therefore, as a firststep a new charge model has to be developed in order to be consistent with an explicit treat-ment of polarization. We computed ‘polarization free’ charges by performing B3LYP/6-311G(2d,2p) calculations and fitting the charges to producethe electrostatic potential atcertain points at the molecular boundary (RESP)35. d These ‘polarization free charges’ arecomputed for certain molecular fragments in gas phase, i.e.for certain chemical groupslike amino acid side chains. The charges therefore already contain the mutual polarizationwithin these fragments. Therefore, the polarization modelis also restricted interactionsbetween these fragments and not applied within one region toavoid double counting.

Critical tests include the calculation of polarizability tensors of amino acid side chainsin comparison with DFT and MP2 data, and the evaluation of thepolarization energy ofsuch side chains due to a probe charge in the vicinity. The polarization model is able to re-produce the QM data with high precision35, allowing therefore for meaningful calculationson larger systems like entire proteins.

7 Applications

In this section, we discuss three applications, to illustrate the various methodologies dis-cussed above.

7.1 Direct QM/MM MD with periodic boundary conditions: Dynamics of peptidesand proteins

The conformation of peptides and proteins depend sensitively on the proper inclusion ofsolvent. The conformations of small peptides in the gas phase are very different fromthose in solution and it is challenging to use a QM description of the peptide augmentedwith an implicit solvent model to model those properly. One possible approach is to in-clude the first solvation shell explicitly83, although finite temperature effects still need tobe included, which can be problematic with a small “microsolvation” model. A physicallymore transparent model is to surround the peptide, treated with QM, by a box of MM watermolecules and to apply periodic boundary conditions84. The main degrees of freedom inthese peptides are the backbone torsions (φ, ψ) , which exhibit rotational barriers of a fewkcal/mol (Fig.1). To sample the energy landscape of such systems, MD simulations in theorder of 10-100 nano-seconds have to be performed, which is clearly only possible usingSE methods. This also illustrates the limits of direct MD simulations, which can handleonly systems with small barriers of a few kcal/mol. Linear scaling methods in combinationwith SE methods allow to simulate the dynamics of small proteins over several 100 ps85.

dDiffuse basis functions should be avoided, since those would allow the charge density into regions far awayfrom the molecule, which are not accessible in the condensedphase due to the environment.

431

Page 14: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

However, this is still quite costly and there are not too manyapplications where a QMtreatment of the entire protein is necessary and the dynamics on these short time-scales arethe quantities of key interest.

Figure 1. The lowest energy conformationCeq

7of the alanine dipetide model in the gas phase. The main degrees

of freedom consist of thephi andpsi dihedral angles, i.e., rotations around the central C-C andC-N single bonds.

7.2 Proton transfer

Proton transfer reactions are involved in many key biological problems, most notably inacid-base catalysis and bioenergetics processes. The breaking and formation of manychemical bonds in these problems and the significant reorganization of the environment inresponse to the transport of charges pose great challenges to theoretical studies. Althoughmore specialized techniques such as MS-EVB can be extremelyvaluable in the study ofcertain proton transfer problems86, a QM/MM framework is required to introduce moreflexibility in the potential energy surface, especially when the reaction involves species ofcomplex electronic structures (e.g., transition metal ion). The diversity of proton transferreactions also makes them ideal for illustrating the value and limitation of various QM/MMtechniques.

7.2.1 Bacteriorhodopsin (bR): MEP results

For relatively localized proton transfers, for which the entropic contribution is likely small,reaction path methods can be applied. An example is the first proton transfer step in bacte-riorhodopsin, where the active site involves well connected hydrogen bonding network asshown in Fig.2. It is known from experiment that entropy doesnot contribute to this step,therefore, we have simulated the process using SCC-DFTB QM/MM in combination withthe CPR approach discussed above42, 87. The computed barriers of 11.5-13.6 kcal/mol fordifferent low-energy pathways are in good agreement with the experimental value of 13kcal/mol. However, to understand this properly one has to beaware of the intrinsic errorcompensation in these calculations: as discussed in detailin Ref.88, popular DFT methods

432

Page 15: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

tend to underestimate proton transfer barriers by 1-4 kcal/mole. On the other hand, theinclusion of nuclear quantum effects like zero point energies would lower proton transferbarriers by roughly this amount, therefore, these two effects tend to cancel each other for awide range of proton transfer systems.

Figure 2. The active site of bacteriorhodopsin in its groundstate. The first proton transfer occurs between theretinal Schiff base and the side chain Asp85.

7.2.2 Carbonic Anhydrase II : MEP vs. PMF

For many long-range proton transfers in biomolecules, however, the MEP results are likelyvery sensitive to the protein structure used in the calculation. More severely, the collectivestructural response in the protein is likely missing in the MEP calculations, which maylead to qualitatively incorrect mechanistic conclusions.A useful example in this context isthe long-range proton transfer in carbonic anhydrase II (CAII), where the rate-limiting stepof the catalytic cycle is a proton transfer between a zinc-bound water/hydroxide and theneutral/protonated His64 residue close to the protein/solvent interface. Since this protontransfer spans at least 8-10A, the transfer is believed to be mediated by the water moleculesin the active site89 (see Fig.3). Since there are multiple water wires of different length in theactive site that connect the donor/acceptor groups (zinc-bound water, His 64), a question ofinterest is whether specific length of water wire dominates the proton transfer or all wireshave comparable contributions.

First, a large number of MEPs have been collected starting from different snapshotscollected from equilibrium MD simulations at the SCC-DFTB/MM level. Since essentiallya positive charge is transferred over a long-distance, it was expected that the MEP energet-ics depend sensitively on the starting structure, which wasindeed observed. For example,

433

Page 16: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

when the starting structure came from a CHOH (zinc-bound water, neutral His64) trajec-tory, the proton transfer from the zinc-water to His64 is largely endothermic(on averageby as much as∼ 13 kcal/mol). By contrast, when the starting structure camefrom a COHH(zinc-bound hydroxide, protonated His64) trajectory, thesame proton transfer reaction wasfound largelyexothermic. As an attempt to capture the “intrinsic barrier” for the protontransfer reaction, which is known to be close to be thermoneutral experimentally,90 wegenerated configurations from equilibrium MD simulations in which protons along a spe-cific type of water wire were restrained to be equal distance from nearby heavy atoms (e.g.,oxygen in water orNǫ in His 64). In this way, the charge distribution associated with thereactive components is midway between the CHOH and COHH states, thus the active-siteconfiguration was expected to facilitate a thermoneutral proton transfer process, which wasindeed confirmed by MEP calculations using such generated configurations as the startingstructure. An interesting observation is that the barriersin such “TS-reorganized” MEPsshowed a steep dependence on the length of the water wire; it was small (∼ 6.8±2.2 kcal/-mol) with short wires but substantially higher than the experimental value (∼ 10 kcal/mol)with longer water wires (e.g., 17.4±2.0 kcal/mol for four-water wires).

Figure 3. The active site of CAII rendered from the crystal structure (PDB ID: 2CBA89). All dotted lines cor-respond to hydrogen-bonding interactions with distances≤3.5 A. The proton acceptor, His64, is resolved topartially occupy both the “in” and “out” rotameric states.

This steep wire-length dependence is in striking contrast with the more rigorous PMFcalculations.91, 92 In the PMF calculations, a collective coordinate93 was used to monitorthe progress of the proton transfer without enforcing specific sequence of events involvingindividual protons along the wire; the use of a collective coordinate is important becausethis allows averaging over different water wire configurations, which is proper since thelife-time of various water wires is on the pico-second time scale,18, 20 much faster thanthe time scale of the proton transfer (µs).90 In the PMF calculations, the wire-length de-pendence was examined by comparing results with different His 64 orientations (“in” and

434

Page 17: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

“out”, which is about 8 and 11A from the zinc, respectively); both configurations werefound to involve multiple lengths of water wires but different relative populations. Thetwo sets of PMF calculations produced barriers of very similar values, which suggestedthat the length of the water wire (or orientation of the acceptor group) is unlikely essentialto the proton transfer rate. Further analysis of the configurations sampled in the MEP sim-ulations suggested that the MEP results artificially favored the concerted proton transfers,which correlate to significant distance dependence. As discussed above, to generate the“TS-reorganized” configurations, all transferring protons along the wire were constrainedto be half-way between the neighboring heavy atoms; therefore, such sampled protein/-solvent configurations would favor a concerted over step-wise proton transfers. Althoughall atoms in the inner region are allowed to move in the MEP searches, the local nature ofMEPs does not allow collective reorganization of the activesite residues/solvent moleculesthus the “memory” of the sampling procedure is not erased.

Therefore, the CAII example clearly illustrates that care must be exercised when us-ing MEP to probe the mechanism of chemical reactions in biomolecules, especially whencollective rearrangements in the environment are expected(e.g., reactions involving chargetransport). Along the same line, the GSBP based QM/MM/CM framework was found to beparticularly attractive in the CAII studies for maintaining the proper solvent configurationsand sidechain orientations in the active site, as compared to Ewald based SCC-DFTB/MMsimulations18, 80, at a faction of the computational cost. Ignoring the bulk solvation effect,for example, was found to lead to unphysical orientations ofthe functionally importantHis64 residue.

7.2.3 Membrane proteins

A particularly exciting area for which the multiscale QM/MM/CM approach is suited con-cerns proton translocation across membrane proteins, where a proper and efficient treat-ment of the heterogeneous protein/solvent/membrane environment is particularly impor-tant, such as in bacteriorhodopsin and cytochrome c oxidase. The GSBP framework alsoallows one to incorporate the effect of membrane potential94, which plays a major rolein bioenergetics, in a numerically efficient manner. Using the SCC-DFTB/MM/GSBPprotocol with a relatively small inner region (∼ 30A×30A×50A) and dielectric mem-brane model93, we were able to reproduce the water wire configurations in the interior ofaquaporin in good agreement with the much more elaborate MD simulations using fourcopies of aquaporin embedded in an explicit lipid bilayer. Ignoring the GSBP contribu-tions, however, led to very different water distributions,which highlights the importanceand reliability of the multiscale framework. In a recent study95, the same framework wasalso found semi-quantitatively successful in predictingpKa of titritable groups in the inte-rior of bacteriorhodopsin and cytochrome c oxidase, which are extremely challenging andrelevant benchmark for studying proton transfer systems ingeneral96. Finally, the SCC-DFTB/MM/GSBP studies of the proton release group (PRG) in bacteriorhodopsin97 led tothe key insight that the PRG is not a protonated water clusteras proposed in a series ofrecent IR studies98, 99; rather, the PRG is a pair of conserved Glutamate bonded togetherwith a delocalized proton (see Fig.4), and it is the delocalization of this “intermolecularproton bond” that leads to the unusual IR signature found in experiments98, 99.

435

Page 18: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

Figure 4. SCC-DFTB/MM-GSBP simulations indicate that the stored proton in the proton pump bacteri-orhodopsin is delocalized (green spheres) between a pair ofconserved glutamate residues rather than amongthe active site water molecules.

7.3 Excited states properties

The accurate determination of excited states properties isa challenging task for quantumchemical methods in general. This holds true in particular for the chromophore in retinalproteins (like bR), a polyene chain linked via a Schiff-base(NH) group to the protein back-bone76, 73, 100(see Fig.5). Due to its extended and highly-correlatedπ-electron system, reti-nal is highly polarizable and undergoes a large change in dipole moment upon excitation,therefore, protein polarization effects may become important for an accurate description ofexcited state properties.

Standard QM/MM calculations using only an electrostatic embedding scheme do nottake the (electronic) polarization response of the proteinenvironment into account, whichis different for ground and excited states due to the change of the dipole moment uponexcitation.e In the case of retinal, the dipole in the excited state is about 10 Debye largerthan in the ground state, therefore, MM polarization stabilizes the excited state more thanthe ground state, leading to an effective red-shift in excitation energies.

Indeed, QM/MM electrostatic embedding calculations tend to overestimate the excita-tion energy. While the experimental absorption maximum is at 2.18 eV, MRCI QM/MMcalculations estimate it to be 2.34 eV, other methods predict even more blue shifted val-ues73. There are many factors that contribute to the computational uncertainty, one ofwhich being the intrinsic accuracy of the applied QM method.Other factors are relatedto the QM/MM coupling and the electrostatic treatment of theenvironment. For example,

eThey of course can take the ‘ionic’ response into account, i.e., the relaxation of the protein structure, which alsoleads to a change in the electrostatic field from the MM environment.

436

Page 19: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

different force field models (like AMBER and CHARMM) use different point charge mod-els, which can lead to differences in the excitation energies on the order of 0.05 eV35. Inmany applications, only the protein is included in the MM treatment, the larger environ-ment including membrane and bulk water is neglected. This effect can be estimated with alinearized version of the Poisson Boltzmann equation 8 in the charge scaling81 approach asdiscussed above. Estimating excitation energies with and without charges scaling resultsagain in differences of about 0.05 eV.

Figure 5. The retinal chromophore in the all-trans conformation, as in the bR ground state. The blue colorindicates the Schiff base (NH) group, from which the proton is transferred in the first step to Asp85.

Using a polarizable model, the ground state charge distribution in the MM region isdetermined using eq. 13. The resulting charges may be different from those in the regularforce field models, because they are computed in response to the actual electrostatic fieldof the protein with retinal in the ground state. This charge distribution leads to verticalexcitation energies about 0.07 eV red-shifted compared to those from the CHARMM forcefield35. In the same way, a different set of MM charges can be determined for the casewhere retinal is in its excited state. This change in the electrostatic environment leads to afurther red shit of 0.07 eV, which is due to the different MM polarization in the ground andexcited states. The total red-shift with respect to the CHARMM charges is 0.14eV, showingthat protein polarization can have a significant impact on excitation energies in those cases,where the dipole moment of the chromophore changes significantly upon excitation.

A different approach to estimate the effect of polarizationis to use a low level QMmethod instead of the polarizable MM region. We have used such a QM/QM’/MM ap-proach, applying charge scaling, a MRCI method for the QM region containing the retinalchromophore and a DFT methods for 300 atoms around the chromophore in the QM’region to benchmark the polarizable MM model34. This study showed that the well-calibrated polarizable MM model gives nearly the same results as the QM’ region. How-ever, the 300 atom QM’ region leads only to roughly 50% of the red-shift, showing that alarge MM region contributes to the polarization effect.

8 Summary

In the last decade, many variants of multiscale methods havebeen developed to studychemical events in complex environments in materials science, chemistry and biology.The specific design of such methods depends very much on the properties of the investi-gated system and the problem in hand. Biological systems arecharacterized by their highdegree of structural flexibility and the long-range nature of the electrostatic forces, which

437

Page 20: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

are essential to the understanding of biological functions. Therefore, the main emphasisin methods development in the biological context lies in theaccurate representation ofelectrostatics and algorithms to tackle the sampling problem. In this article, we have dis-cussed QM/MM algorithms embedded into an implicit electrostatic environment, which ismodeled based on the Poisson-Boltzmann equation. For many applications, the representa-tion of the MM environment by fixed point charges may be appropriate, however, in caseswhere the electrostatic properties in the QM region change significantly, a polarizable MMrepresentation is likely required. Thermal fluctuations, on the other hand, can lead to asignificant contribution to the free energies that characterize the chemical reaction. Ac-cordingly, expensive QM methods often have to be substituted by more efficient, althoughless accurate ones. We have described applications using various approximations for theQM region. For the determination of excitation energies, high level QM methods haveto be applied, while for the study of proton transfer events,DFT and approximate SCC-DFTB can lead to a balanced treatment allowing to draw meaningful conclusions aboutthe reaction mechanism and energetics. In some cases, the neglect of thermal fluctuationswould even lead to much larger errors than the use of lower accuracy QM methods. There-fore, studying biological systems requires applying a multitude of methods and calculatingmultiple experimental observables to reach reliable mechanistic conclusions.

Acknowledgments

We are indebted to our collaborators for their contributions, without which the work de-scribed here can’t be accomplished. Supports from the National Science Foundation, Na-tional Institutes of Health, Alfred P. Sloan Foundation, DFG and computational resourcesfrom the National Center for Supercomputing Applications at the University of Illinois aregreatly appreciated.

References

1. M. Karplus and J. A. McCammon,Molecular dynamics simulations of biomolecules,Nat. Struct. Mol. Biol.,9, 646–652, 2002.

2. W. F. van Gunsteren, D. Bakowies, R. Baron, I. Chandrasekhar, M. Christen,X. Daura, P. Gee, D. P. Geerke, A. Glattli, P. H. Hunenberger, M. A. Kastenholz,C. Ostenbrink, M. Schenk, D. Trzesniak, N. F. A. van der Vegt,and H. B. Yu,Biomolecular modeling: Goals, problems, perspectives, Angew. Chem. Int. Ed.,45, 4064–4092, 2006.

3. A. D. MacKerell Jr., D. Bashford, M. Bellot, R. L. DunbrackJr., J. D. Evanseck,M. J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir,K. Kuczera, F. T. K. Lau, C. Mattos, S. Michnick, T. Ngo, D. T. Nguyen, B. Prodhom,W. E. Reiher III., B. Roux, M. Schlenkrich, J.C. Smith, R. Stote, J. Straub, M. Watan-abe, J. Wiorkiewicz-Kuczera, D. Yin, and M. Karplus,All-Atom Empirical Potentialfor Molecular Modeling and Dynamics Studies of Proteins, J. Phys. Chem. B,102,3586–3616, 1998.

4. W. F. van Gunsteren, S. R. Billeter, A. A. Eising, P. H. Hunenberger, P. Kruger, A. E.Mark, W. R. P. Scott, and I. G. Tironi,Biomolecular Simulation: The GROMOS Man-ual and User Guide., vdf Hochschulverlag, ETH Zurich, Switzerland, 1996.

438

Page 21: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

5. J. W. Ponder and D. A. Case,Force fields for protein simulations, Adv. Prot. Chem.,66, 27, 2003.

6. A. Warshel,Computer simulations of enzyme catalysis: Methods, Progress and In-sights, Annu. Rev. Biophys. Biomol. Struct.,32, 425–443, 2003.

7. Qiang Cui and Martin Karplus,Catalysis and specificity in enzymes: A studyof triosephosphate isomerase (TIM) and comparison with methylglyoxal synthase(MGS), Adv. Prot. Chem.,66, 315–372, 2003.

8. J. L. Gao, S. H. Ma, D. T. Major, K. Nam, J. Z. Pu, and D. G. Truhlar, Mechanismsand free energies of enzymatic reactions, Chem. Rev.,106, 3188–3209, 2006.

9. M. Cossi, V. Barone, R. Cammi, and J. Tomasi,Ab initio study of solvated molecules:A new implementation of the polarizable continuum model, Chem. Phys. Lett.,255,327–335, 1996.

10. C. J. Cramer and D. G. Truhlar,Implicit solvation models: Equilibria, structure, spec-tra, and dynamics, Chem. Rev.,99, 2161–2200, 1999.

11. N. A. Baker, D. Sept, S. Joseph, M. J. Holst, and J. A. McCammon,Electrostatics ofnanosystems: Application to microtubules and the ribosome, Proc. Acad. Natl. Sci.USA, 98, 10037–10041, 2001.

12. M. Feig and C. L. Brooks,Recent advances in the development and application ofimplicit solvent models in biomolecule simulations, Curr. Opin. Struct. Biol.,14,217–224, 2004.

13. J. P. Hansen and I. R. McDonald,Theory of simple liquids, 3rd Ed., Academic Press,London, UK, 2006.

14. D. Bashford and D. A. Case,Generalized born models of macromolecular solvationeffects, Annu. Rev. Phys. Chem.,51, 129–152, 2000.

15. Warshel, A. and Levitt, M.,Theoretical Studies of Enzymic Reactions, J. Mol. Biol.,1976.

16. K. Nam, J. Gao, and D. M. York,An efficient linear-scaling ewald method for long-range electrostatic interactions in combined QM/MM calculations, J. Chem. Theo.Comp.,1, 2–13, 2005.

17. D. Riccardi, P. Schaefer, and Q. Cui,pKa calculations in solution and proteins withQM/MM free energy perturbation simulations: A quantitative test of QM/MM proto-cols, J. Phys. Chem. B,109, 17715–17733, 2005.

18. P. Schaefer, D. Riccardi, and Q. Cui,Reliable treatment of electrostatics in combinedQM/MM simulation of macromolecules, J. Chem. Phys.,123, 014905, 2005.

19. B. A. Gregersen and D. M. York,Variational Electrostatic projection (VEP) methodsfor efficient modeling of the macromolecular electrostaticand solvation environmentin activated dynamics simulations, J. Phys. Chem. B,109, 536–556, 2005.

20. D. Riccardi, P. Schaefer, Y. Yang, H. Yu, N. Ghosh, X. Prat-Resina, Peter Konig,G. Li, D. Xu, H. Guo, M. Elstner, and Qiang Cui,Development of effective quantummechanical/molecular mechanical (QM/MM) methods for complex biological pro-cesses (Feature Article), J. Phys. Chem. B,110, 6458–6469, 2006.

21. H. M. Senn and W. Thiel,QM/MM studies of enzymes, Curr. Opin. Chem. Biol.,11,182–187, 2007.

22. F. Maseras and K. Morokuma,IMOMM - A new integrated ab initio plus molecu-lar mechanics geometry optimization scheme of equilibriumstructures and transitionstates, J. Comp. Chem.,16, 1170–1179, 1995.

439

Page 22: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

23. M. Svensson, S. Humbel, R. D. J. Froese, T. Matsubara, S. Sieber, and K. Mo-rokuma,ONIOM: A multilayered integrated MO+MM method for geometryopti-mizations and single point energy predictions. A test for Diels-Alder reactions andPt(P(t-Bu)(3))(2)+H-2 oxidative addition, J. Phys. Chem.,100, 19357–19363, 1996.

24. Q. Cui, H. Guo, and M. Karplus,Combining ab initio and density functional theorieswith semiempirical methods, J. Chem. Phys.,117, 5617–5631, 2002.

25. M. J. Field, P. A. Bash, and M. Karplus,A combined quantum mechanical and molec-ular mechanical potential for molecular dynamics simulations, J. Comput. Chem.,11, 700–733, 1990.

26. Y. Zhang, T.S. Lee, and W. Yang,A pseudobond approach to combining quantummechanical and molecular mechanical methods, J. Chem. Phys.,110, 46–54, 1999.

27. J. Gao, P. Amara, C. Alhambra, and M. J. Field,A Generalized Hybrid Orbital (GHO)Method for the Treatment of Boundary Atoms in Combined QM/MMCalculations, J.Phys. Chem. A,102, 4714–4721, 1998.

28. D. Das, K. P. Eurenius, E. M. Billings, P. Sherwood, D. C. Chattfield, M. Hodoscek,and B. R. Brooks,Optimization of quantum mechanical molecular mechanical parti-tioning schemes: Gaussian delocalization of molecular mechanical charges and thedouble link atom method, J. Chem. Phys.,117, 10534–10547, 2002.

29. N. Reuter, A. Dejaegere, B. Maigret, and M. Karplus,Frontier Bonds in QM/MMMethods: A Comparison of Different Approaches, J. Phys. Chem. A,104, 1720–1735, 2000.

30. I. Antes and W. Thiel,Adjusted Connection Atoms for Combined Quantum Mechani-cal and Molecular Mechanical Methods., J. Phys. Chem. A,103, 9290, 1999.

31. P. H. Konig, M. Hoffmann, Th. Frauenheim, and Q. Cui,A critical evaluation ofdifferent QM/MM frontier treatments using SCC-DFTB as the QM method, J. Phys.Chem. B,109, 9082–9095, 2005.

32. D. Riccardi, G. Li, and Q. Cui,The importance of van der Waals interactions inQM/MM simulations, J. Phys. Chem. B,108, 6467–6478, 2004.

33. A. Laio, J. VanderVondele, and U. Rothlisberger,A Hamiltonian electrostatic cou-pling scheme for hybrid Car-Parrinello molecular dynamicssimulations, j. Chem.Phys.,116, 6941–6947, 2002.

34. M. Wanko, M. Hoffmann, T. Frauenheim, and M. Elstner,Effect of polarization on theopsin shift in rhodopsins. 1. A combined QM/QM/MM model for bacteriorhodopsinand pharaonis sensory rhodopsin II, J. Phys. Chem. B,112, 11462–11467, 2008.

35. M. Wanko, M. Hoffmann, J. Frahmcke, T. Frauenheim, and M.Elstner,Effect ofpolarization on the opsin shift in rhodopsins. 2. empiricalpolarization models forproteins, J. Phys. Chem. B,112, 11468–11478, 2008.

36. Daan Frenkel and Berend Smit,Understanding Molecular Simulation: From Algo-rithms to Applications, Academic Press, San Diego, London, 2002.

37. D. J. Wales,Energy Landscapes, Cambridge University Press, 2003.38. T. Siomonson,Computational biochemistry and biophysics, Marcel Dekker, Inc.,

2001.39. S. Fischer and M. Karplus,Conjugate Peak Refinement : an algorithm for finding re-

action paths and accurate transition states in systems withmany degrees of freedom.,Chem. Phys. Lett.,194, 252–261, 1992.

40. G. Henkelman, B. P. Uberuaga, and H. Jonsson,Climbing image nudged elastic band

440

Page 23: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

method for finding saddle points and minimum energy paths, J. Chem. Phys.,113,9901–9904, 2000.

41. G. Henkelman and H. Jonsson,A dimer method for finding saddle points on high di-mensional potential surfaces using only first derivatives, J. Chem. Phys.,111, 7010–7022, 1999.

42. A. Bondar, S. Fischer, J. C. Smith, M. Elstner, and S. Suhai, Key role of electrostaticinteractions in bacteriorhodopsin proton transfer, Journal of the American ChemicalSociety,126, 14668–14677, 2004.

43. M. Klahn, S. Braun-Sand, E. Rosta, and A. Warshel,On possible pitfalls in ab ini-tio quantum mechanics/molecular mechanics minimization approaches for studies ofenzymatic reactions, J. Phys. Chem B,109, 15645, 2005.

44. Y. K. Zhang, H. Y. Liu, and W. T. Yang,Free energy calculation on enzyme reactionswith an efficient iterative procedure to determine minimum energy paths on a com-bined ab initio QM/MM potential energy surface, J. Chem. Phys.,112, 3483–3492,2000.

45. H. Hu, Z. Y. Lu, and W. T. Yang,QM/MM minimum free-energy path: Methodologyand application to triosephosphate isomerase, J. Chem. Theo. Comp.,3, 390–406,2007.

46. H. Hu and W. T. Yang,Free Energies of Chemical Reactions in Solution and in En-zymes with Ab Initio Quantum Mechanics/Molecular Mechanics Methods, Annu. Rev.Phys. Chem.,59, 573–601, 2008.

47. G.M. Torrie and J.P. Valleau,Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling, J. Comp. Phys.,23, 187–199, 1977.

48. A. Laio and M. Parrinello,Escaping free energy minima, Proc. Nat. Acad. Sci. USA,99, 12562–12566, 2002.

49. A. Laio, A. Rodriguez-Fortea, F. L. Gervasio, M. Ceccarelli, and M. Parrinello,As-sessing the accuracy of metadynamics, J. Phys. Chem. B,109, 6714–6721, 2005.

50. A. Barducci, G. Bussi, and M. Parrinello,Well-tempered metadynamics: A smoothlyconverging and tunable free-energy method, Phys. Rev. Lett.,100, 020603, 2008.

51. D. H. Min, Y. S. Liu, I. Carbone, and W. Yang,On the convergence improvement inthe metadynamics simulations: A Wang-Landau recursion approach, J. Chem. Phys.,126, 194104, 2007.

52. H. Li, D. Min, Y. Liu, and W. Yang,Essential energy space random walk via energyspace metadynamics method to accelerate molecular dynamics simulations, J. Chem.Phys.,127, 094101, 2007.

53. Y. Q. Gao,An integrate-over-temperature approach for enhanced sampling, J. Chem.Phys.,128, 064105, 2008.

54. D. Hamelberg, J. Mongan, and J. A. Mccammon,Accelerated molecular dynamics:A promising and efficient simulation method for biomolecules, J. Chem. Phys.,120,11919–11929, 2004.

55. P. G. Bolhuis, D. Chandler, C. Dellago, and P. L. Geissler, Transition path sampling:Throwing ropes over rough mountain passes, in the dark, Annu. Rev. Phys. Chem.,53, 291–318, 2002.

56. R. Crehuet and M. J. Field,A transition path sampling study of the reaction catalyzedby the enzyme chorismate mutase, J. Phys. Chem. B,111, 5708–5718, 2007.

57. S. Saen-oon, S. Quaytman-Machleder, V. L. Schramm, and S. D. Schwartz,Atomic

441

Page 24: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

detail of chemical transformation at the transition state of an enzymatic reaction,Proc. Natl. Acad. Sci. USA,105, 16543–16548, 2008.

58. M. Elstner, T. Frauenheim, J. McKelvey, and G. Seifert,Density functional tight bind-ing: Contributions from the American chemical society symposium, J. Phys. Chem.A, 111, 5607–5608, 2007.

59. W. Thiel,Perspectives on semiempirical molecular orbital theory, Adv. Chem. Phys.,93, 703–757, 1996.

60. I. Rossi and D. G. Truhlar,Parameterization of NDDO wavefunctions using geneticalgorithm, Chem. Phys. Lett.,233, 231–236, 1995.

61. Q. Cui and M. Karplus,QM/MM Studies of the Triosephosphate Isomerase (TIM)Catalyzed Reactions: Verification of Methodology and Analysis of the Reaction Mech-anisms, J. Phys. Chem. B,106, 1768–1798, 2002.

62. K. Nam, Q. Cui, J. Gao, and D. M. York,A specific reaction parameterization for theAM1/d Hamiltonian for transphosphorylation reactions, J. Chem. Theo. Comp.,3,486–504, 2007.

63. Yang Yang, Haibo Yu, Darrin M. York, Marcus Elstner, and Qiang Cui, Descrip-tion of phosphate hydrolysis reactions with the Self-Consistent-Charge Tight-Binding-Density-Functional (SCC-DFTB) theory 1. Parameterization, J. Chem. Theo. Comp.,In press, 2008.

64. M. P. Repasky, J. Chandrasekhar, and W. L. Jorgensen,PDDG/PM3 and PDDG/M-NDO: Improved semiempirical methods, J. Comp. Chem.,23, 1601–1622, 2002.

65. Porezag, D., Frauenheim, T., Kohler, T., Seifert, G., and Kaschner, R.,constructionof tight-binding-like potentials on the basis of density functional theory - applicationto carbon, Phys. Rev. B,51, 12947–12957, 1995.

66. M. Elstner, D. Porezag, G. Jungnickel, J. Elstner, M. Haugk, Th. Frauenheim,S. Suhai, and G. Seifert,Self-consistent-charge density-functional tight-bindingmethod for simulations of complex materials properties, Phys. Rev. B,58, 7260–7268, 1998.

67. M. Elstner,SCC-DFTB: What is the proper degree of self-consistency, Journal ofPhysical Chemistry A,111, 5614–5621, 2007.

68. Yang, Y., Yu, H., York, D., Cui, Q., and Elstner, M.,Extension of the Self-Consistent-Charge Density-Functional Tight-Binding Method: Third-Order Expansion of theDensity Functional Theory Total Energy and Introduction ofa Modified EffectiveCoulomb Interaction, J. Phys. Chem. A,111, 10861–10873, 2007.

69. T. Kruger, M. Elstner, P. Schiffels, and Th. Frauenheim,Validation of the densityfunctional based tight-binding approximation method for the calculation of reactionenergies and other data, J. Chem. Phys.,122, 114110, 2005.

70. K. W. Sattelmeyer, J. Tirado-Rives, and W. L. Jorgensen,Comparison of SCC-DFTBand NDDO-based semiempirical molecular orbital methods for organic molecules,Journal of Physical Chemistry A,110, 13551–13559, 2006.

71. N. Otte, M. Scholten, and W. Thiel,Looking at self-consistent-charge density func-tional tight binding from a semiempirical perspective, Journal of Physical ChemistryA, 111, 5751–5755, 2007.

72. A. Bondar, S. Suhai, S. Fischer, J. C. Smith, and M. Elstner, Suppression of the backproton-transfer from Asp85 to the retinal Schiff base in bacteriorhodopsin: A theo-retical analysis of structural elements, Journal of Structural Biology,157, 454–469,

442

Page 25: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

2007.73. M. Wano, M. Hoffmann, P. Strodel, A. Koslowski, W. Thiel,F. Neese, T. Frauenheim,

and M. Elstner,Calculating absorption shifts for retinal proteins: Computationalchallenges, J. Phys. Chem. B,109, 3606–3615, 2005.

74. M. Elstner, P. Hobza, T. Frauenheim, S. Suhai, and E. Kaxiras,Hydrogen bonding andstacking interactions of nucleic acid base pairs: A densityfunctional-theory basedtreatment, Journal of Chemical Physics,114, 5149–5155, 2001.

75. A. Dreuw, J. L. Weisman, and M. Head-Gordon,Long-range charge-transfer excitedstates in time-dependent density functional theory require non-local exchange, Jour-nal of Chemical Physics,119, 2943–2946, 2003.

76. M. Wanko, M. Garavelli, F. Bernardi, T. A. Niehaus, T. Frauenheim, and M. Elstner,Aglobal investigation of excited state surfaces within time-dependent density-functionalresponse theory, Journal of Chemical Physics,120, 1674–1692, 2004.

77. F. Claeyssens, J. N. Harvey, F. R. Manby, R. A. Mata, A. J. Mulholland, K. E.Ranaghan, M. Schutz, S. Thiel, W. Thiel, and H. J. Werner,High-accuracy com-putation of reaction barriers in enzymes, Angew. Chim. Intl. Ed.,45, 6856–6859,2006.

78. J. A. Wagoner and N. A. Baker,Assessing implicit models for nonpolar mean solva-tion forces: The importance of dispersion and volume terms, Proc. Nat. Acad. Sci.USA, 103, 8331–8336, 2006.

79. W. Im, S. Berneche, and B. Roux,Generalized solvent boundary potential for com-puter simulations, J. Chem. Phys.,114, 2924–2937, 2001.

80. D. Riccardi and Q. Cui,pKa analysis for the zinc-bound water in Human CarbonicAnhydrase II: benchmark for “multi-scale” QM/MM simulations and mechanistic im-plications, J. Phys. Chem. A,111, 5703–5711, 2007.

81. A. R. Dinner, X. Lopez, and M. Karplus,A charge-scaling method to treat solvent inQM/MM simulations, Theoretical Chemistry Accounts,109, 118–124, 2003.

82. G. Li, X. Zhang, and Q. Cui,Free Energy Perturbation Calculations with CombinedQM/MM Potentials Complications, Simplifications, and Applications to Redox Poten-tial Calculations, J. Phys. Chem. B,107, 8643–8653, 2003.

83. W. G. Han, K. J. Jalkanen, M. Elstner, and S. Suhai,Theoretical study of aqueousN-acetyl-L-alanine N’-methylamide: Structures and Raman, VCD, and ROA spectra,Journal of Physical Chemistry B,102, 2587–2602, 1998.

84. H. Hu, M. Elstner, and J. Hermans,Comparison of a QM/MM force field and molec-ular mechanics force fields in simulations of alanine and glycine ”dipeptides” (Ace-Ala-Nme and Ace-Gly-Nme) in water in relation to the problemof modeling the un-folded peptide backbone in solution, Proteins: Structure, Function and Genetics,50,451–463, 2003.

85. H. Liu, M. Elstner, E. Kaxiras, T. Frauenheim, J. Hermans, and W. Yang,Quan-tum mechanics simulation of protein dynamics on long timescale, Proteins: Structure,Function and Genetics,44, 484–489, 2001.

86. J. M. J. Swanson, C. M. Maupin, H. Chen, M. K. Petersen, J. Xu, Y. Wu, and G. A.Voth, Proton solvation and transport in aqueous and biomolecularsystems: insightsfrom computer simulations, J. Phys. Chem. B,111, 4300–4314, 2007.

87. A. Bondar, M. Elstner, S. Suhai, J. C. Smith, and S. Fischer, Mechanism of primaryproton transfer in bacteriorhodopsin, Structure,12, 1281–1288, 2004.

443

Page 26: Multiscale Methods for the Description of Chemical Events in … · 2016-05-24 · Multiscale Methods for the Description of Chemical Events in Biological Systems Marcus Elstner1,2

88. M. Elstner,The SCC-DFTB method and its application to biological systems, Theor.Chem. Acc.,116, 316–325, 2006.

89. K. Hakansson, M. Carlsson, L. A. Svensson, and A. Liljas, Structure of native andapo carbonic anhydrase II and structure of some its anion-ligand complexes, J. Mol.Biol., 227, 1192–1204, 1992.

90. D. N. Silverman,Proton transfer in carbonic anhydrase measured by equilibriumisotope exchange, Methods in Enzymology,249, 479–503, 1995.

91. D. Riccardi, P. Konig, X. Prat-Resina, H. Yu, M. Elstner, T. Frauenheim, and Q. Cui,“Proton holes” in long-range proton transfer reactions in solution and enzymes: Atheoretical analysis, J. Am. Chem. Soc.,128, 16302–16311, 2006.

92. D. Riccardi, P. Koenig, H. Guo, and Q. Cui,Proton Transfer in Carbonic AnhydraseIs Controlled by Electrostatics Rather than the Orientation of the Acceptor, Biochem.,47, 2369–2378, 2008.

93. P. H. Konig, N. Ghosh, M. Hoffmann, M. Elstner, E. Tajkhorshid, Th Frauenheim, andQ. Cui,Toward theoretical analyis of long-range proton transfer kinetics in biomolec-ular pumps, Journal of Physical Chemistry A,110, 548–563, 2006.

94. B. Roux,Influence of the membrane potential on the free energy of an intrinsic pro-tein, Biophys. J.,73, 2980–2989, 1997.

95. N. Ghosh, X. Prat-Resina, M. Gunner, and Q. Cui,MicroscopicpKa analysis of Glu286 in Cytochrome c Oxidase (Rhodobacter sphaeroids): towards a calibrated molec-ular model, Biochem.,Submitted, 2008.

96. M. Kato, A. V. Pisliakov, and A. Warshel,The barrier for proton transport in Aqau-porins as a challenge for electrostatic models: The role of protein relaxation in mu-tational calculations, Proteins: Struct. Funct. Bioinfor.,64, 829–844, 2006.

97. P. Phatak, N. Ghosh, H. Yu, M. Elstner, and Q. Cui,Amino acids with an intermolec-ular proton bond as proton storage site in bacteriorhodopsin, Proc. Acad. Natl. Sci.U.S.A.,In press, 2008.

98. F. Garczarek, L. S. Brown, J. K. Lanyi, and K. Gerwert,Proton binding within amembrane protein by a protonated water cluster, Proc. Acad. Natl. Sci. U.S.A.,102,3633–3638, 2005.

99. F. Garczarek and K. Gerwert,Functional waters in intraprotein proton transfer mon-itored by FTIR difference spectroscopy, Nature,439, 109–112, 2006.

100. M. Hoffmann, M. Wanko, P. Strodel, P. H. Konig, T. Frauenheim, K. Schulten,W. Thiel, E. Tajkhorshid, and M. Elstner,Color tuning in rhodopsins: The mecha-nism for the spectral shift between bacteriorhodopsin and sensory rhodopsin II, J.Am. Chem. Soc.,128, 10808–10818, 2006.

444