Molecular dynamics and applications to amyloidogenic sequences

44
Molecular dynamics Molecular dynamics and applications to and applications to amyloidogenic amyloidogenic sequences sequences Nurit Haspel, David Zanuy, Ruth Nurit Haspel, David Zanuy, Ruth Nussinov Nussinov (in cooperation with Ehud (in cooperation with Ehud Gazit’s group) Gazit’s group)

description

Molecular dynamics and applications to amyloidogenic sequences. Nurit Haspel, David Zanuy, Ruth Nussinov (in cooperation with Ehud Gazit’s group). Contents. Molecular dynamics: goals, applications and basic principles: Newton’s equations of motion. Energy conservation and equations. - PowerPoint PPT Presentation

Transcript of Molecular dynamics and applications to amyloidogenic sequences

Page 1: Molecular dynamics and applications to amyloidogenic sequences

Molecular dynamics and Molecular dynamics and applications to amyloidogenic applications to amyloidogenic

sequencessequences

Nurit Haspel, David Zanuy, Ruth NussinovNurit Haspel, David Zanuy, Ruth Nussinov

(in cooperation with Ehud Gazit’s group) (in cooperation with Ehud Gazit’s group)

Page 2: Molecular dynamics and applications to amyloidogenic sequences

ContentsContents Molecular dynamics: goals, applications and Molecular dynamics: goals, applications and

basic principles:basic principles: Newton’s equations of motion.Newton’s equations of motion. Energy conservation and equations.Energy conservation and equations. The force field.The force field. Solvation models.Solvation models. Periodic boundary conditions.Periodic boundary conditions.

A molecular dynamic protocolA molecular dynamic protocol Energy minimization.Energy minimization.

Page 3: Molecular dynamics and applications to amyloidogenic sequences

Contents (cont.)Contents (cont.)

MD protocol (cont)MD protocol (cont) Assignment of initial velocities.Assignment of initial velocities. Equilibration.Equilibration. Methods of integration.Methods of integration.

Case study: Human calcitonin hormoneCase study: Human calcitonin hormone Basic background.Basic background. Simulated models.Simulated models. Initial results and future work plans.Initial results and future work plans.

Page 4: Molecular dynamics and applications to amyloidogenic sequences

The goal of MDThe goal of MD

Predicting the structure and energy of Predicting the structure and energy of molecular systems (in our case – short peptide molecular systems (in our case – short peptide structures).structures).

Simulating the behavior of the molecules in Simulating the behavior of the molecules in the solution (by solving the energy equations the solution (by solving the energy equations at every time interval).at every time interval).

Trying to find a model that explains the Trying to find a model that explains the behavior of the system.behavior of the system.

Page 5: Molecular dynamics and applications to amyloidogenic sequences

Applications of MDApplications of MD

Sampling the conformational space over time. Sampling the conformational space over time. Important for ligand docking, for example.Important for ligand docking, for example.

Determine equilibrium averages, structural and Determine equilibrium averages, structural and motional properties of the system.motional properties of the system.

Study the time development of the system.Study the time development of the system. Today, most (if not all) biomolecular Today, most (if not all) biomolecular

structures obtained by X-ray crystallography structures obtained by X-ray crystallography or NMR are MD refined.or NMR are MD refined.

Page 6: Molecular dynamics and applications to amyloidogenic sequences

Types of simulated systemsTypes of simulated systems

Peptidic systemsPeptidic systems Micelle formationMicelle formation NucleotidesNucleotides Small moleculesSmall molecules Ligand dockingLigand docking …… Note: Each type of system has its own unique Note: Each type of system has its own unique

parameters and equations.parameters and equations.

Page 7: Molecular dynamics and applications to amyloidogenic sequences

The basic principleThe basic principle

Solving the classical mechanics equations (Newton’s Solving the classical mechanics equations (Newton’s equations) over the pairs of atom distances, angles, equations) over the pairs of atom distances, angles, dihedrals, VdW interactions and electrostatics in dihedrals, VdW interactions and electrostatics in small time intervals (other parameters can be added).small time intervals (other parameters can be added).

classical equations are usually sufficient for large scale classical equations are usually sufficient for large scale systems. Quantum mechanical modifications are systems. Quantum mechanical modifications are extremely costly and are used only on small scale extremely costly and are used only on small scale system or where more accuracy is needed.system or where more accuracy is needed.

Page 8: Molecular dynamics and applications to amyloidogenic sequences

Newton’s mechanical equation Newton’s mechanical equation (based on Newton’s second law)(based on Newton’s second law)

F

V1V2

Or, with a small enough time interval Δt:

F = Ma = M*(dv/dt) = M*(d2r/dt2)

ΔV = (F/M)* Δt → V2 = V1+(F/M)* Δt

Page 9: Molecular dynamics and applications to amyloidogenic sequences

Newton equations (cont.)Newton equations (cont.)

The new position, r2 is determined by the old position, r1 and the velocity v2 over time Δt (which should be very small!).

The above equation describes the changes in the positions of the atoms over time.

211212 )/( dtMFdtvrdtvrr

Page 10: Molecular dynamics and applications to amyloidogenic sequences

The process of MDThe process of MD

The simulation is the numerical integration of The simulation is the numerical integration of the Newton equations over time.the Newton equations over time.

Positions and velocities at time t Positions and velocities at time t

Positions and velocities at time t+dt.Positions and velocities at time t+dt.

Positions + velocities = trajectory.Positions + velocities = trajectory.

Page 11: Molecular dynamics and applications to amyloidogenic sequences

The connection between force and The connection between force and energyenergy

U = the energy (scalar).

r = the position vector.

F=-dU/dr →U=-∫Fdr=-1/2*Mv2

Page 12: Molecular dynamics and applications to amyloidogenic sequences

Conservation of energyConservation of energy

The potential energy is taken from the force field parameters.

½*ΣMiVi2 + ΣEpot,i=const

Page 13: Molecular dynamics and applications to amyloidogenic sequences

The potential energy equations – The potential energy equations – bonded interactionsbonded interactions

U(U(RR) = ) =

bond bond

angleangle

dihedraldihedral

bonds

eqr rrK 2)(

angles

eqK 2)(

dihedrals

n nV ])cos[1(2/

Page 14: Molecular dynamics and applications to amyloidogenic sequences

The potential energy equations The potential energy equations (cont., non-bonded interactions)(cont., non-bonded interactions)

Van der WaalsVan der Waals

electrostaticelectrostatic

Etc… Etc…

The energy parameters are defined in the force fieldThe energy parameters are defined in the force field

atoms

ji ij

ij

ij R

B

R

Aij

612

atoms

ji ij

ji

R

qq

Page 15: Molecular dynamics and applications to amyloidogenic sequences

The force field definitionThe force field definition

All the All the equationsequations and the adjusted and the adjusted parametersparameters that allow to describe quantitatively the energy that allow to describe quantitatively the energy of the chemical system.of the chemical system.

Note, that mixing equations and parameters from Note, that mixing equations and parameters from different systems always results in errors!different systems always results in errors!

Force field examples: FF2, FF3, Sybyl, charmm Force field examples: FF2, FF3, Sybyl, charmm etc.etc.

Page 16: Molecular dynamics and applications to amyloidogenic sequences

Solvation modelsSolvation models

No solvent – constant dielectric.No solvent – constant dielectric. Continuum – referring to the solvent as a bulk. Continuum – referring to the solvent as a bulk.

No explicit representation of atoms (saving No explicit representation of atoms (saving time).time).

Explicit – representing each water molecule Explicit – representing each water molecule explicitly (accurate, but expensive).explicitly (accurate, but expensive).

Mixed – mixing two models (for example: Mixed – mixing two models (for example: explicit + continuum. To save time).explicit + continuum. To save time).

Page 17: Molecular dynamics and applications to amyloidogenic sequences

Periodic boundary conditionsPeriodic boundary conditions

Problem: Only a small number of molecules can be Problem: Only a small number of molecules can be simulated and the molecules at the surface experience simulated and the molecules at the surface experience different forces than those at the inner side.different forces than those at the inner side.

The simulation box is replicated infinitely in three The simulation box is replicated infinitely in three dimensions (to integrate the boundaries of the box).dimensions (to integrate the boundaries of the box).

When the molecule moves, the images move in the When the molecule moves, the images move in the same fashion.same fashion.

The assumption is that the behavior of the infinitely The assumption is that the behavior of the infinitely replicated box is the same as a macroscopic system.replicated box is the same as a macroscopic system.

Page 18: Molecular dynamics and applications to amyloidogenic sequences

Periodic boundary conditionsPeriodic boundary conditions

Page 19: Molecular dynamics and applications to amyloidogenic sequences

A sample MD protocolA sample MD protocol Read the force fields data and parameters.Read the force fields data and parameters. Read the coordinates and the solvent molecules.Read the coordinates and the solvent molecules. Slightly minimize the coordinates (the created model Slightly minimize the coordinates (the created model

may contain collisions), a few SD steps followed by may contain collisions), a few SD steps followed by some ABNR steps.some ABNR steps.

Warm to the desired temperature (assign initial Warm to the desired temperature (assign initial velocities).velocities).

Equilibrate the system.Equilibrate the system. Start the dynamics and save the trajectories every 1ps Start the dynamics and save the trajectories every 1ps

(trajectory=the collection of structures at any given (trajectory=the collection of structures at any given time step).time step).

Page 20: Molecular dynamics and applications to amyloidogenic sequences

Why is minimization required?Why is minimization required? Most of the coordinates are obtained using X-Most of the coordinates are obtained using X-

ray diffraction or NMR.ray diffraction or NMR. Those methods do not map the hydrogen Those methods do not map the hydrogen

atoms of the system.atoms of the system. Those are added later using modeling Those are added later using modeling

programs (such as insight), which are not programs (such as insight), which are not 100% accurate.100% accurate.

Minimization is therefore required to resolve Minimization is therefore required to resolve the clashes that may “blow up” the energy the clashes that may “blow up” the energy function.function.

Page 21: Molecular dynamics and applications to amyloidogenic sequences

Common minimization protocolsCommon minimization protocols

First order algorithms:First order algorithms: Steepest descentSteepest descent Conjugated gradientConjugated gradient

Second order algorithms:Second order algorithms: Newton-RaphsonNewton-Raphson Adopted basis Newton Adopted basis Newton

Raphson (ABNR)Raphson (ABNR)

Page 22: Molecular dynamics and applications to amyloidogenic sequences

Steepest descentSteepest descentThis is the simplest minimization method:This is the simplest minimization method: The first directional derivative (gradient) of The first directional derivative (gradient) of

the potential is calculated and displacement is the potential is calculated and displacement is added to every coordinate in the opposite added to every coordinate in the opposite direction (the direction of the force).direction (the direction of the force).

The step is increased if the new conformation The step is increased if the new conformation has a lower energy.has a lower energy.

Advantages: Simple and fast.Advantages: Simple and fast. Disadvantages: Inaccurate, usually does not Disadvantages: Inaccurate, usually does not

converge.converge.

Page 23: Molecular dynamics and applications to amyloidogenic sequences

Conjugated gradientConjugated gradient

Uses first derivative information + information Uses first derivative information + information from previous steps – the weighted average of from previous steps – the weighted average of the current gradient and the previous step the current gradient and the previous step direction.direction.

The weight factor is calculated from the ratio The weight factor is calculated from the ratio of the previous and current steps.of the previous and current steps.

This method converges much better than SD.This method converges much better than SD.

Page 24: Molecular dynamics and applications to amyloidogenic sequences

Newton-Raphson algorithmNewton-Raphson algorithm

Uses both first derivative (slope) and second Uses both first derivative (slope) and second (curvature) information.(curvature) information.

In the one-dimensional case:In the one-dimensional case:

In the multi-dimensional case – much more In the multi-dimensional case – much more complicated (calculates the inverse of a hessian complicated (calculates the inverse of a hessian [curvature] matrix at each step)[curvature] matrix at each step)

Advantage: Accurate and converges well.Advantage: Accurate and converges well. Disadvantage: Computationally expensive, for Disadvantage: Computationally expensive, for

convergence, should start near a minimum.convergence, should start near a minimum.

1

'( )

''( )k

k kk

F xx x

F x+ = -

Page 25: Molecular dynamics and applications to amyloidogenic sequences

Adopted basis Newton Raphson Adopted basis Newton Raphson (ABNR)(ABNR)

An adaptation of the NR method that is An adaptation of the NR method that is especially suitable for large systems. especially suitable for large systems.

Instead of using a full matrix, it uses a basis Instead of using a full matrix, it uses a basis that represents the subspace in which the that represents the subspace in which the system made the most progress in the past.system made the most progress in the past.

Advantage: Second derivative information, Advantage: Second derivative information, convergence, faster than the regular NR convergence, faster than the regular NR method.method.

Disadvantages: Still quite expensive, less Disadvantages: Still quite expensive, less accurate than NR.accurate than NR.

Page 26: Molecular dynamics and applications to amyloidogenic sequences

Assignment of initial velocitiesAssignment of initial velocities At the beginning the only information available is the At the beginning the only information available is the

desired temperature. Initial velocities are assigned desired temperature. Initial velocities are assigned randomly according to the Maxwell-Bolzmann randomly according to the Maxwell-Bolzmann distribution:distribution:

PPv - v - the probability of finding a molecule with velocity the probability of finding a molecule with velocity between v and dv.between v and dv.

Note that: 1. the velocity has x,y,z components.Note that: 1. the velocity has x,y,z components.2. The velocities exhibit a gaussian distribution2. The velocities exhibit a gaussian distribution

Tk

mv

b

bevTk

mdvvP 222

3 2

24)(

Page 27: Molecular dynamics and applications to amyloidogenic sequences

Bond and angle constraints (SHAKE Bond and angle constraints (SHAKE algorithm)algorithm)

Constrain some bond lengths and/or angles to fixed Constrain some bond lengths and/or angles to fixed values using a restraining force Gvalues using a restraining force Gii..

Solve the equations once with no constraint force.Solve the equations once with no constraint force. Determine the magnitude of the force (using lagrange Determine the magnitude of the force (using lagrange

multipliers) and correct the positions accordingly.multipliers) and correct the positions accordingly. Iteratively adjust the positions of the atoms until the Iteratively adjust the positions of the atoms until the

constraints are satisfied.constraints are satisfied.

i i i im a F G= +

Page 28: Molecular dynamics and applications to amyloidogenic sequences

Equilibrating the systemEquilibrating the system

Velocity distribution may change during Velocity distribution may change during simulation, especially if the system is far from simulation, especially if the system is far from equilibrium.equilibrium.

Perform a simulation, scaling the velocities Perform a simulation, scaling the velocities occasionally to reach the desired temperature.occasionally to reach the desired temperature.

The system is at equilibrium if:The system is at equilibrium if: The quantities fluctuate around an average value.The quantities fluctuate around an average value. The average remains constant over time.The average remains constant over time.

Page 29: Molecular dynamics and applications to amyloidogenic sequences

The verlet integration methodThe verlet integration method

Taylor expansion about r(t):Taylor expansion about r(t):

Combining the equation results in:Combining the equation results in:

Which is velocity independent.Which is velocity independent.

The error is of order The error is of order δδtt4 4 (the next expression of the series)(the next expression of the series)

2

2

1( ) ( ) ( ) ( ) ...

21

( ) ( ) ( ) ( ) ...2

r t t r t v t t a t t

r t t r t v t t a t t

d d d

d d d

+ = + + +

- = - + -

2( ) 2 ( ) ( ) ( )r t t r t r t t t a td d d+ = - - +

Page 30: Molecular dynamics and applications to amyloidogenic sequences

The verlet method (cont.)The verlet method (cont.)

The velocities can be calculated using the derivation The velocities can be calculated using the derivation formula:formula:

Here the error is of order Here the error is of order δδtt22

Note – the time interval is in the order of 1fs. (10Note – the time interval is in the order of 1fs. (10 -15-15s)s)

( ) ( )( )

2

r t t r t tv t

t

d dd

+ - -=

Page 31: Molecular dynamics and applications to amyloidogenic sequences

The verlet algorithmThe verlet algorithm

Start with r(t) and r(t-Start with r(t) and r(t-δδt)t) Calculate a(t) from the Newton equation: Calculate a(t) from the Newton equation:

a(t) = fa(t) = fii(t)/m(t)/mi i .. Calculate r(t+Calculate r(t+δδt) according to the t) according to the

aforementioned equation.aforementioned equation. Calculate v(t).Calculate v(t). Replace r(t-Replace r(t-δδt) with r and r with r(t+t) with r and r with r(t+δδt).t). Repeat as desired.Repeat as desired.

Page 32: Molecular dynamics and applications to amyloidogenic sequences

Amyloid fibril formationAmyloid fibril formation

Associated with a large number of degenerative Associated with a large number of degenerative diseases such as Alzheimer’s, Parkinson’s etc.diseases such as Alzheimer’s, Parkinson’s etc.

Associated with a structural change in the protein Associated with a structural change in the protein structure, resulting in the formation of stable fibrils.structure, resulting in the formation of stable fibrils.

The fibrils are richer in The fibrils are richer in ββ-sheets (although their -sheets (although their tertiary arrangements are usually undetermined).tertiary arrangements are usually undetermined).

Amyloid forming proteins do not share sequence Amyloid forming proteins do not share sequence homology, but the fibrillar structures exhibit similar homology, but the fibrillar structures exhibit similar physicochemical and structural characteristics.physicochemical and structural characteristics.

Page 33: Molecular dynamics and applications to amyloidogenic sequences

The human calcitonin (hCT)The human calcitonin (hCT) A 32 amino acid polypeptide hormone, A 32 amino acid polypeptide hormone,

produced by the C-cells of the thyroid and produced by the C-cells of the thyroid and involved in calcium homeostasis.involved in calcium homeostasis.

Fibrillation of hCT was found to be associated Fibrillation of hCT was found to be associated with carcinoma of the thyroid.with carcinoma of the thyroid.

Synthetic hCT can form amyloid fibrils in Synthetic hCT can form amyloid fibrils in vitro with similar morphology to the deposits vitro with similar morphology to the deposits found in the thyroid.found in the thyroid.

The in vitro process is affected by the pH of The in vitro process is affected by the pH of the system.the system.

Page 34: Molecular dynamics and applications to amyloidogenic sequences

The structure of hCTThe structure of hCT

In monomeric state, hCT has little ordered In monomeric state, hCT has little ordered secondary structure in room temperature.secondary structure in room temperature.

Fibrillated hCT have both helical and sheet Fibrillated hCT have both helical and sheet components.components.

In DMSO/HIn DMSO/H22O a short double stranded anti-O a short double stranded anti-parallel parallel ββ-sheet is formed in the region of -sheet is formed in the region of residues 16-21.residues 16-21.

Previous research indicated a critical role to Previous research indicated a critical role to residues 18-19.residues 18-19.

Page 35: Molecular dynamics and applications to amyloidogenic sequences

The sequence of hCTThe sequence of hCT

-- ++

NHNH22-CGNLSTCMLGTYQ-CGNLSTCMLGTYQDFNDFNKFKFHTFPQTAIGVGAP-COOHHTFPQTAIGVGAP-COOH

Page 36: Molecular dynamics and applications to amyloidogenic sequences

Experimental data regarding the Experimental data regarding the fibril forming regionfibril forming region

The The DFNKFDFNKF area was found to form fibrils area was found to form fibrils rich in anti-parallel rich in anti-parallel ββ-sheets.-sheets.

The spectrum observed with the The spectrum observed with the DFNKDFNK tetrapeptide is less typical of tetrapeptide is less typical of ββ-sheets, but may -sheets, but may be interpreted as such.be interpreted as such.

The The FNKFFNKF tetrapeptide exhibits a spectrum tetrapeptide exhibits a spectrum that is typical of a non-ordered structure.that is typical of a non-ordered structure.

The The DFNDFN tripeptide seems to be a mixture of tripeptide seems to be a mixture of ββ-sheet and non-ordered structure.-sheet and non-ordered structure.

Page 37: Molecular dynamics and applications to amyloidogenic sequences

The effect of F→A mutationThe effect of F→A mutation

The DANKA mutation does not exhibit a typical The DANKA mutation does not exhibit a typical spectrum of the spectrum of the ββ-sheet structure, although -sheet structure, although they exhibit a certain degree of order.they exhibit a certain degree of order.

This implies on the effect of the Phe aromatic This implies on the effect of the Phe aromatic residues in the fibrillation process.residues in the fibrillation process.

Page 38: Molecular dynamics and applications to amyloidogenic sequences

Tested modelsTested models

Combinations of parallel/anti parallel within Combinations of parallel/anti parallel within sheet and between sheets. So far – about 20 sheet and between sheets. So far – about 20 models.models.

Each model is simulated for 4ns. (each such Each model is simulated for 4ns. (each such simulation takes about 5 days on a powerful simulation takes about 5 days on a powerful cluster…).cluster…).

The tested parameters for model stability: The tested parameters for model stability: distance within/between sheets, aromatic distance within/between sheets, aromatic interactions, HB contact conservation etc.interactions, HB contact conservation etc.

Page 39: Molecular dynamics and applications to amyloidogenic sequences

Topologically different modelsTopologically different models

Page 40: Molecular dynamics and applications to amyloidogenic sequences

An example of a modelAn example of a model

Page 41: Molecular dynamics and applications to amyloidogenic sequences

Initial results (trajectory analysis)Initial results (trajectory analysis)

A model that’s totally unstable:A model that’s totally unstable:

Before: Before: After:After:

Page 42: Molecular dynamics and applications to amyloidogenic sequences

Average intra-sheet distance analysisAverage intra-sheet distance analysis

Page 43: Molecular dynamics and applications to amyloidogenic sequences

Percentage of conserved H-bonds Percentage of conserved H-bonds over timeover time

Page 44: Molecular dynamics and applications to amyloidogenic sequences

Future work plansFuture work plans

Test mutations once we focus on the correct Test mutations once we focus on the correct model.model.

Make more analyses and find out what causes Make more analyses and find out what causes the fibril formation (suspicion: The aromatic the fibril formation (suspicion: The aromatic ring ring ππ--stacking, salt bridges between the stacking, salt bridges between the oppositely charged residues D and K)oppositely charged residues D and K)

……??? ???