AMBER14 & GPUs
-
Upload
can-ozdoruk -
Category
Science
-
view
1.023 -
download
2
description
Transcript of AMBER14 & GPUs
SAN DIEGO SUPERCOMPUTER CENTER
AMBER 14 and GPUs: Creating the World's Fastest MD Package
on Commodity Hardware
Ross Walker, Associate Research Professor and NVIDIA CUDA Fellow ���San Diego Supercomputer and Department of Chemistry & Biochemistry Adrian Roitberg, Professor, Quantum Theory Project,��� Department of Chemistry, University of Florida
1
SAN DIEGO SUPERCOMPUTER CENTER 2
Presenters!
Ross Walker!San Diego Supercomputer Center &
Department of Chemistry and Biochemistry
UC San Diego!
Adrian Roitberg!Quantum Theory ProjectDepartment of Chemistry
University of Florida!
Scott Le Grand!Senior Engineer!
Amazon Web Services!
SAN DIEGO SUPERCOMPUTER CENTER
Webinar Overview!• Part 1!
• Overview of Amber & AmberTools!• New Features in Amber v14!• Overview of the Amber GPU Project!• New GPU features in Amber v14!• Recommended Hardware Configurations!• Performance!
• Part 2!• Hydrogen Mass Repartitioning!• Constant pH MD!• Multi-dimensional Replica Exchange MD!
3
SAN DIEGO SUPERCOMPUTER CENTER
What is AMBER?!An MD simulation
package
Latest version is 14.0 as of April 2014
distributed in two parts:
- AmberTools, preparatory and analysis programs, free under GPL
- Amber the main simulation programs,
under academic licensing
independent from the accompanying forcefields
A set of MD forcefields
fixed charge, biomolecular forcefields: ff94, ff99, ff99SB, ff03, ff11, ff12SB ...
experimental polarizable forcefields e.g.
ff02EP
parameters for general organic molecules, solvents, carbohydrates
(Glycam), etc.
in the public domain
4
SAN DIEGO SUPERCOMPUTER CENTER
Development!
AMBER is developed by a loose community of academic groups, led by Dave Case at Rutgers University, New Jersey, building on the earlier
work of Peter Kollman at UCSF!
SAN DIEGO SUPERCOMPUTER CENTER
Development!
SAN DIEGO SUPERCOMPUTER CENTER
The AMBER Development Team A Multi-Institutional Research Collaboration
! Principal contributors to the current codes:
David A. Case (Rutgers)Tom Darden (OpenEye)
Thomas E. Cheatham III (Utah)Carlos Simmerling (Stony Brook)
Adrian Roitberg (Florida)Junmei Wang (UT Southwestern Medical Center)
Robert E. Duke (NIEHS and UNC-Chapel Hill) Ray Luo (UC Irvine)
Daniel R. Roe (Utah)Ross C. Walker (SDSC, UCSD)
Scott LeGrand (Amazon)Jason Swails (Rutgers)David Cerutti (Rutgers)
Joe Kaus (UCSD)Robin Betz (UCSD)
Romain M. Wolf (Novartis)Kenneth M. Merz (Michigan State)Gustavo Seabra (Recife, Brazil)
Pawel Janowski (Rutgers) !
Andreas W. Go ̈tz (SDSC, UCSD) Istva ́n Kolossva ́ry (Budapest and D.E. Shaw) Francesco Paesani (UCSD)Jian
Liu (Berkeley) Xiongwu Wu (NIH)!Thomas Steinbrecher (Karlsruhe)
Holger Gohlke (Du ̈sseldorf)!Nadine Homeyer (Du ̈sseldorf)!
Qin Cai (UC Irvine)!Wes Smith (UC Irvine)!
Dave Mathews (Rochester)!Romelia Salomon-Ferrer (SDSC,
UCSD) Celeste Sagui (North Carolina State) Volodymyr Babin (North
Carolina State) Tyler Luchko (Rutgers)!Sergey Gusarov (NINT)!
Andriy Kovalenko (NINT)!Josh Berryman (U. of Luxembourg)
Peter A. Kollman (UC San Francisco)!
7
SAN DIEGO SUPERCOMPUTER CENTER 8
SAN DIEGO SUPERCOMPUTER CENTER
Availability!• AmberTools v14 – Available as a free download via the Amber
Website!• Amber v14 – Available for download after purchase from the Amber
Website!http://ambermd.org/!
9
SAN DIEGO SUPERCOMPUTER CENTER
Distribution!
10
v14 Released April 15, 2014!AmberTools v14 – Free, OpenSource (BSD, LGPL etc.)!Amber v14 – Academic ($500), Industry ($20,000 / $15,000), HPC Center ($2000)!
AmberTools v14!
Amber v14 Adds PMEMD MD Engine Support!
SAN DIEGO SUPERCOMPUTER CENTER
AmberTools v14: New Features!• The sander module, our workhorse simulation program, is now a
part of AmberTools!• Greatly expanded and improved cpptraj program for analyzing
trajectories!• New documentation and tools for inspecting and modifying Amber
parameter files!• Improved workflow for setting up and analyzing simulations;!• New capability for semi-empirical Born-Oppenheimer molecular
dynamics!• EMIL: a new absolute free energy method using TI!• New Free Energy Workflow (FEW) tool automates free energy
calculations (LIE, TI, and MM/PBSA-type calculations)!• QM/MM Improvements – External interface to many QM packages
including TeraChem (for GPU accelerate QM/MM).!• Completely reorganized Reference Manual!
11
SAN DIEGO SUPERCOMPUTER CENTER
Amber v14: New Features!• Major updates to PMEMD GPU support.!• Expanded methods for free energy calculations.!
• Soft core TI supported in PMEMD.!• Support for Intel MIC (experimental).!• Support for Intel MPI.!• ScaledMD.!• Improved aMD support.!• Self Guided Langevin Support.!
12
SAN DIEGO SUPERCOMPUTER CENTER
GPU Accelerated MD!
13
SAN DIEGO SUPERCOMPUTER CENTER
The Project!Develop a GPU accelerated
version of AMBER’s PMEMD !
!San DiegoSupercomputer Center !Ross C. Walker!
!
!University of Florida ! !Adrian Roitberg!!
!NVIDIA (now AWS) ! !Scott Le Grand!
Funded in part by the NSF SI2-SSE Program.!
14
SAN DIEGO SUPERCOMPUTER CENTER
Project Info!• AMBER Website: http://ambermd.org/gpus/!
Publications!1. Salomon-Ferrer, R.; Goetz, A.W.; Poole, D.; Le Grand, S.; Walker, R.C.* "Routine microsecond
molecular dynamics simulations with AMBER - Part II: Particle Mesh Ewald" , J. Chem. Theory Comput., in press, 2013, DOI: 10.1021/ct400314y!
2. Goetz, A.W., Williamson, M.J., Xu, D., Poole, D.,Grand, S.L., Walker, R.C. "Routine microsecond molecular dynamics simulations with amber - part i: Generalized born", Journal of Chemical Theory and Computation, 2012, 8 (5), pp 1542-1555, DOI:10.1021/ct200909j!
3. Pierce, L.C.T., Salomon-Ferrer, R. de Oliveira, C.A.F. McCammon, J.A. Walker, R.C., "Routine access to millisecond timescale events with accelerated molecular dynamics.", Journal of Chemical Theory and Computation, 2012, 8 (9), pp 2997-3002, DOI: 10.1021/ct300284c!
4. Salomon-Ferrer, R.; Case, D.A.; Walker, R.C.; "An overview of the Amber biomolecular simulation package", WIREs Comput. Mol. Sci., 2012, in press, DOI: 10.1002/wcms.1121!
5. Grand, S.L.; Goetz, A.W.; Walker, R.C.; "SPFP: Speed without compromise - a mixed precision model for GPU accelerated molecular dynamics simulations", Chem. Phys. Comm., 2013, 184, pp374-380, DOI: 10.1016/j.cpc.2012.09.022!
15
SAN DIEGO SUPERCOMPUTER CENTER
Motivations!
16
SAN DIEGO SUPERCOMPUTER CENTER
Maximizing Impact!Supercomputers!• Require expensive low
latency interconnects.!• Large node counts needed.!• Vast power consumption!!• Politics controls access.!
Desktops and Small Clusters!• Cheap, ubiquitous.!• Easily maintained.!• Simple to use.!!
17
SAN DIEGO SUPERCOMPUTER CENTER
Better Science?!• Bringing the tools the researcher needs into his
own lab.!• Can we make a researcher’s desktop look like a small cluster (remove
the queue wait)?!• Can we make MD truly interactive (real time feedback /
experimentation?)!• Can we find a way for a researcher to cheaply increase the power of
all his graduate students workstations?!• Without having to worry about available power (power cost?).!• Without having to worry about applying for cluster
time.!• Without having to have a full time ‘student’
to maintain the group’s clusters?!
• GPU’s offer a possible costeffective solution.!
18
SAN DIEGO SUPERCOMPUTER CENTER
Design Goals!Overriding Design Goal: Sampling for the 99%!!
• Focus on < 2 million atoms.!• Maximize single workstation performance.!• Focus on minimizing costs.!
• Be able to use very cheap nodes.!• Both gaming and tesla cards.!• Transparent to user (same input, same output)!
!!
19
The <0.0001% The 1.0% The 99.0%
SAN DIEGO SUPERCOMPUTER CENTER
Version History!• AMBER 10 – Released Apr 2008!
• Implicit Solvent GB GPU support released as patch Sept 2009.!• AMBER 11 – Released Apr 2010!
• Implicit and Explicit solvent supported internally on single GPU.!• Oct 2010 – Bugfix.9 doubled performance on single GPU, added
multi-GPU support.!• AMBER 12 – Released Apr 2012!
• Added Umbrella Sampling Support, 1D-REMD, Simulated Annealing, aMD, IPS and Extra Points.!
• Aug 2012 – Bugfix.9 new SPFP precision model, support for Kepler I, GPU accelerate NMR restraints, improved performance.!
• Jan 2013 – Bugfix.14 support CUDA 5.0, Jarzynski on GPU, GBSA. Kepler II support.!
20
SAN DIEGO SUPERCOMPUTER CENTER
New in AMBER 14 (GPU)!• ~20-30% performance improvement for single GPU runs.!• Peer to peer support for multi-GPU runs providing enhanced multi-GPU scaling.!• Hybrid bitwise reproducible fixed point precision model as standard (SPFP)!• Support for Extra Points in Multi-GPU runs.!• Jarzynski Sampling!• GBSA support!• Support for off-diagonal modifications to VDW parameters.!• Multi-dimensional Replica Exchange (Temperature and Hamiltonian)!• Support for CUDA 5.0, 5.5 and 6.0!• Support for latest generation GPUs.!• Monte Carlo barostat support providing NPT performance equivalent to NVT.!• ScaledMD support.!• Improved accelerated (aMD) MD support.!• Explicit solvent constant pH support.!• NMR restraint support on multiple GPUs.!• Improved error messages and checking.!• Hydrogen mass repartitioning support (4fs time steps).!
21
SAN DIEGO SUPERCOMPUTER CENTER
Reproducibility Critical for Debugging Software and
Hardware!
22
• SPFP precision model is bitwise reproducible. • Same simulation from same random seed = same
result.
• Is used to validate hardware (misbehaving GPUs) (Exxact AMBER Certified Machines)
• Successfully identified 2 GPU models with underlying
hardware issues based on this.
SAN DIEGO SUPERCOMPUTER CENTER
Reproducibility!
Good GPU! Bad GPU!0.0: Etot = -58229.3134!0.1: Etot = -58229.3134!0.2: Etot = -58229.3134!0.3: Etot = -58229.3134!0.4: Etot = -58229.3134!0.5: Etot = -58229.3134!0.6: Etot = -58229.3134!0.7: Etot = -58229.3134!0.8: Etot = -58229.3134!0.9: Etot = -58229.3134!0.10: Etot = -58229.3134!
0.0: Etot = -58229.3134!0.1: Etot = -58227.1072!0.2: Etot = -58229.3134!0.3: Etot = -58218.9033!0.4: Etot = -58217.2088!0.5: Etot = -58229.3134!0.6: Etot = -58228.3001!0.7: Etot = -58229.3134!0.8: Etot = -58229.3134!0.9: Etot = -58231.6743!0.10: Etot = -58229.3134!!
23
Final Energy after 10^6 MD steps (~45 mins per run)
SAN DIEGO SUPERCOMPUTER CENTER
Supported GPUs!• Hardware Version 3.0 / 3.5 (Kepler I / Kepler II) !!
!Tesla K20/K20X/K40!!Tesla K10!!GTX-Titan / GTX-Titan-Black!!GTX770 / 780 / 780Ti!!GTX670 / 680 / 690!!Quadro cards supporting SM3.0* or 3.5* !!
• Hardware Version 2.0 (Fermi) !!!Tesla M2090!!Tesla C2050/C2070/C2075 (and M variants)!!GTX560 / 570 / 580 / 590!!GTX465 / 470 / 480!!Quadro cards supporting SM2.0* !!
24
SAN DIEGO SUPERCOMPUTER CENTER
Supported System Sizes!
25
Explicit Solvent PME!
Implicit Solvent GB!
SAN DIEGO SUPERCOMPUTER CENTER
Running on GPUs!• Details provided on: http://ambermd.org/gpus/!
• Compile (assuming nvcc >= 5.0 installed)!• cd $AMBERHOME!• ./configure –cuda gnu!• make install!• make test!
• Running on GPU!• Just replace executable pmemd with pmemd.cuda!
• $AMBERHOME/bin/pmemd.cuda –O –i mdin …!
If GPUs are set to process exclusive mode pmemd just ‘Does the right thing’™!
26
SAN DIEGO SUPERCOMPUTER CENTER
Running on Multiple GPUs!• Compile (assuming nvcc >= 5.0 installed)!
• cd $AMBERHOME!• ./configure –cuda –mpi gnu!• make install!!
• Running on Multiple-GPUs!• export CUDA_VISIBLE_DEVICES=0,1!• mpirun –np 2 $AMBERHOME/bin/pmemd.cuda.MPI –O –i mdin …!
If the selected GPUs can 'talk' to each other via peer to peer then the code will automatically use it. (look for Peer to Peer : ENABLED in mdout).!
27
SAN DIEGO SUPERCOMPUTER CENTER
Recommended Hardware Configurations!
• DIY Systems!
• Exxact AMBER Certified Systems!• Come with AMBER preinstalled.!• Tesla or GeForce – 3 year warranty.!• Free test drives available.!
• See the following page for updates:!!
!http://ambermd.org/gpus/recommended_hardware.htm#hardware!
28
SAN DIEGO SUPERCOMPUTER CENTER
DIY 2 GPU System!Antec P280 Black ATX Mid Tower Case!
http://tinyurl.com/lh4d96s$109.99!!
Corsair RM Series 1000 Watt ATX/EPS 80PLUS Gold-Certified Power
Supplyhttp://tinyurl.com/l3lchqu
$184.99!!Intel i7-4820K LGA 2011 64 Technology
CPU BX80633I74820K!http://tinyurl.com/l3tql8m!
$324.99!Corsair Vengeance 16GB (4x4GB) DDR3 1866 MHZ (CMZ16GX3M4X1866C9)!http://tinyurl.com/losocqr!
$182.99!
!Asus DDR3 2400 Intel LGA 2011
Motherboard P9X79-E WShttp://tinyurl.com/kbqknvv
$460.00!
Seagate Barracuda 7200 3 TB 7200RPM Internal Bare Drive
ST3000DM001!http://tinyurl.com/m9nyqtk
$107.75!
2 of EVGA GeForce GTX780 SuperClocked 3GB GDDR5
384bit!http://tinyurl.com/m5e6c43!
$509.99 each!Or K20C or K40C!
Total Price : $2411.84 29
Intel Thermal Solution Air!http://tinyurl.com/mjwdp5e!
$21.15!
SAN DIEGO SUPERCOMPUTER CENTER
AMBER Certified Desktops!
30
Joint program with Exxact Inc.!http://exxactcorp.com/index.php/solution/solu_list/65!
• Custom Solutions Available.!
• Free Test Drives.!• GSA Compliant.!• Sole source
compliant.!• Peer to Peer ready.!• Turn key AMBER
GPU solutions.!Contact Ross Walker ([email protected]) or Mike Chen ([email protected]) for more info.!
SAN DIEGO SUPERCOMPUTER CENTER
AMBER Exxact Certified Clusters!
31
Contact Ross Walker ([email protected]) or Mike Chen ([email protected]) for more info.!
SAN DIEGO SUPERCOMPUTER CENTER
Performance!
AMBER 14!
32
Realistic benchmark suite available from:http://ambermd.org/gpus/benchmarks.htm!
33
30.21!
81.26!
129.79!
184.67!
270.99!
251.43!
361.33!
280.01!
389.63!
280.54!
383.32!
196.99!
263.85!
266.07!
364.67!
0.00! 50.00! 100.00! 150.00! 200.00! 250.00! 300.00! 350.00! 400.00! 450.00!
2xE5-2660v2 CPU (16 Cores)!
1X C2075!
2X C2075!
1X GTX 680!
2X GTX 680!
1X GTX 780!
2X GTX 780!
1X GTX 780Ti!
2X GTX 780Ti!
1X GTX Titan Black!
2X GTX Titan Black!
1X K20!
2X K20!
1X K40!
2X K40!
Performance (ns/day)!
DHFR (NVE) HMR 4fs 23,558 Atoms!
34
29.01!
79.49!
126.26!
179.45!
261.23!
239.14!
334.29!
271.68!
366.51!
270.39!
359.91!
190.01!
248.40!
245.00!
335.25!
0.00! 50.00! 100.00! 150.00! 200.00! 250.00! 300.00! 350.00! 400.00!
2xE5-2660v2 CPU (16 Cores)!
1X C2075!
2X C2075!
1X GTX 680!
2X GTX 680!
1X GTX 780!
2X GTX 780!
1X GTX 780Ti!
2X GTX 780Ti!
1X GTX Titan Black!
2X GTX Titan Black!
1X K20!
2X K20!
1X K40!
2X K40!
Performance (ns/day)!
DHFR (NPT) HMR 4fs 23,558 Atoms!
35
15.81!
41.66!
66.84!
95.45!
142.37!
128.37!
181.96!
146.46!
204.13!
146.12!
200.63!
102.67!
137.67!
138.33!
189.41!
0.00! 50.00! 100.00! 150.00! 200.00! 250.00!
2xE5-2660v2 CPU (16 Cores)!
1X C2075!
2X C2075!
1X GTX 680!
2X GTX 680!
1X GTX 780!
2X GTX 780!
1X GTX 780Ti!
2X GTX 780Ti!
1X GTX Titan Black!
2X GTX Titan Black!
1X K20!
2X K20!
1X K40!
2X K40!
Performance (ns/day)!
DHFR (NVE) 2fs 23,558 Atoms!
36
0.85!
2.39!
3.46!
5.72!
8.19!
7.40!
10.47!
9.59!
13.64!
9.60!
13.36!
6.73!
8.80!
9.46!
13.13!
0.00! 2.00! 4.00! 6.00! 8.00! 10.00! 12.00! 14.00! 16.00!
2xE5-2660v2 CPU (16 Cores)!
1X C2075!
2X C2075!
1X GTX 680!
2X GTX 680!
1X GTX 780!
2X GTX 780!
1X GTX 780Ti!
2X GTX 780Ti!
1X GTX Titan Black!
2X GTX Titan Black!
1X K20!
2X K20!
1X K40!
2X K40!
Performance (ns/day)!
Cellulose (NVE) 2fs 408,609 Atoms!
SAN DIEGO SUPERCOMPUTER CENTER
Cumulative Performance (The real differentiator)!
37
> 1.1 microseconds per day on a $6000 desktop!
CPUs are irrelevant to
AMBER GPU performance.
Buy $100 CPUs
<$6000
Workstation gets you > 1100ns/day
aggregate performance for the DHFR NVE HMR benchmark
30.21
280.54
2x 280.54
3x 280.54
4x 280.54
0 200 400 600 800 1000 1200
2xE5-‐2660v2 CPU (16 Cores)
1X GTX Titan Black
2X Individual GTX Titan Black
3X Individual GTX Titan Black
4X Individual GTX Titan Black
Cumula&ve Performance (ns/day)
Amber Cumula&ve Performance DHFR (NVE) HMR 4fs 23,558 Atoms
SAN DIEGO SUPERCOMPUTER CENTER
Historical Single Node / Single GPU Performance
(DHFR Production NVE 2fs)!
38
0
20
40
60
80
100
120
2007 2008 2009 2010 2011 2012 2013 2014
ns/day
PMEMD Peformance
GPU
CPU
Linear (GPU)
Linear (CPU)
SAN DIEGO SUPERCOMPUTER CENTER
Cost Comparison!
Traditional Cluster! GPU Workstation!Nodes Required! 12!
Interconnect! QDR IB!Time to complete
simulations!4.98 days!
Power Consumption! 5.7 kW (681.3 kWh)!System Cost (per day)! $96,800 ($88.40)!
Simulation Cost! (681.3 * 0.18) + (88.40 * 4.98)!
$562.87!
39
4 simultaneous simulations, 23,000 atoms, 250ns each, 5 days maximum time to solution.
SAN DIEGO SUPERCOMPUTER CENTER
Cost Comparison!
Traditional Cluster! GPU Workstation!Nodes Required! 12! 1 (4 GPUs)!
Interconnect! QDR IB! None!Time to complete
simulations!4.98 days! 2.25 days!
Power Consumption! 5.7 kW (681.3 kWh)! 1.0 kW (54.0 kWh)!System Cost (per day)! $96,800 ($88.40)! $5200 ($4.75)!
Simulation Cost! (681.3 * 0.18) + (88.40 * 4.98)!
(54.0 * 0.18) + (4.75 * 2.25)!
$562.87! $20.41!
40
4 simultaneous simulations, 25,000 atoms, 250ns each, 5 days maximum time to solution.
>25x cheaper AND solution obtained in less than half the time
SAN DIEGO SUPERCOMPUTER CENTER
Q&A!
41
Replica Exchange Molecular Dynamics in the Age of Heterogeneous Architectures.
The stuff dreams are made of…
Adrian E. Roitberg Quantum Theory Project Department of Chemistry University of Florida
4 Conformational space has many local minima 4 Barriers hinder rapid escape from high-energy minima
4 How can we search and score all conformations? Coordinate
Ene
rgy
Most of chemistry and certainly all of biology happens at temperatures substantially above 0K
Why is convergence so slow?
• We want the average of observable properties- so we need to include in that weighted average all of the important values that it might have
• Structural averages can be slow to converge- you need to wait for the structure to change, and do it many, many times.
• Sampling an event only once doesn’t tell you the correct probability (or weight) for that event What is the proper weight? Boltzmann’s equation
Pi =e−Ei / kbT
e−Ei / kbT
i∑
We need to sample phase for a biomolecule. Our tool of choice is molecular dynamics. Loop over time steps, and at each time compute forces and energies. Within regular MD methodology there is only one way to sample more conformational space à Run longer The usual (old) way to port code to new architectures was to focus on bottlenecks. In MD, this meant non-bonded interactions (vdw + electrostatics) that scaled O(N2) (potentially O(N lnN) and took ~90% of the time.
With GPUs we can run MUCH faster, for lower cost So what? It is still a microsecond every 2 days, not enough to sample conformational space, except locally. So we have code that run FAST in one/two gpus, and we have the emergence of large machines (keeneland, blue waters, stampede, etc) with many gpus. Are there methods we can use to leverage these two components? Simple way, run MANY independent simulations, each fast. The probability of something “interesting” happening increases linearly with the number of runs. Good, but not good enough.
Can we run independent simulations, and have them ‘talk’ to each other and increase overall sampling ? Yes ! Can we do it according to the laws of statistical mechanics? Yes !
Temperature Replica Exchange (Parallel tempering)
375K
350K
325K
300K
REMD Hansmann, U., CPL 1997 Sugita & Okamoto, CPL 1999
? ? ?
You do not EXCHANGE, you ask PERMISSION to exchange and then decide.
[ ] [ ]min(1,exp{( )( ( ) ( )})i jm n E q E qρ β β= − −
vnew = voldTnewTold
P X( )w X → X '"
#$%&'
= P X '( )w X '→ X( )
P X( ) = e−βEX
Impose desired weighting (limiting distribution) on exchange calculation
375K
350K
325K
300K
4 Impose reversibility/detailed balance
Rewrite using only potential energies, by rescaling kinetic energy for both replicas.
Exchange?
Replica Exchange Methods = Treasure hunt + bartering
Only works when the replicas explore phase space ‘well’ and exchange configurations. The only gain comes from exchanging configurations.
Large gains in sampling efficiency using REMD Perfect for heterogeneous architectures: Need FAST execution within a replica, and the exchanges involve very little data (e.g. communication overhead is effectively zero). Scaling is very close to perfect. Important questions: How many replicas? Where do I put the replicas? How often do I try to exchange ? How do I measure convergence and speedup?
“Exchange frequency in replica exchange molecular dynamics”.
Daniel Sindhikara, Yilin Meng and Adrian E. Roitberg.
Journal of Chemical Physics. 128, 024103 (2008).
“Optimization of Umbrella Sampling Replica Exchange Molecular Dynamics by Replica Positioning” Sabri-Dashti, D.; Roitberg, A. E. Journal of Chemical Theory and Computation. 9, 4692. (2013)
REMD does not have to be restricted to T-space !
H-REMD is a powerful technique that uses different Hamiltonians for each replica.
H1
H4
H2
H3
Within H-REMD, there is another ‘split’ A. Replicas in ‘space’ == umbrella sampling, each replica has a different
umbrella, replicas exchange. You can also use different restraints, different restraint weights for NMR refinement, tighter restraints… Be creative
B. Replicas is true Hamiltonian space. Each replica is a different molecule, or small changes in protonation, or force fields.
Imagine replicas where the rotational barriers (bottlenecks for sampling in polymers) are reduced from the regular value. Thos
pKa (protein) = pKa (model) +1
2.303kBTΔG proteinAH → proteinA−( ) − ΔG(AH → A− )$% &'
Application to a ‘hard’ pKa prediction. ASP 26 in Thoredoxin. pKa = 7.5 (a regular ASP has a pKa of 4.0)
Meng, Yilin; Dashti, Danial Sabri; Roitberg, Adrian E. Journal of Chemical Theory and Computation. 7, 2721-2727. (2011)
REMD methods lend themselves to lego-style mix and match No restriction to use only one replica ladder One could do Multidimensional-REMD. Multiscale REMD T-umbrella Umbrella-umbrella-umbrella Radius of gyration-T T-umbrella (QM/MM) (Get enthalpy/entropy !!) Accelerated MD
Computational Studies of RNA Dynamics “Multi-dimensional Replica Exchange Molecular Dynamics Yields a Converged Ensemble of an RNA Tetranucleotide” Bergonzo, C.; Henriksen, N.; Roe, D.; Swails, J.; Roitberg, A. E.; Cheatham III, T. Journal of Chemical Theory and Computation. 10:492–499 (2014)
Experimental rGACC ensemble
A-form Reference
NMR Major NMR Minor
70-80% 20-30%
Yildirim et al. J.Phys. Chem. B, 2011, 115, 9261.
• NMR spectroscopy for single stranded rGACC
• Major and minor A-form-like structures
Complete Sampling is Difficult • Conformational
variability is high, even though the structure is small
• The populations of different structures (shown blue, green, red, and yellow) are still changing, even after > 5 µs of simulation
Henriksen et al., 2013, J. Phys. Chem. B., 117, 4914.
Niel Henriksen
Enhanced Sampling using Multi-Dimensional Replica Exchange MD (M-REMD)
• Parallel simulations (“replicas”) run at different Temperatures (277K – 400K), Hamiltonians (Full dihedrals – 30% dihedrals)
• Exchanges are attempted at specific intervals
• 192 replicas, each run on its own GPU
Sugita, Y.; Kitao, A.; Okamoto, Y.; 2000, J. Chem. Phys., 113, 6042. Henriksen et al., J. Phys Chem B, 2013.117, 4014-27.; Bergonzo et al.; 2014, JCTC, 10, 492.
Criteria for Convergence • Independent runs with different
starting structures sample the same conformations in the same probabilities
• Kullback-Leibler divergence of the final probability distributions (of any simulation observable) between 2 runs reaches zero
61 Bergonzo et al.; 2014, JCTC, 10, 492.
Cyan indicates a 95% confidence level, where slope and y-intercept ranges are denoted by α and β.
NMR Major 18.1%, 31.7%
NMR Minor 24.6%, 23.4%
Intercalated anti 15.9%, 10.7%
Top 3 Clusters ff12%, DAC%
63
NMR Major, 1-syn* 48.6%
*This structure is never sampled in Amber FF simulations.
pH-Replica Exchange molecular Dynamics in proteins using a discrete protonation method Each replica runs MD at a different pH (talk to me if interested), exchange between replica. One can titrate biomolecules very efficiently.
Acid-range HEWL Titration Residue Calculated
pKa Exp. pKa
Glu 7 3.6 2.6
His 15 6.0 5.5
Asp 18 2.3 2.8
Glu 35 5.7 6.1
Asp 48 1.2 1.4
Asp 52 2.7 3.6
Asp 66 -18.1 1.2
Asp 87 2.5 2.2
Asp 101 3.7 4.5
Asp 119 2.3 3.5
RMSE 6.5 ------
(1) Webb, H.; Tynan-Connolly, B. M.; Lee, G. M.; Farrell, D.; O’Meara, F.; Sondergaard, C. R.; Teilum, K.; Hewage, C.; McIntosh, L. P.; Nielsen, J. E. Proteins 2011, 79, 685–702.
Constant pH MD, implicit solvent
pH Replica Exchange MD pH = 5
pH = 4
pH = 3
pH = 2
pH = 5
pH = 4
pH = 3
pH = 2
?
?
Itoh, S. G.; Damjanovic, A.; Brooks, B. R. Proteins 2011, 79, 3420–3436.
Protonation State Sampling
• CpHMD pH-REMD
<RSS> = 2.33 × 10-2 <RSS> = 3.60 × 10-4
REMD wins, not by much
• CpHMD pH-REMD
Protonation State Sampling • CpHMD
pH-REMD
<RSS> = 2.51 × 10-4 <RSS> = 1.33 × 10-1
REMD wins by a lot
Residue CpHMD pKa pH-REMD pKa Experimental
pKa
Glu 7 3.6 3.8 2.6
His 15 6.0 5.9 5.5
Asp 18 2.3 2.0 2.8
Glu 35 5.7 5.0 6.1
Asp 48 1.2 2.0 1.4
Asp 52 2.7 2.3 3.6
Asp 66 -18.1 1.5 1.2
Asp 87 2.5 2.7 2.2
Asp 101 3.7 3.6 4.5
Asp 119 2.3 2.4 3.5
RMSE 6.5 0.89 ------
Substantially better results with pH-REMD
Ackowledgements: David Case Andy McCammon John Mongan Ross Walker Tom Cheatham III Dario Estrin Marcelo Marti Leo Boechi
$ + computer time NSF/Blue Waters PRAC NSF SI2 Keeneland Director’s allocation Titan NVIDIA
Dr. Yilin Meng Natali Di Russo Dr. Jason Swails
Dr. Danial Sabri-Dashti
Mass Repartitioning to Increase MD Timestep
(double the fun, double the
pleasure)
Chad Hopkins, Adrian Roitberg
Background • Work based off of :
– “Improving Efficiency of Large Time-scale Molecular Dynamics Simulations of Hydrogen-rich Systems”, Feenstra, Hess, Berendsen – JCC 1999
• Recommended increasing H mass by a factor of 4, repartitioning heavy atom masses to keep total system mass constant
• For technical reasons with Amber code, we increase by a factor of 3
• Waters not repartitioned (unneeded due to different constraint handling)
Repartitioning Example
1.008 12.01
12.01
12.01
16.00
14.01
1.008
1.008
3.024
3.024
3.024
5.962
12.01
9.994
16.00
11.99
Original After Repartitioning
Small peptide sampling (capped 3-ALA) 10 x 20 (40) ns Simulations
normal topology, 2 fs timestep middle phi/psi populations
repartitioned topology, 4 fs timestep middle phi/psi populations
Average Timings (Tesla M2070s) Normal topology, 2 fs timestep 160 ns/day
Repartitioned topology, 4 fs timestep 305 ns/day
Dynamics test (peptide)
Free energy map generated from MD trajectory for phi/psi near N-terminal
Norm 2fs Repart 2fs Repart 4fs average hops 112 114 200
Counting hops (for equal numbers of frames in each case) across the barrier between the minima centered at (-60,150) and (-60,-40):
Per-residue RMSDs (HEWL)
C-N termini distance profile
Free energy calculation with mass repartitioning
• PMF from umbrella sampling for disulfide linker dihedral in MTS spin label used in EPR experiments
SAN DIEGO SUPERCOMPUTER CENTER
Acknowledgements!San Diego Supercomputer Center
University of California San Diego
University of Florida
National Science Foundation NSF SI2-SSE Program
NVIDIA Corporation Hardware + People
People Scott Le Grand Duncan Poole Mark Berger
Sarah Tariq Romelia Salomon Andreas Goetz
Robin Betz Ben Madej Jason Swails
Chad Hopkins
79
TEST DRIVE K40 GPU - WORLD’S FASTEST GPU
I am very glad that NVIDIA has provided the GPU Test Drive. Both my professor and I appreciate the opportunity as it profoundly helped our research.
“
” Mehran Maghoumi, Brock Univ., Canada
www.nvidia.com/GPUTestDrive
SPECIAL OFFER FOR AMBER USERS Buy 3 K40 Get 4th GPU Free
Contact your HW vendor or NVIDIA representative for more details
www.nvidia.com/GPUTestDrive
UPCOMING GTC EXPRESS WEBINARS
May 14: CUDA 6: Performance Overview May 20: Using GPUs to Accelerate Orthorectification,
Atmospheric Correction, and Transformations for Big Data May 28: An Introduction to CUDA Programming June 3: The Next Steps for Folding@home
www.gputechconf.com/gtcexpress
NVIDIA GLOBAL IMPACT AWARD
• $150,000 annual award
• Categories include: disease research, automotive safety, weather prediction
• Submission deadline: Dec. 12, 2014
• Winner announced at GTC 2015
Recognizing groundbreaking work with GPUs in tackling key social and humanitarian problems
impact.nvidia.com