Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations
description
Transcript of Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations
![Page 1: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/1.jpg)
Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations
David GobaudComputational Drug Discovery
Stanford University7 March 2006
![Page 2: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/2.jpg)
Outline Overview Background Delft Molecular Dynamics Processor GRAPE Protein Explorer Summary MDGRAPE-3 Chip
Force Calculation Pipeline J-Particle Memory and Control Units
System Architecture Software Cost Questions
![Page 3: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/3.jpg)
Overview Protein Explorer
Petaflop special-purpose computer system for molecular dynamics simulations
High-precision screening for drug design Large-scale simulations of huge proteins/complexes
PC cluster with special-purpose engines to perform the most time-consuming calculations
Dedicated LSI MDGRAPE-3 chip performs force calculations at 165 Gflops or higher
ETA 2006
![Page 4: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/4.jpg)
Background PCs are universal machines
Various applications Hardware can be designed independent of
applications Obstacles to high-performance
Memory bandwidth bottleneck Heat dissipation problem Can be overcome by developing specialized
architectures
![Page 5: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/5.jpg)
Delft Molecular Dynamics Processor (DMDP) Pioneered high-performance special-
purpose systems Not able to achieve effective cost-
performance Demanded too much time and money in
development state Speed of development is a crucial factor affecting
cost-performance because electronic device technology continues to develop rapidly
Almost all calculations performed by DMDP making hardware very complex
![Page 6: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/6.jpg)
GRAPE (GRAvity PipE) One of the most successful attempts to
develop high-performance special-purpose systems
Specialized for simulations of classical particles
Most time spent on calculation of long-range forces (gravitational, Coulomb, and van der Waals) Thus special hardware only performs these
calculations Hardware very simple and cost-effective
![Page 7: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/7.jpg)
GRAPE (GRAvity PipE) In 1995 first machine to break teraflops
barrier in nominal peak performance Since 2001 leader in performance has
been Molecular Dynamics Machine at RIKEN at 78-TFlops
2002 @ University of Tokyo a 64-TFlop GRAPE-6 completed
Protein Explorer launched based on 2002 University of Tokyo success
![Page 8: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/8.jpg)
Protein Explorer Summary Host PC cluster with special purpose boards
attached Boards calculate only non-bounded forces
Very simple hardware and software No detailed knowledge of hardware needed to write
programs Communication time between host and boards
is proportional to number of particles Calculation time proportional to
N^2 for direct summation of long-range forces N*Nc for short range forces where Nc is the average
number of particles within the cutoff radius 0.25 byte/1000 operations
![Page 9: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/9.jpg)
MDGRAPE-3 Chip - Force Calculation Pipeline
3 subtractor units 6 adder units 8 multiplier units 1 function-evaluation unit Can perform ~33 equivalent
operations/sec when it calculates the Coulomb force
![Page 10: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/10.jpg)
MDGRAPE-3 Chip - Force Calculation Pipeline
![Page 11: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/11.jpg)
MDGRAPE-3 Chip - Force Calculation Pipeline Most operations done in 32-bit single
precision floating point format Force accumulation is 80-bit fixed point
format Can be converted to 64-bit double precision
floating point Coordinates stored in 40-bit fixed-point
format Makes implementation of periodic boundary
condition easy
![Page 12: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/12.jpg)
MDGRAPE-3 Chip - Force Calculation Pipeline Function Evaluator
Most important part of pipeline Allows calculation of arbitrary smooth function Has memory unit which contains a table for
polynomial coefficients and exponents and a hardwired pipeline for fourth-order polynomial evaluation
Interpolates an arbitrary smooth function g(x) using segmented fourth-order polynomials by Homer’s method
![Page 13: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/13.jpg)
MDGRAPE-3 Chip - J-Particle Memory and Control Units 20 Force Calculation Pipelines j-Particle Memory Unit
32,768 bodies “Main Memory” 6.6 Mbits constructed by static RAM
Cell-Index Controller Controls j-Particle memory – generates
addresses Force Simulation Unit Master Controller
Manages timings and inputs/outputs of the chip
![Page 14: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/14.jpg)
MDGRAPE-3 Chip 2 virtual pipelines/physical pipeline Physical bandwidth of j-particle unit
2.5 Gbytes/sec but virtual bandwidth will reach 100 Gbytes/sec
340 arithmetic units 20 function-evaluator units which
work simultaneously 165 Gflops at 250MHz
![Page 15: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/15.jpg)
MDGRAPE-3 Chip
![Page 16: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/16.jpg)
MDGRAPE-3 Chip Chip made by Hitachi 6M gates 10M bits of memory Chip size is ~220 mm^2 Dissipate 20 watts at core voltage
of +1.2V .12 W/Gflops much better than P4
3GHz which is 14 W/Gflop
![Page 17: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/17.jpg)
System Architecture Host PC cluster will use Itanium or Opteron CPU 256 nodes with 512 CPUs each Performance of node is 3.96 Tflops
Total reaches a petaflop Require 10G-bit/sec network
Infiniband 10G Ethernet or future Myrinet Network topology will be a 2D hyper-crossbar Each node has 24 MDGRAPE-3 chips MDGRAPE-3 chips connected via 2 PCI-X busses at 133
MHz 19” rack can house 6 nodes
43 racks total Power dissipation ~150 KWatts Occupy 100 m^2
![Page 18: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/18.jpg)
System Architecture
![Page 19: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/19.jpg)
Protein Explorer Board
![Page 20: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/20.jpg)
Software
Very easy to create programs for All computational abilities provided
in a library No special knowledge of device
needed
![Page 21: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/21.jpg)
Cost
$20 million including labor Less than $10/Gflop
At least ten times better than general-purpose computers even when compared with relatively cheap BlueGene/L ($140/Gflop)
![Page 22: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/22.jpg)
Questions What is Myrinet? What is a two-dimensional hyper-
crossbar network topology? How does this compare to massive
distributed computing such as Folding@Home Advantages? Disadvantages?
![Page 23: Protein Explorer: A Petaflops Special Purpose Computer System for Molecular Dynamics Simulations](https://reader035.fdocuments.in/reader035/viewer/2022062806/56814e5e550346895dbbfbe5/html5/thumbnails/23.jpg)