Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and...

26
FCCM 2004 Reconfigurable Molecular Dynamics Simulator Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and Paul Chow University of Toronto

Transcript of Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and...

Page 1: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabihaand Paul ChowUniversity of Toronto

Page 2: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

2

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Why is Molecular Dynamics interesting? Simulates interaction of atoms over time

Many possible applications– Biomolecules

Computationally intensive to handle >1000 atoms

Large computer clusters used in the past

Can a reconfigurable simulation system do this better?

Page 3: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

3

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

What is Molecular Dynamics?

Simulate using classical Newtonian mechanicsF = m a

Integrate acceleration to get position and velocity changes

Use a very small timestep ~ 1 femtosecond

Page 4: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

4

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Molecular Dynamics Background

Simulation Procedure per timestep– Sum force over all interacting atoms

– Calculate acceleration

– Integrate acceleration to update atom position and velocity

– Repeat for all atoms

Page 5: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

5

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Background - Forces

Two types of forces–Bonded – O(n)

–No hardware acceleration required–Non Bonded – O(n2)

–Needs hardware acceleration

Page 6: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

6

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

-2.5E-11

0.0E+00

2.5E-11

5.0E-11

7.5E-11

1.0E-10

Distance

Pote

ntia

lBackground – Force Calculation

Lennard-Jones (LJ) potential models interaction

Force on a atom is the gradient of potential

( )⎥⎥⎦

⎢⎢⎣

⎡⎟⎠⎞

⎜⎝⎛−⎟

⎠⎞

⎜⎝⎛=

612

4rr

rLJσσεφ

Page 7: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

7

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Background – Simulating Large VolumesAny interesting volume has far too many atoms to simulate

Solution – Periodic Boundary Conditions

ReplicatedBox

ReplicatedBox

ReplicatedBox

ReplicatedBox

ReplicatedBox

ReplicatedBox

ReplicatedBox

ReplicatedBox

Box beingsimulated

BA

B’A’

B’A’

B’A’

B’A’

B’A’

B’A’

B’A’

B’A’

Page 8: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

8

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Architectural Design – System Overview

PairGen

VerletUpdate

ForceComputer

AccelerationUpdate

SunWorkstation

SystemControl

ParticleMemory

FunctionValue

Memory

SlopeMemory

Page 9: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

9

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Architectural Design – Force Computer

r2 from PG used for function lookup

Interpolate to obtain a more accurate force magnitude

PairG

enForce

Computer

AccelerationUpdate

FunctionValue

Memory

SlopeMemory

Page 10: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

10

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Architectural Design – Force Computerr2 is larger than 18-bits

–Look up table has a 18-bit address

-1.00E+09

0.00E+00

1.00E+09

2.00E+09

3.00E+09

4.00E+09

5.00E+09

6.00E+09

0 5E-19 1E-18 1.5E-18 2E-18

Separation2

Psue

do-A

ccel

erat

ion

Page 11: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

11

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Architectural Design – Force Computer

r2

FF...FF

low bits

middlebits

highbits

Residual forinterpolation

Address tolookup tables

Page 12: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

12

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Precision and Scaling Factors

Architecture uses integer operations to reduce complexity

Precision: number of bits used to represent a valueScaling Factor: the weight of the least significant bit of the value

Precision

ScalingFactor

Page 13: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

13

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Calculating the Precision and Scaling FactorsCalculations made with atoms at varying distances

Scaling Factor = the minimum valuePrecision = log2 of the difference between minimum and maximum

372-64Acceleration

512-15Velocity

382-64Position

PrecisionScaling Factor

Quantity

Page 14: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

14

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Simulation Environment is ConfigurableSimulation reconfigurability

– Change precision, scaling factors, number of atoms. forces

– No wasted hardware

– No time overhead when precision is reduced

Entire process is automated– One input file controls entire process

– Hardware – C program creates appropriate VHDL

– Software interface, Software initialization– Always match the hardware

Page 15: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

15

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

ImplementationUsed the Transmogrifier 3

–4 interconnected Virtex-E 2000’s

–2MB memories connected to each Virtex-E

–Slow by today’s standards

Page 16: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

16

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

VerificationTested accuracy of implementation

–Compared TM3 results with software

0

500

1000

1500

2000

2500

0 100 200 300 400 500 600 700

Timestep

Ener

gy (k

J/m

ol)

TM3 -Total EnergySoftware - Total EnergyTM3 - Kinetic EnergySoftware - Kinetic EnergyTM3 - Potential EnergySoftware - Potential Energy

Page 17: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

17

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

System Performance

For a 8192 atom MD system running on the TM3

–Frequency: 26 MHz–Timestep Duration: 37 sec

For a 8192 atom software system running on a 2.4GHz Pentium 4

–Timestep Duration: 10.8 sec

MD system is 3.4X slower than software

Page 18: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

18

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

How to improve this?Memory

New FPGA

Parallelism

Page 19: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

19

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Memory Requirements of MD System

Acceleration Array (8192 atom system uses 0.17 MB)

Velocity Array (8192 atom system uses 0.17 MB)

Position Array (8192 atom system uses 0.34 MB)

Lookup Tables: 2 MB

Page 20: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

20

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Improving Performance – Memory Organization On TM3 there is only one external SRAM per FPGA

Single SRAM for all atom information causes large slowdowns

– Handshaking

– Serial reads for x, y and z

– Hardware issues

Better memory system

2.1 seconds/timestep (5X faster than software)

Page 21: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

21

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Improving Performance – Clock Speed

Run on modern FPGA

All possible improvements for clock speed not explored

Expect a factor of 4 increase to a 100MHz

Better memory system + Faster Clock Speed

0.51 seconds/timestep (21X faster than software)

Page 22: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

22

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Improving Performance – Parallel Architecture

Better memory system + Faster Clock Speed + Parallelize

0.51/n seconds/timestep (21n X faster than software)

PairGen

AU(1)AA

AU(0)AA

AU(n-1)AA

VU(1)VA

VU(0)VA

VU(n-1)VA

PAi

PA

iP

AiPA

jPA

jPA

j

LJFC(0)

LJFC(1)

LJFC(n-1)

Slope Mem

Value Mem

Value Mem

Slope Mem

Value Mem

Slope Mem

Page 23: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

23

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Cost, Power Benefits

Performance– MD system can deliver a 21X performance benefit over

software

– Assume a conservative 10X performance advantage

Cost– Microprocessor motherboard-sized board can fit 4

FPGAs

– 4 FPGAs (each $200) + board + SRAM + misc. ~ $1500

– Microprocessor Motherboard + CPU + DRAM ~ $1500

Page 24: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

24

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Comparison of MD Simulator and Supercomputers

40W106WPower~$1500~$1500Cost40X1XPerformance

4 FPGAs + 24 MB SRAM

1 Pentium + 1 GB DRAM

Per Board

40X ImprovementPerformance/Space40X ImprovementPerformance/Cost

100X ImprovementPerformance/Power

Improvement of MD System

Metric

Page 25: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

25

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

ConclusionsEasily reconfigurable MD System designed

Molecular dynamics simulation can be done on FPGAs

Simple enhancements will improve speed– Power, Cost and Space savings over software

Page 26: Navid Azizi, Ian Kuon, Aaron Egier, Ahmad Darabiha and ...pc/research/publications/azizi.fccm2004... · Psuedo-Acceleration. 11 FCCM 2004 ... – C program creates appropriate VHDL

26

FCCM 2004

Reconfigurable Molecular Dynamics Simulator

Future WorkImprove accuracy

Target newer FPGA platform

Support new forces

Funding for the TM3 Project was provided by Micronet and Xilinx

Acknowledgements