LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux...

24
IBM Deep Computing Group June, 2003 | IBM Deep Computing Group © 2002 IBM Corporation LS-DYNA: CAE Simulation Software on Linux Clusters Guangye Li ([email protected]) IBM Deep Computing Team

Transcript of LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux...

Page 1: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

IBM Deep Computing Group

June, 2003 | IBM Deep Computing Group © 2002 IBM Corporation

LS-DYNA: CAE Simulation Software on Linux Clusters

Guangye Li ([email protected])IBM Deep Computing Team

Page 2: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

2

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Topics

Introduction to LS-DYNALS-DYNA ApplicationsTwo versions of LS-DYNA: SMP and MPPAn example

Performance of LS-DYNA on clustersPerformance Improvement with Faster ProcessorsInterconnect Options: Gigabit Ethernet or MyrinetOne or two process nodesComparison of LAM/MPI and MPICH PerformanceSpeedup from Compiler OptionsSpeedup from Faster 533 MHz Front side Bus

Chrysler experience

Page 3: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

3

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

LS-DYNA: A general purpose transient dynamic finite

element program capable of simulating complex real world problems

Software Vendor: Livermore Software Technology Corp. (LSTC)

Largest application in CAE

Large customer base

Page 4: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

4

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

LS-DYNA applications include:

Occupant safety Metal Forming Metal Cutting Biomedical Blast loading Fluid-structure interaction Earthquake engineering …

Page 5: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

5

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Two parallel versions of LS-DYNA

SMP (OpenMP) for shared memory multiple processors.• Parallelized from a serial code• Scalable up to 16 CPUs

MPP (Distributed memory version)• Using the domain decomposition technique• Using MPI for communications between subdomains

(processors)• Scalable up to more than 100 CPUs. • Suitable for both shared memory multiple

processors and clusters• MPP-DYNA on clusters dramatically reduced the

turnaround time and the simulation cost

Page 6: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

6

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Comparison of SMP and MPP

05000

100001500020000250003000035000

1-CPU 2-CPU 4-CPU 8-CPU 16-CPU 32-CPU

ElapsedTime (sec)

SMP MPP1.3 GHz IBM p690November 2002 LS-DYNArefined Neon-535k elements

Page 7: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

7

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

An Example: The Neon Model

Frontal crash with initial speed at 31.5 miles/hour

Model sizenumber of shell elements: 269,249number of nodal points: 285,832

Simulation length: 150 msvehicle bounce back observed at 70 ms

Model created by National Crash Analysis Center (NCAC) at GeorgeWashington University

one of the few publicly available model for vehicle crash analysisbased on 1996 Plymouth Neon

Page 8: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

8

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

1996 Plymouth Neon

Page 9: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

9

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

The model

Page 10: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

10

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

The mesh

Page 11: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

11

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Domain decomposition

The whole mesh is decomposed into NCPU subdomains.

Each domain has about the same number of elementsEach link cut corresponding to communications between two

nodes. The decomposition should minimize the link cuts

Each CPU processes elements in its subdomain

CPUs exchange boundary data using message passing (MPI)

Page 12: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

12

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Page 13: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

13

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Simulation results

Page 14: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

14

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Performance Improvement with Faster Processors

0

5000

10000

15000

20000

25000

2-CPU 4-CPU 8-CPU 16-CPU 32-CPU 64-CPU

ElapsedTime (sec)

2.4 GHz 2.8 GHz

V960 r1488 LS-DYNAXeon, 2 CPUs per nodeGigabit EthernetJan-March 2003 LAM/MPIrefined Neon-535k elements

Page 15: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

15

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Configuring Each Node with One Processor

02000400060008000

10000120001400016000

2-CPU 4-CPU 8-CPU 16-CPU 32-CPU 64-CPU

ElapsedTime (sec)

2 CPUs per node 1 CPU per nodeV960 r1488 LS-DYNAGigabit Ethernet x335 2.8 GHzMarch 2003 LAM/MPIFront crash model 430k elements

Page 16: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

16

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Interconnect – Effect on Performance

0

5000

10000

15000

20000

25000

2-CPU 4-CPU 8-CPU 16-CPU 32-CPU

ElapsedTime (sec)

Fast Ethernet Gigabit Ethernet Myrinet2.2 GHz IntelliStation ClusterJune 2002 MPI LS-DYNArefined Neon-535k elements

Page 17: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

17

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Interconnect Performance Compared

0

5

10

15

20

25

30

2-CPU 4-CPU 8-CPU 16-CPU 32-CPU

ParallelSpeedup

x335+Fast Ethernet x335+Gigabit Ethernetx335+Myrinet p655+SP Switch2

V960 LS-DYNAJan 2003 Refined Neon 535k Elements

Page 18: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

18

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Comparison of LAM/MPI and MPICH Performance

0500

100015002000250030003500

16-CPU 32-CPU

ElapsedTime (sec)

MPICH-1.2.4 LAM/MPI-6.5.62.8 GHz x335 (Xeon) ClusterGigabit EthernetMarch 2003 LS-DYNArefined Neon-535k elements

Page 19: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

19

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Speedup from Compiler Options

25110No_SSE20781SSE

Elapsed time (sec)Intel Compiler Option

V960 r1106 MPP-DYNAFeb 2002 LAM/MPI 6.5.22.2 GHz IntelliStation node – 12 processor runs

Page 20: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

20

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Speedup from Faster 533 MHz Frontside Bus

1.184300001.201550001.08320001.1012000

Speedup: 400MHz to 533 MHz Frontside Bus

Model Size (elements)

V960 r1488 LS-DYNAMarch 2003 LAM/MPI2.8 GHz x335 node – 2 processor runs

Page 21: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

21

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Performance Improvement with Version 970

02000400060008000

100001200014000160001800020000

2-CPU 4-CPU 8-CPU 16-CPU 32-CPU

Elap

sed

Tim

e (s

ec) version 960 r1488 version 970 r3535

2.8 GHz x335 (Xeon) ClusterGigabit EthernetMarch 2003 LAM/MPI MPP-DYNArefined Neon-535k elements

1.20

1.151.14

1.10

Page 22: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

22

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Chrysler experience

Customer requirements• Reduced turn around time

• Price/performance

• Good accuracy, i.e., The numerical results should match the results on those from the current 64 bit machines

A team work• Chrysler

• LSTC

• IBM

• IntelEventually all 22 QA models passed the accuracy requirements and Chrysler bought 108 Xeon based IBM Linux cluster nodes for car crash simulation

Page 23: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

23

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Chrysler is happy with the IBM Linux cluster solution

Without parallel processing, we never would have achieved 5* (NCAP) and “good” (IIHS) on our new Chrysler Sebring and Dodge Stratus within the current product development time. --SubhasShetty, Chrysler

Page 24: LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux Clusters ... V960 r1488 LS-DYNA 2 CPUs per node 1 CPU per node Gigabit Ethernet x335

24

IBM Deep Computing Team – 2003 Cluster World Conference

Cluster World Conference | June 2003 © 2002 IBM Corporation

Summary

MPI based MPP-DYNA has better scalabilityLinux clusters reduced the turn around time for car crash simulationLinux clusters reduced the simulation costThe accuracy is satisfactoryUsers today can customize their system in order to pick the features which serve them best

Processors

Operating system

Interconnect