LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux...
Transcript of LS-DYNA: CAE Simulation Software on Linux · PDF fileLS-DYNA: CAE Simulation Software on Linux...
IBM Deep Computing Group
June, 2003 | IBM Deep Computing Group © 2002 IBM Corporation
LS-DYNA: CAE Simulation Software on Linux Clusters
Guangye Li ([email protected])IBM Deep Computing Team
2
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Topics
Introduction to LS-DYNALS-DYNA ApplicationsTwo versions of LS-DYNA: SMP and MPPAn example
Performance of LS-DYNA on clustersPerformance Improvement with Faster ProcessorsInterconnect Options: Gigabit Ethernet or MyrinetOne or two process nodesComparison of LAM/MPI and MPICH PerformanceSpeedup from Compiler OptionsSpeedup from Faster 533 MHz Front side Bus
Chrysler experience
3
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
LS-DYNA: A general purpose transient dynamic finite
element program capable of simulating complex real world problems
Software Vendor: Livermore Software Technology Corp. (LSTC)
Largest application in CAE
Large customer base
4
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
LS-DYNA applications include:
Occupant safety Metal Forming Metal Cutting Biomedical Blast loading Fluid-structure interaction Earthquake engineering …
5
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Two parallel versions of LS-DYNA
SMP (OpenMP) for shared memory multiple processors.• Parallelized from a serial code• Scalable up to 16 CPUs
MPP (Distributed memory version)• Using the domain decomposition technique• Using MPI for communications between subdomains
(processors)• Scalable up to more than 100 CPUs. • Suitable for both shared memory multiple
processors and clusters• MPP-DYNA on clusters dramatically reduced the
turnaround time and the simulation cost
6
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Comparison of SMP and MPP
05000
100001500020000250003000035000
1-CPU 2-CPU 4-CPU 8-CPU 16-CPU 32-CPU
ElapsedTime (sec)
SMP MPP1.3 GHz IBM p690November 2002 LS-DYNArefined Neon-535k elements
7
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
An Example: The Neon Model
Frontal crash with initial speed at 31.5 miles/hour
Model sizenumber of shell elements: 269,249number of nodal points: 285,832
Simulation length: 150 msvehicle bounce back observed at 70 ms
Model created by National Crash Analysis Center (NCAC) at GeorgeWashington University
one of the few publicly available model for vehicle crash analysisbased on 1996 Plymouth Neon
8
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
1996 Plymouth Neon
9
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
The model
10
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
The mesh
11
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Domain decomposition
The whole mesh is decomposed into NCPU subdomains.
Each domain has about the same number of elementsEach link cut corresponding to communications between two
nodes. The decomposition should minimize the link cuts
Each CPU processes elements in its subdomain
CPUs exchange boundary data using message passing (MPI)
12
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
13
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Simulation results
14
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Performance Improvement with Faster Processors
0
5000
10000
15000
20000
25000
2-CPU 4-CPU 8-CPU 16-CPU 32-CPU 64-CPU
ElapsedTime (sec)
2.4 GHz 2.8 GHz
V960 r1488 LS-DYNAXeon, 2 CPUs per nodeGigabit EthernetJan-March 2003 LAM/MPIrefined Neon-535k elements
15
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Configuring Each Node with One Processor
02000400060008000
10000120001400016000
2-CPU 4-CPU 8-CPU 16-CPU 32-CPU 64-CPU
ElapsedTime (sec)
2 CPUs per node 1 CPU per nodeV960 r1488 LS-DYNAGigabit Ethernet x335 2.8 GHzMarch 2003 LAM/MPIFront crash model 430k elements
16
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Interconnect – Effect on Performance
0
5000
10000
15000
20000
25000
2-CPU 4-CPU 8-CPU 16-CPU 32-CPU
ElapsedTime (sec)
Fast Ethernet Gigabit Ethernet Myrinet2.2 GHz IntelliStation ClusterJune 2002 MPI LS-DYNArefined Neon-535k elements
17
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Interconnect Performance Compared
0
5
10
15
20
25
30
2-CPU 4-CPU 8-CPU 16-CPU 32-CPU
ParallelSpeedup
x335+Fast Ethernet x335+Gigabit Ethernetx335+Myrinet p655+SP Switch2
V960 LS-DYNAJan 2003 Refined Neon 535k Elements
18
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Comparison of LAM/MPI and MPICH Performance
0500
100015002000250030003500
16-CPU 32-CPU
ElapsedTime (sec)
MPICH-1.2.4 LAM/MPI-6.5.62.8 GHz x335 (Xeon) ClusterGigabit EthernetMarch 2003 LS-DYNArefined Neon-535k elements
19
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Speedup from Compiler Options
25110No_SSE20781SSE
Elapsed time (sec)Intel Compiler Option
V960 r1106 MPP-DYNAFeb 2002 LAM/MPI 6.5.22.2 GHz IntelliStation node – 12 processor runs
20
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Speedup from Faster 533 MHz Frontside Bus
1.184300001.201550001.08320001.1012000
Speedup: 400MHz to 533 MHz Frontside Bus
Model Size (elements)
V960 r1488 LS-DYNAMarch 2003 LAM/MPI2.8 GHz x335 node – 2 processor runs
21
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Performance Improvement with Version 970
02000400060008000
100001200014000160001800020000
2-CPU 4-CPU 8-CPU 16-CPU 32-CPU
Elap
sed
Tim
e (s
ec) version 960 r1488 version 970 r3535
2.8 GHz x335 (Xeon) ClusterGigabit EthernetMarch 2003 LAM/MPI MPP-DYNArefined Neon-535k elements
1.20
1.151.14
1.10
22
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Chrysler experience
Customer requirements• Reduced turn around time
• Price/performance
• Good accuracy, i.e., The numerical results should match the results on those from the current 64 bit machines
A team work• Chrysler
• LSTC
• IBM
• IntelEventually all 22 QA models passed the accuracy requirements and Chrysler bought 108 Xeon based IBM Linux cluster nodes for car crash simulation
23
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Chrysler is happy with the IBM Linux cluster solution
Without parallel processing, we never would have achieved 5* (NCAP) and “good” (IIHS) on our new Chrysler Sebring and Dodge Stratus within the current product development time. --SubhasShetty, Chrysler
24
IBM Deep Computing Team – 2003 Cluster World Conference
Cluster World Conference | June 2003 © 2002 IBM Corporation
Summary
MPI based MPP-DYNA has better scalabilityLinux clusters reduced the turn around time for car crash simulationLinux clusters reduced the simulation costThe accuracy is satisfactoryUsers today can customize their system in order to pick the features which serve them best
Processors
Operating system
Interconnect