Amazon Web Services - UCF Computer Sciencedcm/Teaching/CDA5532... · Introduction • The purpose...
Transcript of Amazon Web Services - UCF Computer Sciencedcm/Teaching/CDA5532... · Introduction • The purpose...
Amazon Web Services:Performance Analysis of High Performance Computing Applications on the
Amazon Web Services Cloud
Summarized by: Michael Riera
9/17/2011
University of Central Florida – CDA5532
Agenda
• Purpose
• Benchmarks used
• Machine Setups (including EC2)
• Experiment Setup• Experiment Setup
• Results
• Conclusions
Introduction
• The purpose of this paper is to compareAmazon EC2 service performance againstindustry standard benchmarks for HighPerformance Computing data centers.Performance Computing data centers.
• This papers draws comparison betweenknown super computers, and HP data center,and AWS EC2
Benchmarks
• NERSC Framework– Workload includes:
• Areas of climate
• Materials science
• Fusion• Fusion
• Accelerator modeling
• Astrophysics
• Quantum Chromodynamics
• Integrated Performance Monitoring– Used to quantify the computing and communications
with MPI interfaces.
Machine Setup
• Carver
– National Energy Research Scientific ComputingCenter at Lawrence Berkeley National Labs.
– 400 nodes– 400 nodes
• Quad-core Intel Nehalem 2.67 Ghz
• Dual socket nodes and a single Quad Data Rate (QDR)
• Each Node has 24 GB of RAM (3GB per core)
Machine Setup
• Franklin– National Energy Research Scientific Computing
(NERSC) Center at Lawrence Berkeley NationalLabs.
– 9660 nodes– 9660 nodes• Cray XT4 supercomputers
• Single quad-core 2.3 Ghz AMD Opteron “Budapest”processpr
• 6.4Gb interconnects (node innerconnect)
• Each Node has 8 GB of RAM (2 GB per core)
Machine Setup
• Lawrencium
– Information Technology Division at Berkeley
– 198 nodes (1584 core)
• Dell PowerEdge 1950 server• Dell PowerEdge 1950 server
• Two Intel Xeon quad-core 64 bit, 2.66Ghz Harptownprocessors
• DDR Infiniband network
• Each node, 16GB of RAM (2GB per core)
Machine Setup
• Amazon EC2
– Virtual configuration
• CPU Capacity is defined in terms of an abstract AmazonEC2 compute unit.EC2 compute unit.
• EC2 CU are approximately equivalent to 1.0 – 1.2 Ghz
• The large instances has:– 4 EC2 Compute Units
– 2 Virtual Cores
– 7.5 GB of memory
– Interconnect: Gigabit ethernet
Machine Setup
Machine Setup
• /proc/cpuinfo
• Different combinations (no control overassignation)
– Intel Xeon E5430 2.66Ghz quad-core processor– Intel Xeon E5430 2.66Ghz quad-core processor
– AMD Opteron 270 2.0Ghz dual-cores
– AMD Opteron 2218 HE 2.6Ghz dual-core
Experiment Setup
• CAM
– The community Atmosphere Model (CAM) is theatmospheric component of the CommunityClimate System Model (CCSM)Climate System Model (CCSM)
• GAMESS
– Uses sockets communication
– Considered stride-1 memory access, whichstresses memory bandwidth, and interconnectcollective performance
Experiment Setup
• GTC– Fully self-consistent, gyrokinetic 3-D Particle-in-cell (PIC) code with a
non-spectral poisson solver
• IMPACT-T– Integrated Map and Particle Accelerator Tracking Time– Uses Hockneys FFT
• MAESTRO• MAESTRO– Used to simulating astrophysical flows such as those leading up to
ignition in Type Ia supernovae
• MILC– Represents lattice computation that is used to study Quantum
ChromoDynamics.
• Paratec– Performs Density Functional Theory quantum-mechanical total energy
calculations using pseudi-potentials
Results
Results
Franklin, Lawrence, and EC2, are 1.4x, 2.6x and 2.7x slower than Carver In GAMES Worse case onPARATEC, EC2 is more than 50x slower than Carver. Paratec performs a 3-DFFT and EC2
performed 52x slower than carver
Results
Results:AWS Cloud HW Variance
CONCLUSION
• Cannot control type of hardware in the cloud
• Near supercomputer speeds at every household