Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon...
Transcript of Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon...
![Page 1: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/1.jpg)
Towards Petascale Computing for Science
Horst SimonLenny Oliker, David Skinner, and Erich Strohmaier
Lawrence Berkeley National Laboratory
The Salishan Conference on High-Speed Computing
April 19, 2005
![Page 2: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/2.jpg)
Outline
• Science Driven Architecture
• Performance on today’s (2004 - 2005) platforms
• Challenges with scaling to the Petaflop/s level
• Two tools that can help: IPM and APEX/MAP
![Page 3: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/3.jpg)
Scientific Applications and Underlying Algorithms Drive Architectural Design
• 50 Tflop/s - 100 Tflop/s sustained performance on applications of national importance
• Process:– identify applications– identify computational methods used in these
applications– identify architectural features most important for
performance of these computational methods
Reference: Creating Science-Driven Computer Architecture: A New Path to Scientific Leadership, (Horst D. Simon, C. William McCurdy, William T.C. Kramer, Rick Stevens, Mike McCoy, Mark Seager, Thomas Zacharia, Jeff Nichols, Ray Bair, Scott Studham, William Camp, Robert Leland, John Morrison, Bill Feiereisen), Report LBNL-52713, May 2003. (see www.nersc.gov/news/reports/HECRTF-V4-2003.pdf)
![Page 4: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/4.jpg)
Capability Computing Applications in DOE/SC
• Accelerator modeling
• Astrophysics
• Biology
• Chemistry
• Climate and Earth Science
• Combustion
• Materials and Nanoscience
• Plasma Science/Fusion
• QCD
• Subsurface Transport
![Page 5: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/5.jpg)
Capability Computing Applications in DOE/SC (cont.)
These applications and their computing needs have been well-studied in the past years:
• “A Science-Based Case for Large-scale Simulation”, David Keyes, Sept. 2004 (http://www.pnl.gov/scales).
• “Validating DOE’s Office of Science “Capability”Computing Needs”, E. Barsis, P. Mattern, W. Camp, R. Leland, SAND2004-3244, July 2004.
![Page 6: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/6.jpg)
Science Breakthroughs Enabled by Leadership Computing Capability
Simulate the explosion of a supernova with a full 3D model
Multi-physics, multi-scaleDense linear algebra
Parallel 3D FFTsSpherical transforms
Particle methodsAdaptive mesh refinement
Determine through simulations and analysis of observational data the origin, evolution and fate of the universe, the nature of matter and energy, galaxy and stellar evolutions
Astrophysics
Perform a full ocean/atmosphere climate model with 0.125 degree spacing, with an ensemble of 8-10 runs
Finite difference methodsFFTs
Regular and irregular accessSimulation ensembles
Accurately detect and attribute climate change, predict future climate and engineer mitigation strategies
Climate
Simulate the ITER reactorMulti-physics, multi-scaleParticle methods
Regular and irregular accessNonlinear solvers
Adaptive mesh refinement
Understand high-energy density plasmas and develop an integrated simulation of a fusion reactor
Fusion
Simulate laboratory scale flames with high fidelity representations of governing physical processes
Explicit finite differenceImplicit finite difference
Zero-dimensional physicsAdaptive mesh refinement
Lagrangian particle methods
Predict combustion processes to provide efficient, clean and sustainable energy
Combustion
Simulate nanostructures with hundreds to thousands of atoms as well as transport and optical properties and other parameters
Quantum molecular dynamicsQuantum Monte CarloIterative eigensolversDense linear algebra
Parallel 3D FFTs
Simulate the synthesis and predict the properties of multi-component nanosystems
Nanoscience
Breakthrough Target (50-100
Tflop/s)
Computational Methods
GoalsScience Areas
![Page 7: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/7.jpg)
Opinion Slide
One reason why we have failed so far to make a good case for increased funding in supercomputing is that we have not yet made a compelling science case.
A better example: “The Quantum Universe”
“It describes a revolution in particle physics and a quantum leap in our understanding of the mystery
and beauty of the universe.”
http://interactions.org/quantumuniverse/
![Page 8: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/8.jpg)
How Science Drives Architecture
State-of-the-art computational science requires increasingly diverse and complex algorithms
Only balanced systems that can perform well on a variety of problems will meet future scientists’ needs!Data-parallel and scalar performance are both important
AstrophysicsClimateFusion
Combustion
Nanoscience
Science Areas
XXX
X
X
Multi-Physics
and Multi-Scale
X
X
X
Dense Linear
Algebra
XX
X
FFTs
X
X
X
X
Particle Methods
XXX
X
AMR
XXX
X
X
Data Parallelism
XXX
X
X
Irregular Control
Flow
![Page 9: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/9.jpg)
Phil Colella’s “Seven Dwarfs”
Algorithms that consume the bulk of the cycles of current high-end systems in DOE:
• Structured Grids• Unstructured Grids• Fast Fourier Transform• Dense Linear Algebra• Sparse Linear Algebra • Particles• Monte Carlo
(Should also include optimization / solution of nonlinear systems, which at the high end is something one uses mainly in conjunction with the other seven)
![Page 10: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/10.jpg)
“Evaluation of LeadingSuperscalar and Vector
Architectures for Scientific Computations”
Leonid Oliker, Andrew Canning, Jonathan CarterLBNL
Stephane EthierPPPL
(see SC04 paper at http://crd.lbl.gov/~oliker/ )
![Page 11: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/11.jpg)
Material Science: PARATEC• PARATEC performs first-principles
quantum mechanical total energy calculation using pseudopotentials & plane wave basis set
• Density Functional Theory to calc structure & electronic properties of new materials
• DFT calc are one of the largest consumers of supercomputer cycles in the world
• PARATEC uses all-band CG approach to obtain wavefunction of electrons
• Part of calc. in real space other in Fourier space using specialized 3D FFT to transform wavefunction
• Generally obtains high percentage of peak on different platforms• Developed with Louie and Cohen’s groups (UCB, LBNL), Raczkowski
![Page 12: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/12.jpg)
PARATEC: Code Details
• Code written in F90 and MPI (~50,000 lines) • 33% 3D FFT, 33% BLAS3, 33% Hand coded
F90 • Global Communications in 3D FFT (Transpose)• 3D FFT handwritten, minimize comms. reduce
latency (written on top of vendor supplied 1D complex FFT )
• Code has setup phase then performs many (~50) CG steps to converge the charge density of the system (data on speed is for 5CG steps, does not include setup)
![Page 13: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/13.jpg)
– 3D FFT done via 3 sets of 1D FFTs and 2 transposes
– Most communication in global transpose (b) to (c) little communication (d) to (e)
– Many FFTs done at the same time to avoid latency issues
– Only non-zero elements communicated/calculated
– Much faster than vendor supplied 3D-FFT
PARATEC: 3D FFT
(a) (b)
(e)
(c)
(f)
(d)
![Page 14: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/14.jpg)
PARATEC: Performance
------52%4.2------21%1.138%0.57256
686Atom
432Atom
DataSize
---
---54%62%
%peak
---
---3.23.7
Gflops/P
Altix
10%1.357%4.625624%3.062%4.9128------42%3.4------28%0.41512
59%59%60%
%peak
4.74.74.7
Gflops/P
ES
1.51.72.0
Gflops/P
Power4
0.740.850.95
Gflops/P
Power 3
15%1.929%49%12820%2.633%57%6424%3.039%63%32
%peak
Gflops/P
%peak
%peak
X1P
![Page 15: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/15.jpg)
Magnetic Fusion: GTC• Gyrokinetic Toroidal Code: transport of
thermal energy (plasma microturbulence)• Goal magnetic fusion is burning plasma
power plant producing cleaner energy • GTC solves gyroaveraged gyrokinetic
system w/ particle-in-cell approach (PIC)• PIC scales N instead of N2 – particles
interact w/ electromag field on grid• Allows solving equation of particle motion
with ODEs (instead of nonlinear PDEs)• Main computational tasks:
– Scatter: deposit particle charge to nearest grid points– Solve the Poisson eqn to get potential at each grid point– Gather: Calc force on each particle based on neighbors potential– Move particles by solving eqn of motion along the characteristics– Find particles moved outside local domain and update
• Developed at Princeton Plasma Physics Laboratory, vectorized by Stephane Ethier
![Page 16: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/16.jpg)
GTC: Performance
4%0.061024
0.290.290.320.29
Gflops/P
Power4
5%5%5%5%
%peak
100/cell200M
10/cell20M
Number
Particles
1.561.621.001.15
Gflops/P
ES
20%20%13%14%
%peak
11%12%6%8%
%peak
1.361.500.801.00
Gflops/P
X1
6%0.339%0.13324%0.269%0.1364
0.31
0.29
Gflops/P
Altix
0.13
0.13
Gflops/P
Power 3
5%9%64
5%9%32
%peak
%peak
P
GTC is now scaling to 2048 processors on the ES for a total of 3.7 TFlops/s
![Page 17: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/17.jpg)
Issues in Applications Scaling
Applications Status in 2005
• A few Teraflop/s sustained performance
• Scaled to 512 - 1024 processors
Applications on Petascale Systems need to deal with
• 100,00 processors (assume nominal Petaflop/s system with 100,000 processors of 10 Gflop/s each)
• Multi-core processors
• Topology sensitive interconnection network
![Page 18: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/18.jpg)
Integrated Performance Monitoring (IPM)
• brings together multiple sources of performance metrics into a single profile that characterizes the overall performance and resource usage of the application
• maintains low overhead by using a unique hashing approach which allows a fixed memory footprint and minimal CPU usage
• open source, relies on portable software technologies and is scalable to thousands of tasks
• developed by David Skinner at NERSC (see http://www.nersc.gov/projects/ipm/ )
![Page 19: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/19.jpg)
Scaling Portability: Profoundly Interesting
A high level description of the performance of a well known cosmology code on four well known architectures.
![Page 20: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/20.jpg)
16 Way for 4 seconds
(About 20 timestamps per second per task) *( 1…4 contextual variables)
![Page 21: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/21.jpg)
64 way for 12 seconds
![Page 22: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/22.jpg)
256 Way for 36 Seconds
![Page 23: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/23.jpg)
Application Topology1024 way MILC
1024 way MADCAP
336 way FVCAM
If the interconnect is topologysensitive, mapping will become
an issue (again)
![Page 24: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/24.jpg)
Interconnect Topology
![Page 25: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/25.jpg)
Interconnect Topology
![Page 26: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/26.jpg)
HPCS Program Goals &The HPCchallenge Benchmarks
HighLow
Low
PTRANS
FFT
MissionPartner
Applications
Spa
tial L
ocal
ity
Temporal Locality
RandomAccess
STREAMHPL
HighHigh
Low
Low
PTRANS
FFT
MissionPartner
Applications
Spa
tial L
ocal
ity
Temporal Locality
RandomAccess
STREAMHPL
High
DARPA HPCS will characterize applications
![Page 27: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/27.jpg)
APEX-Map: A Synthetic Benchmark to Explore the Space of Application
Performances
Erich Strohmaier, Hongzhang ShanFuture Technology Group, LBNL
Co-sponsored by DOE/SC and NSA
![Page 28: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/28.jpg)
Apex-MAP characterizes architectures through a synthetic benchmark
Temporal Locality
1/Re-use
0 = High
1=Low
1/L 1=Low0 = High
"HPL"
"Global Streams" "Short indirect"
"Small working set"
Spatial Locality
Apex-MAP
![Page 29: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/29.jpg)
Apex-Map Sequential
1 4
16 64
256
1024
4096
1638
4
6553
6
0.0010.010
0.1001.0000.1
1.0
10.0
100.0
1000.0
Cycles
L
a
Seaborg Sequential2.00-3.001.00-2.000.00-1.00-1.00-0.00
![Page 30: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/30.jpg)
Apex-Map Sequential
1 4
16 64
256
1024
4096
1638
4
6553
6
0.0010.010
0.1001.0000.10
1.00
10.00
100.00
1000.00
Cycles
L
a
Power4 Sequential2.00-3.001.00-2.000.00-1.00-1.00-0.00
![Page 31: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/31.jpg)
Apex-Map Sequential
1 4
16 64
256
1024
4096
1638
4
6553
6
0.000.01
0.101.000.10
1.00
10.00
100.00
Cycles
L
a
X1 Sequential 1.00-2.000.00-1.00-1.00-0.00
![Page 32: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/32.jpg)
Apex-Map Sequential
1 4
16 64
256
1024
4096
1638
4
6553
6
0.000.01
0.101.000.10
1.00
10.00
100.00
1000.00
Cycles
L
a
SX6 Sequential2.00-3.001.00-2.000.00-1.00-1.00-0.00
![Page 33: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/33.jpg)
Parallel APEX-Map
1 4 16 64 256
1024 4096
1638
4
6553
60.0010.010
0.1001.000
0.0
0.1
1.0
10.0
100.0
1000.0
10000.0
MB/s
L
a
Seaborg - 256 proc3.00-4.002.00-3.001.00-2.000.00-1.00-1.00-0.00-2.00--1.00
![Page 34: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/34.jpg)
Parallel APEX-Map
1 4 16 64 256
1024 4096
1638
4
6553
60.0010.010
0.1001.000
0.1
1.0
10.0
100.0
1000.0
10000.0
MB/s
L
a
Power4 256 Proc3.00-4.002.00-3.001.00-2.000.00-1.00-1.00-0.00
![Page 35: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/35.jpg)
Parallel APEX-Map
1 4 16 64 256
1024 4096
1638
4
6553
60.0010.010
0.1001.000
0.1
1.0
10.0
100.0
1000.0
10000.0
MB/s
L
a
Altix - 256 proc3.00-4.002.00-3.001.00-2.000.00-1.00-1.00-0.00
![Page 36: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/36.jpg)
Parallel APEX-Map
1 4 16 64 256
1024 4096
1638
4
6553
60.0010.010
0.1001.000
0.1
1.0
10.0
100.0
1000.0
10000.0
100000.0
MB/s
L
a
X1 - 256 proc4.00-5.003.00-4.002.00-3.001.00-2.000.00-1.00-1.00-0.00
![Page 37: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/37.jpg)
Parallel APEX-Map
1 4 16 64 256
1024 4096
1638
4
6553
60.0010.010
0.1001.000
0.1
1.0
10.0
100.0
1000.0
10000.0
100000.0
MB/s
L
a
ES - 256 proc4.00-5.003.00-4.002.00-3.001.00-2.000.00-1.00-1.00-0.00
![Page 38: Towards Petascale Computing for Science · Towards Petascale Computing for Science Horst Simon Lenny Oliker, David Skinner, and Erich Strohmaier Lawrence Berkeley National Laboratory](https://reader033.fdocuments.in/reader033/viewer/2022052007/601bf42b28d89660bc0afd12/html5/thumbnails/38.jpg)
Summary
• Three sets of tools (applications benchmarks, performance monitoring, quantitative architecture characterization) have been shown to provide critical insight into applications performance
• Need better quantitative data and measurements (like the ones discussed here) to help applications to scaleto the next generation of platforms