The HPC Challenge Benchmark
Transcript of The HPC Challenge Benchmark
1
The HPC Challenge BenchmarkThe HPC Challenge Benchmark
Jack Dongarra
Innovative Computing LaboratoryUniversity of Tennessee
and Computer Science and Mathematics Division
Oak Ridge National Laboratory
http://icl.cs.utk.edu/hpcc/
2
Phases I Phases I -- IIIIII
02 05 06 07 08 09 1003 04
ProductsMetrics,Benchmarks
AcademiaResearchPlatforms
EarlySoftware
Tools
EarlyPilot
Platforms
Phase IIR&D
3 companies~$50M each
Phase IIIFull Scale Development
commercially ready in the 2007 to 2010 timeframe.$100M ?
Metrics and Benchmarks
System DesignReview
Industry
Application Analysis
PerformanceAssessment
HPCSCapability or
Products
Fiscal Year
Concept Reviews PDR
Research Prototypes
& Pilot Systems
Phase III Readiness Review
TechnologyAssessments
Requirementsand Metrics
Phase IIReadiness Reviews
Phase IIndustry
Concept Study5 companies
$10M each
Reviews
Industry Procurements
Critical Program Milestones
DDR
Productivity TeamIndustry:
Mission partners:
Productivity team (Lincoln Lab lead)
MIT LincolnLab
LCS OhioState
PI: Koester PIs: Vetter, Lusk, Post, BaileyPIs: Gilbert, Edelman,Ahalt, Mitchell
PI: Kepner PI: Lucas PI: BasiliPIs: Benson,Snavely
PI: Dongarra
3
Motivation for Additional BenchmarksMotivation for Additional Benchmarks
♦ From Linpack Benchmark and Top500: “no single number can reflect overall performance”
♦ Without HPL Linpack only peak will be reported
♦ Clearly need something more than Linpack
♦ HPC Challenge Benchmark
Goals HPC Challenge BenchmarkGoals HPC Challenge Benchmark
♦Stress CPU, memory system, interconnect
♦Allow for optimizations Record effort needed for tuning
♦Provide verification of results♦Archive results♦Requires: MPI and BLAS
4
HPC Challenge Benchmark HPC Challenge Benchmark Initial Release 11/03Initial Release 11/03
Consists of basically 5 benchmarks; Think of it as a framework or harness for adding benchmarks of interest.
1. HPL (LINPACK) ― MPI on whole system (Ax = b)
2. STREAM ― single CPU *STREAM ― embarrassingly
parallel whole system
3. PTRANS (A A + BT) ― MPI on whole system
4. RandomAccess ― single CPU *RandomAccess ― embarrassingly parallelRandomAccess ― MPI on whole system
5. BW and Latency – MPI
Coming soon: FFT and Matrix Multiply proci prock
Random integerread; update; & write
ApplicationsApplications
Temporal locality
Spat
ial l
ocal
ity
ComputationalFluid
Dynamics
Digital Signal Processing
TravelingSales
Person
Radar Cross Section
high
low
low
Memory Access Patterns
5
Temporal locality
Spat
ial l
ocal
ity
Digital Signal Processing
STREAM / PTRANSHPL Linpack
FFT (coming soon) RandomAccess
Radar Cross Section
high
low
low
Memory Access Patterns
ApplicationsApplicationsSignaturesSignatures
ComputationalFluid
Dynamics
TravelingSales
Person
low
How Will The Benchmarking Work?How Will The Benchmarking Work?♦ Single program to download and run
Simple input file similar to HPL input♦ Base Run and Optimization Run
Base run must be madeUser supplies MPI and the BLAS
Optimized run allowed to replace certain routines
User specifies what was done♦ Results upload via website♦ html table and Excel spreadsheet
generated with performance resultsIntentionally we are not providing a single figure of merit (no over all ranking)
♦ Goal: no more than 2 X the time to execute HPL.
6
http://icl.cs.utk.edu/hpcc/http://icl.cs.utk.edu/hpcc/
Coming soon FFT and Matrix Multiply
Go toGo to……
♦http://icl.cs.utk.edu/hpcc/
7
Example of OutputExample of Output http://icl.cs.utk.edu/hpcc/
8
9
http://http://icl.cs.utk.edu/hpccicl.cs.utk.edu/hpcc//
Expanded Set of BenchmarksExpanded Set of Benchmarks
♦Constructing a framework for benchmarks
♦Developing machine signatures♦ Plans are to expand the benchmark collection
♦Currently working onDGEMM and *DGEMMFFT (1d Complex)
10
Future DirectionsFuture Directions
♦Port to new systems♦Provide more implementations
Languages (Fortran, UPC, Co-Array)Environments Paradigms
♦Other basic operationsSparse matrixI/O
CollaboratorsCollaborators♦Piotr Łuszczek, U of Tennessee♦David Bailey, NERSC/LBL♦Jeremy Kepner, MIT Lincoln Lab♦David Koester, MITRE♦Bob Lucas, ISI/USC♦John McCalpin, IBM, Austin♦Rolf Rabenseifner, HLRS Stuttgart
http://icl.cs.utk.edu/hpcc/