The HPC Challenge Benchmark

1

The HPC Challenge BenchmarkThe HPC Challenge Benchmark

Jack Dongarra

Innovative Computing LaboratoryUniversity of Tennessee

and Computer Science and Mathematics Division

Oak Ridge National Laboratory

http://icl.cs.utk.edu/hpcc/

2

Phases I Phases I -- IIIIII

02 05 06 07 08 09 1003 04

ProductsMetrics,Benchmarks

AcademiaResearchPlatforms

EarlySoftware

Tools

EarlyPilot

Platforms

Phase IIR&D

3 companies~$50M each

Phase IIIFull Scale Development

commercially ready in the 2007 to 2010 timeframe.$100M ?

Metrics and Benchmarks

System DesignReview

Industry

Application Analysis

PerformanceAssessment

HPCSCapability or

Products

Fiscal Year

Concept Reviews PDR

Research Prototypes

& Pilot Systems

Phase III Readiness Review

TechnologyAssessments

Requirementsand Metrics

Phase IIReadiness Reviews

Phase IIndustry

Concept Study5 companies

$10M each

Reviews

Industry Procurements

Critical Program Milestones

DDR

Productivity TeamIndustry:

Mission partners:

Productivity team (Lincoln Lab lead)

MIT LincolnLab

LCS OhioState

PI: Koester PIs: Vetter, Lusk, Post, BaileyPIs: Gilbert, Edelman,Ahalt, Mitchell

PI: Kepner PI: Lucas PI: BasiliPIs: Benson,Snavely

PI: Dongarra

3

Motivation for Additional BenchmarksMotivation for Additional Benchmarks

♦ From Linpack Benchmark and Top500: “no single number can reflect overall performance”

♦ Without HPL Linpack only peak will be reported

♦ Clearly need something more than Linpack

♦ HPC Challenge Benchmark

Goals HPC Challenge BenchmarkGoals HPC Challenge Benchmark

♦Stress CPU, memory system, interconnect

♦Allow for optimizations Record effort needed for tuning

♦Provide verification of results♦Archive results♦Requires: MPI and BLAS

4

HPC Challenge Benchmark HPC Challenge Benchmark Initial Release 11/03Initial Release 11/03

Consists of basically 5 benchmarks; Think of it as a framework or harness for adding benchmarks of interest.

1. HPL (LINPACK) ― MPI on whole system (Ax = b)

2. STREAM ― single CPU *STREAM ― embarrassingly

parallel whole system

3. PTRANS (A A + BT) ― MPI on whole system

4. RandomAccess ― single CPU *RandomAccess ― embarrassingly parallelRandomAccess ― MPI on whole system

5. BW and Latency – MPI

Coming soon: FFT and Matrix Multiply proci prock

Random integerread; update; & write

ApplicationsApplications

Temporal locality

Spat

ial l

ocal

ity

ComputationalFluid

Dynamics

Digital Signal Processing

TravelingSales

Person

Radar Cross Section

high

low

low

Memory Access Patterns

5

Temporal locality

Spat

ial l

ocal

ity

Digital Signal Processing

STREAM / PTRANSHPL Linpack

FFT (coming soon) RandomAccess

Radar Cross Section

high

low

low

Memory Access Patterns

ApplicationsApplicationsSignaturesSignatures

ComputationalFluid

Dynamics

TravelingSales

Person

low

How Will The Benchmarking Work?How Will The Benchmarking Work?♦ Single program to download and run

Simple input file similar to HPL input♦ Base Run and Optimization Run

Base run must be madeUser supplies MPI and the BLAS

Optimized run allowed to replace certain routines

User specifies what was done♦ Results upload via website♦ html table and Excel spreadsheet

generated with performance resultsIntentionally we are not providing a single figure of merit (no over all ranking)

♦ Goal: no more than 2 X the time to execute HPL.

6

http://icl.cs.utk.edu/hpcc/http://icl.cs.utk.edu/hpcc/

Coming soon FFT and Matrix Multiply

Go toGo to……

♦http://icl.cs.utk.edu/hpcc/

7

Example of OutputExample of Output http://icl.cs.utk.edu/hpcc/

9

http://http://icl.cs.utk.edu/hpccicl.cs.utk.edu/hpcc//

Expanded Set of BenchmarksExpanded Set of Benchmarks

♦Constructing a framework for benchmarks

♦Developing machine signatures♦ Plans are to expand the benchmark collection

♦Currently working onDGEMM and *DGEMMFFT (1d Complex)

10

Future DirectionsFuture Directions

♦Port to new systems♦Provide more implementations

Languages (Fortran, UPC, Co-Array)Environments Paradigms

♦Other basic operationsSparse matrixI/O

CollaboratorsCollaborators♦Piotr Łuszczek, U of Tennessee♦David Bailey, NERSC/LBL♦Jeremy Kepner, MIT Lincoln Lab♦David Koester, MITRE♦Bob Lucas, ISI/USC♦John McCalpin, IBM, Austin♦Rolf Rabenseifner, HLRS Stuttgart

http://icl.cs.utk.edu/hpcc/

The HPC Challenge Benchmark

Documents

Transcript of The HPC Challenge Benchmark