A Three Step Blind Approach for Improving HPC Systems' Energy

15
Outline Motivations Runtime energy performance optimization Evaluation and preliminary results Summary A Three Step Blind Approach for Improving HPC Systems’ Energy Performance Ghislain Landry Tsafack, Laurent Lefevre, Patricia Stolf [email protected] EE-LSDS –April, 2013 1 1 This work is supported through Hemera G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013

Transcript of A Three Step Blind Approach for Improving HPC Systems' Energy

Page 1: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

A Three Step Blind Approach for Improving HPCSystems’ Energy Performance

Ghislain Landry Tsafack, Laurent Lefevre, Patricia [email protected]

EE-LSDS –April, 2013

1

1This work is supported through HemeraG.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013

Page 2: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

1 MotivationsHigh Performance Computing (HPC) systems’ design

2 Runtime energy performance optimizationOn-the-fly system adaptation

3 Evaluation and preliminary resultsExperimental platform description

4 Summary

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013

Page 3: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

High Performance Computing (HPC) systems’ design

High Performance Computing (HPC) systems design

place great emphasis on a few components: processorarchitecture, memory subsystems, storage subsystems andcommunication subsystems, management framework, platformarchitecture

performance of cpu-intensive workloads depends on processorarchitecture

guarantee good performance on average over a wide verity ofworkloads.

can result in power dissipation for some workloads or executionphases of a specific workload

most of them allow system reconfiguration (at least theprocessor)

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 3 / 15

Page 4: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

On-the-fly system adaptation

On-line system reconfiguration

Overview of the methodology

phase detection

phase characterization

phase identification and system reconfiguration

reuse of configuration information for recurring phases

def. phase is define as a region of execution of theprogram/system relatively stable with respect to a given metric

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 4 / 15

Page 5: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

On-the-fly system adaptation

On-line system reconfiguration (cont.)

Example:

Phase Detection

Phase Characterization

Compute & memory

intensive

Compute

intensiveMemory

intensive

Identification and decision making

Match any new EV with a known phase if any and make the appropriate

decisionmemory intensive: slow down the

Processor...

Characterization of detected phases

get the characteristics of the new vectorIf identification successful

Detect new phases

Phase detection

Figure: Summary of the methodology on a system which successivelyruns five di↵erent workloads.

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 5 / 15

Page 6: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

On-the-fly system adaptation

Phase changes detection

Execution Vectors (EV) based approach

column vector whose entries are sensors – including hardwareperformance counters, network bytes sent/received and diskread/write counts

example

0

BBB@

cache refbranch ins

...byteSent

1

CCCA

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 6 / 15

Page 7: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

On-the-fly system adaptation

Phase changes detection (cont.)

Similarity/resemblance between EVs is used for phase detection

the manhattan distance between consecutive EVs is theresemblance metric

phase changes occur when the distance between consecutiveEVs exceeds a threshold (varies throughout the system’slife-cycle)

sensor-1

sensor-2

≤ TH↕

EVs represented as points in the 2-dimensional space generated by sensor-1 and

sensor-2

EVs belong to the same phase

TH ≥

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 7 / 15

Page 8: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

On-the-fly system adaptation

Phase characterization or labelling

Represented by reference vector

closest EV to the centroid of the group of EVs belonging tothe phase

Characterization lies on last level cache references per instructionratio (LLCRIR)

Table: Order of magnitude of LLC references per instruction ratio andassociated labels.

Workload label order of magnitude of LLCRIR

compute intensive 10�4

memory bound � 10�2

mixed (both memory compute intensive) 10�3

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 8 / 15

Page 9: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

On-the-fly system adaptation

On-the-fly system adaptation

Key idea: reuse of configuration information for reoccurringphases/workloads

instead of identifying complete phasesmatch each newly sampled EV with known phases and makethe reconfiguration decision for the next sampling interval

basic principle:if at time t the system is in a phase labelled label , it is likelyto be running the same type of phase at time t + 1

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 9 / 15

Page 10: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

On-the-fly system adaptation

On-the-fly system adaptation (cont.)

Table: Phase labels and associated energy reduction schemes.

Phase label Possible reconfiguration decisions

compute intensive switch o↵ memory banks; send disks to sleep;scale the processor up; put NICs into LPI mode

memory intensive scale the processor down; decrease disksor send them to sleep; switch on memory banks

mixed switch on memory banks; scale the processor upsend disks to sleep; put NICs into LPI mode

communication switch o↵ memory banks; scale the processor downintensive switch on disks

I/O intensive switch on memory banks; scale the processor down;increase disks (if needed)

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 10 / 15

Page 11: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

Experimental platform description

Experimental platform description

25 node cluster of Intel Xeon X3440 set up on Grid5000

Linux kernel 2.6.35 runs on each node, where sensors arecollected on a per second basis

Association between labels and processors frequencies:CPU intensive: 2.53Ghzmixed: 2Ghzmemory intensive: 1.87Ghz

consider Benchmarks including MG, EP, BT, SP, FT, CG, and ISfrom NPB-3.2 benchmark suite

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 11 / 15

Page 12: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

Experimental platform description

Methodology and results

randomly run each workload and let the system react accordingly

Table: summary of system’s decisions for 5 instances of above workloads

InstancesWorkload categories Workloads 1st 2nd 3rd 4th 5th

compute intensive (CI) MG CI CI CI CI CIEP CI MI MI CI CI

mixed (MIX) BT CI MIX MIX MIX MIXSP CI MIX MIX MIX MIXFT CI MIX MIX MIX MIX

memory intensive (MI) CG CI MI MI MI MIIS CI CI MI MI MI

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 12 / 15

Page 13: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

Experimental platform description

Impact on performance

0

0.2

0.4

0.6

0.8

1

1.2

1.4

EP MG BT SP FT CG IS

Energ

y co

nsu

mptio

n (

j * 1

06)

12345

(a) Energy consumption

0

50

100

150

200

250

300

350

400

450

1 2 3 4 5

Exe

cutio

n t

ime

(s)

Workloads instance (1 for the 1rst instance, 2 for the 2nd and so on)

BTCGEPFT

MGIS

SP

(b) Execution time

Figure: Variation of energy consumption and execution time of recurringworkloads with respect to system reconfiguration decisions.

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 13 / 15

Page 14: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

Summary

introduce an on-line general purpose methodology forimproving energy performance of HPC systems

processor, disk, and network interconnect (experiments in thepaper concentrate on the processor)

the approach can easily be extended to a large number ofenergy-aware clusters

does not require any specific knowledge of the application orany user intervention

future directions: investigating more power saving schemes;extending the number of instances of workloads in theexperiments.

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 14 / 15

Page 15: A Three Step Blind Approach for Improving HPC Systems' Energy

OutlineMotivations

Runtime energy performance optimizationEvaluation and preliminary results

Summary

Summary

Thank you for your attention!

Questions?

G.L. Tsafack, L. Lefevre, P. Stolf EE-LSDS 2013 15 / 15