PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

28
PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration Yoni Aizik, Muhammad K. Mhameed Design Technology Solution Group, Intel Corporation DAC 2009 User Track

description

PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration. Yoni Aizik, Muhammad K. Mhameed Design Technology Solution Group, Intel Corporation. DAC 2009 User Track. Agenda. Power Management - Motivation Overview Usage Examples Summary. Power Management. - PowerPoint PPT Presentation

Transcript of PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

Page 1: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

PETRA: A Framework for System Level Dynamic

Thermal and Power Management Exploration

Yoni Aizik, Muhammad K. MhameedDesign Technology Solution Group, Intel Corporation

DAC 2009 User Track

Page 2: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

2

Agenda

• Power Management - Motivation• Overview• Usage Examples• Summary

Page 3: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

3

Power Management

• Power Performance• Improved performance

power• How can we save power and

improve performance?– Shut down cores when they are

not needed

– Wake up cores when the workload increases– Reduce frequency

• While making sure that:– The chip does not exceed threshold

temperature– The power is sufficiently low

Page 4: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

4

Sleep States

• Shut down unneeded core

• Idle cores:– Busy resources– No jobs

• To save power, idle cores are sent to sleep

Pow

er

Wak

eup

Tim

eSleep state

Idle

perc

enta

ge

Page 5: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

5

Frequency Scaling• DVFS:

– Higher frequency higher performance, higher power

– Lower frequency ≠ lower performance• Decreases frequency when the memory is

the bottleneck

- To save power, the frequency of the core is reduced to the point that ensures minimum performance degradation

frequency

Page 6: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

6

Challenges

• System architects have to:– Implement power management

algorithms– Consider their mutual influence– Evaluate different implementation

options

• Evaluation of system level power management requires long benchmarks– Minutes-long applications - Thermal effects, global

optimization (not local) – RTL model is not a viable option, due to slow simulation

speed• Need early evaluation method

– That can run long benchmarks – Enable assessment of different power management

algorithms

Page 7: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

7

Petra Objectives

• Enable early assessment of – Power management

algorithms/configurations w.r.t. power/performance on real workloads (~minutes)

– Various OS policies• Power architects are the target users

• Provide a flexible high-level modeling and simulation framework for power management algorithms and hardware

Page 8: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

8

Petra Overview

• Petra reads application traces• Process traces through power

management algorithms implemented by the user

• Takes into account dynamic thermal behavior

• Reports the power and performance of the simulated application

Page 9: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

9

Input Traces

Workload 1Ghz

3Ghz

Each workload has to be prepared ahead of time

Applications is run on previous generation CPU

Collect data at all supported frequencies

Page 10: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

10

Input Traces

#Inst. Dur. Enrg.

2M 5300 ..

2M 4210 ..

0.7M 1540 ..

0 10300 ..

1.3M 1500 ..

Workload

Information is arranged in buckets

Page 11: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

11

Input Traces

#Inst. Dur. Enrg.

2M 5300 ..

2M 4210 ..

0.7M 1540 ..

0 10300 ..

1.3M 1500 ..

Workload

Energy = a1 · # $ misses +

a2 · # int add + …

Monitor activity of uArch events, map it to energy

Page 12: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

12

Input Traces

#Inst. Dur. Enrg. Cdyn

2M 5300 .. ..

2M 4210 .. ..

0.7M 1540 .. ..

0 10300 .. ..

1.3M 1500 .. ..

Workload

Energy = a1 · # $ misses +

a2 · # int add + …

Cdyn =

EnergyDuratio

n

1Vcc² · f

[F]

Page 13: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

13

Dynamic SimulationTime

1 GHz 2 GHz 3 GHz

New Cdyn

Freq Change (1Ghz)Idle

Period

PowerOS

Power

OS

active Period

Page 14: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

14

Dynamic SimulationTime

1 GHz 2 GHz 3 GHz

Power

Power

execu

tion

ti

me

Power

Power

Power

Power

Power

Power

∑energy

Page 15: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

15

Input Traces#Inst. Dur. Cdyn

2M 5300 ..

.. .. ..

#Inst. Dur. Cdyn

2M 5300 ..

.. .. ..

#Inst. Dur. Cdyn

2M 5300 ..

.. .. ..

#Inst. Dur. Cdyn

2M 5300 ..

.. .. ..

1Ghz

2Ghz

3Ghz

4Ghzfreq

PMdata

in1

in2

in3

in4

sel

Page 16: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

16

Building Blocks

OS Agent

Trace Feeder

Outputs

Traces

Power Model

WP Request

ActivityF/S Request

Sys

tem

Fre

quen

cy

Power / WP (Freq, Voltage, Sleep States)

Thermal Model

Working Point

Dispatcher

Cdyn

Fre

q, V

olta

ge

System Configuration

HS Temperature

Power

Env

utilization

System Info

floorplan

Pow

er

Tem

p

Working Point Calculator

Page 17: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

17

Working Point Calculator

Building Blocks

OS Agent

Trace Feeder

Outputs

Traces

Power Model

WP Request

ActivityF/S Request

Sys

tem

Fre

quen

cy

Power / WP (Freq, Voltage, Sleep States)

Thermal Model

Working Point

Dispatcher

Cdyn

Fre

q, V

olta

ge

HS Temperature

Power

Env

utilizationP

ower

Tem

p

System Configuration

System Info

floorplan

Page 18: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

18

Working Point Calculator

Building Blocks

OS Agent

Outputs

Power Model

WP Request

ActivityF/S Request

Power / WP (Freq, Voltage, Sleep States)

Thermal Model

Working Point

Dispatcher

Cdyn

Fre

q, V

olta

ge

System Configuration

HS Temperature

Power

utilization

System Info

floorplan

Pow

er

Tem

p

Traces

Trace Feeder

Env

Sys

tem

Fre

quen

cy

Page 19: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

19

Working Point Calculator

Building Blocks

Trace Feeder

Outputs

Traces

Power Model

WP Request

Activity

Sys

tem

Fre

quen

cy

Power / WP (Freq, Voltage, Sleep States)

Thermal Model

Working Point

Dispatcher

Cdyn

Fre

q, V

olta

ge

System Configuration

HS Temperature

Power

Env

System Info

floorplan

Pow

er

Tem

p

OS Agent

utilizationF/S Request

Page 20: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

20

Working Point Calculator

Building Blocks

OS Agent

Trace Feeder

Outputs

Traces

WP Request

ActivityF/S Request

Sys

tem

Fre

quen

cy

Power / WP (Freq, Voltage, Sleep States)

Thermal Model

Working Point

Dispatcher

System Configuration

HS Temperature

Env

utilization

System Info

floorplan

Pow

er

Cdyn

Power Model

Fre

q, V

olta

geTem

p

Power

Page 21: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

21

Working Point Calculator

Building Blocks

OS Agent

Trace Feeder

Outputs

Traces

Power Model

WP Request

ActivityF/S Request

Sys

tem

Fre

quen

cy

Power / WP (Freq, Voltage, Sleep States)

Working Point

Dispatcher

Cdyn

Fre

q, V

olta

ge

System Configuration

Power

Env

utilization

System Info

floorplan

Tem

p

Pow

er

HS Temperature

Thermal Model

Page 22: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

22

Working Point Calculator

Building Blocks

OS Agent

Trace Feeder

Outputs

Traces

Power Model

ActivityF/S Request

Thermal Model

Cdyn

System Configuration

HS Temperature

Power

Env

utilization

System Info

floorplan

Pow

er

Tem

p

Working Point

Dispatcher

Fre

q, V

olta

ge

WP Request

Sys

tem

Fre

quen

cy

Power / WP (Freq, Voltage, Sleep States)

Page 23: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

23

Working Point Calculator

Building Blocks

OS Agent

Trace Feeder

Outputs

Traces

Power Model

WP Request

ActivityF/S Request

Sys

tem

Fre

quen

cy

Power / WP (Freq, Voltage, Sleep States)

Thermal Model

Working Point

Dispatcher

Cdyn

Fre

q, V

olta

ge

System Configuration

HS Temperature

Power

Env

utilization

System Info

floorplan

Pow

er

Tem

p

Page 24: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

24

Usage Example:Performance Impact of Thermal Estimations

Errors

• Question: How do guard bands used in thermal reading affect the performance of a thermally limited system?

• Guard bands are result of:– Thermal sensor errors– Thermal sensor location (proximity to the

hotspot)• Each guard band adds performance

penalty (over design)• Petra can evaluate this price

Page 25: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

25

Performance Impact of Thermal Estimations Errors

to a significant performance loss• The data is application dependent• Petra analyses of the tradeoff between the

thermal guard bands and performance loss

• Thermally-stressed system

• Proprietary DVFS• Different thermal

guard band values

• Spec2k• Even few

degrees of a thermal guard band lead

0%

5%

10%

15%

20%

25%

30%

5 10 15Guardband [C]

Per

form

ance

Pen

alty

[%

]

spec2k galgel

spec2k equake

Page 26: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

26

Cost of Thread Migration

• Thread migration:– Move execution between cores– Spread power density– Decrease temperature

• Efficiently reduces temp, but adds penalty: – turning-on one core– transfer the µArch state from one core to

another– turning off inactive core

• High frequency thread migration:– better thermal conditions, but– increasing performance overhead

Page 27: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

27

Cost of Thread Migration

• Dual-core system• Running high power

application• TM: Temp > 100ºC• DVFS: Temp > 110ºC• Migration frequency is

varied

• When cycle time > 20mSec, TM is not efficient (the application run-time is constant)

• When cycle time < 5mSec, the overhead of the migration is greater than the thermal benefit

• The optimal working point : TM cycle time of 10mSec– balance the thermal benefit and the migration overhead

30

31

32

33

34

0 10 20 30 40 50

Migration Cycle Time [mSec]

Ap

pli

cati

on

E

xecu

tio

n T

ime

[Sec

]

Page 28: PETRA: A Framework for System Level Dynamic Thermal and Power Management Exploration

28

Summary

• Petra is a novel simulation framework that estimates the effect of the PM algorithms on real workloads

• The abstraction level of the traces enables a reasonable (similar to real) simulation time of target applications

• Our solution provides:• Scalability to long benchmark runs• Time accuracy to reflect real system behavior• Separation of algorithm implementation (user

provided) and their scheduling (part of infrastructure)