Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn ...

19
Energy Modelling and Energy Modelling and Optimization Optimization A Critical Assessment with two Case A Critical Assessment with two Case Studies Studies Norbert Wehn Norbert Wehn http://ems.eit.uni-kl.de http://ems.eit.uni-kl.de [email protected] [email protected] MPSoC‘09 MPSoC‘09 July 2009, Savannah, Georgia July 2009, Savannah, Georgia

Transcript of Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn ...

Page 1: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

Energy Modelling and Energy Modelling and OptimizationOptimization

A Critical Assessment with two A Critical Assessment with two Case StudiesCase Studies

Norbert WehnNorbert Wehnhttp://ems.eit.uni-kl.dehttp://ems.eit.uni-kl.de

[email protected]@eit.uni-kl.de

MPSoC‘09MPSoC‘09

July 2009, Savannah, GeorgiaJuly 2009, Savannah, Georgia

Page 2: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

MotivationMotivation

Energy efficiency is key challenge in embedded Energy efficiency is key challenge in embedded system designsystem design

Optimization techniques rely on energy modelsOptimization techniques rely on energy models Commonly used models can yield wrong Commonly used models can yield wrong

optimization strategiesoptimization strategies

Two case studiesTwo case studies

Short hop versus long hop in wireless sensor Short hop versus long hop in wireless sensor networksnetworks

DRAM power managementDRAM power management

Page 3: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

Short Hops Versus Long Hops in WSNShort Hops Versus Long Hops in WSN

A BC D

d~dE

)d(En1

nd

nnd

En :Multi_hop1

Theory favors many short hopsTheory favors many short hops Forward Error Correction inefficient since E(FEC) > E(d) for small dForward Error Correction inefficient since E(FEC) > E(d) for small d

Transmission energy has Transmission energy has exponential growth with distance dexponential growth with distance dE(d) ~dE(d) ~dαα with with αα path loss exponent (1 < path loss exponent (1 < αα < 4)< 4)

3d

E

3d

E

3d

E

Page 4: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

A More Realistic ViewA More Realistic View

Node placement is hardly ever in a perfect line and estimation Node placement is hardly ever in a perfect line and estimation of node distance is difficultof node distance is difficult

Routing and link management add energy overhead and Routing and link management add energy overhead and latencylatency

Constant base cost for frame transmission independent of hop Constant base cost for frame transmission independent of hop length, e.g., base current of power amplifier, PLLlength, e.g., base current of power amplifier, PLL

High end-to-end reliability requires High end-to-end reliability requires AAutomatic utomatic RRepeat epeat ReReQQuestuest

Many applications: single hop asymetric structureMany applications: single hop asymetric structure Central powerful node for information aggregationCentral powerful node for information aggregation

Use of FEC to trade-off communication versus computation Use of FEC to trade-off communication versus computation energy ?energy ?

TheoreticalTheoretical investigations: Rabaey[2005], Schlegel[2006], investigations: Rabaey[2005], Schlegel[2006], Balakrishnan[2008]Balakrishnan[2008]

Practical results are missingPractical results are missing

Page 5: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

State-Based MICAz Power ModelState-Based MICAz Power Model

M IC A z S ta nd b y0 .0 2 1 m A

M ic ro c o ntro lle r A c tiv e8 m A

M ic ro c o ntro lle rE xt. S ta nd b y

0 .2 1 m A~ 0 .0 0 1 3 5 m s

~ 2 .1 7 m s

M ic ro c o ntro lle r Id le4 m A

~ 0 .0 0 0 5 5 m s

M ic ro c o ntro lle rA D C1 m A

~ 0 .0 0 0 5 5 m s

s ta rtu p2 0 m s * * *

m icrocon tro lle r ope ra tions

T ra ns c e iv e r T ra ns m it a t

0 d B m : 2 5 .4 m A-1 d B m : 2 4 .5 m A-3 d B m : 2 3 .2 m A-5 d B m : 2 1 .9 m A-7 d B m : 2 0 .5 m A

-1 0 d B m : 1 9 .2 m A-1 5 d B m : 1 7 .9 m A-2 5 d B m : 1 6 .5 m A

T ra ns c e iv e r R e c e iv e2 6 .8 m A

1 m s

T ra ns c e iv e r T ra ns m it(w m ic ro c o ntro lle r E xt.

S ta nd b y ) a t0 d B m : 1 7 .6 m A-1 d B m : 1 6 .7 m A-3 d B m : 1 5 .4 m A-5 d B m : 1 4 .1 m A-7 d B m : 1 2 .7 m A

-1 0 d B m : 11 .4 m A-1 5 d B m : 1 0 .1 m A-2 5 d B m : 8 .7 m A

T ra ns c e iv e r R e c e iv e (wm ic ro c . E xt. S ta nd b y )

1 9 m A

~ 0 .0 0 1 3 5 m s

~ 0 .0 0 1 3 5 m s

T ra ns c e iv e r Id le8 .4 m A

0 .1 9 2 m s 0 .1 9 2 m s

tra nsce ive r op era tio ns

F la s h R e a d1 2 m A

F la s h W rite2 3 m A

~ 0 .0 2 m s * *

F la s h W rite (w m ic ro -c o ntro lle r E xt. S ta nd b y )

1 5 .2 m A

~ 0 .0 2 m s * *

~ 0 .0 0 1 3 5 m s

lo gg er -f la sh op era tion s

9mA

Atmel ATMega 128L

Chipcon [email protected]

Page 6: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

FEC in Single Hop Asymetric FEC in Single Hop Asymetric TopologyTopology

CRC Checksum (only error detection) built in in CRC Checksum (only error detection) built in in MICAzMICAzError correction by retransmission (ARQ)Error correction by retransmission (ARQ)

Repetition codes 1/3 and 1/6Repetition codes 1/3 and 1/6Detection and correcting errors by majority votingDetection and correcting errors by majority voting

Convolutional Code Convolutional Code Asymmetrical code: easy to encode, hard to decodeAsymmetrical code: easy to encode, hard to decode

Turbo Code (TC, UMTS standard, rate 1/3)Turbo Code (TC, UMTS standard, rate 1/3)Due to platform limitations only Due to platform limitations only hard decision decodinghard decision decoding

Turbo Code (TC*, UMTS standard, rate 1/3)Turbo Code (TC*, UMTS standard, rate 1/3)Soft informationSoft information by repeating the codeword (rate 1/6) by repeating the codeword (rate 1/6)

Measurements in Lab environmenmt: Measurements in Lab environmenmt: [email protected] [email protected] (WLAN)(WLAN)

Page 7: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

Measured Effectiveness of CodingMeasured Effectiveness of Coding

20 Byte user data20 Byte user data No ARQ at all, i.e., each bit error produces an frame error if no FEC is usedNo ARQ at all, i.e., each bit error produces an frame error if no FEC is used

Page 8: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

Energy Measurements ResultsEnergy Measurements Results

High number of retransmissions requires a large on-time in receive modeHigh number of retransmissions requires a large on-time in receive mode Overhead for encoding is more than compensated for by higher reliabilityOverhead for encoding is more than compensated for by higher reliability

MethodMethod# of sent# of sentMessageMessage

ss

Ø Frames / Ø Frames / succ. succ.

MessageMessage(#ARQ)(#ARQ)

Energy / Energy / succ. succ.

Message [µJ]Message [µJ]

Only ARQOnly ARQBattery fully depleted after 48 hoursExtrapolated to a runtime of 120h

431,906431,906 2.342.34 346,049346,049

ARQ + Rep 1/3ARQ + Rep 1/3 431,737431,737 1.181.18 16,84216,842

ARQ + TC*ARQ + TC* 431,728431,728 1.121.12 11,79811,798

Improvement in energy one order of magnitude Improvement in energy one order of magnitude compared to only ARQcompared to only ARQ

3 MicaZ nodes running in parallel, ~ 1% BER in noisy WLAN environment3 MicaZ nodes running in parallel, ~ 1% BER in noisy WLAN environment 1 frame/sec sent, measured left-over battery capacity after 120 hours 1 frame/sec sent, measured left-over battery capacity after 120 hours

runtimeruntime

Page 9: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

Lessons LearntLessons Learnt

Short-Hop is conclusion of wrong power Short-Hop is conclusion of wrong power modelsmodels

Reality is much more complexReality is much more complex

Forward error correction can be very Forward error correction can be very efficient in star shaped networksefficient in star shaped networks

Page 10: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

DRAM Power ModelsDRAM Power Models

DRAM are crucial components in embedded systemsDRAM are crucial components in embedded systems

Power modeling/optimization well elaborated for CPU Power modeling/optimization well elaborated for CPU cores & SRAMscores & SRAMs

DRAM models DRAM models

SDRAM Power model from manufacturer Micron availableSDRAM Power model from manufacturer Micron available State based model, but transitions not modeledState based model, but transitions not modeled Worst case assumptionsWorst case assumptions

RDRAM Power model from manufacturer RambusRDRAM Power model from manufacturer Rambus Very similiar to Micron modelsVery similiar to Micron models

These models are base of existing simulators and These models are base of existing simulators and optimizationsoptimizations

Power model suggests aggressive use of DRAMs low-power Power model suggests aggressive use of DRAMs low-power modesmodes

Page 11: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

DRAM Model Evaluation & Test SetupDRAM Model Evaluation & Test Setup

Three benchmarks from CSiBe benchmark setThree benchmarks from CSiBe benchmark set

MinigzipMinigzip high memory activity high memory activity DjpegDjpeg medium memory activity medium memory activity VamVam very low memory activity very low memory activity

ADI 80200EVB evaluation boardADI 80200EVB evaluation board

Intel XScale 80200 processor, 266 to 733 MHz in Intel XScale 80200 processor, 266 to 733 MHz in 66MHz steps66MHz steps

32 MByte Micron SDRAM@100 MHz32 MByte Micron SDRAM@100 MHz Measurement taps for CPU core current, IO Measurement taps for CPU core current, IO

current, system peripherals, DRAMscurrent, system peripherals, DRAMs

Page 12: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

Power MeasurementsPower Measurements

Switching SDRAM to low power mode after 10 idle Switching SDRAM to low power mode after 10 idle cycles on memory buscycles on memory bus

minigzip system power consumption (mesasurement @600MHz)

2

2.2

2.4

2.6

2.8

3

time [ms]

syst

em p

ow

er (

W)

SDRAM always on

Measured power consumption of the minigzip benchmarkMeasured power consumption of the minigzip benchmark

Page 13: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

Aggressive Use of Low Power ModesAggressive Use of Low Power Modes

minigzip system power consumption (mesasurement @600MHz)

2

2.1

2.2

2.3

2.4

2.5

2.6

2.7

2.8

2.9

3

time [ms]

sy

ste

m p

ow

er

(W)

SDRAM SREF 10 idle cycles SDRAM always on

Predicted reduction of average power (Micron model): 173 mWPredicted reduction of average power (Micron model): 173 mW

Increase in program runtime due to transition time Increase in program runtime due to transition time active active low power low power

Average power consumption rises by 100mW Average power consumption rises by 100mW (prediction -173mW)(prediction -173mW)

Page 14: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

Power AnalysisPower Analysis

SDRAM Power Consumption

-2

-1

0

1

2

3

4

5

1 151301

451601

751901

10511201

13511501

16511801

19512101

22512401

Sample No. (1 Sample = 1ns)

Vo

lt (

V)

-0.04

-0.02

0

0.02

0.04

0.06

0.08

0.1

Cu

rre

nt

(A)

Clock Enable SDRAM core current

When switching to memory power down mode, a power peak is observed When switching to memory power down mode, a power peak is observed due to a refresh: valid for all DDRx too (JEDEC standard)due to a refresh: valid for all DDRx too (JEDEC standard)

Not modeled in Micron’s power modelNot modeled in Micron’s power model Not taken into account in any previous publication we know ofNot taken into account in any previous publication we know of

Page 15: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

v am system power consum ption (m easurem ent @ 600MHz)

2

2.1

2.2

2.3

2.4

2.5

2.6

2.7

2.8

2.9

3

0 50 100150

200250

300350

400450

500550

600650

700750

800850

900950

tim e [m s]

sy

ste

m p

ow

er

[W]

S D RA M always on S D RA M S RE F after 10 id le cycles

Computation Dominant BenchmarkComputation Dominant Benchmark

““vam” runs entirely from the cachevam” runs entirely from the cache Micron’s model predicts reduction byMicron’s model predicts reduction by 407mW 407mW

180mW reduction

Page 16: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

New SDRAM ModelNew SDRAM Model

benchmarkbenchmark thresholdthreshold

[clock cycles][clock cycles]avg. poweravg. power

[mW][mW]new model new model

[mW][mW]Micron model Micron model

[mW][mW]

minigzipminigzip

(high memory (high memory activity)activity)

offoff 452452 463 (+2.4%)463 (+2.4%) 892 (+97.3%)892 (+97.3%)

1010 552552 551 (-0.2%)551 (-0.2%) 719 (+30.3%)719 (+30.3%)

2020 488488 490 (+0.4%)490 (+0.4%) 794 (+62.7%)794 (+62.7%)

5050 455455 455 (+0.0%)455 (+0.0%) 855 (+87.9%)855 (+87.9%)

djpegdjpeg

(medium (medium memory activity)memory activity)

offoff 314314 313 (-0.3%)313 (-0.3%) 589 (+87.6%)589 (+87.6%)

1010 211211 219 (+3.8%)219 (+3.8%) 281 (+33.2%)281 (+33.2%)

2020 230230 221 (-3.9%)221 (-3.9%) 305 (+32.6%)305 (+32.6%)

5050 211211 207 (-1.9%)207 (-1.9%) 335 (+58.8%)335 (+58.8%)

vamvam

(low memory (low memory activity)activity)

offoff 255255 250 (-2.0%)250 (-2.0%) 476 (+86.7%)476 (+86.7%)

1010 7676 62 (-18.4%)62 (-18.4%) 69 (-9.2%)69 (-9.2%)

2020 6262 63 (+1.6%)63 (+1.6%) 72 (+16.1%)72 (+16.1%)

5050 6666 63 (-4.6%)63 (-4.6%) 78 (+18.2%)78 (+18.2%)

Average errorAverage error 3.3%3.3% 51.7%51.7%

+100mW +88mW -173mW

Page 17: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

Theoretic power models for SDRAM are misleadingTheoretic power models for SDRAM are misleading Overestimate power consumptionOverestimate power consumption Overestimate energy saving potentialOverestimate energy saving potential Neglect important effects (transition energy!)Neglect important effects (transition energy!) Not only wrong absolute numbers but also wrong trendsNot only wrong absolute numbers but also wrong trends

We proposed a new SDRAM power modelWe proposed a new SDRAM power model cycle accurate behaviorcycle accurate behavior only 3.3% average error on poweronly 3.3% average error on power

Aggressive SDRAM power management is not always Aggressive SDRAM power management is not always beneficial!beneficial!

Lessons LearntLessons Learnt

Page 18: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

Thank you for attention!Thank you for attention!

For more information please visitFor more information please visit

http://ems.eit.uni-kl.dehttp://ems.eit.uni-kl.de

Page 19: Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn  wehn@eit.uni-kl.de MPSoC‘09 July 2009,

Comparison of Different CodesComparison of Different Codes

Energy consumed does not scale linearly with payload lengthEnergy consumed does not scale linearly with payload length All programs for encoding easily fit into ROMAll programs for encoding easily fit into ROM

CodeCode RatRatee

energy energy per per

frame frame

[[μJ]μJ]

DelayDelay

Encoding+TraEncoding+Trans. [ms]ns. [ms]

ROM ROM [byte[byte

s]s]

Max. BER Max. BER with with

FER < FER < 10%10%

DecodinDecodingg

on on MICAzMICAz

No FECNo FEC 11 104.73104.73 1.341.34 41364136 0%0% YesYes

RepetitionRepetition 1/31/3 128.97128.97 2.902.90 42454245 1%1% YesYes

RepetitionRepetition 1/61/6 233.18233.18 5.185.18 42454245 3%3% YesYes

ConvolutioConvolutionalnal

1/31/3 254.88254.88 7.807.80 46964696 5%5% NoNo

TCTC 1/31/3 282.79282.79 9.009.00 47984798 7%7% NoNo

TC*TC* 1/61/6 387.49387.49 11.3011.30 47984798 21%21% NoNo