Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn ...
-
Upload
arron-moore -
Category
Documents
-
view
217 -
download
3
Transcript of Energy Modelling and Optimization A Critical Assessment with two Case Studies Norbert Wehn ...
Energy Modelling and Energy Modelling and OptimizationOptimization
A Critical Assessment with two A Critical Assessment with two Case StudiesCase Studies
Norbert WehnNorbert Wehnhttp://ems.eit.uni-kl.dehttp://ems.eit.uni-kl.de
[email protected]@eit.uni-kl.de
MPSoC‘09MPSoC‘09
July 2009, Savannah, GeorgiaJuly 2009, Savannah, Georgia
MotivationMotivation
Energy efficiency is key challenge in embedded Energy efficiency is key challenge in embedded system designsystem design
Optimization techniques rely on energy modelsOptimization techniques rely on energy models Commonly used models can yield wrong Commonly used models can yield wrong
optimization strategiesoptimization strategies
Two case studiesTwo case studies
Short hop versus long hop in wireless sensor Short hop versus long hop in wireless sensor networksnetworks
DRAM power managementDRAM power management
Short Hops Versus Long Hops in WSNShort Hops Versus Long Hops in WSN
A BC D
d~dE
)d(En1
nd
nnd
En :Multi_hop1
Theory favors many short hopsTheory favors many short hops Forward Error Correction inefficient since E(FEC) > E(d) for small dForward Error Correction inefficient since E(FEC) > E(d) for small d
Transmission energy has Transmission energy has exponential growth with distance dexponential growth with distance dE(d) ~dE(d) ~dαα with with αα path loss exponent (1 < path loss exponent (1 < αα < 4)< 4)
3d
E
3d
E
3d
E
A More Realistic ViewA More Realistic View
Node placement is hardly ever in a perfect line and estimation Node placement is hardly ever in a perfect line and estimation of node distance is difficultof node distance is difficult
Routing and link management add energy overhead and Routing and link management add energy overhead and latencylatency
Constant base cost for frame transmission independent of hop Constant base cost for frame transmission independent of hop length, e.g., base current of power amplifier, PLLlength, e.g., base current of power amplifier, PLL
High end-to-end reliability requires High end-to-end reliability requires AAutomatic utomatic RRepeat epeat ReReQQuestuest
Many applications: single hop asymetric structureMany applications: single hop asymetric structure Central powerful node for information aggregationCentral powerful node for information aggregation
Use of FEC to trade-off communication versus computation Use of FEC to trade-off communication versus computation energy ?energy ?
TheoreticalTheoretical investigations: Rabaey[2005], Schlegel[2006], investigations: Rabaey[2005], Schlegel[2006], Balakrishnan[2008]Balakrishnan[2008]
Practical results are missingPractical results are missing
State-Based MICAz Power ModelState-Based MICAz Power Model
M IC A z S ta nd b y0 .0 2 1 m A
M ic ro c o ntro lle r A c tiv e8 m A
M ic ro c o ntro lle rE xt. S ta nd b y
0 .2 1 m A~ 0 .0 0 1 3 5 m s
~ 2 .1 7 m s
M ic ro c o ntro lle r Id le4 m A
~ 0 .0 0 0 5 5 m s
M ic ro c o ntro lle rA D C1 m A
~ 0 .0 0 0 5 5 m s
s ta rtu p2 0 m s * * *
m icrocon tro lle r ope ra tions
T ra ns c e iv e r T ra ns m it a t
0 d B m : 2 5 .4 m A-1 d B m : 2 4 .5 m A-3 d B m : 2 3 .2 m A-5 d B m : 2 1 .9 m A-7 d B m : 2 0 .5 m A
-1 0 d B m : 1 9 .2 m A-1 5 d B m : 1 7 .9 m A-2 5 d B m : 1 6 .5 m A
T ra ns c e iv e r R e c e iv e2 6 .8 m A
1 m s
T ra ns c e iv e r T ra ns m it(w m ic ro c o ntro lle r E xt.
S ta nd b y ) a t0 d B m : 1 7 .6 m A-1 d B m : 1 6 .7 m A-3 d B m : 1 5 .4 m A-5 d B m : 1 4 .1 m A-7 d B m : 1 2 .7 m A
-1 0 d B m : 11 .4 m A-1 5 d B m : 1 0 .1 m A-2 5 d B m : 8 .7 m A
T ra ns c e iv e r R e c e iv e (wm ic ro c . E xt. S ta nd b y )
1 9 m A
~ 0 .0 0 1 3 5 m s
~ 0 .0 0 1 3 5 m s
T ra ns c e iv e r Id le8 .4 m A
0 .1 9 2 m s 0 .1 9 2 m s
tra nsce ive r op era tio ns
F la s h R e a d1 2 m A
F la s h W rite2 3 m A
~ 0 .0 2 m s * *
F la s h W rite (w m ic ro -c o ntro lle r E xt. S ta nd b y )
1 5 .2 m A
~ 0 .0 2 m s * *
~ 0 .0 0 1 3 5 m s
lo gg er -f la sh op era tion s
9mA
Atmel ATMega 128L
Chipcon [email protected]
FEC in Single Hop Asymetric FEC in Single Hop Asymetric TopologyTopology
CRC Checksum (only error detection) built in in CRC Checksum (only error detection) built in in MICAzMICAzError correction by retransmission (ARQ)Error correction by retransmission (ARQ)
Repetition codes 1/3 and 1/6Repetition codes 1/3 and 1/6Detection and correcting errors by majority votingDetection and correcting errors by majority voting
Convolutional Code Convolutional Code Asymmetrical code: easy to encode, hard to decodeAsymmetrical code: easy to encode, hard to decode
Turbo Code (TC, UMTS standard, rate 1/3)Turbo Code (TC, UMTS standard, rate 1/3)Due to platform limitations only Due to platform limitations only hard decision decodinghard decision decoding
Turbo Code (TC*, UMTS standard, rate 1/3)Turbo Code (TC*, UMTS standard, rate 1/3)Soft informationSoft information by repeating the codeword (rate 1/6) by repeating the codeword (rate 1/6)
Measurements in Lab environmenmt: Measurements in Lab environmenmt: [email protected] [email protected] (WLAN)(WLAN)
Measured Effectiveness of CodingMeasured Effectiveness of Coding
20 Byte user data20 Byte user data No ARQ at all, i.e., each bit error produces an frame error if no FEC is usedNo ARQ at all, i.e., each bit error produces an frame error if no FEC is used
Energy Measurements ResultsEnergy Measurements Results
High number of retransmissions requires a large on-time in receive modeHigh number of retransmissions requires a large on-time in receive mode Overhead for encoding is more than compensated for by higher reliabilityOverhead for encoding is more than compensated for by higher reliability
MethodMethod# of sent# of sentMessageMessage
ss
Ø Frames / Ø Frames / succ. succ.
MessageMessage(#ARQ)(#ARQ)
Energy / Energy / succ. succ.
Message [µJ]Message [µJ]
Only ARQOnly ARQBattery fully depleted after 48 hoursExtrapolated to a runtime of 120h
431,906431,906 2.342.34 346,049346,049
ARQ + Rep 1/3ARQ + Rep 1/3 431,737431,737 1.181.18 16,84216,842
ARQ + TC*ARQ + TC* 431,728431,728 1.121.12 11,79811,798
Improvement in energy one order of magnitude Improvement in energy one order of magnitude compared to only ARQcompared to only ARQ
3 MicaZ nodes running in parallel, ~ 1% BER in noisy WLAN environment3 MicaZ nodes running in parallel, ~ 1% BER in noisy WLAN environment 1 frame/sec sent, measured left-over battery capacity after 120 hours 1 frame/sec sent, measured left-over battery capacity after 120 hours
runtimeruntime
Lessons LearntLessons Learnt
Short-Hop is conclusion of wrong power Short-Hop is conclusion of wrong power modelsmodels
Reality is much more complexReality is much more complex
Forward error correction can be very Forward error correction can be very efficient in star shaped networksefficient in star shaped networks
DRAM Power ModelsDRAM Power Models
DRAM are crucial components in embedded systemsDRAM are crucial components in embedded systems
Power modeling/optimization well elaborated for CPU Power modeling/optimization well elaborated for CPU cores & SRAMscores & SRAMs
DRAM models DRAM models
SDRAM Power model from manufacturer Micron availableSDRAM Power model from manufacturer Micron available State based model, but transitions not modeledState based model, but transitions not modeled Worst case assumptionsWorst case assumptions
RDRAM Power model from manufacturer RambusRDRAM Power model from manufacturer Rambus Very similiar to Micron modelsVery similiar to Micron models
These models are base of existing simulators and These models are base of existing simulators and optimizationsoptimizations
Power model suggests aggressive use of DRAMs low-power Power model suggests aggressive use of DRAMs low-power modesmodes
DRAM Model Evaluation & Test SetupDRAM Model Evaluation & Test Setup
Three benchmarks from CSiBe benchmark setThree benchmarks from CSiBe benchmark set
MinigzipMinigzip high memory activity high memory activity DjpegDjpeg medium memory activity medium memory activity VamVam very low memory activity very low memory activity
ADI 80200EVB evaluation boardADI 80200EVB evaluation board
Intel XScale 80200 processor, 266 to 733 MHz in Intel XScale 80200 processor, 266 to 733 MHz in 66MHz steps66MHz steps
32 MByte Micron SDRAM@100 MHz32 MByte Micron SDRAM@100 MHz Measurement taps for CPU core current, IO Measurement taps for CPU core current, IO
current, system peripherals, DRAMscurrent, system peripherals, DRAMs
Power MeasurementsPower Measurements
Switching SDRAM to low power mode after 10 idle Switching SDRAM to low power mode after 10 idle cycles on memory buscycles on memory bus
minigzip system power consumption (mesasurement @600MHz)
2
2.2
2.4
2.6
2.8
3
time [ms]
syst
em p
ow
er (
W)
SDRAM always on
Measured power consumption of the minigzip benchmarkMeasured power consumption of the minigzip benchmark
Aggressive Use of Low Power ModesAggressive Use of Low Power Modes
minigzip system power consumption (mesasurement @600MHz)
2
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3
time [ms]
sy
ste
m p
ow
er
(W)
SDRAM SREF 10 idle cycles SDRAM always on
Predicted reduction of average power (Micron model): 173 mWPredicted reduction of average power (Micron model): 173 mW
Increase in program runtime due to transition time Increase in program runtime due to transition time active active low power low power
Average power consumption rises by 100mW Average power consumption rises by 100mW (prediction -173mW)(prediction -173mW)
Power AnalysisPower Analysis
SDRAM Power Consumption
-2
-1
0
1
2
3
4
5
1 151301
451601
751901
10511201
13511501
16511801
19512101
22512401
Sample No. (1 Sample = 1ns)
Vo
lt (
V)
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
Cu
rre
nt
(A)
Clock Enable SDRAM core current
When switching to memory power down mode, a power peak is observed When switching to memory power down mode, a power peak is observed due to a refresh: valid for all DDRx too (JEDEC standard)due to a refresh: valid for all DDRx too (JEDEC standard)
Not modeled in Micron’s power modelNot modeled in Micron’s power model Not taken into account in any previous publication we know ofNot taken into account in any previous publication we know of
v am system power consum ption (m easurem ent @ 600MHz)
2
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
3
0 50 100150
200250
300350
400450
500550
600650
700750
800850
900950
tim e [m s]
sy
ste
m p
ow
er
[W]
S D RA M always on S D RA M S RE F after 10 id le cycles
Computation Dominant BenchmarkComputation Dominant Benchmark
““vam” runs entirely from the cachevam” runs entirely from the cache Micron’s model predicts reduction byMicron’s model predicts reduction by 407mW 407mW
180mW reduction
New SDRAM ModelNew SDRAM Model
benchmarkbenchmark thresholdthreshold
[clock cycles][clock cycles]avg. poweravg. power
[mW][mW]new model new model
[mW][mW]Micron model Micron model
[mW][mW]
minigzipminigzip
(high memory (high memory activity)activity)
offoff 452452 463 (+2.4%)463 (+2.4%) 892 (+97.3%)892 (+97.3%)
1010 552552 551 (-0.2%)551 (-0.2%) 719 (+30.3%)719 (+30.3%)
2020 488488 490 (+0.4%)490 (+0.4%) 794 (+62.7%)794 (+62.7%)
5050 455455 455 (+0.0%)455 (+0.0%) 855 (+87.9%)855 (+87.9%)
djpegdjpeg
(medium (medium memory activity)memory activity)
offoff 314314 313 (-0.3%)313 (-0.3%) 589 (+87.6%)589 (+87.6%)
1010 211211 219 (+3.8%)219 (+3.8%) 281 (+33.2%)281 (+33.2%)
2020 230230 221 (-3.9%)221 (-3.9%) 305 (+32.6%)305 (+32.6%)
5050 211211 207 (-1.9%)207 (-1.9%) 335 (+58.8%)335 (+58.8%)
vamvam
(low memory (low memory activity)activity)
offoff 255255 250 (-2.0%)250 (-2.0%) 476 (+86.7%)476 (+86.7%)
1010 7676 62 (-18.4%)62 (-18.4%) 69 (-9.2%)69 (-9.2%)
2020 6262 63 (+1.6%)63 (+1.6%) 72 (+16.1%)72 (+16.1%)
5050 6666 63 (-4.6%)63 (-4.6%) 78 (+18.2%)78 (+18.2%)
Average errorAverage error 3.3%3.3% 51.7%51.7%
+100mW +88mW -173mW
Theoretic power models for SDRAM are misleadingTheoretic power models for SDRAM are misleading Overestimate power consumptionOverestimate power consumption Overestimate energy saving potentialOverestimate energy saving potential Neglect important effects (transition energy!)Neglect important effects (transition energy!) Not only wrong absolute numbers but also wrong trendsNot only wrong absolute numbers but also wrong trends
We proposed a new SDRAM power modelWe proposed a new SDRAM power model cycle accurate behaviorcycle accurate behavior only 3.3% average error on poweronly 3.3% average error on power
Aggressive SDRAM power management is not always Aggressive SDRAM power management is not always beneficial!beneficial!
Lessons LearntLessons Learnt
Thank you for attention!Thank you for attention!
For more information please visitFor more information please visit
http://ems.eit.uni-kl.dehttp://ems.eit.uni-kl.de
Comparison of Different CodesComparison of Different Codes
Energy consumed does not scale linearly with payload lengthEnergy consumed does not scale linearly with payload length All programs for encoding easily fit into ROMAll programs for encoding easily fit into ROM
CodeCode RatRatee
energy energy per per
frame frame
[[μJ]μJ]
DelayDelay
Encoding+TraEncoding+Trans. [ms]ns. [ms]
ROM ROM [byte[byte
s]s]
Max. BER Max. BER with with
FER < FER < 10%10%
DecodinDecodingg
on on MICAzMICAz
No FECNo FEC 11 104.73104.73 1.341.34 41364136 0%0% YesYes
RepetitionRepetition 1/31/3 128.97128.97 2.902.90 42454245 1%1% YesYes
RepetitionRepetition 1/61/6 233.18233.18 5.185.18 42454245 3%3% YesYes
ConvolutioConvolutionalnal
1/31/3 254.88254.88 7.807.80 46964696 5%5% NoNo
TCTC 1/31/3 282.79282.79 9.009.00 47984798 7%7% NoNo
TC*TC* 1/61/6 387.49387.49 11.3011.30 47984798 21%21% NoNo