Runtime Software Power Estimation and Minimization Tao Li
description
Transcript of Runtime Software Power Estimation and Minimization Tao Li
![Page 1: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/1.jpg)
Runtime Software Power Estimation and Minimization
Tao Li
![Page 2: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/2.jpg)
Power-aware Computing
![Page 3: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/3.jpg)
Power: Software Perspective & Impact
Power estimation: the first step to power management & optimization
Software contributes to & largely impacts power consumption
![Page 4: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/4.jpg)
It is crucial to model power from the perspective of software
Evaluate software energy in early design stage
Understand impact of software optimizations on energy
Support run-time power management and optimizations
Power: Software Perspective & Impact (Contd.)
![Page 5: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/5.jpg)
Instruction level modeling Computation intensive
High level macro-modeling Difficult to apply to general code
Event counting based modeling Impacted by the availability of performance counters
Architecture level simulation Large slowdown
Software Power Estimation: Current Techniques
![Page 6: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/6.jpg)
Challenges in Run-time Power Estimation
High fidelity & fast speed
On-the-fly estimation capability, non-intrusive & low overhead
Simplicity, availability and generality
![Page 7: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/7.jpg)
Experimental Methodology
SoftWatt: cycle-accurate & full-system power simulation framework
SimOS infrastructure, Wattch power model
Commercial OS & real applications
Out-of-order superscalar processor
Caches & memory hierarchy
Low-power disk
![Page 8: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/8.jpg)
Experimental Methodology (Contd.)
Applications
E-mail and file management (sendmail, fileman)
Java (SPECjvm98: db, jess, javac, jack, mtrt, compress)
SPECInt95 (gcc, vortex)
Database (Postgres: select, update, join)
Miscellaneous (pmake, osboot)
![Page 9: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/9.jpg)
OS Power Characterization OS power varies from one application to another
29 Watt (gcc) ~ 66 Watt (fileman)
Variance of power consumption in OS service routines & invocations
0102030405060
Av
g. P
ow
er
(W)
0246810121416
Std
. D
ev
. (%
)
Avg. Power (W) Std. Dev.(%)
![Page 10: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/10.jpg)
OS Power Characterization (Contd.)
OS routine power correlates with its performance
Circuits used to exploit ILP burn significant portion of power
The number of in-flight instructions that flow through impacts circuit switching activity
For a given OS routine, similar IPC indicates similar circuit switching activity and therefore, similar power
![Page 11: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/11.jpg)
OS Routine Power-Performance Correlation
SCSI Disk Interrupt Handler Read File System Call
![Page 12: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/12.jpg)
Routine Level OS Power Model
Idea: use a linear regression model
Proutine=k1*IPCroutine+k0
to track the OS routine power showing different performance
Energy(OS)= Sum [ Energy(OS routines) ]= Sum [ Power(OS routines)*Time(OS routines) ]
![Page 13: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/13.jpg)
Routine Level OS Power Model (Contd.)
Regression Model P = k1×I PC+k0 OS
Services k1 k0 ε
Comment
utlb 23.6 6.2 0.17% TLB miss handler
COW_fault 32.1 1.1 0.19% copy-on-write fault simscsi_ intr 33.9 1.3 1.94% SCSI disk I / O interrupt clock 36.4 0.6 2.68% clock interrupts read 29.6 4.7 4.53% read fi le write 34.3 1.5 1.27% write fi le open 34.3 1.2 0.41% open a fi le or serial port
close 30.4 3.9 2.61% close an open channel
: Model Fitting Error
![Page 14: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/14.jpg)
Pre-characterization Low level energy simulation
Model fitting
Run-time estimation OS routine boundaries Evaluation using counter values
Routine Level OS Power Modeling
![Page 15: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/15.jpg)
Routine based Regression ModelProutine=k1*IPCroutine+k0
-2%
-1%
0%
1%
2%
se
nd
ma
il
file
ma
n
db
jes
s
po
stg
res
.se
lec
t
po
stg
res
.up
da
te
os
bo
ot
Es
tim
ati
on
Err
or
(%)
Flat Regression ModelPOS=g1*IPCOS+g0
-15%
-10%
-5%
0%
5%
10%
15%
sen
dm
ail
file
man d
b
jess
po
stg
res.
sele
ct
po
stg
res.
up
dat
e
osb
oo
t
Est
imat
ion
Err
or
(%)
Cumulative Estimation Error
![Page 16: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/16.jpg)
Flat Regression Model POS=g1*IPCOS+g0
Per-routine Estimation Error
![Page 17: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/17.jpg)
Routine based Regression ModelProutine=k1*IPCroutine+k0
Per-routine Estimation Error (Contd.)
![Page 18: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/18.jpg)
OS Energy Dissipation
0%10%20%30%40%50%60% % of OS Cycles
% of OS Energy
92% 89%
![Page 19: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/19.jpg)
Phases in Programs(8-issue machine)
0
1
2
3
4
5
6
0.00 0.07 0.13 0.20 0.26Execution Time (seconds)
IPC
Benchmark: SPECjvm98 jessBenchmark: SPECjvm98 jess
Resources are utilized differently during different phases of program execution
Average IPC - User: 2.1, OS: 1.1
![Page 20: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/20.jpg)
Power Minimization via Processor Resource Adaptations
Adapt processor resources to program needs
What can be adapted?
Bandwidth of fetch/decode/issue/retire…
Size of instruction window, re-order buffer,
load store queue…
Reduce power, retain performance
![Page 21: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/21.jpg)
Effects of Tuning Processor Resource for the OS
8-issue -> 4-issue
OS Performance degradation: 4%
OS Power savings: 50%
1-issue 2-issue 4-issue 6-issue 8-issue
OS IPC 0.88 1.09 1.15 1.19 1.21OS Power(W) 6.4 12.2 21.7 31.1 42.8
![Page 22: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/22.jpg)
Previous Approach for Adaptations
Sampling
Cycles
Sampling Window
IPC (Inst. Per Cycle)
Adaptation
A B C D E F
![Page 23: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/23.jpg)
Problems with Sampling based Adaptations (Contd.)
OS executions Short-lived
AdaptationOverhead
User User UserOS OS User
sampling window
A B CTa
Ts
smallersamplingwindow
OS UserUser OS User
Th
![Page 24: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/24.jpg)
OS-aware Routine based Adaptations
OS-aware: Identify OS executions via processor execution modes Just-in-time & full coverage of OS activities
Routine-based: Adapt processor resources at OS routine boundaries
Precise exceptions: drained pipeline Achieve minimum adaptation overhead
User UserOS OS User
OS Routine-basedOptimal Adaptations
Adaptations w ithMinimum Overhead
![Page 25: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/25.jpg)
OS-aware Routine based Adaptations (Contd.)
Apply optimal adaptation for individual OS routine Exploit the routine level Energy-Delay Product
variance
0
0.2
0.4
0.6
0.8
1
clock COW_fault read
No
rma
lize
d E
ne
rgy
-De
lay
Pro
du
ct
1-issue2-issue4-issue6-issue8-issue
OS ServicesOS Services
![Page 26: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/26.jpg)
Routine based Adaptations: OS Power
0
0.2
0.4
0.6
pmak
egcc
vorte
x
sendm
ail
filem
an dbje
ss
java
cja
ck
postgr
es.s
elec
t
postgr
es.u
pdat
e
osboot
AVG
No
rma
lize
d P
ow
er
Sampling based Adaptation (Window Size: 2048-cycle)Sampling based Adaptation (Window Size: 128-cycle)Routine based Adaptation
![Page 27: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/27.jpg)
OS Performance
0.7
0.8
0.9
1
pmak
egcc
vorte
x
sendm
ail
filem
an dbje
ss
java
cja
ck
postgr
es.s
elec
t
postgr
es.u
pdat
e
osboot
AVG
No
rma
lize
d I
PC
Sampling based Adaptation (Window Size: 2048-cycle)Sampling based Adaptation (Window Size: 128-cycle)Routine based Adaptation
![Page 28: Runtime Software Power Estimation and Minimization Tao Li](https://reader036.fdocuments.in/reader036/viewer/2022062409/56814a02550346895db73419/html5/thumbnails/28.jpg)
OS Power & Performance Tradeoff
0
0.2
0.4
0.6
0.8
No
rma
lize
d E
ne
rgy
.De
lay Sampling based Adaptation (Window Size: 2048-cycle)
Sampling based Adaptation (Window Size: 128-cycle)Routine based Adaptation