Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States •...
Transcript of Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States •...
![Page 1: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/1.jpg)
NanoCAD Lab
Evaluating and Exploiting Impacts of Dynamic Power Management Schemes on System Reliability
Liangzhen Lai, Vikas Chandra* and Puneet Gupta UCLA Electrical Engineering Department
ARM Research*
![Page 2: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/2.jpg)
NanoCAD Lab
Radiation-Induced Soft Error • Caused by Alpha particles or neutrons
– Results in flipping logic states or glitches • Memory
– Well-protected if with ECC • Flip-Flops (FFs)
– Expensive to fix • Soft error rate (SER) is measured by failures in
time (FIT) – Failures per billion device-hours
2
![Page 3: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/3.jpg)
NanoCAD Lab
What Can Affect System Reliability? • Component FIT rate depends on:
– Circuit properties, e.g., schematic, layout – Operating conditions, i.e., voltage, location/altitude
• System FIT rate can be affected by: – Ambient – Dynamic power management (DPM)
3
SystemSER
Workload
Ambient(al8tude)
DPM
Managementpolicy
Hardwareconfigura8on
![Page 4: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/4.jpg)
NanoCAD Lab
Hardware Support for Dynamic Power Management • Voltage scaling
– L1 cache tie to Vcore • Higher performance
– L1 cache tie to VL2 • More power saving
• Power gating – Dumping states to memory
• Simpler – Use state-retention FF
• Faster
4
Processorcore
L1Cache
Vcore
VL2
L2Cache
VL2orVcore?
![Page 5: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/5.jpg)
NanoCAD Lab
Impacts of Hardware Power States • Voltage states
– Affects core SER – Affects mem SER if
L1 is tied to the core • Sleep states
– Mem SER • Increase if SRAM
goes to retention mode
– Core SER • Zero if power gated • Increase if using
retention FFs 5
0
0.5
1
1.5
0.7 0.8 0.9
Normalized
SER
Supplyvoltage
![Page 6: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/6.jpg)
NanoCAD Lab
JIT vs RFTS • Two power management approaches:
– Run-Fast-Then-Stop (RFTS): complete the workload with nominal performance and then go to certain sleep states for the remaining slack to reduce power
– Just-In-Time (JIT): adjust the peak performance to elongate the runtime with lower power consumption
6
Ac8veIdle
t
P
t
P
Ac8ve
???
???
t
FIT
???
![Page 7: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/7.jpg)
NanoCAD Lab
Outline
• Motivation • Evaluating DPM on System Reliability
– Evaluation Platform – Results
• Exploiting DPM for System Reliability – Virtual SER monitor – Forbidden state based policy – Results
• Conclusion
7
![Page 8: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/8.jpg)
NanoCAD Lab
Evaluation Platform Choices • Running system software (i.e., OS)
– Simulators are too slow • Interaction with power management policies
– Need accurate timing models • Targeting different hardware configuration
– Designing hardware for each configuration is infeasible
• Using hardware development board – Running applications with OS – Emulating the power/SER
8
![Page 9: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/9.jpg)
NanoCAD Lab
Reliability Evaluation • Hardware:
• Software power management schemes – Linux governors
9
CPUFreq CPUIdle
![Page 10: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/10.jpg)
NanoCAD Lab
Building An Evaluation Platform • BeagleBone Black
– Cortex-A8 – Standard Linux kernel 3.12 – Supports both CPUFreq and CPUIdle – Power/SER models based on 45nm technology
10
CPUFreqstates(P-states)
CPUIdlestates(C-states)
![Page 11: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/11.jpg)
NanoCAD Lab
Embedding Bookkeeping Module
11
Availableat:hZps://github.com/nanocad-lab/JIT-RFTS
![Page 12: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/12.jpg)
NanoCAD Lab
Reliability Evaluation
12
![Page 13: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/13.jpg)
NanoCAD Lab
Outline
• Motivation • Evaluating DPM on System Reliability
– Evaluation Platform – Results
• Exploiting DPM for System Reliability – Virtual SER monitor – Forbidden state based policy – Results
• Conclusion
13
![Page 14: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/14.jpg)
NanoCAD Lab
Monitoring SER • Soft error rate (SER) depends on the altitude
• Sources of altitude difference – Same device used at different locations – Same types of device used at different locations
• Hardware SER monitor is infeasible – Rare occurrence ( < 1 per month per MB)
• System-level Virtual SER monitor – Altitude/Location sensors (GPS, barometer etc.) – Cloud-based services (network capabilities)
14
![Page 15: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/15.jpg)
NanoCAD Lab
Embedding Reliability Into DPM • Considerations:
– Effective – Non-intrusive
• Forbidden-state based policy – Preserve existing structures – Enabling/Disabling power states depending on
current SER
15
VirtualSER
monitors
ReliabilityControlmodule
CPUFreqfrequency
table
CPUIdlestatetable
request
update
Enabling/disabling
Powermanagementgovernors
![Page 16: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/16.jpg)
NanoCAD Lab
Powerstate2
Static/Dynamic Policy • Forbidden-state based policy
– Enable/Disable state based on current SER – Overly pessimistic to limit instantaneous SER
• Dynamic policy – Limit accumulated SER over a fixed period – Dynamically enable/disable power states
16
SERtarget
t
SER
Powerstate1
![Page 17: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/17.jpg)
NanoCAD Lab
Experimental Results
17
2.2X
power
![Page 18: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/18.jpg)
NanoCAD Lab
Policy Effectiveness Evaluation • Calculating optimal P-state and C-state selection
– Offline – Formulated and solved as ILP
• Compare the power difference between our policy and optimal P-state and C-state selections
18
Poweroverheadcomparedtoop8malstateselec8on
![Page 19: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/19.jpg)
NanoCAD Lab
Conclusion • Evaluating DPM on System Reliability
– Building a evaluation Platform • Exploiting DPM for System Reliability
– JIT-RFTS trade-offs • 2.2X SER reduction with minimal overhead
– Simple yet effective policies • <3% overhead compared to offline optimal
• Future work will integrate more reliability phenomena
• The kernel module code is available for download at https://github.com/nanocad-lab/JIT-RFTS
19
![Page 20: Evaluating and Exploiting Impacts of Dynamic Power ... · Impacts of Hardware Power States • Voltage states – Affects core SER – Affects mem SER if L1 is tied to the core •](https://reader034.fdocuments.in/reader034/viewer/2022052101/603b0bee5a5bd80bd43da84d/html5/thumbnails/20.jpg)
NanoCAD Lab
Q&A
20