Dache: A Data Aware Caching for Big-Data using Map Reduce framework
Self Tuning Power Aware Replacement in Caching
-
Upload
datacenters -
Category
Technology
-
view
270 -
download
0
Transcript of Self Tuning Power Aware Replacement in Caching
05/01/23
PB-LRU: A Self-Tuning Power Aware Storage Cache Replacement Algorithm for Conserving Disk Energy
Qingbo Zhu, Asim Shankar and Yuanyuan Zhou
Presented: Hang Zhao Chiu Tan
05/01/23
PB-LRU: Partition-Based LRU
Storage is a major energy consumer, 27% of power budget in a data center.
PB-LRU is a power aware, on-line cache management algorithm.
PB-LRU dynamically partitions cache at run time for energy optimal cache size per disk.
Practical algorithm that dynamically adapts to workload changes with little tuning.
05/01/23
Outline
Motivation Background Why need PB-LRU? Main Idea Energy estimation at Run Time Solving MCKP Evaluation & Simulation Conclusion
05/01/23
Motivation
Why is power conservation important? Data centers are an important component of
the Internet infrastructure. Power needs for a data center are increasing
at 25% a year, with storage taking up 27%.How to reduce power in storage? Simple. Spin down disk when not in use.
05/01/23
Motivation (II)
But … Performance and energy penalty when disk
moving from low to high mode. Data center volume is high. Idle periods
small. Makes spinning up and down impractical.
Solution: Multi-speed disk architecture. PB-LRU targets multi-speed disk.
05/01/23
Background
Break-even time: Minimum length of idle time needed justify spinning up/down.
Oracle DPM: Knows length of next idle period. Uses this to regulate power modes.
Practical DPM: Use thresholds to regulate powering up or down.
05/01/23
Why need PB-LRU?
Earlier work: PA-LRU. Idea: Keep blocks from less active disks in
cache. Thus extends idle period. Cost: More misses to active disks. Justification: Since active disks are already
spinning, cheaper in terms of power consumption.
05/01/23
However …
PA-LRU requires complicated parameter tuning. 4 parameters needed.
No intuition between parameters and disk power consumption or IO times.
Thus difficult to adopt simple extensions or heuristics for real world implementation.
PB-LRU is a practical implementation !
05/01/23
PB-LRU: Main Idea
Divide cache into partitions, one for each disk.
Each partitioned managed individually. Resize partitions periodically. Workloads are not equally distributed through
different disks.
05/01/23
Main Idea (II)
So what do we need? Estimate, for each disk, the energy
consumed for a particular cache size. (estimation problem)
Use these estimates to find partitioning that minimize total energy consumption for all disks. (MCKP problem)
05/01/23
Estimation Problem
Q: How to estimate energy consumption per disk for different cache sizes at run time?
Use simulators. One (multi-disk) simulator for every cache size.
Requires (NumCacheSizes X NumDisks) simulators. Impractical!
05/01/23
Estimation Problem (II)
Mattson’s Stack: Take advantage of inclusion property. A cache of k blocks is a subset of k+1 blocks.
Accessing a stack at position i means a miss at caches smaller than size i.
PB-LRU uses Mattson’s Stack to predict hit or a miss for different partition sizes.
05/01/23
Estimation Problem (III)
In addition, PB-LRU keeps track of previous access time and previous energy consumption.
With these pieces of information, energy consumption of various cache is estimated.
05/01/23
Time T1 T2 T3 T4 T5
Access 5 4 3 2 1
Stack12345
RCache
123
Cache Size Pre_miss Energy1 T5 E52 T5 E53 T5 E54 T5 E55 T5 E5
CacheAccesses
MattsonStack
Existing Cache (real)
5 possible Cache sizes
Before
05/01/23
Time T1 T2 T3 T4 T5 T6
Access 5 4 3 2 1 4
Stack4 (1)1 (2)2 (3)3 (4)5 (5)
RCache
423
Cache Size Pre_miss Energy1 T6 E62 T6 E63 T6 E64 T5 E55 T5 E5
4th element of stack.Miss for cache size < 4
E6 = E5 + E(T6-T5) + 10ms + ActivePower
LRU
LRU
T6: Access Block 4
05/01/23
Solving MCKP
MCKP is NP-hard. But modified problem solvable using dynamic programming.
General result: Increase cache size for less active disks, decrease cache size for active disks.
Why? Penalty for reducing cache size of an active disk is small, while the energy saved for increasing cache size for inactive disk is large
05/01/23
Evaluation Methodology
The integrated simulator Disk power model CacheSim DiskSim
Multi-speed disks model Similar to IBM Ultrastar 36Z15 Add 4 lower-speed modes: 12k,
9k, 6k and 3k RPM Power model: 2-competitive
thresholds
05/01/23
Evaluation Methodology cont. The traces
Real system traces OLTP – database storage system, (21 disks, 128MB cache) Cello96 – Cello file server from HP, (19 disks, 32MB cache)
Synthetic traces generated based on storage system workloads zipf distribution to distribute requests among 24 disks
and blocks in each disk “hill” shape to reflect temporal locality Inter-request arrival distribution: exponential, Pareto
05/01/23
Simulation results Algorithms
Infinite cache LRU PA-LRU PB-LRU
Limited save due to high cold
misses rate 64%
PB-LRU saves 9%
Outperform LRU 22%
05/01/23
Simulation results cont.
Oracle DPM does not slow down the average response time for it always spin disk in time for a request
All PB-LRU results are insensitive to the epoch length
PB-LRU has 5% better
response time
saves 40% response
time
05/01/23
Accuracy of Energy Estimation OLTP, 21 disks
with Practical DPM
Largest deviation of estimated energy from real energy is 1.8%
05/01/23
Cache partition sizes
MCKP partition tendency gives small sizes to disks which remain active increase the sizes assigned to relatively inactive disks
1MB
11-12MB
05/01/23
Effects of spin-up cost
Disks stay longer at low-power mode
Break-even time increases
05/01/23
Sensitivity Analysis on Epoch Length
The epoch length just needs to be large enough to accommodate the “warm-up” period after re-partitioning.
05/01/23
Conclusion
PB-LRU: online storage cache replacement algorithm partitioning the total system cache amongst individual disks
It focuses on multiple disks with data center workloads
Achieving similar or better energy saving and response time improvement with significant less parameter tuning
05/01/23
Future work
Taking pre-fetching into consideration to investigate the role of cache management in energy conservation
Optimally divide the total cache between the cache and pre-fetching buffers
Implement the disk power modeling component into the real storage system
05/01/23
Impact of PB-LRU
5 citations found at Google Scholar Energy conservation techniques for disk array-based s
ervers (ICS’04) Performance Directed Energy Management for Main
Memory and Disks (ASPLOS’04) Power Aware Storage Cache Management Power and Energy Management for Server Systems Management Issues