Caching for Bursts (C-Burst): Let Hard Disks Sleep Well and Work Energetically Feng Chen and...
-
Upload
sydney-porter -
Category
Documents
-
view
217 -
download
0
Transcript of Caching for Bursts (C-Burst): Let Hard Disks Sleep Well and Work Energetically Feng Chen and...
Caching for Bursts (C-Burst): Let Hard Disks Sleep Well and Work Energetically
Feng Chen and Xiaodong ZhangDept. of Computer Science and
EngineeringThe Ohio State University
Power Management in Hard Disk Power management is a requirement in computer
system design Maintenance cost Reliability & Durability Environmental effects Battery life
Hard disk drive is a big energy consumer, for I/O intensive jobs e.g. hard disk drives account for 86% of total energy
consumption in EMC Symmetrix 3000 storage systems. As multi-core CPUs become more energy efficient, disks
are less. 2
Standard Power Management Dynamic Power Management (DPM)
When disk is idle, spin it down to save energy When a request arrives, spin it up to service the
request Frequently spin up/down a disk incurs substantial
penalty: high latency and energy Disk energy consumption is highly dependent on
the pattern of disk accesses (periodically sleep and work)
3
Ideal Access Patterns for Power Saving
4
Ideal Disk Power Saving Condition Requests to disk form a periodic and burst pattern Hard disk can sleep well and work energetically
timeDisk accesses
Disk Accesses in bursts
Long Disk Sleep Interval
Increasing Burstiness of disk accesses is the key to saving disk energy
Buffer Caches Affect Disk Access Patterns
5
Buffer caches in DRAM are part of hard disk service Disk data are cached in main memory (buffer cache) Hits in buffer caches are fast, and avoid disk accesses The buffer cache is able to filter and change disk
access streams
Applications
Buffer Cache
Hard Disk
requests
disk accesses
Prefetching
Caching
Existing Solution for Burst in Disks
6
Forming Burst Disk Accesses with Prefetching Predict the data that are likely to be accessed in the future Preload the to-be-used data into memory Directly condense disk accesses into a sequence of I/O bursts
Both energy efficiency and performance could be improved
Limitations Buffer caching share the same buffer space. Energy-unaware replacement can easily change burst patterns created
by prefetching Aggressive prefetching shrinks available caching space and demands
highly effective caching Energy-aware caching policy can effectively complement prefetching No work has been done.
Reference – Papathanasiou and Scott, USENIX’04
Caching can Easily Affect Access Patterns
7
An example
Original Disk Accesses
time
time
BurstyDisk Accessesorganized by Prefetching
a b c
Buffer Cache
abc
3 blocks to be evicted byEnergy-Unaware CachingUnfortunately, these blocks
to be accessed in a non-bursty way
Disk still cannot sleep well
Solely relying on prefetching is sub-optimal,Energy-aware caching policyis needed to create burst
disk accesses
Caching Policy is Designed for Locality
8
Standard Caching Policies Identify the data that are unlikely to be accessed in the
future (LRU) Evict the not-to-be-used data out from memory They are performance-oriented and energy-unaware
Most are locality-based algorithms, e.g. LRU (clock), LIRS (clock-pro),
Designed for reducing the number of disk accesses No consideration of creating burst disk access pattern
C-Burst (Caching for Bursts) Our objectives: effective buffer caching
To create burst disk accesses for disk energy saving To retain the performance (high hit ratios in buffer cache)
Outlines
9
Motivation Scheme Design
History-based C-Burst (HC-Burst) Prediction-based C-Burst (PC-Burst) Memory Regions Performance Loss Control
Performance Evaluation Programming Multimedia Multi-role Server
Conclusion
Restructuring Buffer Caches
10
Buffer cache is segmented into two regions Priority region (PR)
Hot blocks are managed using LRU-based scheme Blocks w/ strong locality are protected in PR Overwhelming memory misses can be avoided Retain the performance
Energy-aware region (EAR) Cached blocks are managed using C-burst schemes
Non-burst accessed blocks are kept here Re-accessed block (strong locality) is promoted into PR
Region size is dynamically adjusted Both performance and energy saving are considered
Priority Region(LRU)
Energy Aware Region
(C-Burst)
Buffer Cache
BufferCache(LRU)
Our focus in this talk
History-based C-Burst (HC-Burst)
11
Distinguish different streams of disk accesses Multiple tasks run simultaneously in practice History record can help us to distinguish them
Various tasks feature very different access patterns Burst – e.g. grep, CVS, etc. Non-Burst – e.g. make, mplayer, etc.
Accesses reaching the hard disk is a mixture of both burst and non-burst accesses
In aggregate, the disk access pattern is determined by the most non-burst one
Basic Idea of History-based C-Burst (HC-Burst)
12
time
grep
timemake
time
Grep+ make
bursty access long disk Idle intervals
non-bursty accessshort disk Idle Period
Aggregate Results:non-bursty access
A B C D E F G H I J K L
a b c d e f g
A B C Da
b I J K Lc de
E F G H f g
Buffer Cache
To cache the blocks being accessed in a non-burst pattern, to reshape disk accesses to a burst pattern
History-based C-Burst (HC-Burst)
14
Epoch Application’s access pattern may change over time Execution is broken into epochs, say T seconds for each
Too small or too large are both undesired Our choice – T = (Disk Time-out Threshold ) / 2
No disk spin-down happens during one epoch with disk accesses
Distribution of disk accesses during one epoch can be ignored
time
grep
timemake
A B C D E F G H I J K L
a b c d e f g
epoch
History-based C-Burst (HC-Burst)
15
Block Group Blocks accessed during one epoch by the same IOC are grouped
into a block group Each block group is identified by an process ID and an epoch
time The size of a block group indicates the burtiness of data access
pattern of one application The larger a block group is, the more bursty disk accesses are
time
grep
timemake
A B C D E F G H I J K L
a b c d e f g
epochBlock Groups
History-based C-Burst (HC-Burst)
16
HC-Burst Replacement Policy Two types of blocks should be evicted
Data blocks that are unlikely to be re-accessed Blocks with weak locality (e.g. LRU blocks)
Data blocks that can be re-accessed with little energy Blocks being accessed in bursty pattern
Victim block group – the largest block group Blocks that are frequently accessed would be promoted
into PR Large block group often holds infrequently accessed blocks
Blocks that are accessed in a bursty pattern stay in a large BG Large block group holds blocks being accessed in bursts
Level 10
History-based C-Burst (HC-Burst)
17
Multi-level Queues of Block Groups 32-level queues of block groups A block group of N blocks stays in queue Block groups on one queue are linked in the order of their epoch times Block groups may move upwards/downwards, if # of blocks changes The victim block group is always the LRU block group on the top queue w/ valid block groups
Level 0
Level 1
Level 9
Grep# of blks= 1024Epoch ID = 10
Block is promoted to PRGrep# of blks= 1023Epoch ID = 10
Victim block groupLRU + MB
make# of blks= 1023Epoch # 8
Block is demoted to EARMake# of blks= 1024Epoch # 8
Epoch TimeLeast Recent Used (LRU) Most Recent Used (MRU)
Burtiness
Least Bursty (LRU)
Most Bursty (MB)
Prediction-based C-Burst (PC-Burst)
18
Main idea Certain disk access events are known and can be predicted. Evicting a block that is to be accessed during a short interval and close to a
deterministic disk access
time
deterministicdisk
accesses
short disk idle interval
Block A Block B
long disk idle interval
predicted block
accesses
With deterministic disk accesses and block reaccess time , selectively evicting blocks to be accessed in a short
intervals and holding blocks to be accessed in long idle intervals
Holding Block B can avoidBreaking a long idle interval
Prediction-based C-Burst (PC-Burst)
22
Multi-level Queues of Block Groups in PC-Burst 32-level queues of block groups – each level has two queues
Prediction Block Group (PBG) – blocks to be accessed in the same future epoch time History Block Group (HBG) – blocks being accessed in the same history epoch time
Reference Points (RP) - resent deterministic disk accesses Victim block group
The PBG on the top level, in the shortest interval, closest to a RP, to be accessed in the furthest future
If no PBG is found, search the same level queue of HBG
Level 0
Level 10
Shortest Interval
Victim block group
long Interval
Performance Loss Control
23
Why there is performance loss? Increase of memory misses due to energy-oriented caching policy
How to control performance loss Basic Rule
Control the size of Energy Aware Region
Estimating performance loss Ghost buffer (GB)
LRU replacement policy The increase of memory misses (M)
Blocks not found in EAR, but found in GB The average memory miss penalty (P)
Observed average I/O latency Performance loss
L = M x P
Automatic tune Energy Aware Region size L < Tolerable Performance Loss
Enlarge EAR region size L > Tolerable Performance Loss
Shrink EAR region size
Priority Region
Energy Aware Region
Main Memory
Ghost Buffer (LRU)
A memory missPriority Region
Energy Aware Region
Perof. Loss > Tolerable RateShrink Region Size
Priority Region
Energy Aware Region
Perf. Loss < Tolerable RateEnlarge Region size
Check GhostBuffer
Hit in Ghost BufferPerf. Loss ++
Performance Evaluation
24
Implementation Linux Kernel 2.6.21.5 5,500 lines of code in buffer cache management
and generic block layer
Experimental Setup Intel Pentium 4 3.0GHz 1024 MB memory Western Digital WD160GB 7200RPM hard disk drive RedHat Linux WS4 Linux 2.6.21.5 kernel Ext3 file system
Performance Evaluation
25
Methodology Workloads run on experiment machine Disk activities are collected in Kernel on experiment machine Disk events are sent via netconsole to an monitor machine Disk energy consumption is calculated based on collected log of disk
events using disk power models off line
Experiment machine
Monitoringmachine
disk activitieslog
Gigabit LAN
Performance Evaluation
26
Emulated Disk Models Hitachi DK23DA Laptop Disk IBM UltraStar 36Z15 SCSI Disk
Hitachi DK23DA IBM UltraStar 36Z15
Capacity 30GB 18.4GB
Cache 2MB 4MB
RPM 4200 15000
B/W 35MB/sec 53MB/sec
Active Power 2 watt 13.5 watt
Idle Power 1.6 watt 10.2 watt
Standby Power 0.15 watt 2.5 watt
Spin up 1.6 sec / 5 J 10.9 sec / 135 J
Spin down 2.3 sec / 2.94 J 1.5 sec / 13 J
Performance Evaluation
27
Eight applications 3 applications w/ bursty data accesses 5 applications w/ non-bursty data accesses
Three Case Studies Programming Multi-media Processing Multi-role servers
Name Description MB/ epoch
Request/epoch
Make Linux kernel builder
1.98 119.7
Vim Text editor 0.006 0.395
Mpg123 Mp3 player 0.15 3.69
Transcode Video converter 3.2-6.5 10.9-19.1
TPC-H Database query #17
7.3 476.7
Grep Textual search tool
102.2 10186.6
Scp Remote copy tool
51.5-53.8 135-139
CVS Version control tool
19.9 1705.7
Performance Evaluation
28
Case Study I – programming Applications: grep, make, and vim
Grep – bursty workload Make, vim – non-bursty workload
C-Burst schemes protects data set of make from being evicted by grep Disk idle intervals are effectively extended
Performance Evaluation
29
Case Study I – programming Applications: grep, make, and vim
Grep – bursty workload Make, vim – non-bursty workload
C-Burst schemes protects data set of make from being evicted by grep Disk idle intervals are effectively extended
over 30% energy saving Nearly 0% intervals > 16 sec
Over 50% intervals > 16 sec
Performance Evaluation
30
Case Study II – Multi-media Processing Applications: transcode, mpg123, and scp
Mpg123 – its disk accesses serve as deterministic accesses Scp – bursty disk accesses Transcode – non-bursty accesses
PC-Burst achieves better performance than HC-Burst PC-Burst can accommodate deeper prefetching by efficiently using caching space
Around 30% energy savingOver 70% intervals > 16 sec
Performance Evaluation
31
Case Study III – Multi-role Server Applications: TPC-H # 17, and CVS
TPC-H – non-bursty disk accesses ( very random I/O ) CVS – bursty disk accesses
Dataset of TPC-H is protected in memory Performance of TPC-H is significantly improved
No improvement on disk idle interval
Reduced I/O latency
over 35% energy saving
Our Contributions
32
Design a set of comprehensive energy-aware caching policies, called C-Burst, which leverages the filtering effect of buffer cache to manipulate disk accesses
Our scheme does not rely on complicated disk power models and requires no disk specification data, which means high compatibility to different hardware
Our scheme does not assume using any specific disk hardware, such as multi-speed disks, which means our scheme is beneficial to existing hardware
Our scheme provides flexible performance guarantees to avoid unacceptable performance degradation
Our scheme is fully implemented in Linux kernel 2.6.21.5, and experiments under realistic scenarios show up to 35% energy saving with minimal performance loss
Conclusion
33
Energy efficiency is a critical issue for computer system design
Increasing disk access burstiness is the key to achieving disk energy conservation
Leveraging filtering effect of buffer cache can effectively shape the disk accesses to an expected pattern
HC-Burst scheme can distinguish different access pattern of tasks and create a bursty stream of disk accesses
PC-burst scheme can further predict blocks’ re-access time and manipulate the timing of future disk accesses
Our implementation of C-Burst schemes in Linux kernel 2.6.21.5 and experiments show that C-Burst schemes can achieve up to 35% energy saving with minimal performance loss
References
35
[USENIX04] A. E. Papathanasiou and M. L. Scott, Energy Efficient prefetching and caching. In Proc. of USENIX’04
[EMC99] EMC Symmetrix 3000 and 5000 enterprise storage systems product description guide. http://www.emc.com, 1999.
Memory Regions
36
Buffer cache is segmented into two regions Priority region (PR)
Hot blocks are managed using LRU-based scheme Blocks w/ strong locality are protected Overwhelming memory misses can be avoided
Energy-aware region (EAR) Cold blocks are managed using C-burst schemes Victim blocks are always reclaimed from EAR Re-accessed block is promoted into PR Accordingly, the coldest block in PR is demoted
Region size is tuned on line Both performance and energy saving are considered
Priority
Region
Energy Aware Region
Buffer Cache
Evict a cold block
Insert a new block
Demote a hot block
Promote a cold block
Motivation
37
Limitations Energy-unaware caching policy can significantly affect
periodic bursty patternscreated by prefetching Improperly evicting a block may easily break a long disk idle
interval
Aggressive prefetching shrinks available caching space and demands highly effective caching Effective prefetching needs large volume of memory space,
which raises high memory contention for caching
Energy-aware caching policy can effectively complement prefetching When prefetching works unsatisfactorily, caching can give a
hand by carefully selecting blocks for eviction
Motivation
38
Prefetching Predict the data that are likely to be accessed in the future Preload the to-be-used data into memory Directly condense disk accesses into a sequence of I/O
bursts Both energy efficiency and performance could be improved
Caching Identify the data that are unlikely to be accessed in the
future Evict the not-to-be-used data out from memory Traditional caching policies are performance-oriented
Designed for reducing the number of disk accesses No consideration of creating bursty disk access pattern
Prediction-based C-Burst (PC-Burst)
39
Prediction of Deterministic Disk Accesses Track each task’s access history and offer each task credits of [-32,
32]
Feed-back based Prediction Compare observed interval with predicted interval
If prediction is proved wrong, reduce a task’s credits If prediction is proved right, increase a task’s credits
Task w/ credit less than 0 is unpredictable
Repeated mis-prediction increases the charge of credits exponentially Occasional system dynamics only charge a task’s credit slightly Real patter change quickly decreases a task’s credits