Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External...
-
date post
21-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External...
![Page 1: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/1.jpg)
Storage NetworksHow to Handle Heterogeneity
Bálint MiklósJanuary 24th, 2005ETH Zürich
External Memory Algorithms and Data Structures
![Page 2: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/2.jpg)
What Storage Networks are?
• Persistent Storage – Hard Disks• Device capacity is doubled every 14-18
months – data grows faster• Use many disks• Need to protect, access, and manage
the ever-growing volume of storage assets
Storage Networks – Motivation
2
![Page 3: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/3.jpg)
Hardware FailuresStorage Networks – Motivation
power supply
6%
FS error6%
disk subsystem
10%
disk error10%
disk failure42%
others26%
Trace collected from the Internet Archive (March 2003)courtesy of David Pease (UCSC) & Kelly Gottlib
3
![Page 4: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/4.jpg)
Heterogen Storage Networks
• Increasing system speed, capacity: add new disks
• New disks usually have different characteristics than the older disks in the system.
• Many modern storage systems are distributed: Ethernet, FibreChannel.
• How to exploit this heterogeneity?
Storage Networks – Motivation
4
![Page 5: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/5.jpg)
Goal
• Storage system requirements: – space and access balance– availability– resource efficiency– access efficiency– heterogeneity– adaptivity– locality
• Very difficult to meet ALL requirements.
Storage Networks – Motivation
5
![Page 6: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/6.jpg)
Outline
• Model
• AdaptRaid• HERA• RIO
• Conclusions
Storage Networks
6
![Page 7: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/7.jpg)
What Model to Use?
• Why not to use the layout of external memory algorithms?– We need solution for all the (sub)problems– One has to bypass operating system:
complex task
• Therefore different abstraction level:– Set of disks characterized by capacity and
bandwidth– Connection network is unrestricted: e.g.
SCSI, P2P
Storage Networks – Model
7
![Page 8: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/8.jpg)
Model assumptions
• Disk access patterns generated by file system (OS)
• Difficult to predict these and can change
• Assume uniform pattern, our goal is to distribute data evenly
Storage Networks – Model
8
![Page 9: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/9.jpg)
Outline
• Model
• AdaptRaid• HERA• RIO
• Conclusions
Storage Networks
9
![Page 10: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/10.jpg)
Heterogeneous Storage Networks
• Straightforward solution:– Clustering disks according their characteristics– We can have many clusters– Easy to extend– New, faster do not improve overall response time
• Randomized batched solution [Sanders]:– Map randomly data to disks– Schedule a batch of accesses by solving a network
flow problem– Unfeasible for large systems: many flow problems to
be solved– Batch like behavior is a disadvantage.
10
Storage Networks – Heterogeneity
![Page 11: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/11.jpg)
RAID
• Redundant Array of Inexpensive Disks• RAID level 0:
– Striping data across a set of disks
• RAID level 5:– Add a redundancy block per
stripe– Distribute redundancy
information evenly on every disk
11
Storage Networks – AdaptRaid
www.raidrecoveryguide.com
![Page 12: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/12.jpg)
AdaptRaid 0
12
Storage Networks – AdaptRaid
• Basic idea:– Load each disk depending on its
characteristics• First solution:
– Use all disks like in RAID0 until smallest disk is full
– Then, discard full disks, and continue the same way
– Distribution continues until all disks are full
• Lower portion of address space has better access times
• Extend RAID layout for heterogeneity [Cortes, Labarta]
![Page 13: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/13.jpg)
AdaptRaid 0 – Reducing Variance
13
Storage Networks – AdaptRaid
• Reduce variance: – Algorithm temporarly assumes that
disks are smaller.– Repeat pattern more times
• Stripes in a Pattern (SIP) defines the size of the pattern and the degree of variance
• Each disk has the same number of blocks like before
![Page 14: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/14.jpg)
AdaptRaid 5
14
Storage Networks – AdaptRaid
• Similar idea, but one block is used for parity information
• Difference: A write implies updating of the parity.
• If not all the blocks in the stripe are written, a write needs additional read:
small-write problem
![Page 15: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/15.jpg)
AdaptRaid 5 – Small-write Solution
15
Storage Networks – AdaptRaid
• Reference stripe: OS assumes to be a full stripe
• Size of every stripe is a divisor of the reference stripe
• Logically three steps:– Decrease strip size– Distribute evenly empty space
on all disks– Apply Tetris like method to fill
empty blocks
![Page 16: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/16.jpg)
AdaptRaid 5 – variance reduction
Storage Networks – AdaptRaid
• We can use similar variance reduction like in AdaptRaid 0:
– Repeat more times a smaller pattern
16
![Page 17: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/17.jpg)
AdaptRaid – generalization
Storage Networks – AdaptRaid
• What if bigger disks are not the faster ones?
• Until now we tried to use all blocks in a disk, now we want to use less blocks on slow disks
• Utilization Factor (UF): – 0..1 value per disk
• UF can be set based: – disk size (until now)– performance
17
![Page 18: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/18.jpg)
AdaptRaid – summary
Storage Networks – AdaptRaid
• Decide UF for every disk:– How much we want to load a disk
• Decide SIP for the system:– How big the pattern is
• Performance:Adaptivity Speedup
AdaptRaid 0: RAID 0 8%-35%AdaptRaid 5: ? < 30%
Performance measured by simulators.
18
![Page 19: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/19.jpg)
Outline
• Model
• AdaptRaid• HERA• RIO
• Conclusions
Storage Networks
19
![Page 20: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/20.jpg)
Heterogeneous Extension of RAID
• Disk merging tehnique• Disks are partitioned into logical disks• Logical disks have the same bandwidth
and capacity
• We group logical disks in G parity groups
• We have G homogeneous systems.
Storage Networks – HERA
20
![Page 21: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/21.jpg)
Heterogeneous Extension of RAID
• Constraint:
• Each logical disk in a parity group should map to different physical disk
Storage Networks – HERA
i
l
p
DG
21
![Page 22: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/22.jpg)
Heterogeneous Extension of RAID
• Read: online load balancing algorihtm directs request for a block to the disk with the least loaded disk.
• Every disk has a queue with all reads and deadlines.
• Deliver requested blocks based on deadline, and location on disk (to minimize seek-time overhead)
Storage Networks – HERA
22
![Page 23: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/23.jpg)
Heterogeneous Extension of RAID
• The availability is almost as good as the homogeneous case (RAID 5).
• But much more flexible than RAID 5.
• Performance relies on logical disk distribution, which is the task of administrator
• The authors recently proposed a configuration planning algorithm which optimizes for bandwidth and storage:[Zimmermann, Ghandeharizadeh: Highly Available and Heterogeneous Continuous Media Storage Systems] December 2004
Storage Networks – HERA
23
![Page 24: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/24.jpg)
Outline
• Model
• AdaptRaid• HERA• RIO
• Conclusions
Storage Networks
24
![Page 25: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/25.jpg)
Random I/O Mediaserver
• Randomized distribution strategy• Concentrates on delivering multimedia objects.
Optimized for real-time reading:– Video on demand– 3D interactive virtual world navigation– Interactive scientific visualization
• Idea: place data unit on a random disk at a random position. This will insure a long term load balance.
Storage Networks – RIO
25
![Page 26: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/26.jpg)
Homogeneous RIO – Data Placement
• A multimedia object is composed of a sequence of constant size data block.
• Data block is placed on random disk on random location -> long term load balancing
• By replicating a fraction of the data blocks, we allow short term balancing
Storage Networks – RIO
26
![Page 27: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/27.jpg)
Homogeneous RIO – Read Scheduler
• All reads have a deadline. Non real-time request have infinite deadline.
• Request for a block is routed to the disk with the least load
• A disk serves more blocks request in a cycle:– A number of blocks are selected from the disk request
queue– The selected requests are reordered according to their
location on disk to minimize the seek-time overhead and serviced.
Storage Networks – RIO
27
![Page 28: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/28.jpg)
Heterogeneous RIO – Data Placement
• Place data to a disk with probability proportional to its size:
• Probability to place data on disk:• Note that:
• Disk capacity increasing faster than disk bandwidth -> faster, bigger disks are going to be bottleneck
Storage Networks – RIO
S
Cd
jj
nj
jd1
1
28
![Page 29: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/29.jpg)
Heterogeneous RIO – BSR
• n disks (Di):– Capacity: Ci– Bandwidth: Bi
• Total capacity:
• Total bandwidth:
• Bandwidth space ratio (BSR):
• BSR is a hint how much load disk can take
Storage Networks – RIO
n
i
iCC1
n
i
iBB1
CCB
Bbs
i
i
i
29
![Page 30: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/30.jpg)
Heterogeneous RIO – Clusters
• Goal: redirect load from small BSR disks to higher BSR disks.
• Group disks in clusters based on their BSRs.
• Low BSP clusters would have high load.• How much replication do we need to sustain
a certain load?
Storage Networks – RIO
30
![Page 31: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/31.jpg)
Heterogeneous RIO – Replication Factor
• We want to sustain a maximum load of
• Data without replicas:
• Maximum load on a cluster is:
• To use all bandwidth we need :
->
Storage Networks – RIO
31
C
CirB
D
Cimaximax )1(
BBmax
r
CD
1
Bmax ii Bmax
ii
i
bs
CCB
Br 1 1)max( ibsr
![Page 32: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/32.jpg)
Heterogeneous RIO – Summary
• Randomized data placement
• Read scheduler to optimized read bandwidth
• Based on disk characteristics we need different replication factor to sustain certain bandwidth
• Authors claim that in a few years 10% to 40% replication is sufficient to allow to use the full aggregate bandwidth of the network
Storage Networks – RIO
32
![Page 33: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/33.jpg)
Outline
• Model
• AdaptRaid• HERA• RIO
• Conclusions
Storage Networks
33
![Page 34: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/34.jpg)
Conclusions
• All three methods concentrate on optimizing bandwidth and space utilization. Adaptivity is hard to achieve
• AdaptRaid and HERA– Deterministic– Extend homogeneous RAID – AdaptRaid 5 wastes space?
• RIO– Randomized– How fast is read scheduler?– The only one where the autors showed a real-life
implementation (Virtual World Data Center)
Storage Networks – Conclusions
34
![Page 35: Storage Networks How to Handle Heterogeneity Bálint Miklós January 24th, 2005 ETH Zürich External Memory Algorithms and Data Structures.](https://reader037.fdocuments.in/reader037/viewer/2022103005/56649d6c5503460f94a4bc82/html5/thumbnails/35.jpg)
Storage Networks
Thank You!
Questions?
Bálint Miklós
35