Reliability vs Availability
-
Upload
sage-gibbs -
Category
Documents
-
view
40 -
download
0
description
Transcript of Reliability vs Availability
![Page 1: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/1.jpg)
Lecture 4 1
Reliability vs Availability
• Reliability: Is anything broken?
• Availability: Is the system still available to the user?
![Page 2: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/2.jpg)
Lecture 4 2
RAID
• Redundant Array of Inexpensive Disks– Higher throughput– Ability to recover from failures
• Disk Array (Availability reduced to 1/N)• Data Integrity• Striping• MTTR• MTTF
![Page 3: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/3.jpg)
Lecture 4 3
Manufacturing Advantages of Disk Arrays
14”10”5.25”3.5”
3.5”
Disk Array: 1 disk design
Conventional: 4 disk designs
Low End High End
Disk Product Families
![Page 4: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/4.jpg)
Lecture 4 4
Replace Small # of Large Disks with Large # of Small Disks! (1988 Disks)
Data Capacity
Volume
Power
Data Rate
I/O Rate
MTTF
Cost
IBM 3390 (K)
20 GBytes
97 cu. ft.
3 KW
15 MB/s
600 I/Os/s
250 KHrs
$250K
IBM 3.5" 0061
320 MBytes
0.1 cu. ft.
11 W
1.5 MB/s
55 I/Os/s
50 KHrs
$2K
x70
23 GBytes
11 cu. ft.
1 KW
120 MB/s
3900 IOs/s
??? Hrs
$150K
Disk Arrays have potential for
large data and I/O rates
high MB per cu. ft., high MB per KW
reliability?
![Page 5: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/5.jpg)
Lecture 4 5
Array Reliability
• Reliability of N disks = Reliability of 1 Disk / N
50,000 Hours / 70 disks = 700 hours
Disk system MTTF: Drops from 6 years to 1 month!
• Arrays (without redundancy) too unreliable to be useful!
Hot spares support reconstruction in parallel with access: very high media availability can be achieved
![Page 6: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/6.jpg)
Lecture 4 6
Redundant Arrays of Disks• Files are "striped" across multiple spindles• Redundancy yields high data availability
Disks will fail
Contents reconstructed from data redundantly stored in the arrayCapacity penalty to store it
Bandwidth penalty to update
Mirroring/Shadowing (high capacity cost)
Horizontal Hamming Codes (overkill)
Parity & Reed-Solomon Codes
Failure Prediction (no capacity overhead!)VaxSimPlus — Technique is controversial
Techniques:
![Page 7: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/7.jpg)
Lecture 4 7
Redundant Arrays of DisksRAID 1: Disk Mirroring/Shadowing
• Each disk is fully duplicated onto its "shadow" Very high availability can be achieved
• Bandwidth sacrifice on write: Logical write = two physical writes
• Reads may be optimized
• Most expensive solution: 100% capacity overheadTargeted for high I/O rate , high availability environments
recoverygroup
![Page 8: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/8.jpg)
Lecture 4 8
Redundant Arrays of Disks RAID 3: Parity Disk
P100100111100110110010011
. . .logical record 1
0010011
11001101
10010011
00110000
Striped physicalrecords
• Parity computed across recovery group to protect against hard disk failures 33% capacity cost for parity in this configuration wider arrays reduce capacity costs, decrease expected availability, increase reconstruction time• Arms logically synchronized, spindles rotationally synchronized logically a single high capacity, high transfer rate diskTargeted for high bandwidth applications: Scientific, Image Processing
![Page 9: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/9.jpg)
Lecture 4 9
Redundant Arrays of Disks RAID 5+: High I/O Rate Parity
A logical writebecomes fourphysical I/Os
Independent writespossible because ofinterleaved parity
Reed-SolomonCodes ("Q") forprotection duringreconstruction
D0 D1 D2 D3 P
D4 D5 D6 P D7
D8 D9 P D10 D11
D12 P D13 D14 D15
P D16 D17 D18 D19
D20 D21 D22 D23 P...
.
.
.
.
.
.
.
.
.
.
.
.Disk Columns
IncreasingLogical
Disk Addresses
Stripe
StripeUnit
Targeted for mixedapplications
![Page 10: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/10.jpg)
Lecture 4 10
Problems of Disk Arrays: Small Writes
D0 D1 D2 D3 PD0'
+
+
D0' D1 D2 D3 P'
newdata
olddata
old parity
XOR
XOR
(1. Read) (2. Read)
(3. Write) (4. Write)
RAID-5: Small Write Algorithm1 Logical Write = 2 Physical Reads + 2 Physical Writes
![Page 11: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/11.jpg)
Lecture 4 11
Subsystem Organization
host arraycontroller
single boarddisk
controller
single boarddisk
controller
single boarddisk
controller
single boarddisk
controller
hostadapter
manages interfaceto host, DMA
control, buffering,parity logic
physical devicecontrol
often piggy-backedin small format devices
striping software off-loaded from host to array controller
no applications modifications
no reduction of host performance
![Page 12: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/12.jpg)
Lecture 4 12
Summary: A Little Queuing Theory
• Queuing models assume state of equilibrium: input rate = output rate
• Notation: r average number of arriving customers/second
Tser average time to service a customer (tradtionally ต = 1/ Tser )u server utilization (0..1): u = r x Tser Tq average time/customer in queue Tsys average time/customer in system: Tsys = Tq + TserLq average length of queue: Lq = r x Tq Lsys average length of system : Lsys = r x Tsys
• Little’s Law: Lengthsystem = rate x Timesystem (Mean number customers = arrival rate x mean service time)
Proc IOC Device
Queue serverSystem
![Page 13: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/13.jpg)
Lecture 4 13
Summary: Redundant Arrays of Disks (RAID) Techniques
• Disk Mirroring, Shadowing (RAID 1)
Each disk is fully duplicated onto its "shadow" Logical write = two physical writes
100% capacity overhead
• Parity Data Bandwidth Array (RAID 3)
Parity computed horizontally
Logically a single high data bw disk
• High I/O Rate Parity Array (RAID 5)Interleaved parity blocks
Independent reads and writes
Logical write = 2 reads + 2 writes
Parity + Reed-Solomon codes
10010011
11001101
10010011
00110010
10010011
10010011
![Page 14: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/14.jpg)
Lecture 4 14
Homework #3
• Problem 6.5 and 6.6
• Due date: June 26, 2001
![Page 15: Reliability vs Availability](https://reader036.fdocuments.in/reader036/viewer/2022083015/56812b97550346895d8fb9c5/html5/thumbnails/15.jpg)
Lecture 4 15
Assignment #1
• Explain RAID 0-6 in greater detail.
• Due date: July 3, 2001