Copyright © Curt Hill, 2003-2010 RAID What every server wants!
-
Upload
myron-smith -
Category
Documents
-
view
217 -
download
0
Transcript of Copyright © Curt Hill, 2003-2010 RAID What every server wants!
Copyright © Curt Hill, 2003-2010
RAID
What every server wants!
Copyright © Curt Hill, 2003-2010
History I• In the late part of the 1970s and there
were two types of disks:– Hard– Floppy
• The hard drives were only used on mainframes or minis– Quite expensive
• The floppys were mostly used on personal computers
• In about 1984 hard drives started appearing on IBM personal computers
Copyright © Curt Hill, 2003-2010
History II
• Now there were two types of hard drives:– Professional
• Large 100-500 M• Expensive
– Amateur• Small: 5-50 M• Inexpensive
Copyright © Curt Hill, 2003-2010
History III
• At this point economics takes charge
• The large disks were a small market, typically 10s of thousands
• The small disks were mass market, typically 10s of millions
• What happens to the prices?
Copyright © Curt Hill, 2003-2010
History IV• Someone gets a bright idea:
– Should I buy one 500 M professional disk for $50,000 or 10 50M amateur disks at $100 each?
• RAID is born: Redundant Array Inexpensive Disks– Later Redundant Array Independent Disks
• There might even be a performance improvement if I can access the array independently
• However, we need some type of controller to treat the 10 disks as if it were one
Copyright © Curt Hill, 2003-2010
Issues• Redundancy
– If the small is less reliable or if the data is so important we need multiple copies for safety
• Speed– Increase speed by writing part of the
data to one drive and another part to another at the same time
• Versions– There have been many, termed Levels– Currently level 0 through level 6
Copyright © Curt Hill, 2003-2010
Redundancy
• Multiple copies allows one disk to crash without losing data
• Two forms:– Mirroring
• AKA Shadowing, Duplexing
– Error Correction Codes
Copyright © Curt Hill, 2003-2010
Mirroring
• Write all the data twice– Once to a disk and its mirror– This is done simultaneously, so no extra
delay
• A read only has to wait for the faster of the two
• If a disk crashes, no rebuild is needed since the other disk may just be copied
• High disk overhead, needs two disks to store one disk worth of data
• Can read different parts of the two at same time to increase speed
Copyright © Curt Hill, 2003-2010
Error Correction Code
• Even more history and background is needed here
• See the ECC.ppt presentation• Many of the EC codes were studied
by Hamming of Bell Labs– Thus known as Hamming codes
Copyright © Curt Hill, 2003-2010
ECC
• Instead of mirroring which requires double disk space use an ECC
• Conceptually:– The eight bit data placed on eight
separate drives and the four bit ECC on another
– Any one disk that fails may be recreated from the rest
Copyright © Curt Hill, 2003-2010
Speed
• Speed requires parallelism– Do two things at once
• With a mirrored disk read the front half from one and the back half from the other
• Transfer time cut in half• This leads to a more general
approach: stripes
Copyright © Curt Hill, 2003-2010
Stripes
• Cut the data into stripes• If you have N disks
– Partition into N pieces– Read or write N pieces at a time
• Best if each piece goes to a separate controller
Copyright © Curt Hill, 2003-2010
Controllers
• Controllers may be hardware or software or both
• Some levels make a software controller a problem
Copyright © Curt Hill, 2003-2010
RAID Levels
• Originally specified with six levels– 0-5
• Some of these are seldom used and new ones have been added
Copyright © Curt Hill, 2003-2010
Level 0
• Striped disk array• Minimum of two disks• No redundancy or error checking
– If a drive is lost all the data is lost– Only RAID that does not protect data
• Real RAID or not?
Copyright © Curt Hill, 2003-2010
Level 0 - Striping
A1 A2 A3 A4B1 B2 B3 B4C1 C2 C3 C4
Four stripesEach block of data is written in four piecesNo redundancy
Copyright © Curt Hill, 2003-2010
Level 1
• Mirroring• Minimum of two disks• 100% redundancy
– A rebuilt disk is just a copy, not computed
• Simple controller• Cannot be expanded on the fly
Copyright © Curt Hill, 2003-2010
Level 1 - Mirroring
A A E EB B F FC C G G
Four disks, each is a mirror of anotherEach block of data is written twiceNeed twice as much space
Copyright © Curt Hill, 2003-2010
Level 2• Striped disk at bit level• ECC provides the redundancy• Minimum of seven disks for storing
4 bit word• Not commercially viable• Number of parity disks is
proportional to log of number of data disks
• Not very flexible
Copyright © Curt Hill, 2003-2010
Level 2 – Striping with ECC
A1 A2 A3 A4B1 B2 B3 B4C1 C2 C3 C4
Four stripes of data protected by three of ECCECC is computed on the fly
EAxEBx
EAzEAyEBy EBz
ECx ECy ECz
Copyright © Curt Hill, 2003-2010
Level 3• Striped disk array with one ECC
disk• Each block is striped across disks• One disk may fail without
diminishing throughput• Minimum of three disks• High data rates• Controller should be in hardware,
not just software
Copyright © Curt Hill, 2003-2010
Level 3 – Striping with ECC
A1 A2 A3 A4B1 B2 B3 B4C1 C2 C3 C4
Four stripes of data protected by three of ECCECC is computed on the fly
ECCAECCBECCC
Copyright © Curt Hill, 2003-2010
Level 4• Similar to level 3 except
– Blocks are not subdivided– Different block are written to different
disks
• Minimum of three disks• Controller is complex
– Should be hardware
• Not easy to rebuild in case of failure
Copyright © Curt Hill, 2003-2010
Level 5• Striped disk array with distributed
ECC blocks• Each disk stores both data and
ECC– A disk stores ECC data for other disks
• Minimum of three disks• Any single disk failure will result in
no loss of data• Most complex controller design
Copyright © Curt Hill, 2003-2010
Level 5 – Striping with Distributed ECC
A1A2A3A4
B1B2B3
B5
C1C2
C5C4
Four stripes of dataEach ECC protects other fourNo disks are only data or only ECC
ECC5ECC4
ECC3ECC2
ECC1D1
D5D4D3
E2E3E4E5
Copyright © Curt Hill, 2003-2010
Level 6• Striped disk array
– Similar to five except two independent ECC schemes
– ECC blocks distributed among data disks
• Minimum of four disks• May have multiple disk failures
without loss of data
Copyright © Curt Hill, 2003-2010
Level 6 – Striping with Two Distributed ECCs
A1A2A3
B1B2
B3
C1
C3C2
Four stripes of dataTwo types of ECC each of which protects a different groupNo disks are only data or only ECC
ECC4ECC3
ECC2
ECC1
D3D2D1
ECCbECCc
ECCd ECCa
Copyright © Curt Hill, 2003-2010
Others
• Several combinations of these exists• 0 + 1
– Mirroring a striped disk• 10
– Striping a mirrored disk• 53
– Level 0 and 3– Level 0 whose stripes are level 3 arrays
What is coming?
• As disk arrays get larger the likelihood of a double failure becomes significant
• Although no product yet, it seems likely that a triple parity scheme is inevitable
Copyright © Curt Hill, 2003-2010