The concept of RAID in Databases By Junaid Ali Siddiqui.

23
The concept of RAID in The concept of RAID in Databases Databases By Junaid Ali Siddiqui

Transcript of The concept of RAID in Databases By Junaid Ali Siddiqui.

Page 1: The concept of RAID in Databases By Junaid Ali Siddiqui.

The concept of RAID in The concept of RAID in DatabasesDatabases

By

Junaid Ali Siddiqui

Page 2: The concept of RAID in Databases By Junaid Ali Siddiqui.

Some Key terms Some Key terms

Concatenated array Concatenated array This is an array where multiple disk This is an array where multiple disk

drives or arrays are logically drives or arrays are logically connected together, end-to-end .connected together, end-to-end .

Data Drive Data Drive A data drive is a disk drive that is A data drive is a disk drive that is

dedicated to storing data, as opposed dedicated to storing data, as opposed to parity, Hamming code, or a hot to parity, Hamming code, or a hot standby .standby .

Page 3: The concept of RAID in Databases By Junaid Ali Siddiqui.

Logical Disk Logical Disk This is what a RAID array is. Although This is what a RAID array is. Although

the RAID array is multiple disks, it the RAID array is multiple disks, it appears to the Operating System as a appears to the Operating System as a single disk. single disk.

Physical Disk Physical Disk A physical disk is a disk. This term is A physical disk is a disk. This term is

sometimes used to distinguish it from sometimes used to distinguish it from a logical disk. a logical disk.

Page 4: The concept of RAID in Databases By Junaid Ali Siddiqui.

FIGURE 1-1 Logical Drive Including FIGURE 1-1 Logical Drive Including Multiple Physical Drives Multiple Physical Drives

Page 5: The concept of RAID in Databases By Junaid Ali Siddiqui.

Segment size Segment size This is the number of blocks (sometimes expressed This is the number of blocks (sometimes expressed

in bytes) that are written to one disk drive, before in bytes) that are written to one disk drive, before moving on to the next disk drive in the array.moving on to the next disk drive in the array.

Stripe size Stripe size This is similar to This is similar to Segment sizeSegment size, except that it is , except that it is

only valid for RAID-0 arrays. Many manufacturers only valid for RAID-0 arrays. Many manufacturers use this term when they mean use this term when they mean Segment sizeSegment size. .

Stripe width Stripe width This is the number of blocks that must be written This is the number of blocks that must be written

to the array, so that every data drive has had a to the array, so that every data drive has had a complete segment written. complete segment written.

Page 6: The concept of RAID in Databases By Junaid Ali Siddiqui.

What is RAID?What is RAID?

Redundant Array of Inexpensive Disks (RAID) is a Redundant Array of Inexpensive Disks (RAID) is a storage technology used to improve the storage technology used to improve the processing capability of storage systems. This processing capability of storage systems. This technology is designed to provide reliability in technology is designed to provide reliability in disk array systems and to take advantage of the disk array systems and to take advantage of the performance gains offered by an array of multiple performance gains offered by an array of multiple disks over single-disk storage. disks over single-disk storage.

RAID's two primary underlying concepts are: RAID's two primary underlying concepts are: Distributing data over multiple hard drives Distributing data over multiple hard drives

improves performance. improves performance. Using multiple drives properly allows for any one Using multiple drives properly allows for any one

drive to fail without loss of data and without drive to fail without loss of data and without system downtime. system downtime.

Page 7: The concept of RAID in Databases By Junaid Ali Siddiqui.

Types or levels of RAIDTypes or levels of RAID

RAID 0RAID 0RAID 1RAID 1RAID 2RAID 2RAID 3RAID 3RAID 4RAID 4RAID 5RAID 5Compound RAID levelsCompound RAID levels

Page 8: The concept of RAID in Databases By Junaid Ali Siddiqui.

RAID 0RAID 0

In RAID Level 0 (also called striping), each segment is In RAID Level 0 (also called striping), each segment is

written to a different disk, until all drives in the array written to a different disk, until all drives in the array have been written to. have been written to.

Page 9: The concept of RAID in Databases By Junaid Ali Siddiqui.

Using RAID 0Using RAID 0

Advantages Advantages The I/O performance of a RAID-0 array is The I/O performance of a RAID-0 array is

significantly better than a single disk. This significantly better than a single disk. This is true on small I/O requests, as several is true on small I/O requests, as several can be processed simultaneously, and for can be processed simultaneously, and for large requests, as multiple disk drives can large requests, as multiple disk drives can become involved in the operation. become involved in the operation.

DisadvantagesDisadvantages This level of RAID is the only one with no This level of RAID is the only one with no

redundancy. If one disk in the array fails, redundancy. If one disk in the array fails, data is lost. data is lost.

Page 10: The concept of RAID in Databases By Junaid Ali Siddiqui.

RAID 1RAID 1

In RAID Level 1 (also called mirroring), each disk is an exact In RAID Level 1 (also called mirroring), each disk is an exact duplicate of all other disks in the array. When a write is duplicate of all other disks in the array. When a write is performed, it is sent to all disks in the array. When a read is performed, it is sent to all disks in the array. When a read is performed, it is only sent to one disk. This is the least space performed, it is only sent to one disk. This is the least space efficient of the RAID levels. efficient of the RAID levels.

Page 11: The concept of RAID in Databases By Junaid Ali Siddiqui.

AdvantagesAdvantages

RAID-1 arrays with multiple mirrors are often used to improve performance in situations where the data on the disks is being read from multiple programs at the same time. By being able to read from the multiple mirrors at the same time, the data throughput is increased, thus improving performance. The most common use of RAID-1 with multiple mirrors is to improve performance of databases. The read performance for RAID-1 will be no worse than the read performance for a single drive. If the RAID controller is intelligent enough to send read requests to alternate disk drives, RAID-1 can significantly improve read performance. Mirrored set without parity' or 'Mirroring'. Provides fault tolerance from disk errors and failure of all but one of the drives. Two (or more) disks each store exactly the same data, at the same time, and at all times. Data is not lost as long as one disk survives.

DisadvantageThis is the least space efficient of the RAID levels. Total capacity of the array is simply the capacity of one disk.

Page 12: The concept of RAID in Databases By Junaid Ali Siddiqui.

RAID 2RAID 2

RAID Level 2 is an intellectual curiosity, and has never been RAID Level 2 is an intellectual curiosity, and has never been widely used. It is more space efficient then RAID-1, but less widely used. It is more space efficient then RAID-1, but less space efficient then other RAID levels. space efficient then other RAID levels.

Instead of using a simple parity to validate the data , it uses Instead of using a simple parity to validate the data , it uses

a much more complex algorithm, called a Hamming Code.a much more complex algorithm, called a Hamming Code.

Page 13: The concept of RAID in Databases By Junaid Ali Siddiqui.

AdvantagesAdvantages A Hamming code is larger than a parity, so it takes up more A Hamming code is larger than a parity, so it takes up more

disk space, but, with proper code design, is capable of disk space, but, with proper code design, is capable of recovering from multiple drives being lost. RAID-2 is the recovering from multiple drives being lost. RAID-2 is the only simple RAID level that can retain data when multiple only simple RAID level that can retain data when multiple drives fail. drives fail.

DisadvantagesDisadvantages The primary problem with this RAID level is that the amount The primary problem with this RAID level is that the amount

of CPU power required to generate the Hamming Code is of CPU power required to generate the Hamming Code is much higher then is required to generate parity. much higher then is required to generate parity.

In general, all data blocks in the stripe modified by the In general, all data blocks in the stripe modified by the write, must be read in, and used to generate new Hamming write, must be read in, and used to generate new Hamming Code data. Also, on large writes, the CPU time to generate Code data. Also, on large writes, the CPU time to generate the Hamming Code is much higher that to generate Parity, the Hamming Code is much higher that to generate Parity, thus possibly slowing down even large writes. thus possibly slowing down even large writes.

Page 14: The concept of RAID in Databases By Junaid Ali Siddiqui.

RAID 3RAID 3

RAID Level 3 is defined as bytewise (or bitwise) RAID Level 3 is defined as bytewise (or bitwise) striping with parity. Every I/O to the array will access striping with parity. Every I/O to the array will access all drives in the array, regardless of the type of all drives in the array, regardless of the type of access (read/write) or the size of the I/O request. access (read/write) or the size of the I/O request.

During a write, RAID-3 stores a portion of each block During a write, RAID-3 stores a portion of each block on each data disk. It also computes the parity for the on each data disk. It also computes the parity for the data, and writes it to the parity drive. data, and writes it to the parity drive.

In some implementations, when the data is read In some implementations, when the data is read back in, the parity is also read, and compared to a back in, the parity is also read, and compared to a newly computed parity, to ensure that there were no newly computed parity, to ensure that there were no errors. errors.

Page 15: The concept of RAID in Databases By Junaid Ali Siddiqui.

RAID 3RAID 3

Page 16: The concept of RAID in Databases By Junaid Ali Siddiqui.

AdvantagesAdvantages RAID-3 provides a similar level of reliability to RAID-4 and RAID-3 provides a similar level of reliability to RAID-4 and

RAID-5.RAID-5. Striped set with dedicated parity or bit interleaved parity or Striped set with dedicated parity or bit interleaved parity or

byte level parity. This mechanism provides an improved byte level parity. This mechanism provides an improved performance and fault tolerance similar to RAID 5, but with a performance and fault tolerance similar to RAID 5, but with a dedicated parity disk dedicated parity disk

One minor benefit is the dedicated parity disk allows the One minor benefit is the dedicated parity disk allows the parity drive to fail and operation will continue without parity parity drive to fail and operation will continue without parity or performance penalty. or performance penalty.

DisadvantagesDisadvantages RAID-3 also has configuration limitations. The number of data RAID-3 also has configuration limitations. The number of data

drives in a RAID-3 configuration must be a power of two. The drives in a RAID-3 configuration must be a power of two. The most common configurations have four or eight data drives. most common configurations have four or eight data drives.

Unfortunately, it is not possible to have multiple operations Unfortunately, it is not possible to have multiple operations being performed on the array at the same time, due to the being performed on the array at the same time, due to the fact that all drives are involved in every operation. fact that all drives are involved in every operation.

Page 17: The concept of RAID in Databases By Junaid Ali Siddiqui.

RAID 4RAID 4

RAID Level 4 is defined as blockwise striping with parity. RAID Level 4 is defined as blockwise striping with parity. The parity is always written to the same disk drive. This can The parity is always written to the same disk drive. This can create a great deal of contention for the parity drive during create a great deal of contention for the parity drive during

write operations.write operations.

Page 18: The concept of RAID in Databases By Junaid Ali Siddiqui.

AdvantagesAdvantages For reads, and large writes, RAID-4 performance will be similar to a RAID-0 For reads, and large writes, RAID-4 performance will be similar to a RAID-0

array containing an equal number of data disks.array containing an equal number of data disks. The error detection is achieved through dedicated parity and is stored in a The error detection is achieved through dedicated parity and is stored in a

separate, single disk unit. separate, single disk unit. DisadvantagesDisadvantages For small writes, the performance will decrease considerably. To For small writes, the performance will decrease considerably. To

understand the cause for this, a one-block write will be used as an understand the cause for this, a one-block write will be used as an example. example.

A write request for one block is issued by a program. A write request for one block is issued by a program. The RAID software determines which disks contain the data, and parity, The RAID software determines which disks contain the data, and parity,

and which block they are in. and which block they are in. The disk controller reads the data block from disk. The disk controller reads the data block from disk. The disk controller reads the corresponding parity block from disk. The disk controller reads the corresponding parity block from disk. The data block just read is XORed with the parity block just read. The data block just read is XORed with the parity block just read. The data block to be written is XORed with the parity block. The data block to be written is XORed with the parity block. The data block and the updated parity block are both written to disk. The data block and the updated parity block are both written to disk. It can be seen from the above example that a one block write will result in It can be seen from the above example that a one block write will result in

two blocks being read from disk and two blocks being written to disk two blocks being read from disk and two blocks being written to disk

Page 19: The concept of RAID in Databases By Junaid Ali Siddiqui.

RAID 5RAID 5

RAID Level 5 is defined as blockwise striping with parity. It RAID Level 5 is defined as blockwise striping with parity. It differs from RAID-4, in that the parity data is not always differs from RAID-4, in that the parity data is not always

written to the same disk drivewritten to the same disk drive

Page 20: The concept of RAID in Databases By Junaid Ali Siddiqui.

AdvantagesAdvantages RAID-5 has all the performance issues and benefits that RAID-4 RAID-5 has all the performance issues and benefits that RAID-4

has, except as follows: has, except as follows: Since there is no dedicated parity drive, there is no single point Since there is no dedicated parity drive, there is no single point

where contention will be created. This will speed up multiple small where contention will be created. This will speed up multiple small writes. writes.

Multiple small reads are slightly faster. This is because data Multiple small reads are slightly faster. This is because data resides on all drives in the array. It is possible to get all drives resides on all drives in the array. It is possible to get all drives involved in the read operation. involved in the read operation.

Distributed parity requires all drives but one to be present to Distributed parity requires all drives but one to be present to operate; drive failure requires replacement, but the array is not operate; drive failure requires replacement, but the array is not destroyed by a single drive failure. destroyed by a single drive failure.

DisadvantagesDisadvantages The array will have data loss in the event of a second drive failure The array will have data loss in the event of a second drive failure

and is vulnerable until the data that was on the failed drive is and is vulnerable until the data that was on the failed drive is rebuilt onto a replacement drive. A single drive failure in the set rebuilt onto a replacement drive. A single drive failure in the set will result in reduced performance of the entire set until the failed will result in reduced performance of the entire set until the failed drive has been replaced and rebuilt. drive has been replaced and rebuilt.

Page 21: The concept of RAID in Databases By Junaid Ali Siddiqui.

Compound RAID levelsCompound RAID levels

There are times when more then one There are times when more then one type of RAID must be combined, in type of RAID must be combined, in order to achieve the desired effect. In order to achieve the desired effect. In general, this would consist of RAID-0, general, this would consist of RAID-0, combined with another RAID level combined with another RAID level (Often RAID-1, RAID-3 and RAID-5 used (Often RAID-1, RAID-3 and RAID-5 used with RAID-0). with RAID-0).

The primary reason for combining The primary reason for combining multiple RAID architectures would be to multiple RAID architectures would be to get either a very large, or a very fast, get either a very large, or a very fast, logical disk. logical disk.

Page 22: The concept of RAID in Databases By Junaid Ali Siddiqui.

Any questions?Any [email protected]

Page 23: The concept of RAID in Databases By Junaid Ali Siddiqui.

Message from the presenterMessage from the presenter

We spend our days waiting for the ideal We spend our days waiting for the ideal path to appear in front of us but what we path to appear in front of us but what we forget is that paths are made by walking forget is that paths are made by walking not waiting.So always keep yourself on not waiting.So always keep yourself on the right path.the right path.

Thank you for your attention Thank you for your attention

References:References:http://www.accs.com/p_and_p/RAID/Recovery.htmlhttp://www.accs.com/p_and_p/RAID/Recovery.htmlhttp://www.wikipedia.comhttp://www.wikipedia.com

http://docs.sun.com/source/817-3711-10/ch01_basics.html#14202http://docs.sun.com/source/817-3711-10/ch01_basics.html#14202