Failure Correction Techniques for Large Disk Array

download Failure Correction Techniques for Large Disk Array

of 21

  • date post

  • Category


  • view

  • download


Embed Size (px)


Failure Correction Techniques for Large Disk Array. Garth A. Gibson, Lisa Hellerstein et al. University of California at Berkeley. What is the problem?. Disk arrays can increase I/O bandwidth and access parallelism The chance of data loss increases with the increasing number of disk arrays. - PowerPoint PPT Presentation

Transcript of Failure Correction Techniques for Large Disk Array

  • Failure Correction Techniques for Large Disk ArrayGarth A. Gibson, Lisa Hellerstein et al.University of California at Berkeley

  • What is the problem?Disk arrays can increase I/O bandwidth and access parallelismThe chance of data loss increases with the increasing number of disk arrays

    Figure 1. The mean time to data loss (MTTDL) in a single-erasure-correcting array.

  • Types of data failureTransient or noise-related errors: Correct by repeating the offending operation or by applying per sector error-correction facilitiesMedia defects: detect and mask at the factoryCatastrophic failures -- Head crashes or failures of the read/write or controller electronics

  • The goal of this paperAvoid loss of user dataRecover the catastrophic disk failuresMake disk arrays as reliable as an individual disk

  • Concept 1 -- erasure-correcting codes and error-correcting codesErasure-correcting codes are designed to recover erased bits in a message word An unreadable bit is called an erasureThe position of the erased bits are knownFor a catastrophic disk failure, the bits on a failed disk can be designated as unreadableError-correcting codes are designed to correct messages in which some of the bits may have been flipped, but the positions of those bits are unknown.

  • Concept 2 -- Redundancy MetricsDisk as stack of bits -ith.bit in each disk forms the ith.Codeword in the redundancy encoding

    Mean time to data loss (MTTDL): measure of reliabilityCheck disk overhead: check disks/data disksUpdate penalty: number of check disks to be updatedGroup size: the information and check disk that must be accessed during the reconstruction of a failed disk form a group

  • 1d - ParitySingle-erasure-correction schemeFor G data disks, one check disk with parity of all G disks.

    G = 4 Overhead: 1/G Update penalty: 1 Group size: G+1

  • 2d - Parity Double-erasure-correction scheme G2 data disks arranged in 2-dimensional array For each row and each column, one check disk stores parity for that row or columnG = 4 Check disk Overhead: 2G/G2 =2/G Update penalty: 2 Group size = G+1

  • N-dimensional parity (Nd-parity)N-erasure-correction schemeCheck disk overhead: NG(N-1) / GN = N/GUpdate penalty: NGroup size: G+1

  • Linear Codes Contain the original information unmodified within each codeword and compute the check bits of each codeword as the parity of subsets of the information bits

    Codeword = 1 1 1 1 Parity

  • Parity Check Matrix H = [P | I]Fig. 4 How to compute the check parity bit? H*X = 0 First row of H = [100101 100] X = [111010 x1 x2 x3] P I H*X = 1+0+0+0+0+0+x1+0+0 = 0 x1=1

  • Parity Check Matrix for 1d-parity and 2d-parityFig. 5

  • Properties of the parity check matrixExpress in terms of a parameter, t, whose value is between 0 and cH will allow any t erasures to be correctedH will allow any t errors to be detectedThe minimum number of bits in which any two codewords differ, known as the distance of the code, is at least t+1Any set of t column selected from will be linearly independent

  • Implementing ReconstructionFig. 6(a). When 4 disks fail in a 16 information disk 2d-parity array, the controllers allow us to identify which disks need to be repaired and reconstructed.

    0 10000 01100 00000 00010 00000 01000 10001 0011

  • Implementing Reconstruction cont.Fig. 6(b) Apply elementary row operations (the essence of Gaussian elimination) to find a matrix M, such that the product MB has the 4*4 identity matrix in its first four rows.

  • Elementary operationIf we interchange two equation, the new system is still equivalent to the old one.If we multiply an equation with a nonzero number, the new system is still equivalent to the old one.Replacing one equation with the sum of two equation, we obtain an equivalent system

    Example: x + y + z = 0 (1) x - 2y + 2z = 4 (2) x + 2y - z = 2 (3)

    (3) - (1) to replace (3) x + y + z = 0 (1) x - 2y + 2z = 4 (2) y - 2z = 0 (4)

    (2)-(1) to replace (2) x + y + z = 0 (1) - 3y +z = 4 (5) y - 2z = 0 (4)

    (5)+(4)*3 to replace (4) x + y + z = 0 (1) - 3y +z = 4 (5) - 5z= 10 (6)

    result: x=4, y=-2, z=-2

  • Gaussian EliminationDefinition:Using elementary operation, in every step the new matrix was exactly the augmented matrix associated to the new system. Once we obtain a triangular matrix, write the associated linear system and then solve it.

    augmented matrix:

    1 1 1 0 1 -2 2 4 1 2 -1 2

    (3) - (1) to replace (3)

    1 1 1 0 1 -2 2 4 0 1 -2 2

    (2)-(1) to replace (2)

    1 1 1 0 1 -3 1 4 0 1 -2 2

    (5)+(4)*3 to replace (4)

    1 1 1 0 0 -3 1 4 0 0 -5 10

    Example:x + y + z = 0 (1)x - 2y + 2z = 4 (2)x + 2y - z = 2 (3)

    The linear equation :

    x + y + z = 0 - 3y +z = 4 - 5z= 10

  • Implementing Reconstruction cont.Fig. 6 (C) The first 4 rows of MA describe the operations that must be performed to reconstruct our 4 disks.012 34567 89 1511 10 0 0000000 1000000001 00 0 0100010 0000010001 01 1 0100010 0100010000 00 0 0000111 0001000010 01 0 1000100 00001000

  • The position for codes with t-erasure-correctionBe implemented in softwareRun in an I/O processorSoftware learns of failures directly from disk controllers

  • ConclusionImplement the redundancy codes for disk arraysMinimize the number of check disks that must be updated whenever an information disk is updatedImprove the reliability of disk arrays

  • QuestionWhat is codeword for redundancy disk?List three redundancy metricsWhat are 1d-parity and 2d-parity schemes?What mathematical operation to be used for recovering failed disk?

    Erasure is an unreadable bit with a known positionA catastrophic disk failure is one kind of erasureThe frame is a one dimensional parity. G is the number of data disks in one group. There is one check disk storing the parity of G data disks.The figure is two dimensional parity. There are G square data disks that arranged in two- dimensional array.One check disk stores parity for that row or column.In two dimensional parity, each data disk has contribution to two groups1d - parity and two-d parity can be extended to n-dimensional parity.Linear codes consist of the information disk bits and check disk bit that is computed according to the parity of information disk bits.Linear codes can be defined as a parity check matrix.C is the number of check bits, k is the number of information bits. Vector x represents a codeword.Every column of the matrix represents the corresponding disk.Each one of this column contributes to the group in its row.For example, disk 5 contributes to this group in 1d -parity,

    Shaded column represents the failed data.Matrix H can be changed into two parts, A and B.A represents good disk and check disk, B represents the failed disk. Vector X can be divided to two parts, d and y.d denotes the good one, y denotes failed one