Mario Vodisek 1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Erasure...
-
date post
22-Dec-2015 -
Category
Documents
-
view
217 -
download
4
Transcript of Mario Vodisek 1 HEINZ NIXDORF INSTITUTE University of Paderborn Algorithms and Complexity Erasure...
Mario Vodisek 1
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and Complexity
Erasure Codes for Reading and Writing
Mario Vodisek ( joint work with AG Schindelhauer)
Mario Vodisek 2
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityAgenda
• Erasure (Resilient) Codes in storage networks
• The Read-Write-Coding-System
- A Lower Bound and Perfect Codes- Requirements and Techniques
Mario Vodisek 3
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and Complexity
• n-symbol message x with symbols from alphabet • m-symbol encoding y with symbols from (m > n)
• erasure coding provides mapping: n! m such that– reading any n· r < m symbols of y are sufficient for recovery – (mostly: r = n ) optimal for reading)
• advantages:– bm-rc erasures can be tolerated – storage overhead is a factor of
• Generally, erasure codes are used to guarantee information recovery for data transmission over unreliable channels (RS-, Turbo-, LT-Codes, …)
• Lots of research in code properties such as– scalability– encoding/decoding speed-up– rateless-ness
• Attractive also to storage networks: downloads (P2P) and fault-tolerance
Erasure (Resilient) Coding
coding
Mario Vodisek 4
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityErasure Codes for Storage (Area) Networks
SANs require
• high system availability – disks fail or be blocked (probability $ size)
• efficient modification handling– Slow devices ) expensive I/O-operations
Properties:
• a fixed set E of existing errors can be considered at encoding time
• E can have changed to E‘ at decoding time
Additional requirements to erasure codes:
• tolerate some certain number of erasures
• ensure modification of codeword even if erasures occur
• consider E at encoding time and E‘ at decoding time
Network
Network
Mario Vodisek 5
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityThe Read-Write-Coding-System
An (n, r, w, m)b-Read-Write-Coding System (RWC) is defined as follows:
• The base b : b-symbol alphabet b as the set of all used items
• n 1 blocks of information x1, …, xn b
• m n code blocks y1, …, ym b
• any n r m code words sufficient to read the information
• any n w m code words sufficient to change the information by 1, …, n
(In the language of Coding Theory) : given m, n, r, w, our RW-Codes provide:
• a (linear) code of dimension n and block length m such that for n· r,w· m:– the minimum distance of the code is at least m-r+1
– any two codewords y1, y2 are within a distance of at most w from another
– distance(x, y):=|{1· i· m: xi yi }|
coding
m, r, w
n
Mario Vodisek 6
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityA Lower Bound for RW-Codes
Theorem: For r+w < n+m and any base b there does not exist any (n, r, w, m)b-RWC
system !
• We know: n r,w m• Assume: r = w = n m n+1
• Write and subsequent read
n
m
Proof: w
rIndex Sets (W, R):• |W| = w• |R| = r
|S| = W R {n, n-1}
Assume: |S| = n there are bn possible change vectors to be encoded by `write` into S; only basis
for reading with r = n (notice: R\S code words remain unchanged)
Assume: |S| < n = n-1at most bn-1 possible change vectors for S can be encoded by `write`
´read´ will produce faulty output
Mario Vodisek 7
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityCodes at Lower Bound: Perfect Codes
• In the best case (n, r, w, m)b-RWC have parameters r + w = n + m (perfect Codes)
• Unfortunately, perfect RWC do not always exist !!
- E.g. there is no (1, 2, 2, 3)2-RWC but there exists a (1, 2, 2, 3)3-RWC !
• But: all perfect RW-Codes exist if the alphabet is sufficiently large !
Notice to RAID:
• Definition of parity RAID (RAID 4/5) corresponds to an (n, n, n+1, n+1)2-RWC
• From the lower bounds it follows: there is no (n, n, n, n+1)2-RWC
) there is no RAID-system with improved access properties !
Mario Vodisek 8
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityThe Model: Operations
Given:
• X=x1,…, xn the n-symbol information vector over a finite alphabet .
• Y=y1,…, yn the m-symbol code over
• b=||.
• P(M) : the power set of M, Pk(M):={S 2 P(M): |S|=k}
• Define [m]:={1,…,m}
An (n, r, w, m)b-RWC-system consists of the following operations:
• Inital state: X0 2 n, Y0 2 m
• Read function: f: Pr([m]) £ r ! m
• Write function: g: Pr([m]) £ r £ Pw([m]) £ n ! w
• Differential write function: : Pw([m]) £ n ! w
Mario Vodisek 9
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityInitialization: Compute the Encoding Y0
Given (in general):
• the information vector X = x1, …, xn b
• the encoded vector Y = y1, …, yn b
• internal variables V = v1, …, vk for k = m-w = r-n, with no particular information
• set of functions M=M1,…,Mn for encoding
Compute yi from X and V by function Mi ; define Mi as linear combination of X and V
yi = Mi(x1,…,xn,v1,…,vk) = j=1
n xj Mi,j + l=1k vl Mi,l
( Define M as some m £ r matrix; Mi as rows. It follows: M(XV = Y )
• RW-Codes are closely related to Reed-Solomon-Codes !
Mario Vodisek 10
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityThe Matrix Approach: (n, r, w, m)b-RWC
Consider:
• the information vector X = x1, …, xn b
• the encoded vector Y = y1, …, ym b
• internal slack variables V = v1, …, vk for k = m-w = r-n
Further:
• an m r generator matrix M: Mi,j b
• the submatrix (Mi,j)i [m], j {n+1, …, r} is called the variable matrix
=
Mario Vodisek 11
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityEfficient Encoding: b = F[b] (Finite Fields)
• RWC requires efficient arithmetic on elements of b for encoding
) set b = F[b] (finite field with b elements (formerly: GF(b)))
• b = pn for some prime number p and integer n ) F[pn] always exists
• Computation of binary words of length v: b = 2v, F[2v] = {0,…,2v-1}
Features:
• F[b] is closed under addition, multiplication
) exact computation on field elements ) not more than v bits for representiation of results
• Addition, subtraction via XOR (avoids rounding, no carryover)
• Multiplication, division via mapping tables (analogous to logarithm tables for real numbers)
– T : table mapping an integer to its logarithm in F[2v]
– IT: table mapping an integer to its inverse logarithm in F[2v]
) multiplication, division by • adding/subtracting the logs
• taking the inverse log
Mario Vodisek 12
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityThe Vandermonde Matrix
Consider M as m £ r Vandermonde matrix Mi,j = ji-1:
• X, Y, V 2 F[b]
• Mi,j 2 F[b] and all elements are different
• The Vandermonde matrix is non-singular ) invertible
• Any k‘ £ k‘ submatix M‘ is also invertible
=
Consider: each device i in the SAN corresponds to a row of M and element yi
Mario Vodisek 13
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityReading (or Recovery)
Read: Given any r code entries from Y, compute X
• Rearrange rows of M and Y such that first r entries of Y are available
- (any r rows of M are linear independent in a Vandermonde matrix)
• M! M‘ and Y! Y‘
• The first r rows of M‘ describe an invertible r £ r matrix M‘‘
• X is computed by: (X | V)T = (M‘‘)-1YM (X | V)Y
r
m
M‘Y‘
Mario Vodisek 14
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and Complexity Differential Write
Given:
- The change vector =1,…,n and w code entries from Y
- X‘ = X + is new information vector ) change X without reading
entries (XOR)- Compute the difference for the w code entries of Y
Further:- Only choices w < r make sense
- Rearrange m £ r matrix M and Y as follows: y1,…,yw (denote M‘ and Y‘)
- k = r-n (slack vector V)
Mario Vodisek 15
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityDifferential Write (con‘t)
Define following sub-matrices:
- MÃ" = (M‘i,j)i2[w], j2[n]
- M"! = (M’i,j)i2[w], j2{n+1,…,r}
- MÃ# = (M’i,j)i2{w+1,…,m}, j2[n]
- M#! = (M’i,j)i2{w+1,…,m}, j2{n+1,…,r}
MÃ" M"!
MÃ# M#!
w
n
w+1…m
n+1…r
• M#! is k £ k = m-w £ r-n matrix ) M#! invertible
• The vector Y can then be updated by a vector = ,…,w:
= ((MÃ") – (M"!)(M#!)-1(MÃ#)) ¢
Mario Vodisek 16
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and ComplexityDifferential Write: Proof
Use:
• Vector = 1,…,k the change of vector V
• Vector = 1,…,w the change of vector Y
MÃ" M"!
MÃ# M#!
X’ = X + V’ = V + Y’ = Y +
Correctness follows by combining:
M = M = M + M = +
This equation is equivalent to:
(M#!) + (MÃ#) = 0,
(MÃ") + (M"!) =
Since is given, is obained as follows:
= (M#!)-1(-MÃ#) ¢
Mario Vodisek 17
HEINZ NIXDORF INSTITUTEUniversity of Paderborn
Algorithms and Complexity
Heinz Nixdorf Institute& Computer Science InstituteUniversity of PaderbornFürstenallee 1133102 Paderborn, Germany
Tel.: +49 (0) 52 51/60 64 51Fax: +49 (0) 52 51/62 64 82E-Mail: [email protected]://www.upb.de/cs/ag-madh
Thank you for your attention!Thank you for your attention!