Erasure Coding Costs and Benefits
-
Upload
john-cook -
Category
Technology
-
view
127 -
download
0
Transcript of Erasure Coding Costs and Benefits
Erasure Coding Costs and Benefits
John D. Cook1 Robert Primmer2 Ab de Kwant2
1Singular Value Consulting
2Hitachi Data Systems
March 28, 2014
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Two ways to store a 12 GB video
Replication
Store an extra copy.Disk usage: 24 GB, 100% overhead.
Simplest example of erasure coding
Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Two ways to store a 12 GB video
Replication
Store an extra copy.Disk usage: 24 GB, 100% overhead.
Simplest example of erasure coding
Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Two ways to store a 12 GB video
Replication
Store an extra copy.
Disk usage: 24 GB, 100% overhead.
Simplest example of erasure coding
Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Two ways to store a 12 GB video
Replication
Store an extra copy.Disk usage: 24 GB, 100% overhead.
Simplest example of erasure coding
Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Two ways to store a 12 GB video
Replication
Store an extra copy.Disk usage: 24 GB, 100% overhead.
Simplest example of erasure coding
Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Two ways to store a 12 GB video
Replication
Store an extra copy.Disk usage: 24 GB, 100% overhead.
Simplest example of erasure coding
Split into two halves, A and B.
Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Two ways to store a 12 GB video
Replication
Store an extra copy.Disk usage: 24 GB, 100% overhead.
Simplest example of erasure coding
Split into two halves, A and B.Store A, B, and A⊕ B.
Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Two ways to store a 12 GB video
Replication
Store an extra copy.Disk usage: 24 GB, 100% overhead.
Simplest example of erasure coding
Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.
Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Two ways to store a 12 GB video
Replication
Store an extra copy.Disk usage: 24 GB, 100% overhead.
Simplest example of erasure coding
Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.
NB: ⊕ = .
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Two ways to store a 12 GB video
Replication
Store an extra copy.Disk usage: 24 GB, 100% overhead.
Simplest example of erasure coding
Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Another way to store a 12 GB video
Split into three equal parts: A, B, C .
Store A, B, C . and A⊕ B ⊕ C .
Disk usage: 16 GB, 33% overhead.
Can recover B, for example, with (A⊕ B ⊕ C )⊕ A⊕ C .
Call this 3 + 1 encoding.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Another way to store a 12 GB video
Split into three equal parts: A, B, C .
Store A, B, C . and A⊕ B ⊕ C .
Disk usage: 16 GB, 33% overhead.
Can recover B, for example, with (A⊕ B ⊕ C )⊕ A⊕ C .
Call this 3 + 1 encoding.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Another way to store a 12 GB video
Split into three equal parts: A, B, C .
Store A, B, C . and A⊕ B ⊕ C .
Disk usage: 16 GB, 33% overhead.
Can recover B, for example, with (A⊕ B ⊕ C )⊕ A⊕ C .
Call this 3 + 1 encoding.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Another way to store a 12 GB video
Split into three equal parts: A, B, C .
Store A, B, C . and A⊕ B ⊕ C .
Disk usage: 16 GB, 33% overhead.
Can recover B, for example, with (A⊕ B ⊕ C )⊕ A⊕ C .
Call this 3 + 1 encoding.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Another way to store a 12 GB video
Split into three equal parts: A, B, C .
Store A, B, C . and A⊕ B ⊕ C .
Disk usage: 16 GB, 33% overhead.
Can recover B, for example, with (A⊕ B ⊕ C )⊕ A⊕ C .
Call this 3 + 1 encoding.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Another way to store a 12 GB video
Split into three equal parts: A, B, C .
Store A, B, C . and A⊕ B ⊕ C .
Disk usage: 16 GB, 33% overhead.
Can recover B, for example, with (A⊕ B ⊕ C )⊕ A⊕ C .
Call this 3 + 1 encoding.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Another look at 3 + 1 encoding
1 0 00 1 00 0 11 1 1
x1
x2
x3
=
y1
y2
y3
y4
To recover, erase missing element of y and corresponding row ofmatrix. Solve linear system.
Addition is ⊕ (XOR). Multiplication by 1 is identity.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Another look at 3 + 1 encoding
1 0 00 1 00 0 11 1 1
x1
x2
x3
=
y1
y2
y3
y4
To recover, erase missing element of y and corresponding row ofmatrix. Solve linear system.
Addition is ⊕ (XOR). Multiplication by 1 is identity.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Another look at 3 + 1 encoding
1 0 00 1 00 0 11 1 1
x1
x2
x3
=
y1
y2
y3
y4
To recover, erase missing element of y and corresponding row ofmatrix. Solve linear system.
Addition is ⊕ (XOR). Multiplication by 1 is identity.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Another look at 3 + 1 encoding
1 0 00 1 00 0 11 1 1
x1
x2
x3
=
y1
y2
y3
y4
To recover, erase missing element of y and corresponding row ofmatrix. Solve linear system.
Addition is ⊕ (XOR). Multiplication by 1 is identity.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding
Split 12 GB file into equal halves A and B.
Store A, B, (3⊗ A) ⊕ (2⊗ B), and (2⊗ A) ⊕ (3⊗ B).
Can recover from the loss on any two disks.
Total disk: 24 GB. Overhead: 100%.
Same disk use as replication, but probability of loss O(p3)rather than O(p2).
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding
Split 12 GB file into equal halves A and B.
Store A, B, (3⊗ A) ⊕ (2⊗ B), and (2⊗ A) ⊕ (3⊗ B).
Can recover from the loss on any two disks.
Total disk: 24 GB. Overhead: 100%.
Same disk use as replication, but probability of loss O(p3)rather than O(p2).
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding
Split 12 GB file into equal halves A and B.
Store A, B, (3⊗ A) ⊕ (2⊗ B), and (2⊗ A) ⊕ (3⊗ B).
Can recover from the loss on any two disks.
Total disk: 24 GB. Overhead: 100%.
Same disk use as replication, but probability of loss O(p3)rather than O(p2).
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding
Split 12 GB file into equal halves A and B.
Store A, B, (3⊗ A) ⊕ (2⊗ B), and (2⊗ A) ⊕ (3⊗ B).
Can recover from the loss on any two disks.
Total disk: 24 GB. Overhead: 100%.
Same disk use as replication, but probability of loss O(p3)rather than O(p2).
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding
Split 12 GB file into equal halves A and B.
Store A, B, (3⊗ A) ⊕ (2⊗ B), and (2⊗ A) ⊕ (3⊗ B).
Can recover from the loss on any two disks.
Total disk: 24 GB. Overhead: 100%.
Same disk use as replication, but probability of loss O(p3)rather than O(p2).
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding
Split 12 GB file into equal halves A and B.
Store A, B, (3⊗ A) ⊕ (2⊗ B), and (2⊗ A) ⊕ (3⊗ B).
Can recover from the loss on any two disks.
Total disk: 24 GB. Overhead: 100%.
Same disk use as replication, but probability of loss O(p3)rather than O(p2).
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding again
1 00 13 22 3
[ x1
x2
]=
y1
y2
y3
y4
Operate on pairs of bits.
Arithmetic carried out in GF (22).
Addition ⊕ is XOR.
Multiplication ⊗ is more complicated.
Recover by striking missing elements of y and correspondingrows of matrix.
You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding again
1 00 13 22 3
[ x1
x2
]=
y1
y2
y3
y4
Operate on pairs of bits.
Arithmetic carried out in GF (22).
Addition ⊕ is XOR.
Multiplication ⊗ is more complicated.
Recover by striking missing elements of y and correspondingrows of matrix.
You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding again
1 00 13 22 3
[ x1
x2
]=
y1
y2
y3
y4
Operate on pairs of bits.
Arithmetic carried out in GF (22).
Addition ⊕ is XOR.
Multiplication ⊗ is more complicated.
Recover by striking missing elements of y and correspondingrows of matrix.
You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding again
1 00 13 22 3
[ x1
x2
]=
y1
y2
y3
y4
Operate on pairs of bits.
Arithmetic carried out in GF (22).
Addition ⊕ is XOR.
Multiplication ⊗ is more complicated.
Recover by striking missing elements of y and correspondingrows of matrix.
You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding again
1 00 13 22 3
[ x1
x2
]=
y1
y2
y3
y4
Operate on pairs of bits.
Arithmetic carried out in GF (22).
Addition ⊕ is XOR.
Multiplication ⊗ is more complicated.
Recover by striking missing elements of y and correspondingrows of matrix.
You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding again
1 00 13 22 3
[ x1
x2
]=
y1
y2
y3
y4
Operate on pairs of bits.
Arithmetic carried out in GF (22).
Addition ⊕ is XOR.
Multiplication ⊗ is more complicated.
Recover by striking missing elements of y and correspondingrows of matrix.
You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding again
1 00 13 22 3
[ x1
x2
]=
y1
y2
y3
y4
Operate on pairs of bits.
Arithmetic carried out in GF (22).
Addition ⊕ is XOR.
Multiplication ⊗ is more complicated.
Recover by striking missing elements of y and correspondingrows of matrix.
You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
2 + 2 encoding again
1 00 13 22 3
[ x1
x2
]=
y1
y2
y3
y4
Operate on pairs of bits.
Arithmetic carried out in GF (22).
Addition ⊕ is XOR.
Multiplication ⊗ is more complicated.
Recover by striking missing elements of y and correspondingrows of matrix.
You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
m + n erasure coding
Divide data into m fragments and add n parity fragments.
Construct an m + n encoding scheme using Reed-Solomoncodes.
Can recover from the loss of any n fragments.
There is an m + n code for all positive m and n.
Construct an (m + n)×m matrix. Encode and decode asbefore.
Arithmetic carried out in GF (q) where q = 2r ≥ m + n.
Encode data in blocks of r bits.
RAID 5 = m + 1; RAID 6 = m + 2.
Next: What is GF (q) and where does matrix come from?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
m + n erasure coding
Divide data into m fragments and add n parity fragments.
Construct an m + n encoding scheme using Reed-Solomoncodes.
Can recover from the loss of any n fragments.
There is an m + n code for all positive m and n.
Construct an (m + n)×m matrix. Encode and decode asbefore.
Arithmetic carried out in GF (q) where q = 2r ≥ m + n.
Encode data in blocks of r bits.
RAID 5 = m + 1; RAID 6 = m + 2.
Next: What is GF (q) and where does matrix come from?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
m + n erasure coding
Divide data into m fragments and add n parity fragments.
Construct an m + n encoding scheme using Reed-Solomoncodes.
Can recover from the loss of any n fragments.
There is an m + n code for all positive m and n.
Construct an (m + n)×m matrix. Encode and decode asbefore.
Arithmetic carried out in GF (q) where q = 2r ≥ m + n.
Encode data in blocks of r bits.
RAID 5 = m + 1; RAID 6 = m + 2.
Next: What is GF (q) and where does matrix come from?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
m + n erasure coding
Divide data into m fragments and add n parity fragments.
Construct an m + n encoding scheme using Reed-Solomoncodes.
Can recover from the loss of any n fragments.
There is an m + n code for all positive m and n.
Construct an (m + n)×m matrix. Encode and decode asbefore.
Arithmetic carried out in GF (q) where q = 2r ≥ m + n.
Encode data in blocks of r bits.
RAID 5 = m + 1; RAID 6 = m + 2.
Next: What is GF (q) and where does matrix come from?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
m + n erasure coding
Divide data into m fragments and add n parity fragments.
Construct an m + n encoding scheme using Reed-Solomoncodes.
Can recover from the loss of any n fragments.
There is an m + n code for all positive m and n.
Construct an (m + n)×m matrix. Encode and decode asbefore.
Arithmetic carried out in GF (q) where q = 2r ≥ m + n.
Encode data in blocks of r bits.
RAID 5 = m + 1; RAID 6 = m + 2.
Next: What is GF (q) and where does matrix come from?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
m + n erasure coding
Divide data into m fragments and add n parity fragments.
Construct an m + n encoding scheme using Reed-Solomoncodes.
Can recover from the loss of any n fragments.
There is an m + n code for all positive m and n.
Construct an (m + n)×m matrix. Encode and decode asbefore.
Arithmetic carried out in GF (q) where q = 2r ≥ m + n.
Encode data in blocks of r bits.
RAID 5 = m + 1; RAID 6 = m + 2.
Next: What is GF (q) and where does matrix come from?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
m + n erasure coding
Divide data into m fragments and add n parity fragments.
Construct an m + n encoding scheme using Reed-Solomoncodes.
Can recover from the loss of any n fragments.
There is an m + n code for all positive m and n.
Construct an (m + n)×m matrix. Encode and decode asbefore.
Arithmetic carried out in GF (q) where q = 2r ≥ m + n.
Encode data in blocks of r bits.
RAID 5 = m + 1; RAID 6 = m + 2.
Next: What is GF (q) and where does matrix come from?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
m + n erasure coding
Divide data into m fragments and add n parity fragments.
Construct an m + n encoding scheme using Reed-Solomoncodes.
Can recover from the loss of any n fragments.
There is an m + n code for all positive m and n.
Construct an (m + n)×m matrix. Encode and decode asbefore.
Arithmetic carried out in GF (q) where q = 2r ≥ m + n.
Encode data in blocks of r bits.
RAID 5 = m + 1; RAID 6 = m + 2.
Next: What is GF (q) and where does matrix come from?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
m + n erasure coding
Divide data into m fragments and add n parity fragments.
Construct an m + n encoding scheme using Reed-Solomoncodes.
Can recover from the loss of any n fragments.
There is an m + n code for all positive m and n.
Construct an (m + n)×m matrix. Encode and decode asbefore.
Arithmetic carried out in GF (q) where q = 2r ≥ m + n.
Encode data in blocks of r bits.
RAID 5 = m + 1; RAID 6 = m + 2.
Next: What is GF (q) and where does matrix come from?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite fields
GF (q) has q elements.
There exists a field of order q if and only if q = pr .
Elements are polynomials in r − 1 variables.
Addition ⊕ is polynomial addition with coefficients modulo p.
Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.
In computer applications, p is nearly always 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite fields
GF (q) has q elements.
There exists a field of order q if and only if q = pr .
Elements are polynomials in r − 1 variables.
Addition ⊕ is polynomial addition with coefficients modulo p.
Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.
In computer applications, p is nearly always 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite fields
GF (q) has q elements.
There exists a field of order q if and only if q = pr .
Elements are polynomials in r − 1 variables.
Addition ⊕ is polynomial addition with coefficients modulo p.
Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.
In computer applications, p is nearly always 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite fields
GF (q) has q elements.
There exists a field of order q if and only if q = pr .
Elements are polynomials in r − 1 variables.
Addition ⊕ is polynomial addition with coefficients modulo p.
Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.
In computer applications, p is nearly always 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite fields
GF (q) has q elements.
There exists a field of order q if and only if q = pr .
Elements are polynomials in r − 1 variables.
Addition ⊕ is polynomial addition with coefficients modulo p.
Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.
In computer applications, p is nearly always 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite fields
GF (q) has q elements.
There exists a field of order q if and only if q = pr .
Elements are polynomials in r − 1 variables.
Addition ⊕ is polynomial addition with coefficients modulo p.
Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.
In computer applications, p is nearly always 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite fields
GF (q) has q elements.
There exists a field of order q if and only if q = pr .
Elements are polynomials in r − 1 variables.
Addition ⊕ is polynomial addition with coefficients modulo p.
Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.
In computer applications, p is nearly always 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite field example: GF (23)
Elements are 3-bit numbers
Choose irreducible polynomial, e.g. g(x) = x3 + x + 1
3 = 11two 7→ x + 1
5 = 101two 7→ x2 + 1
3⊕ 5 7→ x2 + x 7→ 6
5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite field example: GF (23)
Elements are 3-bit numbers
Choose irreducible polynomial, e.g. g(x) = x3 + x + 1
3 = 11two 7→ x + 1
5 = 101two 7→ x2 + 1
3⊕ 5 7→ x2 + x 7→ 6
5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite field example: GF (23)
Elements are 3-bit numbers
Choose irreducible polynomial, e.g. g(x) = x3 + x + 1
3 = 11two 7→ x + 1
5 = 101two 7→ x2 + 1
3⊕ 5 7→ x2 + x 7→ 6
5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite field example: GF (23)
Elements are 3-bit numbers
Choose irreducible polynomial, e.g. g(x) = x3 + x + 1
3 = 11two 7→ x + 1
5 = 101two 7→ x2 + 1
3⊕ 5 7→ x2 + x 7→ 6
5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite field example: GF (23)
Elements are 3-bit numbers
Choose irreducible polynomial, e.g. g(x) = x3 + x + 1
3 = 11two 7→ x + 1
5 = 101two 7→ x2 + 1
3⊕ 5 7→ x2 + x 7→ 6
5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite field example: GF (23)
Elements are 3-bit numbers
Choose irreducible polynomial, e.g. g(x) = x3 + x + 1
3 = 11two 7→ x + 1
5 = 101two 7→ x2 + 1
3⊕ 5 7→ x2 + x 7→ 6
5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Finite field example: GF (23)
Elements are 3-bit numbers
Choose irreducible polynomial, e.g. g(x) = x3 + x + 1
3 = 11two 7→ x + 1
5 = 101two 7→ x2 + 1
3⊕ 5 7→ x2 + x 7→ 6
5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
How to construct matrix for m + n code
Choose q = pr ≥ m + n. e.g. work in GF (24) for 6 + 3 code.
Start with (m + n)×m Vandermonde matrix.
Do Gaussian elimination by columns until top of matrix equalsidentity. Call this B.
Theorem: Every subset of n rows of B forms an invertiblematrix.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
How to construct matrix for m + n code
Choose q = pr ≥ m + n. e.g. work in GF (24) for 6 + 3 code.
Start with (m + n)×m Vandermonde matrix.
Do Gaussian elimination by columns until top of matrix equalsidentity. Call this B.
Theorem: Every subset of n rows of B forms an invertiblematrix.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
How to construct matrix for m + n code
Choose q = pr ≥ m + n. e.g. work in GF (24) for 6 + 3 code.
Start with (m + n)×m Vandermonde matrix.
Do Gaussian elimination by columns until top of matrix equalsidentity. Call this B.
Theorem: Every subset of n rows of B forms an invertiblematrix.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
How to construct matrix for m + n code
Choose q = pr ≥ m + n. e.g. work in GF (24) for 6 + 3 code.
Start with (m + n)×m Vandermonde matrix.
Do Gaussian elimination by columns until top of matrix equalsidentity. Call this B.
Theorem: Every subset of n rows of B forms an invertiblematrix.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
How to construct matrix for m + n code
Choose q = pr ≥ m + n. e.g. work in GF (24) for 6 + 3 code.
Start with (m + n)×m Vandermonde matrix.
Do Gaussian elimination by columns until top of matrix equalsidentity. Call this B.
Theorem: Every subset of n rows of B forms an invertiblematrix.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
How to choose n
Given a number of data fragments m and an acceptable probabilityof unavailability ε, you can solve for the smallest value of n suchthat
m+n∑i=n+1
(m + n
i
)piL(1− pL)m+n−i < ε
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
How to choose n
Given a number of data fragments m and an acceptable probabilityof unavailability ε, you can solve for the smallest value of n suchthat
m+n∑i=n+1
(m + n
i
)piL(1− pL)m+n−i < ε
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Example of choosing n
Suppose disk failure probability is 0.005 and you want to build asystem with probability of failure < 10−6.
Could use triple replication. Probability of failure= 0.0053 = 1.25× 10−7.
If m = 8, can solve for n = 3 which gives probability of failure1.99× 10−7.
8 GB object encodes to 11 GB rather than 24 GB.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Example of choosing n
Suppose disk failure probability is 0.005 and you want to build asystem with probability of failure < 10−6.
Could use triple replication. Probability of failure= 0.0053 = 1.25× 10−7.
If m = 8, can solve for n = 3 which gives probability of failure1.99× 10−7.
8 GB object encodes to 11 GB rather than 24 GB.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Example of choosing n
Suppose disk failure probability is 0.005 and you want to build asystem with probability of failure < 10−6.
Could use triple replication. Probability of failure= 0.0053 = 1.25× 10−7.
If m = 8, can solve for n = 3 which gives probability of failure1.99× 10−7.
8 GB object encodes to 11 GB rather than 24 GB.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Example of choosing n
Suppose disk failure probability is 0.005 and you want to build asystem with probability of failure < 10−6.
Could use triple replication. Probability of failure= 0.0053 = 1.25× 10−7.
If m = 8, can solve for n = 3 which gives probability of failure1.99× 10−7.
8 GB object encodes to 11 GB rather than 24 GB.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Example of choosing n
Suppose disk failure probability is 0.005 and you want to build asystem with probability of failure < 10−6.
Could use triple replication. Probability of failure= 0.0053 = 1.25× 10−7.
If m = 8, can solve for n = 3 which gives probability of failure1.99× 10−7.
8 GB object encodes to 11 GB rather than 24 GB.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Choosing m
Can increase reliability for fixed disk space by increasing mand n proportionally.
In theory, EC can make both the probability of loss and theoverhead as small as you like.
So why not always use EC, and why not choose m+n as largeas possible?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Choosing m
Can increase reliability for fixed disk space by increasing mand n proportionally.
In theory, EC can make both the probability of loss and theoverhead as small as you like.
So why not always use EC, and why not choose m+n as largeas possible?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Choosing m
Can increase reliability for fixed disk space by increasing mand n proportionally.
In theory, EC can make both the probability of loss and theoverhead as small as you like.
So why not always use EC, and why not choose m+n as largeas possible?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Choosing m
Can increase reliability for fixed disk space by increasing mand n proportionally.
In theory, EC can make both the probability of loss and theoverhead as small as you like.
So why not always use EC, and why not choose m+n as largeas possible?
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Keeping track of fragments
Some system needs to keep track of where to find ECfragments.
Want database in memory, so must be on the order of GB insize.
Kilobytes of data per fragment =⇒ millions of fragments.
In practice, EC system have on the order of millions offragments.
This places an upper limit on m + n and lower limit onfragment size.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Keeping track of fragments
Some system needs to keep track of where to find ECfragments.
Want database in memory, so must be on the order of GB insize.
Kilobytes of data per fragment =⇒ millions of fragments.
In practice, EC system have on the order of millions offragments.
This places an upper limit on m + n and lower limit onfragment size.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Keeping track of fragments
Some system needs to keep track of where to find ECfragments.
Want database in memory, so must be on the order of GB insize.
Kilobytes of data per fragment =⇒ millions of fragments.
In practice, EC system have on the order of millions offragments.
This places an upper limit on m + n and lower limit onfragment size.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Keeping track of fragments
Some system needs to keep track of where to find ECfragments.
Want database in memory, so must be on the order of GB insize.
Kilobytes of data per fragment =⇒ millions of fragments.
In practice, EC system have on the order of millions offragments.
This places an upper limit on m + n and lower limit onfragment size.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Keeping track of fragments
Some system needs to keep track of where to find ECfragments.
Want database in memory, so must be on the order of GB insize.
Kilobytes of data per fragment =⇒ millions of fragments.
In practice, EC system have on the order of millions offragments.
This places an upper limit on m + n and lower limit onfragment size.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Keeping track of fragments
Some system needs to keep track of where to find ECfragments.
Want database in memory, so must be on the order of GB insize.
Kilobytes of data per fragment =⇒ millions of fragments.
In practice, EC system have on the order of millions offragments.
This places an upper limit on m + n and lower limit onfragment size.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Lost versus unavailable data
Data Loss (DL): You’re never getting the data back.
Data unavailability (DU): The data is recoverable, but youcan’t get to it right now.
DL = permanently unavailable
DU = temporarily unavailable
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Lost versus unavailable data
Data Loss (DL): You’re never getting the data back.
Data unavailability (DU): The data is recoverable, but youcan’t get to it right now.
DL = permanently unavailable
DU = temporarily unavailable
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Lost versus unavailable data
Data Loss (DL): You’re never getting the data back.
Data unavailability (DU): The data is recoverable, but youcan’t get to it right now.
DL = permanently unavailable
DU = temporarily unavailable
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Lost versus unavailable data
Data Loss (DL): You’re never getting the data back.
Data unavailability (DU): The data is recoverable, but youcan’t get to it right now.
DL = permanently unavailable
DU = temporarily unavailable
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Lost versus unavailable data
Data Loss (DL): You’re never getting the data back.
Data unavailability (DU): The data is recoverable, but youcan’t get to it right now.
DL = permanently unavailable
DU = temporarily unavailable
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Dying versus being dead
If a disk can be replaced in time, it’s as if it came back to life.
Data loss happens when too many disks are dead at the sametime.
Probability of (eventual) failure is 1.
Probability of being in dead state is most important.
Note that probability of a disk being dead depends onreplacement time, not just disk MTBF.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Dying versus being dead
If a disk can be replaced in time, it’s as if it came back to life.
Data loss happens when too many disks are dead at the sametime.
Probability of (eventual) failure is 1.
Probability of being in dead state is most important.
Note that probability of a disk being dead depends onreplacement time, not just disk MTBF.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Dying versus being dead
If a disk can be replaced in time, it’s as if it came back to life.
Data loss happens when too many disks are dead at the sametime.
Probability of (eventual) failure is 1.
Probability of being in dead state is most important.
Note that probability of a disk being dead depends onreplacement time, not just disk MTBF.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Dying versus being dead
If a disk can be replaced in time, it’s as if it came back to life.
Data loss happens when too many disks are dead at the sametime.
Probability of (eventual) failure is 1.
Probability of being in dead state is most important.
Note that probability of a disk being dead depends onreplacement time, not just disk MTBF.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Dying versus being dead
If a disk can be replaced in time, it’s as if it came back to life.
Data loss happens when too many disks are dead at the sametime.
Probability of (eventual) failure is 1.
Probability of being in dead state is most important.
Note that probability of a disk being dead depends onreplacement time, not just disk MTBF.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Dying versus being dead
If a disk can be replaced in time, it’s as if it came back to life.
Data loss happens when too many disks are dead at the sametime.
Probability of (eventual) failure is 1.
Probability of being in dead state is most important.
Note that probability of a disk being dead depends onreplacement time, not just disk MTBF.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Allocating disks to data centers
Location of disks matters for DU more than DL.
Very unlikely that all disks in a data center will crashsimultaneously, but data centers go offline.
Common to have a replica in every data center.
Companies usually have fewer than m + n data centers, sosome fragments co-located.
EC fragments have correlated risk of unavailability (andfailure, to lesser extent).
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Allocating disks to data centers
Location of disks matters for DU more than DL.
Very unlikely that all disks in a data center will crashsimultaneously, but data centers go offline.
Common to have a replica in every data center.
Companies usually have fewer than m + n data centers, sosome fragments co-located.
EC fragments have correlated risk of unavailability (andfailure, to lesser extent).
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Allocating disks to data centers
Location of disks matters for DU more than DL.
Very unlikely that all disks in a data center will crashsimultaneously, but data centers go offline.
Common to have a replica in every data center.
Companies usually have fewer than m + n data centers, sosome fragments co-located.
EC fragments have correlated risk of unavailability (andfailure, to lesser extent).
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Allocating disks to data centers
Location of disks matters for DU more than DL.
Very unlikely that all disks in a data center will crashsimultaneously, but data centers go offline.
Common to have a replica in every data center.
Companies usually have fewer than m + n data centers, sosome fragments co-located.
EC fragments have correlated risk of unavailability (andfailure, to lesser extent).
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Allocating disks to data centers
Location of disks matters for DU more than DL.
Very unlikely that all disks in a data center will crashsimultaneously, but data centers go offline.
Common to have a replica in every data center.
Companies usually have fewer than m + n data centers, sosome fragments co-located.
EC fragments have correlated risk of unavailability (andfailure, to lesser extent).
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Allocating disks to data centers
Location of disks matters for DU more than DL.
Very unlikely that all disks in a data center will crashsimultaneously, but data centers go offline.
Common to have a replica in every data center.
Companies usually have fewer than m + n data centers, sosome fragments co-located.
EC fragments have correlated risk of unavailability (andfailure, to lesser extent).
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Latency
In replicated system, object returned from one data center.
In EC system, may need to combine fragments from two datacenters to reconstruct object.
Recoverable failure may require EC to access fragmentsfurther away than normal read.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Latency
In replicated system, object returned from one data center.
In EC system, may need to combine fragments from two datacenters to reconstruct object.
Recoverable failure may require EC to access fragmentsfurther away than normal read.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Latency
In replicated system, object returned from one data center.
In EC system, may need to combine fragments from two datacenters to reconstruct object.
Recoverable failure may require EC to access fragmentsfurther away than normal read.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Latency
In replicated system, object returned from one data center.
In EC system, may need to combine fragments from two datacenters to reconstruct object.
Recoverable failure may require EC to access fragmentsfurther away than normal read.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Recovery
EC systems have a lower probability of catastrophic failure but ahigher probability of recoverable failure.
For example, assume disks have 0.995 probability of being alive.
Triple replication and 8+3 EC have comparable probability of dataloss.
But there is a 1.5% chance of recoverable failure in the former and5.36% in the latter.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Recovery
EC systems have a lower probability of catastrophic failure but ahigher probability of recoverable failure.
For example, assume disks have 0.995 probability of being alive.
Triple replication and 8+3 EC have comparable probability of dataloss.
But there is a 1.5% chance of recoverable failure in the former and5.36% in the latter.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Recovery
EC systems have a lower probability of catastrophic failure but ahigher probability of recoverable failure.
For example, assume disks have 0.995 probability of being alive.
Triple replication and 8+3 EC have comparable probability of dataloss.
But there is a 1.5% chance of recoverable failure in the former and5.36% in the latter.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Recovery
EC systems have a lower probability of catastrophic failure but ahigher probability of recoverable failure.
For example, assume disks have 0.995 probability of being alive.
Triple replication and 8+3 EC have comparable probability of dataloss.
But there is a 1.5% chance of recoverable failure in the former and5.36% in the latter.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Recovery
EC systems have a lower probability of catastrophic failure but ahigher probability of recoverable failure.
For example, assume disks have 0.995 probability of being alive.
Triple replication and 8+3 EC have comparable probability of dataloss.
But there is a 1.5% chance of recoverable failure in the former and5.36% in the latter.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Local reconstruction codes
Microsoft Azure 6 + 2 + 2 example system: 3 data disks, 1local parity disk, and 1 global parity disk in each of two datacenters.
Can recover from any single disk failure locally.
Can recover from the loss of any 3 disks.
Can recover from the loss of 4 disks in 86% of cases.
More reliable than 6 + 3, but not quite as reliable as 6 + 4.
Actual Azure implementation uses 12 + 2 + 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Local reconstruction codes
Microsoft Azure 6 + 2 + 2 example system: 3 data disks, 1local parity disk, and 1 global parity disk in each of two datacenters.
Can recover from any single disk failure locally.
Can recover from the loss of any 3 disks.
Can recover from the loss of 4 disks in 86% of cases.
More reliable than 6 + 3, but not quite as reliable as 6 + 4.
Actual Azure implementation uses 12 + 2 + 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Local reconstruction codes
Microsoft Azure 6 + 2 + 2 example system: 3 data disks, 1local parity disk, and 1 global parity disk in each of two datacenters.
Can recover from any single disk failure locally.
Can recover from the loss of any 3 disks.
Can recover from the loss of 4 disks in 86% of cases.
More reliable than 6 + 3, but not quite as reliable as 6 + 4.
Actual Azure implementation uses 12 + 2 + 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Local reconstruction codes
Microsoft Azure 6 + 2 + 2 example system: 3 data disks, 1local parity disk, and 1 global parity disk in each of two datacenters.
Can recover from any single disk failure locally.
Can recover from the loss of any 3 disks.
Can recover from the loss of 4 disks in 86% of cases.
More reliable than 6 + 3, but not quite as reliable as 6 + 4.
Actual Azure implementation uses 12 + 2 + 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Local reconstruction codes
Microsoft Azure 6 + 2 + 2 example system: 3 data disks, 1local parity disk, and 1 global parity disk in each of two datacenters.
Can recover from any single disk failure locally.
Can recover from the loss of any 3 disks.
Can recover from the loss of 4 disks in 86% of cases.
More reliable than 6 + 3, but not quite as reliable as 6 + 4.
Actual Azure implementation uses 12 + 2 + 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Local reconstruction codes
Microsoft Azure 6 + 2 + 2 example system: 3 data disks, 1local parity disk, and 1 global parity disk in each of two datacenters.
Can recover from any single disk failure locally.
Can recover from the loss of any 3 disks.
Can recover from the loss of 4 disks in 86% of cases.
More reliable than 6 + 3, but not quite as reliable as 6 + 4.
Actual Azure implementation uses 12 + 2 + 2.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Summary
EC implemented via Reed-Solomon codes, finite fields.
EC reduces hardware costs and increases reliability.
In theory, overhead and probability of data loss → 0 asm + n→∞.
In practice, m + n is usually on the order of 10.
EC decreases probability of data loss, increases probabilityrecoverable failure.
EC can be vulnerable to data center availability.
EC increases latency.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Summary
EC implemented via Reed-Solomon codes, finite fields.
EC reduces hardware costs and increases reliability.
In theory, overhead and probability of data loss → 0 asm + n→∞.
In practice, m + n is usually on the order of 10.
EC decreases probability of data loss, increases probabilityrecoverable failure.
EC can be vulnerable to data center availability.
EC increases latency.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Summary
EC implemented via Reed-Solomon codes, finite fields.
EC reduces hardware costs and increases reliability.
In theory, overhead and probability of data loss → 0 asm + n→∞.
In practice, m + n is usually on the order of 10.
EC decreases probability of data loss, increases probabilityrecoverable failure.
EC can be vulnerable to data center availability.
EC increases latency.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Summary
EC implemented via Reed-Solomon codes, finite fields.
EC reduces hardware costs and increases reliability.
In theory, overhead and probability of data loss → 0 asm + n→∞.
In practice, m + n is usually on the order of 10.
EC decreases probability of data loss, increases probabilityrecoverable failure.
EC can be vulnerable to data center availability.
EC increases latency.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Summary
EC implemented via Reed-Solomon codes, finite fields.
EC reduces hardware costs and increases reliability.
In theory, overhead and probability of data loss → 0 asm + n→∞.
In practice, m + n is usually on the order of 10.
EC decreases probability of data loss, increases probabilityrecoverable failure.
EC can be vulnerable to data center availability.
EC increases latency.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Summary
EC implemented via Reed-Solomon codes, finite fields.
EC reduces hardware costs and increases reliability.
In theory, overhead and probability of data loss → 0 asm + n→∞.
In practice, m + n is usually on the order of 10.
EC decreases probability of data loss, increases probabilityrecoverable failure.
EC can be vulnerable to data center availability.
EC increases latency.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Summary
EC implemented via Reed-Solomon codes, finite fields.
EC reduces hardware costs and increases reliability.
In theory, overhead and probability of data loss → 0 asm + n→∞.
In practice, m + n is usually on the order of 10.
EC decreases probability of data loss, increases probabilityrecoverable failure.
EC can be vulnerable to data center availability.
EC increases latency.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Summary
EC implemented via Reed-Solomon codes, finite fields.
EC reduces hardware costs and increases reliability.
In theory, overhead and probability of data loss → 0 asm + n→∞.
In practice, m + n is usually on the order of 10.
EC decreases probability of data loss, increases probabilityrecoverable failure.
EC can be vulnerable to data center availability.
EC increases latency.
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Resources
James Westall and James Martin. An Introduction to GaloisFields and Reed-Solomon Coding. http://bit.ly/1oL4NRt
Cheng Huang et al. Erasure Coding in Windows AzureStorage. http://bit.ly/ZPISui
John D. Cook, Robert Primmer, Ab de Kwant. ComparingCost and Performance of Replication and Erasure Coding. Toappear in Hitachi Review. http://bit.ly/1eyvfu9
Contact info: http://JohnDCook.com
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits
Resources
James Westall and James Martin. An Introduction to GaloisFields and Reed-Solomon Coding. http://bit.ly/1oL4NRt
Cheng Huang et al. Erasure Coding in Windows AzureStorage. http://bit.ly/ZPISui
John D. Cook, Robert Primmer, Ab de Kwant. ComparingCost and Performance of Replication and Erasure Coding. Toappear in Hitachi Review. http://bit.ly/1eyvfu9
Contact info: http://JohnDCook.com
John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits