Erasure Coding Costs and Benefits

122
Erasure Coding Costs and Benefits John D. Cook 1 Robert Primmer 2 Ab de Kwant 2 1 Singular Value Consulting 2 Hitachi Data Systems March 28, 2014 John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Transcript of Erasure Coding Costs and Benefits

Page 1: Erasure Coding Costs and Benefits

Erasure Coding Costs and Benefits

John D. Cook1 Robert Primmer2 Ab de Kwant2

1Singular Value Consulting

2Hitachi Data Systems

March 28, 2014

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 2: Erasure Coding Costs and Benefits

Two ways to store a 12 GB video

Replication

Store an extra copy.Disk usage: 24 GB, 100% overhead.

Simplest example of erasure coding

Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 3: Erasure Coding Costs and Benefits

Two ways to store a 12 GB video

Replication

Store an extra copy.Disk usage: 24 GB, 100% overhead.

Simplest example of erasure coding

Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 4: Erasure Coding Costs and Benefits

Two ways to store a 12 GB video

Replication

Store an extra copy.

Disk usage: 24 GB, 100% overhead.

Simplest example of erasure coding

Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 5: Erasure Coding Costs and Benefits

Two ways to store a 12 GB video

Replication

Store an extra copy.Disk usage: 24 GB, 100% overhead.

Simplest example of erasure coding

Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 6: Erasure Coding Costs and Benefits

Two ways to store a 12 GB video

Replication

Store an extra copy.Disk usage: 24 GB, 100% overhead.

Simplest example of erasure coding

Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 7: Erasure Coding Costs and Benefits

Two ways to store a 12 GB video

Replication

Store an extra copy.Disk usage: 24 GB, 100% overhead.

Simplest example of erasure coding

Split into two halves, A and B.

Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 8: Erasure Coding Costs and Benefits

Two ways to store a 12 GB video

Replication

Store an extra copy.Disk usage: 24 GB, 100% overhead.

Simplest example of erasure coding

Split into two halves, A and B.Store A, B, and A⊕ B.

Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 9: Erasure Coding Costs and Benefits

Two ways to store a 12 GB video

Replication

Store an extra copy.Disk usage: 24 GB, 100% overhead.

Simplest example of erasure coding

Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.

Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 10: Erasure Coding Costs and Benefits

Two ways to store a 12 GB video

Replication

Store an extra copy.Disk usage: 24 GB, 100% overhead.

Simplest example of erasure coding

Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.

NB: ⊕ = .

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 11: Erasure Coding Costs and Benefits

Two ways to store a 12 GB video

Replication

Store an extra copy.Disk usage: 24 GB, 100% overhead.

Simplest example of erasure coding

Split into two halves, A and B.Store A, B, and A⊕ B.Disk usage: 18 GB, 50% overhead.Can recover B, for example, with (A⊕ B)⊕ A.NB: ⊕ = .

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 12: Erasure Coding Costs and Benefits

Another way to store a 12 GB video

Split into three equal parts: A, B, C .

Store A, B, C . and A⊕ B ⊕ C .

Disk usage: 16 GB, 33% overhead.

Can recover B, for example, with (A⊕ B ⊕ C )⊕ A⊕ C .

Call this 3 + 1 encoding.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 13: Erasure Coding Costs and Benefits

Another way to store a 12 GB video

Split into three equal parts: A, B, C .

Store A, B, C . and A⊕ B ⊕ C .

Disk usage: 16 GB, 33% overhead.

Can recover B, for example, with (A⊕ B ⊕ C )⊕ A⊕ C .

Call this 3 + 1 encoding.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 14: Erasure Coding Costs and Benefits

Another way to store a 12 GB video

Split into three equal parts: A, B, C .

Store A, B, C . and A⊕ B ⊕ C .

Disk usage: 16 GB, 33% overhead.

Can recover B, for example, with (A⊕ B ⊕ C )⊕ A⊕ C .

Call this 3 + 1 encoding.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 15: Erasure Coding Costs and Benefits

Another way to store a 12 GB video

Split into three equal parts: A, B, C .

Store A, B, C . and A⊕ B ⊕ C .

Disk usage: 16 GB, 33% overhead.

Can recover B, for example, with (A⊕ B ⊕ C )⊕ A⊕ C .

Call this 3 + 1 encoding.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 16: Erasure Coding Costs and Benefits

Another way to store a 12 GB video

Split into three equal parts: A, B, C .

Store A, B, C . and A⊕ B ⊕ C .

Disk usage: 16 GB, 33% overhead.

Can recover B, for example, with (A⊕ B ⊕ C )⊕ A⊕ C .

Call this 3 + 1 encoding.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 17: Erasure Coding Costs and Benefits

Another way to store a 12 GB video

Split into three equal parts: A, B, C .

Store A, B, C . and A⊕ B ⊕ C .

Disk usage: 16 GB, 33% overhead.

Can recover B, for example, with (A⊕ B ⊕ C )⊕ A⊕ C .

Call this 3 + 1 encoding.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 18: Erasure Coding Costs and Benefits

Another look at 3 + 1 encoding

1 0 00 1 00 0 11 1 1

x1

x2

x3

=

y1

y2

y3

y4

To recover, erase missing element of y and corresponding row ofmatrix. Solve linear system.

Addition is ⊕ (XOR). Multiplication by 1 is identity.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 19: Erasure Coding Costs and Benefits

Another look at 3 + 1 encoding

1 0 00 1 00 0 11 1 1

x1

x2

x3

=

y1

y2

y3

y4

To recover, erase missing element of y and corresponding row ofmatrix. Solve linear system.

Addition is ⊕ (XOR). Multiplication by 1 is identity.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 20: Erasure Coding Costs and Benefits

Another look at 3 + 1 encoding

1 0 00 1 00 0 11 1 1

x1

x2

x3

=

y1

y2

y3

y4

To recover, erase missing element of y and corresponding row ofmatrix. Solve linear system.

Addition is ⊕ (XOR). Multiplication by 1 is identity.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 21: Erasure Coding Costs and Benefits

Another look at 3 + 1 encoding

1 0 00 1 00 0 11 1 1

x1

x2

x3

=

y1

y2

y3

y4

To recover, erase missing element of y and corresponding row ofmatrix. Solve linear system.

Addition is ⊕ (XOR). Multiplication by 1 is identity.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 22: Erasure Coding Costs and Benefits

2 + 2 encoding

Split 12 GB file into equal halves A and B.

Store A, B, (3⊗ A) ⊕ (2⊗ B), and (2⊗ A) ⊕ (3⊗ B).

Can recover from the loss on any two disks.

Total disk: 24 GB. Overhead: 100%.

Same disk use as replication, but probability of loss O(p3)rather than O(p2).

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 23: Erasure Coding Costs and Benefits

2 + 2 encoding

Split 12 GB file into equal halves A and B.

Store A, B, (3⊗ A) ⊕ (2⊗ B), and (2⊗ A) ⊕ (3⊗ B).

Can recover from the loss on any two disks.

Total disk: 24 GB. Overhead: 100%.

Same disk use as replication, but probability of loss O(p3)rather than O(p2).

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 24: Erasure Coding Costs and Benefits

2 + 2 encoding

Split 12 GB file into equal halves A and B.

Store A, B, (3⊗ A) ⊕ (2⊗ B), and (2⊗ A) ⊕ (3⊗ B).

Can recover from the loss on any two disks.

Total disk: 24 GB. Overhead: 100%.

Same disk use as replication, but probability of loss O(p3)rather than O(p2).

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 25: Erasure Coding Costs and Benefits

2 + 2 encoding

Split 12 GB file into equal halves A and B.

Store A, B, (3⊗ A) ⊕ (2⊗ B), and (2⊗ A) ⊕ (3⊗ B).

Can recover from the loss on any two disks.

Total disk: 24 GB. Overhead: 100%.

Same disk use as replication, but probability of loss O(p3)rather than O(p2).

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 26: Erasure Coding Costs and Benefits

2 + 2 encoding

Split 12 GB file into equal halves A and B.

Store A, B, (3⊗ A) ⊕ (2⊗ B), and (2⊗ A) ⊕ (3⊗ B).

Can recover from the loss on any two disks.

Total disk: 24 GB. Overhead: 100%.

Same disk use as replication, but probability of loss O(p3)rather than O(p2).

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 27: Erasure Coding Costs and Benefits

2 + 2 encoding

Split 12 GB file into equal halves A and B.

Store A, B, (3⊗ A) ⊕ (2⊗ B), and (2⊗ A) ⊕ (3⊗ B).

Can recover from the loss on any two disks.

Total disk: 24 GB. Overhead: 100%.

Same disk use as replication, but probability of loss O(p3)rather than O(p2).

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 28: Erasure Coding Costs and Benefits

2 + 2 encoding again

1 00 13 22 3

[ x1

x2

]=

y1

y2

y3

y4

Operate on pairs of bits.

Arithmetic carried out in GF (22).

Addition ⊕ is XOR.

Multiplication ⊗ is more complicated.

Recover by striking missing elements of y and correspondingrows of matrix.

You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 29: Erasure Coding Costs and Benefits

2 + 2 encoding again

1 00 13 22 3

[ x1

x2

]=

y1

y2

y3

y4

Operate on pairs of bits.

Arithmetic carried out in GF (22).

Addition ⊕ is XOR.

Multiplication ⊗ is more complicated.

Recover by striking missing elements of y and correspondingrows of matrix.

You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 30: Erasure Coding Costs and Benefits

2 + 2 encoding again

1 00 13 22 3

[ x1

x2

]=

y1

y2

y3

y4

Operate on pairs of bits.

Arithmetic carried out in GF (22).

Addition ⊕ is XOR.

Multiplication ⊗ is more complicated.

Recover by striking missing elements of y and correspondingrows of matrix.

You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 31: Erasure Coding Costs and Benefits

2 + 2 encoding again

1 00 13 22 3

[ x1

x2

]=

y1

y2

y3

y4

Operate on pairs of bits.

Arithmetic carried out in GF (22).

Addition ⊕ is XOR.

Multiplication ⊗ is more complicated.

Recover by striking missing elements of y and correspondingrows of matrix.

You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 32: Erasure Coding Costs and Benefits

2 + 2 encoding again

1 00 13 22 3

[ x1

x2

]=

y1

y2

y3

y4

Operate on pairs of bits.

Arithmetic carried out in GF (22).

Addition ⊕ is XOR.

Multiplication ⊗ is more complicated.

Recover by striking missing elements of y and correspondingrows of matrix.

You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 33: Erasure Coding Costs and Benefits

2 + 2 encoding again

1 00 13 22 3

[ x1

x2

]=

y1

y2

y3

y4

Operate on pairs of bits.

Arithmetic carried out in GF (22).

Addition ⊕ is XOR.

Multiplication ⊗ is more complicated.

Recover by striking missing elements of y and correspondingrows of matrix.

You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 34: Erasure Coding Costs and Benefits

2 + 2 encoding again

1 00 13 22 3

[ x1

x2

]=

y1

y2

y3

y4

Operate on pairs of bits.

Arithmetic carried out in GF (22).

Addition ⊕ is XOR.

Multiplication ⊗ is more complicated.

Recover by striking missing elements of y and correspondingrows of matrix.

You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 35: Erasure Coding Costs and Benefits

2 + 2 encoding again

1 00 13 22 3

[ x1

x2

]=

y1

y2

y3

y4

Operate on pairs of bits.

Arithmetic carried out in GF (22).

Addition ⊕ is XOR.

Multiplication ⊗ is more complicated.

Recover by striking missing elements of y and correspondingrows of matrix.

You multiply matrix times vector in the usual way, but with GF (4)arithmetic. More on that soon.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 36: Erasure Coding Costs and Benefits

m + n erasure coding

Divide data into m fragments and add n parity fragments.

Construct an m + n encoding scheme using Reed-Solomoncodes.

Can recover from the loss of any n fragments.

There is an m + n code for all positive m and n.

Construct an (m + n)×m matrix. Encode and decode asbefore.

Arithmetic carried out in GF (q) where q = 2r ≥ m + n.

Encode data in blocks of r bits.

RAID 5 = m + 1; RAID 6 = m + 2.

Next: What is GF (q) and where does matrix come from?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 37: Erasure Coding Costs and Benefits

m + n erasure coding

Divide data into m fragments and add n parity fragments.

Construct an m + n encoding scheme using Reed-Solomoncodes.

Can recover from the loss of any n fragments.

There is an m + n code for all positive m and n.

Construct an (m + n)×m matrix. Encode and decode asbefore.

Arithmetic carried out in GF (q) where q = 2r ≥ m + n.

Encode data in blocks of r bits.

RAID 5 = m + 1; RAID 6 = m + 2.

Next: What is GF (q) and where does matrix come from?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 38: Erasure Coding Costs and Benefits

m + n erasure coding

Divide data into m fragments and add n parity fragments.

Construct an m + n encoding scheme using Reed-Solomoncodes.

Can recover from the loss of any n fragments.

There is an m + n code for all positive m and n.

Construct an (m + n)×m matrix. Encode and decode asbefore.

Arithmetic carried out in GF (q) where q = 2r ≥ m + n.

Encode data in blocks of r bits.

RAID 5 = m + 1; RAID 6 = m + 2.

Next: What is GF (q) and where does matrix come from?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 39: Erasure Coding Costs and Benefits

m + n erasure coding

Divide data into m fragments and add n parity fragments.

Construct an m + n encoding scheme using Reed-Solomoncodes.

Can recover from the loss of any n fragments.

There is an m + n code for all positive m and n.

Construct an (m + n)×m matrix. Encode and decode asbefore.

Arithmetic carried out in GF (q) where q = 2r ≥ m + n.

Encode data in blocks of r bits.

RAID 5 = m + 1; RAID 6 = m + 2.

Next: What is GF (q) and where does matrix come from?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 40: Erasure Coding Costs and Benefits

m + n erasure coding

Divide data into m fragments and add n parity fragments.

Construct an m + n encoding scheme using Reed-Solomoncodes.

Can recover from the loss of any n fragments.

There is an m + n code for all positive m and n.

Construct an (m + n)×m matrix. Encode and decode asbefore.

Arithmetic carried out in GF (q) where q = 2r ≥ m + n.

Encode data in blocks of r bits.

RAID 5 = m + 1; RAID 6 = m + 2.

Next: What is GF (q) and where does matrix come from?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 41: Erasure Coding Costs and Benefits

m + n erasure coding

Divide data into m fragments and add n parity fragments.

Construct an m + n encoding scheme using Reed-Solomoncodes.

Can recover from the loss of any n fragments.

There is an m + n code for all positive m and n.

Construct an (m + n)×m matrix. Encode and decode asbefore.

Arithmetic carried out in GF (q) where q = 2r ≥ m + n.

Encode data in blocks of r bits.

RAID 5 = m + 1; RAID 6 = m + 2.

Next: What is GF (q) and where does matrix come from?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 42: Erasure Coding Costs and Benefits

m + n erasure coding

Divide data into m fragments and add n parity fragments.

Construct an m + n encoding scheme using Reed-Solomoncodes.

Can recover from the loss of any n fragments.

There is an m + n code for all positive m and n.

Construct an (m + n)×m matrix. Encode and decode asbefore.

Arithmetic carried out in GF (q) where q = 2r ≥ m + n.

Encode data in blocks of r bits.

RAID 5 = m + 1; RAID 6 = m + 2.

Next: What is GF (q) and where does matrix come from?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 43: Erasure Coding Costs and Benefits

m + n erasure coding

Divide data into m fragments and add n parity fragments.

Construct an m + n encoding scheme using Reed-Solomoncodes.

Can recover from the loss of any n fragments.

There is an m + n code for all positive m and n.

Construct an (m + n)×m matrix. Encode and decode asbefore.

Arithmetic carried out in GF (q) where q = 2r ≥ m + n.

Encode data in blocks of r bits.

RAID 5 = m + 1; RAID 6 = m + 2.

Next: What is GF (q) and where does matrix come from?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 44: Erasure Coding Costs and Benefits

m + n erasure coding

Divide data into m fragments and add n parity fragments.

Construct an m + n encoding scheme using Reed-Solomoncodes.

Can recover from the loss of any n fragments.

There is an m + n code for all positive m and n.

Construct an (m + n)×m matrix. Encode and decode asbefore.

Arithmetic carried out in GF (q) where q = 2r ≥ m + n.

Encode data in blocks of r bits.

RAID 5 = m + 1; RAID 6 = m + 2.

Next: What is GF (q) and where does matrix come from?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 45: Erasure Coding Costs and Benefits

Finite fields

GF (q) has q elements.

There exists a field of order q if and only if q = pr .

Elements are polynomials in r − 1 variables.

Addition ⊕ is polynomial addition with coefficients modulo p.

Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.

In computer applications, p is nearly always 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 46: Erasure Coding Costs and Benefits

Finite fields

GF (q) has q elements.

There exists a field of order q if and only if q = pr .

Elements are polynomials in r − 1 variables.

Addition ⊕ is polynomial addition with coefficients modulo p.

Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.

In computer applications, p is nearly always 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 47: Erasure Coding Costs and Benefits

Finite fields

GF (q) has q elements.

There exists a field of order q if and only if q = pr .

Elements are polynomials in r − 1 variables.

Addition ⊕ is polynomial addition with coefficients modulo p.

Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.

In computer applications, p is nearly always 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 48: Erasure Coding Costs and Benefits

Finite fields

GF (q) has q elements.

There exists a field of order q if and only if q = pr .

Elements are polynomials in r − 1 variables.

Addition ⊕ is polynomial addition with coefficients modulo p.

Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.

In computer applications, p is nearly always 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 49: Erasure Coding Costs and Benefits

Finite fields

GF (q) has q elements.

There exists a field of order q if and only if q = pr .

Elements are polynomials in r − 1 variables.

Addition ⊕ is polynomial addition with coefficients modulo p.

Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.

In computer applications, p is nearly always 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 50: Erasure Coding Costs and Benefits

Finite fields

GF (q) has q elements.

There exists a field of order q if and only if q = pr .

Elements are polynomials in r − 1 variables.

Addition ⊕ is polynomial addition with coefficients modulo p.

Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.

In computer applications, p is nearly always 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 51: Erasure Coding Costs and Benefits

Finite fields

GF (q) has q elements.

There exists a field of order q if and only if q = pr .

Elements are polynomials in r − 1 variables.

Addition ⊕ is polynomial addition with coefficients modulo p.

Multiplication is polynomial multiplication modulo anirreducible polynomial g(x) of degree r.

In computer applications, p is nearly always 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 52: Erasure Coding Costs and Benefits

Finite field example: GF (23)

Elements are 3-bit numbers

Choose irreducible polynomial, e.g. g(x) = x3 + x + 1

3 = 11two 7→ x + 1

5 = 101two 7→ x2 + 1

3⊕ 5 7→ x2 + x 7→ 6

5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 53: Erasure Coding Costs and Benefits

Finite field example: GF (23)

Elements are 3-bit numbers

Choose irreducible polynomial, e.g. g(x) = x3 + x + 1

3 = 11two 7→ x + 1

5 = 101two 7→ x2 + 1

3⊕ 5 7→ x2 + x 7→ 6

5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 54: Erasure Coding Costs and Benefits

Finite field example: GF (23)

Elements are 3-bit numbers

Choose irreducible polynomial, e.g. g(x) = x3 + x + 1

3 = 11two 7→ x + 1

5 = 101two 7→ x2 + 1

3⊕ 5 7→ x2 + x 7→ 6

5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 55: Erasure Coding Costs and Benefits

Finite field example: GF (23)

Elements are 3-bit numbers

Choose irreducible polynomial, e.g. g(x) = x3 + x + 1

3 = 11two 7→ x + 1

5 = 101two 7→ x2 + 1

3⊕ 5 7→ x2 + x 7→ 6

5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 56: Erasure Coding Costs and Benefits

Finite field example: GF (23)

Elements are 3-bit numbers

Choose irreducible polynomial, e.g. g(x) = x3 + x + 1

3 = 11two 7→ x + 1

5 = 101two 7→ x2 + 1

3⊕ 5 7→ x2 + x 7→ 6

5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 57: Erasure Coding Costs and Benefits

Finite field example: GF (23)

Elements are 3-bit numbers

Choose irreducible polynomial, e.g. g(x) = x3 + x + 1

3 = 11two 7→ x + 1

5 = 101two 7→ x2 + 1

3⊕ 5 7→ x2 + x 7→ 6

5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 58: Erasure Coding Costs and Benefits

Finite field example: GF (23)

Elements are 3-bit numbers

Choose irreducible polynomial, e.g. g(x) = x3 + x + 1

3 = 11two 7→ x + 1

5 = 101two 7→ x2 + 1

3⊕ 5 7→ x2 + x 7→ 6

5⊗ 6 7→ (x2 + 1)(x2 + x) ≡ x + 1 mod g(x) 7→ 3

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 59: Erasure Coding Costs and Benefits

How to construct matrix for m + n code

Choose q = pr ≥ m + n. e.g. work in GF (24) for 6 + 3 code.

Start with (m + n)×m Vandermonde matrix.

Do Gaussian elimination by columns until top of matrix equalsidentity. Call this B.

Theorem: Every subset of n rows of B forms an invertiblematrix.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 60: Erasure Coding Costs and Benefits

How to construct matrix for m + n code

Choose q = pr ≥ m + n. e.g. work in GF (24) for 6 + 3 code.

Start with (m + n)×m Vandermonde matrix.

Do Gaussian elimination by columns until top of matrix equalsidentity. Call this B.

Theorem: Every subset of n rows of B forms an invertiblematrix.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 61: Erasure Coding Costs and Benefits

How to construct matrix for m + n code

Choose q = pr ≥ m + n. e.g. work in GF (24) for 6 + 3 code.

Start with (m + n)×m Vandermonde matrix.

Do Gaussian elimination by columns until top of matrix equalsidentity. Call this B.

Theorem: Every subset of n rows of B forms an invertiblematrix.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 62: Erasure Coding Costs and Benefits

How to construct matrix for m + n code

Choose q = pr ≥ m + n. e.g. work in GF (24) for 6 + 3 code.

Start with (m + n)×m Vandermonde matrix.

Do Gaussian elimination by columns until top of matrix equalsidentity. Call this B.

Theorem: Every subset of n rows of B forms an invertiblematrix.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 63: Erasure Coding Costs and Benefits

How to construct matrix for m + n code

Choose q = pr ≥ m + n. e.g. work in GF (24) for 6 + 3 code.

Start with (m + n)×m Vandermonde matrix.

Do Gaussian elimination by columns until top of matrix equalsidentity. Call this B.

Theorem: Every subset of n rows of B forms an invertiblematrix.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 64: Erasure Coding Costs and Benefits

How to choose n

Given a number of data fragments m and an acceptable probabilityof unavailability ε, you can solve for the smallest value of n suchthat

m+n∑i=n+1

(m + n

i

)piL(1− pL)m+n−i < ε

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 65: Erasure Coding Costs and Benefits

How to choose n

Given a number of data fragments m and an acceptable probabilityof unavailability ε, you can solve for the smallest value of n suchthat

m+n∑i=n+1

(m + n

i

)piL(1− pL)m+n−i < ε

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 66: Erasure Coding Costs and Benefits

Example of choosing n

Suppose disk failure probability is 0.005 and you want to build asystem with probability of failure < 10−6.

Could use triple replication. Probability of failure= 0.0053 = 1.25× 10−7.

If m = 8, can solve for n = 3 which gives probability of failure1.99× 10−7.

8 GB object encodes to 11 GB rather than 24 GB.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 67: Erasure Coding Costs and Benefits

Example of choosing n

Suppose disk failure probability is 0.005 and you want to build asystem with probability of failure < 10−6.

Could use triple replication. Probability of failure= 0.0053 = 1.25× 10−7.

If m = 8, can solve for n = 3 which gives probability of failure1.99× 10−7.

8 GB object encodes to 11 GB rather than 24 GB.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 68: Erasure Coding Costs and Benefits

Example of choosing n

Suppose disk failure probability is 0.005 and you want to build asystem with probability of failure < 10−6.

Could use triple replication. Probability of failure= 0.0053 = 1.25× 10−7.

If m = 8, can solve for n = 3 which gives probability of failure1.99× 10−7.

8 GB object encodes to 11 GB rather than 24 GB.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 69: Erasure Coding Costs and Benefits

Example of choosing n

Suppose disk failure probability is 0.005 and you want to build asystem with probability of failure < 10−6.

Could use triple replication. Probability of failure= 0.0053 = 1.25× 10−7.

If m = 8, can solve for n = 3 which gives probability of failure1.99× 10−7.

8 GB object encodes to 11 GB rather than 24 GB.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 70: Erasure Coding Costs and Benefits

Example of choosing n

Suppose disk failure probability is 0.005 and you want to build asystem with probability of failure < 10−6.

Could use triple replication. Probability of failure= 0.0053 = 1.25× 10−7.

If m = 8, can solve for n = 3 which gives probability of failure1.99× 10−7.

8 GB object encodes to 11 GB rather than 24 GB.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 71: Erasure Coding Costs and Benefits

Choosing m

Can increase reliability for fixed disk space by increasing mand n proportionally.

In theory, EC can make both the probability of loss and theoverhead as small as you like.

So why not always use EC, and why not choose m+n as largeas possible?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 72: Erasure Coding Costs and Benefits

Choosing m

Can increase reliability for fixed disk space by increasing mand n proportionally.

In theory, EC can make both the probability of loss and theoverhead as small as you like.

So why not always use EC, and why not choose m+n as largeas possible?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 73: Erasure Coding Costs and Benefits

Choosing m

Can increase reliability for fixed disk space by increasing mand n proportionally.

In theory, EC can make both the probability of loss and theoverhead as small as you like.

So why not always use EC, and why not choose m+n as largeas possible?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 74: Erasure Coding Costs and Benefits

Choosing m

Can increase reliability for fixed disk space by increasing mand n proportionally.

In theory, EC can make both the probability of loss and theoverhead as small as you like.

So why not always use EC, and why not choose m+n as largeas possible?

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 75: Erasure Coding Costs and Benefits

Keeping track of fragments

Some system needs to keep track of where to find ECfragments.

Want database in memory, so must be on the order of GB insize.

Kilobytes of data per fragment =⇒ millions of fragments.

In practice, EC system have on the order of millions offragments.

This places an upper limit on m + n and lower limit onfragment size.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 76: Erasure Coding Costs and Benefits

Keeping track of fragments

Some system needs to keep track of where to find ECfragments.

Want database in memory, so must be on the order of GB insize.

Kilobytes of data per fragment =⇒ millions of fragments.

In practice, EC system have on the order of millions offragments.

This places an upper limit on m + n and lower limit onfragment size.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 77: Erasure Coding Costs and Benefits

Keeping track of fragments

Some system needs to keep track of where to find ECfragments.

Want database in memory, so must be on the order of GB insize.

Kilobytes of data per fragment =⇒ millions of fragments.

In practice, EC system have on the order of millions offragments.

This places an upper limit on m + n and lower limit onfragment size.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 78: Erasure Coding Costs and Benefits

Keeping track of fragments

Some system needs to keep track of where to find ECfragments.

Want database in memory, so must be on the order of GB insize.

Kilobytes of data per fragment =⇒ millions of fragments.

In practice, EC system have on the order of millions offragments.

This places an upper limit on m + n and lower limit onfragment size.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 79: Erasure Coding Costs and Benefits

Keeping track of fragments

Some system needs to keep track of where to find ECfragments.

Want database in memory, so must be on the order of GB insize.

Kilobytes of data per fragment =⇒ millions of fragments.

In practice, EC system have on the order of millions offragments.

This places an upper limit on m + n and lower limit onfragment size.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 80: Erasure Coding Costs and Benefits

Keeping track of fragments

Some system needs to keep track of where to find ECfragments.

Want database in memory, so must be on the order of GB insize.

Kilobytes of data per fragment =⇒ millions of fragments.

In practice, EC system have on the order of millions offragments.

This places an upper limit on m + n and lower limit onfragment size.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 81: Erasure Coding Costs and Benefits

Lost versus unavailable data

Data Loss (DL): You’re never getting the data back.

Data unavailability (DU): The data is recoverable, but youcan’t get to it right now.

DL = permanently unavailable

DU = temporarily unavailable

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 82: Erasure Coding Costs and Benefits

Lost versus unavailable data

Data Loss (DL): You’re never getting the data back.

Data unavailability (DU): The data is recoverable, but youcan’t get to it right now.

DL = permanently unavailable

DU = temporarily unavailable

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 83: Erasure Coding Costs and Benefits

Lost versus unavailable data

Data Loss (DL): You’re never getting the data back.

Data unavailability (DU): The data is recoverable, but youcan’t get to it right now.

DL = permanently unavailable

DU = temporarily unavailable

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 84: Erasure Coding Costs and Benefits

Lost versus unavailable data

Data Loss (DL): You’re never getting the data back.

Data unavailability (DU): The data is recoverable, but youcan’t get to it right now.

DL = permanently unavailable

DU = temporarily unavailable

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 85: Erasure Coding Costs and Benefits

Lost versus unavailable data

Data Loss (DL): You’re never getting the data back.

Data unavailability (DU): The data is recoverable, but youcan’t get to it right now.

DL = permanently unavailable

DU = temporarily unavailable

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 86: Erasure Coding Costs and Benefits

Dying versus being dead

If a disk can be replaced in time, it’s as if it came back to life.

Data loss happens when too many disks are dead at the sametime.

Probability of (eventual) failure is 1.

Probability of being in dead state is most important.

Note that probability of a disk being dead depends onreplacement time, not just disk MTBF.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 87: Erasure Coding Costs and Benefits

Dying versus being dead

If a disk can be replaced in time, it’s as if it came back to life.

Data loss happens when too many disks are dead at the sametime.

Probability of (eventual) failure is 1.

Probability of being in dead state is most important.

Note that probability of a disk being dead depends onreplacement time, not just disk MTBF.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 88: Erasure Coding Costs and Benefits

Dying versus being dead

If a disk can be replaced in time, it’s as if it came back to life.

Data loss happens when too many disks are dead at the sametime.

Probability of (eventual) failure is 1.

Probability of being in dead state is most important.

Note that probability of a disk being dead depends onreplacement time, not just disk MTBF.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 89: Erasure Coding Costs and Benefits

Dying versus being dead

If a disk can be replaced in time, it’s as if it came back to life.

Data loss happens when too many disks are dead at the sametime.

Probability of (eventual) failure is 1.

Probability of being in dead state is most important.

Note that probability of a disk being dead depends onreplacement time, not just disk MTBF.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 90: Erasure Coding Costs and Benefits

Dying versus being dead

If a disk can be replaced in time, it’s as if it came back to life.

Data loss happens when too many disks are dead at the sametime.

Probability of (eventual) failure is 1.

Probability of being in dead state is most important.

Note that probability of a disk being dead depends onreplacement time, not just disk MTBF.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 91: Erasure Coding Costs and Benefits

Dying versus being dead

If a disk can be replaced in time, it’s as if it came back to life.

Data loss happens when too many disks are dead at the sametime.

Probability of (eventual) failure is 1.

Probability of being in dead state is most important.

Note that probability of a disk being dead depends onreplacement time, not just disk MTBF.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 92: Erasure Coding Costs and Benefits

Allocating disks to data centers

Location of disks matters for DU more than DL.

Very unlikely that all disks in a data center will crashsimultaneously, but data centers go offline.

Common to have a replica in every data center.

Companies usually have fewer than m + n data centers, sosome fragments co-located.

EC fragments have correlated risk of unavailability (andfailure, to lesser extent).

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 93: Erasure Coding Costs and Benefits

Allocating disks to data centers

Location of disks matters for DU more than DL.

Very unlikely that all disks in a data center will crashsimultaneously, but data centers go offline.

Common to have a replica in every data center.

Companies usually have fewer than m + n data centers, sosome fragments co-located.

EC fragments have correlated risk of unavailability (andfailure, to lesser extent).

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 94: Erasure Coding Costs and Benefits

Allocating disks to data centers

Location of disks matters for DU more than DL.

Very unlikely that all disks in a data center will crashsimultaneously, but data centers go offline.

Common to have a replica in every data center.

Companies usually have fewer than m + n data centers, sosome fragments co-located.

EC fragments have correlated risk of unavailability (andfailure, to lesser extent).

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 95: Erasure Coding Costs and Benefits

Allocating disks to data centers

Location of disks matters for DU more than DL.

Very unlikely that all disks in a data center will crashsimultaneously, but data centers go offline.

Common to have a replica in every data center.

Companies usually have fewer than m + n data centers, sosome fragments co-located.

EC fragments have correlated risk of unavailability (andfailure, to lesser extent).

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 96: Erasure Coding Costs and Benefits

Allocating disks to data centers

Location of disks matters for DU more than DL.

Very unlikely that all disks in a data center will crashsimultaneously, but data centers go offline.

Common to have a replica in every data center.

Companies usually have fewer than m + n data centers, sosome fragments co-located.

EC fragments have correlated risk of unavailability (andfailure, to lesser extent).

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 97: Erasure Coding Costs and Benefits

Allocating disks to data centers

Location of disks matters for DU more than DL.

Very unlikely that all disks in a data center will crashsimultaneously, but data centers go offline.

Common to have a replica in every data center.

Companies usually have fewer than m + n data centers, sosome fragments co-located.

EC fragments have correlated risk of unavailability (andfailure, to lesser extent).

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 98: Erasure Coding Costs and Benefits

Latency

In replicated system, object returned from one data center.

In EC system, may need to combine fragments from two datacenters to reconstruct object.

Recoverable failure may require EC to access fragmentsfurther away than normal read.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 99: Erasure Coding Costs and Benefits

Latency

In replicated system, object returned from one data center.

In EC system, may need to combine fragments from two datacenters to reconstruct object.

Recoverable failure may require EC to access fragmentsfurther away than normal read.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 100: Erasure Coding Costs and Benefits

Latency

In replicated system, object returned from one data center.

In EC system, may need to combine fragments from two datacenters to reconstruct object.

Recoverable failure may require EC to access fragmentsfurther away than normal read.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 101: Erasure Coding Costs and Benefits

Latency

In replicated system, object returned from one data center.

In EC system, may need to combine fragments from two datacenters to reconstruct object.

Recoverable failure may require EC to access fragmentsfurther away than normal read.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 102: Erasure Coding Costs and Benefits

Recovery

EC systems have a lower probability of catastrophic failure but ahigher probability of recoverable failure.

For example, assume disks have 0.995 probability of being alive.

Triple replication and 8+3 EC have comparable probability of dataloss.

But there is a 1.5% chance of recoverable failure in the former and5.36% in the latter.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 103: Erasure Coding Costs and Benefits

Recovery

EC systems have a lower probability of catastrophic failure but ahigher probability of recoverable failure.

For example, assume disks have 0.995 probability of being alive.

Triple replication and 8+3 EC have comparable probability of dataloss.

But there is a 1.5% chance of recoverable failure in the former and5.36% in the latter.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 104: Erasure Coding Costs and Benefits

Recovery

EC systems have a lower probability of catastrophic failure but ahigher probability of recoverable failure.

For example, assume disks have 0.995 probability of being alive.

Triple replication and 8+3 EC have comparable probability of dataloss.

But there is a 1.5% chance of recoverable failure in the former and5.36% in the latter.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 105: Erasure Coding Costs and Benefits

Recovery

EC systems have a lower probability of catastrophic failure but ahigher probability of recoverable failure.

For example, assume disks have 0.995 probability of being alive.

Triple replication and 8+3 EC have comparable probability of dataloss.

But there is a 1.5% chance of recoverable failure in the former and5.36% in the latter.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 106: Erasure Coding Costs and Benefits

Recovery

EC systems have a lower probability of catastrophic failure but ahigher probability of recoverable failure.

For example, assume disks have 0.995 probability of being alive.

Triple replication and 8+3 EC have comparable probability of dataloss.

But there is a 1.5% chance of recoverable failure in the former and5.36% in the latter.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 107: Erasure Coding Costs and Benefits

Local reconstruction codes

Microsoft Azure 6 + 2 + 2 example system: 3 data disks, 1local parity disk, and 1 global parity disk in each of two datacenters.

Can recover from any single disk failure locally.

Can recover from the loss of any 3 disks.

Can recover from the loss of 4 disks in 86% of cases.

More reliable than 6 + 3, but not quite as reliable as 6 + 4.

Actual Azure implementation uses 12 + 2 + 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 108: Erasure Coding Costs and Benefits

Local reconstruction codes

Microsoft Azure 6 + 2 + 2 example system: 3 data disks, 1local parity disk, and 1 global parity disk in each of two datacenters.

Can recover from any single disk failure locally.

Can recover from the loss of any 3 disks.

Can recover from the loss of 4 disks in 86% of cases.

More reliable than 6 + 3, but not quite as reliable as 6 + 4.

Actual Azure implementation uses 12 + 2 + 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 109: Erasure Coding Costs and Benefits

Local reconstruction codes

Microsoft Azure 6 + 2 + 2 example system: 3 data disks, 1local parity disk, and 1 global parity disk in each of two datacenters.

Can recover from any single disk failure locally.

Can recover from the loss of any 3 disks.

Can recover from the loss of 4 disks in 86% of cases.

More reliable than 6 + 3, but not quite as reliable as 6 + 4.

Actual Azure implementation uses 12 + 2 + 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 110: Erasure Coding Costs and Benefits

Local reconstruction codes

Microsoft Azure 6 + 2 + 2 example system: 3 data disks, 1local parity disk, and 1 global parity disk in each of two datacenters.

Can recover from any single disk failure locally.

Can recover from the loss of any 3 disks.

Can recover from the loss of 4 disks in 86% of cases.

More reliable than 6 + 3, but not quite as reliable as 6 + 4.

Actual Azure implementation uses 12 + 2 + 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 111: Erasure Coding Costs and Benefits

Local reconstruction codes

Microsoft Azure 6 + 2 + 2 example system: 3 data disks, 1local parity disk, and 1 global parity disk in each of two datacenters.

Can recover from any single disk failure locally.

Can recover from the loss of any 3 disks.

Can recover from the loss of 4 disks in 86% of cases.

More reliable than 6 + 3, but not quite as reliable as 6 + 4.

Actual Azure implementation uses 12 + 2 + 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 112: Erasure Coding Costs and Benefits

Local reconstruction codes

Microsoft Azure 6 + 2 + 2 example system: 3 data disks, 1local parity disk, and 1 global parity disk in each of two datacenters.

Can recover from any single disk failure locally.

Can recover from the loss of any 3 disks.

Can recover from the loss of 4 disks in 86% of cases.

More reliable than 6 + 3, but not quite as reliable as 6 + 4.

Actual Azure implementation uses 12 + 2 + 2.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 113: Erasure Coding Costs and Benefits

Summary

EC implemented via Reed-Solomon codes, finite fields.

EC reduces hardware costs and increases reliability.

In theory, overhead and probability of data loss → 0 asm + n→∞.

In practice, m + n is usually on the order of 10.

EC decreases probability of data loss, increases probabilityrecoverable failure.

EC can be vulnerable to data center availability.

EC increases latency.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 114: Erasure Coding Costs and Benefits

Summary

EC implemented via Reed-Solomon codes, finite fields.

EC reduces hardware costs and increases reliability.

In theory, overhead and probability of data loss → 0 asm + n→∞.

In practice, m + n is usually on the order of 10.

EC decreases probability of data loss, increases probabilityrecoverable failure.

EC can be vulnerable to data center availability.

EC increases latency.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 115: Erasure Coding Costs and Benefits

Summary

EC implemented via Reed-Solomon codes, finite fields.

EC reduces hardware costs and increases reliability.

In theory, overhead and probability of data loss → 0 asm + n→∞.

In practice, m + n is usually on the order of 10.

EC decreases probability of data loss, increases probabilityrecoverable failure.

EC can be vulnerable to data center availability.

EC increases latency.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 116: Erasure Coding Costs and Benefits

Summary

EC implemented via Reed-Solomon codes, finite fields.

EC reduces hardware costs and increases reliability.

In theory, overhead and probability of data loss → 0 asm + n→∞.

In practice, m + n is usually on the order of 10.

EC decreases probability of data loss, increases probabilityrecoverable failure.

EC can be vulnerable to data center availability.

EC increases latency.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 117: Erasure Coding Costs and Benefits

Summary

EC implemented via Reed-Solomon codes, finite fields.

EC reduces hardware costs and increases reliability.

In theory, overhead and probability of data loss → 0 asm + n→∞.

In practice, m + n is usually on the order of 10.

EC decreases probability of data loss, increases probabilityrecoverable failure.

EC can be vulnerable to data center availability.

EC increases latency.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 118: Erasure Coding Costs and Benefits

Summary

EC implemented via Reed-Solomon codes, finite fields.

EC reduces hardware costs and increases reliability.

In theory, overhead and probability of data loss → 0 asm + n→∞.

In practice, m + n is usually on the order of 10.

EC decreases probability of data loss, increases probabilityrecoverable failure.

EC can be vulnerable to data center availability.

EC increases latency.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 119: Erasure Coding Costs and Benefits

Summary

EC implemented via Reed-Solomon codes, finite fields.

EC reduces hardware costs and increases reliability.

In theory, overhead and probability of data loss → 0 asm + n→∞.

In practice, m + n is usually on the order of 10.

EC decreases probability of data loss, increases probabilityrecoverable failure.

EC can be vulnerable to data center availability.

EC increases latency.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 120: Erasure Coding Costs and Benefits

Summary

EC implemented via Reed-Solomon codes, finite fields.

EC reduces hardware costs and increases reliability.

In theory, overhead and probability of data loss → 0 asm + n→∞.

In practice, m + n is usually on the order of 10.

EC decreases probability of data loss, increases probabilityrecoverable failure.

EC can be vulnerable to data center availability.

EC increases latency.

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 121: Erasure Coding Costs and Benefits

Resources

James Westall and James Martin. An Introduction to GaloisFields and Reed-Solomon Coding. http://bit.ly/1oL4NRt

Cheng Huang et al. Erasure Coding in Windows AzureStorage. http://bit.ly/ZPISui

John D. Cook, Robert Primmer, Ab de Kwant. ComparingCost and Performance of Replication and Erasure Coding. Toappear in Hitachi Review. http://bit.ly/1eyvfu9

Contact info: http://JohnDCook.com

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits

Page 122: Erasure Coding Costs and Benefits

Resources

James Westall and James Martin. An Introduction to GaloisFields and Reed-Solomon Coding. http://bit.ly/1oL4NRt

Cheng Huang et al. Erasure Coding in Windows AzureStorage. http://bit.ly/ZPISui

John D. Cook, Robert Primmer, Ab de Kwant. ComparingCost and Performance of Replication and Erasure Coding. Toappear in Hitachi Review. http://bit.ly/1eyvfu9

Contact info: http://JohnDCook.com

John D. Cook, Robert Primmer, Ab de Kwant Erasure Coding Costs and Benefits