erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite...

22
Ethan L. Miller Symantec Presidential Chair in Storage & Security Center for Research in Storage Systems University of California, Santa Cruz The Basics of Erasure Codes for Archival Storage

Transcript of erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite...

Page 1: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

Ethan L. MillerSymantec Presidential Chair in Storage & Security

Center for Research in Storage SystemsUniversity of California, Santa Cruz

The Basics of Erasure Codesfor Archival Storage

Page 2: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S1 S2S0

Protecting archival data

2

❖ Mirroring: keep multiple copies• All copies are identical• Read any copy• Write all copies

❖ Erasure codes: use multiple “chunks” for redundancy• Read from desired chunk(s)• Write is more complex

• Write an entire stripe, or• Read-Modify-Write to update data

and parity• Lower storage overhead!

D0

D1

D2

D3

D0

D1

D2

D3

D1

D0

D3

D2

S0

D0

D4

S1

D1

P4

S2

D2

D5

S3

D3

Q4

S4

P0

D6

S5

Q0

D7

Page 3: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S1 S2S0

Protecting archival data

2

❖ Mirroring: keep multiple copies• All copies are identical• Read any copy• Write all copies

❖ Erasure codes: use multiple “chunks” for redundancy• Read from desired chunk(s)• Write is more complex

• Write an entire stripe, or• Read-Modify-Write to update data

and parity• Lower storage overhead!

D0

D1

D2

D3

D0

D1

D2

D3

D1

D0

D3

D2

S0

D0

D4

S1

D1

P4

S2

D2

D5

S3

D3

Q4

S4

P0

D6

S5

Q0

D7

Page 4: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S1 S2S0

Protecting archival data

2

❖ Mirroring: keep multiple copies• All copies are identical• Read any copy• Write all copies

❖ Erasure codes: use multiple “chunks” for redundancy• Read from desired chunk(s)• Write is more complex

• Write an entire stripe, or• Read-Modify-Write to update data

and parity• Lower storage overhead!

D0

D1

D2

D3

D0

D1

D2

D3

D1

D0

D3

D2

S0

D0

D4

S1

D1

P4

S2

D2

D5

S3

D3

Q4

S4

P0

D6

S5

Q0

D7

Page 5: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S1 S2S0

Protecting archival data

2

❖ Mirroring: keep multiple copies• All copies are identical• Read any copy• Write all copies

❖ Erasure codes: use multiple “chunks” for redundancy• Read from desired chunk(s)• Write is more complex

• Write an entire stripe, or• Read-Modify-Write to update data

and parity• Lower storage overhead!

D0

D1

D2

D3

D0

D1

D2

D3

D1

D0

D3

D2

S0

D0

D4

S1

D1

P4

S2

D2

D5

S3

D3

Q4

S4

P0

D6

S5

Q0

D7

Page 6: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S1 S2S0

Protecting archival data

2

❖ Mirroring: keep multiple copies• All copies are identical• Read any copy• Write all copies

❖ Erasure codes: use multiple “chunks” for redundancy• Read from desired chunk(s)• Write is more complex

• Write an entire stripe, or• Read-Modify-Write to update data

and parity• Lower storage overhead!

D0

D1

D2

D3

D0

D1

D2

D3

D1

D0

D3

D2

S0

D0

D4

S1

D1

P4

S2

D2

D5

S3

D3

Q4

S4

P0

D6

S5

Q0

D7

Page 7: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S0 S1 S2 S3 S4 S5

How do erasure codes work?

3

❖ Error correcting code: find the error and rebuild it❖ Erasure correcting code: given the broken (erased location), rebuild it

• Most archival storage systems are this type• Use hashing to figure out which chunks are broken

❖ Erasure correcting codes use linear equations to rebuild missing data • Each symbol in the “row” has its own equation• n data values & m erasure correcting symbols• n data values ➡ need n equations to rebuild

D0 D1 D2 D3 P Q

D0 = D0D1 = D1D2 = D2D3 = D3D0 ⊕ D1 ⊕ D2 ⊕ D3 = PD0 ⊕ 2⊗D1 ⊕ 4⊗D2 ⊕ 8⊗D3 = QStripe

Page 8: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S0 S1 S2 S3 S4 S5

How do erasure codes work?

3

❖ Error correcting code: find the error and rebuild it❖ Erasure correcting code: given the broken (erased location), rebuild it

• Most archival storage systems are this type• Use hashing to figure out which chunks are broken

❖ Erasure correcting codes use linear equations to rebuild missing data • Each symbol in the “row” has its own equation• n data values & m erasure correcting symbols• n data values ➡ need n equations to rebuild

D0 D1 D2 D3 P Q

D0 = D0D1 = D1D2 = D2D3 = D3D0 ⊕ D1 ⊕ D2 ⊕ D3 = PD0 ⊕ 2⊗D1 ⊕ 4⊗D2 ⊕ 8⊗D3 = QStripe

Page 9: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S0 S1 S2 S3 S4 S5

How do erasure codes work?

3

❖ Error correcting code: find the error and rebuild it❖ Erasure correcting code: given the broken (erased location), rebuild it

• Most archival storage systems are this type• Use hashing to figure out which chunks are broken

❖ Erasure correcting codes use linear equations to rebuild missing data • Each symbol in the “row” has its own equation• n data values & m erasure correcting symbols• n data values ➡ need n equations to rebuild

D0 D1 D2 D3 P Q

D0 = D0D1 = D1D2 = D2D3 = D3D0 ⊕ D1 ⊕ D2 ⊕ D3 = PD0 ⊕ 2⊗D1 ⊕ 4⊗D2 ⊕ 8⊗D3 = Q

Solve for D1 and D3

Stripe

Page 10: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

❖ Erasure codes rely on finite fields (also called Galois fields)• Add and multiply defined on fixed-width elements (usually 8 or 16 bits for

erasure codes)• Normal “arithmetic” rules apply

❖ This math can be done very quickly• Addition is XOR• Multiplication is more complex, but runs at gigabytes per second on

modern CPUs❖ Number of multiplications is usually the limiting factor

• There are usually shortcuts for creating the original symbols• Rebuilding missing symbols (data) usually needs more multiplications

The math behind erasure codes

4

Page 11: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

❖ Long-term survival of data depends on several factors• How many failures the erasure code can survive• How much data is impacted by a failure• How long it takes to restore protection after a failure

❖ Systems can vary any of these parameters to decrease the likelihood of data loss• Fast rebuild ➡ less likely that too many failures will accumulate• Tolerate more failures ➡ can rebuild more slowly and still be safe• Decluster (spread data around) ➡ multiple independent failures unlikely to

place large amounts of data in jeopardy❖ Estimates of data reliability based on

• Analytical modeling• Simulations

Reliability and erasure codes

5

Page 12: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

❖ Erasure codes can differ in many ways• Number of data storage devices in a “stripe”

• Often, different stripes span different subsets of storage devices• Number and type of failures that can be tolerated• Amount of overhead• Number of devices that must be written (or read) for

• Normal case• Single device failure• Larger-scale failures

• Complexity of generating the parity symbols• Complexity of rebuilding missing data from surviving information

❖ Evaluating erasure codes is all about tradeoffs• If there were such a thing as a perfect erasure code, we’d all use it!• Choice of erasure code depends on what you want from it

Evaluating erasure codes

6

Page 13: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

❖ Reed-Solomon is the most common erasure code• RAID-5 (N+1 parity) is a special case of Reed-Solomon• RAID-6 (P+Q parity) is also a type of Reed-Solomon

❖ Reed-Solomon can be generated very quickly for 1–2 parities• XOR is essentially• Multiplication by 2 is extremely fast, and is all that’s needed for Q

❖ RS with more parity is a bit slower• Each successive parity symbol is a linear combination of all of the data

symbols in the stripe❖ Rebuilding missing data is more expensive

• Invert an N×N matrix (once, so not so bad)• Rebuild missing symbol using dot product of N existing symbols and one

row of the matrix

Reed-Solomon codes

7

Page 14: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S7

S6

S5S1 S2 S3 S4S0

XOR-based codes

8

❖ XOR is cheap ➡ just use XOR!❖ Typically use bigger basic

chunks on each device• Example: run XORs across and

diagonally❖ There can be arbitrarily complex

approaches to this• Paper on STAIR codes in FAST

2014• Survive combinations of failed

devices and failed individual chunks

• May combine XOR and multiplication

D0 D4D1 D2

D5

D3

D6 D7 D8 D9

D10 D11 D12 D13 D14

D15 D16 D17 D18 D19

D20 D21 D22 D23 D24

P0

P1

P2

P3

P4

Q0 Q1 Q2 Q3 Q4

Page 15: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S7

S6

S5S1 S2 S3 S4S0

XOR-based codes

8

❖ XOR is cheap ➡ just use XOR!❖ Typically use bigger basic

chunks on each device• Example: run XORs across and

diagonally❖ There can be arbitrarily complex

approaches to this• Paper on STAIR codes in FAST

2014• Survive combinations of failed

devices and failed individual chunks

• May combine XOR and multiplication

D0 D4D1 D2

D5

D3

D6 D7 D8 D9

D10 D11 D12 D13 D14

D15 D16 D17 D18 D19

D20 D21 D22 D23 D24

P0

P1

P2

P3

P4

Q0 Q1 Q2 Q3 Q4

R0

Page 16: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S7

S6

S5S1 S2 S3 S4S0

XOR-based codes

8

❖ XOR is cheap ➡ just use XOR!❖ Typically use bigger basic

chunks on each device• Example: run XORs across and

diagonally❖ There can be arbitrarily complex

approaches to this• Paper on STAIR codes in FAST

2014• Survive combinations of failed

devices and failed individual chunks

• May combine XOR and multiplication

D0 D4D1 D2

D5

D3

D6 D7 D8 D9

D10 D11 D12 D13 D14

D15 D16 D17 D18 D19

D20 D21 D22 D23 D24

P0

P1

P2

P3

P4

Q0 Q1 Q2 Q3 Q4

R0 R1

Page 17: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S7

S6

S5S1 S2 S3 S4S0

XOR-based codes

8

❖ XOR is cheap ➡ just use XOR!❖ Typically use bigger basic

chunks on each device• Example: run XORs across and

diagonally❖ There can be arbitrarily complex

approaches to this• Paper on STAIR codes in FAST

2014• Survive combinations of failed

devices and failed individual chunks

• May combine XOR and multiplication

D0 D4D1 D2

D5

D3

D6 D7 D8 D9

D10 D11 D12 D13 D14

D15 D16 D17 D18 D19

D20 D21 D22 D23 D24

P0

P1

P2

P3

P4

Q0 Q1 Q2 Q3 Q4

R0 R1 R2 R3 R4

Page 18: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S4S1 S2 S3S0

Hierarchical (pyramid, LRC) codes

9

❖ Differential RAID coverage• N+1 RAID (usually) for smallish

groups• More parity covering multiple

N+1 RAID groups❖ Fast recovery from small

failures❖ Ability to recover (more slowly)

from larger-scale failures• Additional parity uses multiplication• Additional parity is across a larger

set of data elements❖ Survives more failures with

lower overhead

D0 D1 D2 D3

D4 D5 D6 D7

P0

P1

D8 D9 D10 D12 P2

Page 19: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

S4S1 S2 S3S0

Hierarchical (pyramid, LRC) codes

9

❖ Differential RAID coverage• N+1 RAID (usually) for smallish

groups• More parity covering multiple

N+1 RAID groups❖ Fast recovery from small

failures❖ Ability to recover (more slowly)

from larger-scale failures• Additional parity uses multiplication• Additional parity is across a larger

set of data elements❖ Survives more failures with

lower overhead

D0 D1 D2 D3

D4 D5 D6 D7

P0

P1

Q R

D8 D9 D10 D12 P2

Distributed around the system

Page 20: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

❖ Combine multiple data symbols on a single device ➡ rebuilding requires reading fewer devices• Less network traffic for rebuilding• Still survives same number of failures• May need to read more from a device just to return data• Need to read more from each device, in most cases!

❖ Can require more multiplications for most operations• Basic (non-failure) data reads• Reconstruction

❖ May be useful in deep archive: access fewer devices to rebuild• Performance of reads is often less important

Regenerating codes

10

Page 21: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

❖ Common cases:• How long does a “regular” data read take?• How long does it take if a device has failed and the data is on the device?• How much does rebuilding impact performance even for non-affected data?

❖ How is data distributed across the devices?• Typically declustered: each stripe uses a different set of devices

❖ How many device failures can the erasure code withstand?• How long does it take to rebuild all of the data from a lost device, and where

does it get rebuilt?• How likely is it that data will be lost, and how much data would be lost?

❖ Is the bottleneck due to computation, network or I/O?• Typically, computation is easiest to overcome• I/O is often the hardest, especially for archival systems

❖ There are a lot more variations on erasure codes than we could cover today!

So what should you ask?

11

Page 22: erasure codes basics - National Digital Information ... Code... · Erasure codes rely on finite fields (also called Galois fields) • Add and multiply defined on fixed-width

Questions?

12

We’re happy to answer questions about erasure codes!

[email protected]

http://www.ssrc.ucsc.edu/