Network Coding for Distributed Storage Systems

1

Network Coding for Distributed Storage Systems

IEEE TRANSACTIONS ON INFORMATION THEORY, SEPTEMBER 2010

Alexandros G. Dimakis Brighten Godfrey

Yunnan WuMartin J. WainwrightKannan Ramchandran

2

Outline

ه Introductionه Backgroundه Analysisه Evaluationه Conclusion

3

Introduction

ه Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes.

ه Storing data in distributed storage systemsه the encoded data are spread across nodes.ه require less redundancy than replication.ه replace stored data periodically.

4

Introduction

ه Key issue in distributed storage systems.ه repair bandwidthه storage space

ه How to generate encoded data in a distributed way as little data as possible ?

5

MDS Codes

ه A common practice to repair from a single node failure for an erasure coded system.1. a new node to reconstruct the whole encoded data object.2. then, generate just one encoded block.

ه Maximum Distance Separable (MDS) code.ه (n, k)-MDS propertyه recover original file by any k set of encoded data.

6

MDS Codes

File divide

M/k

M/k

M/k

M/k

encode store at n nodes

MDS encode

7

Introduction

ه Redundancy must be continually refreshed as nodes fail in distributed storage systems.ه large data transfers across the network.

8

Introduction

ه The erasure codes can be repaired without communicating the whole data object.

ه (4, 2)-MSR example when node is fail.ه generate smaller parity packets of their data.ه forward them to the newcomer.ه the newcomer mix packets to generate two new packets.

0.50.50.50.5

0.5

0.5

0.5

9

Introduction

ه This paper identifies that there is a optimal tradeoff curve between storage and repair bandwidth.ه smaller storage space => less redundancy => more repair

bandwidth

ه This paper calls codes that lie on this optimal tradeoff curve regenerating codes.

10

Introduction

ه Minimum-Storage Regenerating (MSR) codes.ه can be efficiently repaired.

ه Minimum-Bandwidth Regenerating (MBR) codes.ه storage node stores slightly more than M/k .ه the repair bandwidth can be reduced.

11

Outline


12

Erasure Codes

ه Classical coding theory focuses on the tradeoff between redundancy and error tolerance.

ه In terms of the redundancy-reliability tradeoff, the Maximum Distance Separable (MDS) codes are optimal.ه the most well-known is Reed-Solomon codes.

13

Network Coding

ه Network coding allows ه the intermediate nodes to generate output data by encoding

previously received input data.ه information to be “mixed” at intermediate nodes.

ه This paper investigates the application of network coding for the repair problem in distributed storage.ه tradeoff between storage and repair network bandwidth

14

Distributed Storage Systems

ه Erasure codes could reduce bandwidth use by an order of magnitude compared with replication.

ه Hybrid strategy: ه one special storage node maintains one full replica.ه multiple erasure encoded data.ه transfer only M / k bytes for a new encoded data by replica node.ه there is the problem when replica data lost.

15

Outline


16

Information Flow Graph

17

Storage-Bandwidth Tradeoff

ه The normal redundancy we want to maintain requires active storage nodesه each storing α bitsه β bits each from any d surviving nodesه total repair bandwidth is γ = d β

ه For each set of parameters (n, k, d, α, γ), there is a family of information flow graphs, each of which corresponds to a particular evolution of node failures / repairs.

18


ه Denote this family of directed acyclic graphs by

ه (4, 2, 3, 1 Mb, 1.5 Mb) is feasible.

19


ه Theorem 1 : For any α ≥ α*(n, k, d, γ), the points are feasible.

20

Theorem Proof (1/4)

21

Theorem Proof (2/4)

ه .

ه .

ه .

ه .

22

Theorem Proof (3/4)

ه .

ه .

23

Theorem Proof (4/4)

ه .

ه .

24


ه Code repair can be achieved if and only if the underlying information flow graph has sufficiently large min-cuts.

25


ه Optimal tradeoff curve between storage α and repair bandwidth γه (γ = 1, α = 0.2) (γ = 1, α = 0.1)

26

Special Cases (1/2)

ه Minimum-Storage Regenerating (MSR) Codes

ه .

ه .

27

Special Cases (2/2)

ه Minimum-Bandwidth Regenerating (MBR) Codes

ه .

ه .

28

Outline

ه Introductionه Backgroundه Analysisه Evaluation

ه Node Dynamics and Objectivesه Modelه Quantitative Results

ه Conclusion

29

Node Dynamics and Objectives (1/2)

ه A permanent failureه the permanent departure of a node from the systemه a disk failure resulting in loss of the data stored on the node

ه A transient failureه node rebootه temporary network disconnection

30

Node Dynamics and Objectives (2/2)

ه A file is availableه it can be reconstructed from the data stored on currently available

nodes.

ه A file is durabilityه after permanent node failures, it may be available at some point in

the future.

31

Model (1/5)

ه The model has two key parameters, f and a.ه a fraction f of the nodes storing file data fail permanently per unit

time.ه at any given time, the node storing data is available with some

probability a.

ه The expected availability and maintenance bandwidth of various redundancy schemes can be computed to maintain a file of M bytes.

32

Model (2/5)

ه Replicationه redundancy R replicasه store total R M bytesه replace f R M bytes per unit timeه the file is unavailable if no replica is available

ى probability

ه Ideal Erasure Codesه n = k R, redundancy R n / kه transfer just M / k bytes each packetه replace f R M bytes per unit timeه unavailability probability

33

Model (3/5)

ه Hybridه n = k (R− 1)ه store total R M bytesه transfer f R M bytes per unit timeه The file is unavailable if the replica is unavailable and fewer than

k erasure-coded packets are availableى probability

34

Model (4/5)

ه Minimum-Storage Regenerating Codesه store total R M bytesه redundancy R n / kه replace f R M bytes per unit timeه extra amount of informationه unavailability

35

Model (5/5)

ه Minimum-Bandwidth Regenerating Codesه store total M n bytesه redundancy R n / kه replace f M n bytes per unit timeه extra amount of informationه unavailability

36

Estimating f and a

37

Quantitative Results (1/2)

38

Quantitative Results (2/2)

39

Quantitative Comparison

ه Comparison With Hybridه Disadvantage : asymmetric design

ه MBR codesه Disadvantage :

ى reconstruct the entire file, requires communication with n1 nodesى if the reading frequency of a file is sufficiently high and k is sufficiently small,

this inefficiency could become unacceptable.

40

Outline


41

Conclusion

ه This paper presented a general theoretic framework that can determine the information.ه communicate to repair failures in encoded systems.ه identify a tradeoff between storage and repair bandwidth.

ه One potential application area for the proposed regenerating codes is distributed archival storage or backup.ه regenerating codes potentially can offer desirable tradeoffs in

terms of redundancy, reliability, and repair bandwidth.

Network Coding for Distributed Storage Systems

Documents

Transcript of Network Coding for Distributed Storage Systems