29. dcar distributed coding aware routing in wireless networks
Network Coding for Distributed Storage Systems(Group Meeting Talk)
-
Upload
jayant-apte -
Category
Technology
-
view
543 -
download
1
description
Transcript of Network Coding for Distributed Storage Systems(Group Meeting Talk)
Network Coding for Distributed Storage Systems*
Presented byJayant ApteASPITRG
7/9/13 & 7/11/13
*Dimakis, A.G.; Godfrey, P.B.; Wu, Y.; Wainwright, M.J.; Ramchandran, K. "Network Coding for Distributed Storage Systems", Information Theory, IEEE Transactions on, On page(s): 4539 – 4551 Volume: 56, Issue: 9, Sept. 2010
Outline
● Part 1– Single Source Multi-cast Linear Network Coding
● Part 2– The repair problem
– Reduction of repair problem to single source multicast network
– Family of single source multi-cast networks arising from the reduction
– A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network)
– Minimization of storage bandwidth subject to this lower bound
Some background on single source multi-cast network coding
*Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
Some background on single source multi-cast network coding
*Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
Max-Flow-Min-Cut Theorem
Max-Flow-Min-Cut Theorem
Max-Flow-Min-Cut Theorem
Some background on single source multi-cast network coding
*Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
Basic Network Model
Basic Network Model
Local coding coefficients
Global coding coefficients
Matrix formulation
The transfer matrix
Proof of Theorem 2
Proof of Theorem 3
Some background on single source multi-cast network coding
*Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
Extension to multicast
Part 2- Outline
● Introduction● The repair problem ● Reduction of repair problem to single source multicast network ● Family of single source multi-cast networks arising from the
reduction● A lower bound on min-cuts(i.e. An upper bound on max-flow
and hence coding capacity of network)● Minimization of storage bandwidth subject to this lower bound
Distributed storage
● We are living in an internet age
● Demand for large scale data storage has increased significantly
● Social networks, file and video sharing require seamless storage, access and security for massive amounts of data
● Storage mediums(viz. hard-drives) are individually unreliable
● Hence we introduce redundancy via the use of erasure codes to improve reliability
A storage code((4,2) MDS)
KwefgwsJwehfwgSjfJHFJjhfefogSikytrdsdjhvkjd
A1
A2
B1
B2
A1
A2
B1
B2
A1+B
1
A2+B
2
A2+B
1
A1+ A
2+B
2
Fragment 1
Fragment 2
Disk 1
Disk 2
Disk 3
Disk 4
A storage code((4,2) MDS)
KwefgwsJwehfwgSjfJHFJjhfefogSikytrdsdjhvkjd
A1
A2
B1
B2
A1
A2
B1
B2
A1+B
1
A2+B
2
A2+B
1
A1+ A
2+B
2
Fragment 1
Fragment 2
Disk 1
Disk 2
Disk 3
Disk 4
Part 2- Outline
● Introduction● The repair problem ● Reduction of repair problem to single source multicast network ● Family of single source multi-cast networks arising from the
reduction● A lower bound on min-cuts(i.e. An upper bound on max-flow
and hence coding capacity of network)● Minimization of storage bandwidth subject to this lower bound
Problem Definition
● Storage nodes are distributed and connected in a network● Together they represent some storage code(MDS or
approximate MDS like LDPC)● The issue of repairing a node arises when a storage node of the
system fails● The still functioning nodes are called active nodes● A newcomer node called repair node must connect to a subset
of active nodes, obtain information from them and reconstruct the storage code i.e, repair the code
● The objective is to minimize amount of information transferred in this process
Notation
The repair problem
x1
x2
x3
x4
y1
y2
x5
Example: A (4,2) MDS code ( = repair bandwidth per node )
The repair problem
● Data object (2Mb) is divided into two fragments: y1,y2 (1 Mb each)
● 4 encoded fragments generated: x1,x2,x3,x4 (1 Mb each)
● x4 fails, x5, the newcomer needs to communicate with existing nodes and create a new encoded packet
● Any two out of x1,x2,x3,x5 must suffice to recover
original data object
The repair problem
● What(and how much) should x1,x2,x3 communicate to x5 such that are minimized?
x1
x2
x3
x4
y1
y2
x5
Example 1: A (4,2) MDS code
Variants of the repair problem
● Exact Repair: Failed blocks are exactly regenerated i.e. newcomer node must reconstruct exact replica of encoded block in the failed node
● Functional Repair: Newly generated data block need not be exact replica of encoded block on the failed node
● Exact repair of the systematic part: Only repair the systematic part exactly so there is always a un-coded copy of original file available
Variants of the repair problem
● Exact Repair: Failed blocks are exactly regenerated i.e. newcomer node must reconstruct exact replica of encoded block in the failed node
● Functional Repair: Newly generated data block need not be exact replica of encoded block on the failed node
● Exact repair of the systematic part: Only repair the systematic part exactly so there is always a un-coded copy of original file available
Functional repair example(Using RLNC)
a1
b1
a2
b2
a1+b
1+a
2+b
2
a1+2b
1+a
2+2b
2
a1+2b
1+3a
2+b
2
3a1+2b
1+2a
2+3b
2
a1
b1
a2
b2
p1=a1+2b
1
p2=2a2+b
2
p1=4a1+5b
1+4a
2+5b
2
5a1+7b
1+8a
2+7b
2
6a1+9b
1+6a
2+6b
2
1
2
2
1
3
1
1
1
1
1
22
File fragments
Encoded data blocks
Encoded repair packets
Repair node
(Each box is 0.5Mb)
Functional repair example(Using RLNC)
a1
b1
a2
b2
a1+b
1+a
2+b
2
a1+2b
1+a
2+2b
2
a1+2b
1+3a
2+b
2
3a1+2b
1+2a
2+3b
2
a1
b1
a2
b2
p1=a1+2b
1
p2=2a2+b
2
p1=4a1+5b
1+4a
2+5b
2
5a1+7b
1+8a
2+7b
2
6a1+9b
1+6a
2+6b
2
1
2
2
1
3
1
1
1
1
1
22
File fragments
Encoded data blocks
Encoded repair packets
Repair node
(Each box is 0.5Mb)
Flow across this Cut is repair b/w
An attempt at solution
x1
x2
x3
x4
y1
y2
x5
Example 1: A (4,2) MDS code
An attempt at solution
x1
x2
x3
x4
y1
y2
x5
Example 1: A (4,2) MDS code
x5 Recovers original data object and creates a newindependent linear combination
Can we do better than this?
Can we do better than this?
YES!
Part 2- Outline
● Introduction● The repair problem ● Reduction of repair problem to single source
multicast network ● Family of single source multi-cast networks arising
from the reduction● A lower bound on min-cuts(i.e. An upper bound on
max-flow and hence coding capacity of network)● Minimization of storage bandwidth subject to this
lower bound
Reduction to information flow graph
Example
x1in
x2in
x3in
x4in
x5in
x1out
x2out
x3out
x4out
S
x5out
DC
Information flow graph corresponding to Example 1: A (4,2) MDS code
Node 4 has failed
Dynamic nature of information flow graph due to given failure pattern
x1in
x2in
x3in
x4in
x5in
x1out
x2out
x3out
x4out
S
x5out
DC
Information flow graph corresponding to Example 1: A (4,2) MDS code
Node 4 has failed
Family of information flow graphs
x1in
x2in
x3in
x4in
x5in
x1out
x2out
x3out
x4out
S
x5out
DC
Information flow graph corresponding to Example 1: A (4,2) MDS code
Node 3 also failed say a few minutes later
x6in
x6out
Lemma 1
Outline
● The repair problem ● Reduction of repair problem to single source
multicast network ● Family of single source multi-cast networks arising
from the reduction● A lower bound on min-cuts(i.e. An upper bound on
max-flow and hence coding capacity of network)● Minimization of storage bandwidth subject to this
lower bound
Information flow graph
S
Information flow graph
S
Information flow graph
S
Information flow graph
S
Information flow graph
S
Information flow graph
S
Proof
WLOG
Outline
● The repair problem ● Reduction of repair problem to single source
multicast network ● Family of single source multi-cast networks arising
from the reduction● A lower bound on min-cuts(i.e. An upper bound on
max-flow and hence coding capacity of network)● Minimization of storage bandwidth subject to this
lower bound
Minimize subject to the lower bound
Nature of constraint
LHS of constraint as function of
LHS of constraint as function of
Solution to the optimization
Simplification of solution
Simplification of solution
Solution
Minimum repair bandwidth
Storage-Bandwidth TradeoffRelationship between and [1]
References
● [1]Alexandros G. Dimakis, P. Brighten Godfrey, Yunnan Wu, Martin J. Wainwright, and Kannan Ramchandran. 2010. Network coding for distributed storage systems. IEEE Trans. Inf. Theor. 56, 9 (September 2010), 4539-4551.
● [2]Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
● [3]Tracey Ho and Desmond Lun. 2008. Network Coding: An Introduction. Cambridge University Press, New York, NY, USA.
● [4]Dimakis, A.G.; Ramchandran, K.; Wu, Y.; Changho Suh, "A Survey on Network Codes for Distributed Storage," Proceedings of the IEEE , vol.99, no.3, pp.476,489, March 2011