Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5....

18
Repairable Fountain Codes Megasthenis Asteris Alex Dimakis

Transcript of Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5....

Page 1: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Repairable Fountain Codes

Megasthenis Asteris Alex Dimakis

Page 2: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Distributed Storage

Cluster of machines running Hadoop at Yahoo! (Source: Yahoo!)

•  Failures are the norm. •  We need to protect the data: Introduce redundancy •  Already using Erasure Codes (e.g. Reed Solomon)

Page 3: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Problem Description

•  Systematic form Input is part of the output

Create a linear code with the following properties:

Page 4: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Problem Description

•  Rateless property Columns created independently

...

→ ∞

Page 5: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Problem Description

•  MDS property Any k columns are full rank = 0

= 0

Page 6: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Problem Description

•  Good Locality A column is a linear combination

of at most l other columns.

x1

x2

x3

x4

x5

p1p2

Page 7: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Problem Description

•  Systematic form

•  Rateless property

•  MDS property

•  Good Locality

Summarizing…

...

→ ∞

Page 8: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Approach •  Systematic

•  Sparsity is a measure of locality: The sparser the parities the better

the locality.

Q How sparse can parities be?

•  Sparse Parities

•  Good Locality

+

=

Page 9: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Approach

However... •  Systematic MDS codes can afford no zero in the parity columns.

•  MDS cannot have Sparse Parities.

•  Relax MDS property: •  (1+ε) k symbols should suffice for decoding.

Page 10: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Approach

•  Systematic

•  Rateless

•  MDS property (1+ε)k suffice for decoding

Fountain Codes

ε should be arbitrarily small.

• Sparse Parities

In light of the observations, we want:

...

→ ∞

Page 11: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Prior work

•  [Gummadi]– Systematic LT/Raptor codes – ε cannot be arbitrarily small.

•  [Shokrollahi] -Raptor Codes

- Systematic version - Parities no longer sparse in the input

Page 12: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Our construction

...d(k)

...

...ik

j

×ci,j

...

...i

• k input symbols

• Systematic part

• For each parity:

- Choose d(k) symbols.

- Choose cij ∼ U [0 . . . q)

How small can the degree d(k), of the encoded symbols be?

Page 13: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Results

...d(k)

...

...ik

j

×ci,j

...

...i

d(k) = c ln (k) (c ∝ 1/)

⇒(1 + ) k random columns

of G are lin. indep.w.p. k/q close to 1.

Theorem 2

Decoding w.h.p.⇒

d(k) = Ω (ln (k))

Theorem 1

Coupon Collector

Page 14: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Proving Theorem 2

[Kovalenko & Levitskaya, Cooper, Karp, …. ]

• Analyze the rank of a random matrix.

•  But here, we have an arbitrary number of systematic columns and a random part (parities)

(1 + )k

Page 15: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Proof – Part 1

No

rank(GK) < k

Yes

Yes No rank(GK) = kBad Coeffs?

∃M?

•  Focus on k x k submatrix.

•  Full rank corresponds to a perfect matching on bipartite graph

• (whp using Edmond’s theorem and Schwartz-Zippel

[ Ho et al. ]

Schwart-Zippel

Page 16: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Proof – Part 2

- Erdos-Reyni Random matrix result, Hall’s marriage theorem, first moment method.

Key technical result: - There is a perfect matching whp.

Page 17: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Conclusions •  We introduced a new family of fountain codes:

–  Systematic –  Near MDS –  With logarithmic locality (easy repair of a single

symbol failure - very useful in distributed storage systems)

•  Our proof involved analyzing a new family of random matrices.

•  An interesting open problem: can we use belief propagation decoding for this ensemble?

Page 18: Megasthenis Asteris Alex Dimakismegasthenis.github.io/repository/ISIT2012-Repairable... · 2016. 5. 17. · Megasthenis Asteris Alex Dimakis . Distributed Storage Cluster of machines

Simulation

ε , q = 0.01, 17

ε , q = 0.13, 17

ε , q = 0.25, 17

ε , q = 0.01, 113

100 120 140 160 180 200 220 240 2600

0.2

0.4

0.6

0.8

1

k

Pr(

decod

ing)

P r(decoding) vs k (R = 0 .5 , c = 5)