Shifted Codes Sachin Agarwal Deutsch Telekom A.G., Laboratories Ernst-Reuter-Platz 7 10587 Berlin...
-
Upload
blanche-jones -
Category
Documents
-
view
214 -
download
0
Transcript of Shifted Codes Sachin Agarwal Deutsch Telekom A.G., Laboratories Ernst-Reuter-Platz 7 10587 Berlin...
Shifted CodesSachin AgarwalDeutsch Telekom A.G., LaboratoriesErnst-Reuter-Platz 7 10587 BerlinGermany
Joint work with Andrew Hagedorn and Ari Trachtenberg at Boston University
Best viewed on-screen in slide-show mode
2S. Agarwal, [email protected], January 2008
Outline
1. Motivation & Problem Definition
2. Backgrounda. Rateless Codes
b. Digital Fountain Codes
3. Shifted Codesa. Motivation – Inefficiency of LT codes
b. Construction of Shifted Codes
c. Analysis – Communication and Computation Complexity
4. Experimental Comparisona. LT vs. Shifted Codes
b. Constraint Sensors – Deployment on TMotes
5. Discussion and Round-up
3S. Agarwal, [email protected], January 2008
Outline
1. Motivation & Problem Definition
2. Backgrounda. Rateless Codes
b. Digital Fountain Codes
3. Shifted Codesa. Motivation – Inefficiency of LT codes
b. Construction of Shifted Codes
c. Analysis – Communication and Computation Complexity
4. Experimental Comparisona. LT vs. Shifted Codes
b. Constraint Sensors – Deployment on TMotes
5. Discussion and Round-up
4S. Agarwal, [email protected], January 2008
Partial Information
Transmission Channel with Erasures
Transmitter Receiver
Input symbols Received Symbols
5S. Agarwal, [email protected], January 2008
Partial Information
Transmission Channel with Erasures
Transmitter Receiver
Input symbols Received Symbols
6S. Agarwal, [email protected], January 2008
Partial Information
Transmission Channel with Erasures
Transmitter Receiver
Input symbols Received Symbols
7S. Agarwal, [email protected], January 2008
Partial Information
Transmission Channel with Erasures
Transmitter Receiver
Input symbols Received Symbols
8S. Agarwal, [email protected], January 2008
Partial Information
Transmission Channel with Erasures
Transmitter Receiver
Input symbols Received Symbols
9S. Agarwal, [email protected], January 2008
Partial Information
Transmission Channel with Erasures
Transmitter Receiver
Input symbols Received Symbols
10S. Agarwal, [email protected], January 2008
Partial Information
Multiple Receivers may have different erasures
Transmitter
Receiver 1
Receiver 2
Receiver 3
Given the situation of multiple receivers having partial information, how can all of
them be updated to full information efficiently, and over a broadcast channel?
11S. Agarwal, [email protected], January 2008
Partial InformationAnother Example
Multiple mobile devices may have out-dated information
a. Mobile databases
b. Sensor network information aggregation
c. RSS updates for devices
Broadcaster
Mobile device 1
Mobile device 2
Mobile device 3
Latest version of information
12S. Agarwal, [email protected], January 2008
Problem Definition
Given an encoding host with k input symbols and a decoding host with n out of the k input symbols, the goal is to efficiently determine the remaining k-n input symbols at the decoding host.
The encoding host has no information of which k-n input symbols are missing at the decoding host.
Different decoding hosts may be missing different input symbols
Efficiency1.Communication complexity – Information transmitted from the encoding host to the decoding host should be close in size to the transmission size of the missing k-n input symbols
2.Computational complexity – The algorithm must be computationally tractable
13S. Agarwal, [email protected], January 2008
Information Theoretic Lower Bound
Known ResultAt a minimum, the encoding host would have to send only a little less than the exact contents of the missing input symbols to the decoding host.
Intuition
Decoding host is missing k-n input symbols
Special case of set reconciliation
b
nkbnkC
)lg()(
k – Number of input symbols
n – Number of symbols known a priori at the decoding host
b – Field size of each symbol
14S. Agarwal, [email protected], January 2008
Outline
1. Motivation & Problem Definition
2. Backgrounda. Rateless Codes
b. Digital Fountain Codes
3. Shifted Codesa. Motivation – Inefficiency of LT codes
b. Construction of Shifted Codes
c. Analysis – Communication and Computation Complexity
4. Experimental Comparisona. LT vs. Shifted Codes
b. Constraint Sensors – Deployment on TMotes
5. Discussion and Round-up
15S. Agarwal, [email protected], January 2008
Rateless Codes
Definition“A class of erasure codes with the property that a potentially limitless sequence of encoding symbols can be generated from a given set of source symbols such that the original source symbols can be recovered from any subset of the encoding symbols of size equal to or only slightly larger than the number of source symbols. ”
Wikipedia.org
Examples1. Random Linear Codes
2. LT Codes
3. Raptor Codes
4. Shifted Codes
5. …
16S. Agarwal, [email protected], January 2008
Rateless Codes - EncodingUsed for content distribution over error-prone channels
Random choice of edges based on a probability density function
At least k Encoded Symbolsk input symbols
1 =A+B
2 =B
3 =A+B+C
4 =A+C
A
B
C
17S. Agarwal, [email protected], January 2008
Rateless Codes - DecodingUsed for content distribution over error-prone channels
At least k Encoded Symbols
1 =A+B
2 =B
3 =A+B+C
4 =A+C
k input symbols
SolveGaussian Elimination, Belief Propagation
System of Linear Equations
Irrespective of which encoded symbols are lost in the communication channel, as long as sufficient encoded symbols are received, the decoding can retrieve all the k input symbols
A
B
C
18S. Agarwal, [email protected], January 2008
Decoding Using Belief Propagation
Decoded k Input Symbols
k+ Encoded Symbols
Decoding host
Redundant!
Decode
Input Symbols
19S. Agarwal, [email protected], January 2008
Digital Fountain CodesLT Codes1.Class of rateless erasure codes
invented by Michael Luby1
2.Computationally practical (as compared to Random Linear Codes)
3.Fast decoding algorithm based on Belief propagation instead of Gaussian Elimination
4.Form the outer code for Raptor Codes3, which have linear decoding computational complexity
5.Designed for the case when no input symbols are available at the Decoding host initially
Asymptotic Properties2
Expected number of encoded symbols required for successful decoding
Expected decoding computational complexity
k: number of input symbols2Assuming a constant probability of failure
)ln( 2 kkOk
1Michael Luby, “LT codes,” in The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002, pp. 271–282.3Amin Shokrollahi, “Raptor codes,” IEEE Transactions on Information Theory, vol. 52, no. 6, 2006, pp. 2551–2567.
)ln( kkO
20S. Agarwal, [email protected], January 2008
Digital Fountain CodesLT Codes’ Robust Soliton Probability Distribution
Robust Soliton Probability Distribution k,
Probability of an encoded symbol with degree d is k(d)
Property of releasing degree 1 symbols at a controlled, near-constant rate throughout the decoding process
0 200 400 600 800 1000-6
-5
-4
-3
-2
-1
0
Degree
log 1
0(P
roba
bilit
y)
LT Code (Robust Soliton)
LT code distribution, with parameters k = 1000, c = 0.01, = 0.5.
21S. Agarwal, [email protected], January 2008
Outline
1. Motivation & Problem Definition
2. Backgrounda. Rateless Codes
b. Digital Fountain Codes
3. Shifted Codesa. Motivation – Inefficiency of LT codes
b. Construction of Shifted Codes
c. Analysis – Communication and Computation Complexity
4. Experimental Comparisona. LT vs. Shifted Codes
b. Constraint Sensors – Deployment on TMotes
5. Discussion and Round-up
22S. Agarwal, [email protected], January 2008
Inefficiency of LT Codes for our Problem
k+ Encoded Symbols
Decoding host
Decode
Input Symbols
n out of k input symbols are known a priori at the decoding host
Many redundant encoded symbols
23S. Agarwal, [email protected], January 2008
Inefficiency of LT Codes for our ProblemThe number of these redundant encoded symbols grows with the ratio of input symbols known at the decoder (n) to the total input symbols (k)
If n input symbols are known a priori, then an additional LT-encoded symbol will provide no new information to the decoding host with probability
…which quickly approaches 1 as n → k
d
i
k
dk ik
ind
01
)(
24S. Agarwal, [email protected], January 2008
Intuitive Fix
n known input symbols serve the function of degree 1 encoded symbols, disproportionately skewing the degree distribution for LT encoding
We thus propose to shift the Robust Soliton distribution to the right in order to compensate for the additional functionally degree 1 symbols
Questions
1) How?
2) By how much?
0 200 400 600 800 1000-6
-5
-4
-3
-2
-1
0
Degree
log 1
0(P
roba
bilit
y)
LT Code (Robust Soliton)
25S. Agarwal, [email protected], January 2008
Shifted Code Construction
Definition
The shifted robust soliton distribution is given by
Intuition
n known input symbols at the decoding host reduce the degree of each encoding symbols by an expected fraction
j
kn
iij nknk
1roundfor )(0)(,
kn
1
1
26S. Agarwal, [email protected], January 2008
Shifted Code Distribution
0 200 400 600 800 1000-6
-5
-4
-3
-2
-1
0
Degree
log 1
0(P
roba
bilit
y)
LT Code (Robust Soliton)Shifted Code
LT code distribution and proposed Shifted code distribution, with parameters k = 1000, c = 0.01, = 0.5. The number of known input symbols at the decoding host is set to n = 900 for the Shifted code distribution. The probabilities of the occurrence of encoded symbols of some degrees is 0 with the shifted code distribution.
27S. Agarwal, [email protected], January 2008
Shifted Code – Communication ComplexityLemma IV.2 A decoder that knows n of k input symbols needs
encoding symbols under the shifted distribution to decode all k input symbols with probability at least 1−.
ProofWe have k-n input symbols comprising the encoded symbols after the n known input symbols are removed from the decoding graph. The expresson follows from Luby‘s analysis.
nknkOnkm 2ln)(
28S. Agarwal, [email protected], January 2008
Shifted Code – Average Degree of Encoded Symbol
Lemma IV.3 The average degree of an encoding node under the k,n distribution is given by
ProofThe proof follows from the definitions, since a node with degree d in the μk distribution will correspond to a node with degree roughly
in the shifted code distribution.From Luby‘s analysis,the expresson for the average degree of an LT encoded symbol is
)ln( nk
nk
kO
kn
d
1
)(ln kO
29S. Agarwal, [email protected], January 2008
Shifted Codes – Computational ComplexityLemma IV.4*
For a fixed , the expected number of edges R removed from the decoding graph upon knowledge of n input symbols at the decoding host is given by
R = O (n ln(k − n))Theorem IV.5
For a fixed probability of decoding failure , the number of operations needed to decode using a shifted code is
O (k ln(k − n)) Proof
Summing Lemma IV.4 and the computational complexity of (LT) decoding for the unknown k-n input symbols
*Proof described in: S. Agarwal, A. Hagedorn and A. Trachtenberg, “Rateless Codes Under Partial Information”, Information Theory and Applications Workshop, UCSD, San Diego, 2008
30S. Agarwal, [email protected], January 2008
Outline
1. Motivation & Problem Definition
2. Backgrounda. Rateless Codes
b. Digital Fountain Codes
3. Shifted Codesa. Motivation – Inefficiency of LT codes
b. Construction of Shifted Codes
c. Analysis – Communication and Computation Complexity
4. Experimental Comparisona. LT vs. Shifted Codes
b. Constraint Sensors – Deployment on TMotes
5. Discussion and Round-up
31S. Agarwal, [email protected], January 2008
0 100 200 300 400 500 600 700 800 900 10000
200
400
600
800
1000
1200
Known input symbols (n)
Req
uire
d en
codi
ng s
ymbo
ls a
t D
ecod
ing
host
Without Invention
With Invention
Benefit
For k = 1000, n = 900, the decoding host needs to download about 700 encoded symbols using conventional LT codes. But using shifted codes, only about 180 encoded symbols are required
Experimental ComparisonLT Codes vs. Shifted Codes
The experiment was repeated 100 times and the error-bars of the standard deviation are also plotted in the graph.
LTShifted Code
Y-axis
Number of encoded symbols required at the mobile device to obtain the whole data-set
X-axis
Number of input symbols n available a priori at the mobile device
32S. Agarwal, [email protected], January 2008
Experimental ComparisonConstraint Sensors – Deployment on TMotes
100 200 300 4000
0.5
1
1.5
2
2.5
3
Number of Input Symbols
Tim
e to
Enc
ode
(s)
LT (Robust Soliton)Shifted Code distribution
Total time to Encode(Measure of computational complexity)
100 200 300 4000
2
4
6
8
10
12
Number Input Symbols
Tim
e T
o D
ecod
e (s
)
LT (Robust Soliton)Shifted Code distribution
Total time to Decode(Measure of computational complexity)
33S. Agarwal, [email protected], January 2008
More Data: Communication Savings
-200 0 200 400 600 800 1000 12000
200
400
600
800
1000
1200
n, number of known input symbols at decoding host
Req
uire
d en
code
d sy
mbo
ls f
or s
ucce
ssfu
l \ d
ecod
ing
k=1000 input symbols, 20 randomized trials
LT Robust Soliton
Shifted Code
34S. Agarwal, [email protected], January 2008
More Data: Communication Savings Normalized
100 200 300 400 500 600 700 800 9000
0.2
0.4
0.6
0.8
1
n, number of known input symbols at decoding host
Enc
oded
sym
bols
req
uire
d, n
orm
aliz
ed w
ith L
T-R
S
k=1000 input symbols, 20 randomized trials
LT Robust Soliton
Shifted Code
35S. Agarwal, [email protected], January 2008
More Data: Time Savings, Normalized
100 200 300 400 500 600 700 800 9000
0.2
0.4
0.6
0.8
1
n, number of known input symbols at decoding host
Tim
e ta
ken
to d
ecod
e, n
orm
aliz
ed w
ith L
T-R
S
k=1000 input symbols, 20 randomized trials
LT Robust Soliton
Shifted Code
36S. Agarwal, [email protected], January 2008
Distribution ShiftingWhen the estimate of n at the Encoding Host is not accurate
1
0
)()(0)(k
kp ipj j
k
i
1roundfor
The Theta distribution shifting decodes input symbols much more quickly than the standard LT codes.
37S. Agarwal, [email protected], January 2008
Outline
1. Motivation & Problem Definition
2. Backgrounda. Rateless Codes
b. Digital Fountain Codes
3. Shifted Codesa. Motivation – Inefficiency of LT codes
b. Construction of Shifted Codes
c. Analysis – Communication and Computation Complexity
4. Experimental Comparisona. LT vs. Shifted Codes
b. Constraint Sensors – Deployment on TMotes
5. Discussion and Round-up
38S. Agarwal, [email protected], January 2008
Many Applications
1. Broadcasting coded updates to synchronize databases2. Adapting LT codes when partial information has been
delivereda. Continuous shifting of the distributionb. Using the partial information in case of unsuccessful decoding
(when only some of the input symbols were decoded)
3. Efficient erasure correction when channel characteristics are already known
a. For example, input symbols can be first sent as plain-text, and then depending on the estimate of number of lost input symbols, shifted-coded symbols can be transmitted
4. Heterogeneous channel data delivery5. Application in gossip protocols, particularly in later
iterations 6. Sensor networks - data aggregation, routing
information, etc.7. Restoring storage media that are partially erased…
39S. Agarwal, [email protected], January 2008
Conclusions & Future-work
Conclusions
a. Generalization of LT Code when some of the input symbols are already available at the decoding host
b. Many applications
Future Work
a. By adopting Raptor Code concepts (inner code), Shifted codes can be made more efficient
b. Analytical expressions for Distribution Shifting
c. Application specific shifted codes design
d. “Shifting” other rateless codes
40S. Agarwal, [email protected], January 2008
Further Reading
1. S. Agarwal, A. Hagedorn and A. Trachtenberg, “Rateless Codes Under Partial Information”, Information Theory and Applications Workshop, UCSD, San Diego, 2008
2. S. Agarwal (Deutsche Telekom A.G.), “Method and System for Constructing and Decoding Rateless Codes with Partial Information”, European Patent Application EP 07 023 243.4
3. Michael Luby, “LT codes,” in The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002, pp. 271–282.
4. Amin Shokrollahi, “Raptor codes,” IEEE Transactions on Information Theory, vol. 52, no. 6, 2006, pp. 2551–2567.