On error correction for networks and deadlines
description
Transcript of On error correction for networks and deadlines
On error correction for networks and deadlines
Tracey HoCaltech
INC, 8/5/12
Introduction
Network error correction[Yeung & Cai 06]
s t
• Errors in some bits, locations unknown→ Code across bits
• Errors in some links/packets, locations unknown→ Code across links/packets
s2
unknown erroneous links
t1
t2
network
s1
Classical error correction
Problem statement
• Given a network and error model
− What communication rates are feasible? (info theory)
− How to achieve with practical codes? (coding theory)
s2
unknown erroneous links
t1
t2
network
s1
5
141
r2
r1
Background – network error correction
• Rich literature on single-source multicast with uniform errors− All sinks demand the same information− Equal capacity network links/packets, any z can be
erroneous − Various capacity-achieving codes, e.g. [Cai & Yeung 06,
Jaggi et al. 08, Koetter & Kschischang 08] and results on code properties, e.g. [Yang & Yeung 07, Balli, Yan & Zhang 07, Prasad & Sundar Rajan 09]
This talk• Generalizations
− Non-multicast demands, multiple sources, rateless codes, non-uniform link capacities
− New capacity bounding and coding techniques • Applications
− Streaming (non-multicast nested network)− Distribution of keys/decentralized data (multi-source
network)− Computationally limited networks (rateless codes)
Outline
• Non-multicast nested networks, streaming communication
• Multiple-source multicast, key distribution
• Rateless codes, computationally limited networks
• Non-uniform link capacities
Outline
• Non-multicast nested networks, streaming communicationO. Tekin, S. Vyetrenko, T. Ho and H. Yao, "Erasure correction for nested receivers," Allerton 2011.O. Tekin, T. Ho, H. Yao and S. Jaggi, “On erasure correction coding for streaming,” ITA 2012.D. Leong and T. Ho, “Erasure coding for real-time streaming,” ISIT 2012.
• Multiple-source multicast, key distribution
• Rateless codes, computationally limited networks
• Non-uniform link capacitieswww.mobileapptesting.com/testing-the-alien-hunting-android-app/2011/03/et-phone-home
Background - non-multicast• Not all sinks demand the same information• Capacity even without errors is an open problem
− May need to code across different sinks’ data (inter-session coding)
− Not known in general when intra-session coding suffices• Non-multicast network error correction
− Capacity bounds from analyzing three-layer networks (Vyetrenko, Ho & Dikaliotis 10)
− We build on this work to analyze coding for streaming of stored and online content
Streaming stored content
m1 I1
m2 I2
m3 I3 Demanded information
x x
Initial play-out delay Decoding deadlines
Forward error correctionSource
Packet erasures
packet erasure link(unit size packets)
Nested network model
I1, I2, I3I1 I1, I2
t1 t2 t3 m1 I1
m2 I2
m3 I3
Deadlines SinksDemanded info
xx
Spatial network problem
Temporal coding problem
• Each sink sees a subset of the info received by the next (nested structure)
• Non-multicast demands
Unit capacitylinks
Unit size packets
source
Nested network model
I1, I2, I3I1 I1, I2
t1 t2 t3 m1 I1
m2 I2
m3 I3
xx
Packet error/erasure correctionstreaming code
Capacity outer boundFinite blocklength network error/erasure correction code
Capacity outer bound
source
Problem and resultsProblem• Given an erasure model and deadlines m1, m2, …, what rate
vectors u1, u2, … are achievable?Results• We find the capacity and a simple optimal coding scheme for
a uniform erasure model− At most z erasures, locations unknown a priori
• We show this scheme achieves at least a guaranteed fraction of the capacity region for a sliding window erasure model− Constraints on number of erasures in sliding windows of
certain length− Exact optimal scheme is sensitive to model parameters
Problem and resultsProblem• Given an erasure model and deadlines m1, m2, …, what rate
vectors u1, u2, … are achievable?Results• We find the capacity and a simple optimal coding scheme for
a uniform erasure model− At most z erasures, locations unknown a priori
• We show this scheme achieves at least a guaranteed fraction of the capacity region for a sliding window erasure model− Constraints on number of erasures in sliding windows of
certain length− Exact optimal scheme is sensitive to model parameters
z erasures – upper bounding capacity• Want to find the capacity region of
achievable rates u1,u2,…,un
• We can write a cut-set bound for each sink: u1 ≤ m1 ̶ z u1+u2 ≤ m2 ̶ z
…u1+u2+…+un ≤ mn ̶ z
• Can we combine bounds for multiple erasure patterns and sinks to obtain tighter bounds?
I1 I1, I2 I1, I2, I3
t1 t2 t3
Cut-set combining procedure
•Obtain bounds involving progressively more links and rates ui, by iteratively applying steps:
Extend: H(X |I1i-1)+ |Y| ≥ H(X,Y |I1
i-1)= H(X,Y |I1i)+ ui
where X,Y is a decoding set for Ii
Combine:)|,(
1||)|,( 1
||,:1
i
kAXAA
i IXZHk
XIAZH
Example: m1=3,m2=5, m3=7, m4=11, z=1
12
13
23
1245
1345
2345
12341235
124567
134567
234567
123467123567123456123457123456123457 1234567
1234567
1234567
1234567
1234567
u1+H(X1X2|I1) ≤2
u1+H(X1X3|I1) ≤2
u1+H(X2X3|I1) ≤2
Upper bound derivation graph
• Different choices of links at each step give different upper bounds• Exponentially large number of bounds• Only some are tight – how to find them?• We use an achievable scheme as a guide and show a matching upper
bound
Example: m1=3,m2=5, m3=7, m4=11, z=1
u1≤23u1+2u2 ≤83u1+2u2+u3 ≤96u1+5u2+4u3 ≤246u1+4u2+2u3+u4≤209u1+6u2+4u3+3u4 ≤366u1+5u2+4u3+2u4 ≤286u1+4.5u2+4u3+3u4 ≤309u1+7.5u2+7u3+6u4 ≤54
Capacity region: 12
13
23
1245
1345
2345
12341235
124567
134567
234567
123467123567123456123457123456123457 1234567
1234567
1234567
1234567
1234567
Intra-session Coding
• A rate vector (u1,u2,…,un) is achieved if and only if for every unerased set P:
1 2 … mn ΣP
I1 y1,1 y1,2 … y1,m_n ≥ u1
I2 y2,1 y2,2 … y2,m_n ≥ u2
… … … … …
In yn,1 yn,1 … yn,m_n ≥ un
Σ ≤1 ≤1 ≤1
• Separate erasure coding over each sink’s data• Code design → capacity allocation problem• yj,k : capacity on kth link allocated to jth sink’s data
• We may assume yj,k = 0 for k>mj
1,2,…,9,10 11,12,13,14 15,16,17,18 19,20,21,22
I1
I2
I3
I4 0.2 0.40.40.40
“As uniform as possible” intra-session coding scheme
0.75
0.18750.1875
0.25
0.1875
0.2 0.2 0.2
0.50.50
0.5
0.25
0.50.250
m1 = 10, m2 = 14, m3 = 18, m4 = 22, u1 = 6, u2 = 3, u3 = 3, u4 = 4, z=2
75.0210
6
1
1
zm
u 25.0214
3
2
2
zm
u 1875.0218
3
3
3
zm
u5.0
210183
13
3
zmm
u 2.0222
4
4
4
zm
u 4.021022
4
14
4
zmm
u 5.021422
325.0*4
24
4
zmm
u
• For a given rate vector, fill each row as uniformly as possible subject to constraints from previous rows
• Example:
Can we do better?
Capacity region
Theorem: The z-erasure (or error) correction capacity region is achieved by the “as uniform as possible” coding scheme.
• Characterization of the capacity region in a form that is simple to specify and calculate
• Intra-session coding is also relatively simple to implement
Proof Idea• Consider any given rate vector (u1,u2,…,un) and let Ti,j denote
its corresponding “as uniform as possible” allocation:
• Show inductively: the conditional entropy of any set of unerased links given messages I1,…, Ik matches the residual capacity from the table
• Use Ti,j values to find the appropriate path through upper bound derivation graph
1,2,…,m1 m1 +1,…,m2 m2 +1,…,m3 … mn-1 +1,…,mn
I1 T1,1
I2 T2,1 T2,2
… … … … …
In Tn,1 Tn,2 Tn,3 … Tn,n
Streaming online content• Messages arrive every c time steps at the source, and must be
decoded within d time steps
packet erasure link(unit size packets)
Message decoding deadlines (d=8)
Message creation times (c=3)
Problem and results
Problem• Given an erasure model and parameters c and d, what is the
maximum size of independent uniform messages?Results• We find the capacity and a simple coding scheme that is
asymptotically optimal for the following erasure models:− #1: Limited number of erasures per sliding window− #2: Erasure bursts and guard intervals of certain lengths
• For other values of burst length and guard interval, optimal inter-session convolutional code constructions [Martinian & Trott 07, Leong & Ho 12]
Problem and results
Problem• Given an erasure model and parameters c and d, what is the
maximum size of independent uniform messages?Results• We find the capacity and a simple coding scheme that is
asymptotically optimal for the following erasure models:− #1: Limited number of erasures per sliding window− #2: Erasure bursts and guard intervals of certain lengths
• For other values of burst length and guard interval, optimal inter-session convolutional code constructions [Martinian & Trott 07, Leong & Ho 12]
Code construction• Divide each packet evenly among current messages• Intra-session coding within each message
when d is a multiple of c …
messages 2, 3, 4 are current at t = 12
constant number of current messages at
each time step
Code construction• Divide each packet evenly among current messages• Intra-session coding within each message
variable number of current messages at
each time step
messages 3, 4, 5 are current at t = 13messages 3, 4 are current at t = 12
when d is not a multiple of c …
Capacity result
• Like the previous case, converse obtained by− combining bounds for multiple erasure patterns and sinks
(deadlines)− inductively obtaining upper bounds on the entropy of sets
of unerased packets, conditioned on previous messages • The converse bound coincides with the rate achieved by our
coding scheme asymptotically in the number of messages n• Gap for small n corresponds to underutilization of capacity at
the start and end by the time-invariant coding scheme
Outline
• Non-multicast nested networks, streaming communication
• Multiple-source multicast, key distributionT. Dikaliotis, T. Ho, S. Jaggi, S. Vyetrenko, H. Yao, M. Effros, J. Kliewer and E. Erez, "Multiple-access Network Information-flow and Correction Codes," IT Transactions 2011.H. Yao, T. Ho and C. Nita-Rotaru, "Key Agreement for Wireless Networks in the Presence of Active Adversaries,"Asilomar 2011.
• Rateless codes, computationally limited networks
• Non-uniform link capacities
www.fanpop.com/spots/the-usual-suspects
Multiple-source multicast, uniform z errors
• Coherent (known topology) and noncoherent (unknown topology) cases
s2
t s1
• Sources with independent informationWe could partition network capacity among different sources…But could rate be improved by coding across different sources? To
what extent can different sources share network capacity?Challenge: owing to the need for coding across sources in the
network and independent encoding at sources, straightforward extensions of single-source codes are suboptimal
Related work: code construction in (Siavoshani, Fragouli & Diggavi 08) achieves capacity for C1+C2=C
Multiple-source multicast, uniform z errors
• Sources with independent information• We could partition network capacity among different sources…But could rate be improved by coding across different sources? To
what extent can different sources share network capacity?Challenge: owing to the need for coding across sources in the
network and independent encoding at sources, straightforward extensions of single-source codes are suboptimal
Related work: code construction in (Siavoshani, Fragouli & Diggavi 08) achieves capacity for C1+C2=C
s2
t s1
• Coherent (known topology) and noncoherent (unknown topology) cases
Multiple-source multicast, uniform z errors
• Sources with independent information• We could partition network capacity among different sources…• But could rate be improved by coding across different sources? To
what extent can different sources share network capacity?• Challenge: owing to the need for coding across sources in the
network and independent encoding at sources, straightforward extensions of single-source codes are suboptimal
• Related work: code construction in (Jafari, Fragouli & Diggavi 08) achieves capacity for C1+C2=C
s2
t s1
• Coherent (known topology) and noncoherent (unknown topology) cases
Capacity region• Theorem: The coherent and non-coherent capacity region
under any z link errors is given by the cut set bounds
− U = set of source nodes− mS = min cut capacity between sources in subset S of U and
each sink− ri = rate from the ith source
• Redundant capacity can be fully shared via coding
USzmrSi
Si
,2
Capacity-achieving non-coherent code constructions
1. Probabilistic construction− Joint decoding of sources, using injection distance metric − Subspace distance metric used in single-source case is
insufficient in multi-source case2. Lifted Gabidulin rank metric codes over nested fields
− Successive decoding of sources− Linear transformation to separate out other sources’
interference increases the field size of errors− Sources encode over nested extension fields
An application: key distribution
• Robust distribution of keys from a pool (or other decentralized data)
• Nodes hold subsets of keys, some pre-distributed• Further exchange of keys among nodes• Want to protect against some number of corrupted nodes• Questions:
− How many redundant transmissions are needed?− Can coding help?
V1 V2 V3 V4 V5 V6 V7 V8 V9
k1, k2 k1, k2 k1, k2 k1, k3 k1, k3 k1, k3 k2, k3 k2, k3 k2, k3
R wants k1, k2, k3
An application: key distribution
• Problem is equivalent to multi-source network error correction
• Coding across keys strictly outperforms forwarding in general
S2
V1 V2 V3 V4 V5 V6 V7 V8 V9
V1 V2 V3 V4 V5 V6 V7 V8 V9
k1, k2 k1, k2 k1, k2 k1, k3 k1, k3 k1, k3 k2, k3 k2, k3 k2, k3
R
S1 S3
R
wants k1, k2, k3
Outline
• Non-multicast nested networks, streaming communication
• Multiple-source multicast, key distribution
• Rateless codes, computationally limited networksS. Vyetrenko, A. Khosla & T. Ho, “On combining information-theoretic and cryptographic approaches to network coding security against the pollution attack,” Asilomar 2009.W. Huang, T. Ho, H. Yao & S. Jaggi, “Rateless resilient network coding against Byzantine adversaries,” 2012.
• Non-uniform link capacities
Background – adversarial errors in multicast
• Information theoretic network error correction− Prior codes designed for a given mincut and max no. of
errors zU − Achieve mincut -2zU, e.g. [Cai and Yeung 06, Jaggi et al. 08,
Koetter & Kschischang 08]− No computational assumptions on adversaries− Use network diversity and redundant capacity as resources
• Cryptographic signatures with rateless network codes− Signatures for checking network coded packets, e.g.
[Charles et al. 06, Zhao et al. 07, Boneh et al. 09]− Achieve realized value of mincut after erasures− Use computation, key infrastructure as resources
MotivationCryptographic approach + Does not require a priori estimates of network capacity and
errors (rateless)+ Achieves higher rate− Performing signature checks requires significant computation;
checking all packets at all nodes can limit throughput if nodes are computationally weak, e.g. low-power wireless nodes
Questions:• Can we achieve the rateless benefits without the
computational drawback?• Can we use both network diversity as well as computation as
resources, to do better than with each separately?
Rateless network error correction codes
• Incrementally send redundancy until decoding succeeds • Without an a priori bound on the number of errors, need a means
to verify decoding• We give code constructions using respectively:
1. Shared secret randomness between source and sink (small compared to message size)
2. Cryptographic signatures• These constructions are asymptotically optimal:
− Decoding succeeds w.h.p. once received information/errors satisfy cut set bound
− Overhead becomes negligible with increasing packet length
Rateless code using shared secret• Shared secret is random and independent of the message• Non-rateless case [Nutman and Langberg 08]
− Redundancy Y added to message W so as to satisfy a matrix hash equation [Y W I ]V= H defined by shared secret (V, H)
− Hash is used to extract [Y W I ] from received subspace• Challenges in the rateless case:
1. Calculate redundancy incrementally such that it is cumulatively useful for decoding
2. Send redundancy incrementally • Growth in dimension of subspace to be recovered in turn
necessitates more redundancy
• Each adversarial error packet can correspond to an addition (of erroneous information) and/or an erasure (of good information)
• Code structure:
• yk = w V(k) +hk , where w is the vectorized message, Vij
(k) =akij, and
hk and ak are shared secrets
Rateless code using shared secret
y3
Message W BI
d31y1+0 0 1
d11
d32y2+d33y3
y2
d31d32d33
d22y2
d11y1
d21y1+ d21d22
y1 1
0 1
Linearly dep redundancy for erasures
Long packets
C2C2 W
C3 C3 WC4 C4 W
C1 C1 W
Linearly indep redundancy for additions
Short packets
Rateless code using signatures• Each adversarial error packet can correspond to an addition (of
erroneous information) and/or an erasure (of good information)• Code structure:
• yi = w Si , where w is the vectorized message and Si is a generic known matrix
B
I
Message W
Y1
BI
I
00
0
C2 W+D21 Y1+D22 Y2
Linearly dependent redundancy for erasures
Y2
C2
Linearly independent redundancy for additions
D22
C1 C1 W+D11 Y1 D11
D21
Example: simple hybrid strategy on wireless butterfly network
D
source
1sink 2sink
Node D has limited computation and outgoing capacity → Probabilistically checks/codes a
fraction of packets− Proportion of packets
checked/coded chosen to maximize expected information rate subject to computational budget
Example: simple hybrid strategy on wireless butterfly network
20,200mincut,40coding ofcost
checking ofcost z
Outline
• Non-multicast nested networks, streaming communication
• Multiple-source multicast, key distribution
• Rateless codes, computationally limited networks
• Non-uniform link capacitiesS. Kim, T. Ho, M. Effros and S. Avestimehr, "Network error correction with unequal link capacities," IT Transactions 2011.T. Ho, S. Kim, Y. Yang, M. Effros and A. S. Avestimehr, "On network error correction with limited feedback capacity," ITA 2011.
http://www.geekosystem.com/tag/starbucks/
Uniform and non-uniform links• Adversarial errors on any z fixed but unknown links• Uniform links:
− Multicast error correction capacity = min cut – 2z− Worst-case errors occur on the min cut
• Non-uniform links:− Not obvious what are worst-case errors
• Cut size versus link capacities• Feedback across cuts matters (can provide information
about errors on upstream links)− Related work: Adversarial nodes (Kosut, Tong & Tse 09)
Tighter cut set bounding approach
• The classical cut set bound is equivalent to adding reliable, infinite-capacity bidirectional links between each pair of nodes on each side of the cut
• Tighter bounds can be obtained by taking into account which forward links affect or are affected by which feedback links
• Equivalent to adding a link (i,j) only if there is a directed path from node i to node j on that does not cross the cut
46
Zigzag network
New cut-set boundFor any cut Q, • adversary can erase a set of k ≤ z forward links • adversary then chooses two sets Z1 ,Z2 of z-k links s.t. decoder
cannot distinguish which set is adversarial:− no feedback links downstream of Z1 ,Z2 ,− downstream feedback links are included in Zi , or− downstream feedback links Wi that are not in Zi have
relatively small capacity s.t. distinct codewords have the same feedback link values
• sum of capacities of remaining forward links + capacities of links in W1 ,W2 is an upper bound
Bound is tight on some families of zigzag networks
z=1
Achieve rate 3 using new code construction
Achievability - example
• For z=1, upper bound = 5• Without feedback link, capacity = 2• Can we use feedback link to achieve
rate 5?
∞
∞
z=1
Achieve rate 3 using new code construction
b
r2
a
e
• Some network capacity is allocated to redundancy enabling partial error detection at intermediate nodes
• Nodes that detect errors forward additional information allowing the sink to locate errors
• Use feedback capacity to increase the number of symbols transmitted with error detection
• Remaining network capacity carries an MDS error correction code over all information symbols
r1
c
+ca
d
b
“Detect and forward” coding strategy
z=1capacity= 5
∞
∞
Conclusion
• Network error correction − New coding and outer bounding techniques for non-
multicast demands, multiple sources, non-uniform errors− A model for analysis and code design in various
applications, e.g. robust streaming, key distribution− Rateless and hybrid codes for computationally limited
networks with adversaries
Thank you