The Price of Uncertainty in Communication Brendan Juba (Washington U., St. Louis) with Mark...
-
Upload
brett-austin -
Category
Documents
-
view
216 -
download
2
Transcript of The Price of Uncertainty in Communication Brendan Juba (Washington U., St. Louis) with Mark...
The Price of Uncertainty in Communication
Brendan Juba (Washington U., St. Louis)with Mark Braverman (Princeton)
≈
≈
SINCE WE ALL AGREE ON A PROB. DISTRIBUTION OVER WHAT I MIGHT SAY, I CAN COMPRESS IT TO: “THE
9,232,142,124,214,214,123,845TH MOST LIKELY MESSAGE.
THANK YOU!”
3
1.Encodings and communication across different priors
2.Near-optimal lower bounds for different priors coding
4
Coding schemes
BirdChicken Cat Dinner Pet LambDuck Cow Dog
“MESSAGES”
“ENCODINGS”
6
Ambiguity
BirdChicken Cat Dinner Pet LambDuck Cow Dog
7
Prior distributions
BirdChicken Cat Dinner Pet LambDuck Cow Dog
Decode to a maximum likelihood message
8
Source coding (compression)
• Assume encodings are binary strings• Given a prior distribution P, message m,
choose minimum length encoding that decodes to m.
FOR EXAMPLE, HUFFMAN CODES AND SHANNON-FANO (ARITHMETIC) CODES
NOTE: THE ABOVE SCHEMES DEPEND ON THE PRIOR.
9
SUPPOSE ALICE AND BOB SHARE THE SAME ENCODING SCHEME, BUT DON’T SHARE THE SAME PRIOR…
P Q
CAN THEY COMMUNICATE??HOW EFFICIENTLY??
10
THE CAT.THE ORANGE CAT.THE ORANGE CAT WITHOUT A HAT.
11
Closeness and communication
• Priors P and Q are α-close (α ≥ 1) if for every message m,αP(m) ≥ Q(m) and αQ(m) ≥ P(m)
• Disambiguation and closeness together suffice for communication:
If for every m’≠m, P[m|e] > α2P[m’|e], then:Q[m|e] ≥ 1/αP[m|e] > αP[m’|e] ≥ Q[m’|e]
SO, IF ALICE SENDS e THEN MAXIMUM LIKELIHOOD DECODING
GIVES BOB m AND NOT m’…
“α2-disambiguated”
12
Construction of a coding scheme
(J-Kalai-Khanna-Sudan’11, Inspired by B-Rao’11)
Pick an infinite random string Rm for each m,Put (m,e) E e is a prefix of R⇔ m.
Alice encodes m by sending prefix of Rm s.t.m is α2-disambiguated under P.
Gives an expected encoding length of at mostH(P) + 2log α + 2
13
Remark
Mimicking the disambiguation property of natural language provided an efficient strategy for communication.
14
1.Encodings and communication across different priors
2.Near-optimal lower bounds for different priors coding
15
Our results
1. The JKKS’11/BR’11 encoding is near optimal– H(P) + 2log α – 3log log α – O(1) bits necessary
(cf. achieved H(P) + 2log α + 2 bits)
2. Analysis of positive-error setting [Haramaty-Sudan’14]:If incorrect decoding w.p. ε is allowed—– Can achieve H(P) + log α + log 1/ε bits– H(P) + log α + log 1/ε – (9/2)log log α – O(1) bits
necessary for ε > 1/α
16
An ε-error coding scheme.
(Inspired by J-Kalai-Khanna-Sudan’11, B-Rao’11)
Pick an infinite random string Rm for each m,Put (m,e) E e is a prefix of R⇔ m.
Alice encodes m by sending the prefix of Rm
of length log 1/P(m) + log α + log 1/ε
17
AnalysisClaim. m is decoded correctly w.p. 1-εProof. There are at most 1/Q(m) messages with Q-probability greater than Q(m) ≥ P(m)/α.
The probability that Rm’ for any one of these m’ agrees with the first log 1/P(m) + log α + log 1/ε ≥ log 1/Q(m)+log 1/ε bits of Rm is at most εQ(m).
By a union bound, the probability that any of these agree with Rm (and hence could be wrongly chosen) is at most ε.
18
Length lower bound 1—reduction to deterministic encodings
• Min-max Theorem: it suffices to exhibit a distribution over priors for which deterministic encodings must be long
19
Length lower bound 2—hard priorslog.prob.
≈0
-log α
-2log α
m*
S
Lemma 1: H(P) = O(1)
α-close
α-close
Lemma 2
α2α
20
Length lower bound 3—short encodings have collisions
• Encodings of expected length < 2log α – 3log log α
encode m1 ≠ m2 identically with nonzero prob.• With nonzero probability over choice of P & Q,
m1,m2 S ∈ and m* {∈ m1,m2}• Decoding error with nonzero probability☞Errorless encodings have expected length ≥ 2log
α-3log log α = H(P)+2log α-3log log α-O(1)
21
Length lower bound 4—very short encodings often collide
• If the encoding has expected length < log α + log 1/ε – (9/2)log log α
m* collides with (ε log α) α other messages∼ ∙• Probability that our α draws for S
miss all of these messages is < 1-2ε • Decoding error with probability > ε ☞Error-ε encodings have expected length ≥
H(P) + log α + log 1/ε – (9/2)log log α – O(1)
22
Recap. We saw a variant of source coding for which (near-)optimal solutions resemble natural languages in interesting ways.
23
The problem. Design a coding scheme E so that for any sender and receiver with α-close prior distributions, the communication length is minimized.
(In expectation w.r.t. sender’s distribution)
Questions?