Jaikumar Radhakrishnan - ee.iitb.ac.in · Communication Complexity Jaikumar Radhakrishnan School of...

Post on 20-Apr-2018

215 views 2 download

Transcript of Jaikumar Radhakrishnan - ee.iitb.ac.in · Communication Complexity Jaikumar Radhakrishnan School of...

Communication Complexity

Jaikumar Radhakrishnan

School of Technology and Computer ScienceTata Institute of Fundamental Research

Mumbai

1 June 2012, IIT Bombay

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 1 / 31

Plan

1 Examples, the model, the set disjointness problem

2 Lower bounds for set disjointness, application to streaming

3 Round elimination, lower bounds for data structure problems

4 Remote generation of random variables, correlated sampling⇐

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 2 / 31

Entropy of a random variable

Definition (Shannon entropy)

Claude Shannon (1916-2001)

Let X be a random variable taking values in the set[n] = {1, 2, . . . , n}. Then, its entropy is given by

H[X] = −n∑

i=1

pi log2 pi.

H[X] measures the uncertainity in X.H[X] is a function of the distribution of X, not the actual values it takes.H[X] ≤ log2 n.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 3 / 31

Pragmatics

AliceObserves XSends Bob a message M

⇐⇒

BobRecovers X from M

GoalAlice and Bob exchange bits.Bob must recover X exactly.Goal: minimize the (expected) total number of bits transmitted.

Transmission costLet T[X] denote the minimum cost of transmitting X.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 4 / 31

Pragmatics

AliceObserves XSends Bob a message M

⇐⇒

BobRecovers X from M

GoalAlice and Bob exchange bits.Bob must recover X exactly.Goal: minimize the (expected) total number of bits transmitted.

Transmission costLet T[X] denote the minimum cost of transmitting X.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 4 / 31

Entropy and transmission

Theorem

H[X] ≤ T[X] ≤ H[X] + 1.

Kraft’s inequalityLet `1, `2, . . . , `n be positive integers. Then,

there is a binary tree whose i-th leaf is at height `i

m∑i

2−`i ≤ 1

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 5 / 31

Entropy and transmission

Theorem

H[X] ≤ T[X] ≤ H[X] + 1.

Kraft’s inequalityLet `1, `2, . . . , `n be positive integers. Then,

there is a binary tree whose i-th leaf is at height `i

m∑i

2−`i ≤ 1

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 5 / 31

Long years ago . . . 1948

Shannon’s source coding theoremLet p be a probability distribution on [n]. For ε > 0 and positive integer k, let

N(k, ε) = minA⊂[n]k:pk(A)≥1−ε

|A|.

Theorem (Shannon)

For all ε ∈ (0, 1), limk→∞

1k

log2 |N(k, ε)| = H(p).

. . . not wholly or in full measure, but very substantially!

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 6 / 31

Long years ago . . . 1948

Shannon’s source coding theoremLet p be a probability distribution on [n]. For ε > 0 and positive integer k, let

N(k, ε) = minA⊂[n]k:pk(A)≥1−ε

|A|.

Theorem (Shannon)

For all ε ∈ (0, 1), limk→∞

1k

log2 |N(k, ε)| = H(p).

. . . not wholly or in full measure, but very substantially!

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 6 / 31

Long years ago . . . 1948

Shannon’s source coding theoremLet p be a probability distribution on [n]. For ε > 0 and positive integer k, let

N(k, ε) = minA⊂[n]k:pk(A)≥1−ε

|A|.

Theorem (Shannon)

For all ε ∈ (0, 1), limk→∞

1k

log2 |N(k, ε)| = H(p).

. . . not wholly or in full measure, but very substantially!

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 6 / 31

Conditional entropy

Definition(X,Y): a pair of random variables with some joint distribution.

H[Y | X] =∑

i

pX(i)H[Yi]

FactConditioning reduces uncertainity: H[Y | X] ≤ H[Y].

H[XY] = H[X] + H[Y | X].

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 7 / 31

Conditional entropy

Definition(X,Y): a pair of random variables with some joint distribution.

H[Y | X] =∑

i

pX(i)H[Yi]

FactConditioning reduces uncertainity: H[Y | X] ≤ H[Y].

H[XY] = H[X] + H[Y | X].

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 7 / 31

The noisy channel

SpecificationInput alphabet: [m]

Output alphabet: [n]

Characteristics: Pr[output = j | input = i] = pj|i.

Code of conductEncoding: {0, 1}k → [m]t

Decoding: [n]t → {0, 1}k

GoalError: Pr[input 6= output] ≤ ε.

Rate:kt

should be as large as possible.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 8 / 31

The noisy channel

SpecificationInput alphabet: [m]

Output alphabet: [n]

Characteristics: Pr[output = j | input = i] = pj|i.

Code of conductEncoding: {0, 1}k → [m]t

Decoding: [n]t → {0, 1}k

GoalError: Pr[input 6= output] ≤ ε.

Rate:kt

should be as large as possible.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 8 / 31

The noisy channel

SpecificationInput alphabet: [m]

Output alphabet: [n]

Characteristics: Pr[output = j | input = i] = pj|i.

Code of conductEncoding: {0, 1}k → [m]t

Decoding: [n]t → {0, 1}k

GoalError: Pr[input 6= output] ≤ ε.

Rate:kt

should be as large as possible.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 8 / 31

Capacity

Input to the channel: X ∈ [m]

Ouput of the channel: Y ∈ [n].

Definition (Capacity of a channel E)

C(E) = maxX

H[X] + H[Y]− H[XY].

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 9 / 31

Jaane kya toone kahi . . . jaane kya meine suni

Theorem (Shannon)Let C be the capacity of the channel. Then, for all ε > 0 and all k, there existencoders and decoders such that

Encoding rate: kt ≥ C − ε.

Error: Pr[error]→ 0 as k→∞.

Optimality: Can’t replace C − ε by C + δ for any δ > 0.

. . . baat kuchch ban hi gayee!

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 10 / 31

Jaane kya toone kahi . . . jaane kya meine suni

Theorem (Shannon)Let C be the capacity of the channel. Then, for all ε > 0 and all k, there existencoders and decoders such that

Encoding rate: kt ≥ C − ε.

Error: Pr[error]→ 0 as k→∞.

Optimality: Can’t replace C − ε by C + δ for any δ > 0.

. . . baat kuchch ban hi gayee!

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 10 / 31

Mutual information

DefinitionFor random variables X and Y with some joint probability distribution, theirmutual information is

I[X : Y] = H[X] + H[Y]− H[XY]

= H[X]− H[X | Y]

= H[Y]− H[Y | X].

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 11 / 31

Today: generating random variables remotely

Pair of random variablesLet (X,Y) be a pair of not necessarily independent random variables takingvalues in the set [m]× Y.

AliceReceives x ∈ [m]

⇒⇐

BobGenerates y ∈ Y

GoalPr[Bob’s output = y | x = x] = Pr[Y = y | X = x].

Minimize the average number of bits sent by Alice.Let T[X : Y] be this quantity.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 12 / 31

Today: generating random variables remotely

Pair of random variablesLet (X,Y) be a pair of not necessarily independent random variables takingvalues in the set [m]× Y.

AliceReceives x ∈ [m]

⇒⇐

BobGenerates y ∈ Y

GoalPr[Bob’s output = y | x = x] = Pr[Y = y | X = x].

Minimize the average number of bits sent by Alice.Let T[X : Y] be this quantity.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 12 / 31

Today: generating random variables remotely

Pair of random variablesLet (X,Y) be a pair of not necessarily independent random variables takingvalues in the set [m]× Y.

AliceReceives x ∈ [m]

⇒⇐

BobGenerates y ∈ Y

GoalPr[Bob’s output = y | x = x] = Pr[Y = y | X = x].

Minimize the average number of bits sent by Alice.Let T[X : Y] be this quantity.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 12 / 31

When X and Y are independent. . .

Bob can generate y on his own.No message from Alice is required.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 13 / 31

When X and Y are independent. . .

Bob can generate y on his own.No message from Alice is required.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 13 / 31

When X and Y are highly correlated. . .

The case X = YThen

H[X] ≤ T[X : Y] ≤ H[X] + 1,

where H[X] is the Shannon entropy of X.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 14 / 31

In general, . . .

A lower bound

I[X : Y] ≤ T[X : Y].

ProofLet M be Alice’s message to Bob.Then, x and y are conditionally independent given M. So,

I[x : y] ≤ I[x : yM] ≤ I[x : M] + I[x : y |M] ≤ H[M] ≤ E[|M|].

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 15 / 31

In general, . . .

A lower bound

I[X : Y] ≤ T[X : Y].

ProofLet M be Alice’s message to Bob.Then, x and y are conditionally independent given M. So,

I[x : y] ≤ I[x : yM] ≤ I[x : M] + I[x : y |M] ≤ H[M] ≤ E[|M|].

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 15 / 31

Bad News

A pair of random variables

X ∈U

([n]

n/2

)Y ∈U X

I[X : Y] = 1T[X : Y] ≥ c lg n

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 16 / 31

Bad News

A pair of random variables

X ∈U

([n]

n/2

)Y ∈U X

I[X : Y] = 1T[X : Y] ≥ c lg n

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 16 / 31

Not new . . .

Wyner’s common information (1975)Definition:

C[X : Y] = lim infλ→0

[lim

m→∞

Tλ[Xm : Ym]

m

].

Theorem:C[X : Y] = min

WI[XY : W],

where the minimum is taken over all random variables W suchthat X and Y are conditionally independent given W

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 17 / 31

Common information versus mutual information

T[X : Y] ≥ C[X : Y] ≥ I[X : Y].

There exist random variables where both inequalities are loose.

ExampleX,Y ∈ {0, 1}n.Let W ∈ [n]× {0, 1} uniformly distributed.X[i] = b, Y[i] = b and the other 2(n− 1) bits are generated uniformly.

I[X : Y] = O(n−13 );

C[X : Y] = 2− o(1);

T[X : Y] = Θ(log n).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 18 / 31

Common information versus mutual information

T[X : Y] ≥ C[X : Y] ≥ I[X : Y].

There exist random variables where both inequalities are loose.

ExampleX,Y ∈ {0, 1}n.Let W ∈ [n]× {0, 1} uniformly distributed.X[i] = b, Y[i] = b and the other 2(n− 1) bits are generated uniformly.

I[X : Y] = O(n−13 );

C[X : Y] = 2− o(1);

T[X : Y] = Θ(log n).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 18 / 31

Common information versus mutual information

T[X : Y] ≥ C[X : Y] ≥ I[X : Y].

There exist random variables where both inequalities are loose.

ExampleX,Y ∈ {0, 1}n.Let W ∈ [n]× {0, 1} uniformly distributed.X[i] = b, Y[i] = b and the other 2(n− 1) bits are generated uniformly.

I[X : Y] = O(n−13 );

C[X : Y] = 2− o(1);

T[X : Y] = Θ(log n).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 18 / 31

The right question?

Suppose Alice and Bob are allowed to share a random variable R generatedindependently of Alice’s input.

AliceReceives x ∈ [m] ⇒

BobGenerates y ∈ [n]

Alice generates her message to Bob based on her input x, the randomstring R and some of her own randomness.Bob generates his output based on Alice’s message, the random stringR, and some of his own randomness.Let TR[X : Y] denote the minimum expected number of bitscommunicated (by Alice) in the best strategy for generating (X,Y) withshared randomness.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 19 / 31

The right question?

Suppose Alice and Bob are allowed to share a random variable R generatedindependently of Alice’s input.

AliceReceives x ∈ [m] ⇒

BobGenerates y ∈ [n]

Alice generates her message to Bob based on her input x, the randomstring R and some of her own randomness.Bob generates his output based on Alice’s message, the random stringR, and some of his own randomness.Let TR[X : Y] denote the minimum expected number of bitscommunicated (by Alice) in the best strategy for generating (X,Y) withshared randomness.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 19 / 31

The right question?

Suppose Alice and Bob are allowed to share a random variable R generatedindependently of Alice’s input.

AliceReceives x ∈ [m] ⇒

BobGenerates y ∈ [n]

Alice generates her message to Bob based on her input x, the randomstring R and some of her own randomness.Bob generates his output based on Alice’s message, the random stringR, and some of his own randomness.Let TR[X : Y] denote the minimum expected number of bitscommunicated (by Alice) in the best strategy for generating (X,Y) withshared randomness.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 19 / 31

The right question?

Suppose Alice and Bob are allowed to share a random variable R generatedindependently of Alice’s input.

AliceReceives x ∈ [m] ⇒

BobGenerates y ∈ [n]

Alice generates her message to Bob based on her input x, the randomstring R and some of her own randomness.Bob generates his output based on Alice’s message, the random stringR, and some of his own randomness.Let TR[X : Y] denote the minimum expected number of bitscommunicated (by Alice) in the best strategy for generating (X,Y) withshared randomness.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 19 / 31

The first example revisited

A pair of random variables

X ∈U

([n]

n/2

)Y ∈U X

Note that I[X : Y] = 1.

A strategyRandomness. A random permutation R of [n].Alice’s message. The index i of the first element in this permutation that

occurs in her set X.Communication cost. Note that E[i] ≤ 2.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 20 / 31

The first example revisited

A pair of random variables

X ∈U

([n]

n/2

)Y ∈U X

Note that I[X : Y] = 1.

A strategyRandomness. A random permutation R of [n].Alice’s message. The index i of the first element in this permutation that

occurs in her set X.Communication cost. Note that E[i] ≤ 2.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 20 / 31

The first example revisited

A pair of random variables

X ∈U

([n]

n/2

)Y ∈U X

Note that I[X : Y] = 1.

A strategyRandomness. A random permutation R of [n].Alice’s message. The index i of the first element in this permutation that

occurs in her set X.Communication cost. Note that E[i] ≤ 2.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 20 / 31

The first example revisited

A pair of random variables

X ∈U

([n]

n/2

)Y ∈U X

Note that I[X : Y] = 1.

A strategyRandomness. A random permutation R of [n].Alice’s message. The index i of the first element in this permutation that

occurs in her set X.Communication cost. Note that E[i] ≤ 2.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 20 / 31

The main result

Theorem

I[X : Y] ≤ TR[X : Y] ≤ I[X,Y] + O(lg(I[X : Y] + 1)).

ProofLower bound. Minor variation of the previous proof.

I[X : Y] ≤ I[X : MRY]

≤ I[X : MR] + I[X : Y |MR]

≤ I[X : M | R] + I[X : R]

≤ H[M | R]

≤ H[M]

≤ E[|M|].

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 21 / 31

The main result

Theorem

I[X : Y] ≤ TR[X : Y] ≤ I[X,Y] + O(lg(I[X : Y] + 1)).

ProofLower bound. Minor variation of the previous proof.

I[X : Y] ≤ I[X : MRY]

≤ I[X : MR] + I[X : Y |MR]

≤ I[X : M | R] + I[X : R]

≤ H[M | R]

≤ H[M]

≤ E[|M|].

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 21 / 31

Upper boundDefinitionGiven two distributions P and Q on a set Y, their relative entropy is

D(P‖Q) =∑y∈Y

P(y) lgP(y)

Q(y).

Connection to mutual information

I[X : Y] = Ex←X[D(Qx‖Q)],

where Qx is the conditional distribution of Y given X = x.

The ideaRandomness. R = 〈y1, y2, . . . , yi, . . .〉 independently sampled from Q.Alice’s message. An index i∗.Bob’s output. The sample yi∗ .

Cost. Approximately D(Qx‖Q).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 22 / 31

Upper boundDefinitionGiven two distributions P and Q on a set Y, their relative entropy is

D(P‖Q) =∑y∈Y

P(y) lgP(y)

Q(y).

Connection to mutual information

I[X : Y] = Ex←X[D(Qx‖Q)],

where Qx is the conditional distribution of Y given X = x.

The ideaRandomness. R = 〈y1, y2, . . . , yi, . . .〉 independently sampled from Q.Alice’s message. An index i∗.Bob’s output. The sample yi∗ .

Cost. Approximately D(Qx‖Q).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 22 / 31

Upper boundDefinitionGiven two distributions P and Q on a set Y, their relative entropy is

D(P‖Q) =∑y∈Y

P(y) lgP(y)

Q(y).

Connection to mutual information

I[X : Y] = Ex←X[D(Qx‖Q)],

where Qx is the conditional distribution of Y given X = x.

The ideaRandomness. R = 〈y1, y2, . . . , yi, . . .〉 independently sampled from Q.Alice’s message. An index i∗.Bob’s output. The sample yi∗ .

Cost. Approximately D(Qx‖Q).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 22 / 31

Upper boundDefinitionGiven two distributions P and Q on a set Y, their relative entropy is

D(P‖Q) =∑y∈Y

P(y) lgP(y)

Q(y).

Connection to mutual information

I[X : Y] = Ex←X[D(Qx‖Q)],

where Qx is the conditional distribution of Y given X = x.

The ideaRandomness. R = 〈y1, y2, . . . , yi, . . .〉 independently sampled from Q.Alice’s message. An index i∗.Bob’s output. The sample yi∗ .

Cost. Approximately D(Qx‖Q).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 22 / 31

Upper boundDefinitionGiven two distributions P and Q on a set Y, their relative entropy is

D(P‖Q) =∑y∈Y

P(y) lgP(y)

Q(y).

Connection to mutual information

I[X : Y] = Ex←X[D(Qx‖Q)],

where Qx is the conditional distribution of Y given X = x.

The ideaRandomness. R = 〈y1, y2, . . . , yi, . . .〉 independently sampled from Q.Alice’s message. An index i∗.Bob’s output. The sample yi∗ .

Cost. Approximately D(Qx‖Q).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 22 / 31

Upper boundDefinitionGiven two distributions P and Q on a set Y, their relative entropy is

D(P‖Q) =∑y∈Y

P(y) lgP(y)

Q(y).

Connection to mutual information

I[X : Y] = Ex←X[D(Qx‖Q)],

where Qx is the conditional distribution of Y given X = x.

The ideaRandomness. R = 〈y1, y2, . . . , yi, . . .〉 independently sampled from Q.Alice’s message. An index i∗.Bob’s output. The sample yi∗ .

Cost. Approximately D(Qx‖Q).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 22 / 31

Choosing the right index

Main lemmaLet P and Q be two distributions such that D(P‖Q) is finite. There is aprocedure that on input a sequence

R = 〈y1, y2, . . . , yi, . . .〉

of independently drawn samples from Q, outputs an index i∗ such thatyi∗ has distribution P, andE[length(i∗)] ≤ D(P‖Q) + 2 lg(D(P‖Q) + 1) + O(1).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 23 / 31

Choosing the right index

Main lemmaLet P and Q be two distributions such that D(P‖Q) is finite. There is aprocedure that on input a sequence

R = 〈y1, y2, . . . , yi, . . .〉

of independently drawn samples from Q, outputs an index i∗ such thatyi∗ has distribution P, andE[length(i∗)] ≤ D(P‖Q) + 2 lg(D(P‖Q) + 1) + O(1).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 23 / 31

Choosing the right index

Main lemmaLet P and Q be two distributions such that D(P‖Q) is finite. There is aprocedure that on input a sequence

R = 〈y1, y2, . . . , yi, . . .〉

of independently drawn samples from Q, outputs an index i∗ such thatyi∗ has distribution P, andE[length(i∗)] ≤ D(P‖Q) + 2 lg(D(P‖Q) + 1) + O(1).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 23 / 31

Proof by example

Q = 〈12,

18,

18,

14〉;

Qx = 〈14,

34, 0, 0〉.

R = 〈y1, y2, . . . , yi, . . .〉 (samples drawn from Q)

Step 1: If y1 = 1, accept with probabibility 12 .

If y1 = 2, accept with probability 1.Otherwise, reject.

Step 2 onwards: Accept iff yi = 2.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 24 / 31

Proof by example

Q = 〈12,

18,

18,

14〉;

Qx = 〈14,

34, 0, 0〉.

R = 〈y1, y2, . . . , yi, . . .〉 (samples drawn from Q)

Step 1: If y1 = 1, accept with probabibility 12 .

If y1 = 2, accept with probability 1.Otherwise, reject.

Step 2 onwards: Accept iff yi = 2.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 24 / 31

Proof by example

Q = 〈12,

18,

18,

14〉;

Qx = 〈14,

34, 0, 0〉.

R = 〈y1, y2, . . . , yi, . . .〉 (samples drawn from Q)

Step 1: If y1 = 1, accept with probabibility 12 .

If y1 = 2, accept with probability 1.Otherwise, reject.

Step 2 onwards: Accept iff yi = 2.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 24 / 31

Analysis

Claim

E[i∗] ≤ D(P‖Q) + O(1).

Proof

D(P‖Q) =∑y∈[n]

P(y) lgP(y)

Q(y).

Idea: the element y never needs to be generated after the first⌈

P(y)Q(y)

⌉stages.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 25 / 31

Summary

Entropy, channel capacity, mutual information.

The problem of generating correlated random variables.

A ‘characterization’ of mutual information in terms of shared randomness.

Proof via a rejection sampling procedure based on relative entropy.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 26 / 31

An applications of this result

Theorem (One-shot Reverse Shannon Theorem)

C(P) ≤ T(P) ≤ C(P) + 2 lg(C(P) + 1) + O(1)

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 27 / 31

The Braverman-Rao protocol

Alice does not know QAlice: A distribution P on [n].Bob: A distribution Q on [n].

Guarantee: D(P‖Q) ≤ k.Goal: Alice and Bob agree on a value M whose distribution is P (or

within distance ε to P). Minimize communication Tε(P‖Q).

Theorem (Braverman and Rao, 2010)

Tε(P‖Q) = O(log(1ε

)D(P‖Q)).

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 28 / 31

Remarks

The first result was joint work with Prahladh Harsha, Rahul Jain, DavidMcAllester (2010).

The asymptotic versions of the main result and the one-shot ReverseShannon Theorem were shown earlier by Winter (2002) and Bennett,Shor, Smolin and Thapliyal (2002).

The one-shot version implies the asymptotic versions by a routineapplication of the law of large numbers.

We don’t know if the ‘extra’ log term is necessary.Precise asymptotic tradeoffs between commmunication and sharedrandomness were obtained by Bennett & Winter and Paul Cuff.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 29 / 31

Remarks

The first result was joint work with Prahladh Harsha, Rahul Jain, DavidMcAllester (2010).

The asymptotic versions of the main result and the one-shot ReverseShannon Theorem were shown earlier by Winter (2002) and Bennett,Shor, Smolin and Thapliyal (2002).

The one-shot version implies the asymptotic versions by a routineapplication of the law of large numbers.

We don’t know if the ‘extra’ log term is necessary.Precise asymptotic tradeoffs between commmunication and sharedrandomness were obtained by Bennett & Winter and Paul Cuff.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 29 / 31

Remarks

The first result was joint work with Prahladh Harsha, Rahul Jain, DavidMcAllester (2010).

The asymptotic versions of the main result and the one-shot ReverseShannon Theorem were shown earlier by Winter (2002) and Bennett,Shor, Smolin and Thapliyal (2002).

The one-shot version implies the asymptotic versions by a routineapplication of the law of large numbers.

We don’t know if the ‘extra’ log term is necessary.Precise asymptotic tradeoffs between commmunication and sharedrandomness were obtained by Bennett & Winter and Paul Cuff.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 29 / 31

Remarks

The first result was joint work with Prahladh Harsha, Rahul Jain, DavidMcAllester (2010).

The asymptotic versions of the main result and the one-shot ReverseShannon Theorem were shown earlier by Winter (2002) and Bennett,Shor, Smolin and Thapliyal (2002).

The one-shot version implies the asymptotic versions by a routineapplication of the law of large numbers.

We don’t know if the ‘extra’ log term is necessary.Precise asymptotic tradeoffs between commmunication and sharedrandomness were obtained by Bennett & Winter and Paul Cuff.

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 29 / 31

Plan

1 Examples, the model, the set disjointness problem

2 Lower bounds for set disjointness, application to streaming

3 Round elimination, lower bounds for data structure problems

4 Remote generation of random variables, correlated sampling⇐

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 30 / 31

Thank you

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 31 / 31