Jaikumar Radhakrishnan - ee.iitb.ac.in · Communication Complexity Jaikumar Radhakrishnan School of...

Communication Complexity

Jaikumar Radhakrishnan

School of Technology and Computer ScienceTata Institute of Fundamental Research

Mumbai

1 June 2012, IIT Bombay

Jaikumar Radhakrishnan (School of Technology and Computer Science Tata Institute of Fundamental Research Mumbai)Communication Complexity 1 June 2012, IIT Bombay 1 / 31

1 Examples, the model, the set disjointness problem

2 Lower bounds for set disjointness, application to streaming

3 Round elimination, lower bounds for data structure problems

4 Remote generation of random variables, correlated sampling⇐

Entropy of a random variable

Definition (Shannon entropy)

Claude Shannon (1916-2001)

Let X be a random variable taking values in the set[n] = {1, 2, . . . , n}. Then, its entropy is given by

H[X] = −n∑

pi log2 pi.

H[X] measures the uncertainity in X.H[X] is a function of the distribution of X, not the actual values it takes.H[X] ≤ log2 n.

Pragmatics

AliceObserves XSends Bob a message M

⇐⇒

BobRecovers X from M

GoalAlice and Bob exchange bits.Bob must recover X exactly.Goal: minimize the (expected) total number of bits transmitted.

Transmission costLet T[X] denote the minimum cost of transmitting X.

Pragmatics

AliceObserves XSends Bob a message M

⇐⇒

BobRecovers X from M

GoalAlice and Bob exchange bits.Bob must recover X exactly.Goal: minimize the (expected) total number of bits transmitted.

Transmission costLet T[X] denote the minimum cost of transmitting X.

Entropy and transmission

Theorem

H[X] ≤ T[X] ≤ H[X] + 1.

Kraft’s inequalityLet `1, `2, . . . , `n be positive integers. Then,

there is a binary tree whose i-th leaf is at height `i

2−`i ≤ 1

Entropy and transmission

Theorem

H[X] ≤ T[X] ≤ H[X] + 1.

Kraft’s inequalityLet `1, `2, . . . , `n be positive integers. Then,

there is a binary tree whose i-th leaf is at height `i

2−`i ≤ 1

Long years ago . . . 1948

Shannon’s source coding theoremLet p be a probability distribution on [n]. For ε > 0 and positive integer k, let

N(k, ε) = minA⊂[n]k:pk(A)≥1−ε

Theorem (Shannon)

For all ε ∈ (0, 1), limk→∞

log2 |N(k, ε)| = H(p).

. . . not wholly or in full measure, but very substantially!

Theorem (Shannon)

log2 |N(k, ε)| = H(p).

Theorem (Shannon)

log2 |N(k, ε)| = H(p).

Conditional entropy

Definition(X,Y): a pair of random variables with some joint distribution.

H[Y | X] =∑

pX(i)H[Yi]

FactConditioning reduces uncertainity: H[Y | X] ≤ H[Y].

H[XY] = H[X] + H[Y | X].

Conditional entropy

Definition(X,Y): a pair of random variables with some joint distribution.

H[Y | X] =∑

pX(i)H[Yi]

FactConditioning reduces uncertainity: H[Y | X] ≤ H[Y].

H[XY] = H[X] + H[Y | X].

The noisy channel

SpecificationInput alphabet: [m]

Output alphabet: [n]

Characteristics: Pr[output = j | input = i] = pj|i.

Code of conductEncoding: {0, 1}k → [m]t

Decoding: [n]t → {0, 1}k

GoalError: Pr[input 6= output] ≤ ε.

Rate:kt

should be as large as possible.

The noisy channel

Rate:kt

The noisy channel

Rate:kt

Capacity

Input to the channel: X ∈ [m]

Ouput of the channel: Y ∈ [n].

Definition (Capacity of a channel E)

C(E) = maxX

H[X] + H[Y]− H[XY].

Jaane kya toone kahi . . . jaane kya meine suni

Theorem (Shannon)Let C be the capacity of the channel. Then, for all ε > 0 and all k, there existencoders and decoders such that

Encoding rate: kt ≥ C − ε.

Error: Pr[error]→ 0 as k→∞.

Optimality: Can’t replace C − ε by C + δ for any δ > 0.

. . . baat kuchch ban hi gayee!

Jaane kya toone kahi . . . jaane kya meine suni

Theorem (Shannon)Let C be the capacity of the channel. Then, for all ε > 0 and all k, there existencoders and decoders such that

Encoding rate: kt ≥ C − ε.

Error: Pr[error]→ 0 as k→∞.

Optimality: Can’t replace C − ε by C + δ for any δ > 0.

. . . baat kuchch ban hi gayee!

Mutual information

DefinitionFor random variables X and Y with some joint probability distribution, theirmutual information is

I[X : Y] = H[X] + H[Y]− H[XY]

= H[X]− H[X | Y]

= H[Y]− H[Y | X].

Today: generating random variables remotely

Pair of random variablesLet (X,Y) be a pair of not necessarily independent random variables takingvalues in the set [m]× Y.

AliceReceives x ∈ [m]

⇒⇐

BobGenerates y ∈ Y

GoalPr[Bob’s output = y | x = x] = Pr[Y = y | X = x].

Minimize the average number of bits sent by Alice.Let T[X : Y] be this quantity.

⇒⇐

When X and Y are independent. . .

Bob can generate y on his own.No message from Alice is required.

When X and Y are independent. . .

Bob can generate y on his own.No message from Alice is required.

When X and Y are highly correlated. . .

The case X = YThen

H[X] ≤ T[X : Y] ≤ H[X] + 1,

where H[X] is the Shannon entropy of X.

In general, . . .

A lower bound

I[X : Y] ≤ T[X : Y].

ProofLet M be Alice’s message to Bob.Then, x and y are conditionally independent given M. So,

I[x : y] ≤ I[x : yM] ≤ I[x : M] + I[x : y |M] ≤ H[M] ≤ E[|M|].

In general, . . .

A lower bound

I[X : Y] ≤ T[X : Y].

ProofLet M be Alice’s message to Bob.Then, x and y are conditionally independent given M. So,

I[x : y] ≤ I[x : yM] ≤ I[x : M] + I[x : y |M] ≤ H[M] ≤ E[|M|].

Bad News

A pair of random variables

X ∈U

)Y ∈U X

I[X : Y] = 1T[X : Y] ≥ c lg n

Bad News

X ∈U

)Y ∈U X

I[X : Y] = 1T[X : Y] ≥ c lg n

Not new . . .

Wyner’s common information (1975)Definition:

C[X : Y] = lim infλ→0

m→∞

Tλ[Xm : Ym]

Theorem:C[X : Y] = min

WI[XY : W],

where the minimum is taken over all random variables W suchthat X and Y are conditionally independent given W

Common information versus mutual information

T[X : Y] ≥ C[X : Y] ≥ I[X : Y].

There exist random variables where both inequalities are loose.

ExampleX,Y ∈ {0, 1}n.Let W ∈ [n]× {0, 1} uniformly distributed.X[i] = b, Y[i] = b and the other 2(n− 1) bits are generated uniformly.

I[X : Y] = O(n−13 );

C[X : Y] = 2− o(1);

T[X : Y] = Θ(log n).

T[X : Y] ≥ C[X : Y] ≥ I[X : Y].

I[X : Y] = O(n−13 );

C[X : Y] = 2− o(1);

T[X : Y] ≥ C[X : Y] ≥ I[X : Y].

I[X : Y] = O(n−13 );

C[X : Y] = 2− o(1);

The right question?

Suppose Alice and Bob are allowed to share a random variable R generatedindependently of Alice’s input.

AliceReceives x ∈ [m] ⇒

BobGenerates y ∈ [n]

Alice generates her message to Bob based on her input x, the randomstring R and some of her own randomness.Bob generates his output based on Alice’s message, the random stringR, and some of his own randomness.Let TR[X : Y] denote the minimum expected number of bitscommunicated (by Alice) in the best strategy for generating (X,Y) withshared randomness.

The right question?

The first example revisited

X ∈U

)Y ∈U X

Note that I[X : Y] = 1.

A strategyRandomness. A random permutation R of [n].Alice’s message. The index i of the first element in this permutation that

occurs in her set X.Communication cost. Note that E[i] ≤ 2.

X ∈U

)Y ∈U X

X ∈U

)Y ∈U X

X ∈U

)Y ∈U X

The main result

Theorem

I[X : Y] ≤ TR[X : Y] ≤ I[X,Y] + O(lg(I[X : Y] + 1)).

ProofLower bound. Minor variation of the previous proof.

I[X : Y] ≤ I[X : MRY]

≤ I[X : MR] + I[X : Y |MR]

≤ I[X : M | R] + I[X : R]

≤ H[M | R]

≤ H[M]

≤ E[|M|].

The main result

Theorem

I[X : Y] ≤ TR[X : Y] ≤ I[X,Y] + O(lg(I[X : Y] + 1)).

ProofLower bound. Minor variation of the previous proof.

I[X : Y] ≤ I[X : MRY]

≤ I[X : MR] + I[X : Y |MR]

≤ I[X : M | R] + I[X : R]

≤ H[M | R]

≤ H[M]

≤ E[|M|].

Upper boundDefinitionGiven two distributions P and Q on a set Y, their relative entropy is

D(P‖Q) =∑y∈Y

P(y) lgP(y)

Connection to mutual information

I[X : Y] = Ex←X[D(Qx‖Q)],

where Qx is the conditional distribution of Y given X = x.

The ideaRandomness. R = 〈y1, y2, . . . , yi, . . .〉 independently sampled from Q.Alice’s message. An index i∗.Bob’s output. The sample yi∗ .

Cost. Approximately D(Qx‖Q).

D(P‖Q) =∑y∈Y

P(y) lgP(y)

D(P‖Q) =∑y∈Y

P(y) lgP(y)

D(P‖Q) =∑y∈Y

P(y) lgP(y)

D(P‖Q) =∑y∈Y

P(y) lgP(y)

D(P‖Q) =∑y∈Y

P(y) lgP(y)

Choosing the right index

Main lemmaLet P and Q be two distributions such that D(P‖Q) is finite. There is aprocedure that on input a sequence

R = 〈y1, y2, . . . , yi, . . .〉

of independently drawn samples from Q, outputs an index i∗ such thatyi∗ has distribution P, andE[length(i∗)] ≤ D(P‖Q) + 2 lg(D(P‖Q) + 1) + O(1).

R = 〈y1, y2, . . . , yi, . . .〉

Proof by example

Q = 〈12,

14〉;

Qx = 〈14,

34, 0, 0〉.

R = 〈y1, y2, . . . , yi, . . .〉 (samples drawn from Q)

Step 1: If y1 = 1, accept with probabibility 12 .

If y1 = 2, accept with probability 1.Otherwise, reject.

Step 2 onwards: Accept iff yi = 2.

Proof by example

Q = 〈12,

14〉;

Qx = 〈14,

34, 0, 0〉.

Proof by example

Q = 〈12,

14〉;

Qx = 〈14,

34, 0, 0〉.

Analysis

E[i∗] ≤ D(P‖Q) + O(1).

D(P‖Q) =∑y∈[n]

P(y) lgP(y)

Idea: the element y never needs to be generated after the first⌈

P(y)Q(y)

⌉stages.

Summary

Entropy, channel capacity, mutual information.

The problem of generating correlated random variables.

A ‘characterization’ of mutual information in terms of shared randomness.

Proof via a rejection sampling procedure based on relative entropy.

An applications of this result

Theorem (One-shot Reverse Shannon Theorem)

C(P) ≤ T(P) ≤ C(P) + 2 lg(C(P) + 1) + O(1)

The Braverman-Rao protocol

Alice does not know QAlice: A distribution P on [n].Bob: A distribution Q on [n].

Guarantee: D(P‖Q) ≤ k.Goal: Alice and Bob agree on a value M whose distribution is P (or

within distance ε to P). Minimize communication Tε(P‖Q).

Theorem (Braverman and Rao, 2010)

Tε(P‖Q) = O(log(1ε

)D(P‖Q)).

Remarks

The first result was joint work with Prahladh Harsha, Rahul Jain, DavidMcAllester (2010).

The asymptotic versions of the main result and the one-shot ReverseShannon Theorem were shown earlier by Winter (2002) and Bennett,Shor, Smolin and Thapliyal (2002).

The one-shot version implies the asymptotic versions by a routineapplication of the law of large numbers.

We don’t know if the ‘extra’ log term is necessary.Precise asymptotic tradeoffs between commmunication and sharedrandomness were obtained by Bennett & Winter and Paul Cuff.

Remarks

1 Examples, the model, the set disjointness problem

2 Lower bounds for set disjointness, application to streaming

3 Round elimination, lower bounds for data structure problems

4 Remote generation of random variables, correlated sampling⇐

Thank you

Jaikumar Radhakrishnan - ee.iitb.ac.in · Communication Complexity Jaikumar Radhakrishnan School of...

Documents

Transcript of Jaikumar Radhakrishnan - ee.iitb.ac.in · Communication Complexity Jaikumar Radhakrishnan School of...

EE204 Analog Circuits - ee.iitb.ac.in

Radhakrishnan - Hindu Dharma, 1916

Jaikumar Group In Nashik.

XACML Gyanasekaran Radhakrishnan. Raviteja Kadiyam.

IIT Bombay arjayan@ee.iitb.ac.in, pcpandey@ee.iitb.ac.in 14 th National Conference on Communications, 1-3 Feb. 2008, IIT Bombay, Mumbai, India 1/27 Intro.Intro.

Magnetic Water Treatment .Haripriya Radhakrishnan

Jai Radhakrishnan, MDJai Radhakrishnan, MD Columbia … Dr. Jai Radhakrishnan.pdf · Jai Radhakrishnan, MDJai Radhakrishnan, MD Columbia University. 1. The Patient-Centered Medical

Entropy and Counting - Tata Institute of Fundamental …jaikumar/Papers/EntropyAndCounting.pdf · Entropy and Counting∗ Jaikumar Radhakrishnan School of Technology and Computer

Dr.Archana Guided by Dr Shyamala, Dr Radhakrishnan Satheesan · Guided by Dr Shyamala, Dr Radhakrishnan Satheesan . History

K Radhakrishnan Current Science

Dr. S. Radhakrishnan

02 Radhakrishnan Acute Renal Failure Update

Sangeetha Radhakrishnan - 154859

nitya@ee.iitb.ac.in, pcpandey@ee.iitb.ac.in, santosh4b6@gmail.com EE Dept., IIT Bombay Indicon2013, Mumbai, 13-15 Dec. 2013, Paper No. 524 (Track 4.1,

By s. jaikumar, advocate swamy associates good, bad & the ugly.

C. Radhakrishnan W ELCOME T O EMOTIONAL INTELLIGENCE TRAINING By C. Radhakrishnan 24 March 2009 Payyanur College Payyanur.

Brahmasutra Translation+Text Radhakrishnan

SARVEPALLI RADHAKRISHNAN UNIVERSITY, BHOPALsrku.edu.in/pdf/BE1yr41215.pdf · SARVEPALLI RADHAKRISHNAN UNIVERSITY, BHOPAL ... Euler’s theorem and its application in approximation

Virtual Memory - ee.iitb.ac.in

s radhakrishnan