Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe...

62
Hoeffding’s Bound Theorem Let X 1 ,..., X n be independent random variables with E[X i ]= μ i and Pr (B i X i B i + c i )=1, then Pr (| n X i =1 X i - n X i =1 μ i |≥ ) e - 2 2 n i =1 c 2 i Do we need independence?

Transcript of Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe...

Page 1: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Hoeffding’s Bound

Theorem

Let X1, . . . ,Xn be independent random variables with E[Xi ] = µiand Pr(Bi ≤ Xi ≤ Bi + ci ) = 1, then

Pr(|n∑

i=1

Xi −n∑

i=1

µi | ≥ ε) ≤ e− 2ε2∑n

i=1c2i

Do we need independence?

Page 2: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Martingales

Definition

A sequence of random variables Z0,Z1, . . . is a martingale withrespect to the sequence X0,X1, . . . if for all n ≥ 0 the followinghold:

1 Zn is a function of X0,X1, . . . ,Xn;

2 E[|Zn|] <∞;

3 E[Zn+1|X0,X1, . . . ,Xn] = Zn;

Definition

A sequence of random variables Z0,Z1, . . . is a martingale when itis a martingale with respect to itself, that is

1 E[|Zn|] <∞;

2 E[Zn+1|Z0,Z1, . . . ,Zn] = Zn;

Page 3: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Conditional Expectation

Definition

E[Y | Z = z ] =∑y

y Pr(Y = y | Z = z) ,

where the summation is over all y in the range of Y .

Lemma

For any random variables X and Y ,

E[X ] = EY [EX [X | Y ]] =∑y

Pr(Y = y)E[X | Y = y ] ,

where the sum is over all values in the range of Y .

Page 4: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Lemma

For any random variables X and Y ,

E[X ] = EY [EX [X | Y ]] =∑y

Pr(Y = y)E[X | Y = y ] ,

where the sum is over all values in the range of Y .

Proof. ∑y

Pr(Y = y)E[X | Y = y ]

=∑y

Pr(Y = y)∑x

x Pr(X = x | Y = y)

=∑x

∑y

x Pr(X = x | Y = y) Pr(Y = y)

=∑x

∑y

x Pr(X = x ∩ Y = y) =∑x

x Pr(X = x) = E[X ].

Page 5: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example

Consider a two phase game:

• Phase I: roll one die. Let X be the outcome.

• Phase II: Flip X fair coins, let Y be the number of HEADs.

• You receive a dollar for each HEAD.

Y is distributed B(X , 12),

E[Y | X = a] =a

2

E[Y ] =6∑

i=1

E[Y | X = i ] Pr(X = i)

=6∑

i=1

i

2Pr(X = i) =

7

4

Page 6: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Conditional Expectation as a Random variable

Definition

The expression E[Y | Z ] is a random variable f (Z ) that takes onthe value E[Y | Z = z ] when Z = z .

Consider the outcome of rolling two dice X1,X2,X = X1 + X2.

E[X | X1] =∑x

x Pr(X = x | X1) =

X1+6∑x=X1+1

x · 1

6= X1 +

7

2.

Consider the two phase game

E[Y | X ] =X

2

Page 7: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

If E[Y | Z ] is a random variable, it has an expectation.

Theorem

E[Y ] = E[E[Y | Z ]] .

E[X | X1] = X1 +7

2.

Thus

E[E[X | X1]] = E

[X1 +

7

2

]=

7

2+

7

2= 7 .

Page 8: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Martingales

Definition

A sequence of random variables Z0,Z1, . . . is a martingale withrespect to the sequence X0,X1, . . . if for all n ≥ 0 the followinghold:

1 Zn is a function of X0,X1, . . . ,Xn;

2 E[|Zn|] <∞;

3 E[Zn+1|X0,X1, . . . ,Xn] = Zn;

Definition

A sequence of random variables Z0,Z1, . . . is a martingale when itis a martingale with respect to itself, that is

1 E[|Zn|] <∞;

2 E[Zn+1|Z0,Z1, . . . ,Zn] = Zn;

Page 9: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Martingale Example

A series of fair games (E[gain] = 0), not necessarily independent..

Game 1: bet $1.

Game i > 1: bet 2i if won in round i − 1; bet i otherwise.

Xi = amount won in ith game. (Xi < 0 if ith game lost).

Zi = total winnings at end of ith game.

Page 10: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example

Xi = amount won in ith game. (Xi < 0 if ith game lost).

Zi = total winnings at end of ith game.

Z1,Z2, . . . is martingale with respect to X1,X2, . . .

E[Xi ] = 0.

E[Zi ] =∑

E[Xi ] = 0 <∞.

E[Zi+1|X1,X2, . . . ,Xi ] = Zi + E[Xi+1] = Zi .

Page 11: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Gambling Strategies

I play series of fair games (win with probability 1/2).

Game 1: bet $1.

Game i > 1: bet 2i if I won in round i − 1; bet i otherwise.

Xi = amount won in ith game. (Xi < 0 if ith game lost).

Zi = total winnings at end of ith game.

Assume that (before starting to play) I decide to quit after kgames: what are my expected winnings?

Page 12: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Lemma

Let Z0,Z1,Z2, . . . be a martingale with respect to X0,X1, . . . . Forany fixed n,

EX [1:n][Zn] = E[Z0] .

(X [1 : i ] = X1, . . . ,Xi )

Proof.

Since Zi is a martingale Zi−1 = EXi[Zi |X0,X1, . . . ,Xi−1].

Then

EX [1:i−1][Zi−1] = EX [1:i−1][EXi[Zi |X0,X1, . . . ,Xi−1]]

ButEX [1:i−1][EXi

[Zi |X0,X1, . . . ,Xi−1]] = EX [1:i ][Zi ] .

Thus,EX [1:n][Zn] = EX [n−1][Zn−1] = . . . ,= E[Z0] .

Page 13: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Gambling Strategies

I play series of fair games (win with probability 1/2).

Game 1: bet $1.

Game i > 1: bet 2i if I won in round i − 1; bet i otherwise.

Xi = amount won in ith game. (Xi < 0 if ith game lost).

Zi = total winnings at end of ith game.

Assume that (before starting to gamble) we decide to quit after kgames: what are my expected winnings?

E[Zk ] = E[Z1] = 0.

Page 14: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

A Different Strategy

Same gambling game. What happens if I:

• play a random number of games?

• decide to stop only when I have won $1000?

Page 15: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Stopping Time

Definition

A non-negative, integer random variable T is a stopping time forthe sequence Z0,Z1, . . . if the event “T = n” depends only on thevalue of random variables Z0,Z1, . . . ,Zn.

Intuition: corresponds to a strategy for determining when to stop asequence based only on values seen so far.

In the gambling game:

• first time I win 10 games in a row: is a stopping time;

• the last time when I win: is not a stopping time.

Page 16: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Consider again the gambling game: let T be a stopping time.

Zi = total winnings at end of ith game.

What are my winnings at the stopping time, i.e. E[ZT ]?

Fair game: E[ZT ] = E[Z0] = 0?

“T =first time my total winnings are at least $1000” is a stoppingtime, and E[ZT ] > 1000...

Page 17: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Martingale Stopping Theorem

Theorem

If Z0,Z1, . . . is a martingale with respect to X1,X2, . . . and if T isa stopping time for X1,X2, . . . then

E[ZT ] = E[Z0]

whenever one of the following holds:

• there is a constant c such that, for all i , |Zi | ≤ c ;

• T is bounded;

• E[T ] <∞, and there is a constant c such thatE[|Zi+1 − Zi |

∣∣X1, . . . ,Xi

]< c .

Page 18: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Proof of Martingale Stopping Theorem (Sketch)

We need to show that E [|ZT |] <∞. So we can useE [ZT ] = E [Z0] +

∑i≤T E [E [(Zi − Zi−1) | X1, . . . ,Xi−1]]

• there is a constant c such that, for all i , |Zi | ≤ c - the sum isbounded.

• T is bounded - the sum has finite number of elements.

• E[T ] <∞, and there is a constant c such thatE[|Zi+1 − Zi |

∣∣X1, . . . ,Xi

]< c

E [|ZT |] ≤ E [|Z0|] +∞∑i=1

E[E[|Zi+1 − Zi |

∣∣X1, . . . ,Xi

]1i≤T ]

≤ E [|Z0|] + c∞∑i=1

Pr(T ≥ i)

≤ E [|Z0|] + cE [T ] <∞

Page 19: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Martingale Stopping Theorem Applications

We play a sequence of fair game with the following stopping rules:

1 T is chosen from distribution with finite expectation:E[ZT ] = E[Z0].

2 T is the first time we made $1000: E[T ] is unbounded.

3 We double until the first win. E[T ] = 2 butE[|Zi+1 − Zi |

∣∣X1, . . . ,Xi

]is unbounded.

Page 20: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: The Gambler’s Ruin

• Consider a sequence of independent, fair 2-player gamblinggames.

• In each round, each player wins or loses $1 with probability 12 .

• Xi = amount won by player 1 on ith round.• If player 1 has lost in round i : Xi < 0.

• Zi = total amount won by player 1 after ith rounds.• Z0 = 0.

• Game ends when one player runs out of money• Player 1 must stop when she loses net `1 dollars (Zt = −`1)• Player 2 terminates when she loses net `2 dollars (Zt = `2).

• q = probability game ends with player 1 winning `2 dollars.

Page 21: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: The Gambler’s Ruin

• T = first time player 1 wins `2 dollars or loses `1 dollars.• T is a stopping time for X1,X2, . . . .

• Z0,Z1, . . . is a martingale.• Zi ’s are bounded.

• Martingale Stopping Theorem: E[ZT ] = E[Z0] = 0.

E[ZT ] = q`2 − (1− q)`1 = 0

q =`1

`1 + `2

Page 22: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: A Ballot Theorem

• Candidate A and candidate B run for an election.

• Candidate A gets a votes.

• Candidate B gets b votes.

• a > b.

• Votes are counted in random order:

• chosen from all permutations on all n = a + b votes.

• What is the probability that A is always ahead in the count?

Page 23: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: A Ballot Theorem

• Si = number of votes A is leading by after i votes counted

• If A is trailing: Si < 0).

• Sn = a− b.

• For 0 ≤ k ≤ n − 1: Xk =Sn−k

n−k .

• Consider X0,X1, . . . ,Xn.

• This sequence goes backward in time!

E[Xk |X0,X1, . . . ,Xk−1] = ?

Page 24: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: A Ballot Theorem

E[Xk |X0,X1, . . . ,Xk−1] = ?

• Conditioning on X0,X1, . . . ,Xk−1: equivalent to conditioningon Sn, Sn−1, . . . ,Sn−k+1,

• ai = number of votes for A after first i votes are counted.

• (n − k + 1)th vote: random vote among these first n − k + 1votes.

Sn−k =

{Sn−k+1 + 1 if (n − k + 1)th vote is for BSn−k+1 − 1 if (n − k + 1)th vote is for A

Sn−k =

{Sn−k+1 + 1 with prob.

n−k+1−an−k+1

n−k+1

Sn−k+1 − 1 with prob.an−k+1

n−k+1

Page 25: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

E[Sn−k |Sn−k+1] = (Sn−k+1 + 1)n − k + 1− an−k+1

(n − k + 1)

+ (Sn−k+1 − 1)an−k+1

(n − k + 1)

= Sn−k+1n − k

n − k + 1

(Since 2an−k+1 − n − k + 1 = Sn−k+1)

E[Xk |X0,X1, . . . ,Xk−1] = E

[Sn−kn − k

∣∣∣∣Sn, . . . ,Sn−k+1]

=Sn−k+1

n − k + 1= Xk−1

=⇒ X0,X1, . . . ,Xn is a martingale.

Page 26: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: A Ballot Theorem

T =

{min{k : Xk = 0} if such k existsn − 1 otherwise

• T is a stopping time.

• T is bounded.

• Martingale Stopping Theorem:

E[XT ] = E[X0] =E[Sn]

n=

a− b

a + b.

Two cases:

1 A leads throughout the count.

2 A does not lead throughout the count.

Page 27: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

1 A leads throughout the count.

For 0 ≤ k ≤ n − 1: Sn−k > 0, then Xk > 0.

T = n − 1.

XT = Xn−1 = S1.

A gets the first vote in the count: S1 = 1, then XT = 1.

2 A does not lead throughout the count.

For some k: Sk = 0. Then Xk = 0.

T = k < n − 1.

XT = 0.

Page 28: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: A Ballot Theorem

Putting all together:

1 A leads throughout the count: XT = 1.

2 A does not lead throughout the count: XT = 0

E[XT ] =a− b

a + b= 1 ∗ Pr(Case 1) + 0 ∗ Pr(Case 2) .

That is

Pr(A leads throughout the count) =a− b

a + b.

Page 29: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

A Different Gambling Game

Two stages:

1 roll one die; let X be the outcome;

2 roll X standard dice; your gain Z is the sum of the outcomesof the X dice.

What is your expected gain?

Page 30: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Wald’s Equation

Theorem

Let X1,X2, . . . be nonnegative, independent, identically distributedrandom variables with distribution X . Let T be a stopping time forthis sequence. If T and X have bounded expectation, then

E

[T∑i

Xi

]= E[T ]E[X ] .

Note that T is not independent of X1,X2, . . . .Corollary of the martingale stopping theorem.

Page 31: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

ProofFor i ≥ 1, let Zi =

∑ij=1(Xj − E[X ]).

The sequence Z1,Z2, . . . is a martingale with respect to X1,X2, . . ..E[Z1] = 0, E[T ] <∞, and since Xi are nonnegative

E[|Zi+1 − Zi |

∣∣ X1, . . . ,Xi

]= E[|Xi+1 − E[X ]|] ≤ 2E[X ] .

Hence we can apply the martingale stopping theorem to compute

E[ZT ] = E[Z1] = 0 .

We now find

0 = E[ZT ] = E

T∑j=1

(Xj − E[X ])

= E

T∑j=1

Xj − TE[X ]

= E

T∑j=1

Xj

− E[T ] · E[X ] = 0,

Page 32: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

A Different Gambling Game

Two stages:

1 roll one die; let X be the outcome;

2 roll X standard dice; your gain Z is the sum of the outcomesof the X dice.

What is your expected gain?

Yi = outcome of ith die in second stage.

E[Z ] = E

[X∑i=1

Yi

].

X is a stopping time for Y1,Y2, . . . .

By Wald’s equation:

E[Z ] = E[X ]E[Yi ] =

(7

2

)2

.

Page 33: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: a k-run

• We flip a fair coin until we get a consecutive sequence of kHEADs.

• What’s the expected number of times we flip the coin.

• A SWITCH is a HEAD followed by a TAIL.

• Let X1 be the number of flips till k HEADs or the firstSWITCH

• Let Xi be the number of flips following the i − 1 SWITCH tillk HEADs or the next SWITCH (Xi includes the last HEAD orTAIL).

• Let T be the first i with k HEADs

E[Xi ] =∑j≥1

j2−j +k−1∑j=1

j2−j + (k − 1)2−(k−1) E[T ] = 2k−1

• The expected number of coin flips is E[Xi ]E [T ]

Page 34: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

• Let Xi be the number of flips following the i − 1 SWITCH tillk HEADs or the next SWITCH (Xi includes the last HEAD orTAIL).

• Let T be the first i with k HEADs

• Xi = number of flips till (including) first HEAD + up to k − 2HEADs followed by a TAIL, or k − 1 HEADS

E[Xi ] =∑j≥1

j2−j +k−1∑j=1

j2−j + (k − 1)2−(k−1)

• The probability that Xi ends with k HEADS is 2−(k−1) -sequence of k − 1 HEADS following the first one.

E[T ] = 2k−1

• The expected number of coin flips is E[Xi ]E [T ]

Page 35: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Hoeffding’s Bound

Theorem

Let X1, . . . ,Xn be independent random variables with E[Xi ] = µiand Pr(Bi ≤ Xi ≤ Bi + ci ) = 1, then

Pr

(∣∣∣∣∣n∑

i=1

Xi −n∑

i=1

µi

∣∣∣∣∣ ≥ ε)≤ 2e

− 2ε2∑ni=1

c2i

Do we need independence?

Page 36: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Tail Inequalities

Theorem (Azuma-Hoeffding Inequality)

Let Z0,Z1, . . . ,Zn be a martingale (with respect to X1,X2, . . . )such that |Zk − Zk−1| ≤ ck . Then, for all t ≥ 0 and any λ > 0,

Pr(|Zt − Z0| ≥ λ) ≤ 2e−λ2/(2

∑tk=1 c

2k ) .

The following corollary is often easier to apply.

Corollary

Let X0,X1, . . . be a martingale such that for all k ≥ 1,

|Xk − Xk−1| ≤ c .

Then for all t ≥ 1 and λ > 0,

Pr(|Xt − X0| ≥ λc

√t)≤ 2e−λ

2/2 .

Page 37: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Tail Inequalities: A More General Form

Theorem (Azuma-Hoeffding Inequality)

Let Z0,Z1, . . . , be a martingale with respect to X0,X1,X2, . . . ,such that

Bk ≤ Zk − Zk−1 ≤ Bk + ck ,

for some constants ck and for some random variables Bk that maybe functions of X0,X1, . . . ,Xk−1. Then, for any t ≥ 0 and λ > 0,

Pr(|Zt − Z0| ≥ λ) ≤ 2e−2λ2/(

∑tk=1 c

2k ) .

Page 38: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

ProofLet X k = X0, . . . ,Xk and Yi = Zi − Zi−1.

Since E[Zi | X i−1] = Zi−1,

E[Yi | X i−1] = E[Zi − Zi−1 | X i−1] = 0 .

Since Pr(Bi ≤ Yi ≤ Bi + ci | X i−1) = 1, by Hoeffding’s Lemma:

E[eβYi | X i−1] ≤ eβ2c2i /8 .

Lemma

(Hoeffding’s Lemma) Let X be a random variable such thatPr(X ∈ [a, b]) = 1 and E[X ] = 0. Then for every λ > 0,

E[eλX ] ≤ eλ2(a−b)2/8.

Page 39: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Proof of the Lemma

Since f (x) = eλx is a convex function, for any α ∈ (0, 1) andx ∈ [a, b],

f (X ) ≤ αf (a) + (1− α)f (b) .

Thus, for α = b−xb−a ∈ (0, 1),

eλx ≤ b − x

b − aeλa +

x − a

b − aeλb .

Taking expectation, and using E[X ] = 0, we have

E[eλX

]≤ b

b − aeλa +

a

b − aeλb ≤ eλ

2(b−a)2/8 .

Page 40: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Proof of Azuma-Hoeffding Inequality

E[eβYi

∣∣∣ X i−1]≤ eβ

2c2i /8 .

EX n

[eβ

∑ni=1 Yi

]= EX n−1

[EXn

[eβ

∑ni=1 Yi

∣∣ X n−1]]

= EX n

[eβ

∑n−1i=1 YiEXn−1

[eβYn | X n−1

]]≤ eβ

2c2n/8EX n−1

[eβ

∑n−1i=1 Yi

]≤ eβ

2∑n

i=1 c2i /8

Page 41: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

E[eβ∑n

i=1 Yi ] ≤ eβ2∑n

i=1 c2i /8

Pr(Zt − Z0 ≥ λ) = Pr

(t∑

i=1

Yi ≥ λ

)≤ E[eβ

∑ti=1 Yi ]

eβλ

≤ e−λβeβ2∑t

i=1 c2i /8

≤ 2e−2λ2/(

∑tk=1 c

2k ),

For β = 4λ∑ti=1 c

2i

.

Pr(|Zt − Z0| ≥ λ) ≤ 2e−2λ2/(

∑tk=1 c

2k )

Page 42: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example

Assume that you play a sequence of n fair games, where the bet biin game i depends on the outcome of previous games. LetB = maxi bi . The probability of winning or losing more than λ isbounded by

Pr(|Zn| ≥ λ) ≤ 2e−2λ2/nB2

Pr(|Zn| ≥ λB√n) ≤ 2e−2λ

2

Pr

|Zn| ≥ λ

√√√√ n∑i=1

b2i

≤ 2e−2λ2

Page 43: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Doob Martingale

Let X1,X2, . . . ,Xn be sequence of random variables. LetY = f (X1, . . . ,Xn) be a random variable with E[|Y |] <∞.

For i = 0, 1, . . . , n, let

Z0 = E[Y ] = EX [1,n]f (X1, . . . ,Xn]

Zi = EX [i+1,n][Y |X1 = x1,X2 = x2, . . . ,Xi = xi ]

Zn = E[Y |X1 = x1,X2 = x2, . . . ,Xn = xn] = f (x1, . . . , xn)

Theorem

Z0,Z1, . . . ,Zn is martingale with respect to X1,X2, . . . ,Xn.

Page 44: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

ProofWe use:

Fact

E[E[V |U,W ]|W ] = E[V |W ].

Y = f (X1, . . . ,Xn), Z0 = E[Y ],Zi = EX [i+1,n][Y |X1 = x1, . . . ,Xi = xi ],

Z1,Z2. . . . ,Zn is a martingale if

EXi+1[Zi+1|X1 = x1, . . . ,Xi = xi ] = Zi

EXi+1[Zi+1|x1, x2, . . . , xi ] = EXi+1

[EX [i+2,n][Y |X1, . . . ,Xi+1]|x1, ..., xi ]= EX [i+1,n][Y |x1, x2, . . . , xi ]= Zi .

Page 45: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Simple Example

Y = f (X1, . . . ,Xn) =∑n

i=1 Xi ,Xi ’s are independent and distributed uniform U[0, 1].

Z0 = E[Y ] = EX [1,n]f (X1, . . . ,Xn)] = E[n∑

i=1

Xi ] = n/2

Zi = EX [i+1,n][Y |x1, . . . , xi ]

=i∑

j=1

xj + E[n∑j=i

Xi ] =i∑

j=1

xj + (n − i)/2

Zn = E[Y |x1, . . . , xn] = f (x1, . . . , xn) =n∑

j=1

xj

EXi+1[Zi+1|x1, . . . , xi ] = EXi+1

i+1∑j=1

Xi +n − i − 1

2

∣∣x1, . . . , xi

=i∑

j=1

xi +n − i

2= Zi

Page 46: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: Polya’s Urn

• Start with M balls, R red, M − R blue.

• Repeat n times: We pick a ball uniformly at random. If Redwe add a red ball, else we add a blue ball.

• Xi = 1 if we add a red ball in step i , else Xi = 0

• We want to estimate Sn(R/M) =∑n

i=1 Xi = f (X1, . . . ,Xn)

• Claim: E[Sn(R/M)] = nR/M.Proof by induction on t, that E[St ] = tR/M.

E[St+1 | St ] = St +R + StM + t

E[St+1] = E[E[St+1 | St ]] = E

[St +

R + StM + t

]

= tR

M+

R + tR/M

M + t= (t + 1)

R

M

Page 47: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: Polya’s Urn

Start with M balls, R red, M − R blue. Repeat n times: We pick aball uniformly at random. If Red we add a red ball, else we add ablue ball. Xi = 1 if added a red ball in step i , else Xi = 0,Sn(R/M) =

∑ni=1 Xi , and E[Sn(R/M)] = nR/M

Let Zi = E[Sn| X1 = x1, . . . ,Xi = xi ]. We prove that Z1, . . . ,Zn isa martingale.

Zi = E[Sn| X1 = x1, . . . ,Xi = xi ] =i∑

j=1

xj + E [Sn−i (R +

∑ij=1 xj

M + i)]

=i∑

j=1

xj + (n − i)R +

∑ij=1 xj

M + i

E[Zi+1 | X1, . . . ,Xi ] = E[E[Sn|X1,X2, . . . ,Xi+1] | X1 = x1, . . . ,Xi = xi ]

= E

i∑j=1

xj + Xi+1 + Sn−i−1

(R +

∑ij=1 xj + Xi+1

M + i + 1

)

Page 48: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Zi = E[Sn| X1 = x1, . . . ,Xi = xi ] =∑i

j=1 xj + (n − i)R+

∑ij=1 xj

M+i

E[Zi+1 | X1, ...,Xi ] = E

i∑j=1

xj + Xi+1 + Sn−i−1

(R +

∑ij=1 xj + Xi+1

M + i + 1

)

= E

i∑j=1

xj + Xi+1 + (n − i − 1)R +

∑ij=1 xj + Xi+1

M + i + 1

=i∑

j=1

xj +R +

∑ij=1 xj

M + i+ (n − i − 1)

R +∑i

j=1 xj +R+

∑ij=1 xj

M+i

M + i + 1

=i∑

j=1

xj +R +

∑ij=1 xj

M + i+ (n − i − 1)

M+i+1M+i (R +

∑ij=1 xj)

M + i + 1= Zi

Page 49: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: Edge Exposure Martingale

Let G random graph from Gn,p. Consider m =(n2

)possible edges

in arbitrary order.

Xi =

{1 if ith edge is present0 otherwise

F (G ) = size of maximum clique in G .

Z0 = E[F (G )]

Zi = E[F (G )|X1,X2, . . . ,Xi ], for i = 1, . . . ,m.

Z0,Z1, . . . ,Zm is a Doob martingale.

(F (G ) could be any finite-valued function on graphs.)

Page 50: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Tail Inequalities: Doob Martingales

Let X1, . . . ,Xn be sequence of random variables.

Random variable Y :

• Y is a function of X1,X2, . . . ,Xn;

• E[|Y |] <∞.

Let Zi = E[Y |X1, . . . ,Xi ], i = 0, 1, . . . , n.

Z0,Z1, . . . ,Zn is martingale with respect to X1, . . . ,Xn.

If we can use Azuma-Hoeffding inequality:

Pr(|Zn − Z0| ≥ λ) ≤ .....

then we have,

Pr(|Y − E[Y ]| ≥ λ) ≤ ......

We need a bound on |Zi − Zi−1|.

Page 51: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

McDiarmid Bound

Theorem

Assume that f (X1,X2, . . . ,Xn) satisfies,

|f (x1, . . . , , xi , . . . , xn)− f (x1, . . . , , yi , . . . , xn)| ≤ ci .

and X1, . . . ,Xn are independent, then

Pr(|f (X1, . . . ,Xn)− E[f (X1, . . . ,Xn)]| ≥ λ) ≤ 2e−2λ2/(

∑nk=1 c

2k ) .

[Changing the value of Xi changes the value of the function by atmost ci .]

Page 52: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Proof

Define a Doob martingale Z0,Z1, . . . ,Zn:

• Z0 = E[f (X1, . . . ,Xn)] = E[f (X̄ )]

• Zi = E[f (X0, . . . ,Xn) | X1, . . . ,Xi ] = E[f (Xi , . . . ,Xn) | X i ]

• Zn = f (X1, . . . ,Xn) = f (X̄ )

We want to prove that this martingale satisfies the conditions of

Theorem (Azuma-Hoeffding Inequality)

Let Z0,Z1, . . . , be a martingale with respect to X0,X1,X2, . . . ,such that

Bk ≤ Zk − Zk−1 ≤ Bk + ck ,

for some constants ck and for some random variables Bk that maybe functions of X0,X1, . . . ,Xk−1. Then, for all t ≥ 0 and anyλ > 0,

Pr(|Zt − Z0| ≥ λ) ≤ 2e−2λ2/(

∑tk=1 c

2k ) .

Page 53: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Lemma

If X1, . . . ,Xn are independent then for some random variable Bk ,

Bk ≤ Zk − Zk−1 ≤ Bk + ck ,

Zk − Zk−1 = E[f (X̄ ) | X k ]− E[f (X̄ ) | X k−1] .

Hence Zk − Zk−1 is bounded above by

supx

E[f (X̄ ) | X k−1,Xk = x ]− E[f (X̄ ) | X k−1]

and bounded below by

infyE[f (X̄ ) | X k−1,Xk = y ]− E[f (X̄ ) | X k−1] .

Thus, we need to show

supx

E[f (X̄ ) | X k−1,Xk = x ]− infyE[f (X̄ ) | X k−1,Xk = y ] ≤ c ,

Page 54: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Zk − Zk−1 = supx ,y

E[f (X̄ , x)− f (X̄ , y) | X k−1].

Because the Xi are independent, the values for Xk+1, . . . ,Xn donot depend on the values of X1, . . . ,Xk . Hence, for any valuesz̄ , x , y we have Pr(z̄ , x) = Pr(z̄ , y), and therefore

supx ,y

E[f (X̄ , x)− f (X̄ , y) | X1 = z1, . . . ,Xk−1 = zk−1] =

supx ,y

∑zk+1,...,zn

Pr((Xk+1 = zk+1) ∩ . . . ∩ (Xn = zn)

)∗(f (z̄ , x)− f (z̄ , y)

).

Butf (z̄ , x)− f (z̄ , y) ≤ ci

and therefore

E[f (X̄ , x)− f (X̄ , y) | X k−1] ≤ ci

Page 55: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: Pattern Matching

Given a string and a pattern: is the pattern interesting?

Does it appear more often than is expected in a random string?

Is the number of occurrences of the pattern concentrated aroundthe expectation?

Page 56: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

A = (a1, a2, . . . , an) string of characters, each chosenindependently and uniformly at random from Σ, with s = |Σ|.

pattern: B = (b1, . . . , bk) fixed string, bi ∈ Σ.

F = number of occurrences of B in random string A.

E[F ] = ?

Page 57: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

A = (a1, a2, . . . , an) string of characters, each chosenindependently and uniformly at random from Σ, with m = |Σ|.

pattern: B = (b1, . . . , bk) fixed string, bi ∈ Σ.

F= number occurrences of B in random string S .

E[F ] = (n − k + 1)

(1

m

)k

.

Can we bound the deviation of F from its expectation?

Page 58: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

F= number occurrences of B in random string A.

Z0 = E[F ]

Zi = E[F |a1, . . . , ai ], for i = 1, . . . , n.

Z0,Z1, . . . ,Zn is a Doob martingale.

Zn = F .

Page 59: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

F= number occurrences of B in random string A.

Z0 = E[F ]

Zi = E[F |a1, . . . , ai ], for i = 1, . . . , n.

Z0,Z1, . . . ,Zn is a Doob martingale.

Zn = F .

Each character in A can participate in no more than k occurrencesof B:

|Zi − Zi+1| ≤ k .

Azuma-Hoeffding inequality (version 1):

Pr(|F − E[F ]| ≥ λ) ≤ 2e−λ2/(2nk2) .

Page 60: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Application: Balls and Bins

We are throwing m balls independently and uniformly at randominto n bins.Let Xi = the bin that the ith ball falls into.Let F be the number of empty bins after the m balls are thrown.Then the sequence

Zi = E[F | X1, . . . ,Xi ]

is a Doob martingale.F = f (X1,X2, . . . ,Xm) satisfies the Lipschitz condition with bound1, thus |Zi+1 − Zi | ≤ 1We therefore obtain

Pr(|F − E[F ]| ≥ ε) ≤ 2e−2ε2/m

Here

E[F ] = n

(1− 1

n

)m

,

but we could obtain the concentration result without knowing E[F ].

Page 61: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Application: Chromatic Number

Given a random graph G in Gn,p, the chromatic number χ(G ) isthe minimum number of colors required to color all vertices of thegraph so that no adjacent vertices have the same color.We use the vertex exposure martingale defined below:Let Gi be the random subgraph of G induced by the set of vertices1, . . . , i , let Z0 = E[χ(G )], and let

Zi = E[χ(G ) | G1, . . . ,Gi ] .

Since a vertex uses no more than one new color, again we havethat the gap between Zi and Zi−1 is at most 1.We conclude

Pr(|χ(G )− E[χ(G )]| ≥ λ√n) ≤ 2e−2λ

2.

This result holds even without knowing E[χ(G )].

Page 62: Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe ding’s Bound Theorem Let X 1;:::;X n be independent random variables with E[X i] =

Example: Edge Exposure Martingale

Let G random graph from Gn,p. Consider m =(n2

)possible edges

in arbitrary order.

Xi =

{1 if ith edge is present0 otherwise

F (G ) = size maximum clique in G .

Z0 = E[F (G )]

Zi = E[F (G )|X1,X2, . . . ,Xi ], for i = 1, . . . ,m.

Z0,Z1, . . . ,Zm is a Doob martingale.

(F (G ) could be any finite-valued function on graphs.)