Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe...
Transcript of Hoe ding’s Boundcs.brown.edu/courses/cs155/slides/2020/Chapter-12.pdf · 2020-04-02 · Hoe...
Hoeffding’s Bound
Theorem
Let X1, . . . ,Xn be independent random variables with E[Xi ] = µiand Pr(Bi ≤ Xi ≤ Bi + ci ) = 1, then
Pr(|n∑
i=1
Xi −n∑
i=1
µi | ≥ ε) ≤ e− 2ε2∑n
i=1c2i
Do we need independence?
Martingales
Definition
A sequence of random variables Z0,Z1, . . . is a martingale withrespect to the sequence X0,X1, . . . if for all n ≥ 0 the followinghold:
1 Zn is a function of X0,X1, . . . ,Xn;
2 E[|Zn|] <∞;
3 E[Zn+1|X0,X1, . . . ,Xn] = Zn;
Definition
A sequence of random variables Z0,Z1, . . . is a martingale when itis a martingale with respect to itself, that is
1 E[|Zn|] <∞;
2 E[Zn+1|Z0,Z1, . . . ,Zn] = Zn;
Conditional Expectation
Definition
E[Y | Z = z ] =∑y
y Pr(Y = y | Z = z) ,
where the summation is over all y in the range of Y .
Lemma
For any random variables X and Y ,
E[X ] = EY [EX [X | Y ]] =∑y
Pr(Y = y)E[X | Y = y ] ,
where the sum is over all values in the range of Y .
Lemma
For any random variables X and Y ,
E[X ] = EY [EX [X | Y ]] =∑y
Pr(Y = y)E[X | Y = y ] ,
where the sum is over all values in the range of Y .
Proof. ∑y
Pr(Y = y)E[X | Y = y ]
=∑y
Pr(Y = y)∑x
x Pr(X = x | Y = y)
=∑x
∑y
x Pr(X = x | Y = y) Pr(Y = y)
=∑x
∑y
x Pr(X = x ∩ Y = y) =∑x
x Pr(X = x) = E[X ].
Example
Consider a two phase game:
• Phase I: roll one die. Let X be the outcome.
• Phase II: Flip X fair coins, let Y be the number of HEADs.
• You receive a dollar for each HEAD.
Y is distributed B(X , 12),
E[Y | X = a] =a
2
E[Y ] =6∑
i=1
E[Y | X = i ] Pr(X = i)
=6∑
i=1
i
2Pr(X = i) =
7
4
Conditional Expectation as a Random variable
Definition
The expression E[Y | Z ] is a random variable f (Z ) that takes onthe value E[Y | Z = z ] when Z = z .
Consider the outcome of rolling two dice X1,X2,X = X1 + X2.
E[X | X1] =∑x
x Pr(X = x | X1) =
X1+6∑x=X1+1
x · 1
6= X1 +
7
2.
Consider the two phase game
E[Y | X ] =X
2
If E[Y | Z ] is a random variable, it has an expectation.
Theorem
E[Y ] = E[E[Y | Z ]] .
E[X | X1] = X1 +7
2.
Thus
E[E[X | X1]] = E
[X1 +
7
2
]=
7
2+
7
2= 7 .
Martingales
Definition
A sequence of random variables Z0,Z1, . . . is a martingale withrespect to the sequence X0,X1, . . . if for all n ≥ 0 the followinghold:
1 Zn is a function of X0,X1, . . . ,Xn;
2 E[|Zn|] <∞;
3 E[Zn+1|X0,X1, . . . ,Xn] = Zn;
Definition
A sequence of random variables Z0,Z1, . . . is a martingale when itis a martingale with respect to itself, that is
1 E[|Zn|] <∞;
2 E[Zn+1|Z0,Z1, . . . ,Zn] = Zn;
Martingale Example
A series of fair games (E[gain] = 0), not necessarily independent..
Game 1: bet $1.
Game i > 1: bet 2i if won in round i − 1; bet i otherwise.
Xi = amount won in ith game. (Xi < 0 if ith game lost).
Zi = total winnings at end of ith game.
Example
Xi = amount won in ith game. (Xi < 0 if ith game lost).
Zi = total winnings at end of ith game.
Z1,Z2, . . . is martingale with respect to X1,X2, . . .
E[Xi ] = 0.
E[Zi ] =∑
E[Xi ] = 0 <∞.
E[Zi+1|X1,X2, . . . ,Xi ] = Zi + E[Xi+1] = Zi .
Gambling Strategies
I play series of fair games (win with probability 1/2).
Game 1: bet $1.
Game i > 1: bet 2i if I won in round i − 1; bet i otherwise.
Xi = amount won in ith game. (Xi < 0 if ith game lost).
Zi = total winnings at end of ith game.
Assume that (before starting to play) I decide to quit after kgames: what are my expected winnings?
Lemma
Let Z0,Z1,Z2, . . . be a martingale with respect to X0,X1, . . . . Forany fixed n,
EX [1:n][Zn] = E[Z0] .
(X [1 : i ] = X1, . . . ,Xi )
Proof.
Since Zi is a martingale Zi−1 = EXi[Zi |X0,X1, . . . ,Xi−1].
Then
EX [1:i−1][Zi−1] = EX [1:i−1][EXi[Zi |X0,X1, . . . ,Xi−1]]
ButEX [1:i−1][EXi
[Zi |X0,X1, . . . ,Xi−1]] = EX [1:i ][Zi ] .
Thus,EX [1:n][Zn] = EX [n−1][Zn−1] = . . . ,= E[Z0] .
Gambling Strategies
I play series of fair games (win with probability 1/2).
Game 1: bet $1.
Game i > 1: bet 2i if I won in round i − 1; bet i otherwise.
Xi = amount won in ith game. (Xi < 0 if ith game lost).
Zi = total winnings at end of ith game.
Assume that (before starting to gamble) we decide to quit after kgames: what are my expected winnings?
E[Zk ] = E[Z1] = 0.
A Different Strategy
Same gambling game. What happens if I:
• play a random number of games?
• decide to stop only when I have won $1000?
Stopping Time
Definition
A non-negative, integer random variable T is a stopping time forthe sequence Z0,Z1, . . . if the event “T = n” depends only on thevalue of random variables Z0,Z1, . . . ,Zn.
Intuition: corresponds to a strategy for determining when to stop asequence based only on values seen so far.
In the gambling game:
• first time I win 10 games in a row: is a stopping time;
• the last time when I win: is not a stopping time.
Consider again the gambling game: let T be a stopping time.
Zi = total winnings at end of ith game.
What are my winnings at the stopping time, i.e. E[ZT ]?
Fair game: E[ZT ] = E[Z0] = 0?
“T =first time my total winnings are at least $1000” is a stoppingtime, and E[ZT ] > 1000...
Martingale Stopping Theorem
Theorem
If Z0,Z1, . . . is a martingale with respect to X1,X2, . . . and if T isa stopping time for X1,X2, . . . then
E[ZT ] = E[Z0]
whenever one of the following holds:
• there is a constant c such that, for all i , |Zi | ≤ c ;
• T is bounded;
• E[T ] <∞, and there is a constant c such thatE[|Zi+1 − Zi |
∣∣X1, . . . ,Xi
]< c .
Proof of Martingale Stopping Theorem (Sketch)
We need to show that E [|ZT |] <∞. So we can useE [ZT ] = E [Z0] +
∑i≤T E [E [(Zi − Zi−1) | X1, . . . ,Xi−1]]
• there is a constant c such that, for all i , |Zi | ≤ c - the sum isbounded.
• T is bounded - the sum has finite number of elements.
• E[T ] <∞, and there is a constant c such thatE[|Zi+1 − Zi |
∣∣X1, . . . ,Xi
]< c
E [|ZT |] ≤ E [|Z0|] +∞∑i=1
E[E[|Zi+1 − Zi |
∣∣X1, . . . ,Xi
]1i≤T ]
≤ E [|Z0|] + c∞∑i=1
Pr(T ≥ i)
≤ E [|Z0|] + cE [T ] <∞
Martingale Stopping Theorem Applications
We play a sequence of fair game with the following stopping rules:
1 T is chosen from distribution with finite expectation:E[ZT ] = E[Z0].
2 T is the first time we made $1000: E[T ] is unbounded.
3 We double until the first win. E[T ] = 2 butE[|Zi+1 − Zi |
∣∣X1, . . . ,Xi
]is unbounded.
Example: The Gambler’s Ruin
• Consider a sequence of independent, fair 2-player gamblinggames.
• In each round, each player wins or loses $1 with probability 12 .
• Xi = amount won by player 1 on ith round.• If player 1 has lost in round i : Xi < 0.
• Zi = total amount won by player 1 after ith rounds.• Z0 = 0.
• Game ends when one player runs out of money• Player 1 must stop when she loses net `1 dollars (Zt = −`1)• Player 2 terminates when she loses net `2 dollars (Zt = `2).
• q = probability game ends with player 1 winning `2 dollars.
Example: The Gambler’s Ruin
• T = first time player 1 wins `2 dollars or loses `1 dollars.• T is a stopping time for X1,X2, . . . .
• Z0,Z1, . . . is a martingale.• Zi ’s are bounded.
• Martingale Stopping Theorem: E[ZT ] = E[Z0] = 0.
E[ZT ] = q`2 − (1− q)`1 = 0
q =`1
`1 + `2
Example: A Ballot Theorem
• Candidate A and candidate B run for an election.
• Candidate A gets a votes.
• Candidate B gets b votes.
• a > b.
• Votes are counted in random order:
• chosen from all permutations on all n = a + b votes.
• What is the probability that A is always ahead in the count?
Example: A Ballot Theorem
• Si = number of votes A is leading by after i votes counted
• If A is trailing: Si < 0).
• Sn = a− b.
• For 0 ≤ k ≤ n − 1: Xk =Sn−k
n−k .
• Consider X0,X1, . . . ,Xn.
• This sequence goes backward in time!
E[Xk |X0,X1, . . . ,Xk−1] = ?
Example: A Ballot Theorem
E[Xk |X0,X1, . . . ,Xk−1] = ?
• Conditioning on X0,X1, . . . ,Xk−1: equivalent to conditioningon Sn, Sn−1, . . . ,Sn−k+1,
• ai = number of votes for A after first i votes are counted.
• (n − k + 1)th vote: random vote among these first n − k + 1votes.
Sn−k =
{Sn−k+1 + 1 if (n − k + 1)th vote is for BSn−k+1 − 1 if (n − k + 1)th vote is for A
Sn−k =
{Sn−k+1 + 1 with prob.
n−k+1−an−k+1
n−k+1
Sn−k+1 − 1 with prob.an−k+1
n−k+1
E[Sn−k |Sn−k+1] = (Sn−k+1 + 1)n − k + 1− an−k+1
(n − k + 1)
+ (Sn−k+1 − 1)an−k+1
(n − k + 1)
= Sn−k+1n − k
n − k + 1
(Since 2an−k+1 − n − k + 1 = Sn−k+1)
E[Xk |X0,X1, . . . ,Xk−1] = E
[Sn−kn − k
∣∣∣∣Sn, . . . ,Sn−k+1]
=Sn−k+1
n − k + 1= Xk−1
=⇒ X0,X1, . . . ,Xn is a martingale.
Example: A Ballot Theorem
T =
{min{k : Xk = 0} if such k existsn − 1 otherwise
• T is a stopping time.
• T is bounded.
• Martingale Stopping Theorem:
E[XT ] = E[X0] =E[Sn]
n=
a− b
a + b.
Two cases:
1 A leads throughout the count.
2 A does not lead throughout the count.
1 A leads throughout the count.
For 0 ≤ k ≤ n − 1: Sn−k > 0, then Xk > 0.
T = n − 1.
XT = Xn−1 = S1.
A gets the first vote in the count: S1 = 1, then XT = 1.
2 A does not lead throughout the count.
For some k: Sk = 0. Then Xk = 0.
T = k < n − 1.
XT = 0.
Example: A Ballot Theorem
Putting all together:
1 A leads throughout the count: XT = 1.
2 A does not lead throughout the count: XT = 0
E[XT ] =a− b
a + b= 1 ∗ Pr(Case 1) + 0 ∗ Pr(Case 2) .
That is
Pr(A leads throughout the count) =a− b
a + b.
A Different Gambling Game
Two stages:
1 roll one die; let X be the outcome;
2 roll X standard dice; your gain Z is the sum of the outcomesof the X dice.
What is your expected gain?
Wald’s Equation
Theorem
Let X1,X2, . . . be nonnegative, independent, identically distributedrandom variables with distribution X . Let T be a stopping time forthis sequence. If T and X have bounded expectation, then
E
[T∑i
Xi
]= E[T ]E[X ] .
Note that T is not independent of X1,X2, . . . .Corollary of the martingale stopping theorem.
ProofFor i ≥ 1, let Zi =
∑ij=1(Xj − E[X ]).
The sequence Z1,Z2, . . . is a martingale with respect to X1,X2, . . ..E[Z1] = 0, E[T ] <∞, and since Xi are nonnegative
E[|Zi+1 − Zi |
∣∣ X1, . . . ,Xi
]= E[|Xi+1 − E[X ]|] ≤ 2E[X ] .
Hence we can apply the martingale stopping theorem to compute
E[ZT ] = E[Z1] = 0 .
We now find
0 = E[ZT ] = E
T∑j=1
(Xj − E[X ])
= E
T∑j=1
Xj − TE[X ]
= E
T∑j=1
Xj
− E[T ] · E[X ] = 0,
A Different Gambling Game
Two stages:
1 roll one die; let X be the outcome;
2 roll X standard dice; your gain Z is the sum of the outcomesof the X dice.
What is your expected gain?
Yi = outcome of ith die in second stage.
E[Z ] = E
[X∑i=1
Yi
].
X is a stopping time for Y1,Y2, . . . .
By Wald’s equation:
E[Z ] = E[X ]E[Yi ] =
(7
2
)2
.
Example: a k-run
• We flip a fair coin until we get a consecutive sequence of kHEADs.
• What’s the expected number of times we flip the coin.
• A SWITCH is a HEAD followed by a TAIL.
• Let X1 be the number of flips till k HEADs or the firstSWITCH
• Let Xi be the number of flips following the i − 1 SWITCH tillk HEADs or the next SWITCH (Xi includes the last HEAD orTAIL).
• Let T be the first i with k HEADs
E[Xi ] =∑j≥1
j2−j +k−1∑j=1
j2−j + (k − 1)2−(k−1) E[T ] = 2k−1
• The expected number of coin flips is E[Xi ]E [T ]
• Let Xi be the number of flips following the i − 1 SWITCH tillk HEADs or the next SWITCH (Xi includes the last HEAD orTAIL).
• Let T be the first i with k HEADs
• Xi = number of flips till (including) first HEAD + up to k − 2HEADs followed by a TAIL, or k − 1 HEADS
E[Xi ] =∑j≥1
j2−j +k−1∑j=1
j2−j + (k − 1)2−(k−1)
• The probability that Xi ends with k HEADS is 2−(k−1) -sequence of k − 1 HEADS following the first one.
E[T ] = 2k−1
• The expected number of coin flips is E[Xi ]E [T ]
Hoeffding’s Bound
Theorem
Let X1, . . . ,Xn be independent random variables with E[Xi ] = µiand Pr(Bi ≤ Xi ≤ Bi + ci ) = 1, then
Pr
(∣∣∣∣∣n∑
i=1
Xi −n∑
i=1
µi
∣∣∣∣∣ ≥ ε)≤ 2e
− 2ε2∑ni=1
c2i
Do we need independence?
Tail Inequalities
Theorem (Azuma-Hoeffding Inequality)
Let Z0,Z1, . . . ,Zn be a martingale (with respect to X1,X2, . . . )such that |Zk − Zk−1| ≤ ck . Then, for all t ≥ 0 and any λ > 0,
Pr(|Zt − Z0| ≥ λ) ≤ 2e−λ2/(2
∑tk=1 c
2k ) .
The following corollary is often easier to apply.
Corollary
Let X0,X1, . . . be a martingale such that for all k ≥ 1,
|Xk − Xk−1| ≤ c .
Then for all t ≥ 1 and λ > 0,
Pr(|Xt − X0| ≥ λc
√t)≤ 2e−λ
2/2 .
Tail Inequalities: A More General Form
Theorem (Azuma-Hoeffding Inequality)
Let Z0,Z1, . . . , be a martingale with respect to X0,X1,X2, . . . ,such that
Bk ≤ Zk − Zk−1 ≤ Bk + ck ,
for some constants ck and for some random variables Bk that maybe functions of X0,X1, . . . ,Xk−1. Then, for any t ≥ 0 and λ > 0,
Pr(|Zt − Z0| ≥ λ) ≤ 2e−2λ2/(
∑tk=1 c
2k ) .
ProofLet X k = X0, . . . ,Xk and Yi = Zi − Zi−1.
Since E[Zi | X i−1] = Zi−1,
E[Yi | X i−1] = E[Zi − Zi−1 | X i−1] = 0 .
Since Pr(Bi ≤ Yi ≤ Bi + ci | X i−1) = 1, by Hoeffding’s Lemma:
E[eβYi | X i−1] ≤ eβ2c2i /8 .
Lemma
(Hoeffding’s Lemma) Let X be a random variable such thatPr(X ∈ [a, b]) = 1 and E[X ] = 0. Then for every λ > 0,
E[eλX ] ≤ eλ2(a−b)2/8.
Proof of the Lemma
Since f (x) = eλx is a convex function, for any α ∈ (0, 1) andx ∈ [a, b],
f (X ) ≤ αf (a) + (1− α)f (b) .
Thus, for α = b−xb−a ∈ (0, 1),
eλx ≤ b − x
b − aeλa +
x − a
b − aeλb .
Taking expectation, and using E[X ] = 0, we have
E[eλX
]≤ b
b − aeλa +
a
b − aeλb ≤ eλ
2(b−a)2/8 .
Proof of Azuma-Hoeffding Inequality
E[eβYi
∣∣∣ X i−1]≤ eβ
2c2i /8 .
EX n
[eβ
∑ni=1 Yi
]= EX n−1
[EXn
[eβ
∑ni=1 Yi
∣∣ X n−1]]
= EX n
[eβ
∑n−1i=1 YiEXn−1
[eβYn | X n−1
]]≤ eβ
2c2n/8EX n−1
[eβ
∑n−1i=1 Yi
]≤ eβ
2∑n
i=1 c2i /8
E[eβ∑n
i=1 Yi ] ≤ eβ2∑n
i=1 c2i /8
Pr(Zt − Z0 ≥ λ) = Pr
(t∑
i=1
Yi ≥ λ
)≤ E[eβ
∑ti=1 Yi ]
eβλ
≤ e−λβeβ2∑t
i=1 c2i /8
≤ 2e−2λ2/(
∑tk=1 c
2k ),
For β = 4λ∑ti=1 c
2i
.
Pr(|Zt − Z0| ≥ λ) ≤ 2e−2λ2/(
∑tk=1 c
2k )
Example
Assume that you play a sequence of n fair games, where the bet biin game i depends on the outcome of previous games. LetB = maxi bi . The probability of winning or losing more than λ isbounded by
Pr(|Zn| ≥ λ) ≤ 2e−2λ2/nB2
Pr(|Zn| ≥ λB√n) ≤ 2e−2λ
2
Pr
|Zn| ≥ λ
√√√√ n∑i=1
b2i
≤ 2e−2λ2
Doob Martingale
Let X1,X2, . . . ,Xn be sequence of random variables. LetY = f (X1, . . . ,Xn) be a random variable with E[|Y |] <∞.
For i = 0, 1, . . . , n, let
Z0 = E[Y ] = EX [1,n]f (X1, . . . ,Xn]
Zi = EX [i+1,n][Y |X1 = x1,X2 = x2, . . . ,Xi = xi ]
Zn = E[Y |X1 = x1,X2 = x2, . . . ,Xn = xn] = f (x1, . . . , xn)
Theorem
Z0,Z1, . . . ,Zn is martingale with respect to X1,X2, . . . ,Xn.
ProofWe use:
Fact
E[E[V |U,W ]|W ] = E[V |W ].
Y = f (X1, . . . ,Xn), Z0 = E[Y ],Zi = EX [i+1,n][Y |X1 = x1, . . . ,Xi = xi ],
Z1,Z2. . . . ,Zn is a martingale if
EXi+1[Zi+1|X1 = x1, . . . ,Xi = xi ] = Zi
EXi+1[Zi+1|x1, x2, . . . , xi ] = EXi+1
[EX [i+2,n][Y |X1, . . . ,Xi+1]|x1, ..., xi ]= EX [i+1,n][Y |x1, x2, . . . , xi ]= Zi .
Simple Example
Y = f (X1, . . . ,Xn) =∑n
i=1 Xi ,Xi ’s are independent and distributed uniform U[0, 1].
Z0 = E[Y ] = EX [1,n]f (X1, . . . ,Xn)] = E[n∑
i=1
Xi ] = n/2
Zi = EX [i+1,n][Y |x1, . . . , xi ]
=i∑
j=1
xj + E[n∑j=i
Xi ] =i∑
j=1
xj + (n − i)/2
Zn = E[Y |x1, . . . , xn] = f (x1, . . . , xn) =n∑
j=1
xj
EXi+1[Zi+1|x1, . . . , xi ] = EXi+1
i+1∑j=1
Xi +n − i − 1
2
∣∣x1, . . . , xi
=i∑
j=1
xi +n − i
2= Zi
Example: Polya’s Urn
• Start with M balls, R red, M − R blue.
• Repeat n times: We pick a ball uniformly at random. If Redwe add a red ball, else we add a blue ball.
• Xi = 1 if we add a red ball in step i , else Xi = 0
• We want to estimate Sn(R/M) =∑n
i=1 Xi = f (X1, . . . ,Xn)
• Claim: E[Sn(R/M)] = nR/M.Proof by induction on t, that E[St ] = tR/M.
E[St+1 | St ] = St +R + StM + t
E[St+1] = E[E[St+1 | St ]] = E
[St +
R + StM + t
]
= tR
M+
R + tR/M
M + t= (t + 1)
R
M
Example: Polya’s Urn
Start with M balls, R red, M − R blue. Repeat n times: We pick aball uniformly at random. If Red we add a red ball, else we add ablue ball. Xi = 1 if added a red ball in step i , else Xi = 0,Sn(R/M) =
∑ni=1 Xi , and E[Sn(R/M)] = nR/M
Let Zi = E[Sn| X1 = x1, . . . ,Xi = xi ]. We prove that Z1, . . . ,Zn isa martingale.
Zi = E[Sn| X1 = x1, . . . ,Xi = xi ] =i∑
j=1
xj + E [Sn−i (R +
∑ij=1 xj
M + i)]
=i∑
j=1
xj + (n − i)R +
∑ij=1 xj
M + i
E[Zi+1 | X1, . . . ,Xi ] = E[E[Sn|X1,X2, . . . ,Xi+1] | X1 = x1, . . . ,Xi = xi ]
= E
i∑j=1
xj + Xi+1 + Sn−i−1
(R +
∑ij=1 xj + Xi+1
M + i + 1
)
Zi = E[Sn| X1 = x1, . . . ,Xi = xi ] =∑i
j=1 xj + (n − i)R+
∑ij=1 xj
M+i
E[Zi+1 | X1, ...,Xi ] = E
i∑j=1
xj + Xi+1 + Sn−i−1
(R +
∑ij=1 xj + Xi+1
M + i + 1
)
= E
i∑j=1
xj + Xi+1 + (n − i − 1)R +
∑ij=1 xj + Xi+1
M + i + 1
=i∑
j=1
xj +R +
∑ij=1 xj
M + i+ (n − i − 1)
R +∑i
j=1 xj +R+
∑ij=1 xj
M+i
M + i + 1
=i∑
j=1
xj +R +
∑ij=1 xj
M + i+ (n − i − 1)
M+i+1M+i (R +
∑ij=1 xj)
M + i + 1= Zi
Example: Edge Exposure Martingale
Let G random graph from Gn,p. Consider m =(n2
)possible edges
in arbitrary order.
Xi =
{1 if ith edge is present0 otherwise
F (G ) = size of maximum clique in G .
Z0 = E[F (G )]
Zi = E[F (G )|X1,X2, . . . ,Xi ], for i = 1, . . . ,m.
Z0,Z1, . . . ,Zm is a Doob martingale.
(F (G ) could be any finite-valued function on graphs.)
Tail Inequalities: Doob Martingales
Let X1, . . . ,Xn be sequence of random variables.
Random variable Y :
• Y is a function of X1,X2, . . . ,Xn;
• E[|Y |] <∞.
Let Zi = E[Y |X1, . . . ,Xi ], i = 0, 1, . . . , n.
Z0,Z1, . . . ,Zn is martingale with respect to X1, . . . ,Xn.
If we can use Azuma-Hoeffding inequality:
Pr(|Zn − Z0| ≥ λ) ≤ .....
then we have,
Pr(|Y − E[Y ]| ≥ λ) ≤ ......
We need a bound on |Zi − Zi−1|.
McDiarmid Bound
Theorem
Assume that f (X1,X2, . . . ,Xn) satisfies,
|f (x1, . . . , , xi , . . . , xn)− f (x1, . . . , , yi , . . . , xn)| ≤ ci .
and X1, . . . ,Xn are independent, then
Pr(|f (X1, . . . ,Xn)− E[f (X1, . . . ,Xn)]| ≥ λ) ≤ 2e−2λ2/(
∑nk=1 c
2k ) .
[Changing the value of Xi changes the value of the function by atmost ci .]
Proof
Define a Doob martingale Z0,Z1, . . . ,Zn:
• Z0 = E[f (X1, . . . ,Xn)] = E[f (X̄ )]
• Zi = E[f (X0, . . . ,Xn) | X1, . . . ,Xi ] = E[f (Xi , . . . ,Xn) | X i ]
• Zn = f (X1, . . . ,Xn) = f (X̄ )
We want to prove that this martingale satisfies the conditions of
Theorem (Azuma-Hoeffding Inequality)
Let Z0,Z1, . . . , be a martingale with respect to X0,X1,X2, . . . ,such that
Bk ≤ Zk − Zk−1 ≤ Bk + ck ,
for some constants ck and for some random variables Bk that maybe functions of X0,X1, . . . ,Xk−1. Then, for all t ≥ 0 and anyλ > 0,
Pr(|Zt − Z0| ≥ λ) ≤ 2e−2λ2/(
∑tk=1 c
2k ) .
Lemma
If X1, . . . ,Xn are independent then for some random variable Bk ,
Bk ≤ Zk − Zk−1 ≤ Bk + ck ,
Zk − Zk−1 = E[f (X̄ ) | X k ]− E[f (X̄ ) | X k−1] .
Hence Zk − Zk−1 is bounded above by
supx
E[f (X̄ ) | X k−1,Xk = x ]− E[f (X̄ ) | X k−1]
and bounded below by
infyE[f (X̄ ) | X k−1,Xk = y ]− E[f (X̄ ) | X k−1] .
Thus, we need to show
supx
E[f (X̄ ) | X k−1,Xk = x ]− infyE[f (X̄ ) | X k−1,Xk = y ] ≤ c ,
Zk − Zk−1 = supx ,y
E[f (X̄ , x)− f (X̄ , y) | X k−1].
Because the Xi are independent, the values for Xk+1, . . . ,Xn donot depend on the values of X1, . . . ,Xk . Hence, for any valuesz̄ , x , y we have Pr(z̄ , x) = Pr(z̄ , y), and therefore
supx ,y
E[f (X̄ , x)− f (X̄ , y) | X1 = z1, . . . ,Xk−1 = zk−1] =
supx ,y
∑zk+1,...,zn
Pr((Xk+1 = zk+1) ∩ . . . ∩ (Xn = zn)
)∗(f (z̄ , x)− f (z̄ , y)
).
Butf (z̄ , x)− f (z̄ , y) ≤ ci
and therefore
E[f (X̄ , x)− f (X̄ , y) | X k−1] ≤ ci
Example: Pattern Matching
Given a string and a pattern: is the pattern interesting?
Does it appear more often than is expected in a random string?
Is the number of occurrences of the pattern concentrated aroundthe expectation?
A = (a1, a2, . . . , an) string of characters, each chosenindependently and uniformly at random from Σ, with s = |Σ|.
pattern: B = (b1, . . . , bk) fixed string, bi ∈ Σ.
F = number of occurrences of B in random string A.
E[F ] = ?
A = (a1, a2, . . . , an) string of characters, each chosenindependently and uniformly at random from Σ, with m = |Σ|.
pattern: B = (b1, . . . , bk) fixed string, bi ∈ Σ.
F= number occurrences of B in random string S .
E[F ] = (n − k + 1)
(1
m
)k
.
Can we bound the deviation of F from its expectation?
F= number occurrences of B in random string A.
Z0 = E[F ]
Zi = E[F |a1, . . . , ai ], for i = 1, . . . , n.
Z0,Z1, . . . ,Zn is a Doob martingale.
Zn = F .
F= number occurrences of B in random string A.
Z0 = E[F ]
Zi = E[F |a1, . . . , ai ], for i = 1, . . . , n.
Z0,Z1, . . . ,Zn is a Doob martingale.
Zn = F .
Each character in A can participate in no more than k occurrencesof B:
|Zi − Zi+1| ≤ k .
Azuma-Hoeffding inequality (version 1):
Pr(|F − E[F ]| ≥ λ) ≤ 2e−λ2/(2nk2) .
Application: Balls and Bins
We are throwing m balls independently and uniformly at randominto n bins.Let Xi = the bin that the ith ball falls into.Let F be the number of empty bins after the m balls are thrown.Then the sequence
Zi = E[F | X1, . . . ,Xi ]
is a Doob martingale.F = f (X1,X2, . . . ,Xm) satisfies the Lipschitz condition with bound1, thus |Zi+1 − Zi | ≤ 1We therefore obtain
Pr(|F − E[F ]| ≥ ε) ≤ 2e−2ε2/m
Here
E[F ] = n
(1− 1
n
)m
,
but we could obtain the concentration result without knowing E[F ].
Application: Chromatic Number
Given a random graph G in Gn,p, the chromatic number χ(G ) isthe minimum number of colors required to color all vertices of thegraph so that no adjacent vertices have the same color.We use the vertex exposure martingale defined below:Let Gi be the random subgraph of G induced by the set of vertices1, . . . , i , let Z0 = E[χ(G )], and let
Zi = E[χ(G ) | G1, . . . ,Gi ] .
Since a vertex uses no more than one new color, again we havethat the gap between Zi and Zi−1 is at most 1.We conclude
Pr(|χ(G )− E[χ(G )]| ≥ λ√n) ≤ 2e−2λ
2.
This result holds even without knowing E[χ(G )].
Example: Edge Exposure Martingale
Let G random graph from Gn,p. Consider m =(n2
)possible edges
in arbitrary order.
Xi =
{1 if ith edge is present0 otherwise
F (G ) = size maximum clique in G .
Z0 = E[F (G )]
Zi = E[F (G )|X1,X2, . . . ,Xi ], for i = 1, . . . ,m.
Z0,Z1, . . . ,Zm is a Doob martingale.
(F (G ) could be any finite-valued function on graphs.)