Problems in Discrete Probability - ualberta.ca · 2017. 10. 16. · Problems in Discrete...

Problems in Discrete Probability

Byron Schmuland

October 16, 2017

MANY STUDENTS believe that every mathematical problem has a unique solution: ei-ther in the back of the textbook, the instructor’s solution manual, or hidden away

in the faculty lounge.

To counter this false belief, our problem-based course encourages exploration, multiplesolutions, false starts and dead ends. We aim to hone your problem solving skills ondiscrete probability problems ranging from the basic to the devilishly complex. All youneed is a flexible attitude, an open mind, and the willingness to work hard!

Contents

1 Counting 1

(a) Words, sets of sets, and perfect matchings . . . . . . . . . . . . . . . . . . . . 1

(b) Secret Santa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

(c) Stars and bars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

(d) Non consecutive values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

(e) Non consecutive values on a circle . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Symmetry 4

(a) Sampling with replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

(b) Sampling without replacement . . . . . . . . . . . . . . . . . . . . . . . . . . 4

(c) Formal definitions of symmetry and exchangeability . . . . . . . . . . . . . . 4

(d) Sampling questions revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

(e) Random permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

(f) An urn model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

(g) The German tank problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

(h) Drawing straws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

(i) Color run symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

(j) The superb symmetry of seven . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Inclusion-exclusion 11

(a) One in a million . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

(b) Another one in a million . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

(c) Points when throwing several dice . . . . . . . . . . . . . . . . . . . . . . . . 15

(d) A poker question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

(e) Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

(f) Inclusion-exclusion approximation . . . . . . . . . . . . . . . . . . . . . . . . 18

4 Classical probability 20

(a) Coincidence or Rencontre . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

(b) Occupancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

(c) Double Occupancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

(d) Ménage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

(e) Banach’s matchbox problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

(f) Socks in the dryer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5 Zero-one random variables 29

(a) Moments and factorial moments . . . . . . . . . . . . . . . . . . . . . . . . . 29

(b) One in a million . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

(c) Random student cards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

(d) Where’s the ace? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

(e) Numbers and colours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

(f) Alphabet soup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

(g) Almost disjoint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

(h) Proof of the inclusion-exclusion formula . . . . . . . . . . . . . . . . . . . . . 32

6 Patterns 33

(a) HT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

(b) HH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

(c) HH again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

(d) HT again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

(e) Patterns with a head start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

7 First step analysis 37

(a) Three players . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

(b) Another three players . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

(c) Classical random walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

(d) Number of walks until no shoes . . . . . . . . . . . . . . . . . . . . . . . . . . 39

8 Generating functions 41

(a) Five urn problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

(b) Adding random numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

(c) Ten dice again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

(d) The pattern HH again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

(e) Crazy Dice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

(f) Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

(g) Three consecutive dice values . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

(h) Triangle tango . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

(i) Other generating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

(i1) Even Odds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

(i2) Non-consecutive values . . . . . . . . . . . . . . . . . . . . . . . . . . 48

(i3) Strings without HHH . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

(i4) Red runs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

9 Random walks 51

(a) Definitions and notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

(b) Hitting times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

(c) Random walk on a cube . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

(d) More random walk h functions . . . . . . . . . . . . . . . . . . . . . . . . . . 54

(e) Tennis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

(f) The problem of points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

(g) Gambler’s ruin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

(h) One sided boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

(i) Return to the start . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

10 Counting paths 60

(a) The reflection principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

(b) Catalan numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

(c) The ballot theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

(d) Theater lineup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

(e) The cycle lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

(f) Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

11 Random permutations 69

(a) Color runs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

(b) The Mississippi problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

(c) Secret Santa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

12 Appendices 79

(a) Binomial coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

(b) Stirling numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

(c) Finite calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

(d) Discrete probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

13 References 82

1 COUNTING 1

1 Counting

(a) Words, sets of sets, and perfect matchings

(C1) How many distinct "words" can be made with the letters a, a, b, b, c, c?

For example, abbacc, babacc, cbabac.

(C2) How many distinct sets of two disjoint 2-sets can be made from 1, 2, 3, 4, 5, 6?

For example, 1, 4, 2, 3, 1, 3, 2, 4, 3, 5, 2, 4.

(C3) How many distinct ways can you match up six people into three pairs?

For example,

1

23

4

5 6

1

23

4

5 6

1

23

4

5 6

(b) Secret Santa

How many of the words made using the alphabet a, a, b, b, c, c have no a’s in the firsttwo places, no b’s in the middle two places, and no c’s in the last two places?

Let k be the number of b’s in the first two places. Argue that this is equal to the numberof c’s in the middle two places, and the number of a’s in the last two places.

Therefore the total number of these words is∑2

k=0

(2k

)3= 10.

1 COUNTING 2

(c) Stars and bars

How many sequences (x1, x2, . . . , xk) of positive integers have sum n? This problem hasan easy answer once we set up a bijection:

(x1, x2, . . . , xk) ←→ ∗ · · · ∗︸︷︷︸x1

| ∗ ∗ ∗ ∗ ∗ ∗ · · · ∗︸︷︷︸x2

| · · · | ∗ ∗ ∗ · · · ∗︸︷︷︸xk

.

On the right, begin with a line of n stars and then insert k − 1 bars into the n − 1 spacesbetween the stars. The number of ways is

(n−1k−1

).

What if we allow zero, that is, xj ≥ 0? We could use the bijection

(x1, x2, . . . , xk) ←→ (x1 + 1, x2 + 1, . . . , xk + 1)

where xj+1’s are positive and add up to n+k. The number of such sequences is therefore(n+k−1k−1

).

(d) Non consecutive values

What’s the chance of no consecutive values in the Lotto 6-49 draw?

Solution: The number of possibilities is(496

). How many of these consist of non-

consecutive values? The answer, explained below, is(446

)so the probability is (446 )

(496 )= .5048.

In general, how many sets of k numbers xj from 1, 2, . . . , n with x1 < x2 < · · · < xk haveno consecutive values? We can use the bijection

(x1, x2, . . . , xk) ←→ (x1, x2, . . . , xk) − (0, 1, . . . , k− 1)

where xj−(j− 1)’s are distinct values from 1, 2, . . . , n−(k− 1). The number of differentways is therefore

(n−k+1k

).

Proof by picture: There are(52

)ways to choose two non-consecutive numbers from

1, 2, 3, 4, 5, 6

1 COUNTING 3

(e) Non consecutive values on a circle

How many sets of k numbers from 1, 2, . . . , n arranged in a circle have no consecutivevalues?

Solution 1: We can take all linear arrangements without consecutive values, and thensubtract out those that include both 1 and n:(

n− k+ 1

k

)−

(n− k− 1

k− 2

)

Solution 2: It is possible to make a direct bijection, similar to the previous problem.The bijection is between patterns: on the right is an arbitrary pattern on the small circle;on the left is a pattern on the big circle without consecutive values. To go from the smallcircle to the big circle, move clockwise and add a black segment after every red segment.

For each pattern, the ratio of distinct numberings on the big circle compared to the smallcircle is n/(n− k). From this we deduce that the number of arrangements is

n

n− k

(n− k

k

).

2 SYMMETRY 4

2 Symmetry

(a) Sampling with replacement

Pull 52 cards from an ordinary deck, replacing each card drawn and reshuffling thor-oughly. For 1 ≤ i ≤ 52 let A′i be the event that the ith card drawn was red.

These events are independent with P(A′i) = 1/2 for all i. In particular,

P(A′i1A′i2· · ·A′ij) =

(1

2

)j. (1)

The 252 possible outcomes, e.g. (B, R, R, . . . , B)︸︷︷︸52 places

, are equally likely.

(b) Sampling without replacement

Pull all 52 cards from an ordinary deck without replacement. For 1 ≤ i ≤ 52 let Ai be theevent that the ith card drawn was red.

The(5226

)possible outcomes, e.g. (B, R, R, . . . , B)︸︷︷︸

26 Rs and 26 Bs

, are equally likely.

(c) Formal definitions of symmetry and exchangeability

Definition 2-1 By symmetry we mean a probability preserving map between two prob-ability spaces. That is, φ : Ω→ Ω′ is a bijection with the property that for any A ⊆ Ω wehave P′(φ(A)) = P(A).

Example 2-1 Let NHHH be the number of fair coin tosses until we get three heads in arow, and NT T T the corresponding random variable for three tails in a row. By symmetry,these two random variables have the same distribution.

For formally, define

Ω = HHH, THHH, TTHHH, . . .

Ω′ = TTT,HTTT,HHTTT, . . ..

2 SYMMETRY 5

Let φ : Ω → Ω′ be the map that takes any finite string of H and T , and swaps them.For example φ(T HHH) = HT T T. Since the coin is fair, all strings of the same length areequally likely and so the map φ is probabillity preserving.

Example 2-2 If X is the total when you roll 5 fair dice, then by symmetry P(X = 8) =

P(X = 27). More formally, define the sample space Ω = (d1, d2, d3, d4, d5) : 1 ≤ dj ≤6 for j = 1, . . . , 5 and a map φ : Ω → Ω so that φ((d1, d2, d3, d4, d5)) = (7, 7, 7, 7, 7) −

(d1, d2, d3, d4, d5).

Since all outcomes are equally likely andφ is a bijection, the mapφ is probability preserv-ing. Also note that φ maps those outcomes with sum 8 onto those outcomes with sum35− 8 = 27. It follows that P(X = 8) = P(X = 27).

Definition 2-2 Events A1, A2, . . . , An are called exchangeable if P(Ai1Ai2 · · ·Aij) is thesame for any set of j distinct indices. Here Ai1Ai2 · · ·Aij means the intersection of these jevents.

Similarly, a sequence X1, X2, . . . , Xn of random variables is called exchangeable if the vec-tor (Xi1 , Xi2 , . . . , Xij) has the same distribution for every sequence of j indices.

An alternative definition: The sequence X1, X2, . . . , Xn of random variables is exchange-able if and only if (X1, X2, . . . , Xn) and (Xπ(1), Xπ(2), . . . , Xπ(n)) have the same distributionfor any permutation π. That is, the map φ(x1, . . . , xn) = (xπ(1), . . . xπ(n)) is probabilitypreserving.

2 SYMMETRY 6

Example 2-3

• 07

• 17

• 17

• 27

• 07

• 17

• 07

• 27

• 07

0

0

1

1

2

2

Exchangeability implies being identically distributed, but not the reverse. An exchange-able sequence has extra symmetry.

For instance, the diagram above is the joint mass function of (X, Y), where X and Y takevalues in 0, 1, 2. Checking the marginals shows that they are identically distributed, but

P(X = 0, Y = 2) = 0/7 6= P(Y = 0, X = 2) = 1/7.

That is, the random variables X and Y have the same distribution, but the random vectors(X, Y) and (Y, X) don’t.

(d) Sampling questions revisited

The independent events A′i from example (a) are exchangeable, because of formula (1).For the events Ai from example (b), is the exchangeability "obvious for reasons of sym-metry"?

Well, I’m not sure. Let’s calculate the probability that the 3rd card drawn is red taking allcases into account:

P(A3) = P(RRR) + P(RBR) + P(BRR) + P(BBR)

=26

52· 2551· 2450

+26

52· 2651· 2550

+26

52· 2651· 2550

+26

52· 2551· 2650

=1

2.

Details: There is a permutation of 1, 2, . . . , 52 that sends (i1, i2, . . . , ij) to (1, 2, . . . , j).

2 SYMMETRY 7

Note that P(A1) = 1/2 and P(A1A2) = (1/2) · (25/51) = .245 6= 1/4 so these events are notindependent.

(e) Random permutations

Randomly permute the numbers 1, 2, . . . , n and let Ai be the event that number i is inthe ith place. Prove that these events are exchangeable.

(f) An urn model

An urn contains w white balls and b black balls. Draw the balls one at a time, withoutreplacement, until the urn is empty. Let Ai be the event that the ith ball drawn is white.

Here is an exercise that is trivial if you use exchangeability:

P(all black balls are drawn before the last white ball) =w

w+ b.

(g) The German tank problem

Suppose that x1, x2, . . . , xn is a random sample of size n, taken without replacement,from the set 1, 2, . . . ,m. PuttingM = maxx1, x2, . . . , xn, we want to calculate E(M).

Solution 1: For n ≤ k ≤ m, we have P(M = k) =

(k− 1

n− 1

)(m

n

) .

An Aside. Since the probabilities must add to 1, a simple change of

variables gives us the hockey stick formula:∑n≤k≤m

(k

n

)=

(m+ 1

n+ 1

).

2 SYMMETRY 8

The expected value E(M) is therefore

∑n≤k≤m

k

(k−1n−1

)(mn

) =n(mn

) ∑n≤k≤m

(k

n

)=

n(mn

)(m+ 1

n+ 1

)=

n

n+ 1(m+ 1).

Solution 2: Here is an argument that uses the natural symmetry of the problem. For1 ≤ j ≤ n, define x(j) to be the jth order statistic of the set x1, x2, . . . , xn. To fill out thesevalues, define x(0) = 0 and x(n+1) = m+ 1.

For instance, if N = 6, k = 3, and x1, x2, x3 = 2, 3, 6, then x(1) = 2, x(2) = 3, and x(3) = 6.We also have x(0) = 0 and x(4) = 7.

0 1 2 3 4 5 6 7

21

3

1

Compositions of 7 into four parts are in 1-1 correspondence with subsets of size 3 from1, 2, 3, 4, 5, 6. For example, 2+ 1+ 3+ 1←→ 2, 3, 6.

The random variables x(j+1) − x(j) for j = 0, 1, . . . , n are exchangeable, in particular, theyare identically distributed. Now

n∑j=0

(x(j+1) − x(j)) = m+ 1,

so we deduce that E(x(j+1) − x(j)) = m+1n+1

. Writing x(j) =∑j−1

i=0(x(i+1) − x(i)) we see thatE(x(j)) = j

n+1(m+ 1).

In particular, we haveM = x(n) so E(M) = nn+1

(m+ 1).

(h) Drawing straws

The dirty dozen are a tough bunch of soldiers, and two of them have to go on a dangerousmission. In order to decide who goes, they draw straws. They begin with 10 long strawsand two short straws, and draw successively until the short ones are chosen. The soldierswho get short straws have to go.

2 SYMMETRY 9

For 1 ≤ i ≤ 12, let Ai be the event that soldier i draws the short straw. The argument insection (d) says that these events are exchangeable. In particular, P(Ai) = 2/12 for all i;drawing straws is a completely fair way to make the decision.

(i) Color run symmetry

The distribution of the number of color runs in a well shuffled deck of n red cards and nblack cards is symmetric around the mean n+ 1.

separate colours

opposite pattern opposite pattern

recombine colours

2 SYMMETRY 10

(j) The superb symmetry of seven

The seven digits 1, 2, 3, 4, 5, 6, 7 are written down in a random order. What is the chancethat the resulting number is divisible by 7? For instance, 1234576 works, but 1234567doesn’t.

Solution: Consider the map that takes any order and adds 1 to each digit modulo 7,

(d1, d2, . . . , d7)→ (d1 + 1, d2 + 1, . . . , d7 + 1).

For example, 7451632 → 1562743. Mathematically, this amounts to changing the digit ’7’to a zero, and then adding 1111111. The first step does not change the value modulo 7,while the second step adds 1111111 ≡ 1 (mod 7). Let’s look closely at what happens ifwe repeat this process:

7451632

15627432673154

3714265

4125376

52364176347521

Each permutation is in one of these cycles, so the chance of being divisible by 7 is 1/7.

3 INCLUSION-EXCLUSION 11

3 Inclusion-exclusion

(a) One in a million

Choose one of the million numbers 000000, 000001, . . . , 999999 at random. Find the prob-ability that at least one of the digits 0, 1, . . . , 9 appears exactly twice.

Solution 1:

Pattern Choose Ordervalues them

11112 10 ·(94

)6!/2! 1260× 360 = 453,600

1122(102

)·(82

)6!/2!2! 1260× 180 = 226,800

123 10 · 9 · 8 6!/3!2! 720× 60 = 43,200222

(103

)6!/2!2!2! 120× 90 = 10,800

24 10 · 9 6!/4!2! 90× 15 = 1,350

735,750

This event has a probability of 0.73575.

Note: The values in the second and third columns of the table above may seem a bitad hoc. The pattern becomes visible if you write them as multinomial coefficients. Forinstance, the two values in the first row can be written(

10

5, 4, 1

) (6

1, 1, 1, 1, 2

).

Under the 10 we have 5, 4, 1 meaning that, for this pattern, out of the ten digits 5 of themwill appear zero times, 4 of them will appear once, and 1 of them will appear twice. Underthe six, we just put in the pattern with commas.


Inclusion-Exclusion I. Let A1, A2, . . . , An be events. The prob-

ability that exactly k of them occur (k ≥ 0) is

pn(k) =∑j≥k

(−1)j−k(j

k

)Sj

where S0 = 1, S1 =∑

i P(Ai), S2 =∑

i<j P(AiAj),S3 =

∑i<j<k P(AiAjAk), etc.

Inclusion-Exclusion II. Let A1, A2, . . . , An be events. The prob-

ability that k or more of them occur (k ≥ 1) is

Pn(k) =∑j≥k

(−1)j−k(j− 1

k− 1

)Sj

Solution 2: For i = 0, 1, . . . , 9 define Ai as the event that the digit i appears exactlytwice. Using binomial and multinomial probabilities we get

P1 = P(A0) =6!

2!4!

(1

10

)2(9

10

)4=

98,415

1,000,000

P2 = P(A0A1) =6!

2!2!2!

(1

10

)2(1

10

)2(8

10

)2=

5,760

1,000,000

P3 = P(A0A1A2) =6!

2!2!2!

(1

10

)2(1

10

)2(1

10

)2=

90

1,000,000

For exchangeable events: Sj =

(n

j

)Pj.


Plugging into the second inclusion-exclusion formula with n = 10, k = 1 shows theprobability that at least one digit appears exactly twice to be(

10

1

)P1 −

(10

2

)P2 +

(10

3

)P3 =

2,943

4,000= .73575.

Solution 3: For every pair of distinct indices i, j ∈ 1, 2, . . . , 6 defineAi,j to be the eventthat the ith and jth value are the same, and the rest are all different from this value.

There are(62

)such events with probability P(Ai,j) = (10 · 94)/106.

These events are not exchangeable, but their double intersections Ai,jAk,` comes in twotypes:

P(Ai,jAk,`) =0 if i, j ∩ k, ` 6= ∅(10 · 9 · 82)/106 if i, j ∩ k, ` = ∅.

The triple intersection Ai,jAk,`Am,n has non-zero probability only if the pairs i, j, k, `,and m,n are disjoint, in which case

P(Ai,jAk,`Am,n) = (10 · 9 · 8)/106.

The probability that at least one of these events occurs is

Pn(1) =

(6

2

)10 · 94

106−

(6

2

)(4

2

)1

2!

10 · 9 · 82

106+

(6

2

)(4

2

)1

3!

10 · 9 · 8106

=2,943

4,000≈ .73575.

(b) Another one in a million

Choose one of the million numbers 000000, 000001, . . . 999999 at random. Find the proba-bility that at least two of the digits 0, 1, . . . , 9 appears exactly once.


Solution 1:Pattern Choose Order

values them

111111(104,6

) (6

1,1,1,1,1,1

)210× 720 = 151,200

114(

107,2,0,0,1

) (6

1,1,4

)360× 30 = 10,800

11112(105,4,1

) (6

1,1,1,1,2

)1260× 360 = 453,600

1113(

106,3,0,1

) (6

1,1,1,3

)840× 120 = 100,800

1122(106,2,2

) (6

1,1,2,2

)1260× 180 = 226,800

943,200

This event has a probability of 0.9432.

Solution 2: For i = 0, 1, . . . , 9 define Ai as the event that the digit i appears exactlyonce. Using binomial and multinomial probabilities we get

P2 = P(A0A1) =6!

4!

(1

10

)2(8

10

)4=122,880

106

P3 = P(A0A1A2) =6!

3!

(1

10

)3(7

10

)3=41,160

106

P4 = P(A0A1A2A3) =6!

2!

(1

10

)4(6

10

)2=12,960

106

P5 = P(A0A1A2A3A4) = 6!(1

10

)5(5

10

)1=3,600

106

P6 = P(A0A1A2A3A4A5) = 6!(1

10

)6(4

10

)0=720

106

Notice that P7 = P8 = P9 = P10 = 0. Now using the formula with k = 2 and n = 10we getthe desired probability

P(at least two digits appear exactly once)

=

(1

1

)(10

2

)P2 −

(2

1

)(10

3

)P3 +

(3

1

)(10

4

)P4 −

(4

1

)(10

5

)P5 +

(5

1

)(10

6

)P6

= [5,529,600− 9,878,400+ 8,164,800− 3,628,800+ 756,000]/106

= 943,200/106.


Solution 3: For i = 1, . . . , 6 define A′i as the event that the digit in position i appearsexactly once. These are exchangeable events with

P′2 = P(A′1A′2) =10 · 9 · 84

106=368,640

106

P′3 = P(A′1A′2A′3) =10 · 9 · 8 · 73

106=246,960

106

P′4 = P(A′1A′2A′3A′4) =10 · 9 · 8 · 7 · 62

106=181,440

106

P′5 = P(A′1A′2A′3A′4A′5) =10 · 9 · 8 · 7 · 6 · 5

106=151,200

106

P′6 = P(A′1A′2A′3A′4A′5A′6) =10 · 9 · 8 · 7 · 6 · 5

106=151,200

106

Using the formula with k = 2 and n = 6we get the desired probability

P(at least two digits appear exactly once)

=

(1

1

)(6

2

)P′2 −

(2

1

)(6

3

)P′3 +

(3

1

)(6

4

)P′4 −

(4

1

)(6

5

)P′5 +

(5

1

)(6

6

)P′6

= [5,529,600− 9,878,400+ 8,164,800− 3,628,800+ 756,000]/106

= 943,200/106.

(c) Points when throwing several dice

Ten fair dice are thrown at random. What is the chance P that they add up to 27?

. . . .. . . . . .. . .. . . . . . . . . .. ...

There are 610 = 60,466,176 equally likely outcomes. We need to count how many ofthese add to 27. We will solve this with inclusion-exclusion. Let’s start with the set S =

(y1, y2, . . . , y10) where yk ≥ 1 and y1 + y2 + · · ·+ y10 = 27.

By "stars and bars", the number of such strings is(27−110−1

)= 3,124,550.

For 1 ≤ i ≤ 10, let Ai be the subset of S such that yi ≥ 7. Subtracting 6 from the ithcoordinate puts Ai in one-to-one correspondence with the set (w1, w2, . . . , w10) where


wk ≥ 1 and w1 +w2 + · · · +w10 = 21. By the argument above, there are(21−110−1

)= 167,960

such strings.

For i 6= j, the set AiAj corresponds to strings (v1, v2, . . . , v10) where vk ≥ 1 and v1 + v2 +· · ·+ v10 = 15. By the argument above, there are

(15−110−1

)= 2,002 such strings.

Triple intersections (and higher) are all empty, so the number of elements in S with allvalues less than or equal to 6 is

|S |−∑i

|Ai|+∑i<j

|AiAj|

=

(26

9

)−

(10

1

)(20

9

)+

(10

2

)(14

9

)= 1,535,040.

The required probability is P = 1,535,040/610 = 0.02538.

(d) A poker question

If you were to get dealt five cards, what are the chances that none of them would be a 2,3, or 4 AND no combinations of 87, 86, 85, 76, 75, or 65?

Solution: Define the following sets of hands:

Set Without Number

A 2,3,4,5,6,7(285

)B 2,3,4,5,6,8

(285

)C 2,3,4,5,7,8

(285

)D 2,3,4,6,7,8

(285

)These are exchangeable events, and the "good" hands are exactly those in the union A ∪B ∪ C ∪D. We want (

4

1

)P1 −

(4

2

)P2 +

(4

3

)P3 −

(4

4

)P4.

But in this problem, P2 = P3 = P4, so this simplifies to

4P1 − 3P2 = 4

(285

)(525

) − 3 (245 )(525

) ,


to give the solution

P(good hand) =265608

2598960=93

910= .1021978022.

(e) Coupling

A set of 200 people, consisting of 100 men and 100 women, is randomly divided into100 pairs. Find the probability that exactly 50 of these pairs will consist of a man and awoman.

Solution 1: For 1 ≤ i ≤ 100, define Ai to be the event that woman i is paired with aman. These events are exchangeable, and

P(A1A2 · · ·Aj) =100

199× 99

197× · · · × 101− j

201− 2j.

Therefore we get

P(X = 50) =

100∑j=50

(−1)j−50(j

50

)(100

j

)P(A1A2 · · ·Aj),

which works out to

P(X = 50) =2682490635867360565640252171955859065705332736

16915402546484587405948993803982403353855810915

= .1585827253

Solution 2: Let’s begin by counting the total number of possible pairings. Write out the200 people in any order and underline them two by two: for example

78 45 12 105 137 88 · · · 67 7.

Now, a lot of orders give the same pairings. I could write the pairs in any order, andwithin each pair I can swap the two partners without changing the pairing. For example,

105 12 78 45 137 88 · · · 67 7,

gives the same pairing as the other order. The total number of pairings is therefore 200!100! 2100.

This value can be rewritten as

200!

100! 2100= 199 · 197 · 195 · · · 3 · 1 = 199!!,


where n!! is called the double factorial.

To count how many pairings have exactly 50man-woman couples, we first choose 50menfrom 100, then 50 women from 100, then count how many man-woman couples we canget. The answer is

(10050

)(10050

)50!.

Finally we must make pairings among the 50 remaining men, and then among the 50remaining women. The number of ways this can be done is 49!! · 49!!, and so the finalprobability is

P(X = 50) =

(10050

)250! (49!!)2

199!!= .1585827253.

Solution 3: Write out the 200 men and women in any order and underline them two bytwo: for example

mm mw wm · · · mw.

There are(200100

)such orderings.

Let’s call the number of the four different kinds of pairs n(mm), n(mw), n(wm), andn(ww). Of course n(mm) +n(mw) +n(wm) +n(ww) = 100, and you should check thatn(mm) = n(ww). The number of orderings with n(wm) + n(mw) = 50 is

P(X = 50) =1(200100

) 50∑n(mw)=0

100!

25! 25!n(mw)! (50− n(mw))!

=

(10050

)(5025

)∑n(mw)

(50

n(mw)

)(200100

)=

(10050

)(5025

)250(

200100

)= .1585827253.

(f) Inclusion-exclusion approximation

The inclusion-exclusion formula for Pn(1) says

P := P(A1 ∪A2 ∪ . . . ∪An) = S1 − S2 + S3 − · · ·+ (−1)n−1Sn.


It is worth knowing that the partial sums S1, S1 − S2, S1 − S2 + S3, etc. alternate betweenupper and lower bounds for P.

Example 3-1 If you roll a fair die ten times, what is the chance that there are at leastthree consecutive rolls with the same value? The sequence on page 20 is an example.

Solution: For 1 ≤ i ≤ 8, define Ai to the event that rolls i, i+ 1, and i+ 2 have the samevalue.

Since P(Ai) = 1/62 for all i, we have S1 = 8/36 = 2/9.

For P(AiAj) it gets more complicated, since the probability depends on whether the over-lap between i, i + 1, i + 2 and j, j + 1, j + 2 is zero, one, or two elements. For example,A1 ∩ A2 means that rolls 1, 2 and 3 are the same and also that rolls 2, 3 and 4 are thesame. This amounts to saying that rolls 1, 2, 3, and 4 are all the same; the chance thatthis happens is 1/63. On the other hand, since there is no overlap events A1 and A4 areindependent, so P(A1 ∩A4) = (1/62)(1/62).

Accounting for all the different cases we calculate S2 = 7/144 so that

.17361 = 2/9− 7/144 ≤ P ≤ 2/9 = .22222.

With more patience, you can calculate further terms and get better approximations.

4 CLASSICAL PROBABILITY 20

4 Classical probability

(a) Coincidence or Rencontre

Montmort 1708 The cards on two identical, well-shuffled decks of size n are turnedover one at a time. What is the chance to see the same card simultaneously?

Solution: For 1 ≤ i ≤ n, let Ai be the event that card i is in the same position in bothdecks. If i1, i2, . . . , ij are distinct indices, then

P(Ai1 · · ·Aij) =(n− j)!

n!.

By inclusion-exclusion we get

P(no coincidence) =∑j≥0

(−1)j(n

j

)(n− j)!

n!=

n∑j=0

(−1)j

j!.

(b) Occupancy

de Moivre 1712 A symmetrical die with r faces is thrown n times. What is the chancethat all faces appear at least once?

Solution: For 1 ≤ i ≤ r, let Ai be the event that the ith face does not occur. These areexchangeable events with Pj = (1 − j

r)n and so the probability that there are no missing

faces is

P(no missing faces) = P(X = 0) =∑j≥0

(−1)j(r

j

)(1−

j

r

)n.

More generally,

P(X = k) =∑j≥k

(−1)j−k(j

k

)(r

j

)(1−

j

r

)n.

If we let Y = r − X be the number of faces that occur, then the above formula can berewritten as

P(Y = k) =

(r

k

)(k

r

)nqkn,


where qkn =∑

j(−1)j(kj

) (1− j

k

)n. Note that qkn is the chance that all faces occur when

you throw n dice with k sides.

Stirling numbers of the second kind Let’s definenk

to be the number of ways to

divide n objects into k non-empty groups. For instance,42

= 7 because of the seven

ways shown here:

1, 2, 3 ∪ 4, 1, 2, 4 ∪ 3, 1, 4, 3 ∪ 2, 4, 2, 3 ∪ 1

1, 2 ∪ 3, 4, 1, 4 ∪ 2, 3, 1, 3 ∪ 2, 4.

The Stirling numbers can be used to give an alternative solution to the occupancy prob-lem. We can get all k faces in the following way: partition the index set 1, 2, . . . , n into knon-empty sets, then assign the values 1, 2, . . . , k to these sets.

For instance, let n = 4 and k = 2 (let’s use a coin). We can construct an outcome whereall k values occur by choosing 1, 2, 3 ∪ 4 and then assign H to the first set and T to thesecond. This results in the outcome HHHT . We had 7 ways to make the first choice and 2ways to make the second choice. There are 14 outcomes where both values occur, so theprobability is

P(both values occur) =14

16.

Generalizing this argument, we get

P(all k values occur in n throws) = qkn =

nk

k!

kn.

Example 4-1 A Formula Let Y be the number of distinct faces that occur in n throwsof an r-sided die. Then

P(Y = k) =

(r

k

)(k

r

)nn

k

k!

kn.

Since these must add to 1, we deduce that∑k

n

k

r(r− 1) · · · (r− k+ 1) = rn.

Example 4-2 Coupon collecting Let us sample with replacement from 1, 2, . . . , k, andlet N be the number of trials needed to observe every value. By the occupancy problem,


we see that

P(N ≤ n) = k−n k!n

k

,

and

P(N = n) = k−n k!

n− 1

k− 1

.

For instance, if I throw a die until I see all six values, then the chance that I stop at the10th throw is

P(N = 10) = 6−10 6!

9

5

= .08277.

In the coupon collecting problem, we are usually more interested in the easier task ofdetermining E(N). Write N = Y1 + Y2 + · · · + Yk where Yj is the number of trials neededto increase the collection from j − 1 to j different values. The Yjs are geometric randomvariables with pj = 1− j−1

k. It follows that E(Yj) = 1

pj= k

k−j+1and hence

E(N) = k

(1

k+

1

k− 1+ · · ·+ 1

2+ 1

)≈ k(log(k) + γ).

For the die throwing problem, we have

E(N) = 6

(1

6+1

5+ · · ·+ 1

2+ 1

)= 14.7.

Example 4-3 Birthdays What is the chance that at least two out of n randomly selectedpeople have the same birthday? We assume that there are r days on the calendar.

We could treat this as a missing faces problem, by plugging k = r − n into the inclusion-exclusion formula. However, it is easier to attack the problem directly. If we considerchecking the n people one at a time, we see that

pn = P(n distinct birthdays) =r

r

r− 1

r

r− 2

r· · · r− n+ 1

r=

(r

n

)n!

rn.

For instance with r = 365 and n = 23, this gives a probability of pn = .4927. The chanceof a shared birthday is therefore 1− pn = .5073.

We may also be interested in the number of peopleN that we need to check before we getthe first shared birthday. The formula is

E(N) =

∞∑n=0

P(N > n) =∞∑n=0

(r

n

)n!

rn,


which unfortunately does not simplify. With r = 365 this gives E(N) = 24.617. For largevalues of r, we have E(N) ≈

√rπ2+ 2

3.

Where’d that approximation come from?

∞∑n=0

pn =

∞∑n=0

n−1∏j=0

(1−

j

r

)≈

∞∑n=0

n−1∏j=0

e−j/r =

∞∑n=0

e−∑n−1

j=0 j/r

=

∞∑n=0

e−(n2)/r ≈

∫∞0

e−n2/2r dn =

√rπ

2.

(c) Double Occupancy

Find the probability that every face appears twice or more when you roll 20 fair dice.

Solution 1: Every vector of outcomes (d1, d2, . . . , d20) has equal probability and thereare 620 such vectors. We need the number of such vectors that have at least two of eachpossible value. Brute force gives the table below from which we calculate

1340469610880400/620 = .3666333.


Choose OrderPattern values them

22222 10(65,1

) (20

10,2,2,2,2,2

)6× 20951330400 = 125707982400

222239(6

4,1,1

) (20

9,3,2,2,2,2

)30× 69837768000 = 2095133040000

222248(6

4,1,1

) (20

8,4,2,2,2,2

)30× 157134978000 = 4714049340000

222257(6

4,1,1

) (20

7,5,2,2,2,2

)30× 251415964800 = 7542478944000

222266(64,2

) (20

6,6,2,2,2,2

)15× 293318625600 = 4399779384000

222338(6

3,2,1

) (20

8,3,3,2,2,2

)60× 209513304000 = 12570798240000

222347(

63,1,1,1

) (20

7,4,3,2,2,2

)120× 419026608000 = 50283192960000

222356(

63,1,1,1

) (20

6,5,3,2,2,2

)120× 586637251200 = 70396470144000

222446(6

3,2,1

) (20

6,4,4,2,2,2

)60× 733296564000 = 43997793840000

222455(6

3,2,1

) (20

5,5,4,2,2,2

)60× 879955876800 = 52797352608000

223337(6

3,2,1

) (20

7,3,3,3,2,2

)60× 558702144000 = 33522128640000

223346(

62,2,1,1

) (20

6,4,3,3,2,2

)180× 977728752000 = 175991175360000

223355(6

2,2,2

) (20

5,5,3,3,2,2

)90× 1173274502400 = 105594705216000

223445(

62,2,1,1

) (20

5,4,4,3,2,2

)180× 1466593128000 = 263986763040000

224444(64,2

) (20

4,4,4,4,2,2

)15× 1833241410000 = 27498621150000

233336(6

4,1,1

) (20

6,3,3,3,3,2

)30× 1303638336000 = 39109150080000

233345(

63,1,1,1

) (20

5,4,3,3,3,2

)120× 1955457504000 = 234654900480000

233444(6

3,2,1

) (20

4,4,4,3,3,2

)60× 2444321880000 = 146659312800000

333335(65,1

) (20

5,3,3,3,3,3

)6× 2607276672000 = 15643660032000

333344(64,2

) (20

4,4,3,3,3,3

)15× 3259095840000 = 48886437600000

1340469610880400

Solution 2: For 1 ≤ i ≤ r, define Ai to be the event that face i shows up one or fewertimes. These events are exchangeable and we will use inclusion-exclusion to find thechance that none of them occur. We want

P(double occupancy) =∑j≥0

(−1)j(r

j

)P(A1 · · ·Aj).


Let’s start with P(A1). The number X1 of "1"s that occur is a binomial random variable, so

P(A1) = P(X1 = 0) + P(X1 = 1) =(1−

1

r

)n+ n

(1

r

)(1−

1

r

)n−1.

For j > 1 the vector (X1, X2, . . . , Xj, Z) records the number of "1"s, "2"s, . . ., "j"s, plus thenumber of "others". This is a multinomial vector so

P(A1 · · ·Aj) =∑

0≤x1,...,xj≤1

(n

x1, . . . , xj, n−∑xk

)(1

r

)x1· · ·(1

r

)xj (1−

j

r

)n−∑xk

=

j∑s=0

(j

s

)(n

0, . . . , 0, 1, . . . , 1︸︷︷︸s

, n− s

)(1

r

)s(1−

j

r

)n−s

=

j∑s=0

(j

s

)(n

s

)s!

(1

r

)s(1−

j

r

)n−s.

This gives the double sum

P(double occupancy) =∑j≥0

(−1)j(r

j

) j∑s=0

(j

s

)(n

s

)s!

(1

r

)s(1−

j

r

)n−s.

Solution 3: We know thatnr

counts the number of ways to divide the index set

1, 2, . . . , n into r non-empty subsets.

Now, for every 1 ≤ j ≤ n define Aj to be the subset of those ways where j appearsas a singleton. These are exchangeable, and the size of the intersection A1A2 · · ·As isn−sr−s

, so that inclusion-exclusion gives us the number of ways not in any of the Aj’s to

be∑r

s=0(−1)s(ns

)n−sr−s

. In other words, this sum gives the number of ways to divide the

index set 1, 2, . . . , n into r subsets each of size greater than or equal to two.

Therefore

P(double occupancy) =r!

rn

r∑s=0

(−1)s(n

s

)n− s

r− s

.

Solution 4: The coefficient of s20 in 20! (es − 1− s)6 is 1340469610880400.


(d) Ménage

Touchard 1934 n couples are seated at a round table with men and women alternating.What is the probability that none of the men has his wife next to him?

Solution: We imagine the seats numbered from 1 to 2n in a circle. Let’s put the womenin the odd seats and the men in the even seats. For 1 ≤ i ≤ 2n define Ai to be the eventthat a married couple occupies seats i and i + 1. Because it’s a circle, we consider seat2n+ 1 to be seat 1.

Note that A1A2 is impossible, but A1A3 is possible. In fact, for indices i1, . . . , ij between 1and 2n, we have

P(Ai1 · · ·Aij) =(n− j)!

n!,

if no two indices are consecutive, measured on the circle. Otherwise

P(Ai1 · · ·Aij) = 0.

By the inclusion-exclusion formula, the required probability is

p2n(0) =∑j≥0

(−1)j∑

i1,...,ij

(n− j)!

n!,

where we only count sets of non-consecutive indices. So the question becomes: howmany ways can we place j non-consecutive values on the circle? From section 1(e) weknow that the answer is 2n

2n−j

(2n−jj

). Plugging this in gives

p2n(0) =∑j≥0

(−1)j2n

2n− j

(2n− j

j

)(n− j)!

n!.

(e) Banach’s matchbox problem

Professor Stefan Banach has, in each of his two pockets, a box with n matches. Now andthen he randomly uses a match from a randomly chosen box. Find E(R) where R is thenumber of remaining matches when one box becomes empty.

This problem is equivalent to tossing a fair coin until either n heads, or n tails comes up.If N is the number of coin tosses, then N = n+ (n− R). Thus, E(R) = 2n− E(N).


Let pk = P(N = k) for n ≤ k ≤ 2n − 1. We get N = k by either n − 1 heads in the firstk− 1 tosses followed by a head, or n− 1 tails in the first k− 1 tosses followed by a tail. Bysymmetry, we find

pk = 2

(k− 1

n− 1

)(1

2

)k−1(1

2

)=

(k− 1

n− 1

)(1

2

)k−1

An Aside.∑n≤k≤2n

(k

n

)(1

2

)k= 1

Time to calculate!

E(N) =∑

n≤k≤2n−1

k

(k− 1

n− 1

)(1

2

)k−1

=∑

n≤k≤2n−1

n

(k

n

)(1

2

)k−1

= 2n∑

n≤k≤2n−1

(k

n

)(1

2

)k

= 2n

1−

(2n

n

)(1

2

)2n.

Therefore

E(R) = 2n− E(N) = 2n

(2n

n

)(1

2

)2n.


(f) Socks in the dryer

There are n distinct pairs of socks in the dryer, and we randomly pull them out one at atime. Let N be the number of draws needed to get a matching pair. Calculate E(N).

It is easy to see that P(N > k) = (nk)2k

(2nk )so that

E(N) =

n∑k=0

P(N > k) =n∑k=0

(nk

)2k(

2nk

) .Using super absorption and the "aside" from Banach’s matchbox problem, we simplifythis sum

n∑k=0

(nk

)2k(

2nk

) =1(2nn

) n∑k=0

(2n− k

n− k

)2k

=4n(2nn

) n∑k=0

(2n− k

n

)(1

2

)2n−k

=4n(2nn

) ∑n≤k≤2n

(k

n

)(1

2

)k

=4n(2nn

) .

5 ZERO-ONE RANDOM VARIABLES 29

5 Zero-one random variables

(a) Moments and factorial moments

If A1, . . . , An are events we introduce zero-one random variables U1, . . . , Un such thatUi = 1 if Ai occurs and Ui = 0 if Ai does not. The sum X = U1 + · · · + Un is the totalnumber of events that occur.

The random variable X

2

2

2

0

1 1

1

3

We haveE(Ui) = 1 · P(Ai) + 0 · [1− P(Ai)] = P(Ai),

so thatE(X) =

∑i

P(Ai) = S1.

In a similar way,(X2

)=∑

i<jUiUj is the number of pairs of events that occur, so thatE[(X2

)]=∑

i<j P(AiAj) = S2. In general, for j ≥ 0,

E[(X

j

)]= Sj.


(b) One in a million

Randomly choose a 6-digit number 000000, 000001, . . . , 999999. What is the average num-ber of digits that appear exactly once?

Solution: Let Ai be the event that the digit i appears exactly once.

E(X) =9∑i=0

P(Ai) = 10P(A0) = 10

[61

10

(9

10

)5]= 3.54294.

(c) Random student cards

In a class of n students, everyone’s student card is put in a pile and randomly given backto the students. What is the expected number of students that get their own cards back?

Solution: Let Ai be the event that student i gets his or her own card back. Then,

E(X) =n∑i=1

P(Ai) =n∑i=1

1

n= 1.

(d) Where’s the ace?

On average, where is the first ace in a well-shuffled deck?

Solution: For the 48 non-aces, define Ai, 1 ≤ i ≤ 48 to be the event that card i appearsbefore all the aces. The position of the first ace is X = 1 +

∑48i=1Ui, and since P(Ai) = 1/5

we get

average position of the first ace = E(X) = 1+∑i

P(Ai) = 53/5 = 10.6.

(e) Numbers and colours

An urn contains n numbered balls and b blue balls. Balls are drawn and replaced until ablue ball is obtained, at which point the experiment stops. A repetition is noted each time


a numbered ball is drawn that had been previously drawn. What is the expected numberof repetitions?

Solution: Let N be the number of numbered balls drawn. When you add the termi-nating blue ball, the total number N + 1 of draws is a geometric random variable withp = b/(b+n). Therefore, on average, we draw E(N+ 1) = 1/p = 1+n/b balls. Subtract-ing the blue ball, we get E(N) = n/b.

For 1 ≤ i ≤ n, let Ui be 1 if ball i appears before a blue ball and 0 otherwise. ThenE(Ui) = 1/(b+ 1) and the average number of repetitions is

E(N−

n∑i=1

Ui) = n

(1

b−

1

b+ 1

)=

n

b(b+ 1).

(f) Alphabet soup

A sequence (a1, a2, . . . , an) over the alphabet 1, 2, . . . ,m is chosen uniformly at randomamong themn possibilities. What is the expected size of the set a1, a2, . . . , an?

Solution: Let Ai be the event that the letter i is present in the random sequence. Thesize of the set is just X =

∑mi=1Ui which has expected value

E(X) =m∑i=1

P(Ai) = m[1−

(1−

1

m

)n].

(g) Almost disjoint

Prove that if P(∪iAi) =∑

i P(Ai), then P(AiAj) = 0 for i 6= j.

Solution: Let Xn = U1 + · · · + Un as in section 5(a). Let X0 be one on the set ∪iAiand zero elsewhere. Then Xn − X0 ≥ 0, but the equation

∑i P(Ai) = P(∪iAi) says that

E(Xn − X0) = 0. Therefore P(Xn − X0 = 0) = 1, which implies P(Xn − X0 ≥ 1) = 0.

Now for i 6= j, we have AiAj ⊆ (Xn − X0 ≥ 1), so that P(AiAj) = 0.


(h) Proof of the inclusion-exclusion formula

Let’s prove the wonderful inclusion-exclusion formula. We start by noting that for non-negative integers x ≥ k and any real number ywe have

∑i

(i

k

)(x

i

)yi−k =

(x

k

)∑i

(x− k

i− k

)yi−k =

(x

k

)(1+ y)x−k.

Substituting y = −1 gives us

∑i

(−1)i−k(i

k

)(x

i

)=

(x

k

)0x−k.

Since the sum on the left hand side is also zero when x < k, we get for any non-negativeintegers x, k: ∑

i

(−1)i−k(i

k

)(x

i

)= 1[x=k].

Substituting the random variable X for x and taking expectations gives

∑i

(−1)i−k(i

k

)E[(X

i

)]= P(X = k).

6 PATTERNS 33

6 Patterns

(a) HT

How many fair coin tosses on average are needed to see the pattern "HT"?

2 3 4 5 6 · · ·

HT HHT HHHT HHHHT HHHHHT · · ·T HT T HHT T HHHT T HHHHT · · ·

T T HT T T HHT T T HHHT · · ·T T T HT T T T HHT · · ·

T T T T HT · · ·

There are n− 1 outcomes of length n, all of the form

T · · · T︸︷︷︸t≥0

H · · · H︸︷︷︸h≥0

HT

with t+ h = n− 2.

An Aside. Since P(NHT = n) = (n− 1)/2n and the proba-

bilities must add to 1, we conclude that∑∞

n=2n−12n

= 1.

Using the identity n = (n− 2) + 2we find that

E(NHT) =

∞∑n=2

nP(NHT = n) =

∞∑n=2

n(n− 1)

2n

=

∞∑n=2

(n− 2)(n− 1)

2n+ 2

∞∑n=2

n− 1

2n

=1

2E(NHT) + 2,

which shows that E(NHT) = 4.

6 PATTERNS 34

(b) HH

How many fair coin tosses on average are needed to see the pattern "HH"?

2 3 4 5 6 · · ·

HH T HH T T HH T T T HH T T T T HH · · ·HT HH T HT HH T T HT HH · · ·

HT T HH T HT T HH · · ·HT T T HH · · ·HT HT HH · · ·

The outcomes of length n are formed in two ways: by sticking a T in front of an outcomeof length n−1 or stickingHT in front of an outcome of length n−2. This way we find thatthe number of outcomes of length n is F(n− 1) where F refers to the Fibonacci numbers.

Fibonacci numbers. They are defined by F(0) = 0, F(1) =

1, and F(n) = F(n − 1) + F(n − 2) for n ≥ 2. We have

proved that∑∞

n=2F(n−1)2n

= 1.

The solution to our problem is

E(NHH) =

∞∑n=2

nF(n− 1)

2n= 6.

(c) HH again

How many fair coin tosses on average are needed to see the pattern "HH"?

Conditioning on the first two tosses, there are three cases T , HT , or HH giving us

E(NHH) =1

2[1+ E(NHH)] +

1

4[2+ E(NHH)] +

1

4[2+ 0],

which gives E(NHH) = 6.

6 PATTERNS 35

(d) HT again

How many fair coin tosses on average are needed to see the pattern "HT"? Conditioningon the first toss, there are two cases T and H giving us

E(NHT) =1

2[1+ E(NHT)] +

1

2[1+ E(NH

HT)],

where NHHT is the number of further tosses needed to see HT , given an initial H.

Conditioning on the first toss, there are two cases T and H giving us

E(NHHT) =

1

2[1+ 0] +

1

2[1+ E(NH

HT)],

which implies E(NHHT) = 2. Plug this back into the previous equation to get E(NHT) = 4.

(e) Patterns with a head start

Example 6-1 Given two heads, how many further fair coin tosses on average are neededto see the pattern "HH" again?

1 3 4 5 6 · · ·

HH |H HH | T HH HH | T T HH HH | T T T HH HH | T T T T HH · · ·HH | T HT HH HH | T T HT HH · · ·

HH | T HT T HH · · ·

Conditioning on the first toss, the two cases T and H give

E(NHHHH) =

1

2[1+ E(NHH)] +

1

2[1+ 0],

which implies E(NHHHH) = 4.

Example 6-2 Given HT , how many further fair coin tosses on average are needed to seethe pattern "HT" again?

6 PATTERNS 36

2 3 4 5 6 · · ·

HT |HT HT |HHT HT |HHHT HT |HHHHT HT |HHHHHT · · ·HT | T HT HT | T HHT HT | T HHHT HT | T HHHHT · · ·

HT | T T HT HT | T T HHT HT | T T HHHT · · ·HT | T T T HT HT | T T T HHT · · ·

HT | T T T T HT · · ·

The head start is useless so we get E(NHTHT) = E(NHT) = 4.

Example 6-3 For the patternHTHT things get a bit complicated. We solve it by introduc-ing new notation and using linear algebra. For j = 0, 1, 2, 3, let ej be the expected numberof steps to see the pattern HTHT given a head start of its first j symbols.

Arguing as before, we get the following relations

e0 =1

2(1+ e1) +

1

2(1+ e0)

e1 =1

2(1+ e1) +

1

2(1+ e2)

e2 =1

2(1+ e3) +

1

2(1+ e0)

e3 =1

2(1+ e1) +

1

2(1+ 0).

These relations give the vector equatione0e1e2e3

=

1

1

1

1

+

12

120 0

0 12

120

120 0 1

2

0 120 0

e0e1e2e3

,which can be solved to give

(e0, e1, e2, e3) = (20, 18, 16, 10).

7 FIRST STEP ANALYSIS 37

7 First step analysis

(a) Three players

Huygens 1657 Three persons A, B, and C play a game. A bowl contains w white andb black marbles. The players successively draw a marble (with replacement) in the orderABCABC . . . until someone gets a white marble. He is the winner. Find the winningchances for the three players.

Solution: Letting α = ww+b

and β = bw+b

, first step analysis shows that pA = α + β3pA.Similarly pB = βα+ β3pB, and pC = β2α+ β3pC. Therefore

pA =α

1− β3, pB =

βα

1− β3, pC =

β2α

1− β3.

(b) Another three players

Three evenly matched players A, B, and C hold a tournament. Players A and B play first,and at each subsequent round the winner plays again while the loser sits out. The firstplayer who wins two consecutive rounds, wins the tournament. Find the probabilitiespA, pB, pC that players A, B, or Cwin the tournament.

Brute force solution: Here are the outcomes where player C wins the tournament:ACC,BCC,ACBACC,BCABCC, . . . Therefore the probability that player Cwins is

pC =

∞∑k=1

2

8k=2

8

1

1− 1/8=2

7.

The remaining probability must be divided evenly between the other two players.

First step analysis: Define p1, p2, p3, p4 be the chance that you, player A, will win thetournament starting in one of these situations:

1. You’ve just won a round

2. The start of the tournament

3. Your opponent has just won a round


4. You are sitting out

Logic dictates the following relationships

p1 =1

2+p4

2, p2 =

p1

2+p4

2, p3 =

p1

2+0

2, p4 =

p3

2+0

2,

which you solve to get (p1, p2, p3, p4) = (4/7, 5/14, 2/7, 1/7). Thus pA = pB = 5/14 andpC = 2/7.

(c) Classical random walk

• •• ••x x+ 1x− 1

1/21/2

a b· · ·· · ·

Place Let a < b and consider the simple random walk with boundary points a and b.Define px as the probability that the random walk first hits the boundary at a, starting atstate x. These probabilities satisfy

pa = 1, pb = 0, px =px−1

2+px+1

2for a < x < b.

This implies that x 7→ px is a straight line function; that is, px =b− x

b− a.

a b

x 7→ px

••

Details: Write the equation as

1

2(px − px+1) =

1

2(px−1 − px).

We have px − px+1 = c for some constant c, and since pb = 0 this leads to

px = px − pb =

b−1∑j=x

(pj − pj+1) = (b− x)c.


Plugging in x = a gives 1 = pa = (b− a)c so that c = 1/(b− a), and hence

px =b− x

b− a.

Time Define ex to be the expected number of steps until the walk hits the boundary,starting at state x. These satisfy

ea = eb = 0, ex = 1+ex−1

2+ex+1

2for a < x < b.

This forces x 7→ ex to be a quadratic, ex = −x2+ c1x+ c2, and since we know the two rootsa and b, we conclude that ex = (b− x)(x− a).

a b

x 7→ ex

••

(d) Number of walks until no shoes

Someone lives in a house with two doors. He begins with n pairs of shoes at each door.Every day he randomly selects a door, puts on shoes, goes for a walk, and returns to arandomly chosen door. He leaves his shoes when he comes in the house.

What is the expected number of walks until he discovers no shoes available when hewants to take a walk?

For 0 ≤ i ≤ 2n, let ei be the expected number of completed walks starting with i pairs ofshoes at the front door. For 1 ≤ i ≤ 2n− 1 considering the next walk, we get

ei = 1+ei−1

4+ei

2+ei+1

4. (1)

Some details: Equation (1) gives ∆2ej = −4 for 0 ≤ j < 2n − 1. Summing gives us∆ei = ∆e0 − 4i for 0 ≤ i < 2n. Summing again gives

ei = e0 + [2+ ∆e0]i− 2i2 for 0 ≤ i ≤ 2n (2)


For i = 0, a first step analysis gives

e0 =1

2· 0+ 1

4[1+ e1] +

1

4[1+ e0],

so thate0 =

2+ e13

.

Plugging this into (2) gives usei = e0 + 2e0i− 2i

2. (3)

Finally, by symmetry, we have e0 = e2n. Plugging this into (3) gives us the answer

ei = 2n+ 4ni− 2i2. (4)

8 GENERATING FUNCTIONS 41

8 Generating functions

(a) Five urn problem

Suppose we have 5 urns with black and white balls as in the diagram.

Urn 1 Urn 2 Urn 3 Urn 4 Urn 5

We randomly select one ball from each urn. Let W be the total number of white ballsdrawn. What is P(W = 2)?

We could just add the probabilities of the ten disjoint events

w1w2b3b4b5, w1b2w3b4b5, . . . , b1b2b3w4w5.

Each of these comes with a different probability, but this can be automatically taken intoaccount by expanding the product(

1

3w1 +

2

3b1

)(1

2w2 +

1

2b2

)(3

4w3 +

1

4b3

)(2

7w4 +

5

7b4

)(3

5w5 +

2

5b5

),

and adding the coefficients of the ten events w1w2b3b4b5, · · · etc.

In fact, since we are only interested in white balls, and only the number of them (and notwhat urns they came from) we may as well consider the polynomial

G(w) =

(1

3w+

2

3

)(1

2w+

1

2

)(3

4w+

1

4

)(2

7w+

5

7

)(3

5w+

2

5

)=

3

140w5 +

39

280w4 +

137

420w3 +

283

840w2 +

16

105w+

1

42.

This function is called the probability generating function of W, and we can read off theanswer P(W = 2) = 283/840 = .33690.


Definition 8-1 For a positive integer valued random variable X the probability generat-ing function of X is

GX(s) = E(sX) =∞∑k=0

sk P(X = k).

(b) Adding random numbers

Randomly choose one number from each of the sets 1, 2, 1, 2, 3, and 1, 2, 3, 4. What isthe chance that they add up to 7?

Solution: The probability generating function for the sum is

GX(s) = (s/2+ s2/2)(s/3+ s2/3+ s3/3)(s/4+ s2/4+ s3/4+ s4/4)

= s3(1+ 3s+ 5s2 + 6s3 + 5s4 + 3s5 + s6)/24.

Taking the coefficient of s7 we get P(X = 7) = 5/24.

(c) Ten dice again

We revisit an old friend. Ten fair dice are thrown at random. What is the chance that theyadd up to 27?

Solution 1: The probability generating function for the sum of ten fair dice is

GX(s) = [(s+ s2 + s3 + s4 + s5 + s6)/6]10.

Expanding this with a computer and extracting the coefficient of s27 we get P(X = 27) =

2,665/104,976.

Solution 2: We use the fact that s + s2 + s3 + s4 + s5 + s6 = s(1 − s6)/(1 − s). Dividingby 6 and taking the tenth power gives

GX(s) =1

610s10(1− s6)10(1− s)−10

=1

610s10∑j

(10

j

)(−s6)j

∑k

(−10

k

)(−s)k.


Thus the coefficient of s27 is

P =1

610

[−

(10

0

)(−10

17

)+

(10

1

)(−10

11

)−

(10

2

)(−10

5

)]=

1

610

[(26

9

)− 10

(20

9

)+ 45

(14

9

)]=

2,665

104,976.

The general formula to get a sum of kwhen you throw n m-sided dice is

P(X = k) =1

mn

b(k−n)/mc∑j=0

(−1)j(n

j

)(k− jm− 1

n− 1

)

(d) The pattern HH again

Find p(k) = P(NHH = k) where NHH is the number of tosses of a fair coin needed to seethe pattern HH.

Solution: We note that p(0) = p(1) = 0, p(2) = 1/4, and that for k > 2we have

p(k) =1

2p(k− 1) +

1

4p(k− 2).

Thus the probability generating function satisfies

G(s) =∑k≥2

p(k)sk

=1

4s2 +

∑k>2

1

2p(k− 1)sk +

∑k>2

1

4p(k− 2)sk

=1

4s2 +

1

2sG(s) +

1

4s2G(s).

Solving for G(s) we get

G(s) =s2

4− 2s− s2.


How can we extract coefficients from this expression? First we factor the quadratic into4− 2s− s2 = −(r1 − s)(r2 − s) where r2 = −1+

√5 and r1 = −1−

√5 and then write

1

−(r1 − s)(r2 − s)=

1

r2 − r1

[1

r2 − s−

1

r1 − s

].

Secondly, we use the geometric series to write 1/(r− s) = (1/r)∑

k≥0(s/r)k. Therefore,

G(s) =1

2√5

∑k≥2

[(1

r2

)k−1−

(1

r1

)k−1]sk.

Extracting the probabilities, for k ≥ 2we get

p(k) =1

2√5

[(1

−1+√5

)k−1−

(1

−1−√5

)k−1]

=∑j

(k− 1

2j+ 1

)5j/4k−1.

(e) Crazy Dice

The sum of 2 normal dice has a triangular distribution over 2, 3, . . . , 11, 12 as seen usinggenerating functions:

((s+s2+s3+s4+s5+s6)/6)2 = (s2+2s3+3s4+4s5+5s6+6s7+5s8+4s9+3s10+2s11+s12)/36.

Question: can we put different positive numbers on a pair of dice so that the distributionof the sum stays the same? We will call these crazy dice.

Mathematically, we want positive integers ai, bi for 1 ≤ i ≤ 6 so that

Ga(s)Gb(s) =1

62(s+ s2 + s3 + s4 + s5 + s6)2,

where Ga(s) = 16

∑6i=1 s

aj and Gb(s) = 16

∑6i=1 s

bj . Let’s factor

s+ s2 + s3 + s4 + s5 + s6 = s(1+ s+ s2)(1+ s)(1− s+ s2),

so that

Ga(s)Gb(s) =s2(1+ s+ s2)2(1+ s)2(1− s+ s2)2

62.


How to split this product into Ga and Gb? Since the ai’s and bi’s are strictly positive,both Ga and Gb have the factor s. Both functions evaluate to 1 at s = 1, so that the factor(1 + s + s2)(1 + s) also appears in both. If we give Ga and Gb one factor of 1 − s + s2 weget back ordinary dice.

Alternatively, we could try to put both factors of 1− s+ s2 into Ga. That gives

Ga(s) =1

6s(1+ s+ s2)(1+ s)(1− s+ s2)2 =

1

6(s+ s3 + s4 + s5 + s6 + s8),

andGb(s) =

1

6s(1+ s+ s2)(1+ s) =

1

6(s+ 2s2 + 2s3 + s4).

Translating back, the crazy dice are 1,3,4,5,6,8 and 1,2,2,3,3,4.

(f) Moments

SupposeX is a random variable taking non-negative integer values, and letGX(s) = E(sX).Differentiating once and setting s = 1we get G′X(1) = E(X).

For instance, the probability generating function for the pattern HH problem is G(s) =

s2/(4− 2s− s2). Differentiating, we get G′(s) = 2s(4− s)/(4− 2s− s2)2 and setting s = 1we get E(NHH) = 2 · 1 · (4− 1)/12 = 6.

In fact, for any non-negative integer i if we differentiate i times and set s = 1 we getG

(i)X (1) = E[X(X − 1) · · · (X − i + 1)]. Consequently, we can expand GX as a power series

around the point s = 1 to obtain

GX(s) =∑i

E[(X

i

)](s− 1)i.

Using the binomial theorem to expand the powers of s− 1we get

GX(s) =∑i≥0

E[(X

i

)](s− 1)i

=∑i≥0

E[(X

i

)]∑k≥0

(i

k

)(−1)i−ksk

=∑k≥0

sk∑i≥0

(−1)i−k(i

k

)E[(X

i

)].


Extracting coefficients we get

P(X = k) =∑i≥0

(−1)i−k(i

k

)E[(X

i

)],

another proof of the inclusion-exclusion formula.

(g) Three consecutive dice values

If you roll a die ten times, what is the chance that three consecutive values are the same?

Let N denote the first time when three consecutive values are the same. For every |s| ≤ 1and k in 0, 1, 2, uk(s) = Ek(sN) where k is the size of the "head start", i.e., the number ofconsecutive, identical values just produced. We want to calculate P0(N ≤ 10).

The usual one-step analysis yields the linear system u0 = su1, u1 = s(16u2 +

56u1)

andu2 = s

(16+ 5

6u1). Solving this system gives us

u0(s) =s3

36− 30s− 5s2

=1

36s3 +

5

216s4 +

5

216s5 +

175

7776s6 +

1025

46656s7 +

125

5832s8 +

35125

1679616s9 +

205625

10077696s10 + · · · .

The chance of getting three consecutive values the same by the tenth roll is

P0(N ≤ 10) =1

36+

5

216+ · · ·+ 205625

10077696= .1812984833.

(h) Triangle tango

Three independent random walkers start at different corners of a triangle. At every timepoint, each walker jumps either to the left or to the right at random. Eventually, all threewalkers will meet at one corner and the game is finished. How long on average until thishappens?

Solution: Define "state A" to be when all walkers are at different corners, "state B" whentwo walkers are at one corner, and the third walker is at a different corner, and "state C"when they are all together. The diagram below illustrates how we might move from stateA to state B.


•

•

•

•

• •

We are interested inM, the number of steps needed to finish the game, starting in state A.To solve this problem, we also need the random variable N, the number of steps neededto finish starting in state B.

To begin, imagine starting in state A. There are eight equally likely possibilities dependingon how the walkers jump. In two of the cases we remain in state A, and in six cases weend up in state B. Hence

M =

1+M′ with probability 1/41+N with probability 3/4,

whereM′ has the same distribution asM.

Now imagine starting in state B. There are eight equally likely possibilities depending onhow the walkers jump. In one case all the walkers go to the same corner and the game isfinished. Otherwise, in five cases we stay in state B, and in two cases we go to state A.

Therefore

N =

1 with probability 1/81+N′ with probability 5/81+M with probability 1/4,

where N′ has the same distribution as N.

From these relationships, we find that the probability generating functions satisfy

GM(s) =s

4GM(s) +

3s

4GN(s),

GN(s) =s

8+5s

8GN(s) +

s

4GM(s).

Solving this system of equations shows that

GM(s) =3s2

32− 28s− s2=3

32s2 +

21

256s3 +

153

2048s4 +

1113

16384s5 + · · · .

Differentiating GM and setting s = 1 gives E(M) = G′M(1) = 12.


(i) Other generating functions

Definition 8-2 If (an) is a sequence of numbers, we callH(s) =∑∞

n=0 ansn the generating

function of the sequence.

(i1) Even Odds

Let an be the probability that n Bernoulli trials result in an even number of successes. Thisoccurs if an initial failure is followed by an even number of successes, or an initial successis followed by an odd number of successes. Therefore a0 = 1 and for n ≥ 1

an = qan−1 + p(1− an−1).

Multiplying by sn and adding over nwe see that the generating function satisfies

H(s) = 1+ qsH(s) + ps(1− s)−1 − psH(s)

or2H(s) = [1− s]−1 + [1− (q− p)s]−1.

Expanding the right hand side using geometric series we find that the coefficients satisfy

an =1

2+

(q− p)n

2.

(i2) Non-consecutive values

Any set of k non-consecutive positive integers can be obtained by reading off the "b"s ina sequence of the form:

seq(a)(b seq≥1(a)

)k−1b seq(a).

The generating function for such sequences is

H(s) =1

1− s

(s

s

1− s

)k−1s

1

1− s=

s2k−1

(1− s)k+1.

In other words,

H(s) = s2k−1∞∑m=0

(m+ k

k

)sm,


so the nth coefficient of H is(n−k+1k

).

Similarly, we can find sets of k positive integers with a minimum gap of size d using thesesequences:

seq(a)(b seq≥d(a)

)k−1b seq(a).

The generating function for such sequences is

H(s) =1

1− s

(ssd

1− s

)k−1s

1

1− s=sd(k−1)+k

(1− s)k+1.

In other words,

H(s) = sd(k−1)+k∞∑m=0

(m+ k

k

)sm,

so the nth coefficient of H is(n−d(k−1)

k

).

(i3) Strings without HHH

The set of all finite strings of heads and tails without 3 heads in a row can be expressed as

∞∑k=0

(T +HT +HHT)k seq≤2(H).

The counting generating function for such sequences is

C(h, t) =1+ h+ h2

1− t− ht− h2t.

We bring probability into the picture by replacing each h and t by s/2, to obtain

H(s) =8+ 4s+ 2s2

8− 4s− 2s2 − s3= 1+ s+ s2 +

7

8s3 +

13

16s4 + · · ·

so the kth coefficient of H is P(no HHH in the first k trials). If N is the number of cointosses needed to see HHH, then

H(s) =

∞∑k=0

P(N > k) sk

so letting s→ 1we get E(N) = 14.


(i4) Red runs

The set of red-black binary strings without a run of 6 consecutive reds is

∞∑k=0

([5∑j=0

Rj

]B

)kseq≤5(R).

Transforming this into a counting generating function gives

1− r6

1− r− b+ br6= 1+ r+ b+ r2 + 2rb+ b2 + · · ·+ 352588773866202 b26r26 + · · · .

The chance of getting a red run of length 6 or more with an ordinary deck of cards is

1−352588773866202(

5226

) = .2890187592.

9 RANDOM WALKS 51

9 Random walks

(a) Definitions and notation

Let X0, X1, . . . be a random walk taking values in some state space S. For x ∈ S, thenotation Px and Ex will refer to conditional probability and expectation given that westart at x, i.e., that X0 = x. The transition probability from state x to y is defined aspxy := Px(X1 = y).

If f : S → R is a real valued function on S , we let Pf : S → R be the function that givesthe average f value after taking one step, that is,

(Pf)(x) =∑y∈S

pxy f(y).

If B is a subset of S, we define the hitting time of B to be

NB = infn ≥ 0 : Xn ∈ B.

For future reference, let us also define the return time of B to be

RB = infn ≥ 1 : Xn ∈ B.

Notice the subtle, but important, difference.

(b) Hitting times

A lot of interesting questions can be answered using the following functions. Let A ⊆B ⊆ S and define

f(x) = Px(NA = NB), and h(x) = Ex(NB).

These two functions can often be calculated explicitly as they solve the following bound-ary value problems:

f(x) =

0 if x ∈ B \A1 if x ∈ A(Pf)(x) if x /∈ B.

h(x) =

0 if x ∈ B1+ (Ph)(x) if x /∈ B.

These formulas are a consequence of first step analysis.

9 RANDOM WALKS 52

Example 9-1

b

c e

a d

Starting at state c, a random walk moves on this graph until it hits the boundary d, e.What is the chance that it ends up at state d?

Solution: We let B = d, e, and A = d. By the formula above the function f satisfies

f(a) =1

3(1+ f(b) + f(c))

f(b) =1

4(f(a) + f(c) + 0+ 1)

f(c) =1

3(0+ f(b) + f(a)) .

It’s not too hard to solve this system using linear algebra, but we may as well exploit thesymmetry in the problem by noting that f(b) = 1/2 and f(a) + f(c) = 1. Therefore thethird equation can be rewritten as

f(c) =1

3(0+ 1/2+ 1− f(c)) ,

which gives f(c) = 3/8.

9 RANDOM WALKS 53

(c) Random walk on a cube

On average, how long does it take a random walk on the vertices of a cube to reach theopposite corner?

Solution:

start

nearest

further

opposite

Here we have a random walk with eight states, but by symmetry we can reduce this tofour states: start, nearest neighbors, further neighbors, and opposite. In this case, theboundary is the single state o.

s n f o

1

1/3

2/3 1/3

2/3 1

The formula for h says

h(s) = 1+ h(n)

h(n) = 1+1

3h(s) +

2

3h(f)

h(f) = 1+1

30+

2

3h(n).

From here it is not too hard to work out that h(s) = 10. By the way, we also get h(n) = 9and h(f) = 7.

9 RANDOM WALKS 54

(d) More random walk h functions

0

4

6

6

4 140/15

103/15

0

147/15

124/15

77/15

11

0

15

20

1416

19

Example 9-2

• • •• ••x x+ 1x− 1

1/2 1/21

0 1 b· · ·· · ·

Consider a symmetric random walk that reflects back to one, from zero. We want to findout how long, on average, it takes to reach the state b. In other words, what is the hfunction for the boundary B = b.

For 0 ≤ x < b define rx = Ex(Nx+1) the average number of steps to hit the state onthe right. We know that r0 = 1, and that for 0 < x < b first step analysis gives rx =

1+ 12(rx−1 + rx). This equation gives rx = 2+ rx−1 so that r1 = 3, r2 = 5, etc. Finally,

h(x) =

b−1∑j=x

(2j+ 1) = b2 − x2.

9 RANDOM WALKS 55

Here is an alternative proof: Let (Xn) be the usual random walk on −b, . . . ,−1, 0, 1, . . . , b.Then Yn := |Xn| is a random walk on 0, 1, . . . , b that reflects at zero, and

hY(x) = hX(|x|) = b2 − x2.

(e) Tennis

James Bernoulli 1713 Players A and B win a point in tennis with probabilities p andq = 1− p, respectively. Find the winning probabilities of each player.

Solution: Here is a picture of the states of the process. We begin at (0, 0) and moveright with probability p or up with probability q. Player A wins if we first hit the lowerboundary while player Bwins if we first hit the upper boundary.

00

1

1

2

2

3

3

4

4

5

5

6

6

······

······

······

······

······

······

player A

player B

Let f(i, j) be the probability that player A wins the game, starting at state (i, j). Start atthe three circled states, where f is known and work your way back to state (0, 0). You willfind that

f(0, 0) =p4(3− 2p)(4p2 − 8p+ 5)

2p2 − 2p+ 1.

9 RANDOM WALKS 56

(f) The problem of points

John Bernoulli 1710 Two persons participate in a series of games. In each game, Awinswith probability p and B wins with probability q = 1 − p. The series ends when oneof the players has won r games; he wins a pile of money. For some reason, the series isinterrupted. At this moment, A has won r − a games and B has won r − b games. Whatis a fair way to divide the pile?

Solution 1: This the previous problem with a different boundary. Let’s illustrate thecase of a best-of-seven series, that is, r = 4. We will label the axes by the number of gamesneeded to win the series.

0

0

1

1

2

2

3

3

4

4

·····

·····

·····

·····

·····

player A

player B

If the series is tied 2-2, then the chance that player Awill win the series is 3p2 − 2p3.

Solution 2: Imagine playing all of the a + b − 1 remaining games. Player A wins theseries if he wins at least a of these games, so

f(a, b) =

a+b−1∑i=a

(a+ b− 1

i

)piq(a+b−1)−i.

Just to double check, let’s plug in a = b = 2 and see what we get:

3∑i=2

(3

i

)piq3−i = 3p2q+ p3 = 3p2 − 2p3.

9 RANDOM WALKS 57

Solution 3: Imagine playing all of the a + b − 1 remaining games. Player A wins theseries if his ath win occurs before player Bwins b games, so

f(a, b) =

b−1∑i=0

(a+ i− 1

i

)paqi.

Just to double check, let’s plug in a = b = 2 and see what we get:

1∑i=0

(1+ i

i

)p2qi = p2(1+ 2q) = 3p2 − 2p3.

(g) Gambler’s ruin

James Bernoulli 1713 Players A and B begin with a and b counters to start. They playseveral rounds of a game where player A has probability p of winning. The winner ofthe round gets a counter from the other player, and the first player to get all the counterswins. Find the winning probabilities of each player.

Solution: Consider the random walk that tracks player A’s gains. It moves forwardwith probability p and backward with probability q = 1 − p. The boundary consists oftwo points B = −a, b. Set A = −a so that the probability that A is ruined is f(0).

• •• ••x x+ 1x− 1

pq

−a b· · ·· · ·

The function f satisfies f(−a) = 1, f(b) = 0, and in between f(x) = qf(x − 1) + pf(x + 1).We rewrite this last equation as

q∆f(x− 1) = p∆f(x).

It’s not hard to show that

∆f(x) =

(q

p

)x+a∆f(−a),

9 RANDOM WALKS 58

and then to use the boundary conditions to get

f(x) =

(qp

)b−(qp

)x(qp

)b−(qp

)−a if p 6= q

b− x

b+ aif p = q.

(h) One sided boundaries

Imagine a random walk on the integers that jumps to the left with probability q, or to theright with probability p, where p+ q = 1.

We want to calculate P0(N−1 <∞) the probability that the random walk will ever hit thestate to the left of its starting position.

Solution 1: Let z = P0(N−1 <∞), then a first step analysis gives

z = q+ pz2,

which has two roots 1 and q/p.

If p ≤ q, we must have z = 1. Such a random walk is guaranteed to hit every state to theleft of its start. If p = q, then the random walk will hit every state, in fact, it will hit everystate infinitely often!

If p > q, it turns out that z = q/p. Accepting this for a moment, we combine both casesinto one formula z = 1∧ q

p.

Solution 2: Let’s try another approach to the problem. The hitting timeN−1 is a randomvariable that takes values in 1, . . . ,∞. Its probability generating function is G(s) =

E0(sN−1) for 0 < s < 1. First step analysis gives

G(s) = qs+ psG(s)2,

which means that

G(s) =1±

√1− 4pqs2

2ps.

Since |G(s)| ≤ 1, we must have

G(s) =1−

√1− 4pqs2

2ps.

9 RANDOM WALKS 59

Letting s→ 1, we get P0(N−1 <∞) = 1∧ qp.

Solution 3: We will start with the two boundary problem −a < x < b, and then letb→∞, that is,

Px(N−a <∞) = limb→∞Px(N−a < Nb) = lim

b→∞(qp

)b−(qp

)x(qp

)b−(qp

)−a .When p > q this limit is

(qp

)x+a, but when p ≤ q it is 1.

(i) Return to the start

The probability that the random walk returns to its starting position is

Px(hit x after time 0) = q(1∧

p

q

)+ p

(1∧

q

p

)= q∧ p+ p∧ q = 1− |p− q|.

10 COUNTING PATHS 60

10 Counting paths

Example 10-1 We can prove the formula from "socks in the dryer"

n∑k=0

(2n− k

n− k

)2k = 4n,

by counting paths. Do you see why?

•••

••

••

••

•

••

•••

•

•

•

•

•

02n2n − k

k

−k

(a) The reflection principle

For the time being, we consider paths on the integers whose increments are either plus orminus one. For m < n, define N(m,a)(n,b) to be the number of such paths between (m,a)

and (n, b). Each such path has u steps up and d steps down. Since u + d = n −m andu− d = b− awe deduce that

u =1

2(n−m+ b− a),

and therefore that

N(m,a)(n,b) =

(n−mu

)if u is an integer

0 otherwise.


The value x + y is always odd, or always even, for the points (x, y) along any path. The"zero otherwise" part of the formula above counts impossible paths.

6

4

2

-2

6 82 4

•

•

The number of paths from (0, 1) to (8, 3) is(

8−012[(8−0)+(3−1)]

)=(85

)= 56.

Now we want to count the numberNx(m,a)(n,b) of paths from (m,a) to (n, b) that touch the

x-axis. The reflection principle says that, if ab ≥ 0, then Nx(m,a)(n,b) = N(m,−a)(n,b).

6

4

2

-2

6 82 4

•

•

•

The number of paths from (0, 1) to (8, 3) that touch the x-axis= the number of paths from (0,−1) to (8, 3) =

(86

)= 28 .


Example 10-2 Symmetric random walk Let (Sn) be symmetric random walk startingat S0 = 0 and let b be a negative integer. By reflecting the path at the level b, where b+ nis odd, we find that

P(Sn ≤ b) =1

2P(Nb ≤ n).

0

b ↑Nb

•

•

Reflecting the path when it hits level b.

This can be rewritten asP(Nb > n) = P(b < Sn < −b).

Plugging in b = −1 at time 2n gives

P(N−1 > 2n) = P(−1 < S2n < 1).

In other wordsP(S0 ≥ 0, S1 ≥ 0, . . . , S2n ≥ 0) = P(S2n = 0).

Example 10-3 Now let’s look at the paths that don’t hit zero from time 1 to time 2n. Wehave

P(S1 6= 0, S2 6= 0, . . . , S2n 6= 0) = P(S1 = ±1, R0 > 2n)

= 2P(S1 = 1, R0 > 2n)

= 2P(S1 = 1)P(R0 > 2n |S1 = 1)

= P(N−1 > 2n− 1)

= P(S2n = 0).

Example 10-4 We can use these path counting arguments to derive a binomial identity.There are a total of 4n paths up to time 2n. If we let 2r stand for the last time that the path


hits zero, then by considering cases and using Example 11.2, we get

4n =

n∑r=0

(2r

r

)(2n− 2r

n− r

).

(b) Catalan numbers

The Catalan numbers Cn count the number of non-negative paths from (0, 0) to (2n, 0).The first few values are C0 = 1, C1 = 1, C2 = 2, C3 = 5, and C4 = 14.

Example 10-5 The five paths that demonstrate C3 = 5

0 1 2 3 4 5 6| | | | | | |

0 1 2 3 4 5 6| | | | | | |

0 1 2 3 4 5 6| | | | | | |

0 1 2 3 4 5 6| | | | | | |

0 1 2 3 4 5 6| | | | | | |


We will use the reflection principle to find a formula for the Catalan numbers. Lifting thepath by one unit, we find that

Cn = N(0,1)(2n,1) −Nx(0,1)(2n,1)

= N(0,1)(2n,1) −N(0,−1)(2n,1)

=

(2n

n

)−

(2n

n+ 1

)=

1

n+ 1

(2n

n

).

Example 10-6 What’s the chance that a random walk first returns to zero at time 2n?

Solution: We need to consider all paths from (0, 0) to (2n, 0) that do not touch the x-axis from time 1 to time 2n − 1. The positive ones consist of an upstep, followed by apath from (1, 1) to (2n − 1, 1) which doesn’t hit the x-axis, followed by a downstep. Thenumber of such paths is Cn−1. Similarly, there are Cn−1 such negative paths. Every pathfrom (0, 0) to (2n, 0) has probability pnqn, so we get the answer

P(R0 = 2n) = 2Cn−1 pnqn.

Example 10-7 Draw cards one at a time from a well-shuffled deck and put them on thered pile or the black pile. What is the chance that the piles become equal for the first time,at the tenth draw?

Solution: The card colors can be in(5226

)different orders.

The number of "good" orders is 2C4(4221

), so the required probability is

2C4(4221

)(5226

) =65780

2164491= 0.0304.

(c) The ballot theorem

In an election, candidate A gets a votes and candidate B gets b votes, with a > b. Whatis the chance that A holds the lead throughout the election?


Solution: Count a vote for A as an upstep and a vote for B as a downstep. We wantto count all paths from (0, 0) to (a + b, a − b) that only touches the x-axis at (0, 0). Thenumber of such paths is

N(1,1)(a+b,a−b) −Nx(1,1)(a+b,a−b) = N(1,1)(a+b,a−b) −N(1,−1)(a+b,a−b)

=

(a+ b− 1

a− 1

)−

(a+ b− 1

a

)=

a− b

a+ b

(a+ b

a

).

Dividing by the total number of paths, we get P =a− b

a+ b.

(d) Theater lineup

f + t people are in line at a theatre; f with five dollar bills and t with ten dollar bills.The tickets cost five dollars each, and the till is empty to start. If each customer buys oneticket, what is the chance that nobody will have to wait for change?

Solution: If t > f, then the chance is zero, so let’s assume that t ≤ f.

Put the f+ t people in random order. Start at zero and take a step up for a five dollar billand a step down for a ten dollar bill. The height of this sample path is the number of fivedollar bills that the cashier has. We want this path to stay non-negative. This similar tothe "ballot problem", except that we allow the path to touch the x-axis.

The good sample paths for our problem are in one-to-one correspondence with the goodsample paths for the ballot problem with one extra five dollar bill. (Add the new personto the front of the line).

The ballot theorem says that the number of such paths is

f+ 1− t

f+ 1+ t

(f+ 1+ t

t

).

Dividing by(f+tt

)gives the desired probability

f+ 1− t

f+ 1.


(e) The cycle lemma

In the next two sections, we will consider paths whose steps are integers less than or equalone. Let 〈x1, x2, . . . , xn〉 be a sequence of integers ≤ 1, and put s0 = 0 and sj =

∑ji=1 xi for

1 ≤ j ≤ n. We consider the path (j, sj) as j runs from 0 to n.

A rotation of the path, is the path formed using a rotated sequence of increments

〈xr, xr+1, . . . , xn, x1, . . . , xr−1〉.

Cycle lemma. If sn > 0, then there are exactly sn rotations whose partial sums arestrictly positive.

Corollary. If sn > 0, then there are exactly sn rotations so that sj < sn for 0 ≤ j < n.

Proof: Apply the cycle lemma to the reversed sequence

xrev = 〈xn, xn−1, . . . , x1〉.

We havesn − sj = s

revn−j, 0 ≤ j < n.


Example 10-8

1 2 3 4 5−1

0

1

2

•

•

• •

•

•

x = 〈1, 1,−1, 1, 0〉

1 2 3 4 5−1

0

1

2

•

•

•

• •

•

x = 〈1,−1, 1, 0, 1〉

1 2 3 4 5−1

0

1

2

•

•

• •

•

•

x = 〈−1, 1, 0, 1, 1〉

1 2 3 4 5−1

0

1

2

•

• •

•

•

•

x = 〈1, 0, 1, 1,−1〉

1 2 3 4 5−1

0

1

2

• •

•

•

•

•

x = 〈0, 1, 1,−1, 1〉


(f) Applications

Example 10-9 The ballot theorem revisited. Let candidate A get a votes and can-didate B get b votes where a ≥ mb for some integer m ≥ 1. The probability that Amaintains more thanm times as many votes as B throughout the election is a−mb

a+b.

Proof: For any arrangement of votes

AAAABAAAB · · ·AB,

replace each A with 1 and each B with −m. There are a − mb rotations that maintainstrictly positive partial sums. Since all a+ b rotations are equally likely we get the result.

Let Xj be random variables taking values in . . . ,−2,−1, 0, 1 such that the distribution of(X1, X2, . . . , Xn) is invariant under rotations.

Example 10-10 The hitting time lemma. For k ≥ 1, we have

P(Nk = n) =k

nP(Sn = k).

Example 10-11 For k, n ≥ 1, the cycle lemma implies

P(Sj > 0 for 1 ≤ j ≤ n, Sn = k) =k

nP(Sn = k).

Example 10-12 If all the steps are in −1, 0, 1, then we can generalize the above to allintegers k:

P(R0 > n, Sn = k) =|k|

nP(Sn = k).

Adding over k gives us P(R0 > n) = 1nE(|Sn|). In particular, letting n→∞ gives

P(R0 =∞) = |µ|

where µ is the average step size. This generalizes the result from the section Return to thestart.

11 RANDOM PERMUTATIONS 69

11 Random permutations

(a) Color runs

Example 11-1 Take a thoroughly shuffled deck of cards and turn them over one at atime. What is the chance that, as you go through all 52 cards, you will find a run of 6 ormore consecutive red cards?

Solution: Define the indices with black cards to be 1 ≤ i1 < i2 < · · · < ib ≤ 52, anddefine the gaps between the black cards to be

g1 = i1, gj = ij − ij−1 for 2 ≤ j ≤ 26, and g27 = 53− ib.

As in the German tank solution, these gaps form a composition of the integer 53 into27 parts, and that there is a one-to-one correspondence between random ordering of thecolors and such compositions.

Now those particular orders where the runs of red cards all have size less than 6 areexactly those where the black gaps are at most 6. But we can count these with the sameformula that tells us how many ways we can roll 27 dice and get a total of 53, i.e.,

S :=

4∑j=0

(−1)j(27

j

)(52− 6j

26

).

The probability of at least one run of 6 or more consecutive red cards is P = 1 − S/(5226

),

which works out to P = .2890187592.

(b) The Mississippi problem

If we randomly scramble the letters in the word "MISSISSIPPI", what is the chance thatno consecutive letters are the same?

This is a pretty challenging problem, so let’s start with some easier words. We will call apermutation "good" if no consecutive letters are the same.

Example 11-2 "MISS" The number of permutations of these four letters is 4!1! 1! 2!

= 12.


Now let’s find the number of "bad" permutations. To do this, imagine an alphabet of threeletters M, I, SS; notice the letter double-ess "SS". The number of permutations of this setis 3!

1! 1! 1!= 6.

The number of good permutations is 12− 6 = 6, so

P(good permutation) = 6/12 = 1/2.

Example 11-3 "SUCCESS" The number of permutations of these letters is 7!3! 2!

= 420.

Now let’s first find the number of permutations with no consecutive "S"s. To do this,write four stars where the non-S letters will go and five spaces where S’s might go:

_ ∗_ ∗_ ∗_ ∗_The number of ways to put the S’s into the blank spaces is

(53

)= 10, and the number

of ways to arrange U,E,C,C in the starred places is 4!2!

= 12. There are 10 × 12 = 120

permutations with no consecutive S’s.

Similarly, the number of permutations with no consecutive S’s and with the C’s together,uses three non-S letters U,E,CC to get

(43

)× 3! = 24.

_ ∗_ ∗_ ∗_The number of permutations with the S’s apart and the C’s apart is 120− 24 = 96, so

P(good permutation) = 96/420 = 8/35 ≈ .22857.

Example 11-4 "BILLCLINTON" Using the values from the table below, we see thatthe number of permutations with no consecutive letters of the same type is

846720− 141120− 141120+ 25200 = 589680.

alphabet #permutations with L’s separate

L,L,L,N,N,I,I,B,C,T,O(93

)× 8!

2! 2!= 846720

L,L,L,NN,I,I,B,C,T,O(83

)× 7!

2!= 141120

L,L,L,N,N,II,B,C,T,O(83

)× 7!

2!= 141120

L,L,L,NN,II,B,C,T,O(73

)× 6! = 25200.


Therefore we get P(good permutation) = 5896801663200

= 39110≈ .35454.

Example 11-5 "MISSISSIPPI" Using the same strategy as in the BILLCLINTON ex-ample, we start by calculating the number of permutations with the S’s separate overvarious alphabets.

alphabet #permutations with S’s separate

M,P,P,S,S,S,S,I,I,I,I(84

)× 7!

4! 2!= 7350

M,P,P,S,S,S,S,II,I,I(74

)× 6!

2! 2!= 6300

M,P,P,S,S,S,S,II,II(64

)× 5!

2! 2!= 450

M,P,P,S,S,S,S,III,I(64

)× 5!

2!= 900

M,P,P,S,S,S,S,IIII(54

)× 4!

2!= 60.

The number of permutations with S’s separate and the I’s separate is 7350− 6300+ 450+900− 60 = 2340.

Unfortunately, this 2340 includes permutations where the P’s are together. To get rid ofthose, we go through the entire exercise again, replacing the two P’s with one double PP.

alphabet #permutations with S’s separate

M,PP,S,S,S,S,I,I,I,I(74

)× 6!

4!= 1050

M,PP,S,S,S,S,II,I,I(64

)× 5!

2!= 900

M,PP,S,S,S,S,II,II(54

)× 4!

2!= 60

M,PP,S,S,S,S,III,I(54

)× 4! = 120

M,PP,S,S,S,S,IIII(44

)× 3! = 6.

The number of permutations with S’s separate, the I’s separate, and the P’s together is1050− 900+ 60+ 120− 6 = 324.

Finally the number of good permutations is 2340 − 324 = 2016, and the correspondingprobability is 2016

34650= 16

275≈ .05818.

A sophisticated solution: These problems quickly get out of hand if the words are longand there are lots of multiple letters. Here is an alternative solution that uses ideas fromalgebraic combinatorics.


Define polynomials for k ≥ 1 by qk(x) =∑k

i=1(−1)i−k

i!

(k−1i−1

)xi. Here are the first few poly-

nomials:q1(x) = x, q2(x) = x

2/2− x, q3(x) = x3/6− x2 + x.

The number of permutations with no equal neighbors, using an alphabet with frequenciesk1, k2, . . . is:

∫∞0

∏j

qkj(x) e−x dx.

Example 11-6 For the MISSISSIPPI problem, the product of the q functions is

q4(x)2 q2(x)q1(x)

= (x4/24− x3/2+ 3x2/2− x)2 (x2/2− x) (x)

= x11/1152− 13x10/576+ 11x9/48− 7x8/6+ 77x7/24− 19x6/4+ 7x5/2− x4.

Substituting j! for xj above, we get

11!/1152− 13 · 10!/576+ 11 · 9!/48− 7 · 8!/6+ 77 · 7!/24− 19 · 6!/4+ 7 · 5!/2− 4!

= 34650− 81900+ 83160− 47040+ 16170− 3420+ 420− 24

= 2016,

the same number we calculated from scratch. Amazing!

(c) Secret Santa

Imagine three couples who put their six names in a hat, and randomly draw one nameeach to buy a Christmas gift. We would like the probability that nobody gets their ownname or the name of their partner.

In general, randomly permuting the values 1, 2, . . . ,N corresponds to randomly selectingN dots on a N×N chessboard, with one dot in each row and column.

For instance, the permutation 〈4, 2, 1, 6, 5, 3〉 corresponds to the diagram below.


•

•

•

•

•

•

Now suppose there are some forbidden positions

× ×

× ×

× ×

× ×

× ×

× ×

For each forbidden position i, let Ai be the event that a spot lands there. Then Sj =∑P(Ai1 · · ·Aij), where each P(Ai1 · · ·Aij) is either equal to zero, or to (N−j)!/N! = 1/(N)j.

Here (N)j is defined to be (N)(N−1)(N−2) · · · (N−j+1). The probability is zero preciselywhen two of the positions i1, . . . , ij are in the same row or column.

Therefore we have Sj =Rj(N)j, where Rj is the number of ways to place j non-taking rooks

in the forbidden zone. The probability that none of the spots lands in a forbidden place is

P(X = 0) =

N∑j=0

(−1)jRj

(N)j. (1)

Example 11-1 When there are no forbidden places, the rook numbers are Rj = 10(j) soplugging into (1) gives P(X = 0) = (−1)0R0/(N)0 = 1, as expected.


These "rook numbers" Rj are often calculated using the generating function R(x) =∑

j Rjxj

called the rook polynomial. Equation (1) can be rewritten in terms of R as

P(X = 0) =1

N!

∫∞0

xN R(−1/x) e−x dx. (2)

Equation (2) is useful for calculating P(X = 0) using a computer.

Example 11-2 The rook polynomial for a completely filledm× n rectangle is

R(x) =∑j

(m

j

)(n

j

)j! xj.

To see this, imagine placing j non-taking rooks in this rectangle. You can freely choose jrows and j columns to use. Now you have a j × j square, and there are j! ways to placethe rooks there.

×

×

×

×

×

×

×

×

×

×

×

×

×

×

×

Choose 2 columns

Choose 2 rows becomes×

×

×

×

Example 11-3 Consider the forbidden zone for the secret santa problem.

× ×

× ×

× ×

× ×

× ×

× ×


Each of the three blocks shown has its own rook polynomial, which, using the previousexample, is 1 + 4x + 2x2. The overall rook polynomial for the forbidden zone illustratedabove is

R(x) = (1+ 4x+ 2x2)3 = 1+ 12x+ 54x2 + 112x3 + 108x4 + 48x5 + 8x6.

Substituting the rook numbers into (1) above, we get

P(X = 0) = 1−12

6+54

6 · 5−

112

6 · 5 · 4+

108

6 · 5 · 4 · 3−

48

6 · 5 · 4 · 3 · 2+

8

6 · 5 · 4 · 3 · 2 · 1=

1

9.

Of course, with only three couples it is much easier to calculate this by brute force, as seenin the section on counting.

Example 11-4 For five couples, you can compute with Wolfram Alpha (for example)

integrate (1-4/x+2/x^2)^5*x^10*exp(-x)/10! x=0..infinity

to get the answer3439

28350= 0.1213051146.

Example 11-5 In the secret santa problem, if we only forbid people from drawing theirown name then the forbidden zone consists of the main diagonal. In other words, this isthe "rencontre" problem.

×

×

×

×

×

×


The rook polynomial for this zone is R(x) = (1+ x)N, which means that the rook numbersare Rj =

(Nj

). Therefore,

P(X = 0) =

N∑j=0

(−1)j

(Nj

)(N)j

=

N∑j=0

(−1)j

j!.

Example 11-6 Banana. The letters in the word "banana" are written down in randomorder. What is the chance that none of the first 3 letters is correct?

For example, "a b a n n a" is OK, but not "b n n a a a".

Solution: We will solve this using rook polynomials. The forbidden zone for this prob-lem is

×

×

×

×

×

×

b a n a n abanana

If we spell the word "bnnaaa" the pattern is clearer.

×

×

××

××

b n n a a abnnaaa

The rook polynomial for this zone is R(x) = (1+ x)(1+ 2x)(1+ 3x) and plugging this intothe formula and integrating we get the answer: 19/60.


Example 11-7 Secret santa in a circle. Six people are seated at a round table, and theyput their names in a hat for a gift draw. What is the chance that nobody draws their ownname or the name of the person seated to their right?

Solution: For convenience I will mark the forbidden zone using numbers instead of×. You can see that two rooks in this zone will be non-taking exactly when they landon non-consecutive numbers. Non-consecutive here means taken in a circle, so that 12 isbeside both 11 and 1.

1

2 3

4 5

6 7

8 9

10 11

12

From section 4(d) we know that these rook numbers are

Rj =12

12− j

(12− j

j

),

so that

P(X = 0) =

6∑j=0

(−1)j12

12− j

(12− j

j

)(6− j)!

6!=1

9.

This formula is the same as "ménage" with 6 couples.


Example 11-8 Las Vegas. The authors of [10] write "... the problem is equivalent to acard game played in Las Vegas: The cards of a shuffled deck are dealt one at a time andface up. At the same time the player calls the denominations in the order ace, two, three,...,queen, king; ace, two,... A match occurs when the player calls the same denomination asthe card dealt; the suits need not match. The player wins if no match occurs." What is thechance that the player wins?

Solution: In the forbidden zone illustrated below, I have reordered the outcomes tomake the pattern clearer. You can think of this as changing the game so the dealer calls"ace, ace, ace, ace, two, two, two ,two, three, . . . , king, king, king, king."

××××××××××××××××

××××××××××××××××

××××××××××××××××

××××××××××××××××

××××××××××××××××

××××××××××××××××

××××××××××××××××

××××××××××××××××

××××××××××××××××

××××××××××××××××

××××××××××××××××

××××××××××××××××

××××××××××××××××

The rook polynomial for this forbidden zone is R(x) = (1 + 16x + 72x2 + 96x3 + 24x4)13.Plugging into the formula, (and using a computer!) we find the chance of a win is

1

52!

∫∞0

x52 R(−1/x) e−x dx = .01623272747.

12 APPENDICES 79

12 Appendices

(a) Binomial coefficients

For any integers n, k, we define(n

k

)=

n(n− 1) · · · (n− k+ 1)

k(k− 1) · · · (1)if k > 0

1 if k = 0

0 if k < 0Basic identities:

if n ≥ 0(n

k

)=

(n

n− k

)symmetry

(n

k

)=

(n− 1

k

)+

(n− 1

k− 1

)recursion

(n

k

)= (−1)k

(k− n− 1

k

)inversion

k

(n

k

)= n

(n− 1

k− 1

)absorption

(k

m

)(n

k

)=

(n

m

)(n−m

k−m

)super absorption

Some Sums:

if n ≥ 0∑

n≤k≤m

(k

n

)=

(m+ 1

n+ 1

)hockey stick

if n ≥ 0 or |x| < 1∑k

(n

k

)xk = (1+ x)n binomial

if |x| < 1∑k

(n+ k

k

)xk =

1

(1− x)n+1inverse binomial

if n ≥ 0 or |x| < 1∑k

(k

m

)(n

k

)xk = xm

(n

m

)(1+ x)n−m super binomial

12 APPENDICES 80

(b) Stirling numbers

Recurrences: (n

k

)=

(n− 1

k

)+

(n− 1

k− 1

)n

k

= k

n− 1

k

+

n− 1

k− 1

Power conversion:

xn =∑k

n

k

x(x− 1) · · · (x− k+ 1)

(c) Finite calculus

If f is defined on the integers, then we let∆f(x) = f(x+1)−f(x). The following telescopingsum is the analogue of the fundamental theorem of calculus:

f(x) = f(a) +∑a≤y<x

∆f(y).

We can also take second differences:

∆2f(x) = ∆f(x+ 1) − ∆f(x)

= [f(x+ 2) − f(x+ 1)] − [f(x+ 1) − f(x)]

= f(x+ 2) − 2f(x+ 1) + f(x).

In fact, for any integer n ≥ 0we have

∆nf(x) =∑k

(n

k

)(−1)n−kf(x+ k).

12 APPENDICES 81

(d) Discrete probability

Let Ω be a finite or countable set of all possible outcomes of a random experiment. Sup-pose that p : Ω→ [0, 1] satisfies

∑ω∈Ω p(ω) = 1, then we can define a probability measure

P(A) :=∑

ω∈A p(ω) on all subsets A ofΩ.

A subset A ⊆ Ω is called an event , and we say that the event occurs if our randomlychosenω belongs to A.

For instance, for a fair die roll we have Ω = 1, 2, 3, 4, 5, 6. An event A might be "thedie shows an even number", i.e., A = 2, 4, 6. For a fair die, p(ω) = 1/6 for all ω, andtherefore P(A) = 3/6 = 1/2.

A random variable is any mapping X : Ω→ R, and its expectation is defined as

E(X) =∑ω∈Ω

X(ω)p(ω) =∑x∈R

xP(X = x).

Intuitively, E(X) is the average value of X that would be observed in many repeated ran-dom experiments. For instance, if X is the value thrown on a fair die, then

E(X) = (1+ 2+ 3+ 4+ 5+ 6)1

6= 3.5.

13 REFERENCES 82

13 References

1. Problems and Snapshots From the World of Probability, Gunnar Blom, Lars Holst, andDennis Sandell, Springer-Verlag (1994).

2. Challenging Mathematical Problems with Elementary Solutions Volume 1: Combina-torial Analysis and Probability Theory, Akiva M. Yaglom and Isaak M. Yaglom, DoverPublishing (1987).

3. Fifty Challenging Problems in Probability, Frederick Mosteller, Dover Publishing (1987).

4. An Introduction to the Analysis of Algorithms (2nd edition), Philippe Flajolet andRobert Sedgewick, Addison-Wesley (2013).

5. Concrete Mathematics (2nd edition), Ronald L. Graham, Donald E. Knuth, and OrenPatashnik, Addison-Wesley (1994).

6. Problems column, The American Mathematical Monthly, published ten times per year,www.maa.org/pubs/monthly.html.

7. Stack Exchange – Mathematics, math.stackexchange.com

8. Four Proofs of the Ballot Theorem, Marc Renault, Mathematics Magazine, Vol. 80, No.5, December 2007, 345–352.

9. Integrals Don’t Have Anything to Do with Discrete Math, Do They?, P. Mark Kayll,Mathematics Magazine, Vol. 84, No. 2, April 2011, 108–119.

10. On the Asymptotic Solution of a Card-Matching Problem, Finn F. Knudsen and IvarSkau, Mathematics Magazine, Vol. 69, No. 3, June 1996, 190–197.

Problems in Discrete Probability - ualberta.ca · 2017. 10. 16. · Problems in Discrete...

Documents

Transcript of Problems in Discrete Probability - ualberta.ca · 2017. 10. 16. · Problems in Discrete...