Foundations of Probability Theory › files › Prob › Lecture3.pdf · 2020-06-14 · Yiren Ding...

33
Foundations of Probability Theory Chapter 1 – Lecture 3 Yiren Ding Shanghai Qibao Dwight High School February 25, 2016 Yiren Ding Foundations of Probability Theory 1 / 33

Transcript of Foundations of Probability Theory › files › Prob › Lecture3.pdf · 2020-06-14 · Yiren Ding...

Foundations of Probability TheoryChapter 1 – Lecture 3

Yiren Ding

Shanghai Qibao Dwight High School

February 25, 2016

Yiren Ding Foundations of Probability Theory 1 / 33

Outline

1 Continuity of ProbabilityTheorem 1 (Continuity Theorem)Proof of Theorem 1Borel-Cantelli lemma

2 Compound Chance ExperimentsDefinition of Compound Chance ExperimentExamples

3 Basic Rules of ProbabilityBasic Rules 1–4Proofs of Basic RulesExamples

Yiren Ding Foundations of Probability Theory 2 / 33

Continuity of Probability Theorem 1 (Continuity Theorem)

Introduction

Recall that the probability p : Ω→ [0, 1] is a function and theprobability measure P : P(Ω)→ [0, 1] is a set function.

We now show that it is also a continuous set function.

Consider a non-decreasing sequence of sets En = E1,E2, .... suchthat E1 ⊆ E2 ⊆ ... ⊆ En ⊆ En+1 for all n ∈ N.

We define E = limn→∞ En. Then it is obvious that E =⋃∞

i=1 Ei .

Theorem 1 (Continuity of Probability).

Let P : P(Ω)→ [0, 1] be a probability measure defined on the samplespace Ω and satisfy Axioms 1–3. And let En be a non-decreasingsequence of events with E = lim

n→∞En. Then P is continuous, i.e.,

limn→∞

P(En) = P( limn→∞

En) or simply as limEn→E

P(En) = P(E ).

Yiren Ding Foundations of Probability Theory 3 / 33

Continuity of Probability Proof of Theorem 1

Proof of Theorem 1.

Define F1 = E1, F2 = E2 \ E1,..., and Fn+1 = En+1 \ En, for n ≥ 1.

The sets F1,F2, ... are pairwise disjoint and satisfyn⋃

i=1

Fi =n⋃

i=1

Ei (= En) for all n ≥ 1

and∞⋃i=1

Fi =∞⋃i=1

Ei .

Thus P( limn→∞

En) = P(∞⋃i=1

Ei ) = P(∞⋃i=1

Fi ) =∞∑i=1

P(Fi ) (by Axiom 3)

= limn→∞

n∑i=1

P(Fi ) = limn→∞

P(n⋃

i=1

Fi ) (by Axiom 3 again)

= limn→∞

P(n⋃

i=1

Ei ) = limn→∞

P(En).

Yiren Ding Foundations of Probability Theory 4 / 33

Continuity of Probability Proof of Theorem 1

Remark

Theorem 1 also holds true for a non-increasing sequence of setsEn = E1,E2, .... such that En+1 ⊆ En... ⊆ E2 ⊆ E1 for all n ∈ N.

So we can replace “non-decreasing” with “monotonic” in Theorem 1.

That is, the result

limn→∞

P(En) = P( limn→∞

En)

holds true for both non-decreasing and non-increasing sequence En.For a non-increasing sequence, the set limn→∞ En denotes theintersection

⋂∞i=1 Ei of the sets E1,E2, ....

The proof of Theorem 1 for non-increasing sequence Ei is triviallysimilar, so we leave it as a homework exercise.

The continuity property of probability is crucial in the proof of theLaw of Large Numbers.

Yiren Ding Foundations of Probability Theory 5 / 33

Continuity of Probability Borel-Cantelli lemma

Example 1.

Using the axioms we can prove the following results.

P(A) ≤ P(B) if A ⊆ B.

Boole’s inequality P(∞⋃i=1

Ak) ≤∞∑k=1

P(Ak) for any sequence of

subsets A1,A2, ....

Borel-Cantelli lemma Let A1,A2, ... be an infinite sequence of

subsets with∞∑k=1

P(Ak) <∞. Define the set C as

C = ω : ω ∈ Ak for infinitely many k. Then P(C ) = 0.

The Borel-Cantelli lemma is one of the essential tools in proving theTheoretical Law of Large Numbers.

The proof of the first part is left as a simple homework exercise.

Yiren Ding Foundations of Probability Theory 6 / 33

Continuity of Probability Borel-Cantelli lemma

Proof of Boole’s inequality.

Define pairwise disjoint sets B1,B2, ... such that⋃∞

k=1 Ak =⋃∞

k=1 Bk .

Let B1 = A1 and let B2 = A2 \ A1. In general, let

Bk = Ak \ (A1 ∪ · · · ∪ Ak−1) for k = 2, 3, ...

By induction, B1 ∪ · · · ∪ Bk = A1 ∪ · · · ∪ Ak for k ≥ 1. Also the setsB1, ...,Bk are pairwise disjoint.

Hence⋃∞

k=1 Bk =⋃∞

k=1 Ak and the sets B1,B2, ... are pairwisedisjoint. By Axiom 3, it follows that

P(∞⋃k=1

Ak) = P(∞⋃k=1

Bk) =∞∑k=1

P(Bk).

Since Bk ⊆ Ak , we have P(Bk) ≤ P(Ak) and the proof is complete.

Yiren Ding Foundations of Probability Theory 7 / 33

Continuity of Probability Borel-Cantelli lemma

Proof of Borel-Cantelli lemma.

Let Bn =⋃∞

k=n Ak for n ≥ 1.

Then B1,B2, .. is a non-increasing sequence of sets.

Note that ω ∈ C if and only if ω ∈ Bn for all n ≥ 1.

This implies that set C equals the intersection of all sets Bn.

Using the continuity of probabilities,

P(C ) = P( limn→∞

Bn) = limn→∞

P(Bn).

By Boole’s inequality,

P(C ) = limn→∞

P(∞⋃k=n

Ak) ≤ limn→∞

∞∑k=n

P(Ak) = 0.

The latter limit is 0 since∑∞

k=1 P(Ak) <∞. (Why? homework)

Yiren Ding Foundations of Probability Theory 8 / 33

Compound Chance Experiments Definition of Compound Chance Experiment

Compound Chance Experiments

A chance experiment ε is called a compound experiment if it consistsof several independent elementary chance experiments (ε1, ε2, ...εk).

Assume that each elementary experiment εk has a probability space(Ωk ,Pk), where Ωk is finite or countably infinite and Pk is definedsuch that the probability pk(ωk) is assigned to each element ωk ∈ Ωk

and that for E ⊆ Ωk , Pk(E ) =∑

ωk∈E p(ωk).

The sample space for ε is naturally defined as the set

Ω = ω : ω = (ω1, ..., ωn), where ωk ∈ Ωk for k = 1, ..., n.

The assignment of the probability measure P arises naturally byassigning the probability p(ω) via the product rule:

p(ω) = p1(ω1)× p2(ω2)× · · · × pn(ωn).

We can prove that it is the only probability measure satisfying theproperty P(AB) = P(A)P(B) where A and B are independent events.

Yiren Ding Foundations of Probability Theory 9 / 33

Compound Chance Experiments Examples

Example 2.

In the “Seven Virtues Tea House,” it normally costs $3.50 to drink a cupof tea. On Friday afternoons, however, teachers are in a good mood, socustomers pay $0.25, $1.00 or $2.50 for the first cup. In order todetermine how much they will pay, customers must throw a dart at adartboard divided into eight segments of equal size. Two of the segmentsread $0.25, four of the segments read $1, and two more read $2.50. Twofriends, Frank and Sunmous, each throw a dart at the board. What is theprobability that the two friends will have to pay no more than $2 total fortheir first cup of tea?

Yiren Ding Foundations of Probability Theory 10 / 33

Compound Chance Experiments Examples

Example 2 solution

Here Ω = (ωF , ωS) : ωF ∈ ΩF , ωS ∈ ΩS where ΩF denote thesample space for Frank’s throw and ΩS for Sunmous’ throw.

Let L = $0.25,M = $1 and H = $2.50. So the sample space consistsof outcomes

(L, L), (L,M), (M, L), (L,H), (H, L), (M,M), (M,H), (H,H), (H,M)

By the product rule, the probability is 14 ×

14 = 1

16 for each of theoutcomes (L, L), (L,H), (H, L), and (H,H).

The probability 12 ×

12 = 1

4 to the outcome (M,M).

And the probability 12 ×

14 = 1

8 to each of the outcomes(L,M), (M, L), (H,M), and (M,H).

The two best friends will have to pay no more than $2 if one of thefour outcomes (L, L), (L,M), (M, L), (M,M) occurs.

Hence, the probability of the event is 116 + 1

8 + 18 + 1

4 = 916 .

Yiren Ding Foundations of Probability Theory 11 / 33

Compound Chance Experiments Examples

Example 3.

David and Matilda play Russian roulette in which they take turns pullingthe trigger of a six-cylinder revolver loaded with one bullet (after each pullof the trigger, the magazine is spun to randomly select a new cylinder tofire). David pulls the trigger first. What is the probability that he is dead?

Yiren Ding Foundations of Probability Theory 12 / 33

Compound Chance Experiments Examples

Example 3 – Simulation Approach

import random

count = 0

numTrials = 10000

for i in range(numTrials):

while True:

D = random.randint(1, 6)

M = random.randint(1, 6)

if D == 1:

count += 1

break

elif M == 1:

break

else:

continue

prob = count / numTrials

print(prob)

Yiren Ding Foundations of Probability Theory 13 / 33

Compound Chance Experiments Examples

Example 3 solution

The sample space for this chance experiment is the set

Ω = F ,MF ,MMF , ... ∪ MM...,

where the element M...MF with the first n − 1 letters all being Mrepresents the event that the first n − 1 times that the trigger ispulled, no shot is fired, and that the fatal shot is fired on the nthattempt.

The element MM... represents the event that no fatal shot is firedand the two best friends live happily ever after. (In reality this isimpossible, but formally it must be included in the sample space.)

For notational convenience (a technique you should use often), let’sexpress the sample space as

Ω = 1, 2, ... ∪ ∞

Yiren Ding Foundations of Probability Theory 14 / 33

Compound Chance Experiments Examples

Example 3 solution cont’d

Since the outcomes of the pulling of the trigger are independent ofone another, and the probability of each pull of the trigger is 1

6 thatthe fatal shot will be fired, it is reasonable to assign the probability

p(n) =

(5

6

)n−1 1

6

to the element n ∈ 1, 2, ....What about the probability p(∞)? (Homework, use Axiom 2.)

If we define A as the event that the fatal shot is fired by David, then

P(A) =∞∑n=0

p(2n + 1) =∞∑n=0

(5

6

)2n 1

6

=1

6

∞∑n=0

(25

36

)n

=1

6

(1

1− 2536

)= 0.5455.

Yiren Ding Foundations of Probability Theory 15 / 33

Basic Rules of Probability Basic Rules 1–4

Basic Rules of Probability

Let A ∪ B denote the union of two events A and B. That is, the event that at least one of events A or B occurs.

Let AB denote the intersection of two events A and B, (or as A ∩ B), as the event that both A and B occur.

Theorem 2.Rule 1: For any finite number of mutually exclusive events A1, ...,An,

P(A1 ∪ ... ∪ An) = P(A1) + · · ·+ P(An)

Rule 2: For any event A,P(A) = 1− P(Ac)

Rule 3: For any two events A and B,

P(A ∪ B) = P(A) + P(B)− P(AB)

Rule 4: For any finite number of events A1, ...,An,

P

(n⋃

i=1

Ai

)=

n∑i=1

P(Ai )−∑i<j

P(AiAj)+∑

i<j<k

P(AiAjAk)−· · ·+(−1)n−1P(A1A2 · · ·An).

Yiren Ding Foundations of Probability Theory 16 / 33

Basic Rules of Probability Proofs of Basic Rules

Proof of Rule 1

Let ∅ denote the empty set of outcomes (null event). We first provethat P(∅) = 0.

Applying Axiom 3 with Ai = ∅ for i = 1, 2, ... gives P(∅) =∑∞

i=1 ai ,where ai = P(∅) for each i .This implies that P(∅) = 0.

Let A1, ...,An be any finite sequence of pairwise disjoint sets.

Let’s augment this sequence by An+1 = ∅,An+2 = ∅, ....Then by Axiom 3,

P

(n⋃

i=1

Ai

)= P

( ∞⋃i=1

Ai

)=∞∑i=1

P(Ai ) =n∑

i=1

P(Ai ).

Notice that for a finite sample space, Axiom 3 can be replace by Rule1. The added generality of Axiom 3 is necessary only when thesample space is infinite.

Yiren Ding Foundations of Probability Theory 17 / 33

Basic Rules of Probability Proofs of Basic Rules

Proof of Rule 3

The proof of Rule 2 is left as a simple homework exercise.

To prove Rule 3, let A1 = A \ B, let B1 = B \ A, and let C = AB.

The sets A1,B2, and C are pairwise disjoint. Moreover,

A ∪ B = A1 ∪ B1 ∪ C , A = A1 ∪ C , and B = B1 ∪ C .

By Rule 1,P(A ∪ B) = P(A1) + P(B1) + P(C ).

Also since P(A) = P(A1) + P(C ) and P(B) = P(B1) + P(C ), bysubstituting into the expression above, we have

P(A ∪ B) = P(A)− P(C ) + P(B)− P(C ) + P(C )

= P(A) + P(B)− P(AB).

Yiren Ding Foundations of Probability Theory 18 / 33

Basic Rules of Probability Proofs of Basic Rules

Proof of Rule 4

We only prove Rule 4 for the special case that the sample space isfinite or countably infinite.

In this case P(A) =∑

ω∈A p(ω).

For a fixed ω, if ω /∈⋃n

i=1 Ai , then ω does not contribute to either ofthe LHS or the RHS.

Assume now that ω ∈⋃n

i=1 Ai . Then, there is at least one set Ai towhich ω belongs. Let s be the number of sets Ai to which ω belongs.

In the LHS, p(ω) contributes only once. In the first term of the RHS,p(ω) contributes s times, in the second term

(s2

)times, in the third

term(s3

)times, and so on. Thus, the coefficient of p(ω) in the RHS is

s −(s

2

)+

(s

3

)− · · ·+ (−1)s−1

(s

s

)We leave it as a homework exercise for you to verify that thiscoefficient is equal to 1. (Hint: Use the binomial theorem.)

Yiren Ding Foundations of Probability Theory 19 / 33

Basic Rules of Probability Examples

Example 4.

How many rolls of a fair die are required to have at least a 50% chance ofrolling at least one six? How many rolls of two fair dice are required tohave at least a 50% chance of rolling at least one double six?

For rolling a single dice r times, the sample space consists of all elements(i1, i2, ..., ir ), where ik denotes the outcome of the kth roll of the die.

The sample space has elements, with each equally likely.

Let A be the event that at least one 6 is obtained in r rolls. To compute A,it is easier to compute the probability of the complementary event Ac whereno 6 is obtained in the r rolls. The set Ac has elements.

Hence, P(Ac) = 5r/6r so

P(A) = 1− 5r

6r

How do we find the value r such that the probability is at least a 50%?

Just plug in some numbers! For r = 3, the probability is 0.4213, and forr = 4 this probability is 0.5177. So the answer is 4.

Yiren Ding Foundations of Probability Theory 20 / 33

Basic Rules of Probability Examples

Example 4 solution cont’d

For rolling two dice r times, the sample space consists of all elements

((i1, j1), (i2, j2), ..., (ir , jr )),

where ik and jk denote the outcomes on the kth roll.

The sample space has elements, all equally likely.

Let A be the event that at least one double six is obtained in r rolls. Thecomplementary event Ac of rolling no double six in r rolls can occur in

ways.

Hence P(Ac) = 35r/36r and so

P(A) = 1− 35r

36r.

This probability has the value 0.4914 for r = 24 and the value 0.5055 forr = 25. Hence 25 rolls are required.

Yiren Ding Foundations of Probability Theory 21 / 33

Basic Rules of Probability Examples

Example 5.

A single card is randomly drawn from a thoroughly shuffled deck of 52cards. What is the probability that the drawn card will be either a heart oran ace?

A poker deck has cards, so our sample space is:

Heart ♥A, ...,♥2, Spade ♠A, ...,♠2, Club ♣A, ...,♠2 Diamond ♦A, ...,♦2.

Each outcome has the probability .

Let A be the event that a heart is drawn, and B an ace is drawn.

We have P(A) = ,P(B) = , and P(AB) = .

Applying Rule 3, we have

P(A ∪ B) = P(A) + P(B)− P(AB)

=13

52+

4

52− 1

52=

16

52

Yiren Ding Foundations of Probability Theory 22 / 33

Basic Rules of Probability Examples

Example 6 (continue in Example 9 ).

The eight soccer teams which have reached the quarterfinals of theChampions League are formed by two teams from each of the countriesEngland, Germany, Italy and Spain. The four matches to be played in thequarterfinals are determined by drawing lots.

(a) What is the probability that the two teams from the same countryplay against other in each of the four matches?

(b) What is the probability that there is a match between the two teamsfrom England or between the two teams from Germany?

Yiren Ding Foundations of Probability Theory 23 / 33

Basic Rules of Probability Examples

Example 6 (a) solution

Number the teams 1, ..., 8 and the sample space is all possiblepermutations t1, ..., t8 that corresponds to the four matches(t1, t2), (t3, t4), (t5, t6) and (t7, t8).

The total number of elements in the sample space is .

Each element gets assigned the same probability .

Let A be the event that the two teams from the same country playagainst each other in each of the four matches.

The event A corresponds to elements.

So the probability of event A is

P(A) =384

8!= 0.0095

Yiren Ding Foundations of Probability Theory 24 / 33

Basic Rules of Probability Examples

Example 6 (b) solution

Let E be the event that there is a match between the two teams fromEngland and leg G be the event that there is a match from Germany.

The desired probability P(E ∪ G ) satisfies

P(E ∪ G ) = P(E ) + P(G )− P(EG ).

E , G each correspond to = 5760 elements,

and EG corresponds to = 1152 elements.

So P(E ) = P(G ) = 57608! and P(EG ) = 1152

8! .

This gives

P(E ∪ G ) =5760

8!+

5760

8!− 1152

8!= 0.2571

Yiren Ding Foundations of Probability Theory 25 / 33

Basic Rules of Probability Examples

Example 7 (Modified Birthday Problem).

What is the probability that in a class of 23 students exactly two childrenhave birthdays on the same day?

Take as the sample space of all ordered 23-tuples of numbers selected fromthe integers 1, 2, ..., 365. The size of the sample space is , with eachelement having the same probability of .

For each of the(232

)= 253 possible “places” of the two children with the

same birthday, define Ai as the event that the two children fromcombination i are the only two children having the same birthday.

Kindergarten instincts should tell you that A1, ...,A253 are mutually exclusiveevents each having the probability

P(Ai ) =365× 1× 364× 363× · · · × 344

36523= 0.0014365

By Rule 1, P(A1 ∪ · · · ∪ A253) = P(A1) + · · ·+ P(A253) = 253× 0.0014365= 0.3634. The case for having exactly two days on which exactly twochildren have the same birthday is left as a homework exercise.

Yiren Ding Foundations of Probability Theory 26 / 33

Basic Rules of Probability Examples

Example 7 – Simulation Approach

import random

classSize = 23

numTrials = 1000

dupeCount = 0

for trial in range(numTrials):

bdayList = []

for i in range(classSize):

newBDay = random.randint(1, 365)

bdayList.append(newBDay)

dateCount = 0

for num in bdayList:

if bdayList.count(num) > 1:

dateCount += 1

if dateCount == 2:

dupeCount += 1

prob = dupeCount / numTrials

print("The probability of shared birthday is ", prob)

Yiren Ding Foundations of Probability Theory 27 / 33

Basic Rules of Probability Examples

Example 8 (Envelope problem).

Letters to n different persons are randomly put into n pre-addressedenvelopes. What is the probability that at least one person receives thecorrect letter? Exactly j persons receive a correct letter?

Here the sample space is the set of all permutations e1, ..., en of theordered set 1, ..., n corresponding to the orderings of the people,and ei corresponds to the letter given to the ith person.

The total number of possible outcomes is naturally .

For a fixed i , let Ai denote the event that the letter for person i is putinto the envelope with label i . (ei = i).

The probability that at least one person receives the correct letter isgiven by P(A1 ∪ · · · ∪ An).

Yiren Ding Foundations of Probability Theory 28 / 33

Basic Rules of Probability Examples

For fixed i , the total number of orderings (e1, ..., en) with ei = i isequal to , so P(Ai ) = for i = 1, ..., n.

Next fix i and j with i 6= j . The number of orderings (e1, ..., en) withei = i and ej = j is equal to .

Hence, P(AiAj) = for all i , j such that i 6= j .

Using the inclusion-exclusion principle (Rule 4),

P(A1 ∪ · · · ∪ An) =

(n

1

)(n − 1)!

n!−(n

2

)(n − 2)!

n!+

(n

3

)(n − 3)!

n!

− · · ·+ (−1)n−1(n

n

)1

n!.

=1

1!− 1

2!+

1

3!− · · ·+ (−1)n−1

1

n!

= 1−n∑

k=0

(−1)k

k!≈ 1− 1

e≈ 0.6321

Yiren Ding Foundations of Probability Theory 29 / 33

Basic Rules of Probability Examples

Furthermore, we can easily verify that

P(exactly j persons receive a correct letter) =1

j!

n−j∑k=0

(−1)k

k!.

To do verify this, let Nm denote the number of permutations of theset 1, ...,m so that no integer remains in its original position.

Since the probability that a random permutation has this property is∑mk=0(−1)k/k!. (Why?) It follows that Nm/m! =

∑mk=0(−1)k/k!.

Thus, from kindergarten math olympiad,

P(exactly j persons receive a correct letter) =

(nj

)Nn−j

n!.

After simple substitutions of Nn−j/(n − j)! and(nj

),the result follows.

This probability tends to the so-called Poisson probability e−1/j! asthe number of envelopes become large. We will get there eventually.

Yiren Ding Foundations of Probability Theory 30 / 33

Basic Rules of Probability Examples

Example 8 – Simulation Approach

import random

numTrials = 1000

numEnvelopes = 5

count = 0

a = list(range(numEnvelopes))

for num in range(numTrials):

match = False

b = random.sample(a, len(a))

for i in range(numEnvelopes):

if a[i] == b[i]:

match = True

if match == True:

count+=1

prob = count / numTrials

print(prob)

Yiren Ding Foundations of Probability Theory 31 / 33

Basic Rules of Probability Examples

Example 9 ( Example 6 cont’d).

What is the probability that in none of the four matches two teams fromthe same country play against each other?

Let Ai , i = 1, 2, 3, 4 denote the event that two teams from the samecountry play against each other in the ith match. By Rule 2, thisdesired probability is 1− P(A1 ∪ A2 ∪ A3 ∪ A4).

Previously, we already found that

P(Ai ) =5760

8!for all i and P(AiAj) =

1152

8!for all i < j .

We just need to find P(A1A2A3) and P(A1A2A3A4).

The event A1A2A3 has = 384 elements.

The event A1A2A3A4 has elements.

The rest is left as a homework exercise. ,

Yiren Ding Foundations of Probability Theory 32 / 33

Basic Rules of Probability Examples

Example 10.

Suppose n = 10 married couples are invited to a dance party. Dancepartners are chosen at random, without regard to gender. What is theprobability that no one will be paired with his or her spouse?

Denote by Ai the event that couple i is paired as dance partners.

The sample space consists of all possible permutations of the integers1, ..., 2n, where the integers 2i − 1 and 2i represent couple i .

We leave it to you as a homework exercise , to verify that thecomplementary probability P(A1 ∪ ... ∪ An) of at least one couplebeing paired as dance partners is given by

n∑k=1

(−1)k+1

(n

k

)n × (n − 1)× · · · × (n − k + 1)× 2k × (2n − 2k)!

2n!.

Let n = 10, and using a computer algebra system like Maple, we findthe probability has a value 0.4088.

Yiren Ding Foundations of Probability Theory 33 / 33