Bayes Theorem, Independence, and Discrete Random Variables

Bayes Theorem, Independence, and

Discrete Random Variables

Bruce A Craig

Department of StatisticsPurdue University

STAT 511 Feb 15 1

Outline

Monty Hall Problem

Independence

Random Variables

Probability Distributions

STAT 511 Feb 15 2

Monty Hall Problem

Suppose you’re on “Let’s Make a Deal” and you’re giventhe choice of three curtains: behind one is a new car andbehind each of the other two is a goat. You pick curtain#3 and then the host (who knows where the car is)reveals the goat behind curtain #1. He then asks if you’dlike to switch to curtain #2 or stay with #3. Whatshould you do?

Question made famous after posting in the “Ask Marilyn”column in Parade magazine (1990). She correctlyanswered the question but many PhD mathematicianswrote in telling her she was wrong.

Correct answer: Switch to #2

STAT 511 Feb 15 3

Solution Approach #1

Interested in comparing the probabilities that #3 iscorrect (A) versus curtain #2 is correct (A′) givencurtain #1 is shown (B)Prior to event B , we know P(A) = P(A′) = 1/3

P(A|B) = P(A ∩ B)/P(B)

=P(A)P(B|A)

P(A)P(B|A) + P(A′)P(B|A′)

=1/3(1/2)

1/3(1/2) + 1/3(1)

=1

3↓

P(A′|B) =2

3

Because Monty only shows curtain with a goat, theconditional probabilities above diff

STAT 511 Feb 15 4

Solution Approach #2

Consider the following table of possible scenarios:

Initial choice Action ResultCurtain with a goat Stay Lose

Switch WinCurtain with a goat Stay Lose

Switch WinCurtain with the car Stay Win

Switch Lose

Win 2/3 of time with switch, 1/3 of time without

STAT 511 Feb 15 5

Independence

If A and B independent, then P(A|B) = P(A)

This says that the occurrence of one event does notaffect probability of another event occuring

If A and B independent then

P(A ∩ B) = P(A)P(B) (multiplication rule and definition)

A and B ′ are independentA′ and B are independentA′ and B ′ are independent

Independence is a common assumption in statistical testsand probability models

Should not be blindly assumed though

STAT 511 Feb 15 6

Example: Determining independence

Consider a gas station with six pumps numbered 1, 2, · · · , 6,and let Ei denote the event that pump i is in use at arandomly chosen time. Suppose that

P(E1) = P(E6) = .10,P(E2) = P(E5) = .15,P(E3) = P(E4) = .25

Define events A, B , C by

A = {E2,E4,E6}, B = {E1,E2,E3}, C = {E2,E3,E4,E5}

We then have P(A) = .50, P(A|B) = .30, and P(A|C ) = .50.That is, events A and B are dependent, whereas events A andC are independent.

STAT 511 Feb 15 7

Example: Using multiplication rule

Suppose I roll seven six-sided dice. What is the probabilitythat I get at least one 6?

For a single die, P(rolling a six) = 1/6

For a single die, P(not rolling a six) = 5/6

Complement of “at least one” is “none”

P(no sixes) = 5/6× 5/6× 5/6× 5/6× 5/6× 5/6× 5/6

= (5/6)7

= 0.297

P(at least one six) = 1− P(no sixes)

= 1− 0.297

= .703

STAT 511 Feb 15 8

Random Variables (RVs)

Definition : For a given sample space S, a randomvariable (RV) is any rule that associates a number witheach outcome in S.

In mathematical language, a random variable is a functionwhose domain is the sample space and whose range is theset of real numbers.

Conventions:

Upper case letters: X , Y , Z for RVsLower case letters: x , y and z , specific value or an RVX (E ) = x means that the outcome E is associated withthe value x by the RV X .

STAT 511 Feb 15 9

Example I

When a student calls a university help desk for technicalsupport, he/she will either immediately (S) be able tospeak to someone or (F) be placed on hold.

With S = {S , F}, define RV X by

X (S) = 1, X (F ) = 0

The RV X indicates whether or not the student canimmediately speak to someone.

NOTE: Any random variable whose only possible values are 0and 1 is called a Bernoulli random variable.

STAT 511 Feb 15 10

Example II

Let’s expand on the earlier gas pump example and considerthat there are two six-pump gas stations. Define the followingRVs:

X = the total number of pumps in use at the two stationsY = the difference between the number of pumps in use at

station 1 and the number in use at station 2U = the maximum of the numbers of pumps in use at the

two stationsWhat are the corresponding values when E = (3, 2)?

X = 3 + 2 = 5,Y = 3− 2 = 1, and U = max(3, 2) = 3

STAT 511 Feb 15 11

Types of Random Variables

Definition

A discrete random variable is an rv whose possible valueseither constitute a finite set or else can be listed in an infinitesequence in which there is a first element, a second element,and so on (“countably” infinite).

Definition

A random variable is continuous if both of the followingapply:

1 Its set of possible values consists either of all numbers ina single interval or all numbers in a disjoint union of suchintervals (e.g., [0, 10] ∪ [20, 30]).

2 No possible value of the variable has positive probability,that is, P(X = c) = 0 for any possible value c .

STAT 511 Feb 15 12

Probability Distribution

Describes a discrete random variableAssociates each numeric value with a probabilityDescribes how the total probability of 1 is distributedamong the possible numeric values

P(X = x) = P(all s ∈ S,X (s) = x)

Example: Roll a fair dieX = numeric value on face of diex = 1, 2, 3, 4, 5, 6The probability distribution is P(X = x) = 1/6

Example: Business has just purchased four laser printers,and let X be the number among these that requireservice during the warranty period.

x = 0, 1, 2, 3, 4

Distribution could be P(X = x) =1/4 x = 0, 1, 21/8 x = 3, 4

STAT 511 Feb 15 13


Definition

The probability distribution or probability mass function (pmf)of a discrete rv is defined for every number x byp(x) = P(X = x) = P(all s ∈ S : X (s) = x).

STAT 511 Feb 15 14


Definition

The probability distribution or probability mass function (pmf)of a discrete rv is defined for every number x byp(x) = P(X = x) = P(all s ∈ S : X (s) = x).

For every possible value x of the random variable, thepmf specifies the probability of observing that value whenthe experiment is performed.

The conditions p(x) ≥ 0 and∑

all possible x p(x) = 1 arerequired of any pmf.

STAT 511 Feb 15 14

Graphical Representation

Will often display distribution graphically

(left) Roll of a fair die (right) printers needing repair

STAT 511 Feb 15 15

Example: Probability Distribution

The Cal Poly Department of Statistics has a lab with sixcomputers reserved for statistics majors. Let X denote thenumber of these computers that are in use at a particular timeof day.Suppose that the probability distribution of X is as given inthe following table; the first row of the table lists the possibleX values and the second row gives the probability of each suchvalue.

x 0 1 2 3 4 5 6p(x) .05 .10 .15 .25 .20 .15 .10

STAT 511 Feb 15 16


What is the probability that at most 2 computers are inuse?

What is the probability that at least 3 computers are inuse?

What is the probability that between 2 and 5 computersare in use?

STAT 511 Feb 15 17


What is the probability that at most 2 computers are inuse?

p(0) + p(1) + p(2) = 0.30

What is the probability that at least 3 computers are inuse?

1− 0.3 = 0.7 or p(3) + p(4) + p(5) + p(6) = 0.70

What is the probability that between 2 and 5 computersare in use?

p(2) + p(3) + p(4) + p(5) = 0.75

STAT 511 Feb 15 18

Another PMF Example

Y = # of rolls until a 6 is obtainedThe outcomes are y = 1, 2, 3, 4, ......To determine distribution, consider sequence of rolls anduse product rule/independence

P(Y = 1) = P(roll a six)

= 1/6

P(Y = 2) = P(don’t roll a six)P(roll a six)

= 5/6× 1/6

P(Y = 3) = P(don’t roll a six)P(don’t roll a six)P(roll a six)

= 5/6× 5/6× 1/6...

P(Y = y) =

(

5

6

)y−1

×1

6

STAT 511 Feb 15 19

PMF Parameter

In previous example, assumed the die is fair so p(6)=1/6

Suppose instead that p(6) = α where 0 < α < 1

The pmf (family form) of this RV is then

P(Y = y) = α(1− α)(y−1)for y ≥ 1

A different α results in diff probability distribution

The α is called a parameter

This distribution family is called a geometric distribution

STAT 511 Feb 15 20

Bernoulli Distribution

Suppose that 20% of the customers coming to yourcomputer store buy a desktop computer. Let X indicatewhether a customer buys a desktop computer.

The pmf of this Bernoulli rv X is p(0) = .8 and p(1) = .2

At another store, it may be the case that p(0) = .9 andp(1) = .1.

More generally, the pmf of any Bernoulli rv can beexpressed in the form p(1) = α and p(0) = 1− α, where0 < α < 1. Because the pmf depends on the particularvalue of α we often write p(x ;α) rather than just p(x):

p(x ;α) =

1− α if x = 0α if x = 10 otherwise

STAT 511 Feb 15 21

The Cumulative Distribution Function

For some fixed value x , we often wish to compute theprobability that the observed value of X will be at most x .

p(x) =

0.500 x = 00.167 x = 10.333 x = 20 otherwise

The probability that X is at most 1 is then

P(X ≤ 1) = p(0) + p(1) = .500 + .167 = .667

What about P(X ≤ 1.5), P(X ≤ 0), P(X ≤ 2),P(X ≤ 3.7), and P(X ≤ 20.5)?

Note that P(X < x) ≤ P(X ≤ x).

STAT 511 Feb 15 22

The Cumulative Distribution Function

Definition

The cumulative distribution function (cdf) F (x) of adiscrete rv variable X with pmf p(x) is defined for everynumber x by

F (x) = P(X ≤ x) =∑

y :y≤x

p(y)

For any number x , F (x) is the probability that the observedvalue of X will be at most x .

STAT 511 Feb 15 23

Example

A store carries flash drives with either 1 GB, 2 GB, 4 GB, 8GB, or 16 GB of memory. The accompanying table gives thedistribution of Y = the amount of memory in a purchaseddrive:

y 1 2 4 8 16p(y) .05 .10 .35 .40 .10

Calculate F (4), F (8) and F (16). What about F (2.7),F (7.999).

STAT 511 Feb 15 24

Example

For any number y , F (y) will equal the value of F at theclosest possible value of Y to the left of y . The cumulativedistribution function in this example is

F (y) =

0 y < 1.05 1 ≤ y < 2.15 2 ≤ y < 4.50 4 ≤ y < 8.90 8 ≤ y < 161 16 ≤ y

STAT 511 Feb 15 25

Graphical Representation

STAT 511 Feb 15 26

Bayes Theorem, Independence, and Discrete Random Variables

Documents

Transcript of Bayes Theorem, Independence, and Discrete Random Variables