Probability, Bayes’ Theorem and the Monty Hall Problem.

34
Probability, Bayes’ Theorem and the Monty Hall Problem

Transcript of Probability, Bayes’ Theorem and the Monty Hall Problem.

Page 1: Probability, Bayes’ Theorem and the Monty Hall Problem.

Probability, Bayes’ Theorem and the Monty Hall Problem

Page 2: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 2

Probability Distributions

• A random variable is a variable whose value is uncertain.

• For example, the height of a randomly selected person in this class is a random variable – I won’t know its value until the person is selected.

• Note that we are not completely uncertain about most random variables.

– For example, we know that height will probably be in the 5’-6’ range.

– In addition, 5’6” is more likely than 5’0” or 6’0” (for women).

• The function that describes the probability of each possible value of the random variable is called a probability distribution.

Page 3: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 3

Probability Distributions

• Probability distributions are closely related to frequency distributions.

Page 4: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 4

Probability Distributions

• Dividing each frequency by the total number of scores and multiplying by 100 yields a percentage distribution.

Page 5: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 5

Probability Distributions

• Dividing each frequency by the total number of scores yields a probability distribution.

Page 6: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 6

Probability Distributions

• For a discrete distribution, the probabilities over all possible values of the random variable must sum to 1.

Page 7: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 7

Probability Distributions• For a discrete distribution, we can talk about the probability of a particular

score occurring, e.g., p(Province = Ontario) = 0.36.

• We can also talk about the probability of any one of a subset of scores occurring, e.g., p(Province = Ontario or Quebec) = 0.50.

• In general, we refer to these occurrences as events.

Page 8: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 8

Probability Distributions

• For a continuous distribution, the probabilities over all possible values of the random variable must integrate to 1 (i.e., the area under the curve must be 1).

• Note that the height of a continuous distribution can exceed 1!

Shaded area = 0.683 Shaded area = 0.954 Shaded area = 0.997

Page 9: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 9

Continuous Distributions

• For continuous distributions, it does not make sense to talk about the probability of an exact score.

– e.g., what is the probability that your height is exactly 65.485948467… inches?

55 60 65 70 750

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

Height (in)

Pro

ba

bili

ty p

Normal Approximation to probability distribution for height of Canadian females(parameters from General Social Survey, 1991)

5'3.8"

2.6"s

?

Page 10: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 10

Continuous Distributions• It does make sense to talk about the probability of observing a score that falls within a certain

range

– e.g., what is the probability that you are between 5’3” and 5’7”?

– e.g., what is the probability that you are less than 5’10”?

55 60 65 70 750

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

Height (in)

Pro

ba

bili

ty p

Normal Approximation to probability distribution for height of Canadian females(parameters from General Social Survey, 1991)

5'3.8"

2.6"s

Valid events

Page 11: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 11

Probability of Combined Events

Let ( ) represent the probability of event .p A A

0 ( ) 1p A

disjoint mutually excluIf and are ( ) events, then

( o

sive

r ) ( ) ( )

A B

p A B p A p B

: in the context of the Community Health Survey:

Let represent the event that the respondent lives in Alberta.

Let represent the event that the respondent live

Example

s in BC.

A

B

Then ( ) 0.087

( ) 0.106

( or ) 0.193

p A

p B

p A B

Page 12: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 12

Probability of Combined Events

More generally, if and are mutually exclusive,

( or ) ( ) ( ) ( an

not

d )

A B

p A B p A p B p A B

Canadian Community Health Survey, SleepingExam Haple: bits

Let event that respondent sleeps less than 6 hours per night.A

Let event that respondent reports trouble sleeping most or all of the timeB

( ) 0.139p A

( ) 0.152p B

( and ) 0.061p A B

Thus

( or ) 0.139 0.152 0.061 0.230p A B

Page 13: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 13

Exhaustive Events

• Two or more events are said to be exhaustive if at least one of them must occur.

• For example, if A is the event that the respondent sleeps less than 6 hours per night and B is the event that the respondent sleeps at least 6 hours per night, then A and B are exhaustive.

• (Although A is probably the more exhausted!!)

Page 14: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 14

Independence

Two events are if the occurence of one

in no way affects the probability of

ind

the

ependent

other.

If events and are independent, then

( and ) ( ) ( )

A B

p A B p A p B

If events and are not independent, then

( and ) ( ) ( | )

A B

p A B p A p B A

: pick a card, anyExam cple ard.

Page 15: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 15

An Example: The Monty Hall Problem

Page 16: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 16

Problem History

• When problem first appeared in Parade, approximately 10,000 readers, including 1,000 PhDs, wrote claiming the solution was wrong.

• In a study of 228 subjects, only 13% chose to switch.

Page 17: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 17

Intuition

• Before Monty opens any doors, there is a 1/3 probability that the car lies behind the door you selected (Door 1), and a 2/3 probability it lies behind one of the other two doors.

• Thus with 2/3 probability, Monty will be forced to open a specific door (e.g., the car lies behind Door 2, so Monty must open Door 3).

• This concentrates all of the 2/3 probability in the remaining door (e.g., Door 2).

Page 18: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 18

Page 19: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 19

Analysis

Switching loses with

probability 1/6

Switching wins with probability 2/3Switching loses with

probability 1/3

Switching wins with probability 1/3

Switching wins with probability 1/3

Switching loses with

probability 1/6

Host must open Door 2Host must open Door 3Host opens either Door 2 or 3

Player initially picks Door 1

Car hidden behind Door 3Car hidden behind Door 2Car hidden behind Door 1

Page 20: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 20

Notes

• It is important that

– Monty must open a door that reveals a goat

– Monty cannot open the door you selected

• These rules mean that your choice may constrain what Monty does.

– If you initially selected a door concealing a goat, then there is only one door Monty can open.

• One can rigorously account for the Monty Hall problem using a Bayesian analysis

Page 21: Probability, Bayes’ Theorem and the Monty Hall Problem.

End of Lecture 2

Sept 17, 2008

Page 22: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 22

Conditional Probability

• To understand Bayesian inference, we first need to understand the concept of conditional probability.

• What is the probability I will roll a 12 with a pair of (fair) dice?

• What if I first roll one die and get a 6? What now is the probability that when I roll the second die they will sum to 12?

Let be the state of die 1

Let B be the state of die 2

Let be the sum of die 1 and 2

A

C

( 6) __?p A

( 6) __?p B

( 12) __?p C

( 12 | 6) __?p C A

“Probability of C given A”

Page 23: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 23

Conditional Probability• The conditional probability of A given B is the joint

probability of A and B, divided by the marginal probability of B.

• Thus if A and B are statistically independent,

• However, if A and B are statistically dependent, then

( , )( | )

( )

p A Bp A B

p B

( , ) ( ) ( )( | ) ( ).

( ) ( )

p A B p A p Bp A B p A

p B p B

( | ) ( ).p A B p A

Page 24: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 24

Bayes’ Theorem

• Bayes’ Theorem is simply a consequence of the definition of conditional probabilities:

( , )

( | ) ( , ) ( | ) ( )( )

p A Bp A B p A B p A B p B

p B

( , )

( | ) ( , ) ( | ) ( )( )

p A Bp B A p A B p B A p A

p A

Thus ( | ) ( ) ( | ) ( )p A B p B p B A p A

( | ) ( )

( | )( )

p B A p Ap A B

p BBayes’ Equation

Page 25: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 25

Bayes’ Theorem

• Bayes’ theorem is most commonly used to estimate the state of a hidden, causal variable H based on the measured state of an observable variable D:

( | ) ( )

( | )( )

p D H p Hp H D

p D

Evidence

Prior

Likelihood

Posterior

Page 26: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 26

Bayesian Inference

• Whereas the posterior p(H|D) is often difficult to estimate directly, reasonable models of the likelihood p(D|H) can often be formed. This is typically because H is causal on D.

• Thus Bayes’ theorem provides a means for estimating the posterior probability of the causal variable H based on observations D.

Page 27: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 27

Marginalizing

• To calculate the evidence p(D) in Bayes’ equation, we typically have to marginalize over all possible states of the causal variable H.

( | ) ( )

( | )( )

p D H p Hp H D

p D

1 2

1 1 2 2

( ) ( , ) ( , ) ( , )

( | ) ( ) ( | ) ( ) ( | ) ( )n

n n

p D p D H p D H p D H

p D H p H p D H p H p D H p H

Page 28: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 28

The Full Monty

• Let’s get back to The Monty Hall Problem.

• Let’s assume you initially select Door 1.

• Suppose that Monty then opens Door 2 to reveal a goat.

• We want to calculate the posterior probability that a car lies behind Door 1 after Monty has provided these new data.

Page 29: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 29

The Full Monty

Let represent the state that the car lies behind Door , [1,2,3].iC i i

Let represent the event that Monty opens door , [1,2,3],

revealing a goat.iM i i

2 1 11 2

2

( | ) ( )We seek ( | )

( )

p M C p Cp C M

p M

Page 30: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 30

The Full Monty

2 2 3 2 1 2Since ( | ) 0, we can obtain ( | ) by subtracting ( | ) from 1

(Remember that the probabilities of exhaustive events add to 1!)

p C M p C M p C M

3 2However, we can also calculate ( | ) directly:p C M

2 3 33 2

2

( | ) ( )( | )

( )

p M C p Cp C M

p M

Page 31: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 31

But we’re not on Let’s Make a Deal!

• Why is the Monty Hall Problem Interesting?

– It reveals limitations in human cognitive processing of uncertainty

– It provides a good illustration of many concepts of probability

– It get us to think more carefully about how we deal with and express uncertainty as scientists.

• What else is Bayes’ theorem good for?

Page 32: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 32

Clinical Example

• Christiansen et al (2000) studied the mammogram results of 2,227 women at health centers of Harvard Pilgrim Health Care, a large HMO in the Boston metropolitan area.

• The women received a total of 9,747 mammograms over 10 years. Their ages ranged from 40 to 80. Ninety-three different radiologists read the mammograms, and overall they diagnosed 634 mammograms as suspicious that turned out to be false positives.

• This is a false positive rate of 6.5%.

• The false negative rate has been estimated at 10%.

Page 33: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 33

Clinical Example

• There are about 58,500,000 women between the ages of 40 and 80 in the US

• The incidence of breast cancer in the US is about 184,200 per year, i.e., roughly 1 in 318.

Page 34: Probability, Bayes’ Theorem and the Monty Hall Problem.

PSYC 6130, PROF. J. ELDER 34

Clinical Example

0Let represent the absence of cancer.C

0Let represent a negative mammogram result.M

1Let represent a positive mammogram result.M

1Let represent the presence of cancer.C

1 1 1 1Remember: ( | ) ( | )!p C M p M C

Suppose your friend receives a positive mammogram result.

What quantity do you want to compute?