Chapter 6: Probability Distributions

63
1 Chapter 6: Probability Distributions Section 6.1: How Can We Summarize Possible Outcomes and Their Probabilities?

description

Chapter 6: Probability Distributions. Section 6.1: How Can We Summarize Possible Outcomes and Their Probabilities?. Learning Objectives. Random variable Probability distributions for discrete random variables Mean of a probability distribution - PowerPoint PPT Presentation

Transcript of Chapter 6: Probability Distributions

Page 1: Chapter 6: Probability Distributions

1

Chapter 6: Probability Distributions

Section 6.1: How Can We Summarize Possible Outcomes and Their

Probabilities?

Page 2: Chapter 6: Probability Distributions

2

Learning Objectives

1. Random variable

2. Probability distributions for discrete random variables

3. Mean of a probability distribution

4. Summarizing the spread of a probability distribution

5. Probability distribution for continuous random variables

Page 3: Chapter 6: Probability Distributions

3

Learning Objective 1:Randomness

The numerical values that a variable assumes are the result of some random phenomenon:Selecting a random sample for a

population

orPerforming a randomized experiment

Page 4: Chapter 6: Probability Distributions

4

Learning Objective 1:Random Variable

A random variable is a numerical measurement of the outcome of a random phenomenon.

Page 5: Chapter 6: Probability Distributions

5

Learning Objective 1:Random Variable

Use letters near the end of the alphabet, such as x, to symbolize Variables A particular value of the random variable

Use a capital letter, such as X, to refer to the random variable itself.

Example: Flip a coin three times X=number of heads in the 3 flips; defines the random

variable x=2; represents a possible value of the random

variable

Page 6: Chapter 6: Probability Distributions

6

Learning Objective 2:Probability Distribution

The probability distribution of a random variable specifies its possible values and their probabilities.

Note: It is the randomness of the variable that allows us to specify probabilities for the outcomes

Page 7: Chapter 6: Probability Distributions

7

Learning Objective 2:Probability Distribution of a Discrete Random Variable

A discrete random variable X has separate values (such as 0,1,2,…) as its possible outcomes

Its probability distribution assigns a probability P(x) to each possible value x: For each x, the probability P(x) falls between 0

and 1 The sum of the probabilities for all the possible x

values equals 1

Page 8: Chapter 6: Probability Distributions

8

Learning Objective 2:Example

What is the estimated probability of at least three home runs?P(3)+P(4)+P(5)=0.13+0.03+0.01=0.17

Page 9: Chapter 6: Probability Distributions

9

Learning Objective 3:The Mean of a Discrete Probability Distribution

The mean of a probability distribution for a discrete random variable is

where the sum is taken over all possible values of x.

)(xpx

The mean of a probability distribution is denoted by the parameter, µ.

The mean is a weighted average; values of x that are more likely receive greater weight P(x)

Page 10: Chapter 6: Probability Distributions

10

Learning Objective 3:Expected Value of X

The mean of a probability distribution of a random variable X is also called the expected value of X.

The expected value reflects not what we’ll observe in a single observation, but rather that we expect for the average in a long run of observations.

It is not unusual for the expected value of a random variable to equal a number that is NOT a possible outcome.

Page 11: Chapter 6: Probability Distributions

11

Learning Objective 3:Example

Find the mean of this probability distribution.

The mean:

= 0(0.23) + 1(0.38) + 2(0.22) + 3(0.13) + 4(0.03) + 5(0.01) = 1.38

)(xpx

Page 12: Chapter 6: Probability Distributions

12

Learning Objective 4:The Standard Deviation of a Probability Distribution

The standard deviation of a probability distribution, denoted by the parameter, σ, measures its spread. Larger values of σ correspond to greater

spread. Roughly, σ describes how far the random

variable falls, on the average, from the mean of its distribution

Page 13: Chapter 6: Probability Distributions

13

Learning Objective 5:Continuous Random Variable

A continuous random variable has an infinite continuum of possible values in an interval.

Examples are: time, age and size measures such as height and weight.

Continuous variables are measured in a discrete manner because of rounding.

Page 14: Chapter 6: Probability Distributions

14

Learning Objective 5:Probability Distribution of a Continuous Random Variable A continuous random variable has possible

values that form an interval. Its probability distribution is specified by a curve. Each interval has probability between 0 and 1. The interval containing all possible values has

probability equal to 1.

Page 15: Chapter 6: Probability Distributions

15

Chapter 6: Probability Distributions

Section 6.2: How Can We Find Probabilities for Bell-Shaped

Distributions?

Page 16: Chapter 6: Probability Distributions

16

Learning Objectives

1. Normal Distribution

2. 68-95-99.7 Rule for normal distributions

3. Z-Scores and the Standard Normal Distribution

4. The Standard Normal Table: Finding Probabilities

5. Using the TI-calculator: find probabilities

Page 17: Chapter 6: Probability Distributions

17

Learning Objectives

6. Using the Standard Normal Table in Reverse7. Using the TI-calculator: find z-scores8. Probabilities for Normally Distributed

Random Variables 9. Percentiles for Normally Distributed Random

Variables10. Using Z-scores to Compare Distributions

Page 18: Chapter 6: Probability Distributions

18

Learning Objective 1:Normal Distribution

The normal distribution is symmetric, bell-shaped and characterized by its mean µ and standard deviation . The normal distribution is the most important

distribution in statistics Many distributions have an approximate normal

distribution Approximates many discrete distributions well

when there are a large number of possible outcomes

Many statistical methods use it even when the data are not bell shaped

Page 19: Chapter 6: Probability Distributions

19

Learning Objective 1:Normal Distribution

Normal distributions are Bell shaped Symmetric around the mean

The mean () and the standard deviation () completely describe the density curve Increasing/decreasing moves the curve

along the horizontal axis Increasing/decreasing controls the spread of

the curve

Page 20: Chapter 6: Probability Distributions

20

Learning Objective 1:Normal Distribution

Within what interval do almost all of the men’s heights fall? Women’s height?

Page 21: Chapter 6: Probability Distributions

21

Learning Objective 2:68-95-99.7 Rule for Any Normal Curve

68% of the observations fall within one standard deviation of the mean

95% of the observations fall within two standard deviations of the mean

99.7% of the observations fall within three standard deviations of the mean

Page 22: Chapter 6: Probability Distributions

22

Learning Objective 2: Example : 68-95-99.7% Rule

Heights of adult women can be approximated by a normal distribution

= 65 inches; =3.5 inches 68-95-99.7 Rule for women’s heights

68% are between 61.5 and 68.5 inches

[ µ = 65 3.5 ]

95% are between 58 and 72 inches

[ µ 2 = 65 2(3.5) = 65 7 ]

99.7% are between 54.5 and 75.5 inches

[ µ 3 = 65 3(3.5) = 65 10.5 ]

Page 23: Chapter 6: Probability Distributions

23

Learning Objective 2:Example : 68-95-99.7% Rule

What proportion of women are less than 69 inches tall?

?

65 68.5 (height values)

+1

? = 84%

68% (by 68-95-99.7 Rule)16%

-1

Page 24: Chapter 6: Probability Distributions

24

Learning Objective 3:Z-Scores and the Standard Normal Distribution

The z-score for a value x of a random variable is the number of standard deviations that x falls from the mean

A negative (positive) z-score indicates that the value is below (above) the mean

z-scores can be used to calculate the probabilities of a normal random variable using the normal tables in the back of the book

zx

Page 25: Chapter 6: Probability Distributions

25

Learning Objective 3:Z-Scores and the Standard Normal Distribution

A standard normal distribution has mean µ=0 and standard deviation σ=1

When a random variable has a normal distribution and its values are converted to z-scores by subtracting the mean and dividing by the standard deviation, the z-scores have the standard normal distribution.

Page 26: Chapter 6: Probability Distributions

26

Learning Objective 4:Table A: Standard Normal Probabilities

Table A enables us to find normal probabilities It tabulates the normal cumulative probabilities

falling below the point +zTo use the table:

Find the corresponding z-score Look up the closest standardized score (z) in

the table. First column gives z to the first decimal place First row gives the second decimal place of z

The corresponding probability found in the body of the table gives the probability of falling below the z-score

Page 27: Chapter 6: Probability Distributions

27

Learning Objective 4:Example: Using Table A

Find the probability that a normal random variable takes a value less than 1.43 standard deviations above µ; P(z<1.43)=.9236

TI Calculator = Normcdf(-1e99,1.43,0,1)= .9236TI Calculator = Normcdf(-1e99,1.43,0,1)= .9236

Page 28: Chapter 6: Probability Distributions

28

Learning Objective 4:Example: Using Table A

Find the probability that a normal random variable takes a value greater than 1.43 standard deviations above µ: P(z>1.43)=1-.9236=.0764

TI Calculator = Normcdf(1.43,1e99,0,1)= 0.0764TI Calculator = Normcdf(1.43,1e99,0,1)= 0.0764

Page 29: Chapter 6: Probability Distributions

29

Learning Objective 4:Example:

Find the probability that a normal random variable assumes a value within 1.43 standard deviations of µ Probability below 1.43σ = .9236 Probability below -1.43σ = .0764 (1-.9236) P(-1.43<z<1.43) =.9236-.0764=.8472

TI Calculator = Normcdf(-1.43,1.43,0,1)= .8472TI Calculator = Normcdf(-1.43,1.43,0,1)= .8472

Page 30: Chapter 6: Probability Distributions

30

Learning Objective 5:Using the TI Calculator

To calculate the cumulative probability 2nd DISTR; 2:normalcdf(lower bound, upper

bound,mean,sd) Use –1E99 for negative infinity and 1E99 for

positive infinity

Page 31: Chapter 6: Probability Distributions

31

Learning Objective 5:Find Probabilities Using TI Calculator

Find probability to the left of -1.64 P(z<-1.64)=normcdf(-1e99,-1.64,0,1)=.0505

Find probability to the right of 1.56 P(z>1.56)=normcdf(1.56,1e99,0,1)=.0594

Find probability between -.50 and 2.25 P(-.5<z<2.25)=normcdf(-.5,2.25,0,1)=.6793

Page 32: Chapter 6: Probability Distributions

32

Learning Objective 6:How Can We Find the Value of z for a Certain Cumulative Probability? To solve some of our problems, we will need

to find the value of z that corresponds to a certain normal cumulative probability

To do so, we use Table A in reverse Rather than finding z using the first column

(value of z up to one decimal) and the first row (second decimal of z)

Find the probability in the body of the table The z-score is given by the corresponding values

in the first column and row

Page 33: Chapter 6: Probability Distributions

33

Learning Objective 6:How Can We Find the Value of z for a Certain Cumulative Probability?

Example: Find the value of z for a cumulative probability of 0.025.

Look up the cumulative probability of 0.025 in the body of Table A.

A cumulative probability of 0.025 corresponds to z = -1.96.

Thus, the probability that a normal random variable falls at least 1.96 standard deviations below the mean is 0.025.

Page 34: Chapter 6: Probability Distributions

34

Learning Objective 6:How Can We Find the Value of z for a Certain Cumulative Probability? Example: Find the value of z for a cumulative

probability of 0.975. Look up the cumulative probability of 0.975 in the

body of Table A. A cumulative probability of 0.975 corresponds to z

= 1.96. Thus, the probability that a normal

random variable takes a value no morethan 1.96 standard deviations above the mean is 0.975.

Page 35: Chapter 6: Probability Distributions

35

Learning Objective 7:Using the TI Calculator to Find Z-Scores for a Given Probability 2nd DISTR 3:invNorm; Enter invNorm(percentile,mean,sd)

Percentile is the probability under the curve from negative infinity to the z-score

Enter

Page 36: Chapter 6: Probability Distributions

36

Learning Objective 7:Examples

The probability that a standard normal random variable assumes a value that is ≤ z is 0.975. What is z? Invnorm(.975,0,1)=1.96

The probability that a standard normal random variable assumes a value that is > z is 0.0275. What is z? Invnorm(.975,0,1)=1.96

The probability that a standard normal random variable assumes a value that is ≥ z is 0.881. What is z? Invnorm(1-.881,0,1)=-1.18

The probability that a standard normal random variable assumes a value that is < z is 0.119. What is z? Invnorm(.119,0,1)= -1.18

Page 37: Chapter 6: Probability Distributions

37

Learning Objective 7:Example

Find the z-score z such that the probability within z standard deviations of the mean is 0.50. Invnorm(.75,0,1)= .67 Invnorm(.25,0,1)= -.67

Probability = P(-.67<Z<.67)=.5

Page 38: Chapter 6: Probability Distributions

38

Learning Objective 8:Finding Probabilities for Normally Distributed Random Variables1. State the problem in terms of the observed

random variable X, i.e., P(X<x)2. Standardize X to restate the problem in

terms of a standard normal variable Z

3. Draw a picture to show the desired probability under the standard normal curve

4. Find the area under the standard normal curve using Table A

P(X x) P Z zx

Page 39: Chapter 6: Probability Distributions

39

Learning Objective 8:P(X<x)

Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What percentage of adults have systolic blood pressure less than 100?

P(X<100) =

Normcdf(-1E99,100,120,20)=.1587

15.9% of adults have systolic blood pressure less than 100

P Z 100 120

20

P(z 1.00) .1587

Page 40: Chapter 6: Probability Distributions

40

Learning Objective 8:P(X>x)

Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What percentage of adults have systolic blood pressure greater than 100?

P(X>100) = 1 – P(X<100)

P(X>100)= 1-.1587=.8413 Normcdf(100,1e99,120,20)=.8413 84.1% of adults have systolic blood pressure greater than

100

P Z 100 120

20

P(Z 1.00) .1587

Page 41: Chapter 6: Probability Distributions

41

Learning Objective 8:P(X>x)

Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What percentage of adults have systolic blood pressure greater than 133?

P(X>133) = 1 – P(X<133)

P(X>133)= 1-.7422=.2578 Normcdf(133,1E99,120,20)=.2578 25.8% of adults have systolic blood pressure greater than

133

P Z 133 120

20

P(Z .65) .7422

Page 42: Chapter 6: Probability Distributions

42

Learning Objective 8: P(a<X<b)

Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What percentage of adults have systolic blood pressure between 100 and 133?

P(100<X<133) = P(X<133)-P(X<100)

Normcdf(100,133,120,20)=.5835 58% of adults have systolic blood pressure between 100

and 133

P Z 133 120

20

P Z

100 120 20

P(Z .65) P(Z 1.00) .7422 .1587 .5835

Page 43: Chapter 6: Probability Distributions

43

Learning Objective 9:Find X Value Given Area to Left

Adult systolic blood pressure is normally distributed

with µ = 120 and σ = 20. What is the 1st quartile? P(X<x)=.25, find x:

Look up .25 in the body of Table A to find z= -0.67 Solve equation to find x:

Check: P(X<106.6) P(Z<-0.67)=0.25 TI Calculator = Invnorm(.25,120,20)=106.6

x z 120 ( 0.67)*20 106.6

Page 44: Chapter 6: Probability Distributions

44

Learning Objective 9:Find X Value Given Area to Right

Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. 10% of adults have systolic blood pressure above what level?

P(X>x)=.10, find x. P(X>x)=1-P(X<x) Look up 1-0.1=0.9 in the body of Table A to find z=1.28 Solve equation to find x:

Check: P(X>145.6) =P(Z>1.28)=0.10 TI Calculator = Invnorm(.9,120,20)=145.6

x z 120 (1.28)*20 145.6

Page 45: Chapter 6: Probability Distributions

45

Learning Objective 10:Using Z-scores to Compare Distributions

Z-scores can be used to compare observations from different normal distributions

Example: You score 650 on the SAT which has =500 and

=100 and 30 on the ACT which has =21.0 and =4.7. On which test did you perform better?

Compare z-scoresSAT: ACT:

Since your z-score is greater for the ACT, you performed better on this exam

z650 500

1001.5

z30 21

4.71.91

Page 46: Chapter 6: Probability Distributions

46

Chapter 6: Probability Distributions

Section 6.3: How Can We Find Probabilities When Each Observation Has

Two Possible Outcomes?

Page 47: Chapter 6: Probability Distributions

47

Learning Objectives

1. The Binomial Distribution2. Conditions for a Binomial Distribution3. Probabilities for a Binomial Distribution4. Factorials5. Examples using Binomial Distribution6. Do the Binomial Conditions Apply?7. Mean and Standard Deviation of the Binomial

Distribution8. Normal Approximation to the Binomial

Page 48: Chapter 6: Probability Distributions

48

Learning Objective 1:The Binomial Distribution

Each observation is binary: it has one of two possible outcomes.

Examples: Accept, or decline an offer from a bank for a credit

card. Have, or do not have, health insurance. Vote yes or no on a referendum.

Page 49: Chapter 6: Probability Distributions

49

Learning Objective 2:Conditions for the Binomial Distribution

Each of n trials has two possible outcomes: “success” or “failure”.

Each trial has the same probability of success, denoted by p.

The n trials are independent.

The binomial random variable X is the number of successes in the n trials.

Page 50: Chapter 6: Probability Distributions

50

Learning Objective 3:Probabilities for a Binomial Distribution

Denote the probability of success on a trial by p.

For n independent trials, the probability of x successes equals:

P(x) n!

x!(n - x)!px (1 p)n x, x 0,1,2,...,n

Page 51: Chapter 6: Probability Distributions

51

Learning Objective 4:Factorials

Rules for factorials: n!=n*(n-1)*(n-2)…2*1 1!=1 0!=1

For example, 4!=4*3*2*1=24

Page 52: Chapter 6: Probability Distributions

52

Learning Objective 5:Example: Finding Binomial Probabilities

John Doe claims to possess ESP. An experiment is conducted:

A person in one room picks one of the integers 1, 2, 3, 4, 5 at random.

In another room, John Doe identifies the number he believes was picked.

Three trials are performed for the experiment. Doe got the correct answer twice.

Page 53: Chapter 6: Probability Distributions

53

Learning Objective 5:Example 1

If John Doe does not actually have ESP and is actually guessing the number, what is the probability that he’d make a correct guess on two of the three trials?

The three ways John Doe could make two correct guesses in three trials are: SSF, SFS, and FSS.

Each of these has probability: (0.2)2(0.8)=0.032.

The total probability of two correct guesses is 3(0.032)=0.096.

Page 54: Chapter 6: Probability Distributions

54

Learning Objective 5:Example 1

The probability of exactly 2 correct guesses is the binomial probability with n = 3 trials, x = 2 correct guesses and p = 0.2 probability of a correct guess.

P(2) 3!

2!1!(0.2)2(0.8)1 3(0.04)(0.8) 0.096

2nd Vars0:binampdf(n,p,x)Binampdf(3,.2,2)=0.096

Page 55: Chapter 6: Probability Distributions

55

Learning Objective 5:Binomial Example 2

1000 employees, 50% Female None of the 10 employees chosen for management

training were female.

The probability that no females are chosen is:

Binompdf(10,.5,0)=9.765625E-4 It is very unlikely (one chance in a thousand) that none of the

10 selected for management training would be female if the employees were chosen randomly

P(0) 10!

0!10!(0.50)0(0.50)10 0.001

Page 56: Chapter 6: Probability Distributions

56

Learning Objective 6:Do the Binomial Conditions Apply?

Before using the binomial distribution, check that its three conditions apply:

Binary data (success or failure). The same probability of success for each

trial (denoted by p). Independent trials.

Page 57: Chapter 6: Probability Distributions

57

Learning Objective 6:Do the Binomial Conditions Apply to Example 2?

The data are binary (male, female). If employees are selected randomly, the

probability of selecting a female on a given trial is 0.50.

With random sampling of 10 employees from a large population, outcomes for one trial does not depend on the outcome of another trial

Page 58: Chapter 6: Probability Distributions

58

Learning Objective 7:Binomial Mean and Standard Deviation

The binomial probability distribution for n trials with probability p of success on each trial has mean µ and standard deviation σ given by:

p)-np(1 np,

Page 59: Chapter 6: Probability Distributions

59

Learning Objective 7: Example: Racial Profiling?

Data: 262 police car stops in Philadelphia in 1997. 207 of the drivers stopped were African-American. In 1997, Philadelphia’s population was 42.2%

African-American.

Does the number of African-Americans stopped suggest possible bias, being higher than we would expect (other things being equal, such as the rate of violating traffic laws)?

Page 60: Chapter 6: Probability Distributions

60

Learning Objective 7:Example: Racial Profiling?

Assume: 262 car stops represent n = 262 trials. Successive police car stops are

independent. P(driver is African-American) is p = 0.422.

Calculate the mean and standard deviation of this binomial distribution:

8(0.578)262(0.422)

111 262(0.422)

Page 61: Chapter 6: Probability Distributions

61

Learning Objective 7: Example: Racial Profiling?

Recall: Empirical Rule When a distribution is bell-shaped, close to

100% of the observations fall within 3 standard deviations of the mean.

1353(8)111 3

87 3(8) - 111 3 -u

Page 62: Chapter 6: Probability Distributions

62

Learning Objective 7:Example: Racial Profiling?

If there is no racial profiling, we would not be surprised if between about 87 and 135 of the 262 drivers stopped were African-American.

The actual number stopped (207) is well above these values.

The number of African-Americans stopped is too high, even taking into account random variation.

Limitation of the analysis:Different people do different amounts of driving, so we don’t really know that 42.2% of the potential stops were African-American.

Page 63: Chapter 6: Probability Distributions

63

Learning Objective 8:Approximating the Binomial Distribution with the Normal Distribution

The binomial distribution can be well approximated by the normal distribution when the expected number of successes, np, and the expected number of failures, n(1-p) are both at least 15.