Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

55
Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion

Transcript of Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

Page 1: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 1

Chapter 8Sampling Distributions

Meanand

Proportion

Page 2: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 22

Goal of Statistical Analysis: Find Parameters of Population from Statistics on Sample

Sample

Population

Random Sampling: every unit in the population has an equal chance to be •chosen

•The quality of all statistical analysis depends on the quality of the sample data

Page 3: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 3

Parameter: A number describing a population. Statistic: A number describing a sample.

1. A random sample should represent the population well, so sample statistics from a random sample should provide reasonable estimates of population parameters.

2. All sample statistics have some error in estimating population parameters.

3. If repeated samples are taken from a population and the same statistic (e.g. mean) is calculated from each sample, the statistics will vary, that is, they will have a distribution.

4. A larger sample provides more information than a smaller sample so a statistic from a large sample should have less error than a statistic from a small sample.

Page 4: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 4

Sampling distributions for:

Mean (8.1) (mean of a parameter in a population or EVaveEVave)

Proportion (8.2) (percentage of a parameter in a population

or EV%EV% )

Page 5: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 5

8.1 Distribution of the Sample Mean

Page 6: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 6

Statistics such as are random variables since their value varies from sample to sample.

So they have probability distributions associated with them. In this chapter we focus on the shape, center and spread of statistics such as .

x

x

Page 7: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 78-7

The sampling distribution of a statistic is a probability distribution for all possible values of the statistic computed from a sample of size n.

The sampling distribution of the sample mean is the probability distribution of all possible values of the random variable computed from a sample of size n from a population with mean and standard deviation .

x

x

Page 8: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 88-8

The weights of pennies minted after 1982 are approximately normally distributed with mean 2.46 grams and standard deviation 0.02 grams.

Approximate the sampling distribution of the sample mean by obtaining 200 simple random samples of size n = 5 from this population.

Example 1: Sampling Distribution of the Sample Mean-Normal Population

Page 9: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 98-9

The data on the following slide represent the sample means for the 200 simple random samples of size n = 5.

For example, the first sample of n = 5 had the following data:

2.493 2.466 2.473 2.492 2.471

Note: =2.479 for this sample

x

Page 10: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 108-10

Sample Means for Samples of Size n =5

Page 11: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 118-11

The mean of the 200 sample means is 2.46, the same as the mean of the population.

The standard deviation of the sample means is 0.0086, which is smaller than the standard deviation of the population.

The next slide shows the histogram of the sample means.

Page 12: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 128-12

Page 13: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 138-13

What role does n, the sample size, play in the standard deviation of the distribution of the sample mean?

Page 14: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 148-14

What role does n, the sample size, play in the standard deviation of the distribution of the sample mean?

As the size of the sample gets larger, we do not

expect as much spread in the sample means

since larger observations will offset smaller

observations.

Page 15: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 15

Suppose that a simple random sample of size n is

drawn from a large population with mean and

standard deviation . The sampling distribution of will have mean and standard deviation .

The standard deviation of the sampling distribution is called the standard error of the mean and is denoted .

The Mean and Standard Deviation of theSampling Distribution of

x

x

x n

x

x

x

Page 16: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 16

Notation

the mean of the sample means

the standard deviation of sample mean

(often called the standard error of the mean)

µx = µ

nx =

Page 17: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 17

Sampling from Normal Populations

Page 18: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 18

The weights of pennies minted after 1982 are approximately normally distributed with mean 2.46 grams and standard deviation 0.02 grams.

What is the probability that in a simple random sample of 10 pennies minted after 1982, we obtain a sample mean of at least 2.465 grams?

Example – Weight of pennies

Page 19: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 19

• is normally distributed with =2.46 and .

• .

• P(Z>0.79)=1-0.7852

=0.2148.

Solution

x

x

x 0.02

100.0063

Z 2.465 2.46

0.00630.79 On CALCULATOR:

P(x>2.465)=normalcdf(2.465,10^99,2.46,0.0063)=0.2148

Page 20: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 20

Given the population of passengers has normally distributed weights with a mean of 172 lb and a standard deviation of 29 lb,

a) if one man is randomly selected, find the probability that his weight is greater than 175 lb.

b) if 20 different men are randomly selected, find the probability that their mean weight is greater than 175 lb

Another Example – Water Taxi(work on your own)

Page 21: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 21

Or use table: z = 175 – 172 = 0.10 29

a) if one man is randomly selected, find the probability that his weight is greater than 175 lb:

CALCULATOR: P(X>175)=normalcdf(175,10^99,172,29)

Ans

Page 22: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 22

b) if 20 different men are randomly selected, find the probability that their mean weight is greater than 172 lb.

CALCULATOR: P(X>175)=normalcdf(175,10^99,172,29/ 20 )

Ans – cont

Or use table:z = 175 – 172 = 0.46 29

20

Page 23: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 23

b) if 20 different men are randomly selected, their mean    weight is greater than 175 lb.

P(x > 175) = 0.3228

It is much easier for an individual to deviate from the mean than it is for a group of 20 to deviate from the mean.

a) if one man is randomly selected, find the probability    that his weight is greater than 175 lb.

P(x > 175) = 0.4602

Ans - conclusion

Page 24: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 24

Sampling from a Population that is not Normal

Page 25: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 25

The following table and histogram give the probability distribution for rolling a fair die:

=3.5, =1.708Note that the population distribution is NOT normal

Face on Die Relative Frequency

1 0.1667

2 0.1667

3 0.1667

4 0.1667

5 0.1667

6 0.1667

EXAMPLE: Sampling from a Population that is Not Normal

Page 26: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 26

Estimate the sampling distribution of (average of n tosses of the die) by obtaining 200 simple random samples of size n=4 and calculating the sample mean for each of the 200 samples.

Repeat for n = 10 and 30.

Histograms of the sampling distribution of the sample mean for each sample size are given on the next slide.

x

Page 27: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 27

Page 28: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 28

Page 29: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 29

Page 30: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 30

Central Limit Theorem

The random variable x has a distribution (which may or may not be normal) with mean µ and standard deviation .

Simple random samples all of size n are selected from the population. (The samples are selected so that all possible samples of the same size n have the same chance of being selected.)

Given:

1. The distribution of sample x will, as the sample size increases, approach a normal distribution.

2. The mean of the sample means is the population mean µ.

3. The standard deviation of all sample means is n

Page 31: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 31

Key PointsThe mean of the sampling distribution is equal to the

mean of the parent population and the standard deviation of the sampling distribution of the sample mean is regardless of the sample size.

The shape of the distribution of the sample mean becomes approximately normal as the sample size n increases, regardless of the shape of the population: This is a result of The Central Limit Theorem.

n

As the sample size increases, the sampling distribution of sample means

approaches a normal distribution.

Page 32: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 32

Practical Rules

1. For samples of size n larger than 30, the distribution of the sample means can be approximated reasonably well by a normal distribution. The approximation gets better as the sample size n becomes larger.

2. If the original population is itself normally distributed, then the sample means will be normally distributed for any sample size n (not just the values of n larger than 30).

Page 33: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 33

Example : (Using the Central Limit Theorem)

Suppose that the mean time for an oil change at a “10-minute oil change joint” is 11.4 minutes with a standard deviation of 3.2 minutes.

(a) If a random sample of n = 35 oil changes is selected, describe the sampling distribution of the sample mean.

(b) If a random sample of n = 35 oil changes is selected, what is the probability the mean oil change time is less than 11 minutes?

Page 34: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 34

Example : (Using the Central Limit Theorem)

Suppose that the mean time for an oil change at a “10-minute oil change joint” is 11.4 minutes with a standard deviation of 3.2 minutes.

(a) If a random sample of n = 35 oil changes is selected, describe the sampling distribution of the sample mean.

(b) If a random sample of n = 35 oil changes is selected, what is the probability the mean oil change time is less than 11 minutes?

Solution: is approximately normally distributed with mean=11.4 and std. dev. = .

x

3.2

350.5409

Solution: , P(Z<-0.74)=0.23.

Z 11 11.4

0.5409 0.74

Page 35: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 35

8.2 Distribution of the Sample Proportion

Page 36: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 36

Point Estimate of a Population Proportion

Suppose that a random sample of size n is obtained from a population in which each individual either does or does not have a certain characteristic. The sample proportion, denoted (read “p-hat”) is given by

where x is the number of individuals in the sample with the specified characteristic. The sample proportion is a statistic that estimates the population proportion, p.

ˆ p

ˆ p x

n

Page 37: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 37

In a Quinnipiac University Poll conducted in May of 2008, 1,745 registered voters nationwide were asked whether they approved of the way George W. Bush was handling the economy. 349 responded “yes”. Obtain a point estimate for the proportion of registered voters who approved of the way George W. Bush was handling the economy.

Example 1: Computing a Sample Proportion

Page 38: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 38

In a Quinnipiac University Poll conducted in May of 2008, 1,745 registered voters nationwide were asked whether they approved of the way George W. Bush was handling the economy. 349 responded “yes”. Obtain a point estimate for the proportion of registered voters who approved of the way George W. Bush was handling the economy.

Example 1: Computing a Sample Proportion

Solution:

ˆ p 349

17450.2

Page 39: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 398-39

According to a Time poll conducted in June of 2008, 42% of registered voters believed that gay and lesbian couples should be allowed to marry.

Describe the sampling distribution of the sample proportion for samples of size n=10, 50, 100.

Example 2: Using Simulation to Describe the Distribution of the Sample Proportion

Page 40: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 408-40

Page 41: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 418-41

Page 42: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 428-42

Page 43: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 438-43

Key Points from Example 2

Shape: As the size of the sample, n, increases, the shape of the sampling distribution of the sample proportion becomes approximately normal.

Center: The mean of the sampling distribution of the sample proportion equals the population proportion, p.

Spread: The standard deviation of the sampling distribution of the sample proportion decreases as the sample size, n, increases.

Page 44: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 44

For a simple random sample of size n with population proportion p:• The shape of the sampling distribution of is

approximately normal provided np(1-p)≥10.• The mean of the sampling distribution of is

• The standard deviation of the sampling

distribution of is

Sampling Distribution of

ˆ p

ˆ p

ˆ p

ˆ p p

ˆ p

ˆ p p(1 p)

n

Page 45: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 45

Sampling Distribution of

• The model on the previous slides requires that the sampled values are independent. When sampling from finite populations, this assumption is verified by checking that the sample size n is no more than 5% of the population size N (n ≤ 0.05N).

• Regardless of whether np(1-p) ≥10 or not, the mean of the sampling distribution of is p, and the standard deviation is

ˆ p

ˆ p

ˆ p p(1 p)

n

Page 46: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 46

According to a Time poll conducted in June of 2008, 42% of registered voters believed that gay and lesbian couples should be allowed to marry. Suppose that we obtain a simple random sample of 50 voters and determine which believe that gay and lesbian couples should be allowed to marry. Describe the sampling distribution of the sample proportion.

Example 3:

Page 47: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 47

Solution

The sample of n=50 is smaller than 5% of the population size (all registered voters in the U.S.).

Also, np(1-p)=50(0.42)(0.58)=12.18≥10.

The sampling distribution of the sample proportion is therefore approximately normal with mean=0.42 and standard deviation=

0.42(1 0.42)

500.0698

Page 48: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 48

According to the Centers for Disease Control and Prevention, 18.8% of school-aged children, aged 6-11 years were overweight in 2004.

(a) In a random sample of 90 school children aged 6-11 years what is the probability that at least 19% are overweight?

(b) Suppose in one random sample of 90 school children aged 6-11 years there were 24 overweight children. What might you conclude?

Example 4: Compute Probabilities of a Sample Proportion

Page 49: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 49

• n=90 is less than 5% of the population size• np(1-p)=90(.188)(1-.188)≈13.7≥10• is approximately normal with mean=0.188

and standard deviation =

(a) In a random sample of 90 school-aged children, aged 6-11 years, what is the probability that at least 19% are overweight?

Or (CALCULATOR):

Solution

ˆ p

(0.188)(1 0.188)

900.0412

, P(Z>0.05)=1-0.5199=0.4801

Z 0.19 0.188

0.04120.0485

P(X>0.19)=normalcdf(0.19,10^99,0.188,0.0412)=0.4801

Page 50: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 50

• is approximately normal with mean=0.188 and standard deviation = 0.0412

(b) Suppose in one random sample of 90 school children aged 6-11 years there were 24 overweight children. What might you conclude?

Solution

ˆ p

,

P(X>0.2667)= normalcdf(0.2667,10^99,0.188,0.0412)= 0.028.

We would only expect to see about 3 samples in 100 resulting in a sample proportion of 0.2667 or more. This is an unusual sample if the true population proportion is 0.188

ˆ p 24

900.2667

Page 51: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 51

Next:Practice problems – Answers included(re-work on your own):

Problem 1: Sample Mean and probability for sample mean

Problem 2: Sample proportion and probability for sample proportion

Page 52: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 52

Problem 1 – Sample mean : Problem 1 – Sample mean : Flight search processing timeFlight search processing time

Web application for a flight search: An investigator takes a sample of 100 flight searches and notes the web response time. Assume that the population average of ALL web searches is 15 sec with a standard deviation is 5 sec. Here are the summary statistics calculated by a Statistical Software for the sample of 100:

Summary of web processing time

The MEANS Procedure

Analysis Variable : time

N Mean Std Dev Minimum Maximum---------------------------------------------------------------

100 14.9955626 5.2117790 2.2461204 25.7383955---------------------------------------------------------------

The estimated processing time is 14.99 seconds (sample average) The standard error is equal to /sqrt(n) = 5/sqrt(100)=0.5.

Page 53: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 53

What is the probability that in 100 flight searches, the average time to process the requests is less than 14 seconds?

We can use the normal approximation:The sample average is normally distributed with mean equal to 15 and standard deviation equal to the standard error = 0.5.

14 15 time

P(X<14)= normalcdf(-10^99,14,15,0.5)=0.0228

There is only about 2.3% chance that the average time to process 100 flight requests is less than 14 seconds.

Page 54: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 54

A study by a Federal Agency in 1983 concluded that polygraph (lie detector) tests given to truthful people have probability 0.2 of suggesting that the person is deceptive. A firm asks 20 job applicants about thefts from previous employers, using a polygraph to assess their truthfulness. All applicants were truthful. What is the chance that at least one will fail the test?

Problem 2 – Sample proportion: Problem 2 – Sample proportion: Polygraph percentagesPolygraph percentages

Compute sample proportion and standard error for the sample proportion:

0.05 0.2 proportion

Sample proportion is p=0.2 (same as

population proportion),

The standard error is sqrt(p*(1-p)/n)

=sqrt(0.2*0.8/20)=0.09

Thus sample percentage is approximately

normal with mean 0.2 and standard deviation 0.09.

Page 55: Slide Slide 1 Chapter 8 Sampling Distributions Mean and Proportion.

SlideSlide 55

What is the chance that at least 1 in 20 will fail the test?

Answer:

First figure what prportion does 1 in 20 constitute: 1/20 = 0.05

So probability is: P(X>0.05) = normalcdf(0.05,10^99,0.2,0.09) = 0.952

0.05 0.2 percentage

95.2%

Conclusion:

The chance that at least one applicant out of 20 will fail the polygraph test is 95.2%. That is extremely high!