Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

41
Sampling Sampling Distribution of Distribution of a Sample a Sample Proportion Proportion Lecture 26 Lecture 26 Sections 8.1 – 8.2 Sections 8.1 – 8.2 Wed, Mar 8, 2006 Wed, Mar 8, 2006

Transcript of Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Page 1: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Sampling Sampling Distribution of a Distribution of a

Sample Sample ProportionProportionLecture 26Lecture 26

Sections 8.1 – 8.2Sections 8.1 – 8.2

Wed, Mar 8, 2006Wed, Mar 8, 2006

Page 2: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Preview of the Central Limit Preview of the Central Limit TheoremTheorem

We looked at the distribution of the We looked at the distribution of the sumsum of 1, 2, and 3 uniform random of 1, 2, and 3 uniform random variables variables UU(0, 1).(0, 1).

We saw that the shapes of their We saw that the shapes of their distributions was moving towards the distributions was moving towards the shape of the normal distribution.shape of the normal distribution.

If we replace “sum” with “average,” If we replace “sum” with “average,” we will obtain the same phenomenon, we will obtain the same phenomenon, but on the scale from 0 to 1 each time.but on the scale from 0 to 1 each time.

Page 3: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Preview of the Central Limit Preview of the Central Limit TheoremTheorem

0 1

1

2

Page 4: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Preview of the Central Limit Preview of the Central Limit TheoremTheorem

0 1

1

2

Page 5: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Preview of the Central Limit Preview of the Central Limit TheoremTheorem

0 1

1

2

Page 6: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Preview of the Central Limit Preview of the Central Limit TheoremTheorem

Some observations:Some observations: Each distribution is centered at the Each distribution is centered at the

same place, ½.same place, ½. The distributions are being “drawn in” The distributions are being “drawn in”

towards the center.towards the center. That means that their standard That means that their standard

deviation is decreasing.deviation is decreasing. Can we quantify this?Can we quantify this?

Page 7: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Preview of the Central Limit Preview of the Central Limit TheoremTheorem

0 1

1

2 = ½2 = 1/12

Page 8: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Preview of the Central Limit Preview of the Central Limit TheoremTheorem

0 1

1

2 = ½2 = 1/24

Page 9: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Preview of the Central Limit Preview of the Central Limit TheoremTheorem

0 1

1

2 = ½2 = 1/36

Page 10: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Preview of the Central Limit Preview of the Central Limit TheoremTheorem

This tells us that a mean based on This tells us that a mean based on three observations is much more three observations is much more likely to be close to the population likely to be close to the population mean than is a mean based on only mean than is a mean based on only one or two observations.one or two observations.

Page 11: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Parameters and Parameters and Statistics Statistics

THE PURPOSE OF A STATISTIC IS THE PURPOSE OF A STATISTIC IS TO ESTIMATE A POPULATION TO ESTIMATE A POPULATION PARAMETER.PARAMETER. A sample mean is used to estimate the A sample mean is used to estimate the

population mean.population mean. A sample proportion is used to estimate A sample proportion is used to estimate

the population proportion.the population proportion. Sample statistics, by their very Sample statistics, by their very

nature, are variable.nature, are variable. Population parameters are fixed.Population parameters are fixed.

Page 12: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Some QuestionsSome Questions

We hope that the sample proportion We hope that the sample proportion is close to the population proportion.is close to the population proportion.

How close can we expect it to be?How close can we expect it to be? Would it be worth it to collect a Would it be worth it to collect a

larger sample?larger sample? If the sample were larger, would we If the sample were larger, would we

expect the sample proportion to be expect the sample proportion to be closer to the population proportion?closer to the population proportion?

How much closer?How much closer?

Page 13: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

The Sampling The Sampling Distribution of a StatisticDistribution of a Statistic Sampling Distribution of a StatisticSampling Distribution of a Statistic – –

The distribution of values of the The distribution of values of the statistic over all possible samples of statistic over all possible samples of size size nn from that population. from that population.

Page 14: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

The Sample ProportionThe Sample Proportion Let Let pp be the population proportion. be the population proportion. Then Then pp is a fixed value (for a given is a fixed value (for a given

population).population). Let Let pp^̂ (“ (“pp-hat”) be the sample proportion.-hat”) be the sample proportion. Then Then pp^̂ is a random variable; it takes on is a random variable; it takes on

a new value every time a sample is a new value every time a sample is collected.collected.

The sampling distribution of The sampling distribution of pp^̂ is the is the probability distribution of all the possible probability distribution of all the possible values of values of pp^̂..

Page 15: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

ExampleExample

Suppose that this class is 3/4 Suppose that this class is 3/4 freshmen.freshmen.

Suppose that we take a sample of 2 Suppose that we take a sample of 2 students, selected students, selected with replacementwith replacement..

Find the sampling distribution of Find the sampling distribution of pp^̂..

Page 16: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

ExampleExample

F

N

F

N

F

N

3/4

1/4

3/4

1/4

3/4

1/4

P(FF) = 9/16

P(FN) = 3/16

P(NF) = 3/16

P(NN) = 1/16

Page 17: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

ExampleExample

Let Let XX be the be the numbernumber of freshmen in of freshmen in the sample.the sample.

The probability distribution of The probability distribution of XX is is

xx PP((xx))

00 1/161/16

11 6/166/16

22 9/169/16

Page 18: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

ExampleExample

Let Let pp^̂ be the be the proportionproportion of freshmen of freshmen in the sample. (in the sample. (pp^̂ = = XX//nn.).)

The sampling distribution of The sampling distribution of pp^̂ is is

xx PP((pp^̂ = = xx))

00 1/161/16

1/21/2 6/166/16

11 9/169/16

Page 19: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Samples of Size Samples of Size nn = 3 = 3

If we sample 3 people (with If we sample 3 people (with replacement) from a population that replacement) from a population that is 3/4 freshmen, then the proportion is 3/4 freshmen, then the proportion of freshmen in the sample has the of freshmen in the sample has the following distribution.following distribution.

xx PP((pp^̂ = = xx))

00 1/64 1/64 = .02= .02

1/31/3 9/64 9/64 = .14= .14

2/32/3 27/64 27/64 = .42= .42

11 27/64 27/64 = .42= .42

Page 20: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Samples of Size Samples of Size nn = 4 = 4

If we sample 4 people (with If we sample 4 people (with replacement) from a population that replacement) from a population that is 3/4 freshmen, then the proportion is 3/4 freshmen, then the proportion of freshmen in the sample has the of freshmen in the sample has the following distribution.following distribution.xx PP((pp^̂ = = xx))

00 1/256 1/256 = .004= .004

1/41/4 12/256 12/256 = .05= .05

2/42/4 54/256 54/256 = .21= .21

3/43/4 108/256 108/256 = .42= .42

11 81/256 81/256 = .32= .32

Page 21: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

The Parameters of the The Parameters of the Sampling DistributionsSampling Distributions

When When nn = 1, the sampling distribution = 1, the sampling distribution isis

The mean and standard deviation areThe mean and standard deviation are = 3/4 = 0.75= 3/4 = 0.75 22 = 3/16 = 0.1875 = 3/16 = 0.1875

pp^̂ PP((pp^̂))

00 1/41/4

11 3/43/4

Page 22: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

The Parameters of the The Parameters of the Sampling DistributionsSampling Distributions

When When nn = 2, the sampling distribution = 2, the sampling distribution isis

The mean and standard deviation areThe mean and standard deviation are = 3/4 = 0.75= 3/4 = 0.75 22 = 3/32 = 0.09375 = 3/32 = 0.09375

pp^̂ PP((pp^̂))

00 1/161/16

1/21/2 6/166/16

11 9/169/16

Page 23: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

The Parameters of the The Parameters of the Sampling DistributionsSampling Distributions

When When nn = 3, the sampling distribution = 3, the sampling distribution isis

The mean and standard deviation areThe mean and standard deviation are = 3/4 = 0.75= 3/4 = 0.75 22 = 3/48 = 0.0625 = 3/48 = 0.0625

pp^̂ PP((pp^̂))

00 1/64 = .021/64 = .02

1/31/3 9/64 = .149/64 = .14

2/32/3 27/64 = .4227/64 = .42

11 27/64 = .4227/64 = .42

Page 24: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

The Parameters of the The Parameters of the Sampling DistributionsSampling Distributions

When When nn = 4, the sampling distribution = 4, the sampling distribution isis

The mean and standard deviation areThe mean and standard deviation are = 3/4 = 0.75= 3/4 = 0.75 22 = 3/64 = 0.046875 = 3/64 = 0.046875

pp^̂ PP((pp^̂))

00 1/256 = .0041/256 = .004

1/41/4 12/256 = .0512/256 = .05

2/42/4 54/256 = .2154/256 = .21

3/43/4 108/256 = .42108/256 = .42

11 81/256 = .3281/256 = .32

Page 25: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Sampling DistributionsSampling Distributions

Run the program Run the program

Central Limit Theorem for Central Limit Theorem for Proportions.exeProportions.exe..

Use Use nn = 30 and = 30 and pp = 0.75; generate = 0.75; generate 100 samples.100 samples.

Page 26: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

100 Samples of Size 100 Samples of Size nn = = 3030

= 0.75

= 0.079

Page 27: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Observations and Observations and ConclusionsConclusions

Observation #1: The values of Observation #1: The values of pp^̂ are are clustered around clustered around pp..

Conclusion #1: Conclusion #1: pp^̂ is probably close is probably close to to pp..

Page 28: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Larger Sample SizeLarger Sample Size

Now we will select 100 samples of Now we will select 100 samples of size 120 instead of size 30.size 120 instead of size 30.

Run the program Run the program

Central Limit Theorem for Central Limit Theorem for Proportions.exeProportions.exe..

Pay attention to the Pay attention to the spreadspread (standard deviation) of the (standard deviation) of the distribution.distribution.

Page 29: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

100 Samples of Size 100 Samples of Size nn = = 120120

= 0.75

= 0.0395

Page 30: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Observations and Observations and ConclusionsConclusions

Observation #2: As the sample size Observation #2: As the sample size increases, the clustering is tighter.increases, the clustering is tighter.

Conclusion #2A: Larger samples Conclusion #2A: Larger samples give more reliable estimates.give more reliable estimates.

Conclusion #2B: For sample sizes Conclusion #2B: For sample sizes that are large enough, we can make that are large enough, we can make very good estimates of the value of very good estimates of the value of pp..

Page 31: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Larger Sample SizeLarger Sample Size

Now we will select 10000 samples of Now we will select 10000 samples of size 120 instead of only 100 samples.size 120 instead of only 100 samples.

Run the program Run the program

Central Limit Theorem for Central Limit Theorem for Proportions.exeProportions.exe..

Pay attention to the Pay attention to the shapeshape of the of the distribution.distribution.

Page 32: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

10,000 Samples of Size 10,000 Samples of Size nn = 120= 120

= 0.75

= 0.0395

Page 33: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

10,000 Samples of Size 10,000 Samples of Size nn = 126= 126

Page 34: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

More Observations and More Observations and ConclusionsConclusions

Observation #3: The distribution of Observation #3: The distribution of pp^̂ appears to be approximately appears to be approximately normal.normal.

Page 35: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

One More ConclusionOne More Conclusion

Conclusion #3: We can use the Conclusion #3: We can use the normal distribution to calculate just normal distribution to calculate just how close to how close to pp we can expect we can expect pp^̂ to to be.be.

However, we must know the values However, we must know the values of of and and for the distribution of for the distribution of pp^̂..

That is, we have to That is, we have to quantifyquantify the the sampling distribution of sampling distribution of pp^̂..

Page 36: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

The Sampling The Sampling Distribution of Distribution of pp^̂

It turns out that the sampling It turns out that the sampling distribution of distribution of pp^̂ is approximately is approximately normal with the following parameters.normal with the following parameters.

This is the This is the Central Limit Theorem for Central Limit Theorem for ProportionsProportions, summarized on page 519., summarized on page 519.

n

ppp

n

ppp

pp

1ˆ ofdeviation Standard

1ˆ of Variance

ˆ ofMean

Page 37: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

The approximation to the normal The approximation to the normal distribution is excellent ifdistribution is excellent if

The Sampling The Sampling Distribution of Distribution of pp^̂

.51 and 5 pnnp

Page 38: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Why Surveys WorkWhy Surveys Work

Suppose 51% of the population plan Suppose 51% of the population plan to vote for candidate to vote for candidate XX, i.e., , i.e., pp = = 0.51.0.51.

What is the probability that an exit What is the probability that an exit survey of 1000 people would show survey of 1000 people would show candidate candidate XX with less than 45% with less than 45% support, i.e., support, i.e., pp^̂ < .45? < .45?

Page 39: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Why Surveys WorkWhy Surveys Work

First, describe the sampling First, describe the sampling distribution of distribution of pp^̂ if the sample size is if the sample size is nn = 1000 and = 1000 and pp = 0.51. = 0.51. Check: Check: npnp = 510 = 510 5 and 5 and nn(1 – (1 – pp) = 490 ) = 490

5. 5. pp^̂ is approximately normal. is approximately normal.

01581.0

1000

49.051.0

51.0

ˆ

ˆ

p

p

Page 40: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Why Surveys WorkWhy Surveys Work

The The zz-score of 0.45 is -score of 0.45 is zz = (0.45 – = (0.45 – 0.51)/.01581 0.51)/.01581 = -3.795.= -3.795.

PP((pp^̂ < 0.45) = < 0.45) = PP((ZZ < -3.795) < -3.795)

= 0.00007385 (not likely!)= 0.00007385 (not likely!) Or use normalcdf(-E99, 0.45, 0.51, Or use normalcdf(-E99, 0.45, 0.51,

0.01581).0.01581).

Page 41: Sampling Distribution of a Sample Proportion Lecture 26 Sections 8.1 – 8.2 Wed, Mar 8, 2006.

Why Surveys WorkWhy Surveys Work

Perform the same calculation, but with a Perform the same calculation, but with a smaller sample size, say smaller sample size, say nn = 50. = 50.

The probability turns out to be 0.1980, The probability turns out to be 0.1980, nearly a 20% chance.nearly a 20% chance.

By symmetry, there is also a 20% chance By symmetry, there is also a 20% chance that the sample proportion is greater than that the sample proportion is greater than 57%.57%.

Thus, there is a Thus, there is a 40% chance40% chance that the that the sample proportion is off by at least 6 sample proportion is off by at least 6 percentage points.percentage points.