Chapter 15

45
1 Copyright © 2014, 2012, 2009 Pearson Education, Inc. Chapter 15 Sampling Distribution Models

description

Statistics Powerpoint

Transcript of Chapter 15

DeVeaux Intro Stats, 4e

Chapter 15Sampling Distribution Models#Copyright 2014, 2012, 2009 Pearson Education, Inc.ObjectivesState and apply the conditions and uses of the Central Limit Theorem.Determine the mean and standard deviation (standard error) for a sampling distribution of proportions or means.Apply the sampling distribution of a proportion or a mean to application problems. #Copyright 2014, 2012, 2009 Pearson Education, Inc.15.1Sampling Distribution of a Proportion#Copyright 2014, 2012, 2009 Pearson Education, Inc.Slide 1- 4Sample Proportions and Sampling DistributionsThe Harris poll found that of 889 U.S. adults, 40% said they believe in ghosts. CBS News found that of 808 U.S. adults, 48% said they believe in ghosts.Why are these two sample proportions different?What is the true population proportion (of ALL U.S. adults)?Well denote the population proportion p, and the sample proportion p^Consider all possible samples of size 808 if we made a histogram of the number of samples having a given p^ what might that look like?#Copyright 2014, 2012, 2009 Pearson Education, Inc.4Slide 1- 5Rather than showing real repeated samples, imagine what would happen if we were to actually draw many samples and look at their proportions.The histogram wed get if we could see all the proportions from all possible samples is called the sampling distribution of the proportions.What would the histogram of all the sample proportions look like?The Central Limit Theorem for Sample Proportions#Copyright 2014, 2012, 2009 Pearson Education, Inc.5Sampling About EvolutionAccording to a Gallup poll, 43% believe in evolution. Assume this is true of all Americans.If many surveys were done of 1007 Americans, we could calculate the sample proportion for each.

The histogram shows the distribution of a simulation of 2000 sample proportions.

The distribution of all possiblesample proportions from samples with the same sample size is called the sampling distribution.

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Sampling DistributionsSampling Distribution for ProportionsSymmetricUnimodalCentered at pThe sampling distribution follows the Normal model:

What does the sampling distribution tell us?The sampling distribution allows us to make statements about where we think the corresponding population parameter is and how precise these statements are likely to be.

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Another way of saying thisSample statistics are random variables themselves Sample proportion (for categorical data) Sample mean (for quantitative data)

They have a probability distribution, mean, standard deviation, etc.#Copyright 2014, 2012, 2009 Pearson Education, Inc.

Mean and Standard DeviationSampling Distribution for Proportions

#Copyright 2014, 2012, 2009 Pearson Education, Inc.The Normal Model for EvolutionPopulation: p = 0.43, n = 1007. Sampling Distribution:Mean = 0.43

Standard deviation =

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Slide 1- 11Assumptions and ConditionsMost models are useful only when specific assumptions are true.There are two assumptions in the case of the model for the distribution of sample proportions:The Independence Assumption: The sampled values must be independent of each other.The Sample Size Assumption: The sample size, n, must be large enough.#Copyright 2014, 2012, 2009 Pearson Education, Inc.11Slide 1- 12Assumptions and Conditions (cont.)Assumptions are hardoften impossibleto check. Thats why we assume them. Still, we need to check whether the assumptions are reasonable by checking conditions that provide information about the assumptions.The corresponding conditions to check before using the Normal to model the distribution of sample proportions are the Randomization Condition,10% Condition and the Success/Failure Condition.#Copyright 2014, 2012, 2009 Pearson Education, Inc.12Slide 1- 13Assumptions and Conditions (cont.)Randomization Condition: The sample should be a simple random sample of the population.10% Condition: If sampling has not been made with replacement, then the sample size, n, must be no larger than 10% of the population.Success/Failure Condition: The sample size has to be big enough so that both np and nq are at least 10.

#Copyright 2014, 2012, 2009 Pearson Education, Inc.13Slide 1- 14Because we have a Normal model, for example, we know that 95% of Normally distributed values fall within two standard deviations of the mean. So we should not be surprised if 95% of various polls gave results that were near the mean but varied above and below that by no more than two standard deviations. This is what we mean by sampling error. Its not really an error at all, but just variability youd expect to see from one sample to another.The Central Limit Theorem for Sample Proportions (cont)#Copyright 2014, 2012, 2009 Pearson Education, Inc.14Solving Sampling Distribution Problems(Proportions)First identify what sampling distribution is involved. Hint: you must know the underlying population p and there must be a sample proportion involved.The sampling distribution is given by

Check the conditions, to be sure the sampling distribution applies.

Draw a picture of the Sampling Distribution (Normal curve)

Find where p^ falls on this distribution and use NormalCdf to solve for the probability of seeing p^ or something more extreme (shade from p^ to the nearest tail)

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Practice12) Public Health statistics indicate that 26.4% of American adults smoke cigarettes. Describe the sampling distribution model for the proportion of smokers among a randomly selected group of 50 adults. What are your assumptions and conditions?

15) Based on past experience, a bank believes that 7% of the people who receive loans will not make payments on time. The bank has recently approved 200 loans. What are the mean and standard deviation of the proportion of clients in this group who may not make timely payments?What assumptions underlie your model? Are the conditions met?What is the probability that over 10% of these clients will not make timely payments?Slide 1- 16#Copyright 2014, 2012, 2009 Pearson Education, Inc.Practice16) Assume that 30% of students at a university wear contact lenses.We randomly pick 100 students. Let p^ represent the proportion of students who wear contact lenses. Whats the appropriate model for the distribution of p^? Specify the name of the distribution, the mean, and the standard deviation.Be sure the verify that the conditions are met.Whats the approximate probability that more than one third of this sample wear contacts?Slide 1- 17#Copyright 2014, 2012, 2009 Pearson Education, Inc.Enough Lefty Seats?13% of all people are left handed.A 200-seat auditorium has 15 lefty seats.What is the probability that there will not be enough lefty seats for a class of 90 students?ThinkPlan: p^=15/90 0.167, Want Model:Independence Assumption: With respect to lefties, the students are independent.10% Condition: This is out of all people.Success/Failure Condition: 15 10, 75 10

#Copyright 2014, 2012, 2009 Pearson Education, Inc.

Enough Lefty Seats?ThinkModel: p = 0.13, n=90 The model is: N(0.13, 0.035)ShowPlotMechanics:

Or normalcdf(0.167, 1E99, 0.13, 0.035)

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Enough Lefty Seats?Tell Conclusion: There is about a 14.5% chance that there will not be enough seats for the left handed students in the class.#Copyright 2014, 2012, 2009 Pearson Education, Inc.15.3The Sampling Distribution of Other Statistics#Copyright 2014, 2012, 2009 Pearson Education, Inc.The Sampling Distribution for OthersThere is a sampling distribution for any statistic, but the Normal model may not fit.Below are histograms showing results of simulations of sampling distributions.

#Copyright 2014, 2012, 2009 Pearson Education, Inc.The Sampling Distribution For OthersThe medians seem to be approximately Normal.

The variances seem somewhat skewed right.

The minimums are all over the place.

In this course, we will focus on the proportions and the means.

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Sampling Distribution of the MeansImagine we roll a number of dice and take the average of the rolls over and over again.

For 1 die, the distribution is Uniform.

For 3 dice, the sampling distribution for the means is closer to Normal.

For 20 dice, the sampling distribution for the means is very close to normal. The standard deviation is much smaller.

#Copyright 2014, 2012, 2009 Pearson Education, Inc.2415.4The Central Limit Theorem: The Fundamental Theorem of Statistics#Copyright 2014, 2012, 2009 Pearson Education, Inc.The Central Limit TheoremThe Central Limit TheoremThe sampling distribution of any mean becomes nearly Normal as the sample size grows.

RequirementsIndependentRandomly collected sample

The sampling distribution of the means is close to Normal if either:Large sample sizePopulation close to Normal

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Video on the Central Limit Theoremhttp://www.nytimes.com/video/science/100000002452709/bunnies-dragons-and-the-normal-world.html?playlistId=100000002438160

#Copyright 2014, 2012, 2009 Pearson Education, Inc.

How Normal?

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Population Distribution and Sampling Distribution of the MeansPopulation Distribution

Normal

Uniform

Bimodal

Skewed Sampling Distribution for the Means Normal (any sample size)

Normal (large sample size)

Normal (larger sample size)

Normal (larger sample size)

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Standard Deviation of the MeansWhich would be more unusual: a student who is 69 tall in the class or a class that has mean height of 69?

The sample means have a smaller standard deviation than the individuals.

The standard deviation of the sample means goes down by the square root of the sample size:

#Copyright 2014, 2012, 2009 Pearson Education, Inc.The Sampling Distribution Model for a MeanWhen a random sample is drawn from a population with mean m and standard deviation s, the sampling distribution has:Mean: m

Standard Deviation:

For large sample size, the distribution is approximately normal regardless of the population the random sample comes from.

The larger the sample size, the closer to Normal.

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Solving Sampling Distribution Problems(Means)

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Caution!Pay attention to how the sampling distribution of means differs depending on the size of the sample.

Be careful to distinguish between the underlying distribution of the population (which may or may not be normal) and the sampling distribution of means (which depends on sample size n).#Copyright 2014, 2012, 2009 Pearson Education, Inc.38) Statistics indicate that Ithaca, NY gets an average rainfall of 35.4 of rain each year, with a standard deviation of 4.2. Assume that a Normal model applies During what percentage of years does Ithaca get more than 40 of rain?Less than how much rain falls in the driest 20% of all years? A Cornell student is in Ithaca for 4 years. Let y(bar) represent the mean amount of rain for those 4 years. Describe the sampling distribution model of this sample mean y(bar). Whats the probability that those 4 years average less than 30 of rain?Slide 1- 34#Copyright 2014, 2012, 2009 Pearson Education, Inc.Low BMI RevisitedThe 200 college women with the low BMI reported a mean weight of only 140 pounds. For all 18-year-old women, m = 143.74 and s = 51.54. Does the mean weight seem exceptionally low?

Randomization Condition: The women were a random sample with weights independent.

Sample size Condition: Weights are approximately Normal. 200 is large enough.#Copyright 2014, 2012, 2009 Pearson Education, Inc.

Low BMI RevisitedMean and Standard Deviation of the sampling distribution

The 68-95-99.7 rule suggeststhat the mean is low but not that unusual. Suchvariability is not extraordinary for samples of this size.

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Too Heavy for the Elevator?Mean weight of US men is 190 lb, thestandard deviation is 59 lb. An elevator has a weight limit of 10 persons or 2500 lb. Find the probability that 10 men in the elevator will overload the weight limit.

Think Plan: 10 over 2500 lb same as their mean over 250.

Model: Independence Assumption: Not random, but probably independent.

Sample Size Condition: Weight approx. Normal.

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Too Heavy for the ElevatorThink Model: m = 190, s = 59 By the CLT, the sampling distribution of is approximately Normal:

ShowPlot:

#Copyright 2014, 2012, 2009 Pearson Education, Inc.

Too Heavy for the Elevator?Mechanics:

Tell Conclusion: There is only a 0.0007 chance that the 10 men will exceed the elevators weight limit.

#Copyright 2014, 2012, 2009 Pearson Education, Inc.43) The College Board reported the score distribution shown in the table for all students who took the 2006 AP Statistics Exam:Find the mean and standard deviation of the scoresIf we select a random sample of 40 AP students would we expect their scores to follow a Normal Model? Consider the mean scores of random samples of 40 AP stats students. Describe the sampling model for these meansAn AP stats teacher had 63 students preparing to take the AP exam. He considers his students to be typical of all the national students. Whats the probability that his students will achieve an average score of at least 3?

Slide 1- 40ScorePercent of Students512.6422.2325.3218.3121.6#Copyright 2014, 2012, 2009 Pearson Education, Inc.48) The weight of potato chips in a bag is stated to be 10 ounces. The amount that the machine puts in these bags is believed to have a normal model with mean 10.2 oz and standard deviation of 0.12 oz. What fraction of all bags are underweight?Some of the chips are sold in bargain packs of 3 bags. What is the probability that none of the 3 is underweight?Whats the probability that the mean weight of the 3 bags is below 10 oz.Whats the probability that the mean weight of a 24-bag case is below 10 oz?

Slide 1- 41#Copyright 2014, 2012, 2009 Pearson Education, Inc.15.5Sampling Distributions: A Summary#Copyright 2014, 2012, 2009 Pearson Education, Inc.Sample Size and Standard Deviation

Larger sample size Smaller standard deviation

Multiply n by 4 Divide the standard deviation by 2.

Need a sample size of 100 to reduce the standard deviation by a factor of 10.

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Billion Dollar MisunderstandingBill and Melinda Gates Foundation found that the 12% of the top 50 performing schools were from the smallest 3%. They funded a transformation to small schools.

Small schools have a smaller n, thus a higher standard deviation.

Likely to see both higher and lower means.

18% of the bottom 50 were also from the smallest 3%.

#Copyright 2014, 2012, 2009 Pearson Education, Inc.Distribution of the Sample vs. the Sampling DistributionDont confuse the distribution of the sample and the sampling distribution.If the populations distribution is not Normal, then the samples distribution will not be normal even if the sample size is very large.

For large sample sizes, the sampling distribution, which is the distribution of all possible sample means from samples of that size, will be approximately Normal.#Copyright 2014, 2012, 2009 Pearson Education, Inc.Two Truths About Sampling DistributionsSampling distributions arise because samples vary. Each random sample will contain different cases and, so, a different value of the statistic.

Although we can always simulate a sampling distribution, the Central Limit Theorem saves us the trouble for proportions and means. This is especially important when we do not know the populations distribution.#Copyright 2014, 2012, 2009 Pearson Education, Inc.What Can Go Wrong?Dont confuse the sampling distribution with the distribution of the sample.A histogram of the data shows the samples distribution. The sampling distribution is more theoretical.Beware of observations that are not independent.The CLT fails for dependent samples. A good survey design can ensure independence.Watch out for small samples from skewed or bimodal populations.The CLT requires large samples or a Normal population or both.#Copyright 2014, 2012, 2009 Pearson Education, Inc.