SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The...

35
SAMPLING DISTRIBUTIONS Chapter 7

Transcript of SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The...

Page 1: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

SAMPLING DISTRIBUTIONS

Chapter 7

Page 2: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.
Page 3: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Page 4: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Statistic and Parameter

Statistic – numerical summary of sample data: p-hat or xbar

Parameter – numerical summary of a population: µ for example.

In practice, we seldom know parameters, which are estimated using sample data: statistics estimate parameters

Page 5: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Sampling Distributions: Gray Davis

Before counting votes, the proportion in favor of recalling Governor Gray Davis was an unknown parameter

Exit poll of 3160 voters had sample proportion in favor of a recall as 0.54

Different random sample of 3000 voters would have different sample proportion

Sampling distribution of sample proportion shows all possible values and probabilities for those values

Page 6: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Sampling Distributions

Sampling distribution of statistic is probability distribution that specifies probabilities for possible values the statistic can take

Describe variability that occurs from study to study using statistics to estimate parameters

Help predict how close statistic falls to parameter it estimates

Page 7: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Mean and SD of Sampling Distribution for Proportion

For random sample of size n from population with proportion p in a category, the sampling distribution of the proportion of the sample in that category has:

n

p)-p(1deviation standard

pMean

Page 8: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

The Standard Error

To distinguish standard deviation of a sampling distribution from standard deviation of ordinary probability distribution, we refer to it as a standard error

Page 9: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

2006 California Election

If population proportion supporting reelection of Schwarzenegger was 0.50, would it have been unlikely to observe the exit-poll sample proportion of 0.565?

Would you be willing to predict that Schwarzenegger would win the election?

Page 10: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Given exit poll had 2705 people and assuming 50% support, estimate of population proportion and standard error:

0096.2705

)5.1(*5.

5.

p

2006 California Election

Page 11: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Sample proportion of 0.565 is more than six standard errors from expected value of 0.50

Sample proportion of 0.565 voting for reelection of Schwarzenegger would be very unlikely if population proportion were p = 0.50 or p < 0.50

6.8 0.0096

0.50) - (0.565 z

2006 California Election

Page 12: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Population Distribution

Population distribution: probability distribution from which we take sample

Values of its parameters are usually unknown – what we’d like to learn about

Page 13: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Data Distribution

Distribution of the sample data that we actually see in practice

Described by statistics With random sampling, the larger n is,

the more closely the data distribution resembles the population distribution

Page 14: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Sampling Distribution

Probability distribution of a sample statistic With random sampling, provides

probabilities for all the possible values of statistic

Key for telling us how close sample statistic falls to corresponding unknown parameter

Standard deviation is called standard error

Page 15: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Clinton vs. Spencer: Senatorial Seat

2006 U.S. Senate election in NYAn exit poll of 1336 voters showed

67% (895) voted for Clinton 33% (441) voted for Spencer

When 4.1 million votes tallied 68% voted for Clinton 32% voted for Spencer

Let X= vote outcome with x=1 for Clinton and x=0 for Spencer

Page 16: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Population distribution is 4.1 million x-values, 32% are 0, and 68% are 1.

Data distribution is 1336 x-values from exit poll, 33% are 0, and 67% are 1.

Sampling distribution of sample proportion is approximately normal with p=0.68 and

Only sampling distribution is bell-shaped; others are discrete and concentrated at two values 0 and 1

0.68(1 0.68) /1336 0.013

Clinton vs. Spencer: Senatorial Seat

Page 17: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

7.2 How Close Are Sample Means to Population Means?

Page 18: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Sampling Distribution of Sample Mean

The sample mean, x, is a random variable that varies from sample to sample, whereas the population mean, µ, is fixed.

Page 19: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Sampling distribution of sample mean for random samples of size n from a population with mean µ and standard deviation σ, has: Center and mean is

same mean, µ Spread is standard

error of

x n

Sampling Distribution of Sample Mean

Page 20: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Pizza Sales

Daily sales at a restaurant vary around a mean, µ = $900, with a standard deviation of σ = $300.What are the center and spread of the sampling distribution?

Page 21: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Effect of n on the Standard Error

The standard error of the sample mean =

As n increases, denominator increases, so s.e. decreases

With larger samples, the sample mean is more likely to be close to the population mean

n

Page 22: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Central Limit Theorem

How does the sampling distribution of the sample mean relate with respect to shape, center, and spread to the probability distribution from which the samples were taken?

Page 23: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Central Limit Theorem (CLT)

For random sampling with a large sample size n, sampling distribution of sample mean is approximately normal, no matter what the shape of the original probability distribution

Page 24: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Sampling Distribution of Sample Means

More bell-shaped as n increases

The more skewed, the larger n must be to get close to normal

Usually close to normal when n is 30

Always approximately normal for approximately normal populations

Page 25: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

CLT: Making Inferences

For large n, sampling distribution is approximately normal even if population distribution is not

Enables inferences about population means regardless of shape of population distribution

Page 26: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Calculating Probabilities of Sample Means

Distribution of milk bottle weights is normally distributed with a mean of 1.1 lbs and σ = 0.20

What is the probability that the mean of a random sample of 5 bottles will be greater than 0.99 lbs?

Page 27: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Calculating Probabilities of Sample Means

Closing prices of stocks have a right skewed distribution with a mean (µ) of $25 and σ= $20.

What is the probability that the mean of a random sample of 40 stocks will be less than $20?

Page 28: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Calculating Probabilities of Sample Means

An automobile insurer found repair claims have a mean of $920 and a standard deviation of $870. Suppose the next 100 claims can be regarded as a random sample. What is the probability that the average of the 100 claims is larger than $900?

Page 29: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Distribution of actual weights of 8 oz. wedges of cheddar cheese is normal with mean =8.1 oz and standard deviation of 0.1 oz

Find x such that there is only a 10% chance that the average weight of a sample of five wedges will be above x

Calculating Probabilities of Sample Means

Page 30: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Distribution of 8 oz. wedges have mean = 8.1 oz. and standard deviation = 0.1 oz.

Find x such that there is only a 5% chance the average weight of a sample of five wedges will be below x

Calculating Probabilities of Sample Means

Page 31: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

7.3 How Can We Make Inferences About a Population?

Page 32: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Using the CLT to Make Inferences

Implications of the CLT:1. For large n, sampling

distribution of is approximately normal despite population shape

2. When approximately normal, is within 2 standard errors of µ 95% of the time and almost certainly within 3

x

x

Page 33: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Standard Errors in Practice

Standard error have exact values that depend on parameters:

In practice, parameters are unknown so we approximate with p-hat and s

p(1 p) n

n

Page 34: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Sampling Distribution for a Proportion

Binomial probability distribution is a sampling distribution with x as # of successes in n independent trials and y as probability

Sample proportion (not #) of successes is usually reported, but proportions use the same formulas for the mean and standard deviation of the sampling distribution

Page 35: SAMPLING DISTRIBUTIONS Chapter 7. 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution.

Sampling Distribution for a Proportion