Agresti/Franklin Statistics, 1 of 87 Section 7.2 How Can We Construct a Confidence Interval to...
of 46
/46

Author
lesleyrich 
Category
Documents

view
216 
download
0
Embed Size (px)
Transcript of Agresti/Franklin Statistics, 1 of 87 Section 7.2 How Can We Construct a Confidence Interval to...
Slide 1Section 7.2
How Can We Construct a Confidence Interval to Estimate a Population Proportion?
Agresti/Franklin Statistics, * of 87
We symbolize a population proportion by p
The point estimate of the population proportion is the sample proportion
We symbolize the sample proportion by
Agresti/Franklin Statistics, * of 87
Finding the 95% Confidence Interval for a Population Proportion
A 95% confidence interval uses a margin of error = 1.96(standard errors)
[point estimate ± margin of error] =
Agresti/Franklin Statistics, * of 87
Finding the 95% Confidence Interval for a Population Proportion
The exact standard error of a sample proportion equals:
This formula depends on the unknown population proportion, p
In practice, we don’t know p, and we need to estimate the standard error
Agresti/Franklin Statistics, * of 87
In practice, we use an estimated standard error:
Agresti/Franklin Statistics, * of 87
A 95% confidence interval for a population proportion p is:
Agresti/Franklin Statistics, * of 87
Example: Would You Pay Higher Prices to Protect the Environment?
In 2000, the GSS asked: “Are you willing to pay much higher prices in order to protect the environment?”
Of n = 1154 respondents, 518 were willing to do so
Agresti/Franklin Statistics, * of 87
Example: Would You Pay Higher Prices to Protect the Environment?
Find and interpret a 95% confidence interval for the population proportion of adult Americans willing to do so at the time of the survey
Agresti/Franklin Statistics, * of 87
Example: Would You Pay Higher Prices to Protect the Environment?
Agresti/Franklin Statistics, * of 87
Sample Size Needed for LargeSample Confidence Interval for a Proportion
For the 95% confidence interval for a proportion p to be valid, you should have at least 15 successes and 15 failures:
Agresti/Franklin Statistics, * of 87
“95% Confidence”
With probability 0.95, a sample proportion value occurs such that the confidence interval contains the population proportion, p
With probability 0.05, the method produces a confidence interval that misses p
Agresti/Franklin Statistics, * of 87
How Can We Use Confidence Levels Other than 95%?
In practice, the confidence level 0.95 is the most common choice
But, some applications require greater confidence
To increase the chance of a correct inference, we use a larger confidence level, such as 0.99
Agresti/Franklin Statistics, * of 87
Agresti/Franklin Statistics, * of 87
Different Confidence Levels
In using confidence intervals, we must compromise between the desired margin of error and the desired confidence of a correct inference
As the desired confidence level increases, the margin of error gets larger
Agresti/Franklin Statistics, * of 87
What is the Error Probability for the Confidence Interval Method?
The general formula for the confidence interval for a population proportion is:
Sample proportion ± (zscore)(std. error)
which in symbols is
Agresti/Franklin Statistics, * of 87
What is the Error Probability for the Confidence Interval Method?
Agresti/Franklin Statistics, * of 87
A confidence interval for a population proportion p is:
Agresti/Franklin Statistics, * of 87
Summary: Effects of Confidence Level and Sample Size on Margin of Error
The margin of error for a confidence interval:
Increases as the confidence level increases
Decreases as the sample size increases
Agresti/Franklin Statistics, * of 87
What Does It Mean to Say that We Have “95% Confidence”?
If we used the 95% confidence interval method to estimate many population proportions, then in the long run about 95% of those intervals would give correct results, containing the population proportion
Agresti/Franklin Statistics, * of 87
Section 7.3
How Can We Construct a Confidence Interval To Estimate a Population Mean?
Agresti/Franklin Statistics, * of 87
How to Construct a Confidence Interval for a Population Mean
Point estimate ± margin of error
The sample mean is the point estimate of the population mean
The exact standard error of the sample mean is σ/
In practice, we estimate σ by the sample standard deviation, s
Agresti/Franklin Statistics, * of 87
How to Construct a Confidence Interval for a Population Mean
For large n…
For small n from an underlying population that is normal…
The confidence interval for the population mean is:
Agresti/Franklin Statistics, * of 87
How to Construct a Confidence Interval for a Population Mean
In practice, we don’t know the population standard deviation
Substituting the sample standard deviation s for σ to get se = s/ introduces extra error
To account for this increased error, we replace the zscore by a slightly larger score, the tscore
Agresti/Franklin Statistics, * of 87
How to Construct a Confidence Interval for a Population Mean
In practice, we estimate the standard error of the sample mean by se = s/
Then, we multiply se by a tscore from the tdistribution to get the margin of error for a confidence interval for the population mean
Agresti/Franklin Statistics, * of 87
Properties of the tdistribution
The tdistribution is bell shaped and symmetric about 0
The probabilities depend on the degrees of freedom, df
The tdistribution has thicker tails and is more spread out than the standard normal distribution
Agresti/Franklin Statistics, * of 87
A 95% confidence interval for the population mean µ is:
To use this method, you need:
Data obtained by randomization
Agresti/Franklin Statistics, * of 87
Example: eBay Auctions of Palm Handheld Computers
Do you tend to get a higher, or a lower, price if you give bidders the “buyitnow” option?
Agresti/Franklin Statistics, * of 87
Example: eBay Auctions of Palm Handheld Computers
Consider some data from sales of the Palm M515 PDA (personal digital assistant)
During the first week of May 2003, 25 of these handheld computers were auctioned off, 7 of which had the “buyitnow” option
Agresti/Franklin Statistics, * of 87
“Buyitnow” option:
Bidding only:
250 249 255 200 199 240 228 255 232 246 210 178 246 240 245 225 246 225
Agresti/Franklin Statistics, * of 87
Summary of selling prices for the two types of auctions:
buy_now N Mean StDev Minimum Q1 Median Q3
no 18 231.61 21.94 178.00 221.25 240.00 246.75 yes 7 233.57 14.64 210.00 225.00 235.00 250.00
buy_now Maximum
no 255.00
yes 250.00
Agresti/Franklin Statistics, * of 87
Example: eBay Auctions of Palm Handheld Computers
To construct a confidence interval using the tdistribution, we must assume a random sample from an approximately normal population of selling prices
Agresti/Franklin Statistics, * of 87
Let µ denote the population mean for the “buyitnow” option
The estimate of µ is the sample mean:
x = $233.57
s = $14.64
The 95% confidence interval for the “buyitnow” option is:
which is 233.57 ± 13.54 or (220.03, 247.11)
Agresti/Franklin Statistics, * of 87
Example: eBay Auctions of Palm Handheld Computers
The 95% confidence interval for the mean sales price for the bidding only option is:
(220.70, 242.52)
Notice that the two intervals overlap a great deal:
“Buyitnow”: (220.03, 247.11)
Bidding only: (220.70, 242.52)
There is not enough information for us to conclude that one probability distribution clearly has a higher mean than the other
Agresti/Franklin Statistics, * of 87
How Do We Find a t Confidence Interval for Other Confidence Levels?
The 95% confidence interval uses t.025 since 95% of the probability falls between  t.025 and t.025
For 99% confidence, the error probability is 0.01 with 0.005 in each tail and the appropriate tscore is t.005
Agresti/Franklin Statistics, * of 87
If the Population is Not Normal, is the Method “Robust”?
A basic assumption of the confidence interval using the tdistribution is that the population distribution is normal
Many variables have distributions that are far from normal
Agresti/Franklin Statistics, * of 87
If the Population is Not Normal, is the Method “Robust”?
How problematic is it if we use the t confidence interval even if the population distribution is not normal?
Agresti/Franklin Statistics, * of 87
If the Population is Not Normal, is the Method “Robust”?
For large random samples, it’s not problematic
The Central Limit Theorem applies: for large n, the sampling distribution is bellshaped even when the population is not
Agresti/Franklin Statistics, * of 87
If the Population is Not Normal, is the Method “Robust”?
What about a confidence interval using the tdistribution when n is small?
Even if the population distribution is not normal, confidence intervals using tscores usually work quite well
We say the tdistribution is a robust method in terms of the normality assumption
Agresti/Franklin Statistics, * of 87
With binary data
Agresti/Franklin Statistics, * of 87
Agresti/Franklin Statistics, * of 87
The 2002 GSS asked: “What do you think is the ideal number of children in a family?”
The 497 females who responded had a median of 2, mean of 3.02, and standard deviation of 1.81. What is the point estimate of the population mean?
497
2
3.02
1.81
p
ˆ
n
p
p
How Can We Construct a Confidence Interval to Estimate a Population Proportion?
Agresti/Franklin Statistics, * of 87
We symbolize a population proportion by p
The point estimate of the population proportion is the sample proportion
We symbolize the sample proportion by
Agresti/Franklin Statistics, * of 87
Finding the 95% Confidence Interval for a Population Proportion
A 95% confidence interval uses a margin of error = 1.96(standard errors)
[point estimate ± margin of error] =
Agresti/Franklin Statistics, * of 87
Finding the 95% Confidence Interval for a Population Proportion
The exact standard error of a sample proportion equals:
This formula depends on the unknown population proportion, p
In practice, we don’t know p, and we need to estimate the standard error
Agresti/Franklin Statistics, * of 87
In practice, we use an estimated standard error:
Agresti/Franklin Statistics, * of 87
A 95% confidence interval for a population proportion p is:
Agresti/Franklin Statistics, * of 87
Example: Would You Pay Higher Prices to Protect the Environment?
In 2000, the GSS asked: “Are you willing to pay much higher prices in order to protect the environment?”
Of n = 1154 respondents, 518 were willing to do so
Agresti/Franklin Statistics, * of 87
Example: Would You Pay Higher Prices to Protect the Environment?
Find and interpret a 95% confidence interval for the population proportion of adult Americans willing to do so at the time of the survey
Agresti/Franklin Statistics, * of 87
Example: Would You Pay Higher Prices to Protect the Environment?
Agresti/Franklin Statistics, * of 87
Sample Size Needed for LargeSample Confidence Interval for a Proportion
For the 95% confidence interval for a proportion p to be valid, you should have at least 15 successes and 15 failures:
Agresti/Franklin Statistics, * of 87
“95% Confidence”
With probability 0.95, a sample proportion value occurs such that the confidence interval contains the population proportion, p
With probability 0.05, the method produces a confidence interval that misses p
Agresti/Franklin Statistics, * of 87
How Can We Use Confidence Levels Other than 95%?
In practice, the confidence level 0.95 is the most common choice
But, some applications require greater confidence
To increase the chance of a correct inference, we use a larger confidence level, such as 0.99
Agresti/Franklin Statistics, * of 87
Agresti/Franklin Statistics, * of 87
Different Confidence Levels
In using confidence intervals, we must compromise between the desired margin of error and the desired confidence of a correct inference
As the desired confidence level increases, the margin of error gets larger
Agresti/Franklin Statistics, * of 87
What is the Error Probability for the Confidence Interval Method?
The general formula for the confidence interval for a population proportion is:
Sample proportion ± (zscore)(std. error)
which in symbols is
Agresti/Franklin Statistics, * of 87
What is the Error Probability for the Confidence Interval Method?
Agresti/Franklin Statistics, * of 87
A confidence interval for a population proportion p is:
Agresti/Franklin Statistics, * of 87
Summary: Effects of Confidence Level and Sample Size on Margin of Error
The margin of error for a confidence interval:
Increases as the confidence level increases
Decreases as the sample size increases
Agresti/Franklin Statistics, * of 87
What Does It Mean to Say that We Have “95% Confidence”?
If we used the 95% confidence interval method to estimate many population proportions, then in the long run about 95% of those intervals would give correct results, containing the population proportion
Agresti/Franklin Statistics, * of 87
Section 7.3
How Can We Construct a Confidence Interval To Estimate a Population Mean?
Agresti/Franklin Statistics, * of 87
How to Construct a Confidence Interval for a Population Mean
Point estimate ± margin of error
The sample mean is the point estimate of the population mean
The exact standard error of the sample mean is σ/
In practice, we estimate σ by the sample standard deviation, s
Agresti/Franklin Statistics, * of 87
How to Construct a Confidence Interval for a Population Mean
For large n…
For small n from an underlying population that is normal…
The confidence interval for the population mean is:
Agresti/Franklin Statistics, * of 87
How to Construct a Confidence Interval for a Population Mean
In practice, we don’t know the population standard deviation
Substituting the sample standard deviation s for σ to get se = s/ introduces extra error
To account for this increased error, we replace the zscore by a slightly larger score, the tscore
Agresti/Franklin Statistics, * of 87
How to Construct a Confidence Interval for a Population Mean
In practice, we estimate the standard error of the sample mean by se = s/
Then, we multiply se by a tscore from the tdistribution to get the margin of error for a confidence interval for the population mean
Agresti/Franklin Statistics, * of 87
Properties of the tdistribution
The tdistribution is bell shaped and symmetric about 0
The probabilities depend on the degrees of freedom, df
The tdistribution has thicker tails and is more spread out than the standard normal distribution
Agresti/Franklin Statistics, * of 87
A 95% confidence interval for the population mean µ is:
To use this method, you need:
Data obtained by randomization
Agresti/Franklin Statistics, * of 87
Example: eBay Auctions of Palm Handheld Computers
Do you tend to get a higher, or a lower, price if you give bidders the “buyitnow” option?
Agresti/Franklin Statistics, * of 87
Example: eBay Auctions of Palm Handheld Computers
Consider some data from sales of the Palm M515 PDA (personal digital assistant)
During the first week of May 2003, 25 of these handheld computers were auctioned off, 7 of which had the “buyitnow” option
Agresti/Franklin Statistics, * of 87
“Buyitnow” option:
Bidding only:
250 249 255 200 199 240 228 255 232 246 210 178 246 240 245 225 246 225
Agresti/Franklin Statistics, * of 87
Summary of selling prices for the two types of auctions:
buy_now N Mean StDev Minimum Q1 Median Q3
no 18 231.61 21.94 178.00 221.25 240.00 246.75 yes 7 233.57 14.64 210.00 225.00 235.00 250.00
buy_now Maximum
no 255.00
yes 250.00
Agresti/Franklin Statistics, * of 87
Example: eBay Auctions of Palm Handheld Computers
To construct a confidence interval using the tdistribution, we must assume a random sample from an approximately normal population of selling prices
Agresti/Franklin Statistics, * of 87
Let µ denote the population mean for the “buyitnow” option
The estimate of µ is the sample mean:
x = $233.57
s = $14.64
The 95% confidence interval for the “buyitnow” option is:
which is 233.57 ± 13.54 or (220.03, 247.11)
Agresti/Franklin Statistics, * of 87
Example: eBay Auctions of Palm Handheld Computers
The 95% confidence interval for the mean sales price for the bidding only option is:
(220.70, 242.52)
Notice that the two intervals overlap a great deal:
“Buyitnow”: (220.03, 247.11)
Bidding only: (220.70, 242.52)
There is not enough information for us to conclude that one probability distribution clearly has a higher mean than the other
Agresti/Franklin Statistics, * of 87
How Do We Find a t Confidence Interval for Other Confidence Levels?
The 95% confidence interval uses t.025 since 95% of the probability falls between  t.025 and t.025
For 99% confidence, the error probability is 0.01 with 0.005 in each tail and the appropriate tscore is t.005
Agresti/Franklin Statistics, * of 87
If the Population is Not Normal, is the Method “Robust”?
A basic assumption of the confidence interval using the tdistribution is that the population distribution is normal
Many variables have distributions that are far from normal
Agresti/Franklin Statistics, * of 87
If the Population is Not Normal, is the Method “Robust”?
How problematic is it if we use the t confidence interval even if the population distribution is not normal?
Agresti/Franklin Statistics, * of 87
If the Population is Not Normal, is the Method “Robust”?
For large random samples, it’s not problematic
The Central Limit Theorem applies: for large n, the sampling distribution is bellshaped even when the population is not
Agresti/Franklin Statistics, * of 87
If the Population is Not Normal, is the Method “Robust”?
What about a confidence interval using the tdistribution when n is small?
Even if the population distribution is not normal, confidence intervals using tscores usually work quite well
We say the tdistribution is a robust method in terms of the normality assumption
Agresti/Franklin Statistics, * of 87
With binary data
Agresti/Franklin Statistics, * of 87
Agresti/Franklin Statistics, * of 87
The 2002 GSS asked: “What do you think is the ideal number of children in a family?”
The 497 females who responded had a median of 2, mean of 3.02, and standard deviation of 1.81. What is the point estimate of the population mean?
497
2
3.02
1.81
p
ˆ
n
p
p