Lectures 7 - 8 Confidence Interval Estimation
Transcript of Lectures 7 - 8 Confidence Interval Estimation
Lectures 7 - 8
Confidence Interval Estimation
Chapter Outline
• z-Based Confidence Intervals for a Population Mean: σ Known
• t-Based Confidence Intervals for a Population Mean: σ Unknown
•Confidence Intervals for a Population Proportion
•Sample Size Determination
8-2
Population vs. Sample
3https://onlinecourses.science.psu.edu/stat414/node/18
parameter
statistic
CLT (Sampling distributions)
ത𝑋~𝑁(𝜇 ത𝑋 = 𝜇, 𝜎 ത𝑋 =𝜎
𝑛)
Ƹ𝑝~𝑁(𝜇 ො𝑝 = 𝑝, 𝜎 ො𝑝 =𝑝(1 − 𝑝)
𝑛)
4
ത𝑋~𝑁(𝜇 ത𝑋 = 𝜇, 𝜎 ത𝑋 =𝜎
𝑛)
5
https://www.statisticshowto.datasciencecentral.com/confidence-level/
-Z* Z*
1 − 𝛼 = 𝑃(𝜇 ത𝑋 − 𝑧∗𝜎 ത𝑋 ≤ ത𝑋 ≤ 𝜇 ത𝑋 + 𝑧∗𝜎 ത𝑋)
= 𝑃(𝜇 − 𝑧∗ 𝜎
𝑛≤ ത𝑋 ≤ 𝜇 + 𝑧∗ 𝜎
𝑛)
= 𝑃(−𝑧∗ 𝜎
𝑛≤ ത𝑋 − 𝜇 ≤ 𝑧∗ 𝜎
𝑛)
= 𝑃(− ത𝑋 − 𝑧∗ 𝜎
𝑛≤ −𝜇 ≤ − ത𝑋 + 𝑧∗ 𝜎
𝑛)
= 𝑃( ത𝑋 − 𝑧∗ 𝜎
𝑛≤ 𝜇 ≤ ത𝑋 + 𝑧∗ 𝜎
𝑛)
6
For a given confidence level 1 − 𝛼, the above calculation suggests a form of CI for 𝜇:
( ҧ𝑥 − 𝑧∗ 𝜎
𝑛, ҧ𝑥 + 𝑧∗ 𝜎
𝑛)
A confidence interval
A confidence interval gives ranges of possible values for a certain parameter with a specified confidence level. It has the following structure:
estimate ± margin of error
where margin of error = critical value x standard error of the estimate
estimate ± critical value x standard error
7
Confidence Interval for μ(σ Known)
• Assumptions:• Population standard deviation σ is known.• Population is normally distributed.• If population is not normal, use large sample (n > 30).
• Confidence interval estimate:
where is the point estimate
Zα/2 is the normal distribution critical value for a probability of /2 in each tail
is the standard error
n
σ/2ZX α
X
nσ/
DCOVA
Finding the Critical Value, Zα/2
• Consider a 95% confidence interval:
Zα/2 = -1.96 Zα/2 = 1.96
0.05 so 0.951 ==− αα
0.0252=
α0.025
2=
α
Point EstimateLower Confidence Limit
UpperConfidence Limit
Z units:
X units: Point Estimate
0
1.96/2Z =α
DCOVA
One-sample z Confidence Interval for μOne-sample z Confidence Interval for μ :
ഥ𝒙 ± 𝒛∗ 𝝈
𝒏
where z* is an appropriate value for a N(0,1) distribution:
This interval requires we have a random sample from a normal population. If the sample size is large (n > 30), the assumption of normality is not so crucial and the result is approximate.
10
C = confidence level.
Confidence Interval Example(continued)
Sample # XLower
Limit
Upper
Limit
Contain
µ?
1 362.30 356.42 368.18 Yes
2 369.50 363.62 375.38 Yes
3 360.00 354.12 365.88 No
4 362.12 356.24 368.00 Yes
5 373.88 368.00 379.76 Yes
DCOVA
Example• Find 𝑧∗ for a 90% confidence interval.
𝑧∗ = 1.645
• Find 𝑧∗ for a 95% confidence interval. 𝑧∗ = 1.96
• Find 𝑧∗ for a 99% confidence interval. 𝑧∗ = 2.575
12
Example
Determine a 95% confidence interval for the mean of a normal distribution with variance = 9, using a sample of 100 with the sample mean = 5.
13
Example
Determine a 95% confidence interval for the mean of a normal distribution with variance = 9, using a sample of 100 with the sample mean = 5.
95% CI for μ is
We are 95% confident that the true mean is between 4.412 and 5.588.
14
ഥ𝒙 ± 𝒛∗ 𝝈
𝒏= 5 ±𝟏. 𝟗𝟔
𝟑
𝟏𝟎𝟎= (𝟒. 𝟒𝟏𝟐, 𝟓. 𝟓𝟖𝟖)
Example
• A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the population standard deviation is 0.35 ohms.
• Determine a 95% confidence interval for the true mean resistance of the population.
DCOVA
2.4068 1.9932
0.2068 2.20
)11(0.35/ 1.96 2.20
n
σ/2 ZX
=
=
μ
α
Example
• A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the population standard deviation is 0.35 ohms.
• Solution:
(continued)
DCOVA
Interpretation
• We are 95% confident that the true mean
resistance is between 1.9932 and 2.4068
ohms.
• Although the true mean may or may not be in
this interval, 95% of intervals formed in this
manner will contain the true mean.
DCOVA
Example
A survey of 100 executives revealed that their mean salary was $80,000. Assume the population standard deviation is $15,000. Determine the 95% confidence interval estimate of the mean salary for all executives.
18
Confidence Intervals
Population
Mean
σ Unknown
Confidence
Intervals
Population
Proportion
σ Known
DCOVA
Population vs. Sample
https://onlinecourses.science.psu.edu/stat414/node/18
parameter
statistic
One-sample t Confidence Interval for μOne-sample t Confidence Interval for μ :
ഥ𝒙 ± 𝒕∗𝒔
𝒏where t* is an appropriate value for a t distribution n – 1 degrees of freedom:
This interval requires we have a random sample from a normal population. 21
C = confidence level.
-t* t*
(𝝈 unknown)
Defining the t Random Variable: σ Unknown
•Define a new random variable t:
•The sampling distribution of this random variable is a t distribution with n – 1 degrees of freedom
• t distribution converges to the standard normal distribution as n goes to infinity.
ns
xt
−=
9-22
Example of t-distributions
23
• Assumptions:• Population standard deviation is unknown.• Population is normally distributed.• If population is not normal, use large sample (n >
30).
• Use Student’s t Distribution.
• Confidence Interval Estimate:
(where tα/2 is the critical value of the t distribution with n -1 degrees of freedom and an area of α/2 in each tail.)
Confidence Interval for μ(σ Unknown)
n
StX 2/α
(continued)
DCOVA
• The t is a family of distributions.
• The tα/2 value depends on degrees of freedom (d.f.).• Number of observations that are free to vary after sample
mean has been calculated.
d.f. = n - 1
Student’s t DistributionDCOVA
Student’s t Distribution
t0
t (df = 5)
t (df = 13)t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal
Standard
Normal(t with df = ∞)
Note: t Z as n increasesDCOVA
Selected t distribution values
With comparison to the Z value
Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) (∞ d.f.)
0.80 1.372 1.325 1.310 1.28
0.90 1.812 1.725 1.697 1.645
0.95 2.228 2.086 2.042 1.96
0.99 3.169 2.845 2.750 2.58
Note: t Z as n increases
DCOVA
A random sample of n = 25 has X = 50 and S = 8. Form a 95% confidence interval for μ.
• d.f. = n – 1 = 24, so
The confidence interval is
Example of t distribution confidence interval
2.06390.025t/2 ==αt
25
8(2.0639)50
n
S/2 = αtX
46.698 ≤ μ ≤ 53.302
DCOVA
Example of t distribution confidence interval
• Interpreting this interval requires the assumption that the population you are sampling from is approximately a normal distribution (especially since n is only 25).
• This condition can be checked by creating a:• Normal probability plot or
• Boxplot.
(continued)
DCOVA
Example• Find 𝑡∗ for a 90% confidence interval, n = 10.
1.833
• Find 𝑡∗ for a 95% confidence interval, n = 40.
2.203
• Find 𝑡∗ for a 99% confidence interval, n = 35.
30
Example
Determine a 95% confidence interval for the mean of a normal distribution with sample variance = 9, using a sample of 100 with the sample mean = 5.
31
Example
Determine a 95% confidence interval for the mean of a normal distribution with sample variance = 9, using a sample of 100 with the sample mean = 5.
32
95% CI for μ is
We are 95% confident that the true mean is between 4.405 and 5.595.
ഥ𝒙 ± 𝒕∗ 𝒔
𝒏= 5 ±𝟏. 𝟗𝟖𝟒
𝟑
𝟏𝟎𝟎= (𝟒. 𝟒𝟎𝟓, 𝟓. 𝟓𝟗𝟓)
Example
A group of 100 patients with end-stage renal disease were given the drug epoetin. The mean hemoglobin level of the patients was 10.3 g/dl, with sd of 0.9. Construct a 95% confidence interval for the population mean.
33
Example
In an experiment to assess the effectiveness of hormone replacement therapy, researchers gave conjugated equine estrogen (CEE) to a sample of 25 women between the ages of 45 and 64. After taking the medication for 36 months, the bone mineral density was measured for each of the 25 women. The average density was 0.878 g/cm2, with a standard deviation of 0.126 g/cm2. Find a 95% confidence interval for the mean bone mineral density
34
ExampleDuring a natural gas shortage, a gas company randomly sampled residential gas meters in order to monitor daily gas consumption. On a particular day, a sample of 100 meters showed a sample mean of 250 cubic feet and a sample standard deviation of 50 cubic feet. Provide a 90% confidence interval estimate of the mean gas consumption for the population.
35
90% CI for μ is
We are 90% confident that the true mean is between 241.7 and 258.3.
ഥ𝒙 ± 𝒕∗ 𝒔
𝒏= 250 ±𝟏. 𝟔𝟔
𝟓𝟎
𝟏𝟎𝟎= (𝟐𝟒𝟏. 𝟕, 𝟐𝟓𝟖. 𝟑)
Confidence Intervals
Population
Mean
σ Unknown
Confidence
Intervals
Population
Proportion
σ Known
DCOVA
Population vs. Sample
https://onlinecourses.science.psu.edu/stat414/node/18
parameter
statistic
One-sample z Confidence Interval for pOne-sample z Confidence Interval for p :
ෝ𝒑 ± 𝒛∗ ෝ𝒑(𝟏−ෝ𝒑)
𝒏
where z* is an appropriate value for a N(0,1) distribution:
This interval requires the sample size is large. 38
C = confidence level.
(for n large)
Example
• A random sample of 100 people shows
that 25 are left-handed.
• Form a 95% confidence interval for the
true proportion of left-handers.
DCOVA
Example
• A random sample of 100 people shows that 25 are left-handed. Form a 95% confidence interval for the true proportion of left-handers.
/1000.25(0.75)1.9625/100
p)/np(1/2Zp
=
− α
0.3349 0.1651
3)1.96(0.0430.25
=
=
(continued)
DCOVA
Interpretation
• We are 95% confident that the true percentage of left-handers in the population is between
16.51% and 33.49%.
• Although the interval from 0.1651 to 0.3349 may or may not contain the true proportion, 95% of intervals formed from samples of size 100 in this manner will contain the true proportion.
DCOVA
Example
A random sample of n = 500 American teenagers resulted in 54% stating they get along very well with their parents. What is the corresponding 95% confidence interval for the population proportion of teenagers that get along very well with their parents?
42
Example
A random sample of n = 500 American teenagers resulted in 54% stating they get along very well with their parents. What is the corresponding 95% confidence interval for the population proportion of teenagers that get along very well with their parents?
43
95% CI for p is
We are 95% confident that the true proportion is between 0.442 and 0.638.
ෝ𝒑 ± 𝒛∗ ෝ𝒑(𝟏−ෝ𝒑)
𝒏=0.54 ±𝟏. 𝟗𝟔
𝟎.𝟓𝟒 (𝟏−𝟎.𝟓𝟒)
𝟏𝟎𝟎= (𝟎. 𝟒𝟒𝟐, 𝟎. 𝟔𝟑𝟖)
ExampleIn a random sample of n = 400 adults, 300 stated they believe in love at first sight. Estimate the population proportion of adults that believe in love at first sight. Find an interval estimate for the population proportion p, with 95% confidence.
44
Confidence Intervals
Population
Mean
σ Unknown
Confidence
Intervals
Population
Proportion
σ Known
DCOVA
Activity : The Dominant Eye
Suppose our class room constitute a random sample. Each one of us holds a piece of paper with a 1 inch by 1 inch square cut in the middle at arm’s length with both hands. The person looks through the square at a relatively small object across the room. The person then closes one eye. If he or she can still see the object, the open eye is the dominant eye. If not, the closed eye is the dominant eye.
46
Activity : The Dominant EyeEstimate the proportion of people with right-eye dominant with 95% confidence.
47
Sample sizes calculation
48
The sample size n needed when one wants to estimate the population mean when the population variance is known. (E is the margin of error. You may use s from a pilot study if the population sd is
unknown.) E=𝒛∗ 𝝈
𝒏
𝑛 =𝑧∗𝜎
𝐸
2
Sample sizes calculation
49
The sample size n needed when one wants to estimate the population proportion p. (where p is taken from a pilot study and E is the margin
of error) 𝑬 = 𝒛∗ ෝ𝒑(𝟏−ෝ𝒑)
𝒏
𝑛 = 𝑝(1 − 𝑝)𝑧∗
𝐸
2
The sample size n needed when one wants to estimate the population proportion p. (conservative formula and E is the margin of error)
𝑛 =1
4
𝑧∗
𝐸
2
Determining Sample Size
For the
Mean
Determining
Sample Size
For the
Proportion
DCOVA
Determining Sample Size
For the
Mean
Determining
Sample Size
n
σ2/αZX
n
σ2/αZe =
Sampling error
(margin of error)
DCOVA
Determining Sample Size
For the
Mean
Determining
Sample Size
n
σ2/αZe =
(continued)
2
22
2/
e
Zn
=Now solve
for n to get
DCOVA
Determining Sample Size
• To determine the required sample size for the mean, you must know:
• The desired level of confidence (1 - ), which
determines the critical value, Zα/2.
• The acceptable sampling error, e.
• The standard deviation, σ.
(continued)
DCOVA
Required Sample Size Example
If = 45, what sample size is needed to
estimate the mean within ± 5 with 90%
confidence?
(Always round up)
219.195
(45)(1.645)
e
σZn
2
22
2
22
===
So the required sample size is n = 220
DCOVA
Using Excel To Calculate Sample Size For The Mean
DCOVA
If σ is unknown
• If unknown, σ can be estimated when
using the required sample size formula.
• Use a value for σ that is expected to be at
least as large as the true σ.
• Select a pilot sample and estimate σ with the
sample standard deviation, S.
DCOVA
Determining Sample Size
Determining
Sample Size
For the
Proportion
2
2
/2
e
)(1Zn
ππ −= Now solve
for n to getn
)(1Ze
ππ −=
(continued)
DCOVA
Determining Sample Size
• To determine the required sample size for the proportion, you must know:
• The desired level of confidence (1 - ), which determines the
critical value, Zα./2.
• The acceptable sampling error, e.
• The true proportion of events of interest, π.
• π can be estimated with a pilot sample if necessary (or
conservatively use 0.5 as an estimate of π.)
(continued)
DCOVA
Required Sample Size Example
How large a sample would be necessary to
estimate the true proportion of sales invoices
containing errors in a large population within
±7%, with 95% confidence?
(Assume a pilot sample yields p = 0.15.)
DCOVA
Required Sample Size Example
Solution:
For 95% confidence, use Zα/2 = 1.96
e = 0.07
p = 0.15, so use this to estimate π.
So use n = 100
99.96(0.03)
0.15)(0.15)(1(1.96)
e
)π(1πZn
2
2
2
2
2/ =−
=−
=
(continued)
DCOVA
Using Excel To Calculate A Sample Size For A Proportion
DCOVAHow large a sample would be necessary to
estimate the true proportion of sales invoices
containing errors in a large population within
±7%, with 95% confidence?
(Assume a pilot sample yields p = 0.15.)
Example
Someone would like to estimate a population mean to within 40 units with 99% confidence given that the population standard deviation is 200. What sample size should be used?
62
Example
Someone would like to estimate a population mean to within 40 units with 99% confidence given that the population standard deviation is 200. What sample size should be used?
63
𝑛 =𝑧∗𝜎
𝐸
2=
2.576×200
40
2= 165.89 ↑ 166
Example
A research firm has been contracted to estimate the mean weekly family expenditure on clothes. From a pilot survey of small sample size of 10, it was found that the sample standard deviation of the weekly expenditure is $125. Determine with 99% confidence the number of families that must be sampled to estimate the mean weekly family expenditure on clothes to within $15.
64
Example
As a manufacturer of guitars, a major corporation wants to estimate the proportion of guitar players who are right-handed.
(a) How many guitar players must be surveyed if they want to be within 0.02, with a 95% confidence?
65
Example
As a manufacturer of guitars, a major corporation wants to estimate the proportion of guitar players who are right-handed.
(a) How many guitar players must be surveyed if they want to be within 0.02, with a 95% confidence?
66
𝑛 =1
4
𝑧∗
𝐸
2=
1
4
1.96
0.02
2= 2401
Example
(b) Assume that the manufacturer has an estimate of found from a previous study, which suggests that 75% of guitar players are right-handed, how many guitar players must be surveyed if they want to be within 0.02, with a 95% confidence?
67
Example
(b) Assume that the manufacturer has an estimate of found from a previous study, which suggests that 75% of guitar players are right-handed, how many guitar players must be surveyed if they want to be within 0.02, with a 95% confidence?
68
𝑛 = 𝑝 1 − 𝑝𝑧∗
𝐸
2
= 0.75 1 − 0.751.96
0.02
2=1801
Review Question
A randomly selected sample of 12 students at a university is asked, “How much did you spend on textbooks this semester?” The responses, in dollars, are
200, 200, 175, 450, 300, 350, 250, 200, 175, 450, 300, 350
Compute a 90% CI for the population mean and interpret the CI.
69
Following are data for a group of individuals on whether or not the individual regularly exercise. (N=NO, Y=YES)
N, N, N, N, Y, N, N, Y, N, Y, N, Y, Y, N, N, Y, Y, N, N, N, N, N, N, Y, Y, N, Y, Y, N, N, N, Y, N, Y, N, N, Y
Compute a 99% CI for the population proportion of the individuals who regularly exercise and interpret the CI.
70
Review Question
What sample size is needed to estimate with 95 percent confidence the mean intake of calcium within 20 units of the true mean if the intake is normal with a population variance of 1900 units?
71
Review Question
What sample size is needed to estimate with 95 percent confidence the mean intake of calcium within 20 units of the true mean if the intake is normal and, from a pilot study with sample size of 20, the sample variance is found to be 1900 units?
72
Review Question
A car insurance company would like to determine the proportion of accident claims covered by the company. How large a sample should be taken to estimate the proportion of accident claims covered by the company if we want to be 98% confident that the sample percentage is within ±3% of the actual percentage of the accidents covered by the insurance company?
73
Review Question
A car insurance company would like to determine the proportion of accident claims covered by the company. According to a preliminary estimate 60% of the claims are covered. How large a sample should be taken to estimate the proportion of accident claims covered by the company if we want to be 98% confident that the sample percentage is within ±3% of the actual percentage of the accidents covered by the insurance company?
74
Review Question