Use of Chebyshev’s Theorem to Determine Confidence Intervals
-
Upload
natalie-romero -
Category
Documents
-
view
38 -
download
4
description
Transcript of Use of Chebyshev’s Theorem to Determine Confidence Intervals
1
Use of Chebyshev’s Theorem to Determine Confidence Intervals
• Use when a sample comes from a relatively small sample n>.05N• Non normally distributed• (1 – 1/k2) = C%
k(SE)X
2
Find an 80% and a 90% confidence interval for a population of 500 with a sample size n = 100, X = 45 and a standard deviation s = 12.
Use Chebyshev’s Theorem 100>.05(500)
1 – 1/k2 = .8, k = 2.23 K(SE)X
100
122.2345
[42.324, 47.676]
1 – 1/k2 = .9 k = 3.16 K(SE)X
100
123.1645
[41.208, 48.792]
3
Assumptions
1. The sample is a simple random sample.
2. The conditions for the binomial distribution are satisfied.
3. The normal distribution can be used to approximate the distribution of sample proportions because np 5 and n(1 – p) 5 are both satisfied.
Confidence Intervals for Population Proportions
4
Notation for Proportions
5
p = xn sample proportion
π = population proportion
of x successes in a sample of size n
Notation for Proportions
6
DefinitionPoint Estimate
7
DefinitionPoint Estimate
The sample proportion p is the best point estimate of the population
proportion π.
8
Standard Error
SE =
n
PP
n
)1()1(
9
Confidence Interval for Population Proportion
P + Z(P) * SE
10
Confidence Interval for Population Proportion
P + Z(P) * SE
A poll of 100 students found that 60 prefer Mrs. Peloquinas a teacher compared to Mr. Roesler. Find a 95% and 85%confidence interval that prefer Mrs. Peloquin.
p = .6 (need at least 80 see pg 340)1 - p = .4
.048100
(.6)(.4)
n
p)p(1SE
.6 + 1.96 * .048
.6 + 1.44 * .048
[.505,.694]
[.531,.669]
11
Round-Off Rule for Confidence Interval Estimates of p
Round the confidence interval limits to
three significant digits.
12
Determining Sample Size
13
Determining Sample Size
zME =
p(1-p)n
14
Determining Sample Size
zME =
p(1-p)n
(solve for n by algebra)
15
( )2 p (1-p)
Determining Sample Size
zME =
p(1-p)n
(solve for n by algebra)
zn =
ME2
16
Sample Size for Estimating Proportion p
When an estimate of p is known:
( )2 p (1-p)n =
ME2z
When no estimate of p is known:
( )2 0.25n =
ME2z
17
Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used e-mail.
18
n = [z/2 ]2 p (1-p)
ME2
Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used e-mail.
19
n = [z/2 ]2 p(1-p)
ME2
= [1.645]2 (0.169)(0.831)
0.042
Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used e-mail.
20
= [1.645]2 (0.169)(0.831)
n = [z/2 ]2 p (1-p)
ME2
= 237.51965= 238 households
Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used e-mail.
To be 90% confident that our sample percentage is within four percentage points of the true percentage for all households, we should randomly select and survey 238 households.
0.042
21
Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage.
22
n = [z/2 ]2 (0.25)
ME2
Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage.
23
n = [z/2 ]2 (0.25)
ME2
= (1.645)2 (0.25)
0.042
Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage.
= 422.81641 = 423 households
24
n = [z/2 ]2 (0.25)
ME2
= (1.645)2 (0.25)
= 422.81641 = 423 households
With no prior information, we need a larger sample to achieve the same results with 90% confidence and an error of no more than 4%.
Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage.
0.042
25
26
If 1) n 302) The sample is a simple random sample.3) The sample is from a normally distributed
population.
Case 1 ( is known): Largely unrealistic; Use z-scores
Case 2 (is unknown): Use Student t distribution
Small SamplesAssumptions
27
Student t Distribution
If the distribution of a population is essentially normal, then the distribution of
t =x - µ
sn
28
Student t Distribution
If the distribution of a population is essentially normal, then the distribution of
is essentially a Student t Distribution for all samples of size n.
is used to find critical values denoted by
t =x - µ
sn
t/ 2
29
DefinitionDegrees of Freedom (df )
corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values
30
DefinitionDegrees of Freedom (df )
corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values
df = n - 1in this section
31
DefinitionDegrees of Freedom (df ) = n - 1
corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values
Any
#Specific
#
so that x = 80
Any
#
n = 10 df = 10 - 1 = 9
Any
#Any
#Any
#Any
#Any
#Any
#Any
#
32
Degreesof
freedom
1234567891011121314151617181920212223242526272829
Large (z)
63.6579.9255.8414.6044.0323.7073.5003.3553.2503.1693.1063.0543.0122.9772.9472.9212.8982.8782.8612.8452.8312.8192.8072.7972.7872.7792.7712.7632.7562.575
.005(one tail)
.01(two tails)
31.8216.9654.5413.7473.3653.1432.9982.8962.8212.7642.7182.6812.6502.6252.6022.5842.5672.5522.5402.5282.5182.5082.5002.4922.4852.4792.4732.4672.4622.327
12.7064.3033.1822.7762.5712.4472.3652.3062.2622.2282.2012.1792.1602.1452.1322.1202.1102.1012.0932.0862.0802.0742.0692.0642.0602.0562.0522.0482.0451.960
6.3142.9202.3532.1322.0151.9431.8951.8601.8331.8121.7961.7821.7711.7611.7531.7461.7401.7341.7291.7251.7211.7171.7141.7111.7081.7061.7031.7011.6991.645
3.0781.8861.6381.5331.4761.4401.4151.3971.3831.3721.3631.3561.3501.3451.3411.3371.3331.3301.3281.3251.3231.3211.3201.3181.3161.3151.3141.3131.3111.282
1.000.816.765.741.727.718.711.706.703.700.697.696.694.692.691.690.689.688.688.687.686.686.685.685.684.684.684.683.683.675
.01(one tail)
.02(two tails)
.025(one tail)
.05(two tails)
.05(one tail)
.10(two tails)
.10(one tail)
.20(two tails)
.25(one tail)
.50(two tails)
Table A-3 t Distribution
33
Given a variable that has a t-distribution with the specified degrees of freedom, what percentage of the time will it be in the indicated region?
a. 10 df, between -1 .81 and 1 .81.
b. 10 df, between -2.23 and 2.23.
c. 24 df, between -2.06 and 2.06.
d. 24 df, between -2.80 and 2.80.
e. 24 df, outside the interval from -2.80 and 2.80.
f. 24 df, to the right of 2.80.
g. 10 df, to the left of -1.81.
90%
95%
95 %
99 %
1 %
.5 %
5 %
34
What are the appropriate t critical values for each of the confidence intervals?
a. 95 % confidence, n = 17 b. 90 % confidence, n = 12c. 99 % confidence, n = 14 d. 90 % confidence, n = 25e. 90 % confidence, n = 13f. 95 % confidence, n = 10
2.12
1.803.01
1.711.78
2.262
35
Margin of Error E for Estimate of Based on an Unknown and a Small Simple Random
Sample from a Normally Distributed Population
where t/ 2 has n - 1 degrees of freedom
nME = t
s2
36
Confidence Interval for the Estimate of ME
Based on an Unknown and a Small Simple Random Sample from a Normally Distributed Population
x - ME < µ < x + ME
where ME = t/2 ns
37
Degreesof
freedom
1234567891011121314151617181920212223242526272829
Large (z)
63.6579.9255.8414.6044.0323.7073.5003.3553.2503.1693.1063.0543.0122.9772.9472.9212.8982.8782.8612.8452.8312.8192.8072.7972.7872.7792.7712.7632.7562.575
.005(one tail)
.01(two tails)
31.8216.9654.5413.7473.3653.1432.9982.8962.8212.7642.7182.6812.6502.6252.6022.5842.5672.5522.5402.5282.5182.5082.5002.4922.4852.4792.4732.4672.4622.327
12.7064.3033.1822.7762.5712.4472.3652.3062.2622.2282.2012.1792.1602.1452.1322.1202.1102.1012.0932.0862.0802.0742.0692.0642.0602.0562.0522.0482.0451.960
6.3142.9202.3532.1322.0151.9431.8951.8601.8331.8121.7961.7821.7711.7611.7531.7461.7401.7341.7291.7251.7211.7171.7141.7111.7081.7061.7031.7011.6991.645
3.0781.8861.6381.5331.4761.4401.4151.3971.3831.3721.3631.3561.3501.3451.3411.3371.3331.3301.3281.3251.3231.3211.3201.3181.3161.3151.3141.3131.3111.282
1.000.816.765.741.727.718.711.706.703.700.697.696.694.692.691.690.689.688.688.687.686.686.685.685.684.684.684.683.683.675
.01(one tail)
.02(two tails)
.025(one tail)
.05(two tails)
.05(one tail)
.10(two tails)
.10(one tail)
.20(two tails)
.25(one tail)
.50(two tails)
Table A-3 t Distribution
38
Important Properties of the Student t Distribution1. The Student t distribution is different for different sample sizes (see Figure
6-5 for the cases n = 3 and n = 12).
2. The Student t distribution has the same general symmetric bell shape as the normal distribution but it reflects the greater variability (with wider distributions) that is expected with small samples.
3. The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0).
4. The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a = 1).
5. As the sample size n gets larger, the Student t distribution gets closer to the normal distribution. For values of n > 30, the differences are so small that we can use the critical z values instead of developing a much larger table of critical t values. (The values in the bottom row of Table A-3 are equal to the corresponding critical z values from the standard normal distribution.)
39
Student tdistributionwith n = 3
Student t Distributions for n = 3 and n = 12
0
Student tdistributionwith n = 12
Standardnormaldistribution
40
Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)
41
Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)
x = 26,227
s = 15,873
= 0.05/2 = 0.025
42
Degreesof
freedom
1234567891011121314151617181920212223242526272829
Large (z)
63.6579.9255.8414.6044.0323.7073.5003.3553.2503.1693.1063.0543.0122.9772.9472.9212.8982.8782.8612.8452.8312.8192.8072.7972.7872.7792.7712.7632.7562.575
.005(one tail)
.01(two tails)
31.8216.9654.5413.7473.3653.1432.9982.8962.8212.7642.7182.6812.6502.6252.6022.5842.5672.5522.5402.5282.5182.5082.5002.4922.4852.4792.4732.4672.4622.327
12.7064.3033.1822.7762.5712.4472.3652.3062.2622.2282.2012.1792.1602.1452.1322.1202.1102.1012.0932.0862.0802.0742.0692.0642.0602.0562.0522.0482.0451.960
6.3142.9202.3532.1322.0151.9431.8951.8601.8331.8121.7961.7821.7711.7611.7531.7461.7401.7341.7291.7251.7211.7171.7141.7111.7081.7061.7031.7011.6991.645
3.0781.8861.6381.5331.4761.4401.4151.3971.3831.3721.3631.3561.3501.3451.3411.3371.3331.3301.3281.3251.3231.3211.3201.3181.3161.3151.3141.3131.3111.282
1.000.816.765.741.727.718.711.706.703.700.697.696.694.692.691.690.689.688.688.687.686.686.685.685.684.684.684.683.683.675
.01(one tail)
.02(two tails)
.025(one tail)
.05(two tails)
.05(one tail)
.10(two tails)
.10(one tail)
.20(two tails)
.25(one tail)
.50(two tails)
Table A-3 t Distribution
43
Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)
x = 26,227
s = 15,873
= 0.05/2 = 0.025t/2 = 2.201
ME = t2 s = (2.201)(15,873) = 10,085.29
n 12
44
Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)
x = 26,227
s = 15,873
= 0.05/2 = 0.025t/2 = 2.201
ME = t2 s = (2.201)(15,873) = 10,085.3
n 12
x - ME < µ < x +ME
45
Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)
x = 26,227
s = 15,873
= 0.05/2 = 0.025t/2 = 2.201
E = t2 s = (2.201)(15,873) = 10,085.3
n 12
26,227 - 10,085.3 < µ < 26,227 + 10,085.3
x - E < µ < x + E
46
Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)
x = 26,227
s = 15,873
= 0.05/2 = 0.025t/2 = 2.201
ME = t2 s = (2.201)(15,873) = 10,085.3
n 12
26,227 - 10,085.3 < µ < 26,227 + 10,085.3
$16,141.7 < µ < $36,312.3
x - ME < µ < x + ME
47
Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)
x = 26,227
s = 15,873
= 0.05/2 = 0.025t/2 = 2.201
E = t2 s = (2.201)(15,873) = 10,085.3
n 12
26,227 - 10,085.3 < µ < 26,227 + 10,085.3
x - E < µ < x + E
We are 95% confident that this interval contains the average cost of repairing a Dodge Viper.
$16,141.7 < µ < $36,312.3
48
One Sided Confidence Intervals
A one sided confidence interval is a confidence interval thatestablishes either a likely minimum or a likely maximum valuefor µ, but not both.
Likely maximum
Likely minimum
SE)XZ(X
SE)XZ(X
49
50