Use of Chebyshev’s Theorem to Determine Confidence Intervals

50
1 Use of Chebyshev’s Theorem to Determine Confidence Intervals when a sample comes from a relatively small sample normally distributed 1/k 2 ) = C% k(SE) X

description

Use of Chebyshev’s Theorem to Determine Confidence Intervals. Use when a sample comes from a relatively small sample n > .05N Non normally distributed (1 – 1/k 2 ) = C%. Find an 80% and a 90% confidence interval for a population of 500 - PowerPoint PPT Presentation

Transcript of Use of Chebyshev’s Theorem to Determine Confidence Intervals

Page 1: Use of Chebyshev’s Theorem to Determine Confidence Intervals

1

Use of Chebyshev’s Theorem to Determine Confidence Intervals

• Use when a sample comes from a relatively small sample n>.05N• Non normally distributed• (1 – 1/k2) = C%

k(SE)X

Page 2: Use of Chebyshev’s Theorem to Determine Confidence Intervals

2

Find an 80% and a 90% confidence interval for a population of 500 with a sample size n = 100, X = 45 and a standard deviation s = 12.

Use Chebyshev’s Theorem 100>.05(500)

1 – 1/k2 = .8, k = 2.23 K(SE)X

100

122.2345

[42.324, 47.676]

1 – 1/k2 = .9 k = 3.16 K(SE)X

100

123.1645

[41.208, 48.792]

Page 3: Use of Chebyshev’s Theorem to Determine Confidence Intervals

3

Assumptions

1. The sample is a simple random sample.

2. The conditions for the binomial distribution are satisfied.

3. The normal distribution can be used to  approximate the distribution of sample  proportions because np 5 and n(1 – p) 5 are both satisfied.

Confidence Intervals for Population Proportions

Page 4: Use of Chebyshev’s Theorem to Determine Confidence Intervals

4

Notation for Proportions

Page 5: Use of Chebyshev’s Theorem to Determine Confidence Intervals

5

p = xn sample proportion

π = population proportion

of x successes in a sample of size n

Notation for Proportions

Page 6: Use of Chebyshev’s Theorem to Determine Confidence Intervals

6

DefinitionPoint Estimate

Page 7: Use of Chebyshev’s Theorem to Determine Confidence Intervals

7

DefinitionPoint Estimate

The sample proportion p is the best point estimate of the population

proportion π.

Page 8: Use of Chebyshev’s Theorem to Determine Confidence Intervals

8

Standard Error

SE =

n

PP

n

)1()1(

Page 9: Use of Chebyshev’s Theorem to Determine Confidence Intervals

9

Confidence Interval for Population Proportion

P + Z(P) * SE

Page 10: Use of Chebyshev’s Theorem to Determine Confidence Intervals

10

Confidence Interval for Population Proportion

P + Z(P) * SE

A poll of 100 students found that 60 prefer Mrs. Peloquinas a teacher compared to Mr. Roesler. Find a 95% and 85%confidence interval that prefer Mrs. Peloquin.

p = .6 (need at least 80 see pg 340)1 - p = .4

.048100

(.6)(.4)

n

p)p(1SE

.6 + 1.96 * .048

.6 + 1.44 * .048

[.505,.694]

[.531,.669]

Page 11: Use of Chebyshev’s Theorem to Determine Confidence Intervals

11

Round-Off Rule for Confidence Interval Estimates of p

Round the confidence interval limits to

three significant digits.

Page 12: Use of Chebyshev’s Theorem to Determine Confidence Intervals

12

Determining Sample Size

Page 13: Use of Chebyshev’s Theorem to Determine Confidence Intervals

13

Determining Sample Size

zME =

p(1-p)n

Page 14: Use of Chebyshev’s Theorem to Determine Confidence Intervals

14

Determining Sample Size

zME =

p(1-p)n

(solve for n by algebra)

Page 15: Use of Chebyshev’s Theorem to Determine Confidence Intervals

15

( )2 p (1-p)

Determining Sample Size

zME =

p(1-p)n

(solve for n by algebra)

zn =

ME2

Page 16: Use of Chebyshev’s Theorem to Determine Confidence Intervals

16

Sample Size for Estimating Proportion p

When an estimate of p is known:

( )2 p (1-p)n =

ME2z

When no estimate of p is known:

( )2 0.25n =

ME2z

Page 17: Use of Chebyshev’s Theorem to Determine Confidence Intervals

17

Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used e-mail.

Page 18: Use of Chebyshev’s Theorem to Determine Confidence Intervals

18

n = [z/2 ]2 p (1-p)

ME2

Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used e-mail.

Page 19: Use of Chebyshev’s Theorem to Determine Confidence Intervals

19

n = [z/2 ]2 p(1-p)

ME2

= [1.645]2 (0.169)(0.831)

0.042

Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used e-mail.

Page 20: Use of Chebyshev’s Theorem to Determine Confidence Intervals

20

= [1.645]2 (0.169)(0.831)

n = [z/2 ]2 p (1-p)

ME2

= 237.51965= 238 households

Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used e-mail.

To be 90% confident that our sample percentage is within four percentage points of the true percentage for all households, we should randomly select and survey 238 households.

0.042

Page 21: Use of Chebyshev’s Theorem to Determine Confidence Intervals

21

Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage.

Page 22: Use of Chebyshev’s Theorem to Determine Confidence Intervals

22

n = [z/2 ]2 (0.25)

ME2

Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage.

Page 23: Use of Chebyshev’s Theorem to Determine Confidence Intervals

23

n = [z/2 ]2 (0.25)

ME2

= (1.645)2 (0.25)

0.042

Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage.

= 422.81641 = 423 households

Page 24: Use of Chebyshev’s Theorem to Determine Confidence Intervals

24

n = [z/2 ]2 (0.25)

ME2

= (1.645)2 (0.25)

= 422.81641 = 423 households

With no prior information, we need a larger sample to achieve the same results with 90% confidence and an error of no more than 4%.

Example: We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage.

0.042

Page 25: Use of Chebyshev’s Theorem to Determine Confidence Intervals

25

Page 26: Use of Chebyshev’s Theorem to Determine Confidence Intervals

26

If 1) n 302) The sample is a simple random sample.3) The sample is from a normally distributed

population.

Case 1 ( is known): Largely unrealistic; Use z-scores

Case 2 (is unknown): Use Student t distribution

Small SamplesAssumptions

Page 27: Use of Chebyshev’s Theorem to Determine Confidence Intervals

27

Student t Distribution

If the distribution of a population is essentially normal, then the distribution of

t =x - µ

sn

Page 28: Use of Chebyshev’s Theorem to Determine Confidence Intervals

28

Student t Distribution

If the distribution of a population is essentially normal, then the distribution of

is essentially a Student t Distribution for all      samples of size n.

is used to find critical values denoted by

t =x - µ

sn

t/ 2

Page 29: Use of Chebyshev’s Theorem to Determine Confidence Intervals

29

DefinitionDegrees of Freedom (df )

corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values

Page 30: Use of Chebyshev’s Theorem to Determine Confidence Intervals

30

DefinitionDegrees of Freedom (df )

corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values

df = n - 1in this section

Page 31: Use of Chebyshev’s Theorem to Determine Confidence Intervals

31

DefinitionDegrees of Freedom (df ) = n - 1

corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values

Any

#Specific

#

so that x = 80

Any

#

n = 10 df = 10 - 1 = 9

Any

#Any

#Any

#Any

#Any

#Any

#Any

#

Page 32: Use of Chebyshev’s Theorem to Determine Confidence Intervals

32

Degreesof

freedom

1234567891011121314151617181920212223242526272829

Large (z)

63.6579.9255.8414.6044.0323.7073.5003.3553.2503.1693.1063.0543.0122.9772.9472.9212.8982.8782.8612.8452.8312.8192.8072.7972.7872.7792.7712.7632.7562.575

.005(one tail)

.01(two tails)

31.8216.9654.5413.7473.3653.1432.9982.8962.8212.7642.7182.6812.6502.6252.6022.5842.5672.5522.5402.5282.5182.5082.5002.4922.4852.4792.4732.4672.4622.327

12.7064.3033.1822.7762.5712.4472.3652.3062.2622.2282.2012.1792.1602.1452.1322.1202.1102.1012.0932.0862.0802.0742.0692.0642.0602.0562.0522.0482.0451.960

6.3142.9202.3532.1322.0151.9431.8951.8601.8331.8121.7961.7821.7711.7611.7531.7461.7401.7341.7291.7251.7211.7171.7141.7111.7081.7061.7031.7011.6991.645

3.0781.8861.6381.5331.4761.4401.4151.3971.3831.3721.3631.3561.3501.3451.3411.3371.3331.3301.3281.3251.3231.3211.3201.3181.3161.3151.3141.3131.3111.282

1.000.816.765.741.727.718.711.706.703.700.697.696.694.692.691.690.689.688.688.687.686.686.685.685.684.684.684.683.683.675

.01(one tail)

.02(two tails)

.025(one tail)

.05(two tails)

.05(one tail)

.10(two tails)

.10(one tail)

.20(two tails)

.25(one tail)

.50(two tails)

Table A-3 t Distribution

Page 33: Use of Chebyshev’s Theorem to Determine Confidence Intervals

33

Given a variable that has a t-distribution with the specified degrees of freedom, what percentage of the time will it be in the indicated region?

a. 10 df, between -1 .81 and 1 .81.

b. 10 df, between -2.23 and 2.23.

c. 24 df, between -2.06 and 2.06.

d. 24 df, between -2.80 and 2.80.

e. 24 df, outside the interval from -2.80 and 2.80.

f. 24 df, to the right of 2.80.

g. 10 df, to the left of -1.81.

90%

95%

95 %

99 %

1 %

.5 %

5 %

Page 34: Use of Chebyshev’s Theorem to Determine Confidence Intervals

34

What are the appropriate t critical values for each of the confidence intervals?

a. 95 % confidence, n = 17 b. 90 % confidence, n = 12c. 99 % confidence, n = 14 d. 90 % confidence, n = 25e. 90 % confidence, n = 13f. 95 % confidence, n = 10

2.12

1.803.01

1.711.78

2.262

Page 35: Use of Chebyshev’s Theorem to Determine Confidence Intervals

35

Margin of Error E for Estimate of Based on an Unknown and a Small Simple Random

Sample from a Normally Distributed Population

where t/ 2 has n - 1 degrees of freedom

nME = t

s2

Page 36: Use of Chebyshev’s Theorem to Determine Confidence Intervals

36

Confidence Interval for the Estimate of ME

Based on an Unknown and a Small Simple Random Sample from a Normally Distributed Population

x - ME < µ < x + ME

where ME = t/2 ns

Page 37: Use of Chebyshev’s Theorem to Determine Confidence Intervals

37

Degreesof

freedom

1234567891011121314151617181920212223242526272829

Large (z)

63.6579.9255.8414.6044.0323.7073.5003.3553.2503.1693.1063.0543.0122.9772.9472.9212.8982.8782.8612.8452.8312.8192.8072.7972.7872.7792.7712.7632.7562.575

.005(one tail)

.01(two tails)

31.8216.9654.5413.7473.3653.1432.9982.8962.8212.7642.7182.6812.6502.6252.6022.5842.5672.5522.5402.5282.5182.5082.5002.4922.4852.4792.4732.4672.4622.327

12.7064.3033.1822.7762.5712.4472.3652.3062.2622.2282.2012.1792.1602.1452.1322.1202.1102.1012.0932.0862.0802.0742.0692.0642.0602.0562.0522.0482.0451.960

6.3142.9202.3532.1322.0151.9431.8951.8601.8331.8121.7961.7821.7711.7611.7531.7461.7401.7341.7291.7251.7211.7171.7141.7111.7081.7061.7031.7011.6991.645

3.0781.8861.6381.5331.4761.4401.4151.3971.3831.3721.3631.3561.3501.3451.3411.3371.3331.3301.3281.3251.3231.3211.3201.3181.3161.3151.3141.3131.3111.282

1.000.816.765.741.727.718.711.706.703.700.697.696.694.692.691.690.689.688.688.687.686.686.685.685.684.684.684.683.683.675

.01(one tail)

.02(two tails)

.025(one tail)

.05(two tails)

.05(one tail)

.10(two tails)

.10(one tail)

.20(two tails)

.25(one tail)

.50(two tails)

Table A-3 t Distribution

Page 38: Use of Chebyshev’s Theorem to Determine Confidence Intervals

38

Important Properties of the Student t Distribution1. The Student t distribution is different for different sample sizes (see Figure

6-5 for the cases n = 3 and n = 12).

2. The Student t distribution has the same general symmetric bell shape as the normal distribution but it reflects the greater variability (with wider distributions) that is expected with small samples.

3. The Student t distribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0).

4. The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a = 1).

5. As the sample size n gets larger, the Student t distribution gets closer to the normal distribution. For values of n > 30, the differences are so small that we can use the critical z values instead of developing a much larger table of critical t values. (The values in the bottom row of Table A-3 are equal to the corresponding critical z values from the standard normal distribution.)

Page 39: Use of Chebyshev’s Theorem to Determine Confidence Intervals

39

Student tdistributionwith n = 3

Student t Distributions for n = 3 and n = 12

0

Student tdistributionwith n = 12

Standardnormaldistribution

Page 40: Use of Chebyshev’s Theorem to Determine Confidence Intervals

40

Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)

Page 41: Use of Chebyshev’s Theorem to Determine Confidence Intervals

41

Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)

x = 26,227

s = 15,873

= 0.05/2 = 0.025

Page 42: Use of Chebyshev’s Theorem to Determine Confidence Intervals

42

Degreesof

freedom

1234567891011121314151617181920212223242526272829

Large (z)

63.6579.9255.8414.6044.0323.7073.5003.3553.2503.1693.1063.0543.0122.9772.9472.9212.8982.8782.8612.8452.8312.8192.8072.7972.7872.7792.7712.7632.7562.575

.005(one tail)

.01(two tails)

31.8216.9654.5413.7473.3653.1432.9982.8962.8212.7642.7182.6812.6502.6252.6022.5842.5672.5522.5402.5282.5182.5082.5002.4922.4852.4792.4732.4672.4622.327

12.7064.3033.1822.7762.5712.4472.3652.3062.2622.2282.2012.1792.1602.1452.1322.1202.1102.1012.0932.0862.0802.0742.0692.0642.0602.0562.0522.0482.0451.960

6.3142.9202.3532.1322.0151.9431.8951.8601.8331.8121.7961.7821.7711.7611.7531.7461.7401.7341.7291.7251.7211.7171.7141.7111.7081.7061.7031.7011.6991.645

3.0781.8861.6381.5331.4761.4401.4151.3971.3831.3721.3631.3561.3501.3451.3411.3371.3331.3301.3281.3251.3231.3211.3201.3181.3161.3151.3141.3131.3111.282

1.000.816.765.741.727.718.711.706.703.700.697.696.694.692.691.690.689.688.688.687.686.686.685.685.684.684.684.683.683.675

.01(one tail)

.02(two tails)

.025(one tail)

.05(two tails)

.05(one tail)

.10(two tails)

.10(one tail)

.20(two tails)

.25(one tail)

.50(two tails)

Table A-3 t Distribution

Page 43: Use of Chebyshev’s Theorem to Determine Confidence Intervals

43

Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)

x = 26,227

s = 15,873

= 0.05/2 = 0.025t/2 = 2.201

ME = t2 s = (2.201)(15,873) = 10,085.29

n 12

Page 44: Use of Chebyshev’s Theorem to Determine Confidence Intervals

44

Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)

x = 26,227

s = 15,873

= 0.05/2 = 0.025t/2 = 2.201

ME = t2 s = (2.201)(15,873) = 10,085.3

n 12

x - ME < µ < x +ME

Page 45: Use of Chebyshev’s Theorem to Determine Confidence Intervals

45

Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)

x = 26,227

s = 15,873

= 0.05/2 = 0.025t/2 = 2.201

E = t2 s = (2.201)(15,873) = 10,085.3

n 12

26,227 - 10,085.3 < µ < 26,227 + 10,085.3

x - E < µ < x + E

Page 46: Use of Chebyshev’s Theorem to Determine Confidence Intervals

46

Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)

x = 26,227

s = 15,873

= 0.05/2 = 0.025t/2 = 2.201

ME = t2 s = (2.201)(15,873) = 10,085.3

n 12

26,227 - 10,085.3 < µ < 26,227 + 10,085.3

$16,141.7 < µ < $36,312.3

x - ME < µ < x + ME

Page 47: Use of Chebyshev’s Theorem to Determine Confidence Intervals

47

Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)

x = 26,227

s = 15,873

= 0.05/2 = 0.025t/2 = 2.201

E = t2 s = (2.201)(15,873) = 10,085.3

n 12

26,227 - 10,085.3 < µ < 26,227 + 10,085.3

x - E < µ < x + E

We are 95% confident that this interval contains the average cost of repairing a Dodge Viper.

$16,141.7 < µ < $36,312.3

Page 48: Use of Chebyshev’s Theorem to Determine Confidence Intervals

48

One Sided Confidence Intervals

A one sided confidence interval is a confidence interval thatestablishes either a likely minimum or a likely maximum valuefor µ, but not both.

Likely maximum

Likely minimum

SE)XZ(X

SE)XZ(X

Page 49: Use of Chebyshev’s Theorem to Determine Confidence Intervals

49

Page 50: Use of Chebyshev’s Theorem to Determine Confidence Intervals

50