Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

30
Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1

Transcript of Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

Page 1: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

Confidence Intervals forµ1 - µ2 and p1 - p2

1

Page 2: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• We are interested in:• Confidence intervals for the difference between two

means.• Confidence intervals for the difference between two• proportions.

2

Inference about Two Populations

Page 3: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• Two random samples are drawn from the two populations of interest.

• Because we compare two population means, we use the statistic .

3

Confidence Intervals for the Difference between Two Population Means µ1 - µ2: Independent Samples

21 xx

Page 4: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

4

Population 1 Population 2

Parameters: µ1 and 12 Parameters: µ2 and 2

2 (values are unknown) (values are unknown)

Sample size: n1 Sample size: n2

Statistics: x1 and s12 Statistics: x2 and s2

2

Estimate µ1 µ2 with x1 x2

Page 5: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

Confidence Interval for –

5

*

*

Confidence interval

2 21 2( )

1 21 2

where is the value from the z-table

that corresponds to the confidence level

x x zn n

z

Note: when the values of 12 and 2

2 are unknown, the sample variances s1

2 and s22 computed from the data

can be used.

Page 6: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• Do people who eat high-fiber cereal for breakfast consume, on average, fewer calories for lunch than people who do not eat high-fiber cereal for breakfast?

• A sample of 150 people was randomly drawn. Each person was identified as a consumer or a non-consumer of high-fiber cereal.

• For each person the number of calories consumed at lunch was recorded.

6

Example: confidence interval for –

Page 7: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

Example: confidence interval for –

7

Consmers Non-cmrs568 705498 819589 706681 509540 613646 582636 601739 608539 787596 573607 428529 754637 741617 628633 537555 748

. .

. .

. .

. .

Consmers Non-cmrs568 705498 819589 706681 509540 613646 582636 601739 608539 787596 573607 428529 754637 741617 628633 537555 748

. .

. .

. .

. .

Solution:• The parameter to be tested is the difference between two means. • The claim to be tested is: The mean caloric intake of consumers (1) is less than that of non-consumers (2).

• Use s12 = 4,103 for 1

2 and s22 = 10,670

for 22

1 1 2 243, 604.02; 107, 633.239n x n x

Page 8: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• The confidence interval estimator for the difference between two means is

8

Example: confidence interval for –

*

2 21 2( )

1 21 2

4103 10670(604.02 633.239) 1.96

43 107

29.21 27.38 56.59, 1.83

x x zn n

Page 9: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• The 95% CI is (-56.59, -1.83).• We are 95% confident that the interval

(-56.59, -1.83) contains the true but unknown difference –

• Since the interval is entirely negative (that is, does not contain 0), there is evidence from the data that µ1 is less than µ2. We estimate that non-consumers of high-fiber breakfast consume on average between 1.83 and 56.59 more calories for lunch.

9

Interpretation

Page 10: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

Does smoking damage the lungs of children exposed

to parental smoking?

Forced vital capacity (FVC) is the volume (in milliliters) of

air that an individual can exhale in 6 seconds.

FVC was obtained for a sample of children not exposed to

parental smoking and a group of children exposed to

parental smoking.

We want to know whether parental smoking decreases

children’s lung capacity as measured by the FVC test.

Is the mean FVC lower in the population of children

exposed to parental smoking?

Parental smoking FVC s n

Yes 75.5 9.3 30

No 88.2 15.1 30

x

Page 11: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

Parental smoking FVC s n

Yes 75.5 9.3 30

No 88.2 15.1 30

We are 95% confident that lung capacity in children of smoking parents is between 19.05 and 6.35 milliliters LESS than in children without a smoking parent.

x

95% confidence interval for (µ1 − µ2):

2 21 2

1 21 2

2 2

( ) *

9.3 15.1(75.5 88.2) 1.96

30 3012.7 1.96*3.24

12.7 6.35 ( 19.05, 6.35)

s sx x z

n n

1 = mean FVC of children

with a smoking parent;

2 = mean FVC of children without a smoking parent

Page 12: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• The data below show the sugar content (as a percentage of weight) of 10 brands of cereal randomly selected from a supermarket shelf that is at a child’s eye level and 8 brands selected from the top shelf.

12

Bunny Rabbits and Pirates on the Box

Eye level 40.3 55 45.7 43.3

50.3 45.9 53.5 43 44.2 44

Top 20 2.2

7.5 4.4 22.2 16.6 14.5 10

Create and interpret a 95% confidence interval for the difference1 – 2 in mean sugar content, where 1 is the mean sugar content of cereal at a child’s eye level and 2 is the mean sugar content of cereal on the top shelf.

Page 13: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

Eye level

40.3 55 45.7 43.3 50.3 45.9 53.5 43 44.2 44

Top 20 2.2 7.5 4.4 22.2 16.6 14.5 10

13

21 1 1

22 2 2

2 21 2

1 21 2

Eye level: 46.52, 23.24, 10

top: 12.18, 53.32, 8

95% confidence interval:

( ) 1.96

23.24 53.32(46.52 12.18) 1.96

10 834.34 5.88 (28.46,40.22)

x s n

x s n

x xn n

Page 14: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

Interpretation

• We are 95% confident that the interval(28.46, 40.22) contains the true but unknown value of 1 – 2.

• Note that the interval is entirely positive (does not contain 0); therefore, it appears that the mean amount of sugar 1 in cereal on the shelf at a child’s eye level is larger than the mean amount 2 on the top shelf.

14

Page 15: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

Do left-handed people have a shorter life-expectancy than

right-handed people? Some psychologists believe that the stress of being left-

handed in a right-handed world leads to earlier deaths among

left-handers. Several studies have compared the life expectancies of left-

handers and right-handers. One such study resulted in the data shown in the table.

We will use the data to construct a confidence interval

for the difference in mean life expectancies for left-

handers and right-handers.

Is the mean life expectancy of left-handers less

than the mean life expectancy of right-handers?

Handedness Mean age at death s n

Left 66.8 25.3 99

Right 75.2 15.1 888

x

left-handed presidents

star left-handed quarterback Steve Young

Page 16: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

We are 95% confident that the mean life expectancy for left-handers is between 3.32 and 13.48 years LESS than the mean life expectancy for right-handers.

95% confidence interval for (µ1 − µ2):

2 21 2

1 21 2

2 2

( ) *

(25.3) (15.1)(66.8 75.2) 1.96

99 8888.4 1.96*2.59

8.4 5.08 ( 13.48, 3.32)

s sx x z

n n

1 = mean life expectancy of

left-handers;

2 = mean life expectancy of right-handers

Handedness Mean age at death s n

Left 66.8 25.3 99

Right 75.2 15.1 888

The “Bambino”,left-handed hitter Babe Ruth, baseball’s all-time

best hitter

Page 17: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• Example• An ergonomic chair can be assembled using two different

sets of operations (Method A and Method B)

• The operations manager would like to know whether the assembly time under the two methods differ.

17

Example: confidence interval for –

Page 18: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• Example• Two samples are randomly and independently selected

• A sample of 25 workers assembled the chair using method A.

• A sample of 25 workers assembled the chair using method B.

• The assembly times were recorded

• Do the assembly times of the two methods differs?

18

Example: confidence interval for –

Page 19: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

Example: confidence interval for –

19

Method A Method B6.8 5.25.0 6.77.9 5.75.2 6.67.6 8.55.0 6.55.9 5.95.2 6.76.5 6.6. .. .. .. .

Method A Method B6.8 5.25.0 6.77.9 5.75.2 6.67.6 8.55.0 6.55.9 5.95.2 6.76.5 6.6. .. .. .. .

Assembly times in Minutes

Solution• The parameter of interest is the difference between two population means.

• The claim to be tested is whether a difference between the two methods exists.

• Use s12 = .848 for 1

2 and s22 = 1.303

for 22

Page 20: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

20

Example: confidence interval for –

A 95% confidence interval for 1 - 2 is calculated as follows:

2 21 2

1 21 2

( ) *

.848 1.3036.288 6.016 1.96

25 250.272 0.5749 [ 0.3029, 0.8469]

x x zn n

We are 95% confident that the interval (-0.3029 , 0.8469)contains the true but unknown 1 - 2

Notice: “Zero” is included in the confidence interval

Page 21: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• In this section we deal with two populations whose data are qualitative.

• For qualitative data we compare the population proportions of the occurrence of a certain event.

• Examples• Comparing the effectiveness of new drug versus older

one• Comparing market share before and after advertising

campaign• Comparing defective rates between two machines

21

Confidence Intervals for the difference p1 – p2 between two population proportions

Page 22: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• Parameter• When the data are qualitative, we can only count the

occurrences of a certain event in the two populations, and calculate proportions.

• The parameter we want to estimate is p1 – p2.

• Statistic• An estimator of p1 – p2 is (the difference

between the sample proportions).

22

Parameter and Statistic

21 p̂p̂

Page 23: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• Two random samples are drawn from two populations.• The number of successes in each sample is recorded.• The sample proportions are computed.

23

Point Estimator:

Sample 1 Sample size n1

Number of successes x1

Sample proportion

Sample 1 Sample size n1

Number of successes x1

Sample proportion

Sample 2 Sample size n2

Number of successes x2

Sample proportion

Sample 2 Sample size n2

Number of successes x2

Sample proportionx

n1

1

ˆ p1

2

22 n

xp̂

21 p̂p̂

Page 24: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

Confidence Interval for p1 p2

24

level confidence

on the depends that table-z the

from valueeappropriat theis *z where

)ˆ1(ˆ)ˆ1(ˆ)ˆˆ(

2

22

1

11*21 n

pp

n

ppzpp

level confidence

on the depends that table-z the

from valueeappropriat theis *z where

)ˆ1(ˆ)ˆ1(ˆ)ˆˆ(

2

22

1

11*21 n

pp

n

ppzpp

Page 25: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• Estimating the cost of life saved• Two drugs are used to treat heart attack victims:

• Streptokinase (available since 1959, costs $460)• t-PA (genetically engineered, costs $2900).

• The maker of t-PA claims that its drug outperforms Streptokinase.

• An experiment was conducted in 15 countries. • 20,500 patients were given t-PA• 20,500 patients were given Streptokinase• The number of deaths by heart attacks was recorded.

25

Example: confidence interval for p1 – p2

Page 26: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• Solution• The problem objective: Compare the outcomes of two

treatments.• The data are qualitative (a patient lived or died)• The parameter to be estimated is p1 – p2.

• p1 = death rate with Streptokinase• p2 = death rate with t-PA

26

Example: confidence interval for p1 – p2

(cont.)

Page 27: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• Experiment results• A total of 1497 patients treated with Streptokinase

died.• A total of 1292 patients treated with t-PA died.

• Estimate the difference in the death rates when using Streptokinase and when using t-PA.

27

Example: confidence interval for p1 – p2

(cont.)

Page 28: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• Compute: Manually• Sample proportions:

• The 95% confidence interval estimate is

28

Example: confidence interval for p1 – p2

(cont.)

0630.205001292

p̂,0730.205001497

p̂ 21

2

22

1

1121

)ˆ1(ˆ)ˆ1(ˆ96.1)ˆˆ(

n

pp

n

pppp

2

22

1

1121

)ˆ1(ˆ)ˆ1(ˆ96.1)ˆˆ(

n

pp

n

pppp

)0149.,0051(.

0049.0100.20500

)0630.1(0630.

20500

)0730.1(0730.96.10630.0730.

Page 29: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

• Interpretation• The interval (.0051, .0149) for p1 – p2 does not contain

0; it is entirely positive, which indicates that p1, the death rate for streptokinase, is greater than p2, the death rate for t-PA.

• We estimate that the death rate for streptokinase is between .51% and 1.49% higher than the death rate for t-PA.

29

Example: confidence interval for p1 – p2

(cont.)

Page 30: Confidence Intervals for µ 1 - µ 2 and p 1 - p 2 1.

Example: 95% confidence interval for p1 – p2

2

22

1

1121

)ˆ1(ˆ)ˆ1(ˆ96.1)ˆˆ(

n

pp

n

pppp

2

22

1

1121

)ˆ1(ˆ)ˆ1(ˆ96.1)ˆˆ(

n

pp

n

pppp

1p̂

2p̂

.146) 1.96.212(.788) .146(.854)

(.2123220 10, 245

.066 1.96(.008) or .066 .016

(.05, .082)

30

The age at which a woman gives birth to her first child may be an important

factor in the risk of later developing breast cancer. An international study

conducted by WHO selected women with at least one birth and recorded if

they had breast cancer or not and whether they had their first child before their

30th birthday or after.

Cancer Sample

Size

Age at First Birth > 30

683 3220 21.2%

Age at First Birth <= 30

1498 10,245 14.6%

The parameter to be estimated is p1 – p2.p1 = cancer rate when age at 1st birth >30p2 = cancer rate when age at 1st birth <=30

We estimate that the cancer rate when age at first birth > 30 is between .05 and .082 higher than when age <= 30.