Chapter 11Inferences on Two Samples Ch 11.1 Inference ... 227 Fa… · We must verify the...

Statcrunch review:

Confidence Intervals & Hypothesis testing; test statistic, p-value (Stat);

Critical Values (Stat-Calculators)

𝑝 (proportion, percent) Proportion stats normal

𝜇 (means) T stats T

𝜎 (standard deviation & variance)

Variance stats Chi-square for one variance/SD F for two variances/SD

Chapter 11Inferences on Two Samples

Ch 11.1 Inference about Two Population Proportions

Objective A :Distinguish between Independent and Dependent Sampling

Example 1: Determine whether each sampling method is independent or dependent.

(a) Test scores of the same students in English and Math.

Dependent: The English and Math score are recorded for the same individual.

(b) The effectiveness of two different diets on two different groups of

individuals.

Independent: The two different diet results are recorded for different individuals.

Objective B :Test Hypotheses or Confidence Intervals Regarding Two Proportions from

Independent Samples

Example 1:

The drug Prevnar is a vaccine meant to prevent certain types of bacterial meningitis. It is

typically administered to infants starting around 2 months of age. In randomized, double-blind

clinical trials of Prevnar, infants were randomly divided into two groups. Subjects in group 1

received Prevnar, while subjects in group 2 received a control vaccine (also referred to as a

placebo). After the second dose, 137 of 452 subjects in the experimental group (group 1)

experienced drowsiness as a side effect. After the second dose, 31 of 99 subjects in the control

group (group 2) experienced drowsiness as a side effect. Does the evidence suggest that a

lower proportion of subjects in group 1 experienced drowsiness as a side effect than subjects in

group 2 at the 0.05 level of significance?

Note: Double-blind means that the subjects and doctors were not aware which treatment was

being assigned in order to avoid bias in behavior.

Dependent samples or Independent Samples for Two Proportions? Independent

Group 1: sample proportion (infants who received the vaccine Prevnar)

3031.0452

137ˆ

1

11

n

xp

Group 2: sample proportion (control group)

3131.099

31ˆ

2

22

n

xp

(a) Setup

Ho: The proportion of infants who experienced drowsiness is the same for both groups. (There

is no difference between the two groups.)

H1: The proportion of infants who experienced drowsiness for the Prevnar group is less than

the control group.

Or

Ho: 21 pp ---> 021 pp

H1: 21 pp ---> 021 pp

(b) P value using Statcrunch

We must verify the requirements to perform the hypothesis test between

two population proportions.

Stat --> Proportion Stats --> Two Sample --> With Summary -->

Input the following, --> Compute

StatCrunch Results:

P-value = 0.4221 not unusual, not less than 0.05 or 5%. Do not reject the null hypothesis. 𝐻𝑜

(c) Conclusion

There is not enough evidence to support that a lower proportion of infants who took Prevnar

experienced drowsiness as a side effect compared to infants who took the placebo. Thus, there

is no significant difference between the groups. Taking the vaccine Prevnar made no significant

difference.

d) Find the 95% confidence interval. Does the CI support the same conclusion as the hypothesis

test?

Options – edit – select confidence interval – compute

(-0.111, 0.091) We are 95% confident that the true difference between the two groups is

between -0.111 and 0.091.

Since the difference of zero is in the CI, then this supports the hypothesis test that there is no

significant difference between the two groups.

Example 2: The body mass index (BMI) of an individual is one measure that is used to

judge whether an individual is at a healthy weight. A BMI between 20 and

25 indicates that one is at a normal weight. In a survey of 750 men and 750

women, the Gallup organization found that 203 men and 270 women were

normal weight. Construct a 90% confidence interval to gauge whether there

is a difference in the proportion of men and women who are normal weight.

Interpret the interval.

We must verify the requirements for constructing a confidence interval for the

difference between two population proportions.

a. Is this a dependent or an independent sampling of two proportions?

Independent

Group 1 Men: 271.0750

203ˆ

m

m

mn

xp

Group 2 Women: 360.0750

270ˆ

w

w

wn

xp

b. Find the C. I. for 21 pp at a 90% confidence level

Stat --> Proportion Stats --> Two Sample --> With Summary --> Input the following Compute

90% confidence interval for wm pp is (-0.129, -0.050).

We are 90% confident that the true difference in the proportion of men and women that are

normal weight is between – 12.9% and -5.0%.

Since the difference of zero is not in the CI, then we can conclude that there is a significant

difference between the two groups.

What do the negative values in the CI mean? Since the CI contained differences that were

negative then this meant that the proportion of women who were at a normal weight was

higher compared to the proportion of men (which we knew from calculating the sample

proportions).

c) Conduct a hypothesis test. How does this support the conclusion from the CI?

Set up:

Ho: There is no difference in proportions between genders who are at a normal weight.

H1: There is a difference in proportions between genders who are at a normal weight.

Or

Ho: 21 pp ---> 021 pp

H1: 21 pp ---> 021 pp

P - value

Options-edit-select hypothesis test ≠ - compute

p-value = 0.0002 which is unusual since it is less than 10% (90% confidence).

P value is low, the null must go. Reject the null hypothesis Ho

Conclusion: There is enough evidence to support the claim that there is a difference in

proportions between genders who are at a normal weight.

This is the same conclusion obtained from the confidence interval.

Hw 11.1#10: a) stat-table-frequency-select columns: gender response-response: group by

gender-compute Write down proportions and ‘yes’ out of ‘total’

Then do a regular two sample proportion test with summary of data

Interpreting p-value: If the population proportions are equal, one would expect a sample

difference proportion greater than the absolute value of the one observed in about ___ out of

100 repetitions of this experiment.

Ch 11.2Inference about Two Means: Dependent Samples

Objective A :Test Hypotheses or Confidence Intervals about the Population Mean

Difference of Matched-Pairs Data

Example 1: In an experiment conducted online at the University of Mississippi, study

participants are asked to react to a stimulus. In one experiment, the

participant must press a key on seeing a blue screen. Reaction time (in

seconds_ to press the key is measured. The same person is then asked

to press a key on seeing a red screen, again with reaction time measured.

The results for six randomly sampled study participants are as follows:

(a) Why are these matched-pairs data?

(Dependent) The same participants are used for comparison.

(b) Is the reaction time to the blue stimulus different from the reaction

time to the red stimulus at the 0.01 level of significance?

Note: A normal probability plot and boxplot of the data indicate that

the differences are approximately normally distributed with no outliers.

Setup:

Ho: There is no difference in reaction time to the blue and red stimulus.

H1: There is difference in reaction time to the blue and red stimulus.

or

Ho: 𝜇1 = 𝜇2 0d

H1: 𝜇1 ≠ 𝜇2 0d

P – value:

Input reaction time for seeing a blue screen in column 1 and reaction time for seeing a red

screen in column 2

stat--> T Stats --> Paired --> sample 1 var1, sample 2 var2. Select Save: differences -->Compute

(see below)

Hw: 11.2 #2 asks for differences so if you select differences you won’t have to do by hand. The

mean difference will show up in the hypothesis test under ‘mean’. The standard deviation you

can get from summary stats for the difference column.

StatCrunch Results:

P – value is 0.2466 which is not unusual. 0.2466 is not less than 0.01 (1% significance level).

Cannot reject the null hypothesis, Ho.

Conclusion:

There is not enough evidence to support the claim that the reaction time is different between

the blue and red stimulus.

(c) Construct a 99% confidence interval about the population mean difference. Interpret your

results.

Option – edit Stat –select confidence interval, input 0.99 --> Compute

99% confidence interval for d is (-0.193, 0.379).

We are 99% confidence that the true difference in the reaction time between the

blue stimulus and red stimulus is between -0.193 second and 0.379 second.

d) How does the confidence interval support the conclusion from the hypothesis test?

Since the confidence interval includes the difference of zero, then there is no significant

difference in reaction time to the two stimuli. This is the same conclusion as before for the

hypothesis test.

Homework:

11.2 #2 raw data given part a) stat-t stats-paired-select ‘save differences’. This way statcrunch

computes the differences

11.2 #3 summary of data given b) Since the differences have already been calculated then we

only have one sample of differences. Stat-t stats-one sample- with summary

If time do 11.2 #4 together

Ch 11.3 Inference about Two Means: Independent Samples Objective A :Test Hypotheses or Confidence Intervals regarding the Difference of Two

Independent Means

If we can assume 21 , use t distribution with POOLED standard error.

In general, we use t distribution without POOLED standard error unless instructed

otherwise.

Project: Is there a difference in the means between males and females?

Ho: There is no difference in the mean……. between genders.

H1: There is a difference in the mean ………. between genders.

or

Ho: mf ---> 0 mf

H1: mf ----> 0 mf

Example 1:

(a) The normal probability plots indicated the samples came from the populations that

are normally distributed. The boxplots indicated the samples had no outliers. Assuming

the samples were randomly selected and each sample size is no more than 5% of the

population size, Welch's t-test can be used.

(b) Hypothesis test: Independent Samples for Two Means

Setup:

Ho: There is no difference between genders in mean reaction times.

H1: There is a difference between genders in mean reaction times.

Or

Ho: mf ---> 0 mf

H1: mf ----> 0 mf

P – value

Enter raw data into column 1 (female)and 2 (male).

Stat-t stat- two sample – with data – sample 1 (female var 1), sample 2 (male var 2), hypothesis

test - compute

P – value = 0.5568 is not unusual. 0.5568 is not less than the significance level of 0.05 (or 5%). Cannot

reject the null hypothesis Ho.

Conclustion

There is not enough evidence to support the claim that there is a significant difference in the

reaction times between genders.

Conduct the CI: options – edit – select confidence interval 0.95 – compute

(- 0.61, 0.110)

We are 95% confident that the true difference in means between the genders is between -0.61

and 0.110 seconds.

How does this support the conclusion from the hypothesis test?

Since the difference of zero is in the confidence interval then there is no sigficant difference in reation

times between the genders. This supports the conclusion from the hypothesis test.

(c) Graph --> Boxplot --> Select Female Students, press the control key then select Male Students-->

Input the following --> Compute

The medians are similar for both genders. However, the female data had more variability wheras the

male data was more consistant. No outlers were present for either gender data. The overall spread is

similar as well.

Example 2:

a) What is the typical time it takes to chill glass and aluminum?

Glass: 133.8 ± 9.9 minutes

Glass takes between 123.9 and 143.7 minutes to chill a bottle of beer.

Aluminum: 92.4 ± 7.3 minutes

Aluminum takes between 85.1 and 99.7 minutes to chill a bottle of beer.

b) Is this a dependent or independent sampling? Independent

c) Construct and interpret a 90% confidence intervals for AG .

Stat --> T Stats --> Two Sample --> With Summary --> Input the following

and select confidence interval 0.90--> Compute

CI (38.1, 44.7)

One can be 90% confident that the true mean difference in cooling time between glass and alumiinum is

between 38.1 and 44.7 minutes.

Since the difference of zero is not in the confidence interval than we can conclude that there is a

significant difference in mean cooling times between glass and aluminum. It appears that glass takes

longer to chill.

d) Perform a hypothesis test:

set up:

Ho: There is no difference between cooling times for glass and aluminum.

H1: There is a difference between cooling times for glass and aluminum.

Or

Ho: Ag ---> 0 Ag

H1: Ag ----> 0 Ag

p-value:

options – edit – hypothesis test – compute

p – value < 0.0001 which is unusual. 0.0001 is less than 0.10 ( or 0.01% is less than 10%)

If the p is low, the null must go. Reject the null hypothesis.

Conclusion:

There is evidence to support the claim that there is a significant difference between cooling

times for glass and aluminum.

This supports the conclusion from the CI.

Homework 11.3#8: Hypothesis conclusions cannot imply ‘causation’.

X caused y to happen.

Ch 11.4 Inference about Two Population Standard Deviations

Objective A : Fisher’s F distribution

Objective B : Test Hypotheses regarding Two Population Standard Deviations

Example 1:

Assume that the populations are normally distributed.

Concepts for Hypothesis Tests with Two Variances

2

2

2

1

s

sF where 2

1s is the larger of the two sample variances. Sample 1 is the larger, sample 2 is the

smaller.

If the two populations really do have equal variances, then the ratio of2

2

2

1

s

s should be close to 1 because

2

1s and 2

2s tend to be close in value. If the two populations do not have equal variances, then the ratio of

2

2

2

1

s

swill be bigger than 1 by selecting 2

1s be the larger sample variance. Consequently, a large value of

F will be evidence against 2

2

2

1 .

First, we need to identify which of the two given standard deviations will be used for 2

1s ( 2

1s is the larger

of the two sample variances). Take the standard deviations and square them to get the variance.

In this problem, 64.842.9 22

1 s and 96.736.8 22

2 s 9.2 is SD, 84.64 is variance

Stat --> Variance Stats --> Two Sample --> With Summary ---> Input the following

---> Compute

StatCrunch Results:

Since the P-value (0.7659) is not less than 1.0 , do not reject Ho: 2

2

2

1 .

There is not sufficient evidence to conclude that 2

2

2

1 <--> 21

Or

There is not sufficient evidence to conclude that the standard deviations are not the same.

Example 2:

Identify 2

1s : 2

1s is the larger of the two sample variances. Take the standard deviations and find the

variances (by squaring the SD’s).

28095322

1 s ---> 2601 n

11563422

2 s ---> 2692 n

Stat --> Variance Stats --> Two Sample --> With Summary ---> Input the following

---> Compute

Since the P-value (<0.0001) is less than 05.0 , reject Ho: 2

2

2

1 .

There is sufficient evidence to conclude that the standard deviation walking speed is different between

the two groups.

Chapter 11Inferences on Two Samples Ch 11.1 Inference ... 227 Fa… · We must verify the...

Documents

Transcript of Chapter 11Inferences on Two Samples Ch 11.1 Inference ... 227 Fa… · We must verify the...