Download - Chapter 11

1

Chapter 11

Two-Sample Tests of Hypothesis

2

Goals1. Conduct a test of hypothesis about the

difference between two independent population means when both samples have population standard deviation that are known (z)

2. Conduct a test of hypothesis about the difference between two independent population means when both samples have population standard deviation that are not known (t)

3. Conduct a test of hypothesis about the difference between two population proportions (z)

3

Compare The Means From Two Populations

Is there a difference in the mean number of defects produced on the day and the night shifts at Furniture Manufacturing Inc.? Comparing two means from two different populations

Is there a difference in the proportion of males from urban areas and males from rural areas who suffer from high blood pressure? Comparing two proportions from two different

populations

4

Two Populations Two Independent Populations

E-Trade index funds Merrill Lynch index funds

Two random samples, two sample means Mean rate of return for E-Trade index funds (10.4%) Mean rate of return for Merrill Lynch index funds(11%)

Are the means different? If they are different, is the difference due to

chance (sampling error) or is it really a difference?

If they are the same, the difference between the two sample means should equal zero “No difference”

5

Two Populations Two Independent Populations

Plumbers in central Florida Electricians in central Florida

Two samples, two sample means Mean hourly wage rate for plumbers ($30) Mean hourly wage rate for electricians ($29)

Are the means different? If they are different, is the difference due to

chance (sampling error) or is it really a difference?

If they are the same, the difference between the two sample means should equal zero “No difference”

6

Distribution Of Differences In The Sample Means

Samples (n≥30)

Mean Hourly Wage For Plumbers From

Sample

Mean Hourly Wage For Electricians From Sample Difference

Samples (n≥30)

Mean Hourly Wage For Plumbers From

Sample

Mean Hourly Wage For Electricians From Sample Difference

1 29.80 28.76 1.04 16 30.60 30.19 0.412 30.32 29.40 0.92 17 30.79 28.65 2.143 30.57 29.94 0.63 18 29.14 29.95 -0.814 30.04 28.93 1.11 19 29.91 28.75 1.165 30.09 29.78 0.31 20 28.74 29.21 -0.476 30.02 28.66 1.36 21 29.50 29.45 0.057 29.60 29.13 0.47 22 29.15 29.19 -0.048 29.63 29.42 0.21 23 28.75 28.40 0.359 30.17 29.29 0.88 24 30.20 30.20 0.0010 30.81 29.75 1.06 25 29.78 29.70 0.0811 30.09 28.05 2.04 26 28.20 28.27 -0.0712 29.35 29.07 0.28 27 30.70 30.50 0.2013 29.42 28.79 0.63 28 30.01 30.00 0.0114 29.78 29.54 0.24 29 29.50 29.01 0.4915 29.60 29.60 0.00 30 28.99 28.82 0.17

7

Theory Of Two Sample Tests: Take several pairs of samples Compute the mean of each Determine the difference between the sample

means Study the distribution of the differences in the

sample means If the mean of the distribution of differences is

zero: This implies that there is no difference between the two

populations If the mean of the distribution of differences is not

equal to zero: We conclude that the two populations do not have the

same population parameter (example: mean or proportion)

8

Theory Of Two Sample Tests: If the sample means from the two populations

are equal: The mean of the distribution of differences

should be zero If the sample means from the two populations

are not equal: The mean of the distribution of differences

should be either: Greater than zero

or Less than zero

9

Normal Distributions Remember from chapter 9: Distribution of sample means tend to

approximate the normal distribution when n ≥ 30

For independent populations, it can be shown mathematically that: Distribution of the difference between two

normal distributions is also normal The standard deviation of the distribution of

the difference is the sum of the two individual standard deviations

10

Test Of Hypothesis About The Difference Between Two Independent Population Means

(population standard deviation known) Same five steps:• Step 1: State null and alternate hypotheses• Step 2: Select a level of significance• Step 3: Identify the test statistic (z) and draw• Step 4: Formulate a decision rule• Step 5: Take a random sample, compute the test

statistic, compare it to critical value, and make decision to reject or not reject null and hypotheses

• Fail to reject null• Reject null and accept alternate

Assumptions & Formulas

11

FormulasTwo Independent Pop. Means (population

standard deviation known) Assumptions:

1. Two populations must be independent (unrelated)

2. Population standard deviation known for both

3. Both distributions are Normally distributed

1 2

2 21 2

1 2

1

2

1

2

1

2

Sample Mean of Sample 1


Standard Deviation of Pop. 1

Standard Deviation of Pop. 2

Sample Size of Sample 1


Test Statistic

X Xz

n n

X

X

n

n

z

For The Difference

Between Two Sample Means

Standard Deviationof the

Distribution of Differences

12

Two cities, Bradford and Kane are separated only by the Conewango River

The local paper recently reported that the mean household income in Bradford is $38,000 from a sample of 40 households. The population standard deviation (past data) is $6,000.

The same article reported the mean income in Kane is $35,000 from a sample of 35 households. The population standard deviation (past data) is $7,000.

At the .01 significance level can we conclude the mean income in Bradford is more?

Example 1: Comparing Two Populations

13


Samples Sample Means Sample Standard Deviations n

CitiesMean household

income dollars Sample Size1 Bradford $38,000.00 $6,000.00 402 Kane $35,000.00 $7,000.00 35

We wish to know whether the distribution of the differences in sample means has a mean of 0

The samples are from independent populations Both population standard deviations are known

14

Step 1: State null and alternate hypotheses H0: µB ≤ µK , or µB = µK

H1: µB > µK

Step 2: Select a level of significance = .01 (one-tail test to the right)

Step 3: Identify the test statistic and draw Both pop. SD known, we can use z as the test

statistic Critical value .01 yields .49 area, z = 2.33


15

Step 4: Formulate a decision rule If z is greater than 2.33, we reject H0 and accept H1,

otherwise we fail to reject H0

Step 5: Take a random sample, compute the test statistic, compare it to critical value, and make decision to reject or not reject null and hypotheses

98.1

35)000,7($

40)000,6($

000,35$000,38$22

z


1 2

2 21 2

1 2

X Xz

n n

16

Because 1.98 is not greater than 2.33, we fail to reject H0

We can not conclude that the mean income in Bradford is more than the mean income in Kane

The p-value (table method) is: P(z ≥ 1.98) = .5000 - .4761 = .0239 This is more area under the curve associated with z-score of 2.33 than

for alpha We conclude from the p-value that H0 should not be rejected

However, there is some evidence that H0 is not true


17

Test Of Hypothesis About The Difference Between Two Independent Population Means

(population standard deviation not known)

Same five steps:• Step 1: State null and alternate hypotheses• Step 2: Select a level of significance• Step 3: Identify the test statistic (t) and draw• Step 4: Formulate a decision rule• Step 5: Take a random sample, compute the test



Assumptions & Two Formulas

18

Assumptions necessary:1. Sample populations must follow the normal

distribution2. Two samples must be from independent

(unrelated) populations3. The variances & standard deviations of the

two populations are equal Two Formulas

1. Pooled variance2. T test statistic

19

FormulasTwo Independent Population Means

(population standard deviation not known)

2

)1()1(

21

222

2112

nn

snsns p

21

2

21

11

nns

XXt

p

1

2

1

2

1

2

2



Standard Deviation of Sample 1

Standard Deviation of Sample 2



Pooled Estimate of Populatp

X

X

s

s

n

n

s

ion Variance

Test Statistic For The Difference

Between Two Sample Means

t

Degrees of Freedom = 2

20

Pooled Variance

The two sample variances are pooled to form a single estimate of the unknown population variance A weighted mean of the two sample variances The weights are the degrees of freedom that form

each sample Why pool? Because if we assume the population variances

are equal, the best estimate will come from a weighted mean of the two variances from the two samples

21


A recent EPA study compared the highway fuel economy of domestic and imported passenger cars

A sample of 15 domestic cars revealed a mean of 33.7 MPG with a standard deviation of 2.4 MPG

A sample of 12 imported cars revealed a mean of 35.7 mpg with a standard deviation of 3.9

At the .05 significance level can the EPA conclude that the MPG is higher on the imported cars?

Assume:

1. Samples are independent

2. Population standard deviations are equal

3. Distributions for samples are normal

22

Step 1: State null and alternate hypotheses H0: µD ≥ µI

H1: µD < µI

Step 2: Select a level of significance = .05 (One-tail test to the left)

Step 3: Identify the test statistic and draw Pop. SD not known, so we use the t distribution df = 15 + 12 – 1 – 1 = 25 One-tail test with = .05 Critical Value = -1.708


23

Step 4: Formulate a decision rule

If our t < -1.708, we reject H0 and accept H1,

otherwise we fail to reject H0

Step 5: Take a random sample, compute the test statistic, compare it to critical value, and make decision to reject or not reject null and hypotheses We must make the calculations for:

1.Pooled Variance

2. t-value test statistic


24

640.1

121

151

918.9

7.357.33

11

21

2

21

nns

XXt

p

Example 2:Step 5: Compute

918.921215

)9.3)(112()4.2)(115(

2

))(1())(1(

22

21

222

2112

nn

snsns p

25

Example 2:Step 5: Conclude

Because -1.64 in not less than our critical value of -1.708, we fail to reject H0

There is insufficient sample evidence to claim a higher MPG on the imported cars

The EPA cannot conclude that the MPG is higher on the imported cars

26

Proportion The fraction, ratio, or percent indicating the part of the sample or the

population having a particular trait of interest Example:

A recent survey of Highline students indicated that 98 out of 100 surveyed thought that textbooks were too expensive

The sample proportion is 98/100 .98 98%

The sample proportion is our best estimate of our population proportion

Number of Successes (Number Possessing the Trait)

= X

Sample Size (Number of Observations) = n

Sample Proportion = p =Xn

Sample Proportion

27

Two Sample Tests of Proportions

We investigate whether two samples came from populations with an equal proportion of successes (U = M)

Assumptions:

1. The two populations must be independent of each other

2. Experiment must pass all the binomial tests

28

Test Of Hypothesis About The Difference Between Two Population Proportions

Same five steps:• Step 1: State null and alternate hypotheses• Step 2: Select a level of significance• Step 3: Identify the test statistic (z) and draw• Step 4: Formulate a decision rule• Step 5: Take a random sample, compute the test



Two Formulas

29

Formulas (Two Sample Tests of Proportions)

1 2

1 2

(1 ) (1 )c c c c

p pz

p p p p

n n

1

2

1

2

1

Number possessing the trait in sample 1

Number possessing the trait in sample 2

Number of observations in sample 1

Number of observations in sample 2

Proportion Possessing the Trait in

X

X

n

n

p

2

Sample 1

Proportion Possessing the Trait in Sample 2

Pooled Proportion Possessing the Trait

in the Combined Samples

Test Statistic (Compare Against Critical Value)

c

p

p

z

1 2

1 2c

X Xp

n n

30

Example 3: Two Sample Tests of Proportions

Are unmarried workers more likely to be absent from work than married workers (m < u )?

A sample of 250 married workers showed 22 missed more than 5 days last year

A sample of 300 unmarried workers showed 35 missed more than 5 days last year

= .05 Assume all binominal tests are passed

31

Step 1: State null and alternate hypotheses H0: m ≥ u

H1: m < u

Step 2: Select a level of significance = .05

• Step 3: Identify the test statistic and draw Because the binomial assumptions are met, we use the

z standard normal distribution = .05 .45 z = -1.65 (one-tail test to the left)

• Step 4: Formulate a decision rule If the test statistic is less than -1.65, we reject H0 and

accept H1, otherwise, we fail to reject H0


32


• Step 5: Take a random sample, compute the test statistic, compare it to critical value, and make decision to reject or not reject null and hypotheses

10.1

250)1036.1(1036.

300)1036.1(1036.

30035

25022

z

1036.300250

3522

cp

33

Step 5: Conclude Because -1.1 is not less than -1.65, we fail to

reject H0

We cannot conclude that a higher proportion of unmarried workers miss more than 5 days in a year than do the married workers

The p-value (table method) is: P(z > 1.10) = .5000 - .3643 = .1457

.1457 > .05, thus: fail to reject H0


34

Goals4. Understand the difference between

dependent and independent samples

5. Conduct a test of hypothesis about the mean difference between paired or dependent observations

35

Understand The Difference Between Dependent And Independent Samples

Independent samples are samples that are not related in any way

Dependent samples are samples that are paired or related in some fashion

36

Dependent Samples

The samples are paired or related in some fashion: 1st Measurement of item, 2nd measurement of item

If you wished to buy a car you would look at the same car at two (or more) different dealerships and compare the prices (1 price, 2 price)

Before & After If you wished to measure the effectiveness of a

new diet you would weigh the dieters at the start and at the finish of the program (1 weight, 2 weight)

37

Distribution Of The Differences In The Paired Values

Follows Normal Distribution If the paired values are dependent, be sure to

use the dependent formula, because it is a more accurate statistical test than the formula 11-3 in our textbook Formula 11-7 helps to:

Reduce variation in the sampling distribution Two kinds of variance (variation between 1st & 2nd categories for

paired values, &, variation between values) are reduced to only one (variation between 1st & 2nd categories for paired values)

Reduced variation leads to smaller standard error, which leads to larger test statistic and greater chance of rejecting H0

DF will be smaller

Test Of Hypothesis About The Mean Difference Between Test Of Hypothesis About The Mean Difference Between Paired Or Dependent ObservationsPaired Or Dependent Observations

Make the following calculations when the samples are dependent:

where d is the differencewhere is the mean of the differences is the standard deviation of the differencesn is the number of pairs (differences)

td

s nd

/

dsd

2

2

1d

dd

nsn

EXAMPLE 4EXAMPLE 4

An independent testing agency is comparing the daily rental cost for renting a compact car from Hertz and Avis

At the .05 significance level can the testing agency conclude that there is a difference in the rental charged?

A random sample of eight cities revealed the following information

EXAMPLE 4 EXAMPLE 4 continuedcontinued

City Hertz ($) Avis ($)Atlanta 42 40Chicago 56 52

Cleveland 45 43Denver 48 48

Honolulu 37 32Kansas City 45 48

Miami Seattle

41

46

39

50

EXAMPLE 4 EXAMPLE 4 continuedcontinued

Step 1:Step 2:

= .05

Step 3: Use t because n < 30, df = 7, critical value = 2.365

Step 4: If t < -2.365 or t > 2.365, H0 is rejected and H1 accepted, otherwise,

we fail to reject H0

H Hd d0 10 0: :

Example 4 Example 4 continuedcontinued

City Hertz Avis d d2

Atlanta 42 40 2 4

Chicago 56 52 4 16

Cleveland 45 43 2 4

Denver 48 48 0 0

Honolulu 37 32 5 25

Kansas City 45 48 -3 9

Miami 41 39 2 4

Seattle 46 50 -4 16

Totals 8 78


00.18

0.8

n

dd

1623.3

188

878

1

222

n

n

dd

sd

894.081623.3

00.1

ns

dt

d


Step 5: Because 0.894 is less than the critical value, do not reject the null

hypothesis There is no difference in the mean amount charged by Hertz and

Avis

45

Summarize Chapter 111. Conduct a test of hypothesis about the difference

between two independent population means when both samples have 30 or more observations

2. Conduct a test of hypothesis about the difference between two independent population means when at least one sample has less than 30 observations

4. Conduct a test of hypothesis about the difference between two population proportions

5. Understand the difference between dependent and independent samples

6. Conduct a test of hypothesis about the mean difference between paired or dependent observations