Download - Quantitative method compare means test (independent and paired)

Quantitative Methods

Compare Means Test (T-test)

Independent samples and

Paired samples

Keiko Ono, Ph.D.

(2005)

Two-Samples Compare Means Test

(aka Independent-Samples Compare Means Test)

Compare X by (Group)

• Compare group 1 to group2

• Ha: X1 bar > X2 bar OR X1 bar < X2 bar

Ho: X1 bar = X2 bar

• Assumption: two subsamples were drawn independently

• Sample size equality not necessary

• Test variable: must be interval

• Group variable: categorical, ordinal, interval



• Compare Democrats and Republicans

• Compare Men and Women

• Compare Experimental group and Control group

• Compare drug and placebo

• Compare Brand A and Brand B

• Compare and



• Compare between two groups when there are more than

two groups

→ e.g. Liberal (coded “1”), Moderate (coded “3”),

Conservative (coded “5”).

To compare liberals and conservatives, specify Group1 =

1, Group 2 = 5



• Compare between groups based on interval level

variable

e.g. Feeling thermometer score for feminists

Group 1: respondents 30-years or older

Group 2: under 30

(“cutoff” value would be 30).

Group Statistics

1207 53.62 22.060 .635

213 58.97 20.954 1.436

Respondent age

>= 30

< 30

Thermometer feminists

N Mean Std. Deviation

Std. Error

Mean

Independent Samples Test

.040 .842 -3.283 1418 .001 -5.342 1.627 -8.535 -2.150

-3.403 301.023 .001 -5.342 1.570 -8.432 -2.253

Equal variances

assumed

Equal variances

not assumed

Thermometer feminists

F Sig.

Levene's Test for

Equality of Variances

t df Sig. (2-tai led)

Mean

Difference

Std. Error

Difference Lower Upper

95% Confidence

Interval of the

Difference

t-tes t for Equality of Means

PID Bush FT

1 D 20

2 R 70

3 D 15

4 D 35

5 R 85

6 R 70

7 D 50

8 R 90

9 R 65

n=9



PID Bush FT

1 D 20

2 R 70

3 D 15

4 D 35

5 R 85

6 R 70

7 D 50

8 R 90

9 R 65

n=9

Sample 1 (Democrats)

n1=4

X1 bar= 30

s1=15.8



PID Bush FT

1 D 20

2 R 70

3 D 15

4 D 35

5 R 85

6 R 70

7 D 50

8 R 90

9 R 65

n=9

Sample 2 (Republicans)

n2=5

X2 bar= 76

s2 = 10.8



Sample 1 (Democrats)

n1=4

X1 bar= 30

s1=15.8



Equal variance assumed (Tomlinson)


Example 1. Do men and women feel differently about the military?

(NES 2000)

Group Statistics

665 73.25 20.860 .809

852 72.19 20.130 .690

Gender1. Male

2. Female

Post:Thermometer

military


Std. Error

Mean



(NES 2000)

Group Statistics

665 73.25 20.860 .809

852 72.19 20.130 .690

Gender1. Male

2. Female

Post:Thermometer

military


Std. Error

Mean


.272 .602 1.001 1515 .317 1.060 1.058 -1.016 3.136

.997 1402.109 .319 1.060 1.063 -1.025 3.145

Equal variances

assumed

Equal variances

not assumed

Post:Thermometer

mili tary

F Sig.

Levene's Test for



Mean

Difference

Std. Error


95% Confidence

Interval of the

Difference




(NES 2000)

Group Statistics

665 73.25 20.860 .809

852 72.19 20.130 .690

Gender1. Male

2. Female

Post:Thermometer

military


Std. Error

Mean


.272 .602 1.001 1515 .317 1.060 1.058 -1.016 3.136

.997 1402.109 .319 1.060 1.063 -1.025 3.145

Equal variances

assumed

Equal variances

not assumed

Post:Thermometer

mili tary

F Sig.

Levene's Test for



Mean

Difference

Std. Error


95% Confidence

Interval of the

Difference


Confidence Interval



(NES 2000)

Group Statistics

665 73.25 20.860 .809

852 72.19 20.130 .690

Gender1. Male

2. Female

Post:Thermometer

military


Std. Error

Mean


.272 .602 1.001 1515 .317 1.060 1.058 -1.016 3.136

.997 1402.109 .319 1.060 1.063 -1.025 3.145

Equal variances

assumed

Equal variances

not assumed

Post:Thermometer

mili tary

F Sig.

Levene's Test for



Mean

Difference

Std. Error


95% Confidence

Interval of the

Difference


Critical value of t = 1.96

t = Mean difference / S.E. of Difference



(NES 2000)

Group Statistics

665 73.25 20.860 .809

852 72.19 20.130 .690

Gender1. Male

2. Female

Post:Thermometer

military


Std. Error

Mean


.272 .602 1.001 1515 .317 1.060 1.058 -1.016 3.136

.997 1402.109 .319 1.060 1.063 -1.025 3.145

Equal variances

assumed

Equal variances

not assumed

Post:Thermometer

mili tary

F Sig.

Levene's Test for



Mean

Difference

Std. Error


95% Confidence

Interval of the

Difference


P-value

At 95% confidence level, critical value of p (2-tailed) is .05. If one-tailed, divide by half (.025).



(NES 2000)

Group Statistics

665 73.25 20.860 .809

852 72.19 20.130 .690

Gender1. Male

2. Female

Post:Thermometer

military


Std. Error

Mean


.272 .602 1.001 1515 .317 1.060 1.058 -1.016 3.136

.997 1402.109 .319 1.060 1.063 -1.025 3.145

Equal variances

assumed

Equal variances

not assumed

Post:Thermometer

mili tary

F Sig.

Levene's Test for



Mean

Difference

Std. Error


95% Confidence

Interval of the

Difference


P-value

Group Statistics

659 61.93 22.127 .862

825 66.20 20.482 .713

Gender

1. Male

2. Female

D2t. Thermometer

environmentalis ts


Std. Error

Mean


1.430 .232 -3.852 1482 .000 -4.272 1.109 -6.447 -2.096

-3.819 1358.693 .000 -4.272 1.119 -6.466 -2.077

Equal variances

assumed

Equal variances

not assumed

D2t. Thermometer

environmentalis ts

F Sig.

Levene's Test for



Mean

Difference

Std. Error


95% Confidence

Interval of the

Difference



Example 2. Do men and women feel differently about

environmentalists? (NES 2000)

Group Statistics

659 61.93 22.127 .862

825 66.20 20.482 .713

Gender

1. Male

2. Female

D2t. Thermometer

environmentalis ts


Std. Error

Mean


1.430 .232 -3.852 1482 .000 -4.272 1.109 -6.447 -2.096

-3.819 1358.693 .000 -4.272 1.119 -6.466 -2.077

Equal variances

assumed

Equal variances

not assumed

D2t. Thermometer

environmentalis ts

F Sig.

Levene's Test for



Mean

Difference

Std. Error


95% Confidence

Interval of the

Difference



Example 2. Do men and women feel differently about

environmentalists? (NES 2000)

P-value

Compare Means test – Paired Samples

Independent Samples Compare Means Test dealt

with comparing two independent groups (men vs.

women, Democrat vs. Republican, etc.)

Paired Samples test involves comparing

traits/characteristics of matched observations.

What does this mean?

• Compare before and after new treatment, new drug, new law, etc.

• Compare t1 and t2 (e.g. GDP in 2005 and GDP in 2000 across countries, the unit of analysis is the country)

• Compare average male and female test scores across schools (the unit of analysis: school)

• Compare opinions and evaluations of different objects (the unit of analysis: survey respondent)

Paired-Samples Compare Means Test

Independent Samples Compare Means Test On average, are men more (or less) favorable toward Clinton than

women?

Respondent Sex Clinton Gore

Andy M 70 50

Matt M 60 45

Richard M 30 50

Dave M 25 40

Elaine F 50 60

Julia F 70 65

Rachel F 50 40

Margaret F 35 55

Male

Mean

Female

Mean

What’s wrong with this picture?

Respondent Clinton Rating Gore Rating

Andy 70

Elaine 60

Can we conclude Clinton is more popular (even with

increased n) based on this set of data?

No…because of Natural Variability

Paired Samples Compare Means Test On average, is Clinton rating different from Gore rating?

Respondent Sex Clinton Gore

Andy M 70 50

Matt M 60 45

Richard M 30 50

Dave M 25 40

Elaine F 50 60

Julia F 70 65

Rachel F 50 40

Margaret F 35 55

To test the hypothesis, we use the difference

between the two original variables.

Respondent Sex Clinton Gore C-G

Andy M 70 50 20

Matt M 60 45 15

Richard M 30 50 -20

Dave M 25 40 -15

Elaine F 50 60 -10

Julia F 70 65 5

Rachel F 50 40 10

Margaret F 35 55 -20

To test the hypothesis, we use the difference

between the two original variables.

Respondent Sex Clinton Gore C-G

Andy M 70 50 20

Matt M 60 45 15

Richard M 30 50 -20

Dave M 25 40 -15

Elaine F 50 60 -10

Julia F 70 65 5

Rachel F 50 40 10

Margaret F 35 55 -20

Ho:

X bar = 0

Ha:

X bar<0

or

Xbar >0

The Logic of paired samples test

Say someone has developed a new kind of hand

cream. She claims the new cream is far superior to

the conventional one. How do we test this

proposition?

OLD NEW

One way is to assign the old cream to one group of

experimental subjects and give the new one to

another group.

However, there is natural variability due to skin

differences among research subjects.

In other words, our hands are different from our

neighbors’.

So, a better way to test the difference

between the two brands is…

Randomly assign the two brands to each subject’s

right or left hands! This eliminates variability due

to skin differences.


Example

Which gives better mileage, Gasoline A

or Gasoline B?


Better mileage, Gasoline A or

Gasoline B?

Taxi # Gasoline mileage

1 A 25.6

2 A 32.4

3 A 28.6

4 A 31.2

5 A 29.8

6 A 27.9

7 B 24.9

8 B 26.7

9 B 30.6

10 B 29.8

11 B 30.7

12 B 28.4

One method is to randomly assign

gasoline A or B to cars and compare the

means.

Gasoline A

Mean

Gasoline B

Mean


Better mileage, Gasoline A or

Gasoline B?

Problem: Natural variability in driving habits

and conditions of the car


A better method is to assign

gasoline A and B to the same

cars and compare the means.

Taxi # Gasoline A Gasoline B

1 25.6 24.9

2 32.4 26.7

3 28.6 31.2

4 31.2 30.7

5 29.8 29.5

6 27.9 28.7

7 25.9 30.6

8 26.5 28.4

9 31.3 25.7

10 29.5 29.4

11 31.2 32.8

12 28.8 25.2


We hypothesize if there were no difference between Gasoline A and Gasoline B,

on average, the difference would be zero (this is the null hypothesis).

Taxi # Gasoline A Gasoline B Difference (A-B)

1 25.6 24.9 0.7

2 32.4 26.7 5.7

3 28.6 31.2 -2.6

4 31.2 30.7 0.5

5 29.8 29.5 0.3

6 27.9 28.7 -0.8

7 25.9 30.6 -4.7

8 26.5 28.4 -1.9

9 31.3 25.7 5.6

10 29.5 29.4 0.1

11 31.2 32.8 -1.6

12 28.8 25.2 3.6


We hypothesize if there were no difference between Gasoline A and Gasoline B,

on average, the difference would be zero (this is the null hypothesis).

Taxi # Gasoline A Gasoline B Difference (A-B)

1 25.6 24.9 0.7

2 32.4 26.7 5.7

3 28.6 31.2 -2.6

4 31.2 30.7 0.5

5 29.8 29.5 0.3

6 27.9 28.7 -0.8

7 25.9 30.6 -4.7

8 26.5 28.4 -1.9

9 31.3 25.7 5.6

10 29.5 29.4 0.1

11 31.2 32.8 -1.6

12 28.8 25.2 3.6

Ho:

X bar = 0

Ha:

X bar<0

or

Xbar >0


Example

Paired Samples Statistics

55.43 1771 29.675 .705

57.57 1771 25.663 .610

Pre:Thermometer Bill

Clinton

Pre:Thermometer Al Gore

Pair

1

Mean N Std. Deviation

Std. Error

Mean

Paired Samples Correlations

1771 .720 .000


Clinton &


Pair

1

N Correlation Sig.

Paired Samples Test

-2.141 21.042 .500 -3.121 -1.160 -4.281 1770 .000


Clinton -


Pair

1

Mean Std. Deviation

Std. Error

Mean Lower Upper

95% Confidence

Interval of the

Difference

Paired Differences



Example

Paired Samples Statistics

55.43 1771 29.675 .705

57.57 1771 25.663 .610


Clinton


Pair

1

Mean N Std. Deviation

Std. Error

Mean

Paired Samples Correlations

1771 .720 .000


Clinton &


Pair

1

N Correlation Sig.

Paired Samples Test

-2.141 21.042 .500 -3.121 -1.160 -4.281 1770 .000


Clinton -


Pair

1

Mean Std. Deviation

Std. Error

Mean Lower Upper

95% Confidence

Interval of the

Difference

Paired Differences


Confidence Interval

P-value

t = (mean of difference / (S.D. of difference/√n))

• Compare before and after new treatment, new drug, new law, etc.

• Compare t1 and t2 (e.g. GDP in 2005 and GDP in 2000 across countries, the unit of analysis is the country)

• Compare average male and female test scores across schools (the unit of analysis: school)

• Compare opinions and evaluations of different objects (the unit of analysis: survey respondent)


GDP 2000 GDP 2005 D

Chile

Mexico

Argentina

Ecuador

Peru

Columbia

Venezuela

Cholesterol Level

Before drug After drug D

Patient A

Patient B

Patient C

Patient D

Patient E

Patient F

Patient G

Patient H


Average SAT scores

Female Male D

OU

Texas

Kansas

Missouri

Nebraska

Texas A&M

Rice

Arkansas