Psych 5500/6500

39
1 Psych 5500/6500 t Test for Dependent Groups (aka ‘Paired Samples’ Design) Fall, 2008

description

Psych 5500/6500. t Test for Dependent Groups (aka ‘Paired Samples’ Design). Fall, 2008. Experimental Designs. t test for dependent groups is used in the following two experimental designs: Repeated measures (a.k.a. within-subjects) design. Matched pairs design. - PowerPoint PPT Presentation

Transcript of Psych 5500/6500

Page 1: Psych 5500/6500

1

Psych 5500/6500

t Test for Dependent Groups

(aka ‘Paired Samples’ Design)

Fall, 2008

Page 2: Psych 5500/6500

2

Experimental Designs

t test for dependent groups is used in the following two experimental designs:

1. Repeated measures (a.k.a. within-subjects) design.

2. Matched pairs design.

Page 3: Psych 5500/6500

3

Repeated Measures (Within-subjects) Design

Measure each participant twice, once in ‘Condition A’ and once in ‘Condition B’. The scores in the two groups are no longer independent as they come from the same participants.

I like to use ‘Condition A’ and ‘Condition B’ rather than ‘Group 1’ and ‘Group 2’ as the latter terms seem to imply (at least to me) that there are different subjects in each group.

Page 4: Psych 5500/6500

4

DesignSubject Condition A Condition B

1 S1 S1

2 S2 S2

3 S3 S3

etc. etc. etc.

Each subject’s two scores are dependent, but are independent of the other subjects’ scores.

Page 5: Psych 5500/6500

5

Matched Pairs

The two paired scores don’t have to come from the same person, there are other ways the scores within the pairs could be associated (dependent). For example, measuring marital satisfaction within married couples (a static group design).

Page 6: Psych 5500/6500

6

Design

Couple Wife Husband

1 S1.W S1.H

2 S2.W S2.H

3 S3.W S3.H

etc. etc. etc.

The scores within each couple are dependent, but each couple’s scores are independent of the other other couples’ scores

Page 7: Psych 5500/6500

7

Example: Repeated Measuresor Within-Subjects Design

You are interested in whether attending a mixed-race day camp affects children’s racial prejudice. Six children attending the day camp were given a test to measure racial prejudice (higher scores = more prejudice) when they first arrived at camp. The same six children were given the same test seven days later when they left the camp.

Page 8: Psych 5500/6500

8

DataSubject Before After

1 85 80

2 30 26

3 52 50

4 10 11

5 98 94

6 39 36

Mean= 52.33 49.5

Page 9: Psych 5500/6500

9

Getting Rid of the Nonindependence

Because we have two scores per person the scores are not all independent of each other, which means we can’t do a t test. The solution is simple, we will turn those two scores per person into just one score per person, a score which reflects the difference in each person’s score when they are in Condition A compared to when they are in Condition B.

Page 10: Psych 5500/6500

10

Difference ScoresSubject Before - After = Difference

1 85 - 80 = 5

2 30 - 26 = 4

3 52 - 50 = 2

4 10 - 11 = -1

5 98 - 94 = 4

6 39 - 36 = 3

For each subject we now have just one score, their ‘difference’ score. 1) The difference scores measure how much the subject’s score differed in the two conditions. 2) The difference scores are independent of each other, we can now perform the t test for a single group of scores on the difference scores.

Page 11: Psych 5500/6500

11

Difference ScoresSubject Before - After = D

1 85 - 80 = 5

2 30 - 26 = 4

3 52 - 50 = 2

4 10 - 11 = -1

5 98 - 94 = 4

6 39 - 36 = 3

To simply, let’s call the difference scores ‘d scores’. The mean of the d scores is a measure of the average difference between the scores in condition A and the scores in condition B.

Page 12: Psych 5500/6500

12

Difference ScoresSubject Before - After = D

1 85 - 80 = 5

2 30 - 26 = 4

3 52 - 50 = 2

4 10 - 11 = -1

5 98 - 94 = 4

6 39 - 36 = 3

In our sample, the prejudice scores were on average 2.83 higher before the day camp than they were after the day camp.2.83

6

17

N

DD

17D

Page 13: Psych 5500/6500

13

Difference Scores

Subject D

1 5

2 4

3 2

4 -1

5 4

6 3

2.836

17

N

DD

17D

Mean D is a statistic, it reflects what wefound in those six kids. Our hypotheseswill concern the larger population thesesix kids represent (2-tailed):

H0: μD=0 Ha: μD0

Page 14: Psych 5500/6500

14

Same Thing

What we are about to do is exactly the same thing as performing a t test for a single group of scores, we have simply relabeled our variable as ‘D’ (to stand for ‘difference scores’) rather then ‘Y’.

This is not really a third t test, it is just another context in which we can use the t test for a single group of scores.

Page 15: Psych 5500/6500

15

Sampling Distribution

All the results we could get for mean D assuming H0 were true.

Page 16: Psych 5500/6500

16

df and tc

516

1

df

Ndf D

so tc=±2.571

Page 17: Psych 5500/6500

17

est. standard error

87.06

14.2..

14.257.4..

57.45

83.22

1.

83.226

1771

2

2

2

2

2

D

DD

DD

D

DD

D

DD

N

estest

estest

N

SSest

SS

N

DDSS

(Compare to t test for a single group mean).

Page 18: Psych 5500/6500

18

Page 19: Psych 5500/6500

19

tobt

You should be able to guess what this formula is.

Page 20: Psych 5500/6500

20

t(5)=3.24, p=.023

Page 21: Psych 5500/6500

21

Difference ScoresSubject Before - After = Difference

1 85 - 80 = 5

2 30 - 26 = 4

3 52 - 50 = 2

4 10 - 11 = -1

5 98 - 94 = 4

6 39 - 36 = 3

If we analyze the difference scores to see if the mean of their population differs from zero we get: t(5)=3.248, p=.023, we can conclude that their is a statistically significant difference in the before and after scores (i.e. μD0), if we have no serious confounding variables then we conclude that the day camp affected prejudice scores.

Page 22: Psych 5500/6500

22

One-Tail Tests

If we are testing a theory which predicts that prejudice should be less after the day camp then that would imply that the mean of the difference scores should be greater than zero (write Ha to express the prediction).H0: μD 0Ha: μD > 0

This is indeed the direction the results fell, so the p value would be p=.023/2=.012 So the results are t(5)=3.248, p=.012

Page 23: Psych 5500/6500

23

One-Tail Tests

If we are testing a theory which predicts that prejudice should be greater after the day camp then that would imply that the mean of the difference scores should be less than zero (write Ha to express the prediction).H0: μD 0Ha: μD < 0

This is opposite from the direction the results fell, so the p value would be p=1-.023/2=.988 So the results are t(5)=3.248, p=.988

Page 24: Psych 5500/6500

24

Matched Pairs DesignThis type of design is analyzed exactly the same

way as a repeated measures design, you analyze the difference scores.

Couple Wife - Husband = Difference

1 - =

2 - =

3 - =

4 - =

5 - =

6 - =

Page 25: Psych 5500/6500

25

Lowering Variance

Since the beginning of the semester I’ve been making the point that lowering the variance of the data is a good thing, it leads to more representative data and thus makes it easier to draw conclusions about the population from which the sample was drawn. Lowering variance increases power. I have been promising we would look at a way of accomplishing that other than simply sampling from a more homogeneous population, here it is...

Page 26: Psych 5500/6500

26

Look again at our original data, if these scores came from an independent groups design (e.g. random half of the kids measured before the day camp and the other half measured after the day camp) we would be in trouble, look at how much the scores vary within each group, the kids really differed in prejudice levels. This variance would kill the power of our experiment.

Subject Before After

1 85 80

2 30 26

3 52 50

4 10 11

5 98 94

6 39 36

Mean= 52.33 49.5

Page 27: Psych 5500/6500

27

But with a repeated measures design we are just looking at the effect of the independent variable (attending the camp) on each kid (how much they differed before and after rather than at how prejudiced they are). The independent variable had fairly similar effects on the kids (from –1 to 5), and thus the difference scores don’t have nearly as much variability as the prejudice levels of the various kids.

Subject Before - After = Difference

1 85 - 80 = 5

2 30 - 26 = 4

3 52 - 50 = 2

4 10 - 11 = -1

5 98 - 94 = 4

6 39 - 36 = 3

Mean 52.33 49.5 83.2D

Page 28: Psych 5500/6500

28

Subject Before - After = Difference

1 85 - 80 = 5

2 30 - 26 = 4

3 52 - 50 = 2

4 10 - 11 = -1

5 98 - 94 = 4

6 39 - 36 = 3

Mean 52.33 49.5 83.2D

Analyzed as t for independent groups

.884p0.15,t(10)

0.1518.93

2.83

18.93

0-49.5)-(52.33t

.023p3.25,t(5)

3.250.87

2.83t

Analyzed as t for dependent

Page 29: Psych 5500/6500

29

Variability and DesignsWhich t test you use is based upon how you run the study.

In deciding how to run the study:1. If you think the effect of the independent variable will

be rather similar for each subject and that the subjects’ actual scores will vary quite a bit then use a paired sample design (repeated measure or matched pairs design).

2. If you think the effect of the independent variable will vary quite a bit and that the subjects’ actual scores will be rather similar than use an independent groups design (true experiment, quasi-experiment, static group design).

A repeated measures design is usually more powerful than an independent groups design.

Page 30: Psych 5500/6500

30

Effect Size

The direct measure of effect size in this t test is simply the mean of the difference scores. This value represents the effect of the independent variable on the participants, and it also happens to equal the mean of the first group minus the mean of the second group (making it the same as the measure of effect size in the t test for independent groups).

D

Page 31: Psych 5500/6500

31

Standardized Effect Size

D

D

D

D

σ est.

D g sHedges'

sample, for the S

Dd sCohen'

,population for the σ

μ δ sCohen'

Page 32: Psych 5500/6500

32

2.144.57est.σest.σ

4.575

22.83

1N

SSest.σ

1.953.81SS

3.816

22.83

N

SSS

22.836

1771

N

DDSS

83.26

17

N

DD

2DD

D

D2D

2D

D

D2D

2

D

2

2D

D

Manual Calculations

Page 33: Psych 5500/6500

33

From SPSSWhen doing a ‘Paired Samples t Test’ (what SPSS calls what I call‘t test for correlated groups’) the analysis will provide thefollowing under the title ‘Paired Samples Test’:

Mean = 2.83333 Std Deviation=2.13698

In our use of symbols these would be represented as:

Which is enough to compute Hedges’s g, for Cohen’s dwe need the standard deviation of the sample, which can befound by:

13698.2est. 2.83333D D

95.16

513698.2

1 est. S DD

D

D

N

N

Page 34: Psych 5500/6500

34

32.114.2

83.2

σ est.

D g sHedges'

45.11.95

2.83

S

Dd sCohen'

:SizeEffect edStandardiz

2.83D :DifferenceMean as SizeEffect

D

D

Page 35: Psych 5500/6500

35

Warning....There is some controversy about the correct calculations for standardized effect size. The shortcuts provided in the earlier lecture on a single group t test (repeated below) don’t work in this context:

df

2td

N

2tg

If we were to use those formulas we would get larger effect sizes:

91.2df

2td 2.65

6

(2)(3.25)

N

2tg

Page 36: Psych 5500/6500

36

GPower 3.0In GPower this t test is called the t test for “Means: Difference between two dependent means (matched pairs)”. If you give it mean D and the standard deviation of D (‘SD’) it will compute Cohen’s d (big deal, as we have seen that is a simple formula). The ‘Total sample size’ is the number of pairs of scores (6 in our example). By the way, the post hoc analysis shows that this example had a power of 0.80! This was due to my having the mean D be rather large compared to the SD.

Page 37: Psych 5500/6500

37

Carry-Over Effect

Carry-Over Effect: A confounding variable that may arise due to measuring the same person more than once, thus can only happen in a repeated-measures design.

Practice effect: the general term for when a carry-over effect leads to an increase in performance over subsequent measures.

Fatigue effect: the general term for when a carry-over effect leads to a decrease in performance over subsequent measures.

Page 38: Psych 5500/6500

38

Options for Controlling Carry-Over Effects

1. If your independent variable is a carry-over effect (e.g. the effect of practice) then you do not need or want to control it. Otherwise....

2. If applicable, use different forms of the same test.

3. Minimize the carry-over effect (e.g. increase the time between first measure and second measure).

4. Counterbalance the order of conditions.

Page 39: Psych 5500/6500

39

Counterbalancing the Order of Conditions

Half the participants are in Condition A first and in Condition B second.

The other half of the participants are in Condition B first and Condition A second.

Subject Condition A Condition B

S1 1st 2nd

S2 2nd 1st

S3 1st 2nd

S4 2nd 1st