Research Methods in Psychology Quasi-Experimental Designs and Program Evaluation.

Research Methods in Psychology

Quasi-Experimental Designs and Program Evaluation

Applied Research

Goal• to improve the conditions in which people live and

work Natural settings

• messy, “real world,” hard to establish experimental control

Quasi-experiments• procedures that approximate the conditions of highly

controlled laboratory experiments Program evaluation

• applied research to learn whether real-world treatments work

Characteristics of True Experiments

manipulate an Independent Variable (IV)• treatment, control conditions• high degree of control

especially random assignment to conditions

unambiguous outcome regarding effect of IV on DV• internal validity

Obstacles to Conducting True Experiments in Natural Settings

Permission• difficult to gain permission to conduct true

experiments in natural settings • difficult to gain access to participants

Random assignment perceived as unfair• people want a “treatment”• random assignment is best way to determine

whether a treatment is effective• use “waiting-list” control group

Advantage of True Experiments

Threats to internal validity are controlled• confoundings (alternative explanations for findings) are

controlled• rule out alternative explanations to make a causal inference

about effect of IV on DV• 8 general classes of threats to internal validity

• History• Maturation• Testing Instrumentation

• Regression• Selection• Subject attrition• Additive effects with Selection

Threats to Internal Validity

History• When an event occurs at the same time as

the treatment and changes participants’ behavior

• participants’ “history” includes events other than treatment

• difficult to distinguish whether treatment has an effect

History Threat, continued

Does an AIDS awareness campaign on campus influence condom sales in campus vending machines?

History threat: Suppose at week 4 (X = treatment) a celebrity announces he is HIV+

Can you conclude the awareness campaign was effective?

0

10

20

30

40

50

60

70

1 2 3 4 X 5 6 7 8

Week

Con

dom

sal

es

Threats to Internal Validity, continued

Maturation• Participants naturally change over time.• These maturational changes, not treatment,

may explain any changes in participants during an experiment.

Maturation Threat, continued

Does a new reading program improve 2nd graders’ reading comprehension?

Reading comprehension improves naturally as children mature over the year.

Can you conclude the reading program was effective?0

10

20

30

40

50

60

70

80

90

Pre Post

Mea

nCom

preh

ensio

n


Testing• Taking a test generally affects subsequent

testing• Participants’ performance on a measure at the

end of a study may differ from an initial testing because of their familiarity with the measures

Testing Threat, continued

Does teaching people a new problem solving strategy influence their ability to solve problems quickly?

If similar problems are used in the pretest, faster problem solving may be due to familiarity with the test.

Can we conclude that the new strategy improves problem-solving ability?

0

2

4

6

8

10

12

14

pre post

Min

utes

(Mea

n)


Instrumentation• Instruments used to measure participants’

performance may change over time example: observers may become bored or tired

• Changes in participants’ performance may be due to changes in instruments used to measure performance, not to a treatment

Instrumentation, continued

Suppose that a police protection program is implemented to decrease incidence of rape.

At the same time the program is implemented (X), reporting laws change such that what constitutes rape is broadened.

Can we conclude the program was effective (or ineffective)?

05

101520253035404550

1 2 3 4 X 5 6 7 8

Month

Rep

orts

of R

ape


Regression• Participants sometimes perform very well or very

poorly on a measure because of chance factors (e.g., luck).

• These chance factors are not likely to be present during a second testing, so their scores will not be so extreme.

• The scores will “regress” (go toward) the mean.• Regression effects, not treatment, may account for

changes in participants’ performance over time.

Regression, continued

A test score = true score + error (e.g., chance) definition of an unreliable test or measure:

• it measures with a lot of error

If people score very high or very low on a test, it’s possible that chance factors produced the extreme score.

On a second testing, those chance factors are less likely to be present (that’s why they’re “chance”)

Regression, continued

Suppose that students were selected for an accelerated enrichment program because of their very high scores on a brief test.

Regression: to the extent the test is an unreliable measure of ability, we can expect their scores to regress to the mean at the 2nd testing.

Can we conclude the enrichment program was effective?

010

203040

50607080

90100

Pre Post

Tes

t Sc

ores

(M

ean)


Subject attrition• When participants are lost from the study

(attrition), the group equivalence formed at the start of the study may be destroyed.

• Differences between treatment and control groups at the end of the study may be due to differences in those who remain in each group.

Subject Attrition, continued

Suppose that an exercise program is offered to employees who would like to lose weight.

At Time 1, N = 50

M weight = 225 pounds At Time 2, N = 25 (25 drop out

of study) Suppose the 25 who stayed in

the program weighed, on average, 150 pounds at Time 1

Did the exercise program help people to lose weight?

0

50

100

150

200

250

Time1 Time2

Mea

n W

eigh

t


Selection• occurs when differences exist between

individuals in treatment and control groups at the start of a study

• these differences become alternative explanations for any differences observed at the end of the study

• random assignment controls the selection threat

Selection, continued

Suppose that a community recycling program is tested. Individuals who are interested in recycling are encouraged to participate.

Evaluation: Compare the weight of garbage from participants in the program with weight of garbage from those not in the new program.

Can we tell if the new recycling program is effective?0

5

10

15

20

25

30

35

40

Recyc. Not

mea

n lb

s/w

eek


Additive effects with selection• When one group of participants in an

experiment responds differently to an external event (history) matures at a different rate is measured more sensitively by a test

(instrumentation)

• these threats (rather than treatment) may account for any group differences at the end of a study

Additive Effects with Selection, continued

Does an AIDS awareness campaign at School A affect condom sales compared to control (no awareness campaign (School B)?

History threat: Suppose a celebrity announces at week 4 that he is HIV+

Can you conclude the awareness campaign at School A is effective?

Yes, both groups should have experienced the same history threat equally.

0

10

20

30

40

50

60

70

1 2 3 4 X 5 6 7 8

Week

Con

dom

sal

es

School A School B



Additive effect of Selection and History: Suppose at week 4 (X), the student newspaper at School A reports about students who are HIV+ (not part of the awareness campaign).

Can you conclude the awareness campaign was effective?

0

10

20

30

40

50

60

70

1 2 3 4 X 5 6 7 8

Week

Con

dom

sal

es

School A School B



Additive effect of Selection and History: Suppose at week 4 (X), the student newspaper at School B reports about students who are HIV+

Can you conclude whether the awareness campaign at School A was effective?

0

10

20

30

40

50

60

70

1 2 3 4 X 5 6 7 8

Week

Con

dom

sal

es

School A School B


Important points to remember• When there is no comparison group in a study, the

following threats to internal validity must be ruled out: history, maturation, testing, instrumentation, regression,

subject mortality, selection

• When a comparison group is added, the following threats must be ruled out: selection, additive effects with selection

• Adding a comparison group helps researchers to rule out many threats to internal validity


Threats that even true experiments may not eliminate• contamination• experimenter expectancy effects• novelty effects (including Hawthorne effect)

Threats to external validity • occur when treatment effects may not be generalized

beyond the particular people, setting, treatment, and outcome of an experiment.

• best way to assess external validity: replication


Contamination• occurs when there is communication about

the experiment between groups of participants• three possible outcomes

resentment rivalry diffusion of treatments


Expectancy effects• occur when an experimenter unintentionally

influences the results of an experiment• two types

expectations lead to systematic errors in interpretation of participants’ performance

expectations lead to errors in recording data


Novelty effects• refer to changes in people’s behaviors simply

because as innovation (e.g., a treatment) produces excitement, energy, enthusiasm

• Hawthorne effect: a special case performance changes when people know “significant others”

(e.g., researchers, employers) are interested in them or care about their living or work conditions

Because of contamination, expectancy and novelty effects, researchers may have trouble concluding that a treatment was effective

Quasi-Experiments

“Quasi-” (resembling) experiments• an important alternative when true

experiments are not possible• lack the high degree of control found in true

experiments• researchers must seek additional evidence to

eliminate threats to internal validity

The One-Group Pretest-Posttest Design

“bad experiment” or “preexperimental design”• an intact group is selected to receive a treatment

e.g., a classroom of children, a group of employees

• pretest records participants’ performance before treatment observation 1 (O1)

• treatment is implemented (X)

• posttest records performance following treatment (O2)

O1 X O2

One-Group Pretest-Posttest Design, cont.

O1 X O2

• None of the threats to internal validity are controlled.

• Any change between pretest (O1) and posttest (O2) may be due to treatment (X) or history (some other event coincided with

treatment) testing (effects of repeated testing) maturation (natural changes in participants over

time or instrumentation, regression, subject attrition

Quasi-Experimental Designs

Nonequivalent Control Group Design• a group similar to the treatment group serves

as a comparison group• obtain pretest and posttest measures for

individuals in both groups• random assignment to groups is not used• pretest scores are used to determine whether

the groups are equivalent equivalent only on this dimension

Nonequivalent Control Group Design, continued

treatment

O1 X O2 treatment group------------------

O1 O2 nonequivalent control group

pretest posttest


Example: Does taking a research methods course improve reasoning ability?

Compare students in research methods and developmental psychology courses

DV: 7-item test of methodological and statistical reasoning ability

Suppose group differences are observed at the posttest


By adding a comparison group, rule out these threats to internal validity:• history• maturation• testing• instrumentation• regression

Assume that these threats happen the same to both groups, therefore, can’t be used to explain posttest differences

0

1

2

3

4

5

6

Pre Post

Mea

n R

easo

ning

Sco

re

Develop Methods


What threats are not ruled out?• Selection

Without random assignment to conditions, the two groups are probably not equivalent on many dimensions

These preexisting differences may account for group differences at the posttest


• Additive effects with selection The two groups

• may have different experiences (selection X history)• may mature at different rates (selection X maturation)• may be measured more or less sensitively by the

instrument (selection X instrumentation)• may drop out of the study (courses) at different rates

(differential subject attrition)• may differ in terms of regression to the mean (differential

regression)

Quasi-Experiments, continued

Simple Interrupted Time-Series Design• Observe a DV for some time before and after

a treatment is introduced.• Archival data are often used.• Look for clear discontinuity in the time-series

data for evidence of treatment effectiveness.

O1 O2 O3 O4 X O5 O6 O7 O8

Simple Interrupted Times-Series Design, continued

Example: Study habits• intervention: An instructional course to

change students’ study habits implemented during the summer following the

sophomore year (after semester 4)

• DV: semester GPA• Suppose that a discontinuity is observed

when the treatment (X) is introduced

Simple Interrupted Times-Series Design, continued

What threats can be ruled out?• maturation: assume

maturational changes are gradual, not abrupt

• testing (GPA): if testing influences performance, these effects are likely to show up in initial observations (before X)

testing effects less likely with archival data

• regression: if scores regress to the mean, they will do so in initial observations

0

0.5

1

1.5

2

2.5

3

3.5

4

1 2 3 4 5 6 7 8

Mea

n G

PA

Quasi-Experiments, continued

Time-Series with Nonequivalent Control Group Design• Add a comparison group to the simple time-

series design

O1 O2 O3 O4 X O5 O6 O7 O8

--------------------------------------------------------------

O1 O2 O3 O4 O5 O6 O7 O8

Time Series with Nonequivalent Control Group Design, continued

Example: Study habits • Suppose that a

nonequivalent control group is added—these students don’t participate in the study habits course

• Who could be in the comparison group?

• What threats would you be able to rule out?0

0.5

1

1.5

2

2.5

3

3.5

4

1 2 3 4 5 6 7 8

Mea

n G

PA

Study Control

Program Evaluation

Goal• provide feedback to administrators of human service

organizations in order to help them decide what services to provide who to provide services to how to provide services most effectively and efficiently

Big growth area (especially health care) Program evaluators assess

• needs, process, outcomes, efficiency of social services

Four Questions of Program Evaluation

Needs• Is an agency or organization meeting the

needs of the people it serves survey research designs

Process• How is a program being implemented (is it

going as planned)? observational research designs

Four Questions of Program Evaluation, cont.

Outcome• Has a program been effective in meeting its

stated goals experimental, quasi-experimental research

designs; archival data

Efficiency• Is a program cost-efficient relative to

alternative programs experimental, quasi-experimental research

designs; archival data

Basic Research and Applied Research

Program evaluation is the most extreme case of applied research• goal is practical, not theoretical

Relationship between basic and applied research is reciprocal• basic research provides scientifically based principles

about behavior and mental processes• these principles are applied in complex, real world• new complexities are recognized and new hypotheses

must be tested using basic research

Research Methods in Psychology Quasi-Experimental Designs and Program Evaluation.

Documents

Transcript of Research Methods in Psychology Quasi-Experimental Designs and Program Evaluation.