Research Methods in Psychology Quasi-Experimental Designs and Program Evaluation.
-
Upload
frank-francis -
Category
Documents
-
view
220 -
download
1
Transcript of Research Methods in Psychology Quasi-Experimental Designs and Program Evaluation.
Applied Research
Goal• to improve the conditions in which people live and
work Natural settings
• messy, “real world,” hard to establish experimental control
Quasi-experiments• procedures that approximate the conditions of highly
controlled laboratory experiments Program evaluation
• applied research to learn whether real-world treatments work
Characteristics of True Experiments
manipulate an Independent Variable (IV)• treatment, control conditions• high degree of control
especially random assignment to conditions
unambiguous outcome regarding effect of IV on DV• internal validity
Obstacles to Conducting True Experiments in Natural Settings
Permission• difficult to gain permission to conduct true
experiments in natural settings • difficult to gain access to participants
Random assignment perceived as unfair• people want a “treatment”• random assignment is best way to determine
whether a treatment is effective• use “waiting-list” control group
Advantage of True Experiments
Threats to internal validity are controlled• confoundings (alternative explanations for findings) are
controlled• rule out alternative explanations to make a causal inference
about effect of IV on DV• 8 general classes of threats to internal validity
• History• Maturation• Testing Instrumentation
• Regression• Selection• Subject attrition• Additive effects with Selection
Threats to Internal Validity
History• When an event occurs at the same time as
the treatment and changes participants’ behavior
• participants’ “history” includes events other than treatment
• difficult to distinguish whether treatment has an effect
History Threat, continued
Does an AIDS awareness campaign on campus influence condom sales in campus vending machines?
History threat: Suppose at week 4 (X = treatment) a celebrity announces he is HIV+
Can you conclude the awareness campaign was effective?
0
10
20
30
40
50
60
70
1 2 3 4 X 5 6 7 8
Week
Con
dom
sal
es
Threats to Internal Validity, continued
Maturation• Participants naturally change over time.• These maturational changes, not treatment,
may explain any changes in participants during an experiment.
Maturation Threat, continued
Does a new reading program improve 2nd graders’ reading comprehension?
Reading comprehension improves naturally as children mature over the year.
Can you conclude the reading program was effective?0
10
20
30
40
50
60
70
80
90
Pre Post
Mea
nCom
preh
ensio
n
Threats to Internal Validity, continued
Testing• Taking a test generally affects subsequent
testing• Participants’ performance on a measure at the
end of a study may differ from an initial testing because of their familiarity with the measures
Testing Threat, continued
Does teaching people a new problem solving strategy influence their ability to solve problems quickly?
If similar problems are used in the pretest, faster problem solving may be due to familiarity with the test.
Can we conclude that the new strategy improves problem-solving ability?
0
2
4
6
8
10
12
14
pre post
Min
utes
(Mea
n)
Threats to Internal Validity, continued
Instrumentation• Instruments used to measure participants’
performance may change over time example: observers may become bored or tired
• Changes in participants’ performance may be due to changes in instruments used to measure performance, not to a treatment
Instrumentation, continued
Suppose that a police protection program is implemented to decrease incidence of rape.
At the same time the program is implemented (X), reporting laws change such that what constitutes rape is broadened.
Can we conclude the program was effective (or ineffective)?
05
101520253035404550
1 2 3 4 X 5 6 7 8
Month
Rep
orts
of R
ape
Threats to Internal Validity, continued
Regression• Participants sometimes perform very well or very
poorly on a measure because of chance factors (e.g., luck).
• These chance factors are not likely to be present during a second testing, so their scores will not be so extreme.
• The scores will “regress” (go toward) the mean.• Regression effects, not treatment, may account for
changes in participants’ performance over time.
Regression, continued
A test score = true score + error (e.g., chance) definition of an unreliable test or measure:
• it measures with a lot of error
If people score very high or very low on a test, it’s possible that chance factors produced the extreme score.
On a second testing, those chance factors are less likely to be present (that’s why they’re “chance”)
Regression, continued
Suppose that students were selected for an accelerated enrichment program because of their very high scores on a brief test.
Regression: to the extent the test is an unreliable measure of ability, we can expect their scores to regress to the mean at the 2nd testing.
Can we conclude the enrichment program was effective?
010
203040
50607080
90100
Pre Post
Tes
t Sc
ores
(M
ean)
Threats to Internal Validity, continued
Subject attrition• When participants are lost from the study
(attrition), the group equivalence formed at the start of the study may be destroyed.
• Differences between treatment and control groups at the end of the study may be due to differences in those who remain in each group.
Subject Attrition, continued
Suppose that an exercise program is offered to employees who would like to lose weight.
At Time 1, N = 50
M weight = 225 pounds At Time 2, N = 25 (25 drop out
of study) Suppose the 25 who stayed in
the program weighed, on average, 150 pounds at Time 1
Did the exercise program help people to lose weight?
0
50
100
150
200
250
Time1 Time2
Mea
n W
eigh
t
Threats to Internal Validity, continued
Selection• occurs when differences exist between
individuals in treatment and control groups at the start of a study
• these differences become alternative explanations for any differences observed at the end of the study
• random assignment controls the selection threat
Selection, continued
Suppose that a community recycling program is tested. Individuals who are interested in recycling are encouraged to participate.
Evaluation: Compare the weight of garbage from participants in the program with weight of garbage from those not in the new program.
Can we tell if the new recycling program is effective?0
5
10
15
20
25
30
35
40
Recyc. Not
mea
n lb
s/w
eek
Threats to Internal Validity, continued
Additive effects with selection• When one group of participants in an
experiment responds differently to an external event (history) matures at a different rate is measured more sensitively by a test
(instrumentation)
• these threats (rather than treatment) may account for any group differences at the end of a study
Additive Effects with Selection, continued
Does an AIDS awareness campaign at School A affect condom sales compared to control (no awareness campaign (School B)?
History threat: Suppose a celebrity announces at week 4 that he is HIV+
Can you conclude the awareness campaign at School A is effective?
Yes, both groups should have experienced the same history threat equally.
0
10
20
30
40
50
60
70
1 2 3 4 X 5 6 7 8
Week
Con
dom
sal
es
School A School B
Additive Effects with Selection, continued
Does an AIDS awareness campaign at School A affect condom sales compared to control (no awareness campaign (School B)?
Additive effect of Selection and History: Suppose at week 4 (X), the student newspaper at School A reports about students who are HIV+ (not part of the awareness campaign).
Can you conclude the awareness campaign was effective?
0
10
20
30
40
50
60
70
1 2 3 4 X 5 6 7 8
Week
Con
dom
sal
es
School A School B
Additive Effects with Selection, continued
Does an AIDS awareness campaign at School A affect condom sales compared to control (no awareness campaign (School B)?
Additive effect of Selection and History: Suppose at week 4 (X), the student newspaper at School B reports about students who are HIV+
Can you conclude whether the awareness campaign at School A was effective?
0
10
20
30
40
50
60
70
1 2 3 4 X 5 6 7 8
Week
Con
dom
sal
es
School A School B
Threats to Internal Validity, continued
Important points to remember• When there is no comparison group in a study, the
following threats to internal validity must be ruled out: history, maturation, testing, instrumentation, regression,
subject mortality, selection
• When a comparison group is added, the following threats must be ruled out: selection, additive effects with selection
• Adding a comparison group helps researchers to rule out many threats to internal validity
Threats to Internal Validity, continued
Threats that even true experiments may not eliminate• contamination• experimenter expectancy effects• novelty effects (including Hawthorne effect)
Threats to external validity • occur when treatment effects may not be generalized
beyond the particular people, setting, treatment, and outcome of an experiment.
• best way to assess external validity: replication
Threats to Internal Validity, continued
Contamination• occurs when there is communication about
the experiment between groups of participants• three possible outcomes
resentment rivalry diffusion of treatments
Threats to Internal Validity, continued
Expectancy effects• occur when an experimenter unintentionally
influences the results of an experiment• two types
expectations lead to systematic errors in interpretation of participants’ performance
expectations lead to errors in recording data
Threats to Internal Validity, continued
Novelty effects• refer to changes in people’s behaviors simply
because as innovation (e.g., a treatment) produces excitement, energy, enthusiasm
• Hawthorne effect: a special case performance changes when people know “significant others”
(e.g., researchers, employers) are interested in them or care about their living or work conditions
Because of contamination, expectancy and novelty effects, researchers may have trouble concluding that a treatment was effective
Quasi-Experiments
“Quasi-” (resembling) experiments• an important alternative when true
experiments are not possible• lack the high degree of control found in true
experiments• researchers must seek additional evidence to
eliminate threats to internal validity
The One-Group Pretest-Posttest Design
“bad experiment” or “preexperimental design”• an intact group is selected to receive a treatment
e.g., a classroom of children, a group of employees
• pretest records participants’ performance before treatment observation 1 (O1)
• treatment is implemented (X)
• posttest records performance following treatment (O2)
O1 X O2
One-Group Pretest-Posttest Design, cont.
O1 X O2
• None of the threats to internal validity are controlled.
• Any change between pretest (O1) and posttest (O2) may be due to treatment (X) or history (some other event coincided with
treatment) testing (effects of repeated testing) maturation (natural changes in participants over
time or instrumentation, regression, subject attrition
Quasi-Experimental Designs
Nonequivalent Control Group Design• a group similar to the treatment group serves
as a comparison group• obtain pretest and posttest measures for
individuals in both groups• random assignment to groups is not used• pretest scores are used to determine whether
the groups are equivalent equivalent only on this dimension
Nonequivalent Control Group Design, continued
treatment
O1 X O2 treatment group------------------
O1 O2 nonequivalent control group
pretest posttest
Nonequivalent Control Group Design, continued
Example: Does taking a research methods course improve reasoning ability?
Compare students in research methods and developmental psychology courses
DV: 7-item test of methodological and statistical reasoning ability
Suppose group differences are observed at the posttest
Nonequivalent Control Group Design, continued
By adding a comparison group, rule out these threats to internal validity:• history• maturation• testing• instrumentation• regression
Assume that these threats happen the same to both groups, therefore, can’t be used to explain posttest differences
0
1
2
3
4
5
6
Pre Post
Mea
n R
easo
ning
Sco
re
Develop Methods
Nonequivalent Control Group Design, continued
What threats are not ruled out?• Selection
Without random assignment to conditions, the two groups are probably not equivalent on many dimensions
These preexisting differences may account for group differences at the posttest
Nonequivalent Control Group Design, continued
• Additive effects with selection The two groups
• may have different experiences (selection X history)• may mature at different rates (selection X maturation)• may be measured more or less sensitively by the
instrument (selection X instrumentation)• may drop out of the study (courses) at different rates
(differential subject attrition)• may differ in terms of regression to the mean (differential
regression)
Quasi-Experiments, continued
Simple Interrupted Time-Series Design• Observe a DV for some time before and after
a treatment is introduced.• Archival data are often used.• Look for clear discontinuity in the time-series
data for evidence of treatment effectiveness.
O1 O2 O3 O4 X O5 O6 O7 O8
Simple Interrupted Times-Series Design, continued
Example: Study habits• intervention: An instructional course to
change students’ study habits implemented during the summer following the
sophomore year (after semester 4)
• DV: semester GPA• Suppose that a discontinuity is observed
when the treatment (X) is introduced
Simple Interrupted Times-Series Design, continued
What threats can be ruled out?• maturation: assume
maturational changes are gradual, not abrupt
• testing (GPA): if testing influences performance, these effects are likely to show up in initial observations (before X)
testing effects less likely with archival data
• regression: if scores regress to the mean, they will do so in initial observations
0
0.5
1
1.5
2
2.5
3
3.5
4
1 2 3 4 5 6 7 8
Mea
n G
PA
Quasi-Experiments, continued
Time-Series with Nonequivalent Control Group Design• Add a comparison group to the simple time-
series design
O1 O2 O3 O4 X O5 O6 O7 O8
--------------------------------------------------------------
O1 O2 O3 O4 O5 O6 O7 O8
Time Series with Nonequivalent Control Group Design, continued
Example: Study habits • Suppose that a
nonequivalent control group is added—these students don’t participate in the study habits course
• Who could be in the comparison group?
• What threats would you be able to rule out?0
0.5
1
1.5
2
2.5
3
3.5
4
1 2 3 4 5 6 7 8
Mea
n G
PA
Study Control
Program Evaluation
Goal• provide feedback to administrators of human service
organizations in order to help them decide what services to provide who to provide services to how to provide services most effectively and efficiently
Big growth area (especially health care) Program evaluators assess
• needs, process, outcomes, efficiency of social services
Four Questions of Program Evaluation
Needs• Is an agency or organization meeting the
needs of the people it serves survey research designs
Process• How is a program being implemented (is it
going as planned)? observational research designs
Four Questions of Program Evaluation, cont.
Outcome• Has a program been effective in meeting its
stated goals experimental, quasi-experimental research
designs; archival data
Efficiency• Is a program cost-efficient relative to
alternative programs experimental, quasi-experimental research
designs; archival data
Basic Research and Applied Research
Program evaluation is the most extreme case of applied research• goal is practical, not theoretical
Relationship between basic and applied research is reciprocal• basic research provides scientifically based principles
about behavior and mental processes• these principles are applied in complex, real world• new complexities are recognized and new hypotheses
must be tested using basic research