Post on 31-Mar-2015
NON-EXPERIMENTAL METHODSShwetlena Sabarwal
(thanks to Markus Goldstein for the slides)
OBJECTIVE• Find a plausible counterfactual
• Every non-experimental method is associated with an assumption
• The stronger the assumption the weaker the estimate
TEST ASSUMPTIONS
Reality check
PROGRAM TO EVALUATE
Hopetown HIV/AIDS Program (2008-2012)
Objectives Reduce HIV transmission
Intervention: Peer education Target group: Youth 15-24
Outcome Indicator: Pregnancy rate (proxy for unprotected sex)
I. BEFORE-AFTER IDENTIFICATION
STRATEGYCounterfactual:
Rate of pregnancy observed before program started
EFFECT = After minus Before
I. BEFORE-AFTER IDENTIFICATION
STRATEGYCounterfactual:
Rate of pregnancy observed before program started
EFFECT = After minus Before
Year Number of areas
Teen pregnancy rate (per 1000)
2008 70 62.902012 70 66.37Difference +3.47
COUNTERFACTUAL ASSUMPTION: No change over time
66.37
62.9
50525456586062646668
2008 2012
Tee
n p
reg
nan
cy
(per
100
0)
Effect = +3.47
Intervention
Question: what else might have happened in 2008-2012 to affect teen pregnancy?
Number of areas
Teen pregnancy (per 1000)
2004 2008 2012
70 54.96 62.90 66.37
TEST ASSUMPTION with prior data
REJECT counterfactual hypothesis of no change over time
II. NON-PARTICIPANT IDENTIFICATION STRATEGY
Counterfactual:
Rate of pregnancy among non-participants
Teen pregnancy rate (per 1000) in 2012
Participants 66.37
Non-participants 57.50
Difference +8.87
COUNTERFACTUAL ASSUMPTION:Without intervention participants have
same pregnancy rate as non-participants
66.4
57.5
40
60
80
100
2008 2012
tee
n p
reg
na
nc
y(p
er
10
00
)
Effect = +8.87
Participants
Non-participants
Question: how might participants differ from non-participants?
TEST ASSUMPTION WITH PRE-PROGRAM DATA
66.462.9
46.37
57.5
40
50
60
70
80
2008 2012
tee
n p
reg
na
nc
y(p
er
10
00
)
?
REJECT counterfactual hypothesis of same pregnancy rates
III. DIFFERENCE-IN-DIFFERENCE IDENTIFICATION STRATEGY
Counterfactual:
1.Non-participant rate of pregnancy, purging pre-program differences in participants/nonparticipants
2.“Before” rate of pregnancy, purging before-after change for nonparticipants
1 and 2 are equivalent
Average rate of teen pregnancy in
2008 2012 Difference (2008-2012)
Participants (P) 62.90 66.37 3.47
Non-participants (NP) 46.37 57.50 11.13
Difference (P=NP) 16.53 8.87 -7.66
III. DIFFERENCE-IN-DIFFERENCE IDENTIFICATION STRATEGY
66.462.9
46.37
57.5
40
50
60
70
80
2008 2012
tee
n p
reg
na
nc
y
57.50 - 46.37 = 11.13
66.37 – 62.90 = 3.47
Non-participants
Participants
Effect = 3.47 – 11.13 = - 7.66
III. DIFFERENCE-IN-DIFFERENCE(1) Nonparticipants purging before/after
66.462.9
46.37
57.5
40
50
60
70
80
2008 2012
tee
n p
reg
na
nc
y (
pe
r 1
00
0)
After
Before
Effect = 8.87 – 16.53 = - 7.66
66.37 – 57.50 = 8.87
62.90 – 46.37 = 16.53
III. DIFFERENCE-IN-DIFFERENCE(2) Before purging participation
66.462.9
46.37
57.5
40
50
60
70
80
2008 2012
tee
n p
reg
na
nc
y(p
er
10
00
)
74.0
16.5
III. DIFFERENCE-IN-DIFFERENCE
66.462.9
46.37
57.5
40
50
60
70
80
2008 2012
tee
n p
reg
na
nc
y(p
er
10
00
)
74.0-7.6
III. DIFFERENCE-IN-DIFFERENCE
COUNTERFACTUAL ASSUMPTION:
Question: why might participants’ trends differ from that of nonparticipants?
Without intervention participants and nonparticipants’ pregnancy rates follow same trends
TEST ASSUMPTION WITH PRE-PROGRAM DATA
Average rate of teen pregnancy in
2004 2008 Difference (2004-2008)
Participants (P) 54.96 62.90 7.94
Non-participants (NP) 39.96 46.37 6.41
Difference (P=NP) 15.00 16.53 +1.53 ?
REJECT counterfactual hypothesis of same trends
IV. MATCHING WITH DIFFERENCE-IN-DIFFERENCE IDENTIFICATION
STRATEGYCounterfactual:
Comparison group is constructed by pairing each program participant with a “similar” nonparticipant
Minimize differences in the vector of observed characteristics between participant and nonparticipant
• Parametrically (propensity score matching)
• Nonparametrically
COUNTERFACTUAL ASSUMPTION
Question: how might participant differ from matched nonparticipant?
Unobserved characteristics do not affect outcomes of interest
56
58
60
62
64
66
68
70
72
74
76
2008 2012
Tee
m p
reg
nam
cy r
ate
(per
100
0)
73.36
66.37
Matched nonparticipant
Participant
Effect = - 7.01
COUNTERFACTUAL ASSUMPTION
TEST ASSUMPTIONWITH EXPERIMENTAL DATA
REJECT counterfactual hypothesis
Meta-analysis of studies using experimental data to estimate bias in impact estimates using matching:
unobservables matter!
direction of bias is unpredictable!
V. REGRESSION DISCONTINUITY IDENTIFICATION STRATEGYApplicability:
When strict quantitative criteria determine eligibility
Counterfactual:
Nonparticipants just below the eligibility cutoff are the comparison for participants just above the eligibility cutoff
COUNTERFACTUAL ASSUMPTION
Question: Is the distribution around the cutoff smooth?
Then, assumption is reasonable
However, can only estimate impact around the cutoff, not for the whole program
Nonparticipants just below the eligibility cutoff are the same as participants just above the eligibility cutoff
EXAMPLE: EFFECT OF SCHOOL INPUTS ON TEST SCORES
• Target transfer to poorest schools• Construct poverty index from 1 to 100• Schools with a score <=50 are in• Schools with a score >50 are out• Inputs transfer to poor schools• Measure outcomes (i.e. test scores) before
and after transfer
6065
7075
80O
utco
me
20 30 40 50 60 70 80Score
Regression Discontinuity Design - Baseline
6065
7075
80O
utco
me
20 30 40 50 60 70 80Score
Regression Discontinuity Design - Baseline
Non-Poor
Poor
6570
7580
Out
com
e
20 30 40 50 60 70 80Score
Regression Discontinuity Design - Post Intervention
6570
7580
Out
com
e
20 30 40 50 60 70 80Score
Regression Discontinuity Design - Post Intervention
Treatment Effect
SUMMARY
• Gold standard is randomization – no assumptions needed, always precise estimates
• Non-experimental requires assumptions – can you live with them?
• We did not cover:– Encouragement design– Instrumental variables– Pipeline comparisons