Introduction to Impact evaluation: Part II- Methods & Examples - World...
Transcript of Introduction to Impact evaluation: Part II- Methods & Examples - World...
![Page 1: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/1.jpg)
Introduction to Impact evaluation:evaluation:
Part II- Methods & Examples
E l Sk fiEmmanuel SkoufiasVincenzo Di Maro
The World BankThe World BankPREM Learning Events
April 26, 2011April 26, 2011
![Page 2: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/2.jpg)
OutlineOutline
1. The Evaluation Problem & Selection Bias2. Solutions to the evaluation problem
C S l Cross- Sectional Estimator Before and After Estimator Double Difference Estimator Double Difference Estimator
3. Experimental Designs4. Quasi-Experimental Designs: PSM, RDD.Quas pe e ta es g s S ,5. Instrumental Variables and IE6. How to implement an Impact evaluation p p
2
![Page 3: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/3.jpg)
1 Evaluation Problem and1. Evaluation Problem and Selection Bias
3
![Page 4: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/4.jpg)
How to assess impactHow to assess impact
What is beneficiary’s test score with program compared to without program?
Formally, program impact is:
E (Y | T=1) - E(Y | T=0)
Compare same individual with & without programs at same point in timeprograms at same point in timeSo what’s the Problem? 4
![Page 5: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/5.jpg)
Solving the evaluation problemSolving the evaluation problem
Problem: we never observe the same individual with and without program at same point in timeObserve: E(Y | T=1) & E (Y | T=0) NO!
Solution: estimate what would have happened if beneficiary had not receivedhappened if beneficiary had not received benefits Observe: E(Y | T=1) YES!Observe: E(Y | T 1) YES!Estimate: E(Y | T=0) YES!!
5
![Page 6: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/6.jpg)
Solving the evaluation problemSolving the evaluation problem
Counterfactual: what would have happened without the program Estimated impact is difference between
treated observation and counterfactualNever observe same individual with and
without program at same point in timewithout program at same point in timeNeed to estimate counterfactualC t f t l i k t i t l tiCounterfactual is key to impact evaluation
6
![Page 7: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/7.jpg)
Finding a good counterfactualFinding a good counterfactual
Treated & counterfactualhave identical characteristics, except for benefiting from the intervention
No other reason for differences inNo other reason for differences in outcomes of treated and counterfactual
Only reason for the difference in outcomes is due to the intervention
7
![Page 8: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/8.jpg)
H i th “id l” t f t lHaving the “ideal” counterfactual……Y (observedl) Y1
Y * (counterfactual)
IMPACT
Y1( )
Y0 t=0 t=1 time
Intervention
8
![Page 9: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/9.jpg)
ll i h iallows us to estimate the true impact
Y1
Impact = Y1- Y1*
*Y1*
Y0 t=0 t=1 time
9
![Page 10: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/10.jpg)
Comparison Group IssuesComparison Group IssuesTwo central problems:Programs are targeted
Program areas will differ in observable and unobservable ways precisely because the program intended this
I di id l ti i ti i ( ll ) l t Individual participation is (usually) voluntaryParticipants will differ from non-participants in observable
and unobservable ways (selection based on observable variables such as age and education and unobservablevariables such as age and education and unobservablevariables such as ability, motivation, drive)
Hence, a comparison of participants and an arbitrary group of non-participants can lead toarbitrary group of non participants can lead to heavily biased results
10
![Page 11: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/11.jpg)
Basic formulation of the evaluation problem
IMPACT Average Treatment Effect – ATE, or ATT= Average Treatment Effect on the Treated:ATT= Average Treatment Effect on the Treated:
)1()1( DYEDYE CT
)0()1( DYEDYE CT
Observed Not observed
)0()1( DYEDYE CT
Observed Observed
11
![Page 12: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/12.jpg)
Basic formulation of the evaluation problem
IMPACT:
CT )1()1( DYEDYE CT
)0()1( DYEDYE CT
Selection bias if: BIASDYEDYE CC )1()0( )()(
Selection bias=difference in mean outcomes (in the absence of the
intervention) between participants and non participants
12
intervention) between participants and non-participants
![Page 13: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/13.jpg)
Two sources of selection biasTwo sources of selection bias
• Selection on observablesDataLinearity in controls?
• Selection on unobservables• Selection on unobservablesParticipants have latent attributes that yieldhigher/lower outcomesg
• One cannot judge if exogeneity is plausible without knowing whether one has dealtwithout knowing whether one has dealt adequately with observable heterogeneity.• That depends on program, setting and data.p p g g
13
![Page 14: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/14.jpg)
Alternative solutions 1Alternative solutions 1
Experimental evaluation (“Social experiment”) Program is randomly assigned, so that everyone
has the same probability of receiving thehas the same probability of receiving the treatment.
In theory, this method is assumption free, but in ti ti i dpractice many assumptions are required.
Pure randomization is rare for anti-poverty programs in practice, since randomization p g p ,precludes purposive targeting.
Although it is sometimes feasible to partially randomizerandomize.
14
![Page 15: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/15.jpg)
Randomization
• As long as the assignment is genuinely random, mean “Randomized out” group reveals counterfactual
g g g yimpact is revealed:
ATE is consistently estimated (nonparametrically) by)0()1( DYEDYE CC
• ATE is consistently estimated (nonparametrically) by the difference between sample mean outcomes of participants and non-participants.p p p p
• Pure randomization is the theoretical ideal for ATE, d th b h k f i t l th dand the benchmark for non-experimental methods.
• More common: randomization conditional on ‘X’
15
![Page 16: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/16.jpg)
Alternative solutions 2Alternative solutions 2
N i t l l ti (“Q i i t l”Non-experimental evaluation (“Quasi-experimental”; “observational studies”)
One of two (non-nested) conditional independence O e o t o ( o ested) co d t o a depe de ceassumptions:
1. Placement is independent of outcome given X i l diff th d i diti llsingle difference methods assuming conditionally
exogenous placementOR placement is independent of outcomes changes
Double difference methods
2. A correlate of placement Z is independent of outcomes given D and Xoutcomes given D and XInstrumental variables estimator
16
![Page 17: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/17.jpg)
2 I t E l ti th d2. Impact Evaluation methods
Diff i h th t t th t f t lDiffer in how they construct the counterfactual• Cross sectional Differences• Before and After (Reflexive comparisons)• Difference in Difference (Dif in Dif)• Experimental methods/Randomization• Quasi-experimental methods
• Propensity score matching (PSM) (not discussed)• Regression discontinuity design (RDD)
• Econometric methods • Instrumental variables/Encouragement designs
17
![Page 18: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/18.jpg)
Cross Sectional EstimatorCross-Sectional Estimator
Counterfactual for participants: Non-participants in the same village or households in similar villages
But then: Measured Impact = E(Y | T=1) - E(Y | T=0)= True Impact + MSB
where MSB=Mean Selection Bias = MA(T=1) - MA(T=0) If MA(T 1) > MA(T 0) then MSB>0 and measured If MA(T=1) > MA(T=0) then MSB>0 and measured
impact > true impact Note: An Experimental or Randomized Design
Assigns individuals into T=1 and T=0 groups randomly. Consequence: MA(T=1) = MA(T=0) MSB=0 and Measure Impact = True Impact
18
![Page 19: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/19.jpg)
B f d Aft E ti tBefore and After Estimator Counterfactual for participants: the participants
th l b f th t t f ththemselves before the start of the program Steps:
Collect baseline data on potential participants before the p p pprogram
Compare with data on the same individuals (villages) after the program
Take the difference (after – before) or use a regression with a dummy variable identifying round 2 obs
This allows for the presence of selection bias assuming it is time invariant and enters additively in the model
19
![Page 20: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/20.jpg)
B f d Af E iBefore and After EstimatorY (observedl)
Y1
Y0 t=0 t=1 time
Intervention
20
![Page 21: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/21.jpg)
Shortcomings of Before and After (BA) compa isons(BA) comparisons
Not different from “Results Based” MonitoringOverestimates impactsM d I t T I t T dMeasured Impact = True Impact + TrendAttribute all changes over time to the program
(i.e. assume that there would have been no trend,(or no changes in outcomes in the absence of the program)
Note: Difference in difference may be thought as a method that tries to improve upon the BA method
21
![Page 22: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/22.jpg)
Diff i diff (DiD)Difference-in-difference (DiD):
C t f t l f ti i t Ob d h Counterfactual for participants: Observed changes over time for non-participants
Steps: Collect baseline data on non-participants and (probable)
participants before the program. Note: there is no particular assumption about how the non-
participants are selected. Could use arbitrary comparison groupp p y p g p Or could use comparison group selected via PSM/RDD Compare with data after the program. Subtract the two differences, or use a regression with a dummy
variable for participantvariable for participant.
This allows for selection bias but it must be time-invariant and additive.additive.
22
![Page 23: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/23.jpg)
Difference-in-difference (DiD): Inte p etation 1Interpretation 1
Dif i Dif th t d ff t f th ti t f i t Dif-in-Dif removes the trend effect from the estimate of impact using the BA method True impact= Measured Impact in Treat G ( or BA)– Trend
The change in the control group provides an estimate of the trend The change in the control group provides an estimate of the trend. Subtracting the “trend” form the change in the treatment group yields the true impact of the program The above assumes that the trend in the C group is an accurate
representation of the trend that would have prevailed in the T group inrepresentation of the trend that would have prevailed in the T group in the absence of the program. That is an assumption that cannot be tested (or very hard to test).
What if the trend in the C group is not an accurate representation of the trend that would have prevailed in the T group in the absence of t e t e d t at ou d a e p e a ed t e g oup t e abse ce othe program?? Need observations on Y one period before the baseline period.
01 tCT
tCT YYYY
Ct
Ct
Tt
Tt YYYY 0101 Measured Impact – Trend
23
![Page 24: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/24.jpg)
Difference-in-difference (DiD): Inte p etation 2Interpretation 2
Dif-in-Dif estimator eliminates selection bias under the assumption that selection bias enters additively and does not change over time
01 tCT
tCT YYYY True impact - 01 tt MSBMSB . The latter term drops out 01 tt YYYY True impact 01 tt MSBMSB . The latter term drops out
if 01 tt MSBMSB , i.e. MSB is time invariant
24
![Page 25: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/25.jpg)
Selection bias
Y1 Impact
Y1
* Y1 Y0
S l ti bi
Selection bias
t=0 t=1 time
25
![Page 26: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/26.jpg)
Diff-in-diff requires that the bias is additive and time invariantadditive and time-invariant
Y1 I t Impact
Y1
* Y1 Y0 t=0 t=1 time
26
![Page 27: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/27.jpg)
The method fails if the comparison group is on a different trajectoryg p j y
Y1 I t? Impact?
Y1
* Y1 Y0 t=0 t=1 time
27
![Page 28: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/28.jpg)
3. Experimental Designs
28
![Page 29: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/29.jpg)
Th i l/ d i d d iThe experimental/randomized design
• In a randomized design the control group (randomly• In a randomized design the control group (randomly assigned out of the program) provides the counterfactual (what would have happened to the treatment group without the program)the program)
• Can apply CSDIFF estimator (ex-post observations only)• Or DiD (if have data in baseline and after start of program)• Randomization equalizes the mean selection bias between T and C• Randomization equalizes the mean selection bias between T and C
groups• Note: An Experimental or Randomized Design
Assigns individuals into T=1 and T=0 groups randomly Assigns individuals into T=1 and T=0 groups randomly. Consequence: MA(T=1) = MA(T=0) MSB=0 and Measured Impact = True Impact
29
![Page 30: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/30.jpg)
Lessons from practice--1p
Ethical objections and political sensitivities
• Deliberately denying a program to those who need it and providing the program to some who do not.p g p g
• Yes, too few resources to go around. But is randomization the fairest solution to limited resources?
• What does one condition on in conditional randomizations?• Intention to treat helps alleviate these concerns• Intention-to-treat helps alleviate these concerns• => randomize assignment, but free to not
participateBut even then the “randomized out” group may• But even then, the “randomized out” group may include people in great need.
=> Implications for designCh i f diti i i bl• Choice of conditioning variables.
• Sub-optimal timing of randomization• Selective attrition + higher costs 30
![Page 31: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/31.jpg)
Lessons from practice 2Lessons from practice--2Internal validity: Selective compliance
• Some of those assigned the program choose not to participateto participate.
• Impacts may only appear if one corrects for selective take-up.R d i d i t IV f ti i ti• Randomized assignment as IV for participation
• Proempleo example: impacts of training only appear if one corrects for selective take-up
31
![Page 32: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/32.jpg)
Lessons from practice 3Lessons from practice--3
External validity: inference for scaling upExternal validity: inference for scaling up
• Systematic differences between characteristics of people normallycharacteristics of people normally attracted to a program and those randomly assigned (“randomization bias”: Heckman-Smith)bias : Heckman-Smith)
• One ends up evaluating a different program to the one actually implemented
=> Difficult in extrapolating results from a pilot experiment to the whole population
32
![Page 33: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/33.jpg)
PROGRESA/OportunidadesPROGRESA/Oportunidades What is PROGRESA?Targeted cash transfer program conditioned
on families visiting health centers regularly and on children attending school regularly.and on children attending school regularly.
Cash transfer-alleviates short-term povertyHuman capital investment-alleviates poverty
in the long-termBy the end of 2004: program (renamed
Oportunidades) covered nearly 5 million Oportunidades) covered nearly 5 million families, in 72,000 localities in all 31 states (budget of about US$2.5 billion).
33
![Page 34: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/34.jpg)
CCT programs (like PROGRESA)CCT programs (like PROGRESA)
Expanding Brazil: Bolsa Familia Colombia: Familias en Acción Colombia: Familias en Acción Honduras: Programa de Asignación Familiar (PRAF) Jamaica: Program of Advancement through Health and Education
(PATH) Nicaragua: Red de Protección Social (RPS) Turkey Ecuador: Bono Solidario Phili i Philippines, Indonesia, Peru, Bangladesh: Food for Education Bangladesh: Food for Education
34
![Page 35: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/35.jpg)
Program Description & BenefitsProgram Description & Benefits Education component
A t f d ti l t (d t il b l ) A system of educational grants (details below) Monetary support or the acquisition of school
materials/suppliesmaterials/supplies(The above benefits are tied to enrollment and
regular (85%) school attendance) Improved schools and quality of educations
(teacher salaries)
35
![Page 36: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/36.jpg)
PROGRESA/OPORTUNIDADES: Evaluation Design
EXPERIMENTAL DESIGN: Program randomized at the
Evaluation Design
EXPERIMENTAL DESIGN: Program randomized at the locality level (Pipeline experimental design)
IFPRI not present at time of selection of T and C localities
Report examined differences between T and C for more than 650 variables at the locality level (comparison of locality means) and at the household level (comparison of household means)( p )
Sample of 506 localities– 186 control (no program)– 320 treatment (receive program)
24, 077 Households (hh)78% beneficiariesDifferences between eligible hh and actual beneficiaries receiving benefitsbeneficiaries receiving benefitsDensification (initially 52% of hh classified as eligible)
36
![Page 37: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/37.jpg)
PROGRESA Evaluation Surveys/Data
BEFORE initiation of program:
AFTER initiation of programprogram:
–Oct/Nov 97: Household census
program–Nov 98
June 99Household census to select beneficiaries
–June 99– Nov/Dec 99Included survey of
–March 98: consumption,
Included survey of beneficiary households
school attendance, health
households regarding operations 37
![Page 38: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/38.jpg)
Table: A Decomposition of the Sample of All Households in Treatment and Control Villages
Localities: 320 Households:14 856
Localities: 186 Households: 9 221
g
Households:14,856
TREATMENT LOCALITY where
Households: 9,221
CONTROL LOCALITY where
PROGRESA
Household Eligibility Status
Discriminant Score
(‘puntaje’)
LOCALITY where PROGRESA is in
operation (T=1)
PROGRESA operations are
delayed (T=0)
Eligible for
PROGRESA benefits
(B=1)
Low
Below Threshold
A
B=1, T=1
B
B=1, T=0 (B 1) Below Threshold , ,
Non-Eligible for PROGRESA
b fit
Above Threshold
C
D benefits
(B=0)
High
B=0, T=1
B=0, T=0
38
![Page 39: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/39.jpg)
TreatmentE (Y) 2DIF2DIF
impact estimate
Control
Before After
39
![Page 40: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/40.jpg)
Using regressions to get 2DIF estimates:Using regressions to get 2DIF estimates:
tviXRiTRiTtiYj jjTRRT ,,)2*(2,
Limit sample to eligible households in treatment and control and run regression:
j
•Y(i,t) denotes the value of the outcome indicator in household (or individual) i in period t, •alpha beta and theta are fixed parameters to be estimated
a
•alpha, beta and theta are fixed parameters to be estimated, •T(i) is an binary variable taking the value of 1 if the household belongs in a treatment community and 0 otherwise (i.e., for control communities), •R2 is a binary variable equal to 1 for the second round of the panel (or the round after the initiation of the program) and equal to 0 for the first round (the round before the initiation of the program), •X is a vector of household (and possibly village) characteristics;•last term is an error term summarizing the influence random disturbances•last term is an error term summarizing the influence random disturbances.
40
![Page 41: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/41.jpg)
CSDIF RTYERTYE XX 120|121|
BADIF TRRRTYERTYE XX 021|121|
CSDIF TRTRTYERTYE XX ,12,0|,12,1|
BADIF TRRRTYERTYE XX ,02,1|,12,1|
DIF2 DIF2 = TR =
XX ,02,1|,12,1| RTYERTYE
XX ,02,0|,12,0| RTYERTYE
41
![Page 42: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/42.jpg)
Evaluation ToolsEvaluation Tools
Formal surveys (Semi)-structured observations and (Semi)-structured observations and
interviews Focus groups with stakeholders Focus groups with stakeholders
(beneficiaries, local leaders, local PROGRESA officials doctors nursesPROGRESA officials, doctors, nurses, school teachers, promotoras)
42
![Page 43: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/43.jpg)
All Boys 12-17 Years Old
0.6
0.7g
Scho
ol L
a
0 3
0.4
0.5
nt A
tten
ding
Wee
k
0.2
0.3
Nov-97 Nov-98 Jun-99 Nov-99
Perc
en
Survey Round
Treatment Control
43
![Page 44: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/44.jpg)
All Girls 12-17 Years Old
0.6
0.7in
g Sc
hoo
ek
0.4
0.5
ent A
tten
diL
ast W
ee
0.2
0.3
Nov-97 Nov-98 Jun-99 Nov-99
Perc
e
Nov-97 Nov-98 Jun-99 Nov-99Survey Round
Treatment Control
44
![Page 45: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/45.jpg)
All Boys 12-17 Years Old
0.35
0.40
ng L
ast
0 25
0.30
ent w
orki
nW
eek
0.20
0.25
Nov-97 Nov-98 Jun-99 Nov-99
Perc
e
Nov 97 Nov 98 Jun 99 Nov 99
Survey Round
Treatment Control
45
![Page 46: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/46.jpg)
All Girls 12-17 Years Old
k
0.15
0.20g
Las
t Wee
k
0.10
ent W
orki
ng
0.05Nov-97 Nov-98 Jun-99 Nov-99
S r e Ro nd
Perc
e
Survey Round
Treatment Control
46
![Page 47: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/47.jpg)
4a. Quasi-Experimental Designs:Propensity Score Matching-PSM
47
![Page 48: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/48.jpg)
I t d tiIntroduction By consensus, a randomized design
id th t dibl th d f provides the most credible method of evaluating program impact.
But experimental designs are difficult to But experimental designs are difficult to implement and are accompanied by political risks that jeopardize the chances of implementing themof implementing them The idea of having a comparison/control group
is very unappealing to program managers and governmentsgovernments
Ethical issues involved in withholding benefits for a certain group of households
48
![Page 49: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/49.jpg)
Propensity-score matching (PSM)
Builds on this fundamental idea of the d i d d i d it trandomized design and uses it to come up a
control group (under some maintained/untested assumptions).
In an experimental design a treatment and a control have equal probability in joining the program Two people apply and we decide whoprogram. Two people apply and we decide who gets the program by a coin toss, then each person has probability of 50% of joining the programprogram
49
![Page 50: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/50.jpg)
Propensity-score matching (PSM):Propensity score matching (PSM):Match on the probability of participation.
Ideally we would match on the entire vector X of observed characteristics However this is practicallyobserved characteristics. However, this is practically impossible. X could be huge.
PSM: match on the basis of the propensity scorep p y(Rosenbaum and Rubin) =
)1Pr()( iii XDXP This assumes that participation is independent of
outcomes given X. If no bias given X then no bias given P(X)given P(X).
50
![Page 51: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/51.jpg)
Steps in score matching:1: Representative, highly comparable, surveys of the
non participants and participants
p g
non-participants and participants.2: Pool the two samples and estimate a logit (or probit)
model of program participation. Predicted values are the “propensity scores”.
3: Restrict samples to assure common support
Failure of common support is an important source of bias in observational studies (Heckman et al.)
51
![Page 52: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/52.jpg)
Propensity-score matching (PSM)
You choose a control group by running a logit/probit You choose a control group by running a logit/probitwhere on the LHS you have a binary variable =1 if a person is in the program, 0 otherwise, as a function of observed characteristics.
Based on this logit/probit, one can derive the predicted probability of participating into the program (based on the X or observed characteristics) and you choose a cont ol g o p fo each t eatment indi id al/hh sing hhcontrol group for each treatment individual/hh using hhthat are NOT in the program and have a predicted probability of being in the program very close to that of the person who is in the program (nearest neighborthe person who is in the program (nearest neighbor matching, kernel matching etc).
Key assumption: selection in to the program is based on observables (or in other words unobservables are notobservables (or in other words unobservables are not important in determining participation into the program). 52
![Page 53: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/53.jpg)
Density
Density of scores for participants
0 1 Propensity score
53
![Page 54: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/54.jpg)
D it f f ti i t Density
Density of scores for non-participants
0 1 Propensity score
54
![Page 55: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/55.jpg)
D it f f ti i t Density
Density of scores for non-participants
0 Region of common support 1 Propensity score
55
![Page 56: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/56.jpg)
Steps in PSM cont.,5: For each participant find a sample of non-
participants that have similar propensity
p ,
participants that have similar propensity scores.
6: Compare the outcome indicators. The difference is the estimate of the gain due
fto the program for that observation.
7: Calculate the mean of these individual7: Calculate the mean of these individual gains to obtain the average overall gain. Various weighting schemes =>Various weighting schemes
56
![Page 57: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/57.jpg)
Th i t ti tThe mean impact estimator
NPP
PYWYG /)(
i
ijijj
j PYW-YG1
01
1 /)(
V i i hti hVarious weighting schemes: Nearest k neighbors Kernel-weights (Heckman et al.,):
)]()([
P
)]()([ jiij XPXPKK
KKWj
ijijij
1
/57
![Page 58: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/58.jpg)
P it i htiPropensity-score weighting PSM removes bias under the conditional
exogeneity assumption.exogeneity assumption. However, it is not the most efficient estimator. Hirano, Imbens and Ridder show that weighting
th t l b ti di t th ithe control observations according to their propensity score yields a fully efficient estimator.
Regression implementation for the common impact model:
with weights of unity for the treated units andiii DY
with weights of unity for the treated units and for the controls.))(ˆ1/()(ˆ XPXP
58
![Page 59: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/59.jpg)
How does PSM compare to an experiment?experiment?
PSM is the observational analogue of an experiment in which placement is independent of outcomes
The difference is that a pure experiment does not require the untestable assumption ofrequire the untestable assumption of independence conditional on observables.
Thus PSM requires good data. Example of Argentina’s Trabajar program Example of Argentina’s Trabajar programPlausible estimates using matching on good
dataImplausible estimates using weaker data
59
![Page 60: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/60.jpg)
How does PSM differ from OLS?OLS?
PSM is a non-parametric method (fully non-parametric in outcome space; optionally non-pa a et c outco e space; opt o a y oparametric in assignment space)
Restricting the analysis to common support => PSM weights the data very differently to standard OLS regressionstandard OLS regression
In practice, the results can look very different!p , y
60
![Page 61: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/61.jpg)
How does PSM perform relative to other methods?relative to other methods?
In comparisons with results of a randomized experiment on a US training program, PSM gave a
d i ti (H k t l D h ji dgood approximation (Heckman et al.; Dehejia and Wahba)
Better than the non-experimental regression-based ette t a t e o e pe e ta eg ess o basedmethods studied by Lalonde for the same program.
However, robustness has been questioned (Smith d T dd)and Todd)
61
![Page 62: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/62.jpg)
Lessons on matching methods
When neither randomization nor a baseline survey are feasible careful matching is crucial to control forare feasible, careful matching is crucial to control for observable heterogeneity.
Validity of matching methods depends heavily on data quality. Highly comparable surveys; similar economic environment
Common support can be a problem (esp., if Co o suppo t ca be a p ob e (esp ,treatment units are lost).
Look for heterogeneity in impact; average impact may hide important differences in the characteristicsmay hide important differences in the characteristics of those who gain or lose from the intervention.
62
![Page 63: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/63.jpg)
4b. Quasi-Experimental Designs:Regression Discontinuity Design-RDD
63
![Page 64: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/64.jpg)
E l iti d i Exploiting program design
Pipeline comparisons• Applicants who have not yet received programApplicants who have not yet received program form the comparison group• Assumes exogeneous assignment amongst applicantsapplicants• Reflects latent selection into the program
64
![Page 65: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/65.jpg)
L ss ns f m p cticLessons from practice
Know your program well: Program design f t b f l f id tif ifeatures can be very useful for identifying impact.
Know your setting well too: Is it plausible that Know your setting well too: Is it plausible that outcomes are continuous under the counterfactual?
But what if you end up changing the program But what if you end up changing the program to identify impact? You have evaluated something else!
65
![Page 66: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/66.jpg)
Introduction Introduction Alternative: Quasi-experimental methods
tt ti t li l ti bi attempting to equalize selection bias between treatment and control groups
Discuss paper using PROGRESA data Discuss paper using PROGRESA data (again)one of the first to evaluate the performance of
RDD in a setting where it can be compared to experimental estimates.
Focus on school attendance and work of 12-16 Focus on school attendance and work of 12 16 yr old boys and girls.
66
![Page 67: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/67.jpg)
Regression Discontinuity Design:
Discontinuity designs
RDDDiscontinuity designs
• Participate if score M < m • Impact=
)()( mMYEmMYE iC
iiT
i
• Key identifying assumption: no discontinuity in counterfactual outcomes at m
67
![Page 68: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/68.jpg)
Indexes are common in targeting of social programssocial programs
Anti-poverty programs targeted to households below a given poverty indexhouseholds below a given poverty index
Pension programs targeted to population above a certain agepopulation above a certain age
Scholarships targeted to students with high scores on standardized testhigh scores on standardized test
CDD Programs awarded to NGOs that achieve highest scoresachieve highest scores
Others: Credit scores in Bank lendingCredit scores in Bank lending
68
![Page 69: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/69.jpg)
Advantages of RDD for EvaluationgRDD yields an unbiased estimate of
treatment effect at the discontinuitytreatment effect at the discontinuityCan many times take advantage of a
known rule for assigning the benefit thatknown rule for assigning the benefit that are common in the designs of social policyNo need to “exclude” a group of eligible g p g
households/individuals from treatment
69
![Page 70: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/70.jpg)
Potential Disadvantages of RD
Local treatment effects cannot be generalized (especially if there g ( p y
is heterogeneity of impacts) Power: effect is estimated at the discontinuity so weeffect is estimated at the discontinuity, so we
generally have fewer observations than in a randomized experiment with the same sample size
Specification can be sensitive to functional form: make sure the relationship between the assignment variable and the outcomethe assignment variable and the outcome variable is correctly modeled, including: Nonlinear Relationships I t ti Interactions
70
![Page 71: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/71.jpg)
Some Background on PROGRESA’s targetingPROGRESA s targeting
Two-stage Selection process:Geographic targeting (used census data to Geographic targeting (used census data to
identify poor localities)Within Village household-level targeting
(village household census)(village household census)Used hh income, assets, and demographic
composition to estimate the probability of being poor (Inc per cap<Standard Food basket).
Discriminant analysis applied separately by regionDiscriminant score of each household compared to a
threshold value (high DS=Noneligible, low DS=Eligible)DS Eligible)
Initially 52% eligible, then revised selection process so that 78% eligible. But many of the “new poor” households did not receive benefitsp
71
![Page 72: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/72.jpg)
Figure 1: Kernel Densities of Discriminant Scores and Threshold points by region Figure 1: Kernel Densities of Discriminant Scores and Threshold points by region en
sity
.003412
ensi
ty
.00329
ensi
ty
.002918
De
Region 3Discriminant Score
7593.9e-06
De
Region 4Discriminant Score
7532.8e-06
De
Region 5Discriminant Score
7510
Den
sity
.004142
Den
sity
.004625
Den
sity
.003639
Region 6Discriminant Score
7525.5e-06
Region 12Discriminant Score
5718.0e-06
Region 27Discriminant Score
6914.5e-06
.002937
Den
sity
.000015
Region 28Discriminant Score
757
72
![Page 73: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/73.jpg)
The RDD method 1The RDD method-1
A quasi-experimental approach based on the discontinuity of the treatment assignment mechanism.
Sharp RD design Sharp RD design Individuals/households are assigned to treatment (T) and
control (NT) groups based solely on the basis of an observed continuous measure such as the discriminate score DS. For example, B =1 if and only if DS<=COS (B=1 eligible beneficiary) and B=0 otherwise . Propensity is a step function that is discontinuous at the point DS=COS.
A l t l ti b bl l Analogous to selection on observables only. Violates the strong ignorability assumption of Rosenbaum and
Rubin (1983) which also requires the overlap condition.
73
![Page 74: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/74.jpg)
The RDD method 2The RDD method-2
Fuzzy RD designTreatment assignment depends on an observed
continuous variable such as the discriminant score DScontinuous variable such as the discriminant score DS but in a stochastic manner. Propensity score is S-shaped and is discontinuous at the point DS=COS.
Analogous to selection on observables andAnalogous to selection on observables and unobservables.
Allows for imperfect compliance (self-selection, tt iti ) li ibl b fi i i dattrition) among eligible beneficiaries and
contamination of the comparison group by non-compliance (substitution bias).
74
![Page 75: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/75.jpg)
Sharp Design______ p g--------- Fuzzy design
Regression Discontinuity Design; treatment assignment in sharp (solid) and fuzzy (dashed) designs.
75
![Page 76: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/76.jpg)
Kernel Regression Estimator of T eatment Effect ith a Sha p RDDTreatment Effect with a Sharp RDD
COSDSYECOSDSYEYYCOS |li|li COSDSYECOSDSYEYYCOS iiCOSDSiiCOSDS
|lim|lim
where
n
i iii uKYY 1
)(**and
n
i iii uKYY 1
)(*)1(*
n
i ii uK1
)(*and
n
i ii uK1
)(*)1(
Alternative estimators (differ in the way local information is exploited and in the set of regularity conditions required to achieve asymptotic properties):conditions required to achieve asymptotic properties):
Local Linear Regression (HTV, 2001) Partially Linear Model (Porter, 2003) 76
![Page 77: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/77.jpg)
Main ResultsMain Results
Overall the performance of the RDD is remarkably good. The RDD estimates of program impact agree with the
experimental estimates in 10 out of the 12 possible cases. a
The two cases in which the RDD method failed to reveal any significant program impact on the school tt d f b d i l i th fi t fattendance of boys and girls are in the first year of
the program (round 3).
77
![Page 78: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/78.jpg)
TABLE 3aEstimates of Program Impact By Round (BOYS 12-16 yrs old)
2DIF CSDIF CSDIF-50 Uniform Biweight Epanechnik Triangular Quartic GuassianSCHOOL (1) (2) (3) (4) (5) (6) (7) (8) (9)Round 1 n.a 0.013 -0.001 -0.053 -0.016 -0.031 -0.018 -0.016 -0.050
st. error 0.018 0.028 0.027 0.031 0.029 0.031 0.031 0.021
Estimates of Program Impact By Round (BOYS 12 16 yrs old)
RDD Impact Estimates using different kernel functions
Experimental Estimates
Round 3 0.050 0.064 0.071 0.020 0.008 0.010 0.008 0.008 0.005st. error 0.017 0.019 0.028 0.028 0.034 0.031 0.033 0.034 0.022
Round 5 0.048 0.061 0.099 0.052 0.072 0.066 0.069 0.072 0.057st. error 0.020 0.019 0.030 0.028 0.032 0.030 0.032 0.032 0.021
16331Nobs 4279R-Squared 0.25
WORKRound 1 n.a. 0.018 0.007 0.012 -0.016 -0.004 -0.013 -0.016 0.025
st error 0 019 0 029 0 027 0 032 0 029 0 031 0 032 0 021
163310.21
st. error 0.019 0.029 0.027 0.032 0.029 0.031 0.032 0.021
Round 3 -0.037 -0.018 -0.007 0.007 -0.004 0.002 0.001 -0.004 0.005st. error 0.023 0.017 0.029 0.024 0.028 0.026 0.028 0.028 0.019
Round 5 -0.046 -0.028 -0.037 -0.031 -0.029 -0.030 -0.029 -0.029 -0.028st error 0 025 0 017 0 025 0 024 0 028 0 026 0 027 0 028 0 019st. error 0.025 0.017 0.025 0.024 0.028 0.026 0.027 0.028 0.019
Nobs 4279R-Squared 0.19
NOTES: Estimates in bold have t-values >=2Treatment Group for Experimental & RDD Estimates: Beneficiary Households in Treatment Villages (Group A)
163310.16
p p y g ( p )Comparison Group for Experimental Estimates: Eligible Households in Control Villages (Group B)Comparison Group for RDD Estimates: NonEligible Households in Treatment Villages (Group C)
78
![Page 79: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/79.jpg)
TABLE 3bEstimates of Program Impact By Round (GIRLS 12-16 yrs old)
2DIF CSDIF CSDIF-50 Uniform Biweight Epanechnik. Triangular Quartic GuassianSCHOOL (1) (2) (3) (4) (5) (6) (7) (8) (9)Round 1 n.a. -0.001 0.000 -0.027 -0.025 -0.026 -0.025 -0.025 -0.035
st. error 0.020 0.030 0.029 0.036 0.033 0.034 0.036 0.023
g p y ( y )
RDD Impact Estimates using different kernel functionsExperimental Estimates
Round 3 0.086 0.085 0.082 0.038 0.039 0.041 0.039 0.039 0.054st. error 0.017 0.020 0.029 0.030 0.036 0.033 0.034 0.036 0.024
Round 5 0.099 0.098 0.099 0.078 0.114 0.097 0.107 0.114 0.084st. error 0.020 0.019 0.028 0.031 0.036 0.033 0.035 0.036 0.025
N b 38615046Nobs 3865R-Squared 0.23
WORKRound 1 n.a. 0.034 0.000 0.033 0.026 0.027 0.027 0.026 0.030
st error 0 017 0 024 0 019 0 022 0 020 0 021 0 022 0 015
150460.22
st. error 0.017 0.024 0.019 0.022 0.020 0.021 0.022 0.015
Round 3 -0.034 0.000 0.001 0.005 0.001 0.003 0.002 0.001 -0.008st. error 0.017 0.009 0.016 0.015 0.018 0.016 0.017 0.018 0.012
Round 5 -0.042 -0.008 -0.025 -0.019 -0.034 -0.029 -0.033 -0.034 -0.025st error 0 019 0 009 0 018 0 015 0 018 0 017 0 018 0 018 0 013st. error 0.019 0.009 0.018 0.015 0.018 0.017 0.018 0.018 0.013
Nobs 3865R-Squared 0.07
NOTES: Estimates in bold have t-values >=2Treatment Group for Experimental & RDD Estimates: Beneficiary Households in Treatment Villages (Group A)
150460.05
p p y g ( p )Comparison Group for Experimental Estimates: Eligible Households in Control Villages (Group B)Comparison Group for RDD Estimates: NonEligible Households in Treatment Villages (Group C)
79
![Page 80: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/80.jpg)
5 I t t l5. Instrumental variables/Encouragement Designs
80
![Page 81: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/81.jpg)
5. Instrumental variablesIdentifying exogenous variation using aIdentifying exogenous variation using a 3rd variable
Outcome regression:
(D 0 1 is our program not random)iii DY
(D = 0,1 is our program – not random)
• “Instrument” (Z) influences participation, but d t ff t t i ti i ti (thdoes not affect outcomes given participation (the “exclusion restriction”).
• This identifies the exogenous variation in d houtcomes due to the program.
Treatment regression:iii uZD iii uZD
81
![Page 82: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/82.jpg)
Reduced-form outcome regression:
iiiiii ZuZY )(where and iii u
Instrumental variables (two-stage least squares) estimator of impact:estimator of impact:
OLSOLSIVE ˆ/ˆˆ
Or: iii ZY )ˆ(
Predicted D purged of endogenous part.82
![Page 83: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/83.jpg)
Problems with IVEProblems with IVE1. Finding valid IVs; Usually easy to find a variable that is correlated with
treatment. However the validity of the exclusion However, the validity of the exclusion
restrictions is often questionable.
2. Impact heterogeneity due to latent factors
83
![Page 84: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/84.jpg)
Sources of instrumental variablesvariables
Partially randomized designs as a source of IVsIVs
Non-experimental sources of IVsGeography of program placement (AttanasioGeography of program placement (Attanasio
and Vera-Hernandez); “Dams” example (Duflo and Pande)
Political characteristics (Besley and Case; Paxson and Schady)
Di ti iti i d iDiscontinuities in survey design
84
![Page 85: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/85.jpg)
Endogenous compliance: Instrumental variables estimator
D =1 if treated, 0 if controlZ =1 if assigned to treatment, 0 if not.
Compliance regressioniii ZD 11
Outcome regression(“intention to treat effect”)iii ZY 22
2SLS estimator (=ITT deflated by compliance2deflated by compliance rate)
12
85
![Page 86: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/86.jpg)
Essential heterogeneity and IVEIVECommon-impact specification is not harmless.Heterogeneity in impact can arise fromHeterogeneity in impact can arise from
differences between treated units and the counterfactual in latent factors relevant to
toutcomes. For consistent estimation of ATE we must
assume that selection into the program isassume that selection into the program is unaffected by latent, idiosyncratic, factors determining the impact (Heckman et al).
However likely “winners” will no doubt beHowever, likely winners will no doubt be attracted to a program, or be favored by the implementing agency.
b d h “ d l”=> IVE is biased even with “ideal” IVs. 86
![Page 87: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/87.jpg)
Stylized exampleStylized example Two types of people (1/2 of each):
Type H: High impact; large gains (G) from program Type H: High impact; large gains (G) from program Type L: Low impact: no gain
Evaluator cannot tell which is which But the people themselves can tell (or have a useful But the people themselves can tell (or have a useful
clue)
Randomized pilot: Randomized pilot: Half goes to each type Impact=G/2
Scaled up program: Type H select into program; Type L do not Impact=G Impact=G
87
![Page 88: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/88.jpg)
IVE is only a ‘local’ effect IVE identifies the effect for those induced to switch by the instrument (“local average effect”)
IVE is only a local effect
the instrument ( local average effect ) Suppose Z takes 2 values. Then the effect of the program is:
)0|()1|()0|()1|(
ZDEZDEZYEZYE
IVE
Care in extrapolating to the whole population when th i l t t h t it
)0|()1|( ZDEZDE
there is latent heterogeneity.
88
![Page 89: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/89.jpg)
Local instrumental i bl
LIV directly addresses the latent heterogeneity problem
variablesproblem. The method entails a nonparametric regressionof outcomes Y on the propensity scoreof outcomes Y on the propensity score.
The slope of the regression functioniiii XZPfY )](ˆ[
)](ˆ[ iZPf The slope of the regression function gives the marginal impact at the data point.
This slope is the marginal treatment effect (Björklund
)]([ iZPf
p g ( jand Moffitt), from which any of the standard impact parameters can be calculated (Heckman and Vytlacil)be calculated (Heckman and Vytlacil).
89
![Page 90: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/90.jpg)
Lessons from practiceLessons from practice Partially randomized designs offer great source
f IVof IVs.The bar has risen in standards for non-
experimental IVEexperimental IVE Past exclusion restrictions often questionable in
developing country settingsHowever, defensible options remain in practice, often
motivated by theory and/or other data sources Future work is likely to emphasize latent Future work is likely to emphasize latent
heterogeneity of impacts, esp., using LIV.
90
![Page 91: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/91.jpg)
6 H t I l t I t6. How to Implement an Impact Evaluation
91
![Page 92: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/92.jpg)
TimelineTimelineT1 T2 T3 T4 T5 T6 T7 T8
Timeline
Treatment Start of interventionControl Start of intervention
SURVEYS Follow-up Follow-up
Program Implementation
Impact EvaluationBaseline
Baseline survey must go into field before program implemented
Exposure period between Treatment and Control areas is subject to political, logistical
id iconsiderations
Follow up survey must go into field before program implemented in Control areasprogram implemented in Control areas
Additional follow up surveys depend on funding and marginal benefits
92
![Page 93: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/93.jpg)
Prepare & plan evaluation at same time preparing interventiontime preparing intervention
Avoid conflicts with operational needspStrengthen intervention design and
results frameworkresults frameworkProspective designs:Key to finding control groupsKey to finding control groups More flexibility before publicly
presenting roll-out planpresenting roll-out planBetter able to deal with ethical and
stakeholder issuesstakeholder issuesLower costs of the evaluation
93
![Page 94: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/94.jpg)
Use Phasing fo Cont ol G o psUse Phasing for Control Groups
Limited budget and logistical ability means almost always phase in program over time
h h fThose who go first are treatmentsThose who go later are controls
Who goes first in rollout plan?Eligibility Criteria defines universeg yCost minimum efficient scale defines unit
of interventionTransparency & accountability: criteria should
be quantitative and publicEquity: everyone deserves an equal chance
94
![Page 95: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/95.jpg)
Monitoring Data can be used for Impact Evaluationfor Impact Evaluation
Program monitoring data usually only collected in areas where activecollected in areas where active
Start in control areas at same time as in treatment areas for baseline
Add outcome indicators intoAdd outcome indicators into monitoring data collection
Very cost-effective as little need for additional special surveys 95
![Page 96: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/96.jpg)
Countries already regularly collectCountries already regularly collect
Vital statisticsElectricity, water & sanitation, transport y, , p
company administration informationSchool health clinic MISSchool, health clinic MISIndustrial surveysL b f & h h ld b d tLabor force & household budget surveysDemographic & HealthNational & local budgetary data
96
![Page 97: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/97.jpg)
Can these other data be used?Critical issuesDo they collect outcome indicators Can we identify controls and treatmentsi.e. link to intervention locations and/or
beneficiariesbeneficiariesquestion of identification codes
Statistical power: are there sufficient sampleStatistical power: are there sufficient sample sizes in treatment and control areasAre there baseline (pre-intervention data)Are there more than one year prior to test for
equality of pre-intervention trendsf d lTrue for census data & vital statistics
Usually not for survey data 97
![Page 98: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/98.jpg)
Special SurveysSpecial SurveysWhere there is no monitoring system in place or
il bl d t i i l tavailable data is incompleteNeed baseline & follow-up of control &
t e tmenttreatments
May need information that do not want to collect on a regular basis (esp specific outcomes)
OptionsCollect baseline as part of program application
process If controls never apply, then need special survey 98
![Page 99: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/99.jpg)
Sample Sizesp
Should be based on power calculationsSample sizes needed to statistically
distinguished between two meansgIncreases the rarer the outcome
(e.g maternal mortality)Increases the larger the standard deviation of
the outcome indicator (e.g. test scores)Increases the smaller the desired effect size
Need more sample for subpopulation d o sa p o subpopu a oanalysis (gender, poverty)
99
![Page 100: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/100.jpg)
Staffing OptionsStaffing OptionsContract a single firmEasy one stop shopping & responsibility clearLess flexible & expensive
fi bl i ll l i i l fiFew firms capable, typically large international firms only ones with all skills
Split responsibilitySplit responsibilityContract one for design, questionnaire content,
supervision of data collection, and analysissupervision of data collection, and analysisContract another for data collectionMore complex but cheaperCan get better mix of skills & use local talent
100
![Page 101: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/101.jpg)
StaffingStaffing In-country Lead coordinatorAssists in logistical coordinationBased in-country to navigate obstaclesCan be external consultant or in country researcherCan be external consultant, or in-country researcherMust have stake in the successful implementation of
field work
Consultants (International?)Experimental Design and analysisp g yQuestionnaire and sample design
Local research firmResponsible for field work and data entryLocal researchers (work with international?) 101
![Page 102: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/102.jpg)
B dget Ho m ch ill o need?Budget: How much will you need?
Single largest component: Data collection Cost depends on sample size, interview p p ,
length & measurementDo you need a household survey?Do you need a household survey?Do you need a institutional survey?What are your sample sizes?What are your sample sizes?What is the geographical distribution of your
sample?sample?
102
![Page 103: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/103.jpg)
ConsultantsConsultants
Money is well spent on consultants for design, sampling, and analysisAre there local researchers?Can they do it themselves?Can they do it themselves?Partner with international experts
(OPORTUNIDADES Model)?( )Save money if can access local consultants
Long-term need to build local capacityLong term need to build local capacity
103
![Page 104: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/104.jpg)
Monitoring the InterventionMonitoring the Intervention
Supervise the program implementationEvaluation design is based on roll-out planEnsure program implementation follows roll-
out plan
In order to mitigate problems with program implementation:p g pMaintain dialogue with governmentBuild support for impact evaluation amongBuild support for impact evaluation among
other donors, key players104
![Page 105: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/105.jpg)
B ildi S t f I t E l tiBuilding Support for Impact Evaluation
Once Evaluation Plan for design and implementation is determined:Present plan to government counterpartsPresent plan to key players (donorsPresent plan to key players (donors,
NGOs, etc.)Present plan to other evaluation expertsPresent plan to other evaluation experts
105
![Page 106: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/106.jpg)
Operational messagesOperational messages
Plan evaluation at same time plan projectBuild an explicit evaluation teampInfluence roll-out to obtain control groupsUse quantitative & public allocation criteriaUse quantitative & public allocation criteriaRandomization is ethical
Strengthen monitoring systems toStrengthen monitoring systems to improve IE quality & lower costsSample sizes of surveys drive budget
106
![Page 107: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/107.jpg)
Th kThank you
107
![Page 108: Introduction to Impact evaluation: Part II- Methods & Examples - World Banksiteresources.worldbank.org/INTPOVERTY/Resources/335642... · 2011-05-10 · Introduction to Impact evaluation:](https://reader030.fdocuments.in/reader030/viewer/2022040907/5e7d512fa4b6e315da4c4a0b/html5/thumbnails/108.jpg)
ATE and ATT
Average Treatment Effect ATE:Average Treatment Effect - ATE:
)()( CT YEYE Average Treatment Effect on the Treated - ATT:
)1()1( DYEDYE CT
108