Addressing the Evaluation Gap
description
Transcript of Addressing the Evaluation Gap
Addressing the Evaluation GapResponding to the paper by
William D. Savedoff and Ruth Levine: “When Will We Ever
Learn? Closing the Evaluation Gap”, Center for Global
Development www.cgdev.org
There have been and continue to be multiple discussions concerning the evaluation of international development. They include some commonly agreed frames of reference (as we hope to discover here in Sussex). But they also include forces pulling in many divergent directions ... Or at least different interpretations of what form of “impact evaluation” is called for.
Some attempt to address the complexities of increasingly integrated, multi-intervention, multi-donor national development assistance, including those promoting human rights and advocating for policy change.
Others call for a form of impact evaluation that focuses on the need to conduct rigorous research on more specific cause-effect relationships. The findings of such evaluations can be used to inform subsequent project design.
There are those who propose to use randomized 'scientific' experimental research designs to evaluate 'the real impact' of development projects. Among such proponents are the MIT Poverty Action Lab (http://www.povertyactionlab.com/)
Another is the Center for Global Development's "Evaluation Gap" Working Group. Their recently released report (http://www.cgdev.org/section/initiatives/_active/evalgap) is receiving high-profile attention. Not only in the US, but also in Europe, including a multi-national, multi-agency conference held in June at the Rockefeller Foundation center in Bellagio, Italy.
There are many aspects of the CGD’s initiative that I believe we should applaud and support. These include (among others):– Pointing out that “An evaluation gap exists
because there are too few incentives to conduct good impact evaluations and too many obstacles.”
– Calling for more financial and technical support for more rigorous evaluation
– Advocating that there be more collaborative evaluations
The CGD’s two main suggested solutions are: – The formation of an International Council to
Catalyze Independent Impact Evaluations of Social Sector Interventions.
– The conducting of more rigorous impact evaluations (implying randomized experimental trials).1
1 In fairness, their proposals are more comprehensive than what I am highlighting here. But this points to an important methodological challenge.
I suggest that those of us gathered here in Sussex consider responses to both of these: – Do we agree that there is need for the proposed
CGD-organized International Council? • If so, in what ways are we and the institutions we
represent willing to collaborate with it? • Or are its proposed purposes (see next slide) already
being adequately met by existing institutions or networks?
– What is the role of randomized experimental trials among other evaluation designs?
The International Council• Establish quality standards for rigorous evaluations• Organize and disseminate information• Identify priority topics• Review proposals rapidly• Build capacity to produce, interpret and use knowledge• Create a directory of researchers• Provide grants for impact evaluation design• Create and administer a pooled impact evaluation fund• Signal quality with a “Seal of Approval”• Communicate with policymakers
Evaluation Designs
Though I humbly acknowledge that this is a room full of experts, permit me to share with you the introduction to evaluation design that participants in my training workshops have found helpful.1 This could help clarify the role of more rigorous evaluations (even randomized trials) – when they are needed, and when they may be inappropriate or not feasible.
1These are included in the book RealWorld Evaluation by Bamberger, Rugh and Mabry, published by Sage February 2006
end of project end of project evaluationevaluation
Design #1: Post-test only of project participants Design #1: Post-test only of project participants
X P
Project participantsProject participants
12
baselinebaseline end of project end of project evaluationevaluation
Design #2: Pre+post of project; no comparison Design #2: Pre+post of project; no comparison
P1 X P2
Project participantsProject participants
13
baselinebaseline end of project end of project evaluationevaluation
Comparison groupComparison group
Design #3: Pre+post of project; post-only comparison Design #3: Pre+post of project; post-only comparison
P1 X P2
C
Project participantsProject participants
14
baselinebaseline end of project end of project evaluationevaluation
Comparison groupComparison group
Design #4: Quasi-experimental (pre+post, with ‘comparison’) Design #4: Quasi-experimental (pre+post, with ‘comparison’)
P1 X P2
C1 C2
Project participantsProject participants
15
baselinebaseline end of project end of project evaluationevaluation
Control groupControl group
Design #5: Randomized experimental (pre+post, with ‘control’) Design #5: Randomized experimental (pre+post, with ‘control’)
P1 X P2
C1 C2
Project participantsProject participants
16
Research subjects randomly assigned either to project or control group.
baselinebaseline end of project end of project evaluationevaluation
Comparison groupComparison group
post project post project evaluationevaluation
Design #6: Longitudinal Quasi-experimental Design #6: Longitudinal Quasi-experimental
P1 X P2 X P3 P4
C1 C2 C3 C4
Project participantsProject participants
midtermmidterm
17
baselinebaseline end of project end of project evaluationevaluation
Control groupControl group
post project post project evaluationevaluation
Design #7: Randomized Longitudinal Experimental Design #7: Randomized Longitudinal Experimental
P1 X P2 X P3 P4
C1 C2 C3 C4
Project participantsProject participants
midtermmidterm
18
Research subjects randomly assigned either to project or control group.
How often are ‘more rigorous’ evaluation designs actually used?
• Of the 67 projects included in the last bi-annual meta-evaluation conducted by CARE International, 50 (75%) used a posttest-only design without a comparison group (Design 1); 12% used pre + posttest of project group (Design 2). We guess that these are fairly typical of evaluation designs actually used by INGOs and other development agencies.
• There actually had been baseline studies conducted for 19 of the projects where posttest-only evaluations were conducted. Among the reasons the baselines were not used included accessibility of the baseline data to the evaluators, comparability (in terms of indicators and methodologies), questions regarding the quality of the baseline studies, and/or oversight by those conducting the evaluations
We need to be clear on what’re defining as ‘impact’ and what the contributing causes/contributions
are to achieve that ‘impact’.• We do need to have proven hypotheses of
what interventions and outputs have been shown to lead to what outcomes.
• But such research needs to be clear on the relevant conditions and what other contributing factors there were.
Children are malnourishedChildren are malnourished
Diarrheal Diarrheal diseasedisease
Insufficient Insufficient foodfood
Poor quality Poor quality of foodof food
Unsanitary Unsanitary practicespractices
Need for Need for improved health improved health
policiespolicies
Need for strengthened Need for strengthened capacity of health capacity of health
institutionsinstitutions
Flies and Flies and rodentsrodents
Do not use Do not use facilities facilities correctlycorrectly
People do not People do not wash hands wash hands
before eatingbefore eating
High infant mortality rate
More Children are well nourishedMore Children are well nourished
Less diarrheal Less diarrheal diseasedisease
Sufficient Sufficient foodfood
Good quality Good quality of foodof food
Sanitary Sanitary practicespractices
Improved health Improved health policiespolicies
Strengthened capacity Strengthened capacity of health institutionsof health institutions
Fewer flies Fewer flies and rodentsand rodents
facilities facilities used used
correctlycorrectly
People wash People wash hands before hands before
eatingeating
Lower infant mortality rate
What is the role of randomized experimental trials?
• I believe there are examples of where they should be used to test interventions, to determine clear cause-effect correlations. These then can then be used in subsequent project design and evaluation.
• I solicit your suggestions of examples where they have been or should be used.