Addressing the Evaluation Gap

Addressing the Evaluation GapResponding to the paper by

William D. Savedoff and Ruth Levine: “When Will We Ever

Learn? Closing the Evaluation Gap”, Center for Global

Development www.cgdev.org

http://www.cgdev.org/

There have been and continue to be multiple discussions concerning the evaluation of international development. They include some commonly agreed frames of reference (as we hope to discover here in Sussex). But they also include forces pulling in many divergent directions ... Or at least different interpretations of what form of “impact evaluation” is called for.

Some attempt to address the complexities of increasingly integrated, multi-intervention, multi-donor national development assistance, including those promoting human rights and advocating for policy change.

Others call for a form of impact evaluation that focuses on the need to conduct rigorous research on more specific cause-effect relationships. The findings of such evaluations can be used to inform subsequent project design.

There are those who propose to use randomized 'scientific' experimental research designs to evaluate 'the real impact' of development projects. Among such proponents are the MIT Poverty Action Lab (http://www.povertyactionlab.com/)

http://www.povertyactionlab.com/



Another is the Center for Global Development's "Evaluation Gap" Working Group. Their recently released report (http://www.cgdev.org/section/initiatives/_active/evalgap) is receiving high-profile attention. Not only in the US, but also in Europe, including a multi-national, multi-agency conference held in June at the Rockefeller Foundation center in Bellagio, Italy.

http://www.cgdev.org/section/initiatives/_active/evalgap

http://www.cgdev.org/section/initiatives/_active/evalgap

There are many aspects of the CGD’s initiative that I believe we should applaud and support. These include (among others):– Pointing out that “An evaluation gap exists

because there are too few incentives to conduct good impact evaluations and too many obstacles.”

– Calling for more financial and technical support for more rigorous evaluation

– Advocating that there be more collaborative evaluations

The CGD’s two main suggested solutions are: – The formation of an International Council to

Catalyze Independent Impact Evaluations of Social Sector Interventions.

– The conducting of more rigorous impact evaluations (implying randomized experimental trials).1

1 In fairness, their proposals are more comprehensive than what I am highlighting here. But this points to an important methodological challenge.

I suggest that those of us gathered here in Sussex consider responses to both of these: – Do we agree that there is need for the proposed

CGD-organized International Council? • If so, in what ways are we and the institutions we

represent willing to collaborate with it? • Or are its proposed purposes (see next slide) already

being adequately met by existing institutions or networks?

– What is the role of randomized experimental trials among other evaluation designs?

The International Council• Establish quality standards for rigorous evaluations• Organize and disseminate information• Identify priority topics• Review proposals rapidly• Build capacity to produce, interpret and use knowledge• Create a directory of researchers• Provide grants for impact evaluation design• Create and administer a pooled impact evaluation fund• Signal quality with a “Seal of Approval”• Communicate with policymakers

Evaluation Designs

Though I humbly acknowledge that this is a room full of experts, permit me to share with you the introduction to evaluation design that participants in my training workshops have found helpful.1 This could help clarify the role of more rigorous evaluations (even randomized trials) – when they are needed, and when they may be inappropriate or not feasible.

1These are included in the book RealWorld Evaluation by Bamberger, Rugh and Mabry, published by Sage February 2006

end of project end of project evaluationevaluation

Design #1: Post-test only of project participants Design #1: Post-test only of project participants

X P

Project participantsProject participants

12

baselinebaseline end of project end of project evaluationevaluation

Design #2: Pre+post of project; no comparison Design #2: Pre+post of project; no comparison

P1 X P2


13


Comparison groupComparison group

Design #3: Pre+post of project; post-only comparison Design #3: Pre+post of project; post-only comparison

P1 X P2

C


14



Design #4: Quasi-experimental (pre+post, with ‘comparison’) Design #4: Quasi-experimental (pre+post, with ‘comparison’)

P1 X P2

C1 C2


15


Control groupControl group

Design #5: Randomized experimental (pre+post, with ‘control’) Design #5: Randomized experimental (pre+post, with ‘control’)

P1 X P2

C1 C2


16

Research subjects randomly assigned either to project or control group.



post project post project evaluationevaluation

Design #6: Longitudinal Quasi-experimental Design #6: Longitudinal Quasi-experimental

P1 X P2 X P3 P4

C1 C2 C3 C4


midtermmidterm

17


Control groupControl group

post project post project evaluationevaluation

Design #7: Randomized Longitudinal Experimental Design #7: Randomized Longitudinal Experimental

P1 X P2 X P3 P4

C1 C2 C3 C4


midtermmidterm

18

Research subjects randomly assigned either to project or control group.

How often are ‘more rigorous’ evaluation designs actually used?

• Of the 67 projects included in the last bi-annual meta-evaluation conducted by CARE International, 50 (75%) used a posttest-only design without a comparison group (Design 1); 12% used pre + posttest of project group (Design 2). We guess that these are fairly typical of evaluation designs actually used by INGOs and other development agencies.

• There actually had been baseline studies conducted for 19 of the projects where posttest-only evaluations were conducted. Among the reasons the baselines were not used included accessibility of the baseline data to the evaluators, comparability (in terms of indicators and methodologies), questions regarding the quality of the baseline studies, and/or oversight by those conducting the evaluations

We need to be clear on what’re defining as ‘impact’ and what the contributing causes/contributions

are to achieve that ‘impact’.• We do need to have proven hypotheses of

what interventions and outputs have been shown to lead to what outcomes.

• But such research needs to be clear on the relevant conditions and what other contributing factors there were.

Children are malnourishedChildren are malnourished

Diarrheal Diarrheal diseasedisease

Insufficient Insufficient foodfood

Poor quality Poor quality of foodof food

Unsanitary Unsanitary practicespractices

Need for Need for improved health improved health

policiespolicies

Need for strengthened Need for strengthened capacity of health capacity of health

institutionsinstitutions

Flies and Flies and rodentsrodents

Do not use Do not use facilities facilities correctlycorrectly

People do not People do not wash hands wash hands

before eatingbefore eating

High infant mortality rate

More Children are well nourishedMore Children are well nourished

Less diarrheal Less diarrheal diseasedisease

Sufficient Sufficient foodfood

Good quality Good quality of foodof food

Sanitary Sanitary practicespractices

Improved health Improved health policiespolicies

Strengthened capacity Strengthened capacity of health institutionsof health institutions

Fewer flies Fewer flies and rodentsand rodents

facilities facilities used used

correctlycorrectly

People wash People wash hands before hands before

eatingeating

Lower infant mortality rate

What is the role of randomized experimental trials?

• I believe there are examples of where they should be used to test interventions, to determine clear cause-effect correlations. These then can then be used in subsequent project design and evaluation.

• I solicit your suggestions of examples where they have been or should be used.

Addressing the Evaluation Gap

Documents

Transcript of Addressing the Evaluation Gap