Independent Quality Assessment of UNODC Evaluation Reports ... · evaluation reports in 2016 and...
Transcript of Independent Quality Assessment of UNODC Evaluation Reports ... · evaluation reports in 2016 and...
IndependentQualityAssessmentofUNODCEvaluationReports2016
SynthesisReport
Preparedfor:
IndependentEvaluationUnit,UNODC
byJohnMathiasonandAnnSutherland
December22,2016
CONTENTS
INTRODUCTION........................................................................................................................3
METHODOLOGY........................................................................................................................3
FINDINGS..................................................................................................................................4
A.StrengthsoftheEvaluationReports................................................................................5
B.WeaknessesoftheEvaluationReports...........................................................................6
C.ImprovementsfromFirstDraftinFinal...........................................................................7
D.BestPracticesObserved..................................................................................................8
E.ChallengesforEvaluators................................................................................................9
F.MainstreamingHumanRightsandGenderConsiderations.............................................9
RECOMMENDATIONS.............................................................................................................11
Appendix1.ComparisonofFinalReportwithDraftReport
Appendix2.UNODCEQATemplateasUsedintheReview
Appendix3.RecommendedRevisionstoUNODCEQATemplate
page3
INTRODUCTION
The IndependentEvaluationUnit (IEU) is leadingandguidingevaluations inorder toprovide objective information on the performance of United Nations Office on DrugsandCrime(UNODC).AsamemberoftheUnitedNationsEvaluationGroup(UNEG),IEUisfollowingitsNormsandStandards.
Thenumberofindependentprojectevaluationsandin-depthevaluationshasincreasedsignificantlyoverthepastyearsatUNODC,generatingimportantlearningopportunitiesforallstakeholders,includingtheIEU.UNODC’seffortstocreatebettermechanismsfordiscussingandpropellingstrategicissuesforwardarebeinginformedbyIEU,wherebyimportant insights are generated in the planning, implementation and finalization ofevaluations. As part of its efforts to ensure the agency’s evaluations are providingcredible information to inform planning processes, the IEU has commissionedindependentevaluationqualityassessmentsofevaluationreportsproducedsince2014.
This document synthesizes the EvaluationQuality Assessment (EQA) of all publishedevaluation reports in 2016 andmakes comparisonswith EQAs from 2014 and 2015.Moreover, thequalityofarandomsampleof30%of firstdraftevaluationreportshasbeenreviewedandassessedtoanalyzethequalitativechangesbetweenthefirstdraftreportandthepublishedversion.
Thisassignmentwascarriedoutfrommid-Novembertomid-Decemberof2016bytwoindependent consultants - Dr. John Mathiason (Team Leader) and Ann Sutherland(TeamMember).Bothhaveextensiveexperience inconductingevaluationsandmeta-evaluations for international organizations. They are the Managing Director andPrincipalAssociate,respectively,forAssociatesforInternationalManagementServices(AIMS).
METHODOLOGY
ThefirstphaseofthisassignmentinvolvedmakingminorrevisionstotheEQAtemplate.Checkboxes forratingtheextenttowhicheachcriterionwasmet(met,partiallymet,notmet)were incorporated into the template. Thiswas found to be advantageous inseveral respects: the comments are less redundant as it is no longernecessary to re-state the criteria; the reviewer is forced to consider all elements; and the commentssectionisusedonlytohighlightissuesthatareparticularlystrongorweak,andprovideanyadditionaljustificationthatmaybeneededtoexplaintheoverallscoregivenfortherespectivesection.TherevisedEQAtemplateusedfor theassessment isshownintheAnnex.
The reviewers then examined the quality of all of the evaluation reports publishedduring2016.Thetotalnumberofevaluationsfortheyearwas19.Threewerein-depthevaluations and 16 were independent evaluations. The reviewers also compared thefinal drafts of a third of the evaluations, randomly selected, with the first drafts toobserveanydifferences thatcouldbeattributed to theprocessofcommentingon thefirstdraftsbyIEU.Allweredraftsofindependentevaluations.
page4
To ensure consistency of the review process, both reviewers independently assessedone In-Depth and three Independent Project Evaluation reports. These reports werealso randomly selected andwere rated according to the nineEQA criteria categories.Thereviewersthencomparedtheircommentsandscoresforeachcriterionaswellasthe overall score. In all cases, the overall scores were the same for both reviewers.Therewereminordifferencesincriteriascoresandtherewerenon-materialdifferencesincomments.Thedifferenceswerediscussedandresolved.Asitwasevidentthattherewas consistency between the team, the remaining reports were each rated by oneperson; assignments were based on each reviewer’s area of expertise and languageskills.Overall,approximatelyonethirdoftheevaluationreportswereassessedbybothreviewers.
Theteamusedthe“Unsatisfactory”ratingonlywhenthecriteriaelementsweremissingorverypoorlyaddressed.Similarly,“VeryGood”wasonlyusedwhenallcriteriawerefullymet. As a result, “Fair”was usedwhen the criteriawere generally notmet, andthereforethatscorecanbeunderstoodtomeanthatthereports,orpiecesofthereports,werenotwelldone.
FINDINGS
Allofthe19reportsproducedin2016wereofadequatequality.AsseeninTable1,fourreportsreceivedaVeryGoodratingandnoreportswereratedasUnsatisfactory.
Table 1: Overall Rating of 2016 Reports compared with previous EQA cycles Unsatisfactory Fair Good Very Good Total
# of Reports - 2016 0 8 7 4 19 % of Reports - 2016 0 42% 37% 21% 100%
# of Reports - 20151 0 12 9 1 22 % of Reports – 2015 0% 53% 41% 5% 100% # of Reports – 2014/152 0 9 22 2 33 % of Reports – 2014/15 0% 27% 67% 6% 100% Table1alsoshowstherehavebeensubstantialimprovementscomparedto2015withanincreasednumberofVeryGoodreportsandfewerratedasFair.Thedropinratingsfor the latterhalf of2015compared to thoseof2014-mid2015maybeexplainedbychanges made to the EQA template, in particular to the addition of a criteria ofReliabilityandanincreasedemphasisongenderanalysis.In 2016 there were, as last year, some differences by criterion. Table 2 shows thatExecutiveSummaryandReliabilitysectionswerethemostlikelytohavenegative(FairandUnsatisfactory)ratings.Recommendations,LessonsLearnedandGEEWwerealso
1ThisincludedreportspublishedfromJunethroughDecember2015.2This included the first batch of EQA assessments which considered reports published fromJanuary2014throughMay2015.
page5
lessstrong.TheConclusionsandPresentation/Structuresectionswerethemost likelytohavepositiveratings.
Table 2: Report Rating by Criteria
Table3showsthescoresbysubjectarea.Duetothe lownumbersofreports, there isnotaspecificsectorthatcanbesaidtohavestrongerreports.Howeveritisnotablethattwo of the In-depth reports were rated as Very Good and one as Good. This is animprovementfrom2015where,oftheeightIn-depthreports,25%wereratedasFairand75%wereratedasGood.
Table 3: Report Rating by Subject
A.StrengthsoftheEvaluationReports
Asmentionedabove, theevaluationreports for2016showed improvementwith58%receivingaratingofGoodorVeryGood.Therewere fourprimarystrengthsobservedacrossthosereviewed:
1
2
1
1
3
5
12
6
10
10
8
2
9
10
7
12
3
10
5
6
7
12
8
6
7
2
3
1
4
2
4
4
2
3
2
Presentation/Structure
Executive Summary
Context & Purpose
Scope & Method
Reliability
Findings
Conclusions
Recommendations
Lessons Learned
GEEW
Unsatisfactory
Fair
Good
Very Good
1
5
1
1
1
2
2
1
1
1
1
2
Corruption
Organized Crime & Drugs
Justice
Prev/Treat/Reinteg
Research Trends
Terrorism Prevention
Indepth
Fair
Good
Very Good
page6
Consistency with DAC and UNEG Norms: As was the case for 2015, the reportsgenerallyconformedtotheacceptednormsandguidelines.
Conclusions:Theconclusionssectionwasthebestinmeetingcriteria,showingthattheevaluatorswereabletosynthesizetheresultseffectively in16of19cases(84%wereratedasGoodorVeryGood).
PresentationandStructure: Thissectionofthereportsreceivedthesecond-highestscorewith14of17reportsreceivingaratingofVeryGoodorGood.
InclusionofGenderAnalysis:Therewerenotableimprovementsinthe2016reportsand these are discussed in the “Mainstreaming ofHumanRights andGender” sectionbelow.
B.WeaknessesoftheEvaluationReports
There continue to be areas inwhichUNODCevaluations could improveuponoverall.Theseinclude:
Executive Summaries:Thiswas theweakest sectionof the2016evaluation reports.Themainissueswerethatthesectionoften(a)exceedsthemaximumpagelength,(b)doesnotincludereferencetotheevaluationmethodologyused,(c)isoverlyfocusedonfindings rather than conclusions, and (d) the Summary Matrixes include too manyand/orunclearrecommendations.
ReliabilityofData: Thereareanumberofgoodpracticesinthefieldofdevelopmentevaluationthatarenotsufficientlyusedinthereportsreviewed.Commonare:
• Theabsenceofsamplingframestoaddressreliabilityofdata.Withoutthis, it isnotpossibletoknowifthedataarerepresentative.
• Thesourcesofdata fromstakeholdersconsultedduringtheevaluationprocessnot being evident in the Findings. In approximately one third of reports, thefindingsarepresentedwithoutreference towhere theperspectivescame fromorwithonlyverygeneralreferencetosource(i.e.“intervieweessaid . . .”).Thiscan leave the impression that the Findings are solely or mostly from theperspectiveoftheevaluator.
• Responsesnotbeingbrokendownbystakeholdergrouporbygender.
• Data that was collected but not reported on. In a number of cases, themethodology included data collection processes such as surveys or MostSignificantChange(MSC)questionsbuttherewasnospecificreferencetothesein theFindings.BOL/Y15 isacase inpointwhereby themethodology includedinterviewswith467students,howevernoneofthedatafromthisexercisewasshown.Insuchinstances,itisunknowntothereaderwhetherthedatawasnotanalyzedorwhetheritwasnotuseful.
• Lackofquantitativedatacollectedduringtheevaluation(primarysource).
• Theoveruseofdatafromdocumentreview(secondarysource).
• No,oroverlyvague,explanationsforhowdatawasanalyzedwhichsuggeststhatdataisnotbeingreviewedinasystematicmanner.
page7
AbsenceofLogframes: Aswasthecasefor2015,themostsignificantconcernistheabsence of results frameworks and clear indicators to measure the progress orachievement of the intervention against its planned results. Logframes were rarelyincluded in the reports but in four cases the evaluators prepared an improvedframework(andthesewereevaluationreportsratedVeryGood).Logframesshouldbemandatorycomponentsofevaluationssincetheyareintendedtoexpresswhathasbeenpromisedintermsofoutcomesandobjectivestobeachieved.Ifevaluatorsfindexistinglogframesinadequate,thenthereshouldbesomeattempttosetoneupinordertobeclear aboutwhat the evaluation process ismeasuring. At aminimum, the evaluatorsshouldarticulatetheproposedchainofresultsorprogramtheory.Thispracticeshouldbesuggestedintheinceptionreports.
Inadequate attention to Effectiveness:Effectivenessshouldbe themainsummativefocusoftheFindings.Often,moreattentionisgiventoatleastoneormoreofRelevance,Design, Efficiency, and Impact (when incorrectly used). While all are important tocommenton,theseshouldbelessofafocusthanprogresstowardsachievingresultsoftheintervention.
Varied understandings of Effectiveness and Impact: Frequently, the Effectivenesssectionisusedtopresenttheactivitiesthathavebeencarriedoutwithoutmentioningorlinkingthemtotheoutcomestheyareexpectedtocontributetowards.Thishappenseventhough,inmostcases,themainevaluationquestionsintheToRsareexplicitabouttheneed for theEffectiveness section to focus on outcomes.An additional concern isthat the Impact section is frequently pitched at the level of outcome results, andsometimes at the output level (i.e. increased knowledge gained from trainingworkshops).Iftheprojecthasnotyethadtheopportunitytoshowsubstantialprogresstowardsresults,inmostcasesitwouldbesufficientfortheevaluatortojuststatethatintheImpactsection.
Other common shortcomings: These include lack of consultation with, orinvolvement of, CLPs or other stakeholders beyond their being interviewed; notprovidinginformationaboutevaluatorsandtheirsuitabilityfortheassignment(inonecasethenameoftheevaluatorwasnotprovided);notprovidingthedates/timeframeoftheevaluationprocess;minimaluseofvisualaids;and,using interviewprotocolsthatarenotadaptedforspecificstakeholdergroups.
C.ImprovementsfromFirstDraftinFinal
Asrequested,thereviewerscomparedthefinalreportsofjustoveronethird(seven}oftheevaluationswiththeirfirstdraftsonwhichIEUwillhavecommented. Thesewererandomly selected from the first batch of 17 evaluations received. The results of thecomparisonareshowninTable4andthespecificanalysisisshowninAnnex1.
Table 4. Comparison of First and Final Drafts of a Sample of Evaluations Evaluation Date of first draft and
final draft Substantive
Changes Rating
MEXZ44 Preventing Corruption Mexico October 2015 October 2015
Few Fair
XAWU72 Airport Communications Project Final July 2016 Many Good
page8
September 2016 LAOX26 Criminal Justice Responses to Human Trafficking in Lao Aug/Sept 2016
September 2016 Many Very
Good PERG34 Proyecto Sistema de Monitoreo de Cultivos Ilícitos SIMCI July 2015
February 2016 Some Fair
BRAX16 Expressive Youth - Citizenship, Access to Justice, Peace Brazil
November 2014 June 2015
(But issued in 2016)
Some Good
PSEX02 Forensic Human Resource & Governance Palestine Midterm
February 2016 June 2016
Some Good
XAPX37 Counter-Terrorism East and Southeast Asia Partnership Criminal Justice Responses
April 2016 June 2016
Few Very Good
Thefinaldocumentsshowedgeneralimprovementfromthedraftversions.AlthoughitdidnotappearthatthechangeswouldhaveaffectedtheoverallEQAratingsofanyofthereports,therewereseveralcaseswherethecategoryscoreswouldhaveimproved,particularly GEEW scores. Therewas considerable variation in the extent of changesbetween thedraft and final versions.Of the seven reports, twohadminimal changes,twohadsubstantialchangesandthreewereinbetween.InthetworeportsthatwereinSpanish, themaindifferencewas that theExecutive Summary (and summarymatrix)wastranslatedintoEnglish.Overall,therevisionsweremostlyhelpfulinoneormoreofthe following ways: increasing readability through editing and formatting changes,providing more evidence to back up findings, disaggregating the list of stakeholdersconsultedbygender,includingmoregenderanalysis,andreformulatingtheConclusions.In some instances the revisionswerenotbeneficial,mostnotably a reportwhere theExecutiveSummarywasincreasedfromfourtosevenpages.
Thereweresubstantialdifferencesintimebetweenthefirstdraftandthepublicationofthefinalversionofthereport.Therangewasfromthesamemonthtooverayear.
D.BestPracticesObserved
Revisions to logical frameworks: Where the evaluators reviewed and then, whennecessary, revised the logical framework of the evaluation, they were able to obtaindata that show outcomes and give an initial sense of impact, or indicatewhere datacouldnotbeobtained.Thegoodlogicalframeworksshowedclearlytheexpectedcausalconnectionofoutputwithoutcomes.ThetwoIn-depthevaluationsratedasVeryGoodprovidegoodexamplesofthispractice.
EvaluationMatrix:ThematrixthatisincludedinthereportforLAOX26StrengtheningCriminal justice Responses to Human Trafficking clearly shows theindicators/benchmarksandfindingsinrelationtoeachoftheevaluationquestions.
Robustmethodology:RASH13PreventionoftransmissionofHIVAmongDrugUsers isexemplaryforitsstrongmethodologywhichisabletoshoweffectiveness,relevanceandsustainability.
Treatmentoffindings:ClusterEvaluationofSMART(GLOJ88)&Forensic(GLOU54)hasstrongpresentationofEffectiveness,Efficiency,Impact,andPartnerships.
page9
Presentation of Evaluation Tools: TKMX57 Strengthening Customs Service inTurkmenistan is an exemplary report for the inclusion of evaluation tools tailored todifferentstakeholdergroups.
E.ChallengesforEvaluators
Althoughsomeprogresshasbeenmadeinthisregard,theToRsmostlycontinuetohavetoomany questions to be answered by the evaluations. The stronger evaluations aregenerallythosewheretheevaluatorshavenarroweddownthenumberofquestions.
F.MainstreamingHumanRightsandGenderConsiderations
As in 2015, the evaluations this year consistently included dedicated sub-sectionswithintheFindingsforHumanRightsandGender.
Table5showsthattheaveragescoresforconsideringGEEWhaveimprovedaccordingto theUN-SWAPcriteria.TheSWAPtoolassesses theextent towhichgenderequalityandtheempowermentofwomen(GEEW)isintegratedintoevaluationprocesses.Therearefourcriteriaandeachisratedonascaleof0-3with0beingawardedwhenthereisno integration, 1 when there is partial integration, 2 when there is satisfactoryintegration, and 3 when the evaluation process exceeds requirements. The overallscoresfor2016wouldhavebeenevenhigherifitwasnotforonereportthatreceivedanUnsatisfactoryrating.
Table 5: Average scores for the integration of GEEW (UN-SWAP)
Quality Assessment Criteria Average
Score (0-3) 2015 2016
a. GEEW is integrated in the evaluation scope of analysis and indicators are designed in a way that ensures GEEW-related data will be collected.
1.125 1.63
b. Evaluation criteria and evaluation questions specifically address how GEEW has been integrated into design, planning, implementation of the intervention and the results achieved.
1.625 1.63
c. Gender-responsive evaluation methodology, methods and tools, and data analysis techniques are selected.
.875 1.37
d. Evaluation findings, conclusions and recommendations reflect a gender analysis. 1.875 1.84
Overall Score 5.5 6.53
Overall Rating Fair Fair
Reportsthatwereparticularlystronginanalysinggenderissuesincluded:PSEX02 Forensic Human Resource & Governance Palestine Midterm – this is agoodexampleof ananalysis that goesbeyond lookingat representationofwomen ineventsandcareerpositionstolookatbroaderissuesofGEEW.
page10
XAPX37Counter-TerrorismEastandSoutheastAsiaPartnershipCriminalJusticeResponsesandLAOX26Criminal JusticeResponses toHumanTrafficking in Laowerebothexemplaryfortheexplanationinthemethodologysectionofhowgenderwasconsidered.ThelatterreportreferencedtheUNWomen’shandbook,“HowtoManageAGender-Responsive Evaluation” and used this guidance to structure the evaluationapproach(seeFigure1).
Figure 1: Excerpt of M ethodology Sect ion from LAOX26 Evaluation Report
page11
RECOMMENDATIONS
ThemainrecommendationsfortheIEUtoconsiderare:
Updatedguidanceformanagersofevaluationprocessesandforevaluators: Itissuggested that the reviewand revisionofUNODC’sguidance forevaluationprocessestake into account the recommendationsmade in the EQA Synthesis Reports for both2014and2015astheissuesraisedcontinuetoberelevant.Theguidanceshouldincludethe expectations for reviewing draft reports and should be available in Spanish andFrench.
Additionalrecommendationsbeyondthoseinpastreportsare:
More emphasis on robustmethodology:Evaluators shouldbe includingqualitativedata collectionandanalysisprocesses. If surveysarenot feasible, interviewprotocolsandother toolsshouldbedesigned toelicitat leastsomeresponses thatcanbemoreeasily quantified through content analysis processes. Evaluators also need to betransparentabouthowtheyhaveanalyzedthedatacollectedthroughbothquantitativeandqualitativeprocesses.Additionally,evaluatorsshouldbedisaggregatingresponsesbystakeholdergroupsandbygender.ItisunlikelythatallstakeholdersholdthesameviewsandthevaryingperspectivesshouldbeilluminatedintheFindings.
Further refinement of the EQA template: Further refinement of the template issuggested. The new designwith a box to rate each criteria element has shown to behelpfulbutitsusehasrevealedsomeredundanciestobeaddressed.TherecommendedchangesareshowninAnnex3.
page12
Appendix1.ComparisonofFinalReportwithDraftReport
MEXZ44
AddedEnglish-languageExecutiveSummaryandSummaryMatrix.Veryminoreditorialchanges.
XAWU72
There were extensive changes between the versions – most but not all wereimprovements.TheExecutiveSummarywasincreasedfrom4to7pages,inpartduetoanoverlydetaileddescriptionoftheprojectandchanges,andtheexcessivenumberoffootnoteswas increased. The ES improvements included criteria subheadings for thefindings/conclusions, and a note about the role of the CLP. The text in the SummaryMatrixissomewhatmoreclearandconcise.IntheBackgroundsection,adescriptionoftheprojectwas removedwhichnegativelyaffected theEQArating for that section. Inthe Methodology, Conclusions and Recommendations sections references to HR andGenderwereadded,and intheannexed listofstakeholders, thenumbers interviewedwere gender disaggregated. Text was adjusted throughout the Findings – mostly forclarification or additional evidence. There was a substantial amount of editing done(mostlyminorissues),longparagraphswerebrokenupandsomewererearranged.TheConclusionswerecompletelychangedtoincludeallcriteria.Onelessonswasmovedtobecome a conclusion. There were a number of formatting changes through out thedocument–mosthelpfulbutinacoupleofplacesmaps/figuresendedupcoveringtextwhichwasn’tthecaseintheearlierdraft.
LAOX26
Therewereanumberofchangesthatimprovedthefinalversion,andtheGEEWscore.The final version noted that the consultants were independent and one was asubstantiveexpert.TextofExecutiveSummarynot includedindraft.SummaryMatrixadjustedinfinalversiontoincludetowhomrecommendationsdirectedtoandmoreinevidencecolumn,andonerecommendation(andcorrespondinglesson)abouthandingback victims of trafficking was dropped. In several cases direct excerpts fromdocuments (some quite lengthy) are included in the draft but were removed orshortenedandthemainpointfromthosewereintegratedintothemaintext.Thiswasan improvement.Morespecificitywasaddedabout institutionsthatwere interviewedandpartoftheProjectSteeringCommittee.Thefullsetofevaluationquestions(41)wasremovedfromthemethodologysectionandreferencewasmadetowheretheycouldbefound in the annex. The Efficiency section was increased to include more financialinformation(budget/outcomeandspendingrate),moreanalysisoftheprojectlogframe,andmore evidence to support the shortcomings identified. The Effectiveness sectionincluded additional information about the outcomes of the training. The Conclusionssection,whichwasorganizedby criteria, originallydidnot addressDesignorHumanRights&Gender.Therewasminorre-wordingoftheRecommendations.Thenamesofpeopleconsultedduringtheevaluationswereremovedfromthelistingintheannexandaddedwasthegenderdisaggregationofthisgroup.
page13
PERG34
AddedEnglish-languageExecutive Summary and SummaryMatrix. Moved a table onEvaluación de Hallazgos from the Findings section to an annex. Added arecommendationforUNODCVienna:“4.SerecomiendaalasedeenVienaestablecerunprograma de entrenamiento de nuevas tecnologías, dirigido al personal técnico delproyecto,entreotros,enelusoymanejodeproductossatelitalesdeúltimageneración,enlosnuevossistemasdemedicióndelaextensión,producciónydefinirsuaplicabilidadenelpaís.”Addedfiverecommendationsfortheprojectteamandeditedtheresultsshowninthe SummaryMatrix. Added a table on budgetary contributions by source. Added aparagraphinthemethodologysectionthatdescribedlimitationsintheinterviewsthatweremade. Addedmaterial in theFindingsdealingwithdesign,relevance,efficiency,partnershipsandeffectivenesswereallincreasedinthefinal.Therewerenochangesintheconclusionsorrecommendations.
BRAX16
Thefirstdraftdidnothaveanexecutivesummary.ThefinaldrafttranslatedthefrontpagefromPortuguesetoEnglish.Annexeswereaddedtothefinaldraft.
PSEX02
Therewereminorchangesthroughoutmostofthedocument.Thefontusedinthedraftwas according to the Guidelines but this was changed in the final. Minor revisionsincludedmovingonerecommendationfrom‘important’to‘key’intheSummaryMatrixand noting at the beginning of the Recommendations section who the majority ofrecommendations were directed towards. More substantial editing changes includedmoving portions of the text to the footnotes (footnotes were already used ratherfrequently),genderdisaggregationofthelistofstakeholders,addingmoreexplanatorydetailsinsomecases,andmakingsomeparagraphs/sectionsmoresuccinct.Thelatterwas particularly evident in the Efficiency section where the financial discussion wasshortenedandthelengthysubsectionon“ReasonsexplicatingEfficiency”wasreducedfrom3.5 to 2 pages.Althoughmost of the editing changes appearedhelpful, in a fewcasesreferencetotheperspectivesofstakeholderswereremovedanditwasnotedintheEQAthatthiswasashortcomingofthereport.ThedraftjustincludedtheToRofthelead evaluator but the final version included the full ToR (with ToRs for bothconsultants),apracticethatisnotedintheEQASummaryReportsasbeingredundant.
XAPX37
Thereweresomeeditorialchanges(includingmakingsomestatementslessdefinitive).Thesectiononhumanrightsandgenderequalitywasexpanded.Textonpartnershipswasmovedupaheadofeffectivenessinthefindings..
Page1
Appendix2.UNODCEQATemplateasUsedintheReview
GeneralProjectInformation
Project/ProgrammeNumberandName
ThematicArea
GeographicArea(Region,Country)
Approvedbudgetofthetimeoftheevaluation(USD)
TypeofEvaluation(In-Depth/IndependentProject;final/midterm;other)
Evaluator(s)
DateofEvaluation(fromMM/YYYYtoMM/YYYY)
DateofEvaluationReport(MM/YYYY)
QualityAssessmentconductedon/by
OVERALLQUALITYRATING:SUMMARY:
QualityAssessmentCriteriaAssessmentLevels:VeryGood-Good-Fair-UnsatisfactoryMeetsCriteria:Y=YesN=NoP=Partially
1.Structure,CompletenessAndClarityOfReport RATING:a. Format(headings,font)accordstoIEUGuidelinesandTemplatesforEvaluation
Page2
Reports.b. StructureaccordstoIEUGuidelinesforEvaluationReportswiththefollowing
sequence:ExecutiveSummary;SummaryMatrixofFindings,EvidenceandRecommendations;Introduction(BackgroundandContext,EvaluationScopeandMethodology,LimitationstotheEvaluation);Findings(Design,Relevance,Efficiency,PartnershipandCooperation,Effectiveness,Impact,Sustainability,HumanRightsandGenderEquality/mainstreaming,Innovation);Conclusions;Recommendations;LessonsLearned.
c. Languageisempoweringandinclusiveavoidinggender,heterosexual,age,culturalandreligiousbias,amongothers.
d. Reportiseasytoreadandunderstand(i.e.writteninanaccessiblenon-technicallanguageappropriatefortheintendedaudience).Visualaids,suchasmapsandgraphs,areusedtoconveykeyinformation.Listofacronymsisincluded.
e. Reportisfreefromanygrammar,spelling,orpunctuationerrors. f. Objectivesstatedinthetermsofreferenceareadequatelyaddressed. g. Issuesofhumanrightsandgenderequality/mainstreamingareadequatelyaddressed h. Reportcontainsalogicalsequence:evidence-assessment-findings-conclusions-
recommendations.
i. CompositionofEvaluationTeamisincludedandhasgenderandgeographicexpertise.Preferablyitisgenderbalancedandincludesprofessionalsfromcountriesorregionsconcerned.
j. Annexesincludeataminimum:evaluationtermsofreference;listofpersonsinterviewedandsitesvisited;listofdocumentsconsulted;evaluationtoolsused.
2.ExecutiveSummary RATING:a. Writtenasastand-alonesectionthatprovidesanoverviewoftheevaluationand
presentsitsmainresults.
b. Generallyfollowsthestructureof:i)Purpose,includingintendedaudience(s);ii)Objectivesandbriefdescriptionofintervention;iii)Methodology);iv)MainConclusions;v)Recommendations.
c. SummaryMatrixpresentsonlythekeyandmostimportantrecommendationsfromevaluationreport.
d. Findings,sourcesandrecommendationsintheSummaryMatrixareclearandcohesive,andspecifythestakeholdertowhomtheyareaddressed.
Page3
e. Maximumlength4pages,excludingtheSummaryMatrix. 3.EvaluationContextAndPurpose RATING:a. Cleardescriptionoftheprojectevaluatedispresented. b. Logicmodeland/ortheexpectedresultschain,and/orprogramtheory(thatata
minimumidentifiesandlinksobjectives,outcomesandindicatorsoftheproject)isclearlydescribed
c. Contextofkeycultural,genderrelated,social,political,economic,demographic,andinstitutionalfactorsaredescribed,andthekeystakeholdersinvolvedintheprojectimplementationandtheirrolesareidentified.
d. Project’sstatusisdescribedincludingitsphaseofimplementationandanysignificantchanges(e.g.strategies,logicalframeworks)thathaveoccurred.
e. Purposeoftheevaluationisclearlydefined,includingwhyitwasneededatthatpointintime,whoneededtheinformation,whatinformationisneeded,howtheinformationwillbeused,andthetargetaudience.
4.ScopeAndMethodology RATING:a. Evaluationscopeisclearlyexplainedincludingthemainevaluationcriteria,questions
andjustificationofwhattheevaluationdidanddidnotcover.
b. Transparentdescriptionpresentedofmethodologyapplied;howitwasdesignedtoaddresstheevaluationpurpose,objectives,questionsandcriteriaisexplained.
c. Methodologyallowsfordrawingcausalconnectionsbetweenoutputandexpectedoutcomes
d. Gendersensitivemethodologyawareofpowerrelationsduringanevaluationprocess,inclusiveandparticipatory.
e. Datacollectionmethodsandanalysis,anddatasourcesareclearlydescribed;asaretherationaleforselectingthem,andtheirlimitationsareclearlydescribed.Referenceindicatorsandbenchmarksareincludedwhererelevant.
f. Samplingframeclearlydescribedandincludesareaandpopulationtoberepresented,rationaleforselection,mechanicsofselectionincludingwhetherrandom,numbersselectedoutofpotentialsubjects,andlimitationsofsample.
g. Methodsareappropriateforanalysinggenderequality/mainstreamingandhumanrightsissuesidentifiedinevaluationscope
h. Highdegreeofparticipationofinternalandexternalstakeholders,includingtheCore
Page4
LearningPartners,throughouttheevaluationprocessisplannedforandmadeexplicit.Whentherearethematicorapproachgaps(i.e.genderequality/mainstreaming)amongstakeholders,otherkeyinformantsnotdirectlyinvolvedintheprojectwereinvitedforconsultation.
5.ReliabilityofDataToensurequalityofdataandrobustdatacollectionprocesses
RATING:
a. Triangulationprinciples(usingmultiplesourcesofdataandmethods)wereappliedtovalidatefindings.
b. Qualitativeandquantitativedatasourceswereused,andincludedtherangeofstakeholdergroupsandadditionalkeyinformants(whennecessary)definedinevaluationscope.
c. Limitationsthatemergedinprimaryandsecondarydatasourcesandcollectionprocesses(bias,datagaps,etc.)areidentifiedand,ifrelevant,actionstakentominimizesuchissuesareexplained.
d. Evidenceprovidedofhowdatawascollectedwithasensitivitytoissuesofdiscriminationandotherethicalconsiderations.
e. Adequatedisaggregationofdatabyrelevantstakeholderundertaken(gender,ethnicity,age,under-representedgroups,etc.).Ifthishasnotbeenpossible,itisexplained.
6.FINDINGSANDANALYSISToensuresoundanalysisandcrediblefindings
RATING:
Findings - a. Havebeenformulatedclearly,takeintoaccountanyidentifiedbenchmarks,andare
basedonrigorousanalysisofthedatacollected.
b. AddressallevaluationcriteriaandquestionsraisedintheToRincludingrelevance,efficiency,effectiveness,impactandsustainability,aswellasUNODC’sadditionalcriteriaofdesign,partnershipandcooperation,innovation,andthecross-cuttingthemesofhumanrightsandgender.
c. AddressanylimitationsorgapsintheevidenceanddiscussanyimpactsonrespondingtoevaluationquestionsraisedinToR.
d. Discussanyvariancesbetweenplannedandactualresultsoftheproject(intermsofobjectives,outcomes,outputs).
e. Arepresentedinaclearmanner.
Page5
Analysis -a. Interpretationsarebasedoncarefullydescribedassumptions. b. Contextualfactorsareidentified(includingreasonsforaccomplishmentsandfailures,
andcontinuingconstraints).
c. Causeandeffectlinksbetweenaninterventionanditsendresults(includingunintendedresults)areexplained.
d. Includessubstantiveanalysisofgenderequality/mainstreamingissues e. Includessubstantiveanalysisofhumanrightsissues. 7.CONCLUSIONS RATING:a. Takeintoconsiderationallevaluationcriteriaandquestions,includinghumanrights
andgenderequality/mainstreamingcriteria.
b. Havebeenformulatedclearlyandarebasedonfindingsandsubstantiatedbyevidencecollected.
c. Conveytheevaluators’unbiasedjudgementoftheintervention d. Developedwiththeinvolvementofrelevantstakeholders. e. Presentacomprehensivepictureofboththestrengthsandweaknessesoftheproject. f. Gobeyondthefindingsandprovideathoroughunderstandingoftheunderlying
issuesoftheprojectandaddvaluetothefindings.
8.RECOMMENDATIONS RATING:a. Areclearlyformulated,basedontheconclusions,andsubstantiatedbyevidence
collected.
b. Addressflaws,ifany,inproject’sdataacquisitionprocesses. c. Arespecific,realistic,time-boundandactionable,andofamanageablenumber. d. Areclusteredandprioritized. e. Reflectstakeholders’consultationswhilstremainingbalancedandimpartial f. Clearlyidentifyatargetgroupforaction. 9.LESSONSLEARNED RATING:a. Arecorrectlyidentified,innovativeandaddvaluetocommonknowledge. b. Arebasedonspecificevidenceandanalysisdrawnfromtheevaluation. c. Havewiderapplicabilityandrelevancetothespecificsubjectandcontext.
Page6
SCORING
ElementOfTheEvaluationPointsPerCategory
PointsAwardedVeryGood Good Fair Unsatisfactory
PresentationAndCompleteness 10 ExecutiveSummary 5
EvaluationContextAndPurpose 5 EvaluationScopeAndMethodology 10
ReliabilityOfData 5
FindingsAndAnalysis 35
Conclusions 10
Recommendations 15 LessonsLearned 5
TotalMaximumScore 100
VeryGood->veryconfidenttouse
Good->confidenttouse
Fair->usewithcaution
Unsatisfactory->notconfidenttouse
ASSESSMENTOFTHEINTEGRATIONOFGENDEREQUALITYANDEMPOWERMENTOFWOMEN(GEEW)forUN-SWAPQualityAssessmentCriteria Comments Score(0-3)
a. GEEWisintegratedintheevaluationscopeofanalysisandindicatorsaredesignedinawaythatensuresGEEW-relateddatawillbecollected.
b. EvaluationcriteriaandevaluationquestionsspecificallyaddresshowGEEWhasbeenintegratedintodesign,planning,implementationoftheinterventionandtheresultsachieved.
c. Gender-responsiveevaluationmethodology,methodsandtools,anddataanalysistechniquesareselected.
d. Evaluationfindings,conclusionsandrecommendationsreflectagenderanalysis.
Page7
OverallScore
OverallRating
UN-SWAPScoringSystemExceedingRequirements 3-Fullyintegrated.Applieswhenalloftheelementsunderacriterionaremet,usedandfullyintegratedinthe
evaluationandnoremedialactionisrequiredMeetingRequirements 2-Satisfactorilyintegrated.Applieswhenasatisfactorylevelhasbeenreachedandmanyoftheelementsaremetbut
stillimprovementcouldbedoneApproachingRequirements 1-Partiallyintegrated.Applieswhensomeminimalelementsaremetbutfurtherprogressisneededandremedial
actiontomeetthestandardisrequired.Missing 0-Notatallintegrated.Applieswhennoneoftheelementsunderacriterionaremet.OverallCalculation 11-12=verygood8-10=good4-7=Fair0-3=unsatisfactory
Page1
Appendix3.RecommendedRevisionstoUNODCEQATemplate
GeneralProjectInformation
Project/ProgrammeNumberandName
ThematicArea
GeographicArea(Region,Country)
Approvedbudgetofthetimeoftheevaluation(USD)
TypeofEvaluation(In-Depth/IndependentProject;final/midterm;other)
Evaluator(s)
DateofEvaluation(fromMM/YYYYtoMM/YYYY)
DateofEvaluationReport(MM/YYYY)
QualityAssessmentconductedon/by
OVERALLQUALITYRATING:SUMMARY:
QualityAssessmentCriteriaAssessmentLevels:VeryGood-Good-Fair-UnsatisfactoryMeetsCriteria:Y=YesN=NoP=PartiallyN/A-NotApplicable
1.Structure,CompletenessAndClarityOfReport RATING:
Page2
a. Format(headings,font)accordstoIEUGuidelinesandTemplatesforEvaluationReports.
b. StructureaccordstoIEUGuidelinesforEvaluationReportswiththefollowinglogicalsequence:Listofacronyms;ExecutiveSummary;SummaryMatrixofFindings,EvidenceandRecommendations;Introduction(BackgroundandContext,EvaluationScopeandMethodology,LimitationstotheEvaluation);Findings(Design,Relevance,Efficiency,PartnershipandCooperation,Effectiveness,Impact,Sustainability,HumanRightsandGenderEquality/Mainstreaming,Innovation);Conclusions;Recommendations;LessonsLearned.
c. Objectivesstatedinthetermsofreferenceareadequatelyaddressed. d. Issuesofhumanrightsandgenderequality/mainstreamingareadequately
addressed
e. Reportiseasytoreadandunderstand(i.e.writteninanaccessiblenon-technicallanguageappropriatefortheintendedaudience)..
f. Languageisempoweringandinclusiveavoidinggender,heterosexual,age,culturalandreligiousbias,amongothers.
g. Reportisfreefromgrammar,spelling,orpunctuationerrors. h. Visualaids,suchasmapsandgraphs,areusedtoconveykeyinformation i. Reportcontainsalogicalsequence:evidence-assessment-findings-conclusions-
recommendations.
j. CompositionofEvaluationTeamisincludedandhasgenderandgeographicexpertise.Preferablyitisgenderbalancedandincludesprofessionalsfromcountriesorregionsconcerned.
k. Annexesincludeataminimum:evaluationtermsofreference;logicmodeland/orevaluationmatrix;listofpersonsinterviewedandsitesvisited;listofdocumentsconsulted;evaluationtoolsused.
2.ExecutiveSummary RATING:a. Writtenasastand-alonesectionthatprovidesanoverviewoftheevaluation
andpresentsitsmainresults.
b. Generallyfollowsthestructureof:i)Purpose,includingintendedaudience(s);ii)Objectivesandbriefdescriptionofintervention;iii)Methodology);iv)MainConclusions;v)Recommendations.
Page3
c. SummaryMatrixpresentsonlythekeyandmostimportantrecommendationsfromevaluationreport.
d. Findings,sourcesandrecommendationsintheSummaryMatrixareclearandcohesive,andspecifythestakeholdertowhomtheyareaddressed.
e. Maximumlength4pages,excludingtheSummaryMatrix. 3.EvaluationContextAndPurpose RATING:a. Cleardescriptionoftheprojectevaluatedispresented. b. Logicmodeland/ortheexpectedresultschain,and/orprogramtheory(thatat
aminimumidentifiesandlinksobjectives,outcomesandindicatorsoftheproject)isclearlydescribed.
c. Contextofkeycultural,genderrelated,social,political,economic,demographic,andinstitutionalfactorsaredescribed,andthekeystakeholdersinvolvedintheprojectimplementationandtheirrolesareidentified.
d. Projectstatusisdescribedincludingitsphaseofimplementationandanysignificantchanges(e.g.tostrategies,logicalframeworks)thathaveoccurred.
e. Purposeoftheevaluationisclearlydefined,includingwhyitwasneededatthatpointintime,whoneededtheinformation,whatinformationisneeded,howtheinformationwillbeused,andthetargetaudience.
4.ScopeAndMethodology RATING:a. Evaluationscopeisclearlyexplainedincludingthemainevaluationcriteria,
questionsandjustificationofwhattheevaluationdidanddidnotcover.
b. Transparentdescriptionpresentedofmethodologyapplied,includinghowitwasdesignedtoaddresstheevaluationpurpose,objectives,questionsandcriteriaisexplained.
c. Methodologyallowsfordrawingcausalconnectionsbetweenoutputandexpectedoutcomes
d. Methodsareappropriateforanalysinggenderequality/mainstreamingandhumanrightsissuesidentifiedinevaluationscope;Gendersensitivemethodologytakesintoaccountpowerrelationsduringanevaluationprocess;isinclusiveandparticipatory.
e. Datacollectionmethodsandanalysis,anddatasourcesareclearlydescribed;asaretherationaleforselectingthem,andtheirlimitationsareclearlydescribed.Referenceindicatorsandbenchmarksareincludedwhererelevant.
Page4
f. Samplingframeclearlydescribedandincludesareaandpopulationtoberepresented,rationaleforselection,mechanicsofselectionincludingwhetherrandom,numbersselectedoutofpotentialsubjects,andlimitationsofsample.
g. h. Highdegreeofparticipationofinternalandexternalstakeholders,includingthe
CoreLearningPartners,throughouttheevaluationprocessisplannedforandmadeexplicit.Whentherearethematicorapproachgaps(i.e.genderequality/mainstreaming)amongstakeholders,externalkeyinformantsnotdirectlyinvolvedintheprojectwereinvitedforconsultation.
5.ReliabilityofDataToensurequalityofdataandrobustdatacollectionprocesses
RATING:
a. Triangulationprinciples(usingmultiplesourcesofdataandmethods)wereappliedtovalidatefindings.
b. Qualitativeandquantitativedatasourceswereused,andincludedtherangeofstakeholdergroupsandadditionalkeyinformants(whennecessary)definedinevaluationscope.
c. Limitationsthatemergedinprimaryandsecondarydatasourcesandcollectionprocesses(bias,datagaps,etc.)areidentifiedand,ifrelevant,actionstakentominimizesuchissuesareexplained.
d. Evidenceprovidedofhowdatawascollectedwithasensitivitytoissuesofdiscriminationandotherethicalconsiderations.
e. Adequatedisaggregationofdatabyrelevantstakeholderundertaken(gender,ethnicity,age,under-representedgroups,etc.).Ifthishasnotbeenpossible,itisexplained.
6.FINDINGSANDANALYSISToensuresoundanalysisandcrediblefindings
RATING:
Findings - a. Areclearlyformulatedandpresented b. Arebasedonrigorousanalysisofthedatacollected;takeintoaccountany
identifiedbenchmarks.
c. AddressallevaluationcriteriaandquestionsraisedintheToRincludingrelevance,efficiency,effectiveness,impactandsustainability,aswellasUNODC’sadditionalcriteriaofdesign,partnershipandcooperation,innovation,
Page5
andthecross-cuttingthemesofhumanrightsandgender.d. Addressanylimitationsorgapsintheevidenceanddiscussanyimpactson
respondingtoevaluationquestionsraisedinToR.
e. Discussanyvariancesbetweenplannedandactualresultsoftheproject(intermsofobjectives,outcomes,outputs).
f. Arepresentedinaclearmanner. Analysis -a. Interpretationsarebasedoncarefullydescribedassumptions. b. Contextualfactorsareidentified(includingreasonsforaccomplishmentsand
failures,andcontinuingconstraints).
c. Causeandeffectlinksbetweenaninterventionanditsendresults(includingunintendedresults)areexplained.
d. Includessubstantiveanalysisofgenderequality/mainstreamingissues e. Includessubstantiveanalysisofhumanrightsissues. 7.CONCLUSIONS RATING:a. Takeintoconsiderationallevaluationcriteriaandquestions,includinghuman
rightsandgenderequality/mainstreamingcriteria.
b. Havebeenformulatedclearlyandarebasedonfindingsandsubstantiatedbyevidencecollected.
c. Conveytheevaluators’unbiasedjudgementoftheintervention d. Developedwiththeinvolvementofrelevantstakeholders. e. Presentacomprehensivepictureofboththestrengthsandweaknessesofthe
project.
f. Gobeyondthefindingsandprovideathoroughunderstandingoftheunderlyingissuesoftheprojectandaddvaluetothefindings.
8.RECOMMENDATIONS RATING:a. Areclearlyformulated,basedontheconclusions,andsubstantiatedby
evidencecollected.
b. Addressflaws,ifany,inproject’sdataacquisitionprocesses. c. Arespecific,realistic,time-boundandactionable,andofamanageablenumber. d. Areclusteredandprioritized. e. Reflectstakeholders’consultationswhilstremainingbalancedandimpartial
Page6
f. Clearlyidentifyatargetgroupforaction. 9.LESSONSLEARNED RATING:a. Arecorrectlyidentified,innovativeandaddvaluetocommonknowledge. b. Arebasedonspecificevidenceandanalysisdrawnfromtheevaluation. c. Havewiderapplicabilityandrelevancetothespecificsubjectandcontext. SCORING
ElementOfTheEvaluationPointsPerCategory
PointsAwardedVeryGood Good Fair Unsatisfactory
PresentationAndCompleteness 10 ExecutiveSummary 5
EvaluationContextAndPurpose 5 EvaluationScopeAndMethodology 10
ReliabilityOfData 5
FindingsAndAnalysis 35
Conclusions 10
Recommendations 15 LessonsLearned 5
TotalMaximumScore 100
VeryGood->veryconfidenttouse
Good->confidenttouse
Fair->usewithcaution
Unsatisfactory->notconfidenttouse
ASSESSMENTOFTHEINTEGRATIONOFGENDEREQUALITYANDEMPOWERMENTOFWOMEN(GEEW)forUN-SWAPQualityAssessmentCriteria Comments Score(0-3)
a. GEEWisintegratedintheevaluationscopeofanalysisandindicatorsaredesignedinawaythatensuresGEEW-relateddatawillbecollected.
Page7
b. EvaluationcriteriaandevaluationquestionsspecificallyaddresshowGEEWhasbeenintegratedintodesign,planning,implementationoftheinterventionandtheresultsachieved.
c. Gender-responsiveevaluationmethodology,methodsandtools,anddataanalysistechniquesareselected.
d. Evaluationfindings,conclusionsandrecommendationsreflectagenderanalysis.
OverallScore
OverallRating
UN-SWAPScoringSystemExceedingRequirements 3-Fullyintegrated.Applieswhenalloftheelementsunderacriterionaremet,usedandfullyintegratedinthe
evaluationandnoremedialactionisrequiredMeetingRequirements 2-Satisfactorilyintegrated.Applieswhenasatisfactorylevelhasbeenreachedandmanyoftheelementsaremetbut
stillimprovementcouldbedoneApproachingRequirements 1-Partiallyintegrated.Applieswhensomeminimalelementsaremetbutfurtherprogressisneededandremedial
actiontomeetthestandardisrequired.Missing 0-Notatallintegrated.Applieswhennoneoftheelementsunderacriterionaremet.OverallCalculation 11-12=verygood8-10=good4-7=Fair0-3=unsatisfactory