Post on 30-Dec-2015
How to Measure Impact?
Impact Assessment: to determine what effects programs have on their intended outcomes and whether perhaps there are important unintended effects
What would have happened in the absence of the program?
Since counterfactual is not observable, the key goal of all impact evaluation methods is to construct or “mimic” the counterfactual
Constructing the Counterfactual
Counterfactual is often constructed by selecting a group not affected by the program
Randomized: Use random assignment of the program to
create a control group which mimics the counterfactual
Non-randomized: Argue that a certain excluded group mimics
the counterfactual
Types of Impact Evaluation Methods
Randomized Evaluations Random Assignment Studies Randomized Field Trials Social Experiments Randomized Controlled Trials (RCTs) Randomized Controlled Experiments
The “golden standard” research design for assessing causal effects
Types of Impact Evaluation Methods (Cont.)
Non-experimental or Quasi-experimental Methods Pre-Post Differences-in-Differences Statistical Matching Instrumental Variables Regression Discontinuity Interrupted Time Series
Lack the random assignment to conditions that is essential for true experiments
Randomize or not?
Designs using nonrandomized controls universally yield less convincing results than well-executed randomized field experiments
The randomized field experiment in always the optimal choice for impact assessment
Nevertheless, quasi-experiments are useful for impact assessment when it is impractical or impossible to conduct a true randomized experiment
Randomized trials
Simple case Take a sample of program applicants Randomly assign them to either
Treatment group – is offered treatment Or Control group – not allowed to receive
treatment (during the evaluation period) [Or Placebo group – receives an innocuous
one]
Randomized trials
The critical element in estimating program effects by randomized field experiment is configuring a control group that does not participate in the program but is equivalent to the group that does Identical composition: Intervention and control groups
contain the same mixes of persons or other units in terms of their program-related and outcome-related characteristics
Identical predispositions: Intervention and control groups are equally disposed toward the project and equally likely, without intervention, to attain any given outcome status
Identical experiences: Over the time of observation, intervention and control groups experience the same time-related processes – maturation, interfering events, etc
Randomized trials
Even though target units are assigned randomly, the intervention and control groups will never be exactly the same
But if the random assignment were made over and over, those fluctuations would average out to zero
Statistical methods are used to guide a judgment about whether a specific difference in outcome is likely to have occurred simply by chance or it is more likely to represent the effect of the intervention
Units of analysis
The units on which outcome measures are taken in an impact assessment are called the units of analysis
The choice of the units of analysis should be based on the nature of the intervention and the target units to which it is delivered C.f., units of randomization
The logic of randomized experiments
Outcome Measures
Before Program
After Program Difference
Treatment Group
T1 T2 T = T2 – T1
Control Group C1 C2 C = C2 – C1
* Program effect = T - C
Key steps in conducting a randomized experiment Design the study carefully Randomly assign people to treatment or control Collect baseline data Verify that assignment looks random Monitor process so that integrity of experiment is not
compromised Collect follow-up data for both the treatment and control
groups in identical ways Estimate program impacts by comparing mea outcome
of treatment group vs. mean outcomes of control group Assess whether program impacts are statistically
significant and practically significant
Key advantages of experiments
Because members of the groups (treatment and control) do not differ systematically at the outset of the experiment, any difference that subsequently arises between them can be attributed to the treatment rather than to other factors
Relative to results from non-experimental studies, results from experiments are: Less subject to methodological debates; More likely to be convincing to program
funders and/or policymakers
Validity
A tool to assess credibility of a study
Internal validity: related to ability to draw causal inference, i.e., can we attribute our impact estimates to the program and not to something else
External validity: related to ability generalize to other settings of interest, i.e., can we generalize our impact estimates from this program to other populations, time periods, countries, etc?
Example 1
[Exhibit 8-B] The Child and Adolescent Trial for Cardiovascular Health
(CATCH) 96 elementary schools (CA, LA, MN, TX); 56 intervention
sites and 40 control sites Intervention sites: training sessions for the food service
staffs informing them of the rationale for nutritionally balanced school menus and providing recipes and menus that would achieve that goal
Measured by 24-hour dietary intake interviews with children at baseline and at follow-up, children in the intervention schools were significantly lower than children in control schools in total food intake and in calories derived from fat and saturated fat, but no different with respect to intake of cholesterol or sodium
Example 2
The Minnesota Family Investment Program (MFIP) [Exhibit 8-E] Problem of the Aid to Families with Dependent Children (AFDC): its does not
encourage recipients to leave the welfare rolls and seek employment Conduct an experiment that would encourage AFDC clients to seek employment
and allow them to receive greater income than AFDC would allow if they become employed
Three conditions: An MFIP intervention group receiving more generous benefits and
mandatory participation in employment and training activities An MFIP intervention group receiving only the more generous benefits and
not the mandatory employment and training activities A control group that continued to receive the old AFDC benefits and
services
MFIP intervention families were more likely to be employed and when employed had larger incomes than control families
Those in the intervention group receiving both MFIP benefits and mandatory employment and training activities were more often employed and earned more than the intervention group receiving only the MFIP benefits
Some variations on the basics
Assigning to multiple treatment groups [Example – Education Program]
Problems Large class size Children at different levels of learning Teachers often absent
Possible remedies More teachers to split classes Streaming of pupils into different achievement
bands Make teachers more accountable, may show up
more
Solutions
Do smaller class sizes improve test scores? Add new teachers
Does accountable teacher get better results? New teachers more accountable
Does streaming improve test scores? Divide some classes by achievement
Lottery
An example of clinical trial
Take 1000 people and give half of them the new drug
Can we simply apply this approach to social programs? Need to consider the constraints
Resource Constraints
Many programs have limited resources
Many more eligible recipients than resources will allow services for
Quite common in practice: Training for entrepreneurs or farmers School vouchers
How do you allocate these resources?
Advantages of Lottery
Simple, common, transparent, and flexible
Participants know who the “winners” and “losers” are
There is no a priori reason to discriminate
Perceived as fair