Post on 25-Dec-2015
EVALUATING ACTIVE LABOR MARKET POLICY FOR THE DISADVANTAGED
A Review of US Experience
Robert MoffittJohns Hopkins University
Outline
Thumbnail review of recent US social assistance structure
Methods of evaluation used so far Assessment of strengths and
weaknesses of each Methods that have not been often
used
Current Social Assistance Structure For details, see Means-Tested Transfer Programs
in the U.S., ed. R. Moffitt, U of Chicago Press and NBER, 2003
Temporary Assistance for NeedyFamilies (TANF): cash for (mostly) single mothers
Food Stamps: food coupons for all poor Medicaid: medical care for TANF recipients and
selected non-TANF recipients Earned Income Tax Credit (EITC): earnings
subsidy for all, though mostly families with children
Housing, Disabled and Elderly (SSI)
Real Expenditures, 1990-1996 TANF- FOOD MEDICAID* EITC HOUSING SSI AFDC STAMPS
1990 $24,758 $20,654 $84,658 $8,092 $16,922 $20,125
1996 23,677 27,344 159,357 24,088 19,877 32,065
Pct changefrom 1990 -4% 42% 88% 198% 17% 59%
Share of -1% 7% 60% 13% 4% 10%Growth
In Millions *Includes nursing home and elderly care
FY 2000 expenditures:Child Care: $20,580Job Training: $ 7,347Child Support Enforcement: $3,255
Marginal Benefit Reduction Rates (=marginal tax rates) TANF: differs by state, median = .50 Food Stamps: .30 Medicaid: either 0 or >1 EITC: varies by family size; .34-.40
in phase-in range, 0 in middle range, .16-.21 in phaseout rane
Housing: .20-.30 SSI: 0 or .50
Recent Reforms TANF, 1992-1996: reduction in tax
rates, work requirements, sanctions, time limits, diversion, family caps
Food Stamps: work requirements, sanctions
Medicaid: expansion of eligibility to non-TANF groups
EITC: expansion in generosity 1990+
Evaluation Methods Used to Date Will focus mostly on TANF-AFDC A little on EITC A little on Medicaid But not much done on Food Stamps,
housing, or SSI For more reading, see, Evaluating
Welfare Reform in an Era of Transition, eds. R. Moffitt and S. Ver Ploeg, National Academy Press, 2001
Distinctions
When talking about evaluation, need to make several distinctions
First one is: what is the question being asked?
SOME QUESTIONS
Monitoring is of great evaluation and has an important role to play
But ultimately everyone wants causal analysis
Monitoring (=Descriptive, possibly Longitudinal) vs Evaluation (=causal analysis)
THREE EVALUATION QUESTIONSTHREE EVALUATION QUESTIONS
(1) What has been the overall effect of welfare reform? I.e., of the whole “bundle” of components?
(2) What has been the effect of individual broad components of welfare reform (work requirements, time limits, etc) (i.e., what if everything else except each had been enacted)
(3) What are the effects of detailed strategies (e.g. work first vs hum cap) within components
DISTINCTION BETWEEN BACKWARD-DISTINCTION BETWEEN BACKWARD-LOOKING AND FORWARD-LOOKING LOOKING AND FORWARD-LOOKING
QUESTIONSQUESTIONS
Backward-looking: what was the effect of what has actually happened relative to the old program?
Forward-looking: what would happen if something different were changed from the current program, taken as baseline?
Both questions are of interest, but may require different evaluation methods
WHAT ARE THE BEST EVALUATION WHAT ARE THE BEST EVALUATION METHODS TO ANSWER THE METHODS TO ANSWER THE QUESTIONSQUESTIONS
Experimental (random assignment) Nonexperimental: time series modeling,
cross-area, cross-area fixed effects, eligibility-based differences in differences, matching, cohort comparisons
Answer: best method depends on the question
Alternative Evaluation Methodologies for Different Questions of Interest
Questions of InterestEvaluationMethods
Overall Effects Effects of IndividualBroad Components
Effects of DetailedStrategies
Experimental Poorly Suited
Problems: contaminationof control group; macroand feedback effects; entryeffects; generalizabilityfrom only a few areas
Moderately well suited
Need to becomplemented withnonexperimentalanalyses for entry effectsand generalizability
Well suited
Need to becomplemented withnonexperimentalanalyses forgeneralizability and,possibly, entry effects.
Non-Experimental
Moderately well suited
Time-series modeling &comparison group designsusing ineligibles are mostpromising
Problems: lack of cross-area program variation;data limitations
Moderately well suited
Cross-area comparisondesigns, followed overtime are the mostpromising
Problems: lack of cross-area program variation;measurement of policies;data limitations
Poorly suited
Within-area matchingdesigns mostappropriate, then cross-area comparisons
Problems: extreme datalimitations & lack of stat-tistical power; matchingreliability uncertain
OTHER EVALUATION ISSUESOTHER EVALUATION ISSUES
Generalization Need for microsimulation MSM as a tool for integration and
synthesis
Process analysis and qualitative analysis undervalued?
What questions were addressed and methods used in U.S.?
Monitoring questions that were addressed and answered: What were the short-run outcomes of
women who have left welfare post-reform?
What were the overall trends in income and poverty for single mothers over the 1980s and 1990s, pre-reform and post-reform?
Evaluation Questions AddressedEvaluation Questions Addressed
An evaluation question that was addressed and answered: what was the overall impact of early welfare reform on caseloads, employment, income, poverty, and other outcomes?
Pre-1996, when there was state variation: cross-state, stated fixed effects model; dummy variable in regression for reform vs no reform
Post-1996, when there was no state variation (in existence of any reform); eligibility-based DID
Compared nationwide trends in single mothers vs married women or single women; nationwide trends in low-educated single mothers vs high-educated single mothers
Problem with the latter: separating the effects of other things happening at the same time (other reforms, the economy, etc)
More Evaluation Questions Addressed Impact of Detailed Strategies:
Methods: experiments, though not all directly relevant to what was actually enacted
(1)What were the effects of mandating full time work?
(2) What were the effects of Work First vs. Human Capital strategies?
(3) What were the effects of significant work subsidies greater than those states have generally enacted?
Evaluation Questions Not Asked or Answered Main one: effects of broad components Do not know the effects of time limits, work
requirements, sanctions, etc, either added to AFDC incrementally or subtracted from new program
Cross-state, state fixed effects: doesn’t work, too much cross-state variation, can’t isolate components
Experiments: could have been done, but weren’t Other nonexperimental methods: have not been
used; some data problems as well (states don’t collect the data)
Other Evaluation Questions not addressed: Detailed Strategies
What would have happened if the time limit had been 4 years instead of 5? 3 instead of 5?
What would have happened if there had been more exemptions to work requirements? Less? Stronger sanctions? Weaker ones?
Data
Data always an issue, but will not go over in detail; very US specific
Administrative vs Survey data Despite its reputation for excellent
data, there were (and still are) serious data limitations in evaluating welfare reform in US
Forward-looking questions Have been mostly talking about
backward-looking questions, with the exception of some detailed strategies
Forward-looking: some work done with experiments, detailed strategies
“What works for whom” is the goal of these analyses
Some new experiments of this type (ERA, etc)
Some Other Programs EITC: evaluation mostly using
eligibility-based DID Example: difference in trends over
time when EITC expanded between eligibles and ineligibles
Or between families with different numbers of children
See Hotz-Scholz (2003, Moffitt volume)
Medicaid
Have been expansions of coverage to new non-TANF groups
Evalulation method: cross-state, state fixed effects model
Other Programs (continued)
Job Training: no studies of new program (“WIA”) but both experimental and nonexperimental evaluations of old programs (JTPA, CETA, Job Corps, Supported Work, etc.)
Some matching methods have been used in the nonexperimental evaluations
Matching Has not been used much in evaluation
of welfare reform or its components Much skepticism in the welfare reform
community about its validity Research results: mixed, especially in
the experimental-matching comparisons Have yet to establish a basis for
determining when matching is working well and when it is not, esp in advance (i.e., at the time of evaluation design)
Conclusions Much evaluation work has been conducted in
the US Much has been learned substantively Much has been learned about strengths and
weaknesses of alternative evaluation methods
But mistakes have been made and important questions have gone unanswered
Demonstrates the challenges
EVALUATING ACTIVE LABOR MARKET POLICY FOR THE DISADVANTAGED
A Review of US Experience
Robert MoffittJohns Hopkins University