PODER Experimental Design Catherine C. Eckel – Texas A&M University Rick K. Wilson – Rice...
-
Upload
margaret-fletcher -
Category
Documents
-
view
215 -
download
0
Transcript of PODER Experimental Design Catherine C. Eckel – Texas A&M University Rick K. Wilson – Rice...
Capeto
wn
PODER
Experimental Design
Catherine C. Eckel – Texas A&M University
Rick K. Wilson – Rice University
1
7/7
/14
2
Capeto
wn
IN-CLASS EXPERIMENT 1
7/7
/14
3
Capeto
wn
EXPERIMENT
Rules: Your cards are private and may not be revealed to
anyone except the experimenter No talking unless you are asked to do so Keep your earnings private You must pass 2 cards to the experimenter
Earnings in Round: Black Cards are worth 0 Red Cards held in your hand are worth 1 Red Cards in the experimenter stack are
worth .10
7/7
/14
4
Capeto
wn
EXPERIMENT
Payoffs:
1 x (# Red Cards Kept) + 0.1*(# Red Cards in Deck)
7/7
/14
5
Capeto
wn
WHAT IS THIS?
What is the experiment? What were the treatments? What kind of design was used?
Problems with execution?
7/7
/14
6
Capeto
wn
A WELL PLANNED EXPERIMENT ONLY NEEDS
T C
OR
7/7
/14
7
Capeto
wn
WHAT IS AN EXPERIMENT?
Definition: A test of a hypothesis or demonstration of a fact under conditions manipulated by a researcher.
Key elements: Control, control, control Simplify, simplify, simplify Randomize, randomize, randomize
7/7
/14
OBJECTIVES OF EXPERIMENTS Testing theories Establish empirical regularities as a basis for
new theories Testing institutions and environments Policy advice and wind-tunnel experiments Measurement
Preferences: Goods, risk, fairness, time
8
Capeto
wn
7/7
/14
9
Capeto
wn
WHAT AN EXPERIMENT DOESN’T DO
Substitute for thinking Generate hypotheses Not an all-purpose tool
7/7
/14
OBJECTIVES I:THEORY AND EXPERIMENTS
Test a theory or discriminate between theoriesFormal theory provides the basis for
experimental designTest a theory on its own domain:
Implement the conditions of the theory (e.g., preference assumptions, technology assumptions, institutional assumptions)
(Best to have an alternative hypothesis)Compare the prediction(s) with the
experimental outcome10
Capeto
wn
7/7
/14
THEORY, CONT’D What if the results reject theory? Explore the causes of a theory’s failure
Check each of the assumptions Explore parameter space Find out when the theory fails and when it succeeds Design proper control treatments that allows causal
inferences about why the theory fails
11
Capeto
wn
7/7
/14
12
Capeto
wn
OBJECTIVES II:ESTABLISHING REGULARITIES
Finding patterns Lab is inexpensive Manipulations are easy
Pinpointing effects Interventions, GOTV
Fishing … … in the right Lake
7/7
/14
OBJECTIVES III:INSTITUTIONS AND ENVIRONMENTS
(What is an institution?) Compare environments within the same
institution How robust are the results across different
environments? (e.g., CPRs) Compare institutions within the same
environment Allows for comparisons even when no theory about
the effects of the institution is available (Example: cheap talk vs. face-to-face communications in PG)
How to compare? Usual aggregate welfare measure: Aggregate
amount of money earned divided by the maximum that could be earned
13
Capeto
wn
7/7
/14
OBJECTIVES IV:POLICY AND WIND-TUNNEL EXPERIMENTS
Evaluate policy proposals Which auctions generate the higher revenue? (e.g.,
in arts auctions or broadband license auctions) Do emission permits allow efficient pollution control? How should airport slots be allocated?
The laboratory as a wind tunnel for new institutions What is the distributional consequence of
implementing a “transferable quota” system in a fishery?
Market for research space on space station?
14
Capeto
wn
7/7
/14
OBJECTIVES V:MEASUREMENT: ELICITATION OF PREFERENCES
Valuation to Inform PolicyHow much should the government spend on
avoiding traffic injuries?How much should be spend on the
conservation of nature? Measuring people’s valuations is hard
Are people risk seeking/averse?Who is cooperative?
Requires a theory of individual preferences and knowledge about the strength of particular “motives” (preferences). 15
Capeto
wn
7/7
/14
16
Capeto
wn
EXAMPLE: RISK AVERSION
Economists have a theory of decision making under uncertainty: Expected utility
Suggests that ANY measure of the curvature of the utility function will do: Valuation of lotteries
Willingness to pay/accept Becker Degroot Marschak method
Choices between lotteries and certain amounts Choices among lotteries
But these give different answers. (More on request....)
7/7
/14
17
Capeto
wn
HOW THE READINGS FIT
Establishing Regularities: Chaudhuri (2011); Cardenas and Carpenter (2008); Herrmann, Thöni and Gächter (2008)
Measurement/Elicitation of Preferences: Rustagi, Engel and Kosfeld (2010); Karlan (2005)
Institutions and Environments: Rustagi, Engel and Kosfeld (2010); Heinrich et al. (2010)
7/7
/14
18
Capeto
wn
OBJECTIONS TO EXPERIMENTS
Objection: “Experiments are unrealistic.” All models are unrealistic
They leave out many aspects of reality. Simplicity is a virtue – focuses on critical aspects of a
situation (a causal mechanism or logic of a complex relationship)
Experiments are like models They leave out many aspects of reality Focus on critical aspects (cause or precision of
estimate) Realism may be important but so is control.
7/7
/14
19
Capeto
wn
OBJECTIONS CONTINUED
Objection: “Experiments are artificial.” Biased subject pool (students) Low stakes Small number of participants Inexperienced subjects Anonymity
All can be tested in the lab Such testing rarely affects important results
Example: ultimatum game
7/7
/14
20
Capeto
wn
OBJECTIONS CONTINUED
Objection: “Experiments say nothing about the real world.” External validity Generalization
The experiment involves real decisions for the subjects: real consequences
What is the aim of the experiment? Internal validity – ensuring that the causal
inference is correct External validity: secondary
7/7
/14
21
Capeto
wn
LIMITS OF EXPERIMENTS
Control is never perfect (more later) Weather, Laboratory environment No real control over all other motives (no
dominance) Self-selection: who takes part in the experiment?
Randomization is difficult Experiments (like models) are never general,
just examples: Applies to both lab and field experiments
7/7
/14
22
Capeto
wn
WHAT EXPERIMENTS DO WELL
Test for causal claims Inform theory (e.g. fairness) Allow for replication Develop measures (problem of reliability and
validity) Explore parameters of interest (MPCR) Develop counterfactuals (e.g., macro)
7/7
/14
23
Capeto
wn
CAUSAL CONSIDERATIONS
7/7
/14
24
Capeto
wn
RUEBEN CAUSAL MODEL (RCM)
The dilemma:
The same “i” can’t be in two states at the same time!
FPCI: fundamental problem of causal inference
δi =
iT
-
iCTrue Treatment Effect
7/7
/14
25
Capeto
wn
RCM – BEST CORRECTION
iTjT
kTmC
nC
pC
-ATE =
Mean Treatment Mean Control
Analog: y = α + β1X1 + ε, Xi = (0, 1)
7/7
/14
26
Capeto
wn
SUTVA – STABLE UNIT TREATMENT VALUE ASSUMPTION
Treatment ONLY affects the treated.
iTjT
kTmC
nC
pC
7/7
/14
27
Capeto
wn
SUTVA – AUXILIARY ASSUMPTIONS (FOR INFERENCE)
Assumption 1: Average treatment effect is homogeneous across individuals
iT jT kT
= =
7/7
/14
28
Capeto
wn
SUTVA – AUXILIARY ASSUMPTIONS (FOR INFERENCE)
Assumption 2: Treatment is invariant to manner delivered
iT jT kT
= =
Computer Paper Oral
7/7
/14
29
Capeto
wn
SUTVA – AUXILIARY ASSUMPTIONS (FOR INFERENCE)
iTjT
kT
Assumption 3: All possible states of the world are observed
=
Place 1 Place 2
7/7
/14
30
Capeto
wn
SUTVA – AUXILIARY ASSUMPTIONS (FOR INFERENCE)
Assumption 4: Causal question of interest is historically bound to the data. (e.g., no history effects)
Sept. 9, 2001 Sept. 13, 2001
7/7
/14
iTjT
kT
≠
31
Capeto
wn
SUTVA – AUXILIARY ASSUMPTIONS (FOR INFERENCE)
Assumption 5: The treatment precedes the outcome
7/7
/14
32
Capeto
wn
FIXES FOR THREATS TO INFERENCE, I
Threats Lab Field
One-sided noncompliance
Rare ITT problem
Two-sided noncompliance
Rare ITT problem
Attrition Rare Common observables among treated and untreated?
Interference between experimental units
Rare Source of spillover?IV or matching
Heterogeneous treatment effects
Rare, increase sample size
Subgroup analysis; clustered designs
Covariates Blocking Stratification
7/7
/14
33
Capeto
wn
FIXES FOR THREATS TO INFERENCE, II
Threats Lab Field
Multiple Outcomes Rare – split into distinct orthogonal treatments
Often necessary because of cost of carrying out the design
Hawthorne/John Henry Effects
Rare Within subject/unit data
Generalization Problematic Context dependent?Replication is expensive
Selection to treatment
Rare under random assignment
May be due to NGO demands. Matching or IV might help
7/7
/14
34
Capeto
wn
DESIGN CONSIDERATIONS
7/7
/14
Claim by Rustagi et al. (2010): Forest management can only be successful if: (1) there are large numbers of conditional cooperators and (2) there is an investment in monitoring and punishment.
What would this claim look like under different designs?
COMMON RESEARCH DESIGNS: ONE-SHOT DESIGN (NON-EXPERIMENTAL)
35
Capeto
wn
T O
7/7
/14
Rustagi et al. (2010): With their observational data they have field measures of outcomes (potential crop trees) and survey measures of costly monitoring (amount of time spent on forest patrol).
COMMON DESIGNS: ONE-SHOT DESIGN (NON-EXPERIMENTAL)
36
Capeto
wn
T O
Inference:
Statistics: descriptive or kitchen sink
y = α + β1X1 + β2X2 + … + βmXm + ε
SUTVA violations: Everyone is treated (maybe)?
Rustagi et al. (2010): Using observation data only, know nothing about “types” of actors. “Treatment” might be self-report of time spent in monitoring. But monitoring may be purely social or status enhancing – lots of potential explanations.
7/7
/14
PRE/POST-TEST DESIGN(NON-EXPERIMENTAL)
37
Capeto
wn
T O2O1
7/7
/14
Rustagi et al. (2010): With their observational data they have field measures of outcomes (potential crop trees) and survey measures of costly monitoring (amount of time spent on forest patrol). An improvement would be to measure monitoring time at time 1 and again at time 2 once there was some intervention.
PRE/POST-TEST DESIGN(NON-EXPERIMENTAL)
38
Capeto
wn
T O2O1Inference: Treatment might have caused a difference
Statistics: O1 ≠ O2
SUTVA violations: Everyone is treated (maybe)?Other things may have changed at the same timeAll possible states of the world are not observed.
Rustagi et al. (2010): Again, know nothing about “types” of actors. “Treatment” might be some intervention affecting monitoring. But all 49 CPRs are given same tenure rights.
7/7
/14
STATIC GROUP COMPARISON(NON-EXPERIMENTAL)
39
Capeto
wn
T
O1B
O1AGroup A
Group B
7/7
/14
Rustagi et al. (2010): With their observational data they have field measures of outcomes (potential crop trees) and survey measures of costly monitoring (amount of time spent on forest patrol). Treatment could be differences in attention to tenure rights for harvesting the forest.
STATIC GROUP COMPARISON(NON-EXPERIMENTAL)
40
Capeto
wn
T
O1B
O1AGroup A
Group B (Control group?No, not Randomized.)
Inference: Treatment might have caused a difference
Statistics: O1A ≠ O1B
SUTVA violations: Not clear that treatment produced any observed difference between treated and untreated
Rustagi et al. (2010): Suppose selection into the treatment by CPR (e.g., previously more successful CPRs adopt tenure rules)
y = α + β1X1 + β2X2 + … + βmXm + ε
7/7
/14
RANDOMIZED GROUP COMPARISON(EXPERIMENTAL)
41
Capeto
wn
T
O1B
O1ART
RC
7/7
/14
Rustagi et al. (2010): With their observational data they have field measures of outcomes (potential crop trees) and survey measures of costly monitoring (amount of time spent on forest patrol). Experimental measure is (1) cooperation and (2) beliefs.
RANDOMIZED GROUP COMPARISON(EXPERIMENTAL)
42
Capeto
wn
T
O1B
O1ART
RCInference: Very likely the treatment caused the difference
Statistics: O1A ≠ O1BSUTVA violations:
All possible states of the world may not be observedHistorical boundedness IF everything is not run at
same time
Rustagi et al. (2010): Endogenous selection to “type” – no real assignment.
y = α + β1X1 + ε
7/7
/14
PRE/POST CONTROL(EXPERIMENTAL)
43
Capeto
wn
T O2TO1T
O2CO1C
RTRC
7/7
/14
Rustagi et al. (2010): With their observational data they have one measure of outcomes (potential crop trees). Experimental measure is (1) cooperation and (2) beliefs.
PRE/POST CONTROL(EXPERIMENTAL)
44
Capeto
wn
T O2TO1T
O2CO1C
RTRC
Inference: Treatment probably caused a difference
Statistics: O1T = O1C ; O2T ≠ O2C SUTVA violations:
Treatment homogeneous across individuals?Treatment invariant to delivery method?
Rustagi et al. (2010): Treatment 1 is endogenously selected (conditional cooperator or not) and treatment 2 is a mediator (beliefs about cooperation of others).
y = α + β1X1 + ε
where y = O2 – O1
7/7
/14
SOLOMON FOUR-GROUP(EXPERIMENTAL)
45
Capeto
wn
T O2T1O1T1
O2C1O1C1
RT1RC1
T O2T2
O2C2
RT2RC2
7/7
/14
SOLOMON FOUR-GROUP(EXPERIMENTAL)
46
Capeto
wn
T O2T1O1T1
O2C1O1C1
RT1RC1
T O2T2
O2C2
RT2RC2
Inference: Treatment very likely caused a difference
Statistics: O1T1 = O1C1 = O2C1 = O2C2 ; O2T1 = O2T2 ≠ O2C1 = O2C2
7/7
/14
SOLOMON FOUR-GROUP(EXPERIMENTAL)
47
Capeto
wn
T O2T1O1T1
O2C1O1C1
RT1RC1
T O2T2
O2C2
RT2RC2
NOTE: This is easy to accomplish in the Lab, but often not efficient. It is a nightmare in the field.
7/7
/14
48
Capeto
wn
EFFICIENCY IN DESIGN:COUNTER-BALANCE/WITHIN SUBJECT
Counter-balanced designs O1 TA O2 TB O3 and O1 TB O2 TA O3
Builds within subject design Decreases the number of trials Accounts for treatment ordering effect
Cross-over designs O1 TA O2 TB O3 TA O4
Accounts for treatment plus learning
7/7
/14
49
Capeto
wn
EFFICIENCY IN DESIGN:BLOCKING (AND MATCHING) Randomized Blocking Designs
Suppose a 2(H,L)x2(B,S) within subject design with a 4 game ordering effect (HB, HS, LB,LS). Would require 16 cells under complete factorial design Blocking allows 4 cells: (1) HB,HS,LB,LS (2)
HS,LB,LS,HB(3) LB,LS,HB,HS (4) LS,HB,HS,LB
Assumptions Blocks must be homogeneous Blocks must be randomly assigned
Blocking on “nuisance” variables Sex is not randomly assigned
y = α + β1X1 + β2X2 + ε
Treatment Sex of Subject
7/7
/14
50
Capeto
wn
CALIBRATION
Keep in mind that you are producing a data set
Include a “baseline” in the experimental design
Set parameters so you can be sure to tell if hypothesis is supported
Try to factor in “noise” in behavior – variability in the performance of the subjects. Lots of noise means results are hard to determine.
Develop criteria for rejection
7/7
/14
51
Capeto
wn
ADVICE ON CONTROLS
Control everything that might be controlled (and collect data on everything that might be uncontrolled)
Randomize everything else Maximize contrasts with treatments (do you
really need to use low, medium and high manipulations?)
If a “nuisance” variable is suspected to interact with a treatment, then make it a separate treatment.
If a “nuisance” variable is too much of a problem, then make it a constant (blocking) – comparative statics will then play out.
7/7
/14
52
Capeto
wn
GENERAL REMARK
Whether the conditions implemented in the laboratory are also present in reality will always be subject to some uncertainty.
Therefore, laboratory experiments are no substitute for the analysis of field happenstance data for the conduct and the analysis of field
experiments and for survey data.
7/7
/14