Endogeneity & Exogeneity

19
EXOGENEITY & ENDOGENEITY Shane Thompson Summit Consulting, LLC © 2015

Transcript of Endogeneity & Exogeneity

Page 1: Endogeneity & Exogeneity

EXOGENEITY & ENDOGENEITYShane Thompson

Summit Consulting, LLC

© 2015

Page 2: Endogeneity & Exogeneity

EXOGENEITY, ENDOGENEITY, & YOU

Program evaluation, policy impacts, treatment effects

How does our outcome of interest (income, health, free-throw %) change after our independent variable of interest (education, medicine, practice) changes?

Often, we will have to concern ourselves with whether our independent variable of interest is exogenous or endogenous

Page 3: Endogeneity & Exogeneity

EXOGENOUS VARIABLES

An independent variable is exogenous in a dataset if it is assigned/chosen without respect to how it might influence the outcome of interest

Education level assigned without regard to potential income benefit

Medicine assigned without respect to probability of success

Practice hours assigned without respect to how it might improve performance

Randomization of “treatment” assignment ensures that independent variables are exogenous

Page 4: Endogeneity & Exogeneity

ENDOGENOUS VARIABLES An independent variable is endogenous in a dataset

if it is assigned/chosen according to how it might determine the outcome of interest

Income: Bill Gates drops out of college; Shane Thompson goes forever

Health: Patient with family history of heart attacks takes aspirin; patient without does not take it

FT%: Shaq practices free throws; Steve Nash does not

Simple comparisons of treatment-vs-control are biased

College REDUCES income by billions of dollars! Aspirin INCREASES the probability of heart attacks! Free-throw practice REDUCES free-throw percentage!

Page 5: Endogeneity & Exogeneity

COMMUTING TIME: SLUGGING

Weather…wait...where…who…wreck?

Exogenous or endogenous? The weather

EXOGENOUS The time of day I get in the slug line

ENDOGENOUS My destination (14th, 18th, or L’Enfant Plaza)

ENDOGENOUS The gender of the driver

EXOGENOUS A wreck on the HOV lanes

EXOGENOUS

Page 6: Endogeneity & Exogeneity

CORRELATION VS. CAUSATION

Estimates of the effects of endogenous variables on outcome variables are correlational A change in an outcome variable given a change

in an endogenous variable is potentially reflective of changes of several other variables in the model

Estimates of the effects of exogenous variables on outcome variables are causal A change in an outcome variable given a change

in an exogenous variable is attributable to the exogenous variable

Page 7: Endogeneity & Exogeneity

EX: BLUE JEANS AND REVENUE

AL and AC notice that whenever several Summiteers wear jeans into the office, big deliverables are generally due that day.

In leadership meetings they alert the directors to this amazing trend.

Hypothesis: Wearing jeans increases revenue

Page 8: Endogeneity & Exogeneity

# Deliverables Jeans # Deliverables Jeans1 0 1 01 0 3 00 0 2 00 0 3 03 1   5 1  0 0 1 02 0 0 02 0 3 02 0 1 06 1   6 1  3 0 3 01 0 1 03 0 1 01 0 0 03 1   5 1  1 0 2 02 0 0 01 0 0 00 0 0 08 1   5 1  0 0 1 01 0 0 02 0 0 00 0 0 06 1   6 1  

Page 9: Endogeneity & Exogeneity

JEANS. JEANS! JEANS!

Over the last 50 work days: Summit averages 5.3 deliverables when

Summiteers wear jeans Summit averages 1.1 deliverables when

Summiteers do NOT wear jeans

JEANS INCREASE PRODUCTIVITY!!!

JEANS INCREASE REVENUE!!!

Page 10: Endogeneity & Exogeneity

# Deliverables Jeans Day # Deliverables Jeans Day1 0 Monday 1 0 Monday1 0 Tuesday 3 0 Tuesday0 0 Wednesday 2 0 Wednesday0 0 Thursday 3 0 Thursday3 1 Friday 5 1 Friday0 0 Monday 1 0 Monday2 0 Tuesday 0 0 Tuesday2 0 Wednesday 3 0 Wednesday2 0 Thursday 1 0 Thursday6 1 Friday 6 1 Friday3 0 Monday 3 0 Monday1 0 Tuesday 1 0 Tuesday3 0 Wednesday 1 0 Wednesday1 0 Thursday 0 0 Thursday3 1 Friday 5 1 Friday1 0 Monday 2 0 Monday2 0 Tuesday 0 0 Tuesday1 0 Wednesday 0 0 Wednesday0 0 Thursday 0 0 Thursday8 1 Friday 5 1 Friday0 0 Monday 1 0 Monday1 0 Tuesday 0 0 Tuesday2 0 Wednesday 0 0 Wednesday0 0 Thursday 0 0 Thursday6 1 Friday 6 1 Friday

Page 11: Endogeneity & Exogeneity

WELL, THAT’S EMBARRASSING…

Jeans DO NOT increase deliverables and DO NOT increase revenue.

Still, enterprising Summiteers seek to find the causal effect of jeans

How might we find the CAUSAL effect of jeans?

Page 12: Endogeneity & Exogeneity

BEST OPTION: RANDOMIZED CONTROL TRIAL

Summit randomly assigns casual days

Jean-wearing is exogenous to deliverables, i.e. the assignment to wear jeans is made without regard to how it might influence business

Causal effect of jeans on deliverables: Avg Deliverablesjeans – Avg Deliverablesno jeans

Page 13: Endogeneity & Exogeneity

NON-RANDOMIZED DATA

Jean-wearing is endogenous to deliverables, i.e. the decision to wear jeans may be made according to deliverable deadlines and client meetings

Causal effect of jeans on deliverables IS NOT: Avg Deliverablesjeans – Avg Deliverablesno jeans

Why??

Page 14: Endogeneity & Exogeneity

BALANCE IN TREATMENT ASSIGNMENT

Exogenous Treatment Endogenous Treatment

Treatment Control

Age

Race

Gender

Income

Age

Race

Gender

Income

Treatment Control

Older

Black

Female

$$$

Younger

$

White

Male

Page 15: Endogeneity & Exogeneity

QUASI-EXPERIMENTAL METHODS: MAKING ENDOGENOUS VARIABLES EXOGENOUS

Difference-in-Difference1. Identify a suitable

control group (Firm X)2. Verify parallel trend in

deliverables before treatment (controlling for obs characteristics)

3. Verify no spillover effects

4. Jeans are exogenous, conditional on the parallel trend (which is conditional on obs characteristics)

Page 16: Endogeneity & Exogeneity

QUASI-EXPERIMENTAL METHODS: MAKING ENDOGENOUS VARIABLES EXOGENOUS

Regression Discontinuity1. Summit management

allows staff above a fixed threshold of billable hours to wear jeans (threshold is unknown to staff)

2. Staff just above and just below the threshold are equally productive

3. Jeans are exogenous immediately above and below threshold

Billable hours (hundreds)

Delivera

ble

s in

201

4

Page 17: Endogeneity & Exogeneity

QUASI-EXPERIMENTAL METHODS: MAKING ENDOGENOUS VARIABLES EXOGENOUS

Propensity Score Matching1. Summit allows all staff

to wear jeans if they want

2. Estimate the probability of jean-wearing given observable staff characteristics

3. Assume we observe ALL relevant predictors of jean-wearing

4. Jeans are exogenous at each probability level

Page 18: Endogeneity & Exogeneity

QUASI-EXPERIMENTAL METHODS: MAKING ENDOGENOUS VARIABLES EXOGENOUS

Synthetic Control Method1. Identify several potential

control groups (firms)2. Construct a synthetic Summit

that is a weighted combination of other firms

3. Constrain the synthetic Summit to be approx equal to Summit in observable characteristics and deliverables before the jeans policy

4. Jeans are exogenous, conditional on the pre-treatment equality between Summit and the synthetic control

B

A E

D

C

Page 19: Endogeneity & Exogeneity

CONCLUSIONS

Correlation = Causation

Data are generally messy, poorly-tracked, and non-experimental (rife with endogeneity!)

Cutting edge evaluators, statisticians, and econometricians must be able to:1. identify endogenous and exogenous variables2. implement statistical methods to mitigate

endogeneity

The causal effect of jeans is 10,000 additional deliverables