ENDOGENEITY - SIMULTANEITY

24
ENDOGENEITY - SIMULTANEITY Development Workshop

description

ENDOGENEITY - SIMULTANEITY. Development Workshop. What is endogeneity and why we do not like it? [REPETITION]. Three causes: X influences Y, but Y reinforces X too Z causes both X and Y fairly contemporaneusly - PowerPoint PPT Presentation

Transcript of ENDOGENEITY - SIMULTANEITY

Page 1: ENDOGENEITY - SIMULTANEITY

ENDOGENEITY - SIMULTANEITY

Development

Workshop

Page 2: ENDOGENEITY - SIMULTANEITY

What is endogeneity and why we do not like it? [REPETITION]

Three causes:– X influences Y, but Y reinforces X too– Z causes both X and Y fairly contemporaneusly– X causes Y, but we cannot observe X and Z (which we

observe) is influenced by X but also by Y Consequences:

– No matter how many observations – estimators biased (this is called: inconsistent)

– Ergo: whatever point estimates we find, we can’t even tell if they are positive/negative/significant, because we do not know the size of bias + no way to estimate the size of bias

Page 3: ENDOGENEITY - SIMULTANEITY

The magic of „ceteris paribus”

Each regression is actually ceteris paribus

Problem: data may be at odds with ceteris paribus

Examples?

Page 4: ENDOGENEITY - SIMULTANEITY

Problems with Inferring Causal Effects from Regressions

Regressions tell us about correlations but ‘correlation is not causation’

Example: Regression of whether currently have health problem on whether have been in hospital in past year:

 HEALTHPROB |      Coef.   Std. Err.      t    ------------+---------------------------------    PATIENT |    .262982   .0095126    27.65         _cons |    .153447    .003092    49.63  

Do hospitals make you sick? – a causal effect

Page 5: ENDOGENEITY - SIMULTANEITY

The problem in causal inference in case of simultaneity

ConfoundingInfluence

Treatment

Outcome

Observed Factor

Unobserved Factor

Page 6: ENDOGENEITY - SIMULTANEITY

Any solutions?

ConfoundingInfluence

Treatment

Outcome

Observed Factor

Unobserved Factor

Page 7: ENDOGENEITY - SIMULTANEITY

Instrumental Variables solution…

ConfoundingInfluence

Treatment

OutcomeInstrumentalVariable(s)

Observed Factor

Unobserved Factor

Page 8: ENDOGENEITY - SIMULTANEITY

Fixed Effects Solution… (DiD does pretty much the same)

ConfoundingInfluence

Treatment

Outcome

Fixed Influences

Observed Factor

Unobserved Factor

Page 9: ENDOGENEITY - SIMULTANEITY

Short motivating story – ALMPs in Poland

Basic statement: 50% of unemployed have found employment because of ALMPs

Facts:– 50% of whom? – only those, who were treated (only those were

monitored)– only 90% of treated completed the programmes – of those, who completed, indeed 50% work, but only 60% of

these who work say it was because of the programme

So how many actually employed because of the programme?

Page 10: ENDOGENEITY - SIMULTANEITY

Short motivating story – ALMPs in Poland

???

90

%

52

30

Product

Gross effectiveness

Net effectiveness

Net efficiency?

Completedtraining …

... foundemployment...

... thanks to programme…

Page 11: ENDOGENEITY - SIMULTANEITY

Basic problems in causal inference

Compare somebody „before” and „after”– If they were different already before, the differential will be

wrongly attributed to „treatment” can we measure/capture this inherent difference? does it stay unchanged „before” and „after”? what if we only know „after”?

• If the difference stays the same => DiD estimator => assumption that cannot be tested for

• If the difference cannot be believed to stay the same?

Page 12: ENDOGENEITY - SIMULTANEITY

Faked counterfactual or generating a paralel world

o MEDICINE: takes control groups – people as sick, who get a different treatment or a placebo => experimenting

o What if experiment impossible?

2011-04-21Seminarium magisterskie -

zajęcia 412

Page 13: ENDOGENEITY - SIMULTANEITY

What if experiment impossble?

Only cross-sectional data Panel data

„Regression Discontinuity

Design“

„Propensity Score Matching“

Instrumental variables

Before After Estimators

Difference in Difference Estimators (DiD)

„Propensity Score Matching“ + DiD

Page 14: ENDOGENEITY - SIMULTANEITY

Propensity Score Matching

ConfoundingInfluence

Treatment

Outcome

Treatment

Observed Factor

Unobserved Factor

Page 15: ENDOGENEITY - SIMULTANEITY

Propensity score matching

Group Y1 Y0

Treated (D=1) Observed counterfactual – (does not exist)

Nontreated (D=0)

counterfactual – (does not exist)

observed

Average treatment effect

E(Y)=E(Y1-Y0)=E(Y1)-Y0 Average treatment effect for the untreated

E(Y1-Y0|D=0)=E(Y1|D=0)-E(Y0|D=0) Average treatment effect for the treated (ATT)

E(Y1-Y0|D=1)=E(Y1|D=1)-E(Y0|D=1)

Page 16: ENDOGENEITY - SIMULTANEITY

Propensity Score Matching

Idea– Compares outcomes of similar units where the only difference is

treatment; discards the rest

Example– Low ability students will have lower future achievement, and are

also likely to be retained in class– Naïve comparison of untreated/treated students creates bias,

where the untreated do better in the post period– Matching methods make the proper comparison

Problems– If similar units do not exist, cannot use this estimator

Page 17: ENDOGENEITY - SIMULTANEITY

How to get PSM estimator?

– First stage: run „treatment” on observable characteristics

– Second stage: estimate the probability of „treatment”

– Third stage: compare results of those „treated” and similar non-treated („statistical twinns”)

– The less similar they are, the less likely they should be compared one with another

it it itT X

1

0ˆ ˆ

it jti T j T

ij

it jt

ij Y YN N

p p

ˆ ˆPr 1it it itT X p

Page 18: ENDOGENEITY - SIMULTANEITY

The obtained propensity score is irrelevant (as long as consistent)

NEAREST NEIGHBOR (NN)

Pros => tzw. 1:1 Cons => if 1:1 does not

exist, completely senseless

Page 19: ENDOGENEITY - SIMULTANEITY

The obtained propensity score is irrelevant (as long as consistent)

CALIPER/RADIUS MATCHING(NN)

Pros => more elastic than NN

Cons => who specifies the radius/caliper?

Page 20: ENDOGENEITY - SIMULTANEITY

The obtained propensity score is irrelevant (as long as consistent)

Stratification and Interval

Pros => eliminates discretion in radius/caliper choice

Cons => within strata/interval, units don’t have to be „similar”

(some people say 10 strata is ql)

Page 21: ENDOGENEITY - SIMULTANEITY

The obtained propensity score is irrelevant (as long as consistent)

KERNEL MATCHING (KM)

Pros => uses always all observations

Cons => need to remember about common support

Treatment Control

*

* *

*

*

*

Page 22: ENDOGENEITY - SIMULTANEITY

What is „common support”?

Distributions of pscore may differ substantially across units Only sensible solutions!

Page 23: ENDOGENEITY - SIMULTANEITY

Real world examples

Page 24: ENDOGENEITY - SIMULTANEITY

Next week – practical excercise

Read the papers posted on the web I will post one that we will replicate soon…