ENDOGENEITY - SIMULTANEITY

ENDOGENEITY - SIMULTANEITY

Development

Workshop

What is endogeneity and why we do not like it? [REPETITION]

Three causes:– X influences Y, but Y reinforces X too– Z causes both X and Y fairly contemporaneusly– X causes Y, but we cannot observe X and Z (which we

observe) is influenced by X but also by Y Consequences:

– No matter how many observations – estimators biased (this is called: inconsistent)

– Ergo: whatever point estimates we find, we can’t even tell if they are positive/negative/significant, because we do not know the size of bias + no way to estimate the size of bias

The magic of „ceteris paribus”

Each regression is actually ceteris paribus

Problem: data may be at odds with ceteris paribus

Examples?

Problems with Inferring Causal Effects from Regressions

Regressions tell us about correlations but ‘correlation is not causation’

Example: Regression of whether currently have health problem on whether have been in hospital in past year:

HEALTHPROB | Coef. Std. Err. t ------------+--------------------------------- PATIENT | .262982 .0095126 27.65 _cons | .153447 .003092 49.63

Do hospitals make you sick? – a causal effect

The problem in causal inference in case of simultaneity

ConfoundingInfluence

Treatment

Outcome

Observed Factor

Unobserved Factor

Any solutions?


Treatment

Outcome

Observed Factor

Unobserved Factor

Instrumental Variables solution…


Treatment

OutcomeInstrumentalVariable(s)

Observed Factor

Unobserved Factor

Fixed Effects Solution… (DiD does pretty much the same)


Treatment

Outcome

Fixed Influences

Observed Factor

Unobserved Factor

Short motivating story – ALMPs in Poland

Basic statement: 50% of unemployed have found employment because of ALMPs

Facts:– 50% of whom? – only those, who were treated (only those were

monitored)– only 90% of treated completed the programmes – of those, who completed, indeed 50% work, but only 60% of

these who work say it was because of the programme

So how many actually employed because of the programme?

Short motivating story – ALMPs in Poland

???

90

%

52

30

Product

Gross effectiveness

Net effectiveness

Net efficiency?

Completedtraining …

... foundemployment...

... thanks to programme…

Basic problems in causal inference

Compare somebody „before” and „after”– If they were different already before, the differential will be

wrongly attributed to „treatment” can we measure/capture this inherent difference? does it stay unchanged „before” and „after”? what if we only know „after”?

• If the difference stays the same => DiD estimator => assumption that cannot be tested for

• If the difference cannot be believed to stay the same?

Faked counterfactual or generating a paralel world

o MEDICINE: takes control groups – people as sick, who get a different treatment or a placebo => experimenting

o What if experiment impossible?

2011-04-21Seminarium magisterskie -

zajęcia 412

What if experiment impossble?

Only cross-sectional data Panel data

„Regression Discontinuity

Design“

„Propensity Score Matching“

Instrumental variables

Before After Estimators

Difference in Difference Estimators (DiD)

„Propensity Score Matching“ + DiD

Propensity Score Matching


Treatment

Outcome

Treatment

Observed Factor

Unobserved Factor

Propensity score matching

Group Y1 Y0

Treated (D=1) Observed counterfactual – (does not exist)

Nontreated (D=0)

counterfactual – (does not exist)

observed

Average treatment effect

E(Y)=E(Y1-Y0)=E(Y1)-Y0 Average treatment effect for the untreated

E(Y1-Y0|D=0)=E(Y1|D=0)-E(Y0|D=0) Average treatment effect for the treated (ATT)

E(Y1-Y0|D=1)=E(Y1|D=1)-E(Y0|D=1)

Propensity Score Matching

Idea– Compares outcomes of similar units where the only difference is

treatment; discards the rest

Example– Low ability students will have lower future achievement, and are

also likely to be retained in class– Naïve comparison of untreated/treated students creates bias,

where the untreated do better in the post period– Matching methods make the proper comparison

Problems– If similar units do not exist, cannot use this estimator

How to get PSM estimator?

– First stage: run „treatment” on observable characteristics

– Second stage: estimate the probability of „treatment”

– Third stage: compare results of those „treated” and similar non-treated („statistical twinns”)

– The less similar they are, the less likely they should be compared one with another

it it itT X

1ˆ

1

0ˆ ˆ

it jti T j T

ij

it jt

ij Y YN N

p p

ˆ ˆPr 1it it itT X p

The obtained propensity score is irrelevant (as long as consistent)

NEAREST NEIGHBOR (NN)

Pros => tzw. 1:1 Cons => if 1:1 does not

exist, completely senseless


CALIPER/RADIUS MATCHING(NN)

Pros => more elastic than NN

Cons => who specifies the radius/caliper?


Stratification and Interval

Pros => eliminates discretion in radius/caliper choice

Cons => within strata/interval, units don’t have to be „similar”

(some people say 10 strata is ql)


KERNEL MATCHING (KM)

Pros => uses always all observations

Cons => need to remember about common support

Treatment Control

*

* *

*

*

*

What is „common support”?

Distributions of pscore may differ substantially across units Only sensible solutions!

Real world examples

Next week – practical excercise

Read the papers posted on the web I will post one that we will replicate soon…

ENDOGENEITY - SIMULTANEITY

Documents

Transcript of ENDOGENEITY - SIMULTANEITY