Matching Methods for Causal Inference - Gary King · Matching Methods for Causal Inference Gary...

Matching Methods for Causal Inference

Gary King1

Institute for Quantitative Social ScienceHarvard University

Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center,Harvard Medical School, 3/9/2018

1GaryKing.org1 / 25

3 Problems, 3 Solutions

1. The most popular method (propensity score matching, used in103,000 articles!) sounds magical:

Why Propensity Scores Should Not Be Used forMatching (Gary King, Richard Nielsen)

2. Do powerful methods have to be complicated?

Causal Inference Without Balance Checking:Coarsened Exact Matching (PA, 2011. StefanoM Iacus, Gary King, and Giuseppe Porro)

3. Matching methods optimize either imbalance ( bias) or #units pruned ( variance); users need both simultaneously:

The Balance-Sample Size Frontier in MatchingMethods for Causal Inference (In press, AJPS;Gary King, Christopher Lucas and Richard Nielsen)

2 / 25


1. The most popular method (propensity score matching, used in103,000 articles!) sounds magical:

Why Propensity Scores Should Not Be Used forMatching (Gary King, Richard Nielsen)





2 / 25


1. The most popular method (propensity score matching, used in103,000 articles!) sounds magical: Why Propensity Scores Should Not Be Used for

Matching (Gary King, Richard Nielsen)





2 / 25








2 / 25




2. Do powerful methods have to be complicated? Causal Inference Without Balance Checking:

Coarsened Exact Matching (PA, 2011. StefanoM Iacus, Gary King, and Giuseppe Porro)



2 / 25








2 / 25






3. Matching methods optimize either imbalance ( bias) or #units pruned ( variance); users need both simultaneously: The Balance-Sample Size Frontier in Matching

Methods for Causal Inference (In press, AJPS;Gary King, Christopher Lucas and Richard Nielsen)

2 / 25

Matching to Reduce Model Dependence

3 / 25

Matching to Reduce Model Dependence(Ho, Imai, King, Stuart, 2007: fig.1, Political Analysis)

3 / 25


Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

3 / 25


Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

T

3 / 25


Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

T

CC

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

CC C

C

C

C

C

C

C

C

C

C

C

C

CCC

CC

CC

C

C

3 / 25


Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

T

CC

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

CC C

C

C

C

C

C

C

C

C

C

C

C

CCC

CC

CC

C

C

3 / 25


Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

T

CC

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

CC C

C

C

C

C

C

C

C

C

C

C

C

CCC

CC

CC

C

C

3 / 25


Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

T

CC

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

CC C

C

C

C

C

C

C

C

C

C

C

C

CCC

CC

CC

C

C

3 / 25


Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

TC

C

C

C

C

CC

C

C

CC

C CC

C

C

CCCC

C

CC

C

CC

CC

CC

C

C

C

C

CC

CCCC

3 / 25


Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

TC

C

C

C

C

CC

C

C

CC

C CC

C

C

CCCC

C

CC

C

CC

CC

CC

C

C

C

C

CC

CCCC

3 / 25

The Problems Matching Solves

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25


Without Matching:







4 / 25


Without Matching:

Imbalance







4 / 25


Without Matching:

Imbalance Model Dependence







4 / 25


Without Matching:

Imbalance Model Dependence Researcher discretion







4 / 25


Without Matching:

Imbalance Model Dependence Researcher discretion Bias







4 / 25


Without Matching:

Imbalance Model Dependence Researcher discretion Bias Qualitative choice from unbiased estimates = biased estimator






4 / 25


Without Matching:


e.g., Choosing from results of 50 randomized experiments

Choosing based on plausibility is probably worse[effrt]





4 / 25


Without Matching:







4 / 25


Without Matching:



conscientious effort doesnt avoid biases (Banaji 2013)[acc]

People do not have easy access to their own mental processesor feedback to avoid the problem (Wilson and Brekke1994)[exprt]



4 / 25


Without Matching:







4 / 25


Without Matching:







4 / 25


Without Matching:







4 / 25


WithHHout Matching:

Imbalance Model Dependence Researcher discretion Bias







4 / 25


WithHHout Matching:

ZZImbalance Model Dependence Researcher discretion Bias







4 / 25


WithHHout Matching:

ZZImbalance ((((((((

(hhhhhhhhhModel Dependence Researcher discretion Bias







4 / 25


WithHHout Matching:


(hhhhhhhhhModel Dependence (((((((

(((hhhhhhhhhhResearcher discretion Bias







4 / 25


WithHHout Matching:



(((hhhhhhhhhhResearcher discretion XXXBias







4 / 25


WithHHout Matching:



(((hhhhhhhhhhResearcher discretion XXXBias

A central project of statistics: Automating away human discretion







4 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders Treatment Effect for treated observation i :

TEi = Yi Yi (0)= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders

Treatment Effect for treated observation i :




SATT = Meani{Ti=1}

(TEi )




5 / 25

Whats Matching?





SATT = Meani{Ti=1}

(TEi )




5 / 25

Whats Matching?


TEi = Yi (1) Yi (0)

= observed unobserved



SATT = Meani{Ti=1}

(TEi )




5 / 25

Whats Matching?


TEi = Yi (1) Yi (0)= observed unobserved



SATT = Meani{Ti=1}

(TEi )




5 / 25

Whats Matching?





SATT = Meani{Ti=1}

(TEi )




5 / 25

Whats Matching?



Estimate Yi (0) with Yj with a matched (Xi Xj) control

Quantities of Interest:


SATT = Meani{Ti=1}

(TEi )




5 / 25

Whats Matching?





SATT = Meani{Ti=1}

(TEi )




5 / 25

Whats Matching?





SATT = Meani{Ti=1}

(TEi )




5 / 25

Whats Matching?





SATT = Meani{Ti=1}

(TEi )




5 / 25

Whats Matching?





SATT = Meani{Ti=1}

(TEi )




5 / 25

Whats Matching?





SATT = Meani{Ti=1}

(TEi )




5 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25



BalanceCovariates:


FullyBlocked






6 / 25



BalanceCovariates:


FullyBlocked






6 / 25



BalanceCovariates:


FullyBlocked






6 / 25



BalanceCovariates:


FullyBlocked






6 / 25



BalanceCovariates:


FullyBlocked

Observed

On average Exact

Unobserved

On average On average





6 / 25



BalanceCovariates:


FullyBlocked

Observed On average

Exact

Unobserved

On average On average





6 / 25



BalanceCovariates:


FullyBlocked

Observed On average

Exact

Unobserved On average

On average





6 / 25



BalanceCovariates:


FullyBlocked

Observed On average ExactUnobserved On average

On average





6 / 25



BalanceCovariates:


FullyBlocked






6 / 25



BalanceCovariates:


FullyBlocked


Fully blocked dominates complete randomization

for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!




6 / 25



BalanceCovariates:


FullyBlocked


Fully blocked dominates complete randomization for:

imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!




6 / 25



BalanceCovariates:


FullyBlocked


Fully blocked dominates complete randomization for:imbalance,

model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!




6 / 25



BalanceCovariates:


FullyBlocked


Fully blocked dominates complete randomization for:imbalance, model dependence,

power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!




6 / 25



BalanceCovariates:


FullyBlocked


Fully blocked dominates complete randomization for:imbalance, model dependence, power,

efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!




6 / 25



BalanceCovariates:


FullyBlocked


Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency,

bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!




6 / 25



BalanceCovariates:


FullyBlocked


Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias,

researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!




6 / 25



BalanceCovariates:


FullyBlocked


Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts,

robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!




6 / 25



BalanceCovariates:


FullyBlocked


Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness.

E.g., Imai, King, Nall 2009: SEs 600% smaller!




6 / 25



BalanceCovariates:


FullyBlocked






6 / 25



BalanceCovariates:


FullyBlocked






6 / 25



BalanceCovariates:


FullyBlocked




PSM: complete randomization

Other methods: fully blocked Other matching methods dominate PSM


6 / 25



BalanceCovariates:


FullyBlocked




PSM: complete randomization Other methods: fully blocked

Other matching methods dominate PSM


6 / 25



BalanceCovariates:


FullyBlocked






6 / 25



BalanceCovariates:


FullyBlocked




PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM (wait, it gets worse)

6 / 25

Method 1: Mahalanobis Distance Matching

1. Preprocess (Matching)

Distance(Xc ,Xt) =

(Xc Xt)S1(Xc Xt) (Mahalanobis is for methodologists; in applications, use

Euclidean!) Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

7 / 25

Method 1: Mahalanobis Distance Matching(Approximates Fully Blocked Experiment)


Distance(Xc ,Xt) =




7 / 25



Distance(Xc ,Xt) =




7 / 25



Distance(Xc ,Xt) =

(Xc Xt)S1(Xc Xt)

(Mahalanobis is for methodologists; in applications, useEuclidean!)

Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)


7 / 25



Distance(Xc ,Xt) =


Euclidean!)



7 / 25



Distance(Xc ,Xt) =


Euclidean!) Match each treated unit to the nearest control unit

Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)


7 / 25



Distance(Xc ,Xt) =


Euclidean!) Match each treated unit to the nearest control unit Control units: not reused; pruned if unused

Prune matches if Distance>caliper (Many adjustments available to this basic method)


7 / 25



Distance(Xc ,Xt) =


Euclidean!) Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper

(Many adjustments available to this basic method)


7 / 25



Distance(Xc ,Xt) =




7 / 25

Mahalanobis Distance Matching

Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

8 / 25


Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

8 / 25


Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

C

C

CC

C

C

C

C

C

CC

C

CCC

CC

C

C

C

CC CC

C

C

CC

C

CC

CC

C

C C

CC

C

C

TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

8 / 25


Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

C

C

CC

C

C

C

C

C

CC

C

CCC

CC

C

C

C

CC CC

C

C

CC

C

CC

CC

C

C C

CC

C

C

TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

8 / 25


Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

T TT T

TTTT

T TTTT

T TT

TTTT

CCC C

CC

C

C

C CC

C

CC

CCC CC

C

C

CCCCC

CCC CCCCC

C CCCC

C

8 / 25


Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

T TT T

TTTT

T TTTT

T TT

TTTT

CCC C

CC

C

C

C CC

C

CC

CCC CC

C

8 / 25


Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

T TT T

TTTT

T TTTT

T TT

TTTT

CCC C

CC

C

C

C CC

C

CC

CCC CC

C

8 / 25

Best Case: Mahalanobis Distance Matching

9 / 25


Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

T TTTT T TTTT TT T

T TTTTTTT T

TTTTT T T

T TTTT T

TT TTTTT

TTTT

TTTT

C CCCC C CCCC CC C

C CCCCCCC C

CCCCC C C

C CCCC C

CC CCCCC

CCCC

CCCC

CC CCC CCCCC CCCCCC CCC CC C C CCCCC CCC CC CC CC CC CC CC CC CC C CCC

CC CC CC C CCCC C CCC CC CC C CCC CC CC CC C CCC CC CC CCC C CC CCCCC

CC CCCC C CCC CCC CC C C CCCCCC CCCC C CCCC C CC CCC CC CCC C

C CC CC CC C CCCC C CC C C CC CCC CC CCC CCCCC CCCC CCC CC CCC C C CC CC C

C CCCC CC CC CCC C C CCC C CC CCC CC C CCC C C CCC CC CC C CC C CC

CCCCC CCCC C C CC C CCCC CC CCC CCC C CCC CC CC CC CC CC C CC C C

C CC CCC CC C C CCC CCC C CC CC CCC CCC CC CCC CC C CCC CC C C

C CCCC C CC C CC CC C CC C CCC C C CCC CC C CCCCCC CC CCCC CC CC C CC C CC CC C CC CC

C CC C CCC CCCC C CC CC C CCC CC CC CC C CCCCC CCC C C CC C CC CCC

CCC CC CCC CC CCCC C CCC CC CCC

9 / 25


Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

T TTTT T TTTT TT T

T TTTTTTT T

TTTTT T T

T TTTT T

TT TTTTT

TTTT

TTTT

C CCCC C CCCC CC C

C CCCCCCC C

CCCCC C C

C CCCC C

CC CCCCC

CCCC

CCCC

9 / 25

Method 2: Coarsened Exact Matching(Most powerful easy-to-use approach)


Temporarily coarsen X as much as youre willing

e.g., Education (grade school, high school, college, graduate)

Apply exact matching to the coarsened X , C (X )

Sort observations into strata, each with unique values of C(X ) Prune any stratum with 0 treated or 0 control units

Pass on original (uncoarsened) units except those pruned


Weight controls in each stratum to equal treateds

10 / 25


(Approximates Fully Blocked Experiment)









10 / 25











10 / 25



1. Preprocess (Matching) Temporarily coarsen X as much as youre willing

e.g., Education (grade school, high school, college, graduate) Apply exact matching to the coarsened X , C (X )





10 / 25










10 / 25









10 / 25





Sort observations into strata, each with unique values of C(X )

Prune any stratum with 0 treated or 0 control units Pass on original (uncoarsened) units except those pruned



10 / 25









10 / 25









10 / 25







2. Estimation Difference in means or a model Weight controls in each stratum to equal treateds

10 / 25

Coarsened Exact Matching

11 / 25


Education

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

CCC C

CC CC

C CC C CCC CCC

CCCC CC CC

C CCCCCC

C CCC CC

C

T TT T

TTTT

T TTTT

T TT

TTTT

11 / 25


Education

HS BA MA PhD 2nd PhD

Drinking age

Don't trust anyoneover 30

The Big 40

Senior Discounts

Retirement

Old

CCC C

CC CC

C CC C CCC CCC

CCCC CC CC

C CCCCCC

C CCC CC

C

T TT T

TTTT

T TTTT

T TT

TTTT

11 / 25


Education


Drinking age


The Big 40

Senior Discounts

Retirement

Old

CCC C

CC CC

C CC C CCC CCC

CCCC CC CC

C CCCCCC

C CCC CC

C

T TT T

TTTT

T TTTT

T TT

TTTT

11 / 25


Education


Drinking age


The Big 40

Senior Discounts

Retirement

Old

CC C

CC

CC CCCC CC C

CCCC

TTT T T

TTT

T TT

TTTT

11 / 25


Education


Drinking age


The Big 40

Senior Discounts

Retirement

Old

CC C

CCCC C CC

C CC CCC

C

C

TTT T T

TTT

T TT

TTTT

11 / 25


Education

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

CC C

CCCC C CC

C CC CCC

C

C

TTT T T

TTT

T TT

TTTT

11 / 25

Best Case: Coarsened Exact Matching

12 / 25


Education

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

CC CCC CCCCC CCCCCC CCC CC C C CCCCC CCC CC CC CC C CCC CCC CCC CC CC C CCC

CCC CC CCC C C CCCC C CCC CC CC C CCCCC CC C CC CC CCC CCC CC CC CCC C C CC C

CC CCC CC CCCC C CC CC CCC CC CC C CCCC C CC CC CCC CC CCCCC C CC CCC CC

C C CCC CC CC CC CCC C CC CCC C CC C C CC CCC CC CCC C CCCCC CCCC CCC CC CC CCC C CC CC C

C CC CCCCC CC CC CCC C CC CCC CCC CCC CC C CCCC C C CCC CC CC CCC C C

C C CC CCCCC CCCC CC C CCC CCCC CCC CCC CCC C CCC CC CC CC CCC CC C

C CC C C CC CC CC CC CCC C C CCC CCC C CCCCC CCC CCC CC CCC CC C CCC C C

C C CC CCCCC C CC C CC CC C CC C CCC C C CCC CC C CCC CCC CC CCCC CC CC C CC C CC CC CC CC

CCC CC C CCC CCCC C CC CC C CCC C CC CCC CC C CCC CCCC CCCC C C CC C CC CC

C CCC CC CCC CC CCCC CCCC CC CCC

T TTTT T TTTT TT T

T TTTTTTT T

TTTTT T T

T TTTT T

TT TTTTT

TTTT

TTTT

12 / 25


Education

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

C CCCC CCCCC C

C CCCCCC

CCCCC C C

CCCC C

C CCC C

C

C CCCC

T TTTT TTTTT T

T TTTTTT

TTTTT T T

TTTT T

T TTT T

T

T TTTT

12 / 25


Education

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

C CCCC CCCCC C

C CCCCCC

CCCCC C C

CCCC C

C CCC C

C

C CCCC

T TTTT TTTTT T

T TTTTTT

TTTTT T T

TTTT T

T TTT T

T

T TTTT

12 / 25

Method 3: Propensity Score Matching


Reduce k elements of X to scalari Pr(Ti = 1|X ) = 11+eXi

Distance(Xc ,Xt) = |c t | Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)


13 / 25

Method 3: Propensity Score Matching(Approximates Completely Randomized Experiment)





13 / 25






13 / 25


1. Preprocess (Matching) Reduce k elements of X to scalari Pr(Ti = 1|X ) = 11+eXi



13 / 25



Distance(Xc ,Xt) = |c t |



13 / 25



Distance(Xc ,Xt) = |c t | Match each treated unit to the nearest control unit

Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)


13 / 25



Distance(Xc ,Xt) = |c t | Match each treated unit to the nearest control unit Control units: not reused; pruned if unused

Prune matches if Distance>caliper (Many adjustments available to this basic method)


13 / 25



Distance(Xc ,Xt) = |c t | Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper

(Many adjustments available to this basic method)


13 / 25





13 / 25

Propensity Score Matching

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

C

CC

C

C

C

C

C

CC

C

CCC

CC

C

C

C

CCCC

C

C

CC

C

CC

CC

C

C C

CC

C

C

TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

14 / 25


Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

C

CC

C

C

C

C

C

CC

C

CCC

CC

C

C

C

CCCC

C

C

CC

C

CC

CC

C

C C

CC

C

C

TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

1

0

PropensityScore 14 / 25


Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

C

CC

C

C

C

C

C

CC

C

CCC

CC

C

C

C

CCCC

C

C

CC

C

CC

CC

C

C C

CC

C

C

TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

1

0

PropensityScore

C

C

CC

CCC

C

C

C

CC

C

C

C

C

C

C

C

CCCCC

C

CCCCCCCCC

C

C

C

C

CC

T

TTT

T

TT

T

T

T

T

TT

T

T

T

T

T

T

T

14 / 25


Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

1

0

PropensityScore

C

C

CC

CCC

C

C

C

CC

C

C

C

C

C

C

C

CCCCC

C

CCCCCCCCC

C

C

C

C

CC

T

TTT

T

TT

T

T

T

T

TT

T

T

T

T

T

T

T

14 / 25


Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

1

0

PropensityScore

C

C

CC

CCC

C

C

C

CC

C

C

C

C

C

C

C

CCCCC

C

CCCCCCCCC

C

C

C

C

CC

CCC

C

CCC

C

CCC

CCCC

T

TTT

T

TT

T

T

T

T

TT

T

T

T

T

T

T

T

14 / 25


Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

1

0

PropensityScore

CCC

C

CCC

C

CCC

CCCC

T

TTT

T

TT

T

T

T

T

TT

T

T

T

T

T

T

T

14 / 25


Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

C

CC

C

CC

C

C

C

C

C

C

C

C

C

C

CC

C TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

1

0

PropensityScore

CCC

C

CCC

C

CCC

CCCC

T

TTT

T

TT

T

T

T

T

TT

T

T

T

T

T

T

T

14 / 25


Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

C

CC

C

CC

C

C

C

C

C

C

C

C

C

C

CC

C TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

14 / 25

Best Case: Propensity Score Matching

15 / 25


Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

CCC

C

C

C

C

CC

C

C

C

CC

C

C

C

C

C

CCCC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

CC

C

C

CCC

CC

C

C

C

CC

C

C

C

C

CC

C

CC

C

C

C C

C

C

C

CC

CC

C

CC

C

C

CC

C

C

C

C

CC

CC

C

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

CC

C

C

C

CC

CC

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C

C CC

C

C

C

CC

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

C

CCC

C

C

C

CC

CC

C

C

CC

C

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C C

C

C C

CC

CC

C

C

C

C

CC

C

C

C C

CC

CC

C

C

C

C C

CC

C

C

C

C

C

C

C C

CC

C

CC

C

C

C

C

C

C

C

C

CC

CC

C

C

CC

CC

C

CC

C

C

C

C

CC

C

CC

C

C

C

CC

C

C

CC

C

C

C

C

C

C

C C

C

CC C

C

CC

C C

C

C

C

C

C

CC

CC

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

C

C

CC

CC C

C

C

C

C

C

C C

C

C

C

CC

C

C

C

CC

CC

C

CC

C

C

C

C

C

C

C

CC

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

CC

CC

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C

CC

C

CC

CC C

CCC

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C C

C

C

C

C

C

C CC

C

C

C

C CC

C

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

CC

C

C

CC

C

C

C

C

CC

CC

C

C

T

TTT

T

T

T

T

TT

T

T

T

TT

T

T

T

T

T

TT TT

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

TT

T

T

TTT

TT

1

0

PropensityScore

15 / 25


Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

CCC

C

C

C

C

CC

C

C

C

CC

C

C

C

C

C

CCCC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

CC

C

C

CCC

CC

C

C

C

CC

C

C

C

C

CC

C

CC

C

C

C C

C

C

C

CC

CC

C

CC

C

C

CC

C

C

C

C

CC

CC

C

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

CC

C

C

C

CC

CC

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C

C CC

C

C

C

CC

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

C

CCC

C

C

C

CC

CC

C

C

CC

C

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C C

C

C C

CC

CC

C

C

C

C

CC

C

C

C C

CC

CC

C

C

C

C C

CC

C

C

C

C

C

C

C C

CC

C

CC

C

C

C

C

C

C

C

C

CC

CC

C

C

CC

CC

C

CC

C

C

C

C

CC

C

CC

C

C

C

CC

C

C

CC

C

C

C

C

C

C

C C

C

CC C

C

CC

C C

C

C

C

C

C

CC

CC

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

C

C

CC

CC C

C

C

C

C

C

C C

C

C

C

CC

C

C

C

CC

CC

C

CC

C

C

C

C

C

C

C

CC

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

CC

CC

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C

CC

C

CC

CC C

CCC

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C C

C

C

C

C

C

C CC

C

C

C

C CC

C

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

CC

C

C

CC

C

C

C

C

CC

CC

C

C

T

TTT

T

T

T

T

TT

T

T

T

TT

T

T

T

T

T

TT TT

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

TT

T

T

TTT

TT

1

0

PropensityScore

15 / 25


Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

CC

CC

CC

C

C

CC

C

C

C

CC

C

C

C

C

C

C

C CC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

CC

CC

C

CC

CC

T

TTT

T

T

T

T

TT

T

T

T

TT

T

T

T

T

T

TT TT

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

TT

T

T

TTT

TT

15 / 25

Best Case: Propensity Score Matching is Suboptimal

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

CC

CC

CC

C

C

CC

C

C

C

CC

C

C

C

C

C

C

C CC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

CC

CC

C

CC

CC

T

TTT

T

T

T

T

TT

T

T

T

TT

T

T

T

T

T

TT TT

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

TT

T

T

TTT

TT

15 / 25

PSMs Statistical Properties

1. Low Standards: Sometimes helps, never optimizes

Efficient relative to complete randomization, but Inefficient relative to (the more powerful) full blocking Other methods usually dominate:

Xc = Xt = c = t butc = t 6= Xc = Xt

2. The PSM Paradox: When you do better, you do worse

Background: Random matching increases imbalance When PSM approximates complete randomization (to begin

with or, after some pruning)

all 0.5 (or constantwithin strata) pruning at random Imbalance Inefficency Model dependence Bias

If the data have no good matches, the paradox wont be aproblem but youre cooked anyway.

Doesnt PSM solve the curse of dimensionality problem?

Nope. The PSM Paradox gets worse with more covariates

16 / 25


1. Low Standards: Sometimes helps, never optimizes

Efficient relative to complete randomization, but Inefficient relative to (the more powerful) full blocking Other methods usually dominate:









16 / 25


1. Low Standards: Sometimes helps, never optimizes Efficient relative to complete randomization, but

Inefficient relative to (the more powerful) full blocking Other methods usually dominate:









16 / 25


1. Low Standards: Sometimes helps, never optimizes Efficient relative to complete randomization, but Inefficient relative to (the more powerful) full blocking

Other methods usually dominate:









16 / 25


1. Low Standards: Sometimes helps, never optimizes Efficient relative to complete randomization, but Inefficient relative to (the more powerful) full blocking Other methods usually dominate:









16 / 25



Xc = Xt = c = t

butc = t 6= Xc = Xt








16 / 25











16 / 25


1. Low Standards: Sometimes helps, never optimizes Efficient relative to complete randomization, but Ineffici

Matching Methods for Causal Inference - Gary King · Matching Methods for Causal Inference Gary...

Documents

Transcript of Matching Methods for Causal Inference - Gary King · Matching Methods for Causal Inference Gary...