Matching Methods for Causal Inference - Gary King · Matching Methods for Causal Inference Gary...

208
Matching Methods for Causal Inference Gary King 1 Institute for Quantitative Social Science Harvard University Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Harvard Medical School, 3/9/2018 1 GaryKing.org 1 / 25

Transcript of Matching Methods for Causal Inference - Gary King · Matching Methods for Causal Inference Gary...

Matching Methods for Causal Inference

Gary King1

Institute for Quantitative Social ScienceHarvard University

Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center,Harvard Medical School, 3/9/2018

1GaryKing.org1 / 25

3 Problems, 3 Solutions

1. The most popular method (propensity score matching, used in103,000 articles!) sounds magical:

Why Propensity Scores Should Not Be Used forMatching (Gary King, Richard Nielsen)

2. Do powerful methods have to be complicated?

Causal Inference Without Balance Checking:Coarsened Exact Matching (PA, 2011. StefanoM Iacus, Gary King, and Giuseppe Porro)

3. Matching methods optimize either imbalance ( bias) or #units pruned ( variance); users need both simultaneously:

The Balance-Sample Size Frontier in MatchingMethods for Causal Inference (In press, AJPS;Gary King, Christopher Lucas and Richard Nielsen)

2 / 25

3 Problems, 3 Solutions

1. The most popular method (propensity score matching, used in103,000 articles!) sounds magical:

Why Propensity Scores Should Not Be Used forMatching (Gary King, Richard Nielsen)

2. Do powerful methods have to be complicated?

Causal Inference Without Balance Checking:Coarsened Exact Matching (PA, 2011. StefanoM Iacus, Gary King, and Giuseppe Porro)

3. Matching methods optimize either imbalance ( bias) or #units pruned ( variance); users need both simultaneously:

The Balance-Sample Size Frontier in MatchingMethods for Causal Inference (In press, AJPS;Gary King, Christopher Lucas and Richard Nielsen)

2 / 25

3 Problems, 3 Solutions

1. The most popular method (propensity score matching, used in103,000 articles!) sounds magical: Why Propensity Scores Should Not Be Used for

Matching (Gary King, Richard Nielsen)

2. Do powerful methods have to be complicated?

Causal Inference Without Balance Checking:Coarsened Exact Matching (PA, 2011. StefanoM Iacus, Gary King, and Giuseppe Porro)

3. Matching methods optimize either imbalance ( bias) or #units pruned ( variance); users need both simultaneously:

The Balance-Sample Size Frontier in MatchingMethods for Causal Inference (In press, AJPS;Gary King, Christopher Lucas and Richard Nielsen)

2 / 25

3 Problems, 3 Solutions

1. The most popular method (propensity score matching, used in103,000 articles!) sounds magical: Why Propensity Scores Should Not Be Used for

Matching (Gary King, Richard Nielsen)

2. Do powerful methods have to be complicated?

Causal Inference Without Balance Checking:Coarsened Exact Matching (PA, 2011. StefanoM Iacus, Gary King, and Giuseppe Porro)

3. Matching methods optimize either imbalance ( bias) or #units pruned ( variance); users need both simultaneously:

The Balance-Sample Size Frontier in MatchingMethods for Causal Inference (In press, AJPS;Gary King, Christopher Lucas and Richard Nielsen)

2 / 25

3 Problems, 3 Solutions

1. The most popular method (propensity score matching, used in103,000 articles!) sounds magical: Why Propensity Scores Should Not Be Used for

Matching (Gary King, Richard Nielsen)

2. Do powerful methods have to be complicated? Causal Inference Without Balance Checking:

Coarsened Exact Matching (PA, 2011. StefanoM Iacus, Gary King, and Giuseppe Porro)

3. Matching methods optimize either imbalance ( bias) or #units pruned ( variance); users need both simultaneously:

The Balance-Sample Size Frontier in MatchingMethods for Causal Inference (In press, AJPS;Gary King, Christopher Lucas and Richard Nielsen)

2 / 25

3 Problems, 3 Solutions

1. The most popular method (propensity score matching, used in103,000 articles!) sounds magical: Why Propensity Scores Should Not Be Used for

Matching (Gary King, Richard Nielsen)

2. Do powerful methods have to be complicated? Causal Inference Without Balance Checking:

Coarsened Exact Matching (PA, 2011. StefanoM Iacus, Gary King, and Giuseppe Porro)

3. Matching methods optimize either imbalance ( bias) or #units pruned ( variance); users need both simultaneously:

The Balance-Sample Size Frontier in MatchingMethods for Causal Inference (In press, AJPS;Gary King, Christopher Lucas and Richard Nielsen)

2 / 25

3 Problems, 3 Solutions

1. The most popular method (propensity score matching, used in103,000 articles!) sounds magical: Why Propensity Scores Should Not Be Used for

Matching (Gary King, Richard Nielsen)

2. Do powerful methods have to be complicated? Causal Inference Without Balance Checking:

Coarsened Exact Matching (PA, 2011. StefanoM Iacus, Gary King, and Giuseppe Porro)

3. Matching methods optimize either imbalance ( bias) or #units pruned ( variance); users need both simultaneously: The Balance-Sample Size Frontier in Matching

Methods for Causal Inference (In press, AJPS;Gary King, Christopher Lucas and Richard Nielsen)

2 / 25

Matching to Reduce Model Dependence

3 / 25

Matching to Reduce Model Dependence(Ho, Imai, King, Stuart, 2007: fig.1, Political Analysis)

3 / 25

Matching to Reduce Model Dependence(Ho, Imai, King, Stuart, 2007: fig.1, Political Analysis)

Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

3 / 25

Matching to Reduce Model Dependence(Ho, Imai, King, Stuart, 2007: fig.1, Political Analysis)

Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

T

3 / 25

Matching to Reduce Model Dependence(Ho, Imai, King, Stuart, 2007: fig.1, Political Analysis)

Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

T

CC

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

CC C

C

C

C

C

C

C

C

C

C

C

C

CCC

CC

CC

C

C

3 / 25

Matching to Reduce Model Dependence(Ho, Imai, King, Stuart, 2007: fig.1, Political Analysis)

Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

T

CC

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

CC C

C

C

C

C

C

C

C

C

C

C

C

CCC

CC

CC

C

C

3 / 25

Matching to Reduce Model Dependence(Ho, Imai, King, Stuart, 2007: fig.1, Political Analysis)

Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

T

CC

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

CC C

C

C

C

C

C

C

C

C

C

C

C

CCC

CC

CC

C

C

3 / 25

Matching to Reduce Model Dependence(Ho, Imai, King, Stuart, 2007: fig.1, Political Analysis)

Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

T

CC

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

CC C

C

C

C

C

C

C

C

C

C

C

C

CCC

CC

CC

C

C

3 / 25

Matching to Reduce Model Dependence(Ho, Imai, King, Stuart, 2007: fig.1, Political Analysis)

Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

TC

C

C

C

C

CC

C

C

CC

C CC

C

C

CCCC

C

CC

C

CC

CC

CC

C

C

C

C

CC

CCCC

3 / 25

Matching to Reduce Model Dependence(Ho, Imai, King, Stuart, 2007: fig.1, Political Analysis)

Education (years)

Out

com

e

12 14 16 18 20 22 24 26 28

0

2

4

6

8

10

12

T

T

T

T T

T

T

TTT

TT

T TT T

T

T

T

TC

C

C

C

C

CC

C

C

CC

C CC

C

C

CCCC

C

CC

C

CC

CC

CC

C

C

C

C

CC

CCCC

3 / 25

The Problems Matching Solves

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

Without Matching:

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

Without Matching:

Imbalance

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

Without Matching:

Imbalance Model Dependence

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

Without Matching:

Imbalance Model Dependence Researcher discretion

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

Without Matching:

Imbalance Model Dependence Researcher discretion Bias

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

Without Matching:

Imbalance Model Dependence Researcher discretion Bias Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

Without Matching:

Imbalance Model Dependence Researcher discretion Bias Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments

Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

Without Matching:

Imbalance Model Dependence Researcher discretion Bias Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

Without Matching:

Imbalance Model Dependence Researcher discretion Bias Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc]

People do not have easy access to their own mental processesor feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

Without Matching:

Imbalance Model Dependence Researcher discretion Bias Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

Without Matching:

Imbalance Model Dependence Researcher discretion Bias Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

Without Matching:

Imbalance Model Dependence Researcher discretion Bias Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

WithHHout Matching:

Imbalance Model Dependence Researcher discretion Bias

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

WithHHout Matching:

ZZImbalance Model Dependence Researcher discretion Bias

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

WithHHout Matching:

ZZImbalance ((((((((

(hhhhhhhhhModel Dependence Researcher discretion Bias

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

WithHHout Matching:

ZZImbalance ((((((((

(hhhhhhhhhModel Dependence (((((((

(((hhhhhhhhhhResearcher discretion Bias

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

WithHHout Matching:

ZZImbalance ((((((((

(hhhhhhhhhModel Dependence (((((((

(((hhhhhhhhhhResearcher discretion XXXBias

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

The Problems Matching Solves

WithHHout Matching:

ZZImbalance ((((((((

(hhhhhhhhhModel Dependence (((((((

(((hhhhhhhhhhResearcher discretion XXXBias

A central project of statistics: Automating away human discretion

Qualitative choice from unbiased estimates = biased estimator

e.g., Choosing from results of 50 randomized experiments Choosing based on plausibility is probably worse[effrt]

conscientious effort doesnt avoid biases (Banaji 2013)[acc] People do not have easy access to their own mental processes

or feedback to avoid the problem (Wilson and Brekke1994)[exprt]

Experts overestimate their ability to control personal biasesmore than nonexperts, and more prominent experts are themost overconfident (Tetlock 2005)[tch]

Teaching psychology is mostly a waste of time (Kahneman2011)

4 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders Treatment Effect for treated observation i :

TEi = Yi Yi (0)= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders

Treatment Effect for treated observation i :

TEi = Yi Yi (0)= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders Treatment Effect for treated observation i :

TEi = Yi Yi (0)= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders Treatment Effect for treated observation i :

TEi = Yi (1) Yi (0)

= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders Treatment Effect for treated observation i :

TEi = Yi (1) Yi (0)= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders Treatment Effect for treated observation i :

TEi = Yi Yi (0)= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders Treatment Effect for treated observation i :

TEi = Yi Yi (0)= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control

Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders Treatment Effect for treated observation i :

TEi = Yi Yi (0)= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders Treatment Effect for treated observation i :

TEi = Yi Yi (0)= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders Treatment Effect for treated observation i :

TEi = Yi Yi (0)= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders Treatment Effect for treated observation i :

TEi = Yi Yi (0)= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Whats Matching?

Yi dep var, Ti (1=treated, 0=control), Xi confounders Treatment Effect for treated observation i :

TEi = Yi Yi (0)= observed unobserved

Estimate Yi (0) with Yj with a matched (Xi Xj) control Quantities of Interest:

1. SATT: Sample Average Treatment effect on the Treated:

SATT = Meani{Ti=1}

(TEi )

2. FSATT: Feasible SATT (prune badly matched treateds too)

Big convenience: Follow preprocessing with whateverstatistical method youd have used without matching

Pruning nonmatches makes control vars matter less: reducesimbalance, model dependence, researcher discretion, & bias

5 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed

On average Exact

Unobserved

On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average

Exact

Unobserved

On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average

Exact

Unobserved On average

On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average

On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization

for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:

imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance,

model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence,

power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power,

efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency,

bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias,

researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts,

robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness.

E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization

Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked

Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM

(wait, it gets worse)

6 / 25

Matching: Finding Hidden Randomized Experiments

Types of Experiments

BalanceCovariates:

CompleteRandomization

FullyBlocked

Observed On average ExactUnobserved On average On average

Fully blocked dominates complete randomization for:imbalance, model dependence, power, efficiency, bias, researchcosts, robustness. E.g., Imai, King, Nall 2009: SEs 600% smaller!

Goal of Each Matching Method (in Observational Data)

PSM: complete randomization Other methods: fully blocked Other matching methods dominate PSM (wait, it gets worse)

6 / 25

Method 1: Mahalanobis Distance Matching

1. Preprocess (Matching)

Distance(Xc ,Xt) =

(Xc Xt)S1(Xc Xt) (Mahalanobis is for methodologists; in applications, use

Euclidean!) Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

7 / 25

Method 1: Mahalanobis Distance Matching(Approximates Fully Blocked Experiment)

1. Preprocess (Matching)

Distance(Xc ,Xt) =

(Xc Xt)S1(Xc Xt) (Mahalanobis is for methodologists; in applications, use

Euclidean!) Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

7 / 25

Method 1: Mahalanobis Distance Matching(Approximates Fully Blocked Experiment)

1. Preprocess (Matching)

Distance(Xc ,Xt) =

(Xc Xt)S1(Xc Xt) (Mahalanobis is for methodologists; in applications, use

Euclidean!) Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

7 / 25

Method 1: Mahalanobis Distance Matching(Approximates Fully Blocked Experiment)

1. Preprocess (Matching)

Distance(Xc ,Xt) =

(Xc Xt)S1(Xc Xt)

(Mahalanobis is for methodologists; in applications, useEuclidean!)

Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

7 / 25

Method 1: Mahalanobis Distance Matching(Approximates Fully Blocked Experiment)

1. Preprocess (Matching)

Distance(Xc ,Xt) =

(Xc Xt)S1(Xc Xt) (Mahalanobis is for methodologists; in applications, use

Euclidean!)

Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

7 / 25

Method 1: Mahalanobis Distance Matching(Approximates Fully Blocked Experiment)

1. Preprocess (Matching)

Distance(Xc ,Xt) =

(Xc Xt)S1(Xc Xt) (Mahalanobis is for methodologists; in applications, use

Euclidean!) Match each treated unit to the nearest control unit

Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

7 / 25

Method 1: Mahalanobis Distance Matching(Approximates Fully Blocked Experiment)

1. Preprocess (Matching)

Distance(Xc ,Xt) =

(Xc Xt)S1(Xc Xt) (Mahalanobis is for methodologists; in applications, use

Euclidean!) Match each treated unit to the nearest control unit Control units: not reused; pruned if unused

Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

7 / 25

Method 1: Mahalanobis Distance Matching(Approximates Fully Blocked Experiment)

1. Preprocess (Matching)

Distance(Xc ,Xt) =

(Xc Xt)S1(Xc Xt) (Mahalanobis is for methodologists; in applications, use

Euclidean!) Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper

(Many adjustments available to this basic method)

2. Estimation Difference in means or a model

7 / 25

Method 1: Mahalanobis Distance Matching(Approximates Fully Blocked Experiment)

1. Preprocess (Matching)

Distance(Xc ,Xt) =

(Xc Xt)S1(Xc Xt) (Mahalanobis is for methodologists; in applications, use

Euclidean!) Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

7 / 25

Mahalanobis Distance Matching

Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

8 / 25

Mahalanobis Distance Matching

Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

8 / 25

Mahalanobis Distance Matching

Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

C

C

CC

C

C

C

C

C

CC

C

CCC

CC

C

C

C

CC CC

C

C

CC

C

CC

CC

C

C C

CC

C

C

TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

8 / 25

Mahalanobis Distance Matching

Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

C

C

CC

C

C

C

C

C

CC

C

CCC

CC

C

C

C

CC CC

C

C

CC

C

CC

CC

C

C C

CC

C

C

TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

8 / 25

Mahalanobis Distance Matching

Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

T TT T

TTTT

T TTTT

T TT

TTTT

CCC C

CC

C

C

C CC

C

CC

CCC CC

C

C

CCCCC

CCC CCCCC

C CCCC

C

8 / 25

Mahalanobis Distance Matching

Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

T TT T

TTTT

T TTTT

T TT

TTTT

CCC C

CC

C

C

C CC

C

CC

CCC CC

C

8 / 25

Mahalanobis Distance Matching

Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

T TT T

TTTT

T TTTT

T TT

TTTT

CCC C

CC

C

C

C CC

C

CC

CCC CC

C

8 / 25

Best Case: Mahalanobis Distance Matching

9 / 25

Best Case: Mahalanobis Distance Matching

Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

T TTTT T TTTT TT T

T TTTTTTT T

TTTTT T T

T TTTT T

TT TTTTT

TTTT

TTTT

C CCCC C CCCC CC C

C CCCCCCC C

CCCCC C C

C CCCC C

CC CCCCC

CCCC

CCCC

CC CCC CCCCC CCCCCC CCC CC C C CCCCC CCC CC CC CC CC CC CC CC CC C CCC

CC CC CC C CCCC C CCC CC CC C CCC CC CC CC C CCC CC CC CCC C CC CCCCC

CC CCCC C CCC CCC CC C C CCCCCC CCCC C CCCC C CC CCC CC CCC C

C CC CC CC C CCCC C CC C C CC CCC CC CCC CCCCC CCCC CCC CC CCC C C CC CC C

C CCCC CC CC CCC C C CCC C CC CCC CC C CCC C C CCC CC CC C CC C CC

CCCCC CCCC C C CC C CCCC CC CCC CCC C CCC CC CC CC CC CC C CC C C

C CC CCC CC C C CCC CCC C CC CC CCC CCC CC CCC CC C CCC CC C C

C CCCC C CC C CC CC C CC C CCC C C CCC CC C CCCCCC CC CCCC CC CC C CC C CC CC C CC CC

C CC C CCC CCCC C CC CC C CCC CC CC CC C CCCCC CCC C C CC C CC CCC

CCC CC CCC CC CCCC C CCC CC CCC

9 / 25

Best Case: Mahalanobis Distance Matching

Education (years)

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

T TTTT T TTTT TT T

T TTTTTTT T

TTTTT T T

T TTTT T

TT TTTTT

TTTT

TTTT

C CCCC C CCCC CC C

C CCCCCCC C

CCCCC C C

C CCCC C

CC CCCCC

CCCC

CCCC

9 / 25

Method 2: Coarsened Exact Matching(Most powerful easy-to-use approach)

1. Preprocess (Matching)

Temporarily coarsen X as much as youre willing

e.g., Education (grade school, high school, college, graduate)

Apply exact matching to the coarsened X , C (X )

Sort observations into strata, each with unique values of C(X ) Prune any stratum with 0 treated or 0 control units

Pass on original (uncoarsened) units except those pruned

2. Estimation Difference in means or a model

Weight controls in each stratum to equal treateds

10 / 25

Method 2: Coarsened Exact Matching(Most powerful easy-to-use approach)

(Approximates Fully Blocked Experiment)

1. Preprocess (Matching)

Temporarily coarsen X as much as youre willing

e.g., Education (grade school, high school, college, graduate)

Apply exact matching to the coarsened X , C (X )

Sort observations into strata, each with unique values of C(X ) Prune any stratum with 0 treated or 0 control units

Pass on original (uncoarsened) units except those pruned

2. Estimation Difference in means or a model

Weight controls in each stratum to equal treateds

10 / 25

Method 2: Coarsened Exact Matching(Most powerful easy-to-use approach)

(Approximates Fully Blocked Experiment)

1. Preprocess (Matching)

Temporarily coarsen X as much as youre willing

e.g., Education (grade school, high school, college, graduate)

Apply exact matching to the coarsened X , C (X )

Sort observations into strata, each with unique values of C(X ) Prune any stratum with 0 treated or 0 control units

Pass on original (uncoarsened) units except those pruned

2. Estimation Difference in means or a model

Weight controls in each stratum to equal treateds

10 / 25

Method 2: Coarsened Exact Matching(Most powerful easy-to-use approach)

(Approximates Fully Blocked Experiment)

1. Preprocess (Matching) Temporarily coarsen X as much as youre willing

e.g., Education (grade school, high school, college, graduate) Apply exact matching to the coarsened X , C (X )

Sort observations into strata, each with unique values of C(X ) Prune any stratum with 0 treated or 0 control units

Pass on original (uncoarsened) units except those pruned

2. Estimation Difference in means or a model

Weight controls in each stratum to equal treateds

10 / 25

Method 2: Coarsened Exact Matching(Most powerful easy-to-use approach)

(Approximates Fully Blocked Experiment)

1. Preprocess (Matching) Temporarily coarsen X as much as youre willing

e.g., Education (grade school, high school, college, graduate)

Apply exact matching to the coarsened X , C (X )

Sort observations into strata, each with unique values of C(X ) Prune any stratum with 0 treated or 0 control units

Pass on original (uncoarsened) units except those pruned

2. Estimation Difference in means or a model

Weight controls in each stratum to equal treateds

10 / 25

Method 2: Coarsened Exact Matching(Most powerful easy-to-use approach)

(Approximates Fully Blocked Experiment)

1. Preprocess (Matching) Temporarily coarsen X as much as youre willing

e.g., Education (grade school, high school, college, graduate) Apply exact matching to the coarsened X , C (X )

Sort observations into strata, each with unique values of C(X ) Prune any stratum with 0 treated or 0 control units

Pass on original (uncoarsened) units except those pruned

2. Estimation Difference in means or a model

Weight controls in each stratum to equal treateds

10 / 25

Method 2: Coarsened Exact Matching(Most powerful easy-to-use approach)

(Approximates Fully Blocked Experiment)

1. Preprocess (Matching) Temporarily coarsen X as much as youre willing

e.g., Education (grade school, high school, college, graduate) Apply exact matching to the coarsened X , C (X )

Sort observations into strata, each with unique values of C(X )

Prune any stratum with 0 treated or 0 control units Pass on original (uncoarsened) units except those pruned

2. Estimation Difference in means or a model

Weight controls in each stratum to equal treateds

10 / 25

Method 2: Coarsened Exact Matching(Most powerful easy-to-use approach)

(Approximates Fully Blocked Experiment)

1. Preprocess (Matching) Temporarily coarsen X as much as youre willing

e.g., Education (grade school, high school, college, graduate) Apply exact matching to the coarsened X , C (X )

Sort observations into strata, each with unique values of C(X ) Prune any stratum with 0 treated or 0 control units

Pass on original (uncoarsened) units except those pruned

2. Estimation Difference in means or a model

Weight controls in each stratum to equal treateds

10 / 25

Method 2: Coarsened Exact Matching(Most powerful easy-to-use approach)

(Approximates Fully Blocked Experiment)

1. Preprocess (Matching) Temporarily coarsen X as much as youre willing

e.g., Education (grade school, high school, college, graduate) Apply exact matching to the coarsened X , C (X )

Sort observations into strata, each with unique values of C(X ) Prune any stratum with 0 treated or 0 control units

Pass on original (uncoarsened) units except those pruned

2. Estimation Difference in means or a model

Weight controls in each stratum to equal treateds

10 / 25

Method 2: Coarsened Exact Matching(Most powerful easy-to-use approach)

(Approximates Fully Blocked Experiment)

1. Preprocess (Matching) Temporarily coarsen X as much as youre willing

e.g., Education (grade school, high school, college, graduate) Apply exact matching to the coarsened X , C (X )

Sort observations into strata, each with unique values of C(X ) Prune any stratum with 0 treated or 0 control units

Pass on original (uncoarsened) units except those pruned

2. Estimation Difference in means or a model Weight controls in each stratum to equal treateds

10 / 25

Coarsened Exact Matching

11 / 25

Coarsened Exact Matching

Education

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

CCC C

CC CC

C CC C CCC CCC

CCCC CC CC

C CCCCCC

C CCC CC

C

T TT T

TTTT

T TTTT

T TT

TTTT

11 / 25

Coarsened Exact Matching

Education

HS BA MA PhD 2nd PhD

Drinking age

Don't trust anyoneover 30

The Big 40

Senior Discounts

Retirement

Old

CCC C

CC CC

C CC C CCC CCC

CCCC CC CC

C CCCCCC

C CCC CC

C

T TT T

TTTT

T TTTT

T TT

TTTT

11 / 25

Coarsened Exact Matching

Education

HS BA MA PhD 2nd PhD

Drinking age

Don't trust anyoneover 30

The Big 40

Senior Discounts

Retirement

Old

CCC C

CC CC

C CC C CCC CCC

CCCC CC CC

C CCCCCC

C CCC CC

C

T TT T

TTTT

T TTTT

T TT

TTTT

11 / 25

Coarsened Exact Matching

Education

HS BA MA PhD 2nd PhD

Drinking age

Don't trust anyoneover 30

The Big 40

Senior Discounts

Retirement

Old

CC C

CC

CC CCCC CC C

CCCC

TTT T T

TTT

T TT

TTTT

11 / 25

Coarsened Exact Matching

Education

HS BA MA PhD 2nd PhD

Drinking age

Don't trust anyoneover 30

The Big 40

Senior Discounts

Retirement

Old

CC C

CCCC C CC

C CC CCC

C

C

TTT T T

TTT

T TT

TTTT

11 / 25

Coarsened Exact Matching

Education

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

CC C

CCCC C CC

C CC CCC

C

C

TTT T T

TTT

T TT

TTTT

11 / 25

Best Case: Coarsened Exact Matching

12 / 25

Best Case: Coarsened Exact Matching

Education

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

CC CCC CCCCC CCCCCC CCC CC C C CCCCC CCC CC CC CC C CCC CCC CCC CC CC C CCC

CCC CC CCC C C CCCC C CCC CC CC C CCCCC CC C CC CC CCC CCC CC CC CCC C C CC C

CC CCC CC CCCC C CC CC CCC CC CC C CCCC C CC CC CCC CC CCCCC C CC CCC CC

C C CCC CC CC CC CCC C CC CCC C CC C C CC CCC CC CCC C CCCCC CCCC CCC CC CC CCC C CC CC C

C CC CCCCC CC CC CCC C CC CCC CCC CCC CC C CCCC C C CCC CC CC CCC C C

C C CC CCCCC CCCC CC C CCC CCCC CCC CCC CCC C CCC CC CC CC CCC CC C

C CC C C CC CC CC CC CCC C C CCC CCC C CCCCC CCC CCC CC CCC CC C CCC C C

C C CC CCCCC C CC C CC CC C CC C CCC C C CCC CC C CCC CCC CC CCCC CC CC C CC C CC CC CC CC

CCC CC C CCC CCCC C CC CC C CCC C CC CCC CC C CCC CCCC CCCC C C CC C CC CC

C CCC CC CCC CC CCCC CCCC CC CCC

T TTTT T TTTT TT T

T TTTTTTT T

TTTTT T T

T TTTT T

TT TTTTT

TTTT

TTTT

12 / 25

Best Case: Coarsened Exact Matching

Education

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

C CCCC CCCCC C

C CCCCCC

CCCCC C C

CCCC C

C CCC C

C

C CCCC

T TTTT TTTTT T

T TTTTTT

TTTTT T T

TTTT T

T TTT T

T

T TTTT

12 / 25

Best Case: Coarsened Exact Matching

Education

Age

12 14 16 18 20 22 24 26 28

20

30

40

50

60

70

80

C CCCC CCCCC C

C CCCCCC

CCCCC C C

CCCC C

C CCC C

C

C CCCC

T TTTT TTTTT T

T TTTTTT

TTTTT T T

TTTT T

T TTT T

T

T TTTT

12 / 25

Method 3: Propensity Score Matching

1. Preprocess (Matching)

Reduce k elements of X to scalari Pr(Ti = 1|X ) = 11+eXi

Distance(Xc ,Xt) = |c t | Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

13 / 25

Method 3: Propensity Score Matching(Approximates Completely Randomized Experiment)

1. Preprocess (Matching)

Reduce k elements of X to scalari Pr(Ti = 1|X ) = 11+eXi

Distance(Xc ,Xt) = |c t | Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

13 / 25

Method 3: Propensity Score Matching(Approximates Completely Randomized Experiment)

1. Preprocess (Matching)

Reduce k elements of X to scalari Pr(Ti = 1|X ) = 11+eXi

Distance(Xc ,Xt) = |c t | Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

13 / 25

Method 3: Propensity Score Matching(Approximates Completely Randomized Experiment)

1. Preprocess (Matching) Reduce k elements of X to scalari Pr(Ti = 1|X ) = 11+eXi

Distance(Xc ,Xt) = |c t | Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

13 / 25

Method 3: Propensity Score Matching(Approximates Completely Randomized Experiment)

1. Preprocess (Matching) Reduce k elements of X to scalari Pr(Ti = 1|X ) = 11+eXi

Distance(Xc ,Xt) = |c t |

Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

13 / 25

Method 3: Propensity Score Matching(Approximates Completely Randomized Experiment)

1. Preprocess (Matching) Reduce k elements of X to scalari Pr(Ti = 1|X ) = 11+eXi

Distance(Xc ,Xt) = |c t | Match each treated unit to the nearest control unit

Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

13 / 25

Method 3: Propensity Score Matching(Approximates Completely Randomized Experiment)

1. Preprocess (Matching) Reduce k elements of X to scalari Pr(Ti = 1|X ) = 11+eXi

Distance(Xc ,Xt) = |c t | Match each treated unit to the nearest control unit Control units: not reused; pruned if unused

Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

13 / 25

Method 3: Propensity Score Matching(Approximates Completely Randomized Experiment)

1. Preprocess (Matching) Reduce k elements of X to scalari Pr(Ti = 1|X ) = 11+eXi

Distance(Xc ,Xt) = |c t | Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper

(Many adjustments available to this basic method)

2. Estimation Difference in means or a model

13 / 25

Method 3: Propensity Score Matching(Approximates Completely Randomized Experiment)

1. Preprocess (Matching) Reduce k elements of X to scalari Pr(Ti = 1|X ) = 11+eXi

Distance(Xc ,Xt) = |c t | Match each treated unit to the nearest control unit Control units: not reused; pruned if unused Prune matches if Distance>caliper (Many adjustments available to this basic method)

2. Estimation Difference in means or a model

13 / 25

Propensity Score Matching

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

C

CC

C

C

C

C

C

CC

C

CCC

CC

C

C

C

CCCC

C

C

CC

C

CC

CC

C

C C

CC

C

C

TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

14 / 25

Propensity Score Matching

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

C

CC

C

C

C

C

C

CC

C

CCC

CC

C

C

C

CCCC

C

C

CC

C

CC

CC

C

C C

CC

C

C

TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

1

0

PropensityScore 14 / 25

Propensity Score Matching

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

C

CC

C

C

C

C

C

CC

C

CCC

CC

C

C

C

CCCC

C

C

CC

C

CC

CC

C

C C

CC

C

C

TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

1

0

PropensityScore

C

C

CC

CCC

C

C

C

CC

C

C

C

C

C

C

C

CCCCC

C

CCCCCCCCC

C

C

C

C

CC

T

TTT

T

TT

T

T

T

T

TT

T

T

T

T

T

T

T

14 / 25

Propensity Score Matching

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

1

0

PropensityScore

C

C

CC

CCC

C

C

C

CC

C

C

C

C

C

C

C

CCCCC

C

CCCCCCCCC

C

C

C

C

CC

T

TTT

T

TT

T

T

T

T

TT

T

T

T

T

T

T

T

14 / 25

Propensity Score Matching

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

1

0

PropensityScore

C

C

CC

CCC

C

C

C

CC

C

C

C

C

C

C

C

CCCCC

C

CCCCCCCCC

C

C

C

C

CC

CCC

C

CCC

C

CCC

CCCC

T

TTT

T

TT

T

T

T

T

TT

T

T

T

T

T

T

T

14 / 25

Propensity Score Matching

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

1

0

PropensityScore

CCC

C

CCC

C

CCC

CCCC

T

TTT

T

TT

T

T

T

T

TT

T

T

T

T

T

T

T

14 / 25

Propensity Score Matching

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

C

CC

C

CC

C

C

C

C

C

C

C

C

C

C

CC

C TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

1

0

PropensityScore

CCC

C

CCC

C

CCC

CCCC

T

TTT

T

TT

T

T

T

T

TT

T

T

T

T

T

T

T

14 / 25

Propensity Score Matching

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

C

CC

C

CC

C

C

C

C

C

C

C

C

C

C

CC

C TTTT

T

T

T

T

T

T

T

T

T

TT

T

T

T

T

T

14 / 25

Best Case: Propensity Score Matching

15 / 25

Best Case: Propensity Score Matching

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

CCC

C

C

C

C

CC

C

C

C

CC

C

C

C

C

C

CCCC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

CC

C

C

CCC

CC

C

C

C

CC

C

C

C

C

CC

C

CC

C

C

C C

C

C

C

CC

CC

C

CC

C

C

CC

C

C

C

C

CC

CC

C

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

CC

C

C

C

CC

CC

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C

C CC

C

C

C

CC

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

C

CCC

C

C

C

CC

CC

C

C

CC

C

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C C

C

C C

CC

CC

C

C

C

C

CC

C

C

C C

CC

CC

C

C

C

C C

CC

C

C

C

C

C

C

C C

CC

C

CC

C

C

C

C

C

C

C

C

CC

CC

C

C

CC

CC

C

CC

C

C

C

C

CC

C

CC

C

C

C

CC

C

C

CC

C

C

C

C

C

C

C C

C

CC C

C

CC

C C

C

C

C

C

C

CC

CC

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

C

C

CC

CC C

C

C

C

C

C

C C

C

C

C

CC

C

C

C

CC

CC

C

CC

C

C

C

C

C

C

C

CC

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

CC

CC

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C

CC

C

CC

CC C

CCC

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C C

C

C

C

C

C

C CC

C

C

C

C CC

C

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

CC

C

C

CC

C

C

C

C

CC

CC

C

C

T

TTT

T

T

T

T

TT

T

T

T

TT

T

T

T

T

T

TT TT

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

TT

T

T

TTT

TT

1

0

PropensityScore

15 / 25

Best Case: Propensity Score Matching

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

C

CCC

C

C

C

C

CC

C

C

C

CC

C

C

C

C

C

CCCC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

CC

C

C

CCC

CC

C

C

C

CC

C

C

C

C

CC

C

CC

C

C

C C

C

C

C

CC

CC

C

CC

C

C

CC

C

C

C

C

CC

CC

C

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

CC

C

C

C

CC

CC

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C

C CC

C

C

C

CC

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

C

CCC

C

C

C

CC

CC

C

C

CC

C

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C C

C

C C

CC

CC

C

C

C

C

CC

C

C

C C

CC

CC

C

C

C

C C

CC

C

C

C

C

C

C

C C

CC

C

CC

C

C

C

C

C

C

C

C

CC

CC

C

C

CC

CC

C

CC

C

C

C

C

CC

C

CC

C

C

C

CC

C

C

CC

C

C

C

C

C

C

C C

C

CC C

C

CC

C C

C

C

C

C

C

CC

CC

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

C

C

CC

CC C

C

C

C

C

C

C C

C

C

C

CC

C

C

C

CC

CC

C

CC

C

C

C

C

C

C

C

CC

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

CC

CC

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C

CC

C

CC

CC C

CCC

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

C

CC

C

C C

C

C

C

C

C

C CC

C

C

C

C CC

C

C

C

C

C

C

C

C

C

CC

C

C

C

C

C

C

C

C

CC

C

C

CC

C

C

C

C

CC

CC

C

C

T

TTT

T

T

T

T

TT

T

T

T

TT

T

T

T

T

T

TT TT

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

TT

T

T

TTT

TT

1

0

PropensityScore

15 / 25

Best Case: Propensity Score Matching

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

CC

CC

CC

C

C

CC

C

C

C

CC

C

C

C

C

C

C

C CC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

CC

CC

C

CC

CC

T

TTT

T

T

T

T

TT

T

T

T

TT

T

T

T

T

T

TT TT

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

TT

T

T

TTT

TT

15 / 25

Best Case: Propensity Score Matching is Suboptimal

Education (years)

Age

12 16 20 24 28

20

30

40

50

60

70

80

CC

CC

CC

C

C

CC

C

C

C

CC

C

C

C

C

C

C

C CC

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

C

CC

CC

C

CC

CC

T

TTT

T

T

T

T

TT

T

T

T

TT

T

T

T

T

T

TT TT

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

T

TT

T

T

TTT

TT

15 / 25

PSMs Statistical Properties

1. Low Standards: Sometimes helps, never optimizes

Efficient relative to complete randomization, but Inefficient relative to (the more powerful) full blocking Other methods usually dominate:

Xc = Xt = c = t butc = t 6= Xc = Xt

2. The PSM Paradox: When you do better, you do worse

Background: Random matching increases imbalance When PSM approximates complete randomization (to begin

with or, after some pruning)

all 0.5 (or constantwithin strata) pruning at random Imbalance Inefficency Model dependence Bias

If the data have no good matches, the paradox wont be aproblem but youre cooked anyway.

Doesnt PSM solve the curse of dimensionality problem?

Nope. The PSM Paradox gets worse with more covariates

16 / 25

PSMs Statistical Properties

1. Low Standards: Sometimes helps, never optimizes

Efficient relative to complete randomization, but Inefficient relative to (the more powerful) full blocking Other methods usually dominate:

Xc = Xt = c = t butc = t 6= Xc = Xt

2. The PSM Paradox: When you do better, you do worse

Background: Random matching increases imbalance When PSM approximates complete randomization (to begin

with or, after some pruning)

all 0.5 (or constantwithin strata) pruning at random Imbalance Inefficency Model dependence Bias

If the data have no good matches, the paradox wont be aproblem but youre cooked anyway.

Doesnt PSM solve the curse of dimensionality problem?

Nope. The PSM Paradox gets worse with more covariates

16 / 25

PSMs Statistical Properties

1. Low Standards: Sometimes helps, never optimizes Efficient relative to complete randomization, but

Inefficient relative to (the more powerful) full blocking Other methods usually dominate:

Xc = Xt = c = t butc = t 6= Xc = Xt

2. The PSM Paradox: When you do better, you do worse

Background: Random matching increases imbalance When PSM approximates complete randomization (to begin

with or, after some pruning)

all 0.5 (or constantwithin strata) pruning at random Imbalance Inefficency Model dependence Bias

If the data have no good matches, the paradox wont be aproblem but youre cooked anyway.

Doesnt PSM solve the curse of dimensionality problem?

Nope. The PSM Paradox gets worse with more covariates

16 / 25

PSMs Statistical Properties

1. Low Standards: Sometimes helps, never optimizes Efficient relative to complete randomization, but Inefficient relative to (the more powerful) full blocking

Other methods usually dominate:

Xc = Xt = c = t butc = t 6= Xc = Xt

2. The PSM Paradox: When you do better, you do worse

Background: Random matching increases imbalance When PSM approximates complete randomization (to begin

with or, after some pruning)

all 0.5 (or constantwithin strata) pruning at random Imbalance Inefficency Model dependence Bias

If the data have no good matches, the paradox wont be aproblem but youre cooked anyway.

Doesnt PSM solve the curse of dimensionality problem?

Nope. The PSM Paradox gets worse with more covariates

16 / 25

PSMs Statistical Properties

1. Low Standards: Sometimes helps, never optimizes Efficient relative to complete randomization, but Inefficient relative to (the more powerful) full blocking Other methods usually dominate:

Xc = Xt = c = t butc = t 6= Xc = Xt

2. The PSM Paradox: When you do better, you do worse

Background: Random matching increases imbalance When PSM approximates complete randomization (to begin

with or, after some pruning)

all 0.5 (or constantwithin strata) pruning at random Imbalance Inefficency Model dependence Bias

If the data have no good matches, the paradox wont be aproblem but youre cooked anyway.

Doesnt PSM solve the curse of dimensionality problem?

Nope. The PSM Paradox gets worse with more covariates

16 / 25

PSMs Statistical Properties

1. Low Standards: Sometimes helps, never optimizes Efficient relative to complete randomization, but Inefficient relative to (the more powerful) full blocking Other methods usually dominate:

Xc = Xt = c = t

butc = t 6= Xc = Xt

2. The PSM Paradox: When you do better, you do worse

Background: Random matching increases imbalance When PSM approximates complete randomization (to begin

with or, after some pruning)

all 0.5 (or constantwithin strata) pruning at random Imbalance Inefficency Model dependence Bias

If the data have no good matches, the paradox wont be aproblem but youre cooked anyway.

Doesnt PSM solve the curse of dimensionality problem?

Nope. The PSM Paradox gets worse with more covariates

16 / 25

PSMs Statistical Properties

1. Low Standards: Sometimes helps, never optimizes Efficient relative to complete randomization, but Inefficient relative to (the more powerful) full blocking Other methods usually dominate:

Xc = Xt = c = t butc = t 6= Xc = Xt

2. The PSM Paradox: When you do better, you do worse

Background: Random matching increases imbalance When PSM approximates complete randomization (to begin

with or, after some pruning)

all 0.5 (or constantwithin strata) pruning at random Imbalance Inefficency Model dependence Bias

If the data have no good matches, the paradox wont be aproblem but youre cooked anyway.

Doesnt PSM solve the curse of dimensionality problem?

Nope. The PSM Paradox gets worse with more covariates

16 / 25

PSMs Statistical Properties

1. Low Standards: Sometimes helps, never optimizes Efficient relative to complete randomization, but Ineffici