Soc504: Causal Inference Topics - Princeton University · Soc504: Causal Inference Topics Brandon...

835
Soc504: Causal Inference Topics Brandon Stewart 1 Princeton April 10 - April 19, 2017 1 This lecture draws from slides by Matt Blackwell, Jens Hainmueller, Erin Hartman and Gary King Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 1 / 168

Transcript of Soc504: Causal Inference Topics - Princeton University · Soc504: Causal Inference Topics Brandon...

  • Soc504: Causal Inference Topics

    Brandon Stewart1

    Princeton

    April 10 - April 19, 2017

    1This lecture draws from slides by Matt Blackwell, Jens Hainmueller, Erin Hartmanand Gary King

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 1 / 168

  • ReadingsMonday

    I King, Gary and Langche Zeng. “The Dangers of Extreme Counterfactuals,”PoliticalAnalysis, 14, 2, (2007): 131-159.

    I King, Gary and Langche Zeng. “When Can History be Our Guide? The Pitfalls ofCounterfactual Inference,” International Studies Quarterly, 2006, 51 (March, 2007):183–210.

    Wednesday

    I Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart.”Matching asNonparametric Preprocessing for Reducing Model Dependence in Parametric CausalInference,” Political Analysis, 15 (2007): 199-236.

    Monday

    I Review Morgan and Winship Potential Outcomes ChapterI Kosuke Imai, Gary King, and Elizabeth Stuart. Misunderstandings Among

    Experimentalists and Observationalists About Causal Inference. Journal of the RoyalStatistical Society, Series A, (2008) 171, part 2: 481502

    Wednesday

    I Optional: Imai, Keele, Tingley and Yamamoto. “Unpacking the black box ofcausality: Learning about causal mechanisms from experimental and observationalstudies” American Political Science Review (2011)

    I Optional: Acharya, Blackwell and Sen. “Explaining Causal Findings Without Bias:Detecting and Assessing Direct Effects.” American Political Science Review.(2016).

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 2 / 168

  • ReadingsMonday

    I King, Gary and Langche Zeng. “The Dangers of Extreme Counterfactuals,”PoliticalAnalysis, 14, 2, (2007): 131-159.

    I King, Gary and Langche Zeng. “When Can History be Our Guide? The Pitfalls ofCounterfactual Inference,” International Studies Quarterly, 2006, 51 (March, 2007):183–210.

    Wednesday

    I Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart.”Matching asNonparametric Preprocessing for Reducing Model Dependence in Parametric CausalInference,” Political Analysis, 15 (2007): 199-236.

    Monday

    I Review Morgan and Winship Potential Outcomes ChapterI Kosuke Imai, Gary King, and Elizabeth Stuart. Misunderstandings Among

    Experimentalists and Observationalists About Causal Inference. Journal of the RoyalStatistical Society, Series A, (2008) 171, part 2: 481502

    Wednesday

    I Optional: Imai, Keele, Tingley and Yamamoto. “Unpacking the black box ofcausality: Learning about causal mechanisms from experimental and observationalstudies” American Political Science Review (2011)

    I Optional: Acharya, Blackwell and Sen. “Explaining Causal Findings Without Bias:Detecting and Assessing Direct Effects.” American Political Science Review.(2016).

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 2 / 168

  • ReadingsMonday

    I King, Gary and Langche Zeng. “The Dangers of Extreme Counterfactuals,”PoliticalAnalysis, 14, 2, (2007): 131-159.

    I King, Gary and Langche Zeng. “When Can History be Our Guide? The Pitfalls ofCounterfactual Inference,” International Studies Quarterly, 2006, 51 (March, 2007):183–210.

    Wednesday

    I Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart.”Matching asNonparametric Preprocessing for Reducing Model Dependence in Parametric CausalInference,” Political Analysis, 15 (2007): 199-236.

    Monday

    I Review Morgan and Winship Potential Outcomes ChapterI Kosuke Imai, Gary King, and Elizabeth Stuart. Misunderstandings Among

    Experimentalists and Observationalists About Causal Inference. Journal of the RoyalStatistical Society, Series A, (2008) 171, part 2: 481502

    Wednesday

    I Optional: Imai, Keele, Tingley and Yamamoto. “Unpacking the black box ofcausality: Learning about causal mechanisms from experimental and observationalstudies” American Political Science Review (2011)

    I Optional: Acharya, Blackwell and Sen. “Explaining Causal Findings Without Bias:Detecting and Assessing Direct Effects.” American Political Science Review.(2016).

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 2 / 168

  • ReadingsMonday

    I King, Gary and Langche Zeng. “The Dangers of Extreme Counterfactuals,”PoliticalAnalysis, 14, 2, (2007): 131-159.

    I King, Gary and Langche Zeng. “When Can History be Our Guide? The Pitfalls ofCounterfactual Inference,” International Studies Quarterly, 2006, 51 (March, 2007):183–210.

    Wednesday

    I Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart.”Matching asNonparametric Preprocessing for Reducing Model Dependence in Parametric CausalInference,” Political Analysis, 15 (2007): 199-236.

    Monday

    I Review Morgan and Winship Potential Outcomes ChapterI Kosuke Imai, Gary King, and Elizabeth Stuart. Misunderstandings Among

    Experimentalists and Observationalists About Causal Inference. Journal of the RoyalStatistical Society, Series A, (2008) 171, part 2: 481502

    Wednesday

    I Optional: Imai, Keele, Tingley and Yamamoto. “Unpacking the black box ofcausality: Learning about causal mechanisms from experimental and observationalstudies” American Political Science Review (2011)

    I Optional: Acharya, Blackwell and Sen. “Explaining Causal Findings Without Bias:Detecting and Assessing Direct Effects.” American Political Science Review.(2016).

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 2 / 168

  • ReadingsMonday

    I King, Gary and Langche Zeng. “The Dangers of Extreme Counterfactuals,”PoliticalAnalysis, 14, 2, (2007): 131-159.

    I King, Gary and Langche Zeng. “When Can History be Our Guide? The Pitfalls ofCounterfactual Inference,” International Studies Quarterly, 2006, 51 (March, 2007):183–210.

    WednesdayI Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart.”Matching as

    Nonparametric Preprocessing for Reducing Model Dependence in Parametric CausalInference,” Political Analysis, 15 (2007): 199-236.

    Monday

    I Review Morgan and Winship Potential Outcomes ChapterI Kosuke Imai, Gary King, and Elizabeth Stuart. Misunderstandings Among

    Experimentalists and Observationalists About Causal Inference. Journal of the RoyalStatistical Society, Series A, (2008) 171, part 2: 481502

    Wednesday

    I Optional: Imai, Keele, Tingley and Yamamoto. “Unpacking the black box ofcausality: Learning about causal mechanisms from experimental and observationalstudies” American Political Science Review (2011)

    I Optional: Acharya, Blackwell and Sen. “Explaining Causal Findings Without Bias:Detecting and Assessing Direct Effects.” American Political Science Review.(2016).

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 2 / 168

  • ReadingsMonday

    I King, Gary and Langche Zeng. “The Dangers of Extreme Counterfactuals,”PoliticalAnalysis, 14, 2, (2007): 131-159.

    I King, Gary and Langche Zeng. “When Can History be Our Guide? The Pitfalls ofCounterfactual Inference,” International Studies Quarterly, 2006, 51 (March, 2007):183–210.

    WednesdayI Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart.”Matching as

    Nonparametric Preprocessing for Reducing Model Dependence in Parametric CausalInference,” Political Analysis, 15 (2007): 199-236.

    Monday

    I Review Morgan and Winship Potential Outcomes ChapterI Kosuke Imai, Gary King, and Elizabeth Stuart. Misunderstandings Among

    Experimentalists and Observationalists About Causal Inference. Journal of the RoyalStatistical Society, Series A, (2008) 171, part 2: 481502

    Wednesday

    I Optional: Imai, Keele, Tingley and Yamamoto. “Unpacking the black box ofcausality: Learning about causal mechanisms from experimental and observationalstudies” American Political Science Review (2011)

    I Optional: Acharya, Blackwell and Sen. “Explaining Causal Findings Without Bias:Detecting and Assessing Direct Effects.” American Political Science Review.(2016).

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 2 / 168

  • ReadingsMonday

    I King, Gary and Langche Zeng. “The Dangers of Extreme Counterfactuals,”PoliticalAnalysis, 14, 2, (2007): 131-159.

    I King, Gary and Langche Zeng. “When Can History be Our Guide? The Pitfalls ofCounterfactual Inference,” International Studies Quarterly, 2006, 51 (March, 2007):183–210.

    WednesdayI Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart.”Matching as

    Nonparametric Preprocessing for Reducing Model Dependence in Parametric CausalInference,” Political Analysis, 15 (2007): 199-236.

    MondayI Review Morgan and Winship Potential Outcomes Chapter

    I Kosuke Imai, Gary King, and Elizabeth Stuart. Misunderstandings AmongExperimentalists and Observationalists About Causal Inference. Journal of the RoyalStatistical Society, Series A, (2008) 171, part 2: 481502

    Wednesday

    I Optional: Imai, Keele, Tingley and Yamamoto. “Unpacking the black box ofcausality: Learning about causal mechanisms from experimental and observationalstudies” American Political Science Review (2011)

    I Optional: Acharya, Blackwell and Sen. “Explaining Causal Findings Without Bias:Detecting and Assessing Direct Effects.” American Political Science Review.(2016).

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 2 / 168

  • ReadingsMonday

    I King, Gary and Langche Zeng. “The Dangers of Extreme Counterfactuals,”PoliticalAnalysis, 14, 2, (2007): 131-159.

    I King, Gary and Langche Zeng. “When Can History be Our Guide? The Pitfalls ofCounterfactual Inference,” International Studies Quarterly, 2006, 51 (March, 2007):183–210.

    WednesdayI Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart.”Matching as

    Nonparametric Preprocessing for Reducing Model Dependence in Parametric CausalInference,” Political Analysis, 15 (2007): 199-236.

    MondayI Review Morgan and Winship Potential Outcomes ChapterI Kosuke Imai, Gary King, and Elizabeth Stuart. Misunderstandings Among

    Experimentalists and Observationalists About Causal Inference. Journal of the RoyalStatistical Society, Series A, (2008) 171, part 2: 481502

    Wednesday

    I Optional: Imai, Keele, Tingley and Yamamoto. “Unpacking the black box ofcausality: Learning about causal mechanisms from experimental and observationalstudies” American Political Science Review (2011)

    I Optional: Acharya, Blackwell and Sen. “Explaining Causal Findings Without Bias:Detecting and Assessing Direct Effects.” American Political Science Review.(2016).

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 2 / 168

  • ReadingsMonday

    I King, Gary and Langche Zeng. “The Dangers of Extreme Counterfactuals,”PoliticalAnalysis, 14, 2, (2007): 131-159.

    I King, Gary and Langche Zeng. “When Can History be Our Guide? The Pitfalls ofCounterfactual Inference,” International Studies Quarterly, 2006, 51 (March, 2007):183–210.

    WednesdayI Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart.”Matching as

    Nonparametric Preprocessing for Reducing Model Dependence in Parametric CausalInference,” Political Analysis, 15 (2007): 199-236.

    MondayI Review Morgan and Winship Potential Outcomes ChapterI Kosuke Imai, Gary King, and Elizabeth Stuart. Misunderstandings Among

    Experimentalists and Observationalists About Causal Inference. Journal of the RoyalStatistical Society, Series A, (2008) 171, part 2: 481502

    Wednesday

    I Optional: Imai, Keele, Tingley and Yamamoto. “Unpacking the black box ofcausality: Learning about causal mechanisms from experimental and observationalstudies” American Political Science Review (2011)

    I Optional: Acharya, Blackwell and Sen. “Explaining Causal Findings Without Bias:Detecting and Assessing Direct Effects.” American Political Science Review.(2016).

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 2 / 168

  • ReadingsMonday

    I King, Gary and Langche Zeng. “The Dangers of Extreme Counterfactuals,”PoliticalAnalysis, 14, 2, (2007): 131-159.

    I King, Gary and Langche Zeng. “When Can History be Our Guide? The Pitfalls ofCounterfactual Inference,” International Studies Quarterly, 2006, 51 (March, 2007):183–210.

    WednesdayI Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart.”Matching as

    Nonparametric Preprocessing for Reducing Model Dependence in Parametric CausalInference,” Political Analysis, 15 (2007): 199-236.

    MondayI Review Morgan and Winship Potential Outcomes ChapterI Kosuke Imai, Gary King, and Elizabeth Stuart. Misunderstandings Among

    Experimentalists and Observationalists About Causal Inference. Journal of the RoyalStatistical Society, Series A, (2008) 171, part 2: 481502

    WednesdayI Optional: Imai, Keele, Tingley and Yamamoto. “Unpacking the black box of

    causality: Learning about causal mechanisms from experimental and observationalstudies” American Political Science Review (2011)

    I Optional: Acharya, Blackwell and Sen. “Explaining Causal Findings Without Bias:Detecting and Assessing Direct Effects.” American Political Science Review.(2016).

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 2 / 168

  • ReadingsMonday

    I King, Gary and Langche Zeng. “The Dangers of Extreme Counterfactuals,”PoliticalAnalysis, 14, 2, (2007): 131-159.

    I King, Gary and Langche Zeng. “When Can History be Our Guide? The Pitfalls ofCounterfactual Inference,” International Studies Quarterly, 2006, 51 (March, 2007):183–210.

    WednesdayI Daniel Ho, Kosuke Imai, Gary King, and Elizabeth Stuart.”Matching as

    Nonparametric Preprocessing for Reducing Model Dependence in Parametric CausalInference,” Political Analysis, 15 (2007): 199-236.

    MondayI Review Morgan and Winship Potential Outcomes ChapterI Kosuke Imai, Gary King, and Elizabeth Stuart. Misunderstandings Among

    Experimentalists and Observationalists About Causal Inference. Journal of the RoyalStatistical Society, Series A, (2008) 171, part 2: 481502

    WednesdayI Optional: Imai, Keele, Tingley and Yamamoto. “Unpacking the black box of

    causality: Learning about causal mechanisms from experimental and observationalstudies” American Political Science Review (2011)

    I Optional: Acharya, Blackwell and Sen. “Explaining Causal Findings Without Bias:Detecting and Assessing Direct Effects.” American Political Science Review.(2016).

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 2 / 168

  • 1 Assessing Counterfactuals

    2 A (Brief) Review of Selection on Observables

    3 Matching as Non-parametric Preprocessing

    4 Fundamentals of Matching

    5 Three Approaches to Matching

    6 The Propensity Score

    7 Mechanisms: Estimands and Identification

    8 Mechanisms: Estimation

    9 Controlled Direct Effects

    10 Appendix: The Case Against Propensity Score Matching

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 3 / 168

  • 1 Assessing Counterfactuals

    2 A (Brief) Review of Selection on Observables

    3 Matching as Non-parametric Preprocessing

    4 Fundamentals of Matching

    5 Three Approaches to Matching

    6 The Propensity Score

    7 Mechanisms: Estimands and Identification

    8 Mechanisms: Estimation

    9 Controlled Direct Effects

    10 Appendix: The Case Against Propensity Score Matching

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 3 / 168

  • Counterfactuals

    Three types:

    1 Forecasts Will Donald Trump win reelection?2 What-if Questions What would have happened if the U.S. had not

    invaded Iraq?3 Causal Effects What is the causal effect of the Iraq war on U.S.

    Supreme Court decision making? (a factual minus a counterfactual)

    Counterfactuals are some part of most research, absolutely essentialin the context quantities of interest

    The model will always give an answer- so how do identify reasonablecounterfactuals?

    Summary of Today: don’t ask your model unreasonable questions.(remember the Momentous Sprint?)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 4 / 168

  • Counterfactuals

    Three types:

    1 Forecasts Will Donald Trump win reelection?2 What-if Questions What would have happened if the U.S. had not

    invaded Iraq?3 Causal Effects What is the causal effect of the Iraq war on U.S.

    Supreme Court decision making? (a factual minus a counterfactual)

    Counterfactuals are some part of most research, absolutely essentialin the context quantities of interest

    The model will always give an answer- so how do identify reasonablecounterfactuals?

    Summary of Today: don’t ask your model unreasonable questions.(remember the Momentous Sprint?)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 4 / 168

  • Counterfactuals

    Three types:1 Forecasts Will Donald Trump win reelection?

    2 What-if Questions What would have happened if the U.S. had notinvaded Iraq?

    3 Causal Effects What is the causal effect of the Iraq war on U.S.Supreme Court decision making? (a factual minus a counterfactual)

    Counterfactuals are some part of most research, absolutely essentialin the context quantities of interest

    The model will always give an answer- so how do identify reasonablecounterfactuals?

    Summary of Today: don’t ask your model unreasonable questions.(remember the Momentous Sprint?)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 4 / 168

  • Counterfactuals

    Three types:1 Forecasts Will Donald Trump win reelection?2 What-if Questions What would have happened if the U.S. had not

    invaded Iraq?

    3 Causal Effects What is the causal effect of the Iraq war on U.S.Supreme Court decision making? (a factual minus a counterfactual)

    Counterfactuals are some part of most research, absolutely essentialin the context quantities of interest

    The model will always give an answer- so how do identify reasonablecounterfactuals?

    Summary of Today: don’t ask your model unreasonable questions.(remember the Momentous Sprint?)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 4 / 168

  • Counterfactuals

    Three types:1 Forecasts Will Donald Trump win reelection?2 What-if Questions What would have happened if the U.S. had not

    invaded Iraq?3 Causal Effects What is the causal effect of the Iraq war on U.S.

    Supreme Court decision making? (a factual minus a counterfactual)

    Counterfactuals are some part of most research, absolutely essentialin the context quantities of interest

    The model will always give an answer- so how do identify reasonablecounterfactuals?

    Summary of Today: don’t ask your model unreasonable questions.(remember the Momentous Sprint?)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 4 / 168

  • Counterfactuals

    Three types:1 Forecasts Will Donald Trump win reelection?2 What-if Questions What would have happened if the U.S. had not

    invaded Iraq?3 Causal Effects What is the causal effect of the Iraq war on U.S.

    Supreme Court decision making? (a factual minus a counterfactual)

    Counterfactuals are some part of most research, absolutely essentialin the context quantities of interest

    The model will always give an answer- so how do identify reasonablecounterfactuals?

    Summary of Today: don’t ask your model unreasonable questions.(remember the Momentous Sprint?)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 4 / 168

  • Counterfactuals

    Three types:1 Forecasts Will Donald Trump win reelection?2 What-if Questions What would have happened if the U.S. had not

    invaded Iraq?3 Causal Effects What is the causal effect of the Iraq war on U.S.

    Supreme Court decision making? (a factual minus a counterfactual)

    Counterfactuals are some part of most research, absolutely essentialin the context quantities of interest

    The model will always give an answer- so how do identify reasonablecounterfactuals?

    Summary of Today: don’t ask your model unreasonable questions.(remember the Momentous Sprint?)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 4 / 168

  • Counterfactuals

    Three types:1 Forecasts Will Donald Trump win reelection?2 What-if Questions What would have happened if the U.S. had not

    invaded Iraq?3 Causal Effects What is the causal effect of the Iraq war on U.S.

    Supreme Court decision making? (a factual minus a counterfactual)

    Counterfactuals are some part of most research, absolutely essentialin the context quantities of interest

    The model will always give an answer- so how do identify reasonablecounterfactuals?

    Summary of Today: don’t ask your model unreasonable questions.(remember the Momentous Sprint?)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 4 / 168

  • Which model would you choose? (Both fit the data well.)

    Compare prediction at x = 1.5 to prediction at x = 5

    How do you choose a model?

    R2? Some “test”? “Theory”?

    The bottom line: answers to some questions don’t exist in the data.

    Our estimate of certain quantities of interest is highly modeldependent

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 5 / 168

  • Which model would you choose? (Both fit the data well.)

    Compare prediction at x = 1.5 to prediction at x = 5

    How do you choose a model?

    R2? Some “test”? “Theory”?

    The bottom line: answers to some questions don’t exist in the data.

    Our estimate of certain quantities of interest is highly modeldependent

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 5 / 168

  • Which model would you choose? (Both fit the data well.)

    Compare prediction at x = 1.5 to prediction at x = 5

    How do you choose a model?

    R2? Some “test”? “Theory”?

    The bottom line: answers to some questions don’t exist in the data.

    Our estimate of certain quantities of interest is highly modeldependent

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 5 / 168

  • Which model would you choose? (Both fit the data well.)

    Compare prediction at x = 1.5 to prediction at x = 5

    How do you choose a model? R2?

    Some “test”? “Theory”?

    The bottom line: answers to some questions don’t exist in the data.

    Our estimate of certain quantities of interest is highly modeldependent

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 5 / 168

  • Which model would you choose? (Both fit the data well.)

    Compare prediction at x = 1.5 to prediction at x = 5

    How do you choose a model? R2? Some “test”?

    “Theory”?

    The bottom line: answers to some questions don’t exist in the data.

    Our estimate of certain quantities of interest is highly modeldependent

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 5 / 168

  • Which model would you choose? (Both fit the data well.)

    Compare prediction at x = 1.5 to prediction at x = 5

    How do you choose a model? R2? Some “test”? “Theory”?

    The bottom line: answers to some questions don’t exist in the data.

    Our estimate of certain quantities of interest is highly modeldependent

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 5 / 168

  • Which model would you choose? (Both fit the data well.)

    Compare prediction at x = 1.5 to prediction at x = 5

    How do you choose a model? R2? Some “test”? “Theory”?

    The bottom line: answers to some questions don’t exist in the data.

    Our estimate of certain quantities of interest is highly modeldependent

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 5 / 168

  • Which model would you choose? (Both fit the data well.)

    Compare prediction at x = 1.5 to prediction at x = 5

    How do you choose a model? R2? Some “test”? “Theory”?

    The bottom line: answers to some questions don’t exist in the data.

    Our estimate of certain quantities of interest is highly modeldependent

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 5 / 168

  • Model Dependence Proof

    Model Free Inference

    To estimate E (Y |X = x) at x , average many observed Y with value x

    Assumptions (Model-Based Inference)

    1 Definition: model dependence at x is the difference between predictedoutcomes for any two models that fit about equally well.

    2 The functional form follows strong continuity (think smoothness,although it is less restrictive)

    Result

    The maximum degree of model dependence: solely a function of thedistance from the counterfactual to the data

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 6 / 168

  • Model Dependence Proof

    Model Free Inference

    To estimate E (Y |X = x) at x , average many observed Y with value x

    Assumptions (Model-Based Inference)

    1 Definition: model dependence at x is the difference between predictedoutcomes for any two models that fit about equally well.

    2 The functional form follows strong continuity (think smoothness,although it is less restrictive)

    Result

    The maximum degree of model dependence: solely a function of thedistance from the counterfactual to the data

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 6 / 168

  • Model Dependence Proof

    Model Free Inference

    To estimate E (Y |X = x) at x , average many observed Y with value x

    Assumptions (Model-Based Inference)

    1 Definition: model dependence at x is the difference between predictedoutcomes for any two models that fit about equally well.

    2 The functional form follows strong continuity (think smoothness,although it is less restrictive)

    Result

    The maximum degree of model dependence: solely a function of thedistance from the counterfactual to the data

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 6 / 168

  • Model Dependence Proof

    Model Free Inference

    To estimate E (Y |X = x) at x , average many observed Y with value x

    Assumptions (Model-Based Inference)

    1 Definition: model dependence at x is the difference between predictedoutcomes for any two models that fit about equally well.

    2 The functional form follows strong continuity (think smoothness,although it is less restrictive)

    Result

    The maximum degree of model dependence: solely a function of thedistance from the counterfactual to the data

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 6 / 168

  • Model Dependence Proof

    Model Free Inference

    To estimate E (Y |X = x) at x , average many observed Y with value x

    Assumptions (Model-Based Inference)

    1 Definition: model dependence at x is the difference between predictedoutcomes for any two models that fit about equally well.

    2 The functional form follows strong continuity (think smoothness,although it is less restrictive)

    Result

    The maximum degree of model dependence: solely a function of thedistance from the counterfactual to the data

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 6 / 168

  • Model Dependence Proof

    Model Free Inference

    To estimate E (Y |X = x) at x , average many observed Y with value x

    Assumptions (Model-Based Inference)

    1 Definition: model dependence at x is the difference between predictedoutcomes for any two models that fit about equally well.

    2 The functional form follows strong continuity (think smoothness,although it is less restrictive)

    Result

    The maximum degree of model dependence: solely a function of thedistance from the counterfactual to the data

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 6 / 168

  • Model Dependence Proof

    Model Free Inference

    To estimate E (Y |X = x) at x , average many observed Y with value x

    Assumptions (Model-Based Inference)

    1 Definition: model dependence at x is the difference between predictedoutcomes for any two models that fit about equally well.

    2 The functional form follows strong continuity (think smoothness,although it is less restrictive)

    Result

    The maximum degree of model dependence: solely a function of thedistance from the counterfactual to the data

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 6 / 168

  • Model Dependence Proof

    Model Free Inference

    To estimate E (Y |X = x) at x , average many observed Y with value x

    Assumptions (Model-Based Inference)

    1 Definition: model dependence at x is the difference between predictedoutcomes for any two models that fit about equally well.

    2 The functional form follows strong continuity (think smoothness,although it is less restrictive)

    Result

    The maximum degree of model dependence: solely a function of thedistance from the counterfactual to the data

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 6 / 168

  • Detecting Model Dependence

    Randomly select a large number of infants

    Randomly assign them to 0,6,8,10,12,16 years of education

    Assume 100% compliance, and no measurement error, omittedvariables, or missing data

    Regress cumulative salary in year 17 on education

    We find a coefficient of β̂ = $1, 000, big t-statistics, narrowconfidence intervals, and pass every test for auto-correlation, fit,normality, linearity, homoskedasticity, etc.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 7 / 168

  • Detecting Model DependenceA (Hypothethical) Research Design

    Randomly select a large number of infants

    Randomly assign them to 0,6,8,10,12,16 years of education

    Assume 100% compliance, and no measurement error, omittedvariables, or missing data

    Regress cumulative salary in year 17 on education

    We find a coefficient of β̂ = $1, 000, big t-statistics, narrowconfidence intervals, and pass every test for auto-correlation, fit,normality, linearity, homoskedasticity, etc.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 7 / 168

  • Detecting Model DependenceA (Hypothethical) Research Design

    Randomly select a large number of infants

    Randomly assign them to 0,6,8,10,12,16 years of education

    Assume 100% compliance, and no measurement error, omittedvariables, or missing data

    Regress cumulative salary in year 17 on education

    We find a coefficient of β̂ = $1, 000, big t-statistics, narrowconfidence intervals, and pass every test for auto-correlation, fit,normality, linearity, homoskedasticity, etc.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 7 / 168

  • Detecting Model DependenceA (Hypothethical) Research Design

    Randomly select a large number of infants

    Randomly assign them to 0,6,8,10,12,16 years of education

    Assume 100% compliance, and no measurement error, omittedvariables, or missing data

    Regress cumulative salary in year 17 on education

    We find a coefficient of β̂ = $1, 000, big t-statistics, narrowconfidence intervals, and pass every test for auto-correlation, fit,normality, linearity, homoskedasticity, etc.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 7 / 168

  • Detecting Model DependenceA (Hypothethical) Research Design

    Randomly select a large number of infants

    Randomly assign them to 0,6,8,10,12,16 years of education

    Assume 100% compliance, and no measurement error, omittedvariables, or missing data

    Regress cumulative salary in year 17 on education

    We find a coefficient of β̂ = $1, 000, big t-statistics, narrowconfidence intervals, and pass every test for auto-correlation, fit,normality, linearity, homoskedasticity, etc.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 7 / 168

  • Detecting Model DependenceA (Hypothethical) Research Design

    Randomly select a large number of infants

    Randomly assign them to 0,6,8,10,12,16 years of education

    Assume 100% compliance, and no measurement error, omittedvariables, or missing data

    Regress cumulative salary in year 17 on education

    We find a coefficient of β̂ = $1, 000, big t-statistics, narrowconfidence intervals, and pass every test for auto-correlation, fit,normality, linearity, homoskedasticity, etc.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 7 / 168

  • Detecting Model DependenceA (Hypothethical) Research Design

    Randomly select a large number of infants

    Randomly assign them to 0,6,8,10,12,16 years of education

    Assume 100% compliance, and no measurement error, omittedvariables, or missing data

    Regress cumulative salary in year 17 on education

    We find a coefficient of β̂ = $1, 000, big t-statistics, narrowconfidence intervals, and pass every test for auto-correlation, fit,normality, linearity, homoskedasticity, etc.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 7 / 168

  • What Inferences Would You Be Willing to Make?

    A Factual Question: How much salary would someone receive with 12years of education (a high school degree)?

    The model-free estimate: mean(Y ) among those with X = 12.

    The model-based estimate: Ŷ = X β̂ = 12× $1, 000 = $12, 000

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 8 / 168

  • What Inferences Would You Be Willing to Make?

    A Factual Question: How much salary would someone receive with 12years of education (a high school degree)?

    The model-free estimate: mean(Y ) among those with X = 12.

    The model-based estimate: Ŷ = X β̂ = 12× $1, 000 = $12, 000

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 8 / 168

  • What Inferences Would You Be Willing to Make?

    A Factual Question: How much salary would someone receive with 12years of education (a high school degree)?

    The model-free estimate: mean(Y ) among those with X = 12.

    The model-based estimate: Ŷ = X β̂ = 12× $1, 000 = $12, 000

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 8 / 168

  • What Inferences Would You Be Willing to Make?

    A Factual Question: How much salary would someone receive with 12years of education (a high school degree)?

    The model-free estimate: mean(Y ) among those with X = 12.

    The model-based estimate: Ŷ = X β̂ = 12× $1, 000 = $12, 000

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 8 / 168

  • Counterfactual Inferences with Interpolation

    How much salary would someone receive with 14 years of education(an Associates Degree)?

    Model free estimate: impossible

    Model-based estimate: Ŷ = X β̂ = 14× $1, 000 = $14, 000

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 9 / 168

  • Counterfactual Inferences with Interpolation

    How much salary would someone receive with 14 years of education(an Associates Degree)?

    Model free estimate: impossible

    Model-based estimate: Ŷ = X β̂ = 14× $1, 000 = $14, 000

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 9 / 168

  • Counterfactual Inferences with Interpolation

    How much salary would someone receive with 14 years of education(an Associates Degree)?

    Model free estimate: impossible

    Model-based estimate: Ŷ = X β̂ = 14× $1, 000 = $14, 000

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 9 / 168

  • Counterfactual Inferences with Interpolation

    How much salary would someone receive with 14 years of education(an Associates Degree)?

    Model free estimate: impossible

    Model-based estimate: Ŷ = X β̂ = 14× $1, 000 = $14, 000

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 9 / 168

  • Counterfactual Inference with Extrapolation

    How much salary would someone receive with 24 years of education(a Ph.D.)?

    Ŷ = X β̂ = 24× $1, 000 = $24, 000

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 10 / 168

  • Counterfactual Inference with Extrapolation

    How much salary would someone receive with 24 years of education(a Ph.D.)?

    Ŷ = X β̂ = 24× $1, 000 = $24, 000

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 10 / 168

  • Counterfactual Inference with Extrapolation

    How much salary would someone receive with 24 years of education(a Ph.D.)?

    Ŷ = X β̂ = 24× $1, 000 = $24, 000

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 10 / 168

  • Another Counterfactual Inference with Extrapolation

    How much salary would someone receive with 53 years of education?

    Ŷ = X β̂ = 53× $1, 000 = $53, 000Recall: the regression passed every test and met every assumption;identical calculations worked for the other questions.

    What’s changed? How would we recognize it when the example is lessextreme or multidimensional?

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 11 / 168

  • Another Counterfactual Inference with Extrapolation

    How much salary would someone receive with 53 years of education?

    Ŷ = X β̂ = 53× $1, 000 = $53, 000Recall: the regression passed every test and met every assumption;identical calculations worked for the other questions.

    What’s changed? How would we recognize it when the example is lessextreme or multidimensional?

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 11 / 168

  • Another Counterfactual Inference with Extrapolation

    How much salary would someone receive with 53 years of education?

    Ŷ = X β̂ = 53× $1, 000 = $53, 000

    Recall: the regression passed every test and met every assumption;identical calculations worked for the other questions.

    What’s changed? How would we recognize it when the example is lessextreme or multidimensional?

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 11 / 168

  • Another Counterfactual Inference with Extrapolation

    How much salary would someone receive with 53 years of education?

    Ŷ = X β̂ = 53× $1, 000 = $53, 000Recall: the regression passed every test and met every assumption;identical calculations worked for the other questions.

    What’s changed? How would we recognize it when the example is lessextreme or multidimensional?

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 11 / 168

  • Another Counterfactual Inference with Extrapolation

    How much salary would someone receive with 53 years of education?

    Ŷ = X β̂ = 53× $1, 000 = $53, 000Recall: the regression passed every test and met every assumption;identical calculations worked for the other questions.

    What’s changed? How would we recognize it when the example is lessextreme or multidimensional?Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 11 / 168

  • Model Dependence with One Explanatory Variable

    Suppose Y is starting salary; X is education in 10 categories.

    To estimate E (Y |X ): we need 10 parameters, E (Y |X = xj),j = 1, . . . , 10.

    Model-free method: average 50 observations on Y for each value of X

    Model-based method: regress Y on X , summarizing 10 parameterswith 2 (intercept and slope).

    The difference between the 10 we need and the 2 we estimate withregression is pure assumption.

    (If X were continuous, we would be reducing ∞ to 2, also byassumption)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 12 / 168

  • Model Dependence with One Explanatory Variable

    Suppose Y is starting salary; X is education in 10 categories.

    To estimate E (Y |X ): we need 10 parameters, E (Y |X = xj),j = 1, . . . , 10.

    Model-free method: average 50 observations on Y for each value of X

    Model-based method: regress Y on X , summarizing 10 parameterswith 2 (intercept and slope).

    The difference between the 10 we need and the 2 we estimate withregression is pure assumption.

    (If X were continuous, we would be reducing ∞ to 2, also byassumption)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 12 / 168

  • Model Dependence with One Explanatory Variable

    Suppose Y is starting salary; X is education in 10 categories.

    To estimate E (Y |X ): we need 10 parameters, E (Y |X = xj),j = 1, . . . , 10.

    Model-free method: average 50 observations on Y for each value of X

    Model-based method: regress Y on X , summarizing 10 parameterswith 2 (intercept and slope).

    The difference between the 10 we need and the 2 we estimate withregression is pure assumption.

    (If X were continuous, we would be reducing ∞ to 2, also byassumption)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 12 / 168

  • Model Dependence with One Explanatory Variable

    Suppose Y is starting salary; X is education in 10 categories.

    To estimate E (Y |X ): we need 10 parameters, E (Y |X = xj),j = 1, . . . , 10.

    Model-free method: average 50 observations on Y for each value of X

    Model-based method: regress Y on X , summarizing 10 parameterswith 2 (intercept and slope).

    The difference between the 10 we need and the 2 we estimate withregression is pure assumption.

    (If X were continuous, we would be reducing ∞ to 2, also byassumption)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 12 / 168

  • Model Dependence with One Explanatory Variable

    Suppose Y is starting salary; X is education in 10 categories.

    To estimate E (Y |X ): we need 10 parameters, E (Y |X = xj),j = 1, . . . , 10.

    Model-free method: average 50 observations on Y for each value of X

    Model-based method: regress Y on X , summarizing 10 parameterswith 2 (intercept and slope).

    The difference between the 10 we need and the 2 we estimate withregression is pure assumption.

    (If X were continuous, we would be reducing ∞ to 2, also byassumption)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 12 / 168

  • Model Dependence with One Explanatory Variable

    Suppose Y is starting salary; X is education in 10 categories.

    To estimate E (Y |X ): we need 10 parameters, E (Y |X = xj),j = 1, . . . , 10.

    Model-free method: average 50 observations on Y for each value of X

    Model-based method: regress Y on X , summarizing 10 parameterswith 2 (intercept and slope).

    The difference between the 10 we need and the 2 we estimate withregression is pure assumption.

    (If X were continuous, we would be reducing ∞ to 2, also byassumption)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 12 / 168

  • Model Dependence with One Explanatory Variable

    Suppose Y is starting salary; X is education in 10 categories.

    To estimate E (Y |X ): we need 10 parameters, E (Y |X = xj),j = 1, . . . , 10.

    Model-free method: average 50 observations on Y for each value of X

    Model-based method: regress Y on X , summarizing 10 parameterswith 2 (intercept and slope).

    The difference between the 10 we need and the 2 we estimate withregression is pure assumption.

    (If X were continuous, we would be reducing ∞ to 2, also byassumption)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 12 / 168

  • Model Dependence with Two Explanatory Variables

    How many parameters do we now need to estimate?

    20? Nope. Its10× 10 = 100. This is the curse of dimensionality: the number ofparameters goes up geometrically, not additively.

    If we run a regression, we are summarizing 100 parameters with 3 (anintercept and two slopes).

    But what about including an interaction? Right, so now we’resummarizing 100 parameters with 4.

    The difference: an enormous assumption based on convenience, notevidence or theory.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 13 / 168

  • Model Dependence with Two Explanatory VariablesVariables: X (education) and Z , parent’s income, both with 10 categories

    How many parameters do we now need to estimate?

    20? Nope. Its10× 10 = 100. This is the curse of dimensionality: the number ofparameters goes up geometrically, not additively.

    If we run a regression, we are summarizing 100 parameters with 3 (anintercept and two slopes).

    But what about including an interaction? Right, so now we’resummarizing 100 parameters with 4.

    The difference: an enormous assumption based on convenience, notevidence or theory.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 13 / 168

  • Model Dependence with Two Explanatory VariablesVariables: X (education) and Z , parent’s income, both with 10 categories

    How many parameters do we now need to estimate?

    20? Nope. Its10× 10 = 100. This is the curse of dimensionality: the number ofparameters goes up geometrically, not additively.

    If we run a regression, we are summarizing 100 parameters with 3 (anintercept and two slopes).

    But what about including an interaction? Right, so now we’resummarizing 100 parameters with 4.

    The difference: an enormous assumption based on convenience, notevidence or theory.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 13 / 168

  • Model Dependence with Two Explanatory VariablesVariables: X (education) and Z , parent’s income, both with 10 categories

    How many parameters do we now need to estimate? 20?

    Nope. Its10× 10 = 100. This is the curse of dimensionality: the number ofparameters goes up geometrically, not additively.

    If we run a regression, we are summarizing 100 parameters with 3 (anintercept and two slopes).

    But what about including an interaction? Right, so now we’resummarizing 100 parameters with 4.

    The difference: an enormous assumption based on convenience, notevidence or theory.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 13 / 168

  • Model Dependence with Two Explanatory VariablesVariables: X (education) and Z , parent’s income, both with 10 categories

    How many parameters do we now need to estimate? 20? Nope.

    Its10× 10 = 100. This is the curse of dimensionality: the number ofparameters goes up geometrically, not additively.

    If we run a regression, we are summarizing 100 parameters with 3 (anintercept and two slopes).

    But what about including an interaction? Right, so now we’resummarizing 100 parameters with 4.

    The difference: an enormous assumption based on convenience, notevidence or theory.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 13 / 168

  • Model Dependence with Two Explanatory VariablesVariables: X (education) and Z , parent’s income, both with 10 categories

    How many parameters do we now need to estimate? 20? Nope. Its10× 10 = 100.

    This is the curse of dimensionality: the number ofparameters goes up geometrically, not additively.

    If we run a regression, we are summarizing 100 parameters with 3 (anintercept and two slopes).

    But what about including an interaction? Right, so now we’resummarizing 100 parameters with 4.

    The difference: an enormous assumption based on convenience, notevidence or theory.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 13 / 168

  • Model Dependence with Two Explanatory VariablesVariables: X (education) and Z , parent’s income, both with 10 categories

    How many parameters do we now need to estimate? 20? Nope. Its10× 10 = 100. This is the curse of dimensionality: the number ofparameters goes up geometrically, not additively.

    If we run a regression, we are summarizing 100 parameters with 3 (anintercept and two slopes).

    But what about including an interaction? Right, so now we’resummarizing 100 parameters with 4.

    The difference: an enormous assumption based on convenience, notevidence or theory.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 13 / 168

  • Model Dependence with Two Explanatory VariablesVariables: X (education) and Z , parent’s income, both with 10 categories

    How many parameters do we now need to estimate? 20? Nope. Its10× 10 = 100. This is the curse of dimensionality: the number ofparameters goes up geometrically, not additively.

    If we run a regression, we are summarizing 100 parameters with 3 (anintercept and two slopes).

    But what about including an interaction? Right, so now we’resummarizing 100 parameters with 4.

    The difference: an enormous assumption based on convenience, notevidence or theory.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 13 / 168

  • Model Dependence with Two Explanatory VariablesVariables: X (education) and Z , parent’s income, both with 10 categories

    How many parameters do we now need to estimate? 20? Nope. Its10× 10 = 100. This is the curse of dimensionality: the number ofparameters goes up geometrically, not additively.

    If we run a regression, we are summarizing 100 parameters with 3 (anintercept and two slopes).

    But what about including an interaction? Right, so now we’resummarizing 100 parameters with 4.

    The difference: an enormous assumption based on convenience, notevidence or theory.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 13 / 168

  • Model Dependence with Two Explanatory VariablesVariables: X (education) and Z , parent’s income, both with 10 categories

    How many parameters do we now need to estimate? 20? Nope. Its10× 10 = 100. This is the curse of dimensionality: the number ofparameters goes up geometrically, not additively.

    If we run a regression, we are summarizing 100 parameters with 3 (anintercept and two slopes).

    But what about including an interaction? Right, so now we’resummarizing 100 parameters with 4.

    The difference: an enormous assumption based on convenience, notevidence or theory.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 13 / 168

  • Model Dependence with Many Explanatory Variables

    Suppose: 15 explanatory variables, with 10 categories each.

    I need to estimate 1015 (a quadrillion) parameters with how manyobservations?

    I Regression reduces this to 16 parameters; quite an assumption!

    Suppose: 80 explanatory variables.

    I 1080 is more than the number of atoms in the universe.I Yet, with a few simple assumptions, we can still run a regression and

    estimate only 81 parameters.

    The curse of dimensionality introduces huge assumptions, oftenunrecognized.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 14 / 168

  • Model Dependence with Many Explanatory Variables

    Suppose: 15 explanatory variables, with 10 categories each.

    I need to estimate 1015 (a quadrillion) parameters with how manyobservations?

    I Regression reduces this to 16 parameters; quite an assumption!

    Suppose: 80 explanatory variables.

    I 1080 is more than the number of atoms in the universe.I Yet, with a few simple assumptions, we can still run a regression and

    estimate only 81 parameters.

    The curse of dimensionality introduces huge assumptions, oftenunrecognized.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 14 / 168

  • Model Dependence with Many Explanatory Variables

    Suppose: 15 explanatory variables, with 10 categories each.I need to estimate 1015 (a quadrillion) parameters with how many

    observations?

    I Regression reduces this to 16 parameters; quite an assumption!

    Suppose: 80 explanatory variables.

    I 1080 is more than the number of atoms in the universe.I Yet, with a few simple assumptions, we can still run a regression and

    estimate only 81 parameters.

    The curse of dimensionality introduces huge assumptions, oftenunrecognized.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 14 / 168

  • Model Dependence with Many Explanatory Variables

    Suppose: 15 explanatory variables, with 10 categories each.I need to estimate 1015 (a quadrillion) parameters with how many

    observations?I Regression reduces this to 16 parameters; quite an assumption!

    Suppose: 80 explanatory variables.

    I 1080 is more than the number of atoms in the universe.I Yet, with a few simple assumptions, we can still run a regression and

    estimate only 81 parameters.

    The curse of dimensionality introduces huge assumptions, oftenunrecognized.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 14 / 168

  • Model Dependence with Many Explanatory Variables

    Suppose: 15 explanatory variables, with 10 categories each.I need to estimate 1015 (a quadrillion) parameters with how many

    observations?I Regression reduces this to 16 parameters; quite an assumption!

    Suppose: 80 explanatory variables.

    I 1080 is more than the number of atoms in the universe.I Yet, with a few simple assumptions, we can still run a regression and

    estimate only 81 parameters.

    The curse of dimensionality introduces huge assumptions, oftenunrecognized.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 14 / 168

  • Model Dependence with Many Explanatory Variables

    Suppose: 15 explanatory variables, with 10 categories each.I need to estimate 1015 (a quadrillion) parameters with how many

    observations?I Regression reduces this to 16 parameters; quite an assumption!

    Suppose: 80 explanatory variables.I 1080 is more than the number of atoms in the universe.

    I Yet, with a few simple assumptions, we can still run a regression andestimate only 81 parameters.

    The curse of dimensionality introduces huge assumptions, oftenunrecognized.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 14 / 168

  • Model Dependence with Many Explanatory Variables

    Suppose: 15 explanatory variables, with 10 categories each.I need to estimate 1015 (a quadrillion) parameters with how many

    observations?I Regression reduces this to 16 parameters; quite an assumption!

    Suppose: 80 explanatory variables.I 1080 is more than the number of atoms in the universe.I Yet, with a few simple assumptions, we can still run a regression and

    estimate only 81 parameters.

    The curse of dimensionality introduces huge assumptions, oftenunrecognized.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 14 / 168

  • Model Dependence with Many Explanatory Variables

    Suppose: 15 explanatory variables, with 10 categories each.I need to estimate 1015 (a quadrillion) parameters with how many

    observations?I Regression reduces this to 16 parameters; quite an assumption!

    Suppose: 80 explanatory variables.I 1080 is more than the number of atoms in the universe.I Yet, with a few simple assumptions, we can still run a regression and

    estimate only 81 parameters.

    The curse of dimensionality introduces huge assumptions, oftenunrecognized.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 14 / 168

  • How Factual is your Counterfactual?

    Is your counterfactual close enough to data so that statisticalmethods provide empirical answers?

    If not, the same calculations will be based on indefensible modelassumptions. With the curse of dimensionality, its too easy to fall intothis trap.

    A good existing approach: Sensitivity testing, but this requires theuser to specify a class of models and then to estimate them all andcheck how much inferences change

    King/Zeng “Convex Hull” approach:

    I Specify your explanatory variables, X .I Assume E(Y |X ) is (minimally) smooth in XI No need to specify models (or a class of models), estimators, or

    dependent variables.I Results of one run apply to the class of all models, all estimators, and

    all dependent variables.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 15 / 168

  • How Factual is your Counterfactual?

    Is your counterfactual close enough to data so that statisticalmethods provide empirical answers?

    If not, the same calculations will be based on indefensible modelassumptions. With the curse of dimensionality, its too easy to fall intothis trap.

    A good existing approach: Sensitivity testing, but this requires theuser to specify a class of models and then to estimate them all andcheck how much inferences change

    King/Zeng “Convex Hull” approach:

    I Specify your explanatory variables, X .I Assume E(Y |X ) is (minimally) smooth in XI No need to specify models (or a class of models), estimators, or

    dependent variables.I Results of one run apply to the class of all models, all estimators, and

    all dependent variables.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 15 / 168

  • How Factual is your Counterfactual?

    Is your counterfactual close enough to data so that statisticalmethods provide empirical answers?

    If not, the same calculations will be based on indefensible modelassumptions. With the curse of dimensionality, its too easy to fall intothis trap.

    A good existing approach: Sensitivity testing, but this requires theuser to specify a class of models and then to estimate them all andcheck how much inferences change

    King/Zeng “Convex Hull” approach:

    I Specify your explanatory variables, X .I Assume E(Y |X ) is (minimally) smooth in XI No need to specify models (or a class of models), estimators, or

    dependent variables.I Results of one run apply to the class of all models, all estimators, and

    all dependent variables.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 15 / 168

  • How Factual is your Counterfactual?

    Is your counterfactual close enough to data so that statisticalmethods provide empirical answers?

    If not, the same calculations will be based on indefensible modelassumptions. With the curse of dimensionality, its too easy to fall intothis trap.

    A good existing approach: Sensitivity testing, but this requires theuser to specify a class of models and then to estimate them all andcheck how much inferences change

    King/Zeng “Convex Hull” approach:

    I Specify your explanatory variables, X .I Assume E(Y |X ) is (minimally) smooth in XI No need to specify models (or a class of models), estimators, or

    dependent variables.I Results of one run apply to the class of all models, all estimators, and

    all dependent variables.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 15 / 168

  • How Factual is your Counterfactual?

    Is your counterfactual close enough to data so that statisticalmethods provide empirical answers?

    If not, the same calculations will be based on indefensible modelassumptions. With the curse of dimensionality, its too easy to fall intothis trap.

    A good existing approach: Sensitivity testing, but this requires theuser to specify a class of models and then to estimate them all andcheck how much inferences change

    King/Zeng “Convex Hull” approach:

    I Specify your explanatory variables, X .I Assume E(Y |X ) is (minimally) smooth in XI No need to specify models (or a class of models), estimators, or

    dependent variables.I Results of one run apply to the class of all models, all estimators, and

    all dependent variables.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 15 / 168

  • How Factual is your Counterfactual?

    Is your counterfactual close enough to data so that statisticalmethods provide empirical answers?

    If not, the same calculations will be based on indefensible modelassumptions. With the curse of dimensionality, its too easy to fall intothis trap.

    A good existing approach: Sensitivity testing, but this requires theuser to specify a class of models and then to estimate them all andcheck how much inferences change

    King/Zeng “Convex Hull” approach:I Specify your explanatory variables, X .

    I Assume E(Y |X ) is (minimally) smooth in XI No need to specify models (or a class of models), estimators, or

    dependent variables.I Results of one run apply to the class of all models, all estimators, and

    all dependent variables.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 15 / 168

  • How Factual is your Counterfactual?

    Is your counterfactual close enough to data so that statisticalmethods provide empirical answers?

    If not, the same calculations will be based on indefensible modelassumptions. With the curse of dimensionality, its too easy to fall intothis trap.

    A good existing approach: Sensitivity testing, but this requires theuser to specify a class of models and then to estimate them all andcheck how much inferences change

    King/Zeng “Convex Hull” approach:I Specify your explanatory variables, X .I Assume E(Y |X ) is (minimally) smooth in X

    I No need to specify models (or a class of models), estimators, ordependent variables.

    I Results of one run apply to the class of all models, all estimators, andall dependent variables.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 15 / 168

  • How Factual is your Counterfactual?

    Is your counterfactual close enough to data so that statisticalmethods provide empirical answers?

    If not, the same calculations will be based on indefensible modelassumptions. With the curse of dimensionality, its too easy to fall intothis trap.

    A good existing approach: Sensitivity testing, but this requires theuser to specify a class of models and then to estimate them all andcheck how much inferences change

    King/Zeng “Convex Hull” approach:I Specify your explanatory variables, X .I Assume E(Y |X ) is (minimally) smooth in XI No need to specify models (or a class of models), estimators, or

    dependent variables.

    I Results of one run apply to the class of all models, all estimators, andall dependent variables.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 15 / 168

  • How Factual is your Counterfactual?

    Is your counterfactual close enough to data so that statisticalmethods provide empirical answers?

    If not, the same calculations will be based on indefensible modelassumptions. With the curse of dimensionality, its too easy to fall intothis trap.

    A good existing approach: Sensitivity testing, but this requires theuser to specify a class of models and then to estimate them all andcheck how much inferences change

    King/Zeng “Convex Hull” approach:I Specify your explanatory variables, X .I Assume E(Y |X ) is (minimally) smooth in XI No need to specify models (or a class of models), estimators, or

    dependent variables.I Results of one run apply to the class of all models, all estimators, and

    all dependent variables.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 15 / 168

  • Interpolation vs Extrapolation in one Dimension

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 16 / 168

  • Interpolation or Extrapolation in One and Two Dimensions

    Interpolation: Inside the convex hull

    Extrapolation: Outside the convex hull

    Calculating the convex hull would take forever in high-dimensions

    WhatIf package uses linear programming to check if a candidatepoint is inside the hull

    The key idea is making sure your counterfactual is near the data!

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 17 / 168

  • Interpolation or Extrapolation in One and Two Dimensions

    Interpolation: Inside the convex hull

    Extrapolation: Outside the convex hull

    Calculating the convex hull would take forever in high-dimensions

    WhatIf package uses linear programming to check if a candidatepoint is inside the hull

    The key idea is making sure your counterfactual is near the data!

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 17 / 168

  • Interpolation or Extrapolation in One and Two Dimensions

    Interpolation: Inside the convex hull

    Extrapolation: Outside the convex hull

    Calculating the convex hull would take forever in high-dimensions

    WhatIf package uses linear programming to check if a candidatepoint is inside the hull

    The key idea is making sure your counterfactual is near the data!

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 17 / 168

  • Interpolation or Extrapolation in One and Two Dimensions

    Interpolation: Inside the convex hull

    Extrapolation: Outside the convex hull

    Calculating the convex hull would take forever in high-dimensions

    WhatIf package uses linear programming to check if a candidatepoint is inside the hull

    The key idea is making sure your counterfactual is near the data!

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 17 / 168

  • Interpolation or Extrapolation in One and Two Dimensions

    Interpolation: Inside the convex hull

    Extrapolation: Outside the convex hull

    Calculating the convex hull would take forever in high-dimensions

    WhatIf package uses linear programming to check if a candidatepoint is inside the hull

    The key idea is making sure your counterfactual is near the data!

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 17 / 168

  • Interpolation or Extrapolation in One and Two Dimensions

    Interpolation: Inside the convex hull

    Extrapolation: Outside the convex hull

    Calculating the convex hull would take forever in high-dimensions

    WhatIf package uses linear programming to check if a candidatepoint is inside the hull

    The key idea is making sure your counterfactual is near the data!

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 17 / 168

  • Replication: Doyle and Sambanis, APSR 2000

    Data: 124 Post-World War II civil wars

    Dependent variable: peacebuilding success

    Treatment variable: multilateral UN peacekeeping intervention (0/1)

    Control variables: war type, severity, and duration; developmentstatus; etc...

    Counterfactuals: UN intervention switched (0/1 to 1/0) for eachobservation

    Percent of counterfactuals in the convex hull:

    0%

    Thus, without estimating any models, we know inferences will bemodel dependent; for illustration, here is an example. . . .

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 18 / 168

  • Replication: Doyle and Sambanis, APSR 2000

    Data: 124 Post-World War II civil wars

    Dependent variable: peacebuilding success

    Treatment variable: multilateral UN peacekeeping intervention (0/1)

    Control variables: war type, severity, and duration; developmentstatus; etc...

    Counterfactuals: UN intervention switched (0/1 to 1/0) for eachobservation

    Percent of counterfactuals in the convex hull:

    0%

    Thus, without estimating any models, we know inferences will bemodel dependent; for illustration, here is an example. . . .

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 18 / 168

  • Replication: Doyle and Sambanis, APSR 2000

    Data: 124 Post-World War II civil wars

    Dependent variable: peacebuilding success

    Treatment variable: multilateral UN peacekeeping intervention (0/1)

    Control variables: war type, severity, and duration; developmentstatus; etc...

    Counterfactuals: UN intervention switched (0/1 to 1/0) for eachobservation

    Percent of counterfactuals in the convex hull:

    0%

    Thus, without estimating any models, we know inferences will bemodel dependent; for illustration, here is an example. . . .

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 18 / 168

  • Replication: Doyle and Sambanis, APSR 2000

    Data: 124 Post-World War II civil wars

    Dependent variable: peacebuilding success

    Treatment variable: multilateral UN peacekeeping intervention (0/1)

    Control variables: war type, severity, and duration; developmentstatus; etc...

    Counterfactuals: UN intervention switched (0/1 to 1/0) for eachobservation

    Percent of counterfactuals in the convex hull:

    0%

    Thus, without estimating any models, we know inferences will bemodel dependent; for illustration, here is an example. . . .

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 18 / 168

  • Replication: Doyle and Sambanis, APSR 2000

    Data: 124 Post-World War II civil wars

    Dependent variable: peacebuilding success

    Treatment variable: multilateral UN peacekeeping intervention (0/1)

    Control variables: war type, severity, and duration; developmentstatus; etc...

    Counterfactuals: UN intervention switched (0/1 to 1/0) for eachobservation

    Percent of counterfactuals in the convex hull:

    0%

    Thus, without estimating any models, we know inferences will bemodel dependent; for illustration, here is an example. . . .

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 18 / 168

  • Replication: Doyle and Sambanis, APSR 2000

    Data: 124 Post-World War II civil wars

    Dependent variable: peacebuilding success

    Treatment variable: multilateral UN peacekeeping intervention (0/1)

    Control variables: war type, severity, and duration; developmentstatus; etc...

    Counterfactuals: UN intervention switched (0/1 to 1/0) for eachobservation

    Percent of counterfactuals in the convex hull:

    0%

    Thus, without estimating any models, we know inferences will bemodel dependent; for illustration, here is an example. . . .

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 18 / 168

  • Replication: Doyle and Sambanis, APSR 2000

    Data: 124 Post-World War II civil wars

    Dependent variable: peacebuilding success

    Treatment variable: multilateral UN peacekeeping intervention (0/1)

    Control variables: war type, severity, and duration; developmentstatus; etc...

    Counterfactuals: UN intervention switched (0/1 to 1/0) for eachobservation

    Percent of counterfactuals in the convex hull:

    0%

    Thus, without estimating any models, we know inferences will bemodel dependent; for illustration, here is an example. . . .

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 18 / 168

  • Replication: Doyle and Sambanis, APSR 2000

    Data: 124 Post-World War II civil wars

    Dependent variable: peacebuilding success

    Treatment variable: multilateral UN peacekeeping intervention (0/1)

    Control variables: war type, severity, and duration; developmentstatus; etc...

    Counterfactuals: UN intervention switched (0/1 to 1/0) for eachobservation

    Percent of counterfactuals in the convex hull: 0%

    Thus, without estimating any models, we know inferences will bemodel dependent; for illustration, here is an example. . . .

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 18 / 168

  • Replication: Doyle and Sambanis, APSR 2000

    Data: 124 Post-World War II civil wars

    Dependent variable: peacebuilding success

    Treatment variable: multilateral UN peacekeeping intervention (0/1)

    Control variables: war type, severity, and duration; developmentstatus; etc...

    Counterfactuals: UN intervention switched (0/1 to 1/0) for eachobservation

    Percent of counterfactuals in the convex hull: 0%

    Thus, without estimating any models, we know inferences will bemodel dependent; for illustration, here is an example. . . .

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 18 / 168

  • Doyle and Sambanis, Logit ModelOriginal Model Modified Model

    Variables Coeff SE P-val Coeff SE P-valWartype −1.742 .609 .004 −1.666 .606 .006Logdead −.445 .126 .000 −.437 .125 .000Wardur .006 .006 .258 .006 .006 .342Factnum −1.259 .703 .073 −1.045 .899 .245Factnum2 .062 .065 .346 .032 .104 .756Trnsfcap .004 .002 .010 .004 .002 .017Develop .001 .000 .065 .001 .000 .068Exp −6.016 3.071 .050 −6.215 3.065 .043Decade −.299 .169 .077 −0.284 .169 .093Treaty 2.124 .821 .010 2.126 .802 .008UNOP4 3.135 1.091 .004 .262 1.392 .851Wardur*UNOP4 — — — .037 .011 .001Constant 8.609 2.157 0.000 7.978 2.350 .000N 122 122Log-likelihood -45.649 -44.902Pseudo R2 .423 .433

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 19 / 168

  • Doyle and Sambanis: Model Dependence

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 20 / 168

  • UN Peacekeeping Operations

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 21 / 168

  • Another Example

    Remember our negative binomial model?

    mod |z|)

    (Intercept) 0.5943 0.1718 3.459 0.000541 ***

    cathunemp 7.9323 0.9150 8.669 < 2e-16 ***

    protunemp -19.1683 2.3713 -8.084 6.29e-16 ***

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 22 / 168

  • Proposed Counterfactuals

    Let’s consider two first differences we might plausibly estimate. At thebaseline, both variables are assumed fixed at their sample means.

    1. Counterfactual 1: Catholic unemployment increases by one standarddeviation and Protestant unemployment increases by one standarddeviation.

    2. Counterfactual 2: Catholic unemployment decreases by one standarddeviation and Protestant unemployment increases by one standarddeviation.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 23 / 168

  • Proposed Counterfactuals

    Let’s consider two first differences we might plausibly estimate.

    At thebaseline, both variables are assumed fixed at their sample means.

    1. Counterfactual 1: Catholic unemployment increases by one standarddeviation and Protestant unemployment increases by one standarddeviation.

    2. Counterfactual 2: Catholic unemployment decreases by one standarddeviation and Protestant unemployment increases by one standarddeviation.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 23 / 168

  • Proposed Counterfactuals

    Let’s consider two first differences we might plausibly estimate. At thebaseline, both variables are assumed fixed at their sample means.

    1. Counterfactual 1: Catholic unemployment increases by one standarddeviation and Protestant unemployment increases by one standarddeviation.

    2. Counterfactual 2: Catholic unemployment decreases by one standarddeviation and Protestant unemployment increases by one standarddeviation.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 23 / 168

  • Proposed Counterfactuals

    Let’s consider two first differences we might plausibly estimate. At thebaseline, both variables are assumed fixed at their sample means.

    1. Counterfactual 1: Catholic unemployment increases by one standarddeviation and Protestant unemployment increases by one standarddeviation.

    2. Counterfactual 2: Catholic unemployment decreases by one standarddeviation and Protestant unemployment increases by one standarddeviation.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 23 / 168

  • Proposed Counterfactuals

    Let’s consider two first differences we might plausibly estimate. At thebaseline, both variables are assumed fixed at their sample means.

    1. Counterfactual 1: Catholic unemployment increases by one standarddeviation and Protestant unemployment increases by one standarddeviation.

    2. Counterfactual 2: Catholic unemployment decreases by one standarddeviation and Protestant unemployment increases by one standarddeviation.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 23 / 168

  • Proposed Counterfactuals Plotted

    ●●●●●●●●●●●●●●●●●●●

    ●●●●●●●●●●

    ●●●●

    ●●

    ●●●

    ●●

    ●●

    ●●●●

    ●●●●

    ●●●●●

    ●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●

    ●● ●

    ●●

    ●●●

    ●●● ●●●

    ●●

    ●●●●

    ●●

    ●●

    ●●●

    ●●

    ●●●●●

    ●●●●●●

    ●●

    ●●

    ●●●●

    ●●

    ●●

    ●●●

    ●●

    ●●

    ●●●

    ● ●●●

    ●●●

    ●●●

    ●●●●●●

    ●●

    ●●

    ●●●●●●

    ●●●

    ●●●

    ●●●●● ●●●

    ●●●

    ●●●

    ●●● ●●●

    ●●●●

    ●●●

    ●●●●●●● ●

    ●●●

    ●●●●

    ●●

    ●●●

    ●●

    ●●●●●●●●●

    ●●●●●●●●

    ●●

    ●●● ●●●●

    ●●

    ●●●●

    ●●●●●

    ●●●●●●●●●

    ●●●

    ●●●●●●●●●●●

    ●●

    ●●

    ●●●●

    ●●●●

    ●●●

    ●●

    ●●●●●

    ●●●● ●●●●●

    ●●●

    ●●●

    ●●●●●●●●●●●●

    ●●●●●●●●●

    ●●●

    ●●●●●●●●●●●●

    0.05 0.10 0.15

    0.10

    0.20

    0.30

    0.40

    protunemp

    cath

    unem

    p

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 24 / 168

  • Checking the Convex Hull

    library(WhatIf)

    cf1

  • A Measure of Distance

    The whatif function also tells us the percentage of data points within 1geometric variance of the counterfactual.

    > cf.res1$sum.stat

    1

    0.2608696

    > cf.res2$sum.stat

    1

    0.04603581

    The geometric variance is a generalization of the usual variance which ismore suitable to discrete and continuous variables- essentially it is theaverage pairwise Gower distance in the data. The number of GV’s awaycan be altered with the nearby argument.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 26 / 168

  • A Measure of Distance

    The whatif function also tells us the percentage of data points within 1geometric variance of the counterfactual.

    > cf.res1$sum.stat

    1

    0.2608696

    > cf.res2$sum.stat

    1

    0.04603581

    The geometric variance is a generalization of the usual variance which ismore suitable to discrete and continuous variables- essentially it is theaverage pairwise Gower distance in the data. The number of GV’s awaycan be altered with the nearby argument.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 26 / 168

  • A Measure of Distance

    The whatif function also tells us the percentage of data points within 1geometric variance of the counterfactual.

    > cf.res1$sum.stat

    1

    0.2608696

    > cf.res2$sum.stat

    1

    0.04603581

    The geometric variance is a generalization of the usual variance which ismore suitable to discrete and continuous variables- essentially it is theaverage pairwise Gower distance in the data. The number of GV’s awaycan be altered with the nearby argument.

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 26 / 168

  • Biases in Regression: A Decomposition

    d = mean(Y |D = 1)−mean(Y |D = 0)

    bias ≡ E (d)− θ = ∆o + ∆p + ∆i + ∆e

    ∆o Omitted variable bias (ignorability)

    ∆p Post-treatment bias (check this with theory!)

    ∆i Interpolation bias (use models or matching)

    ∆e Extrapolation bias (check this with data!)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 27 / 168

  • Biases in Regression: A Decomposition

    d = mean(Y |D = 1)−mean(Y |D = 0)

    bias ≡ E (d)− θ = ∆o + ∆p + ∆i + ∆e

    ∆o Omitted variable bias (ignorability)

    ∆p Post-treatment bias (check this with theory!)

    ∆i Interpolation bias (use models or matching)

    ∆e Extrapolation bias (check this with data!)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 27 / 168

  • Biases in Regression: A Decomposition

    d = mean(Y |D = 1)−mean(Y |D = 0)

    bias ≡ E (d)− θ

    = ∆o + ∆p + ∆i + ∆e

    ∆o Omitted variable bias (ignorability)

    ∆p Post-treatment bias (check this with theory!)

    ∆i Interpolation bias (use models or matching)

    ∆e Extrapolation bias (check this with data!)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 27 / 168

  • Biases in Regression: A Decomposition

    d = mean(Y |D = 1)−mean(Y |D = 0)

    bias ≡ E (d)− θ = ∆o + ∆p + ∆i + ∆e

    ∆o Omitted variable bias (ignorability)

    ∆p Post-treatment bias (check this with theory!)

    ∆i Interpolation bias (use models or matching)

    ∆e Extrapolation bias (check this with data!)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 27 / 168

  • Biases in Regression: A Decomposition

    d = mean(Y |D = 1)−mean(Y |D = 0)

    bias ≡ E (d)− θ = ∆o + ∆p + ∆i + ∆e

    ∆o Omitted variable bias (ignorability)

    ∆p Post-treatment bias (check this with theory!)

    ∆i Interpolation bias (use models or matching)

    ∆e Extrapolation bias (check this with data!)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 27 / 168

  • Biases in Regression: A Decomposition

    d = mean(Y |D = 1)−mean(Y |D = 0)

    bias ≡ E (d)− θ = ∆o + ∆p + ∆i + ∆e

    ∆o Omitted variable bias (ignorability)

    ∆p Post-treatment bias (check this with theory!)

    ∆i Interpolation bias (use models or matching)

    ∆e Extrapolation bias (check this with data!)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 27 / 168

  • Biases in Regression: A Decomposition

    d = mean(Y |D = 1)−mean(Y |D = 0)

    bias ≡ E (d)− θ = ∆o + ∆p + ∆i + ∆e

    ∆o Omitted variable bias (ignorability)

    ∆p Post-treatment bias (check this with theory!)

    ∆i Interpolation bias (use models or matching)

    ∆e Extrapolation bias (check this with data!)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 27 / 168

  • Biases in Regression: A Decomposition

    d = mean(Y |D = 1)−mean(Y |D = 0)

    bias ≡ E (d)− θ = ∆o + ∆p + ∆i + ∆e

    ∆o Omitted variable bias (ignorability)

    ∆p Post-treatment bias (check this with theory!)

    ∆i Interpolation bias (use models or matching)

    ∆e Extrapolation bias (check this with data!)

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 27 / 168

  • Counterfactuals Summary

    When the model is true, we can extrapolate or interpolate as desired.

    In practical settings we do not believe the model is true, even if it islocally accurate

    Thus we may get wildly different counterfactuals from differentmodels when we are far from the data, we call this model dependence

    The convex hull provides a way to check for extrapolation

    This is a great way of assessing the reasonableness of our simulatedquantities of interest

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 28 / 168

  • Counterfactuals Summary

    When the model is true, we can extrapolate or interpolate as desired.

    In practical settings we do not believe the model is true, even if it islocally accurate

    Thus we may get wildly different counterfactuals from differentmodels when we are far from the data, we call this model dependence

    The convex hull provides a way to check for extrapolation

    This is a great way of assessing the reasonableness of our simulatedquantities of interest

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 28 / 168

  • Counterfactuals Summary

    When the model is true, we can extrapolate or interpolate as desired.

    In practical settings we do not believe the model is true, even if it islocally accurate

    Thus we may get wildly different counterfactuals from differentmodels when we are far from the data, we call this model dependence

    The convex hull provides a way to check for extrapolation

    This is a great way of assessing the reasonableness of our simulatedquantities of interest

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 28 / 168

  • Counterfactuals Summary

    When the model is true, we can extrapolate or interpolate as desired.

    In practical settings we do not believe the model is true, even if it islocally accurate

    Thus we may get wildly different counterfactuals from differentmodels when we are far from the data, we call this model dependence

    The convex hull provides a way to check for extrapolation

    This is a great way of assessing the reasonableness of our simulatedquantities of interest

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 28 / 168

  • Counterfactuals Summary

    When the model is true, we can extrapolate or interpolate as desired.

    In practical settings we do not believe the model is true, even if it islocally accurate

    Thus we may get wildly different counterfactuals from differentmodels when we are far from the data, we call this model dependence

    The convex hull provides a way to check for extrapolation

    This is a great way of assessing the reasonableness of our simulatedquantities of interest

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 28 / 168

  • 1 Assessing Counterfactuals

    2 A (Brief) Review of Selection on Observables

    3 Matching as Non-parametric Preprocessing

    4 Fundamentals of Matching

    5 Three Approaches to Matching

    6 The Propensity Score

    7 Mechanisms: Estimands and Identification

    8 Mechanisms: Estimation

    9 Controlled Direct Effects

    10 Appendix: The Case Against Propensity Score Matching

    Stewart (Princeton) Causal Inference Apr 10 - Apr 19, 2017 29 / 168

  • 1 Assessing Counterfactuals

    2 A (Brief) Review of Selection on Observables

    3 Matching as Non-parametric Preprocessing

    4 Fundamentals of Matching

    5 Three Approaches to Matching

    6 The Propensity Score

    7 Mechanisms: Estimands and Identification

    8 Mechanisms: Estimation

    9 Controlled Direct Effects

    10 Appendix: The Case Against Propensity Score Matching

    Stewart (Princeton) Causal Inf