1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie...

35
1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie...

Page 1: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

1

Causal Inference and

Ambiguous Manipulations

Richard Scheines

Grant Reaber, Peter Spirtes

Carnegie Mellon University

Page 2: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

2

1. Motivation

Wanted: Answers to Causal Questions:

• Does attending Day Care cause Aggression?

• Does watching TV cause obesity?

• How can we answer these questions

empirically?

• When and how can we estimate the size of

the effect?

• Can we know our estimates are reliable?

Page 3: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

3

Causation & Intervention

P(Lung Cancer | Tar-stained teeth = no)

P(Lung Cancer | Tar-stained teeth set= no)

Conditioning is not the same as intervening

Show Teeth Slides

Page 4: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

4

Gender

CEO Earings

Gender

CEO Earings

I

Page 5: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

5

Causal Inference: Experiments

Gold Standard: Randomized Clinical Trials

- Intervene: Randomly assign treatment

- Observe Response

Estimate P( Response | Treatment assigned)

Page 6: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

6

Causal Inference: Observational Studies

Collect a sample on

- Potential Causes (X)

- Response (Y)

- Covariates (potential confounders Z)

Estimate P(Y | X, Z)

• Highly unreliable

• We can estimate sampling variability, but we don’t know

how to estimate specification uncertainty from data

Individual Day Care Aggressiveness

John

Mary

A lot

None

A lot

A little

Page 7: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

7

2. Progress 1985 – Present

1. Representing causal structure, and

connecting it to probability

2. Modeling Interventions

3. Indistinguishability and Discovery

Algorithms

Page 8: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

8

Representing Causal Structures

Causal Graph G = {V,E} Each edge X Y represents a direct causal claim:

X is a direct cause of Y relative to V

Exposure Infection Symptoms

Page 9: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

9

Direct Causation

X is a direct cause of Y relative to S, iff z,x1 x2 P(Y | X set= x1 , Z set= z)

P(Y | X set= x2 , Z set= z)

where Z = S - {X,Y} X Y

Page 10: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

10

Causal Bayes Networks

P(S = 0) = .7P(S = 1) = .3

P(YF = 0 | S = 0) = .99 P(LC = 0 | S = 0) = .95P(YF = 1 | S = 0) = .01 P(LC = 1 | S = 0) = .05P(YF = 0 | S = 1) = .20 P(LC = 0 | S = 1) = .80P(YF = 1 | S = 1) = .80 P(LC = 1 | S = 1) = .20

Smoking [0,1]

Lung Cancer[0,1]

Yellow Fingers[0,1]

P(S,Y,F) = P(S) P(YF | S) P(LC | S)

The Joint Distribution Factors

According to the Causal Graph,

i.e., for all X in V

P(V) = P(X|Immediate Causes of(X))

Page 11: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

11

Modeling Ideal Interventions

Interventions on the Effect

Wearing

Sweater

Room

Temperature

Pre-experimental SystemPost

Page 12: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

12

Modeling Ideal Interventions

Interventions on the Cause

Pre-experimental SystemPost

Wearing

Sweater

Room

Temperature

Page 13: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

13

Interventions & Causal Graphs

• Model an ideal intervention by adding an “intervention” variable outside the original system

• Erase all arrows pointing into the variable intervened upon

Exp Inf

Rash

Intervene to change Inf

Post-intervention graph?Pre-intervention graph

Exp Inf Rash

I

Page 14: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

14

Calculating the Effect of Interventions

Pre-manipulation Joint Distribution

P(Exp,Inf,Rash) = P(Exp)P(Inf | Exp)P(Rash|Inf)

Intervention on Inf

Exp Inf

Rash

Post-manipulation Joint Distribution

P(Exp,Inf,Rash) = P(Exp)P(Inf | I) P(Rash|Inf)

Exp Inf

Rash

I

Page 15: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

15

Causal Discovery from Observational Studies

X3 | X2 X1

X2 X3 X1

Causal Markov Axiom(D-separation)

IndependenceRelations

Equivalence Class ofCausal Graphs

X2 X3 X1

X2 X3 X1

Discovery Algorithm

Page 16: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

16

Equivalence Class with Latents:PAGs: Partial Ancestral Graphs

X2

X3

X1

X2

X3

Represents

PAG

X1 X2

X3

X1

X2

X3

T1

X1

X2

X3

X1

etc.

T1

T1 T2

Assumptions:

• Acyclic graphs

• Latent variables

• Sample Selection Bias

Equivalence:

• Independence over measured variables

Page 17: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

17

Knowing when we know enough to calculate the effect of Interventions

The Prediction Algorithm (SGS, 2000)

Causal Inference from

Observational Studies

Page 18: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

18

Causal Discovery from Observational Studies

X2 X3 X1 Prediction Algorithm

Equivalence Class (PAG)

X4

Predictions? P(X3 | X2set) yes P(X2 | X1set) Don’t know P(X1 | X2set) yes ….

Observed Independence

X1 _||_ X4 X1 _||_ X3 | X2

X4 _||_ X3 | X2

Discovery Algorithm

Page 19: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

19

3. The Ambiguity of Manipulation

Assumptions

• Causal graph known (Cholesterol is a cause of Heart Condition)

• No Unmeasured Common Causes

Heart Disease

Total Blood Cholesterol

Therefore

The manipulated and unmanipulated distributions are the same:

P(H | TC = x) = P(H | TC set= x)

Page 20: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

20

The Problem with Predicting the Effects of Acting

Problem – the cause is a composite of causes that don’t act uniformly,

E.g., Total Blood Cholesterol (TC) = HDL + LDL

Heart Disease

Total Blood Cholesterol = HDL

+ LDL +

-

•The observed distribution over TC is determined by the unobserved joint distribution over HDL and LDL

• Ideally Intervening on TC does not determine a joint distribution for HDL and LDL

Page 21: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

21

The Problem with Predicting the Effects of Setting TC

Heart Disease

Total Blood Cholesterol = HDL

+ LDL +

-

• P(H | TC set1= x) puts NO constraints on P(H | TC set2= x),

• P(H | TC = x) puts NO constraints on P(H | TC set= x)

• Nothing in the data tips us off about our ignorance, i.e., we don’t

know that we don’t know.

Page 22: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

22

Examples Abound

Social Adjustment

Total TV = Violent Junk

+ PBS, Discovery Channel

+

-

Aggressiveness Total Day Care =

Overcrowded, Poor Quality +

High Quality

+ -

Page 23: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

23

Possible Ways Out

• Causal Graph is Not Known:

Cholesterol does not really cause Heart Condition

• Confounders (unmeasured common causes) are present:

LDL and HDL are confounders

Page 24: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

24

Cholesterol is not really a cause of Heart Condition

Relative to a set of variables S (and a background),

X is a cause of Y iff x1 x2 P(Y | X set= x1) P(Y | X set= x2)

• Total Cholesterol is a cause of Heart Disease

Page 25: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

25

Cholesterol is not really a cause of Heart Condition

Is Total Cholesterol is a direct cause of Heart Condition relative to: {TC, LDL, HDL, HD}?

• TC is logically related to LDL, HDL, so manipulating it once LDL and HDL are set is impossible.

Page 26: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

26

LDL, HDL are confounders

Heart Disease TC

HDL LDL

?

• No way to manipulate TCl without affecting HDL, LDL

• HDL, LDL are logically related to TC

Page 27: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

27

Logico-Causal Systems

S: Atomic Variables

• independently manipulable

• effects of all manipulations are unambiguous

S’: Defined Variables

• defined logically from variables in S

For example:

S: LDL, HDL, HD, Disease1, Disease2

S’: TC

Page 28: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

28

Logico-Causal Systems: Adding Edges

S: LDL, HDL, HD, D1, D2 S’: TC

System over S System over S U S’ D1 D2

LDL HDL

HD

D1 D2

LDL HDL

HD

TC

?

TC HD iff manipulations of TC are unambiguous wrt HD

Page 29: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

29

Logico-Causal Systems: Unambiguous Manipulations

TC HD iff all manipulations of TC are unambiguous wrt HD

For each variable X in S’, let Parents(X’) be the set of variables in S that logically determine X’, i.e.,

X’ = f(Parents(X’)), e.g., TC = LDL + HDL

Inv(x’) = set of all values p of Parents(X’) s.t., f(p) = x’

A manipulation of a variable X’ in S’ to a value x’

wrt another variable Y is unambiguous iff

p1≠ p2 [P(Y | p1 Inv(x’)) = P(Y | p2 Inv(x’))]

Page 30: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

30

Logico-Causal Systems: Removing Edges

S: LDL, HDL, HD, D1, D2 S’: TC

System over S System over S U S’ D1 D2

LDL HDL

HD

D1 D2

LDL HDL

HD

TC

? ?

Remove LDL HD iff LDL _||_ HD | TC

Page 31: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

31

Logico-Causal Systems: Faithfulness

D1 D2

LDL HDL

HD

TC

Faithfulness: Independences entailed by structure, not by

special parameter values. Crucial to inference

Effect of TC on HD unambiguous

Unfaithfulness: LDL _||_ HDL | TC

Because LDL and TC determine HDL, and

similarly, HDL and TC determine TC

Page 32: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

32

Effect on Prediction Algorithm

Manipulate: Effect on: Assume manipulation unambiguous

ManipulationMaybe ambiguous

Disease 1 Disease 2 None None

Disease 1 HD Can’t tell Can’t tell

Disease 1 TC Can’t tell Can’t tell

Disease 2 Disease 1 None None

Disease 2 HD Can’t tell Can’t tell

Disease 2 TC Can’t tell Can’t tell

TC Disease 1 None Can’t tell

TC Disease 2 None Can’t tell

TC HD Can’t tell Can’t tell

HD Disease 1 None Can’t tell

HD Disease 2 None Can’t tell

HD TC Can’t tell Can’t tell

Observed System:

TC, HD, D1, D2 D1 D2

LDL HDL

HD

TC

? ? ?

Still sound – but less informative

Page 33: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

33

Effect on Prediction Algorithm

Observed System:

TC, HD, D1, D2, X

D1 D2

LDL HDL

HD

TC

?

X

Not completely sound

No general characterization of when the Prediction algorithm, suitably modified, is still informative and sound. Conjectures, but no proof yet.

Example:• If observed system has no deterministic relations• All orientations due to marginal independence relations are still valid

Page 34: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

34

Effect on Causal Inference ofAmbiguous Manipulations

Experiments, e.g., RCTs:

Manipulating treatment is• unambiguous sound• ambiguous unsound

Observational Studies, e.g., Prediction Algorithm:

Manipulation is• unambiguous potentially sound• ambiguous potentially sound

Page 35: 1 Causal Inference and Ambiguous Manipulations Richard Scheines Grant Reaber, Peter Spirtes Carnegie Mellon University.

35

References

• Causation, Prediction, and Search, 2nd Edition, (2000), by P. Spirtes, C. Glymour, and R. Scheines ( MIT Press)

• Causality: Models, Reasoning, and Inference, (2000), Judea Pearl, Cambridge Univ. Press

• Spirtes, P., Scheines, R.,Glymour, C., Richardson, T., and Meek, C. (2004), “Causal Inference,” in Handbook of Quantitative Methodology in the Social Sciences, ed. David Kaplan, Sage Publications, 447-478

• Spirtes, P., and Scheines, R. (2004). Causal Inference of Ambiguous Manipulations. in Proceedings of the Philosophy of Science Association Meetings, 2002.

• Reaber, Grant (2005). The Theory of Ambiguous Manipulations. Masters Thesis, Department of Philosophy, Carnegie Mellon University