Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department...

151
Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation Carnegie Mellon University www.hss.cmu.edu/philosophy/ faculty-kelly.php

Transcript of Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department...

Page 1: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham’s Razor in Causal Discovery: A

New ExplanationKevin T. Kelly

Conor Mayo-WilsonDepartment of Philosophy

Joint Program in Logic and ComputationCarnegie Mellon University

www.hss.cmu.edu/philosophy/faculty-kelly.php

Page 2: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

I. Prediction vs. Policy

Page 3: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Predictive Links

Correlation or co-dependency allows one to predict Y from X.

Ash traysLu

ng

can

cer

Ash traysLinked toLung cancer!

scientistpolicy maker

Page 4: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Policy

Policy manipulates X to achieve a change in Y.

Ash traysLu

ng

can

cer

Prohibit ash trays!

Ash traysLinked toLung cancer!

Page 5: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Policy

Policy manipulates X to achieve a change in Y.

Ash traysLu

ng

can

cer

We failed!

Page 6: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Correlation is not Causation

Manipulation of X can destroy the correlation of X with Y.

Ash traysLu

ng

can

cer

We failed!

Page 7: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Standard Remedy

Randomized controlled study

Ash traysLu

ng

can

cer

That’s what happensif you carry out thepolicy.

Page 8: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

InfeasibilityExpenseMorality

Lead

IQ

Let me force a few thousand childrento eat lead.

Page 9: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

InfeasibilityExpenseMorality

Lead

IQ

Just joking!

Page 10: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ironic Alliance

Lead

IQ

Ha! You will never prove thatlead affects IQ…

industry

Page 11: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ironic Alliance

Lead

IQ

And you can’t throw my peopleout of work on a mere whim.

Page 12: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Lead

IQ

So I will keep on polluting, which will never settle the matter because it is not a randomized trial.

Ironic Alliance

Page 13: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

II. Causes From Correlations

Page 14: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Causal Discovery

Protein A

Protein B

Protein C Cancer protein

Patterns of conditional correlation can imply unambiguous causal conclusions

(Pearl, Spirtes, Glymour, Scheines, etc.)

Eliminate protein C!

Page 15: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Basic Idea

Causation is a directed, acyclic network over variables.

What makes a network causal is a relation of compatibility between networks and joint probability distributions.X

YZ

X

YZ

compatibility

pG

Page 16: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Joint distribution p is compatible with directed, acyclic network G iff:

Causal Markov Condition: each variable X is independent of its non-effects given its immediate causes.

Faithfulness Condition: every conditional independence relation that holds in p is a consequence of the Causal Markov Cond.

Compatibility

Y Z

X

W

V

V

Page 17: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

B C

Common Cause

• B yields info about C (Faithfulness);• B yields no further info about C given A (Markov).

A

A

B C

Page 18: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Causal Chain

• B yields info about C (Faithfulness);• B yields no further info about C given A (Markov).

B

A

C

A

B

C

Page 19: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Common Effect

• B yields no info about C (Markov);• B yields extra info about C given A (Faithfulness).

A

B C

A

B C

Page 20: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Distinguishability

A

B CA

B

C

A

C

B

A

B C

indistinguishable distinctive

Page 21: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Immediate Connections

• There is an immediate causal connection between X and Y iff

X is dependent on Y given every subset of variables not containing X and Y (Spirtes, Glymour and Scheines)

X Y

No intermediate conditioning setbreaks dependency

X YZ

WSome conditioningset breaks dependency

Page 22: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Recovery of Skeleton

• Apply preceding condition to recover every non-oriented immediate causal connection.

X Y

Y

Z

skeleton

X Y

Y

Z

truth

Page 23: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Orientation of Skeleton

• Look for the distinctive pattern of common effects.

Common effect

X Y

Y

Z

X Y

Y

Z

truth

Page 24: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Orientation of Skeleton

• Look for the distinctive pattern of common effects.

• Draw all deductive consequences of these orientations.

Common effect

X Y

Y

Z

Y is not common effect of ZYSo orientation must be downward

X Y

Y

Z

truth

Page 25: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Causation from Correlation

Protein A

Protein B

Protein C Cancer protein

The following network is causally unambiguous if all variables are observed.

Page 26: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Causation from Correlation

Protein A

Protein B

Protein C Cancer protein

The red arrow is also immune to latent confounding causes

Page 27: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Brave New World for Policy

Protein A

Protein B

Protein C Cancer protein

Experimental (confounder-proof) conclusions from correlational data!

Eliminate protein C!

Page 28: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

III. The Catch

Page 29: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Metaphysics vs. Inference

The above results all assume that the true statistical independence relations for p are given.

But they must be inferred from finite samples.

Sample Inferred statisticaldependencies

Causalconclusions

Page 30: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Problem of Induction Independence is indistinguishable

from sufficiently small dependence at sample size n.

independence

dependence

data

Page 31: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Bridging the Inductive Gap

Assume conditional independence until the data show otherwise.

Ockham’s razor: assume no more causal complexity than necessary.

Page 32: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Inferential Instability No guarantee that small

dependencies will not be detected later.

Can have spectacular impact on prior causal conclusions.

Page 33: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Current Policy Analysis

Protein A

Protein B

Protein C Cancer protein

Eliminate protein C!

Protein A

Protein B

Protein C Cancer protein

Page 34: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

As Sample Size Increases…

Rescind that order!

Protein A

Protein B

Protein C Cancer proteinweak

Protein D

Page 35: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

As Sample Size Increases Again…

Eliminate protein C again!

Protein A

Protein B

Protein C Cancer proteinweak

Protein D

Protein Eweak

weak

Page 36: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

As Sample Size Increases Again…

Protein A

Protein B

Protein C Cancer proteinweak

Protein D

Protein Eweak

weak

Etc.

Eliminate protein C again!

Page 37: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Typical Applications Linear Causal Case: each variable

X is a linear function of its parents and a normally distributed hidden variable called an “error term”. The error terms are mutually independent.

Discrete Multinomial Case: each variable X takes on a finite range of values.

Page 38: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

No unobserved latent confounding causes

An Optimistic Concession

Genetics

Smoking Cancer

Page 39: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Causal Flipping Theorem

No matter what a consistent causal discovery procedure has seen so far, there exists a pair G, p satisfying the above assumptions so that the current sample is arbitrarily likely in p and the procedure produces arbitrarily many opposite conclusions in p about an arbitrary causal arrow in G as sample size increases.

oops

I meant oops

oopsI meant

I meant

Page 40: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Causal Flipping Theorem

Every consistent causal inference method is covered.

Therefore, multiple instability is an intrinsic feature of the causal discovery problem.

oops

I meant oops

oopsI meant

I meant

Page 41: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

The Crooked Course"Living in the midst of ignorance and considering themselves intelligent and enlightened, the senseless people go round and round, following crooked courses, just like the blind led by the blind." Katha Upanishad, I. ii. 5.

Page 42: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Extremist Reaction Since causal discovery cannot lead

straight to the truth, it is not justified.

I must remain silent.Therefore, I

win.

Page 43: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Moderate Reaction

Many explanations have been offered to make sense of the here-today-gone-tomorrow nature of medical wisdom — what we are advised with confidence one year is reversed the next — but the simplest one is that it is the natural rhythm of science.

(Do We Really Know What Makes us Healthy?, NY Times Magazine, Sept. 16, 2007).

Page 44: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Skepticism Inverted

Unavoidable retractions are justified because they are unavoidable.

Avoidable retractions are not justified because they are avoidable.

So the best possible methods for causal discovery are those that minimize causal retractions.

The best possible means for finding the truth are justified.

Page 45: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Larger Proposal

The same holds for Ockham’s razor in general when the aim is to find the true theory.

Page 46: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

IV. Ockham’s Razor

Page 47: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Which Theory is Right?

???

Page 48: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham Says:

Choose theSimplest!

Page 49: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

But Why?

Gotcha!

Page 50: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Puzzle

An indicator must be sensitive to what it indicates.

simple

Page 51: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Puzzle

An indicator must be sensitive to what it indicates.

complex

Page 52: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Puzzle

But Ockham’s razor always points at simplicity.

simple

Page 53: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Puzzle

But Ockham’s razor always points at simplicity.

complex

Page 54: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Puzzle

How can a broken compass help you find something unless you already know where it is?

complex

Page 55: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Standard Accounts

1. Prior Simplicity BiasBayes, BIC, MDL, MML, etc.

2. Risk MinimizationSRM, AIC, cross-validation, etc.

Page 56: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

1. Bayesian Account Ockham’s razor is a feature of

one’s personal prior belief state. Short run: no objective

connection with finding the truth (flipping theorem applies).

Long run: converges to the truth, but other prior biases would also lead to convergence.

Page 57: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

2. Risk Minimization Acct.

Risk minimization is about prediction rather than truth.

Urges using a false causal theory rather than the known true theory for predictive purposes.

Therefore, not suited to exact science or to practical policy applications.

Page 58: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

V. A New Foundation for

Ockham’s Razor

Page 59: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Connections to the Truth Short-run

Reliability Too strong to be

feasible when theory matters.

Long-run Convergence Too weak to single

out Ockham’s razor

ComplexSimple

ComplexSimple

Page 60: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Middle Path Short-run Reliability

Too strong to be feasible when theory matters.

“Straightest” convergence Just right?

Long-run Convergence Too weak to single

out Ockham’s razor

ComplexSimple

ComplexSimple

ComplexSimple

Page 61: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Problems

T1 T2 T3

Set K of infinite input sequences. Partition of K into alternative

theories.K

Page 62: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Methods

T1 T2 T3

Map finite input sequences to theories or to “?”.

K

T3

e

Page 63: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Method Choice

T1 T2 T3

e1 e2 e3 e4

Input history

Output historyAt each stage, scientist can choose a new method (agreeing with past theory choices).

Page 64: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Aim: Converge to the Truth

T1 T2 T3

K

T3 ? T2 ? T1 T1 T1 T1 . . .T1 T1 T1

Page 65: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Retraction

Choosing T and then not choosing T next

T’

T

?

Page 66: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Aim: Eliminate Needless Retractions

Truth

Page 67: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Aim: Eliminate Needless Retractions

Truth

Page 68: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Aim: Eliminate Needless Delays to Retractions

theory

Page 69: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

applicationapplicationapplication

applicationcorollary

applicationtheory

applicationapplication

corollary applicationcorollary

Aim: Eliminate Needless Delays to Retractions

Page 70: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Why Timed Retractions?

Retraction minimization =generalized significance level.

Retraction time minimization = generalized power.

Page 71: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Easy Retraction Time Comparisons

T1 T1 T2 T2

T1 T1 T2 T2 T3 T3T2 T4 T4

T2 T2

Method 1

Method 2

T4 T4 T4

. . .

. . .

at least as many at least as late

Page 72: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Worst-case Retraction Time Bounds

T1 T2

Output sequences

T1 T2

T1 T2

T4

T3

T3

T3

T3

T3 T3

T4

T4

T4

T4 T4

. . .

(1, 2, ∞)

. . .

. . .

. . .. . .

. . .T4

T4

T4

T1 T2 T3 T3 T3 T4T3 . . .

Page 73: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Curve Fitting

Data = open intervals around Y at rational values of X.

Page 74: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Curve Fitting

No effects:

Page 75: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Curve Fitting

First-order effect:

Page 76: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Curve Fitting

Second-order effect:

Page 77: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham

ConstantLinear

Quadratic

Cubic

There yet?Maybe.

Page 78: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham

ConstantLinear

Quadratic

Cubic

There yet?

Maybe.

Page 79: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham

ConstantLinear

Quadratic

Cubic

There yet?Maybe.

Page 80: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham

ConstantLinear

Quadratic

Cubic

There yet?Maybe.

Page 81: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham Violation

ConstantLinear

Quadratic

Cubic

There yet?Maybe.

Page 82: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham Violation

ConstantLinear

Quadratic

Cubic

I know you’re coming!

Page 83: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham Violation

ConstantLinear

Quadratic

Cubic

Maybe.

Page 84: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham Violation

ConstantLinear

Quadratic

Cubic

!!!

Hmm, it’s quite nice here…

Page 85: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham Violation

ConstantLinear

Quadratic

Cubic

You’re back!Learned your lesson?

Page 86: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Violator’s Path

ConstantLinear

Quadratic

Cubic

See, you shouldn’t run aheadEven if you are right!

Page 87: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham Path

ConstantLinear

Quadratic

Cubic

Page 88: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

More General Argument Required

Cover case in which demon has branching paths (causal discovery)

Page 89: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

More General Argument Required

Cover case in which scientist lags behind (using time as a cost)

Come on!

Page 90: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Effects

Page 91: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Effects

Page 92: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 93: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 94: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 95: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 96: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 97: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 98: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Effects

May take arbitrarily long to discoverBut can’t be taken back

Page 99: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Theories True theory determined by which

effects appear.

Page 100: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Complexity

More complex

Page 101: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Background Constraints

More complex

Page 102: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Background Constraints

More complex

Page 103: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham’s Razor Don’t select a theory unless it is

uniquely simplest in light of experience.

Page 104: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Weak Ockham’s Razor Don’t select a theory unless it

among the simplest in light of experience.

Page 105: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Stalwartness Don’t retract your answer while it

is uniquely simplest

Page 106: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Stalwartness Don’t retract your answer while it

is uniquely simplest

Page 107: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Timed Retraction Bounds

r(M, e, n) = the least timed retraction bound covering the total timed retractions of M along input streams of complexity n that extend e

Empirical Complexity 0 1 2 3 . . .

. . .

M

Page 108: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Efficiency of Method M at e

M converges to the truth no matter what;

For each convergent M’ that agrees with M up to the end of e, and for each n: r(M, e, n) r(M’, e, n)

Empirical Complexity 0 1 2 3 . . .

. . .

M M’

Page 109: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

M is Beaten at e

There exists convergent M’ that agrees with M up to the end of e, such that For each n, r(M, e, n) r(M’, e, n); Exists n, r(M, e, n) > r(M’, e, n).

Empirical Complexity 0 1 2 3 . . .

. . .

M M’

Page 110: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham Efficiency Theorem

Let M be a solution. The following are equivalent: M is always strongly Ockham and

stalwart; M is always efficient; M is never weakly beaten.

Page 111: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Example: Causal Inference Effects are conditional statistical

dependence relations.

X dep Y | {Z}, {W}, {Z,W}

Y dep Z | {X}, {W}, {X,W}

X dep Z | {Y}, {Y,W}

. . .. . .

Page 112: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Causal Discovery = Ockham’s Razor

X Y Z W

Page 113: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Ockham’s Razor

X Y Z W

X dep Y | {Z}, {W}, {Z,W}

Page 114: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Causal Discovery = Ockham’s Razor

X Y Z W

X dep Y | {Z}, {W}, {Z,W}Y dep Z | {X}, {W}, {X,W}X dep Z | {Y}, {Y,W}

Page 115: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Causal Discovery = Ockham’s Razor

X Y Z W

X dep Y | {Z}, {W}, {Z,W}Y dep Z | {X}, {W}, {X,W}X dep Z | {Y}, {W}, {Y,W}

Page 116: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Causal Discovery = Ockham’s Razor

X Y Z W

X dep Y | {Z}, {W}, {Z,W}Y dep Z | {X}, {W}, {X,W}X dep Z | {Y}, {W}, {Y,W}Z dep W| {X}, {Y}, {X,Y}Y dep W| {Z}, {X,Z}

Page 117: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Causal Discovery = Ockham’s Razor

X Y Z W

X dep Y | {Z}, {W}, {Z,W}Y dep Z | {X}, {W}, {X,W}X dep Z | {Y}, {W}, {Y,W}Z dep W| {X}, {Y}, {X,Y}Y dep W| {X}, {Z}, {X,Z}

Page 118: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

IV. Simplicity Defined

Page 119: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Approach

Empirical complexity reflects nested problems of induction posed by the problem.

Hence, simplicity is problem-relative but topologically invariant.

Page 120: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Problems

T1 T2 T3

Set K of infinite input sequences. Partition Q of K into alternative

theories.

K

Page 121: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Simplicity Concepts A simplicity concept for (K, Q) is just

a well-founded order < on a partition S of K with ascending chains of order type not exceeding omega such that:

1. Each element of S is included in some answer in Q.

2. Each downward union in (S, <) is closed;

3. Incomparable sets share no boundary point.

4. Each element of S is included in the boundary of its successor.

Page 122: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Empirical Complexity Defined

Let K|e denote the set of all possibilities compatible with observations e.

Let (S, <) be a simplicity concept for (K|e, Q).

Define c(w, e) = the length of the longest < path to the cell of S that contains w.

Define c(T, e) = the least c(w, e) such that T is true in w.

Page 123: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Applications Polynomial laws: complexity =

degree Conservation laws: complexity =

particle types – conserved quantities.

Causal networks: complexity = number of logically independent conditional dependencies entailed by faithfulness.

Page 124: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

General Ockham Efficiency Theorem

Let M be a solution. The following are equivalent: M is always strongly Ockham and

stalwart; M is always efficient; M is never beaten.

Page 125: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Conclusions

Causal truths are necessary for counterfactual predictions.

Ockham’s razor is necessary for staying on the straightest path to the true theory but does not point at the true theory.

No evasions or circles are required.

Page 126: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Future Directions

Extension of unique efficiency theorem to stochastic model selection.

Latent variables as Ockham conclusions.

Degrees of retraction. Pooling of marginal Ockham

conclusions. Retraction efficiency assessment of

MDL, SRM.

Page 127: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Suggested Reading

"Ockham’s Razor, Truth, and Information", in Handbook of the Philosophy of Information, J. van Behthem and P. Adriaans, eds., to appear.

"Ockham’s Razor, Empirical Complexity, and Truth-finding Efficiency", Theoretical Computer Science, 383: 270-289, 2007.

Both available as pre-prints at: www.hss.cmu.edu/philosophy/faculty-kelly.php

Page 128: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

1. Prior Simplicity Bias

The simple theory is more plausible now because it was more plausible yesterday.

Page 129: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

More Subtle Version

Simple data are a miracle in the complex theory but not in the simple theory.

P C

Regularity: retrograde motion of Venus at solar conjunction

Has to be!

Page 130: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

However…

e would not be a miracle given P(q);

Why not this?

CP

Page 131: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

The Real MiracleIgnorance about model: p(C) p(P);

+ Ignorance about parameter setting: p’(P(q) | P) p(P(q’ ) | P).

= Knowledge about C vs. P(q):p(P(q)) << p(C).

CP

qqqqqqqq

Lead into gold.Perpetual motion.Free lunch.

Sounds good!

Page 132: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Standard Paradox of Indifference

Ignorance of red vs. not-red+ Ignorance over not-red: = Knowledge about red vs. white.

qq

Knognorance = All the priveleges of knowledgeWith none of the responsibilitiesSounds good!

Page 133: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

The Ellsberg Paradox

1/3 ? ?

Page 134: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Human Preference

1/3 ? ?

a > b

a c < cb

b

Page 135: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Human View

1/3 ? ?

a > b

a c < cb

bknowledge ignorance

knowledgeignorance

Page 136: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Bayesian “Rationality”

1/3 ? ?

a > b

a c > cb

bknognoranceknognorance

knognoranceknognorance

Page 137: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

In Any Event

The coherentist foundations of Bayesianism have nothing to do with short-run truth-conduciveness.Not so loud!

Page 138: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Bayesian Convergence

Too-simple theories get shot down…

ComplexityTheories

Updated opinion

Page 139: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Bayesian Convergence

Plausibility is transferred to the next-simplest theory…

Blam! ComplexityTheories

Updated opinion

Plink!

Page 140: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Bayesian Convergence

Plausibility is transferred to the next-simplest theory…

Blam! ComplexityTheories

Updated opinion

Plink!

Page 141: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Bayesian Convergence

Plausibility is transferred to the next-simplest theory…

Blam! ComplexityTheories

Updated opinion

Plink!

Page 142: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Bayesian Convergence

The true theory is never shot down.

Blam! ComplexityTheories

Updated opinion

Zing!

Page 143: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Convergence

But alternative strategies also converge: Any theory choice in the short run is

compatible with convergence in the long run.

Page 144: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Summary of Bayesian Approach

Prior-based explanations of Ockham’s razor are circular and based on a faulty model of ignorance.

Convergence-based explanations of Ockham’s razor fail to single out Ockham’s razor.

Page 145: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

2. Risk Minimization Ockham’s razor minimizes

expected distance of empirical estimates from the true value.

Truth

Page 146: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Unconstrained Estimates

are Centered on truth but spread around it.

Pop!Pop!Pop!Pop!

Unconstrained aim

Page 147: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Off-center but less spread.

Clamped aim

Truth

Constrained Estimates

Page 148: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Off-center but less spread Overall improvement in expected

distance from truth…

Truth

Pop!Pop!Pop!Pop!

Constrained Estimates

Clamped aim

Page 149: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Doesn’t Find True Theory

The theory that minimizes estimation risk can be quite false…

Four eyes!

Clamped aim

Page 150: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

Makes Sense…when loss of an answer is similar in

nearby distributions.

Similarityp

Close is goodenough!Loss

Page 151: Ockham’s Razor in Causal Discovery: A New Explanation Kevin T. Kelly Conor Mayo-Wilson Department of Philosophy Joint Program in Logic and Computation.

But Not When Truth Matters

…i.e., when loss of an answer is discontinuous with similarity.

Similarityp

Close is no cigar!Loss