Can causal models be evaluated? Isabelle Guyon ClopiNet / ChaLearn
description
Transcript of Can causal models be evaluated? Isabelle Guyon ClopiNet / ChaLearn
Can causal models be evaluated?
Isabelle Guyon
ClopiNet / ChaLearn
http://clopinet.com/causality [email protected]
1) Feature Extraction, Foundations and ApplicationsI. Guyon, S. Gunn, et al.Springer, 2006.http://clopinet.com/fextract-book
2) Causation and Prediction ChallengeI. Guyon, C. Aliferis, G. Cooper,
A. Elisseeff, J.-P. Pellet, P. Spirtes, and A. Statnikov, Eds. CiML, volume 2, Microtome. 2010.
http://www.mtome.com/Publications/CiML/ciml.html
Acknowledgements and references
http://gesture.chalearn.org
Co-founders:
Constantin Aliferis Alexander Statnikov
André Elisseeff Jean-Philippe Pellet
Gregory F. Cooper Peter Spirtes
ChaLearn directors and advisors:
Alexander Statnivov Ioannis Tsamardinos
Richard Scheines Frederick Eberhardt
Florin Popescu
Preparation of ExpDeCoExperimental design in causal
discovery
• Motivations• Quiz• What we want to do (next challenge)• What we already set up (virtual lab)• What we could improve• Your input…
Note: Experiment = manipulation = action
Causal discovery motivations (1)
Interesting problems
which actions will have beneficial effects?
…your health?
…climate changes?
… the economy?
What affects…
and…
Predict the consequences of (new)
actions• Predict the outcome of actions
– What if we ate only raw foods?– What if we imposed to paint all cars white?– What if we broke up the Euro?
• Find the best action to get a desired outcome– Determine treatment (medicine)– Determine policies (economics)
• Predict counterfactuals– A guy not wearing his seatbelt died in a car
accident. Would he have died had he worn it?
Causal discovery motivations (2) Lots of
data available
http://data.govhttp://data.uk.govhttp://www.who.int/research/en/http://www.ncdc.noaa.gov/oa/ncdc.htmlhttp://neurodatabase.org/http://www.ncbi.nlm.nih.gov/Entrez/http://www.internationaleconomics.net/data.htmlhttp://www-personal.umich.edu/~mejn/netdata/http://www.eea.europa.eu/data-and-maps/
Causal discovery motivations (3) Classical
ML helpless
X
YY
X
Y
Predict the consequences of actions:
Under “manipulations” by an external agent, only causes are predictive, consequences and confounders are not.
Y
Causal discovery motivations (3) Classical
ML helpless
X
Y
If manipulated, a cause influences the outcome…
Y
Causal discovery motivations (3) Classical
ML helpless
X
Y
… a consequence does not …
Y
Causal discovery motivations (3) Classical
ML helpless
X
Y
… neither does a confounder (consequence of a common cause).
Y
Causal discovery motivations (3) Classical
ML helpless
Causal discovery motivations (3) Classical
ML helpless• Special case: stationary or cross-sectional
data (no time series).• Superficially, the problem resembles a
classical feature selection problem.
X
n
m
n’
Quiz
What could be the causal graph?
Could it be that?
Y
X1 X2
x2
Let’s try
x1
Y
X1 X2
Simpson’s paradox
X1 || X2 | Yx1
Y
Could it be that?
Y
X1 X2
x2
x1
Let’s try
Y
X1 X2
Y
Plausible explanation
baseline(X2)
health(Y)
peak(X1)
X2 X1
180 190 200 210 220 230 240 250 260
20
40
60
80
100
120
peak
baselineY
normaldisease
x1
x2
X2 || Y
X2 || Y | X1
x1
What we would like
Y
X1 X2
Yx2
x1
Manipulate X1
Y
X1 X2
Yx2
x1
Y
X1 X2
Yx2
Manipulate X2
What we want to do
Causal data miningHow are we going to do it?
Obstacle 1: Practical
Many statements of the "causality problem"
Obstacle 2: Fundamental
It is very hard to assess solutions
Evaluation
• Experiments are often:– Costly– Unethical– Infeasible
• Non-experimental “observational” data is abundant and costs less.
New challenge: ExpDeCo
Experimental design in causal discovery
- Goal: Find variables that strongly influence an outcome- Method:
- Learn from a “natural” distribution (observational data)
- Predict the consequences of given actions (checked against a test set of “real” experimental data)
- Iteratively refine the model with experiments (using on-line learning from experimental data)
What we have already done
QUERIES
ANSWERS
Database
Lung Cancer
Smoking Genetics
Coughing
AttentionDisorder
Allergy
Anxiety Peer Pressure
Yellow Fingers
Car Accident
Born an Even Day
Fatigue
Models of systems
http://clopinet.com/causality
February 2007: Project starts. Pascal2 funding.August 2007: Two-year NSF grant.Dec. 2007: Workbench alive. 1st causality challenge.Sept. 2008: 2nd causality challenge (Pot luck).Fall 2009: Virtual lab alive. Dec. 2009: Active Learning Challenge (Pascal2).December 2010: Unsupervised and Transfer Learning
Challenge (DARPA).Fall 2012: ExpDeCo (Pascal2)Planned: CoMSiCo
What remains to be done
ExpDeCo (new challenge)
Setup:• Several paired datasets (preferably or real data):
– “Natural” distribution – “Manipulated” distribution
• Problems– Learn a causal model from the natural distribution– Assessment 1: test with natural distribution– Assessment 2: test with manipulated distribution– Assessment 3: on-line learning from manipulated
distribution (sequential design of experiments)
Challenge design constraints
- Largely not relying on “ground truth” this is difficult or impossible to get (in real data)
- Not biased towards particular methods
- Realistic setting as close as possible to actual use
- Statistically significant, not involving "chance“
- Reproducible on other similar data
- Not specific of very particular settings
- No cheating possible
- Capitalize on classical experimental design
Lessons learned from the Causation & Prediction
Challenge
Causation and Prediction challenge
Toy datasets
Challenge datasets
Assessment w. manipulations (artificial data)
Lung Cancer
Smoking Genetics
Coughing
AttentionDisorder
Allergy
Anxiety Peer Pressure
Yellow Fingers
Car Accident
Born an Even Day
Fatigue
LUCAS0: natural
Causality assessmentwith manipulations
LUCAS1: manipulate
d
Lung Cancer
Smoking Genetics
Coughing
AttentionDisorder
Allergy
Anxiety Peer Pressure
Yellow Fingers
Car Accident
Born an Even Day
Fatigue
Causality assessmentwith manipulations
Lung Cancer
Smoking Genetics
Coughing
AttentionDisorder
Allergy
Anxiety Peer Pressure
Yellow Fingers
Car Accident
Born an Even Day
Fatigue
LUCAS2: manipulate
d
Causality assessmentwith manipulations
•Participants score feature relevance: S=ordered list of features
•We assess causal relevance with AUC=f(V,S)
Assessment w. ground truth
0
9 4
11
61
10 2
3
7
5
8
• We define: V=variables of interest
(Theoretical minimal set of predictive variables, e.g.MB, direct causes, ...)
4 11 2 3 1
Assessment without manip. (real data)
Using artificial “probes”
Lung Cancer
Smoking Genetics
Coughing
AttentionDisorder
Allergy
Anxiety Peer Pressure
Yellow Fingers
Car Accident
Born an Even Day
FatigueLUCAP0: natural
Probes
P1 P2 P3 PT
Probes
Lung Cancer
Smoking Genetics
Coughing
AttentionDisorder
Allergy
Anxiety Peer Pressure
Yellow Fingers
Car Accident
Born an Even Day
Fatigue
P1 P2 P3 PT
LUCAP1&2:
manipulated
Using artificial “probes”
Scoring using “probes”
• What we can compute (Fscore):
– Negative class = probes (here, all “non-causes”, all manipulated).
– Positive class = other variables (may include causes and non causes).
• What we want (Rscore):
– Positive class = causes.
– Negative class = non-causes.
• What we get (asymptotically):
Fscore = (NTruePos/NReal) Rscore + 0.5 (NTrueNeg/NReal)
Pairwise comparisons
Gavin CawleyYin-Wen Chang
Mehreen Saeed
Alexander Borisov
E. Mwebaze & J. QuinnH. Jair Escalante
J.G. Castellano
Chen Chu AnLouis Duclos-Gosselin
Cristian Grozea
H.A. Jen
J. Yin & Z. Geng Gr.Jinzhu Jia
Jianming Jin
L.E.B & Y.T.
M.B.Vladimir Nikulin
Alexey Polovinkin
Marius PopescuChing-Wei Wang
Wu Zhili
Florin Popescu
CaMML TeamNistor Grozavu
Causal vs. non-causal
Jianxin Yin: causal Vladimir Nikulin: non-causal
Insensitivity to irrelevant features
Simple univariate predictive model, binary target and features, all relevant features correlate perfectly with the target, all irrelevant features randomly drawn. With 98% confidence, abs(feat_weight) < w and i wixi < v.
ng number of “good” (relevant) features
nb number of “bad” (irrelevant) features
m number of training examples.
How to overcome this problem?
• Leaning curve in terms of number of features revealed– Without re-training on manipulated data
– With on-line learning with manipulated data
• Give pre-manipulation variable values and the value of the manipulation
• Other metrics: stability, residuals, instrument variables, missing features by design
Conclusion(more:
http://clopinet.com/causality) • We want causal discovery to become “mainstream” data
mining• We believe we need to start with “simple” standard
procedures of evaluation• Our design is close enough to a typical prediction
problem, but– Training on natural distribution– Test on manipulated distribution
• We want to avoid pitfalls of previous challenge designs:– Reveal only pre-manipulated variable values– Reveal variables progressively “on demand”