Exploiting Pearl’s Theorems for Graphical Model Structure Discovery
description
Transcript of Exploiting Pearl’s Theorems for Graphical Model Structure Discovery
![Page 1: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/1.jpg)
Exploiting Pearl’s Theorems for Graphical
Model Structure Discovery
Dimitris Margaritis
(joint work with Facundo Bromberg and Vasant Honavar)
Department of Computer Science
Iowa State University
![Page 2: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/2.jpg)
2 / 66
The problem
General problem: Learn probabilistic graphical models from data
Specific problem: Learn the structure of probabilistic graphical models
![Page 3: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/3.jpg)
3 / 66
Why graphical probabilistic models?
Tools for reasoning under uncertainty can use them to calculate the probability of any
propositional formula (probabilistic inference) given the facts (known values of some variables)
Efficient representation of the joint probability using conditional independences
Most popular graphical models: Markov networks (undirected) Bayesian networks (directed acyclic)
![Page 4: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/4.jpg)
4 / 66
Markov Networks
Define neighborhood structure among variables (i, j):
MNs’ assumption: Si conditionally independent of all but its neighbors:
Intuitively: variable X is conditionally independent (CI) of variable Y given set of variables Z if Z “shields” any influence between X to Y
Intuitively: variable X is conditionally independent (CI) of variable Y given set of variables Z if Z “shields” any influence between X to Y
Notation:
Implies decomposition:
![Page 5: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/5.jpg)
5 / 66
Markov Network Example
Target random variable: crop yield X Observable random variables:
Soil acidity Y1 Soil humidity Y2 Concentration of potassium Y3 Concentration of sodium Y4
![Page 6: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/6.jpg)
6 / 66
Example: Markov network for crop field
The crop field is organized spatially as a regular grid
Defines a dependency structure that matches spatial structure
![Page 7: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/7.jpg)
7 / 66
Markov Networks (MN)
We can represent structure graphically using Markov network G=(V, E):
V: nodes represent random variables, E: undirected edges represent structure i.e.,
(i; j ) 2 E ( ) (i; j ) 2 N
Example MN for:
V = f0,1,2,3,4,5,6,7g
N = f (1;4);(4;7);(7;0);(7;5);(6;5);(0;3);(5;3);(3;2)g
![Page 8: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/8.jpg)
8 / 66
Markov network semantics
The CIs of probability distribution P are be encoded in a MN G by vertex-separation:
3 ??= 7j f0g
3 ?? 7j f0;5g
(Pearl 88’) If the CIs in the graph match exactly those of distribution P, P is said to be graph-isomorph.
Denoting conditional dependence by ,
![Page 9: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/9.jpg)
9 / 66
True probability distribution:
Unknown
The problem revisited
Learn structure of Markov networks from data
Data sampled from distribution:
Known!»
Learningalgorithm
Pr(1,2,¢¢¢;7)
Learned networkTrue network
![Page 10: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/10.jpg)
10 / 66
Structure Learning of Graphical Models
Approaches toStructure Learning:
Approaches toStructure Learning:
• Search for graph with optimal score (Likelihood, MDL)• Score computation intractable in Markov networks
• Search for graph with optimal score (Likelihood, MDL)• Score computation intractable in Markov networks
Score-basedScore-based
Infer graph usinginformation of
independences that hold in underlying
model
Independencebased
Other isolated
approaches
Other isolated
approaches
![Page 11: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/11.jpg)
11 / 66
so this structure (e.g.) is inconsistent!but this, instead, is consistent!
Is variable 7 independent of variable 3 given variables {0,5}?
Independence-based approach
Assumes existence of independence-query oracle that answers the CIs that hold in the true probability distribution
Proceeds iteratively: 1. Query independence query oracle for CI value h in true model2. Discard structures that violate CI h3. Repeat until a single structure is left (uniqueness under assumptions)
Oracle says NO:3 ??= 7j f0;5g
independence query oracle
![Page 12: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/12.jpg)
12 / 66
But an oracle does not exist!
Can be approximated by a statistical independence test (SIT) e.g. Pearson’s 2 or Wilk’s G2
Given as input: a data set D (sampled from the true distribution), and a triplet (X,Y | Z)
The SIT computes the p-value: probability of error in assuming dependence when in fact variables are independent
and decides:
![Page 13: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/13.jpg)
13 / 66
Outline
• Introductory Remarks
• The GSMN and GSIMN algorithms
• The Argumentative Independence Test
• Conclusions
![Page 14: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/14.jpg)
14 / 66
GSMN and GSIMN Algorithms
![Page 15: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/15.jpg)
15 / 66
GSMN algorithm
We introduce (the first) two independence-based algorithms for MN structure learning: GSMN and GSIMN
GSMN (Grow-Shrink Markov Network structure inference algorithm) is a direct adaptation of the grow-shrink (GS) algorithm (Margaritis, 2000) for learning a variable’s Markov blanket using independence tests
De¯nition: A Markov blanket BL(X ) of X 2 V isany subset S of variablesthat shield X from all others variables, that is, (X ?? V ¡ S ¡ fX g j S).
![Page 16: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/16.jpg)
16 / 66
Markov blanket is the set of neighbors in the structure (Pearl and Paz ’85).
Therefore, we can learn the structure by learning the Markov blankets:
GSMN (cont’d)
1: for every X 2 V
2: B L (X ) Ã ¡ get Markov blanket of X using GS algorithm.
3: for every Y 2 B L (X )
4: add edge (X ;Y ) to E (G):
GSMN extends above algorithm with heuristic ordering for grow and shrink phases of GS
N
![Page 17: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/17.jpg)
17 / 66
Initially No Arcs
C
A
B
D
F G
E
K L
![Page 18: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/18.jpg)
18 / 66
G
Markov blanket of A = {}
B
Growing phase
C
A
D
F
E
K L
1. B dependent of A given {}?
2. F dependent of A given {B}?
3. G dependent of A given {B}?
4. C dependent of A given {B,G}?
6. D dependent of A given {B,G,C,K}?
7. E dependent of A given {B,G,C,K,D}?
5. K dependent of A given {B,G,C}?
8. L dependent of A given {B,G,C,K,D,E}?
F
L
Markov blanket of A = {B}
B
Markov blanket of A = {B,G}
G
Markov blanket of A = {B,G,C}
C
Markov blanket of A = {B,G,C,K}
K
Markov blanket of A = {B,G,C,K,D}
D
Markov blanket of A = {B,G,C,K,D,E}
E
![Page 19: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/19.jpg)
19 / 66
Markov blanket of A = {B,G,C,K,D,E}
MinimumMarkov Blanket
MinimumMarkov Blanket
Shrinking phase
C
A
B
D
F G
K L
9. G dependent of A given {B,C,K,D,E}?
(i.e. the set-{G})
E
10. K dependent of A given {B,C,D,E}?
Markov blanket of A = {B,C,D,E} Markov blanket of A = {B,C,K,D,E}
![Page 20: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/20.jpg)
20 / 66
GSIMN
Undirected axioms (Pearl ’88)
• GSIMN (Grow-Shrink Inference Markov Network) uses properties of CIs as inference rules to infer novel tests, avoiding costly SITs.
• Pearl (88’) introduced properties satisfied by the CIs of distributions isomorphic to Markov networks:
• GSIMN modifies GSMN by exploiting these axioms to infer novel tests
![Page 21: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/21.jpg)
21 / 66
Axioms as inference rules
=) (1?? 3 j f4g)(1 ?? 7 j f4g) ^(7??= 3j f4g)
[Transitivity] (X ?? W j Z) (̂W 6?? Y j Z) =) (X ?? Y j Z)
![Page 22: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/22.jpg)
22 / 66
Triangle theorems
(X ?? W j Z1) ^(W 6?? Y j Z1 [ Z2)
=) (X ?? Y j Z1):
(X 6?? W j Z1) ^(W 6?? Y j Z2)
=) (X 6?? Y j Z1 \ Z2)
GSIMN actually uses the Triangle Theorem rules, derived from (only): Strong Union and Transitivity:
Rearranges GSMN visit order to maximize benefits Applies these rules only once (as opposed to computing the
closure) Despite these simplifications, GSIMN infers >95% of inferable
tests (shown experimentally)
![Page 23: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/23.jpg)
23 / 66
Experiments
Our goal: Demonstrate GSIMN requires fewer tests than GSMN, without significantly affecting accuracy
![Page 24: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/24.jpg)
24 / 66
Results for exact learning• We assume independence query oracle, so
tests are 100% accurate output network = true network (proof omitted)
![Page 25: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/25.jpg)
25 / 66
Sampled data: weighted number of tests
![Page 26: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/26.jpg)
26 / 66
Sampled data: Accuracy
![Page 27: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/27.jpg)
27 / 66
Real-world data More challenging because:
Non-random topologies (e.g. regular lattices, small world, chains, etc.)
Underlying distribution may not be graph-isomorph
![Page 28: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/28.jpg)
28 / 66
Outline
• Introductory Remarks
• The GSMN and GSIMN algorithms
• The Argumentative Independence Test
• Conclusions
![Page 29: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/29.jpg)
29 / 66
The Argumentative Independence Test(AIT)
![Page 30: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/30.jpg)
30 / 66
The Problem
Statistical Independence tests (SITs) unreliable for small data sets
Produce erroneous networks when used by independence-based algorithms
This problem is one of the most important criticisms of independence-based approach
Our contribution A new general purpose independence test: the
argumentative independence test or AIT that improves reliability for small data sets
![Page 31: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/31.jpg)
31 / 66
Main Idea
The new independence test (AIT) improves accuracy by “correcting” outcomes of a statistical independence test (SIT): Incorrect SITs may produce CIs inconsistent with Pearl’s
properties of conditional independences Thus, resolving inconsistencies among SITs may correct
the errors Propositional knowledge base (KB)
propositions are CIs (i.e., for (X, Y | Z), or )
inference rules are Pearl’s conditional independence axioms
![Page 32: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/32.jpg)
32 / 66
Pearl’s axioms
• We presented above the undirected axioms
• Pearl (1988) also introduced, for any distribution:
general axiomsgeneral axioms
Directed axiomsDirected axioms
For distributions isomorphic to directed graphs:
![Page 33: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/33.jpg)
33 / 66
Example
• Consider the following KB of CIs, constructed using a SIT.
A.B.C.
• Assume C is wrong (SIT’s mistake).• Assuming the Composition axiom holds, then
D.
• Inconsistency: D and C contradict each other
(0?? 1 j f 2;3g)(0?? 4 j f2;3g)
(06?? f1;4g j f2;3g)
(0?? 1 j f2;3g) (̂0?? 4 j f 2;3g) =) (0?? f1;4g j f2;3g)
![Page 34: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/34.jpg)
34 / 66
Example (cont’d)
(0?? 1 j f 2;3g)(0?? 4 j f2;3g)
(06?? f1;4g j f2;3g)
A.B.C.
(0?? 1 j f2;3g) (̂0?? 4 j f 2;3g) =) (0?? f1;4g j f2;3g)D.
Inconsistent andIncorrect KB:
Consistent but Incorrect KB:
Consistent and correct KB:
At least two ways to resolve inconsistency: rejecting D or rejecting C
If we can resolve inconsistency in favor of D, error could be corrected
The argumentation framework presented next provides a principled approach for resolving inconsistencies
![Page 35: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/35.jpg)
35 / 66
Preference-based Argumentation Framework
Instance of defeasible (non-monotonic) logics
Main contributors: Dung ’95 (basic framework), Amgoud and Cayrol ’02 (added preferences)
The framework consists on three elements:
Set of argumentsAttack relation among argumentsPreference order over arguments
PAF=hA;R ;¼i
A:R :¼:
![Page 36: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/36.jpg)
36 / 66
Arguments Argument (H, h) is an “if-then” rule (if H then h)
Support H is a set of consistent propositions Head h
In independence KBs if-then rules are instances (propositionalizations) of Pearl’s universally quantified rules. For example these
are instances of Weak Union: Propositional arguments: arguments ({h}, h) for
individual CI proposition h
![Page 37: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/37.jpg)
37 / 66
Example
The set of arguments corresponding to KB of previous example is:
A.B.C.D.
Name (H, h) Correct?
(f (0?? 1 j f 2;3g)g;(0?? 1 j f2;3g))(f (0?? 4 j f2;3g)g;(0?? 4 j f2;3g))
(f (06?? f1;4g j f2;3g)g;(06?? f1;4g j f 2;3g))¡f (0?? 1 j f 2;3g);(0?? 4 j f 2;3g)g;(0?? f1;4g j f 2;3g)
¢
![Page 38: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/38.jpg)
38 / 66
Preferences
Preference over arguments obtained from preferences over CI propositions
We say argument (H, h) preferred over argument (H’, h’) iff it is more likely for all propositions in H to be correct:
The probability (h) that h is correct is obtained from p-value of h, computed using a statistical test (SIT) on data
![Page 39: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/39.jpg)
39 / 66
Example
Let’s extend the arguments with preferences:
A.B.C.D.
Name (H,h) Correct? (H)
0.80.70.5
0.8x0.7=0.56
(f (0?? 1 j f 2;3g)g;(0?? 1 j f2;3g))(f (0?? 4 j f 2;3g)g;(0?? 4 j f2;3g))
(f (06?? f1;4g j f2;3g)g;(06?? f1;4g j f 2;3g))¡f (0?? 1 j f 2;3g);(0?? 4 j f 2;3g)g;(0?? f1;4g j f 2;3g)
¢
![Page 40: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/40.jpg)
40 / 66
Attack relation
Since argument (H1,h1) models if H then h rules, it can be logically contradicted by (H2,h2) if:
• (H1,h1) rebuts (H2,h2) iff h1 h2
• (H1,h1) undercuts (H2,h2) iff hH2 such that h h1
R
Definition: Argument b attacks argument a iff b logically contradicts a and a is not preferred over b
The attack relation formalizes and extends the notion of logical contradiction:
![Page 41: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/41.jpg)
41 / 66
Example
A.B.C.D.
C and D rebut each other, and C is not preferred over D, so D attacks C
Name (H, h) Correct? (H)
0.80.70.5
0.8x0.7=0.56
(f (0?? 1 j f 2;3g)g;(0?? 1 j f2;3g))(f (0?? 4 j f2;3g)g;(0?? 4 j f2;3g))
(f (06?? f1;4g j f2;3g)g;(06?? f1;4g j f 2;3g))¡f (0?? 1 j f 2;3g);(0?? 4 j f 2;3g)g;(0?? f1;4g j f 2;3g)
¢
![Page 42: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/42.jpg)
42 / 66
Inference = Acceptability
Inference modeled in argumentation frameworks by acceptability
An argument r is: “inferred” iff it is accepted “not inferred” iff rejected, or in abeyance if neither
Dung-Amgoud’s idea: accept argument r if r is not attacked, or r is attacked, but its attackers are also attacked
![Page 43: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/43.jpg)
43 / 66
Example
A.B.C.D.
We had that D attacks C (and no other attack). Since nothing attacks D, D is accepted. C is attacked by an accepted argument, so C is rejected.
Argumentation resolved the inconsistency in favor of correct proposition D!
In practice, we have thousands of arguments. How to compute acceptability status of all of them?
Name (H, h) Correct? (H)
0.80.70.5
0.8x0.7=0.56
(f (0?? 1 j f 2;3g)g;(0?? 1 j f2;3g))(f (0?? 4 j f2;3g)g;(0?? 4 j f2;3g))
(f (06?? f1;4g j f2;3g)g;(06?? f1;4g j f 2;3g))¡f (0?? 1 j f 2;3g);(0?? 4 j f 2;3g)g;(0?? f1;4g j f 2;3g)
¢
![Page 44: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/44.jpg)
44 / 66
Computing Acceptability Bottom-up
accept if not attacked, or if all attackers attacked.
![Page 45: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/45.jpg)
45 / 66
Computing Acceptability Bottom-up
accept if not attacked, or if all attackers attacked.
![Page 46: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/46.jpg)
46 / 66
Computing Acceptability Bottom-up
accept if not attacked, or if all attackers attacked.
![Page 47: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/47.jpg)
47 / 66
Computing Acceptability Bottom-up
accept if not attacked, or if all attackers attacked.
![Page 48: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/48.jpg)
48 / 66
Computing Acceptability Bottom-up
accept if not attacked, or if all attackers attacked.
![Page 49: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/49.jpg)
49 / 66
Top-down algorithm
Bottom-up algorithm highly inefficient Computes acceptability of all possible arguments
Top-down is an alternative Given argument r, it responds whether r accepted or
rejected accept if all attackers are rejected, and reject if at least one attacker is accepted
We illustrate this with an example
![Page 50: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/50.jpg)
50 / 66
Computing Acceptability Top-down
accept if all attackers rejected, reject if at least one accepted.
1
2
3
5
4
6
9 12
11
8
710
13
7 Target node
![Page 51: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/51.jpg)
51 / 66
Computing Acceptability Top-down
accept if all attackers rejected, reject if at least one accepted.
1
2
3
5
4
6
9 12
11
8
710
13
7 Target node
3 6 11 attackers
![Page 52: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/52.jpg)
52 / 66
Computing Acceptability Top-down
accept if all attackers rejected, reject if at least one accepted.
1
2
3
5
4
6
9 12
11
8
710
13
7 Target node
3 6 11 attackers
4 5 12
![Page 53: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/53.jpg)
53 / 66
Computing Acceptability Top-down
accept if all attackers rejected, reject if at least one accepted.
1
2
3
5
4
6
9 12
11
8
710
13
7 Target node
3 6 11
4 5 12
leaf
![Page 54: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/54.jpg)
54 / 66
Computing Acceptability Top-down
accept if all attackers rejected, reject if at least one accepted.
1
2
3
5
4
6
9 12
11
8
710
13
7 Target node
3 6 11
4 5 12
2 1 13
leaf
leaf leaf leaf
![Page 55: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/55.jpg)
55 / 66
Computing Acceptability Top-down
accept if all attackers rejected, reject if at least one accepted.
1
2
3
5
4
6
9 12
11
8
710
13
7 Target node
3 6 11
4 5 12
2 1 13
![Page 56: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/56.jpg)
56 / 66
Computing Acceptability Top-down
accept if all attackers rejected, reject if at least one accepted.
1
2
3
5
4
6
9 12
11
8
710
13
7 Target node
3 6 11
4 5 12
2 1 13
![Page 57: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/57.jpg)
57 / 66
Computing Acceptability Top-down
accept if all attackers rejected, reject if at least one accepted.
1
2
3
5
4
6
9 12
11
8
710
13
7 Target node
3 6 11
4 5 12
2 1 13
We didn’t evaluate arguments 8, 9 and 10!
![Page 58: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/58.jpg)
58 / 66
Approximate top-down algorithm
It is a tree-traversal, we chose iterative deepening
Time complexity: O(bd)
Difficulties:1. Exponential in depth d.2. By nature of Pearl rules, # attackers of some nodes
(branching factor b) may be exponential
Approximation: To solve (1), we limit d to 3. To solve (2), we consider an alternative propositionalization
of Pearl’s rules that bounds b to polynomial size (details omitted here)
b=3d=3
![Page 59: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/59.jpg)
59 / 66
Experiments
We considered 3 variations of each AIT, one per set of Pearl axioms: general, directed, and undirected
Experiments on data sampled from Markov and Bayesian networks (directed graphical models)
![Page 60: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/60.jpg)
60 / 66
Approximate top-down algorithm:accuracy on data
Axioms: generalTrue model: BN
Axioms: generalTrue model: BN
Axioms: directedTrue model: BN
Axioms: directedTrue model: BN
Axioms: generalTrue model: MN
Axioms: generalTrue model: MN
Axioms: undirectedTrue model: MN
Axioms: undirectedTrue model: MN
![Page 61: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/61.jpg)
61 / 66
Top-down runtime: approximate vs. exact
PC algorithm
GSMN algorithm
We show results only for specific axioms
![Page 62: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/62.jpg)
62 / 66
Top-down accuracy: approx vs. exact
Experiments show accuracies of both match in all but few cases: (only specific axioms)
![Page 63: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/63.jpg)
63 / 66
Conclusions
![Page 64: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/64.jpg)
64 / 66
Summary
I presented two uses of Pearl’s independence axioms/theorems:
1. the GSIMN algorithm• Uses axioms to infer independence test results from
known ones when learning the domain Markov network
faster execution
2. The AIT general-purpose independence test• Uses multiple tests on data and the axioms as integrity
constraints to return the most reliable value
more reliable tests on small data sets
![Page 65: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/65.jpg)
65 / 66
Further Research
Explore other methods of resolving inconsistencies in KB of known independences
Use such constraints to improve Bayesian network and Markov network structure learning from small data sets (instead of just improving individual tests)
Develop faster methods of inferring independences using Pearl’s axioms—Prolog tricks?
![Page 66: Exploiting Pearl’s Theorems for Graphical Model Structure Discovery](https://reader036.fdocuments.in/reader036/viewer/2022062322/56814d03550346895dba2c3e/html5/thumbnails/66.jpg)
66 / 66
Thank you!
Questions?