Exact Inference - University of...
Transcript of Exact Inference - University of...
Exact Inference CSci 5512: Artificial Intelligence II
Instructor: Paul Schrater
Overview: Inference Tasks
Simple Queries: Compute posterior marginals P(b|j,¬m)
Overview: Inference Tasks
Simple Queries: Compute posterior marginals P(b|j,¬m) Conjunctive Queries: Compute joint marginals P(b,¬e|j,¬m)
Overview: Inference Tasks
Simple Queries: Compute posterior marginals P(b|j,¬m) Conjunctive Queries: Compute joint marginals P(b,¬e|j,¬m) Optimal Decisions: Compute P(outcome|action,evidence)
Overview: Inference Tasks
Simple Queries: Compute posterior marginals P(b|j,¬m) Conjunctive Queries: Compute joint marginals P(b,¬e|j,¬m) Optimal Decisions: Compute P(outcome|action,evidence) Value of Information: Which evidence to seek next?
Overview: Inference Tasks
Simple Queries: Compute posterior marginals P(b|j,¬m) Conjunctive Queries: Compute joint marginals P(b,¬e|j,¬m) Optimal Decisions: Compute P(outcome|action,evidence) Value of Information: Which evidence to seek next? Explanation: Why do I need a new starter motor?
Inference
.001 P(B)
.002 P(E)
Earthquake
MaryCalls JohnCalls
B T T F F
E T F T F
.95
.94
.29
.001
Burglary
P(A|B,E)
T F
.90
.05
Alarm
A P(J|A) T F
.70
.01
A P(M|A)
How can we compute P(b|j,m)?
Inference by Enumeration
Simple query can be answered using Bayes rule
Instructor: Arindam Banerjee Exact Inference
Inference by Enumeration
Simple query can be answered using Bayes rule From Bayes Rule
P(b|j,m) = P(b,j,m) P(j,m)
Each marginal can be obtained from the joint distribution
E A P(b,j,m)
P(j,m)
=
= B E A
P(b,E,A,j,m)
P(B,E,A,j,m)
Each term can be written as product of conditionals
P(b,E,A,j,m) = P(b)P(E)P(A|b,E)P(j|A)P(m|A)
The complexity of the simple approach is O(n2n)
Inference by Enumeration (Contd)
Complexity can be improved by a simple observation
P(b|j,m) = 1 P(j,m) E A
P(b,E,A,j,m)
= 1 P(j,m) E A
P(b)P(E)P(A|E,b)P(j|A)P(m|A)
= 1 P(j,m)
P(b) E
P(E) A
P(A|E,b)P(j|A)P(m|A)
Complexity is O(2n) Some computations are repeated
Burglary Network
.001
P(B)
.002
P(E) Earthquake
MaryCalls JohnCalls
B T T F F
E T F T F
.95
.94
.29
.001
Burglary
P(A|B,E)
T F
.90
.05
Alarm
A P(J|A) T F
.70
.01
A P(M|A)
P(j|a) .90
P(m|a) .70
P(j| a) P(j|a) .05
P(m| a) .01
P(j| a)
Enumeration Tree
P(e) .002
P( e) .998
P(a|b,e) .95
P( a|b, e) .06
P( a|b,e) .05
P(a|b, e) .94
.05
P(m| a) .01
Instructor: Arindam Banerjee
P(j|a) .90
P(m|a) .70
.05
P(m| a) .01
P(j| a) P(j|a) .90
P(m|a) .70
.05
P(m| a) .01
P(j| a)
Enumeration Tree
P(b) .001
P(e) .002
P( e) .998
P(a|b,e) .95
P( a|b, e) .06
P( a|b,e) .05
P(a|b, e) .94
Enumeration Tree for P(b|j,m) P(j|a)P(m|a) and P(j|¬a)P(m|¬a) are computed twice
Variable Elimination
Queries can be written as sum-of-products Main idea Sum over products to eliminate variables Storage required to prevent repeat computations Burglary network
P(b|j,m)
= 1 P(j,m)
P(b) B E
P(E) E A
P(A|b,E)P(j|A)P(m|A) A J M
Have a factor for every variable
Factors for Variable Elimination
Factor for M: fm(A) = [P(m|a) P(m|¬a)] Factor for J: fj(A) = [P(j|a) P(j|¬a)] Similarly, factors fA(B,E,A),fE(E),fB(B) Summing out to eliminate variables
¯
¯ ¯
fAjm(B,E) =
fEAjm(B) = A
E ¯
fA(B,E,A)fj(A)fm(A)
fE(E)fAjm(B,E)
P(B|j,m) = 1 P(j,m)
¯ ¯ fB(B)fEAjm(B)
Variable Elimination: Basic Operations
Summing out a variable from a product of factors Pointwise products of (a pair of) factors Sum out a variable from a product of factors Pointwise product of factors g1(a,b) × g2(b,c) = g(a,b,c) In general g1(x1,...,xj,y1,...,yk) × g2(y1,...,yk,z1,...,zl) = g(x1,...,xj,y1,...,yk,z1,...,zl)
Assuming f1,...,fi do not depend on X
x f1 × ··· × fk
X
fi+1 × ··· × fk = f1 × ··· × fi ×
= f1 × ··· × fi × fX¯
Example: Pointwise Product of Factors
A T T F F
B T F T F
f1(A,B) 0.3 0.7 0.9 0.1
B T T F F
C T F T F
f2(B,C) 0.2 0.8 0.6 0.4
A T T T T
B T T F F
C T F T F
f3(A,B,C) 0.3 × 0.2 0.3 × 0.8 0.7 × 0.6 0.7 × 0.4
T T F F
T F T F
0.9 × 0.2 0.9 × 0.8 0.1 × 0.6 0.1 × 0.4
Algorithm�
Irrelevant variables�
Irrelevant variables�
Complexity of Exact Inference
Singly connected networks (or polytrees) Any two nodes are connected by at most one path Time and space cost of variable elimination are O(dkn)
Multiply connected networks Can reduc 3SAT to exact inference =⇒ NP-hard Equivalent to counting 3SAT models =⇒ #P-complete
Inference Problems: Big Picture
Broadly the following types of problems Compute Compute Compute Compute
likelihood of observations marginals P(xA) on subset A of nodes conditionals P(xA|xB) on disjoint subsets A,B mode argmaxx P(x)
• First 3 problems are similar Involves marginalization over sum-of-products • Last problem is fundamentally different Entails maximization rather than marginalization • There is an important connection between the problems