Exact Inference - University of...

Exact Inference CSci 5512: Artificial Intelligence II

Instructor: Paul Schrater

Overview: Inference Tasks

Simple Queries: Compute posterior marginals P(b|j,¬m)

Simple Queries: Compute posterior marginals P(b|j,¬m) Conjunctive Queries: Compute joint marginals P(b,¬e|j,¬m)

Simple Queries: Compute posterior marginals P(b|j,¬m) Conjunctive Queries: Compute joint marginals P(b,¬e|j,¬m) Optimal Decisions: Compute P(outcome|action,evidence)

Simple Queries: Compute posterior marginals P(b|j,¬m) Conjunctive Queries: Compute joint marginals P(b,¬e|j,¬m) Optimal Decisions: Compute P(outcome|action,evidence) Value of Information: Which evidence to seek next?

Simple Queries: Compute posterior marginals P(b|j,¬m) Conjunctive Queries: Compute joint marginals P(b,¬e|j,¬m) Optimal Decisions: Compute P(outcome|action,evidence) Value of Information: Which evidence to seek next? Explanation: Why do I need a new starter motor?

Inference

.001 P(B)

.002 P(E)

Earthquake

MaryCalls JohnCalls

B T T F F

E T F T F

Burglary

P(A|B,E)

A P(J|A) T F

A P(M|A)

How can we compute P(b|j,m)?

Inference by Enumeration

Simple query can be answered using Bayes rule

Instructor: Arindam Banerjee Exact Inference

Inference by Enumeration

Simple query can be answered using Bayes rule From Bayes Rule

P(b|j,m) = P(b,j,m) P(j,m)

Each marginal can be obtained from the joint distribution

E A P(b,j,m)

P(j,m)

= B E A

P(b,E,A,j,m)

P(B,E,A,j,m)

Each term can be written as product of conditionals

P(b,E,A,j,m) = P(b)P(E)P(A|b,E)P(j|A)P(m|A)

The complexity of the simple approach is O(n2n)

Inference by Enumeration (Contd)

Complexity can be improved by a simple observation

P(b|j,m) = 1 P(j,m) E A

P(b,E,A,j,m)

= 1 P(j,m) E A

P(b)P(E)P(A|E,b)P(j|A)P(m|A)

= 1 P(j,m)

P(b) E

P(E) A

P(A|E,b)P(j|A)P(m|A)

Complexity is O(2n) Some computations are repeated

Burglary Network

P(E) Earthquake

MaryCalls JohnCalls

B T T F F

E T F T F

Burglary

P(A|B,E)

A P(J|A) T F

A P(M|A)

P(j|a) .90

P(m|a) .70

P(j| a) P(j|a) .05

P(m| a) .01

P(j| a)

Enumeration Tree

P(e) .002

P( e) .998

P(a|b,e) .95

P( a|b, e) .06

P( a|b,e) .05

P(a|b, e) .94

P(m| a) .01

Instructor: Arindam Banerjee

P(j|a) .90

P(m|a) .70

P(m| a) .01

P(j| a) P(j|a) .90

P(m|a) .70

P(m| a) .01

P(j| a)

Enumeration Tree

P(b) .001

P(e) .002

P( e) .998

P(a|b,e) .95

P( a|b, e) .06

P( a|b,e) .05

P(a|b, e) .94

Variable Elimination

Queries can be written as sum-of-products Main idea Sum over products to eliminate variables Storage required to prevent repeat computations Burglary network

P(b|j,m)

= 1 P(j,m)

P(b) B E

P(E) E A

P(A|b,E)P(j|A)P(m|A) A J M

Have a factor for every variable

Factors for Variable Elimination

Factor for M: fm(A) = [P(m|a) P(m|¬a)] Factor for J: fj(A) = [P(j|a) P(j|¬a)] Similarly, factors fA(B,E,A),fE(E),fB(B) Summing out to eliminate variables

fAjm(B,E) =

fEAjm(B) = A

fA(B,E,A)fj(A)fm(A)

fE(E)fAjm(B,E)

P(B|j,m) = 1 P(j,m)

¯ ¯ fB(B)fEAjm(B)

Variable Elimination: Basic Operations

Summing out a variable from a product of factors Pointwise products of (a pair of) factors Sum out a variable from a product of factors Pointwise product of factors g1(a,b) × g2(b,c) = g(a,b,c) In general g1(x1,...,xj,y1,...,yk) × g2(y1,...,yk,z1,...,zl) = g(x1,...,xj,y1,...,yk,z1,...,zl)

Assuming f1,...,fi do not depend on X

x f1 × ··· × fk

fi+1 × ··· × fk = f1 × ··· × fi ×

= f1 × ··· × fi × fX¯

Example: Pointwise Product of Factors

A T T F F

B T F T F

f1(A,B) 0.3 0.7 0.9 0.1

B T T F F

C T F T F

f2(B,C) 0.2 0.8 0.6 0.4

A T T T T

B T T F F

C T F T F

f3(A,B,C) 0.3 × 0.2 0.3 × 0.8 0.7 × 0.6 0.7 × 0.4

T T F F

T F T F

0.9 × 0.2 0.9 × 0.8 0.1 × 0.6 0.1 × 0.4

Algorithm�

Irrelevant variables�

Complexity of Exact Inference

Singly connected networks (or polytrees) Any two nodes are connected by at most one path Time and space cost of variable elimination are O(dkn)

Multiply connected networks Can reduc 3SAT to exact inference =⇒ NP-hard Equivalent to counting 3SAT models =⇒ #P-complete

Inference Problems: Big Picture

Broadly the following types of problems Compute Compute Compute Compute

likelihood of observations marginals P(xA) on subset A of nodes conditionals P(xA|xB) on disjoint subsets A,B mode argmaxx P(x)

• First 3 problems are similar Involves marginalization over sum-of-products • Last problem is fundamentally different Entails maximization rather than marginalization • There is an important connection between the problems

Exact Inference - University of...

Documents

Transcript of Exact Inference - University of...

Radial Basis Function Networks - Vision Labsvision.psych.umn.edu/users/schrater/schrater_lab/courses/PattRecog… · Radial Basis Function Networks. Introduction In this lecture we

CSCI 5521: Pattern Recognition - Vision Labsvision.psych.umn.edu/users/schrater/schrater_lab/courses/PattRecog… · Prof. Paul Schrater Pattern Recognition CSCI 5521 4 Syllabus cont’d

Segmentation via Generative Models - Vision Labsvision.psych.umn.edu/users/schrater/schrater_lab/courses/CompVis09/... · CSCI 5561: Computer Vision, Prof. Paul Schrater, Spring 2005

OpenIntro Statistics, 2nd Editionsite.iugaza.edu.ps/mriffi/files/2016/02/os2_slides_06.pdfInference for a single proportion Two scientists want to know if a certain drug is effective

What is color? - Vision Labsvision.psych.umn.edu/users/schrater/schrater_lab/courses/CompVis03/... · Slides by D.A. Forsyth What is color? ... eye are not sufficient or necessary

Approximate Inference: MCMC - CSci 5512: Artificial Intelligence IIvision.psych.umn.edu/users/schrater/schrater_lab/courses/... · 2011. 2. 14. · Rejection Sampling Target density

Normative Decision Theory - Vision Labsvision.psych.umn.edu/users/schrater/schrater_lab/courses/MathMod07/... · PSY 5018H: Math Models Hum Behavior, Prof. Paul Schrater, Spring 2005

Convergence accommodation - University of Minnesotavision.psych.umn.edu/.../papers/Kersten1983Convergence_accommo… · Convergence accommodation is an accommodative response that

An Introduction to MCMC for Machine Learning - Vision Labsvision.psych.umn.edu/users/schrater/schrater_lab/courses/AI2/MCMC... · An Introduction to MCMC for Machine Learning CHRISTOPHE

Reinforcement Learning - CSci 5512: Artificial Intelligence IIvision.psych.umn.edu/users/schrater/schrater_lab/courses/AI2/rl1.pdfReinforcement Learning (RL) Learning what to do to

The problem of serial order in behavior: Lashley’s legacyvision.psych.umn.edu/users/schrater/schrater_lab/... · The problem of serial order in behavior: Lashley’s legacy David

Camera Models and Perspective Projection - Vision Labsvision.psych.umn.edu/users/schrater/schrater_lab/courses/CompVis07/... · Cameras •First photograph due to Niepce •First

Inference for Proportions - New Jersey Institute of …dhar/math661/IPS7e_LecturePPT_ch08.pdfInference for Proportions ... 8.2 Comparing two proportionsAuthors: William H Holmes ·

Filtering 2 - University of Minnesotavision.psych.umn.edu/users/schrater/schrater_lab/courses/CompVis03/ImageFiltering2.pdfFiltering as a Linear Transform h = 1 4 6 4 1 F = F convolved

Lect 10 RegressWid - University of Minnesotavision.psych.umn.edu/.../Lect_10_RegressWid.nb.pdfLect_10_RegressWid.nb 3. A fundamental assumption that affects the modeling and performance

CSCI 5521: Pattern Recognition - Vision Labsvision.psych.umn.edu/users/schrater/schrater_lab/courses/PattRecog... · CSCI 5521: Pattern Recognition Please submit your work in an electronic

Lect 12 Hopfield - University of Minnesotavision.psych.umn.edu/.../Psy5038WF2014/Lectures/Lect_11_Hopfield/Lect_11_Hopfield.nb.pdfIsing model of ferromagnetism developed in the 1920’s

Linear Discriminant Functionsvision.psych.umn.edu/users/schrater/schrater_lab/...6 –The multi-category case •We deﬁne c linear discriminant functions and assign x to ω i if

A Computational Theory of Executive Cognitive Processes ...vision.psych.umn.edu/users/schrater/schrater_lab/courses/Labmeeting/... · A Computational Theory of Executive Cognitive

Lecture 02: Statistical Inference for Binomial Parameterspeople.musc.edu/~bandyopd/bmtry711.11/lecture_02.pdfInference for a probability • Phase II cancer clinical trials are usually