Computing & Information Sciences Kansas State University Data Sciences Summer Institute Multimodal...

12
Computing & Information Sciences Kansas State University D ata S ciences S ummer I nstitute M ultimodal I nformation A ccess and S ynthesis Learning and Reasoning with Graphical Models of Probability for the Identity Uncertainty Problem William H. Hsu Tuesday, 29 May 2007 Laboratory for Knowledge Discovery in Databases Kansas State University http://www.kddresearch.org/KSU/CIS/DSSI-MIAS-SRL-20070 529.ppt University of Illinois at Urbana- Champaign DSSI-- MIAS

Transcript of Computing & Information Sciences Kansas State University Data Sciences Summer Institute Multimodal...

Computing & Information SciencesKansas State University

Data Sciences Summer InstituteMultimodal Information Access

and Synthesis

Learning and Reasoning with Graphical Models of Probability for the Identity

Uncertainty Problem

Data Sciences Summer InstituteMultimodal Information Access

and Synthesis

Learning and Reasoning with Graphical Models of Probability for the Identity

Uncertainty Problem

William H. Hsu

Tuesday, 29 May 2007

Laboratory for Knowledge Discovery in Databases

Kansas State University

http://www.kddresearch.org/KSU/CIS/DSSI-MIAS-SRL-20070529.ppt

University of Illinois at Urbana-ChampaignDSSI--MIAS

Computing & Information SciencesKansas State University

• Graphical Models of Probability– Markov graphs

– Bayesian (belief) networks

– Causal semantics

– Direction-dependent separation (d-separation) property

• Learning and Reasoning: Problems, Algorithms– Inference: exact and approximate

• Junction tree – Lauritzen and Spiegelhalter (1988)

• (Bounded) loop cutset conditioning – Horvitz and Cooper (1989)

• Variable elimination – Dechter (1996)

– Structure learning

• K2 algorithm – Cooper and Herskovits (1992)

• Variable ordering problem – Larannaga (1996), Hsu et al. (2002)

• Probabilistic Reasoning in Machine Learning, Data Mining• Current Research and Open Problems

Part 1 of 8: Graphical Models IntroOverview

Computing & Information SciencesKansas State University

Adapted from Fayyad, Piatetsky-Shapiro, and Smyth (1996)

Stages of Data Mining

Computing & Information SciencesKansas State University

P(20s, Female, Low, Non-Smoker, No-Cancer, Negative, Negative) = P(T) · P(F) · P(L | T) · P(N | T, F) · P(N | L, N) · P(N | N) · P(N | N)

• Conditional Independence– X is conditionally independent (CI) from Y given Z (sometimes written X Y |

Z) iff P(X | Y, Z) = P(X | Z) for all values of X, Y, and Z

– Example: P(Thunder | Rain, Lightning) = P(Thunder | Lightning) T R | L

• Bayesian (Belief) Network– Acyclic directed graph model B = (V, E, ) representing CI assertions over – Vertices (nodes) V: denote events (each a random variable)

– Edges (arcs, links) E: denote conditional dependencies

• Markov Condition for BBNs (Chain Rule):

• Example BBN

n

iiin21 Xparents |XPX , ,X,XP

1

X1 X3

X4

X5

Age

Exposure-To-Toxins

Smoking

CancerX6

Serum Calcium

X2Gender X7

Lung Tumor sDescendantNon

Parents

sDescendant

Graphical Models Defined [1]:Independence and Bayes Nets

Computing & Information SciencesKansas State University

ZX E Y

(1)

(2)

(3) Z

Z

From S. Russell & P. Norvig (1995)

Adapted from J. Schlabach (1996)

Motivation: The conditional independence status of nodes within a BBN might change as the availability of evidence E changes. Direction-dependent separation (d-separation) is a technique used to determine conditional independence of nodes as evidence changes.

Definition: A set of evidence nodes E d-separates two sets of nodes X and Y if every undirected path from a node in X to a node in Y is blocked given E.

A path is blocked if one of three conditions holds:

Graphical Models Defined [2]:D-Separation and Markov Blankets

Computing & Information SciencesKansas State University

Adapted from slides by S. Russell, UC Berkeley http://aima.cs.berkeley.edu/

Multiply-connected case: exact, approximate inference are #-complete

Graphical Models Defined [3]:Reasoning with Bayes Nets

Computing & Information SciencesKansas State University

• Goal: Estimate

• Filtering: r = t

– Intuition: infer current state from observations

– Applications: signal identification

– Variation: Viterbi algorithm

• Prediction: r < t

– Intuition: infer future state

– Applications: prognostics

• Smoothing: r > t

– Intuition: infer past hidden state

– Applications: signal enhancement

• CF Tasks

– Plan recognition by smoothing

– Prediction cf. WebCANVAS – Cadez et al. (2000)

)y|P(X r1it

Adapted from Murphy (2001), Guo (2002)

Bayesian Network Applications [1]:Time Series Prediction

Computing & Information SciencesKansas State University

• General-Case BBN Structure Learning: Use Inference to Compute Scores

• Optimal Strategy: Bayesian Model Averaging

– Assumption: models h H are mutually exclusive and exhaustive

– Combine predictions of models in proportion to marginal likelihood

• Compute conditional probability of hypothesis h given observed data D

• i.e., compute expectation over unknown h for unseen cases

• Let h structure, parameters CPTs

Hh

m

n21m

D|hP h D,|xP

x,,x,x|x,,x,xPD|xP

1

m211

dΘ h |ΘPΘ h,|DPhP

hPh|DPD|hP

Posterior Score Marginal Likelihood

Prior over Structures Likelihood

Prior over Parameters

Bayesian Network Applications [2]:Bayes Optimal Classification

Computing & Information SciencesKansas State University

Split vertex in undirected cycle; condition upon each of its state values

Number of network instantiations:Product of arity of nodes in minimal loop cutset

Posterior: marginal conditioned upon cutset variable values

X3

X4

X5

Exposure-To-Toxins

Smoking

Cancer X6

Serum Calcium

X2

Gender

X7

Lung Tumor

X1,1

Age = [0, 10)

X1,2

Age = [10, 20)

X1,10

Age = [100, )

• Deciding Optimal Cutset: NP-hard

• Current Open Problems– Bounded cutset conditioning: ordering heuristics

– Finding randomized algorithms for loop cutset optimization

Inference in Bayesian Networks:Loop Cutset Conditioning

Computing & Information SciencesKansas State University

Novel Contributions [3]:Learning in Graphical Models

Novel Contributions [3]:Learning in Graphical Models

Continuing Work:Speeding up Approximate Inference using Edge Deletion - J. Thornton (2005)Bayesian Network tools in Java (BNJ) v4 - W. Hsu, J. M. Barber, J. Thornton (2006)

Dynamic Bayes Netfor Prediction

University of Illinois at Urbana-ChampaignDSSI--MIAS

Computing & Information SciencesKansas State University

© 2005 KSU Bayesian Network tools in Java (BNJ) Development Team

ALARM Network

Bayesian Network tools in Java(BNJ) v4

Bayesian Network tools in Java(BNJ) v4

University of Illinois at Urbana-ChampaignDSSI--MIAS

Computing & Information SciencesKansas State University

Questions and DiscussionQuestions and Discussion

University of Illinois at Urbana-ChampaignDSSI--MIAS