Chapter 5 Belief Updating in Bayesian Networks

23
1 Chapter 5 Belief Updating in Bayesian Networks Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics, CGS Statistical Genetics Forum May 7,2007

description

Chapter 5 Belief Updating in Bayesian Networks. Bayesian Networks and Decision Graphs Finn V. Jensen Qunyuan Zhang Division. of Statistical Genomics, CGS Statistical Genetics Forum May 7,2007. Contents of the Book. I A practical Guide to Normative Systems - PowerPoint PPT Presentation

Transcript of Chapter 5 Belief Updating in Bayesian Networks

Page 1: Chapter 5   Belief Updating in Bayesian Networks

1

Chapter 5 Belief Updating in Bayesian Networks

Bayesian Networks and Decision GraphsFinn V. Jensen

Qunyuan ZhangDivision. of Statistical Genomics, CGS

Statistical Genetics ForumMay 7,2007

Page 2: Chapter 5   Belief Updating in Bayesian Networks

2

Contents of the Book

I A practical Guide to Normative Systems

1 Causal and Bayesian Network

2 Building Models

3 Learning, Adaption, and Tuning

4 Decision Graphs

II Algorithms for Normative Systems

5 Belief Updating in Bayesian Network

6 Bayesian Network Analysis Tools

7 Algorithms for Influence Diagrams

Page 3: Chapter 5   Belief Updating in Bayesian Networks

3

Structure of the Book

1 Causal and Bayesian Network

2 Building Models3 Learning, Adaption, and Tuning

5 Belief Updating in Bayesian Network 6 Bayesian Network Analysis Tools

4 Decision Graphs7 Algorithms for Influence Diagrams

I. What is BN?

II. How to create a BN?

III. What can we use BN to do? and how?[to know sth.]

Prob.(a single variable | BN)Joint Prob.(a set variables | BN)Importance of varibales

evidence sensitivity parameter sensitivity

Data conflict analysis[to make decision]

Optimal decision (cost & gain)

Page 4: Chapter 5   Belief Updating in Bayesian Networks

4

BN & Decision Tree

A

D1

V1

B

T

D2 V2

C

U=V1+V2

D1

T

AXC V1+V2

D2

AXC V1+V2

AXC V1+V2

D2

AXC V1+V2

T

AXC V1+V2

D2

AXC V1+V2

AXC V1+V2

D2

AXC V1+V2

P(A,C|D1,T,D2)

Page 5: Chapter 5   Belief Updating in Bayesian Networks

5

“BN” of the Book

Concept of BN

Model Biulding

(known part of structure)

BN Learning

(uncertain part of structure)

BN

(structure & parameters)

Rules & Theories Data & Algorithms

Probability Calculation

Knowing, Understanding & Explaining

Decisions

Actions Cost & Gain

Changes

Page 6: Chapter 5   Belief Updating in Bayesian Networks

6

Chapter 5 Belief Updating in Bayesian Networks

Belief = Probability

Belief updating = Probability calculating based on a BN

(model, parameters and/or evidences)

Linear Model BN

Logistic Model

exxxy 3322110

3322110

3322110

1),|1( 3,21 xxx

xxx

e

exxxyP

X1 X2 X3 e

Y

Conditional ProbabilityP(Y| X1,X2,X3)

Marginal ProbabilityP(Y) =∑[-Y] φ

X2

X1

X3

Y

CA

B

E

D

F

Page 7: Chapter 5   Belief Updating in Bayesian Networks

7

Marginal Probability Calculation in BN

I. Simplification (5.5)

II. Marginalization (5.2),(5.3),(5.4),(5.6)

III. Simulation (5.7)

Page 8: Chapter 5   Belief Updating in Bayesian Networks

8

I. Simplifications

Graph-theoretic Representation

Definitions, Propositions & Theorems

Barren Nodes

D

A

B

C

F

E

G

e

D

A

B

C

F

E

G

eG

e

DA B C

F

E

DA B C

F

E

e

e

DA B C

F

E

e

d-separation

By excluding the non-informative nodes (white nodes)

Page 9: Chapter 5   Belief Updating in Bayesian Networks

9

II. Marginalization

Calculating sums of products of potentials by eliminating variables repeatedly

Page 10: Chapter 5   Belief Updating in Bayesian Networks

10

Marginal Probabilities

A

BA1 A2 P(B)

B1 p1 p2p1+p2

P(B1)

B2 p3 p4p3+p4

P(B2)

P(A)p1+p3

P(A1)

p2+p4

P(A2)

Joint Probabilities

Page 11: Chapter 5   Belief Updating in Bayesian Networks

11

An Example of Marginalization/Elimination

BN parameters (potentials) :

φ1=P (A1) , φ2=P (A2|A1) , φ3=P (A3|A1), φ4=P (A4|A2)

φ5=P (A5|A2, A3), φ6=P (A6|A3)

P(A4)=?

A3

A1

A2

A4 A5 A6

65

321

6532165321

),(),,(

),(),(),()(

)()(

3633253

13324412211

,,,,654321

,,,,4

AA

AAA

AAAAAAAAAA

AAAAA

AAAAAAA

UPAP

Distributive Law

Page 12: Chapter 5   Belief Updating in Bayesian Networks

12

Marginalization/Elimination Order

)(

),()(

),(),(),()(

)(),(),(),(),()(

)(),,(),(),(),()(

),(),,(),(),(),()(

)(

4'

1

41'211

21'324412211

3'632

'513324412211

3'6325313324412211

363325313324412211

4

1

21

321

5321

65321

A

AAA

AAAAAAA

AAAAAAAAAA

AAAAAAAAAAA

AAAAAAAAAAAA

AP

A

AA

AAA

AAAA

AAAAA

A3

A1

A2

A4 A5 A6

Variable Elimination Order

)( 412356 APAAAAA

Page 13: Chapter 5   Belief Updating in Bayesian Networks

13

Marginalization/Elimination

Graph-theoretic Representation

Definitions, Propositions & Theorems

Domain: a set of variables in BN

Potential: a real-valued probabilistic table over a domain

φ1=P (A1) , φ2=P (A2|A1) , φ3=P (A3|A1), φ4=P (A4|A2)

φ5=P (A5|A2, A3), φ6=P (A6|A3)

A3

A1

A2

A4 A5 A6

Definition 5.1 (Elimination)Let Фbe a set of potentials, and let X be a variable. X is eliminated from Ф by:

1.Remove all potentials in Ф with X in their domains. Call the removed set ФX

X= A3 => ФX=(φ3, φ5, φ6 ), Ф=(φ1, φ2, φ4 )

2.Calculate φ-X = ∑x ΠФX = ∑A3 φ3φ5φ6

3.Add φ-X to Ф. Call the result set Ф-X =(φ1, φ2, φ4 , φ-X )

P(Y) is calculated by repeatedly eliminating the variables except Y

Question : how to find an efficient/optimal elimination order?

Page 14: Chapter 5   Belief Updating in Bayesian Networks

14

Domain Graphs

Graph-theoretic Representation

Definitions, Propositions & Theorems

BN graph

6 domains

φ1 (A1) , φ2 (A2,A1) ,

φ3 (A3,A1), φ4 (A4,A2)

φ5 (A5,A2,A3), φ6(A6,A3)

A3

A1

A2

A4 A5 A6

Domain graph

6 domains

φ1 (A1) , φ2 (A2,A1) ,

φ3 (A3,A1), φ4 (A4,A2)

φ5 (A5,A2,A3), φ6(A6,A3)

A3

A1

A2

A4 A5 A6

Page 15: Chapter 5   Belief Updating in Bayesian Networks

15

Perfect Elimination Sequence

Graph-theoretic Representation

Definitions, Propositions & Theorems

Fill-ins (red links)

Perfect Elimination Sequence

An elimination sequence without introducing fill-ins.

e.g.

A6, A5, A3, A1, A2 down to A4 => P(A4)

A5, A6, A3, A1, A2 down to A4 => P(A4)

A1, A5, A6, A3, A2 down to A4 => P(A4)

A3

A1

A2

A4 A5 A6

A1

A2

A4 A5 A6

Page 16: Chapter 5   Belief Updating in Bayesian Networks

16

Domain Set of Elimination Sequence

Graph-theoretic Representation

Definitions, Propositions & Theorems

The domain set of an elimination sequence is the set of domains of potentials produced during the elimination where potentials that are subsets of other potentials are removed.

For the sequence

A6, A5, A3, A1, A2 down to A4 => P(A4)

the set of domains is

{(A6,A3),(A2,A3,A5),(A1,A2,A3), (A1,A2),(A2,A4)}

Domain set reflects the complexity of an elimination sequence.

Question: how to find the smallest domain set ?

Page 17: Chapter 5   Belief Updating in Bayesian Networks

17

Set of Cliques

Graph-theoretic Representation

Definitions, Propositions & Theorems

All perfect elimination sequences produce the same the domain set, namely the set of cliques of the domain graph.

e.g.

all the sequences

A6, A5, A3, A1, A2 down to A4

A5, A6, A3, A1, A2 down to A4

A1, A5, A6, A3, A2 down to A4

produce the domain set

{(A6,A3),(A2,A3,A5),(A1,A2,A3), (A1,A2),(A2,A4)}

which contains 5 domains / cliques

Any perfect elimination sequence is optimal.

Cliques are a set of domains produce by perfect elimination sequences.

Clique set is the optimal set of domains.

Question: how to determine the set of cliques?

Page 18: Chapter 5   Belief Updating in Bayesian Networks

18

Triangulated Graphs

Graph-theoretic Representation

Definitions, Propositions & Theorems

An undirected graph with a perfect elimination sequence is called a triangulated graph.

A triangulated graph A nontriangulated graph

Perfect elimination sequence No perfect elimination sequence

A5, A2, A4, A3 down to A1

A3

A1 A2

A4 A5

A3

A1 A2

A4 A5

Page 19: Chapter 5   Belief Updating in Bayesian Networks

19

Cliques in Triangulated Graphs

Graph-theoretic Representation

Definitions, Propositions & Theorems

X : a node in domain graph

Fx : the set of neighbor nodes of X plus X

Simplicial: nodes with a complete neighbor set are called simplicial

To determine the set of cliques in a triangulated graph

1. Eliminate a simplicial node X. Fx is a clique candidate.

2. If Fx does not include all remaining nodes, go to 1.

3. Prune the set of cliques candidates by removing sets that are subsets of other clique candidates.

4. The resulting set is the set of cliques.

Question: given a set of cliques, how to determine the perfect elimination order?

DA

B

C E

X

Page 20: Chapter 5   Belief Updating in Bayesian Networks

20

Join Tree

Graph-theoretic Representation

Definitions, Propositions & Theorems

An organized tree of cliques, in which all nodes on the path between V and W contain the intersection of V and W.

D

A B

C F

I

E

GH

J

ABCDV1

BCDS1

CGHJV5

CGS5

BCDEV10

BCDGV1

BCDS1

DEFIV3

DES3

ABCD

CGHJ

BCDE

BCDG

DEFI

ABCD

CGHJ

BCDE

BCDG

DEFI

A domain graph

Cliques (V) and Separators (S)

A join tree

Elimination sequence

A,F,I,H,J,G,B,C,D down to E

Not a join tree

Page 21: Chapter 5   Belief Updating in Bayesian Networks

21

Propagation Junction Trees

Graph-theoretic Representation

Definitions, Propositions & Theorems

A junction tree is a join tree with the following structure:

1. Each potential is attached to a clique containing the domain of this potential (cliques)

2. Each link has the appropriate separator attached (separable)

3. Each separator contains two “mailboxes”, one for each direction (mutual communication)

φ1,φ2,φ3

V4: A1, A2, A3

φ4

V6: A2, A4φ5

V2: A2, A3, A5

φ6

V1: A3, A6

↑ ↓S4:A2

↑ ↓S2:A2,A3

↑ ↓S1:A3

Collect evidence to V6

distribute evidence from V6

Junction trees provide a general framework for finding optimal elimination sequence for triangulated graphs.

Question: what if a graph is non-triangulated?

Page 22: Chapter 5   Belief Updating in Bayesian Networks

22

Triangulations

Graph-theoretic Representation

Definitions, Propositions & Theorems

Convert a non-triangulated graph into a triangulated one by adding new link(s)

BN non-triangulated graph triangulated graph

D

A B C

E

F G

H I J

D

A B C

E

F G

H I J

D

A B C

E

F G

H I J

Optimal triangulation? Minimal fill-in size?

Heuristic approach: eliminate repeatedly a smplicial node, and if this is not possible, eliminate a node X with minimal size of Fx.

Page 23: Chapter 5   Belief Updating in Bayesian Networks

23

III. Stochastic Simulations

Forward Sampling

1. P(A) => A

2. P(B|A)=>B, P(C|A)=>C

3. P(D|B)=>D

4. P(E|C,D)=>E

5. Repeat steps 1~4

D

A

B C

E

Gibbs Sampling

Evidence: B=n, E=n; P(B=n,E=n) is rare

P(A)=?

P(C| B=n,E=n, A=a0, D=d0) => c1

P(D| B=n,E=n,C=c1,A=a0) => d1

P(A| B=n,E=n, D=d1,C=c1) => a1

P(C| B=n,E=n, A=a1, D=d1) => c2

.

. discard

P(C| B=n,E=n, A=at-1, D=dt-1) => ct

. collect

.

.