Lirong Xia Bayesian networks (2) Thursday, Feb 25, 2014.

Lirong Xia

Bayesian networks (2)

Thursday, Feb 25, 2014

• Pay attention to all reminders

• Written HW 1 out: due this Friday before the class

– type+printout preferred

– or it must be in print handwriting. We may return illegible

HWs without grading

– no email submissions

• Midterm Mar 7

– in-class

– open book

– cannot use smartphone/laptops/wifi

• Project 2 deadline Mar 18 midnight 2

Reminder

• Avg score for reassigned project 1: 15.97

• Usage of old util.py

– should not matter if you only use the member

functions

• if so, let me know

– might be problematic if you directly access the

member variables

– advise: make sure you use all files with the

autograder for project 23

Reassigned project 1

• Bayesian networks

– compact, graphical representation of a joint

probability distribution

– conditional independence

Last class

Bayesian network

• Definition of Bayesian network (Bayes’ net or BN)

• A set of nodes, one per variable X

• A directed, acyclic graph

• A conditional distribution for each node

– A collection of distributions over X, one for each combination of parents’ values

p(X|a1,…, an)

– CPT: conditional probability table

– Description of a noisy “causal” process

A Bayesian network = Topology (graph) + Local Conditional Probabilities

Probabilities in BNs

• Bayesian networks implicitly encode joint distributions– As a product of local conditional distributions

– Example:

• This lets us reconstruct any entry of the full joint

• Not every BN can represent every joint distribution– The topology enforces certain conditional independencies

Reachability (D-Separation)

• Question: are X and Y conditionally independent given evidence vars {Z}?– Yes, if X and Y “separated” by Z– Look for active paths from X to Y– No active paths = independence!

• A path is active if each triple is active:– Causal chain where B is

unobserved (either direction)– Common cause where B is

unobserved– Common effect where B or

one of its descendents is observed

• All it takes to block a path is a single inactive segment

• Given random variables Z1,…Zp,

we are asked whether X Y|Z⊥ 1,…Zp

• Step 1: shade Z1,…Zp

• Step 2: for each undirected path

from X to Y

– if all triples are active, then X and Y

are NOT conditionally independent

• If all paths have been checked and

none of them is active, then X Y|⊥

Z1,…Zp8

Checking conditional independence from BN graph

Example

L B T R

Example

T D R S

• Variables:– R: Raining

– T: Traffic

– D: Roof drips

– S: I am sad

• Questions:

Today: Inference---variable elimination

Inference

• Inference: calculating some useful quantity from a joint probability distribution

• Examples:– Posterior probability:

– Most likely explanation:

Inference

• Given unlimited time, inference in BNs is easy

• Recipe:– State the marginal probabilities you need– Figure out ALL the atomic probabilities you need– Calculate and combine them

• Example:

p b j mp b j m

Example: Enumeration

• In this simple method, we only need the BN to synthesize the joint entries

p b j m

p b p e p a b e p j a p m a

Inference by Enumeration?

More elaborate rain and sprinklers example

RainedSprinklers

were on

Grass wet

Dog wet

Neighbor

walked dog

p(+R) = .2

p(+N|+R) = .3

p(+N|-R) = .4

p(+S) = .6

p(+G|+R,+S) = .9

p(+G|+R,-S) = .7

p(+G|-R,+S) = .8

p(+G|-R,-S) = .2

p(+D|+N,+G) = .9

p(+D|+N,-G) = .4

p(+D|-N,+G) = .5

p(+D|-N,-G) = .3

Inference

• Want to know: p(+R|+D) = p(+R,+D)/P(+D)

• Let’s compute p(+R,+D)

RainedSprinklers

were on

Grass wet

Dog wet

Neighbor

walked dog

p(+R) = .2

p(+N|+R) = .3

p(+N|-R) = .4

p(+S) = .6

p(+G|+R,+S) = .9

p(+G|+R,-S) = .7

p(+G|-R,+S) = .8

p(+G|-R,-S) = .2

p(+D|+N,+G) = .9

p(+D|+N,-G) = .4

p(+D|-N,+G) = .5

p(+D|-N,-G) = .3

Inference…

• p(+R,+D)= ΣsΣgΣn p(+R)p(s)p(n|+R)p(g|+R,s)p(+D|n,g) =

p(+R)Σsp(s)Σgp(g|+R,s)Σn p(n|+R)p(+D|n,g)

RainedSprinklers

were on

Grass wet

Dog wet

Neighbor

walked dog

p(+R) = .2

p(+N|+R) = .3

p(+N|-R) = .4

p(+S) = .6

p(+G|+R,+S) = .9

p(+G|+R,-S) = .7

p(+G|-R,+S) = .8

p(+G|-R,-S) = .2

p(+D|+N,+G) = .9

p(+D|+N,-G) = .4

p(+D|-N,+G) = .5

p(+D|-N,-G) = .3

p(+R)Σsp(s)Σgp(g|+R,s)Σn p(n|+R)p(+D|n,g)

• Order: s>g>n

• only involves s

• only involves s and g

• only involves s, g, and n 20

Staring at the formula…

Variable elimination

• From the factor Σn p(n|+R)p(+D|n,g) we sum out n to obtain a factor only depending on g

• [Σn p(n|+R)p(+D|n,+G)] = p(+N|+R)P(+D|+N,+G) + p(-N|+R)p(+D|-N,+G) = .3*.9+.7*.5 = .62

• [Σn p(n|+R)p(+D|n,-G)] = p(+N|+R)p(+D|+N,-G) + p(-N|+R)p(+D|-N,-G) = .3*.4+.7*.3 = .33

• Continuing to the left, g will be summed out next, etc. (continued on board)

RainedSprinklers

were on

Grass wet

Dog wet

Neighbor

walked dog

p(+R) = .2

p(+N|+R) = .3

p(+N|-R) = .4

p(+S) = .6

p(+G|+R,+S) = .9

p(+G|+R,-S) = .7

p(+G|-R,+S) = .8

p(+G|-R,-S) = .2

p(+D|+N,+G) = .9

p(+D|+N,-G) = .4

p(+D|-N,+G) = .5

p(+D|-N,-G) = .3

Elimination order matters

• p(+R,+D)= ΣnΣsΣg p(+r)p(s)p(n|+R)p(g|+r,s)p(+D|n,g) = p(+R)Σnp(n|

+R)Σsp(s)Σg p(g|+R,s)p(+D|n,g)

• Last factor will depend on two variables in this case!

RainedSprinklers

were on

Grass wet

Dog wet

Neighbor

walked dog

p(+R) = .2

p(+N|+R) = .3

p(+N|-R) = .4

p(+S) = .6

p(+G|+R,+S) = .9

p(+G|+R,-S) = .7

p(+G|-R,+S) = .8

p(+G|-R,-S) = .2

p(+D|+N,+G) = .9

p(+D|+N,-G) = .4

p(+D|-N,+G) = .5

p(+D|-N,-G) = .3

• Compute a marginal probability p(x1,…,xp) in a

Bayesian network

– Let Y1,…,Yk denote the remaining variables

– Step 1: fix an order over the Y’s (wlog Y1>…>Yk)

– Step 2: rewrite the summation as

Σy1 Σy2

…Σyk-1 Σyk

anything

– Step 3: variable elimination from right to left 23

General method for variable elimination

sth only

involving

sth only

involving Y1

and X’s

sth only

involving Y1,

Y2 and X’s

sth only

involving Y1,

Y2,…,Yk-1

and X’s

Trees make our life much easier

• Choose an extreme variable to eliminate first

• …Σx4 P(x4|x1,x2)…Σx5 P(x5|x4) = …Σx4 P(x4|x1,x2)[Σx5 P(x5|x4)]…

• Why this does not work well for general BNs?

X1 X2 X3

Lirong Xia Bayesian networks (2) Thursday, Feb 25, 2014.

Documents

Transcript of Lirong Xia Bayesian networks (2) Thursday, Feb 25, 2014.

Lirong Xia MIRA, SVM, k-NN Tue, April 15, 2014. Linear Classifiers (perceptrons) 1 Inputs are feature values Each feature has a weight Sum is the activation.

Computational Voting Theory - Computer Sciencexial/Files/dissertation_Lirong.pdf · Computational Voting Theory: Game-Theoretic and Combinatorial Aspects by Lirong Xia Department

Allocating Indivisible Items in Categorized Domainsxial/Files/CDAP-full.pdf · Allocating Indivisible Items in Categorized Domains Lirong Xia Computer Science Department Rensselaer

Lirong Xia Kernel (continued), Clustering Friday, April 18, 2014.

Sep 16, 2013 Lirong Xia Computational social choice The easy-to-compute axiom.

Lirong Xia Hidden Markov Models Tue, March 28, 2014.

Sep 15, 2014 Lirong Xia Computational social choice The easy-to-compute axiom.

Xia - Bayesian Meta-Analysis in Drug Safety Evaluation ...bbs.ceb-institute.org/wp-content/uploads/2014/10/Xia-Bayesian-Meta... · Bayesian Meta-Analysis in Drug Safety Evaluation

Introduction to mechanism design - Computer …Introduction to mechanism design Feb. 9, 2016 Lirong Xia • Game theory: predicting the outcome with strategic agents • Games and

Lirong Xia Reinforcement Learning (1) Tue, March 18, 2014.

Generalized Scoring Rules Paper by Lirong Xia and Vincent Conitzer Presentation by Yosef Treitman.

Lirong Xia Probabilistic reasoning over time Tue, March 25, 2014.

Lirong Xia Bayesian networks (1) Thursday, Feb 21, 2014.

X Allocating Indivisible Items in Categorized Domainsxial/Files/CDAP.pdf · X Allocating Indivisible Items in Categorized Domains Lirong Xia, Computer Science Department Rensselaer

Aug 28, 2014 Lirong Xia Introduction to Social Choice.

Sep. 5, 2013 Lirong Xia Introduction to Game Theory.

Ordinal Preference Representation and Aggregation Game-Theoretic and Combinatorial Aspects of Computational Social Choice EPFL June 15, 2012 Lirong Xia.

Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Lirong Xia Utilities and decision theory Friday, Feb 28, 2014.

Lirong Xia Planning Tuesday, April 22, 2014. The kernel trick Neuron network Clustering (k-means) 1 Last time.