Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

16
Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College

Transcript of Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Page 1: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Bayesian Networks

CPSC 386 Artificial Intelligence

Ellen Walker

Hiram College

Page 2: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Bayes’ Rule

• P(A^B) = P(A|B) * P(B)• = P(B|A) * P(A)

• So

• P(A|B) = P(B|A) * P(A) / P(B)

• This allows us to compute diagnostic probabilities from causal probabilities and prior probabilities!

Page 3: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Joint Probability Distribution

• Consider all possibilities of a set of propositionsE.g. Picking 2 cards from a deck

P(card1 is red and card2 is red)

P(card1 is red and card2 is black)

P(card1 is black and card2 is red)

P(card1 is black and card2 is black)

Sum of all combinations should be 1

Sum of all “card1 is red” combinations is P(card1 is red)

Page 4: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Joint P.D. table

Second card is red

Second card is black

First card is red 26*25/52*51

(0.245)

26*26/52*51

(0.255)

First card is black

26*26/52*51

(0.255)

26*25/52*51

(0.245)

Page 5: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Operations on Joint P.D.

• Marginalization (summing out)– Add up elements in a row or column that contain all

possibilities for a given item to remove that item from the distribution

• P(1st card is red) = 0.245+0.255 = 0.5

• Conditioning– Get a distribution over one variable by summing out all other

variables

• Normalization– Take the values in the distribution and find a proper

multiplier (alpha) so that they add up to 1.

Page 6: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

A bigger joint PD

Flu No flu

Fever No fever Fever No fever

ache .15 .05 .05 .1

No ache .05 0 .2 .4

Page 7: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Based on that PD…

• Summing out…– P(flu) = 0.25– P(fever) = 0.45– P(flu ^ fever) = 0.2– P(~flu ^ fever) = 0.25– P(flu | fever) = P(flu ^ fever) / P(fever) =

.0.2 / 0.45 = (4/9)

• Normalizing– P(flu | fever) = <0.2, 0.25> = 0.2/ (0.45) = 4/9

Page 8: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Evaluating Full Joint PD’s

• Advantage– All combinations are available– Any joint or unconditional probability can be

computed

• Disadvantage– Combinatorial Explosion! For N variables, need 2N

individual probabilities– Difficult to get probabilities for all combinations

Page 9: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Independence

• Absolute independence: – P(A|B) = P(A) or P(A,B) = P(A)*P(B)– No need for joint table

Page 10: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Conditional independence

• P(A|B,C)= P(A|C) or P(A,B|C) = P(A|C)*P(B|C)

• If we know the truth of C, then A and B become independent– (e.g. ache and fever are independent given flu)

• We can say C “separates” A and B

Page 11: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Naïve Bayes model

• Assume that all possible effects (symptoms) are separated by the cause

• Then:– P(cause, effect1, effect2, effect3…) = P(effect1|

cause) * P(effect2|cause) * …

• Can work surprisingly well in many cases• Necessary conditional probabilities can be

learned

Page 12: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Bayesian Network

• Data structure that represents– Dependencies (and conditional independencies)

among variables– Necessary information to compute a full joint

probability distribution

Page 13: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Structure of Bayesian Network

• Nodes represent random variables– (e.g. flu, fever, ache)

• Directed links (arrows) connect pairs of nodes, from parent to child

• Each node has joint P.D. P(child | parents) – More parents, bigger P.D.

• Graph has no directed cycles– No node is its own (great… grand) parent!

Page 14: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Example Bayesian Network

Thermometer>100 F

feverache spots

Damp weather

flu measles

Page 15: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Probabilities in the network

• Probability of set of variable assignments is the product of joint probabilities computed from parents– P(x1,x2, …) =

P(x1 |parents(X1))* P(x2 |parents(X2)) …

• Example– P(~therm ^ damp ^ ache ^ ~fever ^flu) =

P(~therm) * P(damp) * P(ache | damp) *P(~fever | therm) * P(flu | ache ^ fever)

Page 16: Bayesian Networks CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

Constructing a Bayesian Network

• Start with root causes• Add direct consequences next (connected)

and so on…– E.g. damp weather -> ache, not damp weather ->

flu

• Each node should be directly connected to (influenced by) only a few

• If we choose a different order, we’ll get tables that are too big!