Bayesian Networks Aldi Kraja Division of Statistical Genomics.

20
Bayesian Networks Aldi Kraja Division of Statistical Genomics

Transcript of Bayesian Networks Aldi Kraja Division of Statistical Genomics.

Bayesian Networks

Aldi KrajaDivision of Statistical Genomics

Bayesian Networks and Decision Graphs. Chapter 1

• Causal networks are a set of variables and a set of directed links between variables

• Variables represent events (propositions)

• A variable can have any number of states

• Purpose: Causal networks can be used to follow how a change of certainty in one variable may change certainty of other variables

Causal networks

Fuel

Fuel MeterStanding

F,½,E

StartY,N

Y,N

Clean SparksY,N

Causal Network for a reduced start car problem

Causal Networks and d-separation

• Serial connection (blocking)Serial connection (blocking)

A B C

Evidence maybe transmitted through a serial connectionunless the state of the variable in the connection is known.

A and C and are d-separated given BWhen B is instantiated it blocks the communication between A and C

Causal networks and d-separation

• Diverging connections (Blocking)Diverging connections (Blocking)

A

B C E…

Influence can pass between all children of A unless the state of A is knownEvidence may be transmitted through a diverging connection

unless it is instantiated.

Causal networks and d-separation

• Converging connections (opening)Converging connections (opening)

A

B C E…

Case1: If nothing is known about A, except inference from knowledge of its parents => then parents are independent

Evidence on one of the parents has no influence on other parents

Case 2: If anything is known about the consequences, then information in onemay tell us something about the other causes. (Explaining away effect)

Evidence may only be transmitted through

the converging connectionIf either A or one of its

descendants has received evidence

Evidence

• Evidence on a variable is a statement of the certainties of its states

• If the variable is instantiated then the variable provides hard evidence

• Blocking in the case of serial and diverging connections requires hard evidence

• Opening in the case of converging connections holds for all kind of evidence

D-separation

• Two distinct variables A and B in a causal network are d-separated if, for all paths between A and B there is an intermediate variable V (distinct from A and B) such that:

• -The connection is SERIAL or DIVERGING and V is instantiated

• Or• - the connection is CONVERGING and neither

V nor any of V’s descendants have received evidence

Probability Theory

• The uncertainty raises from noise in the measurements and from the small sample size in the data.

• Use probability theory to quantify the uncertainty.

P(B=r)=4/10

P(B=g)=6/10

ripeWheat

unripeWheat

Red fungus

Gray fungus

Probability Theory

• The probability of an event is the fraction of times that event occurs out of the total number of trails, in the limit that the total number of trails goes to infinity

Probability Theory

• Sum rule:

• Product rule

Y

YXpXp ),()(

)()|(),( XpXYpYXp

i=1 …… M

j=1……

L

nijY=yi

X=xi

ci

rj

Probability Theory

)()|(),(

),()(

,)(

),(

1

iiji

i

ijijji

L

jjii

jiji

ii

ijii

xXpxXyYpN

c

c

n

N

nyYxXp

yYxXpxXp

ncwhereN

cxXp

N

nyYxXp

Y

YXpXp ),()(

i=1 …… M

j=1……

L

nijY=yi

X=xi

ci

rj

)()|(),( XpXYpYXp

Probability Theory

• Symmetry property

)()(),(

:

')(

)()|()|(

)()|()()|(

),(),(

YpXpYXp

caseSpecial

theoremsBayeXp

YpYXpXYp

YpYXpXpXYp

XYpYXp

Probability Theory

• P(W=u | F=R)=8/32=1/4

• P(W=r | F=R)=24/32=3/4

• P(W=u | F=G)=18/24=3/4

• P(W=r | F=G)=6/24=1/4

P(F=R)=4/10=0.4

P(F=G)=6/10=0.6

unripeWheat

Gray fungus

Red fungus

ripeWheat

1

1

Probability Theory• p(W=u)=p(W=u|F=R)p(F=R)+p(W=u|F=G)p(F=G)

=1/4*4/10+3/4*6/10=11/20• p(W=r)=1-11/20=9/20• p(F=R|W=r)=(p(W=r|F=R)p(F=R)/p(W=r))=• 3/4*4/10*20/9=2/3• P(F=G|W=u)=1-2/3=1/3

P(F=R)=4/10=0.4

P(F=G)=6/10=0.6

unrippedWheat

Gray fungus

Red fungus

ripeWheat

Conditional probabilities

• Convergence connection (blocking)

• p(a|b)p(b)=p(a,b)

• p(a|b,c)p(b|c)=p(a,b|c)

• p(b|a)=p(a|b)p(b)/p(a)

• p(b|a,c)=p(a|b,c)p(b|c)/p(a|c)b

a c

p(a,b,c)=p(a|b)p(c|b)p(b)

b

a c

p(a,b,c)/p(b)=p(a|b)p(c|b)p(b)/p(b)

a╨c | b

Conditional probabilities

• Serial connection (blocking)

• p(a|b)p(b)=p(a,b)

• p(a|b,c)p(b|c)=p(a,b|c)

• p(b|a)=p(a|b)p(b)/p(a)

• p(b|a,c)=p(a|b,c)p(b|c)/p(a|c)

ba c ba c

p(a,b,c)=p(a)p(b|a)p(c|b)p(a,c|b)=p(a,b,c)/p(b)= p(a)p(b|a)p(c|b)/p(b)=p(a) {p(a|b)p(b)/p(a)} p(c|b)/p(b)=p(a|b)p(c|b) a╨c | b

Conditional probabilities

• Convergence connection (opening)

• p(a|b)p(b)=p(a,b)

• p(a|b,c)p(b|c)=p(a,b|c)

• p(b|a)=p(a|b)p(b)/p(a)

• p(b|a,c)=p(a|b,c)p(b|c)/p(a|c)

b

a c

b

a c

p(a,b,c)=p(a)p(c)p(b|a,c)p(a,c|b)=p(a,b,c)/p(b)= p(a)p(c)p(b|a,c)/p(b)

a╨c | 0a╨c | b

Graphical Models

• We need probability theory to quantify the uncertainty. All the probabilistic inference can be expressed with the sum and the product rule.

p(a,b,c)=p(c|a,b)p(a,b)

p(a,b,c)=p(c|a,b)p(b|a)p(a)

a

c

b

DAG

P(x1,x2,….,xK-1,xK)=p(xK|x1,...,xK-1)…p(x2|x1)p(x1)

Graphical Models

• DAG explaining joint distribution of x1,…x7

• The joint distribution defined by a graph is given by the product, over all of the nodes of a graph, of a conditional

distribution of each node conditioned on the variables corresponding to the parents of that node in the graph.

)|()|(),|(),,|(

)()()(),...,(

57463153214

32171

xxpxxpxxxpxxxxp

xpxpxpxxp

x1

x2 x3

x4 x5

x6 x7

K

kkk paxpxp

1

)|()(