Bayesian Inference
Artificial IntelligenceCMSC 25000
February 26, 2002
Agenda
• Motivation– Reasoning with uncertainty
• Medical Informatics
• Probability and Bayes’ Rule– Bayesian Networks– Noisy-Or
• Decision Trees and Rationality• Conclusions
Motivation
• Uncertainty in medical diagnosis– Diseases produce symptoms– In diagnosis, observed symptoms => disease ID– Uncertainties
• Symptoms may not occur• Symptoms may not be reported• Diagnostic tests not perfect
– False positive, false negative
• How do we estimate confidence?
Motivation II
• Uncertainty in medical decision-making– Physicians, patients must decide on treatments– Treatments may not be successful– Treatments may have unpleasant side effects
• Choosing treatments– Weigh risks of adverse outcomes
• People are BAD at reasoning intuitively about probabilities– Provide systematic analysis
Probabilities Model Uncertainty
• The World - Features– Random variables– Feature values
• States of the world– Assignments of values to variables
– Exponential in # of variables– possible states
},...,,{ 21 nXXX}...,,{ ,21 iikii xxx
n
iik
1
nik 2;2
Probabilities of World States
• : Joint probability of assignments– States are distinct and exhaustive
• Typically care about SUBSET of assignments– aka “Circumstance”
– Exponential in # of don’t cares
}),,,({),( 43},{ },{
2142 fXvXtXuXPfXtXPftu ftv
)( iSP
)(1
1
n
i ik
jjSP
A Simpler World
• 2^n world states = Maximum entropy– Know nothing about the world
• Many variables independent– P(strep,ebola) = P(strep)P(ebola)
• Conditionally independent– Depend on same factors but not on each other– P(fever,cough|flu) = P(fever|flu)P(cough|flu)
Probabilistic Diagnosis
• Question:– How likely is a patient to have a disease if they have
the symptoms?• Probabilistic Model: Bayes’ Rule• P(D|S) = P(S|D)P(D)/P(S)
– Where• P(S|D) : Probability of symptom given disease• P(D): Prior probability of having disease• P(S): Prior probability of having symptom
Modeling (In)dependence
• Bayesian network– Nodes = Variables– Arcs = Child depends on parent(s)
• No arcs = independent (0 incoming: only a priori)• Parents of X = • For each X need
)(X))(|( XXP
Simple Bayesian Network
• MCBN1
A
B C
D E
A = only a prioriB depends on AC depends on AD depends on B,CE depends on C
Need:P(A)P(B|A)P(C|A)P(D|B,C)P(E|C)
Truth table22*22*22*2*22*2
Simplifying with Noisy-OR
• How many computations? – p = # parents; k = # values for variable– (k-1)k^p– Very expensive! 10 binary parents=2^10=1024
• Reduce computation by simplifying model– Treat each parent as possible independent cause– Only 11 computations
• 10 causal probabilities + “leak” probability– “Some other cause”
Noisy-OR Example
A B
Pn(b|a) = 1-(1-ca)(1-l)Pn(b|a) = (1-ca)(1-l)Pn(b|a) = 1-(1 -l) = l = 0.5Pn(b|a) = (1-l)
P(B|A) b b
a
a
0.6 0.4
0.5 0.5
Pn(b|a) = 1-(1-ca)(1-l)=0.6 (1-ca)(1-l)=0.4 (1-ca) =0.4/(1-l)
=0.4/0.5=0.8 ca = 0.2
Noisy-OR Example IIA B
C
Full model: P(c|ab)P(c|ab)P(c|ab)P(c|ab) & neg
Noisy-Or: ca, cb, lPn(c|ab) = 1-(1-ca)(1-cb)(1-l)Pn(c|ab) = 1-(1-cb)(1-l)Pn(c|ab) = 1-(1-ca)(1-l)Pn(c|ab) = 1-(1-l)
Assume:
P(a)=0.1
P(b)=0.05
P(c|ab)=0.3
ca= 0.5
P(c|b) = 0.7
= l = 0.3
Pn(c|b)=Pn(c|ab)Pn(a)+Pn(c|ab)P(a) 1-0.7=(1-ca)(1-cb)(1-l)0.1+(1-cb)(1-l)0.9 0.3=0.5(1-cb)0.07+(1-cb)0.7*0.9 =0.035(1-cb)+0.63(1-cb)=0.665(1-cb) 0.55=cb
Graph Models
• Bipartite graphs– E.g. medical reasoning– Generally, diseases cause symptom (not reverse)
d1
d2
d3
d4
s1
s2
s3
s4
s5
s6
Topologies
• Generally more complex– Polytree: One path between any two nodes
• General Bayes Nets– Graphs with undirected cycles
• No directed cycles - can’t be own cause
• Issue: Automatic net acquisition– Update probabilities by observing data– Learn topology: use statistical evidence of indep, heuristic
search to find most probable structure
Decision Making
• Design model of rational decision making– Maximize expected value among alternatives
• Uncertainty from– Outcomes of actions– Choices taken
• To maximize outcome– Select maximum over choices– Weighted average value of chance outcomes
Gangrene Example
Medicine Amputate foot
Live 0.99 Die 0.01850 0
Die 0.05 0
Full Recovery 0.7 1000
Worse 0.25
Medicine Amputate leg
Die 0.4 0
Live 0.6 995
Die 0.02 0
Live 0.98 700
Decision Tree Issues
• Problem 1: Tree size– k activities : 2^k orders
• Solution 1: Hill-climbing– Choose best apparent choice after one step
• Use entropy reduction
• Problem 2: Utility values– Difficult to estimate, Sensitivity, Duration
• Change value depending on phrasing of question
• Solution 2c: Model effect of outcome over lifetime
Conclusion• Reasoning with uncertainty
– Many real systems uncertain - e.g. medical diagnosis• Bayes’ Nets
– Model (in)dependence relations in reasoning– Noisy-OR simplifies model/computation
• Assumes causes independent
• Decision Trees– Model rational decision making
• Maximize outcome: Max choice, average outcomes
Holmes Example (Pearl)
Holmes is worried that his house will be burgled. Forthe time period of interest, there is a 10^-4 a priori chanceof this happening, and Holmes has installed a burglar alarmto try to forestall this event. The alarm is 95% reliable insounding when a burglary happens, but also has a false positive rate of 1%. Holmes’ neighbor, Watson, is 90% sure to call Holmes at his office if the alarm sounds, but he is alsoa bit of a practical joker and, knowing Holmes’ concern, might (30%) call even if the alarm is silent. Holmes’ otherneighbor Mrs. Gibbons is a well-known lush and often befuddled, but Holmes believes that she is four times morelikely to call him if there is an alarm than not.
Holmes Example: Model
There a four binary random variables:B: whether Holmes’ house has been burgledA: whether his alarm soundedW: whether Watson calledG: whether Gibbons called
B A
W
G
Holmes Example: Tables
B = #t B=#f
0.0001 0.9999
A=#t A=#fB
#t#f
0.95 0.05 0.01 0.99
W=#t W=#fA
#t#f
0.90 0.10 0.30 0.70
G=#t G=#fA
#t#f
0.40 0.60 0.10 0.90
Top Related