Software Engineering Laboratory1 Introduction of Bayesian Network 4 / 20 / 2005 CSE634 Data Mining...
-
date post
19-Dec-2015 -
Category
Documents
-
view
215 -
download
0
Transcript of Software Engineering Laboratory1 Introduction of Bayesian Network 4 / 20 / 2005 CSE634 Data Mining...
Software Engineering Laboratory
1
Introduction of Bayesian Network
4 / 20 / 2005
CSE634 Data Mining Prof. Anita Wasilewska 105269827 Hiroo Kusaba
Software Engineering Laboratory 2
References [1] D. Heckerman: “A Tutorial on Learning
with Bayesian Networks”, In “Learning in Graphical Models”, ed. M.I. Jordan, The MIT Press, 1998.
[2] http://www.cs.huji.ac.il/~nir/Nips01-Tutorial/
[3]Jiawei Han:”Data Mining Concepts and Techniques”,ISBN 1-53860-489-8
[4] Whittaker, J.: Graphical Models in Applied Multivariate Statistics, John Wiley and Sons (1990)
Software Engineering Laboratory 3
Contents Brief introduction Review
A little review of probability Bayes theorem
Bayesian Classification Steps of using Bayesian Network
Software Engineering Laboratory 4
Random variables X, Y, Xi, Θ Capitals Condition (or value) of a variable x, y, xi, θ
small Set of a variable X, Y, Xi, Θ in Capital bold Set of a condition (or value) x, y, xi, θ
small bold P(x/a) : Probability that an event x occurs
(or happens) under the condition of a
Software Engineering Laboratory 5
What is Bayesian Network ? Network which express the dependencies
among the random variables Each node has posterior probability which
depends on the previous random variable The whole network also express the joint
probability distribution from all of the random variables
Pa is parent(s) of a node i
},...,,{ 21 nxxxX
X
n
iii PaxpXp
1
Software Engineering Laboratory 6
How is it used ? Bayesian Learning
Estimating dependencies between the random variables from the actual data
Bayesian Inference When some of the random variables are defined
it calculate the other probabilities Patiants condition as a random variable, from the
condition it predicts the desease
Software Engineering Laboratory 7
What is so good about it? Conditional independencies and graphical
expression capture structure of many real-world distributions. [1]
Learned model can be used for many tasks Supports all the features of probabilistic
learning Model selection criteria Dealing with missing data and hidden variables
Software Engineering Laboratory 8
Example of Bayesian Network
Structure of a network
Conditional Probability X,Y,Z are random variables
which takes either 0 or 1 p(X), p(Y|X), p(Z|Y)
X Y Z
X Y P(Y|X)0 0 0.10 1 0.91 0 0.21 1 0.8
Y Z P(Z|Y)0 0 0.30 1 0.71 0 0.41 1 0.6
X P(X)0 0.51 0.5
Software Engineering Laboratory 9
Example of Bayesian Network 2 What is the Joint probability of P(X, Y, Z)?
P(X, Y, Z) = P(X)*P(Y|X)*P(Z|Y)
X Y Z P(X,Y,Z)0 0 0 0.0150 0 1 0.0350 1 0 0.1800 1 1 0.270
X Y Z P(X,Y,Z)1 0 0 0.0301 0 1 0.0701 1 0 0.1601 1 1 0.240
Software Engineering Laboratory 10
A little Review of probability 1 Probability : How likely is it that an event
will happen? Sample Space S
Element of S: elementary event An event A is a subset of S
P(A) ≧ 0 P(S) = 1
Software Engineering Laboratory 11
A little review of probability 2 Discrete probability distribution
P(A) = Σ s∈ A P (s)
Conditional probability distribution P(A|B) = P(A, B) / P(B)
If the events are independent P(A, B) = P(A)*P(B)
Bayes Theorem
ABPAPABPAP
ABPAP
BP
ABPAPBAP
||
)|()(
)(
)|()()|(
BA
Software Engineering Laboratory 12
Bayes Theorem
n
iii
iii
ABPAP
ABPAP
BP
ABPAPBAP
1
|
)|()(
)(
)|()()|(
Software Engineering Laboratory 13
Example of Bayes Theorem You are about to be tested for a rare
desease. How worried should you be if the test result is positive ?
Accuracy of the Test is P(T) = 85% Chance of Infection P(I) = 0.01% What is P(I / not T)
http://www.gametheory.net/Mike/applets/Bayes/Bayes.html
Software Engineering Laboratory 14
Bayesian Classification Suppose that there are m classes,
Given an unknown data sample, xthe Bayesian classifier assigns an unknown sample x to the class c if and only if
mCCC ,...,, 21
ijmj
XCPXCP ji
,1
)|()|(
Software Engineering Laboratory 15
We have to maximize )()|( ii CPCXP
In order to reduce computationclass conditional independence is made
)(
)|()()|(
XP
CXPCPXCP ii
i
n
kiki CxPCXP
1
)|()|(
Software Engineering Laboratory 16
Example of Bayesian Classificationin the text book[3] Customer under 30 and income is
“medium” and student and credit rating is “fair”, which category does the customer belongs? Buy or not.
Software Engineering Laboratory 17
Bayesian Network Network which express the dependencies
among the random variables The whole network also express the joint
probability distribution from all of the random variables
Pa is parent(s) of a node i
},...,,{ 21 nxxxX
X
n
iii PaxpXp
1
X Y Z
n
iii xxxxpp
1121 ),...,,|()(x
)|(),...,,|( 121 iPaxpxxxxp iii
Pai are a subset
Software Engineering Laboratory 18
Steps to apply Bayesian Network Step1 Create a Bayesian Belief Network
Include all the variables that are important in your system
Use causal knowledge to guide the connections made in the graph
Use your prior knowledge to specify the conditional distributions
Step2 Calculate the p(xi|pai) for your goal
Software Engineering Laboratory 19
Example from [1] Example to make a BN from the prior
knowledge BN to find a credit card fraud
Define random variables Fraud(F):Probability that owner is a fraud Gas(G):Bought a gas in 24 hours Jewelry(J):Bought a jewelry in 24 hours Age(A):Age of owner of the card Sex(S):Gender of the owner of the card
Software Engineering Laboratory 20
Give orders to random variables Define dependencies, but you have to be
careful.
),,|(),,,|(
)|(),,|(
)(),|(
)()|(
safjpgsafjp
fgpsafgp
spafsP
apfap
F
G J
SA
F
G
J S
A)|(),,,|(
)|(),,|(
)(),|(
)|()|(
fjpgsafjp
fgpsafgp
spafsP
fapfap
Software Engineering Laboratory 21
Next topic Training with Bayesian Network
Bayes Inference If the training data is complete If the training data is missing Network Evaluation
Software Engineering Laboratory
22
Thank you for listening.