Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data...
-
Upload
ilene-warren -
Category
Documents
-
view
214 -
download
0
description
Transcript of Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data...
![Page 1: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/1.jpg)
Learning in Bayesian NetworksLearning in Bayesian Networks
![Page 2: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/2.jpg)
Known StructureComplete Data
Known StructureIncomplete Data
Unknown StructureComplete Data
Unknown StructureIncomplete Data
Learning
The Learning ProblemThe Learning Problem
![Page 3: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/3.jpg)
Known Structure Complete DataKnown Structure Complete Data
![Page 4: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/4.jpg)
Known Structure Incomplete DataKnown Structure Incomplete Data
![Page 5: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/5.jpg)
Unknown Structure Complete DataUnknown Structure Complete Data
![Page 6: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/6.jpg)
Unknown Structure Incomplete DataUnknown Structure Incomplete Data
![Page 7: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/7.jpg)
Known StructureKnown Structure
Method A
CPTs A
Method B
CPTs B
![Page 8: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/8.jpg)
Known StructureKnown Structure
= PrA
+CPTs
A= PrB
+CPTs B
Which probability distribution should we choose?
Common criterion: Choose distribution that maximizes
likelihood of data
![Page 9: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/9.jpg)
Known StructureKnown Structure
= PrA
+CPTs
A= PrB
+CPTs B
d1
d6
Data D
PrA (D) = PrA (d1) … PrA (dm) Likelihood of data given PrA
PrB (D) = PrB (d1) … PrB (dm) Likelihood of data given PrB
![Page 10: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/10.jpg)
Maximizing Likelihood of DataMaximizing Likelihood of Data
• Complete Data: Unique set of CPTs which maximize likelihood of data
• Incomplete Data: No Unique set of CPTs which maximize likelihood of data
![Page 11: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/11.jpg)
Maximizing Likelihood of DataMaximizing Likelihood of Data
• Complete Data: Unique set of CPTs which maximize likelihood of data
• Incomplete Data: No Unique set of CPTs which maximize likelihood of data
![Page 12: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/12.jpg)
Known Structure, Complete DataKnown Structure, Complete DataData D
d1
d6
òêdjbc= Count(bc;D)Count(dbc;D)
Estimated parameter: Number of data points di with d b
cNumber of data points di with b c
=
![Page 13: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/13.jpg)
Known Structure, Complete DataKnown Structure, Complete DataData D
d1
d6
òêdjbc= Count(bc;D)Count(dbc;D)
Estimated parameter:
= Pj=1m I (bc;dj)
Pj=1m I (dbc;dj )
![Page 14: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/14.jpg)
ComplexityComplexity• Network with:
– Nodes: n– Parameters: k– Data points: m
• Time complexity: O(m k )(straightforward implementation)
• Space complexity: O(k + mn)parameter count
![Page 15: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/15.jpg)
Known Structure, Incomplete DataKnown Structure, Incomplete Data
òêdjbc= Pj=1m Pri(bcjdj)
Pj=1m Pri(dbcjdj )
Estimated parameters at iteration i+1 (using the CPTs at iteration i):Pr0 corresponds to the initial Bayesian network (random CPTs)
![Page 16: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/16.jpg)
Known Structure, Incomplete DataKnown Structure, Incomplete Data
EM Algorithm (Expectation-Maximization):-Initial CPTs to random values-Repeat until convergence:
-Estimate parameters using current CPTs (E-step)-Update CPTs using estimates (M-step)
![Page 17: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/17.jpg)
EM AlgorithmEM Algorithm• Likelihood of data cannot get smaller after an
iteration• Algorithm is not guaranteed to return the network
which absolutely maximizes likelihood of data • It is guaranteed to return a local maxima:
Random re-starts• Algorithm is stopped when
– change in likelihood gets very small– Change in parameters gets very small
![Page 18: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/18.jpg)
ComplexityComplexity• Network with:
– Nodes: n– Parameters: k– Data points: m– Treewidth: w
• Time complexity (per iteration): O(m k n 2w)(straightforward implementation)
• Space complexity: O(k + nm + n 2w)parameter count + space for data + space for inference
![Page 19: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/19.jpg)
Collaborative FilteringCollaborative Filtering
• Collaborative Filtering (CF) finds items of interest to a user based on the preferences of other similar users.– Assumes that human behavior is predictable
![Page 20: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/20.jpg)
Where is it used?Where is it used?• E-commerce
– Recommend products based on previous purchases or click-stream behavior
– Ex: Amazon.com
• Information sites– Rate items based on
previous user ratings– Ex: MovieLens, Jester
![Page 21: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/21.jpg)
John 5 - 3 2Sam - 4 1 5Cindy 3 - 5 -
Bob 5 1 - -
Bob 5 1 3.5 1.7
CF
![Page 22: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/22.jpg)
Memory-based AlgorithmsMemory-based Algorithms
• Use the entire database of user ratings to make predictions.– Find users with similar voting histories to the
active user.– Use these users’ votes to predict ratings for
products not voted on by the active user.
![Page 23: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/23.jpg)
Model-based AlgorithmsModel-based Algorithms
• Construct a model from the vote database.• Use the model to predict the active user’s
ratings.
![Page 24: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/24.jpg)
Bayesian ClusteringBayesian Clustering
• Use a Naïve Bayes network to model the vote database.
• m vote variables: one for each title.– Represent discrete vote values.
• 1 “cluster” variable– Represents user personalities
![Page 25: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/25.jpg)
05.35.1.5.
)Pr(
4
3
2
1
cccc
cC
6.5
25.23.1
)|Pr(
4
1
1
c
cc
cvCv kk
Naïve BayesNaïve Bayes
C
V1 V2 V3 Vm…
![Page 26: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/26.jpg)
C
V1 V2 V3 Vm…
05.35.1.5.
)Pr(
4
3
2
1
cccc
cC
6.5
25.23.1
)|Pr(
4
1
1
c
cc
cvCv kk
![Page 27: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/27.jpg)
• Inference– Evidence: known votes vk for titles k I– Query: title j for which we need to predict vote
• Expected value of vote:
w
hkjj Ikvhvhp
1):|Pr(
C
V1 V2 V3 Vm…
![Page 28: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/28.jpg)
LearningLearning• Simplified Expectation Maximization (EM)
Algorithm with partial data
• Initialize CPTs with random values subject to the following constraints:
)Pr(cc )|Pr(| cvkcvk
1C
c 1| k
kv
cv
![Page 29: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/29.jpg)
DatasetsDatasets• MovieLens
– 943 users; 1682 titles; 100,000 votes (1..5); explicit voting
• MS Web – website visits– 610 users; 294 titles; 8,275 votes (0,1) :
null votes => 0 : 179,340 votes; implicit voting
![Page 30: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/30.jpg)
0
200
400
600
800
0 5 10 15
Iteration
Tota
l Abs
olut
e C
hang
e
• Learning curve for MovieLens Dataset
![Page 31: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/31.jpg)
ProtocolsProtocols
• User database is divided into: 80% training set and 20% test set.– One-by-one select a user from the test set to be
the active user.– Predict some of their votes based on remaining
votes
![Page 32: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/32.jpg)
• All-But-One
• Given-{Two, Five, Ten}
Qe eIa e e e e e e e ee e e
Q eeIa Q Q Q Q Q Q QQ Q Q Q
e e e e eQIa Q Q Q Q Q QQ Q
e eee QeIa QQ Q e e ee e
![Page 33: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/33.jpg)
Evaluation MetricEvaluation Metric
• Average Absolute Deviation
• Ranked Scoring
Pja
jaja vpP ,
,,1
![Page 34: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/34.jpg)
ResultsResults• Experiments were run 5 times and averaged• Movielens
Algorithm Given-Two Given-Five Given-Ten All-But-One
Correlation 1.019 .916 .865 .806
VecSim .948 .878 .843 .799
BC(9) .771 .765 .763 .753
![Page 35: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/35.jpg)
• MS Web
Algorithm Given-Two Given-Five Given-Ten All-But-One
Correlation 0.105 0.0911 0.0844 0.0673
VecSim 0.101 0.0885 0.0818 0.0675
BC(9) 0.0652 0.0652 0.0649 0.0507
![Page 36: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/36.jpg)
Computational IssuesComputational Issues
• Prediction time: (Memory-based) 10 minutes per experiment; (Model-based) 2 minutes
• Learning time: 20 minutes per iteration
• n: number of data point; m: number of titles; w: number of votes per title;|C| number of personality types
Algorithm Prediction Time Learning Time Space
Memory-based O(n*m) N/A O(n*m)
Model-based O(|C|*m) O(n*m*|C|*w) O(|C|*m*w)
![Page 37: Learning in Bayesian Networks. Known Structure Complete Data Known Structure Incomplete Data Unknown…](https://reader036.fdocuments.in/reader036/viewer/2022070616/5a4d1be17f8b9ab0599dfec1/html5/thumbnails/37.jpg)
Demo of SamIamDemo of SamIam• Building networks:
– Nodes, Edges– CPTs
• Inference:– Posterior marginals– MPE– MAP
• Learning: EM• Sensitivity Engine