Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature,...
Transcript of Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature,...
![Page 2: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/2.jpg)
Roadmap• Prob. Programming - Modeling
• Inference
• Learning
• Dynamics
• KBMC & Markov Logic
• DeepProbLog
• From StarAI to Nesy
... with some detours on the way2
![Page 3: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/3.jpg)
Part V: KBMC, Markov Logic
3
![Page 4: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/4.jpg)
A key question in AI:Dealing with uncertainty
Reasoning with relational data
Learning
Statistical relational learning& Probabilistic programming, ...
?• logic• databases• programming• ...
• probability theory• graphical models• ...
• parameters• structure
4
so far
![Page 5: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/5.jpg)
A key question in AI:Dealing with uncertainty
Reasoning with relational data
Learning
Statistical relational learning & Probabilistic programming, ...
?• logic• databases• programming• ...
• probability theory• graphical models• ...
• parameters• structure
5
next
![Page 6: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/6.jpg)
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
Flexible and Compact Relational Model for Predicting Grades
“Program” Abstraction:▪ S, C logical variable representing students, courses▪ the set of individuals of a type is called a population▪ Int(S), Grade(S, C), D(C) are parametrized random variables
Grounding:• for every student s, there is a random variable Int(s)• for every course c, there is a random variable Di(c)• for every s, c pair there is a random variable Grade(s,c)• all instances share the same structure and parameters
![Page 7: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/7.jpg)
G
7
ProbLog by example: Grading
Shows relational structure
grounded model: replace variables by constants
Works for any number of students / classes (for 1000 students and 100 classes, you get 101100 random variables); still only few parameters
With SRL / PP
build and learn compact models,
from one set of individuals - > other sets;
reason also about exchangeability,
build even more complex models,
incorporate background knowledge
![Page 8: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/8.jpg)
Lots of proposals in the literature, e.g.• relational Markov networks (RMNs) [Taskar et al 2002]
• Markov logic networks (MLNs) [Richardson & Domingos 2006]
• probabilistic soft logic (PSL) [Broecheler et al 2010]
• FACTORIE [McCallum et al 2009]
• Bayesian logic programs (BLPs) [Kersting & De Raedt 2001]
• relational Bayesian networks (RBNs) [Jaeger 2002]
• logical Bayesian networks (LBNs) [Fierens et al 2005]
• probabilistic relational models (PRMs) [Koller & Pfeffer 1998]
• Bayesian logic (BLOG) [Milch et al 2005]
• CLP(BN) [Santos Costa et al 2008]
• and many more ...
8
![Page 9: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/9.jpg)
Probabilistic Relational Models (PRMs)
9
PersonBloodtype
M-chromosomeP-chromosome
Person
Bloodtype M-chromosome
P-chromosome
(Father)
Person
Bloodtype M-chromosome
P-chromosome
(Mother)
Table
[Getoor,Koller, Pfeffer]
[Getoor,Koller, Pfeffer]
![Page 10: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/10.jpg)
Probabilistic Relational Models (PRMs)
10
PersonBloodtype
M-chromosomeP-chromosome
Person
Bloodtype M-chromosome
P-chromosome
(Father)
Person
Bloodtype M-chromosome
P-chromosome
(Mother)
Table[Getoor,Koller, Pfeffer]
[Getoor,Koller, Pfeffer]
bt(Person)= BT.
pc(Person)= PC.
mc(Person) = MC.
bt(Person)=BT | pc(Person) =PC , mc(Person) =MC. pc(Person) = PC | pc_father(Father)= PCf, mc_father(Father)= MCf.
pc_father(Person) =PCf | father(Father,Person),pc(Father)=PC. ...
View :
Dependencies (CPDs associated with):
father(Father,Person). mother(Mother,Person).
![Page 11: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/11.jpg)
Probabilistic Relational Models (PRMs) Bayesian Logic Programs (BLPs)
11
father(rex,fred). mother(ann,fred). father(brian,doro). mother(utta, doro). father(fred,henry). mother(doro,henry).
bt(Person)=BT | pc(Person)=PC, mc(Person)=MC.pc(Person)=PC | pc_father(Person)=PCf, mc_father(Person)=MCf.mc(Person)=MC | pc_mother(Person)=PCm, pc_mother(Person)=MCm.
mc(rex)
bt(rex)
pc(rex)mc(ann) pc(ann)
bt(ann)
mc(fred) pc(fred)
bt(fred)
mc(brian)
bt(brian)
pc(brian)mc(utta) pc(utta)
bt(utta)
mc(doro) pc(doro)
bt(doro)
mc(henry)pc(henry)
bt(henry)
RV State
pc_father(Person)= PCf | father(Father,Person),pc(Father) = PC. ...
Extension
Intension
![Page 12: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/12.jpg)
Answering Queries
12
mc(rex)
bt(rex)
pc(rex)mc(ann) pc(ann)
bt(ann)
mc(fred) pc(fred)
bt(fred)
mc(brian)
bt(brian)
pc(brian)mc(utta) pc(utta)
bt(utta)
mc(doro) pc(doro)
bt(doro)
mc(henry)pc(henry)
bt(henry)
P(bt(ann)) ?
Support Network
![Page 13: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/13.jpg)
Answering Queries
13
mc(rex)
bt(rex)
pc(rex)mc(ann) pc(ann)
bt(ann)
mc(fred) pc(fred)
bt(fred)
mc(brian)
bt(brian)
pc(brian)mc(utta) pc(utta)
bt(utta)
mc(doro) pc(doro)
bt(doro)
mc(henry)pc(henry)
bt(henry)
P(bt(ann), bt(fred)) ?
P(bt(ann)| bt(fred)) =P(bt(ann),bt(fred))
P(bt(fred))
Bayes‘ rule
![Page 14: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/14.jpg)
Combining Rules
• Students reads two books
• Typical, noisy-or, noisy-max,
• ...
14
P(A|B,C)
P(A|B) and P(A|C)
prepared(Student,Topic) | read(Student,Book), discusses(Book,Topic).
![Page 15: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/15.jpg)
Knowledge Based Model Construction
Extension + Intension =>Probabilistic Model
Advantages
same intension used for multiple extensions
parameters are being shared / tied together
unification is essential
•learning becomes feasible
•max. likelihood parameter estimation & structure learning
15
![Page 16: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/16.jpg)
Bayesian Logic Programs
16
% apriori nodes nat(0).
% aposteriori nodes nat(s(X)) | nat(X).
nat(0) nat(s(0)) nat(s(s(0)) ...MC
% apriori nodes state(0).
% aposteriori nodes state(s(Time)) | state(Time). output(Time) | state(Time)
state(0)
output(0)
state(s(0))
output(s(0))
...HMM
% apriori nodes n1(0).
% aposteriori nodes n1(s(TimeSlice) | n2(TimeSlice). n2(TimeSlice) | n1(TimeSlice). n3(TimeSlice) | n1(TimeSlice), n2(TimeSlice).
n1(0)
n2(0)
n3(0)
n1(s(0))
n2(s(0))
n3(s(0))
...DBN
pure Pro
log
Prolog and Bayesian Nets as Special Case
![Page 17: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/17.jpg)
Learning BLPs
RVs + States = (partial) Herbrand interpretationProbabilistic learning from interpretations
Family(1)pc(brian)=b,bt(ann)=a,bt(brian)=?,bt(dorothy)=a
Family(2)bt(cecily)=ab,pc(henry)=a,mc(fred)=?,bt(kim)=a,pc(bob)=b
Backgroundm(ann,dorothy),f(brian,dorothy),m(cecily,fred),f(henry,fred),f(fred,bob),m(kim,bob),...
Family(3)pc(rex)=b,bt(doro)=a,bt(brian)=?
17
![Page 18: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/18.jpg)
Parameter Estimation
• +
bt(Person,BT) | pc(Person,PC), mc(Person,MC).pc(Person,PC) | pc_father(Person,PCf), mc_father(Person,MCf).mc(Person,MC) | pc_mother(Person,PCm), pc_mother(Person,MCm).
yields
18
![Page 19: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/19.jpg)
Parameter Estimation
• +
bt(Person,BT) | pc(Person,PC), mc(Person,MC).pc(Person,PC) | pc_father(Person,PCf), mc_father(Person,MCf).mc(Person,MC) | pc_mother(Person,PCm), pc_mother(Person,MCm).
yields
Parameter tying
19
![Page 20: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/20.jpg)
Expectation Maximization
Initial Parameters q0
Logic Program L
Expected counts of a clause
Update parameters (ML, MAP)
Maximization
EM-algorithm:iterate until convergence
Current Model (M,qk)
P( head(GI), body(GI) | DC )MM
DataCaseDC
Ground InstanceGI
P( head(GI), body(GI) | DC )MM
DataCaseDC
Ground InstanceGI
P( body(GI) | DC )MM
DataCaseDC
Ground InstanceGI
20
![Page 21: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/21.jpg)
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
Markov Logic: Intuition
▪ Undirected graphical model▪ A logical KB is a set of hard constraints
on the set of possible worlds▪ Let’s make them soft constraints:
When a world violates a formula,it becomes less probable, not impossible
▪ Give each formula a weight(Higher weight ⇒ Stronger constraint)
( )∑∝ satisfiesit formulas of weightsexpP(world)
![Page 22: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/22.jpg)
A possible worlds view
),( BobAnnaFriends¬
)(BobHappy)(BobHappy¬
),( BobAnnaFriends
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
Say we have two domain elements Anna and Bob as well as two predicates Friends and Happy
slides by Pedro Domingos
![Page 23: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/23.jpg)
A possible worlds view
),( BobAnnaFriends¬
)(BobHappy)(BobHappy¬
),( BobAnnaFriends
)(),(BobHappy
BobAnnaFriends∨
¬
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
Logical formulas such as not Friends(Anna,Bob) or Happy(Bob)
exclude possible worlds
slides by Pedro Domingos
![Page 24: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/24.jpg)
A possible worlds view
),( BobAnnaFriends¬
)(BobHappy)(BobHappy¬
),( BobAnnaFriends
1))(),(( =∨¬Φ BobHappyBobAnnaFriends75.0))(),(( =¬∧Φ BobHappyBobAnnaFriends
1 1
175.0
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
four times as likely that rule holds
slides by Pedro Domingos
![Page 25: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/25.jpg)
An possible worlds view
),( BobAnnaFriends¬
)(BobHappy)(BobHappy¬
),( BobAnnaFriends
29.0)75.0/1log()))(),(((
==
∨¬Φ BobHappyBobAnnaFriendsw
1 1
175.0
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
Or as log-linear model this is:
This can also be viewed as building a graphical model
![Page 26: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/26.jpg)
Cancer(A)
Smokes(A) Smokes(B)
Cancer(B)
Suppose we have two constants: Anna (A) and Bob (B)
slides by Pedro Domingos
Markov Logic
![Page 27: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/27.jpg)
Cancer(A)
Smokes(A)Friends(A,A)
Friends(B,A)
Smokes(B)
Friends(A,B)
Cancer(B)
Friends(B,B)
Suppose we have two constants: Anna (A) and Bob (B)
slides by Pedro Domingos
Markov Logic
![Page 28: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/28.jpg)
Cancer(A)
Smokes(A)Friends(A,A)
Friends(B,A)
Smokes(B)
Friends(A,B)
Cancer(B)
Friends(B,B)
Suppose we have two constants: Anna (A) and Bob (B)
slides by Pedro Domingos
Markov Logic
![Page 29: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/29.jpg)
Cancer(A)
Smokes(A)Friends(A,A)
Friends(B,A)
Smokes(B)
Friends(A,B)
Cancer(B)
Friends(B,B)
Suppose we have two constants: Anna (A) and Bob (B)
slides by Pedro Domingos
Markov Logic
![Page 30: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/30.jpg)
Markov Logic
30
𝑪(𝑨) 𝑺(𝑨) 𝑭 (𝑨, 𝑩) 𝑭 (𝑩, 𝑨) 𝑪(𝑩)𝑺(𝑩)
F1(A) F1(B)F2(A,B)
𝑭 (𝑨, 𝑨)
F2(A,A) F2(B,A)
𝑭 (𝑩, 𝑩)
F2(B,B)
represented as a factor graph
P(Interpretation) ∝ ∏i,θ
Fi(X, Y )θ = ∏i,θ
exp(wi𝕀(Interpretation ⊧ Fi(X, Y )θ)
![Page 31: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/31.jpg)
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
Markov Logic
▪ A Markov Logic Network (MLN) is a set of pairs (F, w) where▪ F is a formula in first-order logic▪ w is a real number
▪ An MLN defines a Markov network with▪ One node for each grounding of each predicate
in the MLN▪ One feature for each grounding of each formula F in the MLN,
with the corresponding weight w▪ Probability of a world
Weight of formula i No. of true groundings of formula i in x
!"
#$%
&= ∑
iii xnw
ZxP )(exp
1)(
![Page 32: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/32.jpg)
Possible WorldsA vocabulary
Possible worldsLogical interpretations
Sm
okes
(Alic
e)
Sm
okes
(Bob
)
Frie
nds(
Alic
e,Bo
b)
Frie
nds(
Bob,
Alic
e)
Slides adapted from Guy Van den Broeck
![Page 33: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/33.jpg)
A logical theory
Interpretations that satisfy the theoryModels
∀x,y, Smokes(x) ∧ Friends(x,y) ⇒ Smokes(y)
Sm
okes
(Alic
e)
Sm
okes
(Bob
)
Frie
nds(
Alic
e,Bo
b)
Frie
nds(
Bob,
Alic
e)
Possible Worlds
Slides adapted from Guy Van den Broeck
![Page 34: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/34.jpg)
A logical theory
First-Order Model Counting
First-order model count~#SAT
∑
Sm
okes
(Alic
e)
Sm
okes
(Bob
)
Frie
nds(
Alic
e,Bo
b)
Frie
nds(
Bob,
Alic
e)
∀x,y, Smokes(x) ∧ Friends(x,y) ⇒ Smokes(y)
Slides Guy Van den Broeck
![Page 35: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/35.jpg)
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
Markov Logic
▪ A Markov Logic Network (MLN) is a set of pairs (F, w) where▪ F is a formula in first-order logic▪ w is a real number
▪ An MLN defines a Markov network with▪ One node for each grounding of each predicate
in the MLN▪ One feature for each grounding of each formula F in the MLN,
with the corresponding weight w▪ Probability of a world
Weight of formula i No. of true groundings of formula i in x
!"
#$%
&= ∑
iii xnw
ZxP )(exp
1)(
![Page 36: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/36.jpg)
1.5 ∀x,y, Smokes(x) ∧ Friends(x,y) ⇒ Smokes(y)
Sm
okes
(Alic
e)
Sm
okes
(Bob
)
Frie
nds(
Alic
e,Bo
b)
Frie
nds(
Bob,
Alic
e)
Markov Logic
Slides adapted from Guy Van den Broeck
counting only substitutions for which X =/= Y X=Alice, Y=BobX=Bob, Y=Alice
1
Zexp(1.5 ⇤ 2)
1
Zexp(1.5 ⇤ 2)
1
Zexp(1.5 ⇤ 1)
A Markov Logic theory
![Page 37: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/37.jpg)
1.5 ∀x,y, Smokes(x) ∧ Friends(x,y) ⇒ Smokes(y)
Sm
okes
(Alic
e)
Sm
okes
(Bob
)
Frie
nds(
Alic
e,Bo
b)
Frie
nds(
Bob,
Alic
e)
Markov Logic
Slides adapted from Guy Van den Broeck
1
Zexp(1.5 ⇤ 2)
1
Zexp(1.5 ⇤ 2)
1
Zexp(1.5 ⇤ 1)
A Markov Logic theory
Zpartition function
∑
![Page 38: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/38.jpg)
A logical theory and a weight function for predicates
Weighted first-order model count∑
Weighted First-Order Model Counting S
mok
es(A
lice)
Sm
okes
(Bob
)
Frie
nds(
Alic
e,Bo
b)
Frie
nds(
Bob,
Alic
e) Smokes → 1 ¬Smokes → 2 Friends → 4 ¬Friends → 1
Related to ProbLog Inference !
![Page 39: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/39.jpg)
Parameter Learning
39
No. of times clause i is true in data
Expected no. times clause i is true according to MLN
[ ])()()(log xnExnxPw iwiwi
−=∂
∂
Has been used for generative learning (Pseudolikelihood); Many variations (also discriminative); applications in networks, NLP, bioinformatics, …
![Page 40: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/40.jpg)
Applications
▪ Natural language processing, Collective Classification, Social Networks, Activity Recognition, …
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
![Page 41: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/41.jpg)
Information Extraction
Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.
P. Hoifung (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
![Page 42: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/42.jpg)
Segmentation
Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.
P. Hoifung (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.
AuthorTitleVenue
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
![Page 43: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/43.jpg)
Entity Resolution
Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.
P. Hoifung (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
![Page 44: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/44.jpg)
Entity Resolution
Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.
P. Hoifung (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.
De Raedt, Kersting, Natarajan, Poole: Statistical Relational AI
![Page 45: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/45.jpg)
Roadmap• Prob. Programming - Modeling
• Inference
• Learning
• Dynamics
• KBMC & Markov Logic
• DeepProbLog
• From StarAI to Nesy
... with some detours on the way45
![Page 46: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/46.jpg)
Part VI: DeepProbLog
46
![Page 47: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/47.jpg)
Learning
PROBABILITY
LOGIC
THREE DIFFERENT PARADIGMS FOR LEARNING
NEURAL
![Page 48: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/48.jpg)
Integrate Deep Learning and (Probabilistic) Logics ?
48
earthquake burglary
alarmhears_alarm
callsAre there an equal number of large things
and metal spheres?
Deep Learning
Logic
?
Cf. Visual Genome en Clevr datasets
Neural-symbolic learning and reasoning: A survey and interpretation.[Besold et all ]
![Page 49: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/49.jpg)
NeSY state-of-the-art• The integration of perception and reasoning is still an open problem.
• Main idea: inject/encode logic into neural networks (and let the NN do the rest)
• Encoding logic in the weights of neural networks
• Learning embeddings for logical entities
• Logical constraints as a regularizer during training
• Templating neural networks
• Building neural networks from functional programs
• Building neural networks from backwards proving
• Differentiable neural computers / program interpreters
![Page 50: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/50.jpg)
State-of-the-art
• Encoding logic in the weights of neural networks
• Logic Tensor Networks (Serafini et al.)
• A Semantic Loss Function for Deep Learning with Symbolic Knowledge (Xu et al.)
• Ontology Reasoning with Deep Neural Networks (Hohenecker et al.)
• Semantic Based Regularization (Diligenti et al.)
50
![Page 51: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/51.jpg)
State-of-the-art
• Templates for neural networks (ako Knowledge Base Model Construction)
• Lifted Relational Neural Networks (Šourek et al.)
• Neural Theorem Prover (Rocktäschel et al.)
• Neural Module Networks (Andreas et al.)
51
![Page 52: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/52.jpg)
State-of-the-art
• Differentiable neural computers / program interpreters
• Differentiable Neural Computer (Graves et al.)
• Neural Programmer-Interpreters (Reed et al.)
• Differentiable Forth Interpreter (Bošnjak et al.)
52
![Page 53: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/53.jpg)
DeepProbLogIdea: inject neural networks into logic by extending an existing PLP language
DeepProbLog = ProbLog + neural predicate
The neural predicate makes neural networks a first-class citizen
53
Related work DeepProbLog
Logic is made less expressive Full expressivity is retained
Logic is pushed into the neural network Clean separation
Fuzzy logic Probabilistic logicLanguage semantics unclear Clear semantics
NeurIPS 2018
![Page 54: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/54.jpg)
Neural predicate
• Neural networks have uncertainty in their predictions
• A normalized output can be interpreted as a probability distribution
• Neural predicate models the output as probabilistic facts
• No changes needed in the probabilistic host language
54
Neural network
![Page 55: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/55.jpg)
DTAI research group
The neural predicateThe output of the neural network is probabilistic facts in DeepProbLog
Example:
nn(mnist_net, [X], Y, [0 ... 9] ) :: digit(X,Y).
Instantiated into a (neural) Annotated Disjunction:
0.04::digit( ,0) ; 0.35::digit( ,1) ; ... ; 0.53::digit( ,7) ; ... ; 0.014::digit( ,9).
![Page 56: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/56.jpg)
DTAI research group
DeepProbLog exemplified: MNIST addition
Task: Classify pairs of MNIST digits with their sum
Benefit of DeepProbLog:
• Encode addition in logic
• Separate addition from digit classification
8411
nn(mnist_net, [X], Y, [0 ... 9] ) :: digit(X,Y).
addition(X,Y,Z) :- digit(X,N1), digit(Y,N2), Z is N1+N2.
Examples: addition( , ,8), addition( , ,4), addition( , ,11), …
![Page 57: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/57.jpg)
DTAI research group
DeepProbLog exemplified: MNIST addition
Task: Classify pairs of MNIST digits with their sum
Benefit of DeepProbLog:
• Encode addition in logic
• Separate addition from digit classification
8411
nn(mnist_net, [X], Y, [0 ... 9] ) :: digit(X,Y).
addition(X,Y,Z) :- digit(X,N1), digit(Y,N2), Z is N1+N2.
addition( , ,8) :- digit( ,N1), digit( ,N2), 8 is N1 + N2.
Examples: addition( , ,8), addition( , ,4), addition( , ,11), …
![Page 58: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/58.jpg)
ExampleLearn to classify the sum of pairs of MNIST digits
Individual digits are not labeled!
E.g. ( , , 8)
Could be done by a CNN: classify the concatenation of both images into 19 classes
However:
58
+ = ?
![Page 59: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/59.jpg)
MNIST Addition• Pairs of MNIST images, labeled
with sum
• Baseline: CNN
• Classifies concatenation of both images into classes 0 ...18
• DeepProbLog:
• CNN that classifies images into 0 … 9
• Two lines of DeepProblog code
59
![Page 60: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/60.jpg)
Multi-digit MNIST addition with MNIST
Result
60
number ( [ ] , Result , Result ) .number ( [H | T ] , Acc , Result) :−
digit(H, Nr ), Acc2 is Nr +10*Acc , number ( T , Acc2 , Result ) .
number (X,Y) :− number (X, 0 ,Y ) .
multiaddition(X, Y, Z ) :− number (X, X2 ) ,
number (Y, Y2 ) , Z is X2+Y2 .
![Page 61: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/61.jpg)
ExampleLearn to classify the sum of pairs of MNIST digits
Individual digits are not labeled!
E.g. ( , , 8)
Could be done by a CNN: classify the concatenation of both images into 19 classes
However:
61
+ = ?
![Page 62: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/62.jpg)
(Deep)ProbLog : Inference
![Page 63: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/63.jpg)
Inference / Reasoning• Most of the work in PP and StarAI is on
inference
• It is hard (complexity wise)
• Many inference methods
• exact, approximate, sampling and lifted …
• Inference is the key to learning
63
![Page 64: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/64.jpg)
ProbLog InferenceAnswering a query in a ProbLog program happens in four steps1. Grounding the program w.r.t. the query2. Rewrite the ground logic program into a propositional logic formula3. Compile the formula into an arithmetic circuit4. Evaluate the arithmetic circuit
0.1 :: burglary. 0.5 :: hears_alarm(mary).
0.2 :: earthquake. 0.4 :: hears_alarm(john).
alarm :– earthquake.
alarm :– burglary. calls(X) :– alarm, hears_alarm(X).
Query
?-P(calls(mary)
![Page 65: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/65.jpg)
ProbLog InferenceAnswering a query in a ProbLog program happens in four steps1. Grounding the program w.r.t. the query (only relevant part !)2. Rewrite the ground logic program into a propositional logic formula3. Compile the formula into an arithmetic circuit4. Evaluate the arithmetic circuit
0.1 :: burglary. 0.5 :: hears_alarm(mary).
0.2 :: earthquake. 0.4 :: hears_alarm(john).
alarm :– earthquake.
alarm :– burglary. calls(mary) :– alarm, hears_alarm(mary).calls(john) :– alarm, hears_alarm(john).
Query
?-P(calls(mary)
![Page 66: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/66.jpg)
ProbLog InferenceAnswering a query in a ProbLog program happens in four steps1. Grounding the program w.r.t. the query 2. Rewrite the ground logic program into a propositional logic formula3. Compile the formula into an arithmetic circuit4. Evaluate the arithmetic circuit
0.1 :: burglary. 0.5 :: hears_alarm(mary).
0.2 :: earthquake. 0.4 :: hears_alarm(john).
alarm :– earthquake.
alarm :– burglary. calls(mary) :– alarm, hears_alarm(mary).
calls(john) :– alarm, hears_alarm(john).
calls(mary)
↔
hears_alarm(mary) ∧ (burglary ∨ earthquake)
![Page 67: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/67.jpg)
ProbLog InferenceAnswering a query in a ProbLog program happens in four steps1. Grounding the program w.r.t. the query 2. Rewrite the ground logic program into a propositional logic formula3. Compile the formula into an arithmetic circuit (knowledge compilation)4. Evaluate the arithmetic circuit
calls(mary)
↔
hears_alarm(mary) ∧ (burglary ∨ earthquake) AND AND
AND
OR
calls(mary)
¬earthquake
0.8
earthquake
0.2
burglary
0.1
hears_alarm(mary)
0.5
0.08 0.1
0.04
0.14
![Page 68: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/68.jpg)
ProbLog InferenceAnswering a query in a ProbLog program happens in four steps1. Grounding the program w.r.t. the query 2. Rewrite the ground logic program into a propositional logic formula3. Compile the formula into an arithmetic circuit (knowledge compilation)4. Evaluate the arithmetic circuit
calls(mary)
↔
hears_alarm(mary) ∧ (burglary ∨ earthquake) AND AND
AND
OR
calls(mary)
¬earthquake
0.8
earthquake
0.2
burglary
0.1
hears_alarm(mary)
0.5
0.08 0.1
0.04
0.14
![Page 69: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/69.jpg)
DTAI research group
Optimization
PLP usually considers the inference settings
DeepProbLog focuses on optimization• We have a set of tuples (q,p)• q is a query and p its desired success probability
![Page 70: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/70.jpg)
DeepProbLog
• We use our algebraic ProbLog and use the gradient semi-ring
• What is aProbLog ?
• a version of ProbLog where the probabilistic semi-ring is replaced by an arbitrary semiring structure
• labels on facts are elements of the semiring
• cf. the different semi-rings for the WMC, #SAT, …
70
![Page 71: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/71.jpg)
more examples
71
semiring label functionI I
![Page 72: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/72.jpg)
DTAI research group
Implementing DeepProbLog
1. Evaluating the neural networks:• Instantiate the neural annotated disjunction• Happens during grounding• ProbLog already had support for external functions
2. Perform backpropagation in the neural networks• No direct loss for neural networks• Loss defined on the logic level• Derive gradient in logic• Start backpropagation with derived gradient
![Page 73: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/73.jpg)
DTAI research group
Deriving the gradient• The output of the neural network are probabilistic facts
• Probabilistic facts are leaves in the AC
• The AC is a differentiable structure
• We can derive it in the forward pass along with the probability
aProbLog + gradient semiring
![Page 74: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/74.jpg)
DTAI research group
Gradient semiringt(0.2) :: earthquake.
t(0.1) :: burglary.
0.5 :: hears_alarm.
alarm :- earthquake.
alarm :- burglary.
calls :- alarm, hears_alarm.
![Page 75: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/75.jpg)
DTAI research group
The DeepProbLog pipeline
![Page 76: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/76.jpg)
EXPERIMENTS
76
![Page 77: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/77.jpg)
Program Induction• Approach similar to that of ‘Programming with a Differentiable
Forth Interpreter’ [1] (∂4)
• Partially defined Forth program with slots / holes
• Slots are filled by neural network (encoder / decoder)
• Fully differentiable interpreter: NNs are trained with input / output examples
• DeepProbLog program with switches
• Switches are controlled by neural networks
77
[1]: Matko Bosnjak, Tim Rocktäschel, Jason Naradowsky, Sebastian Riedel: Programming with a Differentiable Forth Interpreter. ICML 2017: 547-556
Logic
Neural
![Page 78: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/78.jpg)
● Sorting○ Sort lists of numbers using Bubble sort○ Hole: Swap or don’t swap when comparing two numbers
● Addition○ Add two numbers and a carry○ Hole: What is the resulting digit and carry on each step○ (Note: not MNIST digits, but actual numbers)
● Word Algebra Problems○ E.g. “Ann has 8 apples. She buys 4 more. She distributes them equally
among her 3 kids. How many apples does each child receive?○ Hole: Sequence of permuting, swapping and performing operations on
the three numbers[1]: Matko Bosnjak, Tim Rocktäschel, Jason Naradowsky, Sebastian Riedel: Programming with a Differentiable Forth Interpreter. ICML 2017: 547-556
Tasks[1]
![Page 79: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/79.jpg)
hole(X,Y,X,Y):- swap(X,Y,0).
hole(X,Y,Y,X):- swap(X,Y,1).
bubble([X],[],X).bubble([H1,H2|T],[X1|T1],X):- hole(H1,H2,X1,X2), bubble([X2|T],T1,X).
bubblesort([],L,L).
bubblesort(L,L3,Sorted) :- bubble(L,L2,X), bubblesort(L2,[X|L3],Sorted).
sort(L,L2) :- bubblesort(L,[],L2).
Holes defined by neural predicate
Bubble sort implementation
Example DeepProbLog solution
![Page 80: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/80.jpg)
Result
80
![Page 81: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/81.jpg)
Noisy AdditionProbability
nn(classifier, [X], Y, [0 .. 9]) :: digit(X,Y).t(0.2) :: noisy.
1/19 :: uniform(X,Y,0) ; ... ; 1/19 :: uniform(X,Y,18).
addition(X,Y,Z) :- noisy, uniform(X,Y,Z).addition(X,Y,Z) :- \+noisy, digit(X,N1), digit(Y,N2), Z is N1+N2.
(a) The DeepProbLog program.
nn(classifier,[a],0) :: digit(a,0); nn(classifier,[a],1) :: digit(a,1).nn(classifier,[b],0) :: digit(b,0); nn(classifier,[b],1) :: digit(b,1).t(0.2)::noisy.
1/19::uniform(a,b,1).addition(a,b,1) :- noisy, uniform(a,b,1).
addition(a,b,1) :- \+noisy, digit(a,0), digit(b,1).addition(a,b,1) :- \+noisy, digit(a,1), digit(b,0).
(b) The ground DeepProbLog program.
(c) The AC for query addition(a,b,1).
Figure 4: Parameter learning in DeepProbLog. (Example 5)
Figure 5: The learning pipeline.
19
Neural
![Page 82: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/82.jpg)
Noisy AdditionProbability
nn(classifier, [X], Y, [0 .. 9]) :: digit(X,Y).t(0.2) :: noisy.
1/19 :: uniform(X,Y,0) ; ... ; 1/19 :: uniform(X,Y,18).
addition(X,Y,Z) :- noisy, uniform(X,Y,Z).addition(X,Y,Z) :- \+noisy, digit(X,N1), digit(Y,N2), Z is N1+N2.
(a) The DeepProbLog program.
nn(classifier,[a],0) :: digit(a,0); nn(classifier,[a],1) :: digit(a,1).nn(classifier,[b],0) :: digit(b,0); nn(classifier,[b],1) :: digit(b,1).t(0.2)::noisy.
1/19::uniform(a,b,1).addition(a,b,1) :- noisy, uniform(a,b,1).
addition(a,b,1) :- \+noisy, digit(a,0), digit(b,1).addition(a,b,1) :- \+noisy, digit(a,1), digit(b,0).
(b) The ground DeepProbLog program.
(c) The AC for query addition(a,b,1).
Figure 4: Parameter learning in DeepProbLog. (Example 5)
Figure 5: The learning pipeline.
19
nn(classifier, [X], Y, [0 .. 9]) :: digit(X,Y).t(0.2) :: noisy.
1/19 :: uniform(X,Y,0) ; ... ; 1/19 :: uniform(X,Y,18).
addition(X,Y,Z) :- noisy, uniform(X,Y,Z).addition(X,Y,Z) :- \+noisy, digit(X,N1), digit(Y,N2), Z is N1+N2.
(a) The DeepProbLog program.
nn(classifier,[a],0) :: digit(a,0); nn(classifier,[a],1) :: digit(a,1).nn(classifier,[b],0) :: digit(b,0); nn(classifier,[b],1) :: digit(b,1).t(0.2)::noisy.
1/19::uniform(a,b,1).addition(a,b,1) :- noisy, uniform(a,b,1).
addition(a,b,1) :- \+noisy, digit(a,0), digit(b,1).addition(a,b,1) :- \+noisy, digit(a,1), digit(b,0).
(b) The ground DeepProbLog program.
(c) The AC for query addition(a,b,1).
Figure 4: Parameter learning in DeepProbLog. (Example 5)
Figure 5: The learning pipeline.
19
Neural
![Page 83: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/83.jpg)
Noisy Addition
noisy
0.2,[1, 0,0,.. 0,0,..]
⨂
addition(a,b,1)
⨁ p,[∂p/∂pnoisy,
∂p/∂pdigit(a,0),...,∂p/∂pdigit(a,9),∂p/∂pdigit(b,0),...,∂p/∂pdigit(b,9)]
¬noisy
0.8,[-1, 0,0,.. 0,0,..]
digit(a,0)
0.8,[0, 1,0,.. 0,0,..]
digit(b,1)
0.6,[0, 0,0,.. 0,1,..]
digit(a,1)
0.1,[0, 0,1,.. 0,0,..]
digit(b,0)
0.2,[0, 0,0,.. 1,0,..]
uniform(a,b,1)
0.053,[0, 0,0,.. 0,0,..]
⨂ ⨂
⨁
⨂
0.011,[0.053, 0,0,.. 0,0,..]
0.02,[0, 0,0.2,.. 0.1,0,..]
0.48,[0, 0.6,0,.. 0,0.8,..]
0.5,[0, 0.6,0.2,.. 0.1,0.8,..]
0.4,[-0.5, 0.48,0.16,.. 0.08,0.64,..]
0.411,[-0.447, 0.48,0.16,.. 0.08,0.64,..]
Legend
![Page 84: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/84.jpg)
Noisy Addition
Figure 8: The accuracy on the MNIST test set for individual digits while training on (T3).
Fraction of noise0.0 0.2 0.4 0.6 0.8 1.0
Baseline 93.46 87.85 82.49 52.67 8.79 5.87DeepProbLog 97.20 95.78 94.50 92.90 46.42 0.88
DeepProbLog w/ explicit noise 96.64 95.96 95.58 94.12 73.22 2.92Learned fraction of noise 0.000 0.212 0.415 0.618 0.803 0.985
Table 3: The accuracy on the test set for T4.
.
noise tolerant, even retaining an accuracy of 73.2% with 80% noisy labels.As shown in the last row, it is also able to learn the fraction of noisy labelsin the data. This shows that the model is able to recognize which exampleshave noisy labels.
6.2. Program Induction
The second set of problems demonstrates that DeepProbLog can performprogram induction. We follow the program sketching [25] setting of differentiableForth (@4) [8], where holes in given programs need to be filled by neural networkstrained on input-output examples for the entire program. As in their work, weconsider three tasks: addition, sorting [26] and word algebra problems (WAPs)[27].
T5: forth_addition([4], [8], 1, [1, 3])The input consists of two numbers, represented as lists of digits, and acarry. The output is the sum of the numbers and the carry. The programspecifies the basic addition algorithm in which we go from right to left overall digits, calculating the sum of two digits and taking the carry over tothe next pair. The hole in this program corresponds to calculating theresulting digit (result/4) and carry (carry/4), given two digits and theprevious carry.
23
![Page 85: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/85.jpg)
Addition of images only• Examples of the form
• addition( , , )
• What will happen ?Figure 8: The accuracy on the MNIST test set for individual digits while training on (T3).
Fraction of noise0.0 0.2 0.4 0.6 0.8 1.0
Baseline 93.46 87.85 82.49 52.67 8.79 5.87DeepProbLog 97.20 95.78 94.50 92.90 46.42 0.88
DeepProbLog w/ explicit noise 96.64 95.96 95.58 94.12 73.22 2.92Learned fraction of noise 0.000 0.212 0.415 0.618 0.803 0.985
Table 3: The accuracy on the test set for T4.
.
noise tolerant, even retaining an accuracy of 73.2% with 80% noisy labels.As shown in the last row, it is also able to learn the fraction of noisy labelsin the data. This shows that the model is able to recognize which exampleshave noisy labels.
6.2. Program Induction
The second set of problems demonstrates that DeepProbLog can performprogram induction. We follow the program sketching [25] setting of differentiableForth (@4) [8], where holes in given programs need to be filled by neural networkstrained on input-output examples for the entire program. As in their work, weconsider three tasks: addition, sorting [26] and word algebra problems (WAPs)[27].
T5: forth_addition([4], [8], 1, [1, 3])The input consists of two numbers, represented as lists of digits, and acarry. The output is the sum of the numbers and the carry. The programspecifies the basic addition algorithm in which we go from right to left overall digits, calculating the sum of two digits and taking the carry over tothe next pair. The hole in this program corresponds to calculating theresulting digit (result/4) and carry (carry/4), given two digits and theprevious carry.
23
![Page 86: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/86.jpg)
Addition of images only• Examples of the form
• addition( , , )
• What will happen ?
• usual loss function will map all images onto 0 (0 + 0 = 0 )
• can compensate for this by adding regularisation term based on max entropy
Figure 8: The accuracy on the MNIST test set for individual digits while training on (T3).
Fraction of noise0.0 0.2 0.4 0.6 0.8 1.0
Baseline 93.46 87.85 82.49 52.67 8.79 5.87DeepProbLog 97.20 95.78 94.50 92.90 46.42 0.88
DeepProbLog w/ explicit noise 96.64 95.96 95.58 94.12 73.22 2.92Learned fraction of noise 0.000 0.212 0.415 0.618 0.803 0.985
Table 3: The accuracy on the test set for T4.
.
noise tolerant, even retaining an accuracy of 73.2% with 80% noisy labels.As shown in the last row, it is also able to learn the fraction of noisy labelsin the data. This shows that the model is able to recognize which exampleshave noisy labels.
6.2. Program Induction
The second set of problems demonstrates that DeepProbLog can performprogram induction. We follow the program sketching [25] setting of differentiableForth (@4) [8], where holes in given programs need to be filled by neural networkstrained on input-output examples for the entire program. As in their work, weconsider three tasks: addition, sorting [26] and word algebra problems (WAPs)[27].
T5: forth_addition([4], [8], 1, [1, 3])The input consists of two numbers, represented as lists of digits, and acarry. The output is the sum of the numbers and the carry. The programspecifies the basic addition algorithm in which we go from right to left overall digits, calculating the sum of two digits and taking the carry over tothe next pair. The hole in this program corresponds to calculating theresulting digit (result/4) and carry (carry/4), given two digits and theprevious carry.
23
![Page 87: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/87.jpg)
Simplified Poker• dealing with uncertainty
• ignore suits and just with A, J, Q and K
• two players, two cards, and one community card
• train the neural network to recognize the four cards
• reason probabilistically about the non-observed card
• learn the distribution of the unlabeled community card
•
Probability
NeuralLogic
nn(m_swap, [X]) :: swap(X,Y,).
hole(X,Y,X,Y):-\+swap(X,Y).
hole(X,Y,Y,X):-swap(X,Y).
bubble([X],[],X).bubble([H1,H2|T],[X1|T1],X):-
hole(H1,H2,X1,X2),bubble([X2|T],T1,X).
bubblesort([],L,L).
bubblesort(L,L3,Sorted) :-bubble(L,L2,X),bubblesort(L2,[X|L3],Sorted).
forth_sort(L,L2) :- bubblesort(L,[],L2).
Listing 6: Forth sorting sketch (T6)
Figure A.10: Examples of cards used as input for the Poker without perturbations(T9) ex-
periment.
calculate the result.
In Listing 8, there are two neural predicates: coin1/2 and coin2/2. Theirinput is the image of the two coins (e.g. Figure 9). The output is heads or tails.The coins/2 classifies both coins using these two predicates and then performsthe comparison of the classes with the compare/3 predicate.
In Listing 9, there’s a single neural predicate rank/2 that takes as input theimage of a card and classifies it as either a jack, queen, king or ace. There’s alsoan AD with learnable parameters that represents the distribution of the unseencommunity card (house_rank/1). The hand/2 predicate’s first argument is alist of 3 cards. It unifies the output with any of the valid hands that these cardscontain. The valid hands are: high card, pair (two cards have the same rank),three of a kind (three cards have the same rank), low straight (jack, queen king)and high straight(queen, king, ace). Each hand is assigned a rank with the
41
Distribution Jack Queen King Ace
Actual 0.2 0.4 0.15 0.25Learned 0.203± 0.002 0.396± 0.002 0.155± 0.003 0.246± 0.002
Table 8: The results for the Poker experiment (T9).
two cards and the community card.For simplicity, we only use the jack, queen, king and ace. We also do notconsider the suits of the cards.The input consists of 4 images that show the cards dealt to the two players.Additionally, every example is labeled with the chance that the game iswon, lost or ended in a draw, e.g.:
0.8 :: poker([Q~, Q}, A}, K|], loss)
We expect DeepProbLog to:
• train the neural network to recognize the four cards• reason probabilistically about the non-observed card• learn the distribution of the unlabeled community card
To make DeepProbLog converge more reliably, we add some examples withadditional supervision. Namely, in 10% of the examples we additionallyspecify the community card, i.e.
poker([Q~, Q}, A}, K|], A}, loss).
This also showcases one of the strengths of DeepProbLog, namely, it canmake use of examples that have different levels of observability. The lossfunction used in this experiment is the MSE between the predicted andtarget probabilities.
Results. We ran the experiment 10 times. Out of these 10 runs, 4 didn’tconverge on the correct solution. The average values of the learned pa-rameters for the remaining 6 runs are shown Table 8. As can be seen,DeepProbLog is able to correctly learn the probabilistic parameters. Inthese 6 runs, the neural network also correctly learned to classify all cardtypes, achieving a 100% accuracy. The other runs did not converge becausesome of the classes were permuted (i.e., queens predicted as aces and viceversa) or multiple classes mapped onto the same one (queens and kingswere both predicted as kings).
27
Distribution Jack Queen King Ace
Actual 0.2 0.4 0.15 0.25Learned 0.203± 0.002 0.396± 0.002 0.155± 0.003 0.246± 0.002
Table 8: The results for the Poker experiment (T9).
two cards and the community card.For simplicity, we only use the jack, queen, king and ace. We also do notconsider the suits of the cards.The input consists of 4 images that show the cards dealt to the two players.Additionally, every example is labeled with the chance that the game iswon, lost or ended in a draw, e.g.:
0.8 :: poker([Q~, Q}, A}, K|], loss)
We expect DeepProbLog to:
• train the neural network to recognize the four cards• reason probabilistically about the non-observed card• learn the distribution of the unlabeled community card
To make DeepProbLog converge more reliably, we add some examples withadditional supervision. Namely, in 10% of the examples we additionallyspecify the community card, i.e.
poker([Q~, Q}, A}, K|], A}, loss).
This also showcases one of the strengths of DeepProbLog, namely, it canmake use of examples that have different levels of observability. The lossfunction used in this experiment is the MSE between the predicted andtarget probabilities.
Results. We ran the experiment 10 times. Out of these 10 runs, 4 didn’tconverge on the correct solution. The average values of the learned pa-rameters for the remaining 6 runs are shown Table 8. As can be seen,DeepProbLog is able to correctly learn the probabilistic parameters. Inthese 6 runs, the neural network also correctly learned to classify all cardtypes, achieving a 100% accuracy. The other runs did not converge becausesome of the classes were permuted (i.e., queens predicted as aces and viceversa) or multiple classes mapped onto the same one (queens and kingswere both predicted as kings).
27
Distribution Jack Queen King Ace
Actual 0.2 0.4 0.15 0.25Learned 0.203± 0.002 0.396± 0.002 0.155± 0.003 0.246± 0.002
Table 8: The results for the Poker experiment (T9).
two cards and the community card.For simplicity, we only use the jack, queen, king and ace. We also do notconsider the suits of the cards.The input consists of 4 images that show the cards dealt to the two players.Additionally, every example is labeled with the chance that the game iswon, lost or ended in a draw, e.g.:
0.8 :: poker([Q~, Q}, A}, K|], loss)
We expect DeepProbLog to:
• train the neural network to recognize the four cards• reason probabilistically about the non-observed card• learn the distribution of the unlabeled community card
To make DeepProbLog converge more reliably, we add some examples withadditional supervision. Namely, in 10% of the examples we additionallyspecify the community card, i.e.
poker([Q~, Q}, A}, K|], A}, loss).
This also showcases one of the strengths of DeepProbLog, namely, it canmake use of examples that have different levels of observability. The lossfunction used in this experiment is the MSE between the predicted andtarget probabilities.
Results. We ran the experiment 10 times. Out of these 10 runs, 4 didn’tconverge on the correct solution. The average values of the learned pa-rameters for the remaining 6 runs are shown Table 8. As can be seen,DeepProbLog is able to correctly learn the probabilistic parameters. Inthese 6 runs, the neural network also correctly learned to classify all cardtypes, achieving a 100% accuracy. The other runs did not converge becausesome of the classes were permuted (i.e., queens predicted as aces and viceversa) or multiple classes mapped onto the same one (queens and kingswere both predicted as kings).
27
in 6/10 experiments
![Page 88: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/88.jpg)
Challenges
• The data needs to provide a signal (cf. Addition of images only, and Poker … ); aka curriculum layer + regularization
• Scaling up -
• still using the exact inference of ProbLog
• circuits can be very large
• we were working on approximate inference
![Page 89: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/89.jpg)
Further Reading• One book
• Three websites to start
• http://probmods.org/ Probabilistic Models of Cognition — Church
• http://dtai.cs.kuleuven.be/problog/ — check also [DR & Kimmig, MLJ 15]
• http://alchemy.cs.washington.edu/ —Markov Logic, check also [Domingos & Lowd] Markov Logic, Morgan Claypool.
![Page 90: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/90.jpg)
Thanks!
http://dtai.cs.kuleuven.be/problog
Maurice BruynoogheBart DemoenAnton DriesDaan FierensJason Filippou
Bernd GutmannManfred JaegerGerda Janssens
Kristian KerstingAngelika Kimmig
Theofrastos MantadelisWannes Meert
Bogdan MoldovanSiegfried Nijssen
Davide NittiJoris Renkens
Kate RevoredoRicardo Rocha
Vitor Santos CostaDimitar Shterionov
Ingo ThonHannu Toivonen
Guy Van den BroeckMathias VerbekeJonas Vlasselaer
90
Thanks !
![Page 91: Learning and Reasoning for AIluc.deraedt/Francqui4ab.pdf · Lots of proposals in the literature, e.g. • relational Markov networks (RMNs) [Taskar et al 2002] • Markov logic networks](https://reader031.fdocuments.in/reader031/viewer/2022012003/60a25d6cbeeb523bf62fe3f2/html5/thumbnails/91.jpg)
• PRISM http://sato-www.cs.titech.ac.jp/prism/
• ProbLog2 http://dtai.cs.kuleuven.be/problog/
• Yap Prolog http://www.dcc.fc.up.pt/~vsc/Yap/ includes
• ProbLog1
• cplint https://sites.google.com/a/unife.it/ml/cplint
• CLP(BN)
• LP2
• PITA in XSB Prolog http://xsb.sourceforge.net/
• AILog2 http://artint.info/code/ailog/ailog2.html
• SLPs http://stoics.org.uk/~nicos/sware/pepl
• contdist http://www.cs.sunysb.edu/~cram/contdist/
• DC https://code.google.com/p/distributional-clauses
• WFOMC http://dtai.cs.kuleuven.be/ml/systems/wfomc
PLP Systems
91