Post on 21-Dec-2015
1
Knowledge Representation and Reasoning
Representação do Conhecimento e
Raciocínio
José Júlio Alferes
2
Part 1: Introduction
3
What is it ?• What data does an intelligent “agent” deal with?
- Not just facts or tuples.
• How does an “agent” knows what surrounds it? What are the rules of the game? – One must represent that “knowledge”.
• And what to do afterwards with that knowledge? How to draw conclusions from it? How to reason?
• Knowledge Representation and Reasoning AI Algorithms and Data Structures Computation
4
What is it good for ?
• Fundamental topic in Artificial Intelligence– Planning– Legal Knowledge– Model-Based Diagnosis
• Expert Systems• Semantic Web (http://www.w3.org)
– Reasoning on the Web (http://www.rewerse.com)
• Ontologies and data-modeling
5
What is this course about?
• Logic approaches to knowledge representation
• Issues in knowledge representation– semantics, expressivity, complexity
• Representation formalisms• Forms of reasoning• Methodologies• Applications
6
Bibliography• Will be pointed out as we go along (articles,
surveys) in the summaries at the web page
• For the first part of the syllabus:– Reasoning with Logic Programming
J. J. Alferes and L. M. PereiraSpringer LNAI, 1996
– Nonmonotonic ReasoningG. AntoniouMIT Press, 1996.
7
What prior knowledge?
• Computational Logic
• Introduction to Artificial Intelligence
• Logic Programming
8
Logic for KRR
• Logic is a language conceived for representing knowledge
• It was developed for representing mathematical knowledge
• What is appropriate for mathematical knowledge might not be so for representing common sense
• What is appropriate for mathematical knowledge might be too complex for modeling data.
9
Mathematical knowledge vs common sense
• Complete vs incomplete knowledge– x : x N → x R
– go_Work → use_car
• Solid inferences vs default ones– In the face incomplete knowledge
– In emergency situations
– In taxonomies
– In legal reasoning
– ...
10
Monotonicity of Logic
• Classical Logic is monotonic
T |= F → T U T’ |= F
• This is a basic property which makes sense for mathematical knowledge
• But is not desirable for knowledge representation in general!
11
Non-monotonic logics
• Do not obey that property• Appropriate for Common Sense Knowledge
• Default Logic– Introduces default rules
• Autoepistemic Logic– Introduces (modal) operators which speak about
knowledge and beliefs
• Logic Programming
12
Logics for Modeling
• Mathematical 1st order logics can be used for modeling data and concepts. E.g.– Define ontologies– Define (ER) models for databases
• Here monotonicity is not a problem– Knowledge is (assumed) complete
• But undecidability, complexity, and even notation might be a problem
13
Description Logics
• Can be seen as subsets of 1st order logics– Less expressive– Enough (and tailored for) describing
concepts/ontologies– Decidable inference procedures– (arguably) more convenient notation
• Quite useful in data modeling• New applications to Semantic Web
– Languages for the Semantic Web are in fact Description Logics!
14
In this course (revisited)
• Non-Monotonic Logics– Languages– Tools– Methodologies– Applications
• Description Logics– Idem…
15
Part 2: Default and Autoepistemic Logics
16
Default Logic
• Proposed by Ray Reiter (1980)
go_Work → use_car
• Does not admit exceptions!
• Default rules
go_Work : use_car
use_car
17
More examples
anniversary(X) friend(X) : give_gift(X)
give_gift(X)
friend(X,Y) friend(Y,Z) : friend (X,Z)
friend(X,Z)
accused(X) : innocent(X)
innocent(X)
18
Default Logic Syntax
• A theory is a pair (W,D), where:– W is a set of 1st order formulas– D is a set of default rules of the form:
: 1, … ,n
– (pre-requisites), i (justifications) and
(conclusion) are 1st order formulas
19
The issue of semantics
• If is true (where?) and all i are consistent (with what?) then becomes true (becomes? Wasn’t it before?)
• Conclusions must:– be a closed set– contain W– apply the rules of D maximally, without
becoming unsupported
20
Default extensions
• (S) is the smallest set such that:– W (S)– Th((S)) = (S)– A:Bi/C D, A (S) and Bi S → C (S)
• E is an extension of (W,D) iff E = (E)
21
Quasi-inductive definition
• E is an extension iff E = Ui Ei for:
– E0 = W
– Ei+1 = Th(Ei) U {C: A:Bj/C D, A Ei, Bj E}
22
Some properties
• (W,D) has an inconsistent extension iff W is inconsistent– If an inconsistent extension exists, it is unique
• If W Just Conc is inconsistent , then there is only a single extension
• If E is an extension of (W,D), then it is also an extension of (W E’,D) for any E’ E
23
Operational semantics
• The computation of an extension can be reduced to finding a rule application order (without repetitions).
• = (1,2,...) and [k] is the initial segment of with k elements
• In() = Th(W {conc() | })– The conclusions after rules in are applied
• Out() = { | just() and }– The formulas which may not become true, after
application of rules in
24
Operational semantics (cont’d)
• is applicable in iff pre() In() and In()
• is a process iff k , k is applicable in [k-1]• A process is:
– successful iff In() ∩ Out() = {}.• Otherwise it is failed.
– closed iff D applicable in → • Theorem: E is an extension iff there exists ,
successful and closed, such that In() = E
25
Computing extensions (Antoniou page 39)
extension(W,D,E) :- process(D,[],W,[],_,E,_).
process(D,Pcur,InCur,OutCur,P,In,Out) :-getNewDefault(default(A,B,C),D,Pcur),prove(InCur,[A]),not prove(InCur,[~B]),process(D,[default(A,B,C)|Pcur],[C|InCur],[~B|OutCur],P,In,Out).
process(D,P,In,Out,P,In,Out) :-closed(D,P,In), successful(In,Out).
closed(D,P,In) :- not (getNewDefault(default(A,B,C),D,P),prove(In,[A]), not prove(In,[~B]) ).
successful(In,Out) :- not ( member(B,Out), member(B,In) ).
getNewDefault(Def,D,P) :- member(Def,D), not member(Def,P).
26
Normal theories
• Every rule has its justification identical to its conclusion
• Normal theories always have extensions• If D grows, then the extensions grow (semi-
monotonicity)• They are not good for everything:
– John is a recent graduate– Normally recent graduates are adult– Normally adults, not recently graduated, have a job
(this cannot be coded with a normal rule!)
27
Problems• No guarantee of extension existence• Deficiencies in reasoning by cases
– D = {italian:wine/wine french:wine/wine}– W ={italian v french}
• No guarantee of consistency among justifications.– D = {:usable(X), broken(X)/usable(X)}– W ={broken(right) v broken(left)}
• Non cummulativity– D = {:p/p, pvq:p/p}– derives p v q, but after adding p v q no longer does so
28
Auto-Epistemic Logic
• Proposed by Moore (1985)
• Contemplates reflection on self knowledge (auto-epistemic)
• Allows for representing knowledge not just about the external world, but also about the knowledge I have of it
29
Syntax of AEL
• 1st Order Logic, plus the operator L (applied to formulas)
• L means “I know ”• Examples:
MScOnSW → L MScSW
(or L MScOnSW → MScOnSW)
young (X) L studies (X) → studies (X)
30
Meaning of AEL
• What do I know?– What I can derive (in all models)
• And what do I not know?– What I cannot derive
• But what can be derived depends on what I know– Add knowledge, then test
31
Semantics of AEL
• T* is an expansion of theory T iff
T* = Th(T{L : T* |= } {L : T* |≠ })
• Assuming the inference rule /L :
T* = CnAEL(T {L : T* |≠ })
• An AEL theory is always two-valued in L, that is, for every expansion:
| L T* L T*
32
Knowledge vs. Belief
• Belief is a weaker concept– For every formula, I know it or know it not– There may be formulas I do not believe in,
neither their contrary
• The Auto-Epistemic Logic of knowledge and belief (AELB), introduces also operator B – I believe in
33
AELB Example
• I rent a film if I believe I’m neither going to baseball nor football games
Bbaseball Bfootball → rent_filme• I don’t buy tickets if I don’t know I’m going to
baseball nor know I’m going to football L baseball L football → buy_tickets
• I’m going to football or baseballbaseball football
• I should not conclude that I rent a film, but do conclude I should not buy tickets
34
Axioms about beliefs
• Consistency Axiom
B• Normality Axiom
B(F → G) → (B F → B G)
• Necessitation rule
F
B F
35
Minimal models
• In what do I believe?– In that which belongs to all preferred models
• Which are the preferred models?– Those that, for one same set of beliefs, have a minimal
number of true things
• A model M is minimal iff there does not exist a smaller model N, coincident with M on B e Latoms
• When is true in all minimal models of T, we write T |=min
36
AELB expansions
• T* is a static expansion of T iff
T* = CnAELB(T {L : T* |≠ }
{B : T* |=min })
where CnAELB denotes closure using the
axioms of AELB plus necessitation for L
37
The special case of AEB
• Because of its properties, the case of theories without the knowledge operator is especially interesting
• Then, the definition of expansion becomes:
T* = (T*)
where (T*) = CnAEB(T {B : T* |=min })
and CnAEB denotes closure using the axioms of AEB
38
Least expansion
• Theorem: Operator is monotonic, i.e.
T T1 T2 → (T1) (T2)• Hence, there always exists a minimal
expansion of T, obtainable by transfinite induction:– T0 = Cn(T)
– Ti+1 = (Ti)
– T = U T (for limit ordinals )
39
Consequences
• Every AEB theory has at least one expansion
• If a theory is affirmative (i.e. all clauses have at least a positive literal) then it has at least a consistent expansion
• There is a procedure to compute the semantics
40
Part 3: Logic Programming for Knowledge representation
3.1 Semantics of Normal Logic Programs
41
LP forKnowledge Representation
• Due to its declarative nature, LP has become a prime candidate for Knowledge Representation and Reasoning
• This has been more noticeable since its relations to other NMR formalisms were established
• For this usage of LP, a precise declarative semantics was in order
42
Language• A Normal Logic Programs P is a set of rules:
H A1, …, An, not B1, … not Bm (n,m 0)
where H, Ai and Bj are atoms
• Literal not Bj are called default literals
• When no rule in P has default literal, P is called definite
• The Herbrand base HP is the set of all instantiated atoms from program P.
• We will consider programs as possibly infinite sets of instantiated rules.
43
Declarative Programming
• A logic program can be an executable specification of a problem
member(X,[X|Y]).
member(X,[Y|L]) member(X,L).
• Easier to program, compact code• Adequate for building prototypes• Given efficient implementations, why not use it to
“program” directly?
44
LP and Deductive Databases
• In a database, tables are viewed as sets of facts:
• Other relations are represented with rules:
),(
).,(
londonlisbonflight
adamlisbonflight
LondonLisbon
AdamLisbon
tofromflight
).,(),(
).,(),,(),(
).,(),(
BAconnectionnotBAherchooseAnot
BCconnectionCAflightBAconnection
BAflightBAconnection
45
LP and Deductive DBs (cont)
• LP allows to store, besides relations, rules for deducing other relations
• Note that default negation cannot be classical negation in:
• A form of Closed World Assumption (CWA) is needed for inferring non-availability of connections
).,(),(
).,(),,(),(
).,(),(
BAconnectionnotBAherchooseAnot
BCconnectionCAflightBAconnection
BAflightBAconnection
46
Default Rules
• The representation of default rules, such as
“All birds fly”can be done via the non-monotonic operator not
).(
).(
).()(
).()(
.)(),()(
ppenguin
abird
PpenguinPabnormal
PpenguinPbird
AabnormalnotAbirdAflies
47
The need for a semantics
• In all the previous examples, classical logic is not an appropriate semantics– In the 1st, it does not derive not member(3,[1,2])
– In the 2nd, it never concludes choosing another company
– In the 3rd, all abnormalities must be expressed
• The precise definition of a declarative semantics for LPs is recognized as an important issue for its use in KRR.
48
2-valued Interpretations
• A 2-valued interpretation I of P is a subset of HP
– A is true in I (ie. I(A) = 1) iff A I– Otherwise, A is false in I (ie. I(A) = 0)
• Interpretations can be viewed as representing possible states of knowledge.
• If knowledge is incomplete, there might be in some states atoms that are neither true nor false
49
3-valued Interpretations
• A 3-valued interpretation I of P is a set
I = T U not F
where T and F are disjoint subsets of HP
– A is true in I iff A T– A is false in I iff A F– Otherwise, A is undefined (I(A) = 1/2)
• 2-valued interpretations are a special case, where:
HP = T U F
50
Models
• Models can be defined via an evaluation function Î:– For an atom A, Î(A) = I(A)– For a formula F, Î(not F) = 1 - Î(F)– For formulas F and G:
• Î((F,G)) = min(Î(F), Î(G))• Î(F G)= 1 if Î(F) Î(G), and = 0 otherwise
• I is a model of P iff, for all rule H B of P:
Î(H B) = 1
51
Minimal Models Semantics• The idea of this semantics is to minimize positive
information. What is implied as true by the program is true; everything else is false.
• {pr(c),pr(e),ph(s),ph(e),aM(c),aM(e)} is a model• Lack of information that cavaco is a physicist, should indicate that he isn’t• The minimal model is: {pr(c),ph(e),aM(e)}
)(
)(
)()(
cavacopresident
einsteinphysicist
XphysicistXaticianableMathem
52
Minimal Models Semantics
D[Truth ordering] For interpretations I and J, I J iff for all atom A, I(A) I(J), i.e.
TI TJ and FI FJ
T Every definite logic program has a least (truth ordering) model.
D[minimal models semantics] An atom A is true in (definite) P iff A belongs to its least model. Otherwise, A is false in P.
53
TP operator• The minimal models of a definite P can be
computed (bottom-up) via operator TP
D [TP] Let I be an interpretation of definite P.
TP(I) = {H: (H Body) P and Body I}
T If P is definite, TP is monotone and continuous. Its minimal fixpoint can be built by:
I0 = {} and In = TP(In-1)
T The least model of definite P is TP({})
54
On Minimal Models
• SLD can be used as a proof procedure for the minimal models semantics:– If the is a SLD-derivation for A, then A is true– Otherwise, A is false
• The semantics does not apply to normal programs:– p not q has two minimal models:
{p} and {q}
There is no least model !
55
The idea of completion
• In LP one uses “if” but mean “iff” [Clark78]
• This doesn’t imply that -1 is not a natural number!• With this program we mean:
• This is the idea of Clark’s completion:Syntactically transform if’s into iff’sUse classical logic in the transformed theory to provide the
semantics of the program
).())((
).0(
NnaturalNNsnaturalN
naturalN
)()(:0)( YnNYsXYXXnN
56
Program completion
• The completion of P is the theory comp(P) obtained by: Replace p(t) by p(X) X = t, Replace p(X) by p(X) Y , where Y are the
original variables of the rule Merge all rules with the same head into a single one
p(X) 1 … n
For every q(X) without rules, add q(X) Replace p(X) by X (p(X) )
57
Completion Semantics
• Though completion’s definition is not that simple, the idea behind it is quite simple
• Also, it defines a non-classical semantics by means of classical inference on a transformed theory
DLet comp(P) be the completion of P where not is interpreted as classical negation: A is true in P iff comp(P) |= A A is false in P iff comp(P) |= not A
58
SLDNF proof procedure
• By adopting completion, procedurally we have:
not is “negation as finite failure”• In SLDNF proceed as in SLD. To prove not A:
– If there is a finite derivation for A, fail not A– If, after any finite number of steps, all derivations
for A fail, remove not A from the resolvent (i.e. succeed not A)
• SLDNF can be efficiently implemented (cf. Prolog)
59
SLDNF example
p p.q not p.a not b.b not c.
a
not b b
not c c
X
X
q
not p p
p
pNo success nor finite failure
• According to completion:– comp(P) |= {not a, b, not c}– comp(P) | p, comp(P) | not p– comp(P) | q, comp(P) | not q
60
Problems with completion
• Some consistent programs may became inconsistent: p not p becomes p not p
• Does not correctly deal with deductive closuresedge(a,b). edge(c,d). edge(d,c).reachable(a).reachable(A) edge(A,B), reachable(B).
• Completion doesn’t conclude not reachable(c), due to the circularity caused by edge(c,d) and edge(d,c)
Circularity is a procedural concept, not a declarative one
61
Completion Problems (cont)
• Difficulty in representing equivalencies:
bird(tweety). fly(B) bird(B), not abnormal(B).
abnormal(B) irregular(B)irregular(B) abnormal(B)
• Completion doesn’t conclude fly(tweety)!– Without the rules on the left fly(tweety) is true
– An explanation for this would be: “the rules on the left cause a loop”.
Again, looping is a procedural concept, not a declarative one
When defining declarative semantics, procedural concepts should be rejected
62
Program stratification
• Minimal models don’t have “loop” problems• But are only applicable to definite programs• Generalize Minimal Models to Normal LPs:
– Divide the program into strata
– The 1st is a definite program. Compute its minimal model
– Eliminate all nots whose truth value was thus obtained
– The 2nd becomes definite. Compute its MM
– …
63
Stratification example• Least(P1) = {a, b, not p}
• Processing this, P2 becomes:
c trued c, false
• Its minimal model, together with P1 is:
{a, b, c, not d, not p}
• Processing this, P3 becomes:
e a, truef false
p pa bb
c not pd c, not a
e a, not df not c
P1
P2
P3
P
• The (desired) semantics for P is then:
{a, b ,c, not d, e, not f, not p}
64
Stratification
DLet S1;…;Sn be such that S1 U…U Sn = HP, all the Si are disjoint, and for all rules of P:
A B1,…,Bm, not C1,…,not Ck
if A Si then:
• {B1,…,Bm} Ui j=1 Sj
• {C1,…,Ck} Ui-1 j=1 Sj
Let Pi contain all rules of P whose head belongs to Si. P1;…;Pn is a stratification of P
65
Stratification (cont)
• A program may have several stratifications:
ab ac not a
P1P2
P3
P
ab ac not a
P1
P2
Por
• Or may have no stratification:b not aa not b
DA Normal Logic Program is stratified iff it admits (at least) one stratification.
66
Semantics of stratified LPsDLet I|R be the restriction of interpretation I to the atoms
in R, and P1;…;Pn be a stratification of P.
Define the sequence:• M1 = least(P1)
• Mi+1 is the minimal models of Pi+1 such that:
Mi+1| (Uij=1 Sj) = Mi
Mn is the standard model of P
• A is true in P iff A Mn
• Otherwise, A is false
67
Properties of Standard Model
Let MP be the standard model of stratified P
MP is unique (does not depend on the stratification)
MP is a minimal model of P
MP is supported
DA model M of program P is supported iff:
A M (A Body) P : Body M
(true atoms must have a rule in P with true body)
68
Perfect models• The original definition of stratification (Apt et al.) was made
on predicate names rather than atoms.
• By abandoning the restriction of a finite number of strata, the definitions of Local Stratification and Perfect Models (Przymusinski) are obtained. This enlarges the scope of application:
even(0)even(s(X)) not even(X)
P1= {even(0)}P2= {even(1) not even(0)}...
• The program isn’t stratified (even/1 depends negatively on itself) but is locally stratified.
• Its perfect model is: {even(0),not even(1),even(2),…}
69
Problems with stratification
• Perfect models are adequate for stratified LPs– Newer semantics are generalization of it
• But there are (useful) non-stratified LPseven(X) zero(X) zero(0)even(Y) suc(X,Y),not even(X) suc(X,s(X))
• Is not stratified because (even(0) suc(0,0),not even(0)) P
• No stratification is possible if P has:pacifist(X) not hawk(X)hawk(Y) not pacifist(X)
• This is useful in KR: “X is pacifist if it cannot be assume X is hawk, and vice-versa. If nothing else is said, it is undefined whether X is pacifist or hawk”
70
SLS procedure
• In perfect models not includes infinite failure• SLS is a (theoretical) procedure for perfect models
based on possible infinite failure• No complete implementation is possible (how to
detect infinite failure?)• Sound approximations exist:
– based on loop checking (with ancestors)– based on tabulation techniques
(cf. XSB-Prolog implementation)
71
Stable Models Idea• The construction of perfect models can be done
without stratifying the program. Simply guess the model, process it into P and see if its least model coincides with the guess.
• If the program is stratified, the results coincide:– A correct guess must coincide on the 1st strata;
– and on the 2nd (given the 1st), and on the 3rd …
• But this can be applied to non-stratified programs…
72
Stable Models Idea (cont)• “Guessing a model” corresponds to “assuming
default negations not”. This type of reasoning is usual in NMR– Assume some default literals
– Check in P the consequences of such assumptions
– If the consequences completely corroborate the assumptions, they form a stable model
• The stable models semantics is defined as the intersection of all the stable models (i.e. what follows, no matter what stable assumptions)
73
SMs: preliminary examplea not b c a p not qb not a c b q not r r
• Assume, e.g., not r and not p as true, and all others as false. By processing this into P:
a false c a p falseb false c b q true r
• Its least model is {not a, not b, not c, not p, q, r}
• So, it isn’t a stable model:– By assuming not r, r becomes true
– not a is not assumed and a becomes false
74
SMs example (cont)a not b c a p not qb not a c b q not r r
• Now assume, e.g., not b and not q as true, and all others as false. By processing this into P:
a true c a p trueb false c b q false r
• Its least model is {a, not b, c, p, not q, r}
• I is a stable model
• The other one is {not a, b, c, p, not q, r}
• According to Stable Model Semantics:
– c, r and p are true and q is false.
– a and b are undefined
75
Stable Models definitionDLet I be a (2-valued) interpretation of P. The definite
program P/I is obtained from P by:• deleting all rules whose body has not A, and A I
• deleting from the body all the remaining default literals
P(I) = least(P/I)
DM is a stable model of P iff M = P(M).
• A is true in P iff A belongs to all SMs of P
• A is false in P iff A doesn’t belongs to any SMs of P (i.e. not A “belongs” to all SMs of P).
76
Properties of SMs
Stable models are minimal models
Stable models are supported
If P is locally stratified then its single stable model is the perfect model
Stable models semantics assign meaning to (some) non-stratified programs– E.g. the one in the example before
77
Importance of Stable Models
Stable Models are an important contribution:– Introduce the notion of default negation (versus negation as
failure)– Allow important connections to NMR. Started the area of
LP&NMR– Allow for a better understanding of the use of LPs in
Knowledge Representation– Introduce a new paradigm (and accompanying
implementations) of LP
It is considered as THE semantics of LPs by a significant part of the community.
But...
78
Cumulativity
DA semantics Sem is cumulative iff for every P:
if A Sem(P) and B Sem(P) then B Sem(P U {A})
(i.e. all derived atoms can be added as facts, without changing the program’s meaning)
• This property is important for implementations:– without cumulativity, tabling methods cannot be used
79
Relevance
D A directly depends on B if B occur in the body of some rule with head A. A depends on B if A directly depends on B or there is a C such that A directly depends on C and C depends on B.
DA semantics Sem is relevant iff for every P:
A Sem(P) iff A Sem(RelA(P))
where RelA(P) contains all rules of P whose head is A or some B on which A depends on.
• Only this property allows for the usual top-down execution of logic programs.
80
Problems with SMs
The only SM is {not a, c,b}a not b c not ab not a c not c
• Don’t provide a meaning to every program:– P = {a not a} has no stable models
• It’s non-cumulative and non-relevant:
– However b is not true in P U {c} (non-cumulative)• P U {c} has 2 SMs: {not a, b, c} and {a, not b, c}
– b is not true in Relb(P) (non-relevance)
• The rules in Relb(P) are the 2 on the left
• Relb(P) has 2 SMs: {not a, b} and {a, not b}
81
Problems with SMs (cont)• Its computation is NP-Complete
• The intersection of SMs is non-supported:
c is true but neither a nor b are true.a not b c ab not a c b
• Note that the perfect model semantics:– is cumulative– is relevant– is supported– its computation is polynomial
82
Part 3: Logic Programming for Knowledge representation
3.2 Answer-Set Programming
83
Programming with SMs
• A new paradigm of problem representation with Logic Programming (Answer-Set Programming – ASP)– A problem is represented as (part of) a logic program (intentional
database)– An instance of a problem is represented as a set of fact
(extensional database)– Solution of the problems are the models of the complete program
• In Prolog– A problem is represented by a program– Instances are given as queries– Solutions are substitutions
84
Finding subsets
• In PrologsubSet([],_).subSet([E|Ss],[_|S]) :- subSet([E|Ss],S).subSet([E|Ss],[E|S]) :- subSet(Ss,S).?- subset(X,[1,2,3]).
• In ASP:– Program:
in_sub(X) :- element(X), not out_sub(X).out_sub(X) :- element(X), not in_sub(X).
– Facts: element(1). element(2). element(3).– Each stable model represents one subset.
• Which one do you find more declarative?
85
Generation of Stable Models
• A pair of rulesa :- not bb :- not a
generates two stable models: one with a and another with b.
• Rules:a(X) :- elem(X), not b(X).b(X) :- elem(X), not a(X).
with elem(X) having N solutions, generates 2N stable models
86
Small subsets
• From the previous program, eliminate stable models with more than one member– I.e. eliminate all stable models where
in_sub(X), in_sub(Y), X ≠ Y• Just add rule:
foo :-element(X), in_sub(X), in_sub(Y), not eq(X,Y), not foo.
%eq(X,X).• Since there is no notion of query, it is very important to
guarantee that it is possible to ground programs.– All variables appearing in a rule must appear in a predicate that
defines the domains, and make it possible to ground it (in the case, the element(X) predicates.
87
Restricting Stable Models
• A rulea :- cond, not a.
eliminates all stable models where cond is true. • In most ASP solvers, this is simply written as an
integrity constraint:- cond.
• An ASP programs usually has:– A part defining the domain (and specific instance of the
problem)– A part generating models– A part eliminating models
88
N-Queens
• Place N queens in a NxN chess board so that none attacks no other.% Generating modelshasQueen(X,Y) :- row(X), column(Y), not noQueen(Q,X,Y).noQueen(X,Y) :- row(X), column(Y), not hasQueen(Q,X,Y).% Eliminating models% No 2 queens in the same line or column or diagnonal:- row(X), column(Y), row(XX), hasQueen(X,Y), hasQueen(XX,Y), not
eq(X,XX).:- row(X), column(Y), column(YY), hasQueen(X,Y), hasQueen(X,YY), not
eq(Y,YY).:- row(X), column(Y), row(XX), column(YY), hasQueen(X,Y),
hasQueen(XX,YY), not eq(abs(X-XX), abs(Y-YY)).% All rows must have at least one queen:- row(X), not hasQueen(X).hasQueen(X) :- row(X), column(Y), hasQueen(X,Y)
89
The facts (in smodels)
• Define the domain of predicates and the specific program• Possible to write in abbreviated form, and by resorting to
constantsconst size=8.column(1..size).row(1..size).hide.show hasQueen(X,Y).
• Solutions by:> lparse –c size=4 | smodels 0
90
N-Queens version 2
• Generate less, such that no two queens appear in the same row or column.% Generating modelshasQueen(X,Y) :- row(X), column(Y), not noQueen(Q,X,Y).noQueen(X,Y) :- row(X), column(Y), column(YY),
not eq(Y,YY), hasQueen(X,YY).noQueen(X,Y) :- row(X), column(Y), rwo(XX),
not eq(X,XX), hasQueen(XX,Y).• This already guarantees that all rows have a queen.
Elimination of models is only needed for diagonals:% Eliminating models:- row(X), column(Y), row(XX), column(YY), hasQueen(X,Y),
hasQueen(XX,YY), not eq(abs(X-XX), abs(Y-YY)).
91
Back to subsets
in_sub(X) :- element(X), not out_sub(X).out_sub(X) :- element(X), not in_sub(X).
• Generate subsets with at most 2:- element(X), element(Y), element(Z),
not eq(X,Y), not eq(Y,Z), not eq(X,Z),in_sub(X), in_sub(Y), in_sub(Z).
• Generate subsets with at least 2hasTwo :- element(X), element(Y), not eq(X,Y), in_sub(X),
in_sub(Y).:- not hasTwo.
• It could be done for any maximum and minimum• Smodels has simplified notation for that:
2 {in_sub(X): element(X) } 2.
92
Simplified notation in Smodels
• Generate models with between N and M elements of P(X) that satisfy Q(X), given R.
N {P(X):Q(X)} M :- R
• Example:% Exactly one hasQueen(X,Y) per model for each row(X) given
column(Y)1 {hasQueen(X,Y):row(X)} 1 :- column(Y).% Same for columns1 {hasQueen(X,Y):column(Y)} 1 :- row(X).% Elimination in diagonal:- row(X), column(Y), row(XX), column(YY), hasQueen(X,Y),
hasQueen(XX,YY), not eq(abs(X-XX), abs(Y-YY)).
93
Graph colouring
• Problem: find all colourings of a map of countries using not more than 3 colours, such that neighbouring countries are not given the same colour.
• The predicate arc connects two countries.• Use ASP rules to generate colourings, and
integrity constraints to eliminate unwanted solutions
94
Graph colouringarc(minnesota, wisconsin). arc(illinois, iowa).arc(illinois, michigan). arc(illinois, wisconsin).arc(illinois, indiana). arc(indiana, ohio).arc(michigan, indiana). arc(michigan, ohio).arc(michigan, wisconsin). arc(minnesota, iowa).arc(wisconsin, iowa). arc(minnesota, michigan).
col(Country,Colour) ??
min
wisill
iow ind
mic
ohio
95
Graph colouring
%generatecol(C,red) :- node(C), not col(C,blue), not col(C,green).col(C,blue) :- node(C), not col(C,red), not col(C,green).col(C,green) :- node(C), not col(C,blue), not col(C,red).
%eliminate:- colour(C), con(C1,C2), col(C1,C), col(C2,C).
%auxiliarycon(X,Y) :- arc(X,Y).con(X,Y) :- arc(Y,X).
node(N) :- con(N,C).
96
min
wis
ill
iow ind
mic
ohio
One colouring solution
min
wis
ill
iow ind
mic
ohio
Answer: 1Stable Model: col(minnesota,blue) col(wisconsin,green) col(michigan,red) col(indiana,green) col(illinois,blue) col(iowa,red) col(ohio,blue)
97
Hamiltonian paths
• Given a graph, find all Hamiltonian paths
arc(a,b).arc(a,d).arc(b,a).arc(b,c).arc(d,b).arc(d,c).
a b
d c
98
Hamiltonian paths% Subsets of arcsin_arc(X,Y) :- arc(X,Y), not out_arc(X,Y).out_arc(X,Y) :- arc(X,Y), not in_arc(X,Y).
% Nodesnode(N) :- arc(N,_).node(N) :- arc(_,N).
% Notion of reachablereachable(X) :- initial(X).reachable(X) :- in_arc(Y,X), reachable(Y).
99
Hamiltonian paths
% initial is one (and only one) of the nodesinitial(N) :- node(N), not non_initial(N).non_initial(N) :- node(N), not initial(N).:- initial(N1), initial(N2), not eq(N1,N2).
% In Hamiltonian paths all nodes are reachable:- node(N), not reachable(N).
% Paths must be connected subsets of arcs% I.e. an arc from X to Y can only belong to the path if X is reachable:- arc(X,Y), in_arc(X,Y), not reachable(X).
% No node can be visited more than once:- node(X), node(Y), node(Z), in_arc(X,Y), in_arc(X,Z), not eq(Y,Z).
100
Hamiltonian paths (solutions)
a b
d c
{in_arc(a,d), in_arc(b,c), in_arc(d,b)}{in_arc(a,d), in_arc(b,a), in_arc(d,c)}
101
ASP vs. Prolog like programming
• ASP is adequate for:– NP-complete problems– situation where the whole program is relevant for the
problem at hands
If the problem is polynomial, why using such a complex system?
If only part of the program is relevant for the desired query, why computing the whole model?
102
ASP vs. Prolog
• For such problems top-down, goal-driven mechanisms seem more adequate
• This type of mechanisms is used by Prolog– Solutions come in variable substitutions rather
than in complete models– The system is activated by queries– No global analysis is made: only the relevant
part of the program is visited
103
Problems with Prolog
• Prolog declarative semantics is the completion– All the problems of completion are inherited by
Prolog
• According to SLDNF, termination is not guaranteed, even for Datalog programs (i.e. programs with finite ground version)
• A proper semantics is still needed
104
Part 3: Logic Programming for Knowledge representation
3.3 The Well Founded Semantics
105
Well Founded Semantics
• Defined in [GRS90], generalizes SMs to 3-valued models.
• Note that:– there are programs with no fixpoints of – but all have fixpoints of 2
P = {a not a} ({a}) = {} and ({}) = {a}• There are no stable models
• But: ({}) = {} and ({a}) = {a}
106
Partial Stable ModelsDA 3-valued intr. (T U not F) is a PSM of P iff:
• T = P2(T)
• T (T)
• F = HP - (T)
The 2nd condition guarantees that no atom is both true and false: T F = {}
P = {a not a}, has a single PSM: {}
a not b c not ab not a c not c
This program has 3 PSMs:
{}, {a, not b} and {c, b, not a}
The 3rd corresponds to the single SM
107
WFS definition
T [WF Model] Every P has a knowledge ordering (i.e. wrt ) least PSM, obtainable by the transfinite sequence:
T0 = {}
Ti+1 = 2(Ti)
T = U< T, for limit ordinals
Let T be the least fixpoint obtained.
MP = T U not (HP - (T))
is the well founded model of P.
108
Well Founded Semantics
• Let M be the well founded model of P:
– A is true in P iff A M
– A is false in P iff not A M
– Otherwise (i.e. A M and not A M) A is
undefined in P
109
WFS Properties
• Every program is assigned a meaning
• Every PSM extends one SM– If WFM is total it coincides with the single SM
• It is sound wrt to the SMs semantics– If P has stable models and A is true (resp. false) in
the WFM, it is also true (resp. false) in the intersection of SMs
• WFM coincides with the perfect model in locally stratified programs (and with the least model in definite programs)
110
More WFS Properties
• The WFM is supported
• WFS is cumulative and relevant
• Its computation is polynomial (on the number of instantiated rule of P)
• There are top-down proof-procedures, and sound implementations– these are mentioned in the sequel
111
Part 3: Logic Programming for Knowledge representation
3.4 Comparison to other Non-Monotonic Formalisms
112
LP and Default TheoriesD Let P be the default theory obtained by transforming:
H B1,…,Bn, not C1,…, not Cm
into: B1,…,Bn : ¬C1,…, ¬Cm
H
T There is a one-to-one correspondence between the SMs of P and the default extensions of P
T If L WFM(P) then L belongs to every extension of P
113
LPs as defaults
• LPs can be viewed as sets of default rules
• Default literals are the justification:– can be assumed if it is consistent to do so– are withdrawn if inconsistent
• In this reading of LPs, is not viewed as implication. Instead, LP rules are viewed as inference rules.
114
LP and Auto-Epistemic Logic
D Let P be the AEL theory obtained by transforming:
H B1,…,Bn, not C1,…, not Cm
into:
B1 … Bn ¬ L C1 … ¬ L Cm H
T There is a one-to-one correspondence between the SMs of P and the (Moore) expansions of P
T If L WFM(P) then L belongs to every expansion of P
115
LPs as AEL theories
• LPs can be viewed as theories that refer to their own knowledge
• Default negation not A is interpreted as “A is not known”
• The LP rule symbol is here viewed as material implication
116
LP and AEBD Let P be the AEB theory obtained by transforming:
H B1,…,Bn, not C1,…, not Cm
into:
B1 … Bn B ¬C1 … B ¬CmH
T There is a one-to-one correspondence between the PSMs of P and the AEB expansions of P
T A WFM(P) iff A is in every expansion of P
not A WFM(P) iff B¬A is in all expansions of P
117
LPs as AEB theories
• LPs can be viewed as theories that refer to their own beliefs
• Default negation not A is interpreted as “It is believed that A is false”
• The LP rule symbol is also viewed as material implication
118
SM problems revisited
• The mentioned problems of SM are not necessarily problems:– Relevance is not desired when analyzing global
problems– If the SMs are equated with the solutions of a problem,
then some problems simple have no solution– Some problems are NP. So using an NP language is not
a problem.– In case of NP problems, the efficient gains from
cumulativity are not really an issue.
119
SM versus WFM• Yield different forms of programming and of representing
knowledge, for usage with different purposes• Usage of WFM:
– Closer to that of Prolog– Local reasoning (and relevance) are important– When efficiency is an issue even at the cost of expressivity
• Usage of SMs– For representing NP-complete problems– Global reasoning– Different form of programming, not close to that of Prolog
• Solutions are models, rather than answer/substitutions
120
Part 3: Logic Programming for Knowledge representation
3.5 Extended Logic Programs
121
Extended LPs• In Normal LPs all the negative information is implicit. Though
that’s desired in some cases (e.g. the database with flight connections), sometimes an explicit form of negation is needed for Knowledge Representation
• “Penguins don’t fly” could be: noFly(X) penguin(X)
• This does not relate fly(X) and noFly(X) in:
fly(X) bird(X)
noFly(X) penguin(X)
For establishing such relations, and representing negative information a new form of negation is needed in LP:
Explicit negation - ¬
122
Extended LP: motivation
• ¬ is also needed in bodies:“Someone is guilty if is not innocent”
– cannot be represented by: guilty(X) not innocent(X)
– This would imply guilty in the absence of information about innocent
– Instead, guilty(X) ¬innocent(X) only implies guilty(X) if X is proven not to be innocent
• The difference between not p and ¬p is essential whenever the information about p cannot be assumed to be complete
123
ELP motivation (cont)
• ¬ allows for greater expressivity:“If you’re not sure that someone is not innocent, then further
investigation is needed”
– Can be represented by:
investigate(X) not ¬innocent(X)
• ¬ extends the relation of LP to other NMR formalisms. E.g– it can represent default rules with negative conclusions and
pre-requisites, and positive justifications
– it can represent normal default rules
124
Explicit versus Classical ¬• Classical ¬ complies with the “excluded middle”
principle (i.e. F v ¬F is tautological)– This makes sense in mathematics
– What about in common sense knowledge?
• ¬A is the the opposite of A.• The “excluded middle” leaves no room for
undefinednesshire(X) qualified(X)reject(X) ¬qualified(X)
The “excluded middle” implies that every X is either hired or rejected
It leaves no room for those about whom further information is need to determine if they are qualified
125
ELP Language• An Extended Logic Program P is a set of rules:
L0 L1, …, Lm, not Lm+1, … not Ln (n,m 0)
where the Li are objective literals
• An objective literal is an atoms A or its explicit negation ¬A
• Literals not Lj are called default literals
• The Extended Herbrand base HP is the set of all instantiated objective literals from program P
• We will consider programs as possibly infinite sets of instantiated rules.
126
ELP Interpretations• An interpretation I of P is a set
I = T U not F
where T and F are disjoint subsets of HP and
¬L T L F (Coherence Principle)
i.e. if L is explicitly false, it must be assumed false by default
• I is total iff HP = T U F
• I is consistent iff ¬ L: {L, ¬L} T– In total consistent interpretations the Coherence Principle is
trivially satisfied
127
Answer sets• It was the 1st semantics for ELPs [Gelfond&Lifschitz90]• Generalizes stable models to ELPs
D Let M- be a stable models of the normal P- obtained by replacing in the ELP P every ¬A by a new atom A-. An answer-set M of P
is obtained by replacing A- by ¬A in M-
• A is true in an answer set M iff A S
• A is false iff ¬A S
• Otherwise, A is unknown
• Some programs have no consistent answer sets:– e.g. P = {a ¬a }
128
Answer sets and DefaultsD Let P be the default theory obtained by transforming:
L0 L1,…,Lm, not Lm+1,…, not Ln
into: L1,…,Lm : ¬Lm+1,…, ¬Ln
L0
where ¬¬A is (always) replaced by A
T There is a one-to-one correspondence between the answer-sets of P and the default extensions of P
129
Answer-sets and AEL
D Let P be the AEL theory obtained by transforming:
L0 L1,…,Lm, not Lm+1,…, not Ln
into:
L1 L L1… Lm L Lm
¬ L Lm+1 … ¬ L Lm L0 L L0
T There is a one-to-one correspondence between the answer-sets of P and the expansions of P
130
The coherence principle• Generalizing WFS in the same way yields unintuitive
results:pacifist(X) not hawk(X)hawk(X) not pacifist(X)¬pacifist(a)
– Using the same method the WFS is: {¬pacifist(a)}
– Though it is explicitly stated that a is non-pacifist, not pacifist(a) is not assumed, and so hawk(a) cannot be concluded.
• Coherence is not satisfied... Coherence must be imposed
131
Imposing Coherence
• Coherence is: ¬L T L F, for objective L• According to the WFS definition, everything is false
that doesn’t belong to (T)• To impose coherence, when applying (T) simply
delete all rules for the objective complement of literals in T
“If L is explicitly true then when computing undefined literals forget all
rules with head ¬L”
132
WFSX definitionD The semi-normal version of P, Ps, is obtained by adding not ¬L
to every rule of P with head L
DAn interpretation (T U not F) is a PSM of ELP P iff:• T = PPs(T)
• T Ps(T)
• F = HP - Ps(T)
T The WFSX semantics is determined by the knowledge ordering least PSM (wrt )
133
WFSX example
P: pacifist(X) not hawk(X)hawk(X) not pacifist(X)¬pacifist(a)
Ps: pacifist(X) not hawk(X), not ¬pacifist(X)hawk(X) not pacifist(X ), not ¬hawk(X)¬pacifist(a) not pacifist(a)
T0 = {}s(T0) = {¬p(a),p(a),h(a),p(b),h(b)}T1 = {¬p(a)}s(T1) = {¬p(a),h(a),p(b),h(b)}T2 = {¬p(a),h(a)}T3 = T2
The WFM is:{¬p(a),h(a), not p(a), not ¬h(a), not ¬p(b), not ¬h(b)}
134
Properties of WFSX
• Complies with the coherence principle
• Coincides with WFS in normal programs
• If WFSX is total it coincides with the only answer-set
• It is sound wrt answer-sets
• It is supported, cumulative, and relevant
• Its computation is polynomial
• It has sound implementations (cf. below)
135
Inconsistent programs
• Some ELPs have no WFM. E.g. { a ¬a } • What to do in these cases?
Explosive approach: everything follows from contradiction
• taken by answer-sets
• gives no information in the presence of contradiction
Belief revision approach: remove contradiction by revising P
• computationally expensive
Paraconsistent approach: isolate contradiction
• efficient
• allows to reason about the non-contradictory part
136
WFSXp definition
• The paraconsistent version of WFSx is obtained by dropping the requirement that T and F are disjoint, i.e. dropping T Ps(T)
DAn interpretation, T U not F, is a PSMp P iff:• T = PPs(T)
• F = HP - Ps(T)
T The WFSXp semantics is determined by the knowledge ordering least PSM (wrt )
137
WFSXp example
P: c not b ab a ¬ad not e
Ps: c not b, not ¬c a not ¬ab a, not ¬b ¬a not ad not e , not ¬d T0 = {}
s(T0) = {¬a,a,b,c,d}T1 = {¬a,a,b,d}s(T1) = {d}T2 = {¬a,a,b,c,d}T3 = T2
The WFM is:{¬a,a,b,c,d, not a, not ¬a, not b, not ¬b not c, not ¬c, not ¬d, not e}
138
Surgery situation• A patient arrives with: sudden epigastric pain; abdominal
tenderness; signs of peritoneal irritation
• The rules for diagnosing are:– if he has sudden epigastric pain abdominal tenderness, and signs of
peritoneal irritation, then he has perforation of a peptic ulcer or an acute pancreatitis
– the former requires surgery, the latter therapeutic treatment
– if he has high amylase levels, then a perforation of a peptic ulcer can be exonerated
– if he has Jobert’s manifestation, then pancreatitis can be exonerated
– In both situations, the pacient should not be nourished, but should take H2 antagonists
139
LP representationperforation pain, abd-tender, per-irrit, not high-amylase
pancreat pain, abd-tender, per-irrit, not jobert¬nourish perforation h2-ant perforation¬nourish pancreat h2-ant pancreatsurgery perforation anesthesia surgery ¬surgery pancreat
pain. per-irrit. ¬high-amylase.
abd-tender. ¬jobert.
The WFM is:{pain, abd-tender, per-irrit, ¬high-am, ¬jobert , not ¬pain, not ¬abd-tender, not
¬per-irrit, not high-am, not jobert, ¬nourish, h2-ant, not nourish, not ¬h2-ant,
surgery, ¬surgery, not surgery, not ¬surgery,anesthesia, not anesthesia, not ¬anesthesia }
140
Results interpretation
• The symptoms are derived and non-contradictory
• Both perforation and pancreatitis are concluded
• He should not be fed (¬nourish), but take H2 antagonists
• The information about surgery is contradictory
• Anesthesia though not explicitly contradictory (¬anesthesia doesn’t belong to WFM) relies on contradiction (both anesthesia and not anesthesia belong to WFM)
The WFM is:{pain, abd-tender, per-irrit, ¬high-am, ¬jobert , …, ¬nourish, h2-ant, not nourish,
not ¬h2-ant, surgery, ¬surgery, not surgery, not ¬surgery,anesthesia, not anesthesia, not ¬anesthesia }
141
Part 3: Logic Programming for Knowledge representation
3.6 Proof procedures
142
WFSX programming
• Prolog programming style, but with the WFSX semantics
• Requires:– A new proof procedure (different from SLDNF),
complying with WFS, and with explicit negation– The corresponding Prolog-like implementation:
XSB-Prolog
143
SLX:Proof procedure for WFSX
• SLX (SL with eXplicit negation) is a top-down procedure for WFSX
• Is similar to SLDNF– Nodes are either successful or failed– Resolution with program rules and resolution of
default literals by failure of their complements are as in SLDNF
• In SLX, failure doesn’t mean falsity. It simply means non-verity (i.e. false or undefined)
144
Success and failure
• A finite tree is successful if its root is successful, and failed if its root is failed
• The status of a node is determined by:
– A leaf labeled with an objective literal is failed
– A leaf with true is successful
– An intermediate node is successful if all its children are
successful, and failed otherwise (i.e. at least one of its
children is failed)
145
Negation as Failure?• As in SLS, to solve infinite positive recursion,
infinite trees are (by definition) failed• Can a NAF rule be used? YES
True of not A succeeds if true-or-undefined of A fails
True-or-undefined of not A succeeds if true of A fails
• This is the basis of SLX. It defines:– T-Trees for proving truth
– TU-Trees for proving truth or undefinedness
146
T and TU-trees• They differ in that literals involved in recursion
through negation, and so undefined in WFSXp, are failed in T-Trees and successful in TU-Trees
a not b
b not a
…
b
not a
TU
b
not a
TUa
not b
T
a
not b
TX
X
X
X
147
Explicit negation in SLX• ¬-literals are treated as atoms• To impose coherence, the semi-normal program is
used in TU-trees
a not b
b not a¬a
b
not aX
a
not b not ¬a
¬a
true
X
b
not a
¬a
true
…
X
X
a
not b not ¬a
X
148
Explicit negation in SLX (2)• In TU-trees: L also fails if ¬L succeeds true• I.e. if not ¬L fail as true-or-undefined
c not c
b not c¬ba b
not a
X
¬b
true
c
not c
¬a
b
not c not ¬b
a
not ¬aX
X
X
c
not c
c
not c
c
not c
c
not c
…
X
X
X
X
X
X
149
T and TU-trees definition
D T-Trees (resp TU-trees) are AND-trees labeled by literals, constructed top-down from the root by expanding nodes with the rules
• Nodes labeled with objective literal A• If there are no rules for A, the node is a leaf
• Otherwise, non-deterministically select a rule for A
A L1,…,Lm, not Lm+1,…, not Ln
• In a T-tree the children of A are L1,…,Lm, not Lm+1,…, not Ln
• In a TU-tree A has, additionally, the child not ¬A
• Nodes labeled with default literals are leafs
150
Success and FailureD All infinite trees are failed. A finite tree is successful if its root is
successful and failed if its root is failed. The status of nodes is determined by:• A leaf node labeled with true is successful
• A leaf node labeled with an objective literal is failed
• A leaf node of a T-tree (resp. TU) labeled with not A is successful if all TU-trees (resp. T) with root A (subsidiary tree) are failed; and failed otherwise
• An intermediate node is successful if all its children are successful; and failed otherwise
After applying these rules, some nodes may remain undetermined (recursion through not). Undetermined nodes in T-trees (resp.TU) are by definition failed (resp. successful)
151
Properties of SLX
• SLX is sound and (theoretically) complete wrt WFSX.
• If there is no explicit negation, SLX is sound and (theoretically) complete wrt WFS.
• See [AP96] for the definition of a refutation procedure based on the AND-trees characterization, and for all proofs and details
152
Infinite trees examples not p, not q, not r
p not s, q, not r
q r, not p
r p, not q
WFM is {s, not p, not q, not r}
not p not q not r
sX
p
q not s
r
not q
not r
r not p
p
q not s not r
r not p
p not q
X
Xq
r not p
p not q
q not s not r
153
Negative recursion example
q not p(0), not s
p(N) not p(s(N))
s true
WFM = {s, not q}…
not q
p(0)
not p(1)
not p(0)
q
not sX
X
p(1)
not p(2)
p(2)
not p(3)X
X
X
X
s
true
not p(0) … p(1)
not p(2)
p(0)
not p(1)X
XX
p(2)
not p(3)
154
Guaranteeing termination
• The method is not effective, because of loops• To guarantee termination in ground programs:
Local ancestors of node n are literals in the path from n to the root, exclusive of n
Global ancestors are assigned to trees:• the root tree has no global ancestors
• the global ancestors of T, a subsidiary tree of leaf n of T’, are the global ancestors of T’ plus the local ancestors of n
• global ancestors are divided into those occurring in T-trees and those occurring in TU-trees
155
Pruning rules• For cyclic positive recursion:
Rule 1
If the label of a node belongs to its local ancestors, then the node is marked failed, and its children are ignored
• For recursion through negation:
Rule 2
If a literal L in a T-tree occurs in its global T-ancestors then it is marked failed and its children are ignored
156
Rule 2Rule 1
Pruning rules (2)
L
L
L
L
…
157
Other sound rulesRule 3
If a literal L in a T-tree occurs in its global TU-ancestors then it is marked failed, and its children are ignored
Rule 4
If a literal L in a TU-tree occurs in its global T-ancestors then it is marked successful, and its children are ignored
Rule 5
If a literal L in a TU-tree occurs in its global TU-ancestors then it is marked successful, and its children are ignored
158
Pruning examplesa not b
b not a¬a
b
not aX
a
not b not ¬a
¬a
true
X
c not c
b not c¬ba b
not a
X
¬b
truec
not c
¬a
b
not c not ¬b
a
not ¬aX
X
X
X
XRule 3
bRule 2X
159
Non-ground case• The characterization and pruning rules apply to
allowed non-ground programs, with ground queries• It is well known that pruning rules do not generalize
to general programs with variables:
p(X) p(Y)
p(a)
p(X)
p(Y)
What to do?
p(Z)
• If “fail”, the answers are incomplete• If “proceed” then loop
160
Tabling
• To guarantee termination in non-ground programs, instead of ancestors and pruning rules, tabulation mechanisms are required– when there is a possible loop, suspend the literal and try
alternative solutions
– when a solution is found, store it in a table
– resume suspended nodes with new solutions in the table
– apply an algorithm to determine completion of the process, i.e. when no more solutions exist, and fail the corresponding suspended nodes
161
Tabling example
• SLX is also implemented with tabulation mechanisms
• It uses XSB-Prolog tabling implementation
• SLX with tabling is available with XSB-Prolog from Version 2.0 onwards
• Try it at:
p(X) p(Y)
p(a)
p(X)
p(Y)1) suspend X = a
X = a
2) resume
Y = a
X = _
Table for p(X)http://xsb.sourceforge.net/
162
Tabling (cont.)
• If a solution is already stored in a table, and the predicate is called again, then:– there is no need to compute the solution again– simply pick it from the table!
• This increases efficiency. Sometimes by one order of magnitude.
163
Fibonacci examplefib(1,1).
fib(2,1).
fib(X,F) fib(X-1,F1), fib(X-2,F2),F is F1 + F2. fib(4,A)
fib(3,B)
fib(2,C)
C=1 D=1
fib(1,D)
B=3fib(2,E)
E=1
A=4fib(3,F)
F=3
Y=7
Table for fibQ F 2 1 1 1 3 3 4 4 5 7
fib(6,X)
fib(5,Y) fib(4,H)
H=4
X=11
6 11 • Linear rather than
exponential
164
XSB-Prolog
• Can be used to compute under WFS
• Prolog + tabling– To using tabling on, eg, predicate p with 3
arguments:
:- table p/3.
• Table are used from call to call until:abolish_all_table
abolish_table_pred(P/A)
165
XSB Prolog (cont.)
• WF negation can be used via tnot(Pred)• Explicit negation via -Pred
• The answer to query Q is yes if Q is either true or undefined in the WFM
• The answer is no if Q is false in the WFM of the program
166
Distinguishing T from U
• After providing all answers, tables store suspended literals due to recursion through negation Residual Program
• If the residual is empty then True
• If it is not empty then Undefined
• The residual can be inspected with:get_residual(Pred,Residual)
167
Residual program example:- table a/0.
:- table b/0.
:- table c/0.
:- table d/0.
a :- b, tnot(c).c :- tnot(a).b :- tnot(d).d :- d.
| ?- a,b,c,d,fail.no| ?- get_residual(a,RA).RA = [tnot(c)] ;no| ?- get_residual(b,RB).RB = [] ;no| ?- get_residual(c,RC).RC = [tnot(a)] ;no| ?- get_residual(d,RD).no| ?-
168
Transitive closure
• Due to circularity completion cannot conclude not reach(c)
• SLDNF (and Prolog) loops on that query
• XSB-Prolog works fine
:- auto_table.edge(a,b).edge(c,d).edge(d,c).reach(a).reach(A) :- edge(A,B),reach(B).
|?- reach(X).X = a;no.|?- reach(c).no.|?-tnot(reach(c)).yes.
169
Transitive closure (cont)
:- auto_table.edge(a,b).edge(c,d).edge(d,c).reach(a).reach(A) :- edge(A,B),reach(B).
• Declarative semantics closer to operational• Left recursion is handled properly• The version on the right is usually more efficient
:- auto_table.edge(a,b).edge(c,d).edge(d,c).reach(a).reach(A) :- reach(B), edge(A,B).
• Instead one could have written
170
Grammars
• Prolog provides “for free” a right-recursive descent parser
• With tabling left-recursion can be handled
• It also eliminates redundancy (gaining on efficiency), and handle grammars that loop under Prolog.
171
Grammars example
:- table expr/2, term/2.
expr --> expr, [+], term.expr --> term.term --> term, [*], prim.term --> prim.prim --> [‘(‘], expr, [‘)’].prim --> [Int], {integer(Int)}.
• This grammar loops in Prolog• XSB handles it correctly, properly associating * and +
to the left
172
Grammars example:- table expr/3, term/3.
expr(V) --> expr(E), [+], term(T), {V is E + T}.expr(V) --> term(V).term(V) --> term(T), [*], prim(P), {V is T * P}.term(V) --> prim(V).prim(V) --> [‘(‘], expr(V), [‘)’].prim(Int) --> [Int], {integer(Int)}.
• With XSB one gets “for free” a parser based on a variant of Earley’s algorithm, or an active chart recognition algorithm
• Its time complexity is better!
173
Finite State Machines
:- table rec/2.
rec(St) :- initial(I), rec(St,I).
rec([],S) :- is_final(S).rec([C|R],S) :- d(S,C,S2), rec(R,S2).
• Tabling is well suited for Automata Theory implementations
q0 q1
q2
q3a
a b
a
initial(q0).d(q0,a,q1).d(q1,a,q2).d(q2,b,q1).d(q1,a,q3).is_final(q3).
174
Dynamic Programming
• Strategy for evaluating subproblems only once.– Problems amenable for DP, might also be for
XSB.
• The Knap-Sack Problem:– Given n items, each with a weight Ki (1 i n),
determine whether there is a subset of the items that sums to K
175
The Knap-Sack Problem
:- table ks/2.
ks(0,0).ks(I,K) :- I > 0, I1 is I-1, ks(I1,K).ks(I,K) :- I > 0,
item(I,Ki), K1 is K-Ki, I1 is I-1, ks(I1,K1).
Given n items, each with a weight Ki (1 i n), determine whether there is a subset of the items that sums to K.
• There is an exponential number of subsets. Computing this with Prolog is exponential.
• There are only I2 possible distinct calls. Computing this with tabling is polynomial.
176
Combined WFM and ASP at work
• XSB-Prolog XASP package combines XSB with Smodels– Makes it possible to combine WFM
computation with Answer-sets– Use (top-down) WFM computation to
determine the relevant part of the program– Compute the stable models of the residual– Possibly manipulate the results back in Prolog
177
XNMR mode• Extends the level of the Prolog
shell with querying stable models of the residual:
:- table a/0, b/0, c/0.
a :- tnot(b).b :- tnot(a).c :- b.c :- a.
C:\> xsb xnmr.[…]nmr| ?- [example].yes
nmr| ?- c.DELAY LIST = [a]DELAY LIST = [b]? s{c;a} ;{c;b} ;nonmr| ?-
the residuals of the query
SMs of the residual
s{a};{b};nonmr| ?-
a.DELAY LIST = [tnot(b)]
t{a};no
a.DELAY LIST = [tnot(b)]
SMs of residual wherequery is true
178
XNMR mode and relevance
• Stable models given a query• First computes the relevant part of the program
given the query• This step already allows for:
– Processing away literal in the WFM– Grounding of the program, given the query.
• This is a different grounding mechanism, in contrast to lparse or to that of DLV
• It is query dependant and doesn’t require that much domain predicates in rule bodies…
179
XASP libraries
• Allow for calling smodels from within XSB-Programs
• Detailed control and processing of Stable Models• Two libraries are provided
– sm_int which includes a quite low level (external) control of smodels
– xnmr_int which allows for a combination of SMs and prolog in the same program
180
sm_int library
• Assumes a store with (smodels) rules• Provides predicates for
– Initializing the store (smcInit/0 and smcReInit/0)
– Adding and retracting rules (smcAddRule/2 and smcRetractRule/2)
– Calling smodels on the rules of the store (smcCommitProgram/0 and smcComputeModel/0)
– Examine the computed SMs (smcExamineModel/2)– smcEnd/0 for reclaiming resources in the end
181
xnmr_int library• Allows for control, within Prolog of the interface provided
by xnmr.– Predicates that call goals, compute residual, and compute SMs of
the residual• pstable_model(+Query,-Model,0)
– Computes one SM of the residual of the Query– Upon backtracking, computes other SMs
• pstable_model(+Query,-Model,1)– As above but only SMs where Query is true
• Allow for pre and pos-processing of the models– E.g. for finding models that are minimal or prefered in some sense– For pretty input and output, etc
• You must::- import pstable_model/3 from xnmr_int
182
Exercise
• Write a XBS-XASP program that– Reads from the input the dimension N of the board
– Computes the solution for the N-queens problem of that dimension
– Shows the solution “nicely” in the screen
– Shows what is common to all solution • E.g. (1,1) has never a queen, in no solution
• Write a XSB-XASP program that computes minimal diagnosis of digital circuits
183
Part 3: Logic Programming for Knowledge representation
3.7 Application to representing taxonomies
184
A methodology for KR
• WFSXp provides mechanisms for representing usual KR problems:– logic language– non-monotonic mechanisms for defaults– forms of explicitly representing negation– paraconsistency handling– ways of dealing with undefinedness
• In what follows, we propose a methodology for representing (incomplete) knowledge of taxonomies with default rules using WFSXp
185
Representation method (1)
Definite rules If A then B:
– B A• penguins are birds: bird(X) penguin(X)
Default rules Normally if A then B:
– B A, rule_name, not ¬B
rule_name not ¬rule_name• birds normally fly: fly(X) bird(X), bf(X), not ¬fly(X)
bf(X) not ¬bf(X)
186
Representation method (2)Exception to default rules Under conditions COND
do not apply rule RULE:– ¬RULE COND
• Penguins are an exception to the birds-fly rule ¬bf(X) penguin(X)
Preference rules Under conditions COND prefer rule RULE+ to RULE- :– ¬RULE- COND, RULE+
• for penguins, prefer the penguins-don’t-fly to the birds-fly rule: ¬bf(X) penguin(X), pdf(X)
187
Representation method (3)Hypotethical rules “If A then B” may or not apply:
– B A, rule_name, not ¬B
rule_name not ¬rule_name
¬rule_name not rule_name• quakers might be pacifists:
pacifist(X) quaker(X), qp(X), not ¬pacifist(X)
qp(X) not ¬qp(X)
¬qp(X) not qp(X)
For a quaker, there is a PSM with pacifist, another with not pacifist. In the WFM pacifist is undefined
188
Taxonomy example• Mammal are animal
• Bats are mammals
• Birds are animal
• Penguins are birds
• Dead animals are animals
• Normally animals don’t fly
• Normally bats fly
• Normally birds fly
• Normally penguins don’t fly
• Normally dead animals don’t fly
The taxonomy:
• Pluto is a mammal
• Joe is a penguin
• Tweety is a bird
• Dracula is a dead bat
The elements:
• Dead bats don’t fly though bats do
• Dead birds don’t fly though birds do
• Dracula is an exception to the above
• In general, more specific information is preferred
The preferences:
189
The taxonomy
flies
animal
bird
penguin
mammal
bat
dead animal
plutotweety draculajoe
Definite rulesDefault rulesNegated default rules
190
Taxonomy representationTaxonomy
animal(X) mammal(X)mammal(X) bat(X)animal(X) bird(X)bird(X) penguin(X)deadAn(X) dead(X)
Default rules¬flies(X) animal(X), adf(X), not flies(X)adf(X) not ¬adf(X)flies(X) bat(X), btf(X), not ¬flies(X)btf(X) not ¬btf(X) flies(X) bird(X), bf(X), not ¬flies(X)bf(X) not ¬bf(X) ¬flies(X) penguin(X), pdf(X), not flies(X)pdf(X) not ¬pdf(X)¬flies(X) deadAn(X), ddf(X), not flies(X)ddf(X) not ¬ddf(X)
Factsmammal(pluto).bird(tweety). deadAn(dracula).penguin(joe). bat(dracula).
Explicit preferences¬btf(X) deadAn(X), bat(X), r1(X) r1(X) not ¬r1(X)¬btf(X) deadAn(X), bird(X), r2(X) r2(X) not ¬r2(X)¬r1(dracula) ¬r2(dracula)
Implicit preferences¬adf(X) bat(X), btf(X) ¬adf(X) bird(X), bf(X)¬bf(X) penguin(X), pdf(X)
191
Taxonomy resultsJoe dracula pluto tweety
deadAn not not notbat not not notpenguin not not notmammal not notbird not not animal adf ¬ ¬btf ¬ bf ¬ pdf ddf ¬ r1 ¬ r2 ¬ flies ¬ ¬
192
Part 4: Knowledge Evolution
193
LP and Non-Monotonicity
• LP includes a non-monotonic form of default negation
not L is true if L cannot (now) be proven
• This feature is used for representing incomplete knowledge:With incomplete knowledge, assume hypotheses, and jump
to conclusions.
If (later) the conclusions are proven false, withdraw some hypotheses to regain consistency.
194
Typical example• All birds fly. Penguins are an exception:flies(X) bird(X), not ab(X). bird(a) .
ab(X) penguin(X).
• If later we learn penguin(a):– Add: penguin(a).
– Goes back on the assumption not ab(a).
– No longer concludes flies(a).
This program concludes flies(a), by assuming not ab(a).
195
LP representing a static world
• The work on LP allows the (non-monotonic) addition of new knowledge.
• But:– What we have seen so far does not consider this
evolution of knowledge• LPs represent a static knowledge of a given world in
a given situation.
• The issues of how to add new information to a logic program wasn’t yet addressed.
196
Knowledge Evolution• Up to now we have not considered evolution of the
knowledge• In real situations knowledge evolves by:
– completing it with new information
– changing it according to the changes in the world itself
• Simply adding the new knowledge possibly leads to contradiction
• In many cases a process for restoring consistency is desired
197
Revision and Updates• In real situations knowledge evolves by:
– completing it with new information (Revision)– changing it according to the changes in the world itself
(Updates)
• These forms of evolution require a differentiated treatment. Example:– I know that I have a flight booked for London (either for
Heathrow or for Gatwick).Revision: I learn that it is not for Heathrow
• I conclude my flight is for Gatwick
Update: I learn that flights for Heathrow were canceled • Either I have a flight for Gatwick or no flight at all
198
Part 4: Knowledge Evolution
4.1 Belief Revision and Logic Programming
199
AGM Postulates for Revision
For revising a logical theory T with a formula F, first modify T so that it does not derive ¬F, and then add F. The contraction of T by a formula F, T-(F), should obey:
1. T-(F) has the same language as T2. Th(T-(F)) Th(T)3. If T |≠ F then T-(F) = T 4. If |≠ F then T-(F) |≠ F 5. Th(T) Th(T-(F) {F})6. If |= F ↔ G then Th(T-(F)) = Th(T-(G))7. T-(F) ∩ T-(G) T-(F G)8. If T-(F G) |≠ F then T-(F G) T-(F)
200
Epistemic Entrenchment
• The question in general theory revision is how to change a theory so that it obeys the postulates?
• What formulas to remove and what formulas to keep?
• In general this is done by defining preferences among formulas: some can and some cannot be removed.
• Epistemic Entrenchment: some formulas are “more believed” than others.
• This is quite complex in general theories.• In LP, there is a natural notion of “more believed”
201
Logic Programs Revision• The problem:
– A LP represents consistent incomplete knowledge;
– New factual information comes.
– How to incorporate the new information?
• The solution:– Add the new facts to the
program
– If the union is consistent this is the result
– Otherwise restore consistency to the union
• The new problem:– How to restore consistency to an inconsistent program?
202
Simple revision example (1)
P: flies(X) bird(X), not ab(X). bird(a) .
ab(X) penguin(X).
• We learn penguin(a).
P {penguin(a)} is consistent. Nothing more to be done.
• We learn instead ¬flies(a).
P {¬flies(a)} is inconsistent. What to do?
Since the inconsistency rests on the assumption not ab(a), remove that assumption (e.g. by adding the fact ab(a), or forcing it undefined with ab(a) u) obtaining a new program P’.
If an assumption supports contradiction, then go back on that assumption.
203
Simple revision example (2)
P: flies(X) bird(X), not ab(X). bird(a) .
ab(X) penguin(X).
If later we also learn flies(a) (besides the previous ¬flies(a))
P’ {flies(a)} is inconsistent.
The contradiction does not depend on assumptions.
Cannot remove contradiction!
Some programs are non-revisable.
204
What to remove?• Which assumptions should be removed?
normalWheel not flatTyre, not brokenSpokes.
flatTyre leakyValve. ¬normalWheel wobblyWheel.
flatTyre puncturedTube. wobblyWheel .
– Contradiction can be removed by either dropping not flatTyre or not brokenSpokes
– We’d like to delve deeper in the model and (instead of not flatTyre) either drop not leakyValve or not puncturedTube.
205
Revisables
Revisables = not {leakyValve, punctureTube, brokenSpokes}
Revisions in this case are {not lv}, {not pt}, and {not bs}
• Solution:– Define a set of revisables:
normalWheel not flatTyre, not brokenSpokes.
flatTyre leakyValve. ¬normalWheel wobblyWheel.
flatTyre puncturedTube. wobblyWheel .
206
Integrity Constraints
• For convenience, instead of:¬normalWheel wobblyWheel
we may use the denial:
normalWheel, wobblyWheel
• ICs can be further generalized into:L1 … Ln Ln+1 … Lm
where Lis are literals (possibly not L).
207
ICs and Contradiction
• In an ELP with ICs, add for every atom A: A, ¬A
• A program P is contradictory iff P
where is the paraconsistent derivation of SLX
208
Algorithm for 3-valued revision
• Find all derivations for , collecting for each one the set of revisables supporting it. Each is a support set.
• Compute the minimal hitting sets of the support sets. Each is a removal set.
• A revision of P is obtained by adding{A u: A R}
where R is a removal set of P.
209
(Minimal Hitting Sets)
• H is a hitting set of S = {S1,…Sn} iff– H ∩ S1 ≠ {} and … H ∩ Sn ≠ {}
• H is a minimal hitting set of S iff it is a hitting set of S and there is no other hitting set of S, H’, such that H’ H.
• Example:– Let S = {{a,b},{b,c}}– Hitting sets are {a,b},{a,c},{b},{b,c},{a,b,c}– Minimal hitting sets are {b} and {a,c}.
210
ExampleRev = not {a,b,c}
p, q
p not a.
q not b, r.
r not b.
r not c.
p q
not a r not b
not b not c
Support sets are: {not a, not b}and {not a, not b, not c}.
Removal sets are: {not a} and {not b}.
211
Simple diagnosis exampleinv(G,I,0) node(I,1), not ab(G).inv(G,I,1) node(I,0), not ab(G).node(b,V) inv(g1,a,V).node(a,1).¬node(b,0).
%Fault modelinv(G,I,0) node(I,0), ab(G).inv(G,I,1) node(I,1), ab(G).
a=1 b0g1
The only revision is:P U {ab(g1) u}
It does not conclude node(b,1).
• In diagnosis applications (when fault models are considered) 3-valued revision is not enough.
212
2-valued Revision
• In diagnosis one often wants the IC:ab(X) v not ab(X)
– With these ICs (that are not denials), 3-valued revision is not enough.
• A two valued revision is obtained by adding facts for revisables, in order to remove contradiction.
• For 2-valued revision the algorithm no longer works…
213
Example
• In 2-valued revision:– some removals must be deleted;– the process must be iterated.
p. a. b, not c.p not a, not b.
a
X
p
not a not bb not c
XThe only support is {not a, not b}.Removals are {not a} and {not b}.
• P U {a} is contradictory (and unrevisable).• P U {b} is contradictory (though revisable).
But:
214
Algorithm for 2-valued revision
1 Let Revs={{}}2 For every element R of Revs:
– Add it to the program and compute removal sets.– Remove R from Revs– For each removal set RS:
• Add R U not RS to Revs3 Remove non-minimal sets from Revs4 Repeat 2 and 3 until reaching a fixed point of Revs.
The revisions are the elements of the final Revs.
215
• Choose {b}. The removal set of P U {b} is {not c}. Add {b, c} to Rev. • Choose {b,c}. The removal set of P U {b,c} is {}. Add {b, c} to Rev.
• Choose {}. Removal sets of P U {} are {not a} and {not b}. Add them to Rev.
Example of 2-valued revision p. a. b, not c.p not a, not b.
Rev0 = {{}}
Rev1 = {{a}, {b}}
• Choose {a}. P U {a} has no removal sets.
Rev2 = {{b}}
Rev3 = {{b,c}}
•The fixed point had been reached. P U {b,c} is the only revision.
= Rev4
216
Part 4: Knowledge Evolution
4.2 Application to Diagnosis
217
Revision and Diagnosis• In model based diagnosis one has:
– a program P with the model of a system (the correct and, possibly, incorrect behaviors)
– a set of observations O inconsistent with P (or not explained by P).
• The diagnoses of the system are the revisions of P O.
• This allows to mixed consistency and explanation (abduction) based diagnosis.
218
Diagnosis Example
1
1
1
10
c1=0
c3=0
c6=0
c7=0
c2=0
0
1
g10
g11
g16
g19
g22
g23
219
Diagnosis Program Observablesobs(out(inpt0, c1), 0).obs(out(inpt0, c2), 0).obs(out(inpt0, c3), 0).obs(out(inpt0, c6), 0).obs(out(inpt0, c7), 0).obs(out(nand, g22), 0). obs(out(nand, g23), 1). Predicted and observed values cannot be different
obs(out(G, N), V1), val(out(G, N), V2), V1 V2.
Connectionsconn(in(nand, g10, 1), out(inpt0, c1)).conn(in(nand, g10, 2), out(inpt0, c3)).…conn(in(nand, g23, 1), out(nand, g16)).conn(in(nand, g23, 2), out(nand, g19)).
Value propagationval( in(T,N,Nr), V ) conn( in(T,N,Nr), out(T2,N2) ), val( out(T2,N2), V ).val( out(inpt0, N), V ) obs( out(inpt0, N), V ). Normal behaviorval( out(nand,N), V ) not ab(N), val( in(nand,N,1), W1), val( in(nand,N,2), W2),
nand_table(W1,W2,V). Abnormal behaviorval( out(nand,N), V ) ab(N), val( in(nand,N,1), W1), val( in(nand,N,2), W2),
and_table(W1,W2,V).
220
Diagnosis Examplec1=0
c3=0
c6=0
c7=0
c2=0
0
1
g10
g11
g16
g19
g22
g23
Revision are:{ab(g23)}, {ab(c19)}, and {ab(g16),ab(g22)}
1
1
1
1 1
0
1
1
0
1 1
0
1
1
1
0 1
0
221
Revision and Debugging
• Declarative debugging can be seen as diagnosis of a program.
• The components are:– rule instances (that may be incorrect).
– predicate instances (that may be uncovered)
• The (partial) intended meaning can be added as ICs.
• If the program with ICs is contradictory, revisions are the possible bugs.
222
Debugging Transformation• Add to the body of each possibly incorrect rule r(X)
the literal not incorrect(r(X)).• For each possibly uncovered predicate p(X) add the
rule:p(X) uncovered(p(X)).
• For each goal G that you don’t want to prove add: G.
• For each goal G that you want to prove add: not G.
223
Debugging examplea not b
b not c
WFM = {not a, b, not c}
b should be false
a not b, not incorrect(a not b)
b not c, not incorrect(b not c)a uncovered(a)b uncovered(b)c uncovered(c) bRevisables are incorrect/1 and uncovered/1
Revision are:
{incorrect(b not c)}
{uncovered(c)}
BUT a should be false!
Add not a
Revisions now are:
{inc(b not c), inc(a not b)}
{unc(c ), inc(a not b)}
BUT c should be true!
Add c
The only revision is:
{unc(c ), inc(a not b)}
224
Part 4: Knowledge Evolution
4.3 Abductive Reasoning and Belief Revision
225
Deduction, Abduction and Induction• In deductive reasoning one derives conclusions based on
rules and facts– From the fact that Socrates is a man and the rule that all men are
mortal, conclude that Socrates is mortal• In abductive reasoning given an observation and a set of
rules, one assumes (or abduce) a justification explaining the observation– From the rule that all men are mortal and the observation that
Socrates is mortal, assume that Socrates being a man is a possible justification
• In inductive reasoning, given facts and observations induce rules that may synthesize the observations– From the fact that Socrates (and many others) are man, and the
observation that all those are mortal induce that all men are mortal.
226
Deduction, Abduction and Induction
• Deduction: an analytic process based on the application of general rules to particular cases, with inference of a result
• Induction: synthetic reasoning which infers the rule from the case and the result
• Abduction: synthetic reasoning which infers the (most likely) case given the rule and the result
227
Abduction in logic
• Given a theory T associated with a set of assumptions Ab (abducibles), and an observation G (abductive query), is an abductive explanation (or solution) for G iff:
. Ab2. T |= G3. T G is consistent
• Usually minimal abductive solutions are of special interest
• For the notion of consistency, in general integrity constraints are also used (as in revision)
228
Abduction example
• It has been observed that wobblyWheel.
• What are the abductive solutions for that, assuming that abducibles are brokenSpokes, leakyValve and puncturedTube?
wobbleWheel flatTyre.
wobbleWheel brokenSpokes.
flatTyre leakyValve.
flatTyre puncturedTube.
229
Applications
• In diagnosis:– Find explanations for the observed behaviour– Abducible are the normality (or abnormality) of
components, and also fault modes
• In view updates– Find extensional data changes that justify the
intentional data change in the view– This can be further generalized for knowledge
assimilation
230
Abduction as Nonmonotonic reasoning
• If abductive explanations are understood as conclusions, the process of abduction is nonmonotonic
• In fact, abduction may be used to encode various other forms of nonmonotonic logics
• Vice-versa, other nonmonotonic logics may be used to perform abductive reasoning
231
Negation by Default as Abduction
• Replace all not A by a new atom A*• Add for every A integrity constraints:
A A* A, A*
• L is true in a Stable Model iff there is an abductive solution for the query F
• Negation by default is view as hypotheses that can be assumed consistently
232
Defaults as abduction
• For each rule d:A : B
C
add the ruleC ← d(B), A
and the ICs¬d(B) ¬ B¬d(B) ¬C
• Make all d(B) abducible
233
Abduction and Stable Models
• Abduction can be “simulated” with Stable Models• For each abducible A, add to the program:
A ← not ¬A¬A ← not A
• For getting abductive solutions for G just collect the abducibles that belong to stable models with G
• I.e. compute stable models after also adding← not G
and then collect all abducible from each stable model
234
Abduction and Stable Models (cont)
• The method suggested lacks means for capturing the relevance of abductions made for really proving the query
• Literal in the abductive solution may be there because they “help” on proving the abductive query, or simply because they are needed for consistency independently of the query
• Using a combination of WFS and Stable Models may help in this matter.
235
Abduction as Revision
• For abductive queries:– Declare as revisable all the abducibles– If the abductive query is Q, add the IC:
not Q– The revision of the program are the abductive
solutions of Q.
236
Part 4: Knowledge Evolution
4.4 Methodologies for modeling updates
237
Reasoning about changes
• Dealing with changes in the world, rather than in the belief (Updates rather than revision) requires:– Methodology for representing knowledge about the
chang es, actions, etc, using existing languages
or
– New languages and semantics for dealing with a changing world
• Possibly with translation to the existing languages
238
Situation calculus
• Initially developed for representing knowledge that changes using 1st order logics [McCarthy and Hayes 1969]– Several problems of the approach triggered research in
nonmotonic logics• Main ingredients
– Fluent predicates: predicates that may change their truth value
– Situations: in which the fluents are true or false• A special initial situation• Other situations are characterized by the actions that were
performed from the initial situation up to the situation
239
Situation Calculus - Basis
• (Meta)-predicate holds/2 for describing which fluents hold in which situations
• Situations are represented by:– constant s0, representing the initial situation– terms of the form result(Action,Situation),
representing the situation that results from performing the Action in the previous situation
240
Yale shooting
• There is a turkey, initially alive:holds(alive(turkey),s0).
• Whenever you shoot with a loaded gun, the turkey at which you shoots dies, and the gun becomes unloaded
¬holds(alive(turkey),result(shoot,S)) ← holds(loaded,S).¬holds(loaded,result(shoot,S)).
• Loading a gun results in a loaded gunholds(loaded,result(load,S)).
• What happens to the turkey if I load the gun, and then shoot at the turkey?– holds(alive(turkey), result(shoot, result(load,s0)))?
241
Frame Problem
• In general only the axioms for describing what changes are not enough
• Knowledge is also needed about what doesn’t change.• Suppose that there is an extra action of waiting:
– holds(alive(turkey), result(shoot, result(wait,result(load,s0)))) is not true.
• By default, fluents should remain with the truth value they had before, unless there is evidence for their change (commonsense law of inertia)– In 1st order logic it is difficult to express this– With a nonmonotonic logics this should be easy
242
Frame Axioms in Logic Programming
• The truth value of fluents in two consecutive situations is, by default, the same:holds(F,result(A,S)) :- holds(F,S), not
¬holds(F,result(A,S)), not nonInertial(F,A,S).¬holds(F,result(A,S)) :- ¬holds(F,S), not
holds(F,result(A,S)), not nonInertial(F,A,S)
• This allows for establishing the law of inertia.
243
Representing Knowledge with the situation calculus
• Write rules for predicate holds/2 describing the effects of actions.
• Write rules (partially) describing the initial situation, and possibly also some other states
• Add the frame axioms• Care must be taken, especially in the case of
Stable Models, because models are infinite• Look at the models of the program (be it SM or
WF) to get the consequences
244
Yale shooting results
• The WFM of the program contains, e.g.holds(alive,result(load,s0))¬holds(alive,result(shoot,result(wait,result(load,s0))))¬holds(loaded,result(shoot,result(wait,result(load,s0))))
• Queries of the form?- holds(X,<situation>)
return what holds in the given situation.• Queries of the form
?- holds(<property>,X)
return linear plans for obtaining the property from the initial situation.
245
More on the rules of inertia
• The rules allow for, given information about the past, reasoning about possible futures.
• Reasoning about the past given information in the future is also possible, but requires additional axioms:
holds(F,S) :- holds(F,result(A,S)), not ¬holds(F,S), not nonInertial(F,A,S).
¬holds(F,S) :- ¬holds(F,result(A,S)), not holds(F,S), not nonInertial(F,A,S).
• Care must be taken when using these rules, since they may create infinite chains of derivation”
• On the other hand, it is difficult with this representation to deal with simultaneous actions
246
Fluent Calculus
• Extends by introducing a notion of state [Thielscher 1998]• Situation are representations of states• State(S) denotes the state of the world in situation S• Operator o is used for composing fluents that are true in the
same state.• Example:
– State(result(shoot,Soalive(turkey)oloaded) = S– State(result(load,S)) = Soloaded
• Axioms are needed for guaranteeing that o is commutative and associative, and for equality
• This allows inferring non-effects of action without the need for extra frame axioms
247
Event Calculus
• It is another methodology developed for representing knowledge that changes over time [Kowalski and Sergot 1986]
• Solves the frame problem in a different (simpler) way, also without frame axioms.
• It is adequate for determining what holds after a series of action being performed
• It does not directly help for planning and for general reasoning about the knowledge that is changing
248
Event Calculus - Basis
• Fluents are represented as terms, as in situation calculus• Instead of situations, there is a notion of discrete time:
– constants for representing time points– predicate </2 for representing the (partial) order among points– predicates </2 should contain axioms for transitive closure, as
usual.
• A predicates holds_at/2 defines which fluents hold in which time points
• There are events, represented as constants.• Predicate occurs/2 defines what events happen in which
time points.
249
Event Calculus – Basis (cont)
• Events initiate (the truth) of some fluents and terminate (the truth) of other fluents.
• This is represented using predicates initiates/3 and terminates/3
• Effects of action are described by the properties initiated and terminated by the event associated to the action occurrence.
• There is a special event, that initiates all fluents at the beginning
250
Yale shooting again
• There is a turkey, initially alive:initiates(alive(turkey),start,T). occurs(start,t0).
• Whenever you shoot with a loaded gun, the turkey at which you shoots dies, and the gun becomes unloaded
terminates(alive(turkey),shoot,T) ← holds_at(loaded,T).terminates(loaded,shoot,T).
• Loading a gun results in a loaded guninitiates(loaded,load,T).
• The gun was loaded at time t10, and shoot at time t20:occurs(load,t10). occurs(shoot,t20).
• Is the turkey alive at time t21?– holds_at(alive(turkey), t21)?
251
General axioms for event calculus
• Rules are needed to describe what holds, based on the events that occurred:holds_at(P,S) :- occurs(E,S1), initiates(P,E,S1),
S1 < S, not clipped(P,S1,S).clipped(P,S1,S2) :- occurs(E,S), S1 ≤ S < S2,
terminates(P,E,S).
• There is no need for frame axioms. By default thing will remain true until terminated
252
Event calculus application
• Appropriate when it is known which events occurred, and the reasoning task is to know what holds in each moment. E.g. – reasoning about changing databases– reasoning about legislation knowledge bases, for determining what
applies after a series of events– Reasoning with parallel actions.
• Not directly applicable when one wants to know which action lead to an effect (planning), or for reasoning about possible alternative courses of actions– No way of inferring occurrences of action– No way of representing various courses of actions– No way of reasoning from the future to the past
253
Event calculus and abduction
• With abduction it is possible to perform planning using the event calculus methodology– Declare the occurrences of event as abducible– Declare also as abducible the order among the time
events occurred– Abductive solutions for holds_at(<fluent>,<time>)
give plans to achieve the fluent before the given (deadline) time.
254
Representing Knowledge with the event calculus
• Write rules for predicates initiates/3 and terminates/3 describing the effects of actions.
• Describe the initial situation as the result of a special event e.g. start, and state that start occurred in the least time point.
• Add the axioms defining holds_at/2• Add rule for describing the partial order of time
– These are not need if e.g. integer are used for representing time
• Add occurrences of the events• Query the program in time points
255
Part 4: Knowledge Evolution
4.5 Action Languages
256
Action Languages
• Instead of– using existing formalism, such as 1st order logics, logic
programming, etc, – and developing methodologies
• Design new languages specifically tailored for representing knowledge in a changing world– With a tailored syntax for action programs providing ways of
describing how an environment evolves given a set external actions
– Common expressions are static and dynamic rules.• Static rules describe the rules of the domain• Dynamic rules describe effects of actions.
257
• Usually, the semantics of an action program is defined in terms of a transition system.– Intuitively, given the current state of the world s and a set
of actions K, a transition system specifies which are the possible resulting states after performing, simultaneously all the actions in K.
• The semantics can also be given as a translation into an existing formalism– E.g. translating action programs into logic programs
(possibly with extra arguments on predicates, with extra rules, e.g. for frame axioms) assuring that the semantics of the transformed program has a one-to-one correspondence with the semantics of the action program
Action Languages (cont)
258
The A language
• First proposal by [Gelfond Lifshitz, 1993].• Action programs are sets of rules of the form:
– initially <Fluent>– <Fluent> after <Action1>; … ; <ActionN>– <action> causes <Fluent> [if <Condition>]
• A semantics was first defined in terms of a transition system (labeled graph where the nodes are states – sets of fluents true in it – and where the arc are labeled with action)
• Allows for– non-deterministic effects of actions– Conditional effects of actions
259
The Yale shooting in A
initialy alive.
shoot causes ¬alive if loaded.
shoot causes ¬loaded.
load causes loaded.
• It is possible to make statements about other states, e.g.¬alive after shoot; wait.
• and to make queries about states:¬alive after shoot; wait; load ?
260
Translation A into logic programs
• An alternative definition of the semantics is obtained by translating A-programs into logics programs. Roughly:– Add the frame axioms just as in the situation calculus– For each rule
• initially f add holds(f,s0).• f after a1;…;an add holds(f,result(a1,…result(an,s0)…)• a causes f if cond add
holds(f,result(a,S)) :- holds(cond,S).
• Theorem: holds(f,result(a1,…,result(an,s0)…) belongs to a stable model of the program iff there is a state resulting from the initial state after applying a1, … an where f is true.
261
The B Language
• The B language [Gelfond Lifshitz, 1997]. extends A by adding static rules.
• Dynamic rules, as in A, allow for describing effects of action, and “cause” a change in the state.
• Static rules allow for describing rules of the domain, and are “imposed” at any given state
• They allow for having indirect effects of actions• Static rules in B are of the form:
<Fluent> if <Condition>
• Example:dead if ¬alive.
262
Causality and the C language
• Unlike both A and B, where inertia is assumed for all fluents, in C one can decide which fluents are subject to inertia and which aren’t:– Some fluents, such as one time events, should not be
assumed to keep its value by inertia. E.g. action names, incoming messages, etc
• Based on notions of causality:– It allows for assertion that F is caused by Action,
stronger than asserting that F holds
• As in B, it comprises static and dynamic rules
263
Rules in C
• Static Rules:caused <Fluent> if <Condition>
– Intuitively tells that Condition causes the truth of Fluent
• Dynamic Rules:caused <Fluents> if <Condition> after <Formula>
– The <Formula> can be built with fluents as well as with action names
– Intuitively this rules states that after <Formula> is true, the rule caused <Fluents> if <Condition> is in place
264
Causal Theories and semantics of C
• The semantics of C is defined in terms of causal theories (sets of static rules)– Something is true iff it is caused by something else
• Let T be a causal theory, M be a set of fluents andTM = {F| caused F if G and M |= G}
M is a causal model of T iff M is the unique model of TM.• The transition system of C is defined by:
– In any state s (set of fluents) consider the causal theory TK formed by the static rules and the dynamic rules true at that state U K, where K is any set of actions
– There is an arc from s to s’ labeled with K iff s’ is a causal model of TK.
– Note that this way inertia is not obtained!
265
Yale shooting in C
caused ¬alive if True after shoot loaded
caused ¬loaded if True after shoot
caused loaded if True after load
• We still need to say that alive and loaded are inertial:caused alive if alive after alive
caused loaded if loaded after loaded
266
Macros in C
• Macro expressions have been defined for easing the representation of knowledge with C:– A causes F if G
• standing for caused F if True after G A– inertial F
• standing for caused F if F after F
– always F• standing for caused if ¬F
– nonexecutable A if F• standing for caused if F A
– …
267
Extensions of C
• Several extensions exist. E.g.– C++ allowing for multi-valued fluents, and to
encode resources
– K allowing for reasoning with incomplete states
– P and Q that extend C with rich query languages, allowing for querying various states, planning queries, etc
268
Part 4: Knowledge Evolution
4.6 Logic Programs Updates
269
Rule Updates• These languages and methodologies are basically
concerned with facts that change– There is a set of fluents (fact)– There are static rules describing the domain, which are not subject
to change– There are dynamic rules describing how the facts may change due
to actions
• What if the rules themselves, be it static or dynamic, are subject to change?– The rules of a given domain may change in time– Even the rules that describe the effects of actions may change (e.g.
rules describing the effects of action in physical devices that degrade with time)
• What we have seen up to know does not help!
270
Languages for rule updates• Languages dealing with highly dynamic environments where,
besides fact, also static and dynamic rules of an agent may change, need:– Means of integrating knowledge updates from external sources (be it from
user changes in the rules describing agent behavior, or simply from environment events)
– Means for describing rules about the transition between states– Means for describing self-updates, and self-evolution of a program, and
combining self-updates with external ones• We will study this in the setting of Logic Programming
– First define what it means to update a (running) program by another (externally given) program
– Then extend the language of Logic Programs to describe transitions between states (i.e. some sort of dynamic rules)
– Make sure that this deals with both self-updates (coming from the dynamic rules) and updates that come directly from external sources
271
Updates of LPs by LPs
• Dynamic Logic Programming (DLP) [ALPPP98] was introduced to address the first of these concerns– It gives meaning to sequences of LPs
• Intuitively a sequence of LPs is the result of updating P1 with the rules in P2, …– But different programs may also come from different
hierarchical instances, different viewpoint (with preferences), etc.
• Inertia is applied to rules rather than to literals– Older rules conflicting with newer applicable rules are
rejected
272
Updating Models isn’t enough• When updating LPs, doing it model by model is not
desired. It loses the directional information of the LP arrow.
P: sleep not tv_on.watch tv_on.tv_on.
U: not tv_on p_failure.p_failure.
U2: not p_failure.
M = {tv,w}
Mu = {pf,w}
Mu2 = {w}
{pf,s}
{tv,w}
• Inertia should be applied to rule instances rather than to their previous consequences.
273
Logic Programs Updates Example• One should not have to worry about how to
incorporate new knowledge; the semantics should take care of it. Another example:
Open-Day(X) ← Week-end(X) . Week-end(23) . Week-end(24). Sunday(24).
Initial knowledge: The restaurant is open in the week-end
not Open-Day(X) ← Sunday(X) .
New knowledge:On Sunday the restaurant is
closed • Instead of rewriting the program we simply update it with the new
rules. The semantics should consider the last update, plus all rule instances of the previous that do not conflict.
274
Generalized LPs
• Programs with default negation in the head are meant to encode that something should no longer be true.
– The generalization of the semantics is not difficult
• A generalized logic program P is a set of propositional Horn clauses
L L1 ,…, Ln
where L and Li are atoms from LK , i.e. of the form A or ´not A´.
• Program P is normal if no head of the clause in P has form not A.
275
Generalized LP semantics
• A set M is an interpretation of LK if for every atom A in K exactly one of A and not A is in M.
• Definition:
An interpretation M of LK is a stable model of a generalized logic program P if M is the least model of the Horn theory P {not A: A M}.
276
Generalized LPs example
Example: K = { a,b,c,d,e} P : a not b c b e not d not d a, not c d not e
this program has exactly one stable model:
M = Least(P not {b, c, d}) = {a, e, not b, not c, not d}
N = {not a, not e, b, c, d} is not a stable model since
N Least(P {not a, not e})
277
Dynamic Logic Programming
• A Dynamic Logic Program P is a sequence of GLPs
P1 P2 … Pn
• An interpretation M is a stable model of P iff:M = least([UiPi – Reject(M)] U Defaults(M))
– From the union of the programs remove the rules that are in conflict with newer ones (rejected rules)
– Then, if some atom has no rules add (in Defaults) its negation
– Compute the least, and check stability
278
Rejection and Defaults
• By default assume the negation of atoms that have no rule for it with true body:Default(M) = {not A | r: head(r)=A and M |= body(r)}
• Reject all rules with head A that belong to a former program, if there is a later rule with complementary head and a body true in M: Reject(M) = {r Pi | r’ Pj, i ≤ j and
head(r) = not head(r’) and M |= body(r’)}
279
Example
• {pf, sl, not tv, not wt} is the only SM of P1 P2 – Rej = {tv }– Def = {not wt}– Least( P – {tv } U {not wt} = M
• {tv, wt, not sl, not pf} is the only SM of P1P2P3
– Rej = {pf }– Def = {not sl}
P1: sleep not tv_on.watch tv_on.tv_on.
P2: not tv_on p_failure.p_failure.
P3: not p_failure.
280
Another exampleP1 : not fly(X) animal(X) P4 : animal(X) bird(X)
P2 : fly(X) bird(X) bird(X) penguin(X)
P3 : not fly(X) penguin(X) animal(pluto)
bird(duffy)
penguin(tweety)
Program P1P2 P3P4 has a unique stable model in
which fly(duffy) is true and both fly(pluto) and fly(tweety) are false.
281
Some properties
• If M is a stable model of the union P U of programs P and U , then it is a stable model of the update program P U.– Thus, the semantics of the program P U is always
weaker than or equal to the semantics of P U.
• If either P or U is empty, or if both P and U are normal programs, then the semantics of P U and P U coincide.– DLP extends the semantics of stable models
282
What is still missing
• DLP gives meaning to sequences of LPs• But how to come up with those sequences?
– Changes maybe additions or retractions– Updates maybe conditional on a present state– Some rules may represent (persistent) laws
• Since LP can be used to describe knowledge states and also sequences of updating states, it’s only fit that LP is used too to describe transitions, and thus come up with such sequences
283
LP Update Languages
• Define languages that extend LP with features that allow to define dynamic (state transition) rules– Put, on top of it, a language with sets of
meaningful commands that generate DLPs (LUPS, EPI, KABUL) or
– Extend the basic LP language minimally in order to allow for this generation of DLPs (EVOLP)
284
What do we need do make LPs evolve?
• Programs must be allowed to evolveMeaning of programs should be sequences of sets of
literals, representing evolutionsNeeded a construct to assert new informationnots in the heads to allow newer to supervene older rules
• Program evolution may me influenced by the outsideAllow external events… written in the language of programs
285
EVOLP Syntax • EVOLP rules are Generalized LP rules (possibly
with nots in heads) plus special predicate assert/1• The argument of assert is an EVOLP rule (i.e.
arbitrary nesting of assert is allowed)• Examples:
assert( a ← not b) ← d, not e
not a ← not assert( assert(a ← b)← not b), c
• EVOLP programs are sets of EVOLP rules
286
Meaning of Self-evolving LPs
• Determined by sequences of sets of literals• Each sequence represents a possible evolution• The nth set in a sequence represents what is
true/false after n steps in that evolution• The first set in sequences is a SM of the LP, where
assert/1 literals are viewed as normal ones• If assert(Rule) belongs to the nth set, then (n+1)th
sets must consider the addition of Rule
287
Intuitive examplea ←assert(b ←) assert(c ←) ← b
• At the beginning a is true, and so is assert(b ←)• Therefore, rule b ← is asserted• At 2nd step, b becomes true, and so does assert(c ←) • Therefore, rule c ← is asserted• At 3rd step, c becomes true.
< {a, assert(b ←)},{a, b, assert(b ←), assert(c ←)},{a, b, c, assert(b ←), assert(c ←)} >
288
Self-evolution definitions
• An evolution interpretation of P over L is a sequence <I1,…,In> of sets of atoms from Las
• The evolution trace of <I1,…,In> is <P1,…,Pn>:
P1 = P and Pi = {R | assert(R) Ii-1} (2 ≤ i ≤ n)
• Evolution traces contains the programs imposed by interpretations
• We have now to check whether each nth set complies with the programs up to n-1
289
Evolution Stable Models
• <I1,…,In>, with trace <P1,…,Pn>, is an evolution stable model of P, iff
1 ≤ i ≤ n, Ii is a SM of the DLP: P1 …Pi
• Recall that I is a stable model of P1 …Pn iff
I = least( (Pi – Rej(I)) Def(I) )where:– Def(I) = {not A | A ← Body) Pi, Body I}– Rej(I) = {L0 ← Bd in Pi | not L0 ← Bd’) Pj,
i ≤ j ≤ n, and Bd’ I}
290
Simple example
• <{a, assert(b ← a)}, {a, b,c,assert(not a ←)}, {assert( b ← a)}> is an evolution SM of P:
a ← assert(not a ←) ← bassert(b ← a) ← not c c ← assert(not a ←)
• The trace is <P,{b ← a},{not a ←}>
a,assert(b ← a)
assert(b ← a)a, b, c,
assert(not a ←)
291
Example with various evolutions• No matter what, assert c; if a is not going to be
asserted, then assert b; if c is true, and b is not going to be asserted, then assert a.
assert(b) ← not assert(a). assert(c) ←assert(a) ← not assert(b), c
• Paths in the graph below are evolution SMs
ast(b)ast(c)
b,c,ast(b)ast(c)
b,c,ast(a)ast(c)
a,b,c,ast(b)ast(c)
a,b,c,ast(a)ast(c)
292
Event-aware programs
• Self-evolving programs are autistic!
• Events may come from the outside:– Observations of facts or rules– Assertion order
• Both can be written in EVOLP language
• Influence from outside should not persist by inertia
293
Event-aware programs• Events may come from the outside:
– Observations of facts or rules– Assertion order
• Both can be written in EVOLP language• An event sequence is a sequence of sets of EVOLP
rules.• <I1,…,In>, with trace <P1,…,Pn>, is an evolution SM
of P given <E1,…,Ek>, iff
1 ≤ i ≤ n, Ii is a SM of the DLP:
P1 P2 …Pi Ei)
294
Simple example• The program says that: whenever c, assert a ← b• The events were: 1st c was perceived; 2nd an order to assert b;
3rd an order to assert not a
P: assert(a ← b) ← cEvents: <{c ← }, {assert(b ←)}, {assert(not a ←)}>
c,assert(a ←← b)
assert(b ←←)b, a,
assert(not a ←←)b
c ←assert(a ←← b) ← c← c
assert(b ←←)a ← b
bassert(not a ←←)
not a←
c ←← assert(b ←←) assert(not a ←←)
295
Yale shooting with EVOLP
• There is a turkey, initially alive:alive(turkey)
• Whenever you shoot with a loaded gun, the turkey at which you shoots dies, and the gun becomes unloaded
assert(not alive(turkey)) ← loaded, shoot.assert(not loaded) ← shoot.
• Loading a gun results in a loaded gunassert(loaded) ← load.
• Events of shoot, load, wait, etc make the program evolve• After some time, the shooter becomes older, has sight problems, and
does not longer hit the turkey if without glasses. Add event:assert( not assert(not alive(turkey)) ← not glasses)
296
LUPS, EPI and KABUL languages
• Sequences of commands build sequences of LPs• There are several types of commands: assert,
assert event, retract, always, …always (not a ← b, not c) when d, not e
• EPI extends LUPS to allow for:– commands whose execution depends on other
commands– external events to condition the KB evolution
• KABUL extends LUPS and EPI with nesting
297
LUPS Syntax• Statements (commands) are of the form:
assert [event] RULE when COND– asserts RULE if COND is true at that moment. The RULE is non-
inertial if with keyword event.
retract [event] RULE when COND– the same for rule retraction
always [event] RULE when COND– From then onwards, whenever COND assert RULE (as na event if
with the keyword
cancel RULE when COND– Cancel an always command
298
LUPS as EVOLP
• The behavior of all LUPS commands can be constructed in EVOLP. Eg:
• always (not a ← b, not c) when d, not ecoded as event:
assert( assert(not a ← b, not c) ← d, not e )
• always event (a ← b) when ccoded as events:
assert( assert(a ← b, ev(a ← b)) ← c )assert( assert(ev(a ← b)) ← c )
plus:assert( not ev(R) ) ← ev(R), not assert(ev(R))
299
EVOLP features
• All LUPS and EPI features are EVOLP features:– Rule updates; Persistent updates; simultaneous updates;
events; commands dependent on other commands; …
• Many extra features (some of them in KABUL) can be programmed:– Commands that span over time– Events with incomplete knowledge– Updates of persistent laws– Assignments– …
300
More features
• EVOLP extends the syntax and semantics of logic programs– If no events are given, and no asserts are used, the
semantics coincides with the stable models
– A variant of EVOLP (and DLP) have been defined also extending WFS
– An implementation of the latter is available
• EVOLP was show to properly embed action languages A, B, and C.
301
EVOLP possible applications
• Legal reasoning
• Evolving systems, with external control
• Reasoning about actions
• Active Data (and Knowledge) Bases
• Static program analysis of agents’ behavior
• …
302
… and also
• EVOLP is a concise, simple and quite powerful language to reason about KB evolution– Powerful: it can do everything other update and action
languages can, and much more– Simple and concise: much better to use for proving
properties of KB evolution• EVOLP: a firm formal basis in which to express,
implement, and reason about dynamic KB• Sometimes it may be regarded as too low level.
– Macros with most used constructs can help, e.g. as in the translation of LUPS’ always event command
303
Suitcase example
• A suitcase has two latches, and is opened whenever both are up:
open ← up(l1), up(l2)
• There is an action of toggling applicable to each latch:
assert(up(X)) ← not up(X), toogle(X)
assert(not up(X)) ← up(X), toogle(X)
304
Abortion Example
• Once Republicans take over both Congress and the Presidency they establish the law stating that abortions are punishable by jail
assert(jail(X) ← abortion(X)) ← repCongress, repPresident• Once Democrats take over both Congress and the Presidency
they abolish such a lawassert(not jail(X) ← abortion(X)) ← not repCongress, not repPresident
• Performing an abortion is an event, i.e., a non-inertial update.– I.e. we will have events of the form abortion(mary)…
• The change of congress is inertial– I.e. The recent change in the congress can be modeled by the event
assert(not repCongress)
305
Twice fined example
• A car-driver looses his license after a second fine. He can regain the license if he undergoes a refresher course at the drivers school.assert(not license ← fined, probation) ← fined
assert(probation) ← fined
assert(licence) ← attend_school
assert(not probation) ← attend_school
306
Bank example
• An account accepts deposits and withdrawals. The latter are only possible when there is enough balance:assert(balance(Ac,B+C)) ← changeB(Ac,C), balance(Ac,B)assert(not balance(Ac,B)) ← changeB(Ac,C), balance(Ac,B)
changeB(Ac,D) ← deposit(Ac,D)
changeB(Ac,-W) ← withdraw(Ac,W), balance(Ac,B), B > W.
• Deposits and withdrawals are added as events. E.g.– {deposit(1012,10), withdraw(1111,5)}
307
Bank examples (cont)
• The bank now changes its policy, and no longer accepts withdrawals under 50 €. Event:assert( not changeB(Ac,D) ← deposit(Ac,D), D < 50) )
• Next VIP accounts are allowed negative balance up to account specified limit:assert(
changeB(Ac,-W) ← vip(Ac,L), withdrawl(Ac,W), B+L>W ).
308
Email agent example
• Personal assistant agent for e-mail management able to:– Perform basic actions of sending, receiving, deleting
messages– Storing and moving messages between folders– Filtering spam messages– Sending automatic replies and forwarding– Notifying the user of special situations
• All of this may depend on user specified criteria• The specification may change dynamically
309
EVOLP for e-mail Assistant
• If the user specifies, once and for all, a consistent set of policies triggering actions, then any existing (commercial) assistant would do the job.
• But if we allow the user to update its policies, and to specify both positive (e.g. “…must be deleted”) and negative (e.g. “…must not be deleted”) instances, soon the union of all policies becomes inconsistent
• We cannot expect the user to debug the set of policy rules so as to invalidate all the old rules (instances) contravened by newer ones.
• Some automatic way to resolve inconsistencies due to updates is needed.
310
EVOLP for e-mail Assistant (cont)• EVOLP provides an automatic way of removing
inconsistencies due to updates:– With EVOLP the user simply states whatever new is to be
done, and let the agent automatically determine which old rules may persist and which not.
– We are not presupposing the user is contradictory, but just that he keeps updating its profile
• EVOLP further allows:– Postponed addition of rules, depending on user specified
criteria– Dynamic changes in policies, triggered by internal and/or
external conditions– Commands that span over various states– …
311
An EVOLP e-mail Assistant
• In the following we show some policy rules of the EVOLP e-mail assistant.– A more complete set of rules, and the results given by
EVOLP, can be found in the corresponding paper
• Basic predicates:– New messages come as events of the form:
newmsg(Identifier, From, Subject, Body)– Messages are stored via predicates:
msg(Identifier, From, Subj, Body, TimeStamp)and
in(Identifier, FolderName)
312
Simple e-mail EVOLP rules• By default messages are stored in the inbox:assert(msg(M,F,S,B,T)) ← newmsg(M,F,S,B), time(T), not delete(M).
assert(in(M,inbox)) ← newmsg(M,F,S,B), not delete(M).
assert(not in(M,F)) ← delete(M), in(M,F). • Spam messages are to be deleted:
delete(M) ← newmsg(M,F,S,B), spam(F,S,B).
• The definition of spam can be done by LP rules:
spam(F,S,B) ← contains(S,credit).
• This definition can later be updated:
not spam(F,S,B) ← contains(F,my_accountant).
313
More e-mail EVOLP rules• Messages can be automatically moved to other folders. When that
happens (not shown here) the user wants to be notified: notify(M) ← newmsg(M,F,S,B), assert(in(M,F)), assert(not in(M,inbox)).
• When a message is marked both for deletion and automatic move to another folder, the deletion should prevail:
not assert(in(M,F)) ← move(M,F), delete(M).
• The user is organizing a conference, assigning papers to referees. After receipt of a referee’s acceptance, a new rule is to be added, which forwards to the referee any messages about assigned papers:
assert(send(R,S,B1) ← newmsg(M1,F,S,B1), contains(S,Id), assign(Id,R)) ← newmsg(M2,R,Id,B2), contains(B2,accept).
314
Part 5: Ontologies
315
Logic and Ontologies
• Up to now we have studied Logic Languages for Knowledge Representation and Reasoning:– in both static and dynamic domains– with possibly incomplete knowledge and nonmonotonic
reasoning– interacting with the environment and completing the
knowledge, possibly contracting previous assumptios• All of this is parametric with a set of predicates
and a set of objects• The meaning of a theory depends, and is build on
top of, the meaning of the predicates and objects
316
Choice of predicates
• We want to represent that trailer trucks have 18 wheels. In 1st order logics: x trailerTruck(x) hasEighteenWheels(x) or x trailerTruck(x) numberOfWheels(x,18) or x ((truck(x) y(trailer(y) part(x,y)))
s (set(s) count(s,18) w (member(w,s) wheel(w) part(x,w)))
• The choice depends on which predicates are available• For understanding (and sharing) the represented
knowledge it is crucial that the meaning of predicates (and also of object) is formally established
317
Ontologies
• Ontologies establish a formal specification of the concepts used in representing knowledge
• Ontology: originates from philosophy as a branch of metaphysics– studies the nature of existence
– Defines what exists and the relation between existing concepts (in a given domain)
– Sought universal categories for classifying everything that exists
318
An Ontology
• An ontology, is a catalog of the types of things that are assumed to exist in a domain.
• The types in an ontology represent the predicates, word senses, or concept and relation types of the language when used to discuss topics in the domain.
• Logic says nothing about anything, but the combination of logic with an ontology provides a language that can express relationships about the entities in the domain of interest.
• Up to now we have implicitly assumed the ontology– I assumed that you understand the meaning of predicates and
objects involved in examples
319
Aristotle’s OntologyBeing
Substance Accident
Property Relation
Inherence Directedness Containment
Quality Quantity
Movement Intermediacy
Activity Passivity Having Situated Spatial Temporal
320
The Ontology
• Effort to defined and categorize everything that exists
• Agreeing on the ontology makes it possible to understand the concepts
• Efforts to define a big ontology, defining all concepts still exists today:– The Cyc (from Encyclopedia) ontology (over 100,000
concept types and over 1M axioms– Electronic Dictionary Research: 400,00 concept types– WordNet: 166,000 English word senses
321
Cyc Ontology
322
Cyc OntologyThing
Object Intangible
Intangible Object Collection
Process
Occurrence
RelationshipIntangible Stuff
SlotInternal machine thing
Attribute value
Attribute
Represented Thing
Event Stuff
323
Small Ontologies
• Designed for specific application• How to make these coexist with big ontologies?
324
Domain-Specific Ontologies
• Medical domain: – Cancer ontology from the National Cancer Institute in the United
States • Cultural domain:
– Art and Architecture Thesaurus (AAT) with 125,000 terms in the cultural domain
– Union List of Artist Names (ULAN), with 220,000 entries on artists
– Iconclass vocabulary of 28,000 terms for describing cultural images
• Geographical domain:– Getty Thesaurus of Geographic Names (TGN), containing over 1
million entries
325
Ontologies and the Web
• In the Web ontologies provide shared understanding of a domain– It is crucial to deal with differences in terminology
• To understand data in the web it is crucial that an ontology exists
• To be able to automatically understand the data, and use in a distributed environment it is crucial that the ontology is:– Explicitly defined– Available in the Web
• The Semantic Web initiative provides (web) languages for defining ontologies (RDF, RDF Schema, OWL)
326
Defining an Ontology
• How to define a catalog of the types of things that are assumed to exist in a domain?– I.e. how to define an ontology for a given domains?
• What makes an ontology?– Entities in a taxonomy– Attributes– Properties and relations– Facets– Instances
• Similar to ER models in databases
327
Main Stages in Ontology Development
1. Determine scope2. Consider reuse3. Enumerate terms4. Define taxonomy5. Define properties6. Define facets7. Define instances8. Check for anomaliesNot a linear process!
328
Determine Scope
• There is no correct ontology of a specific domain – An ontology is an abstraction of a particular
domain, and there are always viable alternatives
• What is included in this abstraction should be determined by – the use to which the ontology will be put– by future extensions that are already anticipated
329
Determine Scope (cont)
• Basic questions to be answered at this stage are: – What is the domain that the ontology will
cover? – For what we are going to use the ontology? – For what types of questions should the
ontology provide answers? – Who will use and maintain the ontology?
330
Consider Reuse
• One rarely has to start from scratch when defining an ontology – In these web days, there is almost always an
ontology available that provides at least a useful starting point for our own ontology
• With the Semantic Web, ontologies will become even more widely available
331
Enumerate Terms
• Write down in an unstructured list all the relevant terms that are expected to appear in the ontology– Nouns form the basis for class names
– Verbs form the basis for property/predicate names
• Traditional knowledge engineering tools (e.g. laddering and grid analysis) can be used to obtain – the set of terms
– an initial structure for these terms
332
Define the Taxonomy
• Relevant terms must be organized in a taxonomic is_a hierarchy– Opinions differ on whether it is more
efficient/reliable to do this in a top-down or a bottom-up fashion
• Ensure that hierarchy is indeed a taxonomy:– If A is a subclass of B, then every object of
type A must also be an object of type B
333
Define Properties
• Often interleaved with the previous step• Attach properties to the highest class in the
hierarchy to which they apply:– Inheritance applies to properties
• While attaching properties to classes, it makes sense to immediately provide statements about the domain and range of these properties– Immediately define the domain of properties
334
Define Facets
• Define extra conditions over properties– Cardinality restrictions– Required values– Relational characteristics
• symmetry, transitivity, inverse properties, functional values
335
Define Instances
• Filling the ontologies with such instances is a separate step
• Number of instances >> number of classes
• Thus populating an ontology with instances is not done manually – Retrieved from legacy data sources (DBs)– Extracted automatically from a text corpus
336
Check for Anomalies
• Test whether the ontology is consistent– For this, one must have a notion of consistency in the
language
• Examples of common inconsistencies – incompatible domain and range definitions for
transitive, symmetric, or inverse properties
– cardinality properties
– requirements on property values can conflict with domain and range restrictions
337
Protégé
• Java based Ontology editor• It supports Protégé-Frames and OWL as
modeling languages– Frames is based on Open Knowledge Base
Connectivity protocol (OKBC)
• It exports into various formats, including (Semantic) Web formats
• Let’s try it
338
The newspaper example (part):Thing
Author Person
Reporter
Employee
Salesperson
Article
Manager
Advertisement
Content
News Service
Editor
• Properties (slots)– Persons have names which are strings, phone number, etc– Employees (further) have salaries that are positive numbers– Editor are responsible for other employees– Articles have an author, which is an instance of Author, and possibly
various keywords• Constraints
– Each article must have at least two keywords– The salary of an editor should be greater than the salary of any employee
which the editor is responsible for
339
Part 6: Description Logics
340
Languages for Ontologies
• In early days of Artificial Intelligence, ontologies were represented resorting to non-logic-based formalisms– Frames systems and semantic networks
• Graphical representation– arguably ease to design
– but difficult to manage with complex pictures
– formal semantics, allowing for reasoning was missing
341
Semantic Networks
• Nodes representing concepts (i.e. sets of classes of individual objects)
• Links representing relationships– IS_A relationship– More complex relationships may have nodes
Person
Female
ParentWoman
Mother
hasChild(1,NIL)
342
Logics for Semantic Networks
• Logics was used to describe the semantics of core features of these networks– Relying on unary predicates for describing sets of
individuals and binary predicates for relationship between individuals
• Typical reasoning used in structure-based representation does not require the full power of 1st order theorem provers– Specialized reasoning techniques can be applied
343
From Frames to Description Logics
• Logical specialized languages for describing ontologies
• The name changed over time– Terminological systems emphasizing that the language
is used to define a terminology– Concept languages emphasizing the concept-forming
constructs of the languages– Description Logics moving attention to the properties,
including decidability, complexity, expressivity, of the languages
344
Description Logic ALC• ALC is the smallest propositionally closed
Description Logics. Syntax:– Atomic type:
• Concept names, which are unary predicates• Role names, which are binary predicates
– Constructs• ¬C (negation)• C1 ⊓ C2 (conjunction)• C1 ⊔ C2 (disjunction)R.C (existential restriction)R.C (universal restriction)
345
Semantics of ALC
• Semantics is based on interpretations (I,.I) where .I maps:– Each concept name A to AI ⊆ I
• I.e. a concept denotes set of individuals from the domain (unary predicates)
– Each role name R to AI ⊆ I x I
• I.e. a role denotes pairs of (binary relationships among) individuals
• An interpretation is a model for concept C iffCI ≠ {}
• Semantics can also be given by translating to 1st order logics
346
Negation, conjunction, disjunction
• ¬C denotes the set of all individuals in the domain that do not belong to C. Formally– (¬C)I = I – CI
– {x: ¬C(x)}
• C1 ⊔ C2 (resp. C1 ⊓ C2) is the set of all individual that either belong to C1 or (resp. and) to C2– (C1 ⊔ C2)I = C1
I ⋃ C2I resp. (C1 ⊓ C2)I = C1
I ⋂ C2I
– {x: C1(x) ⌵ C2(x)} resp. {x: C1(x) C2(x)}• Persons that are not female
– Person ⊓ ¬Female• Male or Female individuals
– Male ⊔ Female
347
Quantified role restrictions
• Quantifiers are meant to characterize relationship between concepts
R.C denotes the set of all individual which relate via R with at least one individual in concept C– (R.C)I = {d ∈ I | (d,e) ∈ RI and e ∈ CI}
– {x | y R(x,y) C(Y)}
• Persons that have a female child– Person ⊓ hasChild.Female
348
Quantified role restrictions (cont)
R.C denotes the set of all individual for which all individual to which it relates via R belong to concept C– (R.C)I = {d ∈ I | (d,e) ∈ RI implies e ∈ CI}– {x | y R(x,y) C(Y)}
• Persons whose all children are Female– Person ⊓ hasChild.Female
• The link in the network above– Parents have at least one child that is a person, and
there is no upper limit for children hasChild.Person ⊓ hasChild.Person
349
Elephant example
• Elephants that are grey mammal which have a trunck– Mammal ⊓ bodyPart.Trunk ⊓ color.Grey
• Elephants that are heavy mammals, except for Dumbo elephants that are light– Mammal ⊓
(weight.heavy ⊔ (Dumbo ⊓ weight.Light)
350
Reasoning tasks in DL
• What can we do with an ontology? What does the logical formalism brings more?
• Reasoning tasks– Concept satisfiability (is there any model for C?)– Concept subsumption (does C1
I ⊆ C2I for all I?)
C1 ⊑ C2
• Subsumption is important because from it one can compute a concept hierarchy
• Specialized (decidable and efficient) proof techniques exist for ALC, that do not employ the whole power needed for 1st order logics– Based on tableau algorithms
351
Representing Knowledge with DL
• A DL Knowledge base is made of– A TBox: Terminological (background) knowledge
• Defines concepts.• Eg. Elephant ≐ Mammal ⊓ bodyPart.Trunk
– A ABox: Knowledge about individuals, be it concepts or roles
• E.g. dumbo: Elephant or (lisa,dumbo):haschild
• Similar to eg. Databases, where there exists a schema and an instance of a database.
352
General TBoxes
• T is finite set of equation of the form
C1 ≐ C2
• I is a model of T if for all C1 ≐ C2 ∈ T, C1I = C2
I
• Reasoning:– Satisfiability: Given C and T find whether there is a
model both of C and of T?
– Subsumption (C1 ⊑T C2): does C1I ⊆ C2
I holds for all models of T?
353
Acyclic TBoxes
• For decidability, TBoxes are often restricted to equations
A ≐ Cwhere A is a concept name (rather than expression)
• Moreover, concept A does not appear in the expression C, nor at the definition of any of the concepts there (i.e. the definition is acyclic)
354
ABoxes
• Define a set of individuals, as instances of concepts and roles
• It is a finite set of expressions of the form:– a:C– (a,b):Rwhere both a and b are names of individuals, C is a
concept and R a role• I is a model of an ABox if it satisfies all its
expressions. It satisfies– a:C iff aI ∈ CI
– (a,b):R iff (aI,bI) ∈ RI
355
Reasoning with TBoxes and ABoxes
• Given a TBox T (defining concepts) and an ABox A defining individuals– Find whether there is a common model (i.e.
find out about consistency)– Find whether a concept is subsumed by another
concept C1 ⊑T C2
– Find whether an individual belongs to a concept (A,T |= a:C), i.e. whether aI ∈ CI for all models of A and T
356
Inference under ALC
• Since the semantics of ALC can be defined in terms of 1st order logics, clearly 1st order theorem provers can be used for inference
• However, ALC only uses a small subset of 1st order logics– Only unary and binary predicates, with a very limited
use of quantifiers and connectives• Inference and algorithms can be much simpler
– Tableau Algorithms are used for ALC and mostly other description logics
• ALC is also decidable, unlike 1st order logics
357
More expressive DLs
• The limited use of 1st order logics has its advantages, but some obvious drawbacks: Expressivity is also limited
• Some concept definitions are not possible to define in ALC. E.g.– An elephant has exactly 4 legs
• (expressing qualified number restrictions)– Every mother has (at least) a child, and every son is the
child of a mother• (inverse role definition)
– Elephant are animal• (define concepts without giving necessary and sufficient
conditions)
358
Extensions of ALC
• ALCN extends ALC with unqualified number restrictions≤n R and ≥n R and =n R
– Denotes the individuals which relate via R to at least (resp. at most, exactly) n individuals
– Eg. Person ⊓ (≥ 2 hasChild)• Persons with at least two children
• The precise meaning is defined by (resp. for ≥ and =)– (≤n R)I = {d ∈ I | #{(d,e) ∈ RI} ≤ n }
• It is possible to define the meaning in terms of 1st order logics, with recourse to equality. E.g.– ≥2 R is {x: yz, y ≠ z R(x,y) R(x,z)}– ≤2 R is
{x: y,z,w, (R(x,y) R(x,z) R(x,w)) (y=z ⌵ y=w ⌵ z=w)}
359
Qualified number restriction
• ALCN can be further extended to include the more expressive qualified number restrictions
(≤n R C) and (≥n R C)and (=n R C)– Denotes the individuals which relate via R to at least (resp. at
most, exactly) n individuals of concept C– Eg. Person ⊓ (≥ 2 hasChild Female)
• Persons with at least two female children– E.g. Mammal ⊓ (=4 bodypart Leg)
• Mammals with 4 legs
• The precise meaning is defined by (resp. for ≥ and =)– (≤n R)I = {d ∈ I | #{(d,e) ∈ RI} ≤ n }
• Again, it is possible to define the meaning in terms of 1st order logics, with recourse to equality. E.g.– (≥2 R C) is {x: yz, y ≠ z C(y) C(z) R(x,y) R(x,z)}
360
Further extensions• Inverse relations
– R- denotes the inverse of R: R- (x,y) = R(y,x)• One of constructs (nominals)
– {a1, …, an}, where as are individuals, denotes one of a1, …, an
• Statements of subsumption in TBoxes (rather than only definition)
• Role transitivity– Trans(R) denotes the transitivity closure of R
• SHOIN is the DL resulting from extending ALC with all the above described extensions– It is the underlying logics for the Semantic Web language OWL-
DL– The less expressive language SHIF, without nominal is the basis
for OWL-Lite
361
Example
• From the w3c wine ontology– Wine ⊑
PotableLiquid ⊓ (=1 hasMaker) hasMaker.Winery)
• Wine is a potable liquid with exactly one maker, and the maker must be a winery
hasColor-.Wine ⊑ {“white”, “rose”, “red”}• Wines can be either white, rose or red.
– WhiteWine ≐ Wine ⊓ hasColor.{“white”} • White wines are exactly the wines with color white.
362
Part 7: Rules and Ontologies
363
Combining rules and ontologies
• We now know how to represent (possibly incomplete, evolving, etc) knowledge using rules, but assuming that the ontology is known.
• We also learned how to represent ontologies.• The close the circle, we need to combine both.• The goal is to represent knowledge with rules that
make use of an ontology for defining the objects and individuals – This is still a (hot) research topic!– Crucial for using knowledge represented by rules in the
context of the Web, where the ontology must be made explicit
364
Full integration of rules/ontologies
• Amounts to:– Combine DL formulas with rules having no restrictions
– The vocabularies are the same
– Predicates can be defined either using rules or using DL
• This approach encounters several problems– The base assumptions of DL and of non-monotonic
rules are quite different, and so mixing them so tightly is not easy
365
Problems with integration
• Rule languages (e.g. Logic Programming) use some form of closed world assumption (CWA)– Assume negation by default– This is crucial for reasoning with incomplete knowledge
• DL, being a subset of 1st order logics, has no closed world assumption– The world is kept open in 1st order logics (OWA)– This is reasonable when defining concepts– Mostly, the ontology is desirably monotonic
• What if a predicate is both “defined” using DL and LP rules?– Should its negation be assumed by default?– Or should it be kept open?– How exactly can one define what is CWA or OWA is this context?
366
CWA vs OWA
• Consider the program Pwine(X) ← whiteWine(X)nonWhiteWine(X) ← not whiteWine(X)wine(esporão_tinto)
and the “corresponding” DL theoryWhiteWine ⊑ Wine¬WhiteWine ⊑ nonWhiteWineesporão_tinto:Wine
• P derives nonWhiteWine(esporão_tinto) whilst the DL does not.
367
Modeling exceptions
• The following TBox is unsatisfiableBird ⊑ FliesPenguin ⊑ Bird ⊓ ¬Flies
• The first assertion should be seen as allowing exceptions
• This is easily dealt by nonmonotonic rule languages, e.g. logic programming, as we have seen
368
Problems with integration (cont)
• DL uses classical negation while LP uses either default or explicit negation– Default negation is nonmonotonic
– As classical negation, explicit negation also does not assume a complete world and is monotonic
– But classical negation and explicit negation are different
– With classical negation it is not possible to deal with paraconsistency!
369
Classical vs Explicit Negation
• Consider the program Pwine(X) ← whiteWine(X)¬wine(coca_cola)
• and the DL theoryWhiteWine ⊑ Winecoca_cola: ¬Wine
• The DL theory derives ¬WhiteWine(coca_cola) whilst P does not.– In logic programs, with explicit negation, contraposition of
implications is not possible/desired– Note in this case, that contraposition would amount to assume that
no inconsistency is ever possible!
370
Problems with integration (cont)
• Decidability is dealt differently:– DL achieves decidability by enforcing restrictions on
the form of formulas and predicates of 1st order logics, but still allowing for quantifiers and function symbols
• E.g. it is still possible to talk about an individual without knowing who it is:
hasMaker.{esporão} ⊑ GoodWine
– PL achieves decidability by restricting the domain and disallowing function symbols, but being more liberal in the format of formulas and predicates
• E.g. it is still possible to express conjunctive formulas (e.g. those corresponding to joins in relational algebra):
isBrother(X,Y) ← hasChild(Z,X), hasChild(Z,Y), X≠Y
371
Recent approaches to full integration
• Several recent (and in progress) approaches attacking the problem of full integration of DL and (nonmonotonic) rules:– Hybrid MKNF [Motik and Rosati 2007, to appear]
• Based on interpreting rules as auto-epistemic formulas (cf. previous comparison of LP and AEL)
• DL part is added as a 1st order theory, together with the rules
– Equilibrium Logics [Pearce et al. 2006]
– Open Answer Sets [Heymans et al. 2004]
372
Interaction without full integration
• Other approaches combine (DL) ontologies, with (nonmonotonic) rules without fully integrating them:– Tight semantic integration
• Separate rule and ontology predicates• Adapt existing semantics for rules in ontology layer• Adopted e.g. in DL+log [Rosati 2006] and the Semantic Web
proposal SWRL [w3c proposal 2005]– Semantic separation
• Deal with the ontology as an external oracle• Adopted e.g. in dl-Programs [Eiter et al. 2005] (to be studied
next)
373
Nonmonotonic dl-Programs• Extend logic programs, under the answer-set semantic,
with queries to DL knowledge bases• There is a clean separation between the DL knowledge
base and the rules– Makes it possible to use DL engines on the ontology and ASP
solver on the rules with adaptation for the interface• Prototype implementations exist (see dlv-Hex)
• The definition of the semantics is close to that of answer sets
• It also allows changing the ABox of the DL knowledge base when querying– This permits a limited form of flow of information from the LP
part into the DL part
374
dl-Programs
• dl-Programs include a set of (logic program) rules and a DL knowledge base (a TBox and an ABox)
• The semantics of the DL part is independent of the rules– Just use the semantics of the DL-language, completely
ignoring the rules• The semantics of the dl-Program comes from the
rules– It is an adaptation of the answer-set semantics of the
program, now taking into consideration the DL (as a kind of oracle)
375
dl-atoms to query the DL part
• Besides the usual atoms (that are to be “interpreted” on the rules), the logic program may have dl-atoms that are “interpreted” in the DL part
• Simple example:DL[Bird](“tweety”)
– It is true in the program if in the DL ontology the concept Bird includes the element “tweety”
• Usage in a ruleflies(X) ← DL[Bird](X), not ab(X)
– The query Bird(X) is made in the DL ontology and used in the rule
376
More on dl-atoms• To allow flow of information from the rules to the
ontology, dl-atoms allow to add elements to the ABox before querying
DL[Penguin ⊎ my_penguin;Bird](X)– First add to the ABox p:Penguin for each individual p
such that my_penguin(p) (in the rule part), and then query for Bird(X)
• Additions can also be made for roles (with binary rule predicates) and for negative concepts and roles. Eg:
DL[Penguin ⊌ nonpenguin;Bird](X)– In this case p:¬Penguin is added for each
nonpenguin(p)
377
The syntax of dl-Programs
• A dl-Program is a pair (L,P) where– L is a description logic knowledge base– P is a set of dl-rules
• A dl-rule is:H A1, …, An, not B1, … not Bm (n,m 0)
where H is an atom and Ais and Bis are atoms or dl-atoms• A dl-atom is:
DL[S1 op1 p1, …, Sn opn pn;Q](t) (n 0)where Si is a concept (resp. role), opi is either ⊎ or ⊌, pi is a unary (resp. binary) predicate and Q(t) is a DL-query.
378
DL-queries
• Besides querying for concepts, as in the examples, dl-atoms also allow querying for roles, and concept subsumption.
• A DL-query is either– C(t) for a concept C and term t
– R(t1,t2) for a role R and terms t1 and t2
– C1 ⊑ C2 for concepts C1 and C2
379
Interpretations in dl-Programs
• Recall that the Herbrand base HP of a logic program is the set of all instantiated atoms from the program, with the existing constants
• In dl-programs constants are both those in the rules and the individuals in the ABox of the ontology
• As usual a 2-valued interpretation is a subset of HP
380
Satisfaction of atoms wrt L
• Satisfaction wrt a DL knowledge base L– For (rule) atoms
I |=L A iff A ∈ I
I |=L not A iff A ∉ I
– For dl-atomsI |=L DL[S1 op1 p1, …, Sn opn pn;Q](t) iff
L A1(I) … An(I) |= Q(t)
where– Ai(I) = {Si(c) | pi(c) ∈ I} if opi is ⊎
– Ai(I) = {¬Si(c) | pi(c) ∈ I} if opi is ⊌
381
Models of a Program• Models can be defined for other formulas by extending |= with:
– I |=L not A iff I |≠L A– I |=L F, G iff I |=L F and I |=L G– I |=L H G iff I |=L A or I |≠L Gfor atom H, atom or dl-atom A, and formulas F and G
• I is a model of a program (L,P) iffFor every rule H G ∈ P, I |=L H G
• I is a minimal model of (L,P) iff there is no other I’ ⊂ I that is a model of P
• I is the least model of (L,P) if it is the only minimal model of (L,P)
• It can be proven that every positive dl-program (without default negation) has a least model
382
Alternative definition of Models
• Models can also be defined similarly to what has been done above for normal programs, via an evaluation function ÎL:– For an atom A, ÎL(A)=1 if I |=L A, and = 0 otherwise– For a formula F, ÎL(not F) = 1 - ÎL(F)– For formulas F and G:
• ÎL((F,G)) = min(ÎL(F), ÎL(G))• ÎL(F G)= 1 if ÎL(F) ÎL(G), and = 0 otherwise
• I is a model of (L,P) iff, for all rule H B of P:
ÎL(H B) = 1• This definition easily allows for extensions to 3-valued
interpretations and models (not yet explored!)
383
Reduct of dl-Programs
• Let (L,P) be a dl-Program• Define the Gelfond-Lifshitz reduct P/I as for
normal programs, treating dl-atoms as regular atoms
• P/I is obtained from P by– Deleting all rules whose body contains not A and
I |=L A (being A either a regular or dl-atom)– Deleting all the remaining default literals
384
Answer-sets of dl-Programs
• Let least(L,P) be the least model of P wrt L, where P is a positive program (i.e. without negation by default)
• I is an answer-set of (L,P) iffI = least(L,P/I)
• Explicit negation can be used in P, and is treated just like in answer-sets of extended logic programs
385
Some properties
• An answer-sets of dl-Program (L,P) is a minimal model of (L,P)
• Programs without default nor explicit negation always have an answer-set
• If the program is stratified then it has a single answer-set
• If P has no DL atoms then the semantics coincides with the answer-sets semantics of normal and extended programs
386
An example (from [Eiter et al 2006])
• Assume the w3c wine ontology, defining concepts about wines, and with an ABox with several wines
• Besides the ontology, there is a set of facts in a LP defining some persons, and their preferences regarding wines
• Find a set of wines for dinner that makes everybody happy (regarding their preferences)
387
Wine Preferences Example%Get wines from the ontology
wine(X) ← DL[“Wine”](X)%Persons and preferences in the program
person(axel). preferredWine(axel,whiteWine).person(gibbi). preferredWine(gibbi,redWine)person(roman). preferredWine(roman,dryWine)
%Available bottles a person likeslikes(P,W) ← preferredWine(P,sweetWine), wine(W), DL[“SweetWine”](W).likes(P,W) ← preferredWine(P,dryWine), wine(W), DL[“DryWine”](W).likes(P,W) ← preferredWine(P,whiteWine), wine(W), DL[“WhiteWine”](W).likes(P,W) ← preferredWine(P,redWine), wine(W), DL[“RedWine”](W).
%Available bottles a person dislikesdislikes(P,W) ← person(P), wine(W), not likes(P,W)
%Generation of various possibilities of choosing winesbottleChosen(W) ← wine(W), person(P), likes(P,W), not nonChosen(P,W)nonChosen(W) ← wine(W), person(P), likes(P,W), not bottleChosen(P,W)
%Each person must have of bottle of his preferencehappy(P) ← bottleChosen(W), likes(P,W).false ← person(P), not happy(P), not false.
388
Wine example continued
• Suppose that later we learn about some wines, not in the ontology
• One may add facts in the program for such new wines. Eg:
white(joão_pires). ¬dry(joão_pires).• To allow for integrating this knowledge with that of
the ontology, the 1st rule must be changedwine(X) ←
DL[“WhiteWine”⊎white,“DryWine”⊌¬dry;“Wine”](X)• In general more should be added in this rule (to allow
e.g. for adding, red wines, non red, etc…)• Try more examples in dlv-Hex!
389
About other approaches
• This is just one of the current proposals for mixing rules and ontologies
• Is this the approach?– There is currently debate on this issue
• Is it enough to have just a loosely coupling of rules and ontologies?– It certainly helps for implementations, as it allows for
re-using existing implementations of DL alone and of LP alone.
– But is it expressive enough in practical?
390
Extensions• A Well-Founded based semantics for dl-Programs [Eiter et al. 2005] exists
– But such doesn’t yet exists for other approaches• What about paraconsistency?
– Mostly it is yet to be studied!• What about belief revision with rules and ontologies?
– Mostly it is yet to be studied!• What about abductive reasoning over rules and ontologies?
– Mostly it is yet to be studied!• What about rule updates when there is an underlying ontology?
– Mostly it is yet to be studied!• What about updates of both rules and ontologies?
– Mostly it is yet to be studied!• What about … regarding combination of rules and ontologies?
– Mostly it is yet to be studied!• Plenty of room for PhD theses!
– Currently it is a hot research topic with many applications and crying out for results!
391
Part 8: Wrap up
392
What we have studied (in a nutshell)• Logic rule-based languages for representing
common sense knowledge– and reasoning with those languages
• Methodologies and languages for dealing with evolution of knowledge– Including reasoning about actions
• Languages for defining ontologies• Briefly on the recent topic of combining
rules and ontologies
393
What we have studied (1)• Logic rule-based languages for representing
common sense knowledge– Started by pointing about the need of non-monotonicity
to reason in the presence of incomplete knowledge
– Then seminal nonmonotonic languages• Default Logics
• Auto-epistemic logics
– Focused in Logic Programming as a nonmonotonic language for representing knowledge
394
What we have studied (2)• Logic Programming for Knowledge
Representation– Thorough study of semantics
• of normal logic programs• of extended (paraconsistent) logic programs• including state of the art semantics and corresponding systems
– Corresponding proof procedures allowing for reasoning with Logic Programs
– Programming under these semantics• Answer-Set Programming• Programming with tabling
– Example methodology for representing taxonomies
395
What we have studied (3)• Knowledge evolution
– Methods and semantics for dealing with inclusion of new information (still in a static world)
• Introduction to belief revision of theories• Belief revision in the context of logic programming• Abductive Reasoning in the context of belief
revision• Application to model based diagnosis and
debugging
– Methods and languages for knowledge updates
396
What we have studied (4)• Methods and languages for knowledge
updates– Methodologies for reasoning about changes
• Situation calculus• Event calculus
– Languages for describing knowledge that changes
• Action languages• Logic programming update languages
– Dynamic LP and EVOLP with corresponding implementations
397
What we have studied (5)• Ontologies for defining objects, concepts, and
roles, and their structure– Basic notions of ontologies
– Ontology design (exemplified with Protégé)
• Languages for defining ontologies– Basic notions of description logics for representing
ontologies
• Representing knowledge with rules and ontologies– To close the circle
– Still a hot research topic
398
What type of issues• A mixture of:
– Theoretical study of classical issues, well established for several years
• E.g. default and autoepistemic logics, situation and event calculus, …
– Theoretical study of state of the art languages and corresponding system
• E.g. answer-sets, well-founded semantics, Dynamic LPs, Action languages, EVOLP, Description logics, …
– Practical usage of state of the art systems• E.g. programming with ASP-solvers, with XSB-Prolog, XASP, …
– Current research issues with still lots of open topics• E.g. Combining rules and ontologies
399
What next in UNL?For MCL only, sorry
• Semantic Web– Where knowledge representation is applied to the domain of the
web, with a big emphasis on languages for representing ontologies in the web
• Agents– Where knowledge representation is applied to multi-agent
systems, with a focus on knowledge changes and actions
• Integrated Logic Systems– Where you learn how logic programming systems are
implemented
• Project– A lot can be done in this area.– Just contact professors of these courses!
400
What next in partner Universities?Even more for MCL, this time 1st year only
• In FUB– Module on Semantic Web, including course on Description Logics
• In TUD– Advanced course in KRR with seminars on various topics (this year F-
Logic, abduction and induction, …)– General game playing, in which KRR is used for developing general game
playing systems– Advanced course in Description Logics
• In TUW– Courses on data and knowledge based systems, and much on answer-set
programming
• In UPM– Course on intelligent agents and multi-agent systems– Course on ontologies and the semantic web
401
The EndFrom now onwards it is up to you!
Study for the exam and do the project
I’ll always be available to help!