Post on 18-Aug-2015
Datalog§ Track Introduction
&
Reasoning on UML Class Diagrams via Datalog§
Georg Gottlob
Department of Computer Science
University of Oxford
Joint work with G. Orsi, A. Pieris et al.
09:00 <Datalog+/- Track>
09:00 - Datalog Track Introduction
Tue, 4 August, 09:00 – 09:30
Description Georg Gottlob, Giorgio Orsi, and Andreas Pieris: Consistency Checking of Re-Engineered UML
Class Diagrams via Datalog+/- (Invited)
09:30 - Graal: A Toolkit for Query Answering with Existential Rules
Tue, 4 August, 09:30 – 10:00
DescriptionJean-François Baget, Michel Leclère, Marie-Laure Mugnier, Swan Rocher and Clément Sipieter
10:00 - Binary Frontier-guarded ASP with Function Symbols
Tue, 4 August, 10:00 – 10:30
DescriptionMantas Simkus
11:00 - Ontology-Based Multidimensional Contexts with Applications to Quality Data Specification and
Extraction
WhenTue, 4 August, 11:00 – 11:30
DescriptionMostafa Milani and Leopoldo Bertossi
11:30 - Existential rules and Bayesian Networks for Probabilistic Ontological Data Exchange
Tue, 4 August, 11:30 – 12:00
DescriptionThomas Lukasiewicz, Maria Vanina Martinez, Livia Predoiu, Gerardo Ignacio Simari
more details» copy to my calendar
Wed, 09:00 – 09:10
Datalog+, RuleML and OWL 2: Formats and Translations for Existential Rules
JF Baget, A Gutierrez, M Leclère, ML Mugnier, S Rocher and C Sipieter
Competes
Stock
0..1
Member
Owns
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
The Need for Modeling Data and Objects
UML Class Diagrams
F-Logic
Description Logics
Speaker v Person u 9gives.Talk
Tutorial v 8attendedBy.Student
attendedBy v attended-1
Speaker u Student v ?
student::person
person[age ¤) number]
person[age {0:1} ¤) number]
person[name {1:¤} ¤) string]
PREFIX abc: hhttp://example.com/exampleOntology#i
SELECT ?capital ?country
WHERE {
?x abc:cityname ?capital ;
?x abc:isCapitalOf ?y .
?x abc:countryname ?country ;
?x abc:isInContinent abc:Africa . }
SPARQL
Competes
Stock
0..1
Member
Owns
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
A Unifying Framework for Reasoning over Data
UML Class Diagrams
F-Logic
Description Logics
Speaker v Person u 9gives.Talk
Tutorial v 8attendedBy.Student
attendedBy v attended-1
Speaker u Student v ?
student::person
person[age ¤) number]
person[age {0:1} ¤) number]
person[name {1:¤} ¤) string]
PREFIX abc: hhttp://example.com/exampleOntology#i
SELECT ?capital ?country
WHERE {
?x abc:cityname ?capital ;
?x abc:isCapitalOf ?y .
?x abc:countryname ?country ;
?x abc:isInContinent abc:Africa . }
SPARQL
Datalog±
• Logical foundations, semantics and decidability
• Complexity results for ontological query answering
• Identification of tractable fragments
The Datalog§ Family
• Extend Datalog by allowing in the head:
• But query answering under Datalog[9] is already undecidable
see, e.g., [Beeri & Vardi, ICALP 1981] and [Calì, G. & Kifer, KR 2008]
• Datalog[9,=,?] is syntactically restricted ! Datalog§
• Datalog: 8X8Y (X,Y) R(X)
• Existential quantification (TGDs) – 8X8Y (X,Y) 9Z (X,Z)
• Equality predicate (EGDs) – 8X (X) Xi = Xj
• Constant false (Negative Constraints) – 8X (X) ?
Ontological Query Answering
D
Ο
hD,Oi
D
database
ontology
Query
knowledge base
hD,Οi ² Query , D Æ Ο ² Query
Guarded Datalog[9]
8X8Y8Z R(X,Y,Z) ^ S(Y) ^ P(X,Z) 9W Q(X,W)
• All 8-variables occur in one body atom – guard atom
guard
• Query answering is PTIME-complete in data complexity
• Models of finite treewidth ) decidability of query answering
related to [Andréka, Németi & van Benthem, J. Philosophical Logic 1998] and
[Grädel, J. Symb. Log. 1999]
[Calì, G. & Kifer, KR 2008]
Linear Datalog[9]
• Rules with just one body-atom
• Query answering is first-order rewritable
8X supervisorOf(X,X) manager(X)
[Calì, G. & Lukasiewicz, J. Web. Sem. 2012]
First-Order Rewritability
8D (D [ O ² Q , D ² Q*)
Query answering is in AC0 in data complexity
D
translationevaluation
SQLpositive
first-order query
Q Ο
QFO Q*
compilation
Linear Datalog[9]
• Rules with just one body-atom
• … and thus, query answering is in AC0 in data complexity
• Query answering is first-order rewritable
8X supervisorOf(X,X) manager(X)
[Calì, G. & Lukasiewicz, J. Web. Sem. 2012]
Datalog[9,=,?]
• Every decidable Datalog[9] language can be enriched with:
• Non-Conflicting EGDs – no interaction with TGDs
• Negative constrains
• Unsatisfiability can be reduced to query answering
• If the theory is satisfiable, then consider only the TGDs
see, e.g., [Calì, G. & Pieris, Artif. Intell. 2012]
Competes
Stock
0..1
Member
Owns
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
A Unifying Framework for Reasoning over Data
UML Class Diagrams
F-Logic
Description Logics
Speaker v Person u 9gives.Talk
Tutorial v 8attendedBy.Student
attendedBy v attended-1
Speaker u Student v ?
student::person
person[age ¤) number]
person[age {0:1} ¤) number]
person[name {1:¤} ¤) string]
PREFIX abc: hhttp://example.com/exampleOntology#i
SELECT ?capital ?country
WHERE {
?x abc:cityname ?capital ;
?x abc:isCapitalOf ?y .
?x abc:countryname ?country ;
?x abc:isInContinent abc:Africa . }
SPARQL
Guarded Datalog[9,=,?]
• Logical foundations, semantics and decidability
• Complexity results for ontological query answering
• Identification of tractable fragments
Competes
Stock
0..1
Member
Owns
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
A Unifying Framework for Reasoning over Data
UML Class Diagrams
F-Logic
Description Logics
Speaker v Person u 9gives.Talk
Tutorial v 8attendedBy.Student
attendedBy v attended-1
Speaker u Student v ?
student::person
person[age ¤) number]
person[age {0:1} ¤) number]
person[name {1:¤} ¤) string]
PREFIX abc: hhttp://example.com/exampleOntology#i
SELECT ?capital ?country
WHERE {
?x abc:cityname ?capital ;
?x abc:isCapitalOf ?y .
?x abc:countryname ?country ;
?x abc:isInContinent abc:Africa . }
SPARQL
UML Class Diagrams
• Modeling the intensional structure of a software system
• Reasoning services limited to diagram satisfiability checking
• Verify properties at runtime by reasoning over a (possibly
incomplete) state of the system – query answering
Querying UML Class Diagrams: Example I
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
Executive(john)
Member(john,LU)
Stock(BAY)
Issues(BA,BAY)
Owns(john,BAY)
Competes(LU,BA)
Index[0..1]:Str
getIndex():List
Issues
Does anybody have a potential conflict of interest?
Querying UML Class Diagrams: Example I
Stock
0..1
Member
Issues
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
Executive(john)
Member(john,LU)
Stock(BAY)
Issues(BA,BAY)
Owns(john,BAY)
Competes(LU,BA)
Index[0..1]:Str
getIndex():List
Conflict Person(P), Company(C1), Company(C2), Stock(S),
Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
Querying UML Class Diagrams: Example I
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
Executive(john)
Member(john,LU)
Stock(BAY)
Issues(BA,BAY)
Owns(john,BAY)
Competes(LU,BA)
Person(john)
IssuesIndex[0..1]:Str
getIndex():List
Conflict Person(P), Company(C1), Company(C2), Stock(S),
Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
Querying UML Class Diagrams: Example I
Executive(john)
Member(john,LU)
Stock(BAY)
Issues(BA,BAY)
Owns(john,BAY)
Competes(LU,BA)
Person(john)
Company(LU)
Company(BA)
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
Conflict Person(P), Company(C1), Company(C2), Stock(S),
Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
Querying UML Class Diagrams: Example I
Executive(john)
Member(john,LU)
Stock(BAY)
Issues(BA,BAY)
Owns(john,BAY)
Competes(LU,BA)
Person(john)
Company(LU)
Company(BA)
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
{P ! john, C1 ! LU, C2 ! BA, S ! BAY}
IssuesIndex[0..1]:Str
getIndex():List
Conflict Person(P), Company(C1), Company(C2), Stock(S),
Owns(P,S), Member(P,C1), Issues(C2,S), Competes(C1,C2)
Querying UML Class Diagrams: Example II
Group(DB)
Is there a professor who works in the database group?
WorksIn
Leads
1..1
3..1
Student Professor
Member
{disjoint} Group
1..1
0..1
CLeads
since: Date
Group(DB)
Querying UML Class Diagrams: Example II
WorksIn
Leads
1..1
3..1
Student Professor
Member
{disjoint} Group
1..1
0..1
CLeads
Ans Professor(P), WorksIn(P,DB)
since: Date
Group(DB)
Leads(z1,DB)
Professor(z1)
glue(z1,DB,z2)
CLeads(z2)
Querying UML Class Diagrams: Example II
WorksIn
Leads
1..1
3..1
Student Professor
Member
{disjoint} Group
1..1
0..1
CLeads
since: Date
Ans Professor(P), WorksIn(P,DB)
Group(DB)
Leads(z1,DB)
Professor(z1)
WorksIn(z1,DB)
Querying UML Class Diagrams: Example II
WorksIn
Leads
1..1
3..1
Student Professor
Member
{disjoint} Group
1..1
0..1
Ans Professor(P), WorksIn(P,DB)
SINCE
since: Date
Group(DB)
Leads(z1,DB)
Professor(z1)
WorksIn(z1,DB)
…
{P ! z1, DB ! DB}
Querying UML Class Diagrams: Example II
WorksIn
Leads
1..1
3..1
Student Professor
Member
{disjoint} Group
1..1
0..1
SINCE
since: Date
Ans Professor(P), WorksIn(P,DB)
From Diagrams to First-Order Logic (Datalog§)
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
From Diagrams to First-Order Logic (Datalog§)
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
8X8Y Member(X,Y) Company(X) ^ Executive(Y)
8X Company(X) 9Y9Z Member(X,Y) ^ Member(X,Z) ^ Y ≠ Z
8X Executive(X) 9Y Member(Y,X)
[Berardi et al., Artif. Intell. 2005]
From Diagrams to First-Order Logic (Datalog§)
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
8X Company(X) 9Y Issues(X,Y)
8X8Y8Z Stock(X) ^ Issues(Y,X) ^ Issues(Z,X) Y = Z
8X8Y Stock(X) ^ Index(X,Y) Str(Y)
8X8Y Stock(X) ^ getIndex(X,Y) List(Y)
8X Stock(X) 9Y Issues(Y,X)
[Berardi et al., Artif. Intell. 2005]
From Diagrams to First-Order Logic (Datalog§)
Stock
0..1
Member
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
8X Executive(X) Person(X)
[Berardi et al., Artif. Intell. 2005]
Complexity of Query Answering
• EXPTIME-complete in combined complexity
• coNP-complete in data complexity
• Undecidable when diagrams are combined with arbitrary OCL
(Object Constraint Language) constraints
implicit in [Berardi et al., Artif. Intell. 2005] and [Lutz, IJCAR 2008]
implicit in [Ortiz, Calvanese & Eiter, AAAI 2006]
[folklore]
Research Challenge: Reduce High Complexity
• Diagrams often have very large instantiations
• Some applications require very large diagrams
• OCL constraints, which are not expressible diagrammatically,
lead to undecidability or high complexity
Aims and Objectives
• Restrict UML class diagrams to achieve tractability of query
answering in data complexity
• Better understanding of combined complexity
• Add relevant OCL constraints without losing tractability of
query answering in data complexity
1. For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1}
Lean UML Class Diagrams
[Calì, G., Orsi & Pieris, FOSSACS 2012]
1. For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1}
2. For each association A:
• upper bounds mU , nU 2 {1,1},
• if A generalizes some other association, then mU = nU = 1
Lean UML Class Diagrams
C1 C2mL..mU nL..nUA
[Calì, G., Orsi & Pieris, FOSSACS 2012]
1. For each attribute assertion Attribute[ i..j ]:Type j 2 {1,1}
2. For each association A:
• upper bounds mU , nU 2 {1,1},
• if A generalizes some other association, then mU = nU = 1
3. Completeness constraints are forbidden
Lean UML Class Diagrams
C1 C2
C
{complete}
8X C(X) C1(X) _ C2(X)
C1 C2mL..mU nL..nUA
[Calì, G., Orsi & Pieris, FOSSACS 2012]
Lean UML Class Diagrams: Example I
0..1
Member
Issues
Owns
Competes
0..1
1..1
0..1
1..1
1..1
1..1
2..1
CompanyStock
Index[0..1]:Str
getIndex():List
Executive Person
Lean UML Class Diagrams: Example II
WorksIn
Leads
1..1
3..1
Student Professor
Member
{disjoint}Group
1..1
0..1
CLeads
since: Date
Non-Diagrammatic Constraints (OCL)
C1 C2
C
C3
disjoint classes
8X C2(X) ^ C3(X) ?
We need negative constraints of the form
8X C1(X) ^ … ^ Cn(X) ?
C1 C2
C
most-specific class
We need most-specific class constraints of the form
8X C1(X) ^ … ^ Cn(X) C(X)
8X C1(X) ^ C2(X) C(X)
Non-Diagrammatic Constraints (OCL)
type the domain of Enrolled
We need domain-type constraints of the form
8X8Y C(X) ^ Attribute(Y,X) T(Y)
8X8Y CS-Course(X) ^ Enrolled(Y,X) CS-Student(Y)
Enrolled [0..1]:CS-Course
CS-Student
Enrolled [0..1]:Course
Student
Non-Diagrammatic Constraints (OCL)
OCL (Object Constraint Language)
Context C1 inv:
C1.allInstances -> forAll ( x1: C1 |
C2.allInstances -> forAll ( x2: C2 |
x1=x2 implies x2.oclIsTypeOf(C)
)
)
Context C1 inv:
C1.allInstances -> forAll ( x1: C1 |
C2.allInstances -> forAll ( x2: C2 |
x1<>x2
)
)
Context Object inv:
Object.allInstances -> forAll ( y: Object |
y.A.oclIsTypeOf(C) implies y.oclTypeOf(T)
)
8X C1(X) ^ C2(X) ?
8X C1(X) ^ C2(X) C(X)
8X8Y C(X) ^ A(Y,X) T(Y)
8X8Y C(X) ^ Attr(X,Y) T(Y)
8X C(X) 9Y1…9Yn Attr(X,Y1) ^ … ^ Attr(X,Yn)
8X C1(X) C2(X)
8X18X2 A(X1,X2) C1(X1) ^ C2 (X2)
8X18X28Y A(X1,X2) ^ glue(X1,X2,Y) CA(Y)
8X18X2 A(X1,X2) 9Y glue(X1,X2,Y)
8X C(X) 9Y1…9Yn A(X,Y1) ^ … ^ A(X,Yn)
8X C(X) 9Y1…9Yn A(Y1,X) ^ … ^ A(Yn,X)
8X18X2 A1(X1,X2) A2(X1,X2)
8X C1(X) ^ … ^ Cn(X) C(X)
8X8Y C(X) ^ Attr(Y,X) T(Y)
Lean UML Class Diagrams as Rules
additional constraints
associations
classes
8X C1(X) ^ … ^ Cn(X) C(X)
8X8Y C(X) ^ Attr(Y,X) T(Y)
8X8Y C(X) ^ Attr(X,Y) T(Y)
8X C(X) 9Y1…9Yn Attr(X,Y1) ^ … ^ Attr(X,Yn)
8X C1(X) C2(X)
8X18X2 A(X1,X2) C1(X1) ^ C2 (X2)
8X18X28Y A(X1,X2) ^ glue(X1,X2,Y) CA(Y)
8X18X2 A(X1,X2) 9Y glue(X1,X2,Y)
8X C(X) 9Y1…9Yn A(X,Y1) ^ … ^ A(X,Yn)
8X C(X) 9Y1…9Yn A(Y1,X) ^ … ^ A(Yn,X)
8X18X2 A1(X1,X2) A2(X1,X2)
Lean UML Class Diagrams as Rules
additional constraints
associations
classes
Data Complexity of Query Answering
[Calì, G., Orsi & Pieris, FOSSACS 2012]
Theorem: Query answering under Lean UML class diagrams + negative
& most-specific class & domain-type constraints is PTIME-complete
Data Complexity of Query Answering
Proof:
• in PTIME: reduction to query answering under guarded TGDs
• PTIME-hardness (even without domain-type constraints): reduction
from Path System Accessibility
Theorem: Query answering under Lean UML class diagrams + negative
& most-specific class & domain-type constraints is PTIME-complete
[Calì, G., Orsi & Pieris, FOSSACS 2012]
Combined Complexity of Query Answering
Theorem: Query answering under Lean UML class diagrams +
negative & most-specific class constraints is PSPACE-complete
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
C(x)
D
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
…can be computed from A(y,z) [ type(A(y,z))
atoms whose arguments are among {y,z}
8X C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) A0(X,Y)
C(x)
D
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
…can be computed from A(y,z) [ type(A(y,z))
atoms whose arguments are among {y,z}
• type(A(y,z)) can be computed from A(y,z) [ type(C(x)) in NP
C(x)
D
8X C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) A0(X,Y)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
…can be computed from A(y,z) [ type(A(y,z))
atoms whose arguments are among {y,z}
• type(A(y,z)) can be computed from A(y,z) [ type(C(x)) in NP
• construct non-deterministically a finite part with at most |Q| atoms
C(x)
D
8X C(X) ^ A(X,Y) T(Y)
8X8Y A(X,Y) C1(X) ^ C2(Y)
8X8Y A(X,Y) A0(X,Y)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
Combined Complexity of Query Answering
Proof:
• in PSPACE: there exists a tree-like universal model (chase procedure)
A(y,z)
C(x)
D
R(a,v) S(v,w)
B(y,z) T(z)
Q Ã R(X1,X2),S(X2,X3),B(X4,X5),T(X5)
Combined Complexity of Query Answering
Proof:
• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine
on input I = α0α1...αn-1, assuming that it uses m = nk cells
• Initialization rules 8X initial(X) initial-state(X)
8X initial(X) cell0[α0,1](X)
8X initial(X) celli[αi,0](X), for each i 2 {1,…,n-1}
8X initial(X) celli[0,0](X), for each i 2 {n,…,m-1}
initial-state cell0[α0,1] celli[αi,0] celln-1[αn-1,0]… celln[0,0] cellm-1[0,0]…
initial
8X config(X) 9Y succ(X,Y)
8X8Y config(X) ^ succ(X,Y) config(Y)
Proof:
• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine
on input I = α0α1...αn-1, assuming that it uses m = nk cells
• Configuration generation rules
8X initial(X) config(X)
Combined Complexity of Query Answering
initial
config
succ[1..1]:config
Proof:
• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine
on input I = α0α1...αn-1, assuming that it uses m = nk cells
• Rules to describe the transition from one configuration to another,
e.g., state transition rules
Combined Complexity of Query Answering
8X8Y s1-celli [α1,1](X) ^ succ(X,Y) state-s2(Y), for each i 2 {0,…,m-1}
for each δ(hs1,α1i) = hs2,α2,di:
in configuration X, which has state s1, the i-th
cell contains α1, and the cursor is over cell is1-celli [α1,1]
succ[0..1]:state-s2
Proof:
• PSPACE-hardness: simulation of the computation of a PSPACE Turing machine
on input I = α0α1...αn-1, assuming that it uses m = nk cells
• Acceptance rule 8X accept-state(X) accept(X)
• Initial database D = {initial(c)} – c is the initial configuration
• Boolean CQ yes accept(c)
• Turing machine accepts I iff D [ ² yes
Combined Complexity of Query Answering
accept
accept-state
Combined Complexity of Query Answering
Theorem: Query answering under Lean UML class diagrams + negative &
most-specific class & domain-type constraints is EXPTIME-complete
Combined Complexity of Query Answering
Theorem: Query answering under Lean UML class diagrams + negative &
most-specific class & domain-type constraints is EXPTIME-complete
Proof:
• in EXPTIME: reduction to guarded TGDs of bounded arity
Combined Complexity of Query Answering
Proof:
• EXPTIME-hardness: simulation of an alternating PSPACE Turing machine
• Acceptance rules – q 2 {9,} and i 2 {1,2}:
8X accept-stateq(X) accept(X)
8X8Y accept(X) ^ succi(Y,X) accepti(Y)
8X state9(X) ^ accepti(X) accept(X)
8X state(X) ^ accept1(X) ^ accept2(X) accept(X)
accept
accept-stateq
domain-type
most-specific class
Further Restrictions – Restricted Lean (RLean)
different classes have disjoint sets
of attributes and operations
instead of 8X8Y Student(X) ^ Enrolled(X,Y) Course(Y)
we have 8X8Y Student-Enrolled(X,Y) Course(Y)
Data complexity in AC0 and combined complexity NP-complete
Enrolled [0..1]:CS-Course
CS-Student
Enrolled [0..1]:Course
Student
Overview
Linear
Guarded
in AC0
PTIME-c
Lean UML
RLean UML
Complexity of Querying Lean UML Class Diagrams
UML
Formalism
Additional
Constraints
Data
Complexity
Combined
Complexity
Lean
negative
specific-class
domain-type
PTIME-c EXPTIME-c
Leannegative
specific-classPTIME-c PSPACE-c
RLeannegative
specific-classin AC0 NP-c
Competes
Stock
0..1
Member
Owns
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
A Unifying Framework for Reasoning over Data
UML Class Diagrams
F-Logic
Description Logics
Speaker v Person u 9gives.Talk
Tutorial v 8attendedBy.Student
attendedBy v attended-1
Speaker u Student v ?
student::person
person[age ¤) number]
person[age {0:1} ¤) number]
person[name {1:¤} ¤) string]
PREFIX abc: hhttp://example.com/exampleOntology#i
SELECT ?capital ?country
WHERE {
?x abc:cityname ?capital ;
?x abc:isCapitalOf ?y .
?x abc:countryname ?country ;
?x abc:isInContinent abc:Africa . }
SPARQL
Description Logics (DLs)
• Logics that model the domain of interest in terms of:
• Concepts ! sets of objects
• Roles ! binary relations on sets of objects
• Used for conceptual reasoning in the Semantic Web context
Description Logics (DLs)
• Logics that model the domain of interest in terms of:
• Concepts ! sets of objects
• Roles ! binary relations on sets of objects
• Used for conceptual reasoning in the Semantic Web context
p1
p2
p3
f1 p2
f2 p3
f3 p1
person father
person v 9father¡1 each person has a father
9father v person each father is a person
Description Logics (DLs)
• Logics that model the domain of interest in terms of:
• Concepts ! sets of objects
• Roles ! binary relations on sets of objects
• Used for conceptual reasoning in the Semantic Web context
p1
p2
p3
f1 p2
f2 p3
f3 p1
person father
person v 9father¡1 each person has a father
9father v person each father is a person
Description Logics (DLs)
• Logics that model the domain of interest in terms of:
• Concepts ! sets of objects
• Roles ! binary relations on sets of objects
• Used for conceptual reasoning in the Semantic Web context
p1
p2
p3
f1 p2
f2 p3
f3 p1
person father
person v 9father¡1 each person has a father
9father v person each father is a person
The DL-Lite Family
B ::= A | 9R | 9R-1 (basic concept)
C ::= B | :B (general concept)
R ::= P | P-1 (basic role)
E ::= R | :R (general role)
DL-Litecore
DL-LiteF DL-LiteR (OWL 2 QL)
B v C
R v E(funct R)
Popular family of DLs with AC0 data complexity
[Calvanese, De Giacomo, Lembo, Lenzerini & Rosati, J. Autom. Reasoning 2007]
The DL-Lite Family
Popular family of DLs with AC0 data complexity
DL-Lite TBox First-Order Representation (Datalog§)
DL-Litecore
professor v 9teachesTo
professor v :student
DL-LiteR
hasTutor¡ v teachesTo
DL-LiteF
funct(hasTutor)
8X professor(X) 9Y teachesTo(X,Y)
8X professor(X) Æ student(X) ?
8X8Y hasTutor(X,Y) teachesTo(Y,X)
8X8Y8Z hasTutor(X,Y) Æ hasTutor(X,Z) Y = Z
The EL Family
Popular family of DLs with PTIME data complexity (biological applications)
see, e.g., [Baader, IJCAI 2003], [Rosati, DL 2007] and [Pérez-Urbina et al., J. Applied Logic 2010]
C1 v C2
R v P
EL
ELH
ELHI
ELHI :
C ::= > | A | C1 u C2 | 9P.C
P1 v P2
R ::= P | P-1
R v E E ::= R | :R
The EL Family
Popular family of DLs with PTIME data complexity (biological applications)
ELHI : TBox First-Order Representation (Datalog§)
A u B v C
A v 9R
8X A(X) Æ B(X) C(X)
8X A(X) 9Y R(X,Y)
8X A(X) 9Y R(X,Y) Æ B(Y)A v 9R.B
9R v A 8X8Y R(X,Y) A(X)
9R.A v B 8X8Y R(X,Y) Æ A(X) B(X)
A v :B 8X A(X) Æ B(X) ?
R v : P 8X8Y R(X,Y) Æ P(X,Y) ?
DLs vs. Datalog[9,=,?]
• Theorem: For query answering,
DL-Lite ·L Linear
ELHI : ·L Guarded
• Remark: the data complexity is preserved
• Theorem: Linear/Guarded TGDs are strictly more expressive
8X supervisorOf(X,X)manager(X)
Overview
Linear
Guarded
in AC0
PTIME-c
Lean UML and ELHI :DL-Lite and RLean UML
Competes
Stock
0..1
Member
Owns
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
A Unifying Framework for Reasoning over Data
UML Class Diagrams
F-Logic
Description Logics
Speaker v Person u 9gives.Talk
Tutorial v 8attendedBy.Student
attendedBy v attended-1
Speaker u Student v ?
student::person
person[age ¤) number]
person[age {0:1} ¤) number]
person[name {1:¤} ¤) string]
PREFIX abc: hhttp://example.com/exampleOntology#i
SELECT ?capital ?country
WHERE {
?x abc:cityname ?capital ;
?x abc:isCapitalOf ?y .
?x abc:countryname ?country ;
?x abc:isInContinent abc:Africa . }
SPARQL
Frame Logic (F-Logic)
• Originally developed for object-oriented deductive databases
• Now used as an ontology language – Semantic Web
• F-Logic is in general undecidable
see, e.g., [Kifer, Lausen & Wu, J. ACM 1995]
F-Logic Lite
• An expressive decidable fragment of F-Logic
• Obtained from F-Logic by:
• Excluding negation and default inheritance
• Allowing only a limited form of cardinality constraints
• Query answering is PTIME-complete in data complexity
[Calì & Kifer, VLDB 2006]
From F-Logic Lite to First-Order Logic (Datalog§)
[Calì & Kifer, VLDB 2006]
data(O,A,V) Æ data(O,A,W),funct(A,O) V = W
type(O,A,T) Æ data(O,A,V) member(V,T)
sub(C1,C3) Æ sub(C3,C2) sub(C1,C2)
member(O,C) Æ sub(C,C1) member(O,C1)
mandatory(A,O) 9V data(O,A,V)
member(O,C) Æ type(C,A,T) type(O,A,T)
sub(C,C1) Æ type(C1,A,T) type(C,A,T)
type(C,A,T1) Æ sub(T1,T) type(C,A,T)
sub(C,C1) Æ mandatory(A,C1) mandatory(A,C)
member(O,C) Æ mandatory(A,C) member(A,O)
sub(C,C1) Æ funct(A,C1) funct(A,C)
member(O,C) Æ funct(A,C) funct(A,O)
From F-Logic Lite to First-Order Logic (Datalog§)
[Calì & Kifer, VLDB 2006]
data(O,A,V) Æ data(O,A,W),funct(A,O) V = W
type(O,A,T) Æ data(O,A,V) member(V,T)
sub(C1,C3) Æ sub(C3,C2) sub(C1,C2)
member(O,C) Æ sub(C,C1) member(O,C1)
mandatory(A,O) 9V data(O,A,V)
member(O,C) Æ type(C,A,T) type(O,A,T)
sub(C,C1) Æ type(C1,A,T) type(C,A,T)
type(C,A,T1) Æ sub(T1,T) type(C,A,T)
sub(C,C1) Æ mandatory(A,C1) mandatory(A,C)
member(O,C) Æ mandatory(A,C) member(A,O)
sub(C,C1) Æ funct(A,C1) funct(A,C)
member(O,C) Æ funct(A,C) funct(A,O)
From F-Logic Lite to First-Order Logic (Datalog§)
[Calì & Kifer, VLDB 2006]
More expressive
Datalog[9] language?
type(O,A,T) Æ data(O,A,V) member(V,T)
sub(C1,C3) Æ sub(C3,C2) sub(C1,C2)
member(O,C) Æ sub(C,C1) member(O,C1)
mandatory(A,O) 9V data(O,A,V)
member(O,C) Æ type(C,A,T) type(O,A,T)
sub(C,C1) Æ type(C1,A,T) type(C,A,T)
type(C,A,T1) Æ sub(T1,T) type(C,A,T)
sub(C,C1) Æ mandatory(A,C1) mandatory(A,C)
member(O,C) Æ mandatory(A,C) member(A,O)
sub(C,C1) Æ funct(A,C1) funct(A,C)
member(O,C) Æ funct(A,C) funct(A,O)
Non-guarded rules
Weakly-Guarded Datalog[9]
[Calì, G. & Kifer, KR 2008]
8X8Y S(X,Y) Æ P(X,Y) 9Z P(Y,Z)
8X8Y8W P(X,Y) Æ P(W,X) S(Y,X)
Affected positions = ?
• All 8-variables at affected positions occur in one body atom
null can occur
during the chase
Weakly-Guarded Datalog[9]
[Calì, G. & Kifer, KR 2008]
8X8Y S(X,Y) Æ P(X,Y) 9Z P(Y,Z)
8X8Y8W P(X,Y) Æ P(W,X) S(Y,X)
Affected positions = {P[2]
• All 8-variables at affected positions occur in one body atom
null can occur
during the chase
Weakly-Guarded Datalog[9]
[Calì, G. & Kifer, KR 2008]
8X8Y S(X,Y) Æ P(X,Y) 9Z P(Y,Z)
8X8Y8W P(X,Y) Æ P(W,X) S(Y,X)
Affected positions = {P[2], S[1]
• All 8-variables at affected positions occur in one body atom
null can occur
during the chase
Weakly-Guarded Datalog[9]
[Calì, G. & Kifer, KR 2008]
8X8Y S(X,Y) Æ P(X,Y) 9Z P(Y,Z)
8X8Y8W P(X,Y) Æ P(W,X) S(Y,X)
Affected positions = {P[2], S[1]}
• All 8-variables at affected positions occur in one body atom
null can occur
during the chase
• Models of finite treewidth ) decidability of query answering
related to [Andréka, Németi & van Benthem, J. Philosophical Logic 1998] and
[Grädel , J. Symb. Log. 1999]
Weakly-Guarded Datalog[9]
• All 8-variables at affected positions occur in one body atom
[Calì, G. & Kifer, KR 2008]
null can occur
during the chase
• Query answering is EXPTIME-complete in data complexity
type(O,A,T) Æ data(O,A,V) member(V,T)
sub(C1,C3) Æ sub(C3,C2) sub(C1,C2)
member(O,C) Æ sub(C,C1) member(O,C1)
mandatory(A,O) 9V data(O,A,V)
member(O,C) Æ type(C,A,T) type(O,A,T)
sub(C,C1) Æ type(C1,A,T) type(C,A,T)
type(C,A,T1) Æ sub(T1,T) type(C,A,T)
sub(C,C1) Æ mandatory(A,C1) mandatory(A,C)
member(O,C) Æ mandatory(A,C) member(A,O)
sub(C,C1) Æ funct(A,C1) funct(A,C)
member(O,C) Æ funct(A,C) funct(A,O)
F-Logic Lite is Weakly-Guarded
[Calì & Kifer, VLDB 2006]
Affected Positions
data[1]
data[3]
type[1]
member[1]
mandatory[2]
funct[2]
type(O,A,T) Æ data(O,A,V) member(V,T)
sub(C1,C3) Æ sub(C3,C2) sub(C1,C2)
member(O,C) Æ sub(C,C1) member(O,C1)
mandatory(A,O) 9V data(O,A,V)
member(O,C) Æ type(C,A,T) type(O,A,T)
sub(C,C1) Æ type(C1,A,T) type(C,A,T)
type(C,A,T1) Æ sub(T1,T) type(C,A,T)
sub(C,C1) Æ mandatory(A,C1) mandatory(A,C)
member(O,C) Æ mandatory(A,C) member(A,O)
sub(C,C1) Æ funct(A,C1) funct(A,C)
member(O,C) Æ funct(A,C) funct(A,O)
F-Logic Lite is Weakly-Guarded
[Calì & Kifer, VLDB 2006]
Affected Positions
data[1]
data[3]
type[1]
member[1]
mandatory[2]
funct[2]
Tractability?
Polynomial Cloud Criterion (PCC)
[Calì, G. & Kifer, KR 2008]
cloud(a): all atoms in the chase
with arguments in a
and domain constants
• For every database, only polynomially many clouds in the chase (up
to isomorphism), and can be computed in PTIME
Polynomial Cloud Criterion (PCC)
[Calì, G. & Kifer, KR 2008]
• Theorem: Query answering under fixed weakly-guarded TGDs is
in PTIME in data complexity
• A useful tool for identifying tractable cases – F-Logic Lite
cloud(a): all atoms in the chase
with arguments in a
and domain constants
• For every database, only polynomially many clouds in the chase (up
to isomorphism), and can be computed in PTIME
Overview
Linear
Guarded
Weakly-Guarded
PCC
Lean UML and ELHI :
F-Logic Lite
in AC0
PTIME-c
EXPTIME-c
DL-Lite and RLean UML
Competes
Stock
0..1
Member
Owns
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
A Unifying Framework for Reasoning over Data
UML Class Diagrams
F-Logic
Description Logics
Speaker v Person u 9gives.Talk
Tutorial v 8attendedBy.Student
attendedBy v attended-1
Speaker u Student v ?
student::person
person[age ¤) number]
person[age {0:1} ¤) number]
person[name {1:¤} ¤) string]
PREFIX abc: hhttp://example.com/exampleOntology#i
SELECT ?capital ?country
WHERE {
?x abc:cityname ?capital ;
?x abc:isCapitalOf ?y .
?x abc:countryname ?country ;
?x abc:isInContinent abc:Africa . }
SPARQL
SPARQL Protocol and RDF Query Language
• RDF – data model for representing information in the Web
• … in fact, is a finite set of triples (subject, predicate, object) – or
a relational database for the schema {triple(.,.,.)}
• SPARQL – the standard language for querying RDF data
Some SPARQL Queries
• P = (?X, name, ?Y) – list of pairs (o1,o2) such as o2 is the name of o1
• P = (?X, name, B) – list of elements that have a name
• P = (?X, name, ?Y) OPT (?X, phone, ?Y) – for every object o, return
the object o, the name of o, and the phone number of o, if the phone
number is available; otherwise, return the object o and its name
From SPARQL to First-Order Logic (Datalog§)
P = (?X, name, ?Y) – list of pairs (o1,o2) such as o2 is the name of o1
8X8Y triple(X,name,Y) queryP(X,Y)
From SPARQL to First-Order Logic (Datalog§)
P = (?X, name, B) – list of elements that have a name
8X8Y triple(X,name,Y) queryP(X)
From SPARQL to First-Order Logic (Datalog§)
P = (?X, name, ?Y) OPT (?X, phone, ?Y) – for every object o, return the object o, the
name of o, and the phone number of o, if the phone number is available;
otherwise, return the object o and its name
8X8Y triple(X,name,Y) Æ triple(X,phone,Z) queryP(X,Y,Z) Æ compatibleP(Z)
8X8Y triple(X,name,Y) Æ :compatibleP(X) queryP,3(X,Y)
list of individuals with phone number
the third argument (i.e., the phone no.) is missing
From SPARQL to First-Order Logic (Datalog§)
8X8Y triple(X,name,Y) Æ triple(X,phone,Z) queryP(X,Y,Z) Æ compatibleP(Z)
8X8Y triple(X,name,Y) Æ :compatibleP(X) queryP,3(X,Y)
Non-guarded rules, and also negation is needed
P = (?X, name, ?Y) OPT (?X, phone, ?Y) – for every object o, return the object o, the
name of o, and the phone number of o, if the phone number is available;
otherwise, return the object o and its name
From SPARQL to First-Order Logic (Datalog§)
8X8Y triple(X,name,Y) Æ triple(X,phone,Z) queryP(X,Y,Z) Æ compatibleP(Z)
8X8Y triple(X,name,Y) Æ :compatibleP(X) queryP,3(X,Y)
Non-guarded rules, and also negation is needed
Stratified Weakly-Guarded Datalog[9,:,?]
P = (?X, name, ?Y) OPT (?X, phone, ?Y) – for every object o, return the object o, the
name of o, and the phone number of o, if the phone number is available;
otherwise, return the object o and its name
Additional Functionalities
• Reasoning capabilities – deal with RDFS and OWL vocabularies
• Navigational capabilities – exploit the graph structure of RDF data
• General form of recursion – express natural queries
Additional Functionalities
• Reasoning capabilities – deal with RDFS and OWL vocabularies
• Navigational capabilities – exploit the graph structure of RDF data
• General form of recursion – express natural queries
Theorem: Stratified weakly-guarded Datalog[9,:,?] is strictly more
expressive than SPARQL enriched with the above functionalities
(under the the OWL 2 QL and OWL 2 RL profiles of OWL 2)
Overview
Linear
Guarded
Weakly-Guarded
PCC
Lean UML and ELHI :
F-Logic Lite
in AC0
PTIME-c
EXPTIME-c
DL-Lite and RLean UML
SPARQL
Complexity of Datalog[9,=,:,?]
Linear Guarded Weakly-Guarded
Arbitrary
QueryPSPACE-c 2EXPTIME-c 2EXPTIME-c
Arbitrary
ProgramFixed/Atomic
QueryPSPACE-c 2EXPTIME-c 2EXPTIME-c
Arbitrary
QueryNP-c NP-c EXPTIME-c
Fixed
ProgramFixed/Atomic
Queryin AC0 PTIME-c EXPTIME-c
Complexity of Datalog[9,=,:,?]
Data Complexity
Linear Guarded Weakly-Guarded
Arbitrary
QueryPSPACE-c 2EXPTIME-c 2EXPTIME-c
Arbitrary
ProgramFixed/Atomic
QueryPSPACE-c 2EXPTIME-c 2EXPTIME-c
Arbitrary
QueryNP-c NP-c EXPTIME-c
Fixed
ProgramFixed/Atomic
Queryin AC0 PTIME-c EXPTIME-c
Complexity of Datalog[9,=,:,?]
Polynomial Cloud Criterion
Linear Guarded Weakly-Guarded
Arbitrary
QueryPSPACE-c 2EXPTIME-c 2EXPTIME-c
Arbitrary
ProgramFixed/Atomic
QueryPSPACE-c 2EXPTIME-c 2EXPTIME-c
Arbitrary
QueryNP-c NP-c NP-c
Fixed
ProgramFixed/Atomic
Queryin AC0 PTIME-c PTIME-c
Competes
Stock
0..1
Member
Owns
0..1
1..1
0..1
1..1
1..1
1..1
2..1
Company
Executive Person
IssuesIndex[0..1]:Str
getIndex():List
A Unifying Framework for Reasoning over Data
UML Class Diagrams
F-Logic
Description Logics
Speaker v Person u 9gives.Talk
Tutorial v 8attendedBy.Student
attendedBy v attended-1
Speaker u Student v ?
student::person
person[age ¤) number]
person[age {0:1} ¤) number]
person[name {1:¤} ¤) string]
PREFIX abc: hhttp://example.com/exampleOntology#i
SELECT ?capital ?country
WHERE {
?x abc:cityname ?capital ;
?x abc:isCapitalOf ?y .
?x abc:countryname ?country ;
?x abc:isInContinent abc:Africa . }
SPARQL
Datalog±
• Logical foundations, semantics and decidability
• Complexity results for ontological query answering
• Identification of tractable fragments
Thank you!