KR12 Semantic Index and TBox optimisation with respect to dependencies

42
Mariano Rodríguez-Muro and Diego Calvanese KRDB Research Group Free University of Bozen-Bolzano KR’12 Realizing OBDA with data dependencies Thursday, June 14, 2012

description

KR'12 Presentation of two techniques for efficient reasoning and query answering in DL-Lite (OWL 2 QL). First, Semantic Index, a technique to store triples (ABoxes) in a way that already encodes all the entailments of the ontology (RDFS or OWL 2 QL) in the backend, allowing for reasoning without exponential rewritings or forward chaining. The technique can easily be used to construct very efficient triple stores with inference support. Second, a TBox optimisation technique that allows to obtain simpler TBoxes when the triples (ABoxes) satisfy a set of constraints.

Transcript of KR12 Semantic Index and TBox optimisation with respect to dependencies

Page 1: KR12 Semantic Index and TBox optimisation with respect to dependencies

Mariano Rodríguez-Muro and Diego CalvaneseKRDB Research Group

Free University of Bozen-Bolzano

KR’12

Realizing OBDAwith data dependencies

Thursday, June 14, 2012

Page 2: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

DL-Lite (OWL 2 QL) - promise• Light-weight DL• Allows for QA by query rewriting into FO-queries (i.e,. SQL)• Mapping techniques that allow for OBDA

(sound, GAV mappings)• Fast query rewriting techniques• Implementations available

2

Thursday, June 14, 2012

Page 3: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

• UCQ = 729 CQs, UNION of 729 SPJ SQL queries•Datalog = 9 rules,1 SPJ of UNIONS (3 nested UNIONS of 9 elements each)

An trivial example

3

A

B

E D F

C

G H I

q(x) ← A(x), R(x, y), A(y), R(y, z), A(z)

Thursday, June 14, 2012

Page 4: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

• Focused on “size” of the rewriting and dealing with ended up playing an “encoding” game...• The role of SQL engines,

their features and limitations has been neglected

DL-Lite (OWL 2 QL) - reality

• Query answering by query rewriting still slow, but not due to the rewriting technique• RDBMS cannot handle our

queries efficiently

4

A � ∃R

Thursday, June 14, 2012

Page 5: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest- 5

“Efficient query answering in OBDA or QAO requires exploiting and optimizing every element/resource in the query answering

system, not just query rewriting algorithms”

Thursday, June 14, 2012

Page 6: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

ABOX DEPENDENCIESDescribing out data sources

6

Thursday, June 14, 2012

Page 7: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

QAO revisited

Reasoner Application

TBox

ABox

Inputs

Thursday, June 14, 2012

Page 8: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

QAO revisited

Reasoner Application

TBox

ABox

Inputs

Source

Thursday, June 14, 2012

Page 9: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

QAO revisited

Reasoner Application

TBox

ABox

Inputs

SourceOBDA Model

Thursday, June 14, 2012

Page 10: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

OBDA

Reasoner

Source

Application

Direct Communication

TBox

OBDA Model

Inputs

Thursday, June 14, 2012

Page 11: KR12 Semantic Index and TBox optimisation with respect to dependencies

CardiacArrest � Condition

Clog � Condition

Patient � ∃name

Patient � ∃agePatient � ∃ssnPatient � ∃affectedBy

OBDA and -Quest-

An ontology

9

∃name � Patient

∃age � Patient

∃ssn � Patient

∃affectedBy � Patient

∃affectedBy− � Condition

Thursday, June 14, 2012

Page 12: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

An example

id [PKEY] name age ssn

12 John 37 xxx-999

Table: patient

patient_id [FKEY] c_code

12 33

Table: conditionOBDA Model

Thursday, June 14, 2012

Page 13: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

An example

SELECT id,name,age,ssn FROM patient

:person/{$id} a Patient; name $name; age $age^^xsd:int; ssn $ssn

id [PKEY] name age ssn

12 John 37 xxx-999

Table: patient

patient_id [FKEY] c_code

12 33

Table: conditionOBDA Model

Thursday, June 14, 2012

Page 14: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

An example

SELECT id,name,age,ssn FROM patient

:person/{$id} a Patient; name $name; age $age^^xsd:int; ssn $ssn

id [PKEY] name age ssn

12 John 37 xxx-999

Table: patient

patient_id [FKEY] c_code

12 33

Table: condition

SELECT id, c_id FROM condition

:person/{$id} affectedBy :cond/{$id}. :cond/{$id} a Condition

OBDA Model

Thursday, June 14, 2012

Page 15: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

An example

SELECT id,name,age,ssn FROM patient

:person/{$id} a Patient; name $name; age $age^^xsd:int; ssn $ssn

id [PKEY] name age ssn

12 John 37 xxx-999

Table: patient

patient_id [FKEY] c_code

12 33

Table: condition

SELECT id, c_id FROM condition

:person/{$id} affectedBy :cond/{$id}. :cond/{$id} a Condition

SELECT id FROM condition WHERE c_code = 33:cond/{$id} a CardiacArrest

SELECT id,c_id FROM condition WHERE c_code = 27:cond/{$id} a Clog

OBDA Model

Thursday, June 14, 2012

Page 16: KR12 Semantic Index and TBox optimisation with respect to dependencies

R := P | P−

∀x.B1(x) ∈ A → B2(x) ∈ A∀x, y.R1(x, y) ∈ A → R2(x, y) ∈ A

B1 �A B2

R1 �A R2

OBDA and -Quest-

ABox dependencies for DL-Lite

11

Concept descriptions

Role descriptions

ABox constraints

• Semantics:

B := A | ∃R

• In addition to TBox and ABox (or OBDA model), we introducea set of ABox dependencies Sigma Σ

Thursday, June 14, 2012

Page 17: KR12 Semantic Index and TBox optimisation with respect to dependencies

CardiacArrest �A Condition

Clog �A Condition

Patient �A ∃name

Patient �A ∃agePatient �A ∃ssnPatient �A ∃affectedBy

∃name �A Patient

∃age �A Patient

∃ssn �A Patient

∃affectedBy �A Patient

∃affectedBy− �A Condition

OBDA and -Quest-

ABox constraints, example

12

Thursday, June 14, 2012

Page 18: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

ABOX CONSTRAINTS

13

Optimizing your TBox

Thursday, June 14, 2012

Page 19: KR12 Semantic Index and TBox optimisation with respect to dependencies

Given a TBox T and a set of ABox dependencies Σ we want to compute T �

s.t. each α ∈ T � is not redundant w.r.t. Σ

OBDA and -Quest-

Redundancy in T w.r.t. Sigma

14

Thursday, June 14, 2012

Page 20: KR12 Semantic Index and TBox optimisation with respect to dependencies

Given a TBox T and a set of ABox dependencies Σ we want to compute T �

s.t. each α ∈ T � is not redundant w.r.t. Σ

OBDA and -Quest-

Redundancy in T w.r.t. Sigma

14

A T -chain from B to C is a sequence of inclusion assertions (Bi � B�i)

ni=0

for some n ≥ 0, such that:

1. B0 = B, B�n = C, and

2. for 1 ≤ i ≤ n, we have that B�i−1 and Bi are basic concepts s.t., either

(i) B�i−1 = Bi, or (ii) B�

i−1 = ∃R� and Bi = ∃R�−, for some basic role R�.

Respectively for role chains.

Chains in T and Sigma

Thursday, June 14, 2012

Page 21: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

Redundancy w.r.t. Sigma

15

Thursday, June 14, 2012

Page 22: KR12 Semantic Index and TBox optimisation with respect to dependencies

B � C is directly redundant in T w.r.t. Σ if (i) Σ |= B �A C and (ii) forevery T -chain (Bi � B�

i)ni=0 withB�

n = B in T , there is a Σ-chain (Bi �A B�i)

ni=0.

Similarly for roles.

OBDA and -Quest-

Redundancy w.r.t. Sigma

16

Direct redundancy

T-chain Sigma-chain

Thursday, June 14, 2012

Page 23: KR12 Semantic Index and TBox optimisation with respect to dependencies

B � C is redundant in T w.r.t. Σ if

(a) it is directly redundant, or

(b) there exists B� �= B s.t. (i) T |= B� � C, (ii) B� � C is not directlyredundant in T w.r.t. Σ, and (iii) B � B� is directly redundant.

OBDA and -Quest-

Redundancy w.r.t. Sigma

17

Redundancy

T-chain Sigma-chain

Thursday, June 14, 2012

Page 24: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

Redundancy w.r.t. Sigma

18

• We can compute T’ by reducing redundancy checks to reachability in DAGs• Computable in polynomial time• Optimal (no redundant axioms in T’)

T � is the set of inclusion assertions {α ∈ sat(T ) | α is not redundant insat(T ) w.r.t. sat(Σ)}.

Thursday, June 14, 2012

Page 25: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

ABOX CONSTRAINTS

19

Enforcing dependencies

Thursday, June 14, 2012

Page 26: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

OBDA with OWL 2 QL

OBDA Model Manipulate mappings

Manipulate the data

Inference shifting. Separate ground reasoning from existential reasoning (efficiently)

Thursday, June 14, 2012

Page 27: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

OBDA with OWL 2 QL

Reasoner Application

TBox

ABox

Inputs

Thursday, June 14, 2012

Page 28: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

OBDA with OWL 2 QL

Reasoner Application

TBox

ABox

Inputs

Thursday, June 14, 2012

Page 29: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

OBDA with OWL 2 QL

Reasoner Application

TBox

ABox

Inputs

OBDA Model

Thursday, June 14, 2012

Page 30: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

Creating semantic ABox storage

24

ABox

1. Encode the hierarchies of T in indexes and intervals

Thursday, June 14, 2012

Page 31: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

Creating semantic ABox storage

25

ABox

1. Encode the hierarchies of T in indexes and intervals

A

B

E D F

C

G H I

Thursday, June 14, 2012

Page 32: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

Creating semantic ABox storage

26

ABox

1. Encode the hierarchies of T in indexes and intervals

A

B

E D F

C

G H I

1

2

3 4 5

6

7 8 9

Thursday, June 14, 2012

Page 33: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

Creating semantic ABox storage

26

ABox

1. Encode the hierarchies of T in indexes and intervals

A

B

E D F

C

G H I

1

2

3 4 5

6

7 8 9

Insert into your DB using those indexes

C IDX

d 4

h 8

A = D(d), H(h)

Thursday, June 14, 2012

Page 34: KR12 Semantic Index and TBox optimisation with respect to dependencies

A

B

E D F

C

G H I

1 [1,8]

2 [2,5]

3[3,3]

4[4,4]

5[5,5]

6 [6,9]

7[7,7]

8[8,8]

9[9,9]

OBDA and -Quest-

Creating semantic ABox storage

27

ABox

3. Define intervals to retrieve your data

Thursday, June 14, 2012

Page 35: KR12 Semantic Index and TBox optimisation with respect to dependencies

A

B

E D F

C

G H I

1 [1,8]

2 [2,5]

3[3,3]

4[4,4]

5[5,5]

6 [6,9]

7[7,7]

8[8,8]

9[9,9]

OBDA and -Quest-

Creating semantic ABox storage

27

ABox

3. Define intervals to retrieve your data

?x a ASELECT c FROM t WHEREIDX >= 1 AND IDX <= 8

OBDA Model

4. Create mappings using those intervals

Thursday, June 14, 2012

Page 36: KR12 Semantic Index and TBox optimisation with respect to dependencies

A

B

E D F

C

G H I

1 [1,8]

2 [2,5]

3[3,3]

4[4,4]

5[5,5]

6 [6,9]

7[7,7]

8[8,8]

9[9,9]

OBDA and -Quest-

Creating semantic ABox storage

27

ABox

3. Define intervals to retrieve your data

?x a ASELECT c FROM t WHEREIDX >= 1 AND IDX <= 8

OBDA Model

4. Create mappings using those intervals

5. Complement with mappings to cover domain, range and inverse inferences

Thursday, June 14, 2012

Page 37: KR12 Semantic Index and TBox optimisation with respect to dependencies

Experiments

28

• Experimentation using Stanford’s “Resource Index”• Semantic search over

annotated documents• 200 ontologies from Bio-portal

(only hierarchies 200k concepts, millions of subClassOf)

Thursday, June 14, 2012

Page 38: KR12 Semantic Index and TBox optimisation with respect to dependencies

Experiments

29

• Current system uses chase• Naive chase: 7 days• Optimized chase: 40 mins• Cost 16 GB + 70 GB chase data• Split second responses

• Pure query rewriting approaches• not-feasible (UCQs or Datalog)

• Semantic index-based rewriting• DAG computation and indexing 5 mins• Cost: 16 GB• Single queries, split second responses

Thursday, June 14, 2012

Page 39: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

CONCLUSIONS

Thursday, June 14, 2012

Page 40: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

Conclusions• Introduced ABox dependencies to describe the structure

of data• Showed how to optimize TBoxes w.r.t. dependencies,• Introduced the idea of “shifting” inferences from TBox

reasoning to Mapping reasoning.• Introduced “Semantic Index” repositories• Not mentioned - “Equivalence optimization”

31

Thursday, June 14, 2012

Page 41: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

Conclusions• Inference shifting is also possible in strict OBDA (see

T-mappings [AMW’09])• All (including T-mappings) is implemented in Quest and

-ontop-, now available as a P4 plugin.• Combined with a fast rewriting technique (see [Kontchakov

et. al 12] and [Rosati, 12]) T-mappings and Semantic indexes allow for OBDA and QAO in practice• Practical and Theoretical evidence that we have covered

relevant cases• Time to put things to practice!

(and look at the future, hybrid approaches, EL, Datalog+-, OWL 2 RL)

32

Thursday, June 14, 2012

Page 42: KR12 Semantic Index and TBox optimisation with respect to dependencies

OBDA and -Quest-

THANK YOU

Thursday, June 14, 2012