Introduction to query rewriting optimisation with dependencies

91
Dependencies Making Ontology Based Data Access Work in Practice Mariano Rodriguez-Muro and Diego Calvanese {rodriguez,calvanese}@inf.unibz.it KRDB Research Centre Free University of Bozen Bolzano July, 2011 Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 1 / 33

description

Introduction to query rewriting optimisation with dependencies in APEX lab, Shanghai 2012.

Transcript of Introduction to query rewriting optimisation with dependencies

Page 1: Introduction to query rewriting optimisation with dependencies

DependenciesMaking Ontology Based Data Access Work in Practice

Mariano Rodriguez-Muro and Diego Calvanese{rodriguez,calvanese}@inf.unibz.it

KRDB Research CentreFree University of Bozen Bolzano

July, 2011

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 1 / 33

Page 2: Introduction to query rewriting optimisation with dependencies

The context

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 2 / 33

Page 3: Introduction to query rewriting optimisation with dependencies

DL Ontologies

Description Logics:

• Formalisms for knowledge representation.

• Decidable fragments of FOL

• Base of OWL

• World is described by means of Concepts and Roles

Ontologies

• Intentional knowledge: TBox T .

• Extensional knowledge: ABox A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 3 / 33

Page 4: Introduction to query rewriting optimisation with dependencies

DL Ontologies

Description Logics:

• Formalisms for knowledge representation.

• Decidable fragments of FOL

• Base of OWL

• World is described by means of Concepts and Roles

Ontologies

• Intentional knowledge: TBox T .

• Extensional knowledge: ABox A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 3 / 33

Page 5: Introduction to query rewriting optimisation with dependencies

OBDA with DL-Lite

A family of light-weight ontology languages

• DL-LiteF conceptsB := A | ∃R

• DL-LiteF rolesR := P | P−

• DL-LiteF TBoxes

B v B | B v ¬B | (funct R)

• DL-LiteF ABoxesA(a) | R(a, b)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33

Page 6: Introduction to query rewriting optimisation with dependencies

OBDA with DL-Lite

A family of light-weight ontology languages

• DL-LiteF conceptsB := A | ∃R

• DL-LiteF rolesR := P | P−

• DL-LiteF TBoxes

B v B | B v ¬B | (funct R)

• DL-LiteF ABoxesA(a) | R(a, b)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33

Page 7: Introduction to query rewriting optimisation with dependencies

OBDA with DL-Lite

A family of light-weight ontology languages

• DL-LiteF conceptsB := A | ∃R

• DL-LiteF rolesR := P | P−

• DL-LiteF TBoxes

B v B | B v ¬B | (funct R)

• DL-LiteF ABoxesA(a) | R(a, b)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33

Page 8: Introduction to query rewriting optimisation with dependencies

OBDA with DL-Lite

A family of light-weight ontology languages

• DL-LiteF conceptsB := A | ∃R

• DL-LiteF rolesR := P | P−

• DL-LiteF TBoxes

B v B | B v ¬B | (funct R)

• DL-LiteF ABoxesA(a) | R(a, b)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33

Page 9: Introduction to query rewriting optimisation with dependencies

OBDA with DL-Lite

A family of light-weight ontology languages

• DL-LiteF conceptsB := A | ∃R

• DL-LiteF rolesR := P | P−

• DL-LiteF TBoxes

B v B | B v ¬B | (funct R)

• DL-LiteF ABoxesA(a) | R(a, b)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 4 / 33

Page 10: Introduction to query rewriting optimisation with dependencies

Query Answering

TBox:

Man v Person,Woman v Person,Person v ∃hasFather ,

∃hasFather− v Person

ABox:Man(mariano)

Queries:q(x)← Person(x), hasFather(x , y),Person(y)

Problem: Compute the certain answers of Q, denoted cert(Q,O).

The promise

We can do this as efficiently as answering DB queries, also in the virtualsetting.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33

Page 11: Introduction to query rewriting optimisation with dependencies

Query Answering

TBox:

Man v Person,Woman v Person,Person v ∃hasFather ,

∃hasFather− v Person

ABox:Man(mariano)

Queries:q(x)← Person(x), hasFather(x , y),Person(y)

Problem: Compute the certain answers of Q, denoted cert(Q,O).

The promise

We can do this as efficiently as answering DB queries, also in the virtualsetting.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33

Page 12: Introduction to query rewriting optimisation with dependencies

Query Answering

TBox:

Man v Person,Woman v Person,Person v ∃hasFather ,

∃hasFather− v Person

ABox:Man(mariano)

Queries:q(x)← Person(x), hasFather(x , y),Person(y)

Problem: Compute the certain answers of Q, denoted cert(Q,O).

The promise

We can do this as efficiently as answering DB queries, also in the virtualsetting.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33

Page 13: Introduction to query rewriting optimisation with dependencies

Query Answering

TBox:

Man v Person,Woman v Person,Person v ∃hasFather ,

∃hasFather− v Person

ABox:Man(mariano)

Queries:q(x)← Person(x), hasFather(x , y),Person(y)

Problem: Compute the certain answers of Q, denoted cert(Q,O).

The promise

We can do this as efficiently as answering DB queries, also in the virtualsetting.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 5 / 33

Page 14: Introduction to query rewriting optimisation with dependencies

Query Answering with PerfectRef (2005)

Query:q(x)← Person(x), hasFather(x , y),Person(y)

Reformulation:

q(x)← Person(x), hasFather(x , y),Person(y)

q(x)← Person(x), hasFather(x , y), hasFather(z , y)

q(x)← Person(x), hasFather(x , y)

q(x)← Person(x),Person(x)

q(x)← Person(x)

q(x)← Person(x), hasFather(x , y),Man(y)

q(x)← Person(x), hasFather(x , y),Woman(y)

q(x)← hasFather(x ,m), hasFather(x , y),Person(y)

q(x)← hasFather(x ,m), hasFather(x , y), hasFather(z , y)

q(x)← hasFather(x ,m), hasFather(x , y)

q(x)← hasFather(x ,m),Person(x)

q(x)← hasFather(x ,m), hasFather(x , t)

q(x)← hasFather(x ,m)

q(x)← hasFather(x ,m), hasFather(x , y),Man(y)

q(x)← hasFather(x ,m), hasFather(x , y),Woman(y)

q(x)← Man(x), hasFather(x , y),Person(y)

q(x)← Man(x), hasFather(x , y), hasFather(y , z)

q(x)← Man(x), hasFather(x , y),Man(y)

q(x)← Man(x), hasFather(x , y),Woman(y)

q(x)←Woman(x), hasFather(x , y),Person(y)

q(x)←Woman(x), hasFather(x , y), hasFather(y , z)

q(x)←Woman(x), hasFather(x , y),Man(y)

q(x)←Woman(x), hasFather(x , y),Woman(y)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 6 / 33

Page 15: Introduction to query rewriting optimisation with dependencies

Query Answering with PerfectRef (2005)Query:

q(x)← Person(x), hasFather(x , y),Person(y)

Reformulation:

q(x)← Person(x), hasFather(x , y),Person(y)

q(x)← Person(x), hasFather(x , y), hasFather(z , y)

q(x)← Person(x), hasFather(x , y)

q(x)← Person(x),Person(x)

q(x)← Person(x)

q(x)← Person(x), hasFather(x , y),Man(y)

q(x)← Person(x), hasFather(x , y),Woman(y)

q(x)← hasFather(x ,m), hasFather(x , y),Person(y)

q(x)← hasFather(x ,m), hasFather(x , y), hasFather(z , y)

q(x)← hasFather(x ,m), hasFather(x , y)

q(x)← hasFather(x ,m),Person(x)

q(x)← hasFather(x ,m), hasFather(x , t)

q(x)← hasFather(x ,m)

q(x)← hasFather(x ,m), hasFather(x , y),Man(y)

q(x)← hasFather(x ,m), hasFather(x , y),Woman(y)

q(x)← Man(x), hasFather(x , y),Person(y)

q(x)← Man(x), hasFather(x , y), hasFather(y , z)

q(x)← Man(x), hasFather(x , y),Man(y)

q(x)← Man(x), hasFather(x , y),Woman(y)

q(x)←Woman(x), hasFather(x , y),Person(y)

q(x)←Woman(x), hasFather(x , y), hasFather(y , z)

q(x)←Woman(x), hasFather(x , y),Man(y)

q(x)←Woman(x), hasFather(x , y),Woman(y)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 6 / 33

Page 16: Introduction to query rewriting optimisation with dependencies

Query Answering with PerfectRef (2005)Query:

q(x)← Person(x), hasFather(x , y),Person(y)

Reformulation:

q(x)← Person(x), hasFather(x , y),Person(y)

q(x)← Person(x), hasFather(x , y), hasFather(z , y)

q(x)← Person(x), hasFather(x , y)

q(x)← Person(x),Person(x)

q(x)← Person(x)

q(x)← Person(x), hasFather(x , y),Man(y)

q(x)← Person(x), hasFather(x , y),Woman(y)

q(x)← hasFather(x ,m), hasFather(x , y),Person(y)

q(x)← hasFather(x ,m), hasFather(x , y), hasFather(z , y)

q(x)← hasFather(x ,m), hasFather(x , y)

q(x)← hasFather(x ,m),Person(x)

q(x)← hasFather(x ,m), hasFather(x , t)

q(x)← hasFather(x ,m)

q(x)← hasFather(x ,m), hasFather(x , y),Man(y)

q(x)← hasFather(x ,m), hasFather(x , y),Woman(y)

q(x)← Man(x), hasFather(x , y),Person(y)

q(x)← Man(x), hasFather(x , y), hasFather(y , z)

q(x)← Man(x), hasFather(x , y),Man(y)

q(x)← Man(x), hasFather(x , y),Woman(y)

q(x)←Woman(x), hasFather(x , y),Person(y)

q(x)←Woman(x), hasFather(x , y), hasFather(y , z)

q(x)←Woman(x), hasFather(x , y),Man(y)

q(x)←Woman(x), hasFather(x , y),Woman(y)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 6 / 33

Page 17: Introduction to query rewriting optimisation with dependencies

Alternatives

• Improved version of PerfectRef (2007-2011)

• RQR (Urbina et, al. 2007)

Too many unions, cannot execute!.

• PRESTO (Rosati et al., 2010)

Better, eventually it breaks.

• Combined Approach (Kontchakov et. al., 2010)

Fast. But too much data and too much time.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33

Page 18: Introduction to query rewriting optimisation with dependencies

Alternatives

• Improved version of PerfectRef (2007-2011)

• RQR (Urbina et, al. 2007)

Too many unions, cannot execute!.

• PRESTO (Rosati et al., 2010)

Better, eventually it breaks.

• Combined Approach (Kontchakov et. al., 2010)

Fast. But too much data and too much time.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33

Page 19: Introduction to query rewriting optimisation with dependencies

Alternatives

• Improved version of PerfectRef (2007-2011)

• RQR (Urbina et, al. 2007)

Too many unions, cannot execute!.

• PRESTO (Rosati et al., 2010)

Better, eventually it breaks.

• Combined Approach (Kontchakov et. al., 2010)

Fast. But too much data and too much time.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33

Page 20: Introduction to query rewriting optimisation with dependencies

Alternatives

• Improved version of PerfectRef (2007-2011)

• RQR (Urbina et, al. 2007)

Too many unions, cannot execute!.

• PRESTO (Rosati et al., 2010)

Better, eventually it breaks.

• Combined Approach (Kontchakov et. al., 2010)

Fast. But too much data and too much time.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33

Page 21: Introduction to query rewriting optimisation with dependencies

Alternatives

• Improved version of PerfectRef (2007-2011)

• RQR (Urbina et, al. 2007)

Too many unions, cannot execute!.

• PRESTO (Rosati et al., 2010)

Better, eventually it breaks.

• Combined Approach (Kontchakov et. al., 2010)

Fast. But too much data and too much time.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33

Page 22: Introduction to query rewriting optimisation with dependencies

Alternatives

• Improved version of PerfectRef (2007-2011)

• RQR (Urbina et, al. 2007)

Too many unions, cannot execute!.

• PRESTO (Rosati et al., 2010)

Better, eventually it breaks.

• Combined Approach (Kontchakov et. al., 2010)

Fast. But too much data and too much time.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33

Page 23: Introduction to query rewriting optimisation with dependencies

Alternatives

• Improved version of PerfectRef (2007-2011)

• RQR (Urbina et, al. 2007)

Too many unions, cannot execute!.

• PRESTO (Rosati et al., 2010)

Better, eventually it breaks.

• Combined Approach (Kontchakov et. al., 2010)

Fast. But too much data and too much time.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 7 / 33

Page 24: Introduction to query rewriting optimisation with dependencies

What can we do?

?

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 8 / 33

Page 25: Introduction to query rewriting optimisation with dependencies

Query AnsweringIt is not only about existential constants

Query:q(x , y)← Person(x), hasFather(x , y),Person(y)

Reformulation:

q(x , y)← Person(x), hasFather(x , y),Person(y)

q(x , y)← Person(x), hasFather(x , y), hasFather(z , y)

q(x , y)← Person(x), hasFather(x , y),Man(y)

q(x , y)← Person(x), hasFather(x , y),Woman(y)

q(x , y)← hasFather(x ,m), hasFather(x , y),Person(y)

q(x , y)← hasFather(x ,m), hasFather(x , y), hasFather(z , y)

q(x , y)← hasFather(x ,m), hasFather(x , y),Man(y)

q(x , y)← hasFather(x ,m), hasFather(x , y),Woman(y)

q(x , y)← Man(x), hasFather(x , y),Person(y)

q(x , y)← Man(x), hasFather(x , y), hasFather(z , y)

q(x , y)← Man(x), hasFather(x , y),Man(y)

q(x , y)← Man(x), hasFather(x , y),Woman(y)

q(x , y)←Woman(x), hasFather(x , y),Person(y)

q(x , y)←Woman(x), hasFather(x , y), hasFather(z , y)

q(x , y)←Woman(x), hasFather(x , y),Man(y)

q(x , y)←Woman(x), hasFather(x , y),Woman(y)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 9 / 33

Page 26: Introduction to query rewriting optimisation with dependencies

Query AnsweringIt is not only about existential constants

Query:q(x , y)← Person(x), hasFather(x , y),Person(y)

Reformulation:

q(x , y)← Person(x), hasFather(x , y),Person(y)

q(x , y)← Person(x), hasFather(x , y), hasFather(z , y)

q(x , y)← Person(x), hasFather(x , y),Man(y)

q(x , y)← Person(x), hasFather(x , y),Woman(y)

q(x , y)← hasFather(x ,m), hasFather(x , y),Person(y)

q(x , y)← hasFather(x ,m), hasFather(x , y), hasFather(z , y)

q(x , y)← hasFather(x ,m), hasFather(x , y),Man(y)

q(x , y)← hasFather(x ,m), hasFather(x , y),Woman(y)

q(x , y)← Man(x), hasFather(x , y),Person(y)

q(x , y)← Man(x), hasFather(x , y), hasFather(z , y)

q(x , y)← Man(x), hasFather(x , y),Man(y)

q(x , y)← Man(x), hasFather(x , y),Woman(y)

q(x , y)←Woman(x), hasFather(x , y),Person(y)

q(x , y)←Woman(x), hasFather(x , y), hasFather(z , y)

q(x , y)←Woman(x), hasFather(x , y),Man(y)

q(x , y)←Woman(x), hasFather(x , y),Woman(y)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 9 / 33

Page 27: Introduction to query rewriting optimisation with dependencies

The full picture: Ontology Based DataAccess

SourceUser SourceUser

Queries Ontology

Mappings

Source

To deal with OBDA we need to consider:

• If in the backend we have RDBMSs, we cannot go beyond theircapabilities.

• All systems are composed by T , D = 〈R, I〉, M.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 10 / 33

Page 28: Introduction to query rewriting optimisation with dependencies

First ObservationIs my data complete?

Completeness of A

The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.

In any realistic scenario:

• We don’t use arbitrary sources;

• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)

• This happens a lot!

Keyword

Redundancy

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33

Page 29: Introduction to query rewriting optimisation with dependencies

First ObservationIs my data complete?

Completeness of A

The TBox sais: Manager v Employee

In the ABox: all Managers are already employees.

In any realistic scenario:

• We don’t use arbitrary sources;

• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)

• This happens a lot!

Keyword

Redundancy

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33

Page 30: Introduction to query rewriting optimisation with dependencies

First ObservationIs my data complete?

Completeness of A

The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.

In any realistic scenario:

• We don’t use arbitrary sources;

• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)

• This happens a lot!

Keyword

Redundancy

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33

Page 31: Introduction to query rewriting optimisation with dependencies

First ObservationIs my data complete?

Completeness of A

The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.

In any realistic scenario:

• We don’t use arbitrary sources;

• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)

• This happens a lot!

Keyword

Redundancy

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33

Page 32: Introduction to query rewriting optimisation with dependencies

First ObservationIs my data complete?

Completeness of A

The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.

In any realistic scenario:

• We don’t use arbitrary sources;

• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)

• This happens a lot!

Keyword

Redundancy

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33

Page 33: Introduction to query rewriting optimisation with dependencies

First ObservationIs my data complete?

Completeness of A

The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.

In any realistic scenario:

• We don’t use arbitrary sources;

• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)

• This happens a lot!

Keyword

Redundancy

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33

Page 34: Introduction to query rewriting optimisation with dependencies

First ObservationIs my data complete?

Completeness of A

The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.

In any realistic scenario:

• We don’t use arbitrary sources;

• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)

• This happens a lot!

Keyword

Redundancy

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33

Page 35: Introduction to query rewriting optimisation with dependencies

First ObservationIs my data complete?

Completeness of A

The TBox sais: Manager v EmployeeIn the ABox: all Managers are already employees.

In any realistic scenario:

• We don’t use arbitrary sources;

• Intersection of semantics is reflected in completeness (e.g., no need tochase, expand or rewrite)

• This happens a lot!

Keyword

Redundancy

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 11 / 33

Page 36: Introduction to query rewriting optimisation with dependencies

Second ObservationThere are no ABoxes

THERE ARE NO ABOXES!

Any Ontology based query answering systems today:

• Uses relational DBs to store the ABox data;

• In such D, both, R and I can be manipulated;

• Implementors may choose any M for their system;

Opportunity

To complete an ABox we can do more than expansion.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 12 / 33

Page 37: Introduction to query rewriting optimisation with dependencies

Second ObservationThere are no ABoxes

THERE ARE NO ABOXES!

Any Ontology based query answering systems today:

• Uses relational DBs to store the ABox data;

• In such D, both, R and I can be manipulated;

• Implementors may choose any M for their system;

Opportunity

To complete an ABox we can do more than expansion.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 12 / 33

Page 38: Introduction to query rewriting optimisation with dependencies

How to approach the problemTwo level approach

How to approach OBDA in practice?

• Efficient ways to deal with redundancy due to completeness.

• Efficient ways to complete (virtual) ABoxes.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33

Page 39: Introduction to query rewriting optimisation with dependencies

How to approach the problemTwo level approach

How to approach OBDA in practice?

• Efficient ways to deal with redundancy due to completeness.

• Efficient ways to complete (virtual) ABoxes.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33

Page 40: Introduction to query rewriting optimisation with dependencies

How to approach the problemTwo level approach

How to approach OBDA in practice?

• Efficient ways to deal with redundancy due to completeness.

• Efficient ways to complete (virtual) ABoxes.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33

Page 41: Introduction to query rewriting optimisation with dependencies

How to approach the problemTwo level approach

How to approach OBDA in practice?

• Efficient ways to deal with redundancy due to completeness.

• Efficient ways to complete (virtual) ABoxes.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 13 / 33

Page 42: Introduction to query rewriting optimisation with dependencies

ContributionsDealing with redundancy

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 14 / 33

Page 43: Introduction to query rewriting optimisation with dependencies

Characterizing completeness

ABox Dependencies

Definition

An assertion B vA B that restricts valid ABoxes.

Syntax B2 vA B2

Semantics: A |= Manager vA Employee if Manager(x)∈ A impliesEmployee(x)∈ A.

ABox dependencies are fundamentally different than TBox assertions.Think open world

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 15 / 33

Page 44: Introduction to query rewriting optimisation with dependencies

Characterizing completeness

ABox Dependencies

Definition

An assertion B vA B that restricts valid ABoxes.

Syntax B2 vA B2

Semantics: A |= Manager vA Employee if Manager(x)∈ A impliesEmployee(x)∈ A.

ABox dependencies are fundamentally different than TBox assertions.Think open world

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 15 / 33

Page 45: Introduction to query rewriting optimisation with dependencies

Where to deal with redundancy?

Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,what do we do?

Available Options:

• Optimize the query reformulation algorithm to deal with Σ.

• Optimize the TBox T with respect to Σ.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33

Page 46: Introduction to query rewriting optimisation with dependencies

Where to deal with redundancy?

Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,what do we do?Available Options:

• Optimize the query reformulation algorithm to deal with Σ.

• Optimize the TBox T with respect to Σ.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33

Page 47: Introduction to query rewriting optimisation with dependencies

Where to deal with redundancy?

Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,what do we do?Available Options:

• Optimize the query reformulation algorithm to deal with Σ.

• Optimize the TBox T with respect to Σ.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33

Page 48: Introduction to query rewriting optimisation with dependencies

Where to deal with redundancy?

Given a TBox T , an ABox A, a set of dependencies Σ and a query Q,what do we do?Available Options:

• Optimize the query reformulation algorithm to deal with Σ.

• Optimize the TBox T with respect to Σ.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 16 / 33

Page 49: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?

Direct Redundancy: Case 1

Let T be implied the followinghierarchy:

∃hasFather

Person

Human

Redundant if Σ is:

∃hasFather

Person

Human

Σ sais hasFather(mariano, ramon) ∈ A → Human(mariano) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33

Page 50: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?

Direct Redundancy: Case 1

Let T be implied the followinghierarchy:

∃hasFather

Person

Human

Redundant if Σ is:

∃hasFather

Person

Human

Σ sais hasFather(mariano, ramon) ∈ A → Human(mariano) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33

Page 51: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?

Direct Redundancy: Case 1

Let T be implied the followinghierarchy:

∃hasFather

Person

Human

Redundant if Σ is:

∃hasFather

Person

Human

Σ sais hasFather(mariano, ramon) ∈ A → Human(mariano) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33

Page 52: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?

Direct Redundancy: Case 1

Let T be implied the followinghierarchy:

∃hasFather

Person

Human

Redundant if Σ is:

∃hasFather

Person

Human

Σ sais hasFather(mariano, ramon) ∈ A → Human(mariano) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 17 / 33

Page 53: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?

Direct Redundancy: Case 2

Let T be the following TBox:

Person

∃hasFather−

∃hasFather

Man

Redundant if Σ is:

Person

∃hasFather−

∃hasFather

Man

Σ sais Man(ramon) ∈ A → ∃a′ | hasFather(ramon, a′) ∧ Person(a′) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33

Page 54: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?

Direct Redundancy: Case 2

Let T be the following TBox:

Person

∃hasFather−

∃hasFather

Man

Redundant if Σ is:

Person

∃hasFather−

∃hasFather

Man

Σ sais Man(ramon) ∈ A → ∃a′ | hasFather(ramon, a′) ∧ Person(a′) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33

Page 55: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?

Direct Redundancy: Case 2

Let T be the following TBox:

Person

∃hasFather−

∃hasFather

Man

Redundant if Σ is:

Person

∃hasFather−

∃hasFather

Man

Σ sais Man(ramon) ∈ A → ∃a′ | hasFather(ramon, a′) ∧ Person(a′) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33

Page 56: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?

Direct Redundancy: Case 2

Let T be the following TBox:

Person

∃hasFather−

∃hasFather

Man

Redundant if Σ is:

Person

∃hasFather−

∃hasFather

Man

Σ sais Man(ramon) ∈ A → ∃a′ | hasFather(ramon, a′) ∧ Person(a′) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33

Page 57: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?

Direct Redundancy: Case 2

Let T be the following TBox:

Person

∃hasFather−

∃hasFather

Man

Redundant if Σ is:

Person

∃hasFather−

∃hasFather

Man

Σ sais Man(ramon) ∈ A → ∃a′ | hasFather(ramon, a′) ∧ Person(a′) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 18 / 33

Page 58: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?Indirect Redundancy

Let T be the following TBox:

Animal

Man Human

Redundant if Σ is:

Animal

Man Human

Σ sais Man(mariano) ∈ A then Animal(mariano) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33

Page 59: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?Indirect Redundancy

Let T be the following TBox:

Animal

Man Human

Redundant if Σ is:

Animal

Man Human

Σ sais Man(mariano) ∈ A then Animal(mariano) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33

Page 60: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?Indirect Redundancy

Let T be the following TBox:

Animal

Man Human

Redundant if Σ is:

Animal

Man Human

Σ sais Man(mariano) ∈ A then Animal(mariano) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33

Page 61: Introduction to query rewriting optimisation with dependencies

When is an assertion redundant?Indirect Redundancy

Let T be the following TBox:

Animal

Man Human

Redundant if Σ is:

Animal

Man Human

Σ sais Man(mariano) ∈ A then Animal(mariano) ∈ A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 19 / 33

Page 62: Introduction to query rewriting optimisation with dependencies

Formalization: Redundancy

Given a TBox T and a set of dependencies Σ over T , the optimized versionof T w.r.t. Σ, denoted optim(T ,Σ), is the set of inclusion assertions

{α ∈ sat(T ) | α is not redundant in sat(T ) w.r.t. sat(Σ)}

We can compute optim(T ,Σ) in linear time.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 20 / 33

Page 63: Introduction to query rewriting optimisation with dependencies

ContributionsCompleting ABoxes

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 21 / 33

Page 64: Introduction to query rewriting optimisation with dependencies

General considerations

OBDA systems have no ABoxes, instead virtual ABoxes V = 〈D,M〉 withD = 〈R, I〉.

If we that V |= A vA B, we check make sure that mappings for B includeall the data coming from the mappings of A.Trade-off:

• Degree of completeness (# of dependencies),

• Cost of the procedure

• Performance of Query answering.

We can complete virtual ABoxes up to B v ∃R without the need for newdata.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33

Page 65: Introduction to query rewriting optimisation with dependencies

General considerations

OBDA systems have no ABoxes, instead virtual ABoxes V = 〈D,M〉 withD = 〈R, I〉.

If we that V |= A vA B, we check make sure that mappings for B includeall the data coming from the mappings of A.

Trade-off:

• Degree of completeness (# of dependencies),

• Cost of the procedure

• Performance of Query answering.

We can complete virtual ABoxes up to B v ∃R without the need for newdata.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33

Page 66: Introduction to query rewriting optimisation with dependencies

General considerations

OBDA systems have no ABoxes, instead virtual ABoxes V = 〈D,M〉 withD = 〈R, I〉.

If we that V |= A vA B, we check make sure that mappings for B includeall the data coming from the mappings of A.Trade-off:

• Degree of completeness (# of dependencies),

• Cost of the procedure

• Performance of Query answering.

We can complete virtual ABoxes up to B v ∃R without the need for newdata.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33

Page 67: Introduction to query rewriting optimisation with dependencies

General considerations

OBDA systems have no ABoxes, instead virtual ABoxes V = 〈D,M〉 withD = 〈R, I〉.

If we that V |= A vA B, we check make sure that mappings for B includeall the data coming from the mappings of A.Trade-off:

• Degree of completeness (# of dependencies),

• Cost of the procedure

• Performance of Query answering.

We can complete virtual ABoxes up to B v ∃R without the need for newdata.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 22 / 33

Page 68: Introduction to query rewriting optimisation with dependencies

Semantic Index for OBDA

General Idea

• To encode the semantics of T in numeric indexes and ranges forconcept names and roles.

• Store the ABox in the database using those indexes and ranges.

• Make mappings for the system that take the ranges into account.

We can do this by using the implied hierarchy of T to generate the indexand ranges!

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33

Page 69: Introduction to query rewriting optimisation with dependencies

Semantic Index for OBDA

General Idea• To encode the semantics of T in numeric indexes and ranges for

concept names and roles.

• Store the ABox in the database using those indexes and ranges.

• Make mappings for the system that take the ranges into account.

We can do this by using the implied hierarchy of T to generate the indexand ranges!

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33

Page 70: Introduction to query rewriting optimisation with dependencies

Semantic Index for OBDA

General Idea• To encode the semantics of T in numeric indexes and ranges for

concept names and roles.

• Store the ABox in the database using those indexes and ranges.

• Make mappings for the system that take the ranges into account.

We can do this by using the implied hierarchy of T to generate the indexand ranges!

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33

Page 71: Introduction to query rewriting optimisation with dependencies

Semantic Index for OBDA

General Idea• To encode the semantics of T in numeric indexes and ranges for

concept names and roles.

• Store the ABox in the database using those indexes and ranges.

• Make mappings for the system that take the ranges into account.

We can do this by using the implied hierarchy of T to generate the indexand ranges!

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33

Page 72: Introduction to query rewriting optimisation with dependencies

Semantic Index for OBDA

General Idea• To encode the semantics of T in numeric indexes and ranges for

concept names and roles.

• Store the ABox in the database using those indexes and ranges.

• Make mappings for the system that take the ranges into account.

We can do this by using the implied hierarchy of T to generate the indexand ranges!

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 23 / 33

Page 73: Introduction to query rewriting optimisation with dependencies

Semantic Index Example

T = {B v A,C v A,C v D}

We create a table TC with constant and idx columns. To insert the datawe use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC

We create the mappings using the ranges, e.g., SELECT constant

FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33

Page 74: Introduction to query rewriting optimisation with dependencies

Semantic Index Example

T = {B v A,C v A,C v D}

A

B C

D

We create a table TC with constant and idx columns. To insert the datawe use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC

We create the mappings using the ranges, e.g., SELECT constant

FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33

Page 75: Introduction to query rewriting optimisation with dependencies

Semantic Index Example

T = {B v A,C v A,C v D}

1A

B2

C3

4D

We create a table TC with constant and idx columns. To insert the datawe use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC

We create the mappings using the ranges, e.g., SELECT constant

FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33

Page 76: Introduction to query rewriting optimisation with dependencies

Semantic Index Example

T = {B v A,C v A,C v D}

1A

B2

C3

4D

We create a table TC with constant and idx columns. To insert the datawe use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC

We create the mappings using the ranges, e.g., SELECT constant

FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33

Page 77: Introduction to query rewriting optimisation with dependencies

Semantic Index Example

T = {B v A,C v A,C v D}

1, {(1, 3)}A

B2, {(2, 2)}

C3, {(3, 3)}

4, {(3, 4)}D

We create a table TC with constant and idx columns. To insert the datawe use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC

We create the mappings using the ranges, e.g., SELECT constant

FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33

Page 78: Introduction to query rewriting optimisation with dependencies

Semantic Index Example

T = {B v A,C v A,C v D}

1, {(1, 3)}A

B2, {(2, 2)}

C3, {(3, 3)}

4, {(3, 4)}D

We create a table TC with constant and idx columns. To insert the datawe use the indexes. e.g., B(mariano) ∈ A then we put (mariano, 2) ∈ TC

We create the mappings using the ranges, e.g., SELECT constant

FROM TC WHERE IDX ≥ 1 AND IDX ≤ 3; A(constant)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 24 / 33

Page 79: Introduction to query rewriting optimisation with dependencies

Experimentation I

The Resource Index features:

• Search over 22 document collections

• Semantics given by the hierarchies of 200 ontologies (SNOMED, GO)

Implementation in a nutshell:

(i) Understand documents with natural language processing andannotate

Cervical Cancer(′doc224′)

(ii) Expand the ABox

(iii) Pose queries that retrieve documents as

q(x)← A1(x) ∧ · · · ∧ An(x)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 25 / 33

Page 80: Introduction to query rewriting optimisation with dependencies

Experimentation II

The challenge:

• ≈ 3 million concepts and ≈ 2.5 million is-a assertions

• Split second responses

• 150 GB of data

• Expansion data: 1.5 TB

The experimentation data:

• Clinical Trials.gov (CT)

• 181 million assertion (≈ 14 GB of data, ≈ 140 GB when expanded.)

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 26 / 33

Page 81: Introduction to query rewriting optimisation with dependencies

Results

The query:

q(x)← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x)

Results:

• Traditional reformulation: Union of 467874 SQL SPJ queries;

• Semantic Index: 1 SQL; execution 3.582s (0.082s if warm); Timeto compute semantic index: 1 min; Size of data: +≈ 4 GB.

• ABox expansion: 1 SQL; executing 3s (0.6s if warm); Expansiontime ≈ 7 days; Size of data +≈ 126 GB.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 27 / 33

Page 82: Introduction to query rewriting optimisation with dependencies

Results

The query:

q(x)← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x)

Results:

• Traditional reformulation: Union of 467874 SQL SPJ queries;

• Semantic Index: 1 SQL; execution 3.582s (0.082s if warm); Timeto compute semantic index: 1 min; Size of data: +≈ 4 GB.

• ABox expansion: 1 SQL; executing 3s (0.6s if warm); Expansiontime ≈ 7 days; Size of data +≈ 126 GB.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 27 / 33

Page 83: Introduction to query rewriting optimisation with dependencies

The Query

The query:

q(x)← DNA Repair Gene(x) ∧ Antigen Gene(x) ∧ Cancer Gene(x)

SELECT DISTINCT r0.element_id as element_id

FROM

RESOURCE_INDEX.CT_ANN r0 JOIN RESOURCE_INDEX.CT_ANN r1

ON r0.element_id = r1.element_id

JOIN RESOURCE_INDEX.CT_ANN r2

ON r1.element_id = r2.element_id

WHERE

((r0.idx >= 1783559 AND r0.idx <= 1783657)) AND

((r1.idx >= 1782996 AND r1.idx <= 1783029)) AND

((r2.idx >= 1783115 AND r2.idx <= 1783253));

Standard SQL query efficient in ANY DBMS.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 28 / 33

Page 84: Introduction to query rewriting optimisation with dependencies

Conclusions

Contributions

• We indicated that efficient OBDA requires to take into account morethan only T , A and Q.

• Provided means to deal with redundancy at the level of the TBox.

• We showed that expansion is not necessary that we can completeABoxes.

• We presented to efficient ways to complete ABoxes, one for thegeneral OBDA setting and one for the virtual setting.

Future work

• Exploring more expressive languages.

• Exploring the RDFS/SPARQL setting.

• Handling updates of T and A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 29 / 33

Page 85: Introduction to query rewriting optimisation with dependencies

Conclusions

Contributions

• We indicated that efficient OBDA requires to take into account morethan only T , A and Q.

• Provided means to deal with redundancy at the level of the TBox.

• We showed that expansion is not necessary that we can completeABoxes.

• We presented to efficient ways to complete ABoxes, one for thegeneral OBDA setting and one for the virtual setting.

Future work

• Exploring more expressive languages.

• Exploring the RDFS/SPARQL setting.

• Handling updates of T and A.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 29 / 33

Page 86: Introduction to query rewriting optimisation with dependencies

Extra examples

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 30 / 33

Page 87: Introduction to query rewriting optimisation with dependencies

First Observation (cont.)Mappings will introduce dependencies over ABoxes

Let R be a DB schema with the relation schema employee with attributesid, dept, and salary. Let M be the following mappings:

SELECT id,dept FROM employee ;q(id , dept)← Employee(id) ∧WORKS-FOR(id, dept)

SELECT id,dept FROM employee

WHERE salary > 1000

;q(id , dept)← Manager(id)∧MANAGES(id, dept)

Then for any instance I, if Manager(John) ∈ A we have thatEmployee(John).This is an indicator of completeness of all ABoxes A for M and R, e.g., Ais complete w.r.t. Manager vA Employee.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 31 / 33

Page 88: Introduction to query rewriting optimisation with dependencies

First Observation (cont.)Mappings will introduce dependencies over ABoxes

Let R be a DB schema with the relation schema employee with attributesid, dept, and salary. Let M be the following mappings:

SELECT id,dept FROM employee ;q(id , dept)← Employee(id) ∧WORKS-FOR(id, dept)

SELECT id,dept FROM employee

WHERE salary > 1000

;q(id , dept)← Manager(id)∧MANAGES(id, dept)

Then for any instance I, if Manager(John) ∈ A we have thatEmployee(John).

This is an indicator of completeness of all ABoxes A for M and R, e.g., Ais complete w.r.t. Manager vA Employee.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 31 / 33

Page 89: Introduction to query rewriting optimisation with dependencies

First Observation (cont.)Mappings will introduce dependencies over ABoxes

Let R be a DB schema with the relation schema employee with attributesid, dept, and salary. Let M be the following mappings:

SELECT id,dept FROM employee ;q(id , dept)← Employee(id) ∧WORKS-FOR(id, dept)

SELECT id,dept FROM employee

WHERE salary > 1000

;q(id , dept)← Manager(id)∧MANAGES(id, dept)

Then for any instance I, if Manager(John) ∈ A we have thatEmployee(John).This is an indicator of completeness of all ABoxes A for M and R, e.g., Ais complete w.r.t. Manager vA Employee.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 31 / 33

Page 90: Introduction to query rewriting optimisation with dependencies

Formalization: Chains

Let T be a TBox, B, C basic concepts, and Σ a set of dependencies overT . A T -chain from B to C in T (resp., a Σ-chain from B to C in Σ) is asequence of concept inclusion assertions (Bi v B ′i )

ni=0 in T (resp., a

sequence of inclusion dependencies (Bi vA B ′i )ni=0 in Σ), for some n ≥ 0,

such that:

1 B0 = B, B ′n = C , and

2 for 1 ≤ i ≤ n, we have that B ′i−1 and Bi are basic concepts s.t., either

(i) B ′i−1 = Bi , or(ii) B ′i−1 = ∃R and Bi = ∃R−, for some basic role R.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 32 / 33

Page 91: Introduction to query rewriting optimisation with dependencies

Formalization: Redundancy

Let T be a TBox, B, C basic concepts, and Σ a set of dependencies. Theconcept inclusion assertion B v C is directly redundant in T w.r.t. Σ if

(i) Σ |= B vA C and

(ii) for every T -chain (Bi v B ′i )ni=0 with B ′n = B in T , there is a Σ-chain

(Bi vA B ′i )ni=0.

Then, B v C is redundant in T w.r.t. Σ if

(a) it is directly redundant, or

(b) there exists B ′ 6= B s.t.

(i) T |= B ′ v C ,(ii) B ′ v C is not redundant in T w.r.t. Σ, and(iii) B v B ′ is directly redundant in T w.r.t. Σ.

Rodriguez-Muro and Calvanese (UNIBZ) APEX-Shanghai, 2011 July, 2011 33 / 33