Reasoning on Data
Transcript of Reasoning on Data
ReasoningonData:
TheOntology-MediatedQueryAnsweringProblem
Marie-Laure Mugnier
University of Montpellier
UNILOGSchool–Vichy–2018
KNOWLEDGEREPRESENTATIONANDREASONING(KR)
• A field historically at the heart of Artificial Intelligence
• Study formalisms (or languages) to
• represent various kinds of human knowledge • do reasoning on these representations
• along the tradeoff expressivity / tractability of reasoning
à KR languages based on computational logic In this talk: classical first-order logic (FOL)
Major conferences: IJCAI, AAAI, KR
M.-L.Mugnier–UNILOGSchool– 2018 2
Part1:Basics
Knowledge bases, Ontologies Logical view of Queries and Data Main KR formalisms to represent and reason with ontologies Ontology-Mediated Query Answering
Part2:KRformalismsandalgorithmicapproaches
Part3:DecidabilityissuesintheexistenFalruleframework
OVERVIEWOFTHETUTORIAL
M.-L.Mugnier–UNILOGSchool– 2018 3
Knowledge Base (KB)
Reasoning Services
• General knowledge on the application domain « Cats are Mammals »
• Factual Knowledge Description of specific individuals, situations, ...
Félix is a Cat
Factbase, Database
Fundamental tasks
• Checking the consistency of the KB
• Computing answers to a query over the KB
...
KNOWLEDGEBASEDSYSTEMS
Ontology
Knowledge expressed in a KR language
Reasoning algorithms associated with the KR language
M.-L.Mugnier–UNILOGSchool– 2018 4
WHATISANONTOLOGY?
In computer science: a formal specification of the knowledge of a particular domain Ø which allows for machine processing Ø that relies on the semantics of knowledge
> automated reasoning
Such a specification consists of Ø a vocabulary in terms of concepts and relations Ø semantic relationships between these elements
M.-L.Mugnier–UNILOGSchool– 2018 5
EXAMPLESOFONTOLOGIES
¢ Medecine and life sciences : hundreds of available ontologies
� general medical ontologies
SNOMED CT (400 000 terms) GALEN (> 30 000 terms)
� specialized medical ontologies FMA (anatomy) NCI (cancer), ...
� biology � agronomy
¢ Information systems of large organizations and corporations
M.-L.Mugnier–UNILOGSchool– 2018 6
ATTHECOREOFONTOLOGIES:CONCEPTS/CLASSES
¢ Concept/class:acategoryofenSSes(objects)thatshareproperSes InFOL:unarypredicate:Cat,Mammal
¢ Instanceofclass:aspecificmemberofthisclass
¢ FundamentalrelaSononclasses:specializaFon(subsumpFon)(«isakindof»,«subclassof»)
SemanScs:everyinstanceofC1isalsoinstanceofC2
C1
C2
C1 subclass of C2
�x (C1(x) à C2(x)) �x (Cat(x) à Mammal(x))
In FOL: a term (variable or constant)
M.-L.Mugnier–UNILOGSchool– 2018 7
ATTHEHEARTOFONTOLOGIES:CONCEPTS/CLASSES
Disease
LungDisease
Pneumonia
InfecSousPneumonia
InfecSousDisease
BacterialDisease
ViralDisease
BacterialPneumonia
Legionella
InfecSousAgent
Bacteria Virus
OrganismDisorder
However, an ontology is not just a classification!
Concepts organized by specialization
(SNOMEDCT)
M.-L.Mugnier–UNILOGSchool– 2018 8
ONTOLOGIESAREMUCHMORETHANCLASSIFICATIONS
Vocabulary 1. concepts / classes
2. relations (between instances)
« An ontology specifies the vocabulary of an application domain and semantic relationships between the terms of the vocabulary»
+ semantic relationships between concepts
+ semantic relationships between relations
+ properties of relations
+ other axioms that more generally express domain knowledge
+ properties of concepts
M.-L.Mugnier–UNILOGSchool– 2018 9
RELATIONSBETWEENINSTANCES
Often these are binary relations (also called « roles » or « properties »)
subrelation of
SignatureofarelaSon:assignsamaximumconcepttoeachargument(«domain»and«range»inOWL)
�x�y (hasCA(x,y) à dueTo(x,y))
dueTo
hasCausaFveAgent
�x�y (hasCA(x,y) à Disease(x) � Organism(y))
Argument1:aDisorderArgument2:anOrganismor[...]
Argument1:aDiseaseArgument2:anOrganism
M.-L.Mugnier–UNILOGSchool– 2018 10
EXAMPLESOFOTHERFREQUENTTYPESOFAXIOMS
• Necessaryand/orsufficientproperSesofconcepts (ex: BacterialDisease)
• Properties of relations
• NegaSveconstraints(disjointnessbetweenconcepts,relaSons,...)
Bacteria∩Virus=� �x(Bacteria(x)�Virus(x)à )�x(Bacteria(x)à¬Virus(x))
�x (BacterialDisease(x) → �y (Bacteria(y) � hasCausativeAgent(x,y))
A bacterial disease is caused by a bacteria
inverse relations: �x�y (hasPart(x,y) ⟷ isPartOf(y,x))
symmetry, transitivity, ...
functional relation: �x�y�z (isPartOf(x,y) � isPartOf(x,z) → y = z)
M.-L.Mugnier–UNILOGSchool– 2018 11
WHATKINDSOFLANGUAGESTOEXPRESSONTOLOGIES?
HierarchiesofclassesHierarchiesofbinaryrelaSons(called«properSes»)SignaturesoftheserelaSons(«domain»and«range»)
àOWLDLfragmentofRDFSchema(SemanScWeb)
DescripFonLogicsRule-basedlanguages Datalog,existenSalrules,
RDFdeducSverules,AnswerSetProgramming...
Very light languages
More expressive fragments of first-order logics
From a logical viewpoint: anontologyiscomposedofafinitesetofpredicates(toexpressconceptsandrelaSons)afinitesetof(closed)formulasoverthesepredicates oftheform�X(condiFon[X]àconclusion[X])
M.-L.Mugnier–UNILOGSchool– 2018 12
WHATAREONTOLOGIESGOODFOR?
• provideacommonvocabulary
àitiseasiertoshareinformaSon(typicallybetweenexpertsofseveraldomains)
• constrainthemeaningofterms
àforcestoexplicitnot-saidthingsandtoremoveambiguiSeshencelessmisunderstandings
• todoautomatedreasoning,basisofhigh-levelservices
à findimplicitlinksbetweenpiecesofknowledgeà checktheconsistencyoftheKB,finderrorsinmodelingà enrichdataqueryanswering
M.-L.Mugnier–UNILOGSchool– 2018 13
«findallpaSentsaffectedbyalungdiseaseduetoabacteria»
Data
Query (SQL, SPARQL, MongoDB ...)
ONTOLOGY-MEDIATEDQUERYANSWERING(EX:MEDICALRECORDS)
Database(relaSonal,RDF,NoSQL,...)
??
PaSentP:Diagnosis=«legionella»
M.-L.Mugnier–UNILOGSchool– 2018 14
DonnéesData
Query
Knowledge Base
AlegionellaisbacterialpneumoniaAbacterialpneumoniaisapneumoniaApneumoniaisalungdiseaseAbacterialpneumoniaiscausedbyabacteriaIfxiscausedbyythenxisduetoyIfthediagnosisofapaSentxcontainsadiseaseythenxisaffectedbyy
ONTOLOGY-MEDIATEDQUERYANSWERING
«findallpaSentsaffectedbyalungdiseaseduetoabacteria»
PaSentP:Diagnosis=«legionella»
Ontology
M.-L.Mugnier–UNILOGSchool– 2018 15
Knowledgebase
Query
Ontology
Addinganontologicallayerontopofdata
1-Enrichthevocabularyallowingtoabstractfromaspecificdatastorage
2-Infernewfacts,notexplicitelystored,allowingforincompletedatarepresentaSon
Data
ONTOLOGY-MEDIATEDQUERYANSWERING(OMQA)
M.-L.Mugnier–UNILOGSchool– 2018 16
Query
Ontology
Data
3–provideaunifiedviewofmulSplesources
Data Data
ONTOLOGY-MEDIATEDQUERYANSWERING(OMQA)
M.-L.Mugnier–UNILOGSchool– 2018 17
OMQAEXAMPLE:ONTOLOGICALKNOWLEDGE
AlegionellaisbacterialpneumoniaAbacterialpneumoniaisapneumoniaApneumoniaisalungdiseaseAbacterialpneumoniaiscausedbyabacteriaIfxiscausedbyythenxisduetoyIfthediagnosisofapaSentxcontainsadiseaseythenxisaffectedbyy
�x(Legionella(x)→BacterialPneumonia(x))
�x(BacterialPneumonia(x)→�y(hasCausaSveAgent(x,y)�Bacteria(y)))
�x�y(hasCausaSveAgent(x,y)→dueTo(x,y))
�x�y((Diagnosis(x,y)�Disease(y))→isAffectedBy(x,y))
M.-L.Mugnier–UNILOGSchool– 2018 18
FACTBASE
RelaSonalschema: finitesetRofrelaSonsàpredicates infinitedomainofvaluesàconstants
Factbase:usuallyasetofgroundatoms(ontheontologicalvocabulary)seenastheconjuncSonoftheseatoms
F={PaSent(P),Diagnosis(P,M),Legionella(M)}
ArelaFonaldatabasemaynaturallybeviewedasafactbase
InstanceofarelaSonr: finitesetoftuplesonràatomsonr
rakr1 akr2a1a2a1
a2a3a1
«Thediagnosisforthepa4entPislegionella»
Databaseinstance={instanceforeachrinR} àfactbase
{r(a1,a2),r(a2,a3),r(a1,a1)}
M.-L.Mugnier–UNILOGSchool– 2018 19
CONJUNCTIVEQUERIES(CQ)
« find all patients affected by a lung disease due to a bacteria»q(x) = ∃y ∃z (Patient(x) � isAffBy(x,y) � LungDisease(y) � dueTo(y,z) � Bacteria(z)) A CQ is an existentially quantified conjunction of atoms The free variables are the answer variables If closed formula: Boolean CQ DatalognotaSonans(x)ßPaSent(x),isAffBy(x,y),LungDisease(y),dueTo(y,z),Bacteria(z)Select-Join-ProjectqueriesinrelaSonalalgebra(SQL)SELECT...FROM…WHERE<joincondi4ons> SPARQL(semanScwebqueries)SELECT…WHERE<basicgraphpa=ern>
M.-L.Mugnier–UNILOGSchool– 2018 20
ANSWERSTOACONJUNCTIVEQUERY
¢ TheanswertoaBooleanCQqinFisyesifF�qyes=()
¢ LettheCQq(x1,...,xk).Atuple(a1,…,ak)ofconstantsisananswertoqwithrespecttoafactbaseFif F�q[a1,...,ak],whereq[a1,...,ak]isobtainedfromq(x1,...,xk)byreplacingeachxibyai
¢ LetFandqbeseenassetsofatoms.AhomomorphismhfromqtoFisa
mappingfromvariables(q)toterms(F)suchthath(q)F
F�q()iffqcanbemappedbyhomomorphismtoF(a1,…,ak)isananswertoq(x1,...,xk)w.r.t.Fiff
thereisahomomorphismfromqtoFthatmapseachxitoai
M.-L.Mugnier–UNILOGSchool– 2018 21
KEYNOTION:HOMOMORPHISM
q(x)=∃y(movie(y)∧play(x,y))
HomomorphismhfromqtoF:subsStuSonofvar(q)byterms(F)suchthath(q)F
F
h1:xàayàm1
h2:xàayàm2
h1(q)=movie(m1)∧play(a,m1)
h2(q)=movie(m2)∧play(a,m2)
movie(m1)movie(m2)movie(m3)actor(a)actor(b)actor(c)play(a,m1)play(a,m2)play(c,m3)
Answers:obtainedbyrestricSngthedomainsofhomomorphisms toanswervariables
x = a x = c
h3:xàcyàm3 h3(q)=movie(x0)∧play(c,m3)
movie(y)play(x,y)
M.-L.Mugnier–UNILOGSchool– 2018 22
« find all patients affected by a l u n g d i s e a s e d u e t o a bacteria »
q(x) = ∃y∃z(PaSent(x)�isAffectedBy(x,y)�LungDisease(y)�dueTo(y,z)�Bacteria(z))
Factbase = { Patient(P), Diagnosis(P,M), Legionella(M) } « The diagnosis for the patient P is legionella »
Legionellaspecialisa4onofLungDiseaseandBacterialDisease(andDisease)henceLungDisease(M) henceBacterialDisease(M),
Disease(M)
�x(BacterianDisease(x)→�y(hasCausaSveAgent(x,y)�Bacteria(y)))hencehasCausaSveAgent(M,b)andBacteria(b)
�x�y(hasCausaSveAgent(x,y)→dueTo(x,y))hencedueTo(M,b)
�x�y((Diagnosis(x,y)�Disease(y))→isAffectedBy(x,y))henceisAffectedBy(P,M) Answer : x = P
ONTHEOMQAEXAMPLE
M.-L.Mugnier–UNILOGSchool– 2018 23
AMOREGENERALSCHEMA
Query
Ontology
Data Data Data
Mappingsfromdatatofacts
{Databasequery⤳Facts}
Factbase
Conceptuallevel
DescripSonoftheapplicaSondomainwithahighabstracSonlevel
Queryusingthevocabularyoftheontology
Factbase(possiblyvirtual)usingthevocabularyoftheontology
Independentandheterogeneousdatasources
Theanswerstothequeryareinferredfromtheknowledgebase
«Ontology-BasedDataAccess»[Poggietal.,JoDS,2008]
M.-L.Mugnier–UNILOGSchool– 2018 24
MAPPINGS
q(x):�n�sPaSent_T(x,n,s) ⤳PaSent(x)q’(x):�n�sPaSent_T(x,n,s)�DiagnosSc_T(x,y)�y=«Legionella»
⤳�z(diagnosis(x,z)�legionella(z))
Patient_T [ID_PATIENT, NAME,SSN] Diagnosis_T[ID_PATIENT, DISORDER]
PaSent/1Diagnosis/2Legionella/1
Mapping:databasequery(X)⤳conjuncFonwithfreevariablesX
PaSent(P)Diagnosis(P,M)Legionella(M)
PaSent_T Diagnosis_T
......
id ssn disidname
⤳ P....
..
..
..
..
..
..
«Leg.»....
P....
M.-L.Mugnier–UNILOGSchool– 2018 25
Knowledgebase
Query
Ontology
Factbase
ONTOLOGY-MEDIATEDQUERYANSWERING(OMQA)
(Boolean)conjuncSvequeryq
TheoryOinasuitableFOLfragment
Setofgroundatoms(orexistenSallyclosedformula)F
Fundamental decision problem
O, F � q ?
M.-L.Mugnier–UNILOGSchool– 2018 26
OVERVIEWOFTHELECTURE
Part1:Basics
Part2:KRformalismsandalgorithmicapproaches
Outline of description logics – Horn DLs Existential Rules Materialization approach (forward chaining) Query rewriting approach (related to backward chaining)
Part3:DecidabilityissuesintheexistenFalruleframework
M.-L.Mugnier–UNILOGSchool– 2018 27
DESCRIPTIONLOGICS
¢ AfamilyofKRlanguagesforrepresenSngandreasoningwithontologies
¢ MostlycorrespondtodecidablefragmentsofFOL
(relatedtomodalproposiSonallogic,theguardedfragmentofFOL,...)¢ Variable-freesyntax
¢ Usedtobecalled«conceptlanguages»:fromconceptandrolenames(unaryandbinarypredicates)andasetofconstructorsdefinecomplexconcepts(morerecently:complexroles)
¢ Anontologyisasetofaxiomsthatstateinclusionsbetweenconcepts (andbetweenroles)
M.-L.Mugnier–UNILOGSchool– 2018 28
DESCRIPTIONLOGICS:BUILDINGBLOCKS(SYNTAX)Vocabulary
Atomicconcepts: Human,Parent,Student…(unarypredicates)Atomicroles: parentOf,siblingOf,… (binarypredicates)
Complexconceptsandrolescanbebuiltusingasetofconstructors(whichdependsoneachparFcularDL)
conjuncSon(П),disjuncSon(�),negaSon(¬)HumanП¬Parent Female�Male
restrictedformsofexistenSalanduniversalquanSficaSon(�,�)
∃parentOf.(FemaleПStudent) �parentOf.Male inverseofarole(-),composiSonofroles(o) ∃parentOf- parentOfoparentOf
M.-L.Mugnier–UNILOGSchool– 2018 29
DESCRIPTIONLOGICS:BUILDINGBLOCKS(SEMANTICS)
ToeachconceptisassignedaFOLsentencewithfreevariablex Human Human(x)
HumanП¬Parent Human(x)�¬Parent(x)∃parentOf.(FemaleПStudent) ∃y(parentOf(x,y)�Female(y)
�Student(y))�parentOf.Female �y(parentOf(x,y)àFemale(y))ToeachroleisassignedaFOLsentencewith2freevariablesxandy parentOfoparentOf ∃z(parentOf(x,z)�parentOf(z,y))
M.-L.Mugnier–UNILOGSchool– 2018 30
DESCRIPTIONLOGICS:KNOWLEDGEBASE
KnowledgeBase=TBox(ontology)+ABox(factbase)Tbox:axiomsoftheformC1�C2∀x(fol(C1)àfol(C2))
orr1�r2 ∀x∀y(fol(r1)àfol(r2))
Abox:setofgroundfacts parentOf(A,B),Female(A),…
Human�Male�Female ∀x(Human(x)àMale(x)�Female(x))Adult�¬Child ∀x(Adult(x)∧Child(x)à⊥)Parent��parentOf ∀x(Parent(x)à�yparentOf(x,y))HappyFather��parentOf.Female ∀x(HP(x)à(�y(parentOf(x,y)àFemale(y))Human��parentOf-.Human ∀x(Human(x)à�y(parentOf(y,x)∧Human(y)))parentOfoparentOf�ancestorOf ∀x∀y(�z(parentOf(x,z)∧parentOf(z,y))à
ancestorOf(x,y)
M.-L.Mugnier–UNILOGSchool– 2018 31
DESCRIPTIONLOGICS:STANDARDREASONINGTASKSStandardreasoningtasksonaKB(T,A)
¢ ConceptsubsumpSon T�C�D?¢ ConceptsaSsfiability isCsaSsfiablew.r.t.T ?¢ KBsaSsfiability is(T,A)saSsfiable?
¢ Instancechecking (T,A)�C(b),wherebisaconstant?
ConceptsubsumpSon T�C�Diff(T, {C(a),¬D(a)})unsaSsfiableConceptsaSsfiability CsaSsfiablew.r.t.T iff(T, {C(a)})saSsfiable
Instancechecking (T,A)�C(b)iff(T,A�{¬C(b)})unsaSsfiable
AllthesetaskscanbeexpressedintermsofKB(un)saSsfiabilityprovidedthattheconstructorsintheconsideredDLallowforit
Queryansweringbeyondinstancechecking?cannotbereducedtothestandardreasoningtasks
M.-L.Mugnier–UNILOGSchool– 2018 32
EVOLUTIONOFDLS
¢ Concepts:
¢ TBoxaxioms:onlyconceptinclusions
SaSsfiabilityandinstancecheckinginALCare:EXPTIME-completeincombinedcomplexitycoNP-completeindatacomplexity
Evenworseifweaddinverseroles:2EXPTIME-completeincombinedcomplexity
StandardexpressiveDL ALC
M.-L.Mugnier–UNILOGSchool– 2018 33
TWOCOMPLEXITYMEASURESFORQUERYANSWERINGPROBLEMS
Combined complexity (usual complexity measure) The input is O, F and q
Data complexity
The input is F (O and q supposed to be fixed)
This distinction comes from database theory: the size of the query is negligible compared to the size of the data
Problem: Given a KB = (O, F), with O the ontology and F the factbase, and a query q, is q entailed by the KB?
E.g.,qBooleanCQ,FfactbaseDoesF�q?
NP-complete(combined)PTime(data)
M.-L.Mugnier–UNILOGSchool– 2018 34
EVOLUTIONOFDLS
¢ Concepts:
¢ TBoxaxioms:onlyconceptinclusions
SaSsfiabilityandinstancecheckinginALCare:EXPTIME-completeincombinedcomplexitycoNP-completeindatacomplexity
Evenworseifweaddinverseroles:2EXPTIME-completeincombinedcomplexity
TwofactorsledtotheevoluFonofdescripFonlogics:1. pracScaluse(e.g.SNOMEDCT):peoplemostlyuseconjuncSonandexistenSal
quanSficaSon2. complexitytoohighforqueryansweringproblems
StandardexpressiveDL ALC
M.-L.Mugnier–UNILOGSchool– 2018 35
NEWDLSWITHLOWERCOMPLEXITY
DL-LiteR EL
Commonfeature:nodisjuncSon(no«true»negaSon) ThenasaSsfiableKBhasauniquecanonicalmodelM:
ForanyBooleanCQq,KB�qiffMisamodelofqReasoningtechniquesfortheselighterDLsareverysimilartoforwardorbackwardchaininginrule-basesystems
where
where LargeTBoxesClassificaSon
LargeABoxesQueryanswering
M.-L.Mugnier–UNILOGSchool– 2018 36
COMPLEXITYINTRODUCEDBYDISJUNCTIONORNEGATION
q(): �x�y (Blue(x) � on(x,y) � Other(y))
CBA
T: T � Blue � Other A: Blue(A), Other(C), on(A,B), on(B,C)
To answer q, we have to consider two cases: in each model of the KB, either Blue(B) or Other(B) holds
Similarly if we replace T by: ¬Blue � Other (equivalentaxiom)
KB (T,A)
Note that Other � ¬Blue is harmless: it is just a disjointness constraint
M.-L.Mugnier–UNILOGSchool– 2018 37
INSUMMARY
DL ontology (TBox) has axioms of the form
∀x (fol(C1) à fol(C2))
∀x∀y (fol(r1) à fol(r2)) wherefol(r)isapathofatomicrolesortheirinverses
With the new DLs: left and right parts of the implication are both existentially quantified conjunctions of atoms
called « Horn description logics »
DLs essentially satisfy the tree model property:
if a KB is satisfiable then it has a « tree-shaped » model
M.-L.Mugnier–UNILOGSchool– 2018 38
WHY«HORNDLS»ONANEXAMPLE
ELAxiom
Letusskolemize(uandvresp.replacedbyf1(x)andf2(x)):weobtainasetof3Hornclauses(withskolemterms)
HencethenameHorndescripFonlogics
FOLtransla4on
prenexform
M.-L.Mugnier–UNILOGSchool– 2018 39
X,Y,Z:setsofvariables
∀x(actor(x)à∃zplay(x,z))
EXISTENTIALRULES
∀X∀Y(Body[X,Y]à∃ZHead[X,Z])
any positive conjunction (without functional symbols except constants)
Key point: ability to assert the existence of unknown entities
Crucial for representing ontological knowledge in open domains
See « value invention » in databases
∀x ∀y ( siblingOf(x,y) à ∃ z (parentOf(z,x) ∧ parentOf(z,y)) )
we often simplify by omitting universal quantifiers
M.-L.Mugnier–UNILOGSchool– 2018 40
DATA/FACTS
m1m2?x
RelaSonaldatabase RDF
rdf:type
am1am2c?x
ex:actor
AbstracFoninfirst-orderlogic(FOL)
∃x(movie(m1)∧movie(m2)∧movie(x)actor(a)∧actor(b)∧actor(c)play(a,m1)∧play(a,m2)∧play(c,x))
Etc.
ex:play Movie PlayActor
abc
ex:b
rdf:type
rdf:type
ex:movie
rdf:type ex:a ex:m1
ex:c
rdf:type
ex:m2
ex:play
WegeneralizeheretheclassicalnoSonofafactbyallowingexistenSalvariablesfact/factbase=existenFallyclosedconjuncFonofatoms
rdf:type
......
...
...
m_id a_id m_id a_id
_:x
M.-L.Mugnier–UNILOGSchool– 2018 41
LABELLEDHYPERGRAPH/GRAPHREPRESENTATION¢ Afactorasetoffactscanbeseenasasetofatoms
movie(m1),movie(m2),movie(x),actor(a),actor(b),actor(c),play(a,m1),play(a,m2),play(c,x)àhenceahypergraphoritsassociatedbiparFte(mulF-)graph
pvie
12
34
p(x,y,a,x),r(x,y)
• one(labelled)nodeperterm• one(labelled)nodeperatom(~hyperedge)• totallyorderededges
1
2
a
r
x
y
M.-L.Mugnier–UNILOGSchool– 2018 42
movievie
m1 m2
a b c
movievie
actorvie
playvie
movievie
1 1 1
1 11
2 2 2
1 1 1
Ifpredicatesareatmostbinary:atomnodescanbereplacedbylabelsanddirectededges
a
m1 m2
b c
play
actor actoractor
movie movie movie
actor
actor
play
play
movie(m1),movie(m2),movie(x),actor(a),actor(b),actor(c),play(a,m1),play(a,m2),play(c,x)
M.-L.Mugnier–UNILOGSchool– 2018 43
GRAPHHOMOMORPHISMS(1)
• LetG1=(V1,E1)toG2=(V2,E2)beclassicalgraphs.HomomorphismhfromG1toG2: mappingfromV1toV2s.t.
foreveryedge(u,v)inE1,(h(u),h(v))isinE2
mapsto mapsto
M.-L.Mugnier–UNILOGSchool– 2018 44
a
m1 m2
b c play
actor actoractor
movie movie movie
GRAPHHOMOMORPHISMS(2)
• LetG1=(V1,E1)toG2=(V2,E2)beclassicalgraphs.HomomorphismhfromG1toG2: mappingfromV1toV2s.t.
foreveryedge(u,v)inE1,(h(u),h(v))isinE2
• Iftherearelabels:theyhavetobe``kept’’aswell
play
movie
q F
M.-L.Mugnier–UNILOGSchool– 2018 45
GRAPHHOMOMORPHISMS(3)
• LetG1=(V1,E1)toG2=(V2,E2)beclassicalgraphs.HomomorphismhfromG1toG2: mappingfromV1toV2s.t.
foreveryedge(u,v)inE1,(h(u),h(v))isinE2
• Iftherearelabels:theyhavetobe``kept’’aswell
play 1
2
1 movie
qF
movievie
m1 m2
a b c
movievie
actorvie
playvie
movievie
1 1 1
1 11
2 2 2
1 1 1
actor
actor
play
play
M.-L.Mugnier–UNILOGSchool– 2018 46
∀x ∀y ( siblingOf(x,y) à ∃ z (parentOf(z,x) ∧ parentOf(z,y)) )
graph graph
P
P 2
S
1
2
2
1
1
GRAPHVIEWOFEXISTENTIALRULES
∀X∀Y(Body[X,Y]à∃ZHead[X,Z])
s
p
p
Theruleheadhas2kindsofvariables:-fronFer:sharedwiththebody-existenFal(new``blank’’nodes)
x x
y
y
zz
M.-L.Mugnier–UNILOGSchool– 2018 47
F = siblingOf(a,b)
R = ∀x ∀y (siblingOf(x,y) à ∃ z (parentOf(z,x) ∧ parentOf(z,y)))
F�=∃z0(siblingOf(a,b)∧parentOf(z0,a)∧parentOf(z0,b))
s
p
p
a
b
s
RisapplicabletoFifthereisahomomorphismhfrombody(R)toF
xàayàb
ApplyingRtoFw.r.t.hproducesF∪h(head(R))whereanewvariableiscreatedforeachexistenSalvariableinR
a
b
s
p
p
GENERATIONOFFRESH(UNKNOWN)INDIVIDUALS
R F
M.-L.Mugnier–UNILOGSchool– 2018 48
EXISTENTIALRULEFRAMEWORK(LOGICAL/GRAPHICAL)
q(x)=∃y(movie(y)∧play(x,y))
Data/Facts
« Pure » existential rules
Equality rules
Negative Constraints
Conjunctive Queries
∀x ( actor(x) à ∃ z (movie(z) ∧ play(x,z)) )
∀x ∀y ∀z ( movie(y) ∧ director(x,y) ∧ director(z,y) à x = z )
∀x ( movie(x) ∧ person(x) à ⊥ )
movie(m1) play(a,m1) play(c, x) ...
M.-L.Mugnier–UNILOGSchool– 2018 49
MULTIPLETHEORETICALFOUNDATIONS
Datalog(70-80s)
ConceptualGraphs[Sowa1984]
logical translation
+ « value invention »
��-rules,existenFalRules[Baget+IJCAI2009]
Datalog+/- [Cali+PODS2009]
[CheinMugnier1992,2009]
• Samelogicalformas«Tuple-GeneraSngDependencies»(TGDs)longstudiedinrelaSonaldatabases
LightweightDescripSonLogics,e.g.OWL2tractableprofilesMoregenerally,HornDLs
+ «unrestricted cycles » on variables + unbounded arity
M.-L.Mugnier–UNILOGSchool– 2018 50
u TheFOLtranslaSonofHornDLsyieldsexistenSalrulesu ExistenSalrulesarestrictlymoreexpressive:
siblingOf(x,y)à∃z(parentOf(z,x)∧parentOf(z,y))cannotbeexpressedinmostDLsbecauseofthe«cycleonvariables»(needsrolecomposi4on: s�pop)
u Theunboundedpredicatearityallowsformoreflexibility:àdirecttranslaSonofdatabaserelaSonsàaddingcontextualinformaSoniseasy(provenance,trust,etc.)
EXISTENTIALRULESAREMOREEXPRESSIVETHANHORN-DLS
s p
p
x
y
z
Morecomplexinterac4onsbetweenvariablescannotbeexpressedatallinDLs
Unsurprisingly,thisaddedexpressivityhasacost
M.-L.Mugnier–UNILOGSchool– 2018 51
¢ Fundamentaldecisionproblem
Input:K=(F,R)knowledgebase qBooleanconjuncSvequeryQuesSon:isqentailedbyK ?¢ Thisproblemisnotdecidable
f.i.[BeeriVardiICALP1981]onTGDsevenwithasinglerule[Baget&al.KR2010]à find«decidable»classesofrules
withgoodexpressivity/tractabilitytradeoff
EXISTENTIALRULEFRAMEWORK
Data/Facts
« Pure » existential rules
Equality rules
Negative Constraints
Conjunctive Queries
M.-L.Mugnier–UNILOGSchool– 2018 52
atomic body
frontier-1
weakly- guarded
weakly frontier-guarded
datalog
guarded
weakly- acyclic
acyclic Graph of Rule Dependencies
wa-GRD jointly- acyclic
frontier- guarded
jointly-fg
sticky-join
w-sticky-join
sticky
w-sticky
1970s
2003 2004
Since 2008
(PARTIAL)MAPOFDECIDABLECLASSES
DL-LiteR EL
M.-L.Mugnier–UNILOGSchool– 2018 53
FUNDAMENTALNOTIONSFORREASONINGINFOL(�,∧)
¢ BacktotheposiFveconjuncFveexistenFalfragmentofFOL:FOL(�,∧)
¢ Allowstoexpressfactsand(Boolean)conjuncFvequeries¢ Suchformulascanbeseenassetsofatoms,labelledgraphs,relaSonal
structures,...¢ HomomorphismisafundamentalnoSoninthisfragment:
� AninterpretaSonI isamodelofasentencefiffthereisahomomorphism
fromftoI
� OnecandefinehomomorphismsbetweeninterpretaSons.Then:IfI1mapstoI2then,foranyf,I1modeloff�I2modeloff
� Toaformulaf,weassignitsisomorphicmodelM(f)(akacanonicalmodel)
M.-L.Mugnier–UNILOGSchool– 2018 54
MODELISOMORPHICTOAFOL(�,∧)FORMULA
ToaformulafinFOL(�,∧),weassignitsisomorphicmodelM(f) alsocalledcanonicalmodel
f=�x�y�z(p(x,y)∧p(y,z)∧r(x,z,a))
M(f): D={dx,dy,dz,a} pM(f)={(dx,dy),(dy,dz)} rM(f)={(dx,dz,da)}
ThecanonicalmodelM(f)isuniversal:forallM’modeloff,M(f)mapstoM’
foranyfandginFOL(�,∧),g�fiffM(g)isamodeloffifffmapstog
M.-L.Mugnier–UNILOGSchool– 2018 55
ADDINGRANGE-RESTRICTED(=DATALOG)RULESTOFACTS
K = (F, R) whereRisasetofrange-restrictedrules(i.e.,var(head)var(body))
Fisafactbase(ruleswithanemptybody):groundatomsByapplyingrulesfromRstarSngfromF,auniqueresultisobtained:thesaturaFonofF(denotedbyF*)F*isfinitesincenonewvariableiscreatedF*isacore(noredundancies)
F
q
R
F*
TheniceproperSesofFOL(�,∧)arekept:
F*isauniversalmodelofK Hence:foranyCQq,K ⊨ q iff q maps to F*
M.-L.Mugnier–UNILOGSchool– 2018 56
KNOWLEDGEBASESWITHEXISTENTIALRULES
K = (F, R) whereRisasetofexistenFalrulesFisafactbase(ruleswithanemptybody):existenSalconjuncSonsofatoms
Mainchange:F*canbeinfinite
R=person(x)à�yhasParent(x,y)∧person(y)
∧person(y0)∧hasParent(a,y0)
∧person(y1)∧hasParent(y0,y1) Etc.
F=person(a)
butitremainsauniversalmodelhence K ⊨ q iff q mapstoF*
M.-L.Mugnier–UNILOGSchool– 2018 57
F
q
R
« bottom-up » « chase » (TGDs)
APPROACH1TORULES:FORWARDCHAINING/MATERIALISATION
K= (F, R)
K ⊨ q iff q maps by homomorphism to F*
Pros: materialisation offline, then online query answering is fast Cons: volume of the materialisation
needs writing access rights to the data not feasible if data is distributed among several databases not adapted if data change frequently
F*
M.-L.Mugnier–UNILOGSchool– 2018 58
EXAMPLE(MATERIALIZATION)
x = a y = m1 x = a y = m2 x = b y = z0 x = c y = x0
movie(m1)movie(m2)movie(x0)movieActor(a)movieActor(b)play(a,m1)play(a,m2)play(c,x0)
∀x(movieActor(x)à∃z(movie(z)∧play(x,z)))
q(x) = ∃ y (movie(y) ∧ play(x, y)) « find those who play in a movie »
SaturaFon
movie(z0) play(b,z0)
M.-L.Mugnier–UNILOGSchool– 2018 59
« top-down » decomposition into 2 steps [DL-Lite]
APPROACH2TORULES:BACKWARDCHAINING/QUERYREWRITING
R
Rewriting into a set of CQs, seen as a union of conjunctive queries (UCQ) and more generally into a « first-order » query (core SQL query)
F
q K= (F, R)
Query rewriting is independant from any factbase. For any F, F,R�qiffF�Q(i.e.,ifQisaUCQ:thereisqi�QwithF�qi)
Q
Pros: independent from the data Cons: rewriting done at query time, easily leads to huge and unusual queries
M.-L.Mugnier–UNILOGSchool– 2018 60
EXAMPLE
Rewq(x) = ∃ y (movie(y) ∧ play(x, y)) � movieActor(x) Query rewriting
x = a y = m1 x = a y = m2 x = c y = x0
x = a x = b
movie(m1)movie(m2)movie(x0)movieActor(a)movieActor(b)play(a,m1)play(a,m2)play(c,x0)
∀x(movieActor(x)à∃z(movie(z)∧play(x,z)))
q(x) = ∃ y (movie(y) ∧ play(x, y)) « find those who play in a movie »
M.-L.Mugnier–UNILOGSchool– 2018 61
BACKWARD CHAINING SCHEME!
UnificaSonbyaunifieru(ofq’andh’)
Body Headq’
QueryrewriSng
Body
Basicstep:Queryq RuleR
Newquery
h’
Direct rewriting of q with R and u = u(q \ q’) � u(body(R))
M.-L.Mugnier–UNILOGSchool– 2018 62
BASICPROPERTIES(1)
LetF2beobtainedfromF1bytheapplicaSonofRuleRLetaqueryQ1thatmapstoF2byahomomorphismthatusesatleastoneatom
broughtbyRThenthereisQ2,adirectrewriSngofQ1withR,suchthatQ2mapstoF1
F1 F2application of R
Q1
h1
Q2
h2
direct rewriting with R
and h1usesF2\F1
The reciprocal property holds
M.-L.Mugnier–UNILOGSchool– 2018 63
BASICPROPERTIES(2)
F1 F2application of R
Q1
h1
Q2
h2
direct rewriting with R
LetQ2beadirectrewriSngofQ1withRuleRLetF1beafactbasesuchthatQ2mapstoF2ThenthereisanapplicaSonofRtoF1thatproducesF2suchthatQ2mapstoF1
M.-L.Mugnier–UNILOGSchool– 2018 64
ForanyconjuncSvequeryq,foranyfactbaseF,foranysetofrules:thereisahomomorphismfromqtoF’,whereF’isobtainedfromFbyaruleapplicaSonsequenceoflength≤niffthereisahomomorphismfromq’toF,whereq’isobtainedfromqbyarewriSngsequenceoflength≤n
EQUIVALENCEDERIVATION/REWRITINGSEQUENCES
F1 F2
Q1
h1
Q2
h2
F1 F2
Q1
h1
Q2
h2
s.t. h1 usesF2\F1
M.-L.Mugnier–UNILOGSchool– 2018 65
TAKINGINTOACCOUNTEXISTENTIALVARIABLESINRULEHEADS(1)
¢ WewantacompletesetofsoundrewriSngs(setofCQs):qis.t.foranyF,ifF⊨qithenF,R⊨q
R=person(x)à�yhasParent(x,y)q=hasParent(v,w),denSst(w)u={x�v,y�w}rew(q,R,u)=qi=person(v),denSst(w)
qiisunsound:F=person(Maria),denSst(Giorgos)F⊨qihowever(F,{R})doesnotentailq
(1)IfwinqisunifiedwithanexistenSalvariableofR,thenallatomsinwhichwoccurmustbepartoftheunificaSon
M.-L.Mugnier–UNILOGSchool– 2018 66
TAKINGINTOACCOUNTEXISTENTIALVARIABLESINRULEHEADS(2)
R=p(x)à�z1�z2r(x,z1),r(x,z2),s(z1,z2)q=r(v,w),s(w,w)u={x�v,z1�w,z2�w}rew(q,R,u)=qi=p(v)(2)AnexistenSalvariableofRcannotbeunifiedwithanotherterminhead(R)
qiisunsound:F=p(a)F⊨ qi however(F,{R})doesnotentailq
M.-L.Mugnier–UNILOGSchool– 2018 67
PIECE-UNIFIER(FORBOOLEANCQS)
Apiece-unifieruofq’�qandh’head(R)isasubsStuSonofvar(q’+h’)byterms(q’+h’)[ifxisunchanged,wewriteu(x)=x]suchthat:¢ u(q’)=u(h’)
¢ existenSalvariablesofh’areunifiedonlywithvariablesofq’thatdonotoccur in(q\q’)(i.e.,ifxisexistenSalandu(x)=u(t),thentisavariableofq’andnotof(q\q’))
Queryq RuleR
q’ body head
h’
variablessharedbyq’and(q\q’)
ToextendthenoSontogeneralCQs:universalvariablescannotbeunifiedwithanswervariables
M.-L.Mugnier–UNILOGSchool– 2018 68
EXAMPLE
R=twin(x,y)à�zmotherOf(z,x)∧motherOf(z,y)
q=motherOf(v,w)∧motherOf(v,t)∧Female(w)∧Male(t)?
R=twin(x,y)à�zmotherOf(z,x)∧motherOf(z,y)
q=motherOf(v,w)∧motherOf(v,t)∧Female(w)∧Male(t)?
piece-unifieru1 = {z � v,x� w, y � w}
rewrite(q,R,u1)=motherOf(v,t) ∧Female(w)∧Male(w) ∧ twin(w,w)
R=twin(x,y)à�zmotherOf(z,x)∧motherOf(z,y)
q=motherOf(v,w)∧motherOf(v,t)∧Female(w)∧Male(t)?
piece-unifieru2 = {z � v,x � w,y � t}
rewrite(q,R,u2)=twin(w,t)∧Female(w)∧Male(t)
Ifwerewriteagainthisquerywecouldremovethefirstatom
M.-L.Mugnier–UNILOGSchool– 2018 69
WHATIFWESKOLEMIZEDRULES?
qiisunsound:F=person(Maria),denSst(Giorgos)F⊨qihowever(F,{R})doesnotentailq
R=person(x)à�yhasParent(x,y)q=hasParent(v,w),denSst(w)u = { x � v, y � w } rew(q,R,u)=qi=person(v),denSst(w)
Skolem(R)=person(x)àhasParent(x,f(x))
ClassicalmostgeneralunifierofhasParent(x,f(x))andhasParent(v,w):v�xandw�f(x)rew(q,R,u)=denSst(f(x))�person(x) whichcannotbeunifiedwitharulehead
(wouldnotbekeptintheouputsinceitcontainsaskolemfuncSon
Wecouldskolemizetherulesandrelyonusualm.g.u.thenkeeponlyrewriSngswithoutskolemfuncSon
butthiswouldcreateuselessintermediaterewriSngs
M.-L.Mugnier–UNILOGSchool– 2018 70
WHY«PIECES»?
Apieceisaunitofknowledgebroughtbyarule:¢ FronFervariables(andconstants)actascutpointstodecomposeruleheads
intopieces(«minimalnon-emptysubsetsgluedbyexistenSalvariables»)
R=b(x)à�y�zp(x,y)∧p(y,z)∧p(z,x)∧q(x,x) x y
z
¢ Arulewithkpiecescanbedecomposedintokrules,oneforeachpiece,whilekeepingthesamebody
¢ Itcannotbefurtherdecomposed(exceptbyintroducingnewpredicates)
b(x)à�y�z p(x,y)∧p(y,z)∧p(z,x)
b(x)àq(x,x)
M.-L.Mugnier–UNILOGSchool– 2018 71
DECOMPOSITIONOFRULESINTOATOMICHEADRULES(1)
R:b(x)à�y�z p(x,y)∧p(y,z)∧p(z,x) rule with single-piece head
Decomposition into rules with atomic head by introducing a fresh predicate
R0:b(x)à�y�z pR(x,y,z)R1:pR(x,y,z)àp(x,y)R2:pR(x,y,z)àp(y,z)R3:pR(x,y,z)àp(z,x)
We lose the structure of the head • much less efficient query rewriting
• may even lead to lose the property of having a finite universal model
(if the set of rules has this property)
M.-L.Mugnier–UNILOGSchool– 2018 72
DECOMPOSITIONOFRULESINTOATOMICHEADRULES(2)
R:p(x,y)→�zp(y,z),p(z,y)
F:p(a,b)
a b z0 ...F1
F2
z1
z2
F2≡ F1 (F2 mapstoF1 ) henceF*≡ F1
Finiteuniversalmodel
AÅerdecomposiSonintoatomicheadrules:
R0:p(x,y)→�zpR(y,z)R1:pR(y,z)àp(y,z)R2:pR(y,z)àp(z,y)
a b z0 ...F1
F2
z1
z2
F2F1
Nofiniteuniversalmodel
M.-L.Mugnier–UNILOGSchool– 2018 73
OVERVIEWOFTHELECTURE
Part1:Basics
Part2:KRformalismsandalgorithmicapproaches
Part3:DecidabilityissuesintheexistenFalruleframework
UndecidabilityofthefundamentalproblemGenericproperSesthatensuredecidability
Main«concrete»decidableclassesofexistenSalrules
M.-L.Mugnier–UNILOGSchool– 2018 74
However,here:queryrewriSngwithRisfiniteforanyq
SATURATIONMAYNOTHALT
R=person(x)àhasParent(x,y)∧person(y)
∧person(y0)∧hasParent(a,y0)
∧person(y1)∧hasParent(y0,y1)
F=person(a)
NoredundanciesareaddedTheKBhasnofiniteuniversalmodel
M.-L.Mugnier–UNILOGSchool– 2018 75
R=friend(u,v)∧friend(v,w)àfriend(u,w)
q=friend(Giorgos,Maria)
q1=friend(Giorgos,v0)∧friend(v0,Maria)
q2=friend(Giorgos,v1)∧friend(v1,v0)∧friend(v0,Maria)
q2’=friend(Giorgos,v0)∧friend(v0,v1)∧friend(v1,Maria)
q2andq2’areequivalent
Etc.q3=friend(Giorgos,v2)∧friend(v2,v1)∧friend(v1,v0)∧friend(v1,Maria)
QUERYREWRITINGMAYNOTHALT
However,here:saturaSonwithRisfiniteforanyF
There are cases where both processes do not halt (even if the factbase is known)
Thereisaninfinitenumberofnon-redundantrewriSngs
M.-L.Mugnier–UNILOGSchool– 2018 76
UNDECIDABILITYOFTHEFUNDAMENTALPROBLEM
FundamentaldecisionproblemInput:K=(F,R)knowledgebase,qBooleanconjuncSvequeryQuesSon:isqentailedbyK?
This problem is undecidable (only semi-decidable)
E.g. proof by reduction from the word problem in a semi-Thue system
There is a one-step derivation from a word w to w’ if there is a rule wi à wj in G, and w = w1wiw2, w' = w1wjw2 w’ is derived from w if
there is a (finite) sequence of one-step derivations from w to w’
Input: a set G of rules of the form wi à wj, 2 words w0 and wf
Question: is it possible to derive (exactly) wf from w0 using the rules in G?
M.-L.Mugnier–UNILOGSchool– 2018 77
REDUCTIONFROMTHEWORDPROBLEMFromG,w0andwfwebuildaKB(F, R)andaBooleanCQq
Vocabulary constants:thelekersoccuringinG,w0andwf +twospecialconstantsBandE binarypredicates:succandval
FactbaseF=T(w0,B,E)
SetofrulesRisobtainedbytranslaSngeachrulewiàwjintotheexistenSalrule �x �y (T(wi,x,y)à T(wj,x,y))
Toawordw=a1...anweassignthefollowinggraphT(w,x,y) wheretheziareexistenSalvariablesandx,yarefree
x y
Queryq=T(wf,B,E)
Key:anywordwderivablefromw0withGcorrespondstoapathT(w, B, E) inthesaturaSonofFbyR,andreciprocally
M.-L.Mugnier–UNILOGSchool– 2018 78
finite saturation
bounded treewidth saturation
finite UCQ rewriting
atomic body (linear)
frontier-1
weakly- guarded
weakly frontier-guarded
datalog
guarded
weakly- acyclic
acyclic GRD
wa-GRD jointly- acyclic
frontier- guarded
jointly-fg
sticky-join
w-sticky-j
sticky
weakly-sticky
(PARTIAL)MAPOFDECIDABLECASES
DL-Lite
EL
M.-L.Mugnier–UNILOGSchool– 2018 79
GENERICPROPERTIESTHATENSUREDECIDABILITY
ThreegenerickindsofproperSesensuringdecidability:- SaturaSonbyForwardChaininghaltsforanyfactbase
(«finiteexpansionset»,fes)
- QueryrewriSnghaltsforanyconjuncSvequery(«finiteunificaSonset»,fus,orUCQ-rewritability)
- SaturaSonbyForwardChainingmaynothaltbutforanyfactbasethegeneratedfactshaveatree-likestructure(«boundedtreewidthset»,bts)
NoneoftheseproperSesisrecognizable[Baget+KR10]buttheseproperSesprovidegenericalgorithmicschemes
M.-L.Mugnier–UNILOGSchool– 2018 80
No existential variables
MainClasseswithFiniteSaturaFon(fes)
Datalog
Acyclic position dependency graph
Weak-acyclicity
Acyclic Graph of Rule Dependencies Acyclic
GRD
Joint-acyclicity
Acyclic existential dependency graph
GRD with fes strongly connected components
fes-GRD
[BagetKR�04]
[BagetKR�04]
[Deutsch+ICDT�03][Fagin+ICDT03]
[Krötzsch+IJCAI�11]
Position dependency graph: nodes are positions in predicates edges show how existential variables are propagated
Graph of rule dependencies: nodes are rules edges express that a rule may lead to trigger a rule
M.-L. Mugnier – UNILOG School – 2018 81
WEAK-ACYCLICITY
R1:p(x)→�yr(x,y)�q(y)R2:r(x,y)→p(x)
PosiFondependencygraphnodes:posiSons(p,i)inpredicatesedges:foreachfronServariablexinposiSon(p,i)inarulebody -anedgefrom(p,i)toeachposiSon(q,j)ofxintherulehead
-aspecialedgefrom(p,i)toeachposiSonofanexistenSalintheruleheadRisweakly-acyclicifitsposiSongraphcontainsnocircuitwithaspecialedge(*)
weaklyacyclic
notweaklyacyclicspecialedge(p,1)à(r,1)duetoR1edge(r,1)à(p,1)duetoR2
R1:p(x)→�y�zr(x,y)�r(y,z)�r(z,x)R2:r(x,y)�r(y,x)→p(x)
M.-L.Mugnier–UNILOGSchool– 2018 82
ACYCLICGRAPHOFRULEDEPENDENCYGraphofRuleDependenciesnodes:therulesedges:anedgefromRitoRjifanapplicaSonofRimayleadtotriggeranew
applicaSonofRj(«RjdependsonRi»)DependencycanbeeffecSvelycomputedbycheckingifthereisapiece-unifierof
body(Rj)andhead(Ri)
Theseexamplesshowthatweak-acyclicityandacyclicGRDareincomparablecriteriaCommongeneralizaSonsofthesetwonoSonshavebeendefined
CyclicGRDsinceR1andR2dependoneachother
R1:p(x)→�yr(x,y)�q(y)R2:r(x,y)→p(x)
R1:p(x)→�y�zr(x,y)�r(y,z)�r(z,x)R2:r(x,y)�r(y,x)→p(x)
M.-L.Mugnier–UNILOGSchool– 2018 83
E.g. inclusion dependencies, necessary properties of concepts / relations
MainClasseswithFiniteQueryRewriFng(fus)
Atomic- body Sticky Domain-
restricted
Sticky-join Each head atom contains
all or none of the body variables
E.g. concept product Elephant(x) ∧ Mouse(y) à bigger-than(x,y)
[Cali+PVLDB2010]
[Baget+IJCAI�09][Baget+IJCAI�09]
[Cali+RR�10]
= linear Datalog+ [Cali+PODS2009]
Body restricted to a single atom
Restricts multiple occurrences of body variables that do not occur in all head atoms
Each head atom contains all the body variables
E.g. Human(x) àparentOf(y,x) ∧Human(y) is atomic-body, sticky and domain-restricted
M.-L. Mugnier – UNILOG School – 2018 84
WidthofatreedecomposiSon=maxnumberofnodesinabag(minus1)Treewidthofagraph=minwidthoveralldecomposiSontreesofthisgraph
r
a
p(a,b)q(b,z0)r(a,b,t0)p(b,t0)q(t0,z1)r(b,t0,t1)p(t0,t1)
DecomposiFontree1)eachnode(term)appearsinabag2)eachhyperedge(atom)hasallitsnodesinabag3)foreachnodex,thesubgraphinducedbythebagscontainingxisconnected
b
r
t0
z0 ab
node
hyperedge
DecomposiFonTree/Treewidth
p p p
q q
z1
t1bz0 abt0
t0z1 bt0t1
p(a,b)
q(b,z0)
r(a,b,t0) p(b,t0)
r(b,t0,t1) p(t0,t1)
q(t0,z1)
M.-L. Mugnier – UNILOG School – 2018 85
The decidability proof does not provide a halting algorithm (relies on the bounded treewidth model property [Courcelle90])
R is bts if the forward chaining with R generates facts with bounded treewidth: i.e., for any factbase F, there is an integer b s.t. any factbase R -derived from F has treewidth bounded by b
F
BoundedTreewidthoftheDerivedFacts(bts)
EssenSally[CaliGoklobKiferKR’08]
fes (finite saturation) is included in bts (bound given by the number of terms in the finite saturation)
M.-L. Mugnier – UNILOG School – 2018 86
An atom in the body guards all the body variables
Guard only the frontier
Guard only affected variables (i.e.possibly mapped to new existentials)
Guard only affected variables from the frontier
Frontier: variables shared by the body and the head
SomeRecognizablebts(andnotfes)ClassesofRules
guarded [Cali+KR’08]
weakly guarded
[Cali+KR�08]The frontier has size 1
[Baget+KR�10]
frontier guarded
[Baget+KR�10]
weakly frontier guarded
[Baget+IJCAI�09]
frontier 1
r(x,y) ∧ r(y,z) ∧ s(x,y,z) à �u r(y,u) ∧ r(z,u) r(x,y) ∧ r(y,z) ∧ r(x,z) à �u r(z,u)
r(x,y) ∧ r(y,z) à r(y,u) ∧ r(z,u)
datalog
These classes are moreover « greedy bts » => a halting algorithm [Baget+IJCAI�11]M.-L. Mugnier – UNILOG School – 2018 87
Greedybts
R1 = p(x,y) à ∃z p(y,z) R2 = p(x,y) ∧ q(x,z) à ∃t r(x,y,t) ∧ p(y,t)
F = p(a,b) ab
bz0 abt0
t0z1 bt0t1
p(a,b)
q(b,z0)r(a,b,t0)p(b,t0)
r(b,t0,t1)p(t0,t1)q(t0,z1)
R1 R2
R1 R2
Greedy construction of a decomposition tree of derived facts
with bounded width
Etc.
M.-L. Mugnier – UNILOG School – 2018 88
The«Greedybts»Property[Baget+ IJCAI�11]
T0 T0 = terms(F) + {constants} F
T0 ∪ var(h(H))
B
H
h
Derived facts Decomposition tree
All bags contain T0 F
h(H)
Foranyfactbase,foreachruleapplicaSon,fronServariablesnotbeingmappedtoiniSaltermsarejointlymappedtovariablesoccurringinatomsaddedbyasinglepreviousruleapplicaSon
M.-L. Mugnier – UNILOG School – 2018 89
MainIdeasoftheAlgorithmforgbts(1)
1. Bagpakern={homomorphismsfrompartofarulebodyto«currentfact»
thatusesometermsofthebag}
• Aruleisapplicabletothecurrentfactbaseiffabagpakerncontainsitsbody• FCcanbeperformedonthedecoratedtree
2. EquivalencerelaSononbags
OnlyonebagperequivalenceclassisdevelopedTheothernodesareblocked
Boundednumberofequivalenceclassesàfinite«fullblockedtree»T*
BuildafinitedecomposiSontreethatencodesapotenSallyinfinitefact
M.-L. Mugnier – UNILOG School – 2018 90
MainIdeasoftheAlgorithmforgbts(2)
[Baget+IJCAI2011]qaddedasarule«qàmatch»
qisentailediffmatchoccursinabagpa=erni.e.,qmapsbyhomomorphismtoatoms(T*)
[Thomazo+KR2012]offline/onlineseparaSon
(1)compilaSon:treeT*builtindependentlyfromanyquery
(2)querying:anyqisentailediffitmapsby*-homomorphismtoT* i.e.qmapsbyhomomorphismtoabounded«development»ofT*
QuerythisfinitedecomposiSontree
M.-L. Mugnier – UNILOG School – 2018 91
DataComplexityofgbtsClasses
guarded
weakly guarded
frontier guarded
weakly frontier guarded
frontier 1
Datalog (fes)
ExpTime-c
PTime-c
Previousalgorithmisworst-caseopSmalongbtsfordata/combinedcomplexity.
CanbespecializedtobeopSmalonthesegbtssubclassesM.-L. Mugnier – UNILOG School – 2018 92
FES GBTS
FUS
linear
frontier-1
weakly- guarded
weakly frontier-guarded
Datalog
guarded weakly- acyclic
aGRD jointly- acyclic frontier-
guarded
jointly-fg
sticky-join
w-sticky-join
sticky
weakly-sticky
DL-Lite
EL
MFA
glut-fg (BTS)
domain- restricted
super-weak- acyclic
M.-L.Mugnier–UNILOGSchool– 2018 93
CONCLUSION
• Reasoningwithontologiesisbecomingcentralinmanydata-centricapplicaSons
• SolidtheoreScalfoundaSonswitharangeofontologicalformalisms
thatoffervarioustradeoffexpressivity/complexity
• Ongoingresearch• Gobeyond(unionsof)conjuncSvequeries,e.g.combinethemwith
navigaSonalquerieslikeregularpathqueries• NewqueryrewriSngtechniquesthattargetmorepowerful
langages,e.g.Datalog• NewqueryansweringtechniquesthatcombinematerialisaSonand
queryrewriSng• StudytheinteracSonoftheontologywithmappings,whichiskey
toefficientqueryansweringoverheterogeneousdata• RepresenSngandreasoningwithtemporalandspaSaldata
• Dealingwithdatainconsistencies ....M.-L.Mugnier–UNILOGSchool– 2018 94
(Small) Bibliography BienvenuM.,LeclèreM.,Mugnier,M.-L.andRousset,M.-C., Reasoning with Ontologies, chapter6,volume1in«AguidedtourofarSficialintelligenceresearch»,Springer,toappear.IntroducFonstoseveralaspectsofontology-mediatedqueryansweringwithdescripFonlogicsorexistenFalrulesintheReasoningWebsummerschoolbooks:inparScular:Bienvenu,M.andOrSz,M.(2015).Ontology-mediatedqueryansweringwithdatatractabledescripSonlogics.11thInternaSonalReasoningWebSummerSchool,volume9203ofLNCS,pages218–307.Springer.Mugnier,M.andThomazo,M.(2014).AnintroducSontoontology-basedqueryansweringwithexistenSalrules.10thInternaSonalReasoningWebSummerSchool,volume8714ofLNCS,pages245–278.Springer.Goklob,G.,Orsi,G.,Pieris,A.,andSimkus,M.(2012).DataloganditsextensionsforsemanScwebdatabases.10thInternaSonalReasoningWebSummerSchool,volume7487ofLNCS,pages54–77.Springer.
ThesesynthesesprovidefurtherreferencesM.-L.Mugnier–UNILOGSchool– 2018 95
APPENDIX:FURTHERDETAILS
¢ FundamentaldefiniSonsandproperSesfortheFOL(�,∧)fragment
¢ Piece-unifiers
M.-L.Mugnier–UNILOGSchool– 2018 96
INTERPRETATIONS/MODELS(1)
¢ VocabularyV=(P,C),where P=finitesetofpredicates C=setofconstants
¢ InterpretaFonI=(DI ,.I)ofV,where DI≠ø (domain) forallcinC,cIinDI
forallpinPwitharityk,pIDIk
¢ Furthermore,uniquenameassumpSon:forallcanddinC,cI≠dI
¢ SimplifyingassumpSon(inlinewiththeuniquenameassumpSon):CDIandforallcinC,cI=c
V = ( {p/2, r/3 }, {a, b} ) I: DI = {a, b, d1} pI = { (b, a), (b, d1), (d1, b) }
rI = { (d1, d1, a) }
¢ Iisamodeloff(builtonV)iffistrueinI
M.-L.Mugnier–UNILOGSchool– 2018 97
INTERPRETATIONS/MODELS(2)
¢ LetfinFOL(�,∧).Iisamodeloffiff thereisamappingvfromterms(f)toDIsuchthat
forallp(e1,...,ek)inf,(v(e1),...,v(ek))inpI
I: DI = {a, b, d1} pI = {(b, a), (b, d1), (d1, b)} rI = {(d1, d1, a)} f = �x�y�z ( p(x, y) ∧ p(y, z) ∧ r(x, z, a) )
v: x↦ d1 y↦ b z↦ d1
¢ InterpretaSonscanbeseenassetsofatoms
(withelementsfromD\Cseenasvariables)
p(b,a), p(b,x1), p(x1,b), r(x1, x1,a)¢ I isamodeloffiffthereisahomomorphismfromftoI
M.-L.Mugnier–UNILOGSchool– 2018 98
HOMOMORPHISMSAGAINANDAGAIN
¢ OnecandefinehomomorphismsbetweeninterpretaFons¢ Wehave:
IfI1mapsI2then,foranyf,I1modeloff�I2modeloff
¢ ToaformulafinFOL(�,∧),weassignitsisomorphicmodelM(f) (alsocalledcanonicalmodel)
f=�x�y�z(p(x,y)∧p(y,z)∧r(x,z,a))
M(f): D={dx,dy,dz,a} pM(f)={(dx,dy),(dy,dz)} rM(f)={(dx,dz,da)}
M.-L.Mugnier–UNILOGSchool– 2018 99
NICESEMANTICPROPERTIESOFFOL(�,∧)
¢ ThecanonicalmodelM(f)isuniversal,i.e.,forallM’modeloff,M(f)mapstoM’
Proof:LetM’modeloff.Then,fmapstoM’.SinceM(f)isomorphictof,M(f)mapstoM’
¢ g⊨f(i.e.,everymodelofgisamodeloff)ifffmapsbyhomomorphismtoM(g)ifffmapsbyhomomorphismtogProof:⇒Assumeg⊨f.InparScularM(g)isamodeloff,hencefmapstoM(g)
⇐AssumefmapstoM(g).SinceM(g)isuniversal:foranyM’modelofg,fmapstoM’,i.e.,M’isamodeloff,henceg⊨f
M.-L.Mugnier–UNILOGSchool– 2018 100
WHY«PIECES»?(CONT’D)–PIECE-UNIFIERS
¢ UnificaSonmust«map»partsofqaccordingto«pieces»thatcanbeprovidedbyaruleapplicaSon
(otherwiseitisunsoundoruseless) body head
body head
specializa4onofthefron4er
homomorphism
ThetermsofqunifiedwiththefronSer(orwithconstants)cutqinto«pieces»⇒ enSre«pieces»ofqmustbemappedtothepiecesoftherule
R
q
M.-L.Mugnier–UNILOGSchool– 2018 102
PIECE-UNIFIER:ALTERNATIVEDEFINITION
Letu1:fronSer(R)àfronSer(R)�constants(u1isaspecializaSonofthefronSerofR)Letu2beahomomorphismfromq’�qtou1(head(R))Cutpoints:termsofq’mappedtou1(fron4er(R))
ortoconstantsThecutpointscutqinto«pieces»u1+u2isapiece-unifierifq’iscomposedofpieces
body head
body head
u1
u2
M.-L.Mugnier–UNILOGSchool– 2018 103