Finding Commonalities: from Description Logics to the Web of Data
-
Upload
silvia-giannini -
Category
Engineering
-
view
89 -
download
2
description
Transcript of Finding Commonalities: from Description Logics to the Web of Data
Finding Commonalities in Linked Open Data
Silvia Giannini
PhD Student(Supervisor: Prof. Eugenio Di Sciascio)
Dipartimento di Ingegneria Elettrica e dell'Informazione (DEI),Politecnico di Bari, Bari, Italy
in collaboration withProf. Francesco M. Donini, Ph.D. Simona Colucci
Web&Media Group Meeting | 31 March, 2014
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Outline
1 Finding Commonalities: A DLs use caseThe I.M.P.A.K.T. systemThe Core Competence module
2 Finding Commonalities: the Web of Data
3 Conclusion
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The I.M.P.A.K.T. system
What is I.M.P.A.K.T.
Information Management and Processing with the Aid ofKnowledge-based Technologies
An integrated system managing three enterprise business services based onknowledge management:
1 Skill Matching 1
2 Team Composition 2
3 Core Competence Extraction 3
1E. Tinelli, S. Colucci, S. Giannini, E. Di Sciascio, and F.M. Donini, Large scale skill matching
through knowledge compilation In: Proc. of ISMIS 2012, Springer-Verlag (2012) 192�201.2E. Tinelli, S. Colucci, E. Di Sciascio, and F.M. Donini, Knowledge compilation for automated team
composition exploiting standard SQL In: Proc. of SAC 2012, ACM (2012) 1680�1685.3S. Colucci, E. Tinelli, S. Giannini, E. Di Sciascio, and F.M. Donini, Knowledge Compilation for Core
Competence Extraction in Organizations In: Proc. of Business Information Systems 2013, Springer(2013) 163�174.
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The I.M.P.A.K.T. system
What is I.M.P.A.K.T.
Information Management and Processing with the Aid ofKnowledge-based Technologies
An integrated system managing three enterprise business services based onknowledge management:
1 Skill Matching 1
2 Team Composition 2
3 Core Competence Extraction 3
1E. Tinelli, S. Colucci, S. Giannini, E. Di Sciascio, and F.M. Donini, Large scale skill matching
through knowledge compilation In: Proc. of ISMIS 2012, Springer-Verlag (2012) 192�201.2E. Tinelli, S. Colucci, E. Di Sciascio, and F.M. Donini, Knowledge compilation for automated team
composition exploiting standard SQL In: Proc. of SAC 2012, ACM (2012) 1680�1685.3S. Colucci, E. Tinelli, S. Giannini, E. Di Sciascio, and F.M. Donini, Knowledge Compilation for Core
Competence Extraction in Organizations In: Proc. of Business Information Systems 2013, Springer(2013) 163�174.
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The I.M.P.A.K.T. system
What is I.M.P.A.K.T.
Skill Matching GUI
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The I.M.P.A.K.T. system
Behind I.M.P.A.K.T.
An ontology for the HR domain (nearly 5000 concepts)
T -Box
Employee Profile(M
0)
Industry
(M1)
ComplementarySkill(M
2)
Level
(M3)
Language
(M5)
JobTitle(M
6)
Knowledge
(M4)
Main module M0: it models the properties (entry points) needed toimports all the sections describing an employee CV.
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The I.M.P.A.K.T. system
Behind I.M.P.A.K.T.
An ontology for the HR domain (nearly 5000 concepts)
T -Box
Employee Profile(M
0)
Industry
(M1)
ComplementarySkill(M
2)
Level
(M3)
Language
(M5)
JobTitle(M
6)
Knowledge
(M4)
Possible employee skills and technical tools usage ability.
Speci�ed through:type - experience role (e.g., developer, administrator)year - experience levellastdate - last temporal update of work experience
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The I.M.P.A.K.T. system
Behind I.M.P.A.K.T.
A Curriculum Vitae representation
A-Box
A pro�le P = u(∃R0j .C) is a concept in ALE(D), where R0
j , 1 ≤ j ≤ 6, isan entry point, and C is a concept in FL0(D) modeled in Mj .
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
What is a Core Competence
Core Competence: a Knowledge Management process
"Core competencies are a company collective knowledge abouthow to coordinate diverse production skills and integrate multiple
streams of technologies. Identifying core comptencies helps in supportcompetitive advantage, articulate a strategic intent, and allocateresources to build cross-unit technological and production links."
(G. Hamel, and C.K.A. Prahalad, The core competence of the corporation. Harvard Business, in HarvardBusiness Review May-June (1990) 79�90)
Examples:
Apple - design
Net�ix - content delivery
Google - expertise in algorithms
. . .
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The reasoning service
Objective: Automatically extract Core Competence, by identifying a commonknow-how in a signi�cant portion of personnel (k employees, with k set as athreshold value by the people in charge for the strategic analysis).
Tool:
Logic-based approachNon-standard inference services (LCS, k-CS, BICS)
Method:
Knowledge-compilation processIt solves subsumption only via SQL queries against a proper R-DB schema,without any exponential-time inference engine
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
A logic-based approach
Least Common Subsumer (LCS)
Let C1, . . . , Cn be a collection of nconcepts in a DL L. The LeastCommon Subsumer (LCS) ofC1, . . . , Cn is a concept D in L suchthat D is the most speci�c conceptsubsuming all the elements of thecollection.
k-Common Subsumer (k-CS)
Let C1, . . . , Cn be a collection of nconcepts in a DL L and let k < n. Ak-Common Subsumer (k-CS) ofC1, . . . , Cn is a concept D in L suchthat D is an LCS of k concepts amongC1, . . . , Cn.
Informative k-Common Subsumer(IkCS)
Given k < n, an Informativek-Common Subsumer (IkCS) of theconcepts C1, . . . , Cn in a DL L is aconcept D such that D is a k-CSstricltly subsumed by theLCS(C1, . . . , Cn) and addinginformative content to it.
Best Informative Common Subsumer(BICS)
Given k < n, a Best InformativeCommon Subsumer (BICS) of theconcepts C1, . . . , Cn in a DL L is aconcept B such that B is an IkCS forC1, . . . , Cn, and for every k < j ≤ nevery j-CS is not informative.
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Knowledge Compilation process
Issues:
Computational di�culties of deduction in knowledge bases expressedthrough a logical formalism;
Combining the representation power of a logical language, with thescalability and e�ciency of information processing in a DBMS.
Knowledge Compilation:
1 OFF-LINE REASONINGpre-processing of a company intellectual capital, described in a DescriptionLogics (DLs) Knowledge Base (KB), in an appropriate relational databaseschema.
2 ON-LINE REASONINGquerying of the data structure coming out from the �rst phase throughstandard SQL-queries for e�cient Core Competence Extraction.
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
CV translation
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
OFF-LINE REASONING: Relational schema design rules
T -Box informative contentTable CONCEPT: it stores CCNF of all the FL0(D) concepts (part (a))
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
OFF-LINE REASONING: Relational schema design rules
T -Box informative contentA table is created for each entry point R0
j , j > 0 (part (b))
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
OFF-LINE REASONING: Relational schema design rules
A-Box informative contentEach atom of CCNF(C) of a conjunct ∃R0
j .C is stored in a di�erent tupleof table Rj with the same groupID (part (b))
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
OFF-LINE REASONING: Relational schema design rules
A-Box informative contentTable PROFILE includes pro�leID and extra-ontological structuredinformation (e.g., personal data, work-related information) (part (b))
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
ON-LINE REASONING: The Core Competence Extraction Algorithm
1 Pro�les Subsumers Matrix computation
Idea: Extract the common know-how, expressed in form of atomicinformation, shared by the same group of employees, with cardinalitygreater or equal to k.
Example
Mario Rossi: Cplusplus (5 years), Java (5 years), Visual Basic (5 years)
Daniela Bianchi: Cplusplus (2 years), Java (6 years), Visual Basic (1 years)
Elena Pomarico: CplusPlus, Java, Visual Basic
Carmelo Piccolo: VBScript, Process Performance Monitoring
Lucio Battista: DBMS (2 years)
Mariangela Porro: DBMS (2 years), Internet Technologies (2 years)
Nicola Marco: DBMS (5 years), Internet Technologies (5 years)
Domenico De Palo: OOprogramming (6 years), Arti�cial intelligence (4 years), Internet technologies (4years)
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
1 Pro�les Subsumers Matrix computation
Idea: Extract the common know-how, expressed in form of atomicinformation, shared by the same group of employees, with cardinalitygreater or equal to k.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Pro�le Subsumers Matrix
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
1 Pro�les Subsumers Matrix computation
Idea: Extract the common know-how, expressed in form of atomicinformation, shared by the same group of employees, with cardinalitygreater or equal to k.
D1 ∃hasKnowledge.ComputerScienceSkillD2 ∃hasKnowledge.(ComputerScienceSkillu =2 years)D3 ∃hasKnowledge.ProgrammingLanguageD4 ∃hasKnowledge.OOPD5 ∃hasKnowledge.(ComputerScienceSkillu =5 years)D6 ∃hasKnowledge.(DBMSu =2 years)D7 ∃hasKnowledge.(OOPu =5 years)D8 ∃hasKnowledge.(InternetTechnologiesu =2 years)D9 ∃hasKnowledge.C++D10 ∃hasKnowledge.VisualBasicD11 ∃hasKnowledge.Java...
Table: Description of D1, . . . , D11 reported in the previous Table
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
1 Pro�les Subsumers Matrix computation
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
2 Common Subsumers enumeration
Referring to the PSM of the set P = {P (a1), . . . , P (an)}, and to a conceptcomponent Dk ∈ {D1, . . . , Dm} deriving from P, a Core Competence is theunion of the most speci�c features (i.e., pro�le concept components Dj) sharedby the same group of k employees, where k is a prede�ned threshold.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Pro�le Subsumers Matrix
LCS = ∃hasKnowledge.ComputerScienceSkill
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
2 Common Subsumers enumeration
Referring to the PSM of the set P = {P (a1), . . . , P (an)}, and to a conceptcomponent Dk ∈ {D1, . . . , Dm} deriving from P, a Core Competence is theunion of the most speci�c features (i.e., pro�le concept components Dj) sharedby the same group of k employees, where k is a prede�ned threshold.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Pro�le Subsumers Matrix
BICS = ∃hasKnowledge.ComputerScienceSkillu =5 years
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
2 Common Subsumers enumeration
Referring to the PSM of the set P = {P (a1), . . . , P (an)}, and to a conceptcomponent Dk ∈ {D1, . . . , Dm} deriving from P, a Core Competence is theunion of the most speci�c features (i.e., pro�le concept components Dj) sharedby the same group of k employees, where k is a prede�ned threshold.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Pro�le Subsumers Matrix
ICS3 = ∃hasKnowledge.(DBMSu =2 years)
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
2 Common Subsumers enumeration
Referring to the PSM of the set P = {P (a1), . . . , P (an)}, and to a conceptcomponent Dk ∈ {D1, . . . , Dm} deriving from P, a Core Competence is theunion of the most speci�c features (i.e., pro�le concept components Dj) sharedby the same group of k employees, where k is a prede�ned threshold.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Pro�le Subsumers Matrix
ICS3 = ∃hasKnowledge.(OOPu =5 years)
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
2 Common Subsumers enumeration
Referring to the PSM of the set P = {P (a1), . . . , P (an)}, and to a conceptcomponent Dk ∈ {D1, . . . , Dm} deriving from P, a Core Competence is theunion of the most speci�c features (i.e., pro�le concept components Dj) sharedby the same group of k employees, where k is a prede�ned threshold.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Pro�le Subsumers Matrix
ICS3 = ∃hasKnowledge.(InternetTechnologiesu =2 years)
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
The Core Competence Extraction Algorithm
2 Common Subsumers enumeration
Referring to the PSM of the set P = {P (a1), . . . , P (an)}, and to a conceptcomponent Dk ∈ {D1, . . . , Dm} deriving from P, a Core Competence is theunion of the most speci�c features (i.e., pro�le concept components Dj) sharedby the same group of k employees, where k is a prede�ned threshold.
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 ...
1 1 1 1 1 1 0 1 0 1 1 1 ...
2 1 1 1 1 1 0 1 0 1 1 1 ...
3 1 1 0 0 0 1 0 0 0 0 0 ...
4 1 1 0 0 0 1 0 1 0 0 0 ...
5 1 1 0 0 1 1 0 1 0 0 0 ...
6 1 0 1 0 0 0 0 0 0 0 0 ...
7 1 0 1 1 0 0 0 0 1 1 1 ...
8 1 1 1 1 1 0 1 1 0 0 0 ...
Table: Portion of the previous Example Pro�le Subsumers Matrix
ICS3 = ∃hasKnowledge.(C++ u VisualBasic u Java)
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
Core Competence module GUI
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
Core Competence module GUI
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
Core Competence module GUI
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
Core Competence module GUI
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
The Core Competence module
Lessons learned
Proposal: Knowledge Compilation approach for Core Competence Extraction.
+ It improves performances in terms of execution times, w.r.t. classicallogic-based approach.
+ It adopts standard SQL-queries to compute the same informative contentas advanced inference services.
+ It makes the computational costs of the process a�ordable also for largeorganizations, while retaining the full expressiveness of the logic-basedapproaches.
Notes on Performance:
The number of pro�les is highly relevant in the common subsumersenumeration process.
The most computationally expensive process is the pro�le subsumersmatrix creation, under a threshold of pro�les concept components.
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Outline
1 Finding Commonalities: A DLs use case
2 Finding Commonalities: the Web of DataCommon Subsumer in RDFRDF Clustering
3 Conclusion
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Motivation
Learning from the Web of Data:huge amount of interconnected and machine-understandable data
data modeled as RDF resources
dataset addressed as Linked (Open) Data (LOD).
Facts to learnidenti�cation of subsets of resources related to a common informativecontent
- Cluster search (approximate matching)- Disambiguation- Personalization
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Problem De�nition
In analogy to the LCS service, proposed in DLs to learn from examples.
Adaptation to the Web of Data:
giving up to the subsumption minimality requirement: even roughCommon Subsumers are useful for learning in the Web of Data
de�nition of Common Subsumer of pairs of RDF resources
De�nition (Rooted Graph (r-graph))
Let TWr be the set of all triples with subject r in the Web. A Rooted Graph(r-graph) is a pair 〈r, Tr〉, where
1 r is either the URI of an RDF resource, or a blank node
2 Tr = {t | t = <<r p c>>} is a subset of relevant triples in TWr
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: A Possible Representation for resources a and b
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: A(nother) Possible Representation for resources a and b
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Common Subsumer
De�nition (Common Subsumer)
Let 〈a, Ta〉, 〈b, Tb〉 be two r-graphs and x, w, y be blank nodes.
If 〈a, Ta〉 = 〈b, Tb〉, then 〈a, Ta〉 is a Common Subsumer of 〈a, Ta〉, 〈b, Tb〉.if Ta = ∅ or Tb = ∅, the pair 〈x, ∅〉 is a Common Subsumer of 〈a, Ta〉,〈b, Tb〉Otherwise, a pair 〈x, T 〉 is a Common Subsumer of 〈a, Ta〉, 〈b, Tb〉 i�:∃t = <<x w y>> such that (T entails t)
⇒ (1)
∃t1 = <<a p c>>, t2 = <<b q d>> such that(T entails t1) ∧ (T entails t2)where Ta ⊆ T, Tb ⊆ T and 〈w, T 〉 is a Common Subsumer of 〈p, Tp〉 and〈q, Tq〉, and 〈y, T 〉 is a Common Subsumer of 〈c, Tc〉 and 〈d, Td〉.
Note: We consider only simple entailment
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a Common Subsumer of a and b
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a Common Subsumer of a and b
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a Common Subsumer of a and b
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a Common Subsumer of a and b
Note: Triples with a blank node in predicate and object positions are discarded
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a(nother) Common Subsumer of a and b
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a(nother) Common Subsumer of a and b
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a(nother) Common Subsumer of a and b
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Example: a(nother) Common Subsumer of a and b
Note: Triples with a blank node in predicate and object positions are discarded
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Common Subsumer in RDF
Solving Algorithm
Main Features:anytime: if interrupted, it always returns a Common Subsumer of theinput pair of RDF resourcesmodular: it takes as input a function computing the sets of triples relevantfor the input RDF resources
Our current criterion for triples selection:
triples within a given graph distance from the input resourcetriples having properties within to a selected set of signi�cant propertiesfor the dataset/application of interest
Output: A Common Subsumer of two r-graphs 〈a, Ta〉 and 〈b, Tb〉:a pair made up by a resource (anonymous or not) and a set of triplesstating facts about such a resource which are "true" for both a and b.Alternative cases:
〈_ : cs, T 〉: a blank node _ : cs together with a set of triples related to_ : cs.〈a, Ta〉, i� and 〈a, Ta〉 = 〈b, Tb〉〈_ : cs, ∅〉 if either Ta = ∅ or Tb = ∅
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
RDF Clustering
Target Semantic Web Task
Clustering of Web resources with a CS
retrieving resources conveying the same informationin their di�erent RDF descriptions
CS description → SPARQL queries:WHERE { Tcs [blank nodes → variables] }
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
RDF Clustering
Clustering with a CS: A use case
The Italian Chamber of Deputies LOD
Public SPARQL endpoint (http://dati.camera.it/sparql)
Running example: Find the commonalities between deputies Nilde Iotti
and Tina Anselmi in the 10th Legislature
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
RDF Clustering
Clustering with a CS: A use case
The Italian Chamber of Deputies LOD
Public SPARQL endpoint (http://dati.camera.it/sparql)
Running example: Find the commonalities between deputies Nilde Iotti
and Tina Anselmi in the 10th Legislature
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
RDF Clustering
Clustering with a CS: A use case
The Italian Chamber of Deputies LOD
Public SPARQL endpoint (http://dati.camera.it/sparql)
SELECT DISTINCT ?x0
WHERE{
?x0 a <http://dati.camera.it/ocd/deputato> .
?x0 <http:xmlns.comfoaf0.1gender> �female� .
?x0 <http://dati.camera.it/ocd/rif_mandatoCamera> ?x1 .
. . .}
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
RDF Clustering
Clustering with a CS: A use case
1st Legislature clusters
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Outline
1 Finding Commonalities: A DLs use case
2 Finding Commonalities: the Web of Data
3 Conclusion
Silvia Giannini Finding commonalities in Linked Open Data
Finding Commonalities: A DLs use case Finding Commonalities: the Web of Data Conclusion
Conclusion
Motivation: learning shared informative content in collections of RDFresources
Problem De�nition: search for Common Subsumers not subsumptionminimal in order to ensure computability in the Web of Data, too large tobe explored
Results:An anytime algorithm computing Common Subsumers of pairs of RDFresources:
allowing for using partial learned informative content for further processing,whenever the search for Common Subsumers is interruptedpossibly supporting the clustering of collections of RDF resources, byexploiting associativity of Common Subsumers.
Future works:
Extension of CS de�nition to other entailment regimes
Investigation on methods for selection of relevant triples
Automated link traversal techniques for more dataset exploration
Application to data quality problems (e.g.,missing values)
Silvia Giannini Finding commonalities in Linked Open Data