From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration
-
Upload
giuseppefutia -
Category
Internet
-
view
80 -
download
4
Transcript of From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration
![Page 1: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/1.jpg)
From Big Linked Data to Linked Big Data: DBpedia as a framework fordata integration
Giuseppe Futia1, Antonio Vetrò1, Giuseppe Rizzo2
1- Nexa Center for Internet and Society, DAUIN, Politecnico di Torino 2- Istituto Superiore Mario Boella (ISMB)
7th DBpedia Community Meeting in Leipzig15 September 2016
![Page 2: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/2.jpg)
PhD candidate on semantics atNexa Center for Internet & Society,DAUIN, Politecnico di Torino
![Page 3: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/3.jpg)
Experiences with LOD and DBpedia
• TellMeFirst, a tool for classifying and enriching textual documents built on DBpedia Spotlight (http://tellmefirst.polito.it)
• Contratti Pubblici, a tool for processing, exploring, and visualizing Italian Public Procurements (http://public-contracts.nexacenter.org/)
![Page 4: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/4.jpg)
4
How TellMeFirst works
![Page 5: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/5.jpg)
TellMeFirstResults obtained with a
description of theEyes Wide Shut movie
![Page 6: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/6.jpg)
Anti-corruption National Authority
Contratti Pubblici (Synapta + Nexa)
Different data sources to build a search engine on Italian Public Contracts
Agency for Digital Italy
![Page 7: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/7.jpg)
Linked Data repository of Public Contracts, linked to
DBpedia and SPC
Contratti Pubblici(Synapta + Nexa)
Contratti Pubblici
![Page 8: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/8.jpg)
DBpedia in our projects
• TellMeFirst:–Training set used for the semantic classification task–Several interlinks used for the enrichment task
• Contratti Pubblici:–Data enrichment to enable advanced SPARQL queries–Data quality improvement (i.e., consistent labels)
![Page 9: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/9.jpg)
• Big Linked Data–Already implemented as shown by the exponential growth of Linked Data in the last years
• Linked Big Data–RDF data model for Big Data Variety–Meta information to enable powerful analytics–Simplify Big Data access, integration, and interlinking
From Big Linked Data to Linked Big Data
![Page 10: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/10.jpg)
Big Data notion of Variety• Variety of data and representation formats
• Variety of conceptualizations and data models
• Variety related to temporal and spatial dependencies
• Variety as a “generalization of the semantic heterogeneity as studied in the field of Linked Data”
(Pascal Hitzler & Krzysztof Janowicz)
![Page 11: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/11.jpg)
PhD research questions (i)
• RQ1: How can the technological foundations of Linked Data and Big Data can be further improved and combined to create an open software architecture for a multi-thematic, multi-perspective, and multi-medial knowledge graph from heterogeneous sources?
![Page 12: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/12.jpg)
PhD research questions (ii)
• RQ2: Which are the features of a research method to meet and evaluate security, scalability, performance, openness, interoperability of the software architecture mentioned earlier? And how we can measure the quality of the knowledge graph produced with this software architecture?
![Page 13: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/13.jpg)
Key ideas for my PhD• Get concepts and ontologies from the DBpedia
knowledge base to support semantic alignment during the integration stage
• Use frameworks for data integration of structured information with Big Data technologies:RDF Mapping Language (RML) + Hadoop or Spark
• Exploit Machine Learning techniques to increment datasets with unstructured data (i.e., Deep Learning)
![Page 14: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/14.jpg)
DBpedia as knowledge base for:
• Entity linking and annotations in documents
• Assertion of additional categories for data
• Improvement of multilingual information
• Estimation of data quality of integrated information according to different features (i.e., provenance)
![Page 15: From Big Linked Data to Linked Big Data - DBpedia as a framework for data integration](https://reader034.fdocuments.in/reader034/viewer/2022042706/58892f3e1a28ab22578b4973/html5/thumbnails/15.jpg)
Challenges• Greater accuracy (integrating different datasets)
• Immediacy (near-real time data, from new data sources)
• Flexibility (not constrained by database structure)
• Better analytics (the ability to change the rules)
• Data quality (reliability and effectiveness of data)