Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics...
-
Upload
silas-watson -
Category
Documents
-
view
218 -
download
0
Transcript of Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics...
![Page 1: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/1.jpg)
Creating and Creating and Exploiting a Web Exploiting a Web of Semantic Dataof Semantic Data
Tim Finin, UMBCTim Finin, UMBCEarth and Space Science
Informatics Workshop05 August 200905 August 2009
http://ebiquity.umbc.edu/resource/html/id/272/http://ebiquity.umbc.edu/resource/html/id/272/
![Page 2: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/2.jpg)
Overview
•Introduction•Semantic Web 101•Recent Semantic Web trends•Examples: DBpedia, Wikitology•Conclusion
![Page 3: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/3.jpg)
The Age of Big Data
•Massive amounts of data is available today•Advances in many fields driven by availability of unstructured data, e.g., text, audio, images
• Increasingly, large amounts of structured and semi-structured data is also online
•Much of this available in the Semantic Web language RDF, fostering integration and interoperability
•Such structured data is especially important for the sciences
![Page 4: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/4.jpg)
Twenty years ago…Tim Berners-Lee’s 1989 WWW proposal described a web of rela- tionships among named objects unifying many information management tasksCapsule history• Guha’s MCF (~94) • XML+MCF=>RDF (~96)• RDF+OO=>RDFS (~99)• RDFS+KR=>DAML+OIL (00)• W3C’s SW activity (01)• W3C’s OWL (03)• SPARQL, RDFa (08)• Rules (09)
http://www.w3.org/History/1989/proposal.html
![Page 5: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/5.jpg)
Ten years ago ….
•The W3C started developing standards for the Semantic Web
•The vision, technology and use cases are still evolving
•Moving from a web of documents to a web of data
![Page 6: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/6.jpg)
Today
4.5 billion integrated facts 4.5 billion integrated facts published on the Web as published on the Web as RDF Linked Open DataRDF Linked Open Data
![Page 7: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/7.jpg)
Tomorrow
Large collections of Large collections of integrated facts published integrated facts published
on the Web for many on the Web for many disciplines and domainsdisciplines and domains
![Page 8: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/8.jpg)
W3C’s Semantic Web Goal
“The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.”-- Berners-Lee, Hendler and Lassila, The Semantic Web, Scientific American, 2001
![Page 9: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/9.jpg)
Contrast with a non-Web approach
The W3C Semantic Web approach is•Distributed•Open•Non-proprietary•Standards based
![Page 10: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/10.jpg)
How can we share data on the Web?
•POX, Plain Old XML, is one approach, but it has deficiencies
•The Semantic Web languages RDF and OWL offer a simpler and more abstract data model (a graph) that is better for integration
• Its well defined semantics supports knowledge modeling and inference
•Supported by a stable, funded standards organization, the World Wide Web Consortium
![Page 11: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/11.jpg)
Simple RDF Example
http://umbc.edu/~finin/talks/idm02/
“Intelligent Information Systemson the Web and in the Aether”
http://umbc.edu/
dc:Title
dc:Creator
bib:Aff
“Tim Finin” “[email protected]”
bib:namebib:email
Note: “blank node”
![Page 12: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/12.jpg)
The RDF Data Model•An RDF document is an unordered collection of statements, each with a subject, predicate and object
•Such triples can be thought of as a labelled arc in a graph
•Statements describe properties of resources•A resource is any object that can be referenced or denoted by a URI
•Properties themselves are also resources (URIs)•Dereferencing a URI produces useful additional information, e.g., a definition or additional facts
![Page 13: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/13.jpg)
RDF is the first SW language
<rdf:RDF ……..> <….> <….></rdf:RDF>
XML EncodingGraph
stmt(docInst, rdf_type, Document)stmt(personInst, rdf_type, Person)stmt(inroomInst, rdf_type, InRoom)stmt(personInst, holding, docInst)stmt(inroomInst, person, personInst)
Triples
RDFData Model
Good for Machineprocessin
g
Good for human viewing
Good for storage and reasoning
RDF is a simple language for graph based representations
![Page 14: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/14.jpg)
XML encoding for RDF
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:bib="http://daml.umbc.edu/ontologies/bib/"><description about="http://umbc.edu/~finin/talks/idm02/"> <dc:title>Intelligent Information … and in the Aether</dc:Title> <dc:creator> <description> <bib:Name>Tim Finin</bib:Name> <bib:Email>[email protected]</bib:Email> <bib:Aff resource="http://umbc.edu/" /> </description> </dc:Creator></description></rdf:RDF>
http://umbc.edu/~finin/talks/idm02/
“Intelligent Information Systemson the Web and in the Aether”
http://umbc.edu/
dc:Title
dc:Creator
bib:Aff
“Tim Finin” “[email protected]”
bib:namebib:email
![Page 15: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/15.jpg)
N3 is a friendlier encoding
@prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# . @prefix dc: http://purl.org/dc/elements/1.1/ .@prefix bib: http://daml.umbc.edu/ontologies/bib/ .
<http://umbc.edu/~finin/talks/idm02/> dc:title "Intelligent ... and in the Aether" ; dc:creator [ bib:Name "Tim Finin"; bib:Email "[email protected]" bib:Aff: "http://umbc.edu/" ] .
http://umbc.edu/~finin/talks/idm02/
“Intelligent Information Systemson the Web and in the Aether”
http://umbc.edu/
dc:Title
dc:Creator
bib:Aff
“Tim Finin” “[email protected]”
bib:namebib:email
![Page 16: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/16.jpg)
RDFS supports simple inferences• RDF Schema adds vocabulary for classes, properties & constraints• An RDF ontology plus some RDF statements may imply additional
RDF statements (not possible in XML)• Note that this is part of the data model and not of the accessing or
processing code.
@prefix rdfs: <http://www.....>.@prefix : <genesis.n3>.parent a rdf: property; rdfs:domain person;
rdfs:range person.mother rdfs:subProperty parent; rdfs:domain woman; rdfs:range person.eve mother cain.
person a class.woman subClass person.mother a property.eve a person; a woman; parent cain.cain a person.
![Page 17: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/17.jpg)
OWL adds further richness
OWL adds richer representational vocabulary, e.g.– parentOf is the inverse of childOf– Every person has exactly one mother– Every person is a man or a woman but not both– A man is the equivalent of a person with a sex
property with value “male”OWL is based on ‘description logic’ – a logic subset with efficient reasoners that are complete– Good algorithms for reasoning about descriptions
![Page 18: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/18.jpg)
That was then, this is now
• 1996-2000: focus on RDF and data• 2000-2007: focus on OWL,
developing ontologies, sophisticated reasoning
• 2008-…: Integrating and exploiting large RDF data collections backed by lightweight ontologies
![Page 19: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/19.jpg)
A Linked Data story
•Wikipedia as a source of knowledge–Wikis are a great ways to collaborate
on building up knowledge resources
•Wikipedia as an ontology–Every Wikipedia page is a concept or object
•Wikipedia as RDF data–Map this ontology into RDF
•DBpedia as the lynchpin for Linked Data–Exploit its breadth of coverage to integrate things
![Page 20: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/20.jpg)
Populating Freebase KB
![Page 21: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/21.jpg)
Underlying Powerset’s KB
![Page 22: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/22.jpg)
Mined by TrueKnowledge
![Page 23: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/23.jpg)
Wikipedia as an ontology
• Using Wikipedia as an ontology–each article (~3M) is an ontology concept or instance–terms linked via category system (~200k), infobox template
use, inter-article links, infobox links–Article history contains metadata for trust, provenance, etc.
• It’s a consensus ontology with broad coverage• Created and maintained by a diverse community for
free!• Multilingual• Very current• Overall content quality is high
![Page 24: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/24.jpg)
Wikipedia as an ontology
•Uncategorized and miscategorized articles•Many ‘administrative’ categories: articles needing revision; useless ones: 1949 births
•Multiple infobox templates for the same class
•Multiple infobox attribute names for same property
•No datatypes or domains for infobox attribute values
• etc.
![Page 25: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/25.jpg)
Dbpedia : Wikipedia in RDF
•A community effort to extractstructured information fromWikipedia and publish as RDFon the Web
•Effort started in 2006 with EU funding•Data and software open sourced•DBpedia doesn’t extract information from Wikipedia’s text, but from the its structured information, e.g., links, categories, infoboxes
![Page 26: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/26.jpg)
DBpedia: Linked Data lynchpin
![Page 27: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/27.jpg)
http://lookup.dbpedia.org/
![Page 28: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/28.jpg)
![Page 29: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/29.jpg)
![Page 30: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/30.jpg)
![Page 31: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/31.jpg)
Dbpedia uses WP structured data
DBpedia extracts structured data from Wikipedia, especially from Infoboxes
![Page 32: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/32.jpg)
http://dbpedia.org/sparql/
PREFIX dbp: <http://dbpedia.org/resource/>PREFIX dbpo: <http://dbpedia.org/ontology/>SELECT distinct ?Property ?PlaceWHERE {dbp:Barack_Obama ?Property ?Place . ?Place rdf:type dbpo:Place .}
![Page 33: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/33.jpg)
DBpedia: Linked Data lynchpin
![Page 34: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/34.jpg)
Consider Baltimore, MD
![Page 35: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/35.jpg)
Looking at the RDF description
We find assertions equating DBpedia's object for Baltimore with those in other LOD datasets:dbpedia:Baltimore%2C_Maryland
owl:sameAs census:us/md/counties/baltimore/baltimore;
owl:sameAs cyc:concept/Mx4rvVin-5wpEbGdrcN5Y29ycA;
owl:sameAs freebase:guid.9202a8c04000641f800000000004921a;
owl:sameAs geonames:4347778/ .
Since owl:sameAs is defined as an equivalence relation, the mapping works both ways
![Page 36: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/36.jpg)
Linked Data Cloud, March 2009
![Page 37: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/37.jpg)
WikitologyWe’ve been exploring a different approach to derive an ontology from Wikipedia through a series of use cases:– Identifying user context in a collaboration system from
documents viewed (2006)– Improve IR accuracy by adding Wikitology tags to
documents (2007)– ACE: cross document co-reference resolution for named
entities in text (2008)– TAC KBP: Knowledge Base population from text (2009)– Improve Web search engine by tagging documents and
queries (2009)
![Page 38: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/38.jpg)
Wikitology 2.0 (2008)
WordNetYago
Human input & editingDatabases
Freebase KB
RDF RDF
textgraphs
![Page 39: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/39.jpg)
Conclusion
•The Semantic Web approach is a powerful approach for data interoperability and integration
•The research focus is shifting to a “Web of Data” perspective
•Many research issue remain: uncertainty, provenance, trust, parallel graph algorithms, reasoning over billions of triples, user-friendly tools, etc.
•Just as the Web enhances human intelligence, the Semantic Web will enhance machine intelligence
•The ideas and technology are still evolving
![Page 40: Creating and Exploiting a Web of Semantic Data Tim Finin, UMBC Earth and Space Science Informatics Workshop 05 August 2009](https://reader036.fdocuments.in/reader036/viewer/2022062322/5697bff81a28abf838cbf515/html5/thumbnails/40.jpg)
http://ebiquity.umbc.edu/http://ebiquity.umbc.edu/