Semantic Web Technologies for Practical Applicationsguus/talks/07-helsinki.pdf · 2017-04-04 ·...

Post on 08-Jul-2020

0 views 0 download

Transcript of Semantic Web Technologies for Practical Applicationsguus/talks/07-helsinki.pdf · 2017-04-04 ·...

Tumbling Walls Tumbling Walls & &

Building BridgesBuilding Bridges

Semantic Web Technologies for

Practical Applications

Guus SchreiberFree University Amsterdam

Co-chair W3C Semantic WebDeployment Working Group

Overview

• Non-tech intro to Semantic Web• Semantics for the Web• Two application examples• Principles and techniques• W3C activities• Semantic Web: hope or hype?

Non-tech intro to the Semantic Web

The Web: resources and links

URL URLWeb link

The Semantic Web: typed resources and links

URL URLWeb link

ULAN

Henri Matisse

Dublin Core

creator

Painting“Woman with hat

SFMOMA

Principle 1: semantic annotation

• Description of web objects with “concepts”from a shared vocabulary

Principle 2: semantic search• Search for objects

which are linked via concepts (semantic link)

• Use the type of semantic link to provide meaningful presentation of the search results

Paris

Montmartre

PartOf

Query“Paris”

Principle 3: multiple vocabularies. or: the myth of a unified vocabulary• In large virtual collections there are always

multiple vocabularies – In multiple languages

• Every vocabulary has its own perspective– You can’t just merge them

• But you can use vocabularies jointly by defining a limited set of links– “Vocabulary alignment”

• It is surprising what you can do with just a few links

Example“Tokugawa”

AAT style/periodEdo (Japanese period)Tokugawa

SVCN periodEdo

SVCN is local in-house ethnology thesaurus

Semantics for the Web

Challenges

• Machine-processable representation of semantic information

• Defining semantics in an OPEN environment– Adding semantics to other people’s

semantics – Ability for everyone to contribute

• Ability to define mappings between semantic representations– There is no uniform way to classify the

world!

The notion of ontology (as currently used in computer science)

• The Semantic Web needs sets of shared concepts

• These sets of concepts are called “ontologies”

• It is hard and time-consuming to develop ontologies

• Therefore, the Semantic Web developers are looking for existing ontologies, vocabularies, taxonomies

Ontologies and data models

• Main difference with data models is not the content, but the purpose (generalizes over applications)

• You cannot see the difference by just looking at the syntax!

• A conceptual model written in a ontology language is not necessarily an ontology!

Example “ontologies” for SW applications

• Domain-specific vocabularies– Medicine: UMLS, SNOMED, Galen– Art history: AAT, ULAN– Geography: TGN

• Generic ontologies – Top-level categories (reminiscent of

Aristotelian categories)– Lexical vocabularies: WordNet– Units and dimensions, time ontology– Currencies, country codes, …

Good and bad ontologies?!

• Good ontologies are used• Good ontologies represent some form

of consensus in a community• Good ontologies are maintained• Good ontologies do not need to be

complex• Good ontologies may contain

“mistakes”

Levels of interoperability

• Syntactic interoperability– using data formats that you can share– XML family is the preferred option

• Semantic interoperability– How to share meaning / concepts– Technology for finding and representing

semantic links

Two application examples

DOPE: semantic search of large document repositories

• Stuckerschmidt et al. (2003)• EMTREE thesaurus (MeSH-based)• Documents

– 5M Medline abstracts– 500M of full-text articles

• Automatic document indexing• RDF used for syntactic interoperability

– RDF wrapper for SOAP-based access to documents• Disambiguation of search terms• Visualization of search results through semantic

categories– Needed to prevent information overflow

• Part of large Dutch knowledge-economy project MultimediaN

• Partners: VU, CWI, UvA, DEN,ICN

• People: Alia Amin, Lora Aroyo, Mark van Assem, Victor de Boer, Lynda Hardman, Michiel Hildebrand, Laura Hollink, Marco de Niet, Borys Omelayenko, Marie-France van Orsouw, Jacco van Ossenbruggen, Guus SchreiberJos Taekema, Annemiek Teesing,Anna Tordai, Jan Wielemaker, Bob Wielinga

• Artchive.com, Rijksmuseum Amsterdam, Dutch ethnology musea (Amsterdam, Leiden), National Library (Bibliopolis)

E-Culture demonstrator

http://e-culture.multimedian.nl

Semantic Web applicationsprinciples and techniques

Principle 1: Be modest!

• Do the things you’re good at• Use resources of others where

possible– E.g. geographical vocabularies, lexical

resources, such as WordNet

Principle 2: Think Large!

"Once you have a truly massive amount of information integrated as knowledge, then the human-software system will be superhuman, in the same sense that mankind with writing is superhuman compared to mankind before writing."

• Don’t be afraid to include large external resources•The technology can handle it!

Principle 3: Don’t strive for perfection!

• The “not invented here” syndrome• Don’t discard an external resource

because it does not exactly meet your needs

• Just create your local extensions– The technology for this exists

Principle 4: Use open (Web) standards

• Why does the Web work? – Because it moved away from vendor-

specific formats• XML-related standards make shared

life so much easier• Think thrice before embarking on

Flash– Even forgetting the accessibility

problems for a moment

Technique: syntactic vocabulary interoperability

• Make your vocabularies available in the Web standard RDF

• Many organizations are already do this

• W3C provides the SKOS template to make this almost straightforward

• Effort required: at most a few days

Technique:information extraction

MATISSE, HenriLe bonheur de vivre (The Joy of Life)1905-1906Oil on canvas, 69 1/8 x 94 7/8 in. (175 x 241 cm)Barnes Foundation, Merion, PA

Textual annotation mapped to vocabulary terms

Technique: enriching vocabularies

Technique: vocabulary alignment

• Find semantic links between vocabulary terms:– Derain (ULAN) related-to Fauve (AAT))

• Automatic techniques exists, but performance varies

• Often combination of automatic and manual alignment

• Effort strongly dependent on vocabularies– But “a little semantic goes a long way”

(Hendler)

W3C Semantic Web activities

http://www.w3.org/2001/sw/

• RDF/OW: ontology representation• SPARQL: query language• RIF: rule language• Health Care & Life Sciences group• SW Education & Outreach• SW Deployment group

W3C Semantic Web Deployment WG

SKOS: RDF pattern for thesaurus modeling

• Based on ISO standard– broader/narrower. related, multilinguality

• Documentation:http://www.w3.org/TR/swbp-skos-core-guide/

Semantic WebHope or Hype?

16 Nov 200616 Nov 2006

Dave Beckett’s blog about ISWC 2006

http://www.oracle.com/technology/tech/semantic_technologies/index.html

http://esw.w3.org/topic/HCLSIG/Drug_Safety_and_Efficacy

http://www.geneontology.org/GO.downloads.ontology.shtml

et ceterahttp://esw.w3.org/topic/CommercialProducts

http://web.resource.org/rss/1.0/spec

Take-home message

• Basic Semantic Web technology is ready for deployment

• Social barriers have to be overcome!– “open door” policy

• Make sure you can connect others and other can connect to you– “Don’t buy software which does not

support standard open API’s”