Semantic Web Standards Slides based on Ian Horrock’s class.

34
Semantic Web Standards Slides based on Ian Horrock’s class
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    3

Transcript of Semantic Web Standards Slides based on Ian Horrock’s class.

Page 1: Semantic Web Standards Slides based on Ian Horrock’s class.

Semantic Web Standards

Slides based on IanHorrock’s class

Page 2: Semantic Web Standards Slides based on Ian Horrock’s class.

Where we are Today: the Syntactic Web

[Hendler & Miller 02]

Page 3: Semantic Web Standards Slides based on Ian Horrock’s class.

The Syntactic Web is…• A hypermedia, a digital library

– A library of documents called (web pages) interconnected by a hypermedia of links

• A database, an application platform– A common portal to applications accessible through web pages, and

presenting their results as web pages

• A platform for multimedia– BBC Radio 4 anywhere in the world! Terminator 3 trailers!

• A naming scheme– Unique identity for those documents

A place where computers do the presentation (easy) and people do the linking and interpreting (hard).

Why not get computers to do more of the hard work?

[Goble 03]

Page 4: Semantic Web Standards Slides based on Ian Horrock’s class.

Hard Work using the Syntactic Web…Find images of Peter Patel-Schneider, Frank van Harmelen and Alan Rector…

Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois

Page 5: Semantic Web Standards Slides based on Ian Horrock’s class.

Impossible (?) using the Syntactic Web…

• Complex queries involving background knowledge– Find information about “animals that use sonar but are

not either bats or dolphins”

• Locating information in data repositories– Travel enquiries– Prices of goods and services– Results of human genome experiments

• Finding and using “web services”– Visualise surface interactions between two proteins

• Delegating complex tasks to web “agents”– Book me a holiday next weekend somewhere warm, not

too far away, and where they speak French or English

, e.g., Barn Owl

Page 6: Semantic Web Standards Slides based on Ian Horrock’s class.

What is the Problem?• Consider a typical web page:

• Markup consists of: – rendering

information (e.g., font size and colour)

– Hyper-links to related content

• Semantic content is accessible to humans but not (easily) to computers…

Page 7: Semantic Web Standards Slides based on Ian Horrock’s class.

What information can we see…WWW2002The eleventh international world wide web conferenceSheraton waikiki hotelHonolulu, hawaii, USA7-11 may 20021 location 5 days learn interactRegistered participants coming fromaustralia, canada, chile denmark, france, germany, ghana, hong kong, india,

ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire

Register nowOn the 7th May Honolulu will provide the backdrop of the eleventh

international world wide web conference. This prestigious event …Speakers confirmedTim berners-lee Tim is the well known inventor of the Web, …Ian FosterIan is the pioneer of the Grid, the next generation internet …

Page 8: Semantic Web Standards Slides based on Ian Horrock’s class.

What information can a machine see…

Page 9: Semantic Web Standards Slides based on Ian Horrock’s class.

Solution: XML markup with “meaningful” tags?<name> </name><location> </location>

<date> </date>

<slogan> </slogan>

<participants>

</participants>

<introduction>

</introduction>

<speaker> </speaker>

<bio> </bio>…

Page 10: Semantic Web Standards Slides based on Ian Horrock’s class.

But What About…<conf> </conf>

<place>

</place>

<date> </date>

<slogan> </slogan>

<participants>

</participants>

<introduction>

</introduction>

<speaker> </speaker>

<bio> …

Page 11: Semantic Web Standards Slides based on Ian Horrock’s class.

Need to Add “Semantics”• External agreement on meaning of annotations

– E.g., Dublin Core

• Agree on the meaning of a set of annotation tags

– Problems with this approach

• Inflexible

• Limited number of things can be expressed

• Use Ontologies to specify meaning of annotations– Ontologies provide a vocabulary of terms

– New terms can be formed by combining existing ones

– Meaning (semantics) of such terms is formally specified

– Can also specify relationships between terms in multiple ontologies

Page 12: Semantic Web Standards Slides based on Ian Horrock’s class.

History of the Semantic Web• Web was “invented” by Tim Berners-Lee (amongst others), a

physicist working at CERN

• TBL’s original vision of the Web was much more ambitious than the reality of the existing (syntactic) Web:

• TBL (and others) have since been working towards realising this vision, which has become known as the Semantic Web

– E.g., article in May 2001 issue of Scientific American…

“... a goal of the Web was that, if the interaction between person and hypertext could be so intuitive that the machine-readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations.”

Page 13: Semantic Web Standards Slides based on Ian Horrock’s class.

Scientific American, May 2001:

Beware of the Hype

Page 14: Semantic Web Standards Slides based on Ian Horrock’s class.

Beware of the Hype• Hype seems to suggest that Semantic

Web means: “semantics + web = AI”– “A new form of Web content that is

meaningful to computers will unleash a revolution of new abilities”

• More realistic to think of it as meaning: “semantics + web + AI = more useful web”– Realising the complete “vision” is too

hard for now (probably)

– But we can make a start by adding semantic annotation to web resources

Images from Christine Thompson and David Booth

Page 15: Semantic Web Standards Slides based on Ian Horrock’s class.

Web “Schema” Languages• Existing Web languages extended to facilitate content

description– XML XML Schema (XMLS)

– RDF RDF Schema (RDFS)

• XMLS not an ontology language– Changes format of DTDs (document schemas) to be XML

– Adds an extensible type hierarchy

• Integers, Strings, etc.

• Can define sub-types, e.g., positive integers

• RDFS is recognisable as an ontology language– Classes and properties

– Sub/super-classes (and properties)

– Range and domain (of properties)

Page 16: Semantic Web Standards Slides based on Ian Horrock’s class.

RDF and RDFS• RDF stands for Resource Description Framework• It is a W3C candidate recommendation

(http://www.w3.org/RDF)• RDF is graphical formalism ( + XML syntax + semantics)

– for representing metadata

– for describing the semantics of information in a machine- accessible way

• RDFS extends RDF with “schema vocabulary”, e.g.:– Class, Property

– type, subClassOf, subPropertyOf

– range, domain

Page 17: Semantic Web Standards Slides based on Ian Horrock’s class.

The RDF Data Model• Statements are <subject, predicate, object> triples:

Ian

Uli

hasColleague

• Can be represented using XML serialisation, e.g.:<Ian,hasColleague,Uli>

• Statements describe properties of resources• A resource is a URI representing a (class of) object(s):

– a document, a picture, a paragraph on the Web;– http://www.cs.man.ac.uk/index.html– a book in the library, a real person (?)– isbn://5031-4444-3333– …

• Properties themselves are also resources (URIs)

Page 18: Semantic Web Standards Slides based on Ian Horrock’s class.

URIs• URI = Uniform Resource Identifier

• "The generic set of all names/addresses that are short strings that refer to resources“

• URIs may or may not be dereferencable

– URLs (Uniform Resource Locators) are a particular type of URI, used for resources that can be accessed on the WWW (e.g., web pages)

• In RDF, URIs typically look like “normal” URLs, often with fragment identifiers to point at specific parts of a document:

– http://www.somedomain.com/some/path/to/file#fragmentID

Page 19: Semantic Web Standards Slides based on Ian Horrock’s class.

Linking Statements• The subject of one statement can be the object of another

• Such collections of statements form a directed, labeled graph

• Note that the object of a triple can also be a “literal” (a string)

• Note also that RDF triples don’t by themselves give meaning– You know that (1) Ian and Carol are most likely colleagues (barring

multiple jobs for Uli (2) (Uli hasCollegue Ian) holds (“colleagueness” –unlike “love” is symmetric). But DOES YOUR PROGRAM KNOW THIS?

Ian UlihasColleague

Carole http://www.cs.mam.ac.uk/~sattler

hasColleaguehasHomePage

Ian UlihasColleague

Carole http://www.cs.mam.ac.uk/~sattler

hasColleaguehasHomePage

Page 20: Semantic Web Standards Slides based on Ian Horrock’s class.

RDF Syntax• RDF has an XML syntax that has a specific meaning:• Every Description element describes a resource• Every attribute or nested element inside a Description is a property

of that Resource with an associated object resource• Resources are referred to using URIs

<Description about="some.uri/person/ian_horrocks"> <hasColleague resource="some.uri/person/uli_sattler"/></Description><Description about="some.uri/person/uli_sattler"> <hasHomePage>http://www.cs.mam.ac.uk/~sattler</hasHomePage></Description><Description about="some.uri/person/carole_goble"> <hasColleague resource="some.uri/person/uli_sattler"/></Description>

Page 21: Semantic Web Standards Slides based on Ian Horrock’s class.

RDF Schema (RDFS)• RDF gives a formalism for meta data annotation, and a way

to write it down in XML, but it does not give any special meaning to vocabulary such as subClassOf or type– Interpretation is an arbitrary binary relation

– I.e., <Person,subClassOf,Animal> has no special meaning

• RDF Schema defines “schema vocabulary” that supports definition of ontologies– gives “extra meaning” to particular RDF predicates and

resources (such as subClasOf)

– this “extra meaning”, or semantics, specifies how a term should be interpreted

Page 22: Semantic Web Standards Slides based on Ian Horrock’s class.

“Background Theory”

“Instances”

RDF Schema is really RDF background knowledge!

Page 23: Semantic Web Standards Slides based on Ian Horrock’s class.

RDF/RDFS vs. General Knowledge Rep & Reasoning

• We noted that RDF can be seen as “base level facts” and RDFS can be seen as “background theory/facts/rules

• At this level, inference with RDF/RDFS seems to be just a special case of Knowledge Representation Reasoning

• This is good (CSE471 Ahoy!) and bad (reasoning over most non-trivial logics is NP-hard or much much worse).

• RDF/RDFS can be seen as an attempt to limit the complexity of reasoning by limiting the expressiveness of what can be expressed– RDF/RDFS together can be seen as capturing a certain

tractable subset of First Order Logic– ..already there is trouble in paradise with people complaining

that the expressiveness is not enough• Enter OWL, which attempts to provide expressiveness

equivalent to “description logics” (a sort of inheritance reasoning in First-order logic)

Page 24: Semantic Web Standards Slides based on Ian Horrock’s class.

Problems with RDFS• RDFS too weak to describe resources in sufficient detail

– No localised range and domain constraints

• Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants

– No existence/cardinality constraints

• Can’t say that all instances of person have a mother that is also a person, or that persons have exactly 2 parents

– No transitive, inverse or symmetrical properties

• Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that touches is symmetrical

– …

• Difficult to provide reasoning support– No “native” reasoners for non-standard semantics

– May be possible to reason via FO axiomatisation

Page 25: Semantic Web Standards Slides based on Ian Horrock’s class.

RDF Schema is now being superseded by OWL

Page 26: Semantic Web Standards Slides based on Ian Horrock’s class.

32

Layer 4½: Mapping Between Ontologies• Taxonomy Crisis:

– How can your agent know that my “title” is your “name”?!– How can my agent know that some of your “address” objects

are post-boxes, not physical addresses?!– How can my agent know that many Asian first names

correspond to Western surnames?

• Semantic Web Solution: Services for translating/mapping between “related” ontologies.

• Suppose Amazon.com uses Dublin Core (“title”), while Fred Hanna uses it’s own document ontology (“name”). So far … my agent is forced to choose a ontology, or must be carefully crafted to understand both lanuages

• A better solution: A niche now exists for a independent entity (UniversalBookInfo.com) that maps “title” “name” etc

Page 27: Semantic Web Standards Slides based on Ian Horrock’s class.

33

without UniversalBookInfo.com

Nick wants tobuy War & PeaceNick’s

very complicatedagent

Amazon Fred Hanna

Amazonontology

FredHannaontology

Programmer’sbank account

€€€

Page 28: Semantic Web Standards Slides based on Ian Horrock’s class.

34

with UniversalBookInfo.com

Nick wants tobuy War & PeaceNick’s agent

Amazon

UniversalBookInfo.com

Fred Hanna

Joe’s agent

Jane’s Agent

Bank Account

€€€

€ €

Page 29: Semantic Web Standards Slides based on Ian Horrock’s class.

(In)famous “Layer Cake”

Data Exchange

Semantics+reasoning

Relational Data?

?

???

???

???

• Relationship between layers is not clear• OWL DL extends “DL subset” of RDF

Page 30: Semantic Web Standards Slides based on Ian Horrock’s class.

Who will annotate the data?• Semantic web works if the users annotate their pages using some

existing ontology (or their own ontology, but with mapping to other ontologies)

– But users typically do not conform to standards..• and are not patient enough for delayed gratification…

• Two Solutions– 1. Intercede in the way pages are created (act as if you are helping

them write web-pages)• What if we change the MS Frontpage/Claris Homepage so that they

(slyly) add annotations? • E.g. The Mangrove project at U. Wash.

– Help user in tagging their data (allow graphical editing)– Provide instant gratification by running services that use the

tags.– 2. Collaborative tagging!

• “Folksonomies” (look at Wikipedia article)– FLICKR, Technorati, deli.cio.us etc

– 3. Automated information extraction (next topic)

Page 31: Semantic Web Standards Slides based on Ian Horrock’s class.

Folksonomies—The good• Bottom-up approach to taxonomies/ontologies

– [In systems like] Furl, Flickr and Del.icio.us... people classify their pictures/bookmarks/web pages with tags (e.g. wedding), and then the most popular tags float to the top (e.g. Flickr's tags or Del.icio.us on the right)....

– [F]olksonomies can work well for certain kinds of information because they offer a small reward for using one of the popular categories (such as your photo appearing on a popular page). People who enjoy the social aspects of the system will gravitate to popular categories while still having the freedom to keep their own lists of tags.

Classic case of research playing catch-up with practice ;-)

Page 32: Semantic Web Standards Slides based on Ian Horrock’s class.

Works best whenMany peopleTag the sameInfo…

Page 33: Semantic Web Standards Slides based on Ian Horrock’s class.

Folksonomies… the bad• On the other hand, not hard to see a few reasons why a

folksonomy would be less than ideal in a lot of cases: – None of the current implementations have synonym control

(e.g. "selfportrait" and "me" are distinct Flickr tags, as are "mac" and "macintosh" on Del.icio.us).

– Also, there's a certain lack of precision involved in using simple one-word tags--like which Lance are we talking about? (Though this is great for discovery, e.g. hot or Edmonton)

– And, of course, there's no heirarchy and the content types (bookmarks, photos) are fairly simple.

• For indexing and library people, folksonomies are about as appealing as Wikipedia is to encyclopedia editors. – But.. there's some interesting stuff happening around them.

Computizing Eyeballs(brain) cycle stealing

Page 34: Semantic Web Standards Slides based on Ian Horrock’s class.

Collaborative Computing AKA Brain Cycle Stealing

AKA Computizing Eyeballs• A lot of exciting research related to web currently involves

“co-opting” the masses to help with large-scale tasks– It is like “cycle stealing”—except we are stealing “human brain

cycles” (the most idle of the computers if there is ever one ;-) • Remember the mice in the Hitch Hikers Guide to the

Galaxy? (..who were running a mass-scale experiment on the humans to figure out the question..)

– Collaborative knowledge compilation (wikipedia!)– Collaborative Curation – Collaborative tagging

• Many big open issues– How do you pose the problem such that it can be solved using

collaborative computing?– How do you “incentivize” people into letting you steal their

brain cycles?