Semantic Web Towards a Web of Knowledge - Introduction Spring 2006 Computer Engineering Department...

56
Semantic Web Towards a Web of Knowledge - Introduction Spring 2006 Computer Engineering Department Sharif University of Technology

Transcript of Semantic Web Towards a Web of Knowledge - Introduction Spring 2006 Computer Engineering Department...

Semantic WebTowards a Web of Knowledge -

Introduction

Spring 2006

Computer Engineering Department

Sharif University of Technology

Semantic web - Computer Engineering Dept. - Spring 2006

2

Outline

• Introduction– From Original Web to Semantic Web

• Building Blocks– URI– XML– RDF– Ontology– Agent

• History of SW• SW Architecture• Motivation for the SW• Knowledge Representation

Semantic web - Computer Engineering Dept. - Spring 2006

3

The Original Web

• Web of relationships amongst named objects

Information Management: A Proposal, Tim Berners-Lee, CERN, March 1989, May 1990, http://www.w3.org/History/1989/proposal.html

Semantic web - Computer Engineering Dept. - Spring 2006

4

Current Web: the Syntactic Web

Semantic web - Computer Engineering Dept. - Spring 2006

5

The Syntactic Web is…

• A place where computers do the presentation (easy) and people do the linking and interpreting (hard).

• Why not get computers to do more of the hard work?

• “The bane of my existence is doing things thatI know the computer could do for me.”—Dan Connolly, “The XML Revolution”

Semantic web - Computer Engineering Dept. - Spring 2006

6

Hard Work using the Syntactic Web…

Find images

Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois

Semantic web - Computer Engineering Dept. - Spring 2006

7

Impossible using the Syntactic Web…

• Complex queries involving background knowledge– Find information about “animals that use sonar but

are not either bats or dolphins”• Locating information in data repositories

– Travel enquiries– Prices of goods and services– Results of human genome experiments

• Finding and using “web services”• Delegating complex tasks to web “agents”

– Book me a holiday next weekend somewhere warm, not too far away, and where they speak French or English

Semantic web - Computer Engineering Dept. - Spring 2006

8

What is the Problem?

• Consider a typical web page:

• Markup consists of: – rendering

information (e.g., font size and colour)

– Hyper-links to related content

• Semantic content is accessible to humans but not (easily) to computers…

Semantic web - Computer Engineering Dept. - Spring 2006

9

What information can we see…

WWW2002The eleventh international world wide web conferenceSheraton waikiki hotelHonolulu, hawaii, USA7-11 may 20021 location 5 days learn interactRegistered participants coming fromaustralia, canada, chile denmark, france, germany, ghana, hong

kong, india, ireland, italy, japan, malta, new zealand, the netherlands, norway, singapore, switzerland, the united kingdom, the united states, vietnam, zaire

Register nowOn the 7th May Honolulu will provide the backdrop of the eleventh

international world wide web conference. This prestigious event …

Speakers confirmedTim berners-lee Tim is the well known inventor of the Web, …Ian FosterIan is the pioneer of the Grid, the next generation internet …

Semantic web - Computer Engineering Dept. - Spring 2006

10

What information can a machine see…

Semantic web - Computer Engineering Dept. - Spring 2006

11

Solution: XML markup with “meaningful” tags?

<name> </name><location> </location>

<date> </date>

<slogan> </slogan>

<participants>

</participants>

<introduction>

</introduction>

<speaker> </speaker>

<bio> </bio>…

Semantic web - Computer Engineering Dept. - Spring 2006

13

Still the Machine only sees…

< > </ >< > </ >

< > </ >< > </ >< >

</ >

< >

</ >< > </ >< > </ >< > </ >< > </ >

Semantic web - Computer Engineering Dept. - Spring 2006

14

Need to Add “Semantics”

• External agreement on meaning of annotations– E.g., Dublin Core for annotation of library/bibliographic information

• Agree on the meaning of a set of annotation tags– Problems with this approach

• Inflexible• Limited number of things can be expressed

• Use Ontologies to specify meaning of annotations– Ontologies provide a vocabulary of terms– New terms can be formed by combining existing ones

• “Conceptual Lego”– Meaning (semantics) of such terms is formally specified– Can also specify relationships between terms in multiple

ontologies

Semantic web - Computer Engineering Dept. - Spring 2006

15

The Building Blocks

1. URI

2. XML

3. RDF

4. Ontologies

5. Agents

Semantic web - Computer Engineering Dept. - Spring 2006

16

URIs

• URI = Uniform Resource Identifier

• "The generic set of all names/addresses that are short strings that refer to resources"

• URLs (Uniform Resource Locators) are a particular type of URI, used for resources that can be accessed on the WWW.

Semantic web - Computer Engineering Dept. - Spring 2006

17

XML

“XML allows users to add arbitrary structure to their documents but says nothing about what the structures mean”

Semantic web - Computer Engineering Dept. - Spring 2006

18

RDF

o Meaning encoded in sets of ‘triples’: entities have properties which have values

o Entities, properties and values all have distinct URIs

Ganji CEstudentOF

Sharif http://ce.sharif.edu

departmentOF hasHomePage

Semantic web - Computer Engineering Dept. - Spring 2006

19

Example

We are looking for a woman who works for one of my client and her son studies at University of Rovaniemi

Semantic web - Computer Engineering Dept. - Spring 2006

20

Ontologies

o Database A and Database B may use different fields to contain ‘zip code’ (e.g. zip code and postal code)

o Ontologies sort this outo Ontology = ‘a document or file that formally

defines the relations among terms’o Ontologies for the web normally have

o A taxonomyo A set of inference rules

o e.g. “If an animal eats a food, and this food is expensive, then that animal is not suitable for poor people.”

Semantic web - Computer Engineering Dept. - Spring 2006

21

Agents

o Programs whicho collect Web content from various sources

o process the information

o exchange results with other

o effectiveness increases exponentially with number of available agents

Semantic web - Computer Engineering Dept. - Spring 2006

22

Ambient Intelligence (Pervasive Computing)

“In the next step, the Semantic Web will break out of the virtual realm and extend into our physical world. URIs can point to anything, including physical entities, which means we can use the RDF language to describe devices such as cell phones and TVs.”Berners-Lee, T, Hendler, J & Lassila, O ‘The semantic web’, Scientific American, May 2001

Semantic web - Computer Engineering Dept. - Spring 2006

23

History of the Semantic Web

• Web was “invented” by Tim Berners-Lee (amongst others), a physicist working at CERN

• TBL’s original vision of the Web was much more ambitious than the reality of the existing (syntactic) Web:

• TBL (and others) have since been working towards realising this vision, which has become known as the Semantic Web– E.g., article in May 2001 issue of Scientific American…

“... a goal of the Web was that, if the interaction between person and hypertext could be so intuitive that the machine-readable information space gave an accurate representation of the state of people's thoughts, interactions, and work patterns, then machine analysis could become a very powerful management tool, seeing patterns in our work and facilitating our working together through the typical problems which beset the management of large organizations.”

Semantic web - Computer Engineering Dept. - Spring 2006

24

• Realising the complete “vision” is too hard for now (probably)• But we can make a start by adding semantic annotation to web

resources

Scientific American, May 2001:

Semantic web - Computer Engineering Dept. - Spring 2006

25

Semantic Web Architecture

Youarehere

Semantic web - Computer Engineering Dept. - Spring 2006

26

Web of Trust

Semantic web - Computer Engineering Dept. - Spring 2006

27

An example scenario (1)(Originally from Berners Lee, Scientific American paper)

• The entertainment system was belting out the Beatles’ “We Can Work It Out” when the phone rang

• When Pete answered, his phone turned the sound down by sending a message to all the other local devices that had a volume control

• His sister, Lucy, was on the line from the doctor’s office: “Mom needs to see a specialist and then has to have a series of physical therapy sessions. Biweekly or something. I’m going to have my agent set up the appointments.”

• Pete immediately agreed to share the work

Semantic web - Computer Engineering Dept. - Spring 2006

28

An example scenario (2)

• At the doctor’s office, Lucy instructed her semantic web agent through her handheld web browser

• The agent promptly retrieved information about Mom’s prescribed treatment from the doctor’s agent, looked up several lists of providers, and checked for the ones in plan for Mom’s insurance within a 20-mile radius of her home and with a rating of excellent or very good on trusted rating services

• It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their web sites) and Pete’s and Lucy’s busy schedules

Semantic web - Computer Engineering Dept. - Spring 2006

29

An example scenario (3)• In a few minutes the agent presented them with

a plan• Pete didn’t like it – University hospital was all the

way across the town from Mom’s place, and he’d be driving back in the middle of rush hour

• He set his own agent to redo the search with stricter preferences about location and time

• Lucy’s agent having complete trust in Pete’s agent in the context of the present task, automatically assisted by supplying access certificates and shortcuts to the data it had already been sorted through

Semantic web - Computer Engineering Dept. - Spring 2006

30

An example scenario (4)

• Almost instantly the new plan was presented• A much closer clinic and earlier times – but there were

two warning notes• First, Pete would have to reschedule a couple of his less

important appointments. He checked what they were – not a problem

• The other was something about insurance company’s list failing to include this provider under physical therapists: “Service type and insurance plan status securely verified by other means, “the agent reassured him.”

Semantic web - Computer Engineering Dept. - Spring 2006

31

An example scenario (5)

• Lucy registered her assent at about same moment Pete was muttering, “Spare me the details,” and it was all set.

• (Of course, Pete couldn’t resist the details and later that night had his agent explain how it had found that provider even though it wasn’t on the proper list.)

Semantic web - Computer Engineering Dept. - Spring 2006

32

MomPhysician’s Agent

Lucy’s Agent

requiredtreatment

Schedule appointment

Insurance Co.

Provider sites

Rating

in-plan?close-by?

Specialist?

Pete’ Agent

Driving schedule

Semantic web - Computer Engineering Dept. - Spring 2006

33

Expressed Meaning (1)

• Pete and Lucy could use their agents to carry out all these tasks thanks not to the World Wide Web of today but rather the Semantic Web that it will evolve into tomorrow

• Most of the Web’s content today is designed for humans to read, not for computer programs to manipulate meaningfully

• Computers can adeptly parse Web pages for layout and routine processing – here a header, there a link to another page – but in general computers have no reliable way to process the semantics:– This is the home page of the Hartman and Strauss

Physio Clinic– This link goes to Dr. Hartman’s curriculum vitae

Semantic web - Computer Engineering Dept. - Spring 2006

34

Expressed Meaning (2)

• The semantic web will bring structure to the meaningful content of web pages– Creating an environment where software agents roaming from

page to page can readily carry out sophisticated tasks for users

• Such an agent coming to the clinic’s web page will know not just that– The page has keywords such as “treatment, medicine, physical,

therapy” (as might be encoded today) but also that– Dr. Hartman works at this clinic on Mondays, Wedmesdays and

Fridays and that– The script takes a yyyy-mm-dd format and returns appointment

times

Semantic web - Computer Engineering Dept. - Spring 2006

35

Expressed Meaning (3)

• And it will “know” all this without needing AI on the scale of 2001’s Hal or Star War’s C-3PO.

• Instead these semantics were encoded into Web page when the clinic’s office manager (who never took computer courses) massaged it into shape using off-the-shelf software for writing– Semantic Web pages along with– Resources listed on the Physical Therapy Association’s

site

Semantic web - Computer Engineering Dept. - Spring 2006

36

Expressed Meaning (4)

• The semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation

• The first steps in weaving the semantic web into the structure of the existing Web are already under way

• In the near future, these developments will usher in significant new functionality as machines become much better able to process and “understand” the data that they merely display at present.

Semantic web - Computer Engineering Dept. - Spring 2006

37

Expressed Meaning (5)

• The essential property of the World Wide Web is its university

• The power of a hypertext link is that “anything can link to anything”

• One important observation is the difference between– Information produced primarily for human consumption and that– Produced mainly for machines

• At one end of the scale we have everything from five-second TV commercial to poetry

• At the other end we have databases, programs and sensor output

Semantic web - Computer Engineering Dept. - Spring 2006

38

Expressed Meaning (6)

• To date, the Web has developed most rapildy as a medium of documents for– People rather than for– Data and information that can be processed

automatically

• The Semantic Web aims to make up for this

• Like the internet, the Semantic Web will be as decentralized as possible

Semantic web - Computer Engineering Dept. - Spring 2006

39

Expressed Meaning (7)

• Such Web-like systems generate a lot of excitement at every level– From major corporation to individual user, and – Provide benefits that are hard or impossible to predict in

advance

• Decentralization requires compromises the Web had to throw away the ideal of total consistency of all of its interconnections ushering in the infamous message “Error 404 Not Found” but allowing unchecked exponential growth

Semantic web - Computer Engineering Dept. - Spring 2006

40

Knowledge Representation (1)

• For the semantic web to function, computers must have access to– Structured collections of information and– Sets of inference rules

• They can use these to conduct automated reasoning

• AI researchers have studied such systems since long before the Web was developed

Semantic web - Computer Engineering Dept. - Spring 2006

41

Knowledge Representation (2)

• Knowledge representation is currently in a state comparable to that of hypertext before the advent of the Web– It is clearly a good idea, and some very nice

demonstrations exist– But it has not yet changed the world

• It contains the seeds of important applications, but to realize its full potential it must be linked into single global system:– The Semantic Web

Semantic web - Computer Engineering Dept. - Spring 2006

42

Knowledge Representation (3)

• Traditionally knowledge-representation systems typically been centralized

• Requiring everyone to share exactly the same definition of common concepts such as “parent” or “vehicle”

• But central control is stifling, and increasing the size and the scope of such a system rapidly becomes unmanageable

Semantic web - Computer Engineering Dept. - Spring 2006

43

Knowledge Representation (4)

• Knowledge representation systems usually carefully limit the questions that can be asked so that the computer can answer reliably – or answer at all

• The problem is reminiscent of Godel’s theorem from mathematics:– Any system that is complex enough to be useful also

encompasses unanswerable questions– Much like sophisticated versions of the basic paradox

“This sentence is false”

Semantic web - Computer Engineering Dept. - Spring 2006

44

Knowledge Representation (5)

• To avoid such problems, traditional knowledge representation systems generally have their own set of rules for making inferences about their data

• For example, a genealogy system, acting on databases of family trees, might include the rule “a wife of an uncle is an aunt”

• Even if the data could be transferred from one system to another, the rules, existing in a completely different form, usually could not

Semantic web - Computer Engineering Dept. - Spring 2006

45

Knowledge Representation (6)

• Semantic Web researchers, in contrast, accept that paradoxes and unanswerable questions are a price that must be paid to achieve versatility

• They make the language for the rules as expressive as needed to allow the web to reason as widely as desired

• This philosophy is similar to that of the conventional Web:– Early in the Web’s development detractors pointed out that it could

never be a well-organized library– Without a central database and tree structure, one would never be

sure of finding everything– They were right

Semantic web - Computer Engineering Dept. - Spring 2006

46

Knowledge Representation (7)

• The expressive power of the Web made vast amounts of information available

• Search engines now produce remarkably complete indices of a lot of the material out there– Which would have seemed quite impractical a decade ago

• The challenge of the Semantic Web, therefore, is to provide a language that expresses both data and rules for reasoning about the data and that allows rules from any existing knowledge representation system to be exported onto the Web

Semantic web - Computer Engineering Dept. - Spring 2006

47

Knowledge Representation (8)

• Adding logic to the Web that means to use rules to make– Inferences,– Choose courses of action and– Answer questions

• This are tasks before the Semantic Web community at the moment

• A mixture of mathematical and engineering decisions complicate this task

Semantic web - Computer Engineering Dept. - Spring 2006

48

Knowledge Representation (9)

• The logic must be powerful enough– To describe complex properties of objects– But not so powerful that agents can be tricked by being

asked to consider a paradox

• Fortunately, a large majority of the information we want to express is along the lines of– “a hex-head bolt is a type of machine bolt”– This is really written in existing languages with a little

extra vocabulary

Semantic web - Computer Engineering Dept. - Spring 2006

49

Knowledge Representation (10)

• Two important technologies for developing the Semantic Web already in place– eXtensible Markup Language (XML) and– The Resource Description Framework (RDF)

• XML lets everyone create their own tags– Hidden labels that annotate Web pages or sections of text on a

page– Scripts, or programs, can use of these tags in sophisticated ways,

but the script writer has to know what the page writer uses each tag for

– XML allows user to add arbitrary structure to their documents but says nothing about what the structures mean

Semantic web - Computer Engineering Dept. - Spring 2006

50

Knowledge Representation (11)

• Meaning is expressed by RDF– Which encodes it in sets of triples– Each triple being rather like the subject, verb and object of an

elementary sentence

• These triples can be written using XML tags• In RDF, a document makes assertions that

– Particular things (people, Web pages or whatever) have– Properties (such as “is a sister of”, “is the author of”) with– Certain values (another person, another Web page)

Semantic web - Computer Engineering Dept. - Spring 2006

51

Knowledge Representation (12)

• This structure turns out to be a natural way to describe the vast majority of the data processed by machines

• Subject and object are each identified by a Universal Resource Identifier (URI)– Just as used in link on a Web page– (URLs, Uniform Resource Locators, are the most common type of

URI)– The verbs are also identified by URIs– This allows anyone to define

• A new concept• A new verb

– Just be defining a URI for it somewhere on the Web

Semantic web - Computer Engineering Dept. - Spring 2006

52

Knowledge Representation (15)

• Human language thrives when using same term to mean somewhat different things, but automation does not

• Imagine that a clown messenger service is hired to deliver balloons to customers on their birthdays

• Unfortunately, the service transfers the addresses from the customer’s database to its own database,

• It does not know that the “addresses” in the customer’s database are where bills are sent and that many of them are post office boxes

Semantic web - Computer Engineering Dept. - Spring 2006

53

Knowledge Representation (14)

• The hired clowns end up entertaining a number of postal workers– Not necessarily a bad thing but certainly not the intended

effect

• Using a different URI for each specific concept solves that problem

• An address that is a mailing address can be distinguished from one that is a street address, and both can be distinguished from an address that is a speech

Semantic web - Computer Engineering Dept. - Spring 2006

54

Knowledge Representation (15)

• The triples of RDF form Webs of information about related things

• RDF uses URIs to encode this information in a document

• URIs ensure that concepts are not just words in a document but are tied to a unique definition that everyone can find on the Web

Semantic web - Computer Engineering Dept. - Spring 2006

55

Knowledge Representation (16)

• For example, imagine that we have access to a variety of databases with information about people, including their addresses

• If we want to find people living in a specified zip code, we need to know which fields in each database represent– Names and which represent– Zip codes

• RDF can specify that “(field 5 in database A) (is a filed of type) (zip code),” using URIs rather than phrases for each term

Semantic web - Computer Engineering Dept. - Spring 2006

56

Knowledge Representation (17)

• RDF has a simple data structure which can be used to express vast amount of knowledge on the web

• However, it lacks enough power to express higher level of knowledge:– For example it is not possible to build hierarchy of concepts in

RDF• RDF-S (RDF Schema) adds the ability of building

hierarchies to RDF• A layer above RDF-S is named as OWL which extends

RDF and RDF-S with amongst other things, the ability to have restrictions on concepts participating in relations

• In this course we will talk about XML, RDF, RDF-S and OWL

Semantic web - Computer Engineering Dept. - Spring 2006

57

References

• Semantic web official home page:– www.w3.org/2001/sw/

• Previous term course page:– http://ce.sharif.edu/courses/83-84/2/ce926/

• T. Berners-Lee et al., The Semantic Web, Scientific American, 2001

• Manchester university

• Montreal university