RDF by Structured Reference to Semantics, the RS2 framework

Post on 07-Nov-2014

1.836 views 1 download

Tags:

description

Current standard web documents are designed to be presented to humans. Machines have no idea about the information located in a web document. Semantic web is organized in a structured way so that it is meaningful to both machines and humans. In this presentation, we suggest a framework that will process the web documents and produce machine readable format in RDF (Resource Description Framework) collaborated with the OWL (Web Ontology Language). Our suggested framework, which we call RS2 (RDF by Structured Reference to Semantics), takes an HTML document as input, extracts the plain text from it. Natural language context of plaintext is then parsed to yield subject-object-predicate of each sentence. This data is used to lookup in the ontology and generate RDF graph which is the machine intelligible semantic equivalent to the original human recognized text.

Transcript of RDF by Structured Reference to Semantics, the RS2 framework

An Approach to Emerge Semantic Web

Khan Muhammad Nafee Mostafa | 0507007Samiul Hoque Sourav | 0507035Qudrat-E-Alahy Ratul | 0507037

Khulna University of Engineering & TechnologyDepartment of Computer Science and Engineering

Supervised by |Rushdi Shams |

LecturerCSE, KUET

Introduction

• Web » a horde of valuable but unorganized and scattered documents

• Web of document is not intelligible to machines but web of linked data is

• Semantic Web » web of data• RDF graph » semantics underlying the

document• bottleneck of emerging semantic web »

conversion of html to RDF

Objective

• Generate RDF from HTML document • Suggesting a framework titled ‘RDF by

Structured Reference to Semantics’ or RS2 framework to do so

Overview: Web Versions

Web

1.0

• Static Web

Web

2.0

• Social Web• User-generated Content• people consume, produce & share info

Web

3.0

• Semantic web• Collection of machine understanding data

and documents.

Example

Example

Why Semantic Web

QUERY: Bangladeshi player played in IPLList of

Why Semantic Web

Semantic Web Stack

APPLICATION

DATAIN

ABSTRUCT FORMAT

DATAIN

VARIOUS FORMATURI XML

SYNTAX

QUERY

MAP

Why RDF?• A simple RDF graph tells about who is the instance

of which class.

• What is the relation between two instance.• Ex:- Mashrafe play for Kolkata Knight rider.

Mashrafe’s nationality is Bangladeshi.

Mashrafe

Player Cricket Team

Kolkata Nightrider

Instance of Instance of

play for

Country

Bangladeshnationality

Instance of

Architecture of RS2 framework

Extract plaintext plaintext

Parse Natural Language TEXT Parse tree

Yield SPO

SubjectPredicate

Object

Lookup for Semantic

equivalentSemantic Web

entities for SPO

Generate RDF

RS2 framework in action

RS2 framework

lexicon mapper ontology

HTML to plaintext

Html tags don’t have sensible info

Strip them

Get the text that we actually read

Parse sentence

Separate sentences

Separate words

POS tagging

parse with a grammar

parse tree

Mashrafe plays for KKR

<noun/> <verb/> <preposition/> <name/>

sentence

Noun Phrase

Mashrafe

Verb Phrase

Verb Phrase

Verb

plays

Pronoun

for

Noun

KKR

Yield Subject-Predicate-Object

• first Noun Phrase traversed in the parse treeSubject

• Part of Verb Phrase that describes predicates, generally the part before the NP in VP

Predicate

• Noun Phrase in the parse tree of first Verb PhraseObject

Lookup semantic web entities

KKR is located in Kolkata.

Kolkata Knight Rider is situated at west Bengal.

I think KKR and Kolkata Knight

Raider are different

Same anomaly occurred for predicate and object

Lookup semantic web entities• Natural language subject, predicate and object is

not recognizable by the machine.• Convert it to a machine accessible way.

KKR is located in Kolkata.

Kolkata Knight Rider is situated at west Bengal.

Kolkata Knight Rider location Kolkata.

Natural Language

RDF Triple

Subject

Predicate

Object

Generate RDFGenerate URI

Check Property Type in OWL

Lookup domain and range of the property in OWL

Create Instances if not created before in RDF Graph

Create the Triple with Subject, Predicate, Object in RDF Graph

Web 3.0: Advantages

Playing song on the basis of users feedback.Tag based Application.

Web 3.0: Advantages(2)

Automatic Air ticket reservationAutomatic data integrationDigital LibrarySemantic Web ServicesSearching

Demo ApplicationOWL: English Premier LeagueTopic: Chelsea Football Club

Demo Application

Conversion from HTML to RDF

Demo Application

Future Work and Benefits

• An application/framework to enhance Web Ontology from knowledge conceived from html document

• applications with Semantic Web featuresBenefit:• Emergence of Semantic Web• Automatic conversion of piles of html into RDF

graph

Conclusion

• A framework and a prototype application to convert html document into RDF• Eliminate the bottleneck in the

emergence of Semantic Web by RS2

Thank you

RS2 fx O W L