Search Engines After The Semanatic Web

37
Search Engines After The Semantic Web Presented by Samar Hamed Damascus University

description

 

Transcript of Search Engines After The Semanatic Web

Page 1: Search Engines After The Semanatic Web

Search Engines After The Semantic Web

Presented by Samar Hamed

Damascus University

Page 2: Search Engines After The Semanatic Web

Agenda

• Basic Semantic Web Principles• Falcons Semantic Search Engine• Search Engine Giants Experience (Google, Yahoo,

Microsoft)

• Kngine New Promising Search Engine• Summary• References

Page 3: Search Engines After The Semanatic Web

Web evolving

AKA Web 3.0 , web of thing , web of data

where data objects are linked to other data objects (similar to how web pages are linked today)

Computers will be able to make use of data residing inside web pages 

Page 4: Search Engines After The Semanatic Web

Data Representation

RDF (Resource Description Frame Work)

Page 5: Search Engines After The Semanatic Web

Vocabulary

RDF provides a generic, abstract data model for describing resources using subject, predicate, object triples. However, it does not provide any domain-specific terms for describing classes of things in the world and how they relate to each other. This function is served

• RDFS (the RDF Vocabulary Description Language, also known as RDF Schema)

• OWL (the Web Ontology Language)

admin
they Provide:} Definition of the terms we can use} what restrictions apply} what extra relationships are there
Page 6: Search Engines After The Semanatic Web

RDFs vs OWL

while RDFs Is Light Weight Ontology OWL extends the

expressivity of RDFS with additional modeling primitives, For

example, OWL defines the primitives • equivalentClass • equivalentProperty,• inverseOf allows the creator of a vocabulary to state that one property

I s the inverse of another, for example

prod:directed is the owl:inverseOf tv:director.

increase the interoperability of data sets modeled using different vocabularies

Page 7: Search Engines After The Semanatic Web

RDFa

• RDFa is a way to express RDF data within XHTML by reusing the existing human-readable data without repeating content

<div typeof="foaf:Person" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <p property="foaf:name"> Alice Birpemswick </p><p> Email: <a rel="foaf:mbox”href="mailto:[email protected]">[email protected]</a> </p> <p> Phone: <a rel="foaf:phone" href="tel:+1-617-555-7332">+1 617.555.7332</a> </p> </div>

Page 8: Search Engines After The Semanatic Web

Agenda

• Basic Semantic Web Principles

• Falcons Semantic Search Engine• Search Engine Giants experience (Google,Yahoo,

Microsoft)

• Kgine New Promising Search Engine• Summary• References

Page 9: Search Engines After The Semanatic Web

Falcons Semantic Search Engine

Object Search

Concept Search

Document Search

Page 10: Search Engines After The Semanatic Web

Falcons Object Search

Karlsruhe

Page 11: Search Engines After The Semanatic Web

Falcons Object Search

Knows Peter Mika

Page 12: Search Engines After The Semanatic Web

Falcons Object Search

Peter Mika Jim Hendler

Page 13: Search Engines After The Semanatic Web

Object Indexing

• To build the inverted index, search engines build for every object Virtual Document contains its descriptions using :

• local names • associated literals of SW objects• textual descriptions of its neighboring resources

Term1

Term2

Term3

object4 object2 object1

object2

object4 object3

Page 14: Search Engines After The Semanatic Web

Object Indexing

• Falcons approach is to collect neighbors for a SW object starting from it, traversing the graph, and stopping until reaching URIs or literals but not blank nodes cause no terms can be collected from them .

WWW2008, International , World , Wide , Web, Conference, Beijing

Page 15: Search Engines After The Semanatic Web

Weighting and Similarity

• Both virtual document and query are represented as term vector in term vector space,

• The terms of the virtual document are weighted where term in the local name and labels are assigned a higher weighting coefficient than those in literal properties and neighbor's properties term ,

• To calculate similarity between the object and query cosine measure is used,

• the result is ranked based on the combination of of their relevance to the query and their popularity, where:• The relevance score is calculated based on the cosine similarity measure • and The popularity score is evaluated according to the number of RDF

documents that SW objects are used by.

Page 16: Search Engines After The Semanatic Web

Light Weight inference • Falcons index the classes of SW objects and provide a user-

friendly navigation hierarchy of classes for users to refine the search results using class-inclusion reasoning to discover implicit types of objects

• Falcons index not only its explicitly specified classes but also their super classes

Class 1

Class2

Class3

object3 object2 object1

object2

object4 object1

Page 17: Search Engines After The Semanatic Web

Light Weight inference • The system will not recommend all the sub classes instead it

use simple algorithm to determine which ones should be provided to user

OrgnizedEvent

Page 18: Search Engines After The Semanatic Web

Agenda

• Basic Semantic Web Principles• Falcons Semantic Search Engine

• Search Engine Giants Experience (Google, Yahoo, Microsoft)

• Kngine New Promising Search Engine• Summary• References

Page 19: Search Engines After The Semanatic Web

Google Rich snippet• Webmasters can provide structured data by using RDFa to

mark up their web pages • Google crawls RDFa data describing people, products,

businesses, organizations, reviews, recipes, and events• The search result will look smarter and richer according to the

kind of data described in the result

Page 20: Search Engines After The Semanatic Web

Yahoo Search Monkey• SearchMonkey is a system aims to make information

presentation more intelligent when it comes to search results, by craweling RDFa Data,

• enabling the people who know each result best - the publishers- to define what should be presented and how,

• it differs form google rich snippet ,where the site owners can develop the way the result should be presented by themselves.

Page 21: Search Engines After The Semanatic Web

Google Question Answering

What is birth date of Catherine Zeta-Jones. 

Page 22: Search Engines After The Semanatic Web

Google Question Answering

what is the name of Britney Spears’s mother

Page 23: Search Engines After The Semanatic Web

Schema.org:library of vocabularies

• Google, Microsoft, and Yahoo In early June 2011 announced schema.org, a new service intended to create and support a common vocabulary for structured data markup on web pages.

• The idea is to provide a library of vocabularies to embed machine-readable data into web pages in a manner that can be fully exploited across search engines.

•  Schema.org appears to be Linked Data Lite with extremely limited support for vocabularies available at chema.org/docs/full.html

   |       

Page 24: Search Engines After The Semanatic Web

Extending Schema.org

•  one can always create new schemas that are not at all on schema.org, if the content of your domain is not covered by any of the schema.org types.

• If the schema gains search engines may start using this data.) Extensions that gain significant adoption on the web may be moved into the core schema.org vocabulary

• If you publish content of an unsupported type, you have these options:

• Use a less-specific markup type. For example, schema.org has no "Professor" type. However, if you have a directory of professors in your university department, you could use the "person" type to mark up the information for every professor in the directory

.• If you are feeling ambitious, use the schema.org extension system to define a new

type

Page 25: Search Engines After The Semanatic Web

Microdata Model

• Schema.org does not use RDF as a data model instead it uses very generic Microdata supported bye HTM5 drived from RDF Schema

Page 26: Search Engines After The Semanatic Web

Microdata vs RDFa

Microdata audience • RDFa is extensible and very expressive, but the substantial

complexity of the language has contributed to slower adoption. • Schema.org vocabularies are search engine oriented more than

domain specific like RDF• Microdata can be converted to RDFa • There is Schema.RDFS.org a site which is a complementary effort

by people from the Linked Data community to express the terms provided by the Schema.org Vocabularies in RDF

• tagging information, Web page owners could improve the position of their site in search results—an  important source of traffic.

Page 27: Search Engines After The Semanatic Web

Microdata vs RDFa

RDFa audience • All of the capabilities promised by schema.org are already fully

supported in a richer more scalable manner in the form of RDFa

• The entire Web community should decide which features should be supported – not just Microsoft or Google or Yahoo

• Google and Yahoo already support Microdata and RDFa in their advanced search services (Google Rich Snippets and Yahoo Search). So, why is it that we cannot continue to use

Page 28: Search Engines After The Semanatic Web

Agenda

• Basic Semantic Web Principles• Falcons Semantic Search Engine• Search Engine Giants Experience (Google, Yahoo,

Microsoft)

• Kngine New Promising Search Engine• Summary• References

Page 29: Search Engines After The Semanatic Web

Kngine New Promising Search Engine

• Egyptian startup Kngine has announced that its new Kngine search engine has gone live in 2010.

• Most existing semantic search they draw their results from a limited number of sites such as Wikipedia and Freebase. Kngine, however, has expanded beyond those sources, and seeks to index structures information

Page 30: Search Engines After The Semanatic Web

Smart Information

Yes Man

Page 31: Search Engines After The Semanatic Web

Words with Multiple Meanings

Java

Page 32: Search Engines After The Semanatic Web

Comparisons

iPhone vs iPhone 3G iPhone 3GS

Page 33: Search Engines After The Semanatic Web

Answer your questions

Who is the director of 2012

Page 34: Search Engines After The Semanatic Web

Updated Information

• (Weather, Stock, Currency Price, and Sport Matches Results)

Latest world cup matches results

Page 35: Search Engines After The Semanatic Web

Agenda

• Basic Semantic Web Principles• Falcons Semantic Search Engine• Search Engine Giants Experience (Google, Yahoo,

Microsoft)• Kngine New Promising Search Engine

• References

Page 36: Search Engines After The Semanatic Web

• Taha, E. Linked Data :State of The Art; Department of Software Engineering and Information System, 2010.

• Heath, T.; Bizer, C. Linked Data: Evolving the Web into a Global Data Space :Synthesis Lectures on the Semantic Web: Theory and Technology, 1st ed.; Morgan & Claypool, 2011.

• Cheng, G.; Qu, Y. Integrating Lightweight Reasoning into Class-Based Query Refinement for Object Search; Scientific papaer; Institute of Web Science, School of Computer Science and Engineering,Southeast University: Nanjing, 2008.

• Schema.org and the Semantic Web. prototypo.blogspot.com/2011/06/schemaorg-and-semantic-web.html (accessed June 3,2011).

• LUR, X. Kngine: The Smartest Search Engine Ever? http://www.techxav.com/2010/04/09/kngine-the-smartest-search-engine-ever (accessed APRIL 9, 2010).

• Shadbolt, N.; Hall, W.; Berners-Lee, T. The Semantic Web Revisited; IEEE Computer Society, 2006.

Page 37: Search Engines After The Semanatic Web

Thank You