Internet Resources Discovery (IRD)

20
T.Sharon-A.Frank 1 Internet Resources Discovery (IRD) Intelligent IRD

description

Internet Resources Discovery (IRD). Intelligent IRD. Motivation for Intelligence. “ We are drowning in information but starved of knowledge “ John Naisbit. Content. Classical IRD characteristics and the Information food chain Agents - Softbots family Meta SE - Metacrawler - PowerPoint PPT Presentation

Transcript of Internet Resources Discovery (IRD)

Page 1: Internet Resources Discovery (IRD)

T.Sharon-A.Frank1

Internet Resources Discovery (IRD)

Intelligent IRD

Page 2: Internet Resources Discovery (IRD)

T.Sharon-A.Frank2

Motivation for Intelligence

“We are drowning in information but starved of knowledge “ John Naisbit

Page 3: Internet Resources Discovery (IRD)

T.Sharon-A.Frank3

Content• Classical IRD characteristics and the

Information food chain

• Agents - Softbots family

• Meta SE - Metacrawler

• Homepage finder - Ahoy!

• ILA – Internet Learning Agent

• Shopbot – Jango et al. See Oren Ezioni’s Web site at:

http://www.cs.washington.edu/research/projects/WebWare1/www/softbots/softbots.html

Page 4: Internet Resources Discovery (IRD)

T.Sharon-A.Frank4

No Time for Intelligence!

Classical IRD Characteristics

• Massive memory and network resources required.

• Amortized over millions of queries per day.

• Minimal cycles devoted to each individual.

• No memory of previous requests.

• Least common denominator service.

Page 5: Internet Resources Discovery (IRD)

T.Sharon-A.Frank5

Classical Information Food Chain

Page 6: Internet Resources Discovery (IRD)

T.Sharon-A.Frank6

Intelligent Information Food Chain

Page 7: Internet Resources Discovery (IRD)

T.Sharon-A.Frank7

Definition: Softbots

• Softbots are intelligent agents that use software tools and services on a person’s behalf.

• Make intensive use of artificial intelligence (AI) techniques: planning, scheduling, learning, etc.

Page 8: Internet Resources Discovery (IRD)

T.Sharon-A.Frank8

Softbot Family Tree

Rodney

Sims

InfoManifold

Occam

Simon MetaCrawler

Ahoy!

ILA

ShopBot

BargainFinder

Page 9: Internet Resources Discovery (IRD)

T.Sharon-A.Frank9

General problems to be solved

• Discovery– How to find new information sources (IS) ?

• Extraction– What to send and how to parse the response ?

• Translation– How to interpret the response in terms of internal

concepts ?

• Evaluation– How to evaluate the quality of IS ?

Page 10: Internet Resources Discovery (IRD)

T.Sharon-A.Frank10

Metacrawler

Ahoy!

Discovery, Evaluation:

Extraction:

Translation:

Main Focus of the Robots

ILA

Page 11: Internet Resources Discovery (IRD)

T.Sharon-A.Frank11

Meta Search Engine

MetaCrawler

Yahoo Web Crawler Open Text Lycos InfoSeek Inktomi Galaxy Excite

Page 12: Internet Resources Discovery (IRD)

T.Sharon-A.Frank12

Search Service - Motivation

1. The number and variety of Search services.2. Each service provides an incomplete snapshot of Web.3. Users are forced to try and retry their queries across

different indices.4. Each service has its own interface.5. Irrelevant, outdated or unavailable responses.6. There is no time for intelligence.7. Each query is independent.8. No individual customization.9. The result is not homogenized.

Page 13: Internet Resources Discovery (IRD)

T.Sharon-A.Frank13

The Web Community Demands

• Robustness– A working system, accessible 24 hours a day.

• Speed– Transmitting useful information within seconds.

• Added Value– Any increase in sophistication had better yield a

tangible benefit to users.

Page 14: Internet Resources Discovery (IRD)

T.Sharon-A.Frank14

Premises of MetaCrawler

• No single search is sufficient.

• Problem in expressing the query.

• Low quality references can be detected.

Page 15: Internet Resources Discovery (IRD)

T.Sharon-A.Frank15

MetaCrawler

Page 16: Internet Resources Discovery (IRD)

T.Sharon-A.Frank16

MetaCrawler is a Meta-Service

• It doesn’t use a database of its own.

• It uses other external search services that provide the information necessary to fulfill user queries.

Page 17: Internet Resources Discovery (IRD)

T.Sharon-A.Frank17

MetaCrawler Advantages

• It access multiple databases and provides large number of higher quality references.

• It does not depend upon the implementation or existence of any specific search service.

• It access the search services simultaneously.

• Users need not remember the address, interfaces, … of each search service.

Page 18: Internet Resources Discovery (IRD)

T.Sharon-A.Frank18

How It Works?

• It currently accesses a few services: InfoSeek, Lycos, WebCrawler, Yahoo, etc.

• It submits a query to every search service it knows in parallel.

• It collates the results by merging all hits returned.

• It has a sorting and verify option.

• It presents a results page consisting of a list of references.

Page 19: Internet Resources Discovery (IRD)

T.Sharon-A.Frank19

Meta-Search

• http://www.metacrawler.com

Page 20: Internet Resources Discovery (IRD)

T.Sharon-A.Frank20

Meta Search Results