An “information retrieval system” searches the computer system to find the required...

12
WEB SEARCH ENGINES & THEIR PERFORMANCE Merve YILMAZ MGIS 301 08.05.2008

Transcript of An “information retrieval system” searches the computer system to find the required...

Page 1: An “information retrieval system”  searches the computer system to find the required information.

WEB SEARCH

ENGINES

&

THEIR

PERFORMANCEMerve YILMAZ

MGIS 301

08.05.2008

Page 2: An “information retrieval system”  searches the computer system to find the required information.

An “information retrieval system” searches the computer system to find

the required information.

Page 3: An “information retrieval system”  searches the computer system to find the required information.

The most public, visible form of a search engine

İnformation it searches for can be:› Web pages› İmages› Other types of files

Page 4: An “information retrieval system”  searches the computer system to find the required information.

The very first tool leading the web search engines: Archie› Collects the file names and creates an

unindexed database from the collection.› Created by a student of McGill

University,Montreal› The name stands for “archive” without “v”

Page 5: An “information retrieval system”  searches the computer system to find the required information.

The first search engine: Wandex. › Created by an MIT student.

One of the first full-text crawler-based search engines: WebCrawler*crawler: something that crawls like a reptile

Another popular search engine: Lycos, first started in Carnegie Mellon.

Then came other search engines:Excite,Altavista,Infoseek…

Page 6: An “information retrieval system”  searches the computer system to find the required information.

Yahoo provides directory browsing

Page 7: An “information retrieval system”  searches the computer system to find the required information.

How did Google become the most popular among others?

An innovation called “Page Rank” Minimalist interface rather than embedding a web search

engine into a web portal

Page 8: An “information retrieval system”  searches the computer system to find the required information.

I. Web CrawlingII. IndexingIII. SearchingWeb Crawling: an automated web

browser follows every link it sees.Indexing:words are extracted from titles,

headings,meta tags*meta tag: <META name=“keywords” content=“stamps,stamp

collecting,stamps for sale”>

Page 9: An “information retrieval system”  searches the computer system to find the required information.

Search Target 1: Odysseuss2009.org

Best rating percentage: Yahoo.com

Least #of results: Excite.com

Worst rating percentage: Windows

Live / MSN Search

(search.msn.com) (0%)* Excite.com is actually a meta search

engine and shows results from other search engines.

Page 10: An “information retrieval system”  searches the computer system to find the required information.

Best rating percentage among meta engines: Dogpile.com

Worst rating percentage: Apollo7.co.uk (almost 0%)

* The engines that the meta search engines collect results from, change from one meta search engine to another. Therefore their performances change accordingly.

Page 11: An “information retrieval system”  searches the computer system to find the required information.

Search query: “Loss of customer goodwill in lot-sizing”, an article by Deniz Aksen

*all results including links to the article, or the pages where article is cited/referred are accepted as “hit”.

• Best rating percentage: Ask.com (found 1st in almost all searches)

• Worst rating percentage: Lycos.com (almost none of the results are hit)

• Best rating percentage among meta engines: Mamma.com

Page 12: An “information retrieval system”  searches the computer system to find the required information.

• Worst rating percentage among meta engines: Donbusca.com

* Samples are to show only results of some

searches,there are a total of 480 searches with various keywords for one target search result. The remaining are included in excel sheets in detail.