An “information retrieval system” searches the computer system to find the required...
-
Upload
cecily-sanders -
Category
Documents
-
view
213 -
download
1
Transcript of An “information retrieval system” searches the computer system to find the required...
WEB SEARCH
ENGINES
&
THEIR
PERFORMANCEMerve YILMAZ
MGIS 301
08.05.2008
An “information retrieval system” searches the computer system to find
the required information.
The most public, visible form of a search engine
İnformation it searches for can be:› Web pages› İmages› Other types of files
The very first tool leading the web search engines: Archie› Collects the file names and creates an
unindexed database from the collection.› Created by a student of McGill
University,Montreal› The name stands for “archive” without “v”
The first search engine: Wandex. › Created by an MIT student.
One of the first full-text crawler-based search engines: WebCrawler*crawler: something that crawls like a reptile
Another popular search engine: Lycos, first started in Carnegie Mellon.
Then came other search engines:Excite,Altavista,Infoseek…
Yahoo provides directory browsing
How did Google become the most popular among others?
An innovation called “Page Rank” Minimalist interface rather than embedding a web search
engine into a web portal
I. Web CrawlingII. IndexingIII. SearchingWeb Crawling: an automated web
browser follows every link it sees.Indexing:words are extracted from titles,
headings,meta tags*meta tag: <META name=“keywords” content=“stamps,stamp
collecting,stamps for sale”>
Search Target 1: Odysseuss2009.org
Best rating percentage: Yahoo.com
Least #of results: Excite.com
Worst rating percentage: Windows
Live / MSN Search
(search.msn.com) (0%)* Excite.com is actually a meta search
engine and shows results from other search engines.
Best rating percentage among meta engines: Dogpile.com
Worst rating percentage: Apollo7.co.uk (almost 0%)
* The engines that the meta search engines collect results from, change from one meta search engine to another. Therefore their performances change accordingly.
Search query: “Loss of customer goodwill in lot-sizing”, an article by Deniz Aksen
*all results including links to the article, or the pages where article is cited/referred are accepted as “hit”.
• Best rating percentage: Ask.com (found 1st in almost all searches)
• Worst rating percentage: Lycos.com (almost none of the results are hit)
• Best rating percentage among meta engines: Mamma.com
• Worst rating percentage among meta engines: Donbusca.com
* Samples are to show only results of some
searches,there are a total of 480 searches with various keywords for one target search result. The remaining are included in excel sheets in detail.