Pedersen
-
Upload
ram-dutt-shukla -
Category
Technology
-
view
329 -
download
4
Transcript of Pedersen
1
Internet Search Engines: Past and Future
Jan PedersenChief Scientist, Yahoo! Search
11 April 2005
2
Outline
• Compare and contrast
– 2005 vs 1998
• Some underlying trends
• What’s New
3
Search Landscape 2005
• Three major Players
– Google ($52B)
– Yahoo ($42B)
– MSN ($271B)
• $6+B in Paid Search Revenues
• >400M searches daily
• 5-8B claimed index size
• Excellent relevanceSource: Search Engine Watch
4
Search Landscape in1998
5
Infoseek circa 1998
• IPO in 1996
– Same year as Excite and Yahoo!
• 5th ranked destination site
• Market cap of $1B
• Competed as a portal
– Content was king
• Ultimately sold to Disney
– Go network
• Decommissioned in 2001
6
Infoseek Search Technology
• 1.5 Generation Search– Tuned for relevance
• Proximity• Anchor text (ESP)• Inlink counting
– Small but competant index• 60M pages • Alta Vista was 140M at that time
• Still exists as Ultraseek server– Sold to Inktomi in 2000, later resold to Verity
7
Infoseek UI
8
Comparison to State-of-the-art
9
What was missing?
• Business Model
– Monetized via untargeted banner ads
– Need for increased inventory• Portalitis
• Clutter
• Lack of focus• Search was deprioritized
10
Some Underlying Trends
11
Moore’s Law
12
Index Size
• ~150M in 1998
• ~5B in 2005– 33x increase
– Moore would predict 25x
• Monthly refresh in 1998
• Daily refresh in 2005
• What about 2010?– 40B?
• Where is the content?– Public Web?
– Personal Web?
Source: Search Engine Watch and Search Engine Showdown
13
Meaning of Hit Counts
• Hits Counts are estimated
– Indices are tiered
– Estimates can be non-linear
Source:
http://aixtal.blogspot.com/2005/01/web-googles-counts-faked.html
14
Meaning of Document Counts
• Claimed index Size
– Google: 3B
– FAST: 3B
– AV: 1B
• Not all Documents are equal
– Thin docs
• Disparity between claimed and reported
Source: Search Engine Showdown
15
Online Advertising
• Internet accounts for 30+% of viewing time
– Yet only 4% of spend
– $370B overall• $10B online
• Fastest growing advertising segment
• Steady shift toward Online advertising
Source: The Economist
16
The Keyword Marketplace
• The great, unsung, search problem– Matching relevant ads to user intent
• Example of distributed authorship– Advertisers bid on keywords– Discounts for good performance (relevance)
17
What’s New?
18
Verticals
Image Search
Product Search
19
Local
20
Personal Search
21
Contextual Search
22
Desktop Search
23