Search
-
Upload
brisso99 -
Category
Technology
-
view
339 -
download
0
Transcript of Search
Reading
Halavais (2009), esp. chapter 3Brin and Page (1998)Hargittai, E (2004) Do you "google"? Understanding search engine use beyond the hype. First Monday, volume 9, number 3 (March 2004),URL: http://firstmonday.org/issues/issue9_3/hargittai/index.html
What we will cover today
What is search?What is a search engine?How do search engines work?Search Engine OptimisationTensions, problems, issues
Searches (millions)
July 2008 July 2009Change
(%)Share (%)
Total internet
80,554 113,685 41 100
Google sites 48,666 76,684 58 67.5
Yahoo! sites 8,689 8,898 2 7.8
Baidu.com 7,413 7,976 8 7
Microsoft sites
2,349 3,317 41 2.9
eBay 1,223 1,723 41 1.5
NHN Corp. 1,243 1,526 23 1.3
Ask Network 929 1,291 39 1.1
Yandex 663 1,290 94 1.1
AOL 1,148 1,023 -11 0.9
Facebook 743 879 18 0.7
Notes:Audience includes Internet users, ages 15 and older, at home and work. It excludes Internet activity from public computers, such as Internet cafes, and access from mobile phones or PDAs.Source: comScore qSearch, 2009
Like many kinds of statistics, search engine popularity is very hard to measure reliably, and interpretations of available data vary...More confusing is the difference in how popularity is understood. Popularity can mean, at the most basic level, two very distinct things: a) percentage of users who turn to a search engine for their search needs; and, b) percentage of all search queries that are run on a particular search engine. Depending on one’s interest, this distinction is important.”
Hargittai (2004)
URL listURL list
CrawlersCrawlers
Raw archive
Raw archive
Indexing and
ranking
Indexing and
ranking
Database
Database
“Front end”
“Front end”
Query formQuery form
ResultsResults?
Conceptual organisation of the typical search engine. Halavais
(2009): 15
Gather information from web
pages
Gather information from web
pages
Determine relevance to search query
Determine relevance to search query
Accept search query and present results
Accept search query and present results
CRAWLER: •Compiles list of URLs (pages) to be visited•Saves copy of pages•Looks through for links to other pages•Adds new links to the bottom of the list
ARCHIVE:•Created by crawlers•Allows for further processing to obtain information about page, eg extraction and indexing of key terms
DATABASE:•Ranks pages according to relevance to query•Google uses PageRank, based on incoming links, to infer authority
ImplicationsThe more popular you are, the more popular you become
Niches are important
Older nodes (sites) tend to be more popular than new ones, but only on average
Money alone is not enough to guarantee future popularity or growth, but relevance and connection to already popular nodes can be
“The most important change the web brings us is not this increase of information. The real change on the web is in the technologies of attention, the ways in which individuals come to attend to particular content.”
Halavais (2009): 69
Glossary (Halavais, 2009: 196-7)
Google bowling: Making a competitor look like a search spammer by employing obvious spam techniques on their behalf
Google dance: reordering of PageRank after Google completes a new crawl
Googlebomb: An attempt to associate a key phrase with a given website by collectively using that phrase in links to that site
Googlejuice: An imaginary representation of the reputational currency provided by linking from one site to another, thereby improving PageRankKeyword stuffing: Hiding many unrelated keywords, or a large number of the same keyword, on a page to improve its representation in search results
Link farming: Creation of large numbers of pages with the single intent of linking to a page and thus increasing its apparent popularity
Link slutting/whoring: Creating specific content for a site etc with the aim of collecting inbound links from other sites
Link spamming: Use of links to deceive search engines as to the reputation of a target site