Understanding Web SearchingSecondary Readings and So On…
Will Meurer for WIREDOctober 7, 2004
Introduction
• Why do we care about how people use the Web?• Today’s topics (10/7, not the present age):
– Implicit vs. explicit feedback– Representation effectiveness– Browser-based activities– History mechanisms– How do we cater to the people?– Resources– Research
Implicit vs. Explicit FeedbackReading Time, Scrolling and… (Kelly & Belkin, 2001)
• Implicit feedback (Morita & Shinoda):– Time spent on a page is directly related to user
interest. Backed by many studies.
• Explicit feedback (this study)– Time spent on a page is similar for relevant and
irrelevant content.
• Results suggest:– “Generalizability” is severely affected by explicit
feedback methods.– Spend time to choose the right feedback type!
Implicit vs. Explicit FeedbackReading Time, Scrolling and… (Kelly & Belkin, 2001)
• Why do the results differ?– Relevance was difficult to
distinguish this time– Participants are truly
interested in the content former studies
– Users may have rushed to complete in this experimental context
Representation Effectiveness How we really use the Web (Krug, 2000)
Three “facts of life”:
1. “We don’t read pages. We scan them.”– Why? hurry, necessity, habit– If we are to read its entirety, we save or print!
(ClearType project)
Representation Effectiveness How we really use the Web (Krug, 2000)
2. “We don’t make optimal choices. We Satisfice.”
– Why? hurry, quick access to and fro, less work than thinking
– Generally, it’s more productive to guess.
Representation EffectivenessHow we really use the Web (Krug, 2000)
3. “We don’t figure out how things work.”– Why? not important, “if it ain’t broke
(baroque)…”– Is it important to us whether the user
understands how it works or not? Why?
Representation EffectivenessCognitive Strategies in Web… (Navarro-Prieto, et al, 1999)
• Users get lost on the Web. Why?
• It is not just interactivity between user and system, rather user, task, and information
• Analysis structure of browsing behavior presented and tested“The Interactivity Framework” or “How we
should analyze cognitive strategies”
Representation EffectivenessCognitive Strategies in Web… (Navarro-Prieto, et al, 1999)
• The Interactivity Framework– User Level – Web experience, cognitive
processes, cognitive style, knowledge (CS majors knew more about SE processes)
– User Strategies – based on searching structure (or lack of), task nature
SEARCHING CONDITIONS FACT FINDING EXPLORATORY
DISPERSED STRUCTURE
• Look for data base algorithm in Java • Look for criteria for the diagnosis of
diseases
• Find all the available jobs for profession
CATEGORY STRUCTURE
• Look for word definition • Find all information about 1997 Nobel Prize for Literature
Representation EffectivenessCognitive Strategies in Web… (Navarro-Prieto, et al, 1999)
– Information Structure• Internal (user’s) representation• External (system’s) representation• Computational Offloading – How much work does the
user have to do to understand and how much does a representation help?
– Re-representation – How much it makes problem solving easier or more difficult
– Graphical Constraining – How it constrains inferences
– Temporal and Spatial Constraining – How it helps when distributed over time and space
Representation EffectivenessCognitive Strategies in Web… (Navarro-Prieto, et al, 1999)
SEARCHING TASK
EXPERIENCED WEB-PARTICIPANTS
NOVICE WEB-PARTICIPANTS
INFORMATION IN WEB DISPERSED STRUCTURE
(e.g. find criteria for a psychological disease)
SPECIFIC FACT FINDING:
• Bottom-up • Mixed strategy at the
beginning and selecting Bottom-up
• Start with top-down and change at the end to bottom-up
• Start typing without knowing why
EXPLORATORY: • Top-down
INFORMATION IN WEB
CATEGORY STRUCTURE (e.g. find a job opening)
• Mixed strategy at the beginning and then selecting top-down
• Top-down
• Top-down following browser categories
• Start with bottom-up and change to top-down
Representation EffectivenessCognitive Strategies in Web… (Navarro-Prieto, et al, 1999)
• More Results– Experienced users searched with a plan– By having a plan you keep a more internal
representation and focus your search– Inexperienced users were more influenced by
external representations– Computational Offloading Results
• Must explain
– How have these issues changed?
Representation Effectiveness Cognitive Strategies in Web… (Navarro-Prieto, et al, 1999)
• Conclusions– Cognitive strategies used by the participants
depend on how the information is structured.– Interaction is a multi-dimensioned concept.– Search engine interfaces should be designed
to have less restrictive external representation.
Browser-based ActivitiesCharacterizing Browsing… (Catledge & Pitkow, 1995)
• User study of browsing events at the Georgia Tech (xMosaic browser)
• Three main browsing strategies identified:– Search browsing – directed search, goal known– General purpose browsing – consulting highly likely
sources for needed information (dictionary.com)– Serendipitous browsing – random– Most people use a combination of these
Browser-based ActivitiesCharacterizing Browsing… (Catledge & Pitkow, 1995)
• Results– Users were patient 99% of the time for long page loads– 1222 unique sites accessed outside of GATech (~16% of Web servers)– Paths were calculated (sequences of page navigation)
• Per session, paths of 7 different sites occurred 5 times• Per user, paths of 8 different sites occurred 9 times
Browser-based ActivitiesCharacterizing Browsing… (Catledge & Pitkow, 1995)
• More Results– 2% of the retrieved pages were saved or printed– Based on user’s slope, browsing strategy categories were
applied– Slope can also categorize usage
patterns of Web documents– Users tended to operate in one
small area of a site
Browser-based Activities Characterizing Browsing… (Catledge & Pitkow, 1995)
• Design Strategies– Users averaged 10 pages per server
• Make most important info within 2 or 3 jumps from the index• Do not put too many links on one page – increases search
time (back, forward, back, site map, etc.)
– Facilitate the likely visitor browser patterns• Maybe make more than one version of your page?• Most work well in a “hub and spoke” environment
• The Future– Offer site tour based on most frequently traveled
paths– Alter page design dynamically based on site trends
History Mechanisms (in browsers)
Revisitation Patterns in… (Tauscher & Greenberg, 1997)
• Purpose: Provide empirical data to aid in the development of effective history mechanisms– Understand revisitation patterns– Evaluate current mechanisms and suggest
best practices and methods
• Data Collection– Altered version of xMosaic to record activity– Survey of users afterward
History Mechanisms (in browsers)
Revisitation Patterns in… (Tauscher & Greenberg, 1997)
• Revisitation Results– 58% recurrence rate (>40% are new pages!)– As people search they build their vocabulary– 7 browsing strategies
• First-time visits to cluster of pages• Revisits to pages• Authoring of pages (high reload percentage)• Regular use of web-based apps• Hub-and-spoke (breadth-first approach)• Guided tour (e.g. next page links)• Depth-first search (following links deeply before returning to
the index)
History Mechanisms (in browsers)
Revisitation Patterns in… (Tauscher & Greenberg, 1997)
• Revisitation Results– Visit frequency as a function of distance
• Users mostly revisit recently visited pages (within about 6 jumps)• 39% chance that the next URL will match one of the previous 6
pages visited– Access frequency
• 60% of pages visited only once• 19% visited twice• 8% visited 3 times• 4% visited 4 times
– Locality (not valuable for predicting next page)• Most locality sets were small• Only 2.5 to 4.5 URLs per set• Only 15% of pages were part of a locality set
– Paths (not valuable for predicting next page)• Could these be captured and offered in a history mechanism?• Time per page could indicate path
History Mechanisms (in browsers)
Revisitation Patterns in… (Tauscher & Greenberg, 1997)
• Mechanism types– Recency Ordered
• Sequential order based on time accessed• Repeated entries for revisitation• “Pruned” by keeping only first instance or only last• Simple for users to understand (they remember paths)
– Frequency Ordered• Most visited at top, least visited at bottom• User interest changes, latest URLs must have frequency• How to break ties – last visited, earliest visited• When few items are on the list, this suffers• Difficult for users to understand
History Mechanisms (in browsers)
Revisitation Patterns in… (Tauscher & Greenberg, 1997)
• Stack-based– Recently visited at top– Order and availability depend on:
• Loading – causes page to be added to the top• Recalling – changes pointer to the currently displayed page• Revisiting – user reloads the page, has no effect on the stack
– Keeps duplicates– Non-persistent vs. persistent (btw sessions)– Better than recency at short distances– Users have difficulty understanding this model
History Mechanisms (in browsers)
Revisitation Patterns in… (Tauscher & Greenberg, 1997)
• Hierarchically Structured– Recency ordered hyperlink sublists
• Like recency w/ latest position saved• Each URL has its own sublist of links from that page• Helps with common linking paths• Easier to understand
– Context-sensitive web subspace• Somewhat of a combination of the above-mentioned and
stack-based approaches• Gives user better understanding of context of his/her
searches• May be difficult to remember where a certain URL was• I THINK this approach would be a great tool
History Mechanisms (in browsers)
Revisitation Patterns in… (Tauscher & Greenberg, 1997)
• Do users actually use history mechanisms?– Less than 1% of navigation– 3% involve favorites– 30% of navigation was back button usage
How do we cater to the people?
• Inter-site browsing strategies are not easy to tackle. How would you control that?
• Why should we attempt to understand user behavior and search strategies?– Formulate general design principles (e.g. 3 level
depth)– Design for multiple searching personalities– Understand how to survey your intended users or get
feedback most appropriately– Identify importance of all aspects of the development
process and allocate resources accordingly
How do we cater to the people?
Some Bright Ideas• Personalized search
– Learning systems – You might also like…– www.a9.com (history, favorites, personalized
interface)– But what about changing for different types of user
behavior based on the user’s path history on your server?
• Researched since 1995 and earlier!• What has resulted?• Microsoft ASP.net 2.0 – Web Parts
What resources are out there?
• xMosaic 2.6 download, for those of you so excited• Architecture of the World Wide Web
http://www.w3.org/TR/webarch/• Sum Sun Sug Gestions
http://www.sun.com/980713/webwriting/• Jakob Nielsen – research on content usability,
http://useit.com/alertbox/9710a.html
Research
• Vox Populi: The Public Searching Of The Web (2001)– Compares statistics from two studies– Shows how public searching changed from 1997 to 1999
• Usage Patterns of a Web-Based Library Catalog (2001), Michael D. Cooper
• Real Life, Real Users, and Real Needs: A Study and Analysis of User Queries on the Web (2000), Jansen, Spink & Saracevic
• Redefining the Browser History in Hypertext Terms (), Mark Ollerenshaw
Top Related