Lightweight Additions to the Web Search Interface Supporting
ISP 433/633 Week 12 User Interface in IR. Why care about User Interface in IR Human Search using IR...
-
date post
22-Dec-2015 -
Category
Documents
-
view
214 -
download
1
Transcript of ISP 433/633 Week 12 User Interface in IR. Why care about User Interface in IR Human Search using IR...
ISP 433/633 Week 12
User Interface in IR
Why care about User Interface in IR
• Human Search using IR depends on– Search in IR and search in human memory
• One search in service of another• Lexical Memory -> ComputerIR -> Visual
• Design problem is really to make them work together better– Don’t just speed up search engine!– Provide help for Lexical Memory Problem– Provide help for Visual Search of Results
Slide by James Landay
What is HCI?
HumansTechnology
Task
Design
Organizational & Social Issues
Goal of HCI
• Well-designed interactive computer systems– Promote
• Positive feelings of success• Competence• Mastery
– Allow users to concentrate on their work, exploration, or pleasure, rather than on the system or the interface
Interaction Paradigms for IR
• Direct manipulation– Query specification– Query refinement– Result selection
• Delegation– Agents– Recommender systems– Filtering
HCI Questions for IR
• Where does a user start? – Faced with a large set of collections, how
can a user choose one to begin with?
• How will a user formulate a query?
• How will a user scan, evaluate, and interpret the results?
• How can a user reformulate a query?
Starting Points
• Faced with a prompt or an empty entry form … how to start?– Lists of sources– Overviews
• Clusters• Category Hierarchies/Subject Codes• Co-citation links
– Examples, Wizards, and Guided Tours– Automatic source selection
List of Sources
• Have to guess based on the name
• Requires prior exposure/experience
Overviews
• Supervised (manual) category overviews– Yahoo!– MeSHBrowse
• Unsupervised (automated) groupings – Clustering– Kohonen feature maps
Example: MeSH and MedLine
• MeSH category hierarchy– Medical Subject Headings– ~18,000 labels– Manually assigned – ~8 labels/article on average– Average depth: 4.5– Max depth: 9
• Top level categories:anatomy diagnosis related discanimals psych technologydisease biology humanitiesdrugs physics
MeshBrowse (Korn & Shneiderman 95)
Pro and con of Category Labels
• Advantages:– Interpretable– Capture summary information– Describe multiple facets of content– Domain dependent, and so descriptive
• Disadvantages– Do not scale well (for organizing documents)– Domain dependent, so costly to acquire– Vocabulary Problem!
Text Clustering
• What clustering does:– Finds overall similarities among groups of
documents– Finds overall similarities among groups of tokens– Picks out some themes, ignores others
• How clustering works:– Cluster entire collection– Find cluster centroid that best matches the query– Problems with clustering
• It is expensive• It doesn’t work well
Scatter/Gather
• How it works– Cluster sets of documents into general “themes”,
like a table of contents – Display the contents of the clusters by showing
topical terms and typical titles– User chooses subsets of the clusters and re-
clusters the documents within – Resulting new groups have different “themes”
• Originally used to give collection overview• Evidence suggests more appropriate for
displaying retrieval results in context
S/G Example: Query on “star”
Another Use of Clustering
• Use clustering to map the entire huge multidimensional document space into a number of small clusters
• “Project” these onto a 2D graphical representation
“ThemeScapes” Clustering
Kohonen Feature Maps on Text
Pro and con of Clustering
• Advantages:– Get an overview of main themes– Domain independent
• Disadvantages:– Many of the ways documents could group together
are not shown– Not always easy to understand what they mean– Can’t see what documents are about– Documents forced into one position in semantic
space– Hard to view titles
• Perhaps more suited for pattern discovery– Not good for search
Query Specification
• Interaction styles (Shneiderman 97)
– Command language– Form fill– Menu selection– Direct manipulation– Natural language
• What about gesture, eye-tracking, or implicit inputs like reading habits?
Direct Manipulation Query Specification
Menu-Based Query Specification
Display of Retrieval Results
• Goal: – Minimize time/effort for deciding which
documents to examine in detail
• Idea:– Show the roles of the query terms in the
retrieved documents, making use of document structure
Putting Results in Context
• Interfaces should – Give hints about the roles terms play in the
collection– Give hints about what will happen if various
terms are combined– Show explicitly why documents are
retrieved in response to the query– Summarize compactly the subset of
interest
• To support both Query and Navigation
{Query v. Nav} x {Search v. Browse}
Query Navigate
Browse
Search
Tasks
Strategies
Hybrid strategies: Query + Navigation
1. Query Guided NavigationE.g., superbook
– Query hits posted against structure– Pattern of hits gives additional navigational cues for
your target
2. Query Initiated NavigationE.g., typical use of WWW query engine
– Query to neighborhood, then navigate to target– Your vocabulary choice just has to occur near your
target
Superbook
• Interface to very large hierarchically structured texts.
• Full text searching• Posting hits against
the table of contents
• Fisheye View of table of contents
• Highlighting hits in the text
TileBars
• Graphical representation of term distribution and overlap• Simultaneously indicate:
– Relative document length– Query term frequencies– Query term distributions– Query term overlap
TileBars Example
• Mainly about both DBMS & reliability
• Mainly about DBMS, discusses reliability
• Mainly about, say, banking, with a subtopic discussion on DBMS/Reliability
• Mainly about high-tech layoffs
Query terms: DBMS and Reliability
VIBE (Olson et al. 93, Korfhage 93)
VR-VIBE
Query Reformulation
• Thesaurus expansion– Suggest terms similar to query terms
• Relevance feedback– Suggest terms (and documents) similar to
retrieved documents that have been judged to be relevant
– “More like this” interaction
Letizia (Lieberman)
• Watches what you like– did you save page– use it as a jumping off
place– what search terms have
you used– passed-over link
• Extracts keywords etc. to profile user’s interest
• Looks ahead in your neighborhood evaluating what it finds