ISP 433/633 Week 12 User Interface in IR. Why care about User Interface in IR Human Search using IR...

ISP 433/633 Week 12

User Interface in IR

Why care about User Interface in IR

• Human Search using IR depends on– Search in IR and search in human memory

• One search in service of another• Lexical Memory -> ComputerIR -> Visual

• Design problem is really to make them work together better– Don’t just speed up search engine!– Provide help for Lexical Memory Problem– Provide help for Visual Search of Results

Slide by James Landay

What is HCI?

HumansTechnology

Task

Design

Organizational & Social Issues

Goal of HCI

• Well-designed interactive computer systems– Promote

• Positive feelings of success• Competence• Mastery

– Allow users to concentrate on their work, exploration, or pleasure, rather than on the system or the interface

Interaction Paradigms for IR

• Direct manipulation– Query specification– Query refinement– Result selection

• Delegation– Agents– Recommender systems– Filtering

HCI Questions for IR

• Where does a user start? – Faced with a large set of collections, how

can a user choose one to begin with?

• How will a user formulate a query?

• How will a user scan, evaluate, and interpret the results?

• How can a user reformulate a query?

Starting Points

• Faced with a prompt or an empty entry form … how to start?– Lists of sources– Overviews

• Clusters• Category Hierarchies/Subject Codes• Co-citation links

– Examples, Wizards, and Guided Tours– Automatic source selection

List of Sources

• Have to guess based on the name

• Requires prior exposure/experience

Overviews

• Supervised (manual) category overviews– Yahoo!– MeSHBrowse

• Unsupervised (automated) groupings – Clustering– Kohonen feature maps

Example: MeSH and MedLine

• MeSH category hierarchy– Medical Subject Headings– ~18,000 labels– Manually assigned – ~8 labels/article on average– Average depth: 4.5– Max depth: 9

• Top level categories:anatomy diagnosis related discanimals psych technologydisease biology humanitiesdrugs physics

MeshBrowse (Korn & Shneiderman 95)

Pro and con of Category Labels

• Advantages:– Interpretable– Capture summary information– Describe multiple facets of content– Domain dependent, and so descriptive

• Disadvantages– Do not scale well (for organizing documents)– Domain dependent, so costly to acquire– Vocabulary Problem!

Text Clustering

• What clustering does:– Finds overall similarities among groups of

documents– Finds overall similarities among groups of tokens– Picks out some themes, ignores others

• How clustering works:– Cluster entire collection– Find cluster centroid that best matches the query– Problems with clustering

• It is expensive• It doesn’t work well

Scatter/Gather

• How it works– Cluster sets of documents into general “themes”,

like a table of contents – Display the contents of the clusters by showing

topical terms and typical titles– User chooses subsets of the clusters and re-

clusters the documents within – Resulting new groups have different “themes”

• Originally used to give collection overview• Evidence suggests more appropriate for

displaying retrieval results in context

S/G Example: Query on “star”

Another Use of Clustering

• Use clustering to map the entire huge multidimensional document space into a number of small clusters

• “Project” these onto a 2D graphical representation

“ThemeScapes” Clustering

Kohonen Feature Maps on Text

Pro and con of Clustering

• Advantages:– Get an overview of main themes– Domain independent

• Disadvantages:– Many of the ways documents could group together

are not shown– Not always easy to understand what they mean– Can’t see what documents are about– Documents forced into one position in semantic

space– Hard to view titles

• Perhaps more suited for pattern discovery– Not good for search

Query Specification

• Interaction styles (Shneiderman 97)

– Command language– Form fill– Menu selection– Direct manipulation– Natural language

• What about gesture, eye-tracking, or implicit inputs like reading habits?

Direct Manipulation Query Specification

Menu-Based Query Specification

Display of Retrieval Results

• Goal: – Minimize time/effort for deciding which

documents to examine in detail

• Idea:– Show the roles of the query terms in the

retrieved documents, making use of document structure

Putting Results in Context

• Interfaces should – Give hints about the roles terms play in the

collection– Give hints about what will happen if various

terms are combined– Show explicitly why documents are

retrieved in response to the query– Summarize compactly the subset of

interest

• To support both Query and Navigation

{Query v. Nav} x {Search v. Browse}

Query Navigate

Browse

Search

Tasks

Strategies

Hybrid strategies: Query + Navigation

1. Query Guided NavigationE.g., superbook

– Query hits posted against structure– Pattern of hits gives additional navigational cues for

your target

2. Query Initiated NavigationE.g., typical use of WWW query engine

– Query to neighborhood, then navigate to target– Your vocabulary choice just has to occur near your

target

Superbook

• Interface to very large hierarchically structured texts.

• Full text searching• Posting hits against

the table of contents

• Fisheye View of table of contents

• Highlighting hits in the text

TileBars

• Graphical representation of term distribution and overlap• Simultaneously indicate:

– Relative document length– Query term frequencies– Query term distributions– Query term overlap

TileBars Example

• Mainly about both DBMS & reliability

• Mainly about DBMS, discusses reliability

• Mainly about, say, banking, with a subtopic discussion on DBMS/Reliability

• Mainly about high-tech layoffs

Query terms: DBMS and Reliability

VIBE (Olson et al. 93, Korfhage 93)

VR-VIBE

Query Reformulation

• Thesaurus expansion– Suggest terms similar to query terms

• Relevance feedback– Suggest terms (and documents) similar to

retrieved documents that have been judged to be relevant

– “More like this” interaction

Letizia (Lieberman)

• Watches what you like– did you save page– use it as a jumping off

place– what search terms have

you used– passed-over link

• Extracts keywords etc. to profile user’s interest

• Looks ahead in your neighborhood evaluating what it finds

ISP 433/633 Week 12 User Interface in IR. Why care about User Interface in IR Human Search using IR...

Documents

Transcript of ISP 433/633 Week 12 User Interface in IR. Why care about User Interface in IR Human Search using IR...