User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

33
User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001

Transcript of User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Page 1: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

User Interface Design

LBSC 708A/CMSC 838L

Douglas W. Oard

Session 5, October 9, 2001

Page 2: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Agenda

• Muddiest points and questions

• Query formulation

• Selection

• Examination

• Document delivery

Page 3: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

• Inference networks (4)

• Bayes Theorem

• Probability computation

• Relationship to vector space model

• What if a query term is in no document?

• The mud example

Muddiest Points

Page 4: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Problematic Assumptions

• Term independence (9)

• Binary relevance (3)

• Relevance rather than utility (2)

• Prior probability

• Relationship between terms and concepts

Page 5: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Retrieval System Model

SourceSelection

Search

Query

Selection

Ranked List

Examination

Document

Delivery

Document

QueryFormulation

IR System

Indexing Index

Acquisition Collection

Page 6: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Query Formulation

Query Formulation

Search

User

Index

Page 7: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Some Facts About Queries

• IR research shows that long queries are better– Early TREC queries filled a page

• Search engine logs show mostly short queries– Averaging just over 2 words per query– Very few use “advanced” query interfaces– Almost nobody reads “help” screens

• Why don’t people do what’s good for them?– And what can we do about it?

Page 8: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

A Conceptual Framework

• Four perspectives on “information needs”– Visceral

• What you really want to know

– Conscious• What you recognize that you want to know

– Formalized (e.g., TREC topics)• How you articulate what you want to know

– Compromised (e.g., TREC queries)• How you express what you want to know to a system

Page 9: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Compromising Information Needs

• Direct translation from a conscious info need– End users rarely formalize their needs first

• Constrained by perceived system capabilities– Vocabulary

• Guessed, found in earlier searches, or from a thesaurus

– Structure• Length, operators for combining terms, syntax

• Users learn systems and topics by exploring

Page 10: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Things That Help

• Encourage longer queries– Provide a large text entry area

• Provide examples– e.g., For good pizza type +Chicago +“deep dish”– Examples related to the last query are best

• Offer lists of related terms– Documents with high IDF in the last retrieved set

Page 11: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Things That Hurt

• Obscure ranking methods– Unpredictable effects of adding or deleting terms

• Only single-term queries avoid this problem

• Counterintuitive statistics– “clis”: AltaVista says 3,882 docs match the

query– “clis library”: 27,025 docs match the

query!• Every document with either term was counted

Page 12: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

What About Boolean Queries?

• Present a set of text boxes– OR the terms in each box– AND the boxes together

• Allow graphical query depictions– Several techniques have been tried

poetrypoetry Milton Shakespeare

Page 13: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Alternate Query Modalities

• Spoken queries– Used for telephone and hands-free applications– Reasonable performance with limited vocabularies

• But some error correction method must be included

• Handwritten queries– Palm pilot graffiti, touch-screens, …– Fairly effective if some form of shorthand is used

• Ordinary handwriting often has too much ambiguity

Page 14: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Browsing Retrieved Sets

• Uses the detection stage’s output– Unranked sets, ranked lists, document clusters

• Two goals– Identify documents for some form of delivery– Enrich the query in some way

• Two stages– Select promising documents– Examine those documents individually

Page 15: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Indicative vs. Informative

• Terms often applied to document abstracts– Indicative abstracts support selection

• They describe the contents of a document

– Informative abstracts support understanding• They summarize the contents of a document

• Applies to any information presentation– Presented for indicative or informative purposes

Page 16: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Browsing Goals

• Identify documents for some form of delivery– An indicative purpose

• Query Enrichment– Relevance feedback (indicative)

• User designates “more like this” documents

• System adds terms from those documents to the query

– Manual reformulation (informative)• Better approximation of visceral information need

Page 17: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Selection

Search

Selection

Examination

User

About 7381 documents match your query.

1. MAHEC Videoconference Systems. Major Category. Product Type. Product. Network System. Multipoint Conference Server (MCS) PictureTel Prism - 8 port. . - size 5K - 6-Jun-97 - English -

2. VIDEOCONFERENCING PRODUCTS. Aethra offers a complete product line of multimedia and videocommunications products to meet all the applications needs of... - size 4K - 1-Jul-97 - English -

SelectionIndex

DocsIndexing

Page 18: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

A Selection Interface Taxonomy

• One dimensional lists– Content: title, source, date, summary, ratings, ...– Order: retrieval status value, date, alphabetic, ...– Size: scrolling, specified number, RSV threshold

• Two dimensional displays– Construction: clustering, starfields, projection– Navigation: jump, pan, zoom

• Three dimensional displays– Contour maps, fishtank VR, immersive VR

Page 19: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Cluster Map

Page 20: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Cluster Formation• Based on inter-document similarity

– Computed using the cosine measure, for example

• Heuristic methods can be fairly efficient– Pick any document as the first cluster “seed”– Add the most similar document to each cluster

• Adding the same document will join two clusters

– Check to see if each cluster should be split• Does it contain two or more fairly coherent groups?

• Lots of variations on this have been tried

Page 21: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Starfield

Page 22: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Constructing Starfield Displays

• Two attributes determine the position– Can be dynamically selected from a list

• Numeric position attributes work best– Date, length, rating, …

• Other attributes can affect the display– Displayed as color, size, shape, orientation, …

• Each point can represent a cluster

– Interactively specified using “dynamic queries”

Page 23: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Projection

• Depict many numeric attributes in 2 dimensions– While preserving important spatial relationships

• Typically based on the vector space model– Which has about 100,000 numeric attributes!

• Approximates multidimensional scaling– Heuristic approaches are reasonably fast

• Often visualized as a starfield– But the dimensions lack any particular meaning

Page 24: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Contour Map Displays

• Display a cluster density as terrain elevation– Fit a smooth opaque surface to the data

• Visualize in three dimensions– Project two 2-D and allow manipulation– Use stereo glasses to create a virtual “fishtank”– Create an immersive virtual reality experience

• Mead mounted stereo monitors and head tracking

• “Cave” with wall projection and body tracking

Page 25: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.
Page 26: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Examination

Delivery

Selection

Examination

User

Aethra offers a complete product line of multimedia and videocommunications products to meet all the applications needs of users. The standard product line is augmented by a bespoke service to solve customer specific functional requirements.

Standard Videoconferencing Product Line

Vega 384 and Vega 128, the improved Aethra Set-top systems, can be connected to any TV monitor for high quality videoconferencing up to 384 Kbps. A compact and lightweight device, VEGA is very easy to use and can be quickly installed in any officeand work environment.

Voyager, is the first Videoconference briefcase designed for journalist, reporters and people on-the-go. It combines high quality video-communication (up to 384 Kbps) with the necessary reliability in a small and light briefcase.

Docs

Page 27: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Full-Text Examination Interfaces

• Most use scroll and/or jump navigation– Some experiments with zooming

• Long documents need special features– “Best passage” function helps users get started

• Overlapping 300 word passages work well

– “Next search term” function facilitates browsing

• Integrated functions for relevance feedback– Passage selection, query term weighting, …

Page 28: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Extraction-Based Summarization

• Robust technique for making disfluent summaries

• Four broad types:– Single-document vs. multi-document– Term-oriented vs. sentence-oriented

• Combination of evidence for selection:– Salience: similarity to the query– Selectivity: IDF or chi-squared– Emphasis: title, first sentence

• For multi-document, suppress duplication

Page 29: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Generated Summaries

• Fluent summaries for a specific domain• Define a knowledge structure for the domain

– Frames are commonly used

• Analysis: process documents to fill the structure– Studied separately as “information extraction”

• Compression: select which facts to retain• Generation: create fluent summaries

– Templates for initial candidates– Use language model to select an alternative

Page 30: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Things That Help

• Show the query in the selection interface– It provides context for the display

• Explain what the system has done– It is hard to control a tool you don’t understand

• Highlight search terms, for example

• Complement what the system has done– Users add value by doing things the system can’t– Expose the information users need to judge utility

Page 31: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Delivery

Delivery

Examination

User

Docs

Page 32: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Delivery Modalities

• On-screen viewing– Good for hypertext, multimedia, cut-and-paste, …

• Printing– Better resolution, portability, annotations, …

• Fax-on-demand– Really just another way to get to a printer

• Synthesized speech– Useful for telephone and hands-free applications

Page 33: User Interface Design LBSC 708A/CMSC 838L Douglas W. Oard Session 5, October 9, 2001.

Two Minute Paper

• When examining documents in the selection and examination interfaces, which type of information need (visceral, conscious, formalized, or compromised) guides the user’s decisions? Please justify your answer.

• What was the muddiest point in today’s lecture?