User Interfaces for Exploring Large Web-based Collections: Recommender Systems & Metadata-Based...

User Interfaces for Exploring Large Web-based Collections: Recommender Systems & Metadata-Based

Search Interfaces

Rashmi SinhaSIMS, UC Berkeley

Structure of Talk• Broad comparison of Recommender Systems &

Search/Browse Systems

• Different User Needs: Levels of control, task (how open-ended), level of personalized need.

• Focus on Recommender Systems: 2 studies, some design suggestions

• Quick Overview of Flamenco: A Metadata-Based Search / Browse Interface

• Summary, Conclusions & Future Plans

Interaction Paradigm of Recommender Systems…

Basic Interaction Paradigm of Recommender Systems

Which book should I read?

Input (Ratings of Books): I recently enjoyed: Of Mice and Men, Bias, The Summons & Good to Great

Output (Recommendations): Books you might enjoy are…

Popularity of Recommender Systems…

I know what you’ll read next summer (Amazon, Barnes&Noble)

• what movies you should watch… (Reel, RatingZone, Amazon)

• what music you should listen to… (CDNow, Mubu, Gigabeat)

• what websites you should visit (Alexa)

• what jokes you will like (Jester)

• & who you should date (Yenta)

At the heart of Recommender Systems are Collaborative Filtering Algorithms that rely on correlation between individuals

• Meg & James: correlation = .52

• Meg & Jim: correlation = -.67

• Meg & Nick: correlation = .23

Ratings of Books 1 2 3 4 5 6 7 8

Meg 5 3 3 4 2 1

Jim 3 4 2 3 4 5 1 3

Nick 4 3 1 2 4 2 4 1

James 4 2 1 3 4 1 5 5

Recommendations For Meg

Example Recommender System…

Amazon’s Interaction Paradigm

Contrast interaction with Search /Browse Systems

Types of user needs in Exploring Collections

Looking for some new music…I am tired of my old records, looking for something new. I

know what I like, don’t know what else I might like, am open to ideas

-more open-ended-often about “individual taste”

Example Search/Browse System…

Looking for some recipes for cooking

I have some needs, (vegan dessert for 6 people). Also I have some strawberries lying around.-less open-ended

-less about “individual taste”

Integrated Browse-Search Interaction Paradigm

Chose tomatoes…

Currently refined by Course/Meal. You can change toPreparation, Cuisine, Season. System informs you what the next choices are, and what subset of items will be left if you take that path

User Experience for such systems…

User Experience in such Search/Browse interfaces

•More of a controlled experience•Every movement (forward, making a turn, backwards) is a conscious choice. (need information at every step)•User might make mistakes, and retract (go back) a step or two or start again. Each of these is a conscious choice.Experience is similar to

driving a car…

User Experience with Recommender Systems

•-user has less control over specifics of the interaction.•System does not provide information about specifics of action•-more of the black box model (some input from user, output from systems).

Experience is more similar to riding a roller coaster…

What does this mean for Recommender System Interfaces?

Research on Recommender Systems has mostly focused CF Algorithms

Collaborative Filtering Algorithms

Output (Recommendations)

Input from user

Social Recommendations

Taking a closer look at the Recommendation Process

Input User incurs cost in using system:

Time, Effort, Privacy Issues

Receives Recommendation

Cost in reviewing recommendations

Benefit if recommended item appeals

Judges if he/she will sample recommendation

Amazon’s Recommendation Process

Input: One artist/author name

Output: List of Recommendations Explore / Refine Recommendations

Search usingRecommendations

Book Recommendation Site: Sleeper

Input: Ratings of 10 books for all users Use of continuous Rating Bar

(System designed by Ken Goldberg)

Output: List of items with brief information about each item

Degree of confidence in prediction

Sleeper: Output

Study 1: Book and Movie Recommender Systems

Three book systems Amazon Books Sleeper Rating Zone

Three Movie Systems MovieCritic Amazon Movies Reel

Study design similar as before. Recommendations were sampled during study this time.

Study 2: Looking for music online:Music Recommender Systems

Five systems CDNow Amazon SongExplorer MoodLogic MediaUnbound

Not an experiment, but designed like one. Conducted in Lab environment

Broad overview to start with, then zero in on some systems

Meshing of quantitative and qualitative methods (one informing the other)

Pre-test, pre-test, pre-test

User motivation ascertained before study

Within-subjects design used wherever possible

Multiple small studies, rather than one big study

General Testing Methodology

Comprehensive Data Collection: Observation, Behavior logging with time stamps, questionnaires, post-test interviews.

Testing Methodology cont.

The Slim Logger: Simple Excel Based tool for recording timed observations.

For each of online systems:Rated items Reviewed and evaluated recommendation setCompleted questionnaire

For Study 1: also reviewed and evaluated sets of recommendations from 3 friends each

About 15-20 participants in each study, age:18 to 34 years

Study Procedure

How do recommendations from Online Systems compare to that from friends?

Popularity of Online Systems indicates that people find such systems useful. What are they useful for?

Comparing Human Recommenders (user’s friends) to Online Systems

Human Recommenders & Systems: “Good” & “Useful” Recommendations

0

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Amazon(15)

Sleeper(10)

RatingZone (8)

Friends(9)

Amazon(15)

Reel(5-10)

MovieCritic (20)

Friends(9)

Books Movies

% Good Recommendations

% Useful Recommendations

RS AverageAve. Std. Error (x) No. of Recommendations

However Users Like Online RS…

This result was supported by post test interviews.

Do you prefer recommendations from friends or online systems?

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

Books Movies

Num

ber

pref

erri

ng s

yste

m

System

Friends

Why Systems Over Friends?

“Suggested a number of things I hadn’t heard of, interesting matches.”

“It was like going to Cody’s—looking at that table up front for new and interesting books.”

“Systems can pull from a large database—no one person knows about all the movies I might like.”

Recommender Systems broaden horizons

Friends mostly recommend familiar items

%Not Heard of Previously

0102030405060708090

100

Amazon Sleeper RatingZone Friends Amazon MovieCritic Reel Friends

MoviesBooks

What aspects of systems do users like?

Why do users like particular systems: Searching for reasons

Previously Liked Items & adequate Item Description are correlated.Time to Receive Recommendations & No. of Items to Rate are not correlated.

Correlation of Subjective Usefulness and Ratings of System Features

-0.2

-0.1

0.0

0.1

0.2

0.3

0.4

0.5

0.6

No. GoodRecs

Detail inItem

Description

Knowreason for

recs?

PreviouslyLiked Recs.

Time to getrecs

No of itemsto rate

co

rre

lati

on

**** **

Good, Useful & Previously Experienced Recommendations

Post Test Interviews indicate that users “trust” systems if they have already sampled some recommendations

•Previous Positive Experiences lead to “trust”

•Previous Negative Experiences lead to mistrust of system

UsefulNot yet read/

viewed

Previously read/viewed(lead to trust)

All Good Recommendations

Adequate Item Description: The RatingZone Story

0 % of Version 1 and 60% of Version 2 users found item description adequate

% Useful For Both Versions of RatingZOne

0

5

10

15

20

25

30

35

40

45

Version 1: Without Description Version 2: With Description

% U

sefu

l R

ecs

.

An adequate item description, and links to other sources about item was a crucial factor in users being convinced by a recommendation.

Study 1

System TransparencyUser perception that they understand why an item was

recommended

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

Amazon Cdnow MediaUnbound

Mood Logic Song Explorer

Mea

n L

ikin

g

Not Transparent

Transparent

Transparent recommendations liked more than not-transparent ones for all five systems

Study 2

Mean liking for Familiar and Unfamiliar Recommendations

0

1

2

3

4

5

Amazon Cdnow MediaUnbound

Mood Logic Song Explorer

Mea

n L

ikin

g

Unfamiliar

Familiar

Familiar recommendations liked more than unfamiliar ones for all five systems

Some Results: Effect of familiarity on liking

Study 2

Two Models of Recommender System Success

• Recommendations from Amazon received highest liking rating for Study 1 (for books & movies) and second highest for Study 2 (Music)

• Recommendations from MediaUnbound outperformed Amazon in Study 2 (Music)

• Both systems were well liked but differed dramatically in interaction style…

Genre Selection

Favorite artist / band

Rating some songs

Amazon’s bare-bones recommendation process

Genre Selection

Media-Unbound’s long, extended (35 questions) recommendation process

Level of Familiarity

Feedback atEvery stage

Rating some songs

More feedback about user’s tastes

Setting system expectations

Users find MediaUnbound recommendations more useful

0.0%

20.0%

40.0%

60.0%

80.0%

100.0%

120.0%

Useful Use in Future

% U

sers

Amazon

MediaUnbound

Also, most users preferred MediaUnbound over Amazon

But whose recommendations would they buy?

0%

5%

10%

15%

20%

25%

30%

35%

No Action Borrow /Get forfree

Buy

pe

rce

nta

ge

Amazon

Media Unbound

But, users express more interest in buying Amazon recommendations

Amazon: •Safe, conservative approach to recommendations•Recommendations are familiar, few new items•Users find system logic transparent•Users don’t feel like they learnt anything news

MediaUnbound: •Verifies from user how familiar they want recommendations•Long input process seems to generate trust•Recommendations are often new, but well liked

Discussion & Design Suggestions

Justify your Recommendations

– Adequate Item Information: Providing enough detail about item for user to make choice

– System Transparency: Generate (at least some) recommendations which are clearly linked to the rated items

– Explanation: Provide an Explanation, why the item was recommended.

– Community Ratings: Provide link to ratings / reviews by other users. If possible, present numerical summary of ratings.

Accuracy vs. Less InputDon’t sacrifice accuracy for the sake of generating quick recommendations. Users don’t mind rating more items to receive quality recommendations.

–Multilevel recommendations: Users can initially use the system by providing one rating, and are offered subsequent opportunities to refine recommendation

–Provide a happy medium between too little input (leading to low accuracy) and too much input (leading to user impatience)

–Unlike with Search Engines, users are not willing to try again and again.

Users like Rec. Systems as they provide information about new, unexpected items.

List of recommended items should include new items which the user might not find out in any other way.

List could also include some unexpected items (e.g., from other topics / genres) which the user might not have thought of themselves.

Include some New Unexpected Items

Users (especially first time users) need to develop trust in the system.

Trust in system is enhanced by the presence of items that the user has already enjoyed.

Generating some very popular (which have probably been experienced previously) in the initial recommendation set might be one way to achieve this.

Trust Generating Items

Trust Generating Items: A few very popular ones, which the system has high confidence in

Unexpected Items: Some unexpected items, whose purpose is to allow users to broaden horizons.

Transparent Items: At least some items for which the user can see the clear link between the items he /she rated and the recommendation.

New Items: Some items which are new /just released

The right mix of items

Question: Should these be presented as a sorted list / unsorted list/ different categories of recommendations?

Verify degree of familiarity user wants

This can help produce the right mix of items for each user.

To sell as many items as possible or to help users explore their tastes?

The two goals are often contradictory, at least in short term.

Important for system designer to keep goals in mind while designing system.

What kind of a system do you want?

Onwards to Search/Browse Systems…

Flamenco: Designing an innovative Search / Browse

System for Architectural Images • System supports explorations of a large

architectural image dataset

• Our goal was to build a flexible navigation and search using faceted metadata

The Philosophy• Information architecture should be designed to

integrate search throughout• Search results should reflect the information

architecture, supporting an interplay between navigation and search

• This supports the most common human search strategies.

• Use metadata to show user where to go next. More flexible than canned links

• Allow users to expand as well as refine

An Important Search Strategy

• Do a simple, general search / browse– Gets results in the generally correct area

• Look around in the local space of those results• If that space looks wrong, start over

– Akin to Shneiderman’s overview + details

• Our approach endeavors to supports this search strategy

Questions we are trying to answer

• How can Faceted Metadata be effectively used to build an easy to use Search / Browse system?– How many facets should be displayed at once?– Should facets be mixed and matched?– How much is too much?

• How should hierarchies be revealed (progressively, one step at a time)?

• How should large categories be displayed?

• How should refinement & expansion of query be supported?

Image Dataset

Architect

Image

Geo-Region Time/Date

~40,000 images, 9 hierarchical facets, rich faceted hierarchical data

Facets

Development Timeline• Needs assessment.

– Interviewed architects and conducted contextual inquiries. • Lo-fi prototyping.

– Showed paper prototype to 3 professional architects.• Design / Study Round 1.

– Simple interactive version. Users liked metadata idea.• Design / Study Round 2:

– Developed 4 different detailed versions; evaluated with 11 architects; results somewhat positive but many problems identified. Matrix emerged as a good idea.

• Design / Study Round 3. – New version based on results of Round 2– Highly positive user response

Architects’ Image Use• Common activities:

– Use images for inspiration• Browsing during early stages of design

– Collage making, sketching, pinning up on walls– This is different than illustrating PowerPoint

• Maintain sketchbooks & shoeboxes of images

• No formal organization scheme– None of 10 architects interviewed about their image

collections used indexes

• Do not like to use computers to find images

Type of tasks we want to support• Your firm wants to enter a competition to design a new central

library in downtown Oakland. Find images to support your ideas for creative use of space in crowded urban setting (the site is surrounded by skyscrapers).

• You're preparing an exhibit to show off the possibilities of "environmentally-friendly" design at an upcoming home & garden show. Find some images that will encourage new-home purchasers to consider building in this way.

• Your team is doing a pool / pool-house / garden renovation. The lead architect’s preliminary design calls for rough/unfinished materials such as concrete and iron. The client is confused and resistant. Please find 2 or 3 images to use for a collage to help the client explore the idea of “concrete in the garden.”

The Interface• Nine hierarchical facets

– Matrix– SingleTree (control interface inspired by Yahoo)

• Chess metaphor– Opening– Middlegame– Endgame

• Tightly Integrated Search• Expand as well as Refine• Intermediate pages for large categories

Begin Game for SingleTree

Middle Game for SingleTree

Chose “cultural landscapes” in structure types

Begin Game for Matrix

Next: Group by Location

Middle Game for Matrix

Next: Click on one image

Next: Expand from image view

Searched for “brick”

Next: Chose pitched brick vaults

Next: Expand using breadcrumbs

Results of User Study with 19 architects

• Users rated Matrix more highly for:– Usefulness for design work– Seeing relationships between images– Flexibility– Power

• On all except “find this image” task, users also rated the Matrix higher for:– Feeling “on track” during search– Feeling confident about having found all relevant

images

User Comments - Matrix• “Powerful at limiting and expanding result sets. Easy to

shift between searches.”

• “Keep better track of where I am located as well as possible places to go from there.”

• “Left margin menu made it easy to view other possible search queries, helped in trouble-shooting research problems.”

• “Interface was friendlier, easier, more helpful.”

• “I understood the hierarchical relationships better.”

User Comments – Single Tree

• Pro– “Simple”– “More typical of other search engines I’d use”– “Visually simpler and more intuitive…Matrix a bit

overwhelming with choices.”

• Con– “I found SingleTree difficult to use when I had to

refine my search on a search topic which I was not familiar with. I found myself guessing.”

– “SingleTree required more thought to use and to find specific images.”

– “I do not trust my typing and spelling skills. I like having categories.”

Feature Usage (%) Types of Actions

Action Categories

0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00%

Refine search(reduce # of

results)

Expand search(increase # of

results)

Arrange results

Start over/backup

Matrix

Tree

Feature Usage (%) Refining

Use of Features to Refine Search

0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00%

Drill above images

Drill in matrix

Drill from image detail

Drill from large category

Drill by clicking "All N items"

Search within

Disambiguate keyword search

Matrix

Tree

Feature Usage – Expanding / Starting Over

Use of Features to Expand Search / Start Over

0.00% 5.00% 10.00% 15.00% 20.00% 25.00%

Expand searchusing breadcrumbs

Expand by clickingX

Expand from imagedetail

Go back to startmid-search

Search all, mid-task

Back

Matrix

Tree

Summary & Conclusions• A new approach to web site search

– Use hierarchical faceted metadata dynamically, integrated with search• Many difficult design decisions

– Iterating and testing was key

• Design Challenges– How to seamlessly integrate metadata previews with search (Show search results in metadata

context)– How to show hierarchical metadata from several facets (The “matrix” view, Show one level of depth

in the “matrix” view)– How to handle large metadata categories (Use intermediate pages)– How to support expanding as well as refining (Still working on it to some extent)

Overall Summary

• Two modes of information exploration…

User has more control…-Want information at every step-Flexible way to refine, change, expand

User has less control…-Wants transparency-Wants new information-Needs to be convinced that recommendation is good

Conclusions & Future Plans• Recommender Systems and Search/Browse

systems support two different modes of information exploration– Directly compare both in controlled study– Expert / Novice differences in suitability of the two

interfaces– Task based differences in suitability of interfaces

• Recommender Systems: – Role of explicit community as compared to automated

recommendations– How to integrate community with recommendations

• What about Product Advisors? Where do they fit in?

Recommender Systems ProjectKirsten SwearingenSIMS, UC Berkeley

For more informationsims.berkeley.edu/~sinha

Metadata Based Search Users InterfacesMarti Hearst

Ame Elliott, Jennifer English, Kirsten Swearingen, Ping Yee

COLLABORATORS

User Interfaces for Exploring Large Web-based Collections: Recommender Systems & Metadata-Based...

Documents

Transcript of User Interfaces for Exploring Large Web-based Collections: Recommender Systems & Metadata-Based...