Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University...

50
Dec 2003, DRTC © C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada

Transcript of Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University...

Page 1: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC © C.Watters1

Users and the Digital Library

Carolyn Watters

Dalhousie University

Halifax, Canada

Page 2: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters2

Are Digital Libraries Libraries?

Phase I -Electronic access to traditional library Phase II- Access to electronic documents Phase III- Access to all information that is digital

– Communities of interest– Personal– Archival– Current– Editions

Phase IV - Semantic Web

Page 3: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters3

Ranganathan’s 5 Laws and the DL

Books are for use. Every book its reader. Every reader its book. Save the time of the reader. Library is a living organism.

Page 4: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters4

This Talk: Users of Digital Libraries

Task: what kind of information? Motivation: why is the user interacting? Interventions: what can we change?

– Query– Matching– Ranking– Presentation

Conclusions: what is effective?

Page 5: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters5

Task: what kind of information?

Research– I am doing a study on Mesopotamia

Search– Who was the Prime Minister of India in 1962

Refind– What was the name of that movie I read the review on?

Browse– I am interested in Post-modern Art

Page 6: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters6

Impact of Motivation & Task

Type I - Uses & Gratification Task– Satisfaction is in the result– News reading ?

Type II - Ludic Task– Part of the satisfaction is in the process– News reading?

Page 7: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters7

Type I Tasks – Uses and Gratification tasks

The uses and gratification theoretical perspective is based on the assumption that the reader has some underlying goal, outside the reading itself, that reading satisfies.

Having the answer is the goal May be intrinsically motivated (i.e. may just

want to know) Traditional information need

Page 8: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters8

Types of U&G Tasks

Research Queries– History of Kerala– Breadth important– Multiple viewpoints/sources expected

Question and Answer– Capital of Kerala– Accuracy important– Contradiction not expected

Page 9: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters9

Characteristics of U&G Info Tasks

Articulation of a specific query

Recognition of relevance of retrieved results

User has control of “satisficing” point

* Herbert Simon (1976)

Page 10: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters10

Can we make predictions for U&G tasks?

A study* of reading the news suggested that a neural net model was able to learn enough about user preferences to be able to predict what a user would read for this type of task

Use this to modify the query Most useful for repeated topic queries

*Shepherd, M., C. Watters, and A. Marath. Adaptive Filtering for Electronic News. Proc. of HICSS’35. Jan 7-10, 2002.

Page 11: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters11

Type II Tasks - Ludic Tasks

The ludic theoretical perspective is based on the assumption that the reading itself brings satisfaction to the the reader.

The process of getting information is satisfying

Web browsing News reading

Page 12: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters12

Ludic Use characteristics

individual path selection – Users are happy to get different information for

same general search Apperception

– Users choose information that fits their current knowledge

Habitualness– Users perform these searches as part of their

routine (rather than a specific one time info need)

Page 13: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters13

Types of Ludic Tasks

Updating & community awareness tasks– What is happening?– Breadth important– Unknown events are relevant

Search & Browse tasks– What is new/odd/interesting– Novelty is important– answers not expected– Community membership

– ***Exact query unknown

Page 14: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters14

Can we predict what will be read?

We cannot predict based on past behavior what a user will chose to read or the path the user will chose to follow

– Reading the news– Browsing the web

User often multitasking– 33% of web sessions involve more than 2 topics (Spink)

How can we help the user??

Page 15: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters15

Predicting for Ludic Tasks

“when you don’t know where you are going, any road will take you there!”

Lewis Carroll

Page 16: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters16

What do we have to work with?

An information need expressed as a query [user] a user profile [user/system]]

– Interests– history

document content (metadata keywords genre) [derived]

Link topology [author]

Usage patterns of documents [community]

Information about current task [user]

Page 17: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters17

What can we Manipulate?

I. Query

II. Matching

III. Ranking

IV. Presentation

Page 18: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters18

I. Improving the Query

Longer queries are better [BelK03] Average query is 2.2 terms Type of queries (Q&A / research/ browse/refind)

Qualitity of query Modification of query Personal Profiles Stereotypes

Page 19: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters19

Quality of Query

Purpose of query Separation of doc set into relevant & nonrelevant documents

How well does the language of the query fit or not fit the language of the docs

Clarity*= difference between the distribution of the terms used in the query and in the distribution of all terms in the collection

*Croft, 2001

Page 20: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters20

Example of Clarity Values

Query = Apple– General news DL: Clarity value is low

Apple pies, computers, city

– Computer DL: clarity value is high

• Query = Apple Computer Company• General news DL: Clarity is high• Computer DL: same as just Apple

Page 21: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters21

Query Modification

Feedback: add terms from similar docs– User relevance judgements – More like this one

Profiles: add terms from history of user or user interests

Thesaurus: add related terms

Page 22: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters22

Rocchio Feedback

Page 23: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters23

II. Improving the Matching

Using Profiles– [Joe: railway, steam, engine, track, Europe]

Using metadata– Mapping to controlled vocabularies– Add semantics to documents– [D1:<loc>Europe</loc> <topic>Train Transportation</topic>]

Genre– Reports– Home pages– Shopping pages– News

Page 24: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters24

III. Improving ranking

User Profiles Location Stereotypes

Page 25: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters25

User Profiles: Recommender Systems

To “recommend” an existing path through an information space that best satisfies the user’s information need.

Depends on goal of search!

Page 26: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters26

Page 27: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters27

Community & Personal Profiles

Community profiles – provide stability– Common interests

Personal Profiles– Long term interests vs short term– Multiple interests– Topic drift

Page 28: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters28

Effect for Browsing Tasks

Browsing behavior was idiosyncratic and personal System could not learn over time

*Shepherd, M., C.Watters, and R.Kaushik. Lessons from Reading E-News for Browsing the Web: The Roles of Genre and Task. Proc. of the Annual Conference of the American Society for Information Science and Technology. November 2001, Washington.

Page 29: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters29

Profiles & Tasks

Repeated queries based on user profile for sustained interests work well

Feedback mechanisms such as Rocchio work well for sustained interests

BUT not for idiosyncratic queries or browsing

Page 30: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters30

Example of Alternate Ranking:Geospatial Queries

What is here?– What can I do in Bangalore?

Where is there x?– Where can I ski in Eastern Canada or USA?

Page 31: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters31

Where can I go skiing?

Page 32: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters32

What do we have to work with

Geoparsing– Recognizing geographical context – country, river, feature etc

Geocoding– Assigning longitude and latitude values– End, middle, etc

Page 33: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters33

Page 34: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters34

User Stereotypes for Medical News*

Select medical items from online news sources

Categorize medical items by intended audience

*Watters,C., W.Zheng, and E.Milios. 2002. Filtering for Medical News Items. Proc. of the American Society for Information and Technology Conference. Nov. 15-19, Pittsburgh.

Page 35: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters35

Profiling by Keyword

Customized vocabulary in MeSH Pruned non medical branches

– 31, 441 headings

Assigned weights to these headings

nonmedical 1 building

Lay medical 2 Body,stomach

General med 3 Anatomy,umbilicus

Specific med 4 Inguinal Canal

Page 36: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters36

Prototype

Page 37: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters37

IV. Improving Presentation

1.Genre

2. Views

3. Transformations for device

Page 38: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters38

Hit List(linear)

Page 39: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters39

News (broadsheet)

Page 40: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters40

Album

Preferred broadsheet for browsing

*Shneiderman’s PhotoLib

Page 41: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters41

Report

Page 42: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters42

Effect of Genre on Different Devices

Page 43: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters43

Views

*R.Furuta

Page 44: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters44

Million list server messages

*websom.hut.fi/websom

Page 45: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters45

Summary of Interventions

Improve the query Improve matching Improve the ranking Improve presentation

profiles

querymatch rank present

user

Page 46: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters46

Reality Check

Not so easy for real tasks and real users– Topic shifts– Topic relevancy/ importance judgements– *Multitasking– *Task Detection– *Getting Personal Preferences

Page 47: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters47

Can we help?

YES – Queries

can be modified for U&G tasks Use of community profiles for Ludic tasks

– Alternative ranking schemes can be based on type of task

– Match presentation to contents and use

Page 48: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters48

Roles of the DL

Conflicting Goals– Archival – Access – Derivative uses (ex. Animation)– Digital rights management

Page 49: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters49

Conclusions for Digital Libraries

User is an integral part of the system User’s immediate task and motivation matters Community interests matter Ranganathan’s rule 2 for 2003

– Every user his or her information.

Page 50: Dec 2003, DRTC© C.Watters 1 Users and the Digital Library Carolyn Watters Dalhousie University Halifax, Canada.

Dec 2003, DRTC© C.Watters50

Thank you!

More information: My web site

– www.cs.dal.ca/~watters

Web Information Filtering Lab– www.cs.dal.ca/wifl