Designing a user interface for a web-based multimedia search engine - Information...

DESIGNING A USER INTERFACE FOR A WEB-BASED MULTIMEDIA INFORMATION RETRIEVAL SYSTEM

A STUDY SUBMITTED IN PARTIAL FULFILMENT

of the requirements for the degree of

Master of Science in Information Management

at

THE UNIVERSITY OF SHEFFIELD

by

STEPHEN LEVIN

September 2002

ABSTRACT

The growth of the Internet, in particular the World Wide Web, has made it possible

for millions of people around the world to access vast amounts of multimedia

information, in the form of text, images, video, audio, etc. However, this has in turn

generated problems, one of the most significant of which is how users find the

information they want, when the resource they are using is so large and seemingly

disorganised. As technology improves and the number of digital libraries containing

multimedia information increases, research into creating effective multimedia

information retrieval (MIR) systems becomes increasingly important.

In the past, IR systems have been developed with a focus on functionality, often at

the expense of the user interface. However, neglecting to consider whether a user

interface is actually usable can result in much confusion and frustration for users. On

the other hand, a well designed interface can decrease the time it takes a user to get

to grips with a system, increase their understanding of the system and ultimately give

them greater control.

The aim of this project was to design an effective user interface for a multimedia

information retrieval system. This report details the user-based approach that was

adopted to achieve this. Such approaches differ from traditional systems-based

approaches in that the user is central to the design of the interface, to the point where

decisions are dictated by them, as opposed to the designer, wherever possible.

The most important element of a user-centred approach is usability evaluation. In the

approach detailed in this report, the design process of the interface is iterative and

comprises of generating and collecting qualitative and quantitative data from

usability evaluation sessions, then designing/re-designing the system accordingly.

The design process begins with a survey of MIR systems publicly accessible on the

Web. Following this, several of these systems are then user-tested. Based on

feedback collected from participants, potential interface designs are then sketched,

one of which is converted into a low-fidelity prototype for further usability

evaluation. Finally, the interface is redesigned and the design process culminates in

the implementation of a prototype interface in HTML.

It is concluded that, contrary to popular belief, a user-based approach to interface

design can yield significant results, without the need for large amounts of time,

numerous participants and vast amounts of money.

ACKNOWLEDGEMENTS

I would like to express my gratitude to Daniela Petrelli and Mark Sanderson for

providing me with motivation, supervision and guidance throughout the course of

writing this dissertation.

I would also like to thank those that participated in the usability evaluation sessions, for giving up their time to help me.

Thank you also to all my friends for their words of encouragement.

Finally, the biggest thank you of all must go to my Mum, Janet Levin, who was a continuous source of support and encouragement.

4

Stephen Levin Designing a User Interface for a Web-based Multimedia Information Retrieval System

CONTENTS

CHAPTER 1: INTRODUCTION............................................................................. 8 1.1 Motivation.......................................................................................................... 9 1.2 Aims ................................................................................................................. 10 1.3 Outline.............................................................................................................. 11

CHAPTER 2: MULTIMEDIA INFORMATION RETRIEVAL........................ 12

2.1 Existing Approaches To Multimedia Information Retrieval ........................... 12 2.1.1 Feature Based Retrieval ............................................................................. 13

2.1.1.1 QBIC (Query By Image and video Content)...................................... 13 2.1.1.2 VisualSEEk ........................................................................................ 14 2.1.1.3 Problems Associated With Feature Based Retrieval Systems ........... 14

2.1.2 Combining Feature Based Retrieval and Text Based Retrieval................ 15 2.1.2.1 AMORE (Advanced Multimedia Object Retrieval) .......................... 16 2.1.2.2 Webpic ............................................................................................... 17

2.2 Content Description Languages For Multimedia............................................. 18 CHAPTER 3: USER INTERFACE DESIGN IN INFORMATION RETRIEVAL............................................................................................................ 19

3.1 Design Aims..................................................................................................... 20 3.2 The User-Based Approach ............................................................................... 20 3.3 The Information Retrieval Process................................................................... 22

3.3.1 Starting The Retrieval Process .................................................................. 22 3.3.2 Specifying a Query.................................................................................... 22 3.3.3 Viewing Retrieved Results in Context...................................................... 23 3.3.4 Query Reformulation ................................................................................ 23

CHAPTER 4: SURVEY OF EXISTING IR SYSTEMS ...................................... 25

4.1 Content Based Image Retrieval Systems ......................................................... 25 4.1.1 Circus ........................................................................................................ 25 4.1.2 Compass .................................................................................................... 27 4.1.3 NETRA-2 .................................................................................................. 28 4.1.4 QBIC ........................................................................................................ 29

4.2. Content-Based Audio Retrieval Systems ........................................................ 31 4.2.1 SoundFisher .............................................................................................. 31 4.2.2 Speechbot .................................................................................................. 33

4.3 Multimedia Retrieval Systems ......................................................................... 36 4.3.1 AllTheWeb................................................................................................ 36 4.3.2 Lycos Multimedia Search ......................................................................... 39 4.3.3 Singingfish ................................................................................................ 40

4.4 Critical Analysis of Query Methods ................................................................ 42 CHAPTER 5: METHODOLOGY.......................................................................... 44

5.1 Usability Evaluation of Existing MIR Systems ............................................... 45 5.1.1 Initial Questionnaire.................................................................................. 45 5.1.2 Retrieval Tasks.......................................................................................... 45 5.1.3 Usability Satisfaction Questionnaire......................................................... 46 5.1.4 Interview Session ...................................................................................... 47

5


5.2 Usability Evaluation of Low-Fidelity Prototype.............................................. 48 5.3 Prototyping of Final User Interface ................................................................. 49

CHAPTER 6: RESULTS FROM USABILITY EVALUATION OF EXISTING MIR SYSTEMS........................................................................................................ 50

6.1 User Background, Knowledge and Experience ............................................... 50 6.2 Usability Questionnaires .................................................................................. 51 6.2.1 Ease of Learning ....................................................................................... 51

6.2.2 Ease of Use................................................................................................ 52 6.2.3 Ability to Complete Task.......................................................................... 52 6.2.4 Information Quality................................................................................... 53 6.2.5 Interface Issues.......................................................................................... 54 6.2.6 Overall Satisfaction................................................................................... 54 6.2.7 Comparison of Groups .............................................................................. 55

6.3 Interview Sessions............................................................................................ 55 6.3.1 Usefulness of System Features ................................................................. 56

6.3.1.1 Singingfish ......................................................................................... 56 6.3.1.2 Lycos Multimedia Search .................................................................. 58 6.3.1.3 AllTheWeb......................................................................................... 59

6.3.2 System Preferences ................................................................................... 60 6.3.2.1 Result Summary Preferences ............................................................. 60 6.3.2.2 Overall Interface Preferences............................................................. 62 6.3.2.3 Overall System Preferences ............................................................... 63

6.3.3 Suggested Improvements .......................................................................... 63 6.3.3.1 Singingfish ......................................................................................... 64 6.3.3.2 Lycos Multimedia Search .................................................................. 64 6.3.3.3 AllTheWeb......................................................................................... 65

CHAPTER 7: REQUIREMENTS SPECIFICATION ......................................... 66

7.1 Layout .............................................................................................................. 66 7.2 Querying Options ............................................................................................. 66 7.3 Results .............................................................................................................. 67

CHAPTER 8: INITIAL DESIGN........................................................................... 69

8.1 Site Identification............................................................................................. 69 8.2 Query Interface................................................................................................. 70 8.3 Results Presentation ......................................................................................... 71

CHAPTER 9: RESULTS FROM USABILITY EVALUATION OF LOW-FIDELITY PROTOTYPE ...................................................................................... 73

9.1 Recognising Purpose of System....................................................................... 73 9.2 Issuing a Query ................................................................................................ 73 9.3 Browsing Results ............................................................................................. 75 9.4 Overall Comments ........................................................................................... 76 9.5 Summary of Findings....................................................................................... 77

CHAPTER 10: FINAL DESIGN ............................................................................ 78 CHAPTER 11: CONCLUSION.............................................................................. 79

6


BIBLIOGRAPHY APPENDICES………………………………………………………………………..i

Appendix A: Screenshots of AllTheWeb……………………………………………i Appendix B: Screenshots of Lycos Multmedia Search…………………………....vii Appendix C: Screenshots of SingingFish…………………………………………..xi Appendix D: Documents Given To Participants During First Usability Evaluation

Sessions……………………………………………………………...xv Appendix E: Summary of Results From Usability Satisfaction Questionnaires….xix Appendix F: Sketches of User Interface Selected For Low-Fidelity Prototyping…xx Appendix G: Low-Fidelity Prototype…………………………………………....xxii Appendix H: HTML Prototype…………………………………………………xxvii

7


CHAPTER 1: INTRODUCTION

Vast amounts of digital data are being created daily, due in part to the explosive

growth of the Internet, and in particular the World Wide Web - in 1999, the number

of publicly accessible pages on the Web was estimated to be about 800 million.

(Lawrence & Giles (1998)).

The information conveyed in this data presents itself in many different forms

including text, video, images and audio, leading some to describe the World Wide

web as ‘the ultimate, large-scale, dynamically changing, multi-media database’

(Srihari & Zhang (1999:497)).

As well as being generated at a great rate, the information is also being disseminated

all over the world with equal pace. However, coupled with the disorganised nature of

the Internet, this has created increased awareness of the need for information

retrieval (IR) systems that can effectively handle several different media types, as

well as indexing and management methods to support these systems.

According to Kobayashi and Takeda (2000), about 85% of web users use search

engines or a similar search tool to retrieve information. At present, however, the

majority of popular search engines on the Internet are primarily textual, e.g. Google,

Lycos, AltaVista etc. To use such a system, a user must issue a word or phrase as a

search query. The indexed web pages are then examined for relevance, based on their

textual content (Mukherjea et al. (1997)). The progress associated with such systems

has occurred at a great rate over the past two decades, and the ‘state of the art’ of

such engines is considerably further ahead than that of other media types (Ortega-

Binderberger (1999)). However, as the number of digital libraries containing a

variety of media types continues to rise, and the amount of information contained

within these libraries grows, the issue of multimedia information retrieval (MIR)

becomes increasingly important.

8


1.1 Motivation

‘User interface design is a highly creative process requiring intuition and an artistic

sense from the designer’ (Lee (2001:2)). However, as with any area motivated

mainly by technological development, in the construction of IR systems, particularly

those available on the Web, the design of the user interface (UI) is often largely

overlooked. Although there are a large number of web-based search engines publicly

available, users are often dissatisfied with aspects such as the query methods and the

ways in which results are presented (Lawrence & Giles (1998)).

Developers tend to bestow more importance upon the functionality of the search

engine, than the component that bridges the gap between system functionality and

user requirements. This is despite the fact that, regardless of technology, if an

interface does not serve its intended purpose of permitting effective communication

between the user and the system, then the whole system will fail to function

correctly.

A poorly designed UI can cause confusion and frustration in those who use it,

regardless of their skill level. On the contrary, an effective user interface can

decrease the time it takes a user to get to grips with a system, increase their

understanding of the system and give them more control (Shneiderman et al. (1997)).

In turn, this generates more successful searches and greater user satisfaction, as

shown in studies by Koenemann & Belkin (1996). However, user interface design is

often an afterthought in the engineering process, with aims extending little further

than to produce something that is cosmetically appealing.

An important part in the development process of any system is establishing what type

of users will be using the system and their associated needs. This information

facilitates the construction of a detailed specification of requirements, which in turn

enables the designer to create a product that satisfies its users’ needs. However, when

dealing with such a diverse user base as that found on the World Wide Web, this can

prove problematic. Designers of web-based IR systems find themselves having to

cater to users with a variety of different level of skills and experience or,

alternatively, tailoring their systems to suit a specific type of user.

9


Although previous user studies in the field of text-based retrieval have yielded useful

results, which can be applied to the design of such systems, the same cannot be said

for emerging technologies such as MIR, for which there exists little research into

user groups and their associated needs and information seeking behaviour.

Subsequently there are no guidelines on interface design, making the process of

initial design and subsequent iterative refinement particularly difficult.

1.2 Aims

The overall aim of this project is to design an interface for a web-based multimedia

search engine, that is designed to make the task of retrieving relevant information, in

the form of a variety of different media types, as fast and simple as possible for a

user, regardless of their level of experience, whilst at the same time providing a level

of functionality suitable for more experienced users.

The design of the interface should ideally be influenced in part by the results of

existing research into interface design in the area of MIR. However, as such

information is scarce, it will first be necessary to analyse existing approaches to MIR

and survey existing MIR systems, including not only those approaches and systems

concerned with retrieving several different media types, but also those concerned

with retrieving individual media types other than text.

Based on these findings, usability evaluation sessions will then be conducted using

several existing web-based MIR systems, the results of which will be used to

facilitate the construction of a detailed specification of requirements. Based on these

requirements, an initial interface will then be designed. Following further user

testing, in the form of usability evaluation sessions using a low-fidelity prototype, the

proposed interface will be improved accordingly. Finally, a prototype of the interface

will be implemented in HTML (Hyper Text Markup Language).

10


1.3 Outline

Chapter 2 introduces the concept of multimedia information retrieval and discusses

the three main approaches used to represent the content of multimedia data in MIR

systems.

Chapter 3 discusses user interface design, with a particular focus on the field of

information retrieval.

Chapter 4 surveys existing web-based MIR systems and other web-based IR systems

that deal with media types other than text, with a particular focus on their user

interfaces.

Chapter 5 describes the methodology used to reach the final user interface design,

including the usability evaluation sessions, low-fidelity prototyping and prototyping

of the final UI.

Chapter 6 summarises and discusses the results from the first set of usability

evaluation sessions.

Chapter 7 lists the user interface requirements derived from the results in the

previous chapter.

Chapter 8 describes the initial design of the proposed user interface.

Chapter 9 summarises and discusses the results from the second set of usability

evaluation sessions, in which the low-fidelity prototype of the proposed user

interface was used.

Chapter 10 describes the final design of the user interface.

Chapter 11 presents the conclusions of the project.

11


CHAPTER 2: MULTIMEDIA INFORMATION RETRIEVAL

Multimedia information retrieval (MIR) systems are specifically designed to deal

with data with a wide-variety of characteristics, such as text, images, video and

sound. This is reflected in the scope of MIR research, which incorporates a wide

range of different topics including: “analysis of text, image and video, speech and

non-speech audio; graphics; animations; artificial intelligence; human-computer

interaction and multimedia computing” (Koboyashi & Takeda (2000:26)).

Understandably, therefore, the development of a MIR system is considerably more

complex than the development of a traditional IR system, that would typically only

deal with unstructured, textual data. As with any IR system, a MIR system must be

able to represent data in such a way that the information can be retrieved effectively

and efficiently. However, a MIR system must also deal with semi-structured data -

data with a structure that does not exactly match specifications set by the data

schema (Bertino, E. et al. (1999)). Such data is represented using not only

information about the attributes of objects (e.g. their size or type), but also

information about the content (this is referred to as ‘content-based retrieval’).

2.1 Existing Approaches To Multimedia Information Retrieval

Amato et al. (1998) define three main approaches used to represent the content of

multimedia data in MIR systems:

1. Feature based – where a set of features are directly extracted from the

machine-readable representation of the data. These features are either

generated automatically, or with user assistance.

2. Keyword based – where the multimedia content is described using free text or

words from a controlled vocabulary. These annotations are usually manually

created by users, but may also be automatically generated.

3. Concept Based – where application domain knowledge is used to determine

the content of an object and ‘the interpretation leads to the recognition of

concepts which are used to retrieve the object itself’ Amato et al. (1998:3).

12


This chapter will focus on the first two approaches, as they are considered to be the

most relevant to the design of web-based multimedia search engines.

2.1.1 Feature Based Retrieval

As Flickner et al. (1995) state, image retrieval systems typically determine the

similarity between images based on their statistical properties, or ‘features’, for

example colour, shape and texture. Some image retrieval systems are also capable of

distinguishing between global features that apply to an entire image, such as average

colour distribution, and object features that apply to individual components denoted

as objects within an image (Amato et al. (1998)).

2.1.1.1 QBIC (Query By Image and video Content)

The QBIC system, created at the IBM Almaden Research Center, is one such system

capable of distinguishing between global and object features, thereby allowing

queries on individual objects, such as ‘find images with a red round object’ (Flickner

et al. (1995:25)). Created in 1995, QBIC was the first commercial content-based

image retrieval system. The framework and techniques used by the system have

influenced the design of numerous image retrieval systems since (Ortega-

Binderberger (1999)).

QBIC uses colour, texture and shape features to determine the similarity between

images. The colour features are the average values of colour distribution, and a

colour histogram, whilst the texture feature is a combination of coarseness, contrast

and directionality. The shape feature consists of area, circularity, eccentricity axis-

orientation and algebraic movement invariant (Ortega-Binderberger (1999)).

As well as image retrieval, QBIC is also capable of content-based retrieval on video

databases, by computing camera and object motion features. For example, the system

could theoretically deal with a query such as: “find all shots panning from left to

right” (Flickner et al. (1995:25)). Each video clip is split into a series of shots and

representative frames (r-frames) are created for each shot. These r-frames then have

13


their features extracted as though they were normal, still images. The system also

identifies and records moving objects within the shots (Flickner et al. (1995:25)).

Perhaps the most interesting feature of QBIC, however, is the ‘graphical query

language in which queries are posed by drawing, selecting and other graphical

means’ (Flickner et al. (1995:25)). This will be examined in detail in chapter 4.

2.1.1.2 VisualSEEk

The VisualSEEk system, developed by Smith & Chang (1996) is similar to QBIC in

that it supports queries based on visual features such as colour and texture – in this

case, colour is represented using ‘colour sets’, a compact alternative to colour

histograms. Colour sets are easily indexed, enabling the system to search for similar

colour sets with ease, therefore speeding up the image retrieval process (Smith &

Chang (1996)).

Unlike QBIC, however, VisualSEEK is also able to extract and convey spatial

information about images. Each image is divided into a number of regions, with

specific features (i.e. colour), and spatial properties (i.e. size, location and

relationships with other regions). Images are then compared by region (Smith &

Chang (1996)). Users issue queries to VisualSEEk by ‘diagramming spatial

arrangements of colour regions’ (Smith & Chang (1996:1)). For example, if a user

wished to find an image of a sunset, they could sketch a red or orange coloured

region near the top of the query window and a blue or green region near the bottom.

(Ortega-Binderberger (1999)).

2.1.1.3 Problems Associated With Feature Based Retrieval Systems

Although systems such as QBIC and VisualSEEk are ‘robust and efficient’ (Srihari

& Zhang (1999:498)), they are often criticised for being unable to convey semantic

information, i.e. for not considering the meaning of images (Amato et al. (1997)).

Mukherjea & Cho (1999:586) explain that: ‘the similarity of two images can be

determined in two ways: visually and semantically’, and ‘since visual similarity does

not consider the meaning of the images, a picture of a figure skater may be visually

14


similar to a picture of an ice hockey play (because of the white background and

similar shape), but it may not be meaningful for the user’. Even the creators of QBIC

admit that: ‘one of the guiding principles of QBIC is to let computers do what they

do best – quantifiable measurement – and let humans do what they do best –

attaching semantic meaning’ (Flickner et al. (1995:23)).

One solution to the problem of being unable to convey semantic information is to use

a keyword based approach. Although many feature based image retrieval systems,

including QBIC, allow keyword searches in addition to feature based searches, the

user must specify the keywords manually and there is still no concept of semantic

similarity between images (Mukherjea & Cho (1999:587)). Ortega-Binderberger

(1999) points out the difficulty a user faces in describing something that is typically

taken for granted within a given media type. For example, how a user might describe

the ripples on an ocean within a picture would vary depending on the weather.

Flickner et al. (1995:23) claim that ‘if a program can be written to extract

semantically relevant text phrases from images, the problem (of content-based

querying) may be solved’. Several such systems have been attempted, however, no

such systems have so far been proven to be particularly effective (Srihari & Zang

(1999)).

2.1.2 Combining Feature Based Retrieval and Text Based Retrieval

Several web-based search engines have been successfully designed that avoid the

aforementioned limitations associated with image retrieval systems such as QBIC

and VisualSEEk, by combining feature-based queries with keyword queries.

Furthermore, these systems automatically derive semantic attributes of an image

from a web page by examining the surrounding text, making them capable of

considering both the visual and semantic meaning of images. As Henrich & Robbert

(2000:4) state: ‘In a structured document, the most fertile information about an

image, an audio, or a video/animation can be found in the text objects associated

with this media object’.

15


Two such systems are the Advanced Multimedia Oriented Retrieval Engine

(AMORE), designed by Mukherjea & Cho (1999), and Webpic, designed by Srihari

& Zhang (1999). Although they only deal with images and text, both systems were

designed with a view to being extended to cover various other media types, at a later

date.

2.1.2.1 AMORE (Advanced Multimedia Object Retrieval)

The AMORE system combines image queries and textual queries, indexing both the

textual content and the images contained within web pages. The user issues image

queries by specifying an example image, then asking the system to find similar

images (this is known as ‘query by example’). Studies by Mukherjea & Cho (1999)

showed that users find this method of querying particularly useful in image retrieval

systems. However, this method does have its shortcomings, which will be discussed

in chapter 4. To avoid these problems, the system also allows a user to retrieve

images from the Web by specifying keywords.

The process of assigning keywords to images is automated, based on the content of

the web page from which the image originates. As HTML files do not contain

explicit captions, the system has to parse each individual HTML file in turn and

record the keywords that are located ‘near’ to a particular image (Mukherjea & Cho

(1999)). However, in this case, nearness does not necessarily denote physically

proximity. For example, if an image is in a table, then the keywords describing it

may not be located near to the image itself. Furthermore, if a page contains numerous

images, it is often practically impossible to determine which words should be

associated with which images. To avoid these problems, AMORE uses several

heuristics to determine keywords.

The AMORE system is also specifically designed to consider not only syntactic

information but also semantic information. The keywords assigned to images are

represented as terms within vectors, in a vector model. However, not all keywords

are equally as relevant to an image, and many are often completely irrelevant. The

effectiveness of the system at establishing semantic similarity is dependent on the

16


ability to ‘identify relevant keywords and give them more weight in the vector

model’ (Mukherjea & Cho (1999:594)).

2.1.2.2 Webpic

The Webpic system is similar to AMORE, in so far as it capable of interactively

combining text processing with image processing in both the indexing and retrieval

phases (Srihari & Zhang (1999)). Furthermore, Webpic exploits the text

accompanying an image, or ‘collateral text’ (Srihari & Zhang (1999:497)), as well as

the image itself.

Webpic is specifically designed to retrieve pictures of certain people in particular

contexts. For example, a user could perform a search for images of Britney Spears

talking to journalists outside a film premiere. In this case, the feature-based side of

the system would be responsible for recognising Britney Spears’ face, whilst

keyword based methods would satisfy the remaining contextual requirements (Srihari

& Zhang (1999)).

During indexing, the Webpic system uses face detection to build up a database of

individuals’ faces, given a set of names to search for and a set of URL’s to search

over. The face detection itself uses ‘pattern classification in a colour feature space’

(Srihari & Zhang (1999:508)) to determine whether an image contains a human face

and then to isolate the face from the rest of the image. However, before the image

can be stored in the face library, the identity of the individual must be established.

This is achieved by extracting collateral textual information from the web page.

Webpic uses a ‘collateral text editor’ to ‘determine the scope of the text relevant to a

given picture’ (Srihari & Zhang (1999:500)). In a similar manner to the way in which

AMORE extracts keywords from a web page, Webpic’s collateral text editor is

capable of detecting image describers based on clues within the HTML source code,

such as spatial proximity to text and words contained within ‘ALT’ tags. However,

given background knowledge about captions usually found on news websites,

Webpic is also able to detect describers by searching for the presence of phrases such

as ‘left, foreground, rear’ and the use of special fonts (Srihari & Zhang (1999:500)).

17


Once extracted, the relevant collateral text is subjected to natural language

processing, to produce information about the context of the image, including: the

objects and people present in the image, the event occurring in the image, the general

context of the image, e.g. political, entertainment etc., and further attributes such as

‘indoors vs. outdoors’, etc. (Srihari & Zhang (1999:504)). Srihari & Zhang

(1999:500) claim that: ‘this represents a more semantic analysis of the data than

general text and image indexing based on statistical features’.

2.2 Content Description Languages For Multimedia

Henrich & Robbert (2000) suggest that a multimedia query language should not only

facilitate a similar approach to those used by systems such as WebPic and AMORE,

where descriptive information is extracted directly from the collateral text within a

web page, but should also facilitate the storage of this information using a

standardised content description language. They state that this so-called meta-data

and, in particular, manually created meta-data ‘will be the backbone of multimedia

information retrieval systems for the next decade’.

Bertino et al. (1999) take this notion further by suggesting that one of the major

differences between MIR systems and traditional IR systems is that the former

require some form of database schema to provide meta-data, whilst the latter do not

support meta data at all.

Paek et al. (1999:1) also stress that ‘an interoperable method of describing

multimedia content is necessary’, whilst Li et al. (1998:1) point out that ‘there is a

growing need for developing a content description language for multimedia that

improves searching, indexing and managing of the multimedia content’.

18


CHAPTER 3:

USER INTERFACE DESIGN IN INFORMATION RETRIEVAL

HCI (Human Computer Interaction) research is ‘concerned with the design,

evaluation and implementation of interactive computing systems for human use and

with the study of major phenomena surrounding them’ (Preece et al. (2002:8)). HCI

research is increasingly being recognised as important within the field of computer

science. One of the key focuses in HCI research is the design and evaluation of user

interfaces for both hardware and software.

Hackos & Redish (1998) describe the user interface as:

‘the bridge between the world of the product or system and the world of the

users... the means by which the users interact with the product to achieve

their goal... the means by which the system reveals itself to the users and

behaves in relation to the users’ needs’.

Developers have recently begun to take notice of the importance of user interfaces in

IR systems, due mainly to the growth of the World Wide Web (WWW), which has

made vast quantities of information available to a wide and varied user base, thereby

posing the opportunity for user interface research to have a greater impact than ever

before (Hearst (1999)). However, as Hansen (1997) points out, this has not always

been the case. The evaluation of IR systems has traditionally been considered an

issue of precision and recall (standard measures of performance for IR systems), as

opposed to one of usability, which in the past has led developers to neglect the

importance of the user interface and create systems that are: ‘a hurdle for novice

users and an inadequate tool for experts’ (Shneiderman (1998:511)). Nielsen (1993)

recommends that in future a qualitative, rather than a quantitative, approach should

be taken towards the evaluation of IR systems, with a focus on user satisfaction in

terms of the results presented and the user interface.

19


3.1 Design Aims

The design process when creating a user interface can be divided into two stages -

the development of interaction components and the development of interface

software. The former is concerned with how the interface functions and responds to

user interaction, whilst the latter deals with the coding and general implementation

(Hix & Hartson (1993)). This project focuses only on the first stage, the development

of interaction components.

When creating a user interface for an IR system, it is particularly important that the

interface is well designed, as users more often than not approach the system with an

uncertainty about what their goals are and how they will be achieved. The goal of the

user interface in an IR system is to enable a user to satisfy an information need

without the assistance of an experienced, human intermediary (Brajnik et al. (1996)).

A well-designed interface should therefore assist the user in clarifying their

information needs, and subsequently help them to formulate suitable queries and

understand the results (Hearst (1999)).

3.2 The User-Based Approach

In order to design an effective user interface for an IR system, it is necessary to shift

away from the traditional systems-based perspective of IR research, towards a user-

centric approach, that considers the knowledge, experience, requirements, goals and

expectations of users (Hansen (1997)). Abels et al. (1998) describe the user as

occupying a central place during the user-based design process, whereby whenever

possible, design decisions are made by the user as opposed to the designer.

Shneiderman (1998) stresses the importance of characterising the users and the

situation as accurately and thoroughly as possible before beginning the design of a

system. However, when the diversity of individuals that use the WWW is combined

with the wide range of scenarios, goals, and frequencies of use, ‘the set of design

possibilities becomes enormous’ (Shneiderman (1998:67).

20


As an example of differing user requirements, Shneiderman (1998) describes three,

crudely defined, generic user groups and how design aims might differ to cater for

each:

The first group Shneiderman (1998:68) defines are ‘novice or first time users’ -

novice users are those who have absolutely no experience of using similar interfaces,

whilst first time users are familiar with the basic tasks concepts, but not the interface

concepts. Both groups may not be at ease with using computers, which may in turn

hinder their ability to learn. The designer must therefore ensure that constructive help

and feedback is provided to the user. They must also ensure that they refrain from

using technical jargon that may cause confusion. Furthermore, it helps to keep the

number of actions the user must perform to a minimum, thereby ensuring that basic

goals can be achieved, in turn improving the user’s confidence.

The second group of users are ‘knowledgeable, intermittent users’ - those who use a

variety of systems on an irregular basis, therefore possessing a moderate knowledge

of both general task and interface concepts. The main design aim in this case is to

assist the user in remembering how to complete tasks successfully. This can be

achieved by structuring components in an orderly manner, using ‘consistent

terminology’ and maintaining ‘high interface apparency’.

The final group of users, ‘expert frequent users’, are those who frequently use such

systems and are highly familiar with the task and interface concepts. In this case, the

designer must create an interface that allows the user to complete their tasks as

rapidly as possible, with the least amount of interruption. This can be facilitated by

providing shortcuts, macros, etc.

The design principles suggested by Shneiderman (1998) are in keeping with those of

Hearst (1999:259), who recommends that a user interface for an IR system should:

‘offer informative feedback’, ‘reduce working memory load’ and ‘provide alternative

interfaces for novice and expert users’.

21


3.3 The Information Retrieval Process

Hearst (1999) identifies three fundamental stages in the information retrieval process,

which an IR interface should support: 1. Starting the retrieval process, 2. Specifying

a query, 3. Viewing retrieved results in context; and occasionally, 4. Query

reformulation.

3.3.1 Starting The Retrieval Process

In the first stage, the user begins the retrieval process. The user interface must offer

the user something more detailed than simply a query box. The interface should give

users clues about how to begin their search, for example, by assisting them in

selecting the sources they wish to search over. This is particularly important in the

case of novice or first time users.

3.3.2 Specifying a Query

In the second stage, the user formulates and specifies a query. Shneiderman

(1998:71) lists four ‘primary interaction styles’, each of which can be used in query

specification interfaces:

1. Menu selection - where users choose the item that is most appropriate to their task

from a choice of options.

2. Form fillin - used when data must be inputted, and menu selection is therefore

inappropriate. In this case, users see a list of fields, select the appropriate fields

and enter data where necessary.

3. Command language - where users enter purely textual expressions. This style is

the most common interaction style for traditional IR systems, which accept

Boolean Queries consisting of combinations of keywords and the AND and OR

operators.

22


4. Natural language - where users enter natural language sentences or phrases,

without the need for Boolean operators, etc. IR systems that use statistical

ranking algorithms often use this interaction style. However, such systems have

the disadvantage of giving the user ‘less feedback about and control over the

results’ (Hearst (1999:287)).

5. Direct manipulation - whilst this is perhaps the most interesting and innovative of

the five interaction styles proposed by Shneiderman (1998), it is also the least

common. This graphical approach involves creating visual representations of

objects and actions, which the user can manipulate during ‘rapid, incremental,

reversible operations’ using ‘physical actions or button presses’, the results of

which are made ‘immediately visible’ (Shneiderman (1998:205)). Direct

manipulation instills a feeling within the user that they are directly involved in a

world of objects rather than merely communicating with an intermediary

(Hutchins et al. (1986)).

3.3.3 Viewing Retrieved Results in Context

The third stage in the retrieval process involves the user browsing through the

returned results set. The user interface must display the results in a contextual

manner, to facilitate the users understanding of them. In traditional IR systems, this

is usually achieved by returning the results in a ranked list that denotes order of

relevance to the query. However, Hearst (1999) gives examples of other methods,

such as using a tabular display in which results can be organised depending on

criteria assigned to the X and Y axes, and the KWIC (keyword-in-context) method

whereby the portions of the documents containing keywords are extracted from the

text and presented along with information such as document title and abstract.

3.3.4 Query Reformulation

Although it is not a compulsory part of the retrieval process, having viewed the result

set, users may wish to reformulate their query and begin the search process again.

Some IR systems provide a feature known as ‘relevance feedback’ to support this,

whereby the user reformulates their query by selecting the results that they feel are

23


most relevant to their query. The system then extracts features from the chosen

documents and returns a new set of results, based on this information.

For an IR system to be capable of relevance feedback, the user interface should be

designed in such a way that facilitates it. For example, results should be presented

with checkboxes, enabling the user to denote them as relevant. However, how much

control the interface should give the user over the relevance feedback process is

arguable (Hearst (1999)).

24


CHAPTER 4: SURVEY OF EXISTING IR SYSTEMS

In this section, a sample of the existing IR systems for a variety of different media

types will be surveyed, with a particular emphasis on their user interfaces. Some of

the systems described in chapter 3 are excluded from this survey, because working

versions were not publicly accessible.

Although some of the systems surveyed in this chapter only return objects of a

specific media type, the user interfaces are still of interest, as the UI concepts they

utilise, in particular the query methods and result presentation styles, may be applied

to the design of true multimedia retrieval systems, that retrieve a variety of different

media types.

4.1 Content Based Image Retrieval Systems

The systems surveyed in this section allow the user to query their databases and

retrieve images based on their visual characteristics.

4.1.1 Circus (Content-based Image Retrieval and Consultation User-centered

System - The Laboratory for Audio-Visual Communications & Laboratory of

Ergonomics of Intelligent Systems and Design (Switzerland)

The system offers colour and spatial-feature matching features. A user can search

using a standard Query-by-Example (QBE) system, where images are added to the

query as the user considers them to be similar to the desired results, or using a unique

colour matching option. The colour matching works by allowing the user to specify

the percentage of each desired colour that should appear in the image. An example

image is selected and a part of the image is then clicked on to denote the colour that

should be matched. The similarly coloured region of the image is then automatically

selected by the system. This coloured region can then be fine-tuned using a palette,

before searching. The system also offers a rudimentary sketching option that allows

the user to draw their query, however this is not fully functional at present.

25


The interface itself (see figure 4.1 (a)) is neat and compact, with a series of

overlapping windows that allow the user to quickly switch between browsing

through previous results, and issuing new queries, and a series of shortcut buttons for

connecting to the system, executing queries etc. Pull-down menus are used to allow

the user to select a specific image collection and query algorithm. A useful ‘basket’

feature allows the user to store images of particular interest, along with their own

annotations.

Figure 4.1 (a): Circus interface

The results are presented as in figure 4.1 (b), ranked in order of decreasing relevance

from left to right and top to bottom. Results can be clicked on by the user to add

them to an existing search query, as relevant examples, or double-clicked to create a

new query consisting of the chosen image alone.

A demonstration version of this system is currently available in Java applet form

(http://lcavwww.epfl.ch/~zpecenov/CIRCUS/Demo.html). However, at the time of

writing it was not fully accessible.

26


Figure 4.1 (b): Circus results screen

4.1.2 Compass - ITC-IRST Centre for Scientific and Technological Research

Compass is another query by example system, and supports colour, spatial and

textual feature matching. The user selects the desired example images from

thumbnails of the images in the database. These are then added to a ‘Query bag’.

Based on the contents of the ‘Query bag’ a search is conducted, and the 100 most

relevant images are returned.

The system interface (see figure 4.2) is particularly interesting, as the database

images, contents of the query bag and results are all presented within one window

(located from left to right), making it quick and easy to perform searches. The

interface also contains check boxes and text boxes to allow the user to modify search

criteria (hue, intensity, saturation and edge) and the threshold value, respectively.

27


Figure 4.2: Compass interface

Various demo versions of this system can be downloaded from the Compass website

(http://compass.itc.it/demos.html), each containing different image databases, e.g.

fine art, faces, etc.

4.1.3 NETRA-2 - Department of Electrical and Computer Engineering, University

of California

NETRA-2 uses colour matching to retrieve results. However, the unique aspect of

the system is the colour image segmentation algorithm, which allows the user to

select a specific segment of interest within an image. A search is then conducted for

similarly coloured regions within other images in the database and those images are

returned. It has been proposed to extend the system to incorporate texture, shape and

spatial location information into the searching algorithm.

The system’s interface (see figure 4.3) is particularly simplistic to use. The window

is divided into four areas. The top frame allows the user to select an image database

28


from a pull-down menu or issue a textual query for a specific image number. The

right side of the window displays an array of images in the database. This is replaced

by the search results once a user issues a query. These images can be displayed in

‘segmented’ or ‘unsegmented’ form (specified in a pull-down menu) and browsed

through using the ‘<<’ and ‘>>’ buttons. Queries are issued by the user clicking on

an image from the database. The image then appears in the ‘query image’ pane in the

left-side of the window and the user can then issue a query by clicking on a specific

region of the image. A menu in the bottom-left of the window allows the user to

choose a particular category of images to search over.

Figure 4.3: NETRA-2 interface

A Java applet demo version of the system, using a database comprising of 2500

images from Corel photo CD’s is available for download at:

http://maya.ece.ucsb.edu/Netra/netra2.html

4.1.4 QBIC - IBM, Almaden Research Centre

The QBIC system, discussed in section 2.1.1.1, offers several query types: simple

queries, multi-feature queries and multi-pass queries. Simple queries consist of

searches using a single feature, e.g. colour, whilst multi-feature queries are searches

that incorporate several features, e.g. colour, shape and texture.

29


The system is currently being utilised as a search engine for finding items within the

collection at the Hermitage Museum website (www.hermitagemuseum.org). Two

search options are available on the site, a colour search and a layout search. The

colour search allows the user to add different colours to a ‘bucket’, and specify the

proportion of each colour that should be present in the retrieved image. The interface

(see figure 4.4 (a)) consists of a colour palette, sliders with which the user can adjust

RGB values, and the colour ‘bucket’.

Figure 4.4 (a): QBIC colour search interface

The layout search on the other hand allows the user to draw coloured squares and

circles on a canvas to denote the general spatial information of the image they wish

to retrieve. The interface for the layout search (see figure 4.4 (b)) consists of a palette

with which a user can select a colour, and sliders to adjust RGB values. Next to the

canvas there are buttons for selecting shapes and ‘Bring to front’/‘Send to back’

buttons. Below the canvas there are buttons for deleting individual shapes, clearing

the canvas and issuing the search query.

30


Figure 4.4 (b): QBIC layout search interface

4.2. Content-Based Audio Retrieval Systems

The two systems described in this section both retrieve audio files based on their

content. However, they vary greatly in their approaches.

4.2.1 SoundFisher - Muscle Fish (division of Audible Magic Corporation)

Rather than being a straightforward content-based audio retrieval system,

SoundFisher is marketed as a sound effects database management system. The

primary purpose of the system is to categorise sound effects, based on their

similarities. Given an example sound, the system is able to retrieve similar sounds

from a database, based on various audio features, which will be described later.

The interface for SoundFisher is adequately detailed, but still user-friendly enough to

guarantee simple searches can be performed with ease. Special SoundFisher

databases of sounds can be created or downloaded for use within the system, or

individual sounds can be added, through the ‘File’ menu. Issuing a query first

involves selecting an example sound from the database. The sounds are listed in the

bottom-half of the window in several different ways, depending on the view chosen

31


by the user. The category view (see figure 4.5 (a)) lists the filenames in a tree

formation, determined by the user’s creation of categories. The list view presents the

results in the style of a spreadsheet. The 2D view presents the files as vectors on an

X and Y axis, where the X and Y axes are selectable from pull-down menus with the

following fields: loudness, loudness rate, pitch (octaves), pitch rate, brightness

(octaves), in time and out time.

Figure 4.5 (a): Soundfisher category view

Files are selected as examples by clicking on the filename, or relevant row in the

spreadsheet. It should be noted, however, that files cannot be selected from within

the 2D view. Furthermore, the sound can be previewed using the playback buttons

located above the results pane.

Once a suitable sound has been selected, the user then selects the appropriate fields

from the pull-down menus in the ‘Query’ pane, to dictate how many results should

be returned (first 100, first 1000, first 10000 or all), which files should be searched

(entire database, current records or selected records), and what features should be

matched. The system can perform searches based on the features of the actual sound,

i.e. duration, loudness, loudness rate, pitch, pitch rate, brightness, brightness rate,

32


sample rate, sample width, sample format. The system can also perform searches

based on physical features of the file, i.e. filename, file format, date, channels, and

searches based on features defined by the user within SoundFisher, i.e. category,

URL, keyword, comment.

The results are listed in the bottom-half of the window (see figure 4.5 (b)), in place

of the database contents, using the spreadsheet-style view. Shortcut buttons located at

the top of the window make it simple to switch between the results view and

database contents view, reload a particular view, issue a search query and print the

records within the current view.

Figure 4.5 (b): Soundfisher results view

Version 1.0 of the system is available for download at: http://www.soundfisher.com.

4.2.2 SpeechBot - Hewlett Packard

SpeechBot is a content-based audio retrieval system that searches over 14,500 hours

of radio broadcasts from about 25 different shows, for specific keywords or phrases.

Hewlett-Packard use speech recognition software to automatically generate a time-

33


aligned transcript of each show. Although the transcripts are fairly inaccurate, the

key words are said to be recognised correctly, most of the time. These transcripts are

indexed and searched through whenever a query is issued.

The SpeechBot system offers two search options: a simple search and an advanced

search. The interface for the simple search (see figure 4.6 (a)) is minimal yet

effective, containing just one textbox for the query, and two pull-down menus, one to

select topic, e.g. arts and entertainment, government and military, etc. and one to

select date, which allows the user to narrow the results down to those created in the

last 3 days, last week, last 2 weeks, last month, last 3 months, last 6 months or last

year.

Figure 4.6 (a): SpeechBot query screen

The interface for the SpeechBot Power Search (see figure 4.6 (b)) is the same as for

the simple search, but with additional pull-down menus to allow the user to specify

34


exactly which radio show to search over, and whether to rank the results by

relevance, originating website or date. Furthermore, radio buttons can be used to

specify whether the system should search for all the words within the query, any of

the words in the query, the exact phrase, or a Boolean expression. Although there are

radio buttons for selecting an audio, video or audio & video search, the database only

contains audio files at present.

Figure 4.6 (b): SpeechBot ‘power search’ screen

The search results are displayed in a table (see figure 4.6 (c)), below the search

interface. For each result, the table contains a play button, the originating website,

the date of the broadcast, and a brief extract from the transcription, containing the

search query.

35


Figure 4.6 (c): SpeechBot results view

4.3 Multimedia Retrieval Systems

The systems described in this section are ‘true’ multimedia IR systems, in so far as

they are capable of returning results in the form of a variety of different media types.

4.3.1 AllTheWeb

AllTheWeb is perhaps the largest and most popular multimedia retrieval system

available for public usage on the World Wide Web. Given a specific keyword or

phrase, AllTheWeb searches the Web or, more precisely, those web pages indexed

by the system, and returns pictures, videos and audio clips pertaining to the query.

AllTheWeb offers two search options, as is standard with most web-based search

engines: a basic search and an ‘advanced search’. The basic query screen (see

Appendix A-1) is fairly minimal, with a pull-down menu allowing the user to select

36


the language in which to search, a text-box in which to enter a query, and a button to

begin the search. There is also a check-box which allows the user to specify, when

using queries containing more than word, if the exact phrase should be searched for,

as opposed to just the individual words.

However, what makes the AllTheWeb interface particularly interesting is that the

user can choose which media type they wish to search for by clicking on one of the

media-type tabs located above the query box. By default ‘Web pages’ is the

highlighted tab. However, if the user clicks on the ‘Videos’ tab for example, they are

presented with the same query box but the subsequent search results are restricted to

video files only.

Once the results have been displayed, the media-type tabs can also be used to re-

issue the same query for a different type of media. For example, if the user is

currently browsing image results from a particular query and they click on the

‘Videos’ tab, then the interface will change to display video results from the same

query.

In the web pages section of the results, the first ten web page titles are displayed,

along with short summaries describing their content (see Appendix A-2). The news

section has a similar layout, with the first ten news pages displayed, along with a

summary extracted from the document itself.

In the images section of the results, the first nine image results (in order of relevance)

are previewed as thumbnails (see Appendix A-3), below which is the file name,

along with the file’s format (e.g. GIF, JPG, etc.) dimensions (e.g. 640x480) and size

(e.g. 56 kB). If a user follows an image link, a larger (but still scaled down) version

of the file is displayed, along with further file image information and links to the

actual image and its originating page.

In the video section, the results are listed in a table format, with the column headings

‘title and nearby text’, ‘duration’ ‘size’ and ‘format’ (see Appendix A-4).

37


The MP3 results are presented in a similar manner to video results, but instead using

the headings ‘reliability’, ‘type’, ‘title’, ‘size’ and ‘date’ (see Appendix A-5).

Reliability is represented as a number of stars, but it is unclear how this is

determined.

A particularly original feature of AllTheWeb is that when results of a particular

media type are being displayed, links to the top search results in a different media

type are also listed, in a separate area to the right of the results being displayed. For

example, if a user was viewing a list of video results, links to the top 7 web results

would also be listed, separate from the results, as shown in Appendix A-3.

As is to be expected, AllTheWeb’s advanced search interface is more complex, with

the system allowing the user to specify more detailed individual preferences.

However, the options available vary for each media type searched and there is no

universal advanced search interface that covers all media types. The advanced search

screen displayed is determined by which media-type tab is highlighted at the time.

For example, if the user is currently browsing web page results, and they click on the

‘Advanced Search’ link, they will be presented with search options specific to web

pages, as shown in Appendix A-6.

These search options enable the user to restrict their search by specifying, for

example, that only results updated a certain amount of time prior to the search should

be returned, e.g. in the last month, the last 6 months, etc. Users can also use a

combination of two pull-down menus and a query box to specify that only results

smaller than, larger than or exactly a certain size in bytes, kilobytes or megabytes

should be returned. The user can also restrict the number of search results that are

displayed per page. However, although you would expect these options to be

available for all file types, this is inexplicably not the case.

The AllTheWeb search engine can be accessed at: http://www.alltheweb.com.

38


4.3.2 Lycos Multimedia Search

Although the Lycos Multimedia Search (LMS) system is powered by the same

technology as AllTheWeb, the user interface is fairly dissimilar. The major

difference between the two search engines is that LMS does not use the same tab

system that AllTheWeb uses.

To issue a query, the user must click on one of a series of radio buttons labelled

‘All’, ‘Pictures’, ‘Audio’, ‘MP3s’ and ‘Video’, as shown in Appendix B-1, to select a

media type, before entering their query. The exclusion of web page and news results

further distinguishes LMS from AllTheWeb.

If the user chooses to display ‘All’ media types, as is the default option, the results

are presented in three sections (see Appendix B-2).

In this case, the first three image results (in order of relevance) are previewed as

thumbnails in the image section, along with filenames, details of the source and a

link to ‘Enlarge’ the image.

If a user follows the ‘Enlarge’ link, a larger version of the image is displayed, along

with a short summary of what the image shows, the filename and copyright details

(see Appendix B-3). Another particularly interesting feature, although it should be

noted that this is not available if the image does not reside on the Lycos server, is the

inclusion of ‘Next’ and ‘Previous’ image thumbnails.

In the audio section of the results pages, the first three audio results are listed by a

short description of the contents of the audio file, along with file name, file size and a

‘Play Audio Track’ link. It is unclear how the descriptions are generated, and they

often appear inaccurate, and occasionally completely irrelevant. The video results are

presented in the same manner as the audio results, but listed by file name as opposed

to description.

In each section, a ‘More...’ link allows the user to view the other image, video or

audio results, individually. Appendix B-4 shows an example of image results being

39


displayed exclusively. Furthermore, if the user decides to restrict their search to a

certain media type, then instead of having to re-issue their query, they can simply

click on one of the links at the top of the page. These links serve a similar purpose to

the tabs in the AllTheWeb.

The Lycos Multimedia Search engine is located at: http://multimedia.lycos.com/

4.3.3 Singingfish

This is a similar system to AllTheWeb and Lycos Multimedia Search, in so far as it

is a web-based search engine, which returns multimedia objects, along with relevant

information. However, in the case of Singing Fish, the search is limited to audio and

video files.

The basic query screen (see Appendix C-1) is even less sophisticated than that used

by AllTheWeb, with simply a text-box in which to enter the search query.

However, the advanced search interface (see Appendix C-2) has several interesting

features. As well as being able to use checkboxes to select the usual features such as

file format (MP3, QuickTime, Real or Windows) and media type (audio, video or

live-stream), the user can choose to view media that is specifically encoded for a

certain connection speed (28.8k through to 300k) and is of a specific length (>1

minute or >3 minutes). However, perhaps the most interesting feature is the ability to

group the results by category.

A selectable list containing a choice of categories (other, music, movies, news, radio,

sports, television, finance, or all categories) can be used to denote that only results

regarded as belonging to a specific category should be returned. However, it is

unclear how the system determines which category a specific file belongs to.

If the ‘group by category’ checkbox is filled, the search results are presented

categorised as shown in Appendix C-3. For example, ‘other’ results are grouped

together, followed by music results, then movie results, etc. The first five results are

presented in each category, with the option to display more results individually.

40


If the option to group by category is not selected, the results are sorted by relevance

only (see Appendix C-4). In either case, the results are presented as a title, followed

by any additional information stored within the file, e.g. artist, copyright holder, etc.

The length of the file is also listed. Each result is also accompanied by an icon

denoting the type of file, e.g. MP3 file, Real file, Windows Media file, etc. and the

most suitable connection speed with which to view the file. It should be noted that

audio and video files are interspersed, and not presented separately as is the case with

AllTheWeb and LMS.

The SingingFish search engine can be accessed at: http://www.singingfish.com.

41


4.4 Critical Analysis of Query Methods

Several of the systems analysed in the previous section, for example Circus and

Compass, allow the user to specify their query by selecting an example of a relevant

object from a subset of the searchable objects, or from a particular database. As

mentioned in chapter 2, this is a method known as Query-by-Example (QBE).

The QBE query method does have its advantages, in so far as it is a fairly simplistic

way for novice and first time users to yield useful results from the system, due to the

simplicity of the concept and the ease with which the system can process the query

effectively. However, the method is potentially time-consuming, as querying may

involve searching through numerous pages of objects before a suitable object is

encountered. Most significantly though, the QBE method lacks expressive power,

and an expert user may find themselves unable to represent all the desired features

(Cho & Yoo, 1998).

An alternative to the QBE method, that aims to avoid the problem of insufficient

expressive power, is the user-drawn sketch approach adopted by systems such as

QBIC, and described in section 4.1.4. Using such a system, the user may feel that

they are more able to represent their query accurately. However, although the

rudimentary sketch system is able to represent features such as colour, shape and

texture, the effectiveness of the search is dependent on the user’s ability to accurately

depict the target image, and it is often the case that the user’s representation differs

greatly from the desired result.

The colour-matching method, the alternative query method employed by QBIC and

also used by Circus, whereby the user specifies a particular colour, is more simplistic

than the previous methods. It is therefore easier for a user to specify a query that

more closely matches what they are aiming to retrieve. However, colour-matching

does not take into account features such as shape and texture and therefore has less

expressive power than QBE.

42


The query method adopted by Speechbot and all three of the true MIR systems

surveyed is the traditional keyword-based approach. Although the keyword-based

approach lacks expressive power to a certain extent, for example a user may find it

difficult to describe the texture of an image, it is the least time-consuming way for

users to issue queries. The method is extremely simplistic and most novice users

should be familiar with the concept of entering a textual query. At the same time, by

also recognising commands such as Boolean operators, etc. the query method can

cater to more experienced users.

Most importantly, in the context of this study, the keyword-based approach can be

applied to any type of media, enabling the user to search for and retrieve objects in a

variety of different media types with just one query, as demonstrated by AllTheWeb,

LMS and Singingfish.

It should be noted that this method does have a weakness however, in that it is

usually reliant on the indexed objects being annotated successfully, which is

sometimes a time-consuming, manual and potentially subjective process. However,

the three true MIR systems surveyed demonstrate that in the case of a web-based

search engine, it is possible to produce effective search results using only

information automatically extracted from the objects themselves and their originating

web pages.

43


CHAPTER 5: METHODOLOGY

Having taken the factors discussed in section 4.4 into consideration, it was decided

that the user interface designed should utilise the keyword-based query method.

Different approaches were considered, but the original aim of the study was to

produce an interface that allowed the user to search for and retrieve objects of a

variety of different media types. It was apparent that none of the approaches detailed

in chapter 4 would be suitable for this purpose, and it was predicted that creating an

interface that combined several different query methods would be too complicated a

design task and not necessarily practical from a user’s viewpoint.

Subsequently, it was decided that the three MIR systems described in section 4.4 -

AllTheWeb, Lycos Multimedia Search (LMS) and Singingfish - should be used as a

basis for usability testing, as these were the only publicly accessible, web-based MIR

systems capable of retrieving objects of several different media types.

The methodology used to reach the final UI design takes a user-based approach, with

the following aims:

• Use qualitative and quantitative data collection methods (interviews and

questionnaires) to collect cognitive and statistical data from users performing

predetermined information seeking tasks on existing MIR systems.

• Analyse the resulting data, in order to establish users views and opinions on

the systems, and subsequently use this information to generate initial UI

designs.

• Collect qualitative data from usability evaluation of a low-fidelity prototype

and use this data to create the final UI design in HTML.

44


5.1 Usability Evaluation of Existing MIR Systems

The first set of usability evaluation sessions took place over the period of a week in

July 2002, in the usability lab in the Department of Information Studies, Sheffield

University.

In the experiment, four participants, tested separately during 2 hour sessions, were

given briefings (see Appendix A-1) which instructed them that they were to perform

20 minute retrieval tasks on each of the three MIR systems - AllTheWeb, LMS and

Singingfish. A within-groups (same-participant) design was adopted, whereby each

user performed every task. The participants were selected from students completing

the MSc Information Management course within the Department of Information

Studies, as they were known to have a degree of familiarity with web-based search

engines.

Familiarity with the goals and objectives of the participants completing the tasks was

assumed, and the main focus of the evaluation was to ascertain their thoughts,

preferences and opinions regarding each of the three MIR systems.

5.1.1 Initial Questionnaire

Before beginning the evaluation, participants were asked to complete an initial

questionnaire (see Appendix D-2), aimed at collecting background information about

the participants, including their web searching habits, their level of expertise and

most importantly, whether they had previously used any of the search engines.

5.1.2 Retrieval Tasks

Having completed the initial questionnaires, the participants were then given

scenarios (see Appendix D-3), each of which asked them to retrieve the most

relevant information relating to a specific subject. They were allowed to perform an

unlimited number of searches and asked to bookmark any objects they perceived to

be of relevance. Whilst completing the tasks, the participants’ facial expressions and

on-screen activities were recorded using a webcam and video capture software.

45


Using a technique developed by Erikson & Simon (1985), participants were asked to

think aloud and verbalise their thoughts and feelings whilst completing each task.

This was also recorded, with the use of a microphone.

It was crucial that the tasks encouraged the participants to interact with the systems

as much as possible. Therefore, the tasks were tested beforehand to ensure that given

the subjects the participants were asked to retrieve information about, each of the

systems was likely to return a large number of results across all media types,

regardless of the participant’s skill level. However, at the same time, the task had to

be made suitably specific, to ensure that only a low proportion of these results could

be considered relevant, thereby ensuring that the user had to perform several

different searches.

Counterbalancing was used during testing, i.e. the order in which the systems were

tested was varied from user to user, to avoid introducing learning bias. However,

each task could only be allocated to a particular system, as the number of results

returned for the given subjects varied from system to system.

5.1.3 Usability Satisfaction Questionnaire

After completing each task, the participants were then asked to complete an online

questionnaire, based on Lewis’ (1995) Computer Usability Satisfaction

Questionnaire.

The questionnaire measured the participants’ satisfaction with different aspects of the

system’s usability, by asking them to rate statements such as: ‘It was simple to use

this system’ on a 3-point Likert1 scale, based on the extent to which they agreed or

disagreed with the statement.

1 Likert scales are commonly used in the evaluation of user satisfaction. They present the individual with a set number of opinions, typically ranging from negative to positive, with a neutral score in the middle (Babbie (1983)). The individual must select the opinion which most closely matches theirs.

46


The questionnaire (located at http://www.acm.org/~perlman/question.html) had

already been designed and coded, in the form of a Perl CGI script, by Perlman

(1998), and it was only necessary to carry out a minimal amount of customisation.

5.1.4 Interview Session

Having completed all tasks and online questionnaires, the participants were then

interviewed, face-to-face, for 30 minutes, with the aim of extracting their opinions,

preferences and suggestions.

It was decided at an early stage to interview participants, rather than use

questionnaires, as it allowed the questioning to be adapted to suit the participants’

activity whilst completing the tasks. For example, if a participant utilised a particular

system feature whilst completing a task, this was noted, and during the interview

they were then asked to state the usefulness of that feature.

Firstly, participants were asked to rate the features of each of the three systems, some

of which were common features, others of which were unique, in terms of their

usefulness. Ratings were given on a 5-point Likert scale, ranging from ‘completely

useless’ to ‘very useful’.

The participants were then asked several open-ended questions aimed at ascertaining

which of the interfaces they preferred and why, and which specific parts of each

interface they preferred, for example the way image results were presented, etc.

Following this, participants were asked to suggest ways in which each of the three

interfaces might be improved.

Although the interview was adapted to suit each participant, a semi-structured

approach was taken, whereby the interviewer followed a basic script, but could probe

the interviewee for more information if an interesting issue was raised (Preece et al.

(2002)). It was important to ask neutral, open-ended questions so as to not influence

the participants’ opinions, and in order to facilitate further discussion and elicit as

much information as possible from the participant (Hackos & Redish (1998)).

47


The interview ended by asking the participants’ to rate their familiarity with each of

the three search subjects, on a scale of 1-3, where 1 was ‘Not familiar at all’, 2 was

‘Somewhat familiar’ and 3 was ‘Very familiar’.

5.2 Usability Evaluation of Low-Fidelity Prototype

Having analysed the results and produced a requirements specification, sketches

were drawn of potential user interface designs, the best of which (see Appendix F)

was selected for low-fidelity prototyping. This involved creating a hand-drawn, card-

based prototype2 of the proposed user interface, with removable sections representing

the parts of the interface that change as the user interacts with the system (see

Appendix G).

The primary advantage of using a low-fidelity prototype at this stage in the user

interface design process, was the efficiency in terms of both time and cost. It was a

cheap, simple and fast way to confirm that participants understood how to use the

system and liked the design, and obtain feedback on design alternatives. However,

low-fidelity prototypes often have the disadvantage of being impossible to

implement in the final technology (Hackos & Redish (1998)), therefore it was

important to consider what could actually be achieved in HTML, before designing

the prototype.

Four participants were selected to participate individually in a ten-minute usability

evaluation session, based on a process described by Hackos & Redish (1998). Two of

the participants had taken part in the earlier usability evaluation sessions, however,

the other two had never used an MIR system before. It was necessary to introduce

this mix, to ensure that first time users were able to get to grips with the user

interface. The initial questionnaire (see Appendix A-2) was used to select suitable

participants.

2 Hackos & Redish (1998:377) define a prototype as an ‘easily changeable draft or simulation of at least part of an interface’.

48


In each session, the participant was first presented with the card representing the

initial screen of the user interface, and asked what they thought the purpose of the

site was. They were then asked to describe what they would do if they wished to find

information about Britney Spears. If the user responded with a vague answer, the

interviewer would request the exact action the participant would take. Once a clear

answer had been obtained, the interviewer would play the role of the computer and

add or remove parts of the prototype accordingly. If the participant did not interact

with an important part of the system, then the interviewer would ask them what they

expected to happen if they clicked on it, or what they thought the purpose of

something was. This process was then repeated until the task was considered

complete.

At certain points during the evaluation session, alternative designs were tested. In

this case, the interviewer would change the relevant part of the prototype, once or

several times, and ask the participant to identify which style they preferred. The

design alternatives are detailed in chapter 9.

5.3 Prototyping of Final User Interface

Once the results from the low-fidelity prototyping sessions had been analysed, and

the final design of the UI had been decided upon, a prototype of the proposed system

was then coded in HTML, using Macromedia Dreamweaver. Although the prototype

was coded in HTML and therefore implementable on the Web, it was merely a front-

end, without any supporting architecture. The prototype was not intended for

usability evaluation, but merely to represent the exact look and feel of the proposed

user interface and confirm that it could be implemented.

49


CHAPTER 6: RESULTS FROM USABILITY EVALUATION OF EXISTING

MIR SYSTEMS

As detailed in section 5.1, the first part of the user-based design process involved

conducting usability evaluation sessions using the three existing web-based MIR

systems. The results of these sessions and the user comments arising from them are

discussed below:

6.1 User Background, Knowledge and Experience

As mentioned in section 5.1, the participants were selected from students enrolled on

the MSc Information Management course at Sheffield University, as they were

known to be familiar with using web-based search engines. It was anticipated,

therefore, that by applying their existing knowledge they would not have difficulty

getting to grips with the retrieval tasks, and would subsequently have a great deal of

feedback to offer.

The results from the initial questionnaire (see Appendix E), aimed at eliciting

background information about the participants, proves that the participants were both

familiar with using web-based and commercial search engines. All but one of the

participants used the Internet (Q.4) and web-based search engines (Q.5) daily, and in

both cases the other participant stated a usage frequency of ‘more than once a week’.

Furthermore, each of the participants described their level of skill in doing so as

‘intermediate’ (Q.7). The most popular search engine (Q.6) was Google.

Three of the four participants also claimed to have had experience in using

commercial search engines, such as Dialog etc. (Q.8). However, the same amount

described their ability as ‘novice’ (Q.10). It was also apparent from the results that

the frequency of use was considerably less than for web-based search engines, with 3

of the participants stating their usage frequency as ‘less than monthly’, and the

remaining participant stating ‘monthly’ (Q.9).

Although three of the four participants claimed to have used search engines in the

past to retrieve media objects other than web pages (Q.14), all of the participants

50


stated that this was exclusively using Google (Q.15). The usage frequencies varied

from ‘several times a month’ to ‘less than monthly’ (Q.16). None of the participants

had previously used any of the MIR systems tested (Q.11-13).

6.2 Usability Questionnaires

A table showing the mean averages of the data collected from the usability

questionnaires completed by the participants can be found in (Appendix E). The

participants were asked to rate the statements on a Likert scale of 1 to 3, where 1 was

disagree, 2 was neither agree nor disagree and 3 was agree.

Generally, the data collected is fairly neutral. It can be observed from the mean totals

for each question, that participants tended to disagree more often than agree with

statements, however, this was not to a great extent, as the totals deviate no further

than 1.5 and 2.08.

Regardless of the neutral nature of the results, they are still of interest when

comparing the systems to one another, particularly when the questions are grouped

based on the issues they deal with, as below.

6.2.1 Ease of Learning

Questions 7 and 8 deal with the issue of how easy it is to get to grips with the system

and quickly become capable of searching effectively. We can see from the results

shown in table 1 that the participants found Lycos Multimedia Search (LMS) to be

the easiest of the 3 systems to learn to use, with a combined mean score of 4 for

questions 7 and 8, whilst participants found it most difficult to get to grips with

Singingfish, which was given a combined score of 3.25.

51


Table 1: Summary of average results from questions relating to ease of learning.

Question Singingfish Lycos AllTheWeb Mean Score

7. It was easy to learn to use this search engine 1.75 2 2 1.92 8. I believe I became productive quickly using this search engine

1.5 2 1.5 1.67

Total 3.25 4 3.5 3.59

6.2.2 Ease of Use

Distinguishable from the questions that deal with ease of learning, are those that

question the participants’ general satisfaction with how easy it is to use the system.

Questions 1, 2 and 6 deal with this issue, as shown in table 2. Again, participants

appear to be most satisfied with LMS, which achieved a total mean score of 6.5 for

these questions, whilst participants were least satisfied with AllTheWeb and

Singingfish in equal measure, with both achieving totals of 5.5.

Table 2: Summary of average results from questions relating to ease of use.


1. Overall, I am satisfied with how easy it is to use this search engine

2 2 2.25 2.08

2. It was simple to use this search engine 2 2.25 1.75 2 6. I feel comfortable using this search engine 1.5 2.25 1.5 1.75

Total 5.5 6.5 5.5 5.83

6.2.3 Ability to Complete Task

Another important issue that the questionnaire asks participants about is their ability

to complete the task, using the given system, in 3 ways: effectively, quickly and

efficiently (questions 3-5). When the results for each of the three criteria are

combined (as shown in table 3), LMS appears to be the system that best facilitates

52


task completion, with a total mean score of 5.75, compared to the worst rated

system, AllTheWeb, which achieved a score of only 4.75.

Table 3: Summary of average results from questions relating to ability to complete task.


3. I can effectively complete my work using this search engine

1.75 1.75 1.75 1.75

4. I am able to complete my work quickly using this search engine

1.75 2 1.5 1.75

5. I am able to efficiently complete my work using this search engine

1.75 2 1.5 1.75

Total 5.25 5.75 4.75 5.25

6.2.4 Information Quality

In questions 11, 12, 13 and 14, the questionnaire also asks the participants for their

opinions on the availability, clarity, quality and usefulness of the information that the

system provides them with, e.g. online help, on-screen messages, etc. Each of the 3

systems scored fairly poorly, as shown in table 4. However, Singingfish proved to be

the most popular in this case, receiving a total mean score of 7, whilst LMS and

AllTheWeb both scored 6.75.

Table 4: Summary of average results from questions relating to information provided by the system.


11. The information (such as online help, on-screen messages, and other documentation) provided with this search engine is clear

1.75 2 1.75 1.83

12. It is easy to find the system information I needed 1.75 1.75 1.5 1.67 13. The information provided for the search engine is easy to understand

2 1.75 1.5 1.75

14. The information is effective in helping me complete the tasks and scenarios

1.5 1.25 2 1.58

Total 7 6.75 6.75 6.83

53


6.2.5 Interface Issues

Perhaps of most direct relevance to this project are questions 15, 16 and 17, which

ask participants to comment on interface issues. In the first case, participants were

asked to what extent they agreed with the statement: ‘The organization of

information on the search engine screens is clear’. The system that received the

poorest mean response to this statement was AllTheWeb, as shown in table 5.

However, both Singingfish and LMS achieved neutral mean averages of 0. Questions

16 and 17 attempt to gauge the participant’s general attitudes towards the interface,

namely how ‘pleasant’ they found it and how much they liked it. In this case, the

results show that participants preferred the LMS interface, as it achieved a total

combined score of 6.75 compared to the 5.5 mean score of Singingfish, which was

judged to have the worst interface.

Table 5: Summary of average results from questions relating to the user interface.


15. The organization of information on the search engine screens is clear

2 2 1.5 1.83

16. The interface of this search engine is pleasant 1.75 2.25 1.75 1.92 17. I like using the interface of this search engine 1.75 2.5 2 2.08

Total 5.5 6.75 5.25 5.83

6.2.6 Overall Satisfaction

The results discussed so far appear to suggest that overall, participants were most

satisfied with LMS. Furthermore, the mean score totals for each system show that

LMS achieved higher scores than the other two search engines overall. Therefore, it

would be expected that in the final question (Q.20), that aims to elicit overall

satisfaction with the system, LMS would score most highly. However, the results

actually show that participants considered themselves to be as satisfied with LMS as

SingingFish, with both achieving mean scores of 0.

54


6.2.7 Comparison of Groups

By comparing the totals of the average scores for each of the question groups (as

shown in table 6), it is possible to determine that participants were most dissatisfied

with the quality of the information provided by the systems, e.g. on-screen messages,

help, etc. and the ease of use of the systems. These were areas that were therefore

given particular consideration when designing the user interface. For example, it was

noted that in the initial design of the UI, there should be a link to a help page.

However, it was decided that the design of the help page was beyond the scope of the

project.

Table 6: Summary of total average scores for specific areas of questioning.

Area Mean Total

Ease of learning 3.59 Ease of use 5.83 Ability to complete task 5.25 Information quality 6.83 Interface 5.83

6.3 Interview Sessions

Although the statistics derived from the usability questionnaires were fairly neutral

and occasionally contradictory, the comments arising from the subsequent interview

sessions helped to clarify some of the discrepancies.

The aim of the interview sessions was to elicit information about how useful

participants found particular features of each of the 3 systems, which results

presentation styles they favoured, their overall interface preferences and any other

feedback. The results are described below:

55


6.3.1 Usefulness of System Features

As described in section 5.1.4, during the first part of the interview sessions,

participants were asked to express how useful they found individual aspects of each

of the systems, using a 5-point Likert scale, ranging from 1 (completely useless) to 5

(very useful), with 3 being ‘neutral’.

Table 7 contains a summary of the mean average results for each feature of

Singingfish. Tables 8 and 9 contain the same for LMS and AllTheWeb, respectively.

6.3.1.1 Singingfish

Table 7: Summary of mean average results for Singingfish, from first part of interview session.

Feature Mean Score

1. Results categorised into music, movies, news, radio, sports, etc. 3.25

2. Ability to return results from specific categories only 3.25

3. Icons denoting file type of results, e.g. MP3, QuickTime, Real, Windows, etc. 1.5

4. Icons denoting media type of results, i.e. audio or video files 4.25

5. Automatic popup of originating web page in new window when audio/video file is loaded

3.75

6. ‘Simple search’ option 3.5

Although the majority of the results shown in table 7 are fairly neutral, the

participants’ comments that were recorded during the interview sessions appear to

suggest otherwise. For example, although features 1 and 2 only scored a neutral 3.25,

participants subsequent comments ranged from describing them as ‘quite useful’ to

‘very useful’. However, participants did remark that feature 1, the categorisation of

results, was somewhat confusing, due in part to an inaccurate and confusing

categorisation system. One participant remarked that ‘when you’re doing a football

search, surely all the results should be listed under sports?’and ‘there were some

radio clips that didn’t appear to be in the radio section’.

56


Based on these comments, it was therefore decided that in the initial design of the UI,

the results should be categorised in some way. However, the categories should be

distinct and meaningful. Furthermore, users should be given the option of retrieving

results from a specific category only.

Despite the fact that participants described feature 5, the icons denoting the file type

of the result, as ‘fairly useless’ using the Likert scale, participants appeared to like

the use of similar icons denoting media type, with participants agreeing that they

were ‘really useful’ and the best way to denote media type. It was therefore noted

that such icons should be included in the initial design of the UI.

In contrast, the average score of 3.75 given to feature 7, the automatic pop-up

window, did appear to be appropriate given user comments such as ‘it was an added

bonus’ and ‘I didn’t find it a nuisance’. Most of the participants found the feature

useful for judging how authoritative a site was, however, one participant found it

‘really annoying’ that every time they followed a link an extra window was opened,

which required subsequent effort to close.

It was decided that the pop-up window feature would not be included in the initial

design of the UI, as one of the four participants was so strongly opposed to it.

However, it could still be incorporated into an advanced search page in future, as an

optional feature. It was noted from the comments though that participants would still

like to know where an object originated from, in order to judge authoritativeness,

therefore it was decided that the results summaries should list the source of the

object.

As it was necessary to collect participants’ feedback on features 1 and 2, the

participants used the ‘advanced search’ menu to complete their tasks. However, if the

user followed the ‘Simple Search’ link (feature 8), an uncategorised search was

performed. Two of the four participants did so, one of which responded that: ‘it

probably helps for initial searches, but I would need something advanced if I was

looking for specific things’. This further emphasised the need for result

categorisation in the initial design.

57


6.3.1.2 Lycos Multimedia Search

Table 8: Summary of mean average results for Lycos Multimedia Search, from first

part of interview session

Feature Mean Score

1. Ability to exclusively retrieve pictures, audio files, MP3 files, video files, or retrieve all file types.

4

2. If all file types selected, top results from each media type presented on same results page.

4

The results in table 8 are slightly less neutral than those in table 7. On average,

participants found it ‘somewhat useful’ that it was possible to exclusively retrieve

files of a specific media type, with one participant remarking that ‘it was useful.... if

you were specifically looking for pictures, it would be good’, and another suggesting

that ‘it would have been useful if things had been retrieved of any use’. It was

therefore noted that in the initial design of the UI, users should be able to search for

objects of a specific media type, as well as objects of all media types.

Participants also stated that they found feature 2, the results presentation method

whereby the top results of each media type were presented together on the same

page, ‘somewhat useful’. The high average score for feature 2 is supported by

comments such as ‘it was useful, especially if you’re looking for different media’

and ‘it was a better way of presenting the results than mixing them up into stupid

categories’. However, one participant suggested that it ‘could possibly be improved

if you had more results for each media type’.

It was also observed that once participants had performed searches over all media

types, some of them took a noticeable amount of time to find the video results, as

they appeared below the fold-line3 of the page, and were therefore not immediately

visible.

It was concluded from the participant comments, and the general preference

exhibited towards the LMS interface in the data from the usability questionnaires,

that the UI should be designed such that if a user chooses to retrieve objects of all

3 A term derived from newspaper publishing, referring to the part of the page that is visible before scrolling (Krug (2000))

58


media types, the top results from each media type should be presented on the same

page. However, more than just 3 results should be displayed for each media type, and

all 3 media types should be displayed, at least in part, above the fold-line of the page. 6.3.1.3 AllTheWeb Table 9: Summary of mean average results for AllTheWeb, from first part of

interview session.

Feature Mean Score

1. Detailed image information displayed when image link is followed 2.5

2. Results categorised by media type, i.e. ‘web’, ‘news’, ‘pictures’, ‘video’, ‘MP3 files’

3.75

Surprisingly, the detailed information that is presented when a user follows an image

link on AllTheWeb was deemed to be little more than mostly useless, with

participants remarking that the picture itself was the only important thing and that the

supplemental information was ‘not really any more useful than when the image was a

thumbnail’ and they ‘didn’t really take it in’. It was therefore decided that, in the

initial design of the UI, thumbnails should form part of the results summaries for

images. Furthermore, if a user clicks on a thumbnail, simply an enlarged version of

the image should be displayed.

AllTheWeb’s tab based style of results categorisation, as described in section 4.3.1,

was judged to be slightly less useful than the all-in-one method of results

presentation used by LMS. It should be noted that one of the participants had

problems finding files of different media types at first, and commented that the

system was ‘more difficult to get to grips with than the others’. However, the 0.25

difference in average scores is marginal and further user comments are necessary to

establish whether the participants truly favoured one system over the other - these

will be discussed in the next section.

59


6.3.2 System Preferences

In the second part of the interview sessions, participants were asked several open-

ended questions, aimed at determining their overall interface and system preferences.

6.3.2.1 Result Summary Preferences

In collecting user feedback, it was important to determine exactly what information

participants preferred to accompany each search result and how they preferred it to

be presented. By asking them to state which of the three result summary styles they

preferred in the case of each media type and why, it was possible to elicit

information that could influence the final design of the interface. This data is

represented in table 10, which shows which results style each of the four participants

(A, B, C & D) preferred in the case of each media type.

Table 10: Result summary style preferences Singingfish Lycos AllTheWeb

Audio Files C A, B, D

Video Files A C B, D

Images n/a C A, B, D

The results show that AllTheWeb’s style of result summary was most popular in the

case of every media type and user comments appear to support this. However, these

results do differ slightly from those from the first part of the interview when, for

example, participants said that they found the video results summaries from

Singingfish marginally more useful than those from AllTheWeb.

As described in section 4.3.1, AllTheWeb’s audio results are presented in a tabular

format, with columns containing the ‘reliability’ (which is denoted by a number of

stars), ‘type’ (denoted by an icon depicting whether the object is an audio file or

folder containing audio files), file name, file size and date the file was created.

Participants commented that they liked the general style of presentation, for example,

participant A said ‘it’s fairly clear cut, you know what you’re getting’, and

participant D commented that ‘you’ve got all the information there that you need.

60


However, participants also remarked that they did not understand the ‘reliability’ or

‘type’ information: ‘I didn’t really know what any of those things meant’.

It was decided, based on these comments and the results in table 10, that the initial

design of the UI should present audio results in a similar manner to AllTheWeb, in

terms of the tabular layout and the information provided. However, ‘reliability’ and

‘type’ information should not be included.

AllTheWeb’s video files are presented in a similar layout to audio files, with

columns listing the ‘Title and nearby text’, ‘Duration’, ‘Size’ and ‘Format’ for each

result. When asked to comment on why they favoured AllTheWeb’s video result

summaries, participants stated that: ‘you know what to expect when you click on

them’ and knowing the duration, size and format of results was ‘useful’. They also

commented that the tabular style of presentation makes the results ‘quite easy to

read’. However, one participant criticised the accuracy of the title and nearby text

accompanying the results, saying that: ‘a lot of the time it isn’t particularly clear

exactly what you’re going to be looking at when it comes up’.

Based on these comments, it was decided that the initial design of the UI should

display video results in a similar manner to AllTheWeb, but an accurate description

of the contents of each object should be given, rather than just a summary of the

nearby text.

The image results returned by AllTheWeb are presented in the form of nine

thumbnails. Each thumbnail is accompanied by the file name, below which, the file

format, dimensions and size is listed. Participants liked the layout of the results,

stating that the larger size of the thumbnails, compared to LMS was preferable: ‘the

pictures are bigger on AllTheWeb... that’s more useful, because you’ve got a better

idea of what you’re getting, so you haven’t got to click on them’. Another participant

stressed the importance of thumbnails, stating: ‘I just preferred AllTheWeb because

the pictures are slightly bigger. It just makes it much easier to see’. Participants also

commented that the option to enlarge the images was ‘useful’. However, participants

did criticise the file name as not being particularly useful, with one participant

61


stating: ‘the kind of layout was good, but some of the titles were a bit useless - they

weren’t very descriptive’.

These comments confirmed what was proposed earlier and it was decided that, in the

initial design of the UI, the image results should be presented in a similar manner to

AllTheWeb, using enlargeable thumbnails. However, it was noted that the file name

should be replaced with a brief title, which would better convey information about

the image.

6.3.2.2 Overall Interface Preferences

Participants were also asked which of the three interfaces they preferred overall,

however their opinions were somewhat varied.

Participant A stated that they preferred Lycos. Participant B on the other hand cited

the Singingfish interface as their favourite, stating that they particularly liked the way

information was laid out simply and clearly on screen, with ample whitespace: ‘I

liked the way everything was centred, right in the middle of the screen, with clear

things either side... You’ve still got your search box always at the top to check what

you put in and if you want to change your query’.

Participant C cited LMS as their favourite interface, due to its unique style of results

presentation: ‘You actually saw what all the results were, rather than having to click

through different screens to get to them... I quite liked the fact that it gave you a few

results just to show you what they thought the top results were, but you have the

option to go further in to a more specific screen’.

Participant D declared that they preferred the Singingfish interface, as compared to

the other interfaces, it was ‘clear and less cluttered’. They also stated that they

particularly favoured the results presentation style.

It was noted from these comments that the initial design of the UI should be simple,

centralised and utilise plenty of white space. Furthermore, the query box should

remain in clear view near to top of the page, at all times.

62


6.3.2.3 Overall System Preferences

When asked which search engine they preferred overall, the responses were

practically identical to those for the previous question, affirming that the UI plays a

major role in improving overall user satisfaction.

Participant A stated that they preferred Lycos Multimedia Search, as ‘some of the

results were fairly good and it was quite straightforward to use’.

Participant B responded that they favoured Singingfish, in particular the screen

layout which they described as ‘quite clear’ with ‘a nice search box that was always

present in the middle of the screen’, yet AllTheWeb was described as ‘a close

second’.

Participant C cited LMS as their preferred system, the reasons being the results - ‘it

just came up with better results’ - and the manner in which they were presented - ‘it

had all the different media types laid out in different sections which were quite easy

to work through’. In contrast, participant D declared that they preferred AllTheWeb

as ‘it was much easier to see on screen what you were getting’.

These comments served to affirm the validity of prior observations about what

should be included in the initial design of the UI.

6.3.3 Suggested Improvements

For the final part of the interview, participants were asked to suggest ways in which

each of the three systems could be improved. Several of the comments re-iterated

what was stated earlier in the interview, whilst others provided new insight into what

was desired and what should be avoided in the design of the interface. The comments

for each of the systems are summarised below.

63


6.3.3.1 Singingfish

When asked how Singingfish could be improved, participant A replied that they

would like to see a help button giving information on how to search. It was therefore

noted that a link to a help page should be incorporated into the interface in the initial

design.

Participant B stated that the distinction between audio and video should be made

clearer and proposed using a similar style to AllTheWeb, with a ‘table format, giving

file format, title, then size’. This served to affirm the previous observation about how

audio and video results should be presented.

Participant C stated that Singingfish could be improved by removing some of the

search options and ‘grouping the audio and video together, a bit like Lycos’. This

served to affirm the belief that the method of merging audio and video results

together, as used by Singingfish, was not favoured by users.

6.3.3.2 Lycos Multimedia Search

When asked how LMS could be improved, participant A commented that there were

too many adverts and the results did not occupy a large or central enough part of the

screen: ‘There needs to be more attention paid to the results and not so much

attention paid to everything else’. Participant B agreed, remarking ‘I’d move the

results into the centre of the screen... If they do have to have adverts, they should be

moved to the sides’. From this it was concluded that the interface should occupy a

large, central part of the screen, yet still be viewable using a number of different

resolutions.

Participant C commented that they would change the way that audio files were

presented, suggesting that there should be: ‘more of a description about what each

audio bit is, because it was a bit misleading’, whilst Participant D remarked that

larger image thumbnails should be used in the image result summaries. Both these

comments served to affirm that the previously proposed method of presenting image

and audio results was desirable.

64


6.3.3.3 AllTheWeb

When asked how AllTheWeb could be improved, participant A was unsure, but they

did remark that ‘the font size is quite small’. It was therefore noted that the once

implemented in HTML, the UI should be viewed at a number of different resolutions,

using different web browsers, to ensure that the information could be viewed

properly.

Participant B commented that the quality of the results should be improved.

However, this was not considered to be a UI design issue.

Participant C remarked that AllTheWeb ‘seemed to give a lot of good pictures, but

not where they came from’. This stressed the previously observed importance of

incorporating information about the source of objects into the results summaries, in

the initial design of the UI.

Participant D commented that it would be useful if the system was capable of

remembering a user’s search queries. However, this was considered to be more of a

technical issue than an interface issue. Furthermore, the latest version of Microsoft

Internet Explorer uses an auto-complete feature by default, that is capable of

remembering a user’s search queries.

65


CHAPTER 7: REQUIREMENTS SPECIFICATION

Based on the statistical results and user comments discussed in chapter 6, the

following list of user interface requirements was created. The list is divided into

sections for requirements relating to the general layout, querying options, and the

display of results.

7.1 Layout

1. A simple, uncluttered, centralised interface that occupies a larger part of the

screen than LMS, but still utilises plenty of white space.

2. UI should be viewable at a number of resolutions, on several different web

browsers.

3. The search box should be clear and remain prominent at the top of the screen, at

all times.

4. Typefaces slightly larger than those used in AllTheWeb should be utilised.

7.2 Querying Options

5. An option to search for specific media types only, i.e. just images, just audio files,

etc.

6. An option to retrieve results in a specific category only.

7. A link to a separate advanced search page.

8. A link to a help page that gives detailed yet concise information on how to

construct search queries, etc.

66


7.3 Results

9. Results should be categorised, using distinct and meaningful categories.

10. If a user searches over all media types, the top results in the case of each media

type should be displayed, separate from one another, on a single page. More than

3 results should be displayed for each media type.

11. All media types should appear, at least in part, above the fold-line of the page, to

ensure that it is clear to the user that all results are being displayed on one page.

12. Once a user has issued a search query, it should then be possible to perform the

same search but specifying a different media type, by following highly visible

links.

13. The web page that an object originates from must be made clear, by specifying

the URL or a description of the source, in order that the authoritativeness of each

result can be judged effectively.

14. Image results presented in a similar style to AllTheWeb, with 9 fairly large

thumbnails per page, but more detailed information. The summary should

include an option to enlarge the image, source information, image dimensions

and file size & type.

15. Enlarged images should be presented on there own, without additional

information.

16. Audio results presented in a similar, tabular style to AllTheWeb, but without the

confusing ‘Reliability’ and ‘Type’ information. The result summaries should

include details such as the originating web page, file name, size and date.

17. Video results presented in a similar, tabular style to AllTheWeb, with summaries

that include the originating URL, duration, size and format of the file. However,

67


summaries of the contents of the files should be given, as opposed to nearby

text.

18. Content summaries should be succinct and accurately convey exactly what each

search result contains.

19. A small number of links to the top results for different media types should be

listed, separate from the current search results, as in AllTheWeb.

20. Icons should be displayed next to result summaries, denoting the media type of

the object.

68


CHAPTER 8: INITIAL DESIGN

Based on the requirements specified in the previous chapter, initial user interface

design ideas were sketched. Sketches of the proposed final design, selected for low-

fidelity prototyping, can be found in Appendix F.

The design of the proposed UI was heavily influenced by the three existing

commercial, web-based MIR systems. This was to be expected given that the

statistical results and user comments that formed the basis of the requirements were

derived from usability evaluation of these systems. However, as well as combining

the elements of each of these systems that participants favoured most, the designed

UI also featured several original interface concepts. The main features of the

proposed UI are described below.

8.1 Site Identification

It was necessary to include some form of logo in the UI, in order that participants

could identify the name of the site and more importantly, its purpose. Based on

Krug’s (2000) recommendations, the logo was placed at the top of every page.

The logo used was that of the MIND research group. At the time of completing this

project, the group were conducting a project concerned with multimedia digital

libraries. More specifically, the objective of the project was: ‘to design models and to

build sets of tools and associated test-beds to improve the effectiveness of resource

selection, multimedia information access, retrieval and fusion of the retrieved data’

from such libraries (MIND (2001)).

It was believed that the research in this project may potentially be of use to the

MIND project, therefore the site was branded with their logo, which satisfied Krug’s

recommendations of being easily recognisable and also gave some indication of the

purpose of the site.

69


8.2 Query Interface

The start screen of the proposed UI allows the user to specify whether the system

should retrieve results of all media types or of a specific media type only, using a

series of radio buttons. It was decided that images, audio files and video files would

be the media types that users could choose from, as they were the most common

types used by the systems tested. Although AllTheWeb was capable of retrieving

web pages and news articles as well, these were considered to be primarily textual

objects and therefore of less relevance to the project.

A pull-down menu also gives the user the option of retrieving results from all

categories or from a specific category. The categories selected were based on results

of a survey of six category-based web search engines: Singingfish

(http://www.singingfish.com), The Open Directory Project (http://dmoz.org), Lycos

(http://www.lycos.com), InfoSpace (http://www.infospace.com), LookSmart

(http://www.looksmart.com) and Yahoo! (http://www.yahoo.com). Every top-level

category featured in each of the search engines was recorded and those that occurred

most frequently were selected for inclusion in the UI. Although the category names

were different, the concepts were often the same and would be counted as the same

category, for example, the categories: ‘motoring’ and ‘automotive’ were treated as

identical.

The UI presents the user with a standard query box and search button with which to

issue their query. These were designed with Krug’s recommendations in mind: ‘it’s a

simple formula: a box, a button, and the word “Search”. Don’t make it hard for them

(the users) – stick to the formula’ (Krug (2000:67)).

Links to an ‘advanced’ page and ‘help’ page were also included in the UI. However,

the pages themselves were not designed, as this was considered to be beyond the

scope of the project.

70


8.3 Results Presentation

In a similar style to AllTheWeb, the UI uses a row of tabs, which the user can use to

switch between results for different media types. It was decided to use tabs as a

superior alternative for navigation was not conceivable. Krug (2000) lists four

reasons why tabs are such excellent navigational tools:

1. They are simple to understand - because they are based on a physical

metaphor, users of all skill levels can understand how they work.

2. They are highly noticeable - because they make such a visual impact, it is

hard for a user to overlook them.

3. They are efficient - they improve the ease of navigation within a site without

slowing it down.

4. They improve user’s perception of divisions within the site – by creating

the impression that the active tab is physically in front of the other tabs, the

user’s feeling that the site is split into sections, and that they are within one of

the sections, is enhanced.

However, the UI differs from all the systems tested, in that a second row of tabs is

present, which the user can use to switch between results belonging to different

categories. The additional row of tabs makes it possible for the user to easily switch

between results for different media types and results for different categories.

Based on Krug’s (2000) recommendations, contrasting shades of colours are used for

the active and inactive tabs, in order to enhance the feeling that the active tab is

physically in front of the other tabs. Furthermore, borders round the edge of the area

in which the results are listed help to give the feeling that the tabs physically connect

with the space below.

Two sketches were made depicting how the proposed UI would display the results of

a search over all media types. One of the sketches (see Appendix F-1) shows the

results being presented in a similar style to LMS, with results divided into sections

for each media type and these sections laid out horizontally. The other sketch (see

Appendix F-2) shows an alternative design concept, in which the results are laid out

71


vertically, with columns for each media type. It was decided that the two alternatives

would be used in the low-fidelity prototyping sessions, to determine which layout

users preferred.

72


CHAPTER 9: RESULTS FROM USABILITY EVALUATION OF LOW-

FIDELITY PROTOTYPE

Having sketched the initial design of the interface, a low-fidelity prototype of the

system was then created and usability evaluation sessions were conducted, as

detailed in section 5.2. The results of these sessions, and the user comments arising

from them, are summarised below:

9.1 Recognising Purpose of System

As described in section 5.2, during the low-fidelity prototyping sessions, participants

were firstly presented with a card-based sketch, showing the proposed initial query

screen (see Appendix G-1). Participants were then asked what they believed the

purpose of the site to be. Both participants who had taken part in the usability

evaluation of existing MIR systems successfully identified the system as a

multimedia search engine. The remaining participants, who had not previously used a

MIR system, also correctly identified the purpose of the site, with one of the

participants describing the system as ‘something searchable, where you can get

images, audio and video files’ and the other describing it as ‘a search engine

specifically for images, audio and video files rather than web pages’.

9.2 Issuing a Query

Participants were also asked to describe precisely what they would do if they wished

to perform a search for a specific topic. All four participants correctly described how

they would select the media type they wished to search for, and the implications of

choosing a particular media type. All participants also understood how to select a

particular category and how this would affect their search, for example, one

participant stated that: ‘I’d click in Arts and Entertainment and I’d expect it to bring

up some popular culture stuff’, whilst another recognised that if they selected a

single category, rather than all categories, they would ‘get less results’.

73


As was to be expected, given that they had all used traditional web-based IR systems

in the past, the participants also correctly described how to enter a word or phrase in

the query box and use the search button to issue the query.

At this point, two alternative design styles for the query screen were tested, by

adding, removing and re-arranging detachable parts of the prototype. Participants

were asked to comment on each style and to specify which they preferred overall.

Style B (see Appendix G-2) was the same as the initial query screen. However,

instead of using radio buttons to select category type, participants used a drop-down

menu. It was necessary to test this alternative, as it was predicted that if there were a

large number of categories available in the final system, it would be difficult to fit all

the data onto the screen using style A. Three of the four participants preferred using

radio buttons, however, describing the drop-down menu as ‘quite good and quite

neat’, but criticising the fact that ‘you can’t automatically see what categories there

are’.

Style C (see Appendix G-3) was the same as style A, except the category options

were located beneath the query box. This style was tested as participants specified

during the usability evaluation of existing MIR systems that they did not necessarily

want to search in specific categories, but still wanted the option to be available. It

was predicted therefore that users would prefer it if the category box appeared below

the query box. However, this was not the case, and participants unanimously stated

that they preferred styles A and B, complaining that they either did not notice the

category options (‘you don’t notice it straight away’) or would not have bothered to

examine them (‘i’m not sure I would have bothered with it’).

It was therefore decided that in the final design of the system, style A should be used,

whereby radio buttons located above the query box are used to select a category.

74


9.3 Browsing Results

Once the participants had described how they would issue a search query, the

prototype was then modified to represent the system displaying the search results.

It was decided that participants’ feedback about the feature used in AllTheWeb,

whereby the top results for different media types are presented alongside the results

currently being browsed, should be collected. When the relevant part was added to

the low-fidelity prototype and participants were asked to give their feedback on it,

the comments were unanimously positive. It was therefore decided that the feature

should be included in the final design of the UI.

Two design alternatives were then tested at this point. Style A presented the audio,

video and image files in individual rows (see Appendix G-4) while style B presented

the different media types in separate columns (see Appendix G-5). After being

shown each results style, participants were then asked to give their thoughts and

opinions on it, and specify which style they preferred overall.

In each case, participants were first asked what they expected to happen if they

clicked on any of the category tabs, the media type tabs or the ‘next’ links. The

outcome of each action was successfully predicted by all four participants.

3 out of the 4 participants said that they preferred style B over style A. In particular,

participants liked the way that more results could be viewed at once with style B,

compared to style A. For example, one participant stated: ‘because before (using

style A) it was horizontal, you couldn’t really see everything’, whilst another pointed

out that: ‘At first glance, you’ve got an idea of what you’re going to get… you’ve

haven’t got to scroll down the page’.

Participants also liked the way that results from different media types could be easily

compared using style B: ‘I think that this is better because you can cross-reference.

It’s good to know what you’ve got, next to each other.’ However, one participant did

observe a flaw in the design, in that the category tabs appeared to be in line with

some of the columns, misleading the participant into believing that the category tabs

75


and results columns were connected. It was subsequently noted that in the final

design of the interface, the category tabs should be re-aligned, to avoid confusion.

One of the four participants did state that they preferred style A to style B. However,

this criticism was based on the way that the information accompanying each

individual audio and video result was presented, rather than the general layout: ‘Style

A splits the results summaries into separate columns, in the other one it’s a bit dense,

and not as easy to find a date or something’.

It was therefore decided that the final design should use style B to present the results,

as this was generally preferred. However, the information accompanying each audio

and video result should be made clearer by spacing the information out.

9.4 Overall Comments

At the end of the session, participants were asked how they thought the interface

compared to those that they had used in the past. The two participants who had

experience using MIR systems made positive remarks about the way in which the

interface was capable of presenting all media types on on one page. One of the

participants stated that: ‘The vertical format is quite rare, I haven’t come across that

before, I think that could be quite interesting’, whilst the other participant

commented: ‘You’re searching so many different media types that it’s quite nice to

have them all on screen’. The same participant also commented that: ‘There’s a bit

more information than what I’m used to, but I don’t really get fussed by that’.

The remaining participants, who had not used an MIR system in the past, also gave

positive feedback about the interface. One participant commented on the layout,

remarking: ‘I like the plainness and the white background because what you’re

looking for instantly is the query box on any search engine’, whilst the other

commented on the categorisation: ‘I wouldn’t have much difficulty in trying to use

it... I like the way you can select categories’. However, despite the presence of both

these features, one of the participants also remarked that: ‘It didn’t have an advanced

section or a help page’. It was therefore noted that in the final design, the links to the

76


advanced page and help page should be made clearer, by making them larger and

bolder.

9.5 Summary of Findings

The comments made by participants during the low-fidelity prototyping sessions

suggested that the ideas about user preferences collected from the initial usability

evaluation sessions were accurate. The sessions also yielded results which suggested

that both first time and experienced MIR system users could understand the purpose

of the system, could rapidly get to grips with the interface and were capable of

performing search tasks without assistance. The results also indicated strong user

preferences towards specific design options, for example the use of columns to

display results of all media types.

77


CHAPTER 10: FINAL DESIGN

Having received generally positive feedback from participants about the initial

design, during the usability evaluation sessions, it was then necessary to implement a

prototype of the final design in HTML.

The design was implemented in HTML such that the UI could be viewed at

resolutions of 800 by 600 pixels upwards. It was also coded so that it would display

correctly in both Microsoft Internet Explorer and Netscape Navigator.

Screen grabs of each of the pages of the prototype can be found in Appendix H. As

can be seen from the screen grabs, the final UI was practically identical to the low-

fidelity prototype. Based on the comments made by participants during the usability

evaluation sessions, the prototype was designed such that results from queries over

all media types were presented in columns.

The only change made to the design, that was not based on information derived from

the usability evaluation sessions, was the inclusion of a pull-down menu, rather than

radio buttons, for category selection.

Although 3 out of the 4 participants who participated in the usability evaluation

stated that they preferred radio buttons to a pull-down menu, at the implementation

stage it was discovered that, due to the length of the category names chosen, it was

not possible to fit all the options onto the screen using radio buttons, without creating

a cluttered interface. Using a pull-down menu was considered to be a practical and

neat alternative.

78


CHAPTER 11: CONCLUSION

This report began by suggesting that system developers have only recently begun to

take the issue of user interface design seriously and that in the past they were

generally too preoccupied with a system’s functionality to investigate whether the

interface was useable.

It was further suggested, however, that following the creation of the World Wide

Web, which created an opportunity for interface research to positively affect millions

of users, and the recent increase in the popularity of HCI research, developers’

attitudes have begun to change.

The existing web-based MIR systems surveyed in this study appear to support this

claim. At a glance, the user interfaces for Singingfish, LMS and AllTheWeb would

appear to be usable. They seem to have most of the basic features that one would

expect to find in any web-based search engine and their layouts look fairly clear and

reasonably logical. However, the findings in this report have proved that, despite

superficial appearances, these interfaces are still significantly flawed.

A series of usability evaluation sessions were carried out as part of this project,

which involved asking participants to perform retrieval tasks using the

aforementioned MIR systems, then complete online usability questionnaires and

participate in one-to-one interviews. The qualitative and quantitative data collected

from these sessions revealed that the three systems tested had a number of features

that users regarded as useless, several aspects that users found confusing and some

parts that users just ignored completely. It may be concluded from this that system

developers, or more specifically, those in the field of web-based information

retrieval, are still not bestowing enough importance upon user interface design.

It should be noted however that the user feedback collected during usability

evaluation also revealed that the three MIR systems tested had some highly desirable

aspects, which participants regarded as particularly useful. These elements formed

the basis for the subsequent design of the user interface.

79


As well as having features influenced by those in existing MIR systems, the final

user interface also has some aspects that may be considered original. For example, it

uses two rows of tabs to allow the user to navigate between media types and category

types. As well as allowing the user to retrieve results from several different media

types at once, the interface also presents the results in columns, enabling the user to

view more results at once, and compare results of different media types. It can be

concluded from this that usability testing can yield extremely useful results and, at

the same time, facilitate the creation of new ideas.

One of the main complaints levelled at proponents of usability evaluation is that it is

greatly expensive and time consuming. However, this project has effectively shown

that valuable user feedback can be collected using a minimal number of people and

at very little cost. It has also shown to a certain extent that the user-centric approach

to interface design works and that users can be involved in and play a valuable role

in the design process right from the very start.

It is expected that as technology improves, the amount of research into multimedia

information retrieval will continue to increase and so to will the rate of development

of MIR systems. It is hoped that it is considered vital that a user-based approach to

interface design is adopted in the development of any such systems, in order that

research findings are fully exploited.

80


BIBLIOGRAPHY

Abels, E. G., White, M. D. & Hahn, K. (1998). “A User-Based Design Process for

Web Sites”. A User-Based Design Process For Web Sites, 8 (1), 39-48.

Amato, G., Rabitti, F. & Savino, P. (1997). Multimedia Document Search on the Web

[Online]. http://www7.scu.edu.au/programme/posters/1907/com1907.htm [Accessed

9th September 2002].

Amato, G., Mainetto, G. & Savino, P. (1998). An Approach to a Content-Based

Retrieval of Multimedia Data [Online]. http://citeseer.nj.nec.com/287555.html

[Accessed 9th September 2002].

Babbie, E. (1983). The Practice of Social Research. California: Wadsworth

Publising.

Bertino, E., Catania, B. & Ferrari, E. (1999). Multimedia IR: Models and Languages.

In: Baeza-Yates, R. & Ribeiro-Neto, B. (eds) Modern Information Retrieval. New

York, ACM Press. p. 325-343.

Brajnik, G., Mizzaro, S. & Tasso, C. (1996). Evaluating User Interfaces to

Information Retrieval System. A Case Study on User Support. In: Frei, H-P.,

Harman, D., Schauble, P., & Wilkinson, R. (eds.) Proceedings of the 19th Annual

International ACM SIGIR Conference on Research and Development in Information

Retrieval. 18-22 April, 1996, Zurich, Switzerland. pp.128-136.

Cho, S. J. & Yoo, S. I. (1998). Image Retrieval Using Topological Structure of User

Sketch [Online]. http://citeseer.nj.nec.com/cho98image.html [Accessed 9th

September 2002].

Erikson, T. D. & Simon, H. A. (1985). Protocol Analysis: Verbal Reports as Data.

Massachusetts: The MIT Press.

81


Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., Gorkani,

M., Hafner, J., Lee, D., Petkovic, D., Steele, D. & Yanker, P. (1995). “Query by

Image and Video Content: The QBIC System”. IEEE Computer, 28 (9), 23-32.

Hackos, J. T. & Redish, J. C. (1998). User and Task Analysis For Interface Design.

New York: Wiley.

Hansen, P. (1997). An Exploratory Study of IR Interaction for User Interface Design

[Online]. http://citeseer.nj.nec.com/hansen97exploratory.html [Accessed 9th

September 2002].

Hearst, M. A. (1999). User Interfaces and Visualization. In: Baeza-Yates, R. &

Ribeiro-Neto, B. (eds) Modern Information Retrieval. New York, ACM Press. p.

257-323.

Henrich, A. & Robbert, G. (2000). Combining Multimedia Retrieval and Text

Retrieval to Search Structured Document in Digital Libraries [Online].

http://www.ercim.org/publication/ws-proceedings/ DelNoe01/7_Henrich.pdf


Hix, D. & Hartson, H. R. (1993). Developing User Interfaces: Ensuring Usability

Through Product & Process. New York, Wiley.

Hutchins, E. L., Hollan, J. D., & Norman, D. A. (1986). “Direct Manipulation

Interfaces”. In: Norman, D. A. & Draper, S. W. (eds), User Centered System Design:

New Perspectives on Human-Computer Interaction, pp. 87-124. New Jersey:

Lawrence Erlbaum Associates.

IBM. QBIC – IBM’s Query By Image Content [Online].

http://wwwqbic.almaden.ibm.com/ [Accessed 9th September 2002].

Kobayashi, M. & Takeda, K. (2000). “Information Retrieval on The Web”. ACM

Computing Surveys, 32 (2), 144-173.

82


Koenemann, J. & Belkin, N. (1996). A Case For Interaction: A Study of Interactive

Information Retrieval Behavior and Effectiveness [Online].

http://www.acm.org/sigchi/chi96/proceedings/papers/Koenemann/jk1_txt.htm


Krug, S. (2000). Don’t Make Me Think! A Common Sense Approach To Web

Usability. Indiana: New Riders.

Lawrence, S. & Giles, C. (1998). “Searching the World Wide Web”. Science, 280,

98-100.

Lee, H., Smeaton, A. F., Murphy, N., O’Connor, N. & Marlow, S. (2001). Físchlár

on a PDA: Handheld User Interface Design to a Video Indexing, Browsing and

Playback System [Online]. http://www.cdvp.dcu.ie/Papers/UAHCI2001.pdf


Lewis (1995). “IBM Computer Usability Satisfaction Questionnaires: Psychometric

Evaluation and Instructions for Use”. International Journal of Human-Computer

Interaction, 7 (1), 57-78.

Li, C-S., Mohan, R. & Smith, J. R. (1998). Multimedia Content Description in the

Infopyramid [Online].

http://www.research.ibm.com/networked_data_systems/transcoding/Publications/mp

eg7.pdf [Accessed 9th September 2002].

MIND (2001). MIND Project Summary [Online].

http://www.mind.cs.strath.ac.uk/MINDAbstract2.html [Accessed 9th September

2002].

Mukherjea, S., Hirata, K. & Hara, Y. (1997). Towards a Multimedia World Wide

Web Information Retrieval Engine [Online].

http://www.scope.gmd.de/info/www6/technical/paper003/paper3.html [Accessed 9th

September 2002].

83


Mukherjea, S. & Cho, J. (1999). “Automatically Determining Semantics for World

Wide Web Multimedia Information Retrieval”. Journal of Visual Languages and

Computing, 10, 585-606.

Neilsen, J. (1993). Usability Engineering. Academic Press: New York.

Ortega-Binderberger, M. (1999). WebMARS: A Multimedia Search Engine for the

World Wide Web [Online]. http://citeseer.nj.nec.com/322978.html [Accessed 9th

September 2002].

Paek, S., Benitez, A. B. & Chang, S-F. (1999). Self-Describing Schemes for

Interoperable MPEG-7 Multimedia Content Descriptions [Online].

http://citeseer.nj.nec.com/157404.html [Accessed 9th September 2002].

Perlman, G. (1998). http://www.acm.org/~perlman/question.html [Accessed 9th

September 2002].

Preece, J., Rogers, Y. & Sharp, H. (2002). Interaction Design: Beyond Human-

Computer Interaction. New York: Wiley.

Shneiderman, B., Byrd, D., & Croft, W. B. (1997). Clarifying Search – A User-

Interface Framework for Text Searches [Online].

http://www.dlib.org/dlib/january97/retrieval/01shneiderman.html [Accessed 9th

September 2002].

Shneiderman, B. (1998). Designing The User Interface: Strategies for Effective

Human-Computer Interaction. Addison-Wesley: Massachusetts.

Smith, J. R. & Chang, S-F. (1996). VisualSEEk: a Fully Automated Content-Based

Image Query System, Proceedings [Online].

http://www.ctr.columbia.edu/~jrsmith/html/pubs/acmmm96/acm.html [Accessed 9th

September 2002].

84


Srihari, R. K. & Zhang, Z. (1999). “Exploiting Multimodal Context in Image

Retrieval”. Library Trends, 48 (2), 497-520.

85


APPENDIX A: Screenshots of AllTheWeb

Appendix A-1: Basic Query Screen

i


Appendix A-2: Web Results

ii


Appendix A-3: Image Results

iii


Appendix A-4: Video Results

iv


Appendix A-5: Audio Results

v


Appendix A-6: Advanced Query Screen For Web Page Search

vi


APPENDIX B: Screenshots of Lycos Multimedia Search

Appendix B-1: Query Screen

vii


Appendix B-2: Results For All Media Types

viii


Appendix B-3: Detailed Image Information

ix


Appendix B-4: Image Results

x


APPENDIX C: Screenshots of SingingFish

Appendix C-1: Basic Query Screen

xi


Appendix C-2: Advanced Query Screen

xii


Appendix C-3: Categorised Results

xiii


Appendix C-4: Uncategorised Results

xiv


APPENDIX D: Documents Given to Participants During First Usability Evaluation

Sessions

APPENDIX D-1: BRIEFING

The aim of the experiment in which you will be participating is to collect users’ thoughts and opinions on three different multimedia search engines currently available on the Web. This feedback will be used as a basis for the design of a new user interface. In the experiment, you will be presented with three different Web-based search engines, and asked to perform a specific task on each. Both these tasks will be concerned with retrieving certain information from the Web. You can perform as many different searches as you wish. You will be asked to bookmark objects of relevance to the task, be they web pages, images, audio files, video files. In each case, try and find the most relevant objects. To bookmark the objects, right-click on the link to the search result you wish to bookmark and select ‘Add to Favorites’. During the experiment, your activity will be video recorded for further analysis at a later date - the recording will remain completely confidential and will only be used for this purpose. It would be extremely helpful if you would describe aloud your thoughts, feelings, and opinions whilst completing the tasks, e.g. what you are trying to achieve. In total, the experiment should last no longer than 2 hours. If during the test you have any difficulties or questions, please ask the experimenter. The schedule for this experiment will be as follows:

1. Read briefing and complete initial questionnaire – 10 minutes 2. Read task one briefing and complete task – 20 minutes 3. Complete online questionnaire – 5 minutes 4. Read task two briefing and complete task – 20 minutes 5. Complete online questionnaire – 5 minutes 6. Read task two briefing and complete task – 20 minutes 7. Complete online questionnaire – 5 minutes 8. Interview – 30 minutes

Thank you for participating.

xv


Appendix D-2: Initial Questionnaire

1. What is your age? 20-29 30-39 39-49 49-59 59+ 2. Briefly describe your education/qualifications: 3. What is your occupation? 4. How often do you use the Internet?

Daily More than once Weekly Several times Monthly Less than a week a month monthly

5. How often do you use web-based search engines, e.g. Google, Yahoo! etc?

Daily More than once Weekly Several times Monthly Less than a week a month monthly 6. Which web-based search engines do you use? 7. How would your describe your ability to use web-based search engines? Beginner Novice Intermediate Experienced Expert 8. Have you ever used a commercial search engine, for example Dialog, before? (If no, go to Q.11) Yes No 9. How often do you use commercial search engines, e.g. Dialog etc?

Daily More than once Weekly Several times Monthly Less than

xvi


a week a month monthly 10. How would your describe your ability to use commercial search engines? Beginner Novice Intermediate Experienced Expert 11. Have you ever used the Singingfish search engine before? Yes No 12. Have you ever used the Lycos Multimedia Search search engine before? Yes No 13. Have you ever used the AllTheWeb (a.k.a. FAST) search engine before? Yes No 14. Have you ever used search engines to retrieve objects other than web pages before, e.g. images, video files, audio files, etc. (If yes, go to question 15) Yes No 15. Which web-based search engines do you use? 16. How often do you use search engines to retrieve objects other than web pages?

Daily More than once Weekly Several times Monthly Less than a week a month monthly

xvii


APPENDIX D-3: SCENARIOS

Task 1 (SingingFish):

A friend of yours, who works for the Football Association, has asked you to help

them find information on the Internet about the England team’s World Cup 2002

campaign. Retrieve all the audio files and video files that you can, which relate to

this subject and bookmark them. Do not restrict yourself to any one media type.

Task 2 (Lycos Multimedia Search):

You are a journalist working for an online film magazine. You have been asked to

write an article on Jennifer Lopez’s success in Hollywood, in films such as U-Turn,

The Cell, The Wedding Planner and Angel Eyes. You are not interested in her

singing career. The article will be published on the Web and should therefore contain

most different types of media, so retrieve all the pictures, audio files and video files

that you feel are relevant and bookmark them. Please restrict yourself to using the

‘Multimedia Search’ box only.

Task 3 (AllTheWeb):

You are working as a researcher for a new online music retailer. They have asked

that you find as many web pages, images, audio files and video files as you can that

relate to Britney Spears’ second album ‘Oops!… I Did It Again’, and the singles

from the album: ‘Lucky’, ‘Stronger’ and ‘Don’t Let Me Be The Last To Know’.

Anything you find that you feel is relevant should be bookmarked.

xviii


APPENDIX E: Summary of Results From Usability Satisfaction Questionnaires

Question Singing fish

Lycos AllThe Web

Mean Score

1. Overall, I am satisfied with how easy it is to use this search engine

2 2 2.25 2.08

2. It was simple to use this search engine 2 2.25 1.75 2 3. I can effectively complete my work using this search engine

1.75 1.75 1.75 1.75

4. I am able to complete my work quickly using this search engine

1.75 2 1.5 1.75

5. I am able to efficiently complete my work using this search engine

1.75 2 1.5 1.75

6. I feel comfortable using this search engine 1.5 2.25 1.5 1.75 7. It was easy to learn to use this search engine

1.75 2 2 1.92

8. I believe I became productive quickly using this search engine

1.5 2 1.5 1.67

9. The search engine gives error messages that clearly tell me how to fix problems

2.25 1.25 1.5 1.67

10. Whenever I make a mistake using the search engine, I recover easily and quickly

1.75 1.25 1.5 1.5

11. The information (such as online help, on-screen messages, and other documentation) provided with this search engine is clear

1.75 2 1.75 1.83

12. It is easy to find the system information I needed

1.75 1.75 1.5 1.67

13. The information provided for the search engine is easy to understand

2 1.75 1.5 1.75

14. The information is effective in helping me complete the tasks and scenarios

1.5 1.25 2 1.58

15. The organization of information on the search engine screens is clear

2 2 1.5 1.83

16. The interface of this search engine is pleasant

1.75 2.25 1.75 1.92

17. I like using the interface of this search engine

1.75 2.5 2 2.08

18. This search engine has all the functions and capabilities I expect it to have

1.75 1.75 1.5 1.67

19. Overall, I am satisfied with this search engine

2 2 1.5 1.83

xix


Total 34.25 36 31.75

APPENDIX F: Sketches of User Interface Selected For Low-Fidelity

Prototyping

APPENDIX F-1: QUERY STYLE A

xx


Appendix F-2: Query Style B

xxi


APPENDIX G: Low-Fidelity Prototype

APPENDIX G-1: QUERY STYLE A

xxii


APPENDIX G-2: QUERY STYLE B

xxiii


Appendix G-3: Query Style C

xxiv


Appendix G-4: Result Style A

xxv


Appendix G-5: Result Style B

xxvi


APPENDIX H: HTML Prototype

APPENDIX H-1: DISPLAYING RESULTS OF ALL MEDIA TYPES IN ALL

CATEGORIES

xxvii


xxviii


Appendix H-2: Displaying Image Results in All Categories

xxix


Appendix H-3: Displaying Audio Results in All Categories

xxx


Appendix H-4: Displaying Video Results in All Categories

xxxi

Designing a user interface for a web-based multimedia search engine - Information...

Documents

Transcript of Designing a user interface for a web-based multimedia search engine - Information...