Download - 1 Evaluation CS2391 Lecture n+1: Robert Stevens.

1http://img.cs.man.ac.uk/stevens

Evaluation

CS2391 Lecture n+1: Robert Stevens


Introduction

• You’ve gathered requirements, designed your system, built the artefact, …But does it fulfil the user‘s requirements?

• Basic usability• Basic evaluation• Evaluation styles• Design evaluation• Implementation evaluation


Usability Basics

• Allowing users to achieve a goal with efficiency, effectiveness and satisfaction

• Utility is the functionality of a system• Utility without usability, but not vice versa• Worthy, but unhelpful• Have paradigms of good usability, e.g. GUI• Also need theory to know why something is usable• Really want principles to guide developers – engineering not

craft


Execution and Evaluation

System User

Input

Output

presentation

performance

observation

articulation


Execution & Evaluation (2)• Presentation: How the system renders state and allows the user

to evaluate state and alteration to the state• Observation: What the user notices of the presentation; Can

he/she see what they need to?• Articulation: Expression of a user’s execution plan• Performance: the system’s execution of a plan, the results of

which are presented to the user


Usability Principlesa. Visibility of system status System should always

keep users informed b. Match between system and the real world System

should speak the user's languagec. System functions chosen by mistake need a clear

'emergency exit' d. Consistency and standards Avoid ambiguity e. Error preventionf. Recognition rather than recall g. Flexibility and efficiency of useh. Aesthetic and minimalist designi. recognize, diagnose and recover from errorsj. Help and documentation


What is Evaluation?

• Do the design and implementation behave as we expect and fulfil the user’s requirements?

• Not just an add on at the end!• Assess the design at various times during the life cycle• Assess implementation prototypes, alpha and beta versions• Evaluation saves time and money• Many types of evaluation and the trick is to choose the

appropriate one• Purpose is to uncover usability problems


Usability Thoughts

• Recall and recognition• Making a system easier to use makes it more powerful• Humans can switch topics fast – think of more than one thing at

once• Computer system should be able to do the same• Complex syntax often hides the task – need directness of

interaction


Styles of Evaluation

Evaluation

Design Evaluation• Cognitive walkthrough• Heuristic evaluation• Review-based evaluation• The use of models

Implementation Evaluation• Empirical• Observational• Query


Evaluation Styles (2)

• Cheaper to evaluate design, before the expense of implementation

• Tends not to involve the end-users, except as consultants• Evaluation of an implementation does involve end-users• Design evaluation techniques can be used to evaluate

implementation• The former are often paper based and involve experts• The latter are time consuming, difficult and expensive and can

involve numbers of end-users


Types of User

• Not all users are Computer Scientists• Different users have different needs• Remember: Managers, system administrators and trainers• Use end-users where possible and appropriate• Important to have evaluatees that are representative of end-

users• Balance between under use and over use: Users need a reward

for their time


Hawthorn Effect

• Users like to please the evaluator• People respond well to having someone interested in them• Simply by evaluating an artefact, experience of that artefact

improves• Investigation of light levels in factories showed the investigation

itself was the most important factor• Not much to be done about it – be aware


Goals of Evaluation

• Does the system have the correct functionality? Does it match the users task?

• A clerk used to searching by post-code, should be able to search by post-code

• Can the functionality be used: What is the effect on the user?• What are the problems with the system?• The last is part of the other two, but negative aspects drawn out


Laboratory Techniques

• A usability lab: One way mirror; Video and audio recorders• Logging of system• Lacks context; unnatural for end-users and natural collaborative

work difficult• Does allow close study, particularly of specialist task or

particular UI notion• Good for single user tasks


Field Techniques

• See the user in context• Allows a user to interact with all people, objects and actions

involved in a task• Collaborative work can take place• Noisy, difficult to record, etc• Can lack detail possible in laboratory


Cognitive Walk Through• Bring psycology theory into informal and subjective walk through 1. Need a design: not necessarily complete, but location and wording

helpful2. A description of the task: Should be representative3. A list of actions the user makes to perform the task4. A description of the users and the experience expected of them• given to experts, who step through actions and make an assessment of

usability1. Are the users performing the task described by the action?2. Can the users see the object of interaction (button etc)?3. Can the user tell that it is the right action?4. Once performed, does the user get appropriate feedback?• End of execution & evaluation cycle


Heuristic Evaluation

• A set of heuristics (rules of thumb) developed by Jakob Nielsen and Rolf Molich

• Each heuristic used to critique an interface• A set of independent experts use the heuristics• Problems found following a Poisson distribution – 5 experts find

about 75% of problems• Usability questions used to guide and stimulate• Essentially a check list


Review Based Evaluation

• Principles from experimental psychology and HCI literature used to provide evaluation criteria

• E.g., menu design, naming items, icon design and language design and memory attributes

• Cheaper than performing the experiment, but beware of context in which a study was performed

• Like all expert based methods, it is all about stimulating basic questions to be asked

• Try and ensure independence of experts• Performance, using scales and comment fields should be used


Empirical Evaluation• Evaluating the implementation (can also use Design Evaluation

methods here)• Empirical studies concentrate on end-users, rather than experts• The controlled experiment technique• Measure some attribute, while controlling other attributes of system• Various experimental conditions, which differ only in the value of some

variable• Independent (manipulated) and dependent (measured) variable• Difference in behaviour attributed to different values of independent

variable that provide the different conditions (interface style, pointing device, wording, etc.)

• Dependent variable must be measurable in some way – speed, mouse clicks, satisfaction etc.

• Use both subjective and objective Measures


Empirical Techniques (2)• A hypothesis is framed in terms of the variables• A change in the independent variable causes a change in the

dependent• The experiment attempts to prove this relationship• Achieved by disproving null hypothesis; that is, no relationship of

variables• Use statistics to show that any differences seen could not have

happened by chance• Experimental design: Between groups and within groups• Between Groups: Subjects assigned to experimental and control

groups; latter ensures it is the independent variable that counts• Each subject only does one condition, so avoiding learning effects; but

prone to variation• Within Groups: Subject performs in all conditions; Vary condition order

to avoid learning


Empirical Evaluation (3)

• Good for evaluation of individual design decisions: Colour, dialogue, wording, etc.

• Less good for overall usability – systems and humans too complex for controlled experiment

• Difficult to design• Expensive in time, money and users


Observational Techniques

• Think aloud & Co-operative evaluation• Observing the user’s actions in work context – the whole task• Usually pre-determined, representative tasks and users explain

what they are doing (think aloud)• Experimenter interacts with participant (subject) to elicit more

information• Everything recorded (notes, system log, audio, video)• Protocols analysed• Post-experiment walk through


Query Based Techniques

• Ask the user can be very informative• Simple, but highly subjective• Interviews and questionnaires (see earlier lectures)• Good for large numbers and high-level• Good for exploring alternative strategies, particularly in context• Less systematic, more subjective


Summary

• Need to test appropriateness of functionality• Also that functionality can be used• Efficiency, effectiveness and satisfaction• Evaluation of design and its implementation• Choose your users with care• HCI: Dix, Findlay, Abowd & Beale; Chapter 11