© Simeon Keates 2009 Usability with Project Lecture 13 – 28/10/09 Dr. Simeon Keates.

© Simeon Keates 2009

Usability with ProjectLecture 13 – 28/10/09Dr. Simeon Keates


Exercise – part 1

Prepare the testing protocol for evaluating the accessibility and usability of your web-site

Also, address any additional research aims identified in your research plan from Wednesday


Exercise – part 2

You need to consider the following: Pre-session briefing• Prepare your welcome statement• What you are doing and why• Privacy issues and right to withdraw• Any initial questions you wish to ask• Prepare a consent form

Tasks• Identify at least 5 tasks for each user on each site• Ensure you do not introduce systematic errors• Prepare any likely questions you may wish to ask• Remember to add/amend tasks for the “blind” test


Exercise – part 3

You need to consider (continued) Post-session de-brief• Ask any remaining questions needed to address your research issues• Thank the user for their time

E-mail your protocol to Stina, Susanne and me

Remember – you will be putting this to the test next week!


Exercise – suggestions for tasks

Exploring the site / describe each page• Great for getting users used to what is where

Completing a guided product selection task• Find “this” product

Completing an unguided product selection task• Find “any” product of your choice

Changing your mind• You decide you do not want this

How many types of [x] (example: tea)?

More discussion on tasks a little later…


Exercise – additional points

Decide whether all users do the same tasks in the same order or not• Be on the lookout for “order” effects

You should randomise the presentation of the sites• ½ do site 1 first• ½ do site 2 first

Think about getting timing data• Very important to know how long the tasks and sub-tasks take• Companies often care about productivity

More discussion on order effects a little later…


Designing the task protocol


Good experimental design

A usability trial is fundamentally a scientific experiment: A research question is asked An experiment is designed to obtain data The data is analysed An answer is suggested to the question


Example questions

A major search engine company often asks new recruits this: “On our search page results, we often include the ability to progress to

different pages of results via the hyperlinks in “Page 1 / 2 / 3 / … / 10” and also “next page” and “previous” page links. Those links are available at the top and bottom of each page of results. We wish to save money by removing one or other set of links. How would you test which set to remove and whether removing either significantly affects the usability of our search engine?”

How would you test for this???

used to


Example questions

You have been asked to design a new set of icons for operating the electric windows in ITU

Currently the icons look like this:

Which icon opens and which closes?Your task: Design a better pair of icons Design an evaluation to show that your icons are better


An example set of answers

Designs: “^” and “v” are quite abstract • Need cognitive effort to map to “open” and “close”

Evolution tells us that humans respond more quickly to particular shapes and concepts

Using that we can design icons that are more closely associated with “open” and “close”

Examples:


An example evaluation approach

Quick approach: Prepare stick on versions of your icons and put them on the existing

button Ask users which is easier to use

More scientific approach: Code a virtual simulation of the window and the weather outside Show rain/snow and bright sunshine Ask users to press the correct button on screen to open or close the

window Record the RT (reaction time)


Good questions to ask (source: Nielsen “Usability Engineering”)

General categories: Time Errors Extent of system usage Use of help User response User effectiveness



Time: The time users take to complete a specific task The number of tasks of various kinds that can be completed in time x

Errors: The ratio between successful interactions and errors The time spent recovering from errors The number of user errors The number of immediately subsequent erroneous actions



Extent of system usage: The number of commands or other features used by the user The number of commands or other features never used by the user The number of system features the user can remember during a

debrief

Use of Help: The frequency of use of the manuals/help system The time spent using manuals/help How frequently manuals/help solved the user’s problem



User response: The proportion of user statements during the trial that were positive or critical

towards the system The number of times the user expresses clear frustration (or joy) The proportion of users who say they prefer the system over [X]

User effectiveness: The number of times the user hard to work round an unsolvable problem The proportion of users using efficient working strategies vs. those without The amount of “dead” time when the user is not interacting with the system• (a) response time delays – user waiting for system• (b) thinking time delays – system waiting for user

The number of times the user is sidetracked from focusing on the real task


Good experimental design


How to ensure we get “good” results

First, what is “good”?

It does not mean “what we were looking for”

It does mean “results we can trust and believe to be valid”


Balanced designs… (a.k.a. Latin squares)

“Order effects” can significantly alter the results of usability trials Especially those based on comparing two or more designs

The reason is that users get better the more that they practise Example: “If you go shopping in China and try to find tea in a supermarket, your first

attempt will most likely involve walking up and down the aisles in turn until you see the tea.

The second time you go in, you will walk straight to the tea section, or at least the drinks section.”

Thus, the second time you do the task, even if the layout is slightly different (and possibly “poorer”), you will most likely be much faster.


The Power Law of Practice

This improvement over time is a known psychological phenomenon It can be described mathematically through the Power Law of

Practice…

The time Tn to perform a task on the n-th trial follows a power law:

Tn = T1 n-α

where: α = 0.4 [0.2~0.6]


The Power Law of Practice

Tn = T1 n-α

α = 0.4, T1 = 60s, T2 = 45.5s (24% faster), T10 = 23.9s (60%faster)


Eliminating the effects of practice

The only way to eliminate the effects of practice is to use a balanced design

Example: We have 2 competing web site designs We want to see which is the fastest for finding an arbitrary product• i.e. a product that is not “special” in any particular way

Variables:• 4 users (1, 2, 3, 4)• 2 web-sites (A, B)• 2 products (20, 40)


A “balanced” experimental design

User 1 Site A Product 20 User 1 Site B Product 40

User 2 Site B Product 40 User 2 Site A Product 20

User 3 Site A Product 40 User 3 Site B Product 20

User 4 Site B Product 20 User 4 Site A Product 40

Unbalanced design – Site B has a built in advantage

Balanced design – Sites A and B are both first

Balanced design – Sites A and B are both first

Unbalanced design – Site B has a built in advantage

Also – order effects on product And product/site interactions


Which site is better?

We need to establish which site offers the best usability One option: which has the fastest time to find a product? Data collected:

User Site Product Time (s)1 A 40 151 B 20 102 B 20 92 A 40 143 A 20 133 B 40 124 B 40 114 A 20 13


Which site is better?

Collating the data:

Category Data 1 (s) Data 2 (s) Data 3 (s) Data 4(s) Mean (s)Site A 15 14 13 13 13.75Site B 10 9 12 11 10.50Product 20 10 9 13 13 11.25Product 40 15 14 12 11 13.001st site 15 9 13 11 12.002nd site 10 14 12 13 12.25


Statistical significance

It looks like Site B is “better” than Site A• 10.50 vs. 13.75

It also looks like product 20 is faster to find than product 40 • 11.25 vs. 13.00

It also looks like the site evaluated 1st is faster than the one evaluated 2nd

• 12.00 vs. 12.25

Are these results reliable? i.e. are these statistically significant?


Testing statistical significance

Need to set up two hypotheses:• H0 (null): There is no difference in the means (μA = μB)

• H1: μB (mean for B) < μA (mean for A)

Evaluate the difference between means using Student t-test One-tailed test (because we believe B is faster than A)

Using Excel’s TTEST function (one-tailed, assume equal variance): p(B is equal to A) = 0.026 – i.e. statistically significant at 5% level• i.e. <5% chance of μA = μB



What about product order (μ20 < μ40) and learning (μ2nd < μ1st)?

Set up similar hypotheses:• H0 (null): Means are the same ( μ20 = μ40 ) and ( μ1st = μ2nd )

• H1: Means of product 20 and 2nd site are lower (μ20 < μ40) and (μ2nd < μ1st)

Using Excel’s TTEST (one-tailed, assume equal variance): p(μ20 = μ40) = 0.21• i.e. 21% chance that μ20 = μ40 and thus not statistically significant

p(μ2nd = μ1st) = 0.46• i.e. 46% chance that μ1st = μ2nd and thus not statistically significant



You can only state authoritatively that something is better than another if you test for statistical significance

The tighter the threshold, the more believable• i.e. 1% is better than 5%

Even then you still might be wrong• 1 in 20 chance at 5% level• i.e. 65% chance (1 - 0.9520) of being wrong at least once after 20 experiments

• 1 in 100 chance at 1% level• i.e. 18% (1 – 0.9920) chance of being wrong at least once after 20 experiments


Cognitive modelling


A psychological theoretical perspective

Cognitive psychology offers an insight into issues to look for when performing user trials, for example:

Some things are easier to see or hear than others• Effects of contrast, loudness, size, etc.• Models can provide quantitative data on this

Source: Wharton and Lewis “Role of Psychological Theory” in: Usability Inspection Methods, ed. Jakob Nielsen



Some things don’t look or sound they way you would think• Same colour can look very different on different backgrounds, or with

different monitors• May actually need to change a colour to get the same visual appearance

Which picture is brighter?



Some things don’t look or sound they way you would think• Same colour can look very different on different backgrounds, or with

different monitors• May actually need to change a colour to get the same visual appearance

Are the red and green squares the same on each “half”?



Only some of the contents of a complex display are likely to be seen• Depends on size, colour, organisation and movement• Also: where the user is looking (focus)• What the user knows about the structure of the scene• What the user is trying to do



Precise movements take longer than gross movements• i.e. small, fiddly things take longer than big, simple ones• Described by Fitts’ Law• We can model how long a movement of a given length and requiring a given

precision will take• More on this later this morning



Mental operations take time• It takes time to recall info from memory or to make a decision• Can make quantitative estimates of how much time• See Model Human Processor slides from earlier lectures• More on this later

People can perform some mental and physical operations in parallel• It usually takes practice though!• Examples – driving a car, talking while typing, etc.

People get faster the more often that they perform a task• Power Law of Practice• Improvement is rapid at first, but drops off over time…



Novice users my perform tasks differently from expert users• Differences arise from the way that knowledge is mentally represented• Also how much knowledge is available• And how the task is understood and organised for different levels of expertise

It takes time to learn things well• Small amount of time available = small amount learnt• Also only remembered for a small amount of time (usually)• Can make quantitative estimates of how much time is required to learn

something – based on decomposition of the skill into small parts

Prior knowledge can be beneficial• As for scaffolding technique – • Relating to prior/existing knowledge speeds up data acquisition and retention



Recognition is easier than recall• [c.f. Jordan guidelines and Nielsen heuristics]• One of the defining principles of the Star interface design (the forerunner of

modern GUIs)• Interaction is dominated by recognising depictions (icons) rather than

remembering commands

People forget things• Need ample time to rehearse• Hard to keep arbitrary information in mind while performing a task• Needs scaffolding• Often affected by external factors, e.g. inducing stress



Behaviour is often guided by goals• People choose actions that they believe will accomplish their goals• If an action does not appear to help this, it will not be selected• Usually results in “label following” – where labels are related to the goal• Example: archiving a file – call it “Archive” not “Disk maintenance”

Alternative methods can cause problems• Problems with many solution options seem harder than those with few• Possible issue here with preferred solutions for Universal Access!

People try to assess progress• If they do not seem to be making progress, they will often stop, go back or try a

completely new method


Cognitive modelling perspectives – Perception (CMN)

The eye sees up to almost 180° Detail is only seen by the fovea over 2° Remainder is seen by the rest of the retina as peripheral vision for

orientation

The eye moves continuously in a sequences of saccades Each saccade takes ~30ms Each eye dwell takes 60~700ms So, estimated times for eye-movement to a new target (travel + fixation

time):

Eye-movement = 230 [70~700] ms


Cognitive modelling – Reading rate

Assuming 230 ms per saccade, how much can a reader read per fixation?

1 – One saccade per letter (5 letters per word)

2 – One saccade per word

3 – One saccade per phrase (13 chars = 2.5 words for a good reader)

€

60(s /min)(5(saccades /word)*0.230(s /saccade)

= 52words /min

€

60(s /min)(1(saccade /word) *0.230(s /saccade)

= 261words /min

€

60(s /min)( 12.5 (saccade /word) *0.230(s /saccade)

= 652words /min


Cognitive modelling – The perceptual processor

What is the cycle time τp of the Perceptual Processor? This is the unit impulse response time i.e. the time response of the visual system to a very brief pulse of light Also, the time taken (from t = 0) for the image to be available in the Visual

Image Store• This is the “working” store of images in the brain and holds 17 [7~17] letters with a

half-life of 200 [90~1000] ms For most users in most circumstances (it varies by stimulus and need)

τp = 100 [50~200] ms

Note: The Perceptual Processor cycle time τp varies inversely with stimulus intensity (i.e bigger, louder, brighter = faster response)


Cognitive modelling – The perceptual processor

How do people perceive motion? What happens if you vary the time delay between one action and the

following one?


Cognitive modelling – The motor system

Experiment: Draw two horizontal parallel lines on a piece of paper approximately

2.5 cm apart Now, draw vertical lines back and forth for 5 s, i.e.

Now count the number of back and forth motions Should be approximately 70 Thus motor processor “open loop” time is:

τm = 70 [30~100] ms


Cognitive modelling – The motor system

We can also look at the “closed loop” response times Draw the “envelope” of edge contours, like this:

And now count how many “changes in direction” you see Each “change in direction” is a closed-loop correction• i.e. “I am overshooting/undershooting the line and must correct”

You should have ~20


Cognitive modelling – The cognitive processor

So, closed loop control takes ~250ms Of this, we have:• τp = 100 ms

• τm = 70 ms

So what else is happening?

You need to make a decision to perform a correction This is the cognitive processor time, τc

τc = 70 [25~170] ms

Note that τc is shorter: (1) when greater effort is induced by increased task demands or information loads; (2) with practice


Cognitive modelling – The Model Human Processor

Time_taken = x τp + y τc + z τm

Where : x, y and z are integersτp = time for perceptual processor

τc = time for cognitive processor

τm= time for (simple) motor function


Cognitive modelling – The Model Human Processor

Time_taken = x τp + y τc + z τm

For a simple reaction task:x = 1, y = 1, z = 1 (or 2 depending on key-down or key-up)

For a simple classification task (e.g. primary colour or simple shape)x = 1, y = 2, z = 1

For a more complex classification task (e.g. colour and shape)x = 1, y = 3, z = 1

For a more abstract classification task (e.g. letter)x = 1, y = 4, z = 1

I saw something (1), it was a shape (2) it was this shape (3), that shape is this letter (4)


Cognitive modelling – Other useful info

Working memory – holds the information under current consideration Long-term memory – stores knowledge for future use

Pure working memory has capacity, μWM = 3 [2.5~4.1] chunks• i.e. you can only “remember” 3 things at any one time

However, working memory can be augmented by long-term memory, such that:

Effective capacity of Working Memory, μWM* = 7 [5~9] chunks


Cognitive modelling – Dealing with uncertainty

The Uncertainty Principle states that decision time T increases with uncertainty about the decision to be made:

T = Ic H

Where: H is the information-theoretic entropy of the decision;Ic = 150 [0~157] ms/bit

For n equally probable alternatives (Hick’s Law) :H = log2(n + 1)

More generally:

€

H = pi log2( 1 pii∑ +1)


Implications of Hick’s Law

Web-site of 62 products Each equally likely

T = IC H

= 150 * log2(62 + 1)

= 897 ms to decide which product Note – this only applies where the user knows the name of the product and

its location in the list

If we have 8 clusters of 7-8 products (i.e. mean of 7.75 per cluster):T = 150 * log2(8 + 1) + 150 * log2(7.75 + 1)

= 945 ms to decide which producti.e. approximately 5.3% more “thinking” time


Exercise


Exercise – Part 1

Last week you were asked to prepare your user trial protocols Today – put them into practice

Perform a pilot study of the usability of your web-site with at least 1 user

Remember – the principal aim is to “test the test” • (or “trial the trial” or “evaluate the evaluation”…)


Exercise – Part 2

Prepare a progress presentation for the board for Friday Show that good progress is being made

Summarise:• The tasks performed • The data collected• Whether the user liked the site• Whether the user could use the site (e.g. complete the tasks)• What you think is working well in the design• What you think needs to be looked at more closely in the design• Any changes you would like to make to the site and protocol


Exercise - Practicalities

Remember to print out copies of your protocol

Allow plenty of blank space for adding observation notes

Allocate one person to do the pre-session briefing and debrief

Allocate one person to be the facilitator (the person who directs the user)

The remaining members act as observers

© Simeon Keates 2009 Usability with Project Lecture 13 – 28/10/09 Dr. Simeon Keates.

Documents

Transcript of © Simeon Keates 2009 Usability with Project Lecture 13 – 28/10/09 Dr. Simeon Keates.