Post on 27-Jan-2017
Lecture 12
[Workshop #1] UX Goals and Metrics
Human Computer Interaction/COG3103, 2016 Fall Class hours : Monday 1-3 pm/Wendseday 2-3 pm Lecture room : Widang Hall 209 21st November
METRICS AND MEASUREMENTS Workshop #1
Workshop #1 COG_Human Computer Interaction 2
Choosing the Right Metrics Ten Types of Usability Studies
• Issue Based Metrics (Ch 5)
– Anything that prevents task completion
– Anything that takes someone off course
– Anything that creates some level of confusion
– Anything that produces an error
– Not seeing something that should be noticed
– Assuming something should be correct when it is not
– Assuming a task is complete when it is not
– Performing the wrong action
– Misinterpreting some piece of content
– Not understanding the navigation
Workshop #1 COG_Human Computer Interaction 3
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Self Reported Metrics (Ch 6) : Asking participant for information about their
perception of the system and their interaction with it
– Overall interaction
– Ease of use
– Effectiveness of navigation
– Awareness of certain features
– Clarity of terminology
– Visual appeal
– Likert scales
– Semantic differential scales
– After-scenario questionnaire
– Expectation measures
– Usability Magnitude Estimation
– SUS
– CUSQ (Computer System Usability Scale)
– QUIS (Questionnaire for User Interface Satisfaction)
– WAMMI (Website Analysis & Measurement Inventory)
– Product Reaction Cards
Workshop #1 COG_Human Computer Interaction 4
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Behavioral and Physiological Metrics (Ch 7)
– Verbal Behaviors
• Strongly positive comment
• Strongly negative comment
• Suggestion for improvement
• Question
• Variation from expectation
• Stated confusion/frustration
– Nonverbal Behaviors
• Frowning/Grimacing/Unhappy
• Smiling/Laughing/Happy
• Surprised/Unexpected
• Furrowed brow/Concentration
• Evidence of impatience
• Leaning in close to screen
• Fidgeting in chair
• Rubbing head/eyes/neck
Workshop #1 COG_Human Computer Interaction 5
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Combined and Comparative Metrics (Ch 8)
– Taking smaller pieces of raw data like task
completion rates, time-on-task, self reported
ease of use to derive new metrics such as
an overall usability metric or usability score
card
– Comparing existing usability data to expert
or idea results
Workshop #1 COG_Human Computer Interaction 6
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Live Website Metrics (Ch 9)
– Information you can glean from live data on
a production website
• Server logs – page views and visits
• Click through rates - # times link shown vs.
actually clicked
• Drop off rates – abandoned process
• A/B studies – manipulate the pages users see
and compare metrics between them
Workshop #1 COG_Human Computer Interaction 7
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Card Sorting Data (Ch 9)
– Open card sort
• Give participants cards, they sort and define
groups
– Closed card sort
• Give participants cards and name of groups,
they put cards into groups
Workshop #1 COG_Human Computer Interaction 8
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Increasing Awareness
– Aimed at increasing awareness of a specific piece of content
or functionality
– Why is something not noticed or used?
• Metrics
– Live Website Metrics
• Monitor interactions
• Not foolproof – user may notice and decide not to click,
alternatively user may click but not notice interaction
• A/B testing to see how small changes impact user behavior
– Self Reported Metrics
• Pointing out specific elements to user and asking whether
they had noticed those elements during task
• Aware of feature before study began
– Not everyone has good memory
• Show users different elements and ask them to choose
which one they saw during task
– Behavioral and Physiological Metrics
• Eye tracking
– Determine amount of time looking at a certain element
– Average time spent looking at a certain element
Workshop #1 COG_Human Computer Interaction 9
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Problem Discovery
– Identify major usability issues
– After deployment, find out what annoys users
– Periodic checkup to see how users are interaction with
the product
• Discovery vs. usability study
– Open-ended
– Participants may generate own tasks
– Strive for realism in typical task and in user’s
environment
– Comparing across participants can be difficult
• Metrics
– Issue Based Metrics
• Capture all usability issues, you can convert into type
and frequency
• Assign severity rating and develop a quick-hit list of
design improvements
– Self Reported Metrics
Workshop #1 COG_Human Computer Interaction 10
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Creating an Overall Positive User Experience
– Not enough to be usable, want exceptional user
experience
– Thought provoking, entertaining, slightly-addictive
– Performance useful, but what user thinks, feels, and
says really matters
• Metrics
– Self Reported
• Satisfaction – common but not enough
• Exceed expectations – want user to say it was easier,
more efficient, or more entertaining than expected
• Likelihood to purchase, use in future
• Recommend to a friend
• Behavioral and Physiological
– Pupil diameter
– Heart rate
– Skin conductance
Workshop #1 COG_Human Computer Interaction 11
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Comparing Designs
– Comparing more than one design alternative
– Early in the design process teams put together semi-
functional prototypes
– Evaluate using predefined set of metrics
• Participants
– Can’t ask same participant to perform same tasks with
all designs
– Even with counterbalancing design and task order,
information on valuable
• Procedure
– Study as between-subjects, participant only works with
one design
– Have primary design participant works with, show
alternative designs and ask for preference
Workshop #1 COG_Human Computer Interaction 12
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Choosing the Right Metrics Ten Types of Usability Studies
• Comparing Designs (continued)
• Metrics
– Task Success
• Indicates which design more usable
• Small sample size, limited value
– Task Time
• Indicates which design more usable
• Small sample size, limited value
– Issue Based Metrics
• Compare the frequency of high-, medium-, and
lowseverity issues across designs to see which one
most usable
– Self Reported Metrics
• Ask participant to choose the prototype they would
most like to use in the future (forced comparison)
• As participant to rate each prototype along
dimensions such as ease of use and visual appeal
Workshop #1 COG_Human Computer Interaction 13
Task Success
Task Time
Errors
Efficiency
Learnability
Issue Based Metrics
Self Reported Metrics
Behavioral and Physiological Metrics
Combined and Comparative Metrics
Live Website Metrics
Card Sorting Data
Independent & Dependent Variables
Independent variables:
– The things you manipulate or
control for, e.g.,
– Aspect of a study that you
manipulate
– Chosen based on research question
– e.g.
• Characteristics of participants (e.g.,
age, sex, relevant experience)
• Different designs or prototypes
being tested
• Tasks
Dependent variables: – The things you measure
– Describes what happened as a result
of the study
– Something you measure as the result,
or as dependent on, how you
manipulate the independent variables
– e.g.
• Task Success
• Task Time
• SUS score
• etc.
Workshop #1 COG_Human Computer Interaction 14
Need to have a clear idea of what you plan to manipulate and what you plan to measure
Designing a Usability Study
RQ 1
• Research Question :
– Differences in performance
between males and females
• Independent variable
– Gender
• Dependent variable
– Task completion time
RQ 2
• Research Question :
– Differences in satisfaction
between novice and expert
users
• Independent variable :
– Experience level
• Dependent variable :
– Satisfaction
Workshop #1 COG_Human Computer Interaction 15
Types of Data
• Nominal (aka Categorical)
– e.g., Male, Female; Design A, Design B.
• Ordinal
– e.g., Rank ordering of 4 designs tested from Most Visually Appealing to
Least Visually Appealing.
• Interval
– e.g., 7-point scale of agreement: “This design is visually appealing.
Strongly Disagree . . . Strongly Agree”
• Ratio
– e.g., Time, Task Success %
Workshop #1 COG_Human Computer Interaction 16
NORMINAL DATA
• Definition
– Unordered groups or categories
– Without order, cannot say one is better than another
• May provide characteristics of users, independent variables that allow you to segment
data
– Windows versus Mac users
– Geographical location
– Males versus females
• What about dependent variables?
– Number of users who clicked on A vs. B
– Task success
• Usage
– Counts and frequencies
Workshop #1 COG_Human Computer Interaction 17
ORDINAL DATA
• Definition
– Ordered groups and categories
– Data is ordered in a certain way but intervals between measurements are not
meaningful
• Ordinal data comes from self-reported data on questionnaires
– Website rated as excellent, good, fair, or poor
– Severity rating of problem encountered as high, medium, or low
• Usage
– Looking at frequencies
– Calculating average is meaningless (distance between high and medium may
not be the same as medium and low)
Workshop #1 COG_Human Computer Interaction 18
INTERVAL DATA
• Definition
– Continuous data where differences between the measurements are meaningful
– Zero point on the scale is arbitrary
• System Usability Scale (SUS)
– Example of interval data
– Based on self-reported data from a series of questions about overall usability
– Scores range from 0 to 100
• Higher score indicates better usability
• Distance between points meaningful because it indicates increase/decrease in perceived
usability
• Usage
– Able to calculate descriptive statistics such as average, standard deviation, etc.
– Inferal statistics can be used to generalize a population
Workshop #1 COG_Human Computer Interaction 19
Ordinal vs. Interval Rating Scales
• Are these two scales different?
• Top scale is ordinal. You should only calculate frequencies of each
response.
• Bottom scale can be considered interval. You can also calculate
means.
Workshop #1 COG_Human Computer Interaction 20
RATIO DATA
• Definition
– Same as interval data with the addition of absolute zero
– Zero has inherit meaning
• Example
– Difference between a person of 35 and a person 38 is the same as the
difference between people who are 12 and 15
– Time to completion, you can say that one participant is twice as fast as
another
• Usage
– Most analysis that you do work with ratio and interval data
– Geometric mean is an exception, need ratio data
Workshop #1 COG_Human Computer Interaction 21
Statistics for each Data Type
Workshop #1 COG_Human Computer Interaction 22
Confidence Intervals
• Assume this was your time data for a study with 5 participants:
Workshop #1 COG_Human Computer Interaction 23
Does that make a difference in your answer?
Calculating Confidence Intervals
– <alpha> is normally .05 (for a
95% confidence interval)
– <std dev> is the standard
deviation of the set of
numbers (9.6 in this example)
– <n> is how many numbers are
in the set (5 in this example)
Workshop #1 COG_Human Computer Interaction 24
=CONFIDENCE(<alpha>,<std dev>,<n>)
Excel Example
Show Error Bars
Workshop #1 COG_Human Computer Interaction 25
Excel Example
How to Show Error Bar
Workshop #1 COG_Human Computer Interaction 26
Binary Success
• Pass/fail (or other binary criteria)
• 1’s (success) and 0’s (failure)
Workshop #1 COG_Human Computer Interaction 27
Confidence Interval for Task Success
• When you look at task success data across participants for a single
task the data is commonly binary:
– Each participant either passed or failed on the task.
• In this situation, you need to calculate the confidence interval using
the binomial distribution.
Workshop #1 COG_Human Computer Interaction 28
Example
– Easiest way to calculate confidence interval is using Jeff Sauro’s
web calculator:
– http://www.measuringusability.com/wald.htm
Workshop #1 COG_Human Computer Interaction 29
1=success, 0=failure. So, 6/8 succeeded, or 75%.
Chi-square
• Allows you to compare actual and expected frequencies for
categorical data.
Workshop #1 COG_Human Computer Interaction 30
=CHITEST(<actual range>,<expected range>)
Excel Example
Comparing Means
T-test
• Independent samples
(between subjects)
– Apollo websites, task times
T-test
• Paired samples (within
subjects)
– Haptic mouse study
Workshop #1 COG_Human Computer Interaction 31
T-tests in Excel
Independent Samples: Paired Samples:
Workshop #1 COG_Human Computer Interaction 32
=TTEST(<array1>,<array2>,x,y)
x = 2 (for two-tailed test) in almost all cases
y = 2 (independent samples) y = 1 (paired samples)
Comparing Multiple Means
• Analysis of Variance (ANOVA)
Workshop #1 COG_Human Computer Interaction 33
“Tools” > “Data Analysis” > “Anova: Single Factor” Excel example: Study comparing 4 navigation approaches for a website
Homework
Lecture #12 COG_Human Computer Interaction 34
Make your own Kickstarter Project
Page
Recruit participants & Gather Concept
Test Data
1 2
Make a link on your blog, and share the preview link It should - Contain a Project
concept video - Set the project
funding goal - Set a reward scheme
Your Team Blog Post #4 - Questionnaire Example
- http://goo.gl/forms/tucU34LKNI
- Quantitative measures - Qualitative Measures - Deduct the key experience
features you should test in a formative evaluation.
Submission Due : 11: 59 pm Fri. 25th November
Complete Exercise 10-2
3
Your Blog Post #11 - Make your own UX Target Table - Example ; Table 10-8 Choosing
UX metrics for UX measures