IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999 ...

41
IEEM 552 - Human-Computer IEEM 552 - Human-Computer Systems Systems Dr. Vincent Duffy - IEEM Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI Week 7 - Hazards in HCI March 16, 1999 March 16, 1999 http://www-ieem.ust.hk/ http://www-ieem.ust.hk/ dfaculty/duffy/552 dfaculty/duffy/552 email: [email protected] email: [email protected] 1

Transcript of IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999 ...

Page 1: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

IEEM 552 - Human-IEEM 552 - Human-Computer SystemsComputer Systems

Dr. Vincent Duffy - IEEMDr. Vincent Duffy - IEEM

Week 7 - Hazards in HCIWeek 7 - Hazards in HCIMarch 16, 1999March 16, 1999

http://www-ieem.ust.hk/dfaculty/http://www-ieem.ust.hk/dfaculty/duffy/552duffy/552

email: [email protected]: [email protected]

Page 2: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

For todayFor today 1. Further discussion/review of the 1. Further discussion/review of the

summary of results you submittedsummary of results you submitted– based on the week 3 in-class exercise ‘an based on the week 3 in-class exercise ‘an

example’example’ 2. Hazards to conducting and interpreting 2. Hazards to conducting and interpreting

HCI ExperimentsHCI Experiments 3. Brief discussion - Pictogram, Miller & 3. Brief discussion - Pictogram, Miller &

StanneyStanney 4. brief discussion about exam 14. brief discussion about exam 1

2

Page 3: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

A test of 2 interfaces - A test of 2 interfaces - Which interface is better?Which interface is better?

Self rating of Expertise for Library Online Searches

Human Subjects Consent Form Data sheet for data collector Data Sheet for Subject Introduction

– subjects will leave the room

3

Page 4: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Is it enough to ask which is Is it enough to ask which is better? better? What do I expect from the results/data?What do I expect from the results/data? What are the hypotheses? What are the hypotheses?

– What do I think is true about the system What do I think is true about the system before I start? before I start?

– What questions am I trying to answer with What questions am I trying to answer with the data/analyses?the data/analyses?

– H.1. www search is faster.H.1. www search is faster.– H.2. More errors using www search.H.2. More errors using www search.– H.3. More data can be found by www.H.3. More data can be found by www.

4

Page 5: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Self-rating and consent Self-rating and consent formform

5

Page 6: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Group # _____Name member_______ Please rate your experience using the

Library search databases at UST or other universities.

Least Most experience experience 1 2 3 4 5

6

Page 7: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Data sheet for subjectData sheet for subject

7

Page 8: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Data sheet for subjectData sheet for subject Group no./name _________________ The test administrator will show you which interface to use first.

Please do all 3 of the following tasks. Do not stop until all 3 tasks are completed.

Interface 1 1. Please write the 'call number' of the book titled : 'user interface

design' by Eberts. Call no. ________________ 2. Please write the call number for a video about the Wright

Brothers. We do not know the title. However, it is less than 30 minutes long (duration).

Call no. ______________ 3. Please locate/find as many Visual C++books in less than 5

minutes). Number of Visual C++ books found _____________

8

Page 9: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Data/instructions for data Data/instructions for data collectorcollector

9

Page 10: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Before the experiment-Before the experiment-subjects out of the roomsubjects out of the room Group #____ Data sheet for data collector 1. record name of data collector__________ 2. the data collector will need to record time (by watch, clock,

computer, etc.) 3. be sure subject signs human subjects consent form 4. give instruction sheet, allow 1 minute for reading and one

minute for questions. 5. Show the subject how to start library online systems. Count time beginning when the subject double clicks the correct

icon (the two interfaces to be tested are www or telnet/dos)- odd numbered groups (eg. 1,3,5) should begin with www even numbered groups (eg. 2,4,6) should begin with telnet/dos

interface).

10

Page 11: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

During the experimentDuring the experiment collect 14 pieces of data - 7 pieces for each of two interfaces 1. subject name/group no. _______________ 2. www interface (1) or telnet/dos interface (2)_______________ 3. time to find item 1 (call number of 'user interface design' by

Eberts). _______________ 4. time to find item 2 (a video about Wright Brothers - less

than 30 minutes video)______________ (begin counting time immediately after finding item 1) 5. number of errors in finding item 2 (count error as any back,

prev. record, start over, etc.)________ 6. time to find (how many books can you find on Visual C++ in

less than 5 minutes)._____________ 7. quantity of Visual C++ books ________________

11

Page 12: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

After collecting the dataAfter collecting the data Use sample experimental data (previously Use sample experimental data (previously

collected)collected) upload to www, download so accessible to upload to www, download so accessible to

you, run analyses you, run analyses – using SAS -Statistical Analysis Softwareusing SAS -Statistical Analysis Software

how to compare? how to compare? – Simple test of difference in means - we used T-Simple test of difference in means - we used T-

test (comparing only 2 variables)test (comparing only 2 variables)– discuss hypotheses discuss hypotheses – for hw: asked you to interpret the outputfor hw: asked you to interpret the output

612

Page 13: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

For assumptions of analyses/outputFor assumptions of analyses/output– hint : see Chapter 6 Cody and Smith hint : see Chapter 6 Cody and Smith

(p.138-149)(p.138-149) How do I determine if Hypothesis How do I determine if Hypothesis

1-3 are supported?1-3 are supported?– H.1. www search is faster.H.1. www search is faster.– H.2. More errors using www search.H.2. More errors using www search.– H.3. More data can be found by www.H.3. More data can be found by www.

13

Page 14: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Sample SAS programSample SAS program

14

Page 15: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

H.1. www search is faster.H.1. www search is faster.

For our hypothesis we want to check difference in means.For our hypothesis we want to check difference in means. First check if variances equal to help decide which p values to use First check if variances equal to help decide which p values to use

(find p>F’ - prob. That we reject Ho incorrectly? if p<.05 reject Ho)(find p>F’ - prob. That we reject Ho incorrectly? if p<.05 reject Ho) Either way, look at p>|T|, probability that we reject Ho incorrectly.Either way, look at p>|T|, probability that we reject Ho incorrectly. If p<.05, reject - Ho- for T-test - for which Ho says ‘means are same’ If p<.05, reject - Ho- for T-test - for which Ho says ‘means are same’

– if you reject, then conclude - means are statistically differentif you reject, then conclude - means are statistically different For time for task 1 means are not statistically different.For time for task 1 means are not statistically different.

15

Page 16: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

How do these results How do these results influence conclusions influence conclusions

about H1?about H1?

16

Page 17: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

H.2. More errors using www H.2. More errors using www

search.search. Suppose your results looked like this….Suppose your results looked like this….

check difference in means check difference in means First check if variances equal to help decide which p values to use First check if variances equal to help decide which p values to use

(find p>F’ - prob. That we reject Ho incorrectly? if p<.05 reject Ho)(find p>F’ - prob. That we reject Ho incorrectly? if p<.05 reject Ho) Either way, look at p>|T|, probability that we reject Ho incorrectly.Either way, look at p>|T|, probability that we reject Ho incorrectly. If p<.05, reject - Ho- for T-test - for which Ho says ‘means are If p<.05, reject - Ho- for T-test - for which Ho says ‘means are

same’ same’

– if you reject then conclude - means are statistically differentif you reject then conclude - means are statistically different For number of errors means are not statistically different.For number of errors means are not statistically different.

17

Page 18: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

However, this was your However, this was your data...data... What do you conclude about H2?What do you conclude about H2?

Page 19: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

H.3. More data can be found by H.3. More data can be found by www.www.

18

check difference in means check difference in means First check if variances equal p=.004, reject Ho (that Variances are equal) First check if variances equal p=.004, reject Ho (that Variances are equal)

– use the information to decide which p value to observe for the T-testuse the information to decide which p value to observe for the T-test In this case, look at p>|T|, probability that we reject Ho incorrectly.In this case, look at p>|T|, probability that we reject Ho incorrectly.

– For ‘unequal variances’ to help decide whether to reject Ho- for the T-test which For ‘unequal variances’ to help decide whether to reject Ho- for the T-test which says ‘means are equal’says ‘means are equal’

p=.356, accept - Ho- for T-test - ‘means are same’ p=.356, accept - Ho- for T-test - ‘means are same’

– if you reject then conclude - means are statistically differentif you reject then conclude - means are statistically different For quantity of books found, means are not statistically different.For quantity of books found, means are not statistically different.

Page 20: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

None of our 3 hypotheses None of our 3 hypotheses were fully supportedwere fully supported Does this mean we were incorrect from the Does this mean we were incorrect from the

start? start? – WWW is no better than the dos/telnet based WWW is no better than the dos/telnet based

system?system? Does it mean www is Does it mean www is

– not faster, not faster, – not less error prone, not less error prone, – not more likely to allow you to find more not more likely to allow you to find more

information?information? What might have gone wrong?What might have gone wrong?

19

Page 21: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Hazards to conducting & Hazards to conducting & interpreting HCI interpreting HCI

experimentsexperiments To be avoided To be avoided

– when conducting experimentswhen conducting experiments To be noticedTo be noticed

– when reading experiments of other peoplewhen reading experiments of other people– to see if the methodology or interpretation to see if the methodology or interpretation

of results invalidates some conclusionsof results invalidates some conclusions– Sheil (1981) found large % of studies had Sheil (1981) found large % of studies had

some methodology problem which made some methodology problem which made results suspectresults suspect

20

Page 22: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

What is wrong with the following? (please submit a separate sheet What is wrong with the following? (please submit a separate sheet with your answers)with your answers) Q1. Hypothesis states/asks: ‘Is this new interface effective?’Q1. Hypothesis states/asks: ‘Is this new interface effective?’ Q2a.New interface is compared to old interface. The subjects tested using the new design also Q2a.New interface is compared to old interface. The subjects tested using the new design also

have used the old design.have used the old design. Q2b.subjects for new improved design treated more enthusiastically (or more quiet room)Q2b.subjects for new improved design treated more enthusiastically (or more quiet room) Q3. Software manufacturer tests financial planning software on its employees (mostly Q3. Software manufacturer tests financial planning software on its employees (mostly

programmers)programmers) Q4. Two experiments show the same mean difference between interface measures, but the Q4. Two experiments show the same mean difference between interface measures, but the

difference is statistically significant in one experiment and not in the otherdifference is statistically significant in one experiment and not in the other Q5. One person administers a test to 10 subjects for one interface test condition (treatment). A Q5. One person administers a test to 10 subjects for one interface test condition (treatment). A

different person administers the test to 11 subjects for the other.different person administers the test to 11 subjects for the other. Q6. Suppose a correlation (R=.55) shows a significant relationship (p<.05) is found between Q6. Suppose a correlation (R=.55) shows a significant relationship (p<.05) is found between

‘percent correct’ and ‘frequency of use’ of help menus ‘percent correct’ and ‘frequency of use’ of help menus Also suppose a correlation is found betwn. likert scale (1-7 scale) variable & ‘frequency of use’ of Also suppose a correlation is found betwn. likert scale (1-7 scale) variable & ‘frequency of use’ of

help menus. Which are you more likely to use for drawing conclusions?help menus. Which are you more likely to use for drawing conclusions? Q7. Experiment finds that there is no statistical difference between measured variables of the Q7. Experiment finds that there is no statistical difference between measured variables of the

old and new designs. He concludes that the two are the same. Marketing of the new design is old and new designs. He concludes that the two are the same. Marketing of the new design is halted.halted.

Q8. a vendor is trying to sell your software company a computer programming tool that was Q8. a vendor is trying to sell your software company a computer programming tool that was found to reduce programming time by 50%. You are told you should expect 50% reduction in found to reduce programming time by 50%. You are told you should expect 50% reduction in software development time. The product was previously tested on novices. software development time. The product was previously tested on novices.

Page 23: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Q.1. What is wrong with Q.1. What is wrong with this?this?

Q. Is this new interface effective?Q. Is this new interface effective?– what is meant by effective?what is meant by effective?– should this mean faster or fewer errors?should this mean faster or fewer errors?– should this mean people prefer this one?should this mean people prefer this one?– for whom? expert or novice?for whom? expert or novice?– effective compared to what? 2 different designs? some effective compared to what? 2 different designs? some

standard?standard?– evaluations must be made for two or more treatment evaluations must be made for two or more treatment

conditionsconditions a better question/hypothesisa better question/hypothesis

– can new interface can be used with less assistance?can new interface can be used with less assistance?

22

Page 24: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Hazard 1 - Question Hazard 1 - Question phrased improperlyphrased improperly

What’s the big deal?What’s the big deal?– experimenter may discover certain measures for experimenter may discover certain measures for

which data should have been collectedwhich data should have been collected– too latetoo late

How to avoid itHow to avoid it– planning, behind the scenes workplanning, behind the scenes work– conduct a pilot test on a small number of subjectsconduct a pilot test on a small number of subjects– understanding the underlying theories related to the understanding the underlying theories related to the

independent variables or the dependent independent variables or the dependent (performance) measures(performance) measures

23

Page 25: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Q.2. What is wrong with Q.2. What is wrong with this?this?

Q.2.a. New interface is compared to old Q.2.a. New interface is compared to old interface. The subjects tested using the interface. The subjects tested using the new design also have used the old design.new design also have used the old design.– important variable not controlledimportant variable not controlled– subjects have prior experience (training)subjects have prior experience (training)

Q.2.b. subjects for new improved design Q.2.b. subjects for new improved design treated more enthusiastically (or more treated more enthusiastically (or more quiet room)quiet room)– treatment of subjects varies w/level of ind.var.treatment of subjects varies w/level of ind.var.

24

Page 26: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Hazard 2 - Important Hazard 2 - Important variables not controlledvariables not controlled What is the big deal?What is the big deal?

– uncontrolled variable (confounding) can uncontrolled variable (confounding) can simulate or counteract (eliminate simulate or counteract (eliminate detection) of a treatment effectdetection) of a treatment effect

How to eliminate or minimize this?How to eliminate or minimize this?– list all the variables that might influencelist all the variables that might influence– control each variable throughcontrol each variable through

randomization, hold constant or eliminate randomization, hold constant or eliminate (variability), manipulate it(variability), manipulate it

25

Page 27: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Consider ‘an example’ Consider ‘an example’ Hyp. 2 & Hyp. 2 & ‘which ‘which came came first’?first’?

For task 2, ‘find wright brothers video’, For task 2, ‘find wright brothers video’, – time to complete and errors significantly time to complete and errors significantly

higher for the first interface (is it because it higher for the first interface (is it because it was improvement (2nd time doing task) or was improvement (2nd time doing task) or is www is easier to use? Do you know?)is www is easier to use? Do you know?)

26

Page 28: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Q.3 What is wrong with Q.3 What is wrong with this?this? Software manufacturer tests Software manufacturer tests

financial planning software on its financial planning software on its employees (mostly programmers)employees (mostly programmers)– inappropriate sample used inappropriate sample used – tested mostly experts when software tested mostly experts when software

was designed for computer noviceswas designed for computer novices– mixed group of novice and expert mixed group of novice and expert

employeesemployees

27

Page 29: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Hazard 3 - Inappropriate Hazard 3 - Inappropriate sample usedsample used

What’s the big deal?What’s the big deal?– results can be misleading if they are results can be misleading if they are

generalized to the wrong group of usersgeneralized to the wrong group of users How to avoid?How to avoid?

– try to demonstrate that subjects have been try to demonstrate that subjects have been stabilized at some performance levelstabilized at some performance level

– or report honestly that subjects may not or report honestly that subjects may not have been allowed sufficient time to become have been allowed sufficient time to become proficient at task (so as to stabilize level)proficient at task (so as to stabilize level)

28

Page 30: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Q.4. What is wrong with Q.4. What is wrong with this?this? Two experiments show the same mean Two experiments show the same mean

difference between interface difference between interface measures, but the difference is measures, but the difference is statistically significant in one statistically significant in one experiment and not in the otherexperiment and not in the other– not enough subjects are usednot enough subjects are used– What is meant by statistically significant?What is meant by statistically significant?– usually set at p<.05 (probability you reject usually set at p<.05 (probability you reject

null incorrectly ex: null: no difference)null incorrectly ex: null: no difference)

29

Page 31: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Hazard 4 - Not enough Hazard 4 - Not enough subjects usedsubjects used Why does this commonly happen?Why does this commonly happen?

– finding appropriate sample, that is large finding appropriate sample, that is large enough, is difficult sometimesenough, is difficult sometimes

– in practice, number of subjects often in practice, number of subjects often determined by number available or based on determined by number available or based on reports of previous studiesreports of previous studies

How to avoid?How to avoid?– choose larger samples (more expensive)choose larger samples (more expensive)– consider trade-offs: sample size, variability, consider trade-offs: sample size, variability,

size of potential effect (to be measured) size of potential effect (to be measured)

30

Page 32: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Q.5. What is wrong with Q.5. What is wrong with this?this?

One person administers a test to 10 subjects One person administers a test to 10 subjects for one interface test condition (treatment). for one interface test condition (treatment). A different person administers the test to 11 A different person administers the test to 11 subjects for the other. subjects for the other. – what if the two experimenters (people what if the two experimenters (people

administering the test) conduct the experiment administering the test) conduct the experiment differently?differently?

– test administered improperlytest administered improperly– two different people should not administer the two different people should not administer the

test in this mannertest in this manner

31

Page 33: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Hazard 5 - Test administered Hazard 5 - Test administered improperly - experimental improperly - experimental studiesstudies

What’s the big deal?What’s the big deal?– different distractions or test conditions can different distractions or test conditions can

influence the results, or increase the variability influence the results, or increase the variability making actual differences difficult to detect (or making actual differences difficult to detect (or differences may be due to sloppy test conditions)differences may be due to sloppy test conditions)

How to avoid?How to avoid?– Test the interface to eliminate bugs, stabilize the Test the interface to eliminate bugs, stabilize the

experimental/room, and general test conditionsexperimental/room, and general test conditions How is this different for field studies?How is this different for field studies?

32

Page 34: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Q.6. What is wrong with Q.6. What is wrong with this?this?

a correlation (R=.55) shows a significant a correlation (R=.55) shows a significant relationship (p<.05) is found between ‘percent relationship (p<.05) is found between ‘percent correct’ and ‘frequency of use’ of help menus correct’ and ‘frequency of use’ of help menus

correlation is found betwn. likert scale correlation is found betwn. likert scale (1-7 scale)(1-7 scale) variable & ‘frequency of use’ of help menusvariable & ‘frequency of use’ of help menus– the first example violates the assumptions of the the first example violates the assumptions of the

method of analysis used (percent correct not usually method of analysis used (percent correct not usually normally distributed - likert scales are). normally distributed - likert scales are).

– parametric statistics assume normality and parametric statistics assume normality and homogeneity of varianceshomogeneity of variances

33

Page 35: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Suppose our T-test Suppose our T-test shows a shows a

significant difference in the means between time to significant difference in the means between time to complete task 2 and interface. Can we safely conclude complete task 2 and interface. Can we safely conclude that our hypothesis is supported?that our hypothesis is supported?

0

1

2

3

4

5

6

7

<10

0s

100<

x<20

0

>20

0

Timtsk2(1stinterfacetested)

What kind of distribution is What kind of distribution is shown by this data?shown by this data?

Normal, uniform?Normal, uniform? What are the assumptions What are the assumptions

of the statistics we used of the statistics we used (eg.T-test)(eg.T-test)

NormalityNormality If the data is not normally distributed, If the data is not normally distributed,

you can not use the statistics that you can not use the statistics that require normality as a basic require normality as a basic assumption (correlation,, t-test, anova, assumption (correlation,, t-test, anova, etc.).etc.).

Page 36: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Hazard 6 - Improper Hazard 6 - Improper analysis usedanalysis used

What’s the big deal?What’s the big deal?– it can invalidate the results of the experimentit can invalidate the results of the experiment

How to avoid?How to avoid?– test the data - distributions of variables should be test the data - distributions of variables should be

normal and should have equality of variances- for normal and should have equality of variances- for multi-variate stats like regressions multi-variate stats like regressions

– if necessary, use a different method of analysis if necessary, use a different method of analysis (non-parametric-not as robust) or transform data (non-parametric-not as robust) or transform data

– be sure that the data meets the assumptions before be sure that the data meets the assumptions before running the analysis otherwise you waste your timerunning the analysis otherwise you waste your time

35

Page 37: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Q.7. What is wrong with Q.7. What is wrong with this?this? Experiment finds that there is no Experiment finds that there is no

statistical difference between measured statistical difference between measured variables of the old and new designs. He variables of the old and new designs. He concludes that the two are the same. concludes that the two are the same. Marketing of the new design is halted. Marketing of the new design is halted. – if you can not reject the null hypothesis (no if you can not reject the null hypothesis (no

difference), that does not prove itdifference), that does not prove it– it only shows that you could not prove a it only shows that you could not prove a

differencedifference– it may still exist. how?it may still exist. how?

36

Page 38: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Hazard 7 - Null effects Hazard 7 - Null effects interpreted incorrectlyinterpreted incorrectly

examples of some things that may examples of some things that may make a difference more difficult to make a difference more difficult to detect detect – Hazard 2. a confound may have occurredHazard 2. a confound may have occurred– Hazard 4. not enough subjects to detect a Hazard 4. not enough subjects to detect a

differencedifference– Hazard 5. treatments administered poorly Hazard 5. treatments administered poorly

causing high variability in the conditionscausing high variability in the conditions– Hazard 6. was wrong statistical test Hazard 6. was wrong statistical test

conducted?conducted?– your measure may not be sensitive enough your measure may not be sensitive enough

to detect a difference to detect a difference

37

Page 39: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Q.8. What is wrong with Q.8. What is wrong with this?this?

a vendor is trying to sell your software a vendor is trying to sell your software company a computer programming tool company a computer programming tool that was found to reduce programming that was found to reduce programming time by 50%. You are told you should time by 50%. You are told you should expect 50% reduction in software expect 50% reduction in software development time. The product was development time. The product was previously tested on novices. previously tested on novices. – software developers are likely not novices, so software developers are likely not novices, so

it is difficult to know what to expect. it is difficult to know what to expect.

38

Page 40: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

Hazard 8 - Results Hazard 8 - Results generalized beyond generalized beyond conditions testedconditions tested What’s the big deal?What’s the big deal? can mislead readers. can mislead readers. we can be misled if we only read the we can be misled if we only read the

abstract and conclusionabstract and conclusion How to avoid?How to avoid?

be careful not to generalize the be careful not to generalize the results beyond the sample & results beyond the sample & conditions tested conditions tested

your results may lend evidence, but your results may lend evidence, but further testing may be needed to further testing may be needed to confirmconfirm

39

Page 41: IEEM 552 - Human-Computer Systems Dr. Vincent Duffy - IEEM Week 7 - Hazards in HCI March 16, 1999  ieem.ust.hk/dfaculty/duffy/552 email: vduffy@uxmail.ust.hk.

For week 8 - Exam detailsFor week 8 - Exam details Old exam - on web pageOld exam - on web page closed book format in classclosed book format in class

– 100 points 100 points – 65% lecture notes, 3 videos, 2 cases & demo65% lecture notes, 3 videos, 2 cases & demo– 35% integrating concepts with the research papers & 35% integrating concepts with the research papers &

the class tripthe class trip– week 1-7, lectures 1-5 & demos.week 1-7, lectures 1-5 & demos.– Background reading: Chapter 1,3 Eberts; Cody & Background reading: Chapter 1,3 Eberts; Cody &

Smith, Ch. 6 (p.138-146), 3 journal papers -’Thinking Smith, Ch. 6 (p.138-146), 3 journal papers -’Thinking Aloud’ and ‘Task complexity’ and ‘Pictogram’, 2 cases.Aloud’ and ‘Task complexity’ and ‘Pictogram’, 2 cases.

40