Cluster analysis as a technique to guide interface design

Int. J. Man-Machine Studies (1991) 35, 251-265

Cluster analysis as a technique to guide interface design

Sco'I'r LEWtS

A T& T Bell Laboratories, Crawfords Corner Rd, Hohndel, NJ 07733, USA

(Received 3 July 1989 and accepted in revised form 18January 1990)

One difficulty with designing an effective user intcrface for a hardware or software system is that the designer frequently does not have specific information about the user's model of the system. In this paper, a methodology to address this problem is demonstrated. The method is presented as a tool to help constrain the choices that the interface designer can make in the construction of a usable human-system interface. The method involves the use of a specific cluster analysis technique to characterize the models of a domain held by one or more user groups. I discuss how the specific information gained from this technique may be applied to help guide interface construction.

Introduction

Recently, researchers have at tempted to understand a user group's representatio n of a domain through the application of multidimensional scaling (MDS) and cluster analysis (CA). Most of these studies have been concerned with the use of these techniques for helping to structure command menus (Card, 1982; McDonald, Stone, Liebelt & Karat, 1982; McDonald, Stone & Liebelt, 1983; Tullis, 1985) or information-search menus (Hollands & Merikle, 1987). These techniques seem to show promise in their ability to provide useful summary information about how a user group perceives the relationships among a set of times. Designers may use this information to help structure menus to correspond to the user's model of the system. Indeed, empirical studies have shown a correspondence between the use of these techniques to guide menu organization and menu navigation efficiency (Liebelt, McDonald, Stone & Karat, 1982; Hollands & Merikle, 1987).

One misleading aspect of the above studies is that they imply that these methodologies are limited to determining menu organization. Indeed many of the research investigations of the application of CA and MDS as design methodologies have been in the domain of menu organization (there are exceptions, however; e.g. Tullis, Sperling & Steinberg, 1986). Although clearly effective as a way to guide the organization of command and information-search menus, these techniques can readily be generalized to provide contraints on software interface designs other than menu organization. Also, they may just as readily be used to constrain decisions about the design of an interface for a hardware system (e.g. front panel spatial organization).

The purpose of this paper is to present a brief empirical investigation of the usefulness of a specific clustering technique. This technique is presented as a way to gain insight into how users represent a system and the implications of the

251 0020-7373/91/020251 + 15503.00/0 �9 1991 Academic Press Limited

252 s. LEWIs

representation of the construction of a user interface. In three experiments the technique is applied to understand the way that three diverse user groups represent the functions of an oscilloscope. I show that specific design constrafnts are derivable from a straightforward analysis of a data set from 11 users. As well, I show that the method is capable of characterizing useful differences and similarities between user groups in their model of a domain.

Clique Optimization The clustering technique used in the experiments below is called Clique Optimiza- tion (CLOPT).t Before presenting the experiments, it would be useful to briefly introduce CLOPT and give a basic explanation of its operation. The description here is necessarily incomplete, and the curious reader may consult Sriram (1990) for a formal description of the method and Sriram and Lewis (unpublished data) for an empirical comparison of the method with other clustering procedures. For an introduction to CA techniques in general, the reader may consult Everitt (1980).

The input to CLOPT is a lower-half triangular matrix representing some measure of the similarity between each of the n*(n - 1)/2 pairs of n objects. Although the question of how similarity between objects should be measured can be a difficult one for the researcher (Everitt, 1980), the method is indifferent to these considerations as it assumes some measure of similarity as input.

Like the well-known Hierarchical Clustering (HCS) algorithms (Johnson, 1967), CLOPT is a method which generates a hierarchical structure to represent similarity data. Clique Optimization, however, does not generate the hierarchical structure by a simple series of agglomerative steps as HCS does. Clique Optimization first uses a heuristic search (described below) to find a single, optimal partition of the n objects. The method then alternates between divisive and agglomerative steps to generate a complete hierarchy. The ability to find a single optimal partition is a useful feature for a clustering technique because often one is only interested in a single, unambiguous partition of n objects. In fact, the experiments presented below focus on the optimal partition produced by the method and the design information it can reveal.

The optimal partition is composed of several disjoint clusters, each containing one or more of the original n objects. The number of clusters in the partition may range from 2 to n - 1. An index of the model's quality of fit is provided by the squared Pearson correlation between the model and the data. This may be understood as the proportion of variance in the data accounted for by the model.

The method used to find the optimal partition is a heuristic search because only a subset of the possible partitions are examined.:~ The heuristic search consists of two steps. First, a similarity threshold is applied to decide if the similarity between pairs of objects is high enough to consider the pair of objects "connected" to one another. This operation abstracts a number of disjoint groups (not yet considered clusters) of objects which are related to one another at some level greater than or equal to the similarity threshold.

t An implementation of the CLOPT algorithm in Pascal or C is available from the author. :1: An exhaustive search of all possible partitions is infeasible because the number of partitions grows

exponentially with respect to the number of objects.

CLUSTER ANALYSIS TO GUIDE INTERFACE DESIGN 253

Second, a resemblance threshold is applied to decide if each object in a group has a high enough family resemblance to other objects in its group to warrant it becoming a member of the cluster. The family resemblance of a particular offject is defined as the proportion of connections that it has to members in its group relative to the total possible number of connections to members in its group. Only the objects which have a family resemblance above the resemblance threshold constitute the members of a cluster. Those not put into a cluster are returned to the pool of objects yet to be assigned to clusters. Clusters are extracted until all of the objects have been assigned to a cluster.

To find the optimal partition the method conducts a search in the two dimensional parameter space defined by the two threshold values. That is, different values of the similarity threshold and resemblance threshold are iteratively tried, and for each combination the resulting partition is compared with the data. The optimal partition is the one that maximizes the squared correlation and the da ta . t

Once an optimal partition is found, two useful summary measures are computed. First, for each cluster an average similarity is computed. This measure is defined as the average of the intra-cluster similarities. It reflects the tightness of the cluster and is useful as an indication of how closely related the items within a cluster are perceived (i.e. when a cluster's average similarity is high, the items in the cluster are more closely related to one another than when the average similarity is low).

Second, each item in a cluster may be assigned a value that reflects its prototypicality within a cluster. This measure is defined as the average of an item's similarities to the other items in its cluster. It reflects how central or typical an item is of its respective cluster. Both the prototypicality measure and the category average similarity measure are standardized to lie between 0 and 1.:~

Experiment 1

METHOD

Overview A group of oscilloscope users (all technicians with approximately the same work requirements and background of oscilloscope use) were presented with 69 cards. On each card, an atomic function of a modern digital oscilloscope was printed. Subjects were given a sorting task (Rosenberg & Kim, 1975). They were instructed to sort the functions into groups which "made sense to them" or that "belonged together". They were allowed to have any number of groups that they wished, with the restrictions that their partition should have more than five groups and less than 69 groups.

Subjects Eleven male subjects participated in the experiment. The subjects work backgrounds were very similar; they were all technicians who were involved in testing

t Since the partition search is heuristic, there are conditions that lead to sub-optimal solutions. In practice, however, these conditions are not restrictive. See Sriram and Lewis (unpublished data) for an investigation of the properties of the method.

:[: In the following studies the average similarity measure and the typicality measure are not given for clusters with only one item because an item's similarity to itself is considered to be a maximum.

254 s. LEwis

and/or troubleshooting an oscilloscope in the penultimate stage of manufacturing. The average subject had more than 11 years of oscilloscope experience.

Stimulus materials Atomic functions were chosen from the General Purpose Interface Bus (GPIB) command set for the Tektronix 2430A Digital Oscilloscope. The GPIB is a set of commands allowing remote operation of an oscilloscope. Atomic functions were generated from the GPIB specification by selecting a subset of functions (as described below) from the complete list.

To make the sorting task less tedious for subjects, the number of funcuons was limited to 69. This limitation required that a number of scope functions in the GPIB specification be excluded. The excluded functions were: (1) very esoteric or abstruse in their function and would be required for use only under the most extreme conditions; (2) queries of the system that the normal use of the oscilloscope would not require (i.e. those that would only be made by a computer remotely controlling the oscilloscope). Thus 69 functions were selected that were representative of all the capabilities of the scope under common usage. The 69 functions used in all three experiments are listed in the Appendix. Each function was translated into a sentence fragment (phrased in active voice) and typed on a small card.

Procedure Subjects were presented with the 69 cards (randomly ordered) and an instruction sheet. The instructions indicated that they were to "organize these cards according to your intuitions about which functions belong together". The only restrictions were that there be more than five groups and less than 69 groups in their partitions. Subjects were given as much time as they desired to complete this task, and most subjects took between 30 minutes and an hour to finish their sort.

RESULTS AND DISCUSSION

For each subject, a matrix of co-occurrences between all pairs of items was created by placing a 1 in the matrix if a pair of items were placed in the same group or a 0 in the matrix if a pair of items were placed in separate groups. Thus, each subjects groupings were represented as a (lower-half) 69 x 69 binary co-occurrence matrix. These 11 subject matrices were summed to produce a single similarity matrix. The elements in this matrix varied from 0 (all subjects put this pair of items in separate groups) to 11 (all 11 subjects put this pair of items in the same group). This single similarity matrix served as the input to CLOPT.

The optimal partition produced by CLOPT for this data set is presented in Figure 1. The squared correlation between this partition and tile data was 0.67, indicating that the partition was a reasonably good fit to the original data. Associated with each of the 11 non-single-item clusters is an average similarity value for the cluster (AVG SIM value) and a prototypicality value for each item in the cluster (Proto column). Labels for the clusters (which convey the basic meaning of the item groupings) are presented in the upper left corner of each cluster. The labels are only suggestive, but they provide a reasonable interpretation of the semantic content of the dusters.

CLUSTER ANALYSIS T O G U I D E INTERFACE DESIGN 255

Save and retrieve acopa setup

AVG SIM t O0 Items Proto

Save current scope setup 1.00 Retrieve scope setup 1.00

Cursor activities

AVG $1M 095 Items Proto

Turn off cursors 0.97 Select c~Jrsor mode 0.97 Position cursors horizontaJty 0.97 Selec~ actNe cursor 0.97 Position cursors vertc, a'.~ 0.97 Turn on cursors 0.97 Selec~ ta:get wfm for cursor 0.82

Hardcopy

AVG SIM 091 llama Prolo

Print waveform 0.9 I Set want control params 0,91

"IV trigger

AVG SIM 086 Itema Proto

Enter 1V-line number to big 0.91 $e~ect I'V signal intertaced 0.91 Select TV signal non-inter 0.85 Lock trigger level of comp rid 0.79

Customize appearance of display AVG SIM 0.82 Items Proto Adjust graticule intens;'ty 0.89 Adjust readout intensity 0.89 Adjust display intensity 0.82 Adjust intens;'fied zone intens;'ty 0.75 Togg!e readout display 0.75

General trigger functlona

AVG S{M 0.81

Items Ptoto Choose trigger coupling 0.89 Adjust Itlgger level 0.69 Set trigger mode 0.89 Adjust thgger slope 0.69 Select tdgger 0.89 Set trigger pos!tion 0.89 Choose trigger source 0.89 Adjust A-trigger holdoff 0.73 Autoset trigger 0.61 Enter trigger word 0.53

Vertical" activities

AVG SIM 0,74 Items Prolo Adjus! vertical size of select 0.82 Sek~ct waveform for vert;c.3.1 0.78 Choose input coupling 0.75 Position waveform verticaJly 0.74 Choose bandwidth limit 0.67 Se'.ect channel for disp/ay 0.67

Save and display wavaforms

AVG SIM 0.61 Itame Proto Save reference waveform 0.64 Select reference waveform 0.64 Display reference waveform 0.54

Traditional horizontal activities AVG S~ 0.57 Items Prolo Move intensified zone 0.63 Change size of intensified zone 0.63 Set delay time 0.61 Position B-sweep horizonta!ly 0.60 Expand B-sweep ho6zontal/y 0.60 Position A-sweep horizontaJ~ 0.56 Display A-sweep inten by B 0.56 Expand A-sweep hor;zonta!ly 0.56 Display B-sweep 0.56 Display A-sweep 0.53 Disptay waveform with inten 0.51 Display intensified zone only 0.51

Measurement activities and acquisit ion

AVE; SIM 0.57 hems Proto Set number of sweeps aver 0.58 Select acquisition mode 0.57 Choose mode 0.57 Set patams for wfm meas 0.55 Display at,~o ',~,m meas 0.51 Tocjgle smoothing 0.51 Perform math operations 0.49 Set num~r of any sweeps 0.48

What to do when In trouble AVG $1M 0.52 }toms Proto Obtain he!p 0.59 Initiartze 0.50 Toggle menu display 0.45

Set calibration of a channel

Dipole y XY

Choo=.e magnification factor

Select clock function

Choose numbering achaea

Set delay by eventa

Autosalup acope

FIGURE 1. Partition produced by CLOPT from technicians sorts. Note that the clusters are ordered by their AVG SIM value and the items within a cluster are ordered by their Prototypicality.

The results from CLOPT illustrate that the grouping of items can provide interface designers with insight into the mental model that the user group holds about the domain of interest. For brevity, in the following discussion of results I will focus on a few salient features of the analysis and the usefulness of this information for supplying interface design constraints.

In the partition presented in Figure 1, there is a cluster that is characterized as "Traditional Horizontal Activities". Horizontal activities are functions of an oscilloscope having to do with the horizontal axis on the display; the axis usually associated with control of oscilloscope timing. There is an item, however, that was not included as a member of the cluster "Traditional Horizontal Activities" that traditionally has a strong relationship to oscilloscope horizontal items. This is the item "Display XY". By all expectations, this item should have been a member of

256 s. LEWIS

the cluster "Traditional Horizontal Activities" because of its functional similarity to the other cluster items. The fact that this function was not consistently placed with the other items by subjects suggests that this user group may not have understood the function or perceived it in an idiosyncratic way.

Indeed, this argument is supported by the fact that "Traditional Horizontal Activities" cluster and the "Display XY" item merged at the next higher level in the hierarchy. That is, the partition one level up in the hierarchy captured a relationship between "Display XY" and the other "Traditional Horizontal Activities". Appar- ently, there was s o m e recognition of the meaning of "Display XY", but the rated similarity was not high enough to have it be a member of the cluster "Traditional Horizontal Activities" in the optimal partition.

The number of groups that the analysis produces relative to the number of groups that the subjects use can be used to detect items about which there is disagreement. If the solution produces more groups than any one of the subjects, this indicates that there was substantial disagreement among subjects about which other items were associates. Consider a hypothetical example: if Subject 1 had placed item A with B, Subject 2 had placed item A with C, and Subject 3 had placed item A with D, etc., then item A's similarity to other items would tend to be low. Furthermore, because of this, item A is likely to be clustered (in the solution) only with itself (a singleton). As Figure 1 indicates, the method produced 18 distinct clusters. The most groups that any one subject had was only 15. This suggests that on some of the items there may have been considerable disagreement about the associates. The singleton clusters in the solution are obvious candidates for items which may not have been viewed consistently by the subject group.

In general, information about how aspects of an instrument's (or software application's) functionality are understood or misunderstood by a particular user group should prove useful to interface designers. During the process of interface design, questions often arise about the organization or grouping of functions in a command menu hierarchy or on a front panel. What is sometimes needed is a way to characterize how well the user group understands the set of functions. As shown above, this technique can help provide that information. Such information can give the human-interface designer guidance when making choices about, for example, the accessibility of functions in a menu hierarchy or the salience of a specific control on a front panel.

Notice that the average similarities of the non-singleton clusters range from 0.515 (for the "What to do when in trouble" cluster) to 0-948 (for the "Cursor Activities" cluster). The average similarities for a cluster indicate how strongly this subject sample tended to group these items together. The value for the "Cursor Activities" cluster indicates that there was considerable agreement among the subjects that these items constitute this cluster. Alternatively, there is not nearly as much agreement among subjects about the composition of the "What to do when in trouble" cluster.

This index of the relative agreement among subjects about the composition of clusters may be useful for matching the structure of a developing user interface to a target population's expectations. That is, it would sometimes be helpful to know that a large proportion of users expect certain items to be grouped together (such as those in the "Cursor Activities" cluster), while a smaller proportion of users expect


other items to be placed together (such as those in the "What to do when in trouble" cluster). For illustration, consider a situation in which design constraints, such as hardware configuration or a production deadline, require that the grouping of some system functions must deviate from "more natural" groupings (assumed to be known). Information about the relative strength of user's expectations about item groupings might aid decisions about which groupings to sacrifice and which to affirm in the interface design.

The prototypicality of each item in a cluster may also provide useful information to the interface designers. In a sense, the items within a cluster with the highest prototypicality may capture more of the essence or meaning of the grouping than the other items. They are the most typical item for each grouping. For example, for the "Traditional Vertical Activities" cluster the item with the highest prototypicality is "Adjust vertical size of selected waveform". Knowledge that this item is the most typical item of this group aided in interpreting this group as "Traditional Vertical Activities".

As well, knowledge that some items are more or less central to a grouping may guide the development of parts of a user interface. For instance, the clusterings provided by this method could be used to guide menu organization for a software or hardware system. Additionally, within each menu, the prototypicality of an item in a cluster could be used to suggest menu labels or item orderings. For example, a reasonable strategy would be to place those items with higher prototypicalities at the beginning of a menu list and those with low prototypicalities at the end. Thus items perceived as being more typical of a grouping would be the initial item in the menu.

Experiment 2

A potential criticism of Experiment 1 is that subjects sortings may not reflect their "true" cognitive organization of the domain. That is, subjects may be using non-semantic aspects of the stimuli (such as the similarity in item wording) to sort the functions. This could have resulted in a solution which, although reliable, does not reflect the subject group's cognitive organization of oscilloscope functions. Experiment 2 was performed to investigate this possibility.

A simple way to test this criticism is to ask subjects who have no experience with oscilloscopes to sort these functions. Since this group of subjects only have non-semantic (syntactical, lexical) aspects of stimuli on which to base their groupings, the groupings that they provide should only reflect the effects of the surface aspects of the stimuli, such as item wording. That is, it could be that the structures derived by CLOPT in Experiment 1 only reflect the idiosyncratic aspects of the stimuli and do not reflect any of the "true" semantic organization.

Actually, if the solutions obtained from technicians and naive subjects are similar, then two interpretations are possible. The first is the interpretation described above, mainly that the technique is not capturing any of the real semantic organization. A second interpretation, however, is that the syntax and semantics are complemen- tary. In other words, the wording of the items may precisely convey the appropriate semantic associations. Although clearly not very plausible, this would certainly be a desirable aspect of a human-system interface.

258 S. LEWIS

METHOD

Sub]ecls Seven females and four males participated in the experiment. None of these 11 subjects had any significant experience with oscilloscopes.

Procedure Similar to Experiment 1, each subject was given the 69 cards and asked to 'sort the cards on whatever basis "seemed natural" or "made sense to them". Subjects were given as long as they required to accomplish this task.


The squared correlation between the optimal partition and the data was 0-52, indicating that the fit of the solution to the original data is not as good as that for the technicians. The number of clusters present in this solution (24) relative to the technicians solution (18) is informative. It suggests that in general there was less agreement among these naive subjects about the structure of the clusters than among the technicians. The optimal partition generated by CLOPT is presented in Figure 2.

�9 A few of the clusters present in this solution correspond quite closely with the groupings of the technicians from Experiment 1. For example, t he "triggering" clusters are similar for both groups. There are, however, subtle differences which are illuminating and suggest that the groupings occurred for different reasons. For example, the "'triggering" clusters are exactly the same between the two solutions except for the presence of the item "Lock trigger level of composite video back porch to 0 volts" in the naive subjects solution. The technicians placed this item with the "TV Trigger" cluster. The presence of this item in the naive subjects triggering group suggests that to form this group they were keying off of the word "trigger" in each of the items. The "Lock trigger level �9 �9 -" item made it into this group because it had the requisite word. For the technicians, however, this item had additional (semantic) ties to other items (the other TV trigger items) and even the presence of the word "trigger" did not mean that it belonged with the other items that contained the word "trigger".

There are clusters in the naive subjects partition which are clearly grouped on the basis of item wording. For example, the "Functions with the Word 'Set '" cluster. These items are very diverse in their meaning and are grouped in a completely different manner by the technicians in Experiment 1.

Notice that the average similarities for most of the clusters in the naive subjects solution are relatively low. This indicates that the agreement about the organization of these items was generally lower for the naive subjects than for the technicians.

Clearly, the differences in the partitions suggest that (1) the naive subjects seemed to be making their groupings largely on the basis of lexical and syntactical aspects of the items, and (2) by contrast, at least in part, the technicians sortings reflect the use of more semantic criteria.

CLUSTER ANALYSIS TO G U I D E INTERFACE DESIGN 2 5 9

a :

E o

o

0 >" - x ;E

o

_= ~ o ~

0

=_

~ , ~ . ~ .~ . . .

-~ ~ o o o o ~ ~ 0 ~ . . . . . .

c m ~ c -c- S.~ .... ~ o = o ~_~

~.o >.o . - c : . ~ E E

~ . < = ~ R

k

S c~ c~ o c~ c~ 0 c~ c~ d 0

o o x x o o

l - ~co o c ; c : o

E e~ m

~ m m

Oo;C~ ~ ddcffcScSOdddd

~ -

o~ E . . . . . - - - . _83 ._

o ~

c; c5 c5

~ E

o o ~ o ~ o

- q o d 0

e

.~ o d o

_=

N ; g - c~ ~ o ~ - - ' a

o o

~ S

' ~ ~ . .~_

= - ~, ~ ~

oo t om

~

-= g a 0

.~_ ~ _~ ~.~_

o c ~ d c ~

~-o 8 o

o~ ~ u m

.=

coco r... h ,.~ 0 . . . . . . . ~ 0 o 0 0 o o

=

= ~ , ~ , ~ ~ o o cO Q . a .

o

E

�9 . - !

:>-.

8 . - j

,o ca.. e..

.o_

n_

L,s,

260 s. LEWIS

Experiment 3

The two previous experiments suggest that it is possible to get some insight into the differences between two user groups with respect to their understanding and organization of a domain. It was clear that the solutions from the two user groups examined in Experiments 1 and 2 reflect a difference in knowledge and/or organization of the functions of an oscilloscope.

In terms of the concerns of the user interface designer, this difference in two user groups organizations may suggest different expectations of an oscilloscope interface.

. I . Experiment 3 is an attempt to characterize a third user group and to look exphcltly at the ways that CA could contribute to the refinement of an interface for a specific group of users.

For this experiment, oscilloscope designers performed the same task that was given to subjects in the previous experiments. Oscilloscope designers putatively have a deeper understanding of many of the main operations of an oscilloscope. It was expected that this would be reflected in their sortings and that the differences in the CLOPT solutions would provide information relevant to human-interface design considerations.

METHOD

Sltbfecls Eleven male subjects participated in the experiment. All of the subjects were senior engineers who had been or were currently involved in the design of oscilloscopes or were considered expert users of oscilloscopes by their peers.

The average subject had more than 23 years of experience with oscilloscope use. The subjects backgrounds with oscilloscopes were extremely varied, however. Some subjects used oscilloscopes only occasionally, while others used the instruments daily.

Stimuhts materials The materials used were the same as those in Experiments 1 and 2.

Procedure The procedure was the same as described in Experiment 1.


The names given the clusters (and the AVG SIM values) for the designers optimal partition appear in Figure 3. The squared correlation between the partition and the data was 0.61, suggesting that the fit of the model to the data is not as good as the technicians, but better than thc naive subjects. The poorer fit (relative to the technicians) may be due to the fact that the experts backgrounds are particularly diverse in comparison to the technicians. This is especially true with regard to the nature and extent of the oscilloscope use for these two user groups.

The partition produced by the technicians in Experiment 1 is in many ways very similar to that produced by the designers. For example, several of the major groups present in the technicians partition can reasonably be given the same names in the


Save and retrieve scope setup

AVG SILt 1.0o

Cursor activities

AVG SILl 0.95

"IV trigger

AVG SIM 0.91

Customize appearance of display

AVG SIM 0 E5

General trigger functions

AVG SiM 0 83

Hardcopy

AVGSIM 082

Setting sweep modes

AVG Sit,l 0.82

Save and display waveforms

AVG StM 0.76

What to do when In trouble

AVGS[M 067

Measurement activities

AVGSIM 067

"l'raditlonal vertical activities

AVG $tM 0

Traditional horizontal activltles

AVGSIM 061

Select acquisition mode

Choose magnllicatlon factor

'Toggle smoothing

Enter trigger word

Autosetup scope

Select channel for display

Choose mode

Choose numbering scheme

Set calibration of a channel

Select clock function

Adjust intensified zone Intensity

FIGURE 3. Names of clusters for partition produced by CLOPT from designers sorts.

designers partition (e.g. "TV-trigger", "General Trigger Functions", "Traditional Horizontal Activities", "Traditional Vertical Activities", "Cursor Activities", "Cus- tomize Appearance of Display", etc.). These clusterings may be capturing the two groups shared cognitive organization of these oscilloscope functions.

It is useful to consider a more quantitative measure of the notion that the technicians and the designers partitions are relatively similar. A useful index of the relationship between two partitions is the simple element-wise correlation between the binary matrices that represent the partitions (see llubert & Arabic, 1985, for a discussion of measures of partition similarity). The correhttion matrix among the partitions of the user groups from the three experiments is presented in Table I. Clearly the technicians partition is more closely related to the designers partition than that of the naive subjects, as one would expect.

Though the technicians and the designers partitions are fairly closely related, there are interesting differences. For example, the number of chtstcrs is greater for the designers (23) than for the technicians (18). This either indicates greater disagreement among the designers about these functions or it retlccts a consistent view of these functions as more differentiated. Additionally, the average similarities

262 s. LEWIS

TABLE 1 Matrix of correlations anlong the partitiotts for the

technicians, naive subjects and designers

Technicians Naive

Naive 0.56 Designers 0.81 0.54

of the clusters were higher for the designers clusters than for the technicians clusters on many of the similar clusters (i.e. the AVG SIM for the "TV Trigger" cluster for the technicians is 0-86 while the AVG SIM for the "TV Trigger" cluster for the designers is 0.91). This may indicate a more thorough understanding of the relations among these functions for the designers (and thus greater inter-subject agreement) than for the technicians.

One aspect of the designers solution that may be of interest is the location of the "Display XY" function. Recall that this function was omitted from the "Traditional Horizontal Activities" cluster by the technicians in Experiment 1. The designers partition included the "Display XY" function in the "Traditional Horizontal Activities" cluster. The fact that the designers did place this function in the expected cluster suggests that the technicians may not have clearly understood the "Display XY" function and therefore had less motivation to place this function with the "Traditional Horizontal Activities" cluster.

General discussion

The results from the three experiments support the notion that useful design information can result from the application of the CLOPT cluster analysis technique. The methodology seems capable of providing information about the user group's general mental model of the domain as well as information relevant to specific design issues. Below, I briefly review a few applications of this methodology.

hzformation about a shlgle user group First, as shown by Experiment 1, it is possible to address specific questions about a user group's understanding of a particular function or set of functions by using this methodology. For example, it appears from Experiments 1 and 3 that the technicians may have an incomplete understanding of the meaning of the function "Display XY". Certainly the technicians model of this function differs from the designers. With knowledge of this fact (and given that the misunderstanding is not an issue of nomenclature), the conscientious designer may pursue one of several options when designing an interface for this group. One option might be to re-evaluate the need for the function in the system. Alternatively, the function may be made relatively inaccessible by being placed deeper in a menu hierarchy to assure that it does not become a confusing aspect of the interface. Regardless of the action, however, the key aspect of this example is that the design team has information


about a user group's comprehension of a specific function (or functions) and may use this information to guide the design of the interface.

Second, CA results may clearly be used to help determine organization 6f many aspects of the interface. One well known example of this is the use of CA and MDS to organize menus. Several researchers have used these techniques to compare the effects of menu organization on user performance in command menus (Card, 1982; McDonald et at., 1983) and in information-retrieval menus (Hollands & Merikle, 1987). The results of these studies generally suggest that categorical organizations, (as opposed to random or alphabetical organizations), improve user performance in some respect, although the effects of organization depend upon the level of user expertise (Hollands & Merikle, 1987) and task (McDonald, Dayton & McDonald, 1988).

Clique Optimization is especially well suited to the generation of categorically organized menus. The technique generates a hierarchical partition of the objects like several other techniques; however, unlike many other clustering techniques, it provides information about an object's status Ivithin a cluster. That is, information about the object's typicality is provided. This information may be quite useful to interface designers for two reasons: (1) it could provide a guide to item ordering within a menu and (2) it allows easier interpretation of a duster's meaning (the items that are more typical of the cluster are more central to the cluster's meaning) that may help resolve questions of how to name top level or intermediate level menus in a menu hierarchy.

Additionally, the results from CLOPT or other CA techniques may be useful for deciding upon spatial arrangements of a human-system interface. Previous investigations have found MDS useful for determining spatial layouts of facilities in a proposed space station (Tullis et al., 1986) and in a manufacturing plant (Nathan, 1984). In the domain of hardware design, perhaps CA will be useful for determining front panel layouts. Presumably, a front panel that was congruous with a user group's natural grouping would prove easier to use.' Thus the results from CA and MDS analyses may prove useful for prescribing as well as proscribing certain front panel arrangements. For example, given the results of the analysis from experiment 1, one may hesitate to have the control for the "Adjust trigger level" function placed near the control for the "Choose bandwidth limit" function. Alternatively, the clustering would suggest that a similar spatial placement of the "Adjust trigger level" function and the "Select trigger" function would result in a natural grouping for oscilloscope technicians.

Multiple user groups As demonstrated by the three experiments, these techniques allow one to characterize some of the domain specific differences among user groups. This type of information may be useful for answering marketing questions (e.g. choosing a user group for a product or modifying the design of a product for a particular user group) and for highlighting the differences between designer's models of the system and a target user group's model of the system. It is clear from the differences between the solutions from Experiment 1 and Experiment 3 that technicians have a somewhat different model of oscilloscope functions than do designers. This would certainly imply that an interface that was completely intuitive to oscilloscope designers may

264 s LEWIS

seem unintuitive or even obfuscated to a different user group due to their apparent ly different models of the system.

Also, one may use these techniques to track an individual's or ~ group ' s mental model development (Coury, 1984). Such information may be useful for understanding the learnability of an interface and for gaining insight into the aspects of the interface which are confusing or foster the deve lopment of inaccurate models of the system.

In summary, I have a t tempted to demonstra te the potential for the use of C L O P T as a human-sys t em interface design tool through the presentat ion of, three experiments. According to Gardiner and Christie (1987), one of the goals of a design support environment is to provide guidelines for specific design decisions. The method presented here should allow the designer to gain some insight into the mental model that one or more user groups hold of the domain of interest, and use that information to guide specific decisions about the human-interface development .

The author wishes to thank Gcne Lynch, Sandra Grossmann, Steve Knox, Wayne Bailey, Robert Mauro and two anonymous reviewers for their insightful comments and criticisms on a previous version of this article.

References

CARD, S. K. (1982). User perceptual mechanisms in the search of computer command menus. In Proceedhzgs of Human Factors in Computer Systems, pp. 190-196. New York: Association for Computing Machinery.

CARD, S. K., MORAN, T. P. & NEWELL, A. (1983). The Psychology of ttunmn-Computer h~teraction, llillsdale, N J: Lawrence Erlbaum Associates.

CouRv, B. G. (1984). The development and evaluation of a methodology for assessing mental mod.els of complex decision tasks. In Proceedhzgs of the Human Factors Society 28th Annual Meethzg, pp. 133-137. San Antonio TX: lluman Factors Society.

EvERrrr, B. (1980). Cluster Analysis. New York: Halsted Press. GAP.DiNER, M. & CHRISTIE, B. (1987). Applying Cognitive Psychology to User hzterface

Design. Chichester, UK: John Wiley and Sons. |ION.ANDS, J. G. & MERmt.E, P. M. (1987). Menu organization and user expertise in

information search tasks. Human Factors, 29, 577-586. JOIINSON, S. C. (1967). Hierarchical clustering schemes. Psychometrika, 32, 241-254. LIEBELT, L. S., McDoNALD, J. E., STONE, J. D. & KARAT, J. (1982). The effect of

organization on learning menu access. In Proceedings of the Human Factors Society 26th Annual Meeting, pp. 546-550. Seattle, WA: Human Factors Society.

McDoNALD, J. E., DAY'iON, m. & McDoNALD, D. R. (1988). Adapting menu layout to tasks. International Journal of Man-Machine Studies, 28, 417-435.

McDoNALD, J. E., STONE, J. D. & LtEBELT, L. S. (1983). Searching for items in menus: the effects of .organization and type of target. In Proceedings of the Human Factors Society 27th Annual Meeting, pp. 834-837. Norfolk, VA: ! luman Factors Society.

McDONALD, J. E., STONE, J. D., LIEnEL'r, L. S. & KARAT, J. (1982). Evaluating a method for structuring the user-system interface. In Proceedings of the Human Factors Society 26th Annual Meeting, pp. 551-555. Seattle, WA: Human Factors Society.

NATHAN, J. (1984). The use of multidimensional scaling technique for layout analysis: an example. In Proceedings for the 16th Annual Meethzg of the American htstitute for Decision Sciences, pp. 670-672.

ROSENBERG, S. & KIM, M. (1975). Tile method of sorting as a data-gathering procedure in multivariate research. Mtdtivariate Behavioral Research, 10, 489-502.

SRmAM, N. (1990). Clique Optimization: A method to construct parsimonious ultrametric trees from similarity data. Journal of Classification, 7, 33-52.


TULLIS, T. S. (1985). Designing a menu interface to an operating system. In CHI '85 Conference Proceedhzgs, San Francisco, CA., pp. 79-84.

TULLIS, T. S., SPEI~.LING, B. B. t~ STEINBERG, A. L. (1986). The use of multidirrtensional scaling for facilities layout: An application to the design of the space station. In Proceedings of the ttuman Factors Society 30th Annual Meeting, pp. 38-42. Dayton, OH: Human Factors Society.

Appendix Below are the 69 oscilloscope functions used

(1) Enter TV-line number to trigger on (2) Display A-sweep (3) Move intensified zone (4) Choose trigger coupling (5) Set parameters for wavcform

measurements (6) Select waveform for vertical size

adjustment (7) Position B-sweep horizontally (8) Save reference waveform (9) Turn off cursors

(10) Adjust trigger level (11) Display reference waveform (12) Select target waveform for cursors (13) Choose numbering scheme for display

(octal or hexadecimal) (14) Save current scope setup (15) Position A-sweep horizontally (16) Adjust A-trigger holdoff (17) Choose input coupling (AC, DC, or

GND) (18) Display waveform with intensified zone (19) Choose mode (run or stop) (20) Adjust graticule intensity (21) Select TV signal non-interlaced cou-

pling (FLD1 or TV-line) (22) Select cursor mode (delta/absolute) (23) Print waveform (24) Set trigger mode (autolevel or normal) (25) Toggle menu display (on/off) (26) Set delay time (27) Display XY (28) Display automated waveform measure-

ment results (risetime, peak-peak, dis- tal, mesial, proximal, etc.)

(29) Set number of sweeps averaged before starting over

(30) Display A-sweep intensified by B-sweep (31) Change size of intensified zone (32) Set calibration of a channel (33) Adjust display intensity (34) Retrieve scope setup (previously saved) (35) Select clock function (asynchronous,

as stimuli for this research.

fall or rise) (36) Expand A-sweep horizontally (37) Adjust trigger slope (38) Position cursors horizontally (39) Choose bandwidth limit (40) Display intensified zone intensity (41) Adjust intensified zone intensity (42) Initialize (43) Select active cursor (cursor one or cur-

sor two) (44) Select acquisition mode (normal, en-

velope, average or equivalent time) (45) Position cursors vertically (46) Display B-sweep (47) Choose magnification factor (48) Select trigger (A or B) (49) Set trigger position (50) Set print control parameters (51) Position waveform vertically (52) Select reference waveform for

adjustments (53) Set delay by events (54) Choose trigger source (55) Lock trigger level of composite video

back porch to 0 volts (56) Turn on cursors (volts, slope or time) (57) Toggle smoothing (on/off) (58) Perform math operations on waveforms

(add, multiply, invert, etc.) (59) Adjust readout intensity (60) Enter trigger word (bit pattern) (61) Toggle readout display (on/off) (62) Autosetup scope (63) Adjust vertical size of selected

waveform (64) Expand B-sweep horizontally (65) Select channel for display (66) Obtain help (67) Select TV signal interlaced coupling

(FLDI, FLD2, ALT or TV-line) (68) Autoset trigger (69) Set number of envelope sweeps to be

completed before resetting

Cluster analysis as a technique to guide interface design

Documents

Transcript of Cluster analysis as a technique to guide interface design