AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Post on 24-Jun-2015

2.623 views 3 download

Tags:

description

Presents an innovative tool that supports design and evaluation of a Web site’s information architecture. A case study demonstrated that it can significantly reduce resources required to design information-rich applications. AutoCardSorter Video Demo: http://www.youtube.com/watch?v=ly_4GsOMWmU

Transcript of AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Christos Katsanos | ckatsanos@ece.upatras.gr

Nikolaos Tselios | nitse@ece.upatras.gr

Nikolaos Avouris | avouris@ece.upatras.gr

AutoCardSorter: Designing the

Information Architecture of a Web Site

Using Latent Semantic Analysis

ACM SIGCHI | Florence, Italy | 5-10 April, 2008

Purpose & Motivation

Automate Structural Design of Information Spaces

Increase efficiency and flexibility for practitioners

2

Why it is important?

Structural design greatly affects user

experience

Current approaches (e.g. Card Sorting)

often neglected:

Time constraints

Cost to recruit users and run the studies

Increased complexity for data analysis

Challenging for large sites (>100 pages)

3

Our tool-based Methodology

4

Page Text

Descriptions

Semantic Similarity

Measure (e.g. LSA)Hierarchical Clustering

Algorithms

Interactive Tree

Structure

Additional Support

1. Number of Groups

2. Cross-Hierarchy Links

Semantic

Similarity Matrix

The tool Interface (1/2)

5

The tool Interface (2/2)

6

Validation Study Design

7

Validation Study Design

vs

8

Card SortingAutoCardSorter

Investigate quality of results & efficiency

Health & Nutrition Site

Same content item descriptions

18 representative users

Measures & Analysis

9

P1 P2 P3 P4 P5

P1 -

P2 0.94 -

P3 0.11 0.33 -

P4 0.33 0.28 0.11 -

P5 0.50 0.83 0.06 0.06 -

P1 P2 P3 P4 P5

P1 -

P2 0.62 -

P3 0.21 0.14 -

P4 0.49 0.51 0.83 -

P5 0.61 0.11 0.21 0.92 -

Validity

Similarity-Matrices Correlation

10

AutoCardSorter Card Sorting

LSA (P5,P1)Frequency Users

placed in Same Pile

P1 and P5

Validity

% Agreement of Design

1) Hierarchical Cluster Analysis of Card Sorting Data

2) AutoCardSorter vs User-Data Dendrogram

a) Eigenvalue Analysis to ‘cut’ objectively

b) User structure => Ideal

c) In Agreement => Longer sequence of pages

grouped together in the same category as Ideal

11

Efficiency

Total Time Required

12

AutoCardSorter

Card Sorting

Study Results

13

Study Results – Validity (1/2)

14

AutoCardSorter produced results of

comparative quality with Card Sorting:

Similarity-Matrices Correlation = 0.80 (p<0.01)

% Agreement of Design = 100%

Study Results – Validity (2/2)

15AutoCardSorter Card Sorting

Study Results - Efficiency

16

Discussion - Advantages

Increased efficiency (x27)

Reduces resources required

Explore alternative solutions early

Simple to learn and apply

Easy to apply for large sites (>100)

17

Possibility for

wider adoption

Discussion – Current Limitations

Lack of qualitative feedback

No insight to category-labels

18

Future Research

More validation studies in different domains

Additional constraints (e.g. group size)

Improvements to algorithm

Dynamic semantic similarity algos (e.g. LSA IR)

Alternatives to Hierarchical Clustering (e.g.

Factor Analysis)

19

A Demo - Sit back and enjoy

20

Summary & Questions

Proposed an approach that automates structural

design of an information space.

Validation study depicted substantial effectiveness

gain, with similar results to a user-based technique

Cheap + Fast + Easy = Possibility for wider adoption

21

Complementary to user-based methods

Christos Katsanos | ckatsanos@ece.upatras.gr

Extra Slides

22

More Validation Studies

Summary of Results

23

Health &

Nutrition

Educational

Portal

Travel &

Tourism Site

Similarity-Matrices

r (p<0.01)0.80 0.52 0.59

% Agreement of

Design100% 93% 87%

Efficiency

(X Times Faster)27 11 14

More Validation Studies

Efficiency

24

More Validation Studies

Number of Proposed Categories

25

More Validation Studies

Avg. Items/Proposed Category

26

More Validation Studies

Correlation against No of items

27

Statistical Semantic Similarity

Measures - Overview

LSA: Latent Semantic Analysis (Landauer &

Dumais, 1997)

LSA-IR (Falconer et al, 2006)

PLSA (Hofmann, 1999)

PMI: Point-wise Mutual Information (Manning &

Schutze, 1999)

PMI-IR (Turney, 2001)

GLSA (Matveeva et al, 2005)

HAL: Hyperspace Analogue to Language (Lund &

Burgess, 1996)

COALS (Rhode et al, 2004) 28

Latent Semantic Analysis

Similar documents

tend to have

common words

1) Parse corpora representing users’ understanding skills

2) Calculate each word’s frequency of occurrence (TDM)

3) Weight by word’s importance (document, domain)

4) Apply Singular Value Decomposition

5) LSA Index = Cos(Angle of Document Vectors) => [-1,1]

Card Sorting

Typical Effort in person days

30http://www.intranetleadership.com.au

Why 2 validation measures?

Similarity-matrices Correlation

strictest approach (compares

measurements of semantic similarity)

more general (does not presuppose

cluster analysis)

% Agreement of Design

Less strict

How close the ‘proposed’ designs are? 31