AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

31
Christos Katsanos | [email protected] Nikolaos Tselios | [email protected] Nikolaos Avouris | [email protected] AutoCardSorter: Designing the Information Architecture of a Web Site Using Latent Semantic Analysis ACM SIGCHI | Florence, Italy | 5-10 April, 2008

description

Presents an innovative tool that supports design and evaluation of a Web site’s information architecture. A case study demonstrated that it can significantly reduce resources required to design information-rich applications. AutoCardSorter Video Demo: http://www.youtube.com/watch?v=ly_4GsOMWmU

Transcript of AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Page 1: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Christos Katsanos | [email protected]

Nikolaos Tselios | [email protected]

Nikolaos Avouris | [email protected]

AutoCardSorter: Designing the

Information Architecture of a Web Site

Using Latent Semantic Analysis

ACM SIGCHI | Florence, Italy | 5-10 April, 2008

Page 2: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Purpose & Motivation

Automate Structural Design of Information Spaces

Increase efficiency and flexibility for practitioners

2

Page 3: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Why it is important?

Structural design greatly affects user

experience

Current approaches (e.g. Card Sorting)

often neglected:

Time constraints

Cost to recruit users and run the studies

Increased complexity for data analysis

Challenging for large sites (>100 pages)

3

Page 4: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Our tool-based Methodology

4

Page Text

Descriptions

Semantic Similarity

Measure (e.g. LSA)Hierarchical Clustering

Algorithms

Interactive Tree

Structure

Additional Support

1. Number of Groups

2. Cross-Hierarchy Links

Semantic

Similarity Matrix

Page 5: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

The tool Interface (1/2)

5

Page 6: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

The tool Interface (2/2)

6

Page 7: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Validation Study Design

7

Page 8: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Validation Study Design

vs

8

Card SortingAutoCardSorter

Investigate quality of results & efficiency

Health & Nutrition Site

Same content item descriptions

18 representative users

Page 9: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Measures & Analysis

9

Page 10: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

P1 P2 P3 P4 P5

P1 -

P2 0.94 -

P3 0.11 0.33 -

P4 0.33 0.28 0.11 -

P5 0.50 0.83 0.06 0.06 -

P1 P2 P3 P4 P5

P1 -

P2 0.62 -

P3 0.21 0.14 -

P4 0.49 0.51 0.83 -

P5 0.61 0.11 0.21 0.92 -

Validity

Similarity-Matrices Correlation

10

AutoCardSorter Card Sorting

LSA (P5,P1)Frequency Users

placed in Same Pile

P1 and P5

Page 11: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Validity

% Agreement of Design

1) Hierarchical Cluster Analysis of Card Sorting Data

2) AutoCardSorter vs User-Data Dendrogram

a) Eigenvalue Analysis to ‘cut’ objectively

b) User structure => Ideal

c) In Agreement => Longer sequence of pages

grouped together in the same category as Ideal

11

Page 12: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Efficiency

Total Time Required

12

AutoCardSorter

Card Sorting

Page 13: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Study Results

13

Page 14: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Study Results – Validity (1/2)

14

AutoCardSorter produced results of

comparative quality with Card Sorting:

Similarity-Matrices Correlation = 0.80 (p<0.01)

% Agreement of Design = 100%

Page 15: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Study Results – Validity (2/2)

15AutoCardSorter Card Sorting

Page 16: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Study Results - Efficiency

16

Page 17: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Discussion - Advantages

Increased efficiency (x27)

Reduces resources required

Explore alternative solutions early

Simple to learn and apply

Easy to apply for large sites (>100)

17

Possibility for

wider adoption

Page 18: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Discussion – Current Limitations

Lack of qualitative feedback

No insight to category-labels

18

Page 19: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Future Research

More validation studies in different domains

Additional constraints (e.g. group size)

Improvements to algorithm

Dynamic semantic similarity algos (e.g. LSA IR)

Alternatives to Hierarchical Clustering (e.g.

Factor Analysis)

19

Page 20: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

A Demo - Sit back and enjoy

20

Page 21: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Summary & Questions

Proposed an approach that automates structural

design of an information space.

Validation study depicted substantial effectiveness

gain, with similar results to a user-based technique

Cheap + Fast + Easy = Possibility for wider adoption

21

Complementary to user-based methods

Christos Katsanos | [email protected]

Page 22: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Extra Slides

22

Page 23: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

More Validation Studies

Summary of Results

23

Health &

Nutrition

Educational

Portal

Travel &

Tourism Site

Similarity-Matrices

r (p<0.01)0.80 0.52 0.59

% Agreement of

Design100% 93% 87%

Efficiency

(X Times Faster)27 11 14

Page 24: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

More Validation Studies

Efficiency

24

Page 25: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

More Validation Studies

Number of Proposed Categories

25

Page 26: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

More Validation Studies

Avg. Items/Proposed Category

26

Page 27: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

More Validation Studies

Correlation against No of items

27

Page 28: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Statistical Semantic Similarity

Measures - Overview

LSA: Latent Semantic Analysis (Landauer &

Dumais, 1997)

LSA-IR (Falconer et al, 2006)

PLSA (Hofmann, 1999)

PMI: Point-wise Mutual Information (Manning &

Schutze, 1999)

PMI-IR (Turney, 2001)

GLSA (Matveeva et al, 2005)

HAL: Hyperspace Analogue to Language (Lund &

Burgess, 1996)

COALS (Rhode et al, 2004) 28

Page 29: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Latent Semantic Analysis

Similar documents

tend to have

common words

1) Parse corpora representing users’ understanding skills

2) Calculate each word’s frequency of occurrence (TDM)

3) Weight by word’s importance (document, domain)

4) Apply Singular Value Decomposition

5) LSA Index = Cos(Angle of Document Vectors) => [-1,1]

Page 30: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Card Sorting

Typical Effort in person days

30http://www.intranetleadership.com.au

Page 31: AutoCardSorter - Designing the Information Architecture of a web site using Latent Semantic Analysis

Why 2 validation measures?

Similarity-matrices Correlation

strictest approach (compares

measurements of semantic similarity)

more general (does not presuppose

cluster analysis)

% Agreement of Design

Less strict

How close the ‘proposed’ designs are? 31