Visual Analytics and the Geometry of Thought— Spatial Intelligence through Sapient Interfaces...

Post on 22-Dec-2015

217 views 2 download

Transcript of Visual Analytics and the Geometry of Thought— Spatial Intelligence through Sapient Interfaces...

Visual Analytics and the Geometry of Thought—Visual Analytics and the Geometry of Thought—Spatial Intelligence through Sapient InterfacesSpatial Intelligence through Sapient Interfaces

Alexander Klippel & Frank Hardisty

Department of Geography, GeoVISTA Center&

e-Dutton Institute for EducationPenn State

Star PlotsStar PlotsHow Shape Characteristics InfluenceHow Shape Characteristics Influence

Classification TasksClassification Tasks

Alexander Klippel & Frank Hardisty

Department of Geography, GeoVISTA Center&

e-Dutton Institute for EducationPenn State

Overview

Multivariate data displays Experiment on the influence of shape

(of star plots) on the classification of data Design of a tool to administer grouping

experiments Design of a tool to analyze individual similarity

ratings Does shape matter? Conclusion and future work

Displaying Multivariate Data

We encounter limitations in displaying multivariate data in two dimensions

As a response to these constraints several graphic designs have been advised, for example Andrews curves Parallel plots Chernoff faces Star plots Etc etc.

The big question is Which visualization technique does “work” for which data

sets and which does not

Parallel Coordinate Plot

Chernoff Faces

Source: http://mapmaker.rutgers.edu/355/links.html

www.ncgia.ucsb.edu

www.ghastlyfop.com

Star Plots

Star Plots

Star Plots

GeoViz Toolkit: http://www.geovista.psu.edu/grants/cdcesda/software/

Question

In their work on Chernoff faces Chernoff and Rizvi (1975) found that varying the assignment of variables to facial characteristics has an influence on classification tasks

Question For star plots the assumption is made that the

assignment of variables to rays does not matter, but is that really the case?

Experiment: Car Data

Price

Safety rating (h igher is better)

MPGMiles per Gallon

Emissions

Weight

Maximum speed

Interior space

Acceleration(higher is faster)

1

2

3

4

5

6

7

8

1-3-5-7

Price

Safety rating (higher is better)

M PGM iles per Gallon

Emissions

W eight

M axim um speed

Interior space

Acceleration(higher is faster)

1

2

3

4

5

6

7

8

2-3-6-7

20 participants in each conditionPenn State undergraduates

30-15

65-50

100-85

100-85

The Grouping Tool

81 icons(4 variables, 3 levels(high, medium, low))

1-3-5-7

The Grouping Tool

81 icons(4 variables, 3 levels(high, medium, low))

2-3-6-7

Example: All Low Values

=

1-3-5-7 2-3-6-7

Data

Number of groups Time to complete Similarity matrix Linguistic labels

Some Results

There is no statistically significant difference in the number of groups created in 1-3-5-7 and 2-3-6-7 (t = .241, df = 38, p = .811)

There is no statistical significant difference in the time participants needed to complete the task (t = -1.533, df = 38, p = .134)

The similarity values in both similarity matrices are correlated and the correlation is statistically significant (r = .581, N = 3240, p < .0005)

Cluster Analysis

Ward’s method

1-3-5-7

2-3-6-7

MDS Plots

1-3-5-7

MDS Plots

2-3-6-7

Grouping Analysis

Improvise by Chris Weaver (http://www.personal.psu.edu/cew15/improvise/index.html)

2-3-6-71-3-5-7

Price

Safety rating (h igher is better)

MPGMiles per Gallon

Emissions

Weight

Maximum speed

Interior space

Acceleration(higher is faster)

1

2

3

4

5

6

7

8Price

Safety rating (higher is better)

M PGM iles per Gallon

Emissions

W eight

M axim um speed

Interior space

Acceleration(higher is faster)

1

2

3

4

5

6

7

8

1-3-5-7 2-3-6-7

1-3-5-7 2-3-6-7

Price

Safety rating (h igher is better)

MPGMiles per Gallon

Emissions

Weight

Maximum speed

Interior space

Acceleration(higher is faster)

1

2

3

4

5

6

7

8Price

Safety rating (higher is better)

M PGM iles per Gallon

Emissions

W eight

M axim um speed

Interior space

Acceleration(higher is faster)

1

2

3

4

5

6

7

8

1-3-5-7 2-3-6-7

1-3-5-7 2-3-6-7

Price

Safety rating (h igher is better)

MPGMiles per Gallon

Emissions

Weight

Maximum speed

Interior space

Acceleration(higher is faster)

1

2

3

4

5

6

7

8Price

Safety rating (higher is better)

M PGM iles per Gallon

Emissions

W eight

M axim um speed

Interior space

Acceleration(higher is faster)

1

2

3

4

5

6

7

8

1-3-5-7 2-3-6-7

1-3-5-7 2-3-6-7

Price

Safety rating (h igher is better)

MPGMiles per Gallon

Emissions

Weight

Maximum speed

Interior space

Acceleration(higher is faster)

1

2

3

4

5

6

7

8Price

Safety rating (higher is better)

M PGM iles per Gallon

Emissions

W eight

M axim um speed

Interior space

Acceleration(higher is faster)

1

2

3

4

5

6

7

8

1-3-5-7 2-3-6-7

1-3-5-7 2-3-6-7

Conclusion

Shape does matter The assignment of variable to rays in a star plot influences

classification tasks (compare Chernoff faces) Characteristic shape features have an influence on rating

the similarity of the represented data The more characteristic the shape, the greater the influence

It may therefore be that star plots are less suitable for lay person exploratory analysis but more effective in communication (if carefully chosen).

Outlook

Quantifying data analysis Cluster validation methods

E.g., Rand statistic, Jaccard coefficient Individual analysis of “shape families”

Relation to linguistic labels Continue work on how should variables be assigned to

rays For example, is there a time advantage for salient shapes?

Influence of contextual parameters Of a star plot as such (e.g. number of variables/rays) As a symbol in a map (e.g. spatial patterns, and first law or

geography). Star plots in comparison to other visualization

techniques

Thank you