Intro to data visualization

Post on 19-Jan-2015

985 views 0 download

Tags:

description

Slides used in capita selecta HCI course H05N2A

Transcript of Intro to data visualization

Data Visualization - An introduction

Prof Jan AertsBiodata Visualization and AnalysisESAT/SCDUniversity of LeuvenBelgium

twitter: @jandotGoogle+: +Jan Aertsjan.aerts@esat.kuleuven.behttp://biovizanlab.wordpress.comhttp://saaientist.blogspot.com

1. What is data visualization?

“A good sketch is better than a long speech” (Napoleon)

“A good sketch is better than a long speech” (Napoleon)

shows: size of the army, geographical coordinates, direction that the army was traveling, location of the army with respect to certain dates, temperature along the path of the retreat

John Snow - cholera map

Shape of Songs: “Like a Prayer” (Madonna)Martin Wattenberg

What I use as a definition:

“computer-based visualization systems providing visual representations of datasets intended to help people carry out some task more effectively.” (T Munzner)

cognition <=> perceptioncognitive task => perceptive task

“eyes beat memory”

• record information

• blueprints, photographs,seismographs, ...

• analyze data to support reasoning

• develop & assess hypotheses

• discover errors in data

• expand memory

• find patterns (see Snow’s cholera map)

• communicate information

• share & persuade

• collaborate & revise

Why do we visualize data?

pictorial superiority effect

“information”

“informa” “i”65% 1%

72hr

exploration explanation

2. Exploration <-> explanation

exploration explanation

exploration explanation

visual analytics infographics

exploration explanation

visual analytics infographics

exploration explanation

visual analytics infographics

hypothesis generation

exploration explanation

“visual analytics”

=> identify unexpected patterns

J van Wijk

exploration explanation

Anscombe’s quartet

• uX = 9.0

• uY = 7.5

• sigma X = 3.317

• sigma Y = 2.03

• Y = 3 + 0.5X

• R2 = 0.67

A concrete example: hive plots

Martin Krzewinsky

same network

Martin Krzewinsky

different networks!

3D, anyone?

3D, anyone?

occlusioninteraction complexityperspective distortion

text legibility

Gene interaction data: “gene A regulates gene B”

Functions in linux operation system: “function A calls function B”

regulator

manager

workhorse

3. Why specifically learn about dataviz?

Isn’t it all just about using common sense?

• huge space of design alternatives => many tradeoffs

• many possibilities known to be ineffective

• avoid random walk through parameter space

• avoid some of our past mistakes

• extensive experimentation has already been done

• guidelines continue to evolve

• we reflect on lessons learned in design studies

• iterative refinement usually wise

4. Stages of data visualization

How do we get from data to visualization? We need to understand:

• properties of the data

• properties of the image

• the rules mapping data to image

4.1. Properties of the data

S Stevens “On the theory of scales and measurements” (1946)

4.2. Properties of the image - perception

Semiology of graphics

• Jacques Bertin, Gauthier-Villars 1967, EHESS 1998

• semiology = study of signs and sign processes, likeness, analogy, metaphor, symbolism, signification, and communication (Wikipedia)

• visual encoding:

• what - points, lines, areas (, patterns, trees/networks, grids)

• where - positional: XY (1D, 2D, 3D)

• how - retinal: Z (size, lightness, texture, colour, orientation, shape)

• when - temporal: animation

“marks” - geometric primitives

“channels” - control appearance of marks

H

V

S

Gestalt laws - interplay between parts and the whole (Kurt Koffka)

series of principles

Election results Florida:

• black = Bush

• white = Gore

Gestalt - Principle of Simplicity

Every pattern we see is seen such that we see a structure that is as simple as possible.

Gestalt - Principle of Proximity

Things that are close to each other are seen as belonging together (=> clusters)

Gestalt - Principle of Similarity

Things that are similar in some way are perceived as belonging together.

Gestalt - Principle of Closure

You will try to complete a pattern.

Gestalt - Principle of Connectedness

Things that are connected are perceived as belonging together. This encoding is stronger than similarity, shape, colour, and size.

Gestalt - Principle of Good Continuation

Objects that are arranged in a straight or smooth line tend to be seen as a unit.

Gestalt - Principle of Common Fate

Objects that move in the same direction tend to be seen as a unit.

Gestalt - Principle of Familiarity

Gestalt - Principle of Symmetry

Symmetrical areas tend to be seen as figures against asymmetrical backgrounds.

Context affects perceptual tasks

Pre-attentive vision

= ability of low-level human visual system to rapidly identify certain basic visual properties

• some features “pop out”

• used for:

• target detection

• boundary detection

• counting/estimation

• ...

• visual system takes over => all cognitive power available for interpreting the figure, rather than needing part of it for processing the figure

Really fast; see http://www.csc.ncsu.edu/faculty/healey/PP/

1. Combining pre-attentive features does not always work => would need to resort to “serial search” (most channel pairs; all channel triplets)e.g. is there a red square in this picture

Limitations of preattentive vision

2. Speed depends on which channel (use one that is good for categorical; see further (“accuracy”))

4.3. Mapping data to image: visual encoding

Language of graphics

• graphics = sign system:

• each mark (point, line, area) represents a data element

• choose visual variables to encode relationships between data elements

• difference, similarity, order, proportion

• only position supports all relationships (see later)

• huge range of alternatives for data with many attributes

• find images that express & effectively convey the information

Which encoding should I use?

• From huge list of possibilities, you have to choose the best one.

• Principle of Consistency

• properties of the representation should match properties of the data (e.g. pie chart: area vs radius)

• Principle of Importance Ordering

• encode the most important piece of information in the most “effective” way (i.e. spatial position)

Steven’s psychophysical law

= proposed relationship between the magnitude of a physical stimulus and its perceived intensity or strength

Accuracy of quantitative perceptual tasks

McKinlay

what/where (qualitative)how much (quantitative)

Accuracy of quantitative perceptual tasks

McKinlay

what/where (qualitative)how much (quantitative)

Accuracy of quantitative perceptual tasks

McKinlay“power of the plane”

what/where (qualitative)how much (quantitative)

Accuracy of quantitative perceptual tasks

McKinlay

what/where (qualitative)how much (quantitative)

grouping: see Gestalt laws

COLOUR

COLOUR ... is tricky, and often used wrong

Colour space

• = mathematical model to talk about colour

• RGB (red-green-blue)

• most common, but less useful

• HSV (hue-saturation-value)

• more useful

colorbrewer2.org

in R: please use RColorBrewer!

Context affects colour perception

Context affects colour perception

Dangers of Depth (3D)

• We do NOT see in 3D; we see in 2.05D.

• occlusion

• interaction complexity

• perspective distortion

3D example

Lie factor

size of effect shown in graphic“lie factor” =

size of effect in data

3D scatter plots are better as series of 2D projections

Dynamic data

• animation is good sometimes, but often not:

• we can only follow 3-4 visual cues simultaneously

• change in “mental map”

• change blindness (e.g. http://nivea.psycho.univ-paris5.fr/CBMovies/BarnTrackFlickerMovie.gif)

http://vimeo.com/2035117

5. Interaction

Overview, zoom and filter, details on demand(Schneiderman’s Information Seeking Mantra)

• sorting

• filtering

• browsing/exploring

• comparison

• characterizing trends & distributions

• finding anomalies & outliers

• ...

Operations on the data

Techniques to support these operations

• re-orderable matrices

• brushing

• linked views

• overview & detail

• focus & context

• ...

6. Validation

Evaluate the right thing

Munzner, 2009

Slide/picture acknowledgments

• Jeffrey Heer

• Tamara Munzner

• Jessie Kennedy

• Nils Gehlenborg

• Miriah Meyer

“I think this presentation went quite well...”