How Humans See Data - Amazon Cut

214
How Humans See Data John Rauser @jrauser January 2017

Transcript of How Humans See Data - Amazon Cut

Page 1: How Humans See Data - Amazon Cut

How Humans

See Data

John Rauser

@jrauser

January 2017

Page 2: How Humans See Data - Amazon Cut

How Humans

See Data

John Rauser

@jrauser

January 2017

Page 3: How Humans See Data - Amazon Cut

visualization

Page 4: How Humans See Data - Amazon Cut

visualization

is

communication

Page 5: How Humans See Data - Amazon Cut

how to make better visualizations

Page 6: How Humans See Data - Amazon Cut

help humans solve analytical

problems quickly and accurately

with visualization

Page 7: How Humans See Data - Amazon Cut

Part I: Why visualize data at all?

Page 8: How Humans See Data - Amazon Cut
Page 9: How Humans See Data - Amazon Cut

x

1.972

y

1.236

x y

0.111 0.542

1.112 1.994 0.902 0.005

0.000 1.009 0.598 0.085

0.665 1.942 1.613 1.790

0.235 0.356 1.298 1.955

0.247 1.658 0.651 1.937

1.275 1.961 1.949 1.316

0.702 0.045 0.099 0.567

1.760 0.350 0.862 0.010

1.691 0.277 0.027 0.768

1.628 1.778 0.706 1.956

1.957 1.290 1.042 1.999

Page 10: How Humans See Data - Amazon Cut
Page 11: How Humans See Data - Amazon Cut

pre-attentive processing

Page 12: How Humans See Data - Amazon Cut

A graph is an encoding

of the data.

Page 13: How Humans See Data - Amazon Cut

x

1.972

y

1.236

x y

0.111 0.542

1.112 1.994 0.902 0.005

0.000 1.009 0.598 0.085

0.665 1.942 1.613 1.790

0.235 0.356 1.298 1.955

0.247 1.658 0.651 1.937

1.275 1.961 1.949 1.316

0.702 0.045 0.099 0.567

1.760 0.350 0.862 0.010

1.691 0.277 0.027 0.768

1.628 1.778 0.706 1.956

1.957 1.290 1.042 1.999

Page 14: How Humans See Data - Amazon Cut
Page 15: How Humans See Data - Amazon Cut

n x y n x y

1 1.972 1.236 13 0.111 0.542

2 1.112 1.994 14 0.902 0.005

3 0.000 1.009 15 0.598 0.085

4 0.665 1.942 16 1.613 1.790

5 0.235 0.356 17 1.298 1.955

6 0.247 1.658 18 0.651 1.937

7 1.275 1.961 19 1.949 1.316

8 0.702 0.045 20 0.099 0.567

9 1.760 0.350 21 0.862 0.010

10 1.691 0.277 22 0.027 0.768

11 1.628 1.778 23 0.706 1.956

12 1.957 1.290 24 1.042 1.999

Page 16: How Humans See Data - Amazon Cut
Page 17: How Humans See Data - Amazon Cut
Page 18: How Humans See Data - Amazon Cut

Good visualizations optimize

for the human visual system.

Page 19: How Humans See Data - Amazon Cut

How does the human

visual system work?

Page 20: How Humans See Data - Amazon Cut

How does the human visual

system decode a graph?

Page 21: How Humans See Data - Amazon Cut
Page 22: How Humans See Data - Amazon Cut

Cleveland’s three visual

operations of pattern perception:

1. Detection

2. Assembly

3. Estimation

Page 23: How Humans See Data - Amazon Cut

Part II: estimation

Page 24: How Humans See Data - Amazon Cut

Three levels of estimation

a. discrimination X=Y X!=Y

b. ranking X>Y X<Y

c. ratioing X / Y = ?

Page 25: How Humans See Data - Amazon Cut

At the heart of quantitative

reasoning is a single question:

Compared to what?

- Tufte, Envisioning Information

Page 26: How Humans See Data - Amazon Cut

Three levels of estimation

a. discrimination X=Y X!=Y

b. ranking X>Y X<Y

c. ratioing X / Y = ?

Page 27: How Humans See Data - Amazon Cut
Page 28: How Humans See Data - Amazon Cut
Page 29: How Humans See Data - Amazon Cut

the most

important

thing

Page 30: How Humans See Data - Amazon Cut
Page 31: How Humans See Data - Amazon Cut

The most important measurement should exploit

the highest ranked encoding possible.

• Position along a common scale

• Position on identical but nonaligned scales

• Length

• Angle or Slope

• Area

• Volume or Density or Color saturation

• Color hue

Page 32: How Humans See Data - Amazon Cut

The most important measurement should exploit

the highest ranked encoding possible.

• Position along a common scale

• Position on identical but nonaligned scales

• Length

• Angle or Slope

• Area

• Volume or Density or Color saturation

• Color hue

Page 33: How Humans See Data - Amazon Cut

The most important measurement should exploit

the highest ranked encoding possible.

• Position along a common scale

• Position on identical but nonaligned scales

• Length

• Angle or Slope

• Area

• Volume or Density or Color saturation

• Color hue

Page 34: How Humans See Data - Amazon Cut

“The first rule of color:

do not talk about color!”

- Tamara Munzner

Page 35: How Humans See Data - Amazon Cut

luminance

saturation

hue

Page 36: How Humans See Data - Amazon Cut
Page 37: How Humans See Data - Amazon Cut

luminance

saturation

hue

Page 38: How Humans See Data - Amazon Cut
Page 39: How Humans See Data - Amazon Cut
Page 40: How Humans See Data - Amazon Cut
Page 41: How Humans See Data - Amazon Cut
Page 42: How Humans See Data - Amazon Cut

Observation: Alphabetical is

almost never the correct ordering

of a categorical variable.

Page 43: How Humans See Data - Amazon Cut
Page 44: How Humans See Data - Amazon Cut
Page 45: How Humans See Data - Amazon Cut

The most important measurement should exploit

the highest ranked encoding possible.

• Position along a common scale

• Position on identical but nonaligned scales

• Length

• Angle or Slope

• Area

• Volume or Density or Color saturation

• Color hue

Page 46: How Humans See Data - Amazon Cut
Page 47: How Humans See Data - Amazon Cut
Page 48: How Humans See Data - Amazon Cut
Page 49: How Humans See Data - Amazon Cut
Page 50: How Humans See Data - Amazon Cut
Page 51: How Humans See Data - Amazon Cut

The most important measurement should exploit

the highest ranked encoding possible.

• Position along a common scale

• Position on identical but nonaligned scales

• Length

• Angle or Slope

• Area

• Volume or Density or Color saturation

• Color hue

Page 52: How Humans See Data - Amazon Cut
Page 53: How Humans See Data - Amazon Cut
Page 54: How Humans See Data - Amazon Cut
Page 55: How Humans See Data - Amazon Cut

The most important measurement should exploit

the highest ranked encoding possible.

• Position along a common scale

• Position on identical but nonaligned scales

• Length

• Angle or Slope

• Area

• Volume or Density or Color saturation

• Color hue

Page 56: How Humans See Data - Amazon Cut
Page 57: How Humans See Data - Amazon Cut
Page 58: How Humans See Data - Amazon Cut
Page 59: How Humans See Data - Amazon Cut
Page 60: How Humans See Data - Amazon Cut

The most important measurement should exploit

the highest ranked encoding possible.

• Position along a common scale

• Position on identical but nonaligned scales

• Length

• Angle or Slope

• Area

• Volume or Density or Color saturation

• Color hue

Page 61: How Humans See Data - Amazon Cut

11 mpg

Page 62: How Humans See Data - Amazon Cut

11 mpg

Page 63: How Humans See Data - Amazon Cut

11 mpg

Page 64: How Humans See Data - Amazon Cut
Page 65: How Humans See Data - Amazon Cut

The most important measurement should exploit

the highest ranked encoding possible.

• Position along a common scale

• Position on identical but nonaligned

scales

• Length

• Angle or Slope

• Area

• Volume or Density or Color saturation

• Color hue

Page 66: How Humans See Data - Amazon Cut
Page 67: How Humans See Data - Amazon Cut
Page 68: How Humans See Data - Amazon Cut
Page 69: How Humans See Data - Amazon Cut

The most important measurement should exploit

the highest ranked encoding possible.

• Position along a common scale

• Position on identical but nonaligned scales

• Length

• Angle or Slope

• Area

• Volume or Density or Color saturation

• Color hue

Page 70: How Humans See Data - Amazon Cut
Page 71: How Humans See Data - Amazon Cut
Page 72: How Humans See Data - Amazon Cut

The most important measurement should exploit

the highest ranked encoding possible.

• Position along a common scale

• Position on identical but nonaligned scales

• Length

• Angle or Slope

• Area

• Volume or Density or Color saturation

• Color hue

Page 73: How Humans See Data - Amazon Cut

Observation: Stacked

anything is nearly always

a mistake.

Page 74: How Humans See Data - Amazon Cut
Page 75: How Humans See Data - Amazon Cut
Page 76: How Humans See Data - Amazon Cut
Page 77: How Humans See Data - Amazon Cut
Page 78: How Humans See Data - Amazon Cut
Page 79: How Humans See Data - Amazon Cut

Stacking makes the reader

decode lengths, not position

on a common scale.

Page 80: How Humans See Data - Amazon Cut

11 mpg

Page 81: How Humans See Data - Amazon Cut
Page 82: How Humans See Data - Amazon Cut

Observation: Stacked

anything is nearly always

a mistake.

Page 83: How Humans See Data - Amazon Cut
Page 84: How Humans See Data - Amazon Cut

Observation: Pie charts are

ALWAYS a mistake.

Page 85: How Humans See Data - Amazon Cut

Piecharts are the information visualization

equivalent of a roofing hammer to the

frontal lobe. They have no place in the world

of grownups, and occupy the same semiotic

space as short pants, a runny nose, and

chocolate smeared on one’s face. They are

as professional as a pair of assless chaps.

http://blog.codahale.com/2006/04/29/google-analytics-the-goggles-they-do-nothing/

Page 86: How Humans See Data - Amazon Cut

Piecharts are the information visualization

equivalent of a roofing hammer to the frontal

lobe. They have no place in the world of

grownups, and occupy the same semiotic

space as short pants, a runny nose, and

chocolate smeared on one’s face. They are

as professional as a pair of assless chaps.

http://blog.codahale.com/2006/04/29/google-analytics-the-goggles-they-do-nothing/

Page 87: How Humans See Data - Amazon Cut

The most important measurement should exploit

the highest ranked encoding possible.

• Position along a common scale

• Position on identical but nonaligned scales

• Length

• Angle or Slope

• Area

• Volume or Density or Color saturation

• Color hue

Page 88: How Humans See Data - Amazon Cut
Page 89: How Humans See Data - Amazon Cut
Page 90: How Humans See Data - Amazon Cut

Tables are preferable to graphics for many small

data sets. A table is nearly always better than a

dumb pie chart; the only thing worse than a pie

chart is several of them, for then the viewer is

asked to compared quantities located in spatial

disarray both within and between pies… Given

their low data-density and failure to order

numbers along a visual dimension, pie charts

should never be used.

-Edward Tufte, The Visual Display of Quantitative Information

Page 91: How Humans See Data - Amazon Cut

Tables are preferable to graphics for many

small data sets. A table is nearly always better

than a dumb pie chart; the only thing worse than

a pie chart is several of them, for then the viewer

is asked to compared quantities located in spatial

disarray both within and between pies… Given

their low data-density and failure to order

numbers along a visual dimension, pie charts

should never be used.

-Edward Tufte, The Visual Display of Quantitative Information

Page 92: How Humans See Data - Amazon Cut

Clinton Trump

Among Democrats 99% 1%

Among Republicans 53% 47%

Who do you think did a better

job in tonight’s debate?

Page 93: How Humans See Data - Amazon Cut
Page 94: How Humans See Data - Amazon Cut

Afghanistan

Albania

Algeria

Angola

Argentina

Australia

Austria

Bahrain

Bangladesh

Belgium

Benin

Bolivia

Bosnia and Herzegovina

Botswana

Brazil

Bulgaria

Burkina Faso

Burundi

Cambodia

Cameroon

Page 95: How Humans See Data - Amazon Cut

All good pie charts are jokes.

Page 96: How Humans See Data - Amazon Cut
Page 97: How Humans See Data - Amazon Cut
Page 98: How Humans See Data - Amazon Cut

Observation: Comparison is trivial

on a common scale.

Page 99: How Humans See Data - Amazon Cut
Page 100: How Humans See Data - Amazon Cut
Page 101: How Humans See Data - Amazon Cut
Page 102: How Humans See Data - Amazon Cut
Page 103: How Humans See Data - Amazon Cut
Page 104: How Humans See Data - Amazon Cut

the dashboard metaphor is

fundamentally flawed

Page 105: How Humans See Data - Amazon Cut
Page 106: How Humans See Data - Amazon Cut
Page 107: How Humans See Data - Amazon Cut
Page 108: How Humans See Data - Amazon Cut

Observation: Scatterplots

show relationships directly.

Page 109: How Humans See Data - Amazon Cut
Page 110: How Humans See Data - Amazon Cut
Page 111: How Humans See Data - Amazon Cut

Observation: Growth charts

usually aren’t.

Page 112: How Humans See Data - Amazon Cut
Page 113: How Humans See Data - Amazon Cut

If growth (slope) is

important, plot it directly.

Page 114: How Humans See Data - Amazon Cut
Page 115: How Humans See Data - Amazon Cut

Observation: Growth charts

usually aren’t.

If growth (slope) is important,

plot it directly.

Page 116: How Humans See Data - Amazon Cut

The most important measurement should exploit

the highest ranked encoding possible.

• Position along a common scale

• Position on identical but nonaligned scales

• Length

• Angle or Slope

• Area

• Volume or Density or Color saturation

• Color hue

Page 117: How Humans See Data - Amazon Cut

Cleveland’s three visual operations

of pattern perception:

1. Detection

2. Assembly

3. Estimation

Page 118: How Humans See Data - Amazon Cut

Part three: assembly

Page 119: How Humans See Data - Amazon Cut

Gestalt Psychology

Page 120: How Humans See Data - Amazon Cut
Page 121: How Humans See Data - Amazon Cut

reification

Page 122: How Humans See Data - Amazon Cut

emergence

Page 123: How Humans See Data - Amazon Cut
Page 124: How Humans See Data - Amazon Cut

emergence

Page 125: How Humans See Data - Amazon Cut
Page 126: How Humans See Data - Amazon Cut

Prägnanz

Page 127: How Humans See Data - Amazon Cut

Law Of Closure

Page 128: How Humans See Data - Amazon Cut
Page 129: How Humans See Data - Amazon Cut

Law Of Continuity

Page 130: How Humans See Data - Amazon Cut
Page 131: How Humans See Data - Amazon Cut
Page 132: How Humans See Data - Amazon Cut

Observation: Good plots

leverage the law of continuity

to assist with assembly.

Page 133: How Humans See Data - Amazon Cut
Page 134: How Humans See Data - Amazon Cut
Page 135: How Humans See Data - Amazon Cut

Law of Similarity

Page 136: How Humans See Data - Amazon Cut
Page 137: How Humans See Data - Amazon Cut
Page 138: How Humans See Data - Amazon Cut
Page 139: How Humans See Data - Amazon Cut
Page 140: How Humans See Data - Amazon Cut
Page 141: How Humans See Data - Amazon Cut
Page 142: How Humans See Data - Amazon Cut

Law of Proximity

Page 143: How Humans See Data - Amazon Cut
Page 144: How Humans See Data - Amazon Cut
Page 145: How Humans See Data - Amazon Cut

Observation: dodged bar

charts are a bad idea

Page 146: How Humans See Data - Amazon Cut

Cleveland’s three visual operations

of pattern perception:

1. Detection

2. Assembly

3. Estimation

Page 147: How Humans See Data - Amazon Cut

Part IV: detection

Page 148: How Humans See Data - Amazon Cut
Page 149: How Humans See Data - Amazon Cut
Page 150: How Humans See Data - Amazon Cut
Page 151: How Humans See Data - Amazon Cut
Page 152: How Humans See Data - Amazon Cut
Page 153: How Humans See Data - Amazon Cut
Page 154: How Humans See Data - Amazon Cut
Page 155: How Humans See Data - Amazon Cut

excel’s defaults are pretty bad

Page 156: How Humans See Data - Amazon Cut

-

20,000

40,000

60,000

80,000

100,000

120,000

140,000

160,000

180,000

200,000

1 2 3 4 5 6

Page 157: How Humans See Data - Amazon Cut

Observation: Detection isn’t

as trivial as it seems.

Page 158: How Humans See Data - Amazon Cut
Page 159: How Humans See Data - Amazon Cut

“Above all else, show the data.”

-Tufte

Page 160: How Humans See Data - Amazon Cut

Part V: other useful results

Page 161: How Humans See Data - Amazon Cut

Weber’s law: The “Just Noticeable

Difference” is proportional to the

size of the initial stimuli.

Page 162: How Humans See Data - Amazon Cut

10 20

Page 163: How Humans See Data - Amazon Cut

10 20

100 110

Page 164: How Humans See Data - Amazon Cut
Page 165: How Humans See Data - Amazon Cut
Page 166: How Humans See Data - Amazon Cut

12 units

12 units

Page 167: How Humans See Data - Amazon Cut

Observation: Weber’s Law is

why gridlines are useful

Page 168: How Humans See Data - Amazon Cut
Page 169: How Humans See Data - Amazon Cut
Page 170: How Humans See Data - Amazon Cut
Page 171: How Humans See Data - Amazon Cut

“Erase non-data ink.”

-Tufte

Page 172: How Humans See Data - Amazon Cut

“Erase non-data ink,

within reason.”

-Tufte

Page 173: How Humans See Data - Amazon Cut

“Erase non-data ink that interferes

with detection or doesn’t assist

assembly and estimation.”

-Rauser

Page 174: How Humans See Data - Amazon Cut
Page 175: How Humans See Data - Amazon Cut

You are bad at estimating

the difference between lines.

Page 176: How Humans See Data - Amazon Cut
Page 177: How Humans See Data - Amazon Cut
Page 178: How Humans See Data - Amazon Cut
Page 179: How Humans See Data - Amazon Cut
Page 180: How Humans See Data - Amazon Cut

Observation: If a difference is

important, plot it directly.

Page 181: How Humans See Data - Amazon Cut

You are best at detecting variation

in slope near 45 degrees.

Page 182: How Humans See Data - Amazon Cut
Page 183: How Humans See Data - Amazon Cut

banking to 45

Page 184: How Humans See Data - Amazon Cut
Page 185: How Humans See Data - Amazon Cut
Page 186: How Humans See Data - Amazon Cut

Observation: Banking to 45

best shows variation in slope

Page 187: How Humans See Data - Amazon Cut
Page 188: How Humans See Data - Amazon Cut

Q: Should I include 0 on my scale?

Page 189: How Humans See Data - Amazon Cut
Page 190: How Humans See Data - Amazon Cut
Page 191: How Humans See Data - Amazon Cut

Q: Should I include 0 on my scale?

A: It depends.

Page 192: How Humans See Data - Amazon Cut

Q: Should I include 0 on my scale?

A: Relying on the pre-attentive

perception of size or intensity?

Yes, otherwise you will mislead.

Using position? It’s up to you.

Page 193: How Humans See Data - Amazon Cut
Page 194: How Humans See Data - Amazon Cut
Page 195: How Humans See Data - Amazon Cut
Page 196: How Humans See Data - Amazon Cut
Page 197: How Humans See Data - Amazon Cut
Page 198: How Humans See Data - Amazon Cut
Page 199: How Humans See Data - Amazon Cut

“Above all else, show the data.”

-Tufte

Page 200: How Humans See Data - Amazon Cut

“Above all else, show

the variation in the data.”

-Rauser (via Tufte)

Page 201: How Humans See Data - Amazon Cut

R/GGplot2 code for every plot in this

presentation available at http://goo.gl/xH5PLV

The rendered document is at

http://rpubs.com/jrauser/hhsd_notes

This presentation is at

https://goo.gl/LuDNje

I will tweet these links as @jrauser

Page 202: How Humans See Data - Amazon Cut

coda

Page 203: How Humans See Data - Amazon Cut

visualization

is

communication

Page 204: How Humans See Data - Amazon Cut

art

is

communication

Page 205: How Humans See Data - Amazon Cut

visualization

is

art

Page 206: How Humans See Data - Amazon Cut
Page 207: How Humans See Data - Amazon Cut
Page 208: How Humans See Data - Amazon Cut
Page 209: How Humans See Data - Amazon Cut
Page 210: How Humans See Data - Amazon Cut
Page 211: How Humans See Data - Amazon Cut

why does it make you

feel that way?

Page 212: How Humans See Data - Amazon Cut

visualization has as much to

learn from art as from science

Page 213: How Humans See Data - Amazon Cut

R/GGplot2 code for every plot in this

presentation available at http://goo.gl/xH5PLV

The rendered document is at

http://rpubs.com/jrauser/hhsd_notes

This presentation is at

https://goo.gl/LuDNje

I will tweet these links as @jrauser

Page 214: How Humans See Data - Amazon Cut

end