"Introduction to Data Visualization" Workshop for General Assembly by Hunter Whitney Feb 2015

91
INTRODUCTION TO DATA VISUALIZATION February 3, 2015 Hunter Whitney 1 DRAFT

Transcript of "Introduction to Data Visualization" Workshop for General Assembly by Hunter Whitney Feb 2015

INTRODUCTION TO DATA VISUALIZATION

February 3, 2015 Hunter Whitney

1

DRAFT

INTRODUCTION

HUNTER WHITNEY2

! UX Design and Data Visualization Consultant

! Author and Contributing Editor

! @hunterwhitney"

INTRODUCTION

HELLO!‣ Who are you?

‣ What do you do?

‣ What’s your learning goal for today?

‣ Is there a topic you’d like to visualize in the exercise today?

3

Sections:

1) What is Data Visualization?

2) Data Visualization Purposes

3) Data and Design

4) People and Process

5) Examples to Discuss

6) Class Exercise

7) Resources and Conclusions

4

CLASS EXERCISE PRELIMINARIES

DISCUSSION

Toward the end of class, we’re going to split up into groups and create data visualization concept designs. As we go through each section, think about applying the ideas we cover to a project you might choose.

Topic suggestion for the final exercise - create a visualization that shows how a series of events unfolds over time. Be creative. It doesn’t have to be just a timeline on an x-axis. This can be applied to many areas including - business (e.g., patterns of timing from VC funding to IPO), sports (e.g., changes ball possession during a game), medicine (e.g., the spread of an epidemic)

START THINKING…5

KEY QUESTIONS TO ADDRESS IN YOUR PROJECTS

‣What is the purpose/value of the visualization? ‣Who are the intended users? ‣How was the data selected and acquired? ‣What design elements were used and why?

CLASS EXERCISE PRELIMINARIES 6

! We’re only scratching the surface of every topic presented here

! The main goal is for you to look at data visualization with a holistic perspective

! Whatever your levels of skill and experience are, you have something to offer

KEEP IN MIND… 7

INTRODUCTION TO DATA VISUALIZATION

SECTION 1: WHAT IS DATA VISUALIZATION?

8

9

VISUALIZATIONS MAKE IT EASIER TO SEE PATTERNS IN DATA

SECTION 1: WHAT IS DATA VISUALIZATION?

http://data.oecd.org/healthcare/child-vaccination-rates.htm

The key to effectively exposing meaningful patterns in data comes down to thoughtful visual encoding.

http://www.gapminder.org/

SECTION 1: WHAT IS DATA VISUALIZATION? 10

720349656089226535931140790070322302076958689027429003358787115045223998424533087922668417382319480046553364246202505406711172160430997890121737608183566145635519888049583302306957749597705315240714467203496560892265359311407900703223020769586890274290033587871150452239984245330879226684173823194800465533642462025054067111721604309978901217376081835661456355

How does encoding work?

Guess how many ‘7’s there are in this set-

SECTION 1: WHAT IS DATA VISUALIZATION? 11

720349656089226535931140790070322302076958689027429003358787115045223998424533087922668417382319480046553364246202505406711172160430997890121737608183566145635519888049583302306957749597705315240714467203496560892265359311407900703223020769586890274290033587871150452239984245330879226684173823194800465533642462025054067111721604309978901217376081835661456355

They’re the same set of numbers, but now the 7’s pop out at us.

Now, try guessing again-

SECTION 1: WHAT IS DATA VISUALIZATION? 12

720349656089226535931140790070322302076958689027429003358787115045223998424533087922668417382319480046553364246202505406711172160430997890121737608183566145635519888049583302306957749597705315240714467203496560892265359311407900703223020769586890274290033587871150452239984245330879226684173823194800465533642462025054067111721604309978901217376081835661456355

Effective visualizations require thoughtful encoding.

SECTION 1: WHAT IS DATA VISUALIZATION? 13

Design decisions have a big impact on what people will see in the data.

SECTION 1: WHAT IS DATA VISUALIZATION? 14

720349656089226535931140790070

720349656089226535931140790070

A substantial portion of the human brain is devoted to visual processing

Source:http://www.flickr.com/photos/orangeacid/234358923/Creative Commons Attribution License

Source:http://en.wikipedia.org/wiki/File:Brodmann_areas_17_18_19.pngGNU Free Documentation License

WE ARE WIRED FOR VISUALIZATION

10 Million Bits Per Second

Source:Current Biology (July 2006) by Judith McLean and Michael A. Freed

SECTION 1: WHAT IS DATA VISUALIZATION? HUMAN BRAIN 15

TAPPING IN TO OUR PERCEPTUAL POWERSThe pop-out effects are due to your brain’s pre-attentive processing

SECTION 1: WHAT IS DATA VISUALIZATION? PRE-ATTENTIVE PROCESSING 16

COLOR HUE ORIENTATION TEXTURE POSITION & ALIGNMENT

COLOR BRIGHTNESS COLOR SATURATION SIZE SHAPE

What is easier to distinguish here - color or shape differences?

Some attributes pop out more than others.

17SECTION 1: WHAT IS DATA VISUALIZATION? PRE-ATTENTIVE PROCESSING

http://www.slideshare.net/slideshow/view?login=johnwhalen&title=cognitive-science-of-design-in-10-minutes-or-less

SECTION 1: WHAT IS DATA VISUALIZATION? PRE-ATTENTIVE PROCESSING

SHAPE

18

http://www.slideshare.net/slideshow/view?login=johnwhalen&title=cognitive-science-of-design-in-10-minutes-or-less

SECTION 1: WHAT IS DATA VISUALIZATION? BRAIN SYSTEMS 19

SECTION 1: DATA VISUALIZATION PROCESS AND PRACTICES

Adapted from Stephen Few.

20

PUTTING THE PIECES TOGETHER The components of visualizations fit into a larger context of goals, users, and the media in which they are presented.

SECTION 1: WHAT IS DATA VISUALIZATION? BUILDING OUT 21

SECTION 2: DATA VISUALIZATION PURPOSES

INTRODUCTION TO DATA VISUALIZATION 22

Overview first, zoom and filter, then details-on-demand.‣ Time Series and Event Sequences ‣ Part-to-Whole ‣ Geospatial ‣ Ranking ‣ Distribution ‣ Correlation ‣ Deviation ‣ Nominal Comparison

There can be overlaps in what can be shown and related in one visualization

I CAN RELATE!SECTION 2: DATA VISUALIZATION PURPOSES 23

24

TIME-SERIES GRAPHSECTION 2: DATA VISUALIZATION PURPOSES

http://www.businessinsider.com/india-and-america-come-meet-mum-2015-1

25

STREAMGRAPHSECTION 2: DATA VISUALIZATION PURPOSES

26

TEMPORAL HEATMAPSECTION 2: DATA VISUALIZATION PURPOSES

SECTION 2: DATA VISUALIZATION USES 27

EARLY EXAMPLES

28

NEAR REAL-TIME DATASECTION 2: DATA VISUALIZATION PURPOSES

29

MORE TIME EXAMPLESSECTION 2: DATA VISUALIZATION PURPOSES

30

FOR A DEEPER DIVE INTO TEMPORAL DATA VIS..

http://www.oreilly.com/pub/e/3139

http://uxmag.com/articles/its-about-time

SECTION 2: DATA VISUALIZATION PURPOSES

Overview first, zoom and filter, then details-on-demand.PART-TO-WHOLE: A TREEMAP OF TITANIC PROPORTIONSSECTION 2: DATA VISUALIZATION PURPOSES 31

Overview first, zoom and filter, then details-on-demand.

Source: http://blog.visual.ly/the-whole-story-on-part-to-whole-relationships/

PART-TO-WHOLE: OTHER EXAMPLESSECTION 2: DATA VISUALIZATION PURPOSES 32

* Source: http://blog.visual.ly/the-whole-story-on-part-to-whole-relationships/

**

Pie Stacked Area

Parallel Sets Sankey Diagram

FRUIT TREEMAPS: HIERARCHY AND PROPORTIONSSECTION 2: DATA VISUALIZATION PURPOSES 33

Source: Data Insights: New Ways to Visualize and Make Sense of Data, by Hunter Whitney, Morgan Kaufmann; 2012."

34SECTION 2: DATA VISUALIZATION PURPOSES

GEOSPATIAL: THE POLITICAL LANDSCAPE

GEOSPATIAL: EARLY EXAMPLE

Source:"http://en.wikipedia.org/wiki/1854_Broad_Street_cholera_outbreak "

SECTION 2: DATA VISUALIZATION PURPOSES 35

http://uxmag.com/articles/leveraging-the-kano-model-for-optimal-results

RANKING36SECTION 2: DATA VISUALIZATION PURPOSES

37

http://datavizblog.com/category/distribution/

SECTION 2: DATA VISUALIZATION PURPOSES

DISTRIBUTION

38

http://www.statsblogs.com/2014/08/20/creating-heat-maps-in-sasiml/

CORRELATIONSECTION 2: DATA VISUALIZATION PURPOSES

39SECTION 2: DATA VISUALIZATION PURPOSES

DEVIATION

SECTION 2: DATA VISUALIZATION PURPOSES 40

NOMINAL COMPARISON: BAR CHART

Source: Data Insights: New Ways to Visualize and Make Sense of Data, by Hunter Whitney, Morgan Kaufmann; 2012."

41

DIFFERENT PERSPECTIVES: NOMINAL COMPARISON AND PART-TO-WHOLE

Source: Data Insights: New Ways to Visualize and Make Sense of Data, by Hunter Whitney, Morgan Kaufmann; 2012."

SECTION 2: DATA VISUALIZATION PURPOSES

CLASS EXERCISE (KEEP IN MIND)

DISCUSSION KEY QUESTIONS TO ADDRESS

‣ What are the main functions (e.g., exploratory, tracking, explanatory, etc.?)

‣ What kinds of design elements might you want to use?

‣ What level of interactivity might be good to include?

For whichever subject area you choose, think about the basic design elements and functions that might work best. These questions will come into sharper focus as you learn more about the goals of the users.

CONSIDERATIONS FOR YOUR CLASS PROJECT42

SECTION 3: DATA AND DESIGN

INTRODUCTION TO DATA VISUALIZATION 43

http://phys.org/news/2013-10-visualization.html

THERE ARE ENDLESS FORMS OF VISUALIZATIONSECTION 3: DATA AND DESIGN 44

THE MARRIAGE OF DESIGN AND DATA DATA CAN BE BROKEN INTO TWO MAJOR CLASSES: DISCRETE AND CONTINUOUS

45

Source: Data Insights: New Ways to Visualize and Make Sense of Data, by Hunter Whitney, Morgan Kaufmann; 2012."

SECTION 3: DATA AND DESIGN

THE MARRIAGE OF DESIGN AND DATA 46

Source: Data Insights: New Ways to Visualize and Make Sense of Data, by Hunter Whitney, Morgan Kaufmann; 2012."

SECTION 3: DATA AND DESIGN

Nominal Scale: This is simply putting items together without ordering or ranking them (e.g., an apple, an orange, and a tomato).

Ordinal Scale: Elements of the data describe properties of objects or events that are ordered by some characteristic.

THE MARRIAGE OF DESIGN AND MEASUREMENTS 47

Source: Data Insights: New Ways to Visualize and Make Sense of Data, by Hunter Whitney, Morgan Kaufmann; 2012."

SECTION 3: DATA AND DESIGN

Interval Scale: These are data that are measured on some kind of scale, often temporal (e.g., the days of the week, hours of the day).

THE MARRIAGE OF DESIGN AND MEASUREMENTS

Ratio Scale: An ordered series of numbers assigned to items (objects, events, etc.) that allow for estimating and comparing different measures in terms of multiples, such as “half as many” or “four times as heavy.”

48

Source: Data Insights: New Ways to Visualize and Make Sense of Data, by Hunter Whitney, Morgan Kaufmann; 2012."

SECTION 2: DATA VISUALIZATION PURPOSES

STATISTICAL SUMMARIZATION AND ANALYSISVisualizations can clarify or obscure the statistical summarization of

http://blog.visual.ly/using-visual-reasoning-to-understand-numbers/

49SECTION 3: DATA AND DESIGN

50

Source: Reprinted in Data Insights: New Ways to Visualize and Make Sense of Data, by Hunter Whitney, Morgan Kaufmann; 2012."

SECTION 3: DATA AND DESIGN

CHART EFFECTIVENESS

Source: Enrico Bertini, Assistant Professor at NYU-Poly (@filwd)

51SECTION 3: DATA AND DESIGN

Think about good design practices: selective labeling

52

Schwabish, Jonathan A. 2014. "An Economist's Guide to Visualizing Data." Journal of Economic Perspectives, 28(1): 209-34. DOI: 10.1257/jep.28.1.209

SECTION 3: DATA AND DESIGN

Which one is bigger?

A B

A

B

53

Think about good design practices: proximity

SECTION 3: DATA AND DESIGN

Think about good design practices: multiples

54

Schwabish, Jonathan A. 2014. "An Economist's Guide to Visualizing Data." Journal of Economic Perspectives, 28(1): 209-34. DOI: 10.1257/jep.28.1.209

SECTION 3: DATA AND DESIGN

55SECTION 3: DATA AND DESIGN

Source: Data Insights: New Ways to Visualize and Make Sense of Data, by Hunter Whitney, Morgan Kaufmann; 2012."

COLOR AND VALUE

http://blog.visual.ly/building-effective-color-scales/

YOUR VISUAL SYSTEM56

http://www.lottolab.org/articles/illusionsoflight.asp http://adaynotwasted.com/2010/02/light-and-color-illusionsgin-art/

SECTION 3: DATA AND DESIGN

57

CONSTANCYSECTION 3: DATA AND DESIGN

Idea: Forms or patterns transcend the stimuli used to create them. Why do patterns emerge? Under what circumstances?

Principles of Pattern Recognition: “Gestalt” is German for “pattern” or “form, configuration”.

GESTALT PRINCIPLES

http://sixrevisions.com/web_design/gestalt-principles-applied-in-design/http://graphicdesign.spokanefalls.edu/tutorials/process/gestaltprinciples/gestaltprinc.htm

58SECTION 3: DATA AND DESIGN

What do you see here?

http://sixrevisions.com/web_design/gestalt-principles-applied-in-design/

59SECTION 3: DATA AND DESIGN

‣ How do you design the “perfect” visualization?

‣ There’s no perfect visualization: the design space is just too big!

‣ But it’s up to you to design the one that fits...

60SECTION 3: DATA AND DESIGN

! Visualization Display Choices

http://scitechdaily.com/scientists-manage-flood-big-data-space/ http://www.steema.com/tags/mobile

61SECTION 3: DATA AND DESIGN

A FEW DATA VISUALIZATION DEVELOPMENT TOOLS:

62SECTION 3: DATA AND DESIGN

SECTION 4: PEOPLE AND PROCESS

INTRODUCTION TO DATA VISUALIZATION 63

SECTION 4: PEOPLE AND PROCESS 64

http://cnr.ncsu.edu/geospatial/wp-content/uploads/sites/6/2014/02/earth_observation-574_crop1-1500x600.jpg

VISUALIZATION IS ONLY THE TIP OF THE ICEBERGData visualization is only a part of a much larger process that includes identifying the purpose of the visualization, the kinds of people who will use it, the types of data that can be collected and analyzed, and good design choices.

65SECTION 4: PEOPLE AND PROCESS

VISUALIZATION IS PART OF AN ITERATIVE PROCESS

66

Source: Data Insights: New Ways to Visualize and Make Sense of Data, by Hunter Whitney, Morgan Kaufmann; 2012."

SECTION 4: PEOPLE AND PROCESS

PERSPECTIVE: BIOTECHNOLOGY EXECUTIVE67

‣ “We usually have an underlying narrative or hypothesis that is driving the analysis, but even with that you have to be ready for a surprise. Be willing to go where the data leads you, provided you have good data from multiple sources.”

‣ “We try to have teams involved in the data collection and analysis process ‘from soup to nuts’. If people join only at the end of the process, you could be setting yourself up for failure.”

‣ “If you rely on just one data set, you can be totally misled.”

SECTION 4: PEOPLE AND PROCESS

ROLE

• RESEARCHER

• PUBLIC

PRIOR KNOWLEDGE

• NONE

• SUBJECT EXPERT

USE FREQUENCY

• ONCE A DECADE

• EVERY HOUR

USERSUSER QUESTION 1 - WHO VIEWS THE DATA?

68SECTION 4: PEOPLE AND PROCESS

PURPOSE

HYPOTHESIS?

• WHAT ARE WETRYING TO LEARN OR SHOW?

• HOW DO WE KNOWIF WE ACHIEVED IT?

GOAL?

• WHAT ARE THEBOUNDARIES?

PARAMETERS?

69SECTION 4: PEOPLE AND PROCESS

DATA QUESTION 1 - WHO OWNS IT?

PRIMARY

• YOU COLLECT IT • YOU OWN IT • NOBODY ELSE HAS IT

• OTHERS COLLECT IT • OTHERS OWN IT • OTHERS HAVE IT

SECONDARY

DATA70SECTION 4: PEOPLE AND PROCESS

DATA QUESTION 2 - DOES IT CHANGE?

DYNAMIC

• CHANGES OFTEN • COLLECTED OFTEN • TIME WINDOW

MATTERS

• DOES NOT CHANGE • COLLECT IT ONCE • TIME WINDOW

MATTERS

STATIC

DATA71SECTION 4: PEOPLE AND PROCESS

72

“Applied field ethnography”, data, and map visualizations

SECTION 4: PEOPLE AND PROCESS

USER CONTROL: HIGH

STATIC

EXPLAINEXPLORE

(e.g., data-intensive research applications)

(e.g., print infographic advocacy )

(e.g., interactive infographic journalism)

(e.g., data-rich visualizations with limited interactivity)

DYNAMIC

USER CONTROL: LOW

73SECTION 4: PEOPLE AND PROCESS

SECTION 5: EXAMPLES TO DISCUSS

INTRODUCTION TO DATA VISUALIZATION 74

SECTION 5: EXAMPLES TO DISCUSS 75

After Nate Silver moved on to other things, New York Times filled the gap with a data-centric journalism section called “The Upshot.”

Let’s discuss, deconstruct, and critique a few examples from the site. These are screen shots to you may not have full context, but let’s see how these visualizations stand up.

You might want to visit the site and play with it more on your own and practice evaluation it based on what we’ve already discussed.

http://www.nytimes.com/upshot/

76

http://www.nytimes.com/interactive/2014/07/08/upshot/how-the-year-you-were-born-influences-your-politics.html?abt=0002&abg=1

SECTION 5: EXAMPLES TO DISCUSS

77SECTION 5: EXAMPLES TO DISCUSS

http://www.nytimes.com/newsgraphics/2014/senate-model/

78SECTION 5: EXAMPLES TO DISCUSS

79

http://www.nytimes.com/interactive/2014/upshot/buy-rent-calculator.html?abt=0002&abg=0

SECTION 5: EXAMPLES TO DISCUSS

80

https://source.opennews.org/en-US/articles/nyts-512-paths-white-house/

SECTION 5: EXAMPLES TO DISCUSS

SECTION 6: CLASS EXERCISE

INTRODUCTION TO DATA VISUALIZATION 81

‣ Get into groups 4 or more, and discuss the ideas and examples you have in mind.

‣ Then...

• Select the purpose, audience, and data you want to use for a visualization

• Design the visualization on the provided poster paper

• Be ready to share your results and describe your thought process

EXERCISE IDEA: THINK TIME82SECTION 6: CLASS EXERCISE

Streamgraph Space Time Cube Gantt Chart

83SECTION 6: CLASS EXERCISE

Food for thought..

Food for thought..

84

http://www.gapminder.org

SECTION 6: CLASS EXERCISE

SECTION 7: RESOURCES AND CONCLUSIONS

INTRODUCTION TO DATA VISUALIZATION 85

DATA VISUALIZATION RESOURCES

‣ Flowing Data (http://flowingdata.com/

‣ Fast Company Co.design (http://www.fastcodesign.com/)

‣ UX Magazine (http://uxmag.com/)

‣ The Human-Computer Interaction Lab (http://www.cs.umd.edu/hcil/)

‣ A Periodic Table of Visualization Methods (www.visual-literacy.org/periodic_table/periodic_table.html)

Sites:

86SECTION 7: RESOURCES AND CONCLUSIONS

DATA VISUALIZATION BOOKS:

‣ Bertin, J. (2011). Semiology of graphics: Diagrams, networks, maps. (Berg, W. J., Trans.) Redlands, CA: Esri Press. (Original work published 1965)

‣ Card, S. K., Mackinlay, J. D., & Shneiderman, B. (Eds.). (1999). Readings in information visualization: Using vision to think. San Francisco, CA: Morgan Kaufmann Publishers.

‣ Few, S. C. (2009). Now you see it: Simple visualization techniques for quantitative analysis. Oakland, CA: Analytics Press.

‣ Few, S. C. (2004). Show me the numbers: Designing tables and graphs to enlighten. Oakland, CA: Analytics Press.

‣ Fry, B. (2008). Visualizing data. Sebastopol, CA: O’Reilly Media, Inc.

‣ Segaran, T., & Hammerbacher, J. (Eds.) (2009). Beautiful data: The stories behind elegant data solutions. Sebastopol, CA: O’Reilly Media, Inc.

‣ Tufte, E.R. (1997). Visual explanations: Images and quantities, evidence and narrative. Cheshire, CT: Graphics Press, LLC.

‣ Ware, C. (2008). Visual thinking for design. Burlington, MA: Morgan Kaufmann Publishers.

‣ Whitney, H. (2012) Data Insights New Ways to Visualize and Make Sense of Data Morgan Kaufmann/Elsevier 2012.

‣ Wilkinson, L. (2005). The grammar of graphics. Chicago, IL: Springer.

‣ Yau, N. (2011). Visualize this: The flowing data guide to design, visualization, and statistics. Indianapolis, IN: Wiley Publishing, Inc.

87SECTION 7: RESOURCES AND CONCLUSIONS

‣ Length Triesman & Gormican [1988] ‣ Width Julesz [1985] ‣ Size Triesman & Gelade [1980] ‣ Curvature Triesman & Gormican [1988] ‣ Number Julesz [1985]; Trick & Pylyshyn [1994] ‣ Terminators Julesz & Bergen [1983] ‣ Intersection Julesz & Bergen [1983] ‣ Closure Enns [1986]; Triesman & Souther [1985] ‣ Color (hue) Nagy & Sanchez [1990, 1992]; D'Zmura [1991]Kawai et al. ‣ Intensity Beck et al. [1983]; Triesman & Gormican [1988] ‣ Flicker Julesz [1971] ‣ Direction of motion Nakayama & Silverman [1986]; Driver & McLeod [1992] ‣ Binocular luster Wolfe & Franzel [1988] ‣ Stereoscopic depth Nakayama & Silverman [1986] ‣ 3-D depth cues Enns [1990] ‣ Lighting direction Enns [1990]

88SECTION 7: RESOURCES AND CONCLUSIONS

CONCLUDING THOUGHTS•Data visualization involves learning about the rules and the process

•Start with the problem, not with the data or the visualization

•Think big: find the data you need

•Visualize your data in multiple ways

•Know your audience and their goals

89SECTION 7: RESOURCES AND CONCLUSIONS

Keep in mind - the value of data depends on what you do with it

90

Source: Reprinted in Data Insights: New Ways to Visualize and Make Sense of Data, by Hunter Whitney, Morgan Kaufmann; 2012.

SECTION 7: RESOURCES AND CONCLUSIONS

QUESTIONS?

CONTACT: HUNTER WHITNEY [email protected] @HUNTERWHITNEY

91SECTION 7: RESOURCES AND CONCLUSIONS