ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface...
Transcript of ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface...
![Page 1: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/1.jpg)
(A bit about) Data VisualizationICPSR Biennial Meeting
October 2, 2015
Ryan Womack ([email protected])Data Librarian, Rutgers University
This work is licensed under a Creative Commons Attribution
-NonCommercial-ShareAlike 4.0 International License.
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 1 / 52
![Page 2: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/2.jpg)
Introduction
What this talk IS:
Discusses standard techniques of data visualization, the day-to-daypower tools for understanding data
Reviews various graphical techniques, from early to recent, fromsimple to advanced
Presents principles of good data presentation, and show the Rimplementation of many functions
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 2 / 52
![Page 3: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/3.jpg)
Introduction
What this talk is NOT:
It is not about “infographics”, the beautiful, heavily customizedproducts of expert graphic designers. [See 1 and 2 for morediscussion]
It is not about the cognitive science aspects of data perception[wish I knew more about this!]
It is not about how to use R or other software [although code isprovided for those who are interested]
It is not necessarily a balanced survey of all data visualization. Inparticular, it is light on graph networks, clustering, and trees [notmy expertise]
Very little mapping, too [Others are better at this]
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 3 / 52
![Page 4: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/4.jpg)
Setup
Most of the graphics examples that are not web accessible are runin R.
R is open source software available at http://r-project.org
RStudio is a useful freely available editor available athttp://rstudio.com
Workshop materials, including R scripts, supplemental images anddata, are available for download fromhttp://ryanwomack.com/ICPSR2015
The R script file contains working demonstrations of many of theconcepts mentioned here for you to try on your own.
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 4 / 52
![Page 5: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/5.jpg)
Outline
Why?
Whirlwhind tour of historical data viz
Standard visualization vs. some less commonly used examples
3-D and Animation
Interactivity, data exploration
A little bit of big data
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 5 / 52
![Page 6: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/6.jpg)
Why Data Visualization?
Data visualization can:
provide clear understanding of patterns in data
detect hidden structures in data
condense information
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 6 / 52
![Page 7: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/7.jpg)
Anscombe’s Quartet
For example, see Anscombe’s quartet (image source:http://commons.wikimedia.org/wiki/File:Anscombe%27s quartet 3.svg):
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 7 / 52
![Page 8: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/8.jpg)
Links to DataViz sites
Some examples of good data visualization (and fancy infographics) canbe found at:
Information Aesthetics
Chart Porn
Eagereyes
DataVis.ca
VizWiz
US Census Data Visualization Gallery
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 8 / 52
![Page 9: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/9.jpg)
Bad Graphs
Pie Charts are known to be problematic
Clutter and other issues can ruin graphics
Novel or nonsensical?
For more bad ideas, try:
Junk Charts
Ten Worst Graphs
WTFviz
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 9 / 52
![Page 10: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/10.jpg)
Pie Chart Examples
image source: http://peltiertech.com/WordPress/3d-pie-charts/
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 10 / 52
![Page 11: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/11.jpg)
Pie Chart Examples
image source: http://ndevisual.wordpress.com/tag/uses-of-pie-charts/
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 11 / 52
![Page 12: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/12.jpg)
Pie Chart Examples
image source: http://www.nbcchicago.com/news/local/FOX-News-Chart-Fails-Math-73711092.html
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 12 / 52
![Page 13: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/13.jpg)
Pie Chart Examples
image source: http://tips.vovici.com/content/111031 swb
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 13 / 52
![Page 14: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/14.jpg)
Pie Chart Examples
image source: http://tips.vovici.com/content/111031 swb
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 14 / 52
![Page 15: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/15.jpg)
Clutter Example
image source:http://junkcharts.typepad.com/junk charts/2013/03/which-software-is-responsible-for-this.html
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 15 / 52
![Page 16: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/16.jpg)
Playfair
Astronomical observations, charts, and maps led in graphicalinnovation prior to 1800. See also Classic Data Visualizations
William Playfair is the pioneer of the line chart, bar chart, timeseries plots, and pie chart.
Playfair, W. (1786). Commercial and Political Atlas: Representing, byCopper-Plate Charts, the Progress of the Commerce, Revenues, Expenditure,and Debts of England, during the Whole of the Eighteenth Century,
Playfair, W. (1801). Statistical Breviary.
Both republished in The Commercial and Political Atlas and StatisticalBreviary, 2005, Cambridge University Press.
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 16 / 52
![Page 17: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/17.jpg)
Playfair Examples
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 17 / 52
![Page 18: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/18.jpg)
Playfair Examples
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 18 / 52
![Page 19: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/19.jpg)
Minard
Charles Joseph Minard was the next influential data graphic creatorafter Playfair.
Minard’s flow map of Napoleon’s Russian campaign is celebratedby Tufte and others as one of the greatest information graphics.
It embodies an ideal of highly compressed informative elements,presented with style
Six variables: size, location in 2 dimensions, the direction of thearmy, temperature, date [and group]
However, this is a one-off design that crosses into Infographics, butit can be reproduced in R and other software.
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 19 / 52
![Page 20: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/20.jpg)
Minard Examples
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 20 / 52
![Page 21: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/21.jpg)
Minard Examples
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 21 / 52
![Page 22: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/22.jpg)
Fisher and Tukey
In the 20th century, statisticians such as Ronald Fisher and JohnTukey continued to advance graphical methods for the analysis ofdata.
Fisher emphasized plotting the data to understand relationships.
Tukey’s Exploratory Data Analysis emphasized the use of graphicsto understand the data during analysis, rather than the finalpresentation to an outside audience.
Tukey created the box and whiskers plot and the stem and leafplot.
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 22 / 52
![Page 23: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/23.jpg)
Tufte
Edward R. Tufte’s series of books, beginning with The Visual Displayof Quantitative Information, have become the most widely know workson data visualization.
There is considerable overlap between the various publications
Tufte’s ideal is highly compressed, elegant, and informative data,as expressed in dense printed graphics
Tufte sometimes emphasizes beauty and design to the detriment ofsimplicity and clarity [e.g., train schedules]
“Graphical elegance is often found in simplicity of design andcomplexity of data.”
“Beautiful graphics do not traffic with the trivial.”
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 23 / 52
![Page 24: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/24.jpg)
Train Schedule from Marey
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 24 / 52
![Page 25: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/25.jpg)
Tufte’s principles
Tufte has developed and popularized numerous principles andterminology:
Graphics reveal data - show the data without distorting it - “above allelse show the data”
Small multiple - understanding one slice makes understanding otherseasier
Lie factor - effect shown/effect in reality
Graphical Integrity - no lies, let data vary, not design
Data density - maximize data/ink ratio
Sparklines - seems they haven’t caught on
chartjunk - self-explanatory
Powerpoint is responsible for most of the world’s sorrows [TheCognitive Style of Powerpoint]
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 25 / 52
![Page 26: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/26.jpg)
Lie Factor
image source: http://www.datavis.ca/gallery/lie-factor.php
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 26 / 52
![Page 27: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/27.jpg)
Cleveland
William Cleveland’s Elements of Graphing Data and VisualizingData pioneered systematic considerations of data legibility
Cleveland is particularly known for promoting the dot plot as aalternative to bars and pies.
The dot plot provides clarity and easy comparison of data.
Cleveland also pioneered Trellis graphics
Trellis graphics emphasizes comparison of multiple panels of data
The lattice package implements Trellis graphics in R
See Cleveland.pdf for a summary of Cleveland’s recommendations
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 27 / 52
![Page 28: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/28.jpg)
Scatterplot matrix
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 28 / 52
![Page 29: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/29.jpg)
The Grammar of Graphics
The Grammar of Graphics, by Leland Wilkinson, was extremelyinfluential in thinking about graphics
Grammar means ”rules for art and science”
The Grammar of Graphics specifies rules both mathematical andaesthetic
Earlier graph producers focused on aesthetics of static content
Dynamic graphics and scientific visualization, by contrast, requiresophisticated designs to enable brushing, drill-down, zooming,linking
The Grammar of Graphics is easily adapted to this approach
ggplot2 was developed by Hadley Wickham as an implementationof the Grammar of Graphics
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 29 / 52
![Page 30: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/30.jpg)
From Barchart to Dot Plot
The Cleveland dot plot
use to compare labeled quantities, ordered lists
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 30 / 52
![Page 31: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/31.jpg)
Figure: Bar chart v. Dot Plot
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 31 / 52
![Page 32: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/32.jpg)
Visualizing Distributions of Data
Box and Whiskers Plot
illustrate quantiles and outliers. There is also a Tufte version.
Violin plot
Blends density information with box and whiskers style (in anartistic manner)
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 31 / 52
![Page 33: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/33.jpg)
Figure: Box Plot v. Violin Plot
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 32 / 52
![Page 34: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/34.jpg)
Visualizing Categorical Data
Beyond the pie chart
The mosaic plot allows multiple categories to be displayed on thesame graph, but can be complicated to interpret.
The spineplot is a variant of the mosaic plot, plotting proportionsin 2 dimensions.
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 32 / 52
![Page 35: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/35.jpg)
Figure: Pie Chart v. Mosaic Plot
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 33 / 52
![Page 36: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/36.jpg)
Maps and Glyphs
Maps are obviously an important and widespread way of presentingdata.
We examine a few examples of choropleth maps, in which shadingindicates data levels
See also Interactive Maps in R and 5 kinds of Interactive maps inPlot.ly for further exploration
Glyphs present iconic representations of data elements.
Weather maps often use glyphs.
A more dynamic example is here.
As an R example, consider Chernoff faces and the aplpack
package. Also, Smiley faces [and many more graph variants in thischapter].
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 33 / 52
![Page 37: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/37.jpg)
Figure: Choropleth Map v. Chernoff Faces
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 34 / 52
![Page 38: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/38.jpg)
3-D
3-D scatterplots
cloud (lattice)
contour plots
to plot standardized levels of data
wireframe plots
to present a 3-D surface representation of data
rgl (a separate package containing several 3d plotting functionsand animation)
mosaic3d extends the mosaic paradigm to three dimensions
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 34 / 52
![Page 39: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/39.jpg)
Animation
Animation is an easy way to step through data over time
or to provide comparisons of different views of data
R makes animation easy with the animation package
Just enclose a sequence of graphics in the animation command togenerate interactive HTML (or GIF, SWF, LATEX, Video).
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 35 / 52
![Page 40: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/40.jpg)
Interactive DataViz - Principles
Why aren’t all of our graphs interactive?
Brushing is used to select data points and track them throughvarious analyses.
Drilling down, zooming, and subsetting are also interactivetechniques.
Data displays can be linked so that a selection in one panelmodifies the output displayed in another panel.
Interactivity is especially useful for data exploration, studyingmultidimensional relationships.
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 36 / 52
![Page 41: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/41.jpg)
Interactive Data in Practice
There are many R packages that allow for interactive data work in agraphical user interface, including:
playwith - versatile package that works with any graphicsfunction. Graphics can be explored, edited, and exported.
requires separate installation of GTK+ on your computer [method variesby OS]
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 37 / 52
![Page 42: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/42.jpg)
googleVis
In many contexts, visualizing the relationships between data elementsis made easier by viewing related data interactively.
Making this easy are googleVis and other “Vis” packages, e.g.bdvis for biodiversity or rainfreq.
A Library example - comparing selected ARL Statistics for publicCIC universities
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 38 / 52
![Page 43: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/43.jpg)
Interactive Data on the Web - Rcharts
Rcharts is a package that uses javascript to create interactivevisualizations.
Lattice-style commands are used.
The package can output javascript for use in an HTML page.
Some commands depend on supplemental javascript libraries thatmust be installed, such as NVD3
Can embed in documents too, with slidify
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 39 / 52
![Page 44: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/44.jpg)
Interactive Data on the Web - shiny
The shiny package is developed by the Rstudio folks
You can learn shiny in half a day via the online tutorial
More custom control of the design is possible with shiny, incomparison to other do-it-all packages
Graphics use familiar R syntax (including ggplot2), with wrappersto implement web functionality
Every shiny app has the same structure: two R scripts savedtogether in a directory [ui and server files]
You must install the shiny server to deliver pages via the web
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 40 / 52
![Page 45: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/45.jpg)
Interactive Data on the Web - shiny, cont.
There are samples built into the shiny package.
You can build a Census Explorer of your own with theseinstructions from Ari Lamstein.
You can see more in the shiny gallery
Rcharts works with shiny too.
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 41 / 52
![Page 46: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/46.jpg)
Interactive Data on the Web - ggvis
The ggvis package is ALSO developed by the Rstudio folks
Think ggplot meets shiny
Similar syntax to ggplot
Some ability to add interactive controls
Can embed in shiny for web access
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 42 / 52
![Page 47: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/47.jpg)
Interactive Data on the Web - radiant
Radiant is another new R interface built with shiny
The following links demonstrate capabilities:
vnijs.shinyapps.io/basevnijs.shinyapps.io/quantvnijs.shinyapps.io/marketing
By automating the mechanics of interacting with data, we canfocus on exploring and understanding.
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 43 / 52
![Page 48: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/48.jpg)
Other (non-R) options for Web visualization
D3.js, free at http://d3js.org/
Inkscape, free at https://inkscape.org/
Tableau, free 1-year student license athttp://www.tableau.com/academic/students
Plot.ly environment at http://plot.ly
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 44 / 52
![Page 49: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/49.jpg)
Interactive Power
Population pyramids are one example whereinteractivity + animation = insight .
Populationpyramid.net - for all countries, basic animation
The German Population Pyramid from Destatis is even moreinteractive
Doing it in R is possible with these instructions (Part 1) and (Part2)
The ggvis package is ALSO developed by the Rstudio folks
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 45 / 52
![Page 50: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/50.jpg)
Big Data
Big data presents special issues for data visualization
While many techniques and graphics are the same, explorationand plotting must be optimized for the size of the data set
Representation of the complexity of the data may require specialtechniques
hexbin
bigvis
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 46 / 52
![Page 51: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/51.jpg)
bigvis
bigvis was an experimental package by Hadley Wickham to deal withthe issues of Big Data
There is a Preprint and R Meetup presentation by Hadley Wickham
Complete code is available at https://github.com/hadley/bigvis-infovis
Target: process 100 million observations in under 5 seconds.
Fundamental principle: No need for more data points than there arepixels on the screen.
“ggstat” package has been mentioned as a future project that willincorporate these ideas.
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 47 / 52
![Page 52: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/52.jpg)
bigvis steps
Condense (bin, condense)
Smooth (smooth, best_h, peel)
Visualize (autoplot plus standard methods)
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 48 / 52
![Page 53: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/53.jpg)
Trelliscope (Tessera)
Tessera is developed by Purdue, Pacific Northwest NationalLaboratory, and Mozilla. Launched in November 2014, this projectholds a lot of promise.
Running in the R environment, Tessera provides its own commands thatexecute across a cluster, easing the burden of analysis in this environment.
The datadr package “divides and recombines” in a manner similarto MapReduce, providing a simplified interface to Hadoop.
Tessera has its own visualization interface, Trelliscope, that canhandle views across many variables and observations. Described inthis paper.
Tessera’s Bootcamp is a good introduction, or try the quickstart.
Live demo is here.
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 49 / 52
![Page 54: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/54.jpg)
Infographics links
Although not covered here, the following links are a sampling ofinfographics sites for your later enjoyment:
Data Storytelling in Video
Art of Data Visualization - in spite of its title, more on theinfographics side
Parisian Subway Traffic and New York Subway Inequality
Tulp Interactive
Mapping London and London Riots + Twitter
YouTube Trends Map
Global Burden of Disease Visualizations
and the Tree of Life
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 50 / 52
![Page 55: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/55.jpg)
Keep Exploring
Data Visualization represents a nearly infinite world of possibilty forexploration:
plunge into programming
deep dives into data
indulge in interactivity
...have fun and keep learning! [e.g., R-bloggers.com]
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 51 / 52
![Page 56: ICPSR Biennial Meeting October 2, 2015 Ryan Womack ...€¦ · to present a 3-D surface representation of data rgl (a separate package containing several 3d plotting functions and](https://reader034.fdocuments.in/reader034/viewer/2022050219/5f64789192cb8771cf2db674/html5/thumbnails/56.jpg)
References
There is also an online bibliography of references to accompany thispresentation on my home page.
Ryan Womack ([email protected]) Data Librarian, Rutgers University(A bit about) Data Visualization 52 / 52