V is for Visualization
@everydayanalyst
Myles Harrison
www.everydayanalytics.ca
Big Data Week 201405-07-14
Practical Considerations for Visualizing “Big Data”
?
01100010 01101001 01100111 00100000 01100100 01100001 01110100 01100001
DATA
DATA
The 3 “V”s of Big Data:Volume
Velocity
Variety
Visualization!
4
i.
DIMENSIONS
MEASURES
attentive processing
attentive processingpre-
1172 / 293 = ?
1172
293
Adapted from Show Me The Numbers, 2nd ed. by Stephen Few. Analytics Press, 2012
orientation length closure size (area)
curvature density estimation colour (hue)
intensity intersection termination depth
f(x) = data-ink
total ink used
ii.
(how filling)
STOP!
quantity of interest
den
sity
iii.
(quickly now)
iv.
(the spice of life)
METADATA
ABC
101010101010101101010110
Natural Language Processing (NLP)
bag o’ words
tf-idf
sentiment analysis
stemming
named-entity recognitionshingling
semantic analysisinformation retrieval
N-grams
1.0
0.0
-1.0
• Dimensionality reduction techniques:– Multi-dimensional scaling (MDS)– Principle Components Analysis (PCA)– Linear Discriminant Analysis (LDA)
• Variable selection (forward & backward)
Problems of High Dimensionality
Credit: Kirk, Andy. In Praise of Slopegraphs.
v.
(no jumping allowed)
Summary
• Keep in mind the basics - understand data & perception in visualization
• Use visualization for what it is good at, and analysis techniques to handle the ‘Big’ part
• Complexity can be handled with analytical techniques and less-common visualization types (where appropriate)
The Spectrum of Data Visualization
ANALYSISDESIGN
Graphs
Data Art Infographics Dashboards
TablesInformation Design
ScienceArt ?
gracias
Top Related