#5 DataBeersBCN -"Dos and Don'ts of Data Viz"
-
Upload
databeersbcn -
Category
Data & Analytics
-
view
162 -
download
0
Transcript of #5 DataBeersBCN -"Dos and Don'ts of Data Viz"
DOS AND DON’TS OF DATA VIZA TALE OF PIES, DECEPTION AND MIND TRICKSIÑAKI PUIGDOLLERS SABINData Scientist
DON’T RESCALE PROPORTIONS!
x1.75 times bigger
Source: http://cadenaser.com/
15.1 + 70.7 + 15.2 = 101%
DO KEEP THE PROPORTIONS AS THEY ARE
YET ANOTHER EXAMPLE …
?
Source: Twitter, @ppmadrid
NOW IN A PROPORTIONAL SCALE
PSOEPARTIDOPOPULAR
Núm
ero
de P
arad
os
DON’T OMIT THE ORIGIN OF THE Y-AXIS
Where is the
Axis??
94 is not 0
Source: http://blog.rtve.es/http://mediamatters.org/
DO SHOW THE Y-AXIS FROM THE ORIGIN
Mill
ion
Dol
lars
50.66% 49.07%
THIS ALSO HAPPENS IN SCIENTIFIC PAPERS
This is a big difference, isn’t it?
According to the paper,
this should be 1.82
The value of Y (Rape Myth Acceptance)
varies between 1 and 5
There are values placed in the
wrong position
Source: Fox, Jesse; Bailenson, Jeremy N.; Tricase, Liz (2013). "The embodiment of sexualized virtual selves: The Proteus effect and experiences of self-objectification via
avatars". Computers in Human Behavior 29 (3): 930–938
THE REALITY IS SOMETHING DIFFERENTFace
It was not that different in the
end…
Remember:The value of Y
(Rape Myth Acceptance)
varies between 1 and 5
DON’T USE INVENTED OR TAILOR-MADE SCALES
How can this be a line?
Source: http://mediamatters.org/
DO PLOT DATA AS IT IS
DON’T USE DIFFERENT SCALES FOR THE SAME AXIS
Left Y-Axis (representing the non-smokers)
starts at 2
Right Y-Axis (representing the smokers)
starts at 3
Source: H. Wainer, Visual Revelations, Graphical Tales of Fate and Deceptions from Napoleon Bonaparte to Ross Perot
Disclaimer! This Graph is from a tobacco company
DO USE THE SAME SCALE TO MAKE DATA COMPARABLE
DON’T SHOW MEANINGLESS NUMBERSDON’T USE PIE CHARTS
193% ???That’s a big pie!
Source: http://mediamatters.org/
DON’T USE 3D
Perspective makes percentages look different
Source: http://imgarcade.com/1/misleading-circle-graphs/
SOME THINGS WE LEARNED AT SCHIBSTED
■ Know your audience and adapt the visualization to them
■ The title matters, it has to be attractive but not distracting
■ Select the most suitable plot, there is no one-plot-fit-all
■ Show only relevant information, crowded visualizations are misleading
■ Sometimes you can break the rules…
DO CHOOSE A VISUALIZATION FITTING YOUR AUDIENCE
Percentage of Sellers per segment
Slack channels sharing users
DON’T USE CROWDED PLOTS WITH MISLEADING INFORMATION
■ Too many elements
■ The colours are meaningless
■ The axes are misleading (not showing the origin)
DO SHOW ONLY WHAT IS IMPORTANT
■ Axes starting at 0
■ Only the necessary elements
GOAL Show the correlation of the
data points
… A DIFFERENT APPROACH
■ We don’t care about the value it’s OK to break the axis rule!!
■ The colours have a meaning
GOAL Show the distribution and density of the data points
WE ARE LOOKING FOR TALENT!
Thanks, questions?
Data Scientist – Schibsted Product & Technology