Introduction - University of Northern Iowafabos/seminar/readings/tufte.pdf · ·...
-
Upload
nguyenkien -
Category
Documents
-
view
216 -
download
1
Transcript of Introduction - University of Northern Iowafabos/seminar/readings/tufte.pdf · ·...
Introduction
Data graphics visually display measured (Fulltities by meam of
the combined use of points, Jines, a coordinate system, numbers,
symbols, words, shading, and color.
The usc ofabstract, non-representational pictures to show numbers
is a surprisingly recent invention, perhaps because of the diversity
of skills required - the visual-artistic, empirical-statistical, and
mathematical. It was not ullti11750-1Hoo that statistical graphics
length and area to show quantity, time-series, scatterplots, and
multivariate displays-were invented, long after such triumphs of
mathematical ingenuity as logarithms, Cartesian coordinates, the
calculus, and the basics of probability theory. The remarkable
William Playfair (1759-1 H23) developed or improved upon nearly
all the fundamental graphical designs, seeking to replace conven
tional tables of numbers with the systematic visual representations
of his "linear ari thmetic."
Modern data graphics can do much more than simply substitute
for small statistical tables. At their best, graphics arc instrumcnts
for reasoning about quantitative inf(xmation. Often the most effec
tive way to describe, explore, and summarize a set of l1umbers
even a very large set-is to look at pictures of those numbers.
Furthermore, of all mcthods for analyzing and communicating
statistical information, well-designcd data graphics arc usually the
simplest and at the same time the most powerful.
The tlrst part of this book reviews the graphical practice of the
two cellturies since Playfair. The reader will, I hope, rejoice in the
graphical glories shown in Chapter 1 and then condemn the lapses
and lost opportunities exhibited in Chapter 2. Chapter 3, 011 graph
ical integrity and sophistication, seeks to account t(x these diftcr
ences in quality of graphical design.
The second part of the book provides a language for discussing graphics and a practical theory of data graphics. Applying to most visual displays of quantitative information, the theory leads to changes and improvements in design, suggests why some graphics might be better than others, and generates new types of graphics. The emphasis is on maximizing principles, empirical measures of graphical performance, and the sequential improvement of graphics through revision and editing. Insights into graphical design are to be gained, I believe, from theories of what makes for excellence in art, architecture, and prose.
This is a book about the design of statistical graphics and, as such, it is concerned both with design and with statistics. But it is also about how to communicate information through the simultaneous presentation of words, numbers, and pictures. The design of statistical graphics is a universal matter -like mathematics-and is not tied to the unique features of a particular language. The descriptive concepts (a vocabulary for graphics) and the principles advanced apply to most designs. I have at times provided evidence about the scope of these ideas, by showing how frequently a principle applies to (a random sample of) news and scientific graphics.
Each year, the world over, somewhere between 900 billion (9 X 1011) and 2 trillion (2 X 1012 ) images of statistical graphics are printed. The principles of this book apply to most of those graphics. Some of the suggested changes are small, but others are substantial, with consequences for hundreds of billions of printed pages.
But I hope also that the book has consequences for the viewers and makers of those images-that they will never view or create statistical graphics the same way again. That is in part because we are about to see, collected here, so many wonderful drawings, those of Playfair, of Minard, of Marcy, and, nowadays, of the computer.
Most of all, then, this book is a celebration of data graphics.
provides a language for discussing of data graphics. Applying to most
nformation, the theory leads to design, suggests why some graphics ld generates new types of graphics. g principles, empirical measures of . sequential improvement of graphics nsights into graphical design are to ries of what makes for excellence
I of statistical graphics and, as such, n and with statistics. But it is also formation through the simultaneous s, and pictures. The design of statis:ter-Iike mathematics-and is not particular language. The descriptive
phics) and the principles advanced .t times provided evidence about the 19 how frequently a principle applies and scientifIc graphics. omewhere between 900 billion [2) images of statistical graphics are look apply to most of those graphics. are small, but others are substantial,
.s of billions of printed pages. has consequences for the viewers and hey will never view or create statislin. That is in part because we are many wonderful drawings, those ~y, and, nowadays, of the computer. s a celebration of data graphics.
PART I
Graphical Practice
I Graphical Excellence
Excellence in statistical graphics consists of complex ideas
communicated with clarity, precision, and effIciency. Graphical
displays should
• show the data
• induce the viewer to think about the substance rather than about
methodology, graphic design, the technology of graphic production, or something else
• avoid distorting what the data have to say
• present many numbers in a small space
• make large data sets coherent
• encourage the eye to compare different pieces of data
• reveal the data at several levels of detail, from a broad overview
to the fIne structure
• serve a reasonably clear purpose: description, exploration,
tabulation, or decoration
• be closely integrated with the statistical and verbal descriptions
of a data set.
Graphics reveal data. Indeed graphics can be more precise and revealing than conventional statistical computations. Consider
Anscombe's quartet: all four of these data sets are described by
exactl y the same linear model (at least un til the residuals are ex
amined).
II III IV X Y X Y X Y X Y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 14.0 9.06 14.0 8.10 14.0 8.84 8.0 7.04 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.51, 7.0 4.82 7.0 7.2() 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89
N = 11 mean of X's = 9.0 mean of Y's = 7.5 equation l)f regression line: Y = 3 -I O.5X standard error ofestimate ofslope = 0.118 t = 4.24 sum of squares X - X = 110.0 regression sum of squares = 27.50 residual sum of squares of Y = 13.75 correlation coefficient = .82 r2 = .67
• • • • • • • •
• ••••
••• •••
•••
•• • •
•• •• • •
•
••
•
•
14 GRAPHICAL PRACTICE
And yet how they differ, as the graphical display of the data F. J. Anscombe, "Graphs in Statistical Analysis," AII/erirelll Steltistiriilll, 27makes vividly clear: (Febrnary 1973), 17-21.
II
•10 • •
••5 • •• •
10 20
IV •III •
••• I•I
And likewise a graphic easily reveals point A, a wildshot observation that will dominate standard statistical calculations. Note that point A hides in the marginal distribution but shows up as clearly exceptional in the bivariate scatter.
••
•Stephen S. Brier and Stephen E. Fienberg, "Recent Econometric Modelling
• • • of Crime and Punishment: Support tlJr•• • • • the Deterrence Hypothesis?" in StephCll
• A E. Fienberg and Albert J. Reiss. Jr., cds., - • • Indiwtors of Crill/I' illld Criminal jll.'tirl': Quantitiltillc Studies (Washington. D.C., 1980), p. 89.
t
Edward R. Dewey and Edwin F. Dakin, Cycles: Tit" SriCllC( of Predictioll (New York, 1947), p. 144.
GRAPHICAL EXCELLENCE 15
600
700
""<; 650 >
~ z
160 c o
400 -go
156...J
350
152
-148
550
168
500 164
450
.......~\ /~ \
' /' \... , # , ....... / \ '-- \
London Stock Prices \
\ \
..., \
" \/ \
/ \ New York Stock Prices // \
"..... / \ A ' ... _- J/ \
--' -..... _, \ "'8 \
\ \ \ \ \
'--
Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec.
-.010
+.002
-.012
" : -.006
" ,;. ~ -.004 " Co
.~ -.002 ~ u
Normal
'":; .~ -.008
Of course, statistical graphics, just like statistical calculations, are only as good as what goes into them. An ill-specified or preposterous model or a puny data set cannot be rescued by a graphic (or by calculation), no matter how clever or fancy. A silly theory
means a silly graphic:
SOLAR RADIATION AND STOCK PRICES
A. New York stock prices (Barron's average) . B. Solar Radiation, inverted, and C. London stock prices, all by months, 1929 (after Garcia-Mata and Shaffner) .
Let us turn to the practice of graphical excellence, the efficient communication of complex quantitative ideas. Excellence, nearly always of a multivariate sort, is illustrated here for fundamental graphical designs: data maps, time-series, space-time narrative
designs, and relational graphics. These examples serve several purposes, providing a set of high-quality graphics that can be discussed (and sometimes even redrawn) in constructing a theory of data graphics, helping to demonstrate a descriptive terminology, and telling in brief about the history of graphical development. Most of all, we will be able to see just how good statistical graphics can be.
16 GRAPHICAL PRACTICE
Data Maps
These six maps report the age-adjusted death rate from various types of cancer for the 3,056 counties of the United States. Each map portrays some 21,000 numbers. 1 Only a picture can carry such a volume of data in such a small space. Furthermore, all that data,
thanks to the graphic, can be thought about in many different ways at many different levels of analysis-ranging from the contemplation of general overall patterns to the detection of very
fine county-by-county detail. To take just a few examples, look at the
• high death rates from cancer in the northeast part of the country and around the Great Lakes
low rates in an east-west band across the middle of the country
• higher rates for men than for women in the south, particularly
Louisiana (cancers probably caused by occupational exposure, from working with asbestos in shipyards)
unusual hot spots, including northern Minnesota and a few counties in Iowa and Nebraska along the Missouri River
• differences in types of cancer by region (for example, the high rates of stomach cancer in the north-central part of the country -probably the result of the consumption of smoked fish by
Scandinavians)
rates in areas where you have lived.
The maps provide many leads into the causes-and avoidance-
of cancer. For example, the authors report:
In certain situations ... the unusual experience of a county warrants further investigation. For example, Salem County, New Jersey, leads the nation in bladder cancer mortality among white men. We attribute this excess risk to occupational exposures, since about 25 percent of the employed persons in this county work in the chemical industry, particularly the manufacturing of organic chemicals, which may cause bladder tumors. After the finding was communicated to New Jersey health officials, a company in the area reported that at least 330 workers in a single plant had developed bladder cancer during the last 50 years. It is urgent that surveys of cancer risk and programs in cancer control be initiated among workers and former workers in this area.2
1 Each county's rate is located in two dimensiCl11S and, further, at least four numbers would be necessary to reconstruct the size and shape of each county. This yields 7X 3,056 entries in a data matrix sufficient to reproduce a map.
In highest decile, statistically significant .. Significantly high, but not in highest decile .. In highc-st decile, but not statistically signifIcant
Not significantly differmt from U.S. as a whole c:=J Significantly lowc-r than c:=JU.S. as a whole
2 Robert Hoover, Thomas J. Mason, Frank W. McKay, and Joseph F. Fraumeni, Jr., "Cancer by County: New Resource for Etiologic Clues," Scie1lce, 189 (September 19, 1975), 1006.
Maps fwm Atlas of CarIcer Mortality 1'lI' U.S. COIl1lties: 1950-1969, by Thomas J. Mason, Frank W. McKay, Robert Hoover, William J. Blot, and Joseph F. Fraumcni,Jr. (Washington, D.C.: Public Health Service, National I11Stitutcs of Health, 1975). The six maps shown here were redesigned and redrawn by Lawrence Fahey and Edward Tufte.
All types of cancer, white females;
age-adjusted rate by county, 1950-1969
All types of cancer, white males;
age-adjusted rate by county, 1950-1969
Trachea, bronchus, and lung cancer; white females; age-adjusted rate
by county, 1950-1969
Trachea, bronchus, and lung cancer;
white males; age-adjusted rate
by county, 1950-1969
Stomach cancer, white females;
age-adjusted rate by county, 1950--1969
Stomach cancer, white males;
age-a~jllsted rate by county, 1950--1969
20 GH API! I CAL PHACTI CE
The maps repay careful study. Notice how quickly and naturally
our attention has been directed toward exploring the substantive
content of the data rather than toward questions of methodology
and technique. Nonetheless the maps do han' their flaws. They
wrongly equate the visual importance of each county with its
geographic area rather than with the number of people living in
the coullty (or t]Je number of cancer deaths). Our visual impres
sion of the data is entangled with the circumstance of geographic
boundaries, shapes, and areas-the chronic problem afflicting shaded
in-area designs of such "blot maps" or "patch maps."
A further shortcoming, a defect of data rather than graphical
composition, is that the maps arc founded on a suspect data source,
death certificate reports on the cause of death. These reports fIll
under the influence of diagnostic fashions prevailing among doc
tors and coroners in particular places and times, a troublesome
adulterant of the evidence purporting to describe the already some
times ambiguous matter of the exact bodily site of the primarv
cancer. Thus part of the regional clustering seen on the maps, as
well as some of the hot spots, may reflect varying diagnostic
customs and fads along with the actual ditE'rences in cancer rates
between areas.
Data maps have a curious history. It was not until the seventeenth
century that the combination of cartographic and statistical skills
required to construct the data map came together, fully 5,000 years
after the first geographic maps \vere drawn on clay tablets. And
many highly sophisticatecl geographic maps were produced cen
turies before the first map colltaining any statistical material was
drawn." For example, a detailed map with a full grid was engraved
during the eleventh century A.D. in China. The Yii Chi Thu (Map
of the Tracks of Yii the Great) shown here is described by Joseph
Needham as the
. . . most remarkable cartographic work of its age in any culture, can'ed in stone in - 1137 but probably dating from before - 1100. The scale of the grid is 100 Ii to the division. The coastal outline is relatively fmu and the precision of the network of river systems extraordinary. The size of the original, which is now in the Pei Lin Museum at Sian, is about 3 feet square. The name of the geographer is not known.... Anyone who compares this map with the contcmporary productions of European religious cosmography cannot but be amazed at the cxtent to which Chinese geography was at that time ahead of the \X/ cst.... There was nothing like it in Europe tiJl the Escorial MS. map of about ,', 1550 ....4
3 Dot3 IlUpS :He lIs1I3lh- descnbed .lS "thematic nUf1s" in cartogr.lf1hy. For a thorough account, sec Arthur H. Robinson, Early The/llatic 1i1appillg ill tllc Hi"tory 0/ Cartogral,hy (Chicago, 1982) . On the history of st3tistical gr3phics, sec H. Cray FunldlOuscr, "Historical Development of the Graphical Representation of Statistical Data," Osiris, 3 (November 19.17), 2°9-404: and Jamcs R. Beniger and Dorothy L. Robyn, "Quantitative Graphics in Statistics: A Brief History," AllicriCilIl Statisticiall, 32 (February 1978), 1-11.
4]osepb Ncedham, SciCl/rc alld Cil'ili"atiOIl ;'1 Chilla (Cambridge, 1959), v"l. 3.
54°-547.
. '\ ...... 11 to;, iI. Sajll.... Y
GRAPHICAL EXCELLENCE 21
.. . . 'I ;Of 9 . 'V
,. ,. 8 I ..
r.. 0(
'" , ~ .. T ICV
.. - '1-+-+-+-+-+'" I
~ f-- - -
I I t .. " ... , ~ i
t -j
"
~
, ,
---" . + ..---i.. --ti ,
I
E. Chavannes, "Lcs Deux Plus Ancicns Spt'cimcns de la Cartographic Chinoise," Bulletin de l' Ecole Frall\aise de ITxtreme Oriellt,3 (1903), 1-35, Carte B.
22 GRAPHICAL PRACTICE
~ Ecce formulam,v[um,arque HruCturarn TabuIarum ProIornxi,cum qUibufdam locis,Jn
quibus fiudiofus Geographi;c fe Caris exercerc pordt.
S E PTE N T RIO. pars fupcrior.
tA
o
jZ
37
pars inferior. MER I DIE S.
()
~i' % ~ ,~
3 '" M '" ~Z '" Ul
The 1546 edition of Cosmocernphia by Petrus Apianus contained examples of map design that show how very close European cartography by that time had come to achieving statistical graphicacy, even approaching the bivariate scatterplot. But, according to the historical record, no one had yet made the quantitative abstraction
of placing a measured quantity on the map's surface at the intersection of the two threads instead of the name of a city, let alone
the more difficult abstraction of replacing latitude and longitude with some other dimensions, such as time and money. Indeed, it was not until 1786 that the first economic time-series was plotted.
GRAPHICAL EXCELLENCE 23
'\ F LZ I C
One of the first data maps was Edmond Halley's 16R6 chart
showing trade winds and monsoons on a world map.;; The detailed
section below shows the cartograpllic symbolization; with, as
Halley wrote, "'... the sharp end of each little stroak pointing out
that part of the Horizon, from whence the wind continually comes;
and where there are Monsoons the rows of stroaks run alternately
backwards and forwards, by which means they are thicker [denser I than elsewhere."
!0 fJ !II n i 1 II Ii (I c" f ,1 V r -'"'-~ (II
\
'Nonnan J. W. Thrower, "Edlllond Halley as a Thematic (;co-Carwgraphcr," A/l/w/s o(the .1<,<0(;,11/011 of AlIl<r;(1111 Ccogr,ll'/;crs, 5'J (])cccnlber I'j()9), ('52-07()·
Edmond Halle\, "An Historical AcCOlll\t of the Tr.ldl' Winds, ,md !'v1on"l(ml, Ohser\ahk in the Sus lkt\\'CCl1 and Ncar the Tropicks: With ,111 Attelll f't to Assign thl' Phisic'al Cause of Said Winds," Ph;/o."(1!,hira! rrall.,actio/IS,
1~3 (1 (,~(,), 1 S3-1(,R,
24 GRAPH r CA L PHA CTr CE
An early and most worthy use of a map to chart pattcrns uf
discase was the famous dot map of Dr. John Snow, who plotted
the location of deaths from cholera in central London for Sep
tember 1R54. Deaths were marked by dots and, in addition, the
area's elevcn water pumps were located by crosses. Examining the
scatter over the surface of the map, Snow observed that cholera
occurred almost entirely among those who lived ncar (and drank
from) the Broad Street water pump. He had the handle of the
contaminated pump removed, ending the neighborhood epidemic
which had taken more than 500 lives.6 The pump is located at the
center of the map, just to the right of the D in BROAD STREET. Of
course the link between the pump and the disease might have been
revealed by computation and analysis without graphics, with some
good luck and hard work. But, here at least, graphical analysis
testifies about the data far more efficiently than calculation.
';"'J,
6 E. W. Gilbert, "Pioneer Maps ofHealth and Disease in England," Geographical Journal, 124 (1958), 172-183.
so 50 !'JO
Charles Joseph Minard gave quantity as well as direction to the
data measures located on the world map in his portrayal of the
1864 exports of French wine:
)$(;,\I-~l"'l"-' :1,\,\1"1"",,,,:1,,,,,,,,,,,,,,,,,,,, ,)"1.."t, ~",,,,- L'c t-~ll~r('al( ,,''''If,I:'i'jl''-'' I ,:,'"".,,:, If. \.",., "- .."
\ , ",.I'I ~,1;-" "'1":'" d.'",. 1,.'"".,,, ,'.' "'I,d.-.",., ""f'''Ic",
"1('''''''' 1.;,,1,,1',\,,-.,
I"" " '\1"1",1,,,,( .d~.'l,
I,,,'-'-"1'
1
_,'" ,J>'
\ \1 ,.: H I I' I L
\ I I I .\1 \ ,; '\ I \ ~) It I) ,"
I J: \ \ ( L
\\'=:'\"
( 111.'"
\1
,; (. I>
I: J:
,I"
() I.
\ I: 1\ () I:
\ I ... l' 1\ \ I. I I"
Charles Joseph Minard, Tahletl/lx Graph/qlll'S et Garles F(~lIrlllil'es de ;\1.. Millard, 1845-1869, a portfolio of his work held by the I3ibliothcclue de !'Eco!c NationaleIV
des Ponts ct Chaussces, Paris.
26 GRAPHICAL PRACTICE
Computerized cartography and modern photographic techniques have increased the dellSity of information some 5,ooo-fold in the
best of current data maps compared to Halley's pioneering effort. This map shows the distribution of 1.3 million galaxies (including some overlaps) in the northern galactic hemisphere. The map divides the sky into 1,024 X 2,222 rectangles. The number of gal
axies counted in each of the 2,275,328 rectangles is represented by ten gray tones; the darker the tone, the greater the number of galaxies counted. The north galactic pole is at the center. The
sharp edge on the left results from the earth blocking the view trom the observatory. In the area near the perimeter of the map,
the view is obscured by the interstellar dust of the galaxy in which we live (the Milky Way) as the line of sight passes through the flattened disk of our galaxy, The curious texture of local clusters of galaxies seen in this truly new view of the universe was not
anticipated by students of galaxies, who had, of course, microscopically examined millions of photographs of galaxies before seeing this macroscopic view. Although the clllSters are clearly
evident (and accounted for by a theory of galactic origins), the seemingly random filaments may be happenstance. The producers of the map note the "strong temptation to conclude that the galaxies are arranged in a remarkable filamentary pattern on scales of approximately 5° to 15°, but we caution that this visual impression may be misleading because the eye tends to pick out linear 7 Michael Seldner, B. H. Siebers, Edward
patterns even in random noise. Indeed, roughly similar patterllS J. C;roth and P. .lames E. Peebles, "New Reduction of the Lick Catalog of
are seen on maps constructed from simulated catalogs where no Galaxies," AslrOllOllli',ll Joumal, 82 linear structure has been built in, .. ,"7 (April 1977), 249-314.
The most extensive data maps, such as the cancer atlas and the count of the galaxies, place millions of bits of information on a single page before our eyes. No other method for the display of statistical information is so powerful.
'c;' ,~._ .• , •• ' - . ',~ - ••~' ~ - .... .y, ,.- ~>~fI'- .~-:r... ~~~ .. :"'~~:· ...f..:..... ~ .. ~ :._ - \-. ' ~~ ~.r: _~;-·,::."".1 .t.,_'· ~ ...T:"!'~~.':,.'1I'r1' 7- ....,'1-...; ~. ~~ ...... ~_, ~*. ~~"'"" .. 1.,...;.,,",' ...... _-,- -.;... ••.,~;~ - .. i .,.t~ .. ·.". ':-:-".-:"\ .~. ..:.- .I;~: ~.."~_'_'~ , ~:"~-;:"" ~~ .. '...-: '" ' ~..~: ~ ~ ...",;i'!.,.~;jt;..-:~, '. ~ ....-...!¥... 7', .... ''''J~ .... _ .~~1 -.(~. ~ ;!II',. _-v~ .. '"' .~. L • ~ ~ I , • ..: 0' r: "'r!f~' ~'-r-" • \ .s.' " ..~....:.,-. . t ~ :..- " ..' -~ -' .....~...... '. ~~~'"1.. •..•. ~' ,:~ .. '.' ,: .. ' ',4'\' ",,. -to".;
.~ •• : • .=.~ "~~'.~'~,.,,,. '".";';.:Ir. -... ~ .. ~~., ....Jf.;....•.;....~(:y'\.!: ~.:.:.,.~.. ...i( ~.r ~-'." :~. _ -',,"': r .- .' ,. ,..-..... ~ '.: 'v. ~ ~. , ~ '.~ -"'';' .~- •• ~ '~-" '-..,., -"''''''d:.:''flL- ',- ~"-' ,' '''-v •. ", ',,-' '.' -"~~~" .":x .. "':.;:;
_~: -'\" 'r\! .-.' .:-:. _- .'.,··".r_..--- a-';' 1:~_'J,~~-"';';;JI"ii ..... I.,.. :""1":::/ ~- .._. ;-: .. r_... :...... 'f,:""- .~--.:.-. ,. . ';;;To=":~ . "-til J:' ........ -iL··~~~~·- . ." ~-:."'" ·.... k~I··~'L..... .. ." '.,.Jr·•. ·~.·.-,r-JI t~~•• ~.' - '!';.-~ . r . ~&4i:' ~. . ~ .... '1;~"'-'" .~...,::'...... ~':"" ',:. " " ,-C'" -', :.,....., •. ... ~• ' ... -£J'. --.- _ • ........ .. -:".." ,..._ -.-.c' • ;0 ~ .~ .... • ~'" •• ~ -L - • '.:t.-..... .... .-' . ,,".,.... - " .:r -"'~' ..'- ° i '---' III ~ .• '._ '~ '. . ." ~ ." ." -.' ,-,..... ,..-,; .,.... " .. -. , ,'... ~.. . •. .,',' -r.--- .. ,~ .. , r :-.... '~ ~~. - - '
. '.' ,., c,,<~.<, ,.* ." ".' ,., .,' ",,."of." ' '.' ' ..-,' ',.', ~.:,- ." '.' ,.
·~;~~~;'·~:.~~~:t;·"&~~~J~~~J}q~j,~·:~t;i~~~~tf;i~j~;:.:;;:~
GRAPHICAL EXCELLENCE 27
I I
.-:~
J ..
-:~.:: - . .;. "(
.....•
':'j. .. ;. 'J--.' \
'~I! ",~)l:;;';)., . . " ",' '!~~
., ~. 'I . ,~ . "''0,;,,'. Ial, ,,'" ._~ '-"
\. j .. -,.', ~,l . >:.:rt:. "~"~':" -!>",,' ;. :t '. '.'.r. 'i{",~ '-""~ ... ' .,
,I ·~~~,r .. :···:~;~'~·~i~·~~. ~ r--~i" ~:) ~fM ~;'"... \.
.'" ( " '.. ;~..r-::'ifi;'J:' ,:,1 ,,!'" . '\" ~\,.,: ,~'••~~~'.:">~ ~~~;:-\.-j
, ,"".".~ ,. ~~' .' ..:'< •
,,, '", ' ~....,;.
.\'-~, .•..
'.;;." 1- './'
.' . :,~m,
C,{ ~\:", _~'~~ ... ,,: ...... l.. 1 ..~- ~ '~":':.:\~;: ~.~~>
'.'
28 GRAPHICAL PRACTICE
Time-Series
The time-series plot is the most frequently used form of graphic 8 A random sample of 4,000 graphics drawn fmm 15 of the world's newsdesign.s With one dimension marching along to the regular rhythm papers and magazines published from
of seconds, minutes, hours, days, weeks, months, years, centuries, or 1974 to 1980 found that more than 75
millennia, the natural ordering of the time scale gives this design a percent of all the graphics published were time-series. Chapter 3 reports more
strength and efficiency of interpretation found in no other graphic on this. arran gemen t.
This reputed tenth- (or possibly elcventh-) century illustration of the inclinations of the planetary orbits as a function of time, apparently part of a text for monastery schools, is the oldest known example of an attempt to show changing values graphically. It appears as a mysterious and isolated wonder in the history of data
graphics, since the next extant graphic of a plotted time-series shows up some 800 years later. According to Funkhouser, the astronomical content is confused and there are difficulties in recon
ciling the graph and its accompanying text with the actual movements of the planets. Particularly disconcerting is the wavy path
9 H. Gray Funkhouser, "A Note on a ascribed to the sun.') An erasure and correction of a curve occur Tenth Century Gr'lph," Osiris, 1
(January 1936), 260--262. near the middle of the graph.
.~ 0; \t.S
'-.
.iJi:..l/V /,V~,~ \ ~V,~VIP Vcr~1\ t\ \(\ "-I I l.i
.~ _ Vv 1\ '" "V V ~ / l'L Vl~rFI<c V 1/ I\. X' 1/" '\1\" V "1//LL ,7 ",VK .... ~ V II --t\ 'Y V
£/;X 1/ [If"" 1/ V"ll.J~l/I '\'/ ~ \ L,..1
"J :' 'Ij "'"' V
GRAPHICAL EXCELLENCE 29
It was not until the late 1700S that time-series charts began to
appear in scientific writings. This drawing ofJohann Heinrich
Lambert, one of a long series, shows the periodic variation in soil
temperature in relation to the depth under the surface. The greater
the depth, the greater the time-lag in temperature responsiveness.
Modern graphic designs showing time-serics periodicities differ
little from those of Lambert, although the data bases are far larger.
, II
\ I .. I~·1'1
I .! 1)
J. H. Lambert, Pyromctrie (Berlin, 1779).
I~OR~H=:PII SCLRCE '!OYI~GtcP ? ';xrHtmJ SClRCE-I
'E 10 I 56.2 kHz I2:
-'0
" I in..J • .. Nb .. • I' : ... ";
to ={w iL 3.11 cHz u cr 10
6 W!, • 'I • '""e • .. • i' ' I 1:: d d -{fu w 17.8 kHz *.'
f t
c:: -6 10
20" w 0 0 f-f<l ..J MLIGNETIC JOVIc.RARHIC
.200
R(R) 106.5 9").' 83.4 71.::) 59.2 46.5 33.1 19.3 JULY 2 JUlY 3 JULY 4 JULY 5 JULY 6 JULY 7 8,1979
This plot of radio emissions from Jupiter is based on data collected
by Voyager 2 in its pass close by the planet in July 1979. The radio
intensity increases and decreases in a tcn-hour cycle as Jupiter
rotates. Maximum intensity occurs when the Jovian north mag
netic pole is tipped toward the spacecraft, indicating a northern
hemisphere source. A southern source was detected on July 7, as
the spacecraft neared the equatorial plane. The horizontal scale
shows the distance of the spacecraft from the planet measured in
terms ofJupiter radii (R). Note the use of duallabcls on the hori
zontal to indicate both the date and distance from Jupiter. The
entire bottom panel also serves to label the horizontal scale,
describing the changing orientation of the spacecraft relative to
Jupiter as the planet is approached. The multiple time-series
enforce not only comparisons within each series over time (as do D. A. (;urnett, W. S. Kurth, and F. L.all time-series plots) but also comparisons between the three Scarf, "Plasma Wave Observations Near
different sampled radio bands shown. This richly multivariate Jupiter: Initial Results from Voyager 2," Scim(c 206 (November 23, 1979), display is based on 453,600 instrument samples of eight bits each. 987~991; and letter from Donald A.
The resulting 3.6 million bits were reduced by peak and average Gurnett tD Edward R. Tufte, June 27, processing to the 18,900 points actually plotted on the graphic. 1980.
'.
----------------------------------------
30 GRAPHICAL PRACTICE
Time-series displays are at their best for big data sets with real
variability. Why waste the power of data graphics on simple lin
ear changes,
which can usually be better summarized in one or two numbers?
Instead, graphics should be reserved for the richer, more complex, more difficult statistical material. This New York City weather
summary for 1980 depicts 2,220 numbers. The daily high and
low temperatures are shown in relation to the long-run average.
The path of the normal temperatures also provides a forecast of
expected change over the year; in tlle middle of February, for
instance, New York City residents can look forward to warming
at the rate of about 1.5 degrees per week all the way to July,
the yearly peak. This distinguished graphic successfully organizes
a large collection of numbers, makes comparisons between diflcrent parts of the data, and tells a story.
~~~(TJAL NORMAL I ACT JAL J0PMAL M'rUAl NOHh!~Al N~~C:TUALNC,qM~ ACTUAL NO"cI~::cL ACnJAl!.<::\RMA~C:TUAL--.i:!0B!'1..-Al 1 _~~AL "'OR'.AA~TU,'LNO~-'~~_LJhl Ng~1_
'!~;:i\'~' ~,vJ)?8~ :/\I\j"" -~~f,f'/ .niL·--tl~~,~=G:2i='VI~!f: J,~~i~:: (J I .. ~.~------L~""·M'D'N '~O"J--.-n_-_=t __--_ ..~~~- ~___ 0
NEW YORK CITY'S WEATHER FOR 1980 Wf-!IIhW
1
:,=~ I -~ ~~=-tL~ -.:a'=• ....J] Jdj •.• _~:
NC1V York Tillles,January 11, 1981, p. 32.
GRAPHICAL EXCELLENCE 31
E.]. Marey, La Methode Graphiquc (Paris, 1885), p. 20. The method is attributed to the French engineer, lbry.
A design with similar strengtbs is Marey's graphical train scbed
ule for Paris to Lyon in the 1880s. Arrivals and departures from a station are located along the horizontal; length of stop at a station is indicated by the length of the horizontal line. The stations are separated in proportion to their actual distance apart. The slope of the line reflects the speed of the train: the more nearly vertical the line, the faster the train. The intersection of two lines locates
the time and place that trains going in opposite directions pass
each other.
In 1981 a new express train from Paris to Lyon cut the trip to under three hours, compared to more than nine hours when Marcy published the graphical train schedule. The path of the modern
TGV (train a/?randc LJitcssc) is shown, overlaid on the schedule of 100 years before:
!~ 'I,
10 II MIDI
f~ I I t'l,
I
I i+ I
I IHII
1\ 'IL !.~ '\J
! i
[~~r I':,'I I: I I IJ Iii! il' N! nI
T m I' Y1iJI
i ! ru ' 11'liflJ~ Jij IJ!
{11~11~~
!
l i I
II 'I j, I
I II1 I
~ ~ ~I II '
10 II MIDI ,
8
~~F~I
I!, ,
I i L; ~
I II I
~~ I~ IIvk\~lti '!,~ "Y~'I! I: ii, II r
i I I I!}K
IT ,I , I
I I I
I
I
32 GRAPHICAL PRACTICE
The two great inventors of modern graphical designs were J. H. Lambert (1 72X-1777), a Swiss-German scientist and mathematician,
and William Playfair (1759-1823), a Scottish political economist. lO
The first known time-series using economic data was published in
Playfair's remarkable book, The COl1llllercial alia Political Atlas (Lon
don, 1786). Note the graphical arithmetic, which shows the shift
ing balance of trade by the difference between the import and export time-series. Playfair contrasted his new graphical method
with the tabular presentation of data:
Information, that is imperfectly acquired, is generally as imperfectly retained; and a man who has carefully investigated a printed table, finds, when done, that he has only a very faint and partial idea of what he has read; and that like a figure imprinted on sand, is soon totally erased and defaced. The amount of mercantile transactions in money, and of profit or loss, are capable of being as easily represented in drawing, as any part of space, or as the face of a country; though, till now, it has not been attempted. Upon that principle these Charts were made; and, while they give a simple and distinct idea, they are as near perfect accuracy as is any way useful. On inspecting anyone of these Charts attentively, a sufficiently distinct impression will be made, to remain unimpaired for a considerable time, and the idea which does remain will be simple and complete, at once including the duration and the amount. [pages 3-4]
For Playfair, graphics were preferable to tables because graphics
showed the shape of the data in a comparative perspective. Time
l7LlRT t'T"lllht' IiI/POR1:", ami EXP()RT,\' ;" £~V(TLAN1)d/ul //;""
• Ji'('/II Iii,' ,1(<1/' 1/1'(' t.. /7'''2 11' If /Ym!f/lr
10Laura Tilling, "Early Experimental Graphs," BrifUl JOlirnalfor flie History
~f Scimce, 8 (1975),193-213.
~ - 1 -~---
f- ~~-~I----t"-~-
~-1~---=X-E-=t-- -r---+-----l-
~'S' A
4
,L
11IHEHtt -JtLf 1 , ',-::,__
-," L,7J.. 8,?,bm
f...-.
qOQ .1JIl' .1?§O .1/-JO '74" i;'3"J 4'""0 L,:?<I
TJu. /),N~'I;;'nJ' 17/ 1,1", /ltllh'm, (?':If"'II; J-E~IR.". f,L IIt"vy rW 11t"/-(~4>;flhd71£l..1fIIJJ{l\-:1'71~(lr:\1).f f !i--u,':,.. <'~u~' f'a.fliof../ tlJ'th<,.ld,tU~ri; 'If',M.lt1p""'"/7<''.''
GRAPHICAL EXCELLENCE 33
series plots did this, and all but one of the 44 charts in the first edition of The Commercial and Political Atlas were time-series. That
one exception is the first known bar chart, which Playfair invented because year-to-year data were missing and he needed a design to portray the one-year data that were available. Nonetheless he was skeptical about his innovation:
This Chart is diflcrent from the others in principle, as it does not comprehend any portion of time, and it is much inferior in utility to those that do; for though it gives the extent of the different branches of trade, it does not compare the same branch of commerce with itself at different periods; nor does it imprint upon the mind that distinct idea, in doing which, the chief advantage of Charts consists: for as it wants the dimension that is formed by duration, there is no shape given to the quantities. [page 101 ]
He was right: small, noncomparative, highly labeled data sets usually belong in tables.
ill J" J" 1" ,)0 I,,, 7" fJo 'l0 \(><, '(0
. ft.,.;',..,,1
r),/'u,../
-:... '::--~ - I /Jrn;"",.,( ,,,,,I
')-"r .. '" ====.I",rn"f
(",./,,,,,1
The chart does show, at any rate, the imports (cross-hatched lines) and exports (solid lines) to and from Scotland in 1781 for
17 countries, which are ordered by volume of trade. The horizontal scale is at the top, possibly to make it more convenient to see in plotting the points by hand. Zero values are nicely indicated both by the absence of a bar and by a "0." The horizontal scale mistakenly repeats "200." In nearly all his charts, Playfair placed the
labels for the vertical scale on the right side of the page (suggesting that he plotted the data points using his left hand).
•
34 GRAPHICAL PRACTICE
Playfair's last book addressed the gllestion whether the price of wheat had increased relative to wages. In his Letter OI! our agriC1/ltural distresses, their causes and remedies; accompanied with tables and
copper-plate charts shewing and {(Jlliparil1c~ the prices c:r wheL1t, head and labour, from 1565 to 1821, Playfair wrote:
You have before you, my Lords and Gentlemen, a chart of the prices of wheat for 250 years, made from official returns; on the same plate I have traced a line representing, as nearly as I can, the wages of good mechanics, such as smiths, masons, and carpenters, in order to compare the proportion between them and the price of wheat at every different period.... the main fact deserving ofconsideration is, that never at any former period was wheat so cheap, in proportion to mechanical labour, as it is at the present time.... [pages 29-31]
Here Playfair plotted three parallel time-series: prices, wages, and the reigns of British kings and gueens.
IhtL.('("ufln;\'
h'ltztll'dh ./(1/",,, .. /
Ii lh (\'utllI:V
{lmrl'ul Ifh'vontl rYttU'b'.rH
(0 {'U.\RT. 0, , .J'III'II'I/I,/ III U/I/ !ti'H' )
, // _j/fr;" ;.; llll (Jllllril/"'I"IIml -__ /,, _.r •
,1\: \'~:!!('s of r~!~:~~~I~vlhr 't~.t.~
. //" Jh" r /;;;,; I, 101':.1/, ,,- ." /' -Jw "'
~nl.l.l.Ull'I.AY.·.~
lM,I. ('("utllry
(;"'1I1/'8
I"{,r"'!!"",,r
, 7(1 "
" 5
M· J .~
-1'; ~
'I I~ { ~
'I i . I , ";,1I. ";,¥,.r,, ",,'; .$= ~............... I I 1;"''''0.' 'I(' I.t'·,
)('j/ldlm,_
; );", , "";" ,llIf.JIt'l
The history and genealogy ofroyalty was long a graphical favorite.
This superb construction of E. J. Marey brings together several sets of facts about English rulers into a time-series that conveys a sense of the march of history. Marey (1830-1904) also pioneered
the development of graphical methods in human and animal physiology, including studies of horses moving at different paces,
GRAPHICAL EXCELLENCE 35
000 "'.... <Xl
'"~
0 001 0 c.o, .... <Xl '" I
0 0 0 0 '"('J l"'l .,. '" <0
I co co
~+
E.]. Marey, La MhilOde Graphiqllc (Paris, 1885), p. 6.
E.]. Marcy, l',[ollcmenl (London, 1895). Beginning with the tracks of the horse, the time-series are from pages 191, 224, 222, 265, 60, and 61.
n
, , , , , :
:0:0'
o I"'!! -~~ •• '," -:(j'j : I , , , ,
", Q ; ()
fl ('){ fl " n:
: Q ,
fl-j ~:,In ~_: I,'t) fl: -1\
()1'r.!inCl./'!J TF(llk q~lick ..t 1I/7Jlf. (,'I!/I,,)!. 11'{({!c. (Tail!] stn'de). 11,((lk.
---
36 GRAPHICAL PRACTICE
the movement of a starfish turning itself over (read intages from
the bottom upwards),
r---------- "----------- ---I
\
? - -----_.:f·~.11 (;;'
the undulations of the dorsal fin of a descending sea-horse, . Zi-~
_IIIIIIIIII.S.~ I~~~~ t~+t~+~ , ~,
'--- - --,- ~------'!
as well as the advancc of the gecko.
Marcy's man in black velvet, photographed in stick-figurc imagcs,
becamc the timc-series forerunner of Marcel Duchamp's Nude Descending a Staircase.
"~'I
,:';) _c: l '
(:,.'l~j '-, ~ ;'/)~S,).~ "
_.'-, Y', '.', ' ~~ '- ~ I ' '
_ ;t"l "I J?/~~"tn~!I~~~-~~ il,gJd,S!~) "
,"' ... - ~.-~'\,~#:,,__'J J, "
'-,I,
GRAPHICAL EXCELLENCE 37
The problem with time-series is that the simple passage of time
is not a good explanatory variable: descriptive chronology is not causal explanation. There are occasional exceptions, especially when
there is a clear mechanism that drives the Y-variable. This timeseries does testify about causality: the outgoing mail of the u.s. House of Representatives peaks every two years, just before the election dav:
I
60 - Monthly outgoing mail workload, millions of units
October 1972October 1970
October 1968
40
The graphic is worth at least 700 words, the number used in a news report describing how incumbent representatives exploit their
free mailing privileges to advance their re-election campaigns:
!, for sending them as an. integral i founding of the r~publlC, .and Ifrank~d newsletter to his ol~ I Dole of Kansas, Peter H. Domi- I Senator Tower's use ofFRANKED MAIL TIE Ipart of a model re-electIOn cam- only Congress pollees agamst constituents after he had len' nick ?f Colorado Charles McC.! franked mail in his 1972 cam· II
Npalgn. abuses o~ the free .f!lailin~s. office.. Me: Clark is seekmg~Mathlas Jr. of Maryland paignwasctocumen.ted bymem.
q~enator John G. T?wer, Re~ Complamts o~. pohtie:a1. use to regam hIS old post. ' Another pOlItical mall speClaj- d publtcan of Texas, mailed more of the free-mailing pnvilege, I . d' oran ums.TO VOTING SHOW than 800,000 special.interest called the franking privilege, I Practice Documente .. i1St, Lee W. MacGregor, wrote: Tom. Loe~fler, a nig~-rankif1g letters at taxpayer expense as are heard every election year. I Seldom has th<:: pohttcal a proposal. for. the. use of, campaIgn aIde, wrote In a mem-
Testimony Finds the Volume part of his 1972 re-election Recently, however, the volume,use of franked mall. been so ;franke~ mall by hIS chIef, ~ena- orandum da.ted Oct. ~7, . . effort and received campaign and cost of franked mail has ,well. documented as In fec.ent :to:,Javlts, m 1973... 1972, that dunng the campaIgn
Rises Before Elections 1yolunteer offers and donations multiplied. A new Federal law I~estlmony and documents filed: The over~all obJectlve of the Senator Tower had sent "31 110 response. will limit what out-of-office,ln a Federal Court by Common 1franked mal1.p.rogram can ~ special interest letters totaling I q~nator Jacob K. Javits. Re-I challengers can spend to unseat I~ause. the lobby group. whlch ,: to g.et th.e reclpl~';1t of the. mall approximately 803,333 franked
WASHlNGTO~. June 1 (AP)'pubhcan of New York. gave incumbents. IS sumg for an .e!1d to tax-fl-:to I.denttfy positively With a maIlings." -New court testimony and doc-l written appr~Yal in 1973 fOTI In 1972, Congress passed a !nanced mass mallmgs by Con_lpartlcu.lar stand you .have taken \1r. Tower was not available uments show that much ofla tax-paid mal! program intend- law prohibiting mass franked :gress. lor a ~lll you. hav~. mt~oduced; for comment. His administra· the mail Congress sends atl ed to better his image and Imailings within 28 days before I For. ~xample: Joyc~ ~. Bake.T, :the kmd of ldentlf!catlOn that, ti~e assistant, Elwin Skiles. taxpayer expense is tied. direct- payoff at the polls. He focused an ejection. The sponsor of!a political m.all speclal1st, sllld can be translated ll'lt.o a vote·sald the S~nator's use of ly to the re-election campaigns, his mail on areas where he that legislation, Representative 10 a 1973 Job proposal that at the polls on ejectIOn day" franked mail m 1972 was withof Senate and House members.l ~eeded votes. . Morris K. Udall, Democrat of! she wanted to set up direct- Mr. MacGregor saId. in the law.. and .he defended ~ccording . ~ material filed tiThe volume of "official"l Anzona, said in an inte"dew, mall progral?s for Republic~n I Mr. Javits was out of the the free-mailing prIvileges. 10 a laWSUit 10 F~deral Court.: \Congressicnal mail rises. in that further c.hange~ ~ere need- .Se~ators USlOg fran.ked mall. Icou.ntry an~ could ..not .be Pos.tal Service figures show
tISenate Republlcans put two election years and peaks just! ed to curtail political abuse· The purpose of such a pro- reached. HIS admlOlstratlve,that 10 the 12 months before direct-mail experts on the pub~ before the general election. of the frank. gram is to help an mcumbent aSSIstant, Donald Kellerman, November, 1973, Congress .sent lie payroU to ~vise the~.onl ~one. of this activity neces- Mr. Udall urged a 60-dayISe:tator get re-elected," she !def~nded the use of franked 222.9 .million .franked pieces how to use theIr free malhng: sanlyVlolates anyla.w or regula- ,pre-election cutoff for mass: saId. mall. ,of mall. But 10 the next 12 privileges to get votes. ltion, since Congress has wide Imailings and said he favored I She was put on the Senate: "It is a standard device to:months, covering the election
qAn election manual pre- discretion in the use of tax-paid, closing a loophole that recently payroll at $18,810 a year in 11et voters, not voters but Cltl2- season Of .1974, C~mgress sent pared for Senate Democrats:~aiJ. Congress gave itself thelallowed defeated Representa-·1973 and 1974 and testified i~ns. know w~at the Senatorl 350.6 ffillhon, a Jump of 57
~~,!? ~~~t~~s aati~~;~W:: ~~~r~e~~nd e~~~~~~ ~ail th;1 ;ite ::n~~l~~n?~~r~~ ~~~£ra~: khea~u~~~I:; 1~~ta;~se ~~b:;:e1. ,he ~~:~~ here In Washington, ;::,~~~~.a~k~~s :a~~~'s happen
38 GRAPHICAL PRACTICE
Time-series plots can be moved toward causal explanation by smuggling additional variables into the graphic design. For example, this decomposition of economic data, arraying 1,296 numbers, breaks out the top series into seasonal and trading-day fluctuations (which dominate short-term changes) to reveal the long-run trend
adjusted for inflation. (Note a significant defect in the design, however: the vertical grid conceals the height of the December
peaks.) The next step would be to bring in additional variables to explain the transformed and improved series at the bottom. ll
Systemafic and Irregular Components of Toto! Retail Soles, Unifed States [Jillion:;>; of (k31ars
20
44 40 36 32
28
24
j 105 100
. 95
r('rCd,t
rer Cl>rt l
Unadjusted data
I ~~J~II
PcrCl.'n t
\~~j :~~~~~Wv\rl 95
Holiday variation
Percent
1900 ISSI 1862 1563 1S6'l 1955 1966 1$7 1$8 1$9 1970 1971
11 See William S. Cleveland and Irma]. Tcrpcnning, "Graphical Methods for Seasonal Adjustment," Jllllrnal or the American Statisti((jl Associati,'1l 77 (March 1982), 52- 62 .
Julius Shiskin, "Measuring Currcnt Economic Fluctuations," Statistical Reporter (July 1973), p. 3.
time-series:
Finally, a vivid design (with appropriate data) IS the before-after
Time 01 day (PST)
3 6 9 12 15 18 21 24 - ,----;- i---T
8 r.~""~'" ".\10'\',"",,,,, "',1 4I'w~""oftiI""''''~
6 ~ I I'0 4
! x 2 ... !
0 ...¥'t!~j:~ .d'~1f'''''''';,''''''''',.J
,;! ~.-I
-2 LN 2 transfer
February 14, 1982-4
A monopole?
Cahrera's candidate monopole YiRna/ /(lorns Ol'('1' a dis/ur!>a!1ce cau ..... ed hy a li(jllid nitrogen tran~fer earlier in the da....... The jUnIP i/l maf;netic flux throURh the slIpel"cotlductil1R detector loop (or equil'alently. the jump in the loop's SllperCl/rrt'1l1) is jN.\! the riAl/It l1la,t!J1itude to he (l
monopole. ilforeover, the currelll remained stahle for many hours affenl'ard
And before and after the collapse of a bridge on the Rhone in 1840:
Paul de llourO'-S~Andcol sur Ie H.hime. b
GRAPHICAL EXCELLENCE 39
M. Mitchell Waldrop, "In Search of the Magnetic MOllopole," Sciel/ce (June 4, 1982), p. 1087.
Charles Joseph Minard, "Dc 1a Chute des Ponts dans Ies grandes Crues," (October 24, 18S6), Figure 3, in Minard, Collectioll de ses brochures (Paris, 18211869), held by the Bibliothcque de l'EcoIc Natiollale des Pouts et Challssees, Paris.
40 GRAPHICAL PRACTICE
Narrative Graphics of Space and Time
An especially effective device for enhancing the explanatory power
of time-series displays is to add spatial dimensions to the design of the graphic, so that the data are moving over space (in two or three dimensions) as well as over time. Three excellent space-time
story graphics illustrate here how multivariate complexity can be subtly integrated into graphical architecture, integrated so gently and unobtrusively that viewers are hardly aware that they are looking into a world of four or five dimensions. Occasionally graphics are belligerently multivariate, advertising the technique rather than the data. But not these three.
The first is the classic of Charles Joseph Minard (1781-1870), the French engineer, which shows the terrible fate of Napoleon's army
in Russia. Described by E. J. Marey as seeming to defy the pen of the historian by its brutal eloquence,12 this combination of data map and time-series, drawn in 1861, portrays the devastating losses suffered in Napoleon's Russian campaign of 1812. Beginning at the left on the Polish-Russian border near the Niemen River, the thick band shows the size of the army (4-22,000 men) as it invaded Russia in June 1812. The width of the band indicates the size of the army at each place on the map. In September, the army reached Moscow, which was by then sacked and deserted, with 100,000
men. The path of Napoleon's retreat from Moscow is depicted by the darker, lower band, which is linked to a temperature scale and dates at the bottom of the chart. It was a bitterly cold winter, and many froze on the march out of Russia. As the graphic shows, the crossing of the Berezina River was a disaster, and the army finally
struggled back into Poland with only 10,000 men remaining. Also shown arc the movements of auxiliary troops, as they sought to protect the rear and the flank of the advancing army. Minard's
graphic tells a rich, coherent story with its multivariate data, far more enlightening than just a single number bouncing along over time. Six variables are plotted: the size of the army, its location all a two-dimensional surface, direction of the army's movement, and temperature on various dates during the retreat from Moscow.
It may well be the best statistical graphic ever drawn.
12 E.]. Marcy, La MMlOde Graphiql!e (Paris, 1885), p. 73. For morc on Minard, see Arthur H. Robinson, "The Thematic Maps of Charles Joscph Minard," It1Iago Mttndi, 21 (1967), 95~108.
CARTE FIGURATIVE. deli pertes successives en homme~ de l'Armee Fran~aise dans fa campagne de Russie 1812-1813.
Dres.ee parM.Minard,lnspecteur Genera! des Ponts et Chaussees en retro1.ite.
o
22 000 oo
&.000 ,.;
91Jn' = NOVClllbcr 8 11 1'(' = ()nobcr
42 GRAPHICAL PRACTICE
The next time-space graphic, drawn by a computer, displays the levels of three air polIutallts located over a two-dimensional surflce (six counties in southern California) at four times during
the day. Nitrogen oxides (top row) are emitted by power plants, refineries, and vehicles. Refineries along the coast and Kaiser Steel's Fontana plant produce the post-midnight peaks shown in the first
panel; traffic and power plants (With their heavy daytime demand) send levels up during the day. Carbon monoxide (second row) is low after midnight except out at the steel plant; morning traffic then begins to generate each day's ocean of carbon monoxide, with the greatest concentration at the convergence of five freeways in downtown Los Angeles. Reactive hydrocarbons (third row), like nitrogen oxides, come from refmeries after midnight and then increase with traffic during the day. Each of the 12 time-spacepollutant slices summarizes pollutants for 2,400 spatial locations
(2,400 squares five kilometers on a side). Thus 28,800 pollutant readings are shown, except for those masked by peaks.
The air pollution display is a slJlalllllltltiplc. The same graphical design structure is repeated for each of the twelve slices or multiples. Small multiples are economical: once viewers understand the design of one slice, they have immediate access to the data in all the other slices. Thus, as the eye moves from one slice to the next, the constancy of the design allows the viewer to focus on changes in the data rather than on changes in graphical design.
~._-~ BOUNDARYOfSCAB~
~_.L_OSC.A.O.N.(..G.I-:~yES _.C~~N BERNARDINO"
ORANGE - __ . RiVERSIDE
c ', ----caUNTT COUNn"
Los Angeles TiIlICS, July 22, 1979; based on work of Gregory J. McRae, California Institute of Technology.
~---
I ~,ill\1 l
rr,' 'I)' I, I
~c-!~".,.-,' Vir ~~~:)I-II-------, ~r,'\:~~\JIII\ft\I.- ftv.:",,\, r"" \.
~,i-~~pdrJ !:11~~~'JV"~ 1"";>, \
~~~
Our third example of a space-time-story graphic ingeniously mixes space and time on the horizontal axis. This design moves
well beyond the convelltiollal time-series because of its clever
plotting fdd, with location rclatiw to the ground surface Oll the
vertical axis and time/space on the horizolltal. The life cycle of
the Japanese beetle is shown.
February March May JuneIJanuary J lily /\U~lust
I i I
i
More Abstract Designs: Relational Graphics
The invention of data graphics required replacing the latitude
longitude coordinates of the map with more abstract measures not
based on geographical analogy, Moving from maps to statistical
graphics was a big step, and thousands of years passed bdore this
step was taken by Lambert, Playfair, and others in the eighteenth
century. Even so, analogies to the physical world served as the
conceptual basis for early time-series. Playtair repeatedly compared
his charts to maps and, in the preface to the first edition of The COllllllcrrial alld Political Atlas, argued that his cllarts corresponded
to a physical realization of the data:
Suppose the mOlley we pay in anyone year for the expence of the Navy were in guineas, and that these guineas were laid down upon a large table in a straight line, and touching each other, and those paid next year ,vere laid down in another
GRAPHICAL EXCELLENCE 43
L. Hugh Newl1lan, ;Hall alld ["'COS
(L"nd()Jl, 19('5), Pl" 104- 10 5.
: September ~ October : NovPlllher" DUCl'mbcr
44 GRAPHICAL PRACTICE
straight line, and the same continued for a number of years: these lines would be of different lengths, as there were fewer or more guineas; and they would make a shape, the dimensions of which would agree exactly with the amount of the sums; and the value of a guinea would be represented by the part of space which it covered. The charts are exactly this upon a small scale, and one division represents the breadth or value of ten thousand or an hundred thousand guineas as marked, with the same exactness that a square inch upon a map may represent a square mile of country. And they, therefore, are a representation of the real money laid down in differellt lines, as it was originally paid away. [pages iii-iv]
Fifteen years later in The Statistical Brel!iary, his most theoretical
book about graphics, Playtair broke free of analogies to the physical world and drew graphics as designs-in-themselves.
One of four plates in The Statistical BreIJiary, this graphic is dis
tinguished by its multivariate data, the use of area to depict quan
tity, and the pie chart~in apparently the first application of these
devices. The circle represents the area of each country; the line on
the left, the population in millions read on the vertical scales; the
line on the right, the revenue (taxes) collected in millions of pounds sterling read also on the vertical scale; and the "dotted lines drawn
between the population and revenue, are merely intended to con
nect together the lines belonging to the same country. The ascent
*D*
STATISTICAL BREVIARY;
ON A PRTNCIPLE ENTIRELY NEW,
THE RESOURCES
STATE AND KINGDOM IN EUROPE,
STAlKED COrPER·PLATE CHARTS,
PH,YSICAL POWEllS OF F.:\CH DISTINCT NATION WITH LAS!': AND PERsPICUITY.
By WILLIAM PLAYFAIR.
A SIMILAR EXHIBITION OF THE RULING POWf:RS OF HINDOOSTA!'.'.
LONDON· Prin... by T. II'N'UY, Bolt COO", Flu' Str<<l,
hr J. W~"'" 46, r',ume"" Row; C........ TU on<! C" lIond s,...,..; I n"TON, WhJ,<h.lI; V,. No" 'no H"OD, Poo!<ry; B•• <~
~~,~: •• y, Le.d=h.lI Stre"; and T' •• &T andDI<"'.' ~L j.m••••
tHOl,
(JIARTiI<-\,""",,'ntinrlhr Jo:XTJo:NT,POl'rJH\TION 1'1lJo1VE~PI'~S, ofth.. PlUNl7P.·U; ..\~17'!I'.'·S III to:., R()PEtaftu'ihe.J)N~.TON..t·I'OI,.ANnA"TREATY of LrNKVlLU': •
; :. I ~ /1
-1'""'~
:t-I I " I, I, I,'I 1:1 , \~,-
,
AsIATIl~ DO)[I]lU'JN~
Hrw.I"I<iNJ!MJ'UU· +1__ · r,,""'PNi!u
.~ ~ t.'~
:~
, ~
i ,5H ..
~ r..- t:..--,. ,¥ ......,«,.~ .".
~J i'f ~ 1:
•jh U
I
I' i
of those lines being from right to left, or from left to right, shews
whether in proportion to its population the country is burdened
with heavy taxes or otherwise" (pages 13-14). The slope of the
dotted line is uninformative, since it is dependent on the diameter
of the circle as well as the height of the two verticals. However,
the sign of the slope does make sense, taking Playfair to his familiar
point about what he regarded as excessive taxation in Britain
(fourth circle from the right, with the slope ri.lllning opposite to
most countries). Playfair was enthusiastic about the multivariate
arrangement because it fostered comparisons:
The author of this work applied the usc of lines to matters of commerce and finance about fifteen years ago, with great success. His mode was generally approved of as not only facilitating, but rendering those studies more clear, and retained more easily by the memory. The present charts arc in like manner intended to aid statistical studies, by shewing to the eye the sizes of different countries represented by similar forms, for where forms are not similar, the eye cannot compare them easily nor accurately. From this circumstance it happens, that we have a more accurate idea of the sizes of the planets, which arc spheres, than of the nations of Europe which we see on the maps, all of which arc irregular forms in themselves as well as unlike to each other. Size, Population, and Revenue, are the three principal objects of attention upon the general scale of statistical studies, whether we are actuated by curiosity or interest; I have therefore represented these three objects in one view.... [page 15]
But here Playfair had a forerunner-and one who thought more
clearly about the abstract problems of graphical design than did
Playfair, who lacked mathematical skills. A most remarkable and
explicit early theoretical statement advancing the general (non
analogical) relational graphic was made by J. H. Lambert in 1765,
35 years before The Statistical Breviary:
We have in general two variable quantities, x, y, which will be collated with one another by observation, so that we can determine for each value of x, which may be considered as an abscissa, the corresponding ordinate y. Were the experiments or observations completely accurate, these ordinates would give a number of points through which a straight or curved line should be drawn. But as this is not so, the line deviates to a greater or lesser extent from the observational points. It must therefore be drawn in such a way that it comes as ncar as possible to its true position and goes, as it were, through the middle of the given points. 13
GRAPHICAL EXCELLENCE 45
13Johann Heinrich Lambert. Beytr'{l!e zum Gebrm{(hr da AIarhematik ulld daetl AllllJel1lltl1lg (Berlin. 1765), as quoted in Laura Tilling, "Early Experimental Graphs," Brirish JOllrnal for the History of S(;el1(C, 8 (1975),204-205.
46 CRAPHICAL PRACTICE
Lambert drew a graphical derivation of the evaporation rate of
water as a function of temperature, according to Tilling. The
analysis begins with two time-series: DE F, showing the decreasing
height of water in a capillary tube as a function of time, and AB C,
the temperature. The slope of the curve DE F is then taken (note the
tangent DEC) at a number of places, yielding the rate ofevaporation:
~,~ 100 !'-: I ",,,"
2l
'0 '9
" C
A F
'PI ~ " , L i b .. .,' I> :r-I
(, ~'7a/t£...
To complete the graphical calculus, the measured rate is plotted
against the corresponding temperature in this relational graphic:
6
J. H. Lambert, "Essai d'hygromctrie ou sur la mcsure de l'llUmidite," A!<'lIloires de I'Acad"lIlie Royale des Scif1/ces et BellesLettres . .. 1769 (Berlin, 1771), plate 1, facing p. 120; from Tilling's article.
li)!Jr
A ~I I
' I .r 11 ! : _B. 1I
I !I _
30J~ 20 ... J~ 60 It
Thus, by the early 1800s, graphical design was at last no longer
dependent on direct analogy to the physical world-thanks to the
work of Lambert and Playfair. This meant, quite simply but quite
profoundly, that any variable quantity could be placed in rela
tionship to any other variable quantity, measured for the same
units of observation. Data graphics, because they were relational
and not tied to geographic or time coordinates, became relevant
to all quantitative inquiry. Indeed, in modern scientific literature, about 40 percent of published graphics have a relational form, with two or more variables (none of which are latitude, longitude, or time). This is no accident, since the relational graphic-in its barest form, the scatterplot and its variants-is the greatest of all graphical designs. It links at least two variables, encouraging and even imploring the viewer to assess the possible causal relationship between the plotted variables. It confronts causal theories that X causes Y with empirical evidence as to the actual relationship between X and Y, as in the case of the relationship between lung cancer and smoking:
CRUDE MALE DEATH RATE FOR LUNG CANCER IN 1950 AND PER CAPITA CONSUMPTION OF
CIGARETTES IN 1930 IN VARIOUS COUNTRIES.
500
400
z 300 0 ...J ...J
::i
W '" D..
'":r: >«: w 0 200
100
GREAT BJITAIN • ~
FINLAND V)/
SWITZERLAND /0.73, 0.30.- ~ HOLLAND ~
)" •U.S.A.
DENMARK ~- AUSTRALIA-V CANADA
./ '-SWEDEN
V -NORWAY
•ICELAND
250 500 750 1000 1250 1500
CIGARETTE CONSUMPTION
GRAPHICAL EXCELLENCE 47
Report of the Advisory Committee to the Surgeon General, SII/"f.:illg alld Health (Washingwn, D.C., 19(4), p.176; based on R. Doll, "Etiology ofLung Cancer," Ad/',lI/CCS ill Calla'r Rescarcll, 3 (1955), 1-50 .
---
48 GRAPHICAL PRACTICE
These small-multiple relational graphs show unemployment and Paul McCracken, ct a!', Towards FilII Employment alld Price Stability (Paris,inflation over time in "Phillips curve" plots for nine countries, 1977), p. 106.
demonstrating the collapse of what was once thought to be an
inverse relationship between the variables.
InDation and Unemployment Rates
Per cent
Canada United States Japan
20
10
15
10
74
6'
- I I 3 4 5 6 7 1 0 14 18 22
10
Male unemployment rate
United Kingdom France Germany
25 75 J 74~~. 73
i.) .1 I ~75 20
1074 7615
I ',:6
71 8110 73
)72 7[J
I ' 67
14 18 2.2 26
Netherlands Sweden Italy75
10 I ~ 10 I
16
12
\
A ~ 72
R 65
67
10 15 2.0 25 4
GRAPHICAL EXCELLENCE 49
Theory and measured observations diverge in the physical sci
ences, also. Here the relationship between temperature and the
thermal conductivity of copper is assessed in a series of measure
ments from difl:crent laboratories. The connected points are from
a single publication, cited by an identification number. The very
different answers reported in the published literature result mainly
from impurities in the samples of copper. Note how effectively
the graphic organizes a vast amount of data, recording findings
of hundreds of studies on a single page and, at the same time,
enforcing comparisons of the varying results.
:1
RECOMMENDED I
'1
8 10'
THERMAL CONDUCTIVITY OF
COPPER
RECOMMENDED~' t~:,(ljquid}
'I
8 Kl'
TEMPERATURE. K
c. Y. Ho, R. W. Powell, and P. E. Liley, TherllJal Conductipity (:( the J::lclIJCIlts: A ComprehCllsipe Repicll', supplement no. 1, JOllrnal o( Physical alld ChellJ;wl Referc11cc Data, 3 (1974), 1-244·
50 GRAPHICAL PRACTICE
Finally, two relational designs of a different sort- wherein the E. C. Zeeman, "Catastrophe Theory," Scientific American, 234 (April 1976), 67; data points are themselves data. Here the effect of two variables based on Konrad Z. Lorenz. King
interacting is portrayed by the faces on the plotting field: SoiomOfl's Ring (New York, 1952).
FEAR
/ c::-:::-~' -~7
- ' ~-- -- ' . ."" ",#~~
''e " ~
~
1, "
C:., j.Y'
"
----?7 RAGE
And similarly, the varying sizes of white pine seedlings after growing for one season in sand containing different amounts of calcium, in parts per million in nutrient-sand cultures:
H. L. Mitchell, The Grolvth alld Nlltritioll ':( White Pine Seedlillgs if I Cultures Il'ith Varyillg Nitroge1l, Phosphorlls, Potassiul1l mzd CalciulII, The Black Rock Forest Bulletin NO.9 (Cornwall-on-theHudson, New York, 1939), p. 70.
--
GRAPHICAL EXCELLENCE 51
Principles of Graphical Excellence
Graphical excellence is the well-designed presentation of interesting
data-a matter of substal/ce, of statistics, :lIld of design.
Graphical exce]]ence consists of complex ideas communicated
with clarity, precision, and efficiency.
Graphical excellence is that which giY(~s to the viewer the greatest
number of ideas in the shortest time with the least ink in the
smallest space.
spaceink
"",, <" ,.""1 .., "n.. ' ""~'" ",~~"" ".M.. ,."".".j'""'""M"!~ j ........ 1111 III}
ideas time
Graphical exce]]ence is nearly always multivariate.
And graphical excellence requires telling the truth about the data.