Introduction - University of Northern Iowafabos/seminar/readings/tufte.pdf ·  ·...

42
Introduction Data graphics visually display measured (Fulltities by meam of the combined use of points, Jines, a coordinate system, numbers, symbols, words, shading, and color. The usc of abstract, non-representational pictures to show numbers is a surprisingly recent invention, perhaps because of the diversity of skills required - the visual-artistic, empirical-statistical, and mathematical. It was not ullti11750-1Hoo that statistical graphics- length and area to show quantity, time-series, scatterplots, and multivariate displays-were invented, long after such triumphs of mathematical ingenuity as logarithms, Cartesian coordinates, the calculus, and the basics of probability theory. The remarkable William Playfair (1759-1 H23) developed or improved upon nearly all the fundamental graphical designs, seeking to replace conven- tional tables of numbers with the systematic visual representations of his "linear ari thmetic." Modern data graphics can do much more than simply substitute for small statistical tables. At their best, graphics arc instrumcnts for reasoning about quantitative inf(xmation. Often the most effec- tive way to describe, explore, and summarize a set of l1umbers- even a very large set-is to look at pictures of those numbers. Furthermore, of all mcthods for analyzing and communicating statistical information, well-designcd data graphics arc usually the simplest and at the same time the most powerful. The tlrst part of this book reviews the graphical practice of the two cellturies since Playfair. The reader will, I hope, rejoice in the graphical glories shown in Chapter 1 and then condemn the lapses and lost opportunities exhibited in Chapter 2. Chapter 3, 011 graph- ical integrity and sophistication, seeks to account t(x these diftcr- ences in quality of graphical design.

Transcript of Introduction - University of Northern Iowafabos/seminar/readings/tufte.pdf ·  ·...

Introduction

Data graphics visually display measured (Fulltities by meam of

the combined use of points, Jines, a coordinate system, numbers,

symbols, words, shading, and color.

The usc ofabstract, non-representational pictures to show numbers

is a surprisingly recent invention, perhaps because of the diversity

of skills required - the visual-artistic, empirical-statistical, and

mathematical. It was not ullti11750-1Hoo that statistical graphics­

length and area to show quantity, time-series, scatterplots, and

multivariate displays-were invented, long after such triumphs of

mathematical ingenuity as logarithms, Cartesian coordinates, the

calculus, and the basics of probability theory. The remarkable

William Playfair (1759-1 H23) developed or improved upon nearly

all the fundamental graphical designs, seeking to replace conven­

tional tables of numbers with the systematic visual representations

of his "linear ari thmetic."

Modern data graphics can do much more than simply substitute

for small statistical tables. At their best, graphics arc instrumcnts

for reasoning about quantitative inf(xmation. Often the most effec­

tive way to describe, explore, and summarize a set of l1umbers­

even a very large set-is to look at pictures of those numbers.

Furthermore, of all mcthods for analyzing and communicating

statistical information, well-designcd data graphics arc usually the

simplest and at the same time the most powerful.

The tlrst part of this book reviews the graphical practice of the

two cellturies since Playfair. The reader will, I hope, rejoice in the

graphical glories shown in Chapter 1 and then condemn the lapses

and lost opportunities exhibited in Chapter 2. Chapter 3, 011 graph­

ical integrity and sophistication, seeks to account t(x these diftcr­

ences in quality of graphical design.

The second part of the book provides a language for discussing graphics and a practical theory of data graphics. Applying to most visual displays of quantitative information, the theory leads to changes and improvements in design, suggests why some graphics might be better than others, and generates new types of graphics. The emphasis is on maximizing principles, empirical measures of graphical performance, and the sequential improvement of graphics through revision and editing. Insights into graphical design are to be gained, I believe, from theories of what makes for excellence in art, architecture, and prose.

This is a book about the design of statistical graphics and, as such, it is concerned both with design and with statistics. But it is also about how to communicate information through the simultaneous presentation of words, numbers, and pictures. The design of statis­tical graphics is a universal matter -like mathematics-and is not tied to the unique features of a particular language. The descriptive concepts (a vocabulary for graphics) and the principles advanced apply to most designs. I have at times provided evidence about the scope of these ideas, by showing how frequently a principle applies to (a random sample of) news and scientific graphics.

Each year, the world over, somewhere between 900 billion (9 X 1011) and 2 trillion (2 X 1012 ) images of statistical graphics are printed. The principles of this book apply to most of those graphics. Some of the suggested changes are small, but others are substantial, with consequences for hundreds of billions of printed pages.

But I hope also that the book has consequences for the viewers and makers of those images-that they will never view or create statis­tical graphics the same way again. That is in part because we are about to see, collected here, so many wonderful drawings, those of Playfair, of Minard, of Marcy, and, nowadays, of the computer.

Most of all, then, this book is a celebration of data graphics.

provides a language for discussing of data graphics. Applying to most

nformation, the theory leads to design, suggests why some graphics ld generates new types of graphics. g principles, empirical measures of . sequential improvement of graphics nsights into graphical design are to ries of what makes for excellence

I of statistical graphics and, as such, n and with statistics. But it is also formation through the simultaneous s, and pictures. The design of statis­:ter-Iike mathematics-and is not particular language. The descriptive

phics) and the principles advanced .t times provided evidence about the 19 how frequently a principle applies and scientifIc graphics. omewhere between 900 billion [2) images of statistical graphics are look apply to most of those graphics. are small, but others are substantial,

.s of billions of printed pages. has consequences for the viewers and hey will never view or create statis­lin. That is in part because we are many wonderful drawings, those ~y, and, nowadays, of the computer. s a celebration of data graphics.

PART I

Graphical Practice

I Graphical Excellence

Excellence in statistical graphics consists of complex ideas

communicated with clarity, precision, and effIciency. Graphical

displays should

• show the data

• induce the viewer to think about the substance rather than about

methodology, graphic design, the technology of graphic pro­duction, or something else

• avoid distorting what the data have to say

• present many numbers in a small space

• make large data sets coherent

• encourage the eye to compare different pieces of data

• reveal the data at several levels of detail, from a broad overview

to the fIne structure

• serve a reasonably clear purpose: description, exploration,

tabulation, or decoration

• be closely integrated with the statistical and verbal descriptions

of a data set.

Graphics reveal data. Indeed graphics can be more precise and revealing than conventional statistical computations. Consider

Anscombe's quartet: all four of these data sets are described by

exactl y the same linear model (at least un til the residuals are ex­

amined).

II III IV X Y X Y X Y X Y

10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76

13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84

11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 14.0 9.06 14.0 8.10 14.0 8.84 8.0 7.04 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50

12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.51, 7.0 4.82 7.0 7.2() 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89

N = 11 mean of X's = 9.0 mean of Y's = 7.5 equation l)f regression line: Y = 3 -I O.5X standard error ofestimate ofslope = 0.118 t = 4.24 sum of squares X - X = 110.0 regression sum of squares = 27.50 residual sum of squares of Y = 13.75 correlation coefficient = .82 r2 = .67

• • • • • • • •

• ••••

••• •••

•••

•• • •

•• •• • •

••

14 GRAPHICAL PRACTICE

And yet how they differ, as the graphical display of the data F. J. Anscombe, "Graphs in Statistical Analysis," AII/erirelll Steltistiriilll, 27makes vividly clear: (Febrnary 1973), 17-21.

II

•10 • •

••5 • •• •

10 20

IV •III •

••• I•I

And likewise a graphic easily reveals point A, a wildshot obser­vation that will dominate standard statistical calculations. Note that point A hides in the marginal distribution but shows up as clearly exceptional in the bivariate scatter.

••

•Stephen S. Brier and Stephen E. Fien­berg, "Recent Econometric Modelling

• • • of Crime and Punishment: Support tlJr•• • • • the Deterrence Hypothesis?" in StephCll

• A E. Fienberg and Albert J. Reiss. Jr., cds., - • • Indiwtors of Crill/I' illld Criminal jll.'tirl': Quantitiltillc Studies (Washington. D.C., 1980), p. 89.

t

Edward R. Dewey and Edwin F. Dakin, Cycles: Tit" SriCllC( of Predictioll (New York, 1947), p. 144.

GRAPHICAL EXCELLENCE 15

600

700

""<; 650 >­

~ z

160 c o

400 -go

156...J

350

152

-148

550

168

500 164

450

.......~\ /~ \

'­ /' \... , # , ....... / \ '--­ \

London Stock Prices \

\ \

..., \

" \/ \

/ \ New York Stock Prices // \

"..... / \ A ' ... _-­ J/ \

--' -..... _, \ "'8 \

\ \ \ \ \

'--­

Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec.

-.010

+.002

-.012

" : -.006

" ,;. ~ -.004 " Co

.~ -.002 ~ u

Normal

'":; .~ -.008

Of course, statistical graphics, just like statistical calculations, are only as good as what goes into them. An ill-specified or prepos­terous model or a puny data set cannot be rescued by a graphic (or by calculation), no matter how clever or fancy. A silly theory

means a silly graphic:

SOLAR RADIATION AND STOCK PRICES

A. New York stock prices (Barron's average) . B. Solar Radiation, inverted, and C. London stock prices, all by months, 1929 (after Garcia-Mata and Shaffner) .

Let us turn to the practice of graphical excellence, the efficient communication of complex quantitative ideas. Excellence, nearly always of a multivariate sort, is illustrated here for fundamental graphical designs: data maps, time-series, space-time narrative

designs, and relational graphics. These examples serve several purposes, providing a set of high-quality graphics that can be discussed (and sometimes even redrawn) in constructing a theory of data graphics, helping to demonstrate a descriptive terminology, and telling in brief about the history of graphical development. Most of all, we will be able to see just how good statistical graphics can be.

16 GRAPHICAL PRACTICE

Data Maps

These six maps report the age-adjusted death rate from various types of cancer for the 3,056 counties of the United States. Each map portrays some 21,000 numbers. 1 Only a picture can carry such a volume of data in such a small space. Furthermore, all that data,

thanks to the graphic, can be thought about in many different ways at many different levels of analysis-ranging from the con­templation of general overall patterns to the detection of very

fine county-by-county detail. To take just a few examples, look at the

• high death rates from cancer in the northeast part of the country and around the Great Lakes

low rates in an east-west band across the middle of the country

• higher rates for men than for women in the south, particularly

Louisiana (cancers probably caused by occupational exposure, from working with asbestos in shipyards)

unusual hot spots, including northern Minnesota and a few counties in Iowa and Nebraska along the Missouri River

• differences in types of cancer by region (for example, the high rates of stomach cancer in the north-central part of the country -probably the result of the consumption of smoked fish by

Scandinavians)

rates in areas where you have lived.

The maps provide many leads into the causes-and avoidance-

of cancer. For example, the authors report:

In certain situations ... the unusual experience of a county warrants further investigation. For example, Salem County, New Jersey, leads the nation in bladder cancer mortality among white men. We attribute this excess risk to occupational exposures, since about 25 percent of the employed persons in this county work in the chemical industry, particularly the manufacturing of organic chemicals, which may cause bladder tumors. After the finding was communicated to New Jersey health officials, a company in the area reported that at least 330 workers in a single plant had developed bladder cancer during the last 50 years. It is urgent that surveys of cancer risk and programs in cancer control be initiated among workers and former workers in this area.2

1 Each county's rate is located in two dimensiCl11S and, further, at least four numbers would be necessary to recon­struct the size and shape of each county. This yields 7X 3,056 entries in a data matrix sufficient to reproduce a map.

In highest decile, statistically significant .. Significantly high, but not in highest decile .. In highc-st decile, but not statistically signifIcant

Not significantly differmt from U.S. as a whole c:=J Significantly lowc-r than c:=JU.S. as a whole

2 Robert Hoover, Thomas J. Mason, Frank W. McKay, and Joseph F. Frau­meni, Jr., "Cancer by County: New Resource for Etiologic Clues," Scie1lce, 189 (September 19, 1975), 1006.

Maps fwm Atlas of CarIcer Mortality 1'lI' U.S. COIl1lties: 1950-1969, by Thomas J. Mason, Frank W. McKay, Robert Hoover, William J. Blot, and Joseph F. Fraumcni,Jr. (Washington, D.C.: Public Health Service, National I11Stitutcs of Health, 1975). The six maps shown here were redesigned and redrawn by Lawrence Fahey and Edward Tufte.

All types of cancer, white females;

age-adjusted rate by county, 1950-1969

All types of cancer, white males;

age-adjusted rate by county, 1950-1969

Trachea, bronchus, and lung cancer; white females; age-adjusted rate

by county, 1950-1969

Trachea, bronchus, and lung cancer;

white males; age-adjusted rate

by county, 1950-1969

Stomach cancer, white females;

age-adjusted rate by county, 1950--1969

Stomach cancer, white males;

age-a~jllsted rate by county, 1950--1969

20 GH API! I CAL PHACTI CE

The maps repay careful study. Notice how quickly and naturally

our attention has been directed toward exploring the substantive

content of the data rather than toward questions of methodology

and technique. Nonetheless the maps do han' their flaws. They

wrongly equate the visual importance of each county with its

geographic area rather than with the number of people living in

the coullty (or t]Je number of cancer deaths). Our visual impres­

sion of the data is entangled with the circumstance of geographic

boundaries, shapes, and areas-the chronic problem afflicting shaded­

in-area designs of such "blot maps" or "patch maps."

A further shortcoming, a defect of data rather than graphical

composition, is that the maps arc founded on a suspect data source,

death certificate reports on the cause of death. These reports fIll

under the influence of diagnostic fashions prevailing among doc­

tors and coroners in particular places and times, a troublesome

adulterant of the evidence purporting to describe the already some­

times ambiguous matter of the exact bodily site of the primarv

cancer. Thus part of the regional clustering seen on the maps, as

well as some of the hot spots, may reflect varying diagnostic

customs and fads along with the actual ditE'rences in cancer rates

between areas.

Data maps have a curious history. It was not until the seventeenth

century that the combination of cartographic and statistical skills

required to construct the data map came together, fully 5,000 years

after the first geographic maps \vere drawn on clay tablets. And

many highly sophisticatecl geographic maps were produced cen­

turies before the first map colltaining any statistical material was

drawn." For example, a detailed map with a full grid was engraved

during the eleventh century A.D. in China. The Yii Chi Thu (Map

of the Tracks of Yii the Great) shown here is described by Joseph

Needham as the

. . . most remarkable cartographic work of its age in any culture, can'ed in stone in - 1137 but probably dating from before - 1100. The scale of the grid is 100 Ii to the division. The coastal outline is relatively fmu and the precision of the network of river systems extraordinary. The size of the original, which is now in the Pei Lin Museum at Sian, is about 3 feet square. The name of the geographer is not known.... Anyone who compares this map with the contcmporary productions of European religious cosmography cannot but be amazed at the cxtent to which Chinese geography was at that time ahead of the \X/ cst.... There was nothing like it in Europe tiJl the Escorial MS. map of about ,', 1550 ....4

3 Dot3 IlUpS :He lIs1I3lh- descnbed .lS "thematic nUf1s" in cartogr.lf1hy. For a thorough account, sec Arthur H. Rob­inson, Early The/llatic 1i1appillg ill tllc Hi"tory 0/ Cartogral,hy (Chicago, 1982) . On the history of st3tistical gr3phics, sec H. Cray FunldlOuscr, "Historical Devel­opment of the Graphical Representation of Statistical Data," Osiris, 3 (November 19.17), 2°9-404: and Jamcs R. Beniger and Dorothy L. Robyn, "Quantitative Graphics in Statistics: A Brief History," AllicriCilIl Statisticiall, 32 (February 1978), 1-11.

4]osepb Ncedham, SciCl/rc alld Cil'ili"a­tiOIl ;'1 Chilla (Cambridge, 1959), v"l. 3.

54°-547.

. '\ ...... 11 to;, iI. Sajll.... Y

GRAPHICAL EXCELLENCE 21

.. . . 'I ;Of 9 . 'V­

,. ,. 8 I ..

r.. 0(

'" , ~ .. T ICV

.. - '1-+-+-+-+-+'" I

~ f--­ - -

I I t .. " ... , ~ i

t -j

"

~

, ,

---" . + ..­---i.. --ti ,

I

E. Chavannes, "Lcs Deux Plus Ancicns Spt'cimcns de la Cartographic Chinoise," Bulletin de l' Ecole Frall\aise de ITxtreme Oriellt,3 (1903), 1-35, Carte B.

22 GRAPHICAL PRACTICE

~ Ecce formulam,v[um,arque HruCturarn TabuIarum ProIornxi,cum qUibufdam locis,Jn

quibus fiudiofus Geographi;c fe Caris exercerc pordt.

S E PTE N T RIO. pars fupcrior.

tA

o

jZ

37

pars inferior. MER I DIE S.

()

~i' % ~ ,~

3 '" M '" ~Z '" Ul

The 1546 edition of Cosmocernphia by Petrus Apianus contained examples of map design that show how very close European car­tography by that time had come to achieving statistical graphicacy, even approaching the bivariate scatterplot. But, according to the historical record, no one had yet made the quantitative abstraction

of placing a measured quantity on the map's surface at the inter­section of the two threads instead of the name of a city, let alone

the more difficult abstraction of replacing latitude and longitude with some other dimensions, such as time and money. Indeed, it was not until 1786 that the first economic time-series was plotted.

GRAPHICAL EXCELLENCE 23

'\ F LZ I C

One of the first data maps was Edmond Halley's 16R6 chart

showing trade winds and monsoons on a world map.;; The detailed

section below shows the cartograpllic symbolization; with, as

Halley wrote, "'... the sharp end of each little stroak pointing out

that part of the Horizon, from whence the wind continually comes;

and where there are Monsoons the rows of stroaks run alternately

backwards and forwards, by which means they are thicker [denser I than elsewhere."

!0 fJ !II n i 1 II Ii (I c" f ,1 V r -'"'-~ (II

\

'Nonnan J. W. Thrower, "Edlllond Halley as a Thematic (;co-Carwgra­phcr," A/l/w/s o(the .1<,<0(;,11/011 of AlIl<r­;(1111 Ccogr,ll'/;crs, 5'J (])cccnlber I'j()9), ('52-07()·

Edmond Halle\, "An Historical Ac­COlll\t of the Tr.ldl' Winds, ,md !'v1on­"l(ml, Ohser\ahk in the Sus lkt\\'CCl1 and Ncar the Tropicks: With ,111 At­telll f't to Assign thl' Phisic'al Cause of Said Winds," Ph;/o."(1!,hira! rrall.,actio/IS,

1~3 (1 (,~(,), 1 S3-1(,R,

24 GRAPH r CA L PHA CTr CE

An early and most worthy use of a map to chart pattcrns uf

discase was the famous dot map of Dr. John Snow, who plotted

the location of deaths from cholera in central London for Sep­

tember 1R54. Deaths were marked by dots and, in addition, the

area's elevcn water pumps were located by crosses. Examining the

scatter over the surface of the map, Snow observed that cholera

occurred almost entirely among those who lived ncar (and drank

from) the Broad Street water pump. He had the handle of the

contaminated pump removed, ending the neighborhood epidemic

which had taken more than 500 lives.6 The pump is located at the

center of the map, just to the right of the D in BROAD STREET. Of

course the link between the pump and the disease might have been

revealed by computation and analysis without graphics, with some

good luck and hard work. But, here at least, graphical analysis

testifies about the data far more efficiently than calculation.

';"'J,

6 E. W. Gilbert, "Pioneer Maps ofHealth and Disease in England," Geographical Journal, 124 (1958), 172-183.

so 50 !'JO

Charles Joseph Minard gave quantity as well as direction to the

data measures located on the world map in his portrayal of the

1864 exports of French wine:

)$(;,\I-~l"'l"-' :1,\,\1"1"",,,,:1,,,,,,,,,,,,,,,,,,,, ,)"1.."t, ~",,,,- L'c t-~ll~r('al( ,,''''If,I:'i'jl''-'' I ,:,'"".,,:, If. \.",., "- .."

\ , ",.I'I ~,1;-" "'1":'" d.'",. 1,.'"".,,, ,'.' "'I,d.-.",., ""f'''Ic",

"1('''''''' 1.;,,1,,1',\,,-.,

I"" " '\1"1",1,,,,( .d~.'l,

I,,,'-'-"1'­

1

_,'" ,J>'

\ \1 ,.: H I I' I L

\ I I I .\1 \ ,; '\ I \ ~) It I) ,"

I J: \ \ ( L

\\'=:'\"

( 111.'"

\1

,; (. I>

I: J:

,I"

() I.

\ I: 1\ () I:

\ I ... l' 1\ \ I. I I"

Charles Joseph Minard, Tahletl/lx Gra­ph/qlll'S et Garles F(~lIrlllil'es de ;\1.. Millard, 1845-1869, a portfolio of his work held by the I3ibliothcclue de !'Eco!c NationaleIV

des Ponts ct Chaussces, Paris.

26 GRAPHICAL PRACTICE

Computerized cartography and modern photographic techniques have increased the dellSity of information some 5,ooo-fold in the

best of current data maps compared to Halley's pioneering effort. This map shows the distribution of 1.3 million galaxies (including some overlaps) in the northern galactic hemisphere. The map divides the sky into 1,024 X 2,222 rectangles. The number of gal­

axies counted in each of the 2,275,328 rectangles is represented by ten gray tones; the darker the tone, the greater the number of galaxies counted. The north galactic pole is at the center. The

sharp edge on the left results from the earth blocking the view trom the observatory. In the area near the perimeter of the map,

the view is obscured by the interstellar dust of the galaxy in which we live (the Milky Way) as the line of sight passes through the flattened disk of our galaxy, The curious texture of local clusters of galaxies seen in this truly new view of the universe was not

anticipated by students of galaxies, who had, of course, micro­scopically examined millions of photographs of galaxies before seeing this macroscopic view. Although the clllSters are clearly

evident (and accounted for by a theory of galactic origins), the seemingly random filaments may be happenstance. The producers of the map note the "strong temptation to conclude that the gal­axies are arranged in a remarkable filamentary pattern on scales of approximately 5° to 15°, but we caution that this visual impres­sion may be misleading because the eye tends to pick out linear 7 Michael Seldner, B. H. Siebers, Edward

patterns even in random noise. Indeed, roughly similar patterllS J. C;roth and P. .lames E. Peebles, "New Reduction of the Lick Catalog of

are seen on maps constructed from simulated catalogs where no Galaxies," AslrOllOllli',ll Joumal, 82 linear structure has been built in, .. ,"7 (April 1977), 249-314.

The most extensive data maps, such as the cancer atlas and the count of the galaxies, place millions of bits of information on a single page before our eyes. No other method for the display of statistical information is so powerful.

'c;' ,~._ .• , •• ' - . ',~ - ••~' ~ - .... .y, ,.- ­~>~fI'- .~-:r... ~~~ .. :"'~~:· ...f..:..... ~ .. ~ :._ - \-. ' ~~ ~.r: _~;-·,::."".1 .t.,_'· ~ ...T:"!'~~.':,.'1I'r1' 7- ....,'1-...; ~. ~~ ...... ~_, ~*. ~~"'"" .. 1.,...;.,,",' ...... _-,- -.;... ••.,~;~ - .. i .,.t~ .. ·.". ':-:-".-:"\ .~. ..:.- .I;~: ~.."~_'_'~ , ~:"~-;:"" ~~ .. '...-: '" ' ~..~: ~ ~ ...",;i'!.,.~;jt;..-:~, '. ~ ....-...!¥... 7', .... ''''J~ .... _ .~~1 -.(~. ~ ;!II',. _-v~ .. '"' .~. L • ~ ~ I , • ..: 0' r: "'r!f~' ~'-r-" • \ .s.' " ..~....:.,-. . t ~ :..- " ..' -~ -' .....~...... '. ~~~'"1.. •..•. ~' ,:~ .. '.' ,: .. ' ',4'\' ",,. -to".;

.~ •• : • .=.~ "~~'.~'~,.,,,. '".";';.:Ir. -... ~ .. ~~., ....Jf.;....•.;....~(:y'\.!: ~.:.:.,.~.. ...i( ~.r ~-'." :~. _ -',,"': r .- .' ,. ,..-..... ~ '.: 'v. ~ ~. , ~ '.~ -"'';' .~- •• ~ '~-" '-..,., -"''''''d:.:''flL- ',- ~"-' ,' '''-v •. ", ',,-' '.' -"~~~" .":x .. "':.;:;

_~: -'\" 'r\! .-.' .:-:. _- .'.,··".r_..--- a-';' 1:~_'J,~~-"';';;JI"ii ..... I.,.. :""1":::/ ~- .._. ;-: .. r_... :...... 'f,:""- .~--.:.-. ,. . ';;;To=":~ . "-til J:' ........ -iL··~~~~·- . ." ~-:."'" ·.... k~I··~'L..... .. ." '.,.Jr·•. ·~.·.-,r-JI t~~•• ~.' - '!';.-~ . r . ~&4i:' ~. . ~ .... '1;~"'-'" .~...,::'...... ~':"" ',:. " " ,-C'" -', :.,....., •. ... ~• ' ... -£J'. --.- ­_ • ........ .. -:".." ,..._ -.-.c' • ;0 ~ .~ .... • ~'" •• ~ -L - • '.:t.-..... .... .-' . ,,".,.... - " .:r -"'~' ..'- ° i '---' III ~ .• '._ '~ '. . ." ~ ." ." -.' ,-,..... ,..-,; .,.... " .. -. , ,'... ~.. . •. .,',' -r.--- .. ,~ .. , r :-.... '~ ~~. - - '

. '.' ,., c,,<~.<, ,.* ." ".' ,., .,' ",,."of." ' '.' ' ..-,' ',.', ~.:,- ." '.' ,.

·~;~~~;'·~:.~~~:t;·"&~~~J~~~J}q~j,~·:~t;i~~~~tf;i~j~;:.:;;:~

GRAPHICAL EXCELLENCE 27

I I

.-:~

J ..

-:~.:: - . .;. "(

.....•

':'j. .. ;. 'J--.' \

'~I! ",~)l:;;';)., . . " ",' '!~~

., ~. 'I . ,~ . "''0,;,,'. Ial, ,,'" ._~ '-"

\. j .. -,.', ~,l . >:.:rt:. "~"~':" -!>",,' ;. :t '. '.'.r. 'i{",~ '-""~ ... ' .,

,I ·~~~,r .. :···:~;~'~·~i~·~~. ~ r--~i" ~:) ~fM ~;'"... \.

.'" ( " '.. ;~..r-::'ifi;'J:' ,:,1 ,,!'" . '\" ~\,.,: ,~'••~~~'.:">~ ~~~;:-\.-j

, ,"".".~ ,. ~~' .' ..:'< •

,,, '", ' ~....,;.

.\'-~, .•..

'.;;." 1- './'

.' . :,~m,

C,{ ~\:", _~'~~ ... ,,: ...... l.. 1 ..~- ~ '~":':.:\~;: ~.~~>

'.'

28 GRAPHICAL PRACTICE

Time-Series

The time-series plot is the most frequently used form of graphic 8 A random sample of 4,000 graphics drawn fmm 15 of the world's news­design.s With one dimension marching along to the regular rhythm papers and magazines published from

of seconds, minutes, hours, days, weeks, months, years, centuries, or 1974 to 1980 found that more than 75

millennia, the natural ordering of the time scale gives this design a percent of all the graphics published were time-series. Chapter 3 reports more

strength and efficiency of interpretation found in no other graphic on this. arran gemen t.

This reputed tenth- (or possibly elcventh-) century illustration of the inclinations of the planetary orbits as a function of time, apparently part of a text for monastery schools, is the oldest known example of an attempt to show changing values graphically. It appears as a mysterious and isolated wonder in the history of data

graphics, since the next extant graphic of a plotted time-series shows up some 800 years later. According to Funkhouser, the astronomical content is confused and there are difficulties in recon­

ciling the graph and its accompanying text with the actual move­ments of the planets. Particularly disconcerting is the wavy path

9 H. Gray Funkhouser, "A Note on a ascribed to the sun.') An erasure and correction of a curve occur Tenth Century Gr'lph," Osiris, 1

(January 1936), 260--262. near the middle of the graph.

.~ 0; \t.S

'-.

.iJi:..l/V /,V~,~ \ ~V,~VIP Vcr~1\ t\ \(\ "-I I l.i

.~ _ Vv 1\ '" "V V ~ / l'L Vl~rFI<c V 1/ I\. X' 1/" '\1\" V "1//LL ,7 ",VK .... ~ V II --t\ 'Y V

£/;X 1/ [If"" 1/ V"ll.J~l/I '\'/ ~ \ L,..1

"J :' 'Ij "'"' V

GRAPHICAL EXCELLENCE 29

It was not until the late 1700S that time-series charts began to

appear in scientific writings. This drawing ofJohann Heinrich

Lambert, one of a long series, shows the periodic variation in soil

temperature in relation to the depth under the surface. The greater

the depth, the greater the time-lag in temperature responsiveness.

Modern graphic designs showing time-serics periodicities differ

little from those of Lambert, although the data bases are far larger.

, II

\ I .. I~·1'1

I .! 1)

J. H. Lambert, Pyromctrie (Berlin, 1779).

I~OR~H=:PII SCLRCE '!OYI~GtcP ? ';xrHtmJ SClRCE-I

'E 10 I 56.2 kHz I2:

-'0

" I in..J • .. Nb .. • I' : ... ";

to ={w iL 3.11 cHz u cr 10

6 W!, • 'I • '""e • .. • i' ' I 1:: d d -{f­u w 17.8 kHz *.'

f t

c:: -6 10

20" w 0 0 f-f­<l ..J MLIGNETIC JOVIc.RARHIC

.200

R(R) 106.5 9").' 83.4 71.::) 59.2 46.5 33.1 19.3 JULY 2 JUlY 3 JULY 4 JULY 5 JULY 6 JULY 7 8,1979

This plot of radio emissions from Jupiter is based on data collected

by Voyager 2 in its pass close by the planet in July 1979. The radio

intensity increases and decreases in a tcn-hour cycle as Jupiter

rotates. Maximum intensity occurs when the Jovian north mag­

netic pole is tipped toward the spacecraft, indicating a northern

hemisphere source. A southern source was detected on July 7, as

the spacecraft neared the equatorial plane. The horizontal scale

shows the distance of the spacecraft from the planet measured in

terms ofJupiter radii (R). Note the use of duallabcls on the hori­

zontal to indicate both the date and distance from Jupiter. The

entire bottom panel also serves to label the horizontal scale,

describing the changing orientation of the spacecraft relative to

Jupiter as the planet is approached. The multiple time-series

enforce not only comparisons within each series over time (as do D. A. (;urnett, W. S. Kurth, and F. L.all time-series plots) but also comparisons between the three Scarf, "Plasma Wave Observations Near

different sampled radio bands shown. This richly multivariate Jupiter: Initial Results from Voyager 2," Scim(c 206 (November 23, 1979), display is based on 453,600 instrument samples of eight bits each. 987~991; and letter from Donald A.

The resulting 3.6 million bits were reduced by peak and average Gurnett tD Edward R. Tufte, June 27, processing to the 18,900 points actually plotted on the graphic. 1980.

'.

----------------------------------------

30 GRAPHICAL PRACTICE

Time-series displays are at their best for big data sets with real

variability. Why waste the power of data graphics on simple lin­

ear changes,

which can usually be better summarized in one or two numbers?

Instead, graphics should be reserved for the richer, more complex, more difficult statistical material. This New York City weather

summary for 1980 depicts 2,220 numbers. The daily high and

low temperatures are shown in relation to the long-run average.

The path of the normal temperatures also provides a forecast of

expected change over the year; in tlle middle of February, for

instance, New York City residents can look forward to warming

at the rate of about 1.5 degrees per week all the way to July,

the yearly peak. This distinguished graphic successfully organizes

a large collection of numbers, makes comparisons between diflcrent parts of the data, and tells a story.

~~~(TJAL NORMAL I ACT JAL J0PMAL M'rUAl NOHh!~Al N~~C:TUALNC,qM~ ACTUAL NO"cI~::cL ACnJAl!.<::\RMA~C:TUAL--.i:!0B!'1..-Al 1 _~~AL "'OR'.AA~TU,'LNO~-'~~_LJhl Ng~1_

'!~;:i\'~' ~,vJ)?8~ :/\I\j"" -~~f,f'/ .niL·--tl~~,~=G:2i='VI~!f: J,~~i~:: (J I .. ~.~------L~""·M'D'N '~O"J--.-n_-_=t __--_ ..~~~- ~___ 0

NEW YORK CITY'S WEATHER FOR 1980 Wf-!IIhW

1

:,=~ I -~ ~~=-tL~ -.:a'=• ....J] Jdj •.• _~:

NC1V York Tillles,January 11, 1981, p. 32.

GRAPHICAL EXCELLENCE 31

E.]. Marey, La Methode Graphiquc (Paris, 1885), p. 20. The method is attributed to the French engineer, lbry.

A design with similar strengtbs is Marey's graphical train scbed­

ule for Paris to Lyon in the 1880s. Arrivals and departures from a station are located along the horizontal; length of stop at a station is indicated by the length of the horizontal line. The stations are separated in proportion to their actual distance apart. The slope of the line reflects the speed of the train: the more nearly vertical the line, the faster the train. The intersection of two lines locates

the time and place that trains going in opposite directions pass

each other.

In 1981 a new express train from Paris to Lyon cut the trip to under three hours, compared to more than nine hours when Marcy published the graphical train schedule. The path of the modern

TGV (train a/?randc LJitcssc) is shown, overlaid on the schedule of 100 years before:

!~ 'I,

10 II MIDI

f~ I I t'l,

I

I i+ I

I IHII

1\ 'IL !.~ '\J

! i

[~~r I':,'I I: I I IJ Iii! il' N! nI

T m I' Y1iJI

i ! ru ' 11'liflJ~ Jij IJ!

{11~11~~

!

l i I

II 'I j, I

I II1 I

~ ~ ~I II '

10 II MIDI ,

8

~~F~I

I!, ,

I i L; ~

I II I

~~ I~ IIvk\~lti '!,~ "Y~'I! I: ii, II r

i I I I!}K

IT ,I , I

I I I

I

I

32 GRAPHICAL PRACTICE

The two great inventors of modern graphical designs were J. H. Lambert (1 72X-1777), a Swiss-German scientist and mathematician,

and William Playfair (1759-1823), a Scottish political economist. lO

The first known time-series using economic data was published in

Playfair's remarkable book, The COl1llllercial alia Political Atlas (Lon­

don, 1786). Note the graphical arithmetic, which shows the shift­

ing balance of trade by the difference between the import and export time-series. Playfair contrasted his new graphical method

with the tabular presentation of data:

Information, that is imperfectly acquired, is generally as imper­fectly retained; and a man who has carefully investigated a printed table, finds, when done, that he has only a very faint and partial idea of what he has read; and that like a figure imprinted on sand, is soon totally erased and defaced. The amount of mercantile transactions in money, and of profit or loss, are capable of being as easily represented in drawing, as any part of space, or as the face of a country; though, till now, it has not been attempted. Upon that principle these Charts were made; and, while they give a simple and distinct idea, they are as near perfect accuracy as is any way useful. On inspecting anyone of these Charts attentively, a sufficiently distinct impression will be made, to remain unimpaired for a considerable time, and the idea which does remain will be simple and complete, at once including the duration and the amount. [pages 3-4]

For Playfair, graphics were preferable to tables because graphics

showed the shape of the data in a comparative perspective. Time­

l7LlRT t'T"lllht' IiI/POR1:", ami EXP()RT,\' ;" £~V(TLAN1)d/ul //;""

• Ji'('/II Iii,' ,1(<1/' 1/1'(' t.. /7'''2 11' If /Ym!f/lr

10Laura Tilling, "Early Experimental Graphs," BrifUl JOlirnalfor flie History

~f Scimce, 8 (1975),193-213.

~ - 1­ -~---

f-­ ~~-~I----t"-~-

~-1~---=X-E-=t--­ -r---+-----l-­

~'S' A

4

,L

11IHEHtt -J­tLf 1 , ',-::,__

-," L,7J.. 8,?,bm

f...-.

qOQ .1JIl' .1?§O .1/-JO '74" i;'3"J 4'""0 L,:?<I

TJu. /),N~'I;;'nJ' 17/ 1,1", /ltllh'm, (?':If"'II; J-E~IR.". f,L IIt"vy rW 11t"/-(~4>;flhd71£l..1fIIJJ{l\-:1'71~(lr:\1).f f !i--u,':,.. <'~u~' f'a.fliof../ tlJ'th<,.ld,tU~ri; 'If',M.lt1p""'"/7<''.''

GRAPHICAL EXCELLENCE 33

series plots did this, and all but one of the 44 charts in the first edition of The Commercial and Political Atlas were time-series. That

one exception is the first known bar chart, which Playfair invented because year-to-year data were missing and he needed a design to portray the one-year data that were available. Nonetheless he was skeptical about his innovation:

This Chart is diflcrent from the others in principle, as it does not comprehend any portion of time, and it is much inferior in utility to those that do; for though it gives the extent of the different branches of trade, it does not compare the same branch of commerce with itself at different periods; nor does it imprint upon the mind that distinct idea, in doing which, the chief advantage of Charts consists: for as it wants the di­mension that is formed by duration, there is no shape given to the quantities. [page 101 ]

He was right: small, noncomparative, highly labeled data sets usually belong in tables.

ill J" J" 1" ,)0 I,,, 7" fJo 'l0 \(><, '(0

­. ft.,.;',..,,1

r),/'u,../

-:... '::­--~ - I /Jrn;"",.,( ,,,,,I

')-"r .. '" ====­.I",rn"f

(",./,,,,,1

The chart does show, at any rate, the imports (cross-hatched lines) and exports (solid lines) to and from Scotland in 1781 for

17 countries, which are ordered by volume of trade. The horizontal scale is at the top, possibly to make it more convenient to see in plotting the points by hand. Zero values are nicely indicated both by the absence of a bar and by a "0." The horizontal scale mis­takenly repeats "200." In nearly all his charts, Playfair placed the

labels for the vertical scale on the right side of the page (suggest­ing that he plotted the data points using his left hand).

34 GRAPHICAL PRACTICE

Playfair's last book addressed the gllestion whether the price of wheat had increased relative to wages. In his Letter OI! our agriC1/l­tural distresses, their causes and remedies; accompanied with tables and

copper-plate charts shewing and {(Jlliparil1c~ the prices c:r wheL1t, head and labour, from 1565 to 1821, Playfair wrote:

You have before you, my Lords and Gentlemen, a chart of the prices of wheat for 250 years, made from official returns; on the same plate I have traced a line representing, as nearly as I can, the wages of good mechanics, such as smiths, masons, and carpenters, in order to compare the proportion between them and the price of wheat at every different period.... the main fact deserving ofconsideration is, that never at any former period was wheat so cheap, in proportion to mechanical labour, as it is at the present time.... [pages 29-31]

Here Playfair plotted three parallel time-series: prices, wages, and the reigns of British kings and gueens.

IhtL.('("ufln;\'

h'ltztll'dh ./(1/",,, .. /

Ii lh (\'utllI:V

{lmrl'ul Ifh'vontl rYttU'b'.rH

(0 {'U.\RT. 0, , .J'III'II'I/I,/ III U/I/ !ti'H' )

, // _j/fr;" ;.; llll (Jllllril/"'I"IIml -­__ /,,­ _.r­ •

,1\: \'~:!!('s of r~!~:~~~I~vlhr 't~.t.~

. //" Jh" r /;;;,; I, 101':.1/, ,,-­ ." /' -Jw­ "'

~nl.l.l.Ull'I.AY.·.~

lM,I. ('("utllry

(;"'1I1/'8

I"{,r"'!!"",,r

, 7(1 "

" 5

M· J .~

-1'; ~

'I I~ { ~

'I i . I , ";,1I. ";,¥,.r,, ",,'; .$= ~............... I I 1;"''''0.' 'I(' I.t'·,

)('j/ldlm,_

; );", , "";" ,llIf.JIt'l

The history and genealogy ofroyalty was long a graphical favorite.

This superb construction of E. J. Marey brings together several sets of facts about English rulers into a time-series that conveys a sense of the march of history. Marey (1830-1904) also pioneered

the development of graphical methods in human and animal physiology, including studies of horses moving at different paces,

GRAPHICAL EXCELLENCE 35

000 "'.... <Xl

'"~

0 001 0 c.o, .... <Xl '" I

0 0 0 0 '"('J l"'l .,. '" <0

I co co

~+

E.]. Marey, La MhilOde Graphiqllc (Paris, 1885), p. 6.

E.]. Marcy, l',[ollcmenl (London, 1895). Beginning with the tracks of the horse, the time-series are from pages 191, 224, 222, 265, 60, and 61.

n

, , , , , :

:0:0'

o I"'!! -~~ •• '," -:(j'j : I , , , ,

", Q ; ()

fl ('){ fl " n:

: Q ,

fl-j­ ~:,In ~_: I,'t) fl: -1\

()1'r.!inCl./'!J TF(llk q~lick ..t 1I/7Jlf. (,'I!/I,,)!. 11'{({!c. (Tail!] stn'de). 11,((lk.

---

36 GRAPHICAL PRACTICE

the movement of a starfish turning itself over (read intages from

the bottom upwards),

r---------- "----------- ---­I

\

? - -----_.:f·~.11 (;;'

the undulations of the dorsal fin of a descending sea-horse, . Zi-~

_IIIIIIIIII.S.~ I~~~~ t~+t~+~ , ~,

'--- - --,- ~------'!

as well as the advancc of the gecko.

Marcy's man in black velvet, photographed in stick-figurc imagcs,

becamc the timc-series forerunner of Marcel Duchamp's Nude Descending a Staircase.

"~'I

,:';) _c: l '

(:,.'l~j '-, ~ ;'/)~S,).~ "

_.'-, Y', '.', ' ~~ '- ~ I ' '

_ ;t"l "I J?/~~"tn~!I~~~-~~ il,gJd,S!~) "

,"' ... - ~.-~'\,~#:,,__'J J, "

'-,I,

GRAPHICAL EXCELLENCE 37

The problem with time-series is that the simple passage of time

is not a good explanatory variable: descriptive chronology is not causal explanation. There are occasional exceptions, especially when

there is a clear mechanism that drives the Y-variable. This time­series does testify about causality: the outgoing mail of the u.s. House of Representatives peaks every two years, just before the election dav:

I

60 - Monthly outgoing mail workload, millions of units

October 1972October 1970

October 1968

40 ­

The graphic is worth at least 700 words, the number used in a news report describing how incumbent representatives exploit their

free mailing privileges to advance their re-election campaigns:

!, for sending them as an. integral i founding of the r~publlC, .and Ifrank~d newsletter to his ol~ I Dole of Kansas, Peter H. Domi- I Senator Tower's use ofFRANKED MAIL TIE Ipart of a model re-electIOn cam- only Congress pollees agamst constituents after he had len' nick ?f Colorado Charles McC.! franked mail in his 1972 cam· II

Npalgn. abuses o~ the free .f!lailin~s. office.. Me: Clark is seekmg~Mathlas Jr. of Maryland paignwasctocumen.ted bymem.

q~enator John G. T?wer, Re~ Complamts o~. pohtie:a1. use to regam hIS old post. ' Another pOlItical mall speClaj- d publtcan of Texas, mailed more of the free-mailing pnvilege, I . d' oran ums.TO VOTING SHOW than 800,000 special.interest called the franking privilege, I Practice Documente .. i1St, Lee W. MacGregor, wrote: Tom. Loe~fler, a nig~-rankif1g letters at taxpayer expense as are heard every election year. I Seldom has th<:: pohttcal a proposal. for. the. use of, campaIgn aIde, wrote In a mem-

Testimony Finds the Volume part of his 1972 re-election Recently, however, the volume,use of franked mall. been so ;franke~ mall by hIS chIef, ~ena- orandum da.ted Oct. ~7, . . effort and received campaign and cost of franked mail has ,well. documented as In fec.ent :to:,Javlts, m 1973... 1972, that dunng the campaIgn

Rises Before Elections 1yolunteer offers and donations multiplied. A new Federal law I~estlmony and documents filed: The over~all obJectlve of the Senator Tower had sent "31 110 response. will limit what out-of-office,ln a Federal Court by Common 1franked mal1.p.rogram can ~ special interest letters totaling I q~nator Jacob K. Javits. Re-I challengers can spend to unseat I~ause. the lobby group. whlch ,: to g.et th.e reclpl~';1t of the. mall approximately 803,333 franked

WASHlNGTO~. June 1 (AP)'pubhcan of New York. gave incumbents. IS sumg for an .e!1d to tax-fl-:to I.denttfy positively With a maIlings." -New court testimony and doc-l written appr~Yal in 1973 fOTI In 1972, Congress passed a !nanced mass mallmgs by Con_lpartlcu.lar stand you .have taken \1r. Tower was not available uments show that much ofla tax-paid mal! program intend- law prohibiting mass franked :gress. lor a ~lll you. hav~. mt~oduced; for comment. His administra· the mail Congress sends atl ed to better his image and Imailings within 28 days before I For. ~xample: Joyc~ ~. Bake.T, :the kmd of ldentlf!catlOn that, ti~e assistant, Elwin Skiles. taxpayer expense is tied. direct- payoff at the polls. He focused an ejection. The sponsor of!a political m.all speclal1st, sllld can be translated ll'lt.o a vote·sald the S~nator's use of ly to the re-election campaigns, his mail on areas where he that legislation, Representative 10 a 1973 Job proposal that at the polls on ejectIOn day" franked mail m 1972 was with­of Senate and House members.l ~eeded votes. . Morris K. Udall, Democrat of! she wanted to set up direct- Mr. MacGregor saId. in the law.. and .he defended ~ccording . ~ material filed tiThe volume of "official"l Anzona, said in an inte"dew, mall progral?s for Republic~n I Mr. Javits was out of the the free-mailing prIvileges. 10 a laWSUit 10 F~deral Court.: \Congressicnal mail rises. in that further c.hange~ ~ere need- .Se~ators USlOg fran.ked mall. Icou.ntry an~ could ..not .be Pos.tal Service figures show

tISenate Republlcans put two election years and peaks just! ed to curtail political abuse· The purpose of such a pro- reached. HIS admlOlstratlve,that 10 the 12 months before direct-mail experts on the pub~ before the general election. of the frank. gram is to help an mcumbent aSSIstant, Donald Kellerman, November, 1973, Congress .sent lie payroU to ~vise the~.onl ~one. of this activity neces- Mr. Udall urged a 60-dayISe:tator get re-elected," she !def~nded the use of franked 222.9 .million .franked pieces how to use theIr free malhng: sanlyVlolates anyla.w or regula- ,pre-election cutoff for mass: saId. mall. ,of mall. But 10 the next 12 privileges to get votes. ltion, since Congress has wide Imailings and said he favored I She was put on the Senate: "It is a standard device to:months, covering the election

qAn election manual pre- discretion in the use of tax-paid, closing a loophole that recently payroll at $18,810 a year in 11et voters, not voters but Cltl2- season Of .1974, C~mgress sent pared for Senate Democrats:~aiJ. Congress gave itself thelallowed defeated Representa-·1973 and 1974 and testified i~ns. know w~at the Senatorl 350.6 ffillhon, a Jump of 57

~~,!? ~~~t~~s aati~~;~W:: ~~~r~e~~nd e~~~~~~ ~ail th;1 ;ite ::n~~l~~n?~~r~~ ~~~£ra~: khea~u~~~I:; 1~~ta;~se ~~b:;:e1. ,he ~~:~~ here In Washington, ;::,~~~~.a~k~~s :a~~~'s happen­

38 GRAPHICAL PRACTICE

Time-series plots can be moved toward causal explanation by smuggling additional variables into the graphic design. For example, this decomposition of economic data, arraying 1,296 numbers, breaks out the top series into seasonal and trading-day fluctuations (which dominate short-term changes) to reveal the long-run trend

adjusted for inflation. (Note a significant defect in the design, however: the vertical grid conceals the height of the December

peaks.) The next step would be to bring in additional variables to explain the transformed and improved series at the bottom. ll

Systemafic and Irregular Components of Toto! Retail Soles, Unifed States [Jillion:;>; of (k31ars

20

44 40 36 32

28

24

j 105 100

. 95

r('rCd,t

rer Cl>rt l

Unadjusted data

I ~~J~II

PcrCl.'n t

\~~j :~~~~~Wv\rl 95

Holiday variation

Percent

1900 ISSI 1862 1563 1S6'l 1955 1966 1$7 1$8 1$9 1970 1971

11 See William S. Cleveland and Irma]. Tcrpcnning, "Graphical Methods for Seasonal Adjustment," Jllllrnal or the American Statisti((jl Associati,'1l 77 (March 1982), 52- 62 .

Julius Shiskin, "Measuring Currcnt Eco­nomic Fluctuations," Statistical Reporter (July 1973), p. 3.

time-series:

Finally, a vivid design (with appropriate data) IS the before-after

Time 01 day (PST)

3 6 9 12 15 18 21 24 - ,----;- i---T

8 r.~""~'" ".\10'\',"",,,,, "',1 4I'w~""oftiI""''''~

6 ~ I I'0 4

! x 2 ... !

0 ...¥'t!~j:~ .d'~1f'''''''';,''''''''',.J

,;! ~.-I

-2 LN 2 transfer

February 14, 1982-4

A monopole?

Cahrera's candidate monopole YiRna/ /(lorns Ol'('1' a dis/ur!>a!1ce cau ..... ed hy a li(jllid nitrogen tran~fer earlier in the da....... The jUnIP i/l maf;netic flux throURh the slIpel"cotlductil1R detector loop (or equil'alently. the jump in the loop's SllperCl/rrt'1l1) is jN.\! the riAl/It l1la,t!J1itude to he (l

monopole. ilforeover, the currelll remained stahle for many hours affenl'ard

And before and after the collapse of a bridge on the Rhone in 1840:

Paul de llourO'-S~Andcol sur Ie H.hime. b

GRAPHICAL EXCELLENCE 39

M. Mitchell Waldrop, "In Search of the Magnetic MOllopole," Sciel/ce (June 4, 1982), p. 1087.

Charles Joseph Minard, "Dc 1a Chute des Ponts dans Ies grandes Crues," (October 24, 18S6), Figure 3, in Minard, Collectioll de ses brochures (Paris, 1821­1869), held by the Bibliothcque de l'EcoIc Natiollale des Pouts et Challssees, Paris.

40 GRAPHICAL PRACTICE

Narrative Graphics of Space and Time

An especially effective device for enhancing the explanatory power

of time-series displays is to add spatial dimensions to the design of the graphic, so that the data are moving over space (in two or three dimensions) as well as over time. Three excellent space-time­

story graphics illustrate here how multivariate complexity can be subtly integrated into graphical architecture, integrated so gently and unobtrusively that viewers are hardly aware that they are looking into a world of four or five dimensions. Occasionally graphics are belligerently multivariate, advertising the technique rather than the data. But not these three.

The first is the classic of Charles Joseph Minard (1781-1870), the French engineer, which shows the terrible fate of Napoleon's army

in Russia. Described by E. J. Marey as seeming to defy the pen of the historian by its brutal eloquence,12 this combination of data map and time-series, drawn in 1861, portrays the devastating losses suffered in Napoleon's Russian campaign of 1812. Beginning at the left on the Polish-Russian border near the Niemen River, the thick band shows the size of the army (4-22,000 men) as it invaded Rus­sia in June 1812. The width of the band indicates the size of the army at each place on the map. In September, the army reached Moscow, which was by then sacked and deserted, with 100,000

men. The path of Napoleon's retreat from Moscow is depicted by the darker, lower band, which is linked to a temperature scale and dates at the bottom of the chart. It was a bitterly cold winter, and many froze on the march out of Russia. As the graphic shows, the crossing of the Berezina River was a disaster, and the army finally

struggled back into Poland with only 10,000 men remaining. Also shown arc the movements of auxiliary troops, as they sought to protect the rear and the flank of the advancing army. Minard's

graphic tells a rich, coherent story with its multivariate data, far more enlightening than just a single number bouncing along over time. Six variables are plotted: the size of the army, its location all a two-dimensional surface, direction of the army's movement, and temperature on various dates during the retreat from Moscow.

It may well be the best statistical graphic ever drawn.

12 E.]. Marcy, La MMlOde Graphiql!e (Paris, 1885), p. 73. For morc on Minard, see Arthur H. Robinson, "The Thematic Maps of Charles Joscph Minard," It1Iago Mttndi, 21 (1967), 95~108.

CARTE FIGURATIVE. deli pertes successives en homme~ de l'Armee Fran~aise dans fa campagne de Russie 1812-1813.

Dres.ee parM.Minard,lnspecteur Genera! des Ponts et Chaussees en retro1.ite.

o

22 000 oo

&.000 ,.;

91Jn' = NOVClllbcr 8 11 1'(' = ()nobcr

42 GRAPHICAL PRACTICE

The next time-space graphic, drawn by a computer, displays the levels of three air polIutallts located over a two-dimensional surflce (six counties in southern California) at four times during

the day. Nitrogen oxides (top row) are emitted by power plants, refineries, and vehicles. Refineries along the coast and Kaiser Steel's Fontana plant produce the post-midnight peaks shown in the first

panel; traffic and power plants (With their heavy daytime demand) send levels up during the day. Carbon monoxide (second row) is low after midnight except out at the steel plant; morning traffic then begins to generate each day's ocean of carbon monoxide, with the greatest concentration at the convergence of five freeways in downtown Los Angeles. Reactive hydrocarbons (third row), like nitrogen oxides, come from refmeries after midnight and then increase with traffic during the day. Each of the 12 time-space­pollutant slices summarizes pollutants for 2,400 spatial locations

(2,400 squares five kilometers on a side). Thus 28,800 pollutant readings are shown, except for those masked by peaks.

The air pollution display is a slJlalllllltltiplc. The same graphical design structure is repeated for each of the twelve slices or multi­ples. Small multiples are economical: once viewers understand the design of one slice, they have immediate access to the data in all the other slices. Thus, as the eye moves from one slice to the next, the constancy of the design allows the viewer to focus on changes in the data rather than on changes in graphical design.

~._-~ BOUNDARYOfSCAB~

~_.L_OSC.A.O.N.(..G.I-:~yES _.C~~N BERNARDINO"

ORANGE - __ . RiVERSIDE

c ', ----caUNTT COUNn"

Los Angeles TiIlICS, July 22, 1979; based on work of Gregory J. McRae, Cali­fornia Institute of Technology.

~---

I ~,ill\1 l

rr,' 'I)' I, I

~c-!~".,.-,' Vir ~~~:)I-II-------, ~r,'\:~~\JIII\ft\I.- ftv.:",,\, r"" \.

~,i-~~pdrJ !:11~~~'JV"~ 1"";>, \

~~~

Our third example of a space-time-story graphic ingeniously mixes space and time on the horizontal axis. This design moves

well beyond the convelltiollal time-series because of its clever

plotting fdd, with location rclatiw to the ground surface Oll the

vertical axis and time/space on the horizolltal. The life cycle of

the Japanese beetle is shown.

February March May JuneIJanuary J lily /\U~lust

I i I

i

More Abstract Designs: Relational Graphics

The invention of data graphics required replacing the latitude­

longitude coordinates of the map with more abstract measures not

based on geographical analogy, Moving from maps to statistical

graphics was a big step, and thousands of years passed bdore this

step was taken by Lambert, Playfair, and others in the eighteenth

century. Even so, analogies to the physical world served as the

conceptual basis for early time-series. Playtair repeatedly compared

his charts to maps and, in the preface to the first edition of The COllllllcrrial alld Political Atlas, argued that his cllarts corresponded

to a physical realization of the data:

Suppose the mOlley we pay in anyone year for the expence of the Navy were in guineas, and that these guineas were laid down upon a large table in a straight line, and touching each other, and those paid next year ,vere laid down in another

GRAPHICAL EXCELLENCE 43

L. Hugh Newl1lan, ;Hall alld ["'COS

(L"nd()Jl, 19('5), Pl" 104- 10 5.

: September ~ October : NovPlllher" DUCl'mbcr

44 GRAPHICAL PRACTICE

straight line, and the same continued for a number of years: these lines would be of different lengths, as there were fewer or more guineas; and they would make a shape, the dimensions of which would agree exactly with the amount of the sums; and the value of a guinea would be represented by the part of space which it covered. The charts are exactly this upon a small scale, and one division represents the breadth or value of ten thousand or an hundred thousand guineas as marked, with the same exactness that a square inch upon a map may represent a square mile of country. And they, therefore, are a represen­tation of the real money laid down in differellt lines, as it was originally paid away. [pages iii-iv]

Fifteen years later in The Statistical Brel!iary, his most theoretical

book about graphics, Playtair broke free of analogies to the phys­ical world and drew graphics as designs-in-themselves.

One of four plates in The Statistical BreIJiary, this graphic is dis­

tinguished by its multivariate data, the use of area to depict quan­

tity, and the pie chart~in apparently the first application of these

devices. The circle represents the area of each country; the line on

the left, the population in millions read on the vertical scales; the

line on the right, the revenue (taxes) collected in millions of pounds sterling read also on the vertical scale; and the "dotted lines drawn

between the population and revenue, are merely intended to con­

nect together the lines belonging to the same country. The ascent

*D*

STATISTICAL BREVIARY;

ON A PRTNCIPLE ENTIRELY NEW,

THE RESOURCES

STATE AND KINGDOM IN EUROPE,

STAlKED COrPER·PLATE CHARTS,

PH,YSICAL POWEllS OF F.:\CH DISTINCT NATION WITH LAS!': AND PERsPICUITY.

By WILLIAM PLAYFAIR.

A SIMILAR EXHIBITION OF THE RULING POWf:RS OF HINDOOSTA!'.'.

LONDON· Prin... by T. II'N'UY, Bolt COO", Flu' Str<<l,

hr J. W~"'" 46, r',ume"" Row; C........ TU on<! C" lIond s,...,..; I n"TON, WhJ,<h.lI; V,. No" 'no H"OD, Poo!<ry; B•• <~

~~,~: •• y, Le.d=h.lI Stre"; and T' •• &T andDI<"'.' ~L j.m••••

tHOl,

(JIARTiI<-\,""",,'ntinrlhr Jo:XTJo:NT,POl'rJH\TION 1'1lJo1VE~PI'~S, ofth.. PlUNl7P.·U; ..\~17'!I'.'·S III to:., R()PEtaftu'ihe.J)N~.TON..t·I'OI,.ANnA"TREATY of LrNKVlLU': •

; :. I ~ /1

-1'""'~

:t-I I " I, I, I,'I 1:1 , \~,-

,

AsIATIl~ DO)[I]lU'JN~

Hrw.I"I<iNJ!MJ'UU· +1__ · r,,""'PNi!u

.~ ~ t.'~

:~

, ~

i ,5H ..

~ r..- t:..--,. ,¥ ......,«,.~ .".

~J i'f ~ 1:

•jh U

I

I' i

of those lines being from right to left, or from left to right, shews

whether in proportion to its population the country is burdened

with heavy taxes or otherwise" (pages 13-14). The slope of the

dotted line is uninformative, since it is dependent on the diameter

of the circle as well as the height of the two verticals. However,

the sign of the slope does make sense, taking Playfair to his familiar

point about what he regarded as excessive taxation in Britain

(fourth circle from the right, with the slope ri.lllning opposite to

most countries). Playfair was enthusiastic about the multivariate

arrangement because it fostered comparisons:

The author of this work applied the usc of lines to matters of commerce and finance about fifteen years ago, with great success. His mode was generally approved of as not only facili­tating, but rendering those studies more clear, and retained more easily by the memory. The present charts arc in like manner intended to aid statistical studies, by shewing to the eye the sizes of different countries represented by similar forms, for where forms are not similar, the eye cannot compare them easily nor accurately. From this circumstance it happens, that we have a more accurate idea of the sizes of the planets, which arc spheres, than of the nations of Europe which we see on the maps, all of which arc irregular forms in themselves as well as unlike to each other. Size, Population, and Revenue, are the three principal objects of attention upon the general scale of statistical studies, whether we are actuated by curiosity or interest; I have therefore represented these three objects in one view.... [page 15]

But here Playfair had a forerunner-and one who thought more

clearly about the abstract problems of graphical design than did

Playfair, who lacked mathematical skills. A most remarkable and

explicit early theoretical statement advancing the general (non­

analogical) relational graphic was made by J. H. Lambert in 1765,

35 years before The Statistical Breviary:

We have in general two variable quantities, x, y, which will be collated with one another by observation, so that we can determine for each value of x, which may be considered as an abscissa, the corresponding ordinate y. Were the experiments or observations completely accurate, these ordinates would give a number of points through which a straight or curved line should be drawn. But as this is not so, the line deviates to a greater or lesser extent from the observational points. It must therefore be drawn in such a way that it comes as ncar as possible to its true position and goes, as it were, through the middle of the given points. 13

GRAPHICAL EXCELLENCE 45

13Johann Heinrich Lambert. Beytr'{l!e zum Gebrm{(hr da AIarhematik ulld daetl AllllJel1lltl1lg (Berlin. 1765), as quoted in Laura Tilling, "Early Experimental Graphs," Brirish JOllrnal for the History of S(;el1(C, 8 (1975),204-205.

46 CRAPHICAL PRACTICE

Lambert drew a graphical derivation of the evaporation rate of

water as a function of temperature, according to Tilling. The

analysis begins with two time-series: DE F, showing the decreasing

height of water in a capillary tube as a function of time, and AB C,

the temperature. The slope of the curve DE F is then taken (note the

tangent DEC) at a number of places, yielding the rate ofevaporation:

~,~ 100 !'-: I ",,,"

2l

'0 '9

" C

A F

'PI ~ " , L i b .. .,' I> :r-I

(, ~'7a/t£...

To complete the graphical calculus, the measured rate is plotted

against the corresponding temperature in this relational graphic:

J. H. Lambert, "Essai d'hygromctrie ou sur la mcsure de l'llUmidite," A!<'lIloires de I'Acad"lIlie Royale des Scif1/ces et Belles­Lettres . .. 1769 (Berlin, 1771), plate 1, facing p. 120; from Tilling's article.

li)!Jr

A ~I I

' I .r 11 ! : _B. 1I

I !I _

30J~ 20 ... J~ 60 It

Thus, by the early 1800s, graphical design was at last no longer

dependent on direct analogy to the physical world-thanks to the

work of Lambert and Playfair. This meant, quite simply but quite

profoundly, that any variable quantity could be placed in rela­

tionship to any other variable quantity, measured for the same

units of observation. Data graphics, because they were relational

and not tied to geographic or time coordinates, became relevant

to all quantitative inquiry. Indeed, in modern scientific literature, about 40 percent of published graphics have a relational form, with two or more variables (none of which are latitude, longitude, or time). This is no accident, since the relational graphic-in its barest form, the scatterplot and its variants-is the greatest of all graphical designs. It links at least two variables, encouraging and even imploring the viewer to assess the possible causal relationship between the plotted variables. It confronts causal theories that X causes Y with empirical evidence as to the actual relationship be­tween X and Y, as in the case of the relationship between lung cancer and smoking:

CRUDE MALE DEATH RATE FOR LUNG CANCER IN 1950 AND PER CAPITA CONSUMPTION OF

CIGARETTES IN 1930 IN VARIOUS COUNTRIES.

500

400

z 300 0 ...J ...J

::i

W '" D..

'":r: >­«: w 0 200

100

GREAT BJITAIN • ~

FINLAND V)/

SWITZERLAND /0.73, 0.30.- ~ HOLLAND ~

)" •U.S.A.

DENMARK ~- AUSTRALIA-V CANADA

./ '-SWEDEN

V -NORWAY

•ICELAND

250 500 750 1000 1250 1500

CIGARETTE CONSUMPTION

GRAPHICAL EXCELLENCE 47

Report of the Advisory Committee to the Surgeon General, SII/"f.:illg alld Health (Washingwn, D.C., 19(4), p.176; based on R. Doll, "Etiology ofLung Cancer," Ad/',lI/CCS ill Calla'r Rescarcll, 3 (1955), 1-50 .

---

48 GRAPHICAL PRACTICE

These small-multiple relational graphs show unemployment and Paul McCracken, ct a!', Towards FilII Employment alld Price Stability (Paris,inflation over time in "Phillips curve" plots for nine countries, 1977), p. 106.

demonstrating the collapse of what was once thought to be an

inverse relationship between the variables.

InDation and Unemployment Rates

Per cent

Canada United States Japan

20

10

15

10

74

6'

- I I 3 4 5 6 7 1 0 14 18 22

10

Male unemployment rate

United Kingdom France Germany

25 75 J 74~~. 73

i.) .1 I ~75 20

1074 7615

I ',:6

71 8110 73

)72 7[J

I ' 67

14 18 2.2 26

Netherlands Sweden Italy75

10 I ~ 10 I

16

12

\

A ~ 72

R 65

67

10 15 2.0 25 4

GRAPHICAL EXCELLENCE 49

Theory and measured observations diverge in the physical sci­

ences, also. Here the relationship between temperature and the

thermal conductivity of copper is assessed in a series of measure­

ments from difl:crent laboratories. The connected points are from

a single publication, cited by an identification number. The very

different answers reported in the published literature result mainly

from impurities in the samples of copper. Note how effectively

the graphic organizes a vast amount of data, recording findings

of hundreds of studies on a single page and, at the same time,

enforcing comparisons of the varying results.

:1

RECOMMENDED I

'1

8 10'

THERMAL CONDUCTIVITY OF

COPPER

RECOMMENDED~' t~:,(ljquid}

'I

8 Kl'

TEMPERATURE. K

c. Y. Ho, R. W. Powell, and P. E. Liley, TherllJal Conductipity (:( the J::lclIJCIlts: A ComprehCllsipe Repicll', supplement no. 1, JOllrnal o( Physical alld ChellJ;wl Referc11cc Data, 3 (1974), 1-244·

50 GRAPHICAL PRACTICE

Finally, two relational designs of a different sort- wherein the E. C. Zeeman, "Catastrophe Theory," Scientific American, 234 (April 1976), 67; data points are themselves data. Here the effect of two variables based on Konrad Z. Lorenz. King

interacting is portrayed by the faces on the plotting field: SoiomOfl's Ring (New York, 1952).

FEAR

/ c::-:::-~' -~7

- ' ~-- -- ' . ."" ",#­~~

''e " ~

~

1, "

C:., j.Y'

----?7 RAGE

And similarly, the varying sizes of white pine seedlings after growing for one season in sand containing different amounts of calcium, in parts per million in nutrient-sand cultures:

H. L. Mitchell, The Grolvth alld Nlltritioll ':( White Pine Seedlillgs if I Cultures Il'ith Varyillg Nitroge1l, Phosphorlls, Potassiul1l mzd CalciulII, The Black Rock Forest Bulletin NO.9 (Cornwall-on-the­Hudson, New York, 1939), p. 70.

--

GRAPHICAL EXCELLENCE 51

Principles of Graphical Excellence

Graphical excellence is the well-designed presentation of interesting

data-a matter of substal/ce, of statistics, :lIld of design.

Graphical exce]]ence consists of complex ideas communicated

with clarity, precision, and efficiency.

Graphical excellence is that which giY(~s to the viewer the greatest

number of ideas in the shortest time with the least ink in the

smallest space.

spaceink

"",, <" ,.""1 .., "n.. ' ""~'" ",~~"" ".M.. ,."".".j'""'""M"!~ j ........ 1111 III}

ideas time

Graphical exce]]ence is nearly always multivariate.

And graphical excellence requires telling the truth about the data.