IESBS Statistics History

6
ensure that his results are not overvalued. Terms such as ‘statistical significance’ are easily and frequently misunderstood to imply a finding of practical signifi- cance; levels of significance are all too often interpreted as posterior probabilities, for example, in the guise that, if a DNA profile occurs only 1 in 10,000 times, then the chances are 9,999 to 1 that an individual having a matching profile is the source of the profile. As a result, individuals may tend to focus on the quantitative elements of a case, thereby overlooking qualitative elements that may in fact be more germane or relevant. 9. Other Applications The last decades of the twentieth century saw a remarkable variety of legal applications of statistics: attempting to determine the possible deterrent effects of capital punishment, estimating damages in antitrust litigation, epidemiological questions of the type raised in the book and movie A Ciil Action. Some of the books listed in the bibliography discuss a number of these. See also: Juries; Legal Reasoning and Argumentation; Legal Reasoning Models; Linear Hypothesis: Fal- lacies and Interpretive Problems (Simpson’s Paradox); Sample Surveys: The Field Bibliography Bickel P, Hammel E, O’Connell J W 1975 Is there a sex bias in graduate admissions? Data from Berkeley. Science 187: 398–404. Reprinted In: Fairley W B, Mosteller F (eds.) Statistics and Public Policy. Addison-Wesley, New York, pp. 113–30 Chaikan J, Chaikan M, Rhodes W 1994 Predicting Violent Behavior and Classifying Violent Offenders. In: Understanding and Preenting Violence, Volume 4: Consequences and Control. National Academy Press, Washington, DC, pp. 217–95 DeGroot M H, Fienberg S E, Kadane J B 1987 Statistics and the Law. Wiley, New York Evett I W, Weir B S 1998 Interpreting DNA Eidence: Statistical Genetics for Forensic Scientists Federal Judicial Center 1994 Reference Manual on Scientific Eidence. McGraw Hill Fienberg S E 1971 Randomization and social affairs: The 1970 draft lottery. Science 171: 255–61 Fienberg SE (ed.) 1989 The Eoling Role of Statistical Assessments as Eidence in the Courts. Springer-Verlag, New York Finkelstein M O, Levin B 1990 Statistics for Lawyers. Springer- Verlag, New York Gastwirth J L 1988 Statistical Reasoning in Law and Public Policy, Vol. 1: Statistical Modeling and Decision Science; Vol. 2: Tort Law, Eidence and Health. Academic Press, Boston Gastwirth JL 2000 Statistical Science in the Courtroom. Springer-Verlag, New York Harr J 1996 A Ciil Action. Vintage Books, New York Lagakos S W, Wessen B J, Zelen M 1986 An analysis of contaminated well water and health effects in Woburn, Massachusetts. Journal of the American Statistical Association 81: 583–96 Meier P, Zabell S 1980 Benjamin Peirce and the Howland Will. Journal of the American Statistical Association 75: 497–506 Tribe L H 1971 Trial by mathematics. Harard Law Reiew 84: 1329–48 Zeisel H, Kaye D H 1997 Proe It With Figures: Empirical Methods in Law and Litigation. Springer-Verlag, New York S. L. Zabell Statistics, History of The word ‘statistics’ today has several different mean- ings. For the public, and even for many people specializing in social studies, it designates numbers and measurements relating to the social world: popu- lation, gross national product, and unemployment, for instance. For academics in ‘Statistics Departments,’ however, it designates a branch of applied math- ematics making it possible to build models in any area featuring large numbers, not necessarily dealing with society. History alone can explain this dual meaning. Statistics appeared at the beginning of the nineteenth century as meaning a ‘quantified description of human–community characteristics.’ It brought to- gether two previous traditions: that of German Statistik’ and that of English political arithmetic (Lazarsfeld 1977). When it originated in Germany in the seven- teenth and eighteenth centuries, the Statistik of Hermann Conring (1606–81) and Gottfried Achenwall (1719–79) was a means for classifying the knowledge needed by kings. This ‘science of the state’ included history, law, political science, economics, and geog- raphy, that is, a major part of what later became the subjects of ‘social studies,’ but presented from the point of view of their utility for the state. These various forms of knowledge did not necessarily entail quan- titative measurements: they were basically associated with the description of specific territories in all their aspects. This territorial connotation of the word ‘statistics’ would subsist for a long time in the nineteenth century. Independently of the German Statistik, the English tradition of political arithmetic had developed methods for analyzing numbers and calculations, on the basis of parochial records of baptisms, marriages, and deaths. Such methods, originally developed by John Graunt (1620–74) in his work on ‘bills of mortality,’ were then systematized by William Petty (1627–87). They were used, among others, to assess the population of a 15080 Statistics as Legal Eidence

Transcript of IESBS Statistics History

Page 1: IESBS Statistics History

ensure that his results are not overvalued. Terms suchas ‘statistical significance’ are easily and frequentlymisunderstood to imply a finding of practical signifi-cance; levels of significance are all too often interpretedas posterior probabilities, for example, in the guisethat, if a DNA profile occurs only 1 in 10,000 times,then the chances are 9,999 to 1 that an individualhaving a matching profile is the source of the profile.As a result, individuals may tend to focus on thequantitative elements of a case, thereby overlookingqualitative elements that may in fact be more germaneor relevant.

9. Other Applications

The last decades of the twentieth century saw aremarkable variety of legal applications of statistics:attempting to determine the possible deterrent effectsof capital punishment, estimating damages in antitrustlitigation, epidemiological questions of the type raisedin the book and movie A Ci�il Action. Some of thebooks listed in the bibliography discuss a number ofthese.

See also: Juries; Legal Reasoning and Argumentation;Legal Reasoning Models; Linear Hypothesis: Fal-lacies and Interpretive Problems (Simpson’s Paradox);Sample Surveys: The Field

Bibliography

Bickel P, Hammel E, O’Connell J W 1975 Is there a sex bias ingraduate admissions? Data from Berkeley. Science 187:398–404. Reprinted In: Fairley W B, Mosteller F (eds.)Statistics and Public Policy. Addison-Wesley, New York,pp. 113–30

Chaikan J, Chaikan M, Rhodes W 1994 Predicting ViolentBehavior and Classifying Violent Offenders. In: Understandingand Pre�enting Violence, Volume 4: Consequences and Control.National Academy Press, Washington, DC, pp. 217–95

DeGroot M H, Fienberg S E, Kadane J B 1987 Statistics and theLaw. Wiley, New York

Evett I W, Weir B S 1998 Interpreting DNA E�idence: StatisticalGenetics for Forensic Scientists

Federal Judicial Center 1994 Reference Manual on ScientificE�idence. McGraw Hill

Fienberg S E 1971 Randomization and social affairs: The 1970draft lottery. Science 171: 255–61

Fienberg S E (ed.) 1989 The E�ol�ing Role of StatisticalAssessments as E�idence in the Courts. Springer-Verlag, NewYork

Finkelstein M O, Levin B 1990 Statistics for Lawyers. Springer-Verlag, New York

Gastwirth J L 1988 Statistical Reasoning in Law and PublicPolicy, Vol. 1: Statistical Modeling and Decision Science; Vol.2: Tort Law, E�idence and Health. Academic Press, Boston

Gastwirth J L 2000 Statistical Science in the Courtroom.Springer-Verlag, New York

Harr J 1996 A Ci�il Action. Vintage Books, New YorkLagakos S W, Wessen B J, Zelen M 1986 An analysis of

contaminated well water and health effects in Woburn,Massachusetts. Journal of the American Statistical Association81: 583–96

Meier P, Zabell S 1980 Benjamin Peirce and the Howland Will.Journal of the American Statistical Association 75: 497–506

Tribe L H 1971 Trial by mathematics. Har�ard Law Re�iew 84:1329–48

Zeisel H, Kaye D H 1997 Pro�e It With Figures: EmpiricalMethods in Law and Litigation. Springer-Verlag, New York

S. L. Zabell

Statistics, History of

The word ‘statistics’ today has several different mean-ings. For the public, and even for many peoplespecializing in social studies, it designates numbersand measurements relating to the social world: popu-lation, gross national product, and unemployment, forinstance. For academics in ‘Statistics Departments,’however, it designates a branch of applied math-ematics making it possible to build models in any areafeaturing large numbers, not necessarily dealing withsociety. History alone can explain this dual meaning.Statistics appeared at the beginning of the nineteenthcentury as meaning a ‘quantified description ofhuman–community characteristics.’ It brought to-gether two previous traditions: that of German‘Statistik’ and that of English political arithmetic(Lazarsfeld 1977).

When it originated in Germany in the seven-teenth and eighteenth centuries, the Statistik ofHermannConring (1606–81) andGottfried Achenwall(1719–79) was a means for classifying the knowledgeneeded by kings. This ‘science of the state’ includedhistory, law, political science, economics, and geog-raphy, that is, a major part of what later became thesubjects of ‘social studies,’ but presented from thepoint of view of their utility for the state. These variousforms of knowledge did not necessarily entail quan-titative measurements: they were basically associatedwith the description of specific territories in all theiraspects. This territorial connotation of the word‘statistics’ would subsist for a long time in thenineteenth century.

Independently of the German Statistik, the Englishtraditionofpolitical arithmetichaddevelopedmethodsfor analyzing numbers and calculations, on the basis ofparochial records of baptisms, marriages, and deaths.Such methods, originally developed by John Graunt(1620–74) in his work on ‘bills of mortality,’ were thensystematized by William Petty (1627–87). They wereused, among others, to assess the population of a

15080

Statistics as Legal E�idence

Page 2: IESBS Statistics History

kingdom and to draw up the first forms of lifeinsurance. They constitute the origin of moderndemography.

Nineteenth-century ‘statistics’ were therefore afairly informal combination of these two traditions:taxonomy and numbering. At the beginning of thetwentieth century, it further became a mathematicalmethod for the analysis of facts (social or not)involving large numbers and for inference on the basisof such collections of facts. This branch of math-ematics is generally associated with the probabilitytheory, developed in the seventeenth and eighteenthcenturies, which, in the nineteenth century, influencedmany branches of both the social and natural sciences(Gigerenzer et al. 1989). The word ‘statistics’ is stillused in both ways today and both of these uses are stillrelated, of course, insofar as quantitative social studiesuse, in varied proportions, inference tools provided bymathematical statistics. The probability calculus, onits part, grounds the credibility of statistical measure-ments resulting from surveys, together with randomsampling and confidence intervals.

The diversity of meanings of the word ‘statistics,’maintained until today, has been heightened by themassive development, beginning in the 1830s, ofbureaus of statistics, administrative institutions dis-tinct from universities, in charge of collecting, process-ing, and transmitting quantified information on thepopulation, the economy, employment, living con-ditions, etc. Starting in the 1940s, these bureaus ofstatistics became important ‘data’ suppliers for em-pirical social studies, then in full growth. Their historyis therefore an integral part of these sciences, especiallyin the second half of the twentieth century, duringwhich the mathematical methods developed by uni-versity statisticians were increasingly used by the so-called ‘official’ statisticians. Consequently the ‘historyof statistics’ is a crossroads history connecting verydifferent fields, which are covered in other articles ofthis encyclopedia: general problems of the ‘quanti-fication’ of social studies (Theodore Porter), math-ematical ‘sampling’ methods (Stephen E. Fienbergand J. M. Tanur), ‘survey research’ (Martin Bulmer),‘demography’ (Simon Szreter), ‘econometrics,’ etc.The history of these different fields was the object ofmuch research in the 1980s and 1990s, some examplesof which are indicated in the Bibliography. Its maininterest is to underscore the increasingly closer conn-ections between the so-called ‘internal’ dimensions(history of the technical tools), the ‘external’ ones(history of the institutions), and those related to theconstruction of social studies ‘objects,’ born from theinteraction between the three foci constituted byuniversity research, administrations in charge of‘social problems,’ and bureaus of statistics. This ‘co-construction’ of objects makes it possible to joinhistoriographies that not long ago were distinct. Threekey moments of this history will be mentioned here:Adolphe Quetelet and the average man (1830s), Karl

Pearson and correlation (1890s), and the establishmentof large systems for the production and processing ofstatistics.

1. Quetelet, the A�erage Man, and MoralStatistics

The cognitive history of statistics can be presented asthat of the tension and sliding between two foci: themeasurement of uncertainty (Stigler 1986), resultingfrom the work of eighteenth-century astronomers andphysicists, and the reduction of di�ersity, which will betaken up by social studies. Statistics is a way of tamingchance (Hacking 1990) in two different ways: chanceand uncertainty related to protocols of observation,chance and dispersion related to the diversity and theindetermination of the world itself. The Belgianastronomer and statistician Adolphe Quetelet (1796–1874) is the essential character in the transitionbetween the world of ‘uncertain measurement’ of theprobability proponents (Carl Friedrich Gauss, PierreSimone de Laplace) and that of the ‘regularities’resulting from diversity, thanks to his having tran-sferred, around 1830, the concept of average from thenatural sciences to the human sciences, through theconstruction of a new being, the average man.

As early as the eighteenth century, specificitiesappeared fromobservations in large numbers: drawingballs out of urns, gambling, successive measurementsof the position of a star, sex ratios (male and femalebirths), or the mortality resulting from preventivesmallpox inoculation, for instance. The radical inno-vation of this century was to connect these verydifferent phenomena thanks to the common perspec-tive provided by the ‘law of large numbers’ formulatedby Jacques Bernoulli in 1713. If draws from a constanturn containing white and black balls are repeated alarge number of times, the observed share of whiteballs ‘converges’ toward that actually contained by theurn. Considered by some as a mathematical theoremand by others as an experimental result, this ‘law’ wasat the crossroads of the two currents in epistemologicalscience: one ‘hypothetical deductive,’ the other ‘em-pirical inductive.’

Beginning in the 1830s, the statistical description of‘observations in large numbers’ became a regularactivity of the state. Previously reserved for princes,this information henceforth available to ‘enlightenedmen’ was related to the population, to births,marriages, and deaths, to suicides and crimes, toepidemics, to foreign trade, schools, jails, and hos-pitals. It was generally a by-product of administrationactivity, not the result of special surveys. Only thepopulation census, the showcase product of nine-teenth-century statistics, was the object of regularsurveys. These statistics were published in volumeswith heterogeneous contents, but their very existence

15081

Statistics, History of

Page 3: IESBS Statistics History

suggests that the characteristics of society were hence-forth a matter of scientific law and no longer of judiciallaw, that is, of observed regularity and not of thenormative decisions of political power. Quetelet wasthe man who orchestrated this new way of thinking thesocial world. In the 1830s and 1840s, he set upadministrative and social networks for the productionof statistics and established—until the beginning ofthe twentieth century—how statistics were to beinterpreted.

This interpretation is the result of the combinationof two ideas developed from the law of large numbers:the generality of ‘normal distribution’ (or, inQuetelet’svocabulary: the ‘lawof possibilities’) and the regularityof certain yearly statistics. As early as 1738, Abrahamde Moivre, seeking to determine the convergenceconditions for the law of large numbers, had formu-lated the mathematical expression of the future‘Gaussian law’ as the limit of a binomial distribution.Then Laplace (1749–1827) had shown that this lawconstituted a good representation of the distributionof measurement errors in astronomy, hence the namethat Quetelet and his contemporaries also used todesignate it: the law of errors (the expression ‘normallaw,’ under which it is known today, would not beintroduced until the late nineteenth century by KarlPearson).

Quetelet’s daring intellectual masterstroke was tobring together two forms: on the one hand the law oferrors in observation, and on the other, the law ofdistribution of certain body measurements of individ-uals in a population, such as the height of conscripts ina regiment. The likeness of the ‘Gaussian’ looks ofthese two forms of distribution justified the inventionof a new being with a promise of notable posterity insocial studies: the average man. Thus, Quetelet re-stricted the calculation and the legitimate use ofaverages to cases where the distribution of the observa-tions had a Gaussian shape, analogous to that of thedistribution of the astronomical observations of a star.Reasoning on that basis, just as previous to thisdistribution there was a real star (the ‘cause’ of theGaussian-shaped distribution), previous to the equallyGaussian distribution of the height of conscripts therewas a being of a reality comparable to the existence ofthe star. Quetelet’s average man is also the ‘constantcause,’ previous to the observed controlled variability.He is a sort of model, of which specific individuals areimperfect copies.

The second part of this cognitive construction,which is so important in the ulterior uses of statistics insocial studies, is the attention drawn by the ‘remark-able regularity’ of series of statistics, such as those ofmarriages, suicides, or crimes. Just as series of drawsfrom an urn reveal a regularity in the observedfrequency of white balls, the regularity in the rates ofsuicide or crime can be interpreted as resulting fromseries of draws from a population, some of themembers of which are affected with a ‘propensity’ to

suicide or crime. The average man is thereforeendowed not only with physical attributes but also‘moral’ ones, such as these propensities. Here again,just as the average heights of conscripts are stable,whereas individual heights are dispersed, crime orsuicide rates are just as stable, whereas these acts areeminently individual and unpredictable. This form ofstatistics, then called ‘moral,’ signaled the beginning ofsociology, a science of society radically distinct from ascience of the individual, such as psychology (Porter1986). Quetelet’s reasoning would ground the onedeveloped by Durkheim in Suicide: A Study in So-ciology (1897).

This way of using statistical regularity to back theidea of the existence of a society ruled by specific ‘laws,’distinct from those governing individual behavior,dominated nineteenth-century, and, in part, twentieth-century social studies. Around 1900, however, anotherapproach appeared, this one centered on two ideas:the distribution (no longer just the average) of observa-tions, and the correlation between two or severalvariables, observed in individuals (no longer just ingroups, such as territories).

2. Distribution, Correlation, and Causality

This shift of interest from the average individual to thedistributions and hierarchies among individuals, wasconnected to the rise, in late-century VictorianEngland, of a eugenicist and hereditarian current ofthought inspired from Darwin (MacKenzie 1981). Itstwo advocates were Francis Galton (1822–1911), acousin of Darwin, and Karl Pearson (1857–1936). Intheir attempt to measure biological heredity, whichwas central to their political construction, they createda radically new statistics tool that made it possible toconceive partial causality. Such causality had beenabsent from all previous forms of thought, for which Aeither is or is not the cause of B, but cannot be sosomewhat or incompletely. Yet Galton’s research onheredity led to such a formulation: the parents’ height‘explains’ the children’s, but does not entirely ‘de-termine’ it. The taller fathers are, the taller are theirsons on a�erage, but, for a father’s given height, thesons’ height dispersion is great. This formalization ofheredity led to the two related ideas of regression andcorrelation, later to be extensively used in social studiesas symptoms of causality.

Pearson, however, greatly influenced by the anti-realist philosophy of the physicist Ernst Mach, chal-lenged the idea of ‘causality,’ which according to himwas ‘metaphysical,’ and stuck to the one of ‘cor-relation,’ which he described with the help of ‘con-tingency tables’ (Pearson 1911, Chap. 5). For him,scientific laws are only summaries, brief descriptionsin mental stenography, abridged formulas, a con-densation of perception routines for future use andforecasting. Such formulas are the limits of observa-

15082

Statistics, History of

Page 4: IESBS Statistics History

tions that never perfectly respect the strict functionallaws. The correlation coefficient makes it possible tomeasure the strength of the connection, between zero(independence) and one (strict dependence). Thus, inthis conception of science associated by Pearson withthe budding field of mathematical statistics, the realityof things can only be invoked for pragmatic ends andprovided that the ‘perception routines’ aremaintained.Similarly, ‘causality’ can only be insofar as it is aproven correlation, therefore predictable with a fairlyhigh probability. Pearson’s pointed formulationswould constitute, in the early twentieth century, one ofthe foci of the epistemology of statistics applied tosocial studies. Others, in contrast, would seek to givenew meaning to the concepts of reality and causalityby defining them differently. These discussions werestrongly related to the aims of the statistical work,strained between scientific knowledge and decisions.

Current mathematical statistics proceed from theworks of Karl Pearson and his successors: his sonEgon Pearson (1895–1980), the Polish mathematicianJerzy Neyman (1894–1981), the statistician pioneeringin agricultural experimentation Ronald Fisher (1890–1962), and finally the engineer and beer brewerWilliam Gosset, alias Student (1876–1937). Thesedevelopments were the result of an increasinglythorough integration of so-called ‘inferential’ statisticsinto probabilistic models. The interpretation of theseconstructions is always stretched between two per-spectives: the one of science, which aims to prove ortest hypotheses, with truth as its goal, and the one ofaction, which aims to make the best decision, withefficiency as its goal. This tension explains a number ofcontroversies that opposed the founders of inferentialstatistics in the 1930s. In effect, the essential innova-tions were often directly generated within the frame-work of research as applied to economic issues, forinstance in the cases of Gosset and Fisher.

Gosset was employed in a brewery. He developedproduct quality-control techniques based on a smallnumber of samples. He needed to appraise thevariances and laws of distribution of parameterscalculated on observations not complying with thesupposed ‘law of large numbers.’ Fisher, who workedin an agricultural research center, could only carry outa limited number of controlled tests. He mitigated thislimitation by artificially creating a randomness, itselfcontrolled, for variables other than those for which hewas trying to measure the effect. This ‘randomization’technique thus introduced probabilistic chance intothe very heart of the experimental process. UnlikeKarl Pearson, Gosset and Fisher used distinct nota-tions to designate, on the one hand, the theoreticalparameter of a probability distribution (a mean, avariance, a correlation) and on the other, the estimateof this parameter, calculated on the basis of observa-tions so insufficient in number that it was possible todisregard the gap between these two values, theoreticaland estimated.

This new system of notation marked a decisiveturning point: it enabled an inferential statistics basedon probabilistic models. This form of statistics wasdeveloped in two directions. The estimation of para-meters, which took into account a set of recorded data,presupposed that the model was true. The informationproduced by the model was combined with the data,but nothing indicated whether the model and the datawere in agreement. In contrast, the hypothesis testsallowed this agreement to be tested and if necessary tomodify the model: this was the inventive part ofinferential statistics. In wondering whether a set ofevents could plausibly have occurred if a model weretrue, one compared these events—explicitly or other-wise—to those that would have occurred if the modelwere true, and made a judgment about the gap betweenthese two sets of events.

This judgment could itself be made according to twodifferent perspectives, which were the object of vividcontroversy between Fisher on the one hand, andNeyman and Egon Pearson on the other. Fisher’s testwas placed in a perspective of truth and science: atheoretical hypothesis was judged plausible or wasrejected, after consideration of the observed data.Neyman and Pearson’s test, in contrast, was aimed atdecision making and action. One evaluated the re-spective costs of accepting a false hypothesis and ofrejecting a true one, described as errors of Type I andII. These two different aims—truth and economy—although supported by close probabilistic formalism,led to practically incommensurable argumentativeworlds, as was shown by the dialogue of the deafbetween Fisher on one side, and Neyman and Pearsonon the other (Gigerenzer et al. 1989, pp. 90–109).

3. Official Statistics and the Construction of theState

At the same time as mathematical statistics weredeveloping, so-called ‘official’ statistics were also beingdeveloped in ‘bureaus of statistics,’ for a long time onan independent course. These latter did not use thenew mathematical tools until the 1930s in the UnitedStates and the 1950s in Europe, in particular when therandom sample-survey method was used to studyemployment or household budgets. Yet in the 1840s,Quetelet had already actively pushed for such bureausto be set up in different countries, and for their‘scientification’ with the tools of the time. In 1853, hehad begun organizing meetings of the ‘InternationalCongress of Statistics,’ which led to the establishmentin 1885 of the ‘International Statistical Institute’(which still exists and includes mathematicians andofficial statisticians). One could write the history ofthese bureaus as an aspect of the more general historyof the construction of the state, insofar as theydeveloped and legitimized a common language specifi-cally combining the authority of science and that of

15083

Statistics, History of

Page 5: IESBS Statistics History

the state (Anderson 1988, Desrosie� res 1998, Patriarca1996).

More precisely, every period of the history of a statecould be characterized by the list of ‘socially judgedsocial’ questions that were consequently put on theagenda of official statistics. So were co-constructedthree interdependent foci: representation, action, andstatistics—a way of describing and interpreting socialphenomena (towhich social studieswould increasinglycontribute), a method for determining state inter-vention and public action, and finally, a list ofstatistical ‘variables’ and procedures aimed at measur-ing them.

Thus for example in England in the second third ofthe nineteenth century, poverty, mortality, and epi-demic morbidity were followed closely in terms of adetailed geographical distribution (counties) by theGeneral Register Office (GRO), set up by WilliamFarr in 1837. England’s economic liberalism and thePoor Law Amendment Act of 1834 (which led to thecreation of workhouses) were consistent with thisform of statistics. In the 1880s and 1890s, Galton andPearson’s hereditarian eugenics would compete withthis ‘environmentalism,’ which explained poverty interms of the social and territorial contexts. This new‘social philosophy’ was reflected in news forms ofpolitical action, and of statistics. Thus the ‘socialclassification’ in five differentiated groups used byBritish statisticians throughout the twentieth centuryis marked by the political and cognitive configurationof the beginning of the century (Szreter 1996).

In all the important countries (including GreatBritain) of the 1890s and 1900s, however, the work ofthe bureaus of statistics was guided by labor-relatedissues: employment, wages, workers’ budgets, sub-sistence costs, etc. The modern idea of unemploymentemerged, but its definition and its measurement werenot standardized yet. This interest in labor statisticswas linked to the fairly general development of aspecific ‘labor law’ and the first so-called ‘socialwelfare’ legislation, such as Bismarck’s in Germany,or that developed in the Nordic countries in the 1890s.It is significant that theapplicationof the sample surveymethod (then called ‘representative’ survey) was firsttested in Norway in 1895, precisely in view ofpreparing a new law enacting general pension fundsand invalidity insurance: this suggests the consistencyof the political, technical, and cognitive dimensions ofthis co-construction.

These forms of consistency are found in the statisticssystems that were extensively developed, at a differentscale, after 1945. At that time, public policies weregoverned by a number of issues: the regulation of themacroeconomic balance as seen through the Keyne-sian model, the reduction of social inequalities and thestruggle against unemployment thanks to social-welfare systems, the democratization of school, etc.Some people then spoke of ‘revolution’ in governmentstatistics (Duncan and Shelton 1978), and underscored

its four components, which have largely shaped thepresent statistics systems. National accounting, a vastconstruction integrating a large number of statisticsfrom different sources, was the instrument on whichthe macroeconomic models resulting from the Keyne-sian analysis were based. Sample surveys made itpossible to study a much broader range of issues andto accumulate quantitative descriptions of the socialworld, which were unthinkable at a time when theobservation techniques were limited to censuses andmonographs. Statistical coordination, an apparentlystrictly administrative affair, was indispensable tomake consistent the observations resulting from dif-ferent fields. Finally, beginning in 1960, the generaliza-tionof computerdataprocessing radically transformedthe activity of bureaux of statistics.

So, ‘official statistics,’ placed at the junction betweensocial studies,mathematics, and information on publicpolicies, has become an important research com-ponent in the social studies. Given, however, that fromthe institutional standpoint it is generally placedoutside, it is often hardly perceived by those who seekto draw up a panorama of these sciences. In fact, theway bureaus of statistics operate and are integratedinto administrative and scientific contexts varies a lotfrom one country to another, so a history and asociology of social studies cannot omit examiningthese institutions, which are often perceived as meresuppliers of data assumed to ‘reflect reality,’ when theyare actually places where this ‘reality’ is institutedthrough co-constructed operations of social represen-tation, public action, and statistical measurement.

See also: Estimation: Point and Interval; Galton, SirFrancis (1822–1911); Neyman, Jerzy (1894–1981);Pearson, Karl (1857–1936); Probability and Chance:Philosophical Aspects; Quantification in History;Quantification in the History of the Social Sciences;Quetelet, Adolphe (1796–1874); Statistical Methods,History of: Post-1900; Statistical Methods, History of:Pre-1900; Statistics: The Field

Bibliography

Anderson M J 1988 The American Census. A SocialHistory. YaleUniversity Press, New Haven, CT

Desrosie� res A 1998 The Politics of Large Numbers. A History ofStatistical Reasoning. Harvard University Press, Cambridge,MA

Duncan J W, Shelton W C 1978 Re�olution in United StatesGo�ernment Statistics, 1926–1976. US Department of Com-merce, Washington, DC

Gigerenzer G et al. 1989 The Empire of Chance. How ProbabilityChanged Science and E�eryday Life. Cambridge UniversityPress, Cambridge, UK

Hacking I 1990 The Taming of Chance. Cambridge UniversityPress, Cambridge, UK

Klein J L 1997 Statistics Visions in Time. A History of TimeSeries Analysis, 1662–1938. Cambridge University Press,Cambridge, UK

15084

Statistics, History of

Page 6: IESBS Statistics History

Lazarsfeld P 1977 Notes in the history of quantification insociology: Trends, sources and problems. In: Kendall M,Plackett R L (eds.) Studies in the History of Statistics andProbability. Griffin, London, Vol. 2, pp. 213–69

MacKenzie D 1981 Statistics in Britain, 1865–1930. The SocialConstruction of Scientific Knowledge. Edinburgh UniversityPress, Edinburgh, UK

Patriarca S 1996 Numbers and Nationhood: Writing Statistics inNineteenth-century Italy. Cambridge University Press, Cam-bridge, UK

Pearson K 1911 The Grammar of Science, 3rd edn rev. and enl..A. and C. Black, London

Porter T 1986 The Rise of Statistical Thinking, 1820–1900.Princeton University Press, Princeton, NJ

Stigler S M 1986 The History of Statistics: The Measurement ofUncertainty Before 1900. Belknap Press of Harvard UniversityPress, Cambridge, MA

Szreter S 1996 Fertility, Class and Gender in Britain, 1860–1940.Cambridge University Press, Cambridge, UK

A. Desrosie� res

Statistics: The Field

Statistics is a term used to refer to both a field ofscientific inquiry and a body of quantitative methods.The field of statistics has a 350-year intellectual historyrooted in the origins of probability and the rudi-mentary tools of political arithmetic of the seventeenthcentury. Statistics came of age as a separate disciplinewith the development of formal inferential theories inthe twentieth century.This article briefly traces someofthis historical development and discusses currentmethodological and inferential approaches as well assome cross-cutting themes in the development of newstatistical methods.

1. Introduction

Statistics is a body of quantitative methods associatedwith empirical observation. A primary goal of thesemethods is coping with uncertainty. Most formalstatistical methods rely on probability theory toexpress this uncertainty and to provide a formalmathematical basis for data description and foranalysis. The notion of variability associated withdata, expressed through probability, plays a funda-mental role in this theory. As a consequence, muchstatistical effort is focused on how to control andmeasure variability and or how to assign it to itssources.

Almost all characterizations of statistics as a fieldinclude the following elements:

(a) Designing experiments, surveys, and other sys-tematic forms of empirical study.

(b) Summarizing and extracting information fromdata.

(c) Drawing formal inferences from empirical datathrough the use of probability.

(d) Communicating the results of statistical investi-gations to others, including scientists, policy makers,and the public.

This article describes a number of these elements,and the historical context out of which they grew. Itprovides a broad overview of the field, that can serveas a starting point to many of the other statisticalentries in this encyclopedia.

2. The Origins of the Field

The word ‘statistics’ is related to the word ‘state’ andthe original activity that was labeled as statistics wassocial in nature and related to elements of societythrough the organization of economic, demographic,and political facts. Paralleling this work to someextent was the development of the probability calculusand the theory of errors, typically associated with thephysical sciences (see Statistical Methods, History of:Pre-1900). These traditions came together in thenineteenth century and led to the notion of statistics asa collection of methods for the analysis of scientificdata and the drawing of inferences therefrom.

As Hacking (1990) has noted: ‘By the end of thecentury chance had attained the respectability of aVictorian valet, ready to be the logical servant of thenatural, biological and social sciences’ (p. 2). At thebeginning of the twentieth century, we see the emerg-ence of statistics as a field under the leadership of KarlPearson, George Udny Yule, Francis Y. Edgeworth,and others of the ‘English’ statistical school. As Stigler(1986) suggests:

Before 1900 we see many scientists of different fieldsdeveloping and using techniques we now recognize asbelonging to modern statistics. After 1900 we begin tosee identifiable statisticians developing such tech-niques into a unified logic of empirical science thatgoes far beyond its component parts. There was nosharp moment of birth; but with Pearson and Yule andthe growing number of students in Pearson’s lab-oratory, the infant discipline may be said to havearrived. (p. 361)

Pearson’s laboratory at University College, Londonquickly became the first statistics department in theworld and it was to influence subsequent developmentsin a profound fashion for the next three decades.Pearson and his colleagues founded the first methodo-logically-oriented statistics journal, Biometrika, andthey stimulated the development of new approaches tostatistical methods. What remained before statisticscould legitimately take on the mantle of a field ofinquiry, separate from mathematics or the use ofstatistical approaches in other fields, was the de-velopment of the formal foundations of theories ofinference from observations, rooted in an axiomatictheory of probability.

15085

Statistics: The Field

Copyright � 2001 Elsevier Science Ltd.

All rights reserved.

International Encyclopedia of the Social & Behavioral Sciences ISBN: 0-08-043076-7