64 Vol. 6, nos. 4-5images.peabody.yale.edu/lepsoc/jls/1950s/1952/1952-6(4-5... · 2012-03-09 · 64...

3
64 Vol. 6, nos. 4-5 SOME STATISTICAL CONCEPTS IN TAXONOMY by NICHOLAS SHOUMATOFF In my profession of process engineering in the pulp and paper industry, I have been confronted on numerous occasions with problems involving the statistical analYSIS of observational and experimental data. I have been im- pressed with the power of analysis made available by modern statistical meth- ods. The possibilities of applying these methods to the taxonomy of Lepi- doptera, as discussed in the recent series of articles in the Lep. News by F. MARTIN BROWN, have therefore interested me keenly. The statistical methods of modem experimental science were largely developed in biology, although some of the most significant early contri- butions (1908) were made by "Student", an anonymous industrial engineer. The full power of today's methods, however, based on the principle of fidu- cial probability, is largely due to R. A. FISHER, whose specialty of genetics is certainly related to taxonomy. I therefore anticipated no difficulty in at- tempting statistical reversion from technology to Lepidoptera. For this reason it was quite a surprise to find that, in Mr. BROWN'S article in the News, Vol. 5: pp. 64-66, some basic concepts are recommended which are quite dif- ferent from those which I have encountered elsewhere. ,. Mr. BROWN indicates that in taxonomy valid tests of significance should be based on probability levels of a vastly different order of magnitude than are commonly used in other branches of science. He further indicates that the required probability level can be uniquely determined from the size of the sample and the size of the population from which it is drawn. This was illustrated with twO examples, without details of the mathematical procedure employed. I am not in a position to take issue with these recommendations directly, but I would like to point out (in the hope of obtaining information which may resolve this conflict) how they differ from what I have understood to be' the basic concepts underlying modern procedures of statistical inference. The first principle involved in the case is that statistical methods do not establish what can be known with certainty, but only what can be ex- pected with any desired degree of confidence. The specification of the degree of confidence, which is necessary for any test of significance, is purely sub- jective, although certain definite criteria have been established by custOm. Published tables of statistical functions frequently do not extend beyond the 99.9% level of confidence. The implications of selecting the level of confidence should be clearly understood. If in a. given test the 95 % confidence level is established as the criterion of signifiCance, it means that in accepting the significance of results which meet this criterion the chance of error is 5 %. The chance of error in rejecting the significance of a result which does not meet this criterion is rtot uniquely determined by the selected confidence level, but depends on the degree of difference between the true and assumed values of the quantity being investigated. It may be as high as 95 %. It may be seen that if the confidence lever is changed to a higher value to reduce the chance if accepting a non-significant result, the chance of rejecting a result which may be sig- nificant is correspondingly increased. Greater certainty in the first type of judgement can be acquired only at the price of reduced sensitivity in the

Transcript of 64 Vol. 6, nos. 4-5images.peabody.yale.edu/lepsoc/jls/1950s/1952/1952-6(4-5... · 2012-03-09 · 64...

Page 1: 64 Vol. 6, nos. 4-5images.peabody.yale.edu/lepsoc/jls/1950s/1952/1952-6(4-5... · 2012-03-09 · 64 Vol. 6, nos. 4-5 SOME STATISTICAL CONCEPTS IN TAXONOMY by NICHOLAS SHOUMATOFF In

64 Vol. 6, nos. 4-5

SOME STATISTICAL CONCEPTS IN TAXONOMY

by NICHOLAS SHOUMATOFF

In my profession of process engineering in the pulp and paper industry, I have been confronted on numerous occasions with problems involving the statistical analYSIS of observational and experimental data. I have been im­pressed with the power of analysis made available by modern statistical meth­ods. The possibilities of applying these methods to the taxonomy of Lepi­doptera, as discussed in the recent series of articles in the Lep. News by F. MARTIN BROWN, have therefore interested me keenly.

The statistical methods of modem experimental science were largely developed in biology, although some of the most significant early contri­butions (1908) were made by "Student", an anonymous industrial engineer. The full power of today's methods, however, based on the principle of fidu­cial probability, is largely due to R. A. FISHER, whose specialty of genetics is certainly related to taxonomy. I therefore anticipated no difficulty in at­tempting statistical reversion from technology to Lepidoptera. For this reason it was quite a surprise to find that, in Mr. BROWN'S article in the News, Vol. 5: pp. 64-66, some basic concepts are recommended which are quite dif­ferent from those which I have encountered elsewhere. ,.

Mr. BROWN indicates that in taxonomy valid tests of significance should be based on probability levels of a vastly different order of magnitude than are commonly used in other branches of science. He further indicates that the required probability level can be uniquely determined from the size of the sample and the size of the population from which it is drawn. This was illustrated with twO examples, without details of the mathematical procedure employed.

I am not in a position to take issue with these recommendations directly, but I would like to point out (in the hope of obtaining information which may resolve this conflict) how they differ from what I have understood to be ' the basic concepts underlying modern procedures of statistical inference.

The first principle involved in the case is that statistical methods do not establish what can be known with certainty, but only what can be ex­pected with any desired degree of confidence. The specification of the degree of confidence, which is necessary for any test of significance, is purely sub­jective, although certain definite criteria have been established by custOm. Published tables of statistical functions frequently do not extend beyond the 99.9 % level of confidence.

The implications of selecting the level of confidence should be clearly understood. If in a. given test the 95 % confidence level is established as the criterion of signifiCance, it means that in accepting the significance of results which meet this criterion the chance of error is 5 %. The chance of error in rejecting the significance of a result which does not meet this criterion is rtot uniquely determined by the selected confidence level, but depends on the degree of difference between the true and assumed values of the quantity being investigated. It may be as high as 95 %. It may be seen that if the confidence lever is changed to a higher value to reduce the chance if accepting a non-significant result, the chance of rejecting a result which may be sig­nificant is correspondingly increased. Greater certainty in the first type of judgement can be acquired only at the price of reduced sensitivity in the

Page 2: 64 Vol. 6, nos. 4-5images.peabody.yale.edu/lepsoc/jls/1950s/1952/1952-6(4-5... · 2012-03-09 · 64 Vol. 6, nos. 4-5 SOME STATISTICAL CONCEPTS IN TAXONOMY by NICHOLAS SHOUMATOFF In

1952 The Lepidopterists' News 65

second type of judgement. In certain investigations, particularly those which are in a preliminary stage, it is often desirable to follow up an indication even though in the end it may prove to be insignificant. In such cases a lower confidence criterion must be allowed. However, the minimum con­fidence level commonly used is 95%, which with large samples is approxi­mately equivalent to two standard deviations.

It is understandable that a taxonomist would like to exercise the highest degree of confidence in assigning names to populations of Lepidoptera so that the names will have enduring validity rather than clutter up the literature with synonyms. On the other hand, it is doubtful whether, in certain groups at least, the subspecific structure has been so clearly defined that one can afford to overlook indicated differences at a lower level of confidence. In "The Karanasa Butterflies" (Annals Carnegie Museum, 1951) AVINOFF and SWEADNER assigned names to every local population that they were able to distinguish. In doing so they realized that future investigations based on more complete data might not uphold all these names. This was felt to be a lesser evil, however, than the danger of confusing two really distinct entities under the same name, as has frequently happened in earlier literature on this group.

It should always be borne in mind that the magnitude of variation cor­responding to any given level of confidence is not fixed but depends on the amount of information available. In the absence of a complete census, the tcue random variation of an entire population is never known as such, but must be estimated from a sample. A most important concept in this con­nection is the number of degrees of freedom (number of observations minus number of restraints) available for calculation of and comparison with the estimate of error. For example, the number of standard deviations at each probability level is a variable, depending on the number of degrees of free­dom, in accordance with "Student's" t-function, tables of which can be found in almost any current book on statistics. The fixed values listed in Mr. BROWN'S ~lfticJe correspond to infinite degrees of freedom, and are approximately true for large samples only. In most actual cases there are several methods of calculating the estimate of error from the same data, each with a correspond­ing number of degrees of freedom. A typical example is testing the difference between the means of two samples. If specimens are available from two localities so that they may be arranged in two parallel time series to form, say, ten simultaneous pairs, and if one measurement is made on each specimen, the mean difference between the two localities can be tested by comparison with four different estimates of error as follows:

Square Deviations of Individual values from

general mean Individual values from

mean each locality

Degrees of Freedom

19

18

Value of "t"' at 95 % Confidence

2.093

2.101

Differences in each pair 10 2.228 from zero

Differences in each pair 9 2.262 from average difference

The last of these methods has the least degrees of freedom and the highest "t," yet it is often the most sensitive method because the variance among pairs and between localities has been eliminated from the estimate of

Page 3: 64 Vol. 6, nos. 4-5images.peabody.yale.edu/lepsoc/jls/1950s/1952/1952-6(4-5... · 2012-03-09 · 64 Vol. 6, nos. 4-5 SOME STATISTICAL CONCEPTS IN TAXONOMY by NICHOLAS SHOUMATOFF In

66 SHOUMATOFF: Statistical Concepts Vol. 6, nos. 4-5

error. Whichever of these four methods yields the highest confidence level is the one whose result must be considered. These differences in sensitivity due to different sources of variation should not be confused with the general principle that, regardless of the methods of statistical reduction employed, all tests of significance of the same hypothesis based on the same sources of variation in the same set of data are bound, if correctly carried om, to yield exactly the same result, barring only the use of inefficient statistics.

With the small sampling theory illustrated above, confidence limits can be established just as exactly from small samples as from large samples. How­ever, with larger samples, the limits are smaller. This is due to three separate effects:

l. The degrees of freedom, not the total number of observations, 1S

used in calculating the mean square deviation. 2. The value of "t" depends on the degrees of freedom. 3. The standard error of a sample mean varies inversely as the square

root of the sample size. Eventually a point is reached

creasing the sample size. This has hrge samples.

where relatively little is gained by Ill­

been called the principle of inertia for

The previous discussion is intended to show that statistical methods are objective only insofar as they establish accurate betting odds, but the final step of the procedure, whether or not to accept these odds, involves a sub­jective decision. In contrast, Mr. BROWN has, I believe, suggested that there is an absolute scientific basis for completing this final step.

A second fundamental concept involved in this case has to do with the character of the population. The basic calculus of statistical analysis has been derived from the assumption of random sampling from an infinite population. Al! actual populations are of course finite. Fortunately, if the populations are very large in proportion to the sample, the calculus of infinite populations can be applied with entirely negligible error. The principle of inertia applies to population size as well as sample size. In taxonomy, on the other hand, one i~ not primarily concerned with actual populations. A sample containing specimens taken over a period of time exceeding the life span of individuals certainly represents more than one actual population. Conclusions drawn from the study of the sample usually refer not only to the actual populations represented, but also to an unspecified number of future populations. The taxonomic unit is an abstraction which does not actually exist in its entirety at anyone time. It is pardy actual, partly hypothetical, and in effect infin ite. It does not appear, therefore, that significance tests based on the calculated size of an actual population are pertinent to the problems of taxonomy, whether or not they are statistically correct. If Mr. BROWN'S reasoning on this point is followed to its logical conclusion, an infinite deviation would be required if an infinite population is considered.

In conclusion, I would like to repeat that these thoughts have been assembled not in a spirit of criticism but in the hope of reaching a more complete mutual understanding, as all those who work in the same field should have. Properly used statistical methods can do rnuch to promote mutual understanding in taxonomy, and Mr. BROWN'S articles with their high standard of clarity are undoubtedly a most significant contribution in­this direction.

Box 333, Bedford, N. Y., U. S. A.