Biometrika 1908 Student The Probable Error of A

24
THE PROBABLE ERROR OF A MEAN By STUDENT Introduction Any experiment may he regarded as forming an individual of a “population” of experiments which might he performed under the same conditions. A series of experiments is a sample drawn from this population. Now any series of experiments is only of value in so far as it enables us to form a judgment as to the statistical constants of the population to which the experiments belong. In a greater number of cases the question finally turns on the value of a mean, either directly, or as the mean difference between the two quantities. If the number of experiments be very large, we may have precise information as to the value of the mean, but if our sample be small, we have two sources of uncertainty: (1) owing to the “error of random sampling” the mean of our series of experiments deviates more or less widely from the mean of the population, and (2) the sample is not sufficiently large to determine what is the law of distribution of individuals. It is usual, however, to assume a normal distribution, because, in a very large number of cases, this gives an approximation so close that a small sample will give no real information as to the manner in which the population deviates from normality: since some law of distribution must he assumed it is better to work with a curve whose area and ordinates are tabled, and whose properties are well known. This assumption is accordingly made in the present paper, so that its conclusions are not strictly applicable to populations known not to be normally distributed; yet it appears probable that the deviation from normality must be very extreme to load to serious error. We are concerned here solely with the first of these two sources of uncertainty. The usual method of determining the probability that the mean of the pop- ulation lies within a given distance of the mean of the sample is to assume a normal distribution about the mean of the sample with a standard deviation equal to s/ n, where s is the standard deviation of the sample, and to use the tables of the probability integral. But, as we decrease the number of experiments, the value of the standard deviation found from the sample of experiments becomes itself subject to an increasing error, until judgments reached in this way may become altogether misleading. In routine work there are two ways of dealing with this difficulty: (1) an ex- periment may he repeated many times, until such a long series is obtained that the standard deviation is determined once and for all with sufficient accuracy. This value can then he used for subsequent shorter series of similar experiments. (2) Where experiments are done in duplicate in the natural course of the work, the mean square of the difference between corresponding pairs is equal to the standard deviation of the population multiplied by 2. We call thus combine 1

Transcript of Biometrika 1908 Student The Probable Error of A

Page 1: Biometrika 1908 Student The Probable Error of A

THE PROBABLE ERROR OF A MEAN

By STUDENT

Introduction

Any experiment may he regarded as forming an individual of a “population”of experiments which might he performed under the same conditions. A seriesof experiments is a sample drawn from this population.

Now any series of experiments is only of value in so far as it enables us toform a judgment as to the statistical constants of the population to which theexperiments belong. In a greater number of cases the question finally turns onthe value of a mean, either directly, or as the mean difference between the twoquantities.

If the number of experiments be very large, we may have precise informationas to the value of the mean, but if our sample be small, we have two sources ofuncertainty: (1) owing to the “error of random sampling” the mean of our seriesof experiments deviates more or less widely from the mean of the population,and (2) the sample is not sufficiently large to determine what is the law ofdistribution of individuals. It is usual, however, to assume a normal distribution,because, in a very large number of cases, this gives an approximation so closethat a small sample will give no real information as to the manner in whichthe population deviates from normality: since some law of distribution musthe assumed it is better to work with a curve whose area and ordinates aretabled, and whose properties are well known. This assumption is accordinglymade in the present paper, so that its conclusions are not strictly applicable topopulations known not to be normally distributed; yet it appears probable thatthe deviation from normality must be very extreme to load to serious error. Weare concerned here solely with the first of these two sources of uncertainty.

The usual method of determining the probability that the mean of the pop-ulation lies within a given distance of the mean of the sample is to assume anormal distribution about the mean of the sample with a standard deviationequal to s/

√n, where s is the standard deviation of the sample, and to use the

tables of the probability integral.But, as we decrease the number of experiments, the value of the standard

deviation found from the sample of experiments becomes itself subject to anincreasing error, until judgments reached in this way may become altogethermisleading.

In routine work there are two ways of dealing with this difficulty: (1) an ex-periment may he repeated many times, until such a long series is obtained thatthe standard deviation is determined once and for all with sufficient accuracy.This value can then he used for subsequent shorter series of similar experiments.(2) Where experiments are done in duplicate in the natural course of the work,the mean square of the difference between corresponding pairs is equal to thestandard deviation of the population multiplied by

√2. We call thus combine

1

Page 2: Biometrika 1908 Student The Probable Error of A

together several series of experiments for the purpose of determining the stan-dard deviation. Owing however to secular change, the value obtained is nearlyalways too low, successive experiments being positively correlated.

There are other experiments, however, which cannot easily be repeated veryoften; in such cases it is sometimes necessary to judge of the certainty of theresults from a very small sample, which itself affords the only indication of thevariability. Some chemical, many biological, and most agricultural and large-scale experiments belong to this class, which has hitherto been almost outsidethe range of statistical inquiry.

Again, although it is well known that the method of using the normal curveis only trustworthy when the sample is “large”, no one has yet told us veryclearly where the limit between “large” and “small” samples is to be drawn.

The aim of the present paper is to determine the point at which we may usethe tables of the probability integral in judging of the significance of the meanof a series of experiments, and to furnish alternative tables for use when thenumber of experiments is too few.

The paper is divided into the following nine sections:

I. The equation is determined of the curve which represents the frequency dis-tribution of standard deviations of samples drawn from a normal population.

II. There is shown to be no kind of correlation between the mean and thestandard deviation of such a sample.

III. The equation is determined of the curve representing the frequency distri-bution of a quantity z, which is obtained by dividing the distance between themean of a sample and the mean of the population by the standard deviation ofthe sample.

IV. The curve found in I is discussed.

V. The curve found in III is discussed.

VI. The two curves are compared with some actual distributions.

VII. Tables of the curves found in III are given for samples of different size.

VIII and IX. The tables are explained and some instances are given of their use.

X. Conclusions.

Section 1

Samples of n individuals are drawn out of a population distributed normally, tofind an equation which shall represent the frequency of the standard deviationsof these samples.

If s be the standard deviation found from a sample x1x2 . . . xn (all thesebeing measured from the mean of the population), then

s2 =S(x2

1)

n−(

S(x1)

n

)2

=S(x2

1)

n− S(x2

1)

n2− 2S(x1x2)

n2.

2

Page 3: Biometrika 1908 Student The Probable Error of A

Summing for all samples and dividing by the number of samples we get themoan value of s2, which we will write s̄2:

s̄2 =nµ2

n− nµ2

n2=µ2(n− 1)

n,

where µ2 is the second moment coefficient in the original normal distribution ofx: since x1, x2, etc. are not correlated and the distribution is normal, products

involving odd powers of x1 vanish on summing, so that 2S(x1x2)n2 is equal to 0.

If M ′R represent the Rth moment coefficient of the distribution of s2 about

the end of the range where s2 = 0,

M ′1 = µ2

(n− 1)

n.

Again

s4 =

{

S(x21)

n−(

S(x1)

n

)}2

=

(

S(x21)

n

)2

− 2S(x21)

n

(

S(x1)

n

)2

+

(

S(x1)

n

)4

=S(x4

1)

n2+

2S(x21x

22)

n2− 2S(X4

1 )

n3− 4S(x2

1x22)

n3+S(x4

1)

n4

+6S(x2

1x22)

n4+ other terms involving odd powers of x1, etc. which

will vanish on summation.

Now S(x41) has n terms, butS(x2

1x22) has 1

2n(n − 1), hence summing for allsamples and dividing by the number of samples, we get

M ′2 =

µ4

n+ µ2

2

(n− 1)

n− 2µ4

n2− 2µ2

2

(n− 1)

n2+µ4

n3+ 3µ2

2

(n− 1)

n3

= µ4n3{n2 − 2n+ 1} +

µ22

n3(n− 1){n2 − 2n+ 3}.

Now since the distribution of x is normal, µ4 = 3µ22, hence

M ′2 = µ2

2

(n− 1)

n3{3n− 3 + n2 − 2n+ 3} = µ2

2

(n− 1)(n+ 1)

n2.

In a similar tedious way I find

M ′3 = µ3

2

(n− 1)(n+ 1)(n+ 3)

n3

and

M ′4 = µ4

2

(n− 1)(n+ 1)(n+ 3)(n+ 5)

n4.

3

Page 4: Biometrika 1908 Student The Probable Error of A

The law of formation of these moment coefficients appears to be a simpleone, but I have not seen my way to a general proof.

If now MR be the Rth moment coefficient of s2 about its mean, we have

M2 = µ22

(n− 1)

n3{(n+ 1) − (n− 1)} = 2µ2

2

(n− 1)

n2.

M3 = µ32

{

(n− 1)(n+ 1)(n+ 3)

n3− 3(n− 1)

n.(2(n− 1)

n2− (n− 1)3

n3

}

= µ32

(n− 1)

n3{n2 + 4n+ 3 − 6n+ 6 − n2 + 2n− 1} = 8µ3

2

(n− 1)

n3,

M4 =µ4

2

n4

{

(n− 1)(n+ 1)(n+ 3)(n+ 5) − 32(n− 1)2 − 12(n− 1)3 − (n− 1)4}

=µ4

2(n− 1)

n4{n3 + 9n2 + 23n+ 15 − 32n+ 32

− 12n2 + 24n− 12 − n3 + 3n2 − 3n+ 1}

=12µ4

2(n− 1)(n+ 3)

n4.

Hence

β1 =M2

3

M32

=8

n− 1, β2 =

M4

M22

=3(n+ 3)

n− 1),

∴ 2β2 − 3β1 − 6 =1

n− 1{6(n+ 3) − 24 − 6(n− 1)} = 0.

Consequently a curve of Prof. Pearson’s Type III may he expected to fit thedistribution of s2.

The equation referred to an origin at the zero end of the curve will be

y = Cxpe−γx,

where

γ = 2M2

M3=

4µ22(n− 1)n3

8n2µ22(n− 1)

=n

2µ2

and

p =4

β1− 1 =

n− 1

2− 1 =

n− 3

2.

Consequently the equation becomes

y = Cxn−3

2 e−nx2µ2 ,

which will give the distribution of s2.

The area of this curve is C∫∞0x

n−32 e−

nx2µ2 dx = I (say). The first moment

coefficient about the end of the range will therefore be

C∫∞0 x

n−12 e

− nx2µ2 dx

I=

[

C −2µ2

n xn−1

2 e−nx2µ2

]x=∞

x=0

I+C∫∞0

n−1n µ2x

n−32 e

− nx2µ2 dx

I.

4

Page 5: Biometrika 1908 Student The Probable Error of A

The first part vanishes at each limit and the second is equal to

n−1n µ2I

I=n− 1

nµ2.

and we see that the higher moment coefficients will he formed by multiplyingsuccessively by n+1

n µ2,n+3

n µ2 etc., just as appeared to he the law of formationof M ′

2, M′3, M

′4, etc.

Hence it is probable that the curve found represents the theoretical distri-bution of s2; so that although we have no actual proof we shall assume it to doso in what follows.

The distribution of s may he found from this, since the frequency of s isequal to that of s2 and all that we must do is to compress the base line suitably.

Now if y1 = φ(s2) be the frequency curve of s2

and y2 = ψ(s) be the frequency curve of s,then

y1d(s2) = y2ds,

y2ds = 2y1sds,

∴ y2 = 2sy1.

Hence

y2 = 2Cs(s2)n−3

2 e−ns2

2µ2 .

is the distribution of s.This reduces to

y2 = 2Csn−2e−ns2

2σ2 .

Hence y = Axn−2e−s2

2µ2 will give the frequency distribution of standarddeviations of samples of n, taken out of a population distributed normally withstandard deviation σ2. The constant A may he found by equating the area ofthe curve as follows:

Area = A

∫ ∞

0

xn−2e−nx2

2σ2 dx.

(

Let Ip represent

∫ ∞

0

xpe−−nx2

2σ2 dx.

)

Then

Ip =σ2

n

∫ ∞

0

xp−1 d

dx

(

−e−nx2

2σ2

)

dx

=σ2

n

[

−xp−1e−nx2

2σ2

]x=∞

x=0+σ2

n(p− 1)

∫ ∞

0

xp−2e−x2

2σ2 dx

=σ2

n(p− 1)Ip−2,

since the first part vanishes at both limits.

5

Page 6: Biometrika 1908 Student The Probable Error of A

By continuing this process we find

In−2 =

(

σ2

n

)

n−22

(n− 3)(n− 5) . . . 3.1I0

or

In−2 =

(

σ2

n

)

n−22

(n− 3)(n− 5) . . . 4.2I1

according n is even or odd.But I0 is

∫ ∞

0

e−nx2

2σ2 dx =

( π

2n

)

σ,

and I1 is∫ ∞

0

xe− nx2

2sigma2 dx =

[

−σ2

ne−

nx2

2σ2

]x=∞

x=0

=σ2

n.

Hence if n be even,

A =Area

(n− 3)(n− 5) . . . 3.1√

(

π2

) (

σ2

n

)

n−12

,

while is n be odd

A =Area

(n− 3)(n− 5) . . . 4.2(

σ2

n

)

n−12

.

Hence the equation may be written

y =N

(n− 3)(n− 5) . . . 3.1

(

2

π

)

( n

σ2

)

n−12

xn−2e−nx2

2σ2 (n even)

or

y =N

(n− 3)(n− 5) . . . 4.2

( n

σ2

)

n−12

xn−2e−nx2

2σ2 (n odd)

where N as usual represents the total frequency.

Section II

To show that there is no correlation between (a) the distance of the meanof a sample from the mean of the population and (b) the standard deviation ofa sample with normal distribution.

(1) Clearly positive and negative positions of the mean of the sample areequally likely, and hence there cannot be correlation between the absolute valueof the distance of the mean from the mean of the population and the standard

6

Page 7: Biometrika 1908 Student The Probable Error of A

deviation, but (2) there might be correlation between the square of the distanceand the square of the standard deviation. Let

u2 =

(

S(x1)

n

)2

and s2 =S(x2

1)

n−(

S(x1)

n

)2

.

Then if m′1, M

′1 be the mean values of u2 and sz, we have by the preceding part

M ′1 = µ2

(n− 1)

nand m′

1 =µ2

n.

Now

u2s2 =S(x2

1)

n

(

S(x1)

n

)2

−(

S(x1)

n

)4

=

(

S(x21)

n

)2

+ 2S(x1x2).S(x2

1)

n3− S(x4

1)

n4− 6S(x2

1x22)

n4

− other terms of odd order which will vanish on summation.

Summing for all values and dividing by the number of cases we get

Ru2s2σu2σs2 +m1M1 =

µ4

n2+ µ2

2

(n− 1)

n2− µ4

n3− 3µ2

2

(n− 1)

n3,

where Ru2s2 is the correlation between u2 and s2.

Ru2s2σu2σs2 + µ2

2

(n− 1)

n2= µ2

2

(n− 1)

n3{3 + n− 3} = µ2

2

(n− 1)

n2.

Hence Ru2s2σu2σs2 = 0, or there is no correlation between u2 and s2.

Section III

To find the equation representing the frequency distribution of the means ofsamples of n drawn from a normal population, the mean being expressed interms of the standard deviation of the sample.

We have y = Cσn−1 s

n−2e−nx2

2σ2 as the equation representing the distributionof s, the standard deviation of a sample of n, when the samples are drawn froma normal population with standard deviation s.

Now the means of these samples of n are distributed according to the equa-tion1

y =

(n)N√

(2π)σe−

nx2

2σ2 ,

and we have shown that there is no correlation between x, the distance of themean of the sample, and s, the standard deviation of the sample.

1Airy, Theory of Errors of Observations, Part II, §6.

7

Page 8: Biometrika 1908 Student The Probable Error of A

Now let us suppose x measured in terms of s, i.e. let us find the distributionof z = x/s.

If we have y1 = φ(x) and y2 = ψ(z) as the equations representing thefrequency of x and of z respectively, then

y1dx = y2dz = y3dx

s,

∴ y2 = sy1.

Hence

y =N√

(n)s√

(2π)σe−

ns2z2

2σ2

is the equation representing the distribution of z for samples of n with standarddeviation s.

Now the chance that s lies between s and s+ ds is

∫ s+ds

sC

σn−1 sn−2e−

ns2

2σ2 ds∫∞0

Cσn−1 sn−2e−

ns2

2σ2 ds

which represents the N in the above equation.Hence the distribution of z due to values of s which lie between s and s+ ds

is

y =

∫ s+ds

sCσn

(

n2π

)

sn−1e−ns2(1+z2)

2σ2 ds

∫∞0

Cσn−1 sn−2e−

ns2

2σ2 ds=

( n

)

∫ s+ds

sCσn s

n−1e−ns2(1+z2)

2σ2 ds∫∞0

Cσn−2 sn−2e−

ns2

2σ2 ds

and summing for all values of s we have as an equation giving the distributionof z

y =

(

n2π

)

σ

∫ s+ds

sCσn s

n−1e−ns2(1+z2)

2σ2 ds∫∞0

Cσn−2 sn−2e−

ns2

2σ2 ds.

By what we have already proved this reduces to

y =1

2

n− 2

n− 3.n− 4

n− 5. . .

5

4.3

2(1 + z2)−

12n, if n be odd

and to

y =1

2

n− 2

n− 3.n− 4

n− 5. . .

4

3.2

21(1 + z2)−

12n, if n be even

Since this equation is independent of σ it will give the distribution of thedistance of the mean of a sample from the mean of the population expressed interms of the standard deviation of the sample for any normal population.

8

Page 9: Biometrika 1908 Student The Probable Error of A

Section IV. Some Properties of the Standard

Deviation Frequency Curve

By a similar method to that adopted for finding the constant we may findthe mean and moments: thus the mean is at In−1/In−2,which is equal to

n− 2

n− 3.n− 4

n− 5. . .

2

1

(

2

π

)

σ√n, if n be even,

orn− 2

n− 3.n− 4

n− 5. . .

3

2

2

) σ√n, if n be odd .

The second moment about the end of the range is

InIn−2

=(n− 1)σ2

n.

The third moment about the end of the range is equal to

In+1

In−2=In+1

In−1.In− 1

In−2

= σ2 × the mean.

The fourth moment about the end of the range is equal to

In+2

In−2=

(n− 1)(n+ 1)

n2σ4.

If we write the distance of the mean from the end of the range Dσ/√n and

the moments about the end of the range ν1, ν2, etc.,then

ν1 =Dσ√n, ν2 =

n− 1

nσ2, ν3 =

Dσ3

√n, ν4 =

N2 − 1

nσ4.

From this we get the moments about the mean:

µ2 =σ2

n(n− 1 −D2),

µ3 =σ3

n√n{nD − 3(n− 1)D + 2D2} =

σ3D

n√n{2D2 − 2n+ 3},

µ4 =σ2

n2{n2 − 1 − 4D2n+ 6(n− 1)D2 − 3D4}

=σ4

n2{n2 − 1 −D2(3D2 − 2n+ 6)}.

It is of interest to find out what these become when n is large.

9

Page 10: Biometrika 1908 Student The Probable Error of A

In order to do this we must find out what is the value of D.Now Wallis’s expression for π derived from the infinite product value of sinx

isπ

2(2n+ 1) =

22.42.62 . . . (2n)2

123252 . . . (2n− 1)2.

If we assume a quantity θ(

= a0 + a1

n + etc.)

which we may add to the 2n+1in order to make the expression approximate more rapidly to the truth, it is easyto show that θ = − 1

2 + 116n−etc., and we get2

π

2

(

2n+1

2+

1

16n

)

=22.42.62 . . . (2n)2

123252 . . . (2n− 1)2.

From this we find that whether n be even or odd D2 approximates to n −32 + 1

8n when n is large.Substituting this value of D we get

µ2 =σ2

2n

(

1 − 1

4n

)

, µ2 =σ3√

(

1 − 32n + 1

16n2

)

4n2, µ4 =

3σ4

4n2

(

1 +1

2n− 1

16n2

)

.

Consequently the value of the standard deviation of a standard deviation

which we have found

(

σ√(2n)

√{1−(1/4n)}

)

becomes the same as that found for

the normal curve by Prof. Pearson {σ/(2n)} when n is large enough to neglectthe 1/4n in comparison with 1.

Neglecting terms of lower order than 1/n, we find

β1 =2n− 3

n(4n− 3), β)2 = 3

(

1 − 1

2n

)(

1 +1

2n

)

.

Consequently, as n increases, β2 very soon approaches the value 3 of thenormal curve, but β1 vanishes more slowly, so that the curve remains slightlyskew.

Diagram I shows the theoretical distribution of the standard deviations foundfrom samples of 10.

Section V. Some Properties of the Curve

y =n− 2

n− 3.n− 4

n− 5. . .

(

43 .

2π if n be even

54 .

32 .

12 if n be odd

)

(1 + z2)−12n

Writing z = tan θ the equation becomes y = n−2n−3 .

n−4n−5 . . . etc. × cosn θ, which

affords an easy way of drawing the curve. Also dz = dθ/ cos2 θ.

2This expression will be found to give a much closer approximation to π than Wallis’s

10

Page 11: Biometrika 1908 Student The Probable Error of A

Hence to find the area of the curve between any limits we must find

n− 2

n− 3.n− 4

n− 5. . . etc. ×

cosn−2 θdθ

=n− 2

n− 3.n− 4

n− 5. . . etc.

{

n− 3

n− 2

cosn−4 θdθ +

[

cosn−3θ sin θ

n− 2

]}

=n− 2

n− 3.n− 4

n− 5. . . etc.

cosn−4 θdθ +1

n− 3

n− 4

n− 5. . . etc.[cosn−3 θ sin θ],

and by continuing the process the integral may he evaluated.For example, if we wish to find the area between 0 and θ for n = 8 we have

Area =6

5.4

3.2

1.1

π

∫ θ

0

cos6 θdθ

=4

3.2

π

∫ θ

0

cos4 θdθ +1

5.4

3.2

πcos5 θ sin θ

π+

1

πcos θ sin θ +

1

3.2

πcos3 θ sin θ +

1

5.4

3.2

πcos5 θ sin θ

and it will be noticed that for n = 10 we shall merely have to add to this sameexpression the term 1

7 .65 .

43 .

2π cos7 θ sin θ.

11

Page 12: Biometrika 1908 Student The Probable Error of A

The tables at the end of the paper give the area between −∞ and z(

or θ = −π2

and θ = tan−1 z)

.

This is the same as 0.5 + the area between θ = 0, and θ = tan−1 z, and asthe whole area of the curve is equal to 1, the tables give the probability that themean of the sample does not differ by more than z times the standard deviationof the sample from the mean of the population.

The whole area of the curve is equal to

n− 2

n− 3.n− 4

n− 5. . . etc. ×

∫ + 12π

− 12π

cosn−2 θdθ

and since all the parts between the limits vanish at both limits this reduces to1.

Similarly, the second moment coefficient is equal to

n− 2

n− 3.n− 4

n− 5. . . etc. ×

∫ + 12π

− 12π

cosn−2 θ tan2 θdθ

=n− 2

n− 3.n− 4

n− 5. . . etc. ×

∫ + 12π

− 12π

(cosn−4 θ − cosn−2 θ)dθ

=n− 2

n− 3− 1 =

1

n− 3.

Hence the standard deviation of the curve is 1/√

(n− 3). The fourth mo-ment coefficient is equal to

n− 2

n− 3.n− 4

n− 5. . . etc. ×

∫ + 12 π

− 12 π

cosn−2 θ tan4 θdθ

=n− 2

n− 3.n− 4

n− 5. . . etc. ×

∫ + 12π

− 12π

(cosn−6 θ − 2 cosn−4 θ + cosn−2 θ)dθ

=n− 2

n− 3.n− 4

n− 5− 2(n− 2)

n− 3+ 1 =

3

(n− 3)(n− 5).

The odd moments are of course zero, an the curve is symmetrical, so

β1 = 0, β2 =3(n− 3)

n− 5= 3 +

6

n− 5.

Hence as it increases the curve approaches the normal curve whose standarddeviation is 1/

(n− 3).β2, however, is always greater than 3, indicating that large deviations are

mere common than in the normal curve.I have tabled the area for the normal curve with standard deviation 1/

√7

so as to compare, with my curve for n = 103. It will be seen that odds laid

3See p. 29

12

Page 13: Biometrika 1908 Student The Probable Error of A

according to either table would not seriously differ till we reach z = 0.8, wherethe odds are about 50 to 1 that the mean is within that limit: beyond thatthe normal curve gives a false feeling of security, for example, according to thenormal curve it is 99,986 to 14 (say 7000 to 1) that the mean of the populationlies between −∞ and +1.3s, whereas the real odds are only 99,819 to 181 (about550 to 1).

Now 50 to 1 corresponds to three times the probable error in the normalcurve and for most purposes it would be considered significant; for this reason Ihave only tabled my curves for values of n not greater than 10, but have giventhe n = 9 and n = 10 tables to one further place of decimals. They can he usedas foundations for finding values for larger samples.4

The table for n = 2 can be readily constructed by looking out θ = tan−1 zin Chambers’s tables and then 0.5 + θ/π gives the corresponding value.

Similarly 12 sin θ + 0.5 gives the values when n = 3.

There are two points of interest in the n = 2 curve. Here s is equal to halfthe distance between the two observations, tan−1 s

s = π4 , so that between +s

and −z lies 2 × π4 × 1

π or half the probability, i.e. if two observations have beenmade and we have no other information, it is an even chance that the mean ofthe (normal) population will lie between them. On the other hand the second

4E.g. if n = 11, to the corresponding value for n = 9, we add 78× 5

6× 3

4× 1

2× 1

2cos8 θ sin θ:

if n = 13 we add as well 910

× 78× 5

6× 3

4× 1

2× 1

2cos10 θ sin θ, and so on.

13

Page 14: Biometrika 1908 Student The Probable Error of A

moment coefficient is

1

π

∫ + 12π

=− 12π

tan2 θdθ =1

π[tan θ − θ]

+ 12π=∞

=− 12 π

= ∞,

or the standard deviation is infinite while the probable error is finite.

Section VI. Practical Test of the foregoing

Equations

Before I bad succeeded in solving my problem analytically, I had endeavouredto do so empirically. The material used was a correlation table containing theheight and left middle finger measurements of 3000 criminals, from a paper byW. R. Macdonnell (Biometrika, i, p. 219). The measurements were writtenout on 3000 pieces of cardboard, which were then very thoroughly shuffled anddrawn at random. As each card was drawn its numbers were written down ina book, which thus contains the measurements of 3000 criminals in a randomorder. Finally, each consecutive set of 4 was taken as a sample—750 in all—andthe mean, standard deviation, and correlation5 of each sample determined. Thedifference between the mean of each sample and the mean of the population wasthen divided by the standard deviation of the sample, giving us the z of SectionIII.

This provides us with two sets of 750 standard deviations and two sets of 750z’s on which to test the theoretical results arrived at. The height and left middlefinger correlation table was chosen because the distribution of both was approx-imately normal and the correlation was fairly high. Both frequency curves, how-ever, deviate slightly from normality, the constants being for height β1 = 0.0026,β2 = 3.176, and for left middle finger lengths β1 = 0.0030, β2 = 3.140, and inconsequence there is a tendency for a certain number of larger standard devia-tions to occur than if the distributions wore normal. This, however, appears tomake very little difference to the distribution of z.

Another thing which interferes with the comparison is the comparativelylarge groups in which the observations occur. The heights are arranged in 1inch groups, the standard deviation being only 2.54 inches. while, the fingerlengths wore originally grouped in millimetres, but unfortunately I did not atthe time see the importance of having a smaller unit and condensed them into2 millimetre groups, in terms of which the standard deviation is 2.74.

Several curious results follow from taking samples of 4 from material disposedin such wide groups. The following points may be noticed:

(1) The means only occur as multiples of 0.25. (2) The standard deviationsoccur as the square roots of the following types of numbers: n, n+0.10, n+0.25,n+ 0.50, n+ 0.69, 2n+ 0.75.

(3) A standard deviation belonging to one of these groups can only be as-sociated with a mean of a particular kind; thus a standard deviation of

√2 can

5I hope to publish the results of the correlation work shortly.

14

Page 15: Biometrika 1908 Student The Probable Error of A

only occur if the mean differs by a whole number from the group we take asorigin, while

√1.69 will only occur when the mean is at n± 0.25.

(4) All the four individuals of the sample will occasionally come from thesame group, giving a zero value for the standard deviation. Now this leads toan infinite value of z and is clearly due to too wide a grouping, for althoughtwo men may have the same height when measured by inches, yet the finerthe measurements the more seldom will they he identical, till finally the chancethat four men will have exactly the same height is infinitely small. If we hadsmaller grouping the zero values of the standard deviation might be expectedto increase, and a similar consideration will show that the smaller values of thestandard deviation would also be likely to increase, such as 0.436, when 3 fallin one group and 1 in an adjacent group, or 0.50 when 2 fall in two adjacentgroups. On the other hand, when the individuals of the sample lie far apart,the argument of Sheppard’s correction will apply, the real value of the standarddeviation being more likely to he smaller than that found owing to the frequencyin any group being greater on the side nearer the mode.

These two effects of grouping will tend to neutralize the effect on the meanvalue of the standard deviation, but both will increase the variability.

Accordingly, we find that the mean value of the standard deviation is quiteclose to that calculated, while in each case the variability is sensibly greater.The fit of the curve is not good, both for this reason and because the frequencyis not evenly distributed owing to effects (2) and (3) of grouping. On the otherhand, the fit of the curve giving the frequency of z is very good, and as that isthe only practical point the comparison may he considered satisfactory.

The following are the figures for height:

Mean value of standard deviations: Calculated 2.027 ± 0.02Observed 2.026Difference = −0.001

Standard deviation of standard deviations: Calculated 0.8558± 0.015Observed 0.9066Difference +0.0510

Comparison of Fit. Theoretical Equation: y = 16×750√

(2π)σ2x2e

− 2x2

σ2

Scale in terms of standard deviations of population

Calculated frequency

1 12

10 12

27 45 12

64 12

78 12

87 88 81 12

71 58 45 33 23 15 9 12

5 12

7

Observed frequency3 14 1

224 1

237 1

2107 67 73 77 77 1

264 1

249 1

235 28 12 1

29 11 1

27

Difference+1 1

2+4 −2 1

2−8 +42 1

2−11 1

2−14 −11 −4 −7 −5 1

2+4 1

2+2 +5 −2 1

2−

12

+6 0

Whence χ2 = 48.06, P = 0.00006 (about).

In tabling the observed frequency, values between 0.0125 and 0.0875 wereincluded in one group, while between 0.0875 and 0.012.5 they were divided overthe two groups. As an instance of the irregularity due to grouping I may mention

15

Page 16: Biometrika 1908 Student The Probable Error of A

that there were 31 cases of standard deviations 1.30 (in terms of the grouping)which is 0.5117 in terms of the standard deviation of the population, and theywore therefore divided over the groups 0.4 to 0.5 and 0.5 to 0.6. Had they allbeen counted in groups 0.5 to 0.6 χ2 would have fallen to 20.85 and P wouldhave risen to 0.03. The χ2 test presupposes random sampling from a frequencyfollowing the given law, but this we have not got owing to the interference ofthe grouping.

When, however, we test the z’s where the grouping has not had so mucheffect, we find a close correspondence between the theory and the actual result.

There were three cases of infinite values of z which, for the reasons givenabove, were given the next largest values which occurred, namely +6 or −6. Therest were divided into groups of 0.1; 0.04, 0.05 and 0.06, being divided betweenthe two groups on either side.

The calculated value for the standard deviation of the frequency curve was1 (±0.0171), while the observed was 1.030. The value of the standard deviationis really infinite, as the fourth moment coefficient is infinite, but as we havearbitrarily limited the infinite cases we may take as an approximation 1/

√1500

from which the value of the probable error given above is obtained. The fit ofthe curve is as follows:

Comparison of Fit. Theoretical Equation: y = 2N

πcos4 θ, z = tan θ

Scale of z

Calculated frequency

5 9 12

13 12

34 12

44 12

78 12

119 141 78 12

44 12

34 12

13 12

13 12

9 12

5

Observed frequency9 14 1

211 1

233 43 1

270 1

2119 1

2151 1

2122 67 1

249 26 1

216 10 6

Difference+4 +4 −2 −2 −1 1

2−1 −8 + 1

2+10 1

2+3 −11 +4 1

2−8 +2 1

2+ 1

2

Whence χ2 = 12.44, P = 0.56.

This is very satisfactory, especially when we consider that as a rule obser-vations are tested against curves fitted from the mean and one or more othermoments of the observations, so that considerable correspondence is only to ])cexpected; while this curve is exposed to the full errors of random sampling, itsconstants having been calculated quite apart from the observations.

The left middle finger samples show much the same features as those ofthe height, but as the grouping is not so large compared to the variability thecurves fit the observations more closely. Diagrams III6 and IV give the standarddeviations of the z’s for the set of samples. The results are as follows:

6There are three small mistakes in plotting the observed values in Diagram III, which makethe fit appear worse than it really is

16

Page 17: Biometrika 1908 Student The Probable Error of A

Mean value of standard deviations: Calculated 2.186 ± 0.023Observed 2.179Difference = −0.007

Standard deviation of standard deviations: Calculated 0.9224± 0.016Observed 0.9802Difference = +0.0578

Comparison of Fit. Theoretical Equation: y = 16×750√(2π)σ2

x2e− 2x2

σ2

Scale in terms of standard deviations of population1 1

2 10 12 27 45 1

2 64 12 78 1

2 87 88 81 12 71 58 45 33 23 15 9 1

2 5 12 7

Calculated frequency2 14 27 1

2 51 64 12 91 94 1

2 68 12 65 1

2 73 48 12 40 1

2 42 12 20 22 1

2 12 5 7 12

Observed frequency+ 1

2 +3 12 + 1

2 +5 12 — +12 1

2 +7 12 −19 1

2 −16 +2 −9 12 −4 1

2 +9 12 −3 +7 1

2 +2 12 − 1

2 + 12

Whence χ2 = 21.80, P = 0.19.

Value of standard deviation: Calculated 1(±0.017)Observed 0.982Difference = −0.018

Comparison of Fit. Theoretical Equation: y = 2Nπ

cos4 θ, z = tan θ

Scale of zCalculated frequency5 9 1

2 13 12 34 1

2 44 12 78 1

2 119 141 119 78 12 44 1

2 34 12 13 1

2 9 12 5

Observed frequency4 15 1

2 18 33 12 44 75 122 138 120 1

2 71 46 12 36 11 9 6

Difference−1 +6 +4 1

2 −1 − 12 −3 1

2 +3 −3 +1 12 −7 1

2 +2 +1 12 −2 1

2 − 12 +1

Whence χ2 = 7.39, P = 0.92.

A very close fit.We see then that if the distribution is approximately normal our theory

gives us a satisfactory measure of the certainty to be derived from a smallsample in both the cases we have tested; but we have an indication that a finegrouping is of advantage. If the distribution is not normal, the mean and thestandard deviation of a sample will be positively correlated, so although bothwill have greater variability, yet they will tend to counteract one another, amean deriving largely from the general mean tending to be divided by a largerstandard deviation. Consequently, I believe that the table given in Section VIIbelow may be used in estimating the degree of certainty arrived at by the meanof a few experiments, in the case of most laboratory or biological work wherethe distributions are as a rule of a “cocked hat” type and so sufficiently nearlynormal

17

Page 18: Biometrika 1908 Student The Probable Error of A

18

Page 19: Biometrika 1908 Student The Probable Error of A

Section VII. Tables of

n−2n−3.

n−4n−5 . . .

(

32.12n odd

21.

1π n even

)

∫ tan−1 z

− 12π

cosn−2

θdθ

for values of n from 4 to 10 inclusive

Together with√

7√(2π)

∫ x

−∞ e−7x2

2 dx for comparison when n = 10

z`

= x

s

´

n = 4 n = 5 n = 6 n = 7 n = 8 n = 9 n = 10 For comparison„ √

7√(2π)

R

x

−∞ e−7x2

2 dx

«

0.1 0.5633 0.5745 0.5841 0.5928 0.6006 0.60787 0.61462 0.604110.2 0.6241 0.6458 0.6634 0.6798 0.6936 0.70705 0.71846 0.701590.3 0.6804 0.7096 0.7340 0.7549 0.7733 0.78961 0.80423 0.786410.4 0.7309 0.7657 0.7939 0.8175 0.8376 0.85465 0.86970 0.855200.5 0.7749 0.8131 0.8428 0.8667 0.8863 0.90251 0.91609 0.906910.6 0.8125 0.8518 0.8813 0.9040 0.9218 0.93600 0.94732 0.943750.7 0.8440 0.8830 0.9109 0.9314 0.9468 0.95851 0.96747 0.967990.8 0.8701 0.9076 0.9332 0.9512 0.9640 0.97328 0.98007 0.982530.9 0.8915 0.9269 0.9498 0.9652 0.9756 0.98279 0.98780 0.991371.0 0.9092 0.9419 0.9622 0.9751 0.9834 0.98890 0.99252 0.998201.1 0.9236 0.9537 0.9714 0.9821 0.9887 0.99280 0.99539 0.999261.2 0.9354 0.9628 0.9782 0.9870 0.9922 0.99528 0.99713 0.999711.3 0.9451 0.9700 0.9832 0.9905 0.9946 0.99688 0.99819 0.999861.4 0.9451 0.9756 0.9870 0.9930 0.9962 0.99791 0.99885 0.999891.5 0.9598 0.9800 0.9899 0.9948 0.9973 0.99859 0.99926 0.999991.6 0.9653 0.9836 0.9920 0.9961 0.9981 0.99903 0.999511.7 0.9699 0.9864 0.9937 0.9970 0.9986 0.99933 0.999681.8 0.9737 0.9886 0.9950 0.9977 0.9990 0.99953 0.999781.9 0.9970 0.9904 0.9959 0.9983 0.9992 0.99967 0.999852.0 0.9797 0.9919 0.9967 0.9986 0.9994 0.99976 0.999902.1 0.9821 0.9931 0.9973 0.9989 0.9996 0.99983 0.999932.2 0.9841 0.9941 0.9978 0.9992 0.9997 0.99987 0.999952.3 0.9858 0.9950 0.9982 0.9993 0.9998 0.99991 0.999962.4 0.9873 0.9957 0.9985 0.9995 0.9998 0.99993 0.999972.5 0.9886 0.9963 0.9987 0.9996 0.9998 0.99995 0.999982.6 0.9898 0.9967 0.9989 0.9996 0.9999 0.99996 0.999992.7 0.9908 0.9972 0.9989 0.9997 0.9999 0.99997 0.999992.8 0.9916 0.9975 0.9989 0.9998 0.9999 0.99998 0.999992.9 0.9924 0.9978 0.9989 0.9998 0.9999 0.99998 0.999993.0 0.9931 0.9981 0.9989 0.9998 — 0.99999 —

Explanation of Tables

The tables give the probability that the value of the mean, measured fromthe mean of the population, in terms of the standard deviation of the sample,will lie between −∞ and z. Thus, to take the table for samples of 6, theprobability of the mean of the population lying between −∞ and once thestandard deviation of the sample is 0.9622, the odds are about 24 to 1 that themean of the population lies between these limits.

19

Page 20: Biometrika 1908 Student The Probable Error of A

The probability is therefore 0.0378 that it is greater than once the standarddeviation and 0.07511 that it lies outside ±1.0 times the standard deviation.

Illustration of Method

Illustration I. As an instance of the kind of use which may be made of the tables,I take the following figures from a table by A. R. Cushny and A. R. Peebles inthe Journal of Physiology for 1904, showing the different effects of the opticalisomers of hyoscyamine hydrobromide in producing sleep. The average numberof hours’ sleep gained by the use of the drug is tabulated below.

The conclusion arrived at was that in the usual doses 2 was, but 1 was not,of value as a soporific.

Additional hours’ sleep gained by the use of hyoscyamine hydrobromidePatient 1 (Dextro-) 2 (Laevo-) Difference (2 − 1)

1 +0.7 +1.9 +1.22 −1.6 +0.8 +2.43 −0.2 +1.1 +1.34 −1.2 +0.1 +1.35 −0.1 −0.1 06 +3.4 +4.4 +1.07 +3.7 +5.5 +1.88 +0.8 +1.6 +0.89 0 +4.6 +4.6

10 +2.0 +3.4 +1.4Mean +0.75 Mean +2.33 Mean +1.58s.d. 1.70 s.d. 1.90 s.d. 1.17

First let us see what is the probability that 1 will on the average give increaseof sleep; i.e. what is the chance that the mean of the population of which theseexperiments are a sample is positive. +0.75/1.70 = 0.44, and looking out z =0.44 in the table for ten experiments we find by interpolating between 0.8697and 0.9161 that 0.44 corresponds to 0.8873, or the odds are 0.887 to 0.113 thatthe mean is positive.

That is about 8 to 1, and would correspond to the normal curve to about1.8 times the probable error. It is then very likely that 1 gives an increase ofsleep, but would occasion no surprise if the results were reversed by furtherexperiments.

If now we consider the chance that 2 is actually a soporific we have themean inclrease of sleep = 2.33/1.90 or 1.23 times the s.d. From the table theprobability corresponding to this is 0.9974, i.e. the odds are nearly 400 to 1that such is the case. This corresponds to about 4.15 times the probable errorin the normal curve. But I take it that the real point of the authors was that2 is better than 1. This we must t4est by making a new series, subtracting 1from 2. The mean values of this series is +1.38, while the s.d. is 1.17, the meanvalue being +1.35 times the s.d. From the table, the probability is 0.9985, orthe odds are about 666 to one that 2 is the better soporific. The low value of

20

Page 21: Biometrika 1908 Student The Probable Error of A

the s.d. is probably due to the different drugs reacting similarly on the samepatient, so that there is correlation between the results.

Of course odds of this kind make it almost certain that 2 is the better so-porific, and in practical life such a high probability is in most matters consideredas a certainty.

Illustration II. Cases where the tables will be useful are not uncommonin agricultural work, and they would be more numerous if the advantages ofbeing able to apply statistical reasoning were borne in mind when planning theexperiments. I take the following instances from the accounts of the Woburnfarming experiments published yearly by Dr Voelcker in the Journal of the

Agricultural Soceity.A short series of pot culture experiments were conducted in order to deter-

mine the casues which lead to the production of Hard (glutinous) wheat or Soft(starchy) wheat. In three successive years a bulk of seed corn of one variety waspicked over by hand and two samples were selected, one consisting of “hard”grains avid the other of “soft”. Some of each of them were planted in both heavyand light soil and the resulting crops wore weighed and examined for hard andsoft corn.

The conclusion drawn was that the effect of selecting the seed was negligiblecompared with the influence of the soil.

This conclusion was thoroughly justified, theheavy soul producing in eachcase nearly 100% of hard corn, but still the effect of selecting the seed couldjust be traced in each year.

But a curious point, to which Dr Voelcker draws attention in the secondyear’s report, is that the soft seeds produced the higher yield of both corn andstraw. In view of the well-known fact that the varieties which have a highyield tend to produce soft corn, it is interesting to see how much evidence theexperiments afford as to the correlation between softness and fertility in thesame variety.

Further, Mr Hooker7 has shown that the yield of wheat in one year is largelydetermined by the weather during the preceding year. Dr Voelcker’s resultsmay afford a clue as to the way in which the seed id affected, and would almostjustify the selection of particillar soils for growing wheat.8

Th figures are as follows, the yields being expressed in grammes per pot:

Year 1899 1900 1901 StandardSoil Light Heavy Light Heavy Light Heavy Average deviation zYield of corn from soft seed 7.55 8.89 14.81 13.55 7.49 15.39 11.328Yield of corn from hard seed 7.27 8.32 13.81 13.36 7.97 13.13 10.643Difference +0.58 +0.57 +1.00 +0.19 −0.49 +2.26 +0.685 0.778 0.88Yield of straw from soft seed 12.81 12.87 22.22 20.21 13.97 22.57 17.442Yield of straw from hard seed 10.71 12.48 21.64 20.26 11.71 18.96 15.927Difference +2.10 +0.39 +0.78 −0.05 +2.66 +3.61 +1.515 1.261 1.20

If we wish to laid the odds that the soft seed will give a better yield of cornon the average, we divide, the average difference by the standard deviation,

7Journal of the Royal Statistical Society, 18978And perhaps a few experiments to see whether there is a correlation between yield and

“mellowness” in barley.

21

Page 22: Biometrika 1908 Student The Probable Error of A

giving usz = 0.88.

Looking this up in the table for n = 6 we find p = 0.9465 or the odds are 0.9465to 0.0535 about 18 to 1.

Similarly for straw z = 1.20, p = 0.9782, and the odds are about 45 to 1.In order to see whether such odds are sufficient for a practical man to draw

a definite conclusion, I take another act of experiments in which Dr Voelckercompares the effects of different artificial manures used with potatoes on a largescale.

The figures represent the difference between the crops grown with the riseof sulphate of potash and kailit respectively in both 1904 and 1905:

cwt. qr. lb. ton cwt. qr. lb.1904 + 10 3 20 : + 1 10 1 261905 + 6 0 3 : + 13 2 8

(two experiments in each year)

The average gain by the use of sulphate of potash was 15.25 cwt. and thes.d. 9 cwt., whence, if we want the odds that the conclusion given below isright, z = 1.7, corresponding, when n = 4,to p = 0.9698 or odds of 32 to 1; thisis midway between the odds in the former example. Dr Voelcker says: “It maynow fairly be concluded that for the potato crop on light land 1 cwt. per acreof sulphate of potash is a better dressing than kailit.”

Am an example of how the table should be used with caution, I take thefollowing pot culture experiments to test whether it made any difference whetherlarge or small seeds were sown.

Illustration III. In 1899 and in 1903 “head corn” and “tail corn” were takenfrom the same bulks of barley and sown in pots. The yields in grammes wereas follows:

1899 1903Large seed . . . 13.9 7.3Small seed . . . 14.4 1.4

+0.5 +1.4

The average gain is thus 0.95 and the s.d. 0.45, giving z = 2.1. Now thetable for n = 2 is not given, but if we look up the angle whose tangent is 2.1 inChambers’s tables,

p =tan−1 2.1

180◦+ 0.5 =

64◦39′

180◦= 0.859,

so that the odds are about 6 to 1 that small corn gives a better yield than large.These odds9 are those which would be laid, and laid rigidly, by a man whoseonly knowledge of the matter was contained in the two experiments. Anyoneconversant with pot culture would however know that the difference between the

9[Through a numerical slip, now corrected, Student had given the odds as 33 to 1 and itis to this figure that the remarks in this paragraph relate.

22

Page 23: Biometrika 1908 Student The Probable Error of A

two results would generally be greater and would correspondingly moderate thecertainty of his conclusion. In point of fact a large-scale experiment confirmedthis result, the small corn yielding shout 15% more than the large.

I will conclude with an example which comes beyond the range of the tables,there being eleven experiments.

To test whether it is of advantage to kiln-dry barley seed before sowing,seven varieties of barley wore sown (both kiln-dried and not kiln-dried in 1899and four in 1900; the results are given in the table.

Lb. head corn per acre Price of head corn in Cwt. straw per acre Value of crop per acreshillings per quarter in shillings

N.K.D. N.D. Diff. N.K.D. N.D. Diff. N.K.D. N.D. Diff. N.K.D. N.D. Diff.

1903 2009 +106 26 12

12

0 19 12

25 +5 12

140 12

152 +11 12

1935 1915 − 20 28 26 12

−1 12

22 12

24 +1 12

152 12

145 −7 12

1910 2011 +101 29 12

28 12

−1 23 24 +1 158 12

161 +2 12

1899 2496 2463 − 33 30 29 −1 23 28 +5 204 12

199 12

−5

2108 2180 + 72 27 12

27 − 12

22 12

22 12

0 162 142 +2

1961 1925 −36 26 26 0 19 12

419 12

− 12

142 139 12

−2 12

2060 2122 + 62 29 26 −3 24 12

22 12

−2 12

168 155 −13

1444 1482 + 38 29 12

28 12

−1 15 12

16 + 12

118 ‘117 12

− 12

1900 1612 1443 − 70 28 12

28 − 12

18 17 12

− 12

128 12

121 −7 12

1316 1443 +127 30 29 −1 14 12

15 12

+1 12

109 12

116 12

+7

1511 1535 + 24 28 12

28 − 12

17 17 12

+ 12

120 120 12

+12

Average 1841.5 1875.2 +33.7 28.45 27.55 −0.91 19.95 21.05 +1.10 145.82 144.68 +1.14

Standard . . . . . . 63.1 . . . . . . 0.79 . . . . . . 2.25 . . . . . . 6.67deviationStandarddeviation . . . . . . 63.1 . . . . . . 0.79 . . . . . . 2.25 . . . . . . 6.67

÷√

8

It will he noticed that the kiln-dried seed gave on an average the larger yield.of corn and straw, but that the quality was almost always inferior. At first sightthis might be supposed to be due to superior germinating power in the kiln-driedseed, but my farming friends tell me that the effect of this would be that thekiln-dried seed would produce the better quality barley. Dr Voelcker draws theconclusion: “In such seasons as 1899 and 1900 there is no particular advantagein kiln-drying before mowing.” Our examination completely justifies this andadds “and the quality of the resulting barley is inferior though the yield may begreater.”

In this case I propose to use the approximation given by the normal curvewith standard deviation s/

√n− 3 and therefore use Sheppard’s tables, looking

up the difference divided by S/√

8. The probability in the case of yield of cornper acre is given by looking up 33.7/22.3 = 1.51 in Sheppard’s tables. This givesp = 0.934, or the odds are about 14 to 1 that kiln-dried corn gives the higheryield.

Similarly 0.91/0.28 = 3.25, corresponding to p = 0.9994,2 so that the oddsare very great that kiln-dried seed gives barley of a worse quality than seedwhich has not been kiln-dried.

Similarly, it is about 11 to 1 that kiln-dried seed gives more straw and about2 to 1 that the total value of the crop is less with kiln-dried seed.

2As pointed out in Section V, the normal curve gives too large a value for p when theprobability is large. I find the true value in this case to be p = 0.9976. It matters little,however, to a conclusion of this kind whether the odds in its favour are 1660 to 1 or merely416 to 1.

23

Page 24: Biometrika 1908 Student The Probable Error of A

Section X. Conclusions

1. A curve has been found representing the frequency distribution of stan-dard deviations of samples drawn from a normal population.

2. A curve has been found representing the frequency distribution of themeans of the such samples, when these values are measured from the mean ofthe population in terms of the standard deviation of the sample.

3. It has been shown that the curve represents the facts fairly well evenwhen the distribution of the population is not strictly normal.

4. Tables are given by which it can be judged whether a series of experiments,however short, have given a result which conforms to any required standard ofaccuracy or whether it is necessary to continue the investigation.

Finally I should like to express my thanks to Prof. Karl Pearson, withoutwhose constant advice and criticism this paper could not have been written.

[Biometrika, 6 (1908), pp. 1–25, reprinted on pp. 11–34 in “Student’s” Collected

Papers, Edited by E. S. Pearson and John Wishart with a Foreword by LaunceMcMullen, Cambridge University Press for the Biometrika Trustees, 1942.]

24