Poisson Statistics - MITweb.mit.edu/jgross/Public/mit-classes/8.13/poisson... · 10/9/2011  ·...

4
Poisson Statistics Jason Gross * Student at MIT (Dated: October 9, 2011) I present the theory behind the use of Poisson statistics; Poisson statistics arise from the counting of random, statistically independent processes, examples of which abound in nature. We investigate background radiation counts with a scintillation counter, and we do not reject the null hypothesis that they are Poisson-distributed. On the other hand, we reject the hypothesis that they are normally distributed for μ 10 with confidence > 99%. I. THE THEORY OF INDEPENDENT RANDOM VARIABLES AND POISSON STATISTICS I.1. Random Variables It is commonly assumed, at least implicitly, that every measurement taken is a measurement of a random vari- able. Sometimes, the set of possible values is discrete; for example, we may look at a light and observe it to be “on”. Other times, the set of possible values seems con- tinuous; for example, we may measure the width of this sheet of paper in centimeters. Regardless of what the details of a measurement are, we assume that our measurements are repeatable in the limit, and that our particular set of measurements con- stitute a representative sample of all the set of all such possible measurements. If, for example, you have a small field of apple trees, you might estimate the number of ap- ples that you have by counting how many trees you have, counting how many apples are on some particular tree, and multiplying your two counts. In science, we assume that such methods of estimation are reasonable. More formally, we assume that every sequence of mea- surements is a random variable ; this amounts, I think, to assuming that frequentist definition of probability is self-consistent and that the Law of Large Numbers holds for every sequence of measurements. 1 The assumption that every measurement, both in the- ory and in practice, is of a random variable, allows us to talk meaningfully and systematically about models for measurement, which in turn allow us to reason about * [email protected] 1 The law of large numbers states that the sample average of iden- tical statistically independent random measurements almost al- ways converges to the expected value of a given measurement. The formal definition of “almost always” is non-trivial and be- yond the scope of this paper. Any introductory formal statistics or probability theory textbook should explain it; alternatively, see MathWorld[1] or Wikipedia[2]. This may be combined with the frequentist definition of prob- ability to note that the proportion of measurements in which we measure a fixed particular outcome is itself a random vari- able subject to the law of large numbers, and so the law of large numbers permits us to assert that the average of distributions of measurements will match the probability distribution in the large n limit. models for the underlying universe, and how experimen- tal results correspond to these models. It remains to establish that the displays of our measurement devices are actually measurements of the underlying quantities. To me, this seems a remarkable fact, which I hypothesize to be a result of the central limit theorem and the fact that variances add in quadrature. A complete theoretical analysis of the nature of measurement is either far beyond the scope of this paper or far beyond my understanding (or both). I.2. Poisson Statistics: A Limit of the Binomial Distribution Consider a measurement, such as flipping a coin, with two possible outcomes, with probabilities p and q := 1-p; call the outcomes “success” and “failure”, respectively. If you take n such measurements, all statistically indepen- dent, then the probability distribution on the number of successes x is P (x; n, p)= n x p x q n-x = n! x!(n - x)! p x q n-x . (1) Straightforward calculation yields that the mean of this distribution is μ = np. Consider now a measurement, such as the number of radioactive decays in a minute of an atom with a par- ticular half-life, that is believed to occur with binomial distribution with known mean μ and an unknown but large number of potential events (n μ; or p 1). The distribution on the measured number of successes of such a measurement is given by P (x; μ)= μ x x! e -μ . (2) The mean and deviation of this distribution are both equal to μ. A full derivation and discussion of this dis- tribution can be found in an introductory statistics text- books such as Bevington [3].

Transcript of Poisson Statistics - MITweb.mit.edu/jgross/Public/mit-classes/8.13/poisson... · 10/9/2011  ·...

Page 1: Poisson Statistics - MITweb.mit.edu/jgross/Public/mit-classes/8.13/poisson... · 10/9/2011  · Poisson statistics are everywhere. Examples of Poisson-distributed measurements include

Poisson Statistics

Jason Gross∗

Student at MIT(Dated: October 9, 2011)

I present the theory behind the use of Poisson statistics; Poisson statistics arise from the countingof random, statistically independent processes, examples of which abound in nature. We investigatebackground radiation counts with a scintillation counter, and we do not reject the null hypothesisthat they are Poisson-distributed. On the other hand, we reject the hypothesis that they arenormally distributed for µ ≤ 10 with confidence > 99%.

I. THE THEORY OF INDEPENDENT RANDOMVARIABLES AND POISSON STATISTICS

I.1. Random Variables

It is commonly assumed, at least implicitly, that everymeasurement taken is a measurement of a random vari-able. Sometimes, the set of possible values is discrete;for example, we may look at a light and observe it to be“on”. Other times, the set of possible values seems con-tinuous; for example, we may measure the width of thissheet of paper in centimeters.

Regardless of what the details of a measurement are,we assume that our measurements are repeatable in thelimit, and that our particular set of measurements con-stitute a representative sample of all the set of all suchpossible measurements. If, for example, you have a smallfield of apple trees, you might estimate the number of ap-ples that you have by counting how many trees you have,counting how many apples are on some particular tree,and multiplying your two counts. In science, we assumethat such methods of estimation are reasonable.

More formally, we assume that every sequence of mea-surements is a random variable; this amounts, I think,to assuming that frequentist definition of probability isself-consistent and that the Law of Large Numbers holdsfor every sequence of measurements.1

The assumption that every measurement, both in the-ory and in practice, is of a random variable, allows us totalk meaningfully and systematically about models formeasurement, which in turn allow us to reason about

[email protected] The law of large numbers states that the sample average of iden-

tical statistically independent random measurements almost al-ways converges to the expected value of a given measurement.The formal definition of “almost always” is non-trivial and be-yond the scope of this paper. Any introductory formal statisticsor probability theory textbook should explain it; alternatively,see MathWorld[1] or Wikipedia[2].

This may be combined with the frequentist definition of prob-ability to note that the proportion of measurements in whichwe measure a fixed particular outcome is itself a random vari-able subject to the law of large numbers, and so the law of largenumbers permits us to assert that the average of distributionsof measurements will match the probability distribution in thelarge n limit.

models for the underlying universe, and how experimen-tal results correspond to these models. It remains toestablish that the displays of our measurement devicesare actually measurements of the underlying quantities.To me, this seems a remarkable fact, which I hypothesizeto be a result of the central limit theorem and the factthat variances add in quadrature. A complete theoreticalanalysis of the nature of measurement is either far beyondthe scope of this paper or far beyond my understanding(or both).

I.2. Poisson Statistics: A Limit of the BinomialDistribution

Consider a measurement, such as flipping a coin, withtwo possible outcomes, with probabilities p and q := 1−p;call the outcomes “success” and “failure”, respectively. Ifyou take n such measurements, all statistically indepen-dent, then the probability distribution on the number ofsuccesses x is

P (x;n, p) =

(n

x

)pxqn−x =

n!

x!(n− x)!pxqn−x. (1)

Straightforward calculation yields that the mean of thisdistribution is µ = np.

Consider now a measurement, such as the number ofradioactive decays in a minute of an atom with a par-ticular half-life, that is believed to occur with binomialdistribution with known mean µ and an unknown butlarge number of potential events (n� µ; or p� 1). Thedistribution on the measured number of successes of sucha measurement is given by

P (x;µ) =µx

x!e−µ. (2)

The mean and deviation of this distribution are bothequal to µ. A full derivation and discussion of this dis-tribution can be found in an introductory statistics text-books such as Bevington [3].

Page 2: Poisson Statistics - MITweb.mit.edu/jgross/Public/mit-classes/8.13/poisson... · 10/9/2011  · Poisson statistics are everywhere. Examples of Poisson-distributed measurements include

2

II. MOTIVATION AND METHODOLOGY

II.1. Motivation

Poisson statistics are everywhere. Examples ofPoisson-distributed measurements include the rate of ra-dioactive emissions of a radioisotope, the number of dropsof water per minute that fall in to a bucket of water dur-ing a rainstorm, and the number of particles per unitarea in an approximately uniform beam of particles.

We set out to observe a particular example of Pois-son statistics—the level of background radiation—in thelab. We tested the hypothesis that the background γ-raycount, at various energy levels (not measured), is well-described by a Poisson distribution.

II.2. Methodology

We measured the level of background radiation (in par-ticular, γ-rays) using a scintillation counter. A scintilla-tion counter is composed of a scintillator and a photo-multiplier. The scintillator is a material that fluoresces(releases photons) when struck with charged particles.The entrance of the photomultiplier is metallic; the metalreleases electrons when struck by light, via the photoelec-tric effect. The electrons are then accelerated to progres-sively higher voltage dynodes, which are made of materi-als that emit many electrons when struck with [few] elec-trons. This cascade of electron emissions creates a signalthat can be measured, or amplified by a traditional am-plifier. The signal is approximately proportional to theenergy of incoming radiation.

III. PROCEDURE

As shown in Figure 1, we hooked the NaI scintilla-tion counter to a high voltage source, and to a counterthrough a preamplifier and amplifier. The counter wascoupled with a [built-in] discriminator which dropped sig-nals below a particular voltage.

We wanted to measure the distribution at mean per-second counts of approximately 1, 4, 10, and 100 γ-rays;we set the amplifier and discriminator to the followingsettings:

µ Amplifier (ratio) Discriminator (V)≈ 1 26.10 ± 0.01 9.7 ± 0.2≈ 4 26.10 ± 0.01 6.0 ± 0.2≈ 10 46.10 ± 0.01 9.7 ± 0.2≈ 100 46.10 ± 0.01 2.0 ± 0.2

For each setting, we recorded the number of eventsin 100 one-second intervals, and then in one 100 secondinterval. The timer on the counter was determined to be

+HV Power Supply

Scintillator NaI

Photo Multiplier Tube

Voltage Divider

PreamplifierAmplifierCounter

FIG. 1. The setup for measuring the number of incident γ-rays of background radiation in a given time interval. Fig-ure shamelessly stolen (and slightly modified) from the labguide.[4]

accurate to one part in a million with a pulse generator.2

IV. RESULTS AND CONCLUSIONS

IV.1. Analysis and Results

We found that background γ-ray radiation was Poissondistributed.

The data for the one-second trials were plotted againstthe [continuous] theoretical Poisson and normal distri-butions in Figure 2, using the average of the data asthe mean and the square root of that value as the stan-dard deviation. Plots of the best-fit normal curves arealso included for µ ≈ 1 and µ ≈ 4;3 no significant dif-ferences were found between best-fit curves and curvesbased only on the averages, and the visual differencesbetween best-fit Poisson curves and average-based Pois-son curves was not worth showing. Cumulative averageplots are included in Figure 3 to show that the error isinversely proportional to the square root of the numberof measurements. The Pearson χ2 test was performedto find goodness-of-fit. I do not reject the null hypoth-esis that γ-ray counts are Poisson distributed. I rejectthe null hypothesis that γ-ray counts are normally dis-tributed for µ ≤ 10 1/s, with a confidence of > 99%.

2 We also found a systematic offset of 12 µs, but I suspect this wasdue to a confusing setting of our pulse generator (called ToPer)which we did not discover until the electromagnetic pulses lab.

3 I was having trouble with the larger values of µ, presumablybecause Mathematica decided that χ2 had no gradient at aroundµ ≈ 0, where I’m guessing it started. I minimized the χ2 valuesmyself, though, and found no appreciable differences.

Page 3: Poisson Statistics - MITweb.mit.edu/jgross/Public/mit-classes/8.13/poisson... · 10/9/2011  · Poisson statistics are everywhere. Examples of Poisson-distributed measurements include

3

Num

ber

ofC

ount

s

0 1 2 3 4 5 6

10

20

30

40

50

Poisson Data against the Theoretical Distributions for Μ » 1 � sx � 0.96 ± 0.98

Χ2 P-Value Curve

Poisson 3.4 0.33

Normal 3.5´102 5.5´10-67

Normal Fit 3.7´102 2.9´10-72

Frequency Bin

(a)

Num

ber

ofC

ount

s

0 2 4 6 8 10 12 14

5

10

15

20

25Poisson Data against the Theoretical Distributions for Μ » 4 � s

x � 5.5 ± 2.3

Χ2 P-Value Curve

Poisson 6.0 0.65

Normal 84. 6.8´10-13

Normal Fit 84. 6.8´10-13

Frequency Bin

(b)

Num

ber

ofC

ount

s

0 5 10 15 20

5

10

15

20

Poisson Data against the Theoretical Distributions for Μ » 10 � sx � 9.9 ± 3.1

Χ2 P-Value Curve

Poisson 5.7 0.77

Normal 28. 0.0052

Frequency Bin

(c)

Num

ber

ofC

ount

s

80 100 120 140 1600

2

4

6

8

10

12

14

Poisson Data against the Theoretical Distributions for Μ » 100 � sx � 115. ± 11.

Χ2 P-Value Curve

Poisson 11. 0.55

Normal 10. 0.59

Frequency Bin

(d)Unlike the other plots, a bin count of 3 was usedhere.

FIG. 2. Plots of experimental data against the theoreticalcurves. The fitted normal curves have two degrees of freedom,and the other theoretical curves have one.

0 20 40 60 80 100Trial ð

0.5

1.0

1.5

2.0

2.5

3.0rcH jL

Cumulative Average for Μ » 1 � s

(a)

0 20 40 60 80 100Trial ð

2

4

6

8

10

rcH jLCumulative Average for Μ » 4 � s

(b)

0 20 40 60 80 100Trial ð

2

4

6

8

10

12

rcH jLCumulative Average for Μ » 10 � s

(c)

20 40 60 80 100Trial ð

95

100

105

110

115

120

125

rcH jLCumulative Average for Μ » 100 � s

(d)

FIG. 3. Plots of the cumulative average with error bars.

Page 4: Poisson Statistics - MITweb.mit.edu/jgross/Public/mit-classes/8.13/poisson... · 10/9/2011  · Poisson statistics are everywhere. Examples of Poisson-distributed measurements include

4

IV.2. Discussion and Conclusion

Unsurprisingly, background γ-ray radiation counts atthe energies that we probed were found to be Poisson dis-

tributed. This supports our hypothesis that radioactivedecays are random, statistically independent processes.4

[1] J. Renze and E. W. Weisstein, “Law of large numbers,”http://mathworld.wolfram.com/LawofLargeNumbers.

html.[2] “Law of large numbers,” http://en.wikipedia.org/

wiki/Law_of_large_numbers (2011).[3] P. Bevington and D. Robinson, Data Reduction and Error

Analysis for the Physical Sciences (McGraw-Hill, 2003).

[4] S. P. Robinson, “Poisson statistics,” (2011).

ACKNOWLEDGMENTS

The author gratefully acknowledges his lab partnerThomas Vandermeulen for help collecting the data, andthe J-Lab staff for their help in experimentation.

4 Actually, it supports the hypothesis that background γ radiationcomes from random, statistically independent processes. There isan additional hypothesis connecting these two hypotheses, which

is roughly that there is no process that alters the character ofthe distribution of γ-rays between emission and detection.