Uncertainties

Experimental Uncertainties

1 Identifying sources of uncertainty

The results of measurements can be influenced in many ways. Uncertainties can arise frominstrumental limitations, such as the resolution of the scale or the accuracy of its calibration,external influences such as temperature fluctuations, variations in the system itself, and dis-turbance of the system due to the measuring process, for example putting a cold thermometerin a hot fluid. Below are some of the basic categories. As a general rule, you always need tothink at least about resolution (how accurate your scale is) and repeatability (by how muchthe result changes if you repeat the measurement).

1. Resolution: There will always be an uncertainty due to the limited information you canextract from a measuring instrument. For an analog instrument, the uncertainty due toresolution is usually half the smallest division on the scale, as you are usually capableof rounding the measurement to the nearest graduation. For a digital instrument, theuncertainty is best taken as 1 in the last digit, because you dont know whether theinstrument rounded or truncated the later digits.

2. Repeatability: If you repeat an experiment, you will most likely get a different value,even if you think you have repeated it in exactly the same way (unless the experimentis very simple or your precision is not that great). Quite often these variations can becalled noise: this is where the reading (for example, on a voltmeter, or on an oscilloscope)changes quickly and randomly when nothing is changing in the experimental set-up andconditions. In some cases the signal-to-noise ratio is so large that you are unaware of thenoise, but it will be present in most of the experiments you do.

To estimate the size of the uncertainty due to noise, it is best to take several measure-ments and average them. For a small number of measurements, a simple estimate of theassociated uncertainty is half the range of measurements. For example, if you measurevoltages of 1.25 V, 1.35 V, 1.30 V and 1.33 V, for which the average is 1.31 V with arange of 0.1 V, a reasonable estimate of the uncertainty is ±0.05 V. Another methodwould be to use an uncertainty that covers the spread of measurements from the mean:in this case, ±0.06 V.

If you take more than about 10 measurements, you can use statistical methods to esti-mate the uncertainty from the range of measured values. For a set of n measurementsx1, x2, . . . xn, first calculate the mean x̄ and the standard deviation σ of the measurements.The standard deviation is calculated as

σ =

√∑ni=1(xi − x̄)2

n− 1

It is usually easiest to use a computer or calculator with a built-in function such asExcels STDEV. Make sure you use the formula for the sample standard deviation, notthe population standard deviation, since you are sampling from a population of possiblemeasurements you have not made every repeat measurement possible. (The formulafor the population standard deviation has (n) instead of (n 1) is the denominator.) Thestandard deviation indicates how big the scatter of the measurements is around the mean.

For uncorrelated measurements, the standard error in the mean is obtained from theexperimental standard deviation by

∆(x̄) =σ√n

(1)

This is a measure of how well the mean has been determined by your repeat measurementsand is the uncertainty you would quote on your experimental result. Note the dependenceon n: the more measurements you take, the smaller the uncertainty.

Assuming we are dealing with a Gaussian or normal distribution of measured values, the“true” (asymptotic as n→∞) value has a 68% chance of falling within ∆(x̄) of the meanof your measurements. Uncertainties in science are usually quoted at a 68% level.

We also use the standard error in the mean to examine outliers or anomolous readings– we expect 95% of our data to fall within 2∆(x̄) of the mean or expected value, so if ameasurement falls more several ∆(x̄) away from the mean, it is often orth re-measuringor considering whether it follows the pattern we attributed to the data.

3. Calibration: No instrument is perfect; it always has to be checked or calibrated againstanother instrument. (That instrument in turn also must be checked, leading in a chainall the way back to the international reference standards.) Though this informationis not usually available in undergraduate labs, you will sometimes be able to calibrateyour equipment yourself. For example, in the photo-electric effect experiment, you cancalibrate the wavelengths being passed through the monochromator using the standardfilters provided, and you can calibrate zero on the picoammeter using the zero check andzero adjustment. Similar you can try to think of ways of ensuring the thermometers usedto measure the solar house temperatures are properly calibrated (e.g. by measuring thetemperature of ice). Assuming a device behaves linearly, a two point calibration is oftensufficient to check that it is calibrated properly.

4. Disturbances due to the measuring apparatus: The act of measuring can influence thequantity under measurement. Like the external influences, if you can correct for this, doso, otherwise try to quantify the uncertainty this is introducing into your measurement. Asimple example is a temperature measurement: when you lower a cold thermometer into ahot liquid, the immediate temperature reading will be lower than the initial temperatureof the liquid. Waiting for the system to reach thermal equilibrium is one way of minimizingthe disturbance in this case.

5. External influences: The environment – temperature, pressure, humidity, vibrations,background illumination – in which the experiment is conducted will affect the mea-surements obtained. The first step in accounting for such effects is to estimate by howmuch a change in the external parameter affects your measurement, either by theoreticalconsiderations or experimental tests. For example, if you are measuring electromagneticspectra, you can test whether the background illumination in the room has a seriousimpact on your data by taking a measurement with the spectrograph probe covered. Insome cases you may be able to quantify these effects, but often you can’t and so you willjust need to make a sensible decision about whether such factors are negligible, or whatorder of magnitude they might contribute to your uncertainties.

2 Propagation of uncertainties

We denote the uncertainty in a quantity x1 by ∆x1. So for a length (20.5 ± 0.5) cm, wehave x1 = 20.5 cm and ∆x1 = 0.5 cm. We can also write uncertainties in the following formx1 = 20.5(5), where the number in parentheses indicates the uncertainty in the last digit – suchnotation is common in physics research.

If your desired answer is a scalar multiple of the value you measure (for example, youmeasure a photon frequency f and want to relate it to the photon’s energy through E = hfwhere h is Planck’s constant), then the uncertainty on the final result is simply the same scalarmultiplied by the uncertainty on your measurement: that is, ∆E = h∆f .

If you identify more than one source of uncertainty, you may have to combine them. Saywe measure quantities x1, x2, x3 ... which have associated uncertainties ∆x1, ∆x2, ∆x3 ... Wewill assume that the measurements are uncorrelated, so that the outcome of a measurement ofone quantity has no affect on the measurement of the others. (If they are correlated, things geta little more compliacted).

We then combine the results of our measurements to produce a final answer z = f(x1, x2, x3).The general rule for combining uncertainties is:

(∆z)2 = (∆x1)2

(∂z

∂x1

)2

+ (∆x2)2

(∂z

∂x2

)2

+ (∆x3)2

(∂z

∂xx3

)2

+ ...

=n∑i=1

(∆xi)2

(∂z

∂xi

)2

(2)

However, you should always try to make your life as easy as possible: before you startcombining uncertainties, look at the relative contributions each uncertainty is going to maketo the uncertainty on you final result. If you are adding two (or more) numbers together,compare the absolute sizes of the uncertainties: if one is a lot larger than the other, it is oftensufficient to just use that. If the numbers are combined in any other way, compare the fractionaluncertainties: if one quantity has a fractional uncertainty of 1/10 or 10%, while the other hasa fractional uncertainty of 1/100 or 1%, your final answer will have a 10% uncertainty. In suchcases, where one source of uncertainty dominates the others, you do not need to do a complexerror propagation.

2.1 Common cases

Make sure you can see how all of the following examples come from equation 2.

Addition or subtraction:If z = x1 + x2 or z = x1 − x2

∆z =√

(∆x1)2 + ∆x2)2

Multiplication or division:If z = x1 × x2 or z = x1/x2, (

∆z

z

)2

=(

∆x1x1

)2

+(

∆x2x2

)2

Powers: If z = xn1 ,

∆z

z= n

∆x1x1

3 The least-squares model and least-squares fitting

The least-squares model assumes that the data you obtain is a function of one or more variables(which you need to estimate) plus scatter – that is, that data is signal plus noise.

For example, if you make a series of measurements of some quantity m, we assume that theresult of your ith measurement, di, is equal to m+ εi, where εi is a measure of the noise. The εsare often referred to as residuals. For example, in one experiment you may measure the massof 6 nominally identical 50 g masses, which you find to be 49.9, 50.0, 50.1, 50.5, 49.3 and 49.7grams. Each of these provides an equation:

d1 = 49.9 = m+ ε1

d2 = 50.0 = m+ ε2

d3 = 50.1 = m+ ε3

d4 = 50.5 = m+ ε4

d5 = 49.3 = m+ ε5

d6 = 49.3 = m+ ε6

So we have 6 equations and seven unknowns, m (the average mass of the whole population ofmasses) and the residuals.

The least-squares method takes the sum of the squares of the residuals

Q =∑i

(di −m)2

and minimizes it with respect to m – i.e. it calculates dQ/dm and finds the minimum,

dQ

dt=∑i

−2(di −m) = 0

This gives an estimate of the mean mass of all of the masses, not just the ones we measured.To find the variance and hence the uncertainty we take the sum of the squares of the residualsand divide by the number of degrees of freedom ν – in this case, there are 6 residuals, but theyare constrianed by the fact that they have to sum to zero, so there are 5 degrees of freedom:

σ2 =

∑i(di −m)2

ν

You’d quote your result as m± σ.The least-squares approach can be extended to deal with situations where more than one

variable needs to be determined. For data showing a linear trend that can be described in aform

y = mx+ c

we define a quantity

D = nn∑i=1

x2i −(

n∑i=1

xi

)2

which should look vaguely familiar ...Then

c =

∑ni=1 yi

∑ni=1 x

2i −

∑ni=1 xiyi

∑ni=1 xi

D

and

m =n∑ni=1 xiyi

∑ni=1 x

2i −

∑ni=1 xi

∑ni=1 yi

D

The residuals are calculated as

εi = yi − c−mxi

and the root-mean-square error as

σ2 =

√∑ni=1 ε

2i

n− 2

The number of degrees of freedom is now ν = n − 2 because there are two constraints on thedata, the two parameters m and c. Note that σ is NOT an uncertainty in the slope m! To findthe uncertainty on the slope and the intercept we have to go a step further,

σc = σ

√∑ni=1 x

2i

D

σm = σ

√n

D

3.1 Weighted least-squares fitting

The previous section assumed that the uncertainties on each individual measurement were thesame. However, the least-squares approach can be further extended to take into account un-certainties on individual measurements. Essentially this is done by assigning a weight to eachmeasurement such that measurements with large uncertainties contribute less to the minimiza-tion process than measurements with small uncertainties:

yi ⇒ Wiyi, Wi =1

σ2y,i

xi ⇒ Wixi, Wi =1

σ2x,i

Let’s consider a set of independent values xiThis is what the Mathematica template you were given in the first year lab does for you!

Some of you will also have calculators that can do this, and Excel has a built-in function - butbe careful to check whether you are using an unweighted least squares fit or a weighted one,since simply plotting the error bars on the data is not enough to make excel pay attention tothem.

4 Testing a hypothesis

In many of the 2nd year lab experiments, you will have a hypothesis concerning how you expectthe data to behave – for example, that the stopping voltages and frequencies in the photoelectriceffect experiment should obey the relationship

eV = hf − φ (3)

That is, you believe that the stopping voltage is linearly-related to the frequency of the incidentlight through the above equation, with accepted values of e, h and φ. You can use a least-squares fit to obtain values of h and φ from your data, but it is also often worth evaluating howlikely it is that your hypothesis is correct, i.e. that the data really are described by the theoryyou think they’re described by.

The first step you should take in testing a hypothesis is to make sure you know what yourhypothesis is and what your associated assumptions are. In this example, our assumptionsmight be

1. that energy of a photon depends in some way on its frequency;

2. that the energy of a photon is transferred to an electron in the metal of the photocathode;

3. that some of the energy is “used up” in the process of the electron escaping from themetal, and that this amount is a constant characteristic of the material;

4. that the charge on the electron is what we expect (i.e. a constant and equal to 1.6×10−19C); and

5. that our equipment allows us to isolate particular frequencies of light;

6. that we know what those frequencies are!

There may be other assumptions underlying thet way we analyse our data, but you get theidea. If we take all these assumptions as reliable, our experiment provides a means of testingwhether the energy depends linearly on the frequency and of determining a value for Planck’sconstant which can be compared to the accepted value.

To test the hypothesis that equation 3 correctly describes the situation, we compare theresult of our weighted least-squares fit to a standard probability distribution. One of the mostcommon ways of doing this is by using the χ2 statistic.

4.1 χ2 goodness-of-fit test

If the observations we make have no associated uncertainty (as may be the case in some statis-tical experiments), we calculate the test statistic (the χ2) as:

χ2 =n∑i=1

(Oi − Ei)2

Ei

where Oi is the observed result and Ei is the expected or theoretical result. You may beworried by the idea of observations with no uncertainty, but think about the following (classic)situation. The number of deaths per year by horse- or mule-kick were recorded in 10 Prussianarmy corps over 20 years. These numbers were exact: if no such deaths occurred, the numberwas zero. If one such death occured, the number was exactly one. No uncertainties are attachedto such records.

On the other hand, if the observations we make do have associated uncertainties (as is moreoften the case in the lab), we calculate the test statistic (the χ2) as:

χ2 =n∑i=1

(Oi − Ei)2

σ2i

where σi is the uncertainty on the ith observation.This statistic can then be compared to the χ2 distribution if the number of degrees of freedom

is known. The number of degrees of freedom, ν, is the number of values in the calculation ofthe statistic that are free to vary. This is equal to the number of data points entering intothe calculation (n) minus the number of parameters in the fit p. In the photo-electric effectexample, this is equal to n − 2, since the straight line fit incldues a calculation of both theintercept and the slope.

χ2 distributions are tabulated and available online for a variety of different degrees of free-dom. However a common approach in physics and many other fields is to calcualte the χ2

statistic per degree of freedom,

χ2/ν =1

n− pχ2

4.1.1 Example

In 1898, von Bortkiewicz published a now famous study on the occurrence of deaths by horse-or mule-kick in 10 Prussian army corps over a period of 20 years. The data can be summarisedas follows:

Number of deaths 0 1 2 3 4Frequency 109 65 22 3 1

We believe that such events should be governed by Poisson statistics, so we may make atest to determine whether they follow a Poisson distribution,

P (x) =µx

x!e−µ

where µ is the mean of the distribution and P (x) is the probability of measuring a frequencyx.

We can use a χ2 goodness-of-fit test to see whether the horse- and mule-kick deaths areconsistent with a Poisson distribution. First, we calculate the mean number of deaths per yearand find it is 0.61.

Then we calculate the expected frequency with which each number of deaths is observedassuming a Poisson distribution,

N(x) = 200P (x)(0.61)x

x!e−0.61

Then we find the difference between the observed and expected values.

Number of deaths 0 1 2 3 4Ei 109 66 20 4 1Oi 109 65 22 3 1

Oi − Ei 0 1 2 1 0

Now we calculate the χ2 statistic, obtaining χ2 = 0.448.In this case we have five data points and one calculated parameter (the mean of the distribu-

tion), giving four degrees of freedom. This gives us a χ2 per degree of freedom of 0.112. We thenlook up or calculate a P -value – that is, the probability that we would have obtained a χ2 higherthan this if our hypothesis (that the data are described by a Poisson distribution) is correct.There are online calculators such as the one at http://stattrek.com/Tables/ChiSquare.aspxwhich you can use (just try googling chi-squre distribution calculator if that link is defunct).A table of P -values for a χ2 distribution with one degree of freedom is included at the end ofthis document.

We find the probability of getting a larger χ2 statistic is 98%! This is equivalent to sayingthat a much larger deviation (and hence larger χ2 statistic) would not have been unlikely. Infact, the data are spookily clos to a perfect Poisson distribution, with no “noise” or randomness... but also (apparently) genuine.

4.1.2 Example

In a particular lab, you measure the extension of a metal wire as a function of applied load andobtain the following data:

Load (kg), ±0.001 kg 0.100 0.200 0.300 0.400 0.500Extension (mm), ±0.2 mm 0.5 1.1 1.6 2.3 3.4

Your hypothesis is that the stress applied to the wire has not exceeded the proportionallimit, so that the data should obey Hooke’s law and the strain is proportional to the stress as

F

A= Y

∆l

l0

⇒ ∆l =Al0YF = κF

If we assume that the cross-sectional area A does not change as the wire stretches, we cancalculate a χ2 statistic to test the goodness of fit of our straight line hypothesis.

First, we perform a linear regression to find the constant of proportionality κ. Although wehave two sets of uncertainties, the fractional uncertainty on the applied load is much smallerthan that on the measured extension, so we can neglect the uncertainty on the load. In fact,since the uncertainties are the same on all data points we can get away with ignoring themaltogether for the step of doing the linear regression.

A linear regression on ∆l vs F gives a slope of κ = 0.0007143. But what is the χ2 value -i.e., was the straight line fit a sensible thing to do in the first place?

Calculate the χ2 statistic for this example for yourself - use a spreadsheet to make it easier.You will find that the χ2 is significantly greater than 1.

How many degrees of freedom are there in this problem? Remember the linear regressiongives values for both the slope and the intercept – that is, two parameters come out of the fit...

The χ2 per degree of freedom is still significantly greater than one. This tells us thatthis is not a good fit, so some of our assumptions must be wrong. One possibility is thatour uncertainty estimates are too small - but if we want to mess with them, we must havea good reason. Maybe we didn’t take parallax into account, or maybe we’re just extremelyincompetent. Unless you can actually justify your actions, never mess with youruncertainty estimates just to make a fit seem better.

What else could be wrong? Well perhaps the data don’t actually follow a straight line atall. We know that if you stretch a wire enough, it will exceed its elastic limit – at some appliedstress, planes of molecules in the metal start to slide past each other and the extension rapidlyincreases. Perhaps we’re seeing the onset of that with the last point. To test this hypothesis,remove the point, redo the linear regression and goodness of fit test, and see what the value ofχ2 is now ...

Uncertainties

Documents

Transcript of Uncertainties