I Introduction to Measurement and Data Analysis example, if you had measured a time interval as...

20
I Introduction to Measurement and Data Analysis

Transcript of I Introduction to Measurement and Data Analysis example, if you had measured a time interval as...

I

Introduction to Measurement

and Data Analysis

blank

Measurement

In physics lab the activity in which you will most frequently be engaged is measuring things. Using a widevariety of measuring instruments you will measure times, temperatures, masses, forces, speeds, frequencies,energies, and many more physical quantities. The tools you will use will span a range of technologies fromthe simple (such as a ruler) to the complex (perhaps a digital computer). Certainly it would be worthwhileto devote a little time and thought to some of the details of ”measuring things” that may have not yetoccurred to you.

True Value - How Tall?

At first thought you might suppose that the goal of measurement is a very straightforward one: find thetrue value of the thing being measured. Alas, things are seldom as simple as we would like. Consider thefollowing “case study.”

Suppose you wished to measure how tall your lab partner is. One way might be to simply look at him orher and estimate, “Oh, about five-nine.” Meaning five feet, nine inches tall. But of course you couldn’tbe sure that five-eight or five-ten, or even five eleven might be a better estimate. In other words, yourmeasurement (estimate) is uncertain by some amount, perhaps an inch or two either way. The “truevalue” lies somewhere within a range of uncertainty and one way to express this notion is to say thatyour partner’s height is

Five feet nine inches, plus or minus two inches.or

69± 2 inches.

Now it should begin to be clear that at least one of the goals of measurement is to reduce the uncertainty toas small an amount as is feasible and useful. Feasible and useful are important adjectives here.

In some texts the uncertainty may be referred to as the error in the measurement. But our ordinary un-derstanding of the word, error, implies some sort of a mistake or blunder. That is not the intended meaninghere. Uncertainties occur, regardless of the amount of care and attention paid to the measuringprocess. We shall discuss ways to estimate and classify these unavoidable uncertainties later. Mistakes andblunders can (and must) be ruthlessly hunted down and eliminated.

In your quest to determine your partner’s height you might go and get a tape measure, marked off everyeighth of an inch. A new measurement might allow you to state that his or her height is “Five feet nine andthree-quarters inches.” But since there are marks on the tape only every eighth of an inch you are reallysaying that the height is closer to five nine and three-quarters than it is to five nine and five-eighths or tofive nine and seven eighths. It is reasonable to expect that your measurement is “good” to within half of thesmallest division on the tape, in this case half of an eighth, or a sixteenth. You now express the height as

Five feet nine and three-quarter inches, plus or minus one-sixteenth inches.or

69.75± 0.06 inches.

Note that we have rounded off one-sixteenth (0.0625) to 0.06. More about that later. Now this is clearly a“better” measurement. The uncertainty is quite a bit smaller. But is it feasible to make it even smaller still?If you were to continue this obsession with measuring your partner (and your partner consented) you mightpurchase a precision stainless steel tape that is marked off every one hundredth of an inch, thus reducingthe uncertainty to 0.005 inches. Or perhaps a research grant from MA (Measurers Anonymous) would fundthe purchase of a laser interferometer, capable of measuring to within a wavelength of light (about 0.00002

I-1

inches). Would you have at last found the “true height” of you partner? Well, you would be able to expresshis or her height as

69.74843± 0.00001 inches.

But you note, to your dismay, that the measurement is still uncertain. All the King’s horses and all theKing’s men (or, to use a more modern idiom, all the money in the National Science Foundation) cannot giveyou the means to find the“true height” of your partner. In fact, some modern findings in quantum physicsplace some fundamental limits on our ability to measure things. (Look up the Heisenberg Uncertainty Prin-ciple in your physics text.)

But wait, things get worse yet. As you carried out this exercise you probably discovered that your tapemeasure (the cheap one, made of mylar plastic) would stretch somewhat, depending on how tightly youpulled it. When you got the expensive stainless steel tape you hoped to eliminate that problem, but no,stainless steel can stretch too. Not only that, now that you can measure more carefully you probably noticethat the steel tape expands and contracts as the temperature changes.

As if the problem wasn’t ugly enough, your partner remembers reading in a physiology textbook that aperson’s height actually varies over the course of the day, being greater in the morning (after horizontalsleep) and less after gravity has done its work of compressing the vertebrae for a few hours. A few quickchecks over a several hour period reveals the awful truth - there is no “one true height” for you partner.

Before you throw up your hands and run screaming from the lab recall the words above, feasible and use-ful. While it is certainly feasible to measure a person’s height with a laser interferometer it is hardly useful.One reason to measure height might be to get the right size for an article of clothing. For this purpose youroriginal, cheap, mylar tape measure is perfectly adequate. Were you measuring rocket engine parts for thespace shuttle, the interferometer might be an absolute essential.

All measurements are uncertain to some degree. That is an inescapable fact. Some of the uncertaintyis due to the limitations of the measuring instrument (such as stretchy tapes, fuzzy markings, etc) and someis due to natural variations in the thing being measured (people who shrink after they awaken, or steel ballsthat expand when heated).

The drawing at right shows yourruler, marked every eighth of aninch, and the arrow indicates theheight of your partner. Note thatyou must make an estimate of whicheighth-inch mark is closest to the ar-row.

Whenever you record a measure-ment you should make note of theuncertainty inherent in that mea-surement. The “plus or minus” no-tation is a good way to do so. In this laboratory we will adhere to the following rules:

RULE 1. Uncertainties shall be rounded off to one significant digit.

RULE 2. Measurements shall be rounded off to the digit in which they areuncertain.

I-2

For example, if you had measured a time interval as 2.2475 seconds with an uncertainty of 0.0166 secondsthen you should record the measurement as

2.25± 0.02 seconds.

First you rounded off the uncertainty to 1 significant digit (0.0166→ 0.02). Thus the uncertainty lies in the“hundredths” digit. So you then rounded off the measured value to the nearest hundredth (2.2475→ 2.25).By this means you are stating that the “true value” lies somewhere between 2.23 and 2.27 seconds. At firstglance this seems pessimistic, that it “throws away” useful information. But in fact any other expressionis just false security, leading others to believe that the measurement was better than it really was. Asa practical matter, in teaching labs such as this course, the tendency is to underestimate, rather thanoverestimate uncertainties, so the above rules are realistic ones.

Precision and Accuracy

Common use of our language assigns roughly similar meaning to the terms “precise” and “accurate.” Butin scientific terms the words have quite different implications.

Precision in measurement implies the ability to distinguish between closely spaced values. An electronicdigital stopwatch, which “reads out” to 0.01 seconds is much more precise than a wristwatch with a sweepsecond hand, which can be “read” to a half second at best. Thus the number of significant digits tellssomething about the precision of the measurement.

Special Note: A few years ago the Detroit automakers were faced with the marketing problem ofgetting carbuyers to accept smaller cars. One of them, who shall remain nameless,decided that, rather than call them “small” or “compact” or “economy” or “per-sonal sized” they coined the term “precision sized.” The English language is awonderful thing, is it not?

All instruments have a smallest increment that can be detected. This smallest increment is called the leastcount of the instrument. A meter stick has markings every millimeter (0.001 meter) so its least count is0.001 meter. A medical “fever thermometer” is usually marked off in tenths of a degree, so its least count is0.1 degree. The more precise an instrument is, the smaller will be the least count.

(a) Target Shooting. Precise, but not veryaccurate.

(b) Target Shooting. Accurate, but notvery precise.

I-3

(c) Target Shooting. Neither precise noraccurate.

(d) Target Shooting. Both precise and ac-curate.

Accuracy refers to the ability of an instrument to give a reading that compares favorably with generallyaccepted values. It really has little or nothing to do with the precision of the instrument.

We know that water freezes at zero degrees Celsius. A thermometer placed in a mixture of ice and watershould, if it is accurate, read very nearly zero degrees. In fact, you would expect it to read within one”division” of zero. That is, zero, within its least count. If it does not read zero within its least count thenthe precision of the markings is not useful, because the thermometer gives inaccurate readings.

Accuracy of instruments can only be checked by comparing the instrument against a comparison stan-dard. You can purchase a “standard kilogram mass” from a science supply company and use it to check theaccuracy of scales. The “ice-water” bath is a common check for thermometers, as is a “boiling water” bath.A “standard battery cell” is used to check voltmeters.

In labs doing very careful and intricate measurements the instruments will be routinely calibrated andchecked against comparison standards. The instrument will usually have a tag or sticker telling when it wascalibrated last, by whom, and how closely it agreed with the comparison standard.

In a teaching lab such as this course you have no recourse but to presume that the instruments are accurateto within their least count. Should you suspect that an instrument is inaccurate due to a malfunction youshould ask your Instructor to check it for you. Because you can judge the precision of an instrument by itsmarked divisions, but must presume its accuracy, the following rule applies in this class:

RULE 3. The minimum uncertainty of a single measured value is presumed tobe one-half of the least count of the instrument used to make themeasurement.

Were you to measure the length of a sheet of tablet paper with a ruler marked off every millimeter themeasurement could correctly be recorded as

279.0± 0.5 millimeters

Digital Instruments

We must slightly modify the rule above when the measuring instrument indicates the measured value witha digital (numerical) display. In this case you are not given the opportunity to estimate whether the indi-

I-4

cated value is closer to one reading or another. The instrument may be “rounding off” or it may simply betruncating (discarding extra digits) to fit the size of its display.

Even the NBA (National Basketball Association), and now the NCAA recognized this physics problem whenthey required that game clocks be able to display fractions of a second during the last minute of play.Previously, if the game clock showed 4 seconds remaining, there was no way to tell if there were 4.9 secondsor 4.0 seconds, since the clock truncated the tenths of seconds. And nine-tenths of a second can be an eternityfor a defensive player facing former LSU Tiger Shaquille O’Neal “in the paint.” Today’s game clock willnow show tenths of a second during the last minute. So the clock might now display 4.3 seconds. Now thisdisplay is still uncertain. It could be as much as 4.39 s or as little as 4.30 s, since it is now the hundredthsdigit that is truncated. But now the uncertainty is of a magnitude that is negligible, at least insofar asdesigning basketball strategy is concerned. All of this discussion leads us to the following corollary to rule 3:

RULE 3(A). Typically, the minimum uncertainty of a single measurement madewith an instrument incorporating a digital readout is equal to thevalue of the least significant digit (least count) of the display, butcheck with the instrument’s documentation to make sure.

For example, suppose you thought you were sick and had a fever and you took your temperature with botha conventional mercury thermometer and a digital electronic thermometer. Perhaps both indicated 101.4 F.The correct representation of each measurement, assuming that the least counts of both thermometers were0.1 F, would be

Mercury thermometer: 101.40◦ ± 0.05◦ F

Digital Thermometer: 101.4◦ ± 0.1◦ F

As you can see, just because the instrument is modern, electronic, and digital doesn’t always mean it’sbetter.

Multiple Measurements

One way to improve your confidence in a measurement is to repeat the measurement several times. This isespecially valuable when measuring things that are themselves somewhat variable. You would expect thatrepeating the measurement would allow you do determine a value that is somehow “better” than that givenby a single reading.

A baseball is not a perfect sphere. It has raised seams, the leather is not of uniform thickness, and so on.If you set out to determine the “true diameter” of a baseball you would probably measure its diameter atseveral different places. Suppose that the following were measurements of the diameter of a baseball, takenwith a ruler whose least count is one millimeter.

(each ± 0.5 mm) 72 mm, 74 mm, 75 mm, 72 mm, 73 mm, 73 mm, 75 mm, 73 mm

After inspecting these data you could reasonably conclude that the diameter lies somewhere between thelowest value (71 mm) and the highest (75 mm). If forced to settle on one single number to report you mightcalculate the mean value (simply the average) as 73.375 mm. But you are now aware that such a statementwould be claiming an awful lot of precision, probably not justifiable, given the least count of the ruler andthe variability of the baseball.

What you really want to do is to make a claim of the most likely diameter of the baseball, as well as anestimate of the uncertainty of that claim. The most likely value is, intuitively, just the mean value. But theuncertainty deserves some deeper thought. Let us make a table of data, showing each of the measured valuesand the amount by which each measurement differs from the mean of all the measurements (the deviationfrom the mean).

I-5

Measured value, xi Deviation, di = xi − xavg

72 mm 72− 73.375 = −1.375 mm74 mm 74− 73.375 = 0.625 mm75 mm 75− 73.375 = 1.625 mm72 mm 72− 73.375 = −1.375 mm73 mm 73− 73.375 = −0.375 mm73 mm 73− 73.375 = −0.375 mm75 mm 75− 73.375 = 1.625 mm73 mm 73− 73.375 = −0.375 mm

Perhaps the average of the deviations from the mean would provide a good estimate of the uncertainty. Butupon close inspection you will find that the sum of the deviations, and hence their average, is exactly zero.That should come as no surprise, for, by definition, some of the values are a bit larger than the mean andsome are a bit smaller, so that the sum of the deviations will always be zero, for any collection of data.

The equation for the mean, where the bar over the symbol indicates the mean value, and N is the numberof individual measurements, is

x =1N

N∑i=1

xi (1)

But it is the amount of deviation that is of interest, not whether it is above or below the mean. If all of thedeviations were converted to positive numbers before averaging a more useful estimate of the uncertaintymight be made. One way to convert the deviations to positive numbers is to take the absolute value of each.But another, and ultimately more useful way is to square them. The data table now looks like this:

xi di (di)2

72 mm -1.375 mm 1.89174 mm 0.625 mm 0.39175 mm 1.625 mm 2.64172 mm -1.375 mm 1.89173 mm -0.375 mm 0.14173 mm -0.375 mm 0.14175 mm 1.625 mm 2.64173 mm -0.375 mm 0.141

The mean of the squares of the deviations from the mean is now 1.23 mm2. If you take the squareroot of this result (in effect undoing the squaring done earlier) then you obtain the square root of themean of the squares of the deviations from the mean, which turns out to be 1.11. That mouthful ofa name is usually shortened to just root-mean-square (or even shorter to R.M.S.) and the result is calledthe textitstandard deviation of the set of values xi, usually represented by the greek letter sigma (σ). Inequation form:

σx =

√√√√ 1N

N∑i=1

(xi − x)2 =

√√√√ 1N

N∑i=1

d2i

Using the standard deviation as an estimate of the uncertainty and applying rules 1 and 2, then

baseball diameter = 73± 1 millimeters

I-6

Statisticians, unfortunately, cannot leave well enough alone and insist that there are very good argumentsfor computing the standard deviation by dividing by one fewer than the number of measurements. In effectthe N in the equation for the standard deviation above is simply replaced by (N − 1).

σx =

√√√√ 1N − 1

N∑i=1

(xi − x)2 (2)

Nothing is to be gained by attempting to prove the correctness of this assertion, but it is correct and weshall use this latter expression. The earlier expression (using N) is usually called the population standarddeviation , while the latter (using N − 1) is called the sample standard deviation . For the data in thetable above the population standard deviation is 1.11 mm, and the sample standard deviation is 1.19 mm.You can see that the correct expression for the diameter of the baseball is the same in either case.

If you inspect the two equations for standard deviations it should be obvious that as the number of mea-surements, N , becomes quite large, the two types of standard deviation become very nearly equal. And,statistically speaking, unless there are at least five or six measurements, standard deviation analysis of anykind is not a very good predictor of uncertainty. Within this laboratory guide, when the symbol, σ, or theterm, standard deviation , is used, you should presume that the meaning is that of sample standarddeviation . Furthermore the following rule will apply:

RULE 4. The most likely value of a quantity that has been measured severaltimes is the mean value of the series of measurements.

RULE 5. The uncertainty in the value of a quantity that has been measuredseveral times is the sample standard deviation of the series of mea-surements.

Absolute and Fractional Uncertainty

The uncertainties which you have been calculating thus far are more precisely called the absolute uncer-tainty . The absolute uncertainty is expressed using the same units as the measurement. In the data abovethe absolute uncertainty in the baseball diameter is 1 mm.

If you divide the absolute uncertainty by the most likely value of the measurement you obtain the fractionaluncertainty . Using the baseball data

fractional uncertainty =σD

D

In this caseσD

D=

1 mm73 mm

= 0.014

Since the units (mm) cancel out, the fractional uncertainty is a dimensionless quantity, being the ratio of twonumbers of the same units. Multiplying by 100 turns the fractional uncertainty into the percent uncertainty.In this case, the percent uncertainty in the baseball diameter is 1.4%.

Both fractional and absolute uncertainties play very important roles as we investigate the effect of measure-ment uncertainty on calculations which use those measurements.

Addition and Subtraction of Uncertain Values

When two or more numbers, each of which is uncertain, are added (or subtracted) you should expect thatthe sum (or difference) will also be uncertain. Let us agree on some notation and see what happens.

X = X ± σX Y = Y ± σY Z = X + Y

I-7

X + Y = (X ± σX) + (Y ± σY ) = (X + Y )± (σX + σY ) = Z ± σZ

σZ = σX + σX (3)

Notice that we have presumed the ”worst case condition” and taken both uncertainties as positive numbersand added them. In the “real world” occasionally one will partially ”cancel out” the other, but you cannotcount on that. Subtraction is just adding the negative of a number, thus:

RULE 6. When uncertain values are added (or subtracted) the absolute un-certainty of the sum (or difference) is the sum of the absolute uncer-tainties of each value.

Once again, statisticians will argue that, because uncertainties do sometimes “cancel out” Rule 6 gives toolarge a result and the uncertainty of a sum should be the quadrature sum (square root of the sum of thesquares) of the absolute uncertainties. In equation form:

Z = X + Y σz =√σ2

x + σ2y

But because, as stated earlier, uncertainties in a teaching lab are more often understated than overstated,we will adhere to Rule 6 for most of the calculations you will perform.

Multiplying and Dividing Uncertain Values

Using the same notation as before you can investigate what happens when uncertain measurements are partof products or quotients.

If Z = XY , then Z = (X ± σX)(Y ± σY )

= XY ± σXY ± σY X ± σXσY

Rearranging, Z = XY ± (σXY ± σY X)± σXσY

If σXσY << (σXY ± σY X), then σZ = (σXY ± σY X).

ButσZ

Z=

(σXY ± σY X)XY

=σX

X+σY

Y

Therefore,σZ

Z=σX

X+σY

Y

If the calculation involves raising either or both measured values to a power other than 1, the above relation isslightly modified. Since raising to a power is the same sort of operation as multiplying or dividing [x2 = x ·x]and negative powers are just reciprocals of positive powers [x−2 = (1/x) · (1/x)] the change is a minor one.

If Z = XmY n , thenσZ

Z=|m|σX

X+|n|σY

Y(4)

Note that the exponents, m and n, may be positive or negative, but that the absolute value operation renderstheir contribution to the uncertainty always positive.

But statisticians once more intrude and remind us that, for the same reasons noted above, a quadrature sumis a more correct estimate of the uncertainty. In equation form,

I-8

If Z = XmY n , then(σZ

Z

)2

=(|m|σX

X

)2

+(|n|σY

Y

)2

Note that the absolute value operations have disappeared, but squaring each term serves a similar purposein keeping all the terms positive. Notwithstanding the statisticians’ cry of “Rigor, rigor!” we shall dispensewith the quadrature sum for the purposes of this course and use the simpler algebraic sum. Here then is thenext rule:

RULE 7. When uncertain values (possibly raised to powers) are multiplied (ordivided) the fractional uncertainty of the product (or quotient) isthe sum of the fractional uncertainties of the individual values, eachfractional uncertainty having been multiplied by the absolute value ofthe power to which that measurement was raised in the calculation.

Rules 6 and 7 incorporate procedures generally referred to as the propagation of uncertainties throughcalculations. Whenever calculations are performed which involve numbers which are uncertain you mustapply these rules and propagate the uncertainty in order to estimate the uncertainty in the result of thecalculations.

In addition remember that, if the fractional uncertainty and mean value are already known, the absoluteuncertainty cab be calculated by

σD = D · (fractional uncertainty)

Reporting uncertainties in laboratory reports

Lab reports should always include consideration of uncertainty, both in raw data and in the final results ofcalculations. The table below will help to clarify which are the reportable uncertainties for various kinds ofreportable results. Here we use the term direct measurement to mean a quantity read straight from thedial of the instrument, such as the length or width of a table. An indirect measurement is a value arrivedat as the result of a calculation, such as the area of a table top.

Kind of Measurement Uncertainty that should be reported

Single, direct Instrument Uncertainty

Multiple, direct measurements

Sample standard deviationor

instrument uncertainty(Whichever is larger)

Indirect Propagation of uncertainty according to formulae

Sources of Uncertainty

You should now have a fairly good idea of why measurements are uncertain. Part of the uncertainty is due tothe limitations in the precision and accuracy of the measuring instrument and part is due to actual variationsin the thing being measured. It is important that you realize and understand that, in many cases, you can-not know the individual contributions. You can only analyze your data and determine the overall uncertainty.

There are a few exceptions to that statement. For instance, if you were measuring the diameters of all ofthe apples in a crate and found the uncertainty (sample standard deviation) to be over 10 mm, having used

I-9

a ruler marked every one mm, you should rightfully attribute the variation to the apples, not to changes inthe ruler. A useful “rule of thumb” is that if the uncertainty is significantly larger than the least count ofthe instrument (and the instrument is not malfunctioning) then the uncertainty is mostly attributable tothe thing being measured.

Random Uncertainties

Random uncertainties are as completely unpredictable as the roll of a pair of dice. They may be due toinstrument fluctuations or to variations in the measured quantity, and you cannot know which . It ispossible to estimate the magnitude of random uncertainty. Such methods as the standard deviation and thetechniques used when adding, subtracting, multiplying, and dividing measurements will allow you to makegood estimates of random uncertainties. Better instruments, more care in measurement, and consistentprocedures will all help to minimize random uncertainty. But it can never be eliminated.

Consistent Bias

Bias refers to a discrepancy in a measured quantity due to a flaw in the measuring instrument. Consistentmeans the flaw is always present and is present in a uniform (though not necessarily known) fashion. If youcan find out what the consistent bias is you can compensate for it as you analyze your data. There are anumber of types of consistent bias, here are just a few.

Zero bias. If an instrument does not read zero when there is no input, then it exhibits a zero bias. Forexample, suppose the first 3 mm of your meter stick had broken off. Any measurement made with that stickwould always be 3 mm too long. But if you noticed it you could just subtract 3 mm from each measurementand there would be no problem. Another example might be a light intensity meter. It should read zero inthe dark. If it doesn’t, there is a zero bias to contend with.

Proportional bias. If an instrument reads high or low by a constant percentage of its reading then it exhibitsproportional bias. Consider a clock that runs 1 minute fast per day. If you set it at midnight, by the followingmidnight it would be 1 minute fast, or about 0.07% fast. After one more day it would be 2 minutes fast, butthat would still be 0.07% of the total time. At the end of a full year it would be 356 minutes fast, about6 hours. But 6 hours compared to a year is still just 0.07%. As before, if you notice the problem you cancompensate for it. As you will see when we discuss graphs of data, proportional bias shows up in the slopeof a graph, while zero bias shifts the whole graph a fixed amount (up, down, left, or right).

Nonlinear bias. A discrepancy which is repeatable but non-uniform on either an absolute of a percentagebasis exhibits a nonlinear bias. An example might be a spring-wound clock that gradually slows down asthe spring tension decreases. On a graph such a discrepancy will show up as a distortion of the shape of thegraph. For instance, a straight line is distorted into a curve.

Mistakes and Blunders

They do happen, but are neither acceptable nor excusable. When you catch yourself in a mistake you mustrepeat the measurements or calculations. If that is not possible you must eliminate the mistaken data fromany consideration. Under no circumstances may you submit a report which includes data orcalculations which are known to be mistaken. Such action is completely contrary to the principlesof ethics in science. If eliminating the mistaken data means that you cannot complete the experimentalprocedure then you must accept that fact and its consequences. In class that will mean a lower grade onan assignment. In your professional career it may have far more serious ramifications. So let’s be carefulout there.

I-10

Graphical Analysis

In many of your experiments your goal will be to find the relationships between measured quantities. Oneof the best ways to investigate relationships is to draw graphs. Graphs give you a visual clue as to howquantities are related, and the visual clues can suggest mathematical relationships.

A good graph should be as completely self-contained as possible, and should require a minimum of referenceto explanatory text. Follow these precepts and your graphs will be attractive and useful:

1. Use graph paper. A grid of 5 squares per cm or 10 per inch will serve for most graphs.

2. Use a sharp-tipped pen or pencil. Broad lines are difficult to read.

3. Draw the graph as nearly full-page as possible, given the range of data and a convenient scale factor.A compressed graph is difficult to read.

4. Give the graph a title. The title should be concise, but should tell what is being graphed, and fromwhat sort of procedure the data were taken.

5. Use the horizontal (x) axis for the independent variable and the vertical (y) axis for the dependentvariable. (more details on that later)

6. Label each axis with the name of the quantity and the units in which it is expressed.

7. Select a scale for each axis which includes zero whenever possible.

8. Use vertical and/or horizontal bars to indicate the uncertainties in the plotted quantities.

9. Draw a straight line (with a straightedge) or a smooth curve, as appropriate, through the plottedpoints. In general, due to random uncertainties, some of the points will not lie on the line or curve. Infact, statistically, fully a third of may not fall within their uncertainty range of the line or curve.

There are a number of personal computer programs which perform graphical display and analysis. You willuse one or more of these programs in this class.

Making and using a graph

Suppose an experiment is conducted in which an automobile is smoothly accelerated from a standing startand its speed and position are recorded every 5 seconds for a total elapsed time of 30 seconds. The speedwill be read from the speedometer, whose least count is 5 miles per hour. The position will be read fromthe odometer, whose least count is 0.1 miles. The timing of the measurements will be controlled by beepsfrom an electronic digital timer, presumed correct to within 0.01 seconds. The table of data might look asfollows:

Time, seconds Position, miles Speed, miles/hour

5.00± 0.01 0.00± 0.05 10± 3

10.00± 0.01 0.00± 0.05 15± 3

15.00± 0.01 0.10± 0.05 25± 3

20.00± 0.01 0.10± 0.05 40± 3

25.00± 0.01 0.20± 0.05 55± 3

30.00± 0.01 0.30± 0.05 65± 3

I-11

Note that the position data is very imprecise indeed. The least count of the odometer is 0.1 miles, so theposition must be stated to the nearest tenth of a mile, and the uncertainty is half the least count, or 0.05miles (Rule 3). The speed data is only a little better. The least count of the odometer is 5 mph, so thespeed is recorded to the nearest 5 mph marking. The uncertainty in speed is half of the least count, or 2.5mph, which is rounded (according to the Rule 1) to 3 mph. The time, having been taken from a digitalinstrument, is recorded with an uncertainty equal to the full least count, 0.01 s (Rule 3A).

At left is a graph of position versus time. Notehow each data point is identified with a circle. Notealso how the vertical bars are used to representthe uncertainty in the dependent (y−axis) variable.No horizontal bars are used to represent uncertaintyin time (x−axis) because the absolute uncertaintyis too small (0.01 s) to show on a graph of thisscale.

No line has been drawn through the data pointsas yet because it is unclear how the data are re-lated. We don’t yet know whether a straight line ora curve would be appropriate. More on this topiclater.

When a straight line is appropriate it should be drawnso that it passes as close as possible to the maximumnumber of plotted points. It takes a bit of practice tobe able to “eyeball” the best line. In fact, a little later

you will learn a mathematical technique that will guarantee the best line through a set of points.

The next graph is one of speed versustime. It is a little clearer in this casethat a straight line is a reasonable repre-sentation of the relationship of these twoquantities (speed and time). The linehas therefore been drawn. Notice thatsome of the points are above the line,some below, but that most of them fallalong the line to within their uncertainty.Two important bits of information can begleaned from this graph by considering theslope of the line and the area under theline.

The slope is found by selecting two widely sepa-rated points on the line, not data points,and calculating the ratio of rise (change iny−value) to run (change in x−value)

slope = m =riserun

=∆y∆x

m =60 mi/hr

26 s= 2.3

mihr · s

I-12

You might recognize the slope as the acceleration of the automobile. The units may sound strange, butremember that the slope must agree with the units of measurement for the two quantities.

For an object that is accelerating uniformly, that is, the acceleration linearly increases with time, the speedof the object at any time is

v = v0 + at = [at+ v0]

The form of the expression in square brackets, above, is interesting when compared with the general formof the equation of a straight line, in x and y.

y = mx+ b

Here m is the slope and b is the y−intercept. Comparing the two equations should show that, for thespeed vs time graph, the acceleration is equal to the slope and that the speed at time zero is equalto the y-intercept. Upon inspection of the graph you see that the y-intercept is not exactly zero, but isvery nearly so, within the uncertainty of the measurements. Remember, the line on the graph is an estimateof the best straight line through the real data.

The area is also of interest. The area under this graph can be found by counting the number of grid squaresincluded. There are three full squares and approximately 3 half-squares, for a total of 4.5 squares. But whatare the units of the area?

The area of one of the squares is just its height multiplied by its width. But the height is measured, not incm, but in miles per hour. And the width is measured in seconds. Or in other words, the area has unitsof speed multiplied by time , specifically (miles per hour)·(seconds), which is distance. Each square is 20miles per hour “high” and 10 seconds “wide” and if you remember that there are 3600 seconds in an hourthen you can see that each square represents a distance of 0.056 miles. Thus the area under this graph is

area =(

0.056miles

square

)(4.5 squares) = 0.25 miles

If you examine the position graph you will see that, atthe end of 30 seconds, the automobile had travelled 0.3miles. This agrees with the area calculation of the dis-tance within the uncertainty of the measurements. Nowlet us return to the position versus time graph, repeatedhere at left above, but with a smooth curve now drawnthrough the plotted points. For a uniformly acceleratingobject, that began from rest, its position at any time is

x = x0 +12at2

This is a quadratic equation and the shapeof the graph should be a parabola, which isthe shape drawn. But straight lines are mucheasier to draw than parabolae. Is there anyway to use a straight line to show the rela-tionship of these variables? Fortunately thereis.

The answer is to make a substitution of variables. Instead of plotting time on the horizontal axis we shallplot time-squared, as shown. The vertical axis is just the same as it was. When you compare the quadratic

I-13

equation above with the general equation of a straight line you can see that if t2 takes the place of x, dtakes the place of y, and d) takes the place of b, then a straight line can indeed relate position to the square

of elapsed time. And1

2a takes the place of m.

The graph to the left shows position plottedversus the square of the elapsed time. Theslope of the line is, checking the equation, justone-half the acceleration, and the y−interceptis the position at time zero. Calculating theslope, as above

slope =riserun

=0.22 miles

600 seconds2

= 0.00037mis2

= 1.3mi/hr

sec

Since the slope is half of the acceleration, the acceleration is 2.6 miles per hour per second. Compare thisvalue with 2.3 miles per hour, as found from the slope of the speed versus position graph. In this exampleyou have seen how an important physical quantity, acceleration, can be determined graphically from theslope of a straight line. In the first case a naturally linear relation was used. In the second case a nonlinearrelation was linearized by substitution.

I-14

The Method of Least-SquaresFinding the Best Line

As you learned in the preceding section, graphing data in such a way that a straight line represents thefunctional relationship of the data is a very powerful tool. The only weakness in the method was the needto “eyeball” how the line is to be drawn. As promised, we shall now show you how to use a mathematical,statistical method to determine the guaranteed best straight line for a collection of data points.

In principle the problem is a simple one. We have a collection of data points that we would like to representby the equation of a straight line

y = mx+ b

The real question is what value of m, and what value of b, will represent the best possible straightline that could be drawn through the data points? As you have already seen, this line need not pass rightthrough all or even most of the points. It should, however, be “closest to the mostest” of the points. Howcan the art and science of mathematics and statistics deliver this result?

First we must settle on a criterion for what is the best straight line. Clearly we wish to minimize the amountby which the line “misses” each of the plotted points. So a measure of how good the line is could be obtainedby adding up the amount of each miss. Since sometimes the line will miss “high” and sometimes “low” wecan square the amount of the miss in order to make all the amounts positive.

Adding up the square of the amount by which the line misses each plotted point yields a number which wecall the sum of the squares of the deviations from the measured values, or for short just the sumof the squares of the deviations. What is desired is that a line be chosen (by selecting m and b) thatwill results in the least possible value of the sum of the squares of the deviations. Thus the process is namedthe method of least squares.

Statistical methods, which we shall neither prove nor discuss, ultimately yield two formulae, one for theslope of the line, m, and one for the y−intercept, b.

m =

N

(N∑

i=1

xiyi

)−

(N∑

i=1

xi

)(N∑

i=1

yi

)

N

(N∑

i=1

(xi)2)−

(N∑

i=1

xi

)2 (5)

b =

(N∑

i=1

yi

)−m

(N∑

i=1

xi

)N

(6)

Though these equations may on the surface appear dreadfully formidable you should approach them a littlelike modern dentistry. Done correctly, it can be virtually painless. Let us use the data from the acceleratingautomobile and use the method of least squares to find the best straight line for the speed versus time graph.

The data are repeated below, but now each column has been added up and the sums appear at the bottomof each column. Also, extra columns have been added to account for the extra terms needed in the equationsabove. For clarity and simplicity, the uncertainties have been omitted. Remember, the x−variable is timeand the y−variable is speed.

I-15

time speed time2 speed2 (time)(speed)xi yi (xi)2 (yi)2 (xi)(yi)

0.0 0 0 0 05.0 10 25 100 50

10.0 15 100 225 15015.0 25 225 625 37520.0 40 400 1600 80025.0 55 525 3025 137530.0 65 900 4225 1950

Σxi =105.0 Σyi =210 Σ(xi)2 =2275 Σ(yi)2 =9800 Σxiyi =4700

Let us now use these sums to evaluate the two equations. First we will find the slope of the line, m.

m =

N

(N∑

i=1

xiyi

)−

(N∑

i=1

xi

)(N∑

i=1

yi

)

N

(N∑

i=1

(xi)2)−

(N∑

i=1

xi

)2 =7(4700)− (105)(210)

7(2275)− (105)2= 2.21

With the value of m known, use the second equation to find the y−intercept, b.

b =

(N∑

i=1

yi

)−m

(N∑

i=1

xi

)N

=(210)− 2.21(105)

7= −3.15

Thus the best possible straight line through the real data points is given by the equation

y = 2.21x− 3.15

or in other words

speed inmihr

=(

2.21mi

hr · sec

)(time in sec)−

(3.15

mihr

)It now remains to draw this line on the graph. Since the equation of the line is known, all that is necessaryis to choose two values of x (in this case, time) and calculate the corresponding values of y (in this case,speed). Convenient choices might be 0 seconds and 30 seconds since these will give two widely spaced points.Calculating the corresponding speeds,

speed at 0 s =(

2.21mi

hr · sec

)(0sec)− 3.15

mihr

= −3.15mihr

speed at 30 s =(

2.21mi

hr · sec

)(30sec)− 3.15

mihr

= 63.15mihr

The two points to be plotted are (0,−3.15) and (30, 63.15). Once these are plotted on the graph a straightline is drawn connecting them. If the actual measured data points are added to the graph you will see thatthe line passes above some points and below others. The line may not pass exactly through any of the points.Here is the speed versus time graph, repeated, with the least-squares straight line drawn.

Remember that the least-squares line is an estimate of the relationship of the measured quantities. Whileit is the best estimate we can make there is no guarantee that it is the real, true functional relationship.In other words, the slope and intercept we have just found are themselves uncertain. This fact should not

I-16

surprise you, since the data values themselves were uncertain to begin with.

Further mathematical analysis, starting with the equations for the least-squares slope and intercept can giveus additional formulae which represent the uncertainties in the slope and intercept. These formulae arepresented below.

σm =

√Nσ2

y

∆(7)

σb =

√√√√√√ (σ2y)

(N∑

i=1

(xi)2)

∆(8)

Where ∆ = N

(N∑

i=1

(xi)2)−

(N∑

i=1

xi

)2

and

σ2y =

1N − 2

N∑i=1

(mxi + b− yi)2 =

(N∑

i=1

(yi)2)− b

(N∑

i=1

yi

)−m

(N∑

i=1

xiyi

)N − 2

You should by now be able to identify σm and σb as the uncertainties in the slope and y−intercept, re-spectively. Whenever you report the result of a least-squares calculation you should always include theuncertainties.

It is important that you learn to calculate, using your hand calculator, the parameters of the least-squares“fit.” Later on in this course you will learn to use computer software that performs these calculations, aswell as actually drawing the graphs. Be careful to carry as many significant digits as you can whencalculating the right-hand side of the expression for (σy)2. Note that if this calculation for (σy)2

yields a negative quantity, then you must use the middle form in the expression.

There is one final bit of information that statisticians can extract from these relations. The quantity ?y,above, can be used to estimate the uncertainty in a predicted value of y, calculated from the straight-lineequation. Thus, if you use the least-squares straight line equation (y = mx+ b) to predict a value of y forwhich no measurement exists, you should report its uncertainty as follows

y = (mx+ b)± σy√N − 2

Let’s use the automobile data one more time and compute the rest of the parameters

∆ = N

(N∑

i=1

(xi)2)−

(N∑

i=1

xi

)2

= 7(2275)− (105)2 = 4900

Then σm =

√Nσ2

y

∆=

√7(14.9)4900

= 0.146 and σb =

√√√√√√ (σ2y)

(N∑

i=1

(xi)2)

∆=

√(14.9)(2275)

4900= 2.63

I-17

Where σ2y =

1N − 2

N∑i=1

(mxi + b− yi)2

=

(N∑

i=1

(yi)2)− b

(N∑

i=1

yi

)−m

(N∑

i=1

xiyi

)N − 2

=(9800)− (−3.15)(210)− (2.21)(4700)

(7− 2)= 14.9

Therefore, σy =√

14.9 = 3.86

A correct report of the results of our least-squares calculation is, according to the rules of rounding off andstating uncertainties,

m = 2.2± 0.1mi

hr · s

b = 3± 3mihr

Suppose we then used the least-squares relation to calculate the automobile’s speed at time=22 seconds, forwhich no measurement exists. Using the symbol v to represent the speed,

v =(

2.21mi

hr · s

)(22 s) +

(−3.15

mihr

)= 45.47

mihr

v = 45.47± 3.86√7− 2

= 45± 2mihr

A quick mental calculation indicates that the fractional uncertainty in the speed, expressed as a percent, isbetween 4 and 5 percent.This result should not seem a disappointingly large uncertainty, remember that theestimate is based upon data that were themselves quite uncertain.

Selection of independent and dependent variables

The use of statistical methods, such as least-squares, colors our choice of which variables will serve as theindependent and dependent axis variables. The method we used to determine the slope and intercept arebased upon the assumption that the uncertainty in the independent (x) variable is zero or negligible andthat all of the uncertainty resides in the dependent (y) variable. Clearly this idealized situation will not beencountered in your lab.

Usually, however, one of the variables you will be measuring will be known to have a much smaller fractionaluncertainty than others. For instance, when you use the small “slotted masses” that are stamped with theirmass in grams you may assume that the uncertainty is negligible. When comparing uncertainties alwayscompare the fractional uncertainties of variables. The method of least-squares will yield good results if youobserve the following rule:

RULE 8. When making a graph for which the method of least-squares will beused, select the variable with the smallest fractional uncertainty asthe independent variable.

I-18