Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about...

32
Measurements and Their Analysis

description

Sample Versus Population Most often, we collect a small data sample from a much larger population For example, say we wanted to determine the ratio of female to male students enrolled at UF Theoretically we could visit every UF student and collect this information – then compute the ratio This would be an assessment of the population, which gives the actual ratio

Transcript of Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about...

Page 1: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Measurements and Their Analysis

Page 2: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Introduction

• Note that in this chapter, we are talking about multiple measurements of the same quantity

• Numerical analysis – computation of statistical quantities (mean, variance, etc.)

• Graphical analysis – construction of bar charts, scatter diagrams, etc.

Page 3: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Sample Versus Population• Most often, we collect a small data sample

from a much larger population• For example, say we wanted to determine

the ratio of female to male students enrolled at UF

• Theoretically we could visit every UF student and collect this information – then compute the ratio

• This would be an assessment of the population, which gives the actual ratio

Page 4: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Sample of Students• Visiting every UF student would take a very

long time, so we might collect a smaller sample (perhaps stand by the union for 10 minutes and count students as they walk by)

• If we compute the ratio from this sample, we would get an estimate of the actual ratio

• It is important to be unbiased• If we based our sample on students in this

room, we would get a biased estimate

Page 5: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Sample of Measurements

• The population size for measurements is infinite

• Thus, we are always dealing with samples when analyzing measurements

Now, let’s look at some analysis methods for samples of measurements.

Page 6: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Range and Median

• The range (sometimes called dispersion) is the difference between the largest and smallest values

• Generally, a smaller range implies better precision

• The median is the middle value of a sorted data set

• When the number of data elements is even, take the mean of the two middle values

Page 7: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Ordered Set of 50 Readings

20.1 20.5 21.2 21.7 21.821.9 22.0 22.2 22.3 22.322.5 22.6 22.6 22.7 22.822.8 22.9 22.9 23.0 23.123.1 23.2 23.2 23.3 23.423.5 23.6 23.7 23.8 23.823.8 23.9 24.0 24.1 24.124.2 24.3 24.4 24.6 24.724.8 25.0 25.2 25.3 25.325.4 25.5 25.9 25.9 26.1

Range = 26.1-20.1 = 6.0

Median is 23.45 (the average of 23.4 and 23.5)

Note that the difference between the lowest value and the median is 3.35 and between the highest and the median is 2.65

Page 8: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Frequency Histogram• The frequency histogram (or simply histogram)

is a graphical representation of data• A histogram is a bar graph that illustrates the

data distribution• To produce a histogram, the data are divided

into classes which are subranges that are usually equal in width

• The number of classes can vary depending on the number of values, but odd numbers like 7, 9, or 11 are often good choices

Page 9: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Dividing Data Into Classes

Say we want to construct a histogram of the previous data set using 7 classes spanning the range.

The class width will be 6.0/7 = 0.857143 = 0.86

Therefore the first class subrange will be 20.10 – 20.96, the second subrange will be 20.96 – 21.81, the third will be 21.81 – 22.67, etc.

We then count the number of values falling within those classes and compute the fraction of the total.

Page 10: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Class Frequency Table

Class Class Class relativeinterval frequency frequency20.10 - 20.96 2 2/50 = 0.0420.96 - 21.81 3 3/50 = 0.0621.81 - 22.67 8 8/50 = 0.1622.67 - 23.53 13 13/50 = 0.2623.53 - 24.39 11 11/50 = 0.2224.39 - 25.24 6 6/50 = 0.1225.24 - 26.1 7 7/50 = 0.14

Σ= 50/50 = 1

Page 11: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
Page 12: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

What Can We Judge From a Histogram?

• Symmetry• Range• Frequencies• Steepness indicates precision, but only if

the histograms have the same class intervals and scales

Page 13: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

What might cause these various shapes?

Page 14: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Numerical Measures

• Measures of central tendency• Measures of data variation

Page 15: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Measures of Central Tendency

• Arithmetic mean or average• Median (mentioned previously)• Mode

Page 16: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Arithmetic Mean

n

yy

n

ii

1

y is the sample mean

μ is the population mean but the same formula is used

n is the number of values

yi are the individual values

Page 17: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Mean of the 50 Values

500.2350

0.11751

n

yy

n

ii

Why 5 significant figures?

Page 18: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Median

• The median is the middle value• Half of the values are above and half are

below• It is more effective as a measure of central

tendency when there are outliers (blunders) in the data set

Page 19: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Mode

• The mode is the most frequently occurring value

• It is seldom of use when dealing with measurements (real numbers)

• More useful with integers (e.g. most common age)

Page 20: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Definitions

• True value. A quantity’s theoretically correct or exact value. In theory, it is the population mean, μ, which is indeterminate for measurements

• Error (ε). The difference between a measurement and the true value

ii y

Page 21: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Definitions (continued)

• Most probable value ( ). Derived from a sample, it is the average of equally weighted measurements

• Residual (v). The difference between the most probable value and an individual measurement. It is similar to an error, but definitely not the same thing

y

ii yyv

Page 22: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Definitions (continued)• Degrees of Freedom. The number of

observations that are in excess of the minimum number necessary to solve for the unknowns – it equals the number of redundant observations

• Population variance (σ2). This quantifies the precision of the population of a set of data. It can also be called the mean squared error

n

n

ii

1

2

2

Page 23: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Definitions (continued)• Sample variance (S2). This is an unbiased estimate

of the population variance.

• Standard error (σ). Square root of population variance – 68.3% of all observations lie within ±σ of the true value

11

2

2

n

vS

n

ii

Page 24: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Definitions (continued)

• Standard deviation (S). This is the square root of the sample variance – it is an estimate of the standard error. (Do not expect 68.3% of sample observations to fall within ±S of the sample mean unless n is large.)

Page 25: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Definitions (continued)

• Standard deviation of the mean ( ) The mean value will have a lower standard deviation than any single measurement. As n →∞, →0.yS

yS

nSS y

Page 26: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Alternate Formula for Sample Variance

1

)(1

2

2

n

yyS

n

ii

11

2

1

2

2

1

1

2

2

n

yny

n

n

yny

S

n

ii

n

iin

ii

Page 27: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Example20.1 20.5 21.2 21.7 21.821.9 22.0 22.2 22.3 22.322.5 22.6 22.6 22.7 22.822.8 22.9 22.9 23.0 23.123.1 23.2 23.2 23.3 23.423.5 23.6 23.7 23.8 23.823.8 23.9 24.0 24.1 24.124.2 24.3 24.4 24.6 24.724.8 25.0 25.2 25.3 25.325.4 25.5 25.9 25.9 26.1

50.23y

Page 28: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

37.1150

36.921

1

2

n

vS

n

ii

What about significant figures?

Page 29: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

By alternate form: 86.704,272 iy

150)50.23(5086.704,27 2

S

37.149

36.92150

50.612,2786.704,27 S

Standard deviation of the mean

194.05037.1 yS

(Note the higher precision for the mean)

Page 30: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Now let’s work the same example using the program “STATS”

Page 31: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Was the Data Set Reasonably Normal?

About 68% of the values should be between ±S of the mean.

So: 23.50 ±1.37

22.13 to 24.87

34 out of the 50 values fall within that range, which is 68%

Page 32: Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.

Evaluating Normalcy of Data

• Use statistics to the extent possible• Look for values beyond ±3S (about 99.7%

of values should be within that range from the mean)

• At this point in the class, it is still somewhat of a judgment call.

• Later in the course, we will look at other methods of data evaluation