MODULE 1 Mechanical Measurementslibrarian/web courses/IIT-MADRAS/Mech_Meas... · MODULE 1...

Mechanical Measurements Prof. S. P. Venkateshan

Indian Institute of Technology Madras

MODULE 1 Mechanical Measurements

1. Introduction to Mechanical Measurements

Figure 1 Why make measurements?

We recognize three reasons for making measurements as indicated in

Figure 1. From the point of view of the course measurements for commerce is

outside its scope.

Engineers design physical systems in the form of machines to serve

some specified functions. The behavior of the parts of the machine during the

operation of the machine needs to be examined or analyzed or designed such

that it functions reliably. Such an activity needs data regarding the machine parts

in terms of material properties. These are obtained by performing measurements

in the laboratory.

Why Measure?

Generate Data for Design

Generate Data to Validate or Propose a

Theory

For Commerce



The scientific method consists in the study of nature to understand the

way it works. Science proposes hypotheses or theories based on observations

and need to be validated with carefully performed experiments that use many

measurements. When once a theory has been established it may be used to make

predictions which may themselves be confirmed by further experiments.

Measurement categories

1. Primary quantity

2. Derived quantity

3. Intrusive – Probe method

4. Non-intrusive

Measurement categories are described in some detail now.

1. Primary quantity:

It is possible that a single quantity that is directly measurable is of

interest. An example is the measurement of the diameter of a cylindrical

specimen. It is directly measured using an instrument such as vernier calipers.

We shall refer to such a quantity as a primary quantity.

2. Derived quantity:

There are occasions when a quantity of interest is not directly

measurable by a single measurement process. The quantity of interest needs to

be estimated by using an appropriate relation involving several measured

primary quantities. The measured quantity is thus a derived quantity. An

example of a derived quantity is the determination of acceleration due to gravity

(g) by finding the period (T) of a simple pendulum of length (L). T and L are the

measured primary quantities while g is the derived quantity.



3. Probe or intrusive method:

Most of the time, the measurement of a physical quantity uses a probe

that is placed inside the system. Since a probe invariably affects the measured

quantity the measurement process is referred to as an intrusive type of

measurement.

4. Non-intrusive method:

When the measurement process does not involve insertion of a probe into

the system the method is referred to as being non-intrusive. Methods that use

some naturally occurring process like radiation emitted by a body to measure a

desired quantity relating to the system the method may be considered as non-

intrusive. The measurement process may be assumed to be non-intrusive

when the probe has negligibly small interaction with the system. A typical

example for such a process is the use of laser Doppler velocimeter (LDV) to

measure the velocity of a flowing fluid.

General measurement scheme:

Figure 2 Schematic of a general measurement system

Signal Conditioner

Detector and

Transducer

Measured quantity

Calibration or

reference signal

External power

Controller

Output

Computer



Figure 2 shows the schematic of a general measurement scheme. Not all

the elements shown in the Figure may be present in a particular case. The

measurement process requires invariably a detector that responds to the

measured quantity by producing a measurable change in some property of the

detector. The change in the property of the detector is converted to a

measurable output that may be either mechanical movement of a pointer over a

scale or an electrical output that may be measured using an appropriate

electrical circuit. This action of converting the measured quantity to a different

form of output is done by a transducer. The output may be manipulated by a

signal conditioner before it is recorded or stored in a computer. If the

measurement process is part of a control application the computer can use a

controller to control the measured quantity. The relationship that exists between

the measured quantity and the output of the transducer may be obtained by

calibration or by comparison with a reference value. The measurement

system requires external power for its operation.

Some issues:

1. Errors – Systematic or Random

2. Repeatability

3. Calibration and Standards

4. Linearity or Linearization

Any measurement, however carefully it is conducted, is subject to

measurement errors. These errors make it difficult to ascertain the true value of

the measured quantity. The nature of the error may be ascertained by repeating

the measurement a number of times and looking at the spread of the values. If

the spread in the data is small the measurement is repeatable and may be



termed as being good. If we compare the measured quantity obtained by the use

of any instrument and compare it with that obtained by a standardized

instrument the two may show different performance as far as the repeatability

is concerned. If we add or subtract a certain correction to make the two

instruments give data with similar spread the correction is said to constitute a

systematic error. The spread of data in each of the instruments will constitute

random error.

The process of ascertaining the systematic error is calibration. The

response of a detector to the variation in the measured quantity may be linear or

non-linear. In the past the tendency was to look for a linear response as the

desired response. Even when the response of the detector was non-linear the

practice was to make the response linear by some manipulation. With the advent

of automatic recording of data using computers this is not necessary since

software can take care of this aspect.



Sub Module 1.2 2. Errors in measurements

Errors accompany any measurement, however well it is conducted. The

error may be inherent in the measurement process or it may be induced due to

variations in the way the experiment is conducted. The errors may be classified

as:

1. Systematic errors (Bias):

Systematic errors due to faulty or improperly calibrated instruments.

These may be reduced or eliminated by careful choice and calibration of

instruments. Sometimes bias may be linked to a specific cause and estimated by

analysis. In such a case a correction may be applied to eliminate or reduce bias.

Bias is an indication of the accuracy of the measurement. Smaller the bias more

accurate the data

2. Random errors:

Random errors are due to non-specific causes like natural disturbances

that may occur during the measurement process. These cannot be eliminated.

The magnitude of the spread in the data due to the presence of random errors is

a measure of the precision of the data. Smaller the random error more precise is

the data. Random errors are statistical in nature. These may be characterized by

statistical analysis.



We shall explain these through the familiar example shown in Figure 3.

Three different individuals with different skill levels are allowed to complete a

round of target practice. The outcome of the event is shown in the figure.

Figure 3 Precision and accuracy explained through a familiar example

It is evident that the target at the left belongs to a highly skilled shooter.

This is characterized by all the shots in the inner most circle. The result indicates

good accuracy as well as good precision. A measurement made well must be

like this case! The individual in the middle is precise but not accurate. Maybe it

is due to a faulty bore of the gun. The individual at the right is an unskilled

person who is behind on both counts. Most beginners will fall into this category.

The analogy is quite realistic since most students performing a measurement in

the laboratory may be put into one of the three categories. A good

experimentalist has to work hard to excel in it!

Good Precision Poor Accuracy

Good Precision Good Accuracy

Poor Precision Poor Accuracy



Another example:

Figure 4 Example showing the presence of systematic and random errors

in data.

The results shown in Figure 4 compare the response of a particular

thermocouple (that measures temperature) and a standard thermocouple. The

measurements are reported between room temperature (close to 20°C) and

500°C. That there is a systematic variation between the two is clear from the

figure that shows the trend of the measured temperatures indicated by the

particular thermocouple. The systematic error appears to vary with the

0

5

10

15

20

25

0 100 200 300 400 500Temperature, oC

Out

put,

mV

Standard ReferenceIndividual Thermocouple DataPoly. (Individual Thermocouple Data)

Bias

Error



temperature. The data points indicated by the full symbols appear also to hug

the trend line. However the data points do not lie on it. This is due to random

errors that are always present in any measurement. Actually the standard

thermocouple would also have the random errors that are not indicated in the

figure. We have deliberately shown only the trend line for the standard

thermocouple.

Sub Module 1.3

3. Statistical analysis of experimental data Statistical analysis and best estimate from replicate data:

Let a certain quantity X be measured repeatedly to get

iX , i=1,n (1)

Because of random errors these are all different.

How do we find the best estimate Xb for the true value of X?

It is reasonable to assume that the best value be such that the

measurements are as precise as they can be!

In other words, the experimenter is confident that he has conducted the

measurements with the best care and he is like the skilled shooter in the

target practice example presented earlier!

Thus, we minimize the variance with respect to the best estimate Xb of X.

Thus we minimize:

[ ]n

2i b

i 1S X X

== −∑ (2)



This requires that:

[ ]b

n

ii 1

SX

X

n=

∂−

∂

=

∑

∑

n

i bi=1

b

= 2 X X (-1) =0

or X

(3)

The best estimate is thus nothing but the mean of all the individual

measurements!

Error distribution:

When a quantity is measured repeatedly it is expected that it will be

distributed around the best value according to some distribution. Many times

the random errors may be distributed as a normal distribution. If µ and σ are,

respectively, the mean and the standard deviation, then, the probability density is

given by

−⎡ ⎤− ⎢ ⎥⎣ ⎦=

21 x µ2 σ1f(x) e

σ 2π (4)

The probability that the error around the mean is (x-µ) is the area under

the probability density function between (x-µ)+dx and (x-µ) represented by the

product of the probability density and dx. The probability that the error is

anywhere between -∞ and x is thus given by the following integral:

−⎡ ⎤− ⎢ ⎥⎣ ⎦=−∞∫

1 v µx22 σ1F(x) e dv

σ 2π (5)

This is referred to as the cumulative probability. It is noted that if x→∞

the integral tends to 1. Thus the probability that the error is of all possible

magnitudes (between -∞ and +∞) is unity! The integral is symmetrical with



respect to x=µ as may be easily verified. The above integral is in fact the error

integral that is a tabulated function. A plot of f(x) and F(x) is given in Figure 5.

Figure 5 Normal distribution and its integral

Many times we are interested in finding out the chances of error lying between

two values in the form ±pσ. This is referred to as the “confidence interval” and

the corresponding cumulative probability specifies the chances of the error

occurring within the confidence interval. Table 1 gives the confidence intervals

that are useful in practice:

00.20.40.60.8

1

-3 -2 -1 0 1 2 3(x-µ )/σ

Cumulative Probability density



Table 1

Confidence intervals according to normal distribution

Cumulative Probability 0 0.95 0.99 0.999

Interval p 0 +1.96 +2.58 +3.29

The table indicates that error of magnitude greater than ±3.29σ is very unlikely to

occur. In most applications we specify +1.96σ as the error bounds based on

95% confidence.



Example 1

Resistance of a certain resistor is measured repeatedly to obtain the

following data.

No. 1 2 3 4 5 6 7 8 9

R, kΩ 1.22 1.23 1.26 1.21 1.22 1.22 1.22 1.24 1.19

What is the best estimate for the resistance? What is the error with 95%

confidence?

Best estimate is the mean of the data.

1.22 4 1.23 1.26 1.21 1.24 1.19R9

= 1.223 1.22 k

× + + + + +=

≈ Ω

Standard deviation of the error σ:

9 2

1-4

-4

1Variance = Ri -R9

=3.33 10Hence :

= 3.33 10 = 0.183 0.02 k

⎡ ⎤⎣ ⎦

×

σ ×≈ Ω

∑

Error with 95% confidence :

95% Error = 1.96 = 1.96 0.0183 = 0.036 0.04 k

σ ×

≈ Ω



Example 2

Thickness of a metal sheet (in mm) is measured repeatedly to obtain the

following replicate data. What is the best estimate for the sheet thickness? What

is the variance of the distribution of errors with respect to the best value? Specify

an error estimate to the mean value based on 99% confidence.

Experiment No. 1 2 3 4 5 6

t, mm 0.202 0.198 0.197 0.215 0.199 0.194

Experiment No. 7 8 9 10 11 12

t, mm 0.204 0.198 0.194 0.195 0.201 0.202

The best estimate for the metal sheet thickness is the mean of the 12

measured values. This is given by

12

i1

b

0.202 0.198 0.197 0.215 0.199 0.194 0.204t

0.198 0.194 0.195 0.201 0.202t t = 0.2 mm

12 12

+ + + + + +⎡ ⎤⎢ ⎥+ + + + +⎣ ⎦= = =

∑

The variance with respect to the mean or the best value is given by (on

substituting t for bt ) as

12 122i i

2 21 1b

2 2 2 2 2 2 2

2 2 2 2 2

-5 2

t t t = t

12 120.202 0.198 0.197 0.215 0.199 0.194 0.204

0.198 0.194 0.195 0.201 0.2020.2 mm

12= 3.04 10 mm

−⎡ ⎤⎣ ⎦σ = −

⎡ ⎤+ + + + + +⎢ ⎥⎢ ⎥+ + + + +⎣ ⎦= −

×

∑ ∑

The corresponding standard deviation is given by



5b 3.04 10 =0.0055 0.006 mm−σ = × ≈

The corresponding error estimate based on 99% confidence is

bError = 2.58 = 2.58 0.0055 0.014 mm± σ ± × ≈ ±

Principle of Least Squares

Earlier we have dealt with the method of obtaining the best estimate from

replicate data based on minimization of variance. No mathematical proof was

given as a basis for this. We shall now look at the above afresh, in the light of

the error distribution that has been presented above.

Consider a set of replicate data xi. Let the best estimate for the measured

quantity be xb. The probability for a certain value xi within the interval

i i ix , x dx+ to occur in the measured data is given by the relation

( )2b i

2x x

2i i

1p(x ) e dx2

−−

σ=σ π

(6)

The probability that the particular values of measured data are obtained in

replicate measurements must be given by the compound probability given by

( )

( )

( )

( )22 nb ib i

22 i 1

x xx xn n22

i in ni 1 i 1

1 1p = e dx e dx2 2

=

−− −−σσ

= =

∑=

σ π σ π∏ ∏ (7)

The reason the set of data was obtained as replicate data is that it was the

most probable! Since the intervals idx are arbitrary, the above will have to be

maximized by the proper choice of bx and σ such that the exponential factor is a

maximum. Thus we have to choose bx and σ such that



( )2n

b i2

i 1

x x2

n1p ' e =

−−

σ∑

=σ

(8)

has the largest possible value. As usual we set the derivatives b

p ' p ' 0x∂ ∂

= =∂ ∂σ

to

get the values of the two parameters xb and σ. We have:

( )

( )

2ni b

2i 1

x xn

2i bn 2

b i 1This part should go to zero

p ' 1 e 2 x x ( 1) 0x 2

=

−

σ+

=

∑∂= − − − =

∂ σ∑ (9)

Or

( )n n

i b b ii 1 i n

x x =0 or x x x= =

− = =∑ ∑ (10)

It is clear thus that the best value is nothing but the mean of the values! We also

have:

( )( )2n

i b2

i=1

x -xn 2 2

i bn+1 n+3i 0

This part should go to Zero

p ' n 1 = - x x e 0σ

=

∑⎡ ⎤∂+ − =⎢ ⎥∂σ σ σ⎢ ⎥⎣ ⎦

∑ (11)

Or

( )

n 2i b

2 i 1x x

=n

=−

σ∑

(12)

This last expression indicates that the parameter σ2 is nothing but the

variance of the data with respect to the mean! Thus the best values of the

measured quantity and its spread is based on the minimization of the squares of

errors with respect to the mean. This embodies what is referred to as the

“Principle of Least Squares”.



Propagation of errors:

Replicate data collected by measuring a single quantity repeatedly

enables us to calculate the best value and characterize the spread by the

variance with respect to the best value, using the principle of least squares. Now

we look at the case of a derived quantity that is estimated from the

measurement of several primary quantities. The question that needs to be

answered is the following:

“A derived quantity Q is estimated using a formula that involves the

primary quantities. 1 2 na ,a ,.....a Each one of these is available in terms of the

respective best values 1 2 na , a ,.....a and the respective standard deviations

1 2 n, ....σ σ σ . What is the best estimate for Q and what is the corresponding

standard deviation Qσ ?”

We have, by definition

1 2 nQ =Q(a ,a ,.......a ) (13)

It is obvious that the best value of Q should correspond to that obtained by using

the best values for the a’s. Thus, the best estimate for Q given by Q as

1 2 nQ =Q(a ,a ,.......a ) (14)

Again, by definition, we should have:

( )N 22

Q ii 1

1 = Q QN =

σ −∑ (15)

The subscript i indicates the experiment number and the ith estimate of Q is given

by

( )i 1i 2i niQ Q a ,a ,....a= (16)



If we assume that the spread in values are small compared to the mean or the

best values (this is what one would expect from a well conducted experiment),

the difference between the ith estimate and the best value may be written using a

Taylor expansion around the best value as

2N

2Q 1i 2i ni

1 2 2i 2

1 Q Q Qa a ...... aN a a a=

⎛ ⎞∂ ∂ ∂σ = ∆ + ∆ + + ∆⎜ ⎟∂ ∂ ∂⎝ ⎠

∑ (17)

where the partial derivatives are all evaluated at the best values for the a’s. If the

a’s are all independent of one another then the errors in these are unrelated to

one another and hence the cross terms. N

mi kii 1

a a 0 for m k=∆ ∆ = ≠∑ Thus equation

(17) may be rewritten as

2 2 2N

2Q 1i 2i ni

1 2 ni 1

1 Q Q Qa a ....... aN a a a=

⎡ ⎤⎛ ⎞ ⎛ ⎞ ⎛ ⎞∂ ∂ ∂⎢ ⎥σ = ∆ + ∆ + + ∆⎜ ⎟ ⎜ ⎟ ⎜ ⎟∂ ∂ ∂⎢ ⎥⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎣ ⎦∑ (18)

Noting that ( )2N

2ji j

i 1a =N

=∆ σ∑ we may recast the above equation in the form

2 2 2

2 2 2 2Q 1 2 n

1 2 n

Q Q Q = + +.......+a a a

⎛ ⎞ ⎛ ⎞ ⎛ ⎞∂ ∂ ∂σ σ σ σ⎜ ⎟ ⎜ ⎟ ⎜ ⎟∂ ∂ ∂⎝ ⎠ ⎝ ⎠ ⎝ ⎠

(19)

Equation (19) is the error propagation formula. It may also be recast in the form

2 2 22 2 2

Q 1 2 n1 2 n

Q Q Q = + +.......+a a a

⎛ ⎞ ⎛ ⎞ ⎛ ⎞∂ ∂ ∂σ σ σ σ⎜ ⎟ ⎜ ⎟ ⎜ ⎟∂ ∂ ∂⎝ ⎠ ⎝ ⎠ ⎝ ⎠

(20)



Example 3

The volume of a sphere is estimated by measuring its diameter by vernier

calipers. In a certain case the diameter has been measured as D = 0.0502 ±

0.00005 m. Determine the volume and specify a suitable uncertainty for the

same.

Nominal volume of sphere:

3 35 3D 0.0502V =3.14159 6.624 10 m

6 6−= π × = ×

The error in the measured diameter is specified as:

D 0.00005m∆ = ±

The influence coefficient is defined as

2 2-3 2

DV D 0.0502I = = 3.14159 =3.958 10 mD 2 2∂

= π × ×∂

Using the error propagation formula, we have

-3 7 3DV=I D=3.958 10 0.00005 1.979 10 m−∆ ∆ × × = ×

Thus

5 7 3V 6.624 10 1.979 10 m− −= × ± × Alternate solution to the problem

By logarithmic differentiation we have

dV dD =3V D



This may be recast as

-5 -5 3D 0.00005V 3V = 3 6.624 10 = 0.0198 10 mD 0.0502∆

∆ = ± ± × × × ± ×

This is the same as the result obtained earlier.

Example 4

Two resistances R1 and R2 are given as 1000 ± 25 Ω and 500 ± 10 Ω .

Determine the equivalent resistance when these two are connected in a) series

and b) parallel. Also determine the uncertainties in these two cases.

Given Data:

1 1 2 2R 1000, 25;R 500 10 All Values are in = σ = = σ = → Ω

Case a) Resistances connected in series:

Equivalent resistance is

s 1 2R =R R = 1000+500=1500+ Ω

Influence coefficients are:

s s1 2

1 2

R RI 1 ; I 1R R∂ ∂

= = = =∂ ∂

Hence the uncertainty in the equivalent resistance is

( ) ( ) ( ) ( )2 2 2 2s 1 1 2 2 = I I = 25 10 26.93 σ ± σ + σ ± + = ± Ω

Case b) Resistances connected in parallel:

Equivalent resistance is given by



( )

1 2p

1 2

1 2p

1 2

R R 1000 500R = 333.3 R R 1000 500

R R RR R

×= = Ω

+ +

=+

Influence coefficients are:

( ) ( )

( ) ( )

p 2 1 21 2

1 1 2 1 2

p 1 1 21 2 2

2 1 2 1 2

R R R R 500 1000 500I = = = 0.111R R R 1500 1500R RR R R R 1000 1000 500I = = = 0.444R R R 1500 1500R R

∂ ×− − =

∂ + +

∂ ×− − =

∂ + +

Hence the uncertainty in the equivalent resistance is

( ) ( ) ( ) ( )2 2 2 2s 1 1 2 2 = I I = 0.111 25 0.444 10 = 5.24 σ ± σ + σ ± × + × ± Ω

Thus the equivalent resistance is 1500 ± 26.9 Ω in the series arrangement

and 333.6 ± 5.24 Ω in the parallel arrangement.



Error estimation – some results without proof

Standard deviation of the means

The problem occurs as indicated below:

• Replicate data is collected with n measurements in a set

• Several such sets of data are collected

• Each one of them has a mean and a variance (precision)

• What is the mean and standard deviation of the means of all sets?

Population mean

Let N be the total number of data in the entire population. Mean of all the

sets m will be nothing but the population mean (i.e. the mean of all the

collected data taken as a whole).

Population variance

Let the population variance be

( )N

2i

2 i 1x m

=N

=−

σ∑

(21)

Variance of the means

Let the variance of the means be 2mσ . Then we can show that:

( )( )

2 2m

N nn N 1

−σ = σ

− (22)

If n<<N the above relation will be approximated as



( )( )( )( )

2 2m

22

N nn N 1

1-n/N =

n 1-1/N n

−σ = σ

−

σσ ≈

(23)

Estimate of variance

• Sample and its variance

– How is it related to the population variance?

• Let the sample variance from its own mean ms be 2eσ .

• Then we can show that:

( )( )

2 2 2e

N n 1 = 1n N 1 n

− ⎛ ⎞σ σ ≈ σ −⎜ ⎟− ⎝ ⎠ (24)

Error estimator

The last expression may be written down in the more explicit form:

( )

( )

n 2i s

2 1e

x -m =

n-1σ

∑ (25)

Physical interpretation

Equation (25) may be interpreted using physical arguments. Since the

mean (the best value) is obtained by one use of all the available data, the

degrees of freedom available (units of information available) is one less than

before. Hence the error estimator uses the factor (n-1) rather than n in the

denominator!



Example 5 (Example 1 revisited)

Resistance of a certain resistor is measured repeatedly to obtain the

following data.

# 1 2 3 4 5 6 7 8 9

R, kΩ 1.22 1.231.261.211.221.221.221.241.19

What is the best estimate for the resistance? What is the error with 95%

confidence?

Best estimate is the mean of the data.

1.22 4 1.23 1.26 1.21 1.24 1.19R =9

=1.223 1.22 k

× + + + + +

≈ Ω

Standard deviation of the error eσ :

92 -4e i

1

1 = R R = 3.75 108

⎡ ⎤σ − ×⎣ ⎦∑

Hence

-4e = 3.75 10 = 0.019 0.02kσ × ≈ Ω

Error with 95% confidence :

95% eError = 1.96 =1.96 0.019 = 0.036 0.04 k

σ ×≈ Ω



Sub Module 1.4

4. Regression analysis:

Now we are ready to consider curve fit or regression analysis. Suitable

plot of data will indicate the nature of the trend in data and hence will indicate the

nature of the relation between the independent and the dependent variables. A

few examples are shown in Figure 6(a-c).

Figure 6 (a) Linear relation between y and x

-0.5

0

0.5

1

1.5

2

2.5

0 1 2 3

x

y



Figure 6(b) Linear relation between log x and log y

1

10

100

1000

10000 100000 1000000 x

y



The linear graph shown in Figure 6(a) follows a relationship of the form

y=ax+b. The linear relationship on the log-log plot shown in Figure 6(b) follows

the form by ax= . The non-linear relationship shown in Figure 6(c) follows a

polynomial relationship of the form 3 2y ax bx cx d= + + + . The parameters a, b, c,

d are known as the fit parameters and need to be determined as a part of the

regression analysis.

16

18

20

22

24

26

28

30

32

0 500 1000 1500

x

y

Figure 6(c) Non-linear relation between y and x



Linear fit is possible in all the cases shown in Table 2.

Table 2

y=ax+b Linear fit Plots as a straight line on a linear graph sheetby ax= Power law fit Plots as a straight line on a log-log graph

bxy ae= Exponential fit Plots as a straight line on a semi-log graph

Linear regression:

Let ( ) ( ) ( ) ( )1 1 2 2 3 3 n nx , y , x , y , x , y ,........ x , y be a set of ordered pairs of

data. It is expected that there is a linear relation between y and x. Thus, if we

plot the data on a linear graph sheet as in Figure 2(a) the trend of the data

should be well represented by a straight line. We notice that the straight line

shown in the figure does not pass through any of the data points shown by full

symbols. There is a deviation between the data and the line and this deviation is

sometimes positive, sometimes negative, sometimes large and sometimes small.

If we look at the value given by the straight line as a local mean then the

deviations are distributed with respect to the local mean as a normal distribution.

If all data are obtained with equal care one may expect the deviations at various

data points to follow the same distribution and hence the least square principle

may be applied as under:

[ ] ( )n n 22

i f i i2 1 1

y y y ax bMinimise s

n n

− − +⎡ ⎤⎣ ⎦= =∑ ∑

(26)

where fy ax b= + is the desired linear fit to data. We see that 2s is the variance

of the data with respect to the fit and minimization will yield the proper choice of



the mean line represented by the proper parameters a and b. The minimization

requires that

( ) ( )2 2n n

i i i i i1 1

s 1 s 12 y ax b x 0; 2 y ax b 0a n a n

∂ ∂= − − + = = − − + =⎡ ⎤ ⎡ ⎤⎣ ⎦ ⎣ ⎦∂ ∂∑ ∑ (27)

These equations may be rearranged as two simultaneous equations for a and b

as given below:

( ) ( )

( )

2i i i i

i i

x a x b x y

x a nb y

+ =

+ =

∑ ∑ ∑

∑ ∑ (28)

These are known as normal equations. The summation is from i=1 to n and is

not indicated explicitly. The solution to these two equations may be obtained

easily by the use of Kramer’s rule.

i i i2

i i i i i i

i i2 2

i i i i

y x n y

x y x x x ya , b

n x n x

x x x x

= =

∑ ∑ ∑∑ ∑ ∑ ∑

∑ ∑∑ ∑ ∑ ∑

(29)

We now introduce the following definitions:

2 22 2i i i i i i2 2

x y xyx x x y x y

x , y , x , y and xyn n n n n

= = σ = − σ = − σ = −∑ ∑ ∑ ∑ ∑ (30)

The last of the quantities defined in (30) is known as the covariance. All the

other quantities are already familiar to us from statistical analysis. With these

definitions the slope of the line fit a may be written as

xy2x

aσ

=σ

(31)

The latter of the expressions in (28) may be solved for the fit line intercept b as

y abx−

= (32)



In fact the last equation indicates that the regression line passes through the

point ( )x, y . The fit line may be represented in the alternate form fY aX= where

f fY y y= − and X x x= − .

Example 6

The following data is expected to follow a relation of the form y=ax+b.

Determine the fit parameters by linear regression.

x 0.9 2.3 3.3 4.5 5.7 6.7y 1.1 1.6 2.6 3.2 4 5

It is convenient to make a table as shown below. The data given are in

columns 2 and 3. The other quantities needed to calculate the fit parameters are

in the other columns.

Data No. x y x2 y2 x y 1 0.9 1.1 0.8100 1.2100 0.9900 2 2.3 1.6 5.2900 2.5600 3.6800 3 3.3 2.6 10.8900 6.7600 8.5800 4 4.5 3.2 20.2500 10.2400 14.4000 5 5.7 4 32.4900 16.0000 22.8000 6 6.7 5 44.8900 25.0000 33.5000

Column Sum: 23.4 17.5 114.6200 61.7700 83.9500 Column Mean 3.9 2.9167 19.1033 10.2950 13.9917

2xσ 3.8933 Slope of the fit line is: a = 0.6721

2yσ 1.7881 The intercept is: b = 0.2955



Sums are calculated column-wise and are shown in row 8. Various

means are then in row 9. The variances are in rows 10, 11 and column 2. The

regression parameters are then calculated using the results of the analysis

presented earlier.

The regression line is thus given by fy 0.6721x 0.2955= + . The data and

the fit are compared in the following table.

That the fit is a good representation of the data is indicated by the

proximity of the respective values in the second and third columns. The plot

shown in Figure 7 is further proof of this.

Figure 7 Comparison of data with the fit

x y yf

0.9 1.1 0.9 2.3 1.6 1.8 3.3 2.6 2.5 4.5 3.2 3.3 5.7 4 4.1 6.7 5 4.8

0123456

0 2 4 6 8

x

y an

d y f

Data Fit



Goodness of fit and the correlation coefficient:

A measure of how good the regression line as a representation of the data

is deduced now. In fact it is possible to fit two lines to data by (a) treating x as

the independent variable and y as the dependent variable or by (b) treating y as

the independent variable and x as the dependent variable. The former has been

done above. The latter is described by a relation of the form x a ' y b '= + . The

procedure followed earlier can be followed through to get the following (the

reader is expected to show these results):

xy2y

a ' , b ' x a ' yσ

= = −σ

(33)

The second fit line may be recast in the form

1 b 'y ' xa ' a '

= − (34)

The slope of this line is 1a '

which is not the same, in general, as a, the

slope of the first regression line. If the two slopes are the same the two

regression lines coincide. Otherwise the two lines are distinct. The ratio of the

slopes of the two lines is a measure of how good the form of the fit is to the data.

In view of this we introduce the correlation coefficient ρ defined through the

relation

2xy2

2 2x y

slope of first Regression line aa 'slope of second Regression line

σρ = = =

σ σ (35)

Or

xy

x y

σρ = ±

σ σ (36)



The sign of the correlation coefficient is determined by the sign of the

covariance. If the regression line has a negative slope the correlation coefficient

is negative while it is positive if the regression line has a positive slope. The

correlation is said to be perfect if 1ρ = ± . The correlation is poor if 0ρ ≈ .

Absolute value of the correlation coefficient should be greater than 0.5 to indicate

that y and x are related!

In Example 6 the correlation coefficient is positive. The pertinent

parameters are 2 2x y xy3.8933, 1.7811 and 2.6167σ = σ = σ = . With these the

correlation coefficient is 2.6167 0.9923.8933 1.7811

ρ = =×

. Since the correlation

coefficient is close to unity the fit represents the data very closely (Figure 7 has

already indicated this).

Polynomial regression:

Sometimes the data may show a non-linear behavior that may be modeled

by a polynomial relation. Consider a quadratic fit as an example. Let the fit

equation be given by 2fy ax bx c= + + . The variance of the data with respect to

the fit is again minimized with respect to the three fit parameters a, b, c to get

three normal equations. These are solved for the fit parameters. Thus we have

( ) 22i2

y ax bx cs

n

⎡ ⎤− + +⎣ ⎦=∑

(37)

Least square principle requires



( )

( )

( )

22 2

i i

22

i i

22

i

s 2 y ax bx c x 0a ns 2 y ax bx c x 0b ns 2 y ax bx c 0c n

∂ ⎡ ⎤= − + + =⎣ ⎦∂

∂ ⎡ ⎤= − + + =⎣ ⎦∂

∂ ⎡ ⎤= − + + =⎣ ⎦∂

∑

∑

∑

(38)

These may be rewritten as

4 3 2 2i i i i i3 2i i i i i

2i i i

a x b x c x x y

a x b x c x x y

a x b x nc y

+ + =

+ + =

+ + =

∑ ∑ ∑ ∑∑ ∑ ∑ ∑

∑ ∑ ∑ (39)

Normal equations (39) are easily solved for the three fit parameters to

complete the regression analysis.

Goodness of fit and the index of correlation:

In the case of a non-linear fit we define a quantity known as the index of

correlation to determine the goodness of the fit. The fit is termed good if the

variance of the deviates is much less than the variance of the y’s. Thus we

require the index of correlation defined below to be close to ±1 for the fit to be

considered good.

[ ]22f

2 2y

y ys1 1y y

−ρ = ± − = ± −

σ ⎡ ⎤−⎣ ⎦

∑∑

(40)

It can be shown that the index of correlation is identical to the correlation

coefficient for a linear fit. The index of correlation compares the scatter of the

data with respect to its own mean as compared to the scatter of the data with

respect to the regression curve.



Example 7

The friction factor Reynolds number product fRe for laminar flow in a

rectangular duct is a function of the aspect ratio hAw

= where h is the height and

w is the width of the rectangle. The following table gives the available data:

A 0 0.05 0.10 0.125 0.167 0.25 0.4 0.5 0.75 1

fRe 96 89.81 84.68 82.34 78.81 72.93 65.47 62.19 57.87 56.91

Make a suitable fit to data.

A plot of the given data indicates that a quadratic fit may be appropriate.

For the purpose of the following analysis we represent the aspect ratio as x and

the fRe product as y. We seek a fit to data of the form 2fy ax bx c= + + . The

following tabulation helps in the regression analysis.

No. x y x2 x3 x4 x y x2y 1 0 96 0 0 0 0 0 2 0.05 89.81 0.0025 0.000125 6.25E-06 4.4905 0.224525 3 0.1 84.68 0.01 0.001 0.0001 8.468 0.8468 4 0.125 82.34 0.015625 0.001953 0.000244 10.2925 1.286563 5 0.167 78.81 0.027889 0.004657 0.000778 13.16127 2.197932 6 0.25 72.93 0.0625 0.015625 0.003906 18.2325 4.558125 7 0.4 65.47 0.16 0.064 0.0256 26.188 10.4752 8 0.5 62.19 0.25 0.125 0.0625 31.095 15.5475 9 0.75 57.89 0.5625 0.421875 0.316406 43.4175 32.56313 10 1 56.91 1 1 1 56.91 56.91 sum 3.342 747.03 2.091014 1.634236 1.409541 212.2553 124.6098

The three normal equations are then given by

1.409541a 1.634236b 2.091014c 124.60981.634236a 2.091014b 3.342c 212.2553

2.091014a 3.342b 10c 747.03

+ + =+ + =

+ + =



These three simultaneous equations are solved to get the three fit parameters as

a=58.354, b=-94.432, c=94.06

The following table helps in comparing the data with the fit.

x y yf s2=(y-yf)2 sy2

0 96 94.06 3.763026 453.5622 0.05 89.81 89.48 0.105978 228.2214 0.1 84.68 85.20 0.270956 99.54053 0.125 82.34 83.17 0.685562 58.32377 0.167 78.81 79.91 1.226588 16.86745 0.25 72.93 74.09 1.367455 3.143529 0.4 65.47 65.62 0.023763 85.24829 0.5 62.19 61.43 0.573285 156.5752 0.75 57.89 56.06 3.346951 282.677 1 56.91 57.98 1.150145 316.5908 Sum 747.03 747.03 12.51371 1700.75 Mean 74.703 1.251371 170.075

The table also shows how the index of correlation is calculated. The

column sums and column means required are given the last two rows of the

table. Note that calculation of 2yσ requires sums of the form

2y y⎡ ⎤−⎣ ⎦ where y is

available as the last entry in column 2. The index of correlation uses the mean

values of columns 4 and 5 given by 2y 170.075σ = and 2s 1.251371= . The index of

correlation is thus equal to 1.2513711 0.963.170.075

ρ = − = − The negative sign

indicates that y decreases when x increases. The index of correlation is close to

-1 and hence the fit represents the data very well. A plot of the data along with

the fit given in Figure 8 also indicates this. The standard error of the fit is given

by s 1.251371 1.12= = ± .



Figure 8 Comparison of the data with the quadratic fit

In the above we have considered cases that involved one independent

variable and one dependent variable. Sometimes the dependent variable may

be a function of more than one variable. For example, the relation of the form

b cNu a Re Pr= is a common type of relationship between the Nusselt number

(Nu, dependent variable) and Reynolds (Re) and Prandtl (Pr) numbers both of

which are independent variables. By taking logarithms, we see

that log(Nu) log(a) b log(Re) c log(Pr)= + + . It is thus seen that the relationship is

linear when logarithms of the dependent and independent variables are used to

describe the fit. Also the relationship may be expressed in the form z=ax+by+c,

where z is the dependent variable, x and y are independent variables and a, b, c

are the fit parameters. The least square method may be used to determine the fit

0

20

40

60

80

100

0 0.2 0.4 0.6 0.8 1

Aspect ratio

Fric

tion

fact

or R

eyno

lds

num

ber

prod

uct

Data Quadratic fit



parameters. Let the data be available for set of n x, y values. The quantity to be

minimized is given by

( ) 22i

is z ax bx c= − + +⎡ ⎤⎣ ⎦∑ (41)

The normal equations are obtained by the usual process of setting the first partial

derivatives with respect to the fit parameters to zero.

2i i i i i i

2i i i i i i

i i i

a x b x y c x x z

a x y b y c y y z

a x b y nc z

+ + =

+ + =

+ + =

∑ ∑ ∑ ∑∑ ∑ ∑ ∑

∑ ∑ ∑ (42)

These equations are solved simultaneously to get the three fit parameters.

Example 8

The following table gives the variation of z with x and y. Obtain a multiple linear

fit to the data and comment on the goodness of the fit.

No. x y z 1 0.1 0.2 0.426 2 0.3 0.35 0.539 3 0.559 0.5 0.651 4 0.847 0.65 0.786 5 1.156 0.8 0.892 6 1.48 0.95 1.058 7 1.817 1.1 1.185 8 2.168 1.25 1.33 9 2.525 1.4 1.474 10 2.893 1.55 1.634

The calculation procedure follows that given previously. Several sums are

required and these are tabulated below.



No. x y z x2 x y y2 x z y z 1 0.1 0.2 0.426 0.01 0.02 0.04 0.0426 0.0852 2 0.3 0.35 0.539 0.09 0.105 0.1225 0.1617 0.18865 3 0.559 0.5 0.651 0.312481 0.2795 0.25 0.363909 0.3255 4 0.847 0.65 0.786 0.717409 0.55055 0.4225 0.665742 0.5109 5 1.156 0.8 0.892 1.336336 0.9248 0.64 1.031152 0.7136 6 1.48 0.95 1.058 2.1904 1.406 0.9025 1.56584 1.0051 7 1.817 1.1 1.185 3.301489 1.9987 1.21 2.153145 1.3035 8 2.168 1.25 1.33 4.700224 2.71 1.5625 2.88344 1.6625 9 2.525 1.4 1.474 6.375625 3.535 1.96 3.72185 2.0636 10 2.893 1.55 1.634 8.369449 4.48415 2.4025 4.727162 2.5327 Sum 13.845 8.75 9.975 27.40341 16.0137 9.5125 17.31654 10.39125

The last row contains the sums required and the normal equations are easily

written down as under:

27.40341a 16.0137b 13.845c 17.3165416.0137a 9.5125b 8.75c 10.39125

13.845a 8.75b 10c 9.975

+ + =+ + =

+ + =



Figure 9 Parity plot showing the goodness of the fit

These are solved to get the fit parameters as a=0.285, b=0.297, c=0.343.

The data and the fit may be compared by making a parity plot as shown in Figure

9. The parity plot is a plot of given data (z) along the abscissa and the fit (zf)

along the ordinate. The parity line is a line of equality between the two. The

departure of the data from the parity line is an indication of the quality of the fit.

The above figure indicates that the fit is indeed very good. When the data is a

function of more than one independent variable it is not always possible to make

plots between independent and dependent variables. In such a case the parity

plot is a way out.

0.000

0.400

0.800

1.200

1.600

2.000

0 0.5 1 1.5 2

Data

Fit

Fit Parity Line



We may also calculate the index of correlation as an indicator of the

quality of the fit. This calculation is left to the reader!

General non-linear fit:

The fit equation may sometimes have to be chosen as a non-linear

relation that is not either a polynomial or in a form that may be reduced to the

linear form. In such a case the parameter estimation is more involved and

requires the use of a search method to determine the best parameter set that

minimizes the sum of the squares of the residual. The method is described in

some detail below.

Let us represent the fit relation in the form ( )f 1 2 my f x;a ,a ,...a= where the

dependent variable is x and 1 ma a− are m fit parameters to be determined by the

regression analysis. As before we assume that n sets of x, y values are

available. Consider the sum of the squares of the residual given by

( ) ( ) ( ) 221 m 1 m i 1 2 m

is a ..a S a ..a y f x;a ,a ,...a= = −⎡ ⎤⎣ ⎦∑ (43)

In general it is not possible to set the partial derivatives with respect to the

parameters to zero to obtain the normal equations and thus obtain the fit

parameters. In view of this let us look at what is happening to the sum of

squares near a starting parameter set 0 0 01 2 ma ,a ,...a . The sum of squares is

evaluated using this parameter set in equation (43) to get

( )0 0 0 01 2 mS S a ,a ,...a= .Perturb each of the a’s individually to get

( ) ( ) ( )0 0 0 0 0 0 0 0 01 1 2 m 1 1 2 m 1 1 2 mS a a ,a ..a ,S a a ,a ..a ,S a a ,a ..a+ ∆ + ∆ + ∆ ,.. ( )0 0 0 0

1 2 j j mS a ,a ,..a a ..a+ ∆

( )0 0 0 01 2 j j mS a ,a ,..a a ..a+ ∆ , ( )0 0 0 0

1 2 m mS a ,a ,..a a+ . Using these we may estimate the



partial derivatives by the use of finite difference approximation

as( )

( ) ( )0 0 01 2 m

0 0 0 0 0 0 0 01 2 j j m 1 2 j m

j ja ,a ,..a

S a ,a ,..a a ..a S a ,a ,a ..aSa a

+ ∆ −∂=

∂ ∆. There are m such

partial derivatives and they are all likely to be non-zero (if they are all zero we are

already at the optimum point where the sum of squares is possibly a minimum).

The gradient vector is then given by the components1 2 j m

S S S S, ,.. ..a a a a

⎛ ⎞∂ ∂ ∂ ∂⎜ ⎟⎜ ⎟∂ ∂ ∂ ∂⎝ ⎠

. The

magnitude of this vector is obtained by summing the squares of all the partial

derivatives and then taking the square root of this sum.

2m

jj 1

Sgrad Sa=

⎛ ⎞∂= ⎜ ⎟⎜ ⎟∂⎝ ⎠∑ (44)

We divide each of the partial derivatives occurring in the gradient vector

by the magnitude of the gradient vector thus calculated to get the components of

a unit vector that is aligned with the gradient vector. Thus

j1 2 m

SS S Saa a a

, ,.. ,..grad S grad S grad S grad S

⎛ ⎞∂⎛ ⎞ ⎛ ⎞ ⎛ ⎞∂ ∂ ∂⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟∂∂ ∂ ∂⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ (45)

We now take a specific fraction (α, small) of each of these components to

define a step along a direction opposite the gradient vector to get

1 21 0 1 01 1 2 2

j m1 0 1 0j j m m

S Sa a

a a ,a a ,..grad S grad S

S Sa a

a a ,..a agrad S grad S

⎛ ⎞ ⎛ ⎞∂ ∂⎜ ⎟ ⎜ ⎟∂ ∂⎝ ⎠ ⎝ ⎠= −α = −α

⎛ ⎞∂ ⎛ ⎞∂⎜ ⎟ ⎜ ⎟⎜ ⎟∂ ∂⎝ ⎠ ⎝ ⎠= −α = −α

(46)

The calculation above is redone with the new values of the parameter set.



The calculation is continued till the magnitude of the gradient reaches

zero or acceptably small value at which the calculation stops and the

parameter set is assumed to have satisfied the least square principle. An

example will make this procedure clear.

Example 9 The data given in the following table is expected to follow a relation of the

form bxfy ae cx= + . Determine the fit parameters by general non-linear

regression.

x 0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1,8

y 1.196 1.379 1.581 1.79 2.013 2.279 2.545 2.842 3.173 3.5

The sum of squares of the residual is given by ( )i

210bx

i ii 1

S y ae cx=

⎡ ⎤= − +⎣ ⎦∑ . The

partial derivatives needed are obtained analytically

( ) ( ) ( )

( )

i i i i

i

10 10bx bx bx bx

i i i i ii 1 i 110

bxi i i

i 1

S S2 y ae cx e , 2 y ae cx ax e ,a b

S 2 y ae cx xa

= =

=

∂ ∂⎡ ⎤ ⎡ ⎤= − + = − + −⎣ ⎦ ⎣ ⎦∂ ∂

∂ ⎡ ⎤= − +⎣ ⎦∂

∑ ∑

∑ (47)

The above means that the partial derivatives may be computed once the

starting set of parameters is known or assumed. We start the calculation with the

initial parameter seta=1,b=0.2,c=0.1. The value of S turns out to be 11.673 for

this set of parameter values. Using (47) the partial derivatives are obtained

respectively as-24.023,-30.681,-23.003.The magnitude of the gradient vector is



then given by ( ) ( ) ( )0.52 2 2-24.023 , -24.023 , -24.023 45.249⎡ ⎤ =

⎣ ⎦. The components of

the unit vector 0 0 0a b cu ,u ,u are then given by 24.023 30.681 23.003, ,

45.249 45.249 45.249− −

− − − or-

0.531,-0.678,-0.508. We shall choose α value of 0.02 to get the next trial values

for the parameters as

( )( )( )

1 0 0a

1 0 0b

1 0 0c

a a u 1 0.02 0.531 1.011

b b u 1 0.02 0.678 0.214

c c u 1 0.02 0.508 0.11

= −α = − ×− =

= −α = − ×− =

= −α = − ×− =

The S value for this parameter set turns out to be 10.759. The

calculations may be repeated as above. The results are summarized below.

α a b c grad S S 0.02 1 0.2 0.1 45.25 11.67 1.011 0.214 0.11 44.29 10.76 1.022 0.228 0.12 43.25 9.87 .. .. .. .. .. .. 1.219 0.505 0.265 0.848 0.0286 0.005 1.21004 0.5984 0.2665 0.2831 0.001265 0.001 1.209288 0.509251 0.266293 0.108248 0.00108

It is clear from the table that a large number of trials are involved in the

regression analysis. The value of α needs to be reduced as we approach the

optimum set. The final set of parameters for the present case is given by

a=1.2093,b=0.5093,c=0.2663



Figure 10 Comparison of data with the fit

That the regression analysis has indeed converged to the proper fit

parameters is seen from the excellent agreement between the data and the fit

shown in Figure 10. The reader is left to determine the index of correlation and

the standard error of the fit.

00.5

11.5

22.5

33.5

4

0 0.5 1 1.5 2

x

y

Data Fit



Sub Module 1.5

5. Use of EXCEL for regression analysis

EXCEL is a Microsoft product that comes along with the Office suite of

programs. It is essentially a spread sheet program that provides a computing

environment with graphic capabilities. The student is encouraged to learn the

basics of EXCEL programming so that data analysis, regression analysis and

suitable plots may all be done within the EXCEL environment.

EXCEL work sheet provides a grid with cells in it. The cells form columns

and rows as in a matrix. The columns are identified by alphanumeric symbols

and the rows by numerals. For example, A1 refers to the cell in the first column

and first row. Cell C5 will represent the cell in the 3rd column (column number C)

and the 5th row (row number 5). Column identifiers will go from A – Z and then

from AA – AZ and so on.. The cell can hold a number, a statement or a formula.

A number or a statement is simply written by putting the cursor in the appropriate

cell and keying in the number or the statement, as the case may be. A formula,

however, is written by preceding the formula by “=” sign. The formula can

contain a reference to many built in functions in EXCEL as well as the usual

arithmetic operations. The formula can refer to the content of any other cell or

cells. The formulas can be calculated repeatedly over a set of rows by simply

copying down the formula vertically.



Figure 11 An extract of an EXCEL work sheet shows some of the things

one can do!

A B C D E F G 1 2 23 88 3 This is a statement

4

In cell B4 is the formula "=A2*B2" i.e. product of two numbers 2024

5 The formula in B4 is acted upon and the result alone appears in the cell B4 as seen above..

6 x x^2 7 1 1 8 2 4 9 3 9 10 4 16 11 5 25 12 6 36 13 7 49 14 8 64 15 9 81 16 10 100

17 Sum of G7to G16 is obtained by entering the formula "=SUM(G7:G16) in Cell G17 385

18 SUM() is a built in function in EXCEL

Data may be keyed into the cells in the form of columns as shown in the

work sheet given as Figure 12 below. The plotting is menu driven and the plot

may be displayed as a separate plot or within the work sheet. The latter is

shown in the case given here. The data range for the plot is specified by simply

blocking the Data cells shown by the blue background!



Figure 12 Another extract of an EXCEL work sheet, showing data and the

corresponding plot.

A B C D E E G H 1 2 x y 3 1 3.33 4 2 10.43 5 3 21.53 6 4 36.63 7 5 55.73 8 6 78.83 9 7 105.93 10 8 137.03 11 9 172.13

Properties of cells, chart (plot is referred to as chart in EXCEL) are

changed to suit the requirements with menu driven controls. Student should

familiarize oneself by learning these through “HELP” available in EXCEL.



Figure 13 Another extract of an EXCEL work sheet, showing data and the

corresponding plot along with the automatically generated fit. The inset in

the plot gives the linear relation between y and x. The square of the

correlation coefficient is also shown in the inset (symbol R2).

A B C D E F G H 1 The following data is expected to follow a linear law. Obtain such a law 2 by using the "Trend Line" option in EXCEL. 3 x yd 4 0.5 0.35 5 1 1.66 6 1.5 3.418 7 2 4.488 8 2.5 5.306 9 3 8.584 10 3.5 9.97 11 4 12.196 12 4.5 15.382 13 5 15.548 14 5.5 17.274 15 6 18.704 16 6.5 20.306 17 7 21.612 18 7.5 21.446 19 8 24.108

Figure 13 shows how a trend line can be added to the plot. The inset in

the plot shows the relationship that exists between the y and x data values.

Correlation coefficient is very high indicating the fit to represent the data

extremely well.

Several examples of regression using EXCEL are presented below. The

examples are self explanatory and I expect the student to work them out using

EXCEL himself/herself.



Example 10

Linear fit example using EXCEL

The EXCEL worksheet into which the data is input is shown below. The

data is keyed in the cells as shown. The sums required are automatically

calculated by using the SUM function. Variances and the covariance are

calculated using their definitions given earlier. The slopes and the intercepts

are calculated using the formulae derived by the least square method. The two

regression lines are then given by:

1

2

y 14.15x 288.4y 14.74x 297.6= − += − +



Price per thousand pieces of a certain product (x) determines the demand (y) for the product. The data is given below. Fit a straight line to data and discuss the quality of the fit.

No. x y x2 y2 xy y1 y2

Mean: 15.667 66.667 1 15 82 225 6724 1230 76.103 76.495 2 18 25 324 625 450 33.64 32.269 3 13 93 169 8649 1209 104.41 105.98 4 16 60 256 3600 960 61.949 61.753 5 12 128 144 16384 1536 118.57 120.72 6 20 12 400 144 240 5.3309 2.7855

Sum 94 400 1518 36126 5625 400 400 Mean 15.667 66.667 Variance of x= 7.5556 Variance of y= 1576.6 Covariance = -106.9 Slope of first regression line=

-14.15

Intercept of first regression line=

288.42

Slope of second regression line=

-14.74

Intercept of first regression line=

297.62

Correlation coefficient = -0.98

The correlation coefficient is calculated using the statistical parameters that

have been already calculated.



Figure 14 Plot of the resulting data using EXCEL

The data generated has been plotted using EXCEL in the form of a chart

in Figure 14.. The chart option used is “scatter plot”. The given data is shown

using the red circles. The two lines of regression are shown by the brown and

blue lines. Both of them pass through the mean x and mean y (indicated by the

point ). The fit is good because the two regression lines are very close to each

other.

Linear Fit to Data

0

20

40

60

80

100

120

140

10 12 14 16 18 20

x

y

y y1 y2



Figure 15 Plot of the data using EXCEL with “Trend Line” option

The fit may also be done by using the “trend line” option available in

EXCEL. We choose the linear trend line and get the plot shown in Figure 15.

It is observed that the “Trend Line” option yields the first regression

line that considers y to be a function of x. The required arithmetic is

automatically performed by EXCEL. There is an option to automatically display

the regression line equation on the chart.

Linear Fit Using "Trend line"

The regression equation is y = -14.15x + 288.4

0

20

40

60

80

100

120

140

10 12 14 16 18 20

x

yData Linear (Data)



Example 11

Exponential fit example using EXCEL

The given in the first two columns are the time (t, s) and the corresponding

temperature excess (T, °C) over the ambient of a certain system. The data is

expected to be well represented by an exponential law in the form T A exp( Bt)= − .

Obtain the fit using EXCEL.



EXCEL work sheet appears as below.

t (s) Data, (T)

Fit,(T) ln(T) ln(T) t2 t ln(T) [ln(T)]2

0.35 60 58.64 4.09 4.09 0.12 1.43 16.76 0.6 50 50.39 3.91 3.91 0.36 2.35 15.30

0.937 40 41.07 3.69 3.69 0.88 3.46 13.61 1.438 30 30.31 3.40 3.40 2.07 4.89 11.57 2.175 20 19.38 3.00 3.00 4.73 6.52 8.97 3.25 10 10.10 2.30 2.30 10.56 7.48 5.30

mean t, s

Mean of

[ln(T)]

Var t Covaria-nce

Var of ln(T)

1.45833 3.3991 0.9935 0.6026 0.3659 slope B, 1/s

intercept,

ln A

-0.607 4.2837 τ s (To)=A,o

C

1.649 72.5 ρ

t (s) Data,(T) Fit,(T) Error -0.9994 0.35 60 58.64 1.36 0.6 50 50.39 -0.39

0.937 40 41.07 -1.07 1.438 30 30.31 -0.31 2.175 20 19.38 0.62 3.25 10 10.10 -0.10

It is noted that the data represents a linear law on the semi-log plot (the student

is recommended to test this out by making a plot).



Comparison of data with fit

T = 72.5 e-0.607 t

0

10

20

30

40

50

60

70

0 1 2 3 4Time t, s

Tem

pera

ture

Exc

ess

T ,

o CData Expon. (Data)

Figure 16 Comparison of data with fit

The plot shown as Figure 16 indicates that the fit represents the data very

well. Error bars are indicated based on 95% confidence limits using the feature

available in EXCEL.



The standard error is calculated by using the following tabulation using EXCEL.

t (s) Data, T °C Fit,T°C Error, °C

Square of error (°C)2

0.35 60 58.64 1.36 1.86 0.6 50 50.39 -0.39 0.15

0.937 40 41.07 -1.07 1.15 1.438 30 30.31 -0.31 0.10 2.175 20 19.38 0.62 0.38 3.25 10 10.10 -0.10 0.01

Standard Error 1.87

Error bar indicated on T is based on the standard error of 1.87 indicated in the

above table.

The problem may easily be solved by using the “Trend Line” option by

choosing an “exponential law” from the menu.

The quality of the fit may also be gauged by comparing the data with the values

obtained by the use of the exponential least square fit. This is done by making a

“parity plot” as given below as Figure 17. The distribution of the data around

the “parity line” is a measure of the goodness of the fit. The points should be

close to the parity line and must be distributed evenly on the two sides if there is

no “bias” in the measurement. It is observed that the exponential fit is good

based on both these counts!



Parity plot

0

20

40

60

0 20 40 60Data

Fit

Data Parity line

Figure 17 Parity plot for the exponential fit example.

Sometimes it is instructive to show the error between the data and the fit.

In this example the error in the temperature excess between the data and the fit

is plotted as a function of time as shown in Figure 18. The error is evenly

distributed on both the positive and negative sides indicating absence of bias.

The error between the data and the fit is no more than 1.5°C

Figure 18 Error distribution plot for the exponential fit example.

Error plot

-1.5

-1

-0.5

0

0.5

1

1.5

0 0.5 1 1.5 2 2.5 3 3.5

Time, s

Erro

r=(D

ata-

Fit)

, o C



Example 12

Polynomial fit example using EXCEL

The x y data set shown below is expected to follow a quadratic

relationship. Obtain the fit by the least squares method. Discuss the relevant

statistical parameters that characterize the fit. Make a suitable plot.

x y(Data) y(Fit) (y-mean y)2 (y-y(Fit))2 0.2 2.55 2.57 119.39 0.000511 0.35 2.86 2.98 112.80 0.016439 0.55 3.84 3.69 92.79 0.024156 0.73 4.18 4.47 86.40 0.086132 1.05 6.46 6.23 49.18 0.054466 1.32 8.29 8.07 26.87 0.050926 1.65 10.26 10.75 10.32 0.240154 1.86 13.11 12.72 0.13 0.15708 2.01 14.77 14.24 1.68 0.28611 2.55 19.83 20.55 40.41 0.515724 2.92 26.07 25.63 158.68 0.197862 3.11 27.58 28.47 198.92 0.796958 3.5 35.37 34.82 479.33 0.306418 Mean 1.68 13.48 13.48 105.92 0.273294 Index of correlation= 0.999 Standard error = 1.026

Using EXCEL “Trend Line” polynomial (quadratic) option the fit is easily

obtained as 2y(Fit) 2.232x 1.513x 2.181.= + + The index of correlation is calculated

using the relation given previously and is shown in the table. Index of correlation

of 0.999 indicates that the fit is very good.



Plot shown below as Figure 19 indicates graphically the goodness of the

fit. Note that the fit shown uses the “Trend Line” option with “Polynomial Fit” of

EXCEL.

Quadratic Fit to Data

y = 2.2318x2 + 1.5132x + 2.1807R2 = 0.998

0

5

10

15

20

25

30

35

40

0 0.5 1 1.5 2 2.5 3 3.5 4x

Dat

a, F

it

y(Data) Poly. (y(Data))

Figure 19 Polynomial fit examples showing the data and the fit.

The regression equation and its index of correlation are also given in the

inset. Error bars are also indicated based on 95% confidence intervals.



Sub Module 1.6

6. Design of experiments

Goal of experiments:

• Experiments help us in understanding the behavior of a (mechanical)

system

• Data collected by systematic variation of influencing factors helps us to

quantitatively describe the underlying phenomenon or phenomena

The goal of any experimental activity is to get the maximum information

about a system with the minimum number of well designed experiments. An

experimental program recognizes the major “factors” that affect the outcome of

the experiment. The factors may be identified by looking at all the quantities that

may affect the outcome of the experiment. The most important among these

may be identified using a few exploratory experiments or from past experience

or based on some underlying theory or hypothesis. The next thing one has to

do is to choose the number of levels for each of the factors. The data will be

gathered for these values of the factors by performing the experiments by

maintaining the levels at these values.

Suppose we know that the phenomena being studied is affected by the

pressure maintained within the apparatus during the experiment. We may

identify the smallest and the largest possible values for the pressure based on

experience, capability of the apparatus to withstand the pressure and so on.

Even though the pressure may be varied “continuously” between these limits, it



is seldom necessary to do so. One may choose a few values within the identified

range of the pressure. These will then be referred to as the levels.

Experiments repeated with a particular set of levels for all the factors

constitute replicate experiments. Statistical validation and repeatability concerns

are answered by such replicate data.

In summary an experimental program should address the following issues:

• Is it a single quantity that is being estimated or is it a trend involving more

than one quantity that is being investigated?

• Is the trend linear or non-linear?

• How different are the influence coefficients?

• What does dimensional analysis indicate?

• Can we identify dimensionless groups that influence the quantity or

quantities being measured

• How many experiments do we need to perform?

• Do the factors have independent effect on the outcome of the experiment?

• Do the factors interact to produce a net effect on the behavior of the

system?



Full factorial design:

A full factorial design of experiments consists of the following:

– Vary one factor at a time

– Perform experiments for all levels of all factors

– Hence perform a large number of experiments that are needed!

– All interactions are captured (as will be shown later)

Consider a simple design for the following case:

Let the number of factors = k

Let the number of levels for the ith factor = ni

The total number of experiments (n) that need to be performed isk

ii 1n n

== Π .

If k = 5 and number of levels is 3 for each of the factors the total number of

experiments to be performed in a full factorial design is 53 243= .

2k factorial design:

Consider a simple example of a 2k factorial design. Each of the k factors

is assigned only two levels. The levels are usually High = 1 and Low = -1. Such

a scheme is useful as a preliminary experimental program before a more

ambitious study is undertaken. The outcome of the 2k factorial experiment will

help identify the relative importance of factors and also will offer some knowledge

about the interaction effects. Let us take a simple case where the number of

factors is 2. Let these factors be Ax and Bx . The number of experiments that

may be performed is 4 corresponding to the following combinations:



Experiment No. Ax Bx1 +1 +1 2 -1 +1 3 +1 -1 4 -1 -1

Let us represent the outcome of each experiment to be a quantity y. Thus

1y will represent the outcome of experiment number 1 with both factors having

their “High” values, 2y will represent the outcome of the experiment number 2

with the factor A having the “Low” value and the factor B having the “High”

value and so on. The outcome of the experiments may be represented as the

following matrix:

Ax ↓ Bx → +1 -1 +1 1y 3y-1 2y 4y

A simple regression model that may be used can have up to four

parameters. Thus we may represent the regression equation as

0 A A B B AB A By p p x p x p x x= + + + (48)

The p’s are the parameters that are determined by using the “outcome”

matrix by the simultaneous solution of the following four equations:

0 A B AB 1

0 A B AB 2

0 A B AB 3

0 A B AB 4

p p p p yp p p p yp p p p yp p p p y

+ + + =

− + − =

− − − =

− − + =

(49)



Figure 14 Interpretation of 22 factorial experiment

It is easily seen that the parameter 0p is simply the mean value of y that

is obtained by putting A Bx x 0= = corresponding to the mean values for the

factors. Equation (49) expresses the fact that the outcome may be interpreted as

shown in Figure 14.

It is thus seen that the values of y- 0p at the corners of the square indicate

the deviations from the mean value and hence the mean of the square of these

deviations (we may divide the sum of the squares with the number of degrees of

freedom = 3) is the variance of the sample data collected in the experiment.

The influence of the factors may then be gauged by the contribution of each term

to the variance. These ideas will be brought out by example 13.



Example 13

A certain process of finishing a surface involves a machine. The machine

has two speed levels Ax and the depth of cut Bx may also take on two values.

The two values are assigned +1 and -1 as explained in the case of 22 factorial

experiment. The outcome of the process is the surface finish y that may have a

value between 1 (the worst) to 10 (the best). A 22factorial experiment was

performed and the following matrix gives the results:

Ax ↓ Bx → +1 -1

+1 3.5 1.5

-1 8.2 2

Determine the regression parameters and comment on the results. The

regression model given in equation (48) is made use of. The four simultaneous

equations for the regression parameters are given by

0 A B AB

0 A B AB

0 A B AB

0 A B AB

p p p p 3.5.....(i)p p p p 1.5.....(ii)p p p p 8.2.....(iii)p p p p 2........(iv)

+ + + =

− + − =

− − − =

− − + =

If we add the four equations and divide by 4 we get the

parameter o3.5 1.5 8.2 2 15.2p 3.8

4 4+ + +

= = = . (i)-(ii)+(iii)-(iv) yields the value of

parameter A3.5 1.5 8.2 2 8.2p 2.05

4 4+ − −

= = = . (i)+(ii)-(iii)-(iv) yields the value of

parameter B3.5 1.5 8.2 2 5.2p 1.3

4 4− − −

= = − = − . Finally (i)-(ii)-(iii)+(iv) yields the



value of parameter AB3.5 1.5 8.2 2 4.2p 1.05

4 4− − +

= = − = − . Thus the regression

equation based on the experiments is

A B A By 3.8 2.05x 1.3x 1.05x x= + − −

The deviation with respect to the mean is obviously given by

A B A Bd y 3.8 2.05x 1.3x 1.05x x= − = − −

It may be verified that the total sum of squares (SST) of the deviations is given by

( ) ( )( )

2 2 2 2 2 2A B ABSST 4 p p p 4 2.05 1.3 1.05

4 4.2025 1.69 1.1025 4 6.995 27.98

= × + + = × + +

= × + + = × =

The sample variance is thus given by

2y

SST 27.98s 9.33n 1 3

= = ≈−

Contributions to the sample variance are given by 4 times the square of

the respective parameter and hence we also have

SSA 4 4.2025 16.81SSB 4 1.69 6.76SSAB 4 1.1025 4.41

= × == × == × =



Here SSA means the sum of squares due to variation in level of Ax and

so on. The relative contributions to the sample variance are represented as

percentage contributions in the following table:

Contribution % Contribution

SST 27.98 100

SSA 16.81 60.08

SSB 6.76 24.16

SSC 4.41 15.76

Thus the dominant factor is the machine speed followed by the depth of

cut and lastly the interaction effect. In this example all these have significant

effects and hence a full factorial experiment is justified.

More on full factorial design

We like to generalize the ideas described above in what follows.

Extension to larger number of factors as well as larger number of levels would

then be straight forward. Let the High and Low levels be represented by + an –

respectively. In the case of 22 factorial experiment design the following will hold:

Ax Bx A Bx xRow vector 1 + + + Row vector 2 + - - Row vector 3 - + - Row vector 4 - - + Column sum 0 0 0 Column sum of squares 4 4 4



We note that the product of any two columns is zero. Also the column

sums are zero. Hence the three columns may be considered as vectors that

form an orthogonal set. In fact while calculating the sample variance earlier

these properties were used without being spelt out.

Most of the time it is not possible to conduct that many experiments! The

question that is asked is: “Can we reduce the number of experiments and yet get

an adequate representation of the relationship between the outcome of the

experiment and the variation of the factors?” The answer is in general “yes”. We

replace the full factorial design with a fractional factorial design. In the

fractional factorial design only certain combinations of the levels of the factors

are used to conduct the experiments. This ploy helps to reduce the number of

experiments. The price to be paid is that all interactions will not be resolved.

Consider again the case of 22 factorial experiment. If the interaction term

A Bx x is small or assumed to be small only three regression coefficients

0 A Bp ,p ,p will be important. We require only three experiments such that the

four equations reduce to only three equations corresponding to the three

experimental data that will be available. They may be either:

0 A B 1

0 A B 2

0 A B 3

p p p yp p p yp p p y

+ + =

− + =

+ − =

(50)

Or

0 A B 1

0 A B 2

0 A B 4

p p p yp p p yp p p y

+ + =

− + =

− − =

(51)



depending on the choice of the row vectors in conducting the experiments.

In this simple case of two factors the economy of reducing the number of

experiments by one may not be all that important. However it is very useful to go

in for a fractional factorial design when the number of factors is large and when

we expect some factors or interactions between some factors to be unimportant.

Thus fractional factorial experiment design is useful when main effects dominate

with interaction effects being of lower order. Also we may always do more

experiments if necessitated by the observations.

One half factorial design:

Figure 15 Three factors system with two levels. All possible values are given by the corners of the cube.

BC

Aa'b'c' a

a'

abc

b'

b

c

c'



For a system with k factors and 2 levels the number of experiments in a

full factorial design will be 2k. For example, if k=3, this number works out to be

23=8. The eight values of the levels would correspond to the corners of a cube

as represented by Figure 15. A half factorial design would use 2k-1 experiments.

With k=3 this works out to be 22=4. The half factorial design would cut the

number of experiments by half. In the half factorial design we would have to

choose half the number of experiments and they should correspond to four of the

eight corners of the cube. We may choose the corners corresponding to a, b, c

and abc as one possible set. This set will correspond to the following:

Point Ax Bx Cx

a + - -

b - + -

c - - +

abc + + +

We notice at once that the three column vectors are orthogonal. Also the

points are obtained by requiring that the projection on to the left face of the cube

gives a full factorial design of type 22. It is easily seen that we may also use the

corners a’, b’, c’ and a’b’c’ to get a second possible half factorial design. This is

represented by the following:



Point Ax Bx Cx

a’ + + -

b’ + - +

c’ - + +

a’b’c’ - - -

In each of these cases we need to perform only 4 experiments. Let us

look at the consequence of this. For this purpose we make the following table

corresponding to the a, b, c, abc case.

Point Ax Bx Cx A Bx x A Cx x B Cx x A B Cx x x

a + - - - - + +

b - + - - + - +

c - - + + - - +

abc + + + + + + +

Column No. 1 2 3 4 5 6 7

Notice that Column vector 1 is identical to the column vector 6. We say

that these two are “aliases”. Similarly the column vectors 2-5 and column

vectors 3-4 form aliases. Let us look at the consequence of these.

The most general regression model that can be used to represent the

outcome of the full factorial experiment would be



0 A A B B C C AB A B AC A C ABC A B Cy p p x p x p x p x x p x x p x x x= + + + + + + (52)

There are eight regression coefficients and the eight experiments in the

full factorial design would yield all these coefficients. However we now have only

four experiments and hence only four regression coefficients may be resolved.

By looking at the procedure used earlier in solving the equations for the

regression coefficients, it is clear that it is not possible to obtain the coefficients

that form an alias pair. For example, it is not possible to resolve Ap and BCp .

These two are said to be “confounded”. The student may verify that the

following are confounded: Bp and ACp , Ap and BCp , Cp and ABp , 0p and ABCp . The

consequence of this is that the best regression we may propose is

0 A A B B C Cy p p x p x p x= + + + (53)

All interaction effects are unresolved and we have only the primary effects

being accounted for. The regression model is indeed a linear model! The

student may perform a similar analysis with the second possible half factorial

design and come to the same conclusion.

Generalization:

Consider a 2k full factorial design. The number of experiments will be 2k.

These experiments will resolve k main effects, kC2 - 2 factor interactions, kC3 - 3

factor interactions,…………., kCk-1 (k-1) interactions and 1 - k interactions. (if k=5,

say, these will be 5 main effects, 10 – 2 factor interactions, 10 – 3 factor

interactions, 5 – 4 factor interactions and 1 – 5 factor interaction).



More on simple design: We now look at other ways of economizing on the number of experimental runs

required to understand the behavior of systems. We take a simple example of

characterizing the frictional pressure drop in a tube. A typical experimental set

up will look like the one shown in Fig. 16.

The factors that influence the pressure drop (measured with a differential

pressure gage) between stations 1 and 2 may be written down as:

1. Properties of the fluid flowing in the tube: Density ρ, the fluid viscosity µ

(two factors)

2. Geometric parameters: Pipe diameter D, Distance between the two

stations L and the surface roughness parameter (zero for smooth pipe

and non-zero positive number for a rough pipe). (two or three factors)

3. Fluid velocity V (one factor)

Figure 16 Schematic of an experimental set up for friction factor measurement in

pipe flow

L Diameter D

Pipe inner surface qualifier- Smooth/rough

∆p

Station

1 Station

2

Flow Velocity V



The total number of factors that may influence the pressure drop ∆p is thus equal

to 5 factors in the case of a smooth pipe and 6 factors in the case of a rough

pipe. The student can verify that a full factorial design with two levels would

require a total of 32 or 64 experiments. In practice the velocity of the fluid in the

pipe may vary over a wide range of values. The fluid properties may be varied

by changing the fluid or by conducting the experiment at different pressure and

temperature levels! The diameter and length of the pipe may indeed vary over a

very wide range. If one were to go for a simple design the number of factors and

the levels are very large and it would be quite impossible to perform the required

number of experiments.

The question now is: “How are we going to design the experimental scheme?”

How do we conduct a finite number of experiments and yet get enough

information to understand the outcome of the measurement? We look for the

answer in “dimensional analysis” (the student would have learnt this from

his/her course in Fluid Mechanics), for this purpose. Dimensional analysis, in

fact, indicates that the outcome of the experiment may be represented by a

simple relationship of the form

⎟⎟⎠

⎞⎜⎜⎝

⎛= (54)

The student may verify that both the left hand side quantity and the quantity

within the bracket on the right hand side are non-dimensional. This means both

are pure numbers! The f outside the bracketed term on the right hand side

indicates a functional relation. In Fluid Mechanics parlance is known as the



Euler number and is known as the Reynolds number. The third non-

dimensional parameter that appears above is the ratio L/D. Essentially the Euler

number is a function of only two factors the Reynolds number and the length to

diameter ratio! The number of factors has been reduced from 6 to just 2! We

may conduct the experiments with just one or two fluids, a few values of the

velocity and may be a couple of different diameter pipes of various lengths to

identify the nature of the functional relationship indicated in Equation (54).

In summary:

1. The 2k experiments will account for the intercept (or the mean) and all

interaction effects and is referred to as Resolution k design. (With k=5, we

have full factorial design having resolution 5. Number of experiments is

32.).

2. Semi or half factorial design 2k-1 will be resolution k-1 design. (With k=5,

we have half factorial design having resolution 4. Number of experiments

is 16.).

3. Quarter factorial design will have a resolution of k-2. (With k=5, we have

quarter factorial design having resolution 3. Number of experiments is 8.).

And so on…..

The student may also work out the aliases in all these cases.

MODULE 1 Mechanical Measurementslibrarian/web courses/IIT-MADRAS/Mech_Meas... · MODULE 1...

Documents

Transcript of MODULE 1 Mechanical Measurementslibrarian/web courses/IIT-MADRAS/Mech_Meas... · MODULE 1...