MODULE 1 Mechanical Measurementslibrarian/web courses/IIT-MADRAS/Mech_Meas... · MODULE 1...
Transcript of MODULE 1 Mechanical Measurementslibrarian/web courses/IIT-MADRAS/Mech_Meas... · MODULE 1...
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
MODULE 1 Mechanical Measurements
1. Introduction to Mechanical Measurements
Figure 1 Why make measurements?
We recognize three reasons for making measurements as indicated in
Figure 1. From the point of view of the course measurements for commerce is
outside its scope.
Engineers design physical systems in the form of machines to serve
some specified functions. The behavior of the parts of the machine during the
operation of the machine needs to be examined or analyzed or designed such
that it functions reliably. Such an activity needs data regarding the machine parts
in terms of material properties. These are obtained by performing measurements
in the laboratory.
Why Measure?
Generate Data for Design
Generate Data to Validate or Propose a
Theory
For Commerce
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
The scientific method consists in the study of nature to understand the
way it works. Science proposes hypotheses or theories based on observations
and need to be validated with carefully performed experiments that use many
measurements. When once a theory has been established it may be used to make
predictions which may themselves be confirmed by further experiments.
Measurement categories
1. Primary quantity
2. Derived quantity
3. Intrusive – Probe method
4. Non-intrusive
Measurement categories are described in some detail now.
1. Primary quantity:
It is possible that a single quantity that is directly measurable is of
interest. An example is the measurement of the diameter of a cylindrical
specimen. It is directly measured using an instrument such as vernier calipers.
We shall refer to such a quantity as a primary quantity.
2. Derived quantity:
There are occasions when a quantity of interest is not directly
measurable by a single measurement process. The quantity of interest needs to
be estimated by using an appropriate relation involving several measured
primary quantities. The measured quantity is thus a derived quantity. An
example of a derived quantity is the determination of acceleration due to gravity
(g) by finding the period (T) of a simple pendulum of length (L). T and L are the
measured primary quantities while g is the derived quantity.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
3. Probe or intrusive method:
Most of the time, the measurement of a physical quantity uses a probe
that is placed inside the system. Since a probe invariably affects the measured
quantity the measurement process is referred to as an intrusive type of
measurement.
4. Non-intrusive method:
When the measurement process does not involve insertion of a probe into
the system the method is referred to as being non-intrusive. Methods that use
some naturally occurring process like radiation emitted by a body to measure a
desired quantity relating to the system the method may be considered as non-
intrusive. The measurement process may be assumed to be non-intrusive
when the probe has negligibly small interaction with the system. A typical
example for such a process is the use of laser Doppler velocimeter (LDV) to
measure the velocity of a flowing fluid.
General measurement scheme:
Figure 2 Schematic of a general measurement system
Signal Conditioner
Detector and
Transducer
Measured quantity
Calibration or
reference signal
External power
Controller
Output
Computer
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Figure 2 shows the schematic of a general measurement scheme. Not all
the elements shown in the Figure may be present in a particular case. The
measurement process requires invariably a detector that responds to the
measured quantity by producing a measurable change in some property of the
detector. The change in the property of the detector is converted to a
measurable output that may be either mechanical movement of a pointer over a
scale or an electrical output that may be measured using an appropriate
electrical circuit. This action of converting the measured quantity to a different
form of output is done by a transducer. The output may be manipulated by a
signal conditioner before it is recorded or stored in a computer. If the
measurement process is part of a control application the computer can use a
controller to control the measured quantity. The relationship that exists between
the measured quantity and the output of the transducer may be obtained by
calibration or by comparison with a reference value. The measurement
system requires external power for its operation.
Some issues:
1. Errors – Systematic or Random
2. Repeatability
3. Calibration and Standards
4. Linearity or Linearization
Any measurement, however carefully it is conducted, is subject to
measurement errors. These errors make it difficult to ascertain the true value of
the measured quantity. The nature of the error may be ascertained by repeating
the measurement a number of times and looking at the spread of the values. If
the spread in the data is small the measurement is repeatable and may be
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
termed as being good. If we compare the measured quantity obtained by the use
of any instrument and compare it with that obtained by a standardized
instrument the two may show different performance as far as the repeatability
is concerned. If we add or subtract a certain correction to make the two
instruments give data with similar spread the correction is said to constitute a
systematic error. The spread of data in each of the instruments will constitute
random error.
The process of ascertaining the systematic error is calibration. The
response of a detector to the variation in the measured quantity may be linear or
non-linear. In the past the tendency was to look for a linear response as the
desired response. Even when the response of the detector was non-linear the
practice was to make the response linear by some manipulation. With the advent
of automatic recording of data using computers this is not necessary since
software can take care of this aspect.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Sub Module 1.2 2. Errors in measurements
Errors accompany any measurement, however well it is conducted. The
error may be inherent in the measurement process or it may be induced due to
variations in the way the experiment is conducted. The errors may be classified
as:
1. Systematic errors (Bias):
Systematic errors due to faulty or improperly calibrated instruments.
These may be reduced or eliminated by careful choice and calibration of
instruments. Sometimes bias may be linked to a specific cause and estimated by
analysis. In such a case a correction may be applied to eliminate or reduce bias.
Bias is an indication of the accuracy of the measurement. Smaller the bias more
accurate the data
2. Random errors:
Random errors are due to non-specific causes like natural disturbances
that may occur during the measurement process. These cannot be eliminated.
The magnitude of the spread in the data due to the presence of random errors is
a measure of the precision of the data. Smaller the random error more precise is
the data. Random errors are statistical in nature. These may be characterized by
statistical analysis.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
We shall explain these through the familiar example shown in Figure 3.
Three different individuals with different skill levels are allowed to complete a
round of target practice. The outcome of the event is shown in the figure.
Figure 3 Precision and accuracy explained through a familiar example
It is evident that the target at the left belongs to a highly skilled shooter.
This is characterized by all the shots in the inner most circle. The result indicates
good accuracy as well as good precision. A measurement made well must be
like this case! The individual in the middle is precise but not accurate. Maybe it
is due to a faulty bore of the gun. The individual at the right is an unskilled
person who is behind on both counts. Most beginners will fall into this category.
The analogy is quite realistic since most students performing a measurement in
the laboratory may be put into one of the three categories. A good
experimentalist has to work hard to excel in it!
Good Precision Poor Accuracy
Good Precision Good Accuracy
Poor Precision Poor Accuracy
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Another example:
Figure 4 Example showing the presence of systematic and random errors
in data.
The results shown in Figure 4 compare the response of a particular
thermocouple (that measures temperature) and a standard thermocouple. The
measurements are reported between room temperature (close to 20°C) and
500°C. That there is a systematic variation between the two is clear from the
figure that shows the trend of the measured temperatures indicated by the
particular thermocouple. The systematic error appears to vary with the
0
5
10
15
20
25
0 100 200 300 400 500Temperature, oC
Out
put,
mV
Standard ReferenceIndividual Thermocouple DataPoly. (Individual Thermocouple Data)
Bias
Error
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
temperature. The data points indicated by the full symbols appear also to hug
the trend line. However the data points do not lie on it. This is due to random
errors that are always present in any measurement. Actually the standard
thermocouple would also have the random errors that are not indicated in the
figure. We have deliberately shown only the trend line for the standard
thermocouple.
Sub Module 1.3
3. Statistical analysis of experimental data Statistical analysis and best estimate from replicate data:
Let a certain quantity X be measured repeatedly to get
iX , i=1,n (1)
Because of random errors these are all different.
How do we find the best estimate Xb for the true value of X?
It is reasonable to assume that the best value be such that the
measurements are as precise as they can be!
In other words, the experimenter is confident that he has conducted the
measurements with the best care and he is like the skilled shooter in the
target practice example presented earlier!
Thus, we minimize the variance with respect to the best estimate Xb of X.
Thus we minimize:
[ ]n
2i b
i 1S X X
== −∑ (2)
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
This requires that:
[ ]b
n
ii 1
SX
X
n=
∂−
∂
=
∑
∑
n
i bi=1
b
= 2 X X (-1) =0
or X
(3)
The best estimate is thus nothing but the mean of all the individual
measurements!
Error distribution:
When a quantity is measured repeatedly it is expected that it will be
distributed around the best value according to some distribution. Many times
the random errors may be distributed as a normal distribution. If µ and σ are,
respectively, the mean and the standard deviation, then, the probability density is
given by
−⎡ ⎤− ⎢ ⎥⎣ ⎦=
21 x µ2 σ1f(x) e
σ 2π (4)
The probability that the error around the mean is (x-µ) is the area under
the probability density function between (x-µ)+dx and (x-µ) represented by the
product of the probability density and dx. The probability that the error is
anywhere between -∞ and x is thus given by the following integral:
−⎡ ⎤− ⎢ ⎥⎣ ⎦=−∞∫
1 v µx22 σ1F(x) e dv
σ 2π (5)
This is referred to as the cumulative probability. It is noted that if x→∞
the integral tends to 1. Thus the probability that the error is of all possible
magnitudes (between -∞ and +∞) is unity! The integral is symmetrical with
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
respect to x=µ as may be easily verified. The above integral is in fact the error
integral that is a tabulated function. A plot of f(x) and F(x) is given in Figure 5.
Figure 5 Normal distribution and its integral
Many times we are interested in finding out the chances of error lying between
two values in the form ±pσ. This is referred to as the “confidence interval” and
the corresponding cumulative probability specifies the chances of the error
occurring within the confidence interval. Table 1 gives the confidence intervals
that are useful in practice:
00.20.40.60.8
1
-3 -2 -1 0 1 2 3(x-µ )/σ
Cumulative Probability density
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Table 1
Confidence intervals according to normal distribution
Cumulative Probability 0 0.95 0.99 0.999
Interval p 0 +1.96 +2.58 +3.29
The table indicates that error of magnitude greater than ±3.29σ is very unlikely to
occur. In most applications we specify +1.96σ as the error bounds based on
95% confidence.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Example 1
Resistance of a certain resistor is measured repeatedly to obtain the
following data.
No. 1 2 3 4 5 6 7 8 9
R, kΩ 1.22 1.23 1.26 1.21 1.22 1.22 1.22 1.24 1.19
What is the best estimate for the resistance? What is the error with 95%
confidence?
Best estimate is the mean of the data.
1.22 4 1.23 1.26 1.21 1.24 1.19R9
= 1.223 1.22 k
× + + + + +=
≈ Ω
Standard deviation of the error σ:
9 2
1-4
-4
1Variance = Ri -R9
=3.33 10Hence :
= 3.33 10 = 0.183 0.02 k
⎡ ⎤⎣ ⎦
×
σ ×≈ Ω
∑
Error with 95% confidence :
95% Error = 1.96 = 1.96 0.0183 = 0.036 0.04 k
σ ×
≈ Ω
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Example 2
Thickness of a metal sheet (in mm) is measured repeatedly to obtain the
following replicate data. What is the best estimate for the sheet thickness? What
is the variance of the distribution of errors with respect to the best value? Specify
an error estimate to the mean value based on 99% confidence.
Experiment No. 1 2 3 4 5 6
t, mm 0.202 0.198 0.197 0.215 0.199 0.194
Experiment No. 7 8 9 10 11 12
t, mm 0.204 0.198 0.194 0.195 0.201 0.202
The best estimate for the metal sheet thickness is the mean of the 12
measured values. This is given by
12
i1
b
0.202 0.198 0.197 0.215 0.199 0.194 0.204t
0.198 0.194 0.195 0.201 0.202t t = 0.2 mm
12 12
+ + + + + +⎡ ⎤⎢ ⎥+ + + + +⎣ ⎦= = =
∑
The variance with respect to the mean or the best value is given by (on
substituting t for bt ) as
12 122i i
2 21 1b
2 2 2 2 2 2 2
2 2 2 2 2
-5 2
t t t = t
12 120.202 0.198 0.197 0.215 0.199 0.194 0.204
0.198 0.194 0.195 0.201 0.2020.2 mm
12= 3.04 10 mm
−⎡ ⎤⎣ ⎦σ = −
⎡ ⎤+ + + + + +⎢ ⎥⎢ ⎥+ + + + +⎣ ⎦= −
×
∑ ∑
The corresponding standard deviation is given by
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
5b 3.04 10 =0.0055 0.006 mm−σ = × ≈
The corresponding error estimate based on 99% confidence is
bError = 2.58 = 2.58 0.0055 0.014 mm± σ ± × ≈ ±
Principle of Least Squares
Earlier we have dealt with the method of obtaining the best estimate from
replicate data based on minimization of variance. No mathematical proof was
given as a basis for this. We shall now look at the above afresh, in the light of
the error distribution that has been presented above.
Consider a set of replicate data xi. Let the best estimate for the measured
quantity be xb. The probability for a certain value xi within the interval
i i ix , x dx+ to occur in the measured data is given by the relation
( )2b i
2x x
2i i
1p(x ) e dx2
−−
σ=σ π
(6)
The probability that the particular values of measured data are obtained in
replicate measurements must be given by the compound probability given by
( )
( )
( )
( )22 nb ib i
22 i 1
x xx xn n22
i in ni 1 i 1
1 1p = e dx e dx2 2
=
−− −−σσ
= =
∑=
σ π σ π∏ ∏ (7)
The reason the set of data was obtained as replicate data is that it was the
most probable! Since the intervals idx are arbitrary, the above will have to be
maximized by the proper choice of bx and σ such that the exponential factor is a
maximum. Thus we have to choose bx and σ such that
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
( )2n
b i2
i 1
x x2
n1p ' e =
−−
σ∑
=σ
(8)
has the largest possible value. As usual we set the derivatives b
p ' p ' 0x∂ ∂
= =∂ ∂σ
to
get the values of the two parameters xb and σ. We have:
( )
( )
2ni b
2i 1
x xn
2i bn 2
b i 1This part should go to zero
p ' 1 e 2 x x ( 1) 0x 2
=
−
σ+
=
∑∂= − − − =
∂ σ∑ (9)
Or
( )n n
i b b ii 1 i n
x x =0 or x x x= =
− = =∑ ∑ (10)
It is clear thus that the best value is nothing but the mean of the values! We also
have:
( )( )2n
i b2
i=1
x -xn 2 2
i bn+1 n+3i 0
This part should go to Zero
p ' n 1 = - x x e 0σ
=
∑⎡ ⎤∂+ − =⎢ ⎥∂σ σ σ⎢ ⎥⎣ ⎦
∑ (11)
Or
( )
n 2i b
2 i 1x x
=n
=−
σ∑
(12)
This last expression indicates that the parameter σ2 is nothing but the
variance of the data with respect to the mean! Thus the best values of the
measured quantity and its spread is based on the minimization of the squares of
errors with respect to the mean. This embodies what is referred to as the
“Principle of Least Squares”.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Propagation of errors:
Replicate data collected by measuring a single quantity repeatedly
enables us to calculate the best value and characterize the spread by the
variance with respect to the best value, using the principle of least squares. Now
we look at the case of a derived quantity that is estimated from the
measurement of several primary quantities. The question that needs to be
answered is the following:
“A derived quantity Q is estimated using a formula that involves the
primary quantities. 1 2 na ,a ,.....a Each one of these is available in terms of the
respective best values 1 2 na , a ,.....a and the respective standard deviations
1 2 n, ....σ σ σ . What is the best estimate for Q and what is the corresponding
standard deviation Qσ ?”
We have, by definition
1 2 nQ =Q(a ,a ,.......a ) (13)
It is obvious that the best value of Q should correspond to that obtained by using
the best values for the a’s. Thus, the best estimate for Q given by Q as
1 2 nQ =Q(a ,a ,.......a ) (14)
Again, by definition, we should have:
( )N 22
Q ii 1
1 = Q QN =
σ −∑ (15)
The subscript i indicates the experiment number and the ith estimate of Q is given
by
( )i 1i 2i niQ Q a ,a ,....a= (16)
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
If we assume that the spread in values are small compared to the mean or the
best values (this is what one would expect from a well conducted experiment),
the difference between the ith estimate and the best value may be written using a
Taylor expansion around the best value as
2N
2Q 1i 2i ni
1 2 2i 2
1 Q Q Qa a ...... aN a a a=
⎛ ⎞∂ ∂ ∂σ = ∆ + ∆ + + ∆⎜ ⎟∂ ∂ ∂⎝ ⎠
∑ (17)
where the partial derivatives are all evaluated at the best values for the a’s. If the
a’s are all independent of one another then the errors in these are unrelated to
one another and hence the cross terms. N
mi kii 1
a a 0 for m k=∆ ∆ = ≠∑ Thus equation
(17) may be rewritten as
2 2 2N
2Q 1i 2i ni
1 2 ni 1
1 Q Q Qa a ....... aN a a a=
⎡ ⎤⎛ ⎞ ⎛ ⎞ ⎛ ⎞∂ ∂ ∂⎢ ⎥σ = ∆ + ∆ + + ∆⎜ ⎟ ⎜ ⎟ ⎜ ⎟∂ ∂ ∂⎢ ⎥⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎣ ⎦∑ (18)
Noting that ( )2N
2ji j
i 1a =N
=∆ σ∑ we may recast the above equation in the form
2 2 2
2 2 2 2Q 1 2 n
1 2 n
Q Q Q = + +.......+a a a
⎛ ⎞ ⎛ ⎞ ⎛ ⎞∂ ∂ ∂σ σ σ σ⎜ ⎟ ⎜ ⎟ ⎜ ⎟∂ ∂ ∂⎝ ⎠ ⎝ ⎠ ⎝ ⎠
(19)
Equation (19) is the error propagation formula. It may also be recast in the form
2 2 22 2 2
Q 1 2 n1 2 n
Q Q Q = + +.......+a a a
⎛ ⎞ ⎛ ⎞ ⎛ ⎞∂ ∂ ∂σ σ σ σ⎜ ⎟ ⎜ ⎟ ⎜ ⎟∂ ∂ ∂⎝ ⎠ ⎝ ⎠ ⎝ ⎠
(20)
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Example 3
The volume of a sphere is estimated by measuring its diameter by vernier
calipers. In a certain case the diameter has been measured as D = 0.0502 ±
0.00005 m. Determine the volume and specify a suitable uncertainty for the
same.
Nominal volume of sphere:
3 35 3D 0.0502V =3.14159 6.624 10 m
6 6−= π × = ×
The error in the measured diameter is specified as:
D 0.00005m∆ = ±
The influence coefficient is defined as
2 2-3 2
DV D 0.0502I = = 3.14159 =3.958 10 mD 2 2∂
= π × ×∂
Using the error propagation formula, we have
-3 7 3DV=I D=3.958 10 0.00005 1.979 10 m−∆ ∆ × × = ×
Thus
5 7 3V 6.624 10 1.979 10 m− −= × ± × Alternate solution to the problem
By logarithmic differentiation we have
dV dD =3V D
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
This may be recast as
-5 -5 3D 0.00005V 3V = 3 6.624 10 = 0.0198 10 mD 0.0502∆
∆ = ± ± × × × ± ×
This is the same as the result obtained earlier.
Example 4
Two resistances R1 and R2 are given as 1000 ± 25 Ω and 500 ± 10 Ω .
Determine the equivalent resistance when these two are connected in a) series
and b) parallel. Also determine the uncertainties in these two cases.
Given Data:
1 1 2 2R 1000, 25;R 500 10 All Values are in = σ = = σ = → Ω
Case a) Resistances connected in series:
Equivalent resistance is
s 1 2R =R R = 1000+500=1500+ Ω
Influence coefficients are:
s s1 2
1 2
R RI 1 ; I 1R R∂ ∂
= = = =∂ ∂
Hence the uncertainty in the equivalent resistance is
( ) ( ) ( ) ( )2 2 2 2s 1 1 2 2 = I I = 25 10 26.93 σ ± σ + σ ± + = ± Ω
Case b) Resistances connected in parallel:
Equivalent resistance is given by
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
( )
1 2p
1 2
1 2p
1 2
R R 1000 500R = 333.3 R R 1000 500
R R RR R
×= = Ω
+ +
=+
Influence coefficients are:
( ) ( )
( ) ( )
p 2 1 21 2
1 1 2 1 2
p 1 1 21 2 2
2 1 2 1 2
R R R R 500 1000 500I = = = 0.111R R R 1500 1500R RR R R R 1000 1000 500I = = = 0.444R R R 1500 1500R R
∂ ×− − =
∂ + +
∂ ×− − =
∂ + +
Hence the uncertainty in the equivalent resistance is
( ) ( ) ( ) ( )2 2 2 2s 1 1 2 2 = I I = 0.111 25 0.444 10 = 5.24 σ ± σ + σ ± × + × ± Ω
Thus the equivalent resistance is 1500 ± 26.9 Ω in the series arrangement
and 333.6 ± 5.24 Ω in the parallel arrangement.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Error estimation – some results without proof
Standard deviation of the means
The problem occurs as indicated below:
• Replicate data is collected with n measurements in a set
• Several such sets of data are collected
• Each one of them has a mean and a variance (precision)
• What is the mean and standard deviation of the means of all sets?
Population mean
Let N be the total number of data in the entire population. Mean of all the
sets m will be nothing but the population mean (i.e. the mean of all the
collected data taken as a whole).
Population variance
Let the population variance be
( )N
2i
2 i 1x m
=N
=−
σ∑
(21)
Variance of the means
Let the variance of the means be 2mσ . Then we can show that:
( )( )
2 2m
N nn N 1
−σ = σ
− (22)
If n<<N the above relation will be approximated as
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
( )( )( )( )
2 2m
22
N nn N 1
1-n/N =
n 1-1/N n
−σ = σ
−
σσ ≈
(23)
Estimate of variance
• Sample and its variance
– How is it related to the population variance?
• Let the sample variance from its own mean ms be 2eσ .
• Then we can show that:
( )( )
2 2 2e
N n 1 = 1n N 1 n
− ⎛ ⎞σ σ ≈ σ −⎜ ⎟− ⎝ ⎠ (24)
Error estimator
The last expression may be written down in the more explicit form:
( )
( )
n 2i s
2 1e
x -m =
n-1σ
∑ (25)
Physical interpretation
Equation (25) may be interpreted using physical arguments. Since the
mean (the best value) is obtained by one use of all the available data, the
degrees of freedom available (units of information available) is one less than
before. Hence the error estimator uses the factor (n-1) rather than n in the
denominator!
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Example 5 (Example 1 revisited)
Resistance of a certain resistor is measured repeatedly to obtain the
following data.
# 1 2 3 4 5 6 7 8 9
R, kΩ 1.22 1.231.261.211.221.221.221.241.19
What is the best estimate for the resistance? What is the error with 95%
confidence?
Best estimate is the mean of the data.
1.22 4 1.23 1.26 1.21 1.24 1.19R =9
=1.223 1.22 k
× + + + + +
≈ Ω
Standard deviation of the error eσ :
92 -4e i
1
1 = R R = 3.75 108
⎡ ⎤σ − ×⎣ ⎦∑
Hence
-4e = 3.75 10 = 0.019 0.02kσ × ≈ Ω
Error with 95% confidence :
95% eError = 1.96 =1.96 0.019 = 0.036 0.04 k
σ ×≈ Ω
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Sub Module 1.4
4. Regression analysis:
Now we are ready to consider curve fit or regression analysis. Suitable
plot of data will indicate the nature of the trend in data and hence will indicate the
nature of the relation between the independent and the dependent variables. A
few examples are shown in Figure 6(a-c).
Figure 6 (a) Linear relation between y and x
-0.5
0
0.5
1
1.5
2
2.5
0 1 2 3
x
y
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Figure 6(b) Linear relation between log x and log y
1
10
100
1000
10000 100000 1000000 x
y
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
The linear graph shown in Figure 6(a) follows a relationship of the form
y=ax+b. The linear relationship on the log-log plot shown in Figure 6(b) follows
the form by ax= . The non-linear relationship shown in Figure 6(c) follows a
polynomial relationship of the form 3 2y ax bx cx d= + + + . The parameters a, b, c,
d are known as the fit parameters and need to be determined as a part of the
regression analysis.
16
18
20
22
24
26
28
30
32
0 500 1000 1500
x
y
Figure 6(c) Non-linear relation between y and x
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Linear fit is possible in all the cases shown in Table 2.
Table 2
y=ax+b Linear fit Plots as a straight line on a linear graph sheetby ax= Power law fit Plots as a straight line on a log-log graph
bxy ae= Exponential fit Plots as a straight line on a semi-log graph
Linear regression:
Let ( ) ( ) ( ) ( )1 1 2 2 3 3 n nx , y , x , y , x , y ,........ x , y be a set of ordered pairs of
data. It is expected that there is a linear relation between y and x. Thus, if we
plot the data on a linear graph sheet as in Figure 2(a) the trend of the data
should be well represented by a straight line. We notice that the straight line
shown in the figure does not pass through any of the data points shown by full
symbols. There is a deviation between the data and the line and this deviation is
sometimes positive, sometimes negative, sometimes large and sometimes small.
If we look at the value given by the straight line as a local mean then the
deviations are distributed with respect to the local mean as a normal distribution.
If all data are obtained with equal care one may expect the deviations at various
data points to follow the same distribution and hence the least square principle
may be applied as under:
[ ] ( )n n 22
i f i i2 1 1
y y y ax bMinimise s
n n
− − +⎡ ⎤⎣ ⎦= =∑ ∑
(26)
where fy ax b= + is the desired linear fit to data. We see that 2s is the variance
of the data with respect to the fit and minimization will yield the proper choice of
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
the mean line represented by the proper parameters a and b. The minimization
requires that
( ) ( )2 2n n
i i i i i1 1
s 1 s 12 y ax b x 0; 2 y ax b 0a n a n
∂ ∂= − − + = = − − + =⎡ ⎤ ⎡ ⎤⎣ ⎦ ⎣ ⎦∂ ∂∑ ∑ (27)
These equations may be rearranged as two simultaneous equations for a and b
as given below:
( ) ( )
( )
2i i i i
i i
x a x b x y
x a nb y
+ =
+ =
∑ ∑ ∑
∑ ∑ (28)
These are known as normal equations. The summation is from i=1 to n and is
not indicated explicitly. The solution to these two equations may be obtained
easily by the use of Kramer’s rule.
i i i2
i i i i i i
i i2 2
i i i i
y x n y
x y x x x ya , b
n x n x
x x x x
= =
∑ ∑ ∑∑ ∑ ∑ ∑
∑ ∑∑ ∑ ∑ ∑
(29)
We now introduce the following definitions:
2 22 2i i i i i i2 2
x y xyx x x y x y
x , y , x , y and xyn n n n n
= = σ = − σ = − σ = −∑ ∑ ∑ ∑ ∑ (30)
The last of the quantities defined in (30) is known as the covariance. All the
other quantities are already familiar to us from statistical analysis. With these
definitions the slope of the line fit a may be written as
xy2x
aσ
=σ
(31)
The latter of the expressions in (28) may be solved for the fit line intercept b as
y abx−
= (32)
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
In fact the last equation indicates that the regression line passes through the
point ( )x, y . The fit line may be represented in the alternate form fY aX= where
f fY y y= − and X x x= − .
Example 6
The following data is expected to follow a relation of the form y=ax+b.
Determine the fit parameters by linear regression.
x 0.9 2.3 3.3 4.5 5.7 6.7y 1.1 1.6 2.6 3.2 4 5
It is convenient to make a table as shown below. The data given are in
columns 2 and 3. The other quantities needed to calculate the fit parameters are
in the other columns.
Data No. x y x2 y2 x y 1 0.9 1.1 0.8100 1.2100 0.9900 2 2.3 1.6 5.2900 2.5600 3.6800 3 3.3 2.6 10.8900 6.7600 8.5800 4 4.5 3.2 20.2500 10.2400 14.4000 5 5.7 4 32.4900 16.0000 22.8000 6 6.7 5 44.8900 25.0000 33.5000
Column Sum: 23.4 17.5 114.6200 61.7700 83.9500 Column Mean 3.9 2.9167 19.1033 10.2950 13.9917
2xσ 3.8933 Slope of the fit line is: a = 0.6721
2yσ 1.7881 The intercept is: b = 0.2955
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Sums are calculated column-wise and are shown in row 8. Various
means are then in row 9. The variances are in rows 10, 11 and column 2. The
regression parameters are then calculated using the results of the analysis
presented earlier.
The regression line is thus given by fy 0.6721x 0.2955= + . The data and
the fit are compared in the following table.
That the fit is a good representation of the data is indicated by the
proximity of the respective values in the second and third columns. The plot
shown in Figure 7 is further proof of this.
Figure 7 Comparison of data with the fit
x y yf
0.9 1.1 0.9 2.3 1.6 1.8 3.3 2.6 2.5 4.5 3.2 3.3 5.7 4 4.1 6.7 5 4.8
0123456
0 2 4 6 8
x
y an
d y f
Data Fit
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Goodness of fit and the correlation coefficient:
A measure of how good the regression line as a representation of the data
is deduced now. In fact it is possible to fit two lines to data by (a) treating x as
the independent variable and y as the dependent variable or by (b) treating y as
the independent variable and x as the dependent variable. The former has been
done above. The latter is described by a relation of the form x a ' y b '= + . The
procedure followed earlier can be followed through to get the following (the
reader is expected to show these results):
xy2y
a ' , b ' x a ' yσ
= = −σ
(33)
The second fit line may be recast in the form
1 b 'y ' xa ' a '
= − (34)
The slope of this line is 1a '
which is not the same, in general, as a, the
slope of the first regression line. If the two slopes are the same the two
regression lines coincide. Otherwise the two lines are distinct. The ratio of the
slopes of the two lines is a measure of how good the form of the fit is to the data.
In view of this we introduce the correlation coefficient ρ defined through the
relation
2xy2
2 2x y
slope of first Regression line aa 'slope of second Regression line
σρ = = =
σ σ (35)
Or
xy
x y
σρ = ±
σ σ (36)
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
The sign of the correlation coefficient is determined by the sign of the
covariance. If the regression line has a negative slope the correlation coefficient
is negative while it is positive if the regression line has a positive slope. The
correlation is said to be perfect if 1ρ = ± . The correlation is poor if 0ρ ≈ .
Absolute value of the correlation coefficient should be greater than 0.5 to indicate
that y and x are related!
In Example 6 the correlation coefficient is positive. The pertinent
parameters are 2 2x y xy3.8933, 1.7811 and 2.6167σ = σ = σ = . With these the
correlation coefficient is 2.6167 0.9923.8933 1.7811
ρ = =×
. Since the correlation
coefficient is close to unity the fit represents the data very closely (Figure 7 has
already indicated this).
Polynomial regression:
Sometimes the data may show a non-linear behavior that may be modeled
by a polynomial relation. Consider a quadratic fit as an example. Let the fit
equation be given by 2fy ax bx c= + + . The variance of the data with respect to
the fit is again minimized with respect to the three fit parameters a, b, c to get
three normal equations. These are solved for the fit parameters. Thus we have
( ) 22i2
y ax bx cs
n
⎡ ⎤− + +⎣ ⎦=∑
(37)
Least square principle requires
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
( )
( )
( )
22 2
i i
22
i i
22
i
s 2 y ax bx c x 0a ns 2 y ax bx c x 0b ns 2 y ax bx c 0c n
∂ ⎡ ⎤= − + + =⎣ ⎦∂
∂ ⎡ ⎤= − + + =⎣ ⎦∂
∂ ⎡ ⎤= − + + =⎣ ⎦∂
∑
∑
∑
(38)
These may be rewritten as
4 3 2 2i i i i i3 2i i i i i
2i i i
a x b x c x x y
a x b x c x x y
a x b x nc y
+ + =
+ + =
+ + =
∑ ∑ ∑ ∑∑ ∑ ∑ ∑
∑ ∑ ∑ (39)
Normal equations (39) are easily solved for the three fit parameters to
complete the regression analysis.
Goodness of fit and the index of correlation:
In the case of a non-linear fit we define a quantity known as the index of
correlation to determine the goodness of the fit. The fit is termed good if the
variance of the deviates is much less than the variance of the y’s. Thus we
require the index of correlation defined below to be close to ±1 for the fit to be
considered good.
[ ]22f
2 2y
y ys1 1y y
−ρ = ± − = ± −
σ ⎡ ⎤−⎣ ⎦
∑∑
(40)
It can be shown that the index of correlation is identical to the correlation
coefficient for a linear fit. The index of correlation compares the scatter of the
data with respect to its own mean as compared to the scatter of the data with
respect to the regression curve.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Example 7
The friction factor Reynolds number product fRe for laminar flow in a
rectangular duct is a function of the aspect ratio hAw
= where h is the height and
w is the width of the rectangle. The following table gives the available data:
A 0 0.05 0.10 0.125 0.167 0.25 0.4 0.5 0.75 1
fRe 96 89.81 84.68 82.34 78.81 72.93 65.47 62.19 57.87 56.91
Make a suitable fit to data.
A plot of the given data indicates that a quadratic fit may be appropriate.
For the purpose of the following analysis we represent the aspect ratio as x and
the fRe product as y. We seek a fit to data of the form 2fy ax bx c= + + . The
following tabulation helps in the regression analysis.
No. x y x2 x3 x4 x y x2y 1 0 96 0 0 0 0 0 2 0.05 89.81 0.0025 0.000125 6.25E-06 4.4905 0.224525 3 0.1 84.68 0.01 0.001 0.0001 8.468 0.8468 4 0.125 82.34 0.015625 0.001953 0.000244 10.2925 1.286563 5 0.167 78.81 0.027889 0.004657 0.000778 13.16127 2.197932 6 0.25 72.93 0.0625 0.015625 0.003906 18.2325 4.558125 7 0.4 65.47 0.16 0.064 0.0256 26.188 10.4752 8 0.5 62.19 0.25 0.125 0.0625 31.095 15.5475 9 0.75 57.89 0.5625 0.421875 0.316406 43.4175 32.56313 10 1 56.91 1 1 1 56.91 56.91 sum 3.342 747.03 2.091014 1.634236 1.409541 212.2553 124.6098
The three normal equations are then given by
1.409541a 1.634236b 2.091014c 124.60981.634236a 2.091014b 3.342c 212.2553
2.091014a 3.342b 10c 747.03
+ + =+ + =
+ + =
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
These three simultaneous equations are solved to get the three fit parameters as
a=58.354, b=-94.432, c=94.06
The following table helps in comparing the data with the fit.
x y yf s2=(y-yf)2 sy2
0 96 94.06 3.763026 453.5622 0.05 89.81 89.48 0.105978 228.2214 0.1 84.68 85.20 0.270956 99.54053 0.125 82.34 83.17 0.685562 58.32377 0.167 78.81 79.91 1.226588 16.86745 0.25 72.93 74.09 1.367455 3.143529 0.4 65.47 65.62 0.023763 85.24829 0.5 62.19 61.43 0.573285 156.5752 0.75 57.89 56.06 3.346951 282.677 1 56.91 57.98 1.150145 316.5908 Sum 747.03 747.03 12.51371 1700.75 Mean 74.703 1.251371 170.075
The table also shows how the index of correlation is calculated. The
column sums and column means required are given the last two rows of the
table. Note that calculation of 2yσ requires sums of the form
2y y⎡ ⎤−⎣ ⎦ where y is
available as the last entry in column 2. The index of correlation uses the mean
values of columns 4 and 5 given by 2y 170.075σ = and 2s 1.251371= . The index of
correlation is thus equal to 1.2513711 0.963.170.075
ρ = − = − The negative sign
indicates that y decreases when x increases. The index of correlation is close to
-1 and hence the fit represents the data very well. A plot of the data along with
the fit given in Figure 8 also indicates this. The standard error of the fit is given
by s 1.251371 1.12= = ± .
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Figure 8 Comparison of the data with the quadratic fit
In the above we have considered cases that involved one independent
variable and one dependent variable. Sometimes the dependent variable may
be a function of more than one variable. For example, the relation of the form
b cNu a Re Pr= is a common type of relationship between the Nusselt number
(Nu, dependent variable) and Reynolds (Re) and Prandtl (Pr) numbers both of
which are independent variables. By taking logarithms, we see
that log(Nu) log(a) b log(Re) c log(Pr)= + + . It is thus seen that the relationship is
linear when logarithms of the dependent and independent variables are used to
describe the fit. Also the relationship may be expressed in the form z=ax+by+c,
where z is the dependent variable, x and y are independent variables and a, b, c
are the fit parameters. The least square method may be used to determine the fit
0
20
40
60
80
100
0 0.2 0.4 0.6 0.8 1
Aspect ratio
Fric
tion
fact
or R
eyno
lds
num
ber
prod
uct
Data Quadratic fit
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
parameters. Let the data be available for set of n x, y values. The quantity to be
minimized is given by
( ) 22i
is z ax bx c= − + +⎡ ⎤⎣ ⎦∑ (41)
The normal equations are obtained by the usual process of setting the first partial
derivatives with respect to the fit parameters to zero.
2i i i i i i
2i i i i i i
i i i
a x b x y c x x z
a x y b y c y y z
a x b y nc z
+ + =
+ + =
+ + =
∑ ∑ ∑ ∑∑ ∑ ∑ ∑
∑ ∑ ∑ (42)
These equations are solved simultaneously to get the three fit parameters.
Example 8
The following table gives the variation of z with x and y. Obtain a multiple linear
fit to the data and comment on the goodness of the fit.
No. x y z 1 0.1 0.2 0.426 2 0.3 0.35 0.539 3 0.559 0.5 0.651 4 0.847 0.65 0.786 5 1.156 0.8 0.892 6 1.48 0.95 1.058 7 1.817 1.1 1.185 8 2.168 1.25 1.33 9 2.525 1.4 1.474 10 2.893 1.55 1.634
The calculation procedure follows that given previously. Several sums are
required and these are tabulated below.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
No. x y z x2 x y y2 x z y z 1 0.1 0.2 0.426 0.01 0.02 0.04 0.0426 0.0852 2 0.3 0.35 0.539 0.09 0.105 0.1225 0.1617 0.18865 3 0.559 0.5 0.651 0.312481 0.2795 0.25 0.363909 0.3255 4 0.847 0.65 0.786 0.717409 0.55055 0.4225 0.665742 0.5109 5 1.156 0.8 0.892 1.336336 0.9248 0.64 1.031152 0.7136 6 1.48 0.95 1.058 2.1904 1.406 0.9025 1.56584 1.0051 7 1.817 1.1 1.185 3.301489 1.9987 1.21 2.153145 1.3035 8 2.168 1.25 1.33 4.700224 2.71 1.5625 2.88344 1.6625 9 2.525 1.4 1.474 6.375625 3.535 1.96 3.72185 2.0636 10 2.893 1.55 1.634 8.369449 4.48415 2.4025 4.727162 2.5327 Sum 13.845 8.75 9.975 27.40341 16.0137 9.5125 17.31654 10.39125
The last row contains the sums required and the normal equations are easily
written down as under:
27.40341a 16.0137b 13.845c 17.3165416.0137a 9.5125b 8.75c 10.39125
13.845a 8.75b 10c 9.975
+ + =+ + =
+ + =
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Figure 9 Parity plot showing the goodness of the fit
These are solved to get the fit parameters as a=0.285, b=0.297, c=0.343.
The data and the fit may be compared by making a parity plot as shown in Figure
9. The parity plot is a plot of given data (z) along the abscissa and the fit (zf)
along the ordinate. The parity line is a line of equality between the two. The
departure of the data from the parity line is an indication of the quality of the fit.
The above figure indicates that the fit is indeed very good. When the data is a
function of more than one independent variable it is not always possible to make
plots between independent and dependent variables. In such a case the parity
plot is a way out.
0.000
0.400
0.800
1.200
1.600
2.000
0 0.5 1 1.5 2
Data
Fit
Fit Parity Line
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
We may also calculate the index of correlation as an indicator of the
quality of the fit. This calculation is left to the reader!
General non-linear fit:
The fit equation may sometimes have to be chosen as a non-linear
relation that is not either a polynomial or in a form that may be reduced to the
linear form. In such a case the parameter estimation is more involved and
requires the use of a search method to determine the best parameter set that
minimizes the sum of the squares of the residual. The method is described in
some detail below.
Let us represent the fit relation in the form ( )f 1 2 my f x;a ,a ,...a= where the
dependent variable is x and 1 ma a− are m fit parameters to be determined by the
regression analysis. As before we assume that n sets of x, y values are
available. Consider the sum of the squares of the residual given by
( ) ( ) ( ) 221 m 1 m i 1 2 m
is a ..a S a ..a y f x;a ,a ,...a= = −⎡ ⎤⎣ ⎦∑ (43)
In general it is not possible to set the partial derivatives with respect to the
parameters to zero to obtain the normal equations and thus obtain the fit
parameters. In view of this let us look at what is happening to the sum of
squares near a starting parameter set 0 0 01 2 ma ,a ,...a . The sum of squares is
evaluated using this parameter set in equation (43) to get
( )0 0 0 01 2 mS S a ,a ,...a= .Perturb each of the a’s individually to get
( ) ( ) ( )0 0 0 0 0 0 0 0 01 1 2 m 1 1 2 m 1 1 2 mS a a ,a ..a ,S a a ,a ..a ,S a a ,a ..a+ ∆ + ∆ + ∆ ,.. ( )0 0 0 0
1 2 j j mS a ,a ,..a a ..a+ ∆
( )0 0 0 01 2 j j mS a ,a ,..a a ..a+ ∆ , ( )0 0 0 0
1 2 m mS a ,a ,..a a+ . Using these we may estimate the
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
partial derivatives by the use of finite difference approximation
as( )
( ) ( )0 0 01 2 m
0 0 0 0 0 0 0 01 2 j j m 1 2 j m
j ja ,a ,..a
S a ,a ,..a a ..a S a ,a ,a ..aSa a
+ ∆ −∂=
∂ ∆. There are m such
partial derivatives and they are all likely to be non-zero (if they are all zero we are
already at the optimum point where the sum of squares is possibly a minimum).
The gradient vector is then given by the components1 2 j m
S S S S, ,.. ..a a a a
⎛ ⎞∂ ∂ ∂ ∂⎜ ⎟⎜ ⎟∂ ∂ ∂ ∂⎝ ⎠
. The
magnitude of this vector is obtained by summing the squares of all the partial
derivatives and then taking the square root of this sum.
2m
jj 1
Sgrad Sa=
⎛ ⎞∂= ⎜ ⎟⎜ ⎟∂⎝ ⎠∑ (44)
We divide each of the partial derivatives occurring in the gradient vector
by the magnitude of the gradient vector thus calculated to get the components of
a unit vector that is aligned with the gradient vector. Thus
j1 2 m
SS S Saa a a
, ,.. ,..grad S grad S grad S grad S
⎛ ⎞∂⎛ ⎞ ⎛ ⎞ ⎛ ⎞∂ ∂ ∂⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟∂∂ ∂ ∂⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ (45)
We now take a specific fraction (α, small) of each of these components to
define a step along a direction opposite the gradient vector to get
1 21 0 1 01 1 2 2
j m1 0 1 0j j m m
S Sa a
a a ,a a ,..grad S grad S
S Sa a
a a ,..a agrad S grad S
⎛ ⎞ ⎛ ⎞∂ ∂⎜ ⎟ ⎜ ⎟∂ ∂⎝ ⎠ ⎝ ⎠= −α = −α
⎛ ⎞∂ ⎛ ⎞∂⎜ ⎟ ⎜ ⎟⎜ ⎟∂ ∂⎝ ⎠ ⎝ ⎠= −α = −α
(46)
The calculation above is redone with the new values of the parameter set.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
The calculation is continued till the magnitude of the gradient reaches
zero or acceptably small value at which the calculation stops and the
parameter set is assumed to have satisfied the least square principle. An
example will make this procedure clear.
Example 9 The data given in the following table is expected to follow a relation of the
form bxfy ae cx= + . Determine the fit parameters by general non-linear
regression.
x 0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1,8
y 1.196 1.379 1.581 1.79 2.013 2.279 2.545 2.842 3.173 3.5
The sum of squares of the residual is given by ( )i
210bx
i ii 1
S y ae cx=
⎡ ⎤= − +⎣ ⎦∑ . The
partial derivatives needed are obtained analytically
( ) ( ) ( )
( )
i i i i
i
10 10bx bx bx bx
i i i i ii 1 i 110
bxi i i
i 1
S S2 y ae cx e , 2 y ae cx ax e ,a b
S 2 y ae cx xa
= =
=
∂ ∂⎡ ⎤ ⎡ ⎤= − + = − + −⎣ ⎦ ⎣ ⎦∂ ∂
∂ ⎡ ⎤= − +⎣ ⎦∂
∑ ∑
∑ (47)
The above means that the partial derivatives may be computed once the
starting set of parameters is known or assumed. We start the calculation with the
initial parameter seta=1,b=0.2,c=0.1. The value of S turns out to be 11.673 for
this set of parameter values. Using (47) the partial derivatives are obtained
respectively as-24.023,-30.681,-23.003.The magnitude of the gradient vector is
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
then given by ( ) ( ) ( )0.52 2 2-24.023 , -24.023 , -24.023 45.249⎡ ⎤ =
⎣ ⎦. The components of
the unit vector 0 0 0a b cu ,u ,u are then given by 24.023 30.681 23.003, ,
45.249 45.249 45.249− −
− − − or-
0.531,-0.678,-0.508. We shall choose α value of 0.02 to get the next trial values
for the parameters as
( )( )( )
1 0 0a
1 0 0b
1 0 0c
a a u 1 0.02 0.531 1.011
b b u 1 0.02 0.678 0.214
c c u 1 0.02 0.508 0.11
= −α = − ×− =
= −α = − ×− =
= −α = − ×− =
The S value for this parameter set turns out to be 10.759. The
calculations may be repeated as above. The results are summarized below.
α a b c grad S S 0.02 1 0.2 0.1 45.25 11.67 1.011 0.214 0.11 44.29 10.76 1.022 0.228 0.12 43.25 9.87 .. .. .. .. .. .. 1.219 0.505 0.265 0.848 0.0286 0.005 1.21004 0.5984 0.2665 0.2831 0.001265 0.001 1.209288 0.509251 0.266293 0.108248 0.00108
It is clear from the table that a large number of trials are involved in the
regression analysis. The value of α needs to be reduced as we approach the
optimum set. The final set of parameters for the present case is given by
a=1.2093,b=0.5093,c=0.2663
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Figure 10 Comparison of data with the fit
That the regression analysis has indeed converged to the proper fit
parameters is seen from the excellent agreement between the data and the fit
shown in Figure 10. The reader is left to determine the index of correlation and
the standard error of the fit.
00.5
11.5
22.5
33.5
4
0 0.5 1 1.5 2
x
y
Data Fit
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Sub Module 1.5
5. Use of EXCEL for regression analysis
EXCEL is a Microsoft product that comes along with the Office suite of
programs. It is essentially a spread sheet program that provides a computing
environment with graphic capabilities. The student is encouraged to learn the
basics of EXCEL programming so that data analysis, regression analysis and
suitable plots may all be done within the EXCEL environment.
EXCEL work sheet provides a grid with cells in it. The cells form columns
and rows as in a matrix. The columns are identified by alphanumeric symbols
and the rows by numerals. For example, A1 refers to the cell in the first column
and first row. Cell C5 will represent the cell in the 3rd column (column number C)
and the 5th row (row number 5). Column identifiers will go from A – Z and then
from AA – AZ and so on.. The cell can hold a number, a statement or a formula.
A number or a statement is simply written by putting the cursor in the appropriate
cell and keying in the number or the statement, as the case may be. A formula,
however, is written by preceding the formula by “=” sign. The formula can
contain a reference to many built in functions in EXCEL as well as the usual
arithmetic operations. The formula can refer to the content of any other cell or
cells. The formulas can be calculated repeatedly over a set of rows by simply
copying down the formula vertically.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Figure 11 An extract of an EXCEL work sheet shows some of the things
one can do!
A B C D E F G 1 2 23 88 3 This is a statement
4
In cell B4 is the formula "=A2*B2" i.e. product of two numbers 2024
5 The formula in B4 is acted upon and the result alone appears in the cell B4 as seen above..
6 x x^2 7 1 1 8 2 4 9 3 9 10 4 16 11 5 25 12 6 36 13 7 49 14 8 64 15 9 81 16 10 100
17 Sum of G7to G16 is obtained by entering the formula "=SUM(G7:G16) in Cell G17 385
18 SUM() is a built in function in EXCEL
Data may be keyed into the cells in the form of columns as shown in the
work sheet given as Figure 12 below. The plotting is menu driven and the plot
may be displayed as a separate plot or within the work sheet. The latter is
shown in the case given here. The data range for the plot is specified by simply
blocking the Data cells shown by the blue background!
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Figure 12 Another extract of an EXCEL work sheet, showing data and the
corresponding plot.
A B C D E E G H 1 2 x y 3 1 3.33 4 2 10.43 5 3 21.53 6 4 36.63 7 5 55.73 8 6 78.83 9 7 105.93 10 8 137.03 11 9 172.13
Properties of cells, chart (plot is referred to as chart in EXCEL) are
changed to suit the requirements with menu driven controls. Student should
familiarize oneself by learning these through “HELP” available in EXCEL.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Figure 13 Another extract of an EXCEL work sheet, showing data and the
corresponding plot along with the automatically generated fit. The inset in
the plot gives the linear relation between y and x. The square of the
correlation coefficient is also shown in the inset (symbol R2).
A B C D E F G H 1 The following data is expected to follow a linear law. Obtain such a law 2 by using the "Trend Line" option in EXCEL. 3 x yd 4 0.5 0.35 5 1 1.66 6 1.5 3.418 7 2 4.488 8 2.5 5.306 9 3 8.584 10 3.5 9.97 11 4 12.196 12 4.5 15.382 13 5 15.548 14 5.5 17.274 15 6 18.704 16 6.5 20.306 17 7 21.612 18 7.5 21.446 19 8 24.108
Figure 13 shows how a trend line can be added to the plot. The inset in
the plot shows the relationship that exists between the y and x data values.
Correlation coefficient is very high indicating the fit to represent the data
extremely well.
Several examples of regression using EXCEL are presented below. The
examples are self explanatory and I expect the student to work them out using
EXCEL himself/herself.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Example 10
Linear fit example using EXCEL
The EXCEL worksheet into which the data is input is shown below. The
data is keyed in the cells as shown. The sums required are automatically
calculated by using the SUM function. Variances and the covariance are
calculated using their definitions given earlier. The slopes and the intercepts
are calculated using the formulae derived by the least square method. The two
regression lines are then given by:
1
2
y 14.15x 288.4y 14.74x 297.6= − += − +
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Price per thousand pieces of a certain product (x) determines the demand (y) for the product. The data is given below. Fit a straight line to data and discuss the quality of the fit.
No. x y x2 y2 xy y1 y2
Mean: 15.667 66.667 1 15 82 225 6724 1230 76.103 76.495 2 18 25 324 625 450 33.64 32.269 3 13 93 169 8649 1209 104.41 105.98 4 16 60 256 3600 960 61.949 61.753 5 12 128 144 16384 1536 118.57 120.72 6 20 12 400 144 240 5.3309 2.7855
Sum 94 400 1518 36126 5625 400 400 Mean 15.667 66.667 Variance of x= 7.5556 Variance of y= 1576.6 Covariance = -106.9 Slope of first regression line=
-14.15
Intercept of first regression line=
288.42
Slope of second regression line=
-14.74
Intercept of first regression line=
297.62
Correlation coefficient = -0.98
The correlation coefficient is calculated using the statistical parameters that
have been already calculated.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Figure 14 Plot of the resulting data using EXCEL
The data generated has been plotted using EXCEL in the form of a chart
in Figure 14.. The chart option used is “scatter plot”. The given data is shown
using the red circles. The two lines of regression are shown by the brown and
blue lines. Both of them pass through the mean x and mean y (indicated by the
point ). The fit is good because the two regression lines are very close to each
other.
Linear Fit to Data
0
20
40
60
80
100
120
140
10 12 14 16 18 20
x
y
y y1 y2
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Figure 15 Plot of the data using EXCEL with “Trend Line” option
The fit may also be done by using the “trend line” option available in
EXCEL. We choose the linear trend line and get the plot shown in Figure 15.
It is observed that the “Trend Line” option yields the first regression
line that considers y to be a function of x. The required arithmetic is
automatically performed by EXCEL. There is an option to automatically display
the regression line equation on the chart.
Linear Fit Using "Trend line"
The regression equation is y = -14.15x + 288.4
0
20
40
60
80
100
120
140
10 12 14 16 18 20
x
yData Linear (Data)
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Example 11
Exponential fit example using EXCEL
The given in the first two columns are the time (t, s) and the corresponding
temperature excess (T, °C) over the ambient of a certain system. The data is
expected to be well represented by an exponential law in the form T A exp( Bt)= − .
Obtain the fit using EXCEL.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
EXCEL work sheet appears as below.
t (s) Data, (T)
Fit,(T) ln(T) ln(T) t2 t ln(T) [ln(T)]2
0.35 60 58.64 4.09 4.09 0.12 1.43 16.76 0.6 50 50.39 3.91 3.91 0.36 2.35 15.30
0.937 40 41.07 3.69 3.69 0.88 3.46 13.61 1.438 30 30.31 3.40 3.40 2.07 4.89 11.57 2.175 20 19.38 3.00 3.00 4.73 6.52 8.97 3.25 10 10.10 2.30 2.30 10.56 7.48 5.30
mean t, s
Mean of
[ln(T)]
Var t Covaria-nce
Var of ln(T)
1.45833 3.3991 0.9935 0.6026 0.3659 slope B, 1/s
intercept,
ln A
-0.607 4.2837 τ s (To)=A,o
C
1.649 72.5 ρ
t (s) Data,(T) Fit,(T) Error -0.9994 0.35 60 58.64 1.36 0.6 50 50.39 -0.39
0.937 40 41.07 -1.07 1.438 30 30.31 -0.31 2.175 20 19.38 0.62 3.25 10 10.10 -0.10
It is noted that the data represents a linear law on the semi-log plot (the student
is recommended to test this out by making a plot).
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Comparison of data with fit
T = 72.5 e-0.607 t
0
10
20
30
40
50
60
70
0 1 2 3 4Time t, s
Tem
pera
ture
Exc
ess
T ,
o CData Expon. (Data)
Figure 16 Comparison of data with fit
The plot shown as Figure 16 indicates that the fit represents the data very
well. Error bars are indicated based on 95% confidence limits using the feature
available in EXCEL.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
The standard error is calculated by using the following tabulation using EXCEL.
t (s) Data, T °C Fit,T°C Error, °C
Square of error (°C)2
0.35 60 58.64 1.36 1.86 0.6 50 50.39 -0.39 0.15
0.937 40 41.07 -1.07 1.15 1.438 30 30.31 -0.31 0.10 2.175 20 19.38 0.62 0.38 3.25 10 10.10 -0.10 0.01
Standard Error 1.87
Error bar indicated on T is based on the standard error of 1.87 indicated in the
above table.
The problem may easily be solved by using the “Trend Line” option by
choosing an “exponential law” from the menu.
The quality of the fit may also be gauged by comparing the data with the values
obtained by the use of the exponential least square fit. This is done by making a
“parity plot” as given below as Figure 17. The distribution of the data around
the “parity line” is a measure of the goodness of the fit. The points should be
close to the parity line and must be distributed evenly on the two sides if there is
no “bias” in the measurement. It is observed that the exponential fit is good
based on both these counts!
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Parity plot
0
20
40
60
0 20 40 60Data
Fit
Data Parity line
Figure 17 Parity plot for the exponential fit example.
Sometimes it is instructive to show the error between the data and the fit.
In this example the error in the temperature excess between the data and the fit
is plotted as a function of time as shown in Figure 18. The error is evenly
distributed on both the positive and negative sides indicating absence of bias.
The error between the data and the fit is no more than 1.5°C
Figure 18 Error distribution plot for the exponential fit example.
Error plot
-1.5
-1
-0.5
0
0.5
1
1.5
0 0.5 1 1.5 2 2.5 3 3.5
Time, s
Erro
r=(D
ata-
Fit)
, o C
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Example 12
Polynomial fit example using EXCEL
The x y data set shown below is expected to follow a quadratic
relationship. Obtain the fit by the least squares method. Discuss the relevant
statistical parameters that characterize the fit. Make a suitable plot.
x y(Data) y(Fit) (y-mean y)2 (y-y(Fit))2 0.2 2.55 2.57 119.39 0.000511 0.35 2.86 2.98 112.80 0.016439 0.55 3.84 3.69 92.79 0.024156 0.73 4.18 4.47 86.40 0.086132 1.05 6.46 6.23 49.18 0.054466 1.32 8.29 8.07 26.87 0.050926 1.65 10.26 10.75 10.32 0.240154 1.86 13.11 12.72 0.13 0.15708 2.01 14.77 14.24 1.68 0.28611 2.55 19.83 20.55 40.41 0.515724 2.92 26.07 25.63 158.68 0.197862 3.11 27.58 28.47 198.92 0.796958 3.5 35.37 34.82 479.33 0.306418 Mean 1.68 13.48 13.48 105.92 0.273294 Index of correlation= 0.999 Standard error = 1.026
Using EXCEL “Trend Line” polynomial (quadratic) option the fit is easily
obtained as 2y(Fit) 2.232x 1.513x 2.181.= + + The index of correlation is calculated
using the relation given previously and is shown in the table. Index of correlation
of 0.999 indicates that the fit is very good.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Plot shown below as Figure 19 indicates graphically the goodness of the
fit. Note that the fit shown uses the “Trend Line” option with “Polynomial Fit” of
EXCEL.
Quadratic Fit to Data
y = 2.2318x2 + 1.5132x + 2.1807R2 = 0.998
0
5
10
15
20
25
30
35
40
0 0.5 1 1.5 2 2.5 3 3.5 4x
Dat
a, F
it
y(Data) Poly. (y(Data))
Figure 19 Polynomial fit examples showing the data and the fit.
The regression equation and its index of correlation are also given in the
inset. Error bars are also indicated based on 95% confidence intervals.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Sub Module 1.6
6. Design of experiments
Goal of experiments:
• Experiments help us in understanding the behavior of a (mechanical)
system
• Data collected by systematic variation of influencing factors helps us to
quantitatively describe the underlying phenomenon or phenomena
The goal of any experimental activity is to get the maximum information
about a system with the minimum number of well designed experiments. An
experimental program recognizes the major “factors” that affect the outcome of
the experiment. The factors may be identified by looking at all the quantities that
may affect the outcome of the experiment. The most important among these
may be identified using a few exploratory experiments or from past experience
or based on some underlying theory or hypothesis. The next thing one has to
do is to choose the number of levels for each of the factors. The data will be
gathered for these values of the factors by performing the experiments by
maintaining the levels at these values.
Suppose we know that the phenomena being studied is affected by the
pressure maintained within the apparatus during the experiment. We may
identify the smallest and the largest possible values for the pressure based on
experience, capability of the apparatus to withstand the pressure and so on.
Even though the pressure may be varied “continuously” between these limits, it
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
is seldom necessary to do so. One may choose a few values within the identified
range of the pressure. These will then be referred to as the levels.
Experiments repeated with a particular set of levels for all the factors
constitute replicate experiments. Statistical validation and repeatability concerns
are answered by such replicate data.
In summary an experimental program should address the following issues:
• Is it a single quantity that is being estimated or is it a trend involving more
than one quantity that is being investigated?
• Is the trend linear or non-linear?
• How different are the influence coefficients?
• What does dimensional analysis indicate?
• Can we identify dimensionless groups that influence the quantity or
quantities being measured
• How many experiments do we need to perform?
• Do the factors have independent effect on the outcome of the experiment?
• Do the factors interact to produce a net effect on the behavior of the
system?
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Full factorial design:
A full factorial design of experiments consists of the following:
– Vary one factor at a time
– Perform experiments for all levels of all factors
– Hence perform a large number of experiments that are needed!
– All interactions are captured (as will be shown later)
Consider a simple design for the following case:
Let the number of factors = k
Let the number of levels for the ith factor = ni
The total number of experiments (n) that need to be performed isk
ii 1n n
== Π .
If k = 5 and number of levels is 3 for each of the factors the total number of
experiments to be performed in a full factorial design is 53 243= .
2k factorial design:
Consider a simple example of a 2k factorial design. Each of the k factors
is assigned only two levels. The levels are usually High = 1 and Low = -1. Such
a scheme is useful as a preliminary experimental program before a more
ambitious study is undertaken. The outcome of the 2k factorial experiment will
help identify the relative importance of factors and also will offer some knowledge
about the interaction effects. Let us take a simple case where the number of
factors is 2. Let these factors be Ax and Bx . The number of experiments that
may be performed is 4 corresponding to the following combinations:
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Experiment No. Ax Bx1 +1 +1 2 -1 +1 3 +1 -1 4 -1 -1
Let us represent the outcome of each experiment to be a quantity y. Thus
1y will represent the outcome of experiment number 1 with both factors having
their “High” values, 2y will represent the outcome of the experiment number 2
with the factor A having the “Low” value and the factor B having the “High”
value and so on. The outcome of the experiments may be represented as the
following matrix:
Ax ↓ Bx → +1 -1 +1 1y 3y-1 2y 4y
A simple regression model that may be used can have up to four
parameters. Thus we may represent the regression equation as
0 A A B B AB A By p p x p x p x x= + + + (48)
The p’s are the parameters that are determined by using the “outcome”
matrix by the simultaneous solution of the following four equations:
0 A B AB 1
0 A B AB 2
0 A B AB 3
0 A B AB 4
p p p p yp p p p yp p p p yp p p p y
+ + + =
− + − =
− − − =
− − + =
(49)
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Figure 14 Interpretation of 22 factorial experiment
It is easily seen that the parameter 0p is simply the mean value of y that
is obtained by putting A Bx x 0= = corresponding to the mean values for the
factors. Equation (49) expresses the fact that the outcome may be interpreted as
shown in Figure 14.
It is thus seen that the values of y- 0p at the corners of the square indicate
the deviations from the mean value and hence the mean of the square of these
deviations (we may divide the sum of the squares with the number of degrees of
freedom = 3) is the variance of the sample data collected in the experiment.
The influence of the factors may then be gauged by the contribution of each term
to the variance. These ideas will be brought out by example 13.
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Example 13
A certain process of finishing a surface involves a machine. The machine
has two speed levels Ax and the depth of cut Bx may also take on two values.
The two values are assigned +1 and -1 as explained in the case of 22 factorial
experiment. The outcome of the process is the surface finish y that may have a
value between 1 (the worst) to 10 (the best). A 22factorial experiment was
performed and the following matrix gives the results:
Ax ↓ Bx → +1 -1
+1 3.5 1.5
-1 8.2 2
Determine the regression parameters and comment on the results. The
regression model given in equation (48) is made use of. The four simultaneous
equations for the regression parameters are given by
0 A B AB
0 A B AB
0 A B AB
0 A B AB
p p p p 3.5.....(i)p p p p 1.5.....(ii)p p p p 8.2.....(iii)p p p p 2........(iv)
+ + + =
− + − =
− − − =
− − + =
If we add the four equations and divide by 4 we get the
parameter o3.5 1.5 8.2 2 15.2p 3.8
4 4+ + +
= = = . (i)-(ii)+(iii)-(iv) yields the value of
parameter A3.5 1.5 8.2 2 8.2p 2.05
4 4+ − −
= = = . (i)+(ii)-(iii)-(iv) yields the value of
parameter B3.5 1.5 8.2 2 5.2p 1.3
4 4− − −
= = − = − . Finally (i)-(ii)-(iii)+(iv) yields the
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
value of parameter AB3.5 1.5 8.2 2 4.2p 1.05
4 4− − +
= = − = − . Thus the regression
equation based on the experiments is
A B A By 3.8 2.05x 1.3x 1.05x x= + − −
The deviation with respect to the mean is obviously given by
A B A Bd y 3.8 2.05x 1.3x 1.05x x= − = − −
It may be verified that the total sum of squares (SST) of the deviations is given by
( ) ( )( )
2 2 2 2 2 2A B ABSST 4 p p p 4 2.05 1.3 1.05
4 4.2025 1.69 1.1025 4 6.995 27.98
= × + + = × + +
= × + + = × =
The sample variance is thus given by
2y
SST 27.98s 9.33n 1 3
= = ≈−
Contributions to the sample variance are given by 4 times the square of
the respective parameter and hence we also have
SSA 4 4.2025 16.81SSB 4 1.69 6.76SSAB 4 1.1025 4.41
= × == × == × =
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Here SSA means the sum of squares due to variation in level of Ax and
so on. The relative contributions to the sample variance are represented as
percentage contributions in the following table:
Contribution % Contribution
SST 27.98 100
SSA 16.81 60.08
SSB 6.76 24.16
SSC 4.41 15.76
Thus the dominant factor is the machine speed followed by the depth of
cut and lastly the interaction effect. In this example all these have significant
effects and hence a full factorial experiment is justified.
More on full factorial design
We like to generalize the ideas described above in what follows.
Extension to larger number of factors as well as larger number of levels would
then be straight forward. Let the High and Low levels be represented by + an –
respectively. In the case of 22 factorial experiment design the following will hold:
Ax Bx A Bx xRow vector 1 + + + Row vector 2 + - - Row vector 3 - + - Row vector 4 - - + Column sum 0 0 0 Column sum of squares 4 4 4
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
We note that the product of any two columns is zero. Also the column
sums are zero. Hence the three columns may be considered as vectors that
form an orthogonal set. In fact while calculating the sample variance earlier
these properties were used without being spelt out.
Most of the time it is not possible to conduct that many experiments! The
question that is asked is: “Can we reduce the number of experiments and yet get
an adequate representation of the relationship between the outcome of the
experiment and the variation of the factors?” The answer is in general “yes”. We
replace the full factorial design with a fractional factorial design. In the
fractional factorial design only certain combinations of the levels of the factors
are used to conduct the experiments. This ploy helps to reduce the number of
experiments. The price to be paid is that all interactions will not be resolved.
Consider again the case of 22 factorial experiment. If the interaction term
A Bx x is small or assumed to be small only three regression coefficients
0 A Bp ,p ,p will be important. We require only three experiments such that the
four equations reduce to only three equations corresponding to the three
experimental data that will be available. They may be either:
0 A B 1
0 A B 2
0 A B 3
p p p yp p p yp p p y
+ + =
− + =
+ − =
(50)
Or
0 A B 1
0 A B 2
0 A B 4
p p p yp p p yp p p y
+ + =
− + =
− − =
(51)
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
depending on the choice of the row vectors in conducting the experiments.
In this simple case of two factors the economy of reducing the number of
experiments by one may not be all that important. However it is very useful to go
in for a fractional factorial design when the number of factors is large and when
we expect some factors or interactions between some factors to be unimportant.
Thus fractional factorial experiment design is useful when main effects dominate
with interaction effects being of lower order. Also we may always do more
experiments if necessitated by the observations.
One half factorial design:
Figure 15 Three factors system with two levels. All possible values are given by the corners of the cube.
BC
Aa'b'c' a
a'
abc
b'
b
c
c'
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
For a system with k factors and 2 levels the number of experiments in a
full factorial design will be 2k. For example, if k=3, this number works out to be
23=8. The eight values of the levels would correspond to the corners of a cube
as represented by Figure 15. A half factorial design would use 2k-1 experiments.
With k=3 this works out to be 22=4. The half factorial design would cut the
number of experiments by half. In the half factorial design we would have to
choose half the number of experiments and they should correspond to four of the
eight corners of the cube. We may choose the corners corresponding to a, b, c
and abc as one possible set. This set will correspond to the following:
Point Ax Bx Cx
a + - -
b - + -
c - - +
abc + + +
We notice at once that the three column vectors are orthogonal. Also the
points are obtained by requiring that the projection on to the left face of the cube
gives a full factorial design of type 22. It is easily seen that we may also use the
corners a’, b’, c’ and a’b’c’ to get a second possible half factorial design. This is
represented by the following:
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Point Ax Bx Cx
a’ + + -
b’ + - +
c’ - + +
a’b’c’ - - -
In each of these cases we need to perform only 4 experiments. Let us
look at the consequence of this. For this purpose we make the following table
corresponding to the a, b, c, abc case.
Point Ax Bx Cx A Bx x A Cx x B Cx x A B Cx x x
a + - - - - + +
b - + - - + - +
c - - + + - - +
abc + + + + + + +
Column No. 1 2 3 4 5 6 7
Notice that Column vector 1 is identical to the column vector 6. We say
that these two are “aliases”. Similarly the column vectors 2-5 and column
vectors 3-4 form aliases. Let us look at the consequence of these.
The most general regression model that can be used to represent the
outcome of the full factorial experiment would be
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
0 A A B B C C AB A B AC A C ABC A B Cy p p x p x p x p x x p x x p x x x= + + + + + + (52)
There are eight regression coefficients and the eight experiments in the
full factorial design would yield all these coefficients. However we now have only
four experiments and hence only four regression coefficients may be resolved.
By looking at the procedure used earlier in solving the equations for the
regression coefficients, it is clear that it is not possible to obtain the coefficients
that form an alias pair. For example, it is not possible to resolve Ap and BCp .
These two are said to be “confounded”. The student may verify that the
following are confounded: Bp and ACp , Ap and BCp , Cp and ABp , 0p and ABCp . The
consequence of this is that the best regression we may propose is
0 A A B B C Cy p p x p x p x= + + + (53)
All interaction effects are unresolved and we have only the primary effects
being accounted for. The regression model is indeed a linear model! The
student may perform a similar analysis with the second possible half factorial
design and come to the same conclusion.
Generalization:
Consider a 2k full factorial design. The number of experiments will be 2k.
These experiments will resolve k main effects, kC2 - 2 factor interactions, kC3 - 3
factor interactions,…………., kCk-1 (k-1) interactions and 1 - k interactions. (if k=5,
say, these will be 5 main effects, 10 – 2 factor interactions, 10 – 3 factor
interactions, 5 – 4 factor interactions and 1 – 5 factor interaction).
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
More on simple design: We now look at other ways of economizing on the number of experimental runs
required to understand the behavior of systems. We take a simple example of
characterizing the frictional pressure drop in a tube. A typical experimental set
up will look like the one shown in Fig. 16.
The factors that influence the pressure drop (measured with a differential
pressure gage) between stations 1 and 2 may be written down as:
1. Properties of the fluid flowing in the tube: Density ρ, the fluid viscosity µ
(two factors)
2. Geometric parameters: Pipe diameter D, Distance between the two
stations L and the surface roughness parameter (zero for smooth pipe
and non-zero positive number for a rough pipe). (two or three factors)
3. Fluid velocity V (one factor)
Figure 16 Schematic of an experimental set up for friction factor measurement in
pipe flow
L Diameter D
Pipe inner surface qualifier- Smooth/rough
∆p
Station
1 Station
2
Flow Velocity V
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
The total number of factors that may influence the pressure drop ∆p is thus equal
to 5 factors in the case of a smooth pipe and 6 factors in the case of a rough
pipe. The student can verify that a full factorial design with two levels would
require a total of 32 or 64 experiments. In practice the velocity of the fluid in the
pipe may vary over a wide range of values. The fluid properties may be varied
by changing the fluid or by conducting the experiment at different pressure and
temperature levels! The diameter and length of the pipe may indeed vary over a
very wide range. If one were to go for a simple design the number of factors and
the levels are very large and it would be quite impossible to perform the required
number of experiments.
The question now is: “How are we going to design the experimental scheme?”
How do we conduct a finite number of experiments and yet get enough
information to understand the outcome of the measurement? We look for the
answer in “dimensional analysis” (the student would have learnt this from
his/her course in Fluid Mechanics), for this purpose. Dimensional analysis, in
fact, indicates that the outcome of the experiment may be represented by a
simple relationship of the form
⎟⎟⎠
⎞⎜⎜⎝
⎛= (54)
The student may verify that both the left hand side quantity and the quantity
within the bracket on the right hand side are non-dimensional. This means both
are pure numbers! The f outside the bracketed term on the right hand side
indicates a functional relation. In Fluid Mechanics parlance is known as the
Mechanical Measurements Prof. S. P. Venkateshan
Indian Institute of Technology Madras
Euler number and is known as the Reynolds number. The third non-
dimensional parameter that appears above is the ratio L/D. Essentially the Euler
number is a function of only two factors the Reynolds number and the length to
diameter ratio! The number of factors has been reduced from 6 to just 2! We
may conduct the experiments with just one or two fluids, a few values of the
velocity and may be a couple of different diameter pipes of various lengths to
identify the nature of the functional relationship indicated in Equation (54).
In summary:
1. The 2k experiments will account for the intercept (or the mean) and all
interaction effects and is referred to as Resolution k design. (With k=5, we
have full factorial design having resolution 5. Number of experiments is
32.).
2. Semi or half factorial design 2k-1 will be resolution k-1 design. (With k=5,
we have half factorial design having resolution 4. Number of experiments
is 16.).
3. Quarter factorial design will have a resolution of k-2. (With k=5, we have
quarter factorial design having resolution 3. Number of experiments is 8.).
And so on…..
The student may also work out the aliases in all these cases.