ABSTRACT - WIT Press · number of iterations required to achieve convergence. In these numerically...
Transcript of ABSTRACT - WIT Press · number of iterations required to achieve convergence. In these numerically...
The effectiveness of optimisation techniques
for material property characterisation
E.J. O'Brien, J.J. O'Donnell
Department of Civil, Structural and Environmental
Engineering, Trinity College, Dublin, Ireland
ABSTRACT
The paper describes the application of a technique whereby the parameters
which characterise a physical process can be found. The technique is based on
the use of optimisation to find those values for the characteristic parameters
which give a best fit of the measured results to a chosen numerical model.
Typically these parameters represent the physical properties of the materials
involved in the process. Factors which influence the effectiveness of the
technique are described and methods are developed for the assessment of its
suitability for various applications.
INTRODUCTION
There are many situations where the parameters which characterise a physical
process are difficult to determine directly. When modelling such a process
numerically, it is sometimes convenient to determine characteristic parameters
indirectly from measured results. This can be achieved by finding those values
of the characteristic parameters which give a best fit of the measured results to
the numerical model.
In any numerical model of a physical process, three groups of parameters can
be identified. The first group consists of the characteristic parameters, x. ; i =
1, 2, ..., n, the parameters for which values are sought. The second, consists
Transactions on Modelling and Simulation vol 5, © 1993 WIT Press, www.witpress.com, ISSN 1743-355X
46 Computational Methods and Experimental Measurements
of those parameters used in the 'best fit' process, i.e., those for which
differences between measured values, ff ; i = 1, 2, ..., p, and values foundfrom the numerical model, f. ; i = 1, 2, ..., p, are minimised. All other
parameters belong to the third group, y. ; i = 1, 2, ..., m, and values for these
are either measured, known or assumed. Parameters of the second group, as
calculated using the numerical model, can be expressed as:
for i=l,2, ..., p
The best fit process consists of minimising the sum of squares of differences
between these calculated values and the corresponding measured values, i.e.,
Find x. ; i = 1, 2, ..., n
to minimiseP 22^ ̂ " *i '
The minimisation problem can be solved using standard optimisation
algorithms.
A MEASURE OF ILL-CONDITIONING
Two problems, each involving two characteristic parameters, x^, x^, two
parameters of the second group, f^, f^, and one other parameter, y^, are
illustrated in Figures la and Ib. As only two parameters are used in the best fit
process, a unique solution can be found at P. In many cases, there are errors in
the numerical model and/or the measurements (f^ and f^ in these examples).
As the curves in Figure la are almost parallel at P, the solution (namely, the
coordinates at P) is sensitive to such errors and the problem can be termed ill-
conditioned. In the general case, where the solution is being found by
optimisation, it can be quite difficult to obtain an accurate solution to ill-
conditioned problems such as this. The problem of Figure Ibis less sensitive
to such errors and is, in contrast, well conditioned.
Transactions on Modelling and Simulation vol 5, © 1993 WIT Press, www.witpress.com, ISSN 1743-355X
Computational Methods and Experimental Measurements 47
a) Ill-Conditioned b) Well Conditioned
Figure 1. Conditioning of Two-Dimensional Problems
A measure of the difference in slope of the two curves at P is the area of the
parallelogram constructed about the unit normal vectors as illustrated in the
figure. Clearly, a small area implies an ill-conditioned problem while an area
close to unity implies that the problem is well conditioned. This area can be
calculated numerically as the determinant of the matrix, [G], where,
[G] =
i, i
|Vfi
h, 2
VfJ
(1)
In this equation, f. . denotes the partial derivative off with respect to x., i.e.,
Transactions on Modelling and Simulation vol 5, © 1993 WIT Press, www.witpress.com, ISSN 1743-355X
48 Computational Methods and Experimental Measurements
and JV fi| is the length of the gradient vector for f., i.e.,
N = V#l(3)
For a problem in which n parameters are used in the best fit process to
determine n characteristic parameters, the corresponding matrix is, [H], where,
fl, 2 fl, n- • • • -
[H] =
|v f i| |v f i| |v fi|
jVfJ
fn, 1 fn, 2
IvTIlVfl.r N |v in|
(4)
and where,
v ̂ - v&r 9x2,(5)
The determinant of [H] has been used by the authors as a comparative measure
of the conditioning of problems. As for two-dimensional problems, a
determinant close to zero implies ill-conditioning while a value close to unity
implies a well conditioned problem.
REDUNDANCY OF PROBLEMS
In many cases it is possible to use many more parameters in the best fit process
than there are characteristic parameters, i.e.,
p > n
In such cases, it is often not possible to obtain exact agreement between all
measured values and those calculated using the numerical model.
Transactions on Modelling and Simulation vol 5, © 1993 WIT Press, www.witpress.com, ISSN 1743-355X
Computational Methods and Experimental Measurements 49
A two-dimensional example with a redundancy of one is illustrated in Figure 2.
The best fit solution, P, clearly does not satisfy any of the equations exactly. A
simple measure of how closely the numerical model fits the measured results is
the sum of squares of differences between measured and calculated values.
This is, of course, the value of the objective function at the optimum solution.
Figure 2. Redundant Problem in Two Dimensions
As before, a measure of ill-conditioning is the area of a parallelogram or the
multi-dimensional equivalent of this. However, for redundant problems in
n-dimensions, there are more than n normal vectors. It can be seen in Figure 2
that the curves defined by,
and
fi(xi, x:, yi) =
, X2, yi) =
are almost parallel. However, as the third curve is approximately perpendicular
to them at P the problem is well conditioned. It can be seen that the
parallelogram defined by the vectors, nj and ng is an appropriate measure of
Transactions on Modelling and Simulation vol 5, © 1993 WIT Press, www.witpress.com, ISSN 1743-355X
50 Computational Methods and Experimental Measurements
conditioning for this example. In general for two-dimensional problems, the
parallelogram of greatest area defined by any pair of vectors would seem to be
most appropriate.
The unit normal vectors are illustrated in Figure 3 for a problem where n = 3
and p = 5. As each of these vectors is of unit length, each end point is located
on a sphere of unit radius as illustrated in Figure 3a and 3b. The points A, B, C
and D in this figure are defined by the intersection with the sphere of planesparallel to the x^ and x^ axes (two parallel to each). The volume of a solid
defined by the vectors PA, PB, PC (and PD) would give some measure of the
conditioning of this problem as all unit normal vectors would be 'enclosed' by
this. However, there is a skew in the orientation of the unit normal vectors
a) Unit Normal Vectors
b) Sphere of Unit Radius
and Planes Parallel to
Original Axes
c) Sphere of Unit Radius
and Planes Parallel to
Transformed Axes
Figure 3. Unit Normal Vectors in 3 - Dimensional Problem
Transactions on Modelling and Simulation vol 5, © 1993 WIT Press, www.witpress.com, ISSN 1743-355X
Computational Methods and Experimental Measurements 51
which this volume would not reflect. Therefore, linear regression has been
applied to identify the plane passing through P, which gives a best fit to the end
points of the unit vectors. The axes are then transformed so that the x^ and x%
axes lie in this plane and the, x^ axis is perpendicular to it. Then, planes
parallel to the transformed axes are found which just enclose the end points of
all unit normal vectors. The result is illustrated for the three-dimensional
problem in Figure 3c where it can be seen that the appropriate solid is defined
by the vectors PE, PF, PG (and PH).
THE INFLUENCE OF SCALING
The authors have used the conjugate directions method [1] to solve the
optimisation problem. While this technique is relatively inefficient, it has been
found to be extremely robust for problems involving discontinuities of first
derivative and/or sudden changes in second derivative. For the subroutine
used, the rate of convergence was found to be sensitive to the relative scaling ofthe variables. For example, convergence was found to be poor when x^ ranged
in value from -3000 to +3000 while x% ranged from -2 to +6. Speed of
convergence was restored when the initial step length used in the search
procedure for each variable was calculated as a fixed percentage of the expected
range.
The scaling of the variables clearly also affects the angle between unit normal
vectors such as those illustrated in Figure 2. To remove this effect in a manner
consistent with that used for the optimisation, each partial derivative in matrices
[G] and [H] is multiplied by the range of the appropriate variable. Thus, if xj isthe expected range of the variable, x., then the matrix [H] becomes:
Transactions on Modelling and Simulation vol 5, © 1993 WIT Press, www.witpress.com, ISSN 1743-355X
52 Computational Methods and Experimental Measurements
f 1,2X2
gl
f2, 1
|gd
n, 1 Xi fn, 2 X2 In, n :• • • • —i—r
Ignl Ignl
(5)
where,
(6)
THERMAL CONDUCTION
The temperature increase at an internal node, i, on the axis of a hydrating
concrete prism can be expressed as,
where
fi = xi exp [x2 (l/y2 - 1/yi)] + xs ys (7)
y^, y^, y^ = known parameters (over which there is control)
Xj, x^, x^ = characteristic parameters
In a real thermal conduction problem, Equation 7 cannot be applied directly as it
does not allow for heat loss through the ends and sides of the prism. However,
it is used here to illustrate the principles described above. Three numerical
examples are considered first. In each case, temperature increases were
calculated at three different nodes along the prism. In each example,
'measured' temperatures were generated numerically using values for thecharacteristic parameters of, x^ = 1.630 x 10" , x^ = -5683 and x^ = 1.520 x
10" . This exact solution to the optimisation problem was subsequently used toassess the accuracy of the optimisation procedure. The variables, y , y^, and
yy can be controlled by the experimenter. For the three examples considered,
different combinations of values were chosen as presented in Table 1. The
Transactions on Modelling and Simulation vol 5, © 1993 WIT Press, www.witpress.com, ISSN 1743-355X
Computational Methods and Experimental Measurements 53
Table 1. Details of Thermal Conduction Examples
Example
1
2
3
EquationNumber, i(Node No.)
1
2
3
1
2
3
1
2
3
Group 3 variables
?i
303
303
303
303
303
303
333
333
333
?2
303
302
301
303
298
293
333
303
273
?3
99
100
101
90
100
110
100
200
300
# ff
3.135
3.052
2.974
2.998
2.710
2.531
3.150
3.341
4.598
determinant of matrix [FT], as given by Equation 5, was calculated for each
example. The results are given in Table 2. It can be seen from the table that,
while none of the examples are very well conditioned, the third is significantly
better than the other two. The ill-conditioning of the former two is evident from
the relatively large optimum value for the objective functions and the large
number of iterations required to achieve convergence. In these numerically
generated examples the exact solution has been used to calculate the % errors in
the optimal solutions. It can be seen that the errors in the values inferred for the
characteristic parameters vary from up to 38% in Example 1 to 0.1% in
Example 3. Clearly, there is a strong correlation between accuracy and the
determinant of [FT].
In a fourth example, experimental data was used to determine values for the
same characteristic parameters as considered above. Details of the experiment
in which measured temperature changes at nine locations were used to infer
values for six characteristic parameters are given in [2]. For this example,
temperature changes over a one hour period were considered. In order to
Transactions on Modelling and Simulation vol 5, © 1993 WIT Press, www.witpress.com, ISSN 1743-355X
54 Computational Methods and Experimental Measurements
Table 2. Results of Thermal Conduction Examples
Eg.
1
2
3
Det [H']
xlO^
0.84
104
986
No.
Iters.
6798
3500
392
Objective
Fn. x 10^
588
122
0.82
*i(Error)
1.034x 10-4
(37%)
1.480x 10-4
(9.2%)
1.628x 10'4
(0.1%)
*2
(Error)
-7254
(2896)
-6293
(10.7%)
-5682
(0.196)
*3
(Error)
2.105 x lor*
(38%)
1.691 x 10-G
(11.3%)
1.519 x 10~*
(0.196)
provide a comparison with the examples considered above only three of the
unknown parameters were treated as characteristic parameters. For the other
three, the values inferred from a previous run were taken to be known. Thus,
the problem became one of using nine equations (nodes) to infer values for
three parameters. The determinant of [H'J was found for this fourth example to
be 338 x 10't Comparison with the results presented in Table 2 would suggest
an error of not greater than 10% for each parameter. This inference is of course
only valid if measured temperature changes and the values assumed for other
parameters are exact.
CREEP DEFLECTION MODEL
In this example the measured quantities are deflections at various points and for
successive construction stages of a post-tensioned balanced cantilever bridge
deck. Full details are presented elsewhere in this publication [3]. There are
again three characteristic parameters representing, in this case, elastic stiffness
and creep properties of the concrete. A computer program was used to calculate
theoretical deflections for given values of the characteristic parameters. The
problem considered is to find those values for the characteristic parameters
which result in a best fit to measured deflection data. The bridge was
constructed in stages with one segment being added at each stage. The
deflections at the end of each segment at each stage of construction were the
parameters used in the best fit process. Thus, one parameter was available after
Transactions on Modelling and Simulation vol 5, © 1993 WIT Press, www.witpress.com, ISSN 1743-355X
Computational Methods and Experimental Measurements 55
Stage 1 (Segment 1, Stage 1), three parameters were available after Stage 2
(Segment 1, Stages 1 and 2 and Segment 2, Stage 2) and so on. For the
purposes of this study, values for the three characteristic parameters were
determined four times using all data available after each of Stages 4, 5, 6 and 7
(10, 15, 21 and 28 parameter values respectively).
The results are presented in Table 3. It was expected that the conditioning of
the problem would improve as an increasing amount of data became available.
This however, would not appear to be the case as the number of iterations
required increases with the quantity of data used. In the third column of Table
3, the determinants of [H'], as evaluated at the solution in each case, are
presented. The general trend apparent in these values is consistent with the
increasing number of iterations required to determine the solutions, ie.,
Table 3. Results of Creep Deflection Model
Problem No. j
(No. Params.)
1(10)
2(15)
3(21)
4(28)
No.
Iters.
147
171
197
317
At Solution
to No. j
0.904
0.861
0.293
0.233
Determinants
At Solution
to No. 1
0.904
0.896
0.928
0.951
At Solution
to No.4
0.047
0.233
0.233
0.233
successive problems are less well conditioned. The fact that the determinants
reduce as data is added has been found to be due largely to the fact that the
optimal solutions for each problem are quite different from one another. In the
fourth column of Table 3, the determinants are all evaluated at the point, (43.29
x 10̂ , 1.341, 0.841), which is the optimal solution for Problem No. 1. The
final column of the table contains the determinants as evaluated at the point,
(31.92 x 10f, 0.385, -4.04), which is the optimal solution for Problem No. 4.
Transactions on Modelling and Simulation vol 5, © 1993 WIT Press, www.witpress.com, ISSN 1743-355X
56 Computational Methods and Experimental Measurements
Clearly, for these problems the optimal solution changes as more data is
considered. It can be seen that this has a significant effect on the values of the
determinants. The general trend evident in each of columns 4 and 5 is one of
increasingly well conditioned problems. However, a comparison of
corresponding values in different columns indicates that the conditioning of the
problems are quite variable in the vicinity of the solutions. For example,
Problem No. 1 is well conditioned near its own solution but is quite poorly
conditioned at the optimal solution to Problem No. 4. Clearly such a possibility
must be considered when the determinant of [H'] is used as a measure of the
conditioning of problems.
CONCLUSIONS
A technique is described for the determination of the parameters which
characterise a physical process. The determinant of a matrix of partial derivative
terms is proposed as a measure of the conditioning of such problems. A
method of catering for highly redundant problems is described. Examples
illustrate the application of the method to two different physical processes.
REFERENCES
1. Rao S.S. Optimization Theory and Applications, 2nd Edition, Halsted Press
New York, 1984.
2. O'Brien, E.J., O'Donnell, J.J., Waldron P., Lahlouh, El-H., "Non-linear
Conduction Modelling of Concrete Walls under the Influence of Heat of
Hydration of Cement" in Advanced Computational Methods in Heat Transfer II,
Vol 1, (Ed. Wrobel, R.C., Brebbia, C.A. & Nowak A.J.) pp. 121-130.
Computational Mechanics Publications & Elsevier Applied Science, 1992.
3. Flanagan J.W., O'Brien E.J. "Numerical Modelling of Time Dependent
Deflections during Construction of Prestressed Concrete Bridges", in CMEM
93 - Computational Methods and Experimental Measurements, Siena, 1993
(Ed. Brebbia, C.A. and Carlomagno, G.M.). Computational Mechanics
Publications, 1993.
Transactions on Modelling and Simulation vol 5, © 1993 WIT Press, www.witpress.com, ISSN 1743-355X