[IEEE IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference....

6

Click here to load reader

Transcript of [IEEE IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference....

Page 1: [IEEE IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference. Rediscovering Measurement in the Age of Informatics - Budapest, Hungary (21-23

\

IEEE Instrumentation and Measurement Technology Conference Budapest, Hungary, May 21-23,2001.

Anytime Information Processing Based on Fuzzy and Neural Network Models

Amamaria R. Vrirkonyi-K6czy1, Antonio Ruano2, P6ter Baranyi3, Orsolya Takacs’

‘Dept. of Measurement and Information Systems Budapest University of Technology and Economics

Muegyetem rkp. 9., Budapest, H-1521 Hungary Phone: +36 1 463 2057, Fax: +36 1 463 41 12, E-mail: [email protected]; [email protected]

’Dept. of Electronic Engineering and Computing, University of Algarve,

Campus de Gambelas, 8000 Faro, Portugal Phone: +351289 800912, Fax: +351289 818560, E-mail: [email protected]

Dept. of Telecommunication and Telematics Budapest University of Technology and Economics, Hungary Phone: +36 1 463 1758, E-mail: [email protected]~

Abstract - In modern measurement and control systems, the available time and resources are often not only limited, but could change during the operation of the system. In these cases, the so called anytime algorithms could be used advantageously. While diflerent soft computing methods are wide-spreadly used in system modeling, their usability in these cases are limited, because the lack of an universal method for the determination of the needed complexity often results in huge and redundant neural networks1 fuzzy rule-bases. This paper proposes a possible way to carry out anytime information processing in fuzzy systems or neural networks, with the help of the Singular Value Decomposition (SVD)-based complexity reduction algorithm. Keywords - fuzzy modeling, neural networks, anytime systems, complexity reduction, singular value decomposition.

I. INTRODUCTION

Model based schemes play an important role among the measurement and control strategies applied to dynamic plants. The basically linear approaches to fault diagnosis, optimal state estimation, and controller design are well understood and successfully combined with adaptive techniques (see. e.g. [l]) to provide optimum performance. Nonlinear techniques, however, are far from this maturity or still are not well understood. There is a wide variety of possible models to be applied based on both classical methods [2] and recent advances in handling [3] information, but up till now practically no systematic method was available which could be offered to solve a larger family of nonlinear control problems. The efforts on the field of fuzzy and neural network (NN) based modeling and control, however, seem to result in a real breakthrough also in this respect. With the advent of adaptive fuzzy and NN controllers very many control problems could be efficiently solved and the model based approach to fuzzy and NN controllers design became a

reality. Using model based techniques in measurement and control also the inverse models play a definite role.

Another reason for dealing with so called soft computational model based techniques is that in computer-based monitoring and diagnostic systems the operations should be performed under prescribed response time conditions. It is an obvious requirement to provide enough computational power but the achievable processing speed is highly influenced by the precedence, timing, and data access conditions of the processing itself. It seems to be unavoidable even in the case of extremely careful design to get into situations where the shortage of necessary data and/or processing time becomes serious. Such situations may result a critical breakdown of the monitoring and/or diagnostic systems [4]. The concept of “anytime” processing tries to handle the case of too many abrupt changes and their consequences in larger scale embedded systems [5]. The idea is that if there is a temporal shortage of computational power and/or there is a loss of some data the actual operations should be continued to maintain the overall performance “at lower price”, i.e., information processing based on algorithms and/or models of simpler complexity should provide outputs of acceptable quality to continue the operation of the complete monitoring system. The accuracy of the processing will be temporarily lower but possibly still enough to produce data for qualitative evaluations and supporting decisions. Consequently “anytime” algorithms provide short response time and are very flexible with respect to the available input information and computational power.

Takagi-Sugeno fuzzy models [6], fuzzy models with non- singleton consequences [7], and generalised NN based

This work was sponsored by the Office of Bilateral Intergovernmental Scientific Cooperation Programs (P-16/99) and the Hungarian Funds for Scientific Research (OTKA T026254, TO351 90).

0-7803-6646-8/01/$10.00 02001 IEEE

1247

Page 2: [IEEE IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference. Rediscovering Measurement in the Age of Informatics - Budapest, Hungary (21-23

approaches [8] combined with the Singular Value Decomposition (SVD) technique are excellent tools for “anytime” operations. By using SVD not only the “sequence” of the rules is defined but also the extent in which they contribute to the mapping. To cope with the limits arising in the system or in its environment, determined by the computational need of the remaining truncated model we on- line can appropriately abandon the less significant part of the rule base. The approximation error will also be given.

The novelty of this paper is that it proposes a frame for SVD based fuzzy and NN techniques to be used in anytime systems. The advantages of the suggested modelling methods are that these models 1, are suitable for modelling a large class of non-linear problems, 2, have relatively low (optimal) computational complexity (it can be proved that after the exact reduction the obtained computational complexity is minimal), 3, may be operated in anytime systems (the use of this modelling technique offers an easy way for further, non- exact reductions where besides the reduced computational need the error of the approximation can also be obtained). The avoidable extra calculations caused by the SVD can be pre-executed off-line.

In the followings authors present exact and non-exact SVD based model reduction methods for certain fuzzy and generalised NN based models to decrease model-complexity and thus to open up possibility for their application in extremely large or in such systems where serious temporal limitations may occur in the computing and/or timing conditions during the operations. The approximation error of the non-exact reduction will also be given, along with an illustrative example.

11. THE SVD ALGORITHM

The Sm-based complexity reduction algorithm is based on the decomposition of the F matrix:

where A, are orthogonal matrixes ( AkAl = E ), and B contains the 4 singular values of F in decreasing order. The maximum number of the nonzero singular values is nSI.2, = min(nl,n2) . The singular values indicate the significance of the corresponding columns of A,. Let the matrixes be partitioned in the following way:

where r denotes “reduced” and n, 5 nSvD .

If Bd contains only zero singular values then Bd and Af can

be dropped: F = A[B‘AiT. If Bd contains nonzero singular

values, as well, then the F ’ = A [ B ‘ A ~ ~ matrix is only an approximation of F and the maximal difference between the values of F and F’ [SI:

HI. REDUCTION OF FUZZY RULE-BASES WITH SVD

Consider a fuzzy rule base with two inputs, where the antecedent fuzzy sets are in Ruspini-partition and the consequence fuzzy sets are singletons. So the rules are:

& , j : If x, is A,,i and x2 is A2,j then ~ = y ~ , ~ ,

where i = 1 . . . n, and j = 1.. . n 2 .

The fuzzyfication method is singleton and during the inference, product T-norm and sum, S-norm are used. The result of the fuzzy inference in case of the (xT,x;) input values will be:

Let F be a matrix, containing the yi,i elements, then apply

the above mentioned procedures to obtain F = F’= A,BA$, where A, and A2 are SN (Sum-Normalised: the sum of each row equals to one) and NN (Non-Negative). Then the new rule-base will be:

where i = 1.. . n i , j = 1.. .ng , y’i,j are the elements of B , and the new membership functions can be obtained as:

1248

Page 3: [IEEE IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference. Rediscovering Measurement in the Age of Informatics - Budapest, Hungary (21-23

where Ak, j , i is the (j,i)th element of A, . by possibly non-linear weighting functions, instead of simple constant weights.

The reduced rule-base contains only n," rules instead of nl*n2

rules and the error can be estimated from the discarded singular values.

Let us focus on two neighbouring layers 1 and 1+1 of a forward model. Let the neurons be denoted as N ~ , ~ , i = 1..ni in layer I , where nl is the number of neurons. Further, let input

The method can be extended to n-dimension cases, as Of Ni,i be X l , i , k ' k = l . n l - l and its Yl,i. The follows. In this case the reduction can be made in n steps, in every step one dimension of the FI matrix, containing the yJI,,,,,,, consequences is reduced. The i . step will be:

1. Spreading out the n-dimensional F, matrix (size:

n[ x.. x n, x.. .Xn, 1 into a two-dimensional S, Therefore, the output of neuron N l + l , j will be

(n, ~ ( n [ * . . . * n : - ~ *n,+l*...*n,)) matrix. FI contains the

connection between layers I and 1+1 can be defined by the f i , j , i(yi , i) weighting functions ( j = ~ . - n l + ~ ) . Thus

Xl+l,j,i = fi, j , i(Yi, i) (5)

J J ~ ~ , , , , ~ , consequences, the following Fi-s will be generated by the algorithm.

2. Reduction of Si: Si = A,BA'T = Ais,* , where the size The weighting functions can also be changed during the training: the unknown weighting functions are approximated

ni xn; and the size of S,; is with linearly combined known functions, where only the

n; X(n;*.. .*n,21 *nj+l*...*n,). linear combination must be trained. For this approximation the above described PSGS fuzzy systems can be used, with one input and one output:

3. Re-stacking of Si* into Fi+l n-dimensional matrix (size '

is

n i x ... xn; ~ n , + ~ x ... xn , ) , and continuing with step 1. for Fi+l-

The consequences of the reduced rule-base will be the elements of F,, and the new membership functions will be:

To reduce the size of a generalised neural network the SVD- based complexity reduction can be used. Equation (7) can always be. transformed into the following form:

instead of nl *. . .*n,. The maximum error of the reduction - at any point - will be less or equal then the sum of the discarded singular values.

where "r" denotes "reduced", further n;+l <n l+ l and

Vi : mLi 5 ml,i.

There also exist extensions of the SVD based reduction to

large rule-bases, where the size of the rule-base is greater then the available operational memory ([l 11).

non-singleton ([7]), Tkagi-Sugeno ([lo]) and for extremely The reduced form is represented as a network with an extra inner layer between layers Between the original layer 1 and the new layer the weighting functions are approximated from the reduced PSGS fuzzy systems, and

and

layer 1+1. simply computes the weighted sum ( al,,,= ) of the outputs of the new layer.

The reduction means the reduction of the B = [b, ; ,1 three-

N. REDUCTION OF A GENERALIZED NEURAL NETWORK

The classical multilayer neural network can be generalized, if L .>,,.>-,

the non-linear transfer functions are moved from the nodes into the links. It results in neurons that apply only a sum operation to the input values, and links that are characterized

dimensional matrix in two steps. In the first step, the first dimension is reduced, and the ' L j , Z are determined

1249

Page 4: [IEEE IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference. Rediscovering Measurement in the Age of Informatics - Budapest, Hungary (21-23

while in the second the third dimension is reduced, and the new membership functions are determined. The detailed description of the algorithm can be found in [12].

The maximal error of the resulted neural network can be computed from the discarded singular values, considering that the singular values, discarded in the first step count nl times ([131).

The error bounds for generalised type neural networks with non-singleton consequents can be found in [ 141.

V. ANYTIME USE OF SVD BASED FUZZY AND NN MODELS

Iterative algorithms are popular tools in anytime systems, because their complexity and computational time-need could be flexibly changed according to the temporal conditions. These algorithms always produce some results when they are stopped (continuously or at certain discrete points along the time), and the accuracy of the results is monotone, i.e., more accurate results can be obtained, if the calculations are continued. The available time/resources need not be estimated in advance: the calculations can be continued until the results are needed, therefore always in the given conditions achievable most accurate results will be got.

Unfortunately the usability of iterative algorithms are limited. While there is a wide range of problems that can be solved by iterative algorithms an adequate evaluation method can not always be found. In some cases there is not at all an effective iterative evaluation method and more frequently the accuracy of the results is not known: we only know that the algorithm gives more and more accurate results but it is not known, how much time is needed to achieve a given accuracy or what will be the rate of error, if the calculations are stopped at a given point.

Besides the iterative algorithms, a wide-range of other types of computing methoddalgorithms can be used in anytime systems, as well, by using modular architectures (Fig. 1.). The flexibility of this method is lower then in case of using iterative algorithms and it needs some extra planning and considerations but it can be used also in cases when an adequate iterative algorithm can not at all be found.

In this case, each module of the system implementing a certain task is realized in more different ways: for the same task, there are more units which have the same inputs, outputs and solve the same problem, but have different complexity and accuracy, etc. At a given time, in the knowledge of the current conditions (tasks to complete, achievable timehesources, needed accuracy, etc,), an expert system chooses the adequate configurations, i. e., the units which will be used. This means the optimization of the whole system,

instead of the optimization of the individual modules, e.g. in some cases it can be more advantageous to reduce the computational complexity and accuracy of some parts of the system and rearrange the resources to an other at the moment more important task.

Although, the units implementing a certain task may have different internal structure, from several points of view it is advantageous if they are built of similar structure. In this case the adaptation or change between the units means only the change of some parameter set.

When we apply very simple SVD based fuzzy and NN models in anytime systems it is enough to use one copy of the model and to continue the evaluation of the model (starting from the most significant part) till we reach the limits of the allowable computational time. After stopping with the evaluations the error bound can be taken into account from the model.

On the other hand, if we use more complex models,' because of the cross-effects within the module, the modular architecture should be applied which is more general. The computational need, accuracy, etc. parameters should be included in the units as their attributes.

VI. ERROR EXPANSION

The temporary reduction of complexity causes the reduction of accuracy, as well. While in case of SVD-based complexity reduction this error can be easily estimated, to obtain the so called resultant error (the error of the whole system) further computations must be made, considering the errors of the different modules, and the path of data and error through the modules.

For the calculation of the resultant error the error transfer functions of the modules must also be known. If module B uses the results of module A, then, if the accuracy of module

I Expert System I

Fig. 1. Anytime system with a modular architecture

1250

Page 5: [IEEE IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference. Rediscovering Measurement in the Age of Informatics - Budapest, Hungary (21-23

A reduces, the accuracy of the outputs of B will reduce, as well. The y=fE (x) error transfer function means, that if the input of the module has an absolute error x, then the output of the module will have an additional absolute error y. The error along the data-path are cumulative. It is supposed, that the intemal error of the modules, originating from inexact computations, noise, etc., could be modeled as an additive error component in the output of the module. Thus in the example above if module A has an error EA and module B has an error EB then the resultant error on the output of B will be: f e , ~ ( E ~ ) + EB , where fe,B is the error expansion function

known. In case of the above described fuzzy inference systems and generalized neural networks this can be easily computed.

In dynamic systems, further considerations must be made. In this case, the error could spread not only in space, but also in time in the system, namely, the temporary reduction of accuracy may effect the operation of the system even after the restoration of the original accuracy. If the system contains additive memory elements then the error theoretically will never disappear from the system.

of module B.

The error transfer function can be determined in a given xo point as:

If the f~() error expansion function of the feed-back module is bounded as:

(13) f E ( x e ) 5 k*xe 9

where f() is the transfer function of the given module/unit.

While this formula is accurate, it also is practically This means, that if the absolute value of the fE() error uncomputable, so simpler methods are needed. If f() is expansion function is always less then one then the error will monotone, then the error transfer function is: sooner or later disappear from the system, but if its absolute

value is greater or equal than one then the effect of a temporal accuracy-reduction will influence all of the later results

SE(Xe)=max()f(xo)--(xo - x e ) l ~ J ~ ( x o + x e ) - f ( x o ) / } (10) ([151).

This formula Sti l l needs the CalCUlatiOn Off() two times for This property of dynamic systems demands a great deal of every xo. precaution and planning from the engineers. The error

expansion functions of the feed-back modules must always be Iff() is linear Or nearly linear, the error transfer function Can estimated in advance (e.g. from the derivatives of the system), be estimated as: and if they do not have the wanted properties, i.e., they are

not bounded, additional error-compensation methods must be f~ (xe 1 z xe * f ’ ( ~ o ) . (11) used.

While this gives an easily computable estimation, it can cause the underestimation of the resultant error, iff() is non-linear.

VII. RECONFIGURATION TRANSIENTS

A certain overestimation of the resultant error can be got according to the followings:

While this formula also needs a lot of calculations, iff() has a global maximum, then thefE() function can be computed off- line and it will be valid for the whole domain. To improve the accuracy of the estimation, the domain can be divided into more, overlapping intervals, and different fE() functions can be determined to the intervals.

With this last formula, only the derivative of thef() must be

Anytime algorithms based on feedback systems unavoidably suffer from transients. These well-known phenomena are due to the dynamic nature of the signal processing structures applied. Both parameter and structure adaptations generate transients. The nature of these transients depends not only on the transfer function of the structures to be implemented, but also on the actual information processing structure [16]. For this very reason the implementation of anytime algorithms must be performed using structures having good transient behavior. This structure dependency is strongly related to the “energy distribution” within the processing structure.

Another aspect to be considered can be that if the system is reconfigured between the old and the new configuration through intermediate steps the transients may decrease. The

1251

Page 6: [IEEE IMTC 2001. Proceedings of the 18th IEEE Instrumentation and Measurement Technology Conference. Rediscovering Measurement in the Age of Informatics - Budapest, Hungary (21-23

transients depend on the selection of the number and the actual locations of these intermediate steps. Unfortunately the simplest method, i.e., the linear interpolation in most cases will not ensure good results [ 161. Fuzzy decision making may help to find the optimal strategy of the adaptation, however, controlling transients in reconfigurable systems is still an important area of investigations and research.

WI. ILLUSTRATIVE EXAMPLE

In this Section, to illustrate the efficiency of the proposed SVD-based reduction method, a simple example with a three input-one output fuzzy model is presented.

The model describes the human hearing system and it is constructed similarly to the fuzzy system in Section In. The inputs are the frequency, sound intensity and age, while the output is the sound intensity felt by the person, i. e. the subjective intensity. The number of antecedent fuzzy sets is 16*13*11 (frequency*intensity*age), thus the number of rules is 2288. By discarding some of the singular values, the number of rules can be reduced to 1287 (9*13*11), with a relatively small guarantied error-bound (1.7054).

Fig. 2. shows the obtained surface of the original and the reduced fuzzy systems, as a function of frequency and sound intensity at age of 60 years, and the error of the reduction.

CONCLUSIONS

In systems, where the available time and resources are limited, and even can change during the operation of the system, special algorithms, the so called anytime techniques can be used advantageously. While different soft computing methods are wide-spreadly used in nearly every area of engineering, their usability is limited, because of their high complexity. In this paper the use of exact and non-exact Singular Value Decomposition based complexity reduction is proposed for the elimination of the redundancy of, fuzzy systems and generalized neural networks and for their

OnGina1 RIIc.base Reduced rulebare

anytime modes of operation to overcome the problems and to ensure the possibility of the continuous operation and “optimal” performance for the whole system.

REFERENCES

B. Widrow, B., E. Walach, Adaptive Inverse Control, Prentice Hall, 1996. BiUings,S.A., “Identification of Nonlinear Systems - A Survey,” IEE Proc. Vol. 127, Pt.D, No. 6, Nov. 1980.. pp. 272-284. Klir, G.J., T.A. Folger, Fuzzy Sets, Uncertainty, and Information, Prentice-Hall International, Inc., 1988. VBrkonyi-Kbczy, A.R., T. Kovicshhy, “Anytime Algorithms in Embedded Signal Processing Systems,” In Roc. of the IX. European Siena1 Processine Conference. EUSIPCO-98. Rhodes. Greece. Sen 8- . * 11: 1998,Vol. 1,pp. 169-172.. Baron,C., J.-C Geffroy, G. Motet ed., Embedded System Applications,

~~

Kluwer Academic Publishers, 1997. P. Baranyi, Y. Yam, “Singular Value-Based Approximation with Takagi-Sugeno Type Fuzzy Rule Base“, In IEEE Int. Conf. on Fuzzy Systems, 1997., pp. 265-270. P. Baranyi and Y. Yang, “Singular value-based approximation with non-singleton support”, In Seventh Int. IFSA World Congress, Prague, June 25-29, 1997., pp.127-132. Baranyi P., Y. Yam, .H. Hashimoto, P. Korondi, P. Michelherger,. “Approximation and Complexity Reduction of the Generalized Neural Network,” accepted to IEEE Trans. on Fuzzy Systems. Yam, Y., P. Baranyi, C.T. Yang, “Reduction of Fuzzy Rule Base Via Singular Value Decomposition,” IEEE Trans. on Fuzzy Systems, Vol. 7.. No. 2.. Am.. 1999. ~~ .120-132.

[lo] P.’Baranyi, ?.’Yam,’ F.T. Yang, A.R. Vhkonyi-Kbczy, “Complexity Reduction of a Rational General For”’, Proc. of the IEEE Int. Conference on Fuzzy Systems, FUZZ-IEEE99., Aug. 22-25, 1999, Seoul, Korea, pp. 366-371.

[ l l ] P. Baranyi, Y. Yam, C.T. Yang, A.R. V&konyi-K6czy, “Practical Extension of the SVD Based Reduction Technique for Extremely Large Fuzzy Rule Bases,” Proc. of the IEEE Int. Workshop on Intelligent Signal Processing, WISP99, Sep, 4-7, 1999, pp. 29-33.

1121 Baranyi P., Y. Yam, .H. Hashimoto, P. Korondi, P. Michelherger, “Approximation and Complexity Reduction of the Generalized Neural Network,” accepted to IEEE Trans. on Fuzzy Systems.

[13] Lei, K., “Error bound of the SVD Based Neural Network Reduction,” submitted to IEEE Workshop on Intelligent Signal Processing, WISP‘2001, Budapest, Hungary, May 24-25,2001

[14] TakBcs, O., A.R. Virkonyi-K6czy. “Error-Bound for the Non-Exact SVD-Based Complexity Reduction of the Generalized Type Hybrid Neural Network with Non-Singleton Consequents,” submitted to the 2000 IEEE Instrumentation & Measurement Technology Conf., IMTCZOOO, Budapest, Hungary May 21-23,2001.

[15] Dorf, R.C., Modern Control Systems, Addison-Wesley Puhl. Comp., USA, 1987

[16] Piceli, G., T. Kovhcshfizy. “Transients in Reconfigurahle Systems.” In Proc. 1998 IEEE Instrumentation & Measurement Technology Conf., IMTC’98. St. Paul, Minnesota, USA (May, 1998). 919-922.

~ I .

_ , - - ( _ - - I _ . I _ - 0.12, . , _ . I _ I . :

Fig 2. : The original and the reduced model of the human hearing system and the error of its reduction

1252