An integrated structure for Kalman-filter-based measurand reconstruction

8
IEEE TRANSACTIONS ON INSTRUMENTATIONAND MEASUREMENT, VOL. 43, NO. 3, JUNE 1994 403 An Integrated Structure for Kalman-Filter-Based Measurand Reconstruction Andrzej Barwicz, Member, IEEE, Daniel Massicotte, Student Member, IEEE, Yvon Savaria, Member, IEEE, Marc-Alain Santerre, and Roman Z. Morawski, Member, IEEE Abstract-The problem of improving the quality of measurand reconstruction using integration techniques for implementation of various measurement functions is addressed. An integrated specialized structure for Kalman-filter-based reconstruction of the measurand is proposed. It is designed to act as an external coprocessor for a host processor. Although intended to improve the resolution of spectrometric measurements, it may be used in other applications where similar processing of measurement signals is required. The performance of the integrated specialized structure is compared to that of the general-purpose digital signal processor DSP56001. I. INTRODUCTION EASURAND reconstruction constitutes a fundamental M problem of metrology [2], which may be formulated using a general scheme of measurement shown in Fig. 1. In this figure: z is a measurand, i.e., the empirical characteristic of the object of measurement whose determination is the purpose of measurement, 5 is the raw result of measurement, and 3 is the final result of measurement. The transformation of the measurement information in the measurement channel can be decomposed logically into two stages [2] : conversion, aimed at transferring measurement information into the domain of easily interpretable phenom- ena (e.g., electrical signals, digital codes) and reconstruction, aimed at transforming the result of conversion 5, i.e., the raw result of measurement, into the final result of measurement 3. Recent achievements in the domain of microelectronics make possible an improvement of the quality of measure- ments, in particular by the use of micromechanical sen- sors (conversion) and sophisticated digital-signal-processing algorithms (reconstruction), both implemented in VLSI tech- nology. Depending on the accuracy, speed and reliability of measurement required, one may choose general-purpose means of digital signal processing (microcontrollers, DSP’s, field programmable gate array, etc.) or a specialized VLSI processor. Specialized processors dedicated to measurement Manuscript received May 18, 1993; revised January 31, 1994. This work was supported by the Natural Sciences and Engineering Research Council of Canada, Canadian Microelectronics Corporation, and by the University of QuCbec. Hewlett-Packard Canada contributed with equipment support. A. Banvicz and M.-A. Santerre are with the DCpartement d’IngCnierie, UniversitC du QuCbec a Trois-Rivikres, C.P. 500, Trois-Rivikres, Qu6bec G9A 5H7, Canada. D. Massicotte and Y. Savaria are with the Departement de Genie Glectrique et Informatique, &ole Polytechnique de MontrCal, C.P. 6079, Succ. A, MontrCal, QuCbec H3C 3A7, Canada. R. Z. Morawski is with the Institute of Radioelectronics, Warsaw University of Technology ul. Nowowicjska 15/19, 00-665 Warsaw, Poland. IEEE Log Number 9402229. of Fig. I. General scheme of measurement. applications become necessary if the measurand reconstruction requires processing of mathematical models leading to high computational requirements. In general, the justification for the use of a specialized processor instead of a general- purpose digital signal processor (DSP) comes primarily from the speed requirement. A specialized processor adapted to the reconstruction algorithms offers a much higher speed than a general-purpose DSP. Another justification arises if the direct association of integrated sensors with a processor is desirable in order to reduce the influence of noise or signal distortion. However, when accuracy of computation is of primary importance, then a general-purpose DSP usually yields satisfactory results. Of course, there are other widely recognized incentives for integration such as requirements con- cerning miniaturization of the measuring system, its reliability, limited power consumption, and ease of development or/and maintenance. In this paper, an integrated structure for Kalman-filter- based reconstruction of the measurand is proposed. Although intended to improve the resolution of spectrometric measure- ments, it may be used in other applications where similar processing of measurement signals is required. 11. MEASURAND RECONSTRUCTION IN SPECTROMETRY The results of spectrophotometric measurement (the ab- sorption spectrum, of a sample under study) are subject to systematic errors of an instrumental type. To take into account their total effect, the following model of the recorded spectrum y(X) is used most frequently: y(X) = J’” g(X - X’)z(X’)dX’ (1) where X is the wavelength, .(A) is the undistorted spectrum as it might be recorded by a perfectly resolving spectropho- tometer, and g(X) is the unnormalized incoherent optical response function, whose shape resembles a steep Gaussian function. Digital correction of spectrophotometric data consists of numerically solving (1) on the basis of measured samples -m 0018-9.456/94$04.00 0 1994 IEEE

Transcript of An integrated structure for Kalman-filter-based measurand reconstruction

Page 1: An integrated structure for Kalman-filter-based measurand reconstruction

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 43, NO. 3, JUNE 1994 403

An Integrated Structure for Kalman-Filter-Based Measurand Reconstruction

Andrzej Barwicz, Member, IEEE, Daniel Massicotte, Student Member, IEEE, Yvon Savaria, Member, IEEE, Marc-Alain Santerre, and Roman Z. Morawski, Member, IEEE

Abstract-The problem of improving the quality of measurand reconstruction using integration techniques for implementation of various measurement functions is addressed. An integrated specialized structure for Kalman-filter-based reconstruction of the measurand is proposed. It is designed to act as an external coprocessor for a host processor. Although intended to improve the resolution of spectrometric measurements, it may be used in other applications where similar processing of measurement signals is required. The performance of the integrated specialized structure is compared to that of the general-purpose digital signal processor DSP56001.

I. INTRODUCTION

EASURAND reconstruction constitutes a fundamental M problem of metrology [2 ] , which may be formulated using a general scheme of measurement shown in Fig. 1. In this figure: z is a measurand, i.e., the empirical characteristic of the object of measurement whose determination is the purpose of measurement, 5 is the raw result of measurement, and 3 is the final result of measurement.

The transformation of the measurement information in the measurement channel can be decomposed logically into two stages [2] : conversion, aimed at transferring measurement information into the domain of easily interpretable phenom- ena (e.g., electrical signals, digital codes) and reconstruction, aimed at transforming the result of conversion 5, i.e., the raw result of measurement, into the final result of measurement 3.

Recent achievements in the domain of microelectronics make possible an improvement of the quality of measure- ments, in particular by the use of micromechanical sen- sors (conversion) and sophisticated digital-signal-processing algorithms (reconstruction), both implemented in VLSI tech- nology. Depending on the accuracy, speed and reliability of measurement required, one may choose general-purpose means of digital signal processing (microcontrollers, DSP’s, field programmable gate array, etc.) or a specialized VLSI processor. Specialized processors dedicated to measurement

Manuscript received May 18, 1993; revised January 31, 1994. This work was supported by the Natural Sciences and Engineering Research Council of Canada, Canadian Microelectronics Corporation, and by the University of QuCbec. Hewlett-Packard Canada contributed with equipment support.

A. Banvicz and M.-A. Santerre are with the DCpartement d’IngCnierie, UniversitC du QuCbec a Trois-Rivikres, C.P. 500, Trois-Rivikres, Qu6bec G9A 5H7, Canada.

D. Massicotte and Y. Savaria are with the Departement de Genie Glectrique et Informatique, &ole Polytechnique de MontrCal, C.P. 6079, Succ. A, MontrCal, QuCbec H3C 3A7, Canada.

R. Z. Morawski is with the Institute of Radioelectronics, Warsaw University of Technology ul. Nowowicjska 15/19, 00-665 Warsaw, Poland.

IEEE Log Number 9402229.

of

Fig. I. General scheme of measurement.

applications become necessary if the measurand reconstruction requires processing of mathematical models leading to high computational requirements. In general, the justification for the use of a specialized processor instead of a general- purpose digital signal processor (DSP) comes primarily from the speed requirement. A specialized processor adapted to the reconstruction algorithms offers a much higher speed than a general-purpose DSP. Another justification arises if the direct association of integrated sensors with a processor is desirable in order to reduce the influence of noise or signal distortion. However, when accuracy of computation is of primary importance, then a general-purpose DSP usually yields satisfactory results. Of course, there are other widely recognized incentives for integration such as requirements con- cerning miniaturization of the measuring system, its reliability, limited power consumption, and ease of development or/and maintenance.

In this paper, an integrated structure for Kalman-filter- based reconstruction of the measurand is proposed. Although intended to improve the resolution of spectrometric measure- ments, it may be used in other applications where similar processing of measurement signals is required.

11. MEASURAND RECONSTRUCTION IN SPECTROMETRY The results of spectrophotometric measurement (the ab-

sorption spectrum, of a sample under study) are subject to systematic errors of an instrumental type. To take into account their total effect, the following model of the recorded spectrum y(X) is used most frequently:

y(X) = J’” g(X - X’)z(X’)dX’ (1)

where X is the wavelength, .(A) is the undistorted spectrum as it might be recorded by a perfectly resolving spectropho- tometer, and g(X) is the unnormalized incoherent optical response function, whose shape resembles a steep Gaussian function. Digital correction of spectrophotometric data consists of numerically solving (1) on the basis of measured samples

-m

0018-9.456/94$04.00 0 1994 IEEE

Page 2: An integrated structure for Kalman-filter-based measurand reconstruction

404 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 43, NO. 3, JUNE 1994

Fig. 2. Model of the data used for correction.

of y(A), which are inevitably subject to random errors, and an a priori identified g(A). This operation, a special case of measurand reconstruction is, as a rule, numerically ill- conditioned. Therefore, care must be taken with noisy data to avoid large errors in the results. Numerous regularization techniques have been developed to deal with this problem [2]. Among them, iterative methods are the most popular in the domain of spectrometry-cf., e.g., [3]. In other domains such as seismology and dynamic calorimetry, more efficient techniques, the variational methods in particular, have gained popularity. The Kalman-filter-based algorithms of measurand reconstruction belong to this group.

111: KALMAN-FILTER-BASED ALGORITHMS OF MEASURAND RECONSTRUCTION

Numerous algorithms for solving the problem formulated in the previous section have been developed using the principle of Kalman filtering [4]-[12]. Some attempts have also been made to design integrated structures dedicated to these algorithms

with h i = hk(AA) for k = 1 , 2 , . . . , K . Here {U:} and (7:) are the stationary random sequences modeling the solution of (4) and the error in the data, respectively. For the sake of simplicity, a steady-state version of the Kalman filter [15] is used to estimate the state vector

[13], [14], [16]-[20]. As an example, the Kalman-filter-based algorithm for spectrometric data correction proposed in [ 1 I] will be outlined here. The principle of model decomposition

Here the subscript j / l denotes the estimate at time j given the data available at time 1.

(10) - k - k zn/n-1 = @Zn--1, of the spectrometric data is shown in Fig. 2. It is based on a

convolutional factorization of the optical response function and the symbol 2: is used instead of 2kln to simplify the notation. The vector of the steady-state Kalman gain g(A) 2 hl(A) * h2(X) * . ' . * h y A ) (2)

k'", = lim kf where hk (A) are real-valued functions whose shapes resemble 2-00

very steep Gaussians. In this way, the problem of solving (1) is transformed into the problem of solving the series of K

. lS computed according to

equations E&o = I

y'(A) =y(A) = hl(A) *$(A) (3) (4) hk(A) * yk+'(A) = yk(A) fork = 1 ,2 , . . . , K (13) I

on the basis of the noisy samples {yn} of y(A). The use of a Kalman filter for solving (4) is based on the discretization of the equation

M

yn = y / ( ~ n + ~h,,) g(AA)z(An + A',,, - x k ) ~ ~ m=l

M

m=l

where g k is a regularization parameter chosen to minimize the reconstruction error [ll]. It follows from (11)-(15) that k L may be calculated in advance since it does not depend on the data.

Taking into account that for physical reasons the undistorted spectrum z( A) is nonnegative, the following constraint is imposed on the solution:

Here AA is the discretization step along the X-axis, An = AMIN + (n - l)AA, A A = A',,, + ( m - l )AA,zn = x(Xn),g, = g(AA),z(A) and y(X) vanish outside the in- terval [ A M I N , AMIN + ( N - l)AA], and g(A) vanish outside [Xl,,,, A',,, + ( M - l)AA].

form = 1 , 2 , . . . , M (16)

where a E [0, 11 is a parameter to be optimized empirically if K = 1, and a = 0 if K > 1.

i k , m ~ { if ':,m < 0 zn,m if i$,m 2 o

Page 3: An integrated structure for Kalman-filter-based measurand reconstruction

BARWICZ et al.: AN INTEGRATED STRUCTURE FOR KALMAN-BASED RECONSTRUCTION 405

The model of the measurement data equations (7) and (8) makes it possible to include a fixed-lag smoother without extra calculations [ 151. The estimates 2; are extracted from the state vector 2: according to

(ck -’ 22+n- 1,1 f o r n = 1 , 2 , . . . , N - d e = { (c k - 1 - k zN,d-N+n+l for n = N - d + 1 , . . . , N ) (17)

with

c k = A A for k = l , 2 , . . . , K - i

where d is the delay of estimation. The outlined reconstruction algorithm reveals the common

features of the whole class of Kalman-filter-based algorithms which make them suitable for integration, i.e., their recursive nature, the possibility of their iterative use, the sparseness of the matrix @, and the limited number of nonzero elements in the vector b. The iterative nature of the algorithm reinforces the justification for the development of the specialized processor, since it increases the computational complexity by a factor of K .

Iv. DEVELOPMENT OF THE INTEGRATED STRUCTURE

Even the simplest case of the algorithm described in Section 111, viz. that corresponding to K = 1 can be applied to a large class of signals [5], [7], [8], [lo]. Therefore, we propose a specialized architecture for implementation of the algorithm corresponding to K = 1, and we show how to use it for measurand reconstruction with the iterative version of the algorithm ( K > 1 ) requiring high computational capacity from the processor.

The architectures of general-purpose DSP’s available on the market are intended to be versatile, and they are not optimized for any particular algorithm. Computing opera- tions are performed over many cycles using only one multi- plier/accumulator (M/A). A specialized architecture designed for a stationary version of the Kalman filter, using only one M/A unit, was proposed in [16]. Systolic architectures which are oriented at nonstationary versions of the Kalman filter were suggested in [17], [18], [19] and [20]. When applied to a stationary model of the data { & } for M 2 64, the specialized processors based on those architectures become very expensive and are also “resistant” to the introduction of constraints. On the other hand, the particular form of the matrix @ and of the vector b makes it possible to simplify these architectures.

A. From Algorithm to Architecture The developed structure should be faster than a general-

purpose DSP and suitable for incorporation of the positivity constraints. To keep cost at a reasonable level we have chosen for implementation the steady-state version of the Kalman filter defined by (9) and (10) using the Kalman gain which is precomputed according to (11)-(15). In this case the complexity of each iteration performed by this filter

is characterized by a complexity of 2MN multiplications and 2MN additions. To maximize the performance of the processor, its architecture was developed to minimize the number of cycles required to execute our target algorithm within practical implementation and complexity constraints. The processor will execute one instruction per cycle. Data was assigned to several independent memory modules, for access in parallel, and the internal data paths made it possible to exploit the parallelism on the vector element level of the filtering algorithm. To make this parallelism apparent, we have transformed (9) and (10) into the sequential form

(19)

(20)

(21) (22)

where f(.) represents the positivity constraint of (16). The operation (19) is easy to execute because the multiplication of the vector f ( i k - l ) by the matrix @ simply shifts the elements of the vector, i.e.,

- k k Zn+1 =@f(2n-1) , 2; = 0

~k = ( h Zn/n-l

I k = y k - y k

k T - k

n n n 2; =2:/n-l + k,In k k

- k - Zn/n-l -

- k - k - k - k - k - k T [zn-1,212n-1,31~n-1,4( . . . l ~ n - ~ , ~ - ~ l ~ n - ~ , ~ l ~ n - ~ , ~ l .

(23)

Operations (20) and (22) are a bit more complicated because they involve vector multiplication. Operation (22) uses the results of operation (20), but careful study reveals that we may execute them in parallel since the computation of elements of the vectors in equations (20) and (22) can be alternated [13]. The time necessary for operations (15), (16), and (21) is not significant for the total processing time.

The flow diagram of the algorithm is shown in Fig. 3 where vertical arrows indicate sequential computations, and horizontal arrows the possibility of parallel computations. The estimate jj2+1 is computed in parallel with the estimate 2;, necessary for obtaining the data $;+l used in the next sample. Fig. 4 shows the sequence of parallel computation presented in Fig. 3. The right-hand column represents the calculation producing 2; and the left-hand column, the operations produc- ing y:+l. This figure emphasizes the links between the two operations and shows how their parallelism may be exploited by using many M/A units for obtaining 2: and jj:+l [ 131. The following recursive equations from Fig. 4 are

(24)

(25)

(26)

- k - - k k k zn,m+1 -zn/n-l ,m+l+ ‘m,m+lIn

- k - k - k - k Yn+l,m-l - h m - l ~ n + l / n , m - ~ + Yn+l,m-2

O r

- k Yn+l,m-l= hk-If(’k,m) + Yk+l,m-2

where 2n+lln,m-l k = f ( 2 k , m ) . Equations (24), (25), and (26) show the dependence between the processed data $:+l,m-l

and 22,m which suggests the distribution of the operations among an even number of M/A units; (24) and (26) are computed simultaneously [ 131. Simultaneous access to the elements of the vectors i;,n-l and k k , as well as to the

Page 4: An integrated structure for Kalman-filter-based measurand reconstruction

406 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 43, NO. 3, JUNE 1994

Fig. 3. Data flow diagram of the algorithm.

i i; r k r k

y.+I,M - yn+l.M-l

Fig. 4. The function f(.) represents the positivity constraint defined in (16).

Sequence of parallel computation of the operations (20) and (22).

elements of the vectors and hk should be ensured. The result of reconstruction 2, i! is found according to (16).

B. Proposed Architecture The functional block diagram of the designed integrated

structure shown in Fig. 5 allows the effective parallel exe- cution of (24) and (26). The vectors 2 k , k k , and hk used by the algorithm of reconstruction are stored in three intemal memories MEM A, MEM B, and MEM C, respectively. The register MEM REG is used to store the data N , M , K, and a in (16) for the algorithm. To ensure high-speed parallel transfer of data, all memory units may be accessed simultaneously because each of them has its own data bus. When the structure is used to execute different operations, the multiplexers (MUX) are used to link the bus to memories and to the intemal registers of the M/A units. Thus, the architecture comprises six data buses: four buses for the memories MEM A, MEM B, MEM C, and MEM REG, one bus for the output of the M/A 1 and MIA 2, and one bus for extemal access (Data YO). To

unit

t

Fig. 5. Functional block diagram.

avoid delays in addressing the data, each memory has its own unit for address generation (AGU) using two pointers to allow the division of the memory into separate sections. One pointer is modified when another is working, and in this way vectorial processing is accelerated [14]. Regarding the data path block on the right, the implementation of a complete arithmetic- logic unit would not be efficient considering the silicon area required. Therefore, to execute its necessary computational functions, the data path has independent comparator and adder units. The comparator supports the introduction of the positivity constraints defined by (16), and an adder solves (22) in one cycle. An advantage of this simplified design is that it makes possible parallel transfer of data to the memories. The counter unit supports the execution of loops. It includes four independent counters: one of them may be used to generate the time base for sampling, and the others may be used to control the loops. The microprogrammable sequencer is similar to the advanced Micro Devices AM2910, and it contains a RAM unit for microcode [22]. It supports a horizontal programming style (a horizontal microcode instruction typically contains the separate fields dedicated to each control block of the architecture) to execute as many operations as possible in one cycle while minimizing the complexity of the control logic. The vectors k k and hk can be changed during the processing by stopping the sequencer with a suitable control signal. This makes possible the use of reconstruction algorithms based on a nonstationary version of the Kalman filter.

The horizontal microcode instructions implemented in the memory of the microsequencer are not easily interpretable for a programmer. Thus it was necessary to define the set of instructions to facilitate programming. A set of 11 dedicated instructions (on the assembler level) was defined in order to execute the algorithm to be implemented (Table I). The concatenation of several dedicated instructions is possible to obtain a horizontal instruction executable in one clock cycle. The program executing the proposed algorithm (Section 111) is composed of 64 horizontal instructions. This set of instructions can be used to execute the various algorithms for signal processing such as the FFT, IIR, or FIR filters.

C . Implementation of the Iterative Algorithm The proposed architecture for K = 1 may be used for

implementation of the iterative algorithm of reconstruction in

Page 5: An integrated structure for Kalman-filter-based measurand reconstruction

~

BARWICZ et al.: AN INTEGRATED STRUCTURE FOR KALMAN-BASED RECONSTRUCTION

TABLE I SET OF SPECIALIZED INSTRUCTIONS

ABS Absolute Value JMP Jump CMP Compare JSR Jump to

Subroutine DEC Address MOVE Move Data

DO Start Hardware (including the

INC Address RTS Retum from

Jcc Jump WAIT Wait for Interrupt

Decrement Register

Loop MIA)

Increment Subroutine

Conditionally

Fig. 6. Use of the architecture for iterative algorithm.

two ways shown in Fig. 6. In the first version [Fig. 6(a)], a single processor supports all K iterations, while in the second version [Fig. 6(b)] the computation is pipelined on N p processors with the data decomposition of Fig. 2. The processor would complete the computation of K iterations before acquisition of the consecutive measurement sample &. This, however, would require splitting its intemal memories as shown in Fig. 7. For each iteration I C , it is necessary to store the vectors i : , k k , and hk in three memory modules MZ, Mk, and Mh, respectively. Each module may be divided into a number of sections S,, Sk, and sh, respectively, with 1 5 S, 5 K , l 5 SI, 5 K , and 1 5 sh 5 K , and each section has M memory cells. However, the vectors k k and hk can be the same for different iterations [ l l ] , and in this case the number of sections in the memory modules is S k = 1 and sh = 1 except for the vectors 2:. Two problems arise with this first solution: the processing time is proportional to K and, to support vectors of fixed size, the size of intemal memories would grow with K to excessive values if K were large. Thus, in Fig. 6(b) we propose the use of Np pipelined units, each supporting Np = K / S z iterations. This solution will offer maximum throughput when Np = K and minimum total cost when N, = 1.

D. Prototyping The developed specialized integrated structure is intended

to act as an external coprocessor for a host processor as shown in Fig. 8. It is composed of two M/A units and of a sequencer controlling these units according to the executed algorithm with or without positivity constraints. Two M/A units are used as a trade between speed and silicon area. Nevertheless, the architecture allows the use of more than two M/A units when justified by speed requirements and acceptable from the cost point of view. In Fig. 8, Data 1/0 is the input/output bus for

~

407

M, M k M h

~- 1 2 M

Splitting of the memory. Fig 7.

-k lYn 4 HOST PROCESSOR c't:

Fig. 8. General block diagram of the processor using the proposed structure.

the data, ADD PGM is the address bus for programming the memories, CNTR is the control bus, Test Port is used for chip intemal scan chain testing, and Vdd - GND is the power

The design was made by means of the standard CAD tools available from the Canadian Microelectronics Corporation (CMC): CADENCE, SILOS, BNR DFT, SYNFUL, and a high-level hardware description language VERILOG. Since the design may be implemented using different work lengths and technologies, a reduced-cost prototype, 8-b version was fabricated using 1.2 pm CMOS technology available from CMC. In this prototype the MEM A, MEM B, and MEM C are 32 x 8-b memories. To save silicon area, the memory MEM REG is not implemented, and the corresponding data are intro- duced into the microcode instruction. The microprogrammable sequencer is divided into eight 32 x 8-b sections to support horizontal programming of 64-b instructions. The M/A1 unit and M/A2 unit are 8 x 8-b signed multipliers with 16-b accumulators and have been designed according to the Booth algorithm [23]. The MUX units are designed with tristate buffers to save silicon area. A two-phase nonoverlapping clock is used. The chip complexity is of 24 000 transistors without the memories, and the silicon area is 5.1 x 9.0 mm2. A 68-pin grid array package has been chosen and 59 pins are used. This manufactured prototype is sufficient for physical confirmation of the design and will be used for teaching purposes.

supply.

V. PERFORMANCE EVALUATION The previous section has shown that a specialized structure

can be implemented with fewer components than the general- purpose DSP such as the DSP56001 which contains about

Page 6: An integrated structure for Kalman-filter-based measurand reconstruction

408 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 43, NO. 3, J U N E 1994

TABLE I1 RESULTS OF THE COMPARISON OF THE PERFORMANCE OF BOTH PROCESSORS FOR THE VECTOR DIMENSION k’ AND FOR I< = 1, = 1 . AND A\7p = 1

Asymptotic Reduction of Number of Cycles Reduction of Number of

Vectorial Operations DSP56000 Specialized Processor (lim AI -.+ w) Cycles ( M = 64)

a scalar

Addition of two vectors 20 + 6 M 2 + ,If 84% 84% Shifting of a vector 14 + 4 M 2 + 2 9 1 50% 52%

constraint

positivity constraint

Multiplication of a vector by 26 + 4 M 2 + ,zil 75% 77%

Multiplication of two vectors 20 + 2 M 4 + ,If 50% 54%

Reconstruction without 106 + 16M 10 + 2 ( M - 3) 88% 88%

Reconstruction with a 106 + 2 0 M 14 + 4 ( M - 3) 80% 81%

200 000 transistors [24]. Naturally the word lengths in both devices are not the same, but in a custom design such as ours, one can tailor the word length to the exact requirements of the application to reduce implementation costs, energy consumption and heat dissipation, which is very important in integrated measuring system applications.

The expected computing performance of the designed struc- ture has been estimated in the following way.

The basic functions of the vector computation and the algorithm described by equations (19)-(22) were pro- grammed for two processors: the VERILOG model of the implemented structure and the Motorola general-purpose DSP56001. The performance was assessed using the number of cycles required for the execution of the algorithm. The compari- son was based on the assumption that the clock frequency is the same for both processors, that the pipeline of DSP56001 is full, that the data of the DSP56001 are in the intemal memory, and that the access time and computation time of the M/A units are smaller than or equal to one clock cycle. The results of the comparison for K = 1, N = 1, and Np = 1 are shown in Table 11.

These results show the improvement of the algorithm- adapted architecture. For the reconstruction without positivity constraints, the gain in the number of cycles is about 88%, while for the reconstruction with the positivity constraints it is about 80%. The difference between the time of reconstruction with and without constraints is due to the execution time of the operation of comparison which cannot be executed in parallel. The reduction of the number of clock cycles by 80% implies an increase of the reconstruction speed up to 5 times in terms of necessary clock cycles. In the case of the addition of two vectors, the reduction ratio reaches 84%. This comes from the fact that the DSP56001 has only two intemal memories, and that the result of the addition should temporarily be placed in the register before being stored in the memory. Thus, this operation is executed in many cycles. The reduction for the multiplication of two vectors is significant because the use of two M/A units allows about 50% reduction of the number of required cycles.

According to the results of the simulation at the schematic level, the proposed structure could operate at 45 MHz, thus providing 45 million instructions per second (MIPS). How-

ever, the real maximum clock frequency of the circuit can be estimated only by testing the real physical structure or by detailed layout-based simulations which take into account interconnection delays, the clock skew and the effects of pack- aging. The difference between schematic-based performances and real performances could decrease the operation speed by a factor of 2. The speed of DSP56001, which was rated at 10.25 MIPS at a clock rate of 20 MHz [25], is at least 2 times slower than the speed of the designed structure in terms of the number of instructions per cycle.

VI. CONCLUSIONS The primary objective of this study was the implementation

of a measurand reconstruction algorithm using integration techniques.

The proposed integrated structure for Kalman-filter-based reconstruction is implemented with fewer components, is less costly in silicon area and in energy consumption (and by consequence in heat dissipation) as well as faster than existing general-purpose DSP chips by a comfortable margin from the point of view of its application in integrated measuring systems.

The computing performance of the proposed structure was compared with that of the general-purpose digital signal pro- cessor DSP56001 using as a criterion the number of cycles necessary to execute basic operations. In the case of recon- struction with the positivity constraints, our structure is 10 times faster. The factor of 10 results from the multiplication of the clock rate reduction factor of 2 by the cycle reduction factor of 5. This improvement of the speed of algorithm execution by the proposed structure, which is designed to act as a coprocessor for a host processor, is due to the parallelization of the computation algorithm on two multipliers/accumulators, and to the parallelization of the internal data transfers to the multipliers/accumulators. Each operation is executed in one cycle thanks to the use of a wide horizontal microprogrammed control structure.

The integrated structure for Kalman-filter-based measurand reconstruction presented in this paper constitutes a step to- wards the design of specialized autonomous processors for metrological applications. In development of such processors based on the proposed structure according to specific applica-

Page 7: An integrated structure for Kalman-filter-based measurand reconstruction

BARWICZ er al.: AN INTEGRATED STRUCTURE FOR KALMAN-BASED RECONSTRUCTION 409

tion requirements, one can consider the trade between silicon area, speed, and accuracy of the processor.

The designed specialized integrated structure is basically intended to improve the resolution of spectrometric measure- ments, but it may be applied to the solving of other problems where similar processing of measurement signals is required.

[24] E. F., “The samples of the Motorola DSP are for seen of the beginning of 87,” Processeur et Systemes, no. 5, pp. 15-17, Sept. 1986 (in French).

[25] DSP56OOI Digital Signal Processor User’s Manual, Motorola, Inc., 1990. Ch. 1.2.

REFERENCES

P. A. Jansson, Deconvolution wiifh Application in Spectroscopy. Or- lando, FL: Academic, 1984. R. Z. Morawski, “Univied approach to measurement signal reconstruc- tion,” invited paper, in Conf. Rec. IEEE IMTCi93, Irvine, CA, May 18-20, 1993. P. B. Crilly, “A quantitative evaluation of various iterative deconvolu- tion algorithms,” IEEE Trans. Instrum. Measurement, vol. 40, no. 3, pp. 558-562, June 1991. N. D. Crump, “A Kalman filter approach to the deconvolution of seismic signals,” Geophys., vol. 39, pp. 432444, 1974. G. Demoment and A. Segalen, “Adele, a fast suboptimal estimator for real-time deconvolution,” Electron. Left., vol. 19, no. 3, pp. 8688, 1983. D. Alba and G. R. Meira, “Inverse optimal filtering method for the instrumental spreading correction in size exclusion chromatography,” J . Liquid Chromatography, vol. 7, no. 14, pp. 2833-2862, 1984. G. Demoment and R. Reynaud, “Fast minimum variance deconvolu- tion,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, no. 4, pp. 1324-1326, 1985. G. Demoment, “Deconvolution of signal, ” Lab. Signalfx et Systemes (CNRSESE) and Coursebook of the Ecole Sup. d’ElectricitC, No. 3086/85, France, Intemal Rep. L2S 20/84, 1986 (in French). J. K. Tugnait, “Constrained signal restoration via iterated extended Kalman filtering,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-33, no. 2, pp. 472475, 1985. D. Massicotte, R. Z. Morawski, and A. Barwicz, “Incorporation of a positivity constraint into a Kalman-filter-based algorithm for correction of spectrometric data,” in Con$ Rec. IEEE IMTCi92, New York, May 1992, pp. 590-593. {-){-], “Efficiency of constraining the set of feasible solutions in Kalman-filter-based algorithms of spectrophotometric data correction,” in Conf. Rec. IEEE IMTCi93. Irvine, CA, May 18-20, 1993. M. Ben Slima, R. Z. Morawski, and A. Barwicz, “A recursive spline- based algorithm for spectrophotometric data correction,” in Conf. Rec. IEEE IMTCi93. Irvine, CA, May 18-20, 1993. D. Massicotte, M. A. Santerre, Y. Savaria, and A. Barwicz, “Structure of parallel calculus for Kalman-filter into the signal reconstruction,” in Proc. Canadian Conf. Electrical and Computer Engineering. Toronto, Ont., Sept. 13-16, 1992, pp, TM4.16.1-TM4.16.4 (in French). M. A. Santerre. D. Massicotte, A. Barwicz, and Y. Savaria, “Architec- ture of specialized processor for Kalman-filter-based signal reconstruc- tion,” in Proc. Cunadiun Conf. Electrical and Computer Engineering, Toronto, Ont., Sept. 13-16, 1992, pp. MM7.3.1-MM7.3.4 (in French). B. D. 0. Anderson and J. B. Moore, Optimal Filtering. Englewood Cliffs, NJ: Prentice-Hall, 1979. R. Reynaud, “Kalman filtering for the monodimensional signal de- convolution: Study of the new fast-Kalman-type algorithm, numerical experimentation, implementation in a specialized fixed-point arithmetic processor.” Ph.D. dissertation, Univ. Paris XI, 1986, (in French). S. Y. Kung and J. N. Hwang, “Systolic array designs for Kalman filtering,” IEEE Trans. Signal Processing, vol. 39, no. 1, pp. 171-182, Jan. 1991. F. M. F. Gaston and G. W. Irwin, “Systolic approach to square root information Kalman filtering,”Int. J . Contr., vol. SO, no. I , pp. 225-248, 1989. P. Rao and M. Bayoumi, “An algorithm specific VLSI parallel archi- tecture for Kalman filter,” in VLSI Signal Processing, IV. New York: IEEE Press, 1991, pp. 264-273. M. R. Azimi-Sadjadi, T. Lu, and E. M. Nebot, “Parallel and sequential block Kalman filtering and their implementation using systolic arrays,” IEEE Trans. Signal Processing, vol. 39, no. 1, pp. 137-147, Jan. 1991. R. D. Fellman, R. T. Kaneshiro, and K. Konstantinides, “Design and evaluation of an architecture for a digital signal processor for instrumen- tation applications,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 38, no. 3, pp. 537-546, Mar. 1990. D. White, Bit-Slice Design: Controllers and ALV. M. Annaratone, Digifal CMOS Circuit Design.

Garlands, 1981. Kluwer, 1986, ch. 6.6.

Andrzej Barwicz (M’83) was born in Poland in 1942. He received the M.S.E.E. and Ph.D. degrees from the Warsaw University of Technology, Poland, in 1965 and 1973, respectively.

From 1965 to 1967, he was with the Group of Computer-Aided Measurements, Institute of Ra- dioelectronics, Warsaw University of Technology. From 1974 to 1979, he was the Head of Post- graduate Study on Computer-Aided Measurement. From 1979 to 1986, he was an Associate Professor at the Institute of Telecommunications of Oran,

Oran, Algeria. Since 1987, he has been a Professor with the Department of Engineering at the University of Quebec, Trois Rivieres, P.Q., Canada. From 1990 to 1992 he was the head of the Graduate Program in Industrial Electronics. His interests include electrical measuring systems and digital microelectronics. He is the coauthor of 30 technical papers and the holder of 5 patents.

Dr. Barwicz was the Vice Chairman from 1988 to 1991, and the Chairman in 1991/1992, of the IEEE-St. Maurice Section.

Daniel Massicotte was bom in QuCbec, Canada, in 1964. He received the B.Sc.A. and M.Sc.A. degrees in electrical engineering and industrial electronics in 1987 and 1990, respectively, from the UniversitC du QuCbec a Trois-Rivieres (UQTR), P.Q., Canada.

He is presently studying for the Ph.D. degree in electrical engineering at the &ole Polytechnique de MontrCal (EPM), P.Q., Canada. His field of research is signal processing, microelectronics, and neural networks. From 1987 to 1990 he worked at the acousto-optical/ultrasonics laboratory at the UQTR

and in 1990 he joined the microelectronics research team at the EPM and the electrical measuring systems laboratory at the UQTR.

Yvon Savaria (S’78-M’85) received the B. Eng. and M.Sc.A. degrees in electncal engineenng from &ole Polytechnique de MontrCal, P.Q., Canada, in 1980 and 1982, respectively, and the Ph.D. degree also in electrical engineenng from McGill Univer- sity, MontrCal, P.Q , Canada, in 1985

Since 1985, he has been with &ole Polytechnique where he is currently Associate Professor in the Department of Electrical Engineering, He is also a member of the Ordre des IngCnieurs du QuCbec.

His main research interests are fault tolerant ar- chitectures, WSI, testing, VLSI design methodologies, advanced packaging, testing, high speed electronics circuits, and practical applications of these technologies, including integrated measurement systems.

Dr. Savana has published a book on VLSI design and more than 80 technical papers in these areas. He has served on the program committee of the IEEE workshop on defect and fault tolerance in VLSI, ISCAS ’92, FTCS21, and the Canadian conference on very large scale integration In 1993, he served as program co-chairman of the IEEE workshop on defect and fault tolerance in VLSI and he is general co-chairman of the 1994 edition of that workshop. He is a member of the Montreal “Groupe de Recherche en Integration et Architecture des Ordinateurs” and he chairs i t 7 university industry liaison committee. He is also the director of the microelectronic? research group of kcole Polytechnique de MontrCal.

Page 8: An integrated structure for Kalman-filter-based measurand reconstruction

410 IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 43, NO. 3, JUNE 1994

Marc-Alain Santerre was born in Quebec, Canada, in 1964. He received the B.Sc.A. degree in micro- electronics in 1990 from the Universitk du QuCbec i MontrCal (UQAM) P.Q., Canada. He is presently studying for the M.Sc.A. degree in industrial electronics at the Universite du QuCbec i Trois-Rivieres (UQTR), P.Q., Canada. Since 1993 he has been working as an ASIC designer with ABL Canada. His interests include signal processing and microelectronics.

Roman Z. Morawski was bom in 1949 in Poland. He received the M.Sc. degree with honours in electronic instrumentation from Warsaw University of Technology, Poland, in 1972, the Ph.D. degree in computing processes and structures in measurement from Leningrad Institute of Electncal Engineering, USSR, in 1979, and the higher Doctor’s degree (in habilitation) from Warsaw University of Technology in 1990.

Since 1972 he has been with the Group on Computer-Aided Measurements, Institute of Radio-

electronics, Warsaw University of Technology, currently as an Assistant Professor. His research intereqts are concentrated on software aspects of measurement Within the last 10 years he has published more than 40 papers and 2 monographies on mathematical modeling and signal processing in com- puterized measurements, measuring system design, and teaching metrology

Dr. Morawski is a member of the Polish Engineenng Association for Measurement, Automation and Robotics, and the Committee for Metrology and Instrumentation, Polish Academy of Sciences.