MATLAB Simulation of Gradient-Based Neural Network for...

12
MATLAB Simulation of Gradient-Based Neural Network for Online Matrix Inversion Yunong Zhang, Ke Chen, Weimu Ma, and Xiao-Dong Li Department of Electronics and Communication Engineering Sun Yat-Sen University, Guangzhou 510275, China [email protected] Abstract. This paper investigates the simulation of a gradient-based recurrent neural network for online solution of the matrix-inverse prob- lem. Several important techniques are employed as follows to simulate such a neural system. 1) Kronecker product of matrices is introduced to transform a matrix-differential-equation (MDE) to a vector-differential- equation (VDE); i.e., finally, a standard ordinary-differential-equation (ODE) is obtained. 2) MATLAB routine “ode45” is introduced to solve the transformed initial-value ODE problem. 3) In addition to various im- plementation errors, different kinds of activation functions are simulated to show the characteristics of such a neural network. Simulation results substantiate the theoretical analysis and efficacy of the gradient-based neural network for online constant matrix inversion. Keywords: Online matrix inversion, Gradient-based neural network, Kronecker product, MATLAB simulation. 1 Introduction The problem of matrix inversion is considered to be one of the basic problems widely encountered in science and engineering. It is usually an essential part of many solutions; e.g., as preliminary steps for optimization [1], signal-processing [2], electromagnetic systems [3], and robot inverse kinematics [4]. Since the mid- 1980’s, efforts have been directed towards computational aspects of fast matrix inversion and many algorithms have thus been proposed [5]-[8]. It is known that the minimal arithmetic operations are usually proportional to the cube of the matrix dimension for numerical methods [9], and consequently such algorithms performed on digital computers are not efficient enough for large-scale online applications. In view of this, some O(n 2 )-operation algorithms were proposed to remedy this computational problem, e.g., in [10][11]. However, they may be still not fast enough; e.g., in [10], it takes on average around one hour to invert a 60000-dimensional matrix. As a result, parallel computational schemes have been investigated for matrix inversion. The dynamic system approach is one of the important parallel-processing methods for solving matrix-inversion problems [2][12]-[18]. Recently, due to the in-depth research in neural networks, numerous dynamic and analog solvers D.-S. Huang, L. Heutte, and M. Loog (Eds.): ICIC 2007, LNAI 4682, pp. 98–109, 2007. c Springer-Verlag Berlin Heidelberg 2007

Transcript of MATLAB Simulation of Gradient-Based Neural Network for...

Page 1: MATLAB Simulation of Gradient-Based Neural Network for ...vision.cs.tut.fi/personal/kechen/paper/icic2007.pdf · MATLAB Simulation of Gradient-Based Neural Network 99 based on recurrent

MATLAB Simulation of Gradient-Based NeuralNetwork for Online Matrix Inversion

Yunong Zhang, Ke Chen, Weimu Ma, and Xiao-Dong Li

Department of Electronics and Communication EngineeringSun Yat-Sen University, Guangzhou 510275, China

[email protected]

Abstract. This paper investigates the simulation of a gradient-basedrecurrent neural network for online solution of the matrix-inverse prob-lem. Several important techniques are employed as follows to simulatesuch a neural system. 1) Kronecker product of matrices is introduced totransform a matrix-differential-equation (MDE) to a vector-differential-equation (VDE); i.e., finally, a standard ordinary-differential-equation(ODE) is obtained. 2) MATLAB routine “ode45” is introduced to solvethe transformed initial-value ODE problem. 3) In addition to various im-plementation errors, different kinds of activation functions are simulatedto show the characteristics of such a neural network. Simulation resultssubstantiate the theoretical analysis and efficacy of the gradient-basedneural network for online constant matrix inversion.

Keywords: Online matrix inversion, Gradient-based neural network,Kronecker product, MATLAB simulation.

1 Introduction

The problem of matrix inversion is considered to be one of the basic problemswidely encountered in science and engineering. It is usually an essential part ofmany solutions; e.g., as preliminary steps for optimization [1], signal-processing[2], electromagnetic systems [3], and robot inverse kinematics [4]. Since the mid-1980’s, efforts have been directed towards computational aspects of fast matrixinversion and many algorithms have thus been proposed [5]-[8]. It is known thatthe minimal arithmetic operations are usually proportional to the cube of thematrix dimension for numerical methods [9], and consequently such algorithmsperformed on digital computers are not efficient enough for large-scale onlineapplications. In view of this, some O(n2)-operation algorithms were proposedto remedy this computational problem, e.g., in [10][11]. However, they may bestill not fast enough; e.g., in [10], it takes on average around one hour to inverta 60000-dimensional matrix. As a result, parallel computational schemes havebeen investigated for matrix inversion.

The dynamic system approach is one of the important parallel-processingmethods for solving matrix-inversion problems [2][12]-[18]. Recently, due to thein-depth research in neural networks, numerous dynamic and analog solvers

D.-S. Huang, L. Heutte, and M. Loog (Eds.): ICIC 2007, LNAI 4682, pp. 98–109, 2007.c© Springer-Verlag Berlin Heidelberg 2007

Page 2: MATLAB Simulation of Gradient-Based Neural Network for ...vision.cs.tut.fi/personal/kechen/paper/icic2007.pdf · MATLAB Simulation of Gradient-Based Neural Network 99 based on recurrent

MATLAB Simulation of Gradient-Based Neural Network 99

based on recurrent neural networks (RNNs) have been developed and inves-tigated [2][13]-[18]. The neural dynamic approach is thus regarded as a powerfulalternative for online computation because of its parallel distributed nature andconvenience of hardware implementation [4][12][15][19][20].

To solve for a matrix inverse, the neural system design is based on the equa-tion, AX −I = 0, with A ∈ R

n×n. We can define a scalar-valued energy functionsuch as E(t) = ‖AX(t) − I‖2/2. Then, we use the negative of the gradient∂E/∂X = AT (AX(t)− I) as the descent direction. As a result, the classic linearmodel is shown as follows:

X(t) = −γ∂E∂X

= −γAT (AX(t) − I), X(0) = X0 (1)

where design parameter γ > 0, being an inductance parameter or the reciprocalof a capacitive parameter, is set as large as the hardware permits, or selectedappropriately for experiments.

As proposed in [21], the following general neural model is an extension to theabove design approach with a nonlinear activation-function array F :

X(t) = −γAT F (AX(t) − I) (2)

where X(t), starting from an initial condition X(0) = X0 ∈ Rn×n, is the acti-

vation state matrix corresponding to the theoretical inverse A−1 of matrix A.Like in (1), the design parameter γ > 0 is used to scale the convergence rate ofthe neural network (2), while F (·) : R

n×n → Rn×n denotes a matrix activation-

function mapping of neural networks.

2 Main Theoretical Results

In view of equation (2), different choices of F may lead to different performance.In general, any strictly-monotonically-increasing odd activation-function f(·),being an element of matrix mapping F , may be used for the construction of theneural network. In order to demonstrate the main ideas, four types of activationfunctions are investigated in our simulation:

– linear activation function f(u) = u,– bipolar sigmoid function f(u) = (1 − exp(−ξu))/(1 + exp(−ξu)) with ξ � 2,– power activation function f(u) = up with odd integer p � 3, and– the following power-sigmoid activation function

f(u) =

{up, if |u| � 11+exp(−ξ)1−exp(−ξ) · 1−exp(−ξu)

1+exp(−ξu) , otherwise(3)

with suitable design parameters ξ � 1 and p � 3.

Other types of activation functions can be generated by these four basic types.Following the analysis results of [18][21], the convergence results of using dif-

ferent activation functions are qualitatively presented as follows.

Page 3: MATLAB Simulation of Gradient-Based Neural Network for ...vision.cs.tut.fi/personal/kechen/paper/icic2007.pdf · MATLAB Simulation of Gradient-Based Neural Network 99 based on recurrent

100 Y. Zhang et al.

Proposition 1. [15]-[18][21] For a nonsingular matrix A ∈ Rn×n, any strictly

monotonically-increasing odd activation-function array F (·) can be used for con-structing the gradient-based neural network (2).

1. If the linear activation function is used, then the global exponential conver-gence is achieved for neural network (2) with convergence rate proportionalto the product of γ and the minimum eigenvalue of AT A.

2. If the bipolar sigmoid activation function is used, then the superior conver-gence can be achieved for error range [−δ, δ], ∃δ ∈ (0, 1), as compared tothe linear-activation-function case. This is because the error signal eij =[AX − I]ij in (2) is amplified by the bipolar sigmoid function for error range[−δ, δ].

3. If the power activation function is used, then the superior convergence can beachieved for error ranges (−∞, −1] and [1, +∞), as compared to the linear-activation-function case. This is because the error signal eij = [AX − I]ij in(2) is amplified by the power activation function for error ranges (−∞, −1]and [1, +∞).

4. If the power-sigmoid activation function is used, then superior convergencecan be achieved for the whole error range (−∞, +∞), as compared to thelinear-activation-function case. This is in view of Properties 2) and 3).

In the analog implementation or simulation of the gradient-based neural net-works (1) and (2), we usually assume that it is under ideal conditions. However,there are always some realization errors involved. For example, for the linearactivation function, its imprecise implementation may look more like a sigmoidor piecewise-linear function because of the finite gain and frequency dependencyof operational amplifiers and multipliers. For these realization errors possiblyappearing in the gradient-based neural network (2), we have the following the-oretical results.

Proposition 2. [15]-[18][21] Consider the perturbed gradient-based neural model

X = −γ(A + ΔA)T F ((A + ΔA)X(t) − I) ,

where the additive term ΔA exists such that ‖ΔA‖ � ε1, ∃ε1 � 0, then the steady-state residual error limt→∞ ‖X(t) − A−1‖ is uniformly upper bounded by somepositive scalar, provided that the resultant matrix A + ΔA is still nonsingular.

For the model-implementation error due to the imprecise implementation of sys-tem dynamics, the following dynamics is considered, as compared to the originaldynamic equation (2).

X = −γAT F (AX(t) − I) + ΔB, (4)

where the additive term ΔB exists such that ‖ΔB‖ � ε2, ∃ε2 � 0.

Proposition 3. [15]-[18][21] Consider the imprecise implementation (4), thesteady state residual error limt→∞ ‖X(t) − A−1‖ is uniformly upper bounded bysome positive scalar, provided that the design parameter γ is large enough (the so-called design-parameter requirement). Moreover, the steady state residual errorlimt→∞ ‖X(t) − A−1‖ can be made to zero as γ tends to positive infinity .

Page 4: MATLAB Simulation of Gradient-Based Neural Network for ...vision.cs.tut.fi/personal/kechen/paper/icic2007.pdf · MATLAB Simulation of Gradient-Based Neural Network 99 based on recurrent

MATLAB Simulation of Gradient-Based Neural Network 101

As additional results to the above lemmas, we have the following general obser-vations.

1. For large entry error (e.g., |eij | > 1 with eij := [AX − I]ij), the poweractivation function could amplify the error signal (|ep

ij | > · · · > |e3ij | > |eij | >

1), thus able to automatically remove the design-parameter requirement.2. For small entry error (e.g., |eij | < 1), the use of sigmoid activation func-

tions has better convergence and robustness than the use of linear activationfunctions, because of the larger slope of the sigmoid function near the origin.

Thus, using the power-sigmoid activation function in (3) is theoretically a betterchoice than other activation functions for superior convergence and robustness.

3 Simulation Study

While Section 2 presents the main theoretical results of the gradient-based neuralnetwork, this section will investigate the MATLAB simulation techniques inorder to show the characteristics of such a neural network.

3.1 Coding of Activation Function

To simulate the gradient-based neural network (2), the activation functions areto be defined firstly in MATLAB. Inside the body of a user-defined function,the MATLAB routine “nargin” returns the number of input arguments whichare used to call the function. By using “nargin”, different kinds of activationfunctions can be generated at least with their default input argument(s).

The linear activation-function mapping F(X) = X ∈ Rn×n can be generated

simply by using the following MATLAB code.

function output=Linear(X)output=X;

The sigmoid activation-function mapping F(·) with ξ = 4 as its default inputvalue can be generated by using the following MATLAB code.

function output=Sigmoid(X,xi)if nargin==1, xi=4; endoutput=(1-exp(-xi*X))./(1+exp(-xi*X));

The power activation-function mapping F(·) with p = 3 as its default inputvalue can be generated by using the following MATLAB code.

function output=Power(X,p)if nargin==1, p=3; endoutput=X.^p;

Page 5: MATLAB Simulation of Gradient-Based Neural Network for ...vision.cs.tut.fi/personal/kechen/paper/icic2007.pdf · MATLAB Simulation of Gradient-Based Neural Network 99 based on recurrent

102 Y. Zhang et al.

The power-sigmoid activation function defined in (3) with ξ = 4 and p = 3being its default values can be generated below.

function output=Powersigmoid(X,xi,p)if nargin==1, xi=4; p=3;elseif nargin==2, p=3;endoutput=(1+exp(-xi))/(1-exp(-xi))*(1-exp(-xi*X))./(1+exp(-xi*X));i=find(abs(X)>=1);output(i)=X(i).^p;

3.2 Kronecker Product and Vectorization

The dynamic equations of gradient-based neural networks (2) and (4) are alldescribed in matrix form which could not be simulated directly. To simulate suchneural systems, the Kronecker product of matrices and vectorization techniqueare introduced in order to transform the matrix-form differential equations tovector-form differential equations.

– In general case, given matrices A = [aij ] ∈ Rm×n and B = [bij ] ∈ R

p×q, theKronecker product of A and B is denoted by A⊗ B and is defined to be thefollowing block matrix

A ⊗ B :=

⎛⎜⎝

a11B . . . a1nB...

......

am1B . . . amnB

⎞⎟⎠ ∈ R

mp×nq.

It is also known as the direct product or tensor product. Note that in generalA ⊗ B �= B ⊗ A. Specifically, for our case, I ⊗ A = diag(A, . . . , A).

– In general case, given X = [xij ] ∈ Rm×n, we can vectorize X as a vector,

i.e., vec(X) ∈ Rmn×1, which is defined as

vec(X) := [x11, . . . , xm1, x12, . . . , xm2, . . . , x1n, ..., xmn]T .

As stated in [22], in general case, let X be unknown, given A ∈ Rm×n and

B ∈ Rp×q, the matrix equation AX = B is equivalent to the vector equation

(I ⊗ A) vec(X) = vec(B).

Based on the above Kronecker product and vectorization technique, for sim-ulation proposes, the matrix differential equation (2) can be transformed to avector differential equation. We thus obtain the following theorem.

Theorem 1. The matrix-form differential equation (2) can be reformulated asthe following vector-form differential equation:

vec(X) = −γ(I ⊗ AT )F((I ⊗ A) vec(X) − vec(I)

), (5)

where activation-function mapping F(·) in (5) is defined the same as in (2)except that its dimensions are changed hereafter as F (·) : Rn2×1 → Rn2×1.

Page 6: MATLAB Simulation of Gradient-Based Neural Network for ...vision.cs.tut.fi/personal/kechen/paper/icic2007.pdf · MATLAB Simulation of Gradient-Based Neural Network 99 based on recurrent

MATLAB Simulation of Gradient-Based Neural Network 103

Proof. For readers’ convenience, we repeat the matrix-form differential equation(2) here as X = −γAT F (AX(t) − I).

By vectorizing equation (2) based on the Kronecker product and the abovevec(·) operator, the left hand side of (2) is vec(X), and the right hand side ofequation (2) is

vec(−γAT F(AX(t) − I)

)= −γ vec

(AT F(AX(t) − I)

)= −γ(I ⊗ AT ) vec(F(AX(t) − I)).

(6)

Note that, as shown in Subsection 3.1, the definition and coding of the activationfunction mapping F (·) are very flexible and could be a vectorized mapping fromR

n2×1 to Rn2×1. We thus have

vec(F(AX(t) − I))= F(vec(AX(t) − I))= F(vec(AX) + vec(−I))

= F((I ⊗ A) vec(X) − vec(I)

).

(7)

Combining equations (6) and (7) yields the vectorization of the right hand sideof matrix-form differential equation (2):

vec(−γAT F(AX(t) − I)

)= −γ(I ⊗ AT )F

((I ⊗ A) vec(X) − vec(I)

).

Clearly, the vectorization of both sides of matrix-form differential equation (2)should be equal, which generates the vector-form differential equation (5). Theproof is thus complete.

Remark 1. The Kronecker product can be generated easily by using MATLABroutine “kron”; e.g., A⊗B can be generated by MATLAB command kron(A,B).To generate vec(X), we can use the MATLAB routine “reshape”. That is, ifthe matrix X has n rows and m columns, then the MATLAB command ofvectorizing X is reshape(X,m*n,1) which generates a column vector, vec(X) =[x11, . . . , xm1, x12, . . . , xm2, . . . , x1n, ..., xmn]T .

Based on MATLAB routines “kron” and “vec”, the following code is used todefine a function returns the evaluation of the right-hand side of matrix-formgradient-based neural network (2). In other words, it also returns the evaluationof the right-hand side of vector-form gradient-based neural network (5). Notethat I ⊗ AT = (I ⊗ A)T .

function output=GnnRightHandSide(t,x,gamma)if nargin==2, gamma=1; endA=MatrixA; n=size(A,1); IA=kron(eye(n),A);% The following generates the vectorization of identity matrix IvecI=reshape(eye(n),n^2,1);% The following calculates the right hand side of equations (2) and (5)output=-gamma*IA’*Powersigmoid(IA*x-vecI);

Page 7: MATLAB Simulation of Gradient-Based Neural Network for ...vision.cs.tut.fi/personal/kechen/paper/icic2007.pdf · MATLAB Simulation of Gradient-Based Neural Network 99 based on recurrent

104 Y. Zhang et al.

Note that we can change “Powersigmoid” in the above MATLAB code to “Sig-moid” (or “Linear”) for using different activation functions.

4 Illustrative Example

For illustration, let us consider the following constant matrix:

A =

⎡⎣1 0 11 1 01 1 1

⎤⎦ , AT =

⎡⎣1 1 10 1 11 0 1

⎤⎦ , A−1 =

⎡⎣ 1 1 −1−1 0 10 −1 1

⎤⎦ .

For example, matrix A can be given in the following MATLAB code.

function A=MatrixA(t)A=[1 0 1;1 1 0;1 1 1];

The gradient-based neural network (2) is thus in the following specific form⎡⎣x11 x12 x13x21 x22 x23x31 x32 x33

⎤⎦ = −γ

⎡⎣1 1 10 1 11 0 1

⎤⎦ F

⎛⎝

⎡⎣1 0 11 1 01 1 1

⎤⎦

⎡⎣x11 x12 x13x21 x22 x23x31 x32 x33

⎤⎦ −

⎡⎣1 0 00 1 00 0 1

⎤⎦⎞⎠ .

4.1 Simulation of Convergence

To simulate gradient-based neural network (2) starting from eight random initialstates, we firstly define a function “GnnConvergence” as follows.

function GnnConvergence(gamma)tspan=[0 10]; n=size(MatrixA,1);for i=1:8x0=4*(rand(n^2,1)-0.5*ones(n^2,1));[t,x]=ode45(@GnnRightHandSide,tspan,x0,[],gamma);for j=1:n^2

k=mod(n*(j-1)+1,n^2)+floor((j-1)/n);subplot(n,n,k); plot(t,x(:,j)); hold on

endend

To show the convergence of the gradient-based neural model (2) using power-sigmoid activation function with ξ = 4 and p = 3 and using the design param-eter γ := 1, the MATLAB command is GnnConvergence(1), which generatesFig. 1(a). Similarly, the MATLAB command GnnConvergence(10) can generateFig. 1(b).

To monitor the network convergence, we can also use and show the norm ofthe computational error, ‖X(t) − A−1‖. The MATLAB codes are given below,i.e., the user-defined functions “NormError” and “GnnNormError”. By calling“GnnNormError” three times with different γ values, we can generate Fig. 2.It shows that starting from any initial state randomly selected in [−2, 2], thestate matrices of the presented neural network (2) all converge to the theoretical

Page 8: MATLAB Simulation of Gradient-Based Neural Network for ...vision.cs.tut.fi/personal/kechen/paper/icic2007.pdf · MATLAB Simulation of Gradient-Based Neural Network 99 based on recurrent

MATLAB Simulation of Gradient-Based Neural Network 105

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

x11 x12

x13

x21x22

x23

x31x32

x33

(a) γ = 1

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

−1

0

1

0 5 10−1

0

1

2

0 5 10−2

0

2

x11 x12x13

x21 x22 x23

x31 x32 x33

(b) γ = 10

Fig. 1. Online matrix inversion by gradient-based neural network (2)

inverse A−1, where the computational errors ‖X(t) − A−1(t)‖ all converge tozero. Such a convergence can be expedited by increasing γ. For example, if γis increased to 103, the convergence time is within 30 milliseconds; and, if γ isincreased to 106, the convergence time is within 30 microseconds.

function NormError(x0,gamma)tspan=[0 10]; options=odeset();[t,x]=ode45(@GnnRightHandSide,tspan,x0,options,gamma);Ainv=inv(MatrixA);B=reshape(Ainv,size(Ainv,1)^2,1);total=length(t); x=x’;for i=1:total, nerr(i)=norm(x(:,i)-B); endplot(t,nerr); hold on

function GnnNormError(gamma)if nargin<1, gamma=1; endtotal=8; n=size(MatrixA,1);for i=1:total

x0=4*(rand(n^2,1)-0.5*ones(n^2,1));NormError(x0,gamma);

endtext(2.4,2.2,[’gamma=’ int2str(gamma)]);

4.2 Simulation of Robustness

Similar to the transformation of the matrix-form differential equation (2) to avector-form differential equation (5), the perturbed gradient-based neural net-work (4) can be vectorized as follows:

vec(X) = −γ(I ⊗ AT )F((I ⊗ A) vec(X) − vec(I)

)+ vec(ΔB). (8)

Page 9: MATLAB Simulation of Gradient-Based Neural Network for ...vision.cs.tut.fi/personal/kechen/paper/icic2007.pdf · MATLAB Simulation of Gradient-Based Neural Network 99 based on recurrent

106 Y. Zhang et al.

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

γ = 1

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

γ = 10

0 1 2 3 4 5 6 7 8 9 100

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

γ = 100

Fig. 2. Convergence of ‖X(t) − A−1‖F using power-sigmoid activation function

To show the robustness characteristics of gradient-based neural networks, thefollowing model-implementation error is added in a sinusoidal form (with ε2 =0.5):

ΔB = ε2

⎡⎣cos(3t) − sin(3t) 0

0 sin(3t) cos(3t)0 0 sin(2t)

⎤⎦ .

The following MATLAB code is used to define the function “GnnRightHand-SideImprecise” for ODE solvers, which returns the evaluation of the right-handside of the perturbed gradient-base neural network (4), in other words, the right-hand side of the vector-form differential equation (8).

function output=GnnRightHandSideImprecise(t,x,gamma)if nargin==2, gamma=1; ende2=0.5;deltaB=e2*[cos(3*t) -sin(3*t) 0; 0 sin(3*t) cos(3*t);0 0 sin(2*t)];vecB=reshape(deltaB,9,1);vecI=reshape(eye(3),9,1);IA=kron(eye(3),MatrixA);output=-gamma*IA’*Powersigmoid(IA*x-vecI)+vecB;

To use the sigmoid (or linear) activation function, we only need to change“Powersigmoid” to “Sigmoid” (or “Linear”) in the above MATLAB code. Basedon the above function “GnnRightHandSideImprecise” and the function below(i.e.,“GnnRobust”), MATLAB commands GnnRobust(1) and GnnRobust(100)can generate Fig. 3.

function GnnRobust(gamma)tspan=[0 10]; options=odeset(); n=size(MatrixA,1);for i=1:8

x0=4*(rand(n^2,1)-0.5*ones(n^2,1));[t,x]=ode45(@GnnRightHandSideImprecise,tspan,x0,options,gamma);for j=1:n^2k=mod(n*(j-1)+1,n^2)+floor((j-1)/n);subplot(n,n,k); plot(t,x(:,j)); hold on

endend

Page 10: MATLAB Simulation of Gradient-Based Neural Network for ...vision.cs.tut.fi/personal/kechen/paper/icic2007.pdf · MATLAB Simulation of Gradient-Based Neural Network 99 based on recurrent

MATLAB Simulation of Gradient-Based Neural Network 107

0 5 10−1

0

1

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

x11 x12x13

x21x22

x23

x31

x32 x33

(a) γ = 1

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

0 5 10−2

0

2

x11 x12 x13

x21 x22x23

x31x32

x33

(b) γ = 100

Fig. 3. Online matrix inversion by GNN (4) with large implementation errors

0 1 2 3 4 5 6 7 8 9 100

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

γ = 1

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

γ = 10

0 1 2 3 4 5 6 7 8 9 100

1

2

3

4

5

6

γ = 100

Fig. 4. Convergence of computational error ‖X(t) − A−1‖ by perturbed GNN (4)

Similarly, we can show the computational error ‖X(t) − A−1‖ of gradient-based neural network (4) with large model-implementation errors. To do so, inthe previously defined MATLAB function “NormError”, we only need change“GnnRightHandSide” to “GnnRightHandSideImprecise”. See Fig. 4. Even withimprecise implementation, the perturbed neural network still works well, andits computational error ‖X(t) − A−1‖ is still bounded and very small. More-over, as the design parameter γ increases from 1 to 100, the convergence isexpedited and the steady-state computational error is decreased. It is worthmentioning again that using power-sigmoid or sigmoid activation functions hassmaller steady-state residual error than using linear or power activation func-tions. It is observed from other simulation data that when using power-sigmoidactivation functions, the maximum steady-state residual error is only 2 × 10−2

and 2 × 10−3 respectively for γ = 100 and γ = 1000. Clearly, compared to thecase of using linear or pure power activation functions, superior performancecan be achieved by using power-sigmoid or sigmoid activation functions underthe same design specification. These simulation results have substantiated thetheoretical results presented in previous sections and in [21].

Page 11: MATLAB Simulation of Gradient-Based Neural Network for ...vision.cs.tut.fi/personal/kechen/paper/icic2007.pdf · MATLAB Simulation of Gradient-Based Neural Network 99 based on recurrent

108 Y. Zhang et al.

5 Conclusions

The gradient-based neural networks (1) and (2) have provided an effective online-computing approach for matrix inversion. By considering different types of acti-vation functions and implementation errors, such recurrent neural networks havebeen simulated in this paper. Several important simulation techniques have beenintroduced, i.e., coding of activation-function mappings, Kronecker product ofmatrices, and MATLAB routine “ode45”. Simulation results have also demon-strated the effectiveness and efficiency of gradient-based neural networks for on-line matrix inversion. In addition, the characteristics of such a negative-gradientdesign method of recurrent neural networks could be summarized as follows.

– From the viewpoint of system stability, any monotonically-increasing ac-tivation function f(·) with f(0) = 0 could be used for the construction ofrecurrent neural networks. But, for the solution effectiveness and design sim-plicity, the strictly-monotonically-increasing odd activation-function f(·) ispreferred for the construction of recurrent neural networks.

– The gradient-based neural networks are intrinsically designed for solvingtime-invariant matrix-inverse problems, but they could also be used to solvetime-varying matrix-inverse problems in an approximate way. Note that, inthis case, design parameter γ is required to be large enough.

– Compared to other methods, the gradient-based neural networks have aneasier structure for simulation and hardware implementation. As parallel-processing systems, such neural networks could solve the matrix-inverseproblem more efficiently than those serial-processing methods.

Acknowledgements. This work is funded by National Science Foundation ofChina under Grant 60643004 and by the Science and Technology Office of SunYat-Sen University. Before joining Sun Yat-Sen University in 2006, the corre-sponding author, Yunong Zhang, had been with National University of Ireland,University of Strathclyde, National University of Singapore, Chinese Universityof Hong Kong, since 1999. He has continued the line of this research, supportedby various research fellowships/assistantship. His web-page is now available athttp://www.ee.sysu.edu.cn/teacher/detail.asp?sn=129.

References

1. Zhang, Y.: Towards Piecewise-Linear Primal Neural Networks for Optimizationand Redundant Robotics. Proceedings of IEEE International Conference on Net-working, Sensing and Control (2006) 374-379

2. Steriti, R.J., Fiddy, M.A.: Regularized Image Reconstruction Using SVD and aNeural Network Method for Matrix Inversion. IEEE Transactions on Signal Pro-cessing, Vol. 41 (1993) 3074-3077

3. Sarkar, T., Siarkiewicz, K., Stratton, R.: Survey of Numerical Methods for Solutionof Large Systems of Linear Equations for Electromagnetic Field Problems. IEEETransactions on Antennas and Propagation, Vol. 29 (1981) 847-856

Page 12: MATLAB Simulation of Gradient-Based Neural Network for ...vision.cs.tut.fi/personal/kechen/paper/icic2007.pdf · MATLAB Simulation of Gradient-Based Neural Network 99 based on recurrent

MATLAB Simulation of Gradient-Based Neural Network 109

4. Sturges Jr, R.H.: Analog Matrix Inversion (Robot Kinematics). IEEE Journal ofRobotics and Automation, Vol. 4 (1988) 157-162

5. Yeung, K.S., Kumbi, F.: Symbolic Matrix Inversion with Application to ElectronicCircuits. IEEE Transactions on Circuits and Systems, Vol. 35 (1988) 235-238

6. El-Amawy, A.: A Systolic Architecture for Fast Dense Matrix Inversion. IEEETransactions on Computers, Vol. 38 (1989) 449-455

7. Neagoe, V.E.: Inversion of the Van Der Monde Matrix. IEEE Signal ProcessingLetters, Vol. 3 (1996) 119-120

8. Wang, Y.Q., Gooi, H.B.: New Ordering Methods for Space Matrix Inversion viaDiagonaliztion. IEEE Transactions on Power Systems, Vol. 12 (1997) 1298-1305

9. Koc, C.K., Chen, G.: Inversion of All Principal Submatrices of a Matrix. IEEETransactions on Aerospace and Electronic Systems, Vol. 30 (1994) 280-281

10. Zhang, Y., Leithead, W.E., Leith, D.J.: Time-Series Gaussian Process RegressionBased on Toeplitz Computation of O(N2) Operations and O(N)-Level Storage.Proceedings of the 44th IEEE Conference on Decision and Control (2005) 3711-3716

11. Leithead, W.E., Zhang, Y.: O(N2)-Operation Approximation of Covariance MatrixInverse in Gaussian Process Regression Based on Quasi-Newton BFGS Methods.Communications in Statistics - Simulation and Computation, Vol. 36 (2007) 367-380

12. Manherz, R.K., Jordan, B.W., Hakimi, S.L.: Analog Methods for Computation ofthe Generalized Inverse. IEEE Transactions on Automatic Control, Vol. 13 (1968)582-585

13. Jang, J., Lee, S., Shin, S.: An Optimization Network for Matrix Inversion. NeuralInformation Processing Systems, American Institute of Physics, NY (1988) 397-401

14. Wang, J.: A Recurrent Neural Network for Real-Time Matrix Inversion. AppliedMathematics and Computation, Vol. 55 (1993) 89-100

15. Zhang, Y.: Revisit the Analog Computer and Gradient-Based Neural System forMatrix Inversion. Proceedings of IEEE International Symposium on IntelligentControl (2005) 1411-1416

16. Zhang, Y., Jiang, D., Wang, J.: A Recurrent Neural Network for Solving SylvesterEquation with Time-Varying Coefficients. IEEE Transactions on Neural Networks,Vol. 13 (2002) 1053-1063

17. Zhang, Y., Ge, S.S.: A General Recurrent Neural Network Model for Time-VaryingMatrix Inversion. Proceedings of the 42nd IEEE Conference on Decision and Con-trol (2003) 6169-6174

18. Zhang, Y., Ge, S.S.: Design and Analysis of a General Recurrent Neural NetworkModel for Time-Varying Matrix Inversion. IEEE Transactions on Neural Networks,Vol. 16 (2005) 1477-1490

19. Carneiro, N.C.F., Caloba, L.P.: A New Algorithm for Analog Matrix Inversion.Proceedings of the 38th Midwest Symposium on Circuits and Systems, Vol. 1 (1995)401-404

20. Mead, C.: Analog VLSI and Neural Systems. Addison-Wesley, Reading, MA (1989)21. Zhang, Y., Li, Z., Fan, Z., Wang, G.: Matrix-Inverse Primal Neural Network with

Application to Robotics. Dynamics of Continuous, Discrete and Impulsive Systems,Series B, Vol. 14 (2007) 400-407

22. Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis, Cambridge University Press,Cambridge (1991)