BSB

9
86 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 3, NO. 1, JANUARY 1992 Dynamical Analysis of the Brain-State-in-a-Box (BSB) Neural Models Stefen Hui and Stanislaw H. Zak, Member, IEEE Abstmct-In this paper we perform a stability analysis of the brain-state-in-a-box (BSB) neural models with weight matrices that need not be symmetric. The implementation of associative memories using the analyzed class of neural models is also addressed. In particular, we modify the BSB model so that we can better control the extent of the domains of attraction of stored patterns. Generalizations of the results obtained for the BSB models to a class of cellular neural networks are also discussed. I. INTRODUCHON HE brain-state-in-a-box (BSB) neural model was con- T ceived by Anderson and coworkers in 1977 [l]. It is often represented in discrete time by the following difference equation: z(k + 1) = g(z(k) + oWz(k)) (1) where z(k) is the state vector at time k, o > 0 is a step size, W E Rnxn is a symmetric weight matrix, and g is a saturating linear function defined as follows: Let T Y = [Yl,...,YnI ; then the ith component of g(y) is 1 if ya 2 1 { -1 if ~~5-1 (S(Y))Z = Yi if - 1 < Yz < 1 Note that the function g(.) is responsible for the name given to (l), for the state vector z(k) lives in the “box” H, A [-1, I]”, which is the closed n-dimensional unit hypercube. The BSB model has been investigated by many researchers, among them Anderson et al. [l], Greenberg [3], Grossberg [4], Michel and Farrell[9], and Hui and Zak [6]. Anderson et al. [ 11 used the BSB model as a “categorical perceiver.” This model, as was pointed out by Rumelhart et al. [lo], is a close relative of the linear associator. From the dynamics point of view, the BSB model can be viewed as a discrete linear system in a saturated mode. The main difference between the BSB model and a usual discrete system is that the linear system is defined on R“, while the BSB model is defined on the closed n- dimensional hypercube. The above point of view was utilized by Li et al. [7] in the analysis and synthesis of an analog counterpart of the BSB model. Li et al. [7] referred to this Manuscript received January 15, 1991; revised June 17, 1991. S. Hui is with the Department of Mathematical Sciences, San Diego State S. H. Zak is with the School of Electrical Engineering, Purdue University, IEEE Log Number 9103356. University, San Diego, CA 92182. West Lafayette, IN 47907. analog counterpart as the linear system in a saturated mode (LSSM). The BSB model can be used in the implementation of associative memories, where each stored prototype pattern is an asymptotically stable equilibrium point of (1). Hopfield [5] and Michel and Farrell [9] discuss the subject of the appli- cability of artificial neural networks to associative memories and present techniques that have been used in the design of neural networks for this purpose. The difficulties involved in the implementation of associative memories using neural networks, as pointed out by Michel and Farrell [9], include: i) storing each prototype pattern as an asymptotically stable equilibrium point of the network; ii) controlling the extent of the basins of attraction of each stored pattern; and iii) minimizing the number of asymptotically stable equilib- ria in the network that are not used for storing patterns. These issues are addressed in this paper. Our goal in this paper will be to investigate conditions on the weight matrix W guaranteeing that the extreme points are stable equilibria and that no other equilibrium point in H, is stable. In Section I11 we will devise a method which allows one to determine all the equilibrium points of the network. The analysis will show that a generic trajectory starting in the open hypercube Hi will hit a “face” and slide on that surface to an “edge” and so on, until an extreme point is reached (see Fig. 1). The remaining degrees of freedom on the elements of the weight matrix W can be used to control the extent of the basins of attraction of each prototype pattern. Our approach does not require the symmetry of the weight matrix W. We perform our analysis of the BSB neural model with a nonsymmetric interconnection matrix W. Furthermore, we modify the BSB model (1) by adding to its linear part a component ab (see eq. (2)) which allows one to better control the extent of the domains of attraction of stored patterns. In Section IV we prove rigorously, under some mild assumptions on the weight matrix, that the set of initial states whose trajectories do not reach an extreme point of H, in a finite number of steps has n-dimensional Lebesgue measure 0. In Section V we show that the results we obtained for the BSB neural models can be generalized to a class of cellular neural networks. To proceed further we need some technical results which are presented in the following section. 11. SOME DEFINITIONS AND BACKGROUND RESULTS In the following two sections we will be analyzing a more general version of the BSB model than that considered by I - 1045-9227/92$03.00 O 1992 IEEE --T ~ -~

description

BSB

Transcript of BSB

  • 86 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 3, NO. 1, JANUARY 1992

    Dynamical Analysis of the Brain-State-in-a-Box (BSB) Neural Models

    Stefen Hui and Stanislaw H. Zak, Member, IEEE

    Abstmct-In this paper we perform a stability analysis of the brain-state-in-a-box (BSB) neural models with weight matrices that need not be symmetric. The implementation of associative memories using the analyzed class of neural models is also addressed. In particular, we modify the BSB model so that we can better control the extent of the domains of attraction of stored patterns. Generalizations of the results obtained for the BSB models to a class of cellular neural networks are also discussed.

    I. INTRODUCHON HE brain-state-in-a-box (BSB) neural model was con- T ceived by Anderson and coworkers in 1977 [l]. It is

    often represented in discrete time by the following difference equation:

    z ( k + 1) = g ( z ( k ) + oWz(k) ) (1) where z ( k ) is the state vector at time k, o > 0 is a step size, W E Rnxn is a symmetric weight matrix, and g is a saturating linear function defined as follows: Let

    T Y = [Yl,...,YnI ;

    then the ith component of g(y) is

    1 if ya 2 1

    { -1 if ~ ~ 5 - 1 (S(Y))Z = Yi if - 1 < Yz < 1 Note that the function g(.) is responsible for the name given to (l), for the state vector z ( k ) lives in the box H, A [-1, I], which is the closed n-dimensional unit hypercube.

    The BSB model has been investigated by many researchers, among them Anderson et al. [l], Greenberg [3], Grossberg [4], Michel and Farrell[9], and Hui and Zak [6]. Anderson et al. [ 11 used the BSB model as a categorical perceiver. This model, as was pointed out by Rumelhart et al. [lo], is a close relative of the linear associator. From the dynamics point of view, the BSB model can be viewed as a discrete linear system in a saturated mode. The main difference between the BSB model and a usual discrete system is that the linear system is defined on R, while the BSB model is defined on the closed n- dimensional hypercube. The above point of view was utilized by Li et al. [7] in the analysis and synthesis of an analog counterpart of the BSB model. Li et al. [7] referred to this

    Manuscript received January 15, 1991; revised June 17, 1991. S. Hui is with the Department of Mathematical Sciences, San Diego State

    S. H. Zak is with the School of Electrical Engineering, Purdue University,

    IEEE Log Number 9103356.

    University, San Diego, CA 92182.

    West Lafayette, IN 47907.

    analog counterpart as the linear system in a saturated mode (LSSM).

    The BSB model can be used in the implementation of associative memories, where each stored prototype pattern is an asymptotically stable equilibrium point of (1). Hopfield [5] and Michel and Farrell [9] discuss the subject of the appli- cability of artificial neural networks to associative memories and present techniques that have been used in the design of neural networks for this purpose. The difficulties involved in the implementation of associative memories using neural networks, as pointed out by Michel and Farrell [9], include:

    i) storing each prototype pattern as an asymptotically stable equilibrium point of the network;

    ii) controlling the extent of the basins of attraction of each stored pattern; and

    iii) minimizing the number of asymptotically stable equilib- ria in the network that are not used for storing patterns.

    These issues are addressed in this paper. Our goal in this paper will be to investigate conditions on the weight matrix W guaranteeing that the extreme points are stable equilibria and that no other equilibrium point in H , is stable. In Section I11 we will devise a method which allows one to determine all the equilibrium points of the network. The analysis will show that a generic trajectory starting in the open hypercube H i will hit a face and slide on that surface to an edge and so on, until an extreme point is reached (see Fig. 1). The remaining degrees of freedom on the elements of the weight matrix W can be used to control the extent of the basins of attraction of each prototype pattern. Our approach does not require the symmetry of the weight matrix W . We perform our analysis of the BSB neural model with a nonsymmetric interconnection matrix W . Furthermore, we modify the BSB model (1) by adding to its linear part a component ab (see eq. (2)) which allows one to better control the extent of the domains of attraction of stored patterns. In Section IV we prove rigorously, under some mild assumptions on the weight matrix, that the set of initial states whose trajectories do not reach an extreme point of H , in a finite number of steps has n-dimensional Lebesgue measure 0. In Section V we show that the results we obtained for the BSB neural models can be generalized to a class of cellular neural networks.

    To proceed further we need some technical results which are presented in the following section.

    11. SOME DEFINITIONS AND BACKGROUND RESULTS

    In the following two sections we will be analyzing a more general version of the BSB model than that considered by

    I -

    1045-9227/92$03.00 O 1992 IEEE

    --T ~ -~

  • HUI AND ZAK. DYNAMICAL ANALYSIS OF BRAIN-STATE-IN-A-BOX NEURAL MODELS 87

    Fig. 1. A 3-D illustration of the evolution of a typical trajectory analyzed neural network model.

    of the

    Anderson et al. in [l]. The model we will be dealing with has the form

    z ( k + 1) = g((1, + aW)x(k) + ab) (2) where b E R", and the weight matrix W need not be symmetric.

    The two main reasons we have modified the BSB model (1) by introducing in it the vector ab are:

    i) the presence of ab in (2) will allow us to better control the extent of the basins of attraction of the prototype patterns, as will become evident in Section 111;

    ii) the analysis of the BSB model behavior on the boundary regions of Hn can be reduced to the study of models of type (2) rather than (1).

    The discussion of the stability or instability of a point (stored pattern) can proceed only if the point under consideration is an equilibrium point (state). Only after that can one try to determine its stability properties.

    To define an equilibrium point we need the following notation:

    T ( z ) g((In + aW)z + ab). Definition 1: A point z E Hn = [-1, 11" is an equilibrium

    Definition 2: A matrix W E RnXn is row diagonal dom- point of (2) if T ( z ) = z.

    inant if n

    J=1 J # l

    The matrix W is said to be strongly row diagonal dominant if n

    ,=1 J # I

    With the above definitions in hand we can prove the following result concerning the BSB neural model with a nonsymmetric interconnection matrix.

    Theorem 1: Consider (2) where b = 0. Then the extreme points of the hypercube Hn are equilibrium points if and only if W is row diagonal dominant.

    Proof: A proof of the sufficiency part (e) of this theorem can be found in Greenberg [3, p, 3231. The necessary part (+) of the theorem will be proved by contraposition. Thus assume that, for some i,

    Define

    as follows: if j = i or wij = 0

    -sgn(wij) if j # i and wij # 0, e j =

    where

    Note that, so defined, e is an extreme point of the hypercube, and the ith entry of ( I , + aW)e is

    n

    ((In + aW)e)i = (1 + awii) - a lwijl J = 1 J f l

    n

    = 1 + .(WZi - IWijl) < 1.

    Since a > 0 and

    this implies that the extreme point e is not an equilibrium point

    In further development we need the following four technical

    Lemma 1 (Levy-Desplanques theorem): A complex n x n

    since T(e) # e. 0

    lemmas.

    matrix W such that II

    = 1, . . . ,= l J # t

    is nonsingular, that is, det W # 0. (Thus, a strongly row diagonal dominant matrix is invertible.)

    0 Lemma 2 (Gerschgorin's theorem): Given any n x n matrix

    W with complex entries, denote by Di( i = 1, . . . , n) the closed disk in the complex plane with center wii and radius

    Proof: See e.g. Marcus and Minc [8, p. 1461.

    n

    J Z 1 J f t

    Then the eigenvalues of W are located in the closed disks D;,i = l,..-,n.

    Proof: This is a corollary of Lemma 1. See [8, p. 1461 for more details. 0

    Lemma 3: Given W E Rn n. If W is strongly row diagonal dominant, then all its eigenvalues are located in the open right

  • 88 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 3, NO. 1, JANUARY 1992

    e(2) * - - I I I I I I

    I I I I

    i-1

    ~ ( k + 1) = g((In + CYW)Z(IC) + ab), where we assume that the components of b E R" satisfy the

    \ - - - - - - f. following assumption:

    i b l l J # *

    12

    l b , l < w z z - C ~ ~ , j / , i = l 1 . . . 1 1 2 . (3) J=1

    0 We use the following notation:

    -W-'b E H;.

    A simple computation shows

    e (3)

    L(-W-'b) = (In + aW)(-W-lb) +ab = -W-'b. Then Since W-lb E H;,

    where H; = (-1,l)" is the open unit n-dimensional hyper- cube. we have

    Proof: It follows from Lemma 3 that W is invertible. From the assumption on b, it follows that the image of each extreme point e = [el,..*,enlT, where lezl = 1 for i = 1,. . . , n, under W is in the same quadrant. That is, if y = We, then sgn(y,) = e, for i = 1,.

    + , n. We also have

    -W-lbE H;,

    T(-W--lb) = -W-'b.

    ~h~~ - ~ - l b is an equilibrium point of (2). Since W is invertible, the uniqueness of the equilibrium point is clear.

    Let

    1n Y E H; i = 1,. . . IyzI > p i l l

    from the assumption on b. Since H; = (-1, l)n is the open convex hull spanned by the extreme points dl), . . . , e(2") , the image W((-1, 1)") is the open convex hull spanned by We('), . . . , We(2") (see Fig. 2). From our previous observa- tion.

    and

    y # -W-'b.

    Suppose T(y) E H;. Then T(y) = L(y), and we can write

    n y = o - ~ - l b for some o E IR", o # 0. n [ - l b z l , IbzII c W((-l, Then

    -1 e (4) hypercube HE is -W-'b. Furthermore, if ([(In + aW)zll >

    i=l

    T(y) = L(Y) = (In + aW) + ab = (In + (YW)(O - W-'b) +ab Thus W-lb E HE. 0

    now analyze the stability of the neural network modeled by (2). With the above background results at our disposal we can

    = (In + aW)O - W-lb. Thus

    111. STABILITY ANALYSIS OF NEURAL MODEL (2) llT(y) + W-lbll = ll(I" -k aW)O(( > ((@I( = lly + W-lb((. In this section we analyze the neural network modeled bv

    (2). Recall that the dynamics of this net is governed by the 0

  • HUI AND 2AK: DYNAMICAL ANALYSIS OF BRAIN-STATE-IN-A-BOX NEURAL MODELS 89

    Remark 2: A simple consequence of the above theorem is that the only equilibrium point of the BSB neural model described by (1) in HE is 0.

    Remark 3: Observe that the vector b in (2) can be used to assign the equilibrium point - W-'b to a prespecified location in HE. Also note that since -W-'b E Hg and in HE model (2) is linear:

    z ( k + 1) = (I, + aW)z(k) + ab, where the eigenvalues of (I, + aW) are all outside the unit disk in the complex plane, -W-'b is a completely unstable equilibrium point of (2) in HE. Thus if z(0) is in a neighborhood of -W-l b, then the successive iterates z ( k ) will move away from W-lb.

    We next show that the trajectory { z ( k ) } moves away from -W-'b until it hits the boundary of the hypercube H,. We then show that this continues once the trajectory hits an edge; it will stay on that edge until it reaches a lower dimensional edge (see Fig. 1).

    Theorem 3: All edges of H, are invariant for (2); that is, if the ith coordinate of z ( k ) is 1 (or -l), then the ith coordinate of x ( k + 1) is 1 (or -1). If more than one coordinate of z ( k ) has entries 1 (or - l) , then x( k + 1) has 1 (or - 1) at the same coordinates.

    Proof: Let the ith coordinate of z ( k ) be 1. Then the ith coordinate of (I, + aW)z(k) + ab is

    3=1 3 f .

    Thus

    IT[(x)]il = 1 for i = 1,. . . , n.

    By continuity, for sufficiently small 6, we must have

    T ( z ) = e. 0

    One can use a technique of Greenberg's [3] to obtain an alternative proof of the above theorem.

    For certain models, we can analyze their dynamics further to give an upper bound on the number of steps that the trajectory will stay in the open hypercube before hitting the boundary.

    Corollary 2: Assume that on top of the assumption (3) the matrix W + WT is positive definite. Let z(0) E H: and z(0) # -W-lb. Let d = Ilz(0) + W-lbll. Then z ( k ) lies on the boundary of H , for

    ,

    2 log y l og ( l+ a X m i n ( W + WT) + a2Xmin(WTW))' k >

    where Xmin(P) denotes the smallest (real) eigenvalue of the symmetric matrix P.

    Proof: From the assumptions it follows that Xmin(W + W T ) > 0 and Xmi,(WTW) > 0. We write z(0) = 71- W-'b for some 0 E R", 77 # 0, with llvll = d. Since g(y) = y for all y in HE, we have, for all z ( k ) in Hg,

    z ( k ) = ( I , + aW)kq - W-lb. Thus

    Ilz(k) + W-lbll = It(& + aWkrlll. To proceed further we need to evaluate the norm ll(1, t aW)7111. Note that

    The last inequality follows because of (3). Thus zi(k + 1) = T(z ; ( k ) ) = 1. Similarly, we have the result for zi(k) = -1.

    An immediate corollary of Theorem 3 is the following analogue of Theorem 1.

    Corollary 1: The extreme points of the hypercube H, are equilibrium points of model (2) if and only if W , b satisfy (3).

    stable. We formally define the stability of an equilibrium point of (2) as in Greenberg [3].

    Definition 3: An equilibrium point z is stable if there exists

    ll(I, + aW)711l2 = $-(I, + a(W + W T ) + a2WTW)71 0 2 1 1 ~ 1 1 ' + QXmin (W + w') IIVII'

    + a 'xmin ( w T ~ ) 1 1 ~ 1 1 ~ since if p = pT > 0 then ~min(p)~171~~2 5 ~ T p q . Thus

    Now the question arises whether the equilibrium points are ll(~n + a ~ ) ~ 1 1 ' 2 (1 + a X m i n ( W + wT) + a 2 J m i n ( w T ~ ) ) IIvII'. (4)

    Hence a neighborhood N ( z ) in H, for which z = T(y) for all

    In other words, an equilibrium point is stable if all points

    Theorem 4: The extreme points of the hypercube H, are

    Proof: Let e be an extreme point of H,. In the proof of

    Y E N ( x ) .

    near it are sent to it in one step.

    stable equilibrium points of (2).

    Theorem 3, we showed that if Iz;(k)l = 1, then

    I[L(z(k))]il = [ [ ( I , + aW)z(k) + ablil > 1. By the continuity of L ( z ) there is a 6 > 0 so that for all x such that 11x-ell < 6 we have

    I[L(z)]iI > 1 f o r i = l , . . . , n .

    Let

    711 ( I , + aW)"-lq Then taking into account (4), we have

  • 90 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 3, NO. 1, JANUARY 1992

    Proceeding in a similar manner, we obtain and

    Ilx(k) + W-lbll = ll(1, + aW)kl?ll wajxj = w;juj. j J j J

    2 [(I + aXmin(W + w') Then b is a constant vector independent of x E S (see Definition 4).

    + a2Xmin (w'w)) 1'21 IC IITI I. Since x(k) E H z , Ilx(k) + W-lbll < 2f i , where 2 6 is the Let length of the diagonal of H,. Thus for x(k) E H;,

    bi = bi + C w i j u j . 2 6 > [(l + aXmi,(W + W') + a2Xmin(WTW))"2]%i j J

    Thus for i $2 J we have since l l ~ l l = d . Therefore

    2 log y < log(1 + aXmi,(W + W') + a2Xm;,(W'W))'

    [g((I , + aW)x + ab)]; = g

    To complete the proof observe that by Theorem 3 we are guaranteed that the trajectory never leaves the boundary once there. 0

    We next turn our attention to the dynamics of the system (2) restricted to an edge of the hypercube H,. We will show that the equation which governs the dynamics of the neural model (2) on an edge of H , has a form similar to that of (2), although with W and b of lower dimensionality.

    We will use the following definitions in further consider- ations.

    Definition 4: we say that S is an edge of H, if there exists a nonempty subset J of the set { 1,2, . . . , n } such that

    S = {[S~,...,S,]'I - 15 s j 5 1 for j $ J and

    The above equation shows us that if we identify S with the (n - 1 JI)-dimensional hypercube by deleting all fixed coordinates corresponding to the index set J , then the network (2) behavior is governed by the equation

    z(k + 1) = g((1n-l J I + a @ ) g ( k ) + ab) (5) where @ is the (n - I JI) x (n - I JI) submatrix of W obtained by deleting the rows and columns with indices in J from W .

    Note that for i $2 J, we have

    s j = u j for j E J, where u j is fixed to 1 or - 1). since (ujI = 1 for j E J. Thus

    We define the dimension of S to be the number of free (not

    Definition 5: The interior of the edge S or the open edge fixed) variables in its definition.

    So is the set

    So = {[s~,.. . ,s,] ' E SI - 1 < sj < 1 for j $2 J}.

    Let S be an edge of H,. By Theorem 3, x(k) E S implies x(m) E S for m 2 k . Thus x;(m) = U; for i E J . Therefore we only need to study x:i(m) for i $2 J.

    We write

    Gii > IWijl + IbiI. (6) j # i

    Observe the similarity between the two pairs of equations (2) and (9, and (3) and (6). The invariance of the structure of the network when restricted to a boundary region of H , allows us to apply our previous results to the analysis of the network behavior on an edge.

    We will now illustrate the above results on a numerical example.

    Example 1: Let n

    [(In + 0W)x + ab]; = + " W i j ) Z j + abi j=1

    and let b = 0. The dynamics of the system on the front face {[lrx2,x3]'I - 15 xi 5 1, i = 2,3} of H3 is governed by

    = C(&j + aw;j)xj jeJ

    + a(bi + wijxj) + Sijxj, j J j J

    z(k + 1) = g((12 + aPV)3(k) +ab) where

    1 i f i = j 0 if i # j . saj = {

    For x E S and i J, we have

    G i j X j = 0 j J

    where

    L J L J

    Of course, we are identifying 3t- E R2 and (1,Z) E R3. The dynamics on the lower front edge {[1, y, -1IT I - 1 5 y 5 1) is governed by

    i ( k + 1) = g ( ( I 1 + a&)l(k) +ai),

  • HUI AND iAK: DYNAMICAL ANALYSIS OF BRAIN-STATE-IN-A-BOX NEURAL MODELS 91

    where

    W = 2 z = 0 . 1 + 1(-1) = -1. Similarly we can obtain the dynamics on all the faces and edges of H3.

    Once we have the dynamics, we can obtain the equilibrium points by applying Theorem 2. For example, on the front face we have

    - W - l b = - & 1 3 - 1 0 2][1] = [ -4. T So the equilibrium point in H3 is [l, i , -21 . On the lower

    front edge, we have -W b = 3, and the equilibrium point is [I, ;,-I] .

    In general, it is true that there is exactly one equilibrium point in the interior of each edge.

    Theorem 5: On the hypercube H,, model (2) has 3 equil- ibrium points, and only the 2 extreme points are stable equilibria. The distribution of the equilibrium points is inde- pendent of a: > 0.

    Proof: We know that the 2 extreme points are stable equilibria by Theorem 4. By Theorem 2, there is one equilib-

    - - 1 T

    rium point in HZ and, to each edge, there is each edge. There are

    with Theorem 2 applied to (2) restricted one equilibrium point in the interior of

    nontrivial edges. Thus there are

    1 + 2 ( 3 ) 2 j + 2 = 3 j = 1

    equilibrium points. By the proof of Theorem 2, the equilibrium points in HZ and the interior of the edges are not stable. Note that the extreme points and edges of an edge are extreme points and edges of the H,. Therefore, it is not necessary to consider these separately. The equilibrium points are independent of a:

    Observe that we can _obtain the equilibrium points explicitly by computing W and b for each edge. Then the equilibrium point is at -W-lb by Theorem 2.

    Similar conclusions to that of Theorem 5 about the distri- bution of equilibrium points for the analog counterpart of (1) were obtained by Li et al. [7].

    In the next section we prove rigorously that the set of initial states whose trajectories do not reach an extreme point of H, in a finite number of steps has n-dimensional Lebesgue measure 0. That is, with probability 1 (with respect to any probability measure that is absolutely continuous with respect to the Lebesgue measure) all initial states lead to an extreme point.

    From what we have done in this section, it is not difficult to see that H, is partitioned into 2 regions by (n - 1)- dimensional surfaces. Each of the regions contains exactly

    by Theorem 2. 0

    1-1 J Fig. 3. Illustration of the domains of attraction in example 2 along with sample trajectories. If z(0) = [0.1,0Ir, Then the trajectory startin from this initial state reaches [l , 1IT in seven steps. If r ( 0 ) = [0.2, -0.2], then the corresponding trajectory reaches [-1, -1IT in eleven steps.

    one extreme point. The surfaces meet at -W-b, the totally unstable equilibrium point in HZ.

    Example 2: Let

    W = [: t] b = [:I. We find that the equilibrium points are the extreme points

    of H2, and [0,0IT, [l, - - $ I T , [-i, 1IT, and [;,-1lT. Fig. 3 illustrates the distribution of the equilibrium points along with the domains of attraction of the extreme stable equilibrium points.

    Iv. GENERIC PROPERTIES OF THE TRAJECTORIES OF THE BSB SYSTEMS

    In [lo, p. 671 one can find the following statement: BSB systems always end up in one or another of the corners. The particular corner depends on the start state of the network, the input to the system, and the pattern of connections among the units. The statement is technically correct although with- out a proof. The first portion of the above statement was also investigated by Greenberg [3], who concludes that since the extreme points are stable equilibria when W is strongly diagonal dominant we have the essence of convergence to extreme point equilibria [3, p. 3241. He then, correctly, claims, without a proof, that if s(0) E HZ then the trajectory starting from s(0) converges to an extreme point of H, with probability 1.

    In what follows we provide a rigorous proof of an even stronger version of the above claims using only deterministic arguments.

    Theorem 6: The set of initial states of the trajectories of (2) which do not reach an extreme point in a finite number of steps has n-dimensional Lebesgue measure 0. That is, with

  • 92 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 3, NO. 1, JANUARY 1992

    probability 1 with respect to any probability measure that is absolutely continuous with respect to the Lebesgue measure, all initial states lead to an extreme point.

    In the proof of the above theorem we need the following lemma, which states that the set of points which are sent into a thin set of an edge must be thin itself.

    Lemma 5: Let S be a nontrivial edge of H,. Suppose dim S = m. Let E be a subset of So (the interior of S ) with m-dimensional Lebesgue measure 0. Then the set

    {IC E H: I T(Ic) E E } L(H 2)

    L(x) = (I + aW)x +ah has n-dimensional Lebesgue measure 0. ~ ( e ( ~ ) )

    Proof of Lemma 5: Without loss of generality, we can assume that

    so ={[SI,. . . , sm, 1 , . . . ,1]T E IR I - 1 < si < 1, i = 1 , . . . ,m}.

    By assumption E c So. Then for IC E H: T(z) = g((In + aW)z + ab) E E

    if and only if

    (In + ~ W ) X + ab E {[SI, .. 7 sm, tm+l, .. 7 LIT I [ s 1 , . . . , s m , l , . . . , l I T E E a n d t i 2 1

    for i = m + 1, . . . , n> n L(H;) A N . Recall

    L(z) = ( I , + aW)z + ab. Clearly N has identical cross sections along the direction orthogonal to S. Thus by Fubinis theorem on iterated integrals [ l l , p. 781.

    Fig. 4. Illustration of the proof of Theorem 6 for n = 2

    CO

    for some IC = 1 , 2 , . . .> = U L-(N) has measure zero. Observe that Ur=oL-k(N) is the piecewise- linear curve connecting a, ~ - l ( a ) , . . . ,L-(a) to -W-lb which is the equilibrium point in H i (see Fig. 4). The other three equilibria points can be analyzed similarly. Thus the proof of the case n = 2 is complete.

    Let us now consider the general case. Assume the conclu- sion of the theorem holds for i = 1, . . . , n - 1. By Cor- ollary 2, applied to the open hypercube H: and each edge, a trajectory will reach a lower dimensional edge unless it hits an equilibrium point which is not an extreme point. Let a be the fixed point of an open (n-m)-dimensional edge So, 0 < m < n. If z(k) E So and z ( k ) # a, then the trajectory will reach a lower dimensional edge by Corollary 2.

    k=O

    Let

    meas(N) 5 meas(E) . diam(L(H:)) = 0. CI Proof of Theorem 6: We use induction on the dimension

    of the hypercube H,. The case n = 1 follows immediately from Corollary 2.

    N = {X E HZ JT(Ic) = U } .

    B~ hmma 5, the n-dimensional measure of N is 0. Since L is one-to-one and onto, the sets

    As an illustration of the idea of the proof we present the L - y N ) , L - y N ) , . . . proof for the case n = 2 before proceeding with ;he general case. are in H: and have n-dimensional measure 0. Thus the set

    Let IC(0) E H, and z(0) # -W-b. BY CorollarY 2, the ( ~ ( 0 ) E H: I {~~(lc)} hits the boundary first at a} trajectory {~(lc)} will be on the boundary of Hz in a finite number of steps. If the trajectory hits the boundary not at one

    00

    = U L - y N ) of the four equilibrium points in the interiors of the edges, then k=O another application of Corollary 2, in one-dimensional hyper- cube HI, shows that the trajectory will end at an extreme point. Let S be an edge and a in So be the equilibrium point. Let

    N = {z E H, I T(z ) = a}.

    Since {a} has one-dimensional measure 0, N has two- dimensional measure 0 by Lemma 5. The matrix (12 + aW) is nonsingular (with the eigenvalues of W in the open right- half complex plane); hence L is one-to-one and onto. Thus L-l(N), L-(N), . . . are line segments in H, and have two-dimensional measure 0. Therefore the set

    has measure 0. We next consider the case where the trajectory {IC(IC)} hits

    the boundary at another edge before hitting a. So suppose that the trajectory hits the boundary first at an edge SI # S. Then n > dim S > dim S since the trajectory can move from S to S only if S is an edge of SI. Let n = dim SI. Since n 5 n - 1, the set

    E = ( ~ ( 0 ) E SI I IC(IC) = a for some I C } has n-dimensional measure 0 by induction. By Lemma 5 the set

    { X ( O ) E Hz I X ( k ) = a NI = {X E H: I T ( z ) E E }

  • HUI AND ZAK: DYNAMICAL ANALYSIS OF BRAIN-STATE-IN-A-BOX NEURAL MODELS

    has n-dimensional measure 0. Again by the one-to-one prop- erty of L we have

    L- 1 (NI) , L-2 ( N ) ,

    are in H z and have measure 0. Thus the set

    ( ~ ( 0 ) E H i I { ~ ( l c ) } hits the boundary first at S

    and then a} = U L-(N) has measure 0. Since there are only a finite number of edges, the set

    CO

    k=O

    ( ~ ( 0 ) E HZ I ( ~ ( l c ) } hits a for some I C } has measure 0. Since there are only a finite number of equilibrium points, the set

    ( ~ ( 0 ) E H i I {z(k)} hits some nonextreme equilibrium point}

    has measure 0, and the proof is complete.

    v. GENERALIZATIONS TO A CLASS OF DISCRETE CELLULAR NEURAL NETWORKS

    In this section we show that the results we obtained for the BSB neural models can be extended to a class of discrete cellular neural networks.

    Analog cellular neural networks have been investigated by Chua and Lin [2]. In this section we propose a class of discrete cellular neural network models which can be considered as a discrete counterpart of the analog models proposed by Lin and Chua.

    One can form a cellular network of any dimension. In this paper, however, for the sake of clarity we deal only with two- dimensional cellular networks. A cellular network is composed of interconnected basic building cells (neurons) C(i , j ) , where any cell in the network is connected only to its neighborhood cells. We define a neighborhood of the cell C(i , j ) in a manner similar to that in [2].

    Definition 6: The r-neighborhood N T ( i , j ) of a cell C ( i , j ) is the set of cells C(lc,l) such that

    N T ( i , j ) = {C(lc,l) I max{IIC - il 5 r, IZ - j l 5 r } . An illustration of a two-neighborhood of a cell C ( i , j ) is

    depicted in Fig. 5. Having defined the neighborhood of a cell in a cellular

    neural network, we now propose to consider the following model of a class of discrete neural networks:

    (7)

    U 0 0 0 0

    0 0 0 0 0

    0 0 0 0 0

    93

    0 0 0 U 0

    Fig. 5 . The neighborhood of cell C(z, j ) for T = 2

    In (7) we do not require that the interconnections between the cells have any kind of symmetry.

    The above 2-D model can be represented in the format of the 1-D BSB model described by (2) if one defines the mn-dimensional state vector z ( k ) as follows:

    Then one can use the results of the previous sections to study the cellular neural model (7).

    VI. CONCLUSIONS

    In this paper we have studied the dynamic properties of a class of brain-state-in-a-box (BSB) neural networks described by a system of first-order linear difference equations defined on a closed hypercube.

    We have introduced a more general version of the BSB neural model. This generalization allows one to better control the extent of the domains of attraction of stored patterns. Our stability analysis was performed for the network with the nonsymmetric interconnection matrix. A method for comput- ing the equilibrium points has also been offered. The results obtained were interpreted from the point of view of the implementation of associative memories using the analyzed neural model.

    We also discussed generalizations of the results obtained to a class of discrete cellular neural models which can be viewed as a discrete counterpart of the analog neural models analyzed in [2].

    ACKNOWLEDGMENT

    The authors gratefully acknowledge the constructive re- marks of the reviewers.

  • 94 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 3, NO. 1, JANUARY 1992

    [71

    REFERENCES

    J. A. Anderson, J. W. Silverstein, S. A. Ritz, and R. S. Jones, Distinctive features, categorical perception, and probability learning: Some applica- tions of a neural model, in Neurocomputing: Foundations of Research, J.A. Anderson and E. Rosenfeld, Eds. Cambridge, M A MIT Press, 1988. L.O. Chua and L. Yang, Cellular neural networks: Theory, IEEE Trans. Circuits Syst., vol. 35, pp. 1257-1272, Oct. 1988. H. J. Greenberg, Equilibria of the brain-state-in-a-box (BSB) neural model, Neural Networks, vol. 1, no. 4, pp. 323-324, 1988. S. Grossberg, Nonlinear neural networks: Principles, mechanisms, and architectures, Neural Networks, vol. 1, no. 1, pp. 17-61, 1988. J. J. Hopfield, Neurons with graded response have collective computa- tional properties like those of two-state neurons, Proc. Nut. Acad. Sci. US., vol. 81, pp. 3088-3092, May 1984. S. Hui and S. H. Zak, On the brain-state-in-a-box (BSB) neural mod- els, in Proc. Int. AMSE Con$ Neural Networks (San Diego, CA), May 29-31, 1991. J.-H. Li, A. N. Michel, and W. Porod, Analysis and synthesis of a class of neural networks: Linear systems operating on a closed hypercube, IEEE Trans. Circuits Syst., vol. 36, pp. 1405-1422, Nov. 1989. M. Marcus and H. Minc, A Survey of Matrix Theory and Matrix Inequalities. A. N. Michel and J. A. Farrell, Associative memories via artificial neural networks, IEEE Control Systems Magazine, vol. 10, no. 3, pp. 6-17, Apr. 1990. D.E. Rumelhart, J.L. McClelland, and the PDP Research Grouo.

    Boston, M A Allyn and Bacon, 1964.

    Stefen Hui received the B.A. degree from the University of California at Berkeley and the Ph.D. degree from the University of Washington in Seattle, both in mathematics.

    He has held positions as an applied mathematician at the Naval Ocean Systems Center in San Diego and as a Research Assistant Professor at Purdue University. Currently he is an Assistant Professor of Mathematics at San Diego State University. His research interests include function theory, control theory, and neural computations.

    Stanislaw H. Zak (M81) received the Ph.D. degree from the Technical University of Warsaw, Poland, in 1977.

    He was an Assistant Professor in the Institute of Control and Industrial Electronics, Technical Uni- versity of Warsaw, from 1977 to 1980. From 1980 until 1983, he was a visiting Assistant Professor in the Department of Electrical Engineering, Uni- versity of Minnesota. In August 1983, he joined the School of Electrical Engineering at Purdue Uni- versity, where he is now an Associate Professor. . I . -

    Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, Foundations.

    I l l ] L. Debnath and P. Mikusinski, Introduction to Hilbert Spaces with Applications. San Diego, CA. Academic Press, 1990. control problems.

    His research interests are in three different fields: an algebraic approach to the analysis and synthesis of linear systems; the control and observation of nonlinear systems; and applications of neural networks to optimization and

    Cambridge, MA: MIT Press, 1989.

    71-