University of California, San Diegohelton/MTNSHISTORY/CONTENTS/... · known results on linear...
Transcript of University of California, San Diegohelton/MTNSHISTORY/CONTENTS/... · known results on linear...
-----------------
INTERNATIONAL SYMPOSIUM
ON
OPERATOR THEORY OF NETWORKS
AND SYSTEMS
Volume I
AUGUST 12-14, 1975
MONTREAL
Copyright
~b ) 6 [t7)') M(J', y\
©1976 by Western Periodicals Company
13000 Raymer Street
North Hollywood, California 91605
j ....... .. '<~ '}.~
.f
PREFACE
TIle enclosed Dapers constitute ones discussed at the oDcrator Theory of
Networks and Systems International Symposium, OTNS, held at Concordia
University, Montreal, Canada, August 12-14, 1975. On behalf of the audience
wishing contact with these ideas I wish to thank the authors for ~~ing therrl
available and Western Periodicals for publishing them. The OTNS Symposium has been set up to bring together researchers Horkinr;
in the area for cross-fertilization and codification of results. By its nature
the field is one where both engineers and mathematicians have made extensive
contributions. Consequently, we have been happy to see that the OTNS Symposium
has positively contributed to fertile interactions between the two disciplines
and we hope such will continue within the OTNS structure now established.
The success of the Montreal OTNS Symposium is owed to a number of people.
Primary am:mg these is Professor N. Levan of UClA who as Program Chairman set
the tone of the meeting. Besides him I would like to publicly thank \v. Porter,
R. M. De Santis, V. Ramachandran, R. Saeks, M. N. S. SI.Jamv, ,T. Baras, 11. DeClaris,
as well as my Co-Chairman A. Zernanian, all ow whom contributed to the administra-
tion of the Symposium. We also appreciated the institutional support of our Co-Sponsors:
1. Department of Applied Mathematics and Statistics, SUNY, Stony Brook,
New York. 2. System Science Department, UCLA, Los Angeles, California.
3. Electrical Engineering Department, Concordia Universitv, Montreal,
Canada. I look forward to seeing a renewal of these activities at the next OTNS
Symposium presently being planned for summer 1977.
III
R. H. Newcomb 1975 OTNS Co-Chairman Electrical Engineering Department University of Marvland College Park, }1aryland
f
TABLE OF CO~TENTS
"Structure ne~ul t for ",anI inear Passive Svstems" .. B.D.D. :\nder~oJ\ and P . .1. ~foYlan', Department of I:lectrical Engineering, Univer~ity of Nel,castle, Australia
",\n Algebra of Operator :'etlwrks". II'illiam 0:. Anderson, Jr., Department of ~!athematics and George I:. Trapp, Department of Statistics and Computer Science, Il'est Virginia University, ~lorgantown, IVest \·irginia
"ContractiYe l'erturl'ation~ of Restricted Shifts" . ,Joseph A. Ball., Vi rgi nia polytechnic Institute and State University, Blacksburg, Virginia and Arthur Lubin,North~estern University, Evanston, Illinois
"Frequency Response ~!ethods in ~!ultivariable Infinite Dimensional Linear S:'~tems"
,John S. Baras, Electrical Engineering Department, University of ~!aI'\'land, College Park, ~laryland
"On Simulataneous niagonalization of a Collection of Ilermitian !-Iatrices" S. Chakrabarti, B. B. Bhattacharyya and '1. N. S. Swamy, Department of Electrical Engineering, Concordia University, Montreal, Quebec, Canada
1
5
· 16
· 24
· 29
"J\ Wal~h Operational ~latrix for Solving Variational Problems". .41 C. F. Chen and C. II. Hsiao, Electrical Engineering Department, University of HGuston, Houston, Texas
"A Complex Form of the Generalized Fourier Series and Transforms" . 48 Dan A. Ciulin, Poly technical Institute of Bucharest
"Triangularization of Some Restricted Shifts". . 54 n. N. Clark, University of Georgia, Athens, Georgia and S. Sickler, Eastern Nazarene College, Quincy, Massachusetts
"Further Result~ on the Association of Variables". . . 57 James Conland and f. L. Koll, University of Regina, Regina, Saskatchewan
"The Feedback Interconnection of ~,!ul tivariab1e Systems: Simplifying Theorems for Stability" . .63
C. A. Desoer and W. S. Chan, Department of Electrical Engineering and Computer Sciences and the Electronics Research Laboratory, University of California, Berkelev, California
"The 'Fourier' Transform of a resolution Space and a Theorem of Masani" R. A. DeCarlo, R. Sacks and~!' Strauss, Texas Tech University, Lubbock, Texas
"Lumped-Distributed Networ]; S:'ntilesis Via Invariant Subspace Theory" . P. Dewilde, Departement Elektrotechniek, Katholieke Universiteit Leuven, \leverlee, Belgium and .T. S. Baras, Electrical Engineering Department, University of l'-laryland, College Park, Maryland
"Linear Ililbert Networks Containing Finitely ~lany Nonlinear Elements". Vaclav Dolezal
"Livsic's Chain Svnthesis" T. T. Ila and R. W. Newcomb, Electrical Engineering Department, Uni versi ty of ~laryland, College Park, l'-Iaryland
v
•. 69
· 75
· 80
• 82
I
"Radar Target Recognition--An Operator Theoretic Approach" ..... . D. E. Hammers and A. J. MacKinnon, ITT Gilfillan, Van Nuys, California
"Infinite Dimensional Realizahility Theory" . . . .. . ... , J. William Helton, University of California, San Diego, La Jolla, California
"Linear Network Synthesis Using Iteration Methods" . . . . . . . .. Y. ~. Jan and F. R. Chang, National Chiao-Tung University, Taiwan, Republic of China
"The Transform'ation Operator Approach to Multi-Subsystem Dynamics" . William Jerkovsky, The Aerospace Corporation, El Segundo, California
"A Note on the Nagy-Foias Lossy and Lossless Space" ..... . N. Levan, Department of System Science, 4532 Boel tcr lIall, University of California, Los Angeles, California
"An Output Control Problems Containing Input Derivatives". Victor Lovass-Nagy and David L. Powers, Clarkson College of Technology, Potsdam, New York
85
95
98
105
113
118
"An Explicit Treatment of Dilation Theory" . . . . . . . . . . . . 122 P. Hasani, University of Pittsburgh, Pittsburgh, Pennsylvania
"Characterizations of Operations Derived from Network Connections" . . . 133 Kaysuyoshi Nishio, Department of Information Engineering, Faculty of Engineering, Ibaraki University, Hitachi, Iharaki, ,Japan and Tsuyoshi Ando, Research Institute of Applied Electricity, Hokkaido University, Sapporo, Japan
"A Functional Analysis Approach to Minimum Sensitivity Control Design" J. Gary Reid, United States Air Force Avionics Lahoratorv , Wright Patterson AFB, Ohio .
"Passivity'and LP-Stability of Some Nonlinear Evolution Equations" . Dinu Wexler, with the Department of Hathematics, Facultcs Universitaires N.D. de la Paix, Namur, Belgium
"Contractive Transfer Ratios of Operator Networks" . . . . . . . . . . A. H. Zemanian, State University of New York at Stony Brook, Stony Brook, New York
VI
142
149
152
F
STRUCTURE RESULT FOR NONLINEAR
PASSIVE SYSTEMS
B.D.O. Anderson and P. J. Moylan Department of Electrical Engineering
University of Newcastle Australia
Abstract
A class of nonlinear, finite-dimensional, dynamic systems is studied for which the input and output vectors u and y satisfy a passivity condition. It is shown that such systems may be viewed a~ a cascade of a memory less passive nonlinear system and a dynamic lossless system. Th~ discussion is related to known results on linear network synthesis.
1. INTRODUCTION
Classical network synthesis, for linear, lumped,
finite, passive networks, is concerned with the
problem of passing from a port description of a
network in terms of, say, a positive real imped
ance matrix Z(s), to a collection of (passive)
network elements and a scheme for interconnecting
them to produce a network with impedance matrix
equal to that prescribed, [1-3]. State-space
approaches to the same problem [4] commence by
assuming known a state-variable realization
{F, G, H, J} (generally minimal) of Z(s) - thus
Z(s) = J + H~ (sI-F) -lG (1)
and then attempt to construct from this an inter
nally dissipative realization {F, G, a, J}, i.e.
one for which
[
J+J~
-(ll-G)
Equivalently, one needs to find a positive
definite P such that
[J+J~
-(H-PG)
-(H-PG)l >0
-PF-F~pJ
(2)
(3)
Once (2) or (3) is obtained, it is then possible
to define easily a nondynamic coupling network
Nc ' synthesisable merely with (passive) resistors,
transformers and gyrators, such that termination
of some of the ports of Nc in inductors leads to
an impedance Z(s) being observed at the r~main
ing ports. For details, see [4].
Our purpose here is to describe hew some of these
results will carryover to a nonlinear situation.
Consider a system described by
x
y
f(x) + g(x)u
hex) + j(x)u (4)
where uCo) and yeo) are real m-vector functions
of time, x(o) is a real n-vector function of time
and f(o), g(o), h(o), j(o) are suitably smooth
re~l functions of appropriate dimension, with
f(O) = 0, h(O) O. We call such a system pas-
sive if for all u( 0) and tl, given x(t o) = 0,
one has
(I u~y dt > 0 (5) to
This definition is a natural extension to that
applying in the linear case; one can think of
u(o) and y(o) as current and voltage vectcrs
respectively, so that the integral in (5) consti
tutes the energy input to a network with port
variables u(o) and y(O), computed ovpr
[to, tl]' and with the network initially unexcited.
Our task is to provide a no~inear internally
passive synthesis for (4)0 That is, we wish to
find a nonlinear, nondyna.!c or memoryle~s,
passive coupling network together with nonlinear
passive inductors so that the arrangement depicted
in Figure 1 (coupling network loaded at some ports
by inductors) has u related to y by (4). Note
that while (5) is a passivity condition on the
network of Figure I, it is an external one, direct
ly putting constraints on the port behaviour alone
of the network, and not the behaviour of internal
variables nor the properties of components within
the network.
Note also that our specification of both the non
linear inductor network and the nondynamic
coupling networ!, re:::ulting from the synthesis pro
ceoure will be simply via port descriptions of
these networks - we shall not attempt to describe
how to undo any mutual coupling of the nonlinear
inductors for example. Accordingly, the contribu
tion to practical network synthesis is virtually
nil; the result is more one concerning the theory
of passive systems, with electrical networks pro
viding one means of visualizing the results.
In section 2, we present background results drawn
from [5] which allow reinterpretation of the con
dition (5) in terms of the state-variable equations
(4). These results are used in section 3 to
present a passive synthesis. Section 4 contains
concluding remarks.
2. BACKGROUND
Returning to the linear problem for the moment, we
note that it is possible to associate with a passive
(or positive real) Z(s) in (1) a variational
problem, the solution to which defines a positive
definite P satisfying (3). It turns out that
the same sort of idea can be employed in studying
the passivity of (4).
2
Following [5], we shall assume that (4) is com
pletely controllable in the sense that for any
finite states Xo and Xl' there exists a finite
time tl and a smooth control defined on [0, til
such that the state can be driven from x(O) = Xo
to x(t l ) = Xl' Further, we assume a form of local
controllability: for any Xo and any Xl in a
suitably small open neighbourhood of x o' there
exists a u(o) and tl as above with the
additional property that
J;I u'(t)y(t)dtl < p(llxl -xoll) (6)
for some continuous p(o) such that p(O) = O.
(This equation in effect demands that changes of
state must not use arbitrarily large amounts of
energy). The main theorem of [5l then states that
a necessary and sufficient condition for (5) to
hold is that there should exist real functions
P(o), ~(o) and w(o) with P(x) continuous and
with, for all x,
P(x) > 0
f' (x) 'iJ P (x)
~g' (x) 'iJ P(x)
j(x) + j '(x)
and P(O) o
-~' (x) ~(x)
h(x) - w'(x)~(x)
w' (x)w(x)
(7)
(8)
These equations generalize those applying in the
linear case, [4l. The results of the linear case
are recovered by setting f(x) = Fx, g(x) = G,
h(x) = H'x, j(x) = J and P(x) = x'Px. The
variational problem used in the linear case when
translated to the nonlinear case yields the follow
ing characterization for one of the functions P(x)
satisfying (8): T
P(x) = - lim inf J 2u'(t)y(t)dt T-- u(o) 0
Let us observe for later use that (8) imply
[w(x) ~(x)l > 0
(9)
(10)
It is also possible to define a loss less system by
specializing (4) and (5) somewhat, and to obtain a
corresponding specialization of (8). Thus we say
(4) is loss less if (a) it is passive and (b) if
x(t o) = x(tj
) = 0, then
o (11)
for all u(·). In this case, the results of [5]
show that (7) and (8) hold with £(x) and w(x)
both zero, and the matrix on the left side of (10)
is accordingly zero.
3. SYNTHESIS PROCEDURE
For a single nonlinear inductor carrying current
and flux ¢, assumed to have a ¢-i charac
teristic passing through the origin, the stored ¢
~nergy is JOi(¢)d¢. For n mutu~lly coupled
nonlinear inductors with current vector i and
flux vector ¢, the stored energy is J:i'(¢)d¢;
since this integral must be path-independent for
a lossless set of inductors, i(¢) must be of the
form VQ(¢) for some scalar function Q of ¢,
nonnegative on account of the passivity property.
In our problem, we identify the state variable x
witn the vector uf inductor fluxes, and the func
tion ~(x) with the stored energy. Since
P(x) > 0 for all x, this means that the inductors
are certainly passive, indeed lossless. The
current corresponding to the flux vector x is
~VP(x). [In abstract terms, one may regard the
inductor simply as the map x H~VP(x)].
In Figure 1, we may evidently identify the vari
ables v and i as x and -Vp(x) respectively. 2
l;ow observing (4) we see that the only way the
coupling network could provide the requisite rela
tion between u and y is if it sustains precise
ly the voltage-current pairs.
or
[h(x)+j(x)ul
~ (x) +g (x).d
These pairs in effect define the coupling network.
3
Note that in the event that the map x ~Vp(x)
is invertible, the coupling network will be current
controlled (i.e. any current can exist at its
ports) as will the coupled inductors. Further
the coupling network is plainly nondynamic. If
x~VP(x) is not invertible, the network is still
nondynamic, though controlled by something external
to it, viz. the vector x of inductor fluxes.
Let us now observe the passivity of the coupling
network. The instantaneous power flow into the
network is
u'g'(x)+f'(x)] I u l l:wP(xll
[u'j'(x)+h'(x)
[u 1], j (x)+j '(x) h(X)-~g'(X)Vp9
~(X)-~g'(X)VP(X)]' -f'(X)VP(x2J
> 0
using (10). Moreover, in case (4) is lossless,
we know that £(x) and w(x) in (8) are zero,
[:J
and use of (10) then implies here that the instant
aneous power flow into the coupling network is
zero. Hence if (4) is passive, so is the coupling
network and if (4) is lossless, so is the coupling
network.
4. CONCLUSIONS
The main result of this paper is the demonstration
that a clatis of passive systems can be viewed as a
cascade of a memory less passive system (termed
earlier the coupling network) and a dynamic loss
less system (termed earlier the coupled inductor
network). Some immediate variations on this theme
are clearly possible; for example, one could work
with admittances and capacitors, or one could
exhibit ~ synthesis starting from an analogue of
the scattering matrix. [Results akin to those of
[5] have been developed by one of the authors
which handle this problem].
Perhaps of more interest would be an examination
of the extent to which reciprocity ideas could be
incorporated into the study. Presumably one would
parallel some of the linear system ideas used in
[4], but the details remain unclear.
---.-------..... --.--,-,-----~-.--.. -- -------.--.-~---------~ ~~~----=-==x=
REFERENCES
[lJ E. A. Gui11emin, Synthesis of Passive Networks,
Wiley, New York, 1957.
[2J L. Weinberg, Network Analysis and Synthesis,
HcGraw Hill Book Co. Inc., New York, 1962.
[3] R. W. Newcomb, Linear Multiport Synthesis,
McGraw Hill Book Co. Inc., New York, 1966.
[4] B. D. O. Anderson and S. Vongpanitlerd,
Network Analysis and Synthesis - A Modern
Systems Theory Approach, Prentice Hall,
New Jersey, 1973, 548pp.
[5] P. J. Moylan, "Implications of Passivity in
a Class of Nonlinear Systems", IEEE
TFansactions on Automatic ControZ, Vol. AC-19.
No.4, pp. 373-381, August 1974.
u i.J ... ,.. . Passive
-? ,- Nonlinear :F y Nondynamic v - Coupling Network -....
J'\..
FIGURE 1
Passive Coupled
Nonlinear Inductors
Cascade Decomposition of Nonlinear Network
4
AN ALGEBRA OF OPEnATOR NETlV()I~KS
William N. Anderson, Jr. Department of Mathematics
and George E. Trapp
Department of Statistics and Computer Science
West Virginia University Morgantown, West Virginia
Abstract
An algebraic treatment of the foundation of the theory of networks whose elements are linear operators. The operation which determines the input impedance operator from the branch impedance operator is analyzed. Special cases, such as positively connected networks and resistive networks, are considered in detail.
1. INTRODUCTION
In this paper we present an algebraic treatment of
the foundations of the theory of networks whose
elements are linear operators. This treatment is
based on methods which we have earlier used to
study the interconnection of networks (4); ulti
mately these methods derive from the ploneering
paper of Bott and Duffin (9). Related treatments
are given by Dolezal and Zemanian (13), (14), (30).
The purpose of this study is to formulate precise
mathematical statements of the physical assumptions
underlying the theory of operator networks, and to
show how certain aspects of the theory can be de
rived from this comparatively simple set of assum
ptions.
Our primary results concern the relationship be
tween the branch impedance operator of an arbitrary
network and the input impedance operator of an
associated n-port network. Given a graph with
operators in the edges, the branch impedance oper
ator determines branch voltages as a linear func
tion of branch currents. By choosing n pairs of
nodes of the graph we may consider this network as
5
an n-port network and seek to determine the asso
ciated input impedance operator. Simple graphical
conditions, similar to those given by Cederbaum
(11), guarantee the existence of the n-port imped
ance operator. By defining a Kirchhoff subspace
to be the set of allowable current flows in the
branches and ports, we are able to show that a
branch impedance operator and a Kirchhoff subspace
give rise to the n-port impedance operator. Not
all Kirchhoff subspaces arise from graphs; our
theory holds for any such subspace, allowing us to
consider matroids and other non-electrical situa
tions, see.(9), (15).
We begin in section 2 by reviewing the necessary
linear algebra. We then define almost right
definite operators. These operators are a natural
generalization of Hermitian positive semidefinite
operators and are closely related to positive real
operators. We discuss various properties of posi
tive real operators, in particular we show that
the Moore-Penrose generalized inverse of a posi
tive real operator is again positive real.
Motivated by Kirchhoff's current laws, we define a
Kirchhoff subspace in section 3. We show that a
natural dual space, the voltage space, exists and
obtain matrix representations of these spaces. If
A, the branch impedance operator, is a positive
real operator, and r is a Kirchhoff space, we show
that there exists a new positive real operator
• (A) (the n-port impedance operator). Using the
shorted operator (2), we obtain an explicit rep
resentation for ;(A).
The formalism of section 3 is given network inter
pretations in section 4. In particular, positive
ly connected networks, and network duality results
are given.
Specializing to resistive networks in section 5,
we are able to demonstrate various inequalities
concerning" (A) by using the classical power mini
mization principle. Moreover we determine when
two Kirchhoff spaces yield the same ~ function.
One application of this result is to the study of
the interconnection of networks with and without
transformers.
Finally, in the last section, we consider some
possible generalizations of this work.
2. POSITIVE REAL OPERATORS
Throughout this paper, we will consider finite
dimensional complex vector spaces, with inner pro
duct .denoted by <','>. The book by Halmos is a
good reference for the linear algebra we use,
(19) •
If A is a linear operator, then by A* we mean the
linear operator defined by <Ax,y> - <x,A*y> for
all vectors x and y. We denote the range of A by
ran (A), and the null space of A by ker (A). If ..L
W is a subspace of V, then W is the orthogonal
complement of W. A Hermitian operator is a linear
operator such that A* • A. We say that a Hermitian
operator is positive Semidefinite if <Ax,x> ~ 0
for all vectors x; if moreover A is invertible,
we say that A is positive definite. The well
known Fredholm alternative theorem, in the form we .J. will use it, states that ran (A) * ker (A*) ; for
the special case that A is lIermitian or skew
Hermitian, it follows that ran (A) • (ker (A)r.
6
It is well know that if A is positive semidefi
nite, then <Ax,x> - 0 only if Ax = O. General
izin~ this result to non-Hermitian operators, we
say that the operator A is almost right definite
if
i~<Ax,x> > 0 for all vectors x
ii)~<Ax,x> - 0 only if Ax = 0
Lewis and Newman (22) have used a similar defini
tion for operators on real vector spaces; they use
the terminology almost positive definite.
LEMMA 2.1: Let A be a linear operator, with
A - ~ + As' where ~ is Hermitian and As is skew
Hermitian. Then A is almost right definite if and
only if
iii) ~ is positive semidefinite
Iv) ran (As)<:ran (~).
Proof: It is easy to see that&t<Ax,x> - <~x,x>,
so that i) and iii) are equivalent.
Now suppose that iii) and Iv) hold, and that
~ <Ax,x> - O. Then <~x,x> - 0, and since ~ is
positive semidefinite, it follows that ~x • O.
Therefore Asx • 0, and thus Ax - O.
Conversely, suppose that i) and Ii) hold. Then if
ran (As>4- ran (~), there is a vector x such that
~x - 0 but Asx ". O. Then~<Ax,x> - <~x,x> • 0
but Ax ". 0, contradicting ii).
QED
LEMMA 2.2: Let A and B be almost right definite
operators, on vector spaces WI and W2, respective
ly. Then the operator~ ~ - A ® B on WI (t) W2 is almost right definite.
Proof: For any vectors x£Wl and y£W2 , <A + B (x,y),
(x,y» - <Ax,x> + <By,y>. Since ~ <Ax,x> ~ 0
andP-a <By,y> ~ 0, it follows that ~ [<Ax,x> + <By,
y>] ~ O. Moreover, if~ [<Ax,x> + <By,y>] - 0,
then;}~ <Ax ,x> - 0 and ~<By ,y> - 0, so that Ax • 0
and By - O. Therefore A <t:' B (x ,y) - O.
QED
It is an easy consequence of lemma 1 that if A is
almost right definite, then ran (A) - ran (~) and .J...
ran (A) - ker (A). It follows that Al - AI ran
(A) maps ran (A) one-to-one onto ran (A). The op
erator At, the generalized inverse of A, is defined
by
Ai I ran (A) - A-I 1
Ai" I ker (A) - 0
LEMMA 2.3: If A is an almost right definite oper
ator, then At is almost right definite.
~: Let P be the Hermitian projection onto ran
(A). For any vector x, there is a vector yE ran
(A) such that Ay • Px. Then <Ax,x> - <J: PX,Px>
- <y,Ay>. Therefore~<A "x,x> -f'...<.<y,Ay> > O.
Horeover, ifR..<A~x,x> - 0, then 0 - Ay = Px.
Therefore XE ker (P) - ker (A), and thus At is al
most right definite.
QED
The generalized inverse we have constructed above
is commonly known as the Moore-Penrose generalized
inverse. For many of our purposes it would suffice
to consider any operator A' such that AA'A - A, our
A' is such an operator, but there exist in general
many others which are not almost right definite.
The various types of generalized inverses are di
scussed by Albert (I), and Rao and Hitra (24).
Let D denote the open right-half plane of the com
plex plane. A positive ~ operator A (z) is an
operator valued function defined on D such that
i) A (z) is analytic on D
ii) A (z) is Hermitian on the real axis
iii) ~ <A (z) x,x> ?.. 0 for all vectors x and all ZED
LE~~ 2.4: Let A (z) be an analytic operator val
ued function on D such that A (z) is Hermitian on
the real axis. Then A (z) is a positive real op
erator if and only if A (z) is an almost right
definite operator for all ZED.
Proof: If A (z) is almost right definite for all
ZED, then iii) holds.
Conversely, suppose that~<A (zo) x,x> - 0 for
some z ED and vector x ~ 0 ronsider the scalar o
function f (z) - <A (z) x,x>; then~f (z ) - O. o
Since f (z) is a harmonic function, it follows
7
that either ~(f) - 0 on D or f is negative at po
ints arbitrarily close to zoo The latter condition
violates the positive reality of A (z). Therefore
<A (z) x,x> - 0 on the real axis; since A (z) is
Hermitian there, it follows that A (z) x - 0 on the
real axis. Now let y be any vector. Then <A (z)
x,y> is an analytic function which is zero on the
real axis, and therefore throughout D by analytic
continuation. Therefore A (z) x - 0 throughout D;
in particular A (zo) x • O.
QED
LEMMA 2.5: Let A (z) be a positive real operator.
Then A' (z) is a positive real operator.
Proof: In view of lemmas 3 and 4, we need only show
that A'r is an analytic function of z. It follows
from the proof of lemma 4 that ker A (z) is a con
stant subspace for ZED. Therefore A (z) - Al (z)
~O, where Al (z) is an invertible positive real
function acting on the subspace R· ran A (z), and
o is the zero operator on ker A (z). Then At (z)
• Al (z)Ef)o; since a true inverse is analytic, so
is Ai" (z).
QED
LEMMA 2.6: Let A (z) be a positive real operator
which is a rational function of z. Then A is a
rational function also.
The Eroof is a simple consequence of the fact that
the inverse of a rational function is again rational.
It is not true for general analytic operator func
tions that the Moore-Penrose generalized inverse
is analytic; however, as Bart, Kaashoek, and Lay
(8) have shown, an analytic pseudo inverse A' sat
isfying AA'A • A can always be constructed.
3. KIRCHHOFF SUBSPACES
We will let sand t denote respectively m and n
dimensional complex vector spaces, with inner pro
ducts <., ·>s and <., ·>t respectively, let V - S
(f; T with inner product
«sl,t l ), (s2,t2»v = <sl,s2>s + <t l ,t2>v·
We assume that fixed orthonormal bases E - {el •••••
em} and F - {fl,···.fn } are given fo~ Sand T res
pectively; then EuF is a basis for V. We call the
vectors of E branches, and the vectors of F ports.
All matrices will be written with respect to
these fixed bases.
A Kirchhoff Subspace, abbreviated KS, is a sub
space r such that
i) ~ t£T, ]s£S '(s,t) £ r
ii) (O,t) £ r -> t· 0
In terms of networks i) states that any current t
may be put in the ports and a internal distribution
s exists in the branches; ii) requires that if all
currents in the branches are O,s • 0, then all port
currents must be 0, t • O.
We have previously considered this type of sub
space when connecting two n-port resistive net
works, see (4). A Kirchhoff subspace is just a
formal way to state Kirchhoff's current laws for
an arbitrary network.
If r is a KS, then we define the dual confluence
r' by
r' • {( a,T) £vl < a,s>s • <T,t>T for
all s,td}
We note the dual KS may be thought of as voltages
across ports and branches or, in terms of graphs,
the currents in the dual graph.
LEMMA 3.1: If r is a Kir~hhoff space in V, then
a.) r" - r b.) dim r + dim r' • dim V
c.) r' is a KS
The proof is an easy exercise in linear algebra,
and is omitted.
It will be convenient to represent r by 'certain
matrices; recall in this regard that fixed ortho
normal bases have been chosen for Sand T.
LEMMA 3.2: There exists an m x m matrix K and an
n x m matrix L such that the columns of the matrix
J r -r:' :,] form a basis for r. Moreover, n ~ m.
Proof: By condition i) of the definition of a KS,
we can choose the first m columns of J as shown;
8
the remaining columns are chosen so as to be a bas
is for the set of vectors in r which are of the form
(s,O). The inequality n < m follows from parts b
and c of lemma 3.1.
QED
In a similar manner, we let the columns of the
matrix be a basis for r'.
LEMMA 3.3: The vector (s,t) E r if and only if
l~ls - l~l. if and only if there is a vector ~ such
that [K* L*l [~] • s. The vector «(1 .T) E r' if
and only if [~1 (1 - l ~ 1. if and only if there is a
vector II such that l M* N*] l~] . (J
The proof follows by direct computation from the
definitions of J r and J r ,.
THEOREM 3.4: Let r be a KS and A an almost right
definite operator on S. Then given a vector c£T,
there exists a vector aES and a unique vector YET
such that (a,c) £ rand (Aa,y) E r'.
Proof: Using lemma 3.3, we wish to solve the eq
uations
(3.1) l~,:J l: l~l We will employ the Fredholm Alt. Thm. The homo
geneous adjoint system to (3.1) is
(3.2) r~ ~ :] \~~l f~l lM* N* -A* u3 lo
For any solution to (3.2), we have
(3 • 3) [ M* N* J r uu1
2
1 l - A*u3
Then A*u3 is orthogonal to all solutions to [:]
w - 0; since u3 is such a solution we have <u3 ,A*u3>
- O. Since A is almost right definite, it follows
that A*u3 • 0, so that the right hand side of (3.3)
is 0, and thus (3.3) is the homogeneous adjoint sy-
::::) 'O[:1h: :Y[:i[
By the definition of KS, system (3.4) has a solu
tion for all vectors c, and thus <ul'c> ~ 0 for
all c. Thus ul
= :), so that (1) has a solution.
Moreover, since y = 0 for c - 0, it follows that
Y is uniquely determined by c.
Definition: Let r be a KS and A an almost right
definite operator.
fined by 4> (A) c = y.
;p (A) is the operator de-
THEOREM 3.5: Let A be an almost right definite
matrix and let r be a KS. Then;P (A) is an almost
right definite matrix.
Proof: Given a vector c, there exists an jI 50
(jI ,c) £ r and (AI> ,Y) £ r'. Therefore <Aa,a> -
<'r,c;. ., <4» (A)c,c>. Therefore~ (~(A)c,c) ~ o. Now if ~ (p (A)c,c) = 0, then~ (Aa ,a ) = 0 50 o 0
that Aao a O. Since r' is a confluence, it fol-
lows that Y ., 0; that is 4> (A)c ., o.
QED
We will derive an explicit expression for the
function ~ (A) in terma of a special function known
as the shorted operator. Let A be an almost right
definite matrix, partitioned A _fAll A121 where
l~l An
All and ~2 are square.
S(A), is defined by
Then the shorted operator,
(3.5) S(A) = All - A12
In previous papers we derived properties of the
shorted operator - for the special case of pos
itive semidefinite A - using a variational problem
(2), (6). Here we use a KS. The formula (3.5)
itself is old, and finds application in many areas
not related to this paper (10), (12).
THEOREM 3.6: Let r be a confluence and A an al-
most right definite operator.
A [K* L*J)
Then, (A) .. S.( l~ 1
By theorem 3.4, given a vector cET, there is a
vector a£S and a unique vector y £1' such that (a,
c) E rand (Aa,Y) £ r'. Using lemma 3.3, we find
that there exists a vector). such that [K* L*J
l~1" a. Moreover [~l Aa ., [11' so that
(3.b) 6 ~ A[K* L*) c A
9
In particular, 0 = LAK*c + LAL*A. Since, is known
to exist for any c, we can solve for one possible
choice of X ~ A = - (1.0\1.*) LAK*c. Substitutinp; for
A in (3.6), we have y ~ (KAK* - KAL* (LAL*) L',"
which is the theorem.
QED
In the special case that K = L ., I, we have ~ (A) = S(A) •
If A (z) is a positive real operator, then LA (z)
L* is also a positive real operator. Lemma 2.5
yields that (LA (z) L*) is analytic and hence
~(A (z) ) is analytic. Then theorem 3.5 and lemma
2.4 imply that ~ (A (z) ) is positive real. We
have therefore demonstrated the followinh theorem.
THEOREM 3.7: Let A (z) be a positive real operator
and let r be a KS. Then;p (A (z) ) is a positive
real operator.
The preceding theorem deals with the .;ase where ,\
varys as a function of the scalar variable z; one
can also consider general matrix variations. In
this connection let us recall that a funet ion f is
said to be (Frechth) differentiable if there b
a linear function L such that f(x + h) = f(x ) + o 0
Lh + o(\\h\ I); for scalar or vector functions this
Frechet derivative is of course the usual deriva
tive, and L is commonly expressed as the Jacobian
matrix. However, it is not necessary to express
L as a matrix, in writing the derivative of the
shorted operator it proves convenient not to do so.
The chain rule for derivatives is usually expressed
in terms of matrix multiplication; this convenience
is due to the fact that multiplication of matrices
is equivalent to composition of the corresponding
linear functions. When we use the chain rule later,
we will have to express this composition _,( fun.:
tion directly, since we have not written the deri
vative of the shorted operator as a Jacobian mat
rix.
LEtIMA 3.8: \2] be an almost posiA22
tive definite matrix such that A22 is invertible,
and let E -1·El1 El. 21 be a matrix such that A22
lE21 E22
+ E22 is invertible. Then
(3.7) S (A +E) • S (A) + [1 - A12A;~] \ Ell
LEn .' 1
\ I -1 I -1 -1
L-A22 An J-<A12A22E22 - E12) (A22 + E22 ) -1
(E22A22AZl - E2l). The proof is a direct compu-
tation.
THEOREM 3.9: Let A be an almost right definite
operator such that A22 is invertible. Then the
shorted operator SeA) is a differentiable function
at A.
~: For sufficiently small E, A22 + E22 will
be invertible, and the final term in (3.7) is
01 lEI I. Therefore the middle term in (3.7) is
the desired linear approximation.
QED
Theorem 3.9 of course imples that if A22 is in
vertible, then SeA) is continous at A. In the
absence of invertibility, SeA) need not be con
tinousi an example is given in (6).
If higher derivatives are desired, one may expand -1
(A22 + E22) in a Neumann series.
The concept of Quality is important in network
theory. In the present context, the duality is
expressed in terms of dual KS. That is, if r is
a KS and ~ is the corresponding matrix, we con
sider the dual KS r' and the corresponding matrix
operator ~ '.
THEOR1:l! 3.10: Let:' be a KS and :' be the dual
is. Let. and ~' be the corresponding matrix
operations. Then if A is an invertible almost
right definite operator, ~ (A) _ ij' (A-I» -1 •
Proof: Let y - ~ (A)c. Then there exists a vec
tor a such that (a,c) £ rand (Aa,y) £ r'. Let
a- Aai then (A-la,c) £ r" - rand (a,l) £ r'.
Therefore; '(A-I) - c, so that'. '(A-I) ~(A) - 1.
QED
4. OPERATOR NETWORKS
An operator network is a pair (r,A) where r is a
Kirchhoff space and A - A (z) is a positive real
operator acting on the vector space E, that is,
10
the space spanned by the branches. The operator A
is called the branch impedance operator.
Given a vector cET, called the port current vector,
we wish to find a ~r~n~~ current ~ a£S and a
port voltage vector YET such that (a,c) E rand
(Aa, y) £ r'. The correspondence between c and y is
given by the equation y - ~ (A)c; the operator. (A)
is called the input impedance operator.
In this section we will prove some basis theorems
about operator networks, and discuss some import
ant examples. The fundamental theorem is:
THEOREH 4.1: Let (r,A) be an operator network.
Then for any port current vector c there is a un
ique port voltage y, and a branch current vector
a such that (a,c) £ rand (Aa,y) £ r'. The opera
tor ~ (A) defined by the equation y • , (A) c is a
positive real operator.
Proof: This is merely a restatement of theorems
3.4 and 3.7.
The operator network (r,A) is said to be positively
connected if LAL* is invertible, equivalently if
ker (AL*) - O. This definition would appear to
depend on the choice of Li the independence from
the choice of L follows from the fact that the co
lumns of L* form a basis for a uniquely determined
subspace of S. In particular, if A is invertible,
then (r,A) is positively connected.
TIIEOREM 4.2: Let (r,A) be a positively connected
operator network. Then the branch current vector
b is uniquely determined by the port current vec
tor c.
Proof: By assumption LAL* is invertible. Then we
can solve (3.6) for ~ obtaining ~ __ (LAL*)-l
LAK*c, and then
(4.1) a - [K* L*] l~l - (K* - L*(LAL*)-lLAK*)c
QED .-J
Following Cederbaum (11), we call K - (K* - L*(LA t
L*) LAK*)* the modified circuit ~atrix of the op-
erator network (r,A). An extensive study of the
properties of K has been made by Thulasiraman and
~urti (27).
THEOREM 4.3: Let (r,A) be a positively connected
operator network. Then t (A) is a differentiable
function of A.
Proof: Since LAL* is invertible. theorem 3.9
applies. Moreover. in view of (3.7). we see that
the linear approximation to ~ (A + E) is ~ (A) + UK;".
QED
For the special case of an n-port network. this de
rivative formula has been given by Cederbaum (11).
One important example of a general variation in A
is given by letting the impedance in a single bra
nch vary. A particularly simple formula is avail
able in this case. Let the branch be branch i; we
will use the notation dA/daii for this derivative.
THEOREM 4.4: Let (r.A) be a positively connected
operator network. Let 0; be the current in branch
i corresponding to a port current of 1 at port j
and 0 at the other ports. Then
1...tJ.& .. [<;. ••••• aml* [<;. ••••• aml. Claii
Proof: For ease of notation. assume that i-I.
Then by theorem (4.3) and the chain rule, the de-
"vaUv. is< [~ g F" -K [f (l.'··.) ii., By
theorem (4.2). the vector [<;..··.aml is the first
row of K*. But this row is the product [lO"'OlK*.
QED
As mentioned previously. one familiar example of
an operator network is an ~-port network. Let G'
be an oriented graph with m + n edges, m of which
are called port edges. Let P denote the set of
port edges. and let G be the sub graph of G' con
taining the edges of G which are not in P. We
assume that the set P contains neither a circuit
nor a cocircuit of G'. The branches e i then cor
respond to the edges of G, and the ports fi to the
port edges.
The current space of G' is in fact a KS r. The
hypothesis that P contains no cocircuit yields
condition i of the KS definition. and the hypoth
esis that P contains no circuit yields condition
ii. The voltage space of G' is also a KS r. The
11
KS r', the dual to r, is defined by (0,r) e: r' if .l..
and only if ~. ,-1) E r. The duality of rand r'
here follows from the fact that r and r~are ortho
gonal complements.
The matrix J r may be found by imbedding P in a co
tree of G' and then letting the columns of Jr
be
vectors corresponding to the fundamental basis of
circuits relative to the cotree. Similarly, form
J r , by imbedding P in a tree. This construction is
described in more detail by Cederbaum (11).
In the electrical interpetation, the edges of G are
inside a ''black box", and the port edges are order
ed pairs of terminals which are accessible from the
outside. If (s,t) E r, then t is a vector of cur
rent sources at the ports of the network, and s is
the vector of branch currents inside the "black
box". Similarly, if (o.t) then t is the vec-
tor of port voltages and a is the vector of branch
voltages. The duality of rand r' expresses the
physical principle that power measured at the ports
is the same as power measured at the branches. The
matrix A has diagonal entries with a ii • R > 0 for
a resistance; aii = Lz for an inductance; and aii -
l/Cz for a capacitance. Mutual inductance will re
sult in off diagonal entries. but in any case A (z)
will be positive real.
The term operator network is more commonly applied
to the generalization of this example in which the
edge impedances are operators. Then each entry of
1 in the J r matrix is replaced by an identity ma
trix of appropriate Size, and the O's of J r are re
placed by zero matrices. The operator A is then
block diagonal, unless mutual inductances are all
owed. Special cases of this generalization are the
series and parallel connection of networks (3);
the more general connections treated in (4) do not
necessarily admit such simple graphical models.
Instead of starting with a graph and obtaining a
KS. we now start with the KS, and reversing the
earlier procedure, we obtain a generalization of a
graph. knQol\l as a matroid. Let r be a confluence.
If x e: r, the support of x is the set of branches
and ports corresponding to the non-zero components
of x. A circuit is a minimal non-empty support
set; the set of circuits forms a matroid. This de
finition is given by Tutte (28); there are many
well known equialent definitions (23), (29). If
instead of r we consider r', then we speak of co
circuits and the dual matroid. (Tutte uses r~ in
stead of r', but it is easy to see - and is a spe
cial case of theorem 10 of (9) - that the two de
finitions are equivalent)
THEOREM 4.5; Let (r ,A) be a resistive operator
network where A is a diagonal matrix. Then the
network is positively connected if and only if ev
ery circuit containing only branches contains at
least one branch with non-zero resistance.
Proof: Let t£T be a non-zero vector such that
RL*t - O. Then every branch 1. n the support 0 f
L*t has zero resistance, so that any circuit con
tained in the support of L*t will contain only
zero resistors.
Conversely, if some circuit contains only zero
resistors, then this circuit is the support of a
vector s£r such that Rs - O. Since the columns
of L* are a basis for the vectors in r whose
support is contained in the set of branches, the
vector s - L*t for some vector t; therefore
RL*t - O.
QED
The assumption that every circuit contains one
non-zero resistor is common in electrical net
work literature (18), (23); the term "positively
connected" is due to Duffin (17), where an equi
valent definition, in terms of cocircuits, is
given.
Minty has proved theorems 4.1 and 4.5 in the
special case that the matroid is regular (23); he
uses the term digraphoid instead.
5. RESISTIVE OPERATOR NETWORKS
Let (r,A) be an operator network. We say that the
network is resistive if the matrix A is a constant
matrix. In this case A must of course be Hermi
tian and positive semidefinite.
A classical variation principle states that in a
resistive network the current always takes the
12
path of least resistance. In the case of resistive
operator networks we have an algebraic formulation
of this principle.
THEOREM 5.1: Let (r,A) be a resistive operator
network, and 41 the associated matrix operation.
Let c be an arbitrary vector and Y = P (A)c. Then
for any vector (a,c) E r.
(5.l) <Aa,a> ~ <'r,c>
Moreover, equality holds if and only if (Aa,n Er'.
Proof: fiy the definition of KS equality will hold
if (Aa, Y) E r'. In fact, by theorem 3.4 rr~"'"
exists an ao such that (ao'c) E rand (Aao,-r) E r'.
Since r is a vector space (t,o) E r, and thus <Aao
'
t> - o. Therefore <Aa,a> • <A(ao
+ t), ao
+ t> = <Aao,ao> + <At,t> ~ <Aao,ao> = <Y,c>. Thus the in
equality is established. Moreover, if equality
holds, then <At,t> • O. Since A is positive semi
definite, it follows that At • 0, that is Aa ~ Aao
'
so that <Aa,Y> • <Aao'·Y) e: r'.
QED
If A and B are positive semidefinite operators, we
say that A ~ B if A - B is positive semidefinte.
Another expression of this variational principle
can be given in terms of this partial order of op
erator.
THEOREM 5.2: Let (r,A) and (r,B) be resistive op
erator networks, with ~ the corresponding matrix
operation. Then> (A + B) ~; (A) + ~ (B) •
Proof: The special case for ~ being the shorted
operator is theorem 5.1 of (2). The result then
follows from theorem 3.6.
QED
An alternative proof, based on theorem 5.1, can be
constructed analogous to the proof of theorem J.j
Nishio and Ando have shown how some of these in
equalities actually characterize certain matrix
operations as deriving from network models (25).
Corollary 1: The input impedance operator is a
monotone function of the branch impedance operator.
Proof: If the branch impedance operators are A
and C, with A ~ C, then C - A is also positive
semidefinite. Therefore; (C) .; (A + (C - A» >
<t (A) + <t (C - A) ~ ?(A) •
QED
Corollary 2: The input impedance operator is a
concave function of the branch impedance operator.
Proo f : If 0 <: a.5.. 1, t hen ~ ( 0 A + (1 - 0) B) ~
1> (0. A) + p (1 - 0) B) " o.,~ (A) + (1 - 0) P (B) ;
QED
Dolezal has proved theorems similar to these two
corollaries, see (13), (14).
In addition to defining a partial order of opera
tors, we can define an order relation on the mat
rix operations derived from Kirchhoff spaces.
Let r l and r2 be Kirchhoff spaces and PI' P2 the
corresponding matrix operations. We define f' 1 ::...
~ 2 if 1> 1 (A) .5.. p 2 (A) for all positive semidefinite
operators A.
)In the next theorem we give a condition on the
Kirchhoff spaces which is equivalent to the con
dition 1> 1 .5.. P 2' In order to prove our theorem we
need two results on operator ranges. First, it is
well known that if A and B are positive semidefin
ite, and A.5.. B, then ran (A) ran (B). Second,
theorem 1 of ( ::) states that if A is positive
semidefinite, and S is a subspace, then ran (S(A»
" ran (A) S.
THEOREM 5.3: Let p 1 and 1> 2 be matrix operations
derived from the Kirchhoff spaces r l and r Z res
pectively. Then <p 1 .5.. P 2 if and only if there is a
constant k with Ikl ~ 1 such that for every vector
(a,kc) c r2 , the vector (alkc) £ r l •
Proof: Suppose that there is such a k. Let A be
a positive semidefinite operator. Then given an
arbitrary port current vector c, there exists a
vector a such that (a,c) £ f2' and (Aa,y) [ rio
By hypothesis, (k-la,c) £ rl' so that by theorem -1 - 1 1-2 5.1, <1> 1 (A)c,c" .5.. <Ak a,k a>" k <Aa,a>"
I 1-2 1 I ,k ~2(A)c,c>. Since kl > I, it follows that
PI (A) .5.. p 2 (A) for all A.
Conversely, suppose that 41 1 (A) ::... P 2 (A) for all
positive semidefinite operators A. Let US con
sider an arbitrary vector (0 , Yl) £ ri with Yl '
O. Then aa* is a positive semidefinite operator,
13
and thus PI ( oc! *) ::... p 2 (aa *) by hypothesis. Com
puting these operators, we have by theorem 3.6
and lemma 3.3 1> 1 (oa *) • S kl ao * [k1 * L] *] •
L1
k2
L2
[Y2* 82*] - S )')'2* ye 2* for some
fl 2y z* fl2B2*
We notice that if 32 ; f) then since Y2 h2* (;2*] e2
has rank 1 and because of aforementioned result on
the range of the shorted operator, we must have
p 2 (oa *) " O. However ~ 1 (00 *) is non-zero, and
PI ::...t2' thus f'Z = 0 and therefore ~2(oa*) =
'(2Y2*' !1oreover, since 1>l(a,,*) = '(1'(1*' we must
then have ran ('(l-Y l *) ran ('(2'(2*)' Since these
ranges are one-dimensional, it follows that y~ •
ky l ; furthermore Ikl ~ 1 since Yl'l* ~ '(2-(2*'
Concei vably k depends on a. However let Y and C:;
be two linearly independent vectors with (a ,r) (
ri and (B,O) (: r~. Then for some constants k1' k2
and k3 , (Il ,kl " [ ri, (S,k 20) [ ri, and (CL+ ~,k3
(y + 0» £ rio Since 'i is a subspace, it follows
that (O,k3(O + -y) - kly - k20) ( 'i' and since ri
is a Kirchhoff space, it follows that (k3
- k2
)0 +
(k3
- kI)y K O. But 0 and yare linearly indepen
dent; thus kl - k2 " k3 ·
We now know that if (0 ,y) ( ri, and Y , 0, then
(a ,k'y) (: rio We need to show that if (0 ,0) £: ri, then also (0 ,0) (: rio To see this, consider some
(B,y) (: ri.with y ~ O. Then (S,ky)£fi' Since'i
is a subspace, (0+ B,y)£ri, so that (o+i3,ky}t"i"
Since r i is a subspace. we have (Q +2 ,ky) - C3 ,ky)
=(a,O)£fi.
We have proved that if (0 , y) £: ;' i then (u ,k-r> c
fi; the theorem however refers to r 1 and f 2 . To
complete the proof, consider a vector (a,e) ( '2"
Then for any vector (c,,-,) .~ ri, we have ('J ,ky) <:
ri. By the definition of Kirchhoff space, it fol
lows that <a,o > = <c,ky> .. <k*c>(>. Therefore, by
the definition of Kirchhoff space (a,k*c) f ;1'
Since Ik*i = Ikl ~ 1, the theorem is proved.
QED
Cor,'llary: Let rl
and r2
be Kirchhoff spaces, and
~ 1 and ~ 2 the corresponding matrix operations.
Then ~l(A) • ~2(A) for all positive real operators
A if and only if there is a constant k with Ikl •
1 such that the vector (a,c) c r l if and only if
the vector (a,kc) £ r2•
Proof: If equality holds for all positive real A,
it holds for all positive semidefinite A. Then ~l
~ ~ 2 and ~ 2 ~ ~ 1 and the theorem applies. The
converse is proved as in the first part of the
proof of the theorem.
Of somewhat more interest is the case where the
operator network is derived from a graph, with
scalar resistors .1f operators allowed in the ed
ges. In terms of A, this means that A is res
tricted to be a diagonal or block diagonal matrix.
I.e conjecture that in this case an analogue to
theorem 5.3 is true, except that there is a
separate constant for each edge. That the exist
ence of such constants implies the inequality is
clear, but we have no proof for the converse.
Even the case of equality seems difficult; the
scalar case has been proved by Bott - Duffin (9),
theorems, (9) but their proof does not seem to
generalize.
An important special case of theorem 5.3 arises
when r2<= r l ; that is k a 1. This case arises,
for example, in treating the connection of n-port
networks with and without ideal transformers. For
any type of connection, series, parallel, hybrid,
cascade or even more general, the use of trans
formers merely restricts the space of available
currents. Thus by the theorem the impedance of
the connection without transformers is always less
than or equal to the impedance with the use of
transformers. We have discussed the inequalities
for the parallel connection in (7); conditions for
equality are given there in terms of the topology
of the networks. Other authors have treated the
case of equality, without discussing the inequal
ities (21), (24). The series connection appears
somewhat more difficult to treat, the difficulty
being that the series connection of n-port is not
14
commutative unless transformers are used.
6. CONCLUSION
In this paper we have presented the foundations of
an algebraic treatment of operator networks. Many
other properties of ordinary networks can be shown
to hold in this wide context. For example, one
can generalize Duffin's theory of extremal length
(16) and heat-flow networks (17); these generali
zations depend primarily on theorems 4.4 and
5.1 of this paper.
Another subject that one may wish to treat is the
infinite dimensional generalization of operator
networks. For the case of invertible operators in
all the branches, Zem anian has developed an exten
sive theory (30) without this assumption of inver
tibility, theorem 4.1 need not hold. Counter
examples for the parallel connection of two oper
ators are discussed in (6). However, the shorted
operator can be shown to exist in very general
circumstances (6), (20), and the formula of theo
rem 3.6 will still yield useful results. For
example, one can consider infinite ladder networks
and prove fixed point theorems similar to the ones
in (5) without resort to the limiting arguments
used there.
1.
REFERENCES
A. Albert, Regression and the Moore-Penrose
Pseudoinverse, Academic Press, New York 1972.
2. W.N. Anderson, Jr., "Shorted Operators"
SIAM J. Appl. Math. 20(1971) 576-594.
3. W.N. Anderson, Jr. and R.J. Duffin, "Series
4.
5.
and Parallel Addition of Matrices", J. Math.
Anal. Appl. 26(1969) 576-594.
W.N. Anderson, Jr., R.J. Duffin and G.E.
Trapp, "rtatrix Operations Induced by Network
Connections", SIAM J. on Control 13(1975)
446-461.
W.N. Anderson, Jr., G.D. Kleindorfer, P.R.
Kleindorfer and M.B. Woodrodfe, "Consistent
Estimates of the Parameters of a Linear System'
Ann. Math. Stat. 40(1969) 2064-2075.
6. I .. N. Anderson, Jr. and G.E. Trapp, "Shorted
Operators II", SIAM J. Appl. ~fath. 28(1975)
160-171.
7. W.N. Anderson, Jr. and G.E. Trapp, "Inequ
alities for the Parallel Connection of Resis
tive n-Port Networks", J. Franklin lust 299
(1975) 305-313.
8. H. Bart, M.A. Kaashoek and D.C. Lay, "Re-
lative Inverses of Finite Meromorphic Operator
Functions", IndQg Math. to appear.
9. R. Bott and R.J. Duffin, "On the Algebra
of Networks", Trans. Amer. Hath. Soc. 74(l953}
99-109.
10. D. Carlson, E. Haymsworth, and T. Markham,
"A Generalization of the Schur Complement by
Means of the ~oore-Penrose Pseudoinverse",
SIAM J. App!. ~iath. 26(l974} 169-175.
11. I. Cederbaum, "On Equivalence of Resistive
n-Port Networks", IEEE Trans Circuit Theory
CT-12(1965} 338-344.
12. R.W. Cottle, "Hanifestation of the Schur
Complement", Linear Algebra Appl. 8(l974}
lB9-2ll.
13. V. Dolezal, "Hilbert Networks I", SIMi J.
14.
Control l2(1974}.
V. Dolezal and A.H. Zeman ian , "Hilbert
Networks II - Some Qualitative Properties",
SIAM J. Control 13(1975) to appear.
15. R.J. Duffin, "An analysis of the Wang alge-
bra of networks", Trans. Amer. Math. Soc.
93(1959} 114-131.
16. R.J. Duffin, "The External Lenght of a
Network", J. Math. Anal. App!. 5 (1962) 200-
215.
17. R.J. Duffin, "Optimum Heat Transfer and
Network Programming", J. Math. Mech. l7(196B}
759-76B.
lB. H. Flanders, "Infinite Networks I - Resis-
tive Networks", IEEE Trans Circuit Theory
CT-lB(197l} 326-331.
19. P.R. Halmos, ~ Dimensional Vector
Spaces, Van Nostrard, Princeton, 196B.
20. M.G. Krein, "The Theory of Selfadjoint Ex-
tensions of Semibounded Hermitian Operators
and Its Applications I", Mat. Sbornik N.S.
20(62} (1947) 431-495 (In Russian, with En
glish summary).
21. A. Lempel and I Cederbaum, "Parallel Inter-
15
connection of n-Port N~tworks", IEEE Trans
Circuit Theory, CT-14(1967} 274-279.
22. T. Lewis and T. Newman, "Pseudoinverses of
Positive Semidefinite Matrices", SIAM J. Appl.
Math. l6(196B} 701-703.
23. G. Minty, "On the Axiomatic Foundations of
24.
25.
26.
the Theories of Directed Linear Graphs, Elec
trical Networks, and Network Programming", J.
Math. Mech. l5(1966} 485-520.
V.C.K. Murti and K. Thulasiraman, "Parallel
Connections of n-Port Networks", Proc. IEEE
55(1967} 1216-1217.
K. Nishio and T. Ando, "Characterizations
of Operations Derived from Network Connections"
preprint.
C.R. Rao and S.K. Mitra, Generalized Inverse
of Matrices and Its Applications, Wiley, New
York 1971.
27. K. Thulasiraman and V.C.K. Murti, '~odified
Cut-Set Matrix of an n-Port Network", Proc
lEE ll5(1968}.
28. W.T. Tutte, "Lectures on Matroids", Nat.
Bur. of Standards J. Res. 69B(1965} 1-48.
29. H. v.'hitney, "On the Abstract Properties of
Linear Dependence", Amer. J. Math. 57(1935}
509-533.
30. A.H. Zemanian, "Infinite Networks of Posi-
tive Operators", Circuit Theory and App1. 2
(l974) 69-78.
William ~. Anderson, Jr. recieved the ES, !is and
Ph') from Carnegie-Hellon Uni'lersity in 1960, 1967
and 1968 respectively. From 1960 to 1964 he ser
ved in the US Army Signal Corps. Since receiving
the Ph) he has been at The Rockefeller University,
the University of ~1aryland and West Virginia Univ
ersity. He is a member "f SIAM, ACM and Sigma Xi.
George Trapp recieved the BS, }IS and PhD from
Carnegie-~Iellon University in 1966, 1967 and 1970
respectively. He has been at West Virginia Univ
ersity and a consultant for Westinghouse Electric
Corporation since 1970. He is a member of ~lAA,
SIMI, ANS, Pi Hu Epsilon, Sigma Xi, Tau Beta Pi,
and Phi Kappa Phi.
CONTRACTIVE PERTURBATIONS
* OF RESTRICTED SHIFTS
by Joseph A. Ball Virginia Polytechnic Institute and State University
Blacksburg, Virginia 24061
and Arthur Lubin Northwestern University
Evanston, Illinois 60201
Abstract
The characteristic function of a certain type of contractive perturbation of a restricted shift operator is determined in terms of that of the unperturbed operator. Also a spectral representation is computed explicity for a class of unitary perturbations. These results generalize some of the finite-dimensional results of P.A. Fuhrman related to stability of linear control systems.
1. INTRODUCTION
In this paper, we study a class of pure contrac
tive and unitary perturbations of (generalized)
restricted shifts acting in a Sz.-Nagy-Foias
space generated by an analytic contractive opera
tor-valued function S(z), and we consider some re
lations between the characteristic functions and
spectra of the original operators and the pertur
bations. Restricted shifts (at least the case
where S(z) is unitary-valued) arise in the reali
zation theory of discrete linear control systems,
in which case the analysis of the perturbations
studied here has applications to stability theory
for linear control systems. D.N. Clark [2J stud
ied the one-dimensional unitary perturbations of 2 restricted shifts in H , i.e. S(z) a scalar inner
function. The general unitary perturbations are
implicit in work of de Branges and Rovnyak [lJ,
though in the context of the de Branges-Rovnyak
model theory rather than the Sz.-Nagy-Foias.
P.A. Fuhrman [4J considered a class of completely
nonunitary and unitary perturbations for the case
S(z) an inner function on a finite-dimensional
space. In this case, the maps considered are
always compact perturbations. Our purpose here i,
to generalize the results of [4J and [2J to the
context of a more general Sz.-Nagy-Foias space.
2. PRELIMINARY RESULTS
For notation, let C and C* be separable Hilbert 2 2 2 2 spaces, let L (C), L (C*), H (C), H (C*) denote tl
standard vector-valued Lebesgue and Hardy spaces
defined on the unit circle. (See [5J or [7J for
general references). We will use "t" to denote tl
argument of a function (vector or operator-valued:
defined on the unit circle, and for analytic func
tions, we will freely identify h(t) on the circle
with its extension to the disc, denoted h(z). ThE
symbol S denotes a fixed purely contractive anal)
tic operator-valued function from C to C*' i.e.
S (z): C-tC*, II S (z)1I s; 1 for all I zl < 1 and
* Most of these results will appear elsewhere in revised form.
16
s
Ie
Ie
r-
\lS(O)cll < \lcll for all c ~ C, and let t.(t) = * \ 2 2 (I-S(t) Set»~ • Let H=H (C*) ~ t.L (C), where the
2 second summand denotes the L (C) closure of
[t.(t)g(t)\gE L2 (C)}, and M=[(S(z)f(z),t.(t)f(t»\
fE H2(C)}C H. Then M is invariant under the (uni
lateral)~hift U+ in H defined by U+(f,g) = lt (zf(z),e get»~, so K = H0M is invariant under
* * U+, where U+ is the left-shift defined by * -1 -it U+(f,g) = (z (f(z) - f(O», e get»~. We call K
the Sz.-Nagy-Foias space generated by S. Let T
denote the restricted right shift in K, i.e. the
compression of U+ to K. Thus, for (f,g) ~ K,
T(f,g) = P(zf, eitg) , where P denotes projection
* * onto K, and T = U+\K' Note that if S is inner,
then t.(t) = 0 a.e. and K = H2
(C) (3 SH2
(C).
Let S(z) be the analytic operator-valued function - * - * defined by S(z) = S(z) , i.e. Set) = S(-t) :C*~.
Analogously to above, define 6(t): C*~by
6(t) = (I-S(t)* S(t»%, H = H2(C) ~ AL2
(C*), M =
USf, 6f)\ fE H2
(C*)}, K = H0M, and T = puli('
where u+ is the unilateral shift in Hand P is
projection onto K. Note that S is inner if and
only if S is inner. (We use "inner" in the sense
of [5], i.e. Set): C ~ C*is unitary a.e.; in the
terminology of [7], this is called "inner from
both sides".) The following result indicates how
a restlcted left shift may be represented as a
resticted right shift, and is basic for our
analysis. The inner case is proved in (3). The
proof is a direct computation and will be omitted.
2.1 THEOREM
(i) Let L=L2(C*) ~ t.L2(C) , i=L2
(C) ~ 6L2
(C*). _ -it *
Then T : ~L given by '1" (f,t.g)=e (S(-t) f(-t) + 2 e _ e
t. (-t)g(-t), t.(t)(f(-t)-S(-t)g(-t») is unitary.
(ii) '1"='1" \ K is a unitary map from K to K e
implementing a unitary equivalence between the -* -right shift T on K and the left shift TonK:
-* TT = T '1".
We can now derive an explicit formula for T which
will be useful later on.
2.2 COROLLARY . t
For (f,t.g)i K, T(f,t.g)=(zf(z)-S(z)Q(0),e1
t.(t)g(t)-
t.(t)Q(O) where Q(O) is the first component of
T(f,t.g). Proof: This is obtained by computing
*- * (T T T) (f,t.g).
If F=(f,g) E K and T(F) = (Q,h), denote by (TlF)(z)
the C-valued function Q(z). We state several tech
nical lemmas needed later on. The proofs are rela
tively straight-forward computations.
2.3 LEMMA.
For \ w\ < 1, X E C*' Y E 'I(
C, let
17
k :<!-S(z)S~w) w,x,y 1 - zw
x, -
+(S(z) - Sew) z - w y,
Then k E K and W,X,y
* t.(t)S(w)
1 - eit
1i'
t.(t) it
e - w
x)
P«x/(l - z~), 0» = k 0' w,x,
P« S~t2 t. {t 2 y»=k 0 • it y, it
- w w, ,y
e e - w
2.4 LEMMA
If (f ,g) = F E K, then
(i) (F, k O)K = w,x,
(ii) (F ,k 0 )K = w, ,y
(f(w),x)C *
«T IF)(W) ,y) c.
In particular
(iii) for X,y E C*' , and '11 in D,
* (k km ) = (I - S(TI2 S{C2 ',x,O'--1j ,y,O K 1-1],
(iv) for x,y E C, - * -
(k 1<- ) = (I - s(l) S(i:.) ) "O,x' 1I,0,y K I _ '11' x, y C
and
(v) for x E C, y,; C*' S(ll) - S {')
(k.. 1<- ) = (~w..L----=~ x, y)c -1.,O,x' lI,y,O K T) _, *
We note that if F = (f,g) E K is orthogonal to
K for all w ~ D, x,; C* and y E C, then f = 0 W,X,y
and '1" IF = O. From the formula for T l' it follows
that also g = 0, and henc(c F is the zero element of
K. This implies that [k \w E D, x E C*' y& cJ W,X,y spans a dense subset of K. This fact will make
the formulas iii) - v) useful for computations
later on.
The next Lemma follows from the Corollary to Theo
rem 1.1 and direct computations.
2.5 ~
(i) Tk = W-l(k - k ), w 1 o. w,x,o w,x,o O,X,O
(ii) Tk = wk - k - • w,O,y w,O,y O,S(w)y,O
(iii) T*k = wk - k * w,x,O w,x,O O,O,S(w) x
(iv) T*k 0 = w-1(k 0 - kO 0 ), w 1 o. w, ,y w, ,y , ,y
We wish to distinguish two subspaces of K defined
by
k = the closure of {kO olx E C*} 0 ,x,
K = 0 the closure of (kO,o) y E C}.
Let us simplify the notation for this special case by writing
d for kO 0 and D for kO 0 • x ,x, y , ,y
2.6 LEMMA
Let F = (f,g) E K. Then * -1 -it (i) T F = (z f(z), e g(t» if and only if
F lko' it 1 (ii) TF = (zf(z), e g(t» if and only if F KO'
2.7~
Let PkO
and PKa
denote the orthogonal projection
onto kO and KO respectively. Then PkOF = dx
'
* -1 where x = (I - S(O)S(O» f(O) and PK
F = D , o y
where y = (I - S(0)*S(0»-1(T1F)(0). (Note since
S(z) is a pure contractive function, x and y
are well-defined for F in some dense subset of K.)
3. THE PERTURBATIONS .-
3.1 DEFINITION
Let A: C ~ C* be a bounded linear map. We define
Z(A) to be the unique bounded linear map on K such
that
Z(A)F = ITf
dAy if f = D Y
18
3.2 REMARK
* It follows that Z(A) is given by
Z(A) F= * IT * F if F L kO
* -1 * D if F=d ,where y=(I-S(O) S(O» A y x *
(I-S(O)S(O) )x
We note that T = Z(-S(O» (by lemma 2.4), and tha * Z(A) dx = DA*x
if and only if
* * (1) AS(O) S(O) = S(O)S(O) A
3.3 THEOREM
(i) Z(A) is a contraction if and only if * * * (2) A (I - S(O) S(O~ ~ (I - S(O)S(O) )
(ii) Z(A) is unitary if and only if
A=(I-S(O)S(O)*)\V(I-S(O)*S(O»% for some unitary
(iii) If A satisfies condition (1), then
Z (A) is a contraction if and only if II All ~ 1 and
Z(A) is unitary if and only if A is unitary.
Proof
(i) Since Z(A) maps KoL isometrically onto koL
and sends KO onto kO' Z (A) is a contraction if ani
only if it is contractive on kO' By lemma 2.4,
this holds precisely when
IIAYIl2-IIS(0)*AYII~lIyjI2_IIS(0)YIl2 for all y E C, but
this is clearly equivalent to (2).
(ii) As above, Z(A) is isometric precisely when
equality holds in (2). By [4, theorem 1.7(i)J,
this holds if and only if A = (I - S(O)S(O)*)-%
V(I - S(O)*S(O»~ for some isometry V. By
* lemma 2.4, Z(A) is isometric if and only if
(I - S(O)S(O)*) = (I-S(O)S(O)*)~VV*(I-S(O)S(O)*)%: * which holds if and only if VV = I, so V must be
unitary.
(iii) If (1) holds, then (2) reduces to * * *
(A A)(I-S(O) S(O»~(I-S(O) S(O», which, using (1)
again, holds if and only if A*A~ I, Le. IIAII ~ 1.
In the second case, (1) implies that A = V.
4. CHARACTERISTIC FUNCTIONS AND SPECTRA
V.
d
The Sz.-Nagy-Foias model theory for contractions
assigns to each contraction T on a Hilbert space
} * ). H the triple [19
T,19
T*,EilT(A ) where DT = (I-T T) 2,
D * = (I-TT*):\;; 19 = DR 19 * = D *H and Eil (A) = T '* 1:1 T' T T' T
[-T +ADT*(I..I.T) DT
J\19 is an analytic operator-T
valued functiun whose values are contractions from
19T
to ilT*' the defect spaces of T. (This holds
since TDT = DT*T.) We call this triple the char
acteristic function of T, and if T is completely
non-unitary (c.n.u.), i.e. there is no reducing
subspace on which T is unitary. Then T is uni
tarily equivalent to the adjoint of the restricted
shift on the Sz.-Nagy-Foias space generated by its
characteristic function [7, p. 248J. In most
cases, one is unable to get any "concrete" infor
mation from this representation for a specific
operator because of computational difficulties
involved in simplifying the form of the charac
teristic function. However, if A satisfies (1),
then we can apply Fuhrmann's proof [4, p. l69-l72J
verbatim to get the following two theorems.
4.1 THEOREM
If A is a strict contraction satisfying (1), then
Z(A) is a c.n.u. contraction on K with character
istic function [KO' kO' SZ(A)(z)}, where EilZ(A)(z)
is given by
EilZ(A) (z)Dy=dG(z)y where
(I-S(O)S(O)*):\;;G(Z)(I-S(O)*S(O»-:\;;
(I-AA*)~(I-r (z)A*) -1([ (z)-A) (I-A*A) -:\;; and
r (z) =
(I-S(O)S(O)*):\;;(I-S(Z)S(O)*)-I(S(z)-S(O»(I-S(O)* S(O»-~
Note that the above are matrix fractional linear
transformations.
We call an open arc Y of the unit circle regular
for S(z) if S(z) has analytic continuation over
Y and for all AE y, S(A) is unitary. Let a(T)
and a(Z(A» denote the spectrum of T and Z(A)
respectively. Recall [7, theorem VI, 4.lJ that
a(T)=(1 zl < 11 S(z) is not boundedly inverible}
19
U £I A I = 1111. lies on no regular arc of S}.
4.2 THEOREM
Under the assumptions of 3.l,(i) a (Z(A»=l\A\=11
A lies on no regular arc of S} U Llz\ <l\(f(z)-A)
is not boundedly invertible}.
(ii) n *n Z(A) and Z(A) both converge to zero in
the strong operator topology if and only if S(z)
is inner. We note that (ii) is the condition for
asymptotic stability for a certain discrete linear
control system.
5. UNITARY PERTURBATIONS
Since the characteristic function of a unitary
map is zero, the above method fails totally when
Z(A) is unitary. However, when A satisfies (1) we
can still get spectral information about Z(A) by
adapting techniques of D.N. Clark [2J to a more
general setting. We begin with two technical
lemmas, and omit the proofs.
5.1 LEMMA
If A is unitary and satisfies (1), then a a = A (I + S(O)A*) (S(O)*+ A*)-l is unitary from C to C*'
5.2 LEMMA
For F = (f,g) E K,
(i) (Z(A)-!XF)=kO 0 where ,x, * * -1 x = -(a + S(O» ('rlF) (0)
* * (ii) (Z(A) -T )(F) = kO 0 where 1 ' ,y
y = -(a + S(O»- f(O).
For Iz\ < 1, defineCP(z): C* ... C* by
* * -1 cp(z) = (I-S(z)a )(1 + S(z)a) • Then straight-
forward calculation gives
* (3) cp(') + cp (1])
2(1 + s(')a*)-l(I-S(')S(I])*)(I + as(I])*)-l
and hence (let z = , = ~) cp(z) has non-negative
real part for \z\ < 1. By the opertor-valued
version of the Herglotz theorem, there exists a
non-negative operator-valued measure ~ on [O,zn]
2Ii:iS i9-1 such that cp (z) = J (e + z)(e -z) d ~ (9).
o
I
il' :1
Thus
(4) cP (')-+<P (11)* =2f(1-,Tj) (l-e -i9 ') -1(1_ei9 11) -ld)J.(9).
Comparing (3) and (4) yields
* * * I-S{C)S{]) = f I+s(')a d)J.(9) I+aS(n)
l-,Tj l_e- i9 , t_e i9li (5)
Similar computations give
(6)
and
(7) I-S{')S(U) _lTTa*I+aS{'t d)J.(9) I+S{li>a* ,-li 0 e -i9 _, e i9 _Tj
2 We define the Hilbert space L ()J.) as in Shulman
[6). For f-XtXEl
+ ••• + Xn~n a simple C*-valued
function where ~ ••••• ~ are characteristic 1 n
functions of disjoint Borel sets and xl ••••• xn are
corresponding elements of C* define
IIf\l2 = )J.
f(d)J.(t)f(t).f(t»=()J.(E1)xl·x1)+···+()J.(En)Xn·Xn)·
This does not depend on the representation of f(t)
in terms of characteristic functions. Let
a- (f(t): [0.2TT) .. c*lf is Borel measurable.
J \If(t)1I 2d()J.(t)x.x) < co for all x E C*. the range
of f(t) is contained in a finite dimensional
subspace of C*}. For f i a let e l .e2 ..... ek be a
basis for the smallest subspace which contains the
range of f(t). and define
a(f.t)-()J.(t)e l .e1) + ••• + ()J.(t)ek·ek)·
The definition is independent of the choice of
basis for this subspace. and IIfI1 2 s:Jllf(t)11 2da(f.t) )J. whenever f is a simple function. For f £ a. there
is a sequence of simple functions (f (t)} such n
that the range of fn(t) is contained in the range
of f(t) for n=1.2 ••••• and such that
20
J Ilf (t)-f(t)112da(f.t) .. 0 as n .. co. We can n 2
define Ilf(t)11 unambiguously as )J.
Ilfll2= 11m Ilf 112. )J. n-tx> n)J.
2 By L ()J.) is meant the Hilbert space completion of
the inner product space of equivalence classes of
functions with finite-dimensional range in )J.-norm. 2 The definition of L ()J.) is such that explicit for-
mulas can be written only for an element associated
with the equivalence class of an element of a. This. however, causes no difficulties for our pur
poses. It is clear. for example. that the trans-it 2 formation h(t) .. e h(t) is unitary in L ()J.). with
spectrum equal to supp()J.) (the complement of the
largest open set on which )J. is zero).
We are now in position to define a unitary trans
formation of K onto L2()J.) which transforms the
operator Z(A) on K to the operator of multiplica-it 2 tion by e on L ()J.).
5.3 THEOREM
Define V on elements in K of the form k, by .x.y
x -- * 1+ 5") a
it -e -,
ay.
Then V is well-defined and extends uniquely to a 2 unitary transformation (also V) of K onto L ()J.)
such that VZ(A) = ei~.
Proof:
We first check that V is an isometry on those
vectors where it is defined. Note. for x.y i C*.
* (k k ). (I-S(C)S(]) y.x)C 11.y.0· ,.x.O K l-Ti, *
* * - (JI+ S(C)a d (t) 1+ 05(1)) ) b -it)J. it= y.x C y l-e' l-e n *
(5)
* * .. (1+05{T]) y. 1+05{') x) 2 l_eitli l_e it , L ()J.)
= (Vk- O.Vk, O)L2(). 'I.Y. .x. )J.
Also. for x.y. £ C. - * -
(k .k ) .. (I-S{C) Sen) Y.x)c 11.0.y "O.x K 1-li,
- * - * S 1+ as(C) 1+ S(1))c:v )
(c:v -it d~(t) it _ c:v y,x C e -, e -~
(Vk-,O,y' Vk~ 0 ) 2( ) II .. ' ,x L ~
and finally, for x £ C* and y E C,
(VL ,Vk ) 2 . 1I,0,y "x,O L (~)
Hence V is isometric (and hence also well-defined)
on its domain. Since elements of the form k~ 'I,x,y span a dense set in K, V extends by linearity and
continuity to be an isometry of K into L2(~). Since the range of V contains all elements of the
i~ / it -form x/(l-e w) and x (e - w) for x £ C* and
Iwl < 1, it follows that V is onto L2(~).
It remains to show VZ(A) = ei~. By Lemmas 2.5
and 5.2, -1 -1
Z (A) (k 0) = w k -w k w,x, w,x,O O,x,O -1
+ w kO'(c:v* +S(ot)-l(S(O)*-S(w)*)x,O -1
= w (kw,x,O-kO' (c:v*+s (ot) -l(c:v* + S(wt)x,O)
and hence
VZ(A)k O· w,x, -1 -1
w (l_eitw)-l(I+oS(W)*)x-w (I+aS(wt)x
-1 it- -1 * = w [(1 - e w) - lJ(I+ oS(w) )x
Similarly
Z(A)k = Wk - k w,O,y w,O,y O,X(W)y,O
-k * * -1 * 0, (0' +S(O» (I - S(O) S(W» y,O
So
21
= eitVk • w,O,y
The theorem follows.
We note the following inversion formula for V.
5.4 THEOREM
* 2 Let V : L ~ K be defined, for F in G, by
* V F = (WlF, W2 F) where (WlF)(z)
(1+ S(z)O'*) S (l-e -itz)d~(t)F(t)
and (W2
F)(t) = lim(I - s(reitts(reit»-!:2. r~l
S it * it * it * i(t-e) -1 - (S(re ) -S(re ) S(re )0 )(l-re ) d~(e)F(e)
* Then V is the adjoint of V defined in Theorem
4.3
Proof:
To obtain WI' rewrite equation (5) substituting z
for , and noting that
* Vk- - I + as (1)) x to obtain lI,x,O- l_eitrj
* I-S(z)S(!) x = 1 - '11
Similarly, using equation (6),
- * S(z)-S(]) =SI+S(Z)C:V d (t)(VL )(t) - Y -it ~ . II 0 y
z-T)- l-e z "
This proves the correctness of the formula for
WI for all F of the form Vkm , and hence by -q,x,y approximation for all F £ G. To obtain the for-
mula for W2
, we first find a formula for
* (TlV F)(z). By an argument dual to that above, we
find
* (T IV F)(z)
* * S -it -1 -0 (I+aS(z» (e -z) d~(t)F(t).
The formula for W2
is then obtained by using the
* explicit formulas for T and T in Theorem 2.1.
5. 5 THEORE}l
Let A be unitary and satisfy (1). Then
cr(Z(A»=(I~1 - ll~ lies on no regular arc of
S} U (I~I - 11 ~ lies on a regular arc of S but
(1+ S(\)a*) is not boundedly invertible.}
Proof:
Since Z(A) has a representation as multiplication
by e ia on L2(1J.), we have cr(Z(A» = supp(IJ.), the
complement of the largest open set on which IJ. is
zero. By the integral representation of ~, we see
that the complement of supp(lJ.) is the set of ~ at
which ~(z) has analytic continuation with
Re~ (~) .. O. Since
* * -1 Cjl(z) = (I-S(z)a )(I+S(z)a) ,we have
(I+Cjl(z» - 2(I+S(z)a*)-1 and
S(z) = (I-~(z»(I+CP(z»-lQ'
Now, suppose cp(z) has continuation at ~ and
Re~ (~) .. O. Then (I+cp(\» is boundedly inver
tible, and hence (I+Cjl(z»-l extends to an analy
tic function in a neighborhood of~. Thus, S(z)
* has analytic continuation at ~ and (I+S(~)a ) is
boundedly invertible; since Reel' ~) = 0, S~) is
unitary. Conversely, suppose S(z) has analytic
* continuation at~, (I+S(~)a ) is boundedly inver-
tible, and S~) is unitary. Then (I+S(z)a*)-l
is analytic in some neighborhood of~, so ~(z) has
analytic continuation at \; since S(\) is unitary,
Re ~ ~) .. O. By taking complements, the theorem
now follows.
* * * * Since (I+S(\)a )-[(I+S(O)A) -S~)S(O) +A») * -1 * (I+S(O)A) ,we see that (I+S(\)a ) is
boundedly invertible if and only if B(\) -* * * -[(I+S(O)A ) - S(\)(S(O) +A ») is boundedly
invertible. With r as in Theorem 4.1, we have,
since A satisfies (1),
(r (\) -A) =
(I-S(O)S(O)*)%(I-S(\)S(O)*)-lB(\)A(I-S(O)*S(O»-%.
Thus, (r(~)-A) is invertible, but not necessarily
bounded1y, if and only if B(\) is invertible.
Since boundedness follows immediately in the finite
dimensional case, we have the followins generaliza
tion of [4, Theorem 3.6) to the case of general
analytic contractions S(z).
5.6 COROLLARY
If A is unitary on C, C finite-dimensional, and A
satisfies (1), then cr(Z(A»=(I\I=ll\ lies on no
regular arc of S} U (\ \ = 1\ \ lies on a regular arc
for S but (r(\) -A) is not invertible}.
In the finite-dimensional case, Z(A) is a compact
perturbation of T. Hence by the known spectral
behavior of T and Wey1' s theorem, q \ \ • 1] \ lies on
a regular arc for S but r(\) - A is not invertible}
must be eigenvalues for Z(A).
We can also adapt Fuhrmann's calculations [4, page
174) to determine eigenvalues in our more general
setting.
5.7 THEOREM
If A is unitary and satisfies (1), and \ lies on a
regular arc for S, then \ is an eigenvalue for Z(A)
if and only if the range of r ~) - A is not dense
in C*.
22
BIBLIOGRAPHY
(1) L. de Branges and J. Rovnyak, Canonical models
in quantum scattering theory, Perturbation
theory and its applications in quantum mecha
nics, Wiley, New York, (1966), 295-391.
(2) D.N. Clark, One dimensional perturbations of
restricted shifts, J. Analyse Math. 25(1972),
169-191.
(3) P.A. Fuhrmann, On the corona theorem and its
application to spectral problems in Hilbert
space, Trans. Amer. Math Soc. 13291968), 55-66.
(4) P.A. Fuhrmann, On a class of finite dimensional
contractive perturbations of restricted shifts
of finite multiplicity, Is. J. Math 16(1973),
162-175.
(5) H. He1son, Lectures on invariant subspaces,
Academic Press, New York, 1964.
(6) L. Shulman, Perturbations of unitary trans
formations, J. Math. Anal. and App1. 28 (1969),
231-254.
(7) B. Sz.-Nagy and C. Foias, Harmonic analysis of
operators in Hilbert space, North-Holland
Publishing Co., 1970.
i
23
FREQUENC Y RESPONSE METHODS IN
MULTIVARIABLE INFINITE DIMENSIONAL LINEAR SYSTEMS
John S. Baras Electrical Engineering Department
University of Maryland College Park, Maryland 20742
Abstract
Recent results on the analysis of models and structural properties of linear distributed systems are presented. The presentation emphasi7.es the role played by harmonic analysis in these studies. The conclusions are that a careful selection of mathematical methods makes possible a satisfactory classification and detailed analysis of distributed systems models. These methods provide simple models that reflect input-output data of engineering importance.
SUMMARY
Modeling distributed parameter systems one finds
a number of intrinsic problems that do not appear
in lumped parameter systems modeling. Typically
a linear distributed system is modeled by a differ
ential equation
dx(t) = A x(t) + Bu(t) dt
y(t) = C x(t) } (1)
Here for a great variety of problems it suffices to
assume that x(t) is in a Hilbert space X [1]. The
operator A arises from a formal partial differen
tialor integrodifferential operator and it may in
clude boundary conditions through the definition of
its domain t(A). In all situations A is as sumed to
generate a strongly continuous semigroup of bound
ed ope rators on X. This last statement is an ab
stract phrasing of the usual assumption that the
system of equations under study be well-posed.
The controls u are for us square integrable C[;n_
valued functions and the outputs yare square
24
integrable C[;m -valued functions. So u € L 2
and 2 n
y € L Certainly other input and output function m
spaces can be utilized. It turns out however that
2 the L topology gives rise to a particularly rich
theory. This does not state that other function
spaces can not provide theories with similarly rich
structures. The latter remains to be proved how
ever. It is fair to say that to date other theories
(based on distributions for example (31) have not
produced detailed results like the ones we desc ribe
here.
Describing the properties of the operators Band
C in (1) above is more intricate. Indeed there are
various possibilities that are due to the following
facts: in distributed systems we can (a) apply
distributed control, that is control distributed in
the spatial domain of our partial differential op
erator or, (b) apply boundary control, that is con
trol through the boundary conditions of our p. d. e.
system; in distributed systems we can, (c) have
as outputs linear functionals of the whole solution
x, that is distributed observations (in the forITl of
a weighted average) or, (d) have as outputs linear
functionals of the boundary values of the solution
and (or) its derivatives, that is boundary observa
tions. In (1) B: ([;n -+X and C: X -+ ([;ITl. In the case
of distributed control B is bounded, appears in (1)
directly froITl the physical description of the sys
teITl and usually Range (B)~ lIl(A). SiITlilarly with
distributed observation C is bounded. In case of
boundary obervation C turns out to be typically
unbounded. The usual situation however is that
C is A-bounded [4]. That is its dOITlain ;I) (C)~
£(A) and Ilcxll([;rfi klllAXllx+k21lx\lX for SOITle
positive kl,k
2 and for all xe:lIl(A). The situation
with boundary control is a little ITlore subtle. In
such cases the physical description of the system
does not result directly in a model like (1). Typ-dx(t)
ically one has a p. d. e. ~ = cr x(t) and a bound-
ary partial differential operator 'I, which gives
the control via 'I x(t) = u(t), with 'I being cr -
bounded. One has to work further to bring the
original description into the forITl of (1). At the
end of this construction one ends with an operator
B that is "unbounded ", in the sense that B now
ITlaps ([; n into V '-:;) X -:;) V whe re Vi is the dual of
V (note here that V is included in X as a set and
not as a Hilbert space, the inner products in X
and V may be considerably different) (see [1] or
[2] for details). However as a ITlap froITl o:;n into
Vi B is clearly bounded. These ideas have been
used formally in engineering probleITls when re
placing boundary controls with delta-function type
distributed controls.
We mainly analyze here ITlodels that have both
operators Band C bounded. We would like to
point out however that ITlost of the results can b~'
extended to the other cases with additional work
required by the more eOITlplex ITlatheITlatica1
technicalities. The basic ideas reITlain the sanH'.
The matrix valued function T(t) = CeAtB associated
with (1) is the weighting pattern of the system. The
Laplace transform of T is the transfer function A -1 T (s) = C (Is -A) B which is 0 riginally well defined
in SOITle right half plane. The triple (A, B, C) is a
regular realization for T or T, when Band Care At
bounded and T(t) = Ce B. This last equation is a
25
representation for the function T, and thus we
expect classical function theoretic representation
results to be quite useful here. We shall see that
this is indeed the case.
It is clear that spectral properties of the generator
A are crucial for the analysis of systems like (1\.
Utilization of spectral infornlation can provide
structural and qualitative analysis of great detail.
On physical grounds it is desirable that the spec
tral properties of A "faithfully repres('nt input-
output ITleasureITlents ". Let us nlake the last stat e-A -1
ment ITlore precise. Clearly T(s) = e(ls-A) I;
can be analytically continued in ;) 0 (A), tIll' connect
ed cOITlponent of the resolvent st"t of A that contain,;
+ 00. To siITlplify the discussion we aSSU111C that
p(A) is connected. Then if we lct ,~(T) denote the ~
set of nonanalyticity of T we have the spectral in-
clusion property [5]. J(T)~,' (i\\. A realization ~
(A, B,C) is spectrally minimal [ ,,] if ,'(T)=,' (A), ~
for sonle analytic continuation of T, and with
ITlultiplicities countl'd whenever l1waningful. Our
position is that spectrally mininlal rl'alizations are
very useful and natural nlOdels for linear distrib
uted systems. Afterall physicists and engine'l'rs
usually nleasure things like natural frl'qu,'nei,'s.
spectral lines, radiation 111Odl's that arc' rt'flectl'd
in the singulal"ities of T. VI'<' \\'ant thl'n to inn'sti-
gate ('xistl'nce of such tl1odds, find sinlph' 1l1odds
of this type and study rdations lwtwel'n such 111odds.
What follows is a very bril'f SUllUllary of rt'sults in
this dirl'ction. For details and further rl'fl,rt'IKes
Wl' rde r to [5] [(>] 17].
A regular realization (A, B, C) is reachable \\'hl'n-
A':'t ever B':' e x"O for t"2 0 implies X" 0; is obser-
vable whenever C eAtx" 0 for t ~ 0 implies X" 0;
is canonical whenever it is reachable and obser-
vable; is exactly reachable whenever the limit
\ Ibn, eAt BB~' eA*t dt exists as a bounded and
oJ tl-+'" 0
boundedly invertible operator; is exactl~ observt 1 A':'t At
vable when eve r the limit lim ' e C ':'C e dt .; t"'''' 0 I
exits as a bounded and boundedly invertible opera
tor. First notice that the existence of regular
realizations implies certain properties for T. In
deed we have:
Theorem 1: Let T be an mxn matrix weighting
pattern. If T has a regular realization then T is
continuous and of exponential order. On the other
hand if T is locally absolutely continuous and its
derivative T is of expontial order, T has a reg
ula r realization.
To proceed in the analysis we need to use the
theory of Hardy functions H~' H'" , H2 2 k mxn mxn
H (.t(C , N» (see [7] for notations). Then
" Theorem 2: Let T be analytic in Res> O. If
T(iUJ) =- C (i:l,)':' R(iUJ) a. e. with C E: H2(.t (Cm
, N) ),
2 n 13 E:H (.t(C., N» where N is an auxiliary Hilbert
" space, then T has a regular realization.
This latter realization is given by
X" H2
(N)
G" en ... X; (Gu)(i'JJ)" 13(iu.;)u
Ft c x P 2 M iwt x
H (N) e
'" I r' H:X"'e
m; Hx" znJ C'(i'JJ)x(iW)dW
-'" whc re M i'lit is the ope rato r I multiplication by
ei,),t I l~his is the translation realization.
(2)
It is interesting to ask when does the factorization
condition of the previous theorem become neces-
ZT5
sary? Then
" Theorem 3: Let T be a transfer function matrix.
If eithe r
(a) T has a dissipative (Le. for xE:£(A),
(Ax, x) + (x, Ax) ,;; 0) globally as symptoticall y
stable (i.e. lim \IeAt
x\\=O, VxE:X) regular t -+'"
realization, or
" 2 T £ H anc:l has a reachable and exactly mxn
(b)
observable regular realization,
then the factorization condition of Theorem 2 is
also necessary.
We would like to analyze case (b) a little further.
Note that the square integrability assumption is
inessential. The Hankel operator is then well
defined: a:>
(HTu)(t)" J T(t+ 0 )u(o )dO (3)
o
or in the frequency domain
H" U "P 2 M" Ou T H T
(4)
m
where ou(iW)" ~(-iW). Then the following is a well
26
defined regular realization:
X" Range (H,,) c H2 T m
" (Bu)(iw)" T(iw) u
1 r'"' C x " 2n J x(iw)dW
-a:>
(5)
(5) is the restricted translation realization. But
we know [6] [8] that if (A, n, C) and (F, G, H) are
two regular, reachable and exactly observable real
lizations of the same weighting patte rn T, then
the re exists a bounded and boundedl y inve rtible
operator P so that PA" FP, PB" G, C" HP. So it
suffices to analyze the restricted translation real
ization for this class of weighting patterns (and
thus systems). Note that this is an extremely
i
simple model and that the Fourier transform
(which is a classical function theoretic represen
tation theorem) was utilized in its construction.
Now Range (H,,) is a left translation invariant sub-T Z~
space, and therefore Range (H,,)= (Q H ) ,ksm, T r k
where Q (itlJ) is isometric a. e. The important r
case is when k=m.
space of full range.
Then Q is inner and the subr
This fact must reflect some
" properties of T. The relevant property is that of
existence of a pseudonleromorphic continuation of
bounded type in the open left half plane. A transfer
" function matrix T analytic in Res> 0, has the above
mentioned property if there exists a matrix func
tion G and a scalar function g, both bounded and " analytic in Res< 0 so that T(iW)=G(iW)/g(iW) a.e.
on the iW -axis. This is a generalization of the
concept of regular analytic continuation. Then the
following are equivalent:
" (a) T has a meromorphic pseudocontinuation of
bounded type in Res < O. Z
(b)(Range(H,,))~=QH ,Q inner. " T r m r
(c) T has a right coprime factorization T(iW) =
= Q (iW) P ':'(iw), with Q inner and P eH"" • r r r rmxn
Now Q determines the spectrum of A in the r
restricted translation realization (5) with multi-
plicities: a (A) = fl..! € OLP such that Q ':'(-W) has r
non null kernel} U r points on iW -axis through which
Q cannot be continued analytically}. Q also r r
determines the singularities of the pseudocontinu-
ation of T and 06) = a (A), multiplicities counted.
Note that, except for pathological cases, the
pseudocontinuation will be a true analytic continu-
" ation fo r T. Thus we have:
Theorem 4: Suppose T€HZ
nH"" T has a mxn mm meromorphic pseudo-continuation of bounded type
" in O.L.P., and T has a reachable and exactly
observable regular realization. Then i) the
restricted translation realization is spectrally
minimal, ii) any other reachable and exactly
observable realization is spectrally minimal.
Note also that Qr
gives a precise state space de
composition for this class via the Jordan model"
theory of Nagy- Foias [10, ch. III]. Similar results
for discrete time systems can be found in [8]. [9].
and the references therein. We would like to re-
mark again that all of the above can be extended to
the other cases, i. e. B or C or both being un
bounded. What is involved is a careful analysis of
the restricted translation realization (5) (which
can formally be written for any H"" function) in mxn
order to make the various operators well defined.
This is as far, invariant subspace theory and
Hardy spaces go. There are however inportant
classes of distributed systems that arise from
engineering and physics that are not included here.
To produce examples one needs only consider
transfer functions with branch points. For a
simple example consider heat transfer along a
long bar:
27
Ox (t, z)
at
x(O, z) = 0
x(t, 0) = u(t)
Z o x(t, z)
az Z
lim x(t, z) = 0
z-+ ""
- x(t, z)
y(t) = (temperature at z = 1) = x(t,1)
(6)
Then T(t) -t
e 1 -1/4t d T" () -ISTI e an s = e .
One can write a translation realization for this T
x = LZ[O,"")
At l' [0) e = left trans ahon an ,00 (7)
"" Cx = J g(t)x(t)dt, g(t)=e-
t
o This is a canonical regular realization. However
a (A) = closed L. P. while C1 cI') =
fbranch cut from -1 to ...0:>1. Thus no spectral
minimality. But certainly (7) is an unnatural Theorem 6 ([6]): Let (A, B,C) and (F,G,H) be
model for (6), because it ignores the great internal canonical regular realizations of T, with A = A''',
symmetry of (6). One has to use other means. In
particular many problems from mathematical
physics lead to models like ( 1) where A is selfadjoint
or normal. Then one can show by the use of the
spectral thoorem that if (A, B, C) is a canonical
regular realization for T, and A = A ':' this
realization is spectrally minimal [6]. The inter
nal symmetry of the system results to additional
properties for T, which then can be utilized to
construct simple models. A simple example,
which illustrates the point, and also indicates how
classical function representation results can be
used here, is provided by the well known to elec
trkal engineers completely monotonic and positive
definite functions. These arise naturally from
lumped distributed RC networks. A function cp is
completely monotonic, if it is C'"' on [ 0, co) and
(_l)n cp (n) (t) ~ {) for t> 0, and is positive definite
on (-'"', '"') if L: cp(t.-t.) a.a . ., 0, for every set of .• 1 J 1 J 103
real numbers ft.} and complex numbers fa.} . 1 1
Then we have [6]:
Theorem 5: A weighting pattern T is completely
monotonic if and only if it has a regular realization
(A, b, b) with A = A* and stable. T has a positive
definite extension on (-,"" ... ) if and only if it has a
regular realization (A, b, b) with A = -A':'.
One uses Bernstein's representation of completely
monotonic functions and Bochner's rep res entation
of positive definite functions to construct simple
models.
Under such symmetry the state space isomor
phism theorem can be improved. It is important
to note that spectral minimality results from
assumptions on A alone (like A = A'~, A normal),
while the state space isomorphism theorem re
quires additional symmetry:
28
B=C'~, F=F'::, G=H':'. Then they are similar via
a unitary map.
REFERENCES
[ 1] J. L. Lions, Optimal Control of Systems Governed by Partial Differential Equations, Springer- Verlag, 1971.
[2] H. O. Fattorini, "Boundary Control Systems ", SIAM J. Control, Vol. 6, No.3, pp. 349-38S, 1968.
[3] A. Bensous san and J. P. Aubin, "Models of Representation of Linear Time Invariant Systems in Continuous Time ", Univ. of WisconsinMadison, Math. Res. Ctr., Report MRC #1286, Sept. 1972.
[4] T. Kato, Perturbation Theory of Linear Operators, Springe r- Ve rlag, 1966.
[S] J.S. Baras and R.W. Brockett, "H2 Functions and Infinite Dimensional Realization Theory", SIAM J. Control, Vol. 13, No.1, pp. 221-241, Jan. 1975.
[6] J. S. Baras, R. W. Brockett and P. A. Fuhrmann, "State-space Models for Infinite Dimensional Systems ", IEEE Trans. on Aut. Control, Vol. AC-19, No.6, pp. 693-700, Dec. 1974.
[7] J. S. Baras and P. Dewilde, "Invariant Subspace Methods in Linear Multivariab1e Distributed Systems and Lumped Distributed Network Synthesis ", to be published in IEEE Proceedings, Special Issue on Recent Trends in System Theory, Jan. 1976.
[8] J. W. Helton, "Discrete Time Systems, Operator Models and Scatte ring Theory", J. Funct. Analysis, 16, 1974, pp. lS-38.
[9] P. A. Fuhrmann, "Realization Theory in Hilbert Space for a Class of Transfer Functions ", J. Funct. Analysis, 18, pp. 338-349, 1975.
PO] B.Sz-Nagy and C. Foias, Harmonic Analysis of Operators on Hilbert Space, North, Holland, Amsterdam, 1970.
L
ON SIMULTANEOUS DIAGONALIZATION
OF A COLLECTION OF HERMITIAN MATRICES
S. Chakrabarti, B.B. Bhattacharyya
and
M.N.S. Swamy
Department of Electrical Engineering Cor cordia University
Montreal, Quebec H3G 1MB, Canada.
Abstract
Existing results on simultaneous diagonalization of a pair of hermitian matrices have been extended for more than two matrices where the necessary and sufficient conditions are derived for simultaneous diagonalization through a single congruent transformation or through a pair of contragradient transformations under various conditions. It has been shown that the techniques preserited here are also applicable in a straightforward fashion for matrices with multivariate functional entries.
I. INTRODUCTION
The problems of simultaneous diagonalization of a
finite number of matrices of finite order, through
a single transformation or through a pair of
specially related transformations, occur often in
statistics and engineering, particularly electri
cal engineering. Initiated by Weierstrass' study
of strict equivalence and the canonical forms
for regular pencils of matrices and supplemented
by Kronecker's study of more general problems
involving singular pencils, similar problems have
been receiving continuing attention from the
mathematicians. (1-3,6) The necessary and suffi
cient conditions for diagonalizing a finite set
of n x n nondefective* matrices through a single (1)
similarity transformation are well-known.
*An nxn matrix is called "nondefective" if and only if, it has n linearly independent eigenvalues,( 4) i.e. the matrix is similar to a diagonal matrix •
29
Similarly, the necessary and sufficient conditions
for diagonalizing a pair of hermitian matrices A
and B through a single congruent transformation,
i.e., through transformations QHAQ and QHBQ
where* Q is nonsingular, so that both QHAQ and
QHBQ are diagonal, are known. (3) Various special
cases of the last problem have been discussed in
(2) and (3), of which (3) includes the most detail
ed study of this problem (and related problems)
known to the authors.
Simultaneous diagonalization of a pair of hermitian
matrices, A and B, through contragradient
transformations, i.e., through transformations of
the form Q-lAQ-lH and QHBQ has also been ob-
*Throughout this paper, symbols of the form AH will denote the transposed of the complex conjugate of a matrix A.
tained by Rao and Mitra. (3) The necessary and
sufficient conditions for simultaneous diagonaliz
ation of a finite set of hermitian matrices {Ai}'
through a single unitary matrix transformation of H the form U AU ,where U is unitary, are also
known. (3,5) To the best knowledge of the authors,
the problems of simultaneous diagonalization of a
finite set of more than two hermitian matrices
through a single congruent transformation or
through a pair of contragradient transformations
are yet unsolved. The purpose of the present
paper is to extend the existing results on the
problems of the type mentioned above. In particu
lar, the necessary and sufficient conditions for
the simultaneous diagonalization of a set of
hermitian matrices through a single co-gradient
transformation is presented.
It may be recognised that since the matrices
involved are hermitian whenever QHAQ or H
Q-lBQ-l are diagonal (where A and B are her-
mitian), these are real diagonals.
Finally, it has been shown that the problems of
simutlaneous diagonalization of hermitian matrices
the entries of which are complex-valued functions
of many complex variables can be converted into
the equivalent problems of simultaneous diagonal
ization involving constant matrices.
2. SIMULTANEOUS DIAGONALIZATION
OF CONSTANT HERMITIAN MATRICES
In this section, we begin with a result quoted
from Rao and Mitra. (3)
Lemma 1 [Theorem 6.4.5] Let A and B be a pair
of hermitian matrices of the same order. Then a
necessary and sufficient condition that there
exists a nonsingular matrix H
Q such that
Q-lAQ-l and QHBQ are both diagonal is that:
rank (BAB) ; (ii) AB is nondef-(i) rank (BA)
ective with real eigenvalues.
However, it is easy to prove that for ~ two
matrices A and B, not necessarily hermitian,
such that BA is nondefective, rank (BA) = rank
(BAB) always holds. Note that rank (AB) < min
30
{rank(A) ,rank (B)}
«BA)B) < rank (BA)
Hence, rank (BAB) = rank
We also know that rank of a
matrix M is equal to the rank of its square, i.e.
rank (M) = rank (M2) if, and only if, the elemen
tary divisors of M corresponding to zero eigen
value are linear. Since BA is nondefective, all
elementary divisors of BA are linear and hence,
rank (BA) = rank (BABA) , i.e., rank (BA) = rank
«BAB)A) ~ rank (BAB) • This and the last inequal
ity together yield rank (BA) = rank (BAB).
In that case, clearly, the condition (i) of Lemma
1 is superfluous in view of condition (ii).
Without using the condition (i) of Lemma I, we
shall now present an alternative proof for the
necessity and sufficiency of condition (ii) of the
same lemma. The proof is more straightforward
than the one presented in (3).
Lemma 2. Let A and B be hermitian matrices
of the same order. Then there exists a nonsingular H
Q-lAQ-l matrix Q such that and are
both diagonal if, and only if, AB is nondefect
ive with real eigenvalues.
Proof. If anyone of the matrices A and B is
the null matrix, the lemma holds trivially. Let
us consider that both A and B are non-null.
In order to prove necessity we note that if Q
-1 _lH H exists as above so that Q AQ and Q BQ are
diagonal, and hence real diagonals, the product
Q-lABQ is then real diagonal. The necessity
follows.
In order to prove sufficiency, we shall consider
two separate cases, viz., AB = 0 and AB + 0 •
Since A, B are hermitian, AB o if, and only
if, BA = 0 ,i.e., A and B commute with each
other. Hence there exists a unitary matrix U H
such that UHAU = UHAU-l and UHBU are diagonal
and the sufficiency is proved.
Finally, let us consider the case AB + 0 Then
AB is nondefective with real eigenvalues means
that there exists a nonsingular T such that
1 -1 _lH H T- ABT = A is real diagonal. Hence, T AT T BT
H THBTT-lAT-l i ; .e., the matrices
H T-lAT- l and THBT, which are hermitian since
A, B are hermitian, commute. Hence, there exists
a unitary transformation U such that H
UHT-lAT-l U and UHTHBTU are simultaneously
diagonal. Thus Q = TU is the desired nonsin
gular diagonalizing transformation.
Q.E.D.
Theorem 1. Let Al ,A2,···,Anl and Bl ,B2 , ••• , Bn2
be non-null hermitian matrices of the same order
and let anyone of these matrices, say Bl be
invertible. nonsingular
matrix Q
Then there exists a H
such that Q-lA Q-l i
and are
diagonal for i=1,2, ••• ,nl
and
if, and only if, (i) AiBl and
j = 1,2, ••• ,n2 -1
Bl Bj are non-
defective with real eigenvalues, V i and j = 2,
3, ••• ,n2 ' and (ii) the non-null elements of
the set -1 -1
{AlBl,A2Bl,···,Anl Bl,Bl B2 ,Bl B3"'"
pairwise commute.
Proof. Necessity: Given that
singular matrix Q such that
are real diagonal, V i,j
-1 = Q AiBjQ is real diagonal for
and j = 1,2, ••• ,n2
• Similarly,
= Q-lB-lB Q is real diagonal for 1 k
k = 2,3, .•• ,n2 •
The necessity of (i) follows. Since the matrices
and -1
Bl Bk are simultaneously diagonaliz-
able through a single similarity transformation,
the necessity of (ii) follows immediately.
Sufficiency: Note that the conditions
(i) and (ii) given above imply that there exists
a nonsingular matrix T such and
-1 -1 T Bl BjT = ~j are real diagonal for
and j '" 2,3, ... , n2 • Then clearly,
31
and
and
Vi and Vj , the matrices
pairwise commute and the matrices
pairwise commute. Since for any
two commuting matrices M and
exists, M-l also commutes with N, it follows
that the elements of the entire set of hermitian
matrices H -1 H
{(T BIT) , Al ,A2,···,Anl
' (T BIT) ,
pairwise commute. Hence there
exists a unitary matrix U such that
H H H U , U T Bl TU , U Ai U V i
real diagonal. Thus,
is real diagonal, Vi. Similarly,
= UHTHB TUUH~ U is real diagonal for 1 j
j =2,3, ... ,n2
' and hence, UHTHBkTU is diagonal
for k=1,2, ••• ,n2
• Identifying Q=TU as the
desired diagonalizing transformation. the
sufficiency follows.
Q.E.D.
Theorem 2. Let and
be non-null hermitian matrices of the same order
and let at least one of the matrices from each
collection, say Al and Bl , be invertible.
Then
that
there exists a H
Q-lA Q-l and i
nonsingular matrix Q such
QHB.Q J are diagonal for
i=1,2, ... ,nl
and j =1,2, ... ,n2 if, and only if,
(i) AB is nondefective with real eigenvalues
and (it) the non-null products {AiBj,i = 1,2, •••
,nl
; j = 1,2, ••• ,n2} pairwise commute.
Proof. Necessity follows in the same fashion as
in Theorem 1. For sufficiency we note that since
(i) and (ii) hold, there exists a nonsingular
matrix
diagonal,
V i,j
commute Vj , i = 1, 2 , ••• , n1
• Then
H H -1 T-1A T-1 = Ail (T B1 T) , Vi i
H H (T-1A1 T-1 )-1 A
1j Vj T BjT
i.e. , Ail commutes with H -1 (T B
1T) , Vi and
A1j
commutes with THA-1T Vj, because H 1 '
T-1A T-1 and THB T are hermitian, V i and i j
V j Then, Vi,
k = 1,2, ••• ,n1
Similarly, V j, R.; j,
R. = 1,2, ••• ,n2 • Hence, the hermitian matrices
H H -1 _lH H H T-1A
1T-1 ,T-1A
2T-1 , ••• ,T An1T ,T B
1T,T B
2T, ••• ,
THB T pairwise commute with one another. Hence, n2
there exists a unitary matrix U such that H
u~-lAiT-1 U, Vi and UHTHBjTU, Vj , are diag-
onal. Identifying Q = TU as the diagona1izing
transformation, the sufficiency follows.
Q.E.D.
Theorem 3. Let Al ,A2,···,An1 and B1
,B2
, ••. ,Bn2 be two collections of non-null hermitian matrices
of the same order such that the products
are non-null nonderogatory* with real eigenvalues
V i,j
such that
Then there exists H
Q-1A Q-1 and i
a nonsingu1ar matrix
QHB Q are diagonal, j
* A matrix is called derogatory(4) if the same eigenvalue of this matrix can occur in more than one elementary Jordan li10cks. O~~erwise, the matrix is called non-derogatory( •
Q
32
V i,j , if, and only if, (i) Ai B j is nondefect
ive with real eigenvalues and (ii) the products
AiBj commute with one another, V i,j
Proof. The necessity is obvious. To prove suffi
ciency, we note that (i) and (ii) imply that there -1
exists a nonsingu1ar matrix Q such that Q AiBjQ
= Aij is real diagonal, V i,j , and further,
for any fixed i and for any fixed j , the diago-
nal elements of Aij are different from one anoth-
er. In that case,
Similarly,
H (Q BjQ)Aij , V i,j
Using a well-known result (1; Theorem 3, p.223) it
follows immediately from the structure of Aij'S H
that Q-1AiQ-1 and QHBjQ are diagonal, Vi
and V j
Q.E.D.
In the next two theorems, the conditions are sev
erely restricted so as to make the identification
of the diagona1izing matrices particularly simple.
Theorem 4. Let A1
,A2,.·.,An1 and B
1,B
2, ••• ,Bn2
be two collections of non-null hermitian matrices
of same order such that (i) the products AiBj
are non-null, nondefective with real eigenvalues
and pairwise commuting, V i,j ; and (ii) for all
Bj (resp. Ai)' there exists
a Bj , say B1), such that
nonderogatory*, V j (resp.
exists a nonsingu1ar matrix H and Q B. Q , j = 1,2, .•• , n2 J -1
only if Q A1
Bj Q, V j and
real diagonal.
* Ibid.
an Ai' say A1 (resp.
A1Bj (resp. Ai
B1
) is
Vi) • Then there
Q such that
are diagonal if, and -1
Q Ai B1 Q, V i are
!
~
Proof. Necessity is obvious. For sufficiency
we note that:
-1 Q AlBjQ Alj
-1 Q AiBlQ An
are real diagonal, V j and V i and further,
for each of these diagonal matrices the diagonal
mutually different. Then H H
Q-lA Q-l • QHB Q' Q-lA Q-l = A iIi n
H H-l and (Q B.Q)Al . = Q B.QQ Al
J J J
Vj Again, using the result
given in (1; Theroem H
Q-lAiQ-l , V i and
3, p.ZZ3), it follows that
QHBjQ, V j ,are diagonal.
Q.E.D. Theorem 5. Let Al,AZ""'~l and Bl,BZ, ••• ,BnZ
be two collections of non-null hermitian matrices
of same order such that (i) the products AiBj
are non-null, nondefective with real eigenvalues
and pairwise commuting; and (ii) let there exist
one matrix in each collection, say Al and Bl
,
such that Al and Bl are nonderogatory and
invertible. Then for any nonsingular matrix Q -1 such that Q AiB.Q is real diagonal, V i,j
J H the matrices Q-lAiQ-l, V i and QHBjQ, V j
are diagonal.
Proof. Condition (i) guarantees the existence of -1
a nonsingular matrix Q such that Q AiBjQ = Ai' H J
is real diagonal, V i,j Then (Q-lA Q-l ) = 1
H -1 H All(Q BlQ) is hermitian implies that Q BlQ iSH
diagonal (1; Theorem 3, p.ZZ3) and H
is diagonal. Then Q-lA Q-l i
hence Q-lA Q-l 1
H -1
diagonal, V i and QHB.Q J
is diagonal, V j
= An (Q BlQ) is -1 _lH_l
Aij(Q AlQ ) Aij
Q.E.D.
If a matrix A is both nondefective and nonderog
atory, then the dimension of the eigenspace asso
ciated to each eigenvalue of A is 1 and the
eigenvectors of A are linearly independent. Then
if B is another nondefective and nonderogatory
matrix such that AB = BA , it follows immediately
33
that any matrix Q which diagonalizes A through
a similarity transformation also diagonalizes B
through the same similarity transformation. In
the case of Theorem 3, Theorem 4 and Theorem 5, if
Q is a diagonalizing transformation, Then -1
Q AiBjQ is real diagonal, V i,j ,regardless of
whether the products AiBj are derogatory or non
derogatory. However, as far as the nonderogatory
products are concerned, if these products commute,
then we need to identify only a modal matrix of
anyone of these products as the matrix which
simultaneously diagonalizes the nonderogatory
products AiBj through a similarity transform
ation. Let us consider anyone of the derogatory
products, say
matrices Ai B j
values of AZBZ
AZBZ ' of the pairwise commuting
and let V be one of the eigen
with a geometric multiplicity (4)
p ,where 1 < P < n n being the order of
the matrices. Then the dimension of the eigen-
values associated to V is p. This p-dimensiDn
a1 eigenspace contains an appropriate set of, -1
exactly p column vectors of Q, since Q AiBjQ
is diagonal, Vi ,j Note that for AZB Z ' any
nontrivial linear combination of p linearly
idnependent eigenvectors associated to V is
also an eigenvector associated to
Q is a nonsingular matrix such that
Hence, if ·-1 ' Q AZBZQ is
.-1 • diagonal, Q A.B.Q
1. J will not necessarily be diago-
na1 for i" Z , j " Z Thus we conclude that for
Theorems 4 and 5, the diagonalizing matrix Q
can be identified as any modal matrix of anyone
of the nonderogatory products
The rest of this section will deal with the simul-
taneous diagonalization of hermitian matrices
through a single cogradient transformation. In
order to do so, the following result is needed,
which is quoted almost verbatim from Rao and
Mitra(3); the theorem number is quoted in paren
thesis.
Lemma 3 (Theroem 6.4.Z). Let A and B be a
pair of n x n hermitian matrices. A necessary
and sufficient condition that there exists a non
singular transformation Q such that QHAQ and
QHBQ are both diagonal is
.I.H .LH J. (i) rank (B A) = rank (B AB) ,
.L .LH .L _ .LH (ii) [A - AB (B AB) B AJ B - is nondefective with
.I. real eigenvalues; where B denotes the matrix
whose range space is the orthogonal complement of
the range space of Band B is a generalized
inverse of B where BB B = B; the superscript
"-" denotes this kind of generalized inverse.
In proving this result Rao and Mitra snowed that
the matrix Q, henceforth called a "diagonalizing
matrix" is of the form
where S is a nonsingular matrix such that
E and Fare hermetian matrices, D is real
diagonal, K is a nonsingular matrix such that
KHEK and KHDK are diagonal, U is a unitary
matrix such that UHFU is diagonal; E ZHLAZ
F = BJ.HAB J. , Z = yH(e~H)-l e is an nxr
matrix such that B = eDeH eHe I and I is r r
the r xr identity matrix where r = rank(B) ;
y is such that (B-)HBB-= yHy where y~ = I .I. .I.H .L _ .LH r
and A is diagonal; L = In - AB (B AB) B
and finally, S has the partitioned form as
follows:
Our next step is to generalize this result for
more than two hermitian matrices.
First we note that for any two n x n non-null
hermitian matrices A and B if A =!;B for
some non-zero complex number ~, then for any non
singular matrix Q, QHAQ is diagonal if, and only
if, QHBQ is diagonal. Hence, without any loss of
generality,
Al ,A2 ,··· ,Am
complex number
Assumption I.
we can assume that
are such that
!; 'I i,j
Ai ; !;Aj for any
Let us call this
34
Let us now assume that the non-null hermitian mat
rices A and B have common-position columns
which are the same in both A and B, i.e., the
ith column in A is identical to the ith column in
B and so on. These columns may be all-zero or may
contain non-zero matrices. Then we know that there
exists a real nonsingular transformation P ,
expressible as a product of standard elementary
transformations, such that
where the superscript "T" denotes matrix transposi-
tion, A and B are hermitian matrices of the
same order such that there are no common-position
columns which are the same in both A and B.
Let Al ,A2, ... ,Am denote the reduction of the non
null hermitian matrices of same order, Al
,A2
, .•. ,
Am ' as performed in (2) such that Al ,A2
, ••• ,Am do not have any common position columns which are the
same in all these matrices. In that case it follows that
Lemma 4. Al
,A2, ... ,Am are simultaneously diagona
lizable through a single cogradient transformation
if, and only if, Al
,A2 , ... ,Am are simultaneously
diagonalizable through a single cogradient trans
formation.
Proof. Necessity: Let P be the real nonsingular
transformation such that
A = i
as in (2). Let Q be a nonsingular matrix such H
that Q AiQ = Ai
M = P-lQ. Then
is diagonal, 'I i Define
M is nonsingular and partition-
ing M as
where
that
MIl has the same order as H -
MllAiMll is diagonal, 'Ii
Ai ' it follows
Sufficiency: Given that there exists a H -
matrix, say MIl ' such that MllAiMll are diagona~
Vi. Then it is trivial to verify that for any
matrix M,
M
where MZl and MZZ are arbitrary submatrices
"uoh <h", "~i :]M a<e well-defined, • i ,
MH rAo- i °o~ M the matrices L J are diagonal. In that
case, if we define Q = PM , then is
diagonal, V i and sufficiency follows.
Q.E.D.
As a consequence of this lemma, it can be assumed
without any loss of generality that the non-null
hermitian matrices Al,AZ, ••• ,Am do not have any
common position columns which are the same in all
these matrices. Let us call this Assumption II.
Finally, let ai denote the a-th column of Ai' -a
1 < a < n , where the order of Ai is n x n for il3 il3 il3 il3
all i. Let ~a = ~l~a = ~ a = ••• = ~ ~ 1 Z-al PI3 apl3
10,1< P _< n - 1 , Z < 13 < m, ~l'~Z'···'~ - - PI3 are all non-zero complex numbers, where we also
assume that no other columns of Al,AZ,···,Am satisfy this relation (if there are other such
columns, the following procedure can be applied
successively and only the final result need be
considered). Then again we note that there exists
a nonsingular transformation E, expressible as a
product of standard elementary transformations,
such that
where Ai is a hermitian matrix of order
(n - m~n PI3) x (n - m~n PI3) , Vi, and there exists
at least one Ai such that the columns of this
matrix cannot be expressed as a non-zero complex
multiple of one another. It is easy to check
that Lemma 4 holds even when Ai is replaced by
Ai. Hence using the same kind of arguments as
before, it can be assumed without any loss of
generality that Al,AZ, ••. ,Am are such that there
exists at least one matrix Ai' 1 ..::. i ..::. m , such
that all columns of Ai are different from one
another. Let us call this Assumption III.
Definition 1. A set of non-null hermitian matrices
of the same order, containing at least two elements
of this set simultaneously satisfy the Assumptions
I, II and III, will be called an almost regular set
of matrices.
It is easy to see that every almost regular set of
matrices contains at least two elements, say Ml
and MZ
' such that there exists a non-zero real
number A where Ml + AMZ is nonsingular. This
is due to the fact that for any almost regular
set of matrices we can pick up two elements of the
set, Ml and MZ ' such that {Ml,MZ} is an almost
regular set of matrices and one of the matrices,
say Ml ' is such that its column cannot be express
ed as a non-zero complex multiple of one another.
Then treating A as a formal parameter, it follows
that Ml + AMZ has full rank and det(Ml + AMZ) ,
expressed as a polynomial in A , has exactly
rank(Ml
+ AMZ
) roots. Then for any real A ,
35
A I 0 , which is not also a root of det(Ml + AMZ)
clearly Ml + AMZ is nonsingular. For such a A,
the triple (Ml,MZ,A) will be called a regular
triple for the almost regular set of matrices.
Note that the above argument also yields a proce
dure for determining A for any regular triple.
We can now state and prove the following theorem
which incorporates the construction of a diagonal
izing transformation whenever it exists.
Theorem 6. Let {AI ,AZ ' .•• ,Am} be an almost
regular set of matrices and let {Al,AZ,A} be a
regular triple for this set. Let B = Al + AAZ •
Then there exists a nonsingular matrix Q such H -1
that Q AiQ is diagonal if, and only if, (i) B Ai
are nondefective with real eigenvalues, Vi, and
(ii) The matrices B-lA. pairwise commute with 1.
one another, Vi.
Proof. Necessity: Given that there exists a non
singular matrix Q such that QHA Q is diagonal, i
and hence real diagonal, V i Then QHBQ = A is
real diagonal and necessity follows.
Sufficiency. Given that the conditions
(i) and (ii) hold. Hence there exists a
lar matrix T such that T-lB-lA T = A i i
diagonal, Vi Then T-lB-lT-lHTHA T i
THAi T = (THnT)Ai
, V i Since THA T i
nonsingu-
is real
is hermi-
tian, (THBT)Ai Ai(THBT) i.e., THBT commutes
with Ai' Vi and since Ai's are diagonal, H the elements of the set {T BT ,AI ,A2,· •• ,Am}
commute pairwise. Thus there exists a unitary
matrix U such that UHTHBTU and UHAi U, Vi
are real diagonal. Hence
Le., UHTHAi TU is real diagonal, Vi. Then
Q = TU is the desired diagonalizing transform
ation and the sufficiency follows.
Q.E.D.
The procedure for determining Q may be formal
ized as follows:
Step 1: Reduce the given set of matrices to an
almost regular set of matrices by using standard
elementary transformations as given in (2) and
(3).
Step 2: Select any regular triple {Ai,Aj,A} for
this almost regular set of matrices and construct
B = Ai + AAj •
Step 3: Check whether (i) and (ii) of Theorem 3
are both satisfied for this B and the almost
regular set of matrices. If these conditions are
satisfied go to the next step. Otherwise, stop.
Step 4: Construct the diagonalizing matrix Qll
as given in the proof of sufficiency in Theorem 6.
Then the desired diagonalizing transformation will
be of the form PQ
vbe« Q - [:~: as in the proof of Lemma 4,
o J and P is the product Q22
of standard elementary transformations used to
reduce the given set of matrices to the almost
regular set of matrices as shown in (2) and (3);
Q2l
and Q22 are arbitrary matrices such that
36
Q is nonsingular and PQ is well-defined.
Before concluding the discussion on this problem we
shall present one more result which follows as a
direct corollary to Theorem 6.5.2 in Rao and
Mitra (3) •
Lemma 5. Let Ai
,A2
, ••• ,Am
be non-null hermitian
matrices of the same order and let one of these
matrices, say Al ' be positive semidefinite, and
)t(Ai ) c )t(Al
) for i = 2,3, •.• ,m , where J't(Ai
)
denotes the column space of Al and so on. Then
there exists a nonsingular matrix Q such that
QHAiQ is diagonal, Vi , if and only if, AiA~Aj = AjA~A.1 ,for i = 2,3, ••• ,m and j = 2,3, .•• ,m ,
where Al is any generalized inverse of Al such
that AiA~Al = Al
The proof is obtained in the same manner as for the
one given for Theorem 6.5.2. in Rao and Mitra(3).
3. SIMULTANEOUS DIAGONALIZATION OF MATRICES
WITH FUNCTIONAL ENTRIES THROUGH CONSTANT "
COMPLEX TRANSFORMATIONS.
In this section a brief discussion will be present
ed on the simultaneous diagonalization of hermitian
matrices with functional entries through a single
constant congruent transformation or through a pair
of constant contragradient transformations. Consi
der a finite collection of (nl
+ n2
) hermitian
matrices, denoted by Ml ,M2,···,Mnl and Nl
,N2
, ••
• ,Nn2 ' each of the same finite order n x n Let
the matrices be of the form as given below:
i I< j
i I< j
where i,j = 1,2, ..• ,n; the subscript and super
script k corresponds to the kth matrix ~, and
similarly for l; x(k) £ ¢Yk , y<l) £ (,H; '\
and 0l are positive integers> l, Vk,l; C[
"denotes the field of complex numbers; the symbolsof
the form ~Yk denotes the yk-dimensional vector-
space over ([.
copies of ct ; obtained as the product of
(k) Yk Il ij £ [([ :([), II i,j ,k
II i,j,R, where the symbols of
denotes the linear space of all Yk
complex-valued functions on ~ with ~ as the
underlying field. Whenever possible, we shall
drop the characterizers (~(k» and (y(R,».
The problems of diagonalization may be formulated
as follows:
Problem I. Find the necessary and sufficient
conditions that a non-singular constant n x n
matrix Q exists such that
are diagonal for k = 1,Z, ••• ,nl
Problem II. Find the necessary and sufficient
conditions that a non-singular
matrix Q exists such that
nxn constant
(Sa)
(Sb)
and diagonal for k = 1,Z, ••• ,nl
and R, = 1,Z, ••• ,nZ
Let the symbol ~[fl,fZ, ••• ,fp] denote the finite
dimensional linear space of functions generated by
the finite set of complex-valued functions
{fl,fZ, ••• ,fp} with ~ as the underlying field.
Similarly, let .-&g <f l' f Z ' ••• , f g > denote the
linear space of functions over ~ for which a
finite set of g linearly independent functions
{fl,fZ, ••. ,fg} form a basis. Whenever there is
no danger of confusion, the symbols of the form
~g <fl,fZ, ••• ,fg> will be abbreviated to ~g The "script" capital letters will be reserved for
representing such finite-dimensional linear spaces
of complex-valued functions with , as the under
lying field.
and
Lemma 6. A necessary condition that ~ will be
diagonalized for all k in the form of equation
(4) through a complex constant invertible matrix (k)
Q is that every Il ij - entry of ~, II i,j,
belongs to a p(k)-dimensional subspace of
[~Yk :~], denoted by J(. (k) where 1 < p (k) _< n; p -
37
k = 1, Z , •.. , nl
• The subspace .ltp
(k) is the small
est such subspace in the sense that for every other
subspace Jtp over d containing all Il~~)-entries of ~, we have p (k) 2. p, (k) = S'Jt and for
(k) p P P P = P , jt (k) = JI: .
p p
Proof: It is sufficient to prove this lemma for
only one k. The most general form of the n x n • (k)
diagonal matrix, ~(~ ), may be written as
follows:
(k), i-I Z • where all II. s , -" ••. ,n , are not necessar-,(l.
ily linearly independent of one another or not even
different. Let
f::. D (k) (k) (k) Yk ftp (k)=/.){Il l ,liZ ,···,Iln }C[4: :(.]
Evidently, p(k) 2. n. On the other hand, p(k)~ 0
corresponds to the trivial case of the subspace
{O}. Hence, 1 2. p(k) 2. n In that case,
denoting the largest subset of the linearly inde-
pendent elements of the set of generators (k) (k) (k)
{Ill ' liZ , ••• ,11 } of ~ (k) (k) (k)
by {Ill ' liZ ,. (k) n p
",Ilp (k)} , (after re-indexing, if necessary), it
is possible to write:
i.e., the rest of the functions belonging to the (k) (k)
set of generators, viz., II (k)+l,1l (k)+l"'" (k) p P
lin ' may be expressed as a non-trivial linear com-
bination of the basis function. Therefore, since
equation (1) holds, we can write:
which yields
II i,j ; (6)
the "tilde" denotes the complex conjugation. Thus (k) Il ij e: fl.p (k), II i,j. The last statement is
obvious from the definition of ~ (k) • p
Q.E.D.
From now on, our attention will primarily be
restricted to simultaneous diagonalization through
a single cogradient transformation;however, it will
be obvious from the subsequent discussion that the
same arguments can be used for the case of simul
taneous diagonalization through a contragradient
transformation.
Except in the simplest case of p(k) equal to 1,
the problem is now to identify a basis for Ji (k) P
Since these spaces are finite-dimensional, the
most natural approach for determining the bases is
to follow the methods for doing the same in the
case of ~n as closely as possible. This means
that the spaces ~ (k) should be pre-Hilbert p
spaces with respect to suitably defined inner-
products. The inner product structure enables us
to employ Gram-Schmidt orthonormalization techni
que for identifying an orthonormal set of basis
functions for each ~ (k) • P
The Gram-Schmidt orthonormalization is then per-(k) (k) (k) (k)
formed on the sets {~ll '~12 '···'~ij '···'~nn } using the standard procedure (including whether
the Gramian is zero or not at every step of ortho-(1) normalization) , for k=1,2, ••• ,n
l• Let
{A(k) A(k) A(k)} ~l '~2 " ••• ,~p (k) denote an orthonormal
basis of Jl (k) obtained in this fashion. Since
for p(k) >Pn the simultaneous diagonalization
is not possible as mentioned in Lemma 6, in order
to proceed further, we shall now assume p(k) < n
\/k
Let <·1·> denote the inner-product operation
corresponding to the appropriate space. Let
Then
(k) ~ij
Similarly,
(k) ~r
then,
(k) lID ,tij e:1I', \/i,j,k,m m
••• (7)
t(k) = <~ (k) A(k) (8) ~ >
ijm ij m
let
(k) =! 4I(k) A (k) , .~k) e: ct, \/ r,k,m; (9) ~m m=l rm m
41 (k) < (k) A (k) (10) ~r ~ > r m m
38
Let us now define n x n matrices
onal matrices ~(k) as follows: m
T(k) and diagm
~(k) m
[t~~) ];i,j=1,2, .•. ,n 13m
. (k) (k) (k) d1ag{<I> 1 ' <1>2 , •.. ·<I>n }
m m m
with k=1,2, ••• ,nl
and m=1,2, .•• ,p (k)
(11)
(12)
Since
~ are hermitian, \/ k , obviously, T~k) mitian, \/ k,m •
are her-
Theorem 7. The collection nl
hermitian matrices
Ml
,M2
, ••• ,Mnl is simultaneously diagonalizable as
in (4) if, and only if, Q-lT(k)Q-lH ~(k) are m m
real diagonal matrices, \/ k,m •
Proof. Necessity: Substitution of (9) into (6)
yields:
(k) (k) ! Y _ (k) A (k)
~ij m=l r=lqirqjr<l>rm ~m (13)
Comparing (13) with (17) we have:
t (k) n _ (k)
ijm r~lqirqjr<l> r m (14)
Le.
T(k) Q ~(k)QH Vk,m m m ' (15)
where ~(k) is a diagonal matrix as defined in (12~ m H
Thus Q-lT(k)Q-l = ~(k) are diagonal, \/k,m m H m H
Since (Q-lT(k)Q-l)H = Q-lT(k)Q-l ~(k) must m m m
havr only real diagonal entries.
Sufficiency. Given that the matrices H
Q -IT (k) Q-l ~ (k) are diagonal, \/ k m There-m m ' •
fore, T(k) Q~(k)QH. Note that m m
(k)
~ = Jl T~k)~~k) (k)
I! Q~(k)QHA(k) ~ m ~m
m=l
(k)
Q( I ~(k)~(k»QH (16) m=l m m
(k)
= ! ~ (k)~ (k) m=l m m
(17)
Since ~~k) are diagonal, V k,m ~ are diagonal,
Vk •
Q.E.D.
(.q (q Let U e ' l = 1,2, .•• , n2 and e '" 1,2, .•• , d
denote the matrices derived from
where U(l) are the counterparts (l) e
d are the counterparts of the
Nl
,N2 , .. ·,Nl ,
of T(k) (where m
dimensionalities
p(k». Similarly, let ~(l) denote the real e
diagonal matrices which are counterparts of ~(k) m
The next theorem follows as
of Theorem 7.
an obvious corollary
Theorem 8. The collection of (nl
+ n2) herme
tian matrices Ml
,M2
, ••• ,Mnl and Nl
,N2 , ••• ,Nn2 is simultaneously diagonalized through contra
gradient transformations as shown in (5a) and (5b) if d 1 if Q-lT (k)Q-lH = ~ (k) V k m , an on y, m m"
and Q~(l}Q ~(l) V l,e, are real diagonal e e
matrices.
Depending on how the Gram-Schmidt orthonormaliz
ation is initiated, different orthonormal bases
may be obtained for the same space. It will now
be shown that Theorem 7 and Theorem 8 hold regard
less of the choice of the orthonormal bases; the
diagonalizing transformation Q remains invariant.
~(k) .(k) .(k) Let {~l '~2 ""'~p (k)} be
mal basis for the space fi (k) • P
be the counterparts of T(k) ~(k) m ' m
determined using this new basis.
another orthonor-. Let f(k) ~(k)
m ' m
, respectively,
Theorem 9. The matrices
diagonal if, and only if,
Q-lf(k)Q-lH are real
Q~lT(k)Q-lH are real m
diagona1.
The proof is simple and hence omitted.
The following corollary is obvious.
Corollary. The matrices
The implications of Theorem 8 and Theorem 9 are
that the problems of simultaneous diagonalization
of matrices with functional entries can be convert
ed to equivalent problems of constant matrices.
The rest of the problems then· become identical to
those discussed in the section II.
This concludes our discussion on the simultaneous
diagonalization of functional matrices.
39
V. CONCLUSION
In this paper, existing results on simultaneous
diagonalization of hermitian matrices through a
single congruent transformation and through a pair
of contragradient transformations have been gener
alized for an arbitrary but finite number of
matrices. The results obtained here are shown to
be applicable to the matrices with multivariate
functional entries which satisfy appropriate con
ditions. An example of simultaneous diagonaliz
ation through contragradient transformations is
also included.
ACKNOWLEDGEMENT
This work was supported by the National Research
Council of Canada under Grant Nos. A-7739 and
A-7740.
The authors wish to thank Prof. N.K. Bose of the
University of Pittsburgh, Pa., and Professor M.
Vidyasagar of Concordia University, Montreal, for
useful discussions and suggestions.
REFERENCES
1. F.R. Gantmacher, Matrix Theory, Vol.I, Chelsea Publishing Co., New York, N.Y., 1960.
2. A.C. Aitken, Determinants and Matrices, 9th Ed., Oliver and Boyd, Edinburgh, U.K., 1956.
3. C.R. Rao and S.K. Mitra, Generalized Inverse of Matrices and Its Applications, John Wiley and Sons, Inc., New York, N.Y., 1971.
4. D.M. Young and R.T. Gregory, A Survey of Numerical Mathematics, Vol.II, Addison-Wesley Publishing Co., Reading, Mass., 1973.
5. P. Bhimasankaram. "On Generalized Inverses of Partitioned Matrices", Sankhya, Series A, Vo1.33, 1971.
6. A Ben-Israel and T.N.E. Greville, Generalized Inverses: Theory and Applications, WileyInterscience, New York, 1974.
* * * S. Chakrabarti received the B.Sc.(Hons) degree in Physics, and the B. Tech., M.Tech. and D.Sc. degrees in Radiophysics and Electronics from the University of Calcutta, India, in 1965, 1967, 1968 and 1973, respectively.
From 1972 to 1974 he was with the Department of Electrical Engineering, University of California, Davis, supported by a National Scholarship awarded by the Govt. of India. Since October 1974, he has been a Post-doctoral Fellow at Concordia University Montreal. His current research interests are in the areas of Digital Signal Processing, Optimiz-
-
i
'I'
:11'
III
ation Techniques and Systems Theory.
Dr. Chakrabarti is a member of the IEEE, SIAM and MAA.
B.B. Bhattacharyya received the B.Tech. (Honors) and the M.Tech. degrees from the Indian Institute of Technology, Kharagpur, in 1958 and 1959,respective1y, and the Ph.D. degree in electrical engineering in 1968 from Nova Scotia Technical College, Halifax, Canada.
From 1959 to 1965 he held appointment as a Technical Teacher trainee and a Lecturer at, respectively, the Indian Institute of Technology, Kharagpur, and the Indian Institute of Technology, Madras. He joined the Electrical Engineering Department of the Novia Scotia Technical College, Halifax, Nova Scotia, Canada, in 1965 and, in 1968, moved to the University of Calgary to become Assistant Professor in electrical engineering. He joined Sir George Williams University (now known as Concordia University), Montreal, Canada, in 1970 as an Associate Professor of electrical engineering. He became a Professor in 1973. He has published a number of articles in the area of network theory.
M.N.S. Swamy was born on April 7, 1935. He received the B.Sc. (Honors) degree in mathematics from Mysore, India, in 1954, the Diploma in electrical communication engineering from the Indian Institute of Science, Banga10re, in 1957, and the M.Sc. and Ph.D. degrees in electrical engineering from the University of Saskatchewan, Saskatoon, Saskatchewan, Canada, in 1960 and 1963, respectively.
He worked as a Senior Research Assistant at the Indian Institute of Science until 1959, when he began graduate study at the University of Saskatchewan. In 1963, he returned to India to work at the Indian Institute of Technology, Madras. From 1964 to 1965, he was an Assistant Professor of Mathematics at the University of Saskatchewan. He has also taught as Professor of Electrical Engineering at Nova Scotia Technical College, Halifax, and the University of Calgary, Calgary, Alberta, Canada. He is now Chairman of the Department of Electrical Engineering, Concordia University, Montreal, Canada. He has published a number of papers on number theory, semiconductor circuits, control systems, and network theory. He is also Associate Editor of the Fibonacci Quarterly. During 1976, he is the vice-president of the IEEE Circuits and Systems Society.
A WALSH OPERATIONAL MATRIX FOR SOLVING VARIATIONAL PROBLEMS
C. F. Chen and C. H. Hsiao
E.lectrical Engineering Department, University of Houston
Houston, Texas 77004
Abstract
The Walsh function was initiated by Rademacher [1] and independently developed by \~alsh [2] in the early nineteen twenties. In recent years, the Walsh theory has been innovated and applied to various fields [3-8] in engineering and scibnce. To the authors' knowledge, however, this powerful tool has not been used for solving variational problems.
This paper establishes a clear procedure for the variational problem solution via the Walsh functions techniques. In the beginning part of this paper, we will introduce Walsh functions and briefly summarize the properties. Therefore, it is tutotial in nature. Then we will derive an operational matrix for the integration use. The variational problems will be solved by means of the direct method with Walsh series.
I ntroduct ion
The basic idea of direct method for solving variational problem is to convert the problem of extremization of a functional into a problem of extremization of a function which involves a finite number of variables. Ritz's method is a well known one in this field. This paper introduces Walsh functions first which is tutotial in nature; then presents a direct method for solving variational problems via Walsh functions. The procedure involves (1) assuming the admissible functions by Walsh series with coefficients to be determined; (2) establishing an operational matrix for performing integration; and (3) finding the necessary condition for extremization and (4) solving for the algebraic equation obtained from the previous steps to evaluate Waolsh Coefficients. Because of the orthonormal property of the powerful Walsh series, the new direct method is simpler in reasoning as well as in calculation. An illustrative example and a practical application to heat conduction problem are included.
Rademacher and Walsh Functions
Rademacher's function is a set of square waves of unit height with periods equal to 1,1/2,1/4,1/8, ...
2(I-k) respectively. The first five square waves are shown in Fig. 1. Alternatively, we state that thek_l number of cycles of the square waves of rk(t) is 2 It is noted that the set is not complete since, except for ro(t), the set involves only functions which are odd aflout t=I/2.
In 1923, Walsh independently developed a complete set which is known as Walsh functions. The set of
4)
Walsh functions and the set of Radamacher functions have the following relations.
<P (t) r (t) 0 0
<p (t) r (t) I I
I 0
<p (t) (r (t» (rl (t)) 2 2
I I <p (t) (r (t» ( r (t»
3 2 I
<p (t) n
where
q
(1 )
in which ['J Therefore,
means taking the greatest integer of II II
n = b q
q-2 0 2 + ... bl
. 2 (4)
where bqbq_l ... bl is the binary expression of n.
Therefore if a particular Walsh function ¢ (t) is given and its Rademacher function components arg required, we simply change n into binary form and then substitute into (2).
Rademacher functions are easy to draw, so are Walsh functions.
Walsh Coefficient Determination
A function f(t) which is absolutely integrable in [0,1] may be expanded into Walsh series.
where cn are coefficients of Walsh series of f(t).
It is desirable to determine the coefficients c such that the integral square error n
N [f(t} - L
n=O c ~ (t)J2dt = £
n n (6)
is minimized. Taking the partial derivative of € with
respect to cn yields
Jl
~ = 2c - 2 dCn n 0
fIt). (t)dt and then setting it n
equal to zero, we have:
c n J
l
o .n (t) fit) dt
This simple result is due to the orthornormal property of Walsh functions.
Let us illustrate the Walsh series expansion by the following simple ramp function example.
fIt) = t
Substituting fIt) into (7) and taking only 4 terms yield,
o
1 II
After substituting these obtained values of coefficient into (5), we have
1 1 1 t = 2.0(t) - 2 .l(t) - 8" .2(t) + 0 .• 3(t)
Which is the four-term Walsh series expansion of the ramp function.
Discrete Formula
If the given function is not in its analytic form but in tabulated data or in graphical form and its Walsh series expansion is derived, we would modify (5) and (7) into discrete forms:
k=0,1,2, ••. (m-I) (5-a)
n=O, 1 ,2 , ••. (m- I ) (7-a)
where fk is the average value of the function in question in the k-th subinterval and. k the value of n, the n-th Walsh function in the k-th sub-interval; and m is the total number of subintervals.
For illustrating the use of discrete formula, let us evaluate the Walsh series again for the ramp function in its tabulated form. Given
k o 2 3 1/8 3/8 5/8 7/8
The corresponding graphical form is shown in Fig. 3.
-------~~-
Equation (7-a) in its expansion form for m=4 is as follows,
--1
r col
1'00
'-1
·01 ·02 ·03 1 fO
Cl ! ·10 ·11 ·12 ·13 fl
=1 1 (8)
c2 i ! ·20 ·21 ·22 ·23 f2 11
I
I I I c3 i ~- ~ L·30 ·31 ·32 ·33
-.-l f3
L -
42
Substituting the tabulated data of the ramp function into (8), we have
.-Col f1 11 I-
i I : 1/8
I -1 -I 3/8 c1 I I i Ii I
5/8 ! c i -1 -1 2. I I
c3 ! -1 -I I 7/8 I (9-a) _I ~-
__ J L... -'
11/2 I
(9-b)
The square matrix defined in (8) and the numerical values shown in (9-a) are easily recognized from the definition of Walsh functions.
It is seen that the Walsh series of a unit ramp function obtained from discrete formula, or (9), and that obtained from the analytic formula are, of course, the same, since fk is exactly equal to the average value of fIt) in the k-th interval.
Eq. (8) can be written into a general compact form.
c = Wf . 1. m
Where W is called the Walsh matrix.
Operational Matrix
(10)
In the previous section, we showed that the ramp function can be expressed by a Walsh series, or
t ; 1.. (t) - 1.. (t) - ~ ~2(t) 2 0 4 1 0 'I'
( 11)
However, a ramp function can be considered as the first integration of a unit step function, or .o(t). Therefore, we write the following
s:
The first integration of • (t) is a triangular function, and if we expand the !riangular function
(12)
into Walsh series by using discrete formula with' m=4.
we have
Jt I I
<1>1 (t)dx = [If' 0, 0, - 8] o
Similarly, we can evaluate the Walsh series coefficients of the first integration of <l>2(t) and <l>3(t) and easily obtain
r o
and
I <1>2 (x)dx ;;; [ 8' 0, 0, 0]
I [ 0, 8' 0, 0]
>-<1>01 I
<I> 1 i <1>2
<1>3
Combining (12) through (15), we have
<1>1 (x)dx I I , 4" I
i'" 1 1=
i I <1>2 (x)dx 8
<l>3(x)dx 0 i .....; +-
or in compact form
I - If
0
0
I a-
j: <I>(x)dx = P (4x4) ~(4) (t)
( 14)
( 16)
~(llx4) is called the operational matrix of dimension
4 which relates Walsh functions and their integrations. It is chosenas a square matrix for the reason of convenient calculation.
By use of (17), integration becomes multiplication, therefore, we consider P as an operational matrix. -
If we divide the unit [0,1) into 8 sub-intervals
instead of 4, and evaluate J<I>Odt, J<I>ldt, ...
J <l>7dt by either analytic method or discrete formula,
we would obtain a group of triangular waves as shown in Fig. 4. Then expand the triangular waves into Walsh functions. Consequently, we arrive at the
43
following formula.
WO"lii 1 I 0 I I 0 0 0 r<l>o 1 -4" -a- !-IT
j<l>l dt It 0 0 1 ! 0
I 0 0 1$1 -a- -16
If ill 0 0 0 0 0 I 0 $2 I $2dt I i a- -IT
J ,I I I 1 I 0 0 ! 0 0 I $3dt i . 0 8 0 -16 $3 ! ~: -- - --
, j'$4dt -I :6
---------- - - ---0 0 0 0 0 0 0 $4
, j'$5dt 10 I 0 0 0 0 0 0 $5 16
0 I 0 0 0 0 0 $6 16 ("1. 0
f $7dt I 0 0 0 I 0 0 0 0 $7_ 16 .-J -- --which is
J: P(8)(x)dx;;; ~(8x8)P(8)(t) (18-a)
It is interesting to note that the left up corner
of ~ (8x8) is exactly ~(4x4) in (17), the right up
corner and the left down corner are unit matrices multiplied by -1- and ~ respectively, and the right
d • 16. 116 11 .. own corner IS simp y a nu matrix. Following a similar reasoning line, we can write
a general expression for the operational matrix P of
order m (which is a positive integer power of 2).
P ",(mxm)
I
1 I 2 - 1- - I m I 2 1 m-(,:-) , - " , ___ L _______ ,
~I 0 :, (~\
1 , 1 1
- - I (m\, m ::: 11',
m ;:: ({) :: 0', ' I ' 1
- - - - '- - - - - - - ~ - - - - - - - ~ - 2m !. (r) 1 1 ,-
- I (~\ I 0 (m) 1 m:: 41 '" 41 , ' -------------------,------
1 2m ::: o (~\
::: 21
This operational matrix will play an important role in the direct method for solving variational problems.
Direct Method for Simplest Variational Problem
The regular method for solving problem of a functional:
Jl .
J = 0 F(t,x(t),x(t»dt
the extremization
(20)
Ii I
is through the Euler equation
F _...E.. F. = 0 x dt x (21)
However, the differential equation so obtained can be integrated easily only in exceptional cases. Therefore, many direct methods have been developed. Ritz's and Gelerkin's methods are well known [9,10]. This paper mainly uses Walsh's functions to establish the direct method for variational problems.
Not like other direct methods starting with the assumption of the variable itself, the method we developed starts with the rate variable. In other words, first assume the rate variable x(t) as a Walsh series whose coefficients are to be determined.
00 • x( t) E c i4>i
i=O (22)
Taking a finite terms as an approximation, we have
~ (tl ;. cOiflO + c l 4>1 + ... + cm- l 4>m-1
~ :' ~ (23)
From ( 17) we know that
ft p (>")d A = P ~ (t) (24) ::::
0
Then the variable x( t) can be expressed as
.. jt
x( tl o
x(A)dA + x(o) c' ~ ~(t) + x(o)
(25)
The other terms In the functional of (20) are known function of the independent variable t and can be expanded Into Walsh series with known coefficients. Expressing x, x and t in terms of Walsh functions through substitution, we finally have
(26)
The original extremization of a functional problem shown in (20) becomes the extrimization of a function of a finite set of variables In (26).
Taking partial derivatives of J with respect to c i ' and setting them equal to zero, we obtain,
~=o dC. '
I
(1=0,1, ..• m-I)
Solving for c l ' and substituting into (25), we will have the answer.
(27)
We note that the above proposed method implies Euler's direct method of finite difference and is similar to Ritz's method using power series and Fourier series; but considering (i) the orthonormal property of Walsh series and (ii) the product property of P shown
in (24) and the operational property of P itself, we can claim that the new direct method via~Walsh functions is simpler and more powerful than any previous methods.
Let us establish the detailed procedure via several classical problems.
Illustrative Example
It is required to find the extremal of the following functional.
J = J~
and boundary
x(O)
x (1)
(~2 + t ~)dt
conditions
0
1/4
(28)
(29-a)
(29-b)
This is the exercise #7, Ch.1 of Elsgole's book [II]. For solving this problem by the Walsh direct
method, we assume that
14>ol • I 4>1 ; x (t) = [c , c l ' c2 ' c
3] p
4>2 4> '- 3_
(30)
Here we let m=4 for clarity in presentation; more accurate results can be obtained by using a larger m.
There is a variable t involved in (28) explicitly, we need its Walsh series expansion also. Using (II) cirectly yields
I 4'
~ h' pet)
I B"' 0] Ij> (t) (II-a)
Substituting (30) and (II-a) into (28), we have
fl, , , , J = [~ pet) p (t) : +: pet) p (t)
o h]dt (31)
However, the vector function 4>(t) has a particular
property due to the orthonormality of Walsh function,
I
J ~(t) p' (t}dt o
After integration, (31) simply becomes
J = c'c + c'h
(32)
So far, we have not introduced the boundary conditions into consideration. For the initial one, we easily see that,
t It x(t) f ~(A)d>" + x(o) ,
4>(A)dA + 0 c
0 0 (34-a)
;;; c' P pet} (34-b) .. since x(O) = O. For the final boundary condition, substituting (29-b) into (34-a) yields
X(J)=C,!I </>(\)d\ - 0-
It is Interesting to note that the definite integral of </>0 from 0 to I is equal to I, while the definite inte-
grals of </>1' </>2 and </>3 are all equal to zero; or
(36)
i=1 ,2, .•.
Substituting (36) into (35) simply gives
o (37)
o
Co is found without too much effort at all. This information should be substituted into (33) also, we then have
I 2 2 2 I I I J = 16 + c I + c2 + c3 + ~ - ~I - SC2
For extremization, we take the partial derivatives of J with respect to c i ' i=I,2,3, and set it equal to zero:
dJ/ClC I 0: 2c
I -I 0 4=
dJ/dC 2 0: 2c2 - ~ = 0
dJ/dC3
= 0: 2c = 0 3
Therefore,
And x(t) is obtained from (34-b) ,
x(t) = c' P </>(t) - --
I
/,' = "8
or I (38) c2 =16
c = 0 3
I I I 16 </>1 (t) - 32 </>2(t) -128 </>3(t)
(40)
If Euler equation is used for the analytic solution, the answers should be
x(t) = t t(1 - t t)
~(t) = t (I - t)
(39-a)
(40-a)
respectively. The graphical comparison of the solutions via Euler's analytic method and via Walsh's direct method is shown in Fig. 5. It is seen that even m=4, the Walsh direct method is quite satisfactory.
Application to a Heat Conduction Problem
Consider the following functional extremization
45
problem:
J = J: [t y 2 - y g(x)]dx = r: F(x,y,y )dx
(41)
where x is the independent variable; and y the dependent variable. The function g(x) is defined as
I _ { -I for 0 2. x 2. 4 and
g (x) - I I 3 for - < x < -4 - 2
(42)
The values of y(O) and y(l) are unspecified. This is a functional extremization problem with moving boundaries at both ends. Using classical variational analysis, we can find the two conditions from F(x,y,y )
F. I 0 y x=O y (0) = 0 impl ies
(43)
Fy Ix=1 = 0 impl ies y (I) = 0
Schechter [12] gave a physical meaning to this problem by assigning y(x) to be the temperature in a solid slab; g(x) the rate of internal heat generation per unit volume, and x the space variable as shown in Fig. 6. It is noted that g(x) satisfies the following:
,1 J g(x)dt = 0 o
( If'/- )
which means that there is a volume weighted equality of sources and sinks of thermal energy.
What we are interested in is to apply Walsh direct method-to solve this functional problem.
First of all, we assume y (x) as a Walsh series with 8 terms.
00
y (x) = L c',</>', i=O
= cO</>0+cI</>I+c2</>2+c3</>3+c4</>4+c5</>5+c6¢6+c7¢7
~ :' ¢ (x)
Integrating Y (x) and using operational matrix P, we obtain
y(x) y (\)d\ + y(O) ~ ~' (x </>(\)dA + y(O) .. 0
_ ~ c' P ¢(x) + y(O)
The Walsh series expansion of g(x) can be shown to be,
(47)
Substituting (45), (46) and (47) into (41) and applying orthonormal property of ¢(x), we have,
J = f>h'~(x)f (x):-:'~~(x)2' (x)~-y(O)g(x)]dx = 1 c' c - c' P h (48)
2 - - - :::
The boundary conditions of (43) can be expressed in terms of Walsh functions
In minimizing J with respect to c subject to the constraint (49), we apply Lagr~nge'; techniques with two multipliers Al and A2 and define,
J* ~ J + Al~'~(O) + A2~'!(I)
= t ~'~ = ~'P~ + Al~'!(O) + A2:'!(I) (50)
where "*" means the auxiliary function. Setting the partial derivatives of J'~ with respect to c equal to zero, we have
3J'~ ~ = ~ - ~~ + Al!(O) + A2!(I) = 0 (51 )
Equation (49 and (51) may be combined
I ~(O) p(l)l r - ~ l -:~l ::;8x8 I
I I I
.p' (0) 0 0 ! : Al i= I I
I .p' (1) 0 0 .-J A2 I
~ - j
Solving (52), we have
(53)
Therefore,
y(x) = x P p(x) + y(O) -I = ~ [-39~0+52~1+12~2-34~3+8~4-7~5+~6-8~7]
(54)
The Walsh series solution curve obtained from the above equation and the actual solution are drawn in Fi g. 7.
Conclusions
After briefly reviewing the Walsh series techniques, we form an operational matrix for performing integrations in Walsh functions analysis. A direct method of variation is established by using Walsh series. A simplest extremization problem and a variational problem concerning heat conduction are completely solved step by step by the new proposed procedure. It is believed that the approach is more powerful than Ritz's or Euler's direct methods for solving variational problems.
References
1. H. Rademacher, Einige Salze uber Reihen von
46
allgemeinen Orthogona-functlonen, Math. Ann., Vol. 87, 1922, 712-738.
2. J. L. Walsh, A Closed Set of Orthogonal Functions, Am. J. Math., Vol. 45, 1923, 5-24.
3. J. D. Lee, Review of Recent Work on Applications of Walsh Functions in Communications, Proc. Walsh Function Symp., Nav. Res. Labs., Washington, D.C., 1970, 26-35.
4. H. F. Harmuth, Application of Walsh Function in Communications, IEEE Spectrum, Nov. 1969, 82-91.
5. J. E. Gibbs and H. A. Gebbie, Application of Walsh Function to Transform Spectroscopy, Nature, Vol. 224, Dec. 1969, 1012-1013.
6. C. W. Thomas and A. J. Welch, Heart Rate Representation Using Walsh Functions, Proc. Walsh Function Symp., Nav. Res. Labs., Washington, D.C., 1972, 154 154-158.
7. F. Picher, Walsh Function and Optimal Linear Systems, Proc. Walsh Function Symp., Nav. Res. Labs., Washington, D.C., 1970, 17-22.
8. M. S. Corrington, Solution of Differential and Integral Equations with Walsh Functions, IEEE Trans. on Circuit Theory, CT-20, No.5, Sept. 1973, 470-475.
9. Gelfand, I. M. and S. V. Fomin, Calculus of Variations, Prentice-Hall, 1963.
10. C. D. Brewster, Approximate Methods of Higher Analysis, Interscience Publishers, Inc. New York, 1958.
II. Elsgolls, E. L., Calculus of Variation, London, Pergamon Press, 1961.
12. Schechter, R. B., The Variation Method in Engineering, New York: McGraw Hill Co. 1967, pp. 23-24.
13. Neuman, C. P. and A. Sen, "A Suboptimal Control Algorithm for Constrained Problems Using Cubic Splines," Automatlca, 9, Sept. 1973, pp. 67-69.
14. Mang, J. H., "A Sequency-Ordered Fast Walsh Transform," IEEE Trans. on Audio and Electroacoustics, Vol. AU-20, No.3, Aug. 1972, pp. 204-205.
~o (f ~lt------
Fig. I Rademacher Functions
Fig. 2 Walsh Functions
1 ----------------
Fig. 5 Fig. 3 Ramp Function
Fig. 6
Fig. 4 Walsh Functions and their First Integrations Fig. 7
47
Solutions of Extremal Problem with Fixed Boundary Conditions
3W=br-o 05 1 X
-I
>-=5,,-,IO:..:bc-+--_X
Heat Conduction In a Solid Slab
Solutions of Heat Conduction Example
A COMPLEX .FOlilil OF THE GENERALIZED FOUAIlli SilUES
AND TJiANSlOliIIS
Dan A. oi u.l1n Polyteohnioal Instjt~te of Euchar!st
Abstract
Starting from are'll fllnction whioh satisfies the Dir:l.chlet oon di tions for develoving in a Fourier ser1.es r.tn 1 transform", and utilising the properues of the analytic'":!.l signals an al go r1 thm to generate an orthogonal set of functions is prlsented. Relation which represent in a complex form the orthogo:18.1 sets of funotions is obtained directely. A partic'~ll'tr C'lse of tlis relation is the well-known set ejnwt. By means of these rel~tions the ~eneralized frequenc3 (sequency) is defined and a complex form for LSner'llized Fol1rier/Laplace series and tr.ansforms is dedl~l)ed. As applications relations for generalized DirlC pulse, periodioal generalized Dirac pl~lse and a gener'llized form of convolution theorem in which the natllre of the Fourier trmsforrn applears evident are deduced. The application of the rel"ttions presented i~ the paper leads, through other ways, to the rllsl11 ts known in the spec1al1 ty field.
1.INTdOD{JCTION
Relations for represen tlltion of the s1 g_
nala,[l] ,[1.4] ,by means of gener':l.lized .1<'011-
riel' ser1es and transforms are known.
These relations are sill!1.1<tr to the real forms of' the Fourier series 'md tr3.nsforms. Sometimes thi s repre eent·'ttion C'ln 'lssu.me
2. SETS 01 ORTHOGOlI'AL PUlI'C!IOlfS
The well-known orthogon'll set hnctions ejnwo t C'ln be ','Iritten in the form
ein.,.,.t _ [e it] 11~ r t " tJ nw .. - = L CO'3 T J !lL11.
=[crnt -3 d( (C03t) ] Il"il
By replacing the 'lm.lytj cal sig:nStI ejt
(1 )
a oomplex form too,{2]. The util:lz::ttion by <.Ulother ~Ul.:llytic"tl si611al zT(t) whose of suoh representation is ·:ldv r:tnt C1.geolls J.'e::tl pS1.rt only when the orthogonal set of functions regarding whi oi the developing t1ike pl·.oe
is apriori known (as for eX'd.mple the 7'-:llsl). f 'mction, Legendre polynomi!11s etc.) If
the se~ of ianct:i ons 1 s to be deduced
Btart1~g from ~ 6i van re:d fi.mction throClgh
an ort~ogon';llized iflethod,{3'~{14} ,[16J ' then, Uec:.t.!.se of rec:~rp.nces proceed'lce :l sed, rot I 1 riSe ·'3Jlloun t 0 f c'J.l c'~l3. ti on s take s plaoe.
48
(2 )
S'l.tisfies the Dil.'ichlet conditione J.nd
whi eli is ,..i ther :;1 j?Clriod.i e'.u f.J.tlction wi th
tho p?riod T, or ,~ f.l.nction defin1 te on ':l
fini te into rv"l.l [C ,T)"I.nother set of fl1notions i 0 obt·j,1r.ed;[~]:
[Zr{t)j"/'o = br (t)-iJf[1r(t J)j n/,o . Zr-3rltJ] np.,
=[ V~:(t)+ lt~(tJ] e-JM3~ ~(t) J =einPo['f{t)-i'f(tJ] (3)
rel .. tion it~ \\hich v,e not~d
'1m = - alta tl 'K [MO] d (J 1r{t) (4)
t(i)== exp[jln(1;(f)+'J{2[1r{tJj)] (~)
i::Uld n= 1,' 2, 3, •••
p = ':l, real n:lmOer o np
The orthoe;on',ll set 01' f,mctions [ZT(t)] 0
obt''lir:.ed 'is1bove, s'l.t1 Rt'i es the condi tiona
01 ortHOGoYuli ty in accord'l.nce to thE' re-
lations: T
f fRe[z;~t)] Jm[r;p-(t)]dt =0 o
r
.if]m[~np'(tJlJm[i!T"'P.{ilJdi =[~rI. for n=m T T 'J 0 ';:0 I' nRn
o
(6)
(7)
( 8)
,Froof: npo 11
Since o3i tLer the i'..mctA~n "(e [ZT (tu
01' ";hC' r;Ulction Jm[ZT 0 (t)] s'itisfies
tl:E' -:11 rich]."'t r.oYJd1 tions by developine, in
'j, ~'ol.,d<2.c 8e1'ie s, [5Jwe c'm wri te:
" IT T Re[lTn'~{t)J]m.[zrnp~{t)]di;. .. 0, T - jk~t .... J"H4Jt (9)
= -1.1 LC(kw.}e !.$(Jw,,)1r iv/oe ]tit T 0 At: • .,. _ L:-oO
in which w.=2JI 'lnd T is the P2L'iod or T
the dc>fjr.j!1;~ interV':l.l of the sign':il Re,
1m [z~po (il)
Into,rc!J;;U1[SL!b the order of tile Sel:ll 'md
S:.1;'1 3.Ild integ.c'ition oper''ttior:.s we obt:dn:
where ((oj = A(·),. i B(.)
To prove t,l"le relation (8) the same method
as for the proof of the relation (7) can
be Ll.tllized. Con8eqLl.ently, the relation
(3)can be considered as a relation repre
senting any orthogonal set of fQnct10ns
which s3.tisfies the Dirichlet conditions
for developing in a ]'oQrier series. Choo
sing a partic'J.lar form for functions 'f(.)
'3.lld r (.) from the m<:l.in rel·1.tion (3) a re
lation to represente a given orthogon,'ll
set of fQnctions will be obtained. Thus,
~or "'(t): 0 and 'fIt) : t the set of l'u.nction; ejnwo t will be obtained.
The number p has a meaning simil'ir to , 0
the number wo' If the an'lli ti c signal
ZIt) is periodic'll with the pertod T we
can wri te : '[ . J Po JPo ftl) -I 't(t)
l.r (tl = e POf
J~['f{t .. tT)-i't{t40nrJ] == z., t+nT): e
(13 )
n=o,1,2, •••
According to the rel'':I.tion (13) there re-; j;e[zT"P-(fJ] Jm[z/P"(t)dt ]dt=
o .... 2 :: -/.L /({iw.J! ~~rziwo :: 0
8-1.1 ts th'it either the f;.lnction of modulus (10) ePo''-f (t) and the fi.l.nction of phase
ejPo'f(t) 'ire periodical fll.nctions (some
an'lly-times tl'ley cau be const.'illts) Because fmc
:E'OLl.- tions e j (.) is periodicoU too there resill ts t
L =·00
~\,c two: ,Jrodlf of the r;~l':ition (7) the
ti c ,1 c1J.l zT (t) is det'eloped in :3.
rjec se.:.':'.es
(ll ) or
49
po 'f(T) " 27i 2Ji po = 'fiT} (14 )
a rel'ltion which represents ~ generaliza
tion of the rel<:ltion
(15 )
for 'f(t) I t
By <:In'llogy with the meqning of the fre-w, Wo
quency given for t~e number 21 (n;; fo)
the number s;; ----2-2
"" has bee'Ul c9.11ed in o JI-
[1] sequency (gener"lli zed freque~cy). an.p. [f(t)-j 'f'{tO The orthogonal set of functions e
can be nonnalised. Because, [4] ,[5J
n~r(t} 1/ :H'K[~{t)JII:: +{r~2(t}dt o
(17)
The ;malitic nOl'm'ilised signal z.(t) will
be gi ven by reI'). ti on :
Z,.,{i}::: ~(tJ _ ' 'R [M+J) 1/ '3r (t)U J 1/ dl [~7(t)JQ (18)
=~(t)-aK[1r{t)J _ ldtl 1117 (t}f - 1/1rft}Q
If function sT(t) is known the'Ul'1l1 tic
dgnal zT(t) oan be c'llculated. ]\mction
t[sT(t)J is calcJ.bted either by utilj ~dng t,he rel'ltion -}( [~(t)] = II! f ~(t) dr;
// t - (; - 0-
or, for the periodi c'il flJ.nction
utilising the relation,[6] T
'J([~r{t)] :: ; J 1r(r;}df[J,(t-~>]dr; o
T
:: 2~ J :1T tr;)cta JiG d~ o d T
(19 )
s.r (t j, by
(20 )
or by '-ltiliSing the FouMer series. If
~(t) = ZAn co;, 27i nt 1" Bn din?!! nt (21 ) I'l T T
then
50
3.GEI'ir;iiALIZED RlJ.Q~ ... t SiLtI~S
:; .1.HEP~SEI; TATION C]1 G'!;N3!t."iLI Z3D ~'t J.U t;;t SE.tIES
The rel~:itions for representation of 3. pe
riodlC'.l or defined on :m interV'tl [0,'1] f1lnctlon g(t) which s'.ttist'ies tIle Dirich
let condttions, by me9.ns of a Gener~liz9j
louirer 8erie8,[1],[7], 'n'<\:
(23)
where
T
G" =f J9r{t){'(t}dt o
(?4 )
and
97 (tJ = ~r (t+lT) (25)
In tllese rel'ttions the complex conjll,;'1.te
of tlie fmction fk (t) has been noted f: (t) ~1.l: d tne 0 L'tho bornl se t 0 f Llll C ti on s
reg~trding whi eh the d"'velopint; in tile Ge
nerqlized Fourier seriAS is done W':J.S no-te d fk (t ). rl.!'i ting
fl( (the iKpo['f(i)- i'f'(tJ] (26)
we obtqin rel~tions
n:-()I>
and respectively
u' The coefficients G(nppi 'tre complex mlmbers
b) In the reLt,tiona (27' md (28) the valu.e for n is to be tOtk"n b t
,'- e lVeen :!.:. 00. By non si derj ng th:? r,? l:t ti on s (9) (10) 'md (12) we e'll1 ob"'f!'" -/-, , J
'-'--ve uil it +'l" o rtl10 gon ''t11 ty j s kent fa t' th'" ": -
• ' w nee~tlve
v'll~e of n too.
c) ... tel:ition (27) represent tile p.ign3.1s
gT(t) eitller if ePo *(t),. 0 or if ePol\'(t)
V''tr.ifl':(~S only for '1 n'l'nb"r nf points of
"zero me:'ts'lre".
d) ;{el!tion (27) Cin be dedloed from re
Ll.tion (29) by developil1b in i /oiuier
series of tl:~ sign'-ll gT(t)
~ lkWpt: Q (t) = L- G., (KuI,,) e ( ?9 ) tfT" l(
by tHe re':1rl"tne;ing ':l.nd regro;']'ping of the
ter'its. In tLi s w"lY if one cansi ders the
develoojng in '1. FO,lorier series oJ each
ter'lJ of the orthogo'1<tl 8Pt of f'moti ons e info [ f(i) - it(t}] %
- 'f27it (30) ;: L Cn {r2Ji}e J T r." T
'Uld by replacing in rel'ltion (27) tLere
res;llts:
a reL'.Ltion from whi ch we c'm dedlce rp-lCl
t ion (?9) thrOll.gh ''1 ra'lrr'mging 'md re
gro:J.pin,~ of the t~rms.
e) l'he rel'itions (28) 'md (31) fihow th'3.t , . anp.['f(t) -i.,(tJ]
tLeortr;ouo1n1 set of j,mctlonse
i s compl e te •
f) rlelqtion (27) c'm be rew,J:'itten in
'~fiotHer form, n'lmely,
:r jnp.C'fttJ-jtMJ -f" jnp .. ['f/tJ-jtllll ar{t) = LG(np.}e tGlo) + Li{np,Je
11:-00' n: 4
;p np.tM -nf. t{l;) = Ao'" 11[6{J1p,)e + 6{-np,)e ] c.tr.)rlfo itt) +
+ i[G{np .. )enp.tlt~ 6l_np,)enpot{-t14irz nt. i(t J] 00'
= Ao'" I A{t,nfi)cron/'Di{t) T'5{~rtfD)~"rznp.'f{I:)
(31 )
in whioh
51
T
AD = f 1 ~T{t)dt (32) o
... {elation (31) is tLe ger.er':l.liz3d form of
relation:
0"
Q ttl: AD T LAn wn,v.t T 8n.1mnUlot (33) dT n:"
TIle '-1p'~ri();Lic'il fmotions C'H' b8 conside
red to proceed iro:n the 8~~.i odi.c,:tl func
tions by "l.); 'Ul1:im.;t~d :incr'~'jse of the pe
riod? B'lt,:in -I,hi" 0,\8', ':iccorrting to
r~l'ltjon (14) po n'lst be 0 djfferenti:'tl
'(ll;o'~nt dp c--:.nd til>? d1 f3C:"3t2 variable
will be considerecl'.lSi cOI!tinous one.
ilnd8r tbe Sf' condi ti on s, by '~tili zing re
I ':I. ti on (;;' 8) we will 0 b bin:
&;m'"8 'f{~i)6IP}:: 7-.00 = f3(t)e-if['f{t)+i'rCtJk-
-011 relqtion in which G (p) repr8sents
seq,lency 0;):;otr:il d"lnsity f:,motion
(35 )
(36 )
the
:irg
f(') represButs tLe inverse of fllllction
'f( • )
(37 )
.!!lCi g(t) the aperiodic,:tl 81..)1::.1.1.
Under tue s",:ne ccndi ti ons flU 011 rel3.ti on
(?7) :h>;>re r'3S'llts:
cw in.,. ['f(t)- j't(tll . 3{t) = linn L G(n.~)e _
T~OI> n:-....
0- - j,,~. ['({t)-tNt)] 21
:t.m. L ~ .,(~ )6(nr-)e Zi,,~/;(g)(38) T..,,,. ft.·eo It
.... t ,;; / ip['f(t)·i'l'ltJ.!h. F 1('1) :: z7i J .. l.7td'Je -r
if the linli t
t . _2.;;..'ii~ ___ "Pl. • ::"
T""" p.""''f{~) (39) :: (~m. ,'f(T) - I,m..!!!!.) T.... Art 'ft'ffTJ] - T-,OI' T
exi st s rmd is eq'.nl wi th k. It :i s known
th~t for ~t)::. t 'md Y;(t) = 0 the limit
exi sts '~'ld is equal wi th 1.
Considering the relation (38) 'lnd repla
cing Gl (p) by Gl
(p) tfp (p) ("p (p) repre
sents the periodical D~rac pulsg with Po
3.S period) there resJ.lts that,[8j
[
0- ip['f(~)-i't(t}J hrltJ ::2~ __ 4 (PJI,. (/') e dp
- 000.., il' ['f{t)-i'l'(t)] ::"2; /4{p)L/(p-np.)e df (10)
_ De 71=-000
Changing the order of sum and integration
we will obtain
ZOO' in,. r'f{tJ-i'l'(tJ)
Ir.,{t):: 64 1",.1 e (41 ) ". -01'
a relation of the sa.me form as relation (27) • In a formal way, the fact tllQt.t the li:n:i t
T'3.king in to ""ccoun t reI 'i tion (27) the pe
riodi c'il gener'lli zed Dir:~c pulse C'U'l be
defined
('f) [00 info ('f(t)- i'l'ltJ) b (t) = e (45) p.
n: -01'
Tne relations (36),(38) ~d (44) &llow a
gener!.tlized theorem of convolution to be
dedClce<i. By considering the sic;nals
i I'";. ;p['f{tJ-tNt)] ~(t) ;' 27 J{I'} e c¥
-00>
(46 )
fO" ,,['f{tJ-j't(t)]
A{t) = ~ H(f} e rlf 2n
(47)
-00
we c~n ded'~ce
I ... 4, [,,{t)- i'l'{t)j 3(t): 2jj J '5'(p)J.I(pJe (J rip
jro; ~P['f(tJ-j'H.JJ -i~["{r)tJ'f'{r)lip[yI4A.)"''''MJ :2~jjJ~{~)h{u)e e e. J'."P .
-..,.
or exists has been j.tstifii:ld in relltion (3~).
In the following, as in the case of the
1ol1rier transform, we will consider k=l.
By means of the fo.£'m;"l change of variable
in relation (38):
31'::S : v+a· w
we can obtain reVil.tion for tile general1zed
Lapl~ce transfona
I f :1 ['1ft) -i'l-{t)) lIt). iF G4(~J e tid
III (421
and
5.APPLICA!IO.
By means of r.l,~,n (38) the generalized D:l.rao pul.e S (t) we can define
SC'I) , r- JP [,!,tj-i'f{tJj (t).- 2;; __ e iJ.p. (44 )
52,
-g{t): ff#~}h(u.)J("itl r,u)d~ tkt (48) --
For !f(t)=. t ~d 'f(t)=.O there results
Ji'f}tru)J =. J(t-,,-u.l (49) I' '({.,.t /
'f'{o) so .
The rel~tion (48) represents the generali
ze1.f~nVOll1tion theorem. The funotion
& (t, ?:,I1) depends only on functions
'fl. )~d '/'(.) which define the generalized Fol1rier tr'Ulsform.
The relations ('6),('8),(27) an4 (28) allow
the dedu.cing ot a sampling theor.lls ot a
signal wi th liIl1t1n, time dl1r&tion and li
miting sequen07 speotrum, a sampling theo
rem wi th "V'lrt able step" and a Q,l1an t1 sin,
theorem ,[9],.1 th reell h It mllar to tho •• presented in[l) ,flo].
The same relations allowe4 I1S to dedl10e •
't
i
In''itL(~:!Jitic~il rel'ltions fOl' dp fin1ng'i pro- Use of urthogonarlllalized dC fJ.nctions,
b.ibUity dcr;sity f.mction for 't [;iven phy- Telecommunication Conference, Bucharest,
sic.tl syste,ll [n]. A 'll'lin ap91iciition con- 1972
sjsts in 'tn'11ysis 'tnd synthesis of systems [3]D.C.doss, (}rhtonorm9.1 Exponenti':l.ls, IEEE
with "time v·:tri'1.ble" p':trameters,[l] ,[2],[4J ,Transactions on Communic"l.t1.on Electronics,
[}. 2J • rihrch, 1964.
[4]D. Ciulin, Tlle ,\nalysi s 0 f Time-Varying
6. CONCLUSION Systems, Polytechnic'll Institute of Bucha-
In tile p3.per ::m 'j,lgOrithm for direct gene- rest, December 1973
r:J.ti.on of:m orthoe;on'11 set of functions, [5]V.Cizek, An'llitic SiGn'll '~nd some of
st':lrting f'ro:n'~ re'tl fclnction which s"ttisi.' its Applic~tion. Symposium Summer School
fjes tfle Dirichlet conditions, is presented. on Circuit Theory, Prag 1968
it !,'1rticllu'e C"ise of this "tlcsorithm is the [6]V.Cizek, Discrete Hilbert 1ricJ.!lsform,
t:;e!:er'J.tion of the well-known set ejnwo t IEEE Tr"tlls'lction on A.ldio and lHectroacu.s-
st'..irting from fu.nction cos wot. l'ne rela- t.ics,' \01. l\.U-18,no.4,1970
tionp obt'lined per,ni t tile genGr'ill s:J.tion [7]A'.Zygmund, Trigonometric Series. Carn-
of tbe freq·tency me'l!ling, by lntrod i.lcing brige at the University Press, 1959.
the :ae.:lnir..u 01' seq:lency for the (3ener"il1- [8]D.Oiulin, Sl.gnal aepresentationsby Me-
sed freq'.lency,[l]. By u.tilising the re- thods of P..mction:l.l Analysis. Telecornmu-
l'.tions t111lS obtained rel:'l.tio!ls 1'"r gene- n1c~tion Ltev1ew, Buch'lrest ,no.5 ,1972.
r':l.lized Fourier seri es :md trH.nsform and [9]D.Ciulin, Description of the ::;'ilTlpling even for gener3.1ized L::J.pl3.ce tr!:l.Ilsfonn are
ded.lced. The se rel''l t'i ons h"lve been \1 til1-sed flJr defining t.il'3 gene r''ll t ;jed Di r"l.C
ptlse .md ti,e llerjcdic"ll gener'lllz<:ld 'D1r'lC
plll se. A ger~er'J.li zed form of the convolution theorem in which appear in a direct
way the nature of the IltUised Fourier
trCJ.nsform is dedllced too.
The reldtiono dedllced have been atilised
in p;)8c1 ':tli zed li teL'~t.u:e .tvr appli c'itions,
[9] ,[ll] , [12]. !tesul ts like those pl'esented
in[l] ,[10] h3.ve been obhined, [9].
TIle :nOl.in applic'J.tion of the [resented re
l:.1tion consists of the 'iD'llysis and syn
tbesis of the systems with time v:3.r!':l.ole p"tra:llete r s [4] ,[12].
rlEFE.dENCBS
[1] H. i.harIDllth, A G<>1'1er<il1 zed Concept of
FreqHmCj s.hd some tI,pplications. 13EE Tr"l.r,~
s"J.ctions cn Inlormation Theory, vOl.II-Il' ,
no';, 1968. [2]D.iV • .mPilre cnt, ,v. ~chmi d t, Appro ximation
of Butterworth ~nd ~auer Filters by the
53
Signals Using Function"!l Analysis. '.rele
communication rlev1ew, no.12,Bucharest, 1972.
[10] I .Kluvanec, Sampling Theorem in Abs-
tract H'irmonic Analysis. Mathematicko fyzk"l,l
ny Casopis Sloven, Aka.d,Vied 15,1965.
[llJ V.M. Catuneanu, Un the FU,nctional Spc1.ce
'ind SOill(;l Implic~tions in the ltea11bility
Theory. Proceeding of the 3-th SymposLlm
on Real1b111ty,Budapest,1974.
[12] D. Ciulin ,.te son~iD t T1me-Varian t C1 rcui t.
Proceeding of the Second International Sym
posium on Network Theory, Yugoslavia,1972
[13] Lt. 'N.Newcomb, Operator Theory of Net
works. C1rC!lits 'lnd Systems IEEE,1974
[14] H.F.Harm!lth,Tr'1nsmission of Info rma
tion by Crthogon'l.l Functions. Springer-Ver
lr:l.g ,Berlin ,Heidel berg, New-York ,1970
[15] C.BooBwetter, Slgn'llanaly:o: smd Syn
these ' it Hllfe Crthogonaler ~"il ter,
Frequenz,no.l,1971
[16J H.L.Armstrong, On the iUtpredenhtion
01' Transients by Series of Orthobonal Func
tions, I!~ Tr'ms':lctions on Circ:"u t '.rheory n~.4 ,1959.
TRIANGULARIZATION OF SOME RESTRICTED SHIFTS·
D. N. Clark University of Georgia
Athens, Georgia
and S. Sickler Eastern Nazarene College
Quincy, Mass.
Abstract
Triangularization of the res~icted shift along a chain of vector regular factorizations is obtained, in case the characteristic function is scalar-valued and pseudo-meromorphic.
1. INTRODUCTION
The purpose of this note is to point out how the Darlington,synthesis methods of
Arov (2) can be used to obtain a new ~i
angularization of restricted shifts in the pseudo-meromorphic case. Our results take place in the scalar case, but perhaps
they point to matrix-valued generalizations. Also, results peculiar to the pseudo-meromorphic case may be of interest because of the physical differences in that case.
2. DEFINITIONS
We say we have a triangularization of a bounded operator T if we have a
(discrete) representation of T as a triangular matrix or a (continuous) representation of T as a multiplication
operator and a Volterra integral operator on some L2 space. In either case, we need a chain (discrete or continuous) of invariant subs paces of T.
3. HISTORY
For a contraction operator T having a
scalar (Sz.-Nagy-Foia~) characteristic
·communicated in written form only
54
function ~, triangularizations have been given by Ahern and Clark (1) for the discrete case, and in (I), Kriete (4) and
Clark (3) for the continuous case. All
these papers use chains of invariant subspaces arising from factorizations
~= fP:J. ~ of the characteristic function into two scalar factors.
But there are other factorizations of ~
which give rise to invariant subspaces. Indeed, according to Sickler (6), if
I~(eit) 1< 1 a.e., there are chains corresponding to factorizations of the form
t ~= (~1~2) (t,Oll~2)
where t,Oij E Hoo
with I t,Ou l2 + It,Oi2 12 = 1 , i = 1,2 We pose the problem of finding
triangularizations corresponding to such chains, and we show how to do it in a special case: the case in which ~ is "noncyclic" or "pseudo-meromorphic". We use a Darlington Synthesis method of Arov (2).
4. AROV I S METHOD
The pseudo-meromorphic assumption on tp
amounts to the existence of an inner func
tion B such that B~ E H2 , or, by
Arov (2), the existence of two invariant
subspaces Ml and M2 of T such that
Ml::l~, TIM 2
is isometric, and
T*I~~ is isometric. Let
(1- Ic,o(eit ) 12)1/2 and let
~(t) =
e be the
outer function with Ie I = ~. The sub-
spaces Ml and
factorizations of
M2 correspond to the
c,o:
c,o 1 c,o= (1 0) (!Jx'!) , c,o= (c,o ~e) (0)
respectively, where Ii> , ~ are inner.
The result of Arov (2) also asserts that
the characteristic function of the com
pression of T on Ml - M2 is given by
A-C A concrete version of Arov's result is
given as follows.
Lemma. The operator
u- C maps each of the three spaces in the
decomposition
[0 (i) (~) ~]
Ei' [(~ EBH2)eA(~ (+\H2)]
('fl A(O (+)~)
unitarily onto the corresponding space in
the decomposition
(K e Ml
) EB (Ml e M2 ) (i) M2
of the model space K of T
5. CONCLUSION
Of course triangularization of U*TU can
be obtained, at least if one assumes a
chain of factorizations A = AkAk of A.
For example, one may pick the orthonormal
basis
Ht Zt =A(O,e )
for U*K, and get
55
1 (TUxj , UYk) = 0lj «0,1) , Ak (0»
o (TUXj , UZ k ) = OJ 1 0kO «0,1) , A (0) (1»
(TUYk' UX t ) = 0
(T~:~ (:~:(-1 ~ a::1 t::] ,A. ((1- a~ .hrJl if t.,k
(TUYk' Uz t )
_ '«,cAk' ( .it ~1 -ak,it) -1), (~))
(TUz t ' UXj ) = (TUz t ' Uy j) = 0
(TUZ t , UZ j ) = °jt+l •
More satisfactory versions of the above
can be obtained by using a triangulariza
tion of the Sz.-Nagy-Foia~ model of A;
see Lubin (5).
(1)
REFERENCES
P. R. Ahern and D. N. Clark, On func
tions orthogonal to invariant sub
spaces. Acta Math. 124 (1970),
191-204.
(2) D. Z. Arov, Darlington's method for
dissipative systems. Soviet Physics
Doklady 16 (1972), 954-956.
(3) D. N. Clark, Concrete model theory
for a class of operators. J. Func
tional Analysis 14 (1973), 269-280.
2
(4) T. L. Kriete, Fourier transforms and
chains of inner functions. Duke
Math. J. 40 (1973), 131-143.
(5) A. Lubin, Representations of vector
valued invariant subspaces and con
crete model theory, to appear.
(6) S. Sickler, The invariant subspaces
of almost unitary operators.
Indiana U. Math. J. 24 (1975),
635-650.
Douglas N. Clark is Associate Professor
at the University of Georgia. He
received the Ph.D. (Mathematics) from
Johns Hopkins in 1967, and was
Assistant Professor at UCLA.
Sheldon Sickler is Assistant Professor
at Eastern Nazarene College. He
received the Ph.D. (Mathematics) at
UCLA in 1973.
56
FURTHER RESULTS ON THE
ASSOCIATION OF VARIABLES
James Conlan and E.L. Koh University of Regina Regina, Saskatchewan
Abstract
This paper summarizes and extends some results on association of variables obtained by us earlier. Specifically, the fractional calculus is used to extend some classical results for Laplace transforms. These results are then applied to some problems in systems analysis.
1. INTRODUCTION
If ~(t) is the input to a non-linear time
invariant system, then the output can be
represented as a Volterra series of the
form Lgn(t), where 00 00
g (t)=f ··.f h ('Il,···,'I )~(t-'Il)'" n n n -00 _00
see references (1) ,(2) ,(12). Here hn is
the impulse response of an n-th order
system.
If we let 00 00
then upon taking the n-dimensional Laplace
transform, L , of f , we have n n
Lnfn=Fn(Sl,···,Sn)=Hn(sl,···,sn)~(sl)···
~(s ), where Hn=L h , and ~=L~, where L=L l • n n n
From now on we omit the subscript "n" from
the functions fn,Fn' and gn' To obtain the
output as a function of t, we need to find
57
-1 Ln F=f(tl,···,tn )· Upon setting t l =···=
tn=t, we obtain g(t)=f(t, ... ,t). If we let
G(s) be the Laplace transform of g(t) we
have the following diagram, see reference
(3) •
G (s) -------4) 9 (t)
The function G=A F is called the associated n
transform of the function F.
2. DERIVATIVE THEOREMS
2.1 FRACTIONAL OPERATORS
In what follows we will make essential use
of the concept of fractional integration
and differentiation. There are various
ways to define these operations, not all of
which are equivalent, see reference (11).
The particular integrals we will need are
the following, see reference (6). The
Riemann-Liouville fractional integral:
A 1 JX A-l I {fix) }=f(If (x-y) f(y)dy. o
The Weyl fractional integral:
A 1 Joo A-l K {f(x)}=f(If (y-x) f(y)dy. x
Here, and throughout the paper, we assume
that A is a complex number with positive
real part, R(A»O. The corresponding
fractional derivatives are
and
, k k-A{ } D".f(x)=-(d/dx) K f(x) , respectively,
where k is the integer satisfying
k-l~R(A)<k. Note that in the limiting case
A A where A=k-l, we have DOf(x)=Doof(x)=
(d/dx) k-l f (x) .
2.2 APPLICATIONS TO SYSTEMS
In what follows it will be convenient to
use the following notation. We let
S=(sl,···,sn)' and ~m=(sl,···,sm_l,sm+l' ... s). In reference (8), Koh extended
n a result of Chen and Chiu, refenmc:e (3),
to obtain the following result.
Theorem 1. If F(i)=(s -a)-(k+l)Fl(~ ), m m
58
Proof. -1 -g (t) = f ( t , . . . ,t) = L F ( s) I -t= (t t ) n , ••• ,
Upon taking the Laplace transform of both
sides, the theorem is proved.
In references (4) and (9), Conlan and Koh
extended the two classical formulas for
Laplace transforms
t L{t-nf{t)}=(J ds)nf(x) , and
o L{tnf(t) }=(-d/ds)nf(s),
where n is a positive integer, to the case
for general n. Specifically, the
following results were obtained.
Theorem 2. Assume: (l) v is a complex
number with R{v) >0; (2) f (x) is "uch that
f(x)=O for x<O, and f(x)e- cx is absolutely
integrable on [0,00) for some c; (3)
x-vf(x) is absolutely integrable on [0,1].
Then L{t-Vf(t) }=KVL{f(t)}.
Theorem 3. Under the hypotheses (l) ,(2)
of theorem 2,
One can prove theorem 2 by setting
KVL{f}=r(~)J:(Y-S)V-l{J:f(t)e-ytdt}dY'
interchanging the order of integration,
and using the definition of the gamma
function. The proof of theorem 3 follows
in a similar fashion from
Using theorem 3, one can obtain the
following theorem, the proof of which is
similar to that of theorem 1, see refer
ence (4) for details.
Theorem 4. If F(S)=(Sr-a)-(V+l)Fl(~r)'
(_l)k-l v then G(s)- r(v+l) D~Gl(s-a).
3. DISTRIBUTIONS AND
ASSOCIATION OF VARIABLES
3.1 FRACTIONAL OPERATORS AND DISTRIBUTIONS
The question remains as to possible exten
sions of theorem 4 to the case where the
exponent of the monomial factor of F has
positive real part. To answer this
question it is useful to apply fractional
operators to distributions. One can
follow Erdelyi and McBride, reference (7),
and define IAf by
A A <I f,CP>=<f,K cP> (1)
where cP is a test function, and f is a
distribution with support in [O,~): see
reference (13) for definitions of these
concepts. Note that if f is a function,
this definition reduces to the formula
for interchanging the order of integration.
Using (1), the Laplace transform of
eatIA[o(t)] can be written as
L{eatIA[o(t»)}=<IA[O(t»), eate-st>
=<o(t) 'KAe-(s-a)t>=r(~)JOOyA-le-(S-a)YdY. o
Putting y=v/(s-a), this becomes 00
1 -AJ A-l-v r (A) (s-a) v e dv. o
Since the integral is just r(A), we have
We can find the Laplace transform of
eatD~O(t) in a similar manner:
(2)
L{eatD~O(t)}=«d/dt)kIk-Ao(t) ,e-(s-a)t>
and as in the proof of (2),
<oCt), Kk-Ae-(s-a)t>=(s_a)-(k-A). Hence
at A A L{e DOo (t) = (s-a) , s>a. (3)
3.2 A PRODUCT THEORID!
The answer to the question raised at the
beginning of paragraph 3.1: is embodied in
the following theorem.
_ A A
Theorem 5. If F(s)=(s -a) Fl(S ), then m m
G(S)=Kk-A{e-(s-a)t[(S-a)-d/dtlk9l(t)}lt=0'
s>a.
-1 A -1 A
=L {(s -a) }L l{Fl(S )},
59
m n- m at A and so by (3), g(t)=e DOO(t)gl(t). Hence
{ at A A G(s)=L e DOo(t)gl(t)}=<DOO(t),gl(t) x
x e-(s-a)t>
=(-l)k<o(t) ,Kk-A{(d/dt)k[gl(t)e-(s-a)t)}>
=(_l)k<o (t) ,Kk- A{ ~ (~) [-(s-a) )k-jgi j ) (t) x j=O J
=<o(t) ,Kk-A{e-(s-a)t[(S_a)_d/dt)k9l(t)}>
=Kk-A{e-(s-a)t[(S-a)-d/dtlk9l(t)}lt=0' q.e.d.
Example.
Therefore by theorem 5,
1/2{ -st }j G{s)=K e [s-d/dt]gl{t) t=O
_ 2 1 Joo -1/2 -sy 3/2 - lIT r(1/2) oy e [s-d/dy]y dy
2Joo -sy 1 =- {sy-3/2}e dy=---TI 0 TIS
One can verify the above result by direct
computation as follows:
=L-l{st/2}~ IE2 t 3- ~ D~/20(t)t~/2t3·
Therefore g{t)=~ D~/20{t)t3/2.
any test function ~(t),
<g,~> ~ <D01/ 20{t) ,t3/2~(t»
In
=-~ <0 (t) ,Kl / 2 { (d/dt) [t3/2~ (t)] >
Now for
=- ~ ____ 1 __ Joo
(y_t)-1/2(d/dy) [y3/2~(y)]X vTI r (1/2) t
x dY!t=O
where H(t)=l for t~O, and zero otherwise.
Thus, g(t)=-~H(t), and G(s)=-~L{H}=-~ " TI TIS
4. SERIES EXPANSIONS
4.1 A GENERAL METHOD
In this section we present a general
method for deriving operational formulas
f0r the association of variables. We
consider the function F of the form
F{S)=Fl(sm)F2{~m)' where Fl(Sm) can be ex
panded in an absolutely convergent series,
for ISml>R, of the type
00 A. Fl(S ) = L a./s J
m j=O J m (4)
with {A.} an arbitrary increasing sequence J
of positive numbers which approach 00.
Under these conditions, one can take the
inverse transform of Fl(Sm) term by term
to yield
00 1..-1 fl{tm) = L a.t J /r(A.)
j=O J m J (5)
which converges for all real and complex
tm~O~ see reference (5). This fact enables
us to prove the following.
Theorem 6. If F(S)=F l (Sm)F2 (Sm)' and
00 A. Fl(Sm) L a./s J where O<AO<Al< ... ~oo,
j=O J m then
G(s) a. k. ",-1
\ ---1- (-l) J D J G (s) l. r(A.) 00 2 '
j=O J
where k.-l<A.<k .. J - J J
Proof. From (5), we have
00 1..-1 L a.t J
j=O J m
Hence get)
f2(t )/r{A.)
m J
",-1 L a.t J g2(t)/~(AJ')' and so
j=O J
00 1..-1 t G(s)=J { L aJ.t J g2(t)/r(A
J.)}e-
s dt.
o j=O
Since g2(t) is Laplace transformable, and
the a. are such that the series converges J
absolutely, we can interchange the orders
of summation and integration. 'rhus
00 00 1..-1 G(s)= L [a./r(A.)]J t J g2(t)e-stdt.
j=O J J 0
By theorem 3, we obtain the conclusion of
the theorem.
4.2 APPLICATIONS
Example 1. _ 2 2 -v A
Let F(s)=(s +a) Fl(S), m m
R(v»O. It is easly shown that
00 j 2J' L (-1) r(v+j)a j=O j!r(v)s~(j+V)
Thus
G(s) 00 ' 2' k \ (-1) J r (v+j) a J (-1) j o 2j+2v-l L J'!r(vjr(2J'+2v) 00 x
j=O
x
where k,-1<2v<k,. From Legendre's duplica-J J
'- 2z-1 1 tion formula, rnr(2z)=2 r(z)r(z+Z)'
we get
L j=O
00
G(s)= L j=O j! rn r (j+l/2)
where k>j+l/2, i.e., k=j+l. Therefore,
G(s)=-,I Ia j /(j!rn»)o!-1/2 Gl(s). J=O
Since o-1/2=_Kl / 2 , and OV=Ov=(d/ds)v for 00 00
, -1/2 aO 1/2 ~nteger v, G(s)=n e K Gl(s), or
lfoo -1/2 G(s)=n (y-s-a) Gl(y)dy s+a
(7) •
This formula is certainly well defined for
the Laplace transform Gl{s). As an
application consider F{sl,s2)=[1/ls l +l) x
x [1/{s2+1»). Here Gl{s)=l/{s+l), and the
associated transform of F is
G{s) ! f [l/Iy-s-l) [l/{y+l)]dy n s+l
! 1 f dv, on letting v 1T Is+2 0 (v+l) IV
Hence G{s)=1/ls+2, since the last integral
(6), is simply 1T. This result can be verified
where J is the Bessel f.'nction of order ~. ~
Strictly speaking, this formula (6) is
symbolic in nature. It becomes more tract
able for specific i~tegral values of v. For instance, when v=l, the formula reduces
=-!sin(aO )Gl(S)=-!sin(aO)Gl{s), where a 00 a
O=d/ds, which was obtained by Koh in ref
erence (8).
Example 2. Let F(S)=(S +a)-1/2Fl{~ ). m m
Again, it can be shown that
(s +a)-1/2 m
Thus
61
by direct inversion.
(l)
(2)
(3)
REFERENCES
Barrett, J. F.: "The use of function-'
als in the analysis of non-linear
physical systems". J. Electron
Control, 15, 1963, 567.
Brilliant, M.B.: "Theory of the
analysis of non-linear systems".
Report 345, Research Lab. of Electron,
M.LT., 1958.
Chen, C.F., and Chiu, R.F.: "New
theorems of association of variables
in multiple dimensional Laplace trans
form". Int. J. Systems ScL, 1973,
4, 647.
(4)
(5)
Conlan J. and Koh, E. L.: "A fract
ional differentiation theorem for
the Laplace transform". Canadian
Math. Bull., to appear.
Doetsch, G. :"Guide to the applica-
Professor James Conlan received the
bachelors degree and the masters degree
from the University of California,
Berkeley Campus. He received the Ph.D.
in mathematics from the University of
Maryland in 195B. He has worked as a
tion of the Laplace and z-transform", research mathematician for the U.S. Navy,
Van Nostrand, 1971. and has taught at the University of
Western Australia, and at Howrard
(6)
(7)
(B)
Erdelyi, A., et.al.:" Tables of
integral transforms", McGraw-Hill,
1954.
Erdelyi, A., and McBride, A.C.:
"Fractional integrals of distribu
tions", Siam J. Math. Anal., 1,
1970, 547.
Koh, E.L.: "Association of variables
in n-dimensional Laplace transform",
Int. J. Systems Sci., 1975, 6, 127.
(9) Koh, E.L. and Conlan, J.: "Fract
ional derivatives, Laplace trans
forms, and association of variables",
Int. J. Systems Sci., to appear.
(10) Lubbock, J.K. and Bansal, V.S.:
"Multidimensional Laplace trans
forms for solution of nonlinear
equations", Proc. I.E.E., 1969,
166, 2075.
(11) Oldham, K.B. and Spanier, J.: "The
fractional calculus", Academic
Press, 1974.
(12) Volterra, V.: "Thecry of function-
als", Blackie, 1930.
(13) Zemanian, A.H.: "Distribution
theory and transform analysis",
McGraw-Hill, 1965.
62
University.
Professor Eusebio L. Koh received his
Ph.D. in 1967 at the State University of
New York, Stony Brook, working under
A.H. Zemanian. His dissertation was on
the Hankel transformation of generalized
functions. He holds an honors degree
from the University of the Philippines
and masters degrees from Purdue and
Birmingham. He taught engineering in the
Philippines until 1964 and mathematics
in South Carolina for a year before going
to Regina in 196B. He is on Sabbatical
Leave at the Technischen Hochschule
Darmstadt during 1975-76. His research
interests are distribution theory,
integral transformation, and differential
equations. Professor Koh is married and
has four children.
TilE FEEDBACK INTERCONNECTION OF MULTI VARIABLE
SYSTEMS: SIMPLIFYING THEOREMS FOR STABILITY
C. A. Desoer and W. S. Chan
Department of Electrical Engineering and Computer Sciences and the Electronics Research Laboratory
University of California, Berkeley, California 947Z0
Abstract
We consider the stability of the feedback interconnection of possibly unstable, n-input n-output subsystems whose interconnection is described by el = ul - yZ' eZ = uz + Yl and Yi = Gi(ei), i = 1,Z. We give three theorems which simplify the stability tests. Theorem 1 deals with nonlinear time-varying subsystems. It gives conditions on GZ so that the stability of ul ~ Yl guarantees that of the feedback system. The other two theorems consider continuous-time linear timelinvariant subsystems. It is noted that in the multivariable case, the stability of ui ~ Yi, i = 1,Z are not sufficient to guarantee the stability of the feedback system and Theorem Z specifies some additional required conditions. Theorem 3 Shows that if GZ and Gl(I+GZGl)-l are in some special stable classes, so is the transfer function of the feedback system. In both theorems, corollaries specialize the results to lumped and single-input single-output cases. The paper ends by showing how these results can be translated for the discrete-time case.
INTRODUCTION
This paper may be viewed as a first step toward a general input-output theory for arbitrary interconnections of multi-input multi-output subsystems. In contrast to [1] it does allow, in several results, unstable subsystems. It is closely related to [Z] which gives necessary and sufficient conditions for stability allowing for unstable subsystems. The thrust of the paper is towards finding conditions under which stability tests are greatly simplified. The results below constitute an extension of results presented at the 1974 Allerton Conference [18]. The discrete-time extension is described in section IV.
The point of view adopted in the paper is that pioneered by Sandberg and Zames [3,4]. This approach to stability problems has been developed in many papers [5-9] and books [lO-lZ]. A slightly different but closely related approach is to be found in [13-16].
In the first section of the paper we describe the system under consideration and review the pertinent definitions and facts needed to state our results. The second section presents two basic examples which are needed to understand some basic points related to the new results. The third section states precisely the three basic theorems and
63
tries to describe the nature and interrelationships of the results. A more complete version of this paper will appear in the Proc. IEEE, Dec., 1975.
Notations. lR, 0:, lR(s)".)\: denote, respectively, the fields of real numbers, complex numbers, rational functions with real coefficients, and the convolution algebra defined in [5], [6] and [lZ]. The elements of the convolution algebra,.)\: are generalized functions described in (10), below; it is easy to show that if f,g EI.)\: then f+g and cf (for all real number c) are in~, furthermore if as "product" one takes the convolution of f and g, then f*g E,~; for this reason,.)\: is called a convolution algebra. Superscripts nand nxn are used to denote th~ co~resgonding classes of ordered n-tuples (e.g. lR , (/; ".J\') and nxn arrays (e.g. lR(s)nxn), respectively. Laplace transforms are denoted by a '. Operators and matrix-trans ferfunctions are denoted by capitals (e.g. Gl' Gl)· Scalar transfer functions are denoted by lower case letters, (e.g. g(s». The abbreviations MIMO and SISO denote "multiple-input miltiple-output" and "single-input single-output," respectively. (/;+ and &+ denote the closed and the open right half-plane.
I. SYSTEM DESCRIPTION AND PRELIMINARY DEFINITIONS
We consider a feedback system S whose inputs,
outputs, etc. are defined on 0' c 1R: typically 0' = lR+, for continuous-time systems, and 0' = l+, (the nonnegative integers) for discrete-time systems. Let',J = {f: jJ +CV} where CV is a normed space with norm 11.11. For any T E 0', fT(t) = f(t) if t < T, and zero for t > T. Using the usual definitions of addition and scalar multiplication, we define the vector space
To avoid long concatenations of subscripts, we shall write
The feedback system S is made up of two subsystems as shown in Fig. I. If CV = lRn, then the two subsystems are n-input n-output subsystems.~ )The inputs ui, errors ei' outputs Yi belong to Sfe . We define for i = 1,2
(1)
Note that it is not required that the Gi be linear. The equations are then
(2)
(3)
We make a general existence assumption ~~ich w~ll hold througho~t thl:!laper: lI(ul,u2) E Sf e x Sf e , 3 (el> e2) E ~e x Sf e which satisfy the equations (2), (3) of the system. For general existence criteria see [4,11,12]. Note that uniqueness is not required. If uniqueness holds, there is a map, denoted by He such that
If uniqueness does not hold, He becomes a relation [17].
:0 Gi is said to be ~-stable iff (4)
. 3 I( < '" :3 IIx E S£, liT E g e
II Gixll < kllxD T - T
The gain of Gi is defined to be the infimum of all such k; it is denoted by y(Gi). Calculations of the gain for SISO and MIMO systems can be found in [3,4,11,12]. The incremental gain of Gi, Y(Gi),(t) is defined as
(t)The superscript - used to distinguished the incremental gain from the gain, has nothing to do with the II-II sometimes used to denote Z-transforms.
64
IIG x - G x II < yllXl
- x II }. il i2T- 2T
(5)
For linear system y(Gi ) = y(Gi ). Let u, e, and y denote the order pairs (ul,u2), (el,e2). and (Yl,Y2)' respectively. We also have the map H : u ~ y. It is important to note that if we def~ne J : ~e x Sfe + 'le x Sf e by J(u)
J(ul.u2) = (u2.-ul). then
(6)
where I denotes the identity. LoLl"
If both G1 and G2 are linear maps, the map He : u ~e is obtained as follows: (a) operate with G10n (2). use linearity, eliminate Glel using (3) and solve for e2; (b) operate with G2 on (3), use linearity, eliminate G2e2 using (2) and solve for el; (c) the linearity of G2 imllies (I+G2Gl)G2 = G2(I+GlG2) hence G2(I+GlG2)- = (I+G2G1)-lG2; (d) similarly, the linearity of Gl implies Gl(I+G2Gl)-1
(I+GlG2)-IGI' The final result is
e = [elJ [(I+G2Gl )-1 e
2 = G
I(I+G
2G
l)-1
where G1G2 denotes the composition of Gl with G2'
1:h)e TIJ1f' (or the relation). He is said to be .) Sf x~ -stable iff 3k < 00 such that lIul,u2 E: ~e' liT E ~,.for i = 1,2
(8)
In other words, if in the product space we choose the norm lIuli = Ilulll + IIU211. then we see that (8) is equivalent to y(He ) < "'. From (6), y(He ) < 00 if and only if y(Hy) < ~.
For the continuous-time. linear, time-invariant case, for i = 1,2, we define Gi by a convolution: to alleviate notation. we also use Gi to denote the kernel of the convolution operator, thus Gi : 1R+ + lR nxn and
(9)
Using A to denote Laplace transformed quantities • we have
In the linear, time-invariant, distributed case, we introduce the Banach algebras ,J\ and,.) as follows (see [5],[6],[12])
,J~ {f : 1R+ + lR I f(t) L fio(t-t i ) + fa(t) i=o
where (10)
L Ifil < "', t > 0 IIi, fa ELl} i=o i-
A Eu~nxn reans t;, that each element of the matrix A ErJt. 1,.7' nxn ~ lA I A Ej\ nxn}. It is well kpown that if Gl>~2 Ej\:nxn, then (;1 + (;2, (;1(;2 Ey\nxn and Gil Ej\:nxn * inf Idet Gl(s) I > O.
sEG:+
IIhn t;, 2: Ih.1 + loo Ih (t) Idt, (11)
a i=o 1. a
0
Ifh ( ) E fin hl ,h2,··· ,hn -
Ilhll t;, = maxllh II ,
a CY. a CY.
and if H E rJ't nXn
t;, n IIHII = max ~ Dh .. U (12)
a i j=l 1.J a
Then if 1 u E Ln ~nxn < p ..:: 00 , and H Er then
p
HH*u U < IIHII lIu Y (13) p- a p
where II. lip denotes the .£th norm [12]. It can also be shown that if H ErJnxn and if u : 1R+ -+ :R n is continuous and bounded (or almost periodic, or periodic) then H*u has the same properties, resp. Property (13) is often expressed by saying that u ~ H*u is L~stable for all p E [1,00].
Ii rA lnxn Two elements : . ./\J, ~u of rJ\ are said to be pseudo right coprime, abbr. p.r.c., (resp. pseudo left coprime, abbr. p.l.c.) [12,19] iff
(i) det CUJ (s) ~ 0 Vs E t+
and (ii) q)Ji +C\jct) = q,t; (resp ... Klcfl-fl--m~iti) Gixen a function G : ~+ -+ ~nxn, the ordered pair (;\I,Cf)) is said to be a p.r.c. factorization, abbr. p.r.c.f. (resp. p.l.c. factorization, abbr. p.l.c.f.) of G iff
(i) G =JGc{)-l (resp. G = cO-S~)
In ~he linear, time-i~variant. lumped case, Gl,G2 E 1R (s)nxn and Gi is said to be proper iff all its elements are bounded at infinity, and
Gi is said to be exponential stable (abbr. expo s~) iff it is proper and bas all its poles in ~_, (the open left half plane).
If G E 1R (s) nxn and (; is proper, then G has both a left- and a right-coprime factorization, [12].
The following observation will greatly simplify the analysis below: The map J defined above in (6) is a linear isometric map, therefore the two equations (6) lead to the l~vJ
Lemma: a) In the general nonlinear case: He is --g: x S£ stable if and only if Hy is :1 x -:J: stable;
b) in the linear time-invariant case:
He E ,,2nx2n 1.·f d 1 'f Jt an on y 1
H E .fi2nx2n. y •
also for the lumped case, . He is expo st. if and only if Hy is
expo st.
It is this lemma that allows us to restrict our attention to He exclusively. We choose He because it is more convenient to work with.
II. INSTRUCTIVE EXAMPLES
In the linear case, He is given by (7): He splits into four partial maps: ui» ej. i,j =)1,2. Each one of these four partial maps may be ~-stable or not: this gives 16 = 24 possible patterns of instability; this number is further reduced to 10 by interchanging subscripts 1 and 2. In view of the fact that each of the four partial maps depends on the same two functions Gl and G2, one might expect that not all possible patterns of instability might ?ccur and hence that one might prove the sr x Sf -stability of He by studying only a proper subset of the four partial maps. This is, in fact, not so. Consider the following two linear timeinvariant examples.
(ii)~,cf) are p.r.c. (resp. il,q) are p.l.c) Example 1. If gl(s~ = lis, g2(s) = s/(s+l), then all submatrices of He are expo stable except 81(1+8281)-1 which has a pole at s = O.
" lim inf IdetC[)(s.)I > 0 • i-+oo 1.
TheAfoll~wing fact ha~been established in [121. If G E ,...An n A and v(1,q ,) is a p. r. c. f. or a p.l.c.f. of G then p E t+ is a pole of G
* P E (;+ is a zero of det cf).
65
Example 2. If al(s) 1/ s]
l/s
G 2 (s) = [1/ (s+ 1) 1/ s l o l/(S+ld
then all submatrices of He are expo stable except (1+(;2(;1)-1 which has a pole at s = O. A detailed
study of all 10 possibilities is reported in [18].
In conclusion, even in the lumped, linear, timeinyarj~nt case, in order to prove that He is ~ x :::r stable, one must in the general case investigate the stability of each of the four partial maps ui ~ ej' i,j = l,Z.
III. THE SIMPLIFYING THEOREMS
In most design procedures and stability considerations one assumes uz = 0 and studies the stability of the map ul ~ Yl' namely, Gl(I+GZGl)-l. An interesting question is then: under what general conditions does_nhe(V-stability of Gl(I+GZGl)-l imply the Sf x~ -stability of He? The following theorem answers the question for a broad class of nonlinear systems:
Theorem 1. (Nonlinear time-varying MlMO)
Let Gi be defined_~s in (1). If GZ and Gl(I+GZGl)-l are ~-stable, and if the incremental gatn .of Gz, y(GZ)' is finite, then He and Hy are ~x ~ stable. -
Comments: (a) It s~ould be stressed that Gl is not required to be ~-stable. (b) In particular, if as in most practical cases the feedback subsystem, GZ' is linear, then the c~ndition Y(GZ) < 00 is equivalent to that GZ be Sf-stable. Therefore for the linear time-varying MIMO case, the ~-stability of)Gztnd that of Gl(I+GZGl)-l imply that He is ~ x ~ stable. (c) If GZ is unbiased (i.e. GZO = 0), choosing xZ = 0 in (5) and comparing with (4), we see that y(GZ) ~ Y(GZ)· Hence, we have the following
Corollary 1.1. (Nonlinear time-varying MIMO)
If Gl(I+GZGl)-l is ~-stable, if GZ is unbiased and if GZ has a finitF i~Tremental gain, y(GZ), then He and Hy are '.:f x ~ stable. -
In order to bring to bear analytical tools, we restrict ourselves to linear time-invariant distributed systems. An important feature of Theorem Z and its corollaries, is that they do not impose any stability conditions on either Gl or GZ. This is in contrast to Theorem 1 which requires that y(GZ) < 00.
Theorem Z. (Linear time-invariant distributed MIMO)
Let Gl and GZ be represented by con~olution oper~tors as in (9). S~ppose that Gl has p:l.c.f. and GZ has p.r.c.f. or Gl has p.r.c.f. and GZ has p.l.c.f. Suppose that V sequences (Si)1=1 C '+ and I si I + 00
lim infldet[I + Gl(si)GZ(si)]I > 0 (14) i-->oo
Under these conditions (abbr. U. Lc.) if (a) 9l(I~GZGl)-l, GZ(I+GlGZ)-l are in,~nxn, and (b) Gl, GZ have no common ~ pole, then He and Hy E ')ZnxZn. -
66
Comments: (a) this conclusion implies that He is Ln-stable for all p E [1,00], see (13), and that t~e system takes inputs u that are continuous and bounded, or almost periodic, or periodic into errors e and outputs y of the same class, resp. (b) Note that neither Gl nor GZ are assumed to be L~-stable.
Corollary Z.l. (Linear time-invariant lumped MIMO)
~et, for i = 1,Z, Gi be a convolution opera~o:, Gi(s) E 1R (s)nxn and be pr~per. Let det(I+GlG2) (00) # O. U.t.c., ifAGl(I+G2Gl)-1, GZ(I+GlGZ)-I are expo st~ and it Gl and GZ have no common ~ pole, then He and Hy are expo st. -
The condition det(I+G1GZ) (00) # 0 is related to well-posedness [11,15]: with the Gi(S) E 1R(s)nxn and proper, this determinantal condition is violated if and only if (I+GlGZ)-l and (I+GZGl)-l have a pole at infinity, i.e. the closed-loop system transfer function He includes differentiators!
Corollary Z.Z. (Linear time-invariant lumped SISO)
~et, for i = 1,Z, gi gi(s} E ~(~) and be and gZ(l+glgZ)-l are expo st.
be a convolution operator, A A A 1 proper. U.t.c·Aif gl(!+gZgl} expo st., then He and Hy are -
Observe that if the assumption of Corollary Z.Z were replaced by "(1+8Z8l)-1 is expo st." then the conclusion does not follow. See counterexample in [18]: gl(s) = s/(;:l); 8Z(s) = (s-l)/s.
Corollary Z.3. (Linear time-invariant distributed SISO)
Let, for i = 1,Z, Gi be SISO, hence denoted by gi and let it be a convolu~ion ~p~rator. Let 8l,8Z ~ave ~.~.f. U.t.c. i¥ gl(l+~Zgl)-lAand ~ gZ(1+gl gZ)-l are in J\- then He and Hy EcJtZxZ -
Theorem 3 and its corollary are more restrictive: tyey exploit the properties of the algebras ,finxn and 1R (s)nxn, resp. and impose some stability requirement on GZ'
Theorem 3. (Linear time-invariant distributed MIMO)
If GZ and Gl(I+GZGl)-l are in,~nxn, then He and Hy are in,~ZnxZn. -
Since the proof of Theorem 3 is purely algebraic, it obviously extends almost verbatim to the lumped case.
Corollary 3.1. (Linear time-invariant lumped MIMO)
If GZ and Gl~I+GZGl}-l are exponentially stable, then so are He and Hy . -
Note that it is this corollary which justifies the common design procedures and the elementary discussions of MIMO feedback systems.
Comments: (a) in,TheoremJ3 and Corollary 3.1 it is
not assumed that Gl is in, nxn or exponentially stable resp., (b) Theorem 3 and Corollary 3.1 bear
I :\
a striking resemblence with Theorem 1. The nature of the results are different. Theorem 1 deals with nonlinear operators and the conclusion that some operators have finite gain is reached on the basis of similar assumptions concerning G2 and Gl (I+G2Gl )-l, together with the auxiliary assumption y(G2) < 00. Theorem 3 deals with a s~ecific algebra of linear operators, namely. ~n~n, the conclusion to be reached is that He ~ ~2nx2n. The proof is purely calculational relying on properties of linear maps and on the various closure properties of the algebra. The same comments hold for Corollary 3.1. (c) One might think that Theorem 3 follows from Theorem 1 by replacing "'::i-stable" with "Ll}-stable for all p E [I,"']": this, however, is incorrect because the class of convolution operators that are L~-stable for all p E [1,00] is larger than,~nxn. Indeed the latter excludes the possibility of singular measures in the convolution kernel.
IV. THE DISCRETE-TIME CASE
The results above except for Theorem 1 and its corollary are stated for the continuous-time case. A study of the proofs would easily show that they extend easily to the discrete-time case. The required changes are listed in the Table I: B(O,l) and B(O,l)C denote the open unit ball centered on 0 in ~ and its complement, resp.; tl denotes the convolution algebra of absolutely
convergent sequences: £1 = {(z.)'" c c:1 ~lz.I<,"'}, (for details see [12]). 1 0 0 1
Table I
Laplace transform -> Z-transform
.A -> '~l
,;nxn nXn -> 2.1
0
I: -> B(O,l)
c:+ -> B(O,l)C
s -> -> z -> 00
1R(s)nxn -> 1R (z) nxn
Research sponsored by the National Science Foundation Grant GK-43024X.
[3] 1. W. Sandberg, "Some Results on the Theory of Physical Systems Governed by Nonlinear Functional Equations," Bell Syst. Tech. Jour., 44, p. 871-898 (May-June 1965).
[4] G. Zames, "On the Input-Output Stability of Nonlinear Time-Varying Feedback Systems," IEEE Trans. AC-ll, 2, 228-238; 3, 465-467, (1966).
[5] C. A. Desoer and M. Y. Wu, "Stability of Linear Time-Invariant Systems," IEEE Trans. CT-15, p. 245-250 (1968).
[6] , "Stability of Multiloop Feedback Linear Time-Invariant Systems," J. Math. Anal. Appl, 23, p. 121-130 (1968).
[7] F. M. Callier and C. A. Desoer, "Necessary and Sufficient Conditions for Stability of n-input n-output Convolution Feedback Systems," IEEE Trans. AC-18, 3, p. 295-298, June 1973.
[8] F. M. Callier and C. A. Desoer, "LP-stability, (12 p 2 00 ), of Multivariable Nonlinear Time-Varying Feedback Systems that are Open Loop Stable," Int. Jour. of Control, 19, I, 65-72, 1974. --
[9] M. Vidyasagar, "Some Applications of the Spectral Radius Concept to Nonlinear Feedback Stability," IEEE Trans. CT-19, p. 608-615, Nov. 1972.
[10] J. M. Holtzman, "Nonlinear System Theory," Prentice-Hall, Englewood Cliffs, New Jersey, 1970.
[11] J. C. Willems, "The Analysis of Feedback Systems," MIT Press, Cambridge, Mass, 1971.
[12] C. A. Desoer and M. Vidyasagar, "Feedback Systems: Input Output Properties," Academic Press, New York, 1975.
[13] W. A. Porter and C. L. Zahm, "Basic Concepts in System Theory," Tech. Report 33, Systems Engg. Lab., Univ. of Michigan, Ann Arbor, 1969.
[14] M. J. Damborg and A. Naylor, "Fundamental Structure of Input-Output Stability of Feedback Systems," IEEE Trans. Vol. SSC-6, p. 92-96, 1970.
References [15] R. Saeks, "Resolution Space, Operators and Systems," Lecture notes 82, Springer-Verlag, 1973. [I] D. W. Porter and A. N. Michel, "Input-Output
Stability of Time-Varying Nonlinear Multiloop Feedback Systems," IEEE Trans. AC-19, 4, [16] p. 422-427, Aug. 1974.
[2] F. M. Callier and C. A. Desoer,"A Stability Theorem Concerning Open-loop Unstable Con-volution Feedback Systems with Dynamical [17] Feedbacks," to be presented at the IFAC Congress, 1975.
67
R. De Santis, "Causality, Strict Causality and Invertibility for Systems in Hilbert Resolution Spaces," SIAM J. Control, g, 3, p. 536-554, Aug. 1974.
S. MacLane and G. Birkhoff, "Algebra," The MacMillan Co., New York, 1967.
[18] C. A. Desoer and W. S. Chan, "Interconnection of Unstable Linear Systems," Proc. Twelfth Allerton Conference 1974, U. of Ill., Urbana, Illinois.
[20] F. M. Callier and C. A. Desoer, "Open-loop Unstable Convolution Feedback Systems with Dynamical Feedbacks," to appear in Automatica (Abbreviated Version Proc. IFAC '75).
68
[19] M. Vidyasagar, "Coprime Factorization and Stability of Multivariable Distributed Feedback Systems," (to appear in SIAM J. Control).
F
THE "FOURIER" TRANSFORM OF A RESOLUTION SPACE AND A THEOREM OF MASANI*
R. A. DeCarlo, R. Saeks, and M. Strauss Texas Tech University
Lubbock, Texas
1. ABSTRACT
Using two classic theorems (one of.Mackey an~ another of Stone) and ~ recent result of Masani and Rosenberg, thlS paper pleces together a genera11zed frequency response theory for an abstract Uniform Resolution Space. The present theory assimilates past work as done by Fa1b, Freedman, Anton, Masani and Rosenberg, and one of the authors. The results of this paper are not new, but are merely a rearrangement of subtleties uncovered by the aforementioned aut~ors. An interesting consequence of this work was that an abstract Uniform Reso1utlon Space has both a "time transform" and a "frequency tranSform". Such a duality is not readily identifiable in an L2 function space since the time transform, there, is the identity.
2. INTRODUCTION
Fourier analysis is basic to the design and understanding of physical systems. The property that convolution in the time domain maps into a product in the frequency domain, yields a theory both practical and aesthetically pleasing. This note provides what is hoped to be a generalized frequency response theory for arbitrary, closed, linear, time invariant operators on a uniform resolution space. Previous attempts at providing a general frequency theory have illuminated numerous subtleties, yet still appear inadequate for one reason or another. Interestingly enough, the mathematics necessary for such a synthesis is well entrenched in the 1itera-ture. This paper merely pieces these results together and reinterprets them in light of the work
two fundamental ideas--the idea of a "transform" from time to frequency and the property of a time
invariant mapping to a product of functions in frequency. We desire a Fourier representation for time invariant operators defined on an appropriate space. Two avenues arise. A traditional approach uses a Fourier-like integral to obtain the representation. In an abstract approach, the Fourier representation is a spectral representation of the abstract operator relative to an appropriate spectral measure. This road is both more general and eliminates the need for a specific representation of the operator.
Fa1b, Freedman and Anton (3), (5) developed a gen-era1ization closely paralleling the classical
of Fa1b, Freedman, Masani, Rosenberg and Saeks. (3), theory. The formulation considers Hilbert space(9), (12), (21)
Classical Fourier analysis consists essentially of valued L2 functions (square integrable relative to the Haar measure), defined over a locally compact
*This research supported in part by Air Force Office of Scientific Research Grant AFOSR 74-2631.
69
abelian (LCA) group, G, and operators which are characterized by an Ll convolutional weighting function. The theory is highly representation-dependent and fits awkwardly into the setting of an abstract resolution space. In fact, the identity and unit delay are not admissable to the theory. The major advantage is that one obtains an operator-valued Fourier representation.
Masani and Rosenberg (11), (7) use a spectral theoretic vehicle to alleviate the difficulty of a specific representation of the operator. Moreover, the theory settles nicely into an abstract setting. Yet, the frequency response is always scalar-valued, even in the multivariable case, and the concept of a "transform" is absent.
Finally, Saeks (21) has a Masani-like development whose Fourier representation assumes values in a suitably restricted class of operators. The advantages are the compatibility with abstract spaces and an operator-valued frequency response. Yet, still, the concept of a transform is missing and major existence questions are still present.
The structure of the present theory rests on the classic theorems of Mackey (7) and Stone (4) and a recent theorem of Rosenberg and Masani. (10) With this comment, we define the setting.
2. UNIFORM RESOLUTION SPACE
A resolution space is a pair, (H,E), where H is a Hilbert space and E is a spectral measure on an ordered LCA topological group, G. On an ordered LCA group, a spectral measure determines a resolution of the identity, and conversely. Thus, it is advantageous to work with the resolution of the identity Et = E([_oo, t]), rather than with the spectral measure E, as illustrated at the end of this section.
As an example, consider the Banach space, L2, together with the truncation operator, Et , defined as
(EtX)(q) = {x(q) q ~ t t 0 q > t
or equivalently, the spectral measure, defined via
(E(B)x)(q) = [Xo(q) q £ B
q ¢ B
for all Borel sets B.
In addition, L2 admits a group U of shift operators ut , defined as
(Utx)(q) = x(q - t).
Thus, the concept of time invariance is well defined in a classical L2 setting. In general, a resolution space lacks the concept of time invariance. Such a property requires an extension of the concept of the L2 "time-shift". A group of such operators, in general, fails to exist in an arbitrary resolution space.
In particular, we seek a strongly continuous group of unitary operators (i.e., Ut -s = Ut(Us)-l for all
t and s in G), such that
UtE(B) = E(B + t)Ut
for all t in G and Borel sets B. A resolution space, together with such a group U of shift oper
ators, ut , is a Untform Resolution Space (URS) , denoted by the triple (H,E,U).
Underlying each URS is an ordered LCA group, G, which, for our purposes, i~ time. Associated with G is a "character group", G, which is the group of continuous homomorphisms from G into the multiplicative group ~f complex numbers of magnitude one. Note that G is, in general, not ordered.
In like manner, attached to each URS (e.g., (H,E, U)), defined over G, is a "dual" character space (H,U,E)*, defined over G. E and U are a spectral measure and group of shift operators, respectively, defined via the two equalities
t £ G
and A
Y £ G.
Here, (y,-t) denotes the complex number of magnitude one, resulting from the operation of the
A
*We have adopted the ordering (H,U,E) bekause via Stone'~ theorem, E and U cQntain the same information. Moreover, U and E do. Thus, (H,U,E) rather than (H,E,U).
70
r " character y in G acting on -t in G, and where the
integral is the Lebesgue integral. Stone's theor-" em (~) assures the existence and uniqueness of E
and U. " " Oddly, the charact~r space (H,U,E) is not a resol-
~t~on space since G is not ordered. However, (H, U,E) displays all the resolution space properties which do not depend on the ordering of G. In fact, by Stone's theorem (~), (7), (12), U is a group of shift operators for E, satisfying the imprimitivity equality over G--i.e.,
UYE(B) = E{B + y)UY•
For our purposes, the character group plays the role of frequency.
Now, the physical properties of causality, memorylessness, time invariance, etcetra, have precise descriptions in the uniform resolution space structure. In particular, for bounded operators, T, on (H,E,U), causality is equivalent to EtT = EtTEt (1), (2), (20); anticausality, to EtT = EtTEt *; memorylessness, to EtT = TEt which, in turn, is equivalent to T, being both causal and anticausal. Since memorylessness is a symmetric conce~t~ it has an analog in the character space, (H,U,E), whereas causality does not. Because of this, we say a bounded operator, T, is time invariant if E{B)T = TE{B) for all Borel sets B of G. Via Stone's theorem, this is equivalent to UtT = TUt
for all t in G. Clearly, we emphasize the character space in the definition of time invariance.
In the case of unbounded operators, T (e.g., the derivative operator), T is causal if EtT ~ EtTEt **; T is anticausal if EtT ~EtTEt; T is memoryless if EtT = TEt; and, finally, although ~omewhat ~on-intuitively, T is time invariant if E{B)T ~TE(B)
A -
for all Borel sets B in G, where, again, we empha-size the definition in the character space. For unbounded operators, Stone's theorem, in general, does not yield an equivalent statement (such as
UtT = TUt ) in the original resolution space. However, for the case of linear, single-valued, closed operators with domain dense in H, then UtT = TUt
if and only if EtT '= TEt. (9), (10) The fundamental role of the character space becomes more clear in the following section.
3. EQUIVALENT SPACES
In this section, Mackey's theorem verifies an equivalence between an abstract URS, (H,E,U), and a function space, (L2{G,K), XB' crt). Now, the relevant information contained in (H,E,U) is also con-
" " tained in (H,U,E). Thus, applying Mackey's theo-" " rem to (H,E,U){under the guise of (H,U,E», another
equivalence to (L2{G,K), crY, xs)"exists. Furthermore, (L2{G,K), XB' crt) and (L2{G,K), crY, Xs) have an affinity via Stone's theorem.
After numerous references throughout this development to the above two authors, we, at last, precisely state their results. Hopefully, this will facilitate understanding of the maps between the various spaces, hinted to in the above paragraph. The following is a statement of Stone's theorem for
LCA groups. (4), (14)
Suppose G is an LCA group and G, its "dual" character group; let (y, -t) be the complex number of magnitude one, resulting from the operation of Y in G on -t in G; define E{L) as the cr-algebra of Borel ~ets of G{G); finally, let {utlt in G} ({UYly in G}) be a strongly continuous group of unitary operators on a complex Hilbert space, H, onto H." Then, there exists a unique spectral measure, E{·) (E{.», for H on E(L), such that for all t in G (y
" in G),
or
UY = IG{y, -t)dE{t).
The initial task, now, is to construct an equiva-
t *Et = E«t, 00]) = I - E •
**For an unbounded operator, T, onta resolution space, (H,E,U), the domain of EtT is smaller than the domain of TE. As such, the containments indicate that, where the domains of EtT and TEt coincide, then EtT = TEt.
71
I'
II ii, :ii:
lence between twa L2 spaces via this thearem.
Cansider the URS'A(L2(G,K), XB' at), and the character s£ace, (L2(G,K), aY, Xs)' where G is an LCA graup; G, its character graup; XB' the characteristic functian af the Barel set B in L; at, the classical shift aperatar (i.e., (atf)(q) = f(q -t)*, and, lastly, the measure an the space will be the Haar measure, m.
Now, the Faurier transfarm maps L2
(G,K) to. L2(G,K)
in such a mann~r that XB is taken to. the spectral measure an L2(G,K), whase integral is aY. Simi-
t A
larly, a maps to. the unitary graup an L2(G,K), whas~ assaciated spectral measure is XS' As such, (L2(G,K), aY, Xs) is the Faurier transfarm af the character space far (L2(G,K), XB' at).
Interpreting this, we have XBcampletely determining aY and XS' completely specifying at. The link between these twa spaces and an abstract resalutian space is Mackey's thearem. The statement af his thearem far LCA graups fallaws. (19)
Let E be a spectral measure an the Barel sets af an LCA graup, J, and let U be a strangly cantinuaus, unitary representatian af J, such that
E(B + t)Ut = UtE(B)**
far all Barel sets B af J and all t in J; then, there exists a unique Hilbert space, K, and a unitary transfarmatian, M,
such that
(1) ME(B)M- l = XB far all Barel sets af J; and
(2) MUtM- l = at far all t in J,
where K is a complex Hilbert space and m, the Haar measure.
Far an arbitrary URS, (H,E,U), defined aver an LCA graup, G, by design, E and U satisfy the imprimitivity equality. Mareaver, the character space,
A A
(H,U,E), passesses the praperty that E and U satis-fy the imprimi ti vi ty equality. (20) Fi na lly, it is clear that (L2(G,K), XB' at) and (L2(G,K), aY, Xs) satisfy the hypatheses af the thearem. Therefare, by blending Mackey's and Stane's thea rem , the fallawing cammutative diagram results:
A A
(H. E (U), U (E))
V ~M (L2(G,K), XB' at )( F )(L2(G,K), aY, Xs).
Figure
Remarkably, the diagram reveals that an arbitrary URS is equivalent, under a memaryless, time invariant, unitary transfarmatian--a unifarm resalutian space isomarphism, to. an L2 space. Distinctians among the spaces far a fixed G, therefare, depend anly an the cardinality af the space, K. (19) Mixing this equivalence with a recent result af Masani and Rasenberg (10), gives the desired structure--i. e., "time invariance" maps to. multiplicatian.
4. APPLICATION OF THE MASANI-ROSENBERG RESULT
This sectian begins with the result af the abave mentioned authars. The thea rem is nat stated in its general farm (10), but is restricted to. a graup J, a Hilbert space, K, and the Haar measure, m.
Let T be a closed, single-valued, linear aperatar with dense domain an L2(J,K,m), such that T cammutes with the aperatian of multiplication by the characteristic functian af a Borel set--i.e.,
far all B in L (the a-a1gebra of all Barel sets af J).
A
Then, there exists a measurable functian, T an J, whase values are aperatars an K, such that
(Tf)(j) = T(j)f(j) j in J.
This thearem applies to. functian space. Thus, to. verify the saught after praperties an the abstract URS, we first apply the Mackey Transforms as in
*Here! {atlt in G} and ~~IB in the set af Barel sets af G} serve as the strangly cantlnuaus graup af Shlft aperators and the spectral measure, respectively.
**This is the imprimitivity equality.
n
Figure 1, redrawn below for simplicity. " " (H, E (U), U (E))
V, F ~y (L2(G,K), XS' a )( ) (L2(G,K), a , xB)·
Our discussion dwells upon two types of operators in the abstract URS, memoryless operators and time invariant. First, we consider the time invariant case.
Let T be a closed, linear, single-valued, time invariant operator on (H,E(U), U(E)). Recall that T is time invariant if E(B)T C TE(8). Under the
" -Mackey Transform, M (which we term the Mackey fre-quency transform), we have an equivalent statement in (L2(G,K), aY, Xs), as follows:
XSTM ~ TMXS' " where TM is the image of T under the M transforma-
tion. Clearly, the conditions of the Masani-Rosenberg theo~e~ are satisfied. Thus, there exists a mapping, T:G+K, such that
" (Th)(Y) = T(y)h(y) " " for all Y in G and h in L2(G,K). This says that
time invariant, closed, linear, single-valued operators on an abstract URS are, as hoped, multiplications in the "frequency domain"--i .e., in (L2 ,(G,K), Y ") a , Xs •
Now, let T be a memoryless, linear, closed, singlevalued operator on (H,E,U). Recall that T is mem oryless if EtT = TEt. Thus, by reasoning similar to the time invariant case, the image of T under the Mackey-time transform, M, commutes with Xs in (L2(G,K), XS' at). Thus, it is equivalent to a multiplication by the Masani-Rosenberg theorem.
This structure, then, shows that certain operators on an abstract URS have the "right" properties. It is interesting to note that there is a duality inherent in this formulation. The presence of a Mackey "time-transform" and a corresponding "frequency-transform" is apparently necessary for the cohesiveness of the theory.
5. CONCLUSIONS
The above ideas assimilate past theories in a number of ways. The theory generalizes the Falb-
Freedman-Anton work because of the abstract setting and because it is valid for a larger class of operators. Clearly, there is no restriction to scalar-valued frequency responses as in (9) and (13). Moreover, it circumvents the existence questions associated with Saeks' work. (21) In fact, as in (21), given appropriate conditions, a multiplication on a function space can be viewed as an integral over the spectral measure, defined via multiplication by X, i.e.,
T = f T(w)dX(w).
Hence, the pre-image of T under the Mackey frequency transform in (H,E,U) takes the form
" " T = f T(y)dE(y),
as was specified in (21).
In a private conversation with one of the authors, Desoer raised a question about the fact that differential operators satisfied the definition of memorylessness in an abstract setting. The question caused some doubt in our minds as to the appropriateness of the definition. This note gives a partial answer, in that closed, memoryless operators are multiplications. Hence, the apparent pathology noted by Desoer can only arise in the case of nonclosed operators.
5. REFERENCES
1. R. DeSantis, Causality and Stability in Resolution Space, Proc. 14th Midwest Symposium on Circuit Theory, Univ. of Denver, 1971.
2. , Causality Structure of Engineering Systems, Doctoral Thesis, Univ. of Michigan, Ann Arbor, 1971.
3. P. L. Falb and M. '1. Freedman, A Generalized Transform Theory for Causal Operators, SIAM J. Control, 7 (1969), pp.452-471.
4. P. A. Filmore, Notes on Operator Theory, Van Nostrand-Reinhold, New York, 1970.
73
5. M. I. Freedman, P. L. Falb and J. Anton, A Note on C usality and Analyticity, SIAM J. Control, 7 (1969), pp. 472-47S.
6. G. W. Mackey, Induced Representation of Gro~pSk and Quantum Mechanics, w. A. Benjamin:-kew or, T90s.
I I!I! Iii
7. , A Theorem of Stone and von Neu-mann, Duke Math J., 16 (1949), pp. 313-326.
8. , Group Representations, Lecture Notes, Oxford Univ., 1966-1967.
9. P. Masani and M. Rosenberg, When is an Operator the Integral of a Spectral Measure, (to appear).
10. , The Multiplication Operator in L? over a Localizable Space and Bochner I s TheOrem, (to appear). _
11. P. Masani, The Normality of Time-invariant Subordinate Operators in a Hilbert Space, Bull. Amer. Math Soc., 71 (1965), pp.546-550.
12. Orthogonally Scattered Measures, Advannces in Math .• 2 (1968). pp. 61-117.
13. • Quasi-Isometric Measures and their Appllcatlons. Bull. AMer. Math Soc •• 76 (1970). pp. 472·528.
14. F. Riesz and B. SZ-Nagy. Functional Analysis. Fredrick Ungar. New York, 1953.
15. R. Saeks. Causality in Hilbert Space. SIAM Review. 12 (1973)'. pp. 283-308.
16.
17.
18.
19.
20.
21.
f State in Hilbert Space. SIAM Review. 15 (1973}. pp. 283-308.
• On the Existence of Shift Operators. ·T-ec~h-.~Memor- EE-707. Univ. of Notre Dame. fndiana. 1970.
• Causal Factorization. Shift OperaTt~or~s-a~nC7d the Spectral Multiplicity Function. Vector and Operator Valued Measures and ~catrons:--Academic Press. Inc •• 1973.-
• Resolution Space Operators and ~....... te-m-s-...... S-pringer-Verlag. New York. 1973-.-
• Fourier Analysis in Hilbert Space. ~SI~A~M~R~e-view. 15 (1973). pp. 604-637.
Segal. I. E •• Equivalence of Measure Spaces. Amer. J. Math. 73 (1951). pp. 275-313.
74
LUNPED-DIS'I'RIBUTED NE:rWORK SYNTHESIS
VIA HJVAIHANT SUBSPACE 'I'HEORY
by
P.Dewilde
Departement Elektratechniek
Kathalieke Universiteit Leuven
Heverlee, Belgium
and
J.S.Baras
Electrlcal Engineering Department
Universit.y of Haryland
College Park, Maryland 20742
75
SUHMARY
'l'he goal of this paper is to provide cascade synthesis methods for
general nonrational impedance/scattering or transfer/scattering
matrices. It is shown that the invariant subspace theory of
Helson [1], Lax [2]and the factorization theorems of V.p.Potapov
[3] provide the means for synthesis techniques, which generalize
those used by Belevitch [4,5] and Oono-Yasuura [6] in the rational
case. In contrast with the standard multivariable approach to
lumped distributed network synthesis [7-9 ] our methods are single
variable. It should be emphasized that in the Russian literature
many examples can be found, where similar single variable tech
niques f~om operator theory are applied not only to network syn
thesis problems but to more general systems synthesis as well.
See for example ~O-ll] and the interesting book by Livsic [ 12]
We analyse both lossles s and lossy synthesis .. rIe assume that
we are given a scattering matrix S(jw), a transfer scattering
matrix L21 (jw) or an input impedance Z(jw), which are non-ratio
nal and we discuss methods to synthesize a cascade lumped -dis
tributed network realizing the function at hand. Starting from
an analysis of the system properties of an (eventually lossy)
scattering matrix, we deduce the crucial property that a lossy
(ordinary or transfer) scattering matrix can be synthesized in
a lossless cascade if and only if its nullspace is big enough,
or if and only if it has a pseudo continuation of bounded type
in the left half plane [13] or if and only if it can be embedded
in a lossless scattering matrix. Such matrices are called
·"roomy". Any factorization technique for a 10ss1e~s scat'::ering
matrix ego Potapov [11], w~ll then yield a cascade~synthesi~:
Such a factorization has the form :
(1") S I = B.P
where B is a Blashke product and P a singular part. Bdsed on
the ordering of matrix inner functions by divisibility a degree
theory is developed for (eventually nonrational) scattering
matrices \·;hich generalizes that of the ratioi1al case.
The notion of reduction of degree in this generalized sen~c is defined. Based on that we shov' how one can pullout of
76
2.
(1) elementary fnctors that arc directly synthcsizablc and thu~
pJ::ouuce a cascadc synthesis. For preliminary results i1'. this
direction we refer to 14-15-16-17-18
Next., technical and computational problcms are discussed.
First there is the problem of obtaining a spectral factorizdtion
and next the problem of computing a lossler,s embedding.
A method to do both at once based on a Belevitch synthesis type
procedure is presented and its applicability is discussed.
\~e give various restrictions that should be imposed on S, so
that the resulting networks will be realistic and we illustrate
these results by particular electrical networks.
At the end vIe present a discussion that conn~cts the networks
synthesis problems discussed here with infinite dimension
realization theory.
..
77
REFERENCES -------
[11 H.nelson, L~cturcs on Invariant Suhspaces, Academic Press, N.Y.
1964.
[2J p.D.Lax, "Translation Invariant Spaces," Acta Math, Vol. 101,
1959
[3] V.P. Potapov, "'fhe Multiplicative Structure of J-Contrative
Hatrix Functions," A.M.S.Transl.Ser.2, Vol. 15, 131-243.
[4] V.Belevitch "Factorization of scattering matrices with
application to passive net\'lOrk synthesis" Phil.Res. Repts
18 (4) 275-317 (Aug 1963)
[5] V.Belevitch, Class·ical NetvTOrk Theory, Holden Day, San
Francisco, 1968.
[6] Y.Oono and K. Yasuura, "Synthesis of finite passive 2n-ter
minal Networks with prescribed Scattering Matrices,"
Mem. Fac. Eng. Kyushu Univ. 14 (2) ,125 -177 (May 1954~
[7] D.C.Youla, "A Revieyl of Some Recent Developments in the
Synthesis of Rational Multivariable Positive-Real Matrices,
"SIAM-Al'lS Proceedings, Vol.III, Mathematical Aspects of
Electrical Network Analysis, H.S .tH1f and F .Harary (edts.)
1971, 161-190.
[8] T. Koga, "Syn thes is of a Resistively Terminated Cascade of.
Uniform Lossless Transmission Lines and Lumped Pa~>f;ive
Lossless Two-Ports," IEEE Trans .Cir~it Theor~, Vol. CT-18,
1971, 444-455. - -
[9] D.C.You1a, J.D.Rhodes and P.C.Marston,"Oriving-Point Synthesis
of Resistor-Terminated Cascades Composed of Lumped Lossless
Passive Two-Ports and Commensurate TEM Lines,"IEEE Trans.
on Circuit Theory. CT-l9, No 6,1972, 648-664.
78
[10] A.V. Efimov , "Realization of Reactive J-expanding
Hatrix Functions" (in Russian) I Akad .. Nauk. JI.rmjansk ,
SSSR Isvcst V, No.1 (1970) I 54-63.
[l1J V.P. Potapov , "Ccneral Theorems on the Structure and
Detaching of Elemen t.ary Factors of Analytical Hatrix
Functions,"{in Russian) Dokl. Akad. Nauk.Armjansk,
SSSR, XLVIII, No.5, 1969, 257-263.
[12J H.S. Livsic, Operators, Oscillations, Waves, Open
Systems, A.H.S. Transl. of l-!3.th. Monographs, Vol. 34,
1973.
[13J R.G. Douglas and J.\v. Helton, "Inner Dilations of
Analytic Matrix Functions and Darlington Synthesis,"
to appear.
[14J P. Dewilde, "Roomy Scattering Matrix Synthesis,"
Technical Report, Dept. of Hathematics, Univ. of Cali
fornia, Univ. of California, Berkeley, 1971.
[15] _______ , "On the Finite Unitary Embedding Theorem
for Lossy Scattering Matrices," Proc. 1974 European
Conference on Circuit Theory and Design lEE, London.
[16] , "Cascade Scattering Matrix Synthesis"
[17]
Techn. Rapt. No. 6560-21, Info. Systems Lab. Stan.ford
Univ. 1970.
V. Belevitch and R.W. Newcomb, "On the
Problem of Degree Reduction of a Scattering Matrix
by Factorization", Journal of the Franklin Ins.t.
Vol. 291, No.5, May 1971, 387-401.
[18J J.S. Baras, "Lumped-Distributed Network Synthesis and
Infinite Dimensional Realization Theory," Proc. 1974
International Symposium on Circuits and Systems, San
Francisco, 76-80. ,
79
Linear Hilbert Networks Containing Finitely Many Nonlinear Elements
Vaclav Dolezal
Abstract
This paper establishes conditions for the existence and uniqueness
of a regime in a linear (finite or infinite) Hilbert network which con
tains finitely many nonlinear multivalued elements. These conditions are
given in terms of the driving point-set impedance of the linear network
and the operator describing the nonlinear elements.
80
Linear Hilbert Networks Containing Finitely NJany Nonlinear Elements
Vaclav Dolezal
Technical Summary
The objective of the paper is to give relatively simple conditions which guarantee the existence and uniqueness of a regime in a linear (finite or infinite) Hilbert network containing finitely many nonlinear multivalued elements.
Using the theory developed in papers [lJ - [3J, we first prove a theorem on the existence 0·' the driving point-set impedance of a (nonlinear, in general) Hilbert network.
Then we establish the main theorem on existence and uniqueness of a regime in a network under consideration. It turns out that the necessary and sufficient conditions in question are given by the behavior of the mapping R + Z+, where R is the driving point-set impedance of the linear part of the network, and Z+ is an operator describing the nonlineQr element.
Moreover, it will be shown that the main theorem holds also for finite networks whose variables are in a linear space that need not be a Hilbert space.
As examples illustrating the application of the results, we consiciel' a finite R, L, C network with constant elements, which contains either a nonlinear resistor or a nonlinear inductance, and a DC~urrent network containing several nonlinear resistors.
References
[lJ V. Dolezal, Hilbert Networi{s I, SIArIJ J. Control, Vol. 12, No.4, Nov. 1974.
[2J V. Dolezal and A. H. Zemani~, Ililbert Networks II - Some ,!mlit:1t ive Properties, SIAM J. Control, Vol. 13, No.1, Feb. 1975.
[3J V. Dolezal, Generalized Hilbert Networks, to appear in SIAM J. l'ontl\'l.
81
Livsic's Chain Synthesis
T. T. Ha and R. W. Newcanb Electrical Engineering Department
University of Maryland College Park, Maryland 20742
ABSTFACI'
The factorization method of M. S. Livsic for J-lossless matrices is presented
in concise form.
"on the other bank, the palms beckoned us". [lJ
I. INTROOOCI'ION
In the book of M. S. Livsic [2J a factoriza
tion theory for J-lossless matrices is presented
which is useful for ,the cascade synthesis of
lossless networks. Unfortunately this is spread
throughout a large segment of the book and in a
language not too familiar to western engineers.
Consequently we here attempt to concisely
present the ideas and in a form more accessible
to western engineers. Al th:::>ugh we hope this
paper will be self-contained we point out the
canpanion paper [3] where the background ideas
are presented.
II. PRELIMINARIES
We consider as given an n x n matrix
transmission operator S (p) of the canplex
frequency variable p = 0 + jw. associated with
S(p) is a constant matrix J satisfying J = J a,
2 where superscript a ~ans adjoint, and J = I,
with I the identity. Physically a transmission
onerator can be interpreted as a chain, scattering
or transfer scattering matrix (or various canbina
tions of these), the interpretation depending upon
the associated J. Thus, for example, J = I corresponds to the scattering matrix while
J =[~~] is associated with tre chain matrix
[3].
We assume, as per [3], that the transmission
operator can be written in the form
(1a)
82
where T, the interior operator, and r, the input
to state space operator, are consequently taken as
known. The interior operator is also taken to
satisfy the relationship [3J
(lb)
Knowing the transmission operator in this form we
clearly know the input to interior operator
[ -1 R = T-pIJ r (2a)
fran which
S(p) = I + Jr~ (2b)
As a final preliminary we ccmnent aht if Q
is an isanetry, Q~ = I, [4, p.1SJ, then the
replac~nt
(3a)
f = Qr (3b)
leaves S(p) invariant, that is
(4a)
though we find
R = Q~ (4b)
III. FACI'ORIZATION - RATIONAL CASE
At this point we assume that S(p) is rational
in p in which case T in (la) can be taken to
be an m x m matrix operator with m finite.
By putting T in lower triangular form, (S), we
will show that a decomposition of the interior
operator R, (10), results to yield a factoriza
tion, (12).
By standard techniques [5, p. 75J there exists
a W1i tary transfornation Q, which is a special
case of an isanetry, for which T can be put in
lONer triangular form through (3a). Thus ~
assume that T is in lower triangular form and,
hence, by partition, written as
(5)
where Tn and T22 are also lONe!' triangular,
with Tll of any desired number of rows (smaller
than m, assuming m.:.. 1).
Assuming nON that T is in lONer triangular
form and to go along with the partition of (5)
we define projection matrix operators
(6)
where 11 and 12 are identities of appropriate
dimension; here £1 + 22 = 1. By (5)
from which our definitions for Tl ,T2, r l ,r2 should be clear. Further, as £1I'£2 = 0,
(7f)
(7g)
With these points in mind we can invert T-pI
83
(-1 -1 -,22 T2-pI) £2~1 (Tl-pI) El ( Bb)
= El(Tl-pI)-~1+E2(T2-PI)-122
(Bc)
consequently we obtain, on directly using (Bc) in
(2),
(9a)
(9b)
(9c)
(10)
(9d)
(ge)
(9f)
(9g)
Equation (10) is the main one which allONS the
factorization, for
S=I+Jra[T-pI)-lr
=I+J(r~+r;)[~+~Sl]
=sl+Jr~2S1 = (I+Jr~)Sl
=S2S1
(lla)
(llb)
(llc)
(lId)
(12)
Having obtained the desired factorization, (12),
we comment that if the degree 8[S], [6, p. 195]
is m, that is the state-space realization of S(p)
for (1a) is minimal, then
as seen from (9) using 1:1 + .£2 = I. [1] G. Nam, "Night Crossing", in ''We Promise one
Another", The Indochina l"bbile Education
IV. FACIDRIZATION - IRRATIONAL CASE Project, Washington, OC, 1971, pp. 57-59.
If S(p) is irrational in p then the
above procedure still works whenever T can be
put into the lower triangular form (5) by an
isometry Q. The disucssion in [7, p. 73]
indicates that this will always be the case if
S is a contraction. See also [2, Chap. 6].
V. DISCUSSION
Given an S(p) in the form of (1a) we have
presented the ideas of Li vsic which lead to its
factorization, as we have seen always in the
rational case. This has the advantage that
optimal degree reduction occurs. Too, if one
desires degree one factors then such can be obtain
ed by the use of a partition at (5) in which
Tll is 1 x 1. However, it should be pointed out
that the triangularization of T required may
lead to oomplex valued T and r. The S(p) under consideration all satisfy
(14)
in which case they are often called J-lossless[8].
Such correspond physically to lossless network
structures when S(p) has appropriate analyticity
properties to guarantee possi vi ty.
"A few Irore strokes! The bank was now closed"
[1] .
84
[2] M. S. Livsic , "Operators, Oscillations,
Waves (open systems)", Translations of
Mathematical Monographs, Vol. 34, American
Mathematical Society, Providence, RI, 1973.
[3] T. T. Ha and R. W. Newcanb, "Fundamentals
for Livaics Chain Synthesis", Proceedings of
the 18th Midwest Symposium on Circuits and
Systems, August 11-12, 1975, pp. 337-340.
[4] P. A. FillIrore, "Notes on Operator Theory",
Van Nostran Reinhold Co. , New York, 1970.
[5] C. C. Mac Duffee, "The Theory of Matrices",
Chelsea, New York, 1956.
[6] R. W. Newcanb, "Linear Multiport Synthesis",
McGraw-Hill, New York, 1966.
[7] B. Sz-Nagy and C. Foias, "Harmonic Analysis
of Operators on Hilbert Sj)aces", North
Holland, Amsterdam, 1970.
[8] N. Levan, "Theory and Applications of J.
Lossless Scattering Systems", Journal of
the Franklin Institute, Vol. 294, No.5,
November, 1972, pp. 312-321.
RADAR TARGET RECOGNITION -
AN OPERATOR THEORETIC APPROACH
D. E. Hammers and A. J. MacKinnon
ITT Gilfillan
Van Nuys, Cal.
Abstract
The problem of signal design for radar target recognition is studied from an operator theoretic point of view. Target operators are structured from measured signatures so that recognition becomes a matter of identifying a target operator as a member of a priori known set of operators. A signal design criteria is established and shown to be optimum in relation to the statistical reliability of recognizing a target in the presence of White Gaussian Noise.
1. INTRODUCTION
Prior to 1970 radar systems were designed on the basis of classical analyses as established in various significant Radar Texts. (1)(2) These methods, for the most part, are based on important fundamental work by many individuals (to many to name here) towards developing target and background models which are used to develop the important radar subsystems--antenna, transmitter, receiver, and signal processor. The goals stressed are primarily those of detecting and tracking targets in a variety of corruptive backgrounds. In the late sixties digital methods began to be heavily applied for the realization of signal processors needed to extract the important target parameters for these goals.
The emphasis on digital proces sing coupled with the emergence of modern methods of systems optimization theory has influenced radar systems analysts to begin treating new design problems from an overall systems viewpoint. In this context, it has become apparent that the major item in any radar system which integrates the entire system together is the transmitted waveform. Each subsystem, including the target, should be viewed as a transformation acting on the desired waveform producing a response which ultimately leads toward the primary goals. The fundamental waveform design approach usually taken by Radar systems analysts is that of deriving a signal which provides the peak response on the time-
85
frequency domain of the Radar Ambiguity Function. (3) The latter is a correlation function which characterizes the degree to which a signal can be distinguished or resolved from another signal in the time and frequency domain. The waveforms associated with certain additional radar functions cannot be easily treated by this method, in particular, distinguishing or recognition of multiple targets as well as unwanted or bogus targets. In reality, what is needed, is an extension of the ambiguity function approach to cover this problem by linking waveform design to the statistical likelihood of identifying the target as a known operator. That is, an operatortheoretic approach might provide a more unifying mathem~tical treatment of waveform design for target recognition than the classical method stressed by the Radar Ambiguity Function.
In this paper we study the recognition problem by an operator theoretic approach such as that applied by Balakrishnan to systems identification(4) or by Weber to signal detection(S) to provide a rigorous foundation for developing the above concept. In fact, the techniques developed here are similar to modern methods of designing probing signals of identifying linear systems; e. g. , Esposito and Schumer(6) solve for a signal which discriminates between two linear filters in colored Gaussian noise. Mosca(7) generalizes this result by using a functional analytic method for deriving the probing signals for indentification of one of M linear channels in colored Gaussian·
noise. Our approach is to derive a signal design criteria which is nlore dependent on the properties of the target operator rather than noise. This is due to the fact that we assume that the target has already been detected, located and that colored noise (clutter) can be removed by standard radar filtering methods. (8)
The concept of a Radar Target Operator is introduced in relation to measurements made on a target by a test waveform. An Impulse Response Operator is then characterized from these measurements with respect to the cor relation time assigned to the target. (Correlation time here refers to changes occurring while the target is fixed in space.) These same measurements and operator generation are assumed possible for all ta rgets in the set to be identified.
A probing signal design criteria is derived for identifying any of the Radar Target Operators of the known set. The criteria is based on maximizing the mean square difference (separation distance) between the responses of the target set. The problem of achieving a signal which is equally likely matched to all operators is discussed and as a result the requirement of a constrained signal design criteria evolves. It is shown that the desired signal is one which maximizes a constrained bilinear functional created from the operato rs of the target set.
Finally, the probability of correctly recognizing a member of the a priori set is investigated. It is shown that the mean square difference criteria for the probing signal is optimum in the sense of achieving the optimum Bayes decision in the presence of White Gaussian Noise.
2. STRUCTURE OF THE RADAR TARGET OPERATOR
The concept of the Radar Target Operator as developed in this paper depends on the preservation of linearity of the subspace of all waveforms which are transmitted and received from the target. That is, a real vector space is preserved from the signal domain through the electromagnetic field domain and back to the signal domain. (9) Now, referring to Figure 1, a target, denoted by.T, can be regarded as a scattering system which is characterized by a linear target operator, S, which maps incident waves, wi, into reflected waves, w r :
wr = Sw
i (2-1)
The particular target ope rator which we are concerned with is the Target Impulse Resp?nse Operator. Consider Figure 1 and let Wi be the
86
<l>1Il 0--- =-=0= '---:----' - S - L....-A-R
- ......
p' AR S AT<I>
COMPOSITE OPERATOR MODEL
<l>1t) ~L._~~_~_~_:;'_ ... ~ P III
T pit); T <1>(1)
WHERE T ; AR SAT
SINGLE OPERATOR MODEL
pit)
Figure 1. Modeling of Target Operators
Dirac 5(t-r) function; also let h:r be the corresponding reflected wave, i. e. ,
(2-2)
consequently:
wr
(t) = f h.r(t, C) wi (C) d C (2-3)
_CD
i for any w in the signal domain. Note that in (2-3)
we have assumed ~ to be non-stationary. This is necessary to accommodate changes caused by the targets motion relative to the spacial location of the radar. To this end, a time cor relation interval, tJ, is defined over which these changes will be considered. We exclude the target's trajectory so that tJ is chosen small enough to only observe motion occurring about a fixed point in space. Now letting AT and AR denote the transmitting and receiving antenna functions we have
and
i w = ATD(t)
r p(t) = AR w
(2-4)
(2-5)
then, substituting (2-1) and (2-4) into (2-5) we have
Now let aT and aR
be the antenna impulse responses so that
(2-6)
(2 -7)
where ". " denotes the convolution (or composition) operator. But in Figure 1, we define T to be the Radar Target System Operator which maps (tJ (t) into p(t) or
p(t) = T(tJ(t) (2-8)
So finally we have a target characterized by
T = (A SA) = ((a • h -). a ) R T R':/ T
(2 -9)
We note that p(t) would be a random variable if for instance the target orientation (aspect angle of the target) is unknown relative to the direction of the transmitted wave caused by (~(t). In this case p(t) is mapped 1 to 1 into the a priori range space of T with probability < 1 so that T can be considered as a stochastic operator. If the orientation is known or if p(t) can be mapped 1 to 1 into the a priori known range space of T with probability 1; then T would be classified as a deterministic operator. If wi and w r are horizontally or vertically polarized electric field intensity vectors, then by choosing
we have the impulse response matrix operator:
(2-10)
In our discussions we will only consider one polarization vector.
In general, tractable mathematical representations for T do not exist for complex targets such as aircraft so we assume that a priori measurements (signatures) under ideal conditions can be made to approximate it. For example, if the target were suspended in free space and (tJ (t) ~ ~ (t) with a
R ~ aT ~ ~ (t) then
p(t) = T (t) f, (t) ~ I:y(t) (2-11)
This is reasonable since under controlled conditions, a target could be held fixed in the far field * of the antenna and measured by a test signal, (tJ(t), consisting of a sequence of very narrow pulses for a length ot time equivalent to 6. These pulses would be transmitted at a rate which is high enough to prevent aliasing (violation of Shannon's sampling theory) the measurement of the frequencies defining motion within a bandwidth of 1 16.
The pulsewidth l' would be selected less than dlc (ratio of target line-of-sight length to the speed of light). An estimate of the power spectral density associated with p(t) might appear as in Figure 2. Note that the transmitted RF frequency would also be selected such that the wavelength is much less than d, creating the situation of the
5 twl p
r-- } . i::~~~0\~6~lO'E OF
-- ~~- L11----- -. '. , ___ ',,_ " _ ~\'i~;~ ENVELOPE
---........ ",- ""
t~J SptW' T POWER SFECTRAL DENS lTY OF pw
Figure 2. Example of Target Frequency Response
so-called "extended target." For example, if the target dimension along the line-of-sight of the radar was 25' then a reasonable value of l' and A might be 10 ns and 3 cm. On the basis of the above, the te st signal !P (t) will be approximated by a sequence of pulse functions modulating the RF carrier,
jw t c
e
Considering a finite set of N complex valued points (T
S apart) on the time interval [0, 6], an
N -dimensional vector space, <1>, can be defined which is spanned by an orthogonal basis generated from the test signal !p(t); vis.
Then each !Pi is an Nxl vector with component, such that
CP11 0 0
0 (tJ22
(tJ 0 . !PZ 1 0 ···!PN =
(2-12)
0 0 !PNN
with
jw t {) c
!Po • .e 11 1
(2-13)
2 ':'The antenna "far field" begins at a distance which is usually defined as 2D IX, where D is antenna aperture and A is wavelength.
87
6. _ .!. [u(t-iT )-u(t-i(TS+T))], T« d (2-14) 1 T S c
u(t) - unit step function
Now let a system 0 function be a signal vector in ~ in the sense that it is formed by unity projections on each cp.. Thus the system impulse response can b~ represented in vector form as
(2-15)
where h~is an Nxl vector, T j7 a NxN matrix and ,~ ,J
o a Nxl unity vector. The components of h,fare hi such that
h. (2-16) 1
o
But hywill be assumed stationary for t( [0, TS] so that
(2-17)
o
3. SIGNAL DESIGN CRITERIA
After establishing the definition of the Radar Target Operator, we proceed to develop a waveform design criteria which will lead to an optimal probing signal other than cp (t). cp (t) could be used, but it is our purpose to find a signal of lower dimensionality and bandwidth which can be used to identify the Radar Target Operator observed by the radar. As in Section 2, the components of cp are the basis vectors which span a finite dimensional waveform space ~. Then for W, a subspace of ~, another signal, w, can be generated by operating on the most basic element of ~, i. e., 0 such that
w = Uo, wE' W (3 -1)
*. .... where U is a degenerate operator actmg on..,. Further, we assume that the target has initially been detected (i. e., in the sense that its position is known) and its response is that of a deterministic operator. As such, it is thought to be a member of an a priori known target set, ,r, with
where ~(k) is the target operator of the kth ta~get, .~-( ). In the radar recognition problem, it is equally likely that anyone or none of these operators exist. A design criteria is needed to systematically derive a probing signal, w, which generates a response in the range space of T(k) and which, in some sense, is ideally separated from corresponding responses of the range spaces of the remaining operators of .~-. This must be accomplished under the liITlitations of acceptable bandwidths, power levels and duty cycles of the radar transITlitter. U will provide this requireITlent. Note that the above goal is also reasonable when considering the addition of noise to the response, since then respective range spaces are enlarged and can overlap. As seen later, the probability of recognizing.1-(k), is related to the design criteria of w.
Since T(k) is the iITlpulse response operator of the kth tar~et, it follows froITl (2-15) that the response p k) can be written in vector forITl as
(k) P
Then
(k) = T w, (3-2)
(3 -3)
whe re II x II denotes the no rITl of an e leITlent x in a Hilbert space of cOITlplex valued functions defined on a set of N poInts on the tiITle interval [0, 1\] II p(k) II can be considered as a ITleasure of the voltage received froITl the kth target when w is transmitted. SiITlilarly,
II p(i)II_11 pU)11 =11 T~11-11 T~1L i",j, lsisK, (3-4)
is a ITleasure ot the ditference voltage received froITl ta rgets i and j. Applying the triangle inequality, we have
(3-5)
This implies that greater separation is possible between targets when coherent rather than noncoherent inforITlation is considered.
3.1 REMARK 1
Regarding the ability to identify the presence of one of the target operators in 3", let us consider the following situation. Let R
l, R
2, R
3, be the
range spaces of the set
9"= (T(l), T(2), T(3))
':'A degenerate operator maps an entire space into a finite diITlensional sub-space [10].
88
and suppose these are subspaces of a space P such that .T:W- P. Then the operators are completely identifiable on W under the following logic:
(3-6)
, where 11 and U imply the operations of intersection and union and ¢ is the null set (see Figure 3).
Figure 3. Completely Identifiable Targets
With the above preliminaries, we now develop a distance function d(w) on P, relative to the set ,T. Let dij denote the norm square of the difference between the ith and jth targets or from (3-3) and (3-5) we have
d .. = II (T(i) _ TO)) wl\ 2 1J
(3-7)
or since the adjoint of T(i) - T(j) exists, we have
the quadratic form
(3-8)
(3-9)
3.2 REMARK 2
Again considering the three target set, it follows from (3-6), that it is desirous to have d12, d13 and d 23 as large as possible for some w £ W. Letting p(1), p(2), and p(3) be vectors on P, we
seek to maximize the norms of the difference vectors as shown in Figure 4.
d • II (2) _ ml12 12 p P ~
p (2) /' /:
2 A / d
23 • II p(2) _ pmll ~ / /
II
Figure 4. Distance Functions for the Three Target Set
89
Each d .. cannot be maximized separately since w which {~ appropriate for d12 may not be for dl3 or d 23 . Suppose instead the average va lue of dij is considered; i. e.,
(3 -1 0)
then the triangle inequality implies that
max a.(w) :5: max dlZ
+ max dl3 + max dZ3 w w
Maximizing (3-10) as it is written may still have the tendency to separate the range spaces of one pair of operators while causing the range spaces of another pair to overlap. Thus we need a constraint on the maximation of a.(w). a(w) can be generalized to K targets by considering d ij for all target pairs, with i "j, and letting a(w) be the sum of the distance between all targets, it follows that
L
a.(w) ~ d .. L ij 1J
or letting L
D L r D ..
ij 1J
the generalized distance function becomes a quadratic form
(3-11)
(3-11a)
where
L K(K - 1)
2 pairings.
Recall that the dimensionality of <I> is N
Also from the definition of U it follows that
and also that
cl .. (w):5: d .. (Ii) 1J 1J
"
(3-12)
Considering another subspace Y where U: - y
such that YeW then
(3-13)
and for y E: Y
d .. (y) "d .. (w) 1J 1J
(3-141
Thus ordered subspaces can be created by U on [O,~] and the cor responding distance functions monotonically increase as the subspace dimension increases.
3.3 REMARK 3
Considering the three target example, a plot of the pairwise distance functions versus subspace dimension can now be structured as shown in Figure 5. Then relative to the dimension of the desired waveform space, W, the following bounds are apparent:
d max
and
d . mln
AVERAGE
max d .. (w), i, j lJ
min d .. (w) i, j lJ
PAIR-WISE --t-----:r------:;...-F-----SEPARATION FOR Nw
u z
~ '"
~ NW
SUBSPACE DIMENSION
(3-15)
(3-16)
Figure 5. Target Separation in Relation to the Dimension of the Waveform Space
The desired constraint is formulated by requiring that the difference between d and d . be bounded, or for ( > 0 we forlRax mln
(3-17)
or equivalently, letting fJ (w) = w':'Bw we have
{3 (w) = 1 (3-18)
where
B (3-19)
and
D .. from d lJ max
D D .. from d . lJ mln
This ( constraint insures that the paired mean square differences between all targets will be
90
maximized uniformly within ( so that maximum separation waveform will not favor targets farther apart than others.
The technique of constraining one quadratic form by another can be expressed more succinctly in the form of a Rayleigh Quotient, d(w), of D relative to B(lO). That is, under all conditions previous ly established, the probing signal des ign criteria d(w) is well defined for w ~ 0 and can be simply written from (3-11) and (3-18) as
d( ) = a..t& w (3 (w) (3-20)
Thus the derivation of a maximum separation waveform becomes a problem of finding that w which maximizes d(w). It can be shown that the desired w is the eigenvector which corresponds to the maximum eigenvalue of d(w)(ll).
4. PROBABILITY OF TARGET RECOGNITION
We begin by assuming that the received signal can be the response of the target and its local background to the transmitted waveform, w. This signal is defined to be an element in an NW-dimensional response space, P, since it is in the range space of the linear operator defined for the target and background. We further assume that White Gaussian Noise is added to the target by the receiver. Referring to Figure 6, let x be the received signal, p(k) the response of the kth target, c the background response and 11
the noise, we have for K possible targets:
(k) x=p +c+17 (4-1)
or equivalently
(k) x = T w + Cw + 17 ls;ks; (4-2)
TRANSM ITTER ANTENNA
w
~ y (kl - RECEIVER - ANTENNA x -
1)
Figure 6. Definition of Target, Background and Noise
(k) . IIere T and C are the hnear operators repre-senting finite approximations to the impulse responses of the kth target and associated background. We note that the background return can appear as the localized environment in which the target exists, or as some unknown distortion to the target operator. For simplicity, all operators are considered deterministic; i. e., the target exists at a known aspect angle while the background is unable to change during the dwell time of the N-dimensional waveform, w. In the following treatment the background is non-existent, c = 0 and the variance of the noise, 0
2 , is known or can be measured. The problem for non-deterministic taryets and for c * 0 is treated in reference ( 2).
Under the above conditions, the conditional probability density for the presence of the jth target can be written as
e
2 ('J 2 x (4-3)
2 Here, x denotes norm of x on [0, 6] and 0"
the variance of x assuming p(j) was observe~
Now assuming that each target is equally likely to occur, it follows that the optimal Bayes decision logic for selecting the kth target is
exp -~II x - p (k) 112 (4-4)
x
max K
exp (- 2 :: Ilx -pof) 1 Sj SK
Taking In of both sides and since In is monotonic it fonows that the desired decisfon logic is
(4-5)
lsj S K
4.1 REMARK I
Considering once again the space P containing the ranges of the three target set, it is evident from (4-5) that the ideal hyperplane decision
91
surface is achieved when the response vectors form a plane defining an equilateral triangle (see Figure 7).
Here it is clear that the distance between responses is the same for each pair and the unconstrained signal design criteria (l (w) can be applied to derive the optimum w.
Figure 7.
4.2 REMARK 2
HYPERPLANE ::.-- DECISION SURFACE
Symmetric Target Responses Plus Noise
A more likely response geometry is depicted in Figure 8. For this case it can be shown that maximization of the average difference distance, (l (w), will not necessarily result in the optimum w since it would tend to favor responses p(l) and p(3). Consider the following:
Suppose x = p(2) + 1/ (4-6)
and assume that the noise, TI, is such that
(4-7)
p
Figure 8. Non-Symmetric Target Responses Plus Noise
Now define d .. (x) to be the shortest projection of x on (p._p.), 1J ls(i, j)s:3.
1 J
,-
I
Iii
Then for the geometry shown in Figure 8
select lover 2 (4-8a)
select 3 over 2 (4-8b)
select lover 3 (4-8c)
The above logic fails to select response 2 even though we assumed it present. Further, it follows that
max a. (w) ¢:::::;> max d .. , 1~ i, j"; 3 1J
(4-9)
w
so that conditions (4-8) a, b, c are only worsened by maximizing a. (w). The decision logic can be balanced by deriving w under the constraining quadradic form fJ (w) established in Equation (3-18). That is, letting
(4-10)
an acceptable value of E" can be found by establishing confidence limits around the required Probability of Recognition, Pc' and then computing the corresponding values of dmax and dmin' To do this we need to further investigate the method of computing the pair-wise Pc as a function of dij and RMS noise level.
Without loss of generality we limit the target set to two targets, say
so that
g G w
h Hw
From (4-5) the two target hypothesis test becomes:
If II x - g II 2 -II x - h II 2 sO, then the
target is 'f)
If II x - g II 2 -II x - h II 2 > 0, then the
target is .1('
(4-11)
(4-12)
Since G and H, the operaf~rs of '!J and .:If', are bounded linear operators 3), their adjoints G':' and H':' exist so that (4-12) can be written as:
92
I (x, Aw) I 2 (w,:i w) ~ 0 ~ <!j
< 0 ~ .Yt'
(4-13)
where (x, y) denotes the inner product of x and y on [0, t,] and
Aw
l:!w
(G-H)w
(G':'G-H':'H)w
(4-14)
(4-15)
The recognition of a particular target, say 'lJ , implies that the following inequality has been satisified,
2( (Gw + TI), Aw) - (w, l:!w) ~ ° (4-16)
Since ~(w) = 1/2 (w, A':'Aw), we get
Q! (w) + (TI, Aw) ~ ° (4-17)
Recall that from (3-7) that max a.(w) produced that waveform which separat'fs the targets in the mean square sense. Thus (4-17) shows this criteria to be ideal for the two target case in the sense of Bayes Decision Logic since a greater a. (w) strengthens the inequality. Note that for the two target set, D+ = D_ and therefore B=O. Thus a. (w) = d(w) as used in the following equations.
The success of recognizing target ,1 can be computed from the conditional density fTl (x/g). 1£
we define P c(g) as the probability of cor rectly recognizing 'fl, then from (4-13) we have for a threshold Q(w),
Pc(g) = PR
[ I (x, Aw) I ~O (w) I x = g + TIl (4-18)
Now let z = (x, Aw), then since x is characterized by a Gaussian density so is z, or letting f (z/x)= f (x/ g) the probability density function of z given target :1 is
f (z/ g)
2 2 -(z-j./ ) /2 a
g z (4-19) e
. (12) where It can be shown that
(4-20)
a 2 = a 2 (2d) z (4-21)
and P (g) is computed from c
Pc(g) = fa> f (zig) dz
O(w)
or equivalently
(4-2l)
2 2 -(z -I-' ) la (2d)
g e dz
(4-23)
4.3 REMARK 3
The required signal design constraint, £, for the three target system can be determined from (4-23). Assuming a fixed, plots of Pc (p(2)) versus d l2 and d
23 can be made relative to a common d
axis for a given waveform dimensionality NW:S; N (e. g., see Figure 9).
The £ criteria can be determined by selecting dmin and dmax for the required Probability of Recognition. Note that if an acceptable £ cannot be determined for the required P" (i. e., it does not exist for NW :s; N), then an unreliable target signatures have been obtained from the N sampled test signal, "o(t), on (O,~).
REQUIRED Pc-_~----"""'..c:._---:::;....r
Figure 9. Probability at Recognition Versus Pair-Wise Distance
5. RELA TlONSHIP TO THE RADAR AMBIG UITY FUNCTION
The measure of separability (or distance) between target responses for the required recognition performance was shown to be a constrained quadratic form which depended on the paired differences between target operators of the a priori known set. This approach can be viewed as an
extension of the concepts implied by the Radar Ambi~'1i4r Function. For example, it can be shown that for two targets the Ambiguity Function can be used to derive the best transmitted signal to discriminate between responses due to the Doppler shift of two targets. Here identification or recognition is not the goal, but rather it is the ability to indicate that two targets exist. More specifically, refer back to Equation (3-7) and suppose we define our distance function, d .. (w), on a continuous time interval (-a>, 00)
i~Jstead of the sample interval (0, 0. Then assuming that p.(t) and p.(t) are square inte-g rable on (-00, 00) we can Jdefine
(5-1 )
Now if p.(t) and p.(t) are modeled as similar functionJ only wifh time and frequency differences
such that
p. (t) ( ) jw.t ( 5-2a) - wt+T.e 1 1 1
p.(t) w(t + T.) jW. t (5-2b) - e J
J J
then letting T = T. - T. and W = W. - W. it follows that the cross pr~ducf term fro~ EqJation (5-1)
becomes
93
00
Ij!(T, W) ~[ T T
w(t + '2) w(t - Z-lcos wt dt (5-3)
which is the Ambiguity Function form. By minimizing Equation (5-3) with respect to w(t), we in effect maximize the distance between target responses and can optimally separate the targets. However, in the recognition waveform design problem, the simple modeling used in Equations (5-2a) and (5-2b) cannot be used, rather Equation (3-2) holds, so that Equation (5-3) becomes
00 • Ij!(T, Iol) =[ T.(t,T) w(T) (T.(t,Iol)W(\-l))··'dt (5-4)
1 J '"
which can be defined as the target operator form of a cross-ambiguity function. Thus minimizing Equation (5-4) with respect to w(t) is equivalent to maximizing Equation (5 -1). The multiple target constrained form of the pairwise cross-ambiguity function could be similarly derived. Note that if the signal space domain was restricted to N points on (0, 6), then finite dimensional target operato rs could be used and a Rayleigh Quotient form for the constrained minimization problem could also be applied.
to
r !
6. SUMMARY
By considering the radar target as a finite dimensional operator on a Hilbert Space, we have attempted to develop an operator theoretic methodology for designing an optimum target recognition system relative to the realizable as pect of m.easuring the target signatures. The approach has been directed at seeking a balanced solution depending upon the association of the target observation space to the transmitted signal space through the radar target ope rator. The context of "optimum" has been taken to mean the best
References
(1) Skolnik, M. W., Radar Handbook, McGrawHill, 1970.
(2) Barton, D. K., Radar Systems Analysis, Prentice-Hall, 1964.
(3) Rihaczek, A. W., Principles of High Resolution Radar, Chap. 5, McGraw-Hill, 1969.
(4) Balakrishnan, A. V., "Stochastic System Identification Techniques," "Stochastic Optimization and Control," Wiley & Sons,
(5) Weber, C. L., Elements of Detection and Signal Design, Chap. 14, McGraw-Hill, 1968.
(6) Esposito, R. and M. A. Schumer, "Probing Linear Filters - Signal Design for the Detection Problem," IEEE IT, Vol. IT-16, No.2, pp. 167-171, March 1970.
(7) Mosca, E., "Probing Signal Design for Linear Channel Identification, " IEEE IT, Vol. IT-18, No.4, pp. 481-487, July 1972.
94
transmitted waveform which would provide the required probability or recognition. As noted in Section 3, this meant a degenerate (lower dimensional) form of the original measurement signal when Bayes Decision Logic is applied to the possible responses in the presence of White Gaussian N ois e. The characteristics of the signal we re derived relative to the discrimination properties necessary to identify (recognize) a target as a particular linear system in an a priori known set of linear systems. Thus a system identification approach was taken.
(8) Nathanson, F. E., Radar Design Principles, Chap. 9, McGraw-Hill, 1969.
(9) Levan, N., "Target Dis crimination Studies, Report No.1, " Unpublished International Telephone and Telegraph (Gilfillan) Report, Van Nuys, California, August 1972.
(10) Bellman, R., Introduction to Matrix Analysis, pg. 110, McGraw-Hill, 1960.
(11) Hammers, D. E., "Generation of a Finite Dimensional Matched Filter," Unpublished ITTG Report, Dec. 1970.
(12) Hammers, D. E., A System Theoretic Approach to Radar Target Recognition, Phd. Dissertation, UCLA, 1973.
(13) Akhiezer, N. 1. and 1. M. Glazman, Theory of Linear Operators in Hilbert Space, Ungar Publishing Co., 1966, pg. 43.
(14) Muddle, R. P., "An Eigenfunction Problem Concerning the Ability of a Radar to Distinguish Between Two Targets, " Royal Aircraft Establishment, Technical Report No. 66348, Nov. 1966.
INFINITE DIMENSIONAL REALIZABILITY THEORY
J. Hilliam Helton University of California, San Diego
La Jolla, California
This talk is on realizability theory for systems with infinite dimensional state space. As is generally known the study of [A,B,C,D] type linear systems was discovered to be closely related to LaxPhillips scattering theory or equivalently to operator model theory. This was discussed by several people at the first conference in this series 2 years ago. ~y
purpose here is to give a general descrip
tion of what has happened and of the state of the art now. The talk will dwell on generalities and downplay technical details. For more detail one is referred to a fairly complete article which will appear in the forthcoming issue of the I.E.E.E. Proceedings devoted to 'Recent Advances in Systems Theory.' The work described in this talk is largely due to Baras, Brockett, Fuhrmann, DeHi1de, and myself.
To speak generally the main accomplishment of the last two years has been to establish analogs of the basic facts from finite dimensional systems theory and also to give what seems to be a reasonable definition of infinite dimensional system. The last point sounds strange so lets consider it first. r'lhat reqUirements would one like for an [A,B,C,D] linear system? One would like for the theory to be general enough to contain many examples but be specific enough so that , .. e can prove
95
substantial things about it. To be more specific some desirable properties are
(1) A,B,C,D act on some vector spaces.
(2) A,B,C,D can be associated with most distributed devices.
(3) Any natural notion of energy associated with the physical situation corresponds to a topology on the vector spaces which is 'compatible' with the system.
(4) The basic general properties of finite dimensional systems have valid analogs.
Maybe more should be said about energy. In finite dimensions one can ignore such considerations but in infinite dimensions one must have a topology on the space. Traditionally in mathematical physics the topology is painstakingly selected to have a direct physical interpretation. Usually it is an inner product and gives a Hilbert space structure to the vector space involved. This seems like a very good tradition to maintain in the case of infinite dimensional systems. For discrete time systems this proves not to be difficult although it must be mentioned that energy considerations are a little
" ~
garbled in discrete time. In continuous ,1"
time one runs into a lot of technical td
:'
i
" II: I
II: I , ,
i
problems. Certainly more than I originally
anticipated. Baras and Brockett showed that if one picks A,B,C,D in the mathematically nicest way then one could not
even do a transmission line. This suggests one use a set up like Balakrisna's
where B,C are distributions. That's plenty general but there is not a natural
Hilbert space structure and so energy considerations are ignored. Aubin and
Bensoussan of the Lions school pursued the
Balakrisna distribution approach and gave
an abstract analysis of it. They then im
posed a Hilbert space structure in a cer
tain way and found the same difficulties
which Baras and Brockett found. The first point to be emphasized in this talk is
that there is a reasonable way of defining
infinite dimensional systems which meet conditions (1) through (4). The set up is
complicated enough that it requires more discussion than can be given in a short
talk and so we only list the definition
and give no further explan~tion: A system consists of
X,U,Y are Hilbert spaces.
A is a closed operator on X defined
on 19(A) a dense sub domain of X and the semigroup eAt is bounded.
This gives rise to a rigged Hilbert space structure ~(A*)- , X , ~(A*).
The operators B,C,D are continuous linear
B:U -- 19(A*)- C:19 (A) -- Y
D:U -- Y
A compatible system satisfies
(ZOI-A)-lB c ~(C) (all systems in the
authors experience are compatible). In addition there are straightforward
notions of continuous, approximate, and
~ controllability and observability.
The set up just described has very good properties; it satisfies (1) through (4)
in fact. Let's expand on (4): Our infinite
96
dimensional system has the properties-
(a) An operator function F analytic
near infinity can always be realized with a compatible system.
(b) Given two canonical systems with
the same frequency response func
tion there is an M such that
(c)
MA = Al1 ME = B CM= C D = D
If (A-A)-l and (A-A) -1 are meromorphic in (b), then they have
the same poles. If the frequency response function F in (a) is
'pseudomeromorphic' then the
singularities of (A-A)-l equal
those of F. If F is not
'pseudomeromorphic' everything breaks down.
It should be mentioned that (c) is a
little delicate and one cannot expect more to hold. In fact the first part is a little surprising and seems to be the
physically correct and useful result. It
was first done in the Lax-Phillips theory.
That brings us to the next main property of our systems.
(5) Every infinite dimensional system
corresponds to an abstract LaxPhillips scattering situation.
The correspondence is very explicit.
Thus the study of this type of system is equivalent to Lax-Phillips scattering
theory ("lith D+ and D_ orthogonal) . Advantages of this are: TheoreMs from the Lax-Phillips theory (both present and
future) can be converted to theorems about
systems and vice versa. We are guaranteed that all of the many examples they treat can be fit into this systems set up, and lastly the Lax-Phillips avoids all complications involving distributions and is
couched quite simply in terms of Hilbert
space. Thus an alternative formulation of continuously approximately controllable and observable compatible system is The
Lax-Phillips Model.
T(t) is a uniformly bounded semigroup on a Hilbert space H containing two
distinguished closed orthogonal sub
spaces D+ and D_ such that
(i) T(t)D+ c D+ and T(t)*D_ c D_;
(ii) n T(t)D+ = {a} and n T(t)*D_ = {a};
(iii) st lim PD +D T(t) 0 and o -
st lim PD +D T(t)* 0; o +
(iv) T(t) is isometric on D+ and
T(t)* is isometric on D
(v) T(t)D+ is 1 to T(t)(D_ ~X)
and T(t)*D_ is 1 to
T(t)*(D+ $X).
It seems very likely this set up will be easier to use in some circumstances than the half distributional-half operator theoretic [A,B,C,D] formulation. One project which might be informative would be to see if one could translate 'least squares quadratic control' and'estima
tion' into the Lax-Phillips set up. Possibly it would lead to some interesting constructions. One might also determine how some of the parabolic and hyperbolic
partial differential systems in Lious framework fit into the Lax-Phillips model.
A final property which should be emphasized is that of 'compactness.' It is not included in the definition of system, however, many systems which carefully model a physical situation will satisfy a compactness condition. Experience with
the Lax-Phillips theory indicates it
should enter as
(6) 3 T > 0 so that .eAT (lA-A) -1 is a compact operator (that is, it is
approximable by a finite rank
operator).
For discussion of the physical significance of (6) in several situations see §S
of the longer paper.
This brief sketch hopefully gives some idea of the present state of the art. To my view the general realizability theory is now complete enough that we must begin to go to more specific levels. TtTe must
impose physically reasonable additional assumptions on the systems structure which lead to more refined theories. T·Thereas,
the restriction of finite dimensional state space is so strong that in traditional systems theory no classification of
additional structure has developed; in infinite dimensional situations such a
classification is necessary.
97
BIOG~HV
BILL HELTON was born in Jacksonville, Texas on November 21, 1944. He received the bachelors degree in mathematics froM the University of Texas in 1964, the masters degree and the Ph.D. degree in mathematics from Stanford University in 1968, respec
tively.
From 1968 to 1973 he was an Assistant Pro- .f
fessor at Stony Brook, New vork. During the beginning of 1974 he was a visiting Associate Professor at UCLA and currently is an Associate Professor in the Department of Mathematics at the University of Cali
fornia, San Diego in La Jolla.
LINEAR NETWORK SYNTHESIS USING ITERATION XETHODS
Y.G.Jan and F.R.Chane National Chiao-Tunc UniTersity
Taiyan, Republic of CAina
Abstract
We employ two iteration methods, Tiz. maximum eicenTalue iteration and Gauss-Siedel iteration, te rind the rational appreximation te t .. truncated La«uerre transform of the impulse response of some linear networks. Usin~ the deriTed alcoritaa, the executinc procedures fer synthesizing taree simple networks are discussed in detail. Taeir results are plotted and tabulated.
I. SIGNAL REPRE3El.nl'IO;; .-.NO LAGUlSJ.1RE
POL YJ( O}lIALS
I-I. Si~al aepresentation
There are many different meanin«a of si~alG,
we consider the representation of a sienal as a
real function ret) on (O,OoJ. Furthermore, ye
assulle f( t J to be intepoable in tlle mean squared
sense, i.e. 5;2(t) dt (00. We can express tllo eienal f(t) by a set of com
-plete aHtIaftGt..t b .. b { In (t); noaO, 1,2, ... } [1} Of course. t .. enersy ef In(t) is &Ssuaed finite.
Jllen f(t) is reprosented by the basis fUnctien
discussed abeTe. tllen ye kaTe: C)o
f(-t; .. 1:; C.ln (t) ?I._O
YHre C;a JOOf(t) 1 (t) dt non
HeyeTer in practieal ease. ye ean u.e only a
finite number ef basi. functions in eur repro
-.entation. Suppeso {In(t), noaO",2, •••.•• K} be
tlle finite s.t ef ortllonoraal b .. i. functions
yhicll are used to represent the sisnal f(t).Tllen
tlle apprexi.ation f (t) can be expres.ed as: " a f (t) .. T C I (t) (3)
a L..J n n n .. o
98
wllere the ceefficient. {cnl are cllesan in
sucll a Ya::! tllat the mea squared error
f "1~(t)-fa(t»2dt (4) is ainillum.
After impl.mentation of equation (4) by sub
-stitutinc equatien (3) into equation (4),
tae ceefficients { Cn\ in equation l3) are
still ciTen as equation (2). Tlle mean squared
errer t can be reduced by increasinc tlle nu.
-ber K since ye have assumed tllat tlle basis
functions form a complete set.
1-2. La«Uerre Pelynoaials
A particular important set of cOllplete ort .. -
-normal basis is the Lacuerre polynomials
In (t). noaO,1.2 •.••• whicll appear as: [2]
{
~2.tt e-o.t ~ ('tt) (-2oJ:f 1 (t)_ 1(=0 f( k r t~o ~ . (5)
o t<o Yhicll can be obtained by applyinC Graa-
Scllmidt precedure te {( at )ne -at .naO,1, 2 ..... }
Yllere a is 8 positive real nu.ber.
l'he corresponding Laplace transforll of equation
(5) is: ../20... s-a. n
Ln(s)=( .51-a. ) ( sta. ) (6)
It appears that tile nth LSbUerre pol;rnomial
Ln\S) is the impulse response of the cascade
cOllbination of a single pole network and naIl
pass networks each with a pole at S=-a (1].
~upp.se we have a system response h(t), taen it
can be approximated by Lagurre polynomials as: f(.
]l(t)~h(t) ='Z:Cl(t) (7) a ?\=o n n
where Cn=~oa h(tjln(t) = (00 n(f) L *(f) df .J-60 n
B.r droppi~ off signals at the network junctions
and summing with proper veighting factors { Cn\
then tae overall network can approximate the
desired function, as shown in figure 1. (1J.
1-3. Lacuerre Transform
b'rom previous section, we conclude that ~
finite energy sigDal can be represented by 00
• Laguerre polynollials as f(t)=Z"C 1 (t).:rhe
coefficients { Cni in the rep~;;e:t:tion are naaed as Laguerre speotrull wAioh was adopted , from Steiglitz s ~per (3] , and the associated
polynollial F (z)"~Cnzn is defined as the L8BI1erre e ?Ie"
transform of f(t), wllere the dumlQ' variable z is
a complex variable inside the unit circle.
Let us make the following transformation:
S-(l z= sta.
... a.l~ ~
(8)
i.e. it transforms the unit circle Izr -1 into
s-jw, tae outside of the unit circle in z domain
into tlul left half plane Gf the s domain, and
the inside of the unit circle into the right half , plane of the s dOllain. Then froll Parsevals relatie.
and equation (6) vo can han the !Gllowing
relationship: (4) ~ S~
F(a)" (~)p e( sta.) and aillil arly we havo:
..fit( a (ltZ» Fe(Z)--;=Z- p( I-Z,
(9)
99
The above property, i.e. the relationShip between
equations (9) and (10) offers a simple .ethGd to
compute the Lacuerre spectrum of a fUnction with
rational Laplace transform.
II. APPROXHIATION OF TRUllCATED LAGUl!!RRE
POLYBOMIALS BY RATIONAL FUliCTIONS
In the si~al representation as equation(3J, ve
should increase the nu.ber of coefficients to
reduce the approximation error, in other words,
we should increase the cOllplexity of the network.
It turns out that the syntheais of the resulted
network would be rather complicate and seriGusly
uneconomical. In order to avoid aboTe drawbacks, it is possible
to employ a rational function R ~.) to appro xi-e mate the truncated Lacuerre transform (5) . That
is we use the following approximation:
B,y equation(9), the associated Laplace transferm
of R (z) is: e
The realization of equation(12)is shown in fi,2,
which we have one stage Of~, and P or Q (de
-pend on which one is larger) stages of( ~~ ).
It is to be noted that. the number of unknown
coefficients is P+Q+1. Howenr in the polynOmial
representation, as in fi«ure 2, we need one atac
of ~ and K staces of (~+& ).Its corre&
-ponding number of coefficienta is K+1. Suppo ••
K is an even number, and also if P=Q=K/2, th. ( S-Cl \
we can saTe half stages of ~ I by the
rational approximation as equation (11). In the
following, we will firat review tAe Pade'.etbod
( 5 J for finding the required rational
expression.
:, i;I'
II-1.Tke Pade' Approximation
If we choose K=P+Q, tken equation (,1) can be
rewritten as: ( 2 ( 2 Q. cO+c , z + c2z +"" 1+b, z+b2z + ••••.••• +bQz )
Then equation (13) can be rewritten into the
following matrix form: a., Co at (, Co .,. c b,
-1:~ C~tl
o . . (pt ~)x' Go~ c" bQ. (!:I·t' II\X I or equivalently ~ .. 'J
-~ · --f::l b [J [ ] <piQit)Y(Q ,)
where r c 1 Jand (c 21 are corresponding upper and
lover parts of the matrix in equation (,4).
Zquation (,5) can be solved in the follov1n«
wa;r:
The vectors i, $, b
"
and the matrix [C41are the
correspondinc vectors and matrix in equation (14).
If the iDvers. of [041 exists, then from equation
(17) we have:
-1
$, -- (041 c3
and also we uve the solution for a &8:
( 19)
hen the rational approximation 1l (z) is expanded e
100
in power series by long diVision, then by section
1-3, the coefficients of tAls series will be
" identical to those of H (z), the truncated 1a-e
-t;I1erre transform, up to degree P+Q.
In the above approximation method we only con
-sider P+Q.+1 coefficiants ot the Laguerre trans-
" -torm H (z). iith the saae tecknolo~, Salomon-e -sson [51 modified above method and extend the
coetficiants of c to K+1 terms with K~P+Q. In n this situation, we have more linear equnt~ons
than the number ot unknown variables {ail .
and {bi)Q • And consequently equation (t~ is i=O no longer valid, and therefore an error vector
sAould be introduced, i.e. we have: Q" "c..,
(, Co (c
_~t /"
~ = '1' - - --0 CPt-! " " .. '" ... C~, c 0 b~
{' I
C . I ") CI' " , ci< Q • (j( "T"I )I tit' ~')>(I"t I) ( tl~ t f ....
or alternative y we have tae tolloY1n« matrix .~~
torm:
-cd- _\~l~~Y~f~ ~ rc,l =e [1 t (kti)X~ _ u (~_~»)( (I<-p); - b
Or - ~ +[A) ~ == e (22) (2f) where [ I 1 is the identity matrix of order
(P+1) X (P+1 ), Tectors i and Ii 1 are defiaed in
equation (15). Vectors c6' i, e and matrix 05
are defined as:
'Co - /a 0 ~
(, u, €I
Cb = -x,- €"-Cl. ap -~I (.23)
Ck -b61. €"
(q~l~; 0
0 0 Co
C,,_I C~-2 ..... c,.~~
Hence ( A) is a matrix of order( K+1)X (P+Q+1 J •
From aboye consideration, we for.ulate the
problem as to find the unknown vector etc
- ~ x = SUCA that tae mean squared errer c
-~ '2 -bQ
- - -T ( J - -T- -T = e e - x B _ x -2x v + c6 c,
is minillUm. where [B )= (A)T [A] is a real sy_etric matrix
( ) ( , - AC" 1 T -of order P+Q+1 X P+Q+1) and y = A c6 , an
(P+Q+1) tuples yector. From the gradient method in calculus of vari-
-'II -ation,tae optimal solution % ll.appens to be:
end its associated error is:
-T _ _*T_ E ... c6 c6 - x v
Salomonsson eaployed above procedures, i.e.
involying a direct matrix inYersion. to find
tae required coefficients for the si&nal repre
-sentation for some simple networks. (5)
11-2. Tae First Kind of Iteration .etud Per
Searching Rational Coefficients
From equation (26). we know taat we sllould
inverse the real symmetric matrix (BJ to find tao -'II
tae optimal solution x • It is ,enerally a
tedious and time consuming 1I9rk. lIence we will
approach this problea from another point ef view.
In stead of inversing II&trix (B). we use the
following iteration aethod to find the required
solution x :
i =i_A( (BJin-i (z]) n+1 n n
where in denotes the value of x at the nth itera-
-tion stage and A is the step size at the nth n
iteration process.
WUn O<.An <: 2/ A.. max ' where .A. mu is the largest eigenvalue of matrix (B J. then as the
iteration stages is large enough in+1 will
approach tae optill&l. solution i*. (B) -1 T. Tlle selection of the iteration step size and its
associate computation procedures can be referred
to reference (41 .
11-3. The Second Kind Of Iteration Ketaod For
linding Rational Coefficients
Tao approximation procedure in tae preYious itera
-tion algori tllm is designed in such a manner that
not only the convergence rate should converge
fast but also tu step size An be no greater taan
the value 2 /.A. • In other words. the i tora-max -tion method involves maxillUm eigenyalue cal-
-culations yhich in certain situations m87 become
an annoying step. In order to overcome above difficulty. we intro
-duce another iteration algorithm .. :
in+1 ... in - 1/2 C (D1 + (LJ}-1. V£ (28)
where aatrices (D).(UJ .(L) , are diagonal, upper
triaJlClllar. and loYer triant=Ular matrix of CBl respectively. Froa the definition of gradient. equation (28)
is equivalent to the following expression:
%n+1'" i n- 1/2 [[D) + [L)r1
X 2 ( LBl ~ - i
=xn- (CD) + (Llr1 «(BJ in -- v
(29)
Let rlb1 + [L J]-1 ((B) iii TJ -Ain• then ye have
another form for equation (29). 1. e.
i 1C1i -~i n+ n n
Since IIatrices (D J and (L 1 are the diagonal
and lower triangular part of the matrix (D) • then by the back substitution method it is Dot
difficult to calculate the value A in.
Equation ,29) can be further simplified as:
101
xn +1= - (lD J + lL1)-1
+ [CD) + lLl)-1 -v
Equation (31) is just a Gauss-Siedel iteration
form in numericnl analysis for solving simul
-taneous equations. Since (B 1 is a real, sym-
-metric, positive- definite matrix, then from
Jauss-~iedel iteration property we conclude
that the iteration process in equation (31)
will conver~e no mAtter the initial guess
of the va:ue Xo (6" The convergent rate of
this iteration process can be referred to (4) •
I II. t!XAKPLE.3
, . t· In order to il:ustrate the Pade approxkma 10n
and iteration methods as illustrated in the
previous sections • .ie have calculated the appro
-xi;aations of the following impulse responses:
~ t < 1
~ t < 2
t? 2
o~t ~2
t> 2
In theee examples, the parameter a is chosen
as 1, and for comparison we have calculated
approximations of these desired impulse responseS
to the truncated La...",1.lerre series ( equation (11».
For truncated 1,aguerre seriee, the degree for
the numerator and denominator are both 5, and A
therefore the degree of Laguerre series B (z) I e
is 10, however in modified Pade approximation
the degree of the power series is selected as 12.
·.!.'he results are plotted on figure 3 to figure 5.
and also tabulated on table 1.
IV. COll~LUJII,)HS
:Ie have presented two iteration methods for
finding the coefficients of Laguerre polynomials
in the synthesis of some linear networks. In
-stead of original matrix inversion method, these
two methods used iterative adjusted algorithms
to find the approximation parameters and the
performance of these methods are based on the
minimization of mean squared error. The first
kind of algorithm converges under the condition
that the iteration step size is less than some
certain constant value while the second algoritha
will converge under all possible circumstances.
This analysis us been illustrated by the synthe
-sis of three simple networks and their results
have been plo tted and tabulated and cOllpared.
V.REFERENCES
1. :c.ewis Franks," Si8Ilal Theory," Prentice -Hall,
1969.
2.Nobert Wiener," Extrapolation, Interpolation
and Jmoothing of Stationary Time Series,"
John '.iiley and 30ns, 1949.
3.K. 3teigli tz, n Rational Transform Approximation
via the Laguerre Spectrum," J.of the Franklin
Institute, Vol.280,No.5,pp.387-393.
4. Y G.Jan, F.R. Chang, "Signal Representation with
~aguerre Polynomials Using Iteration Net.od, "
Technical Report, Telecommunication Labs.
TeL-B No. 143, Oct. 1974.
5.Goran Salomonsson, "Linear Network 3ynthesis
with Laguerre Polynomials," Ericsson Technics,
No.2, pp. 84-109, 1971.
6.J.H. Wilkinson," Rounding Errors in Algebraic
Processes," Prentice-Hall, 1963.
102
Truncated ,
Pade ,
Modified Pade
** Pirat iteration
** :3econd iteration
Table 1 * Sum of :3quared Irror
A, (t) 112(t)
1.0982 0.33789
0.53206 0.21829 10-1
0.52967 0.20740 10-1
0.81507 0.83576 10-1
0.78846 0.79277 10-1
* Tho samp~ing interTal i8 set equal to 0.1 fro. t=O to t-5.
** Tke initial point ~ i8 ckosen arbitrail7 .. (-1,-1,-1, ••••.• -1)
Tlle TalUO of E 18 choson as 10-4
h3{t)
0.56918
0.29672 10-1
0.21769 10-1
0.13344
0.12558
lI'il'llro 1. J.pprozi.aUon of desired impulae reaponse b7 a
linoar combination of Laruorre pol7noaials
Ficure 2. aoali.ation of Re{z) wit. M=N.
103
~f!Oe 3 lx..ple I
'il .." \
1
0.5
I
0.5
1 2
\
\ \
\ --- -. .. _},' 4
Approxiaation bv Dane wthod
r,..e4-lx"pip 2:
h2(t)
1 2 4
ADproximation by ~ad~ method
Example 3:
h3(t)
1
fi,.~
""---2 3 4
n")r~)(imat1011 by pa IE' method
1 ' - -." r 0.5
0.5
5
1
::l.5
\ , ,
1
\ \ \
2
\ \
... ~~ , ...
ADprox iro"t i"" by modifi~d "a ,e method J~: r.t' !"'\bt rix i nvprse
h2(t)
1 :2 3 4
Approximaticr, r.Y l'1o(Jlfipd ::>ale method using ffibtrix ir vrrsp
h3 (t)
1
I , 2
---_ ....
A;Jprnxim&t :n!. t., 'r:" .... :If'O os f
mf't.t,..., as:' T1t ,..- .• r l..X invPTf r
1 r',
0.: ,
I
"
I I
\
-•. - - - 4
Ar·':)!"'oJ<imbti:"ltl ~ y r.v- i~ !'ir.·~ "'f:~: , .... ~s:,.nt I~· . ::~.' r·· ... •
h;,:t)
').5
') .
5
2 1.
A"::>roximat ion ty "l0uifip(i metnoc .1siI!£;, lEt i!ltJrbt iell
hJ (t)
!, ;~\ " , .1
\ , I ,I \ , \
j \ .. -......
,,")'.)]':-y : tr..:'lt . '! ...i.S -;.; " !".:.I i'e. ~ .
).5
5
0.5
1
~ ,
, \
I I \
I \
\
--~--
A;nrux~rr.&.i:i -II !-': ... lJ.S i !'".r LY""": 1 t,' ..... ~ TJ
r,~ (t:
... ----" ...... :"'. ---5
Aoproximat ien by mo·:ifi. n ~ l!"ptho': using 2nd itpr&tion
h) (t:·
~ ~ \ " ~
J I' ~ \ . ~
:. :' ~~'(
• o
THE TRANSFORMATION OPERA'lOR APPROACH*
TO MULTI-SUBSYS'mM DYNAMICS
William Jerkovsky The Aerospace Corporation El Segundo, California
Abstract
A new approach to formulating nonlinear dynamics equations is presented. This new approach is a unified approach which abstracts the essence inherent in the Newton, Euler, Lagrange, and Boltzmann-Hamel methods of formulating dynamics equations. The key ingredient is the linear velocity transformation operator which transforms "Old" velocity variables to "new" velocity variables. The new approach is illustrated by a simple example of interconnecting two rigid bodies.
1. INTRODUCTION
Gabriel Kron 1-3 has developed a tensorial approach
to generating equations of motion for a system
when the equations for the subsystems and the
interconnection equations are given. The most
significant feature of this tensorial approach is
that the resulting equations of motion are the
exact nonlinear ordinary differential equations
for the system. Currently there are many workers
in the field of network and system theory who use
approaches that are similar to Kron's, but most of
these methods are only applicable to linear systems
(because, in effect, they are linearized versions
of Kron's method).
The purpose of this paper is to cast Kron's work
into a modern mathematical framework, and to
illustrate the method with a multi-body spacecraft
dynamics example.
The discussion of the method given herein differs
from previous discussions of Kron's work because
Kron and his followers did not separate the method
into its algebraic and geometric aspects. The
algebraic aspects are e~sentia1ly the same whether
the system is linear or not; the geometric aspects
of the system contain all the nonlinearity of the dynamics.
2. PRIMITIVE EQUATIONS
In any finite dimensional physical system we can
always specify a kinetic energy T whi~h is
quadratic in the system velocity vector . .J:
T=~-Jtl.J.cr (1 )
where at is the transpose of ~. and u is the
system metric tensor. In general, I.J. is positive
definite symmetric, and therefore it has a positive
definite symmetric inverse v
-1 ) v = I.J. (2
The product I.J. 0 is called the sO's tern momentum r.:
G = I.J. .) (3)
Even though in general I.J. depends on the coordi
nates, nevertheless G is a linear function of the
veloc i ty o. The kinetic energy is now t i VE'n by
(4 )
The equations of motion for any dynamical system
can always be written as
G - C G = K
where G is the time derivative of G. K is the
sys tem force including forces derivable from a
potential or dissipation function), and C is tb~ 4 system Cartan operator In general C depends op
the velocity a and on the
*This work was supported by USAF SAMSO Contract No. F04701-74-C-0075.
lOS
\
;i
(5) is a nonlinear ordinary differential equation
for G. If we introduce the notation
c • G = G - C G (6)
then (5) becomes D
G = K
" G is the covariant time derivative of G.
If we have a network of n subsystems, then the
equations for the ith (i = 1,2, ... ,n) s~bsystem iii ai ·i i i
can be written as G = ~ 0, G = G - C G, and • i i G = K (note that the superscript i is a counting
index, not a vector component index). Let 0 be
the direct sum of oi for i = 1,2, ... n; similarly, i
let G be the direct sum of the G ; etc. Thus, the
composite system consisting of n subsystems can be
described by G = ~ 0, G = G - C G, and G = K. Note
that this composite system has as many degrees of
freedom as the sum of the number of degrees of
freedom for all the subsystems.
3. TRANSFORMED EQUATIONS
If it is desired to reduce the number of degrees of
freedom of the composite system-because of the
constraints due to the interconnections- we
introduce a new velocity vector a which has smaller
dimension than o. We write the transformation in
the form
(8)
where A is a linear transformation operator (note
that in general A depends on the coordinates but
not on the velocity). We can actually write the
transformation equation (8) even if we do not want
to reduce the number of degrees of freedom of the
( compos j te ) system. If cr and o have the same
dimens ion (i.e. , if we do not reduce the number of
degrees of freedom) then A is invertible. If (f
has smaller dimension than 0 (because of the
constraints) then the operator A still has rank
equal to the dimension of 5; hence, A is an injec
tive operator. When we convert from:; to j we
s imul taneously convert from G to G, where
G = At G (9)
where At is the transpose of A. The kinetic energy
can now be written as
106
where
1 -t:2 G 0
t i:1=A ~A
(10)
(11)
~ is the transformed metric tensor. Since ~ is
positive definite symmetric and A is injective, ~
is also positive definite symmetric. We can now
also write
G ~ 0 (12)
and
o = v G (13)
where _-1 v = ~ (14)
Evidently ii is also pos i tive definite symmetric.
We now define the linear transformation operator B
by
(15)
Since ~ and v are bijective and A is injective, it
follows that B is surjective. In fact, we easily
see that B is a left inverse of A, and hence (8)
can be inverted:
a Bo (16) We can also write
The transformed equations of motion are now given
by " G if (18)
where
.. . G=G-CG
C=AtBt+AtCBt
Since K = At K we also have
~ = At G (20)
" and hence G is obtained from 8 (via At) in the
same way as G is obtained from G.
Thus the equations of motion for a dynamical system
are always given by the statement that the covar
iant time derivative of the momentum is equal to
the force. The equations G - C G = K and
B - C G = K are equivalent to (but simpler than!)
the Boltzmann-Hamel equations 5,6. If the system
velocity is actually a holonomic velocity then
these equations reduce to the Lagrange equations
(or equivalently, to the Hamilton equations).
In principle, the amount of effort required to
formulate the equations of motion for a dynamical
system is the same no matter what approach is used.
The advantage of formulating the equations as
discussed herein is that all of the nonlinear
dynamics is buried in the Cartan operators (C or C). It may take quite a bit of effort to get an
explicit expression for the Cartan operator, but
after it is found the equations of motion are very
simple when expressed in terms of this operator.
Furthermore, the equations of motion always have
the same form, and all the terms required in the
equations are always obtained in the sare way. In
a sense, the approach described herein is a non
linear generalization of similar ideas in linear
largescale system theory 7
4 • TENSORS AND VEC'IDR SPACES
It is interesting to note that we can consider 0
and cr to be contravariant vectors, and we can
consider G, G, and K, K to be covariant vectors.
The relationships G = ~ 0 and IT = ~ (j then show
that~, ~ are covariant of degree 2. Hence v,V
are contravariant of degree 2. The relationship
o = A cr shows that A (and B) is covariant of degree
1 and contravariant of degree 1. The kinetic 1 t -t -
energy is, of course, a scalar; T= 2 G 0= G 0, or
G 1 t 1 _t - - It' t h 8 = 2" 0 ~ 0 = 2 0 ~ O. l.S easy 0 s ow
that the tire derivative of T is T = Kt 0 = ft cr, and this is, of course also a scalar.
We can consider cr to be an element in the velocity
space /7 (denoted by cr ~. Similarly a .. 9, G .. Jf ,
Gt~Ke.~ Ke.j(. These spaces are related as shown
in the commutative diagram in Figure 1.
'§ A .9'
i! I ~.
At
a
:K •• -----:K
Figure 1: Commutative Diagram for Transformation Operator A
9'and '§ are tangent spaces to the configuration
manifold Which describes the dynamical system;
.it andJ" are cotangent spaces which are the duals
of the tangent spaces (and vice versa). The
tangent and cotangent spaces are ~ vector
spaces even though the configuration manifold is
in general nonlinear. The metric tensor relates
these dual spaces to each other. Since the
effect of A on 9'and of At on .it is invertible, we
also have the commutative diagram in Figure 2.
107
Figure 2: Commutative Diagram for Transformation Operator B
If we introduce basis vectors in each of the above
vector spaces, then cr, G, K, C, ~, are represented
by the matrices E., g, K, Q, ~, respectively; here
Q, g, and.K. are colwnn matrices of scalars, and
.c. and .l:. are square matrices of scalars. The
elerent of !!. Which is in the i t.h row and j Yl
column can be expressed as
.Qij (21)
where Qk
is the k:t.b elerent of Q., and Where the
coefficients rik are the components of the affiDe
connection (with respect to the chosen basis). It
o is a holonomic velocity, then the rik are the
Christoffel symbols of the second kind for the
metric matrix 1!:.: The coefficients rik are usually
very tedious to evaluate; however, we don't need
to evaluate them when we use the transformation
operator approach because we only need the Cartan
operator C (or the Cartan matrix Q.), and we can get
an expression for C directly without first evalua
ting the rik.
It should be noted that we really don't even need
C, because only the product C G appears in the
equation of motion. Often we can write C = Cl +C2 where C
2G = o. Hence, C G reduces to CIG.
In practice it is usually simplest to introduce
X = - C G so that (5) and (6) become
G+X K (22)
G+X
The transformed equations then become
(23)
where
(24)
The significance of the above development is that
we have expressed the equations of motion for an
arbitrary dynamical system in terms of the
algebraic and geometric aspects. The whole devel
opment is centered around the transformation
operator A. If we make another transformation, say - - - = -t- = -"b = .!.t-t cr = A ?J, then we get G = A G, K = A K, C = A B + -t - -t ~ .!. - = ~ = A C B , G = G - C G, and G = K; alternatively,
~ == = = -t - !..t-we have G = G + X where X = A X - A G. Since
cr = A cr we also have cr =../0 where A = A A. Since
A and A are covariant of degree 1 and contravariant
of degree 1, so isA; since A and A are injective,
so is .H.
5. EXAMPLE
The transformation operator approach to dynamics is
being used at the Aerospace Corporation to generate
the dynamics equations of motion for various 8 spacecraft • Its utility stems from the fact
that the same systematic technique can be used for
108
spacecraft of arbitrary configuration. After the
equations of motion are expressed in the form . . G = C G + K or G = - X + K these equations are
numerically integrated for G, and then cr is found
by solving G = ~ cr for a.
We will now give an example of deriving the equa
tions of motion for a system consisting of two
rigid bodies, given the equations of motion for a
single rigid body. This system of two rigid
bodies might represent a spacecraft in which one
of the bodies represents the "main body" of the
spacecraft and the other rigid body represents an
"attached body" such as a gimballed antenna. For . --i l = 1,2, let H c. denote the angular momentum of
and let !i body i about itg center of mass c i ' c. l
denote the torque (or moment) on body i about the -i
center of mass c i ; also, let ~ denote the linear
momentum of body i, and let Fl be the force on
body i. The equations of motion for body i are
now given by
, . j{l
c. l
(25)
Let wi denote the angular velocity of body i with
respect to inertial space, and let v denote the c.
translational velocity of the center lof mass c. l
with respect to inertial space. The angular -+i --i
momentum H and linear momentum P are now given c. by l
(26)
where Ii is the inertia of body i about the center c i i
of mass c i ' and M
have also introduced
the property that
for any vector V.
is the mass of body i. We
the unit dyadic E which has
(27)
The primitive momentum and force for this two
body system can now be denoted by
HI Ll cl
cl
H2 L2
G c2 K
c2 (28) pI ffL p2 ¥
From (25) we now have G + X = K where X O. The
primitive velocity is
_1 (jJ
--2 w CY= v c
l
v c2
and the primitive mass/inertia is
(30)
where the elements of ~ not shown are zero dyadics
(Cj). From (26) we now have G = I.lo a.
Next we assume that the two-body configuration is
such that there are no relative translational
degrees of freedom between the two bodies. Thus,
the composite system has only 9 degrees of freedom
(3 translational and 3 rotational degrees of
freedom for body 1, plus 3 rotational degrees of
freedom for body 2). We incorporate the con
straints ( of no relative translational degrees of
freedom) by expressing the 12-dimensional vector
CY in terms of a 9-dimensional vector~. In order
to find the required transformation operator A
we examine Figure 3 and introduce the following
notation.
Body 2: -2
= Ui + n 1: (jJ
Inertial Reference
Figure 3: Diagram of Two-Body Configuration
If P and q are points in space let 1 and r p q denote the position vectors to points p and q,
respectively, from the origin of an inertial
reference; also, let R denote the position pq vector to point p from point q. Evidently
Rpq = Tp - rq • Let a be an arbitrary reference
point fixed in body 1, and let h be the "hinge
point" where bodies 1 and 2 are interconnected.
We can now write
109
r + R + R h a >ba c2
Taking the inertial time derivative of these - ~ equations and introducing the notation v = r p p (i.e., v is the translational velocity of point
p p with respect to inertial space) yields
Since R is fixed in body 1, we have CIa
where we have introduced the following notation: if .• - -t A and B an~ vectors, then A and A are defined by
We can now write
A'B = AxE -t -A =-A
is also fixed in body 1 we get
(34)
..1 S"l 1 ~ -Rt .2 N • W lml ar y, n h = h • w, ow c2
c2
ndenote the angular velocity of body 2 with
respect to body 1, and for convenience write
u..~ as;. Then
From this we get
. .1 W
.2 w w+O
where It It h + ~ • (35)to (37) can now be c2a c2 a
written as
t] it 0 0 -l ;] (38) -2 E E 0 w - R't 0 if
::~ cla
R't ~t E R h c2
a c2
We now define a by
The transformation operator A is now given by
comparing::; = A·awith (38).
Now that we have A, we get the transformed momentum
and force from IT = At.G and K = At.K. Thus we get
E 0 (40)
OE
Ifl +W + R .T +R ~ .p c
l c2 cla c
2a
If + R .p2 c
2 c2
h
If a
where If is the angular momentum of bodies 1 and 2 a
about a, ~ is the angular momentum of body 2
about h, and P is the linear momentum of bodies
1 and 2. Similarly we get
L a
K= ~ h
(41)
F
where r is the torque on bodies 1 and 2 about a, a
~ is the torque on body 2 about h, and F is the
110
force on bodies 1 and 2. Since X = 0 we get
_·t [va x pJ X = -A ·G = .,. = ..... ~ vh x P
0"
The transformed mass/inertia is t
jj; = A .~.A =
y 12 a ah
= ~a ~ M'Rt ~ 'Rt
ca c2h
where
MR ca
M2'R c2
h
ME
(42)
(43)
- ="T + Ml 'R .R't I a cl cla cla
+7 +~ R' • 1ft C2
c2a c2a
'":!21
'":!2 -2 - 0;>( t h = I + M- R h·.11 h
C2 C2 C2 (44 )
M=Ml+M2
MR =MlR' +~R ca cla c2a
Note that Ta is the inertia of bodies 1 and 2
about a;~ is the inertia of body 2 about h, Mis
the mass of bodies 1 and 2, and point c is the
center of mass of bodies 1 and 2 (so that If is ca the position vector to the composite center of
mass from the arbitrary reference point a fixed
in body 1).
The final transformed equations of motion are now
given by
[~J and
H - - 'fah I M1f W' a a ca
~ ~a -y2 ~1f h • "0 h c2 (G=;1".O) (46)
-- M~ M2'Rt ME --P v ca c2h
t-
In a spacecraft dynamics application, F is the
total external force on the spacecraft, L is the a total external torque on the spacecraft about the
point a, and if body 2 is an antenna, ~ is the
torque on the antenna about the hinge point h.
This example can easily be generalized to a
system consisting of an arbitrary number of rigid
bodies interconnected in a fashion so that the
graph of the configuration forms a tree (no
closed lOOps). If the configuration does not
form a tree, the equations are considerably more
involved; however, even this more complicated
case can be handled by cutting the graph so that
it becomes a tree and then enforcing the cut
constraints by appropriate constraint forces
(i.e., Lagrange multipliers). If the bodies are
flexible rather than rigid, the same approach
still applies, but the equations get much more
complicated (primarily because the equations for
a single flexible body are much more complicated
than for a single rigid body).
III
m:FERENCES
1. G. Kron, IINon-Riemannian Dynamics of Rotating
Electrical Machinery", Journal of Mathematics
and Physics, Vol. XIII, No.2, pp 103-194,
May 1934.
2. B. Hoffman, "Kron's Method of Subspaces",
Quaterly of Applied Mathematics, Vol. II; No.
3, pp 218-231, 1944.
3. L.V. Bewley, Tensor Analysis of Electric
Circuits and Machines, Ronald Press, 1961.
4. H.W. Guggenheimer, Differential Geometry,
McGraW-Hill, 1963.
5. E.T. Whittaker, A Treatise on the Analytical
pynamics of Particles and Rigid Bodies,
Cambridge University Press, 4th Edition, 1937·
6. D.C. White and H.H. Woodson, Electromechanical
Energy Conversion, Wiley, 1959·
7. D.M. Himmelblau (ed.), Decomposition of
Large-Scale Problems, North-Holland Publishing
Co. (American Elsevier Publishing Co.), 1973·
8. W. Jerkovsky, "The Transformation Operator
Approach to Multi-Body Spacecraft Dynamics",
Aerospace Corporation Report (rough draft),
(3 volumes), dated October 1974.
BIOGRAPHY
William Jerkovsky was born in 3atschka Palanka,
Yugoslavia, in 1940. He received the B.S. degree
from Loyola University, Los Angeles, in 1962 and
the M.S. degree from the University of California,
Los Angeles, in 1965; both degrees are in physics.
He joined the Aerospace Corporation in 1972 where
he is currently involved in various aspects of
spacecraft dynamics and control. He was previously
in the controls department at TRW Systems,
Redondo Beach, and in the preliminary design
department at Garrett AiResearch, Torrance.
112
A NaIE ON THE NAGY-FDIAS
LOSSY AND J1)SSIESS SPACE*
N. Levan Departnent of System Science
4532 Boelter Hall University of California
Los Angeles, California 90024
Abstract
We discuss in this paper the construction and properties of a Nagy-Foias soace. The neanings and applications of such a space in systerm and networks will be studied in sone details.
1. IN'OODUCI'ION
A Nagy-Foias space is an abstract Hilbert space on
which a model of a contraction Hilbert space
operator can be constructed. Plainly speaking, a
ITDdel of an operator is another operator (or
operators) which is better understood, but at the
sarre tine, has a lot more structures.
In what follows, we shall investigate the· system
neanings and applications of a Nagy-Foias space,
as well as the applications of the Nagy-Fbi as m:xiel
theory in system and network realization.
2. THE NAGY-FDIAS SPACE
Let H be a (separable) Hilbert space, the space
of analytic functions f'ran the unit disc I zl < 1
to the vectors in H is denoted by H2 (H) • Thus ,
H2(H) consists of power series
00 f(z) = L f zn, for Izl < 1, f in H
n=O n n
and ~llfI12<00 (2-1) n=O n
2 '!he irmer product and norm in H (H) are defined
in the usual way 00
lif'his work was supported by National Science Foundation under Grant number ENG 75-11876.
(2-2)
and
IIfl12 = jollfnll~ (2-3)
The space H2(H) is actually a Hilbert space, and
furthermore it can be identified with the space
L;(H) [lJ, which consists of Fourier series with non-negative powers of eit . For a function f(z)
in H2(H) •. there exists a boundary function f(eit
)
in L;(H). We shall need the space L2
(H) \ which
consists of Fourier series of all powers of eit
Clearly L2(H) = L~(H) 0 L;(H), where L~(H)
113
is the space of Fourier series of ne.~ti ve powers it of e .
If we consider the t;(H) sequence {fo ' f l ,···, fn
, ... } as a discrete sip::nal, startin~ from the
present tine 0, then f(z) is just its discrete
Fourier transform. In what follows we shall
regard L 2(H) as the space of allowed sip.nals over
all tine, while H2(H) (or L;(H)) is the sub
space of present-future sip::nals and L2
(H) is the
subspace of past signals.
Given two Hilbert spaces HI and H2, a funCtioo
e(z) from Izl < 1 to the operators (linear
bounded) from HI to H2 is denoted by {e(z), •
HI' H2}. Such a function is bounded when
and is analytic when it has the power series .
expansion 00
e(z) = I e z, 1 zl < 1 n=O n n
(2-5)
where en are operators from Hl to H2 [lJ.
To each analytic e(z) we can have the boundary
function e(eit ) defined almost everywhere. It
then follows that the multiplication by e(z) is
a map from H2(Hl ) into H2(H2) while the
multiplication by e(eit ) is a map from- L2(Hl )
into L 2 (H 2). Clearly, e (z) can be regarded as
a system transfer function, and e(eit ) is a
system frequency function. In this paper we shall
consider those e(z) which has a causal [2,3J
boundary function e(eit ). This can be best
explained as follows. If we represent e(eit )
by a matrix operator with respect to the
decomposition L2(H.) = L2(H.) ® L+2(H.), 1 - 1 1 i = 1, 2 , then
(2-6)
then clearly e12 is a ~ap from the present
future input subspace L~(Hl) into the past out-2 it put space L_(H2), therefore, for e(e ) to be
ca~al e12 must be 0, that is the function
e(e1t ) is lower triangular [2,3J. Note that e22 can actually be identified with the multiplica
tion by e(z). In what follows we shall need
the adjoint e(eit )* which is clearly given by
e(eit )* = I e-inte* n=O n
* where en is the adjoint map of en.
A bounded analytic fUnction {8(z), Hl , H2}
is said to be contractive if 1 le(z)hll IH
.:: IlhlllH for any hl in Hl . For a 2 1
(2-7)
contractive e, we can define the positive self-
adjoint operator
(2-8)
It is clear that l'l(t) is bounded between ° and
1, and the multiplication by ~(t) is a map from 2 2 L (Hl ) into L (Hl ). Let u(t) be an element of
114
L2(Hl ), then
11~(t)u(t)112= Ilu(t)11 2-lle(eit )u(t)II Z (2-9)
Thus if u(t) is the incident voltage of a linear
network whose scattering operator [4J is e(z) then
1 1~(t)u(t)1 12 is just the net amount of energy
absorbed by the network. We corrment that linear
passive networks are characterized by contractive
analytic functions e.
A contractive analytic function e is said to be
inner if the map e(eit ) is an isometry,
consequently for an inner e, ~ (t) = ° - and this
clearly corresponds to the case of a lossless
network.
The Output - Energy Dissipation Space
Given a contractive analytic function {e(z),
Hl , H2} we associate with it the following input
and output spaces: L2(Hl), H2(Hl ), L2(H2) and 2 H (H
2). We now define the space
(2-10)
where (!) denotes the direct sum, and stands
for the closure. Since ~(t) is bounded below
by 0, elements of H are of the form (v(z),
~(t)u(t)) with v(z) in H2(H2) and u(t) in 2 L (Hl ). We shall refer to H as the output -
energy dissipation space. H is a Hilbert space
[1J whose inner product and norm are defined
in the usual way
[(Vl'~~)' (v2,~u2)J [vl,vlJ + [~ul,~u2J and
Note that for a pair (v ,~u), the output v needs
not come from the input u, infact both v and u
are quite arbitrary. In H consider the sub
space of all present - future outputs which come
from present - future inputs, i.e., the subspace
"t 2 M = {(e(z)w(z),~(t)w(el )),w in H (Hl )
We have
II(ew,~w)112 Ilewl12
+ Ilw112-llewl12
IIwl12
( 2-11)
Hence M is a closed subspace of H, as a
consequence, we can consider the orthogonal
complement in H of M
W- = H2(H2
) 0 f>(t) L2
(Hl )
<3 {(e(z)w(z) ,t;(t)w(eit »), w
(2-12)
W- is called the Nagy - Foias space generated by
{e(z), Hl , H2}
We now explore further to see what are the
meaning;s and applications of this space to system
and network theory. Clearly W- is basically a
subspace of the output - "input" space H, and
its elements must be such that
(v,f>u)cW-<~(v,f>u)JL(ew,f>w),
w in H2(Hl ) (2-13)
or
(v,f>u)cw-<=>(e*v + f>2u ,w) = 0,
2 w in H (Hl ) (2-14)
2 2 Thus as a function in L (Hl ), (e*v + f> u) must
be orthogonal to the subspace H2(Hl ), that is
it can only be expanded into a Fourier series of it +-
negati ve powers of e Letting Pi and Pi'
i = 1, 2, be the projection operators onto 2 L+(H
i), we can express (2-13) and (2-14) simply
as
Given an element (v,f>u) of W-, we can row
construct the element
(2-15)
(2-16)
where u + u+ = u, and it follows readily from
(2-15) that (e21u_,f>u_) is in Mf while
(~2u+,f>u+) is in M. The element v is just the
response due to u. Using (v,f>u) we can write,
for ~ (v,f>u) in W-
Consequently
(v,f>u) = (e21
u ,f>u )+P (v-v,O) - - M"'-
(2-rn
In what follows we shall concentrate on two
special subs paces of W-, which are denoted by
~ and ~ and are defined by
~ = closure {(e2lu_,f>U_), U_cL:(Hl )}
and
~ = closure {Pw-(Y, 0), y in H2
(H2)}
(2-18)
(2-19)
Plainly speaking ~ is just the subspace of
present-future outputs which result entirely P'om
past inputs, while ~ is the subspace of outputs
whose corresponding inputs are not prescribed
before hand.
3. 'IHE IDSSLESS NAGY-FOIAS SPACE
For the loss less case t;(t) = 0, and therefore
W- = H2(H2
) e {e(z)w(z), w c H2(Hl )} (3-1)
The subspace ~ in this case can be simply
described as follows. Setting
2 n e(z) = e
O+el z+e2z + •.• + enz + ..•
We have
+( it ) e12u_ = P2 e(e )u_
(ela_l + e2a_2 + •.. ) 1
+ .....
(3-3)
'lhus the matrix of e12 with respect to the it 2it
orthonormal basis {l, e ,e ," • } is
115
(3-4)
This infinite matrix is called the Hankel operator
H8 generated by 8. We conclude that
(3-5)
The subspace ~ in this case is all of W since, by definition ~ is the closure of
2 Pw{Y} for any y in H (H2).
2 Let Y be any element of H (H2), then from
(2-15), it is in W if and only if P~(8*y) = 0
or equivalently
* 822 Y = 0 ( 3-6)
Furthernore, since 8 is inner, it can be
* readily verified that 822 821 = O. Thus, *
those y of W are in the null space of 822 ,
and in turn, are also in the range space of 821 ,
We therefore have, for an inner 8:
0-7)
The subspace W in this case is the family of
all present-future outputs due entirely to past
inputs - it can therefore be taken to be a
state space [5], and due to 0-7), the system in
this case is both controllable and observable.
4. '!HE LOSSY NAGY-FOIAS SPACE
We now consider the case in which 8 is not inner,
this corresponds to a lossy passive network.
We shall briefly describe here the basic struc
tures of ~ and ~. For details, we refer to
[ 6 ].
In what follows we will need the definition of the
restricted shift operator. Plainly speaking the
shift operator on a space ~ is the multiplica
tion by z. Fbr the space H, the shift operator
116
s is just the multiplication by z and by it e
Furthermore, it can be shown that [1 ], the sub
space M is invariant under S -- and hence W is invariant under S*. The restricted shift T
is a linear bounded operator from W to W, defined by
·t T(v,Au) = P w(zv,e
l ~u),(v,~u) in W (4-1)
It then follows that its adjoint is
( 4-2)
We now set
4>o(z) 8(z)
4>1 (z) 4>O(z) - 4>0(0)
z
<P (z) - 4> (0) 4>n+l(z)
n n z
then, clearly 4>n(O) = 8n , the coefficient of zn
in the power series expansion of 8(z). Next,
define
Then clearly [6] for Ci in HI
K Ci = [4> ,~e-int]Ci n n
(4-3)
-int -int [82le Ci,~e Ci] (4-4)
Thus ~ is in ~ for n > 1. Furthernore
it can be readily verified that
Kn+lCi = T*nxlCi, n > 1
We have
Theorem 1
Mi = span {KlCi, T*KlCi, •.. , T~lCi, ... }
(4-5)
--- *1 and therefore KlCi is a cyclic subspace of T ~'
* ~ ~ the restriction of T to ~.
Similar results can be gotten for ~ [6 ]. Indeed
we have
Theorem 2
M2 = span {kOS, klS, k2S, .•• , knB}
where
and therefore kOS is a cyclic subspace of
For the lossy case, the space W can again
be taken to be a state space, except that, in this
case, we also have the dissipated energy along
with the state. We note that for an element in
~, we have
which is just the energy stored in the system -
due to input u _ in the past. Thus, in some
sense, this shoos same connection between state ani
energy. The space ~ can be thought of as the
space of "error states" - since originally, for
a pair (V,tlU) of H, the output v and input
u are quite arbitrary.
We conclude with the comment that, since
Kla and kuB are cyclic for T*IJ~ and Tl~
respectively, they will result in canonical
realization of e For a full treatment of this,
we refer to [6].
REFERENCES
1. B. Sz-Nagy and C. Foias, "Harm::mic Analysis of Operators on Hilbert Space", North HollandAmerican Elsevier, Amsterdam, NevI York, 1970.
2. W. A. Porter, "Some Circuit Theory Concepts Revisited", Int. J. on Control, Vol. 12,- pp. 433-448, 1970.
3. R. Saeks, "Causality in Hilbert Space", Siam Review, Vol. 12, pp. 357-383, 1970.
4. R. W. Newcanb, "Linear Multiport Synthesis", NcGraw-Hill, New York, 1966.
5. A. V. Balal<rishnan, "State Space Theory of Linear Time-Varying Systems" pp. 95-125
117
of "System Theory", L. A. Zadeh and E. Pollack Editors, McGraw-Hill, New York, 1969.
6. N. Levan, "Canonical Realizations of Transfer Operators': Proceedings 7th LF.LP. Conference on Optimization Techniques, NICE, September 8-13, 1975; Springer-Verlag 1975.
On output Control Problems Containing Input Derivatives
Victor Lovass-Nagy David L. Powers
Clarkson College of Technology Potsdam, New York 13676
Abstract
Given plant and control equations, either or both containing the derivative of the input, the problem is to find a control which drives the output along a prescribed path. In this article, a method is developed which avoids the Laplace transform and uses the concept of a matrix generalized inverse. Some criteria are found for existence of a solution, and techniques are given for simplifying the computation of the solution.
Consider the time-invariant control all functions are as differentiable as
systCr.1
dx dt (1)
(2)
where x is the (nxl) state vector, u is
the (pxl) input vector and y is the (qxl)
output vector. One important problem
in control theory is to determine an
input or control vector u which forces
the output to be a prescribed function
of time. This problem (often called
output function observability or
functional reproducibility of output)
can be attacked by means of Laplace
transform. The case where Bl=O and
01=0 has been treated by sevGral authors
[Brockett, 1970, p. 81; Wolovich, 1974,
p. 163]. In this note a method will be
developed which avoids the Laplace
transform and uses matrix generalized
inverses to determine inputs which drive
the output of the system (1), (2) along
a prescribed path. We will assume that
118
necessary.
Let y be a given function of time.
It is desired to find a control u(t)
satisfying equations (1) and (2)
together with some initial conditions,
x(O) = x O' u(O) = u O•
First we rewrite the original problem
(1), (2) in the form
or dw P dt Qw + f
(3)
(4)
(5)
(6)
-B ] 1 Q =
Dl r:] f =r:]· Now we treat equation (6) as an
1 b · ., h . h dw. t b a ge ra~c equat~on ~n w ~c dt ~s 0 e
found. A symbolic way to solve (6) is
by multiplying through by a conditional
inverse, pc, of P, that is, any matrix
satisfying the equation ppcp = p
[Graybill, 1969, p. 162]. pC is also
called a g-inverse [Rao and Mitra, 1971,
p. 21], or a {l}-inverse [Ben-Israel
and Grevi11e, 1974, p. 8].
If equation (6) does indeed have an
algebraic solution, its general form is
where h is an arbitrary function of t.
In order to survey the legitimacy of the
passage from the equation (6) to equation
(7) it is convenient to specify the con
ditional inverse
(8 )
where o~ is a conditional inverse of 01
(i.e., any matrix satisfying the equation c 010 101). Hence we calculate
It is clear that if ppcQ = Q and
ppcf = f, then equation (7) implies
equation (6). In terms of the blocks
of the partitioned matrices, these con
ditions are equivalent to requiring that
Certainly these conditions are fulfilled
if 01 (and hence P) is of full row rank,
for then 010~ = I. However, even when
these conditions are not fulfilled, it
may happen that
(I_PPC) (Qw + f) = 0
for an appropriate choice of the function
h. In terms of smaller matrices, this
means that
o. (9)
This condition, of course, can be checked
only ~ posteriori.
The arbitrary function h appearing
in equation (7) deserves some additional
comment. If we partition h in the same
way as w,
119
then we may calculate
c lB1 (1-01 O~) h2 (I-P P)h =
(I-010~)h2
Evidently, the components of h1 do not
even enter into equation (7) nor its
solution. Of the components of h 2 ,
some may be needed to satisfy the con
dition (9). Any remaining components
may be used to minimize a cost functional
if desired.
Let us now concentrate on the
solution of the initial value problem
Setting M = pCQ, the solution of the
initial value problem (10) is
(10)
w(t) = exp(Mt) w(O) + J:eXP(M(t-T».
pCf(T)dT + J:eXP(M(t-T» (Z-pcp)h(T)dT
(11)
Since P and Q usually have more
columns than rows, the matrix M = pCQ
will have rank less than its order. It
is then convenient to introduce the matrix
N = Qpc. If we use the power-series
definition of the exponential, we find
11'11'" I
'I
2 exp{Mt) = I + pCQt + pCQpcQ ;1 + ••.
QPCt2 I + pC{t + --2-1-- + ..• )Q
I + pC F{Nt)Q
where the function
Nt2
N2
t3
ft F{Nt) = t + ~ + ~ + •.. = oeXP{NT)dT
Moreover, by a similar manipulation,
exp{Mt)Pc = pC exp{Nt).
By these means, we may recast the
terms of the solution (11), as follows.
C ft = p oeXP{N{t-T» f{T)dT
exp{Mt)w{O) = (I + pCF{Nt)Q)w{O)
Finally we arrive at the solution
wet) (I + pCF{Nt)Q)w{O)
+ pC f:eXP{N{t-T»f{T)dT
+ f:C1 + pCF{N{t-T»Q) l~lJp{T)dT C
where p (t) = (I-01 01) h2 (t) •
Here, everything is expressed in terms
of functions of N only. We make note
of two additional facts: first that
exp(Nt) = I +NF(Nt);
and secondly, if N is nonsingular,
-1 F(Nt) = N (exp(Nt)-I).
(12)
(13)
Thus, only one of the two functions F(Nt)
and exp(Nt) need be calculated.
120
Example. Since the derivative of the
input vector u can be removed from one
(not both) of the equations (1), (2),
[Balabanian and Bickart, 1969, p. 245]
[a,l[ [:j That is,
Then
p
Q
[-1 O~-
A = 2 -2
c = [0,1]
we find that
~ 0 0
U 1 0 0 1
o -2 -1
1 1 0
°0 0 °1
[~ 0
pC 1 0 0
-~
0 0
a J 0 0 0 0 -1/2 1/2 0 1/2 -1/2
0 1 -2 1 -1/2 0 -1/2 0
N 0
~/2J -2 -1
The equation that one solves is
where
dw at
pC f = ~/2YJ and
li/2Y
y.
[1,1] •
0 0 1/2 1/2_
-u
o
or (I-pCP1h ~ rn. where r ~[_~] and n i,
an arbitrary scalar function of t.
Note also these features: P is of full row
rank (rank P=3) so that ppcQ = Q and
Ppcf f for any Q and f; also the
matrix N = QPC is nonsingular, 3x3, while
M = pCQ is singular 4x4.
Our solution, in terms of N, is
w(t) (I + pcN-l(exp(Nt)-I)Q)W(O)
+ pc J:eXP(N(t-T))f(T)dT
It c -1 + 0 (I + P N (exp(N(t-T))-I)Q)rn(T)dT.
Since N in this example is a diagonal i
zable matrix, the exponentials are
relatively easy to compute. The
arbitrary function n may be used to
minimize some performance index, if
desired.
References
1. Balabanian, N., and T.A. Bickart,
Electrical Network Theory (New York:
wiley) 1969.
2. Ben-Israel, A., and T.N.E. Greville,
Generalized Inverses, Theory and
Applications (New York: Wiley) 1974.
3. Brockett, R.W., Finite Dimensional
Linear Systems (New York: wiley) 1970.
4. Graybill, F.A., Introduction to
Matrices with Applications in
Statistics (Belmont, California:
Wadsworth) 1969.
5. Rao, C.R., and Mitra, S.K.,
Generalized Inverse of Matrices
and its Applications (New York:
Wiley) 1971.
121
6. Wolovich, W.A., Linear Multivariable
Systems (New York: Springer) 1974.
Victor Lovass-Nagy received B.S.
and M.S. degrees in Electrical Engineering
and the Ph.D. in Mathematics from the
University of Technical Sciences in
Budapest and later taught there in the
Faculty of Mathematics. After four years
as an engineer in the Ganz Electrical
Works, Budapest, and two years as Reader
in Engineering Mathematics at the
University of Khartoum, he came to
Clarkson College of Technology, where he
is now Professor of Mathematics. His
research interests are: theory and
applications of matrices, systems science,
network theory and control theory.
David L. Powers received B.S. and M.S. degrees in Mechanical Engineering
from Carnegie Institute of Technology
and Ph.D. in Mathematics from the
University of Pittsburgh. After two
years at Universidad Santa Maria in
Valparaiso Chile, he joined the
Department of Mathematics at Clarkson
College of Technology, where he now is an
Associate Professor. His research
interests include: control theory, matrix
theory and numerical analysis.
Cfl"I""'"""
, .' . ,
I
il;
,
AN EXPLICIT TREATMENT OF DILATION THEORY
by
P. Masani
University of Pittsburgh, Pittsburgh, Pa. 15260
Abstract
In this paper w-to-W* operator-valued positive definite
kernels K(··) are defined, where W is a Banach space, and the
Moore-Aronszajn Reproducing Kernel Theorem extended to such K(··).
Congruent Hilbertian varieties X(·) whose covariance kernel is
K(·o) are thereby obtained. The general notion of a propagator
or controller of X(o) is introduced, and necessary and sufficient
conditions established for its existence. It is shown that if
dilations are redefined in terms of isometries rather than pro
jections (as seems more natural), the dilation R(o) of a given
operator-valued function R(o), is precisely the propagator of a
Hilbertian variety whose covariance kernel K(·o) is that ob
tained from R(o) by the methods of Halmos and Nagy. Dilation
Theorems are thus rendered explicit, and their method of proof
routinizedo
122
Characterizations or operations derived
rrom network connections
"
Katsuyoshi NISHIO
Department of Information Enginee~ing, Faculty. of Engineering
Ibaraki University, Hitachi, Ibaraki, Japan
Tsuyoshi ANDO
Research Institute of Applied Electricity,
Hokkaido University, Sapporo, Japan
123
1. Introduction
Through study of electrical network connections, Anderson
and Duffin (2] introduced the concept of parallel ~ of two . Hermi tian semi -defini te matrices, an,d subsequently Anderson (1]
defined a matrix operation, called shorted operation to a subspace,
for each Hermi~ian semi-definite matrix. If A and Bare im-
pedance matrices of two resistive n-port networks then their para-
llel sum A : B is the impedance matrix of the parallel connection.
If ports are partitioned to a group of s ports and to the re-
maining group of n-s ports,then the shorted matrix AM to the
subspace M spanned by the former. group is the impedance matrix
of the network obtained by shorting the last n-s ports.
~ . Parallel addition and shorted operation can be defined on the
class of all bounded positive linear operators on a Hilbert space
and are of great interest from the point of view of operator theory. "
In fact, Anderson and Trapp (3J pushed through this program and
terconnections.
124
Our purpose in this paper is to give some characterizations,
of parallel addition and shorted operation among operations on the
c1a. ss of all positive ope'rators or. a Hilbert space. Theorem 1
will show that the series-parallel inequality aTl~ the transformer ...
inequality are, in some sense, char~cteristic for parallel addition.
In Theorem 2 shorted operation is recovered through c"~mutativity
wi th parallel addition, while Theorem 3 will charac,terize ,sho::ted
operation in terms of some inequalities of con~ve type. In the
final section we make some comment on range inclusion relations
associated with shorted operation and paxallel addition.
2. Preliminaries
In this paper we shall be concerned with (bounded linear)
operators on a complex Hilbert spaceH. The range of an. operator T
will be denoted by ran(T). A Her.itian operator A will be called .,
posi tive if the quadratic form (Ax.x)~ 0 for all x. H. For, two
Hermi t ian opera tors A and B, we denote A ~ B if A~ - B is
positive. The unique positive square root of a positive operat~r
'.
125
A will be denoted by A projection will always mean a
Hermitian projection. Convergence is understood as strong conver-
gence unless convergence in ~ is mentioned explicitly.
AM Given a closed subspace M of H, the shorted operator of a
"
positive operator A to M is defined as the maximum of all
posi tive aperators B such that B' A and ran(8)~ M. The ex-·
istence of such maximum is guaranteed by Anderson and Trapp Ca. Theorem 1) , but it was pointed out far earlier by Krein (7,
Theorem 1 J. The operation A ~ AM will be called the shorted
operation to M. Anderson and Trapp (3, Theorem 6J showed that
shorted operation admits the following variational description
(minimum power principle):
= inf (A(x+y),x+y) YEw'-
for
where M~ is the ortho-comp1ement of M.
x E. H,
Given two positive operators A and B, we obse~ve the
operator on direct ,sum H ® H with (operator) matrix (:
(1)
Now the parallel sum A : B is defined as the restriction of the
shortGd operator (A A,
A) . t6 the iJubspao. "H e {o} , A+B H era}
identified with H itself. Then we have by (1)
126
(A:Bx,x) = inft(AY,'y) + (Bz,z). x=y+z} for XCi~. (2)
The binary operation (A,B) r---+ A : B will be called parallel
addition. The following formula, derived from (2), is otten of use:
IA:B = (A- l + B-l)-l if A and B have bounded inverse. (3) -. We can see immediately trom definition that shorted operation
has the following properties (cf. [3] ):
(i) AM' A,
(H) (c(.A)M = 0(. AM
(Hi) (AM)M = AM'
for 01. > 0, , •
(iv) AM+BM,,(A+B)M'
Correspendingly, we can derive immediately fro~ (2) that parallel
addition satisfies the followino:conditions(cf. C3J):
(I) A:B = B:A,
(II) (A:B):C = A: (B:C),
(III) (o(.A):(c(.B) = cL(A:B) tor oC. > 0,
(IV) 1 A:A = '2A,
(V) A:B + c:o ~ (A+C~: (B+O), "
(VI) T*(A:B)T ~ T*AT:T*BT, for every operator T,
(VII) If An(resp. Bn) conY.~9 •• d.c~ ••• inoly to A (~ •• p. 8)
then so does A :B n n
127
to A:B.
, ,
Condition (V) means that the impedance of series-parallel
connection is greater than that of parallel-series connection.
T*AT is the impedance when a transformer is connected. Therefore
(VI) means that the impedance of parallel connection with trans-
"
former first is greater than that w~th transformer last •.
The important interconnections between parallel additi~n and
shorted operation were established by Anderson and Trapp (3,
Theorem 12)
A: c(. P converges to AM in norm as do ~ 00 , (4 )
where P is the projection to the subspace M. In particular,
since AH = A for all A by definition,
(VIII) A:o(. converges to.A in norm as tl ~ 00
An important consequence of (4) is the commutativity of parallel
addition and shorted operation:
J • (5 )
128
3. Parallel addition
In the preceding section we pointed out that parallel addition
satisries condition (I) to (VIII). Our main result in this section .. is to show that these conditions si~gle out parallel addition. To
this end, ~he rollowing Lemma on functional equation plays a basic
role.
Lemma (Bohnenblust (4, Theorem 4.1.) . .!::!.!! be .! function'
defined on [O,OO)X (0,00) with value in [0,00). If i satisfies
the rollowing conditions:
(A) ~(r{, r) = f( r ' ct ), .
(B) ~( ~ (c{ , r ), D) = !( d., i ( r ' -r) ) , (C) ~("rct , 'J'r) = 1!c('f) for .,.:> 0,
(0 ) ~(ol ,p ) ~ ¥(c{"r' ) I for 0( ~ oi. .!.!!2 r "(5' J
(E) ~(l,O) = 1
then either ~ has ~he form .,
~ ( d. , r ) = max ( 0( , ~ ) ror o(,p~o,
~ there is .! constant 0 < ~ <ac)such that
0;
129
In particular, if ~(l,l) = 2 in addition, then
for
Theorem 1. A binary operation (A,B) ~ AoB on the
. cl~ss of all positive operators ~ out to be parallel addition,
i.e. AOB ~ A:B for all positive operators A and B,!! and
only if it satisfies the following conditions:
J
(I) AOB = BOA,
(II) (AOB )oC = Ao (BoC) ,
(III) (otA)o(oC.B) = d(AoB) for d > 0,
(IV) 1
AoA = '2A,
(V) AoB + CoD ~(A+c)o(B+O),
(VI) T*(AoB)T~ T*AToT*BT for every operator T,
(VII)
(VIII)
If A (~p.. B ) converges decreasingly to n n
then so does A 0 B n n
A0eL converges to
to
A (~p. B)
Proof. Suppose that a binary operation (A,B) ~ AoB
satisfies conditions (I) to (VIII); Let us show that if a pro-
jection P commutes with A and. B then P commutes with AoB,
and more precisely
(6)
130
In fact, since commutativity implies
A = PAP + (l-P)A(l-P) and B =PBP + (l-P)B(l-P),
it follows from (V) and (VI) that
AoB > PAPoPBP + (l-P)A(l-P)o(l-P)B(l-P)
~ P(A.B)P + (l-P)(AOB)p..-~).
Therefore ~he operator
C = AoB - P(A.B)P - (l-P)(AoB)(l-P)
is positive with
pcp = (l-P)C(l-P) = o.
Then by positivity of C we conclude that
CP = C(l-P) = 0 hence C = 0,
which ovbiously implies the expec~ed commutativity of P and AoB
Since scalars commute with all projections, it follows from
the above that for each pair 01., r the operator o(.p cOlIIIDutes
~i th all projections, consequently oeop . must be a scalar. we
consider the scalar function i of two positive variables,
defined by
for 0( , ~ > 0, (7)
which has the meaning because by· (IV) and (V)
-1 -1 1 -1-1 ct 0 P ~ 2m i n (ol , ~ ) > o.
131
Now it follows from (1),(11),(111) and (V) that
(A) ~ ( ol , r ) = ~ (r ,c(. ),
(B ) ~ ( ~ (cl 'r ), 'f) = I (a( , ~ ( ~ , l' » , (C) ~(t'ol, Q~) = l' ~(~ ,~ ) -for 1" ~ 0,
(D) ~ ( c( '~ ) ~ ~ ( cI. " r") '. for 0(, 0(' and r ~ pl. On the basis of (A) and (D) we can extend q; over [0, ot) )( [0, r ) by the formula
~ (0<. ,0) = ~ (0, c{ )~lim I (0( ,(3 ) f -) 0 ,
and
f(O,O) = o.
The extended function satisfies also conditions (A) and (D), and
(E) 2(1,0) = I,
because by (VIII)
!(l,O) = lim ~(,l/n) = lim(l.n)-l = 1. n-)_ n+ao
'; Now since ~ satisfies conditions (A) to (E) and ~(1,1) = 2
by (IV), it follow,s from Lemma of Bohnenblust that
for ,~ 'r ~ O. (8)
Since by (3)
for 0( , r > 0,
(7) and (8) implies
132
and consequently by continuity
for «,~~ o. (9)
For a positive operator A, in view of the spectral theorem '.
(cf. (a, § 107]), there exists a sequence {An 1 such that AI> A2 > ... and An Converges to A and such that each An has the form
N A = Lei .p .
n i=l n,1 n,1
where ol . ~ 0 (i=1,2, ... , N=N(n» and P • are projections n,1 n,1 N
such that p .' p . = 0 n,1 n,] (iFj) and ,EP . = l. i=i n,1
commutes with each p . n,1 we have by (6) and (9)
N N A o 1=.L.A P .OP .=2:.0(. .P •• P.
n i=l n n,1 n,1 i=l n,1 n,1 ,n,1
N N =.2:(0{ .el1)P . =2:.(oC .:l)P . ,
i=l n.1 n,1 i=l n,1 n,1 • J
and correspondingly N
A : 1 = 2:.( 0£. . : l)P . n . i=l n,1 n,1
hence
A .1 = A : 1 n n (n=1,2; ... ) .
Now it follows from (10) and (VII) that
A-I = lim A .1 n-too n
= lim A :1 = A:l. n-t_ n
133
Since
(10)
(11)
';.11,' I,
!II, ' ;i:
To get the final goal, we re~tk that ;.r a po~ i Uve °PQr·ator
C has bounded inverse then
C(AeS)C = (CAC).«('.C). ( l~)
In fact, (VI) implies
C (AeS)C .:EO (CAC)e (CSC) ,
and
Now for atbi trary positive operators A and B, since (B +.~ ~ has bounded inverse for each n')o, we have by (11),(12) and (VII)
AoS = lim Ao(S+l/n) n~""
= lim(S+1/n)~{(S+l/n)-~A(B+1/n)-~.11 (B+1/n)~ ..... n~oo
= lim A n~GII
(S+l/n) = A
This comp1ietes the proof.
B.
Remark, In the proof of Theorem 1 conditions (I), (II), (III),' "
(IV) and (VIII) are used only for scalars, and condition (VI)
only for positive T.
4. Shorted operation
In § ~ parallel addition is defined in terms of shorted
operation, and the latter is recaptured from the former by (4).
We shall first characterize sport~d operation through commutativity
with parallel addition.
Theorem 2. An operation JC(.) £!! the class of· posi tive
operators coincides with shorted operation to ~ closed subspace
M, Le. ~(A) = AM for every positive operator A,·g ~ only
i! it satisfies the following conditions:
• J
(a)
(b)
x( d. A) = OC> :n:.(A)
X(A:B) =:n.(A):B = A: X(B).
Proof. In' 2 we pointed out that shorted operation satisfies
(a) and (b).
Suppose that an operation ~(.) satisfies (a) and (b).
Then we have
1: n :rt ( 1) = 1: :It ( n) = :n: (1: n) = X( 1 ) : n. (13)
Since by (VI~I) the sequence ;e(l):n converges to ~(1) in
norm as n ~oo l:n:7t(l) converges increasingly to 'to(l).
135
, I , 1
, ' 'I I I
I :i!i:'
, '1:'1',:1'",: : 'I, Iii
In view of the spect ra1 theorem (cf. (8, i 107) ) :7C( l) admits
spectral representation 00
"(1) = ~ )"dP(A.) o
where P().) are projections such that for A ~ ,...,
lim P(r) = P()oJ r -+ ~+
and lim P(~) -- 1. ~7OO
Tben we have by the spectral
representation
j[(1) ~ ~ (l-P()..» for A. > O. (14)
It follows from (4), (13) 8fia (14) that
J[;(1) = lim l:nJt(l) ~ lim l:n:l(l-P().» = 1-P(l,..) n~oo n+oo
and consequently
X(l) ~ 1 - P(O). (15 )
On the other hand, since ran(~(l» is contained in M = ran(l-P(O»,
we have
)&(1) ~ 1\'3r(1) 1I·(l-p(O». (16)
Now for each positive operator A it follows from (a) and (b)
conbined with (4), (15) and (16) that
3C(A) = lim :Jt(A):n = 11m A::nX(1) n~oo n-too
~ lim A: n( l-P(O» = A n+oo M
• J
and similarly
x,(A) ~ lim A:n UX(l)ft(l-P(O» = AM' n-t ..
Therefore the operation" coincides with the shorted operation
to M. '.
The following theorem correspoods to Theore. 1 tor parallel
addition.
Theorem ·3. An operation x( .) 2.!! the cU~.s of all positive
operators ~ out to be ~ shorted operation ~ ~ closed
subspace M, i.e. ~(A) = AM ~ every positive operator A,
if and only if it satisfies the -following ,conditions:
{i) x(A) ~ A,
(ii) X(c( A) = «.;,t(A) f-or 0,
(iii) X(x.(A» = 'teA),
(iv) X(A) -: X(B) ~ X(A+B),
(v) 't (A+ 't(B)C 'l:.(B» ~ :n.: (A) + X(B)C le(B),
(vi) 't(A2) ~ lC(A)2.
Proof. In § 2 we pointed out that the shorted operation to
a closed subspace M sa-tisfies (i) to (iv). Also (v) tollows
137
! I:
:",1 I '"
1','
11": ,
ill"
II il':I.1
immediately from (1) because 8 My = 0 for OVC!r), V E MJ. To
see (vi), let
A = A + lin and 8 = P + lin n n
where P is the projection to M. Sinc" A n and 8n have
bounded inverse, it follow£! fr~m (3) th.t fat
-2 (A -1+ m -18 .')2 (An:~8n) = n n
Since order relation between two positive operators i. reversed
by forming respective inverses, we have
(1 ) -l(A 2 mB 2) m +m n: n
8y (VII) the sequences A 2: mB 2. and n n converge to
2 2 A : mP and (A:mP) respectively as n --t 00 , hence
-1 2 2 m (1 +.) (A: mP) ~ (A: mP) . (17)
8y (4) the left side of (17) converges to (A2)M while the right
side does to T~is proves (vi) for shorted operation.
Suppose conversely that an operation ~(.) satisfies conditions
(i) to (vi). It follows from (i) and (vi) that
o~ X(l) ~ 1 and . 2 2
~(l) = X(l) ~ XCI) ,
which implies immediately that 'tel) is a projection. Let M
138
denote the range of 'tel). Now in view of (ii) we have only to
show that
for 0' A ~ 1.
Let O~A' 1. Then since lelA) ~ :eel) by (iv), ran(JC(A» .. is contained in M, so that we conclude from (i) and the definition
of shorted operation that
(10)
Then it follows from (10), (iii) and (iv) that
:1C(AM- X(A» + X(A)
.. X(AM-X(A» ... X(X(A» ~ "CAM) ~ aCA)
hence
" (11)
Now since (10) implies
'Japplying (vi) we have by (11)
~(l) = X[(AM- X(A» + [:IC(l)
"
hence
This together with (10) conchdes' the proof. '.,
139
5. Range inclusion
Remarkable properties of parallel addition and shorted operation
from the point of view of range inclusion have been discussed by
Anderson and Duffin (2) and Anderson (11 in finite dimensional
• J
case and 'by Fillmore and William (6) and Anderson and Trapp [31
in general case: for positive operators A, B and a closed sub-·~
space M
(cl )
I (cl )
(,~ )
( ~' )
ran(AM~) = ran(A ~)n M,' (3, Theorem 1]
ran(AM) 2 ran(A )("\M,
ran«A:B)~) = ran(A~)(\ran(B~),
ran(A:B) "2 ran(A)" ran(B).
t 3, Theorem 11]
On the other hand, we showed in the course of the proof of Theorem 3
We remark that (~") (resp. (rU» can be cosidered to give quanti-
, I tative expression to (c(.) (resp. (~». In fact, for instance,
by (0£) applied to A2, relation "(ct.') is equivalent to
ran(AM) :2ran( (A2)M~);
which says, by a result of O.ouglas [5 J , that ·there is • constant
1'> 0 such that
we can take
140
Referenc ••
'1 W.N. Anderson, Jr., Shorted operators, SIAM J. Appl. Math. 20(1971), 520~525.
. 2 W.N. Anderson, Jr. and R.J. Duffin, Series and parallel
addition of matrice., J.'Math. Anal. Ap,l. 26(1969), ,576-594.
3 W.N. Anderson, Jr. and G.B. Trapp, Shorted operator. II, (preprint) •
4 F. Bohnenblust, On axiomatic characterization of Lp space, Duke Math. J, 6(1940), 627-640.
5 R.S. Douglas,
" inclusion, On majorization, factorization and range
Proc, Amer. Math. Soc. 17(1966), 413-416.
6 P.A. Fillmore and J,P, Willia •• , On operator ranges, Advances in Math. 7(1971), 254-281. \
7 M.G. Krein, Theory of selfadjoint extensions of ... i-1 bounded operators and its application, Mat. Sb.
20(62)(1947), 431-495 (Russian).
8 F. Riesz and B'. Sz. -Nagy, Functional ana1y.ia, Ungar, New York, 1955.
"
141
Iii! I I
'iii ~ II
'II
A FUNCTIONAL ANALYSIS APPROACH TO
MINIMUM SENSITIVITY CONTROL DESIGN
J. Gary Reid United States Air Force Avionics Laboratory
Wright Patterson AFB, Ohio 45433
Abstract
This paper treats the problem of open-loop minimum sensitivity control design from a gradient iteration method of solution rather than from previously wellknown Riccati equation techniques. By using standard methods from functional analys~s~ the quadratic cost functional of the generally high dimension sensit~v~ty system is transformed into a low-dimension minimum norm problem on the control space. A recently obtained matrix-operator form of the parameter sensitivities in linear time-invariant ordinary differential equation systems then enables one to compute the gradient function with a very small number of integrals. Finally, extensions are suggested to more general linear systems defined on an arbitrary Hilbert space, and some of the computational considerations of computing the gradient are discussed.
1. INTRODUCTION
A classic problem in control theory is the
design of control laws which are "forgiving" of
one's inaccurate knowledge of system parameter
values. One method to approach this problem is
to treat the nominal trajectory parameter sen
sitivities as additional "states" of the system,
and then use the formalism of the Maximum Prin
ciple (Pontryagin, et al (7) to determine the
optimal control which minimizes a cost func
tional of both the natural system states and
the sensitivity "states". (See, eg, Guardabassi,
et al [I), Holtzman and Horing (3), Kahne (4).
If the system is described by a set of
linear ordinary differential equations, then it
is well-known that the parameter sensitivities
also satisfy a linear set of differential equa
tions termed the "sensitivity system". There
fore, if the sensitivity cost functional is
selected to be quadratic in the control, system
states, and sensitivity "states", then minimum
sensitivity open-loop control may be determined
explicitly via solution of a matrix Riccati
differential equation (eg, Kahne (4). We
142
emphasize that the control law so obtained
is merely open-loop, and it cannot be made
closed-loop (as is the case when the quadratic
cost functional does not contain sensitivity
constraints) except by approximation (See, eg,
Lamont and Kahne (5). This is a well-known
dilemma (eg, Deyst and Price (9), and basically
stems from the fact that the optimal control is
computed as a linear combination of the nominal
trajectory parameter sensitivities which, in
turn, are computed under the assumption that
the control is open-loop. Since the parameter
sensitivities are not "physical" variables of
the system, but are merely mathematical opera
tors (partial derivatives) which are dependent
upon the mathematical nominal trajectory to
compute their values, they lose their meaning
in an on-line, feedback control law.
One problem of such Riccati equation
approaches is often the extremely large number
of differential equations to be solved. If n
is the state dimension and p is the parameter
dimension, then there are potentially n(p+l)
"states" in the sensitivity system and the
symmetric Riccati matrix differential equa
tion has dimension n(p+l) x n(p+l). If the
system is time-invariant, then this number of
differential equations may be reduced sub
stantially by using low-order sensitivity models
(eg, Wilkie and Perkins [13]) or taking con
trollability properties into account (eg, Gupta
and Mehra [2] or Reid, et al [10] [11]), but
regardless there still can be a great amount of
computation if either the state or parameter
dimensions are high.
This paper treats this same open-loop,
linear system, minimum sensitivity control pro
blem, but it approaches the solution in a funda
mentally different way. Using well-known
methods of functional analysis, the quadratic
cost functional is transformed into a minimum
norm on the space of controls. (See, eg, Porter
[8]). This problem may then be solved by
standard gradient iteration techniques. For
simplicity, the case of linear time invariant
ordinary differential equation systems is
treated first. Using recently reported (Reid,
et al [10] [11]) algebraic representations of
the sensitivities in such systems, a highly
efficient method of computing the gradient
function is thereby obtained. Following this
development, extensions to more general linear
systems described by linear operators on a
Hilbert space are discussed.
2. PROBLEM FORMULATION
We consider the n-dimensional time
invariant system
~(t) • A(v)x(t) + B(v)u(t) x(O)mxo
with observable output
yet) - C(v)x(t) yet) £ Rm
(1)
(2)
where v is a constant, unknown parameter vector
which parametrizes A, B, and C. The nominal
value of v £ RP is designated Vo and all system
quantities will henceforth be evaluated at vo;
therefore, the explicit reference to v will be
143
deleted from the notation. The components of
v are designated Vi' i • 1, 2, ••• p, and
partial derivatives with respect to Vi are
designated with a subscript "(i)".
For a given u £ L2 (0, t f ; Rr) and a given
Xo £ Rn
, the nominal output of the system (1)
is uniquely specified by
yet) - T(t)xo + Wet) u (3)
where
T(t) :: CeAt t Wet) ::! CeA(t-s)B(o) ds (4)
o Then the output parameter sensitivities are
given by
z(i)(t) :: Y(i}(t) - T(i)(t)xo + W(i) (t)u (5)
where it is fairly easy to show that [10]
(6)
The computation of these partial derivatives will
be discussed in the next section.
Now it is assumed that we wish to minimize
the quadratic cost functional
tf
J(u) - <Y(t f ), SfY(tf » + ! <yet). S(t)Y(t»dt o
tf (7)
+ ! <u(t}, u(t» dt o
where we define the m(p+l) dimensioned augmented
vector
yet) T(t) Wet)
z (1) (t) T(l)(t) W(l)(t)
yet):: x + u 0
z(p)(t) T (p) (t) W(p)(t)
:: T(t)x + W(t)u (8) 0
and it is assumed that Sf and Set) are non
negative and symmetric. This is a very gereral
sensitivity cost functional as it weights the
parameter sensitivities at both the terminal
time as well as along the nominal trajectory.
and various modifications of this cost func-
tiona1 have been used by a number of previous
researchers (eg, [1] [3] [4]).
The problem, then, is to calculate the
minimum sensitivity, open-loop control law
which minimizes the functional (7). As dis
cussed in Section 1. this problem may be solved
explicitly via a Riccati equation technique;
however, we will take an alternate approach by
using standard tools of functional analysis to
transform the cost functional (7) into the form
tf
J(u) k + 2 J <h(t), u(t»dt o
where
tf
+ J <u(t), P(t)u>dt o
k = <Y z • i • (t f ), Sf Yz • i . (tf »
tf
+ J <Yz • i . (t), S(t)Yz . i • (t»dt o
-* h(t) = W (t f - t) Sf Yz •i . (t f )
tf -* + J W (s-t) S(s) Yz • i . (s) ds
o
r.* -P(t)u - u(t) + w (tf-t)SfW(tf)u
(9)
(10)
(11)
tf
+ J w* (s-t)S (s)W(s)u ds (12) t
Y i (t) = T(t)x z. . 0 (14)
Since Sf and S(·) are assumed positive, the
optimal control exists and is uniquely specified
by
u * (t).. _P- 1(t)h (15)
However, it is not practical to invert P(t)
directly except by Riccati equation techniques
and so iterative method of solution is appro
priate.
The gradient of J(u) is easily shown to be
VJ(t; u) = 2(h(t) + P(t)u) (16)
and so a gradient or conjugate gradient
algorithm may be conveniently utilized (eg,
Luenberger [6]). The key to such an approach
is, of course, obtaining an efficient means
to compute the gradient function. This is
the topic discussed in the next section.
3. COMPUTATION OF THE GRADIENT
Recently Reid, et a1 [10] [11] have shown
that the zero-input and zero-state response
of the system (8) may be put into the form
Yz • i • (t) = T(t)xo = F f(t) (17)
t Y (t) z.s.
W(t)u = J H f(t-s)u(s) ds (18) o
where f(') is
with elements
a 2n-dimenslona1 vector function j A t of the form t e k where Ak ,
144
k = 1, 2, .•• q, are the destinct eigenvalues of
A (which, if complex, would also introduce
factors of sines and cosines), and j = 0, 1,
.•. 2nk
where nk
is the eigenvalue multi
plicity of Ak in the characteristic polynomial
of A. The m(p+1) x 2n and m(p+1) x 2nr
dimensional matrices F and H are obtained
directly from the quantities (CAxo)' (CAxo) (i)'
(CAB), CAB) (i)' i = 1, 2, ••. p, and by the in
version of a 2n x 2n dimeri-s'ioned Vandermonde
matrix. (See Appendix)
Now if we assume, for simp1icit!, a single
control input (ie, r=l) then we may, put h(t) and
P(t)u, equations (10) - (11), into t.he form
T T h(t) = f (t f - t)H Sf Yz.~.(tf)
(19) o
tf T T·
+ J f (s-t)H S (s)Y (s) ds z.s.
(2G)
t
where
F f(t) (21)
t Y (t) = H J f(t-s)u(s)ds z.s.
(22) o
For more than one control input, th~ computa
tions are mere1yrepea,ted Jor each iIl-put and
corresponding column vec~or of B. A~so note
that the indicated 2n convolutions in equations
(18), (19) and (2l) may be transformed into an
equal number of quadrature integrals through
use of relations such as
~(t-s) \t -~s ~t -A s (t-s}e z tee - e se k (23)
Therefore, in examining equation (18) -
(2l) we see that the gradient function
VJ(t; u) - 2(h(t} + P(t)u} (15)
may be computed with only 2nr quadrature inte
grals if S(o} = 0 (terminal sensitivity only)
or 4nr quadrature if S(o} ; 0 (trajectory
sensitivity). Thus the computations for each
iteration of a gradient minimization algorithm
are quite low and increase only linearly with
the state dimension, n, and are independent of
the parameter dimension, p. This is in contrast
to the Riccati equation approach in which, at
worst case, the computations may increase with
the square of nand p. Since the cost func
tional is quadratic, we are assured that either
a gradient or conjugate gradient algorithm will
converge ~o the unique minimizing control (eg,
Luenberger [6]). Additionally, if the system
(I) is output controllable and if it is desired
to meet the terminal constraint, y(tf } • Yf'
exactly, then the gradient projection .ehtod
of Rosen [12] may be conveniently utilized
to calculate the minimizing control.
4. EXTENSIONS TO GENERAL LINEAR SYSTEMS
The problem formulation of Section 2. is
easily extended to the more general linear
system with output given by the operator equa
tion
yet} a T(t; v)xo + Wet; v}u (24)
yet} £ Rm
v £ RP u £ U Xo £ X
where the control space U and state space X are
assumed to be arbitrary real Hilbert spaces, and
the zero-input and the causal zero-state system
operators, T(t; 8) and Wet; 8), are each assumed
to be continuous in t and continuously differ-
145
entiable with respect each vi at the nominal
Vo £ RP• Then the output parameter sensitivities
are once again given by the operator equations
(25)
and the cost functional J(u), equation (7), may
be for.ed in the same manner as before. Defining
the augaented operators T(t) and Wet) in a -* siai1ar manner to Section 2. and letting W (t f )
denote the adjoint operator of W(t f } £
II (p+l) -* Lc (U, R ) and W denote the adjoint operator
of W{o} £ Lc(U, L2(0, tf
; Rm(P+l»), then we may
once again transform the cost functional J(u)
into the form
J{u} - k + 2 <u, h> + <u, Pu>
wbere
(26)
(27)
(28)
sod k and Yz•i • (t) are defined as in Section 2.
Rote that each term of the above expressions
correlate with the corresponding terms for h(t)
aDd P{t}u in equations (18) - (19).
The gradient of J(u) is once again given
by VJ(u} - 2(h + Pu), and so a gradient or con
jugate gradient algoritm. .ay be utilized in
co.puting the optimal control. However, once
again the key to such an approach is how easily - ~. it is to compute W(o)u and the adjoints w (t f ) -* aDd V. For linear ti.e-invariant ordinary
differential equation systellS we saw thdt tbese
quantities could be co.puted quite efficiently;
the pri.ary reason for this ease of coaputation
st~ frca the fact that 801M! nor.ally very
tt.e coosuaing convolutions in coaputing W{ o)u -" aod V could be transfotw!d into the far .ore
coavenient quadrature integral fora. However.
for geoeral linear systeas tbis conversion of
the convolutions would generally not be feasible.
Beoce. it is a significant observation that only ... the -trajectory sensitivity" adjoint, W ,
~l"es a convolution, while inherently the -* -teraiDa1 sensitivity" adjoint. W (tf ) does DOt
(see equation (18) - (19». If one only visbea
to weight the sensitivities at the terminal time
t f and not along the entire nominal trajectory,
then this gradient method of solution may still
be a computationally practical method of solu
tion. It is also of interest to note that this
substantial difference in computational burden
bct'Jeen the "terminal" and "trajectory" cost
functional problems is not a feature of the
Riccati equation type approaches, for there the
number of differential equations remains in
variant regardless of such a change in the cost
functional. Therefore, in the terminal sensi
tivity problem, particularly, the gradient type
approach might be an attractive alternative to
such explicit methods of solution.
5. EXAMPLE
To illustrate the theory of Section 3. we
consider the simple second order linear system
d dt
[lOJ ~l (tJ _ 61 (tJ o (t) - x (t) t E: [0,1]
2 2 (30)
with nominal parameter vector v = [-2 -3 l]T. o
Then the nominal eigenvalues are Al = -1 and
A2 = -2. From the Appendix the Vandermonde
matrix V and its inverse are
-1
1
-2
1
1
-2
4
-4
-~l V-I = [_1: -8 - 9
12 - 2
Then it is fairly easy to show that
F
2 o -2 0
-3 2
5 -2
4 -2
-6 2
o o
o o
-1
2
3
-5
-4
6
o o
o o
21
-2
-2
4
o o
H
1
-1
-2
3
3
-4
1
-1
4
8
5
1
o o 1
-1
-1
1
o o
5
12
9
2
-1
2
2
-3
-3
4
(31)
o o 1
-2
-2
4
-1 0
2 0 (32)
146
and
f(t) = [e-t te-t e-2t te-2t ]T
Then for the sensitivity cost functional
+ 1 2 J u (t)dt o
the gradient function
1 J byT(t)Y(t)dt o
VJ(t; u) = 2(h(t) + P(t)u)
(33)
(34)
(35)
may be computed from expressions (19) - (22)
where the matrices Sf and S are replaced by the
scalars a and b, respectively. For each new
guess of the control there are four integrals
required to compute Y (0) and four to com-z.s.
pute the third term of P(o)u (the trajectory
sensitivity term).
On the other hand, since there are two
system states and three parameters there are a
total of eight "states" in the sensitivity
system. It may be shown from controllability
considerations that the complete sensitivity
system for this example may be generated with
only six differential equations [10], but
regardless, the Riccati equation method of sol
ution would require solution of either an 8x8
or 6x6 nonlinear matrix Riccati equation.
Using a steepest descent algorithm
(Luenberger [6]) and the initial guess uo(t)
-h(t), the gradient method of solution was
mechanized on the CDC 6600 digital computer.
The results for two different sets of
weighting constants, a and b, are shown in
Tables 1 and 2. We see that in either case the
gradient method converges quite rapidly.
ITERATION
START
1
2
COST
.717
.715
.715
NORM2 GRADIENT
.180
. 385xlO-3
.1l0xlO-5
Table 1. a = 1, b = .1
ITERATION COST NORM2 GRADIENT
START --4
.108x10 • 180x104
1 • 632xl03 • 413x102
2 • 612x103 • 338x10
3 • 61lx103 • 777xlO -1
4 • 61lxl03 • 642x10-2
Table 2. a ~ 10, b = 1
6. SUMMARY
By using standard methods of functional
analysis, a gradient method of solution has been
developed as a computational alternative to
Riccati equation approaches to the minimum sen
sitivity control problem in linear systems. For
an n-dimensional linear time-invariant ordinary
differential equation system with r control in
puts, a recently obtained algebraic representa
tion of the parameter sensitivities [10] [11]
allows us to compute the gradient function with
merely 2nr or 4nr quadrature integrals for the
terminal or trajectory sensitivity problems,
respectively. This method of solution may then
be a practical alternative to Riccati equation
techniques where the computations increase at
a rate greater than the square of the state
dimension n.
Finally, extensions have been suggested for
more general linear systems, and it was observed
that the computations of the "terminal" cost
functional problem are inherently far less than
the "trajectory" cost functional problem due to
the elimination of time-consuming convolutions.
APPENDIX
In this appendix we briefly describe how
the matrices F and H used in equations (20) and
(21) are computed. Far more details and a
derivation of this matrix-operator form for the
parameter sensitivities may be found in [10]
[11] •
For simplicity. assume that the nominal A
matrix has des tinct real eigenvalues, Xk ,
k - 1, 2, ••• n, and assume that there is only
one control input so that B is a column vector •
Define the 2nx2n generalized Vandermonde matrix
147
1 Xl
0 1
1 X2 V -
o 1
).2 1
2Xl X2
2
2A n
X2n- 1 1
(2n-l}Xin- 2
).2n-l 2
(2n_1}X2n- 2 n
(A-I)
and the m(p+1)x2n dimensioned matrices
E -
G
Then
and
CB
(CB) (1)
(CB)(p)
CAx o
H _ GV-1
2n-1 CA x o
(CA2n- l x ) o (D)
(A-2)
CA2n- l B
(A-3
(A-4)
The extension to the case for multi-inputs,
non-destinct and complex eigenvalues is easily
made. (See [10] for details).
BIBLIOGRAPHY
1. Guardabassi, G., A. Locatelli, and S.
Rinaldi, "On the Optimization of Continuous
Linear Systems with Sensitivity Constraints."
Preprints Second IFAC Symposium on System Sensi
tivity and Adaptivity. Dubrovnik, Yugoslavia,
August 1968.
2. Gupta, N. K. and R. K. Mehra, "Computational
Aspects of Maximum Likelihood Estimation and
Reduction in Sensitivity Function Calculations."
IEEE Trans on Automatic Control, AC-19: 774-783,
Dec. 1974.
3. Holtzman, J. M. and S. Horing, "The Sensi
tivity of Terminal Conditions of Optimal Control
Systems to Parameter Variations," IEEE Trans on
Automatic Control, AC-10: 420-426, Oct 1965.
4. Kahne, S., "Low-Sensitivity Design of Opti
mal Linear Control Systems," IEEE Trans on Aero
space and Elect. Systems, AES-4: 374-397, May
1968.
5. Lamont, G. and S. Kahne, "Comparison of
Sensitivity Improvement Techniques for Linear
Optimal Control Systems," IEEE Trans. Aerospace
and Elect. Systems, AES-5: 142-151, March 1969.
6. Luenberger, D. G. Optimization by Vector
Space Methods, New York: John Wiley & Sons, Inc.
1969.
7. Pontryagin, L. S., V. G. Bo1tyznsku, R. V.
Gamkre1idze, and E. F. Mischenka, (trans by K. N.
Trirogoff, ed by L. W. Neustadt) The Mathematical
Theory of Optimal Processes, New York: John
Wiley, 1962.
8. Porter, W. A., Modern Foundation of Systems
Engineering, New York: Macmillan, 1966.
9. Price, C. and J. Deyst, "A Method for Ob
taining Desired Sensitivity Characteristics
with Optimal Controls", Proceedings Joint
Automatic Control Conference, University of
Michigan, 1968.
10. Reid, J. G., "Sensitivity Operators and
Associated System Concepts for Linear Dynamic
Systems", PhD Dissertation, Air Force Institute
of Technology, 1975.
148
11. Reid, J. G., P. S. Maybeck, R. B. Asher, and
J. D. Dillow, ,"An Algebraic Representation of
Parameter Sensitivities in Linear Time-Invariant
Systems," Journal of the Franklin Institute,
January 1976.
12. Rosen, J., ·"The Gradient Proj ection Method
of Nonlinear Programming, Part I, Linear Con
straints", J. S·oc. Industrial Applied Math.
8: 181-217, 1960.
13. Wilkie, D. F. and W.R. Perkins, "Genera
tion of Sensitivity Functions for Linear Systems
Using Low-Order Models", IEEE Transactions on
Automatic Control.
BIOGRAPHY
J. Gary Reid was born in Newark, New Jersey,
on 12 Novermber 1945. In 1967 he re~~ived a n:s. in Aeronautics from the United Stat~~'Air Force
Academy and was commissioned in the Air Force.
In 1968 he received an S.M. from M. 1. T. in Aero
nautics and Astronautics, and he is currently"
working to complete a Ph.D. degree at the Air
Force Institute of Technology with a'specia1ity'
in estimation and control theory. :'
Captain Reid is a student member,of the IEEE
societies on automatic control, computers, and , .
aerospace and e1ectr?nic systems. His current
research interests include linear system theory,
identification, adaptive nonlinear estimation,
and pattern recognition.
PASSIVITY AND LP-STABILITY OF SOME NONLINEAR EVOLUTION EQUATIONS
Dinu Wexler, with the Department of Mathematics, Facultes
Universitaires N.D. de la Paix, Namur, Belgium.
Abstract One states some results on passivity and LP-stability of the input-output operator associated with nonlinear evolution equations involving a monotone operator.
1. INTRODUCTION
We discuss passivity and LP-stability as defined in Systems Theory (see for instance (3), (5)) for the input-output operator associated with the differential equation
du dt + Au :3 f, (1)
where A is a maximal monotone operator (possibly nonlinear, unbounded and multivalued) of a real Hilbert space H. A significant class of ordinary and partial differential equations may be written in form (1). The recently developed general theory of these equations is closely related to nonlinear contraction semi groups. From this point of view it represents a nonlinear version of the well-known Hille Yosida Phillips theory. It is also related to variational inequalities for partial differential equations. A systematic exposition may be found in the monograph of H. Brezis (2).
Recall first some basic definitions and results in this theory. A (multivalued) operator A H",CP(H) with domain
D(A) = {x E H : Ax F 0}, is said to be maximal monotone if : (i) A is monotone, i.e. (x1-x2'Y1-Y2) ;> 0, Vxl'x2 E D(A), Y1 E A xl'
Y2 E A x2;
and (ii) A does not possess proper monotone extensions. An important class of maximal monotone operators consists of subdifferentials of convex lower semi continuous functions ~ : H~) - 00, + 00),
~ =! + 00.
Recall the following existence, uniqueness and regularity results: let f E L~oc(to' + 00, H). Then for any Uo E ~ there exists a unique weak soluti on on [ to' + oo[ of the Cauchy problem
~ + Au :3 f, u(O) = uo. (2)
This solution is a strong one whenever (i) A is a subdifferential and f E L~oc(tot + 00, H); or (ii) Uo E D(A) and fEB Vloc (! to' + 00[, H). The solution u depends continuously on Uo and f in the following sense: if u and u are the solutions of
(2) and
149
~ + Au :3 f, u(O) = Uo (3)
respectively, then
lu(t) - "(t)1 ~ luo - "01 + )t If - fl do, 't ~ to' to
In this setting one may consider various problems for systems with input f and output u. We state here our main results on passivity, strong passivity and LP-stability. Proofs will be published
: I !
elsewhere.
2. PASSIVITY
Assume 0 E AO. One says that eq. (1) is passive if to any f E L;(R, H) (the causal extension of L2(R, H)) one can assign a weak solution u of (1) on R such that u E L;(R, H) and such that
J:.(f - i, u - u)db > 0, Vt E R,
for any f, f E L;(R, H) and any assigned solutions u, U respectively.
We first state the properties which are implied by passivity: when eq. (1) is passive, then the solution u assigned to f is unique, limt~_oo u(t) o and the operator f~ u is causal and commutes with any translation. Moreover, the operator f~ u possesses the following continuity property : if un and u are the solutions assigned to fn and f respectively and if f ~ f in L2(_ 00, T,
2 n H), then un ~ u weakly in L (- 00, T, H) and strongly in Lr (_ 00, T, H), for any rE12, +001 (hence uniformly onl - 00, TI).
We state a sufficient passivity condition : assume o E AO; if there exists a > 0 and p > 0 such that
(x, y) ~ alxl 2, V x E D(A), Ixl < p, y E Ax,
then eq. (1) is passive. Some nonlinear diffusion equation satisfy the above conditions. Another simple example is furnished by the subdifferential of the norm of H .(so that Ax = Ixl-1 x, if x f 0
and AO is the closed unit ball of H with center 0).
3. STRONG PASSIVITY
One says that eq. (1) is a-strongly passive (a > 0) if it is passive and if for any f and f in L;(R, H)) the assigned solutions u and u respectively satisfy
J:;: - i, u - u)db >" J:~u - ul' d , V t E R.
The maximal monotone operator A is said to be a-strongly monotone if A - a I is still monotone (I being the identity on H). Recall that for any A > 0, the (nonlinear) resolvent JA = (I + A A)-l
of A is a single-valued operator defined on the whole of H. We now state our main result.
Theorem 1 : Assume 0 E AO and let a > O. The following three conditions are equivalent: (i) eq. (1) is a-strongly passive; (ii) A is a-strongly monotone; and (iii) the resolvent of A satisfies
(JA Y1 - JA Y2' Y1 - Y2) ~ A a IJA Yl - JA Y212,
V Y1' Y2 in H and A > O.
This extends to our framework a result established previously for linear evolution equations by Beltrami and Buianouckas [11 by arguments which use in an essential way the linearity. The proof we give is based on certain results in the theory of nonlinear contraction semi groups.
Note that when A is a-strongly monotone, the operator fl~ u possesses the following Lipschitz pro-
• 2 perty : for any f, f in Le(R, H), the assigned so-lutions u and u respectively satisfy
IIXt(u-u) II 2 .0; a-I II~t(f-f) II 2 ' Vt E R, L (R,H) L (R,H)
where;tt is the characteristic function of 1- 00, tl·
4. LP-STABILITY
Some results on LP-stability with p E [1, + oo[ may be obtained by using the ideas we applied to discuss passivity. We mention here only a result on LOO-stability. One says that eq. (1) is LOO-stable if for any [uo' fl in ~ x Loo(R+, H) the weak solution u of (2) on R+ belongs to L (R+, H). The maximal monotone operator A is said to be coercive
150
if there exists Xo E H such that
(x - xo' y) Ixl =+00. lim
Ixl~ yEAx
Theorem 2 : If A is coercive, then eq. (1) is Loo
_ stable and the operator [uo' fll+ u from D(A) x Loo(R+, H) into ~(R+, H) carries bounded sets into bounded sets.
The converse of Theorem 2 holds at least when A is a subdifferential. When dim H < 00 and A is a subdifferential, then eq. (1) is LOO-stable if and only if A is surjective, i.e.
U Ax = H. ~D(A)
REFERENCES
[1) E.J. Beltrami and F. Buianouckas. A Note on Passive Evolution Equations, J. of Math. Analysis and Applications 37 (1972), 227-230.
(2) H. Brezis. Operateurs maximaux monotones et semi-groupes de contractions dans les espaces de Hilbert. North-Holland/American Elsevier. 1973.
(3) C.A. Desoer and M. Vidyasagar. Feedback Systerns: Input-output properties, Academic Press, 1975.
(4) D. Wexler. Operateurs fortement monotones et equations d'evolution fortement passives. C.R. Acad. Sc. Paris 280 (1975). 201-204.
(5) A.H. Zemanian. Realizability Theory for Continuous Linear Systems. Academic Press. 1972.
Dinu WEXLER was born in Bucharest. Rumania. on August 28, 1931. He received the Doctor degree in Mathematics from the University of Bucharest in 1966. From 1955 to 1971 he was with the Department of Mathematics of the Institute of Petroleum. Gas and Geology, Bucharest and the Institute of Mathematics of the Rumanian Academy of Sciences. In 1972 he was with the Department of Mathematics. Universite de Paris VI (Pierre et Marie Curie). France and in 1973 he joined the Department of Mathematics. Facultes Universitaires N.D. de la Paix. Namur. Belgium. where he holds the rank of Associate Professor. His research interests are in Differential Equations and Nonlinear Analysis.
151
CONTRACTIVF. TRANSFER RATIOS OF OPF:RATOR NF'lWORKS A. H. Zemanian
State University of New York at Stony Brook Stony Brook, New York 11794
Abstract
By an "operator" we mean a bounded linear operator in a complex Hilbert space. An operator network without mutual coupling, whose self-impedances are commuting operator-valued analytic functions corresponding to passive elements, have voltage and current transfer ratios that are contractions on certain cones in the right-half complex plane.
Summary. The concept of an electrical network whose parameters are operators in a complex Hilbert space H has been discussed in a number of prior works (1) - (8). The purpose of the present paper is to extend to operator networks the fact that RIC scalar networks having no mutual coupling have voltage and current transfer ratios whose absolute values are bounded by one on the real positive axis. We show that networks whose parameters are invertible commuting positive bounded linear operators in H have operator-valued voltage and current transfer ratios that are strict contractions on certain cones in the righthalf plane. The apex of every such cone is at the origin and its bisector is the real axis. Moreover, the angle of the cone depends in a certain way on the number of nodes of the network.
We need some definitions: The numerical range W(f) of an operator is the bounded set of complex numbers
W(f) = [(£a,a) : a €H, 1\ a \\ = l}.
C denotes the complex plane in the following expressions. For any positive integer m, C(m) is the open cone
C(m) = [)"'sC: larg)"'\ < TT/2m}.
For any fixed C € C (m), we define the closed cone
O(m,O = [)." € C: larg )"'1 ~ m larg CI}.
152
Let m be a maximal abelian self-adjoint algebra of operators containing the parameters of the networ~ Given any set of positive commuting operators, such an m can always be constructed. Q(m) will denote the set of all analytic operator-valued functions F em C(m) such that, for each C€C(m), we have F(C) €"n, W[F(C)]C O(m,O, and w[F(O] is bounded away from the origin.
We aSSume that the network N under consideration is a connected three-terminal or two-terminal-pair network having no mutual inductance and no internal sources. Every self-impedance is assumed to be a member of Q,(l). (Note that every resistance, inductance, or capacitance that is a positive invertible operator in m has its impedance in Q(l).) Finally, we assume that there does not exist in N any short circuits and that all the external terminal nodes are distinct nodes.
We now generate a certain network N' equivalent to N as follows. We first replace all series connections inside N by equivalent single branches and then replace all parallel connections by equivalent single branches. We continue repeating these two steps until a network N' with no internal series or parallel connections is obtained. N' is uni~uely determined by N and has the same behavior at its external terminals as does N. Throughout the following k will denote the number of internal nodes (not counting the external terminal nodes) of N'.
A cut-n~de ~f a c~nnected network is a node wh~se deletion coupled with the deletion of all the branches incident at that node results in a disconnected network. An external terminal node of N is a cut-node if and only if it is a cut-node in N'.
Let T(,) and J(C) denote respectively the opencircuit voltage transfer ratio and the short-circuit current transfer ratio of either a threeterminal or two-terminal-pair network. We are now ready to state our conclusions.
Theorem 1. Let N be a three-terminal network such that the input node not common to the output is not a cut-node. Then, for all ,eC(2k + 2), IIT(,)II < 1 and T(O has the form
T ( ,) = [I + A ( ,) r \ where A E: Q(2k + 2) and I denotes the identity operator in H.
Theorem 2. Let N be a three-terminal network such that the output node not common to the input is not a cut-n~de. Then, for all 'eC(2k + 2), IIJ( 011 < 1 and J( (;) has the form
J(') = [I + B(,)]-l,
where Be Q,(2k + 2).
Theorem 3. Let N be a two-terminal-pair network such that at least one of the input (respectively, output) nodes is not a cut-node. Then, IIT(,)II < 1 (respectively, IIJ(,)II < 1) f~r every ,CC(2k + 4).
These theorems are proven by extending Kirchhoff's third and fourth laws to operator networks and manipulating the numerical ranges ~f the operatorvalued impedances in an appropriate way.
The values of k in these theorems cannot be decreased; that is, given any, not in the cone indicated in the c~nclusion of anyone of the theorems, there exists a network that satisfies the stated assumptions and whose voltage or current transfer ratio is not a contraction at that ,. The assumption that all branch impedance operators commute is a severe one. However, it can be shown by example that it is necessary. Indeed, let r
l and r
2 be positive invertible noncommuting
operators connected in s~ries. Let T be the transfer v'Jltage ratio that maps the voltage drop across r l + r 2 into the v~ltage drop across r 2 • Then, by choosing r
l and r
2 appropriately, we can
make IITII greater than one.
Finally, we note that if s~me of the external terminal n~des are cut-nodes or if there exist paths of short circuits inside N c~nnecting the terminal nodes, then it is possible f~r T(e) and J(C) to have norms equal to but not larger than one.
153
RF,FERENCFS
(1) V. Dolezal, "Hilbert netw~rks: I", SIAM J. Contr~l, t~ appear
(2) V. D~lezal and A. H. Zemanian, "Hilbert networks: II - Some qualitative pr~perties", SIAM J. Control, to appear
(3) V. Dolezal, "Generalized Hilbert networks", to appear
(4) V. Dolezal, "Networks", to appear
(5) A. H. Zemanian, "Passive operat~r netw~rks", I~E~ Trans. Circuits and Systems, v~l. CAS-21 (1974), pp. 184 - 193
(6) A. H. Zemanian, "rnfini te networks of positive ~perat~rs", Circuit Theory and Applicati~ns, vol. 2 (1974), pp. 69 - 78
(7) A. H. Zemanian, "Continued fractions of operator-valued analytic functions", Journal ~f Approximation Theory, t~ appear
(8) A. H. Zemanian, "Infinite electrical networks", Proc. I~EF" to appear
. ,~' .