Estimation op ological and8880/... · 2005. 3. 17. · yModel. 10 3.2 P arameterizations of Relaxed...

44

Transcript of Estimation op ological and8880/... · 2005. 3. 17. · yModel. 10 3.2 P arameterizations of Relaxed...

Spectral Estimation by

Geometric, Topological and

Optimization Methods

Per Enqvist

Doctoral Thesis

Stockholm, 2001

Optimization and Systems Theory

Department of Mathematics

Royal Institute of Technology

Stockholm, Sweden

Copyright c 2001 by Per Enqvist

TRITA-MAT-01-OS-03

ISSN 1401-2294

ISRN KTH/OPT SYST/DA 01/02{SE

Universitetsservice US AB, Stockholm, 2001

To my family

v

Abstract

This thesis consists of four papers dealing with various aspects of spectral

estimation and the stochastic realization problem.

In Paper A a robust algorithm for solving the Rational Covariance

Extension Problem with degree constraint (RCEP) is presented. This

algorithm improves on the current state of art that is based on convex

optimization. The new algorithm is based on a continuation method,

and uses a change of variables to avoid spectral factorizations and the

numerical ill-conditioning in the original formulation occuring for some

parameter values.

In Paper B a parameterization of the RCEP is described in the context

of cepstral analysis and homomorphic �ltering. Further, it is shown that

there is a natural extension of the optimization problem mentioned above

to incorporate cepstral parameters as a parameterization of zeros. The

extended optimization problem is also convex and, in fact, it is shown

that a window of covariances and cepstral lags form local coordinates for

ARMA models of order n.

In Paper C the geometry of shaping �lters is analyzed by considering

parameterizations using various combinations of poles, zeros, covariance

lags, cepstral lags and Markov parameters. In particular, the covariance

and cepstral interpolation problem is studied using di�erential geometry

and duality theory. Assuming there is an underlying system that is stable

and minimum phase, it is shown in this paper that there is a one-to-

one correspondence between Markov parameters and cepstral coeÆcients.

An approach based on simultaneous Markov and covariance parameter

interpolation has been studied by Skelton et. al. In this paper it is studied

from a global analysis point of view.

Paper D deals with a regularization of two �lter design methods,

namely the covariance and cepstral matching ARMA design method and

covariance matching for MA �lters. Both methods are posed as opti-

mization problems, and a barrier term is introduced to achieve a strictly

minimum phase solution. As a result of the regularization, exact inter-

polation is traded for a gain in entropy, and the map from data to �lter

de�ned by the optimization problems is turned into a di�eomorphism.

Keywords: Spectral estimation, ARMA models, Covariance analysis,

vi

Cepstral analysis, Markov parameters, Global analysis, Convex optimiza-

tion, Continuation methods, Entropy maximization.

Mathematics Subject Classi�cation (1991): 93E12, 60G10, 42A70,30E05, 90C25, 53C12, 94A17.

vii

Acknowledgments�

Although much of my time as a Ph.D. student has been a lone wolf race,

I could not have made it without the help and support of a number of

key persons.

My �rst and warmest thanks go to my advisor Professor Anders

Lindquist, who despite a heavy work load has managed to be available

for discussing my spurious ideas and for all his help with formalizing and

carrying out these ideas. The second most important researcher I have

had the privilege to work with is Professor Chris Byrnes. During his

intense visits at KTH he has managed to boost my research with his cu-

riosity and good mathematical intuition. Especially, it has been very nice

to be a coauthor of two papers together with the dynamic duo; Professor

Anders Lindquist and Professor Chris Byrnes.

I am also grateful to Professor Sergei Gusev and Professor Tryphon

Georgiou for encouraging discussions that directly or indirectly inspired

to results presented in Paper A of this thesis.

Further, I thank Professor Clyde F. Martin for supervising my diploma

work at Texas Tech and all the teachers I have come across, to name a few,

Professor Tomas Bj�ork, Docent Krister Svanberg, and Professor Anders

Lindquist for inspiring and interesting me in graduate studies.

My colleagues Anders Dahl�en, Ryozo Nagamune, Jorge Mari and Ulf

J�onsson has formed a valuable discussion panel always ready to bounce

any new ideas. The stimulating social environment at the Division of

Optimization and Systems Theory has been an important factor making

it a joy to get to work. Especially, at times when you need some diversion

from the research, such as doing sports (thank you Henrik and Petter) or

just have a chat. In particular I would like to thank my two room-mates

during these years: Camilla Land�en and Torvald Ersson. Camilla and

I shared the initial confusion as beginner Ph.D. students and we helped

each other through the �rst courses. I have also appreciated the company

of Torvald who share my interest in sports.

Finally, I would like to thank my family. With their solid support and

encouragement in the back, nothing seems too diÆcult.

�This work was sponsored in part by TFR and the G�oran Gustafsson Foundation.

viii

Contents

1 Introduction 1

1 Linear System Models . . . . . . . . . . . . . . . . . . . . 1

2 Realization Theory . . . . . . . . . . . . . . . . . . . . . . 2

2.1 Deterministic Realization Theory . . . . . . . . . . 3

2.2 Stochastic Realization Theory . . . . . . . . . . . . 4

2.3 Connections between Stochastic and Deterministic

Realization . . . . . . . . . . . . . . . . . . . . . . 6

3 The Rational Covariance Extension Problem . . . . . . . 9

3.1 The Maximum Entropy Model . . . . . . . . . . . 10

3.2 Parameterizations of Relaxed Versions of the RCEP 13

4 Classical Speech Modeling . . . . . . . . . . . . . . . . . . 15

4.1 Acoustic Tube Modeling . . . . . . . . . . . . . . . 18

4.2 Lossless Tube Equations . . . . . . . . . . . . . . . 19

4.3 The E�ects of Nasal Coupling . . . . . . . . . . . . 21

5 Foliations, Transversality and Local Coordinates . . . . . 22

6 Summary of the Papers . . . . . . . . . . . . . . . . . . . 27

6.1 Paper A . . . . . . . . . . . . . . . . . . . . . . . . 28

6.2 Paper B . . . . . . . . . . . . . . . . . . . . . . . . 28

6.3 Paper C . . . . . . . . . . . . . . . . . . . . . . . . 28

6.4 Paper D . . . . . . . . . . . . . . . . . . . . . . . . 29

A A Homotopy Approach to Rational Covariance Extensionwith Degree Constraint 35

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2 The Original Optimization Problem . . . . . . . . . . . . 39

3 A New Formulation of the Optimization Problem . . . . . 43

4 Homotopy approach . . . . . . . . . . . . . . . . . . . . . 55

4.1 Initial Value Problem formulation . . . . . . . . . 58

x Contents

4.2 Predictor-Corrector method . . . . . . . . . . . . . 58

5 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 59

5.1 Adaptive step length procedure . . . . . . . . . . . 59

5.2 How to choose the initial step size �. . . . . . . . 62

5.3 A practical algorithm . . . . . . . . . . . . . . . . 64

6 Convergence of the proposed algorithm . . . . . . . . . . . 64

7 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 67

B Cepstral coeÆcients, covariance lags and pole-zero mod-els for �nite data strings 73

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 75

2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . 78

2.1 Analysis based on in�nite data . . . . . . . . . . . 78

2.2 LPC �lters . . . . . . . . . . . . . . . . . . . . . . 80

2.3 Cepstral maximization and LPC �lters . . . . . . . 81

3 Homomorphic �ltering and generalizations of LPC �ltering 83

3.1 Cepstral and covariance windows as local coordi-

nates for pole-zero models . . . . . . . . . . . . . . 83

3.2 Cepstral maximization and a generalization of LPC

design . . . . . . . . . . . . . . . . . . . . . . . . . 92

4 Realization algorithms for lattice-ladder notch (LLN) �lters100

4.1 Selecting the positive pseudo-polynomial . . . . . . 103

4.2 The algorithm . . . . . . . . . . . . . . . . . . . . 105

4.3 Examples . . . . . . . . . . . . . . . . . . . . . . . 107

5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 110

C Identi�ability and well-posedness of shaping-�lter param-eterizations: A global analysis approach 119

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 121

2 Some geometric representations of classes of models . . . 126

3 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . 129

4 Global analysis on Pn . . . . . . . . . . . . . . . . . . . . 132

5 Identi�ability of shaping �lters . . . . . . . . . . . . . . . 140

6 The simultaneous partial realization problem . . . . . . . 147

7 Zero assignability vs. cepstral assignability . . . . . . . . 158

A Divisors and polynomials . . . . . . . . . . . . . . . . . . 162

B Calculation of cepstral coeÆcients . . . . . . . . . . . . . 163

C Connectivity of Pn(c) . . . . . . . . . . . . . . . . . . . . 164

Contents xi

D A convex optimization approach to ARMA(n,m) modeldesign from covariance and cepstrum data 1731 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 175

2 Local coordinates for ARMA models . . . . . . . . . . . . 179

3 Optimization problems for cepstrum and covariance inter-

polation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182

4 Regularization of Problem (P) . . . . . . . . . . . . . . . . 188

5 Regularization of Problem (M) . . . . . . . . . . . . . . . 196

6 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . 200

6.1 MA Filter Design . . . . . . . . . . . . . . . . . . . 200

6.2 ARMA Filter Design . . . . . . . . . . . . . . . . . 203

7 Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . 205

xii Contents

Introduction

This introduction is intended to provide some backround material for

understanding the four papers on stochastic realization theory forming

the main body of this thesis. Even for the readers who are familiar with

stochastic realization theory, the introduction will be useful for setting

notation and for introducing concepts to be used later from related �elds,

such as speech processing, information theory and di�erential geometry.

1 Linear System Models

Before stochastic realization theory can be studied, some basic linear

system theory is presented. Throughout this thesis the models are con-

sidered to be linear, and this class of models are described next. A

linear model can be seen as a linear mapping w : U ! Y from an input

space U to an output space Y, both of which are real vector spaces of

scalar sequences. If f: : : ; u�1; u0; u1; : : :g 2 U is a sequence of inputs,

and f: : : ; y�1; y0; y1; : : :g 2 Y is the corresponding sequence of outputs,

and the system is assumed to be causal and time-invariant,

ym =

1Xk=0

wkum�k =

1Xk=0

wkz�kum = w(z)um; (1)

for some parameters wk, where z denotes a forward shift operator. The

parameters wk are called the Markov parameters of the system and w(z)

the transfer function. The input-output map can be described by a black

box representation as depicted in Figure 1.

In this thesis it is assumed that the transfer function w(z) belongs to

the class of real rational functions. In this case, w(z) can be expressed

2 Introduction

um - w(z) - ym

Figure 1: Black box model

as a ratio of polynomials:

w(z) =�(z)

a(z)=

1Xk=0

wkz�k; (2)

where �(z) and a(z) are polynomials of �nite order n given by

�(z)4

= �0zn + �1z

n�1 + : : :+ �n; �j 2 R; j = 0; 1; : : : ; n; (3)

a(z)4

= a0zn + a1z

n�1 + : : :+ an; aj 2 R; j = 0; 1; : : : ; n: (4)

For the main part of this thesis, we will further assume that w(z) is

� proper i.e., the order of �(z) is less than or equal to the order of

a(z),

� stable i.e., the polynomial a(z) has all its roots in the open unit

disc,

and for the most part also that w(z) is

� minimum phase i.e., the polynomial �(z) has all its roots in the

open unit disc.

2 Realization Theory

The theoretical frame works used for model building, based on parame-

ters obtained by observing a system, is called realization theory. Given a

(linear) system, a number of di�erent characterizing parameters can be

determined from the system, such as the Markov parameters. Realiza-

tion theory deal with the inverse problem, namely the synthesis of systems

based on a set of characterizing parameters. There are two branches of

realization theory, namely deterministic and stochastic realization the-

ory. These two theories di�er in the availability of system inputs, and

2. Realization Theory 3

since this determines which type of characterizing parameters that can be

estimated from observing the system, the characterizing parameters are

di�erent in the two theories. Roughly speaking, deterministic realization

theory is based on Markov parameters and stochastic realization theory

is based on covariance lags.

Next, the main ideas of the two theories are explained and the con-

nection between them is discussed.

2.1 Deterministic Realization Theory

In deterministic realization theory, the input is assumed to be a known

control signal and the output signal is observed as the system evolves

over time. Although the realization theory is independent of the particu-

lar application, it is often useful to consider examples to guide the studies

at the higher abstraction level. Realization of systems is used in many

di�erent �elds of applications, e.g. mechanical engineering, biology and

physics. To name a few, in economics, it is often assumed that the in a-

tion rate can be modeled as a system controlled by the interest rate, and

in chemical engineering, concentration levels are controlled by diluting or

adding substances.

One way of building a model for a system is to �rst estimate a win-

dow of Markov parameters w0; : : : ; wN from observed input and output

sequences, and then to determine a realization w(z) based on this pa-

rameter set. For a stable linear system the Markov parameters form an

exponentially decreasing sequence, and if the sequence is truncated after

wN the relation (1) applied for m = 0; 1; : : : ; N gives the system

26664

y0

y1

...

yN

37775 =

266664

u0 u�1 : : : u�N

u1 u0. . .

......

. . .. . . u�1

uN : : : u1 u0

377775

26664

w0

w1

...

wN

37775 : (5)

Assuming the input provides a full rank system the Markov parameter

estimates fwjgNj=0 can be determined.

The realization step described next is based on the Markov parameters

w0; : : : ; w2n and the assumption that (2) holds for a system of order n.

As described in e.g. [20, 6, 1] the coeÆcients of � and a are uniquely

determined by this set of Markov parameters. From here and on, we will

without loss of generality consider the normalized Markov parameters,

where wk := wk=w0, so that in particular w0 = 1. Then it can be assumed

4 Introduction

that both a(z) and �(z) are monic, i.e. a0 = 1 and �0 = 1. Multiplication

of both sides of (2) by a(z), and identi�cation of the coeÆcients for

z�1; : : : ; z

�n lead to the equation system

26664wn+1

wn+2

...

w2n

37775 = �

266664w1 w2 : : : wn

w2 w3 . .. ...

... . ..

. ..

w2n�2

wn : : : w2n�2 w2n�1

377775

26664

an

an�1

...

a1

37775 (6)

and identi�cation of the coeÆcients for zn; : : : ; z; 1 give

26664�1

�2

...

�n

37775 =

26664w1

w2

...

wn

37775+

266664

1 0 : : : 0

w1 1. . .

......

. . .. . . 0

wn�1 : : : w1 1

377775

26664a1

a2

...

an

37775 : (7)

If the Hankel matrix in (6) is nonsingular the coeÆcients of the polyno-

mial a(z) can be determined. In general, it cannot be guaranteed that

a(z) is a Schur polynomial, i.e. a polynomial with all roots inside the

unit circle. Since this is related to stability of the system, this is an

important issue in many applications. Assuming a(z) is a Schur polyno-

mial obtained from (6), then (7) determines a polynomial �(z) such that

w(z) = �(z)=a(z) interpolates the Markov parameters 1; w1; : : : ; w2n.

Further, for an arbitrary Schur polynomial a(z), (7) determines a

polynomial �(z) such that w(z) = �(z)=a(z) interpolates the Markov

parameters 1; w1; : : : ; wn. Actually, if we consider all rational transfer

functions w(z) = �(z)=a(z) of order n that interpolates the Markov pa-

rameters 1; w1; : : : ; wn, it follows from (7) that there is such a unique

transfer function for every choice of the Schur polynomial a(z), and all

interpolating transfer functions of order n can be obtained in this way.

Thus the Schur polynomials a(z) parameterize all solutions. This type of

parameterizations will be used frequently in this thesis, and therefore the

concept of parameterization is formalized and exempli�ed in Section 5.

2.2 Stochastic Realization Theory

In stochastic realization theory, the input is assumed to be an unknown

sequence of (white) noise f: : : ; u�1; u0; u1; : : :g 2 U, and the only data

available for determining a realization is the corresponding sequence of

2. Realization Theory 5

outputs f: : : ; y�1; y0; y1; : : :g 2 Y. The output will then form a stochastic

process.

Examples of processes that can be modeled in this way, are the yearly

unemployment rate for the U.S., the closing price of a stock at the New

York stock exchange, measurement errors in sensors and short samples of

speech signals. The last example is described in more detail in Section 4.

Although the input signal may not be white noise in these examples, the

input is clearly not easily measured, for technical reasons or by complex-

ity, and then it is reasonable to model it as white noise.

One of the simplest stochastic processes, is the white noise process,

which can be de�ned as a sequence of independent, or at least uncor-

related, stochastic variables. The aim of stochastic realization theory is

to represent a stochastic process as the output of a linear �lter that is

driven by a white noise process.

As the output data is usually preprocessed in order to remove trends

and other deterministic components, it is assumed that the stochastic

processes considered are zero-mean, that is E yt = 0 for all t. If the

process is considered at two di�erent time instants, t and s, the (linear)

dependence between the stochastic variables yt and ys is described by the

covariance,

Cov(t; s)4

= E ytys:

In this thesis the processes are assumed to be stationary in a weak sense,

i.e. the covariances are assumed to have the property

Cov(t; s) = Cov(t� s; 0) = Cov(0; s� t);

and the variances of the stochastic variables yt are assumed to be �nite.

In other words, the covariances depend only on the time di�erence, so the

covariance properties are thus characterized by the covariance sequence

r`4

= Cov(`; 0);

where clearly r�` = r` holds. This concept quanti�es a weak sense of

invariance to time translation for the process.

The spectral density can be de�ned from the covariances by the Lau-

rent series

�(z)4

=

1X`=�1

r`z�` (8)

and it can be shown that this series converges in an annulus including

the unit circle. In the special case that z = ei� it is easy to see that

6 Introduction

the spectral density is given by the Fourier transform of the covariance

sequence

�(ei�) =

1X`=�1

r`e�i`� = r0 + 2

1X`=1

r` cos(`�):

For the white noise process the spectral density is a constant and thus,

as for white light, the contribution to the process power spectral distri-

bution is equal for all frequencies. Further, if the covariance sequence is

generated by the output of a shaping �lter w(z), i.e. a stable transfer

function, driven by white noise, it can be shown that the spectral density

is related to the shaping �lter as �(ei�) = w(ei�)w(e�i�).

On the other hand, the power spectral density can be factored as

�(ei�) = w(ei�)w(e�i�) where w(z) is a shaping �lter, and if vt is white

noise, the process de�ned by xt = w(z)vt is stationary and has the co-

variances frtg1 which determines � as in (8). The inverse problem of

obtaining the shaping �lter from the spectral density is called spectral

factorization. It is clear that the spectral density has to be nonnegative

on the unit circle due to the equation

�(ei�) = w(ei�)w(e�i�) = jw(ei�)j2: (9)

In addition, it follows from a theorem by Fej�er [5] that this condition

is also suÆcient. From (9) it also follows that the spectral density can

be interpreted as the ampli�cation of the energy expressed for each fre-

quency.

Clearly, the full covariance sequence determines the shaping �lter by

(8) and spectral factorization. However only a few covariances can be

estimated from �nite number of data. A central problem in stochastic

realization theory, and in this thesis, is the rational covariance exten-

sion problem, which deals with determining all shaping �lters of order n

matching the window of covariances lags r0; r1; : : : ; rn. This problem is

described in Section 3.

2.3 Connections between Stochastic and Determin-

istic Realization

The stochastic realization problem can be formulated in a way that is sim-

ilar to deterministic realization, but there are some important di�erences.

Since the input is stochastic and unknown, the Markov parameters are

2. Realization Theory 7

in general not available in stochastic realization theory. A classical ap-

proach using impedance functions is presented next, and a new approach

is introduced in Paper C.

De�ne the impedance function v(z), corresponding to the spectral

density �, by

v(z)4

= r0=2 +

1X`=1

r`z�` =

1

2�

Z�

��

(1=2 +

1X`=1

z�`ei`�) �(ei�)d�

=1

2

1

2�

Z�

��

ei� + z

�1

ei� � z�1�(ei�)d�; jzj � 1: (10)

Then v(z) is analytic inside the unit disc, v(z�1) is analytic outside the

unit disc, and the spectral density can be expressed in the impedance

function as

�(z) = v(z) + v(z�1): (11)

The partial stochastic realization problem can be formulated as to

determine a rational spectral density �(z) interpolating the covariances

r0; r1; : : : ; rn, which seems similar to the deterministic problem of deter-

mining a rational function v(z) interpolating the \Markov parameters"

r0=2; r1; : : : ; rn. However, there is a further constraint, which says that

the function v has to be positive real. A real function is (strictly) pos-

itive real if it is stable and maps the outside of the unit circle into the

open right half plane of the complex plane, as depicted in Figure 2. This

ensures that the spectral density �(z) = v(z) + v(z�1) is positive on the

unit circle.

v(z)

Figure 2: Mapping of the Positive Real function v(z).

8 Introduction

Hence if the positive real part v can be determined, then the spectral

density is obtained from (11) and the shaping �lter w(z) is given by the

stable minimum phase spectral factor. It is also clear that if the shaping

�lter w(z) is determined, the positive real part is obtained from (10).

Positive real functions are important in other areas too. For some

applications, typically in electric circuit design or for mechanical systems,

passive systems are desired. A linear system (in state space form)

xk+1 = Axk +Buk;

yk = Cxk +Duk; (12)

is called passive if the inequality

N�1Xk=0

ukyk � x>

NPxN � x>

0 Px0; N � 0;

is satis�ed for some positive de�nite P . For a particular choice of the stor-

age function x>Px, the concepts of positive real and passive are equiva-

lent if the system (12) is a minimal realization of an asymptotically stable

transfer function v(z) [5].

2.3.1 Cepstrum

A third set of parameters characterizing a shaping �lter is given by the

cepstrum. The cepstrum can be de�ned from the logarithm of the spectral

density by considering its Laurent expansion in an annulus containing the

unit circle

log�(z)4

=

1X`=�1

c`z�`;

where the cepstral lags are de�ned as the coeÆcients c`. A �nite win-

dow of cepstral lags for a stationary stochastic process can be estimated

from the outputs fytgNt=0. This suggests that the cepstrum can be used

as characterizing parameters for stochastic realization theory, and it is

shown in Paper B and C that the cepstrum complements the covariances

in a very nice way.

In paper C it is also shown that for stable and minimum phase shaping

�lters, there is a one-to-one correspondence between �nite windows of

cepstral lags and Markov parameters. There is thus a connection to

deterministic realization theory. Realizations based on covariance and

Markov parameters have been studied by Skelton et. al. [15, 16].

3. The Rational Covariance Extension Problem 9

3 The Rational Covariance Extension Prob-

lem

In [12] Kalman posed the Rational Covariance Extension Problem (RCEP).

Problem 3.1 (The Rational Covariance Extension Problem)Given a �nite covariance sequence r0; r1; : : : ; rn such that the Toeplitz

matrix

Rn

4

=

266664r0 r1 : : : rn

r1 r0. . .

......

. . .. . . r1

rn : : : r1 r0

377775 (13)

is positive de�nite, determine all extensions rn+1; rn+2; : : : such that Rk

is positive de�nite for all k > n and the spectral density de�ned by the

covariances frkg1k=0 is rational of degree at most 2n. Then any minimal

spectral factor w(z) will have degree at most n.

If r0; r1; : : : ; rk are the covariances of the outputs fyjg, the assumptionthat Rk is positive de�nite is equivalent to the natural assumption that

E

0@ kX

j=1

vjyj

1A2

= v>Rkv > 0

for every vector v such that at least one element vk is nonzero. The set

of covariances r0; r1; : : : ; rn such that Rn in (13) is positive de�nite is

denoted by Rn.

A complete parameterization of the solutions to the RCEP was con-

jectured by Georgiou in [10]. This conjecture was later proven in [4],

where the main theorem led to the following corollary, the existence part

of which had been proven in [10].

Corollary 3.1 (Corollary 2.4 in [4])Let (1; r1; : : : ; rn) 2 Rn be a given partial covariance sequence. Then

given any Schur polynomial

�(z) = zn + �1z

n�1 + : : :+ �n;

there exists a unique monic Schur polynomial a(z) of degree n and a

unique � 2 (0; 1] such that

w(z) = ��(z)

a(z)

10 Introduction

is a minimum phase spectral factor of a spectral density �(z) satisfying

�(z) = 1 +

1Xi=1

r̂i(zi + z

�i); r̂i = ri for i = 1; 2; : : : ; n:

In particular, the solutions of the rational positive extension problem are

in one-one correspondence with self-conjugate sets of n points (counted

with multiplicity) lying in the open unit disc, i.e. with all possible zero

structures of modeling �lters. Moreover, the modeling �lter depends an-

alytically on the covariance data and the choice of zeros of the spectral

density.

A complete parameterization of all solutions to the RCEP is thus provided

by the numerator �(z) of the �lter. The convex optimization problem

formulated in [3] determines a stable polynomial a(z), such that for a

given stable polynomial �(z), the �lter w(z) = �(z)=a(z) interpolates

a given window of covariances r0; r1; : : : ; rn. Solving the optimization

problem thus provides a means of determining any solution to the RCEP

parameterized as in Corollary 3.1. A robust algorithm for solving this

optimization problem is the contribution of Paper A in this thesis. Next,

a special extension of the partial covariance sequence is considered.

3.1 The Maximum Entropy Model

The maximum entropy (ME) model is a widely used stochastic model,

popular for its low computational complexity and nice matching at fre-

quencies of large energy. It solves the covariance extension problem, and

is the unique solution of this problem that also maximizes the entropy of

the model. It is a rational model with all zeros at the origin, which in

the �eld of statistics is called an Auto Regressive (AR) model.

3.1.1 The Entropy of a Process

Consider a discrete-valued stochastic variable y with probability function

Py . The information I of an outcome y = �y is de�ned by [22, 7]

I(y = �y)4

= � logPy(y = �y);

and the entropy H is de�ned as the mean of the information

H(y)4

= E I(y) = �X�y2Y

Py(y = �y) logPy(y = �y):

3. The Rational Covariance Extension Problem 11

For a stochastic vector-valued variable y = (y1; : : : ; yn) with a proba-

bility density P , the entropy is given by

H(y1; : : : ; yn) = �ZP (y) logP (y)dy;

which can be used to generalize the concept to a stochastic process by

considering the limit of the (per sample) information (1=N)H(y1; : : : ; yN )

as N tends to in�nity. This is called the entropy of the process, or the

information rate [5, 23]. Since the entropy of a process depends on the

probability density, a special class of stochastic processes has to be con-

sidered in order to determine a closed form expression for the entropy.

A Gaussian stochastic process is a stochastic process where for any

combination (t1; t2; : : : ; tn) the vector (yt1 ; yt2 ; : : : ; ytn) of stochastic vari-

ables has a multivariate normal distribution. For a real Gaussian sta-

tionary stochastic process the distribution is completely determined by

its covariances r0; r1; r2; : : :. The Gaussian property is used here only for

motivating the entropy interpretations.

For a Gaussian stochastic process the entropy is given by

H(y) =1

2log 2� +

1

2+

1

4�

Z 2�

0

log�y(ei�)d�:

(see [5]) With a slight abuse of notation, the expression

H(y)4

=1

2�

Z 2�

0

log�y(ei�)d� (14)

will be used and referred to as the entropy of a process, even though the

process may not be Gaussian.

3.1.2 The Maximum Entropy Solution

Consider the problem of maximizing the entropy measure

�(r̂) =1

2�

Z �

��

log �(ei�) d�;

where the spectral density is expressed in terms of the covariances fr̂kgvia

�(z) = r̂0 +

1Xk=1

r̂k(zk + z

�k);

12 Introduction

under the constraints that r̂ 2 R1 and

r̂k = rk ; for k = 0; : : : ; n:

Burg [2] described this problem in his thesis in the following way:

\Maximum entropy spectral analysis is based on choosing

the spectrum which corresponds to the most random or

the most unpredictable time series whose autocorrelation

function agrees with the known values".

In [2], Burg showed that the solution of this problem can be obtained from

the optimal linear predictor, and therefore the ME �lter is also called the

linear prediction (LPC) �lter. The LPC �lter coeÆcients satisfy the

normal equations [19, 23] given by

266664

r0 r1 � � � rn�1

r1 r0. . .

......

. . .. . . r1

rn�1 � � � r1 r0

377775

26664'n;n

...

'n;1

1

37775 = �

26664

rn

rn�1

...

r1

37775 : (15)

This linear system of equations can be solved using the Levinson algo-

rithm, which is a fast algorithm for solving Toeplitz systems.

Then the maximum entropy �lter is given by

wME(z)4

=

pvnz

n

'n(z);

where

'n(z)4

= zn + 'n;1z

n�1 + � � �+ 'n;n

is the n:th Szeg�o polynomial of the �rst kind and vn = r0 +P

n

j=0 rj'n;j

is the variance of the n:th order prediction error. The corresponding

spectral density is given by

�(z) = wME(z)wME(z�1) =

vn

'n(z)'n(z�1)= 1 +

1Xi=1

r̂i(zi + z

�i);

and since r̂k = rk for k = 1; : : : ; n this determines a solution of the RCEP.

3. The Rational Covariance Extension Problem 13

3.2 Parameterizations of Relaxed Versions of the

Rational Covariance Extension Problem

The RCEP can also be formulated using the impedance function.

Problem 3.10 (The Rational Covariance Extension Problem)Given a window of bona �de covariance lags r0; r1; : : : ; rn, �nd all positive

real impedance functions v(z), which are rational of degree at most n and

which interpolate the covariance lags, in the sense that

v(z) =1

2r̂0 + r̂1z

�1 + r̂2z�2 + : : : ; r̂k = rk ; k = 0; 1; : : : ; n:

If either the rationality and the positive realness constraints are omitted,

one of the following two classic problems are obtained.

3.2.1 The Carath�eodory Extension Problem

Corresponding to each covariance sequence frkg there is a unique se-

quence of Schur parameters de�ned by

k4

= 'k+1;k+1; (16)

and the relation between these are given by the Levinson recursion. The

Schur parameter k is given by

k =1

vk

kXj=0

'k;k�jrj+1:

Problem 3.2 (The Carath�eodory Extension Problem)Given a window of bona �de covariance lags r0; r1; : : : ; rn, �nd all positive

real impedance functions interpolating the covariances.

Note that here there is no assumption that the positive real function is

rational.

This problem was solved by Schur in [21] where a one-one correspon-

dence between every normalized covariance sequence and every sequence

of Schur parameters with j kj < 1 for all k was established. The Levinson

algorithm can be given in a streamlined form using the Szeg�o polynomials

of the �rst kind and the reversed polynomials

'�

n(z)4

= 'nnzn + 'n;n�1z

n�1 + � � �+ 1;

14 Introduction

which satisfy the following recursion,�'i+1(z)

'�

i+1(z)

�=

�z � i

� iz 1

��'i(z)

'�

i(z)

�;

�'0(z)

'�

0(z)

�=

�1

1

�; (17)

de�ned using the Schur parameters. Any extension of the �xed window

of Schur parameters with elements of absolute value less than one corre-

sponds to an extension of the �nite covariance sequence. This extension

de�nes a shaping �lter which is the limit of wME(z) =pvnz

n='n(z)

as n tends to in�nity. It is however highly nontrivial to determine

which of these extensions that corresponds to rational spectral densities

of bounded degree.

3.2.2 The Partial Realization Problem

We shall next look at the parameterizations of all impedance functions

interpolating a given sequence of covariances. Omitting the assumption

that the function is positive real, this is equivalent to the following de-

terministic realization problem [13], that is posed here in the stochastic

setting.

Problem 3.3 (The Partial Realization Problem)Given a window of bona �de covariance lags r0; r1; : : : ; rn, �nd all real

rational, of degree at most n, impedance functions interpolating the co-

variances.

In order to present the solution of this problem we need the Szeg�o poly-

nomials of both the �rst and second kind. The Szeg�o polynomials of the

�rst kind are given by (17), and the Szeg�o polynomials of the second kind

are given by the recursion� i+1(z)

i+1(z)

�=

�z i

iz 1

� � i(z)

i(z)

�;

� 0(z)

0(z)

�=

�1

1

�; (18)

which is the same as the �rst one if we change the sign of each i.

Any positive real function interpolating the window of covariance lags

1; r1; : : : ; rn can be parameterized as

v(z) = n(z)� z

�1sn+1(z)

n(z)

'n(z) + z�1sn+1(z)'�p(z);

where sn+1(z) is an arbitrary Schur function. In case the impedance

function is rational we have the Kimura-Georgiou parameterization [14,

4. Classical Speech Modeling 15

11]

v(z) = n(z) + �1 n�1(z) + : : :+ �n 0(z)

'n(z) + �1'n�1(z) + : : :+ �n'0(z):

To characterize the �-parameters that give rise to positive real func-

tions using the Kimura-Georgiou parameterization is not an easy prob-

lem.

4 Classical Speech Modeling

In classical speech modeling, stochastic realization theory has been used

for coding of speech. Since this application is both important and easily

explained, it will be used throughout this thesis as a motivating example.

Therefore, a brief orientation of some basic speech processing concepts is

given here.

Speech is built up of small units called phonemes, and also of transi-

tions between these (and also silence). Stochastic processes can be used

to model speech, and usually the speech signal is regarded to be sta-

tionary for approximately the duration of a phoneme. In practice, it is

considered stationary on intervals of about 25 ms, as depicted in Fig-

ure 3. For a speech signal sampled at 8000 samples per second, 25 ms

corresponds to 200 samples.

0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200−0.06

−0.04

−0.02

0

0.02

0.04

0.06

Figure 3: Speech partitioned into approximately sta-

tionary parts

The phonemes in American English are depicted in Figure 4. The

16 Introduction

(I)

(AE)æe (E)

Mid

(OW)

I

(ER)(A)

(IY)Front

(OO)(U)(O)

i

O

(UH)

Back

Λ,

Vowels

a

c

uU

mnη

t ∫

Whisper

Nasals

Consonants

voicedunvoicedvoiced

Stops

bdg

(B)(D)(G)

ptk

(P)(T)(K)

v

z

(V)(TH)(Z)(ZH)

f

ssh,

(F)(THE)(S)(SH)∫

θ

Affricatives

unvoiced

Fricatives

Semivowels

Liquids

wl

(W)(L)

r (R)y (Y)

Glides

aIcIaUeIoUjU

(AI)(OI)(AU)(EI)(oU)(JU)

Diphtongs

h

zh,

j,d (DZH)(TSH)

(H)

(M)(N)(NG)

Figure 4: Phonemes in American English

phonemes are divided into two classes, voiced and unvoiced, correspond-

ing to the mode of excitation. In fact, voiced sounds are formed when

the vocal cords are vibrating from the air stream that comes through

the glottis and the quasi-periodic pulses produced in this way excites the

vocal tract. For unvoiced sounds the vocal cords are inactive, and the ex-

citation is the result of turbulence caused as the air stream passes through

some constriction in the vocal tract. In Figure 4 the voiced phonemes

are shaded.

In Figure 5 (upper part) the amplitude plot of the voiced phoneme

\a" is depicted. If a ME �lter is �tted to the frame of speech data in

Figure 5 and the speech data is �ltered by the inverse of this �lter, in

an attempt to determine the excitation signal, the amplitude plot in the

lower part of Figure 5 is obtained. The characteristic pulse train character

of the excitation of voiced phonemes is evident, and this suggests that

the shaping e�ect of the vocal tract is modeled well by the ME �lter.

The spectral properties of the voiced phoneme \a" is depicted in the

upper part of Figure 6 using a periodogram estimate, and the lower part

depicts the periodogram of the inverse �ltered signal. The spectral en-

velope of the periodogram is matched by the ME �lter, and from the

4. Classical Speech Modeling 17

0 50 100 150 200 250−1

−0.5

0

0.5

1Original voiced "A"

0 50 100 150 200 250−1

−0.5

0

0.5

1Filtered voiced "A"

Figure 5: Amplitude plots for a voiced \a", and an

inverse �ltered version

inverse �ltered version it can be seen that the spectral envelope is rather

at.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−60

−50

−40

−30

−20

−10

0

10

Frequency

Pow

er S

pect

rum

Mag

nitu

de (

dB)

Original voiced "A"

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−50

−40

−30

−20

−10

Frequency

Pow

er S

pect

rum

Mag

nitu

de (

dB)

Filtered voiced "A"

Figure 6: Periodogram estimates of the spectrum for

a voiced \a", and an inverse �ltered version.

All whispered sounds are unvoiced, since the vocal cords are inactive,

and thus an unvoiced version of the phoneme \a" can be formed. A

similar analysis as for the voiced \a" is carried out and is depicted in

Figures 7 and 8. Note that the amplitude plots are much more irregular,

and that there are no spikes in the time domain representation of the

inverse �ltered signal in the lower part of Figure 7. Note also that the

spectral envelope of the voiced \a" in Figure 6 and the unvoiced \a" in

Figure 8 are quite similar.

Speech can be synthesized using a speech production model as de-

18 Introduction

0 50 100 150 200 250−1.5

−1

−0.5

0

0.5

1Original unvoiced "A"

0 50 100 150 200 250−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6Filtered unvoiced "A"

Figure 7: Amplitude plots for an unvoiced \a", and

an inverse �ltered version

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−50

−40

−30

−20

−10

0

10

Frequency

Pow

er S

pect

rum

Mag

nitu

de (

dB)

Original unvoiced "A"

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−40

−35

−30

−25

−20

−15

−10

−5

Frequency

Pow

er S

pect

rum

Mag

nitu

de (

dB)

Filtered unvoiced "A"

Figure 8: Periodogram estimates of the spectrum for

an unvoiced \a", and an inverse �ltered version.

picted in Figure 9. The model of the vocal tract is considered next.

4.1 Acoustic Tube Modeling

If one tries to model the physical apparatus that we use to generate

speech, a model for the vocal tract has to be formed. Under the assump-

tion that the nasal tract is closed o�, the vocal tract is the area between

the vocal folds and the lips as depicted in Figure 10. The anatomical

components determining the vocal tract area are the lips, the jaw, the

tongue, and the velum. This area can be modeled by a sequence of acous-

tic tubes as in Figure 11, and is similar to the acoustic tube model Atal

4. Classical Speech Modeling 19

ModelVocal tract

Speech

Vocal tract parameters

Glottal pulsegenerator

Random noisegenerator

Pitch period

Voiced/Unvoiced

Figure 9: Speech production model

computed in 1970.

4.2 Lossless Tube Equations

Outlining the derivation of the acoustic tube model given in [17], we �rst

consider the two laws of physics that govern the sound wave propagation.

The pressure p and volume velocity u of the sound waves in a tube

satisfy the momentum equation

@p

@x(x; t) = � �

A

@u

@t(x; t);

and the continuity of mass equation

@u

@x(x; t) = � A

�c2

@p

@t(x; t);

where A is the cross-section area, � is the air density and c is the speed

of sound.

If these two laws of physics are di�erentiated and combined they form

the following two wave equations

@2p

@x2=

1

c2

@2p

@t2;

@2u

@x2=

1

c2

@2u

@t2;

which have the solution of a left and a right going wave in the pressure

and volume velocity

u(x; t) = u+(t� x=c)� u

�(t+ x=c);

20 Introduction

Nasal Cavity

Vocalfolds

Vocal tract

Figure 10: Vocal tract.

p(x; t) = p+(t� x=c) + p

�(t+ x=c):

The solution for a single tube is combined for a number of tubes using

the continuity of the pressure and the volume velocity at the junction of

two tubes to connect the solutions. Continuity of the pressure and volume

velocity gives

u+m�1(t+ T ) = mu

m�1(t� T ) + (1 + m)u+m(t� T )

u�

m(t+ T ) = (1� m)u

m�1(t� T )� mu+m(t� T )

where T = L=(4c) is the time for the sound wave to cover half the length

of a tube and

m =Am�1 �Am

Am�1 +Am

; (19)

is a re ection coeÆcient. The re ection coeÆcients m are actually the

Schur parameters, and from (19) it comes in naturally that m 2 (�1; 1).Now let

y+m(t) = cmu

+m(t+ T � tm);

y�

m(t) = �cmu�m(t� T � tm);

where tm = 2(m+ 1)T and cm =Q

m

i=1(1 + �i).

4. Classical Speech Modeling 21

Lips

Width

Glottis

Figure 11: Lossless tube model.

Taking the z-transform, with z being a forward shift of 4T , we obtain

Y+m (z) = Y

+m�1(z) + mY

m�1(z);

zY�

m (z) = mY+m�1(z) + Y

m�1(z);

which is a form of the Levinson recursion for the Maximum Entropy �lter.

If we further assume that the model is excited by an air ow u0(t) at

the vocal folds, and combine the equations for a sequence of n such tubes

it can be shown that the output at the lips is given by (1=an(z))u0(z).

If the excitation is turbulent air ow, it can be regarded as white

noise and for a �xed vocal tract model, the output process is a stationary

stochastic process. For a slowly changing vocal tract, the model has to

be updated as time goes by.

To generate speech using this model we have to identify the vocal

tract parameters and excite the �lter 1=an(z) with white noise.

4.3 The E�ects of Nasal Coupling

In the last section it was indicated that speech could be modeled well by

an all pole model. However, the analysis was made under the assumption

that there are no nasal coupling and that the excitation was located at

the glottis among other things. As the phoneme \m" is pronounced, the

mouth is closed and the whole stream of air escapes through the nostrils,

as depicted in Figure 13. The closed oral cavity can trap energy at certain

22 Introduction

m-1u (t+T)

m-1

+

u (t-T)-m

m

-

+u (t-T)

m-1:th sectionm:th section

A

A

u (t+T)

m

m-1

Figure 12: The waves at a junction.

frequencies, and thus for nasal sounds the vocal system transfer function

is characterized by anti-resonances as well as resonances [17, 9, 7]. With

the mouth open we get a superposition of the two models. There is thus

a physical motivation for including zeros in the shaping �lter. How to

incorporate zeros to the shaping �lter design methods has been the main

consideration of this thesis.

5 Foliations, Transversality and Local Co-

ordinates

The realization problem can, as described above, be considered as the

problem to map some set of (measured) data to a family of shaping �lters.

This map should be a homeomorphism, i.e., onto, one-one and continuous

with a continuous inverse. Here we shall assume that the map is a di�eo-

morphism. The data set will then form a \complete parameterization"

of the family of shaping �lters.

To properly de�ne what we mean by a parameterization, and to in-

troduce some of the concepts used in papers B and C, a few de�nitions

from di�erential geometry [18, 8] will be introduced in the context of the

class of problems under study.

De�nition 5.1 A subset X � Rn is called a smooth manifold of dimen-

sion k if each point x 2 X has a neighborhood Nx \ X that is di�eomor-

phic to an open subset U of the Euclidean space Rk . A di�eomorphism

: U ! Nx\X is called a parameterization of the region Nx\X, and the

5. Foliations, Transversality and Local Coordinates 23

Oral Cavity

Nasal Cavity

Glottis

Nostrils

Closure

Figure 13: The nasal coupling

inverse di�eomorphism �1 : Nx \ X ! U is called a system of (local)

coordinates.

For an example we consider the set

Sn4

= fmonic Schur polynomials of order ng ;

which in fact is a smooth manifold of dimension n. This follows from

the fact that any polynomial in Sn is a Szeg�o polynomial 'n(z) for some

choice of Schur parameters 0; 1; : : : ; n�1 such that f kgnk=1 2 U =

(�1; 1)n , and the map from Schur parameters to the Szeg�o polynomials

is a di�eomorphism, as seen from (16) and (17). Further, it also follows

that the set of Schur parameters forms coordinates for the set Sn.

De�nition 5.2 Let Mn be an n-manifold and let F = fF�g denote a

partition of Mn into disjoint path-connected subsets. Then F is called

a foliation of Mn of codimension k (with 0 < k < n) if there exists

a cover of Mn by open sets U , each equipped with a homeomorphism

h : U ! Rn which maps each nonempty component of F� \ U onto a

parallel translation of the standard hyperplane Rn�k in Rn . Each F� is

then called a leaf.

A simple example is given now that is based on (7), and is of a form

that is similar to the problems considered in papers B and C. Hence it

24 Introduction

will serve as a good introduction to the foliation concepts in the present

context. De�ne the manifolds

�n

4

= fmonic polynomials of order ng

of dimension n and

Qn4

= Sn ��n

of dimension 2n, then a pair of monic polynomials �(z) and a(z) such

that w(z) = �(z)=a(z) is a transfer function with the steady state gain

w0 = 1 can be considered to be an element (a; �) 2 Qn.De�ne the map

� : Qn �! Rn

(a; �) 7!�w1 � � � wn

�> (20)

where wk = h�(z)a(z)

; z�ki are the Markov parameters of the �lter corre-

sponding to (a; �). Here h�; �i denotes the inner product in L2[��; �].Now the n-dimensional manifold

Qn(w)4

= f(a; �) 2 Qn j �(a; �) = wg = ��1(w)

are the leaves of a foliation of Qn, since each Qn(w) is path-connected, the

subsets Qn(w) form a disjoint partition of Qn, i.e. Qn = [w 2 R

nQn(w),

Qn(w(1)) \ Qn(w(2)) = ; if w(1) 6= w

(2) and the restriction of � to Qn(w)

is a homeomorphism.

For the case that n = 1, it is clear that if a(z) = z+a and �(z) = z+�,

then Q1 �= f(a; �) j a 2 (�1; 1); � 2 R g. Using that w1 = �� a it is easyto see that the leaves Q1(w1) form lines as depicted in Figure 14.

De�nition 5.3 Two manifolds intersects transversely if they do not have

a common tangent plane at the intersection.

Two foliations F and G are complementary if any leaf F� 2 F and

any leaf G� 2 G always intersects once, and if the intersection of any leaf

F� 2 F and any leaf G� 2 G is transverse.

As an example, consider the following second foliation. De�ne the

map

�a : Qn �! Rn

(a; �) 7!�a1 � � � an

�> (21)

5. Foliations, Transversality and Local Coordinates 25

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−3

−2

−1

0

1

2

3

a

σ

Figure 14: Foliation of Qn with the leaves Qn(w) for

n = 1.

where ak are the coeÆcients of the monic Schur polynomial a(z). Now

the n-dimensional manifold

Qn(a)4

= f(a; �) 2 Qn j �a(a; �) = ag = ��1(a)

are the leaves of a foliation of Qn, since each Qn(a) is path-connected,

Qn = [a2SnQn(a), Qn(a(1)) \ Qn(a(2)) = ; if a(1) 6= a(2) and the re-

striction of �a to Sn is a homeomorphism. Considering the case n = 1

again, it is obvious that the leaves Q1(a) form vertical lines as depicted

in Figure 15.

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−3

−2

−1

0

1

2

3

a

σ

Figure 15: Foliation of Qn with the leaves Qn(a) for

n = 1.

Consider the two foliations Qn(w) and Qn(a). Then w and a deter-

mine � uniquely by (7) and thus the intersection is unique. Further, the

26 Introduction

tangent spaces of the two foliations are needed in order to show that the

intersection is transverse. The tangent space of Qn at (a; �) is given by

the pair of polynomials (u; v), where

u(z) = u1zn�1 + � � �+ un;

and

v(z) = v1zn�1 + � � �+ vn:

Then it is easy to see that the directional derivative

D(u;v)�a = lim�!0

1

�(�(a+ �u; � + �v)� �(a; �)) = u (22)

is zero if and only if u = 0, thus

T(a;�)Qn(a) = f(0; v) j deg v � n� 1g:

For the foliation with the n �rst Markov parameters �xed, the tangent

vector (u; v) must satisfy D(u;v)� = 0, i.e.

D(u;v)� = lim�!0

1

�h�� + �v

a+ �u� �

a

�; z

ki = hav � �u

a2; z

ki = 0; (23)

for k = 0; 1; : : : ; n. It follows that

av � �u = a2

1Xk=n+1

wkz�k; (24)

and since the left hand side is a polynomial, the right hand side has to

be a polynomial too. Further, the right hand side of (24) has degree less

than or equal to n� 1, and therefore it follows that

T(a;�)Qn(w) = f(u; v) j �v � au = r; deg r � n� 1g:

Now, to determine if the foliations are transverse, we assume that (u; v) is

a tangent vector of both foliations, i.e. (u; v) 2 T(a;�)Qn(a)\T(a;�)Qn(w).This holds if and only if u = 0 and r = �v has degree less than or equal

to n� 1, which is impossible if not (u; v) = (0; 0) and this establishes the

transversality.

Finally we have thus shown that Qn(a) and Qn(w) form complemen-

tary foliations. Further, the function

: Qn ! R2n

(a; �) 7! (�(a; �); �a(a; �)) = (w1; : : : ; wn; a1; : : : ; an)

6. Summary of the Papers 27

determines a parameterization of Qn and �1 determines a system of

coordinates.

Considering the case n = 1 once again, the complementarity of the

two foliations is clear from the fact that all the lines, that form the

leaves of the two foliations, intersects at a nonzero angle, as depicted in

Figure 16. It is also clear that an element in Q1 is determined uniquely

by the coordinates (w1; a1).

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−3

−2

−1

0

1

2

3

a

σ

Figure 16: The two complementary foliations of Qn,

with the leaves Qn(w) and Qn(a) for n = 1.

6 Summary of the Papers

The four papers constituting the thesis are:

A: A homotopy approach to rational covariance extension withdegree constraint(Submitted for publication to International Journal of Applied Math-

ematics and Computer Science, special MTNS2000 issue.)

B: Cepstral coeÆcients, covariance lags and pole-zero modelsfor �nite data stringscoauthored with Professor Anders Lindquist and Professor Chris

Byrnes (To appear in IEEE Transactions on Signal Processing,

April, 2001)

C: Identi�ability and well-posedness of shaping-�lter param-eterizations: A global analysis approach

28 Introduction

coauthored with Professor Anders Lindquist and Professor Chris

Byrnes (Submitted for publication to SIAM Journal on Control

and Optimization)

D: A convex optimization approach to ARMA(n,m) model de-sign from covariance and cepstrum data

We next describe the contents of the papers in more detail.

6.1 Paper A: A homotopy approach to rational covariance exten-

sion with degree constraint

A convex optimization problem was formulated in [3] for solving the

RCEP. A robust algorithm for solving this optimization problem is the

contribution of this paper. The algorithm is based on a continuation

method, and uses a change of variables to avoid spectral factorizations

and the numerical ill-conditioning occuring in the original formulation

for some parameter values.

6.2 Paper B: Cepstral coeÆcients, covariance lags and pole-zero

models for �nite data strings

In this paper the parameterization of the RCEP presented in [4] is de-

scribed in the context of cepstral analysis and homomorphic �ltering.

Further, it is shown that there is a natural extension of the optimization

problem formulated in [3] to incorporate cepstral parameters as a param-

eterization of zeros. The extended optimization problem is also convex

and, in fact, it is shown that a window of covariances and cepstral lags

form local coordinates for ARMA models of order n.

6.3 Paper C: Identi�ability and well-posedness of shaping-�lter pa-

rameterizations: A global analysis approach

The geometry of shaping �lters is analyzed by considering parameteriza-

tions using various combinations of poles, zeros, covariance lags, cepstral

lags and Markov parameters. Assuming there is an underlying system

that is stable and minimum phase, it is shown in this paper that there

is a one-to-one correspondence between Markov parameters and cepstral

coeÆcients. An approach based on simultaneous Markov and covariance

parameter interpolation has been studied by Skelton et. al. [15, 16]. In

this paper it is studied from a global analysis approach.

6. Summary of the Papers 29

6.4 Paper D: A convex optimization approach to ARMA(n,m)

model design from covariance and cepstrum data

The �nal paper deals with a regularization of two �lter design meth-

ods, namely the covariance and cepstral matching ARMA design method

and covariance matching for MA �lters. Duality theory from mathemat-

ical programming is used to derive these methods, thus the �lter design

methods are posed as optimization problems. In order to achieve strictly

minimum phase solutions a barrier term is introduced to de�ne a set of

regularized optimization problems. As a result of the regularization, ex-

act interpolation is traded for a gain in entropy, and the map from data

to �lter de�ned by the optimization problems is turned into a di�eomor-

phism.

30 Introduction

References

[1] Aoki, M. State Space Modeling of Time Series. Springer-Verlag,

1987.

[2] Burg, J. Maximum Entropy Spectral Analysis. PhD thesis, Stanford

University, 1975.

[3] Byrnes, C., Gusev, S., and Lindquist, A. A convex optimiza-

tion approach to the rational covariance extension problem. SIAM

Journal on Control and Optimization 37, 1 (1999), 211{229.

[4] Byrnes, C., Lindquist, A., Gusev, S., and Matveev, A. A

complete parametrization of all positive rational extensions of a co-

variance sequence. IEEE Trans. Automatic Control 40, 11 (1995),

1841{1857.

[5] Caines, P. Linear stochastic systems. J.Wiley, 1987.

[6] Chen, C.-T. Linear System Theory and Design. Oxford University

Press, 1984.

[7] Deller, J., Proakis, J., and Hansen, J. Discrete-Time Pro-

cessing of SPeech Signals. Prentice Hall, 1987.

[8] Doolin, B., and Martin, C. Introduction to Di�erential Geome-

try for Engineers. Marcel Dekker Inc, 1990.

[9] Fant, G. Acoustic theory of speech production. Mouton & Co.,

1960.

[10] Georgiou, T. Partial Realization of Covariance Sequences. PhD

thesis, University of Florida, 1983.

32 Introduction

[11] Georgiou, T. Realization of power spectra from partial covariance

sequences. IEEE Trans. Acoustics, Speech and Signal Processing

ASSP-35 (1987), 438{449.

[12] Kalman, R. Realization of covariance sequences. In Toeplitz Memo-

rial Conference (Tel Aviv, Israel, 1981).

[13] Kalman, R., Falb, P., and Arbib, M. Topics in Mathematical

System Theory. McGraw-Hill, New York, 1969.

[14] Kimura, H. Positive partial realization of covariance sequences. In

Modelling, Identi�cation and Robust Control. Elsevier Science Pub-

lishers, 1986, pp. 499{513.

[15] King, A., Desai, U., and Skelton, R. A generalized approach

to q-Markov covariance equivalent realizations for discrete systems.

Automatica 24, 4 (1988), 507{515.

[16] Liu, K., and Skelton, R. A new formulation of Q-Markov covari-

ance equivalent realization. Applied Mathematics and Computation

53 (1993), 83{95.

[17] Markel, J., and Jr., A. G. Linear Prediction of Speech. Springer-

Verlag, 1976.

[18] Milnor, J. Topology from the Di�erentiable Viewpoint. Princeton

University Press, 1997.

[19] Porat, B. Digital Processing of Random Signals, Theory & Meth-

ods. Prentice Hall, 1994.

[20] Rugh, W. Linear System Theory. Prentice Hall, 1996.

[21] Schur, I. On power series which are bounded in the interior of the

unit circle I and II. Journal f�ur die reine und angewandte Mathe-

matik 148 (1918), 122{145.

[22] Shannon, C., and Weaver, W. The Mathematical Theory of

Communication. University of Illinois Press, 1949.

[23] Wu, N. The Maximum Entropy Method. Springer-Verlag, 1997.