Blind Subspace System Identi cation with Riemannian ...cassiano/pdf/acc2017beamer.pdfSuppose we want...

Post on 04-Aug-2020

3 views 0 download

Transcript of Blind Subspace System Identi cation with Riemannian ...cassiano/pdf/acc2017beamer.pdfSuppose we want...

Blind Subspace System Identification withRiemannian Optimization

Cassiano BeckerVictor Preciado

Department of Electrical and Systems EngineeringUniversity of Pennsylvania

presented at the

2017 American Control Conference

May 24, 2017

C. Becker Blind Subspace System Identification with Riemannian Optimization 1

Motivation

System Identification

uses input and output samples tofind a dynamic model for a systemof interest.

2

Motivation

System Identification

uses input and output samples tofind a dynamic model for a systemof interest.

What ifwe do not have access to the inputsamples themselves, but only topartial input information?

3

Input Parametrization

The inputs u(k) ∈ Rm for k = 0, . . . , L− 1 are assumed to be represented as

u(k) = Q(k)z ,

where Q(k) ∈ Rm×d is known and z ∈ Rd is unknown.

For example (event kernel)

Suppose we want to express the inputs {[u(k)]l}L−1k=0 for an input channel l as

the superposition of unknown stereotyped time-courses (or event kernels)zj ∈ Rdj , associated with j = 1, . . . , r event types:

The known information consists of the event onsets ki and the kernel lengths dj .

4

Input Parametrization

The inputs u(k) ∈ Rm for k = 0, . . . , L− 1 are assumed to be represented as

u(k) = Q(k)z ,

where Q(k) ∈ Rm×d is known and z ∈ Rd is unknown.

For example (event kernel)

Suppose we want to express the inputs {[u(k)]l}L−1k=0 for an input channel l as

the superposition of unknown stereotyped time-courses (or event kernels)zj ∈ Rdj , associated with j = 1, . . . , r event types:

The known information consists of the event onsets ki and the kernel lengths dj .

5

Input Encoding Example

Consider the set of inputs {u(k)}L−1k=0 , where u(k) = Q(k)z with

one input channel: u(k) ∈ R1,

two input kernels, z1 and z2, with d1 = 3 and d2 = 4.

It can be encoded as:

u(0)u(1)

...

...u(L− 1)

=

Q(0)zQ(1)z

...

...Q(L− 1)z

=

11

1 11

1 11 1

1

z1(0)...

z1(d1 − 1)z2(0)

...z2(d2 − 1)

6

Problem Statement

We consider an unknown discrete LTI system

Σ = (A ∈ Rn×n,B ∈ Rn×m,C ∈ Rp×n,D ∈ Rp×m).

Given

output measurements {y(k) ∈ Rp}L−1k=0;

partial input information {Q(k) ∈ Rm×d}L−1k=0.

Find

inputs estimates {u(k) = Q(k)z}L−1k=0 obtained from z ∈ Rd ; and

a linear state space representation1 ΣT = (AT , BT , CT , DT ) with

initial state xT (0)

such that∑L−1

k=0 ‖y(k)− y(k)‖22 is minimized.

1up to an invertible transformation of the state, ı.e., xT (k) = Tx(k)

7

Overview

Subspace methods provide reliable methodsfor discrete state space LTI identificationbased on input-output measurementsarranged in a linear matrix equation

XN

Us,N

Os XN

Ts Us,N Ys,N

Π⊥Us,N

The structure in the linear matrix equationcan be explored to allow for partiallyunknown input parametrization

We formulate the joint input-systemidentification as a low-rank matrixapproximation problem, and useRiemannian optimization onfixed-rank matrix manifolds.

TmMm×nk

Mm×nk

Rm×n

rM

πMM

−∇m f

8

Overview

Subspace methods provide reliable methodsfor discrete state space LTI identificationbased on input-output measurementsarranged in a linear matrix equation

XN

Us,N

Os XN

Ts Us,N Ys,N

Π⊥Us,N

The structure in the linear matrix equationcan be explored to allow for partiallyunknown input parametrization

We formulate the joint input-systemidentification as a low-rank matrixapproximation problem, and useRiemannian optimization onfixed-rank matrix manifolds.

TmMm×nk

Mm×nk

Rm×n

rM

πMM

−∇m f

9

Overview

Subspace methods provide reliable methodsfor discrete state space LTI identificationbased on input-output measurementsarranged in a linear matrix equation

XN

Us,N

Os XN

Ts Us,N Ys,N

Π⊥Us,N

The structure in the linear matrix equationcan be explored to allow for partiallyunknown input parametrization

We formulate the joint input-systemidentification as a low-rank matrixapproximation problem, and useRiemannian optimization onfixed-rank matrix manifolds.

TmMm×nk

Mm×nk

Rm×n

rM

πMM

−∇m f

10

Roadmap

1 Introduction

2 Subspace System Identification (SSID)

3 Riemannian Blind Subspace System Identification (RBSID)

4 Experimental Results

5 Future Research

11

1 Introduction

2 Subspace System Identification (SSID)

3 Riemannian Blind Subspace System Identification (RBSID)

4 Experimental Results

5 Future Research

12

Data Equation (1 of 2)

The output at time k due to an initial condition x(0) and inputs u(i) fori = 0, . . . , k − 1 satisfies

x(k) = Akx(0) +k−1∑i=0

Ak−i−1Bu(i) and y(k) = Cx(k) + Du(k)

We can write in matrix for the outputs observed from s samples at times0, . . . , s − 1

y(0)y(1)y(2)

...y(s − 1)

︸ ︷︷ ︸

Y0,s

=

CCACA2

...CAs−1

︸ ︷︷ ︸Os

x(0)+

D 0 0 · · · 0CB D 0 0CAB CB D 0

......

.... . .

...CAs−2B CAs−3 CAs−4 · · · D

︸ ︷︷ ︸

Ts

u(0)u(1)u(2)

...u(s − 1)

︸ ︷︷ ︸

U0,s

whereOs ∈ Rsp×n is an s × 1 block matrix with each block JOsKi,j ∈ Rp×n andTs ∈ Rsp×sm is an s × s block matrix with each block JTsKi,j ∈ Rp×m

13

Data Equation (2 of 2)

We horizontally concatenate N equations for Yi,s starting at timesi = 0, . . . ,N − 1 and define Ys,N ∈ Rsp×N

Ys,N :=[Y0,s Y1,s · · · YN−1,s

]=

y(0) y(1) · · · y(N − 1)y(1) y(2) · · · y(N)

......

. . ....

y(s − 1) y(s) · · · y(N + s − 2)

as an s × N block matrix with each block JYs,NKi,j ∈ Rp×1.

By defining Us,N ∈ Rsm×N correspondingly, and XN ∈ Rn×N such that

XN = [x(0) x(1) . . . x(N − 1)]

we can write the important data equation:

Ys,N = OsXN + Ts Us,N

We note that the term OsXN has rank n.

14

Subspace Methods (for known inputs)

We estimate the range of Os by removing the effect of the inputs with

Π⊥Us,N = IN − UTs,N(Us,NUT

s,N)−1Us,N

being post-multiplied to the data-equation, getting

Ys,NΠ⊥Us,N = OsXNΠ⊥Us,N .

Decomposing UnΣnVT

n := Ys,NΠ⊥Us,N and defining T := XNΠ⊥Us,NVnΣ−1

we can express the range

Un =Ys,NΠ⊥Us,NVnΣ−1n = OsXNΠ⊥Us,NVnΣ−1

n = OsT ,

which implies

Un = OsT =

CT

CT (T−1AT )...

CT (T−1AT )s−1

=:

CT

CTAT

...CTA

s−1T

.

15

1 Introduction

2 Subspace System Identification (SSID)

3 Riemannian Blind Subspace System Identification (RBSID)

4 Experimental Results

5 Future Research

16

RBSID - Approach

Issue:We cannot define the projection Π⊥Us,N without knowledge of {u(k)}L−1

k=0 .

Strategy:Leverage structural knowledge (low-rankness) in the data equation

Ys,N = OsXN + Ts Us,N .

Approach

1 parametrize the inputs u(k) = Q(k)z

2 apply a transformation on Ts Us,N to reveal low-rank structure

3 formulate the problem as a low-rank approximation problem

4 use Riemannian optimization estimate low-rank matrices

5 apply realization algorithm on Os XN to recover the system matrices andSVD on introduced variable W (z) to recover the inputs

17

Transformation on Ts Us,NRecall Ys,N = OsXN + Ts Us,N . and denote the Toeplitz matrix elements

Ts =

D 0 · · · 0CB D · · · 0

......

. . ....

CAs−2B · · · CB D

=:

H1 0 · · · 0H2 H1 · · · 0...

.... . .

...Hs · · · H2 H1

where each Hi ∈ Rp×m are the Hankel parameters of the system.

Expand the product Ts Us,N ∈ Rsp×N and apply a transformation from[Scobee et al., 2015, CDC 2015] on u(k) = Q(k)z to give

Ts Us,N =H1 ⊗ zT 0 · · · 0H2 ⊗ zT H1 ⊗ zT · · · 0

......

. . ....

Hs ⊗ zT · · · H2 ⊗ zT H1 ⊗ zT

vec(Q(0)T ) · · · vec(Q(N − 1)T )vec(Q(1)T ) · · · vec(Q(N)T )

.... . .

...vec(Q(s − 1)T ) · · · vec(Q(N + s − 2)T )

=: H(z)Qs,N

18

Low-rankness of W(z)

Note that the first block-column of H(z), i.e.

JH(z)K∗,1 =

H1 ⊗ zT

H2 ⊗ zT

...Hs ⊗ zT

=: W (z) ∈ Rsp×d has rank m.

In particular, for m = 1 we have

W (z) =

H1 ⊗ zT

H2 ⊗ zT

...Hs ⊗ zT

=

H1

H2

...Hs

zT has rank 1.

Given W (z) we can retrieve (α)z and (1/α)Hi by a (Kronecker) SVD, up to ascalar α.

19

Low-rankness of W(z)

Note that the first block-column of H(z), i.e.

JH(z)K∗,1 =

H1 ⊗ zT

H2 ⊗ zT

...Hs ⊗ zT

=: W (z) ∈ Rsp×d has rank m.

In particular, for m = 1 we have

W (z) =

H1 ⊗ zT

H2 ⊗ zT

...Hs ⊗ zT

=

H1

H2

...Hs

zT has rank 1.

Given W (z) we can retrieve (α)z and (1/α)Hi by a (Kronecker) SVD, up to ascalar α.

20

Recovery Problem

We are now ready to state our recovery problem.

We know that OsXN = Ys,N − Ts Us,N = Ys,N −H(z)Qs,N has rank n.

We know that JH(z)K∗,1 = W (z) has rank m.

These requirements can be expressed as the problem:

find H ∈ Ts ⊂ Rsp×sd

subject to rank(Ys,N −HQs,N) = n,

rank (JHK∗,1) = m.

This is a non-convex feasibility problem.

21

Recovery Problem

We are now ready to state our recovery problem.

We know that OsXN = Ys,N − Ts Us,N = Ys,N −H(z)Qs,N has rank n.

We know that JH(z)K∗,1 = W (z) has rank m.

These requirements can be expressed as the problem:

find H ∈ Ts ⊂ Rsp×sd

subject to rank(Ys,N −HQs,N) = n,

rank (JHK∗,1) = m.

This is a non-convex feasibility problem.

22

Riemmanian Methods

TmMm×nk

Mm×nk

Rm×n

rM

πMM

−∇m f

23

Riemmanian Optimization Algorithms

First and second order algorithms exist for unconstrainedoptimization in the manifold space [Absil et al., 2009].

Essentially the same convergence and complexity guarantees as theEuclidean counterparts.

The Manopt toolbox1 provides a modular implementations w.r.t. themanifolds (fixed rank manifolds included).

Require the Euclidean gradient and (optionally) Hessian operators.

1www.manopt.org

24

Solution using Riemmanian Optimization

Consider the manifold of fixed-rank matrices

Mm,nk =

{X ∈ Rm×n : rank(X ) = k

}.

1 Introduce the variable W ∈Msp×dm (rank m).

2 Introduce the slack variable F ∈Msp×Nn (rank n)

such that ‖F −OsXn‖2F =

∥∥F − Ys,N +HQs,N

∥∥2

F → 0.

3 Define the operator L : Rsp×d → Ts

such that H = L(W ).

We can express the fixed rank matrix approximation problem

minimizeF∈Msp×N

n ,W∈Msp×dm

‖F − Ys,N + L(W )Qs,N‖2F .

Since the problem is now unconstrained in the manifold, Riemannianoptimization methods can be applied.

25

Solution using Riemmanian Optimization

Consider the manifold of fixed-rank matrices

Mm,nk =

{X ∈ Rm×n : rank(X ) = k

}.

1 Introduce the variable W ∈Msp×dm (rank m).

2 Introduce the slack variable F ∈Msp×Nn (rank n)

such that ‖F −OsXn‖2F =

∥∥F − Ys,N +HQs,N

∥∥2

F → 0.

3 Define the operator L : Rsp×d → Ts

such that H = L(W ).

We can express the fixed rank matrix approximation problem

minimizeF∈Msp×N

n ,W∈Msp×dm

‖F − Ys,N + L(W )Qs,N‖2F .

Since the problem is now unconstrained in the manifold, Riemannianoptimization methods can be applied.

26

Solution using Riemmanian Optimization

Consider the manifold of fixed-rank matrices

Mm,nk =

{X ∈ Rm×n : rank(X ) = k

}.

1 Introduce the variable W ∈Msp×dm (rank m).

2 Introduce the slack variable F ∈Msp×Nn (rank n)

such that ‖F −OsXn‖2F =

∥∥F − Ys,N +HQs,N

∥∥2

F → 0.

3 Define the operator L : Rsp×d → Ts

such that H = L(W ).

We can express the fixed rank matrix approximation problem

minimizeF∈Msp×N

n ,W∈Msp×dm

‖F − Ys,N + L(W )Qs,N‖2F .

Since the problem is now unconstrained in the manifold, Riemannianoptimization methods can be applied.

27

Solution using Riemmanian Optimization

Consider the manifold of fixed-rank matrices

Mm,nk =

{X ∈ Rm×n : rank(X ) = k

}.

1 Introduce the variable W ∈Msp×dm (rank m).

2 Introduce the slack variable F ∈Msp×Nn (rank n)

such that ‖F −OsXn‖2F =

∥∥F − Ys,N +HQs,N

∥∥2

F → 0.

3 Define the operator L : Rsp×d → Ts

such that H = L(W ).

We can express the fixed rank matrix approximation problem

minimizeF∈Msp×N

n ,W∈Msp×dm

‖F − Ys,N + L(W )Qs,N‖2F .

Since the problem is now unconstrained in the manifold, Riemannianoptimization methods can be applied.

28

Linear operator L

The linear operator L : Rsp×d → Ts ⊂ Rsp×sd

L(W ) = L

H1 ⊗ zT

H2 ⊗ zT

...Hs ⊗ zT

= L

H1(z)H2(z)

...Hs(z)

=

H1(z) 0 · · · 0H2(z) H1(z) · · · 0

.... . . · · · 0

Hs(z) Hs−1(z) · · · H1(z)

= H

can be explicitly written as

H = L(W ) =:s∑

i=0

AiWBi

= [S0p | . . . | Ss−1

p ]s∑

i=0

(ei ⊗ Isp)W (ei ⊗ Id)T

Sp ∈ Rsp×sp, with [Sp]ij = 1 if i − j = p, and [Sp]ij = 0 otherwise.

29

1 Introduction

2 Subspace System Identification (SSID)

3 Riemannian Blind Subspace System Identification (RBSID)

4 Experimental Results

5 Future Research

30

SNN: n = 2, s = 4, N = 40, σ = 0 [Scobee et al., 2015]

0 10 20 30 40 50 60 70 80-3

-2

-1

0

1

2

3

original

estimate

80 90 100 110 120 130 140

-6

-4

-2

0

2

4

6

80 90 100 110 120 130 140

-4

-3

-2

-1

0

1

2

3

4

5

0 5 10 15 20 25 30 35-1.5

-1

-0.5

0

0.5

1

1.5

original

estimate

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

orignal

estimated

31

RBSID: n = 2, s = 4, N = 40, σ = 0 (our approach)

0 10 20 30 40 50 60 70 80-3

-2

-1

0

1

2

3

4

original

estimate

80 90 100 110 120 130 140

-6

-4

-2

0

2

4

6

80 90 100 110 120 130 140

-4

-3

-2

-1

0

1

2

3

4

5

0 5 10 15 20 25 30 35-1.5

-1

-0.5

0

0.5

1

1.5

original

estimate

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

orignal

estimated

32

SNN: n = 4, s = 8, N = 160, σ = 0 [Scobee et al., 2015]

0 10 20 30 40 50 60 70 80-2

-1

0

1

2

3

4

original

estimate

80 90 100 110 120 130 140 150

-1.5

-1

-0.5

0

0.5

1

1.5

80 90 100 110 120 130 140 150

-2

-1

0

1

2

3

4

0 5 10 15 20 25 30 35-1.5

-1

-0.5

0

0.5

1

1.5

original

estimate

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

orignal

estimated

33

RBSID: n = 4, s = 8, N = 160, σ = 0 (our approach)

0 10 20 30 40 50 60 70 80-2

-1

0

1

2

3

4

original

estimate

80 90 100 110 120 130 140 150

-1.5

-1

-0.5

0

0.5

1

1.5

80 90 100 110 120 130 140 150

-2

-1

0

1

2

3

4

0 5 10 15 20 25 30 35-1.5

-1

-0.5

0

0.5

1

1.5

original

estimate

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

orignal

estimated

34

SNN: n = 4, s = 8, N = 240, σ = 1e − 1 [Scobee et al., 2015]

0 10 20 30 40 50 60 70 80-3

-2

-1

0

1

2

3

4

original

estimate

80 90 100 110 120 130 140 150

-1.5

-1

-0.5

0

0.5

1

1.5

80 90 100 110 120 130 140 150

-3

-2

-1

0

1

2

3

4

0 5 10 15 20 25 30 35-1.5

-1

-0.5

0

0.5

1

1.5

original

estimate

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

orignal

estimated

35

RBSID: n = 4, s = 8, N = 240, σ = 1e − 1 (our approach)

0 10 20 30 40 50 60 70 80-3

-2

-1

0

1

2

3

4

original

estimate

80 90 100 110 120 130 140 150

-1.5

-1

-0.5

0

0.5

1

1.5

80 90 100 110 120 130 140 150

-2

-1

0

1

2

3

4

0 5 10 15 20 25 30 35-1.5

-1

-0.5

0

0.5

1

1.5

original

estimate

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

orignal

estimated

36

Comparison

37

1 Introduction

2 Subspace System Identification (SSID)

3 Riemannian Blind Subspace System Identification (RBSID)

4 Experimental Results

5 Future Research

38

Future Research

Conclusion

Introduced formulation as low-rank matrix approximation

Improved empirical performance in the low-sample regime

Provided one practical example of input parametrization

39

References I

Absil, P.-A., Mahony, R., and Sepulchre, R. (2009).

Optimization Algorithms on Matrix Manifolds.

Princeton University Press.

Becker, C. and Preciado, V. (2017).

Blind Subspace System Identification with Riemmanian Optimization.

In 2017 American Control Conference, pages 1474–1480. IEEE.

Scobee, D., Ratliff, L., Dong, R., Ohlsson, H., Verhaegen, M., and Sastry, S. S.(2015).

Nuclear Norm Minimization for Blind Subspace Identification (N2BSID).

In 2015 54th IEEE Conference on Decision and Control (CDC), pages2127–2132. IEEE.

40

Questions?

41

Transformation over Ts Us,N (1 of 2)

Recall the lower block triangular Toeplitz matrix Ts ⊂ Rsp×sm and denote

Ts =

D 0 · · · 0CB D · · · 0

......

. . ....

CAs−2B · · · CB D

=:

H1 0 · · · 0H2 H1 · · · 0...

.... . .

...Hs · · · H2 H1

where each Hi ∈ Rp×m. Expand the product Ts Us,N ∈ Rsp×N as

Ts Us,N =

H1u(0) · · · H1u(N − 1)

H2u(0) + H1u(1) · · ·...

. . ....

Hsu(0) + · · ·+ H1u(s − 2) · · · Hsu(N − 1) + · · ·+ H1u(N + s − 2)

We then apply input parametrization u(k) = Q(k)z anda smart transformation from [Scobee et al., 2015, CDC 2015]

42

Transformation over Ts Us,N (1 of 2)

Recall the lower block triangular Toeplitz matrix Ts ⊂ Rsp×sm and denote

Ts =

D 0 · · · 0CB D · · · 0

......

. . ....

CAs−2B · · · CB D

=:

H1 0 · · · 0H2 H1 · · · 0...

.... . .

...Hs · · · H2 H1

where each Hi ∈ Rp×m. Expand the product Ts Us,N ∈ Rsp×N as

Ts Us,N =

H1u(0) · · · H1u(N − 1)

H2u(0) + H1u(1) · · ·...

. . ....

Hsu(0) + · · ·+ H1u(s − 2) · · · Hsu(N − 1) + · · ·+ H1u(N + s − 2)

We then apply input parametrization u(k) = Q(k)z anda smart transformation from [Scobee et al., 2015, CDC 2015]

43

Transformation over Ts Us,N (2 of 2)

Each block Hiu(k) ∈ Rp can be written

Hiu(k) = vec((Hiu(k))T ) = vec(

(HiQ(k)z)T)

= vec(zTQ(k)THT

i

)=(Hi ⊗ zT

)vec(Q(k)T

).

We define Ts Us,N =: H(z)Qs,N , with H(z) ∈ Rsp×sd and Qs,N ∈ Rsd×N , i.e.H1 ⊗ zT 0 · · · 0H2 ⊗ zT H1 ⊗ zT · · · 0

......

. . ....

Hs ⊗ zT · · · H2 ⊗ zT H1 ⊗ zT

vec(Q(0)T ) · · · vec(Q(N − 1)T )vec(Q(1)T ) · · · vec(Q(N)T )

.... . .

...vec(Q(s − 1)T ) · · · vec(Q(N + s − 2)T )

Further, denote Hi (z) = Hi ⊗ z ,∈ Rp×d , so that

H(z) =

H1(z) 0 · · · 0H2(z) H1(z) · · · 0

......

. . ....

Hs(z) · · · H2(z) H1(z)

44

Fast Calculation of Euclidean Gradient (1 of 2)

h(F ,W ) = ‖F − Ys,N + L(W )Qs,N‖2F

= ‖vec(F − Ys,N + L(W )Qs,N)‖22

=

∥∥∥∥∥vec(F )− vec(Ys,N) + vec(s∑

i=0

AiWBiQs,N)

∥∥∥∥∥2

2

= ‖vec(F )− vec(Ys,N) + Mvec(W ))‖22

= ‖f − y + Mw‖22 =: h(f ,w) (1)

where f := vec(F ), y := vec(Ys,N), w := vec(W ) and

M :=s∑

i=1

((BiQs,N)T ⊗ Ai

)is defined by applying vec(AXB) = (BT ⊗ A)vec(X ) in the expansion ofthe linear operator L(W ) =

∑si=0 AiWBi .

45

Fast Calculation of Euclidean Gradient (2 of 2)

With these definitions, the euclidean Gradient is quickly obtained as

∇f h(f ,w) = f − y + Mw

∇w h(f ,w) = MTMw + MT(f − y)

and the matrix gradients can be obtained by applying the inversevectorization function in each case.

Second order information (for the Hessian matrix) can be also quicklyobtained from this form.

46

The Fixed Rank ManifoldManifold Parametrization

Mm,nk :=

{X ∈ Rm×n : rank(X ) = k

}={Udiag(σ)V T : U ∈ Stmk ,V ∈ Stnk

}Tangent Space

TXMm,nk = {UMV T + UpV

T + UVpT : M ∈ Rk×k ;

Up ∈ Rm×k ,UpTU = 0;Vp ∈ Rn×k ,Vp

TV = 0}.

Projection

ΠTXMm,nk

(X ) = PuXPv + P⊥u XPv + PuXP⊥v ,

where Pu = UUT and P⊥u = I − UUT (and so respectively Pv and P⊥v ).

Retraction

RX (ξ) = arg minY∈Mm,n

k

‖X + ξ − Y ‖F

computed as RX (ξ) =∑k

i=1 σiuiviT, with ui , vi , σi from SVD of X + ξ.

47

Recovery Problem - Optimization Approaches

In [Scobee et al., 2015, CDC 2015] a (double) convex relaxation isproposed

minimizeH∈Ts⊂Rsp×sd

‖Ys,N −HQs,N ‖∗ + λ ‖ JHK∗,1 ‖∗

which we refer to as Sum-of-Nuclear-Norms (SNN).

However, this formulation simultaneously relaxes two structures on H.

Furthermore, recovery depends on choosing regularization parameter λ.

Proposed approach

We address the problem in the space of fixed-rank matrices viaRiemannian Optimization and compare both approaches experimentally.

48

Recovery Problem - Optimization Approaches

In [Scobee et al., 2015, CDC 2015] a (double) convex relaxation isproposed

minimizeH∈Ts⊂Rsp×sd

‖Ys,N −HQs,N ‖∗ + λ ‖ JHK∗,1 ‖∗

which we refer to as Sum-of-Nuclear-Norms (SNN).

However, this formulation simultaneously relaxes two structures on H.

Furthermore, recovery depends on choosing regularization parameter λ.

Proposed approach

We address the problem in the space of fixed-rank matrices viaRiemannian Optimization and compare both approaches experimentally.

49

Recovery Problem - Optimization Approaches

In [Scobee et al., 2015, CDC 2015] a (double) convex relaxation isproposed

minimizeH∈Ts⊂Rsp×sd

‖Ys,N −HQs,N ‖∗ + λ ‖ JHK∗,1 ‖∗

which we refer to as Sum-of-Nuclear-Norms (SNN).

However, this formulation simultaneously relaxes two structures on H.

Furthermore, recovery depends on choosing regularization parameter λ.

Proposed approach

We address the problem in the space of fixed-rank matrices viaRiemannian Optimization and compare both approaches experimentally.

50

Decomposition of the Matrix Os

Simpler case. Suppose u = 0, and response of the system is due to the initialcondition x(0).

Given Ys,N = OsXN , we decompose it via an SVD, so that

Ys,N = UnΣnVTn = OsXN .

Right-multiplying by VnΣ−1n and defining the matrix T = XNVnΣ−1 we can

express Un = OsT .

We now note that Un is equivalent to an extended observability matrix

Un = OsT =

CT

CT (T−1AT )...

CT (T−1AT )s−1

=

CT

CTAT

...CT (AT )s−1

given by the matrices AT ,CT , which are similarity transformations of thematrices A andC , parametrized by T .

51

Estimation of AT and CT

The matrix Un can be used to generate an estimates of AT and CT .

If we take the product UnAT , we note that its first to s − 1 blocks areequal to the second to s-th blocks of Un, considering blocks of size p× n.

JUnK1:s−1AT = JUnK2,s

The estimate AT can be obtained in closed form as:

AT = JUnK†1:s−1JUnK2,s

Similarly, the estimate for C is obtained from Un as

CT = JUnK1

52

Estimation of BT , DT and xT (0)

Given y(k) and u(k) and estimates AT , CT , one can find estimates for xT (0),BT and DT as follows. Applying the vec operator in the output stateequations, we have

vec(y(k)) = y(k) = vec

(CT A

kT xT (0) +

k−1∑i=0

CT Ak−i−1T Bu(i) + Du(k)

)=

= CT AkT xT (0) +

(k−1∑i=0

u(i)T ⊗ CT Ak−i−1T

)vec(BT ) +

(u(k)T ⊗ Ip

)vec(DT )

which is linear in the variables xT (0), vec(BT ) and vec(DT ). Defining

φ(k)T =

[CT A

kT

(k−1∑i=0

u(i)T ⊗ CT Ak−i−1T

) (u(k)T ⊗ Ip

)]and

θT = [ xT (0)T vec(BT )T vec(DT )T ]

one can find θ by solving the standard minimum least squares problem the

minimizeθ

N−1∑k=0

∥∥∥y(k)− φ(k)Tθ∥∥∥2

F

53

Arbirtrary Inputs (1 of 2)

For general input sequences, the extended observability matrix is obtained fromthe data matrix as

OsXN = Ys,N − Ts Us,N ,

where the term Ts Us,N depends on the unkwown system. However, one canconsider the following problem

minimizeTs

‖Ys,N − Ts Us,N‖2F

which allows a closed for solution depending only on Ys,N and Us,N .

Ts = Ys,NUTs,N(Us,NUT

s,N)−1

The objective function at the solution gives

Ys,N − TsUs,N = Ys,N(IN − UTs,N(Us,NUT

s,N)−1Us,N) = Ys,NΠ⊥Us,N

where the projection matrix Π⊥Us,N is given by:

Π⊥Us,N = IN − UTs,N(Us,NUT

s,N)−1Us,N

54

Arbirtrary Inputs (2 of 2)

Noting that Us,NΠ⊥Us,N = 0 we can write

Ys,NΠ⊥Us,N = OsXNΠ⊥Us,N

It can be shown that rank(Ys,NΠ⊥Us,N ) = n, and therefore

range(OsXn) = range(OsXNΠ⊥Us,N ).

We can proceed and decompose the matrix Ys,NΠ⊥Us,N = UnΣnVn to get

Un = OsT with T = XNΠ⊥Us,NVnΣ−1n , and find the system matrix estimates.

The matrix Π⊥Us,N performs a projection of Ys,N

onto the space spanned by XN along the spacespanned by Us,N .

XN

Us,N

Os XN

Ts Us,N Ys,N

Π⊥Us,N

55