Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath)...

98
Low-rank Techniques: from Matrix Equations to High-dimensional PDEs Peter Benner Joint work with Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick K¨ urschner and Jens Saak (MPI DCTS) Akwum Onwunta (MPI DCTS/U Maryland soon) Martin Stoll (TU Chemnitz) 17 November 2017 Landscapes in Mathematical Sciences University of Bath

Transcript of Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath)...

Page 1: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Low-rank Techniques:

from Matrix Equations to High-dimensional PDEs

Peter Benner

Joint work withSergey Dolgov (U Bath)

Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig)Patrick Kurschner and Jens Saak (MPI DCTS)

Akwum Onwunta (MPI DCTS/U Maryland soon)

Martin Stoll (TU Chemnitz)

17 November 2017

Landscapes in Mathematical SciencesUniversity of Bath

Page 2: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

The Max Planck Institute (MPI) in Magdeburg

The Max Planck Society

operates 84 institutes — 79 in Germany, 2 inItaly, 1 each in The Netherlands, Luxembourg,and the USA;

has ∼ 23,000 employees;

had 18 Nobel Laureates since 1948.

”The first MPI in engineering. . . ”

MPI Magdeburg

founded 1998

4 departments (directors)

9 research groups

budget ∼ 15 Mio. EUR

∼ 230 employees

130 scientists,

doing research in

biotechnologychemical engineeringprocess engineeringenergy conversionapplied math

©Peter Benner Low-Rank Techniques 2/40

Page 3: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Outline

1. Motivation

2. Solving Large-Scale Sylvester and Lyapunov EquationsSome BasicsThe Low-Rank StructureLR-ADI MethodThe New LR-ADI Applied to Lyapunov Equations

3. From Matrix Equations to PDEs in d DimensionsThe Curse of DimensionalityTensor TechniquesNumerical Examples

©Peter Benner Low-Rank Techniques 3/40

Page 4: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Outline

1. Motivation

2. Solving Large-Scale Sylvester and Lyapunov Equations

3. From Matrix Equations to PDEs in d Dimensions

©Peter Benner Low-Rank Techniques 4/40

Page 5: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Prelude — 20 years of working in low-rank methods

Model Reduction in Frequency Domain

Approximate the linear control system

x = Ax + Bu, A ∈ Rn×n, B ∈ Rn×m,y = Cx + Du, C ∈ Rq×n, D ∈ Rq×m,

by reduced-order system

˙x = Ax + Bu, A ∈ Rr×r , B ∈ Rr×m,

y = C x + Du, C ∈ Rq×r , D ∈ Rq×m

of order r n, such that

‖ y − y ‖ = ‖Gu − Gu ‖ ≤ ‖G − G ‖ · ‖ u ‖ < tolerance · ‖ u ‖,

whereG(s) = C(sIn − A)−1B + D, G(s) = C(sIr − A)−1B + D

are the associated transfer functions of the system.

©Peter Benner Low-Rank Techniques 5/40

Page 6: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Prelude — 20 years of working in low-rank methods

Balanced Truncation [Moore ’81]

System Σ :

x(t) = Ax(t) + Bu(t),

y(t) = Cx(t),with A stable, i.e., Λ (A) ⊂ C−,

is balanced, if system Gramians = solutions P,Q of Lyapunov equations

AP + PAT + BBT = 0, ATQ + QA + CTC = 0,

satisfy: P = Q = diag(σ1, . . . , σn) with σ1 ≥ σ2 ≥ . . . ≥ σn > 0.

σ1, . . . , σn are the Hankel singular values (HSVs) of Σ.

Compute balanced realization of Σ via state-space transformation

T : (A,B,C) 7→ (TAT−1,TB,CT−1)

=

([A11 A12

A21 A22

],

[B1

B2

],[C1 C2

]).

Truncation (A, B, C) = (A11,B1,C1).

©Peter Benner Low-Rank Techniques 6/40

Page 7: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Prelude — 20 years of working in low-rank methods

Balanced Truncation [Moore ’81]

System Σ :

x(t) = Ax(t) + Bu(t),

y(t) = Cx(t),with A stable, i.e., Λ (A) ⊂ C−,

is balanced, if system Gramians = solutions P,Q of Lyapunov equations

AP + PAT + BBT = 0, ATQ + QA + CTC = 0,

satisfy: P = Q = diag(σ1, . . . , σn) with σ1 ≥ σ2 ≥ . . . ≥ σn > 0.

σ1, . . . , σn are the Hankel singular values (HSVs) of Σ.

Compute balanced realization of Σ via state-space transformation

T : (A,B,C) 7→ (TAT−1,TB,CT−1)

=

([A11 A12

A21 A22

],

[B1

B2

],[C1 C2

]).

Truncation (A, B, C) = (A11,B1,C1).

©Peter Benner Low-Rank Techniques 6/40

Page 8: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Prelude — 20 years of working in low-rank methods

Balanced Truncation [Moore ’81]

System Σ :

x(t) = Ax(t) + Bu(t),

y(t) = Cx(t),with A stable, i.e., Λ (A) ⊂ C−,

is balanced, if system Gramians = solutions P,Q of Lyapunov equations

AP + PAT + BBT = 0, ATQ + QA + CTC = 0,

satisfy: P = Q = diag(σ1, . . . , σn) with σ1 ≥ σ2 ≥ . . . ≥ σn > 0.

σ1, . . . , σn are the Hankel singular values (HSVs) of Σ.

Compute balanced realization of Σ via state-space transformation

T : (A,B,C) 7→ (TAT−1,TB,CT−1)

=

([A11 A12

A21 A22

],

[B1

B2

],[C1 C2

]).

Truncation (A, B, C) = (A11,B1,C1).

©Peter Benner Low-Rank Techniques 6/40

Page 9: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Prelude — 20 years of working in low-rank methods

Balanced Truncation [Moore ’81]

System Σ :

x(t) = Ax(t) + Bu(t),

y(t) = Cx(t),with A stable, i.e., Λ (A) ⊂ C−,

is balanced, if system Gramians = solutions P,Q of Lyapunov equations

AP + PAT + BBT = 0, ATQ + QA + CTC = 0,

satisfy: P = Q = diag(σ1, . . . , σn) with σ1 ≥ σ2 ≥ . . . ≥ σn > 0.

σ1, . . . , σn are the Hankel singular values (HSVs) of Σ.

Compute balanced realization of Σ via state-space transformation

T : (A,B,C) 7→ (TAT−1,TB,CT−1)

=

([A11 A12

A21 A22

],

[B1

B2

],[C1 C2

]).

Truncation (A, B, C) = (A11,B1,C1).

©Peter Benner Low-Rank Techniques 6/40

Page 10: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Prelude — 20 years of working in low-rank methods

Balanced Truncation [Moore ’81]

. . . is one of the greatest model reduction techniques for linear systems

©Peter Benner Low-Rank Techniques 6/40

Page 11: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Prelude — 20 years of working in low-rank methods

Balanced Truncation [Moore ’81]

. . . is one of the greatest model reduction techniques for linear systems since

the reduced-order model is stable with HSVs σ1, . . . , σr ;

it allows adaptive choice of r via computable error bound (there is a free lunch!):

‖ y − y ‖2 ≤ ‖G − G‖H∞‖ u ‖2 ≤(

2∑n

k=r+1σk

)‖ u ‖2.

©Peter Benner Low-Rank Techniques 6/40

Page 12: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Prelude — 20 years of working in low-rank methods

Balanced Truncation [Moore ’81]

. . . is one of the greatest model reduction techniques for linear systems since

the reduced-order model is stable with HSVs σ1, . . . , σr ;

it allows adaptive choice of r via computable error bound (there is a free lunch!):

‖ y − y ‖2 ≤ ‖G − G‖H∞‖ u ‖2 ≤(

2∑n

k=r+1σk

)‖ u ‖2.

©Peter Benner Low-Rank Techniques 6/40

Page 13: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Prelude — 20 years of working in low-rank methods

Balanced Truncation [Moore ’81]

. . . is one of the greatest model reduction techniques for linear systems!

But: as suggested in [Moore ’81], it is an O(n3) method!(Bottleneck: solution of the Lyapunov equations, followed by SVD of their Cholesky factors to

determine necessary parts of T and T−1.)

©Peter Benner Low-Rank Techniques 6/40

Page 14: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Prelude — 20 years of working in low-rank methods

Balanced Truncation [Moore ’81]

. . . is one of the greatest model reduction techniques for linear systems!

But: as suggested in [Moore ’81], it is an O(n3) method!(Bottleneck: solution of the Lyapunov equations, followed by SVD of their Cholesky factors to

determine necessary parts of T and T−1.)

In 1997, we derived a method to compute low-rank factors S ,R ∈ Rn×`, ` n, suchthat P ≈ SST , Q ≈ RRT [1] and then showed that the truncation could be obtained viacheap `S × `R SVD of STR [2]!

(Some similar ideas: low-rank Lyapunov solver [Saad ’90, Jaimoukha/Kasenally ’94], small-scaleSVD in BT [Penzl ’98, Li/White ’99].)

P. Benner and E.S. Quintana-Ortı. Solving stable generalized Lyapunov equations with the matrix sign function. Numerical Algorithms,20(1):75–100, 1999. (Preprint SFB393 97–23)

P. Benner, E.S. Quintana-Ortı, and G. Quintana-Ortı. Balanced truncation model reduction of large-scale dense systems on parallel computers.Mathematical and Computer Modeling of Dynamical Systems, 6(4):383–405, 2000.

©Peter Benner Low-Rank Techniques 6/40

Page 15: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Outline

1. Motivation

2. Solving Large-Scale Sylvester and Lyapunov EquationsSome BasicsThe Low-Rank StructureLR-ADI MethodThe New LR-ADI Applied to Lyapunov Equations

3. From Matrix Equations to PDEs in d Dimensions

©Peter Benner Low-Rank Techniques 7/40

Page 16: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

Sylvester equation

James Joseph Sylvester(September 3, 1814 – March 15, 1897)

AX + XB = C .

Lyapunov equation

Alexander Michailowitsch Ljapunow(June 6, 1857 – November 3, 1918)

AX + XAT = C , C = CT .

©Peter Benner Low-Rank Techniques 8/40

Page 17: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

Sylvester equation

James Joseph Sylvester(September 3, 1814 – March 15, 1897)

AX + XB = C .

Lyapunov equation

Alexander Michailowitsch Ljapunow(June 6, 1857 – November 3, 1918)

AX + XAT = C , C = CT .

©Peter Benner Low-Rank Techniques 8/40

Page 18: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

Generalizations of Sylvester (AX + XB = C) and Lyapunov (AX + XAT = C) Equations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generalized Sylvester equation:

AXD + EXB = C .

Generalized Lyapunov equation:

AXET + EXAT = C , C = CT .

Stein equation:X − AXB = C .

(Generalized) discrete Lyapunov/Stein equation:

EXET − AXAT = C , C = CT .

Note:

Consider only regular cases, having a unique solution!

Solutions of symmetric cases are symmetric, X = XT ∈ Rn×n; otherwise, X ∈ Rn×`

with n 6= ` in general.

©Peter Benner Low-Rank Techniques 9/40

Page 19: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

Generalizations of Sylvester (AX + XB = C) and Lyapunov (AX + XAT = C) Equations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Generalized Sylvester equation:

AXD + EXB = C .

Generalized Lyapunov equation:

AXET + EXAT = C , C = CT .

Stein equation:X − AXB = C .

(Generalized) discrete Lyapunov/Stein equation:

EXET − AXAT = C , C = CT .

Note:

Consider only regular cases, having a unique solution!

Solutions of symmetric cases are symmetric, X = XT ∈ Rn×n; otherwise, X ∈ Rn×`

with n 6= ` in general.

©Peter Benner Low-Rank Techniques 9/40

Page 20: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Applications

Gramian-based model order reduction

Linear stability analysis

Continuation algorithms for detecting Hopf bifurcations

Determining metastable equilibrium points of stochastic differential equations

Block-triangularization of matrices and matrix pencils

Computing covariance matrices, e.g., in biochemical reaction networks

Numerical solution of fractional partial differential equations

Solving algebraic Riccati equations via Newton’s method

. . .

©Peter Benner Low-Rank Techniques 10/40

Page 21: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Existence of Solutions of Linear Matrix Equations

Exemplarily, consider the generalized Sylvester equation

AXD + EXB = C . (1)

Vectorization (using Kronecker product) representation as linear system:(DT ⊗ A + BT ⊗ E︸ ︷︷ ︸

=:A

)vec(X )︸ ︷︷ ︸

=:x

= vec(C )︸ ︷︷ ︸=:c

⇐⇒ Ax = c .

=⇒ ”(1) has a unique solution ⇐⇒ A is nonsingular”

Lemma

Λ (A) = αj + βk | αj ∈ Λ (A,E ), βk ∈ Λ (B,D).Hence, (1) has unique solution ⇐⇒ Λ (A,E ) ∩ −Λ (B,D) = ∅.

Example: Lyapunov equation AX + XAT = C has unique solution⇐⇒ @ µ ∈ C : ±µ ∈ Λ (A).

©Peter Benner Low-Rank Techniques 11/40

Page 22: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Existence of Solutions of Linear Matrix Equations

Exemplarily, consider the generalized Sylvester equation

AXD + EXB = C . (1)

Vectorization (using Kronecker product) representation as linear system:(DT ⊗ A + BT ⊗ E︸ ︷︷ ︸

=:A

)vec(X )︸ ︷︷ ︸

=:x

= vec(C )︸ ︷︷ ︸=:c

⇐⇒ Ax = c .

=⇒ ”(1) has a unique solution ⇐⇒ A is nonsingular”

Lemma

Λ (A) = αj + βk | αj ∈ Λ (A,E ), βk ∈ Λ (B,D).Hence, (1) has unique solution ⇐⇒ Λ (A,E ) ∩ −Λ (B,D) = ∅.

Example: Lyapunov equation AX + XAT = C has unique solution⇐⇒ @ µ ∈ C : ±µ ∈ Λ (A).

©Peter Benner Low-Rank Techniques 11/40

Page 23: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Existence of Solutions of Linear Matrix Equations

Exemplarily, consider the generalized Sylvester equation

AXD + EXB = C . (1)

Vectorization (using Kronecker product) representation as linear system:(DT ⊗ A + BT ⊗ E︸ ︷︷ ︸

=:A

)vec(X )︸ ︷︷ ︸

=:x

= vec(C )︸ ︷︷ ︸=:c

⇐⇒ Ax = c .

=⇒ ”(1) has a unique solution ⇐⇒ A is nonsingular”

Lemma

Λ (A) = αj + βk | αj ∈ Λ (A,E ), βk ∈ Λ (B,D).Hence, (1) has unique solution ⇐⇒ Λ (A,E ) ∩ −Λ (B,D) = ∅.

Example: Lyapunov equation AX + XAT = C has unique solution⇐⇒ @ µ ∈ C : ±µ ∈ Λ (A).

©Peter Benner Low-Rank Techniques 11/40

Page 24: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Existence of Solutions of Linear Matrix Equations

Exemplarily, consider the generalized Sylvester equation

AXD + EXB = C . (1)

Vectorization (using Kronecker product) representation as linear system:(DT ⊗ A + BT ⊗ E︸ ︷︷ ︸

=:A

)vec(X )︸ ︷︷ ︸

=:x

= vec(C )︸ ︷︷ ︸=:c

⇐⇒ Ax = c .

=⇒ ”(1) has a unique solution ⇐⇒ A is nonsingular”

Lemma

Λ (A) = αj + βk | αj ∈ Λ (A,E ), βk ∈ Λ (B,D).Hence, (1) has unique solution ⇐⇒ Λ (A,E ) ∩ −Λ (B,D) = ∅.

Example: Lyapunov equation AX + XAT = C has unique solution⇐⇒ @ µ ∈ C : ±µ ∈ Λ (A).

©Peter Benner Low-Rank Techniques 11/40

Page 25: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Existence of Solutions of Linear Matrix Equations

Exemplarily, consider the generalized Sylvester equation

AXD + EXB = C . (1)

Vectorization (using Kronecker product) representation as linear system:(DT ⊗ A + BT ⊗ E︸ ︷︷ ︸

=:A

)vec(X )︸ ︷︷ ︸

=:x

= vec(C )︸ ︷︷ ︸=:c

⇐⇒ Ax = c .

=⇒ ”(1) has a unique solution ⇐⇒ A is nonsingular”

Lemma

Λ (A) = αj + βk | αj ∈ Λ (A,E ), βk ∈ Λ (B,D).Hence, (1) has unique solution ⇐⇒ Λ (A,E ) ∩ −Λ (B,D) = ∅.

Example: Lyapunov equation AX + XAT = C has unique solution⇐⇒ @ µ ∈ C : ±µ ∈ Λ (A).

©Peter Benner Low-Rank Techniques 11/40

Page 26: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

The Low-Rank Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sylvester Equations

Find X ∈ Rn×m solving

AX − XB = FGT ,

where A ∈ Rn×n, B ∈ Rm×m, F ∈ Rn×r , G ∈ Rm×r .

If n,m large, but r n,m X has a small numerical rank.[Penzl 1999, Grasedyck 2004,

Antoulas/Sorensen/Zhou 2002]

rank(X , τ) = f min(n,m)

300 600 900

0

−10

u

σ155 ≈ u

singular values of 1600× 900 example

σ(X )

Compute low-rank solution factors Z ∈ Rn×f , Y ∈ Rm×f ,D ∈ Rf×f , such that X ≈ ZDY T with f min(n,m).

©Peter Benner Low-Rank Techniques 12/40

Page 27: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

The Low-Rank Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Lyapunov Equations

Find X ∈ Rn×n solving

AX+XAT = −FFT ,

where A ∈ Rn×n, F ∈ Rn×r .

If n large, but r n X has a small numerical rank.[Penzl 1999, Grasedyck 2004,

Antoulas/Sorensen/Zhou 2002]

rank(X , τ) = f n

300 600 900

0

−10

u

σ155 ≈ u

singular values of 1600× 900 example

σ(X )

Compute low-rank solution factors Z ∈ Rn×f ,D ∈ Rf×f , such that X ≈ ZDZT with f n.

©Peter Benner Low-Rank Techniques 12/40

Page 28: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

Theoretical Foundation of the Low-rank Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Lyapunov case: [Grasedyck ’04]

AX + XAT + BBT = 0 ⇐⇒ (In ⊗ A + A⊗ In)︸ ︷︷ ︸=:A

vec(X ) = − vec(BBT ).

©Peter Benner Low-Rank Techniques 13/40

Page 29: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

Theoretical Foundation of the Low-rank Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Lyapunov case: [Grasedyck ’04]

AX + XAT + BBT = 0 ⇐⇒ (In ⊗ A + A⊗ In)︸ ︷︷ ︸=:A

vec(X ) = − vec(BBT ).

For stable M, i.e., Λ (M) ⊂ C−, apply

M−1 = −∫ ∞

0

exp(tM)dt

to A and approximate the integral via (sinc) quadrature ⇒

A−1 ≈ −k∑

i=−k

ωi exp(tkA),

with error ∼ exp(−√k) (exp(−k) if A = AT ), then an approximate Lyapunov solution is

given by

©Peter Benner Low-Rank Techniques 13/40

Page 30: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

Theoretical Foundation of the Low-rank Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Lyapunov case: [Grasedyck ’04]

AX + XAT + BBT = 0 ⇐⇒ (In ⊗ A + A⊗ In)︸ ︷︷ ︸=:A

vec(X ) = − vec(BBT ).

For stable M, i.e., Λ (M) ⊂ C−, apply

M−1 = −∫ ∞

0

exp(tM)dt

to A and approximate the integral via (sinc) quadrature ⇒

A−1 ≈ −k∑

i=−k

ωi exp(tkA),

with error ∼ exp(−√k) (exp(−k) if A = AT ), then an approximate Lyapunov solution is

given by

vec(X ) ≈ vec(Xk) =k∑

i=−k

ωi exp(tiA) vec(BBT ).

©Peter Benner Low-Rank Techniques 13/40

Page 31: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

Theoretical Foundation of the Low-rank Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Lyapunov case: [Grasedyck ’04]

AX + XAT + BBT = 0 ⇐⇒ (In ⊗ A + A⊗ In)︸ ︷︷ ︸=:A

vec(X ) = − vec(BBT ).

vec(X ) ≈ vec(Xk) =k∑

i=−k

ωi exp(tiA) vec(BBT ).

Now observe that

exp(tiA) = exp (ti (In ⊗ A + A⊗ In)) ≡ exp(tiA)⊗ exp(tiA).

©Peter Benner Low-Rank Techniques 13/40

Page 32: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

Theoretical Foundation of the Low-rank Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Lyapunov case: [Grasedyck ’04]

AX + XAT + BBT = 0 ⇐⇒ (In ⊗ A + A⊗ In)︸ ︷︷ ︸=:A

vec(X ) = − vec(BBT ).

vec(X ) ≈ vec(Xk) =k∑

i=−k

ωi exp(tiA) vec(BBT ).

Now observe that

exp(tiA) = exp (ti (In ⊗ A + A⊗ In)) ≡ exp(tiA)⊗ exp(tiA).

Hence,

vec(Xk) =k∑

i=−k

ωi (exp(tiA)⊗ exp(tiA)) vec(BBT )

©Peter Benner Low-Rank Techniques 13/40

Page 33: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

Theoretical Foundation of the Low-rank Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Lyapunov case: [Grasedyck ’04]

AX + XAT + BBT = 0 ⇐⇒ (In ⊗ A + A⊗ In)︸ ︷︷ ︸=:A

vec(X ) = − vec(BBT ).

Hence,

vec(Xk) =k∑

i=−k

ωi (exp(tiA)⊗ exp(tiA)) vec(BBT )

=⇒ Xk =k∑

i=−k

ωi exp(tiA)BBT exp(tiAT ) ≡

k∑i=−k

ωiBiBTi ,

so that rank (Xk) ≤ (2k + 1)m with

||X − Xk ||2 . exp(−√k) ( exp(−k) for A = AT )!

Extended to AX + XAT +∑m

j=1 NjXNTj + BBT = 0. [B./Breiten 2012]

©Peter Benner Low-Rank Techniques 13/40

Page 34: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure I

For simplicity, consider again the Lyapunov equation AX + XAT + BBT = 0.

Definition

The matrix sign function of M ∈ Rn×n with no purely imaginary eigenvalues is

sign (M) = sign

(T

[J− 0

0 J+

]T−1

)= T

[−I 0

0 I

]T−1

with J± containing all Jordan blocks of M corresponding to eigenvalues withpositive/negative real parts.

©Peter Benner Low-Rank Techniques 14/40

Page 35: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure I

For simplicity, consider again the Lyapunov equation AX + XAT + BBT = 0.

Definition

The matrix sign function of M ∈ Rn×n with no purely imaginary eigenvalues is

sign (M) = sign

(T

[J− 0

0 J+

]T−1

)= T

[−I 0

0 I

]T−1

with J± containing all Jordan blocks of M corresponding to eigenvalues withpositive/negative real parts.

Observations

1. sign

([A BBT

0 −AT

])=

[−I 2X

0 I

].

2. sign (M) = limk→∞Mk with Mk+1 = 12(Mk + M−1

k ) if M0 = M.

©Peter Benner Low-Rank Techniques 14/40

Page 36: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure I

For simplicity, consider again the Lyapunov equation AX + XAT + BBT = 0.

Observations

1. sign

([A BBT

0 −AT

])=

[−I 2X

0 I

].

2. sign (M) = limk→∞Mk with Mk+1 = 12(Mk + M−1

k ) if M0 = M.

Sign function iteration for solving Lyapunov equations

M0 =

[A BBT

0 −AT

], and inversion formula for block-triangular matrices:

Ak+1 ← 1

2(Ak + A−1

k )

Bk+1Bk+1 ← 1

2(BkB

Tk + A−1

k BkBTk A−T

k )

©Peter Benner Low-Rank Techniques 14/40

Page 37: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure I

For simplicity, consider again the Lyapunov equation AX + XAT + BBT = 0.

Observations

1. sign

([A BBT

0 −AT

])=

[−I 2X

0 I

].

2. sign (M) = limk→∞Mk with Mk+1 = 12(Mk + M−1

k ) if M0 = M.

Sign function iteration for solving Lyapunov equations

M0 =

[A BBT

0 −AT

], and inversion formula for block-triangular matrices:

Ak+1 ← 1

2(Ak + A−1

k )

Bk+1BTk+1 ← 1

2(BkB

Tk + A−1

k Bk(A−1k Bk)T )

©Peter Benner Low-Rank Techniques 14/40

Page 38: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure I

For simplicity, consider again the Lyapunov equation AX + XAT + BBT = 0.

Observations

1. sign

([A BBT

0 −AT

])=

[−I 2X

0 I

].

2. sign (M) = limk→∞Mk with Mk+1 = 12(Mk + M−1

k ) if M0 = M.

Sign function iteration for solving Lyapunov equations

M0 =

[A BBT

0 −AT

], and inversion formula for block-triangular matrices:

Ak+1 ← 1

2(Ak + A−1

k )

Bk+1BTk+1 ← 1

2(BkB

Tk + A−1

k Bk(A−1k Bk)T )

so that Bk → 1√2Z with X = ZZT .

©Peter Benner Low-Rank Techniques 14/40

Page 39: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure I

For simplicity, consider again the Lyapunov equation AX + XAT + BBT = 0.

Observations

1. sign

([A BBT

0 −AT

])=

[−I 2X

0 I

].

2. sign (M) = limk→∞Mk with Mk+1 = 12(Mk + M−1

k ) if M0 = M.

Factored sign function iteration for Lyapunov equations [B./Quintana-Ortı 1997/99]

Ak+1 ← 1

2(Ak + A−1

k )

Bk+1 ← 1√2

[Bk , A−1k Bk ]

Problem: number of columns in Bk doubles each iteration!

©Peter Benner Low-Rank Techniques 14/40

Page 40: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure I

For simplicity, consider again the Lyapunov equation AX + XAT + BBT = 0.

Observations

1. sign

([A BBT

0 −AT

])=

[−I 2X

0 I

].

2. sign (M) = limk→∞Mk with Mk+1 = 12(Mk + M−1

k ) if M0 = M.

Factored sign function iteration for Lyapunov equations [B./Quintana-Ortı 1997/99]

Ak+1 ← 1

2(Ak + A−1

k )

Bk+1 ← 1√2

[Bk , A−1k Bk ]

Problem: number of columns in Bk doubles each iteration!

Cure: truncation operator Bk+1 ← Tε(

1√2[Bk , A

−1k Bk ]

)with, e.g., Tε returning the

scaled left singular vectors of the truncated SVD w.r.t. the numerical rank tolerance ε.

©Peter Benner Low-Rank Techniques 14/40

Page 41: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure II

Exploiting the Low-rank Structure for Large and Sparse Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sylvester equation AX − XB = FGT is equivalent to linear system of equations(Im ⊗ A− BT ⊗ In

)vec(X ) = vec(FGT ).

This cannot be used for numerical solutions unless nm ≤ 1, 000 (or so), as

it requires O(n2m2) of storage;

direct solver needs O(n3m3) flops;

low (tensor-)rank of right-hand side is ignored;

in Lyapunov case, symmetry and possible definiteness are not respected.

©Peter Benner Low-Rank Techniques 15/40

Page 42: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure II

Exploiting the Low-rank Structure for Large and Sparse Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sylvester equation AX − XB = FGT is equivalent to linear system of equations(Im ⊗ A− BT ⊗ In

)vec(X ) = vec(FGT ).

This cannot be used for numerical solutions unless nm ≤ 1, 000 (or so), as

it requires O(n2m2) of storage;

direct solver needs O(n3m3) flops;

low (tensor-)rank of right-hand side is ignored;

in Lyapunov case, symmetry and possible definiteness are not respected.

©Peter Benner Low-Rank Techniques 15/40

Page 43: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure II

Exploiting the Low-rank Structure for Large and Sparse Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sylvester equation AX − XB = FGT is equivalent to linear system of equations(Im ⊗ A− BT ⊗ In

)vec(X ) = vec(FGT ).

This cannot be used for numerical solutions unless nm ≤ 1, 000 (or so), as

it requires O(n2m2) of storage;

direct solver needs O(n3m3) flops;

low (tensor-)rank of right-hand side is ignored;

in Lyapunov case, symmetry and possible definiteness are not respected.

©Peter Benner Low-Rank Techniques 15/40

Page 44: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure II

Exploiting the Low-rank Structure for Large and Sparse Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sylvester equation AX − XB = FGT is equivalent to linear system of equations(Im ⊗ A− BT ⊗ In

)vec(X ) = vec(FGT ).

This cannot be used for numerical solutions unless nm ≤ 1, 000 (or so), as

it requires O(n2m2) of storage;

direct solver needs O(n3m3) flops;

low (tensor-)rank of right-hand side is ignored;

in Lyapunov case, symmetry and possible definiteness are not respected.

©Peter Benner Low-Rank Techniques 15/40

Page 45: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure II

Exploiting the Low-rank Structure for Large and Sparse Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sylvester equation AX − XB = FGT is equivalent to linear system of equations(Im ⊗ A− BT ⊗ In

)vec(X ) = vec(FGT ).

This cannot be used for numerical solutions unless nm ≤ 1, 000 (or so), as

it requires O(n2m2) of storage;

direct solver needs O(n3m3) flops;

low (tensor-)rank of right-hand side is ignored;

in Lyapunov case, symmetry and possible definiteness are not respected.

©Peter Benner Low-Rank Techniques 15/40

Page 46: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Exploiting the Low-Rank Structure II

Exploiting the Low-rank Structure for Large and Sparse Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sylvester equation AX − XB = FGT is equivalent to linear system of equations(Im ⊗ A− BT ⊗ In

)vec(X ) = vec(FGT ).

This cannot be used for numerical solutions unless nm ≤ 1, 000 (or so), asit requires O(n2m2) of storage;

direct solver needs O(n3m3) flops;

low (tensor-)rank of right-hand side is ignored;

in Lyapunov case, symmetry and possible definiteness are not respected.

Possible solvers:Standard Krylov subspace solvers in operator from [Hochbruck, Starke, Reichel, . . . ]

Block-Tensor-Krylov subspace methods with truncation [Kressner/Tobler,

Bollhofer/Eppler, B./Breiten, . . . ]

Galerkin-type methods based on (extended, rational) Krylov subspace methods[Jaimoukha, Kasenally, Jbilou, Simoncini, Druskin, Knizhermann,. . . ]

Doubling-type methods [Smith, Chu et al., B./Sadkane/El Khoury, . . . ]

ADI methods [Wachspress, Reichel, Li, Penzl, B., Saak, Kurschner, Opmeer/Reis, . . . ]

©Peter Benner Low-Rank Techniques 15/40

Page 47: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

LR-ADI Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sylvester and Stein equations

Let α

k

6= β

k

with α

k

/∈ Λ(B), β

k

/∈ Λ(A), then

AX − XB = FGT︸ ︷︷ ︸Sylvester equation

⇔ X = A

k

XB

k

+ (β

k

− α

k

)F

k

G

k

H︸ ︷︷ ︸Stein equation

with the Cayley-like (Mobius) transformations

A

k

:= (A− β

k

In)−1(A− α

k

In), B

k

:= (B − α

k

Im)−1(B − β

k

Im),

F

k

:= (A− β

k

In)−1F , G

k

:= (B − α

k

Im)−HG .

fix point iteration

Xk = A

k

Xk−1B

k

+ (β

k

− α

k

)F

k

G

k

H

for k ≥ 1, X0 ∈ Rn×m.

[Wachspress 1988]

©Peter Benner Low-Rank Techniques 16/40

Page 48: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

LR-ADI Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sylvester and Stein equations

Let αk 6= βk with αk /∈ Λ(B), βk /∈ Λ(A), then

AX − XB = FGT︸ ︷︷ ︸Sylvester equation

⇔ X = AkXBk + (βk − αk)FkGkH︸ ︷︷ ︸

Stein equation

with the Cayley-like (Mobius) transformations

Ak := (A− βkIn)−1(A− αkIn), Bk := (B − αkIm)−1(B − βkIm),

Fk := (A− βkIn)−1F , Gk := (B − αkIm)−HG .

alternating directions implicit (ADI) iteration

Xk = AkXk−1Bk + (βk − αk)FkGkH

for k ≥ 1, X0 ∈ Rn×m. [Wachspress 1988]

©Peter Benner Low-Rank Techniques 16/40

Page 49: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

LR-ADI Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sylvester ADI iteration [Wachspress 1988]

Xk = AkXk−1Bk + (βk − αk)FkGHk ,Ak := (A− βk In)−1(A− αk In), Bk := (B − αk Im)−1(B − βk Im),

Fk := (A− βk In)−1F ∈ Rn×r , Gk := (B − αk Im)−HG ∈ Cm×r .

Now set X0 = 0 and find factorization Xk = ZkDkYHk

X1 = A1X0B1 + (β1 − α1)F1GH1

⇒ V1 := Z1= (A− β1In)−1F ∈ Rn×r ,

D1= (β1 − α1)Ir ∈ Rr×r

,

W1 := Y1 = (B − α1Im)−HG ∈ Cm×r .

©Peter Benner Low-Rank Techniques 17/40

Page 50: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

LR-ADI Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sylvester ADI iteration [Wachspress 1988]

Xk = AkXk−1Bk + (βk − αk)FkGHk ,Ak := (A− βk In)−1(A− αk In), Bk := (B − αk Im)−1(B − βk Im),

Fk := (A− βk In)−1F ∈ Rn×r , Gk := (B − αk Im)−HG ∈ Cm×r .

Now set X0 = 0 and find factorization Xk = ZkDkYHk

X1 = (β1 − α1)(A− β1In)−1FGT (B − α1Im)−1

⇒ V1 := Z1= (A− β1In)−1F ∈ Rn×r ,

D1= (β1 − α1)Ir ∈ Rr×r ,

W1 := Y1 = (B − α1Im)−HG ∈ Cm×r .

©Peter Benner Low-Rank Techniques 17/40

Page 51: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

LR-ADI Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Sylvester ADI iteration [Wachspress 1988]

Xk = AkXk−1Bk + (βk − αk)FkGHk ,Ak := (A− βk In)−1(A− αk In), Bk := (B − αk Im)−1(B − βk Im),

Fk := (A− βk In)−1F ∈ Rn×r , Gk := (B − αk Im)−HG ∈ Cm×r .

Now set X0 = 0 and find factorization Xk = ZkDkYHk

X2 = A2X1B2 + (β2 − α2)F2GH2V2 = V1 + (β2 − α1)(A + β2I )

−1V1 ∈ Rn×r ,

W2 = W1 + (α2 − β1)(B + α2I )−HW1 ∈ Rm×r ,

Z2 = [Z1, V2],

D2 = diag (D1, (β2 − α2)Ir ),

Y2 = [Y1, W2].

©Peter Benner Low-Rank Techniques 17/40

Page 52: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

LR-ADI Algorithm [B. 2005, Li/Truhar 2008, B./Li/Truhar 2009]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Algorithm 1: Low-rank Sylvester ADI / factored ADI (fADI)

Input : Matrices defining AX − XB = FGT and shift parameters α1, . . . , αkmax,β1, . . . , βkmax.

Output: Z , D, Y such that ZDY H ≈ X .

1 Z1 = V1 = (A− β1In)−1F ,

2 Y1 = W1 = (B − α1Im)−HG .3 D1 = (β1 − α1)Ir4 for k = 2, . . . , kmax do5 Vk = Vk−1 + (βk − αk−1)(A− βk In)−1Vk−1.

6 Wk = Wk−1 + (αk − βk−1)(B − αk In)−HWk−1.7 Update solution factors

Zk = [Zk−1,Vk ], Yk = [Yk−1,Wk ], Dk = diag (Dk−1, (βk−αk)Ir ).

©Peter Benner Low-Rank Techniques 18/40

Page 53: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

LR-ADI Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Disadvantages of Low-Rank ADI as of 2012:1. No efficient stopping criteria:

Difference in iterates norm of added columns/step: not reliable, stops oftentoo late.Residual is a full dense matrix, can not be calculated as such.

2. Requires complex arithmetic for real coefficients when complex shifts areused.

3. Expensive (only semi-automatic) set-up phase to precompute ADI shifts.

None of these disadvantages exists as of today

Key observation: residual is low-rank matrix, with rank less than or equalto that of right-hand side!

=⇒ speed-ups old vs. new LR-ADI can be up to 20!

©Peter Benner Low-Rank Techniques 19/40

Page 54: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

LR-ADI Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Disadvantages of Low-Rank ADI as of 2012:1. No efficient stopping criteria:

Difference in iterates norm of added columns/step: not reliable, stops oftentoo late.Residual is a full dense matrix, can not be calculated as such.

2. Requires complex arithmetic for real coefficients when complex shifts areused.

3. Expensive (only semi-automatic) set-up phase to precompute ADI shifts.

None of these disadvantages exists as of today

Key observation: residual is low-rank matrix, with rank less than or equalto that of right-hand side!

=⇒ speed-ups old vs. new LR-ADI can be up to 20!

©Peter Benner Low-Rank Techniques 19/40

Page 55: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

LR-ADI Method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Disadvantages of Low-Rank ADI as of 2012:1. No efficient stopping criteria:

Difference in iterates norm of added columns/step: not reliable, stops oftentoo late.Residual is a full dense matrix, can not be calculated as such.

2. Requires complex arithmetic for real coefficients when complex shifts areused.

3. Expensive (only semi-automatic) set-up phase to precompute ADI shifts.

None of these disadvantages exists as of today

Key observation: residual is low-rank matrix, with rank less than or equalto that of right-hand side!

=⇒ speed-ups old vs. new LR-ADI can be up to 20!

©Peter Benner Low-Rank Techniques 19/40

Page 56: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Other Low-rank Lyapunov Solvers. . .

. . . for Lyapunov equation 0 = AX + XAT + BBT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Projection-based methods for Lyapunov equations with A + AT < 0:

1. Compute orthonormal basis range (Z), Z ∈ Rn×r , for subspace Z ⊂ Rn, dimZ = r .

2. Set A := ZTAZ , B := ZTB.

3. Solve small-size Lyapunov equation AX + X AT + BBT = 0.

4. Use X ≈ ZXZT .

Examples:

Krylov subspace methods, i.e., for m = 1:

Z = K(A,B, r) = spanB,AB,A2B, . . . ,Ar−1B

[Saad 1990, Jaimoukha/Kasenally 1994, Jbilou 2002–2008].

Extended Krylov subspace method (EKSM) [Simoncini 2007],

Z = K(A,B, r) ∪ K(A−1,B, r).

Rational Krylov subspace methods (RKSM) [Druskin/Simoncini 2011].

©Peter Benner Low-Rank Techniques 20/40

Page 57: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Other Low-rank Lyapunov Solvers. . .

. . . for Lyapunov equation 0 = AX + XAT + BBT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Projection-based methods for Lyapunov equations with A + AT < 0:

1. Compute orthonormal basis range (Z), Z ∈ Rn×r , for subspace Z ⊂ Rn, dimZ = r .

2. Set A := ZTAZ , B := ZTB.

3. Solve small-size Lyapunov equation AX + X AT + BBT = 0.

4. Use X ≈ ZXZT .

Examples:

Krylov subspace methods, i.e., for m = 1:

Z = K(A,B, r) = spanB,AB,A2B, . . . ,Ar−1B

[Saad 1990, Jaimoukha/Kasenally 1994, Jbilou 2002–2008].

Extended Krylov subspace method (EKSM) [Simoncini 2007],

Z = K(A,B, r) ∪ K(A−1,B, r).

Rational Krylov subspace methods (RKSM) [Druskin/Simoncini 2011].

©Peter Benner Low-Rank Techniques 20/40

Page 58: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

The New LR-ADI Applied to Lyapunov Equations

Example: an ocean circulation problem [Van Gijzen et al. 1998]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

FEM discretization of a simple 3D ocean circulation model (barotropic,constant depth) stiffness matrix −A with n = 42, 249, choose artificialconstant term B = rand(n,5).

Convergence history:

0 100 200 300 400 50010−10

10−5

100

TOL

coldim(Z )

‖R‖/‖B

TB‖

LR-ADI with adaptive shifts vs. EKSM

LR-ADIEKSM

CPU times: LR-ADI ≈ 110 sec, EKSM ≈ 135 sec.

©Peter Benner Low-Rank Techniques 21/40

Page 59: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

The New LR-ADI Applied to Lyapunov Equations

Example: an ocean circulation problem [Van Gijzen et al. 1998]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

FEM discretization of a simple 3D ocean circulation model (barotropic,constant depth) stiffness matrix −A with n = 42, 249, choose artificialconstant term B = rand(n,5).

Convergence history:

0 100 200 300 400 50010−10

10−5

100

TOL

coldim(Z )

‖R‖/‖B

TB‖

LR-ADI with adaptive shifts vs. EKSM

LR-ADIEKSM

CPU times: LR-ADI ≈ 110 sec, EKSM ≈ 135 sec.

©Peter Benner Low-Rank Techniques 21/40

Page 60: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

The New LR-ADI Applied to Lyapunov Equations

Example: an ocean circulation problem [Van Gijzen et al. 1998]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

FEM discretization of a simple 3D ocean circulation model (barotropic,constant depth) stiffness matrix −A with n = 42, 249, choose artificialconstant term B = rand(n,5).

Convergence history:

0 100 200 300 400 50010−10

10−5

100

TOL

coldim(Z )

‖R‖/‖B

TB‖

LR-ADI with adaptive shifts vs. EKSM

LR-ADIEKSM

CPU times: LR-ADI ≈ 110 sec, EKSM ≈ 135 sec.

©Peter Benner Low-Rank Techniques 21/40

Page 61: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving Large-Scale Sylvester and Lyapunov Equations

Summary & Outlook (Matrix Equations). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Numerical enhancements of low-rank ADI for large Sylvester/Lyapunovequations:

1. low-rank residuals, reformulated implementation,2. compute real low-rank factors in the presence of complex shifts,3. self-generating shift strategies (quantification in progress).

For diffusion-convection-reaction example:332.02 sec. down to 17.24 sec. acceleration by factor almost 20.

Generalized version enables derivation of low-rank solvers for variousgeneralized Sylvester equations.Ongoing work:

Apply LR-ADI in Newton methods for algebraic Riccati equations

R(X ) = AX + XAT + GGT − XSSTX = 0,

D(X ) = AXAT − EXET + GGT + ATXF (Ir + FTXF )−1FTXA = 0.

For nonlinear AREs seeP. Benner, P. Kurschner, J. Saak. Low-rank Newton-ADI methods for large nonsymmetric algebraic Riccati equations. J. FranklinInst., 2015.

©Peter Benner Low-Rank Techniques 22/40

Page 62: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Outline

1. Motivation

2. Solving Large-Scale Sylvester and Lyapunov Equations

3. From Matrix Equations to PDEs in d DimensionsThe Curse of DimensionalityTensor TechniquesNumerical Examples

©Peter Benner Low-Rank Techniques 23/40

Page 63: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

From Matrix Equations to PDEs in d Dimensions

The Curse of Dimensionality [Bellman 1957]

Increase matrix size of discretized differential operator for h→ h2

by factor 2d .

Rapid Increase of Dimensionality, called Curse of Dimensionality (d > 3).

Consider −∆u = f in [ 0 , 1 ]× [ 0 , 1 ] ⊂ R2, uniformly discretized as

(I ⊗ A + A⊗ I ) x =: Ax = b ⇐⇒ AX + XAT = B

with x = vec (X ) and b = vec (B) with low-rank right hand side B ≈ b1bT2 .

Low-rankness of X := VWT ≈ X follows from properties of A and B, e.g.,[Penzl ’00, Grasedyck ’04].

We solve this using low-rank Krylov subspace solvers. These essentially requirematrix-vector multiplication and vector computations.

Hence, A vec (Xk ) = A vec(VkW

Tk

)= vec

([AVk , Vk ] [Wk , AWk ]T

)The rank of [AVk Vk ] ∈ Rn,2r , [Wk AWk ] ∈ Rnt ,2r increases but can be controlled usingtruncation operator Tε (like in sign function solver for Lyapunov equations). Low-rankKrylov subspace solvers. [Kressner/Tobler, B/Breiten, Savostyanov/Dolgov, . . . ].

Ideas correspond to separation of variables in continuous-time: b(x , y) ≈ b1(x)b2(y).

©Peter Benner Low-Rank Techniques 24/40

Page 64: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

From Matrix Equations to PDEs in d Dimensions

The Curse of Dimensionality [Bellman 1957]

Increase matrix size of discretized differential operator for h→ h2

by factor 2d .

Rapid Increase of Dimensionality, called Curse of Dimensionality (d > 3).

Consider −∆u = f in [ 0 , 1 ]× [ 0 , 1 ] ⊂ R2, uniformly discretized as

(I ⊗ A + A⊗ I ) x =: Ax = b ⇐⇒ AX + XAT = B

with x = vec (X ) and b = vec (B) with low-rank right hand side B ≈ b1bT2 .

Low-rankness of X := VWT ≈ X follows from properties of A and B, e.g.,[Penzl ’00, Grasedyck ’04].

We solve this using low-rank Krylov subspace solvers. These essentially requirematrix-vector multiplication and vector computations.

Hence, A vec (Xk ) = A vec(VkW

Tk

)= vec

([AVk , Vk ] [Wk , AWk ]T

)The rank of [AVk Vk ] ∈ Rn,2r , [Wk AWk ] ∈ Rnt ,2r increases but can be controlled usingtruncation operator Tε (like in sign function solver for Lyapunov equations). Low-rankKrylov subspace solvers. [Kressner/Tobler, B/Breiten, Savostyanov/Dolgov, . . . ].

Ideas correspond to separation of variables in continuous-time: b(x , y) ≈ b1(x)b2(y).

©Peter Benner Low-Rank Techniques 24/40

Page 65: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

From Matrix Equations to PDEs in d Dimensions

The Curse of Dimensionality [Bellman 1957]

Increase matrix size of discretized differential operator for h→ h2

by factor 2d .

Rapid Increase of Dimensionality, called Curse of Dimensionality (d > 3).

Consider −∆u = f in [ 0 , 1 ]× [ 0 , 1 ] ⊂ R2, uniformly discretized as

(I ⊗ A + A⊗ I ) x =: Ax = b ⇐⇒ AX + XAT = B

with x = vec (X ) and b = vec (B) with low-rank right hand side B ≈ b1bT2 .

Low-rankness of X := VWT ≈ X follows from properties of A and B, e.g.,[Penzl ’00, Grasedyck ’04].

We solve this using low-rank Krylov subspace solvers. These essentially requirematrix-vector multiplication and vector computations.

Hence, A vec (Xk ) = A vec(VkW

Tk

)= vec

([AVk , Vk ] [Wk , AWk ]T

)The rank of [AVk Vk ] ∈ Rn,2r , [Wk AWk ] ∈ Rnt ,2r increases but can be controlled usingtruncation operator Tε (like in sign function solver for Lyapunov equations). Low-rankKrylov subspace solvers. [Kressner/Tobler, B/Breiten, Savostyanov/Dolgov, . . . ].

Ideas correspond to separation of variables in continuous-time: b(x , y) ≈ b1(x)b2(y).

©Peter Benner Low-Rank Techniques 24/40

Page 66: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

From Matrix Equations to PDEs in d Dimensions

The Curse of Dimensionality [Bellman 1957]

Increase matrix size of discretized differential operator for h→ h2

by factor 2d .

Rapid Increase of Dimensionality, called Curse of Dimensionality (d > 3).

Consider −∆u = f in [ 0 , 1 ]× [ 0 , 1 ] ⊂ R2, uniformly discretized as

(I ⊗ A + A⊗ I ) x =: Ax = b ⇐⇒ AX + XAT = B

with x = vec (X ) and b = vec (B) with low-rank right hand side B ≈ b1bT2 .

Low-rankness of X := VWT ≈ X follows from properties of A and B, e.g.,[Penzl ’00, Grasedyck ’04].

We solve this using low-rank Krylov subspace solvers. These essentially requirematrix-vector multiplication and vector computations.

Hence, A vec (Xk ) = A vec(VkW

Tk

)= vec

([AVk , Vk ] [Wk , AWk ]T

)The rank of [AVk Vk ] ∈ Rn,2r , [Wk AWk ] ∈ Rnt ,2r increases but can be controlled usingtruncation operator Tε (like in sign function solver for Lyapunov equations). Low-rankKrylov subspace solvers. [Kressner/Tobler, B/Breiten, Savostyanov/Dolgov, . . . ].

Ideas correspond to separation of variables in continuous-time: b(x , y) ≈ b1(x)b2(y).

©Peter Benner Low-Rank Techniques 24/40

Page 67: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

From Matrix Equations to PDEs in d Dimensions

The Curse of Dimensionality [Bellman 1957]

Increase matrix size of discretized differential operator for h→ h2

by factor 2d .

Rapid Increase of Dimensionality, called Curse of Dimensionality (d > 3).

Consider −∆u = f in [ 0 , 1 ]× [ 0 , 1 ] ⊂ R2, uniformly discretized as

(I ⊗ A + A⊗ I ) x =: Ax = b ⇐⇒ AX + XAT = B

with x = vec (X ) and b = vec (B) with low-rank right hand side B ≈ b1bT2 .

Low-rankness of X := VWT ≈ X follows from properties of A and B, e.g.,[Penzl ’00, Grasedyck ’04].

We solve this using low-rank Krylov subspace solvers. These essentially requirematrix-vector multiplication and vector computations.

Hence, A vec (Xk ) = A vec(VkW

Tk

)= vec

([AVk , Vk ] [Wk , AWk ]T

)

The rank of [AVk Vk ] ∈ Rn,2r , [Wk AWk ] ∈ Rnt ,2r increases but can be controlled usingtruncation operator Tε (like in sign function solver for Lyapunov equations). Low-rankKrylov subspace solvers. [Kressner/Tobler, B/Breiten, Savostyanov/Dolgov, . . . ].

Ideas correspond to separation of variables in continuous-time: b(x , y) ≈ b1(x)b2(y).

©Peter Benner Low-Rank Techniques 24/40

Page 68: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

From Matrix Equations to PDEs in d Dimensions

The Curse of Dimensionality [Bellman 1957]

Increase matrix size of discretized differential operator for h→ h2

by factor 2d .

Rapid Increase of Dimensionality, called Curse of Dimensionality (d > 3).

Consider −∆u = f in [ 0 , 1 ]× [ 0 , 1 ] ⊂ R2, uniformly discretized as

(I ⊗ A + A⊗ I ) x =: Ax = b ⇐⇒ AX + XAT = B

with x = vec (X ) and b = vec (B) with low-rank right hand side B ≈ b1bT2 .

Low-rankness of X := VWT ≈ X follows from properties of A and B, e.g.,[Penzl ’00, Grasedyck ’04].

We solve this using low-rank Krylov subspace solvers. These essentially requirematrix-vector multiplication and vector computations.

Hence, A vec (Xk ) = A vec(VkW

Tk

)= vec

([AVk , Vk ] [Wk , AWk ]T

)The rank of [AVk Vk ] ∈ Rn,2r , [Wk AWk ] ∈ Rnt ,2r increases but can be controlled usingtruncation operator Tε (like in sign function solver for Lyapunov equations). Low-rankKrylov subspace solvers. [Kressner/Tobler, B/Breiten, Savostyanov/Dolgov, . . . ].

Ideas correspond to separation of variables in continuous-time: b(x , y) ≈ b1(x)b2(y).

©Peter Benner Low-Rank Techniques 24/40

Page 69: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

From Matrix Equations to PDEs in d Dimensions

The Curse of Dimensionality [Bellman 1957]

Increase matrix size of discretized differential operator for h→ h2

by factor 2d .

Rapid Increase of Dimensionality, called Curse of Dimensionality (d > 3).

Consider −∆u = f in [ 0 , 1 ]× [ 0 , 1 ] ⊂ R2, uniformly discretized as

(I ⊗ A + A⊗ I ) x =: Ax = b ⇐⇒ AX + XAT = B

with x = vec (X ) and b = vec (B) with low-rank right hand side B ≈ b1bT2 .

Low-rankness of X := VWT ≈ X follows from properties of A and B, e.g.,[Penzl ’00, Grasedyck ’04].

We solve this using low-rank Krylov subspace solvers. These essentially requirematrix-vector multiplication and vector computations.

Hence, A vec (Xk ) = A vec(VkW

Tk

)= vec

([AVk , Vk ] [Wk , AWk ]T

)The rank of [AVk Vk ] ∈ Rn,2r , [Wk AWk ] ∈ Rnt ,2r increases but can be controlled usingtruncation operator Tε (like in sign function solver for Lyapunov equations). Low-rankKrylov subspace solvers. [Kressner/Tobler, B/Breiten, Savostyanov/Dolgov, . . . ].

Ideas correspond to separation of variables in continuous-time: b(x , y) ≈ b1(x)b2(y).

©Peter Benner Low-Rank Techniques 24/40

Page 70: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

From Matrix Equations to PDEs in d Dimensions

Ideas extend to d ≥ 3: let Aj ∈ Rnj×nj , j = 1, . . . , d , be 1d-discretization matrices

corresponding to ”grid points” x(1)j , x

(2)j , . . . , x

(nj )

j , then under certain assumptions, theLaplace operator (w/ homogeneous boundary conditions) can be discretized as

A = A1 ⊗ I ⊗ · · · ⊗ I + I ⊗ A2 ⊗ I ⊗ · · · ⊗ I + . . .+ I ⊗ · · · ⊗ I ⊗ Ad .

Note: if nj = n ∀ j , then A ∈ Rnd×nd !

If source term can be written as f (x1, x2, . . . , xd) = f1(x1)f2(x2) · · · fd(xd), discretizedright-hand side becomes

b = b1 ⊗ b2 ⊗ · · · ⊗ bd ∈ Rnd with bj = [ fj(x(1)j ), . . . , fj(x

(n)j ) ].

Now if the solution x of Ax = b

1. could be written as x = x1 ⊗ x2 ⊗ · · · ⊗ xd with xj ∈ Rn,

2. and could be computed without forming A, b, but working only with Aj , bj ,

then we could hope to reduce the storage and computational complexity from nd to dn!This requires low-rank tensor techniques.

(Note: for x we wıll need a more complex representation as 1. does not hold in general!)

©Peter Benner Low-Rank Techniques 25/40

Page 71: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

From Matrix Equations to PDEs in d Dimensions

Ideas extend to d ≥ 3: let Aj ∈ Rnj×nj , j = 1, . . . , d , be 1d-discretization matrices

corresponding to ”grid points” x(1)j , x

(2)j , . . . , x

(nj )

j , then under certain assumptions, theLaplace operator (w/ homogeneous boundary conditions) can be discretized as

A = A1 ⊗ I ⊗ · · · ⊗ I + I ⊗ A2 ⊗ I ⊗ · · · ⊗ I + . . .+ I ⊗ · · · ⊗ I ⊗ Ad .

Note: if nj = n ∀ j , then A ∈ Rnd×nd !

If source term can be written as f (x1, x2, . . . , xd) = f1(x1)f2(x2) · · · fd(xd), discretizedright-hand side becomes

b = b1 ⊗ b2 ⊗ · · · ⊗ bd ∈ Rnd with bj = [ fj(x(1)j ), . . . , fj(x

(n)j ) ].

Now if the solution x of Ax = b

1. could be written as x = x1 ⊗ x2 ⊗ · · · ⊗ xd with xj ∈ Rn,

2. and could be computed without forming A, b, but working only with Aj , bj ,

then we could hope to reduce the storage and computational complexity from nd to dn!This requires low-rank tensor techniques.

(Note: for x we wıll need a more complex representation as 1. does not hold in general!)

©Peter Benner Low-Rank Techniques 25/40

Page 72: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

From Matrix Equations to PDEs in d Dimensions

Ideas extend to d ≥ 3: let Aj ∈ Rnj×nj , j = 1, . . . , d , be 1d-discretization matrices

corresponding to ”grid points” x(1)j , x

(2)j , . . . , x

(nj )

j , then under certain assumptions, theLaplace operator (w/ homogeneous boundary conditions) can be discretized as

A = A1 ⊗ I ⊗ · · · ⊗ I + I ⊗ A2 ⊗ I ⊗ · · · ⊗ I + . . .+ I ⊗ · · · ⊗ I ⊗ Ad .

Note: if nj = n ∀ j , then A ∈ Rnd×nd !

If source term can be written as f (x1, x2, . . . , xd) = f1(x1)f2(x2) · · · fd(xd), discretizedright-hand side becomes

b = b1 ⊗ b2 ⊗ · · · ⊗ bd ∈ Rnd with bj = [ fj(x(1)j ), . . . , fj(x

(n)j ) ].

Now if the solution x of Ax = b

1. could be written as x = x1 ⊗ x2 ⊗ · · · ⊗ xd with xj ∈ Rn,

2. and could be computed without forming A, b, but working only with Aj , bj ,

then we could hope to reduce the storage and computational complexity from nd to dn!

This requires low-rank tensor techniques.

(Note: for x we wıll need a more complex representation as 1. does not hold in general!)

©Peter Benner Low-Rank Techniques 25/40

Page 73: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

From Matrix Equations to PDEs in d Dimensions

Ideas extend to d ≥ 3: let Aj ∈ Rnj×nj , j = 1, . . . , d , be 1d-discretization matrices

corresponding to ”grid points” x(1)j , x

(2)j , . . . , x

(nj )

j , then under certain assumptions, theLaplace operator (w/ homogeneous boundary conditions) can be discretized as

A = A1 ⊗ I ⊗ · · · ⊗ I + I ⊗ A2 ⊗ I ⊗ · · · ⊗ I + . . .+ I ⊗ · · · ⊗ I ⊗ Ad .

Note: if nj = n ∀ j , then A ∈ Rnd×nd !

If source term can be written as f (x1, x2, . . . , xd) = f1(x1)f2(x2) · · · fd(xd), discretizedright-hand side becomes

b = b1 ⊗ b2 ⊗ · · · ⊗ bd ∈ Rnd with bj = [ fj(x(1)j ), . . . , fj(x

(n)j ) ].

Now if the solution x of Ax = b

1. could be written as x = x1 ⊗ x2 ⊗ · · · ⊗ xd with xj ∈ Rn,

2. and could be computed without forming A, b, but working only with Aj , bj ,

then we could hope to reduce the storage and computational complexity from nd to dn!This requires low-rank tensor techniques.

(Note: for x we wıll need a more complex representation as 1. does not hold in general!)

©Peter Benner Low-Rank Techniques 25/40

Page 74: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Tensor Techniques

Separation of Variables and Low-rank Approximation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

n

Approximate: x(i1, . . . , id)︸ ︷︷ ︸tensor

≈∑α

x(1)α (i1)x(2)

α (i2) · · · x(d)α (id)︸ ︷︷ ︸

tensor product decomposition

.

Goals:

Store and manipulate x O(dn) cost instead of O(nd).

Solve equations Ax = b O(dn2) cost instead of O(n2d).

©Peter Benner Low-Rank Techniques 26/40

Page 75: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Data Compression in 2D: Low-Rank Matrices

Discrete separation of variables:x1,1 · · · x1,n

......

xn,1 · · · xn,n

=r∑

α=1

v1,α

...vn,α

[wα,1 · · · wα,n]

+O(ε).

Diagrams:

x

i1 i2

≈ v wα

i1i2

Rank r n.

mem(v) + mem(w) = 2nr n2 = mem(x).

Singular Value Decomposition (SVD)=⇒ ε(r) optimal w.r.t. spectral/Frobenius norm.

©Peter Benner Low-Rank Techniques 27/40

Page 76: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Data Compression in Higher Dimensions

Tensor Trains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Matrix Product States/Tensor Train (TT) format[Wilson ’75, White ’93, Verstraete ’04, Oseledets ’09/’11]:

For indices

ip . . . iq = (ip − 1)np+1 · · · nq + (ip+1 − 1)np+2 · · · nq + · · ·+ (iq−1 − 1)nq + iq,

the TT format can be expressed as

x(i1 . . . id) =r∑

α=1

x(1)α1

(i1) · x(2)α1,α2

(i2) · x(3)α2,α3

(i3) · · · x(d)αd−1,αd

(id)

or

x(i1 . . . id) = x(1)(i1) · · · x(d)(id), x(k)(ik) ∈ Rrk−1×rk .

or

x(k) x(k+1) x(d)x(k−1)x(2)x(1)α1 α2 αk−2 αk−1 αk αk+1 αd−1

i1 i2 ik−1 ik ik+1 id

Storage: O(dnr 2).

©Peter Benner Low-Rank Techniques 28/40

Page 77: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Overloading Tensor Operations

Always work with factors x(k) ∈ Rrk−1×nk×rk instead of full tensors.

Sum z = x + y increase of tensor rank rz = rx + ry .

TT format for a high-dimensional operator

A(i1 . . . id , j1 . . . jd) = A(1)(i1, j1) · · ·A(d)(id , jd)

Matrix-vector multiplication y = Ax ; tensor rank ry = rA · rx .

Additions and multiplications increase TT ranks.

Decrease ranks quasi-optimally via truncation operator Tε using SVD (orQR).

Iterative linear solvers (and preconditioners) can be implemented in TTformat; we use, e.g., TT-GMRES.

©Peter Benner Low-Rank Techniques 29/40

Page 78: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Numerical Examples

Our work

Apply low-rank iterative solvers in TT format to discrete optimality systems resultingfrom

PDE-constrained optimization problems under uncertainty,

discretized by stochastic Galerkin method, i.e., applying Galerkin projection to weakformulation using

implicit Euler (discontinuous Galerkin, dG(0)) in time,

classical finite element methods (FEM) in physical space,

generalized polynomial chaos in N-dimensional parameter space (resulting fromparameterizing uncertain parameters).

This yields naturally a (1 + 2(3) + N)-way tensor representation of solution, which weapproximate in low-rank TT format.

Take home message

Biggest problem solved so far has n = 1.29 · 1015 unknowns (optimality system forunsteady incompressible Navier-Stokes control problem with uncertain viscosity).

Would require ≈ 10 petabytes (PB) = 10, 000 TB to store the solution vector!

Using low-rank tensor techniques, we need ≈ 7 · 107 bytes = 70 GB to solve theoptimality system in MATLAB in less than one hour!

©Peter Benner Low-Rank Techniques 30/40

Page 79: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Numerical Examples

Our work

Apply low-rank iterative solvers in TT format to discrete optimality systems resultingfrom

PDE-constrained optimization problems under uncertainty,

discretized by stochastic Galerkin method.This yields naturally a (1 + 2(3) + N)-way tensor representation of solution, which weapproximate in low-rank TT format.

Take home message

Biggest problem solved so far has n = 1.29 · 1015 unknowns (optimality system forunsteady incompressible Navier-Stokes control problem with uncertain viscosity).

Would require ≈ 10 petabytes (PB) = 10, 000 TB to store the solution vector!

Using low-rank tensor techniques, we need ≈ 7 · 107 bytes = 70 GB to solve theoptimality system in MATLAB in less than one hour!

©Peter Benner Low-Rank Techniques 30/40

Page 80: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Numerical Examples

Our work

Apply low-rank iterative solvers in TT format to discrete optimality systems resultingfrom

PDE-constrained optimization problems under uncertainty,

discretized by stochastic Galerkin methodThis yields naturally a (1 + 2(3) + N)-way tensor representation of solution, which weapproximate in low-rank TT format.

Take home message

Biggest problem solved so far has n = 1.29 · 1015 unknowns (optimality system forunsteady incompressible Navier-Stokes control problem with uncertain viscosity).

Would require ≈ 10 petabytes (PB) = 10, 000 TB to store the solution vector!

Using low-rank tensor techniques, we need ≈ 7 · 107 bytes = 70 GB to solve theoptimality system in MATLAB in less than one hour!

©Peter Benner Low-Rank Techniques 30/40

Page 81: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Numerical Examples

Our work

Apply low-rank iterative solvers in TT format to discrete optimality systems resultingfrom

PDE-constrained optimization problems under uncertainty,

discretized by stochastic Galerkin methodThis yields naturally a (1 + 2(3) + N)-way tensor representation of solution, which weapproximate in low-rank TT format.

Take home message

Biggest problem solved so far has n = 1.29 · 1015 unknowns (optimality system forunsteady incompressible Navier-Stokes control problem with uncertain viscosity).

Would require ≈ 10 petabytes (PB) = 10, 000 TB to store the solution vector!

Using low-rank tensor techniques, we need ≈ 7 · 107 bytes = 70 GB to solve theoptimality system in MATLAB in less than one hour!

©Peter Benner Low-Rank Techniques 30/40

Page 82: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Optimal Control of Unsteady Heat Equation

Consider the optimization problem

J (t, y , u) =1

2||y − y ||2L2(0,T ;D)⊗L2(Ω) +

α

2||std(y)||2L2(0,T ;D) +

β

2||u||2L2(0,T ;D)⊗L2(Ω)

subject, P-almost surely, to∂y(t, x, ω)

∂t−∇ · (a(x, ω)∇y(t, x, ω)) = u(t, x, ω), in (0,T ]×D × Ω,

y(t, x, ω) = 0, on (0,T ]× ∂D × Ω,

y(0, x, ω) = y0, in D × Ω,

where

for any z : D × Ω→ R, z(x, ·) is a random variable defined on the completeprobability space (Ω,F ,P) for each x ∈ D,∃ 0 < amin < amax <∞ s.t. P(ω ∈ Ω : a(x , ω) ∈ [amin, amax] ∀ x ∈ D) = 1.

©Peter Benner Low-Rank Techniques 31/40

Page 83: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Stochastic Galerkin Finite Element Method

Weak formulation of the random PDE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Seek y ∈ H1(0,T ;H1

0 (D)⊗ L2(Ω))

such that, P-almost surely,

〈yt , v〉+ B(y , v) = `(u, v) ∀ v ∈ H10 (D)⊗ L2(Ω),

with the coercive1 bilinear form

B(y , v) :=

∫Ω

∫Da(x, ω)∇y(x, ω) · ∇v(x, ω)dx dP(ω), v , y ∈ H1

0 (D)⊗ L2(Ω),

and

`(u, v) = 〈u(x, ω), v(x, ω)〉

=:

∫Ω

∫Du(x, ω)v(x, ω)dx dP(ω), u, v ∈ H1

0 (D)⊗ L2(Ω).

Coercivity and boundedness of B + Lax-Milgram =⇒ unique solution exists.

1due to the positivity assumption on a(x, ω)

©Peter Benner Low-Rank Techniques 32/40

Page 84: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Stochastic Galerkin Finite Element Method

Weak formulation of the optimality system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Theorem [Chen/Quarteroni 2014, B./Onwunta/Stoll 2016]

Under appropriate regularity assumptions, there exists a unique adjoint state p andoptimal solution (y , u, p) to the optimal control problem for the random unsteady heatequation, satisfying the stochastic optimality conditions (KKT system) for t ∈ (0,T ]P-almost surely

〈yt , v〉+ B(y , v) = `(u, v), ∀ v ∈ H10 (D)⊗ L2(Ω),

〈pt ,w〉 − B∗(p,w) = `(

(y − y) +α

2S(y),w

), ∀w ∈ H1

0 (D)⊗ L2(Ω),

`(βu − p, w) = 0, ∀ w ∈ L2(D)⊗ L2(Ω),

where

S(y) is the Frechet derivative of ‖std(y)‖2L2(0,T ;D);

B∗ is the adjoint operator of B.

©Peter Benner Low-Rank Techniques 33/40

Page 85: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Stochastic Galerkin Finite Element Method

Discretization of the random PDE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

y , p, u are approximated using standard Galerkin ansatz, yieldingapproximations of the form

z(t, x, ω) =P−1∑k=0

J∑j=1

zjk(t)φj(x)ψk(ξ) =P−1∑k=0

zk(t, x)ψk(ξ).

Here,

φjJj=1 are linear finite elements;

ψkP−1k=0 are the P = (N+n)!

N!n!multivariate Legendre polynomials of degree ≤ n.

Implicit Euler/dG(0) used for temporal discretization w/ constant time step τ .

©Peter Benner Low-Rank Techniques 34/40

Page 86: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

The Fully Discretized Optimal Control Problem

Discrete first order optimality conditions/KKT system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . τM1 0 −KTt

0 βτM2 τN T

−Kt τN 0

yup

=

τMαy0d

,where

M1 = D ⊗ Gα ⊗M =: D ⊗Mα, M2 = D ⊗ G0 ⊗M,

Kt = Int ⊗[∑N

i=0 Gi ⊗ Ki

]+ (C ⊗ G0 ⊗M),

N = Int ⊗ G0 ⊗M,

and

G0 = diag(⟨ψ2

0

⟩,⟨ψ2

1

⟩, . . . ,

⟨ψ2P−1

⟩), Gi (j , k) =

⟨ξiψjψk

⟩, i = 1, . . . ,N,

Gα = G0 + αdiag(

0,⟨ψ2

1

⟩, . . . ,

⟨ψ2P−1

⟩)(with first moments 〈 . 〉 w.r.t. P),

K0 = M + τK0, Ki = τKi , i = 1, . . . ,N,

M,Ki ∈ RJ×J are the mass and stiffness matrices w.r.t. the spatial discretization, where Ki

corresponds to the contributions of the ith KLE term to the stiffness,

C = −diag(ones,−1), D = diag(

12, 1, . . . , 1, 1

2

)∈ Rnt×nt .

©Peter Benner Low-Rank Techniques 35/40

Page 87: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

The Fully Discretized Optimal Control Problem

Discrete first order optimality conditions/KKT system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . τM1 0 −KTt

0 βτM2 τN T

−Kt τN 0

yup

=

τMαy0d

,

Linear system with 3JPnt unknowns!

©Peter Benner Low-Rank Techniques 35/40

Page 88: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving the Optimality System

Optimality system leads to saddle point problem[A BT

B 0

].

Very large scale setting, (block-)structured sparsity iterative solution.

Krylov subspace methods for indefinite symmetric systems: MINRES, . . . .

Requires good preconditioner.

Famous three-iterations-convergence result [Murphy/Golub/Wathen 2000]: usingideal preconditioner

P :=

[A 00 −S

]with the Schur complement S := −BA−1BT ,

MINRES finds the exact solution in at most three steps.

Motivates to use mean-based approximate Schur complement preconditioner[A 0

0 S

].

Implemented in TT format TT-MINRES.

©Peter Benner Low-Rank Techniques 36/40

Page 89: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving the Optimality System

Optimality system leads to saddle point problem[A BT

B 0

].

Very large scale setting, (block-)structured sparsity iterative solution.

Krylov subspace methods for indefinite symmetric systems: MINRES, . . . .

Requires good preconditioner.

Famous three-iterations-convergence result [Murphy/Golub/Wathen 2000]: usingideal preconditioner

P :=

[A 00 −S

]with the Schur complement S := −BA−1BT ,

MINRES finds the exact solution in at most three steps.

Motivates to use mean-based approximate Schur complement preconditioner[A 0

0 S

].

Implemented in TT format TT-MINRES.

©Peter Benner Low-Rank Techniques 36/40

Page 90: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving the Optimality System

Optimality system leads to saddle point problem[A BT

B 0

].

Very large scale setting, (block-)structured sparsity iterative solution.

Krylov subspace methods for indefinite symmetric systems: MINRES, . . . .

Requires good preconditioner.

Famous three-iterations-convergence result [Murphy/Golub/Wathen 2000]: usingideal preconditioner

P :=

[A 00 −S

]with the Schur complement S := −BA−1BT ,

MINRES finds the exact solution in at most three steps.

Motivates to use mean-based approximate Schur complement preconditioner[A 0

0 S

].

Implemented in TT format TT-MINRES.

©Peter Benner Low-Rank Techniques 36/40

Page 91: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving the Optimality System

Optimality system leads to saddle point problem[A BT

B 0

].

Very large scale setting, (block-)structured sparsity iterative solution.

Krylov subspace methods for indefinite symmetric systems: MINRES, . . . .

Requires good preconditioner.

Famous three-iterations-convergence result [Murphy/Golub/Wathen 2000]: usingideal preconditioner

P :=

[A 00 −S

]with the Schur complement S := −BA−1BT ,

MINRES finds the exact solution in at most three steps.

Motivates to use mean-based approximate Schur complement preconditioner[A 0

0 S

].

Implemented in TT format TT-MINRES.

©Peter Benner Low-Rank Techniques 36/40

Page 92: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving the Optimality System

Optimality system leads to saddle point problem[A BT

B 0

].

Very large scale setting, (block-)structured sparsity iterative solution.

Krylov subspace methods for indefinite symmetric systems: MINRES, . . . .

Requires good preconditioner.

Famous three-iterations-convergence result [Murphy/Golub/Wathen 2000]: usingideal preconditioner

P :=

[A 00 −S

]with the Schur complement S := −BA−1BT ,

MINRES finds the exact solution in at most three steps.

Motivates to use mean-based approximate Schur complement preconditioner[A 0

0 S

].

Implemented in TT format TT-MINRES.

©Peter Benner Low-Rank Techniques 36/40

Page 93: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Solving the Optimality System

Optimality system leads to saddle point problem[A BT

B 0

].

Very large scale setting, (block-)structured sparsity iterative solution.

Krylov subspace methods for indefinite symmetric systems: MINRES, . . . .

Requires good preconditioner.

Famous three-iterations-convergence result [Murphy/Golub/Wathen 2000]: usingideal preconditioner

P :=

[A 00 −S

]with the Schur complement S := −BA−1BT ,

MINRES finds the exact solution in at most three steps.

Motivates to use mean-based approximate Schur complement preconditioner[A 0

0 S

].

Implemented in TT format TT-MINRES.

©Peter Benner Low-Rank Techniques 36/40

Page 94: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Numerical Results

Mean-Based Preconditioned TT-MINRES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

TT-MINRES # iter (t) # iter (t) # iter (t)

nt 25 26 28

dim(A) = 3JPnt 10, 671, 360 21, 342, 720 85, 370, 880

α = 1, tol = 10−3

β = 10−5 6 (285.5) 6 (300.0) 8 (372.2)

β = 10−6 4 (77.6) 4 (130.9) 4 (126.7)

β = 10−8 4 (56.7) 4 (59.4) 4 (64.9)

α = 0, tol = 10−3

β = 10−5 4 (207.3) 6 (366.5) 6 (229.5)

β = 10−6 4 (153.9) 4 (158.3) 4 (172.0)

β = 10−8 2 (35.2) 2 (37.8) 2 (40.0)

©Peter Benner Low-Rank Techniques 37/40

Page 95: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Optimal Control 3D Stokes-Brinkman w/ Uncertain Viscosity

State

Control

Mean Standard deviation

Full size: nxnξnt ≈ 3 · 109. Reduction:mem(TT )

mem(full)= 0.002.

©Peter Benner Low-Rank Techniques 38/40

Page 96: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Conclusions (High-dimensional PDEs)

Low-rank tensor solver for unsteady heat and Navier-Stokes equations withuncertain viscosity.

Similar techniques used for Stokes(-Brinkman) optimal control problems.

Adapted AMEn (TT) solver to saddle point systems.

To consider next:

Navier-Stokes: many parameters coming from uncertain geometry orKarhunen-Loeve expansion of random fields;Initial results: the more parameters, the more significant is the complexityreduction w.r.t. memory — up to a factor of 109 for the control problem for abackward facing step.exploit multicore technology (need efficient parallelization of AMEn).

©Peter Benner Low-Rank Techniques 39/40

Page 97: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

Conclusions (High-dimensional PDEs)

Low-rank tensor solver for unsteady heat and Navier-Stokes equations withuncertain viscosity.

Similar techniques used for Stokes(-Brinkman) optimal control problems.

Adapted AMEn (TT) solver to saddle point systems.

To consider next:

Navier-Stokes: many parameters coming from uncertain geometry orKarhunen-Loeve expansion of random fields;Initial results: the more parameters, the more significant is the complexityreduction w.r.t. memory — up to a factor of 109 for the control problem for abackward facing step.exploit multicore technology (need efficient parallelization of AMEn).

©Peter Benner Low-Rank Techniques 39/40

Page 98: Low-rank Techniques: from Matrix Equations to High … · 2020. 3. 23. · Sergey Dolgov (U Bath) Boris Khoromskii and Venera Khoromskaia (MPI DCTS and MPI MIS Leipzig) Patrick

References for Optimal Control under Uncertainty

P. Benner, S. Dolgov, A. Onwunta, and M. Stoll.Solving optimal control problems governed by random Navier-Stokes equations using low-rank methods.arXiv Preprint arXiv:1703.06097, March 2017.

P. Benner, S. Dolgov, A. Onwunta, and M. Stoll.Low-rank solvers for unsteady Stokes-Brinkman optimal control problem with random data.Computer Methods in Applied Mechanics and Engineering, 304:26–54, 2016.

P. Benner, A. Onwunta, and M. Stoll.Low rank solution of unsteady diffusion equations with stochastic coefficients.SIAM/ASA Journal on Uncertainty Quantification, 3(1):622–649, 2015.

P. Benner, A. Onwunta, and M. Stoll.Block-diagonal preconditioning for optimal control problems constrained by PDEs with uncertain inputs.SIAM Journal on Matrix Analysis and Applications, 37(2):491–518, 2016.

C.E. Powell and D.J. Silvester.Preconditioning steady-state Navier–Stokes equations with random data.SIAM Journal on Scientific Computing, 34(5):A2482–A2506, 2012.

E. Rosseel and G.N. Wells.Optimal control with stochastic PDE constraints and uncertain controls.Computer Methods in Applied Mechanics and Engineering, 213–216:152–167, 2012.

M. Stoll and T. Breiten.A low-rank in time approach to PDE-constrained optimization.SIAM Journal on Scientific Computing, 37(1):B1–B29, 2015.

©Peter Benner Low-Rank Techniques 40/40