Krylov subspace methods for linear systems with tensor ...

Introduction Basic Algorithm Convergence bounds Solving the compressed equation Numerical experiments

Krylov subspace methods for linear systemswith tensor product structure

Christine Tobler

Seminar for Applied Mathematics,ETH Zürich

19. August 2009

Outline

1 Introduction

2 Basic Algorithm

3 Convergence bounds

4 Solving the compressed equation

5 Numerical experiments

Outline

1 Introduction

2 Basic Algorithm

Linear system with tensor product structure

We consider the linear system

Ax = b

A =d∑

In1 ⊗ · · · ⊗ Ins−1 ⊗ As ⊗ Ins+1 ⊗ · · · ⊗ Ind ,

b = b1 ⊗ · · · ⊗ bd ,

As ∈ Rns×ns positive definite, bs ∈ Rns .

Example for 3 dimensions:

(A1 ⊗ I ⊗ I + I ⊗ A2 ⊗ I + I ⊗ I ⊗ A3)x = b1 ⊗ b2 ⊗ b3

Linear system with tensor product structure

We consider the linear system

Ax = b

A =d∑

In1 ⊗ · · · ⊗ Ins−1 ⊗ As ⊗ Ins+1 ⊗ · · · ⊗ Ind ,

b = b1 ⊗ · · · ⊗ bd ,

As ∈ Rns×ns positive definite, bs ∈ Rns .

Example for 3 dimensions:

(A1 ⊗ I ⊗ I + I ⊗ A2 ⊗ I + I ⊗ I ⊗ A3)x = b1 ⊗ b2 ⊗ b3

Tensor decompositions

CP decomposition:

vec(X ) =k∑

ar ⊗ br ⊗ cr

ar ∈ Rm,br ∈ Rn, cr ∈ Rp

Tucker decomposition:

vec(X ) =m∑

n∑s=1

p∑l=1

Gijl ai ⊗ bj ⊗ cl

=(A⊗ B ⊗ C) vec(G)

G ∈ Rm×n×p,

ai ∈ Rm,bj ∈ Rn, cl ∈ Rp

About this system

The eigenvalues of the matrix A are given by all possible sums

λ(1)i1 + λ

(2)i2 + ...+ λ

where λ(s)is denotes an eigenvalue of As. For As positive definite, the

system has a unique solution.

Note that x and b are vector representations of tensors in Rn1×···×nd ,and b represents a rank-one tensor.

A tensor arising from the discretization of a sufficiently smoothfunction f can be approximated by a short sum of rank-one tensors.By superposition, we can insert such a tensor as right-hand side intoour system.

About this system

λ(1)i1 + λ

(2)i2 + ...+ λ

About this system

λ(1)i1 + λ

(2)i2 + ...+ λ

Tensorized Krylov subspaces

A Krylov subspace is defined as

Kk (A,b) = span

b, Ab, . . . , Ak−1b.

We define a tensorized Krylov subspace as

K⊗K (A,b) := span(Kk1 (A1,b1)⊗ · · · ⊗ Kkd (Ad ,bd )

Note thatKk0 (A,b) ⊂ K⊗K (A,b)

for K = (k0, . . . , k0).⇒ Tensorized Krylov subspaces are richer than standard Krylovsubspaces.

Kk (A,b) = span

b, Ab, . . . , Ak−1b.

Kk (A,b) = span

b, Ab, . . . , Ak−1b.

Outline

1 Introduction

2 Basic Algorithm

Reminder: CG method (Ax = b)

The best approximation of x in Kk (A,b) is defined by:

‖xk − x‖A = minx∈Kk (A,b)

‖x − x‖A.

Find Uk with (orthonormal) columns that span the Krylov subspaceKk (A,b). Set xk = Uk y , then y is the solution of the compressedsystem

Hk y = b, Hk = U>k AUk , b = U>k b.

Convergence bound (κ = κ2(A))

‖xk − x‖A ≤ C(A,b, κ)(√κ− 1√

‖x − x‖A.

‖xk − x‖A ≤ C(A,b, κ)(√κ− 1√

‖x − x‖A.

‖xk − x‖A ≤ C(A,b, κ)(√κ− 1√

Tensorized Krylov: A vec(X ) = b

Applying the Arnoldi method to As,bs results in Us,Hs with

U>s AsUs = Hs,

Us column-orthogonal, Hs upper Hessenberg matrix.

The columns of Us span Kks (As,bs), and similarly the columns ofU := U1 ⊗ · · · ⊗ Ud span K⊗K (A,b).

Solve the compressed system

Hy = b

with xK = Uy , b = U>b and H = U>AU .

U>s AsUs = Hs,

Hy = b

U>s AsUs = Hs,

Hy = b

Compressed system

The structure of Hy = b corresponds to that of Ax = b:

H =d∑

Ik1 ⊗ · · · ⊗ Iks−1 ⊗ Hs ⊗ Iks+1 ⊗ · · · ⊗ Ikd

b = b1 ⊗ · · · ⊗ bd = ‖b‖2 (e1 ⊗ · · · ⊗ e1)

For dense core tensor y , xK is a tensor in Tucker decomposition:

xK = (U1 ⊗ · · · ⊗ Ud )y

Outline

1 Introduction

2 Basic Algorithm

Convergence, s.p.d. case (1)

Similarly to CG, we have:

‖xK − x‖A = minx∈K⊗K (A,b)

‖x − x‖A

Every vector in a Krylov subspace Kks (As,bs) can be seen asp(As)bs, with p a polynomial of order at most ks − 1. A similar thoughtreduces the bound calculation to the min-max problem

EΩ(K) := minp∈Π⊗K

‖p(λ1, . . . , λd )− 1λ1 + · · ·+ λd

‖Ω,

where Π⊗K is a space of multivariate polynomials, and Ω contains theeigenvalues (λ1, . . . , λd ), with λi ∈ [λmin(Ai ), λmax(Ai )].

Convergence, s.p.d. case (2)

Inserting an upper bound for EΩ(K), we find

‖xK − x‖A ≤d∑

C(A,b, κs)(√κs − 1√κs + 1

with κs = 1 + λmax(As)−λmin(As)λmin(A) .

For the case As, ks constant:

‖xK − x‖A ≤ C(A,b,d)(√κ− 1√

with κ = d−1d + κ2(A)

Note that the convergence rate improves with increasing dimension.

Convergence, non-symmetric positive definite case

General convergence bound:

‖xK − x‖2 ≤d∑

∫ ∞0

e−αs t ‖Use−tHs e1 − e−tAs bs‖2 dt

with αs :=∑

j 6=s αj and αj = λmin(Aj + A>j )/2.

The bound on ‖Use−tHs e1 − e−tAs bs‖2 will depend on additionalknowledge on As.

For example, when the field of values of each matrix As is containedin a known ellipse, an explicit convergence bound can be found.

Outline

1 Introduction

2 Basic Algorithm

Grasedyck’s method, system Hy = b

For H positive definite:

H−1 =

∫ ∞0

exp(−tH)dt

The exponential of H has a Kronecker product structure, too:

exp(−tH) = exp(−td∑

Hs) =d∏

exp(−tHs) =d⊗

exp(−tHs),

Approximation ym:

ym =m∑

d⊗s=1

exp(− αjHs

with certain coefficients αj , ωj .

Coefficients of the exponential sum (1)

The coefficients αj , ωj should minimize

supz∈Λ(H)

∣∣1z−

m∑j=1

ωje−αj z∣∣

Case 1: H symmetric and condition number knownThe eigenvalues of H are real: Λ(H) ⊂ [λmin, λmax]. There arecoefficients αj , ωj s.t.

‖y − ym‖2 ≤ C(H, b) exp( −mπ2

log(8κ2(H))

These coefficients may be found using a variant of the Remezalgorithm. Tabellated values have been made available byHackbusch.

Coefficients of the exponential sum (2)

Case 2: Nonsymmetric H and/or unknown condition numberThere is an explicit formula for αj , ωj , where the eigenvalues of H onlyneed to have positive real part.

‖y − y2m+1‖2 ≤ C(H, b) exp(µ/π) exp(−√

where µ = max|=m(Λ(H))|.

The convergence is significantly slower than for Case 1.

Theoretical convergence bounds for the coefficients

0 50 100 150 20010

10−6

10−4

10−2

||y −

yk|| 2

Case 1 for cond. 1e5Case 1 for cond. 1e15Case 2

Outline

1 Introduction

2 Basic Algorithm

Symmetric example: Poisson Equation

Finite difference discretization of the Poisson equation in ddimensions:

∆u = f in Ω = [0,1]d

u = 0 on Γ := ∂Ω,

where the right-hand side f is a separable function.

As =1h2

2 −1

−1 2. . .

. . . . . . −1−1 2

,bs : random numbers.

The approximation error is measured by relative residual, ‖AxK−b‖2‖b‖2

Convergence for system size 200d

Numerical convergence

50 100 150 20010

10−6

10−4

10−2

d = 5d = 10d = 50d = 100

Theoretical convergence rate

50 100 150 20010

10−6

10−4

10−2

d = 5d = 10d = 50d = 100

Convergence for system size 1000d

200 400 600 800 100010

10−6

10−4

10−2

d = 5d = 10d = 50d = 100

200 400 600 800 100010

10−6

10−4

10−2

d = 5d = 10d = 50d = 100

Extended Krylov subspaces

Using extended Krylov subspaces

Kks (As,bs) := span(Kks (As,bs) ∪ Kks+1(A−1s ,bs)),

the algorithm works analogously.Convergence bound (As, ks constant):

‖xK − x‖2 ≤ C(A,b,d)(√κ− 1√

The convergence rate depends on κ:

κ ≈ d − 1d

+1d2 (d − 1)

d−1d κ2(A)

d−1d

d = 2 : κ =12

√κ2(A)

4, d 0 : κ ≈ d − 1

d+κ2(A)

Extended Krylov, system size 200d

5 10 15 20 25 30 35 4010

10−6

10−4

10−2

d = 5d = 10d = 50d = 100

50 100 150 20010

10−6

10−4

10−2

d = 5d = 10d = 50d = 100

Non-symmetric example

Finite difference discretization in d dimensions of theconvection-diffusion equation

∆u − (c, . . . , c)∇u = f in Ω = [0,1]d

u = 0 on Γ := ∂Ω,

where f is again a separable function.

As =1h2

2 −1−1 2 −1

. . . . . . . . .−1 2 −1

−1 2

3 −5 1

1 3 −5. . .

. . . . . . . . . 11 3 −5

For c = 10, all eigenvalues of A are real. For c = 100, this is not thecase, and the convergence is significantly worsened.

Non-symmetric case, system size 200d

Convergence for c = 10

50 100 150 20010

10−6

10−4

10−2

d = 5d = 10d = 50d = 100

Convergence for c = 100

50 100 150 20010

10−6

10−4

10−2

Conclusions

An efficient algorithm to calculate a low-rank approximation ofthe solution tensorThe computational complexity is linear in the number ofdimensionsOnly matrix-vector operations with the full system matrices arerequiredA theoretical convergence bound was found.

Conclusions

An efficient algorithm to calculate a low-rank approximation ofthe solution tensorThe computational complexity is linear in the number ofdimensionsOnly matrix-vector operations with the full system matrices arerequiredA theoretical convergence bound was found.

Thank you for your attention!

Literature

D. Kressner, C. Tobler, “Krylov subspace methods for linear systemswith tensor product structure”, Research report No. 2009-16, SAM,ETH Zurich , 2009.

L. Grasedyck, “Existence and computation of low Kronecker-rankapproximations for large linear systems of tensor product structure”,Computing, 72(3-4):247-265, (2004).

D. Braess, W. Hackbusch, “Approximation of 1/x by exponential sumsin [1,∞)”, IMA J. Numer. Anal., 25(4):685-697, (2005).

Krylov subspace methods for linear systems with tensor ...

Documents

Transcript of Krylov subspace methods for linear systems with tensor ...

Rational Krylov subspace methods for phi-functions in ...

Krylov Subspace Methods for Nonlinear Model Reduction · Computational Methods in Systems and Control Theory ... Max-Planck-Institut Magdeburg P. Benner, Krylov Subspace Methods for

Krylov Subspace Methods for the Eigenvalue problem

I.1.(a) Krylov Subspace Projection Methodsbai/Winter09/krylov.pdf · I.1.(a) Krylov Subspace Projection Methods 1 Introduction ... Alternatively, there is an oblique projection technique.

Deﬂation and augmentation techniques in Krylov …introduction to Krylov subspace methods and to [74] for a recent overview on Krylov subspace methods; see also [20, 21] for an advanced

Krylov-Subspace-Based Order Reduction Methods Applied to ...

Krylov Subspace Methods for Model Reduction of MIMO …mediatum.ub.tum.de/doc/1344241/816308.pdf · 2020. 2. 19. · Krylov Subspace Methods for Model Reduction of MIMO Quadratic-Bilinear

Inner-Iteration Krylov Subspace Methods for Least Squares ... · Inner-Iteration Krylov Subspace Methods for Least Squares Problems . Keiichi Morikuni and Ken Hayami . NII-2011-001E

Krylov-Subspace Methods - II

Krylov subspace methods for computing hydrodynamic interactions

Communication-Avoiding Krylov Subspace …...Communication-Avoiding Krylov Subspace Methods in Theory and Practice by Erin Claire Carson A dissertation submitted in partial satisfaction

ResearchArticle A Method of Indefinite Krylov Subspace for ...downloads.hindawi.com/journals/mpe/2018/2919873.pdf · A Method of Indefinite Krylov Subspace for Eigenvalue Problem

Simultaneous Tensor Subspace Selection and Clustering: The ...ranger.uta.edu/~chqding/papers/TuckerClusterKDD2008.pdf · Simultaneous Tensor Subspace Selection and Clustering: The

I.1.(a) Krylov Subspace Projection Methods

Krylov subspace methods for projected Lyapunov equations ...stykel/Publications/pr_10_735… · for Krylov subspace methods have been extended to projected Lyapunov equations [37,

Stability of Krylov Subspace Spectral Methods · Stability of Krylov Subspace Spectral Methods James V. Lambers Department of Energy Resources Engineering Stanford University includes

ANALYSIS OF SOME KRYLOV SUBSPACE ...saad/PDF/RIACS-90-ExpTh.pdf2. Krylov subspace approximations for eAv . We are interested in approxi-mations to the matrix exponential operation

Krylov Subspace Methods in Model Order Reduction

Structure Preserving Model Order Reduction via Krylov Subspace Projection · Structure Preserving Model Order Reduction via Krylov Subspace Projection Zhaojun Bai University of California,

Krylov Subspace Approximation for Local Community Detection in …bindel/papers/2019-tkdd.pdf · Krylov Subspace Approximation for Local Community Detection in Large Networks :3 robust