A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on:...
Transcript of A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on:...
![Page 1: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/1.jpg)
A short course on:Preconditioned Krylov subspace methods
Yousef SaadUniversity of Minnesota
Dept. of Computer Science andEngineering
Universite du Littoral, Jan 19-30, 2005
1
![Page 2: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/2.jpg)
Outline
Part 1
• Introd., discretization of PDEs
• Sparse matrices and sparsity
• Basic iterative methods (Relax-
ation..)
Part 2
• Projection methods
• Krylov subspace methods
Part 3
• Preconditioned iterations
• Preconditioning techniques
• Parallel implementations
Part 4
• Eigenvalue problems
• Applications –
Calais February 7, 2005 2
2
![Page 3: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/3.jpg)
Preconditioning – Basic principles
Basic idea is to use the Krylov subspace method on a modified
system such as
M−1Ax = M−1b.
• The matrix M−1A need not be formed explicitly; only need to
solve Mw = v whenever needed.
• Consequence: fundamental requirement is that it should be easy
to compute M−1v for an arbitrary vector v.
Calais February 7, 2005 3
3
![Page 4: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/4.jpg)
Left, Right, and Split preconditioning
Left preconditioning
M−1Ax = M−1b
Right preconditioning
AM−1u = b, with x = M−1u
Split preconditioning . Assume M is factored: M = MLMR.
M−1L AM−1
R u = M−1L b, with x = M−1
R u
Calais February 7, 2005 4
4
![Page 5: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/5.jpg)
Preconditioned CG (PCG)
II Assume: A and M are both SPD.
II Applying CG directly to M−1Ax = M−1b or AM−1u = b
won’t work because coefficient matrices are not symmetric.
II Alternative: when M = LLT use split preconditioner option
II Second alternative: Observe that M−1A is self-adjoint wrt M
inner product:
(M−1Ax, y)M = (Ax, y) = (x,Ay) = (x,M−1Ay)M
Calais February 7, 2005 5
5
![Page 6: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/6.jpg)
Preconditioned CG (PCG)
ALGORITHM : 1 Preconditioned Conjugate Gradient
1. Compute r0 := b−Ax0, z0 = M−1r0, and p0 := z0
2. For j = 0, 1, . . ., until convergence Do:
3. αj := (rj, zj)/(Apj, pj)
4. xj+1 := xj + αjpj
5. rj+1 := rj − αjApj
6. zj+1 := M−1rj+1
7. βj := (rj+1, zj+1)/(rj, zj)
8. pj+1 := zj+1 + βjpj
9. EndDo
Calais February 7, 2005 6
6
![Page 7: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/7.jpg)
Note M−1A is also self-adjoint with respect to (., .)A:
(M−1Ax, y)A = (AM−1Ax, y) = (x,AM−1Ay) = (x,M−1Ay)A
II Can obtain a similar algorithm
II Assume that M = Cholesky product M = LLT .
Then, another possibility: Split preconditioning option, which ap-
plies CG to the system
L−1AL−Tu = L−1b, with x = LTu
II Notation: A = L−1AL−T . All quantities related to the precondi-
tioned system are indicated by .
Calais February 7, 2005 7
7
![Page 8: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/8.jpg)
ALGORITHM : 2 Conjugate Gradient with Split Preconditioner
1. Compute r0 := b−Ax0; r0 = L−1r0; and p0 := L−T r0.
2. For j = 0, 1, . . ., until convergence Do:
3. αj := (rj, rj)/(Apj, pj)
4. xj+1 := xj + αjpj
5. rj+1 := rj − αjL−1Apj
6. βj := (rj+1, rj+1)/(rj, rj)
7. pj+1 := L−T rj+1 + βjpj
8. EndDo
II The xj’s produced by the above algorithm and PCG are identical
(if same initial guess is used).
Calais February 7, 2005 8
8
![Page 9: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/9.jpg)
Flexible accelerators
Question: What can we do in case M is de£ned only approxi-
mately? i.e., if it can vary from one step to the other.?
Applications:
II Iterative techniques as preconditioners: Block-SOR, SSOR, Multi-
grid, etc..
II Chaotic relaxation type preconditioners (e.g., in a parallel com-
puting environment)
II Mixing Preconditioners – mixing coarse mesh / fine mesh precon-
ditioners.
Calais February 7, 2005 9
9
![Page 10: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/10.jpg)
ALGORITHM : 3 GMRES – No preconditioning
1. Start: Choose x0 and a dimension m of the Krylov subspaces.
2. Arnoldi process:
• Compute r0 = b−Ax0, β = ‖r0‖2 and v1 = r0/β.
• For j = 1, ...,m do
– Compute w := Avj
– for i = 1, . . . , j, do
hi,j := (w, vi)
w := w − hi,jvi
;
– hj+1,1 = ‖w‖2; vj+1 =w
hj+1,1
• De£ne Vm := [v1, ...., vm] and Hm = hi,j.
3. Form the approximate solution: Compute xm = x0 + Vmym where
ym = argminy‖βe1 − Hmy‖2 and e1 = [1, 0, . . . , 0]T .
4. Restart: If satis£ed stop, else set x0← xm and goto 2.
10
![Page 11: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/11.jpg)
ALGORITHM : 4 GMRES – with (right) Preconditioning
1. Start: Choose x0 and a dimension m of the Krylov subspaces.
2. Arnoldi process:
• Compute r0 = b−Ax0, β = ‖r0‖2 and v1 = r0/β.
• For j = 1, ...,m do
– Compute zj := M−1vj
– Compute w := Azj
– for i = 1, . . . , j, do :
hi,j := (w, vi)
w := w − hi,jvi
– hj+1,1 = ‖w‖2; vj+1 = w/hj+1,1
• De£ne Vm := [v1, ...., vm] and Hm = hi,j.
3. Form the approximate solution: Compute xm = x0 +M−1Vmym
where ym = argminy‖βe1 − Hmy‖2 and e1 = [1, 0, . . . , 0]T .
4. Restart: If satis£ed stop, else set x0← xm and goto 2.11
![Page 12: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/12.jpg)
ALGORITHM : 5 GMRES – with variable Preconditioning
1. Start: Choose x0 and a dimension m of the Krylov subspaces.
2. Arnoldi process:
• Compute r0 = b−Ax0, β = ‖r0‖2 and v1 = r0/β.
• For j = 1, ...,m do
– Compute zj := M−1j vj ; Compute w := Azj;
– for i = 1, . . . , j, do:
hi,j := (w, vi)
w := w − hi,jvi
;
– hj+1,1 = ‖w‖2; vj+1 = w/hj+1,1
• De£ne Zm := [z1, ...., zm] and Hm = hi,j.
3. Form the approximate solution: Compute xm = x0 + Zmym where
ym = argminy‖βe1 − Hmy‖2 and e1 = [1, 0, . . . , 0]T .
4. Restart: If satis£ed stop, else set x0← xm and goto 2.Calais February 7, 2005 12
12
![Page 13: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/13.jpg)
Properties
• xm minimizes b−Axm over SpanZm.
• If Azj = vj (i.e., if preconditioning is ‘exact’ at step j) then ap-
proximation xj is exact.
• If Mj is constant then method is ≡ to Right-Preconditioned GM-
RES.
Additional Costs:
• Arithmetic: none.
• Memory: Must save the additional set of vectors zjj=1,...m
Advantage: FlexibilityCalais February 7, 2005 13
13
![Page 14: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/14.jpg)
Standard preconditioners
• Simplest preconditioner: M = Diag(A) II poor convergence.
• Next to simplest: SSOR.
M = (D − ωE)D−1(D − ωF )
• Still simple but often more ef£cient: ILU(0).
• ILU(p) – ILU with level of £ll p – more complex.
• Class of ILU preconditioners with threshold
• Class of approximate inverse preconditioners
• Class of Multilevel ILU preconditioners
• Algebraic Multigrid PreconditionersCalais February 7, 2005 14
14
![Page 15: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/15.jpg)
An observation. Introduction to Preconditioning
II Take a look back at basic relaxation methods: Jacobi, Gauss-
Seidel, SOR, SSOR, ...
II These are iterations of the form x(k+1) = Mx(k) + f where M is
of the form M = I − P−1A . For example for SSOR,
PSSOR = (D − ωE)D−1(D − ωF )
Calais February 7, 2005 15
15
![Page 16: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/16.jpg)
II SSOR attempts to solve the equivalent system
P−1Ax = P−1b
where P ≡ PSSOR by the £xed point iteration
x(k+1) = (I − P−1A)︸ ︷︷ ︸
M
x(k)+P−1b instead of x(k+1) = (I−A)x(k)+b
In other words:
Relaxation Scheme ⇐⇒ Preconditioned Fixed Point Iteration
Calais February 7, 2005 16
16
![Page 17: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/17.jpg)
The SOR/SSOR preconditioner
D
−F
−E
II SOR preconditioning
MSOR = (D − ωE)
II SSOR preconditioning
MSSOR = (D − ωE)D−1(D − ωF )
II MSSOR = LU , L = lower unit matrix, U = upper triangular. One
solve with MSSOR ≈ same cost as a MAT-VEC.
Calais February 7, 2005 17
17
![Page 18: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/18.jpg)
II k-step SOR (resp. SSOR) preconditioning:
k steps of SOR (resp. SSOR)
II Questions: Best ω? For preconditioning can take ω = 1
M = (D − E)D−1(D − F )
Observe: M = LU +R with R = ED−1F .
II Best k? k = 1 is rarely the best. Substantial difference in
performance.
18
![Page 19: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/19.jpg)
Iteration times versus
k for SOR(k) precon-
ditioned GMRES
Number of SOR steps
CPU
Ti
me
0. 5.0 10. 15. 20. 25. 30.0.
0.50
1.0
1.5
2.0
GMRES(20)
GMRES(10)
Calais February 7, 2005 18
19
![Page 20: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/20.jpg)
ILU(0) and IC(0) preconditioners
II Notation: NZ(X) = (i, j) |Xi,j 6= 0
II Formal de£nition of ILU(0):A = LU +RNZ(L) ⋃NZ(U) = NZ(A)rij = 0 for (i, j) ∈ NZ(A)
II This does not de£ne ILU(0) in a unique way.
Constructive de£nition: Compute the LU factorization of A but
drop any £ll-in in L and U outside of Struct(A).
II ILU factorizations are often based on i, k, j version of GE.
Calais February 7, 2005 20
20
![Page 21: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/21.jpg)
What is the IKJ version of GE?
ALGORITHM : 6 Gaussian Elimination – IKJ Variant
1. For i = 2, . . . , n Do:
2. For k = 1, . . . , i− 1 Do:
3. aik := aik/akk
4. For j = k + 1, . . . , n Do:
5. aij := aij − aik ∗ akj
6. EndDo
7. EndDo
8. EndDo
Calais February 7, 2005 21
21
![Page 22: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/22.jpg)
Not accessed
Accessed but not
Accessed and modified
modified
Calais February 7, 2005 22
22
![Page 23: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/23.jpg)
ILU(0) – zero-£ll ILU
ALGORITHM : 7 ILU(0)
For i = 1, . . . , N Do:
For k = 1, . . . , i− 1 and if (i, k) ∈ NZ(A) Do:
Compute aik := aik/akj
For j = k + 1, . . . and if (i, j) ∈ NZ(A), Do:
compute aij := aij − aikak,j.
EndFor
EndFor
II When A is SPD then the ILU factorization = Incomplete Choleski
factorization – IC(0). Meijerink and Van der Vorst [1977].
Calais February 7, 2005 23
23
![Page 24: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/24.jpg)
Typical eigenvalue distribution
Calais February 7, 2005 24
24
![Page 25: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/25.jpg)
Pattern of ILU(0) for 5-point matrix
A
L U
LU
Calais February 7, 2005 25
25
![Page 26: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/26.jpg)
Stencils and ILU factorization
Stencils of A and the L and U parts of A:
Stencil of A Stencil of L Stencil of U
26
![Page 27: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/27.jpg)
Stencil of the product LU :
ª
Fill-ins
Calais February 7, 2005 27
27
![Page 28: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/28.jpg)
Higher order ILU factorization
II Higher accuracy incomplete Choleski: for regularly structured
problems, IC(p) allows p additional diagonals in L.
II Can be generalized to irregular sparse matrices using the notion
of level of £ll-in [Watts III, 1979]
• Initially Lev(aij) =
0 for aij 6= 0
∞ for aij == 0
• At a given step i of Gaussian elimination:
Lev(akj) = minLev(akj);Lev(aki) + Lev(aij) + 1
Calais February 7, 2005 28
28
![Page 29: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/29.jpg)
II ILU(p) Strategy = drop anything with level of £ll-in exceeding p.
* Increasing level of £ll-in usually results in more accurate ILU and...
* ...typically in fewer steps and fewer arithmetic operations.
Calais February 7, 2005 29
29
![Page 30: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/30.jpg)
ILU(1)
Augmented A
L1 U1
L1U1
Calais February 7, 2005 30
30
![Page 31: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/31.jpg)
ALGORITHM : 8 ILU(p)
For i = 2, N Do
For each k = 1, . . . , i− 1 and if aij 6= 0 do
Compute aik := aik/ajj
Compute ai,∗ := ai,∗ − aikak,∗.
Update the levels of ai,∗
Replace any element in row i with lev(aij) > p by zero.
EndFor
EndFor
II The algorithm can be split into a symbolic and a numerical phase.
Level-of-£ll II in Symbolic phase
Calais February 7, 2005 31
31
![Page 32: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/32.jpg)
ILU with threshold – generic algorithms
ILU(p) factorizations are based on structure only and not numeri-
cal values II potential problems for non M-matrices.
II One remedy: ILU with threshold – (generic name ILUT.)
Two broad approaches:
First approach [derived from direct solvers]: use any (direct) sparse
solver and incorporate a dropping strategy. [Munksgaard (?), Os-
terby & Zlatev, Sameh & Zlatev[90], D. Young, & al. (Boeing) etc...]
Calais February 7, 2005 32
32
![Page 33: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/33.jpg)
Second approach : [derived from ‘iterative solvers’ viewpoint]
1. use a (row or colum) version of the (i, k, j) version of GE;
2. apply a drop strategy for the elment lik as it is computed;
3. perform the linear combinations to get ai∗. Use full row expansion
of ai∗;
4. apply a drop strategy to £ll-ins.
Calais February 7, 2005 33
33
![Page 34: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/34.jpg)
ILU with threshold: ILUT(k, ε)
• Do the i, k, j version of Gaussian Elimination (GE).
• During each i-th step in GE, discard any pivot or £ll-in whose
value is below ε‖rowi(A)‖.
• Once the i-th row of L + U , (L-part + U-part) is computed retain
only the k largest elements in both parts.
II Advantages: controlled £ll-in. Smaller memory overhead.
II Easy to implement – much more so than preconditioners derived
from direct solvers.
II can be made quite inexpensive.
Calais February 7, 2005 34
34
![Page 35: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/35.jpg)
Restarting methods for linear systems
Motivation: Goal: to use the information generated from
current GMRES loop to improve convergence at next GM-
RES restart.
References:
II R. A. Nicolaides (87): De¤ated CG.
II R. Morgan (92) De¤ated GMRES
II S. Kharchenko & A. Yeremin (92) pole placement ideas.
II K. Burrage, J. Ehrel, and B. Pohl (93): De¤ated GMRES
II E de Sturler: use SVD information in GMRES.
Calais February 7, 2005 35
35
![Page 36: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/36.jpg)
II Can help improve convergence and prevent stagnation of GMRES
in some cases.
Generally speaking: One should not expect to solve very hard prob-
lems with Eigenvalue De¤ation Preconditioning alone.
II Question: Can the same effects be achieved with block-Krylov
methods?
Calais February 7, 2005 36
36
![Page 37: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/37.jpg)
Using the Flexible GMRES framework
Method: De¤ation can be achieved by ‘enriching’ the Krylov sub-
space with approximate eigenvectors obtained from previous runs.
We can use Flexible GMRES and append these vectors at end. [See
R. Morgan (92), Chapman & YS (95).]
II Vectors v1, . . . , vm−p = standard Arnoldi vectors
II Vectors vm−p+1, . . . , vm = Computed as in FGMRES where new
vectors zj are previously computed eigenvectors.
II Storage: we need to store v1, . . . , vm and zm−p+1, . . . , zm. II p
additional vectors, with typically p << m.
37
![Page 38: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/38.jpg)
GMRES with de¤ation
1. De¤ated Arnoldi process: r0 := b−Ax0, v1 := r0/(β := ‖r0‖2).
For j = 1, ...,m do
If j < m− p then zj := vj Else zj = uj−m (eigenvector)
w = Azj
For i = 1, . . . , j, do
hi,j := (w, vi)
w := w − hi,jvi
hj+1,j = ‖w‖2, vj+1 = w/‖w‖2.
EndDo
De£ne Zm := [z1, ...., zm] and Hm = hi,j.
2. Form the approximate solution:
Compute xm = x0 + Zmym where ym = argminy‖βe1 − Hmy‖2.
3. Get next eigenvector estimates u1, . . . , up from Hm, Vm, Zm, ...
4. Restart: If satis£ed stop, else set x0← xm and goto 1.
38
![Page 39: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/39.jpg)
Question 1: which eigenvectors to add?
II Answer: those associated with smallest eigenvalues.
Question 2: How to compute eigenvectors from the Flexible GMRES
step? II Answer: use the relation
AZm = Vm+1Hm
Approximation: λ, u = Zmy
Galerkin Condition: r ⊥ AZm gives the generalized problem
HHmHm y = λ HH
mV Hm+1Zm y
In Addition in GMRES: Hm = QmRm so HHmHm = RH
mRm.
See: Morgan (1993).
Calais February 7, 2005 39
39
![Page 40: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/40.jpg)
An example: Shell problems
II Can be very hard to solve!
II A matrix of size N=38,002, with Nz = 949,452 nonzero elements.
II Actually symmetric. Not exploited in test.
II Most simplistic methods fail.
II ILUT(50,0) does not work even with GMRES(80).
II This is an example when a large subspace is required.
40
![Page 41: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/41.jpg)
0 100 200 300 400 500 600 700 800 900 100010
−18
10−16
10−14
10−12
10−10
10−8
10−6
10−4
10−2
100
Numbers of GMRES steps
log 10
of r
esid
ual
Shell problem. N = 38002, Nz = 949452, m = 80
0 eigenvector 5 eigenvectors10 eigenvectors20 eigenvectors30 eigenvectors
Calais February 7, 2005 41
41
![Page 42: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/42.jpg)
An example
A matrix arising from Euler’s equations on unstructured mesh
[II Contributed by Larry Wigton from Boeing]
Size = 3,864. (966 mesh points).
Nonzero elements: 238,252 (about 62 per row).
II Dif£cult to solve in spite of its small size.
42
![Page 43: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/43.jpg)
II Results with ILUT(lfil, ε)
l£l Iterations estimate of
(tol = 10−8) ‖(LU)−1‖
100 ∗ 0.19E + 56
110 ∗ 0.34E + 9
120 30 0.70E + 5
130 25 0.33E + 7
140 20 0.17E + 4
150 19 0.69E + 4
43
![Page 44: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/44.jpg)
Results with Block Jacobi Preconditioning
with Eigenvalue De¤ation
reduction in residual norm in
1200 GMRES steps with m = 49
4x4 block 16x16 block
p = 0 0.8 E 0 0.8 E 0
p = 4 0.8 E 0 4.0 E-5
p = 8 1.2 E-2 2.9 E-7
p = 12 1.9 E-2 3.8 E-6
Calais February 7, 2005 44
44
![Page 45: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/45.jpg)
Theory – (Hermitian case only)
Assume that A is SPD and let K = Km +W , where W is s.t.
dist(AW,U) = ε
with U = exact invariant subspace associated with λ1, .., λs. Then
the residual r obtained from the minimal residual projection
process onto the augmented Krylov subspace K satis£es the
inequality,
‖r‖2 ≤ ‖r0‖2
√√√√√√√√
1
T 2m(γ)
+ ε2
where γ ≡ λn+λs+1
λn−λs+1, Tm ≡ Chebyshev polyn. of deg. m of 1st kind.
II See [YS, SIMAX vol. 4, pp 43-66 (1997)] for other results.
Calais February 7, 2005 45
45
![Page 46: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/46.jpg)
SPECIAL FORMS OF ILUS
46
![Page 47: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/47.jpg)
Crout-based ILUT (ILUTC)
Terminology: Crout versions of LU compute the k-th row of U and
the k-th column of L at the k-th step.
Computational pattern
Black = part computed at step k
Blue = part accessed
Main advantages:1. Less expensive than ILUT (avoids sorting)
2. Allows better techniques for dropping
47
![Page 48: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/48.jpg)
References:
[1] M. Jones and P. Plassman. An improved incomplete Choleski
factorization. ACM Transactions on Mathematical Software, 21:5–
17, 1995.
[2] S. C. Eisenstat, M. H. Schultz, and A. H. Sherman. Algorithms
and data structures for sparse symmetric Gaussian elimination.
SIAM Journal on Scienti£c Computing, 2:225–237, 1981.
[3] M. Bollhofer. A robust ILU with pivoting based on monitoring the
growth of the inverse factors. Linear Algebra and its Applica-
tions, 338(1–3):201–218, 2001.
[4] N. Li, Y. Saad, and E. Chow. Crout versions of ILU. MSI technical
report, 2002.Calais February 7, 2005 48
48
![Page 49: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/49.jpg)
Crout LU (dense case)
II Go back to delayed update algorithm (IKJ alg.) and observe: we
could do both a column and a row version
II Left: U computed by rows. Right: L computed by columns
Note: The entries 1 : k − 1 in the k-th row in left £gure need not be
computed. Available from already computed columns of L. Similar
observation for L (right).
49
![Page 50: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/50.jpg)
ALGORITHM : 9 Crout LU Factorization (dense case)
1. For k = 1 : n Do :
2. For i = 1 : k − 1 and if aki 6= 0 Do :
3. ak,k:n = ak,k:n − aki ∗ ai,k:n
4. EndDo
5. For i = 1 : k − 1 and if aik 6= 0 Do :
6. ak+1:n.k = ak+1:n,k − aik ∗ ak+1:n,i
7. EndDo
8. aik = aik/akk for i = k + 1, ..., n
9. EndDo
50
![Page 51: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/51.jpg)
ALGORITHM : 10 ILUC - Crout version of ILU
1. For k = 1 : n Do :
2. Initialize row z: z1:k−1 = 0, zk:n = ak,k:n
3. For i | 1 ≤ i ≤ k − 1 and lki 6= 0 Do :
4. zk:n = zk:n − lki ∗ ui,k:n
5. EndDo
6. Initialize column w: w1:k = 0, wk+1:n = ak+1:n,k
7. For i | 1 ≤ i ≤ k − 1 and uik 6= 0 Do :
8. wk+1:n = wk+1:n − uik ∗ lk+1:n,i
9. EndDo
10. Apply a dropping rule to row z
11. Apply a dropping rule to column w
12. uk,: = z; l:,k = w/ukk, lkk = 1
13. Enddo
II Notice that the updates to the k-th row of U (resp. the k-th column
51
![Page 52: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/52.jpg)
of L) can be made in any order.
II Operations in Lines 4 and 8 are sparse vector updates (must be
done in sparse mode)..
Calais February 7, 2005 52
52
![Page 53: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/53.jpg)
Comparison with standard techniques
0 10 20 30 40 50 60 700
2
4
6
8
10
12
14
Lfil
Pre
cond
ition
Tim
e
RAEFSKY3 (n=21,200; nnz=1,488,768)
ILUTCr−ILUTc−ILUTb−ILUT
Precondition time vs. L£l for ILUC (solid), row-ILUT (circles), column-
ILUT (triangles) and r-ILUT with Binary Search Trees (stars)Calais February 7, 2005 53
53
![Page 54: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/54.jpg)
ILUS – ILU for Sparse Skyline format
II Often in CFD codes the matrices are generated in a format con-
sisting of a sparse row representation of the decomposition
A = D + L1 + LT2
where D is the strict lower part of A and L1, L2 are strict lower
triangular.
sparse row→
← sparse column
54
![Page 55: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/55.jpg)
II Can develop ILU versions based on this data structure.
II Advantages: (1) Savings when A has a symmetric structure. (2)
Graceful degradation to an incomplete Choleski when A is symmet-
ric (or nearly symmetric). (3) A little more convenient than ILUT for
handling ‘instability’ of factorization.
Let Ak+1 =
Ak vk
wk αk
. If Ak = LkDkUk then
Ak+1 =
Lk 0
yk 1
Dk 0
0 dk+1
Uk zk
0 1
zk = D−1k L−1
k vk; yk = wkU−1k D−1
k ; dk+1 = αk+1 − ykDkzk
II To get next column zk we need to solve a system with sparse L
and sparse RHS vk. Similarly with yk.
II How can we approximately such systems inexpensively?
Note: Sparse RHS and sparse L
55
![Page 56: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/56.jpg)
II Simplest possibility: truncated Neumann series,
zk = D−1k L−1
k vk = D−1k (I + Ek + E2
k + . . .+ Epk)vk
vector zk gets denser as ‘level-of-£ll’ p increases.
II We also use sparse-sparse mode GMRES.
II Idea of sparse-sparse mode computations is quite useful in de-
veloping preconditioners.
Calais February 7, 2005 56
56
![Page 57: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/57.jpg)
Approximate Inverse preconditioners
Motivation:• L - U solves in ILU may be ‘unstable’
• Parallelism in L-U solves limited
Idea: Approximate the inverse of A directly M ≈ A−1
II Right preconditioning: Find M such that
AM ≈ I
II Left preconditioning: Find M such that
MA ≈ I
II Factored approximate inverse: Find L and U s.t.
LAU ≈ D
Calais February 7, 2005 57
57
![Page 58: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/58.jpg)
Some references
• Benson and Frederickson (’82): approximate inverse using stencils
• Grote and Simon (’93): Choose M to be banded
• Cosgrove, Dıaz and Griewank (’91) : Procedure to add £ll-ins to M
• Kolotilina and Yeremin (’93) : Factorized symmetric precondition-
ings M = GTLGL
• Huckle and Grote (’95) : Procedure to £nd good pattern for M
• Chow and YS (’95): Find pattern dynamically by using dropping.
• M. Benzi & Tuma (’96, ’97,..): Factored app. inv.
Calais February 7, 2005 58
58
![Page 59: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/59.jpg)
One (of many) options:
try to £nd M to approximately minimize ‖I −AM‖F
Note:
min ‖I −AM‖2F = min ∑nj=1 ‖ej −AMej‖
22 =
∑nj=1min ‖ej −Amj‖
22
II Problem decouples into n independent least-squares systems
II In each of these systems the matrix and RHS are sparse
Two paths:
1. Can £nd a good sparsity pattern for M £rst then compute M
using this patters.
2. Can £nd the pattern dynamically [similar to ILUT]
Calais February 7, 2005 59
59
![Page 60: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/60.jpg)
Approximate inverse with drop-tolerance [Chow & YS, 1994]
Find min ‖ej −Amj‖22, 1 ≤ j ≤ n
by solving approximately
Amj = ej, 1 ≤ j ≤ n
with a few steps of GMRES, starting with a sparse mj.
• Iterative method works in sparse mode: Krylov vectors are sparse.
• Use sparse-vector by sparse-vector and sparse-matrix by sparse-
vector operations.
• Dropping strategy is applied on mj.
• Exploit ‘self-preconditioning’.
Calais February 7, 2005 60
60
![Page 61: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/61.jpg)
Sparse-Krylov MINRES and GMRES
• Dual threshold dropping strategy: drop tolerance and maximum
number of nonzeros per column
• In MINRES, dropping performed on solution after each inner iter-
ation
• In GMRES, dropping performed on Krylov basis at each iteration
• Use sparse-vector by sparse-vector operations
Calais February 7, 2005 61
61
![Page 62: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/62.jpg)
Self-preconditioning The system
Amj = ej
may be preconditioned with the current M . This is even more effec-
tive if the columns are computed in sequence.
• Actually use FGMRES
• Leads to inner and outer iteration approach
• Quadratic convergence if no dropping is done
• Effect of reordering?
Calais February 7, 2005 62
62
![Page 63: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/63.jpg)
A few remarks
II There is no guarantee that M is nonsingular, unless the accuracy
is high enough.
II There are many cases in which APINV preconditioners work
while ILU or ILUTP [with reasonable £ll] won’t
II The best use of APINV preconditioners may be in combining them
with other techniques. For example,
Minimize ‖B −AM‖F
where B is some other preconditioner (e.g. block-diagonal).
II Preconditioner for A→MB−1.
Calais February 7, 2005 63
63
![Page 64: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/64.jpg)
Approximate inverses for block-partitioned matrices
Motivation. Domain Decomposition
B1 F1
B2 F2
. . . ...
Bn Fn
E1 E2 · · · En C
≡
B F
E D
Note the factorization:
B F
E C
=
B 0
E S
I B−1F
0 I
in which S is the Schur complement,
S = C − EB−1F.
Calais February 7, 2005 64
64
![Page 65: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/65.jpg)
One idea: Compute M = LU in which
L =
B 0
E MS
and U =
I B−1F
0 I
II MS = some preconditioner to S.
One option: MS = S = sparse approximation to S
S = C − EY where Y ≈ B−1F
II Need to £nd a sparse matrix Y such that
BY ≈ F
where F and B are sparse.
Calais February 7, 2005 65
65
![Page 66: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/66.jpg)
Preconditioning the Normal Equations
II Why not solve
ATAx = ATb or AATy = b ?
II Advantage: symmetric positive de£nite systems
II Disadvantages:•Worse conditioning
• Not easy to precondition.
II Generally speaking, the disadvantages outweigh the advantages.
Calais February 7, 2005 66
66
![Page 67: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/67.jpg)
Incomplete Cholesky and SSOR for Normal Equations
II First Observation: IC(0) does not necessarily exist for SPD matri-
ces.
II Can shift matrix: perform IC(0) on AAT + αI for example. Hard
to £nd good values of α for general matrices.
II Can modify dropping strategy: exploit relation between IC(0) and
Incomplete Modi£ed Gram Schmidt on A→ ICMGS, [Wang & Galli-
van, 1993]
67
![Page 68: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/68.jpg)
II Can also get L from Incomplete LQ factorization [Saad, 1989].
Advantage: arbitrary accuracy. Disadvantage: need the Q factor.
II We never need to form the matrix B = ATA or B = AAT in
implementation.
II Alternative: use SSOR [equivalent to Kacmarz algorithm]. No dif-
£culties with shifts [Take ω = 1], trivial to implement, no additional
storage required.
Calais February 7, 2005 68
68
![Page 69: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/69.jpg)
ILUM AND ARMS
69
![Page 70: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/70.jpg)
Independent set orderings & ILUM (Background)
Independent set orderings permute a matrix into the form
B F
E C
where B is a diagonal matrix.
II Unknowns associated with the B block form an independent set
(IS).
II IS is maximal if it cannot be augmented by other nodes to form
another IS.
II IS ordering can be viewed as a “simpli£cation” of multicoloring
70
![Page 71: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/71.jpg)
Main observation: Reduced system obtained by eliminating the
unknowns associated with the IS, is still sparse since its coef£cient
matrix is the Schur complement
S = C − EB−1F
II Idea: apply IS set reduction recursively.
II When reduced system small enough solve by any method
II Can devise an ILU factorization based on this strategy.
II See work by [Botta-Wubbs ’96, ’97, YS’94, ’96, (ILUM), Leuze ’89,
..]
Calais February 7, 2005 71
71
![Page 72: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/72.jpg)
ALGORITHM : 11 ILUM
For lev = 1, nlev Do
a. Get an independent set for A.
b. Form the reduced system associated with this set;
c. Apply a dropping strategy to this system;
d. Set A := current reduced matrix and go back to (a).
EndDo
Calais February 7, 2005 72
72
![Page 73: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/73.jpg)
Group Independent Sets / Aggregates
II Generalizes (common) Independent Sets
Main goal: to improve robustness
Main idea: use independent sets of “cliques”, or “aggregates”. There
is no coupling between the aggregates.
No Coupling
II Reorder equations so nodes of independent sets come £rstCalais February 7, 2005 73
73
![Page 74: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/74.jpg)
Algebraic Recursive Multilevel Solver (ARMS)
Original matrix, A , and reordered matrix, A0 = P T0 AP0 .
0 50 100 150 200 250 300
0
50
100
150
200
250
300
nz = 31550 50 100 150 200 250 300
0
50
100
150
200
250
300
nz = 3155
II Block ILU factorizations (Diagonal blocks treated as sparse):
P Tl AlPl =
Bl Fl
El Cl
≈
Ll 0
ElU−1l I
×
I 0
0 Al+1
×
Ul L−1l Fl
0 I
74
![Page 75: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/75.jpg)
Problem: Fill-in
0 50 100 150 200 250 300
0
50
100
150
200
250
300
nz = 12205
II II
Remedy: dropping strategy
0 50 100 150 200 250 300
0
50
100
150
200
250
300
nz = 4255
II Next step: treat the Schur complement recursively
Calais February 7, 2005 75
75
![Page 76: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/76.jpg)
Algebraic Recursive Multilevel Solver (ARMS)
B F
E C
y
z
=
f
g
→
L 0
EU−1 I
×
U L−1F
0 S
y
z
=
f
g
where S = C − EB−1F = Schur complement.
II Idea: perform above block factorization recursively on S
II Blocks in B treated as sparse. Can be as large/small as desired.
II Algorithm is fully recursive
II Incorporates so-called W-cycles
II stability criterion added to block independent sets algorithm
Calais February 7, 2005 76
76
![Page 77: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/77.jpg)
Factorization:
P Tl AlPl =
Bl Fl
El Cl
≈
Ll 0
ElU−1l I
×
I 0
0 Al+1
×
Ul L−1l Fl
0 I
II L-solve ∼ restriction operation. U-solve ∼ prolongation.
II Solve Last level system with, e.g., ILUT+GMRES
ALGORITHM : 12 ARMS(Alev ) factorization
1. If lev = last lev then
2. Compute Alev ≈ LlevUlev
3. Else:
4 Find an independent set permutation Plev
5. Apply permutation Alev := P TlevAlevP
6. Compute factorization
7. Call ARMS(Alev+1 )
8. EndIf
77
![Page 78: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/78.jpg)
Inner-Outer inter-level iterations
Idea: Use an iteration at level l to reduce residual norm by tol. τ
Forward Backward
Back to original system Original system
Reduced system: solve by any means
II Many possible variants.
Calais February 7, 2005 78
78
![Page 79: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/79.jpg)
Three options for inner-outer inter-level iterations
(VARMS) Descend, using the level-structure. At last level use GMRES-
ILUT. Ascend back to the current level.
(WARMS) Use a few steps of GMRES+VARMS to solve the reduced
system. At last level: use GMRES-ILUT.
(WARMS*) Use a few steps of FGMRES to solve the reduced system
- Preconditioner: WARMS* (recursive). Last level: use ILUT-GMRES
II WARMS* can be expensive! use with a small number of levels.
II Iterating allows to use less costly factorizations [memory]
Calais February 7, 2005 79
79
![Page 80: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/80.jpg)
Storage: II At each
level - except last,
store: Li, Ui, Fi, Ei
U1
L1
F1
E1
U0
L0
A2
F0
E0
II For WARMS: need to multiply by intermediate Ai ’s
II Al+1 × w computed as(
Cl − ElU−1l L−1
l F)
× w II Need to store
above 4 matrices + Ci.
Calais February 7, 2005 80
80
![Page 81: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/81.jpg)
Group Independent Set reordering
SeparatorFirst Block
Simple strategy used: Do a Cuthill-MKee ordering until there are
enough points to make a block. Reverse ordering. Start a new block
from a non-visited node. Continue until all points are visited. Add
criterion for rejecting “not suf£ciently diagonally dominant rows.”
Calais February 7, 2005 81
81
![Page 82: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/82.jpg)
Original matrix
0.10E-06
0.19E+07
82
![Page 83: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/83.jpg)
Block size of 6
0.10E-06
0.19E+07
83
![Page 84: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/84.jpg)
Block size of 20
0.10E-06
0.19E+07
84
![Page 85: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/85.jpg)
PARALLEL PRECONDITIONERS
85
![Page 86: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/86.jpg)
Introduction
II In recent years: Big thrust of parallel computing techniques in
applications areas. Problems becoming larger and harder
II In general: very large machines (e.g. Cray T3E) are gone. Excep-
tion: big US government labs.
II Replaced by ‘medium’ size machine (e.g. IBM SP2, SGI Origin)
II Programming model: Message-passing seems to be King (MPI)
II Open MP and threads for small number of processors
II Important new reality: parallel programming has penetrated the
‘applications’ areas [Sciences and Engineering + industry]
Calais February 7, 2005 86
86
![Page 87: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/87.jpg)
Parallel preconditioners: A few approaches
“Parallel matrix computation” viewpoint:
• Local preconditioners: Polynomial (in the 80s), Sparse Approxi-
mate Inverses, [M. Benzi-Tuma & al ’99., E. Chow ’00]
• Distributed versions of ILU [Ma & YS ’94, Hysom & Pothen ’00]
• Use of multicoloring to unaravel parallelism
Calais February 7, 2005 87
87
![Page 88: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/88.jpg)
Domain Decomposition ideas:
• Schwarz-type Preconditioners [e.g. Widlund, Bramble-Pasciak-
Xu, X. Cai, D. Keyes, Smith, ...]
• Schur-complement techniques [Gropp & Smith, Ferhat et al. (FETI),
T.F. Chan et al., YS and Sosonkina ’97, J. Zhang ’00, ...]
Multigrid / AMG viewpoint:
• Multi-level Multigrid-like preconditioners [e.g., Shadid-Tuminaro et
al (Aztec project), ...]
II In practice: Variants of additive Schwarz very common (simplic-
ity)
Calais February 7, 2005 88
88
![Page 89: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/89.jpg)
Intrinsically parallel preconditioners
Some alternatives
(1) Polynomial preconditioners;
(2) Approximate inverse preconditioners;
(3) Multi-coloring + independent set ordering;
(4) Domain decomposition approach.
Calais February 7, 2005 89
89
![Page 90: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/90.jpg)
POLYNOMIAL PRECONDITIONING
Principle: M−1 = s(A) where s is a (low) degree polynomial:
s(A)Ax = s(A)b
Problem: how to obtain s? Note: s(A) ≈ A−1
II Several approaches.
* Chebyshev polynomials
* Least squares polynomials
* OthersII Polynomial preconditioners are seldom used in practice.
Calais February 7, 2005 90
90
![Page 91: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/91.jpg)
Domain Decomposition
Problem:
∆u = f in Ω
u = uΓ on Γ = ∂Ω.
Domain:
Ω =s⋃
i=1Ωi,
Ω1 Ω2
Ω3
Γ12
Γ13
II Domain decomposition or substructuring methods attempt to solve
a PDE problem (e.g.) on the entire domain from problem solutions
on the subdomains Ωi.
Calais February 7, 2005 91
91
![Page 92: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/92.jpg)
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
19 20 21
22 23 24 25
26 27 28 29
30 31 32 33
34
35
36
37383940
Discretization of domain
92
![Page 93: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/93.jpg)
Coef£cient Matrix
Calais February 7, 2005 93
93
![Page 94: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/94.jpg)
Types of mappings
1 2 3 4
5 6 7 8
9 10 11 12
(a)
1 2 3 4
5 6 7 8
9 10 11 12
Ω1
Ω2
(b)
1 2 3 4
5 6 7 8
9 10 11 12
Ω1
Ω2
(c)
(a) Vertex-based, (b) edge-based, and (c) element-based partitioning
94
![Page 95: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/95.jpg)
Calais February 7, 2005 94
95
![Page 96: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/96.jpg)
DISTRIBUTED SPARSE MATRICES
96
![Page 97: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/97.jpg)
Generalization: Distributed Sparse Systems
II Simple illustration:
Block assignment
Assign equation i and
unknown i to a given
processor.
Calais February 7, 2005 97
97
![Page 98: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/98.jpg)
Partitioning a sparse matrix
II Use a graph partitioner to partition the adjacency graph:
1 2 3 4
5 6 7 8
9 10 11 12
P1
P2
P3
P4
II Can allow overlap.
II Partition can be very general.
Calais February 7, 2005 98
98
![Page 99: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/99.jpg)
Given: a general mapping of pairs equation/unknown to proces-
sors. = set of p subsets of variables .
II Subsets can be arbitrary + allow ‘overlap’. Mapping can be
obtained by graph partitioners.
Problem: build local data structures needed for iteration phase.
II Several graph partitioners available: Metis, Chaco, Scotch, ...
99
![Page 100: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/100.jpg)
Calais February 7, 2005 100
100
![Page 101: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/101.jpg)
Distributed Sparse matrices (continued)
II Once a good partitioning is found, questions are:
1. How to represent this partitioning?
2. What is a good data structure for representing dis-
tributed sparse matrices?
3. How to set up the various “local objects” (matrices,
vectors, ..)
4. What can be done to prepare for communication that will
be required during execution?
Calais February 7, 2005 101
101
![Page 102: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/102.jpg)
Two views of a distributed sparse matrix
External interfacenodes
Internalnodes
Local interfacenodes
XiXiAi
II Local interface variables always ordered last.
Calais February 7, 2005 102
102
![Page 103: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/103.jpg)
Local view of distributed matrix:
local Data
External data External data
OO AiiX Xi
The local matrix:
Local
points
InternalPoints
Interface
Aloc
Bext
Calais February 7, 2005 103
103
![Page 104: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/104.jpg)
Distributed Sparse Matrix-Vector Product Kernel
Algorithm:
1. Communicate: exchange boundary data.
Scatter xbound to neighbors - Gather xext from neighbors
2. Local matrix – vector product
y = Alocxloc
3. External matrix – vector product
y = y +Bextxext
NOTE: 1 and 2 are independent and can be overlapped.
Calais February 7, 2005 104
104
![Page 105: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/105.jpg)
Distributed Sparse Matrix-Vector Product
Main part of the code:
call MSG_bdx_send(nloc,x,y,nproc,proc,ix,ipr,ptrn,ierr)cc do local matrix-vector product for local pointsc
call amux(nloc,x,y,aloc,jaloc,ialoc)cc receive the boundary informationc
call MSG_bdx_receive(nloc,x,y,nproc,proc,ix,ipr,ptrn,ierr)c
c do local matrix-vector product for external pointsc
nrow = nloc - nbnd + 1call amux1(nrow,x,y(nbnd),aloc,jaloc,ialoc(nloc+1))
creturn
Calais February 7, 2005 105
105
![Page 106: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/106.jpg)
The local exchange information
II List of adjacent processors (or subdomains)
II For each of these processors, lists of boundary nodes to be sent
/ received to /from adj. PE’s.
II The receiving processor must have a matrix ordered consistently
with the order in which data is received.
Requirements
II The ‘set-up’ routines should handle overlapping
II Should use minimal storage (only arrays of size nloc allowed).
Calais February 7, 2005 106
106
![Page 107: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/107.jpg)
Main Operations in (F) GMRES :
1. Saxpy’s – local operation – no communication
2. Dot products – global operation
3. Matrix-vector products – local operation – local communication
4. Preconditioning operations – locality varies.
Calais February 7, 2005 107
107
![Page 108: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/108.jpg)
Distributed Dot Product
/*-------------------- call blas1 function
tloc = DDOT(n, x, incx, y, incy);
/*-------------------- call global reduction
MPI_Allreduce(&tloc,&ro,1,MPI_DOUBLE,MPI_SUM,comm);
Calais February 7, 2005 108
108
![Page 109: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/109.jpg)
PARALLEL PRECONDITIONERS
109
![Page 110: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/110.jpg)
Three approaches:
• Schwarz Preconditioners
• Schur-complement based Preconditioners
• Multi-level ILU-type Preconditioners
II Observation: Often, in practical applications, only Schwarz Pre-
conditioners are used
Calais February 7, 2005 110
110
![Page 111: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/111.jpg)
Domain-Decomposition-Type Preconditioners
Local view of distributed matrix:
local Data
External data External data
OO AiiX Xi
Block Jacobi Iteration (Additive Schwarz):
1. Obtain external data yi
2. Compute (update) local residual ri = (b−Ax)i = bi−Aixi−Biyi
3. Solve Aiδi = ri
4. Update solution xi = xi + δi
Calais February 7, 2005 111
111
![Page 112: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/112.jpg)
II Multiplicative Schwarz. Need a coloring of the subdomains.
1 1
1 1
2 2
3 3
33
44
112
![Page 113: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/113.jpg)
Multicolor Block SOR Iteration (Multiplicative Schwarz):
1. Do col = 1, . . . , numcols
2. If (col.eq.mycol) Then
3. Obtain external data yi
4. Update local residual ri = (b−Ax)i
5. Solve Aiδi = ri
6. Update solution xi = xi + δi
7. EndIf
8. EndDo
Calais February 7, 2005 113
113
![Page 114: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/114.jpg)
Breaking the sequential color loop
II “Color” loop is sequential. Can be broken in several different
ways.
(1) Have a few subdomains per processors
1 12 2 2
4 44
33
11
1
1 1 11
22
2
23 3 33
4 4 44
3 332
444
2
22 1
3
1
43
114
![Page 115: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/115.jpg)
(2) Separate interior nodes from interface nodes (2-level blocking)
Color # 1 Interior nodes
Color 2
Color 3
Color 3
Color 2
(3) Use a block-GMRES algorithm - with Block-size = number of
colors. SOR step targets a different color on each column of the
block II no iddle time.
Calais February 7, 2005 115
115
![Page 116: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/116.jpg)
Local Solves
II Each local system Aiδi = ri can be solved in three ways:
1. By a (sparse) direct solver
2. Using a standard preconditioned Krylov solver
3. Doing a backward-forward solution associated with an accurate
ILU (e.g. ILUT) precondioner
II We only use (2) with a small number of inner steps (up to 10) or
(3).
Calais February 7, 2005 116
116
![Page 117: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/117.jpg)
Performance comparison for different machines
0 5 10 15 20 25 30 350
1
2
3
4
5
6
7
8
Number of processors
Tim
e in
sec
onds
Overlapped Schwarz with ILU solver. Matrix VENKAT01
IBM−COW (FC)
SP2SGI−COW (HiPPI)
T3E
117
![Page 118: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/118.jpg)
0 5 10 15 20 25 30 350
1
2
3
4
5
6
7
8
9
− − − − − total
_______ precon
......... matvec
− − * − − fgmres
Number of processors
tim
e in
sec
onds
Jacobi overlap with LU solver. Matrix VENKAT01. CRAY−T3E
118
![Page 119: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/119.jpg)
0 2 4 6 8 10 12 14 160
5
10
15
20
25
− − x − − multicolor SOR with ILU solver
______________ multicolor SOR
.................. multicolor segregated SOR, ILU solver
− − o − − multicolor segregated SOR, ILU solver and GMRES
Number of processors
tim
e in
sec
onds
Multicolor SOR preconditioning. Matrix VENKAT01.
Calais February 7, 2005 118
119
![Page 120: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/120.jpg)
SCHUR COMPLEMENT-BASED PRECONDITIONERS
120
![Page 121: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/121.jpg)
Schur complement system
Local system can be written as
Aixi +Xiyi,ext = bi. (1)
local Data
External data External data
OO AiiX Xi
xi= vector of local unknowns, yi,ext = external interface variables,
and bi = local part of RHS.
121
![Page 122: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/122.jpg)
II Local equations
Bi Fi
Ei Ci
ui
yi
+
0
∑
j∈NiEijyj
=
fi
gi
. (2)
II eliminate ui from the above system:
Siyi +∑
j∈Ni
Eijyj = gi − EiB−1i fi ≡ g′i,
where Si is the “local” Schur complement
Si = Ci − EiB−1i Fi. (3)
Calais February 7, 2005 122
122
![Page 123: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/123.jpg)
Structure of Schur complement system
II Schur complement system
Sy = g′
with
S =
S1 E12 . . . E1p
E21 S2 . . . E2p
... . . . ...
Ep1 Ep−1,2 . . . Sp
y1
y2
...
yp
=
g′1
g′2...
g′p
.
Calais February 7, 2005 123
123
![Page 124: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/124.jpg)
Simplest idea: Schur Complement Iterations
ui
yi
Internal variables
Interface variables
II Do a global primary iteration (e.g., block-Jacobi)
II Then accelerate only the y variables (with a Krylov method)
Still need to precondition..
Calais February 7, 2005 124
124
![Page 125: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/125.jpg)
Approximate Schur-LU
II Two-level method based on induced preconditioner. Global sys-
tem can also be viewed as
B F
E C
u
y
=
f
g
. with B =
B1 F1
B2 F2
. . . ...
Bp Fp
E1 E2 · · · Ep C
Block LU factorization of A:
B F
E C
=
B 0
E S
I B−1F
0 I
,
125
![Page 126: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/126.jpg)
Preconditioning:
L =
B 0
E MS
and U =
I B−1F
0 I
with MS = some approximation to S.
II Preconditioning to global system can be induced from any pre-
conditioning on Schur complement.
Rewrite local Schur system as
yi + S−1i
∑
j∈Ni
Eijyj = S−1i
[
gi − EiB−1i fi
]
.
II equivalent to Block-Jacobi preconditioner for Schur complement.
II Solve with, e.g., a few steps (e.g., 5) of GMRES
II Question: How to solve with Si?
126
![Page 127: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/127.jpg)
Two approaches:
(1) can compute approximation Si ≈ Si using approximate inverse
techniques (M. Sosonkina)
(2) we can simply use LU factorization of Ai. Exploit the property:
If Ai =
LBi0
EiU−1Bi
LSi
UBiL−1BiFi
0 USi
Then LSiUSi
= Si
127
![Page 128: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/128.jpg)
Name Precon lfil 4 8 16 24 36 40
raefsky1 SAPINV 10 14 13 10 11 8 8
20 12 11 9 9 8 8
SAPINVS 10 16 13 10 11 8 8
20 13 11 9 9 8 8
SLU 10 215 197 198 194 166 171
20 48 50 40 42 41 41
BJ 10 85 171 173 273 252 263
20 82 170 173 271 259 259Number of FGMRES(20) iterations for the RAEFSKY1 problem.
128
![Page 129: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/129.jpg)
Name Precon lfil 16 24 32 40 56 64 80 96
af23560 SAPINV 20 32 36 27 29 73 35 71 61
30 32 35 23 29 46 60 33 52
SAPINVS 20 32 35 24 29 55 35 37 59
30 32 34 23 28 43 45 23 35
SLU 20 81 105 94 88 90 76 85 71
30 38 34 37 39 38 39 38 35
BJ 20 37 153 53 60 77 80 95 *
30 36 41 53 57 81 87 97 115Number of FGMRES(20) iterations for the AF23560 problem.
Calais February 7, 2005 129
129
![Page 130: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/130.jpg)
0 10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
35
40
Processors
T3
E s
eco
nd
s
360 x 360 Mesh − CPU Time
Solid line: BJ
Dash−dot line: SAPINV
Dash−star line: SLU
0 10 20 30 40 50 60 70 80 90 10050
100
150
200
250
300
350
400
450
Processors
Ite
ratio
ns
360 x 360 Mesh − Iterations
Solid line: BJDash−dot line: SAPINVDash−star line: SLU
Times and iteration counts for solving a 360×360 discretized Laplacean
problem with 3 different preconditioners using ¤exible GMRES(10).
130
![Page 131: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/131.jpg)
II Solution times for a Laplacean problem with various local sub-
problem sizes using FGMRES(10) with 3 different preconditioners
(BJ, SAPINV, SLU) and the Schur complement iteration (SI).
0 10 20 30 40 50 60 70 80 90 1000
5
10
15
20
25
30
Processors
T3
E s
eco
nd
s
50 x 50 Mesh in each PE
Solid line: BJDash−dot line: SAPINVDash−star line: SLUDash−circle line: SI
0 10 20 30 40 50 60 70 80 90 1000
10
20
30
40
50
60
70
80
Processors
T3
E s
eco
nd
s
70 x 70 Mesh
ASSClu
Calais February 7, 2005 131
131
![Page 132: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/132.jpg)
PARALLEL ARMS
132
![Page 133: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/133.jpg)
Parallel implementation of ARMS
Three types of points: interior
(independent sets), local inter-
faces, and global interfaces
Interdomain
Interior points
Local Interfaces Interfaces
Main ideas: (1) exploit recursivity (2) distinguish two phases: elim-
ination of interior points and then interface points.
Calais February 7, 2005 133
133
![Page 134: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/134.jpg)
Result: 2-part Schur complement: one corresponding to local in-
terfaces and the other to inter-domain interfaces.IS I1
I2
Bext
Calais February 7, 2005 134
134
![Page 135: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/135.jpg)
Three approaches
Method 1: Simple additive Schwarz using ILUT or ARMS locally
Method 2: Schur complement approach. Solve Schur complement
system (both I1 and I2) with either a block Jacobi (M. Sosonkina and
YS, ’99) or multicolor ILU(0).
Method 3: Do independent set reduction across subdomains. Re-
quires construction of global group independent sets.
II Current status Methods 1 and 2.
Calais February 7, 2005 135
135
![Page 136: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/136.jpg)
Construction of global group independent sets A two level strategy
1. Color subdomains
2. Find group independent sets
locally
3. Color groups consistently
Proc 4
Proc 1
Proc 3
Proc 2
Calais February 7, 2005 136
136
![Page 137: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/137.jpg)
color 2
color 3
color 4
color 1
color 1
color 3
Internal interface points
External interface points
Algorithm: Multicolor Distributed ILU(0)1. Eliminate local rows,
2. Receive external interf. rows from PEs s.t. color(PE) < MyColor
3. Process local interface rows
4. Send local interface rows to PEs s.t. color(PE) > MyColor
Calais February 7, 2005 137
137
![Page 138: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/138.jpg)
Methods implemented in pARMS:
add x Additive Schwarz procedure, with method x for subdo-
mains. With/out overlap. x is one of ILUT, ILUK, ARMS.
sch x Schur complement technique, with method x = factoriza-
tion used for local submatrix = ILUT, ILUK,ARMS. Equiv.
to Additive Schwarz preconditioner on Schur complement.
sch sgs Multicolor Multiplicative Schwarz (block Gauss-Seidel)
preconditioning is used instead of additive Schwarz for Schur
complement.
sch gilu0 ILU(0) preconditioning is used for solving the global
Schur complement system obtained from the ARMS reduction.
Calais February 7, 2005 138
138
![Page 139: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/139.jpg)
Test problem
1. Scalability experiment: sample £nite difference problem.
−∆u+ γ
exy
∂u
∂x+ e−xy
∂u
∂y
+ αu = f ,
Dirichlet Boundary Conditions ; γ = 100, α = −10; centered differ-
ences discretization.
II Keep size constant on each processor [100×100] II Global linear
system with 10, 000 ∗ nproc unknowns.
2. Comparison with a parallel direct solver – symmetric problems
3. Large irregular matrix example arising from magneto hydrody-
namics.
Calais February 7, 2005 139
139
![Page 140: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/140.jpg)
0 10 20 30 40 50 60 70 80 900
2
4
6
8
10
12
14
16
18
20
Processors
Orig
in 3
800
seco
nds
100 x 100 mesh per processor − Wall−Clock Time
add_arms add_arms no its add_arms ovp add_arms ovp no itsadd_ilut add_ilut no its add_ilut ovp add_ilut ovp no its
Solution times for 2D PDE problem with £xed subproblem size
140
![Page 141: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/141.jpg)
0 10 20 30 40 50 60 70 80 9010
20
30
40
50
60
70
80
90
100
Processors
Itera
tions
100 x 100 mesh per processor − Iterationsadd_arms add_arms no its add_arms ovp add_arms ovp no itsadd_ilut add_ilut no its add_ilut ovp add_ilut ovp no its
Iterations for 2D PDE problem with £xed subproblem size
141
![Page 142: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/142.jpg)
0 10 20 30 40 50 60 70 80 900.5
1
1.5
2
2.5
3
3.5
4
Processors
Orig
in 3
800
seco
nds
100 x 100 mesh per processor − Wall−Clock Time
add_arms no its add_arms ovp no itssch_arms sch_gilu0 sch_gilu0 no its sch_sgs sch_sgs no its
Solution times for a 2D PDE problem with the £xed subproblem
size using different preconditioners.
142
![Page 143: A short course on: Preconditioned Krylov subspace …saad/Calais/PREC.pdfA short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer](https://reader035.fdocuments.in/reader035/viewer/2022081614/5fbfc746d0e2ee359d784872/html5/thumbnails/143.jpg)
0 10 20 30 40 50 60 70 80 900
10
20
30
40
50
60
70
80
90
100
Processors
Itera
tions
100 x 100 mesh per processor − Iterations
add_arms no its add_arms ovp no itssch_arms sch_gilu0 sch_gilu0 no its sch_sgs sch_sgs no its
Iterations for a 2D PDE problem with the £xed subproblem size
using different preconditioners.
143