Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel...
Transcript of Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel...
![Page 1: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/1.jpg)
Scientific ComputingProf. Dr. Stefan Funken, Prof. Dr. Alexander Keller,Prof. Dr. Karsten Urban | 11. Januar 2007
Parallele Algorithmen
![Page 2: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/2.jpg)
Page 2 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
How to solve a tridiagonal system?
Algorithm (Tridiagonal system)
1. Eliminate in each diagonal blocksubdiagonal elements.
2. Eliminate in each diagonal blocksuperdiagonal elements from third lastrow on.
3. Eliminate elements in superdiagonalblocks.
![Page 3: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/3.jpg)
Page 2 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
How to solve a tridiagonal system?
Algorithm (Tridiagonal system)
1. Eliminate in each diagonal blocksubdiagonal elements.
2. Eliminate in each diagonal blocksuperdiagonal elements from third lastrow on.
3. Eliminate elements in superdiagonalblocks.
![Page 4: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/4.jpg)
Page 2 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
How to solve a tridiagonal system?
Algorithm (Tridiagonal system)
1. Eliminate in each diagonal blocksubdiagonal elements.
2. Eliminate in each diagonal blocksuperdiagonal elements from third lastrow on.
3. Eliminate elements in superdiagonalblocks.
![Page 5: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/5.jpg)
Page 2 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
How to solve a tridiagonal system?
Algorithm (Tridiagonal system)
1. Eliminate in each diagonal blocksubdiagonal elements.
2. Eliminate in each diagonal blocksuperdiagonal elements from third lastrow on.
3. Eliminate elements in superdiagonalblocks.
![Page 6: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/6.jpg)
Page 2 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
How to solve a tridiagonal system?
Algorithm (Tridiagonal system)
1. Eliminate in each diagonal blocksubdiagonal elements.
2. Eliminate in each diagonal blocksuperdiagonal elements from third lastrow on.
3. Eliminate elements in superdiagonalblocks.
![Page 7: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/7.jpg)
Page 2 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
How to solve a tridiagonal system?
Algorithm (Tridiagonal system)
1. Eliminate in each diagonal blocksubdiagonal elements.
2. Eliminate in each diagonal blocksuperdiagonal elements from third lastrow on.
3. Eliminate elements in superdiagonalblocks.
![Page 8: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/8.jpg)
Page 2 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
How to solve a tridiagonal system?
Algorithm (Tridiagonal system)
1. Eliminate in each diagonal blocksubdiagonal elements.
2. Eliminate in each diagonal blocksuperdiagonal elements from third lastrow on.
3. Eliminate elements in superdiagonalblocks.
![Page 9: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/9.jpg)
Page 2 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
How to solve a tridiagonal system?
Algorithm (Tridiagonal system)
1. Eliminate in each diagonal blocksubdiagonal elements.
2. Eliminate in each diagonal blocksuperdiagonal elements from third lastrow on.
3. Eliminate elements in superdiagonalblocks.
![Page 10: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/10.jpg)
Page 2 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
How to solve a tridiagonal system?
Algorithm (Tridiagonal system)
1. Eliminate in each diagonal blocksubdiagonal elements.
2. Eliminate in each diagonal blocksuperdiagonal elements from third lastrow on.
3. Eliminate elements in superdiagonalblocks.
Results in a tridiagonal subsystem withunknowns x5, x10, x15, x20.
![Page 11: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/11.jpg)
Page 2 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
How to solve a tridiagonal system?
Algorithm (Tridiagonal system)
1. Eliminate in each diagonal blocksubdiagonal elements.
2. Eliminate in each diagonal blocksuperdiagonal elements from third lastrow on.
3. Eliminate elements in superdiagonalblocks.
Results in a tridiagonal subsystem withunknowns x5, x10, x15, x20.
![Page 12: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/12.jpg)
Page 2 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
How to solve a tridiagonal system?
Algorithm (Tridiagonal system)
1. Eliminate in each diagonal blocksubdiagonal elements.
2. Eliminate in each diagonal blocksuperdiagonal elements from third lastrow on.
3. Eliminate elements in superdiagonalblocks.
Results in a tridiagonal subsystem withunknowns x5, x10, x15, x20.If data are stored rowwise only onecommunication to neighbouring processorneccessary.
![Page 13: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/13.jpg)
Page 3 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Iterative Solver
Steepest Descent
The steepest descent method minimizes a differentiable function F in direction ofsteepest descent.Consider F (x) := 1
2xTAx − bT x where A is symmetric and positiv definite.Hence, ∇F = 1
2 (A + AT )x − b = Ax − b
Input: Initial guess x0
r0 := b − Ax0
Iteration: k = 0, 1, . . .
xk+1 := xk + λopt(xk , rk) rk % Update xk
rk+1 := b − Axk+1 % Compute residual
![Page 14: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/14.jpg)
Page 3 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Iterative Solver
Steepest Descent
The steepest descent method minimizes a differentiable function F in direction ofsteepest descent.Consider F (x) := 1
2xTAx − bT x where A is symmetric and positiv definite.Hence, ∇F = 1
2 (A + AT )x − b = Ax − b
Input: Initial guess x0
r0 := b − Ax0
Iteration: k = 0, 1, . . .
xk+1 := xk + λopt(xk , rk) rk % Update xk
rk+1 := b − Axk+1 % Compute residual
Using rk+1 = b − Axk+1
![Page 15: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/15.jpg)
Page 3 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Iterative Solver
Steepest Descent
The steepest descent method minimizes a differentiable function F in direction ofsteepest descent.Consider F (x) := 1
2xTAx − bT x where A is symmetric and positiv definite.Hence, ∇F = 1
2 (A + AT )x − b = Ax − b
Input: Initial guess x0
r0 := b − Ax0
Iteration: k = 0, 1, . . .
xk+1 := xk + λopt(xk , rk) rk % Update xk
rk+1 := b − Axk+1 % Compute residual
Using rk+1 = b − Axk+1 = b − A(xk + λopt(xk , rk) rk)
![Page 16: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/16.jpg)
Page 3 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Iterative Solver
Steepest Descent
The steepest descent method minimizes a differentiable function F in direction ofsteepest descent.Consider F (x) := 1
2xTAx − bT x where A is symmetric and positiv definite.Hence, ∇F = 1
2 (A + AT )x − b = Ax − b
Input: Initial guess x0
r0 := b − Ax0
Iteration: k = 0, 1, . . .
xk+1 := xk + λopt(xk , rk) rk % Update xk
rk+1 := b − Axk+1 % Compute residual
Using rk+1 = b−Axk+1 = b−A(xk +λopt(xk , rk) rk) = rk −λopt(x
k , rk) Ark gets
![Page 17: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/17.jpg)
Page 3 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Iterative Solver
Steepest Descent
The steepest descent method minimizes a differentiable function F in direction ofsteepest descent.Consider F (x) := 1
2xTAx − bT x where A is symmetric and positiv definite.Hence, ∇F = 1
2 (A + AT )x − b = Ax − b
Input: Initial guess x0
r0 := b − Ax0
Iteration: k = 0, 1, . . .
xk+1 := xk + λopt(xk , rk) rk % Update xk
rk+1 := rk − λopt(xk , rk) Ark % Compute residual
![Page 18: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/18.jpg)
Page 4 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Steepest Descent Method
Let x , p ∈ Rn. What is the optimal λopt(x , p) in steepest descent method:Consider the following minimization problem:
f (λ)!= min with f (λ) := F (x + λp)
Then, with F (x) = 12 〈x ,Ax〉 − 〈b, x〉 we get
f (λ) = F (x + λp)
![Page 19: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/19.jpg)
Page 4 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Steepest Descent Method
Let x , p ∈ Rn. What is the optimal λopt(x , p) in steepest descent method:Consider the following minimization problem:
f (λ)!= min with f (λ) := F (x + λp)
Then, with F (x) = 12 〈x ,Ax〉 − 〈b, x〉 we get
f (λ) = F (x + λp)
=1
2〈x + λp,A(x + λp)〉 − 〈b, x + λp〉
![Page 20: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/20.jpg)
Page 4 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Steepest Descent Method
Let x , p ∈ Rn. What is the optimal λopt(x , p) in steepest descent method:Consider the following minimization problem:
f (λ)!= min with f (λ) := F (x + λp)
Then, with F (x) = 12 〈x ,Ax〉 − 〈b, x〉 we get
f (λ) = F (x + λp)
=1
2〈x + λp,A(x + λp)〉 − 〈b, x + λp〉
=1
2〈x ,Ax〉 − 〈b, x〉+ λ〈p,Ax − b〉+
1
2λ2〈p,Ap〉
![Page 21: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/21.jpg)
Page 4 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Steepest Descent Method
Let x , p ∈ Rn. What is the optimal λopt(x , p) in steepest descent method:Consider the following minimization problem:
f (λ)!= min with f (λ) := F (x + λp)
Then, with F (x) = 12 〈x ,Ax〉 − 〈b, x〉 we get
f (λ) = F (x + λp)
=1
2〈x + λp,A(x + λp)〉 − 〈b, x + λp〉
=1
2〈x ,Ax〉 − 〈b, x〉+ λ〈p,Ax − b〉+
1
2λ2〈p,Ap〉
= F (x) + λ〈p,Ax − b〉+1
2λ2〈p,Ap〉
![Page 22: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/22.jpg)
Page 4 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Steepest Descent Method
Let x , p ∈ Rn. What is the optimal λopt(x , p) in steepest descent method:Consider the following minimization problem:
f (λ)!= min with f (λ) := F (x + λp)
Then, with F (x) = 12 〈x ,Ax〉 − 〈b, x〉 we get
f (λ) = F (x) + λ〈p,Ax − b〉+1
2λ2〈p,Ap〉
![Page 23: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/23.jpg)
Page 4 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Steepest Descent Method
Let x , p ∈ Rn. What is the optimal λopt(x , p) in steepest descent method:Consider the following minimization problem:
f (λ)!= min with f (λ) := F (x + λp)
Then, with F (x) = 12 〈x ,Ax〉 − 〈b, x〉 we get
f (λ) = F (x) + λ〈p,Ax − b〉+1
2λ2〈p,Ap〉
If p 6= 0, 〈p,Ap〉 > 0.
![Page 24: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/24.jpg)
Page 4 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Steepest Descent Method
Let x , p ∈ Rn. What is the optimal λopt(x , p) in steepest descent method:Consider the following minimization problem:
f (λ)!= min with f (λ) := F (x + λp)
Then, with F (x) = 12 〈x ,Ax〉 − 〈b, x〉 we get
f (λ) = F (x) + λ〈p,Ax − b〉+1
2λ2〈p,Ap〉
If p 6= 0, 〈p,Ap〉 > 0.
Hence, from 0!= f ′(λ) = 〈p,Ax − b〉+ λ〈p,Ap〉 we obtain
λopt(x , p) =〈p, b − Ax〉〈p,Ap〉
.
![Page 25: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/25.jpg)
Page 5 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numerical Example
2D Problem
I A =
(2 11 2
)I b =
(−11
)I x0 =
(8−3
)I 5 iterations
![Page 26: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/26.jpg)
Page 5 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numerical Example
2D Problem
I A =
(2 11 2
)I b =
(−11
)I x0 =
(8−3
)I 5 iterations
![Page 27: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/27.jpg)
Page 5 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numerical Example
2D Problem
I A =
(2 11 2
)I b =
(−11
)I x0 =
(8−3
)I 5 iterations
![Page 28: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/28.jpg)
Page 5 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numerical Example
2D Problem
I A =
(2 11 2
)I b =
(−11
)I x0 =
(8−3
)I 5 iterations
![Page 29: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/29.jpg)
Page 5 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numerical Example
2D Problem
I A =
(2 11 2
)I b =
(−11
)I x0 =
(8−3
)I 5 iterations
![Page 30: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/30.jpg)
Page 5 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numerical Example
2D Problem
I A =
(2 11 2
)I b =
(−11
)I x0 =
(8−3
)I 5 iterations
![Page 31: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/31.jpg)
Page 6 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Iterative Solver
Steepest Descent
Input: Initial guess x0
r0 := b − Ax0
Iteration: k = 0, 1, . . .
λopt := 〈rk ,rk〉〈rk ,Ark〉
xk+1 := xk + λopt rk
rk+1 := rk − λopt Ark
2 matrix-vector-products, 2 inner products, and 2 saxpy’s per iteration
Is it possible save one matrix-vector-product?
![Page 32: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/32.jpg)
Page 6 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Iterative Solver
Steepest Descent
Input: Initial guess x0
r0 := b − Ax0
Iteration: k = 0, 1, . . .
ak := Ark
λopt := 〈rk ,rk〉〈rk ,ak〉
xk+1 := xk + λopt rk
rk+1 := rk − λopt ak
1 matrix-vector-products, 2 inner products, and 2 saxpy’s per iteration
![Page 33: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/33.jpg)
Page 7 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
How can vectors be given?
![Page 34: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/34.jpg)
Page 7 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
How can vectors be given?I Full value at each node, e.g. given
u` = (1, 1, 1, 1)T ur = (1, 1, 1, 1)T .
![Page 35: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/35.jpg)
Page 7 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
How can vectors be given?I Full value at each node, e.g. given
u` = (1, 1, 1, 1)T ur = (1, 1, 1, 1)T .
Using incidence matrices C` and Cr .
C` =
0BBBBB@
0 0 1 00 0 0 00 1 0 00 0 0 11 0 0 00 0 0 0
1CCCCCA
Cr =
0BBBBB@
0 0 0 01 0 0 00 0 1 00 1 0 00 0 0 00 0 0 1
1CCCCCA
![Page 36: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/36.jpg)
Page 7 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
How can vectors be given?I Full value at each node, e.g. given
u` = (1, 1, 1, 1)T ur = (1, 1, 1, 1)T .
Using incidence matrices C` and Cr .
C` =
0BBBBB@
0 0 1 00 0 0 00 1 0 00 0 0 11 0 0 00 0 0 0
1CCCCCA
Cr =
0BBBBB@
0 0 0 01 0 0 00 0 1 00 1 0 00 0 0 00 0 0 1
1CCCCCA
Note
u` :
0BBBBB@
101110
1CCCCCA
=
0BBBBB@
0 0 1 00 0 0 00 1 0 00 0 0 11 0 0 00 0 0 0
1CCCCCA
0BB@
1111
1CCA
![Page 37: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/37.jpg)
Page 7 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
How can vectors be given?
I Full value at each node, e.g. given
u` = (1, 1, 1, 1)T ur = (1, 1, 1, 1)T .
Hence
u = C`(1, 1, 1, 1)T + Cr (1, 1, 1, 1)T
= (1, 0, 1, 1, 1, 0)T + (0, 1, 1, 1, 0, 1)T
= (1, 1, 2, 2, 1, 1)T 6= (1, 1, 1, 1, 1, 1)T
resp.
u = C`u` + Crur
![Page 38: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/38.jpg)
Page 7 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
How can vectors be given?
I Full value at each node be given.
I Value is given after assembling all data,e.g. given
u` = (1,1
2, 1,
1
2)T ur = (1,
1
2,1
2, 1)T
results in
u = C`u` + Crur
= (1, 0,1
2,1
2, 1, 0)T + (0, 1,
1
2,1
2, 0, 1)T
= (1, 1, 1, 1, 1, 1)T
![Page 39: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/39.jpg)
Page 8 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Types of Vectors
Two types of vectors, depending on the storage type:
type I: u is stored on Pk as restriction uk = Cku.’Complete’ value accessable on Pk .
type II: r is stored on Pk as rk , s.t.r =
∑pk=1 CT
k rk .Nodes on the interface have only a part of the full value.
![Page 40: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/40.jpg)
Page 9 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
Let matrices on both subdomains be given,for example:
A` =
0BB@
2 1 3 −2−3 4 −7 34 3 6 05 −2 1 2
1CCA Ar =
0BB@
0 2 1 01 3 −7 2−2 −9 4 03 7 1 5
1CCA
![Page 41: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/41.jpg)
Page 9 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
Let matrices on both subdomains be given,for example:
A` =
0BB@
2 1 3 −2−3 4 −7 34 3 6 05 −2 1 2
1CCA Ar =
0BB@
0 2 1 01 3 −7 2−2 −9 4 03 7 1 5
1CCA
How to construct matrix A w.r.t global numberingfrom A` and Ar?
![Page 42: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/42.jpg)
Page 9 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
Let matrices on both subdomains be given,for example:
A` =
0BB@
2 1 3 −2−3 4 −7 34 3 6 05 −2 1 2
1CCA Ar =
0BB@
0 2 1 01 3 −7 2−2 −9 4 03 7 1 5
1CCA
How to construct matrix A w.r.t global numberingfrom A` and Ar?
Use incidence matrices C` and Cr .
C` =
0BBBBB@
0 0 1 00 0 0 00 1 0 00 0 0 11 0 0 00 0 0 0
1CCCCCA
Cr =
0BBBBB@
0 0 0 01 0 0 00 0 1 00 1 0 00 0 0 00 0 0 1
1CCCCCA
![Page 43: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/43.jpg)
Page 9 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
Let matrices on both subdomains be given,for example:
A` =
0BB@
2 1 3 −2−3 4 −7 34 3 6 05 −2 1 2
1CCA Ar =
0BB@
0 2 1 01 3 −7 2−2 −9 4 03 7 1 5
1CCA
How to construct matrix A w.r.t global numberingfrom A` and Ar?
Use incidence matrices C` and Cr .
C` =
0BBBBB@
0 0 1 00 0 0 00 1 0 00 0 0 11 0 0 00 0 0 0
1CCCCCA
Cr =
0BBBBB@
0 0 0 01 0 0 00 0 1 00 1 0 00 0 0 00 0 0 1
1CCCCCA
Now we get A = C`A`CT` + CrArC
Tr .
![Page 44: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/44.jpg)
Page 10 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
A = C`A`CT` + Cr Ar CT
r
![Page 45: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/45.jpg)
Page 10 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
A = C`A`CT` + Cr Ar CT
r
=
0BBBBB@
0 0 1 00 0 0 00 1 0 00 0 0 11 0 0 00 0 0 0
1CCCCCA
0BB@
2 1 3 −2−3 4 −7 34 3 6 05 −2 1 2
1CCA
0BB@
0 0 0 0 1 00 0 1 0 0 01 0 0 0 0 00 0 0 1 0 0
1CCA + . . .
![Page 46: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/46.jpg)
Page 10 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
A = C`A`CT` + Cr Ar CT
r
=
0BBBBB@
0 0 1 00 0 0 00 1 0 00 0 0 11 0 0 00 0 0 0
1CCCCCA
0BB@
2 1 3 −2−3 4 −7 34 3 6 05 −2 1 2
1CCA
0BB@
0 0 0 0 1 00 0 1 0 0 01 0 0 0 0 00 0 0 1 0 0
1CCA + . . .
=
0BBBBB@
6 0 3 0 4 00 0 0 0 0 0−7 0 4 3 −3 01 0 −2 2 5 03 0 1 −2 2 00 0 0 0 0 0
1CCCCCA
+
0BBBBB@
0 0 0 0 0 00 0 1 2 0 00 −2 4 −9 0 00 1 −7 3 0 20 0 0 0 0 00 3 1 7 0 5
1CCCCCA
![Page 47: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/47.jpg)
Page 10 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numbering
4
3
2
2
13
1 4
local numbering
5
2
63
global numbering
41
A = C`A`CT` + Cr Ar CT
r
=
0BBBBB@
0 0 1 00 0 0 00 1 0 00 0 0 11 0 0 00 0 0 0
1CCCCCA
0BB@
2 1 3 −2−3 4 −7 34 3 6 05 −2 1 2
1CCA
0BB@
0 0 0 0 1 00 0 1 0 0 01 0 0 0 0 00 0 0 1 0 0
1CCA + . . .
=
0BBBBB@
6 0 3 0 4 00 0 0 0 0 0−7 0 4 3 −3 01 0 −2 2 5 03 0 1 −2 2 00 0 0 0 0 0
1CCCCCA
+
0BBBBB@
0 0 0 0 0 00 0 1 2 0 00 −2 4 −9 0 00 1 −7 3 0 20 0 0 0 0 00 3 1 7 0 5
1CCCCCA
=
0BBBBB@
6 0 3 0 4 00 0 1 2 0 0−7 −2 4+4 −9+3 −3 01 1 −7−2 3+2 5 23 0 1 −2 2 00 3 1 7 0 5
1CCCCCA
![Page 48: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/48.jpg)
Page 11 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Types of Matrices
There are two types of matrices:
type I: ’Complete’ (but not all) entries are accessable on Pk .
type II: The matrix is stored in a distrubuted manner similiar to type II.
A =
p∑k=1
CkAkCTk
where Ak belongs to processor Pk , resp. to the subdomain Ωi .
![Page 49: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/49.jpg)
Page 12 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Converting Type
Obviously, addition, subtraction (and similiar operations) of vectors can be donewithout communication, if they are of the same type.
I Converting from type I to type II needs communication.Mapping is not unique, e.g.
ui = Ci
(p∑
k=1
CkCTk
)−1
CTk uk
I Converting from type II to type I needs communication.
r i = Ci
p∑k=1
CTk rk
![Page 50: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/50.jpg)
Page 13 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Inner Product
The inner product of two vectors u, r of different typeneeds only one reduce-communication.
〈u, r〉
= uTp∑
k=1
CTk rk
=
p∑k=1
uTCTk rk
=
p∑k=1
〈Cku, rk〉
=
p∑k=1
〈uk , rk〉
![Page 51: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/51.jpg)
Page 13 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Inner Product
The inner product of two vectors u, r of different typeneeds only one reduce-communication.
〈u, r〉
= uTp∑
k=1
CTk rk
=
p∑k=1
uTCTk rk
=
p∑k=1
〈Cku, rk〉
=
p∑k=1
〈uk , rk〉
![Page 52: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/52.jpg)
Page 13 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Inner Product
The inner product of two vectors u, r of different typeneeds only one reduce-communication.
〈u, r〉
= uTp∑
k=1
CTk rk
=
p∑k=1
uTCTk rk
=
p∑k=1
〈Cku, rk〉
=
p∑k=1
〈uk , rk〉
![Page 53: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/53.jpg)
Page 13 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Inner Product
The inner product of two vectors u, r of different typeneeds only one reduce-communication.
〈u, r〉
= uTp∑
k=1
CTk rk
=
p∑k=1
uTCTk rk
=
p∑k=1
〈Cku, rk〉
=
p∑k=1
〈uk , rk〉
![Page 54: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/54.jpg)
Page 13 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Inner Product
The inner product of two vectors u, r of different typeneeds only one reduce-communication.
〈u, r〉
= uTp∑
k=1
CTk rk
=
p∑k=1
uTCTk rk
=
p∑k=1
〈Cku, rk〉
=
p∑k=1
〈uk , rk〉
![Page 55: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/55.jpg)
Page 14 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Matrix-Vector Multiplications
I type II - matrix × type I - vectorresult is a type II vector, no communication!!!Consider A =
∑pk=1 CkAkC
Tk .
Au
=
p∑k=1
CkAkCTk u =
p∑k=1
Ck Akuk︸ ︷︷ ︸rk
= r
I type II - matrix × type II - vectortype conversion neccessary, needs communication
![Page 56: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/56.jpg)
Page 14 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Matrix-Vector Multiplications
I type II - matrix × type I - vectorresult is a type II vector, no communication!!!Consider A =
∑pk=1 CkAkC
Tk .
Au =
p∑k=1
CkAkCTk u
=
p∑k=1
Ck Akuk︸ ︷︷ ︸rk
= r
I type II - matrix × type II - vectortype conversion neccessary, needs communication
![Page 57: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/57.jpg)
Page 14 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Matrix-Vector Multiplications
I type II - matrix × type I - vectorresult is a type II vector, no communication!!!Consider A =
∑pk=1 CkAkC
Tk .
Au =
p∑k=1
CkAkCTk u =
p∑k=1
Ck Akuk︸ ︷︷ ︸rk
= r
I type II - matrix × type II - vectortype conversion neccessary, needs communication
![Page 58: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/58.jpg)
Page 14 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Matrix-Vector Multiplications
I type II - matrix × type I - vectorresult is a type II vector, no communication!!!Consider A =
∑pk=1 CkAkC
Tk .
Au =
p∑k=1
CkAkCTk u =
p∑k=1
Ck Akuk︸ ︷︷ ︸rk
= r
I type II - matrix × type II - vectortype conversion neccessary, needs communication
![Page 59: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/59.jpg)
Page 15 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Steepest Descent
Parallel Version
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
Iteration: k = 0, 1, . . .
ak := Awk
λ := 〈wk ,rk〉〈wk ,ak〉
xk+1 := xk + λ wk
rk+1 := rk − λ ak
wk :=∑p
`=1 CT` rk
Only two allreduce-communications andone vector accumulation per iteration necessary!
![Page 60: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/60.jpg)
Page 16 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Non-overlapping Subdomains
Different Indizes
1. I nodes in interior of subdomains[NI =
∑pj=1 NI ,j ].
2. E nodes in interior ofsubdomains-edges [NE =
∑ne
j=1 NE ,j ].(ne number of subdomain-edges)
3. V crosspoints, i.e. endpoints ofsubdomain-edges [NV ]
4. E and V are often denoted as couplingnodes with index C [NC = NE + NV ]
![Page 61: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/61.jpg)
Page 16 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Non-overlapping Subdomains
Different Indizes
1. I nodes in interior of subdomains[NI =
∑pj=1 NI ,j ].
2. E nodes in interior ofsubdomains-edges [NE =
∑ne
j=1 NE ,j ].(ne number of subdomain-edges)
3. V crosspoints, i.e. endpoints ofsubdomain-edges [NV ]
4. E and V are often denoted as couplingnodes with index C [NC = NE + NV ]
![Page 62: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/62.jpg)
Page 16 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Non-overlapping Subdomains
Different Indizes1. I nodes in interior of subdomains
[NI =∑p
j=1 NI ,j ].
2. E nodes in interior ofsubdomains-edges [NE =
∑ne
j=1 NE ,j ].(ne number of subdomain-edges)
3. V crosspoints, i.e. endpoints ofsubdomain-edges [NV ]
4. E and V are often denoted as couplingnodes with index C [NC = NE + NV ]
![Page 63: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/63.jpg)
Page 16 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Non-overlapping Subdomains
Different Indizes1. I nodes in interior of subdomains
[NI =∑p
j=1 NI ,j ].
2. E nodes in interior ofsubdomains-edges [NE =
∑ne
j=1 NE ,j ].(ne number of subdomain-edges)
3. V crosspoints, i.e. endpoints ofsubdomain-edges [NV ]
4. E and V are often denoted as couplingnodes with index C [NC = NE + NV ]
![Page 64: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/64.jpg)
Page 16 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Non-overlapping Subdomains
Different Indizes1. I nodes in interior of subdomains
[NI =∑p
j=1 NI ,j ].
2. E nodes in interior ofsubdomains-edges [NE =
∑ne
j=1 NE ,j ].(ne number of subdomain-edges)
3. V crosspoints, i.e. endpoints ofsubdomain-edges [NV ]
4. E and V are often denoted as couplingnodes with index C [NC = NE + NV ]
![Page 65: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/65.jpg)
Page 16 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Non-overlapping Subdomains
Different Indizes1. I nodes in interior of subdomains
[NI =∑p
j=1 NI ,j ].
2. E nodes in interior ofsubdomains-edges [NE =
∑ne
j=1 NE ,j ].(ne number of subdomain-edges)
3. V crosspoints, i.e. endpoints ofsubdomain-edges [NV ]
4. E and V are often denoted as couplingnodes with index C [NC = NE + NV ]
![Page 66: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/66.jpg)
Page 17 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Non-overlapping Subdomains
Communication1. Communication only neccessary for
nodes on the coupling boundaries.
2. Global communication for crosspoints.
3. Only communication to theneighbouring subdomain foredge-nodes.
4. Not all nodes have to be ’touched’ fora vector accumulation
w :=
p∑`=1
CT` r
5. Split into communication betweenneighbouring subdomains and oneglobal communication for allcrosspoints.
![Page 67: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/67.jpg)
Page 17 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Non-overlapping Subdomains
Communication1. Communication only neccessary for
nodes on the coupling boundaries.
2. Global communication for crosspoints.
3. Only communication to theneighbouring subdomain foredge-nodes.
4. Not all nodes have to be ’touched’ fora vector accumulation
w :=
p∑`=1
CT` r
5. Split into communication betweenneighbouring subdomains and oneglobal communication for allcrosspoints.
![Page 68: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/68.jpg)
Page 17 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Non-overlapping Subdomains
Communication1. Communication only neccessary for
nodes on the coupling boundaries.
2. Global communication for crosspoints.
3. Only communication to theneighbouring subdomain foredge-nodes.
4. Not all nodes have to be ’touched’ fora vector accumulation
w :=
p∑`=1
CT` r
5. Split into communication betweenneighbouring subdomains and oneglobal communication for allcrosspoints.
![Page 69: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/69.jpg)
Page 17 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Non-overlapping Subdomains
Communication1. Communication only neccessary for
nodes on the coupling boundaries.
2. Global communication for crosspoints.
3. Only communication to theneighbouring subdomain foredge-nodes.
4. Not all nodes have to be ’touched’ fora vector accumulation
w :=
p∑`=1
CT` r
5. Split into communication betweenneighbouring subdomains and oneglobal communication for allcrosspoints.
![Page 70: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/70.jpg)
Page 17 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Non-overlapping Subdomains
Communication1. Communication only neccessary for
nodes on the coupling boundaries.
2. Global communication for crosspoints.
3. Only communication to theneighbouring subdomain foredge-nodes.
4. Not all nodes have to be ’touched’ fora vector accumulation
w :=
p∑`=1
CT` r
5. Split into communication betweenneighbouring subdomains and oneglobal communication for allcrosspoints.
![Page 71: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/71.jpg)
Page 18 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numerical Example
Notice the following properties of the algorithm
rm⊥rm+1 = rm − λopt(xm, rm)Arm = rm − 〈rm, b − Axm〉
〈rm,Arm〉Arm
![Page 72: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/72.jpg)
Page 18 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numerical Example
Notice the following properties of the algorithm
rm⊥rm+1 = rm − λopt(xm, rm)Arm = rm − 〈rm, b − Axm〉
〈rm,Arm〉Arm
resp.
〈rm, rm+1〉
![Page 73: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/73.jpg)
Page 18 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numerical Example
Notice the following properties of the algorithm
rm⊥rm+1 = rm − λopt(xm, rm)Arm = rm − 〈rm, b − Axm〉
〈rm,Arm〉Arm
resp.
〈rm, rm+1〉 = 〈rm, rm〉 − 〈rm, b − Axm〉〈rm,Arm〉
〈rm,Arm〉 = 0
![Page 74: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/74.jpg)
Page 18 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numerical Example
Notice the following properties of the algorithm
rm⊥rm+1 = rm − λopt(xm, rm)Arm = rm − 〈rm, b − Axm〉
〈rm,Arm〉Arm
resp.
〈rm, rm+1〉 = 〈rm, rm〉 − 〈rm, b − Axm〉〈rm,Arm〉
〈rm,Arm〉 = 0
but not rm⊥rm+2.
![Page 75: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/75.jpg)
Page 18 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numerical Example
Notice the following properties of the algorithm
rm⊥rm+1 = rm − λopt(xm, rm)Arm = rm − 〈rm, b − Axm〉
〈rm,Arm〉Arm
resp.
〈rm, rm+1〉 = 〈rm, rm〉 − 〈rm, b − Axm〉〈rm,Arm〉
〈rm,Arm〉 = 0
but not rm⊥rm+2. We loose all our information!!!There exists a better algorithm for symmetric and positive definite matrices, as theyarise in the finite element method!!!
![Page 76: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/76.jpg)
Page 18 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Numerical Example
Notice the following properties of the algorithm
rm⊥rm+1 = rm − λopt(xm, rm)Arm = rm − 〈rm, b − Axm〉
〈rm,Arm〉Arm
resp.
〈rm, rm+1〉 = 〈rm, rm〉 − 〈rm, b − Axm〉〈rm,Arm〉
〈rm,Arm〉 = 0
but not rm⊥rm+2. We loose all our information!!!There exists a better algorithm for symmetric and positive definite matrices, as theyarise in the finite element method!!! The CG-algorithm.
![Page 77: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/77.jpg)
Page 19 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Preconditioned Conjugate Gradient Method
Solve Ax = b (A,W sym, + def), W−1 ’easy’ to compute, s.t. W−1A ≈ I(e.g. W−1 = I , W−1 = k-iterations of Jacobi/Gauss-Seidel)
Input: Initial guess x0
r0 := b − Ax0
p0 := W−1r0
σ0 := 〈p0, r0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Apk
λopt := σk
〈ak ,pk〉
xk+1 := xk + λopt rk
rk+1 := rk − λopt ak
qk+1 := W−1rk+1
σk+1 := 〈qk+1, rk+1〉pm+1 := qm+1 + σk+1
σkpk
![Page 78: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/78.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
p0 := W−1r0
σ0 := 〈p0, r0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Apk , λopt := σk
〈ak ,pk〉
xk+1 := xk + λopt pk
rk+1 := rk − λopt ak
qk+1 := W−1rk+1, σk+1 := 〈qk+1, rk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 79: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/79.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
p0 := W−1r0
σ0 := 〈p0, r0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Apk , λopt := σk
〈ak ,pk〉
xk+1 := xk + λopt pk
rk+1 := rk − λopt ak
qk+1 := W−1rk+1, σk+1 := 〈qk+1, rk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 80: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/80.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
p0 := W−1r0
σ0 := 〈p0, r0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Apk , λopt := σk
〈ak ,pk〉
xk+1 := xk + λopt pk
rk+1 := rk − λopt ak
qk+1 := W−1rk+1, σk+1 := 〈qk+1, rk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 81: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/81.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1r0
σ0 := 〈p0, r0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Apk , λopt := σk
〈ak ,pk〉
xk+1 := xk + λopt pk
rk+1 := rk − λopt ak
qk+1 := W−1rk+1, σk+1 := 〈qk+1, rk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 82: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/82.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1w0
σ0 := 〈p0, r0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Apk , λopt := σk
〈ak ,pk〉
xk+1 := xk + λopt pk
rk+1 := rk − λopt ak
qk+1 := W−1rk+1, σk+1 := 〈qk+1, rk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 83: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/83.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1w0
s0 :=∑p
`=1 CT` p0
σ0 := 〈w0, p0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Apk , λopt := σk
〈ak ,pk〉
xk+1 := xk + λopt pk
rk+1 := rk − λopt ak
qk+1 := W−1rk+1, σk+1 := 〈qk+1, rk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 84: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/84.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1w0
s0 :=∑p
`=1 CT` p0
σ0 := 〈w0, p0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Ask , λopt := σk
〈ak ,pk〉
xk+1 := xk + λopt pk
rk+1 := rk − λopt ak
qk+1 := W−1rk+1, σk+1 := 〈qk+1, rk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 85: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/85.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1w0
s0 :=∑p
`=1 CT` p0
σ0 := 〈w0, p0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Ask , λopt := σk
〈ak ,sk〉
xk+1 := xk + λopt pk
rk+1 := rk − λopt ak
qk+1 := W−1rk+1, σk+1 := 〈qk+1, rk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 86: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/86.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1w0
s0 :=∑p
`=1 CT` p0
σ0 := 〈w0, p0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Ask , λopt := σk
〈ak ,sk〉
xk+1 := xk + λopt sk
rk+1 := rk − λopt ak
qk+1 := W−1rk+1, σk+1 := 〈qk+1, rk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 87: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/87.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1w0
s0 :=∑p
`=1 CT` p0
σ0 := 〈w0, p0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Ask , λopt := σk
〈ak ,sk〉
xk+1 := xk + λopt sk
rk+1 := rk − λopt ak
qk+1 := W−1rk+1, σk+1 := 〈qk+1, rk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 88: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/88.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1w0
s0 :=∑p
`=1 CT` p0
σ0 := 〈w0, p0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Ask , λopt := σk
〈ak ,sk〉
xk+1 := xk + λopt sk
rk+1 := rk − λopt ak
wk+1 :=∑p
`=1 CT` rk+1
qk+1 := W−1rk+1, σk+1 := 〈qk+1, rk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 89: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/89.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1w0
s0 :=∑p
`=1 CT` p0
σ0 := 〈w0, p0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Ask , λopt := σk
〈ak ,sk〉
xk+1 := xk + λopt sk
rk+1 := rk − λopt ak
wk+1 :=∑p
`=1 CT` rk+1
qk+1 := W−1wk+1, σk+1 := 〈qk+1, rk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 90: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/90.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1w0
s0 :=∑p
`=1 CT` p0
σ0 := 〈w0, p0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Ask , λopt := σk
〈ak ,sk〉
xk+1 := xk + λopt sk
rk+1 := rk − λopt ak
wk+1 :=∑p
`=1 CT` rk+1
qk+1 := W−1wk+1, σk+1 := 〈qk+1,wk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 91: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/91.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1w0
s0 :=∑p
`=1 CT` p0
σ0 := 〈w0, p0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Ask , λopt := σk
〈ak ,sk〉
xk+1 := xk + λopt sk
rk+1 := rk − λopt ak
wk+1 :=∑p
`=1 CT` rk+1
qk+1 := W−1wk+1, σk+1 := 〈qk+1,wk+1〉pk+1 := qk+1 + σk+1
σkpk
![Page 92: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/92.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1w0
s0 :=∑p
`=1 CT` p0
σ0 := 〈w0, p0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Ask , λopt := σk
〈ak ,sk〉
xk+1 := xk + λopt sk
rk+1 := rk − λopt ak
wk+1 :=∑p
`=1 CT` rk+1
qk+1 := W−1wk+1, σk+1 := 〈qk+1,wk+1〉pk+1 := qk+1 + σk+1
σkpk
sk+1 :=∑p
`=1 CT` pk+1
![Page 93: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/93.jpg)
Page 20 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
Input: Initial guess x0
r0 := b − Ax0
w0 :=∑p
`=1 CT` r0
p0 := W−1w0
s0 :=∑p
`=1 CT` p0
σ0 := 〈w0, p0〉Iteration: k = 0, 1, . . . (as long as k < n, rk 6= 0)
ak := Ask , λopt := σk
〈ak ,sk〉
xk+1 := xk + λopt sk
rk+1 := rk − λopt ak
wk+1 :=∑p
`=1 CT` rk+1
qk+1 := W−1wk+1, σk+1 := 〈qk+1,wk+1〉pk+1 := qk+1 + σk+1
σkpk
sk+1 :=∑p
`=1 CT` pk+1
![Page 94: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/94.jpg)
Page 21 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
A and W−1 are given as type II ’matrices’.
I storage needed for 7 vectors (plus A and W−1)
I 2 vector accumulations (per iteration)
I 2 allreduce-operations
I 1 ’local’ application of A and W−1
I 2 inner products and 3 saxpy-operations
![Page 95: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/95.jpg)
Page 21 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
A and W−1 are given as type II ’matrices’.
I storage needed for 7 vectors (plus A and W−1)
I 2 vector accumulations (per iteration)
I 2 allreduce-operations
I 1 ’local’ application of A and W−1
I 2 inner products and 3 saxpy-operations
![Page 96: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/96.jpg)
Page 21 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
A and W−1 are given as type II ’matrices’.
I storage needed for 7 vectors (plus A and W−1)
I 2 vector accumulations (per iteration)
I 2 allreduce-operations
I 1 ’local’ application of A and W−1
I 2 inner products and 3 saxpy-operations
![Page 97: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/97.jpg)
Page 21 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
A and W−1 are given as type II ’matrices’.
I storage needed for 7 vectors (plus A and W−1)
I 2 vector accumulations (per iteration)
I 2 allreduce-operations
I 1 ’local’ application of A and W−1
I 2 inner products and 3 saxpy-operations
![Page 98: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/98.jpg)
Page 21 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
A and W−1 are given as type II ’matrices’.
I storage needed for 7 vectors (plus A and W−1)
I 2 vector accumulations (per iteration)
I 2 allreduce-operations
I 1 ’local’ application of A and W−1
I 2 inner products and 3 saxpy-operations
![Page 99: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/99.jpg)
Page 21 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Parallel Preconditioned Conjugate Gradient Method
A and W−1 are given as type II ’matrices’.
I storage needed for 7 vectors (plus A and W−1)
I 2 vector accumulations (per iteration)
I 2 allreduce-operations
I 1 ’local’ application of A and W−1
I 2 inner products and 3 saxpy-operations
How should we choose W−1 ???
![Page 100: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/100.jpg)
Page 22 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Debugging MPI Programs
Practical debugging strategies
I run parallel program on single process,
tests most of functionality, such as I/O
I run parallel program with two processes,
or more, such that all functionality can be exercised
I run with smallest problem size that exercises all functionality
solving a 4× 4-system is the same as 1024× 1024
I use ’printf’-debugger
I put fflush(stdout); after every printf
I for point-to-point communication, print data being sent and received
I prefix each message with the process rank, sort by rank!
messages received from different processes do not necessarily arrive in chronological
order
I make sure that all the data structures have been set up correctly
![Page 101: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/101.jpg)
Page 22 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Debugging MPI Programs
Practical debugging strategies
I run parallel program on single process,
tests most of functionality, such as I/O
I run parallel program with two processes,
or more, such that all functionality can be exercised
I run with smallest problem size that exercises all functionality
solving a 4× 4-system is the same as 1024× 1024
I use ’printf’-debugger
I put fflush(stdout); after every printf
I for point-to-point communication, print data being sent and received
I prefix each message with the process rank, sort by rank!
messages received from different processes do not necessarily arrive in chronological
order
I make sure that all the data structures have been set up correctly
![Page 102: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/102.jpg)
Page 22 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Debugging MPI Programs
Practical debugging strategies
I run parallel program on single process,
tests most of functionality, such as I/O
I run parallel program with two processes,
or more, such that all functionality can be exercised
I run with smallest problem size that exercises all functionality
solving a 4× 4-system is the same as 1024× 1024
I use ’printf’-debugger
I put fflush(stdout); after every printf
I for point-to-point communication, print data being sent and received
I prefix each message with the process rank, sort by rank!
messages received from different processes do not necessarily arrive in chronological
order
I make sure that all the data structures have been set up correctly
![Page 103: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/103.jpg)
Page 22 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Debugging MPI Programs
Practical debugging strategies
I run parallel program on single process,
tests most of functionality, such as I/O
I run parallel program with two processes,
or more, such that all functionality can be exercised
I run with smallest problem size that exercises all functionality
solving a 4× 4-system is the same as 1024× 1024
I use ’printf’-debugger
I put fflush(stdout); after every printf
I for point-to-point communication, print data being sent and received
I prefix each message with the process rank, sort by rank!
messages received from different processes do not necessarily arrive in chronological
order
I make sure that all the data structures have been set up correctly
![Page 104: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/104.jpg)
Page 22 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Debugging MPI Programs
Practical debugging strategies
I run parallel program on single process,
tests most of functionality, such as I/O
I run parallel program with two processes,
or more, such that all functionality can be exercised
I run with smallest problem size that exercises all functionality
solving a 4× 4-system is the same as 1024× 1024
I use ’printf’-debugger
I put fflush(stdout); after every printf
I for point-to-point communication, print data being sent and received
I prefix each message with the process rank, sort by rank!
messages received from different processes do not necessarily arrive in chronological
order
I make sure that all the data structures have been set up correctly
![Page 105: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/105.jpg)
Page 22 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Debugging MPI Programs
Practical debugging strategies
I run parallel program on single process,
tests most of functionality, such as I/O
I run parallel program with two processes,
or more, such that all functionality can be exercised
I run with smallest problem size that exercises all functionality
solving a 4× 4-system is the same as 1024× 1024
I use ’printf’-debugger
I put fflush(stdout); after every printf
I for point-to-point communication, print data being sent and received
I prefix each message with the process rank, sort by rank!
messages received from different processes do not necessarily arrive in chronological
order
I make sure that all the data structures have been set up correctly
![Page 106: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/106.jpg)
Page 22 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Debugging MPI Programs
Practical debugging strategies
I run parallel program on single process,
tests most of functionality, such as I/O
I run parallel program with two processes,
or more, such that all functionality can be exercised
I run with smallest problem size that exercises all functionality
solving a 4× 4-system is the same as 1024× 1024
I use ’printf’-debugger
I put fflush(stdout); after every printf
I for point-to-point communication, print data being sent and received
I prefix each message with the process rank, sort by rank!
messages received from different processes do not necessarily arrive in chronological
order
I make sure that all the data structures have been set up correctly
![Page 107: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/107.jpg)
Page 22 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Debugging MPI Programs
Practical debugging strategies
I run parallel program on single process,
tests most of functionality, such as I/O
I run parallel program with two processes,
or more, such that all functionality can be exercised
I run with smallest problem size that exercises all functionality
solving a 4× 4-system is the same as 1024× 1024
I use ’printf’-debugger
I put fflush(stdout); after every printf
I for point-to-point communication, print data being sent and received
I prefix each message with the process rank, sort by rank!
messages received from different processes do not necessarily arrive in chronological
order
I make sure that all the data structures have been set up correctly
![Page 108: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/108.jpg)
Page 22 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Debugging MPI Programs
Practical debugging strategies
I run parallel program on single process,
tests most of functionality, such as I/O
I run parallel program with two processes,
or more, such that all functionality can be exercised
I run with smallest problem size that exercises all functionality
solving a 4× 4-system is the same as 1024× 1024
I use ’printf’-debugger
I put fflush(stdout); after every printf
I for point-to-point communication, print data being sent and received
I prefix each message with the process rank, sort by rank!
messages received from different processes do not necessarily arrive in chronological
order
I make sure that all the data structures have been set up correctly
![Page 109: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/109.jpg)
Page 22 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Debugging MPI Programs
Practical debugging strategies
I run parallel program on single process,
tests most of functionality, such as I/O
I run parallel program with two processes,
or more, such that all functionality can be exercised
I run with smallest problem size that exercises all functionality
solving a 4× 4-system is the same as 1024× 1024
I use ’printf’-debugger
I put fflush(stdout); after every printf
I for point-to-point communication, print data being sent and received
I prefix each message with the process rank, sort by rank!
messages received from different processes do not necessarily arrive in chronological
order
I make sure that all the data structures have been set up correctly
![Page 110: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/110.jpg)
Page 23 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Most frequent sources of trouble
Sequential programming
1. interface problems (types, storage of pointers to data)
2. pointer and dynamical memory management
3. logical and algorithmic bugs
Parallel programming
1. communication
2. races
3. deadlocks
![Page 111: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/111.jpg)
Page 23 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Most frequent sources of trouble
Sequential programming
1. interface problems (types, storage of pointers to data)
2. pointer and dynamical memory management
3. logical and algorithmic bugs
Parallel programming
1. communication
2. races
3. deadlocks
![Page 112: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/112.jpg)
Page 23 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Most frequent sources of trouble
Sequential programming
1. interface problems (types, storage of pointers to data)
2. pointer and dynamical memory management
3. logical and algorithmic bugs
Parallel programming
1. communication
2. races
3. deadlocks
![Page 113: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/113.jpg)
Page 23 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Most frequent sources of trouble
Sequential programming
1. interface problems (types, storage of pointers to data)
2. pointer and dynamical memory management
3. logical and algorithmic bugs
Parallel programming
1. communication
2. races
3. deadlocks
![Page 114: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/114.jpg)
Page 23 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Most frequent sources of trouble
Sequential programming
1. interface problems (types, storage of pointers to data)
2. pointer and dynamical memory management
3. logical and algorithmic bugs
Parallel programming
1. communication
2. races
3. deadlocks
![Page 115: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/115.jpg)
Page 23 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Most frequent sources of trouble
Sequential programming
1. interface problems (types, storage of pointers to data)
2. pointer and dynamical memory management
3. logical and algorithmic bugs
Parallel programming
1. communication
2. races
3. deadlocks
![Page 116: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/116.jpg)
Page 24 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Races
Definition: A race produces an unpredictable program state and behavior due toun-synchronized concurrent executions.Most often data races occur, which are caused by unordered concurrent accessesof the same memory location from multiple processes.
Example: ’triangle inequality’
recv(−1)
1
2
3recv(1)send(3)
Prozess 2
send(3)send(2)
Prozess 1 Prozess 3
Effect: non-deterministic, non-reproducable program running
![Page 117: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/117.jpg)
Page 25 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Communication with MPI
Deadlock I
Time Process A Process B1 MPI_Send to B, tag = 0 local work2 MPI_Send to B, tag = 1 local work3 local work MPI_Recv from A, tag = 14 local work MPI_Recv from A, tag = 0
I The program will deadlock, if system provides no buffer.
I Process A is not able to send message with tag=0.
I Process B is not able to receive message with tag=1.
![Page 118: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/118.jpg)
Page 25 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Communication with MPI
Deadlock I
Time Process A Process B1 MPI_Send to B, tag = 0 local work2 MPI_Send to B, tag = 1 local work3 local work MPI_Recv from A, tag = 14 local work MPI_Recv from A, tag = 0
I The program will deadlock, if system provides no buffer.
I Process A is not able to send message with tag=0.
I Process B is not able to receive message with tag=1.
![Page 119: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/119.jpg)
Page 25 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Communication with MPI
Deadlock I
Time Process A Process B1 MPI_Send to B, tag = 0 local work2 MPI_Send to B, tag = 1 local work3 local work MPI_Recv from A, tag = 14 local work MPI_Recv from A, tag = 0
I The program will deadlock, if system provides no buffer.
I Process A is not able to send message with tag=0.
I Process B is not able to receive message with tag=1.
![Page 120: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/120.jpg)
Page 25 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Communication with MPI
Deadlock I
Time Process A Process B1 MPI_Send to B, tag = 0 local work2 MPI_Send to B, tag = 1 local work3 local work MPI_Recv from A, tag = 14 local work MPI_Recv from A, tag = 0
I The program will deadlock, if system provides no buffer.
I Process A is not able to send message with tag=0.
I Process B is not able to receive message with tag=1.
![Page 121: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/121.jpg)
Page 26 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Communication with MPI
Deadlock II
Time Process A Process B1 MPI_Send to B MPI_Send to A2 MPI_Recv from B B MPI_Recv from A
I The program will deadlock, if system provides no buffer.
I Process A and Process B are not able to send messages.
I Order communications in the right way!
![Page 122: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/122.jpg)
Page 26 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Communication with MPI
Deadlock II
Time Process A Process B1 MPI_Send to B MPI_Send to A2 MPI_Recv from B B MPI_Recv from A
I The program will deadlock, if system provides no buffer.
I Process A and Process B are not able to send messages.
I Order communications in the right way!
![Page 123: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/123.jpg)
Page 26 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Communication with MPI
Deadlock II
Time Process A Process B1 MPI_Send to B MPI_Send to A2 MPI_Recv from B B MPI_Recv from A
I The program will deadlock, if system provides no buffer.
I Process A and Process B are not able to send messages.
I Order communications in the right way!
![Page 124: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/124.jpg)
Page 26 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Communication with MPI
Deadlock II
Time Process A Process B1 MPI_Send to B MPI_Send to A2 MPI_Recv from B B MPI_Recv from A
I The program will deadlock, if system provides no buffer.
I Process A and Process B are not able to send messages.
I Order communications in the right way!
![Page 125: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/125.jpg)
Page 27 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Communication with MPI
Example: Exchange of messagesif (myrank == 0)
MPI_Send( sendbuf, 20, MPI_INT, 1, tag, communicator);
MPI_Recv( recvbuf, 20, MPI_INT, 1, tag, communicator, &status);
else if (myrank == 1)
MPI_Recv( recvbuf, 20, MPI_INT, 0, tag, communicator, &status);
MPI_Send( sendbuf, 20, MPI_INT, 0, tag, communicator);
I This code succeeds even with no buffer space at all !!!
I Important note: Code which relies on buffering is considered unsafe !!!
![Page 126: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/126.jpg)
Page 28 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Performance Visualization for Parallel Programs
I MPE is a software package for MPI programmers.
I useful tools for MPI programs, mainly performance visualization
I latest version is called MPE2
I current tools are:
1. profiling libraries to create logfiles2. postmortem visualization of logfiles when program is executed3. shared-display parallel X graphics library4. . . .
![Page 127: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/127.jpg)
Page 29 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Performance Visualization for Parallel Programs
![Page 128: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/128.jpg)
Page 30 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Performance Visualization for Parallel Programs
![Page 129: Scientific Computing - Parallele Algorithmen · Januar 2007 Funken / Keller / Urban Parallel Numerical Algorithms Iterative Solver Steepest Descent The steepest descent method minimizes](https://reader033.fdocuments.in/reader033/viewer/2022060717/607d6920b2f75250e230d515/html5/thumbnails/129.jpg)
Page 31 Scientific Computing | 11. Januar 2007 | Funken / Keller / Urban Parallel Numerical Algorithms
Performance Visualization for Parallel Programs