CS 584
description
Transcript of CS 584
CS 584
Iterative Methods
Gaussian elimination is considered to Gaussian elimination is considered to be a be a directdirect method to solve a system. method to solve a system.
An An indirectindirect method produces a method produces a sequence of values that converges to sequence of values that converges to the solution of the system.the solution of the system.
Computation is halted in an indirect Computation is halted in an indirect algorithm when when a specified algorithm when when a specified accuracy is reached.accuracy is reached.
Why Iterative Methods?
Sometimes we don't need to be exact.Sometimes we don't need to be exact.– Input has inaccuracy, etc.Input has inaccuracy, etc.– Only requires a few iterationsOnly requires a few iterations
If the system is sparse, the matrix can If the system is sparse, the matrix can be stored in a different format.be stored in a different format.
Iterative methods are usually Iterative methods are usually stable.stable.– Errors are dampenedErrors are dampened
Iterative Methods
Consider the following system.Consider the following system.
7 -6 x1 3-8 9 x2 -4
=
Now write out the equationsNow write out the equations– solve for the isolve for the ithth unknown in the i unknown in the ithth equation equation
x1 = 6/7 x2 + 3/7x2 = 8/9 x1 - 4/9
Iterative Methods
Come up with an initial guess for each xCome up with an initial guess for each x ii
Hopefully, the equations will produce better values and which Hopefully, the equations will produce better values and which can be used to calculate better values, etc. and will converge to can be used to calculate better values, etc. and will converge to the answer.the answer.
Consider
Are systems of equations and finite Are systems of equations and finite element methods related?element methods related?
Iterative Methods
Jacobi iterationJacobi iteration– Use all old values to compute new valuesUse all old values to compute new values
k x1 x2 0 0.00000 0.0000010 0.14865 -0.1982020 0.18682 -0.2490830 0.19662 -0.2621540 0.19913 -0.2655150 0.19977 -0.26637
Jacobi Iteration
The ith equation has the formThe ith equation has the form
Which can be rewritten as:Which can be rewritten as:
][][],[1
0
ibjxjiAn
j
ijjxjiAib
iiAix ][],[][
],[
1][
Jacobi Iteration
The vector (b - Ax) is zero when and if The vector (b - Ax) is zero when and if we get the exact answer.we get the exact answer.
Define this vector to be the residual Define this vector to be the residual rr Now rewrite the solution equationNow rewrite the solution equation
][],[
][][ ix
iiA
irix
void Jacobi(float A[][], float b[], float x[], float epsilon){ int k = 0; float x1[]; float r[]; float r0norm; // Randomly select an initial x vector
r = b - Ax; // This involves matrix-vector mult etc. r0norm = ||r||2; // This is basically the magnitude while (||r||2 > epsilon * r0norm) { for (j = 0; j < n; j++) x1[j] = r[j] / A[j,j] + x[j];
r = b - Ax; } x = x1;}
Parallelization of Jacobi
3 main computations per iteration3 main computations per iteration Inner product (2 norm)Inner product (2 norm) Loop calculating x[j]sLoop calculating x[j]s Matrix-vector mult. to calculate rMatrix-vector mult. to calculate r
If If A[j,j], b[j],A[j,j], b[j], & & r[j]r[j] are on the same proc. are on the same proc. Loop requires no communicationLoop requires no communication
Inner product and Matrix-vector mult Inner product and Matrix-vector mult require communication.require communication.
Inner Product
Suppose data is distributed row-wiseSuppose data is distributed row-wise Inner product is simply dot productInner product is simply dot product
– IP = Sum(x[j] * x[j])IP = Sum(x[j] * x[j])
This only requires a global sum collapseThis only requires a global sum collapse– O(log n)O(log n)
Matrix-Vector Multiplication
Again data is distributed row-wiseAgain data is distributed row-wise Each proc. requires all of the elements Each proc. requires all of the elements
in the vector to calculate their part of the in the vector to calculate their part of the resulting answer.resulting answer.
This results in all to all gatherThis results in all to all gather– O(n log n)O(n log n)
Jacobi Iteration
Resulting cost for float (4 bytes)Resulting cost for float (4 bytes)– TTcommcomm = #iterations * (T = #iterations * (TIPIP + T + TMVMMVM))
– TTIPIP = log p * (t = log p * (tss + t + tww * 4) * 4)
– TTMVMMVM = p log p * (t = p log p * (tss + t + tww * nrows/p * 4) * nrows/p * 4)
Iterative Methods
Gauss-SeidelGauss-Seidel– Use the new values as soon as availableUse the new values as soon as available
k x1 x2 0 0.00000 0.0000010 0.21977 -0.2490920 0.20130 -0.2653130 0.20008 -0.2665940 0.20000 -0.26666
Gauss-Seidel Iteration
The basic Gauss-Seidel iteration isThe basic Gauss-Seidel iteration is
1
0
1
11 ],[][],[][][
],[
1][
i
j
n
ijkkk jiAjxjiAjxib
iiAix
Gauss-Seidel Iteration
Rule: Always use the newest values of Rule: Always use the newest values of the previously computed variables.the previously computed variables.
Problem: Sequential?Problem: Sequential? Gauss-Seidel is indeed sequential if the Gauss-Seidel is indeed sequential if the
matrix is dense. matrix is dense. Parallelism is a function of the sparsity Parallelism is a function of the sparsity
and ordering of the equation matrix.and ordering of the equation matrix.
Gauss-Seidel Iteration
We can increase possible parallelism by We can increase possible parallelism by changing the numbering of a system.changing the numbering of a system.
Parallelizing Red-Black GSI
Partitioning?Partitioning?
Communication?Communication?
Block checkerboard.
2 phases per iteration. 1- compute red cells using values from black cells 2- compute black cells using values from red cellsCommunication is required for each phase.
Partitioning
P0 P1
P2 P3
Communication
P0 P1
P2 P3
Procedure Gauss-SeidelRedBlack
while ( error > limit ) send black values to neighbors recv black values from neighbors
compute red values
send red values to neighbors recv red values from neighbors
compute black values
compute error /* only do every so often */endwhile
Extending Red-Black Coloring
Goal: Produce a graph coloring scheme Goal: Produce a graph coloring scheme such that no node has a neighbor of the such that no node has a neighbor of the same color.same color.
Finite element methods produce graphs Finite element methods produce graphs with only 4 neighbors.with only 4 neighbors.– Two colors sufficeTwo colors suffice
What about more complex graphs?What about more complex graphs?
More complex graphs
Use graph coloring heuristics.Use graph coloring heuristics. Number the nodes one color at a time.Number the nodes one color at a time.
Conclusion
Iterative methods are used when an Iterative methods are used when an exact answer is not computable or exact answer is not computable or needed.needed.
Gauss-Seidel converges faster than Gauss-Seidel converges faster than Jacobi but parallelism is trickier.Jacobi but parallelism is trickier.
Finite element codes are simply Finite element codes are simply systems of equationssystems of equations– Solve with either Jacobi or Gauss-SeidelSolve with either Jacobi or Gauss-Seidel