Lecture 8: Fast Linear Solvers (Part 3)
1
Cholesky Factorization
โข Matrix ๐ด is symmetric if ๐ด = ๐ด๐.
โข Matrix ๐ด is positive definite if for all ๐ โ ๐, ๐๐๐จ๐ > 0.
โข A symmetric positive definite matrix ๐ด has Cholesky factorization ๐ด = ๐ฟ๐ฟ๐, where ๐ฟ is a lower triangular matrix with positive diagonal entries.
โข Example.
โ ๐ด = ๐ด๐ with ๐๐๐ > 0 and ๐๐๐ > |๐๐๐|๐โ ๐ is positive definite.
2
๐ด =
๐11 ๐12 โฆ ๐1๐๐12 ๐22 โฆ ๐2๐โฎ
๐1๐
โฎ๐2๐
โฑโฆ
โฎ๐๐๐
=
๐11 0 โฆ 0๐21 ๐22 โฆ 0โฎ๐๐1
โฎ๐๐2
โฑโฆ
โฎ๐๐๐
๐11 ๐21 โฆ ๐๐10 ๐22 โฆ ๐๐2โฎ0
โฎ0
โฑโฆ
โฎ๐๐๐
3
In 2 ร 2 matrix size case ๐11 ๐21๐21 ๐22
=๐11 0๐21 ๐22
๐11 ๐210 ๐22
๐11 = ๐11; ๐21 = ๐21/๐11; ๐22 = ๐22 โ ๐212
4
Submatrix Cholesky Factorization Algorithm
5
for ๐ = 1 to ๐ ๐๐๐ = ๐๐๐ for ๐ = ๐ + 1 to ๐ ๐๐๐ = ๐๐๐/๐๐๐ end for ๐ = ๐ + 1 to ๐ // reduce submatrix for ๐ = ๐ to ๐ ๐๐๐ = ๐๐๐ โ ๐๐๐๐๐๐
end end end
Equivalent to ๐๐๐๐๐๐ due
to symmetry
6
Remark:
1. This is a variation of Gaussian Elimination algorithm.
2. Storage of matrix ๐ด is used to hold matrix ๐ฟ .
3. Only lower triangle of ๐ด is used (See ๐๐๐ = ๐๐๐ โ ๐๐๐๐๐๐).
4. Pivoting is not needed for stability.
5. About ๐3/6 multiplications and about ๐3/6 additions are required.
Data Access Pattern
Read only
Read and write
Column Cholesky Factorization Algorithm
7
for ๐ = 1 to ๐ for ๐ = 1 to ๐ โ 1 for ๐ = ๐ to ๐ ๐๐๐ = ๐๐๐ โ ๐๐๐๐๐๐
end end ๐๐๐ = ๐๐๐
for ๐ = ๐ + 1 to ๐ ๐๐๐ = ๐๐๐/๐๐๐
end end
Data Access Pattern
8
Read only
Read and write
Parallel Algorithm
โข Parallel algorithms are similar to those for LU factorization.
9
References โข X. S. Li and J. W. Demmel, SuperLU_Dist: A scalable
distributed-memory sparse direct solver for unsymmetric linear systems, ACM Trans. Math. Software 29:110-140, 2003
โข P. Hรฉnon, P. Ramet, and J. Roman, PaStiX: A high-performance parallel direct solver for sparse symmetric positive definite systems, Parallel Computing 28:301-321, 2002
QR Factorization and HouseHolder Transformation
Theorem Suppose that matrix ๐ด is an ๐ ร ๐ matrix with linearly independent columns, then ๐ด can be factored as ๐ด = ๐๐
where ๐ is an ๐ ร ๐ matrix with orthonormal columns and ๐ is an invertible ๐ ร ๐ upper triangular matrix.
10
QR Factorization by Gram-Schmidt Process
Consider matrix ๐ด = [๐1 ๐2 โฆ |๐๐]
Then,
๐1 = ๐1, ๐1 =๐1
||๐1||
๐2 = ๐2 โ (๐2 โ ๐1)๐1, ๐2 =๐2
||๐2||
โฆ
๐๐ = ๐๐ โ ๐๐ โ ๐1 ๐1 โโฏโ ๐๐ โ ๐๐โ1 ๐๐โ1, ๐๐ =๐๐
||๐๐||
๐ด = ๐1 ๐2 โฆ ๐๐
= ๐1 ๐2 โฆ ๐๐
๐1 โ ๐1 ๐2 โ ๐1 โฆ ๐๐ โ ๐1
0 ๐2 โ ๐2 โฆ ๐๐ โ ๐2
โฎ โฎ โฑ โฎ0 0 โฆ ๐๐ โ ๐๐
= ๐๐
11
Classical/Modified Gram-Schmidt Process
for j=1 to n
๐๐ = ๐๐
for i = 1 to j-1
๐๐๐ = ๐๐
โ๐๐ (๐ถ๐๐๐ ๐ ๐๐๐๐)
๐๐๐ = ๐๐โ๐๐ (๐๐๐๐๐๐๐๐)
๐๐ = ๐๐ โ ๐๐๐๐๐
endfor
๐๐๐ = ||๐๐||2
๐๐ = ๐๐/๐๐๐
endfor 12
Householder Transformation
Let ๐ฃ โ ๐ ๐ be a nonzero vector, the ๐ ร ๐ matrix
๐ป = ๐ผ โ 2๐๐๐
๐๐๐
is called a Householder transformation (or reflector).
โข Alternatively, let ๐ = ๐/||๐||, ๐ป can be rewritten as
๐ป = ๐ผ โ 2๐๐๐.
Theorem. A Householder transformation ๐ป is symmetric and orthogonal, so ๐ป = ๐ป๐ = ๐ปโ1.
13
1. Let vector ๐ be perpendicular to ๐.
๐ผ โ 2๐๐๐
๐๐๐๐ = ๐ โ 2
๐(๐๐๐)
๐๐๐= ๐
2. Let ๐ =๐
| ๐ |. Any vector ๐ can be written as
๐ = ๐ + ๐๐๐ ๐. ๐ผ โ 2๐๐๐ ๐ = ๐ผ โ 2๐๐๐ ๐ + ๐๐๐ ๐ = ๐ โ ๐๐๐ ๐
14
โข Householder transformation is used to selectively zero out blocks of entries in vectors or columns of matrices.
โข Householder is stable with respect to round-off error.
15
For a given a vector ๐, find a Householder
transformation, ๐ป, such that ๐ป๐ = ๐
10โฎ0
= ๐๐1
โ Previous vector reflection (case 2) implies that vector ๐ is in parallel with ๐ โ ๐ป๐.
โ Let ๐ = ๐ โ ๐ป๐ = ๐ โ ๐๐1, where ๐ = ยฑ| ๐ | (when the arithmetic is exact, sign does not matter).
โ ๐ = ๐/|| ๐||
โ In the presence of round-off error, use
๐ = ๐ + ๐ ๐๐๐ ๐ฅ1 | ๐ |๐1
to avoid catastrophic cancellation. Here ๐ฅ1 is the first entry in the vector ๐.
Thus in practice, ๐ = โ๐ ๐๐๐ ๐ฅ1 | ๐ | 16
Algorithm for Computing the Householder Transformation
17
๐ฅ๐ = max { ๐ฅ1 , ๐ฅ2 , โฆ , ๐ฅ๐ } for ๐ = 1 to ๐ ๐ฃ๐ = ๐ฅ๐/๐ฅ๐ end
๐ผ = ๐ ๐๐๐(๐ฃ1)[๐ฃ12 + ๐ฃ2
2 +โฏ+ ๐ฃ๐2]1/2
๐ฃ1 = ๐ฃ1 + ๐ผ ๐ผ = โ๐ผ๐ฅ๐ ๐ = ๐/| ๐ |
Remark: 1. The computational complexity is ๐(๐). 2. ๐ป๐ is an ๐ ๐ operation compared to ๐(๐2) for a general matrix-vector product. ๐ป๐ = ๐ โ ๐๐(๐๐ป๐).
QR Factorization by HouseHolder Transformation
Key idea:
Apply Householder transformation successively to columns of matrix to zero out sub-diagonal entries of each column.
18
โข Stage 1 โ Consider the first column of matrix ๐ด and determine a
HouseHolder matrix ๐ป1 so that ๐ป1
๐11๐21โฎ
๐๐1
=
๐ผ10โฎ0
โ Overwrite ๐ด by ๐ด1, where
๐ด1 =
๐ผ1 ๐12โ โฏ ๐1๐
โ
0 ๐22โ โฎ ๐2๐
โ
โฎ โฎ โฎ โฎ0 ๐๐2
โ โฏ ๐๐๐โ
= ๐ป1๐ด
Denote vector which defines ๐ป1 vector ๐1. ๐1 = ๐1/| ๐1 |
Remark: Column vectors
๐12โ
๐22โ
โฎ๐๐2โ
โฆ
๐1๐โ
๐2๐โ
โฎ๐๐๐โ
should be computed by
๐ป๐ = ๐ โ ๐๐(๐๐ป๐). 19
โข Stage 2 โ Consider the second column of the updated matrix
๐ด โก ๐ด1 = ๐ป1๐ด and take the part below the diagonal to determine a Householder matrix ๐ป2
โ so that
๐ป2โ
๐22โ
๐32โ
โฎ๐๐2โ
=
๐ผ2
0โฎ0
Remark: size of ๐ป2โ is (๐ โ 1) ร (๐ โ 1).
Denote vector which defines ๐ป2โ vector ๐2. ๐2 = ๐2/| ๐2 |
โ Inflate ๐ป2โ to ๐ป2 where
๐ป2 =1 00 ๐ป2
โ
Overwrite ๐ด by ๐ด2 โก ๐ป2๐ด1
20
โข Stage k
โ Consider the ๐๐กโ column of the updated matrix ๐ด and take the part below the diagonal to determine a Householder matrix ๐ป๐
โ so that
๐ป๐โ
๐๐๐โ
๐๐+1,๐โ
โฎ๐๐,๐โ
=
๐ผ๐
0โฎ0
Remark: size of ๐ป๐โ is (๐ โ ๐ + 1) ร (๐ โ ๐ + 1).
Denote vector which defines ๐ป๐โ vector ๐๐. ๐๐ = ๐๐/| ๐๐ |
โ Inflate ๐ป๐โ to ๐ป๐ where
๐ป๐ =๐ผ๐โ1 0
0 ๐ป๐โ , ๐ผ๐โ1 is (๐ โ 1) ร (๐ โ 1) identity matrix.
Overwrite ๐ด by ๐ด๐ โก ๐ป๐๐ด๐โ1
21
โข After (๐ โ 1) stages, matrix ๐ด is transformed into an upper triangular matrix ๐ .
โข ๐ = ๐ด๐โ1 = ๐ป๐โ1๐ด๐โ2 = โฏ =๐ป๐โ1๐ป๐โ2โฆ๐ป1๐ด
โข Set ๐๐ = ๐ป๐โ1๐ป๐โ2โฆ๐ป1. Then ๐โ1 = ๐๐. Thus ๐ = ๐ป1
๐๐ป2๐ โฆ๐ป๐โ1
๐
โข ๐ด = ๐๐
22
Example. ๐ด =1 11 21 3
.
Find ๐ป1 such that ๐ป1
111
=๐ผ100
.
By ๐ = โ๐ ๐๐๐ ๐ฅ1 | ๐ |,
๐ผ1 = โ 3 = โ1.721.
๐ข1 = 0.8881, ๐ข2 = ๐ข3 = 0.3250.
By ๐ป๐ = ๐ โ ๐๐ ๐๐ป๐ , ๐ป1
123
=โ3.46410.36611.3661
๐ป1๐ด =โ1.721 โ3.4641
0 0.36610 1.3661
23
Find ๐ป2โ and ๐ป2, where
๐ป2โ 0.36611.3661
=๐ผ2
0
So ๐ผ2 = โ1.4143, ๐2 =0.79220.6096
So ๐ = ๐ป2๐ป1๐ด =โ1.7321 โ3.4641
0 โ1.41430 0
๐ = (๐ป2๐ป1)๐= ๐ป1
๐๐ป2๐
=โ0.5774 0.7071 โ0.4082โ0.5774 0 0.8165โ0.5774 โ0.7071 โ0.4082
24
๐ด๐ = ๐ โ ๐๐ ๐ = ๐
โ ๐๐๐๐ ๐ = ๐๐๐ โ ๐ ๐ = ๐๐๐
Remark: Q rarely needs explicit construction
25
Parallel Householder QR
โข Householder QR factorization is similar to Gaussian elimination for LU factorization.
โ Additions + multiplications: ๐4
3๐3 for QR versus
๐2
3๐3 for LU
โข Forming Householder vector ๐๐is similar to computing multipliers in Gaussian elimination.
โข Subsequent updating of remaining unreduced portion of matrix is similar to Gaussian elimination.
โข Parallel Householder QR is similar to parallel LU. Householder vectors need to broadcast among columns of matrix.
26
References
โข M. Cosnard, J. M. Muller, and Y. Robert. Parallel QR decomposition of a rectangular matrix, Numer. Math. 48:239-249, 1986
โข A. Pothen and P. Raghavan. Distributed orthogonal factorization: Givens and Householder algorithms, SIAM J. Sci. Stat. Comput. 10:1113-1134, 1989
27
Top Related