Sparse factorization using low rank submatrices Cleve Ashcraft...
Transcript of Sparse factorization using low rank submatrices Cleve Ashcraft...
![Page 1: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/1.jpg)
Sparse factorizationusing low rank submatrices
Cleve AshcraftLSTC
2010 MUMPS User Group MeetingApril 15-16, 2010
Toulouse, FRANCE
ftp.lstc.com:outgoing/cleve/MUMPS10 Ashcraft.pdf
1
![Page 2: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/2.jpg)
LSTCLivermore Software Technology Corporation
Livermore, California 94550
• Founded by John Hallquist of LLNL, 1980’s
• Public domain versions of DYNA and NIKE codes
• LS-DYNA : implicit/explicit,nonlinear finite element analysis code
• Multiphysics capabilities
– Fluid/structure interaction
– Thermal analysis
– Acoustics
– Electromagnetics
2
![Page 3: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/3.jpg)
Multifrontal Algorithm
• very large, sparse LDLT and LDU factorizations
• tree structure organizes factor storage,solve and factor operations
• medium-to-large sparse linear systemslocated at each leaf node of the tree
• medium-sized dense linear systemslocated at each interior node of the tree
• dense matrix-matrix operationsat each interior node
• sparse matrix-matrix adds between nodes
3
![Page 4: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/4.jpg)
Multifrontal tree
0
1
2
3
4 5
6
7
8
9
10
11
12
13
1415
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47 48
49
50
51
52
53
54
55
56
57
58 59
60
61
62
63
64
65
66
67
68
69
70
71
4
![Page 5: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/5.jpg)
Multifrontal tree – polar representation
0
1
2
3
45
6
7
8
9
10
11
1213
14
15
1617
18 19 20 21
22
23
24
25
26 27
28
2930
31 32
33
34
35
36
37
38
39
40
41
42
43
44454647
48
49
5051
52
53
54
55
56
57
58
59
60
6162
6364
65
66
67
68 69
70
71
5
![Page 6: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/6.jpg)
Multifrontal Algorithm
• 40M equations, present frontier at LSTC
• serial, SMP, MPP, hybrid, GPU
• large problems, > 1M - 10M dof,require out-of-core storage of factor entries,even on distributed memory systems
• IO cost largely hidden during the factorization
• IO cost dominant during the solves
• eigensolver =⇒ several right hand sides
• many applications, e.g., Newton’s method,have a single right hand side
6
![Page 7: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/7.jpg)
Low Rank Approximations
• hierarchical matrices,Hackbusch, Bebendorf, Leborne, others
• semi-separable matrices, Gu, others
• submatrices are numerically rank deficient
• method of choice for Boundary Elements (BEM)
• now applied to Finite Elements (FEM)
• A ≈ XY TAm
n
= m X
r
Y Tn
r
• storage = r(m + n) vs mn
• reduction in ops =r
min(m,n)7
![Page 8: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/8.jpg)
Multifrontal Algorithm + Low Rank Approximations
• At each leaf node in the multifrontal tree —use standard multifrontal
• At each interior node in the multifrontal tree —low rank matrix-matrix multiplies and sums
• Between nodes —low rank matrix sums
• Dramatic reduction in storage
• Dramatic reduction in operations
• Excellent approximation propertiesfor finite element operators
• Our experience is with potential equations,elasticity with solids and shells
8
![Page 9: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/9.jpg)
Outline
• graph, tree, matrix perspectives
• experiments – 2-D potential equation
• low rank computations
• blocking strategies
• summary
9
![Page 10: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/10.jpg)
One subgraph
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
One subtree
ΩI ΩJ
@@
@@
@@@I
S
6
∂S
One submatrix
AΩI ,ΩIAΩI ,S
AΩI ,∂S
AΩJ ,ΩJAΩJ ,S AΩJ ,∂S
AS,ΩIAS,ΩJ
AS,S AS,∂S
A∂S,ΩIA∂S,ΩJ
A∂S,S A∂S,∂S
10
![Page 11: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/11.jpg)
Portion of original matrix
0 200 400 600 800 1000 1200
0
200
400
600
800
1000
1200
nz = 9052
A ordered by domains, separator and external boundary
11
![Page 12: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/12.jpg)
Portion of factor matrix
0 200 400 600 800 1000 1200
0
200
400
600
800
1000
1200
nz = 48899
L ordered by domains, separator and external boundary
12
![Page 13: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/13.jpg)
Schur complement matrix
[AS,S AS,∂S
A∂S,S A∂S,∂S
]=
0 20 40 60 80 100 120 140 160
0
20
40
60
80
100
120
140
160
nz = 20659
Schur complement, separator and external boundary
13
![Page 14: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/14.jpg)
Outline
• graph, tree, matrix perspectives
• experiments – 2-D potential equation
• low rank computations
• blocking strategies
• summary
14
![Page 15: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/15.jpg)
Computational experiments
• Compute LS,S, L∂S,S and A∂S,∂S
• Find domain decompositions of S and ∂S
• Form block matrices, e.g., LS,S =∑
K≥J
LK,J
• Find singular value decompositions of each LK,J
• Collect all singular values
σ =∑
K≥J
min(|K|,|J |)∑
i=1
σ(K,J)i
• Split matrix LS,S = MS,S + NS,S using singularvalues σ.
• We want ‖NS,S‖F ≤ 10−14 ‖LS,S‖F
15
![Page 16: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/16.jpg)
255 × 255 diagonal block factor LS,S
43% dense, relative accuracy 10−14
LS,S =
0 50 100 150 200 250
0
50
100
150
200
250
16
![Page 17: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/17.jpg)
754 × 255 lower block factor L∂S,S
16% dense, relative accuracy 10−14
L∂S,S =
0 100 200
0
100
200
300
400
500
600
700
17
![Page 18: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/18.jpg)
754 × 754 update matrix A∂S,∂S
21% dense, relative accuracy 10−14
A∂S,∂S =
0 100 200 300 400 500 600 700
0
100
200
300
400
500
600
700
18
![Page 19: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/19.jpg)
255 × 255 factor matrix LS,S
storage vs accuracy
0 0.1 0.2 0.3 0.4 0.5 0.6−16
−14
−12
−10
−8
−6
−4
−2
0
fraction of dense storage
log 10
(1 −
||M
|| F /
||A|| F
)
approximating 255 x 255 separator factor matrix L3,3
2 x 2 blocks3 x 3 blocks4 x 4 blocks5 x 5 blocks6 x 6 blocks7 x 7 blocks8 x 8 blocks
19
![Page 20: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/20.jpg)
754 × 255 factor matrix L∂S,S
storage vs accuracy
0 0.05 0.1 0.15 0.2−16
−14
−12
−10
−8
−6
−4
−2
0
fraction of dense storage
log 10
(1 −
||M
|| F /
||A|| F
)
approximating 754 x 255 interface−separator factor matrix L4,3
3 x 1 blocks6 x 2 blocks9 x 3 blocks12 x 4 blocks
20
![Page 21: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/21.jpg)
754 × 754 update matrix A∂S,∂S
storage vs accuracy
0 0.1 0.2 0.3 0.4 0.5−16
−14
−12
−10
−8
−6
−4
−2
0
fraction of dense storage
log 10
(1 −
||M
|| F /
||A|| F
)
approximating 754 x 754 schur complement matrix Ahat44
2 x 2 blocks3 x 3 blocks4 x 4 blocks5 x 5 blocks6 x 6 blocks7 x 7 blocks8 x 8 blocks
21
![Page 22: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/22.jpg)
Outline
• graph, tree, matrix perspectives
• experiments – 2-D potential equation
• low rank computations
• blocking strategies
• summary
22
![Page 23: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/23.jpg)
How to compute low rank submatrices ?
• SVD – singular value decomposition
A = UΣUT
where U and V are orthogonal and Σ is diagonal
• Gold standard, expensive, O(n3) ops
• QR factorization
AP = QR
where Q is orthogonal and R is upper triangular,and P is a permutation matrix
• Silver standard, moderate cost, O(rn2) ops
23
![Page 24: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/24.jpg)
row norms of R vs singular values of A
754 × 754 update matrix A∂S,∂S
94 × 94 submatrix sizenear, mid and far interactions
0 2 4 6 8 10 12 14 16 18 20−20
−18
−16
−14
−12
−10
−8
−6
−4
−2
0row norms of R versus singular values of 754 x 754 A using 8 segments
near ||R
i,:||
2
near σi
mid ||Ri,:
||2
mid σi
far ||Ri,:
||2
far σi
24
![Page 25: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/25.jpg)
SVD vs QR with column pivoting — Conclusions :
• Column pivoting QR factorization does well.
• Row norms of R track singular values σ
• The numerical rank of R is greater than needed,but not that much greater
• For more accuracy/less storagetwo sided orthogonal factorizations
– AP = ULV , U and V orthogonal, L triangular
– PAQ = UBV ,U and V orthogonal, B bidiagonal
track the singular values very closely.
25
![Page 26: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/26.jpg)
Type of approximation of submatrices
• submatrix LI,J of L∂S,S, ‖LI,J‖F = 2.92 × 10−2
factorization numerical rank total entriesnone — 4900QR 20 2681
ULV 18 2680SVD 18 2538
• submatrix LI,J of L∂S,S, ‖LI,J‖F = 2.04 × 10−3
factorization numerical rank total entriesnone — 4900QR 14 1896
ULV 10 1447SVD 10 1410
26
![Page 27: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/27.jpg)
Operations with low rank submatrices
A = UAV TA , B = UBV T
B , C = UCV TC
See Bebendorf, “Hierarchical Matrices”, Chapter 1
• Multiplication A = B C, rank(A) ≤ min(rank(B), rank(C))
A = UAV TA =
(UBV T
B
) (UCV T
C
)= BC
= UB
(V T
B UC
)V T
C
= UB
((V T
B UC
)V T
C
)
=(UB
(V T
B UC
))V T
C
• Addition A = B + C, rank(A) ≤ rank(B) + rank(C)
A = UAV TA =
(UBV T
B
)+
(UCV T
C
)= B + C
=[UB UC
] [VB VC
]T
27
![Page 28: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/28.jpg)
Multiplication A = B C
A = UAV TA , B = UBV T
B , C = UCV TC
A = B C =
=(
(m × h) (h × l))(
(l × k) (k × n))
= =
= (m × min(h, k)) (min(h, k) × n)
28
![Page 29: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/29.jpg)
Near-near matrix product
0 5 10 15 20 25 30 35
−16
−14
−12
−10
−8
−6
−4
−2
0
near−near interaction, L2,1
L2,1T
σ(L21)
σ(L21*L21T)
29
![Page 30: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/30.jpg)
mid-near matrix product
0 5 10 15 20 25 30 35
−16
−14
−12
−10
−8
−6
−4
−2
0
mid−near interaction, L2,1
L1,1T
σ(L21)σ(L11)
σ(L21*L11T)
30
![Page 31: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/31.jpg)
far-near matrix product
0 5 10 15 20 25 30 35
−16
−14
−12
−10
−8
−6
−4
−2
0
far−near interaction, L5,1
L6,1T
σ(L51)σ(L61)
σ(L51*L61T)
31
![Page 32: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/32.jpg)
mid-mid matrix product
2 4 6 8 10 12 14 16
−16
−14
−12
−10
−8
−6
−4
−2
0
mid−mid interaction, L1,1
L1,1T
σ(L11)
σ(L11*L11T)
32
![Page 33: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/33.jpg)
far-mid matrix product
2 4 6 8 10 12 14 16
−16
−14
−12
−10
−8
−6
−4
−2
0
mid−far interaction, L6,1
L1,1T
σ(L61)
σ(L61*L61T)
33
![Page 34: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/34.jpg)
far-far matrix product
1 2 3 4 5 6 7 8 9 10 11
−16
−14
−12
−10
−8
−6
−4
−2
0
far−far interaction, L6,1
L6,1T
σ(L61)
σ(L61*L61T)
34
![Page 35: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/35.jpg)
Addition A = B + C
rank(A) ≤ rank(B) + rank(C)
A = UAV TA =
(UBV T
B
)+
(UCV T
C
)= B + C
=[UB UC
] [VB VC
]T
= (Q1R1) (Q2R2)T
= Q1
(R1R
T2
)QT
2
= Q1 (Q3R3) QT2
= (Q1Q3)(R3Q
T2
)
= (Q1Q3)(Q2R
T3
)T
• R1RT2 usually has low numerical rank
• examples follow35
![Page 36: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/36.jpg)
update matrix, diagonal blockA3,3 = L3,1L
T3,1 + L3,2L
T3,2 + L3,3L
T3,3
0 5 10 15 20 25
−16
−14
−12
−10
−8
−6
−4
−2
0
log 10
(σ)
update matrix −− self interaction
σ(near)σ(sum)σ(mid)σ(far)
36
![Page 37: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/37.jpg)
update matrix, mid-distance off-diagonal blockA5,3 = L5,1L
T3,1 + L5,2L
T3,2 + L5,3L
T3,3
0 2 4 6 8 10 12−20
−18
−16
−14
−12
−10
−8
−6
−4
−2
log 10
(σ)
update matrix −− mid−range interaction
σ(sum)σ(near)σ(mid)σ(far)
37
![Page 38: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/38.jpg)
update matrix, far-distance off-diagonal blockA7,3 = L7,1L
T3,1 + L7,2L
T3,2 + L7,3L
T3,3
0 2 4 6 8 10 12−20
−18
−16
−14
−12
−10
−8
−6
−4
−2
log 10
(σ)
update matrix −− far−range interaction
σ(sum)σ(near)σ(mid)σ(far)
38
![Page 39: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/39.jpg)
Outline
• graph, tree, matrix perspectives
• experiments – 2-D potential equation
• low rank computations
• blocking strategies
• summary
39
![Page 40: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/40.jpg)
Blocking Strategies
• Active data structures[LJ,J
L∂J,J A∂J,∂J
]
• partition J and ∂J independently
• for 2-d problems, J and ∂J are 1-d manifolds
• for 3-d problems, J and ∂J are 2-d manifolds
• we need mesh partitioning of the separatorand the boundary of a region J
• each index set of a partition is a segment
40
![Page 41: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/41.jpg)
Segment partition
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1 segment wirebasket domain decomposition
41
![Page 42: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/42.jpg)
•
0 50 100 150 200 250
0
50
100
150
200
250
LJ,J =∑
σ,τ
Lσ,τ σ × τ ⊆ J × J
•
0 100 200
0
100
200
300
400
500
600
700
L∂J,J =∑
σ,τ
Lσ,τ σ × τ ⊆ ∂J × J
•
0 100 200 300 400 500 600 700
0
100
200
300
400
500
600
700
A∂J,∂J =∑
σ,τ
Aσ,τ σ×τ ⊆ ∂J×∂J
42
![Page 43: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/43.jpg)
Strategy I
• The partition of ∂J (local to node J)conforms to the partitions of ancestors K,K ∩ ∂J 6= ∅
• Advantages :
– Update assembly is simplified
A(p(J))σ2,τ2 = A
(p(J))σ2,τ2 + A
(J)σ1,τ1
where σ1 ⊆ σ2, τ1 ⊆ τ2
– One destination for each A(J)σ1,τ1
• Disadvantages :
– Partition of ∂J can be fragmented,more segments, smaller size,less efficient storage
43
![Page 44: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/44.jpg)
Strategy II
• The partition of ∂J (local to node J)need not conform to the partitions of ancestors
• Advantages :
– Partition of ∂J can be optimized bettersince ∂J is small and localized
• Disadvantages :
– Update assembly is more complex
A(p(J))σ1∩σ2,τ1∩τ2
= A(p(J))σ1∩σ2,τ1∩τ2
+ A(J)σ1∩σ2,τ1∩τ2
where σ1 ∩ σ2 6= ∅, τ1 ∩ τ2 6= ∅
– Several destinations for a submatrix A(J)σ1,τ1
44
![Page 45: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/45.jpg)
Outline
• graph, tree, matrix perspectives
• experiments – 2-D potential equation
• low rank computations
• blocking strategies
• summary
45
![Page 46: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/46.jpg)
Multifrontal tree Multifrontal Factorization
0
1
2
3
45
6
7
8
9
10
11
1213
14
15
1617
18 19 20 21
22
23
24
25
26 27
28
2930
31 32
33
34
35
36
37
38
39
40
41
42
43
44454647
48
49
5051
52
53
54
55
56
57
58
59
60
6162
6364
65
66
67
68 69
70
71
• each leaf node is ad-dimensional sparseFEM matrix
• each interior node isa (d − 1)-dimensionaldense BEM matrix
• use low rank storageand computation ateach interior node
46
![Page 47: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/47.jpg)
Call to action
• progress to date in low rank factorizationsdriven by iterative methods
• work needed from direct methods community
• start from industrial strength multifrontal code
– serial, SMP, MPP, hybrid, GPU
– pivoting for stability, out-of-core,singular systems, null spaces
47
![Page 48: Sparse factorization using low rank submatrices Cleve Ashcraft …mumps.enseeiht.fr/doc/ud_2010/Ashcraft_talk.pdf · 2017. 9. 30. · Multifrontal Algorithm •very large, sparse](https://reader035.fdocuments.in/reader035/viewer/2022071113/5fea8c21d6caf96438132a99/html5/thumbnails/48.jpg)
Call to action
• Many challenges
– partition of space, partition of interface
– added programming complexityof low rank matrices
– challenges to pivot for stability
– challenges/opportunitiesto implement in parallel
• Payoff will be huge
– reduction in storage footprint
– reduction in computational work
– take direct methods to next levelof problem size
48