THE UNIVERSITY OF CHICAGO A GENERIC …the university of chicago a generic multiresolution...
Transcript of THE UNIVERSITY OF CHICAGO A GENERIC …the university of chicago a generic multiresolution...
THE UNIVERSITY OF CHICAGO
A GENERIC MULTIRESOLUTION PRECONDITIONER FOR SPARSE SYMMETRIC
SYSTEMS
A DISSERTATION SUBMITTED TO
THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES DIVISION
IN CANDIDACY FOR THE DEGREE OF
MASTER OF SCIENCE
DEPARTMENT OF COMPUTER SCIENCE
BY
PRAMOD KAUSHIK MUDRAKARTA
CHICAGO, ILLINOIS
MARCH 13, 2018
Copyright c© 2018 by Pramod Kaushik Mudrakarta
All Rights Reserved
TABLE OF CONTENTS
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 RELATED WORK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 WAVELET BASED SPARSE APPROXIMATE INVERSE PRECONDITIONERS 7
4 MULTIRESOLUTION MATRIX FACTORIZATION . . . . . . . . . . . . . . . . 11
5 NUMERICAL RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175.1 Implementation and parameters . . . . . . . . . . . . . . . . . . . . . . . . . 175.2 Model PDE problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185.3 Off-the-shelf matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.3.1 Comparison with wavelet preconditioners . . . . . . . . . . . . . . . . 215.4 Performance when only an approximate solution is desired . . . . . . . . . . 235.5 Wall-clock time comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . 235.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
6 CONCLUSION AND FUTURE WORK . . . . . . . . . . . . . . . . . . . . . . . 276.1 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
A APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32A.1 pMMF parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32A.2 Results on off-the-shelf matrices . . . . . . . . . . . . . . . . . . . . . . . . . 32
iii
LIST OF FIGURES
4.1 A graphical representation of the structure of Multiresolution Matrix Factoriza-tion. Here, P is a permutation matrix which ensures that S` = {1, . . . , δ`} foreach `. Note that P is introduced only for the sake of visualization, an actualMMF would not contain such an explicit permutation. . . . . . . . . . . . . . . 12
5.1 Relative residual as a function of iteration number. . . . . . . . . . . . . . . . . 235.2 Wall-clock times as a function of matrix size (left) and number of nonzeros (right).
Top row corresponds to IWSPAI [16] and bottom to MMF preconditioner. . . . 25
A.1 Full set of parameters used in pMMF for the experiments . . . . . . . . . . . . . 32
iv
LIST OF TABLES
5.1 Iteration counts of GMRES until convergence to a relative residual of 10−8. Heren is the number of rows of the finite difference matrix. WSPAI refers to thewavelet sparse preconditioner of Chan, Tang and Wan [7] and IWSPAI to theimplicit sparse preconditioner of Hawkins and Chen [16]. It is clear that MMFpreconditioner is consistently better. × indicates that the desired tolerance wasnot reached within 1000 iterations. . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.2 Performance of MMF preconditioning by matrix group. %wins indicates thepercentage of times MMFprec resulted in lower GMRES (30) iteration countcompared to no preconditioning. × indicates a win percentage of zero. Sparsityis defined as f , where fn = nnz where n is the size of the matrix and nnz is thenumber of nonzeros. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.3 Wall clock running time of preconditioner setup and linear solve times in seconds.× indicates that the desired tolerance was not reached within 1000 iterations. . 24
A.1 Iteration counts of GMRES solved to a relative error of 10−4. × indicates thatthe method did not achieve the desired tolerance within 500 iterations. . . . . . 34
A.2 Iteration counts of GMRES (30) with tolerance = 10−9 on off-the-shelf sparsematrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
A.3 Comparison of no preconditioning and MMFprec for various levels of GMRES(30)tolerance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
v
ACKNOWLEDGMENTS
To be added
vi
ABSTRACT
We introduce a new general purpose multiresolution preconditioner for symmetric linear
systems. Most existing multiresolution preconditioners use some standard wavelet basis
that relies on knowledge of the geometry of the underlying domain. In constrast, based
on the recently proposed Multiresolution Matrix Factorization (MMF) algorithm [17], we
construct a preconditioner that discovers a custom wavelet basis adapted to the given linear
system without making any geometric assumptions. A parallel algorithm for MMF is de-
scribed that confers computational efficiency. Some advantages of the new approach are fast
preconditioner-vector products, invariance to the ordering of the rows/columns, and the abil-
ity to handle systems of any size. Numerical experiments on finite difference discretizations
of model PDEs and off-the-shelf matrices illustrate the effectiveness of the MMF precondi-
tioner.
vii
CHAPTER 1
INTRODUCTION
Symmetric linear systems of the form
Ax = b, (1.1)
where A ∈ Rn×n and b ∈ Rn are central to many numerical computations in science and
engineering. Examples include finite difference discretizations of partial differential equations
[23] and optimization algorithms where a linear system is solved in each iteration [35, 36, 28].
Often, solving the linear system is the most time consuming part of large scale computations.
When A, the coefficient matrix, is large and sparse, usually iterative algorithms such as
the minimum residual method (GMRES) [21] or the stabilized bi-conjugate gradient method
(BiCGStab) [30] are used to solve eq. (1.1). However, if the condition number κ2(A) is high
(i.e., A is ill-conditioned), these methods tend to converge slowly. For example, in the case
of MINRES, which is a variant of GMRES, for positive definite A,
||Axn − b||2 ≤(
1− 1
κ2(A)2
)n2
||Ax0 − b||2 , (1.2)
where xn is the n-th iterate and x0 is the initial guess [29]. Many matrices arising from
problems of interest are ill-conditioned.
Preconditioning is a technique to improve convergence, where, instead of eq. (1.1), we
solve
MAx = Mb, (1.3)
where M ∈ Rn×n is a rough approximation to A−1 1. While eq. (1.3) is still a large
1. An alternate way to precondition is from the right, i.e., solve AMx = b, but, for simplicity, in this
1
linear system, it is generally easier to solve than eq. (1.1), because MA is more favorably
conditioned than A. Note that solving eq. (1.3) with an iterative method involves computing
many matrix-vector products with MA, but that does not necessarily mean that MA needs
to be computed explicitly. This is an important point, because even if A is sparse, MA can
be dense, and therefore expensive to compute.
There is no such thing as a “universal” preconditioner. Preconditioners are usually
custom-made for different kinds of coefficient matrices and are evaluated differently based
on what kind of problem they are used to solve (how accurate x needs to be, how easy the
solver is to implement on parallel computers, storage requirements, etc.). Some of the most
effective preconditioners exploit sparsity. The best case scenario is when both A and M
are sparse, since in that case all matrix-vector products involved in solving eq. (1.3) can be
evaluated very fast. Starting in the 1970s, this lead to the devevelopment of so-called Sparse
Approximate Inverse (SPAI) preconditioners [2, 14, 4, 16], which formuate finding M as a
least squares problem
minM∈S
||AM − I||F , (1.4)
where S is an appropriate class of sparse matrices. Note that since ||AM−I||2F =∑ni=1 ||Ami−
ei||22, where mi is the i-th column of M and ei is the i-th standard basis vector, eq. (1.4)
reduces to solving n independent least square problems, which can be done in parallel.
One step beyond generic SPAI preconditioners are methods that use prior knowledge
about the system at hand to transform A to a basis where its inverse can be approximated
in sparse form. For many problems, orthogonal wavelet bases are a natural choice. Recall
that wavelets are similar to Fourier basis functions, but have the advantage of being localized
paper we constrain ourselves to discussing left preconditioning.
2
in space. Transforming (1.1) to a wavelet basis amounts to rewriting it as Ax = b, where
A = WTAW, x = WTx, and b = WT b.
Here, the wavelets appear as the columns of the orthogonal matrix W . This approach was
first proposed by Chan, Tang and Wan [7].
Importantly, many wavelets admit fast transforms, meaning that WT factors in the form
WT = WTLW
TL−1 . . .W
T1 , (1.5)
where each of the WT` factors are sparse. While the wavelet transform itself is a dense
transformation, in this case, transforming to the wavelet basis inside an interative solver
can be done by sparse matrix-vector arithmetic exclusively. Each W` matrix can be seen as
being responsible for extracting information from x at a given scale, hence wavelet transforms
constitute a form of multiresolution analysis.
Wavelet sparse preconditioners have proved to be effective primarily in the PDE domain,
where the problem is low dimensional and the structure of the equations (together with
the discretization) strongly suggest the form of the wavelet transform. However, multiscale
data is much more broadly prevalent, e.g., in biological problems and social networks. For
these kinds of data, the underlying generative process is unknown, rendering the classical
wavelet-based preconditioners ineffective.
In this paper, we propose a preconditioner based on a form of multiresolution analysis
for matrices called Multiresolution Matrix Factorization (MMF), that was first introduced
in [17]. Similar to eq. (1.5), MMF has a corresponding fast wavelet transform, in particular,
it is based on an approximate factorization of A of the form
A ≈ QT1 QT2 . . . Q
TLHQLQL−1 . . . Q1, (1.6)
3
where each of theQ` matrices are sparse and orthogonal, andH is close to diagonal. However,
in contrast to classical wavelet transforms, here the Q` matrices are not induced from any
specific analytical form of wavelets, but rather “discovered” by the algorithm itself from the
structure of A, somewhat similarly to algebraic multigrid methods [24]. This feature gives
our preconditioner considerably more flexibility than existing wavelet sparse preconditioners,
and allows it to exploit latent multiresolution structure in a wide range of problem domains.
Notations
In the following, we use [n] to denote the set {1, 2, . . . , n}. Given a matrix A ∈ Rn×n and
two (ordered) sets S1, S2 ⊆ [n], AS1,S2 will denote the f|S1|× |S2| dimensional submatrix
of A cut out by the rows indexed by S1 and the columns indexed by S2. S will denote the
complement of S, in [n], i.e., [n]\S.
4
CHAPTER 2
RELATED WORK
Constructing a good preconditioner hinges on two things: 1. being able to design an efficient
algorithm to compute an approximate inverse to A, and 2. making the preconditioner as
close to A−1 as possible. It is rare for both a matrix and its inverse to be sparse. For
example, Duff et al. [11] show that the inverses of irreducible sparse matrices are generally
dense. However, it is often the case that many entries of the inverse are small, making
it possible to construct a good sparse approximate inverse. For example, [10] shows that
when A is banded and symmetric positive definite, the distribution of the magnitudes of the
matrix entries in A−1 decays exponentially away from the diagonal w.r.t. the eigenvalues of
A. Benzi and Tuma [4] note that sparse approximate inverses have limited success because
of the requirement that the actual inverse of the matrix has small entries.
A better way of computing approximate inverses is in factorized form using sparse factors.
The dense nature of the inverse is still preserved in the approximation as the product of the
factors (which is never explicitly computed) can be dense. Factorized approximate inverses
have been proposed based on LU factorization. However, they are not easily parallelizable
and are sensitive to reordering [4].
Multiscale variants of classic preconditioners have already been proposed and have often
been found to be superior [3] to their one-level counterparts. The current frontiers of research
on preconditioning also focus on designing algorithms for multi-core machines. Multilevel
preconditioners assume that the coefficient matrix has a hierarchy in structure. These in-
clude the preconditioners that are based on rank structures, such as H-matrices [15], which
represent a matrix in terms of a hierarchy of blocked submatrices where the off-diagonal
blocks are low rank. This allows for fast inversion and LU factorization routines. Pre-
conditioners based on H-matrix approximations have been explored in [12, 19, 13]. Other
multilevel preconditioners based on low rank have been proposed in [34].
5
Multigrid preconditioners [6, 22] are reduced tolerance multigrid solvers, which alternate
between fine- and coarse-level representations to reduce the low and high frequency compo-
nents of the error respectively. In contrast, hierarchical basis methods [38, 37] precondition
the original linear system as in eq. (1.3) by expressing A in a hierarchical representation. A
hierarchical basis-multigrid preconditioner has been proposed in [1].
Hierarchical basis preconditioners can be thought of as a special kind of wavelet precon-
ditioners as it is possible to interpret the piecewise linear functions of the hierarchical basis
as wavelets. Connections between wavelets and hierarchical basis methods have also been
explored in [32, 33] to improve the performance of hierarchical basis methods.
6
CHAPTER 3
WAVELET BASED SPARSE APPROXIMATE INVERSE
PRECONDITIONERS
We begin with a brief introduction to classical orthogonal wavelet transforms. For a detailed
introduction, see [8]. Assuming n = 2N for simplicity, the L-level wavelet transform of a
signal x ∈ Rn can be written as a matrix vector product WTx, where
W = W1W2 . . .WL (3.1)
with L ≤ N and
WTk =
Uk 0
Vk 0
0 In− n2k−1
k = 1, . . . , L, (3.2)
where Uk, Vk ∈ R(n/2k)×(n/2k−1) are of the form
Uk =
h0 h1 h2 · · · hm−1
h0 h1 h2 · · · hm−1
. . . . . . . . . . . .
h2 · · · hm−1 h0 h1
,
Vk =
g0 g1 g2 · · · gm−1
g0 g1 g2 · · · gm−1
. . . . . . . . . . . .
g2 · · · gm−1 g0 g1
.
7
The scalars hi, gi for i = 0, . . . ,m− 1 are the high-pass and low-pass filter coefficients of the
wavelet transform, respectively. The above holds true even when n = p2s for some s and p.
In that case, the maximum level of the wavelet transform applied is upper bounded by s.
On higher dimensional signals, wavelet transforms are applied dimension-wise. For exam-
ple, let x ∈ Rn2 be a 2D signal (matrix) which has been vectorized by stacking the columns.
The wavelet transform x is computed by first applying a 1D transform on the columns and
then on the rows. If W ∈ Rn×n is the 1D orthogonal wavelet transform matrix, then
x = (In ⊗WT )(WT ⊗ In)x = (W ⊗W )Tx,
where ⊗ is the Kronecker product [31] and In, the n×n identity matrix. Thus, W ⊗W can
be called the two dimensional wavelet transform matrix. For vectorized 3D signals (tensors),
the wavelet transform matrix is W ⊗W ⊗W .
Chan, Tang and Wan [7] were the first to propose a wavelet sparse approximate inverse
preconditioner. In their approach, the linear system eq. (1.1) is first transformed into a
standard wavelet basis such as the Daubechies [8] basis, and a sparse approximate inverse
preconditioner is computed for the transformed coefficient matrix by solving
minM∈Sblockdiag
∣∣∣∣∣∣WTAWM − I∣∣∣∣∣∣F. (3.3)
The preconditioner is constrained to be block diagonal in order to maintain its sparsity
and simplify computation. They show the superiority of the wavelet preconditioner over
an adaptive sparse approximate inverse preconditioner for elliptic PDEs with smooth coeffi-
cients over regular domains. However, their method performs poorly for elliptic PDEs with
discontinuous coeffcients. The block diagonal constraint does not fully capture the structure
of the inverse in the wavelet basis.
Bridson and Tang [5] construct a multiresolution preconditioner similar to Chan, Tang
8
Algorithm 1 Solve Ax = b using the implicit wavelet SPAI preconditioner [16]
1: Compute preconditioner M = arg minM∈SW ||AM −W ||F2: Solve WTAMy = WT b3: return x = My
Algorithm 2 Compute preconditioner M = arg minM∈SW ||AM −W ||F1: for j = 1, . . . , n do2: Sj = indices of nonzero entries of wj3: Tj = indices of nonzero entries of A(:, Sj)4: Solve z∗ = arg min ||A(Tj , Sj)z − wj(Tj)||2 by reduced QR-factorization5: Set mj(Tj) = z∗
6: end for7: return M
and Wan [7], but determine the sparsity structure adaptively. Instead of using Daubechies
wavelets, they use second generation wavelets [26], which allows the preconditioner to be
effective for PDEs over irregular domains. However, their algorithm requires the additional
difficult step of finding a suitable ordering of the rows/columns of the coefficient matrix
which limits the number of levels to which multiresolution structure can be exploited.
Hawkins and Chen [16] compute an implicit wavelet sparse approximate inverse precon-
ditioner, which removes the computational overhead of transforming the coefficient matrix
to a wavelet basis. Instead of eq. (3.3), they solve
minM∈SW
∣∣∣∣∣∣WTAM − I∣∣∣∣∣∣F, (3.4)
where SW is the class of matrices which have the same sparsity structure as W . They
empirically show that this sparsity constraint is enough to construct a preconditioner superior
to that of Chan, Tang and Wan [7]. The complete algorithm is described in algorithms 1
and 2.
Hawkins and Chen [16] apply their preconditioner on Poisson and elliptic PDEs in 1D, 2D
and 3D. We found, by experiment, that it is critical to use a wavelet transform of the same
9
dimension as the underlying PDE of the linear system for success of their preconditioner. On
linear systems where the underlying data generator is unknown — this happens, for example,
when we are dealing with Laplacians of graphs — their preconditioner is ineffective. Thus,
there is a need for a wavelet sparse approximate inverse preconditioner which can mould itself
to any kind of data, provided that it is reasonable to assume a multiresolution structure.
10
CHAPTER 4
MULTIRESOLUTION MATRIX FACTORIZATION
The Multiresolution Matrix Factorization (MMF) of a symmetric matrix A ∈ Rn×n, as
defined in [17], is a multilevel sparse factorization of the form
A ≈ QT1 QT2 . . . Q
TLHQL . . . Q2Q1, (4.1)
where the matrices Q1, . . . , QL and H obey the following conditions:
1. Each Q` is orthogonal and highly sparse. In the simplest case, each Q` is a Givens
rotation, i.e., a matrix which differs from the identity in just the four matrix elements
[Q`]i,i = cos θ, [Q`]i,j = − sin θ,
[Q`]j,i = sin θ, [Q`]j,j = cos θ,
for some pair of indices (i, j) and rotation angle θ. Multiplying a vector with such a
matrix rotates it counter-clockwise by θ in the (i, j) plane. More generally, Q` is a
so-called k-point rotation, which rotates not just two, but k coordinates.
2. Typically, in MMF factorizations L = O(n), and the size of the active part of the Q`
matrices decreases according to a set schedule n = δ0 ≥ δ1 ≥ . . . ≥ δL. More precisely,
there is a nested sequence of sets [n] = S0 ⊇ S1 ⊇ . . . ⊇ SL such that the [Q`]S`−1,S`−1
part of each rotation is the n− δ`−1 dimensional identity. S` is called the active set at
level `. In the simplest case, δ` = n− `.
3. H is an SL-core-diagonal matrix, which means that it is block diagonal with two blocks:
HSL,SL , called the core, which is dense, and HSL,SLwhich is diagonal. In other words,
Hi,j=0 unless i, j ∈SL or i=j.
11
PAPT ≈( )Q>1
( )Q>2
. . .
( )Q>L
( )H
( )QL
. . .
( )Q2
( )Q1
Figure 4.1: A graphical representation of the structure of Multiresolution Matrix Factoriza-tion. Here, P is a permutation matrix which ensures that S` = {1, . . . , δ`} for each `. Notethat P is introduced only for the sake of visualization, an actual MMF would not containsuch an explicit permutation.
The structure implied by the above conditions is illustrated in chapter 4. MMF factoriza-
tions are, in general, only approximate, as there is no guarantee that O(n) sparse orthogonal
matrices can bring a symmetric matrix to core-diagonal form. Rather, the goal of MMF algo-
rithms is to minimize the approximation error, which, in the simplest case, is the Frobenius
norm of the difference between the original matrix and its MMF factorized form.
MMF was originally introduced in the context of multiresolution analysis on discrete
spaces, such as graphs. In particular, the columns of QT = QT1 . . . QTL−1Q
TL have a natural
interpretation as wavelets, and the factorization itself is effectively a fast wavelet transform,
mimicking the structure of classical orthogonal multiresolution analyses on the real line [20].
MMF has also been successfully used for compressing large matrices [27].
In this paper we use MMF in a different way. The key property that we exploit is that
eq. (4.1) automatically gives rise to an approximation to A−1,
A−1 = QT1 . . . QTL−1Q
TLH−1QLQL−1 . . . Q1, (4.2)
which is very fast to compute, since inverting H reduces to separately inverting its core
(which is assumed to be small) and inverting its diagonal block (which is trivial). Assuming
that the core is small enough, the overall cost of inversion becomes O(n) as the matrices Qi
are sparse. When using eq. (4.2) as a preconditioner, of course we never compute eq. (4.2)
explicitly, but rather (similarly to other wavelet sparse approximate inverse preconditioners)
12
we apply it to vectors in factorized form as
A−1v = QT1 (. . . (QTL−1(QTL(H−1(QL(QL−1 . . . (Q1v)) . . .). (4.3)
Since each of the factors here is sparse, the entire product can be computed in O(n) time. We
remark that computing the MMF preconditioner does not depend on the spatial dimension
of the underlying problem. In fact, for matrices such as graph Laplacians, the underlying
dimensionality is unknown. MMF is more widely applicable because of its dimension-free
nature.
Computation of the MMF
The MMF of a symmetric matrix A is usually computed by minimizing the Frobenius norm
factorization error
‖A−QT1 . . . QTLHQL . . . Q1‖Frob (4.4)
over all admissible choices of active sets S1, . . . , SL and rotation matrices Q1, . . . , QL. The
minimization is carried out in a greedy manner, where the rotation matrices Q1, . . . , QL are
determined sequentially, as A is subjected to the sequence of transformations
A 7→ Q1AQT1︸ ︷︷ ︸
A1
7→ Q2Q1AQT1 Q
T2︸ ︷︷ ︸
A2
7→ . . . 7→ QL . . . Q2Q1AQT1 Q
T2 . . . Q
TL︸ ︷︷ ︸
H
.
In this process, at each level `, the algorithm
1. Determines which subset of rows/columns {i1, . . . , ik} ⊆ S`−1 are to be involved in the
next rotation, Q`.
2. Given {i1, . . . , ik}, it optimizies the actual entries of Q`.
13
3. Selects a subset of the indices in {i1, . . . , ik} for removal from the active set (the corre-
sponding rows/columns of the working matrix A` then become “wavelets”).
4. Sets the off-diagonal parts of the resulting wavelet rows/columns to zero in H.
The final error is the sum of the squares of the zeroed out off-diagonal elements (see Propo-
sition 1 in [17]). The objective therefore is to craft Q` such that these off-diagonals are as
small as possible.
For preconditioning it is critical to be able to compute the MMF approximation fast. To
this end employ two further heuristics. First, the row/column selection process is acceler-
ated by randomization: for each `, the first index i1 is chosen uniformly at random from
the current active set S`−1, and then i2, . . . , ik are chosen so as to ensure that Q` can pro-
duce δ` − δ`−1 rows/columns with suitably small off-diagonal norm. Second, exploiting the
fundamentally local character of MMF pivoting, the entire algorithm is parallelized using a
generalized blocking strategy first described in [27].
Notation 1. Let B1 ·∪B2 ·∪ . . . ·∪Bk = [n] be a partition of [n] and A ∈ Rn×n. We use JAKi,j
to denote the [A]Bi,Bjblock of A and say that A is (B1, . . . , Bk)-block-diagonal if JAKi,j = 0
if i 6= j.
The pMMF algorithm proposed in [27] uses a rough clustering algorithm to group the
rows/columns of A into a certain number of blocks, and factors each block independently
and in parallel. However, to avoid overcommitting to a specific clustering, each of these
factorizations is only partial (typically the core size is on the order of 1/2 of the size of the
block). The algorithm proceeeds in stages, where each stage consists of (re-)clustering the
remaining active part of the matrix, performing partial MMF on each cluster in parallel,
and then reassembling the active rows/columns from each cluster into a single matrix again
(algorithm 3).
Assuming that there are P stages in total, this process results in a two-level factorization.
14
Algorithm 3 pMMF (top level of the pMMF algorithm)
Input: a symmetric matrix A∈Rn×nA0 ← Afor (p= 1 to P ) {
Cluster the active columns of Ap−1 to Bp1·∪Bp2 ·∪ . . . ·∪B
pm
Reblock Ap−1 according to (Bp1 , . . . , B
pm)
for (u= 1 to m) JQpKu,u ← FindRotationsForCluster([Ap] : ,Bu)
for (u= 1 to m) {for (v= 1 to m) {
JApKu,v← JQpKu,uJAp−1Ku,vJQpKv,v>}}
}H ← the core of AL plus its diagonalOutput: (H,Q1, . . . , Qp)
At the stage level, we have
A ≈ QT1 Q
T2 . . . Q
TP HQP . . . Q2Q1, (4.5)
where, assuming that the clustering in stage p is Bp1·∪Bp2 ·∪. . . ·∪B
pm, each Qp is a (B
p1 , . . . , B
pm)
block diagonal orthogonal matrix, which, in turn, factors into a product of a large number
of elementary k-point rotations
Qp = Qlp . . . Qlp−1+2Qlp−1+1. (4.6)
Thanks to the combination of these computational tricks, empirically, for sparse matrices,
pMMF can achieve close to linear scaling behavior with n, both in memory and computation
time [27]. For completeness, the subroutine used to compute the rotations in each cluster is
presented in algorithm 4.
15
Algorithm 4 FindRotationsForCluster(U) (we assume k= 2 and η is the compres-sion ratio)
Input: a matrix U made up of the c columns of Ap−1 forming cluster u in ApCompute the Gram matrix G=U>US ← {1, 2, . . . , c} (the active set)for (s= 1 to bηcc){
Select i∈S uniformly at randomFind j = argmaxS\{i}
∣∣〈U:,i,U:,j〉∣∣ /‖U:,j ‖
Find the optimal Givens rotation qs of columns (i, j)U ← qsU q>sG← qsGq
>s
ifwwUi,:wwoff-diag<
wwUj,:wwoff-diag then S ← S \ {i} else S ← S \ {j}}Output: JQpKu,u = qbηcc . . . q2q1
16
CHAPTER 5
NUMERICAL RESULTS
We compare performance of the preconditioners on both model PDE problems and off-the-
shelf sparse matrix datasets.
5.1 Implementation and parameters
We implemented parallelized versions of all preconditioners in both MATLAB and C++. In
MATLAB, parallel code was written using the command parfor and in C++, we used the
pthreads library.
For PDE problems which are relatively small, we use the MATLAB code. We used a
block size of 8 for WSPAI as used in [7, 16]. For IWSPAI, we used 6 wavelet levels. To solve
the minimization problem in Step 4 of Algorithm 2, we used the lsqr method in MATLAB,
as we found it to be faster and accurate in reproducing the results in [16]. The maximum
number of iterations in lsqr was set to 500. We used Daubechies wavelets [8] for both the
wavelet preconditioners.
The pMMF library [18] was used to compute the MMF preconditioner. Default param-
eters supplied by the library were used. These include using second order rotations, i.e.,
Givens rotations, designating half of the active number of columns at each level as wavelets
and compressing the matrix until the core is of size 100×100. The parameter which controls
the extent of pMMF parallelization, namely the maximum size of blocks in blocked matrices,
was set to 2000. The full set of parameters that were used are shown in Figure A.1.
We note that the C++ implementation of the wavelet preconditioners using the same
matrix classes as pMMF was slower than its MATLAB counterpart. This suggests that
specialized matrix libraries are needed to speed-up their code in C++.
The PDE experiments were run on 28-core Intel E5-2680v4 2.4GHz computers. We used
17
GMRES (no-restarts) with a maximum number of iterations of 1000.
For off-the-shelf matrices, we use the C++ implementations. The parameters used were
the same, except pMMF on matrices larger than 65536 rows uses a maximum cluster size of
5000. We used GMRES (with restarts after 30 iterations) to a maximum of 500 iterations.
The experiments were run on 32-core AMD Opteron 6386 SE.
5.2 Model PDE problems
The model PDE problems used are
• 1D Laplacian. One dimensional Poisson’s equation
uxx = (1 + x2)−1ex, x ∈ [0, 1],
with a Dirichlet boundary condition discretized with central differences.
• 2D Laplacian. Two dimensional Poisson’s equation
uxx + uyy = −100x2, (x, y) ∈ [0, 1]2,
with a Dirichlet boundary condition discretized with central differences.
• 3D Laplacian. Three dimensional Poisson’s equation
uxx + uyy + uzz = −100x2, (x, y, z) ∈ [0, 1]3.
• 2D Disc. Two dimensional PDE with discontinuous coefficients
(a(x, y)ux)x + (b(x, y)uy)y = sin(πxy), (x, y) ∈ [0, 1]2,
18
Dataset n no prec. WSPAI IWSPAI MMF prec.
1D Laplacian 256 256 46 13 10512 512 64 13 101024 1001 93 17 132048 1001 131 17 2
2D Laplacian 256 45 33 28 81024 91 41 28 84096 180 59 30 13
3D Laplacian 512 28 26 28 84096 55 41 30 11
2D Disc 256 240 256 37 131024 868 × 24 13
Table 5.1: Iteration counts of GMRES until convergence to a relative residual of 10−8. Heren is the number of rows of the finite difference matrix. WSPAI refers to the wavelet sparsepreconditioner of Chan, Tang and Wan [7] and IWSPAI to the implicit sparse preconditionerof Hawkins and Chen [16]. It is clear that MMF preconditioner is consistently better. ×indicates that the desired tolerance was not reached within 1000 iterations.
with
a(x, y) = b(x, y) =
10−3, (x, y) ∈ [0, 0.5]× [0.5, 1],
103, (x, y) ∈ [0.5, 1]× [0, 0.5],
1, otherwise,
with a Dirichlet boundary condition discretized with central differences.
A regular mesh was assumed in constructing the finite difference matrices for these PDEs.
We used GMRES with a stopping tolerance of 10−8 in relative residual and a cap on the
number of iterations at 1000. The iteration counts are shown in 5.2.
MMF preconditioning is consistently better on model problems in terms of iteration
count. Higher dimensional finite difference Laplacian matrices are generally well condi-
tioned, as the condition number depends more strongly on the mesh size. In fact, the
condition number of d-dimensional finite difference Laplacian matrix grows as n2h , where h
is the step size. Even on higher dimensional Laplacians, where the wavelet preconditioners
fail to provide adequate speedup, MMF preconditioning is effective. On average, MMF pre-
19
conditioning seems to converge in about half the number of iterations as that required by
the best wavelet preconditioner.
We observed that choosing the correct dimensionality of the wavelet transform is neces-
sary for IWSPAI to be effective. Without it, IWSPAI and WSPAI on higher order Laplacians
resulted in nearly no speed-up compared to no preconditioning.
5.3 Off-the-shelf matrices
We collected all symmetric sparse matrices of size larger than 8192 from the University of
Florida Sparse Matrix Collection [9]. The matrices come from a variety of scientific problems:
structural engineering, theoretical/quantum chemistry, heat flow, 3D vision, finite element
approximations, networks, etc. The right hand sides were generated by multiplying the
coefficient matrices with a random vector.
Wavelet preconditioners cannot be applied on matrices whose size is not divisible by 2.
Moreover, the number of levels at which wavelet transforms is limited by the multiplicity of
2 as a factor of the matrix size. MMF preconditioner, on the other hand, can be applied to
arbitrary size matrices.
First, we evaluate the effectiveness of MMF preconditioning (MMFprec). The iteration
counts when using GMRES(30) and a tolerance of 10−9 are shown in Table A.2. Of the
493 matrices, GMRES converges with no preconditioning in 53 cases, while with MMFprec
converges in 120 cases. MMFprec results in fewer iterations than no preconditioning in 110
cases. 15 of the 31 cases where MMFprec is not effective belong to the “Nemeth” group,
which has arbitrary matrices and may not have multiresolution structure.
MMFprec’s performance by matrix group is shown in Table 5.3. For each group, we
counted the number of matrices on which MMF achieves faster convergence than no precon-
ditioning. MMFprec is able to accelerate linear systems arising from a variety of scientific
problems, especially ones which are known to have multiresolution structure such as graphs
20
and circuit simulations. Moreover, circuit simulation problems are known to have hierar-
chical structure different from those arising in PDEs [25]. We also note that MMFprec is
effective on a range of sparsity levels and matrix sizes.
5.3.1 Comparison with wavelet preconditioners
We selected several off-the-shelf matrices of sizes between 8192 and 65536. To make the ma-
trix size compatible with wavelet preconditioners, we discarded a random set of rows/columns
from each matrix such that its size is reduced to p2s, where s = blog2 nc and p = bn/2sc.
We limited ourselves to an upper bound of 65536 as the wavelet preconditioner does not
scale well on wall-clock time (ref. Section 5.5).
Another challenge in applying the wavelet preconditioner on off-the-shelf matrices is that
oftentimes the underlying generative process of the matrices is unknown and the dimension-
ality of the wavelet transform to use cannot be determined. For our experiments, we choose
a dimensionality of one.
The iteration counts are tabled in Table A.2. We used GMRES (no restarts) for this
experiment. IWSPAI and no preconditioning are effective on 13 and 10 out of the 43 cases
respectively. MMF preconditioning is better on 23 out of the 43 cases.
IWSPAI is effective on matrices arising in CFD and Statistical/Mathematical problems,
and no preconditioning is sufficient on structural problems. MMF preconditioning outper-
forms the rest on all the graph problems.
Increasing the wavelet transform level increases the accuracy of the wavelet precondition-
ers. In this case, Hawkins and Chen [16] remark that a few iterations of GMRES can be used
in place of reduced QR factorization in Step 4 of algorithm 2 to alleviate the increased setup
time. However, using GMRES defeats the purpose of maintaining higher accuracy with a
higher wavelet transform level. Further, we noted empirically that using GMRES drastically
reduces the efficacy of the implicit wavelet preconditioner - with most of the cases being
21
Group name #prob. Avg. size Avg. sparsity %wins Description
Rajat 5 19962 4.31 100.0 Circuit simulationJGD Trefethen 2 19999 27.72 100.0 Assorted combinatorial problemsOberwolfach 8 50436 25.43 100.0 Model reductionUm 1 101492 16.23 100.0 ElectromagneticsBoeing 7 25455 1.0 100.0 Structural engineeringQY 1 14454 10.24 100.0 Power networkBotonakis 2 153237 6.97 100.0 Thermal problemRothberg 1 53570 21.91 100.0 Structural engineeringTKK 1 13681 52.21 100.0 Structural engineeringAndrews 1 60000 12.67 100.0 Computer visionDIMACS10 21 135252 4.73 100.0 GraphsLourakis 1 10581 72.85 100.0 Computer visionMulvey 2 74752 7.99 100.0 EconomicsCunningham 2 38617 25.11 100.0 AcousticsTSOPF 6 39249 155.61 100.0 Power networkHB 5 26659 45.16 100.0 AssortedChen 4 26364 75.89 100.0 Structural engineeringGupta 2 39423 68.45 100.0 GraphsPothen 5 45646 4.9 100.0 Structural engineeringIPSO 1 15435 9.17 100.0 Power networkAG-Monien 7 25197 4.0 85.7 GraphsGHS indef 15 55174 5.46 80.0 Structural engineeringAndrianov 4 13730 26.35 75.0 OptimizationGHS psdef 13 39094 5.44 69.2 Structural engineeringBindel 2 10605 13.63 50.0 Thermal problemNemeth 25 9506 41.53 4.0 Newton-Schultz iterationNorris 2 9702 8.88 × BioengineeringSchenk IBMNA 1 23948 8.46 × OptimizationUTEP 1 16129 15.69 × PDEsOkunbor 1 8205 15.3 × AcousticsMaxPlanck 2 81920 4.0 × CFD
Table 5.2: Performance of MMF preconditioning by matrix group. %wins indicates thepercentage of times MMFprec resulted in lower GMRES (30) iteration count compared tono preconditioning. × indicates a win percentage of zero. Sparsity is defined as f , wherefn = nnz where n is the size of the matrix and nnz is the number of nonzeros.
22
Figure 5.1: Relative residual as a function of iteration number.
non-convergent.
5.4 Performance when only an approximate solution is desired
For a model PDE problem, we plot the relative residual as a function of iteration counts
in Figure 5.4. We see that the curve corresponding to the MMF preconditioner is below
the curves for the other preconditioners. This means that an approximate solution can be
determined quickly by the MMF preconditioner.
In Table A.2 we tabulate iteration counts for reaching different values of GMRES toler-
ance on off-the-shelf matrices. Even here MMF preconditioning is successful on more number
of matrices for all values of the tolerance. The starkest difference in performance is found at
tolerance 10−5.
5.5 Wall-clock time comparison
In Table 5.5, we present the wall clock running times for linear solves with the different
preconditioners on the model PDE problems. In terms of the total time for the linear solve
including preconditioner setup, MMF preconditioner is consistently better.
23
Dataset n no prec. WSPAI IWSPAI MMF prec.solve setup solve setup solve setup solve
1D Laplacian 256 0.3 0.77 0.01 0.8 2e-05 0.01 0.01512 1.35 1.70 0.03 1.73 4.3e-05 0.03 0.021024 5.18 5.36 0.09 3.79 8.2e-05 0.07 0.022048 7.80 24.2 0.24 9.9 1.5e-04 0.15 0.02
2D Laplacian 256 0.54 21 0.03 0.26 3.4e-05 0.05 0.041024 0.08 3.87 0.03 0.32 2.4e-04 0.10 0.024096 1.65 371 0.46 6.43 4.5e-03 0.44 0.03
3D Laplacian 512 0.01 0.16 0.01 0.16 1.2e-05 0.04 0.014096 0.17 950 0.31 6.13 3.4e-03 0.61 0.05
2D Disc 256 0.23 0.18 0.30 0.20 3.3e-05 0.01 0.021024 2.67 3.77 5.60 0.31 2.9e-04 0.11 0.034096 3.96 × × 3.27 3.5e-03 0.41 2.5e-03
Table 5.3: Wall clock running time of preconditioner setup and linear solve times in seconds.× indicates that the desired tolerance was not reached within 1000 iterations.
On off-the-shelf matrices, the wall-clock time of the wavelet preconditioners is much
worse. We took a large network matrix SNAP/amazon0302 from the UFlorida repository
and computed the wavelet and MMF preconditioners on submatrices of increasing size.
The results are shown in Figure 5.5. For a matrix with just 80000 nonzeros, the wavelet
preconditioner takes almost 20 minutes to setup, despite the parallelism in implementation.
Note that we used the most basic parameters while computing the MMF. With proper
tuning, performance can be brought up, which would result in better performance. The other
wavelet preconditioners have only one parameter, namely the level of the wavelet transform,
which leaves little room for tuning.
5.6 Discussion
We observed that the MMFprec is, by and large, more effective than wavelet preconditioners.
For a narrow set of problems, i.e., where the matrix size is a multiple of 2, and whose un-
derlying process is known (e.g., low-order PDEs), the wavelet preconditioner is a reasonable
choice of preconditioner when the matrix is not too big. We also empirically noted that
wavelet preconditioner is more effective on CFD problems.
24
Figure 5.2: Wall-clock times as a function of matrix size (left) and number of nonzeros(right). Top row corresponds to IWSPAI [16] and bottom to MMF preconditioner.
25
Our experiments also revealed that the wall-clock time of the wavelet preconditioner scales
linear in number of nonzeros, which is the same as MMFprec. Perhaps a highly optimized
implementation of the wavelet preconditioner would vastly expand its applicability. However,
other limitations such as dependence on the underlying dimensionality of the problem and
parity of matrix size still exist.
Then the question arises: when is MMF preconditioning useful? While we observed that
MMF is effective on a range of problems, we also noted that is it especially applicable to graph
problems and circuit simulations, which are known to have hierarchical structure. Further,
when an approximate solution is desired, MMFprec is preferable to no preconditioning.
The “geometry free” nature of MMF preconditioner makes it more flexible than standard
wavelet preconditioners. In particular, MMF can be applied to matrices of any size, not just
p2s. Furthermore, MMF preconditioning is completely invariant to the ordering of the
rows/columns, in contrast to, for example, the multiresolution preconditioner of Bridson
and Tang [5]. The adaptability of MMF makes it suitable to preconditioning a wide variety
of linear systems.
26
CHAPTER 6
CONCLUSION AND FUTURE WORK
We presented a new multiresolution preconditioner for symmetric linear systems that does
not depend on any geometric assumptions, and hence can be applied to any coefficient
matrix that is assumed to have multiresolution structure, even in the loose sense. Numerical
experiments show the effectiveness of the new preconditioner in a range of problems. In our
experiments we used default parameters, but with fine tuning our results could possibly be
improved further.
6.1 Future work
Following are a few directions this research can take
• It is not yet clear exactly what kind of matrices the new MMF preconditioner is most
effective on, in part due to the general nature of the pMMF algorithm. It is possible
that specializing MMF to specific types of linear systems would yield even more effective
preconditioners.
• The applicability of MMF preconditioner can be widened greatly by removing the
constraint of symmetry. Developing a non-symmetric MMF algorithm would not only
benefit preconditioning, but also other applications that MMF is currently used for.
• Computing the multiresolution inverse directly: Instead of computing the MMF and
the inverting it, one can try to compute the MMF of the matrix inverse directly. One
way to approach this is to minimize
‖I − AQT1 . . . QTLHQL . . . Q1‖Frob
27
REFERENCES
[1] Randolph E Bank, Todd F Dupont, and Harry Yserentant. The hierarchical basis
multigrid method. Numerische Mathematik, 52(4):427–458, 1988.
[2] M.W. Benson. Iterative solution of large scale linear systems. Mathematics report.
Thesis (M.Sc.)–Lakehead University, 1973.
[3] Michele Benzi. Preconditioning techniques for large linear systems: a survey. Journal
of Computational Physics, 182(2):418–477, 2002.
[4] Michele Benzi and Miroslav Tuma. A comparative study of sparse approximate inverse
preconditioners. Applied Numerical Mathematics, 30(2-3):305–340, 1999.
[5] Robert Bridson and Wei-Pai Tang. Multiresolution approximate inverse preconditioners.
SIAM Journal on Scientific Computing, 23(2):463–479, 2001.
[6] William L Briggs, Van Emden Henson, and Steve F McCormick. A multigrid tutorial.
SIAM, 2000.
[7] Tony F Chan, Wei Pai Tang, and Wing Lok Wan. Wavelet sparse approximate inverse
preconditioners. BIT Numerical Mathematics, 37(3):644–660, 1997.
[8] Ingrid Daubechies. Ten lectures on wavelets. SIAM, 1992.
[9] Timothy A Davis and Yifan Hu. The University of Florida sparse matrix collection.
ACM Transactions on Mathematical Software (TOMS), 38(1):1, 2011.
[10] Stephen Demko, William F Moss, and Philip W Smith. Decay rates for inverses of band
matrices. Mathematics of Computation, 43(168):491–499, 1984.
[11] Iain S Duff, AM Erisman, CW Gear, and John K Reid. Sparsity structure and Gaussian
elimination. ACM SIGNUM Newsletter, 23(2):2–8, 1988.
28
[12] Markus Faustmann, Jens Markus Melenk, and Dirk Praetorius. H-matrix approxima-
bility of the inverses of FEM matrices. Numerische Mathematik, 131(4):615–642, 2015.
[13] Lars Grasedyck, Ronald Kriemann, and Sabine Le Borne. Domain decomposition based
LU preconditioning. Numerische Mathematik, 112(4):565–600, 2009.
[14] Marcus J Grote and Thomas Huckle. Parallel preconditioning with sparse approximate
inverses. SIAM Journal on Scientific Computing, 18(3):838–853, 1997.
[15] Wolfgang Hackbusch. A sparse matrix arithmetic based on H-matrices. part I: intro-
duction to H-matrices. Computing, 62(2):89–108, 1999.
[16] Stuart C Hawkins and Ke Chen. An implicit wavelet sparse approximate inverse pre-
conditioner. SIAM Journal on Scientific Computing, 27(2):667–686, 2005.
[17] Risi Kondor, Nedelina Teneva, and Vikas Garg. Multiresolution matrix factorization.
In Proceedings of the 31st International Conference on Machine Learning (ICML-14),
pages 1620–1628, 2014.
[18] Risi Kondor, Nedelina Teneva, and Pramod Kaushik Mudrakarta. Parallel MMF: a
multiresolution approach to matrix computation. CoRR, abs/1507.04396, 2015.
[19] Ronald Kriemann and Sabine Le Borne. H-FAINV: hierarchically factored approximate
inverse preconditioners. Computing and Visualization in Science, 17(3):135–150, 2015.
[20] Stephane G Mallat. A theory for multiresolution signal decomposition: the wavelet
representation. IEEE Transactions on Pattern Analysis and Machine Intelligence,
11(7):674–693, 1989.
[21] Christopher C Paige and Michael A Saunders. Solution of sparse indefinite systems of
linear equations. SIAM Journal on Numerical Analysis, 12(4):617–629, 1975.
29
[22] Fabio Henrique Pereira, Sergio Luıs Lopes Verardi, and Silvio Ikuyo Nabeta. A fast
algebraic multigrid preconditioned conjugate gradient solver. Applied Mathematics and
Computation, 179(1):344–351, 2006.
[23] Alfio Quarteroni and Alberto Valli. Numerical approximation of partial differential
equations, volume 23. Springer Science & Business Media, 2008.
[24] John W Ruge and Klaus Stuben. Algebraic multigrid. Multigrid methods, 3(13):73–130,
1987.
[25] Wilhelmus HA Schilders. Iterative solution of linear systems in circuit simulation. In
Progress in Industrial Mathematics at ECMI 2000, pages 272–277. Springer, 2002.
[26] Wim Sweldens. The lifting scheme: a construction of second generation wavelets. SIAM
Journal on Mathematical Analysis, 29(2):511–546, 1998.
[27] Nedelina Teneva, Pramod Kaushik Mudrakarta, and Risi Kondor. Multiresolution ma-
trix compression. In Artificial Intelligence and Statistics, pages 1441–1449, 2016.
[28] Michael J Todd and Yinyu Ye. A centered projective algorithm for linear programming.
Mathematics of Operations Research, 15(3):508–529, 1990.
[29] Lloyd N Trefethen and David Bau III. Numerical linear algebra, volume 50. SIAM,
1997.
[30] Henk A Van der Vorst. Bi-CGStab: A fast and smoothly converging variant of Bi-
CG for the solution of nonsymmetric linear systems. SIAM Journal on Scientific and
Statistical Computing, 13(2):631–644, 1992.
[31] Charles F Van Loan. The ubiquitous Kronecker product. Journal of Computational
and Applied Mathematics, 123(1):85–100, 2000.
30
[32] Panayot S Vassilevski and Junping Wang. Stabilizing the hierarchical basis by approx-
imate wavelets, I: theory. Numerical Linear Algebra with Applications, 4(2):103–126,
1997.
[33] Panayot S Vassilevski and Junping Wang. Stabilizing the hierarchical basis by approx-
imate wavelets II: implementation and numerical results. SIAM Journal on Scientific
Computing, 20(2):490–514, 1998.
[34] Yuanzhe Xi, Ruipeng Li, and Yousef Saad. An algebraic multilevel preconditioner with
low-rank corrections for sparse symmetric matrices. SIAM Journal on Matrix Analysis
and Applications, 37(1):235–259, 2016.
[35] YINYU Ye. Interior-point algorithms for quadratic programming. Recent Developments
in Mathematical Programming, pages 237–261, 1991.
[36] Yinyu Ye. On the finite convergence of interior-point algorithms for linear programming.
Mathematical Programming, 57(1):325–335, 1992.
[37] Harry Yserentant. Hierarchical bases give conjugate gradient type methods a multigrid
speed of convergence. Applied Mathematics and Computation, 19(1-4):347–358, 1986.
[38] Harry Yserentant. On the multi-level splitting of finite element spaces. Numerische
Mathematik, 49(4):379–412, 1986.
31
APPENDIX A
APPENDIX
A.1 pMMF parameters
-------------- pMMF parameters ---------------
Order of rotations (k): 2
Max number of stages: 30
Rotations per stage (fraction): 0.5
Number to eliminate after each rotation: 1
Target core size: 100
Prenormalize each channel: off
Rotation selection criterion: inner product
Inner products normalized: off
Grams based on diagonal blocks only: off
Clustering method: inner product
Number of clusters: 1
Minimum cluster size: 1
Maximum cluster size: 2000
Maximum clustering depth: 4
Maximum clustering iterations: 8
Bypass option: on
Compute Frobenius error: off
----------------------------------------------
Figure A.1: Full set of parameters used in pMMF for the experiments
A.2 Results on off-the-shelf matrices
[htdp]
Dataset n no prec. IWSPAI MMF prec.
nd3k 8192 455 236 323
nemeth03 9216 4 4 2
net25 9216 460 × ×
fv2 9216 20 20 27
32
fv3 9216 42 38 52
nemeth12 9216 13 10 3
nemeth11 9216 10 8 3
nemeth09 9216 7 6 3
nemeth14 9216 × × 8
nemeth04 9216 5 4 3
nemeth23 9216 211 × ×
pf2177 9216 174 × ×
bloweybq 9216 × 8 ×
nemeth10 9216 8 7 3
flowmeter0 9216 × × 9
nemeth25 9216 164 × ×
nemeth24 9216 179 × ×
nemeth15 9216 282 × 70
nopoly 10240 119 108 105
bcsstk17 10240 × × 266
bundle1 10240 × × 30
linverse 11264 × 20 ×
t2dah 11264 × × 7
crystm02 13312 1 1 30
Pres Poisson 14336 436 43 114
bcsstm25 14336 × × 2
gyro m 16384 1 1 115
gyro k 16384 × × 220
nd6k 16384 × 270 330
bodyy4 16384 184 147 91
t3dl a 18432 × 141 6
Si5H12 18432 103 71 89
33
Trefethen 20000b 18432 × × 8
crystm03 24576 1 1 33
spmsrtls 28672 × 150 ×
wathen100 28672 × × 33
wathen120 32768 × × 33
mario001 36864 269 × ×
torsion1 36864 41 29 50
bfly 49152 59 × ×
crankseg 2 57344 × × 246
Ga3As3H12 57344 × × 104
cant 57344 × × 83
# wins 10 13 23
Table A.1: Iteration counts of GMRES solved to a relative
error of 10−4. × indicates that the method did not achieve
the desired tolerance within 500 iterations.
Name Group N NNZ Sparsity NoPrec MMFPrec
delaunay n13 DIMACS10 8192 49094 6.0 × ×
c-37 Schenk IBMNA 8204 74676 9.1 × ×
ft01 Okunbor 8205 125567 15.3 2 ×
benzene PARSEC 8219 242669 29.5 × ×
hep-th Newman 8361 32253 3.9 × ×
bcsstk33 HB 8738 591904 67.7 × 122
nd3k ND 9000 3279690 364.4 × ×
G66 Gset 9000 36000 4.0 × ×
3elt dual AG-Monien 9000 26556 3.0 × ×
vsp data and seymourl DIMACS10 9167 111732 12.2 × ×34
c-39 Schenk IBMNA 9271 116587 12.6 × ×
nemeth09 Nemeth 9506 395506 41.6 11 11
nemeth05 Nemeth 9506 394808 41.5 7 7
nemeth13 Nemeth 9506 474472 49.9 29 ×
nemeth24 Nemeth 9506 1506550 158.5 7 ×
nemeth16 Nemeth 9506 587012 61.8 77 ×
nemeth15 Nemeth 9506 539802 56.8 483 ×
nemeth26 Nemeth 9506 1511760 159.0 5 ×
nemeth12 Nemeth 9506 446818 47.0 20 20
nemeth23 Nemeth 9506 1506810 158.5 11 ×
nemeth07 Nemeth 9506 394812 41.5 9 9
nemeth21 Nemeth 9506 1173746 123.5 26 ×
nemeth08 Nemeth 9506 394816 41.5 10 10
nemeth22 Nemeth 9506 1358832 142.9 17 ×
nemeth01 Nemeth 9506 725054 76.3 × ×
nemeth10 Nemeth 9506 401448 42.2 13 13
nemeth19 Nemeth 9506 818302 86.1 46 ×
nemeth18 Nemeth 9506 695234 73.1 57 ×
nemeth06 Nemeth 9506 394808 41.5 8 8
nemeth25 Nemeth 9506 1511758 159.0 5 ×
nemeth03 Nemeth 9506 394808 41.5 7 6
nemeth11 Nemeth 9506 408264 42.9 16 16
nemeth14 Nemeth 9506 496144 52.2 319 ×
nemeth04 Nemeth 9506 394808 41.5 7 7
nemeth20 Nemeth 9506 971870 102.2 37 ×
nemeth17 Nemeth 9506 629620 66.2 62 ×
nemeth02 Nemeth 9506 394808 41.5 6 6
net25 Andrianov 9520 401200 42.1 437 ×
35
fv1 Norris 9604 85264 8.9 30 ×
flowmeter0 Oberwolfach 9669 67391 7.0 × ×
pf2177 Andrianov 9728 725144 74.5 × 21
c-41 Schenk IBMNA 9769 101635 10.4 × ×
TSC OPF 300 IPSO 9774 820783 84.0 × ×
whitaker3 AG-Monien 9800 57978 5.9 × 11
fv3 Norris 9801 87025 8.9 × ×
fv2 Norris 9801 87025 8.9 30 ×
ca-HepTh SNAP 9877 51971 5.3 × ×
c-40 Schenk IBMNA 9941 81501 8.2 × ×
G67 Gset 10000 40000 4.0 × ×
bloweybq GHS indef 10001 49999 5.0 × ×
crack AG-Monien 10240 60760 5.9 × 6
hangGlider 3 VDOL 10260 92703 9.0 × ×
sit100 GHS indef 10262 61046 5.9 × ×
shuttle eddy Pothen 10429 103599 9.9 × ×
c-42 Schenk IBMNA 10471 110285 10.5 × ×
vsp p0291 seymourl iias DIMACS10 10498 107736 10.3 × ×
bundle1 Lourakis 10581 770811 72.8 326 57
ed B unscaled Bindel 10605 144579 13.6 12 ×
ed B Bindel 10605 144579 13.6 × 33
PGPgiantcompo Arenas 10680 48632 4.6 × ×
c-44 Schenk IBMNA 10728 85000 7.9 × ×
nopoly Gaertner 10774 70842 6.6 × ×
TSOPF FS b162 c1 TSOPF 10798 608540 56.4 × 14
pkustk02 Chen 10800 810000 75.0 × 15
sc10848 Boeing 10848 1229776 113.4 × ×
rajat06 Rajat 10922 46983 4.3 × 5
36
wing nodal DIMACS10 10937 150976 13.8 × ×
bcsstk17 HB 10974 428650 39.1 × ×
vsp c-30 data d DIMACS10 11023 124368 11.3 × ×
CurlCurl 0 Bodendiek 11083 113343 10.2 × ×
3plates Cunningham 11107 11105 1.0 33 2
c-43 Schenk IBMNA 11125 123659 11.1 × ×
fe 4elt2 DIMACS10 11143 65636 5.9 × ×
2dah Oberwolfach 11445 176117 15.4 × ×
2dah e Oberwolfach 11445 176117 15.4 × 58
2dah Oberwolfach 11445 176117 15.4 × ×
Oregon-1 SNAP 11492 47127 4.1 × ×
Oregon-2 SNAP 11806 65804 5.6 × ×
bcsstk18 HB 11948 149090 12.5 × ×
linverse GHS indef 11999 95977 8.0 × ×
ca-HepPh SNAP 12008 237010 19.7 × ×
ncvxqp1 GHS indef 12111 73963 6.1 × ×
vibrobox Cote 12328 301700 24.5 × ×
stokes64s GHS indef 12546 140034 11.2 × ×
stokes64 GHS indef 12546 140034 11.2 × ×
skir Pothen 12598 196520 15.6 × 9
uma2 GHS indef 12992 49365 3.8 × 9
c-45 Schenk IBMNA 13206 174452 13.2 × ×
Reuters911 Pajek 13332 296094 22.2 × ×
lowThrust 4 VDOL 13562 160947 11.9 × ×
cbuckle TKK 13681 676515 49.4 × ×
cyl6 TKK 13681 714241 52.2 × 3
pcrystk02 Boeing 13965 968583 69.4 × 92
crystk02 Boeing 13965 968583 69.4 × ×
37
crystm02 Boeing 13965 322905 23.1 137 47
bcsstk29 HB 13992 619488 44.3 × ×
vsp befref fxm 2 4 air02 DIMACS10 14109 196448 13.9 × ×
case9 QY 14454 147972 10.2 × 25
TSOPF FS b9 c6 TSOPF 14454 147972 10.2 × 24
Pres Poisson ACUSIM 14822 715804 48.3 × ×
rajat07 Rajat 14842 63913 4.3 × 11
c-46 Schenk IBMNA 14913 130397 8.7 × ×
c-47 Schenk IBMNA 15343 211401 13.8 × ×
OPF 3754 IPSO 15435 141478 9.2 × 122
bcsstm25 HB 15439 15439 1.0 × 2
bcsstk25 HB 15439 252241 16.3 × ×
opt1 GHS psdef 15449 1930655 125.0 × 20
hangGlider 4 VDOL 15561 149532 9.6 × ×
barth5 Pothen 15606 107362 6.9 × ×
hangGlider 5 VDOL 16011 155246 9.7 × ×
Dubcova1 UTEP 16129 253009 15.7 320 ×
olafu Simon 16146 1015156 62.9 × ×
lowThrust 5 VDOL 16262 198369 12.2 × ×
net50 Andrianov 16320 945200 57.9 × ×
delaunay n14 DIMACS10 16384 98244 6.0 × ×
fe sphere DIMACS10 16386 98304 6.0 × ×
ncvxqp9 GHS indef 16554 54040 3.3 51 9
pds10 GHS psdef 16558 149658 9.0 × ×
stro-ph Newman 16706 243162 14.6 × ×
cond- Newman 16726 95650 5.7 × ×
ex3sta1 Andrianov 16782 678998 40.5 × 5
gupta3 Gupta 16783 9323427 555.5 × 5
38
ramage02 GHS psdef 16830 2866352 170.3 × 15
cti DIMACS10 16840 96464 5.7 × ×
pkustk07 Chen 16860 2418804 143.5 × 29
lowThrust 6 VDOL 16928 207349 12.2 × ×
Si10H16 PARSEC 17077 875923 51.3 × ×
copter1 GHS psdef 17222 211064 12.3 × ×
gyro Oberwolfach 17361 1021159 58.8 × ×
gyro k Oberwolfach 17361 1021159 58.8 × ×
gyro Oberwolfach 17361 340431 19.6 × ×
lowThrust 7 VDOL 17378 211561 12.2 × ×
cvxqp3 GHS indef 17500 114962 6.6 × ×
bodyy4 Pothen 17546 121550 6.9 279 213
lowThrust 8 VDOL 17702 216445 12.2 × ×
L-9 AG-Monien 17983 71192 4.0 × ×
nd6k ND 18000 6897316 383.2 × ×
crplat2 DNVS 18010 960946 53.4 × ×
lowThrust 9 VDOL 18044 219589 12.2 × ×
lowThrust 10 VDOL 18260 222005 12.2 × ×
c-48 Schenk IBMNA 18354 166080 9.0 × ×
lowThrust 11 VDOL 18368 223801 12.2 × ×
ndem vtx Pothen 18454 253350 13.7 × 6
lowThrust 12 VDOL 18458 224593 12.2 × ×
lowThrust 13 VDOL 18476 224897 12.2 × ×
bodyy5 Pothen 18589 128853 6.9 × ×
ford1 GHS psdef 18728 101576 5.4 × 5
ca-AstroPh SNAP 18772 396160 21.1 × ×
fxm4 6 Andrianov 18892 497844 26.4 × 153
whitaker3 dual AG-Monien 19190 57162 3.0 × ×
39
pattern1 Andrianov 19242 9323432 484.5 × ×
rajat08 Rajat 19362 83443 4.3 × 17
bodyy6 Pothen 19366 134208 6.9 × ×
raefsky4 Simon 19779 1316789 66.6 × ×
Si5H12 PARSEC 19896 738598 37.1 × ×
LFAT5000 Oberwolfach 19994 79966 4.0 × 13
LF10000 Oberwolfach 19998 99982 5.0 × 13
Trefethen 20000b JGD Trefethen 19999 554435 27.7 × 11
qpband GHS indef 20000 45000 2.2 9 ×
Trefethen 20000 JGD Trefethen 20000 554466 27.7 × 12
crack dual AG-Monien 20141 60086 3.0 × 2
rail 20209 Oberwolfach 20209 139233 6.9 × ×
3dl e Oberwolfach 20360 20360 1.0 × 2
3dl Oberwolfach 20360 509866 25.0 × ×
3dl Oberwolfach 20360 509866 25.0 × ×
syl201 DNVS 20685 2454957 118.7 × ×
c-49 Schenk IBMNA 21132 157040 7.4 × ×
ube1 TKK 21498 897056 41.7 × ×
ube2 TKK 21498 897056 41.7 × ×
biplane-9 AG-Monien 21701 84076 3.9 × ×
vsp msc10848 300sep
100in 1Kou
DIMACS10 21996 2442056 111.0 × ×
pkustk01 Chen 22044 979380 44.4 × ×
rdhei DNVS 22098 1935324 87.6 × ×
pkustk08 Chen 22209 3226671 145.3 × 145
c-50 Schenk IBMNA 22401 180245 8.0 × ×
cs4 DIMACS10 22499 87716 3.9 × ×
pli Li 22695 1350309 59.5 × ×
40
s-22july06 Newman 22963 96872 4.2 × ×
uma1 GHS indef 22967 87760 3.8 × 28
bcsstk36 Boeing 23052 1143140 49.6 × ×
sc23052 Boeing 23052 1142686 49.6 × ×
bcsstm36 Boeing 23052 331484 14.4 × 169
net75 Andrianov 23120 1489200 64.4 × ×
ca-CondM SNAP 23133 186936 8.1 × ×
c-51 Schenk IBMNA 23196 203048 8.8 × ×
c-52 Schenk IBMNA 23948 202708 8.5 5 ×
stufe-10 AG-Monien 24010 92828 3.9 × 6
de2010 DIMACS10 24115 116056 4.8 × ×
ug3d GHS indef 24300 69984 2.9 267 ×
rajat09 Rajat 24482 105573 4.3 × 23
pcrystk03 Boeing 24696 1751178 70.9 × ×
crystk03 Boeing 24696 1751178 70.9 × ×
crystm03 Boeing 24696 583770 23.6 138 50
dtoc GHS indef 24993 69972 2.8 × ×
hi2010 DIMACS10 25016 124126 5.0 × 460
ri2010 DIMACS10 25181 125750 5.0 × ×
bcsstm37 Boeing 25503 27021 1.1 × 128
bcsstk37 Boeing 25503 1140977 44.7 × ×
s TKK 25710 3749582 145.8 × ×
brainpc2 GHS indef 27607 179395 6.5 × 11
bratu3d GHS indef 27792 173796 6.3 × ×
TSOPF FS b39 c7 TSOPF 28216 730080 25.9 × 13
bcsstk30 HB 28924 2043492 70.7 × 341
ug2d GHS indef 29008 76832 2.6 × ×
TSOPF FS b300 c1 TSOPF 29214 4400122 150.6 × ×
41
TSOPF FS b300 TSOPF 29214 4400122 150.6 × ×
hread DNVS 29736 4444880 149.5 × ×
OPF 6000 IPSO 29902 274697 9.2 × ×
net100 Andrianov 29920 2033200 68.0 × ×
spmsrtls GHS indef 29995 229947 7.7 × ×
bloweybl GHS indef 30003 110000 3.7 × ×
blowey GHS indef 30004 150009 5.0 × 17
ug2dc GHS indef 30200 80000 2.6 × ×
rajat10 Rajat 30202 130303 4.3 × 31
c-53 Schenk IBMNA 30235 355139 11.7 × ×
bcsstk35 Boeing 30237 1450163 48.0 × ×
bcsstm35 Boeing 30237 35050 1.2 × 21
big dual AG-Monien 30269 89858 3.0 × 10
wathen100 GHS psdef 30401 471601 15.5 × 51
TSOPF FS b162 c3 TSOPF 30798 1801300 58.5 × ×
cond-mat-2003 Newman 31163 240761 7.7 × ×
c-54 Schenk IBMNA 31793 385987 12.1 × ×
gupta1 Gupta 31802 2164210 68.1 × ×
vsp barth5 1Ksep 50in 5Kou DIMACS10 32212 203610 6.3 × ×
helm3d01 GHS indef 32226 428444 13.3 × ×
lpl1 Andrianov 32460 328036 10.1 × ×
vt2010 DIMACS10 32580 155598 4.8 × ×
rgg n 2 15 s0 DIMACS10 32768 320482 9.8 × 4
delaunay n15 DIMACS10 32768 196548 6.0 × 16
se AG-Monien 32768 98300 3.0 × 14
c-55 GHS indef 32780 403450 12.3 × ×
SiO PARSEC 33401 1317655 39.4 × ×
pkustk09 Chen 33960 1583640 46.6 × ×
42
ship 001 DNVS 34920 3896496 111.6 × ×
ug3dcqp GHS indef 35543 128115 3.6 × ×
bcsstk31 HB 35588 1181416 33.2 × 77
c-56 Schenk IBMNA 35910 380240 10.6 × ×
nd12k ND 36000 14220946 395.0 × ×
pdb1HYS Williams 36417 4344765 119.3 × ×
wathen120 GHS psdef 36441 565761 15.5 × 50
shock-9 AG-Monien 36476 142580 3.9 × ×
pw GHS psdef 36519 326107 8.9 × 5
email-Enron SNAP 36692 367662 10.0 × ×
net125 Andrianov 36720 2577200 70.2 × ×
pkustk05 Chen 37164 2205144 59.3 × ×
finance256 GHS psdef 37376 298496 8.0 × 12
c-58 GHS indef 37595 552551 14.7 × ×
c-57 Schenk IBMNA 37833 403373 10.7 × ×
rio001 GHS indef 38434 204912 5.3 × 8
vsp south31 slptsk DIMACS10 39668 379828 9.6 × ×
jnlbrng1 GHS psdef 40000 199200 5.0 137 ×
obstclae GHS psdef 40000 197608 4.9 60 ×
orsion1 GHS psdef 40000 197608 4.9 60 ×
vsp sctap1-2b and seymourl DIMACS10 40174 281662 7.0 × ×
case39 QY 40216 1042160 25.9 × ×
cond-mat-2005 Newman 40421 352226 8.7 × ×
TSOPF FS b162 c4 TSOPF 40798 2398220 58.8 × 122
insurfo GHS psdef 40806 203622 5.0 91 ×
c-59 GHS indef 41282 480536 11.6 × ×
c-62 Schenk IBMNA 41731 559341 13.4 × ×
c-62ghs GHS indef 41731 559339 13.4 × ×
43
pkustk06 Chen 43164 2571768 59.6 × ×
net150 Andrianov 43520 3121200 71.7 × ×
c-61 Schenk IBMNA 43618 310016 7.1 × ×
c-60 Schenk IBMNA 43640 298570 6.8 × ×
OPF 10000 IPSO 43887 426898 9.7 × ×
c-63 GHS indef 44234 434704 9.8 × ×
bcsstk32 HB 44609 2014701 45.2 × 3
fe body DIMACS10 45087 327780 7.3 × 5
vsp model1 crew1 cr42 south31 DIMACS10 45101 379952 8.4 × ×
k2010 DIMACS10 45292 217114 4.8 × ×
3dtube Rothberg 45330 3213618 70.9 × ×
bcsstk39 Boeing 46772 2060662 44.1 × ×
bcsstm39 Boeing 46772 46772 1.0 × 2
vanbody GHS psdef 47072 2329056 49.5 × ×
c-65 Schenk IBMNA 48066 360428 7.5 × ×
nh2010 DIMACS10 48837 234550 4.8 × ×
gridgen GHS psdef 48962 512084 10.5 × ×
cc AG-Monien 49152 139264 2.8 × ×
ccc AG-Monien 49152 147456 3.0 × ×
bfly AG-Monien 49152 196608 4.0 152 ×
stokes128 GHS indef 49666 558594 11.2 × ×
c-66b Schenk IBMNA 49989 444851 8.9 × ×
c-66 Schenk IBMNA 49989 444853 8.9 × ×
sparsine GHS indef 50000 1548988 31.0 × ×
cvxbqp1 GHS psdef 50000 349968 7.0 × ×
ncvxbqp1 GHS indef 50000 349968 7.0 × ×
c-64 Schenk IBMNA 51035 707985 13.9 × ×
c-64b Schenk IBMNA 51035 707601 13.9 × ×
44
dawson5 GHS indef 51537 1010777 19.6 × ×
pct20stif Boeing 52329 2698463 51.6 × ×
ct20stif Boeing 52329 2600295 49.7 × ×
dictionary28 Pajek 52652 191400 3.6 × ×
crankseg 1 GHS psdef 52804 10614210 201.0 × ×
struct3 Rothberg 53570 1173694 21.9 × 58
nasasrb Nasa 54870 2677324 48.8 × ×
srb1 GHS psdef 54924 2962152 53.9 × ×
copter2 GHS psdef 55476 759952 13.7 × 18
pkustk04 Chen 55590 4218660 75.9 × 390
vsp bump2 e18
aa01 model1 crew1
DIMACS10 56438 601602 10.7 × ×
TSOPF FS b300 c2 TSOPF 56814 8767466 154.3 × 244
c-67b Schenk IBMNA 57975 530583 9.2 × ×
c-67 Schenk IBMNA 57975 530229 9.1 × ×
vsp bcsstk30 500sep
10in 1Kou
DIMACS10 58348 4033167 69.1 × ×
dixmaanl GHS indef 60000 299998 5.0 × ×
Andrews Andrews 60000 760154 12.7 181 176
60k DIMACS10 60005 178880 3.0 × ×
5esindl GHS indef 60008 255004 4.2 × ×
blockqp1 GHS indef 60012 640033 10.7 10 ×
Ga3As3H12 PARSEC 61349 5970947 97.3 × ×
GaAsH6 PARSEC 61349 3381809 55.1 × ×
wing DIMACS10 62032 243088 3.9 × 5
gupta2 Gupta 62064 4248286 68.5 × 184
can Williams 62451 4007383 64.2 × ×
ncvxqp5 GHS indef 62500 424966 6.8 × 3
45
brack2 AG-Monien 62631 733118 11.7 × ×
pkustk03 Chen 63336 3130416 49.4 × ×
crankseg 2 GHS psdef 63838 14148858 221.6 × ×
c-68 GHS indef 64810 565996 8.7 × ×
Dubcova2 UTEP 65025 1030225 15.8 × ×
rgg n 2 16 s0 DIMACS10 65536 684258 10.4 × ×
delaunay n16 DIMACS10 65536 393150 6.0 × 16
qa8f Cunningham 66127 1660579 25.1 75 60
qa8fk Cunningham 66127 1660579 25.1 × ×
gas sensor Oberwolfach 66917 1703365 25.5 × 9
H2O PARSEC 67024 2216736 33.1 × ×
c-69 GHS indef 67458 623914 9.2 × ×
ct2010 DIMACS10 67578 336352 5.0 × 28
k1 san GHS indef 67759 559775 8.3 × ×
c-70 GHS indef 68924 658986 9.6 × ×
e2010 DIMACS10 69518 335476 4.8 × ×
cfd1 Rothberg 70656 1825580 25.8 × ×
F2 Koutsovasilis 71505 5294285 74.0 × ×
oilpan GHS psdef 73752 2148558 29.1 × ×
finan512 Mulvey 74752 596992 8.0 54 27
pfinan512 Mulvey 74752 596992 8.0 × 13
ncvxqp3 GHS indef 75000 499964 6.7 × 7
TSOPF FS b39 c19 TSOPF 76216 1977600 25.9 × ×
c-71 GHS indef 76638 859520 11.2 × ×
vsp vibrobox scagr7-2c
rlfddd
DIMACS10 77328 871172 11.3 × ×
fe tooth DIMACS10 78136 905182 11.6 × ×
3dh e Oberwolfach 79171 4352105 55.0 × ×
46
3dh Oberwolfach 79171 4352105 55.0 × 9
3dh Oberwolfach 79171 4352105 55.0 × 7
rail 79841 Oberwolfach 79841 553921 6.9 × ×
0nsdsil GHS indef 80016 355034 4.4 × ×
2nnsnsl GHS indef 80016 347222 4.3 × ×
cont-201 GHS indef 80595 438795 5.4 × 14
pkustk10 Chen 80676 4308984 53.4 × ×
pache1 GHS psdef 80800 542184 6.7 × ×
shallow water2 MaxPlanck 81920 327680 4.0 35 ×
shallow water1 MaxPlanck 81920 327680 4.0 19 ×
hermal1 Schmid 82654 574458 7.0 × ×
consph Williams 83334 6010480 72.1 × ×
c-72 GHS indef 84064 707546 8.4 × ×
TSOPF FS b300 c3 TSOPF 84414 13135930 155.6 × 2
nv2010 DIMACS10 84538 416998 4.9 × ×
onera dual Pothen 85567 419201 4.9 × 3
vsp c-60 data cti cs4 DIMACS10 85830 482160 5.6 × ×
wy2010 DIMACS10 86204 427586 5.0 × 28
ncvxqp7 GHS indef 87500 574962 6.6 × 9
pkustk11 Chen 87804 5217912 59.4 × ×
olesnik0 GHS indef 88263 744216 8.4 × 11
net4-1 Andrianov 88343 2441727 27.6 × ×
sd2010 DIMACS10 88360 410722 4.6 × ×
denormal Castrillon 89400 1156224 12.9 × ×
s3dkt3m2 GHS psdef 90449 3686223 40.8 × ×
s3dkq4m2 GHS psdef 90449 4427725 49.0 × ×
s4dkt3m2 TKK 90449 3753461 41.5 × ×
boyd1 GHS indef 93279 1211231 13.0 × ×
47
ndem dual Pothen 94069 460493 4.9 × 28
pkustk12 Chen 94653 7512317 79.4 × ×
pkustk13 Chen 94893 6616827 69.7 × ×
Si34H36 PARSEC 97569 5156379 52.8 × ×
t1 DNVS 97578 9753570 100.0 × ×
fe rotor DIMACS10 99617 1324862 13.3 × ×
G n pin pou DIMACS10 100000 1002401 10.0 × ×
preferentialAttachmen DIMACS10 100000 999970 10.0 × ×
smallworld DIMACS10 100000 999996 10.0 × ×
ford2 GHS psdef 100196 544688 5.4 × 24
vsp mod2 pgp2 slptsk DIMACS10 101364 778736 7.7 × ×
2cubes sphere Um 101492 1647264 16.2 × 36
hermomech TK Botonakis 102158 711558 7.0 × ×
hermomech TC Botonakis 102158 711558 7.0 76 23
filter3D Oberwolfach 106437 2707179 25.4 × 13
x104 DNVS 108384 8713602 80.4 × ×
598 DIMACS10 110971 1483868 13.4 × ×
Ge87H76 PARSEC 112985 7892195 69.9 × ×
Ge99H100 PARSEC 112985 8451395 74.8 × ×
Ga10As10H30 PARSEC 113081 6115633 54.1 × ×
luxembourg os DIMACS10 114599 239332 2.1 × ×
shipsec8 DNVS 114919 3303553 28.7 × ×
ut2010 DIMACS10 115406 572066 5.0 × 59
TSOPF FS b39 c30 TSOPF 120216 3121160 26.0 × ×
cop20k A Williams 121192 2645680 21.8 × ×
ship 003 DNVS 121728 3777036 31.0 × ×
cfd2 Rothberg 123440 3085406 25.0 × ×
usroads-48 Gleich 126146 323900 2.6 × ×
48
boneS01 Oberwolfach 127224 5516602 43.4 × ×
usroads Gleich 129164 330870 2.6 × ×
rgg n 2 17 s0 DIMACS10 131072 1457508 11.1 × ×
delaunay n17 DIMACS10 131072 786352 6.0 × 24
2010 DIMACS10 132288 638668 4.8 × ×
Ga19As19H42 PARSEC 133123 8884839 66.7 × ×
nd2010 DIMACS10 133769 625946 4.7 × ×
wv2010 DIMACS10 135218 662922 4.9 × 14
vsp finan512 scagr7-2c rlfddd DIMACS10 139752 1104040 7.9 × ×
shipsec1 DNVS 140874 3568176 25.3 × ×
bmw7st 1 GHS psdef 141347 7318399 51.8 × ×
fe ocean DIMACS10 143437 819186 5.7 × ×
engine TKK 143571 4706073 32.8 × ×
144 DIMACS10 144649 2148786 14.9 × ×
d2010 DIMACS10 145247 700378 4.8 × ×
Dubcova3 UTEP 146689 3636643 24.8 × ×
bmwcra 1 GHS psdef 148770 10641602 71.5 × ×
id2010 DIMACS10 149842 728264 4.9 × ×
G2 circui AMD 150102 726674 4.8 × ×
pkustk14 Chen 151926 14836504 97.7 × ×
gearbox Rothberg 153746 9080404 59.1 × ×
SiO2 PARSEC 155331 11283503 72.6 × ×
wave AG-Monien 156317 2118662 13.6 × ×
2010 DIMACS10 157508 776610 4.9 × 26
ky2010 DIMACS10 161672 787778 4.9 × 20
nm2010 DIMACS10 168609 830970 4.9 × 30
c-73 Schenk IBMNA 169422 1279274 7.6 × ×
nj2010 DIMACS10 169588 829912 4.9 × ×
49
s2010 DIMACS10 171778 839980 4.9 × ×
shipsec5 DNVS 179860 4598604 25.6 × ×
cont-300 GHS indef 180895 988195 5.5 × 15
sc2010 DIMACS10 181908 893160 4.9 × ×
d pretok GHS indef 182730 1641672 9.0 × ×
Si41Ge41H72 PARSEC 185639 15011265 80.9 × ×
r2010 DIMACS10 186211 904310 4.9 × ×
uron GHS indef 189924 1690876 8.9 × ×
caidaRouterLevel DIMACS10 192244 1218132 6.3 × ×
ne2010 DIMACS10 193352 913854 4.7 × 21
wa2010 DIMACS10 195574 947432 4.8 × 15
or2010 DIMACS10 196621 979512 5.0 × ×
fullb DNVS 199187 11708077 58.8 × ×
co2010 DIMACS10 201062 974574 4.8 × 18
fcondp2 DNVS 201822 11294316 56.0 × ×
hermomech dM Botonakis 204316 1423116 7.0 76 22
la2010 DIMACS10 204447 980634 4.8 × ×
roll DNVS 213453 11985111 56.1 × ×
14b DIMACS10 214765 3358036 15.6 × ×
ia2010 DIMACS10 216007 1021170 4.7 × 20
pwtk Boeing 217918 11524432 52.9 × ×
hood GHS psdef 220542 9895422 44.9 × ×
CO PARSEC 221119 7666057 34.7 × ×
halfb DNVS 224617 12387821 55.2 × ×
HTC 336 4438 IPSO 226340 783496 3.5 × ×
HTC 336 9129 IPSO 226340 762969 3.4 × ×
CurlCurl 1 Bodendiek 226451 2472071 10.9 × ×
coAuthorsCiteseer DIMACS10 227320 1628268 7.2 × ×
50
bmw3 2 GHS indef 227362 11288630 49.7 × ×
ks2010 DIMACS10 238600 1121798 4.7 × 24
n2010 DIMACS10 240116 1193966 5.0 × 18
Si87H76 PARSEC 240369 10661631 44.4 × ×
z2010 DIMACS10 241666 1196094 4.9 × ×
BenElechi1 BenElechi 245874 13150496 53.5 × ×
l2010 DIMACS10 252266 1230482 4.9 × ×
wi2010 DIMACS10 253096 1209404 4.8 × ×
Lin Lin 256000 1766400 6.9 × ×
n2010 DIMACS10 259777 1227102 4.7 × ×
offshore Um 259789 4242673 16.3 × ×
rgg n 2 18 s0 DIMACS10 262144 3094569 11.8 × ×
delaunay n18 DIMACS10 262144 1572792 6.0 × ×
in2010 DIMACS10 267071 1281716 4.8 × ×
citationCiteseer DIMACS10 268495 2313294 8.6 × ×
ok2010 DIMACS10 269118 1274148 4.7 × 15
va2010 DIMACS10 285762 1402128 4.9 × ×
nc2010 DIMACS10 288987 1416620 4.9 × ×
ga2010 DIMACS10 291086 1418056 4.9 × ×
3Dspectralwave2 Sinclair 292008 12859992 44.0 × ×
coAuthorsDBLP DIMACS10 299067 1955352 6.5 × ×
dblp-2010 LAW 326186 1640936 5.0 × ×
i2010 DIMACS10 329885 1578090 4.8 × ×
o2010 DIMACS10 343565 1656568 4.8 × ×
ny2010 DIMACS10 350169 1709546 4.9 × ×
oh2010 DIMACS10 365344 1768240 4.8 × ×
darcy003 GHS indef 389874 2097566 5.4 × ×
rio002 GHS indef 389874 2097566 5.4 × ×
51
helm2d03 GHS indef 392257 2741935 7.0 × ×
pa2010 DIMACS10 421545 2058462 4.9 × ×
uto DIMACS10 448695 6629222 14.8 × ×
il2010 DIMACS10 451554 2164464 4.8 × ×
fl2010 DIMACS10 484481 2346294 4.8 × ×
f 2 k101 Schenk AFE 503625 17550675 34.8 × ×
f 3 k101 Schenk AFE 503625 17550675 34.8 × ×
f 1 k101 Schenk AFE 503625 17550675 34.8 × ×
rgg n 2 19 s0 DIMACS10 524288 6539536 12.5 × ×
delaunay n19 DIMACS10 524288 3145646 6.0 × ×
parabolic fe Wissgott 525825 3674625 7.0 × ×
Wins 31 110
Converged 53 120
Table A.2: Iteration counts of GMRES (30) with tolerance
= 10−9 on off-the-shelf sparse matrices
Name N 10−3 10−5 10−7 10−9
delaunay n13 8192 MMFPrec MMFPrec Both Both
c-37 8204 Both Both Both Both
ft01 8205 NoPrec NoPrec NoPrec NoPrec
benzene 8219 NoPrec Both Both Both
hep-th 8361 MMFPrec MMFPrec Both Both
bcsstk33 8738 MMFPrec MMFPrec MMFPrec MMFPrec
nd3k 9000 NoPrec NoPrec Both Both
G66 9000 MMFPrec Both Both Both52
3elt dual 9000 MMFPrec MMFPrec MMFPrec Both
vsp data and seymourl 9167 NoPrec Both Both Both
c-39 9271 NoPrec Both Both Both
nemeth09 9506 Both Both Both Both
nemeth05 9506 MMFPrec Both Both Both
nemeth13 9506 MMFPrec Both NoPrec NoPrec
nemeth24 9506 NoPrec NoPrec NoPrec NoPrec
nemeth16 9506 NoPrec NoPrec NoPrec NoPrec
nemeth15 9506 NoPrec NoPrec NoPrec NoPrec
nemeth26 9506 NoPrec NoPrec NoPrec NoPrec
nemeth12 9506 MMFPrec Both Both Both
nemeth23 9506 NoPrec NoPrec NoPrec NoPrec
nemeth07 9506 Both Both Both Both
nemeth21 9506 NoPrec NoPrec NoPrec NoPrec
nemeth08 9506 Both Both Both Both
nemeth22 9506 NoPrec NoPrec NoPrec NoPrec
nemeth01 9506 NoPrec NoPrec NoPrec Both
nemeth10 9506 Both Both MMFPrec Both
nemeth19 9506 NoPrec NoPrec NoPrec NoPrec
nemeth18 9506 NoPrec NoPrec NoPrec NoPrec
nemeth06 9506 Both Both Both Both
nemeth25 9506 NoPrec NoPrec NoPrec NoPrec
nemeth03 9506 Both Both Both MMFPrec
nemeth11 9506 Both MMFPrec MMFPrec Both
nemeth14 9506 NoPrec NoPrec NoPrec NoPrec
nemeth04 9506 Both Both Both Both
53
nemeth20 9506 NoPrec NoPrec NoPrec NoPrec
nemeth17 9506 NoPrec NoPrec NoPrec NoPrec
nemeth02 9506 Both Both Both Both
net25 9520 NoPrec NoPrec NoPrec NoPrec
fv1 9604 NoPrec NoPrec NoPrec NoPrec
flowmeter0 9669 NoPrec Both Both Both
pf2177 9728 MMFPrec MMFPrec MMFPrec MMFPrec
c-41 9769 Both Both Both Both
TSC OPF 300 9774 MMFPrec MMFPrec MMFPrec Both
whitaker3 9800 MMFPrec MMFPrec MMFPrec MMFPrec
fv3 9801 NoPrec NoPrec NoPrec Both
fv2 9801 NoPrec NoPrec NoPrec NoPrec
ca-HepTh 9877 MMFPrec MMFPrec MMFPrec Both
c-40 9941 Both Both Both Both
G67 10000 MMFPrec MMFPrec Both Both
bloweybq 10001 NoPrec NoPrec Both Both
crack 10240 MMFPrec MMFPrec MMFPrec MMFPrec
hangGlider 3 10260 MMFPrec Both Both Both
sit100 10262 Both Both Both Both
shuttle eddy 10429 MMFPrec MMFPrec Both Both
c-42 10471 Both Both Both Both
vsp p0291 seymourl iias 10498 NoPrec Both Both Both
bundle1 10581 MMFPrec MMFPrec MMFPrec MMFPrec
ed B unscaled 10605 NoPrec NoPrec NoPrec NoPrec
ed B 10605 NoPrec NoPrec MMFPrec MMFPrec
PGPgiantcompo 10680 MMFPrec MMFPrec Both Both
54
c-44 10728 Both Both Both Both
nopoly 10774 NoPrec NoPrec Both Both
TSOPF FS b162 c1 10798 MMFPrec MMFPrec MMFPrec MMFPrec
pkustk02 10800 MMFPrec MMFPrec MMFPrec MMFPrec
sc10848 10848 NoPrec NoPrec Both Both
rajat06 10922 MMFPrec MMFPrec MMFPrec MMFPrec
wing nodal 10937 NoPrec Both Both Both
bcsstk17 10974 MMFPrec MMFPrec Both Both
vsp c-30 data d 11023 NoPrec Both Both Both
CurlCurl 0 11083 MMFPrec MMFPrec Both Both
3plates 11107 MMFPrec MMFPrec MMFPrec MMFPrec
c-43 11125 Both Both Both Both
fe 4elt2 11143 NoPrec Both Both Both
2dah 11445 MMFPrec MMFPrec Both Both
2dah e 11445 MMFPrec MMFPrec MMFPrec MMFPrec
2dah 11445 MMFPrec MMFPrec Both Both
Oregon-1 11492 NoPrec Both Both Both
Oregon-2 11806 NoPrec Both Both Both
bcsstk18 11948 MMFPrec MMFPrec Both Both
linverse 11999 NoPrec Both Both Both
ca-HepPh 12008 Both Both Both Both
ncvxqp1 12111 MMFPrec MMFPrec MMFPrec Both
vibrobox 12328 NoPrec MMFPrec Both Both
stokes64s 12546 MMFPrec Both Both Both
stokes64 12546 MMFPrec MMFPrec Both Both
skir 12598 MMFPrec MMFPrec MMFPrec MMFPrec
55
uma2 12992 MMFPrec MMFPrec MMFPrec MMFPrec
c-45 13206 NoPrec Both Both Both
Reuters911 13332 Both Both Both Both
lowThrust 4 13562 MMFPrec Both Both Both
cbuckle 13681 NoPrec MMFPrec Both Both
cyl6 13681 MMFPrec MMFPrec MMFPrec MMFPrec
pcrystk02 13965 MMFPrec MMFPrec MMFPrec MMFPrec
crystk02 13965 MMFPrec MMFPrec Both Both
crystm02 13965 MMFPrec MMFPrec MMFPrec MMFPrec
bcsstk29 13992 MMFPrec MMFPrec MMFPrec Both
vsp befref fxm 2 4 air02 14109 NoPrec Both Both Both
case9 14454 MMFPrec MMFPrec MMFPrec MMFPrec
TSOPF FS b9 c6 14454 MMFPrec MMFPrec MMFPrec MMFPrec
Pres Poisson 14822 MMFPrec Both Both Both
rajat07 14842 MMFPrec MMFPrec MMFPrec MMFPrec
c-46 14913 Both Both Both Both
c-47 15343 Both Both Both Both
OPF 3754 15435 MMFPrec MMFPrec MMFPrec MMFPrec
bcsstm25 15439 MMFPrec MMFPrec MMFPrec MMFPrec
bcsstk25 15439 NoPrec MMFPrec Both Both
opt1 15449 MMFPrec MMFPrec MMFPrec MMFPrec
hangGlider 4 15561 MMFPrec Both Both Both
barth5 15606 NoPrec MMFPrec Both Both
hangGlider 5 16011 MMFPrec MMFPrec MMFPrec Both
Dubcova1 16129 NoPrec NoPrec NoPrec NoPrec
olafu 16146 NoPrec Both Both Both
56
lowThrust 5 16262 MMFPrec MMFPrec MMFPrec Both
net50 16320 NoPrec NoPrec NoPrec Both
delaunay n14 16384 MMFPrec Both Both Both
fe sphere 16386 MMFPrec Both Both Both
ncvxqp9 16554 MMFPrec MMFPrec NoPrec MMFPrec
pds10 16558 MMFPrec MMFPrec Both Both
stro-ph 16706 Both Both Both Both
cond- 16726 Both Both Both Both
ex3sta1 16782 MMFPrec MMFPrec MMFPrec MMFPrec
gupta3 16783 MMFPrec MMFPrec MMFPrec MMFPrec
ramage02 16830 MMFPrec MMFPrec MMFPrec MMFPrec
cti 16840 NoPrec Both Both Both
pkustk07 16860 MMFPrec MMFPrec MMFPrec MMFPrec
lowThrust 6 16928 MMFPrec Both Both Both
Si10H16 17077 NoPrec Both Both Both
copter1 17222 NoPrec Both Both Both
gyro 17361 MMFPrec Both Both Both
gyro k 17361 MMFPrec Both Both Both
gyro 17361 MMFPrec MMFPrec Both Both
lowThrust 7 17378 Both Both Both Both
cvxqp3 17500 MMFPrec MMFPrec Both Both
bodyy4 17546 MMFPrec MMFPrec MMFPrec MMFPrec
lowThrust 8 17702 Both Both Both Both
L-9 17983 MMFPrec MMFPrec MMFPrec Both
nd6k 18000 NoPrec NoPrec Both Both
crplat2 18010 NoPrec MMFPrec Both Both
57
lowThrust 9 18044 MMFPrec Both Both Both
lowThrust 10 18260 MMFPrec Both Both Both
c-48 18354 NoPrec Both Both Both
lowThrust 11 18368 MMFPrec Both Both Both
ndem vtx 18454 MMFPrec MMFPrec MMFPrec MMFPrec
lowThrust 12 18458 MMFPrec MMFPrec MMFPrec Both
lowThrust 13 18476 MMFPrec Both Both Both
bodyy5 18589 MMFPrec MMFPrec MMFPrec Both
ford1 18728 MMFPrec MMFPrec MMFPrec MMFPrec
ca-AstroPh 18772 MMFPrec MMFPrec Both Both
fxm4 6 18892 MMFPrec MMFPrec MMFPrec MMFPrec
whitaker3 dual 19190 MMFPrec Both Both Both
pattern1 19242 NoPrec Both Both Both
rajat08 19362 MMFPrec MMFPrec MMFPrec MMFPrec
bodyy6 19366 MMFPrec MMFPrec Both Both
raefsky4 19779 NoPrec NoPrec Both Both
Si5H12 19896 NoPrec Both Both Both
LFAT5000 19994 MMFPrec MMFPrec MMFPrec MMFPrec
LF10000 19998 MMFPrec MMFPrec MMFPrec MMFPrec
Trefethen 20000b 19999 MMFPrec MMFPrec MMFPrec MMFPrec
qpband 20000 NoPrec NoPrec NoPrec NoPrec
Trefethen 20000 20000 MMFPrec MMFPrec MMFPrec MMFPrec
crack dual 20141 MMFPrec MMFPrec MMFPrec MMFPrec
rail 20209 20209 NoPrec Both Both Both
3dl e 20360 MMFPrec MMFPrec MMFPrec MMFPrec
3dl 20360 MMFPrec MMFPrec Both Both
58
3dl 20360 MMFPrec MMFPrec Both Both
syl201 20685 NoPrec Both Both Both
c-49 21132 Both Both Both Both
ube1 21498 NoPrec Both Both Both
ube2 21498 NoPrec Both Both Both
biplane-9 21701 NoPrec Both Both Both
vsp msc10848 300sep
100in 1Kou
21996 NoPrec Both Both Both
pkustk01 22044 NoPrec Both Both Both
rdhei 22098 NoPrec Both Both Both
pkustk08 22209 MMFPrec MMFPrec MMFPrec MMFPrec
c-50 22401 NoPrec Both Both Both
cs4 22499 MMFPrec MMFPrec MMFPrec Both
pli 22695 MMFPrec Both Both Both
s-22july06 22963 NoPrec Both Both Both
uma1 22967 MMFPrec MMFPrec MMFPrec MMFPrec
bcsstk36 23052 MMFPrec MMFPrec Both Both
sc23052 23052 MMFPrec MMFPrec Both Both
bcsstm36 23052 MMFPrec MMFPrec MMFPrec MMFPrec
net75 23120 MMFPrec NoPrec Both Both
ca-CondM 23133 Both Both Both Both
c-51 23196 Both Both Both Both
c-52 23948 NoPrec NoPrec NoPrec NoPrec
stufe-10 24010 MMFPrec MMFPrec MMFPrec MMFPrec
de2010 24115 MMFPrec MMFPrec Both Both
ug3d 24300 NoPrec NoPrec NoPrec NoPrec
59
rajat09 24482 MMFPrec MMFPrec MMFPrec MMFPrec
pcrystk03 24696 NoPrec Both Both Both
crystk03 24696 MMFPrec MMFPrec Both Both
crystm03 24696 MMFPrec MMFPrec MMFPrec MMFPrec
dtoc 24993 Both Both Both Both
hi2010 25016 MMFPrec MMFPrec MMFPrec MMFPrec
ri2010 25181 MMFPrec MMFPrec Both Both
bcsstm37 25503 MMFPrec MMFPrec MMFPrec MMFPrec
bcsstk37 25503 MMFPrec MMFPrec Both Both
s 25710 MMFPrec Both Both Both
brainpc2 27607 MMFPrec MMFPrec MMFPrec MMFPrec
bratu3d 27792 MMFPrec Both Both Both
TSOPF FS b39 c7 28216 MMFPrec MMFPrec MMFPrec MMFPrec
bcsstk30 28924 MMFPrec MMFPrec MMFPrec MMFPrec
ug2d 29008 Both Both Both Both
TSOPF FS b300 c1 29214 MMFPrec MMFPrec Both Both
TSOPF FS b300 29214 MMFPrec Both Both Both
hread 29736 NoPrec Both Both Both
OPF 6000 29902 MMFPrec MMFPrec Both Both
net100 29920 MMFPrec NoPrec Both Both
spmsrtls 29995 Both Both Both Both
bloweybl 30003 MMFPrec MMFPrec Both Both
blowey 30004 MMFPrec MMFPrec MMFPrec MMFPrec
ug2dc 30200 Both Both Both Both
rajat10 30202 MMFPrec MMFPrec MMFPrec MMFPrec
c-53 30235 Both Both Both Both
60
bcsstk35 30237 MMFPrec MMFPrec Both Both
bcsstm35 30237 MMFPrec MMFPrec MMFPrec MMFPrec
big dual 30269 MMFPrec MMFPrec MMFPrec MMFPrec
wathen100 30401 MMFPrec MMFPrec MMFPrec MMFPrec
TSOPF FS b162 c3 30798 NoPrec Both Both Both
cond-mat-2003 31163 Both Both Both Both
c-54 31793 NoPrec NoPrec Both Both
gupta1 31802 MMFPrec Both Both Both
vsp barth5 1Ksep
50in 5Kou
32212 NoPrec Both Both Both
helm3d01 32226 NoPrec Both Both Both
lpl1 32460 MMFPrec Both Both Both
vt2010 32580 Both Both Both Both
rgg n 2 15 s0 32768 MMFPrec MMFPrec MMFPrec MMFPrec
delaunay n15 32768 MMFPrec MMFPrec MMFPrec MMFPrec
se 32768 MMFPrec MMFPrec MMFPrec MMFPrec
c-55 32780 Both Both Both Both
SiO 33401 NoPrec Both Both Both
pkustk09 33960 MMFPrec MMFPrec Both Both
ship 001 34920 MMFPrec MMFPrec Both Both
ug3dcqp 35543 MMFPrec MMFPrec Both Both
bcsstk31 35588 MMFPrec MMFPrec MMFPrec MMFPrec
c-56 35910 NoPrec Both Both Both
nd12k 36000 NoPrec NoPrec Both Both
pdb1HYS 36417 MMFPrec NoPrec Both Both
wathen120 36441 MMFPrec MMFPrec MMFPrec MMFPrec
61
shock-9 36476 MMFPrec MMFPrec Both Both
pw 36519 MMFPrec MMFPrec MMFPrec MMFPrec
email-Enron 36692 Both Both Both Both
net125 36720 MMFPrec NoPrec Both Both
pkustk05 37164 MMFPrec MMFPrec MMFPrec Both
finance256 37376 MMFPrec MMFPrec MMFPrec MMFPrec
c-58 37595 Both Both Both Both
c-57 37833 Both Both Both Both
rio001 38434 MMFPrec MMFPrec MMFPrec MMFPrec
vsp south31 slptsk 39668 NoPrec Both Both Both
jnlbrng1 40000 NoPrec NoPrec NoPrec NoPrec
obstclae 40000 NoPrec NoPrec NoPrec NoPrec
orsion1 40000 NoPrec NoPrec NoPrec NoPrec
vsp sctap1-
2b and seymourl
40174 Both Both Both Both
case39 40216 MMFPrec MMFPrec MMFPrec Both
cond-mat-2005 40421 Both Both Both Both
TSOPF FS b162 c4 40798 MMFPrec MMFPrec MMFPrec MMFPrec
insurfo 40806 NoPrec NoPrec NoPrec NoPrec
c-59 41282 Both Both Both Both
c-62 41731 MMFPrec MMFPrec Both Both
c-62ghs 41731 Both Both Both Both
pkustk06 43164 MMFPrec MMFPrec Both Both
net150 43520 NoPrec NoPrec Both Both
c-61 43618 Both Both Both Both
c-60 43640 Both Both Both Both
62
OPF 10000 43887 MMFPrec MMFPrec Both Both
c-63 44234 Both Both Both Both
bcsstk32 44609 MMFPrec MMFPrec MMFPrec MMFPrec
fe body 45087 MMFPrec MMFPrec MMFPrec MMFPrec
vsp model1 crew1
cr42 south31
45101 NoPrec Both Both Both
k2010 45292 MMFPrec MMFPrec MMFPrec Both
3dtube 45330 MMFPrec MMFPrec Both Both
bcsstk39 46772 NoPrec MMFPrec Both Both
bcsstm39 46772 MMFPrec MMFPrec MMFPrec MMFPrec
vanbody 47072 NoPrec MMFPrec Both Both
c-65 48066 Both Both Both Both
nh2010 48837 MMFPrec MMFPrec Both Both
gridgen 48962 NoPrec Both Both Both
cc 49152 NoPrec Both Both Both
ccc 49152 NoPrec Both Both Both
bfly 49152 NoPrec NoPrec NoPrec NoPrec
stokes128 49666 MMFPrec MMFPrec MMFPrec Both
c-66b 49989 Both Both Both Both
c-66 49989 Both Both Both Both
sparsine 50000 NoPrec Both Both Both
cvxbqp1 50000 MMFPrec MMFPrec Both Both
ncvxbqp1 50000 NoPrec Both Both Both
c-64 51035 Both Both Both Both
c-64b 51035 Both Both Both Both
dawson5 51537 Both Both Both Both
63
pct20stif 52329 MMFPrec Both Both Both
ct20stif 52329 MMFPrec MMFPrec Both Both
dictionary28 52652 Both Both Both Both
crankseg 1 52804 NoPrec NoPrec Both Both
struct3 53570 MMFPrec MMFPrec MMFPrec MMFPrec
nasasrb 54870 MMFPrec MMFPrec Both Both
srb1 54924 MMFPrec MMFPrec Both Both
copter2 55476 MMFPrec MMFPrec MMFPrec MMFPrec
pkustk04 55590 MMFPrec MMFPrec MMFPrec MMFPrec
vsp bump2 e18 aa01
model1 crew1
56438 Both Both Both Both
TSOPF FS b300 c2 56814 MMFPrec MMFPrec MMFPrec MMFPrec
c-67b 57975 Both Both Both Both
c-67 57975 MMFPrec MMFPrec MMFPrec Both
vsp bcsstk30 500sep
10in 1Kou
58348 NoPrec Both Both Both
dixmaanl 60000 NoPrec Both Both Both
Andrews 60000 NoPrec NoPrec NoPrec MMFPrec
60k 60005 NoPrec Both Both Both
5esindl 60008 Both Both Both Both
blockqp1 60012 NoPrec NoPrec NoPrec NoPrec
Ga3As3H12 61349 Both Both Both Both
GaAsH6 61349 Both Both Both Both
wing 62032 MMFPrec MMFPrec MMFPrec MMFPrec
gupta2 62064 MMFPrec MMFPrec MMFPrec MMFPrec
can 62451 NoPrec Both Both Both
64
ncvxqp5 62500 MMFPrec MMFPrec MMFPrec MMFPrec
brack2 62631 MMFPrec MMFPrec MMFPrec Both
pkustk03 63336 MMFPrec Both Both Both
crankseg 2 63838 NoPrec Both Both Both
c-68 64810 NoPrec NoPrec Both Both
Dubcova2 65025 NoPrec NoPrec Both Both
rgg n 2 16 s0 65536 NoPrec Both Both Both
delaunay n16 65536 MMFPrec MMFPrec MMFPrec MMFPrec
qa8f 66127 NoPrec MMFPrec MMFPrec MMFPrec
qa8fk 66127 NoPrec NoPrec NoPrec Both
gas sensor 66917 MMFPrec MMFPrec MMFPrec MMFPrec
H2O 67024 NoPrec Both Both Both
c-69 67458 Both Both Both Both
ct2010 67578 MMFPrec MMFPrec MMFPrec MMFPrec
k1 san 67759 MMFPrec MMFPrec Both Both
c-70 68924 Both Both Both Both
e2010 69518 Both Both Both Both
cfd1 70656 NoPrec Both Both Both
F2 71505 MMFPrec MMFPrec Both Both
oilpan 73752 MMFPrec Both Both Both
finan512 74752 MMFPrec MMFPrec MMFPrec MMFPrec
pfinan512 74752 MMFPrec MMFPrec MMFPrec MMFPrec
ncvxqp3 75000 MMFPrec MMFPrec MMFPrec MMFPrec
TSOPF FS b39 c19 76216 Both Both Both Both
c-71 76638 Both Both Both Both
65
vsp vibrobox scagr7-
2c rlfddd
77328 Both Both Both Both
fe tooth 78136 NoPrec Both Both Both
3dh e 79171 MMFPrec MMFPrec Both Both
3dh 79171 MMFPrec MMFPrec MMFPrec MMFPrec
3dh 79171 MMFPrec MMFPrec MMFPrec MMFPrec
rail 79841 79841 NoPrec NoPrec Both Both
0nsdsil 80016 Both Both Both Both
2nnsnsl 80016 Both Both Both Both
cont-201 80595 MMFPrec MMFPrec MMFPrec MMFPrec
pkustk10 80676 NoPrec Both Both Both
pache1 80800 NoPrec NoPrec Both Both
shallow water2 81920 NoPrec NoPrec NoPrec NoPrec
shallow water1 81920 NoPrec NoPrec NoPrec NoPrec
hermal1 82654 NoPrec Both Both Both
consph 83334 MMFPrec MMFPrec Both Both
c-72 84064 Both Both Both Both
TSOPF FS b300 c3 84414 MMFPrec MMFPrec MMFPrec MMFPrec
nv2010 84538 Both Both Both Both
onera dual 85567 MMFPrec MMFPrec MMFPrec MMFPrec
vsp c-60 data cti cs4 85830 Both Both Both Both
wy2010 86204 MMFPrec MMFPrec MMFPrec MMFPrec
ncvxqp7 87500 MMFPrec MMFPrec MMFPrec MMFPrec
pkustk11 87804 MMFPrec MMFPrec Both Both
olesnik0 88263 MMFPrec MMFPrec MMFPrec MMFPrec
net4-1 88343 MMFPrec MMFPrec Both Both
66
sd2010 88360 Both Both Both Both
denormal 89400 NoPrec NoPrec Both Both
s3dkt3m2 90449 NoPrec Both Both Both
s3dkq4m2 90449 NoPrec Both Both Both
s4dkt3m2 90449 MMFPrec Both Both Both
boyd1 93279 NoPrec NoPrec NoPrec Both
ndem dual 94069 MMFPrec MMFPrec MMFPrec MMFPrec
pkustk12 94653 MMFPrec MMFPrec MMFPrec Both
pkustk13 94893 MMFPrec Both Both Both
Si34H36 97569 Both Both Both Both
t1 97578 MMFPrec MMFPrec Both Both
fe rotor 99617 NoPrec Both Both Both
G n pin pou 100000 NoPrec Both Both Both
preferentialAttachmen 100000 Both Both Both Both
smallworld 100000 NoPrec Both Both Both
ford2 100196 MMFPrec MMFPrec MMFPrec MMFPrec
vsp mod2 pgp2 slptsk 101364 Both Both Both Both
2cubes sphere 101492 MMFPrec MMFPrec MMFPrec MMFPrec
hermomech TK 102158 NoPrec NoPrec Both Both
hermomech TC 102158 MMFPrec MMFPrec MMFPrec MMFPrec
filter3D 106437 MMFPrec MMFPrec MMFPrec MMFPrec
x104 108384 MMFPrec MMFPrec Both Both
598 110971 NoPrec Both Both Both
Ge87H76 112985 Both Both Both Both
Ge99H100 112985 Both Both Both Both
Ga10As10H30 113081 Both Both Both Both
67
luxembourg os 114599 NoPrec Both Both Both
shipsec8 114919 MMFPrec MMFPrec Both Both
ut2010 115406 MMFPrec MMFPrec MMFPrec MMFPrec
TSOPF FS b39 c30 120216 Both Both Both Both
cop20k A 121192 Both Both Both Both
ship 003 121728 NoPrec Both Both Both
cfd2 123440 NoPrec Both Both Both
usroads-48 126146 NoPrec Both Both Both
boneS01 127224 MMFPrec Both Both Both
usroads 129164 NoPrec Both Both Both
rgg n 2 17 s0 131072 NoPrec Both Both Both
delaunay n17 131072 MMFPrec MMFPrec MMFPrec MMFPrec
2010 132288 Both Both Both Both
Ga19As19H42 133123 Both Both Both Both
nd2010 133769 Both Both Both Both
wv2010 135218 MMFPrec MMFPrec MMFPrec MMFPrec
vsp finan512 scagr7-
2c rlfddd
139752 Both Both Both Both
shipsec1 140874 NoPrec Both Both Both
bmw7st 1 141347 NoPrec NoPrec NoPrec Both
fe ocean 143437 NoPrec Both Both Both
engine 143571 MMFPrec Both Both Both
144 144649 NoPrec Both Both Both
d2010 145247 Both Both Both Both
Dubcova3 146689 Both NoPrec Both Both
bmwcra 1 148770 MMFPrec Both Both Both
68
id2010 149842 Both Both Both Both
G2 circui 150102 NoPrec NoPrec Both Both
pkustk14 151926 NoPrec Both Both Both
gearbox 153746 MMFPrec Both Both Both
SiO2 155331 NoPrec Both Both Both
wave 156317 MMFPrec MMFPrec Both Both
2010 157508 MMFPrec MMFPrec MMFPrec MMFPrec
ky2010 161672 MMFPrec MMFPrec MMFPrec MMFPrec
nm2010 168609 MMFPrec MMFPrec MMFPrec MMFPrec
c-73 169422 MMFPrec MMFPrec Both Both
nj2010 169588 Both Both Both Both
s2010 171778 MMFPrec MMFPrec MMFPrec Both
shipsec5 179860 MMFPrec MMFPrec Both Both
cont-300 180895 MMFPrec MMFPrec MMFPrec MMFPrec
sc2010 181908 Both Both Both Both
d pretok 182730 MMFPrec MMFPrec Both Both
Si41Ge41H72 185639 Both Both Both Both
r2010 186211 Both Both Both Both
uron 189924 Both Both Both Both
caidaRouterLevel 192244 Both Both Both Both
ne2010 193352 MMFPrec MMFPrec MMFPrec MMFPrec
wa2010 195574 MMFPrec MMFPrec MMFPrec MMFPrec
or2010 196621 Both Both Both Both
fullb 199187 MMFPrec Both Both Both
co2010 201062 MMFPrec MMFPrec MMFPrec MMFPrec
fcondp2 201822 MMFPrec MMFPrec Both Both
69
hermomech dM 204316 MMFPrec MMFPrec MMFPrec MMFPrec
la2010 204447 Both Both Both Both
roll 213453 NoPrec Both Both Both
14b 214765 NoPrec Both Both Both
ia2010 216007 MMFPrec MMFPrec MMFPrec MMFPrec
pwtk 217918 MMFPrec MMFPrec Both Both
hood 220542 MMFPrec Both Both Both
CO 221119 Both Both Both Both
halfb 224617 MMFPrec MMFPrec Both Both
HTC 336 4438 226340 MMFPrec MMFPrec Both Both
HTC 336 9129 226340 MMFPrec MMFPrec MMFPrec Both
CurlCurl 1 226451 MMFPrec Both Both Both
coAuthorsCiteseer 227320 Both Both Both Both
bmw3 2 227362 NoPrec Both Both Both
ks2010 238600 MMFPrec MMFPrec MMFPrec MMFPrec
n2010 240116 MMFPrec MMFPrec MMFPrec MMFPrec
Si87H76 240369 Both Both Both Both
z2010 241666 Both Both Both Both
BenElechi1 245874 MMFPrec MMFPrec Both Both
l2010 252266 Both Both Both Both
wi2010 253096 Both Both Both Both
Lin 256000 NoPrec Both Both Both
n2010 259777 Both Both Both Both
offshore 259789 MMFPrec MMFPrec Both Both
rgg n 2 18 s0 262144 NoPrec Both Both Both
delaunay n18 262144 NoPrec Both Both Both
70
in2010 267071 Both Both Both Both
citationCiteseer 268495 Both Both Both Both
ok2010 269118 MMFPrec MMFPrec MMFPrec MMFPrec
va2010 285762 Both Both Both Both
nc2010 288987 Both Both Both Both
ga2010 291086 Both Both Both Both
3Dspectralwave2 292008 Both Both Both Both
coAuthorsDBLP 299067 Both Both Both Both
dblp-2010 326186 Both Both Both Both
i2010 329885 Both Both Both Both
o2010 343565 Both Both Both Both
ny2010 350169 Both Both Both Both
oh2010 365344 Both Both Both Both
darcy003 389874 MMFPrec MMFPrec Both Both
rio002 389874 NoPrec Both Both Both
helm2d03 392257 NoPrec Both Both Both
pa2010 421545 Both Both Both Both
uto 448695 NoPrec Both Both Both
il2010 451554 Both Both Both Both
fl2010 484481 Both Both Both Both
f 2 k101 503625 NoPrec MMFPrec Both Both
f 3 k101 503625 MMFPrec Both Both Both
f 1 k101 503625 MMFPrec MMFPrec Both Both
rgg n 2 19 s0 524288 NoPrec Both Both Both
delaunay n19 524288 NoPrec Both Both Both
parabolic fe 525825 NoPrec Both Both Both
71
Wins NoPrec 148 60 39 31
MMFPrec 231 199 129 110
Table A.3: Comparison of no preconditioning and MMF-
prec for various levels of GMRES(30) tolerance.
72