ABSTRACT Name: Benjamin Sprague Department: Physics Title...
Transcript of ABSTRACT Name: Benjamin Sprague Department: Physics Title...
ABSTRACT
Name: Benjamin Sprague Department: Physics
Title: Wavelet-space solution of the Poisson equation: An algorithm for use in particle-in-cell simulations
Major: Physics Degree: Master of Science
Approved by: Date:
Thesis Director
NORTHERN ILLINOIS UNIVERSITY
ABSTRACT
A particle-in-cell (PIC) approach is useful for simulations involving large numbers of
particles interacting via electric charge or gravitation. The interaction potential for such
forces is described by the Poisson equation ∇2U = Cρ, where C is some constant which
establishes the sign and magnitude of the interaction. By depositing the particle charge
distribution into cells and solving for the potential on the grid of cells, the solution of the
Poisson equation is made more tractable for large numbers of particles.
This work presents a wavelet-based algorithm for the solution of the Poisson equation
in PIC simulations which exhibits better scaling properties than the traditional Green’s
function and FFT approach. This algorithm also provides unique possibilities for adaptive
resolution PIC simulations and parallelization of the Poisson solver.
The algorithm described in this work is derived in a self-contained manner from theory
to implementation, with the intent of encouraging the adoption of this wavelet-space
technique by researchers with no prior experience in wavelet techniques. The algorithm
was implemented in a test program to verify correctness, and over 1000 test runs of the
application were executed to examine the numerical accuracy and scaling properties of
the algorithm. These tests demonstrate that the wavelet-space approach is a competitive
algorithm for the solution of the Poisson equation in a particle-in-cell simulation.
NORTHERN ILLINOIS UNIVERSITY
WAVELET-SPACE SOLUTION OF THE POISSON EQUATION: AN
ALGORITHM FOR USE IN PARTICLE-IN-CELL SIMULATIONS
A THESIS SUBMITTED TO THE GRADUATE SCHOOL
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE
MASTER OF SCIENCE
DEPARTMENT OF PHYSICS
BY
BENJAMIN SPRAGUE
c© 2008 Benjamin Sprague
DEKALB, ILLINOIS
AUGUST 2008
Certification: In accordance with departmental and Graduate
School policies, this thesis is accepted in
partial fulfillment of degree requirements.
Thesis Director
Date
ACKNOWLEDGEMENTS
Work supported by the Office of Naval Research, Department of Defense, under con-
tract N00014-06-1-0587 with Northern Illinois University and by the Department of Edu-
cation under contract P116Z010035 with Northern Illinois University.
TABLE OF CONTENTS
Page
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
LIST OF FIGURES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
LIST OF APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Chapter
1 INTRODUCTION AND PROBLEM STATEMENT . . . . . . . . . . . . . . . . . . . 1
1.1 Problem Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Improvement of Wavelet-Based Particle-in-Cell Algorithm . . . . . . . . . . 4
1.3 Goals and Chapter Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 WEIGHTED RESIDUAL METHOD AND SOLVING PDE IN A BASIS . . . . . 8
2.1 Fourier Basis, Green’s Functions, etc. . . . . . . . . . . . . . . . . . . . . . . 10
3 INTERPOLATING WAVELETS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.1 Transform and Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 What Is the Wavelet Basis? . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4 SPECIAL TOPICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.1 Refinement Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Dual and Primal Wavelet Transform . . . . . . . . . . . . . . . . . . . . . . 40
4.3 Orthonormal Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.4 Multidimensional Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.5 Continuous Real Space to Wavelet Space . . . . . . . . . . . . . . . . . . . . 47
4.6 Human-Readable Representation of Vectors . . . . . . . . . . . . . . . . . . 49
v
Chapter Page
5 THE ALGORITHM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.1 Prior Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.2 Representation of Operators, Calculating Operators in Wavelet Space . . . 53
5.3 Non-Standard Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.4 3D Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.5 Implementation of 3D Operators in Non-Standard Form . . . . . . . . . . . 65
5.6 Preconditioning and Temporal Coherence . . . . . . . . . . . . . . . . . . . 69
5.7 Multiresolution Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.8 Monopole Approximation for Boundary Conditions . . . . . . . . . . . . . . 78
5.9 Parallelization Opportunities . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6 IMPLEMENTATION AND ALGORITHM TESTING. . . . . . . . . . . . . . . . . . 81
6.1 Features and Code Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.2 Testing: Parameters and Measurements . . . . . . . . . . . . . . . . . . . . 83
6.3 Testing: Demonstration of Solutions . . . . . . . . . . . . . . . . . . . . . . 85
6.4 Testing: Effects of Radial Extent and Monopole BCs . . . . . . . . . . . . . 89
7 CONCLUSIONS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.1 Future Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
LIST OF TABLES
Table Page
1 Linear prediction filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2 Linear update filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3 Interpolating wavelet prediction filters . . . . . . . . . . . . . . . . . . . . . . . 101
4 Interpolating wavelet base filters . . . . . . . . . . . . . . . . . . . . . . . . . . 103
5 Linear prediction filter on boundary . . . . . . . . . . . . . . . . . . . . . . . . 116
6 Linear prediction filter on boundary, no S−1 . . . . . . . . . . . . . . . . . . . . 117
LIST OF FIGURES
Figure Page
1 Particle-in-cell algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Basic forward wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3 Sampled linear function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Single-stage wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5 Basic inverse wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
6 Recursive forward wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . 17
7 Fully transformed linear function with perfect prediction filter . . . . . . . . . . 18
8 Transformed linear function – zero BCs . . . . . . . . . . . . . . . . . . . . . . 20
9 Transformed linear function – periodic BCs . . . . . . . . . . . . . . . . . . . . 21
10 Transformed linear function – interpolation BCs . . . . . . . . . . . . . . . . . 22
11 Lifted wavelet transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
12 Wavelet functions of linear interpolating wavelets . . . . . . . . . . . . . . . . . 26
13 Wavelet functions of linear interpolating wavelets – doubled resolution . . . . . 27
14 Wavelet functions – higher order families . . . . . . . . . . . . . . . . . . . . . 29
15 Shape and frequency spectrum of linear interpolating wavelets . . . . . . . . . 31
16 Shape and frequency spectrum of 8th-order interpolating wavelets . . . . . . . 32
17 Wavelet transform example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
18 Standard form wavelet space operator . . . . . . . . . . . . . . . . . . . . . . . 56
19 Non-standard form wavelet space operator . . . . . . . . . . . . . . . . . . . . . 57
20 Pseudo code for non-standard form operator – top scale . . . . . . . . . . . . . 62
21 Pseudo code for non-standard form operator – child scales . . . . . . . . . . . . 63
22 Results of preconditioning and temporal coherence . . . . . . . . . . . . . . . . 71
viii
Figure Page
23 Multiresolution grid used for boundary conditions . . . . . . . . . . . . . . . . 77
24 Code testing input functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
25 Dipole decay with boundary layers . . . . . . . . . . . . . . . . . . . . . . . . . 86
26 Error vs. radial extent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
27 Number of points vs. radial extent . . . . . . . . . . . . . . . . . . . . . . . . . 93
28 Maple 10 code for generating operator base filters . . . . . . . . . . . . . . . . . 104
LIST OF APPENDICES
Appendix Page
A WAVELET COEFFICIENTS AND OPERATOR COEFFICIENTS. . . . . . . . . 100
B DERIVATION OF WAVELET TRANSFORM IDENTITIES . . . . . . . . . . . . . 106
C INHERENT PRECONDITIONER IN A BIORTHOGONAL BASIS . . . . . . . . 109
D A BRIEF PRIMER ON DIRAC NOTATION . . . . . . . . . . . . . . . . . . . . . . . 112
E INTERPOLATING WAVELETS ON AN INTERVAL . . . . . . . . . . . . . . . . . . 115
CHAPTER 1
INTRODUCTION AND PROBLEM STATEMENT
Improving the detail of the N-body space charge calculation is a crucial step in in-
creasing the fidelity of particle accelerator and galaxy dynamics simulations. For physical
problems in which the Coulomb force is dominant, accurately approximating the Poisson
equation ∇2U = Cρ is essential in correctly evaluating the forces acting on each particle
in the simulation.
Several approaches have been taken for the solution of the Poisson equation in such
N-body simulations. They may be classified as either integral or differential equation ap-
proaches. The differential forms directly solve the Poisson equation in differential form,
while the integral forms formulate the Poisson equation as an integral equation and eval-
uate the summed effect of the charge distribution at each point in the computational
domain.
The brute-force N-body approach is an integral solution, useful for small N . Each
particle simply sums the forces applied to it by each of the other particles. This approach
is simple to implement and can scale well in parallel, but it scales poorly with the number
of particles N . This naive approach to N-body simulation scales as O(N2) for each time
step, making it prohibitive for use with standard computer hardware for N much larger
than 103.
An improved integral approach is obtained by evaluating nearby forces with a direct
N-body approach, while approximating long-range forces by a multipole expansion. This
tree approach also has good parallel scaling potential, and it scales significantly better
with particle count. Typical tree codes scale as O(N logN) and are frequently used in
galactic dynamics simulations.
2
For very large particle numbers, the Green’s function plus FFT approach is often used
[15]. This approach shares characteristics with both integral and differential solutions
of the Poisson equation. The particle density is deposited on a grid of cells, and the
potential is evaluated on this grid using the differential Poisson equation. This technique
is often called a “particle-in-cell” approach. Because a Green’s function is used to solve
the equation, however, the solution proceeds using a convolution in a manner analogous
to an integral solution. This approach always scales as O(N) with respect to the number
of particles, because the individual particles are used only for carrying the dynamics
information between time steps. It scales as O(Ng logNg) with respect to the number of
grid points, with the advantage that the number of grid points is typically chosen to be
an order of magnitude less than the particle count.
The typical algorithm of a particle-in-cell code is shown in Figure 1. The particles are
initially constructed with some distribution in space and momentum. They are advanced
forward in time and then binned onto a computational grid. This binning process is very
similar to constructing a 3D histogram of the charge density. Once this charge density
function is obtained, the Poisson equation is solved on this grid, the forces are interpolated
back to the particles, and the particles advance forward in time again. This loop is repeated
for each time step of the simulation as the particles travel through the accelerator. This
thesis is concerned only with the step of solving the Poisson equation to find the forces on
the particles caused by their own charge distribution.
By replacing the Poisson equation solver, an alternative particle-in-cell approach was
developed by Terzic et al. in 2007, and it is a fully differential solution to the N-body
problem [28]. It uses orthogonal wavelet bases defined over the computational grid to
evaluate the differential equation, and an iterative conjugate gradient solver to find the
solution. Because this algorithm uses an iterative solver, and information from the pre-
vious time step, it is adaptive in time and can very quickly find a solution if the charge
density changes slowly with respect to the time step.
3
3
Particle-In-Cell
Advance
particles in
time
Bin particles
to grid
Solve Poisson
equation
on grid
Interpolate
force to
particles
Figure 1: Particle-in-cell algorithm.
4
1.1 Problem Scale
In particle accelerator codes, the number of particles used is typically on the order of
105 or 106. With particle counts in the millions, the brute force scaling of O(N2) is simply
impractical for most hardware. Particle-in-cell codes can manage these high particles
counts because the particle deposition schemes are simply O(N). In addition, the particle
management routines of particle-in-cell codes can operate very efficiently in parallel since
the particles only interact through the grid. The grid solution for an FFT-based particle-
in-cell scales as O(Ng logNg) with approximately ten particles per cell. This means that
grid sizes are typically on the order of 323 or 643. In order to increase the number of
particles in future simulations, this grid size needs to increase as well to maintain the
optimum number of particles per cell. This requires an improvement in the scaling of
the algorithms with respect to the number of grid points, as well as with respect to the
number of parallel processors employed in the simulation.
1.2 Improvement of Wavelet-Based Particle-in-Cell Algorithm
The initial wavelet-based Poisson solver for particle-in-cell applications constructed by
Terzic et al. showed great promise for improving the speed of future accelerator codes
and was competitive with the standard FFT plus Green’s function algorithm for current
typical problem sizes [28]. It used a preconditioner in wavelet space to ensure rapid con-
vergence of the iterative solution and it took advantage of previous time step information
to create an adaptive-in-time approach to the solution of Poisson’s equation in a particle-
in-cell environment. This current work is a continuation of that effort, and is focused on
overcoming the difficulties of this earlier algorithm in the effort to enhance the ability of a
5
wavelet-based algorithm to provide adaptive solutions which can produce additional detail
for the particle-in-cell simulator.
Specifically, the original wavelet-based solver utilized the standard form of wavelet
operators, which are simple to implement but hide much of the inherent sparsity of the
operator in the wavelet basis. Also, the scales are tightly coupled in this form, and it is
not obvious how to separate them in order to construct a parallel algorithm.
Second, this initial wavelet-based particle-in-cell algorithm had difficulty specifying
boundary conditions in wavelet space. A set of Green’s functions had to be evaluated at
each time step, which was computationally expensive and difficult to arrange for efficient
parallel execution. This caused the boundary conditions evaluation to consume a signifi-
cant portion of the solver’s execution time, even during those stages where not much had
changed – losing some of the benefits of the algorithm’s otherwise adaptive-in-time nature.
The goal of this current work is to construct an algorithm which retains the strengths
of the work of Terzic et al.: adaptivity in time and fast solution of Poisson’s equation,
while improving these two key areas to produce an efficient algorithm which can be easily
adapted for parallel execution.
1.3 Goals and Chapter Outline
With the major foundational works in modern wavelet theory dating back to the early
1990’s [7], it is becoming increasingly less tenable to consider it a “new” mathematical
technique merely on the premise of age. Also, many of the major ideas behind solving
differential equations in wavelet bases have been well known for at least ten years [4, 5,
6, 11, 13, 14, 17], yet the slow adoption of these techniques for solving physical problems
is cause to evaluate the reasons for such a hesitant response. In an effort to encourage
6
the adoption of these powerful techniques, this thesis is constructed with the following
concepts in mind:
• Wavelet theory has many interrelated properties and techniques for manipulation of
wavelets which must be mastered in order to be comfortable working with wavelet
bases. Wavelets are a generalization of many different techniques, and it is easy
to get lost in the (often irrelevant) details of the basis. With this in mind, this
work will attempt to constrain the discussion to a minimum working set of required
knowledge, keeping the focus on things required for the current application, and will
identify important techniques as well as equations to enable the reader to grasp the
concepts more thoroughly.
• Wavelet methods are computational in nature and are only useful when implemented
in software. This thesis will attempt to make the results as repeatable as possible
by including notes on the special considerations which arise when implementing a
real wavelet algorithm – filter boundary conditions and their effect, preserving a
symmetric Laplacian operator, special cases when generalizing from 1D to 3D, etc.
• Wavelet techniques are often misconstrued as being “new” or “magic.” This work
will attempt to help physicists understand the connections from wavelet theory to
common mathematical concepts, and will derive equations and concepts from basic
principles wherever possible – rather than simply using equations from other sources
without explaining their meaning.
7
1.3.1 Organization
This work begins with a quick review of the basic mathematics involved in computa-
tionally solving partial differential equations in a basis set, and moves quickly into describ-
ing the basis of choice for this algorithm – interpolating wavelets. In Chapter 4, several
additional mathematical techniques are introduced which are essential to the understand-
ing of the wavelet particle-in-cell Poisson equation algorithm. This chapter is followed by
a chapter describing the algorithm in detail. Chapter 6 describes a test implementation
of the algorithm which was used to verify the mathematics of the earlier chapters as well
as to examine the feasibility of the new boundary condition schemes introduced in this
thesis. The final chapter reviews the contributions of this thesis and examines possible
directions for future development.
CHAPTER 2
WEIGHTED RESIDUAL METHOD AND SOLVING PDE IN A BASIS
Approximating a differential equation with any basis function expansion must first
begin with the initial continuous equation:
∇2(u+ ubc) = ρ. (2.1)
where u is the potential inside the computational domain and ubc is the potential on the
outside.1 Eqn. (2.1) can be reinterpreted as a requirement that the residual R would
vanish everywhere:
R = ∇2(u+ ubc)− ρ = 0. (2.2)
If a u, ubc and ρ can be found which causes R to vanish in all of continuous space, then
the problem is solved. When working with analytic functions for ρ, this can often be done
by using the integral form of the equation. However, solving the problem for general ρ
requires approximating with a basis function expansion. The particular form of the basis
functions is irrelevant at this point, but the goal is to make the residual orthogonal to the
space spanned by the basis functions. Specifically, we want
〈φi|R〉 = 〈φi, R〉 =∫φi(~x)R(~x)d3~x = 0, ∀i. (2.3)
Note the use of Dirac’s Bra-Ket notation to represent the inner product, and its equivalence
to the integral norm. I will continue to use the Dirac notation throughout this thesis. So,
we want to reduce the residual to be orthogonal to every basis vector (function) in the
1This separation in the potential will be explained in further detail in Section 5.8.
9
expansion space given by the φi. In the limit of a complete basis, it can be seen that this
restriction will be equivalent to the continuous requirement (Eqn. 2.2).
Because the ultimate goal is to establish a relationship between u and ρ, we must next
examine the residual in detail. Again, due to the inability of finite minds or computers to
represent a continuum of values, we need to expand ρ, u and ubc in some set of functions
in order to work with them. For our purposes, it is convenient if they all are expanded in
the same set of basis functions:
ρ =∑j
ρjΦj , u =∑j
ujΦj , ubc =∑j
ubcj Φj . (2.4)
Inserting these expressions into the requirement on R gives:
〈φi|R〉 = 〈φi|∇2∑j
(uj + ubcj )Φj〉 − 〈φi|∑j
ρjΦj〉 ,
=∑j
(uj + ubcj ) 〈φi|∇2|Φj〉 −∑j
ρj 〈φi|Φj〉 = 0, ∀i. (2.5)
We now have an algebraic linear equation to solve, which is a well-researched problem
in computational science. This will be more apparent if we write:
~ρ =
ρ0
ρ1
· · ·
ρn
, ~u =
u0
u1
· · ·
un
, ~ubc =
ubc0
ubc1
· · ·
ubcn
, O := 〈φi|Φj〉 , L := 〈φi|∇2|Φj〉 . (2.6)
Our discretized differential equation becomes
L(~u+ ~ubc) = O~ρ, (2.7)
10
and finding ~u becomes a problem of simply inverting the matrix L:
~u = L−1(O~ρ− L~ubc). (2.8)
This matrix inversion is handled readily by several common linear algebra algorithms.
Note that choice of φi and Φi are arbitrary, and the general form of the resulting linear
equation does not depend on their properties. However, in order for L to be invertible, it
must be square, which requires that the number of basis functions be the same for both
expansions. One common choice for basis functions is to pick φi = δ(~x − α~i), restricting
the residual to be zero at discrete points. This is called a collocation method and is
useful for finding a solution which is valid on a grid. It also often results in a less dense
representation of the Laplacian operator matrix and is the approach taken by Goedecker
[11].
Another common choice is to choose φi = Φi, and the resulting weighted residual
method is known as the Galerkin method. The largest advantage of this approach is that
the resulting Laplacian operator is a symmetric matrix, which is a requirement for many
linear system solvers. Arias et al. use this approach and are able to use the quickly
converging conjugate gradient method [1, 17], while Goedecker’s approach limits him to
variations of the slower steepest descent algorithm, although with a much more sparse
operator [12].
2.1 Fourier Basis, Green’s Functions, etc.
Another basis of academic interest for the weighted residual method is the Fourier
basis. Using φω = Φω = eiωx results in a diagonal overlap matrix and diagonal Laplacian
matrix and can therefore be trivially inverted. The downside of this approach lies in the
11
periodicity that is assumed in the expansion of functions into the Fourier basis. Unless
the basis expansion is carried out to infinite frequencies, the expansion will result in a
function which repeats periodically.
The inherent periodicity of the Fourier basis usually makes it undesirable for use with
the differential form of the Poisson equation due to the difficulty of defining spatially lo-
calized boundary conditions. Instead, the Poisson equation in integral form is used with
a Green’s function, and a pilgrimage through frequency space is used to turn the convo-
lution integral into a simple algebraic expression [15]. This approach has the advantage
of allowing arbitrary boundary conditions to be applied using the Green’s function, yet
only needing a computational grid of the size of the charge distribution of interest. The
disadvantages of this approach is that in order to avoid periodic wrapping of the Green’s
function, the frequency-space calculation must be performed on a doubled grid, and it still
requires a fully dense grid inside of the charge region.
CHAPTER 3
INTERPOLATING WAVELETS
One of the main purposes of a wavelet basis is compact storage of general data such
as music and photos and data files [27]. It is in this realm of data compression that
the wavelet scheme is easiest to conceptualize. The goal of a basis-function compression
scheme is to represent the input data with as few basis function coefficients as possible.
This is generally best accomplished when the basis function set shares many similarities
with the data (f(~x)) to be compressed. A more mathematically formal way to state this is
to say that 〈φi|f〉 is large for only a very few i, and zero for all others. One example of this
is an infinitely periodic signal. In a delta-function (time sampling) basis set, this signal
would be significantly dense, even if it were compressed by storing only a single period.
However, in a Fourier series expansion, this signal may require only very few coefficients
to accurately represent it. In the opposite extreme, a highly localized delta function signal
is sparse in a time sampling space, but infinitely dense in Fourier space.
This space / frequency localization dichotomy is of mathematical interest, but phys-
ical signals are nearly always localized in space, and often are significantly localized in
frequency as well. There are many common methods of compactly representing signals
in simple physical systems, often using energy or angular momentum eigenkets such as
gauss-hermite polynomials and spherical harmonics. For the general case, however, none
of these analytical basis sets have the full localization in space that is typical of physical
quantities such as charge and mass distributions. This need for a basis set which com-
pactly describes physical quantities in general is a major driving force in the success of
wavelet methods.
13
3.1 Transform and Inverse
Wavelets are most easily understood through the transform between discrete physical
space and wavelet space. Interpolating wavelets are the main focus of this research, and
their transform will be described in detail. This description will follow the lifting scheme
wavelet description of Sweldens [26, 27], though there are many ways to represent a wavelet
transform.
This transform occurs after the signal has already been converted into a discrete sample
of the physical system at some resolution level ~h, which can have a different value for
each dimension considered. The goal of the transformation is to reduce the problem
to the fewest number of large non-zero coefficients as possible by providing a way to
recover the value of the original signal via some interpolation scheme. In 1D, the method
taken to accomplish this is to first separate the even-indexed coefficients from the odd-
indexed coefficients (see Figure 2). Then the even-indexed coefficients (which will be called
“Smooth” or “S” coefficients) are interpolated in some manner to make a prediction of
what the odd-indexed coefficients (called “Detail” or “D” coefficients) should contain.
This prediction is subtracted from the “D” coefficients, and the first stage of the wavelet
transform is complete.
sk
dk
Sp Psk+1
Figure 2: Basic forward wavelet transform.
14
As an example, consider a linear function as the input function sampled as in Figure 3,
and a simple wavelet transform utilizing a linear interpolation / prediction scheme.1 The
prediction scheme would be simply to take any two adjacent S coefficients, take their
average, and predict that the odd D coefficient between them should equal their average.
This corresponds to the prediction filter shown in Table 1. The coordinates are such that
a 0 offset refers to the S coefficient to the left of the current D coefficient, and a 1 offset
refers to the next S coefficient, which is to the right of the current D coefficient.
The prediction is then tested by subtracting the prediction from the actual value of
the D coefficient. Of course, in the special case of a linear input function, this prediction
scheme will be perfect, and all of the D coefficients will be zero as in Figure 4. This
transform is completely invertible – all that is required is to perform the prediction again
from the S coefficients and add the prediction back on to the D coefficients, then merge the
S and D coefficients back into one single data set. The block diagram for this operation
is shown in Figure 5.
With a perfect prediction filter, the goal to compress this input data set has now
achieved a 2:1 compression ratio with no losses. Cutting the storage requirement in half is
a good start, but with such a simple input function, it is not yet very impressive. Further
compression is achieved by recursively splitting the remaining zero-level S0 coefficients into
-1 level S−1 and D−1 coefficients and predicting these D−1 coefficients as well, as Figure 6
shows. There are now two separate scales of information, and only the D coefficients are
kept for the lower scale2, while the S and D coefficients are kept at the highest scale.
1A simpler wavelet transform is possible, using a 0th order constant-interpolating scheme, which resultsin an un-lifted Haar transform.
2The numbering convention for levels is an unfortunate relic of wavelet literature. “Up” in level is“Down” in scale. This work will attempt to preserve the distinction between the two. Numeric values arealways levels, while the terms “higher”, “lower”, “up”, “down”, “large”, “small” will always refer to scales.The author would have preferred to simply re-number levels to match scales and derive the self-consistentset of equations, but then the reader would be unable to connect with the vast majority of other waveletliterature. It would be analogous to re-labeling the electron charge as positive in an electronics text sothat currents would make more intuitive sense.
15
0
2
4
6
8
10
s00 s
00 s
02 s
03 s
04 s
05 s
06 s
07
Linear Input FunctionWavelet Transform Output
Figure 3: Sampled linear function.
Table 1: Linear prediction filter
Offset 0 1
Value 0.5 0.5
16
0
2
4
6
8
10
s00 d
00 s
01 d
01 s
02 d
02 s
03 d
03
Linear Input FunctionWavelet Transform Output
Figure 4: Single-stage wavelet transform. Transform is of a linear function using a perfectprediction filter.
sk
dk
MP sk+1
Figure 5: Basic inverse wavelet transform.
17
This recursive pyramid algorithm can continue until there is only one S−n and D−n point
remaining, as long as the original data set contained a power of 2 number of points (see
Figure 7). The naming convention for scales is such that increasing scale corresponds
to decreasing resolution and decreasing number of S and D coefficients. Each scale has
half as many coefficients, but they are positioned twice as far apart as the coefficients at
the next lower scale. The frequency information contained in a scale is then half of that
which is contained in the next lower scale. In fact, the wavelet transform stage can be
considered as a high frequency / low frequency filter where the S coefficients are given the
low frequency information and the D coefficients are given the high frequency data. The
separation into scales is also a separation in frequency.
sk
dk
Sp Psk+1
sk-1
dk-1
Sp P
Figure 6: Recursive forward wavelet transform.
The wavelet compression scheme will result in only one single non-zero coefficient (Sn)
if the prediction scheme is perfect, as in Figure 7. This seems to violate the basic idea that
it takes two constants to define a line. However, the perceptive reader may have noticed
that up until this point the boundary conditions have been ignored. Because our filter
simply took the average of two points on either side of the D coefficient, we have been
18
0
2
4
6
8
10
s-20 d
00 d
-10 d
01 d
-20 d
02 d
-11 d
03
Linear Input FunctionWavelet Transform Output
Figure 7: Fully transformed linear function with perfect prediction filter.
19
assuming that the prediction scheme could correctly guess what the S coefficient values
were for points after of the last D coefficient. For higher order interpolation schemes, this
problem is more obvious, as a wider filter is used, and more S coefficients are taken into
account that are often outside of the region of sampled data. There are many ways to han-
dle boundary conditions with wavelet expansions, and the different types are appropriate
for different uses. The simplest boundary condition scheme is so simply assume that ev-
erything outside of the region of interest is zero. This scheme is the easiest to implement,
but often results in large D coefficients along boundaries due to the implied discontinuity
between the measured non-zero data and the assumption of zero outside of the boundary
(see Figure 8). Periodic boundary conditions often give the same boundary difficulties for
non-periodic data sets, as Figure 9a shows. For periodic or symmetric data sets however,
it is often advantageous in terms of compression to choose periodic or mirrored boundary
conditions (Figure 9b). In the “second generation” wavelet scheme [26, 27], one can also
define a prediction scheme which takes the finite interval into account. This can give a
better general-case solution to the problem of data compression with wavelets (see Fig-
ure 10), but it makes the prediction filter dependent on the spatial coordinate rather than
being uniform across the whole region of interest. See Appendix E for an example of
defining the linear interpolating wavelets on an interval.
3.1.1 Lifting
It is sometimes desirable to have the multiple scales of coefficients conserve some
quantities, such as zeroth, first, second, or nth moments. In order to accomplish this,
we need to update the sequence of S coefficients in some way in order to preserve these
moments. This can be accomplished by adding an “update” stage to the transform after
the prediction stage has occurred. As Figure 11 shows, this change is easily invertable
20
0
2
4
6
8
10
s-20 d
00 d
-10 d
01 d
-20 d
02 d
-11 d
03
Linear Input FunctionWavelet Transform Output
(a) Linear input function.
0
1
2
3
s-20 d
00 d
-10 d
01 d
-20 d
02 d
-11 d
03
Linear Input FunctionWavelet Transform Output
(b) Periodic linear function.
Figure 8: Transformed linear function – zero BCs.
21
0
2
4
6
8
10
s-20 d
00 d
-10 d
01 d
-20 d
02 d
-11 d
03
Linear Input FunctionWavelet Transform Output
(a) Linear input function.
0
1
2
3
s-20 d
00 d
-10 d
01 d
-20 d
02 d
-11 d
03
Linear Input FunctionWavelet Transform Output
(b) Periodic linear function.
Figure 9: Transformed linear function – periodic BCs.
22
0
2
4
6
8
10
s-20 d
00 d
-10 d
01 d
-20 d
02 d
-11 d
03
Linear Input FunctionWavelet Transform Output
(a) Linear input function.
0
1
2
3
s-20 d
00 d
-10 d
01 d
-20 d
02 d
-11 d
03
Linear Input FunctionWavelet Transform Output
(b) Periodic linear function.
Figure 10: Transformed linear function – interpolation BCs.
23
using the same scheme as before: mirror the diagram from left to right, and change the
signs of the operations. The update stage is also known as a lifting stage, and there are
techniques for defining lifting update stages to preserve multiple moments in between the
scales of the transformation [26, 27]. For the simple case of linear interpolating wavelets, to
conserve the zeroth moment (average), one defines the update filter as in Table 2. In this
case, each D coefficient simply writes 1/4 of its value to the two S coefficients surrounding
it. This again is subject to boundary condition issues, especially if the prediction filter
changes near the boundaries. See Sweldens [26, 27] for more details on deriving update
filters in the lifting scheme.
sk
dk
Spsk+1 P U
(a) Forward transform.
sk
dk
M sk+1PU
(b) Inverse transform.
Figure 11: Lifted wavelet transforms.
3.2 What Is the Wavelet Basis?
Recalling that the goal is to create a wavelet basis set, one may wonder why only the
coefficients have been mentioned thus far. If this wavelet transform does convert from the
24
Table 2: Linear update filter
Offset 0 1
Value 0.25 0.25
Note: This filter is designed to preserve the average.
25
sampled set into another basis, then what do the basis functions look like, and what are
their properties? Because the wavelet basis is in fact defined through the transform and
inverse transform, an obvious way to examine the basis functions is to simply set only
one coefficient equal to unity in the wavelet basis, and perform an inverse transformation
in order to see what the function will become in coordinate space. In Figure 12, we
can see several such functions drawn. The S coefficients will produce functions called
scaling functions, which will be represented by φki (where k denotes the level3, and i is the
index within that level), and the D coefficients produce functions called wavelet functions,
denoted with ψki .
While these functions appear to be only coarse discrete functions, by adding additional
lower scales of zero coefficients, they can be interpolated to any resolution and are defined
on any fractional value that can be represented by a computer (Figure 13). These scaling
(φki ) and wavelet (ψki ) functions behave as pseudo-continuous functions and are defined on
any rational number that can be obtained as a binary fraction (0.d1d2d3d4 =∑
i di2−i).
Therefore, we can treat them as functions of the coordinate space (φki (x) and ψki (x)).
Though they are not defined on the set of irrational numbers, they are defined on the
rational numbers which are arbitrarily nearby. Since only rational numbers are definable
in a finite computer representation, we may treat these functions φki (x) and ψki (x) as
continuous functions of x for any computational purposes.
Because the shape of the individual wavelet functions does not depend on its scale or
its index, we can write all of the functions φki (x) as translations and scalings of some base
scaling function φ(x). Similarly, the wavelet functions are all translations and dilations
of some function ψ(x) (often referred to as the “Mother Wavlet”) [11]. Formally, this is
3Note that k follows the numbering for levels rather than scales. See Footnote 2 on page 14 for adiscussion of the difference.
26
0
1
s0
0d2
0d1
0d2
1d0
0d2
2d1
1d2
3
Wavelet Coefficients
0
0.5
1
0 1 2 3 4 5 6 7 8
ϕ00
0
1
s0
0d2
0d1
0d2
1d0
0d2
2d1
1d2
3
Wavelet Coefficients
0
0.5
1
0 1 2 3 4 5 6 7 8
ψ0
0
0
1
s0
0d2
0d1
0d2
1d0
0d2
2d1
1d2
3
Wavelet Coefficients
0
0.5
1
0 1 2 3 4 5 6 7 8
ψ1
1
0
1
s0
0d2
0d1
0d2
1d0
0d2
2d1
1d2
3
Wavelet Coefficients
0
0.5
1
0 1 2 3 4 5 6 7 8
ψ2
2
Figure 12: Wavelet functions of linear interpolating wavelets. Left column is input waveletcoefficients, and right column is the result of the inverse wavelet transform of those coef-ficients (with zero for the boundary conditions), showing the wavelet or scaling functionassociated with that coefficient.
27
0
1
s0
0d2
0d1
0d2
1d0
0d2
2d1
1d2
3
Wavelet Coefficients
0
0.5
1
0 1 2 3 4 5 6 7 8
ϕ00
0
1
s0
0d2
0d1
0d2
1d0
0d2
2d1
1d2
3
Wavelet Coefficients
0
0.5
1
0 1 2 3 4 5 6 7 8
ψ0
0
0
1
s0
0d2
0d1
0d2
1d0
0d2
2d1
1d2
3
Wavelet Coefficients
0
0.5
1
0 1 2 3 4 5 6 7 8
ψ1
1
0
1
s0
0d2
0d1
0d2
1d0
0d2
2d1
1d2
3
Wavelet Coefficients
0
0.5
1
0 1 2 3 4 5 6 7 8
ψ2
2
Figure 13: Wavelet functions of linear interpolating wavelets – doubled resolution. Thisfigure is identical to Figure 12, with the addition of another layer of d3
i coefficients whichare set to zero. The result is a double-resolution view of the same wavelet and scalingfunctions from Figure 12. This resolution doubling can be repeated to view the functionsto any desired resolution.
28
expressed as
φki (x) = φ(2kx− i), (3.1)
ψki (x) = ψ(2kx− i). (3.2)
Another unique attribute of these functions is that they are strictly limited in sup-
port. Beyond some finite range ±x, they are identically equal to zero. This is a feature
which is not achievable with analytic functions constructed as a power series represen-
tation. Even the most rapidly decaying analytic function has some small non-zero tail
region with infinite extent. The compact support of the wavelet functions is an advantage
in two ways. First, it more accurately matches distributions of physical systems – avoid-
ing the introduction of spurious tail regions into the expansion, and allowing an inverse
transform with perfect reconstruction. Second, it simplifies many operations in wavelet
space, as wavelet functions which are far apart in real space have zero overlap and can
have no direct interactions with each other through spatially local operators. Higher order
wavelets, constructed with higher order interpolation schemes, have larger support and
more complicated φ(x) and ψ(x) functions, but all wavelet schemes possess this property
of compact support (Figure 14).
So the wavelet basis set has achieved one of the goals demanded of it: localization in
space. In addition, the fact that the φki and ψki have some finite non-zero support rather
than being a δ function implies, through the time-frequency uncertainty principle, that
they may be localized in some frequency band as well. Note that because the wavelet
function is not periodic, it will require an infinite frequency range to fully represent it,
but if the wavelet function is a fairly smooth function (has a large number of continuous
derivatives), the decay of Fourier coefficients toward infinity will be rapid. For n continuous
derivatives of ψki (x), the Fourier coefficients will decay as 1/ωn+1 at large ω [11]. This
29
-0.5
0
0.5
1
0 1 2 3 4 5 6 7 8
ψ2
2
(a) Linear interpolating wavelets.
-0.5
0
0.5
1
0 1 2 3 4 5 6 7 8
ψ2
2
(b) 4th-order interpolating wavelets.
-0.5
0
0.5
1
0 1 2 3 4 5 6 7 8
ψ2
2
(c) 6th-order interpolating wavelets.
-0.5
0
0.5
1
0 1 2 3 4 5 6 7 8
ψ2
2
(d) 8th-order interpolating wavelets.
Figure 14: Wavelet functions – higher order families. Notice that while the higher orderfunctions do have wider support, their support is still finite.
30
can be seen in Figures 15 and 16 as the higher order and smoother functions decay much
more rapidly in the higher frequency range.
In the low-frequency regime, the Fourier components can be made to decay as well if
certain constraints are met by the wavelet functions [11]. The Fourier transform of the
wavelet function is given by
Ψ(ω) =∫ψ(x)e−iωxdx, (3.3)
and at ω = 0, this becomes simply Ψ(ω = 0) =∫ψ(x)dx. So, the Fourier spectrum will
decay down to zero at the origin if the wavelet function has a vanishing zero moment.
Furthermore, expanding Ψ(ω) as a power series around the origin gives
Ψ(ω) =∞∑l=0
ωl
l!dl
dωlΨ(ω)
∣∣∣∣∣ω=0,
(3.4)
where the derivatives dl
dωlΨ(ω)
∣∣∣ω=0
are given by
dl
dωlΨ(ω)
∣∣∣∣ω=0
=dl
dωl
∫ψ(x)e−iωxdx
∣∣∣∣ω=0,
=∫ψ(x)(−ix)le−iωxdx
∣∣∣∣ω=0,
= (−i)l∫xlψ(x)dx.
(3.5)
Therefore the series expansion says that if ψ(x) has m vanishing moments, then Ψ(ω) will
vanish as ωm for ω → 0. As Figures 15 and 16 show, the lifted wavelet functions with
more vanishing moments are more localized in frequency near zero.
Acquiring vanishing moments in the wavelet functions is accomplished through mod-
ifying the update stage of the wavelet transform. Notice that in the inverse transform of
Figure 11b, the update stage can affect both even and odd coefficients, and can cause a
change in the shape of the resulting ψ(x) function, because it depends on D coefficients
31
-0.5
0
0.5
1
1.5
0 1 2 3 4 5 6 7 8
ψk=4,i=5
(x)ψk=5,i=20
(x)ψk=6,i=50
(x)
(a)
0
200
400
600
800
1000
1200
0 20 40 60 80 100 120
ψk=4,i=5
(x)ψk=5,i=20
(x)ψk=6,i=50
(x)
(b)
-0.5
0
0.5
1
1.5
0 1 2 3 4 5 6 7 8
ψk=4,i=5
(x)ψk=5,i=20
(x)ψk=6,i=50
(x)
(c)
0
200
400
600
800
1000
1200
0 20 40 60 80 100 120
ψk=4,i=5
(x)ψk=5,i=20
(x)ψk=6,i=50
(x)
(d)
Figure 15: Shape and frequency spectrum of linear interpolating wavelets. (a) and (b) arelinear interpolating wavelets, (c) and (d) are lifted linear interpolating wavelets with thefirst two moments vanishing.
32
-0.5
0
0.5
1
1.5
0 1 2 3 4 5 6 7 8
ψk=4,i=5
(x)ψk=5,i=20
(x)ψk=6,i=50
(x)
(a)
0
200
400
600
800
1000
1200
0 20 40 60 80 100 120
ψk=4,i=5
(x)ψk=5,i=20
(x)ψk=6,i=50
(x)
(b)
-0.5
0
0.5
1
1.5
0 1 2 3 4 5 6 7 8
ψk=4,i=5
(x)ψk=5,i=20
(x)ψk=6,i=50
(x)
(c)
0
200
400
600
800
1000
1200
0 20 40 60 80 100 120
ψk=4,i=5
(x)ψk=5,i=20
(x)ψk=6,i=50
(x)
(d)
Figure 16: Shape and frequency spectrum of 8th-order interpolating wavelets. (a) and (b)are 8th-order interpolating wavelets, (c) and (d) are lifted 8th-order interpolating waveletswith the first eight moments vanishing.
33
as input. Because the wavelet expansion is a complete basis which is separated into two
spaces at each step, namely scaling function space (φki ) and wavelet function space (ψki ),
if one of these two spaces conserves a quantity, then the functions in the other space must
be orthogonal to this quantity. For example, if the scaling function space is to conserve
the average, then the wavelet functions must have a zero average because they are all
orthogonal to the zeroth moment. So it is equivalent to say either that the wavelet func-
tion has been lifted so that it has a zero moment, or that the scaling function space has
been lifted so that it conserves the this moment. This is another purpose of the lifting
stage of the transformation – to further localize the wavelet functions in frequency space
by adding more vanishing moments to ψ(x).
CHAPTER 4
SPECIAL TOPICS
This chapter will address several diverse topics and historical notation from wavelet
theory. Much of this information, especially the refinement relations with the alternate
h and g filter notation and the concept of a dual wavelet space, will be critical to the
understanding of later chapters of this work.
4.1 Refinement Relations
Running a single S coefficient through one stage of the inverse transform (Fig. 11b), one
can see that any function φki is simply a linear combination of functions φk+1j at the lower
scale.1 In the same way, the wavelet functions ψki can also be written as a combination
of scaling functions φk+1j . This result can be expressed directly as a set of refinement
relations between base functions (defined in Eqns. 3.1 and 3.2):
φ(x) =∑j
hjφ(2x− j), (4.1)
ψ(x) =∑j
gjφ(2x− j), (4.2)
where the range of j depends on the size of the h and g filters. It is often useful to
rephrase these equations in terms of relations between functions at adjacent levels. By
1Again, level k + 1 is a lower scale than k. See Footnote 2 on page 14.
35
using Eqns. 3.1 and 3.2, the following can be obtained:
φki (x) =∑j
hjφk+12i+j(x), (4.3)
ψki (x) =∑j
gjφk+12i+j(x). (4.4)
Wavelet families can be defined entirely by these refinement relations, and this notation
is often used in literature [1, 7, 11]. These h and g filters can be derived, as mentioned,
from the action of the prediction and update filters in the inverse wavelet transform.2 The
resulting expressions are (using P for the predict filter and U for the update filter)3:
h2i = δi,0, (4.5)
h2i+1 = P−i,
g2i = −U−i,
g2i+1 = δi,0 −∑j
U(−j−i)Pj ,
where again, the index j runs over all of the non-zero values of U(−j−i)Pj , and Ui and Pi
are defined to be zero outside of the size of their established filter range.
This new notation allows us to compute integrals involving wavelet functions, which
will be useful when defining the O and L matrices later. As an example, we can compute
2While h and g can be derived from the prediction and update filters, it also possible to constructvalid h and g filters which have no finite P and U filter representation. This is the case with orthogonalwavelets, for example [26].
3These relations were derived by following a delta-function sequence δi,0 through either the “S” or “D”branch of the inverse transform, and observing what information came out as S coefficients on the lowerscale. This is how the h and g coefficients of Eqns. 4.1 and 4.2 are defined.
36
the normalization condition for the scaling functions via
1 =∫φ(x)dx,
=∫ ∑
j
hjφ(2x− j)dx,
=∑j
hj
(12
∫φ(X)dX
),
=12
∑j
hj ,
2 =∑j
hj . (4.6)
In this fashion, we can often obtain relationships between the filter coefficients in the
discrete set which are equivalent to the integral relationships in the continuous functions,
and we can perform integrals over wavelet functions even though they are defined only by
their refinement relationships.
4.1.1 Dual Space
The wavelet bases which have been introduced up to this point are not orthogonal. In
other words, the overlap matrix O := 〈φi|Φj〉 from Chapter 2 contains possibly significant
off-diagonal components. Because there are many cases where orthonormality is useful for
simplifying expressions, we introduce another wavelet basis which is orthogonal (dual) to
the first (primal) wavelet basis.
37
This new dual basis is composed of dual scaling (φki ) and dual wavelet (ψki ) functions
derived from a base scaling function and mother wavelet
φki (x) = 2kφ(2kx− i), (4.7)
ψki (x) = 2kψ(2kx− i), (4.8)
which obey a set of refinement relations similar to that of the primal wavelets:
φ(x) = 2∑j
hjφ(2x− j), (4.9)
ψ(x) = 2∑j
gjφ(2x− j), (4.10)
φki (x) =∑j
hjφk+12i+j(x), (4.11)
ψki (x) =∑j
gjφk+12i+j(x). (4.12)
Notice that there is now a four defining filters for the wavelet family: h, g, h, and g.
The constraints between the two bases will cause only two of these to be independent.
Also, the factors of 2 and 2k in these relations will become apparent shortly. These two
bases are required to satisfy the following biorthogonality constraints on a single level k:
〈φki |φkj 〉 = δi,j ,
〈φki |ψkj 〉 = 0,
〈ψki |φkj 〉 = 0,
〈ψki |ψkj 〉 = δi,j ,
38
and if these are satisfied, the refinement relations between bases guarantee an even stricter
set of constraints between multiple levels4:
〈φki |φkj 〉 = δi,j , (4.13)
〈φki |ψlj〉 = 0, k ≤ l, (4.14)
〈ψki |φlj〉 = 0, k ≥ l, (4.15)
〈ψki |ψlj〉 = δi,jδk,l. (4.16)
A few questions remain: First, why are the definitions and refinement relations for
the dual functions different from those for the primal functions? Second, what are the
relationships between h, g, h, and g, and how do we find the dual functions given only the
primal functions?
The answer to the first question lies in the orthonormality constraint on the scaling
functions:
δi,j = 〈φki (x)|φkj (x)〉 ,
= 〈2kφ(2kx− i)|φ(2kx− j)〉 ,
=∫
2kφ(2kx− i)φ(2kx− j)dx,
=∫
2kφ(X − i)φ(X − j)(2−k)dX,
= 〈φ0i (X)|φ0
i (X)〉 .
(4.17)
Therefore the factor of 2 between scales of dual functions is just a normalization factor
to ensure that 〈φki (x)|φki (x)〉 = 1 for all levels k. Note that because this factor is only a
normalization factor, it could have been shared symmetrically between φ and φ as a√
2
4This is because any function φl, φl, φl, or ψl can be written as a sum of scaling functions φl+1 orφl+1 recursively at a lower scale until l = k and the above expressions for 〈φki |ψkj 〉 = 〈ψki |φkj 〉 = 0 can beinvoked.
39
factor, as is the case with orthogonal wavelet families [11]. For simplicity, it is usually
placed entirely on the dual functions when working with biorthogonal wavelet families,
but this choice is arbitrary.
A similar method can be used to establish orthonormality relations between the re-
finement relation filters (h, g, h, and g). For example, substituting Eqns. (4.11) and (4.3)
into Eq. (4.13) gives
δi,j = 〈φki |φkj 〉 ,
= 〈∑µ
hµφk+12i+µ|
∑ν
hνφk+12j+ν〉 ,
=∑µ,ν
hµ hν 〈φk+12i+µ|φ
k+12j+ν〉 ,
=∑µ,ν
hµ hν δ2i+µ,2j+ν ,
=∑ν
hν+2j−2i hν and, replacing dummy index ν with l = ν − 2i,
=∑l
h2j+l h2i+l.
Similar derivation gives the other orthonormality conditions in terms of the refinement
relation filters:
∑l
h2j+l h2i+l = δi,j , (4.18)
∑l
h2j+l g2i+l = 0, (4.19)
∑l
g2j+l h2i+l = 0, (4.20)
∑l
g2j+l g2i+l = δi,j . (4.21)
40
A solution of these constraints is given by the relations:
gi = (−1)1−i h1−i, (4.22)
hi = (−1)i g1−i. (4.23)
Note that Eq. (4.23) has essentially the same form as Eq. (4.22), but it is solved for h rather
than g. Now we have a set of relations that link the dual and the primal spaces together,
but we need to derive wavelet transforms for the dual wavelets in order to examine the
characteristics of the dual space.
4.2 Dual and Primal Wavelet Transform
Armed with the dual basis and the orthogonality relations from the previous section,
we can define a single stage of the forward wavelet transform of a function f as
ski = 〈φki |f〉 ,
= 〈∑j
hjφk+12i+j |f〉 ,
ski =∑j
hj sk+12i+j . (4.24)
dki = 〈ψki |f〉 ,
dki =∑j
gj sk+12i+j . (4.25)
41
In a similar fashion, the dual forward transform is found to be
ski =∑j
hj sk+12i+j , (4.26)
dki =∑j
gj sk+12i+j . (4.27)
Since these are linear transformations, we can represent them as matrices F and F for the
forward and dual forward transforms, respectively.
The next goal is to find the backward transform (B = F−1) and its dual form (B =
F−1). Toward this goal, we begin with an expression of the relationship between scales in
the wavelet expansion:
∑ν
skν |φkν〉+∑µ
dkµ |ψkµ〉 =∑i
sk+1i |φk+1
i 〉 , (4.28)
where skν , dkµ, and sk+1i are just the expansion coefficients on levels k and k + 1. The goal
of the forward transform is to find the left-hand side of this equation in terms of the right-
hand side, and the goal of the inverse transform is exactly the opposite. To isolate the
right-hand side, we operate on the entire equation with 〈φk+1j | and use the orthogonality
conditions to isolate a single sk+1j :
〈φk+1j |
∑ν
skν |φkν〉+ 〈φk+1j |
∑µ
dkµ |ψkµ〉 = 〈φk+1j |
∑i
sk+1i |φk+1
i 〉 ,
∑ν
skν 〈φk+1j |φkν〉+
∑µ
dkµ 〈φk+1j |ψkµ〉 = sk+1
j ,
∑ν,α
skν hα 〈φk+1j |φk+1
2ν+α〉+∑µ,β
dkµ gβ 〈φk+1j |φk+1
2µ+β〉 = sk+1j ,
∑ν,α
skν hα δj,2ν+α +∑µ,β
dkµ gβ δj,2µ+β = sk+1j ,
∑α
skj−α2
hα +∑β
dkj−β2
gβ = sk+1j ,
42
∑α
(skj−α h2α + dkj−α g2α
)= sk+1
2j , (4.29)
∑α
(skj−α h2α+1 + dkj−α g2α+1
)= sk+1
2j+1. (4.30)
So the backwards transform B consists of Eqns. (4.29) and (4.30) for the even and odd
terms of sk+1n . Analogously, the dual reverse transform B has the same form as Eqns. (4.29)
and (4.30), but with h and g replaced by h and g.
Because the basis function set is not guaranteed to be orthonormal, the transformations
F and B are not guaranteed to be unitary. In general,
F†F 6= 1,
B†B 6= 1,
and so,
F† 6= B,
B† 6= F,
contrary to the usual results for a unitary transform from the orthogonal coordinate basis
to another orthogonal basis set. For biorthogonal wavelet bases, however, the above
relations are correctly generalized as
F†F = 1, F† = B, (4.31)
B†B = 1, B† = F. (4.32)
These equations are derived in Appendix B.
43
The full recursive wavelet transformations can be defined in terms of the single level
transforms as
W = F1F2 . . .FN =N∏n=1
Fn, (4.33)
W−1 = BNBN−1 . . .B1 =1∏
n=N
Bn, (4.34)
W = F1F2 . . . FN =N∏n=1
Fn, (4.35)
W−1 = BN BN−1 . . . B1 =1∏
n=N
Bn. (4.36)
For these full transforms, Eqns. (4.31) and (4.32) imply that
W† = W−1, (4.37)(W−1
)†= W. (4.38)
These two relationships will become indispensable in later sections of this thesis in ensuring
the symmetry of operators in wavelet space. An important connection to recall is that
if the wavelet families are defined through Eqns. (4.5), these F and B transforms are
identical to those created using the prediction and update filters from Section 3.1.
4.3 Orthonormal Wavelets
Much research has gone into the development of wavelet families which are both ortho-
normal and smooth [7], and several such families have been discovered. Orthogonal
wavelets also were used by Terzic, Pogorelov, and Bohn in their wavelet-based Poisson
solver [28]. Because of some special characteristics of the biorthogonal interpolating
44
wavelets that simplify the multiresolution wavelet transform (see Section 5.7), they are
the chosen basis for this current research. Orthonormal wavelet families are of limited
interest here. However, orthonormal wavelets have a few important special considerations
which need to be addressed in order to understand some of the features of the biorthogonal
wavelet families which are the focus of this thesis.
In the orthonormal case, the biorthogonality relationships of Eqns. (4.13-4.16) are
replaced with the set establishing orthonormality between the non-dual functions them-
selves:
〈φki |φkj 〉 = δi,j , (4.39)
〈ψki |φlj〉 = 0, k ≥ l, (4.40)
〈ψki |ψlj〉 = δi,jδk,l. (4.41)
Even though these relationships do not seem to define a fully orthonormal set in the usual
fashion, consider the fact that a function expanded in wavelet space has the form:
|f〉 =∑i
ski |φki 〉+M∑n=k
∑i
dni |ψni 〉 .
In this case, all of the included basis functions are orthonormal to each other as is required
for an orthonormal basis function set.
Another observation to make is that the orthogonality relationships of Eqns. (4.39-
4.41) can be derived from the biorthogonal relationships from Eqns. (4.13-4.16) by the
simple substitutions:
φ = φ,
ψ = ψ,
45
making the dual and primal wavelet spaces identical. The first useful result of this equality
is that now W = W, so normal unitary transformation relations apply, as is expected of
an orthonormal basis.
The second result, which will be useful in Section 5.6 when considering preconditioning
effects, is that the normalization conditions on the wavelet and scaling functions need to
change from that of Eqns. (3.1) and (3.2). Consider adding an arbitrary normalization
factor Ak:
φki (x) = Akφ(2kx− i),
ψki (x) = Akψ(2kx− i),
and computing the orthonormal wavelet family equivalent of Eqn. (4.17), we obtain:
δi,j = 〈φki (x)|φkj (x)〉 ,
= 〈Akφ(2kx− i)|Akφ(2kx− j)〉 ,
= A2k2−k∫φ(X − i)φ(X − j)dX,
= A2k2−k 〈φ0i (X)|φ0
i (X)〉 .
This, and a similar argument for ψ, fixes A =√
2, so for orthonormal wavelet sets we need
to replace Eqns. (3.1) and (3.2) with:
φki (x) =√
2kφ(2kx− i), (4.42)
ψki (x) =√
2kψ(2kx− i). (4.43)
This difference in the normalization factors between the biorthogonal and orthonormal
wavelet bases causes significant effects when considering preconditioning in the wavelet
basis and will be considered in more detail in Section 5.6.
46
4.4 Multidimensional Wavelets
Multidimensional wavelet bases can be constructed either directly or by use of tensor
products of 1D wavelet functions. Tensor product forms of wavelet functions are the
simplest for constructing transforms and defining operators. They are defined simply by
multiplying 1D wavelet functions together in outer products. In 2D, for example, this
results in four different types of basis functions at each scale level – one scaling function
and three wavelet-style functions:
φksl,sm(x, y) = φkl (x)φkm(y), (4.44)
ψksl,dm(x, y) = φkl (x)ψkm(y), (4.45)
ψkdl,sm(x, y) = ψkl (x)φkm(y), (4.46)
ψkdl,dm(x, y) = ψkl (x)ψkm(y), (4.47)
with analogous definitions for the dual functions as well. Using these functions, the 2D
wavelet transform can be defined as the tensor product W2D = Wx⊗Wy which operate
independently in the x and y directions, respectively.5 Note that because these operations
act on independent spaces, they are commutative and W2D = Wx ⊗Wy = Wy ⊗Wx.
These transforms take the respective 1D wavelet functions from the scaling function forms
of Eqn. (4.44) to the wavelet style forms of Eqns. (4.45-4.47). The dual and inverse wavelet
transformation functions are defined analogously.
5Recall that for function vectors, |φ〉x ⊗ |ψ〉y = φ(x)ψ(y), which is why the tensor product symbol ⊗is generally suppressed for them.
47
When working with wavelets in an arbitrary number of dimensions, it is convenient to
invent a more abstract notation for describing a tensor product wavelet function:
ξkix,iy ,iz ,···(x, y, z, · · · ) =∏
d=x,y,z,···
φkid
2
(d), id even,
ψkid−1
2
(d), id odd.
(4.48)
Notice that this definition reproduces Eqns. (4.44-4.47) for the 2D case, but it also es-
tablishes an indexing scheme for the wavelets in N dimensions. For example, in the 2D
case, it specifies that only the points where ix and iy are both even correspond to the
scaling-function type (or ss-type) functions of Eqn. (4.44) and the wavelet-wavelet (or
dd-type) functions of Eqn. (4.47) are located on points with odd ix and iy. The sd-type
and ds-type functions correspond to the even-odd and odd-even combinations of ix and iy.
The 3D case is defined analogously, but there is one sss-type scaling function and seven
wavelet-style functions.
A full set of recursion relations and transforms analogous to the 1D cases in Section 4.1
could be defined in this new notation, but for the present work it is simpler to use the
abstract ξ(x, y, z, · · · ) only as a shorthand notation for preliminary steps of derivations,
and expand the expression in the form of Eqns. (4.44-4.47) when performing operations
on the wavelet functions.
4.5 Continuous Real Space to Wavelet Space
Up to this point, it has been implied that the S coefficients of a vector were equivalent
to the real-space value of the function at that point. This is only strictly true for the
interpolating wavelet families, although it is often used as an approximation for other
48
wavelet families as well. From the biorthogonality constraints, we know that
ski = 〈φki |f〉 .
So, the S coefficients are determined by the form of the dual scaling function φ. For the
non-lifted interpolating wavelet families, we have U = 0, and so Equations (4.5) and (4.23)
give
gi = δi,1, (4.49)
hi = δi,0, (4.50)
and using this h filter in Eqn. (4.9) results in the expression:
φ(x) = 2∑j
δj,0φ(2x− j),
φ(x) = 2φ(2x), (4.51)
which is a recursion relationship that can only be satisfied by a Dirac delta function. So
for the non-lifted interpolating wavelet functions we have
φ(x) = δ(x), (4.52)
and the samples from continuous space are identically equal to the S coefficients of the
wavelet space.
For wavelet bases in general, the equivalence between S coefficients and delta function
samples of the function no longer holds exactly. However, in the limit of large k and/or
relatively smooth function |f〉, the approximation is fairly accurate. The normalization of
49
φ is usually chosen such that
∫φ(x)dx = 1,
= 2∑j
hj
∫φ(2x− j)dx,
=∑j
hj
∫φ(X)dX,
∑j
hj = 1. (4.53)
For biorthogonal wavelet bases, it is common practice to choose h such that∑
j hj = 1,
and similarly, orthonormal bases are normalized such that∑
j hj =√
2.6 For large k then,
φ from any typical wavelet basis is a sharply peaked unit norm function with very short
support. A delta function sampling provides an excellent approximation to φ as long as
the function sampled is well-behaved and somewhat smooth in the sampling region. As a
result, for uniform grids, it is common practice to simply take the delta function samples
as equivalent to the lowest-scale S coefficients. Non-uniform grids, however, require special
consideration as the deviation between δ(x) and φ(x) may be large at higher scales where
φ(x) has a wide support.7
4.6 Human-Readable Representation of Vectors
The interleaved arrangement of scales in wavelet-space vectors used in earlier figures
(see Figures 7-10) is often used in computer implementations of wavelet codes, but it is dif-
ficult to visualize the structure of the underlying wavelet expansion in this format. When
6Recall that h = h in an orthonormal basis.
7Non-uniform grids will be considered in more detail in Section 5.7.
50
visualizing wavelet-transformed data, it is more common to show the scales separated by
a simple permutation of the components. For example, instead of the interleaved format:
[s−20 d0
0 d−10 d0
1 d−20 d0
2 d−11 d0
3
],
the data is arranged in order of decreasing scale:
[s−20 d−2
0 d−10 d−1
1 d00 d0
1 d02 d0
3
].
Figure 17 shows a pair of Gaussian functions expanded in wavelet space, with this
human-readable ordering of wavelet coefficients. In this format, it is easy to see that the
larger Gaussian contributes to the larger scales, but only the smaller central section con-
tributes to the higher frequency scales. Notice also the significant sparseness of the vector
expanded in wavelet space – this is typical of wavelet expansions for physical functions
and is a result of the dual frequency and spatial locality of the basis.
Figure 17: Wavelet transform example. Top plot is a superposition of two Gaussians ofdifferent widths in real space. Bottom plot shows the wavelet transform of this function inwith human-readable ordering of wavelet coefficients. The individual scales have verticallines dividing them.
51
In the remainder of this work, the computer code and mathematics will continue to use
the interleaved format of describing vectors, but I will use the human-readable ordering
for displaying vectors and operator matrices.
CHAPTER 5
THE ALGORITHM
Depending on the family of basis functions, the algorithm for the solution of the Poisson
equation can appear vastly different. In all cases, the goal is to solve Eqn. (2.7) for ~u:
L(~u+ ~ubc) = O~ρ.
In this work, wavelet-basis optimizations were chosen in several areas in order to improve
scaling and performance. These optimizations lie in the representation of the operators L
and O, in the preconditioning of the Laplacian operator, in the sparse representation of
the vectors ~u and ~ρ, and in the application of the boundary conditions ~ubc.
5.1 Prior Work
This thesis is a continuation of an earlier work by Terzic, Pogerelov, and Bohn which
produced a wavelet-based Poisson solver integrated into the accelerator code Impact-T [28].
This earlier solver utilized orthogonal wavelet families, a preconditioned conjugate gradient
algorithm, and a Green’s function solution to the application of boundary conditions. For
grid sizes of around 323, this solver achieved computation speeds that were comparable
to that of the FFT based Green’s function solver that is native to Impact-T. Many of
the algorithm decisions of the present work were made in response to the lessons learned
through the successes and shortcomings of this earlier solver.
53
5.2 Representation of Operators, Calculating Operators in Wavelet Space
When trying to solve the Poisson equation in a wavelet basis, the straightforward way
to implement the Laplacian and overlap operators is by simply creating a matrix containing
all of the inner products between every wavelet function included in the expansion. In
1D, this becomes:
L =
〈φ00|∂2
x|φ00〉 〈φ0
0|∂2x|ψ0
0〉 〈φ00|∂2
x|ψ10〉 〈φ0
0|∂2x|ψ1
1〉 · · ·
〈ψ00|∂2
x|φ00〉 〈ψ0
0|∂2x|ψ0
0〉 〈ψ00|∂2
x|ψ10〉 〈ψ0
0|∂2x|ψ1
1〉 · · ·
〈ψ10|∂2
x|φ00〉 〈ψ1
0|∂2x|ψ0
0〉 〈ψ10|∂2
x|ψ10〉 〈ψ1
0|∂2x|ψ1
1〉 · · ·
〈ψ11|∂2
x|φ00〉 〈ψ1
1|∂2x|ψ0
0〉 〈ψ11|∂2
x|ψ10〉 〈ψ1
1|∂2x|ψ1
1〉 · · ·
· · · · · · · · · · · · · · ·
. (5.1)
Because evaluating each of these matrix elements individually can be computationally
prohibitive, a common approach that is taken is to first pose the problem in regular space
in terms of the lowest scale S coefficients:
LS =
〈φn0 |∂2x|φn0 〉 〈φn0 |∂2
x|φn1 〉 〈φn0 |∂2x|φn2 〉 〈φn0 |∂2
x|φn3 〉 · · ·
〈φn1 |∂2x|φn0 〉 〈φn1 |∂2
x|φn1 〉 〈φn1 |∂2x|φn2 〉 〈φn1 |∂2
x|φn3 〉 · · ·
〈φn2 |∂2x|φn0 〉 〈φn2 |∂2
x|φn1 〉 〈φn2 |∂2x|φn2 〉 〈φn2 |∂2
x|φn3 〉 · · ·
〈φn3 |∂2x|φn0 〉 〈φn3 |∂2
x|φn1 〉 〈φn3 |∂2x|φn2 〉 〈φn3 |∂2
x|φn3 〉 · · ·
· · · · · · · · · · · · · · ·
, (5.2)
and because of the spatial locality of the scaling functions, this matrix is strongly diago-
nally dominant.
54
The linear interpolating wavelets, for example, give the familiar three-point finite dif-
ference stencil:
LS =
−2 1 0 0 · · ·
1 −2 1 0 · · ·
0 1 −2 1 · · ·
0 0 1 −2 · · ·
· · · · · · · · · · · · · · ·
. (5.3)
In fact, finite difference stencils provide reasonably good approximations to LS for almost
any wavelet family.
Once LS and OS are obtained, we have a linear equation in coordinate space:
LSW−1(~u+ ~ubc) = OSW−1~ρ,
where we can identify LSW−1 and OSW−1 as the operators being applied to the wavelet-
space vectors ~u, ~ubc, and ~ρ. This particular choice of operators is undesirable because
it is not equivalent to L in Eqn. (5.1), nor is it even symmetric. Symmetry of the op-
erator is required for many iterative matrix inversion techniques including the conjugate
gradient algorithm, so we would like to preserve the symmetry of the operator LS when
transforming it into wavelet space. Using Eqn. (4.37), we can see that
(LSW−1
)† = WLS† = WLS.
55
So, taking the dual forward transform of the above linear system, we obtain a system with
correctly symmetric operators:
WLSW−1(~u+ ~ubc) = WOSW−1~ρ. (5.4)
The prescription for obtaining these wavelet space LW and OW operators is revealed
by
LW = WLSW−1 = WLSW† = W(WLS
)†. (5.5)
In other words, perform the 2D dual wavelet transform, or the 1D wavelet transform along
rows and the columns of the 2D matrix LS. The equivalence of LW with L in Eqn. (5.1)
is readily seen by rewriting the dual forward transform from Eqn. (4.26 - 4.27) as
F : 〈φki |f〉 =∑j
hj 〈φk+12i+j |f〉 , (5.6)
〈ψki |f〉 =∑j
gj 〈φk+12i+j |f〉 . (5.7)
In the fashion of Eqn. (5.5), we can see that applying W down the columns and W†
across the rows of LS in Eqn. (5.2) will produce the matrix L in Eqn. (5.1).1 This is
the technique used by Terzic et al. to generate LW in their wavelet-based Poisson solver.
However, they used exclusively orthogonal wavelet bases, so their overlap operator was
identity and LW = WLSW−1.2 Because of the spatial localization of the wavelet bases
used, the partial derivative operators retained much of their sparsity in the journey to
wavelet space.
1Note that this needed to be explicitly shown. Comparing Eqns. (2.7) and (5.4) is not quite conclusiveenough, because the same comparison would apply also to the non-symmetric operator LSW
−1.
2Recall that W = W in an orthogonal wavelet basis.
56
5.3 Non-Standard Operator
While the form of the operator written as Eqn. (5.1) is shown to have retained much
of its sparsity, it is difficult to take full advantage of due to the non-localized nature of the
matrix density (Figure 18). Also, this standard form of the operator required storing the
entire matrix (or at least half, due to symmetry). Through a reordering of instructions
in the application of the operator, Beylkin describes another form for the operator which
reveals much of the true sparsity of the operation and preserves the simple repetitive nature
of the original LS matrix, enabling faster application and O(1) storage requirements [5].
S0 S0
D0 D0
X
D1
D2
D3
D1
D2
D3
=
Figure 18: Standard form wavelet space operator.
The shape of the non-standard form operator is shown in Figure 19. Qualitatively
one can see that the scales are effectively uncoupled in the operator. The coupling enters
into the operation through the redundancy of data in both the input and result vectors.
In order to produce the redundant input vector, the inverse transform (W−1 = W†)
is applied to the vector, and all of the intermediate S coefficient data is kept. After
57
the operator is applied to each scale independently, the resulting redundant data is dual
wavelet transformed ((W−1
)† = W) and added back into the corresponding scales.
S0 S0
D0 D0
D3
S1
D1
S2
D2
S3
D3
= X
S1
D1
S2
D2
S3
Figure 19: Non-standard form wavelet space operator.
The translation from the standard operator form into this non-standard form can be
seen by examining the contribution of each scale to individual scales higher up in the
wavelet expansion. We will begin with the base scale (level 0). If only the base scale
existed, the operator LW would reduce to simply
LW(0) = WLS(0)W† = LS(0),
where LS(0) denotes the LS matrix operator applied to scale n = 0 (replace n with 0 in
Eqn. (5.2).) Similarly, if another single scale existed in addition to the base scale, LW(1)
58
would become
LW(1) =
(N=1∏n=1
Fn
)LS(1)
(1∏
n=N=1
F†n
)= F1LS(1)F
†1,
and its contribution to the base scale will be given by
LW(1) (contribution to base scale) = P†0,SF1LS(1)F†1,
where P†0,S is the projection operator which selects only the rows of the operator LW(1)
which correspond to S coefficients on level 0, which is the base scale.
Taking the difference of the two expressions above, we can say that the additional
information provided to the base scale by the addition of the new level 1 is
P†0,S(LW(1) − LW(0)
)= P†0,SF1LS(1)F
†1 − P
†0,SLS(0), (5.8)
and we can rewrite the contribution of LW(1) to the base scale as
P†0,SLW(1) = P†0,SLS(0) + P†0,S(F1LS(1)F
†1 − LS(0)
).
Adding another scale will produce another contribution to the above sum, comprised
of
LW(2) − LW(1) = F1
(F2LS(2)F
†2 − LS(1)
)F†1,
resulting in a base scale contribution of
P†0,SLW(2) = P†0,SLS(0) + P†0,S(F1LS(1)F
†1 − LS(0)
)+ P†0,SF1
(F2LS(2)F
†2 − LS(1)
)F†1,
59
and for a general LW(N) we can write the contribution to the topmost scale as
P†0,SLW(N) = P†0,SLS(0)
+ P†0,SN∑n=1
n−1∏j=1
Fj
(FnLS(n)F†n − LS(n−1)
) 1∏j=n−1
F†j
.(5.9)
So the base scale operator is split into portions which operate on individual scale levels
and are then coupled via the wavelet transform back up to the base scale. For the result
applied to lower scales, simply switch the projection operator to P†k,D, and note that P†k,Dapplied to any operator which produces information at a scale above the D coefficients
of level k will result in zero. For example, P†k,DLS(0) = 0. This projection operator
replacement results in the operator contribution to any level k as
P†k,DLW(N) = P†k,DFkLS(k)F†k
+ P†k,DN∑
n=k+1
n−1∏j=k
Fj
(FnLS(n)F†n − LS(n−1)
) k∏j=n−1
F†j
.(5.10)
Again, the operator consists of applying an operator at the current scale and then adding
in contributions from lower scales which were also computed independently.
Applying only a single-stage forward transform operation to LS(n) from Eqn. (5.2),
we can see that rather than the full wavelet operator as in Eqn. (5.1), we obtain simply:
FnLS(n)F†n =
〈φn−1i |∂2
x|φn−1j 〉 〈φni |∂2
x|ψnj 〉
〈ψni |∂2x|φnj 〉 〈ψni |∂2
x|ψnj 〉
=
LS(n−1) LSD(n)
LDS(n) LDD(n)
. (5.11)
So the(FnLS(n)F
†n − LS(n−1)
)term from Eqn. (5.10) is simply the LSD(n), LDS(n), and
LDD(n) parts of the FnLS(n)F†n filter because the LS(n−1) term is subtracted out.
60
With this knowledge, we can rewrite Eqn. (5.10) in a more illuminating form as3
P†k,DLW(N) =
0 0
LDS(k) LDD(k)
+ P†k,DN∑
n=k+1
n−1∏j=k
Fj
0 LSD(n)
0 0
k∏j=n−1
F†j
.
(5.12)
This form shows the application of the LDS(n) and LDD(n) operations to each scale, fol-
lowed by a wavelet transformation of the LSD(n) terms, coupling the lower scales with the
higher ones, as is seen in Figure 19. Note that the P†k,D term is responsible for eliminating
the wavelet-space terms LDS(n), and LDD(n) in the wavelet-transformed portion, leaving
only the LSD(n) term.
From the translation invariance of the wavelet basis, it is easily seen that the LSD(n),
LDS(n), and LDD(n) operators are diagonally dominant, sparse, and invariant to transla-
tion.4 For example, LDD(n) is invariant to translation because it depends only on the
distance between wavelet functions:
LDD(n) = 〈ψni |∂2x|ψnj 〉 = 〈ψn0 |∂2
x|ψnj−i〉 = DDj−i.
Since the wavelet functions have a finite spatial range, the filter DDj−i decays to zero
for large |j − i|, which causes LDD(n) to be diagonally dominant and sparse. So the
non-standard form of the wavelet space operator reduces to the wavelet transform and
inverse plus the application of three simple filters SD, DS, and DD, all of which are
3Recall that P†k,DLS(k−1) = 0.
4With periodic or zero boundary conditions, of course.
61
short one-dimensional filters. The matrix for the overlap operator follows the same pre-
scription, with the substitution of 1 for ∂2x, resulting in a different set of SD, DS, and
DD filters. Appendix A describes in detail the process for calculating these filters for
both the Laplacian and operator matrices, and contains tabulated coefficients for a few
of the interpolating wavelet bases used in this thesis. An important fact to note is that
since all of the operations performed in the non-standard form of the operator are simple
filter applications, the complexity of the algorithm in terms of the number of operations
required to perform the matrix-vector multiplication is of the order of O(N) for large N ,
where N is the total length of the vector.
The non-standard operator could be implemented as three separate operations: Inverse
wavelet transform to a redundant state, application of SD, DS, and DD filters to each
scale in parallel, and forward dual wavelet transform, summing the redundant S coefficients
back into the wavelet coefficients. However, to reduce the number of duplicate copies of
the data structure and to simplify a parallel implementation, it may be advantageous to
combine all three steps into a single operation. To assist in implementation, C++ style
pseudo code illustrating the algorithm for the full non-standard operator is contained in
Figures 20 and 21. These algorithms also account for the fact that the grid may be an
adaptive mesh, which will be discussed further in Section 5.7.
5.4 3D Operators
Generalizing the operator strategy into a fully three-dimensional algorithm is fairly
straightforward when working with wavelet bases defined as tensor products. However,
there are a few non-obvious nuances which merit a careful explanation of the derivation.
The “naive” extension of L3D = Lx +Ly +Lz is incorrect for general biorthogonal wavelet
bases, as will be shown shortly.
62
topScale::NS_operator(topScaleData) {// (Note: The top scale starts out as only S coefficients,// so no inverse transform is needed.)
childRegion = area that overlaps with child region;dataToChild = topScaleData[childRegion];dataFromChild = a storage space of the size of childRegion
for holding result from child scale;
// Pass S-coefficient data to child and get the result// from children:mySubScale->SubScale::NS_operator( mySubScale->subScaleData,
dataToChild,dataFromChild );
// Perform the SS filter on this scale:// Note that the SS_filter cannot be done in-place.tempData = SS_filter(topScaleData);topScaleData = tempData;
// Add the result from the child scales back into// this current data:topScaleData[childRegion] += dataFromChild;
}
Figure 20: Pseudo code for non-standard form operator – top scale.
63
subScale::NS_operator(subScaleData, dataFromParent, dataToParent) {// (Note: The redundant S coefficient data from the parent// scale is passed in dataFromParent.)
tempData = DS_and_DD_filters(subScaleData, dataFromParent);// (Note: This does *not* depend on dataFromParent.)
dataToParent = SD_filter(subScaleData);
if( child exists ) {childRegion = area that overlaps with child region;childBorder = extra border area for child BCs;dataToChild = subScaleData[childRegion + childBorder];dataFromChild = a storage space of the size of
childRegion + childBorder for holdingresult from child scale;
// Note that the boundary conditions on this transform// are the values given in dataFromParent.dataToChild = wave_transform_oneScale_inverse( subScaleData,
dataFromParent,childRegion );
// Pass S-coefficient data to the child and get the result// returned from children:mySubScale->SubScale::NS_operator( mySubScale->subScaleData,
dataToChild,dataFromChild );
// Note that the boundary conditions for this transform are// zero since it is an SD filter result from the child, and// the SD filter is diagonally dominant so that the data// *is* zero outside of the childRegion.dataFromChild = wave_dual_transform_oneScale( dataFromChild,
childRegion );// Add the correctly transformed data into this current// scale *and* the data to be passed to our parent:tempData += D_coefficients(dataFromChild);dataToParent += S_coefficients(dataFromChild);
}}
Figure 21: Pseudo code for non-standard form operator – child scales.
64
Using the multidimensional notation for wavelet functions described in Section 4.4, a
matrix element of the 3D Laplacian takes the form:
Lk,lix,iy,iz,jx,jy,jz
= 〈ξkix,iy ,iz |∂2x + ∂2
y + ∂2z |ξljx,jy ,jz〉 , (5.13)
where ∇2 is expanded as a sum of second partial derivatives since the basis is a set of
Cartesian tensor products. Examining a single term of this expansion, we can see that
〈ξkix,iy ,iz |∂2x|ξljx,jy ,jz〉 = 〈ξkix |∂
2x|ξljx〉 〈ξ
kiy ,iz |1|ξ
ljy ,jz〉 ,
= 〈ξkix |∂2x|ξljx〉 〈ξ
kiy |1|ξ
ljy〉 〈ξ
kiz |1|ξ
ljz〉 ,
which is simply the tensor product of the 1D Laplacian and overlap matrix elements. This
observation, combined with Eqn. (5.13) reveals that the correct form of the 3D Laplacian
operator is given by
L3D = Lx ⊗Oy ⊗Oz + Ox ⊗ Ly ⊗Oz + Ox ⊗Oy ⊗ Lz. (5.14)
Notice that in an orthonormal wavelet basis, we have Ox = Oy = Oz = 1, and so this
reduces to the “naive” operator mentioned earlier. However, in general, all operators have
this form involving the 1D overlap matrices. For the 3D overlap matrix elements:
Ok,lix,iy,iz,jx,jy,jz
= 〈ξkix,iy ,iz |1|ξljx,jy ,jz〉 , (5.15)
a similar calculation reveals that
O3D = Ox ⊗Oy ⊗Oz. (5.16)
65
5.5 Implementation of 3D Operators in Non-Standard Form
The implementation of these 3D operators in the non-standard form can proceed in
three ways. First, the 1D operators may be applied in succession for operator products,
and the results summed together as needed for summed operations. This approach is
prohibitive because the 3D Laplacian operator would require nine applications of a non-
standard form operation, resulting in nine iterations through the entire data structure –
which is a costly memory-bound operation.
5.5.1 Generalizing the Non-Standard Form in 2D
The other two methods for generalizing the multidimensional non-standard form in-
volve directly generalizing Eqn. (5.12) for the multidimensional operators defined in the
previous section. Beginning with the simpler Eqn. (5.8) involving only a base level 0 and
an additional scale, we can derive a similar result for a 2D overlap matrix:
∆OW(1)2D = P†0,S2D
(OW(1)2D −OW(0)2D
),
= P†0,S2DF12DOS(1)2DF†12D
− P†0,S2DOS(0)2D ,
and expanding the 2D operators in terms of tensor products of 1D operators, we have:
∆OW(1)2D =(P†0,Sx ⊗ P†0,Sy
)(F1x ⊗ F1y)(OSx(1) ⊗OSy(1))(F†1x⊗ F†1y)
− (P†0,Sx ⊗ P†0,Sy
)(OSx(0) ⊗OSy(0)),(5.17)
66
Rearranging commuting terms and then expanding the matrix elements in the form of
Eqn. (5.11) gives:
∆OW(1)2D =(P†0,Sx ⊗ P†0,Sy
)(F1xOSx(1)F†1x
)⊗ (F1yOSy(1)F†1y
)
− (P†0,Sx ⊗ P†0,Sy
)(OSx(0) ⊗OSy(0)),
=(P†0,Sx ⊗ P†0,Sy
)
OSx(0) OSDx(1)
ODSx(1) ODDx(1)
⊗ OSy(0) OSDy(1)
ODSy(1) ODDy(1)
− (P†0,Sx ⊗ P
†0,Sy
)(OSx(0) ⊗OSy(0)).
Applying the projection operators gives:
∆OW(1)2D =[OSx(0) OSDx(1)
]⊗[OSy(0) OSDy(1)
]−[OSx(0) 0
]⊗[OSy(0) 0
], (5.18)
and expanding the tensor products results in
∆OW(1)2D =[OSx(0) ⊗OSy(0) OSx(0) ⊗OSDy(1) OSDx(1) ⊗OSy(0) OSDx(1) ⊗OSDy(1)
]−[OSx(0) ⊗OSy(0) 0 0 0
],
=[0 OSx(0) ⊗OSDy(1) OSDx(1) ⊗OSy(0) OSDx(1) ⊗OSDy(1)
]. (5.19)
So in the 2D and 3D case of the non-standard operator form, extra cross terms involving
OSx(0) and OSy(0) arise which are not cancelled by the OSx(0) ⊗OSy(0) term at the next
67
higher scale. This means that one cannot simply ignore the OSx(0) and OSy(0) filters in
child scales, as was the case in 1D (in Eqn. (5.12) for example).
5.5.2 Method Two
The second method for implementing a non-standard operator in 2D or 3D requires
applying the 1D operators exactly as Eqn. (5.18) prescribes – applying the full 1D filters
in each direction, and then explicitly subtracting out the OSx(0) and OSy(0) term.5 This
method may be used to reduce the Laplacian operator to only a single non-standard
operator application, but it performs redundant work by computing the OSx(0) ⊗OSy(0)
twice and canceling them.
5.5.3 Method Three
The third method of implementing the non-standard operator in multiple dimensions
begins by recognizing that the final matrix in Eqn. (5.19) is the 2D equivalent of the OS
and OSD operators. By analogy from the 1D case of Eqn. (5.11), we can define:
Fn2DOS(n)2DF†n2D=
OS(n−1)2D OSD(n)2D
ODS(n)2D ODD(n)2D
. (5.20)
5If implementing this form of the 2D or 3D non-standard form operator with a multiresolution grid,take note that the OSx(0) ⊗OSy(0) term needs to be evaluated on the child scale rather than the parent
scale in order for it to cancel perfectly with the same term resulting from P†0,S2DF12DOS(1)2DF†12D
.
68
Using the multidimensional projection operators:
Pn,S2D= Pn,Sx ⊗ Pn,Sy , (5.21)
and
Pn,D2D= Pn+1,S2D
− Pn,S2D,
= Pn,Sx ⊗ Pn,Dy + Pn,Dx ⊗ Pn,Sy + Pn,Dx ⊗ Pn,Dy , (5.22)
we can express the terms in Eqn. (5.20) as
OS(n−1)2D = P†n,S2DFn2DOS(n)2DF†n2D
Pn,S2D, (5.23)
OSD(n)2D = P†n,S2DFn2DOS(n)2DF†n2D
Pn,D2D, (5.24)
ODS(n)2D = P†n,D2DFn2DOS(n)2DF†n2D
Pn,S2D, (5.25)
ODD(n)2D = P†n,D2DFn2DOS(n)2DF†n2D
Pn,D2D. (5.26)
The explicit tensor-product form of these operators can be found by simply expanding
OS(n)2D , Fn2D , and the projection operators in their tensor-product forms. In terms of
these operators, it is simple to show that the 2D analogy of Eqn. (5.12) becomes
P†k,D2DOW(N)2D =
0 0
ODS(k)2D ODD(k)2D
+ P†k,D2D
N∑n=k+1
n−1∏j=k
Fj2D
0 OSD(n)2D
0 0
k∏j=n−1
F†j2D
.
(5.27)
Using these forms of the 2D or 3D operators, the algorithm for the application of the
non-standard form remains unchanged. The 2D operators in Eqns. (5.23–5.26) can be
69
implemented either as tensor products of 1D operators applied over only the specified
regions or explicitly expanded as 2D filters and applied as a whole. The latter is preferable
due to the fact that the projections of wavelet and scaling function regions can be included
into the 2D filter, simplifying the implementation.
5.6 Preconditioning and Temporal Coherence
Solving the Poisson equation (2.7) for ~u involves inverting the Laplacian operator
matrix applied to the right-hand side of the equation – i.e.:
~u = L−1(O~ρ− L~ubc).
This is a common linear algebra equation and can be solved via a number of known direct or
iterative algorithms. An iterative method is chosen for this current work for a few reasons.
First, iterative matrix inversion techniques typically only require knowing the action of
the operator matrix on a vector. Direct inversion methods often require knowledge of the
entire matrix in order to solve for the inverse operator. For this reason, a direct method
will be very inefficient for these wavelet space operators, since the operators are both very
large and very sparse. As described in Section 5.3, applying the Laplacian operator in
non-standard form requires storage of only a few small filters rather than an entire large
multidimensional matrix.
5.6.1 Temporal Coherence
A second reason for electing to use an iterative methods stems from the application of
this algorithm to PIC-based particle dynamics simulations. In such a simulation code, the
70
Poisson equation solver is called at each simulation time-step to produce the space-charge
forces to be applied on the charge distribution. For such a simulation to be reliable, this
time-step size must be short relative to the time scales of the motions of the individual
particles. This means that the charge distribution and the resultant potential changes
slowly with respect to time, and there is a temporal coherence between time-steps. An
iterative matrix inversion scheme allows the Poisson solver to take advantage of this tem-
poral coherence by using the previous time-step’s potential as a “smart” initial guess for
the current time step. As illustrated in Figure 22, Terzic et al. showed that this good
initial guess by itself can reduce the number of iterations for convergence to around 10
per time step for much of the simulation [28].
5.6.2 Conjugate Gradient Algorithm and Preconditioning
The conjugate gradient algorithm is the iterative method of choice for this Poisson
equation solver due to its simple implementation and fast convergence. It requires only
a single application of the Laplacian matrix per iteration, is easy to parallelize if the
matrix-vector product can be performed efficiently in parallel, and converges rapidly to
an acceptable solution. The convergence rate of the algorithm is strongly dependent on
the initial guess (as is shown in Figure 22) and on the condition number κ of the Laplacian
matrix [24], where
κ =λmaxλmin
=Maximum eigenvalue of LMinimum eigenvalue of L
, (5.28)
and the error per iteration decreases faster than
||e(i)||||e(0)||
≤ 2(√
κ− 1√κ+ 1
)i, (5.29)
71
Figure 22: Results of preconditioning and temporal coherence. Figure taken from Terzicet al. [28].This figure compares the number of iterations required for convergence of each Poissonequation solution in a simulation run in Impact-T using the Poisson solver of [28].The top line (green) is the non-preconditioned conjugate gradient algorithm, with an initialguess of zero for the potential.The second line (blue) is the preconditioned conjugate gradient algorithm with an initialguess of zero for the potential.The third line (red) is the non-preconditioned conjugate gradient algorithm with a “smart”initial guess for the potential.The bottom line (black) is the preconditioned conjugate gradient algorithm with a “smart”initial guess for the potential.
72
where i is the iteration number.
Though the non-standard form of the operator is O(N) in its complexity, the overall
algorithm complexity is also dependent on the number of conjugate gradient iterations
required to obtain a satisfactory solution. Terzic et al. state that for a standard 3-point
finite difference stencil, κ ∝ O(M2) where M is the grid length in each dimension [28].6
Because this effectively causes κ to depend on N , the overall algorithm complexity can no
longer be O(N). Rather, in situations where the initial guess is poor, the order becomes
O(N ·C(κ(N))), where C(κ) is the number of iterations of the conjugate gradient algorithm
required for acceptable convergence. C(κ) can be estimated by solving Eqn. (5.29) for the
number of iterations i with respect to a chosen convergence requirement of ε = ||e(i)||||e(0)||
.
In this manner, Shewchuck gives the following upper bound on the number of iterations
required for convergence [24]:
C(κ) = i ≤⌈
12√κ ln
(2ε
)⌉. (5.30)
The goal, then, is to reduce C(κ) ∝√κ.
The typical method of reducing the condition number of a matrix involves choosing
a preconditioner matrix M which approximates the Laplacian operator in Eqn. (2.7) and
solving instead the preconditioned form:
M−1L~u = M−1(O~ρ− L~ubc). (5.31)
Factoring M as M = PPT, which is possible for any symmetric and positive-definite M,
we have the symmetric form:
P−1LP−TPT~u = P−1(O~ρ− L~ubc). (5.32)
6The 3D grid is assumed to be cubic of size M ×M ×M , so N = M3.
73
M is selected in the hope that the product M−1L = P−1LP−T will have a lower condition
number than L, so that the conjugate gradient algorithm will converge more rapidly. A
significant reduction in condition number was obtained by [28] with the diagonal precon-
ditioner matrix:
(M−1)k,li,j = 2−kδk,lδi,j , (5.33)
which simply multiplies each vector element by 2−k where k is the element’s level number.
This preconditioner reduces the condition number function from κ ∝ O(M2) to κ ∝
O(M), making the convergence rate proportional to C ∝√M . Applying this precon-
ditioner to the present non-standard operator formalism results in an overall algorithm
worst case complexity of O(N ·√M) or O(N
76 ) in the 3D case, and O(N1+ 1
2d ) for d dimen-
sions. So the worst case algorithm complexity is nearly linear with respect to the number
of basis functions, but as Figure 22 shows, temporal coherence ensures that the typical
case algorithm complexity will be simply O(N) because the initial guess will be accurate
enough in the average case to require only a few conjugate gradient iterations to obtain
satisfactory convergence. Terzic et al. report that over the entire 30,000 time step simu-
lation run which produced the data for Figure 22, the “smart” guess plus preconditioner
approach required an average of only 2.4 iterations per time step [28]. This adaptivity in
time is one of the significant strengths of an iterative wavelet-based approach applied to
PIC simulations – less work is required to solve the Poisson equation during portions of
the simulation which have very slow changes with respect to time. This is an advantage
over the Green’s function plus FFT approach, which requires that the full solution method
be repeated for each time step, even if very little of the charge density has changed.
74
5.6.3 Inherent Preconditioning of Biorthogonal Bases
As mentioned in Section 4.3, the normalization difference between the orthonormal
wavelets used in [28] and the biorthogonal bases used for this thesis has significant ram-
ifications in applying preconditioners for the conjugate gradient algorithm. Because the
orthonormal wavelet transform is unitary, it preserves the eigenvalues of the Laplacian
operator, and the non-preconditioned wavelet-space operator has the same inherent con-
dition number κ ∝ O(M2) as the standard finite-difference stencil in real space. After the
preconditioner in Eqn. (5.33) is applied, the condition number is reduced to κ ∝ O(M).
Because the biorthogonal wavelet transform is not a unitary transformation, it does not
preserve the eigenvalues of the Laplacian operator, and it can effectively provide a simi-
lar preconditioner to that employed in [28] – simple scaling factors between scales. Also,
because the normalization factors between scales are arbitrary in the biorthogonal basis
(subject to the biorthonormality conditions of Eqns. 4.13-4.16), any scaling factor could be
chosen – providing an opportunity to construct biorthogonal bases with inherent precon-
ditioning equivalent to that used in [28]. Lippert, Arias, and Edelman noted the fact that
a biorthogonal basis appeared to be providing just this sort of inherent preconditioning,
but did not provide a theory for why the default preconditioning was helpful rather than
detrimental [17]. From my own examination of the typical normalization choice for bi-
orthogonal bases, I discovered that the change in normalization between an orthonormal
basis and a biorthogonal one in 1D has exactly the same effect on the linear algebraic
Equation (2.7) as does the preconditioning matrix Eqn. (5.33).7 Therefore, the default
preconditioning inherent to any biorthogonal wavelet basis in 1D is equivalent to the pre-
conditioning scheme used in [28], with no additional computational work required to apply
7See Appendix C for the derivation of this equivalence.
75
the “preconditioner”, and the preconditioning has a similar form in 3D with different scale
factors.
5.6.4 Alternate Preconditioning Schemes
The selection of a preconditioning scheme for use in solving the Poisson equation in a
wavelet basis is a very open-ended choice. The preconditioning scheme used in [28] was
used in this current work, but several other schemes are worthy of future consideration
as well. Goedecker and Arias both claim preconditioning schemes which provide a con-
dition number which is invariant to additional layers in the wavelet basis under certain
conditions – effectively claiming that κ ∝ O(1), resulting in an O(N) overall algorithm
complexity [1, 11, 17]. This is an area which requires further research to determine if these
preconditioning schemes are appropriate for a solver designed for PIC simulation, as well
as to ascertain under which conditions these schemes do in fact yield an O(N) algorithm.
5.7 Multiresolution Grids
In order to implement an efficient discrete-grid solver in any basis, one must be able
to truncate the basis in some way in order to avoid an infinite basis set. The error
resulting from this truncation should be slight, and the resulting approximation needs to
still accurately approximate the full basis set solution. This will occur if the basis functions
utilized in the expansion are fairly well localized in whatever space is being truncated,
and if the expanded function was already sparse in this region. As an example, a function
expanded in an infinite sum of Dirac δ functions can be truncated in any regions where
the function is zero, with no effect on the accuracy of the expansion. Also, most functions
expanded in frequency space will have some frequency localization, and the frequency
76
expansion may be truncated above some minimum frequency value, with little change to
the accuracy of the function representation. This truncation is typically accomplished by a
reduction in the frequency of grid points used to discretely sample the function. Truncation
in frequency is particularly useful when considering the Poisson equation because the
Laplacian operator is a sharpening operator – it strengthens high-frequency components
of the potential to yield a charge density with stronger high-frequency contributions. This
means that when inverting the Laplacian operator to solve the Poisson equation, if the
charge density can be represented accurately with some maximum frequency, then the
potential will not contain contributions from higher frequencies than this maximum.8
A Fourier expansion, then, can be truncated in frequency space with very little ap-
proximation error because both the operator and the basis set are localized in frequency.
However, the expansion requires that the grid be uniformly dense because the basis func-
tions are not at all localized in space. A wavelet basis set, being localized in both frequency
and space, can be accurately truncated in both spaces. A look at the form of the Laplacian
operator in wavelet space (Figure 19) shows that at each scale level, the Laplacian op-
erator is also localized in space. So the wavelet basis can be truncated in frequency by
reduction of the grid resolution, and this grid resolution does not have to be uniform
over the computational domain, due to the spatial localization of the wavelet basis and
Laplacian operator. These properties can be utilized to construct an adaptive-resolution
grid structure for solving the Poisson equation – the grid resolution can be made high
where there are sharp transitions in the charge density and the grid resolution can be
smoother in regions where the charge density is zero or smooth in nature.
8This can be proven more exactly by noting that the Fourier transform basis functions are eigenfunc-tions of the Laplacian operator. The Laplacian operator can only scale the vector in this basis, so if anyfrequency component is zero in the charge density, it will be zero in the potential and vice versa.
77
5.7.1 Multiresolution Grid to Improve Boundary Conditions
A multiresolution grid may be used to create boundary conditions that more closely
resemble the ideal “open” boundary in PIC codes. Figure 23a shows how the base com-
putational grid containing the beam bunch is contained in a small central region of the
beam pipe. Applying homogeneous boundary conditions (U = 0) at the boundaries of
this small computational grid will result in strong image charges which will significantly
distort the resulting potential. An improvement in the accuracy of the potential is gained
by expanding the computational region to some larger volume at least the size of the beam
pipe, as in Figure 23b. Applying the homogeneous boundary conditions on this larger grid
will significantly reduce the unwanted effects of image charges. However, the significant
number of points added by this approach is prohibitive. The wavelet multiresolution grid
approach, however, makes this approach possible by reducing the number of extra points
required to extend the boundary of the computational domain (see Figure 23c). Because
there is no charge density in the intermediate region, the grid is made as sparse as possible
and the number of grid points needed is significantly less than a dense grid over the same
region.
(a) Base grid with beam pipe. (b) Dense grid to boundary. (c) Adaptive grid to boundary.
Figure 23: Multiresolution grid used for boundary conditions.
78
The variation in the grid density is restricted by the LSD and LDS terms of the
operators, as valid S coefficient data is required in order to calculate the effect on the
current grid. Therefore each layer needs a boundary region of the next coarsest resolution
as wide as the widest filter that projects from the S to D or D to S spaces. Figure 25, in
Chapter 6, illustrates an actual multiresolution grid with varying numbers of additional
coarse boundary regions.
5.7.2 Interpolation
The reduction of the wavelet basis applies only to the function as it is already expanded
in wavelet space – i.e., only wavelet-type “D” coefficients may be truncated because they
will be nearly zero in regions where the input charge density is smooth. During the initial
wavelet transformation, the function is still in regular space, and the wavelet transform
may require fine data points outside of the region where they are available. In this case,
an interpolation scheme may be used to evaluate the function at these points and allow
the wavelet transformation to proceed correctly. See [11] for details on how this wavelet
interpolation is accomplished. This interpolation scheme is not needed in the current
work because the charge density is entirely contained in the finest scale, and the charge
distribution is exactly zero at all the boundary regions where this interpolation would be
needed.
5.8 Monopole Approximation for Boundary Conditions
An additional improvement in approximating “open” boundary conditions is achieved
by applying a non-homogeneous Dirichlet boundary condition containing the monopole
term of the given charge density. The required monopole information of the total charge
79
and mean charge center is easily obtained from the charge distribution and is continually
tracked by many PIC codes. With this information, we can give an approximation of the
potential at the computational boundary by using the first term of a multipole expansion.
Generalizing the approach taken in [28], arbitrary non-homogeneous boundary conditions
may be applied to by separating the full Laplacian operator into sections involving both
regions internal and external to the base computational grid:
L~u = (P†in + P†out)L(Pin + Pout)~u.
Since we are only interested in the result internal to the computational grid, only two
terms remain:
P†inL~u = (P†inLPin + P†inLPout)~u,
= P†inL~uin + P†inL~u
bc, (5.34)
where ~uin = Pin~u is the potential inside the computational grid, and ~ubc = Pout~u is the
potential due to the boundary conditions.
Because ~ubc is a given function, it can be moved to the right-hand side of the Poisson
equation (Eqn. 2.7), yielding:
L~u = O~ρ− L~ubc, (5.35)
where the P†in term on the left of each term is implied rather than explicitly shown. Com-
puting this L~ubc term then, simply requires applying the Laplacian operator onto a grid
which is zero inside the computational domain, and uses any desired non-homogeneous
boundary conditions when applying the operator. The resulting effect on the computa-
tional domain is then subtracted from the charge density term resulting in a “fictitious
charge distribution” which applies the correct boundary conditions. An important opti-
mization afforded by the multiresolution grid structure is that this L~ubc term will only
80
contain non-zero values on the scale which touches the boundary because the lower scales
are already truncated in such a way that values outside of their parent scale have no effect
on them when coupled through the Laplacian operator.
The effect of applying a monopole approximation to the boundary conditions is to
completely eliminate the monopole term from the image charge error, leaving only dipole
terms and higher. This means that the image charge falloff occurs as 1r2
rather than 1r
as is the case for the monopole term. The overall result is that a smaller computational
grid may be used while giving the same error as a larger grid that utilized a completely
homogeneous boundary condition.
5.9 Parallelization Opportunities
This algorithm for solving the Poisson equation offers significant opportunities for par-
allel optimization. The conjugate gradient algorithm which is used as the iterative solver
consists mainly of dot product and SAXPY (scalar multiply and vector addition) opera-
tions, which are trivial to implement in parallel as they are element-wise array operations
or simple sum-reductions [8]. The remaining difficult operation to parallelize is the appli-
cation of the Laplacian operator to the trial potential. For the Laplacian operator stage,
the wavelet basis allows the data to be split in either space or scales. A separation in scales
offers a very small boundary between communicating layers, as information is only passed
between scales as S coefficients, which consist of only 18 of the number of coefficients on
any given layer. Also, if an additional separation is needed, the wavelet basis can also sep-
arate in space, requiring only a few boundary coefficients to be passed between adjacent
segments. This is a significant reduction in communication compared with an FFT-based
implementation, which does not separate well in space due to the lack of spatial locality
of the basis set.
CHAPTER 6
IMPLEMENTATION AND ALGORITHM TESTING
A computer implementation of the Poisson equation solver algorithm was constructed
with two goals in mind – to verify the correctness of the mathematical algorithm and
filter coefficients and to examine the response of the algorithm to changes in input param-
eters. This chapter reviews some of the features of this implementation and examines the
costs and benefits of the unique contributions of this algorithm for use in particle-in-cell
simulation codes.
6.1 Features and Code Structure
From the spectrum of available implementation choices, the solver used in this work was
constructed in object-oriented C++ with the Blitz++ library (http://www.oonumerics.
org/blitz/) utilized for manipulation and optimization of dynamic arrays. It uses a
template structure to enable the code to be compiled and used with any number of di-
mensions, from 1D to 4D. This flexibility allows solutions with various dimensionality to
be tested and compared in the same codebase, and demonstrates the generalization of the
algorithm to various dimensions. The code also allows selection of the wavelet family to
be used, and includes implementations of 2nd, 4th, 6th, and 8th order lifted and non-lifted
interpolating wavelets.
82
6.1.1 Code Structure
The code is organized into two categories of data structures: vectors and filters. Be-
cause all of the matrix-vector operators used for this algorithm are symmetric and diag-
onally dominant, these filters represent the entire matrix by storing the representation
of a single “row” in a matrix. The concept of a “row” of the matrix is generalized to
higher dimensions as a tensor product of the 1D filters calculated in Appendix A. The
various types of coefficients in a wavelet expansion described in Eqn. (4.48) often each
have unique filters which take into account the differing types of neighboring coefficients.
Filters are used to describe both the non-standard operator matrices and the wavelet
transform matrices.
Because the scales are adaptive in nature, vectors are organized in two types as well: A
single base scale plus multiple child scales. The base scale consists of all S-type coefficients
at the coarsest resolution plus a possible single child scale, and the child scales are each
comprised of wavelet-type coefficients at a single resolution and may recursively contain a
single child scale as well. The data is organized in an interleaved fashion as in Eqn. (4.48),
which leaves redundant S-type data points in the data structure of child scales. This data
is only 18 of the child scale for the 3D solver, and it is used for in intermediate steps of
the wavelet transform and non-standard operator applications. The base scale and child
scale objects contain the code for the non-standard operator algorithms, according to the
algorithm outlined in Figures 20 and 21.
Child scales apply boundary conditions of zero whenever wavelet-type data is required
outside of the child scale’s region, but it uses the parent data as S-type data points in
the boundary region around the scale. This creates the requirement that a child scale
must be able to fit within its parent scale and also have enough boundary values at the
parent’s scale level to satisfy the widest of the wavelet transform, overlap, or Laplacian
83
filters. The boundary conditions applied to the top-most scale can be arbitrary, and can be
set to any desired function. Implementations of homogeneous (zero) boundary conditions
and monopole approximation boundary conditions are both included and can be selected
from a command-line option in order to allow comparison with and without the monopole
boundary condition approximation, which is a new contribution of this current work.
6.2 Testing: Parameters and Measurements
Over 1000 variations of parameters were tested with this solver to verify robustness and
to examine the feasibility of the multi-scale algorithm and monopole approximation for
use in approximating open boundary conditions. The output of the solver was compared
with the known analytic solutions and an estimate of the overall error was produced for
each test run. All tests reported in this chapter were computed in 3D. The actual charge
distribution and potential functions shown in this chapter are always three-dimensional
and symmetric about z = 0, and a 2D slice across z ≈ 0 is shown in figures.
• Three different input charge distributions were utilized – a Gaussian pure monopole,
a Gaussian pure dipole, and a double Gaussian function containing both monopole,
dipole, and higher order multipole terms. A typical particle-in-cell beam dynamics
simulation will only include particles of a single charge, but the dipole input test
function was included as a control group to establish how the dipolar term of a mul-
tipole expansion affects the solver. Figure 24 illustrates these three input variations.
Each of the Gaussian functions are radially truncated such that ρ = 0 for r > 14 ,
and they are arranged so that non-zero charge density only exists within a sphere of
r < 1 around the origin.
• 2nd-order and 4th-order interpolating wavelets were used, of both the lifted and
non-lifted varieties.
84
• Either monopole approximation and zero (homogeneous) boundary conditions were
employed.
• The number of grid points sampling the finest scale was varied in cubed powers of
two from 43 up to 643. The finest scale was always chosen to be of size 1 in each
dimension, in order to include the non-zero charge distribution of the input function.
This is the typical way that the input function grid is assigned in a particle-in-cell
simulation.
• The number of additional boundary scales was varied from 0 (only the base finest
scale) up to 10 boundary scales. The scales were added using the minimum number
of points required in each additional parent scale, which was dependent on the
wavelet family used. The minimum radial extent of the largest scale was recorded
for comparison between simulations using different wavelet families and fine-grid
densities.
-1-0.5
0 0.5
1 -1-0.5
0 0.5
1
0
0.5
1
(a) Gaussian monopole.
-1-0.5
0 0.5
1 -1-0.5
0 0.5
1
-1
-0.5
0
0.5
1
(b) Gaussian dipole.
-1-0.5
0 0.5
1 -1-0.5
0 0.5
1
0
0.5
1
(c) Double Gaussians.
Figure 24: Code testing input functions.
85
The conjugate gradient method was used to compute the solution of the potential,
using an initial guess of 0, and the number of iterations was chosen to reduce the residual
to 8 orders of magnitude smaller than the squared norm of the charge distribution input
function. The error of the final result was computed as an RMS error comparing the
computed result with the analytical result on the finest grid only:
RMS Error =
√∑i∈Fine Grid(Resulti −Actuali)2
Num Fine Points. (6.1)
Since the boundary grids only exist in order to improve the boundary condition ap-
proximation, they were not included in the error calculation. Note that comparing with
the analytical solution tests both the ability of the solver to invert the Laplacian, as well
as the ability of the basis function set to accurately represent the charge density function,
which is often dependent on the choice of fine grid density.
6.3 Testing: Demonstration of Solutions
A test series demonstrating the effect of increasing boundary layers is shown in Fig-
ure 25. This set of results is from a series using the Gaussian dipole input function, 2nd
order lifted wavelets, and a 323 core grid. The dipole input function is shown to illustrate
the dipole image charge falloff without requiring the use of monopole boundary conditions.
The number of boundary layers is varied, and each individual result is shown in parts (a)
through (d) of the figure. Each part of the figure shows the entire solution of the potential
on the left, including the boundary grid regions. In the upper right, the deviation of the
solution from the analytical potential is shown on the finest grid, along with the overall
RMS error for the solution. In the lower right, the residual is shown as a function of
86
-1-0.5
0 0.5
1 -1
-0.5
0
0.5
1
-0.025-0.02
-0.015-0.01
-0.005 0
0.005 0.01
0.015 0.02
0.025
Solution of Potential:(Radial Extent > 1.000000)
-1 0
1 -1
0
1
-0.01
-0.005
0
0.005
0.01
Deviations:(RMS Error = 0.001909)
10-8
10-6
10-4
10-2
0 5 10 15 20 25 30
Convergence: 29 Iterations
(a) Dipole – 0 boundary layers.
-2-1
0 1
2 3 -2
-1 0
1 2
3
-0.025-0.02
-0.015-0.01
-0.005 0
0.005 0.01
0.015 0.02
0.025
Solution of Potential:(Radial Extent > 1.774190)
-1 0
1 -1
0
1
-0.00075-0.0005
-0.00025 0
0.00025 0.0005
Deviations:(RMS Error = 0.000271)
10-8
10-6
10-4
10-2
0 5 10 15 20 25
Convergence: 24 Iterations
(b) Dipole – 2 boundary layers.
Figure 25: Dipole decay with boundary layers. (continued on following page)
87
-6-4
-2 0
2 4
6 8 -6
-4-2
0 2
4 6
8
-0.025-0.02
-0.015-0.01
-0.005 0
0.005 0.01
0.015 0.02
0.025
Solution of Potential:(Radial Extent > 4.870970)
-1 0
1 -1
0
1
-0.0003-0.0002-0.0001
0 0.0001 0.0002
Deviations:(RMS Error = 0.000030)
10-8
10-6
10-4
10-2
0 5 10 15 20 25
Convergence: 22 Iterations
(c) Dipole – 4 boundary layers.
-10-5
0 5
10 15 -10
-5 0
5 10
15
-0.025-0.02
-0.015-0.01
-0.005 0
0.005 0.01
0.015 0.02
0.025
Solution of Potential:(Radial Extent > 9.000000)
-1 0
1 -1
0
1
-0.0003-0.0002-0.0001
0 0.0001 0.0002
Deviations:(RMS Error = 0.000022)
10-8
10-6
10-4
10-2
0 5 10 15 20 25 30 35
Convergence: 31 Iterations
(d) Dipole – 5 boundary layers.
Figure 25: (continued)
88
the conjugate gradient iteration number, with the total number of iterations required for
convergence denoted above this plot.
Figure 25a demonstrates that truncating the computational domain very near to the
charge distribution results in significant dipole image charge effects in the resulting po-
tential. The effect of this dipolar image charge can be seen in the overall slope of the
deviations of this solution – the dipole image charge creates a dipole error in the poten-
tial, as is expected. In Figure 25b, the dipole image charge error is still visibly dominant,
but it has reduced in strength by several orders of magnitude and the sampling error is
beginning to be significant as well. In Figure 25c, four boundary layers have been added,
and the RMS error is greatly reduced from that in Figure 25a. In addition, the error is
no longer dominated by the dipole image charge, but the sampling error is the promi-
nent contributor to the overall error. Further increasing the number of boundary layers
provides only marginal improvement in the RMS error, as Figure 25d demonstrates.
The convergence of the tests in this chapter also merits some consideration. For sim-
plicity, the algorithm was implemented with the entire top scale remaining as S-type
coefficients. This is because the top scale’s dimension is not guaranteed to be an even
number in each direction, so a wavelet transform of this layer is not always possible.
With zero boundary scales, then, the convergence rate is essentially identical to that of
a finite-difference scheme with no preconditioning. Adding boundary scales allows the
wavelet preconditioning to work while also reducing the number of points in the topmost
non-transformed scale, resulting in cases where adding boundary scales actually causes
the solution to converge more rapidly. Further study of this effect is needed in order to
construct an improved preconditioning scheme for cases with few boundary layers. In any
case, the number of points in the overall vector is usually a better indicator of program
execution time since the algorithm complexity is O(Np · Niter) and Np >> Niter. An
optimal preconditioning scheme will produce Niter ∝ const, so O(Np) is the best expected
performance – the fact that the condition number goes down with increased number of
89
scales is a symptom of the artificially poor condition number for the lower scale numbers
rather than an improved condition number for the medium scale numbers. This effect is
not significant for typical uses of the algorithm, however, as a few boundary scales are
always needed in order to correctly evaluate the boundary conditions.
6.4 Testing: Effects of Radial Extent and Monopole BCs
A second observation which can be made from Figure 25 is that the radial extent of
the computational domain is not linearly related to the number of scales. Rather, it is
nearly exponential, as the point spacing of each additional layer is double the spacing of
the previous scale. For this reason, and because the radial extent is an actual physical
characteristic of the Poisson equation under consideration, it is used rather than the
number of boundary layers in the remaining plots in this chapter.
In terms of the radial extent, it is expected that the image charge error will decay as r−1
for monopole terms in the potential and as r−2 and faster for dipole and higher multipole
terms. As was mentioned in Sec. 5.8, the goal of applying monopolar boundary conditions
is to reduce the erroneous image charge in the potential to dipole or higher terms, so that
the error should decay as r−2 or faster for any charge distribution. Figure 26 illustrates
the effects of varying the number of boundary layers on the RMS error of the various
combinations of input functions and boundary condition schemes. Because the finest grid
is always chosen to have a radial extent of 1, the radial extent shown in this figure is
normalized such that it is also a relative radial extent.
As expected, the Gaussian monopole and double Gaussian input functions with ho-
mogeneous (zero) boundary conditions are subject to image charge errors which decay
approximately as r−1. The decay in the error of these functions is constrained by their
90
1e-05
0.0001
0.001
0.01
1 10 100
RMS Error
Radial Extent
Dipole with Monopole BCDipole with Zero BC
Two Gauss with Monopole BCTwo Gauss with Zero BC
Monopole with Monopole BCMonopole with Zero BC
Figure 26: Error vs. radial extent. Second-order lifted wavelets, 323 grid.
91
monopolar terms, and they require approximately an order of magnitude increase in the
radial extent to produce an order of magnitude decrease in the error.
The RMS error of the Gaussian dipole input function decays as r−2 – also as predicted.
Before the sampling error saturates the results, the dipole function’s error decays two
orders of magnitude by the time the relative radial extent has increased its first single
order of magnitude. This results in a significantly smaller error than the monopolar
functions for even very small numbers of boundary layers.
Applying the monopole approximation boundary conditions also has the expected ben-
efits. For the dipole input function, there is no effect, as there is no monopolar term to
cancel. For both the Gaussian monopole and the double Gaussian input functions, acti-
vating the monopolar boundary conditions significantly reduces the RMS error for small
numbers of boundary layers. For radial extents of around 2–5, the RMS error is reduced
by two orders of magnitude or more by using the monopolar boundary condition approxi-
mation. The Gaussian monopole function exhibits no image charge effect whatsoever, and
the image charge error of the double Gaussian function initially decays at least as fast as
r−2 for the first few boundary layers.
As the number of boundary layers gets larger, the error reduction of all of the in-
put functions saturates to the limit of the sampling error, and the convergence of the
conjugate gradient algorithm is somewhat dependent on the total number of points in
the computational domain, so the RMS error grows slightly with increased radial extent.1
These two effects counteract the reduction of the multipole terms and produce an expected
minimum error with radial extents of around 2–5 when using the monopolar boundary
condition approximation for a general input function. This typically corresponds to only
1The Gaussian dipole input function is not greatly affected by this second effect because the initialguess of “0” for the potential is fairly accurate for the boundary grids, resulting in a smaller overall residualat convergence of the conjugate gradient algorithm.
92
a few boundary layers, so the extra cost of the boundary condition computations are
expected to be small.
As mentioned in the previous section, the cost of adding extra layers for boundary
condition computation can be approximated roughly as the number of extra points added
by these boundary layers. Figure 27 shows the relative increase in points caused by
increasing the radial extent of the computational domain to improve the computation
of boundary conditions. For radial extents in the “optimal” region of 2-5, the fractional
increase in points is less than 2 for all cases, and is marginal with more dense fine grids. For
typical particle-in-cell simulations, grid sizes of 323 are typical when used with simulations
involving hundreds of thousands of particles, with 643 used for particle counts in the
millions – which is becoming more and more common. So, the boundary conditions
computation will account for a smaller and smaller fraction of the overall computation
time as the resolution of modern PIC simulations increases.
93
1
1.5
2
2.5
3
3.5
4
1 10 100
Total Points / Fine Grid Points
Radial Extent
163
323
643
Figure 27: Number of points vs. radial extent. Second-order lifted wavelets.
CHAPTER 7
CONCLUSIONS
This thesis provides an algorithm for calculating solutions of the Poisson equation
for use in particle-in-cell simulators. This algorithm retains the principal benefits of the
wavelet-based solver of Terzic et al.:
• The iterative method used to solve the Poisson equation allows the algorithm to be
adaptive in time by using information from the previous time step to accelerate the
solution of the current time step.
• The preconditioning scheme used in this algorithm is similar to the scheme used in
[28], and it is applied implicitly in the wavelet basis itself. The recognition of the
nature of this implicit preconditioning is unique to this work. This preconditioning
allows the iterative solver to converge more rapidly than a similar finite-difference
scheme would allow.
• The solution is computed in wavelet space, allowing possible wavelet de-noising of the
charge density and potential to reduce sampling noise inherent in the particle-in-cell
algorithm [20, 21, 28].
In addition, this new algorithm extends the prior work by utilizing the non-standard
form of the wavelet-space Laplacian operator, exposing much of the inherent sparsity of
the operator in the wavelet basis. The improved sparsity of the operators results in faster
O(N) operator application speeds. The non-standard operator form also separates the
wavelet scales to enhance the amount of parallelization available.
To overcome the difficulties that the former algorithm suffered when computing bound-
ary conditions, a new multiscale adaptive-in-space algorithm is used to reduce image charge
95
error in boundary condition computation. This new algorithm is far simpler to implement
in parallel codes. The multiscale algorithm for boundary conditions also enables the code
to put the adaptive-in-time nature of the iterative solver to work on the boundary condi-
tions as well – consuming less computational time during portions of the simulation when
the charge density is slowly changing.
Further reducing the computational cost of the boundary condition calculation, a new
monopole approximation boundary condition scheme is used to reduce the number of
boundary layers needed. Tests were performed to verify the predicted behavior of this
algorithm and they show that this monopole boundary condition scheme reduces the
required number of boundary layers to about 3. It also cuts down the RMS error by two
orders of magnitude on grids with three boundary layers for general functions encountered
in accelerator or galactic dynamics.
Overall, this new algorithm is shown to be an improvement in both execution com-
plexity and flexibility over the previous wavelet based particle-in-cell solver. It also shows
the potential to improve the range of simulations that are practical for particle-in-cell
simulations by reducing the computational cost required for large grid sizes and allowing
implementations of parallel versions of the algorithm.
7.1 Future Applications
A key area for future development of this algorithm lies in the extension of the adaptive
grid algorithm into the core of the charge density region. Terzic et al. reported that less
than 10% of wavelet coefficients are required to represent a typical particle density in a
particle-in-cell simulation with a 323 fine grid size, and this fraction decreases with larger
grid densities [28]. The adaptive grid structure developed in this present work could be
harnessed to exploit some of this sparsity, improving both the operator application speeds
96
and the convergence rate of the conjugate gradient algorithm, which currently depends
on the number of grid points. This improved performance would allow simulations with
higher grid densities and particle counts even when executed as a serial process.
The next stages of algorithm improvement are optimization for execution speed and the
development of a parallel code. Parallelization opportunities exist for both large-grained
and small-grained parallel strategies. Large-grained parallel areas involve distributing the
data across computation nodes and performing operations on individual scales or segments
of the data in parallel. The short length of the filters involved in wavelet transformations
and operators also supports small-grained parallel strategies involving applying the filter
in parallel over several small sections in the multiple cores of a single processor. This
small-grained parallelism is a result of the compact support of the wavelet basis, and
will become increasingly important as modern processors continue to exhibit more small-
grained parallelism through multithreading and multiple processor cores.
BIBLIOGRAPHY
[1] T.A. Arias. Multiresolution analysis of electronic structure: semicardinal and waveletbases. Reviews of Modern Physics, 71:267-311, 1999.
[2] A. Averbuch, G. Beylkin, R. Coifman, P. Fischer, and M. Israeli. Adaptive solutionof multidimensional PDEs via tensor product wavelet decomposition. 2003.URL: http://www.cs.tau.ac.il/~amir1/PS/poisson.pdf.
[3] A. Averbuch, E. Braverman, and M. Israeli. Parallel adaptive solution of a Poissonequation with multiwavelets. SIAM Journal on Scientific Computing, 22:1053–1086,2000.
[4] G. Beylkin, R. Coifman, and V. Rokhlin. Fast wavelet transforms and numericalalgorithms I. Communications on Pure and Applied Mathematics, 44:141–183, 1991.
[5] G. Beylkin. On the representation of operators in bases of compactly supportedwavelets. SIAM Journal on Numerical Analysis, 29:1716–1740, 1992.
[6] G. Beylkin and James M. Keiser. On the adaptive numerical solution of nonlinearpartial differential equations in wavelet bases. Journal of Computational Physics,132:233–259, 1997.
[7] I. Daubechies. Ten lectures on wavelets. Society for Industrial and Applied Mathe-matics, Philadelphia, 1992.
[8] Martyn R. Field. Optimizing a parallel conjugate gradient solver. SIAM Journal onScientific Computing, 19:27–37, 1998.
[9] Masafumi Fujii and Wolfgang J.R. Hoefer. Interpolating wavelet collocation methodof time dependent Maxwell’s equations: characterization of electrically large opticalwaveguide discontinuities. Journal of Computational Physics, 186:666–689, 2003.
[10] C.K. Gan, P.D. Haynes, and M.C. Payne. Preconditioned conjugate gradient methodfor the sparse generalized eigenvalue problem in electronic structure calculations.Computer Physics Communications, 134:33–40, 2001.
98
[11] Stefan Goedecker. Wavelets and their application for the solution of partial differentialequations in physics. Presses Polytechniques et Universitaires Romandes, Lausanne,1998.
[12] Stefan Goedecker and Claire Chauvin. Combining multigrid and wavelet ideas to con-struct more efficient multiscale algorithms. Journal of Theoretical and ComputationalChemistry, 2:483–495, 2003.
[13] Stefan Goedecker and Oleg Ivanov. Solution of multiscale partial differential equationsusing wavelets. Computers in Physics, 12:548–555, 1998.
[14] S. Goedecker and O.V. Ivanov. Linear scaling solution of the Coulomb problem usingwavelets. Solid State Communications, 105:665-669, 1998.
[15] R. Hockney and J. Eastwood. Computer simulation using particles. McGraw-Hill,New York, 1981.
[16] Randal J. LeVeque. Wave propagation software, computational science, and repro-ducible research. Proceedings of the International Congress of Mathematicians, 2006.
[17] Ross A. Lippert, T.A. Arias, and Alan Edelman. Multiscale computation with inter-polating wavelets. Journal of Computational Physics, 140:278–310, 1998.
[18] Gisela Poplau, Ursula van Rienen, Marieke de Loos, and Bas van der Geer. A multi-grid based 3D space-charge routine in the tracking code GPT. TESLA Reports, 2003-03.
[19] S.B. van der Geer, O.J. Luiten, M.J. de Loos, G. Poplau, and U. van Rienen. 3Dspace-charge model for GPT simulations of high-brightness electron bunches. TESLAReports, 2003-04.
[20] Allessandro B. Romeo, Cathy Horellou, and Joran Bergh. N-body simulations withtwo-orders-of-magnitude higher performance using wavelets. Monthly Notices of theRoyal Astronomical Society, 342:337–344, 2003.
[21] Allessandro B. Romeo, Cathy Horellou, and Joran Bergh. A wavelet add-on codefor new-generation N-body simulations and data de-noising (JOFILUREN). MonthlyNotices of the Royal Astronomical Society, 354:1208–1222, 2004.
[22] David B. Serafini, Peter McCorquodale, and Phillip Colella. Advanced 3D Poissonsolvers and particle-in-cell methods for accelerator modeling. Journal of Physics:Conference Series, 16:481–485, 2005.
99
[23] R. Shankar. Principles of quantum mechanics. Springer, New York, 1994.
[24] Jonathan R. Shewchuck. An introduction to the conjugate gradient method withoutthe agonizing pain. Carnegie Mellon University, Pittsburgh, 1994.
[25] M.K. Sun and W.Y. Tam. Analysis of 2-D scattering problems using a novel nonuni-form gridding FDTD method. Microwave and Optical Technology Letters, 28:430–432,2001.
[26] Wim Sweldens. The lifting scheme: a construction of second generation wavelets.SIAM Journal on Mathematical Analysis, 29:511–546, 1998.
[27] Wim Sweldens and Peter Schroder. Building your own wavelets at home. Wavelets inComputer Graphics, ACM SIGGRAPH 96 course notes, 1996.
[28] Balsa Terzic, Ilya V. Pogorelov, and Courtlandt L. Bohn. Particle-in-cell beam dy-namics simulations with a wavelet-based Poisson solver. Physical Review Special Top-ics – Accelerators and Beams, 10:034201, 2007.
[29] Oleg V. Vasilyev and Christopher Bowman. Second-generation wavelet collocationmethod for the solution of partial differential equations. Journal of ComputationalPhysics, 165:660–693, 2000.
APPENDIX A
WAVELET COEFFICIENTS AND OPERATOR COEFFICIENTS
101
The prediction filters for interpolating wavelet families are all derived from the cor-
responding interpolation schemes. Table 3 lists the prediction filters for 2nd-, 4th-, 6th-,
and 8th-order interpolating wavelets. The 2nd-order filter is simply the linear prediction
scheme derived in Chapter 3, and the higher order filters are given in [11]. The update
filters for the non-lifted wavelet families are all zero. Sweldens gives a simple expres-
sion for update filters which give the maximum number of vanishing moments for a given
prediction filter [27]:
U [i] =12P [−i]. (A.1)
Once U and P are chosen, finding h, g, h, and g is simply a matter of applying Eqns. (4.5)
and (4.22–4.23).
Table 3: Interpolating wavelet prediction filters
Offset 1 2 3 4
2nd Order 12
4th Order 916 − 1
16
6th Order 75128 - 25
2563
256
8th Order 12252048 - 245
204849
2048 - 52048
Note: Only positive offsets are shown, but the filters are symmetric about 1/2, soP [1− i] = P [i], where i ≥ 1.
With the wavelet coefficients found, determining the 1D operator matrix elements
involves evaluating the LS(n), LSD(n), LDS(n), and LDD(n) terms from Eqn. (5.11). To be
102
general, we can find the operator A(l) which represents the lth derivative, and this will
have special cases of L and O. The solution begins by defining the base integral:
a(l)i =
∫φ(x)∂lxφ(x− i)dx. (A.2)
Expanding this base integral in terms of the basic refinement relation in Eqn. (4.1) gives:
a(l)i =
∫φ(x)∂lxφ(x− i)dx,
=∑ν,µ
hνhµ
∫φ(2x− ν)∂lxφ(2(x− i)− µ)dx,
=∑ν,µ
hνhµ
∫φ(Z)∂lZ(2)lφ(Z + ν − 2i− µ)(
12
)dZ,
= 2l−1∑ν,µ
hνhµa(l)2i+µ−ν . (A.3)
Solving Eqn. (A.3) reduces to solving for the eigenvector of
B(l)~a(l) = 21−l~a(l), (A.4)
where
B(l)i,j =
∑ν,µ
hνhµδj,2i+µ−ν . (A.5)
The eigenvector equation leaves the normalization arbitrary, but Goedecker gives the
following normalization based on the normalization of the scaling functions and the action
of derivative operators on polynomial functions [11]:
∑i
ila(l)i = l! (A.6)
103
A Maple 10 program to calculate the base filters of for any interpolating wavelet family
is given in Figure 28. Also, calculated filters for 2nd- and 4th-order wavelets are shown in
Table 4 to allow validation of the code prior to using it to compute filters for higher order
wavelet familes.
Table 4: Interpolating wavelet base filters
Offset 0 1 2 3 4 5
2nd Order ~a(0) 23
16
2nd Order ~a(2) −2 1
4th Order ~a(0) 5626470245
19253140490 − 2827
702456283
2247840 − 16210735 − 1
6743520
4th Order ~a(2) −209
98 0 − 1
72
Note: Only positive offsets are shown, but the filters are symmetric about 0, soa(l)[−i] = a(l)[i].
Once the base filters have been computed, the components of all of the individual
filters can derived in terms of them. A(l)S(n) becomes:
A(l)S(n),i,j = 〈φni |∂(l)
x |φnj 〉 ,
=∫φ(2nx− i)∂(l)
x φ(2nx− j)dx,
=∫φ(Z)∂(l)
Z (2n)lφ(Z + i− j)(2−n)dZ,
A(l)S(n),i,j = 2n(l−1)a
(l)j−i. (A.7)
104
> with(LinearAlgebra);> derivVal := 2; eigval := 2^(1-derivVal);
# 2nd Order Interpolating Wavelets:> size := 3; h := Vector(size, [1/2, 1, 1/2]);
# 4th Order Interpolating Wavelets:> #size := 7; h := Vector(size, [-1/16, 0, 9/16, 1, 9/16, 0, -1/16]);
# 6th Order Interpolating Wavelets:> #size := 11; h := Vector(size, [3/256, 0, -25/256, 0, 75/128, 1,
75/128, 0, -25/256, 0, 3/256]);
#Calculate the derivative operator filter:> asize := 2*size-1; zero := size; A := Matrix(asize, asize);> for i to asize do
for j to asize doA[i, j] := 0;for nu to size dofor mu to size do
if (j-zero) = 2*(i-zero) - (nu-zero) + (mu-zero) thenA[i,j] := A[i,j] + h[nu]*h[mu];
end ifend do
end doend do
end do;> print(A);
> C := convert(evalm(A-eigval*IdentityMatrix(asize)), Matrix);> det := Determinant(C);> if det = 0 then
NullSpace(C);unnormalizedA := %[1]
end if;> b := add(unnormalizedA[i]*i^derivVal, i = 1 .. asize)
/factorial(derivVal);> a := evalm(unnormalizedA/b);# Correctly normalized answer is in "a".
Figure 28: Maple 10 code for generating operator base filters.
105
By simply replacing indices in Eqn. (A.7), we can derive the following expression which
will be useful for computing the remaining filters:
〈φn+12i+ν |∂
(l)x |φn+1
2j+µ〉 = 2(n+1)(l−1)a(l)2(j−i)+µ−ν . (A.8)
Using this equation and Eqns. (4.3–4.4), we can quickly derive the other filters. A(l)SD(n)
is found to be
A(l)SD(n),i,j = 〈φni |∂(l)
x |ψnj 〉 ,
=∑ν,µ
hνgµ 〈φn+12i+ν |∂
(l)x |ψn+1
2j+µ〉 ,
A(l)SD(n),i,j = 2(n+1)(l−1)
∑ν,µ
hνgµa(l)2(j−i)+µ−ν . (A.9)
In a similar fashion, A(l)DS(n) and A(l)
DD(n) are given as
A(l)DS(n),i,j = 2(n+1)(l−1)
∑ν,µ
gνhµa(l)2(j−i)+µ−ν , (A.10)
A(l)DD(n),i,j = 2(n+1)(l−1)
∑ν,µ
gνgµa(l)2(j−i)+µ−ν . (A.11)
From these general derivative operator filters, the Laplacian filter can be found by
substituting l = 2 and the overlap operator is given by l = 0.
APPENDIX B
DERIVATION OF WAVELET TRANSFORM IDENTITIES
107
The goal of this appendix is to derive the identities:
F†F = 1, F† = B, (4.31)
B†B = 1, B† = F. (4.32)
Beginning with the expression for the dual forward transform given in Eqns. (4.26–
4.27):
ski =∑j
hj sk+12i+j , (4.26)
dki =∑j
gj sk+12i+j , (4.27)
a change of dummy index variables results in the equivalent expressions:
ski =∑q
hq−2i sk+1q , (B.1)
dki =∑q
gq−2i sk+1q . (B.2)
This form of the equation reveals the matrix structure of the dual forward transform to
be
Fka,b =
hb−2(a
2), a even,
gb−2(a−12
), a odd.
(B.3)
Separating the even and odd rows of the matrix into submatrices results in
Fka,b =
hb−a
gb−a−1
, (B.4)
108
and the transpose of F is
(Fka,b)† =
[ha−b ga−b−1
], (B.5)
with even and odd columns now separated into submatrices for h and g.
Now consider the backward transform B. From Eqn. (4.28), an alternate expression
equivalent to Eqns. (4.29–4.30) can be derived by a different choice of dummy variables:
sk+1j =
∑ν
(skν hj−2ν + dkν gj−2ν). (B.6)
Again, because the s and d coefficients are interleaved in the vectors, the equivalent matrix
form of this expression is given by
Bka,b =
ha−2( b
2), b even,
ga−2( b−12
), b odd.
(B.7)
Separating the even and odd columns into submatrices for h and g changes this expression
into
Bka,b =
[ha−b ga−b−1
], (B.8)
which can be seen to be equivalent to Eqn. (B.5), showing that F† = B. Similar equations
can be derived to show that F† = B by substituting h and g for h and g in the expressions
above.
APPENDIX C
INHERENT PRECONDITIONER IN A BIORTHOGONAL BASIS
110
Examining the relative scaling caused by the standard rescaling of the basis for the
biorthogonal case involves keeping track the individual components of the linear algebraic
Eqn. (2.7):
L~u = O~ρ, (2.7)
with respect to the different scaling factors from the orthonormal-type scaling of
Eqns. (4.42–4.43):
φki (x) = φki (x) =√
2kφ(2kx− i), (4.42)
ψki (x) = ψki (x) =√
2kψ(2kx− i), (4.43)
and the traditional biorthogonal scale factors of Eqns. (3.1–3.2, 4.7–4.8):
φki (x) = φ(2kx− i), (3.1)
ψki (x) = ψ(2kx− i), (3.2)
φki (x) = 2kφ(2kx− i), (4.7)
ψki (x) = 2kψ(2kx− i). (4.8)
The scaling function components of any vector |f〉 are given as ski = 〈φki |f〉. Therefore,
the change in the components with respect to the difference in scaling is given by
ski (biortho scaling)ski (ortho scaling)
=〈φki |f〉 (biortho scaling)〈φki |f〉 (ortho scaling)
=2k√
2k
=√
2k, (C.1)
with exactly the same result for the wavelet-style coefficients. Defining the diagonal ma-
trix:
PTi,j = δi,j
√2k, where k is the level number, (C.2)
111
the factor between types of wavelet expansions is given by
~u(biortho scaling) = PT~u(ortho scaling). (C.3)
For the operators, the scaling factors are found by
〈φki |φkj 〉 (biortho scaling)
〈φki |φkj 〉 (ortho scaling)=
12
(√
2k)2
=√
2−2k
. (C.4)
In terms of Eqn. (C.2), this means that
O(biortho scaling) = P−1O(ortho scaling)P−T. (C.5)
The same conversion also applies to the Laplacian matrix L. Using Eqns. (C.3) and (C.5),
we can see that the biorthogonal case of Eqn. (2.7) corresponds to
P−1L(ortho scaling)P−TPT~u(ortho scaling) = P−1O(ortho scaling)P
−TPT~ρ(ortho scaling), (C.6)
which is exactly the same form as the preconditioned linear equation from Eqn. (5.32). In
terms of M = PPT, we have
(M−1)k,li,j = 2−kδk,lδi,j , (C.7)
equivalent to Eqn. (5.33), which means that the preconditioning caused by the rescaling
of the biorthogonal bases is identical to the preconditioning scheme used in [28].
Note that this applies specifically to the one-dimensional case. For d dimensions, the
tensor product nature of the wavelet bases duplicates the scaling factors in each direction
resulting in the more general expression:
(M−1)k,li,j = 2−d·kδk,lδi,j , (C.8)
which is equivalent to the scheme of [28] only in the one-dimensional case.
APPENDIX D
A BRIEF PRIMER ON DIRAC NOTATION
113
Dirac notation allows for a more compact representation of many mathematical ex-
pressions which are common when working with arbitrary basis sets.
• |ψ〉 is called ket ψ. It is a basis-independent representation of a vector.
• 〈ψ| is called bra ψ. It is another basis-independent representation of a vector, the
adjoint to the ket |ψ〉.
A bra times a ket (called a braket) is an inner product, which is defined according to
whatever basis set the vectors are expanded in. In the case of coordinate-space basis ~x
(used exclusively in this work):
〈ψ| |ψ〉 = 〈ψ|ψ〉 =∫ ∞−∞
ψ∗(~x)ψ(~x)d~x.
The relation between bras and kets is that 〈ψ| = |ψ〉†, where † indicates the hermetian
adjoint, which translates into transpose and conjugate. In other words,
〈ψ| = |ψ〉† = (|ψ〉∗)T,
|ψ〉 = 〈ψ|† = (〈ψ|∗)T.
This ensures that an inner product will always result in a real number.
Bras and kets basically always act like vectors and their transposes. As an example,
for real vectors ~x and ~y:
~xTL~y = ~x†L~y = 〈x|L |y〉 ,(~xT~y
)T= ~yT~x,
= (〈x|y〉)† = 〈y|x〉∗ = 〈y|x〉 , since x and y are real.
For a more thorough treatment of the notation, see Chapter 1 of Shankar [23].
114
Note that all of the functions and operators in this thesis are real functions with no
imaginary components, so complex conjugation is irrelevant and will be ignored (treated
as identity) throughout. This means that † will be treated effectively the same as T, and a
reader who wishes to generalize the results to complex-valued functions will have to trace
all of the conjugations back through the derivations and add them back in.
APPENDIX E
INTERPOLATING WAVELETS ON AN INTERVAL
116
For linear interpolating wavelets, the only change needed is on the last D coefficient
prediction, where there is no S coefficient beyond it. For this last point, simply take
the last two S coefficients, call them Sa and Sb. Both of these are to the left of the D
coefficient in question, but they can be used to make a linear extrapolation to predict the
D coefficient as Sb + 12(Sb − Sa) = 3
2Sb −12Sa. For the last step, when there is only one S
coefficient available, no linear interpolation is possible, so any prediction is equally valid.
For simplicity, we can make a prediction of zero slope when this happens. The resulting
prediction filters are shown in Tables 5 and 6. In this case, our simple line example will
reduce to two values of points that are on the line, which is the optimal compression of the
data. See Sweldens for more details on second generation wavelets [26, 27]. This scheme
was not used in this study because it complicates the definitions of operators by changing
the shape of the wavelet and scaling functions along the boundaries.
Table 5: Linear prediction filter on boundary
Offset -1 0
Value -12
32
Note: This is the filter for the last D coefficient, when S−1 exists.
117
Table 6: Linear prediction filter on boundary, no S−1
Offset 0
Value 1
Note: This is the filter for the last D coefficient, when S−1 does not exist.