ABSTRACT Name: Benjamin Sprague Department: Physics Title...

ABSTRACT

Name: Benjamin Sprague Department: Physics

Title: Wavelet-space solution of the Poisson equation: An algorithm for use in particle-in-cell simulations

Major: Physics Degree: Master of Science

Approved by: Date:

Thesis Director

NORTHERN ILLINOIS UNIVERSITY

ABSTRACT

A particle-in-cell (PIC) approach is useful for simulations involving large numbers of

particles interacting via electric charge or gravitation. The interaction potential for such

forces is described by the Poisson equation ∇2U = Cρ, where C is some constant which

establishes the sign and magnitude of the interaction. By depositing the particle charge

distribution into cells and solving for the potential on the grid of cells, the solution of the

Poisson equation is made more tractable for large numbers of particles.

This work presents a wavelet-based algorithm for the solution of the Poisson equation

in PIC simulations which exhibits better scaling properties than the traditional Green’s

function and FFT approach. This algorithm also provides unique possibilities for adaptive

resolution PIC simulations and parallelization of the Poisson solver.

The algorithm described in this work is derived in a self-contained manner from theory

to implementation, with the intent of encouraging the adoption of this wavelet-space

technique by researchers with no prior experience in wavelet techniques. The algorithm

was implemented in a test program to verify correctness, and over 1000 test runs of the

application were executed to examine the numerical accuracy and scaling properties of

the algorithm. These tests demonstrate that the wavelet-space approach is a competitive

algorithm for the solution of the Poisson equation in a particle-in-cell simulation.

NORTHERN ILLINOIS UNIVERSITY

WAVELET-SPACE SOLUTION OF THE POISSON EQUATION: AN

ALGORITHM FOR USE IN PARTICLE-IN-CELL SIMULATIONS

A THESIS SUBMITTED TO THE GRADUATE SCHOOL

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE

MASTER OF SCIENCE

DEPARTMENT OF PHYSICS

BY

BENJAMIN SPRAGUE

c© 2008 Benjamin Sprague

DEKALB, ILLINOIS

AUGUST 2008

Certification: In accordance with departmental and Graduate

School policies, this thesis is accepted in

partial fulfillment of degree requirements.

Thesis Director

Date

ACKNOWLEDGEMENTS

Work supported by the Office of Naval Research, Department of Defense, under con-

tract N00014-06-1-0587 with Northern Illinois University and by the Department of Edu-

cation under contract P116Z010035 with Northern Illinois University.

TABLE OF CONTENTS

Page

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

LIST OF FIGURES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

LIST OF APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

Chapter

1 INTRODUCTION AND PROBLEM STATEMENT . . . . . . . . . . . . . . . . . . . 1

1.1 Problem Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Improvement of Wavelet-Based Particle-in-Cell Algorithm . . . . . . . . . . 4

1.3 Goals and Chapter Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 WEIGHTED RESIDUAL METHOD AND SOLVING PDE IN A BASIS . . . . . 8

2.1 Fourier Basis, Green’s Functions, etc. . . . . . . . . . . . . . . . . . . . . . . 10

3 INTERPOLATING WAVELETS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.1 Transform and Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2 What Is the Wavelet Basis? . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4 SPECIAL TOPICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.1 Refinement Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.2 Dual and Primal Wavelet Transform . . . . . . . . . . . . . . . . . . . . . . 40

4.3 Orthonormal Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.4 Multidimensional Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.5 Continuous Real Space to Wavelet Space . . . . . . . . . . . . . . . . . . . . 47

4.6 Human-Readable Representation of Vectors . . . . . . . . . . . . . . . . . . 49

v

Chapter Page

5 THE ALGORITHM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.1 Prior Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.2 Representation of Operators, Calculating Operators in Wavelet Space . . . 53

5.3 Non-Standard Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.4 3D Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.5 Implementation of 3D Operators in Non-Standard Form . . . . . . . . . . . 65

5.6 Preconditioning and Temporal Coherence . . . . . . . . . . . . . . . . . . . 69

5.7 Multiresolution Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.8 Monopole Approximation for Boundary Conditions . . . . . . . . . . . . . . 78

5.9 Parallelization Opportunities . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6 IMPLEMENTATION AND ALGORITHM TESTING. . . . . . . . . . . . . . . . . . 81

6.1 Features and Code Structure . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6.2 Testing: Parameters and Measurements . . . . . . . . . . . . . . . . . . . . 83

6.3 Testing: Demonstration of Solutions . . . . . . . . . . . . . . . . . . . . . . 85

6.4 Testing: Effects of Radial Extent and Monopole BCs . . . . . . . . . . . . . 89

7 CONCLUSIONS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

7.1 Future Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

APPENDICES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

LIST OF TABLES

Table Page

1 Linear prediction filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Linear update filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3 Interpolating wavelet prediction filters . . . . . . . . . . . . . . . . . . . . . . . 101

4 Interpolating wavelet base filters . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5 Linear prediction filter on boundary . . . . . . . . . . . . . . . . . . . . . . . . 116

6 Linear prediction filter on boundary, no S−1 . . . . . . . . . . . . . . . . . . . . 117

LIST OF FIGURES

Figure Page

1 Particle-in-cell algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Basic forward wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Sampled linear function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4 Single-stage wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

5 Basic inverse wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

6 Recursive forward wavelet transform . . . . . . . . . . . . . . . . . . . . . . . . 17

7 Fully transformed linear function with perfect prediction filter . . . . . . . . . . 18

8 Transformed linear function – zero BCs . . . . . . . . . . . . . . . . . . . . . . 20

9 Transformed linear function – periodic BCs . . . . . . . . . . . . . . . . . . . . 21

10 Transformed linear function – interpolation BCs . . . . . . . . . . . . . . . . . 22

11 Lifted wavelet transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

12 Wavelet functions of linear interpolating wavelets . . . . . . . . . . . . . . . . . 26

13 Wavelet functions of linear interpolating wavelets – doubled resolution . . . . . 27

14 Wavelet functions – higher order families . . . . . . . . . . . . . . . . . . . . . 29

15 Shape and frequency spectrum of linear interpolating wavelets . . . . . . . . . 31

16 Shape and frequency spectrum of 8th-order interpolating wavelets . . . . . . . 32

17 Wavelet transform example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

18 Standard form wavelet space operator . . . . . . . . . . . . . . . . . . . . . . . 56

19 Non-standard form wavelet space operator . . . . . . . . . . . . . . . . . . . . . 57

20 Pseudo code for non-standard form operator – top scale . . . . . . . . . . . . . 62

21 Pseudo code for non-standard form operator – child scales . . . . . . . . . . . . 63

22 Results of preconditioning and temporal coherence . . . . . . . . . . . . . . . . 71

viii

Figure Page

23 Multiresolution grid used for boundary conditions . . . . . . . . . . . . . . . . 77

24 Code testing input functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

25 Dipole decay with boundary layers . . . . . . . . . . . . . . . . . . . . . . . . . 86

26 Error vs. radial extent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

27 Number of points vs. radial extent . . . . . . . . . . . . . . . . . . . . . . . . . 93

28 Maple 10 code for generating operator base filters . . . . . . . . . . . . . . . . . 104

LIST OF APPENDICES

Appendix Page

A WAVELET COEFFICIENTS AND OPERATOR COEFFICIENTS. . . . . . . . . 100

B DERIVATION OF WAVELET TRANSFORM IDENTITIES . . . . . . . . . . . . . 106

C INHERENT PRECONDITIONER IN A BIORTHOGONAL BASIS . . . . . . . . 109

D A BRIEF PRIMER ON DIRAC NOTATION . . . . . . . . . . . . . . . . . . . . . . . 112

E INTERPOLATING WAVELETS ON AN INTERVAL . . . . . . . . . . . . . . . . . . 115

CHAPTER 1

INTRODUCTION AND PROBLEM STATEMENT

Improving the detail of the N-body space charge calculation is a crucial step in in-

creasing the fidelity of particle accelerator and galaxy dynamics simulations. For physical

problems in which the Coulomb force is dominant, accurately approximating the Poisson

equation ∇2U = Cρ is essential in correctly evaluating the forces acting on each particle

in the simulation.

Several approaches have been taken for the solution of the Poisson equation in such

N-body simulations. They may be classified as either integral or differential equation ap-

proaches. The differential forms directly solve the Poisson equation in differential form,

while the integral forms formulate the Poisson equation as an integral equation and eval-

uate the summed effect of the charge distribution at each point in the computational

domain.

The brute-force N-body approach is an integral solution, useful for small N . Each

particle simply sums the forces applied to it by each of the other particles. This approach

is simple to implement and can scale well in parallel, but it scales poorly with the number

of particles N . This naive approach to N-body simulation scales as O(N2) for each time

step, making it prohibitive for use with standard computer hardware for N much larger

than 103.

An improved integral approach is obtained by evaluating nearby forces with a direct

N-body approach, while approximating long-range forces by a multipole expansion. This

tree approach also has good parallel scaling potential, and it scales significantly better

with particle count. Typical tree codes scale as O(N logN) and are frequently used in

galactic dynamics simulations.

2

For very large particle numbers, the Green’s function plus FFT approach is often used

[15]. This approach shares characteristics with both integral and differential solutions

of the Poisson equation. The particle density is deposited on a grid of cells, and the

potential is evaluated on this grid using the differential Poisson equation. This technique

is often called a “particle-in-cell” approach. Because a Green’s function is used to solve

the equation, however, the solution proceeds using a convolution in a manner analogous

to an integral solution. This approach always scales as O(N) with respect to the number

of particles, because the individual particles are used only for carrying the dynamics

information between time steps. It scales as O(Ng logNg) with respect to the number of

grid points, with the advantage that the number of grid points is typically chosen to be

an order of magnitude less than the particle count.

The typical algorithm of a particle-in-cell code is shown in Figure 1. The particles are

initially constructed with some distribution in space and momentum. They are advanced

forward in time and then binned onto a computational grid. This binning process is very

similar to constructing a 3D histogram of the charge density. Once this charge density

function is obtained, the Poisson equation is solved on this grid, the forces are interpolated

back to the particles, and the particles advance forward in time again. This loop is repeated

for each time step of the simulation as the particles travel through the accelerator. This

thesis is concerned only with the step of solving the Poisson equation to find the forces on

the particles caused by their own charge distribution.

By replacing the Poisson equation solver, an alternative particle-in-cell approach was

developed by Terzic et al. in 2007, and it is a fully differential solution to the N-body

problem [28]. It uses orthogonal wavelet bases defined over the computational grid to

evaluate the differential equation, and an iterative conjugate gradient solver to find the

solution. Because this algorithm uses an iterative solver, and information from the pre-

vious time step, it is adaptive in time and can very quickly find a solution if the charge

density changes slowly with respect to the time step.

3

3

Particle-In-Cell

Advance

particles in

time

Bin particles

to grid

Solve Poisson

equation

on grid

Interpolate

force to

particles

Figure 1: Particle-in-cell algorithm.

4

1.1 Problem Scale

In particle accelerator codes, the number of particles used is typically on the order of

105 or 106. With particle counts in the millions, the brute force scaling of O(N2) is simply

impractical for most hardware. Particle-in-cell codes can manage these high particles

counts because the particle deposition schemes are simply O(N). In addition, the particle

management routines of particle-in-cell codes can operate very efficiently in parallel since

the particles only interact through the grid. The grid solution for an FFT-based particle-

in-cell scales as O(Ng logNg) with approximately ten particles per cell. This means that

grid sizes are typically on the order of 323 or 643. In order to increase the number of

particles in future simulations, this grid size needs to increase as well to maintain the

optimum number of particles per cell. This requires an improvement in the scaling of

the algorithms with respect to the number of grid points, as well as with respect to the

number of parallel processors employed in the simulation.

1.2 Improvement of Wavelet-Based Particle-in-Cell Algorithm

The initial wavelet-based Poisson solver for particle-in-cell applications constructed by

Terzic et al. showed great promise for improving the speed of future accelerator codes

and was competitive with the standard FFT plus Green’s function algorithm for current

typical problem sizes [28]. It used a preconditioner in wavelet space to ensure rapid con-

vergence of the iterative solution and it took advantage of previous time step information

to create an adaptive-in-time approach to the solution of Poisson’s equation in a particle-

in-cell environment. This current work is a continuation of that effort, and is focused on

overcoming the difficulties of this earlier algorithm in the effort to enhance the ability of a

5

wavelet-based algorithm to provide adaptive solutions which can produce additional detail

for the particle-in-cell simulator.

Specifically, the original wavelet-based solver utilized the standard form of wavelet

operators, which are simple to implement but hide much of the inherent sparsity of the

operator in the wavelet basis. Also, the scales are tightly coupled in this form, and it is

not obvious how to separate them in order to construct a parallel algorithm.

Second, this initial wavelet-based particle-in-cell algorithm had difficulty specifying

boundary conditions in wavelet space. A set of Green’s functions had to be evaluated at

each time step, which was computationally expensive and difficult to arrange for efficient

parallel execution. This caused the boundary conditions evaluation to consume a signifi-

cant portion of the solver’s execution time, even during those stages where not much had

changed – losing some of the benefits of the algorithm’s otherwise adaptive-in-time nature.

The goal of this current work is to construct an algorithm which retains the strengths

of the work of Terzic et al.: adaptivity in time and fast solution of Poisson’s equation,

while improving these two key areas to produce an efficient algorithm which can be easily

adapted for parallel execution.

1.3 Goals and Chapter Outline

With the major foundational works in modern wavelet theory dating back to the early

1990’s [7], it is becoming increasingly less tenable to consider it a “new” mathematical

technique merely on the premise of age. Also, many of the major ideas behind solving

differential equations in wavelet bases have been well known for at least ten years [4, 5,

6, 11, 13, 14, 17], yet the slow adoption of these techniques for solving physical problems

is cause to evaluate the reasons for such a hesitant response. In an effort to encourage

6

the adoption of these powerful techniques, this thesis is constructed with the following

concepts in mind:

• Wavelet theory has many interrelated properties and techniques for manipulation of

wavelets which must be mastered in order to be comfortable working with wavelet

bases. Wavelets are a generalization of many different techniques, and it is easy

to get lost in the (often irrelevant) details of the basis. With this in mind, this

work will attempt to constrain the discussion to a minimum working set of required

knowledge, keeping the focus on things required for the current application, and will

identify important techniques as well as equations to enable the reader to grasp the

concepts more thoroughly.

• Wavelet methods are computational in nature and are only useful when implemented

in software. This thesis will attempt to make the results as repeatable as possible

by including notes on the special considerations which arise when implementing a

real wavelet algorithm – filter boundary conditions and their effect, preserving a

symmetric Laplacian operator, special cases when generalizing from 1D to 3D, etc.

• Wavelet techniques are often misconstrued as being “new” or “magic.” This work

will attempt to help physicists understand the connections from wavelet theory to

common mathematical concepts, and will derive equations and concepts from basic

principles wherever possible – rather than simply using equations from other sources

without explaining their meaning.

7

1.3.1 Organization

This work begins with a quick review of the basic mathematics involved in computa-

tionally solving partial differential equations in a basis set, and moves quickly into describ-

ing the basis of choice for this algorithm – interpolating wavelets. In Chapter 4, several

additional mathematical techniques are introduced which are essential to the understand-

ing of the wavelet particle-in-cell Poisson equation algorithm. This chapter is followed by

a chapter describing the algorithm in detail. Chapter 6 describes a test implementation

of the algorithm which was used to verify the mathematics of the earlier chapters as well

as to examine the feasibility of the new boundary condition schemes introduced in this

thesis. The final chapter reviews the contributions of this thesis and examines possible

directions for future development.

CHAPTER 2

WEIGHTED RESIDUAL METHOD AND SOLVING PDE IN A BASIS

Approximating a differential equation with any basis function expansion must first

begin with the initial continuous equation:

∇2(u+ ubc) = ρ. (2.1)

where u is the potential inside the computational domain and ubc is the potential on the

outside.1 Eqn. (2.1) can be reinterpreted as a requirement that the residual R would

vanish everywhere:

R = ∇2(u+ ubc)− ρ = 0. (2.2)

If a u, ubc and ρ can be found which causes R to vanish in all of continuous space, then

the problem is solved. When working with analytic functions for ρ, this can often be done

by using the integral form of the equation. However, solving the problem for general ρ

requires approximating with a basis function expansion. The particular form of the basis

functions is irrelevant at this point, but the goal is to make the residual orthogonal to the

space spanned by the basis functions. Specifically, we want

〈φi|R〉 = 〈φi, R〉 =∫φi(~x)R(~x)d3~x = 0, ∀i. (2.3)

Note the use of Dirac’s Bra-Ket notation to represent the inner product, and its equivalence

to the integral norm. I will continue to use the Dirac notation throughout this thesis. So,

we want to reduce the residual to be orthogonal to every basis vector (function) in the

1This separation in the potential will be explained in further detail in Section 5.8.

9

expansion space given by the φi. In the limit of a complete basis, it can be seen that this

restriction will be equivalent to the continuous requirement (Eqn. 2.2).

Because the ultimate goal is to establish a relationship between u and ρ, we must next

examine the residual in detail. Again, due to the inability of finite minds or computers to

represent a continuum of values, we need to expand ρ, u and ubc in some set of functions

in order to work with them. For our purposes, it is convenient if they all are expanded in

the same set of basis functions:

ρ =∑j

ρjΦj , u =∑j

ujΦj , ubc =∑j

ubcj Φj . (2.4)

Inserting these expressions into the requirement on R gives:

〈φi|R〉 = 〈φi|∇2∑j

(uj + ubcj )Φj〉 − 〈φi|∑j

ρjΦj〉 ,

=∑j

(uj + ubcj ) 〈φi|∇2|Φj〉 −∑j

ρj 〈φi|Φj〉 = 0, ∀i. (2.5)

We now have an algebraic linear equation to solve, which is a well-researched problem

in computational science. This will be more apparent if we write:

~ρ =

ρ0

ρ1

· · ·

ρn

, ~u =

u0

u1

· · ·

un

, ~ubc =

ubc0

ubc1

· · ·

ubcn

, O := 〈φi|Φj〉 , L := 〈φi|∇2|Φj〉 . (2.6)

Our discretized differential equation becomes

L(~u+ ~ubc) = O~ρ, (2.7)

10

and finding ~u becomes a problem of simply inverting the matrix L:

~u = L−1(O~ρ− L~ubc). (2.8)

This matrix inversion is handled readily by several common linear algebra algorithms.

Note that choice of φi and Φi are arbitrary, and the general form of the resulting linear

equation does not depend on their properties. However, in order for L to be invertible, it

must be square, which requires that the number of basis functions be the same for both

expansions. One common choice for basis functions is to pick φi = δ(~x − α~i), restricting

the residual to be zero at discrete points. This is called a collocation method and is

useful for finding a solution which is valid on a grid. It also often results in a less dense

representation of the Laplacian operator matrix and is the approach taken by Goedecker

[11].

Another common choice is to choose φi = Φi, and the resulting weighted residual

method is known as the Galerkin method. The largest advantage of this approach is that

the resulting Laplacian operator is a symmetric matrix, which is a requirement for many

linear system solvers. Arias et al. use this approach and are able to use the quickly

converging conjugate gradient method [1, 17], while Goedecker’s approach limits him to

variations of the slower steepest descent algorithm, although with a much more sparse

operator [12].

2.1 Fourier Basis, Green’s Functions, etc.

Another basis of academic interest for the weighted residual method is the Fourier

basis. Using φω = Φω = eiωx results in a diagonal overlap matrix and diagonal Laplacian

matrix and can therefore be trivially inverted. The downside of this approach lies in the

11

periodicity that is assumed in the expansion of functions into the Fourier basis. Unless

the basis expansion is carried out to infinite frequencies, the expansion will result in a

function which repeats periodically.

The inherent periodicity of the Fourier basis usually makes it undesirable for use with

the differential form of the Poisson equation due to the difficulty of defining spatially lo-

calized boundary conditions. Instead, the Poisson equation in integral form is used with

a Green’s function, and a pilgrimage through frequency space is used to turn the convo-

lution integral into a simple algebraic expression [15]. This approach has the advantage

of allowing arbitrary boundary conditions to be applied using the Green’s function, yet

only needing a computational grid of the size of the charge distribution of interest. The

disadvantages of this approach is that in order to avoid periodic wrapping of the Green’s

function, the frequency-space calculation must be performed on a doubled grid, and it still

requires a fully dense grid inside of the charge region.

CHAPTER 3

INTERPOLATING WAVELETS

One of the main purposes of a wavelet basis is compact storage of general data such

as music and photos and data files [27]. It is in this realm of data compression that

the wavelet scheme is easiest to conceptualize. The goal of a basis-function compression

scheme is to represent the input data with as few basis function coefficients as possible.

This is generally best accomplished when the basis function set shares many similarities

with the data (f(~x)) to be compressed. A more mathematically formal way to state this is

to say that 〈φi|f〉 is large for only a very few i, and zero for all others. One example of this

is an infinitely periodic signal. In a delta-function (time sampling) basis set, this signal

would be significantly dense, even if it were compressed by storing only a single period.

However, in a Fourier series expansion, this signal may require only very few coefficients

to accurately represent it. In the opposite extreme, a highly localized delta function signal

is sparse in a time sampling space, but infinitely dense in Fourier space.

This space / frequency localization dichotomy is of mathematical interest, but phys-

ical signals are nearly always localized in space, and often are significantly localized in

frequency as well. There are many common methods of compactly representing signals

in simple physical systems, often using energy or angular momentum eigenkets such as

gauss-hermite polynomials and spherical harmonics. For the general case, however, none

of these analytical basis sets have the full localization in space that is typical of physical

quantities such as charge and mass distributions. This need for a basis set which com-

pactly describes physical quantities in general is a major driving force in the success of

wavelet methods.

13

3.1 Transform and Inverse

Wavelets are most easily understood through the transform between discrete physical

space and wavelet space. Interpolating wavelets are the main focus of this research, and

their transform will be described in detail. This description will follow the lifting scheme

wavelet description of Sweldens [26, 27], though there are many ways to represent a wavelet

transform.

This transform occurs after the signal has already been converted into a discrete sample

of the physical system at some resolution level ~h, which can have a different value for

each dimension considered. The goal of the transformation is to reduce the problem

to the fewest number of large non-zero coefficients as possible by providing a way to

recover the value of the original signal via some interpolation scheme. In 1D, the method

taken to accomplish this is to first separate the even-indexed coefficients from the odd-

indexed coefficients (see Figure 2). Then the even-indexed coefficients (which will be called

“Smooth” or “S” coefficients) are interpolated in some manner to make a prediction of

what the odd-indexed coefficients (called “Detail” or “D” coefficients) should contain.

This prediction is subtracted from the “D” coefficients, and the first stage of the wavelet

transform is complete.

sk

dk

Sp Psk+1

Figure 2: Basic forward wavelet transform.

14

As an example, consider a linear function as the input function sampled as in Figure 3,

and a simple wavelet transform utilizing a linear interpolation / prediction scheme.1 The

prediction scheme would be simply to take any two adjacent S coefficients, take their

average, and predict that the odd D coefficient between them should equal their average.

This corresponds to the prediction filter shown in Table 1. The coordinates are such that

a 0 offset refers to the S coefficient to the left of the current D coefficient, and a 1 offset

refers to the next S coefficient, which is to the right of the current D coefficient.

The prediction is then tested by subtracting the prediction from the actual value of

the D coefficient. Of course, in the special case of a linear input function, this prediction

scheme will be perfect, and all of the D coefficients will be zero as in Figure 4. This

transform is completely invertible – all that is required is to perform the prediction again

from the S coefficients and add the prediction back on to the D coefficients, then merge the

S and D coefficients back into one single data set. The block diagram for this operation

is shown in Figure 5.

With a perfect prediction filter, the goal to compress this input data set has now

achieved a 2:1 compression ratio with no losses. Cutting the storage requirement in half is

a good start, but with such a simple input function, it is not yet very impressive. Further

compression is achieved by recursively splitting the remaining zero-level S0 coefficients into

-1 level S−1 and D−1 coefficients and predicting these D−1 coefficients as well, as Figure 6

shows. There are now two separate scales of information, and only the D coefficients are

kept for the lower scale2, while the S and D coefficients are kept at the highest scale.

1A simpler wavelet transform is possible, using a 0th order constant-interpolating scheme, which resultsin an un-lifted Haar transform.

2The numbering convention for levels is an unfortunate relic of wavelet literature. “Up” in level is“Down” in scale. This work will attempt to preserve the distinction between the two. Numeric values arealways levels, while the terms “higher”, “lower”, “up”, “down”, “large”, “small” will always refer to scales.The author would have preferred to simply re-number levels to match scales and derive the self-consistentset of equations, but then the reader would be unable to connect with the vast majority of other waveletliterature. It would be analogous to re-labeling the electron charge as positive in an electronics text sothat currents would make more intuitive sense.

15

0

2

4

6

8

10

s00 s

00 s

02 s

03 s

04 s

05 s

06 s

07

Linear Input FunctionWavelet Transform Output

Figure 3: Sampled linear function.

Table 1: Linear prediction filter

Offset 0 1

Value 0.5 0.5

16

0

2

4

6

8

10

s00 d

00 s

01 d

01 s

02 d

02 s

03 d

03


Figure 4: Single-stage wavelet transform. Transform is of a linear function using a perfectprediction filter.

sk

dk

MP sk+1

Figure 5: Basic inverse wavelet transform.

17

This recursive pyramid algorithm can continue until there is only one S−n and D−n point

remaining, as long as the original data set contained a power of 2 number of points (see

Figure 7). The naming convention for scales is such that increasing scale corresponds

to decreasing resolution and decreasing number of S and D coefficients. Each scale has

half as many coefficients, but they are positioned twice as far apart as the coefficients at

the next lower scale. The frequency information contained in a scale is then half of that

which is contained in the next lower scale. In fact, the wavelet transform stage can be

considered as a high frequency / low frequency filter where the S coefficients are given the

low frequency information and the D coefficients are given the high frequency data. The

separation into scales is also a separation in frequency.

sk

dk

Sp Psk+1

sk-1

dk-1

Sp P

Figure 6: Recursive forward wavelet transform.

The wavelet compression scheme will result in only one single non-zero coefficient (Sn)

if the prediction scheme is perfect, as in Figure 7. This seems to violate the basic idea that

it takes two constants to define a line. However, the perceptive reader may have noticed

that up until this point the boundary conditions have been ignored. Because our filter

simply took the average of two points on either side of the D coefficient, we have been

18

0

2

4

6

8

10

s-20 d

00 d

-10 d

01 d

-20 d

02 d

-11 d

03


Figure 7: Fully transformed linear function with perfect prediction filter.

19

assuming that the prediction scheme could correctly guess what the S coefficient values

were for points after of the last D coefficient. For higher order interpolation schemes, this

problem is more obvious, as a wider filter is used, and more S coefficients are taken into

account that are often outside of the region of sampled data. There are many ways to han-

dle boundary conditions with wavelet expansions, and the different types are appropriate

for different uses. The simplest boundary condition scheme is so simply assume that ev-

erything outside of the region of interest is zero. This scheme is the easiest to implement,

but often results in large D coefficients along boundaries due to the implied discontinuity

between the measured non-zero data and the assumption of zero outside of the boundary

(see Figure 8). Periodic boundary conditions often give the same boundary difficulties for

non-periodic data sets, as Figure 9a shows. For periodic or symmetric data sets however,

it is often advantageous in terms of compression to choose periodic or mirrored boundary

conditions (Figure 9b). In the “second generation” wavelet scheme [26, 27], one can also

define a prediction scheme which takes the finite interval into account. This can give a

better general-case solution to the problem of data compression with wavelets (see Fig-

ure 10), but it makes the prediction filter dependent on the spatial coordinate rather than

being uniform across the whole region of interest. See Appendix E for an example of

defining the linear interpolating wavelets on an interval.

3.1.1 Lifting

It is sometimes desirable to have the multiple scales of coefficients conserve some

quantities, such as zeroth, first, second, or nth moments. In order to accomplish this,

we need to update the sequence of S coefficients in some way in order to preserve these

moments. This can be accomplished by adding an “update” stage to the transform after

the prediction stage has occurred. As Figure 11 shows, this change is easily invertable

20

0

2

4

6

8

10

s-20 d

00 d

-10 d

01 d

-20 d

02 d

-11 d

03


(a) Linear input function.

0

1

2

3

s-20 d

00 d

-10 d

01 d

-20 d

02 d

-11 d

03


(b) Periodic linear function.

Figure 8: Transformed linear function – zero BCs.

21

0

2

4

6

8

10

s-20 d

00 d

-10 d

01 d

-20 d

02 d

-11 d

03



0

1

2

3

s-20 d

00 d

-10 d

01 d

-20 d

02 d

-11 d

03



Figure 9: Transformed linear function – periodic BCs.

22

0

2

4

6

8

10

s-20 d

00 d

-10 d

01 d

-20 d

02 d

-11 d

03



0

1

2

3

s-20 d

00 d

-10 d

01 d

-20 d

02 d

-11 d

03



Figure 10: Transformed linear function – interpolation BCs.

23

using the same scheme as before: mirror the diagram from left to right, and change the

signs of the operations. The update stage is also known as a lifting stage, and there are

techniques for defining lifting update stages to preserve multiple moments in between the

scales of the transformation [26, 27]. For the simple case of linear interpolating wavelets, to

conserve the zeroth moment (average), one defines the update filter as in Table 2. In this

case, each D coefficient simply writes 1/4 of its value to the two S coefficients surrounding

it. This again is subject to boundary condition issues, especially if the prediction filter

changes near the boundaries. See Sweldens [26, 27] for more details on deriving update

filters in the lifting scheme.

sk

dk

Spsk+1 P U

(a) Forward transform.

sk

dk

M sk+1PU

(b) Inverse transform.

Figure 11: Lifted wavelet transforms.

3.2 What Is the Wavelet Basis?

Recalling that the goal is to create a wavelet basis set, one may wonder why only the

coefficients have been mentioned thus far. If this wavelet transform does convert from the

24

Table 2: Linear update filter

Offset 0 1

Value 0.25 0.25

Note: This filter is designed to preserve the average.

25

sampled set into another basis, then what do the basis functions look like, and what are

their properties? Because the wavelet basis is in fact defined through the transform and

inverse transform, an obvious way to examine the basis functions is to simply set only

one coefficient equal to unity in the wavelet basis, and perform an inverse transformation

in order to see what the function will become in coordinate space. In Figure 12, we

can see several such functions drawn. The S coefficients will produce functions called

scaling functions, which will be represented by φki (where k denotes the level3, and i is the

index within that level), and the D coefficients produce functions called wavelet functions,

denoted with ψki .

While these functions appear to be only coarse discrete functions, by adding additional

lower scales of zero coefficients, they can be interpolated to any resolution and are defined

on any fractional value that can be represented by a computer (Figure 13). These scaling

(φki ) and wavelet (ψki ) functions behave as pseudo-continuous functions and are defined on

any rational number that can be obtained as a binary fraction (0.d1d2d3d4 =∑

i di2−i).

Therefore, we can treat them as functions of the coordinate space (φki (x) and ψki (x)).

Though they are not defined on the set of irrational numbers, they are defined on the

rational numbers which are arbitrarily nearby. Since only rational numbers are definable

in a finite computer representation, we may treat these functions φki (x) and ψki (x) as

continuous functions of x for any computational purposes.

Because the shape of the individual wavelet functions does not depend on its scale or

its index, we can write all of the functions φki (x) as translations and scalings of some base

scaling function φ(x). Similarly, the wavelet functions are all translations and dilations

of some function ψ(x) (often referred to as the “Mother Wavlet”) [11]. Formally, this is

3Note that k follows the numbering for levels rather than scales. See Footnote 2 on page 14 for adiscussion of the difference.

26

0

1

s0

0d2

0d1

0d2

1d0

0d2

2d1

1d2

3

Wavelet Coefficients

0

0.5

1

0 1 2 3 4 5 6 7 8

ϕ00

0

1

s0

0d2

0d1

0d2

1d0

0d2

2d1

1d2

3


0

0.5

1

0 1 2 3 4 5 6 7 8

ψ0

0

0

1

s0

0d2

0d1

0d2

1d0

0d2

2d1

1d2

3


0

0.5

1

0 1 2 3 4 5 6 7 8

ψ1

1

0

1

s0

0d2

0d1

0d2

1d0

0d2

2d1

1d2

3


0

0.5

1

0 1 2 3 4 5 6 7 8

ψ2

2

Figure 12: Wavelet functions of linear interpolating wavelets. Left column is input waveletcoefficients, and right column is the result of the inverse wavelet transform of those coef-ficients (with zero for the boundary conditions), showing the wavelet or scaling functionassociated with that coefficient.

27

0

1

s0

0d2

0d1

0d2

1d0

0d2

2d1

1d2

3


0

0.5

1

0 1 2 3 4 5 6 7 8

ϕ00

0

1

s0

0d2

0d1

0d2

1d0

0d2

2d1

1d2

3


0

0.5

1

0 1 2 3 4 5 6 7 8

ψ0

0

0

1

s0

0d2

0d1

0d2

1d0

0d2

2d1

1d2

3


0

0.5

1

0 1 2 3 4 5 6 7 8

ψ1

1

0

1

s0

0d2

0d1

0d2

1d0

0d2

2d1

1d2

3


0

0.5

1

0 1 2 3 4 5 6 7 8

ψ2

2

Figure 13: Wavelet functions of linear interpolating wavelets – doubled resolution. Thisfigure is identical to Figure 12, with the addition of another layer of d3

i coefficients whichare set to zero. The result is a double-resolution view of the same wavelet and scalingfunctions from Figure 12. This resolution doubling can be repeated to view the functionsto any desired resolution.

28

expressed as

φki (x) = φ(2kx− i), (3.1)

ψki (x) = ψ(2kx− i). (3.2)

Another unique attribute of these functions is that they are strictly limited in sup-

port. Beyond some finite range ±x, they are identically equal to zero. This is a feature

which is not achievable with analytic functions constructed as a power series represen-

tation. Even the most rapidly decaying analytic function has some small non-zero tail

region with infinite extent. The compact support of the wavelet functions is an advantage

in two ways. First, it more accurately matches distributions of physical systems – avoid-

ing the introduction of spurious tail regions into the expansion, and allowing an inverse

transform with perfect reconstruction. Second, it simplifies many operations in wavelet

space, as wavelet functions which are far apart in real space have zero overlap and can

have no direct interactions with each other through spatially local operators. Higher order

wavelets, constructed with higher order interpolation schemes, have larger support and

more complicated φ(x) and ψ(x) functions, but all wavelet schemes possess this property

of compact support (Figure 14).

So the wavelet basis set has achieved one of the goals demanded of it: localization in

space. In addition, the fact that the φki and ψki have some finite non-zero support rather

than being a δ function implies, through the time-frequency uncertainty principle, that

they may be localized in some frequency band as well. Note that because the wavelet

function is not periodic, it will require an infinite frequency range to fully represent it,

but if the wavelet function is a fairly smooth function (has a large number of continuous

derivatives), the decay of Fourier coefficients toward infinity will be rapid. For n continuous

derivatives of ψki (x), the Fourier coefficients will decay as 1/ωn+1 at large ω [11]. This

29

-0.5

0

0.5

1

0 1 2 3 4 5 6 7 8

ψ2

2

(a) Linear interpolating wavelets.

-0.5

0

0.5

1

0 1 2 3 4 5 6 7 8

ψ2

2

(b) 4th-order interpolating wavelets.

-0.5

0

0.5

1

0 1 2 3 4 5 6 7 8

ψ2

2

(c) 6th-order interpolating wavelets.

-0.5

0

0.5

1

0 1 2 3 4 5 6 7 8

ψ2

2

(d) 8th-order interpolating wavelets.

Figure 14: Wavelet functions – higher order families. Notice that while the higher orderfunctions do have wider support, their support is still finite.

30

can be seen in Figures 15 and 16 as the higher order and smoother functions decay much

more rapidly in the higher frequency range.

In the low-frequency regime, the Fourier components can be made to decay as well if

certain constraints are met by the wavelet functions [11]. The Fourier transform of the

wavelet function is given by

Ψ(ω) =∫ψ(x)e−iωxdx, (3.3)

and at ω = 0, this becomes simply Ψ(ω = 0) =∫ψ(x)dx. So, the Fourier spectrum will

decay down to zero at the origin if the wavelet function has a vanishing zero moment.

Furthermore, expanding Ψ(ω) as a power series around the origin gives

Ψ(ω) =∞∑l=0

ωl

l!dl

dωlΨ(ω)

∣∣∣∣∣ω=0,

(3.4)

where the derivatives dl

dωlΨ(ω)

∣∣∣ω=0

are given by

dl

dωlΨ(ω)

∣∣∣∣ω=0

=dl

dωl

∫ψ(x)e−iωxdx

∣∣∣∣ω=0,

=∫ψ(x)(−ix)le−iωxdx

∣∣∣∣ω=0,

= (−i)l∫xlψ(x)dx.

(3.5)

Therefore the series expansion says that if ψ(x) has m vanishing moments, then Ψ(ω) will

vanish as ωm for ω → 0. As Figures 15 and 16 show, the lifted wavelet functions with

more vanishing moments are more localized in frequency near zero.

Acquiring vanishing moments in the wavelet functions is accomplished through mod-

ifying the update stage of the wavelet transform. Notice that in the inverse transform of

Figure 11b, the update stage can affect both even and odd coefficients, and can cause a

change in the shape of the resulting ψ(x) function, because it depends on D coefficients

31

-0.5

0

0.5

1

1.5

0 1 2 3 4 5 6 7 8

ψk=4,i=5

(x)ψk=5,i=20

(x)ψk=6,i=50

(x)

(a)

0

200

400

600

800

1000

1200

0 20 40 60 80 100 120

ψk=4,i=5

(x)ψk=5,i=20

(x)ψk=6,i=50

(x)

(b)

-0.5

0

0.5

1

1.5

0 1 2 3 4 5 6 7 8

ψk=4,i=5

(x)ψk=5,i=20

(x)ψk=6,i=50

(x)

(c)

0

200

400

600

800

1000

1200

0 20 40 60 80 100 120

ψk=4,i=5

(x)ψk=5,i=20

(x)ψk=6,i=50

(x)

(d)

Figure 15: Shape and frequency spectrum of linear interpolating wavelets. (a) and (b) arelinear interpolating wavelets, (c) and (d) are lifted linear interpolating wavelets with thefirst two moments vanishing.

32

-0.5

0

0.5

1

1.5

0 1 2 3 4 5 6 7 8

ψk=4,i=5

(x)ψk=5,i=20

(x)ψk=6,i=50

(x)

(a)

0

200

400

600

800

1000

1200

0 20 40 60 80 100 120

ψk=4,i=5

(x)ψk=5,i=20

(x)ψk=6,i=50

(x)

(b)

-0.5

0

0.5

1

1.5

0 1 2 3 4 5 6 7 8

ψk=4,i=5

(x)ψk=5,i=20

(x)ψk=6,i=50

(x)

(c)

0

200

400

600

800

1000

1200

0 20 40 60 80 100 120

ψk=4,i=5

(x)ψk=5,i=20

(x)ψk=6,i=50

(x)

(d)

Figure 16: Shape and frequency spectrum of 8th-order interpolating wavelets. (a) and (b)are 8th-order interpolating wavelets, (c) and (d) are lifted 8th-order interpolating waveletswith the first eight moments vanishing.

33

as input. Because the wavelet expansion is a complete basis which is separated into two

spaces at each step, namely scaling function space (φki ) and wavelet function space (ψki ),

if one of these two spaces conserves a quantity, then the functions in the other space must

be orthogonal to this quantity. For example, if the scaling function space is to conserve

the average, then the wavelet functions must have a zero average because they are all

orthogonal to the zeroth moment. So it is equivalent to say either that the wavelet func-

tion has been lifted so that it has a zero moment, or that the scaling function space has

been lifted so that it conserves the this moment. This is another purpose of the lifting

stage of the transformation – to further localize the wavelet functions in frequency space

by adding more vanishing moments to ψ(x).

CHAPTER 4

SPECIAL TOPICS

This chapter will address several diverse topics and historical notation from wavelet

theory. Much of this information, especially the refinement relations with the alternate

h and g filter notation and the concept of a dual wavelet space, will be critical to the

understanding of later chapters of this work.

4.1 Refinement Relations

Running a single S coefficient through one stage of the inverse transform (Fig. 11b), one

can see that any function φki is simply a linear combination of functions φk+1j at the lower

scale.1 In the same way, the wavelet functions ψki can also be written as a combination

of scaling functions φk+1j . This result can be expressed directly as a set of refinement

relations between base functions (defined in Eqns. 3.1 and 3.2):

φ(x) =∑j

hjφ(2x− j), (4.1)

ψ(x) =∑j

gjφ(2x− j), (4.2)

where the range of j depends on the size of the h and g filters. It is often useful to

rephrase these equations in terms of relations between functions at adjacent levels. By

1Again, level k + 1 is a lower scale than k. See Footnote 2 on page 14.

35

using Eqns. 3.1 and 3.2, the following can be obtained:

φki (x) =∑j

hjφk+12i+j(x), (4.3)

ψki (x) =∑j

gjφk+12i+j(x). (4.4)

Wavelet families can be defined entirely by these refinement relations, and this notation

is often used in literature [1, 7, 11]. These h and g filters can be derived, as mentioned,

from the action of the prediction and update filters in the inverse wavelet transform.2 The

resulting expressions are (using P for the predict filter and U for the update filter)3:

h2i = δi,0, (4.5)

h2i+1 = P−i,

g2i = −U−i,

g2i+1 = δi,0 −∑j

U(−j−i)Pj ,

where again, the index j runs over all of the non-zero values of U(−j−i)Pj , and Ui and Pi

are defined to be zero outside of the size of their established filter range.

This new notation allows us to compute integrals involving wavelet functions, which

will be useful when defining the O and L matrices later. As an example, we can compute

2While h and g can be derived from the prediction and update filters, it also possible to constructvalid h and g filters which have no finite P and U filter representation. This is the case with orthogonalwavelets, for example [26].

3These relations were derived by following a delta-function sequence δi,0 through either the “S” or “D”branch of the inverse transform, and observing what information came out as S coefficients on the lowerscale. This is how the h and g coefficients of Eqns. 4.1 and 4.2 are defined.

36

the normalization condition for the scaling functions via

1 =∫φ(x)dx,

=∫ ∑

j

hjφ(2x− j)dx,

=∑j

hj

(12

∫φ(X)dX

),

=12

∑j

hj ,

2 =∑j

hj . (4.6)

In this fashion, we can often obtain relationships between the filter coefficients in the

discrete set which are equivalent to the integral relationships in the continuous functions,

and we can perform integrals over wavelet functions even though they are defined only by

their refinement relationships.

4.1.1 Dual Space

The wavelet bases which have been introduced up to this point are not orthogonal. In

other words, the overlap matrix O := 〈φi|Φj〉 from Chapter 2 contains possibly significant

off-diagonal components. Because there are many cases where orthonormality is useful for

simplifying expressions, we introduce another wavelet basis which is orthogonal (dual) to

the first (primal) wavelet basis.

37

This new dual basis is composed of dual scaling (φki ) and dual wavelet (ψki ) functions

derived from a base scaling function and mother wavelet

φki (x) = 2kφ(2kx− i), (4.7)

ψki (x) = 2kψ(2kx− i), (4.8)

which obey a set of refinement relations similar to that of the primal wavelets:

φ(x) = 2∑j

hjφ(2x− j), (4.9)

ψ(x) = 2∑j

gjφ(2x− j), (4.10)

φki (x) =∑j

hjφk+12i+j(x), (4.11)

ψki (x) =∑j

gjφk+12i+j(x). (4.12)

Notice that there is now a four defining filters for the wavelet family: h, g, h, and g.

The constraints between the two bases will cause only two of these to be independent.

Also, the factors of 2 and 2k in these relations will become apparent shortly. These two

bases are required to satisfy the following biorthogonality constraints on a single level k:

〈φki |φkj 〉 = δi,j ,

〈φki |ψkj 〉 = 0,

〈ψki |φkj 〉 = 0,

〈ψki |ψkj 〉 = δi,j ,

38

and if these are satisfied, the refinement relations between bases guarantee an even stricter

set of constraints between multiple levels4:

〈φki |φkj 〉 = δi,j , (4.13)

〈φki |ψlj〉 = 0, k ≤ l, (4.14)

〈ψki |φlj〉 = 0, k ≥ l, (4.15)

〈ψki |ψlj〉 = δi,jδk,l. (4.16)

A few questions remain: First, why are the definitions and refinement relations for

the dual functions different from those for the primal functions? Second, what are the

relationships between h, g, h, and g, and how do we find the dual functions given only the

primal functions?

The answer to the first question lies in the orthonormality constraint on the scaling

functions:

δi,j = 〈φki (x)|φkj (x)〉 ,

= 〈2kφ(2kx− i)|φ(2kx− j)〉 ,

=∫

2kφ(2kx− i)φ(2kx− j)dx,

=∫

2kφ(X − i)φ(X − j)(2−k)dX,

= 〈φ0i (X)|φ0

i (X)〉 .

(4.17)

Therefore the factor of 2 between scales of dual functions is just a normalization factor

to ensure that 〈φki (x)|φki (x)〉 = 1 for all levels k. Note that because this factor is only a

normalization factor, it could have been shared symmetrically between φ and φ as a√

2

4This is because any function φl, φl, φl, or ψl can be written as a sum of scaling functions φl+1 orφl+1 recursively at a lower scale until l = k and the above expressions for 〈φki |ψkj 〉 = 〈ψki |φkj 〉 = 0 can beinvoked.

39

factor, as is the case with orthogonal wavelet families [11]. For simplicity, it is usually

placed entirely on the dual functions when working with biorthogonal wavelet families,

but this choice is arbitrary.

A similar method can be used to establish orthonormality relations between the re-

finement relation filters (h, g, h, and g). For example, substituting Eqns. (4.11) and (4.3)

into Eq. (4.13) gives

δi,j = 〈φki |φkj 〉 ,

= 〈∑µ

hµφk+12i+µ|

∑ν

hνφk+12j+ν〉 ,

=∑µ,ν

hµ hν 〈φk+12i+µ|φ

k+12j+ν〉 ,

=∑µ,ν

hµ hν δ2i+µ,2j+ν ,

=∑ν

hν+2j−2i hν and, replacing dummy index ν with l = ν − 2i,

=∑l

h2j+l h2i+l.

Similar derivation gives the other orthonormality conditions in terms of the refinement

relation filters:

∑l

h2j+l h2i+l = δi,j , (4.18)

∑l

h2j+l g2i+l = 0, (4.19)

∑l

g2j+l h2i+l = 0, (4.20)

∑l

g2j+l g2i+l = δi,j . (4.21)

40

A solution of these constraints is given by the relations:

gi = (−1)1−i h1−i, (4.22)

hi = (−1)i g1−i. (4.23)

Note that Eq. (4.23) has essentially the same form as Eq. (4.22), but it is solved for h rather

than g. Now we have a set of relations that link the dual and the primal spaces together,

but we need to derive wavelet transforms for the dual wavelets in order to examine the

characteristics of the dual space.

4.2 Dual and Primal Wavelet Transform

Armed with the dual basis and the orthogonality relations from the previous section,

we can define a single stage of the forward wavelet transform of a function f as

ski = 〈φki |f〉 ,

= 〈∑j

hjφk+12i+j |f〉 ,

ski =∑j

hj sk+12i+j . (4.24)

dki = 〈ψki |f〉 ,

dki =∑j

gj sk+12i+j . (4.25)

41

In a similar fashion, the dual forward transform is found to be

ski =∑j

hj sk+12i+j , (4.26)

dki =∑j

gj sk+12i+j . (4.27)

Since these are linear transformations, we can represent them as matrices F and F for the

forward and dual forward transforms, respectively.

The next goal is to find the backward transform (B = F−1) and its dual form (B =

F−1). Toward this goal, we begin with an expression of the relationship between scales in

the wavelet expansion:

∑ν

skν |φkν〉+∑µ

dkµ |ψkµ〉 =∑i

sk+1i |φk+1

i 〉 , (4.28)

where skν , dkµ, and sk+1i are just the expansion coefficients on levels k and k + 1. The goal

of the forward transform is to find the left-hand side of this equation in terms of the right-

hand side, and the goal of the inverse transform is exactly the opposite. To isolate the

right-hand side, we operate on the entire equation with 〈φk+1j | and use the orthogonality

conditions to isolate a single sk+1j :

〈φk+1j |

∑ν

skν |φkν〉+ 〈φk+1j |

∑µ

dkµ |ψkµ〉 = 〈φk+1j |

∑i

sk+1i |φk+1

i 〉 ,

∑ν

skν 〈φk+1j |φkν〉+

∑µ

dkµ 〈φk+1j |ψkµ〉 = sk+1

j ,

∑ν,α

skν hα 〈φk+1j |φk+1

2ν+α〉+∑µ,β

dkµ gβ 〈φk+1j |φk+1

2µ+β〉 = sk+1j ,

∑ν,α

skν hα δj,2ν+α +∑µ,β

dkµ gβ δj,2µ+β = sk+1j ,

∑α

skj−α2

hα +∑β

dkj−β2

gβ = sk+1j ,

42

∑α

(skj−α h2α + dkj−α g2α

)= sk+1

2j , (4.29)

∑α

(skj−α h2α+1 + dkj−α g2α+1

)= sk+1

2j+1. (4.30)

So the backwards transform B consists of Eqns. (4.29) and (4.30) for the even and odd

terms of sk+1n . Analogously, the dual reverse transform B has the same form as Eqns. (4.29)

and (4.30), but with h and g replaced by h and g.

Because the basis function set is not guaranteed to be orthonormal, the transformations

F and B are not guaranteed to be unitary. In general,

F†F 6= 1,

B†B 6= 1,

and so,

F† 6= B,

B† 6= F,

contrary to the usual results for a unitary transform from the orthogonal coordinate basis

to another orthogonal basis set. For biorthogonal wavelet bases, however, the above

relations are correctly generalized as

F†F = 1, F† = B, (4.31)

B†B = 1, B† = F. (4.32)

These equations are derived in Appendix B.

43

The full recursive wavelet transformations can be defined in terms of the single level

transforms as

W = F1F2 . . .FN =N∏n=1

Fn, (4.33)

W−1 = BNBN−1 . . .B1 =1∏

n=N

Bn, (4.34)

W = F1F2 . . . FN =N∏n=1

Fn, (4.35)

W−1 = BN BN−1 . . . B1 =1∏

n=N

Bn. (4.36)

For these full transforms, Eqns. (4.31) and (4.32) imply that

W† = W−1, (4.37)(W−1

)†= W. (4.38)

These two relationships will become indispensable in later sections of this thesis in ensuring

the symmetry of operators in wavelet space. An important connection to recall is that

if the wavelet families are defined through Eqns. (4.5), these F and B transforms are

identical to those created using the prediction and update filters from Section 3.1.

4.3 Orthonormal Wavelets

Much research has gone into the development of wavelet families which are both ortho-

normal and smooth [7], and several such families have been discovered. Orthogonal

wavelets also were used by Terzic, Pogorelov, and Bohn in their wavelet-based Poisson

solver [28]. Because of some special characteristics of the biorthogonal interpolating

44

wavelets that simplify the multiresolution wavelet transform (see Section 5.7), they are

the chosen basis for this current research. Orthonormal wavelet families are of limited

interest here. However, orthonormal wavelets have a few important special considerations

which need to be addressed in order to understand some of the features of the biorthogonal

wavelet families which are the focus of this thesis.

In the orthonormal case, the biorthogonality relationships of Eqns. (4.13-4.16) are

replaced with the set establishing orthonormality between the non-dual functions them-

selves:

〈φki |φkj 〉 = δi,j , (4.39)

〈ψki |φlj〉 = 0, k ≥ l, (4.40)

〈ψki |ψlj〉 = δi,jδk,l. (4.41)

Even though these relationships do not seem to define a fully orthonormal set in the usual

fashion, consider the fact that a function expanded in wavelet space has the form:

|f〉 =∑i

ski |φki 〉+M∑n=k

∑i

dni |ψni 〉 .

In this case, all of the included basis functions are orthonormal to each other as is required

for an orthonormal basis function set.

Another observation to make is that the orthogonality relationships of Eqns. (4.39-

4.41) can be derived from the biorthogonal relationships from Eqns. (4.13-4.16) by the

simple substitutions:

φ = φ,

ψ = ψ,

45

making the dual and primal wavelet spaces identical. The first useful result of this equality

is that now W = W, so normal unitary transformation relations apply, as is expected of

an orthonormal basis.

The second result, which will be useful in Section 5.6 when considering preconditioning

effects, is that the normalization conditions on the wavelet and scaling functions need to

change from that of Eqns. (3.1) and (3.2). Consider adding an arbitrary normalization

factor Ak:

φki (x) = Akφ(2kx− i),

ψki (x) = Akψ(2kx− i),

and computing the orthonormal wavelet family equivalent of Eqn. (4.17), we obtain:

δi,j = 〈φki (x)|φkj (x)〉 ,

= 〈Akφ(2kx− i)|Akφ(2kx− j)〉 ,

= A2k2−k∫φ(X − i)φ(X − j)dX,

= A2k2−k 〈φ0i (X)|φ0

i (X)〉 .

This, and a similar argument for ψ, fixes A =√

2, so for orthonormal wavelet sets we need

to replace Eqns. (3.1) and (3.2) with:

φki (x) =√

2kφ(2kx− i), (4.42)

ψki (x) =√

2kψ(2kx− i). (4.43)

This difference in the normalization factors between the biorthogonal and orthonormal

wavelet bases causes significant effects when considering preconditioning in the wavelet

basis and will be considered in more detail in Section 5.6.

46

4.4 Multidimensional Wavelets

Multidimensional wavelet bases can be constructed either directly or by use of tensor

products of 1D wavelet functions. Tensor product forms of wavelet functions are the

simplest for constructing transforms and defining operators. They are defined simply by

multiplying 1D wavelet functions together in outer products. In 2D, for example, this

results in four different types of basis functions at each scale level – one scaling function

and three wavelet-style functions:

φksl,sm(x, y) = φkl (x)φkm(y), (4.44)

ψksl,dm(x, y) = φkl (x)ψkm(y), (4.45)

ψkdl,sm(x, y) = ψkl (x)φkm(y), (4.46)

ψkdl,dm(x, y) = ψkl (x)ψkm(y), (4.47)

with analogous definitions for the dual functions as well. Using these functions, the 2D

wavelet transform can be defined as the tensor product W2D = Wx⊗Wy which operate

independently in the x and y directions, respectively.5 Note that because these operations

act on independent spaces, they are commutative and W2D = Wx ⊗Wy = Wy ⊗Wx.

These transforms take the respective 1D wavelet functions from the scaling function forms

of Eqn. (4.44) to the wavelet style forms of Eqns. (4.45-4.47). The dual and inverse wavelet

transformation functions are defined analogously.

5Recall that for function vectors, |φ〉x ⊗ |ψ〉y = φ(x)ψ(y), which is why the tensor product symbol ⊗is generally suppressed for them.

47

When working with wavelets in an arbitrary number of dimensions, it is convenient to

invent a more abstract notation for describing a tensor product wavelet function:

ξkix,iy ,iz ,···(x, y, z, · · · ) =∏

d=x,y,z,···

φkid

2

(d), id even,

ψkid−1

2

(d), id odd.

(4.48)

Notice that this definition reproduces Eqns. (4.44-4.47) for the 2D case, but it also es-

tablishes an indexing scheme for the wavelets in N dimensions. For example, in the 2D

case, it specifies that only the points where ix and iy are both even correspond to the

scaling-function type (or ss-type) functions of Eqn. (4.44) and the wavelet-wavelet (or

dd-type) functions of Eqn. (4.47) are located on points with odd ix and iy. The sd-type

and ds-type functions correspond to the even-odd and odd-even combinations of ix and iy.

The 3D case is defined analogously, but there is one sss-type scaling function and seven

wavelet-style functions.

A full set of recursion relations and transforms analogous to the 1D cases in Section 4.1

could be defined in this new notation, but for the present work it is simpler to use the

abstract ξ(x, y, z, · · · ) only as a shorthand notation for preliminary steps of derivations,

and expand the expression in the form of Eqns. (4.44-4.47) when performing operations

on the wavelet functions.

4.5 Continuous Real Space to Wavelet Space

Up to this point, it has been implied that the S coefficients of a vector were equivalent

to the real-space value of the function at that point. This is only strictly true for the

interpolating wavelet families, although it is often used as an approximation for other

48

wavelet families as well. From the biorthogonality constraints, we know that

ski = 〈φki |f〉 .

So, the S coefficients are determined by the form of the dual scaling function φ. For the

non-lifted interpolating wavelet families, we have U = 0, and so Equations (4.5) and (4.23)

give

gi = δi,1, (4.49)

hi = δi,0, (4.50)

and using this h filter in Eqn. (4.9) results in the expression:

φ(x) = 2∑j

δj,0φ(2x− j),

φ(x) = 2φ(2x), (4.51)

which is a recursion relationship that can only be satisfied by a Dirac delta function. So

for the non-lifted interpolating wavelet functions we have

φ(x) = δ(x), (4.52)

and the samples from continuous space are identically equal to the S coefficients of the

wavelet space.

For wavelet bases in general, the equivalence between S coefficients and delta function

samples of the function no longer holds exactly. However, in the limit of large k and/or

relatively smooth function |f〉, the approximation is fairly accurate. The normalization of

49

φ is usually chosen such that

∫φ(x)dx = 1,

= 2∑j

hj

∫φ(2x− j)dx,

=∑j

hj

∫φ(X)dX,

∑j

hj = 1. (4.53)

For biorthogonal wavelet bases, it is common practice to choose h such that∑

j hj = 1,

and similarly, orthonormal bases are normalized such that∑

j hj =√

2.6 For large k then,

φ from any typical wavelet basis is a sharply peaked unit norm function with very short

support. A delta function sampling provides an excellent approximation to φ as long as

the function sampled is well-behaved and somewhat smooth in the sampling region. As a

result, for uniform grids, it is common practice to simply take the delta function samples

as equivalent to the lowest-scale S coefficients. Non-uniform grids, however, require special

consideration as the deviation between δ(x) and φ(x) may be large at higher scales where

φ(x) has a wide support.7

4.6 Human-Readable Representation of Vectors

The interleaved arrangement of scales in wavelet-space vectors used in earlier figures

(see Figures 7-10) is often used in computer implementations of wavelet codes, but it is dif-

ficult to visualize the structure of the underlying wavelet expansion in this format. When

6Recall that h = h in an orthonormal basis.

7Non-uniform grids will be considered in more detail in Section 5.7.

50

visualizing wavelet-transformed data, it is more common to show the scales separated by

a simple permutation of the components. For example, instead of the interleaved format:

[s−20 d0

0 d−10 d0

1 d−20 d0

2 d−11 d0

3

],

the data is arranged in order of decreasing scale:

[s−20 d−2

0 d−10 d−1

1 d00 d0

1 d02 d0

3

].

Figure 17 shows a pair of Gaussian functions expanded in wavelet space, with this

human-readable ordering of wavelet coefficients. In this format, it is easy to see that the

larger Gaussian contributes to the larger scales, but only the smaller central section con-

tributes to the higher frequency scales. Notice also the significant sparseness of the vector

expanded in wavelet space – this is typical of wavelet expansions for physical functions

and is a result of the dual frequency and spatial locality of the basis.

Figure 17: Wavelet transform example. Top plot is a superposition of two Gaussians ofdifferent widths in real space. Bottom plot shows the wavelet transform of this function inwith human-readable ordering of wavelet coefficients. The individual scales have verticallines dividing them.

51

In the remainder of this work, the computer code and mathematics will continue to use

the interleaved format of describing vectors, but I will use the human-readable ordering

for displaying vectors and operator matrices.

CHAPTER 5

THE ALGORITHM

Depending on the family of basis functions, the algorithm for the solution of the Poisson

equation can appear vastly different. In all cases, the goal is to solve Eqn. (2.7) for ~u:

L(~u+ ~ubc) = O~ρ.

In this work, wavelet-basis optimizations were chosen in several areas in order to improve

scaling and performance. These optimizations lie in the representation of the operators L

and O, in the preconditioning of the Laplacian operator, in the sparse representation of

the vectors ~u and ~ρ, and in the application of the boundary conditions ~ubc.

5.1 Prior Work

This thesis is a continuation of an earlier work by Terzic, Pogerelov, and Bohn which

produced a wavelet-based Poisson solver integrated into the accelerator code Impact-T [28].

This earlier solver utilized orthogonal wavelet families, a preconditioned conjugate gradient

algorithm, and a Green’s function solution to the application of boundary conditions. For

grid sizes of around 323, this solver achieved computation speeds that were comparable

to that of the FFT based Green’s function solver that is native to Impact-T. Many of

the algorithm decisions of the present work were made in response to the lessons learned

through the successes and shortcomings of this earlier solver.

53

5.2 Representation of Operators, Calculating Operators in Wavelet Space

When trying to solve the Poisson equation in a wavelet basis, the straightforward way

to implement the Laplacian and overlap operators is by simply creating a matrix containing

all of the inner products between every wavelet function included in the expansion. In

1D, this becomes:

L =

〈φ00|∂2

x|φ00〉〈φ0

0|∂2x|ψ0

0〉〈φ00|∂2

x|ψ10〉〈φ0

0|∂2x|ψ1

1〉 · · ·

〈ψ00|∂2

x|φ00〉〈ψ0

0|∂2x|ψ0

0〉〈ψ00|∂2

x|ψ10〉〈ψ0

0|∂2x|ψ1

1〉 · · ·

〈ψ10|∂2

x|φ00〉〈ψ1

0|∂2x|ψ0

0〉〈ψ10|∂2

x|ψ10〉〈ψ1

0|∂2x|ψ1

1〉 · · ·

〈ψ11|∂2

x|φ00〉〈ψ1

1|∂2x|ψ0

0〉〈ψ11|∂2

x|ψ10〉〈ψ1

1|∂2x|ψ1

1〉 · · ·

· · · · · · · · · · · · · · ·

. (5.1)

Because evaluating each of these matrix elements individually can be computationally

prohibitive, a common approach that is taken is to first pose the problem in regular space

in terms of the lowest scale S coefficients:

LS =

〈φn0 |∂2x|φn0 〉〈φn0 |∂2

x|φn1 〉〈φn0 |∂2x|φn2 〉〈φn0 |∂2

x|φn3 〉 · · ·

〈φn1 |∂2x|φn0 〉〈φn1 |∂2

x|φn1 〉〈φn1 |∂2x|φn2 〉〈φn1 |∂2

x|φn3 〉 · · ·

〈φn2 |∂2x|φn0 〉〈φn2 |∂2

x|φn1 〉〈φn2 |∂2x|φn2 〉〈φn2 |∂2

x|φn3 〉 · · ·

〈φn3 |∂2x|φn0 〉〈φn3 |∂2

x|φn1 〉〈φn3 |∂2x|φn2 〉〈φn3 |∂2

x|φn3 〉 · · ·

· · · · · · · · · · · · · · ·

, (5.2)

and because of the spatial locality of the scaling functions, this matrix is strongly diago-

nally dominant.

54

The linear interpolating wavelets, for example, give the familiar three-point finite dif-

ference stencil:

LS =

−2 1 0 0 · · ·

1 −2 1 0 · · ·

0 1 −2 1 · · ·

0 0 1 −2 · · ·

· · · · · · · · · · · · · · ·

. (5.3)

In fact, finite difference stencils provide reasonably good approximations to LS for almost

any wavelet family.

Once LS and OS are obtained, we have a linear equation in coordinate space:

LSW−1(~u+ ~ubc) = OSW−1~ρ,

where we can identify LSW−1 and OSW−1 as the operators being applied to the wavelet-

space vectors ~u, ~ubc, and ~ρ. This particular choice of operators is undesirable because

it is not equivalent to L in Eqn. (5.1), nor is it even symmetric. Symmetry of the op-

erator is required for many iterative matrix inversion techniques including the conjugate

gradient algorithm, so we would like to preserve the symmetry of the operator LS when

transforming it into wavelet space. Using Eqn. (4.37), we can see that

(LSW−1

)† = WLS† = WLS.

55

So, taking the dual forward transform of the above linear system, we obtain a system with

correctly symmetric operators:

WLSW−1(~u+ ~ubc) = WOSW−1~ρ. (5.4)

The prescription for obtaining these wavelet space LW and OW operators is revealed

by

LW = WLSW−1 = WLSW† = W(WLS

)†. (5.5)

In other words, perform the 2D dual wavelet transform, or the 1D wavelet transform along

rows and the columns of the 2D matrix LS. The equivalence of LW with L in Eqn. (5.1)

is readily seen by rewriting the dual forward transform from Eqn. (4.26 - 4.27) as

F : 〈φki |f〉 =∑j

hj 〈φk+12i+j |f〉 , (5.6)

〈ψki |f〉 =∑j

gj 〈φk+12i+j |f〉 . (5.7)

In the fashion of Eqn. (5.5), we can see that applying W down the columns and W†

across the rows of LS in Eqn. (5.2) will produce the matrix L in Eqn. (5.1).1 This is

the technique used by Terzic et al. to generate LW in their wavelet-based Poisson solver.

However, they used exclusively orthogonal wavelet bases, so their overlap operator was

identity and LW = WLSW−1.2 Because of the spatial localization of the wavelet bases

used, the partial derivative operators retained much of their sparsity in the journey to

wavelet space.

1Note that this needed to be explicitly shown. Comparing Eqns. (2.7) and (5.4) is not quite conclusiveenough, because the same comparison would apply also to the non-symmetric operator LSW

−1.

2Recall that W = W in an orthogonal wavelet basis.

56

5.3 Non-Standard Operator

While the form of the operator written as Eqn. (5.1) is shown to have retained much

of its sparsity, it is difficult to take full advantage of due to the non-localized nature of the

matrix density (Figure 18). Also, this standard form of the operator required storing the

entire matrix (or at least half, due to symmetry). Through a reordering of instructions

in the application of the operator, Beylkin describes another form for the operator which

reveals much of the true sparsity of the operation and preserves the simple repetitive nature

of the original LS matrix, enabling faster application and O(1) storage requirements [5].

S0 S0

D0 D0

X

D1

D2

D3

D1

D2

D3

=

Figure 18: Standard form wavelet space operator.

The shape of the non-standard form operator is shown in Figure 19. Qualitatively

one can see that the scales are effectively uncoupled in the operator. The coupling enters

into the operation through the redundancy of data in both the input and result vectors.

In order to produce the redundant input vector, the inverse transform (W−1 = W†)

is applied to the vector, and all of the intermediate S coefficient data is kept. After

57

the operator is applied to each scale independently, the resulting redundant data is dual

wavelet transformed ((W−1

)† = W) and added back into the corresponding scales.

S0 S0

D0 D0

D3

S1

D1

S2

D2

S3

D3

= X

S1

D1

S2

D2

S3

Figure 19: Non-standard form wavelet space operator.

The translation from the standard operator form into this non-standard form can be

seen by examining the contribution of each scale to individual scales higher up in the

wavelet expansion. We will begin with the base scale (level 0). If only the base scale

existed, the operator LW would reduce to simply

LW(0) = WLS(0)W† = LS(0),

where LS(0) denotes the LS matrix operator applied to scale n = 0 (replace n with 0 in

Eqn. (5.2).) Similarly, if another single scale existed in addition to the base scale, LW(1)

58

would become

LW(1) =

(N=1∏n=1

Fn

)LS(1)

(1∏

n=N=1

F†n

)= F1LS(1)F

†1,

and its contribution to the base scale will be given by

LW(1) (contribution to base scale) = P†0,SF1LS(1)F†1,

where P†0,S is the projection operator which selects only the rows of the operator LW(1)

which correspond to S coefficients on level 0, which is the base scale.

Taking the difference of the two expressions above, we can say that the additional

information provided to the base scale by the addition of the new level 1 is

P†0,S(LW(1) − LW(0)

)= P†0,SF1LS(1)F

†1 − P

†0,SLS(0), (5.8)

and we can rewrite the contribution of LW(1) to the base scale as

P†0,SLW(1) = P†0,SLS(0) + P†0,S(F1LS(1)F

†1 − LS(0)

).

Adding another scale will produce another contribution to the above sum, comprised

of

LW(2) − LW(1) = F1

(F2LS(2)F

†2 − LS(1)

)F†1,

resulting in a base scale contribution of

P†0,SLW(2) = P†0,SLS(0) + P†0,S(F1LS(1)F

†1 − LS(0)

)+ P†0,SF1

(F2LS(2)F

†2 − LS(1)

)F†1,

59

and for a general LW(N) we can write the contribution to the topmost scale as

P†0,SLW(N) = P†0,SLS(0)

+ P†0,SN∑n=1

n−1∏j=1

Fj

(FnLS(n)F†n − LS(n−1)

) 1∏j=n−1

F†j

.(5.9)

So the base scale operator is split into portions which operate on individual scale levels

and are then coupled via the wavelet transform back up to the base scale. For the result

applied to lower scales, simply switch the projection operator to P†k,D, and note that P†k,Dapplied to any operator which produces information at a scale above the D coefficients

of level k will result in zero. For example, P†k,DLS(0) = 0. This projection operator

replacement results in the operator contribution to any level k as

P†k,DLW(N) = P†k,DFkLS(k)F†k

+ P†k,DN∑

n=k+1

n−1∏j=k

Fj

(FnLS(n)F†n − LS(n−1)

) k∏j=n−1

F†j

.(5.10)

Again, the operator consists of applying an operator at the current scale and then adding

in contributions from lower scales which were also computed independently.

Applying only a single-stage forward transform operation to LS(n) from Eqn. (5.2),

we can see that rather than the full wavelet operator as in Eqn. (5.1), we obtain simply:

FnLS(n)F†n =

〈φn−1i |∂2

x|φn−1j 〉〈φni |∂2

x|ψnj 〉

〈ψni |∂2x|φnj 〉〈ψni |∂2

x|ψnj 〉

=

LS(n−1) LSD(n)

LDS(n) LDD(n)

. (5.11)

So the(FnLS(n)F

†n − LS(n−1)

)term from Eqn. (5.10) is simply the LSD(n), LDS(n), and

LDD(n) parts of the FnLS(n)F†n filter because the LS(n−1) term is subtracted out.

60

With this knowledge, we can rewrite Eqn. (5.10) in a more illuminating form as3

P†k,DLW(N) =

0 0

LDS(k) LDD(k)

+ P†k,DN∑

n=k+1

n−1∏j=k

Fj

0 LSD(n)

0 0

k∏j=n−1

F†j

.

(5.12)

This form shows the application of the LDS(n) and LDD(n) operations to each scale, fol-

lowed by a wavelet transformation of the LSD(n) terms, coupling the lower scales with the

higher ones, as is seen in Figure 19. Note that the P†k,D term is responsible for eliminating

the wavelet-space terms LDS(n), and LDD(n) in the wavelet-transformed portion, leaving

only the LSD(n) term.

From the translation invariance of the wavelet basis, it is easily seen that the LSD(n),

LDS(n), and LDD(n) operators are diagonally dominant, sparse, and invariant to transla-

tion.4 For example, LDD(n) is invariant to translation because it depends only on the

distance between wavelet functions:

LDD(n) = 〈ψni |∂2x|ψnj 〉 = 〈ψn0 |∂2

x|ψnj−i〉 = DDj−i.

Since the wavelet functions have a finite spatial range, the filter DDj−i decays to zero

for large |j − i|, which causes LDD(n) to be diagonally dominant and sparse. So the

non-standard form of the wavelet space operator reduces to the wavelet transform and

inverse plus the application of three simple filters SD, DS, and DD, all of which are

3Recall that P†k,DLS(k−1) = 0.

4With periodic or zero boundary conditions, of course.

61

short one-dimensional filters. The matrix for the overlap operator follows the same pre-

scription, with the substitution of 1 for ∂2x, resulting in a different set of SD, DS, and

DD filters. Appendix A describes in detail the process for calculating these filters for

both the Laplacian and operator matrices, and contains tabulated coefficients for a few

of the interpolating wavelet bases used in this thesis. An important fact to note is that

since all of the operations performed in the non-standard form of the operator are simple

filter applications, the complexity of the algorithm in terms of the number of operations

required to perform the matrix-vector multiplication is of the order of O(N) for large N ,

where N is the total length of the vector.

The non-standard operator could be implemented as three separate operations: Inverse

wavelet transform to a redundant state, application of SD, DS, and DD filters to each

scale in parallel, and forward dual wavelet transform, summing the redundant S coefficients

back into the wavelet coefficients. However, to reduce the number of duplicate copies of

the data structure and to simplify a parallel implementation, it may be advantageous to

combine all three steps into a single operation. To assist in implementation, C++ style

pseudo code illustrating the algorithm for the full non-standard operator is contained in

Figures 20 and 21. These algorithms also account for the fact that the grid may be an

adaptive mesh, which will be discussed further in Section 5.7.

5.4 3D Operators

Generalizing the operator strategy into a fully three-dimensional algorithm is fairly

straightforward when working with wavelet bases defined as tensor products. However,

there are a few non-obvious nuances which merit a careful explanation of the derivation.

The “naive” extension of L3D = Lx +Ly +Lz is incorrect for general biorthogonal wavelet

bases, as will be shown shortly.

62

topScale::NS_operator(topScaleData) {// (Note: The top scale starts out as only S coefficients,// so no inverse transform is needed.)

childRegion = area that overlaps with child region;dataToChild = topScaleData[childRegion];dataFromChild = a storage space of the size of childRegion

for holding result from child scale;

// Pass S-coefficient data to child and get the result// from children:mySubScale->SubScale::NS_operator( mySubScale->subScaleData,

dataToChild,dataFromChild );

// Perform the SS filter on this scale:// Note that the SS_filter cannot be done in-place.tempData = SS_filter(topScaleData);topScaleData = tempData;

// Add the result from the child scales back into// this current data:topScaleData[childRegion] += dataFromChild;

}

Figure 20: Pseudo code for non-standard form operator – top scale.

63

subScale::NS_operator(subScaleData, dataFromParent, dataToParent) {// (Note: The redundant S coefficient data from the parent// scale is passed in dataFromParent.)

tempData = DS_and_DD_filters(subScaleData, dataFromParent);// (Note: This does *not* depend on dataFromParent.)

dataToParent = SD_filter(subScaleData);

if( child exists ) {childRegion = area that overlaps with child region;childBorder = extra border area for child BCs;dataToChild = subScaleData[childRegion + childBorder];dataFromChild = a storage space of the size of

childRegion + childBorder for holdingresult from child scale;

// Note that the boundary conditions on this transform// are the values given in dataFromParent.dataToChild = wave_transform_oneScale_inverse( subScaleData,

dataFromParent,childRegion );

// Pass S-coefficient data to the child and get the result// returned from children:mySubScale->SubScale::NS_operator( mySubScale->subScaleData,

dataToChild,dataFromChild );

// Note that the boundary conditions for this transform are// zero since it is an SD filter result from the child, and// the SD filter is diagonally dominant so that the data// *is* zero outside of the childRegion.dataFromChild = wave_dual_transform_oneScale( dataFromChild,

childRegion );// Add the correctly transformed data into this current// scale *and* the data to be passed to our parent:tempData += D_coefficients(dataFromChild);dataToParent += S_coefficients(dataFromChild);

}}

Figure 21: Pseudo code for non-standard form operator – child scales.

64

Using the multidimensional notation for wavelet functions described in Section 4.4, a

matrix element of the 3D Laplacian takes the form:

Lk,lix,iy,iz,jx,jy,jz

= 〈ξkix,iy ,iz |∂2x + ∂2

y + ∂2z |ξljx,jy ,jz〉 , (5.13)

where ∇2 is expanded as a sum of second partial derivatives since the basis is a set of

Cartesian tensor products. Examining a single term of this expansion, we can see that

〈ξkix,iy ,iz |∂2x|ξljx,jy ,jz〉 = 〈ξkix |∂

2x|ξljx〉〈ξ

kiy ,iz |1|ξ

ljy ,jz〉 ,

= 〈ξkix |∂2x|ξljx〉〈ξ

kiy |1|ξ

ljy〉〈ξ

kiz |1|ξ

ljz〉 ,

which is simply the tensor product of the 1D Laplacian and overlap matrix elements. This

observation, combined with Eqn. (5.13) reveals that the correct form of the 3D Laplacian

operator is given by

L3D = Lx ⊗Oy ⊗Oz + Ox ⊗ Ly ⊗Oz + Ox ⊗Oy ⊗ Lz. (5.14)

Notice that in an orthonormal wavelet basis, we have Ox = Oy = Oz = 1, and so this

reduces to the “naive” operator mentioned earlier. However, in general, all operators have

this form involving the 1D overlap matrices. For the 3D overlap matrix elements:

Ok,lix,iy,iz,jx,jy,jz

= 〈ξkix,iy ,iz |1|ξljx,jy ,jz〉 , (5.15)

a similar calculation reveals that

O3D = Ox ⊗Oy ⊗Oz. (5.16)

65

5.5 Implementation of 3D Operators in Non-Standard Form

The implementation of these 3D operators in the non-standard form can proceed in

three ways. First, the 1D operators may be applied in succession for operator products,

and the results summed together as needed for summed operations. This approach is

prohibitive because the 3D Laplacian operator would require nine applications of a non-

standard form operation, resulting in nine iterations through the entire data structure –

which is a costly memory-bound operation.

5.5.1 Generalizing the Non-Standard Form in 2D

The other two methods for generalizing the multidimensional non-standard form in-

volve directly generalizing Eqn. (5.12) for the multidimensional operators defined in the

previous section. Beginning with the simpler Eqn. (5.8) involving only a base level 0 and

an additional scale, we can derive a similar result for a 2D overlap matrix:

∆OW(1)2D = P†0,S2D

(OW(1)2D −OW(0)2D

),

= P†0,S2DF12DOS(1)2DF†12D

− P†0,S2DOS(0)2D ,

and expanding the 2D operators in terms of tensor products of 1D operators, we have:

∆OW(1)2D =(P†0,Sx ⊗ P†0,Sy

)(F1x ⊗ F1y)(OSx(1) ⊗OSy(1))(F†1x⊗ F†1y)

− (P†0,Sx ⊗ P†0,Sy

)(OSx(0) ⊗OSy(0)),(5.17)

66

Rearranging commuting terms and then expanding the matrix elements in the form of

Eqn. (5.11) gives:

∆OW(1)2D =(P†0,Sx ⊗ P†0,Sy

)(F1xOSx(1)F†1x

)⊗ (F1yOSy(1)F†1y

)

− (P†0,Sx ⊗ P†0,Sy

)(OSx(0) ⊗OSy(0)),

=(P†0,Sx ⊗ P†0,Sy

)

OSx(0) OSDx(1)

ODSx(1) ODDx(1)

⊗ OSy(0) OSDy(1)

ODSy(1) ODDy(1)

− (P†0,Sx ⊗ P

†0,Sy

)(OSx(0) ⊗OSy(0)).

Applying the projection operators gives:

∆OW(1)2D =[OSx(0) OSDx(1)

]⊗[OSy(0) OSDy(1)

]−[OSx(0) 0

]⊗[OSy(0) 0

], (5.18)

and expanding the tensor products results in

∆OW(1)2D =[OSx(0) ⊗OSy(0) OSx(0) ⊗OSDy(1) OSDx(1) ⊗OSy(0) OSDx(1) ⊗OSDy(1)

]−[OSx(0) ⊗OSy(0) 0 0 0

],

=[0 OSx(0) ⊗OSDy(1) OSDx(1) ⊗OSy(0) OSDx(1) ⊗OSDy(1)

]. (5.19)

So in the 2D and 3D case of the non-standard operator form, extra cross terms involving

OSx(0) and OSy(0) arise which are not cancelled by the OSx(0) ⊗OSy(0) term at the next

67

higher scale. This means that one cannot simply ignore the OSx(0) and OSy(0) filters in

child scales, as was the case in 1D (in Eqn. (5.12) for example).

5.5.2 Method Two

The second method for implementing a non-standard operator in 2D or 3D requires

applying the 1D operators exactly as Eqn. (5.18) prescribes – applying the full 1D filters

in each direction, and then explicitly subtracting out the OSx(0) and OSy(0) term.5 This

method may be used to reduce the Laplacian operator to only a single non-standard

operator application, but it performs redundant work by computing the OSx(0) ⊗OSy(0)

twice and canceling them.

5.5.3 Method Three

The third method of implementing the non-standard operator in multiple dimensions

begins by recognizing that the final matrix in Eqn. (5.19) is the 2D equivalent of the OS

and OSD operators. By analogy from the 1D case of Eqn. (5.11), we can define:

Fn2DOS(n)2DF†n2D=

OS(n−1)2D OSD(n)2D

ODS(n)2D ODD(n)2D

. (5.20)

5If implementing this form of the 2D or 3D non-standard form operator with a multiresolution grid,take note that the OSx(0) ⊗OSy(0) term needs to be evaluated on the child scale rather than the parent

scale in order for it to cancel perfectly with the same term resulting from P†0,S2DF12DOS(1)2DF†12D

.

68

Using the multidimensional projection operators:

Pn,S2D= Pn,Sx ⊗ Pn,Sy , (5.21)

and

Pn,D2D= Pn+1,S2D

− Pn,S2D,

= Pn,Sx ⊗ Pn,Dy + Pn,Dx ⊗ Pn,Sy + Pn,Dx ⊗ Pn,Dy , (5.22)

we can express the terms in Eqn. (5.20) as

OS(n−1)2D = P†n,S2DFn2DOS(n)2DF†n2D

Pn,S2D, (5.23)

OSD(n)2D = P†n,S2DFn2DOS(n)2DF†n2D

Pn,D2D, (5.24)

ODS(n)2D = P†n,D2DFn2DOS(n)2DF†n2D

Pn,S2D, (5.25)

ODD(n)2D = P†n,D2DFn2DOS(n)2DF†n2D

Pn,D2D. (5.26)

The explicit tensor-product form of these operators can be found by simply expanding

OS(n)2D , Fn2D , and the projection operators in their tensor-product forms. In terms of

these operators, it is simple to show that the 2D analogy of Eqn. (5.12) becomes

P†k,D2DOW(N)2D =

0 0

ODS(k)2D ODD(k)2D

+ P†k,D2D

N∑n=k+1

n−1∏j=k

Fj2D

0 OSD(n)2D

0 0

k∏j=n−1

F†j2D

.

(5.27)

Using these forms of the 2D or 3D operators, the algorithm for the application of the

non-standard form remains unchanged. The 2D operators in Eqns. (5.23–5.26) can be

69

implemented either as tensor products of 1D operators applied over only the specified

regions or explicitly expanded as 2D filters and applied as a whole. The latter is preferable

due to the fact that the projections of wavelet and scaling function regions can be included

into the 2D filter, simplifying the implementation.

5.6 Preconditioning and Temporal Coherence

Solving the Poisson equation (2.7) for ~u involves inverting the Laplacian operator

matrix applied to the right-hand side of the equation – i.e.:

~u = L−1(O~ρ− L~ubc).

This is a common linear algebra equation and can be solved via a number of known direct or

iterative algorithms. An iterative method is chosen for this current work for a few reasons.

First, iterative matrix inversion techniques typically only require knowing the action of

the operator matrix on a vector. Direct inversion methods often require knowledge of the

entire matrix in order to solve for the inverse operator. For this reason, a direct method

will be very inefficient for these wavelet space operators, since the operators are both very

large and very sparse. As described in Section 5.3, applying the Laplacian operator in

non-standard form requires storage of only a few small filters rather than an entire large

multidimensional matrix.

5.6.1 Temporal Coherence

A second reason for electing to use an iterative methods stems from the application of

this algorithm to PIC-based particle dynamics simulations. In such a simulation code, the

70

Poisson equation solver is called at each simulation time-step to produce the space-charge

forces to be applied on the charge distribution. For such a simulation to be reliable, this

time-step size must be short relative to the time scales of the motions of the individual

particles. This means that the charge distribution and the resultant potential changes

slowly with respect to time, and there is a temporal coherence between time-steps. An

iterative matrix inversion scheme allows the Poisson solver to take advantage of this tem-

poral coherence by using the previous time-step’s potential as a “smart” initial guess for

the current time step. As illustrated in Figure 22, Terzic et al. showed that this good

initial guess by itself can reduce the number of iterations for convergence to around 10

per time step for much of the simulation [28].

5.6.2 Conjugate Gradient Algorithm and Preconditioning

The conjugate gradient algorithm is the iterative method of choice for this Poisson

equation solver due to its simple implementation and fast convergence. It requires only

a single application of the Laplacian matrix per iteration, is easy to parallelize if the

matrix-vector product can be performed efficiently in parallel, and converges rapidly to

an acceptable solution. The convergence rate of the algorithm is strongly dependent on

the initial guess (as is shown in Figure 22) and on the condition number κ of the Laplacian

matrix [24], where

κ =λmaxλmin

=Maximum eigenvalue of LMinimum eigenvalue of L

, (5.28)

and the error per iteration decreases faster than

||e(i)||||e(0)||

≤ 2(√

κ− 1√κ+ 1

)i, (5.29)

71

Figure 22: Results of preconditioning and temporal coherence. Figure taken from Terzicet al. [28].This figure compares the number of iterations required for convergence of each Poissonequation solution in a simulation run in Impact-T using the Poisson solver of [28].The top line (green) is the non-preconditioned conjugate gradient algorithm, with an initialguess of zero for the potential.The second line (blue) is the preconditioned conjugate gradient algorithm with an initialguess of zero for the potential.The third line (red) is the non-preconditioned conjugate gradient algorithm with a “smart”initial guess for the potential.The bottom line (black) is the preconditioned conjugate gradient algorithm with a “smart”initial guess for the potential.

72

where i is the iteration number.

Though the non-standard form of the operator is O(N) in its complexity, the overall

algorithm complexity is also dependent on the number of conjugate gradient iterations

required to obtain a satisfactory solution. Terzic et al. state that for a standard 3-point

finite difference stencil, κ ∝ O(M2) where M is the grid length in each dimension [28].6

Because this effectively causes κ to depend on N , the overall algorithm complexity can no

longer be O(N). Rather, in situations where the initial guess is poor, the order becomes

O(N ·C(κ(N))), where C(κ) is the number of iterations of the conjugate gradient algorithm

required for acceptable convergence. C(κ) can be estimated by solving Eqn. (5.29) for the

number of iterations i with respect to a chosen convergence requirement of ε = ||e(i)||||e(0)||

.

In this manner, Shewchuck gives the following upper bound on the number of iterations

required for convergence [24]:

C(κ) = i ≤⌈

12√κ ln

(2ε

)⌉. (5.30)

The goal, then, is to reduce C(κ) ∝√κ.

The typical method of reducing the condition number of a matrix involves choosing

a preconditioner matrix M which approximates the Laplacian operator in Eqn. (2.7) and

solving instead the preconditioned form:

M−1L~u = M−1(O~ρ− L~ubc). (5.31)

Factoring M as M = PPT, which is possible for any symmetric and positive-definite M,

we have the symmetric form:

P−1LP−TPT~u = P−1(O~ρ− L~ubc). (5.32)

6The 3D grid is assumed to be cubic of size M ×M ×M , so N = M3.

73

M is selected in the hope that the product M−1L = P−1LP−T will have a lower condition

number than L, so that the conjugate gradient algorithm will converge more rapidly. A

significant reduction in condition number was obtained by [28] with the diagonal precon-

ditioner matrix:

(M−1)k,li,j = 2−kδk,lδi,j , (5.33)

which simply multiplies each vector element by 2−k where k is the element’s level number.

This preconditioner reduces the condition number function from κ ∝ O(M2) to κ ∝

O(M), making the convergence rate proportional to C ∝√M . Applying this precon-

ditioner to the present non-standard operator formalism results in an overall algorithm

worst case complexity of O(N ·√M) or O(N

76 ) in the 3D case, and O(N1+ 1

2d ) for d dimen-

sions. So the worst case algorithm complexity is nearly linear with respect to the number

of basis functions, but as Figure 22 shows, temporal coherence ensures that the typical

case algorithm complexity will be simply O(N) because the initial guess will be accurate

enough in the average case to require only a few conjugate gradient iterations to obtain

satisfactory convergence. Terzic et al. report that over the entire 30,000 time step simu-

lation run which produced the data for Figure 22, the “smart” guess plus preconditioner

approach required an average of only 2.4 iterations per time step [28]. This adaptivity in

time is one of the significant strengths of an iterative wavelet-based approach applied to

PIC simulations – less work is required to solve the Poisson equation during portions of

the simulation which have very slow changes with respect to time. This is an advantage

over the Green’s function plus FFT approach, which requires that the full solution method

be repeated for each time step, even if very little of the charge density has changed.

74

5.6.3 Inherent Preconditioning of Biorthogonal Bases

As mentioned in Section 4.3, the normalization difference between the orthonormal

wavelets used in [28] and the biorthogonal bases used for this thesis has significant ram-

ifications in applying preconditioners for the conjugate gradient algorithm. Because the

orthonormal wavelet transform is unitary, it preserves the eigenvalues of the Laplacian

operator, and the non-preconditioned wavelet-space operator has the same inherent con-

dition number κ ∝ O(M2) as the standard finite-difference stencil in real space. After the

preconditioner in Eqn. (5.33) is applied, the condition number is reduced to κ ∝ O(M).

Because the biorthogonal wavelet transform is not a unitary transformation, it does not

preserve the eigenvalues of the Laplacian operator, and it can effectively provide a simi-

lar preconditioner to that employed in [28] – simple scaling factors between scales. Also,

because the normalization factors between scales are arbitrary in the biorthogonal basis

(subject to the biorthonormality conditions of Eqns. 4.13-4.16), any scaling factor could be

chosen – providing an opportunity to construct biorthogonal bases with inherent precon-

ditioning equivalent to that used in [28]. Lippert, Arias, and Edelman noted the fact that

a biorthogonal basis appeared to be providing just this sort of inherent preconditioning,

but did not provide a theory for why the default preconditioning was helpful rather than

detrimental [17]. From my own examination of the typical normalization choice for bi-

orthogonal bases, I discovered that the change in normalization between an orthonormal

basis and a biorthogonal one in 1D has exactly the same effect on the linear algebraic

Equation (2.7) as does the preconditioning matrix Eqn. (5.33).7 Therefore, the default

preconditioning inherent to any biorthogonal wavelet basis in 1D is equivalent to the pre-

conditioning scheme used in [28], with no additional computational work required to apply

7See Appendix C for the derivation of this equivalence.

75

the “preconditioner”, and the preconditioning has a similar form in 3D with different scale

factors.

5.6.4 Alternate Preconditioning Schemes

The selection of a preconditioning scheme for use in solving the Poisson equation in a

wavelet basis is a very open-ended choice. The preconditioning scheme used in [28] was

used in this current work, but several other schemes are worthy of future consideration

as well. Goedecker and Arias both claim preconditioning schemes which provide a con-

dition number which is invariant to additional layers in the wavelet basis under certain

conditions – effectively claiming that κ ∝ O(1), resulting in an O(N) overall algorithm

complexity [1, 11, 17]. This is an area which requires further research to determine if these

preconditioning schemes are appropriate for a solver designed for PIC simulation, as well

as to ascertain under which conditions these schemes do in fact yield an O(N) algorithm.

5.7 Multiresolution Grids

In order to implement an efficient discrete-grid solver in any basis, one must be able

to truncate the basis in some way in order to avoid an infinite basis set. The error

resulting from this truncation should be slight, and the resulting approximation needs to

still accurately approximate the full basis set solution. This will occur if the basis functions

utilized in the expansion are fairly well localized in whatever space is being truncated,

and if the expanded function was already sparse in this region. As an example, a function

expanded in an infinite sum of Dirac δ functions can be truncated in any regions where

the function is zero, with no effect on the accuracy of the expansion. Also, most functions

expanded in frequency space will have some frequency localization, and the frequency

76

expansion may be truncated above some minimum frequency value, with little change to

the accuracy of the function representation. This truncation is typically accomplished by a

reduction in the frequency of grid points used to discretely sample the function. Truncation

in frequency is particularly useful when considering the Poisson equation because the

Laplacian operator is a sharpening operator – it strengthens high-frequency components

of the potential to yield a charge density with stronger high-frequency contributions. This

means that when inverting the Laplacian operator to solve the Poisson equation, if the

charge density can be represented accurately with some maximum frequency, then the

potential will not contain contributions from higher frequencies than this maximum.8

A Fourier expansion, then, can be truncated in frequency space with very little ap-

proximation error because both the operator and the basis set are localized in frequency.

However, the expansion requires that the grid be uniformly dense because the basis func-

tions are not at all localized in space. A wavelet basis set, being localized in both frequency

and space, can be accurately truncated in both spaces. A look at the form of the Laplacian

operator in wavelet space (Figure 19) shows that at each scale level, the Laplacian op-

erator is also localized in space. So the wavelet basis can be truncated in frequency by

reduction of the grid resolution, and this grid resolution does not have to be uniform

over the computational domain, due to the spatial localization of the wavelet basis and

Laplacian operator. These properties can be utilized to construct an adaptive-resolution

grid structure for solving the Poisson equation – the grid resolution can be made high

where there are sharp transitions in the charge density and the grid resolution can be

smoother in regions where the charge density is zero or smooth in nature.

8This can be proven more exactly by noting that the Fourier transform basis functions are eigenfunc-tions of the Laplacian operator. The Laplacian operator can only scale the vector in this basis, so if anyfrequency component is zero in the charge density, it will be zero in the potential and vice versa.

77

5.7.1 Multiresolution Grid to Improve Boundary Conditions

A multiresolution grid may be used to create boundary conditions that more closely

resemble the ideal “open” boundary in PIC codes. Figure 23a shows how the base com-

putational grid containing the beam bunch is contained in a small central region of the

beam pipe. Applying homogeneous boundary conditions (U = 0) at the boundaries of

this small computational grid will result in strong image charges which will significantly

distort the resulting potential. An improvement in the accuracy of the potential is gained

by expanding the computational region to some larger volume at least the size of the beam

pipe, as in Figure 23b. Applying the homogeneous boundary conditions on this larger grid

will significantly reduce the unwanted effects of image charges. However, the significant

number of points added by this approach is prohibitive. The wavelet multiresolution grid

approach, however, makes this approach possible by reducing the number of extra points

required to extend the boundary of the computational domain (see Figure 23c). Because

there is no charge density in the intermediate region, the grid is made as sparse as possible

and the number of grid points needed is significantly less than a dense grid over the same

region.

(a) Base grid with beam pipe. (b) Dense grid to boundary. (c) Adaptive grid to boundary.

Figure 23: Multiresolution grid used for boundary conditions.

78

The variation in the grid density is restricted by the LSD and LDS terms of the

operators, as valid S coefficient data is required in order to calculate the effect on the

current grid. Therefore each layer needs a boundary region of the next coarsest resolution

as wide as the widest filter that projects from the S to D or D to S spaces. Figure 25, in

Chapter 6, illustrates an actual multiresolution grid with varying numbers of additional

coarse boundary regions.

5.7.2 Interpolation

The reduction of the wavelet basis applies only to the function as it is already expanded

in wavelet space – i.e., only wavelet-type “D” coefficients may be truncated because they

will be nearly zero in regions where the input charge density is smooth. During the initial

wavelet transformation, the function is still in regular space, and the wavelet transform

may require fine data points outside of the region where they are available. In this case,

an interpolation scheme may be used to evaluate the function at these points and allow

the wavelet transformation to proceed correctly. See [11] for details on how this wavelet

interpolation is accomplished. This interpolation scheme is not needed in the current

work because the charge density is entirely contained in the finest scale, and the charge

distribution is exactly zero at all the boundary regions where this interpolation would be

needed.

5.8 Monopole Approximation for Boundary Conditions

An additional improvement in approximating “open” boundary conditions is achieved

by applying a non-homogeneous Dirichlet boundary condition containing the monopole

term of the given charge density. The required monopole information of the total charge

79

and mean charge center is easily obtained from the charge distribution and is continually

tracked by many PIC codes. With this information, we can give an approximation of the

potential at the computational boundary by using the first term of a multipole expansion.

Generalizing the approach taken in [28], arbitrary non-homogeneous boundary conditions

may be applied to by separating the full Laplacian operator into sections involving both

regions internal and external to the base computational grid:

L~u = (P†in + P†out)L(Pin + Pout)~u.

Since we are only interested in the result internal to the computational grid, only two

terms remain:

P†inL~u = (P†inLPin + P†inLPout)~u,

= P†inL~uin + P†inL~u

bc, (5.34)

where ~uin = Pin~u is the potential inside the computational grid, and ~ubc = Pout~u is the

potential due to the boundary conditions.

Because ~ubc is a given function, it can be moved to the right-hand side of the Poisson

equation (Eqn. 2.7), yielding:

L~u = O~ρ− L~ubc, (5.35)

where the P†in term on the left of each term is implied rather than explicitly shown. Com-

puting this L~ubc term then, simply requires applying the Laplacian operator onto a grid

which is zero inside the computational domain, and uses any desired non-homogeneous

boundary conditions when applying the operator. The resulting effect on the computa-

tional domain is then subtracted from the charge density term resulting in a “fictitious

charge distribution” which applies the correct boundary conditions. An important opti-

mization afforded by the multiresolution grid structure is that this L~ubc term will only

80

contain non-zero values on the scale which touches the boundary because the lower scales

are already truncated in such a way that values outside of their parent scale have no effect

on them when coupled through the Laplacian operator.

The effect of applying a monopole approximation to the boundary conditions is to

completely eliminate the monopole term from the image charge error, leaving only dipole

terms and higher. This means that the image charge falloff occurs as 1r2

rather than 1r

as is the case for the monopole term. The overall result is that a smaller computational

grid may be used while giving the same error as a larger grid that utilized a completely

homogeneous boundary condition.

5.9 Parallelization Opportunities

This algorithm for solving the Poisson equation offers significant opportunities for par-

allel optimization. The conjugate gradient algorithm which is used as the iterative solver

consists mainly of dot product and SAXPY (scalar multiply and vector addition) opera-

tions, which are trivial to implement in parallel as they are element-wise array operations

or simple sum-reductions [8]. The remaining difficult operation to parallelize is the appli-

cation of the Laplacian operator to the trial potential. For the Laplacian operator stage,

the wavelet basis allows the data to be split in either space or scales. A separation in scales

offers a very small boundary between communicating layers, as information is only passed

between scales as S coefficients, which consist of only 18 of the number of coefficients on

any given layer. Also, if an additional separation is needed, the wavelet basis can also sep-

arate in space, requiring only a few boundary coefficients to be passed between adjacent

segments. This is a significant reduction in communication compared with an FFT-based

implementation, which does not separate well in space due to the lack of spatial locality

of the basis set.

CHAPTER 6

IMPLEMENTATION AND ALGORITHM TESTING

A computer implementation of the Poisson equation solver algorithm was constructed

with two goals in mind – to verify the correctness of the mathematical algorithm and

filter coefficients and to examine the response of the algorithm to changes in input param-

eters. This chapter reviews some of the features of this implementation and examines the

costs and benefits of the unique contributions of this algorithm for use in particle-in-cell

simulation codes.

6.1 Features and Code Structure

From the spectrum of available implementation choices, the solver used in this work was

constructed in object-oriented C++ with the Blitz++ library (http://www.oonumerics.

org/blitz/) utilized for manipulation and optimization of dynamic arrays. It uses a

template structure to enable the code to be compiled and used with any number of di-

mensions, from 1D to 4D. This flexibility allows solutions with various dimensionality to

be tested and compared in the same codebase, and demonstrates the generalization of the

algorithm to various dimensions. The code also allows selection of the wavelet family to

be used, and includes implementations of 2nd, 4th, 6th, and 8th order lifted and non-lifted

interpolating wavelets.

82

6.1.1 Code Structure

The code is organized into two categories of data structures: vectors and filters. Be-

cause all of the matrix-vector operators used for this algorithm are symmetric and diag-

onally dominant, these filters represent the entire matrix by storing the representation

of a single “row” in a matrix. The concept of a “row” of the matrix is generalized to

higher dimensions as a tensor product of the 1D filters calculated in Appendix A. The

various types of coefficients in a wavelet expansion described in Eqn. (4.48) often each

have unique filters which take into account the differing types of neighboring coefficients.

Filters are used to describe both the non-standard operator matrices and the wavelet

transform matrices.

Because the scales are adaptive in nature, vectors are organized in two types as well: A

single base scale plus multiple child scales. The base scale consists of all S-type coefficients

at the coarsest resolution plus a possible single child scale, and the child scales are each

comprised of wavelet-type coefficients at a single resolution and may recursively contain a

single child scale as well. The data is organized in an interleaved fashion as in Eqn. (4.48),

which leaves redundant S-type data points in the data structure of child scales. This data

is only 18 of the child scale for the 3D solver, and it is used for in intermediate steps of

the wavelet transform and non-standard operator applications. The base scale and child

scale objects contain the code for the non-standard operator algorithms, according to the

algorithm outlined in Figures 20 and 21.

Child scales apply boundary conditions of zero whenever wavelet-type data is required

outside of the child scale’s region, but it uses the parent data as S-type data points in

the boundary region around the scale. This creates the requirement that a child scale

must be able to fit within its parent scale and also have enough boundary values at the

parent’s scale level to satisfy the widest of the wavelet transform, overlap, or Laplacian

83

filters. The boundary conditions applied to the top-most scale can be arbitrary, and can be

set to any desired function. Implementations of homogeneous (zero) boundary conditions

and monopole approximation boundary conditions are both included and can be selected

from a command-line option in order to allow comparison with and without the monopole

boundary condition approximation, which is a new contribution of this current work.

6.2 Testing: Parameters and Measurements

Over 1000 variations of parameters were tested with this solver to verify robustness and

to examine the feasibility of the multi-scale algorithm and monopole approximation for

use in approximating open boundary conditions. The output of the solver was compared

with the known analytic solutions and an estimate of the overall error was produced for

each test run. All tests reported in this chapter were computed in 3D. The actual charge

distribution and potential functions shown in this chapter are always three-dimensional

and symmetric about z = 0, and a 2D slice across z ≈ 0 is shown in figures.

• Three different input charge distributions were utilized – a Gaussian pure monopole,

a Gaussian pure dipole, and a double Gaussian function containing both monopole,

dipole, and higher order multipole terms. A typical particle-in-cell beam dynamics

simulation will only include particles of a single charge, but the dipole input test

function was included as a control group to establish how the dipolar term of a mul-

tipole expansion affects the solver. Figure 24 illustrates these three input variations.

Each of the Gaussian functions are radially truncated such that ρ = 0 for r > 14 ,

and they are arranged so that non-zero charge density only exists within a sphere of

r < 1 around the origin.

• 2nd-order and 4th-order interpolating wavelets were used, of both the lifted and

non-lifted varieties.

84

• Either monopole approximation and zero (homogeneous) boundary conditions were

employed.

• The number of grid points sampling the finest scale was varied in cubed powers of

two from 43 up to 643. The finest scale was always chosen to be of size 1 in each

dimension, in order to include the non-zero charge distribution of the input function.

This is the typical way that the input function grid is assigned in a particle-in-cell

simulation.

• The number of additional boundary scales was varied from 0 (only the base finest

scale) up to 10 boundary scales. The scales were added using the minimum number

of points required in each additional parent scale, which was dependent on the

wavelet family used. The minimum radial extent of the largest scale was recorded

for comparison between simulations using different wavelet families and fine-grid

densities.

-1-0.5

0 0.5

1 -1-0.5

0 0.5

1

0

0.5

1

(a) Gaussian monopole.

-1-0.5

0 0.5

1 -1-0.5

0 0.5

1

-1

-0.5

0

0.5

1

(b) Gaussian dipole.

-1-0.5

0 0.5

1 -1-0.5

0 0.5

1

0

0.5

1

(c) Double Gaussians.

Figure 24: Code testing input functions.

85

The conjugate gradient method was used to compute the solution of the potential,

using an initial guess of 0, and the number of iterations was chosen to reduce the residual

to 8 orders of magnitude smaller than the squared norm of the charge distribution input

function. The error of the final result was computed as an RMS error comparing the

computed result with the analytical result on the finest grid only:

RMS Error =

√∑i∈Fine Grid(Resulti −Actuali)2

Num Fine Points. (6.1)

Since the boundary grids only exist in order to improve the boundary condition ap-

proximation, they were not included in the error calculation. Note that comparing with

the analytical solution tests both the ability of the solver to invert the Laplacian, as well

as the ability of the basis function set to accurately represent the charge density function,

which is often dependent on the choice of fine grid density.

6.3 Testing: Demonstration of Solutions

A test series demonstrating the effect of increasing boundary layers is shown in Fig-

ure 25. This set of results is from a series using the Gaussian dipole input function, 2nd

order lifted wavelets, and a 323 core grid. The dipole input function is shown to illustrate

the dipole image charge falloff without requiring the use of monopole boundary conditions.

The number of boundary layers is varied, and each individual result is shown in parts (a)

through (d) of the figure. Each part of the figure shows the entire solution of the potential

on the left, including the boundary grid regions. In the upper right, the deviation of the

solution from the analytical potential is shown on the finest grid, along with the overall

RMS error for the solution. In the lower right, the residual is shown as a function of

86

-1-0.5

0 0.5

1 -1

-0.5

0

0.5

1

-0.025-0.02

-0.015-0.01

-0.005 0

0.005 0.01

0.015 0.02

0.025

Solution of Potential:(Radial Extent > 1.000000)

-1 0

1 -1

0

1

-0.01

-0.005

0

0.005

0.01

Deviations:(RMS Error = 0.001909)

10-8

10-6

10-4

10-2

0 5 10 15 20 25 30

Convergence: 29 Iterations

(a) Dipole – 0 boundary layers.

-2-1

0 1

2 3 -2

-1 0

1 2

3

-0.025-0.02

-0.015-0.01

-0.005 0

0.005 0.01

0.015 0.02

0.025


-1 0

1 -1

0

1

-0.00075-0.0005

-0.00025 0

0.00025 0.0005


10-8

10-6

10-4

10-2

0 5 10 15 20 25


(b) Dipole – 2 boundary layers.

Figure 25: Dipole decay with boundary layers. (continued on following page)

87

-6-4

-2 0

2 4

6 8 -6

-4-2

0 2

4 6

8

-0.025-0.02

-0.015-0.01

-0.005 0

0.005 0.01

0.015 0.02

0.025


-1 0

1 -1

0

1

-0.0003-0.0002-0.0001

0 0.0001 0.0002


10-8

10-6

10-4

10-2

0 5 10 15 20 25


(c) Dipole – 4 boundary layers.

-10-5

0 5

10 15 -10

-5 0

5 10

15

-0.025-0.02

-0.015-0.01

-0.005 0

0.005 0.01

0.015 0.02

0.025


-1 0

1 -1

0

1

-0.0003-0.0002-0.0001

0 0.0001 0.0002


10-8

10-6

10-4

10-2

0 5 10 15 20 25 30 35


(d) Dipole – 5 boundary layers.

Figure 25: (continued)

88

the conjugate gradient iteration number, with the total number of iterations required for

convergence denoted above this plot.

Figure 25a demonstrates that truncating the computational domain very near to the

charge distribution results in significant dipole image charge effects in the resulting po-

tential. The effect of this dipolar image charge can be seen in the overall slope of the

deviations of this solution – the dipole image charge creates a dipole error in the poten-

tial, as is expected. In Figure 25b, the dipole image charge error is still visibly dominant,

but it has reduced in strength by several orders of magnitude and the sampling error is

beginning to be significant as well. In Figure 25c, four boundary layers have been added,

and the RMS error is greatly reduced from that in Figure 25a. In addition, the error is

no longer dominated by the dipole image charge, but the sampling error is the promi-

nent contributor to the overall error. Further increasing the number of boundary layers

provides only marginal improvement in the RMS error, as Figure 25d demonstrates.

The convergence of the tests in this chapter also merits some consideration. For sim-

plicity, the algorithm was implemented with the entire top scale remaining as S-type

coefficients. This is because the top scale’s dimension is not guaranteed to be an even

number in each direction, so a wavelet transform of this layer is not always possible.

With zero boundary scales, then, the convergence rate is essentially identical to that of

a finite-difference scheme with no preconditioning. Adding boundary scales allows the

wavelet preconditioning to work while also reducing the number of points in the topmost

non-transformed scale, resulting in cases where adding boundary scales actually causes

the solution to converge more rapidly. Further study of this effect is needed in order to

construct an improved preconditioning scheme for cases with few boundary layers. In any

case, the number of points in the overall vector is usually a better indicator of program

execution time since the algorithm complexity is O(Np · Niter) and Np >> Niter. An

optimal preconditioning scheme will produce Niter ∝ const, so O(Np) is the best expected

performance – the fact that the condition number goes down with increased number of

89

scales is a symptom of the artificially poor condition number for the lower scale numbers

rather than an improved condition number for the medium scale numbers. This effect is

not significant for typical uses of the algorithm, however, as a few boundary scales are

always needed in order to correctly evaluate the boundary conditions.

6.4 Testing: Effects of Radial Extent and Monopole BCs

A second observation which can be made from Figure 25 is that the radial extent of

the computational domain is not linearly related to the number of scales. Rather, it is

nearly exponential, as the point spacing of each additional layer is double the spacing of

the previous scale. For this reason, and because the radial extent is an actual physical

characteristic of the Poisson equation under consideration, it is used rather than the

number of boundary layers in the remaining plots in this chapter.

In terms of the radial extent, it is expected that the image charge error will decay as r−1

for monopole terms in the potential and as r−2 and faster for dipole and higher multipole

terms. As was mentioned in Sec. 5.8, the goal of applying monopolar boundary conditions

is to reduce the erroneous image charge in the potential to dipole or higher terms, so that

the error should decay as r−2 or faster for any charge distribution. Figure 26 illustrates

the effects of varying the number of boundary layers on the RMS error of the various

combinations of input functions and boundary condition schemes. Because the finest grid

is always chosen to have a radial extent of 1, the radial extent shown in this figure is

normalized such that it is also a relative radial extent.

As expected, the Gaussian monopole and double Gaussian input functions with ho-

mogeneous (zero) boundary conditions are subject to image charge errors which decay

approximately as r−1. The decay in the error of these functions is constrained by their

90

1e-05

0.0001

0.001

0.01

1 10 100

RMS Error

Radial Extent

Dipole with Monopole BCDipole with Zero BC

Two Gauss with Monopole BCTwo Gauss with Zero BC

Monopole with Monopole BCMonopole with Zero BC

Figure 26: Error vs. radial extent. Second-order lifted wavelets, 323 grid.

91

monopolar terms, and they require approximately an order of magnitude increase in the

radial extent to produce an order of magnitude decrease in the error.

The RMS error of the Gaussian dipole input function decays as r−2 – also as predicted.

Before the sampling error saturates the results, the dipole function’s error decays two

orders of magnitude by the time the relative radial extent has increased its first single

order of magnitude. This results in a significantly smaller error than the monopolar

functions for even very small numbers of boundary layers.

Applying the monopole approximation boundary conditions also has the expected ben-

efits. For the dipole input function, there is no effect, as there is no monopolar term to

cancel. For both the Gaussian monopole and the double Gaussian input functions, acti-

vating the monopolar boundary conditions significantly reduces the RMS error for small

numbers of boundary layers. For radial extents of around 2–5, the RMS error is reduced

by two orders of magnitude or more by using the monopolar boundary condition approxi-

mation. The Gaussian monopole function exhibits no image charge effect whatsoever, and

the image charge error of the double Gaussian function initially decays at least as fast as

r−2 for the first few boundary layers.

As the number of boundary layers gets larger, the error reduction of all of the in-

put functions saturates to the limit of the sampling error, and the convergence of the

conjugate gradient algorithm is somewhat dependent on the total number of points in

the computational domain, so the RMS error grows slightly with increased radial extent.1

These two effects counteract the reduction of the multipole terms and produce an expected

minimum error with radial extents of around 2–5 when using the monopolar boundary

condition approximation for a general input function. This typically corresponds to only

1The Gaussian dipole input function is not greatly affected by this second effect because the initialguess of “0” for the potential is fairly accurate for the boundary grids, resulting in a smaller overall residualat convergence of the conjugate gradient algorithm.

92

a few boundary layers, so the extra cost of the boundary condition computations are

expected to be small.

As mentioned in the previous section, the cost of adding extra layers for boundary

condition computation can be approximated roughly as the number of extra points added

by these boundary layers. Figure 27 shows the relative increase in points caused by

increasing the radial extent of the computational domain to improve the computation

of boundary conditions. For radial extents in the “optimal” region of 2-5, the fractional

increase in points is less than 2 for all cases, and is marginal with more dense fine grids. For

typical particle-in-cell simulations, grid sizes of 323 are typical when used with simulations

involving hundreds of thousands of particles, with 643 used for particle counts in the

millions – which is becoming more and more common. So, the boundary conditions

computation will account for a smaller and smaller fraction of the overall computation

time as the resolution of modern PIC simulations increases.

93

1

1.5

2

2.5

3

3.5

4

1 10 100

Total Points / Fine Grid Points

Radial Extent

163

323

643

Figure 27: Number of points vs. radial extent. Second-order lifted wavelets.

CHAPTER 7

CONCLUSIONS

This thesis provides an algorithm for calculating solutions of the Poisson equation

for use in particle-in-cell simulators. This algorithm retains the principal benefits of the

wavelet-based solver of Terzic et al.:

• The iterative method used to solve the Poisson equation allows the algorithm to be

adaptive in time by using information from the previous time step to accelerate the

solution of the current time step.

• The preconditioning scheme used in this algorithm is similar to the scheme used in

[28], and it is applied implicitly in the wavelet basis itself. The recognition of the

nature of this implicit preconditioning is unique to this work. This preconditioning

allows the iterative solver to converge more rapidly than a similar finite-difference

scheme would allow.

• The solution is computed in wavelet space, allowing possible wavelet de-noising of the

charge density and potential to reduce sampling noise inherent in the particle-in-cell

algorithm [20, 21, 28].

In addition, this new algorithm extends the prior work by utilizing the non-standard

form of the wavelet-space Laplacian operator, exposing much of the inherent sparsity of

the operator in the wavelet basis. The improved sparsity of the operators results in faster

O(N) operator application speeds. The non-standard operator form also separates the

wavelet scales to enhance the amount of parallelization available.

To overcome the difficulties that the former algorithm suffered when computing bound-

ary conditions, a new multiscale adaptive-in-space algorithm is used to reduce image charge

95

error in boundary condition computation. This new algorithm is far simpler to implement

in parallel codes. The multiscale algorithm for boundary conditions also enables the code

to put the adaptive-in-time nature of the iterative solver to work on the boundary condi-

tions as well – consuming less computational time during portions of the simulation when

the charge density is slowly changing.

Further reducing the computational cost of the boundary condition calculation, a new

monopole approximation boundary condition scheme is used to reduce the number of

boundary layers needed. Tests were performed to verify the predicted behavior of this

algorithm and they show that this monopole boundary condition scheme reduces the

required number of boundary layers to about 3. It also cuts down the RMS error by two

orders of magnitude on grids with three boundary layers for general functions encountered

in accelerator or galactic dynamics.

Overall, this new algorithm is shown to be an improvement in both execution com-

plexity and flexibility over the previous wavelet based particle-in-cell solver. It also shows

the potential to improve the range of simulations that are practical for particle-in-cell

simulations by reducing the computational cost required for large grid sizes and allowing

implementations of parallel versions of the algorithm.

7.1 Future Applications

A key area for future development of this algorithm lies in the extension of the adaptive

grid algorithm into the core of the charge density region. Terzic et al. reported that less

than 10% of wavelet coefficients are required to represent a typical particle density in a

particle-in-cell simulation with a 323 fine grid size, and this fraction decreases with larger

grid densities [28]. The adaptive grid structure developed in this present work could be

harnessed to exploit some of this sparsity, improving both the operator application speeds

96

and the convergence rate of the conjugate gradient algorithm, which currently depends

on the number of grid points. This improved performance would allow simulations with

higher grid densities and particle counts even when executed as a serial process.

The next stages of algorithm improvement are optimization for execution speed and the

development of a parallel code. Parallelization opportunities exist for both large-grained

and small-grained parallel strategies. Large-grained parallel areas involve distributing the

data across computation nodes and performing operations on individual scales or segments

of the data in parallel. The short length of the filters involved in wavelet transformations

and operators also supports small-grained parallel strategies involving applying the filter

in parallel over several small sections in the multiple cores of a single processor. This

small-grained parallelism is a result of the compact support of the wavelet basis, and

will become increasingly important as modern processors continue to exhibit more small-

grained parallelism through multithreading and multiple processor cores.

BIBLIOGRAPHY

[1] T.A. Arias. Multiresolution analysis of electronic structure: semicardinal and waveletbases. Reviews of Modern Physics, 71:267-311, 1999.

[2] A. Averbuch, G. Beylkin, R. Coifman, P. Fischer, and M. Israeli. Adaptive solutionof multidimensional PDEs via tensor product wavelet decomposition. 2003.URL: http://www.cs.tau.ac.il/~amir1/PS/poisson.pdf.

[3] A. Averbuch, E. Braverman, and M. Israeli. Parallel adaptive solution of a Poissonequation with multiwavelets. SIAM Journal on Scientific Computing, 22:1053–1086,2000.

[4] G. Beylkin, R. Coifman, and V. Rokhlin. Fast wavelet transforms and numericalalgorithms I. Communications on Pure and Applied Mathematics, 44:141–183, 1991.

[5] G. Beylkin. On the representation of operators in bases of compactly supportedwavelets. SIAM Journal on Numerical Analysis, 29:1716–1740, 1992.

[6] G. Beylkin and James M. Keiser. On the adaptive numerical solution of nonlinearpartial differential equations in wavelet bases. Journal of Computational Physics,132:233–259, 1997.

[7] I. Daubechies. Ten lectures on wavelets. Society for Industrial and Applied Mathe-matics, Philadelphia, 1992.

[8] Martyn R. Field. Optimizing a parallel conjugate gradient solver. SIAM Journal onScientific Computing, 19:27–37, 1998.

[9] Masafumi Fujii and Wolfgang J.R. Hoefer. Interpolating wavelet collocation methodof time dependent Maxwell’s equations: characterization of electrically large opticalwaveguide discontinuities. Journal of Computational Physics, 186:666–689, 2003.

[10] C.K. Gan, P.D. Haynes, and M.C. Payne. Preconditioned conjugate gradient methodfor the sparse generalized eigenvalue problem in electronic structure calculations.Computer Physics Communications, 134:33–40, 2001.

98

[11] Stefan Goedecker. Wavelets and their application for the solution of partial differentialequations in physics. Presses Polytechniques et Universitaires Romandes, Lausanne,1998.

[12] Stefan Goedecker and Claire Chauvin. Combining multigrid and wavelet ideas to con-struct more efficient multiscale algorithms. Journal of Theoretical and ComputationalChemistry, 2:483–495, 2003.

[13] Stefan Goedecker and Oleg Ivanov. Solution of multiscale partial differential equationsusing wavelets. Computers in Physics, 12:548–555, 1998.

[14] S. Goedecker and O.V. Ivanov. Linear scaling solution of the Coulomb problem usingwavelets. Solid State Communications, 105:665-669, 1998.

[15] R. Hockney and J. Eastwood. Computer simulation using particles. McGraw-Hill,New York, 1981.

[16] Randal J. LeVeque. Wave propagation software, computational science, and repro-ducible research. Proceedings of the International Congress of Mathematicians, 2006.

[17] Ross A. Lippert, T.A. Arias, and Alan Edelman. Multiscale computation with inter-polating wavelets. Journal of Computational Physics, 140:278–310, 1998.

[18] Gisela Poplau, Ursula van Rienen, Marieke de Loos, and Bas van der Geer. A multi-grid based 3D space-charge routine in the tracking code GPT. TESLA Reports, 2003-03.

[19] S.B. van der Geer, O.J. Luiten, M.J. de Loos, G. Poplau, and U. van Rienen. 3Dspace-charge model for GPT simulations of high-brightness electron bunches. TESLAReports, 2003-04.

[20] Allessandro B. Romeo, Cathy Horellou, and Joran Bergh. N-body simulations withtwo-orders-of-magnitude higher performance using wavelets. Monthly Notices of theRoyal Astronomical Society, 342:337–344, 2003.

[21] Allessandro B. Romeo, Cathy Horellou, and Joran Bergh. A wavelet add-on codefor new-generation N-body simulations and data de-noising (JOFILUREN). MonthlyNotices of the Royal Astronomical Society, 354:1208–1222, 2004.

[22] David B. Serafini, Peter McCorquodale, and Phillip Colella. Advanced 3D Poissonsolvers and particle-in-cell methods for accelerator modeling. Journal of Physics:Conference Series, 16:481–485, 2005.

99

[23] R. Shankar. Principles of quantum mechanics. Springer, New York, 1994.

[24] Jonathan R. Shewchuck. An introduction to the conjugate gradient method withoutthe agonizing pain. Carnegie Mellon University, Pittsburgh, 1994.

[25] M.K. Sun and W.Y. Tam. Analysis of 2-D scattering problems using a novel nonuni-form gridding FDTD method. Microwave and Optical Technology Letters, 28:430–432,2001.

[26] Wim Sweldens. The lifting scheme: a construction of second generation wavelets.SIAM Journal on Mathematical Analysis, 29:511–546, 1998.

[27] Wim Sweldens and Peter Schroder. Building your own wavelets at home. Wavelets inComputer Graphics, ACM SIGGRAPH 96 course notes, 1996.

[28] Balsa Terzic, Ilya V. Pogorelov, and Courtlandt L. Bohn. Particle-in-cell beam dy-namics simulations with a wavelet-based Poisson solver. Physical Review Special Top-ics – Accelerators and Beams, 10:034201, 2007.

[29] Oleg V. Vasilyev and Christopher Bowman. Second-generation wavelet collocationmethod for the solution of partial differential equations. Journal of ComputationalPhysics, 165:660–693, 2000.

APPENDIX A

WAVELET COEFFICIENTS AND OPERATOR COEFFICIENTS

101

The prediction filters for interpolating wavelet families are all derived from the cor-

responding interpolation schemes. Table 3 lists the prediction filters for 2nd-, 4th-, 6th-,

and 8th-order interpolating wavelets. The 2nd-order filter is simply the linear prediction

scheme derived in Chapter 3, and the higher order filters are given in [11]. The update

filters for the non-lifted wavelet families are all zero. Sweldens gives a simple expres-

sion for update filters which give the maximum number of vanishing moments for a given

prediction filter [27]:

U [i] =12P [−i]. (A.1)

Once U and P are chosen, finding h, g, h, and g is simply a matter of applying Eqns. (4.5)

and (4.22–4.23).

Table 3: Interpolating wavelet prediction filters

Offset 1 2 3 4

2nd Order 12

4th Order 916 − 1

16

6th Order 75128 - 25

2563

256

8th Order 12252048 - 245

204849

2048 - 52048

Note: Only positive offsets are shown, but the filters are symmetric about 1/2, soP [1− i] = P [i], where i ≥ 1.

With the wavelet coefficients found, determining the 1D operator matrix elements

involves evaluating the LS(n), LSD(n), LDS(n), and LDD(n) terms from Eqn. (5.11). To be

102

general, we can find the operator A(l) which represents the lth derivative, and this will

have special cases of L and O. The solution begins by defining the base integral:

a(l)i =

∫φ(x)∂lxφ(x− i)dx. (A.2)

Expanding this base integral in terms of the basic refinement relation in Eqn. (4.1) gives:

a(l)i =

∫φ(x)∂lxφ(x− i)dx,

=∑ν,µ

hνhµ

∫φ(2x− ν)∂lxφ(2(x− i)− µ)dx,

=∑ν,µ

hνhµ

∫φ(Z)∂lZ(2)lφ(Z + ν − 2i− µ)(

12

)dZ,

= 2l−1∑ν,µ

hνhµa(l)2i+µ−ν . (A.3)

Solving Eqn. (A.3) reduces to solving for the eigenvector of

B(l)~a(l) = 21−l~a(l), (A.4)

where

B(l)i,j =

∑ν,µ

hνhµδj,2i+µ−ν . (A.5)

The eigenvector equation leaves the normalization arbitrary, but Goedecker gives the

following normalization based on the normalization of the scaling functions and the action

of derivative operators on polynomial functions [11]:

∑i

ila(l)i = l! (A.6)

103

A Maple 10 program to calculate the base filters of for any interpolating wavelet family

is given in Figure 28. Also, calculated filters for 2nd- and 4th-order wavelets are shown in

Table 4 to allow validation of the code prior to using it to compute filters for higher order

wavelet familes.

Table 4: Interpolating wavelet base filters

Offset 0 1 2 3 4 5

2nd Order ~a(0) 23

16

2nd Order ~a(2) −2 1

4th Order ~a(0) 5626470245

19253140490 − 2827

702456283

2247840 − 16210735 − 1

6743520

4th Order ~a(2) −209

98 0 − 1

72

Note: Only positive offsets are shown, but the filters are symmetric about 0, soa(l)[−i] = a(l)[i].

Once the base filters have been computed, the components of all of the individual

filters can derived in terms of them. A(l)S(n) becomes:

A(l)S(n),i,j = 〈φni |∂(l)

x |φnj 〉 ,

=∫φ(2nx− i)∂(l)

x φ(2nx− j)dx,

=∫φ(Z)∂(l)

Z (2n)lφ(Z + i− j)(2−n)dZ,

A(l)S(n),i,j = 2n(l−1)a

(l)j−i. (A.7)

104

> with(LinearAlgebra);> derivVal := 2; eigval := 2^(1-derivVal);

# 2nd Order Interpolating Wavelets:> size := 3; h := Vector(size, [1/2, 1, 1/2]);

# 4th Order Interpolating Wavelets:> #size := 7; h := Vector(size, [-1/16, 0, 9/16, 1, 9/16, 0, -1/16]);

# 6th Order Interpolating Wavelets:> #size := 11; h := Vector(size, [3/256, 0, -25/256, 0, 75/128, 1,

75/128, 0, -25/256, 0, 3/256]);

#Calculate the derivative operator filter:> asize := 2*size-1; zero := size; A := Matrix(asize, asize);> for i to asize do

for j to asize doA[i, j] := 0;for nu to size dofor mu to size do

if (j-zero) = 2*(i-zero) - (nu-zero) + (mu-zero) thenA[i,j] := A[i,j] + h[nu]*h[mu];

end ifend do

end doend do

end do;> print(A);

> C := convert(evalm(A-eigval*IdentityMatrix(asize)), Matrix);> det := Determinant(C);> if det = 0 then

NullSpace(C);unnormalizedA := %[1]

end if;> b := add(unnormalizedA[i]*i^derivVal, i = 1 .. asize)

/factorial(derivVal);> a := evalm(unnormalizedA/b);# Correctly normalized answer is in "a".

Figure 28: Maple 10 code for generating operator base filters.

105

By simply replacing indices in Eqn. (A.7), we can derive the following expression which

will be useful for computing the remaining filters:

〈φn+12i+ν |∂

(l)x |φn+1

2j+µ〉 = 2(n+1)(l−1)a(l)2(j−i)+µ−ν . (A.8)

Using this equation and Eqns. (4.3–4.4), we can quickly derive the other filters. A(l)SD(n)

is found to be

A(l)SD(n),i,j = 〈φni |∂(l)

x |ψnj 〉 ,

=∑ν,µ

hνgµ 〈φn+12i+ν |∂

(l)x |ψn+1

2j+µ〉 ,

A(l)SD(n),i,j = 2(n+1)(l−1)

∑ν,µ

hνgµa(l)2(j−i)+µ−ν . (A.9)

In a similar fashion, A(l)DS(n) and A(l)

DD(n) are given as

A(l)DS(n),i,j = 2(n+1)(l−1)

∑ν,µ

gνhµa(l)2(j−i)+µ−ν , (A.10)

A(l)DD(n),i,j = 2(n+1)(l−1)

∑ν,µ

gνgµa(l)2(j−i)+µ−ν . (A.11)

From these general derivative operator filters, the Laplacian filter can be found by

substituting l = 2 and the overlap operator is given by l = 0.

APPENDIX B

DERIVATION OF WAVELET TRANSFORM IDENTITIES

107

The goal of this appendix is to derive the identities:

F†F = 1, F† = B, (4.31)

B†B = 1, B† = F. (4.32)

Beginning with the expression for the dual forward transform given in Eqns. (4.26–

4.27):

ski =∑j

hj sk+12i+j , (4.26)

dki =∑j

gj sk+12i+j , (4.27)

a change of dummy index variables results in the equivalent expressions:

ski =∑q

hq−2i sk+1q , (B.1)

dki =∑q

gq−2i sk+1q . (B.2)

This form of the equation reveals the matrix structure of the dual forward transform to

be

Fka,b =

hb−2(a

2), a even,

gb−2(a−12

), a odd.

(B.3)

Separating the even and odd rows of the matrix into submatrices results in

Fka,b =

hb−a

gb−a−1

, (B.4)

108

and the transpose of F is

(Fka,b)† =

[ha−b ga−b−1

], (B.5)

with even and odd columns now separated into submatrices for h and g.

Now consider the backward transform B. From Eqn. (4.28), an alternate expression

equivalent to Eqns. (4.29–4.30) can be derived by a different choice of dummy variables:

sk+1j =

∑ν

(skν hj−2ν + dkν gj−2ν). (B.6)

Again, because the s and d coefficients are interleaved in the vectors, the equivalent matrix

form of this expression is given by

Bka,b =

ha−2( b

2), b even,

ga−2( b−12

), b odd.

(B.7)

Separating the even and odd columns into submatrices for h and g changes this expression

into

Bka,b =

[ha−b ga−b−1

], (B.8)

which can be seen to be equivalent to Eqn. (B.5), showing that F† = B. Similar equations

can be derived to show that F† = B by substituting h and g for h and g in the expressions

above.

APPENDIX C

INHERENT PRECONDITIONER IN A BIORTHOGONAL BASIS

110

Examining the relative scaling caused by the standard rescaling of the basis for the

biorthogonal case involves keeping track the individual components of the linear algebraic

Eqn. (2.7):

L~u = O~ρ, (2.7)

with respect to the different scaling factors from the orthonormal-type scaling of

Eqns. (4.42–4.43):

φki (x) = φki (x) =√

2kφ(2kx− i), (4.42)

ψki (x) = ψki (x) =√

2kψ(2kx− i), (4.43)

and the traditional biorthogonal scale factors of Eqns. (3.1–3.2, 4.7–4.8):

φki (x) = φ(2kx− i), (3.1)

ψki (x) = ψ(2kx− i), (3.2)

φki (x) = 2kφ(2kx− i), (4.7)

ψki (x) = 2kψ(2kx− i). (4.8)

The scaling function components of any vector |f〉 are given as ski = 〈φki |f〉. Therefore,

the change in the components with respect to the difference in scaling is given by

ski (biortho scaling)ski (ortho scaling)

=〈φki |f〉 (biortho scaling)〈φki |f〉 (ortho scaling)

=2k√

2k

=√

2k, (C.1)

with exactly the same result for the wavelet-style coefficients. Defining the diagonal ma-

trix:

PTi,j = δi,j

√2k, where k is the level number, (C.2)

111

the factor between types of wavelet expansions is given by

~u(biortho scaling) = PT~u(ortho scaling). (C.3)

For the operators, the scaling factors are found by

〈φki |φkj 〉 (biortho scaling)

〈φki |φkj 〉 (ortho scaling)=

12

(√

2k)2

=√

2−2k

. (C.4)

In terms of Eqn. (C.2), this means that

O(biortho scaling) = P−1O(ortho scaling)P−T. (C.5)

The same conversion also applies to the Laplacian matrix L. Using Eqns. (C.3) and (C.5),

we can see that the biorthogonal case of Eqn. (2.7) corresponds to

P−1L(ortho scaling)P−TPT~u(ortho scaling) = P−1O(ortho scaling)P

−TPT~ρ(ortho scaling), (C.6)

which is exactly the same form as the preconditioned linear equation from Eqn. (5.32). In

terms of M = PPT, we have

(M−1)k,li,j = 2−kδk,lδi,j , (C.7)

equivalent to Eqn. (5.33), which means that the preconditioning caused by the rescaling

of the biorthogonal bases is identical to the preconditioning scheme used in [28].

Note that this applies specifically to the one-dimensional case. For d dimensions, the

tensor product nature of the wavelet bases duplicates the scaling factors in each direction

resulting in the more general expression:

(M−1)k,li,j = 2−d·kδk,lδi,j , (C.8)

which is equivalent to the scheme of [28] only in the one-dimensional case.

APPENDIX D

A BRIEF PRIMER ON DIRAC NOTATION

113

Dirac notation allows for a more compact representation of many mathematical ex-

pressions which are common when working with arbitrary basis sets.

• |ψ〉 is called ket ψ. It is a basis-independent representation of a vector.

• 〈ψ| is called bra ψ. It is another basis-independent representation of a vector, the

adjoint to the ket |ψ〉.

A bra times a ket (called a braket) is an inner product, which is defined according to

whatever basis set the vectors are expanded in. In the case of coordinate-space basis ~x

(used exclusively in this work):

〈ψ| |ψ〉 = 〈ψ|ψ〉 =∫ ∞−∞

ψ∗(~x)ψ(~x)d~x.

The relation between bras and kets is that 〈ψ| = |ψ〉†, where † indicates the hermetian

adjoint, which translates into transpose and conjugate. In other words,

〈ψ| = |ψ〉† = (|ψ〉∗)T,

|ψ〉 = 〈ψ|† = (〈ψ|∗)T.

This ensures that an inner product will always result in a real number.

Bras and kets basically always act like vectors and their transposes. As an example,

for real vectors ~x and ~y:

~xTL~y = ~x†L~y = 〈x|L |y〉 ,(~xT~y

)T= ~yT~x,

= (〈x|y〉)† = 〈y|x〉∗ = 〈y|x〉 , since x and y are real.

For a more thorough treatment of the notation, see Chapter 1 of Shankar [23].

114

Note that all of the functions and operators in this thesis are real functions with no

imaginary components, so complex conjugation is irrelevant and will be ignored (treated

as identity) throughout. This means that † will be treated effectively the same as T, and a

reader who wishes to generalize the results to complex-valued functions will have to trace

all of the conjugations back through the derivations and add them back in.

APPENDIX E

INTERPOLATING WAVELETS ON AN INTERVAL

116

For linear interpolating wavelets, the only change needed is on the last D coefficient

prediction, where there is no S coefficient beyond it. For this last point, simply take

the last two S coefficients, call them Sa and Sb. Both of these are to the left of the D

coefficient in question, but they can be used to make a linear extrapolation to predict the

D coefficient as Sb + 12(Sb − Sa) = 3

2Sb −12Sa. For the last step, when there is only one S

coefficient available, no linear interpolation is possible, so any prediction is equally valid.

For simplicity, we can make a prediction of zero slope when this happens. The resulting

prediction filters are shown in Tables 5 and 6. In this case, our simple line example will

reduce to two values of points that are on the line, which is the optimal compression of the

data. See Sweldens for more details on second generation wavelets [26, 27]. This scheme

was not used in this study because it complicates the definitions of operators by changing

the shape of the wavelet and scaling functions along the boundaries.

Table 5: Linear prediction filter on boundary

Offset -1 0

Value -12

32

Note: This is the filter for the last D coefficient, when S−1 exists.

117

Table 6: Linear prediction filter on boundary, no S−1

Offset 0

Value 1

Note: This is the filter for the last D coefficient, when S−1 does not exist.

ABSTRACT Name: Benjamin Sprague Department: Physics Title...

Documents

Transcript of ABSTRACT Name: Benjamin Sprague Department: Physics Title...