Download - Multilevel Methods for $p$-Adaptive Finite Element Analysis of Electromagnetic Scattering

Copyright (c) 2013 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing [email protected].

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION 1

Abstract—In p-adaptive finite element analysis, the large,

sparse matrix that arises can be block structured according to the

hierarchical level of the unknowns. A multilevel preconditioner

for the matrix is a V-cycle that starts by applying Gauss-Seidel to

the highest level, then the next level down, and so on. On the

other side of the V, Gauss-Seidel is applied in the reverse order.

At the bottom of the V is the lowest order system, which typically

is solved exactly with a direct solver. However, for a complex

geometry even the lowest order system may be too large for

direct factorization. Here an alternative is proposed: to continue

the V-cycle downwards, first into a set of auxiliary, node-based

spaces, then through a series of progressively smaller matrices

generated by an algebraic multigrid method. The smallest matrix

is solved by factorization.

The method is applied to p-adaptive analysis of a five-

resonator iris filter, a split-ring resonator loaded waveguide, a

“buckyball” metallic frame surrounding a conducting sphere,

and a noncommensurate frequency selective surface. Tetrahedral

elements up to 4th

order are used. The largest matrix has over 12

million rows and 0.6 billion nonzero entries.

Index Terms—Finite element methods, multigrid methods,

microwave propagation, scattering parameters.

I. INTRODUCTION

High order finite elements can be very effective in solving

scattering problems, particularly when different orders can be

used in different regions. P-adaption involves repeated

solution of the problem, starting at low order and

automatically increasing the order in the regions where it is

needed. It is a powerful technique in its own right [1] [2] and

even more effective in combination with h-adaption [3] [4].

Like all finite element methods (FEMs), as the problem size

increases the dominant computational cost of p-adaption

becomes the solution of the large, sparse matrix problem to

which the electromagnetic field problem is reduced. Given the

demands for solving ever larger and more complex problems

with greater accuracy, it is particularly important to consider

efficient techniques for solving the matrix equation. The most

efficient methods tend to be those that take into account as

much as possible the nature of the matrix. A salient feature of

the matrix in p-adaption is that it is constructed in a series of

levels, corresponding to the orders of the finite elements. This

feature was exploited in [5][6][7] for two element orders with

Manuscript received February 1, 2013. This work was supported by the

Natural Sciences and Engineering Research Council of Canada.

The authors are with the Department of Electrical and Computer

Engineering, McGill University, Montreal, QC H3A 0E9, Canada. (e-mail:

[email protected], [email protected]).

a method called p-type multiplicative Schwartz (pMUS). It

was extended to higher orders in [8][9]. In none of this work

were mixed-order meshes considered; rather, the element

order was assumed to be uniform throughout the mesh.

The multilevel approach adopted in pMUS works very well,

but stops at the lowest order (the Whitney element [10]). The

matrix problem that corresponds to all elements being of the

lowest order is solved either exactly by a direct solver or

approximately by application of incomplete Choleski factors

[7]. For large, geometrically complex problems, the number of

tetrahedra is high and the Whitney problem itself can be

computationally challenging. On the other hand, there has

been progress in applying multilevel methods to the Whitney

mesh. Geometric multigrid methods (GMGs), though

powerful, requires special, nested grids of tetrahedra [11].

Algebraic multigrid methods (AMGs) are not constrained in

this way. AMGs that apply directly to the Whitney element

have been devised [12][13][14], but a promising alternative,

adopted here, is first to transfer the Whitney field onto

auxiliary spaces, on each of which a scalar AMG can be

applied. This is the Auxiliary Space Preconditioning (ASP)

method [15]. Its application to wave scattering is reported in

[16].

We describe in the next section a multilevel algorithm that

solves the matrix problem for a large mesh consisting of

elements of different orders. It uses a Krylov subspace solver

with a preconditioner that applies a multilevel strategy all the

way from the highest order present, down through the ASP

projection to a series of scalar AMG coarsenings.

Since p-adaption requires the solution of a number of such

matrix problems, consideration was also given to the

possibility of obtaining information from the first solve, when

all the elements are Whitney elements, and using this to

accelerate the solution at subsequent adaptive steps. This is

discussed in Section III. Results for the new algorithms are

presented in Section IV, along with comparison with some

standard iterative methods.

II. PMUS AND AUXILIARY SPACE PRECONDITIONING

The problem considered is finding the phasor electric field

inside a given volume of space, at a given frequency, in the

presence of materials with arbitrary permittivity and

permeability (symmetric, complex tensors in general). On the

boundary of the volume a number of different boundary

conditions may be specified: perfect electric conductor (PEC);

absorbing boundary conditions (ABCs) to represent waves

propagating outwards into free space; and port boundary

Multilevel Methods for 𝑝-adaptive Finite

Element Analysis of Electromagnetic Scattering

A. Aghabarati, J. P. Webb, Member, IEEE



conditions where waveguides or transmission lines connect to

the volume. In this way, a large range of wave scattering

problems may be handled, both free-space and enclosed. The

mathematical details are summarized in [16].

In the present case, the finite elements used are the

tetrahedral, hierarchical, incomplete-order, vector elements

described in [8]. The order 1 elements are the well-known

Whitney elements [10], with 6 unknowns, associated with 6

basis functions, which will be referred to as “first order”. The

element of order 2 contains these 6 basis functions, plus

another 14, which are called “second order”. The elements of

orders 3 and 4 add 25 “third order” and 39 “fourth order” basis

functions, respectively. Since the elements are hierarchical,

different orders can be mixed together in the same mesh, as

will typically happen after the first iteration of p-adaption.

After discretization, the problem reduces to a matrix

equation , where is complex and symmetric and is

a column vector of the unknowns, representing the electric

field. We number the unknowns so that those associated with

first order basis functions come first, then those of second

order, and so on. (Note that, for example, a first order

unknown may belong only to elements of order higher than 1.

A Whitney edge function may be for an edge that is shared

only by elements of order 2, since these elements contain both

first and second order basis functions.) With this numbering,

the matrix can be partitioned as follows:

[

] (1)

where 𝑝 is the highest order present in the mesh. We define

to be the square, upper left, submatrix of whose lower

right block is . We also define an associated series of

trivial, rectangular, prolongation matrices, , by the equation

( ) for 𝑝. (2)

( ) “coarsens” the representation of the field by dropping

the entries corresponding to order ; “prolongates” the

representation by adding back these entries, setting their

values to zero.

The function 𝑝 ( ) returns an approximation to

( ) , where is a vector of the appropriate length. It

involves primarily the application of forward and backward

Gauss-Seidel (GS):

( ) ( ) ; ( ) ( )

(3)

where , and are the diagonal, strict lower triangle and

strict upper triangle, respectively, of square matrix . ( )

and ( ) are inexpensive approximations to and are

applied in a “V-cycle” to the diagonal blocks of (1), starting

with the highest block and proceeding down to the lowest,

then back up again to the highest. The algorithm 𝑝 ( ) is given recursively as follows:

( ) ( ) If :

Solve order 1: ( ) Else:

Backward GS: ( ) ; Residual update:

Coarsen: ( ) Solve coarse: 𝑝 ( ) Prolongate: ;

Residual update:

Forward GS: ( ) ;

The vectors and are blocks of and , respectively,

corresponding to the partitions in (1).

In conventional pMUS, the operation “ ( )” is either a

direct solution of , or multiplication by the inverse of

the incomplete Choleski factors of [7]. Here, instead, we

continue to apply a multilevel approach. The first step is to

transfer the Whitney field represented by to four auxiliary

spaces: the space of piecewise linear, scalar functions on the

same mesh and the three components of the space of nodal

vector functions. We will denote the transfer matrices as

( ) , where takes the values , , and . Details are in

[16], where is called . The algorithm ( ) is,

then:

( ) ( ) Backward GS: (

) ;

Residual update:

Initialize:

For each of 4 auxiliary spaces, :

Transfer: ( )

Solve: ( ) Transfer back:

;

Residual update:

Forward GS: ( ) ;

The function ( ) returns an approximation to

( ) , where

represents in auxiliary space . For

, it is defined as

( )

(4)

For , is obtained not directly from , but from

the version of that results when is replaced by (

) during matrix assembly, where is a real number

[16][17][18]. We call this matrix . Then:

( )

for . (5)

This shift in the frequency improves the quality of the

preconditioner. For all the results below, .

The approximation to ( ) is obtained by a multilevel

process similar to pMUS, only using a series of prolongation

matrices, , obtained by an algebraic coarsening process [19].

Each maps a vector representing the field in

to a shorter



vector representing a coarser version of the same field. From

we can generate a series of “coarse” matrices:

(

)

for (6)

where is the chosen number of AMG coarse levels. The

function ( ) returns an approximation to

( ) by applying GS and successive coarsenings and

prolongations in another V-cycle. It is given recursively as

follows:

( ) ( )

If :

Solve exactly: ( )

Else:

Backward GS: ( ) ;

Residual update:

Coarsen: ( )

Solve coarse: ( ) Prolongate:

;

Residual update:

Forward GS: ( ) ;

At , the matrix problem is small enough to be solved

exactly by a direct method.

A call to 𝑝 ( 𝑝), then, initiates the overall V-

cycle shown in Fig. 1, during which GS is applied to the

sequence of matrices shown – backward GS on the downward

portion, and forward GS on the upward portion. The complete

V-cycle approximates ( ) , and constitutes a multilevel

preconditioner.

Since the most computationally intensive part of the

preconditioner is pMUS, it has been found beneficial to

execute the ASP/AMG part more than once in each cycle,

because this reduces the number of Krylov iterations at

relatively small additional cost. Hence the cycle shown in Fig.

2, which is called an extended W-cycle from its shape. Note

that the backward and forward GS has been arranged to

preserve the symmetry of the overall preconditioner. All of the

results below use this W-cycle version of 𝑝 ( 𝑝).

III. DEFLATION

The multilevel/AMG cycle described in the previous section

captures error components successively by construction of a

multilevel hierarchy. This hierarchy is defined by transfer

operators between different polynomial groups of basis

functions at higher levels and continued with virtual grid

transfer operators below the Whitney level. The cycle aims to

combine the smoothing properties of Gauss-Seidel with

transfer to the next lowest (coarser) level of the error

components invariant to relaxation. Errors that remain after

the smoother has been applied and that must be reduced at the

next level are called “algebraically smooth”. Algebraically

smooth components belong to the space of eigenvectors of

that have small eigenvalues and for rapid convergence,

accurate representation of them on the coarser level in needed.

The nature of the algebraically smooth errors and the near

null space of affect the performance of the multilevel cycle.

The frequency-shifting mentioned above reduces the effect of

this space on the preconditioner. Here deflation is also

employed to improve the convergence rate. The goal of

deflation is to remove from the system the eigenspace

corresponding to troublesome, algebraically smooth,

eigenvectors. Deflation permits the incorporation of existing

knowledge about the space of troublesome eigenvectors

available from a previous calculation, or a nearby problem.

In the context of Krylov subspace methods, deflation

techniques appear in the paper of Nicolaides [20] and a

comparable approach is proposed in [21] and [23]. Let

𝑝 ( ) be the Cholesky factorization of the

multilevel preconditioner and assume that the columns of the

matrix are the “deflation vectors” that approximately span

the space of slow converging modes for preconditioned

system , where , and . The deflated Lanczos algorithm [20] starts with initial

Fig. 1. The V-cycle version of 𝑝 ( 𝑝). A solid box around a matrix

means that it is used in an application of backward GS; a dashed box means

that it is used in an application of forward GS. The circle means an exact

solution using this matrix. Dashed arrows imply a series of steps with

decreasing (downward) or increasing (upward) matrix superscripts.

pMUS

ASP

scalar

AMG

scalar

AMG

Fig. 2. The W-cycle version of 𝑝 ( 𝑝).

pMUS

ASPASP ASP

scalar

AMG

scalar

AMG



vector ‖ ‖ orthogonal to and builds a sequence of

vectors such that [21]:

{ } ; ‖ ‖ (7)

To obtain [ ] one can apply the standard

Lanczos procedure to the auxiliary matrix

(8)

where ( )

. If the matrix is singular, its

pseudo-inverse is considered. To satisfy , the residual

vector is associated with a special initial guess

( ) (9)

for some arbitrary . The resulting Krylov subspace at the

th step is denoted by ( ) and the algorithm seeks

an approximate solution ( ) by requiring

that the residual be orthogonal to ( ).

The change to the standard symmetric Lanczos algorithm is

minimal and can be summarized as follow:

1. Based on given , pre-compute ( ) and the

initial approximation vector using (8).

2. Construct a symmetric Lanczos space based on the

coefficient matrix instead of by modifying all the

matrix-vector products as

(10)

The resulting Lanczos vectors satisfy where

is a tridiagonal matrix.

Depending on different ways of factorizing , distinct

deflated Krylov solvers can be developed in a straightforward

way for complex symmetric coefficient systems [24]. Here we

use the complex orthogonal conjugate gradient method

(COCG) [25], which is based on Cholesky factorization. We

remark that since the solution is approximated in the space

generated by and , inclusion of deflation vectors in the

space is beneficial even when contains only poor

approximations of slow-converging eigenvectors.

A. Determination of Deflation Vectors

For the deflated Krylov solver, the number of desired

eigenvector must be chosen, along with which eigenvalues

to be targeted. In particular, small eigenvalues related to near-

null space and large outstanding eigenvalues are the most

important ones for deflation purpose and accelerating the

convergence. There are distinct ways to exploit some

knowledge about approximate eigenvectors [21][23]. Here we

assume that important information is obtained during the

solution of the problem at the first adaptive step, when all

elements are at their lowest order (𝑝 ), and not from a prior

knowledge about the problem. For this first step, the solution

is obtained by a different Krylov subspace solver, MINRES

[26]. Standard MINRES is for Hermitian matrices; in the

present case, the matrix is complex and symmetric. Standard

MINRES is adapted to this case by replacing every instance of

the Hermitian operator (i.e., complex-conjugate transpose) by

a simple transpose [24]. Determination of approximate

eigenvalues to be applied to subsequent adaptive steps is

achieved by the following harmonic projection method

[21][22].

Given subspace and tridiagonal matrix from the

MINRES iterations, the method computes eigenpairs ( ) by solving the generalized eigenproblem

(11)

where

(12)

(

) (13)

Defining [ ], the problematic modes can then

be expressed in the form . Equation (11) can be

solved at a low cost by any technique for dense generalized

eigenvalue problems. In our experiments, the QZ algorithm is

used.

B. Deflated Krylov for 𝑝-Adaption

Consider now the matrix equation at the th adaptive step, , with unknowns. Each matrix is of the form (1), with the same upper left block

. For , the equation is solved by deflated COCG with

deflation vectors obtained from as follows:

[ ]

(14)

In summary, the procedure for solving 𝑝-type hierarchical

systems using a deflated Krylov solver

preconditioned with a multilevel cycle is as follows:

1. First-order solution: Set = and apply iterations of

MINRES, preconditioned with the ASP/AMG algorithm

described in the previous section, to solve . This computes the matrix that has the Lanczos vectors,

and the tridiagonal matrix .

2. Select , the number of eigenvectors to be used.

3. Eigenvector computation: Find and using (12) and

(13). Solve (11) for eigenvalues and let be the matrix

containing the corresponding vectors. Set and

compute (( ) )

.

4. Adaption iterations: for

a. Initialize =0 and then transfer into it values from

, with appropriate indexing. b. Set according to (14).

c. Solve by preconditioned deflated COCG

with and from initial guess

( ) ( ).

In deflated COCG, the regular update of the descent

direction vector 𝑝 𝑝 is changed to 𝑝

𝑝 ( ) where is the

preconditioned residual vector: 𝑝 ( ). We

remark that assuming , the extra memory required for

storing and the non-zeros of is moderate. The



computational cost of deflation is limited to finding just the

first values of and then multiplying by ( ) .

The cost is much less than that of the preconditioning and the

gain in decreasing the number of iterations makes it well

worthwhile.

IV. NUMERICAL RESULTS

In this section, we study the performance of the proposed

approach via several numerical experiments. In each example,

we adaptively increase the order of the elements and solve the

matrix problem at each adaptive step by either MINRES (for

the first step) or COCG (for subsequent steps), preconditioned

with variants of the W-cycle described in section I, with or

without deflation. Depending on the method used for

approximating the solution to the Whitney system, , the preconditioners tested are:

- pMUS[ASP]: the proposed approach, i.e., ASP and AMG.

- pMUS[AV]: one backward GS step followed by one

forward GS step, both applied to the vector-scalar

potential (A-V) version of the Whitney system, in

compact form [27][16]. This amounts to a SSOR

preconditioner applied to the compact A-V version of the

entire matrix.

- pMUS[iLU]: sparse incomplete factorization of [28].

- pMUS[LU]: complete LU factorization of using

UMFPACK [29].

The Krylov iterations are terminated when the infinity

norm of the residual is reduced by a factor of . In the

examples below, this parameter is set to unless

otherwise stated. All simulation are done using Matlab [30]

and performed on a PC with a -bit, 4-core, Intel GHz

processor and GB of RAM. The geometries are modeled

and meshed with ElecNet [31].

For the first example, we compute the S-parameter of a

five-resonator iris filter [9][32]. The geometry of the two port

waveguide filter is depicted in Fig. 3 and its outer dimensions

are . The computational domain

is discretized with tetrahedra, resulting in a maximum

edge length of ⁄ at GHz. The number of DOFs is

increased from to in steps by the p-

adaption scheme of [2], increasing the polynomial order of

25% of the elements at each step. The Krylov termination

parameter, , for both MINRES and COCG, is set to . For pMUS[ASP], 2 AMG levels are used (i.e., ).

Deflation is also considered. The number of deflation

eigenvectors, , is set to 50% of the number of MINRES

steps, , used to solve the first-order problem.

The results in Table I show that at each adaption step there

is a reduction in the number of iterations by using

TABLE I

SOLUTION STATISTICS FOR THE ADAPTIVE S-PARAMETER COMPUTATION OF WAVEGUIDE IRIS FILTER AT 3.63 GHZ

Adaption Step

DOFs 56,632 127,126 214,814 308,726 391,569 481,752

Preconditioner

(+ Deflation) Iter.

CPU

Time

(s)

Memory

(MB) Iter.

CPU

Time

(s)

Memory

(MB) Iter.

CPU

Time

(s)

Memory

(MB) Iter.

CPU

Time

(s)

Memory

(MB) Iter.

CPU

Time

(s)

Memory

(MB) Iter.

CPU

Time (s)

Memory

(MB)

pMUS[ASP] 71 8.66 64 97 16.6 114 101 27.9 182 103 44.8 294 108 60.7 374 110 81.1 547

pMUS[ASP]+Def[50%] 71 8.66 64 42 9.23 134 48 14.3 204 49 24.0 326 50 30.6 428 53 40.8 567

pMUS[AV] 197 14.4 57 207 20.6 116 207 40.4 168 203 72.7 287 203 97.4 382 204 131.3 520

pMUS[iLU] 594 84.1 205 636 188.0 289 639 245.9 316 604 338.9 363 687 459. 7 511 623 523.6 635

pMUS[LU] 1 2.41 302 40 7.52 354 50 13.4 417 57 24.1 541 60 33.1 608 61 43.6 733

Computed

Fig. 3. Geometry of the iris waveguide filter.

Fig. 4. Convergence behaviour of error for the uniform

and adaptive p-refinement

0 2 4 6 8 10 12 14

x 105

10-3

10-2

10-1

100

Number of DOF's

ab

s(S

11-S

11

ref )

Adaptive p- refinement

Uniform p- refinement

64 MB

476 MB

1082 MB

2784 MB

134 MB

204MB

326MB

428MB

567MB

Fig. 5. Magnitude of scattering parameters versus frequency

for the iris waveguide filter

3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4

-60

-50

-40

-30

-20

-10

0

Frequency [GHz]

ab

s(S

11)

[d

B]

1st Order (Uniform)

2nd Adaption Step

4th Adaption Step

6th Adaption Step

4th Order (Uniform)

Mode- Matching (Reiter et. al)



pMUS[ASP] compared to other methods and also significant

acceleration, after the first step, by incorporation of deflation

eigenvectors. During the 6 adaption steps, the cumulative CPU

time related to the matrix solution phase for pMUS[ASP] with

(without) deflation is 128 s (240 s), while it is 377 s, 1840 s

and 124 s for pMUS[AV], pMUS[iLU] and pMUS[LU],

respectively. Note that an inappropriate treatment of the

lowest order block (such as iLU) can worsen the convergence

rate at higher orders and increasingly add to the overall

solution cost. In addition, the solution time for auxiliary space

preconditioning plus deflation approach is very close to

pMUS[LU] method, which has the highest memory

requirement due to the complete factorization.

In Fig. 4. we show how the error | | depends on

the number of DOFs with uniform and adaptive increments of

elements order. The computations are made at the frequency

of 3.63 GHz, but the behaviour is similar for all frequencies.

The reference solution is obtained by solving the same

problem with a highly refined mesh ( ) and using

4th order elements entirely, which results in 3.89 million

DOFs. Comparing the slopes of the curves, we see that adding

DOFs in an adaptive way gives a more rapid decrease of error.

The magnitude of computed over the frequency range

GHz to GHz is shown in Fig. 5 for some of the

adaption steps. The frequency response when uniform 4th

order elements are employed and results from [32] are also

given. As observed, the adaptive solver progressively reduces

the error in . The second example is free space scattering from a

conducting sphere surrounded by a polyhedral, 60-node

“buckyball”, acting as a metallic frame to reduce the

backscattered wave from the sphere at MHz [33]. The

problem is shown in Fig. 6. The diameter of the internal solid

sphere is m and the external polyhedral conducting frame

has diameter of m with an edge width and thickness of

m and m, respectively. An absorbing

boundary condition (ABC) is applied over a spherical surface

5 m away from the buckyball. The mesh obtained after

discretization consists of nodes and

elements with minimum and maximum edge lengths of

and , respectively. In this computation, the

number of unknowns increases from to in

TABLE II

SOLUTION DETAILS FOR THE BUCKYBALL SCATTERING PROBLEM

Adaption Step

DOFs 810,595 2,319,586 3,716,547 5,710,953

Preconditioner (+ Deflation) Iter. CPU Time

(hh;mm;ss)

Memory

(MB) Iter.

CPU Time

(hh;mm;ss)

Memory

(MB) Iter.

CPU Time

(hh;mm;ss)

Memory

(MB) Iter.

CPU Time

(hh;mm;ss)

Memory

(MB)

pMUS[ASP]+Def[50%] 108 00:03:58 928 124 00:08:39 2,635 130 00:12:50 3,621 157 00:25:30 5,895

pMUS[AV] 2,058 00:21:10 686 2,173 01:19:52 1,767 2,138 02:18:15 2,673 2,188 04:19:25 4,908

Fig. 6. Geometry of 6m metal sphere surrounded with a 10m spherical

polyhedral conducting frame.

Fig. 7. Comparison of effects of deflation on iterations counts for

the buckyball problem

1 2 3 4

100

150

200

250

300

Adaption Steps (i)

Nu

mb

er

of

Itera

tio

ns

No Deflation Vector (0%)

11 Deflation Vectors (10%)





Fig. 8. Comparison of effects of deflation on CPU time for


1 2 3 4

500

1000

1500

2000

2500

3000

Adaption Steps (i)

CP

U T

ime [

s]

No Deflation Vectors (0%)






Fig. 9. Comparison of effects of deflation on memory usage for


1 2 3 4

1000

2000

3000

4000

5000

6000

7000

Adaption Steps (i)

Mem

ory

[M

B]

No Deflation Vectors (0%)








adaptive steps.

The detailed computational information is listed in Table II.

For pMUS[ASP], 3 AMG levels are used (i.e., ), with

, and DOFs, respectively. In this case,

it was not possible to run pMUS[iLU] and pMUS[LU]

because of the high memory requirements in factorizing

(more than 50GB). At the same time, the run time required by

pMUS[ASP] accelerated with 57 deflation vectors is much

less than with pMUS[AV] at each adaption step.

Next we consider the effect of varying the dimension, , of

the deflation subspace from 0 (no deflation) to

( deflation). Figs. 7-9 compare the number of iterations,

computation time and memory usage for several choices of .

Increasing leads to fewer iterations, but at the cost of more

memory usage. We also see that increasing beyond about

50% of does not bring much further reduction in

computation time. Towards the end of the adaption,

deflation leads to a reduction in cumulative computation

time and a reduction in Krylov iterations.

The next example is a rectangular metallic waveguide

loaded with split ring resonators (SRRs). This configuration

has attracted extensive attention [34][35], since it provides the

realization of negative index metamaterials and novel

methods to miniaturize waveguide-based devices. The

geometry is directly taken from [34], proposed for the

creation of a stopband above the ordinary waveguide cutoff

frequency. The geometry, showing the excitation port A and

absorption port B, is presented in Fig. 10. The rectangular

waveguide is a standard C-band type with dimensions

at the operating frequency of

GHz. The dielectric slabs (shown as green) on which the

four SRR arrays are printed have relative permittivity of

and thickness of mm. The dimensions of the

SRR elements can be found in [34], noting that printed

elements on the middle and left lateral wall slabs are a factor

of smaller than the elements on the right side. The entire

composite domain is discretized with tetrahedra

elements, resulting in edge lengths , and

(a)

(b)

(c)

Fig. 10. (a) The geometry of SRR loaded waveguide filter along with the excitation and absorption ports. (b) The mesh discretization.

(c) Intensity distribution of electric field strength within the loaded waveguide

Po

rt A

Po

rt B

TABLE III

SOLUTION DETAILS FOR THE SRR LOADED WAVEGUIDE PROBLEM

Adaption Step

DOFs 1,050,016 2,353,671 3,967,045 5,994,104

Preconditioner (+ Deflation) Iter. CPU Time

(hh:mm:ss)

Memory

(MB) Iter.

CPU Time

(hh:mm:ss)

Memory

(MB) Iter.

CPU Time

(hh:mm:ss)

Memory

(MB) Iter.

CPU Time

(hh:mm:ss)

Memory

(MB)

pMUS[ASP]+Def[50%] 947 00:36:25 955 916 01:43:21 8,803 1,551 03:57:10 10,416 2,084 08:11:28 13,023

pMUS[AV] 4,056 01:25:46 834 6,201 03:35:04 1,724 9,033 10:35:38 3,176 10,115 22:04:29 5,942

Computed

Fig. 11. Cumulative CPU time versus adaption steps for the SRR loaded

waveguide filter

1 2 3 4

2

4

6

8

10

12

14x 10

4

Adaption Steps (i)

Cu

mu

lati

ve C

PU

Tim

e [

s]

pMUS [AV]

pMUS [ASP]

pMUS [ASP]+Def [50%]



. Adaption was used to compute the reflection

coefficient as the parameters of interest and the results are

given in Table III. The convergence of deflated Krylov

preconditioned with pMUS and 3 level AMG ( ) is

superior again compared to pMUS[AV]. The importance of

deflating the Krylov solver from hampering eigenvectors and

its effects on the cumulative CPU time is shown in Fig. 11.

While the required time to accurately solve the problem in

adaption steps is around hours with pMUS[AV],

pMUS[ASP] reduces that to hours and with deflation it is

less than hours. The magnitude of the electric field

intensity at the mid-plane passing thorough the geometry is

plotted in Fig. 10(c). It is apparent that energy propagation

toward the output port is suppressed by the ring resonator

elements.

For the final experiment, results obtained from the

scattering analysis of a finite noncommensurate frequency

selective surface (FSS) are presented. The device, shown in

Fig. 12, consists of two dissimilar Jerusalem cross screens

(dimensions taken from [36]) printed on opposite sides of a

dielectric slab, where is the free

space wavelength at 3 GHz. The dielectric relative permittivity

is . Such composite and multi-layer structures

with different periodicities are difficult to handle rigorously

using periodic boundary conditions [36][37] and analysis of

the entire system is usually needed.

For the computational domain truncation, the geometry is

enclosed in a box at least away from any point on the

FSS and the ABC is applied. The structure is illuminated

normally ( direction) with an -polarized plane wave. The

mesh consists of tetrahedra. As observed in Fig.

12, the mesh is highly nonuniform, with an average element

size of and sizes ranging from around

the metallic regions to near the truncation boundary.

A summary of computational statistics for 4 steps of p-

adaption, using pMUS[ASP]+Def[50%], is given in Table IV.

In this case pMUS[AV] failed at each adaptive step to reduce

the residual by a factor of after iterations. The

column labeled “Nonzeros” gives the number of nonzero

entries in the matrix. Note that the matrices become denser as

the adaption proceeds, which is characteristic of p-adaption.

Here, as in the previous examples, the number of iterations is

relatively stable, despite the growth in the size of the matrix.

At step 4, a complex valued system with almost million

unknowns and billion nonzeros is solved on a PC in

(a)

(b) (c)

Fig. 12. Geometry of a Jerusalem cross noncommenserate FSS

(a) Side view. (b) Top view showing the 7 7 upper screen.

(c) Bottom view showing the 5 5 lower screen

(a)

(b)

Fig. 14. Visualization of electric field for the noncommenserate FSS

(a) over plane . (b) over plane

Fig. 13. Residual history of (deflated) preconditioned COCG method

for the noncommenserate FSS scattering at 4th adaption step

0 200 400 600 800 1,000 1,200 1,400

10-6

10-5

10-4

10-3

10-2

10-1

100

101

102

Iterations

Rela

tive R

esid

ual

pMUS[AV]

pMUS[ASP]+Def[0%]

pMUS[ASP]+Def[50%]

TABLE IV

SOLUTION DETAILS FOR THE NONCOMMENSERATE FSS SCATTERING PROBLEM

i DOFs Nonzeros Iter. CPU Time

(hh:mm:ss)

Memory

(MB)

1 2,261,319 36,824,307 533 00:54:28 2,314

2 5,482,526 188,871,828 626 02:07:01 11,241

3 8,909,294 370,550,000 683 03:18:42 14,347

4 12,264,629 626,152,411 594 03:45:56 17,424

TABLE V

MATRIX HIERARCHY DETAILS FOR THE NONCOMMENSERATE

FSS SCATTERING PROBLEM AT THE 4TH ADAPTION STEP

Matrix DOFs Nonzeros

12,264,629 626,152,411

12,051,413 525,271,225

10,794,234 416,843,750

2,261,319 36,824,307

259,089 3,487,890

90,448 1,236,772

34,162 480,396

20,174 287,358



reasonable time.

Table V gives, for the matrix (4th adaptive step), the

matrices used in its preconditioner. Since there are some

4th order elements present at this step, the largest matrix

(which is equal to itself) is . Four levels of AMG are

used (i.e., ), so the smallest matrix is .

The residual history in solving the system is shown in

Fig. 13. Removing the accelerations provided by deflation

results in an increase of and in Krylov iterations

and run time respectively, as well as reduction in

memory usage. Fig. 14 shows the computed electric field over

two cross-sections, both passing through the center of the

structure.

V. CONCLUSION

The pMUS[ASP] method has been shown to be effective for

the analysis of complex geometries by hierarchical finite

elements. Incorporation of ASP and AMG into the

preconditioning V- or W-cycle allows pMUS to be applied

even when the matrix at lowest order is too big for direct

solution. This is especially useful for scatterers with fine

geometric details, requiring a very large mesh in which the

bulk of the elements remain at low order. pMUS[ASP] was

always faster than pMUS[AV] (SSOR preconditioning applied

to the A-V system). In one case it was over 10 times faster and

in another case, pMUS[AV] failed to converge.

In p-adaption, the use of a deflated Krylov solver, which

exploits information obtained from the first adaptive step,

provided some additional reduction in run time (20% to 40%),

at the expense of a greater memory requirement (10% to

35%).

REFERENCES

[1] L. S. Andersen and J. L. Volakis, "Adaptive multiresolution antenna

modeling using hierarchical mixed-order tangential vector finite

elements," IEEE Transactions on Antennas and Propagation, vol. 49,

pp. 211-222, Feb 2001.

[2] D. Nair and J. P. Webb, “P-Adaptive Computation of the Scattering

Parameters of 3-D Microwave Devices”, IEEE Transactions On

Magnetics, vol. 40, no. 2, Mar 2004.

[3] M. M. Botha and J. M. Jin, "Adaptive finite element-boundary intergral

analysis for electromagnetic fields in 3-D," IEEE Transactions on

Antennas and Propagation, vol. 53, pp. 1710-1720, May 2005.

[4] I. Gomez-Revuelto, L. E. Garcia-Castillo, and M. Salazar-Palma, "Goal-

Oriented Self-Adaptive Hp-Strategies for Finite Element Analysis of

Electromagnetic Scattering and Radiation Problems," Progress in

Electromagnetics Research-Pier, vol. 125, pp. 459-482, 2012.

[5] G. H. Peng, D. Dyczij-Edlinger, and J. F. Lee, "Hierarchical methods for

solving matrix equations from TVFEMs for microwave components,"

IEEE Transactions on Magnetics, vol. 35, pp. 1474-1477, May 1999.

[6] Y. Zhu and A. C. Cangellaris, "Hierarchical multilevel potential

preconditioner for fast finite-element analysis of microwave devices,"

IEEE Transactions on Microwave Theory and Techniques, vol. 50, pp.

1984-1989, Aug 2002.

[7] J. F. Lee and D. K. Sun, "p-Type multiplicative Schwarz (pMUS)

method with vector finite elements for modeling three-dimensional

waveguide discontinuities," IEEE Transactions on Microwave Theory

and Techniques, vol. 52, pp. 864-870, Mar 2004.

[8] P. Ingelstrom, "A new set of H(curl)-conforming hierarchical basis

functions for tetrahedral meshes," IEEE Transactions on Microwave

Theory and Techniques, vol. 54, pp. 106-114, Jan 2006.

[9] P. Ingelstrom, V. Hill, and R. Dyczij-Edlinger, "Comparison of

hierarchical basis functions for efficient multilevel solvers," IET Science

Measurement & Technology, vol. 1, pp. 48-52, Jan 2007.

[10] J. Jin, The finite element method in electromagnetics, 2nd edition, John

Wiley, 2002.

[11] R. Hiptmair, "Multigrid method for Maxwell's equations," SIAM Journal

on Numerical Analysis, vol. 36, pp. 204-225, Dec 2 1998.

[12] S. Reitzinger and J. Schoberl, "An algebraic multigrid method for finite

element discretizations with edge elements," Numerical Linear Algebra

with Applications, vol. 9, pp. 223-238, Apr-May 2002.

[13] P. B. Bochev, C. J. Garasi, J. J. Hu, A. C. Robinson, and R. S.

Tuminaro, "An improved algebraic multigrid method for solving

Maxwell's equations," SIAM Journal on Scientific Computing, vol. 25,

pp. 623-642, 2003.

[14] R. Perrussel, L. Nicolas, F. Musy, L. Krahenbuhl, M. Schatzman, and C.

Poignard, "Algebraic multilevel methods for edge elements," IEEE

Transactions on Magnetics, vol. 42, pp. 619-622, Apr 2006.

[15] J. Xu, "The auxiliary space method and optimal multigrid

preconditioning techniques for unstructured grids," Computing, vol. 56,

pp. 215-235, 1996.

[16] A. Aghabarati, J. P. Webb, “An algebraic multigrid method for the finite

element analysis of large scattering problems”, IEEE Trans. Antennas

and Propagation, vol. 61, no.2, February 2013.

[17] Y. Erlangga, C. Oosterlee, and C. Vuik, "A novel multigrid based

preconditioner for heterogeneous Helmholtz problems," SIAM Journal

on Scientific Computing, vol. 27, pp. 1471-1492, 2006.

[18] Y. A. Erlangga, "Advances in iterative methods and preconditioners for

the Helmholtz equation," Archives of Computational Methods in

Engineering, vol. 15, pp. 37-66, Mar 2008.

[19] Y. Notay, "An Aggregation-Based Algebraic Multigrid Method",

Electronic Transactions on Numerical Analysis, vol. 37, pp. 123-146,

2010.

[20] R. A. Nicolaides, “Deflation of Conjugate Gradients with Applications

to Boundary Value Problems”, SIAM J. Numer. Anal., 24 (1987), pp.

355–365.

[21] Y. Saad, M. Yeung, J. Erhel and F. Guyomarcha, “Deflated Version of

the Conjugate Gradient Algorithm”, SIAM Journal on Scientific

Computing. Vol. 21, No. 5, pp. 1909–1926, Apr 2000.

[22] C. Andrew and Y. Saad. "Deflated and augmented Krylov subspace

techniques." Numerical linear algebra with applications, vol. 4 no. 1,

pp, 43-66, 1997.

[23] A. M. Abdel-Rehim, R. B. Morgan, D. A. Nicely and W. Wilcox,

“Deflated and Restarted Symmetric Lanczos Methods for Eigenvalues

and Linear Equations With Multiple Right-Hand Sides”, SIAM Journal

on Scientific Computing, vol 32, no 1, pp 129-149, 2010.

[24] R. Freund, “Conjugate Gradient-Type Methods for Linear Systems with

Complex Symmetric Coefficient Matrices”, SIAM Journal on Scientific

and Statistical Computing, 1992.

[25] H.A. van der Vorst, J. B. M. Melissen, "A Petrov–Galerkin type method

for solving Ax = b, where A is symmetric complex", IEEE Transactions

on Magnetics, vol. 26, no. 2, pp. 706-708, Mar 1990.

[26] C. C. Paige, and M. A. Saunders. "Solution of sparse indefinite systems

of linear equations." SIAM Journal on Numerical Analysis, vol 12, no 4,

pp. 617-629, Sep 1975.

[27] R. Edlinger, G. Peng and J. F. Lee, " A Fast Vector-Potential Method

Using Tangentially Continuous Vector Finite Elements", IEEE

Transactions on Microwave Theory and Techniques, vol. 46, pp. 86-868,

Jun 1998.

[28] Y. Saad, “Iterative Methods for Sparse Linear Systems”, PWS

Publishing Company, 1996, Chapter 10 - Preconditioning Techniques.

[29] T. A. Davis, “Algorithm 832: UMFPACK, an unsymmetric-pattern

multifrontal method”, ACM Transactions on Mathematical Software, vol

30, no. 2, June 2004, pp. 196-199.

[30] Matlab 7.10, the MathWorks, Inc., Natick, MA, 2000.

http://www.mathwork.com.

[31] ElecNet 7.3, 2012, Infolytica Corporation, Montreal, Canada,

http://www.infolytica.com.

[32] J. M. Reiter and F. Arndt; “Rigorous Analysis of Arbitrarily Shaped H-

and E-Plane Discontinuities in Rectangular Waveguides by a Full Wave

Boundary Contour Mode-Matching Method”, IEEE Transactions on

Microwave Theory and Techniques, vol. 43, no. 4, pp. 796-801, Apr

1995.

[33] P. A. Bernhardt, “Radar Backscatter from Conducting Polyhedral

Spheres”, IEEE Antennas and Propagation Magazine, vol. 52, No.5, Oct

2010.

[34] E. R. Iglesias, Ó. Q. Teruel, and M. M. Kehn, “Multiband SRR Loaded

Rectangular Waveguide”, IEEE Transactions on Antennas and

Propagation, vol. 57, no. 5, pp. 1570-1574, May 2009.



[35] F. Y. Meng, Q. Wu, D. Erni and L.W. Li, “Controllable Metamaterial-

Loaded Waveguides Supporting Backward and Forward Waves”, IEEE

Transactions on Antennas and Propagation, vol. 59, no. 9, Sep. 2011.

[36] Y. E. Erdemli, K. Sertel, R. A. Gilbert, D. E. Wright, and J. L. Volakis,

“Frequency-Selective Surfaces to Enhance Performance of Broad-Band

Reconfigurable Arrays”, IEEE Transactions on Antennas and

Propagation, vol. 50, no. 12, Dec 2002.

[37] R. Mittra, “A Look at Some Challenging Problems in Computational

Electromagnetics”, IEEE Transactions on Antennas and Propagation,

vol. 46, no. 5, pp. 18-32, Oct 2004.

Ali Aghabarati received the B.Eng. degree in telecommunication engineering

from Iran University of Science and Technology, Tehran, Iran, in 2008, and

the M.Eng. degree in electrical engineering from Amirkabir University of

Technology, Tehran, Iran, in 2010. Currently, he is working toward the Ph.D

degree at the Computational Electromagnetics Laboratory, McGill University,

Montreal, QC, Canada. His main research interests include numerical

techniques in electromagnetic computation, with focus on finite element

methods.

Jon P. Webb (M’83) received a Ph.D. from Cambridge University, England,

in 1981. Since 1982 he has been professor in the Department of Electrical and

Computer Engineering at McGill University, Montreal, Canada. His area of

research is computer methods in electromagnetics, especially the application

of the finite element method.