praetorius/download/published/pp01_computing74.pdf

28
Applications of H-Matrix Techniques in Micromagnetics N. Popovic´ and D. Praetorius, Vienna Received April 28, 2004; revised August 5, 2004 Published online: December 8, 2004 Ó Springer-Verlag 2004 Abstract The variational model by Landau and Lifshitz is frequently used in the simulation of stationary micromagnetic phenomena. We consider the limit case of large and soft magnetic bodies, treating the associated Maxwell equation exactly via an integral operator P. In numerical simulations of the resulting minimization problem, difficulties arise due to the imposed side-constraint and the unboundedness of the domain. We introduce a possible discretization by a penalization strategy. Here the computation of P is numerically the most challenging issue, as it leads to densely populated matrices. We show how an efficient treatment of both P and the corresponding bilinear form can be achieved by application of H-matrix techniques. AMS Subject Classifications: 65K10, 65F30, 65F50, 65N30, 65N38. Keywords: Micromagnetics, Landau–Lifshitz model, magnetic potential, finite element method, hierarchical matrices, interpolation. 1. Introduction The simulation of stationary micromagnetic phenomena occurring in static or quasi-static processes is frequently based on a variational model named after Landau and Lifshitz. Therein, one minimizes the energy functional E a ðmÞ :¼ Z X /ðmÞ dx Z X f m dx þ 1 2 Z R d jruj 2 dx þ a Z X jrmj 2 dx ð1Þ over some set of admissible vector-valued magnetizations m : X ! R d on a bounded Lipschitz domain X R d corresponding to the magnet, with mðxÞ :¼ 0 for x 2 R d n X and d ¼ 2; 3. Here / 2 C 1 ðR d ; R þ Þ is the anisotropy density (depending on properties of the material on a crystalline level), f 2 L 2 ðX; R d Þ denotes an applied exterior magnetic field,0 a 1 is the exchange parameter, and u is the magnetic potential related to m by Maxwell’s equation divðru þ mÞ¼ 0 in D 0 ðR d Þ: ð2Þ The model is completed by adding the non-convex constraint jmðxÞj ¼ 1 for a.e. x 2 X: ð3Þ Computing 74, 177–204 (2005) Digital Object Identifier (DOI) 10.1007/s00607-004-0098-7

Transcript of praetorius/download/published/pp01_computing74.pdf

Page 1: praetorius/download/published/pp01_computing74.pdf

Applications of H-Matrix Techniques in Micromagnetics

N. Popovic and D. Praetorius, Vienna

Received April 28, 2004; revised August 5, 2004Published online: December 8, 2004

� Springer-Verlag 2004

Abstract

The variational model by Landau and Lifshitz is frequently used in the simulation of stationarymicromagnetic phenomena. We consider the limit case of large and soft magnetic bodies, treatingthe associated Maxwell equation exactly via an integral operator P. In numerical simulations of theresulting minimization problem, difficulties arise due to the imposed side-constraint and theunboundedness of the domain. We introduce a possible discretization by a penalization strategy. Herethe computation of P is numerically the most challenging issue, as it leads to densely populatedmatrices. We show how an efficient treatment of both P and the corresponding bilinear form can beachieved by application of H-matrix techniques.

AMS Subject Classifications: 65K10, 65F30, 65F50, 65N30, 65N38.

Keywords: Micromagnetics, Landau–Lifshitz model, magnetic potential, finite element method,hierarchical matrices, interpolation.

1. Introduction

The simulation of stationary micromagnetic phenomena occurring in static orquasi-static processes is frequently based on a variational model named afterLandau and Lifshitz. Therein, one minimizes the energy functional

EaðmÞ :¼Z

X/ðmÞ dx�

ZXf �m dxþ 1

2

ZRdjruj2dxþ a

ZXjrmj2dx ð1Þ

over some set of admissible vector-valued magnetizations m : X! Rd on abounded Lipschitz domain X � Rd corresponding to the magnet, with mðxÞ :¼ 0

for x 2 RdnX and d ¼ 2; 3. Here / 2 C1ðRd ; RþÞ is the anisotropy density

(depending on properties of the material on a crystalline level), f 2 L2ðX; RdÞdenotes an applied exterior magnetic field, 0 � a� 1 is the exchange parameter,and u is the magnetic potential related to m by Maxwell’s equation

divð�ruþmÞ ¼ 0 in D0ðRdÞ: ð2Þ

The model is completed by adding the non-convex constraint

jmðxÞj ¼ 1 for a.e. x 2 X: ð3Þ

Computing 74, 177–204 (2005)Digital Object Identifier (DOI) 10.1007/s00607-004-0098-7

Used Distiller 5.0.x Job Options
This report was created automatically with help of the Adobe Acrobat Distiller addition "Distiller Secrets v1.0.5" from IMPRESSED GmbH. You can download this startup file for Distiller versions 4.0.5 and 5.0.x for free from http://www.impressed.de. GENERAL ---------------------------------------- File Options: Compatibility: PDF 1.2 Optimize For Fast Web View: Yes Embed Thumbnails: Yes Auto-Rotate Pages: No Distill From Page: 1 Distill To Page: All Pages Binding: Left Resolution: [ 600 600 ] dpi Paper Size: [ 439.37 666.142 ] Point COMPRESSION ---------------------------------------- Color Images: Downsampling: Yes Downsample Type: Bicubic Downsampling Downsample Resolution: 150 dpi Downsampling For Images Above: 225 dpi Compression: Yes Automatic Selection of Compression Type: Yes JPEG Quality: Medium Bits Per Pixel: As Original Bit Grayscale Images: Downsampling: Yes Downsample Type: Bicubic Downsampling Downsample Resolution: 150 dpi Downsampling For Images Above: 225 dpi Compression: Yes Automatic Selection of Compression Type: Yes JPEG Quality: Medium Bits Per Pixel: As Original Bit Monochrome Images: Downsampling: Yes Downsample Type: Bicubic Downsampling Downsample Resolution: 600 dpi Downsampling For Images Above: 900 dpi Compression: Yes Compression Type: CCITT CCITT Group: 4 Anti-Alias To Gray: No Compress Text and Line Art: Yes FONTS ---------------------------------------- Embed All Fonts: Yes Subset Embedded Fonts: No When Embedding Fails: Warn and Continue Embedding: Always Embed: [ ] Never Embed: [ ] COLOR ---------------------------------------- Color Management Policies: Color Conversion Strategy: Convert All Colors to sRGB Intent: Default Working Spaces: Grayscale ICC Profile: RGB ICC Profile: sRGB IEC61966-2.1 CMYK ICC Profile: U.S. Web Coated (SWOP) v2 Device-Dependent Data: Preserve Overprint Settings: Yes Preserve Under Color Removal and Black Generation: Yes Transfer Functions: Apply Preserve Halftone Information: Yes ADVANCED ---------------------------------------- Options: Use Prologue.ps and Epilogue.ps: No Allow PostScript File To Override Job Options: Yes Preserve Level 2 copypage Semantics: Yes Save Portable Job Ticket Inside PDF File: No Illustrator Overprint Mode: Yes Convert Gradients To Smooth Shades: No ASCII Format: No Document Structuring Conventions (DSC): Process DSC Comments: No OTHERS ---------------------------------------- Distiller Core Version: 5000 Use ZIP Compression: Yes Deactivate Optimization: No Image Memory: 524288 Byte Anti-Alias Color Images: No Anti-Alias Grayscale Images: No Convert Images (< 257 Colors) To Indexed Color Space: Yes sRGB ICC Profile: sRGB IEC61966-2.1 END OF REPORT ---------------------------------------- IMPRESSED GmbH Bahrenfelder Chaussee 49 22761 Hamburg, Germany Tel. +49 40 897189-0 Fax +49 40 897189-71 Email: [email protected] Web: www.impressed.de
Adobe Acrobat Distiller 5.0.x Job Option File
<< /ColorSettingsFile () /AntiAliasMonoImages false /CannotEmbedFontPolicy /Warning /ParseDSCComments false /DoThumbnails true /CompressPages true /CalRGBProfile (sRGB IEC61966-2.1) /MaxSubsetPct 100 /EncodeColorImages true /GrayImageFilter /DCTEncode /Optimize true /ParseDSCCommentsForDocInfo false /EmitDSCWarnings false /CalGrayProfile () /NeverEmbed [ ] /GrayImageDownsampleThreshold 1.5 /UsePrologue false /GrayImageDict << /QFactor 0.9 /Blend 1 /HSamples [ 2 1 1 2 ] /VSamples [ 2 1 1 2 ] >> /AutoFilterColorImages true /sRGBProfile (sRGB IEC61966-2.1) /ColorImageDepth -1 /PreserveOverprintSettings true /AutoRotatePages /None /UCRandBGInfo /Preserve /EmbedAllFonts true /CompatibilityLevel 1.2 /StartPage 1 /AntiAliasColorImages false /CreateJobTicket false /ConvertImagesToIndexed true /ColorImageDownsampleType /Bicubic /ColorImageDownsampleThreshold 1.5 /MonoImageDownsampleType /Bicubic /DetectBlends false /GrayImageDownsampleType /Bicubic /PreserveEPSInfo false /GrayACSImageDict << /VSamples [ 2 1 1 2 ] /QFactor 0.76 /Blend 1 /HSamples [ 2 1 1 2 ] /ColorTransform 1 >> /ColorACSImageDict << /VSamples [ 2 1 1 2 ] /QFactor 0.76 /Blend 1 /HSamples [ 2 1 1 2 ] /ColorTransform 1 >> /PreserveCopyPage true /EncodeMonoImages true /ColorConversionStrategy /sRGB /PreserveOPIComments false /AntiAliasGrayImages false /GrayImageDepth -1 /ColorImageResolution 150 /EndPage -1 /AutoPositionEPSFiles false /MonoImageDepth -1 /TransferFunctionInfo /Apply /EncodeGrayImages true /DownsampleGrayImages true /DownsampleMonoImages true /DownsampleColorImages true /MonoImageDownsampleThreshold 1.5 /MonoImageDict << /K -1 >> /Binding /Left /CalCMYKProfile (U.S. Web Coated (SWOP) v2) /MonoImageResolution 600 /AutoFilterGrayImages true /AlwaysEmbed [ ] /ImageMemory 524288 /SubsetFonts false /DefaultRenderingIntent /Default /OPM 1 /MonoImageFilter /CCITTFaxEncode /GrayImageResolution 150 /ColorImageFilter /DCTEncode /PreserveHalftoneInfo true /ColorImageDict << /QFactor 0.9 /Blend 1 /HSamples [ 2 1 1 2 ] /VSamples [ 2 1 1 2 ] >> /ASCII85EncodePages false /LockDistillerParams false >> setdistillerparams << /PageSize [ 576.0 792.0 ] /HWResolution [ 600 600 ] >> setpagedevice
Page 2: praetorius/download/published/pp01_computing74.pdf

For large and soft magnets, the parameter a in (1) vanishes. In general, the modelthen lacks classical solutions, see [JK90], and hence has to be relaxed eitherby considering measure valued solutions [Ped97] or by convexification [DeS93,Tar95]. In fact, for a certain limit configuration of soft-large bodies, E0ðmÞ, i.e.,(1) with a! 0 can be justified to be the correct model, see [DeS93]. The corre-sponding convexified problem E��0 is given by

E��0 ðmÞ :¼Z

X/��ðmÞdx�

ZXf �m dxþ 1

2

ZRdjruj2dx ð4Þ

subject to (2) and

jmðxÞj � 1 for a.e. x 2 X: ð5Þ

Here /�� is the convexified density defined by

/��ðxÞ ¼ sup uðxÞju : Rd ! R convex and ujS � /� �

for jxj � 1; ð6Þ

where S ¼ fx 2 Rd�� jxj ¼ 1g denotes the unit sphere. Then, the relaxed problem

reads:

Minimize E��0 over A :¼ fm 2 L1ðX; RdÞ�� kmkL1ðX;Rd Þ � 1g: ð7Þ

In contrast to the ill-posed problem E0, the convexification is well-posed[DeS93,Ped97,CP04b]. In fact, this convexified model provides the mathematicalfoundation of the so-called phase theory in micromagnetics, cf. [HS98].

Remark 1: For uniaxial materials such as cobalt, the anisotropy energy is given by

/ðxÞ ¼ 1=2�1� ðx � eÞ2

�, with jxj ¼ 1 and e 2 Rd a given fixed unit vector called the

easy axis. A direct calculation shows /��ðxÞ ¼ 1=2Pd

j¼2ðx � zjÞ2 for jxj � 1 then,where fe; z2; . . . ; zdg is an orthonormal basis of Rd .

The numerical treatment of the minimization problem related to E��0 was initiatedby [CP01] for d ¼ 2, where the authors treat a simplified model obtained byreplacing Rd in (2) by a bounded Lipschitz domain bX containing X, and solve for

a potential u 2 H 10 ðbXÞ. Here, as in [CP04b, Cp04a, LM92, Ma91, Pra04], (2) is

treated exactly via an integral representation, i.e., u ¼Lm, where L is a linearconvolution operator. We then set Pm :¼ rðLmÞ, see Theorem 2.1 below, andreformulate the stray field energy contribution in (4) in terms ofP. The advantageis that in the resulting model, only one discretization for m is required, e.g., bypiecewise constant functions mh.

From a numerical point of view, the computation of Pm for a given magneti-zation is the most challenging issue, since it will lead to densely populatedmatrices. The aim of the present work is to show how an efficient numericaltreatment of both P and the induced bilinear form að�; �Þ can be achieved byapplication of H-matrix techniques.

178 N. Popovic and D. Praetorius

Page 3: praetorius/download/published/pp01_computing74.pdf

Remark 2: The treatment of the convexified model (7) requires the explicitknowledge of the convexified anisotropy density /��, which is, however, in generalunknown even for simple /, cf. [DeS93]. Both the Young measure relaxationproposed in [Ped97] and the corresponding discretization [KP01] avoid the com-putation of /��. Note that as far as the computation of the magnetic potential (2) isconcerned, our ideas apply in that setting, as well. Further analysis on stabilizeddiscrete models, as well as a comparison of the various approaches, can be found inthe articles [CP01] and [KP01], as well as in the survey monograph by Prohl[Pro01].

The remainder of this paper is organized as follows: in Sect. 2, we give a fewpreliminaries and present a possible discretization of (7); Sect. 3 contains someinterpolation results required for the following analysis; Sect. 4 motivates theconcept of hierarchical (H- resp. H2-) matrices; in Sect. 5, we give two differentapproaches for a Galerkin discretization of the potential Eq. (2) via H-matrixtechniques; Sect. 6 finally summarizes the results of our numerical experiments.

2. Preliminaries and Discretization

This section is devoted to the reformulation of (7) in terms of the associatedEuler-Lagrange equations and introduces a possible discretization by a penali-zation strategy.

2.1. Preliminaries

The following Theorem 2.1 gathers some of the properties of the operator Prequired in the following. Proofs can be found in [Pra04], although we expect theresult to be known to the experts.

Theorem 2.1 ([LM92,Ma91, Pra04]):Given anym 2 L1ðX; RdÞ, there exists an (upto an additive constant) unique magnetic potential u ¼Lm 2 H1

locðRdÞ such that

ru 2 L2ðRd ; RdÞ and divð�ruþmÞ ¼ 0 in D0ðRdÞ: ð8Þ

The (extended) operator P : L2ðRd ; RdÞ ! L2ðRd ; RdÞ; m 7!rðLmÞ is an L2

orthogonal projection. The potential Lm can be represented as a convolutionoperator

Lm ¼Xd

j¼1

@G@xj�mj; ð9Þ

where m ¼ ðm1; . . . ;mdÞ is trivially extended [by zero] from X to Rd (so that theconvolution is formally well-defined). Here G : Rdnf0g ! R is the Newtoniankernel

Application of H-Matrix Techniques in Micromagnetics 179

Page 4: praetorius/download/published/pp01_computing74.pdf

GðxÞ :¼

1

c2log jxj d ¼ 2,

1

ð2� dÞcdjxj2�d d > 2

8><>: ð10Þ

for x 6¼ 0, where the constant cd :¼ jSj > 0 denotes the surface measure of the unitsphere (e.g., c2 ¼ 2p; c3 ¼ 4p). (

Since the energy functional E��0 from (4) is convex and (Gateaux) differentiable,the minima are equivalently characterized by the corresponding Euler-Lagrange

equations [DeS93]. Thus, problem ðRP Þ reads: Find ðk;mÞ 2 L2ðXÞ � L2ðX; RdÞsuch that

Pmþ D/��ðmÞ þ km ¼ f a.e. in X; ð11Þk � 0; jmj � 1; kð1� jmjÞ ¼ 0 a.e. in X: ð12Þ

Existence results for ðRP Þ can be found in [DeS93,Ped97]; in particular, in theuniaxial case the solution to ðRP Þ is unique.

2.2. The Discretized Problem

Let T ¼ fT1; . . . ; TNg be a finite family of pairwise disjoint non-empty open sets

Tj which satisfy X ¼SN

j¼1 Tj. The space of all T-piecewise constant functions is

denoted by P0ðTÞ; h 2 P0ðTÞ is the mesh-size function, hjT :¼ hT :¼ diamðT Þ.For f 2 L2ðXÞ, let fT 2 P0ðTÞ be the T-piecewise integral mean given by

fTjT :¼ 1

jT j

ZT

f dx for all T 2T:

The discrete problem ðRPe;hÞ now reads as follows: given a penalization parameter

e 2 P0ðTÞ with e > 0, find mh 2 P0ðTÞd such that

hPmh þ D/��ðmhÞ þ khmh; emhiL2ðXÞ ¼ hf; emhiL2ðXÞ for all emh 2 P0ðTÞd ; ð13Þ

where kh 2 P0ðTÞ is defined by

kh ¼ e�1ðjmhj � 1Þþjmhj

ð14Þ

with ð�Þþ :¼ maxf�; 0g.

Remark 3: For mh 2 P0ðTÞd , the potential Pmh can be computed exactly, as theassociated bilinear form

aðmh; emhÞ :¼ hPmh; emhiL2ðXÞ for all mh; emh 2 P0ðTÞd ð15Þ

180 N. Popovic and D. Praetorius

Page 5: praetorius/download/published/pp01_computing74.pdf

can be evaluated by a closed form formula, cf. Theorem 5.1. The evaluation of Pmh

is, however, computationally demanding, as it typically leads to densely populatedstiffness matrices.

As for the continuous problem, we have existence of discrete solutions anduniqueness in the uniaxial model case, cf. [CP04b]. Moreover, in that case thea priori error analysis for ðRPe;hÞ from [CP04b] suggests the choice e ¼ h for thepenalization parameter, as there holds

kPm�PmhkL2ðRd Þ þ kD/��ðmÞ � D/��ðmhÞkL2ðXÞ

þ kkm� khmhkL2ðXÞ þ kekhmhkL2ðXÞ

� C�km�mTkL2ðXÞ þ kkm� ðkmÞTkL2ðXÞ þ kekmkL2ðXÞ

with a generic constant C � 0. For ðk;mÞ sufficiently smooth (e.g., m 2 H1ðX; RdÞ,km 2 H1ðX; RdÞ), the above right-hand side turns out to be of order Oðeþ hÞ.

The stiffness matrix A induced by the bilinear form að�; �Þ from (15) will in the

following be approximated by an appropriate H2-matrix eA. Given ðRPe;hÞ, onethen obtains an approximate discrete model ðfRPe;hÞ after replacing A by eA anddefining the approximate bilinear form eað�; �Þ accordingly.

3. Multi-dimensional Interpolation of Integral Kernels

The following section contains some results on the multi-dimensional interpola-tion of integral kernels. We restrict ourselves to one particular class of kernelfunctions here, known as asymptotically smooth kernels.

3.1. Asymptotically Smooth Kernels

A kernel function

j : Rd � Rd ! R; ðx; yÞ7!jðx; yÞ ð16Þ

is said to be asymptotically smooth if there exist constants Casm and casm such that

j@ax@

by jðx; yÞj � Casmðcasmjx� yjÞ�jaj�jbj�sðaþ bÞ! ð17Þ

for all multi-indices a; b 2 Nd0 with jaj þ jbj � 1 and some singularity order s 2 R,

where x; y 2 Rd with x 6¼ y.

Example 3.1: For the Newtonian kernel G defined in ð10Þ, jðx; yÞ :¼ Gðx� yÞ isasymptotically smooth for any d � 2, with Casm ¼ c�12 and Casm ¼

�ð2� dÞcd

��1for

d ¼ 2 and d � 3, respectively, and casm ¼ 1, see ½Gra01.

Application of H-Matrix Techniques in Micromagnetics 181

Page 6: praetorius/download/published/pp01_computing74.pdf

Remark 4: Note that the derivatives of an asymptotically smooth kernel j also are

asymptotically smooth: given ~j :¼ @~ax@

~by j, (17) holds with ~s ¼ sþ j~aj þ j~bj;

~casm ¼ casm þ e for e > 0 arbitrary, and some eCasm depending on Casm and e. This is

a consequence of

�ðaþ bÞ þ ð~aþ ~bÞ

�! � Casmcjajþjbjasm ðaþ bÞ!

with an ðj~aj þ j~bjÞ-dependent constant casm. For every choice of casm > 1, there is a

Casm > 0 (depending on casm) such that the above inequality holds. One then setseCasm :¼ CasmCasm and ecasm :¼ casmcasm, respectively.

3.2. Interpolation Operators in One Dimension

For m 2 N0, let the space of m-th order polynomials in one spatial variable bedenoted by Pm, and consider the interpolation operator

Im : C½�1; 1 ! Pm; u 7!Xm

j¼0uðtjÞLj with LjðtÞ ¼

Ymk¼0k 6¼j

t � tk

tj � tkð18Þ

acting on the so-called reference element ½�1; 1. Here�LjðtÞ

�mj¼0 are the Lagrange

polynomials corresponding to the interpolation points ðtjÞmj¼0. Note that Im is aprojection, i.e., linear with I2

m ¼ Im.

For m 2 N0, the Lebesgue constant Km 2 R is defined as the operator norm of Im,

Km :¼ supu2C½�1;1

u 6¼0

kImuk1;½�1;1kuk1;½�1;1

: ð19Þ

Clearly, we have Km � 1. Moreover, we assume that there are constantsk;Ck 2 Rþ such that

Km � Ckðmþ 1Þk: ð20Þ

For Chebyshev interpolation, where tj ¼ cos�ð2jþ 1Þp=ð2ðmþ 1ÞÞ

�, this estimate

holds with k ¼ 1 ¼ Ck, cf. [Riv84].

For an arbitrary compact interval I :¼ ½a; b � R, we define the affine transfor-mation

UI : ½�1; 1 ! I ; t 7! 1

2

�ðaþ bÞ þ tðb� aÞ

�:

The transformed interpolation operator IIm is then given by

182 N. Popovic and D. Praetorius

Page 7: praetorius/download/published/pp01_computing74.pdf

IIm : C½a; b ! Pm; u 7!

�Imðu UIÞ

� U�1I :

Obviously, the projection property as well as (20) now carry over from Im to IIm.

3.3. Tensor Interpolation Operators

For a family of closed intervals Ij :¼ ½aj; bj � R; j 2 f1; . . . ; 2dg, define the

axially parallel box B � R2d by B :¼Q2d

j¼1 Ij. Given a family�IIj

m

�2dj¼0 of inter-

polation operators on ðIjÞ2dj¼0, the m-th order tensor product interpolation operator

IBm on B is then defined as

IBm :¼ II1

m � � � � �II2dm :

In analogy to IIm; I

Bm is a projection from CðBÞ to

Qm :¼ span p1 � � � � � p2d jpj 2 Pm; j 2 f1; . . . ; 2dg� �

: ð21Þ

We require the following result on the interpolation error of IBm adapted from

[BG02, BLM02]:

Theorem 3.2: Let u 2 C1ðBÞ such that there are constants Cu; cu 2 Rþ satisfying

k@nj uk1;B � Cuc

nun! ð22Þ

for all j 2 f1; . . . ; 2dg and n 2 N0. Then, we have

ku�IBmuk1;B � 16edCuK

2dm

�1þ cudiamðBÞ

�ðmþ 1Þ 1þ 2

cudiamðBÞ

� �ðmþ1Þ: ð23Þ

Proof: The proof is as in [BG02, Theorem 3.2], with the d there replaced by 2d.

h

3.4. Local Error Analysis for Asymptotically Smooth Kernels

We now apply interpolation to obtain an approximate degenerate kernelej :¼ IB

mj instead of the given asymptotically smooth integral kernel j. LetBr;Bs � Rd be compact axially parallel boxes with positive Euclidean distancedistðBr;BsÞ > 0:

Lemma 3.3: An asymptotically smooth kernel j : Rd � Rd ! R satisfies (22) onB :¼ Br � Bs, with constants

Cj ¼max

(kjkL1ðBr�BsÞ;

Casm�casmdistðBr;BsÞ

�s

)and cj ¼

1

casmdistðBr;BsÞ: ð24Þ

Application of H-Matrix Techniques in Micromagnetics 183

Page 8: praetorius/download/published/pp01_computing74.pdf

Provided diamðBr � BsÞ � g distðBr;BsÞ with g > 0, there holds in particular

kj�Iðr;sÞm jkL1ðBr�BsÞ � c1c2Cj 1þ 2casmg

� �ðmþ1Þð25Þ

with Iðr;sÞm :¼ IBm, a numerical constant c1 ¼ 16edð1þ g=casmÞ, and a constant

c2 ¼ K2dm ðmþ 1Þ with only polynomial increase in m.

Remark 5: Note that the constant Cj > 0 behaves like distðBr;BsÞ�s for the kernelswe are interested in, such as jðx; yÞ ¼ log jx� yj resp. jðx; yÞ ¼ jx� yj�s.

Proof of Lemma 3.3: Direct computation shows that (22) is valid, and Theorem3.2 yields

kj�IBmjkL1ðBÞ � 16ed

�1þ cjdiamðBÞ

�K2d

m ðmþ 1ÞCj 1þ 2

cjdiamðBÞ

� �ðmþ1Þ:

Combining this with cjdiamðBÞ � g=casm, we obtain (25). h

The proof of Theorem 3.2 is only based on the stability constant Km defined in(19). The advantage is that the resulting estimate can be applied to a fairly widerange of interpolation operators. If one restricts oneself to tensor Chebyshevinterpolation – as we will do in the numerical experiments – one can do better byusing the following error estimate for tensor Chebyshev polynomials adaptedfrom [BH02]:

Lemma 3.4: Provided diamðBr � BsÞ � gdistðBr;BsÞ with g > 0, an asymptotically

smooth kernel j on B :¼ Br � Bs � R2d satisfies

kj�Iðr;sÞm jk1;B � dCasmc�sasmK2d�1

m distðBr;BsÞ�s4�mc�ðmþ1Þasm gmþ1: ð26Þ

Proof: The proof is along the lines of [BH02]. h

Remark 6: Lemma 3.3 ensures (asymptotically) exponential convergence with re-spect to m irrespective of the choice of g > 0. Nevertheless, the constant c1 obtainedin Lemma 3.3 is too pessimistic, in contrast to the reasonably good approximationresults observed for small m, as well, cf. Sect. 6. From Lemma 3.4, we obtainexponential convergence provided at least 4casm > g, with the highly improved

constants c1 ¼ dCasmc�sasm � 16edð1þ g=casmÞ and c2 ¼ K2d�1

m �K2dm ðmþ 1Þ.

In the following, we require additional error estimates for tensor Chebyshevpolynomials, in particular for the norms of the first and second derivatives ofthe respective interpolation errors. For the proofs, we make use of the fol-lowing well-known one-dimensional error estimate for Chebyshev interpolationfrom [BH02],

184 N. Popovic and D. Praetorius

Page 9: praetorius/download/published/pp01_computing74.pdf

ku�IIjmuk1;Ij

� 4�m

2ðmþ 1Þ! jIjjmþ1kuðmþ1Þk1;Ijfor u 2 Cmþ1ðIjÞ; ð27Þ

as well as of a result on the first derivative of u�Imu in one dimension takenfrom the proof of [BS01, Theorem 3.3.1],

kðu�ImuÞ0k1;½�1;1 ��

1

ðr� 1Þ!þ1

r!CðmÞ

kuðrÞk1;½�1;1 for u 2 Cr½�1;1; ð28Þ

with 1 � r � mþ 1 and CðmÞ a constant which may be estimated by

CðmÞ � Kmm2, cf. [BS01]. Affine transformation then yields

kðu�ImuÞ0k1;Ij� 2�ðr�1ÞjIjjr�1

�1

ðr � 1Þ!þ1

r!Kmm2

kuðrÞk1;Ij

ð29Þ

for Ij :¼ ½aj; bj � R.

Furthermore, we need an estimate for the norms of the derivatives of algebraicpolynomials known as Markov’s Theorem [DL93, Theorem 1.4]:

kp0k1;Ij� m22jIjj�1kpk1;Ij

for p 2 Pm: ð30Þ

This estimate cannot be improved; in particular, it is sharp for the Chebyshevpolynomials.

To begin with, we state a result concerning the first derivatives of j�Iðr;sÞm j.

Lemma 3.5: Provided diamðBr � BsÞ � gdistðBr;BsÞ with g > 0, an asymptoticallysmooth j 2 Cmþ2ðBÞ on B :¼ Br � Bs satisfies

k@aðj�Iðr;sÞm jÞk1;B � c1c22�mc�ðmþ1Þasm gmþ1 for 1 � a � 2d; ð31Þ

where c1 is a numerical constant depending on d; B, and j and c2 ¼ðKmm2 þ mþ 1ÞK2d�1

m grows polynomially in m.

Proof: As in [BG02, BH02], we write

k@aðj�Iðr;sÞm jÞk1;B �X2d

j¼1

@a

�b

j�1k¼1I

Ikm � ðId �IIj

mÞ �b2dk¼jþ1Id

j

|fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼:Jjj

1;B

; ð32Þ

to obtain estimates for k@aJjjk1;B, we have to consider three cases.

Application of H-Matrix Techniques in Micromagnetics 185

Page 10: praetorius/download/published/pp01_computing74.pdf

– First, for j < a, an explicit computation gives @aJjj ¼ Jjð@ajÞ, whence

k@aJjjk1;B � Kj�1m

bj�1k¼1Id � ðId �IIj

mÞ �b2dk¼jþ1Id

@aj

1;B

� Kj�1m

4�m

2ðmþ 1Þ! jIjjmþ1k@mþ1j @ajk1;B;

here we have used (19) on IIkm; k < j, and applied the estimate in (27) to Ij. By

exploiting the asymptotic smoothness of j to estimate k@mþ1j @ajk1;B �

Casm

�casmdistðBr;BsÞ

��ðmþ2þsÞðmþ 2Þ!; jIjj � diamðBr � BsÞ, and diamðBr � BsÞ� g distðBr;BsÞ, we get

k@aJjjk1;B � Casmc�ðsþ1Þasm distðBr;BsÞ�ðsþ1Þc�ðmþ1Þasm gmþ1 mþ 2

24�mKj�1

m : ð33Þ

– Second, for j ¼ a, it follows from (29) with r ¼ mþ 1 that

k@aJajk1;B � Ka�1m

ba�1k¼1Id � @aðId �IIa

mÞ �b2dk¼aþ1Id

j

1;B

� Ka�1m 2�mjIajm

�1

m!þ m2

ðmþ 1Þ! Km

k@mþ1

a jk1;B:

As before, we now obtain

k@aJajk1;B � Casmc�sasm distðBr;BsÞ�sc�ðmþ1Þasm gmþ1ðKmm2 þ mþ 1Þ2�mjIaj�1Ka�1

m :

ð34Þ

– Third, the case j > a is treated by applying Markov’s Theorem (30) to IIam and

by using the estimate in (27), whence

k@aJjjk1;B � Kj�1m m22jIaj�1

bj�1k¼1Id � ðId �IIj

mÞ �b2dk¼jþ1Id

j

1;B

� Kj�1m m2 4�m

ðmþ 1Þ! jIaj�1jIjjmþ1k@mþ1

j jk1;B

and

k@aJjjk1;B � Casmc�sasmdistðBr;BsÞ�sc�ðmþ1Þasm gmþ1m24�mjIaj�1Kj�1

m : ð35Þ

Collecting the estimates in (33), (34), and (35), we finally have

k@aðj�Iðr;sÞm jÞk1;B� CasmdistðBr;BsÞ�sc�ðmþ1Þasm gmþ1c�s

asm

�c�1asm

mþ 2

24�mdistðBr;BsÞ�1

Xj<a

Kj�1m

þ ðKmm2 þ mþ 1Þ2�mjIaj�1Ka�1m þ m24�mjIaj�1

Xj>a

Kj�1m

� c�ðmþ1Þasm gmþ1ðKmm2 þ mþ 1Þ2�mK2d�1m Casmc�s

asmdistðBr;BsÞ�s

��ða� 1Þc�1asmdistðBr;BsÞ�1 þ

�2d � ða� 1Þ

��min2d

j¼1jIjj��1�

;

which gives the desired result.

186 N. Popovic and D. Praetorius

Page 11: praetorius/download/published/pp01_computing74.pdf

For the second derivatives of j�Iðr;sÞm j, we obtain in a similar fashion:

Lemma 3.6: Under the assumptions of Lemma 3.5, we have

k@a@bðj�Iðr;sÞm jÞk1;B � c1c22�ðm�1Þc�ðmþ1Þasm gmþ1 for 1 � a; b � 2d; ð36Þ

with a constant c1 depending on d; B, and j and c2 ¼ ðKmm2 þ mþ 1Þm2K2d�1m .

Proof: Without loss of generality, we assume 1 � a � b � 2d throughout; similarreasoning as in the proof of Lemma 3.5, with Jjj defined as in (32), then impliesthe following cases:

– j < a � b: with @a@bJjj ¼ Jjð@a@bjÞ, one has as in (33)

k@a@bJjjk1;B � Kj�1m

4�m

2ðmþ 1Þ! jIjjmþ1k@mþ1j @a@bjk1;B

� Kj�1m Casmc�ðsþ2Þasm distðBr;BsÞ�ðsþ2Þc�ðmþ1Þasm gmþ14�m ðmþ 2Þðmþ 3Þ

2;

– j ¼ a < b: from @a@bJaj ¼ @aJað@bjÞ and (29) for r ¼ mþ 1 it follows that

k@a@bJajk1;B � Ka�1m 2�mjIajm

�1

m!þ m2

ðmþ 1Þ! Km

k@mþ1

j @bjk1;B

� Ka�1m Casmc�ðsþ1Þasm distðBr;BsÞ�ðsþ1ÞjIaj�1c�ðmþ1Þasm gmþ12�m

� ðmþ 2Þ�1þ mþ m2Km

�;

cf. (34);

– a < j < b: as @a@bJjj ¼ @aJjð@bjÞ, we obtain with (30)

k@a@bJjjk1;B � Kj�1m m22jIaj�1

4�m

2ðmþ 1Þ! jIjjmþ1k@mþ1j @bjk1;B

� Kj�1m Casmc�ðsþ1Þasm distðBr;BsÞ�ðsþ1ÞjIaj�1c�ðmþ1Þasm gmþ14�mm2ðmþ 2Þ;

see (35);

– j ¼ a ¼ b: to obtain an estimate for @2j Jjj, we proceed as in [BS01, Theorem3.3.1]: given u 2 Cmþ1½�1; 1 and 1 � r � mþ 1, let Rr be the remainder term ofthe Taylor series expansion of degree r � 1 of u about t ¼ 0. Usingu�Imu ¼ Rr �ImRr; kRrk1;½�1;1 � 1=r!kuðrÞk1;½�1;1; and kR00r k1;½�1;1 �1=ðr � 2Þ!kuðrÞk1;½�1;1, we have by the triangle inequality and with (30)

kðu�ImuÞ00k1;½�1;1 � kR00r k1;½�1;1 þ kðImRrÞ00k1;½�1;1

� 1

ðr � 2Þ! kuðrÞk1;½�1;1 þ m2ðm� 1Þ2Km

1

r!kuðrÞk1;½�1;1:

By affine transformation it follows for u 2 Cmþ1ðIaÞ and r ¼ mþ 1 that

Application of H-Matrix Techniques in Micromagnetics 187

Page 12: praetorius/download/published/pp01_computing74.pdf

kðu�ImuÞ00k1;Ia � 2�ðm�1ÞjIajm�1�

1

ðm�1Þ!þ1

ðmþ1Þ!ðm�1Þ2m2Km

kuðmþ1Þk1;Ia ;

whence

k@2aJajk1;B�Ka�1m

ba�1k¼1Id�@2aðId�IIa

mÞ�b2dk¼aþ1Id

j

1;B

�Ka�1m 2�ðm�1ÞjIajm�1

�1

ðm�1Þ!þ1

ðmþ1Þ!ðm�1Þ2m2Km

k@mþ1

a jk1;B

�Ka�1m Casmc�s

asmdistðBr;BsÞ�sjIaj�2c�ðmþ1Þasm gmþ12�ðm�1Þ

��mðmþ1Þþðm�1Þ2m2Km

�;

– a < j ¼ b: with @a@bJbj ¼ @að@bJbjÞ, (29) and (30) give

k@a@bJbjk1;B � Kb�1m m22jIaj�12�mjIbjm

�1

m!þ m2

ðmþ 1Þ! Km

k@mþ1

b jk1;B

� Kb�1m Casmc�s

asmdistðBr;BsÞ�sjIaj�1jIbj�1c�ðmþ1Þasm gmþ12�ðm�1Þ

� m2�1þ mþ m2Km

�;

– a � b < j: applying Markov’s Theorem (30) twice yields

k@a@bJjjk1;B � Kj�1m m22jIaj�1m22jIbj�1

4�m

2ðmþ 1Þ! jIjjmþ1k@mþ1j @ajk1;B

� Kj�1m Casmc�s

asmdistðBr;BsÞ�sjIaj�1jIbj�1c�ðmþ1Þasm gmþ14�ðm�1Þm4

2:

Let us first consider a < b: by collecting the above estimates, we obtain as inLemma 3.5

k@a@bðj�Iðr;sÞm jÞk1;B� Casmc�s

asmdistðBr;BsÞ�sðKmm2 þ mþ 1Þm22�ðm�1ÞK2d�1m c�ðmþ1Þasm gmþ1

��ða� 1Þc�2asmdistðBr;BsÞ�2 þ ðb� aÞc�1asmdistðBr;BsÞ�1

�min2d

j¼1jIjj��1

þ�2d � ðb� 1Þ

��min2d

j¼1jIjj��2�

;

from which the result follows. An analogous computation for a ¼ b shows thatthe estimate then still holds, with the same constants c1 and c2. This concludesthe proof. h

Remark 7: Lemmas 3.5 and 3.6 ensure exponential convergence with respect to mprovided at least 2casm > g. For j the Newtonian kernel G, this implies exponentialconvergence for g 2 ð0; 2Þ, cf. Remark 4.

188 N. Popovic and D. Praetorius

Page 13: praetorius/download/published/pp01_computing74.pdf

4. H2-Matrix Techniques

In this section, we motivate the concept of hierarchical matrices and give thecorresponding definitions.

4.1. Motivation

We consider a bilinear form að�; �Þ on L2ðXÞ given by a double integration with anasymptotically smooth kernel j,

aðu; vÞ :¼Z

X

ZX

uðxÞjðx; yÞvðyÞ dydx for u; v 2 L2ðXÞ; ð37Þ

with X � Rd a bounded domain. In the following we require a partition T of Xand a block partitioning P of T�T. For g > 0 fixed, a block ðr; sÞ 2 P is thencalled admissible provided

diamðBr � BsÞ � gdistðBr;BsÞ; ð38Þ

where Br and Bs denote axially parallel boxes in Rd of minimal size contain-ing [r and [s, respectively; otherwise ðr; sÞ is called inadmissible. Here[r :¼ fx 2 T jT 2 r �Tg (resp. [s :¼ fy 2 T jT 2 s �Tg) denotes the union ofall elements T 2T contained inr (resp. in s).P is thus split into two subsetsPfar andPnear: the subset of all admissible blocks is called far field and denoted by Pfar; theinadmissible blocks are collected in the near field Pnear :¼ PnPfar.

The approximate bilinear form eað�; �Þ is obtained by replacing the kernel functionon admissible blocks by an approximate but degenerate kernel obtained byinterpolation. More precisely,

eaðu; vÞ : ¼X

ðr;sÞ2Pnear

Z[r

Z[s

uðxÞjðx; yÞvðyÞ dydx

þX

ðr;sÞ2Pfar

Z[r

Z[s

uðxÞ�Iðr;sÞm jðx; yÞ

�vðyÞ dydx;

ð39Þ

where Iðr;sÞm denotes the tensor interpolation operator with respect to thebounding box B :¼ Br � Bs.

For each s �T with corresponding bounding box Bs, define a family ðxsjÞ

Msj¼0 of

interpolation points plus the associated tensor Lagrange polynomials ðLsjÞ

Msj¼0,

where Ms 2 N. Given

Iðr;sÞm jðx; yÞ ¼XMr

m1¼0

XMs

m2¼0j�xr

m1; xs

m2

�Lr

m1ðxÞLs

m2ðyÞ for ðx; yÞ 2 [r� [s;

the second term from (39) can then be written as

Application of H-Matrix Techniques in Micromagnetics 189

Page 14: praetorius/download/published/pp01_computing74.pdf

Xðr;sÞ2Pfar

XMr

m1¼0

XMs

m2¼0j�xr

m1; xs

m2

�|fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl}¼:Srs

m1m2

Z[r

uðxÞLrm1ðxÞdx

|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼:V r

m1ðuÞ

Z[s

vðyÞLsm2ðyÞdy

|fflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflffl}¼:V s

m2ðvÞ

: ð40Þ

The advantage of this new representation becomes obvious if we discretize að�; �Þ:consider the basis fu1; . . . ;uNg of the space P0ðTÞ of piecewise constant func-tions on T given by uj :¼ vTj

, where vTjis the characteristic function on Tj 2T

and N ¼ jTj, and define A 2 RN�N by Ajk :¼ aðuj;ukÞ and the approximate

matrix eA 2 RN�N by eAjk :¼ eaðuj;ukÞ for all 1 � j; k � N . On inadmissible blocks

ðr; sÞ 2 Pnear, we then simply have eAjr�s ¼ Ajr�s, whereas for ðr; sÞ 2 Pfar, eA is

given by

eAjr�s ¼ V rSrsV sT Ajr�s; ð41Þ

with V rjm1

:¼ V rm1ðujÞ and V s

km2:¼ V s

m2ðukÞ as defined in (40).

4.2. Block Partitioning

A simple method to find a hierarchical partition P ofT�T is to first construct acluster tree from T by binary space partitioning: one starts with the root clustercontaining all T 2T, splits it into two son clusters and repeats the procedurerecursively until each cluster contains less than a given number of elements Clf.Geometrically speaking, for every T 2T, one chooses a coordinate axis andsplits the set along this axis.

One can then use the admissibility condition (38) in combination with the clustertree structure to construct P: starting with T�T, one splits each pair of clustersas long as it is not admissible. This gives a P satisfying T�T ¼

Sðr;sÞ2P r� s.

Clearly, a pair ðr; sÞ can only appear in P if it is admissible or if either r or s is aleaf. The above procedure can easily be formalized, see, e.g., [BGH03, BH02,HKS00] for formal definitions and algorithms.

4.3. H-Matrices vs. H2-Matrices

Based on the concepts of cluster tree and block partitioning, the matrixapproximation approach outlined in Sect. 4.1 can be generalized by introducing aclass of data-sparse matrices, the so-called H-matrices. Given a block parti-tioning P of T�T and some k 2 N, a matrix A 2 RN�N is called H-matrix ofrank k provided rank

�Ajr�s

�� k for each ðr; sÞ 2 P. Moreover, if a factorization

of the form (41) holds for a family V ¼ ðV sÞs�T and some multiplication matrices

Srs; A is called uniform H-matrix with respect to the cluster basis V , cf. [BGH03,HKS00].

Additional structure can be gained by considering uniform H-matrices for whichthe corresponding cluster bases are nested: if the space Qm defined in (21) is used

190 N. Popovic and D. Praetorius

Page 15: praetorius/download/published/pp01_computing74.pdf

for interpolation on all clusters, polynomials corresponding to father clusters canbe expressed exactly in terms of polynomials corresponding to son clusters. Fors �T and s0 2 sonsðsÞ, we have

LsjðxÞ ¼

XMm¼0

Lsj

�xs0

m

�Ls0

mðxÞ: ð42Þ

Defining the transfer matrix Bs0s 2 R M�M by Bs0smj :¼Ls

j

�xs0

m

�, we obtain

V sij ¼

Z[s

vTjLs

jðxÞdx ¼X

s02 sons ðsÞ

XMm¼0

Lsj

�xs0

m

� Z[s0

vTjLs0

mðxÞdx ¼X

s02 sons ðsÞ

XMm¼0

Bs0smjV

s0im

ð43Þfor all i 2 s0 � s, i.e., V sjs0 ¼ V s0Bs0s. A is called H2-matrix with respect to V if it isa uniform H-matrix with respect to V and if V is nested.

Remark 8: For H2-matrices, one only has to store the multiplication matrices Srs

for all admissible blocks ðr; sÞ 2 Pfar, the cluster matrices V s for all leaves s �T,the transfer matrices Bs0s for all father-son pairs, and Ajr�s on all non-admissibleblocks ðr; sÞ 2 Pnear.

Remark 9: To evaluate (39), fast and efficient algorithms for matrix-vector multi-plication are required. We assume the underlying partitioning P to be sparse in the

sense of [Gra01], with some sparsity constant Csp. Given an H2-matrix eA 2 RN�N

and x; y 2 RN , it can then be shown that the computation of y ¼ eAx needs onlyOðNmdÞ operations to complete. Similarly, both the number of operations required tobuild anH2-matrix approximation and the amount of storage needed to store it are oforder OðNmdÞ, cf. [BGH03, Gie01, HKS00]. Note that Csp enters all the abovecomplexity estimates; for Csp !1, the complexity of the problem will become un-bounded, as well. Estimates on Csp can be found in [GH03].

4.4. Global Error Analysis for the Approximate Bilinear Form

By definition of the admissibility condition (38), one can apply the approximationresults of Sect. 3.4 on each admissible block ðr; sÞ 2 Pfar. Indeed, given someconstant c3, Lemma 3.4 shows that we can choose an approximation order m 2 N

so that

kj�Iðr;sÞm jkL1ðBr�BsÞ � c3

for all ðr; sÞ 2 Pfar.

Theorem 4.1: Under the above assumptions, we have

jaðu; vÞ � eaðu; vÞj � c3kukL1ðXÞkvkL1ðXÞ � c3jXjkukL2ðXÞkvkL2ðXÞ ð44Þ

for all u; v 2 L2ðXÞ.

Application of H-Matrix Techniques in Micromagnetics 191

Page 16: praetorius/download/published/pp01_computing74.pdf

Proof: For almost all ðx; yÞ 2 X� X, we define an integral kernel ejðx; yÞ as fol-lows. Let S :¼

Sf@T jT 2Tg denote the skeleton of the partition. Note that

S � Rd is a set of measure zero. For x; y 2 XnS, there are unique elementsTx; Ty 2T satisfying x 2 Tx and y 2 Ty , respectively. Since P is a partition ofT�T, there is a unique block ðr; sÞ 2 P with ðTx; TyÞ 2 r� s. Consequently, wemay define

ejðx; yÞ :¼jðx; yÞ if (r,s) is not admissible,

Iðr;sÞm jðx; yÞ else.

(

Since að�; �Þ and eað�; �Þ differ only on the far-field blocks, we have

jaðu; vÞ � eaðu; vÞj ¼Z

X

ZX

uðxÞ�jðx; yÞ � ejðx; yÞ�vðyÞdydx

��������;

and a Holder inequality yields

jaðu; vÞ � eaðu; vÞj � kj� ejkL1ðX�XÞkukL1ðXÞkvkL1ðXÞ;

since uðxÞ and vðyÞ decouple on X� X. Using kukL1ðXÞ � jXj1=2kukL2ðXÞ, we obtain

the desired estimate. h

4.5. Global Error Analysis for the Corresponding Matrix

Let Uh � L2ðXÞ be a finite-dimensional space, and define the stiffness matrix

A 2 RN�N and its H2-approximation eA 2 RN�N by Ajk :¼ aðuj; ukÞ and

eAjk :¼ eaðuj; ukÞ, respectively, for a fixed basis fu1; . . . ; uNg of Uh. Then, there holds

Corollary 4.2: The approximation error for the stiffness matrix A is bounded by

kA� eAkF � Nc3�maxN

j¼1kujkL1ðXÞ�2: ð45Þ

In particular, the error decreases exponentially with the approximation order m.

Provided A is a regular matrix, eA also is regular for large approximation orders m.

Proof: The first assertion follows by applying Theorem 4.1 to each entry of A.Regularity is immediate, as the regular matrices GLðNÞ form an open subset of

RN�N . h

5. Galerkin Discretization of the Potential Equation

In this section, we provide two different H2-matrix approaches to obtain a rea-sonable data sparse approximation of the stiffness matrix

192 N. Popovic and D. Praetorius

Page 17: praetorius/download/published/pp01_computing74.pdf

A 2 RdN�dN with Ajk :¼ aðuj;ukÞ ð46Þ

for a fixed basis fu1; . . . ;udNg ofP0ðTÞd , where the bilinear form að�; �Þ is definedas in (15). We recall a result from [Pra04].

Theorem 5.1: For bounded Lipschitz domains x; ex � Rd and vectors m; em 2 Rd , wehave

aðvxm; v ~xemÞ ¼ �Z@x

Z@ ~x

Gðx� yÞ�mðxÞ �m

�;�emðyÞ � em� dsydsx; ð47Þ

where m and em denote the outer normal vectors on @x and @ ex, respectively. Fur-thermore, we have the symmetry properties

aðvxm; v ~xemÞ ¼ aðv ~xem; vxmÞ ¼ aðvxem; v ~xmÞ; ð48Þ

and in the case distðx; exÞ > 0 there holds

aðvxm; v ~xemÞ ¼Z

x

Z~xm � HGðx� yÞem dydx ð49Þ

with the Hessian HG of the Newtonian kernel G. h

Now, a reasonable choice for a basis of P0ðTÞd is

uj :¼ vTje1; ujþN :¼ vTj

e2 etc. for 1 � j � N ; ð50Þ

as is shown in the following. This basis gives rise to the definition of the matrices

Aab 2 RN�Nsym for fixed 1 � a; b � d; Aab

jk :¼ aðvTjea; vTk

ebÞ; ð51Þ

where the symmetry of Aab (i.e., an additional symmetry of A) follows from (48).

Note that – again by Eq. (48) – we have Aab ¼ Aba. Therefore, A is a symmetricd � d block matrix with symmetric blocks Aab of dimension N � N ,

A ¼A11 A12

A12 A22

!for d ¼ 2 and A ¼

A11 A12 A13

A12 A22 A23

A13 A23 A33

0B@

1CA for d ¼ 3;

ð52Þ

respectively. The idea is to approximate each block Aab by an appropriate H2-matrix.

5.1. Approximation of A on Far Field Blocks via (49)

A direct H2-matrix approach stems from the far field representation (49) forwhich the Hessian (i.e., the second derivatives) of the Newtonian kernel enters,

Application of H-Matrix Techniques in Micromagnetics 193

Page 18: praetorius/download/published/pp01_computing74.pdf

@2G@xa@xb

ðxÞ ¼ 1

cd

dabjxj2 � dxaxb

jxjdþ2for x 2 Rdnf0g; ð53Þ

where 1 � a; b � d and dab denotes the Kronecker delta. Obviously, these kernelsare asymptotically smooth with singularity order s ¼ �d, cf. Remark 4. Let P bea block partitioning with respect to the given triangulation T. To abbreviatenotation, let mj denote the outer normal vector on the boundary @Tj of an element

Tj, and let mjh :¼ mhjTj

2 Rd be a discrete magnetization mh 2 P0ðTÞd . In analogyto the previous section and according to (47) and (49), the bilinear form að�; �Þ onP0ðTÞd reads

aðmh; emhÞ ¼ �XN

j;k¼1

Z@Tj

Z@Tk

Gðx� yÞ�mjðxÞ �mj

h

��mkðyÞ � emk

h

�dsy dsx

¼ �X

ðr;sÞ2Pnear

XTj2r

XTk2s

Z@Tj

Z@Tk

Gðx� yÞ�mjðxÞ �mj

h

��mkðyÞ � emk

h

�dsydsx

þX

ðr;sÞ2Pfar

Z[r

Z[smhðxÞ � HGðx� yÞemhðyÞ dydx:

ð54Þ

As in the previous section, we obtain the approximate bilinear form by replacingthe exact kernel HG on far field blocks by tensor interpolation Iðr;sÞm HG which isnow understood coefficient-wise [since we are dealing with a matrix kernelHGðx� yÞ 2 Rd�d

sym ],

eaðmh; emhÞ : ¼ �X

ðr;sÞ2Pnear

XTj2r

XTk2s

Z@Tj

Z@Tk

Gðx� yÞ�mjðxÞ �mj

h

��mkðyÞ � emk

h

�dsydsx

þX

ðr;sÞ2Pfar

Z[r

Z[smhðxÞ �

�Iðr;sÞm HGðx� yÞ

�emhðyÞ dydx:

The error analysis is completely straightforward following the arguments ofSect. 4.4. We denote by jab; 1 � a; b � d, the second derivatives of the Newto-nian kernel, cf. (53). As before, we assume that the approximation degree m 2 N

is large enough so that

kjab �Iðr;sÞm jabkL1ðBr�BsÞ � c3;

with a constant c3 which is independent of ðr; sÞ 2 Pfar.

Theorem 5.2: Under the above assumptions, we have

jaðmh; emhÞ � eaðmh; emhÞj � c3djXjkmhkL2ðXÞkemhkL2ðXÞ ð55Þ

for all mh; emh 2 P0ðTÞd .

194 N. Popovic and D. Praetorius

Page 19: praetorius/download/published/pp01_computing74.pdf

Proof: With jðx; yÞ :¼ HGðx� yÞ and ej as in the proof of Theorem 4.1, we obtain

jaðmh; emhÞ � eaðmh; emhÞj � jXjkj� ejkL1ðX�X; Rd�d ÞkmhkL2ðX; Rd ÞkemhkL2ðX; Rd Þ;

where we consider the usual (Euclidean) operator norm k � k on Rd�d . Recalling

that the Frobenius norm satisfies kAk � kAkF :¼�Pd

j;k¼1 A2jk

�1=2, it follows that

kjðx; yÞ � ejðx; yÞk � kjðx; yÞ � ejðx; yÞkF � dc3:

This concludes the proof. h

Finally, we explicitly state the approximation eAab 2 RN�N to Aab to clarify what

has to be implemented. Recall that eAab 2 RN�N is defined by eAabjk ¼ eaðvTj

ea; vTkebÞ.

The computation of eAab is performed separately on the admissible and theinadmissible blocks of P:

First, let ðr; sÞ 2 Pfar be admissible; given the degenerate kernel

Iðr;sÞm jabðx; yÞ ¼XMm1¼0

XMm2¼0

jab�xr

m1; xs

m2

�Lr

m1ðxÞLs

m2ðyÞ for ðx; yÞ 2 [r� [s;

ð56Þwhere M ¼ md and Lr

m1and Ls

m2are the appropriate tensor Lagrange polyno-

mials, this implies

eAabjk ¼

ZTj

ZTk

ejabðx; yÞdydx ¼XMm1¼0

XMm2¼0

jab�xr

m1; xs

m2

�|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl}

¼:Sabrsm1m2

ZTj

Lrm1ðxÞdx

|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl}¼:V r

jm1

ZTk

Lsm2ðyÞdy

|fflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflffl}¼:V s

km2

for Tj 2 r and Tk 2 s. With the matrices V r 2 Rjrj�M , V s 2 Rjsj�M , and Sabrs

2 RM�M , the submatrix Aabjr�s of Aab can be computed approximately by amatrix product

Aabjr�s V rSabrsV sT ¼: eAabjr�s: ð57Þ

Second, for an inadmissible block ðr; sÞ 2 Pnear, we have eAabjr�s ¼ aðvTjea;

vTkebÞ ¼ Aabjr�s; these entries are computed by (47). Double boundary integrals

as in (47) occur in the context of boundary integral methods with piecewiseconstant ansatz and test functions. For simple geometries of the elements, ana-lytic formulae are known, cf. [Hac02, Mai99, Mai00].

Remark 10: Note that only the multiplication matrices Sabrs and Aab in (57) dodepend on the indices a; b. Therefore, on admissible blocks ðr; sÞ 2 Pfar, the matriceseAabjr�s should be treated simultaneously for all 1 � a � b � d.

Remark 11: For inadmissible blocks ðr; sÞ 2 Pnear, the matrices eAabjr�s ¼ Aabjr�sshould also be assembled simultaneously, as the entries only differ on the components

Application of H-Matrix Techniques in Micromagnetics 195

Page 20: praetorius/download/published/pp01_computing74.pdf

of the normal vectors, cf. (47). Since the block partitioning is symmetric and we are

using constant approximation order, eAab also is symmetric; therefore eAabjk should

only be computed and stored for 1 � j � k � N .

5.2. Approximation of A on Far Field Blocks via (47)

A different H2-matrix approach can be realized by use of (47) and by replacing Gby Iðr;sÞm G on admissible blocks. We start out from (54) and consider the fol-lowing discretization of the bilinear form að�; �Þ, where jðx; yÞ :¼ Gðx� yÞ now:

eaðmh; emhÞ ¼ �X

ðr;sÞ2Pnear

XTj2r

XTk2s

Z@Tj

Z@Tk

�mjðxÞ �mj

h

�jðx; yÞ

�mkðyÞ � emk

h

�dsy dsx

�X

ðr;sÞ2Pfar

XTj2r

XTk2s

Z@Tj

Z@Tk

�mjðxÞ �mj

h

�Iðr;sÞm jðx; yÞ

�mkðyÞ � emk

h

�dsy dsx:

We state the implementational details first. The only difference to Sect. 5.1 is in

the way how eA is computed on admissible blocks ðr; sÞ 2 Pfar. Given thedegenerate kernel

Iðr;sÞm jðx; yÞ ¼XMm1¼0

XMm2¼0

j�xr

m1; xs

m2

�Lr

m1ðxÞLs

m2ðyÞ;

we obtain for Tj 2 r; Tk 2 s using integration by parts

eAabjk ¼ eaðvTj

ea; vTkebÞ ¼ �

Z@Tj

Z@Tk

mjaðxÞIðr;sÞm jðx; yÞmk

bðyÞ dsy dsx

¼ �XMm1¼0

XMm2¼0

j�xr

m1; xs

m2

� Z@Tj

Lrm1ðxÞmj

aðxÞ dsx

Z@Tk

Lsm2ðyÞmk

bðyÞ dsy

¼ �XMm1¼0

XMm2¼0

j�xr

m1; xs

m2

�|fflfflfflfflfflfflffl{zfflfflfflfflfflfflffl}¼Srs

m1m2

ZTj

@Lrm1

@xadx

|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl}¼:V ra

jm1

ZTk

@Lsm2

@ybdy

|fflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflffl}¼:V sb

km2

:

Remark 12: The integrands in V rajm1

and V sbkm2

are computable, as formulae for the

first derivatives of the Lagrange polynomials in one dimension can easily be derivedby induction: given Lj 2 Pm, we have

L0jðtÞ ¼

Xm

k¼1k 6¼j

1

tj � tk

Ym‘¼1

‘ 62fj;kg

t � t‘tj � t‘

:

Remark 13: Note that the multiplication matrices Srs do not depend on a; b here.Thus, we now have to assemble and store Srs, the matrices V ra and V sb for all leaves

196 N. Popovic and D. Praetorius

Page 21: praetorius/download/published/pp01_computing74.pdf

r; s of the cluster tree only, and the father-son transformation matrices Bs0s. This is asignificant difference to Sect. 5.1, where the computation of Sabrs, of V s for all leaves

s, and of Bs0s is required.

The results of Sect. 5.1 now carry over immediately; however, we have to assumethat m is large enough so that

k@a@bðj�Iðr;sÞm jÞkL1ðBr�BsÞ � c3

for jðx; yÞ ¼ Gðx� yÞ, with a constant c3 which is independent of ðr; sÞ 2 Pfar and1 � a; b � d. The existence of such a constant is a consequence of Lemma 3.6. Incomplete analogy to Theorem 5.2 we conclude:

Theorem 5.3: Under the above assumptions, we have

jaðmh; emhÞ � eaðmh; emhÞj � c3djXjkmhkL2ðXÞkemhkL2ðXÞ ð58Þ

for all mh; emh 2 P0ðTÞd . h

5.3. Solvability of the Approximate Discrete Model ðfRPe;hÞ

Suppose a triangulation T of X by rectangular, axis-parallel boxes to be given.The Gauss Divergence Theorem then shows that the restriction of P, PjP0ðTÞd , isinjective, cf. [CP04a]. In particular, there exists a constant c4 such that

aðmh;mhÞ ¼ kPmhk2L2ðRd Þ � c4kmhk2L2ðXÞ for all mh 2 P0ðTÞd :

Theorem 5.4: Given the above assumptions, take m large enough so thatCdf :¼ c4 � c3djXj � 0, with c3 from Theorems 5.2 and 5.3, respectively. Then, the

approximate discrete model ðfRPe;hÞ has solutions. In case Cdf > 0, the solution to

ðfRPe;hÞ is unique.

Proof: As

eaðmh;mhÞ � aðmh;mhÞ � jaðmh;mhÞ � eaðmh;mhÞj � Cdfkmhk2L2ðXÞ;

one concludes with Cdf � 0 that the approximate bilinear form eað�; �Þ is positivesemidefinite, and even positive definite if Cdf > 0. Thus, the symmetry of eað�; �Þshows that mh 7!eaðmh;mhÞ is a convex functional. In sum, the approximate energyfunctional

eEðmhÞ :¼Z

X/��ðmhÞdx�

ZXf �mhdxþ 1

2eaðmh;mhÞ þ

1

2

ZX

e�1ðjmhj � 1Þ2þ dx ð59Þ

is continuous and convex. Coercivity of eE follows from the last term in (59), i.e.,from the penalization energy contribution.

Application of H-Matrix Techniques in Micromagnetics 197

Page 22: praetorius/download/published/pp01_computing74.pdf

The Direct Method of the Calculus of Variations now proves the existence of

(global) minimizers for eEð�Þ. As minimizers of eE in P0ðTÞd are zeros of the

corresponding Gateaux derivatives, which read ðfRPe;hÞ, we have thus shown the

solvability of ðfRPe;hÞ.

To prove uniqueness in case Cdf > 0, assume ðkh;mhÞ and ðekh; emhÞ to be two

solutions of ðfRPe;hÞ now. One immediately finds

eaðmh � emh;mh � emhÞ þ hD/��ðmhÞ � D/��ðemhÞ;mh � emhiL2ðXÞ

þ hkhmh � ekhemh;mh � emhiL2ðXÞ ¼ 0: ð60Þ

As/�� is convex, the second term in (60) cannot be negative, whereas the third termis nonnegative by a direct calculation [CP01]. Hence, eaðmh � emh;mh � emhÞ ¼ 0,which implies mh ¼ emh due to the definiteness of eað�; �Þ. h

Remark 14: The a priori and a posteriori error analysis for the approximate model

ðfRPe;hÞ is the topic of ongoing research and will appear in a subsequent work.

6. Numerical Experiments

In the following section, we collect the results of our numerical experiments. Wecompare the performance of the two H2-matrix approximations introduced inSects. 5.1 and 5.2 to that of the standard approach using the full stiffnessmatrices.

6.1. Implementational Details

All numerical experiments were conducted using the HLib software packageprovided by S. Borm and L. Grasedyck of the Max-Planck-Institute forMathematics in the Sciences (Leipzig). We utilized a Compaq/HP AlphaServerES45 under Unix, with four Alpha EV68 CPUs running at 1 GHz each and 32GBytes of RAM.

For our experiments, we varied the interpolation order m in the H2-matrixapproximation between 2 and 6. As for the parameters, we fixed the maximumleaf size Clf ¼ 20 and the admissibility parameter g ¼ 1 throughout. However, toavoid having to vary Clf with m, we decided to store admissible blocksðr; sÞ 2 Pfar as full matrices whenever jrjjsj � MrMs, i.e., whenever storing thefull matrix was less expensive than storing the corresponding multiplicationmatrix Srs.

Moreover, to be able to use the supplied HLib routines with as few modificationsas possible, we did exploit the symmetry of Aab when setting up the matrices, butneglected it in the process of storing them.

198 N. Popovic and D. Praetorius

Page 23: praetorius/download/published/pp01_computing74.pdf

6.2. Full Matrices vs. H2-Matrices

In a first example, we restrict ourselves to a comparison of the properties of A and

its H2-approximation eA. For simplicity, we assume the domain X to be the unitsquare, X :¼ ½0; 1 � ½0; 1 � R2, and consider a uniform triangulation of X con-sisting of rectangular elements. The number N of degrees of freedom is variedbetween 256 and 1048576. Note that for N � 16384, we have no longer set up thefull matrix A, but have approximated it by an H2-matrix with m ¼ 10; all sub-sequent references are made to this approximation.

We compare the results of our two H2-matrix approaches for fixed approxima-

tion orders m. First, we give the relative approximation errors kA� eAk2=kAk2;the values are collected in the following Tables 1 and 2.

In summary, the errors in the second approach (cf. Sect. 5.2) seem to be larger byone order of magnitude. The convergence rates, however, are optimal in bothcases: every increase of m by one reduces the error by an order of magnitude.Note that for N ¼ 256 and m � 4, our choice of Clf implies that there are noadmissible blocks; the error in this case is due to rounding. The error estimatesthemselves are computed by a power iteration, with a maximum of 100 iterativesteps.

Second, we consider the times needed for building the two H2-matrix approxi-mations and compare them to the setup times of the full matrix A, see Tables 3and 4.

The time required for setting up the full matrix by far surpasses the setup times ofthe approximations; the gap increases with the number N of degrees of freedom.For m fixed, the difference between the two approaches is negligible here.

Third, of particular interest is the amount of memory required for storing thematrix approximations, as compared to the storage requirements of the originalmatrices; these are listed in Tables 5 and 6.

Table 1. Relative approximation errors (first H2-matrix approach, cf. Sect. 5.1)

N=m 2 3 4 5 6

256 2:38605�3 2:42547�4 2:22014�16 2:36817�16 2:10329�161024 4:66221�3 4:74404�4 4:42905�5 4:45515�6 3:85638�74096 6:08654�3 6:08069�4 6:14339�5 5:88869�6 5:21420�716384 7:28966�3 7:41998�4 7:76218�5 7:11627�6 7:26753�765536 8:42299�3 8:57423�4 8:83683�5 8:26648�6 8:21227�7

Table 2. Relative approximation errors (second H2-matrix approach, cf. Sect. 5.2)

N=m 2 3 4 5 6

256 1:47377�2 1:80771�3 2:46549�16 2:25802�16 2:15506�161024 3:80124�2 3:47728�3 2:38627�4 2:65527�5 2:05902�64096 5:48085�2 4:98563�3 4:17924�4 3:81013�5 3:65331�616384 6:54358�2 6:24201�3 5:19552�4 5:21367�5 5:78795�665536 7:10886�2 7:69041�3 6:14949�4 6:02877�5 6:88747�6

Application of H-Matrix Techniques in Micromagnetics 199

Page 24: praetorius/download/published/pp01_computing74.pdf

Overall, the figures compare very favorably with the storage required by the fullmatrices; here the second approach is clearly superior to the first one for m fixedand N large. This is probably due to the fact that the multiplication matrices onadmissible blocks only have to be stored once instead of for each block. Note thaton meshes with N small, the memory requirements slightly favor the full matrixapproach. This is due to the organizational effort involved in constructing thecluster tree of the H2-matrix approximation and in allocating memory blockwisefor subblocks instead of for the whole matrix.

Table 5. Memory requirement in KBytes/N (first approach)

N=m 2 3 4 5 6 full

256 5.2 5.6 6.8 7.2 7.6 6.01024 8.8 11.6 18.8 20.5 22.1 24.04096 11.4 16.2 28.9 34.6 42.6 96.016384 12.9 19.1 35.5 44.6 58.2 384.065536 13.7 20.7 39.3 50.6 67.9 1536.0

262144 14.2 21.6 41.3 53.9 73.4 6144.01048576 14.4 22.1 n/a n/a n/a 24576.0

Table 6. Memory requirement in KBytes/N (second approach)

N=m 2 3 4 5 6 full

256 5.2 5.5 7.2 8.1 9.2 6.01024 8.3 9.4 18.6 19.9 22.5 24.04096 10.4 12.2 26.9 29.5 35.4 96.016384 11.6 13.8 31.8 35.6 44.1 384.065536 12.2 14.8 34.6 39.1 49.2 1536.0

262144 12.6 15.2 36.0 40.9 51.9 6144.01048576 12.7 15.5 n/a n/a n/a 24576.0

Table 3. Setup time in seconds (first approach)

N=m 2 3 4 5 6 full

256 0.7 0.7 0.8 0.8 0.8 0.81024 3.8 3.8 9.0 9.0 10.3 13.54096 18.1 18.0 50.9 51.1 60.1 215.616384 78.2 78.5 237.4 237.9 286.1 n/a65536 329.0 330.9 1037.3 1042.2 1263.9 n/a

262144 1353.3 1361.8 4303.4 4308.8 5248.6 n/a1048576 5421.2 5925.1 n/a n/a n/a n/a

Table 4. Setup time in seconds (second approach)

N=m 2 3 4 5 6 full

256 0.7 0.6 0.8 0.9 0.9 0.81024 3.8 3.9 9.1 9.1 10.4 13.54096 18.1 18.0 51.1 51.4 60.7 215.616384 78.6 78.4 238.0 238.5 287.7 n/a65536 330.1 331.5 1036.8 1037.3 1262.8 n/a

262144 1343.5 1349.4 4291.0 4311.0 5259.6 n/a1048576 5450.3 5445.4 n/a n/a n/a n/a

200 N. Popovic and D. Praetorius

Page 25: praetorius/download/published/pp01_computing74.pdf

Altogether, these numerical experiments underline the applicability of H-matrixtechniques to the discretized potential operator P from ðRPe;hÞ. For givenapproximation order m, the first approach leads to lesser approximation errors,but is otherwise also more costly numerically, as is reflected by the much higherstorage requirements.

In a certain sense, however, the two approaches seem almost equivalent: if werequire some fixed accuracy, the approximation order m always has to be higherby one in the second approach, as the errors lag behind by one order of mag-nitude. A comparison of the respective setup times and memory requirementsthen shows the numerical cost to be almost even.

6.3. An Example with Known Exact Solution

In our second example, we consider a model problem for the relaxed Landau-Lifshitz problem ðRP Þ taken from [CP04b]. As above, let the domain X be the unitsquare; assume X to be filled with some uniaxial magnetized material, with theeasy axis given by e ¼ ð�1; 1Þ=

ffiffiffi2p

and the corresponding normal by

z ¼ ð1; 1Þ=ffiffiffi2p

, see Remark 1. Define ðm; kÞ 2 W 1;1ðX; R2Þ � L1ðXÞ as

mðxÞ :¼ x for jxj � 1,x=jxj for jxj > 1

and kðxÞ :¼ 0 for jxj � 1,

1 for jxj > 1.

ð61Þ

Then, ðm; kÞ solves ðRP Þ with given right-hand side

f :¼ Pmþ ðm � zÞzþ km in L2ðX; R2Þ; ð62Þ

cf. (11),(12). In the following, we replace Pm in (62) by PmT, where mT denotesthe piecewise integral mean of m. Note that there are no fully analytic examplesfor ðRP Þ with known solutions, which is why we have to restrict ourselves to thepresent model.

As m is Lipschitz continuous and therefore in W 1;1ðX; R2Þ, the a priori analysisfrom [CP04b] yields kðm�mhÞ � zkL2ðXÞ ¼ Oðeþ hÞ, with e the penalizationparameter from (14). For our experiments, we choose e ¼ h and compute thediscrete solution mh ¼

P2Nj¼1 xjuj with respect to the basis fu1; . . . ;u2Ng from (50)

by a classical Newton-Raphson scheme: the unknown coefficient vector x 2 R2N

is determined as the unique zero of

F ðxÞ :¼�hPmh þ D/��ðmhÞ þ khmh � f; uji

�2Nj¼1 ¼ 0: ð63Þ

(A detailed discussion on the relevance of e can be found in [CP04b].) Note thatthe convergence of the Newton-Raphson method is not guaranteed mathemati-cally, since F is only differentiable almost everywhere. The Jacobian of F can be

written as a finite sum DF ðxÞ ¼ AþPN

j¼1 DjðxÞ with symmetric positive semi-

Application of H-Matrix Techniques in Micromagnetics 201

Page 26: praetorius/download/published/pp01_computing74.pdf

definite matrices DjðxÞ 2 R2N�2N , cf. [CP04a]. For a triangulation by rectangularelements such as ours, the matrix A can be shown to be positive definite byapplying the Gauss Divergence Theorem, see [CP04b]. We therefore employ apreconditioned conjugate gradient method, with the LU decomposition of a

coarsened H-matrix version of eA as preconditioner.

In Table 7, we summarize the number of Newton steps required for finding thecoefficient vector x of mh in (63). One sees that the number of steps is nearlyconstant, i.e., independent of m and growing only slightly with N .

Figure 2 gives the convergence history of the full error km�mhkL2ðXÞ for different

choices of the order m in the H2-matrix approximation. The convergence rate isalmost the optimal 1=2, up to minimal deviations which are presumably due tothe fact that we approximate Pm by PmT in (62). As was to be expected, theinterpolation order m has to increase with the number N of degrees of freedom inorder to maintain optimal convergence; the effect is clearly more pronounced forthe second approach.

Note that we consider the full L2 norm instead of kD/��ðmÞ � D/��ðmhÞkL2ðXÞ,

although only the latter is covered by the available a priori error analysis.Recently it has been shown that one always obtains weak L2 convergence ofmh * m; more precisely, there holds km�mhk ~H�1ðXÞ ¼ Oðeþ hÞ, cf. [CP04c].

Nevertheless, the numerical experiments available from [CP04b] indicate that onemay hope for sharper results.

Table 7. Number of Newton steps: first approach (left), second approach (right)

N=m 2 3 4 5 6

256 6 6 6 6 61024 8 8 8 8 84096 7 8 8 8 816384 8 8 9 9 9

N=m 2 3 4 5 6

256 6 6 6 6 61024 8 8 8 8 84096 8 8 8 8 816384 9 9 9 9 9

Fig. 1. The discrete solution ðmh; khÞ of ðRPe;hÞ as in the model example for N ¼ 1024, with mh on theleft (displayed as vectors mhjT and jmhjT j) and kh on the right. In the white region, we have jmhj � 1

and therefore kh ¼ 0

202 N. Popovic and D. Praetorius

Page 27: praetorius/download/published/pp01_computing74.pdf

Acknowledgement

This work has been supported by the Austrian Science Fund FWF under grant no. P15274, ‘‘Reliableand Efficient Numerical Algorithms in Computational Micromagnetics’’. The authors thank S. Bormand L. Grasedyck of the Max-Planck-Institute for Mathematics in the Sciences (Leipzig) for providingthem with the HLib software package, as well as for several helpful comments.

References

[BG02] Borm, S., Grasedyck, L.: Low-rank approximation of integral operators by interpolation.Computing 72, 325–332 (2004).

[BGH03] Borm, S., Grasedyck, L., Hackbusch, W.: Introduction to hierarchical matrices withapplications. Eng. Anal. B. Elem. 27, 405–422 (2003).

[BH02] Borm, S., Hackbusch, W.: H2-matrix approximation of integral operators by interpola-tion. Appl. Numer. Math. 43, 129–143 (2002).

Fig. 2. Experimental convergence of km�mhkL2ðXÞ over N : first approach (top), second approach

(bottom)

Application of H-Matrix Techniques in Micromagnetics 203

Page 28: praetorius/download/published/pp01_computing74.pdf

[BLM02] Borm, S., Lohndorf, M., Melenk, J. M.: Approximation of integral operators by variable-order interpolation. Accepted for publication in Numer. Math. (2004).

[BS01] Babuska, I., Strouboulis, T.: The finite element method and its reliability. NumericalMathematics and Scientific Computation. Oxford: Oxford University Press 2001.

[CP01] Carstensen, C., Prohl, A.: Numerical analysis of relaxed micromagnetics by penalised finiteelements. Numer. Math. 90, 65–99 (2001).

[CP04a] Carstensen, C., Praetorius, D.: Effective simulation of a macroscopic model for stationarymicromagnetics. Comput. Meth. Appl. Mech. Engng. (2004) (forthcoming).

[CP04b] Carstensen, C., Praetorius, D.: Numerical analysis for a macroscopic model inmicromagnetics. SIAM J. Numer. Anal. (2004) (forthcoming).

[CP04c] Carstensen, C., Praetorius, D.: Stabilization yields strong convergence for a macroscopicmodel in micromagnetics. 2004 (in preparation).

[DeS93] DeSimone, A.: Energy minimizers for large ferromagnetic bodies. Arch. Rat. Mech. Anal.125, 99–143 (1993).

[DL93] DeVore, R. A., Lorentz, G. G.: Constructive approximation. Grundlehren der mathe-matischen Wissenschaften, vol. 303. Berlin Heidelberg: Springer 1993.

[GH03] Grasedyck, L., Hackbusch, W.: Construction and arithmetics of H-matrices. Computing70, 295–334 (2003).

[Gie01] Giebermann, K.: Multilevel approximation of boundary integral operators. Computing 67,183–207 (2001).

[Gra01] Grasedyck, L.: Theorie und Anwendungen hierarchischer Matrizen. PhD thesis, Christian-Albrechts-Universitat, Kiel, 2001.

[Hac02] Hackbusch, W.: Direct integration of the Newton potential over cubes including aprogram description. Computing 68, 193–216 (2002).

[HKS00] Hackbusch, W., Khoromskij, B., Sauter, S. A.: On H2-matrices. In: Lectures on AppliedMathematics, pp. 9–29. Berlin Heidelberg: Springer 2000.

[HS98] Hubert, A., Schafer, R.: Magnetic domains: the analysis of magnetic microstructures.Berlin Heidelberg: Springer 1998.

[JK90] James, R. D., Kinderlehrer, D.: Frustration in ferromagnetic materials. Cont. Mech.Therm. 2, 215–239 (1990).

[KP01] Kruzık, M., Prohl, A.: Young measure approximation in micromagnetics. Numer. Math.90(2), 291–307 (2001).

[LM92] Luskin, M., Ma, L.: Analysis of the finite element approximation of microstructure inmicromagnetics. SIAM J. Numer. Anal. 320–331, 1992.

[Ma91] Ma, L.: Analysis and computation for a variational problem in micromagnetics. PhDthesis, University of Minnesota, Minneapolis, 1991.

[Mai99] Maischak, M.: The analytical computation of the Galerkin elements for the Laplace, Lameand Helmholtz equation in 2D-BEM. Technical Report ifam48, Institut fur AngewandteMathematik, Universitat Hannover, 1999.

[Mai00] Maischak, M.: The analytical computation of the Galerkin elements for the Laplace, Lameand Helmholtz equation in 3D-BEM. Technical Report ifam50, Institut fur AngewandteMathematik, Universitat Hannover, 2000.

[Ped97] Pedregal, P.: Parametrized measures and variational principles. Basel: Birkhauser 1997.[Pra04] Praetorius, D.: Analysis of the operator 4�1 div arising in magnetic models. ZAA 23

(2004), 589–605[Pro01] Prohl, A.: Computational micromagnetism. Advances in Numerical Mathematics

(Teubner, B. G., ed.). Stuttgart 2001.[Riv84] Rivlin, T. J.: The Chebyshev polynomials. New York: Wiley 1984.[Tar95] Tartar, L.: Beyond Young measures. Meccanica 30, 505–526 (1995).

Nikola PopovicVienna University of TechnologyInstitute for Analysis and Scientific ComputingWiedner Hauptstrasse 8–101040 Vienna, Austriae-mail: [email protected]

D. PraetoriusVienna University of TechnologyInstitute for Analysis and Scientific ComputingWiedner Hauptstrasse 8–101040 Vienna, Austriae-mail: [email protected]

204 N. Popovic and D. Praetorius: Application of H-Matrix Techniques in Micromagnetics