Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby...

37
Local convergence of Newton-like methods for degenerate eigenvalues of nonlinear eigenproblemsa Fei Xue and Daniel B. Szyld Report 12-10-04 October 2012 This report is available in the World Wide Web at http://www.math.temple.edu/~szyld

Transcript of Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby...

Page 1: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Local convergence of Newton-likemethods for degenerate eigenvalues

of nonlinear eigenproblemsa

Fei Xue and Daniel B. Szyld

Report 12-10-04October 2012

This report is available in the World Wide Web athttp://www.math.temple.edu/~szyld

Page 2: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.
Page 3: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

LOCAL CONVERGENCE OF NEWTON-LIKE METHODS FORDEGENERATE EIGENVALUES OF NONLINEAR EIGENPROBLEMSāˆ—

DANIEL B. SZYLDā€  AND FEI XUEā€ 

Abstract. We study the local convergence rates of several single-vector Newton-like methodsfor the solution of a semi-simple or defective eigenvalue of nonlinear algebraic eigenvalue problems ofthe form T (Ī»)v = 0. This problem has not been fully understood, since the Jacobian associated withthe single-vector Newtonā€™s method is singular at the desired eigenpair, and the standard convergencetheory is not applicable. In addition, Newtonā€™s method generally converges only linearly towardssingular roots. In this paper, we show that faster convergence can be achieved for degenerate eigen-values. For semi-simple eigenvalues, we show that the convergence of Newtonā€™s method, Rayleighfunctional iteration and the Jacobi-Davidson method are quadratic, and the latter two converge cu-bically for locally symmetric problems. For defective eigenvalues, all these methods converge onlylinearly in general. We then study two accelerated methods for defective eigenvalues, which exhibitquadratic convergence and require the solution of two linear systems per iteration. The results areillustrated by numerical experiments.

Key words. nonlinear eigenvalue problems, degenerate eigenvalues, Jordan chains, Newtonā€™smethod, inverse iteration, Rayleigh functionals, Jacobi-Davidson method

AMS subject classifications. 65F15, 65F10, 65F50, 15A18, 15A22.

1. Introduction. In this paper, we study the local convergence rates of severalsingle-vector Newton-like methods for the solution of a degenerate eigenvalue Ī» and acorresponding eigenvector v of the nonlinear algebraic eigenvalue problem of the form

(1.1) T (Ī»)v = 0,

where T (Ā·) : U ā†’ CnƗn is holomorphic on a domain U āŠ‚ C, Ī» āˆˆ U and v āˆˆ Cn \ {0}.Ī» is an eigenvalue of T (Ā·) if and only if T (Ī») is singular. Assume that detT (Ā·) 6ā‰” 0in any neighborhood of Ī»; that is, eigenvalues of T (Ā·) are isolated. The algebraicmultiplicity of an eigenvalue Ī» of (1.1) is defined as the multiplicity of Ī» as a rootof the characteristic function detT (Āµ). The eigenvalue Ī» is called degenerate if itsalgebraic multiplicity larger than one. A degenerate eigenvalue Ī» is called semi-simple if its geometric multiplicity, defined as the dimension of null (T (Ī»)), equals itsalgebraic multiplicity; it is called defective if its geometric multiplicity is smaller thanits algebraic multiplicity.

Our study of numerical methods for degenerate eigenvalues of (1.1) is motivatedby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems. These problems arise in a variety of applications, such as vibrational analysisof structures, stability analysis of fluid flows, optimal control problems, quantum me-chanics, and delay differential equations; see, e.g., [17, 18, 31] and references therein.Polynomial and rational eigenvalue problems, which can be transformed via lineariza-tion into standard linear eigenvalue problems of dimensions multiple times larger, havebeen studied intensively due to their wide applications [14, 15, 28]. For these problemsof small or medium size, linearization is usually the most effective approach. On theother hand, for very large or irrational (truly nonlinear) problems where linearizationis not effective or applicable, or if there are only a small number of eigenpairs of inter-est, other methods might be more appropriate. For example, one can use the contour

āˆ—This version dated 4 October 2012. This work was supported by the U. S. National ScienceFoundation under grant DMS-1115520.ā€ Department of Mathematics, Temple University, 1805 N. Broad Street, Philadelphia, PA 19122,

USA (szyld,[email protected]).

1

Page 4: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

2 D. B. Szyld and F. Xue

integral method [1, 5, 9] to find initial approximations to the eigenvalues in a specifieddomain, and apply Newton-like methods to refine the eigenpair approximations.

Variants of Newton-like methods have been proposed and studied for the solu-tion of a single eigenpair of the nonlinear eigenvalue problem (1.1); see the earlyworks [21, 22, 24], a few recent developments [4, 11, 25, 27], and a study of inexactvariants of these algorithms [29]. In all these references, the analyses are developedonly for simple eigenpairs, and the convergence of Newton-like methods are generallyquadratic (possibly cubic for problems with local symmetry). However, degenerateeigenpairs also arise in important scenarios such as certain gyroscopic systems, hy-perbolic Hermitian polynomial problems, and some delay differential equations, forwhich the understanding of fast convergent Newton-like methods is not complete.The major difficulty is that the Jacobian (Frechet derivative) of the augmented sys-tem of problem (1.1) at the desired degenerate eigenpair is singular, and the standardconvergence theory of Newtonā€™s method is not directly applicable. To work aroundthis difficulty, a block version of Newtonā€™s method was proposed to compute a simpleinvariant pair involving the whole eigenspace of the desired degenerate eigenvalues[13], where in each iteration the number of linear systems to be solved equals at leastthe sum of algebraic multiplicities of all these eigenvalues. Therefore, this methodcould be prohibitive for degenerate eigenvalues of high algebraic multiplicities.

For many applications, fortunately, we are mostly interested in eigenvalues and asingle eigenvector. In this case, single-vector Newton-like methods are probably themost suitable algorithms. The main purpose of this paper is to gain a better under-standing of several fast convergent single-vector Newton-like methods for the solutionof a degenerate eigenpair of problem (1.1). We assume that the initial approxima-tion to the desired eigenpair is available, and we investigate the local convergencerates of inverse iteration, Rayleigh functional iteration (RFI), and the single-vectorJacobi-Davidson (JD) method. For semi-simple eigenvalues, we show that the localconvergence of these algorithms is quadratic or cubic, respectively, if the local sym-metry of T (Ā·) is absent or present; in other words, the local convergence rates of theNewton-like methods for semi-simple eigenpairs are identical to those for simple ones.For defective eigenvalues, on the other hand, we show that the convergence of thesemethods is in general only linear. We then propose two accelerated variants of New-tonā€™s method, which exhibit locally quadratic convergence at the cost of solving twolinear systems per iteration.

The utility and limitation of the algorithms for defective eigenpairs discussed inthis paper need to be emphasized. In particular, it is well known that the computationof defective eigenpairs (eigenvalues and associated eigenvectors alone, excluding thegeneralized eigenvectors) is an ill-posed problem. A small perturbation of the matrixpencil of order Īµ generally leads to a scattering of a defective eigenvalue Ī» by amagnitude of Īµ1/m, where m is the length of the longest Jordan chain; see, e.g.,[19, 20] and references therein. Therefore, the computation of defective eigenpairsalone is relevant only for those with short Jordan chains (m small) and where moderateaccuracy is sufficient for the applications. On the other hand, simple invariant pairsinvolving a defective eigenpair together with all associated generalized eigenvectorsare much less sensitive to perturbations, and therefore the computation of these pairscould be performed to high accuracy; see, for example, [3, 30]. To this end, one mayagain refer to the block version of Newtonā€™s method [13], which is more expensivethan the single-vector methods studied in this paper.

The rest of the paper is organized as follows. In Section 2 we review some defini-

Page 5: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 3

tions and preliminary results for degenerate eigenvalues. In Section 3, we show thatthe singularity of the Jacobian of the augmented system at the desired semi-simpleeigenpair has no impact on the local quadratic convergence of inverse iteration; inaddition, RFI and single-vector JD converge quadratically or cubically towards asemi-simple eigenpair, respectively, for locally nonsymmetric or symmetric problems.We then show in Section 4 that the convergence of inverse iteration, RFI and single-vector JD towards a defective eigenpair is in general only linear, and we proposetwo accelerated algorithms that converge quadratically. Numerical experiments areprovided throughout the paper to illustrate the convergence results. Section 5 is theconclusion of the paper.

2. Preliminaries. In this section, we review some preliminary results on degen-erate eigenvalues for the study of local convergence of Newton-like methods. Theoriesfor semi-simple and defective eigenvalues can be presented in a uniform way, yet wereview them separately for the purpose of clarity. For the two types of degener-ate eigenvalues, as we shall see, there exist important differences in the structure ofthe resolvent T (Āµ)āˆ’1 near the eigenvalue, and the sets of right and left eigenvectorssatisfy different normalization conditions. These properties are fundamental in theunderstanding of the quadratic (possibly cubic) and linear convergence of Newton-likemethods, respectively, for a semi-simple and a defective eigenpair.

2.1. Semi-simple eigenvalues. Let Ī» be an eigenvalue of (1.1), algT (Ī») be thealgebraic multiplicity of Ī», i.e., the multiplicity of Ī» as a root of the characteristicfunction detT (Āµ), and geoT (Ī») = dim (nullT (Ī»)) be the geometric multiplicity. It canbe shown that algT (Ī») ā‰„ geoT (Ī»). Then Ī» is semi-simple if algT (Ī») = geoT (Ī») ā‰„ 2.Intuitively speaking, a semi-simple eigenpair can be considered as a set of multiplesimple eigenpairs sharing an identical eigenvalue. This perspective is helpful for ourunderstanding of the quadratic convergence of Newton-like methods in this case.

The major theorem on the structure of the resolvent T (Āµ)āˆ’1 near a semi-simpleeigenvalue and the normalization condition of the sets of left and right eigenvectorsis described as follows, and it can be obtained directly from Theorem A.10.2 of [12],where all Jordan chains are of length 1.

Theorem 2.1. Let T be a Fredholm holomorphic operator function in a neighbor-hood of a semi-simple eigenvalue Ī» āˆˆ U . Let algT (Ī») = J , and {Ļ•k} (k = 1, . . . , J) bethe corresponding right eigenvectors. Then there exists a unique set of correspondingleft eigenvectors {Ļˆk} (k = 1, . . . , J) such that in a neighborhood of Ī»

T (Āµ)āˆ’1 =

Jāˆ‘k=1

怈Ā·, Ļˆk怉Ļ•kĀµāˆ’ Ī»

+Q(Āµ),

where Q is holomorphic in a neighborhood of Ī». The two sets of eigenvectors satisfythe following normalization condition

(2.1) 怈T ā€²(Ī»)Ļ•k, Ļˆj怉 = Ī“jk, (j, k = 1, . . . , J).

In addition, the right eigenvectors satisfy

(2.2) T (Ī»)Q(Ī»)T ā€²(Ī»)Ļ•k = 0, (k = 1, . . . , J).

Note that Ī» is a simple eigenvalue if J = 1. Therefore, we assume throughout thepaper that J ā‰„ 2 for the semi-simple case.

Page 6: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

4 D. B. Szyld and F. Xue

Let x 6= 0 be a right eigenvector approximation which has a significant componentin span{Ļ•1, . . . , Ļ•J}. We first give a decomposition of x which we use later for thestudy of the local convergence of Rayleigh functional iteration. Let G āˆˆ CJƗJ be anonsingular matrix, Ī¦J = [Ļ•1 . . . Ļ•J ]G and ĪØJ = [Ļˆ1 . . . ĻˆJ ]Gāˆ’āˆ— such that

(2.3) ĪØāˆ—JTā€²(Ī»)Ī¦J = IJ .

From (2.1) we know that both T ā€²(Ī»)Ī¦J and ĪØāˆ—JTā€²(Ī») are of full rank. Let Wnāˆ’J āˆˆ

CnƗ(nāˆ’J) and Unāˆ’J āˆˆ CnƗ(nāˆ’J), respectively, have orthonormal columns such that

(2.4) W āˆ—nāˆ’JTā€²(Ī»)Ī¦J = 0(nāˆ’J)ƗJ and ĪØāˆ—JT

ā€²(Ī»)Unāˆ’J = 0JƗ(nāˆ’J).

Assume that x 6āˆˆ range(Unāˆ’J), such that ā€–ĪØāˆ—JT ā€²(Ī»)xā€– 6= 0. A decomposition of xcan be formed as follows:

(2.5) x = Ī³(cv + sg),

where

Ī³ =

āˆ£āˆ£āˆ£āˆ£āˆ£āˆ£āˆ£āˆ£āˆ£āˆ£[

ĪØāˆ—JW āˆ—nāˆ’J

]T ā€²(Ī»)x

āˆ£āˆ£āˆ£āˆ£āˆ£āˆ£āˆ£āˆ£āˆ£āˆ£, c =

ā€–ĪØāˆ—JT ā€²(Ī»)xā€–Ī³

, s =ā€–W āˆ—nāˆ’JT ā€²(Ī»)xā€–

Ī³,(2.6)

v = Ī¦JĪØāˆ—JT

ā€²(Ī»)x

ā€–ĪØāˆ—JT ā€²(Ī»)xā€–, and g =

1

s

(x

Ī³āˆ’ cv

).

Here, Ī³ is a generalized norm of x, and c and s with c2 + s2 = 1 are generalized sineand cosine, respectively, of the angle between x and v āˆˆ span{Ļ•1, . . . , Ļ•J}. It can beshown without difficulty that

(2.7) ĪØāˆ—JTā€²(Ī»)g = 0 and ā€–W āˆ—nāˆ’JT ā€²(Ī»)gā€– = 1,

and thus g āˆˆ range(Unāˆ’J) is an error vector normalized as above. The eigenvectorapproximation error can be measured by the generalized sine s or tangent t = s/c.

2.2. Defective eigenvalues. An eigenvalue Ī» of (1.1) is defective if algT (Ī») >geoT (Ī»). This definition is consistent with that for eigenvalues of a matrix.

For a defective eigenvalue Ī», the structure of the resolvent T (Āµ)āˆ’1 near Ī» and thenormalization condition of the sets of left and right eigenvectors are more complicatedthan they are in the semi-simple case.

Definition 2.2. Let Ī» be a defective eigenvalue of the holomorphic operator T (Ā·)and Ļ•Ā·,0 a corresponding right eigenvector. Then the nonzero vectors Ļ•Ā·,1, . . . , Ļ•Ā·,māˆ’1

are called generalized eigenvectors if

(2.8)

nāˆ‘j=0

1

j!T (j)(Ī»)Ļ•Ā·,nāˆ’j = 0, n = 1, . . . ,māˆ’ 1,

where T (j)(Ī») = dj

dĀµj T (Āµ)|Āµ=Ī». Then the ordered collection {Ļ•Ā·,0, Ļ•Ā·,1, . . . , Ļ•Ā·,māˆ’1} is

called a right Jordan chain corresponding to Ī». If (2.8) is satisfied for some m = māˆ—and no more vectors can be introduced such that (2.8) is satisfied for m = māˆ— + 1,then māˆ— is called the length of the Jordan chain and a partial multiplicity of Ī».

Similarly, let ĻˆĀ·,0 be a left eigenvector of Ī». One can define a left Jordan chainĻˆĀ·,0, ĻˆĀ·,1, . . . , ĻˆĀ·,māˆ’1 by replacing T (j)Ļ•Ā·,nāˆ’j in (2.8) with Ļˆāˆ—Ā·,nāˆ’jT

(j).

Page 7: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 5

The structure of the resolvent T (Āµ)āˆ’1 near a defective Ī» is described as follows(c.f. Theorem A.10.2 of [12]).

Theorem 2.3. Let T : U ā†’ CnƗn be a Fredholm holomorphic function in aneighborhood of a defective eigenvalue Ī» of T , and J and m1, . . . ,mJ be the geometricand partial multiplicities of Ī». Suppose that {Ļ•k,s}, k = 1, . . . , J, s = 0, . . . ,mk āˆ’ 1is a canonical system of right Jordan chains of T corresponding to Ī». Then(i) There is a unique canonical system of left Jordan chains {Ļˆk,s}, k = 1, . . . , J ,s = 0, . . . ,mk āˆ’ 1 such that in a neighborhood of Ī»

(2.9) T (Āµ)āˆ’1 =

Jāˆ‘k=1

mkāˆ’1āˆ‘h=0

āˆ‘hs=0怈Ā·, Ļˆk,s怉Ļ•k,hāˆ’s(Āµāˆ’ Ī»)mkāˆ’h

+Q(Āµ),

where Q is holomorphic in a neighborhood of Ī».(ii) The left Jordan chains {Ļˆk,s} in (i) satisfy the following normalization conditions

āˆ‘s=0

mk+sāˆ‘Ļƒ=s+1

1

Ļƒ!怈T (Ļƒ)(Ī»)Ļ•k,mk+sāˆ’Ļƒ, Ļˆj,`āˆ’s怉 = Ī“jkĪ“

0` ,(2.10)

āˆ‘s=0

mk+sāˆ‘Ļƒ=s+1

1

Ļƒ!怈T (Ļƒ)(Ī»)Ļ•k,`āˆ’s, Ļˆj,mk+sāˆ’Ļƒć€‰ = Ī“jkĪ“

0` ,

where Ļˆj,p = 0, Ļ•k,q = 0 for p ā‰„ mj , q ā‰„ mk.

An important observation of the sets of right and left eigenvectors associated witha defective eigenvalue can be derived from Theorem 2.3. Note from Definition 2.2 ofJordan chains that T (Ī»)Ļ•k,1 + T ā€²(Ī»)Ļ•k,0 = 0. Premultiplying by Ļˆāˆ—j,0, we have

(2.11) Ļˆāˆ—j,0T (Ī»)Ļ•k,1 + Ļˆāˆ—j,0Tā€²(Ī»)Ļ•k,0 = Ļˆāˆ—j,0T

ā€²(Ī»)Ļ•k,0 = 0

for any j, k = 1, . . . , J . We shall see later that this property plays a critical role inthe linear convergence of Newton-like methods towards defective eigenvalues.

2.3. Review of Newton-like algorithms. Consider the solution of a singleeigenpair (Ī», v) under a normalization condition of problem (1.1)

(2.12) F (Ī», v) =

[T (Ī»)vuāˆ—v āˆ’ 1

]= 0,

where u āˆˆ Cn is a fixed normalization vector such that u 6āŠ„ v; see, e.g., [18, 24].This nonlinear system of equations involving n + 1 variables is usually referred toas the augmented system of problem (1.1). Application of Newtonā€™s method to theaugmented system leads to an iteration scheme as follows:

(2.13)

[xk+1

Āµk+1

]=

[xkĀµk

]āˆ’[T (Āµk) T ā€²(Āµk)xkuāˆ— 0

]āˆ’1 [xkĀµk

],

where the computation of the new eigenvector iterate xk+1 and the new eigenvalueiterate Āµk+1 are performed simultaneously by solving a linear system involving theJacobian of order n + 1. In general, it is more convenient to solve a linear systeminvolving a coefficient matrix of order n to compute xk+1, and then update Āµk+1

Page 8: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

6 D. B. Szyld and F. Xue

correspondingly. To this end, we exploit the structure of the block inverse of theJacobian and obtain inverse iteration as follows:

(2.14)

pk = Tāˆ’1(Āµk)T ā€²(Āµk)xk

xk+1 =1

uāˆ—pkpk

Āµk+1 = Āµk āˆ’1

uāˆ—pk

,

which is mathematically equivalent to Newtonā€™s method. Inverse iteration is generallyused as a substitute of the original Newtonā€™s method (2.13), and its convergenceanalysis is usually carried out using theories of the latter.

Here, we would like to point out that for a degenerate Ī» with geoT (Ī») = J > 1,T (Ī») = 0 together with a single normalization vector may not uniquely determine theeigenvector v, since there could be infinitely many ways to form v āˆˆ span{Ļ•1, Ā· Ā· Ā· , Ļ•J}such that uāˆ—v = 1 holds. In many cases, fortunately, one is mostly interested in thecomputation of the eigenvalue alone, and any single eigenvector v āˆˆ span{Ļ•1, Ā· Ā· Ā· , Ļ•J}would satisfy the needs of the applications. We assume this is the case throughout thepaper, so that the study of single-vector Newton-like algorithms is most appropriatefor this purpose. If the whole eigenspace of a degenerate eigenvalue is needed, onemay have to use a block variant of Newtonā€™s method based on invariant pairs [13].

inverse iteration can be modified to incorporate potentially more accurate eigen-value approximation, so that the convergence can be accelerated for locally symmetricproblems, i.e., those where T (Ī») is real (skew) symmetric, complex (skew) Hermi-tian, or complex (skew) symmetric. The most important variant of this type is theRayleigh functional iteration (RFI), a natural generalization of Rayleigh quotient iter-ation (RQI) to the nonlinear eigenvalue problems. Given a right eigenvector approxi-mation xk ā‰ˆ v, one can choose some auxiliary vector yk, such that yāˆ—kT (Ļk)xk = 0 forsome scalar Ļk, and Ļk = ĻF (xk;T, yk) is called the value of the nonlinear Rayleighfunctional ĻF (Ā·;T, y); see [26] and references therein. The RFI is described as follows:

(2.15)

choose vector yk and compute Ļk = ĻF (xk;T, yk) s.t. yāˆ—kT (Ļk)xk = 0;pk = Tāˆ’1(Ļk)T ā€²(Ļk)xk;normalize pk and assign it to xk+1.

RFI bears a close connection to the single-vector Jacobi-Davidson (JD) method,another type of most widely used Newton-like eigenvalue algorithms. Starting withan eigenvector approximation xk and its corresponding Rayleigh functional value Ļk,single-vector JD solves a correction equation for a correction vector āˆ†xk such thatthe new eigenvector approximation is xk+1 = xk + āˆ†xk. The correction equationof JD is a projection (restriction) of the linear system arising in RFI onto a n āˆ’ 1dimensional complementary space of span{xk}. One can construct different variantsof the JD correction equation. We focus our study on one variant, which assumes ageneral form and has been used to compute a simple eigenpair in [29]:

choose vector yk and compute Ļk = ĻF (xk;T, yk)s.t. yāˆ—kT (Ļk)xk = 0 and yāˆ—kT

ā€²(Ļk)xk 6= 0

define Ī (1)k = I āˆ’ T ā€²(Ļk)xky

āˆ—k

yāˆ—kTā€²(Ļk)xk

,Ī (2)k = I āˆ’ xku

āˆ—

uāˆ—xk, and solve the correction

equation Ī (1)k T (Ļk)Ī 

(2)k āˆ†xk = āˆ’T (Ļk)xk for āˆ†xk āŠ„u,

xk+1 = xk + āˆ†xk, and normalize when necessary

.(2.16)

Page 9: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 7

It can be shown that the exact solution of the correction equation is

āˆ†xk =Tāˆ’1(Ļk)T ā€²(Ļk)xkuāˆ—Tāˆ’1(Ļk)T ā€²(Ļk)xk

āˆ’ xk

so that xk + āˆ†xk = Tāˆ’1(Ļk)T ā€²(Ļx)xk up to a scaling factor. Therefore, this variant

of JD is mathematically equivalent to RFI, provided ykTā€²(Ļk)xk 6= 0 so that Ī 

(1)k is

well-defined and uāˆ—Tāˆ’1(Ļk)T ā€²(Ļx)xk 6= 0.In practice, to enhance the robustness and the rate of convergence, JD usually

works with a search subspace of variable dimension for eigenvector approximations,and in this case it is referred to as the full JD method. In this paper, we restrictour discussion to the local convergence of single-vector JD, which can be used as aworst-case estimate of the convergence rates of full JD.

2.4. Convergence of Newtonā€™s method near singular roots. We end thissection by reviewing the local convergence of Newtonā€™s method for a nonlinear systemof algebraic equations F (z) = 0 towards a root zāˆ— for which the Jacobian F ā€²(zāˆ—)is singular. The typical convergence behavior can be roughly described as follows.The whole space for the variable z can be decomposed as a direct sum of the nullspace N1 of F ā€²(zāˆ—) and an appropriate complementary space M1. When Newtonā€™smethod converges towards zāˆ—, the component of the iterate error in N1 convergeslinearly, whereas the error component in M1 converges quadratically; see, e.g., [6, 7,8]. To prepare for the convergence analysis of inverse iteration towards degenerateeigenvalues, we need to review some results from [8].

Let N1 = null(F ā€²(zāˆ—)),M2 = range(F ā€²(zāˆ—)), so that codim(M2) = dim(N1). Wechoose complementary subspaces M1, N2 such that Cn+1 = N1 āŠ•M1 = N2 āŠ•M2.Define PNi as the projections onto Ni along Mi, and PMi = I āˆ’ PNi . Given thesesubspaces, the Jacobian F ā€²(z) can be decomposed as

F ā€²(z) = AF (z) +BF (z) + CF (z) +DF (z),

where

AF (z) = PM2F ā€²(z)PM1

, BF (z) = PM2F ā€²(z)PN1

(2.17)

CF (z) = PN2F ā€²(z)PM1

, DF (z) = PN2F ā€²(z)PN1

.

Let AFāˆ— = PM2AF (zāˆ—)PM1

, which is a bijection when considered as an operator fromM1 intoM2. The conditions for the nonsingularity of F ā€²(z) are described as follows.

Proposition 2.4. Let z be a candidate Newton iterate and ez = z āˆ’ zāˆ— be theiterate error. For z āˆˆ SĪ“ ā‰” {z : ā€–ezā€– ā‰¤ Ī“} with a sufficiently small Ī“ > 0, F ā€²(z) isnonsingular if and only if the Schur complement

(2.18) SF (z) : N1 ā†’ N2 ā‰” DF (z)āˆ’ CF (z)Aāˆ’1F (z)BF (z)

is nonsingular. In this case, we have

F ā€²(z)āˆ’1 =(AF (z) +BF (z) + CF (z) +DF (z)

)āˆ’1(2.19)

= PM1

(Aāˆ’1F (z) +Aāˆ’1

F (z)BF (z)Sāˆ’1F (z)CF (z)Aāˆ’1

F (z))PM2

āˆ’PM1Aāˆ’1F (z)BF (z)Sāˆ’1

F (z)PN2

āˆ’PN1Sāˆ’1F (z)CF (z)Aāˆ’1

F (z)PM2 + PN1Sāˆ’1F (z)PN2 .

Page 10: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

8 D. B. Szyld and F. Xue

The convergence of Newtonā€™s method can be explored by studying each term in(2.19). To this end, note that Taylor expansions of the terms in (2.17) at zāˆ—, i.e., therestrictions of the Jacobian F ā€²(z) onto Mi and Ni (i = 1, 2), are as follows:

AF (z) = AFāˆ— +

nāˆ‘j=a

A(j)(z) +On+1(ez),(2.20)

BF (z) =

nāˆ‘j=b

B(j)(z) +On+1(ez),

CF (z) =

nāˆ‘j=c

C(j)(z) +On+1(ez),

DF (z) =

nāˆ‘j=d

D(j)(z) +On+1(ez),

where

A(j)(z) =1

j!PM2

F (j+1)(zāˆ—)(ejz, PM1Ā·),(2.21)

B(j)(z) =1

j!PM2

F (j+1)(zāˆ—)(ejz, PN1Ā·),

C(j)(z) =1

j!PN2F

(j+1)(zāˆ—)(ejz, PM1 Ā·), and

D(j)(z) =1

j!PN2F

(j+1)(zāˆ—)(ejz, PN1 Ā·)

are square matrices. Here, F (j+1)(zāˆ—) (Ā·, Ā· Ā· Ā· , Ā·) stands for the (j + 1)st derivative ofF at zāˆ— (a multilinear form (tensor) with j + 1 arguments), and ejz means that thefirst j arguments of F (j+1) are all ez. The values j = a, b, c, d > 0 in (2.20) are thesmallest integers for which the vectors A(j)(z)ez, B(j)(z)ez, C(j)(z)ez, and D(j)(z)ez,respectively, are nonzero. In addition, let j = a, b, c, d be the smallest integers forwhich the matrices A(j)(z), B(j)(z), C(j)(z) and D(j)(z) are nonzero. Obviously, wehave a ā‰¤ a, b ā‰¤ b, c ā‰¤ c and d ā‰¤ d.

With the above notation and definitions, a major convergence result of the stan-dard Newtonā€™s method near singular points developed in [8] is summarized as follows.

Theorem 2.5 (Theorem 5.9 in [8]). Define the operator

D(j) =1

j!PN2

F (j+1)(zāˆ—)((PN1ez)

j , PN1Ā·),

and j = d be the smallest integer for which D(j) 6ā‰” 0. Let ez = z āˆ’ zāˆ— be the error

of z, and assume that d = d ā‰¤ c and D(d) is nonsingular for all PN1ez 6= 0. Define

` = min(a, b, c, d), and let z0 be the initial Newton iterate. Then, for sufficiently smallĪ“ > 0 and Īø > 0, F ā€²(z0) is nonsingular for all

z0 āˆˆW (Ī“, Īø) ā‰” {z : 0 < ā€–ezā€– ā‰¤ Ī“, ā€–PM1ezā€– ā‰¤ Īøā€–PN1

ezā€–} ,

all subsequent Newton iterates remain in W (Ī“, Īø); in addition, zk ā†’ zāˆ— with

ā€–PM1(zk āˆ’ zāˆ—)ā€– ā‰¤ Cā€–PM1

(zkāˆ’1 āˆ’ zāˆ—)ā€–`+1

Page 11: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 9

for some constant C > 0 independent of k, and

limkā†’āˆž

ā€–PN1(zk āˆ’ zāˆ—)ā€–ā€–PN1

(zkāˆ’1 āˆ’ zāˆ—)ā€–=

d

d+ 1.

Theorem 2.5 states that under certain assumptions, as Newtonā€™s method con-verges towards zāˆ— for which F ā€²(zāˆ—) is singular, the error component lying in M1

converges at least quadratically, whereas the error component lying in N1 convergesonly linearly. This observation will be used directly in Sections 3 and 4 to show thequadratic and linear convergence of inverse iteration, respectively, towards a semi-simple and a defective eigenpair.

3. Convergence for semi-simple eigenvalues. In this section, we study thelocal convergence of Newton-like methods for a semi-simple eigenvalue of the nonlinearalgebraic eigenvalue problem (1.1). As we shall see, the Jacobian of the augmentedsystem (2.12) is singular at a semi-simple eigenpair, and the convergence theory ofNewtonā€™s method near singular roots indicates that the algorithm generally convergesonly linearly. However, we show that the singularity of the Jacobian does not hamperthe quadratic convergence of Newtonā€™s method towards semi-simple eigenvalues; inaddition, Rayleigh functional iteration converges at least quadratically, and it con-verges cubically for locally symmetric problems. These convergence results are verysimilar to those for simple eigenvalues; see, e.g., [25, 29].

3.1. Inverse iteration. Assume that (Ī», v) is a semi-simple eigenpair of theholomorphic matrix pencil T (Ā·), and Ļ•1, . . . , Ļ•J and Ļˆ1, . . . , ĻˆJ are the correspondingleft and right eigenvectors. Since dim (nullT (Ī»)) = J , there exists a singular valuedecomposition of T (Ī») of the following form

Y āˆ—T (Ī»)X =

[0 00 Ī£nāˆ’J

],

where X = [XJ Xnāˆ’J ] and Y = [YJ Ynāˆ’J ] are unitary matrices, XJ āˆˆ CnƗJ andYJ āˆˆ CnƗJ have orthonormal columns that form a basis of span{Ļ•1, . . . , Ļ•J} andspan{Ļˆ1, . . . , ĻˆJ}, respectively, and Ī£nāˆ’J is a diagonal matrix of positive singularvalues of T (Ī»). Therefore there exist nonsingular matrices KJ ,MJ āˆˆ CJƗJ such thatXJ = [Ļ•1 . . . Ļ•J ]KJ and YJ = [Ļˆ1 . . . ĻˆJ ]MJ . Since v āˆˆ span{Ļ•1, . . . , Ļ•J}, we canwrite v = [Ļ•1 . . . Ļ•J ]dv for some nonzero dv āˆˆ CJ . It follows from (2.1) that

Y āˆ—J Tā€²(Ī»)v = Māˆ—J [Ļˆ1 . . . ĻˆJ ]āˆ—T ā€²(Ī»)[Ļ•1 . . . Ļ•J ]dv = Māˆ—Jdv,

and consequently[Y āˆ—

1

] [T (Ī») T ā€²(Ī»)vuāˆ— 0

] [X

1

]=

[Y āˆ—T (Ī»)X Y āˆ—T ā€²(Ī»)vuāˆ—X 0

](3.1)

=

0 0 Māˆ—Jdv0 Ī£nāˆ’J Y āˆ—nāˆ’JT

ā€²(Ī»)vuāˆ—XJ uāˆ—Xnāˆ’J 0

.Let h = [hāˆ—a h

āˆ—b hc]

āˆ— be a vector in the null space of the above square matrix, whereha āˆˆ CJ , hb āˆˆ Cnāˆ’J and hc āˆˆ C. Then 0 0 Māˆ—Jdv

0 Ī£nāˆ’J Y āˆ—nāˆ’JTā€²(Ī»)v

uāˆ—XJ uāˆ—Xnāˆ’J 0

hahbhc

=

Māˆ—JdvhcĪ£nāˆ’Jhb + Y āˆ—nāˆ’JT

ā€²(Ī»)vhcuāˆ—XJha + uāˆ—Xnāˆ’Jhb

=

000

.

Page 12: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

10 D. B. Szyld and F. Xue

In the first J rows, since MJ is nonsingular and dv 6= 0, we have Māˆ—Jdv 6= 0, andtherefore hc = 0. The next nāˆ’ J rows is thus simplified as Ī£nāˆ’Jhb, and since Ī£nāˆ’Jis diagonal with all nonzero entries, we have hb = 0, and the last row is equivalent touāˆ—XJha = 0. Now, since u with uāˆ—v = 1 specifies the scaling of v = [Ļ•1 . . . Ļ•J ]dvin the desired eigenspace range(XJ), we have uāˆ—XJ 6= 0. Without loss of generality,assume that uāˆ—XJ = [Ī·1 . . . Ī·J ] where Ī·J 6= 0. Then we can determine ha and h.

h =

hahbhc

āˆˆ span

10...0āˆ’ Ī·1Ī·J

0nāˆ’J+1

,

01...0āˆ’ Ī·2Ī·J

0nāˆ’J+1

, . . . ,

00...1

āˆ’Ī·Jāˆ’1

Ī·J

0nāˆ’J+1

.

We see from (3.1) that the null space of the Jacobian

[T (Ī») T ā€²(Ī»)vuāˆ— 0

]can be

obtained by premultiplying h by

[X

1

].

N1 ā‰” null

([T (Ī») T ā€²(Ī»)vuāˆ— 0

])(3.2)

=

[X

1

]span

10...0āˆ’ Ī·1Ī·J

0nāˆ’J+1

, . . . ,

00...1

āˆ’Ī·Jāˆ’1

Ī·J

0nāˆ’J+1

= span

{[Xe1āˆ’ Ī·1

Ī·JXeJ

0

],

[Xe2āˆ’ Ī·2

Ī·JXeJ

0

], . . . ,

[XeJāˆ’1āˆ’ Ī·Jāˆ’1

Ī·JXeJ

0

]}āŠ‚ span

{[Ļ•1

0

], . . . ,

[Ļ•J0

]},

where dim(N1) = J āˆ’ 1 (J ā‰„ 2).

The special structure of the null space N1 indicates that the convergence of New-tonā€™s method towards the semi-simple eigenvalue and the corresponding eigenspace(instead of the eigenvector v) is quadratic. To see this, define the space

M1 =M1Ī± āŠ•M1Ī² = span

{[XeJ

0

]}āŠ• range

([Xnāˆ’J

1

]),

so that dim(M1) = nāˆ’ J + 2. Therefore

Cn+1 = span

{[Xe1āˆ’ Ī·1

Ī·JXeJ

0

], . . . ,

[XeJāˆ’1āˆ’ Ī·Jāˆ’1

Ī·JXeJ

0

]}āŠ•(

span

{[XeJ

0

]}āŠ• range

([Xnāˆ’J

1

]))= N1 āŠ• (M1Ī± āŠ•M1Ī²) = N1 āŠ•M1.

Page 13: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 11

Let PN1be the projector onto N1 along M1, PM1

= I āˆ’ PN1, and ek =[

xkĀµk

]āˆ’[vĪ»

]be the error between the Newton iterate and the particular eigen-

pair (Ī», v) in the kth iteration. We know from Theorem 2.5 that ā€–PN1 (ek) ā€– andā€–PM1

(ek) ā€–, respectively, converge to zero linearly and quadratically. Therefore New-tonā€™s method converges towards the particular eigenpair (Ī», v) linearly.

Fortunately, the linear convergence of Newtonā€™s method towards (Ī», v) does notaffect its quadratic convergence towards Ī» and the desired eigenspace. The key ob-servation is that the error between the Newton iterate and the eigenvalue togetherwith its eigenspace lies in M1, and ā€–PM1(ek)ā€– converges quadratically. In fact, the

eigenspace approximation error in range

([Xnāˆ’J

0

])together with the eigenvalue ap-

proximation error in span

{[01

]}can be represented by PM1Ī²

(ek), the projection of

ek ontoM1Ī² = range

([Xnāˆ’J

1

])along N1āŠ•M1Ī± = span

{[Ļ•1

0

], . . . ,

[Ļ•J0

]}.

Therefore, PM1Ī²(ek), instead of PN1

(ek), represents the error between the Newton

iterate and the space E = span

{[Ļ•1

Ī»

], . . . ,

[Ļ•JĪ»

]}spanned by all valid candidate

eigenpairs. In addition, (3.2) shows that any vector lying in N1 can be representedas the difference between two candidate eigenpairs, and thus PN1(ek) bears no con-nection to the error between the Newton iterate and E spanned by all candidateeigenpairs. Since M1Ī² āŠ‚ M1, the quadratic convergence of Newtonā€™s method to-wards the semi-simple Ī» and its eigenspace (represented by E) is established from thequadratic convergence of ā€–PM1

(ek)ā€–. This result is summarized as follows.Theorem 3.1. Let Ī» be a semi-simple eigenvalue of the holomorphic operator

T (Ā·) : U ā†’ CnƗn, and Ļ•1, . . . , Ļ•J be the corresponding right eigenvectors. Let (Āµ0, x0)be a right eigenpair approximation such that |Āµ0āˆ’Ī»| and āˆ (x0, span{Ļ•1, . . . , Ļ•J}) aresufficiently small. Assume that the conditions in Theorem 2.5 hold. Then inverseiteration (2.14) converges towards Ī» together with its eigenspace quadratically.

3.2. Numerical experiments for inverse iteration. In this section, we pro-vide numerical results illustrating the locally quadratic convergence of inverse iterationfor semi-simple eigenvalues. The experiments are performed on five benchmark prob-lems, one from the MatrixMarket [16], two from the NLEVP toolbox [2], and twoconstructed artificially. A description of these problems is given in Table 3.1.

problem source type size eigenvalue multiplicity local symm.tols1090 MM lep 1090 āˆ’12.098 200 real unsymm.

plasma drift NLEVP pep 128 10.004129āˆ’ 2 cplx unsymm.0.19324032i

schrodinger NLEVP qep 1998 āˆ’0.33111181+ 2 cplx symm.0.24495497i

ss art symm artificial nep 256 0 5 real symm.ss art unsymm artificial nep 256 0 5 real unsymm.

Table 3.1Description of the test problems for semi-simple eigenvalues

For example, the problem tols1090 taken from the MatrixMarket (MM) is a lineareigenvalue problem (lep), defined by a matrix A āˆˆ R1090Ɨ1090; it has a semi-simpleeigenvalue Ī» = āˆ’12.098 of multiplicity 200, and the matrix T (Ī») = Ī»I āˆ’ A is realunsymmetric (local symmetry). Similarly, plasma drift and schrodinger taken from

Page 14: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

12 D. B. Szyld and F. Xue

NLEVP, respectively, are a polynomial eigenvalue problem (pep) of degree 3 anda quadratic eigenvalue problem (qep); they both have a semi-simple eigenvalue Ī» ofmultiplicity 2, and T (Ī») is complex unsymmetric and complex symmetric, respectively.The two artificial problems are both truly nonlinear eigenvalue problems (nep), whereT (Āµ) cannot be represented as a polynomial of Āµ; specifically,

Tsymm(Āµ) = Gāˆ—AD(Āµ)GA for ā€œss art symmā€, and(3.3)

Tunsymm(Āµ) = Gāˆ—AD(Āµ)GB for ā€œss art unsymmā€,

where D(Āµ) = diag([eĀµāˆ’1; 2 sinĀµ; āˆ’5 ln(1 + Āµ); 8Āµ; tanāˆ’1(Āµ); c0 + c1Āµ + c2Āµ2]),

ci āˆˆ R251 (i = 0, 1, 2) and GA, GB āˆˆ R256Ɨ256 generated by MATLABā€™s functionrandn are random matrices whose entries follow the standard normal distribution.Both artificial problems have a semi-simple eigenvalue Ī» = 0 of multiplicity 5.

To test the local convergence of inverse iteration for these problems, we firstconstruct an initial eigenvector approximation using the MATLAB commands

x 0 = v āˆ— cos(err phi) + g āˆ— sin(err phi);(3.4)

where v is a normalized eigenvector corresponding to the desired eigenvalue of thegiven problem, g is a normalized random vector that has been orthogonalized againstv, and err phi is the error angle āˆ (x0, v). An appropriate value for err phi is obtainedby trial and error, such that quadratic convergence of inverse iteration can be clearlyobserved. The initial eigenvalue approximation Āµ0 is defined as the value of the one-sided Rayleigh functional ĻF (x0;T, y) where the auxiliary vector y = T ā€²(Ī»)v, andtherefore |Āµ0 āˆ’ Ī»| = O (āˆ (x0, v)).

To estimate the convergence rate of inverse iteration, we monitor how quicklythe eigenresidual norm ek := ā€–T (Āµk)xkā€– decreases as the algorithm proceeds. Theresidual norm is a natural measure of the eigenpair approximation error of (Āµk, xk),and it can be evaluated at little cost. For all the test problems, it is easy to seethat inverse iteration converges at least superlinearly. Nevertheless, our goal in thissection is to show that the order of convergence ` of this algorithm is exactly 2, andthe standard criterion ā€–ek+1ā€–/ā€–ekā€–` ā‰¤ C may not be very descriptive for this purpose.For example, if e0 = 10āˆ’2, e1 = 10āˆ’5 and e2 = 10āˆ’12, it is then difficult to tell if theconvergence is quadratic, superquadratic or cubic, due to the very small number ofiterations for which the standard criterion holds.

We now discuss an alternative approach to estimate ` in a more descriptive man-

ner. First, we generate a sequence of initial approximations (Āµ(j)0 , x

(j)0 ), such that

āˆ (v, x(j+1)0 ) = 1

2āˆ (v, x(j)0 ), and thus |Āµ(j+1)

0 āˆ’ Ī»| ā‰ˆ 12 |Āµ

(j)0 āˆ’ Ī»| since the value of the

Rayleigh functional Āµ(j)0 = ĻF (x

(j)0 ;T, y) satisfies |Āµ(j)

0 āˆ’Ī»| = O(āˆ (x

(j)0 , v)

). It can be

shown that e(j+1)0 = ā€–T (Āµ

(j+1)0 )x

(j+1)0 ā€– ā‰ˆ 1

2e(j)0 = 1

2ā€–T (Āµ(j)0 )x

(j)0 ā€–. We then apply one

step of inverse iteration to generate a sequence of new iterates (Āµ(j)1 , x

(j)1 ), for which

e(j+1)1 =

(12

)`e

(j)1 holds, and an estimate of ` can be obtained. This approach is more

descriptive for our purpose, because a relatively large number of initial approxima-tions can be generated, for which the algorithm exhibits the `th order of convergencefor at least one iteration.

The estimated order of convergence of inverse iteration is summarized in Table 3.2.To explain the results, take for instance the problem tols1090. We generated 20

initial approximations (Āµ(j)0 , x

(j)0 ) (j = 1, 2, . . . , 20), where āˆ (x

(1)0 , v) = 10āˆ’2 and

āˆ (x(j+1)0 , v) = 1

2āˆ (x(j)0 , v). The estimates of the order of convergence ` are obtained

Page 15: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 13

by applying the least-squares line formula to(

log e(j)0 , log e

(j)1

). As one can see, the

estimated values of ` are very close to 2, indicating that inverse iteration convergesquadratically, independent of the local symmetry; see Theorem 3.1.

problem āˆ (x(1)0 , v) # init. approx. estimated `

tols1090 10āˆ’2 20 2.022plasma drift 2Ɨ 10āˆ’1 20 2.002schrodinger 10āˆ’4 13 1.979ss art symm 2Ɨ 10āˆ’2 16 2.008

ss art unsymm 2Ɨ 10āˆ’2 16 1.983Table 3.2

Estimated order of convergence for inverse iteration

3.3. Rayleigh functional iteration and single-vector JD. We see that asemi-simple eigenvalue is a class of degenerate eigenvalue for which Newtonā€™s methodexhibit the same order of convergence as that for simple eigenvalues. In fact, we cantake a step further and show that Rayleigh functional iteration (RFI), a variant ofNewtonā€™s method which approximates eigenvalues in a different manner, convergescubically towards a semi-simple eigenvalue Ī» if the problem is locally symmetric, i.e.,if T (Ī») is (skew) real/complex symmetric or (skew) Hermitian. In this case, note thatthe right and the left eigenvectors satisfy Ļ•j = Ļˆj or Ļ•j = Ļˆj (j = 1, . . . , J), so thata left eigenvector approximation can be obtained in each iteration without additionalcomputational cost.

We now present the local convergence analysis of RFI. First, note that for anyv =

āˆ‘Jj=1 Ī±jĻ•j such that T (Ī»)v = 0, we have

(3.5)

Jāˆ‘k=1

(Ļˆāˆ—kTā€²(Ī»)v)Ļ•k =

Jāˆ‘k=1

Jāˆ‘j=1

Ī±j (Ļˆāˆ—kTā€²(Ī»)Ļ•j)

Ļ•k =

Jāˆ‘k=1

Ī±kĻ•k = v.

Note from (2.2) that for such v, we have Q(Ī»)T ā€²(Ī»)v āˆˆ span{Ļ•1, . . . , Ļ•J} where bothQ(Āµ) and T ā€²(Āµ) are holomorphic in a neighborhood of Ī»; that is, there exists someconstant Ī± such that

(3.6) Q(Ī»)T ā€²(Ī»)v = Ī±v with v āˆˆ span{Ļ•1, . . . , Ļ•J}.

In the following derivation, let Ļ = ĻF (x;T, y) be the Rayleigh functional valuecorresponding to x and an auxiliary vector q. To prepare for the analysis, we alsoneed decompositions of the following vectors

Q(Ī»)T ā€²(Ī»)g = Ī³1(c1v1 + s1g1),(3.7)

Qā€²(Ī»)T ā€²(Ļ)x = Ī³2(c2v2 + s2g2), and(3.8)

Q(Ļ)T ā€²ā€²(Ī»)x = Ī³3(c3v3 + s3g3),(3.9)

which can be obtained by substituting the left-hand sides in (3.7)ā€“(3.9), respectively,for x in (2.5). Therefore vj āˆˆ span{Ļ•1, . . . , Ļ•J} and

(3.10) T ā€²(Ī»)gj āŠ„ span{Ļˆ1, . . . , ĻˆJ}

for j = 1, 2, 3; see (2.5)ā€“(2.7). Without loss of generality, assume that the righteigenvector approximation x has a unit generalized norm, i.e., Ī³ = 1; see (2.5) and

Page 16: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

14 D. B. Szyld and F. Xue

(2.6). Assuming that Ļ is in a neighborhood of Ī» in which F is analytic, the gener-alized norms Ī³j of these vectors are bounded above by O(1). It follows that the newunnormalized eigenvector approximation p computed by RFI is

p = Tāˆ’1(Ļ)T ā€²(Ļ)x =1

Ļāˆ’ Ī»

Jāˆ‘k=1

(Ļˆāˆ—kTā€²(Ļ)x)Ļ•k +Q(Ļ)T ā€²(Ļ)x

=1

Ļāˆ’ Ī»

Jāˆ‘k=1

Ļˆāˆ—k

(T ā€²(Ī»)(cv + sg) + (Ļāˆ’ Ī»)T ā€²ā€²(Ī»)x+

(Ļāˆ’ Ī»)2

2T ā€²ā€²ā€²(Ī»)x

)Ļ•k +

Q(Ī»)T ā€²(Ī»)(cv + sg) + (Ļāˆ’ Ī») (Qā€²(Ī»)T ā€²(Ļ) +Q(Ļ)T ā€²ā€²(Ī»))x+O(|Ļāˆ’ Ī»|2

)=

cv

Ļāˆ’ Ī»+

Jāˆ‘k=1

(Ļˆāˆ—kT

ā€²ā€²(Ī»)x+Ļāˆ’ Ī»

2Ļˆāˆ—kT

ā€²ā€²ā€²(Ī»)x

)Ļ•k + cĪ±v + sQ(Ī»)T ā€²(Ī»)g +

(Ļāˆ’ Ī») (Qā€²(Ī»)T ā€²(Ļ) +Q(Ļ)T ā€²ā€²(Ī»))x+O(|Ļāˆ’ Ī»|2

)(see (2.7),(3.5) and (3.6))

=cv

Ļāˆ’ Ī»+

Jāˆ‘k=1

Ī·kĻ•k + cĪ±v + sĪ³1c1v1 + (Ļāˆ’ Ī»)(Ī³2c2v2 + Ī³3c3v3) +

sĪ³1s1g1 + (Ļāˆ’ Ī»)(Ī³2s2g2 + Ī³3s2g3) +O(|Ļāˆ’ Ī»|2

)(see (3.7)ā€“(3.9))

= vp + gp +O(|Ļāˆ’ Ī»|2

),

where

vp =cv

Ļāˆ’ Ī»+

Jāˆ‘k=1

Ī·kĻ•k + cĪ±v + sĪ³1c1v1 + (Ļāˆ’ Ī»)(Ī³2c2v2 + Ī³3c3v3),

Ī·k = Ļˆāˆ—kTā€²ā€²(Ī»)x+

Ļāˆ’ Ī»2

Ļˆāˆ—kTā€²ā€²ā€²(Ī»)x, and

gp = sĪ³1s1g1 + (Ļāˆ’ Ī»)(Ī³2s2g2 + Ī³3s2g3),

so that vp āˆˆ span{Ļ•1, . . . , Ļ•J} and T ā€²(Ī»)gpāŠ„ span{Ļˆ1, . . . , ĻˆJ}.The convergence rates of RFI can be obtained by studying the error angle āˆ (p, vp).

Recall that Ļ = ĻF (x;T, y) is the Rayleigh functional value such that yāˆ—T (Ļ)x = 0.Suppose that x is a good right eigenvector approximation for which the generalizedsine s is sufficiently small, then |Ļ āˆ’ Ī»| = O(s2) if T (Ā·) is locally symmetric and|Ļāˆ’ Ī»| = O(s) otherwise; see Theorem 5 in [26]. In both cases, it is easy to see that(Ļāˆ’ Ī»)āˆ’1cv is the unique dominant term in vp, and ā€–vpā€– = O(|Ļāˆ’ Ī»|āˆ’1); in addition,ā€–gpā€– = O(s) +O(|Ļāˆ’ Ī»|) = O(s). Following the discussion of the error angle āˆ (x, v)(see (2.5) and (2.6)), we see that the generalized tangent of the error angle āˆ (p, vp)is bounded above by a quantity proportional to the ratio between the magnitude ofthe error component gp and that of the eigenvector component vp. That is,

gtan(p, vp) ā‰¤ O(ā€–gpā€–+O(|Ļāˆ’ Ī»|2)

ā€–vpā€– āˆ’ O(|Ļāˆ’ Ī»|2)

)= O(|Ļāˆ’ Ī»|s)

=

{O(s3) if T (Ā·) is locally symmetricO(s2) otherwise

.

In other words, the convergence rates of RFI towards semi-simple and simpleeigenvalues are identical. This conclusion also holds for the single-vector JD method(2.16) because it is mathematically equivalent to RFI. The result is summarized inthe following theorem.

Page 17: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 15

Theorem 3.2. Let Ī» be a semi-simple eigenvalue of the holomorphic operatorT (Ā·) : U ā†’ CnƗn, and x0 = Ī³(c0v + s0g) be a corresponding initial right eigenvectorapproximation with a sufficiently small error angle āˆ (x0, v). Then RFI (2.15) andsingle-vector JD (2.16) converge towards Ī» and its eigenspace at least quadratically orat least cubically for locally nonsymmetric or symmetric problems, respectively.

3.4. Numerical experiments for RFI and JD. In this section, we test theconvergence of RFI and single-vector JD on the five problems with semi-simple eigen-values introduced in Section 3.2. We follow the same approach used for inverse it-eration to estimate the order of convergence for RFI, with the only difference in

the generation of the initial eigenvalue approximations Āµ(j)0 . To achieve the maxi-

mum order of convergence for RFI, as discussed in Section 3.3, we use the two-sidedRayleigh functional whenever the local symmetry exists; that is, for schrodinger and

ss art symm (see Table 3.1), we choose y = conj(x(j)0 ) and y = x

(j)0 , respectively,

and let Āµ(j)0 = ĻF (x

(j)0 ;T, y), such that |Āµ(j)

0 āˆ’ Ī»| = O(āˆ (x

(j)0 , v)2

). As a result, RFI

converges at least cubically for these problems.Table 3.3 presents the estimated order of convergence for RFI. The results also

hold for single-vector JD because the two algorithms are mathematically equivalent.We see that RFI converges quadratically for the three problems without local sym-metry, and it converges cubically for the two problems with local symmetry (see thenumbers in bold in Table 3.3). All these results are consistent with Theorem 3.2.

problem āˆ (x(1)0 , v) # init. approx. estimated `

tols1090 10āˆ’6 16 2.013plasma drift 5Ɨ 10āˆ’4 11 2.012schrodinger 5Ɨ 10āˆ’4 10 2.972ss art symm 2Ɨ 10āˆ’2 12 3.006

ss art unsymm 2Ɨ 10āˆ’2 17 1.997Table 3.3

Estimated order of convergence for RFI/JD

4. Convergence for defective eigenvalues. We see in Section 3 that Newton-like methods converge towards semi-simple eigenvalues at least quadratically. In thissection, we study the computation of defective eigenvalues, which is naturally morechallenging. We show that the local convergence of the standard Newton-like methodstowards a defective eigenpair is generally only linear. Then we propose an acceleratedinverse iteration, and show that it converges quadratically under certain assumptions,where in each iteration two linear systems need to be solved. This convergence rate isalso referred to as superlinear convergence of order

āˆš2. This algorithm is inspired by

an accelerated Newtonā€™s method for solving a general nonlinear system for a singularroot [8]. In addition, we present an accelerated single-vector Jacobi-Davidson method,which also converges quadratically in this case.

4.1. Inverse iteration. The proof of the linear convergence of inverse iterationfor defective eigenvalues is similar to that for semi-simple eigenvalues. Let Ī» be adefective eigenvalue of T (Ā·) with geometric multiplicity geoT (Ī») = J . This meansthat there are exactly J right and J left Jordan chains associated with Ī», namely

{Ļ•1,0, . . . , Ļ•1,m1āˆ’1}, {Ļ•2,0, . . . , Ļ•2,m2āˆ’1}, . . . , {Ļ•J,0, . . . , Ļ•J,mJāˆ’1}, and(4.1)

{Ļˆ1,0, . . . , Ļˆ1,m1āˆ’1}, {Ļˆ2,0, . . . , Ļˆ2,m2āˆ’1}, . . . , {ĻˆJ,0, . . . , ĻˆJ,mJāˆ’1},

Page 18: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

16 D. B. Szyld and F. Xue

where mi ā‰„ 2 (1 ā‰¤ i ā‰¤ J), such that they satisfy the properties described in Theo-rem 2.3. Consider a singular value decomposition of T (Ī») in the following form

Y āˆ—T (Ī»)X =

[0J 00 Ī£nāˆ’J

],

where X = [XJ Xnāˆ’J ] and Y = [YJ Ynāˆ’J ] are unitary matrices, XJ āˆˆ CnƗJ andYJ āˆˆ CnƗJ have orthonormal columns forming a basis of span{Ļ•1,0, . . . , Ļ•J,0} andspan{Ļˆ1,0, . . . , ĻˆJ,0}, respectively, and Ī£nāˆ’J is a diagonal matrix of nonzero singularvalues of T (Ī»). Therefore, there exist nonsingular matrices KJ ,MJ āˆˆ CJƗJ such thatXJ = [Ļ•1,0 . . . Ļ•J,0]KJ and YJ = [Ļˆ1,0 . . . ĻˆJ,0]MJ . Let v = [Ļ•1,0 . . . Ļ•J,0]dv be acandidate right eigenvector where dv 6= 0. It follows from (2.11) that

Y āˆ—J Tā€²(Ī»)v = Māˆ—J [Ļˆ1 . . . ĻˆJ ]āˆ—T ā€²(Ī»)[Ļ•1 . . . Ļ•J ]dv = 0.

Thus we have[Y āˆ—

1

] [T (Ī») T ā€²(Ī»)vuāˆ— 0

] [X

1

]=

[Y āˆ—T (Ī»)X Y āˆ—T ā€²(Ī»)vuāˆ—X 0

]

=

0J 0 00 Ī£nāˆ’J Y āˆ—nāˆ’JT

ā€²(Ī»)vuāˆ—XJ uāˆ—Xnāˆ’J 0

.Let h = [hāˆ—a h

āˆ—b hc]

āˆ— be a vector in the null space of the above square matrix,where ha āˆˆ CJ , hb āˆˆ Cnāˆ’J and hc āˆˆ C. Then 0J 0 0

0 Ī£nāˆ’J Y āˆ—nāˆ’JTā€²(Ī»)v

uāˆ—XJ uāˆ—Xnāˆ’J 0

hahbhc

=

0Ī£nāˆ’Jhb + Y āˆ—nāˆ’JT

ā€²(Ī»)vhcuāˆ—XJha + uāˆ—Xnāˆ’Jhb

=

000

.It follows from the second block row that hb = āˆ’Ī£āˆ’1

nāˆ’JYāˆ—nāˆ’JT

ā€²(Ī»)vhc, and thus the last

row is equivalent to uāˆ—XJha āˆ’ uāˆ—Xnāˆ’JĪ£āˆ’1nāˆ’JY

āˆ—nāˆ’JT

ā€²(Ī»)vhc = 0. Since u specifies thescaling of v = [Ļ•1,0 . . . Ļ•J,0]dv āˆˆ range(XJ) such that uāˆ—v = 1, we have uāˆ—XJ 6= 0.Without loss of generality, assume that uāˆ—XJ = [Ī³1 . . . Ī³J ] where Ī³J 6= 0. Then,depending on whether hc = 0, we can determine ha, hb and h as follows

h =

hahbhc

āˆˆ span

10...0āˆ’ Ī³1Ī³J

0nāˆ’J0

, . . . ,

00...1

āˆ’Ī³Jāˆ’1

Ī³J

0nāˆ’J0

,

00...0Ī·w1

.

where Ī· = 1Ī³Juāˆ—Xnāˆ’JĪ£āˆ’1

nāˆ’JYāˆ—nāˆ’JT

ā€²(Ī»)v āˆˆ C and w = āˆ’Ī£āˆ’1nāˆ’JY

āˆ—nāˆ’JT

ā€²(Ī»)v āˆˆ Cnāˆ’J . It

follows that the null space N1 of the Jacobian at (Ī», v) is of dimension J , which is

Page 19: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 17

one dimension larger than that in the semi-simple case; see (3.2). In fact,

N1 ā‰” null

([T (Ī») T ā€²(Ī»)vuāˆ— 0

])(4.2)

=

[X

1

]span

10...0āˆ’ Ī³1Ī³J

0nāˆ’J0

, . . . ,

00...1

āˆ’Ī³Jāˆ’1

Ī³J

0nāˆ’J0

,

00...0Ī·w1

= span

{[Xe1āˆ’ Ī³1

Ī³JXeJ

0

], . . . ,

[XeJāˆ’1āˆ’ Ī³Jāˆ’1

Ī³JXeJ

0

]}āŠ• span

{[Ī·XeJ+Xnāˆ’Jw

1

]}.

The complementary space M1 of N1 can be defined as follows

(4.3) M1 = span

{[XeJ

0

]}āŠ• range

([Xnāˆ’J

0

]),

so that dim(M1) = nāˆ’J + 1, and Cn+1 = N1 āŠ•M1. From (4.2) and (4.3), it followsby the definitions of N1 and M1 that

M2 = range

([T (Ī») T ā€²(Ī»)vuāˆ— 0

])=

[T (Ī») T ā€²(Ī»)vuāˆ— 0

]M1(4.4)

= span

{[01

]}āŠ• range

([T (Ī»)Xnāˆ’Juāˆ—Xnāˆ’J

]), (uāˆ—XeJ = Ī³J 6= 0 by assumption)

so that dim(M2) = n āˆ’ J + 1. There is considerable freedom to choose the comple-mentary space N2; for example, one can define

(4.5) N2 = range

([ (T (Ī»)Xnāˆ’J

)āŠ„0

]),

where(T (Ī»)Xnāˆ’J

)āŠ„consists of J column vectors orthogonal to range

(T (Ī»)Xnāˆ’J

),

and therefore M2 and N2 are orthogonal complements of each other.

Similar to the analysis for semi-simple eigenvalues, let ek =

[xkĀµk

]āˆ’[vĪ»

]be the

error between the Newton iterate and the particular eigenpair (Ī», v) in the kth step,PN1

be the projector onto N1 along M1, and PM1= I āˆ’ PN1

. To complete theanalysis, we make the following assumption.

Assumption 4.1. For any sequence of Newton iterate errors {ek} whose com-ponents lying in N1 converge to zero linearly, their components lying in any one-dimensional subspace of N1 also converge to zero linearly.

The assumption states that the Newton iterate errors lying in any subspace ofthe kernel of the singular Jacobian exhibit qualitatively the same behavior, and noā€œspecialā€ subspace exists in which the iterate errors converge more quickly than they

Page 20: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

18 D. B. Szyld and F. Xue

do in other subspaces. This assumption of isotropism is based on the observations ofthe behavior of Newton iterate errors in our numerical experiments.

Since the Jacobian is singular at (Ī», v), Theorem 2.5 shows that ā€–PN1 (ek) ā€– andā€–PM1 (ek) ā€–, respectively, converge to zero linearly and quadratically. From (4.2),

since span

{[Ī·XeJ+Xnāˆ’Jw

1

]}is a subspace of N1, by Assumption 4.1, the iterate

errors {ek} have a component lying in this one-dimensional space which convergeslinearly. Given the form of this basis vector, we see that its last entry 1 representsthe eigenvalue approximation error, and it also contains the component Xnāˆ’Jw whichrepresents an eigenvector approximation error (range(XJ) is the desired eigenspace).As a result, inverse iteration converges towards Ī» and its eigenspace linearly.

4.2. Numerical experiments for inverse iteration. Numerical results areprovided in this section to illustrate the linear convergence of inverse iteration fordefective eigenvalues. We choose two problems from the NLEVP collection, andwe constructed four problems artificially, because the benchmark problems in theliterature with defective eigenvalues are rather limited.

A description of these problems is given in Table 4.1. For example, the prob-lem df art m1m2, which defines an matrix pencil T (Āµ) āˆˆ C256Ɨ256, is an artificiallyconstructed, truly nonlinear problem. It has a defective eigenvalue Ī» = 0, and thealgebraic and the geometric multiplicities of Ī» are algT (Ī») = 5 and geoT (Ī») = 4,respectively; the length of the shortest and the longest Jordan chains associated withĪ» are min{mi} = 1 and max{mi} = 2, respectively.

problem source type size eigenvalue algT (Ī») & min{mi} &geoT (Ī») max{mi}

time delay NLEVP nep 3 3Ļ€i 2 1 2 2jordan3 artificial lep 256 2 3 1 3 3

df art m1m2 artificial nep 256 0 5 4 1 2df art m1m3 artificial nep 256 0 5 3 1 3df art m2m3 artificial nep 256 0 5 2 2 3

mirror NLEVP pep 9 0 9 7 1 2Table 4.1

Description of the test problems for defective eigenvalues

The four artificially constructed problems are as follows. For jordan3,

T (Āµ) = Gāˆ—AD(Āµ)GB , where

D(Āµ) =

Āµāˆ’2 āˆ’1

Āµāˆ’2 āˆ’1Āµāˆ’2

ĀµI253āˆ’diag([0; 1; c1])

,and for the other three problems,

T (Āµ) = Gāˆ—A

[D5(Āµ)

diag([c0 + Āµc1 + Āµ2c2])

]GB , where

D5(Āµ) =

eĀµāˆ’1 1āˆ’tanāˆ’1(2Āµ)

2 sin(Āµ)āˆ’5 ln(1 + Āµ)

8Āµtanāˆ’1(3Āµ)

Page 21: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 19

for df art m1m2,

D5(Āµ) =

eĀµāˆ’1 1āˆ’tanāˆ’1(2Āµ)

2 sin(Āµ) cos(Āµ2)āˆ’5 ln(1 + Āµ)

8Āµtanāˆ’1(3Āµ)

for df art m1m3, and

D5(Āµ) =

eĀµāˆ’1 1āˆ’tanāˆ’1(2Āµ)

2 sin(Āµ) cos(Āµ2)āˆ’5 ln(1 + Āµ)

8Āµ eāˆ’2Āµ

tanāˆ’1(3Āµ)

df art m2m3, respectively, where GA, GB and ci (i = 0, 1, 2) are random matricesdescribed in Section 3.2. Among the six test problems, time delay, jordan3 anddf art m2m3 have Jordan chains of minimum length ā‰„ 2. The linear convergence ofinverse iteration for this type of problem is shown in Section 4.1.

To illustrate the analysis, we generated an initial eigenpair approximation (Āµ0, x0)in the same manner as we did for semi-simple eigenvalues. We then ran inverseiteration with (Āµ0, x0), and we found that the algorithm does converge linearly untilthe eigenvalue and eigenvector approximation errors decrease to the magnitude aroundĪµ1/max{mi} (Īµ is the machine precision) ā€” the highest precision one can achieve by asingle-vector algorithm for defective eigenvalues; see [19] for details. An estimate ofthe order of convergence was obtained by applying the least-squares line formula tothe sequence (log ek, log ek+1) for k = 1, 2, . . ., where ek := ā€–T (Āµk)xkā€–. Note that wedid not use e0, because the residual norm ā€–T (Āµ1)x1ā€– is usually significantly smallerthan ā€–T (Āµ0)x0ā€– (see Lemma 4.5 for an explanation), and thus the inclusion of e0

produces considerable noise in our estimate. Table 4.2 shows that inverse iterationconverges linearly for defective eigenvalues Ī» for which the shortest Jordan chain isof length ā‰„ 2.

problem āˆ (x(1)0 , v) # iters estimated `

time delay Ɨ10āˆ’3 20 1.001jordan3 5Ɨ 10āˆ’3 14 0.984

df art m2m3 10āˆ’2 15 0.969Table 4.2

Estimated order of convergence of inverse iteration for a defective Ī» (min{mi} ā‰„ 2)

4.3. Rayleigh functional iteration and single-vector JD. The linear con-vergence of inverse iteration towards defective eigenpairs is much less satisfactorythan its quadratic convergence towards semi-simple eigenpairs. Thus it is importantto develop algorithms that converge more rapidly. By analogy with the order of con-vergence exhibited by the Newton-like methods for semi-simple eigenvalues, one mightspeculate that RFI and single-vector JD exhibit quadratic convergence towards de-fective eigenpairs for locally symmetric problems. Unfortunately, this is not the case,as we show later in Proposition 4.2, due to the special structure of the resolvent neardefective eigenvalues. In fact, RFI generally converges only linearly towards defectiveeigenpairs, whether the local symmetry of T (Ā·) exists or not.

Page 22: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

20 D. B. Szyld and F. Xue

To see this, we first assume for the sake of simplicity that the defective Ī» has onlyone right Jordan chain {Ļ•1,0, . . . , Ļ•1,māˆ’1} such that algT (Ī») = m and geoT (Ī») = 1.Let (Āµ, x) be an eigenpair approximation, and p = Tāˆ’1(Āµ)T ā€²(Āµ)x be the unnormalizednew eigenvector approximation computed by any Newton-like method we discussed.Recalling the structure of Tāˆ’1(Āµ) from Theorem 2.3, we have

p = Tāˆ’1(Āµ)T ā€²(Āµ)x =

māˆ’1āˆ‘h=0

āˆ‘hs=0怈T ā€²(Āµ)x, Ļˆ1,s怉Ļ•1,hāˆ’s

(Āµāˆ’ Ī»)māˆ’h+Q(Āµ)T ā€²(Āµ)x(4.6)

=

māˆ’1āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ1,0怉(Āµāˆ’ Ī»)māˆ’i

Ļ•1,i +

māˆ’2āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ1,1怉(Āµāˆ’ Ī»)māˆ’1āˆ’iĻ•1,i + . . .

+

1āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ1,māˆ’2怉(Āµāˆ’ Ī»)2āˆ’i Ļ•1,i +

怈T ā€²(Āµ)x, Ļˆ1,māˆ’1怉(Āµāˆ’ Ī»)

Ļ•1,0 +Q(Āµ)T ā€²(Āµ)x

=māˆ’1āˆ‘j=0

māˆ’jāˆ’1āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ1,j怉(Āµāˆ’ Ī»)māˆ’jāˆ’i

Ļ•1,i +Q(Āµ)T ā€²(Āµ)x.

To analyze the direction of p, assume that the eigenvector Ļ•1,0 and the generalizedeigenvector Ļ•1,1 are not parallel. Then for every j < m āˆ’ 1, consider the followingcomponent shown in the last equation of (4.6)

māˆ’jāˆ’1āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ1,j怉(Āµāˆ’ Ī»)māˆ’jāˆ’i

Ļ•1,i =怈T ā€²(Āµ)x, Ļˆ1,j怉(Āµāˆ’ Ī»)māˆ’j

māˆ’jāˆ’1āˆ‘i=0

(Āµāˆ’ Ī»)iĻ•1,i.(4.7)

The above expression shows clearly that the ratio between the magnitude of the errorcomponent (in the direction of Ļ•1,1) and the magnitude of the eigenvector component(in the direction of Ļ•1,0) in (4.7) is on the order of O(Āµ āˆ’ Ī»). Now for j = m āˆ’ 1in (4.6), assume in addition that 怈T ā€²(Ī»)Ļ•1,0, Ļˆ1,māˆ’1怉 6= 0, so that 怈T ā€²(Āµ)x, Ļˆ1,māˆ’1怉 ā‰ˆć€ˆT ā€²(Ī»)Ļ•1,0, Ļˆ1,māˆ’1怉 = O(1) for Āµ sufficiently close to Ī» and x sufficiently close to Ļ•1,0

in direction. Consider the corresponding component in the last equation of (4.6),namely,

(4.8)怈T ā€²(Āµ)x, Ļˆ1,māˆ’1怉

Āµāˆ’ Ī»Ļ•1,0 +Q(Āµ)T ā€²(Āµ)x.

Similarly, the ratio between the magnitude of the error component and that of theeigenvector component of (4.8) is also on the order of O(Āµāˆ’Ī»). Therefore, when nor-malized, the new eigenvector approximation p in (4.6) has an eigenvector componentof magnitude O(1) and an error component of magnitude O(Āµāˆ’ Ī»).

As a result, the local convergence rates of Newton-like methods depend on theaccuracy of the eigenvalue approximation Āµ, which in turn usually depends on theaccuracy of the eigenvector approximation x. Let the accuracy of x be representedby the sine of the error angle āˆ (x, Ļ•1,0), and assume that |Āµāˆ’Ī»| = O(sin`āˆ (x, Ļ•1,0)).Then the convergence of the Newton-like method is of order `. In particular, for RFIwhere Āµ = ĻF (x;T, y) is the Rayleigh functional value, we have ` = 1 in general fordefective eigenvalues, and therefore RFI converges only linearly. This observation issummarized as follows.

Proposition 4.2. Let Ī» be a defective eigenvalue of the holomorphic operatorT (Ā·) : U ā†’ CnƗn, and Ļ• and Ļˆ be a corresponding unit right and a unit left eigenvec-tor, respectively. Let x = Ļ• cosĪ±+Ļ•āŠ„ sinĪ± and y = Ļˆ cosĪ²+ĻˆāŠ„ sinĪ², where Ī±, Ī² < Ļ€

2 ,

Page 23: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 21

Ļ•āŠ„āŠ„ null (T (Ī»)) and ĻˆāŠ„āŠ„null ((T (Ī»))āˆ—) are unit vectors, and thus ā€–xā€– = ā€–yā€– = 1.Assume that yāˆ—T ā€²(Ī»)x 6= 0. Let Ļ = ĻF (x;T, y) be the Rayleigh functional valueclosest to Ī» such that yāˆ—T (Ļ)x = 0. Then for sufficiently small Ī±,

|Ļāˆ’ Ī»| ā‰¤ 2ā€–T (Ī»)ā€–| sinĪ± sinĪ²||yāˆ—T ā€²(Ī»)x|

(4.9)

=2ā€–T (Ī»)ā€–| sinĪ± sinĪ²|āˆ£āˆ£cosĪ² sinĪ±Ļˆāˆ—T ā€²(Ī»)Ļ•āŠ„ + sinĪ² cosĪ±Ļˆāˆ—āŠ„T

ā€²(Ī»)Ļ•+ sinĪ² sinĪ±Ļˆāˆ—āŠ„Tā€²(Ī»)Ļ•āŠ„

āˆ£āˆ£ .Assume in addition that Ļˆāˆ—T ā€²(Ī»)Ļ•āŠ„, Ļˆāˆ—āŠ„T

ā€²(Ī»)Ļ• and Ļˆāˆ—āŠ„Tā€²(Ī»)Ļ•āŠ„ are all bounded away

from zero. Then, for sufficiently small Ī±, |Ļāˆ’ Ī»| ā‰¤ O(sinĪ±).Proof. The first part of the proposition through (4.9) was shown in [29, Section

4.3] using the Newton-Kantorovich Theorem. The goal here is to show that, in con-trast to the scenario for semi-simple eigenvalues, we have |Ļāˆ’ Ī»| ā‰¤ O(sinĪ±) (insteadof O(sinĪ± sinĪ²)), whether the local symmetry of T (Ā·) exists or not.

First, assume that y is not a good left eigenvector approximation, i.e., sinĪ² =O(1). Since sinĪ± is small, we have 2ā€–T (Ī»)ā€–| sinĪ± sinĪ²| = O(sinĪ±), and from (4.9),|yāˆ—T ā€²(Ī»)x| = |sinĪ² cosĪ±Ļˆāˆ—āŠ„T

ā€²(Ī»)Ļ•+O(sinĪ±)| = O(1). Therefore it follows that

(4.10) |Ļāˆ’ Ī»| ā‰¤ 2ā€–T (Ī»)ā€–| sinĪ± sinĪ²||yāˆ—T (Ī»)x|

= O(sinĪ±).

Now assume that y is a good left eigenvector approximation such that sinĪ² =O(sinĪ±); in particular, suppose the local symmetry of T (Ā·) exists, and thus we choosey = x or y = x to generate the two-sided Rayleigh functional value. In this case,| sinĪ±| = | sinĪ²|, and it follows from (4.9) that 2ā€–T (Ī»)ā€–| sinĪ± sinĪ²| = O(sin2 Ī±), and|yāˆ—T ā€²(Ī»)x| = O(sinĪ±). Therefore, (4.10) still holds.

Remark. Proposition 4.2 shows that for a defective Ī», the use of a left eigen-vector approximation for the Rayleigh functional ĻF does not generate an eigenvalueapproximation of second order accuracy, as is achieved for simple and semi-simpleeigenvalues. This lack of high accuracy is attributed to the fact that Ļˆāˆ—T ā€²(Ī»)Ļ• = 0.Moreover, the use of a left eigenvector approximation for ĻF could introduce addi-tional complications. Namely, since |yāˆ—T ā€²(Ī»)x| = O(sinĪ±) from (4.9), it follows that|yāˆ—T ā€²(Ļ)x| ā‰¤ |yāˆ—T ā€²(Ī»)x|+O(Ļāˆ’Ī») = O(sinĪ±). Thus there is a risk that yāˆ—T ā€²(Ļ)x 6= 0,a critical condition required by the definition of Rayleigh functionals (see [26]), maybe violated, at least numerically. As a result, we see from (2.16) that the small

magnitude of yāˆ—T ā€²(Ļ)x could introduce numerical difficulties in the projector Ī (1)k for

single-vector JD. We therefore recommend using y far from a left eigenvector approx-imation to compute the Rayleigh functional value for a defective Ī».

From Proposition 4.2, it follows immediately from (4.7) and (4.8) that RFI con-verges linearly towards defective eigenvalues. In addition, assuming that the Rayleighfunctional value Ļ = ĻF (x;T, y) satisfies 怈T ā€²(Ļ)x, y怉 6= 0, then the single-vector JD(2.16) also converges linearly in this case, because it is mathematically equivalent toRFI. We summarize this result in the following theorem.

Theorem 4.3. Let Ī» be a defective eigenvalue of the holomorphic operator T (Ā·) :U ā†’ CnƗn with exactly one left and one right Jordan chains {Ļ•1,0, . . . , Ļ•1,māˆ’1} and{Ļˆ1,0, . . . , Ļˆ1,māˆ’1}, where Ļ•1,0 is not parallel to Ļ•1,1, and 怈T ā€²(Ī»)Ļ•1,0, Ļˆ1,māˆ’1怉 6= 0. Let(Ļ0, x0) be an initial eigenpair approximation with āˆ (x0, Ļ•1,0) sufficiently small, andĻk = ĻF (xk;T, yk) be the Rayleigh functional value closest to Ī». Assume that the upperbounds in Proposition 4.2 are qualitatively sharp, i.e., |Ļk āˆ’ Ī»| = O(sināˆ (xk, Ļ•1,0)).

Page 24: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

22 D. B. Szyld and F. Xue

Then RFI converges locally towards (Ī», Ļ•1,0) linearly. The same conclusion applies tosingle-vector JD (2.16) if, in addition, Ļk is such that yāˆ—kT

ā€²(Ļk)xk 6= 0.

4.4. Numerical experiments for RFI/JD. In this section, we illustrate bynumerical experiments the linear convergence of RFI and single-vector JD for defec-tive eigenvalues with a single Jordan chain. The experiments are performed on theproblems time delay and jordan3. We run RFI with an initial eigenpair approxima-tion (Āµ0, x0), and we analyze (log ek, log ek+1), where ek := ā€–T (Āµk)xkā€– (k = 1, 2, . . .),for the estimated order of convergence. Here, note that the local symmetry does notexist for both test problems, and we choose y = T ā€²(Ī»)xk to generate the value ofthe Rayleigh functional ĻF (xk;T, y). Table 4.3 shows the error angle of the initialiterate, the number of iterations taken, and the estimated order of convergence. Wesee clearly that RFI converges linearly for both problems.

In addition, we tested the convergence of the two-sided Rayleigh functiona iter-ation (TSRFI), which converges cubically for simple eigenvalues; see, e.g., [25]. Wefound that this algorithm also converges linearly in this setting, which is consistentwith Proposition 4.2 and Theorem 4.3. As we have discussed, for a defective eigen-value Ī» with a single Jordan chain, we generally have |ĻF (x;T, y) āˆ’ Ī»| = O

(āˆ (x, v)

)(instead of O

(āˆ (x, v)2

)), no matter whether y is a good left eigenvector approxima-

tion; consequently, TSRFI exhibits the same order of convergence (linear) as RFI.The results for TSRFI are also summarized in Table 4.3.

RFI TSRFIproblem āˆ (x0, v) # iters estimated ` # iters estimated `

time delay 10āˆ’3 18 1.002 15 0.981jordan3 5Ɨ 10āˆ’3 19 0.961 13 0.967

Table 4.3Estimated order of convergence of RFI and TSRFI for a defective Ī» (geoT (Ī») = 1)

4.5. Accelerated algorithms. In this section, we first study an acceleratedinverse iteration for degenerate eigenvalues, which is inspired by the accelerated New-tonā€™s method for general nonlinear system of equations near singular roots [8]. Thisalgorithm, which requires the solution of two linear systems in each iteration, exhibitsquadratic convergence towards defective eigenpairs. We then propose an acceleratedsingle-vector Jacobi-Davidson method based on a minor modification of the acceler-ated inverse iteration, and we show by experiments that it also converges quadratically.

4.5.1. Accelerated inverse iteration. Let Ī» be a defective eigenvalue of theholomorphic T (Ā·). To simplify the notation, we use the MATLAB expressions todenote eigenpair approximations written in the form of column vectors; e.g., [x;Āµ] is

used to represent

[xĀµ

]. Let F ([v;Ī»]) = 0 be the augmented system of (1.1), and

[xk;Āµk] be the starting eigenpair approximation in the kth iteration. Consider thefollowing accelerated Newtonā€™s method:

[wk; Ī½k] = [xk;Āµk]āˆ’ F ā€²([xk;Āµk])āˆ’1F ([xk;Āµk]) (half-step iterate)(4.11)

[xk+1; Āµk+1] = [wk; Ī½k]āˆ’mF ā€²([wk; Ī½k])āˆ’1F ([wk; Ī½k]) (full-step iterate)

= m([wk; Ī½k]āˆ’F ā€²([wk; Ī½k])āˆ’1F ([wk; Ī½k])

)āˆ’ (māˆ’1)[wk; Ī½k],

Page 25: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 23

where F ([xk;Āµk]) =

[T (Āµk)xkuāˆ—xk āˆ’ 1

]=

[T (Āµk)xk

0

], ā€–F ([xk;Āµk])ā€– = ā€–T (Āµk)xkā€– is the

residual norm of [xk;Āµk], and F ā€²([xk;Āµk]) =

[T (Āµk) T ā€²(Āµk)xkuāˆ— 0

]is the Jacobian of

F at [xk;Āµk]. In other words, the full-step eigenpair approximation [xk+1;Āµk+1] isa special linear combination of the half-step iterate [wk; Ī½k] and the next standardNewtonā€™s iterate [wk; Ī½k]āˆ’F ā€²([wk; Ī½k])āˆ’1F ([wk; Ī½k]). We will see later in this sectionthat the length of the Jordan chain m is the only value for the linear combinationcoefficients in (4.11) such that the algorithm converges quadratically.

It can be shown by the structure of the block inverse of the Jacobian that (4.11)is equivalent to the following accelerated inverse iteration:

1. pk = Tāˆ’1(Āµk)T ā€²(Āµk)xk (half-step intermediate vector)

2. wk =pkuāˆ—pk

(half-step eigenvector approximation)

3. Ī½k = Āµk āˆ’1

uāˆ—pk(half-step eigenvalue approximation)

4. qk = Tāˆ’1(Ī½k)T ā€²(Ī½k)wk (full-step intermediate vector)

5. xk+1 = āˆ’(māˆ’ 1)wk +mqkuāˆ—qk

(full-step eigenvector approximation)

6. Āµk+1 = Ī½k āˆ’m

uāˆ—qk(full-step eigenvalue approximation)

.(4.12)

In this section, we establish the locally quadratic convergence of the acceleratedalgorithm (4.11) (or (4.12)) towards the desired defective eigenpair. To this end, wefirst make an assumption about the eigenpair approximation (Āµk, xk) as follows.

Assumption 4.4. Let Ī» be a defective eigenvalue of the holomorphic T (Ā·) with Jright and J left Jordan chains shown in (4.1). Assume that the eigenpair approxima-tion (Āµk, xk) is such that its eigenvalue approximation error |Āµk āˆ’ Ī»| is proportionalto its eigenvector approximation error āˆ  (xk,Ī¦) where Ī¦ = span{Ļ•1,0, . . . , Ļ•J,0}.

This assumption seems reasonable due to the following observation. In practice,it is usually not known in priori whether the desired eigenpair is defective, and thus weuse Newtonā€™s method, RFI or JD to solve for the eigenpair. For Newtonā€™s method, it isshown in Section 4.1 that its local convergence towards a defective eigenpair is linear;in fact, the error ek = [xk;Āµk]āˆ’ [v;Ī»] has a component in the one-dimensional space

span

{[Ī·XeJ +Xnāˆ’Jw

1

]}(a subspace of the null space of the Jacobian) which

converges linearly. Since this basis vector has nontrivial components representing botheigenvalue and eigenvector approximation errors, the two errors become proportionalto each other after sufficiently many Newton steps. This assumption also holds forRFI where Āµk = Ļ is the Rayleigh functional value, since Proposition 4.2 shows that|Ļāˆ’ Ī»| ā‰¤ O(sināˆ (xk,Ī¦)) for the defective Ī», independent of the symmetry of T (Ī»).

Our main goal is to show that the accelerated method (4.11) converges quadrat-ically towards the defective eigenpair [v;Ī»]. We take three steps to complete thisanalysis. In step 1, we make a few assumptions about the half-step iterate [wk; Ī½k],and we study the values of b, c, d in the Taylor expansion of F ([wk; Ī½k]) at [v;Ī»]. Thisstep is critical to establish the quadratic convergence. In step 2, we show that theJacobian at any eigenpair approximation sufficiently close to [v;Ī»] is nonsingular, sothat all half-step and full-step iterates are well-defined. In step 3, we write the full-step iterate error as a linear combination of the error of [wk; Ī½k] and that of the nextstandard Newton iterate [zk; Ī¾k] = [wk; Ī½k] āˆ’ F ā€²([wk; Ī½k])āˆ’1F ([wk; Ī½k]); we analyze

Page 26: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

24 D. B. Szyld and F. Xue

the projections of the full-step iterate error onto M1 and N1, and show that all theprojected errors are bounded by O(s2

k), where sk = sināˆ (xk, Ļ•1,0), the error anglebetween the current eigenvector approximation and the desired eigenvector.

In the following analysis, to simplify the notation, we omit the subscript k of sk,ck, [xk;Āµk], and [wk; Ī½k], when there is no risk of confusion; we keep the subscriptk + 1 of [xk+1;Āµk+1] though, to clearly identify the full-step iterate. To simplify theanalysis, we again assume that Ī» has only one right Jordan chain {Ļ•1,0, . . . , Ļ•1,māˆ’1}.

In step 1, we first show that the half-step iterate [w; Ī½] has a special property asfollows. Since Newtonā€™s method converges linearly, the error of [w; Ī½] is proportionalto that of [x;Āµ], yet the residual norm of [w; Ī½] is significantly smaller than that of[x;Āµ]; namely, ā€–T (Āµ)xā€– = O(s) and ā€–T (Ī½)wā€– = O(sm), where s=sināˆ (x, Ļ•1,0) is theeigenvector approximation error.

Lemma 4.5. Let Ī» be a defective eigenvalue of the holomorphic operator T (Ā·) withonly one left and one right Jordan chains {Ļ•1,0, . . . , Ļ•1,māˆ’1} and {Ļˆ1,0, . . . , Ļˆ1,māˆ’1}.Assume that 怈T ā€²ā€²(Ī»)Ļ•1,0, Ļˆ1,0怉 6= 0, or 怈T ā€²(Ī»)Ļ•1,0, Ļˆ1,1怉 6= 0. Let [x;Āµ] be an eigenpairapproximation satisfying Assumption 4.4, s = sināˆ (x, Ļ•1,0) be the sine of the errorangle of x, and [w; Ī½] be the half-step iterate in (4.11). Then ā€–T (Ī½)wā€– = O(sm) forsufficiently small s.

Proof. We first establish two critical relations, namely, ā€–uāˆ—Tāˆ’1(Āµ)T ā€²(Āµ)xā€– =O(sāˆ’(māˆ’1)) and ā€–T ā€²(Āµ)x āˆ’ T ā€²(Āµ)wā€– = O(s). Assume without loss of generality thatx = cĻ•1,0 + sg, where ā€–Ļ•1,0ā€– = ā€–gā€– = 1 and gāŠ„Ļ•1,0.

We first show that ā€–uāˆ—Tāˆ’1(Āµ)T ā€²(Āµ)xā€– = O(sāˆ’(māˆ’1)). From (4.6), we have that

p = Tāˆ’1(Āµ)T ā€²(Āµ)x =

māˆ’1āˆ‘j=0

māˆ’jāˆ’1āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ1,j怉(Āµāˆ’ Ī»)māˆ’jāˆ’i

Ļ•1,i +G(Āµ)T ā€²(Āµ)x(4.13)

=

māˆ’1āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ1,0怉(Āµāˆ’ Ī»)māˆ’i

Ļ•1,i +

māˆ’1āˆ‘j=1

māˆ’jāˆ’1āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ1,j怉(Āµāˆ’ Ī»)māˆ’jāˆ’i

Ļ•1,i +G(Āµ)T ā€²(Āµ)x.

Recall from (2.11) that 怈T ā€²(Ī»)Ļ•1,0, Ļˆ1,0怉 = 0, and |Āµāˆ’ Ī»| = O(s) by Assumption 4.4.First, assume that 怈T ā€²ā€²(Ī»)Ļ•1,0, Ļˆ1,0怉 = O(1). Then

怈T ā€²(Āµ)x, Ļˆ1,0怉 = c怈T ā€²(Āµ)Ļ•1,0, Ļˆ1,0怉+ s怈T ā€²(Āµ)g, Ļˆ1,0怉(4.14)

= c怈T ā€²(Ī»)Ļ•1,0, Ļˆ1,0怉+ c(Āµāˆ’ Ī»)怈T ā€²ā€²(Ī»)Ļ•1,0, Ļˆ1,0怉+ s怈T ā€²(Ī»)g, Ļˆ1,0怉+O

((Āµāˆ’ Ī»)2

)+O (s(Āµāˆ’ Ī»))

= 0 +O(Āµāˆ’ Ī») +O(s) = O(s).

The dominant term ināˆ‘māˆ’1i=0 怈T ā€²(Āµ)x, Ļˆ1,0怉(Āµāˆ’ Ī»)āˆ’(māˆ’i)Ļ•1,i, which appears in the

last equality of (4.13), is thus

怈T ā€²(Āµ)x, Ļˆ1,0怉(Āµāˆ’ Ī»)m

Ļ•1,0 =O(s)

O(sm)Ļ•1,0 = O(sāˆ’(māˆ’1))Ļ•1,0.

Consider the sum of the terms corresponding to j = 1 in (4.13), namely,

(4.15)

māˆ’2āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ1,1怉(Āµāˆ’ Ī»)māˆ’1āˆ’iĻ•1,i.

Now, assume alternatively that 怈T ā€²(Ī»)Ļ•1,0, Ļˆ1,1怉 6= 0. It follows that 怈T ā€²(Āµ)x, Ļ•1,1怉 =怈T ā€²(Ī»)Ļ•1,0, Ļˆ1,1怉+O(s) = O(1) for small s. Therefore the dominant term in (4.15) is

怈T ā€²(Āµ)x, Ļˆ1,1怉(Āµāˆ’ Ī»)māˆ’1

Ļ•1,0 =O(1)

O(smāˆ’1)Ļ•1,0 = O(sāˆ’(māˆ’1))Ļ•1,0.

Page 27: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 25

For every j ā‰„ 2, the sum of corresponding terms in (4.13) are bounded byO(sāˆ’(māˆ’2)), and they are thus not the dominant ones. In summary, if either怈T ā€²ā€²(Ī»)Ļ•1,0, Ļˆ1,0怉 6= 0 or 怈T ā€²(Ī»)Ļ•1,0, Ļˆ1,1怉 6= 0, the dominant term in (4.13) can bewritten as O(sāˆ’(māˆ’1))Ļ•1,0, and ā€–uāˆ—Tāˆ’1(Āµ)T ā€²(Āµ)xā€– = O(sāˆ’(māˆ’1)) follows.

To complete the proof, it is sufficient to show ā€–T ā€²(Āµ)xāˆ’T ā€²(Āµ)wā€– = O(s). In fact,since [w; Ī½] is obtained by applying one step of Newtonā€™s method to [x;Āµ], and boththe eigenvalue and the eigenvector approximation errors of Newtonā€™s iterates convergelinearly in this case (see Section 4.1), we have ā€–w āˆ’ Ļ•1,0ā€– = O(s). Therefore

ā€–T ā€²(Āµ)xāˆ’ T ā€²(Āµ)wā€– ā‰¤ ā€–T ā€²(Āµ)ā€–ā€–xāˆ’ wā€–ā‰¤ ā€–T ā€²(Āµ)ā€–(ā€–xāˆ’ Ļ•1,0ā€–+ ā€–w āˆ’ Ļ•1,0ā€–) = O(s).

Finally, note from (4.12) that

T (Ī½)w = T(Āµāˆ’ [uāˆ—Tāˆ’1(Āµ)T ā€²(Āµ)x)]āˆ’1

)w(4.16)

= T (Āµ)w āˆ’ T ā€²(Āµ)w

uāˆ—Tāˆ’1(Āµ)T ā€²(Āµ)x+O

((uāˆ—Tāˆ’1(Āµ)T ā€²(Āµ)x)āˆ’2

)= T (Āµ)

Tāˆ’1(Āµ)T ā€²(Āµ)x

uāˆ—Tāˆ’1(Āµ)T ā€²(Āµ)xāˆ’ T ā€²(Āµ)w

uāˆ—Tāˆ’1(Āµ)T ā€²(Āµ)x+O(s2(māˆ’1))

=T ā€²(Āµ)xāˆ’ T ā€²(Āµ)w

uāˆ—Tāˆ’1(Āµ)T ā€²(Āµ)x+O(s2(māˆ’1)).

It then immediately follows that

ā€–T (Ī½)wā€– =ā€–T ā€²(Āµ)xāˆ’ T ā€²(Āµ)wā€–|uāˆ—Tāˆ’1(Āµ)T ā€²(Āµ)x|

+O(s2(māˆ’1))

=O(s)

O(sāˆ’(māˆ’1))+O(s2(māˆ’1)) = O(sm)

for m ā‰„ 2. This completes the proof.

Lemma 4.5 gives a critical preliminary result in step 1 of the proof on the valuesof b, c, and d in the Taylor expansion of F ([w; Ī½]) at [Ļ•1,0;Ī»]. To complete this step,we need the following assumption about [w; Ī½].

Assumption 4.6. Let Ī» be a defective eigenvalue of the holomorphic operatorT (Ā·) with exactly one right and one left Jordan chains. Let M2 defined in (4.4) be therange of the Jacobian at (Ī», v) (where v = Ļ•1,0), and [x;Āµ] and [w; Ī½] be the startingand the half-step iterates, respectively, of the accelerated method (4.11). Assume thatthere exists a constant Īø0 > 0 independent of |Āµ āˆ’ Ī»| and āˆ (x, Ļ•1,0), such that the

angle between F ([w; Ī½]) =

[T (Ī½)w

0

]and M2 is bounded below by Īø0.

Assumption 4.6 seems reasonable in general, as we discuss below. By Lemma 4.5,T (Ī½)w has a significant component parallel to T ā€²(Āµ)(wāˆ’x), where wāˆ’x āŠ„ u because

uāˆ—x = uāˆ—x = 1. Thus

[T (Ī½)w

0

]has a large component in

{[T ā€²(Āµ) (span{u})āŠ„

0

]},

where (span{u})āŠ„ is the orthogonal complement of span{u}. Given the structure ofM2 in (4.4), with a proper choice of u, there is no reason to expect this componentto be well approximated by any vector in M2.

Page 28: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

26 D. B. Szyld and F. Xue

To complete step 1, recall the Taylor expansion of F ([w; Ī½]) at [v;Ī»] as follows:

F ([w; Ī½]) ā‰”[T (Ī½)wuāˆ—y āˆ’ 1

]=

[T (Ī»)v

0

]+

[T (Ī») T ā€²(Ī»)vu 0

][y āˆ’ vĪ½ āˆ’ Ī»

]+

nāˆ‘j=1

1

(j + 1)!

(Ī½ āˆ’ Ī»)jT (j)(Ī»)

{j(Ī½ āˆ’ Ī»)jāˆ’1T (j)(Ī»)(y āˆ’ v)

+(Ī½ āˆ’ Ī»)jT (j+1)(Ī»)v

}0 0

[ y āˆ’ vĪ½ āˆ’ Ī»

]

+O(ā€–ewĪ½ā€–n+2) =

nāˆ‘j=0

1

(j + 1)!F (j+1)([v;Ī»])(ejwĪ½ , ewĪ½) +O(ā€–ewĪ½ā€–n+2),

where ewĪ½ = [w; Ī½]āˆ’ [v;Ī»], F (j+1)([v;Ī»]) (Ā·, Ā· Ā· Ā· , Ā·) : Cn+1Ɨ . . .ƗCn+1 ā†’ Cn+1 standsfor the j + 1st derivative of F at [v;Ī»] (which is a multilinear form (tensor) withj + 1 arguments), and ejwĪ½ means that the first j arguments of F (j+1) are all ewĪ½ . Inparticular, F (j+1)([v;Ī»])(ejwĪ½ , Ā·) is a Cn+1 Ɨ Cn+1 matrix.

Now we are ready to discuss several cases in which the possible values of b, c, andd in the Taylor expansion of F ([w; Ī½]) at [v;Ī»] can be determined.

Lemma 4.7. Let [v;Ī»] be a defective eigenpair of T (Ā·) with one right and one leftJordan chain of length m, [x;Āµ] a corresponding eigenpair approximation satisfyingAssumption 4.4, and [w; Ī½] the half-step eigenpair approximation in (4.11) computedby one step of Newtonā€™s method. The Taylor expansion of F ([w; Ī½]) at [v;Ī»] is

F ([w; Ī½]) ā‰”[T (Ī½)w

0

]=

[T (Ī»)v

0

]+AFāˆ—ewĪ½ +

nāˆ‘j=a

1

j + 1A(j)([w; Ī½])ewĪ½

+

nāˆ‘j=b

1

j + 1B(j)([w; Ī½])ewĪ½ +

nāˆ‘j=c

1

j + 1C(j)([w; Ī½])ewĪ½

+

nāˆ‘j=d

1

j + 1D(j)([w; Ī½])ewĪ½ +O(ā€–ewĪ½ā€–n+2). (n ā‰„ max(a, b, c, d) )

Then b = 1 if m > 2. In addition, under Assumption 4.6, exactly one the followingscenarios must be true:

1. c ā‰„ d = māˆ’ 1, or2. (only if m ā‰„ 3) c = māˆ’ 2 and d ā‰„ māˆ’ 1, or3. (only if m ā‰„ 4) c ā‰¤ māˆ’ 3 and d = c+ 1.

Proof. We know from Lemma 4.5 that ā€–F ([w; Ī½])ā€– = O(sm). By Assumption 4.6,āˆ  (F ([w; Ī½]),M2) > Īø0 > 0. From (4.4) and (4.5), since M2 and N2 are orthogonalcomplements, and PN2

is the projection onto N2 along M2, we have

ā€–PN2F ([w; Ī½])ā€– ā‰„ ā€–F ([w; Ī½])ā€– sin Īø0 = O(sm) and

ā€–PM2F ([w; Ī½])ā€– ā‰¤ ā€–F ([w; Ī½])ā€– cos Īø0 = O(s`), where ` ā‰„ m.

In addition, by Assumption 4.4, [x;Āµ] has eigenvalue and eigenvector approximationerrors both on the order of O(s), and the two errors are represented by certain com-ponents lying in N1 and M1, resepectively; see (4.2) and (4.3). It follows that

PN1exĀµ = O(s) and PM1

exĀµ = O(s), where exĀµ = [x;Āµ]āˆ’ [v;Ī»].

By Theorem 2.5, we have PN1ewĪ½ = O(s) and PM1ewĪ½ = O(s2). To find the value of

Page 29: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 27

b for m > 2, note that

PM2F ([w; Ī½]) = AFāˆ—ewĪ½ +

nāˆ‘j=a

1

j + 1A(j)([w; Ī½])ewĪ½ +

nāˆ‘j=b

1

j + 1B(j)([w; Ī½])ewĪ½

= AFāˆ—(PM1ewĪ½) +

nāˆ‘j=a

1

j + 1A(j)([w; Ī½])(PM1

ewĪ½) +

nāˆ‘j=b

1

j + 1B(j)([w; Ī½])(PN1

ewĪ½)

= O(s2) +O(sa+2) +O(sb+1) = O(s`),

where ` ā‰„ m. For any m > 2, since O(sa+2) ā‰¤ O(s3), we must have b = 1 to cancelout the O(s2) terms in the last step of the above equality.

Similarly, to see the relation between c and d, we have

PN2F ([w; Ī½]) =

nāˆ‘j=c

1

j + 1C(j)([w; Ī½])ewĪ½ +

nāˆ‘j=d

1

j + 1D(j)([w; Ī½])ewĪ½ +O(ā€–ewĪ½ā€–n+2)

= O(sc+2) +O(sd+1) +O(sn+2) = O(sm).(4.17)

Clearly, (4.17) holds only if exactly one of the following cases is true:1. If c ā‰„ m āˆ’ 1, then we must have d = m āˆ’ 1 (m ā‰„ 2) to maintain a O(sm)

term on the left-hand side of (4.17).2. If c = m āˆ’ 2, then we must have d ā‰„ m āˆ’ 1 (m ā‰„ 3, b = 1) so that there is

no term of order lower than m on the left-hand side of (4.17).3. If c ā‰¤ m āˆ’ 3, then we must have d = c + 1 (m ā‰„ 4, b = 1) so that the two

terms of order lower than m on the left-hand side of (4.17) are cancelled out.The lemma is thus established.Lemma 4.7 completes step 1, where we derived the possible values of b, c, and d

for the Taylor expansion of F ([w; Ī½]) at [v;Ī»]. In step 2, we show that all the half-step and full-step iterates of the accelerated inverse iteration are well-defined. To thisend, it is sufficient to show that the Jacobian of Newtonā€™s method at any half-step orfull-step eigenpair approximation [w; Ī½] or [x;Āµ] is nonsingular. The nonsingularity ofthe Jacobian can be guaranteed by the following Lemma.

Lemma 4.8. Let [x;Āµ] be an eigenpair approximation sufficiently close, but not

equal to [v;Ī»]. Then F ā€²([x;Āµ]) =

[T (Āµ) T ā€²(Āµ)xuāˆ— 0

], the Jacobian of Newtonā€™s method

at [x;Āµ], is nonsingular.Proof. By assumption, the eigenvalues of T (Ā·) are isolated, and therefore T (Āµ) is

nonsingular if Āµ is sufficiently close, but not equal to Ī». It follows that

[T (Āµ)uāˆ—

]has

full column rank. Assume that the Jacobian

[T (Āµ) T ā€²(Āµ)xuāˆ— 0

]is singular, then there

exists y āˆˆ Cn such that

[T (Āµ)uāˆ—

]y =

[T ā€²(Āµ)x

0

]. It follows that y = T (Āµ)āˆ’1T ā€²(Āµ)x

and uāˆ—y = 0. However, if x is sufficiently close to v in direction, and |Āµāˆ’Ī»| sufficientlysmall, then y, the unnormalized new eigenvector approximation, is closer to v indirection. Given the normalization condition uāˆ—v = 1, it is impossible to have uāˆ—y = 0.Therefore the Jacobian must be nonsingular.

In step 3, we first show that the error of the full-step iterate can be written asa linear combination of the half-step iterate error and the errof of the next standardNewton iterate [z; Ī¾] = [w; Ī½] āˆ’ F ā€²([w; Ī½])āˆ’1F ([w; Ī½]); this derivation follows that in

Page 30: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

28 D. B. Szyld and F. Xue

[8, Section 4]. Then, we study the projections of the full-step iterate error onto N1

and M1, and show that the projected errors are bounded by O(s2).

To decompose the error of the full-step iterate, define

[z; Ī¾] = [w; Ī½]āˆ’ F ā€²([w; Ī½])āˆ’1F ([w; Ī½]),

the iterate obtained by applying one step of Newtonā€™s method to the half-step iterate.Then we have from (4.11) that [xk+1;Āµk+1] = m[z; Ī¾] āˆ’ (m āˆ’ 1)[w; Ī½]. The error of[xk+1;Āµk+1] can be analyzed by studying ezĪ¾ = [z; Ī¾]āˆ’ [v;Ī»]. Following the derivationin [8, Section 4], one can show that ezĪ¾ satisfies

ezĪ¾ = ewĪ½ āˆ’ F ā€²([w; Ī½])āˆ’1F ([w; Ī½])(4.18)

= F ā€²([w; Ī½])āˆ’1{ nāˆ‘j=a

j

(j + 1)A(j)([w; Ī½])ewĪ½ +

nāˆ‘j=b

j

(j + 1)B(j)([w; Ī½])ewĪ½

nāˆ‘j=c

j

(j + 1)C(j)([w; Ī½])ewĪ½ +

nāˆ‘j=d

j

(j + 1)D(j)([w; Ī½])ewĪ½ +O(sn+2)

}.

To study the errors of [xk+1;Āµk+1] projected onto N1 and M1, we assume thatthe first case discussed in Lemma 4.7 holds, that is, c ā‰„ d = m āˆ’ 1; we assume inaddition that c ā‰„ d. These assumptions are similar, but not identical, to those in [8,Section 7]. Since PM1

ewĪ½ = O(s2) and PN1ewĪ½ = O(s), we have

[xk+1;Āµk+1]āˆ’ [v;Ī»] = m[z; Ī¾]āˆ’ (māˆ’ 1)[w; Ī½]āˆ’ [v;Ī»] = mezĪ¾ āˆ’ (māˆ’ 1)ewĪ½(4.19)

= mF ā€²([w; Ī½])āˆ’1{ a

a+ 1A(a)([w; Ī½])ewĪ½ +

b

b+ 1B(b)([w; Ī½])ewĪ½ +

c

c+ 1C(c)([w; Ī½])ewĪ½

+d

d+ 1D(d)([w; Ī½])ewĪ½ + PM2

O(smin(a+3,b+2)) + PN2O(smin(c+3,d+2))

}āˆ’(māˆ’ 1)F ā€²([w; Ī½])āˆ’1F ā€²([w; Ī½])ewĪ½

= F ā€²([w; Ī½])āˆ’1

{āˆ’ (māˆ’ 1)AFāˆ—ewĪ½ +

( ma

a+ 1āˆ’ (māˆ’ 1)

)A(a)([w; Ī½])ewĪ½ +( mb

b+ 1āˆ’ (māˆ’ 1)

)B(b)([w; Ī½])ewĪ½ + PM2

O(smin(a+3,b+2)) + PN2O(sd+2)

}.

Here, in the last step of (4.19), C(c)([w; Ī½])ewĪ½ =C(c)([w; Ī½])(PM1ewĪ½) =PN2

O(sc+2)

does not appear explicitly, because it is assimilated into PN2O(sd+2), where d ā‰¤ c.

More importantly, as d = māˆ’ 1 by our assumption, D(d)([w; Ī½])ewĪ½ vanishes because

its coefficient is m dd+1 āˆ’ (m āˆ’ 1) = 0. The cancellation of D(d)([w; Ī½])ewĪ½ is critical

in the proof of the quadratic convergence of the accelerated method.

To finish step 3, we show that the errors of [xk+1;Āµk+1] projected onto N1 andM1 are both bounded by O(s2). To this end, recall from (2.19) the expression ofF ā€²([w; Ī½])āˆ’1, and apply it to (4.19). With some algebraic manipulation, we have

[xk+1;Āµk+1]āˆ’ [v;Ī»] ā‰” ek+1MM + ek+1

MN + ek+1NM + ek+1

NN ,

Page 31: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 29

where

ek+1MM = PM1

{Aāˆ’1F ([w; Ī½]) +Aāˆ’1

F ([w; Ī½])BF ([w; Ī½])Sāˆ’1F ([w; Ī½])CF ([w; Ī½])Aāˆ’1

F ([w; Ī½])}

Ɨ{āˆ’ (māˆ’ 1)AFāˆ—ewĪ½ +

a+ 1āˆ’ma+ 1

A(a)([w; Ī½])ewĪ½ +b+ 1āˆ’mb+ 1

B(b)([w; Ī½])ewĪ½

+PM2O(smin(a+3,b+2))},

ek+1MN = āˆ’PM1

Aāˆ’1F ([w; Ī½])BF ([w; Ī½])Sāˆ’1

F ([w; Ī½])PN2O(sd+2),

ek+1NM = āˆ’PN1

Sāˆ’1F ([w; Ī½])CF ([w; Ī½])Aāˆ’1

F ([w; Ī½])PM2

{āˆ’ (māˆ’ 1)AFāˆ—ewĪ½ +

a+ 1āˆ’ma+ 1

A(a)([w; Ī½])ewĪ½ +b+ 1āˆ’mb+ 1

B(b)([w; Ī½])ewĪ½ + PM2O(smin(a+3,b+2))

},

and ek+1NN = PN1

Sāˆ’1F ([w; Ī½])PN2

O(sd+2).We show that all the projected errors shown above are bounded by O(s2). First,

note from (2.20) that the dominant term of AF ([w; Ī½]) is AFāˆ— = PM2Fā€²([v;Ī»])PM1 ,

which means ā€–AF ([w; Ī½])ā€– = O(1) and ā€–Aāˆ’1F ([w; Ī½])ā€– = O(1). In addition, by (2.18),

we have ā€–SF ([w; Ī½])ā€–=O(smin(d,b+c)). Thus the operator involved in ek+1MM satisfies

ā€–Aāˆ’1F ([w; Ī½]) +Aāˆ’1

F ([w; Ī½])BF ([w; Ī½])Sāˆ’1F ([w; Ī½])CF ([w; Ī½])Aāˆ’1

F ([w; Ī½])ā€–ā‰¤ O(1) +O(sb+cāˆ’min(d,b+c)) ā‰¤ O(1) +O(1) = O(1).

It follows that ā€–ek+1MMā€– = O(1)

(O(s2) +O(sa+2) +O(sb+1)

)= O(s2).

For the second error term ek+1MN , since c ā‰„ d by our assumption, we have

ā€–Aāˆ’1F ([w; Ī½])BF ([w; Ī½])Sāˆ’1

F ([w; Ī½])ā€– ā‰¤ O(sbāˆ’min(d,b+c)) = O(sbāˆ’d),

and it follows that ā€–ek+1MN ā€– ā‰¤ O(sbāˆ’d)O(sd+2) ā‰¤ O(sb+2).

Similarly, for the third error term ek+1NM, we have

ā€–Sāˆ’1F ([w; Ī½])CF ([w; Ī½])Aāˆ’1

F ā€– ā‰¤ O(scāˆ’min(d,b+c)) = O(scāˆ’d),

and therefore ā€–ek+1NMā€– ā‰¤ O(scāˆ’d)

(O(s2) +O(sa+2) +O(sb+1)

)ā‰¤ O(s2).

Finally, the last error term ek+1NN can be directly bounded by ā€–Sāˆ’1

F ([w; Ī½])ā€–:

ā€–ek+1NN ā€– ā‰¤ O(sāˆ’min(d,b+c))O(sd+2) = O(sāˆ’d+d+2) ā‰¤ O(s2).

The above analysis completes step 3, and the result is summarized as follows.Theorem 4.9. Let Ī» be a defective eigenvalue of the holomorphic matrix pencil T

with one left and one right Jordan chains of length m, and v = Ļ•1,0 be the correspond-ing eigenvector. Let Ī“, Īø > 0 be some appropriately small constants, [x0;Āµ0] āˆˆW (Ī“, Īø)be an eigenpair approximation of [v;Ī»] satisfying Assumption 4.4. Under Assump-tion 4.6, suppose that for any half-step iterate [wk; Ī½k], c ā‰„ d, and c ā‰„ d = m āˆ’ 1.Then for sufficiently small sināˆ (x0, v), the full-step iterates of the accelerated method(4.11) (or (4.12)) are well-defined, and [xk;Āµk] converges towards [v;Ī»] quadratically.

Remark. One can see from the above derivation that c ā‰„ d = māˆ’1 and c ā‰„ d arecritical assumptions leading to the quadratic convergence of the accelerated methods.In addition, [xk+1;Āµk+1] = m

([w; Ī½] āˆ’ F ā€²([w; Ī½])āˆ’1F ([w; Ī½])

)āˆ’ (m āˆ’ 1)[w; Ī½] is the

unique linear combination of the two iterates that cancels out the error terms lying

Page 32: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

30 D. B. Szyld and F. Xue

in N1 on the order of O(s). A violation of these assumptions or a different linearcombination would lead to a deceleration of the convergence rate back to linear.

Theorem 4.9 is an extension of the major results in [8, Section 7] to the specialsetting of eigenvalue computation under less stringent assumptions. Specifically, [8]studied two accelerated Newtonā€™s methods for solving a general nonlinear system ofequations for a singular root. It was assumed there that a = a, b = b, c = c and d = dfor any intermediate and full step iterates. Under certain additional hypothesis, thefirst accelerated method requires the computation of three Newtonā€™s directions periteration, assuming that c ā‰„ d and b ā‰„ min(2, d); the second one solves for twoNewtonā€™s directions per iteration, assuming that b ā‰„ 2 and c ā‰„ d ā‰„ 2. In fact, forthe computation of degenerate eigenvalues discussed here, these hypotheses do nothold. Our assumption in Theorem 4.9 about the values of c, d, c, and d is made forthe half-step iterate [w; Ī½] alone, and it is less demanding than those in [8].

4.5.2. Accelerated single-vector JD. One could also design other acceleratedNewton-like methods from the accelerated inverse iteration (4.12). In this section, wepropose an accelerated single-vector JD, which is derived from a minor modificationof (4.12). The only difference between the two algorithms lies in the computationof new eigenvalue approximations. In contrast to Newtonā€™s method and RFI, theJD methods compute normalized new eigenvector approximations directly withoutforming the unnormalized version; as a result, the way the accelerated Newtonā€™smethod updates eigenvalue approximations cannot be realized by JD; see Steps 3and 6 in (4.12). To work around this difficulty, we use the Rayleigh functional valueas new eigenvector approximations, and we have the new algorithm as follows:

1. Choose vector yk, e.g., yk = T ā€²(Āµk)xk, define Ī (1)k = I āˆ’ T ā€²(Āµk)xky

āˆ—k

yāˆ—kTā€²(Āµk)xk

,

Ī (2)k = I āˆ’ xku

āˆ—

uāˆ—xk, and solve the correction equation

Ī (1)k T (Āµk)Ī 

(2)k āˆ†xk = āˆ’

(T (Āµk)āˆ’ yāˆ—kT (Āµk)xk

yāˆ—kTā€²(Āµk)xk

T ā€²(Āµk)

)xk for āˆ†xk āŠ„u;

2. wk = xk + āˆ†xk;3. Choose vector zk1, e.g., zk1 = T ā€²(Āµk)wk and compute the RF valueĪ½k = ĻF (wk;T, zk1);

4. Define Ī (1)k = I āˆ’ T ā€²(Ī½k)wkz

āˆ—k1

zāˆ—k1Tā€²(Ī½k)wk

, Ī (2)k = I āˆ’ wku

āˆ—

uāˆ—wk, and solve the

correction equation Ī (1)k T (Ī½k)Ī 

(2)k āˆ†wk = āˆ’T (Ī½k)wk for āˆ†wk āŠ„u;

5. xk+1 = āˆ’(māˆ’ 1)wk +m(wk + āˆ†wk) = wk +māˆ†wk;6. Choose vector zk2, e.g., zk2 = T ā€²(Ī½k)xk+1 and compute the RF valueĀµk+1 = ĻF (xk+1;T, zk2).

(4.20)

Following the standard derivation of the exact solution of JD correction equa-

tions, we can show that āˆ†xk =Tāˆ’1(Āµk)T ā€²(Āµk)xkuāˆ—Tāˆ’1(Āµk)T ā€²(Āµk)xk

āˆ’xk at Step 1 of (4.20), and thus

wk =Tāˆ’1(Āµk)T ā€²(Āµk)xkuāˆ—Tāˆ’1(Āµk)T ā€²(Āµk)xk

at Step 2; similarly, āˆ†wk =Tāˆ’1(Ī½k)T ā€²(Ī½k)wkuāˆ—Tāˆ’1(Ī½k)T ā€²(Ī½k)wk

āˆ’ wk at

Step 4, and xk+1 =mTāˆ’1(Ī½k)T ā€²(Ī½k)wkuāˆ—Tāˆ’1(Ī½k)T ā€²(Ī½k)wk

āˆ’ (māˆ’ 1)wk at Step 5. Assuming x0 is nor-

malized such that uāˆ—x0 = 1, then all the following half-step and full-step eigenvector

Page 33: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 31

approximations satisfy uāˆ—wk = uāˆ—xk+1 = 1. For the purpose of numerical stability,however, we recommend these conditions to be explicitly enforced at Steps 2 and 5.

We see that the only difference between the accelerated inverse iteration (4.12)and the accelerated single-vector JD (4.20) is the way new eigenvalue approximationsare computed. Consequently, algorithms (4.20) and (4.11) are not mathematicallyequivalent. Nevertheless, given the close similarity between the two methods, it isnatural to expect that the accelerated JD exhibits quadratic convergence for defectiveeigenpairs. This convergence rate is illustrated by numerical experiments.

4.6. Numerical experiments for the accelerated algorithms. We presentnumerical results for the accelerated inverse iteration and the accelerated JD on theproblems with a single Jordan chain, namely, time delay and jordan3. We run theaccelerated algorithms with an initial eigenpair approximation and we find that bothconverge superlinearly. To illustrate the order of convergence descriptively, we follow

the approach used in Section 3.2: a sequence of initial approximations (Āµ(j)0 , x

(j)0 ) is

generated, then one step of the accelerated algorithms is applied to obtain a sequence

of new approximations (Āµ(j)1 , x

(j)1 ), and an estimate of the order of convergence ` is

obtained by analyzing(

log e(j)0 , log e

(j)1

). The results presented in Table 4.4 show that

both algorithms converge quadratically for the test problems.

accelerated inverse iter. accelerated JD

problem āˆ (x(1)0 , v) # init. approx. est. ` # init. approx. est. `

time delay 10āˆ’3 18 2.011 11 2.032jordan3 2.5Ɨ 10āˆ’3 9 2.028 9 2.058

Table 4.4Estimated order of convergence of the accelerated algorithms for a defective Ī» (geoT (Ī») = 1)

4.7. Defective eigenvalues with multiple Jordan chains. We studied inSections 4.3 and 4.5 the convergence of several Newton-like methods for a defectiveeigenvalue Ī» with a single Jordan chain (J ā‰” geoT (Ī») = 1). To make our discussionmore complete, we consider in this section defective eigenvalues with multiple (J ā‰„ 2)Jordan chains. For this type of eigenvalues with certain simple spectral structures, wecan develop a detailed convergence analysis as we did for those with a single Jordanchain. Due to the limit of space, however, we would not pursue such a thorough inves-tigation in this paper; instead, we provide some heuristic insight into the convergencerates the Newton-like methods are most likely to exhibit. Our speculation is directlybased on the results developed for defective Ī» with J = 1, and we will later illustrateour discussion by numerical experiments.

Let Ī» be a defective eigenvalue of the holomorphic T (Ā·) with J = geoT (Ī») ā‰„ 2, and{{Ļ•1,0, . . . , Ļ•1,m1āˆ’1}, {Ļ•2,0, . . . , Ļ•2,m2āˆ’1} . . . , {Ļ•J,0, . . . , Ļ•J,mJāˆ’1}} be the correspond-ing Jordan chains. Assume without loss of generality that m1 ā‰¤ m2 ā‰¤ . . . ā‰¤ mJ . Weconsider several types of spectral structure as follows.

Case 1: m1 ā‰„ 2. One may call this type of eigenvalue ā€œpurely defectiveā€ sinceit can be considered as a combination of several defective eigenpairs sharing the sameeigenvalue, each of which has a single Jordan chain of length mi ā‰„ 2. In this case, wehave shown in Section 4.1 that the standard inverse iteration converges linearly. Inaddition, if each Jordan chain satisfies the assumption of Theorem 4.3, it is naturalto expect the local convergence of RFI and single-vector JD to be linear in general.

To achieve quadratic convergence, we may use the two accelerated algorithms

Page 34: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

32 D. B. Szyld and F. Xue

(4.12) and (4.20) with m = mJ used for the linear combination to construct thefull-step iterates. The motivation for choosing this value for m is as follows. As theNewton-like methods proceed, the eigenvector approximation xk tends to convergetowards the eigenspace spanned by eigenvectors associated with the longest Jordanchains. In fact, assume without loss of generality that J = 2 and 2 ā‰¤ m1 < m2. Let(Āµ, x) be the current eigenpair approximation. The new eigenvector approximation is

p = Tāˆ’1(Āµ)T ā€²(Āµ)x =

2āˆ‘k=1

mkāˆ’1āˆ‘j=0

mkāˆ’jāˆ’1āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆk,j怉(Āµāˆ’ Ī»)mkāˆ’jāˆ’i

Ļ•k,i +G(Āµ)T ā€²(Āµ)x(4.21)

=

m1āˆ’1āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ1,0怉(Āµāˆ’ Ī»)m1āˆ’i

Ļ•1,i +

m1āˆ’1āˆ‘j=1

m1āˆ’jāˆ’1āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ1,j怉(Āµāˆ’ Ī»)m1āˆ’jāˆ’i

Ļ•1,i +

m2āˆ’1āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ2,0怉(Āµāˆ’ Ī»)m2āˆ’i

Ļ•2,i +

m2āˆ’1āˆ‘j=1

m2āˆ’jāˆ’1āˆ‘i=0

怈T ā€²(Āµ)x, Ļˆ2,j怉(Āµāˆ’ Ī»)m2āˆ’jāˆ’i

Ļ•2,i +G(Āµ)T ā€²(Āµ)x.

Assume that 怈T ā€²ā€²(Ī»)Ļ•1,0, Ļˆ2,0怉 6= 0 and 怈T ā€²ā€²(Ī»)Ļ•2,0, Ļˆ2,0怉 6= 0, or 怈T ā€²(Ī»)Ļ•1,0, Ļˆ2,1怉 6= 0and 怈T ā€²(Ī»)Ļ•2,0, Ļˆ2,1怉 6= 0. Following the proof of Lemma 4.5, we can show that thedominant term in (4.21) can be written as O

((Āµāˆ’ Ī»)āˆ’(m2āˆ’1)

)Ļ•2,0. In other words,

the new eigenvector approximation contains little component of Ļ•1,0, and Newton-like methods behave as if there is only one Jordan chain {Ļ•2,0, . . . , Ļ•2,m2āˆ’1} involved.Thus we expect the accelerated algorithms with m = m2 to converge quadratically.

Case 2: m1 = 1 and mJ ā‰„ 3. In this case, it is not difficult to follow theidea given in Section 4.1 to show that inverse iteration converges linearly. In addi-tion, for any practical Newton-like methods where the new eigenvector approxima-tion is p = Tāˆ’1(Āµ)T ā€²(Āµ)x, one can show that p is dominated by a term of the formO((Āµāˆ’ Ī»)āˆ’(mJāˆ’1)

)Ļ•J,0, and p contains another term of the form O

((Āµāˆ’ Ī»)āˆ’1

)Ļ•1,0.

Assume again without loss of generality that J = 2. Since m2 ā‰„ 3, the dominant termin p is much larger in magnitude than the Ļ•1,0 term. As a result, the Jordan chainof length 1 has minimal impact on the algorithm, and Newton-like methods shouldbehave the same way as they do for Case 1.

Case 3: m1 = 1 and mJ = 2. The linear convergence of inverse iteration can beestablished as in Section 4.1. For other Newton-like methods, this case seems morecomplicated than the previous two because it is not clear whether these algorithmsconverge to the eigenspace associated with the short (mi = 1) or the long (mi = 2)Jordan chains. In fact, the new eigenvector approximation p = Tāˆ’1(Āµ)T ā€²(Āµ)x containsboth O

((Āµāˆ’ Ī»)āˆ’1

)Ļ•1,0 and O

((Āµāˆ’ Ī»)āˆ’1

)Ļ•J,0, and p is not necessarily dominated

by either of the two components. We do not have a complete understanding of theconvergence of Newton-like methods for this case, but we will discuss two numericalexamples in the next section.

4.8. Numerical experiments for defective eigenvalues with multipleJordan chains. In this section, we provide numerical evidence to illustrate ourheuristic analysis of the convergence of Newton-like methods for defective eigenvalueswith multiple Jordan chains. Four problems, namely, df art m1m2, df art m1m3,df art m2m3, and mirror (see Section 3.2) are used to test the convergence rates.As we discussed, we first run the algorithms with an initial eigenpair (Āµ0, x0) andsee how quickly ā€–T (Āµk)xkā€– decreases to determine if the convergence is linear or atleast superlinear. For algorithms that seem to converge linearly and superlinearly,respectively, we follow the standard approach (see Section 4.2 for defective eigenval-

Page 35: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 33

ues) and the more descriptive approach (see Section 3.2 for semi-simple eigenvalues)to estimate the order of convergence.

inverse iteration RFI/JD TSRFIproblem āˆ (x0, v) # iters est. ` # iters est. ` # iters est. `

df art m1m3 10āˆ’2 14 0.978 14 0.957 10 0.965āˆ—

df art m2m3 10āˆ’2 15 0.969 14 0.974 11 1.013āˆ—

df art m1m2 5Ɨ 10āˆ’2 17 1.000 12 0.984 8 0.973āˆ—

mirror 10āˆ’2 15 1.002 3 ā‰ˆ 2ā€  2 ā‰ˆ 3ā€ 

Table 4.5Estimated order of convergence of the non-accelerated algorithms for a defective Ī» (geoT (Ī») ā‰„ 2)

The results for inverse iteration, RFI/JD, and TSRFI are presented in Table 4.5.We see that inverse iteration converges linearly for all problems, and RFI and TSRFIalso converge linearly for the three artificially constructed problems. The problemmirror is the only one with defective eigenvalues in our tests for which RFI and TSRFIexhibit the same order of convergence as for simple and semi-simple eigenvalues. Wedo not have a complete understanding of this observation, but we find that the desiredeigenvalue Ī» = 0 of this problem has a special spectrum structure. Namely, it hastwo Jordan chains of length 2, and seven Jordan chains of length 1; for the twolongest Jordan chains {Ļ•1,0, Ļ•1,1} and {Ļ•2,0, Ļ•2,1}, it holds that span{Ļ•1,0, Ļ•2,0} =span{Ļ•1,1, Ļ•2,1}; that is, the generalized eigenvectors and the eigenvectors of theseJordan chains span the same space. We speculate that the special spectral structureplays a critical role in the high order of convergence.

accelerated inverse iter. accelerated JD

problem āˆ (x(1)0 , v) # init. approx. est. ` # init. approx. est. `

df art m1m3 2.5Ɨ 10āˆ’2 11 1.998 10 2.010df art m2m3 5Ɨ 10āˆ’3 8 2.015 8 2.014

problem āˆ (x0, v) # iters. est. ` # iters. est. `df art m1m2 10āˆ’1 9 1.054 3 ā‰ˆ 2

mirror 10āˆ’2 3 ā‰ˆ 2 3 ā‰ˆ 2Table 4.6

Estimated order of convergence of accelerated algorithms for defective Ī» (geoT (Ī») ā‰„ 2)

Table 4.6 summarizes the estimated order of convergence of the accelerated algo-rithms. For both algorithms, the parameter m is set to be the length of the longestJordan chains. We see that the accelerated algorithms generally exhibit quadraticconvergence for defective eigenvalues with multiple Jordan chains, with the only ex-ception that the accelerated inverse iteration converges linearly for df art m1m2. Thisproblem belongs to Case 3 discussed in Section 4.7, for which our understanding ofthe convergence of Newton-like methods is not complete. Nevertheless, Tables 4.4and 4.6 show that the accelerated JD converges quadratically for all test problems.

In fact, for some reason, our approach used in Section 3.2 does not give a de-scriptive estimate of the true order of convergence of the accelerated algorithms fordf art m1m2 and mirror. Note that mirror also belongs to Case 3 discussed inSection 4.7. To estimate the true order of convergence, we ran the algorithms with

āˆ—The residual norms in the first two iterations are not used for the estimate of order of convergencefor TSRFI, because they decrease much less rapidly in subsequent iterations.ā€ For the problem mirror, one can see by the standard criterion ā€–ek+1ā€–/ā€–ekā€–` ā‰¤ C that RFI

and TSRFI, respectively, exhibit quadratic and cubic convergence; our approach used in Section 3.2somehow does not generate a descriptive estimate of the order of convergence for this problem.

Page 36: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

34 D. B. Szyld and F. Xue

(Āµ0, x0) and then used the standard criterion ā€–ek+1ā€–/ā€–ekā€–` ā‰¤ C. For these two prob-lems, we speculate that the eigenpair approximations generated by the acceleratedalgorithms have certain special structure for which our approach used in Section 3.2fails to detect the actual rate at which ā€–T (Āµk)xkā€– decreasesāˆ—.

5. Conclusion. The local convergence of single-vector Newton-like methods fordegenerate eigenvalues has not been fully understood, since the standard convergencetheory of Newtonā€™s method based on the nonsingularity of the Jacobian is not ap-plicable in this setting. In this paper, we studied the convergence of several of themost widely-used single-vector Newton-like methods for the solution of a degenerateeigenvalue and a corresponding eigenvector of general nonlinear algebraic eigenvalueproblems of the form T (Ī»)v = 0. Our major conclusion is that at least quadraticconvergence can be achieved by these algorithms for both semi-simple and defectiveeigenvalues.

Specifically, we showed that Newton-like methods exhibit the same order of con-vergence for semi-simple eigenvalues as they do for simple eigenvalues. The conver-gence is generally quadratic; in addition, RFI/JD with appropriate use of the Rayleighfunctional can achieve cubic convergence for problems with local symmetry.

The convergence analysis for defective eigenvalues is more complicated. Weshowed the linear convergence of inverse iteration and RFI/JD, and we proposedtwo accelerated algorithms which converge quadratically for a defective Ī» with a sin-gle Jordan chain. We also gave some heuristic discussion on the convergence for adefective Ī» with multiple Jordan chains. Our analyses are illustrated by numericalexperiments.

REFERENCES

[1] J. Asakura, T. Sakurai, H. Tadano, T. Ikegami and K. Kimura, A numerical method forpolynomial eigenvalue problems using contour integral, Japan Journal of Industrial andApplied Mathematics, Volume 27 (2010), pp. 73ā€“90.

[2] T. Betcke, N. J. Higham, V. Mehrmann, C. Schroder and F. Tisseur, NLEVP: a col-lection of nonlinear eigenvalue problems, MIMS EPrint 2010.98, Manchester Institute forMathematical Sciences, University of Manchester, 2010.

[3] T. Betcke and D. Kressner, Perturbation, extraction and refinement of invariant pairs formatrix polynomials, Linear Algebra and its Applications, Vol. 435 (2011), pp. 514ā€“536.

[4] T. Betcke and H. Voss, A Jacobi-Davidson type projection method for nonlinear eigenvalueproblems, Future Generation Comuter Systems, Vol. 20 (2004), pp. 363ā€“372.

[5] W.-J. Beyn, An integral method for solving nonlinear eigenvalue problems, Linear Algebra andits Applications, Vol. 436 (2012), pp. 3839ā€“3863.

[6] D. W. Decker and C. T. Kelley, Newtonā€™s method at singular points I, SIAM Journal onNumerical Analysis, Vol. 17 (1980), pp. 66ā€“70.

[7] D. W. Decker and C. T. Kelley, Newtonā€™s method at singular points II, SIAM Journal onNumerical Analysis, Vol. 17 (1980), pp. 465ā€“471.

[8] D. W. Decker, H. B. Keller and C. T. Kelley, Convergence rates for Newtonā€™s method atsingular points, SIAM Journal on Numerical Analysis, Vol. 20 (1983), pp. 296ā€“314.

[9] N. Hale, N. J. Higham, L. N. Trefethen, Computing AĪ±, log(A), and related matrix func-tions by contour integrals, SIAM Journal on Numerical Analysis, Vol. 46 (2008), pp. 2505ā€“2523.

[10] I. C. F. Ipsen, Computing an eigenvector with inverse iteration, SIAM Review, Vol. 39 (1997),pp. 254ā€“291.

āˆ—A similar example: our approach used in Section 3.2, when applied to inverse iteration for com-puting defective eigenvalues with a single Jordan chain, gives an estimated mth order convergence,which does not reflect the actual linear convergence; this is due to the special structure of the newiterate generated by inverse iteration; see Lemma 4.5.

Page 37: Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems.

Newton-like methods for degenerate eigenvalues 35

[11] E. Jarlebring and W. Michiels, Analyzing the convergence factor of residual inverse itera-tion, BIT Numerical Mathematics, Vol. 51(2011), pp. 937ā€“957.

[12] V. Kozlov and V. Mazā€™ia, Differential equations with operator coefficients, Springer, Berlin-Heidelberg, 1999.

[13] D. Kressner, A block Newton method for nonlinear eigenvalue problems, Numerische Mathe-matik, Vol. 114 (2009), pp. 355ā€“372.

[14] D.S. Mackey, N. Mackey, C. Mehl and V. Mehrmann, Vector spaces of linearizations formatrix polynomials, SIAM Journal Matrix Analysis and Applications, Vol. 28 (2006), pp.971ā€“1004.

[15] D.S. Mackey, N. Mackey, C. Mehl and V. Mehrmann, Structured polynomial eigenvalueproblems: good vibrations from good linearizations, SIAM Journal Matrix Analysis andApplications, Vol. 28 (2006), pp. 1029ā€“1051.

[16] The Matrix Market, http://math.nist.gov/MatrixMarket/, NIST, 2007.[17] V. Mehrmann and C. Schroder, Nonlinear eigenvalue and frequency response problems in

industrial practice, Journal of Mathematics in Industry, Vol. 1 (2011), article No. 7.[18] V. Mehrmann and H. Voss, Nonlinear eigenvalue problems: a challenge for modern eigenvalue

methods, Mitteilungen der Gesellschaft fur Angewandte Mathematik und Mechanik, Vol.27 (2005), pp. 121ā€“151.

[19] J. Moro, J. V. Burke and M. L. Overton, On the Lidskiiā€“Vishikā€“Lyusternik perturbationtheory for eigenvalues of matrices with arbitrary Jordan structure, SIAM Journal MatrixAnalysis and Applications, Vol. 18 (1997), pp. 793ā€“817.

[20] J. Moro and F.M. Dopico, First Order Eigenvalue Perturbation Theory and the NewtonDiagram, Chapter in the book Applied Mathematics and Scientific Computing, edited byZ. Drmac, V. Hari, L. Sopta, Z. Tutek and K. Veselic, pp. 143ā€“175. Kluwer AcademicPublishers (2003).

[21] A. Neumaier, Residual inverse iteration for the nonlinear eigenvalue problem, SIAM Journalon Numerical Analysis, Vol. 22 (1985), pp. 914ā€“923.

[22] M.R. Osborne, Inverse iteration, Newtonā€™s method, and non-linear eigenvalue problems, TheContributions of Dr. J.H. Wilkinson to Numerical Analysis, Symposium Proceedings Series,19, pp. 21ā€“53, The Institute of Mathematics and its Applications, London, 1979.

[23] G. Peters and J. H. Wilkinson, Inverse iteration, ill-conditioned equations and Newtonā€™smethod, SIAM Review, Vol. 21 (1979), pp. 339ā€“360.

[24] A. Ruhe, Algorithms for the nonlinear eigenvalue problem, SIAM Journal on Numerical Anal-ysis, Vol. 10 (1973), pp. 674ā€“689.

[25] K. Schreiber, Nonlinear eigenvalue problems: Newton-type methods and nonlinear Rayleighfunctionals, Ph.D thesis, Department of Mathematics, TU Berlin, 2008.

[26] H. Schwetlick and K. Schreiber, Nonlinear Rayleigh functionals, Linear Algebra and ItsApplications, Vol. 436 (2012), pp. 3991ā€“4016.

[27] A. Spence and C. Poulton, Photonic band structure calculations using nonlinear eigenvaluetechniques, Journal of Computational Physics, Vol. 204 (2005), pp. 65ā€“81.

[28] Y. Su and Z. Bai, Solving Rational Eigenvalue Problems via Linearization, SIAM JournalMatrix Analysis and Applications, Vol. 32 (2011), pp. 201ā€“216.

[29] D. B. Szyld and F. Xue, Local convergence analysis of several inexact Newton-type algorithmsfor general nonlinear eigenvalue problems, to appear in Numerische Mathematik.

[30] D. B. Szyld and F. Xue, Several properties of invariant pairs of nonlinear algebraic eigenvalueproblems, Temple Research Report 12-02-09, February 2012.

[31] F. Tisseur and K. Meerbergen, The quadratic eigenvalue problem, SIAM Review, Vol. 43(2001), pp. 234ā€“286.