Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby...
Transcript of Local convergence of Newton-like methods for degenerate ...szyld/reports/Newton_Nonsimple_TR.pdfby...
Local convergence of Newton-likemethods for degenerate eigenvalues
of nonlinear eigenproblemsa
Fei Xue and Daniel B. Szyld
Report 12-10-04October 2012
This report is available in the World Wide Web athttp://www.math.temple.edu/~szyld
LOCAL CONVERGENCE OF NEWTON-LIKE METHODS FORDEGENERATE EIGENVALUES OF NONLINEAR EIGENPROBLEMSā
DANIEL B. SZYLDā AND FEI XUEā
Abstract. We study the local convergence rates of several single-vector Newton-like methodsfor the solution of a semi-simple or defective eigenvalue of nonlinear algebraic eigenvalue problems ofthe form T (Ī»)v = 0. This problem has not been fully understood, since the Jacobian associated withthe single-vector Newtonās method is singular at the desired eigenpair, and the standard convergencetheory is not applicable. In addition, Newtonās method generally converges only linearly towardssingular roots. In this paper, we show that faster convergence can be achieved for degenerate eigen-values. For semi-simple eigenvalues, we show that the convergence of Newtonās method, Rayleighfunctional iteration and the Jacobi-Davidson method are quadratic, and the latter two converge cu-bically for locally symmetric problems. For defective eigenvalues, all these methods converge onlylinearly in general. We then study two accelerated methods for defective eigenvalues, which exhibitquadratic convergence and require the solution of two linear systems per iteration. The results areillustrated by numerical experiments.
Key words. nonlinear eigenvalue problems, degenerate eigenvalues, Jordan chains, Newtonāsmethod, inverse iteration, Rayleigh functionals, Jacobi-Davidson method
AMS subject classifications. 65F15, 65F10, 65F50, 15A18, 15A22.
1. Introduction. In this paper, we study the local convergence rates of severalsingle-vector Newton-like methods for the solution of a degenerate eigenvalue Ī» and acorresponding eigenvector v of the nonlinear algebraic eigenvalue problem of the form
(1.1) T (Ī»)v = 0,
where T (Ā·) : U ā CnĆn is holomorphic on a domain U ā C, Ī» ā U and v ā Cn \ {0}.Ī» is an eigenvalue of T (Ā·) if and only if T (Ī») is singular. Assume that detT (Ā·) 6ā” 0in any neighborhood of Ī»; that is, eigenvalues of T (Ā·) are isolated. The algebraicmultiplicity of an eigenvalue Ī» of (1.1) is defined as the multiplicity of Ī» as a rootof the characteristic function detT (Āµ). The eigenvalue Ī» is called degenerate if itsalgebraic multiplicity larger than one. A degenerate eigenvalue Ī» is called semi-simple if its geometric multiplicity, defined as the dimension of null (T (Ī»)), equals itsalgebraic multiplicity; it is called defective if its geometric multiplicity is smaller thanits algebraic multiplicity.
Our study of numerical methods for degenerate eigenvalues of (1.1) is motivatedby recent rapid development of theory and algorithms for nonlinear eigenvalue prob-lems. These problems arise in a variety of applications, such as vibrational analysisof structures, stability analysis of fluid flows, optimal control problems, quantum me-chanics, and delay differential equations; see, e.g., [17, 18, 31] and references therein.Polynomial and rational eigenvalue problems, which can be transformed via lineariza-tion into standard linear eigenvalue problems of dimensions multiple times larger, havebeen studied intensively due to their wide applications [14, 15, 28]. For these problemsof small or medium size, linearization is usually the most effective approach. On theother hand, for very large or irrational (truly nonlinear) problems where linearizationis not effective or applicable, or if there are only a small number of eigenpairs of inter-est, other methods might be more appropriate. For example, one can use the contour
āThis version dated 4 October 2012. This work was supported by the U. S. National ScienceFoundation under grant DMS-1115520.ā Department of Mathematics, Temple University, 1805 N. Broad Street, Philadelphia, PA 19122,
USA (szyld,[email protected]).
1
2 D. B. Szyld and F. Xue
integral method [1, 5, 9] to find initial approximations to the eigenvalues in a specifieddomain, and apply Newton-like methods to refine the eigenpair approximations.
Variants of Newton-like methods have been proposed and studied for the solu-tion of a single eigenpair of the nonlinear eigenvalue problem (1.1); see the earlyworks [21, 22, 24], a few recent developments [4, 11, 25, 27], and a study of inexactvariants of these algorithms [29]. In all these references, the analyses are developedonly for simple eigenpairs, and the convergence of Newton-like methods are generallyquadratic (possibly cubic for problems with local symmetry). However, degenerateeigenpairs also arise in important scenarios such as certain gyroscopic systems, hy-perbolic Hermitian polynomial problems, and some delay differential equations, forwhich the understanding of fast convergent Newton-like methods is not complete.The major difficulty is that the Jacobian (Frechet derivative) of the augmented sys-tem of problem (1.1) at the desired degenerate eigenpair is singular, and the standardconvergence theory of Newtonās method is not directly applicable. To work aroundthis difficulty, a block version of Newtonās method was proposed to compute a simpleinvariant pair involving the whole eigenspace of the desired degenerate eigenvalues[13], where in each iteration the number of linear systems to be solved equals at leastthe sum of algebraic multiplicities of all these eigenvalues. Therefore, this methodcould be prohibitive for degenerate eigenvalues of high algebraic multiplicities.
For many applications, fortunately, we are mostly interested in eigenvalues and asingle eigenvector. In this case, single-vector Newton-like methods are probably themost suitable algorithms. The main purpose of this paper is to gain a better under-standing of several fast convergent single-vector Newton-like methods for the solutionof a degenerate eigenpair of problem (1.1). We assume that the initial approxima-tion to the desired eigenpair is available, and we investigate the local convergencerates of inverse iteration, Rayleigh functional iteration (RFI), and the single-vectorJacobi-Davidson (JD) method. For semi-simple eigenvalues, we show that the localconvergence of these algorithms is quadratic or cubic, respectively, if the local sym-metry of T (Ā·) is absent or present; in other words, the local convergence rates of theNewton-like methods for semi-simple eigenpairs are identical to those for simple ones.For defective eigenvalues, on the other hand, we show that the convergence of thesemethods is in general only linear. We then propose two accelerated variants of New-tonās method, which exhibit locally quadratic convergence at the cost of solving twolinear systems per iteration.
The utility and limitation of the algorithms for defective eigenpairs discussed inthis paper need to be emphasized. In particular, it is well known that the computationof defective eigenpairs (eigenvalues and associated eigenvectors alone, excluding thegeneralized eigenvectors) is an ill-posed problem. A small perturbation of the matrixpencil of order Īµ generally leads to a scattering of a defective eigenvalue Ī» by amagnitude of Īµ1/m, where m is the length of the longest Jordan chain; see, e.g.,[19, 20] and references therein. Therefore, the computation of defective eigenpairsalone is relevant only for those with short Jordan chains (m small) and where moderateaccuracy is sufficient for the applications. On the other hand, simple invariant pairsinvolving a defective eigenpair together with all associated generalized eigenvectorsare much less sensitive to perturbations, and therefore the computation of these pairscould be performed to high accuracy; see, for example, [3, 30]. To this end, one mayagain refer to the block version of Newtonās method [13], which is more expensivethan the single-vector methods studied in this paper.
The rest of the paper is organized as follows. In Section 2 we review some defini-
Newton-like methods for degenerate eigenvalues 3
tions and preliminary results for degenerate eigenvalues. In Section 3, we show thatthe singularity of the Jacobian of the augmented system at the desired semi-simpleeigenpair has no impact on the local quadratic convergence of inverse iteration; inaddition, RFI and single-vector JD converge quadratically or cubically towards asemi-simple eigenpair, respectively, for locally nonsymmetric or symmetric problems.We then show in Section 4 that the convergence of inverse iteration, RFI and single-vector JD towards a defective eigenpair is in general only linear, and we proposetwo accelerated algorithms that converge quadratically. Numerical experiments areprovided throughout the paper to illustrate the convergence results. Section 5 is theconclusion of the paper.
2. Preliminaries. In this section, we review some preliminary results on degen-erate eigenvalues for the study of local convergence of Newton-like methods. Theoriesfor semi-simple and defective eigenvalues can be presented in a uniform way, yet wereview them separately for the purpose of clarity. For the two types of degener-ate eigenvalues, as we shall see, there exist important differences in the structure ofthe resolvent T (Āµ)ā1 near the eigenvalue, and the sets of right and left eigenvectorssatisfy different normalization conditions. These properties are fundamental in theunderstanding of the quadratic (possibly cubic) and linear convergence of Newton-likemethods, respectively, for a semi-simple and a defective eigenpair.
2.1. Semi-simple eigenvalues. Let Ī» be an eigenvalue of (1.1), algT (Ī») be thealgebraic multiplicity of Ī», i.e., the multiplicity of Ī» as a root of the characteristicfunction detT (Āµ), and geoT (Ī») = dim (nullT (Ī»)) be the geometric multiplicity. It canbe shown that algT (Ī») ā„ geoT (Ī»). Then Ī» is semi-simple if algT (Ī») = geoT (Ī») ā„ 2.Intuitively speaking, a semi-simple eigenpair can be considered as a set of multiplesimple eigenpairs sharing an identical eigenvalue. This perspective is helpful for ourunderstanding of the quadratic convergence of Newton-like methods in this case.
The major theorem on the structure of the resolvent T (Āµ)ā1 near a semi-simpleeigenvalue and the normalization condition of the sets of left and right eigenvectorsis described as follows, and it can be obtained directly from Theorem A.10.2 of [12],where all Jordan chains are of length 1.
Theorem 2.1. Let T be a Fredholm holomorphic operator function in a neighbor-hood of a semi-simple eigenvalue Ī» ā U . Let algT (Ī») = J , and {Ļk} (k = 1, . . . , J) bethe corresponding right eigenvectors. Then there exists a unique set of correspondingleft eigenvectors {Ļk} (k = 1, . . . , J) such that in a neighborhood of Ī»
T (Āµ)ā1 =
Jāk=1
ćĀ·, ĻkćĻkĀµā Ī»
+Q(Āµ),
where Q is holomorphic in a neighborhood of Ī». The two sets of eigenvectors satisfythe following normalization condition
(2.1) ćT ā²(Ī»)Ļk, Ļjć = Ī“jk, (j, k = 1, . . . , J).
In addition, the right eigenvectors satisfy
(2.2) T (Ī»)Q(Ī»)T ā²(Ī»)Ļk = 0, (k = 1, . . . , J).
Note that Ī» is a simple eigenvalue if J = 1. Therefore, we assume throughout thepaper that J ā„ 2 for the semi-simple case.
4 D. B. Szyld and F. Xue
Let x 6= 0 be a right eigenvector approximation which has a significant componentin span{Ļ1, . . . , ĻJ}. We first give a decomposition of x which we use later for thestudy of the local convergence of Rayleigh functional iteration. Let G ā CJĆJ be anonsingular matrix, Ī¦J = [Ļ1 . . . ĻJ ]G and ĪØJ = [Ļ1 . . . ĻJ ]Gāā such that
(2.3) ĪØāJTā²(Ī»)Ī¦J = IJ .
From (2.1) we know that both T ā²(Ī»)Ī¦J and ĪØāJTā²(Ī») are of full rank. Let WnāJ ā
CnĆ(nāJ) and UnāJ ā CnĆ(nāJ), respectively, have orthonormal columns such that
(2.4) W ānāJTā²(Ī»)Ī¦J = 0(nāJ)ĆJ and ĪØāJT
ā²(Ī»)UnāJ = 0JĆ(nāJ).
Assume that x 6ā range(UnāJ), such that āĪØāJT ā²(Ī»)xā 6= 0. A decomposition of xcan be formed as follows:
(2.5) x = Ī³(cv + sg),
where
Ī³ =
ā£ā£ā£ā£ā£ā£ā£ā£ā£ā£[
ĪØāJW ānāJ
]T ā²(Ī»)x
ā£ā£ā£ā£ā£ā£ā£ā£ā£ā£, c =
āĪØāJT ā²(Ī»)xāĪ³
, s =āW ānāJT ā²(Ī»)xā
Ī³,(2.6)
v = Ī¦JĪØāJT
ā²(Ī»)x
āĪØāJT ā²(Ī»)xā, and g =
1
s
(x
Ī³ā cv
).
Here, Ī³ is a generalized norm of x, and c and s with c2 + s2 = 1 are generalized sineand cosine, respectively, of the angle between x and v ā span{Ļ1, . . . , ĻJ}. It can beshown without difficulty that
(2.7) ĪØāJTā²(Ī»)g = 0 and āW ānāJT ā²(Ī»)gā = 1,
and thus g ā range(UnāJ) is an error vector normalized as above. The eigenvectorapproximation error can be measured by the generalized sine s or tangent t = s/c.
2.2. Defective eigenvalues. An eigenvalue Ī» of (1.1) is defective if algT (Ī») >geoT (Ī»). This definition is consistent with that for eigenvalues of a matrix.
For a defective eigenvalue Ī», the structure of the resolvent T (Āµ)ā1 near Ī» and thenormalization condition of the sets of left and right eigenvectors are more complicatedthan they are in the semi-simple case.
Definition 2.2. Let Ī» be a defective eigenvalue of the holomorphic operator T (Ā·)and ĻĀ·,0 a corresponding right eigenvector. Then the nonzero vectors ĻĀ·,1, . . . , ĻĀ·,mā1
are called generalized eigenvectors if
(2.8)
nāj=0
1
j!T (j)(Ī»)ĻĀ·,nāj = 0, n = 1, . . . ,mā 1,
where T (j)(Ī») = dj
dĀµj T (Āµ)|Āµ=Ī». Then the ordered collection {ĻĀ·,0, ĻĀ·,1, . . . , ĻĀ·,mā1} is
called a right Jordan chain corresponding to Ī». If (2.8) is satisfied for some m = māand no more vectors can be introduced such that (2.8) is satisfied for m = mā + 1,then mā is called the length of the Jordan chain and a partial multiplicity of Ī».
Similarly, let ĻĀ·,0 be a left eigenvector of Ī». One can define a left Jordan chainĻĀ·,0, ĻĀ·,1, . . . , ĻĀ·,mā1 by replacing T (j)ĻĀ·,nāj in (2.8) with ĻāĀ·,nājT
(j).
Newton-like methods for degenerate eigenvalues 5
The structure of the resolvent T (Āµ)ā1 near a defective Ī» is described as follows(c.f. Theorem A.10.2 of [12]).
Theorem 2.3. Let T : U ā CnĆn be a Fredholm holomorphic function in aneighborhood of a defective eigenvalue Ī» of T , and J and m1, . . . ,mJ be the geometricand partial multiplicities of Ī». Suppose that {Ļk,s}, k = 1, . . . , J, s = 0, . . . ,mk ā 1is a canonical system of right Jordan chains of T corresponding to Ī». Then(i) There is a unique canonical system of left Jordan chains {Ļk,s}, k = 1, . . . , J ,s = 0, . . . ,mk ā 1 such that in a neighborhood of Ī»
(2.9) T (Āµ)ā1 =
Jāk=1
mkā1āh=0
āhs=0ćĀ·, Ļk,sćĻk,hās(Āµā Ī»)mkāh
+Q(Āµ),
where Q is holomorphic in a neighborhood of Ī».(ii) The left Jordan chains {Ļk,s} in (i) satisfy the following normalization conditions
ās=0
mk+sāĻ=s+1
1
Ļ!ćT (Ļ)(Ī»)Ļk,mk+sāĻ, Ļj,`āsć = Ī“jkĪ“
0` ,(2.10)
ās=0
mk+sāĻ=s+1
1
Ļ!ćT (Ļ)(Ī»)Ļk,`ās, Ļj,mk+sāĻć = Ī“jkĪ“
0` ,
where Ļj,p = 0, Ļk,q = 0 for p ā„ mj , q ā„ mk.
An important observation of the sets of right and left eigenvectors associated witha defective eigenvalue can be derived from Theorem 2.3. Note from Definition 2.2 ofJordan chains that T (Ī»)Ļk,1 + T ā²(Ī»)Ļk,0 = 0. Premultiplying by Ļāj,0, we have
(2.11) Ļāj,0T (Ī»)Ļk,1 + Ļāj,0Tā²(Ī»)Ļk,0 = Ļāj,0T
ā²(Ī»)Ļk,0 = 0
for any j, k = 1, . . . , J . We shall see later that this property plays a critical role inthe linear convergence of Newton-like methods towards defective eigenvalues.
2.3. Review of Newton-like algorithms. Consider the solution of a singleeigenpair (Ī», v) under a normalization condition of problem (1.1)
(2.12) F (Ī», v) =
[T (Ī»)vuāv ā 1
]= 0,
where u ā Cn is a fixed normalization vector such that u 6ā„ v; see, e.g., [18, 24].This nonlinear system of equations involving n + 1 variables is usually referred toas the augmented system of problem (1.1). Application of Newtonās method to theaugmented system leads to an iteration scheme as follows:
(2.13)
[xk+1
Āµk+1
]=
[xkĀµk
]ā[T (Āµk) T ā²(Āµk)xkuā 0
]ā1 [xkĀµk
],
where the computation of the new eigenvector iterate xk+1 and the new eigenvalueiterate Āµk+1 are performed simultaneously by solving a linear system involving theJacobian of order n + 1. In general, it is more convenient to solve a linear systeminvolving a coefficient matrix of order n to compute xk+1, and then update Āµk+1
6 D. B. Szyld and F. Xue
correspondingly. To this end, we exploit the structure of the block inverse of theJacobian and obtain inverse iteration as follows:
(2.14)
pk = Tā1(Āµk)T ā²(Āµk)xk
xk+1 =1
uāpkpk
Āµk+1 = Āµk ā1
uāpk
,
which is mathematically equivalent to Newtonās method. Inverse iteration is generallyused as a substitute of the original Newtonās method (2.13), and its convergenceanalysis is usually carried out using theories of the latter.
Here, we would like to point out that for a degenerate Ī» with geoT (Ī») = J > 1,T (Ī») = 0 together with a single normalization vector may not uniquely determine theeigenvector v, since there could be infinitely many ways to form v ā span{Ļ1, Ā· Ā· Ā· , ĻJ}such that uāv = 1 holds. In many cases, fortunately, one is mostly interested in thecomputation of the eigenvalue alone, and any single eigenvector v ā span{Ļ1, Ā· Ā· Ā· , ĻJ}would satisfy the needs of the applications. We assume this is the case throughout thepaper, so that the study of single-vector Newton-like algorithms is most appropriatefor this purpose. If the whole eigenspace of a degenerate eigenvalue is needed, onemay have to use a block variant of Newtonās method based on invariant pairs [13].
inverse iteration can be modified to incorporate potentially more accurate eigen-value approximation, so that the convergence can be accelerated for locally symmetricproblems, i.e., those where T (Ī») is real (skew) symmetric, complex (skew) Hermi-tian, or complex (skew) symmetric. The most important variant of this type is theRayleigh functional iteration (RFI), a natural generalization of Rayleigh quotient iter-ation (RQI) to the nonlinear eigenvalue problems. Given a right eigenvector approxi-mation xk ā v, one can choose some auxiliary vector yk, such that yākT (Ļk)xk = 0 forsome scalar Ļk, and Ļk = ĻF (xk;T, yk) is called the value of the nonlinear Rayleighfunctional ĻF (Ā·;T, y); see [26] and references therein. The RFI is described as follows:
(2.15)
choose vector yk and compute Ļk = ĻF (xk;T, yk) s.t. yākT (Ļk)xk = 0;pk = Tā1(Ļk)T ā²(Ļk)xk;normalize pk and assign it to xk+1.
RFI bears a close connection to the single-vector Jacobi-Davidson (JD) method,another type of most widely used Newton-like eigenvalue algorithms. Starting withan eigenvector approximation xk and its corresponding Rayleigh functional value Ļk,single-vector JD solves a correction equation for a correction vector āxk such thatthe new eigenvector approximation is xk+1 = xk + āxk. The correction equationof JD is a projection (restriction) of the linear system arising in RFI onto a n ā 1dimensional complementary space of span{xk}. One can construct different variantsof the JD correction equation. We focus our study on one variant, which assumes ageneral form and has been used to compute a simple eigenpair in [29]:
choose vector yk and compute Ļk = ĻF (xk;T, yk)s.t. yākT (Ļk)xk = 0 and yākT
ā²(Ļk)xk 6= 0
define Ī (1)k = I ā T ā²(Ļk)xky
āk
yākTā²(Ļk)xk
,Ī (2)k = I ā xku
ā
uāxk, and solve the correction
equation Ī (1)k T (Ļk)Ī
(2)k āxk = āT (Ļk)xk for āxk ā„u,
xk+1 = xk + āxk, and normalize when necessary
.(2.16)
Newton-like methods for degenerate eigenvalues 7
It can be shown that the exact solution of the correction equation is
āxk =Tā1(Ļk)T ā²(Ļk)xkuāTā1(Ļk)T ā²(Ļk)xk
ā xk
so that xk + āxk = Tā1(Ļk)T ā²(Ļx)xk up to a scaling factor. Therefore, this variant
of JD is mathematically equivalent to RFI, provided ykTā²(Ļk)xk 6= 0 so that Ī
(1)k is
well-defined and uāTā1(Ļk)T ā²(Ļx)xk 6= 0.In practice, to enhance the robustness and the rate of convergence, JD usually
works with a search subspace of variable dimension for eigenvector approximations,and in this case it is referred to as the full JD method. In this paper, we restrictour discussion to the local convergence of single-vector JD, which can be used as aworst-case estimate of the convergence rates of full JD.
2.4. Convergence of Newtonās method near singular roots. We end thissection by reviewing the local convergence of Newtonās method for a nonlinear systemof algebraic equations F (z) = 0 towards a root zā for which the Jacobian F ā²(zā)is singular. The typical convergence behavior can be roughly described as follows.The whole space for the variable z can be decomposed as a direct sum of the nullspace N1 of F ā²(zā) and an appropriate complementary space M1. When Newtonāsmethod converges towards zā, the component of the iterate error in N1 convergeslinearly, whereas the error component in M1 converges quadratically; see, e.g., [6, 7,8]. To prepare for the convergence analysis of inverse iteration towards degenerateeigenvalues, we need to review some results from [8].
Let N1 = null(F ā²(zā)),M2 = range(F ā²(zā)), so that codim(M2) = dim(N1). Wechoose complementary subspaces M1, N2 such that Cn+1 = N1 āM1 = N2 āM2.Define PNi as the projections onto Ni along Mi, and PMi = I ā PNi . Given thesesubspaces, the Jacobian F ā²(z) can be decomposed as
F ā²(z) = AF (z) +BF (z) + CF (z) +DF (z),
where
AF (z) = PM2F ā²(z)PM1
, BF (z) = PM2F ā²(z)PN1
(2.17)
CF (z) = PN2F ā²(z)PM1
, DF (z) = PN2F ā²(z)PN1
.
Let AFā = PM2AF (zā)PM1
, which is a bijection when considered as an operator fromM1 intoM2. The conditions for the nonsingularity of F ā²(z) are described as follows.
Proposition 2.4. Let z be a candidate Newton iterate and ez = z ā zā be theiterate error. For z ā SĪ“ ā” {z : āezā ā¤ Ī“} with a sufficiently small Ī“ > 0, F ā²(z) isnonsingular if and only if the Schur complement
(2.18) SF (z) : N1 ā N2 ā” DF (z)ā CF (z)Aā1F (z)BF (z)
is nonsingular. In this case, we have
F ā²(z)ā1 =(AF (z) +BF (z) + CF (z) +DF (z)
)ā1(2.19)
= PM1
(Aā1F (z) +Aā1
F (z)BF (z)Sā1F (z)CF (z)Aā1
F (z))PM2
āPM1Aā1F (z)BF (z)Sā1
F (z)PN2
āPN1Sā1F (z)CF (z)Aā1
F (z)PM2 + PN1Sā1F (z)PN2 .
8 D. B. Szyld and F. Xue
The convergence of Newtonās method can be explored by studying each term in(2.19). To this end, note that Taylor expansions of the terms in (2.17) at zā, i.e., therestrictions of the Jacobian F ā²(z) onto Mi and Ni (i = 1, 2), are as follows:
AF (z) = AFā +
nāj=a
A(j)(z) +On+1(ez),(2.20)
BF (z) =
nāj=b
B(j)(z) +On+1(ez),
CF (z) =
nāj=c
C(j)(z) +On+1(ez),
DF (z) =
nāj=d
D(j)(z) +On+1(ez),
where
A(j)(z) =1
j!PM2
F (j+1)(zā)(ejz, PM1Ā·),(2.21)
B(j)(z) =1
j!PM2
F (j+1)(zā)(ejz, PN1Ā·),
C(j)(z) =1
j!PN2F
(j+1)(zā)(ejz, PM1 Ā·), and
D(j)(z) =1
j!PN2F
(j+1)(zā)(ejz, PN1 Ā·)
are square matrices. Here, F (j+1)(zā) (Ā·, Ā· Ā· Ā· , Ā·) stands for the (j + 1)st derivative ofF at zā (a multilinear form (tensor) with j + 1 arguments), and ejz means that thefirst j arguments of F (j+1) are all ez. The values j = a, b, c, d > 0 in (2.20) are thesmallest integers for which the vectors A(j)(z)ez, B(j)(z)ez, C(j)(z)ez, and D(j)(z)ez,respectively, are nonzero. In addition, let j = a, b, c, d be the smallest integers forwhich the matrices A(j)(z), B(j)(z), C(j)(z) and D(j)(z) are nonzero. Obviously, wehave a ā¤ a, b ā¤ b, c ā¤ c and d ā¤ d.
With the above notation and definitions, a major convergence result of the stan-dard Newtonās method near singular points developed in [8] is summarized as follows.
Theorem 2.5 (Theorem 5.9 in [8]). Define the operator
D(j) =1
j!PN2
F (j+1)(zā)((PN1ez)
j , PN1Ā·),
and j = d be the smallest integer for which D(j) 6ā” 0. Let ez = z ā zā be the error
of z, and assume that d = d ā¤ c and D(d) is nonsingular for all PN1ez 6= 0. Define
` = min(a, b, c, d), and let z0 be the initial Newton iterate. Then, for sufficiently smallĪ“ > 0 and Īø > 0, F ā²(z0) is nonsingular for all
z0 āW (Ī“, Īø) ā” {z : 0 < āezā ā¤ Ī“, āPM1ezā ā¤ ĪøāPN1
ezā} ,
all subsequent Newton iterates remain in W (Ī“, Īø); in addition, zk ā zā with
āPM1(zk ā zā)ā ā¤ CāPM1
(zkā1 ā zā)ā`+1
Newton-like methods for degenerate eigenvalues 9
for some constant C > 0 independent of k, and
limkāā
āPN1(zk ā zā)āāPN1
(zkā1 ā zā)ā=
d
d+ 1.
Theorem 2.5 states that under certain assumptions, as Newtonās method con-verges towards zā for which F ā²(zā) is singular, the error component lying in M1
converges at least quadratically, whereas the error component lying in N1 convergesonly linearly. This observation will be used directly in Sections 3 and 4 to show thequadratic and linear convergence of inverse iteration, respectively, towards a semi-simple and a defective eigenpair.
3. Convergence for semi-simple eigenvalues. In this section, we study thelocal convergence of Newton-like methods for a semi-simple eigenvalue of the nonlinearalgebraic eigenvalue problem (1.1). As we shall see, the Jacobian of the augmentedsystem (2.12) is singular at a semi-simple eigenpair, and the convergence theory ofNewtonās method near singular roots indicates that the algorithm generally convergesonly linearly. However, we show that the singularity of the Jacobian does not hamperthe quadratic convergence of Newtonās method towards semi-simple eigenvalues; inaddition, Rayleigh functional iteration converges at least quadratically, and it con-verges cubically for locally symmetric problems. These convergence results are verysimilar to those for simple eigenvalues; see, e.g., [25, 29].
3.1. Inverse iteration. Assume that (Ī», v) is a semi-simple eigenpair of theholomorphic matrix pencil T (Ā·), and Ļ1, . . . , ĻJ and Ļ1, . . . , ĻJ are the correspondingleft and right eigenvectors. Since dim (nullT (Ī»)) = J , there exists a singular valuedecomposition of T (Ī») of the following form
Y āT (Ī»)X =
[0 00 Ī£nāJ
],
where X = [XJ XnāJ ] and Y = [YJ YnāJ ] are unitary matrices, XJ ā CnĆJ andYJ ā CnĆJ have orthonormal columns that form a basis of span{Ļ1, . . . , ĻJ} andspan{Ļ1, . . . , ĻJ}, respectively, and Ī£nāJ is a diagonal matrix of positive singularvalues of T (Ī»). Therefore there exist nonsingular matrices KJ ,MJ ā CJĆJ such thatXJ = [Ļ1 . . . ĻJ ]KJ and YJ = [Ļ1 . . . ĻJ ]MJ . Since v ā span{Ļ1, . . . , ĻJ}, we canwrite v = [Ļ1 . . . ĻJ ]dv for some nonzero dv ā CJ . It follows from (2.1) that
Y āJ Tā²(Ī»)v = MāJ [Ļ1 . . . ĻJ ]āT ā²(Ī»)[Ļ1 . . . ĻJ ]dv = MāJdv,
and consequently[Y ā
1
] [T (Ī») T ā²(Ī»)vuā 0
] [X
1
]=
[Y āT (Ī»)X Y āT ā²(Ī»)vuāX 0
](3.1)
=
0 0 MāJdv0 Ī£nāJ Y ānāJT
ā²(Ī»)vuāXJ uāXnāJ 0
.Let h = [hāa h
āb hc]
ā be a vector in the null space of the above square matrix, whereha ā CJ , hb ā CnāJ and hc ā C. Then 0 0 MāJdv
0 Ī£nāJ Y ānāJTā²(Ī»)v
uāXJ uāXnāJ 0
hahbhc
=
MāJdvhcĪ£nāJhb + Y ānāJT
ā²(Ī»)vhcuāXJha + uāXnāJhb
=
000
.
10 D. B. Szyld and F. Xue
In the first J rows, since MJ is nonsingular and dv 6= 0, we have MāJdv 6= 0, andtherefore hc = 0. The next nā J rows is thus simplified as Ī£nāJhb, and since Ī£nāJis diagonal with all nonzero entries, we have hb = 0, and the last row is equivalent touāXJha = 0. Now, since u with uāv = 1 specifies the scaling of v = [Ļ1 . . . ĻJ ]dvin the desired eigenspace range(XJ), we have uāXJ 6= 0. Without loss of generality,assume that uāXJ = [Ī·1 . . . Ī·J ] where Ī·J 6= 0. Then we can determine ha and h.
h =
hahbhc
ā span
10...0ā Ī·1Ī·J
0nāJ+1
,
01...0ā Ī·2Ī·J
0nāJ+1
, . . . ,
00...1
āĪ·Jā1
Ī·J
0nāJ+1
.
We see from (3.1) that the null space of the Jacobian
[T (Ī») T ā²(Ī»)vuā 0
]can be
obtained by premultiplying h by
[X
1
].
N1 ā” null
([T (Ī») T ā²(Ī»)vuā 0
])(3.2)
=
[X
1
]span
10...0ā Ī·1Ī·J
0nāJ+1
, . . . ,
00...1
āĪ·Jā1
Ī·J
0nāJ+1
= span
{[Xe1ā Ī·1
Ī·JXeJ
0
],
[Xe2ā Ī·2
Ī·JXeJ
0
], . . . ,
[XeJā1ā Ī·Jā1
Ī·JXeJ
0
]}ā span
{[Ļ1
0
], . . . ,
[ĻJ0
]},
where dim(N1) = J ā 1 (J ā„ 2).
The special structure of the null space N1 indicates that the convergence of New-tonās method towards the semi-simple eigenvalue and the corresponding eigenspace(instead of the eigenvector v) is quadratic. To see this, define the space
M1 =M1Ī± āM1Ī² = span
{[XeJ
0
]}ā range
([XnāJ
1
]),
so that dim(M1) = nā J + 2. Therefore
Cn+1 = span
{[Xe1ā Ī·1
Ī·JXeJ
0
], . . . ,
[XeJā1ā Ī·Jā1
Ī·JXeJ
0
]}ā(
span
{[XeJ
0
]}ā range
([XnāJ
1
]))= N1 ā (M1Ī± āM1Ī²) = N1 āM1.
Newton-like methods for degenerate eigenvalues 11
Let PN1be the projector onto N1 along M1, PM1
= I ā PN1, and ek =[
xkĀµk
]ā[vĪ»
]be the error between the Newton iterate and the particular eigen-
pair (Ī», v) in the kth iteration. We know from Theorem 2.5 that āPN1 (ek) ā andāPM1
(ek) ā, respectively, converge to zero linearly and quadratically. Therefore New-tonās method converges towards the particular eigenpair (Ī», v) linearly.
Fortunately, the linear convergence of Newtonās method towards (Ī», v) does notaffect its quadratic convergence towards Ī» and the desired eigenspace. The key ob-servation is that the error between the Newton iterate and the eigenvalue togetherwith its eigenspace lies in M1, and āPM1(ek)ā converges quadratically. In fact, the
eigenspace approximation error in range
([XnāJ
0
])together with the eigenvalue ap-
proximation error in span
{[01
]}can be represented by PM1Ī²
(ek), the projection of
ek ontoM1Ī² = range
([XnāJ
1
])along N1āM1Ī± = span
{[Ļ1
0
], . . . ,
[ĻJ0
]}.
Therefore, PM1Ī²(ek), instead of PN1
(ek), represents the error between the Newton
iterate and the space E = span
{[Ļ1
Ī»
], . . . ,
[ĻJĪ»
]}spanned by all valid candidate
eigenpairs. In addition, (3.2) shows that any vector lying in N1 can be representedas the difference between two candidate eigenpairs, and thus PN1(ek) bears no con-nection to the error between the Newton iterate and E spanned by all candidateeigenpairs. Since M1Ī² ā M1, the quadratic convergence of Newtonās method to-wards the semi-simple Ī» and its eigenspace (represented by E) is established from thequadratic convergence of āPM1
(ek)ā. This result is summarized as follows.Theorem 3.1. Let Ī» be a semi-simple eigenvalue of the holomorphic operator
T (Ā·) : U ā CnĆn, and Ļ1, . . . , ĻJ be the corresponding right eigenvectors. Let (Āµ0, x0)be a right eigenpair approximation such that |Āµ0āĪ»| and ā (x0, span{Ļ1, . . . , ĻJ}) aresufficiently small. Assume that the conditions in Theorem 2.5 hold. Then inverseiteration (2.14) converges towards Ī» together with its eigenspace quadratically.
3.2. Numerical experiments for inverse iteration. In this section, we pro-vide numerical results illustrating the locally quadratic convergence of inverse iterationfor semi-simple eigenvalues. The experiments are performed on five benchmark prob-lems, one from the MatrixMarket [16], two from the NLEVP toolbox [2], and twoconstructed artificially. A description of these problems is given in Table 3.1.
problem source type size eigenvalue multiplicity local symm.tols1090 MM lep 1090 ā12.098 200 real unsymm.
plasma drift NLEVP pep 128 10.004129ā 2 cplx unsymm.0.19324032i
schrodinger NLEVP qep 1998 ā0.33111181+ 2 cplx symm.0.24495497i
ss art symm artificial nep 256 0 5 real symm.ss art unsymm artificial nep 256 0 5 real unsymm.
Table 3.1Description of the test problems for semi-simple eigenvalues
For example, the problem tols1090 taken from the MatrixMarket (MM) is a lineareigenvalue problem (lep), defined by a matrix A ā R1090Ć1090; it has a semi-simpleeigenvalue Ī» = ā12.098 of multiplicity 200, and the matrix T (Ī») = Ī»I ā A is realunsymmetric (local symmetry). Similarly, plasma drift and schrodinger taken from
12 D. B. Szyld and F. Xue
NLEVP, respectively, are a polynomial eigenvalue problem (pep) of degree 3 anda quadratic eigenvalue problem (qep); they both have a semi-simple eigenvalue Ī» ofmultiplicity 2, and T (Ī») is complex unsymmetric and complex symmetric, respectively.The two artificial problems are both truly nonlinear eigenvalue problems (nep), whereT (Āµ) cannot be represented as a polynomial of Āµ; specifically,
Tsymm(Āµ) = GāAD(Āµ)GA for āss art symmā, and(3.3)
Tunsymm(Āµ) = GāAD(Āµ)GB for āss art unsymmā,
where D(Āµ) = diag([eĀµā1; 2 sinĀµ; ā5 ln(1 + Āµ); 8Āµ; tanā1(Āµ); c0 + c1Āµ + c2Āµ2]),
ci ā R251 (i = 0, 1, 2) and GA, GB ā R256Ć256 generated by MATLABās functionrandn are random matrices whose entries follow the standard normal distribution.Both artificial problems have a semi-simple eigenvalue Ī» = 0 of multiplicity 5.
To test the local convergence of inverse iteration for these problems, we firstconstruct an initial eigenvector approximation using the MATLAB commands
x 0 = v ā cos(err phi) + g ā sin(err phi);(3.4)
where v is a normalized eigenvector corresponding to the desired eigenvalue of thegiven problem, g is a normalized random vector that has been orthogonalized againstv, and err phi is the error angle ā (x0, v). An appropriate value for err phi is obtainedby trial and error, such that quadratic convergence of inverse iteration can be clearlyobserved. The initial eigenvalue approximation Āµ0 is defined as the value of the one-sided Rayleigh functional ĻF (x0;T, y) where the auxiliary vector y = T ā²(Ī»)v, andtherefore |Āµ0 ā Ī»| = O (ā (x0, v)).
To estimate the convergence rate of inverse iteration, we monitor how quicklythe eigenresidual norm ek := āT (Āµk)xkā decreases as the algorithm proceeds. Theresidual norm is a natural measure of the eigenpair approximation error of (Āµk, xk),and it can be evaluated at little cost. For all the test problems, it is easy to seethat inverse iteration converges at least superlinearly. Nevertheless, our goal in thissection is to show that the order of convergence ` of this algorithm is exactly 2, andthe standard criterion āek+1ā/āekā` ā¤ C may not be very descriptive for this purpose.For example, if e0 = 10ā2, e1 = 10ā5 and e2 = 10ā12, it is then difficult to tell if theconvergence is quadratic, superquadratic or cubic, due to the very small number ofiterations for which the standard criterion holds.
We now discuss an alternative approach to estimate ` in a more descriptive man-
ner. First, we generate a sequence of initial approximations (Āµ(j)0 , x
(j)0 ), such that
ā (v, x(j+1)0 ) = 1
2ā (v, x(j)0 ), and thus |Āµ(j+1)
0 ā Ī»| ā 12 |Āµ
(j)0 ā Ī»| since the value of the
Rayleigh functional Āµ(j)0 = ĻF (x
(j)0 ;T, y) satisfies |Āµ(j)
0 āĪ»| = O(ā (x
(j)0 , v)
). It can be
shown that e(j+1)0 = āT (Āµ
(j+1)0 )x
(j+1)0 ā ā 1
2e(j)0 = 1
2āT (Āµ(j)0 )x
(j)0 ā. We then apply one
step of inverse iteration to generate a sequence of new iterates (Āµ(j)1 , x
(j)1 ), for which
e(j+1)1 =
(12
)`e
(j)1 holds, and an estimate of ` can be obtained. This approach is more
descriptive for our purpose, because a relatively large number of initial approxima-tions can be generated, for which the algorithm exhibits the `th order of convergencefor at least one iteration.
The estimated order of convergence of inverse iteration is summarized in Table 3.2.To explain the results, take for instance the problem tols1090. We generated 20
initial approximations (Āµ(j)0 , x
(j)0 ) (j = 1, 2, . . . , 20), where ā (x
(1)0 , v) = 10ā2 and
ā (x(j+1)0 , v) = 1
2ā (x(j)0 , v). The estimates of the order of convergence ` are obtained
Newton-like methods for degenerate eigenvalues 13
by applying the least-squares line formula to(
log e(j)0 , log e
(j)1
). As one can see, the
estimated values of ` are very close to 2, indicating that inverse iteration convergesquadratically, independent of the local symmetry; see Theorem 3.1.
problem ā (x(1)0 , v) # init. approx. estimated `
tols1090 10ā2 20 2.022plasma drift 2Ć 10ā1 20 2.002schrodinger 10ā4 13 1.979ss art symm 2Ć 10ā2 16 2.008
ss art unsymm 2Ć 10ā2 16 1.983Table 3.2
Estimated order of convergence for inverse iteration
3.3. Rayleigh functional iteration and single-vector JD. We see that asemi-simple eigenvalue is a class of degenerate eigenvalue for which Newtonās methodexhibit the same order of convergence as that for simple eigenvalues. In fact, we cantake a step further and show that Rayleigh functional iteration (RFI), a variant ofNewtonās method which approximates eigenvalues in a different manner, convergescubically towards a semi-simple eigenvalue Ī» if the problem is locally symmetric, i.e.,if T (Ī») is (skew) real/complex symmetric or (skew) Hermitian. In this case, note thatthe right and the left eigenvectors satisfy Ļj = Ļj or Ļj = Ļj (j = 1, . . . , J), so thata left eigenvector approximation can be obtained in each iteration without additionalcomputational cost.
We now present the local convergence analysis of RFI. First, note that for anyv =
āJj=1 Ī±jĻj such that T (Ī»)v = 0, we have
(3.5)
Jāk=1
(ĻākTā²(Ī»)v)Ļk =
Jāk=1
Jāj=1
Ī±j (ĻākTā²(Ī»)Ļj)
Ļk =
Jāk=1
Ī±kĻk = v.
Note from (2.2) that for such v, we have Q(Ī»)T ā²(Ī»)v ā span{Ļ1, . . . , ĻJ} where bothQ(Āµ) and T ā²(Āµ) are holomorphic in a neighborhood of Ī»; that is, there exists someconstant Ī± such that
(3.6) Q(Ī»)T ā²(Ī»)v = Ī±v with v ā span{Ļ1, . . . , ĻJ}.
In the following derivation, let Ļ = ĻF (x;T, y) be the Rayleigh functional valuecorresponding to x and an auxiliary vector q. To prepare for the analysis, we alsoneed decompositions of the following vectors
Q(Ī»)T ā²(Ī»)g = Ī³1(c1v1 + s1g1),(3.7)
Qā²(Ī»)T ā²(Ļ)x = Ī³2(c2v2 + s2g2), and(3.8)
Q(Ļ)T ā²ā²(Ī»)x = Ī³3(c3v3 + s3g3),(3.9)
which can be obtained by substituting the left-hand sides in (3.7)ā(3.9), respectively,for x in (2.5). Therefore vj ā span{Ļ1, . . . , ĻJ} and
(3.10) T ā²(Ī»)gj ā„ span{Ļ1, . . . , ĻJ}
for j = 1, 2, 3; see (2.5)ā(2.7). Without loss of generality, assume that the righteigenvector approximation x has a unit generalized norm, i.e., Ī³ = 1; see (2.5) and
14 D. B. Szyld and F. Xue
(2.6). Assuming that Ļ is in a neighborhood of Ī» in which F is analytic, the gener-alized norms Ī³j of these vectors are bounded above by O(1). It follows that the newunnormalized eigenvector approximation p computed by RFI is
p = Tā1(Ļ)T ā²(Ļ)x =1
Ļā Ī»
Jāk=1
(ĻākTā²(Ļ)x)Ļk +Q(Ļ)T ā²(Ļ)x
=1
Ļā Ī»
Jāk=1
Ļāk
(T ā²(Ī»)(cv + sg) + (Ļā Ī»)T ā²ā²(Ī»)x+
(Ļā Ī»)2
2T ā²ā²ā²(Ī»)x
)Ļk +
Q(Ī»)T ā²(Ī»)(cv + sg) + (Ļā Ī») (Qā²(Ī»)T ā²(Ļ) +Q(Ļ)T ā²ā²(Ī»))x+O(|Ļā Ī»|2
)=
cv
Ļā Ī»+
Jāk=1
(ĻākT
ā²ā²(Ī»)x+Ļā Ī»
2ĻākT
ā²ā²ā²(Ī»)x
)Ļk + cĪ±v + sQ(Ī»)T ā²(Ī»)g +
(Ļā Ī») (Qā²(Ī»)T ā²(Ļ) +Q(Ļ)T ā²ā²(Ī»))x+O(|Ļā Ī»|2
)(see (2.7),(3.5) and (3.6))
=cv
Ļā Ī»+
Jāk=1
Ī·kĻk + cĪ±v + sĪ³1c1v1 + (Ļā Ī»)(Ī³2c2v2 + Ī³3c3v3) +
sĪ³1s1g1 + (Ļā Ī»)(Ī³2s2g2 + Ī³3s2g3) +O(|Ļā Ī»|2
)(see (3.7)ā(3.9))
= vp + gp +O(|Ļā Ī»|2
),
where
vp =cv
Ļā Ī»+
Jāk=1
Ī·kĻk + cĪ±v + sĪ³1c1v1 + (Ļā Ī»)(Ī³2c2v2 + Ī³3c3v3),
Ī·k = ĻākTā²ā²(Ī»)x+
Ļā Ī»2
ĻākTā²ā²ā²(Ī»)x, and
gp = sĪ³1s1g1 + (Ļā Ī»)(Ī³2s2g2 + Ī³3s2g3),
so that vp ā span{Ļ1, . . . , ĻJ} and T ā²(Ī»)gpā„ span{Ļ1, . . . , ĻJ}.The convergence rates of RFI can be obtained by studying the error angle ā (p, vp).
Recall that Ļ = ĻF (x;T, y) is the Rayleigh functional value such that yāT (Ļ)x = 0.Suppose that x is a good right eigenvector approximation for which the generalizedsine s is sufficiently small, then |Ļ ā Ī»| = O(s2) if T (Ā·) is locally symmetric and|Ļā Ī»| = O(s) otherwise; see Theorem 5 in [26]. In both cases, it is easy to see that(Ļā Ī»)ā1cv is the unique dominant term in vp, and āvpā = O(|Ļā Ī»|ā1); in addition,āgpā = O(s) +O(|Ļā Ī»|) = O(s). Following the discussion of the error angle ā (x, v)(see (2.5) and (2.6)), we see that the generalized tangent of the error angle ā (p, vp)is bounded above by a quantity proportional to the ratio between the magnitude ofthe error component gp and that of the eigenvector component vp. That is,
gtan(p, vp) ā¤ O(āgpā+O(|Ļā Ī»|2)
āvpā ā O(|Ļā Ī»|2)
)= O(|Ļā Ī»|s)
=
{O(s3) if T (Ā·) is locally symmetricO(s2) otherwise
.
In other words, the convergence rates of RFI towards semi-simple and simpleeigenvalues are identical. This conclusion also holds for the single-vector JD method(2.16) because it is mathematically equivalent to RFI. The result is summarized inthe following theorem.
Newton-like methods for degenerate eigenvalues 15
Theorem 3.2. Let Ī» be a semi-simple eigenvalue of the holomorphic operatorT (Ā·) : U ā CnĆn, and x0 = Ī³(c0v + s0g) be a corresponding initial right eigenvectorapproximation with a sufficiently small error angle ā (x0, v). Then RFI (2.15) andsingle-vector JD (2.16) converge towards Ī» and its eigenspace at least quadratically orat least cubically for locally nonsymmetric or symmetric problems, respectively.
3.4. Numerical experiments for RFI and JD. In this section, we test theconvergence of RFI and single-vector JD on the five problems with semi-simple eigen-values introduced in Section 3.2. We follow the same approach used for inverse it-eration to estimate the order of convergence for RFI, with the only difference in
the generation of the initial eigenvalue approximations Āµ(j)0 . To achieve the maxi-
mum order of convergence for RFI, as discussed in Section 3.3, we use the two-sidedRayleigh functional whenever the local symmetry exists; that is, for schrodinger and
ss art symm (see Table 3.1), we choose y = conj(x(j)0 ) and y = x
(j)0 , respectively,
and let Āµ(j)0 = ĻF (x
(j)0 ;T, y), such that |Āµ(j)
0 ā Ī»| = O(ā (x
(j)0 , v)2
). As a result, RFI
converges at least cubically for these problems.Table 3.3 presents the estimated order of convergence for RFI. The results also
hold for single-vector JD because the two algorithms are mathematically equivalent.We see that RFI converges quadratically for the three problems without local sym-metry, and it converges cubically for the two problems with local symmetry (see thenumbers in bold in Table 3.3). All these results are consistent with Theorem 3.2.
problem ā (x(1)0 , v) # init. approx. estimated `
tols1090 10ā6 16 2.013plasma drift 5Ć 10ā4 11 2.012schrodinger 5Ć 10ā4 10 2.972ss art symm 2Ć 10ā2 12 3.006
ss art unsymm 2Ć 10ā2 17 1.997Table 3.3
Estimated order of convergence for RFI/JD
4. Convergence for defective eigenvalues. We see in Section 3 that Newton-like methods converge towards semi-simple eigenvalues at least quadratically. In thissection, we study the computation of defective eigenvalues, which is naturally morechallenging. We show that the local convergence of the standard Newton-like methodstowards a defective eigenpair is generally only linear. Then we propose an acceleratedinverse iteration, and show that it converges quadratically under certain assumptions,where in each iteration two linear systems need to be solved. This convergence rate isalso referred to as superlinear convergence of order
ā2. This algorithm is inspired by
an accelerated Newtonās method for solving a general nonlinear system for a singularroot [8]. In addition, we present an accelerated single-vector Jacobi-Davidson method,which also converges quadratically in this case.
4.1. Inverse iteration. The proof of the linear convergence of inverse iterationfor defective eigenvalues is similar to that for semi-simple eigenvalues. Let Ī» be adefective eigenvalue of T (Ā·) with geometric multiplicity geoT (Ī») = J . This meansthat there are exactly J right and J left Jordan chains associated with Ī», namely
{Ļ1,0, . . . , Ļ1,m1ā1}, {Ļ2,0, . . . , Ļ2,m2ā1}, . . . , {ĻJ,0, . . . , ĻJ,mJā1}, and(4.1)
{Ļ1,0, . . . , Ļ1,m1ā1}, {Ļ2,0, . . . , Ļ2,m2ā1}, . . . , {ĻJ,0, . . . , ĻJ,mJā1},
16 D. B. Szyld and F. Xue
where mi ā„ 2 (1 ā¤ i ā¤ J), such that they satisfy the properties described in Theo-rem 2.3. Consider a singular value decomposition of T (Ī») in the following form
Y āT (Ī»)X =
[0J 00 Ī£nāJ
],
where X = [XJ XnāJ ] and Y = [YJ YnāJ ] are unitary matrices, XJ ā CnĆJ andYJ ā CnĆJ have orthonormal columns forming a basis of span{Ļ1,0, . . . , ĻJ,0} andspan{Ļ1,0, . . . , ĻJ,0}, respectively, and Ī£nāJ is a diagonal matrix of nonzero singularvalues of T (Ī»). Therefore, there exist nonsingular matrices KJ ,MJ ā CJĆJ such thatXJ = [Ļ1,0 . . . ĻJ,0]KJ and YJ = [Ļ1,0 . . . ĻJ,0]MJ . Let v = [Ļ1,0 . . . ĻJ,0]dv be acandidate right eigenvector where dv 6= 0. It follows from (2.11) that
Y āJ Tā²(Ī»)v = MāJ [Ļ1 . . . ĻJ ]āT ā²(Ī»)[Ļ1 . . . ĻJ ]dv = 0.
Thus we have[Y ā
1
] [T (Ī») T ā²(Ī»)vuā 0
] [X
1
]=
[Y āT (Ī»)X Y āT ā²(Ī»)vuāX 0
]
=
0J 0 00 Ī£nāJ Y ānāJT
ā²(Ī»)vuāXJ uāXnāJ 0
.Let h = [hāa h
āb hc]
ā be a vector in the null space of the above square matrix,where ha ā CJ , hb ā CnāJ and hc ā C. Then 0J 0 0
0 Ī£nāJ Y ānāJTā²(Ī»)v
uāXJ uāXnāJ 0
hahbhc
=
0Ī£nāJhb + Y ānāJT
ā²(Ī»)vhcuāXJha + uāXnāJhb
=
000
.It follows from the second block row that hb = āĪ£ā1
nāJYānāJT
ā²(Ī»)vhc, and thus the last
row is equivalent to uāXJha ā uāXnāJĪ£ā1nāJY
ānāJT
ā²(Ī»)vhc = 0. Since u specifies thescaling of v = [Ļ1,0 . . . ĻJ,0]dv ā range(XJ) such that uāv = 1, we have uāXJ 6= 0.Without loss of generality, assume that uāXJ = [Ī³1 . . . Ī³J ] where Ī³J 6= 0. Then,depending on whether hc = 0, we can determine ha, hb and h as follows
h =
hahbhc
ā span
10...0ā Ī³1Ī³J
0nāJ0
, . . . ,
00...1
āĪ³Jā1
Ī³J
0nāJ0
,
00...0Ī·w1
.
where Ī· = 1Ī³JuāXnāJĪ£ā1
nāJYānāJT
ā²(Ī»)v ā C and w = āĪ£ā1nāJY
ānāJT
ā²(Ī»)v ā CnāJ . It
follows that the null space N1 of the Jacobian at (Ī», v) is of dimension J , which is
Newton-like methods for degenerate eigenvalues 17
one dimension larger than that in the semi-simple case; see (3.2). In fact,
N1 ā” null
([T (Ī») T ā²(Ī»)vuā 0
])(4.2)
=
[X
1
]span
10...0ā Ī³1Ī³J
0nāJ0
, . . . ,
00...1
āĪ³Jā1
Ī³J
0nāJ0
,
00...0Ī·w1
= span
{[Xe1ā Ī³1
Ī³JXeJ
0
], . . . ,
[XeJā1ā Ī³Jā1
Ī³JXeJ
0
]}ā span
{[Ī·XeJ+XnāJw
1
]}.
The complementary space M1 of N1 can be defined as follows
(4.3) M1 = span
{[XeJ
0
]}ā range
([XnāJ
0
]),
so that dim(M1) = nāJ + 1, and Cn+1 = N1 āM1. From (4.2) and (4.3), it followsby the definitions of N1 and M1 that
M2 = range
([T (Ī») T ā²(Ī»)vuā 0
])=
[T (Ī») T ā²(Ī»)vuā 0
]M1(4.4)
= span
{[01
]}ā range
([T (Ī»)XnāJuāXnāJ
]), (uāXeJ = Ī³J 6= 0 by assumption)
so that dim(M2) = n ā J + 1. There is considerable freedom to choose the comple-mentary space N2; for example, one can define
(4.5) N2 = range
([ (T (Ī»)XnāJ
)ā„0
]),
where(T (Ī»)XnāJ
)ā„consists of J column vectors orthogonal to range
(T (Ī»)XnāJ
),
and therefore M2 and N2 are orthogonal complements of each other.
Similar to the analysis for semi-simple eigenvalues, let ek =
[xkĀµk
]ā[vĪ»
]be the
error between the Newton iterate and the particular eigenpair (Ī», v) in the kth step,PN1
be the projector onto N1 along M1, and PM1= I ā PN1
. To complete theanalysis, we make the following assumption.
Assumption 4.1. For any sequence of Newton iterate errors {ek} whose com-ponents lying in N1 converge to zero linearly, their components lying in any one-dimensional subspace of N1 also converge to zero linearly.
The assumption states that the Newton iterate errors lying in any subspace ofthe kernel of the singular Jacobian exhibit qualitatively the same behavior, and noāspecialā subspace exists in which the iterate errors converge more quickly than they
18 D. B. Szyld and F. Xue
do in other subspaces. This assumption of isotropism is based on the observations ofthe behavior of Newton iterate errors in our numerical experiments.
Since the Jacobian is singular at (Ī», v), Theorem 2.5 shows that āPN1 (ek) ā andāPM1 (ek) ā, respectively, converge to zero linearly and quadratically. From (4.2),
since span
{[Ī·XeJ+XnāJw
1
]}is a subspace of N1, by Assumption 4.1, the iterate
errors {ek} have a component lying in this one-dimensional space which convergeslinearly. Given the form of this basis vector, we see that its last entry 1 representsthe eigenvalue approximation error, and it also contains the component XnāJw whichrepresents an eigenvector approximation error (range(XJ) is the desired eigenspace).As a result, inverse iteration converges towards Ī» and its eigenspace linearly.
4.2. Numerical experiments for inverse iteration. Numerical results areprovided in this section to illustrate the linear convergence of inverse iteration fordefective eigenvalues. We choose two problems from the NLEVP collection, andwe constructed four problems artificially, because the benchmark problems in theliterature with defective eigenvalues are rather limited.
A description of these problems is given in Table 4.1. For example, the prob-lem df art m1m2, which defines an matrix pencil T (Āµ) ā C256Ć256, is an artificiallyconstructed, truly nonlinear problem. It has a defective eigenvalue Ī» = 0, and thealgebraic and the geometric multiplicities of Ī» are algT (Ī») = 5 and geoT (Ī») = 4,respectively; the length of the shortest and the longest Jordan chains associated withĪ» are min{mi} = 1 and max{mi} = 2, respectively.
problem source type size eigenvalue algT (Ī») & min{mi} &geoT (Ī») max{mi}
time delay NLEVP nep 3 3Ļi 2 1 2 2jordan3 artificial lep 256 2 3 1 3 3
df art m1m2 artificial nep 256 0 5 4 1 2df art m1m3 artificial nep 256 0 5 3 1 3df art m2m3 artificial nep 256 0 5 2 2 3
mirror NLEVP pep 9 0 9 7 1 2Table 4.1
Description of the test problems for defective eigenvalues
The four artificially constructed problems are as follows. For jordan3,
T (Āµ) = GāAD(Āµ)GB , where
D(Āµ) =
Āµā2 ā1
Āµā2 ā1Āµā2
ĀµI253ādiag([0; 1; c1])
,and for the other three problems,
T (Āµ) = GāA
[D5(Āµ)
diag([c0 + Āµc1 + Āµ2c2])
]GB , where
D5(Āµ) =
eĀµā1 1ātanā1(2Āµ)
2 sin(Āµ)ā5 ln(1 + Āµ)
8Āµtanā1(3Āµ)
Newton-like methods for degenerate eigenvalues 19
for df art m1m2,
D5(Āµ) =
eĀµā1 1ātanā1(2Āµ)
2 sin(Āµ) cos(Āµ2)ā5 ln(1 + Āµ)
8Āµtanā1(3Āµ)
for df art m1m3, and
D5(Āµ) =
eĀµā1 1ātanā1(2Āµ)
2 sin(Āµ) cos(Āµ2)ā5 ln(1 + Āµ)
8Āµ eā2Āµ
tanā1(3Āµ)
df art m2m3, respectively, where GA, GB and ci (i = 0, 1, 2) are random matricesdescribed in Section 3.2. Among the six test problems, time delay, jordan3 anddf art m2m3 have Jordan chains of minimum length ā„ 2. The linear convergence ofinverse iteration for this type of problem is shown in Section 4.1.
To illustrate the analysis, we generated an initial eigenpair approximation (Āµ0, x0)in the same manner as we did for semi-simple eigenvalues. We then ran inverseiteration with (Āµ0, x0), and we found that the algorithm does converge linearly untilthe eigenvalue and eigenvector approximation errors decrease to the magnitude aroundĪµ1/max{mi} (Īµ is the machine precision) ā the highest precision one can achieve by asingle-vector algorithm for defective eigenvalues; see [19] for details. An estimate ofthe order of convergence was obtained by applying the least-squares line formula tothe sequence (log ek, log ek+1) for k = 1, 2, . . ., where ek := āT (Āµk)xkā. Note that wedid not use e0, because the residual norm āT (Āµ1)x1ā is usually significantly smallerthan āT (Āµ0)x0ā (see Lemma 4.5 for an explanation), and thus the inclusion of e0
produces considerable noise in our estimate. Table 4.2 shows that inverse iterationconverges linearly for defective eigenvalues Ī» for which the shortest Jordan chain isof length ā„ 2.
problem ā (x(1)0 , v) # iters estimated `
time delay Ć10ā3 20 1.001jordan3 5Ć 10ā3 14 0.984
df art m2m3 10ā2 15 0.969Table 4.2
Estimated order of convergence of inverse iteration for a defective Ī» (min{mi} ā„ 2)
4.3. Rayleigh functional iteration and single-vector JD. The linear con-vergence of inverse iteration towards defective eigenpairs is much less satisfactorythan its quadratic convergence towards semi-simple eigenpairs. Thus it is importantto develop algorithms that converge more rapidly. By analogy with the order of con-vergence exhibited by the Newton-like methods for semi-simple eigenvalues, one mightspeculate that RFI and single-vector JD exhibit quadratic convergence towards de-fective eigenpairs for locally symmetric problems. Unfortunately, this is not the case,as we show later in Proposition 4.2, due to the special structure of the resolvent neardefective eigenvalues. In fact, RFI generally converges only linearly towards defectiveeigenpairs, whether the local symmetry of T (Ā·) exists or not.
20 D. B. Szyld and F. Xue
To see this, we first assume for the sake of simplicity that the defective Ī» has onlyone right Jordan chain {Ļ1,0, . . . , Ļ1,mā1} such that algT (Ī») = m and geoT (Ī») = 1.Let (Āµ, x) be an eigenpair approximation, and p = Tā1(Āµ)T ā²(Āµ)x be the unnormalizednew eigenvector approximation computed by any Newton-like method we discussed.Recalling the structure of Tā1(Āµ) from Theorem 2.3, we have
p = Tā1(Āµ)T ā²(Āµ)x =
mā1āh=0
āhs=0ćT ā²(Āµ)x, Ļ1,sćĻ1,hās
(Āµā Ī»)māh+Q(Āµ)T ā²(Āµ)x(4.6)
=
mā1āi=0
ćT ā²(Āµ)x, Ļ1,0ć(Āµā Ī»)māi
Ļ1,i +
mā2āi=0
ćT ā²(Āµ)x, Ļ1,1ć(Āµā Ī»)mā1āiĻ1,i + . . .
+
1āi=0
ćT ā²(Āµ)x, Ļ1,mā2ć(Āµā Ī»)2āi Ļ1,i +
ćT ā²(Āµ)x, Ļ1,mā1ć(Āµā Ī»)
Ļ1,0 +Q(Āµ)T ā²(Āµ)x
=mā1āj=0
mājā1āi=0
ćT ā²(Āµ)x, Ļ1,jć(Āµā Ī»)mājāi
Ļ1,i +Q(Āµ)T ā²(Āµ)x.
To analyze the direction of p, assume that the eigenvector Ļ1,0 and the generalizedeigenvector Ļ1,1 are not parallel. Then for every j < m ā 1, consider the followingcomponent shown in the last equation of (4.6)
mājā1āi=0
ćT ā²(Āµ)x, Ļ1,jć(Āµā Ī»)mājāi
Ļ1,i =ćT ā²(Āµ)x, Ļ1,jć(Āµā Ī»)māj
mājā1āi=0
(Āµā Ī»)iĻ1,i.(4.7)
The above expression shows clearly that the ratio between the magnitude of the errorcomponent (in the direction of Ļ1,1) and the magnitude of the eigenvector component(in the direction of Ļ1,0) in (4.7) is on the order of O(Āµ ā Ī»). Now for j = m ā 1in (4.6), assume in addition that ćT ā²(Ī»)Ļ1,0, Ļ1,mā1ć 6= 0, so that ćT ā²(Āµ)x, Ļ1,mā1ć āćT ā²(Ī»)Ļ1,0, Ļ1,mā1ć = O(1) for Āµ sufficiently close to Ī» and x sufficiently close to Ļ1,0
in direction. Consider the corresponding component in the last equation of (4.6),namely,
(4.8)ćT ā²(Āµ)x, Ļ1,mā1ć
Āµā Ī»Ļ1,0 +Q(Āµ)T ā²(Āµ)x.
Similarly, the ratio between the magnitude of the error component and that of theeigenvector component of (4.8) is also on the order of O(ĀµāĪ»). Therefore, when nor-malized, the new eigenvector approximation p in (4.6) has an eigenvector componentof magnitude O(1) and an error component of magnitude O(Āµā Ī»).
As a result, the local convergence rates of Newton-like methods depend on theaccuracy of the eigenvalue approximation Āµ, which in turn usually depends on theaccuracy of the eigenvector approximation x. Let the accuracy of x be representedby the sine of the error angle ā (x, Ļ1,0), and assume that |ĀµāĪ»| = O(sin`ā (x, Ļ1,0)).Then the convergence of the Newton-like method is of order `. In particular, for RFIwhere Āµ = ĻF (x;T, y) is the Rayleigh functional value, we have ` = 1 in general fordefective eigenvalues, and therefore RFI converges only linearly. This observation issummarized as follows.
Proposition 4.2. Let Ī» be a defective eigenvalue of the holomorphic operatorT (Ā·) : U ā CnĆn, and Ļ and Ļ be a corresponding unit right and a unit left eigenvec-tor, respectively. Let x = Ļ cosĪ±+Ļā„ sinĪ± and y = Ļ cosĪ²+Ļā„ sinĪ², where Ī±, Ī² < Ļ
2 ,
Newton-like methods for degenerate eigenvalues 21
Ļā„ā„ null (T (Ī»)) and Ļā„ā„null ((T (Ī»))ā) are unit vectors, and thus āxā = āyā = 1.Assume that yāT ā²(Ī»)x 6= 0. Let Ļ = ĻF (x;T, y) be the Rayleigh functional valueclosest to Ī» such that yāT (Ļ)x = 0. Then for sufficiently small Ī±,
|Ļā Ī»| ā¤ 2āT (Ī»)ā| sinĪ± sinĪ²||yāT ā²(Ī»)x|
(4.9)
=2āT (Ī»)ā| sinĪ± sinĪ²|ā£ā£cosĪ² sinĪ±ĻāT ā²(Ī»)Ļā„ + sinĪ² cosĪ±Ļāā„T
ā²(Ī»)Ļ+ sinĪ² sinĪ±Ļāā„Tā²(Ī»)Ļā„
ā£ā£ .Assume in addition that ĻāT ā²(Ī»)Ļā„, Ļāā„T
ā²(Ī»)Ļ and Ļāā„Tā²(Ī»)Ļā„ are all bounded away
from zero. Then, for sufficiently small Ī±, |Ļā Ī»| ā¤ O(sinĪ±).Proof. The first part of the proposition through (4.9) was shown in [29, Section
4.3] using the Newton-Kantorovich Theorem. The goal here is to show that, in con-trast to the scenario for semi-simple eigenvalues, we have |Ļā Ī»| ā¤ O(sinĪ±) (insteadof O(sinĪ± sinĪ²)), whether the local symmetry of T (Ā·) exists or not.
First, assume that y is not a good left eigenvector approximation, i.e., sinĪ² =O(1). Since sinĪ± is small, we have 2āT (Ī»)ā| sinĪ± sinĪ²| = O(sinĪ±), and from (4.9),|yāT ā²(Ī»)x| = |sinĪ² cosĪ±Ļāā„T
ā²(Ī»)Ļ+O(sinĪ±)| = O(1). Therefore it follows that
(4.10) |Ļā Ī»| ā¤ 2āT (Ī»)ā| sinĪ± sinĪ²||yāT (Ī»)x|
= O(sinĪ±).
Now assume that y is a good left eigenvector approximation such that sinĪ² =O(sinĪ±); in particular, suppose the local symmetry of T (Ā·) exists, and thus we choosey = x or y = x to generate the two-sided Rayleigh functional value. In this case,| sinĪ±| = | sinĪ²|, and it follows from (4.9) that 2āT (Ī»)ā| sinĪ± sinĪ²| = O(sin2 Ī±), and|yāT ā²(Ī»)x| = O(sinĪ±). Therefore, (4.10) still holds.
Remark. Proposition 4.2 shows that for a defective Ī», the use of a left eigen-vector approximation for the Rayleigh functional ĻF does not generate an eigenvalueapproximation of second order accuracy, as is achieved for simple and semi-simpleeigenvalues. This lack of high accuracy is attributed to the fact that ĻāT ā²(Ī»)Ļ = 0.Moreover, the use of a left eigenvector approximation for ĻF could introduce addi-tional complications. Namely, since |yāT ā²(Ī»)x| = O(sinĪ±) from (4.9), it follows that|yāT ā²(Ļ)x| ā¤ |yāT ā²(Ī»)x|+O(ĻāĪ») = O(sinĪ±). Thus there is a risk that yāT ā²(Ļ)x 6= 0,a critical condition required by the definition of Rayleigh functionals (see [26]), maybe violated, at least numerically. As a result, we see from (2.16) that the small
magnitude of yāT ā²(Ļ)x could introduce numerical difficulties in the projector Ī (1)k for
single-vector JD. We therefore recommend using y far from a left eigenvector approx-imation to compute the Rayleigh functional value for a defective Ī».
From Proposition 4.2, it follows immediately from (4.7) and (4.8) that RFI con-verges linearly towards defective eigenvalues. In addition, assuming that the Rayleighfunctional value Ļ = ĻF (x;T, y) satisfies ćT ā²(Ļ)x, yć 6= 0, then the single-vector JD(2.16) also converges linearly in this case, because it is mathematically equivalent toRFI. We summarize this result in the following theorem.
Theorem 4.3. Let Ī» be a defective eigenvalue of the holomorphic operator T (Ā·) :U ā CnĆn with exactly one left and one right Jordan chains {Ļ1,0, . . . , Ļ1,mā1} and{Ļ1,0, . . . , Ļ1,mā1}, where Ļ1,0 is not parallel to Ļ1,1, and ćT ā²(Ī»)Ļ1,0, Ļ1,mā1ć 6= 0. Let(Ļ0, x0) be an initial eigenpair approximation with ā (x0, Ļ1,0) sufficiently small, andĻk = ĻF (xk;T, yk) be the Rayleigh functional value closest to Ī». Assume that the upperbounds in Proposition 4.2 are qualitatively sharp, i.e., |Ļk ā Ī»| = O(sinā (xk, Ļ1,0)).
22 D. B. Szyld and F. Xue
Then RFI converges locally towards (Ī», Ļ1,0) linearly. The same conclusion applies tosingle-vector JD (2.16) if, in addition, Ļk is such that yākT
ā²(Ļk)xk 6= 0.
4.4. Numerical experiments for RFI/JD. In this section, we illustrate bynumerical experiments the linear convergence of RFI and single-vector JD for defec-tive eigenvalues with a single Jordan chain. The experiments are performed on theproblems time delay and jordan3. We run RFI with an initial eigenpair approxima-tion (Āµ0, x0), and we analyze (log ek, log ek+1), where ek := āT (Āµk)xkā (k = 1, 2, . . .),for the estimated order of convergence. Here, note that the local symmetry does notexist for both test problems, and we choose y = T ā²(Ī»)xk to generate the value ofthe Rayleigh functional ĻF (xk;T, y). Table 4.3 shows the error angle of the initialiterate, the number of iterations taken, and the estimated order of convergence. Wesee clearly that RFI converges linearly for both problems.
In addition, we tested the convergence of the two-sided Rayleigh functiona iter-ation (TSRFI), which converges cubically for simple eigenvalues; see, e.g., [25]. Wefound that this algorithm also converges linearly in this setting, which is consistentwith Proposition 4.2 and Theorem 4.3. As we have discussed, for a defective eigen-value Ī» with a single Jordan chain, we generally have |ĻF (x;T, y) ā Ī»| = O
(ā (x, v)
)(instead of O
(ā (x, v)2
)), no matter whether y is a good left eigenvector approxima-
tion; consequently, TSRFI exhibits the same order of convergence (linear) as RFI.The results for TSRFI are also summarized in Table 4.3.
RFI TSRFIproblem ā (x0, v) # iters estimated ` # iters estimated `
time delay 10ā3 18 1.002 15 0.981jordan3 5Ć 10ā3 19 0.961 13 0.967
Table 4.3Estimated order of convergence of RFI and TSRFI for a defective Ī» (geoT (Ī») = 1)
4.5. Accelerated algorithms. In this section, we first study an acceleratedinverse iteration for degenerate eigenvalues, which is inspired by the accelerated New-tonās method for general nonlinear system of equations near singular roots [8]. Thisalgorithm, which requires the solution of two linear systems in each iteration, exhibitsquadratic convergence towards defective eigenpairs. We then propose an acceleratedsingle-vector Jacobi-Davidson method based on a minor modification of the acceler-ated inverse iteration, and we show by experiments that it also converges quadratically.
4.5.1. Accelerated inverse iteration. Let Ī» be a defective eigenvalue of theholomorphic T (Ā·). To simplify the notation, we use the MATLAB expressions todenote eigenpair approximations written in the form of column vectors; e.g., [x;Āµ] is
used to represent
[xĀµ
]. Let F ([v;Ī»]) = 0 be the augmented system of (1.1), and
[xk;Āµk] be the starting eigenpair approximation in the kth iteration. Consider thefollowing accelerated Newtonās method:
[wk; Ī½k] = [xk;Āµk]ā F ā²([xk;Āµk])ā1F ([xk;Āµk]) (half-step iterate)(4.11)
[xk+1; Āµk+1] = [wk; Ī½k]āmF ā²([wk; Ī½k])ā1F ([wk; Ī½k]) (full-step iterate)
= m([wk; Ī½k]āF ā²([wk; Ī½k])ā1F ([wk; Ī½k])
)ā (mā1)[wk; Ī½k],
Newton-like methods for degenerate eigenvalues 23
where F ([xk;Āµk]) =
[T (Āµk)xkuāxk ā 1
]=
[T (Āµk)xk
0
], āF ([xk;Āµk])ā = āT (Āµk)xkā is the
residual norm of [xk;Āµk], and F ā²([xk;Āµk]) =
[T (Āµk) T ā²(Āµk)xkuā 0
]is the Jacobian of
F at [xk;Āµk]. In other words, the full-step eigenpair approximation [xk+1;Āµk+1] isa special linear combination of the half-step iterate [wk; Ī½k] and the next standardNewtonās iterate [wk; Ī½k]āF ā²([wk; Ī½k])ā1F ([wk; Ī½k]). We will see later in this sectionthat the length of the Jordan chain m is the only value for the linear combinationcoefficients in (4.11) such that the algorithm converges quadratically.
It can be shown by the structure of the block inverse of the Jacobian that (4.11)is equivalent to the following accelerated inverse iteration:
1. pk = Tā1(Āµk)T ā²(Āµk)xk (half-step intermediate vector)
2. wk =pkuāpk
(half-step eigenvector approximation)
3. Ī½k = Āµk ā1
uāpk(half-step eigenvalue approximation)
4. qk = Tā1(Ī½k)T ā²(Ī½k)wk (full-step intermediate vector)
5. xk+1 = ā(mā 1)wk +mqkuāqk
(full-step eigenvector approximation)
6. Āµk+1 = Ī½k ām
uāqk(full-step eigenvalue approximation)
.(4.12)
In this section, we establish the locally quadratic convergence of the acceleratedalgorithm (4.11) (or (4.12)) towards the desired defective eigenpair. To this end, wefirst make an assumption about the eigenpair approximation (Āµk, xk) as follows.
Assumption 4.4. Let Ī» be a defective eigenvalue of the holomorphic T (Ā·) with Jright and J left Jordan chains shown in (4.1). Assume that the eigenpair approxima-tion (Āµk, xk) is such that its eigenvalue approximation error |Āµk ā Ī»| is proportionalto its eigenvector approximation error ā (xk,Ī¦) where Ī¦ = span{Ļ1,0, . . . , ĻJ,0}.
This assumption seems reasonable due to the following observation. In practice,it is usually not known in priori whether the desired eigenpair is defective, and thus weuse Newtonās method, RFI or JD to solve for the eigenpair. For Newtonās method, it isshown in Section 4.1 that its local convergence towards a defective eigenpair is linear;in fact, the error ek = [xk;Āµk]ā [v;Ī»] has a component in the one-dimensional space
span
{[Ī·XeJ +XnāJw
1
]}(a subspace of the null space of the Jacobian) which
converges linearly. Since this basis vector has nontrivial components representing botheigenvalue and eigenvector approximation errors, the two errors become proportionalto each other after sufficiently many Newton steps. This assumption also holds forRFI where Āµk = Ļ is the Rayleigh functional value, since Proposition 4.2 shows that|Ļā Ī»| ā¤ O(sinā (xk,Ī¦)) for the defective Ī», independent of the symmetry of T (Ī»).
Our main goal is to show that the accelerated method (4.11) converges quadrat-ically towards the defective eigenpair [v;Ī»]. We take three steps to complete thisanalysis. In step 1, we make a few assumptions about the half-step iterate [wk; Ī½k],and we study the values of b, c, d in the Taylor expansion of F ([wk; Ī½k]) at [v;Ī»]. Thisstep is critical to establish the quadratic convergence. In step 2, we show that theJacobian at any eigenpair approximation sufficiently close to [v;Ī»] is nonsingular, sothat all half-step and full-step iterates are well-defined. In step 3, we write the full-step iterate error as a linear combination of the error of [wk; Ī½k] and that of the nextstandard Newton iterate [zk; Ī¾k] = [wk; Ī½k] ā F ā²([wk; Ī½k])ā1F ([wk; Ī½k]); we analyze
24 D. B. Szyld and F. Xue
the projections of the full-step iterate error onto M1 and N1, and show that all theprojected errors are bounded by O(s2
k), where sk = sinā (xk, Ļ1,0), the error anglebetween the current eigenvector approximation and the desired eigenvector.
In the following analysis, to simplify the notation, we omit the subscript k of sk,ck, [xk;Āµk], and [wk; Ī½k], when there is no risk of confusion; we keep the subscriptk + 1 of [xk+1;Āµk+1] though, to clearly identify the full-step iterate. To simplify theanalysis, we again assume that Ī» has only one right Jordan chain {Ļ1,0, . . . , Ļ1,mā1}.
In step 1, we first show that the half-step iterate [w; Ī½] has a special property asfollows. Since Newtonās method converges linearly, the error of [w; Ī½] is proportionalto that of [x;Āµ], yet the residual norm of [w; Ī½] is significantly smaller than that of[x;Āµ]; namely, āT (Āµ)xā = O(s) and āT (Ī½)wā = O(sm), where s=sinā (x, Ļ1,0) is theeigenvector approximation error.
Lemma 4.5. Let Ī» be a defective eigenvalue of the holomorphic operator T (Ā·) withonly one left and one right Jordan chains {Ļ1,0, . . . , Ļ1,mā1} and {Ļ1,0, . . . , Ļ1,mā1}.Assume that ćT ā²ā²(Ī»)Ļ1,0, Ļ1,0ć 6= 0, or ćT ā²(Ī»)Ļ1,0, Ļ1,1ć 6= 0. Let [x;Āµ] be an eigenpairapproximation satisfying Assumption 4.4, s = sinā (x, Ļ1,0) be the sine of the errorangle of x, and [w; Ī½] be the half-step iterate in (4.11). Then āT (Ī½)wā = O(sm) forsufficiently small s.
Proof. We first establish two critical relations, namely, āuāTā1(Āµ)T ā²(Āµ)xā =O(sā(mā1)) and āT ā²(Āµ)x ā T ā²(Āµ)wā = O(s). Assume without loss of generality thatx = cĻ1,0 + sg, where āĻ1,0ā = āgā = 1 and gā„Ļ1,0.
We first show that āuāTā1(Āµ)T ā²(Āµ)xā = O(sā(mā1)). From (4.6), we have that
p = Tā1(Āµ)T ā²(Āµ)x =
mā1āj=0
mājā1āi=0
ćT ā²(Āµ)x, Ļ1,jć(Āµā Ī»)mājāi
Ļ1,i +G(Āµ)T ā²(Āµ)x(4.13)
=
mā1āi=0
ćT ā²(Āµ)x, Ļ1,0ć(Āµā Ī»)māi
Ļ1,i +
mā1āj=1
mājā1āi=0
ćT ā²(Āµ)x, Ļ1,jć(Āµā Ī»)mājāi
Ļ1,i +G(Āµ)T ā²(Āµ)x.
Recall from (2.11) that ćT ā²(Ī»)Ļ1,0, Ļ1,0ć = 0, and |Āµā Ī»| = O(s) by Assumption 4.4.First, assume that ćT ā²ā²(Ī»)Ļ1,0, Ļ1,0ć = O(1). Then
ćT ā²(Āµ)x, Ļ1,0ć = cćT ā²(Āµ)Ļ1,0, Ļ1,0ć+ sćT ā²(Āµ)g, Ļ1,0ć(4.14)
= cćT ā²(Ī»)Ļ1,0, Ļ1,0ć+ c(Āµā Ī»)ćT ā²ā²(Ī»)Ļ1,0, Ļ1,0ć+ sćT ā²(Ī»)g, Ļ1,0ć+O
((Āµā Ī»)2
)+O (s(Āµā Ī»))
= 0 +O(Āµā Ī») +O(s) = O(s).
The dominant term ināmā1i=0 ćT ā²(Āµ)x, Ļ1,0ć(Āµā Ī»)ā(māi)Ļ1,i, which appears in the
last equality of (4.13), is thus
ćT ā²(Āµ)x, Ļ1,0ć(Āµā Ī»)m
Ļ1,0 =O(s)
O(sm)Ļ1,0 = O(sā(mā1))Ļ1,0.
Consider the sum of the terms corresponding to j = 1 in (4.13), namely,
(4.15)
mā2āi=0
ćT ā²(Āµ)x, Ļ1,1ć(Āµā Ī»)mā1āiĻ1,i.
Now, assume alternatively that ćT ā²(Ī»)Ļ1,0, Ļ1,1ć 6= 0. It follows that ćT ā²(Āµ)x, Ļ1,1ć =ćT ā²(Ī»)Ļ1,0, Ļ1,1ć+O(s) = O(1) for small s. Therefore the dominant term in (4.15) is
ćT ā²(Āµ)x, Ļ1,1ć(Āµā Ī»)mā1
Ļ1,0 =O(1)
O(smā1)Ļ1,0 = O(sā(mā1))Ļ1,0.
Newton-like methods for degenerate eigenvalues 25
For every j ā„ 2, the sum of corresponding terms in (4.13) are bounded byO(sā(mā2)), and they are thus not the dominant ones. In summary, if eitherćT ā²ā²(Ī»)Ļ1,0, Ļ1,0ć 6= 0 or ćT ā²(Ī»)Ļ1,0, Ļ1,1ć 6= 0, the dominant term in (4.13) can bewritten as O(sā(mā1))Ļ1,0, and āuāTā1(Āµ)T ā²(Āµ)xā = O(sā(mā1)) follows.
To complete the proof, it is sufficient to show āT ā²(Āµ)xāT ā²(Āµ)wā = O(s). In fact,since [w; Ī½] is obtained by applying one step of Newtonās method to [x;Āµ], and boththe eigenvalue and the eigenvector approximation errors of Newtonās iterates convergelinearly in this case (see Section 4.1), we have āw ā Ļ1,0ā = O(s). Therefore
āT ā²(Āµ)xā T ā²(Āµ)wā ā¤ āT ā²(Āµ)āāxā wāā¤ āT ā²(Āµ)ā(āxā Ļ1,0ā+ āw ā Ļ1,0ā) = O(s).
Finally, note from (4.12) that
T (Ī½)w = T(Āµā [uāTā1(Āµ)T ā²(Āµ)x)]ā1
)w(4.16)
= T (Āµ)w ā T ā²(Āµ)w
uāTā1(Āµ)T ā²(Āµ)x+O
((uāTā1(Āµ)T ā²(Āµ)x)ā2
)= T (Āµ)
Tā1(Āµ)T ā²(Āµ)x
uāTā1(Āµ)T ā²(Āµ)xā T ā²(Āµ)w
uāTā1(Āµ)T ā²(Āµ)x+O(s2(mā1))
=T ā²(Āµ)xā T ā²(Āµ)w
uāTā1(Āµ)T ā²(Āµ)x+O(s2(mā1)).
It then immediately follows that
āT (Ī½)wā =āT ā²(Āµ)xā T ā²(Āµ)wā|uāTā1(Āµ)T ā²(Āµ)x|
+O(s2(mā1))
=O(s)
O(sā(mā1))+O(s2(mā1)) = O(sm)
for m ā„ 2. This completes the proof.
Lemma 4.5 gives a critical preliminary result in step 1 of the proof on the valuesof b, c, and d in the Taylor expansion of F ([w; Ī½]) at [Ļ1,0;Ī»]. To complete this step,we need the following assumption about [w; Ī½].
Assumption 4.6. Let Ī» be a defective eigenvalue of the holomorphic operatorT (Ā·) with exactly one right and one left Jordan chains. Let M2 defined in (4.4) be therange of the Jacobian at (Ī», v) (where v = Ļ1,0), and [x;Āµ] and [w; Ī½] be the startingand the half-step iterates, respectively, of the accelerated method (4.11). Assume thatthere exists a constant Īø0 > 0 independent of |Āµ ā Ī»| and ā (x, Ļ1,0), such that the
angle between F ([w; Ī½]) =
[T (Ī½)w
0
]and M2 is bounded below by Īø0.
Assumption 4.6 seems reasonable in general, as we discuss below. By Lemma 4.5,T (Ī½)w has a significant component parallel to T ā²(Āµ)(wāx), where wāx ā„ u because
uāx = uāx = 1. Thus
[T (Ī½)w
0
]has a large component in
{[T ā²(Āµ) (span{u})ā„
0
]},
where (span{u})ā„ is the orthogonal complement of span{u}. Given the structure ofM2 in (4.4), with a proper choice of u, there is no reason to expect this componentto be well approximated by any vector in M2.
26 D. B. Szyld and F. Xue
To complete step 1, recall the Taylor expansion of F ([w; Ī½]) at [v;Ī»] as follows:
F ([w; Ī½]) ā”[T (Ī½)wuāy ā 1
]=
[T (Ī»)v
0
]+
[T (Ī») T ā²(Ī»)vu 0
][y ā vĪ½ ā Ī»
]+
nāj=1
1
(j + 1)!
(Ī½ ā Ī»)jT (j)(Ī»)
{j(Ī½ ā Ī»)jā1T (j)(Ī»)(y ā v)
+(Ī½ ā Ī»)jT (j+1)(Ī»)v
}0 0
[ y ā vĪ½ ā Ī»
]
+O(āewĪ½ān+2) =
nāj=0
1
(j + 1)!F (j+1)([v;Ī»])(ejwĪ½ , ewĪ½) +O(āewĪ½ān+2),
where ewĪ½ = [w; Ī½]ā [v;Ī»], F (j+1)([v;Ī»]) (Ā·, Ā· Ā· Ā· , Ā·) : Cn+1Ć . . .ĆCn+1 ā Cn+1 standsfor the j + 1st derivative of F at [v;Ī»] (which is a multilinear form (tensor) withj + 1 arguments), and ejwĪ½ means that the first j arguments of F (j+1) are all ewĪ½ . Inparticular, F (j+1)([v;Ī»])(ejwĪ½ , Ā·) is a Cn+1 Ć Cn+1 matrix.
Now we are ready to discuss several cases in which the possible values of b, c, andd in the Taylor expansion of F ([w; Ī½]) at [v;Ī»] can be determined.
Lemma 4.7. Let [v;Ī»] be a defective eigenpair of T (Ā·) with one right and one leftJordan chain of length m, [x;Āµ] a corresponding eigenpair approximation satisfyingAssumption 4.4, and [w; Ī½] the half-step eigenpair approximation in (4.11) computedby one step of Newtonās method. The Taylor expansion of F ([w; Ī½]) at [v;Ī»] is
F ([w; Ī½]) ā”[T (Ī½)w
0
]=
[T (Ī»)v
0
]+AFāewĪ½ +
nāj=a
1
j + 1A(j)([w; Ī½])ewĪ½
+
nāj=b
1
j + 1B(j)([w; Ī½])ewĪ½ +
nāj=c
1
j + 1C(j)([w; Ī½])ewĪ½
+
nāj=d
1
j + 1D(j)([w; Ī½])ewĪ½ +O(āewĪ½ān+2). (n ā„ max(a, b, c, d) )
Then b = 1 if m > 2. In addition, under Assumption 4.6, exactly one the followingscenarios must be true:
1. c ā„ d = mā 1, or2. (only if m ā„ 3) c = mā 2 and d ā„ mā 1, or3. (only if m ā„ 4) c ā¤ mā 3 and d = c+ 1.
Proof. We know from Lemma 4.5 that āF ([w; Ī½])ā = O(sm). By Assumption 4.6,ā (F ([w; Ī½]),M2) > Īø0 > 0. From (4.4) and (4.5), since M2 and N2 are orthogonalcomplements, and PN2
is the projection onto N2 along M2, we have
āPN2F ([w; Ī½])ā ā„ āF ([w; Ī½])ā sin Īø0 = O(sm) and
āPM2F ([w; Ī½])ā ā¤ āF ([w; Ī½])ā cos Īø0 = O(s`), where ` ā„ m.
In addition, by Assumption 4.4, [x;Āµ] has eigenvalue and eigenvector approximationerrors both on the order of O(s), and the two errors are represented by certain com-ponents lying in N1 and M1, resepectively; see (4.2) and (4.3). It follows that
PN1exĀµ = O(s) and PM1
exĀµ = O(s), where exĀµ = [x;Āµ]ā [v;Ī»].
By Theorem 2.5, we have PN1ewĪ½ = O(s) and PM1ewĪ½ = O(s2). To find the value of
Newton-like methods for degenerate eigenvalues 27
b for m > 2, note that
PM2F ([w; Ī½]) = AFāewĪ½ +
nāj=a
1
j + 1A(j)([w; Ī½])ewĪ½ +
nāj=b
1
j + 1B(j)([w; Ī½])ewĪ½
= AFā(PM1ewĪ½) +
nāj=a
1
j + 1A(j)([w; Ī½])(PM1
ewĪ½) +
nāj=b
1
j + 1B(j)([w; Ī½])(PN1
ewĪ½)
= O(s2) +O(sa+2) +O(sb+1) = O(s`),
where ` ā„ m. For any m > 2, since O(sa+2) ā¤ O(s3), we must have b = 1 to cancelout the O(s2) terms in the last step of the above equality.
Similarly, to see the relation between c and d, we have
PN2F ([w; Ī½]) =
nāj=c
1
j + 1C(j)([w; Ī½])ewĪ½ +
nāj=d
1
j + 1D(j)([w; Ī½])ewĪ½ +O(āewĪ½ān+2)
= O(sc+2) +O(sd+1) +O(sn+2) = O(sm).(4.17)
Clearly, (4.17) holds only if exactly one of the following cases is true:1. If c ā„ m ā 1, then we must have d = m ā 1 (m ā„ 2) to maintain a O(sm)
term on the left-hand side of (4.17).2. If c = m ā 2, then we must have d ā„ m ā 1 (m ā„ 3, b = 1) so that there is
no term of order lower than m on the left-hand side of (4.17).3. If c ā¤ m ā 3, then we must have d = c + 1 (m ā„ 4, b = 1) so that the two
terms of order lower than m on the left-hand side of (4.17) are cancelled out.The lemma is thus established.Lemma 4.7 completes step 1, where we derived the possible values of b, c, and d
for the Taylor expansion of F ([w; Ī½]) at [v;Ī»]. In step 2, we show that all the half-step and full-step iterates of the accelerated inverse iteration are well-defined. To thisend, it is sufficient to show that the Jacobian of Newtonās method at any half-step orfull-step eigenpair approximation [w; Ī½] or [x;Āµ] is nonsingular. The nonsingularity ofthe Jacobian can be guaranteed by the following Lemma.
Lemma 4.8. Let [x;Āµ] be an eigenpair approximation sufficiently close, but not
equal to [v;Ī»]. Then F ā²([x;Āµ]) =
[T (Āµ) T ā²(Āµ)xuā 0
], the Jacobian of Newtonās method
at [x;Āµ], is nonsingular.Proof. By assumption, the eigenvalues of T (Ā·) are isolated, and therefore T (Āµ) is
nonsingular if Āµ is sufficiently close, but not equal to Ī». It follows that
[T (Āµ)uā
]has
full column rank. Assume that the Jacobian
[T (Āµ) T ā²(Āµ)xuā 0
]is singular, then there
exists y ā Cn such that
[T (Āµ)uā
]y =
[T ā²(Āµ)x
0
]. It follows that y = T (Āµ)ā1T ā²(Āµ)x
and uāy = 0. However, if x is sufficiently close to v in direction, and |ĀµāĪ»| sufficientlysmall, then y, the unnormalized new eigenvector approximation, is closer to v indirection. Given the normalization condition uāv = 1, it is impossible to have uāy = 0.Therefore the Jacobian must be nonsingular.
In step 3, we first show that the error of the full-step iterate can be written asa linear combination of the half-step iterate error and the errof of the next standardNewton iterate [z; Ī¾] = [w; Ī½] ā F ā²([w; Ī½])ā1F ([w; Ī½]); this derivation follows that in
28 D. B. Szyld and F. Xue
[8, Section 4]. Then, we study the projections of the full-step iterate error onto N1
and M1, and show that the projected errors are bounded by O(s2).
To decompose the error of the full-step iterate, define
[z; Ī¾] = [w; Ī½]ā F ā²([w; Ī½])ā1F ([w; Ī½]),
the iterate obtained by applying one step of Newtonās method to the half-step iterate.Then we have from (4.11) that [xk+1;Āµk+1] = m[z; Ī¾] ā (m ā 1)[w; Ī½]. The error of[xk+1;Āµk+1] can be analyzed by studying ezĪ¾ = [z; Ī¾]ā [v;Ī»]. Following the derivationin [8, Section 4], one can show that ezĪ¾ satisfies
ezĪ¾ = ewĪ½ ā F ā²([w; Ī½])ā1F ([w; Ī½])(4.18)
= F ā²([w; Ī½])ā1{ nāj=a
j
(j + 1)A(j)([w; Ī½])ewĪ½ +
nāj=b
j
(j + 1)B(j)([w; Ī½])ewĪ½
nāj=c
j
(j + 1)C(j)([w; Ī½])ewĪ½ +
nāj=d
j
(j + 1)D(j)([w; Ī½])ewĪ½ +O(sn+2)
}.
To study the errors of [xk+1;Āµk+1] projected onto N1 and M1, we assume thatthe first case discussed in Lemma 4.7 holds, that is, c ā„ d = m ā 1; we assume inaddition that c ā„ d. These assumptions are similar, but not identical, to those in [8,Section 7]. Since PM1
ewĪ½ = O(s2) and PN1ewĪ½ = O(s), we have
[xk+1;Āµk+1]ā [v;Ī»] = m[z; Ī¾]ā (mā 1)[w; Ī½]ā [v;Ī»] = mezĪ¾ ā (mā 1)ewĪ½(4.19)
= mF ā²([w; Ī½])ā1{ a
a+ 1A(a)([w; Ī½])ewĪ½ +
b
b+ 1B(b)([w; Ī½])ewĪ½ +
c
c+ 1C(c)([w; Ī½])ewĪ½
+d
d+ 1D(d)([w; Ī½])ewĪ½ + PM2
O(smin(a+3,b+2)) + PN2O(smin(c+3,d+2))
}ā(mā 1)F ā²([w; Ī½])ā1F ā²([w; Ī½])ewĪ½
= F ā²([w; Ī½])ā1
{ā (mā 1)AFāewĪ½ +
( ma
a+ 1ā (mā 1)
)A(a)([w; Ī½])ewĪ½ +( mb
b+ 1ā (mā 1)
)B(b)([w; Ī½])ewĪ½ + PM2
O(smin(a+3,b+2)) + PN2O(sd+2)
}.
Here, in the last step of (4.19), C(c)([w; Ī½])ewĪ½ =C(c)([w; Ī½])(PM1ewĪ½) =PN2
O(sc+2)
does not appear explicitly, because it is assimilated into PN2O(sd+2), where d ā¤ c.
More importantly, as d = mā 1 by our assumption, D(d)([w; Ī½])ewĪ½ vanishes because
its coefficient is m dd+1 ā (m ā 1) = 0. The cancellation of D(d)([w; Ī½])ewĪ½ is critical
in the proof of the quadratic convergence of the accelerated method.
To finish step 3, we show that the errors of [xk+1;Āµk+1] projected onto N1 andM1 are both bounded by O(s2). To this end, recall from (2.19) the expression ofF ā²([w; Ī½])ā1, and apply it to (4.19). With some algebraic manipulation, we have
[xk+1;Āµk+1]ā [v;Ī»] ā” ek+1MM + ek+1
MN + ek+1NM + ek+1
NN ,
Newton-like methods for degenerate eigenvalues 29
where
ek+1MM = PM1
{Aā1F ([w; Ī½]) +Aā1
F ([w; Ī½])BF ([w; Ī½])Sā1F ([w; Ī½])CF ([w; Ī½])Aā1
F ([w; Ī½])}
Ć{ā (mā 1)AFāewĪ½ +
a+ 1āma+ 1
A(a)([w; Ī½])ewĪ½ +b+ 1āmb+ 1
B(b)([w; Ī½])ewĪ½
+PM2O(smin(a+3,b+2))},
ek+1MN = āPM1
Aā1F ([w; Ī½])BF ([w; Ī½])Sā1
F ([w; Ī½])PN2O(sd+2),
ek+1NM = āPN1
Sā1F ([w; Ī½])CF ([w; Ī½])Aā1
F ([w; Ī½])PM2
{ā (mā 1)AFāewĪ½ +
a+ 1āma+ 1
A(a)([w; Ī½])ewĪ½ +b+ 1āmb+ 1
B(b)([w; Ī½])ewĪ½ + PM2O(smin(a+3,b+2))
},
and ek+1NN = PN1
Sā1F ([w; Ī½])PN2
O(sd+2).We show that all the projected errors shown above are bounded by O(s2). First,
note from (2.20) that the dominant term of AF ([w; Ī½]) is AFā = PM2Fā²([v;Ī»])PM1 ,
which means āAF ([w; Ī½])ā = O(1) and āAā1F ([w; Ī½])ā = O(1). In addition, by (2.18),
we have āSF ([w; Ī½])ā=O(smin(d,b+c)). Thus the operator involved in ek+1MM satisfies
āAā1F ([w; Ī½]) +Aā1
F ([w; Ī½])BF ([w; Ī½])Sā1F ([w; Ī½])CF ([w; Ī½])Aā1
F ([w; Ī½])āā¤ O(1) +O(sb+cāmin(d,b+c)) ā¤ O(1) +O(1) = O(1).
It follows that āek+1MMā = O(1)
(O(s2) +O(sa+2) +O(sb+1)
)= O(s2).
For the second error term ek+1MN , since c ā„ d by our assumption, we have
āAā1F ([w; Ī½])BF ([w; Ī½])Sā1
F ([w; Ī½])ā ā¤ O(sbāmin(d,b+c)) = O(sbād),
and it follows that āek+1MN ā ā¤ O(sbād)O(sd+2) ā¤ O(sb+2).
Similarly, for the third error term ek+1NM, we have
āSā1F ([w; Ī½])CF ([w; Ī½])Aā1
F ā ā¤ O(scāmin(d,b+c)) = O(scād),
and therefore āek+1NMā ā¤ O(scād)
(O(s2) +O(sa+2) +O(sb+1)
)ā¤ O(s2).
Finally, the last error term ek+1NN can be directly bounded by āSā1
F ([w; Ī½])ā:
āek+1NN ā ā¤ O(sāmin(d,b+c))O(sd+2) = O(sād+d+2) ā¤ O(s2).
The above analysis completes step 3, and the result is summarized as follows.Theorem 4.9. Let Ī» be a defective eigenvalue of the holomorphic matrix pencil T
with one left and one right Jordan chains of length m, and v = Ļ1,0 be the correspond-ing eigenvector. Let Ī“, Īø > 0 be some appropriately small constants, [x0;Āµ0] āW (Ī“, Īø)be an eigenpair approximation of [v;Ī»] satisfying Assumption 4.4. Under Assump-tion 4.6, suppose that for any half-step iterate [wk; Ī½k], c ā„ d, and c ā„ d = m ā 1.Then for sufficiently small sinā (x0, v), the full-step iterates of the accelerated method(4.11) (or (4.12)) are well-defined, and [xk;Āµk] converges towards [v;Ī»] quadratically.
Remark. One can see from the above derivation that c ā„ d = mā1 and c ā„ d arecritical assumptions leading to the quadratic convergence of the accelerated methods.In addition, [xk+1;Āµk+1] = m
([w; Ī½] ā F ā²([w; Ī½])ā1F ([w; Ī½])
)ā (m ā 1)[w; Ī½] is the
unique linear combination of the two iterates that cancels out the error terms lying
30 D. B. Szyld and F. Xue
in N1 on the order of O(s). A violation of these assumptions or a different linearcombination would lead to a deceleration of the convergence rate back to linear.
Theorem 4.9 is an extension of the major results in [8, Section 7] to the specialsetting of eigenvalue computation under less stringent assumptions. Specifically, [8]studied two accelerated Newtonās methods for solving a general nonlinear system ofequations for a singular root. It was assumed there that a = a, b = b, c = c and d = dfor any intermediate and full step iterates. Under certain additional hypothesis, thefirst accelerated method requires the computation of three Newtonās directions periteration, assuming that c ā„ d and b ā„ min(2, d); the second one solves for twoNewtonās directions per iteration, assuming that b ā„ 2 and c ā„ d ā„ 2. In fact, forthe computation of degenerate eigenvalues discussed here, these hypotheses do nothold. Our assumption in Theorem 4.9 about the values of c, d, c, and d is made forthe half-step iterate [w; Ī½] alone, and it is less demanding than those in [8].
4.5.2. Accelerated single-vector JD. One could also design other acceleratedNewton-like methods from the accelerated inverse iteration (4.12). In this section, wepropose an accelerated single-vector JD, which is derived from a minor modificationof (4.12). The only difference between the two algorithms lies in the computationof new eigenvalue approximations. In contrast to Newtonās method and RFI, theJD methods compute normalized new eigenvector approximations directly withoutforming the unnormalized version; as a result, the way the accelerated Newtonāsmethod updates eigenvalue approximations cannot be realized by JD; see Steps 3and 6 in (4.12). To work around this difficulty, we use the Rayleigh functional valueas new eigenvector approximations, and we have the new algorithm as follows:
1. Choose vector yk, e.g., yk = T ā²(Āµk)xk, define Ī (1)k = I ā T ā²(Āµk)xky
āk
yākTā²(Āµk)xk
,
Ī (2)k = I ā xku
ā
uāxk, and solve the correction equation
Ī (1)k T (Āµk)Ī
(2)k āxk = ā
(T (Āµk)ā yākT (Āµk)xk
yākTā²(Āµk)xk
T ā²(Āµk)
)xk for āxk ā„u;
2. wk = xk + āxk;3. Choose vector zk1, e.g., zk1 = T ā²(Āµk)wk and compute the RF valueĪ½k = ĻF (wk;T, zk1);
4. Define Ī (1)k = I ā T ā²(Ī½k)wkz
āk1
zāk1Tā²(Ī½k)wk
, Ī (2)k = I ā wku
ā
uāwk, and solve the
correction equation Ī (1)k T (Ī½k)Ī
(2)k āwk = āT (Ī½k)wk for āwk ā„u;
5. xk+1 = ā(mā 1)wk +m(wk + āwk) = wk +māwk;6. Choose vector zk2, e.g., zk2 = T ā²(Ī½k)xk+1 and compute the RF valueĀµk+1 = ĻF (xk+1;T, zk2).
(4.20)
Following the standard derivation of the exact solution of JD correction equa-
tions, we can show that āxk =Tā1(Āµk)T ā²(Āµk)xkuāTā1(Āµk)T ā²(Āµk)xk
āxk at Step 1 of (4.20), and thus
wk =Tā1(Āµk)T ā²(Āµk)xkuāTā1(Āµk)T ā²(Āµk)xk
at Step 2; similarly, āwk =Tā1(Ī½k)T ā²(Ī½k)wkuāTā1(Ī½k)T ā²(Ī½k)wk
ā wk at
Step 4, and xk+1 =mTā1(Ī½k)T ā²(Ī½k)wkuāTā1(Ī½k)T ā²(Ī½k)wk
ā (mā 1)wk at Step 5. Assuming x0 is nor-
malized such that uāx0 = 1, then all the following half-step and full-step eigenvector
Newton-like methods for degenerate eigenvalues 31
approximations satisfy uāwk = uāxk+1 = 1. For the purpose of numerical stability,however, we recommend these conditions to be explicitly enforced at Steps 2 and 5.
We see that the only difference between the accelerated inverse iteration (4.12)and the accelerated single-vector JD (4.20) is the way new eigenvalue approximationsare computed. Consequently, algorithms (4.20) and (4.11) are not mathematicallyequivalent. Nevertheless, given the close similarity between the two methods, it isnatural to expect that the accelerated JD exhibits quadratic convergence for defectiveeigenpairs. This convergence rate is illustrated by numerical experiments.
4.6. Numerical experiments for the accelerated algorithms. We presentnumerical results for the accelerated inverse iteration and the accelerated JD on theproblems with a single Jordan chain, namely, time delay and jordan3. We run theaccelerated algorithms with an initial eigenpair approximation and we find that bothconverge superlinearly. To illustrate the order of convergence descriptively, we follow
the approach used in Section 3.2: a sequence of initial approximations (Āµ(j)0 , x
(j)0 ) is
generated, then one step of the accelerated algorithms is applied to obtain a sequence
of new approximations (Āµ(j)1 , x
(j)1 ), and an estimate of the order of convergence ` is
obtained by analyzing(
log e(j)0 , log e
(j)1
). The results presented in Table 4.4 show that
both algorithms converge quadratically for the test problems.
accelerated inverse iter. accelerated JD
problem ā (x(1)0 , v) # init. approx. est. ` # init. approx. est. `
time delay 10ā3 18 2.011 11 2.032jordan3 2.5Ć 10ā3 9 2.028 9 2.058
Table 4.4Estimated order of convergence of the accelerated algorithms for a defective Ī» (geoT (Ī») = 1)
4.7. Defective eigenvalues with multiple Jordan chains. We studied inSections 4.3 and 4.5 the convergence of several Newton-like methods for a defectiveeigenvalue Ī» with a single Jordan chain (J ā” geoT (Ī») = 1). To make our discussionmore complete, we consider in this section defective eigenvalues with multiple (J ā„ 2)Jordan chains. For this type of eigenvalues with certain simple spectral structures, wecan develop a detailed convergence analysis as we did for those with a single Jordanchain. Due to the limit of space, however, we would not pursue such a thorough inves-tigation in this paper; instead, we provide some heuristic insight into the convergencerates the Newton-like methods are most likely to exhibit. Our speculation is directlybased on the results developed for defective Ī» with J = 1, and we will later illustrateour discussion by numerical experiments.
Let Ī» be a defective eigenvalue of the holomorphic T (Ā·) with J = geoT (Ī») ā„ 2, and{{Ļ1,0, . . . , Ļ1,m1ā1}, {Ļ2,0, . . . , Ļ2,m2ā1} . . . , {ĻJ,0, . . . , ĻJ,mJā1}} be the correspond-ing Jordan chains. Assume without loss of generality that m1 ā¤ m2 ā¤ . . . ā¤ mJ . Weconsider several types of spectral structure as follows.
Case 1: m1 ā„ 2. One may call this type of eigenvalue āpurely defectiveā sinceit can be considered as a combination of several defective eigenpairs sharing the sameeigenvalue, each of which has a single Jordan chain of length mi ā„ 2. In this case, wehave shown in Section 4.1 that the standard inverse iteration converges linearly. Inaddition, if each Jordan chain satisfies the assumption of Theorem 4.3, it is naturalto expect the local convergence of RFI and single-vector JD to be linear in general.
To achieve quadratic convergence, we may use the two accelerated algorithms
32 D. B. Szyld and F. Xue
(4.12) and (4.20) with m = mJ used for the linear combination to construct thefull-step iterates. The motivation for choosing this value for m is as follows. As theNewton-like methods proceed, the eigenvector approximation xk tends to convergetowards the eigenspace spanned by eigenvectors associated with the longest Jordanchains. In fact, assume without loss of generality that J = 2 and 2 ā¤ m1 < m2. Let(Āµ, x) be the current eigenpair approximation. The new eigenvector approximation is
p = Tā1(Āµ)T ā²(Āµ)x =
2āk=1
mkā1āj=0
mkājā1āi=0
ćT ā²(Āµ)x, Ļk,jć(Āµā Ī»)mkājāi
Ļk,i +G(Āµ)T ā²(Āµ)x(4.21)
=
m1ā1āi=0
ćT ā²(Āµ)x, Ļ1,0ć(Āµā Ī»)m1āi
Ļ1,i +
m1ā1āj=1
m1ājā1āi=0
ćT ā²(Āµ)x, Ļ1,jć(Āµā Ī»)m1ājāi
Ļ1,i +
m2ā1āi=0
ćT ā²(Āµ)x, Ļ2,0ć(Āµā Ī»)m2āi
Ļ2,i +
m2ā1āj=1
m2ājā1āi=0
ćT ā²(Āµ)x, Ļ2,jć(Āµā Ī»)m2ājāi
Ļ2,i +G(Āµ)T ā²(Āµ)x.
Assume that ćT ā²ā²(Ī»)Ļ1,0, Ļ2,0ć 6= 0 and ćT ā²ā²(Ī»)Ļ2,0, Ļ2,0ć 6= 0, or ćT ā²(Ī»)Ļ1,0, Ļ2,1ć 6= 0and ćT ā²(Ī»)Ļ2,0, Ļ2,1ć 6= 0. Following the proof of Lemma 4.5, we can show that thedominant term in (4.21) can be written as O
((Āµā Ī»)ā(m2ā1)
)Ļ2,0. In other words,
the new eigenvector approximation contains little component of Ļ1,0, and Newton-like methods behave as if there is only one Jordan chain {Ļ2,0, . . . , Ļ2,m2ā1} involved.Thus we expect the accelerated algorithms with m = m2 to converge quadratically.
Case 2: m1 = 1 and mJ ā„ 3. In this case, it is not difficult to follow theidea given in Section 4.1 to show that inverse iteration converges linearly. In addi-tion, for any practical Newton-like methods where the new eigenvector approxima-tion is p = Tā1(Āµ)T ā²(Āµ)x, one can show that p is dominated by a term of the formO((Āµā Ī»)ā(mJā1)
)ĻJ,0, and p contains another term of the form O
((Āµā Ī»)ā1
)Ļ1,0.
Assume again without loss of generality that J = 2. Since m2 ā„ 3, the dominant termin p is much larger in magnitude than the Ļ1,0 term. As a result, the Jordan chainof length 1 has minimal impact on the algorithm, and Newton-like methods shouldbehave the same way as they do for Case 1.
Case 3: m1 = 1 and mJ = 2. The linear convergence of inverse iteration can beestablished as in Section 4.1. For other Newton-like methods, this case seems morecomplicated than the previous two because it is not clear whether these algorithmsconverge to the eigenspace associated with the short (mi = 1) or the long (mi = 2)Jordan chains. In fact, the new eigenvector approximation p = Tā1(Āµ)T ā²(Āµ)x containsboth O
((Āµā Ī»)ā1
)Ļ1,0 and O
((Āµā Ī»)ā1
)ĻJ,0, and p is not necessarily dominated
by either of the two components. We do not have a complete understanding of theconvergence of Newton-like methods for this case, but we will discuss two numericalexamples in the next section.
4.8. Numerical experiments for defective eigenvalues with multipleJordan chains. In this section, we provide numerical evidence to illustrate ourheuristic analysis of the convergence of Newton-like methods for defective eigenvalueswith multiple Jordan chains. Four problems, namely, df art m1m2, df art m1m3,df art m2m3, and mirror (see Section 3.2) are used to test the convergence rates.As we discussed, we first run the algorithms with an initial eigenpair (Āµ0, x0) andsee how quickly āT (Āµk)xkā decreases to determine if the convergence is linear or atleast superlinear. For algorithms that seem to converge linearly and superlinearly,respectively, we follow the standard approach (see Section 4.2 for defective eigenval-
Newton-like methods for degenerate eigenvalues 33
ues) and the more descriptive approach (see Section 3.2 for semi-simple eigenvalues)to estimate the order of convergence.
inverse iteration RFI/JD TSRFIproblem ā (x0, v) # iters est. ` # iters est. ` # iters est. `
df art m1m3 10ā2 14 0.978 14 0.957 10 0.965ā
df art m2m3 10ā2 15 0.969 14 0.974 11 1.013ā
df art m1m2 5Ć 10ā2 17 1.000 12 0.984 8 0.973ā
mirror 10ā2 15 1.002 3 ā 2ā 2 ā 3ā
Table 4.5Estimated order of convergence of the non-accelerated algorithms for a defective Ī» (geoT (Ī») ā„ 2)
The results for inverse iteration, RFI/JD, and TSRFI are presented in Table 4.5.We see that inverse iteration converges linearly for all problems, and RFI and TSRFIalso converge linearly for the three artificially constructed problems. The problemmirror is the only one with defective eigenvalues in our tests for which RFI and TSRFIexhibit the same order of convergence as for simple and semi-simple eigenvalues. Wedo not have a complete understanding of this observation, but we find that the desiredeigenvalue Ī» = 0 of this problem has a special spectrum structure. Namely, it hastwo Jordan chains of length 2, and seven Jordan chains of length 1; for the twolongest Jordan chains {Ļ1,0, Ļ1,1} and {Ļ2,0, Ļ2,1}, it holds that span{Ļ1,0, Ļ2,0} =span{Ļ1,1, Ļ2,1}; that is, the generalized eigenvectors and the eigenvectors of theseJordan chains span the same space. We speculate that the special spectral structureplays a critical role in the high order of convergence.
accelerated inverse iter. accelerated JD
problem ā (x(1)0 , v) # init. approx. est. ` # init. approx. est. `
df art m1m3 2.5Ć 10ā2 11 1.998 10 2.010df art m2m3 5Ć 10ā3 8 2.015 8 2.014
problem ā (x0, v) # iters. est. ` # iters. est. `df art m1m2 10ā1 9 1.054 3 ā 2
mirror 10ā2 3 ā 2 3 ā 2Table 4.6
Estimated order of convergence of accelerated algorithms for defective Ī» (geoT (Ī») ā„ 2)
Table 4.6 summarizes the estimated order of convergence of the accelerated algo-rithms. For both algorithms, the parameter m is set to be the length of the longestJordan chains. We see that the accelerated algorithms generally exhibit quadraticconvergence for defective eigenvalues with multiple Jordan chains, with the only ex-ception that the accelerated inverse iteration converges linearly for df art m1m2. Thisproblem belongs to Case 3 discussed in Section 4.7, for which our understanding ofthe convergence of Newton-like methods is not complete. Nevertheless, Tables 4.4and 4.6 show that the accelerated JD converges quadratically for all test problems.
In fact, for some reason, our approach used in Section 3.2 does not give a de-scriptive estimate of the true order of convergence of the accelerated algorithms fordf art m1m2 and mirror. Note that mirror also belongs to Case 3 discussed inSection 4.7. To estimate the true order of convergence, we ran the algorithms with
āThe residual norms in the first two iterations are not used for the estimate of order of convergencefor TSRFI, because they decrease much less rapidly in subsequent iterations.ā For the problem mirror, one can see by the standard criterion āek+1ā/āekā` ā¤ C that RFI
and TSRFI, respectively, exhibit quadratic and cubic convergence; our approach used in Section 3.2somehow does not generate a descriptive estimate of the order of convergence for this problem.
34 D. B. Szyld and F. Xue
(Āµ0, x0) and then used the standard criterion āek+1ā/āekā` ā¤ C. For these two prob-lems, we speculate that the eigenpair approximations generated by the acceleratedalgorithms have certain special structure for which our approach used in Section 3.2fails to detect the actual rate at which āT (Āµk)xkā decreasesā.
5. Conclusion. The local convergence of single-vector Newton-like methods fordegenerate eigenvalues has not been fully understood, since the standard convergencetheory of Newtonās method based on the nonsingularity of the Jacobian is not ap-plicable in this setting. In this paper, we studied the convergence of several of themost widely-used single-vector Newton-like methods for the solution of a degenerateeigenvalue and a corresponding eigenvector of general nonlinear algebraic eigenvalueproblems of the form T (Ī»)v = 0. Our major conclusion is that at least quadraticconvergence can be achieved by these algorithms for both semi-simple and defectiveeigenvalues.
Specifically, we showed that Newton-like methods exhibit the same order of con-vergence for semi-simple eigenvalues as they do for simple eigenvalues. The conver-gence is generally quadratic; in addition, RFI/JD with appropriate use of the Rayleighfunctional can achieve cubic convergence for problems with local symmetry.
The convergence analysis for defective eigenvalues is more complicated. Weshowed the linear convergence of inverse iteration and RFI/JD, and we proposedtwo accelerated algorithms which converge quadratically for a defective Ī» with a sin-gle Jordan chain. We also gave some heuristic discussion on the convergence for adefective Ī» with multiple Jordan chains. Our analyses are illustrated by numericalexperiments.
REFERENCES
[1] J. Asakura, T. Sakurai, H. Tadano, T. Ikegami and K. Kimura, A numerical method forpolynomial eigenvalue problems using contour integral, Japan Journal of Industrial andApplied Mathematics, Volume 27 (2010), pp. 73ā90.
[2] T. Betcke, N. J. Higham, V. Mehrmann, C. Schroder and F. Tisseur, NLEVP: a col-lection of nonlinear eigenvalue problems, MIMS EPrint 2010.98, Manchester Institute forMathematical Sciences, University of Manchester, 2010.
[3] T. Betcke and D. Kressner, Perturbation, extraction and refinement of invariant pairs formatrix polynomials, Linear Algebra and its Applications, Vol. 435 (2011), pp. 514ā536.
[4] T. Betcke and H. Voss, A Jacobi-Davidson type projection method for nonlinear eigenvalueproblems, Future Generation Comuter Systems, Vol. 20 (2004), pp. 363ā372.
[5] W.-J. Beyn, An integral method for solving nonlinear eigenvalue problems, Linear Algebra andits Applications, Vol. 436 (2012), pp. 3839ā3863.
[6] D. W. Decker and C. T. Kelley, Newtonās method at singular points I, SIAM Journal onNumerical Analysis, Vol. 17 (1980), pp. 66ā70.
[7] D. W. Decker and C. T. Kelley, Newtonās method at singular points II, SIAM Journal onNumerical Analysis, Vol. 17 (1980), pp. 465ā471.
[8] D. W. Decker, H. B. Keller and C. T. Kelley, Convergence rates for Newtonās method atsingular points, SIAM Journal on Numerical Analysis, Vol. 20 (1983), pp. 296ā314.
[9] N. Hale, N. J. Higham, L. N. Trefethen, Computing AĪ±, log(A), and related matrix func-tions by contour integrals, SIAM Journal on Numerical Analysis, Vol. 46 (2008), pp. 2505ā2523.
[10] I. C. F. Ipsen, Computing an eigenvector with inverse iteration, SIAM Review, Vol. 39 (1997),pp. 254ā291.
āA similar example: our approach used in Section 3.2, when applied to inverse iteration for com-puting defective eigenvalues with a single Jordan chain, gives an estimated mth order convergence,which does not reflect the actual linear convergence; this is due to the special structure of the newiterate generated by inverse iteration; see Lemma 4.5.
Newton-like methods for degenerate eigenvalues 35
[11] E. Jarlebring and W. Michiels, Analyzing the convergence factor of residual inverse itera-tion, BIT Numerical Mathematics, Vol. 51(2011), pp. 937ā957.
[12] V. Kozlov and V. Mazāia, Differential equations with operator coefficients, Springer, Berlin-Heidelberg, 1999.
[13] D. Kressner, A block Newton method for nonlinear eigenvalue problems, Numerische Mathe-matik, Vol. 114 (2009), pp. 355ā372.
[14] D.S. Mackey, N. Mackey, C. Mehl and V. Mehrmann, Vector spaces of linearizations formatrix polynomials, SIAM Journal Matrix Analysis and Applications, Vol. 28 (2006), pp.971ā1004.
[15] D.S. Mackey, N. Mackey, C. Mehl and V. Mehrmann, Structured polynomial eigenvalueproblems: good vibrations from good linearizations, SIAM Journal Matrix Analysis andApplications, Vol. 28 (2006), pp. 1029ā1051.
[16] The Matrix Market, http://math.nist.gov/MatrixMarket/, NIST, 2007.[17] V. Mehrmann and C. Schroder, Nonlinear eigenvalue and frequency response problems in
industrial practice, Journal of Mathematics in Industry, Vol. 1 (2011), article No. 7.[18] V. Mehrmann and H. Voss, Nonlinear eigenvalue problems: a challenge for modern eigenvalue
methods, Mitteilungen der Gesellschaft fur Angewandte Mathematik und Mechanik, Vol.27 (2005), pp. 121ā151.
[19] J. Moro, J. V. Burke and M. L. Overton, On the LidskiiāVishikāLyusternik perturbationtheory for eigenvalues of matrices with arbitrary Jordan structure, SIAM Journal MatrixAnalysis and Applications, Vol. 18 (1997), pp. 793ā817.
[20] J. Moro and F.M. Dopico, First Order Eigenvalue Perturbation Theory and the NewtonDiagram, Chapter in the book Applied Mathematics and Scientific Computing, edited byZ. Drmac, V. Hari, L. Sopta, Z. Tutek and K. Veselic, pp. 143ā175. Kluwer AcademicPublishers (2003).
[21] A. Neumaier, Residual inverse iteration for the nonlinear eigenvalue problem, SIAM Journalon Numerical Analysis, Vol. 22 (1985), pp. 914ā923.
[22] M.R. Osborne, Inverse iteration, Newtonās method, and non-linear eigenvalue problems, TheContributions of Dr. J.H. Wilkinson to Numerical Analysis, Symposium Proceedings Series,19, pp. 21ā53, The Institute of Mathematics and its Applications, London, 1979.
[23] G. Peters and J. H. Wilkinson, Inverse iteration, ill-conditioned equations and Newtonāsmethod, SIAM Review, Vol. 21 (1979), pp. 339ā360.
[24] A. Ruhe, Algorithms for the nonlinear eigenvalue problem, SIAM Journal on Numerical Anal-ysis, Vol. 10 (1973), pp. 674ā689.
[25] K. Schreiber, Nonlinear eigenvalue problems: Newton-type methods and nonlinear Rayleighfunctionals, Ph.D thesis, Department of Mathematics, TU Berlin, 2008.
[26] H. Schwetlick and K. Schreiber, Nonlinear Rayleigh functionals, Linear Algebra and ItsApplications, Vol. 436 (2012), pp. 3991ā4016.
[27] A. Spence and C. Poulton, Photonic band structure calculations using nonlinear eigenvaluetechniques, Journal of Computational Physics, Vol. 204 (2005), pp. 65ā81.
[28] Y. Su and Z. Bai, Solving Rational Eigenvalue Problems via Linearization, SIAM JournalMatrix Analysis and Applications, Vol. 32 (2011), pp. 201ā216.
[29] D. B. Szyld and F. Xue, Local convergence analysis of several inexact Newton-type algorithmsfor general nonlinear eigenvalue problems, to appear in Numerische Mathematik.
[30] D. B. Szyld and F. Xue, Several properties of invariant pairs of nonlinear algebraic eigenvalueproblems, Temple Research Report 12-02-09, February 2012.
[31] F. Tisseur and K. Meerbergen, The quadratic eigenvalue problem, SIAM Review, Vol. 43(2001), pp. 234ā286.