A generalization of the Moore–Penrose inverse related to matrix subspaces of

9
A generalization of the Moore–Penrose inverse related to matrix subspaces of C nm Antonio Suárez, Luis González * Department of Mathematics, University of Las Palmas de Gran Canaria, 35017 Las Palmas de Gran Canaria, Spain article info Keywords: Moore–Penrose inverse Frobenius norm S-Moore–Penrose inverse Approximate inverse preconditioning Constrained least squares problem abstract A natural generalization of the classical Moore–Penrose inverse is presented. The so-called S-Moore–Penrose inverse of an m n complex matrix A, denoted by A y S , is defined for any linear subspace S of the matrix vector space C nm . The S-Moore–Penrose inverse A y S is char- acterized using either the singular value decomposition or (for the full rank case) the orthogonal complements with respect to the Frobenius inner product. These results are applied to the preconditioning of linear systems based on Frobenius norm minimization and to the linearly constrained linear least squares problem. Ó 2010 Elsevier Inc. All rights reserved. 1. Introduction Let C mn ðR mn Þ denote the set of all m n complex (real) matrices. Throughout this paper, the notations A T ; A ; A 1 , r(A) and tr(A) stand for the transpose, conjugate transpose, inverse, rank and trace of matrix A, respectively. The general reciprocal was described by Moore in 1920 [1] and independently rediscovered by Penrose in 1955 [2], and it is nowadays called the Moore–Penrose inverse. The Moore–Penrose inverse of a matrix A 2 C mn (sometimes referred to as the pseudoinverse of A) is the unique matrix A y 2 C nm satisfying the four Penrose equations AA y A ¼ A; A y AA y ¼ A y ; ðAA y Þ ¼ AA y ; ðA y AÞ ¼ A y A: ð1:1Þ For more details on the Moore–Penrose inverse, see [3]. The pseudoinverse has been widely studied during the last decades from both theoretical and computational points of view. Some of the most recent works can be found, e.g., in [4–10] and in the references contained therein. As is well-known, the Moore–Penrose inverse A y can be alternatively defined as the unique matrix that gives the mini- mum Frobenius norm among all solutions to any of the matrix minimization problems min M2C nm kMA I n k F ; ð1:2Þ min M2C nm kAM I m k F ; ð1:3Þ where I n denotes the identity matrix of order n and kk F stands for the matrix Frobenius norm. Another way to characterize the pseudoinverse is by using the singular value decomposition (SVD). If A ¼ URV is an SVD of A then A y ¼ V R y U : ð1:4Þ 0096-3003/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2010.01.062 * Corresponding author. E-mail address: [email protected] (L. González). Applied Mathematics and Computation 216 (2010) 514–522 Contents lists available at ScienceDirect Applied Mathematics and Computation journal homepage: www.elsevier.com/locate/amc

Transcript of A generalization of the Moore–Penrose inverse related to matrix subspaces of

Page 1: A generalization of the Moore–Penrose inverse related to matrix subspaces of

Applied Mathematics and Computation 216 (2010) 514–522

Contents lists available at ScienceDirect

Applied Mathematics and Computation

journal homepage: www.elsevier .com/ locate /amc

A generalization of the Moore–Penrose inverse relatedto matrix subspaces of Cn�m

Antonio Suárez, Luis González *

Department of Mathematics, University of Las Palmas de Gran Canaria, 35017 Las Palmas de Gran Canaria, Spain

a r t i c l e i n f o

Keywords:Moore–Penrose inverseFrobenius normS-Moore–Penrose inverseApproximate inverse preconditioningConstrained least squares problem

0096-3003/$ - see front matter � 2010 Elsevier Incdoi:10.1016/j.amc.2010.01.062

* Corresponding author.E-mail address: [email protected] (L. Gonzá

a b s t r a c t

A natural generalization of the classical Moore–Penrose inverse is presented. The so-calledS-Moore–Penrose inverse of an m� n complex matrix A, denoted by AyS, is defined for anylinear subspace S of the matrix vector space Cn�m. The S-Moore–Penrose inverse AyS is char-acterized using either the singular value decomposition or (for the full rank case) theorthogonal complements with respect to the Frobenius inner product. These results areapplied to the preconditioning of linear systems based on Frobenius norm minimizationand to the linearly constrained linear least squares problem.

� 2010 Elsevier Inc. All rights reserved.

1. Introduction

Let Cm�n ðRm�nÞ denote the set of all m� n complex (real) matrices. Throughout this paper, the notations AT; A�; A�1, r(A)

and tr(A) stand for the transpose, conjugate transpose, inverse, rank and trace of matrix A, respectively. The general reciprocalwas described by Moore in 1920 [1] and independently rediscovered by Penrose in 1955 [2], and it is nowadays called theMoore–Penrose inverse. The Moore–Penrose inverse of a matrix A 2 Cm�n (sometimes referred to as the pseudoinverse of A)is the unique matrix Ay 2 Cn�m satisfying the four Penrose equations

AAyA ¼ A; AyAAy ¼ Ay; ðAAyÞ� ¼ AAy; ðAyAÞ� ¼ AyA: ð1:1Þ

For more details on the Moore–Penrose inverse, see [3]. The pseudoinverse has been widely studied during the last decadesfrom both theoretical and computational points of view. Some of the most recent works can be found, e.g., in [4–10] and inthe references contained therein.

As is well-known, the Moore–Penrose inverse Ay can be alternatively defined as the unique matrix that gives the mini-mum Frobenius norm among all solutions to any of the matrix minimization problems

minM2Cn�m

kMA� InkF ; ð1:2Þ

minM2Cn�m

kAM � ImkF ; ð1:3Þ

where In denotes the identity matrix of order n and k � kF stands for the matrix Frobenius norm.Another way to characterize the pseudoinverse is by using the singular value decomposition (SVD). If A ¼ URV� is an SVD

of A then

Ay ¼ VRyU�: ð1:4Þ

. All rights reserved.

lez).

Page 2: A generalization of the Moore–Penrose inverse related to matrix subspaces of

A. Suárez, L. González / Applied Mathematics and Computation 216 (2010) 514–522 515

Also, the pseudoinverse of A can be defined via a limiting process as [11]

Ay ¼ limd!0

A�ðAA� þ dImÞ�1 ¼ limd!0ðA�Aþ dInÞ�1A�

and by the explicit algebraic expression [3]

Ay ¼ C�ðCC�Þ�1ðB�BÞ�1B�

based on the full rank factorization

A ¼ BC; B 2 Cm�r ; C 2 Cr�n; rðAÞ ¼ rðBÞ ¼ rðCÞ ¼ r:

In particular, if A 2 Cm�n has full row rank (full column rank, respectively) then Ay is simply the unique solution to prob-lem (1.2) (to problem (1.3), respectively), given by the respective explicit expressions [3]

Ay ¼ ðiÞ A� AA�ð Þ�1 iff rðAÞ ¼ m;

ðiiÞ A�Að Þ�1A� iff rðAÞ ¼ n;

(ð1:5Þ

so that Ay is a right inverse (left inverse, respectively) of A if r(A) = m (r(A) = n, respectively).A natural generalization of the Moore–Penrose inverse, the so-called left or right S-Moore–Penrose inverse, arises simply

by taking in Eq. (1.2) or (1.3), respectively, the minimum over an arbitrary matrix subspace S of Cn�m, instead of over thewhole matrix space Cn�m. This is the starting point of this paper, that has been organized as follows.

In Section 2, we define the S-Moore–Penrose inverse and we provide an explicit expression based on the SVD of matrix A(which generalizes Eq. (1.4) for Ay), as well as an alternative expression for the full rank case, in terms of the orthogonal com-plements with respect to the Frobenius inner product (which generalizes Eq. (1.5) for Ay). Next, in Section 3 we apply theseresults to the preconditioning of linear systems based on Frobenius norm minimization and to the linear least squares prob-lem subject to some linear restrictions. Finally, in Section 4 we present our conclusions.

2. The S-Moore–Penrose inverse

Eqs. (1.2) and (1.3), regarding the least squares characterization of the Moore–Penrose inverse, can be generalized asfollows.

Definition 2.1. Let A 2 Cm�n and let S be a linear subspace of Cn�m. Then

(i) The left Moore–Penrose inverse of A with respect to S or, for short, the left S-Moore–Penrose inverse of A, denoted byAyS;l, is the minimum Frobenius norm solution to the matrix minimization problem

minM2SkMA� InkF : ð2:1Þ

(ii) The right Moore–Penrose inverse of A with respect to S or, for short, the right S-Moore–Penrose inverse of A, denotedby AyS;r , is the minimum Frobenius norm solution to the matrix minimization problem

minM2SkAM � ImkF : ð2:2Þ

Remark 2.1. Note that for S ¼ Cn�m, the left and right S-Moore–Penrose inverses become the Moore–Penrose inverse. That is,the left and right S-Moore–Penrose inverses generalize the definition, via the matrix minimization problems (1.2) and (1.3),respectively, of the classical pseudoinverse. Moreover, when A 2 Cn�n is nonsingular and A�1 2 S then the left and right S-Moore–Penrose inverses are just the inverse. Briefly:

AyCn�m ;l ¼ Ay ¼ Ay

Cn�m ;r and AyS;l ¼ A�1 ¼ AyS;r for all S s:t: A�1 2 S:

Remark 2.2. In particular, if matrix A has full row rank (full column rank, respectively) then it has right inverses (leftinverses, respectively). Thus, due to the uniqueness of the orthogonal projection of the identity matrix In (Im, respectively)onto the matrix subspace SA # Cn�n (AS # Cm�m, respectively), we conclude that, for these special cases, Definition 2.1 simplyaffirms that AyS;l (AyS;r , respectively) is just the unique solution to problem (2.1) (to problem (2.2), respectively). The followingsimple example illustrates this fact.

Example 2.1. For n ¼ m ¼ 2 and

A ¼1 12 0

� �; S ¼ span

1 00 0

� �� �¼

a 00 0

� �� �a2C

;

on one hand, we have

Page 3: A generalization of the Moore–Penrose inverse related to matrix subspaces of

516 A. Suárez, L. González / Applied Mathematics and Computation 216 (2010) 514–522

AyS;lA� I2

��� ���2

F¼min

M2SkMA� I2k2

F ¼mina2Cja� 1j2 þ jaj2 þ 1;

so that

AyS;l ¼12 00 0

!; AyS;lA� I2

��� ���F¼ 3

2

and on the other hand, we have

AAyS;r � I2

��� ���2

F¼min

M2SkAM � I2k2

F ¼mina2Cja� 1j2 þ j2aj2 þ 1;

so that

AyS;r ¼15 00 0

!; AAyS;r � I2

��� ���F¼ 9

5:

Remark 2.3. Example 2.1 illustrates three basic differences between the Moore–Penrose and the S-Moore–Penrose inverses.First, we have that, in general, AyS;l – AyS;r . Second, neither AyS;l, nor AyS;r satisfies, in general, none of the Penrose equations givenin (1.1). Finally, although matrix A has full rank, neither AyS;l is a left inverse, nor AyS;r is a right inverse of matrix A.

Throughout this paper we address only the case of the right S-Moore–Penrose inverse AyS;r , but analogous results can beobtained for the left S-Moore–Penrose inverse AyS;l.

Next theorem provides us with an explicit expression for AyS;r from an SVD of matrix A.

Theorem 2.1. Let A 2 Cm�n and let S be a linear subspace of Cn�m. Let A ¼ URV� be an SVD of matrix A. Then

AyS;r ¼ VRyV�SU;rU� and AAyS;r � Im

��� ���F¼ RRyV�SU;r � Im

��� ���F: ð2:3Þ

Moreover,

Ay � AyS;r��� ���

F¼ Ry � RyV�SU;r

��� ���F: ð2:4Þ

Proof. For all M 2 S, we have N ¼ V�MU 2 V�SU and M ¼ VNU�. Using the invariance property of the Frobenius norm undermultiplication by unitary matrices, we get

kA eM � ImkF ¼minM2SkAM � ImkF ¼ min

N2V�SUkURV�VNU� � ImkF ¼ min

N2V�SUkUðRN � ImÞU�kF

¼ minN2V�SU

kRN � ImkF ¼ kReN � ImkF ; ð2:5Þ

that is, eN 2 V�SU is a solution to the matrix minimization problem

minN2V�SU

kRN � ImkF ð2:6Þ

if and only if eM ¼ V eNU� 2 S is a solution to problem (2.2).Now, let eN be any solution to problem (2.6). Since RyV�SU;r is the minimum Frobenius norm solution to problem (2.6), we

have

RyV�SU;r

��� ���F6 keNkF

and then, using again the fact that the Frobenius norm is unitarily invariant, we have

VRyV�SU;rU�

��� ���F6 kV eNU�kF for any solution eN to problem ð2:6Þ; i:e:;

VRyV�SU;rU�

��� ���F6 k eMkF for any solution eM to problem ð2:2Þ:

This proves that the minimum Frobenius norm solution AyS;r to problem (2.2) is VRyV�SU;rU�, and the right-hand equality of

Eq. (2.3) immediately follows from Eq. (2.5).Finally, Eq. (2.4) immediately follows using the well-known expression Ay ¼ VRyU� and the left-hand equality of Eq. (2.3),

that is,

Ay � AyS;r��� ���

F¼ VRyU� � VRyV�SU;rU

���� ���

F¼ Ry � RyV�SU;r

��� ���F: �

Remark 2.4. Formula (2.3) for the right S-Moore–Penrose inverse generalizes the well-known expression (1.4) for theMoore–Penrose inverse in terms of an SVD of A, since (see Remark 2.1)

Page 4: A generalization of the Moore–Penrose inverse related to matrix subspaces of

A. Suárez, L. González / Applied Mathematics and Computation 216 (2010) 514–522 517

S ¼ Cn�m ) V�SU ¼ Cn�m ) Ay ¼ AyS;r ¼ VRyV�SU;rU� ¼ VRyU�:

For full column rank matrices, the next theorem provides an expression for AyS;r involving the orthogonal complement ofsubspace S (denoted, as usual, by S?) with respect to the Frobenius inner product h�; �iF .

Theorem 2.2. Let A 2 Cm�n with rðAÞ ¼ n and let S be a linear subspace of Cn�m. Then

AyS;r ¼ ðA�AÞ�1ðA� þ QÞ; for some Q 2 S?: ð2:7Þ

Proof. Let

N ¼ ðA�AÞAyS;r 2 Cn�m and Q ¼ N � A� 2 Cn�m:

Since rðA�AÞ ¼ n, we get

AyS;r ¼ ðA�AÞ�1N ¼ ðA�AÞ�1ðA� þ QÞ; for some Q 2 Cn�m ð2:8Þ

and we only need to prove that Q 2 S?.Using the fact that AAyS;r is the orthogonal projection of the identity matrix onto the subspace AS, we have

AAyS;r � Im;AMD E

F¼ 0 for all M 2 S; i:e:;

tr AAyS;r � Im

� �M�A�

� �¼ 0 for all M 2 S; i:e:;

tr A�AAyS;r � A�� �

M�� �

¼ 0 for all M 2 S;

but using Eq. (2.8), we have

A�AAyS;r � A� ¼ ðA�AÞðA�AÞ�1ðA� þ QÞ � A� ¼ Q

and then

trðQM�Þ ¼ 0; i:e:; hQ ;MiF ¼ 0 for all M 2 S; i:e:; Q 2 S?: �

Remark 2.5. Eq. (2.7) generalizes the well-known formula (1.5)(ii) for the Moore–Penrose inverse of full column rank matri-ces. Indeed (see Remark 2.1)

S ¼ Cn�m ) S? ¼ f0g ) Q ¼ 0 ) AyCn�m ;r ¼ ðA

�AÞ�1A� ¼ Ay:

An application of Theorem 2.2 is given in the next corollary.

Corollary 2.1. Let A 2 Cm�n with rðAÞ ¼ n. Let s 2 Cm � f0g and let ½0=s� be the annihilator subspace of s in Cn�m, i.e.,

½0=s� ¼ fM 2 Cn�mjMs ¼ 0g:

Then

Ay½0=s�;r ¼ ðA�AÞ�1A� Im � ksk�2

2 ss�� �

; ð2:9Þ

where k � k2 stands for the usual vector Euclidean norm.

Proof. Using Eq. (2.7) for S ¼ ½0=s�, we have

Ay½0=s�;r ¼ ðA�AÞ�1ðA� þ QÞ; for some Q 2 ½0=s�?

and since the orthogonal complement of the annihilator subspace ½0=s� is given by [12]

½0=s�? ¼ fvs�jv 2 Cng;

we get

Ay½0=s�;r ¼ ðA�AÞ�1ðA� þ vs�Þ; for some v 2 Cn: ð2:10Þ

Multiplying both sides of Eq. (2.10) by vector s, we have

Ay½0=s�;rs ¼ ðA�AÞ�1ðA� þ vs�Þs

and, since Ay½0=s�;r 2 ½0=s�, we get

Page 5: A generalization of the Moore–Penrose inverse related to matrix subspaces of

518 A. Suárez, L. González / Applied Mathematics and Computation 216 (2010) 514–522

0 ¼ ðA� þ vs�Þs ) v ¼ � 1

ksk22

A�s

and substituting v by its above expression in Eq. (2.10), we get

Ay½0=s�;r ¼ ðA�AÞ�1 A� � 1

ksk22

A�ss� !

¼ ðA�AÞ�1A�ðIm � ksk�22 ss�Þ: �

Remark 2.6. According to formula (1.5)(ii) for the full column rank case, Eq. (2.9) can be simply rewritten as

Ay½0=s�;r ¼ Ay Im �ss�

s�s

� �;

a curious representation of the right S-Moore–Penrose inverse of A (S being the annihilator subspace ½0=s� of s) as the productof the Moore–Penrose inverse of A and an element of S, namely the elementary annihilator Im � ss�

s�s of vector s.

Regarding the full column rank case, let us mention that, in addition to formulas (2.3) and (2.7), other explicit expressionsfor AyS;r – more appropriate for computational purposes – can be given using an arbitrary basis of subspace S. The basic ideasimply consists of expressing the orthogonal projection AAyS;r of Im onto the subspace AS by its expansion with respect to anorthonormal basis of AS (after using the Gram–Schmidt orthonormalization procedure, if necessary). These expressions havebeen developed in [13] for the special case of real n� n nonsingular matrices, and they have been applied to the precondi-tioning of large linear systems and illustrated with some numerical experiments corresponding to real-world cases.

The next two lemmas state some spectral properties of the matrix product AAyS;r that will be used in the next section.

Lemma 2.1. Let A 2 Cm�n and let S be a linear subspace of Cn�m. Then

AAyS;r � Im

��� ���2

F¼ m� tr AAyS;r

� �; ð2:11Þ

AAyS;r��� ���2

F¼ tr AAyS;r

� �: ð2:12Þ

Lemma 2.2. Let A 2 Cm�n and let S be a linear subspace of Cn�m. Let fkigmi¼1 and frigm

i¼1 be the sets of eigenvalues and singularvalues, respectively, of matrix AAyS;r arranged, as usual, in nonincreasing order of their modules. Then

Xm

i¼1

r2i ¼

Xm

i¼1

ki 6 m: ð2:13Þ

We omit the proofs of Lemmas 2.1 and 2.2, since these proofs are respectively similar to those of Lemmas 2.1 and 2.3given in [18] in the context of the spectral analysis of the orthogonal projections of the identity onto arbitrary subspacesof Rn�n.

3. Applications

3.1. Applications to preconditioning

In numerical linear algebra, the convergence of the iterative Krylov methods [14–16] used for solving the linear system

Ax ¼ b; A 2 Rn�n; A nonsingular; x; b 2 Rn�1 ð3:1Þ

can be improved by transforming it into the right preconditioned system

AMy ¼ b; x ¼ My: ð3:2Þ

The approximate inverse preconditioning (3.2) of system (3.1) is performed in order to get a preconditioned matrix AM asclose as possible to the identity in some sense, taking into account the computational cost required for this purpose. Thecloseness of AM to In may be measured by using a suitable matrix norm like, for instance, the Frobenius norm. In thisway, the optimal (in the sense of the Frobenius norm) right preconditioner of system (3.1), among all matrices M belongingto a given subspace S of Rn�n, is precisely the right S-Moore–Penrose inverse AyS;r , i.e., the unique solution to the matrix min-imization problem (Remark 2.2)

minM2SkAM � InkF ; A 2 Rn�n; rðAÞ ¼ n; S # Rn�n: ð3:3Þ

Alternatively, we can transform system (3.1) into the left preconditioned system MAx = Mb, which is now related to the left S-Moore–Penrose inverse AyS;l.

Page 6: A generalization of the Moore–Penrose inverse related to matrix subspaces of

A. Suárez, L. González / Applied Mathematics and Computation 216 (2010) 514–522 519

In the next theorem, we apply the spectral property (2.13) to the special case of the optimal preconditioner AyS;r , i.e., thesolution to problem (3.3). We must assume that matrix AyS;r is nonsingular, in order to get a nonsingular coefficient matrixAAyS;r of the preconditioned system (3.2). Hence, all singular values of matrix AAyS;r will be greater than zero, and they will bedenoted as

r1 P r2 P � � �P rn > 0:

Theorem 3.1. Let A 2 Rn�nbe nonsingular and let S be a vector subspace of Rn�n such that the solution AyS;r to problem (3.3) isnonsingular. Then the smallest singular value rn of the preconditioned matrix AAyS;r lies in the interval (0,1]. Moreover, we have thefollowing lower and upper bounds on the distance dðIn;ASÞ

ð1� rnÞ2 6 AAyS;r � In

��� ���2

F6 n 1� r2

n

: ð3:4Þ

Proof. Using Lemma 2.2 for m = n, we get

nr2n 6

Xn

i¼1

r2i 6 n ) r2

n 6 1 ) rn 6 1:

To prove the left-hand inequality of Eq. (3.4), we use the well-known fact that singular values continuously depend on theirarguments [17], i.e., for all A; B 2 Rn�n and for all i ¼ 1;2; . . . ;n

jriðAÞ � riðBÞj 6 kA� BkF ) jrn � 1j 6 AAyS;r � In

��� ���F:

Finally, to prove the right-hand inequality of Eq. (3.4), we use Eqs. (2.11) and (2.12) for m = n

AAyS;r � In

��� ���2

F¼ n� tr AAyS;r

� �¼ n� AAyS;r

��� ���2

F¼ n�

Xn

i¼1

r2i 6 n 1� r2

n

: �

Remark 3.1. Eq. (3.4) states that AAyS;r � I��� ���

Fdecreases to 0 at the same time as the smallest singular value rn of the precon-

ditioned matrix AAyS;r increases to 1. That is, the closeness of AAyS;r to the identity is determined by the closeness of rn to theunity. In the extremal case, rn ¼ 1 iff AyS;r ¼ A�1 iff A�1 2 S. A more complete version of Theorem 3.1, involving not only thesmallest singular value, but also the smallest eigenvalue’s modulus, of the preconditioned matrix, can be found in [18]. Theproof presented here is completely different, more direct and simpler than the one given in [18].

Remark 3.2. In numerical linear algebra, the theoretical effectiveness analysis of the preconditioners M to be used, is basedon the spectral properties of the preconditioned matrix AM (e.g., condition number, clustering of eigenvalues and singularvalues, departure from normality, etc.). For this reason, the spectral properties of matrix AAyS;r , stated in Lemmas 2.1 and2.2 and in Theorem 3.1, are especially useful when the left S-Moore–Penrose inverse is the optimal preconditioner AyS;rdefined by problem (3.3). We refer the reader to [18] for more details about the theoretical effectiveness analysis of thesepreconditioners.

3.2. Applications to the constrained least squares problem

The S-Moore–Penrose inverse can be applied to the linear least squares problem subject to a set of homogeneous linearconstraints. This is in accordance with the well-known fact that the Moore–Penrose inverse can be applied to the linear leastsquares problem without restrictions [19], the major application of the Moore–Penrose inverse [20]. More precisely, let usrecall that, regarding the linear system Ax ¼ b; A 2 Rm�n, the unique solution (if A has full column rank) or the minimumnorm solution (otherwise) of the least squares problem

minx2RnkAx� bk2

is given by

�x ¼ Ayb:

Here we consider as well the linear system Ax ¼ b; A 2 Rm�n, but now subject to the condition x 2 T , where T is a given vectorsubspace of Rn. Denoting by Bx = 0 the system of Cartesian equations of subspace T, this problem is usually formulated as

minx2RnkAx� bk2 subject to Bx ¼ 0 ð3:5Þ

and we refer the reader to [11] and to the references contained therein, for more details about the linearly constrained linearleast squares problem.

Page 7: A generalization of the Moore–Penrose inverse related to matrix subspaces of

520 A. Suárez, L. González / Applied Mathematics and Computation 216 (2010) 514–522

Next theorem provides us with a new, different approach – in terms of the S-Moore–Penrose inverse – to problem (3.5).Our theorem covers all possible situations that one can imagine for both the full rank and the rank deficient cases, e.g., Ax = bconsistent but Ax ¼ b; x 2 T inconsistent; Ax = b consistent and underdetermined while Ax ¼ b; x 2 T consistent and un-iquely determined; both Ax = b and Ax ¼ b; x 2 T inconsistent, etc.

Theorem 3.2. Let A 2 Rm�n and b 2 Rm. Let T be a linear subspace of Rn of dimension p. Then the minimum norm solution to theconstrained least squares problem

minx2TkAx� bk2 ð3:6Þ

is given by

�xT ¼ AySP ;rb; ð3:7Þ

where P is the n� p matrix whose columns are the vectors of an arbitrary basis of T and SP ¼ PRp�m # Rn�m. In particular, if A hasfull column rank then �xT is the unique least squares solution to problem (3.6).

Proof. Let B ¼ fv1; . . . ;vpg be a given orthonormal basis of T and let P be the n� p matrix whose columns are v1; . . . ;vp. For aclearer exposition, we subdivide the proof into the following three parts:

(i) Writing any vector x 2 T as x ¼ Pa ða 2 RpÞ, we get

kA~x� bk2 ¼minx2TkAx� bk2 ¼min

a2RpkðAPÞa� bk2 ¼ kðAPÞ~a� bk2;

that is, ~a 2 Rp is a solution to the unconstrained least squares problem

mina2RpkðAPÞa� bk2 ð3:8Þ

if and only if ~x ¼ P~a 2 T is a solution to the constrained least squares problem (3.6).Now, let ~a be any solution to problem (3.8). Since �a ¼ ðAPÞyb is the minimum norm solution to problem (3.8), we have

kðAPÞybk2 6 k~ak2

and then, using the fact that P has orthonormal columns ðPT P ¼ IpÞ, we have

kPak22 ¼ ðPaÞTðPaÞ ¼ aTa ¼ kak2

2 for all a 2 Rp

and then

kPðAPÞybk2 6 kP~ak2 for any solution ~a to problem ð3:8Þ; i:e:;

kPðAPÞybk2 6 k~xk2 for any solution ~x to problem ð3:6Þ:

This proves that the minimum norm solution �xT to problem (3.6) can be expressed as

�xT ¼ PðAPÞyb for any orthonormal basis B of T: ð3:9Þ

(ii) Writing any matrix M 2 SP ¼ PRp�m as M ¼ PN ðN 2 Rp�mÞ, we get

kA eM � ImkF ¼minM2SP

kAM � ImkF ¼ minN2Rp�m

kðAPÞN � ImkF ¼ kðAPÞeN � ImkF ;

that is, eN 2 Rp�m is a solution to the unconstrained matrix minimization problem

minN2Rp�m

kðAPÞN � ImkF ð3:10Þ

if and only if eM ¼ P eN 2 SP is a solution to the constrained matrix minimization problem

minM2SP¼PRp�m

kAM � ImkF : ð3:11Þ

Now, let eN be any solution to problem (3.10). Since ðAPÞy is the minimum Frobenius norm solution to problem (3.10), wehave

y

kðAPÞ kF 6 keNkF

and then, using again the fact that P has orthonormal columns ðPT P ¼ IpÞ, we have

kPNk2F ¼ trððPNÞTðPNÞÞ ¼ trðNT NÞ ¼ kNk2

F for all N 2 Rp�m

and then

kPðAPÞykF 6 kP eNkF for any solution eN to problem ð3:10Þ; i:e:;

kPðAPÞykF 6 k eMkF for any solution eM to problem ð3:11Þ:

Page 8: A generalization of the Moore–Penrose inverse related to matrix subspaces of

A. Suárez, L. González / Applied Mathematics and Computation 216 (2010) 514–522 521

This proves that the minimum Frobenius norm solution AySP ;rto problem (3.11) can be expressed as

AySP ;r¼ PðAPÞy for any orthonormal basis B of T: ð3:12Þ

(iii) Using Eqs. (3.9) and (3.12), we get

�xT ¼ AySP ;rb for any orthonormal basis B of T: ð3:13Þ

To conclude the derivation of formula (3.7), we must proof that, although the assumption that B is an orthonormal basisof T is (in general) essential for Eqs. (3.9) and (3.12) to be hold, formula (3.13) for �xT is valid not only for orthonormal bases,but in fact for any basis (not necessarily orthonormal) of subspace T. In other words, the vector AySP ;r

b is independent of thechosen basis. Indeed, let B, B0 be two arbitrary bases of T and let P ,P0 be the n� p matrices whose columns are the vectors of Band B0, respectively. Writing matrix P0 as P0 ¼ PC, C being a p� p nonsingular matrix, we have

SP0 ¼ P0Rp�m ¼ PðCRp�mÞ ¼ PRp�m ¼ SP ) AySP ;rb ¼ AySP0 ;r

b:

Finally, if in particular A has full column rank then A is left invertible and, due to the uniqueness of the orthogonal pro-jection A�xT of vector b onto the subspace AT # Rm, we derive that the minimum norm solution �xT ¼ AySP ;r

b to problem (3.6) isindeed the unique solution to this problem. h

Remark 3.3. In particular, for the special full rank case r(A) = n, the expression (3.9) for the unique least squares solution �xT

of problem (3.6), is valid for any basis (not necessarily orthonormal) of T. Indeed, let B, B0 be two arbitrary bases of T and let P,P0 be the n� p matrices whose columns are the vectors of B and B0, respectively. Writing matrix P0 as P0 ¼ PC, C being a p� pnonsingular matrix, we have

P0ðAP0Þy ¼ PCððAPÞCÞy ¼ PCCyðAPÞy ¼ PCC�1ðAPÞy ¼ PðAPÞy;

where the reverse order law for the product (AP)C holds because AP has full column rank, i.e.,

p ¼ rðAÞ þ rðPÞ � n 6 rðAPÞ 6 minðrðAÞ; rðPÞÞ ¼ p

and C has full row rank; see, e.g., [3]. In conclusion, the minimum norm solution to problem (3.6) is given by

�xT ¼ PðAPÞybfor any orthonormal basis of T if rðAÞ < n;for any basis of T if rðAÞ ¼ n

and also, for both cases ðrðAÞ 6 nÞ, by �xT ¼ AySP ;r

b for any basis of T.

The following simple example illustrates Theorem 3.2.

Example 3.1. Let A ¼ ð1 1 1 Þ 2 R1�3 and b ¼ ð2Þ 2 R1�1. Consider the subspace T � x3 ¼ 0 of R3. Then m ¼ 1; n ¼ 3;p ¼ 2 and taking the (nonorthonormal) basis of T : B ¼ fð1; 0;0Þ; ð1;1;0Þg, we get

P ¼1 10 10 0

0B@1CA ) SP ¼ PR2�1 ¼

a

b

0

0B@1CA

8><>:9>=>;

a;b2R

� R3�1 ) AySP ;r¼

1=21=2

0

0B@1CA

and then the minimum norm solution to problem (3.6) is given by

�xT ¼ AySP ;rb ¼

110

0B@1CA:

4. Concluding remarks

The left and right S-Moore–Penrose inverses, AyS;l and AyS;r , extend the idea of the standard inverse from nonsingular squarematrices not only to rectangular and singular square matrices (as the Moore–Penrose inverse does), but also to invertiblematrices, themselves, by considering a subspace S of Cn�m not containing A�1. Restricted to the right case, the two expres-sions given for AyS;r (namely, the first one based on an SVD of A; the second one involving the orthogonal complement of sub-space S), respectively generalize the well-known corresponding formulae (1.4) and (1.5)(ii) for the Moore–Penrose inverse.The S-Moore–Penrose inverse AyS;r can be seen as an useful alternative to Ay: it provides us with a computationally feasibleapproximate inverse preconditioner for large linear systems (in contrast to the unfeasible computation of Ay ¼ A�1), and italso provides the minimum norm solution AySP ;r

b to the least squares problem subject to homogeneous linear constraints(in contrast to the minimum norm solution Ayb to the unconstrained least squares problem). The behavior of the S-Moore–Penrose inverse with respect to both theoretical results and other applications, already known for the Moore–Pen-rose inverse, can be considered for future researches.

Page 9: A generalization of the Moore–Penrose inverse related to matrix subspaces of

522 A. Suárez, L. González / Applied Mathematics and Computation 216 (2010) 514–522

Acknowledgments

This work was partially supported by Ministerio de Ciencia e Innovación (Spain) and FEDER through Grant contract:CGL2008-06003-C03-01/CLI.

References

[1] E.H. Moore, On the reciprocal of the general algebraic matrix, Bull. Am. Math. Soc. 26 (1920) 394–395.[2] R. Penrose, A generalized inverse for matrices, Proc. Cambridge Philos. Soc. 51 (1955) 406–413.[3] A. Ben-Israel, T.N.E. Greville, Generalized Inverses: Theory and Applications, second ed., Springer-Verlag, NewYork, 2003.[4] J.K. Baksalary, O.M. Baksalary, Particular formulae for the Moore–Penrose inverse of a columnwise partitioned matrix, Linear Algebra Appl. 421 (2007)

16–23.[5] T. Britz, D.D. Olesky, P. van den Driessche, The Moore–Penrose inverse of matrices with an acyclic bipartite graph, Linear Algebra Appl. 390 (2004) 47–

60.[6] M.A. Rakha, On the Moore–Penrose generalized inverse matrix, Appl. Math. Comput. 158 (2004) 185–200.[7] Y. Tian, Using rank formulas to characterize equalities for Moore–Penrose inverses of matrix products, Appl. Math. Comput. 147 (2004) 581–600.[8] Y. Tian, Approximation and continuity of Moore–Penrose inverses of orthogonal row block matrices, Appl. Math. Comput. (2008), doi:10.1016/

j.amc.2008.10.027.[9] F. Toutounian, A. Ataei, A new method for computing Moore–Penrose inverse matrices, J. Comput. Appl. Math. (2008), doi:10.1016/j.cam.2008.10.008.

[10] X. Zhang, J. Cai, Y. Wei, Interval iterative methods for computing Moore–Penrose inverse, Appl. Math. Comput. 183 (2006) 522–532.[11] G.H. Golub, C.F. Van Loan, Matrix Computations, third ed., Johns Hopkins Press, Baltimore, USA, 1996.[12] A. Griewank, A short proof of the Dennis–Schnabel theorem, BIT 22 (1982) 252–256.[13] G. Montero, L. González, E. Flórez, M.D. García, A. Suárez, Approximate inverse computation using Frobenius inner product, Numer. Linear Algebra

Appl. 9 (2002) 239–247.[14] O. Axelsson, Iterative Solution Methods, Cambridge University Press, Cambridge, MA, 1994.[15] A. Greenbaum, Iterative Methods for Solving Linear Systems, Frontiers Appl. Math., vol. 17, SIAM, Philadelphia, PA, 1997.[16] Y. Saad, Iterative Methods for Sparse Linear Systems, PWS Publishing Co., Boston, MA, 1996.[17] R.A. Horn, C.R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, MA, 1985.[18] L. González, Orthogonal projections of the identity: spectral analysis and applications to approximate inverse preconditioning, SIAM Rev. 48 (2006)

66–75.[19] R. Penrose, On best approximate solutions of linear matrix equations, Proc. Cambridge Philos. Soc. 52 (1956) 17–19.[20] A. Ben-Israel, The Moore of the Moore–Penrose inverse, Electron. J. Linear Algebra 9 (2002) 150–157.