Linear Algebra and Its Applications, 2ed. Solution of …Linear Algebra and Its Applications, 2ed....

Linear Algebra and Its Applications, 2ed.Solution of Exercise Problems

Yan Zeng

Version 1.0.4, last revised on 2014-08-13.

Abstract

This is a solution manual for Linear algebra and its applications, 2nd edition, by Peter Lax [8]. Thisversion omits the following problems: exercise 2, 9 of Chapter 8; exercise 3, 4 of Chapter 17; exercises ofChapter 18; exercise 3 of Appendix 3; exercises of Appendix 4, 5, 8 and 11.

If you would like to correct any typos/errors, please send email to [email protected].

Contents1 Fundamentals 2

2 Duality 5

3 Linear Mappings 8

4 Matrices 12

5 Determinant and Trace 14

6 Spectral Theory 18

7 Euclidean Structure 23

8 Spectral Theory of Self-Adjoint Mappings of a Euclidean Space into Itself 28

9 Calculus of Vector- and Matrix- Valued Functions 32

10 Matrix Inequalities 35

11 Kinematics and Dynamics 38

12 Convexity 40

13 The Duality Theorem 44

14 Normed Linear Spaces 45

15 Linear Mappings Between Normed Linear Spaces 47

16 Positive Matrices 49

17 How to Solve Systems of Linear Equations 50

1

18 How to Calculate the Eigenvalues of Self-Adjoint Matrices 51

A Appendix 51A.1 Special Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51A.2 The Pfaffian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52A.3 Symplectic Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53A.4 Tensor Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55A.5 Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55A.6 Fast Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55A.7 Gershgorin’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56A.8 The Multiplicity of Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56A.9 The Fast Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56A.10 The Spectral Radius . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56A.11 The Lorentz Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57A.12 Compactness of the Unit Ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57A.13 A Characterization of Commutators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57A.14 Liapunov’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58A.15 The Jordan Canonical Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58A.16 Numerical Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

2

1 FundamentalsThe book’s own solution gives answers to Ex 1, 3, 7, 10, 13, 14, 16, 19, 20, 21.

I 1. (page 2) Show that the zero of vector addition is unique.

Proof. Suppose 0 and 0′ are two zeros of vector addition, then by the definition of zero and commutativity,we have 0′ = 0′ + 0 = 0 + 0′ = 0.

I 2. (page 3) Show that the vector with all components zero serves as the zero element of classical vectoraddition.

Proof. For any x = (x1, · · · , xn) ∈ Kn, we have

x+ 0 = (x1, · · · , xn) + (0, · · · , 0) = (x1 + 0, · · · , xn + 0) = (x1, · · · , xn) = x.

So 0 = (0, · · · , 0) is the zero element of classical vector addition.

I 3. (page 3) Show that (i) and (iv) are isomorphic.

Proof. The isomorphism T can be defined as T ((a1, · · · , an)) = a1 + a2x+ · · ·+ anxn−1.

I 4. (page 3) Show that if S has n elements, (i) and (iii) are isomorphic.

Proof. Suppose S = {s1, · · · , sn}. The isomorphism T can be defined as T (f) = (f(s1), · · · , f(sn)) ,∀f ∈KS .

I 5. (page 4) Show that when K = R, (iv) is isomorphic with (iii) when S consists of n distinct points of R.

Proof. For any p(x) = a1 + a2x+ · · ·+ anxn−1, we define

T (p) = p(x),

where p on the left side of the equation is regarded as a polynomial over R while p(x) on the right side ofthe equation is regarded as a function defined on S = {s1, · · · , sn}. To prove T is an isomorphism, it sufficesto prove T is one-to-one. This is seen through the observation that

1 s1 s21 · · · sn−11

1 s2 s22 · · · sn−12

· · · · · · · · · · · · · · ·1 sn s2n · · · sn−1

n

a1a2...an

=

p(s1)p(s2)· · ·

p(sn)

and the Vandermonde matrix

1 s1 s21 · · · sn−11

1 s2 s22 · · · sn−12

· · · · · · · · · · · · · · ·1 sn s2n · · · sn−1

n

is invertible for distinct s1, s2, · · · , sn.

I 6. (page 4) Prove that Y + Z is a linear subspace of X if Y and Z are.

Proof. For any y, y′ ∈ Y , z, z′ ∈ Z and k ∈ K, we have (by commutativity and associative law)

(y + z) + (y′ + z′) = (z + y) + (y′ + z′) = z + (y + (y′ + z′)) = z + ((y + y′) + z′) = z + (z′ + (y + y′))

= (z + z′) + (y + y′) = (y + y′) + (z + z′) ∈ Y + Z,

andk(y + z) = ky + kz ∈ Y + Z.

So Y + Z is a linear subspace of X if Y and Z are.

3

I 7. (page 4) Prove that if Y and Z are linear subspaces of X, so is Y ∩ Z.

Proof. For any x1, x2 ∈ Y ∩ Z, since Y and Z are linear subspaces of X, x1 + x2 ∈ Y and x1 + x2 ∈ Z.Therefore, x1+x2 ∈ Y ∩Z. For any k ∈ K and x ∈ Y ∩Z, since Y and Z are linear subspaces of X, kx ∈ Yand kx ∈ Z. Therefore, kx ∈ Y ∩ Z. Combined, we conclude Y ∩ Z is a linear subspace of X.

I 8. (page 4) Show that the set {0} consisting of the zero element of a linear space X is a subspace of X.It is called the trivial subspace.

Proof. By definition of zero vector, 0+0 = 0 ∈ {0}. For any k ∈ K, k0 = k(0+0) = k0+k0. So k0 = 0 ∈ {0}.Combined, we can conclude {0} is a linear subspace of X.

I 9. (page 4) Show that the set of all linear combinations of x1, · · · , xj is a subspace of X, and that it isthe smallest subspace of X containing x1, · · · , xj . This is called the subspace spanned by x1, · · · , xj .

Proof. Define Y = {k1x1 + · · · + kjxj : k1, · · · , kj ∈ K}. Then clearly x1 = 1x1 + 0x2 + · · · + 0xj ∈ Y .Similarly, we can show x2, · · · , xj ∈ Y . Since for any k1, · · · , kj , k′1, · · · , k′j ∈ K,

(k1x1 + · · ·+ kjxj) + (k′1x1 + · · ·+ k′jxj) = (k1 + k′1)x1 + · · ·+ (kj + k′j)xj ∈ Y

and for any k1, · · · , kj , k ∈ K,

k(k1x1 + · · ·+ kjxj) = (kk1)x1 + · · ·+ (kkj)xj ∈ Y,

we can conclude Y is a linear subspace of X containing x1, · · · , xj . Finally, if Z is any linear subspace ofX containing x1, · · · , xj , it is clear that Y ⊂ Z as Z must be closed under scalar multiplication and vectoraddition. Combined, we have proven Y is the smallest linear subspace of X containing x1, · · · , xj .

I 10. (page 5) Show that if the vectors x1, · · · , xj are linearly independent, then none of the xi is the zerovector.

Proof. We prove by contradiction. Without loss of generality, assume x1 = 0. Then 1x1 +0x2 + · · ·+0xj =0. This shows x1, · · · , xj are linearly dependent, a contradiction. So x1 = 0. We can similarly provex2, · · · , xj = 0.

I 11. (page 7) Prove that if X is finite dimensional and the direct sum of Y1, · · · , Ym, then

dimX =∑

dimYj .

Proof. Suppose Yi has a basis yi1, · · · , yini. Then it suffices to prove y11 , · · · , y1n1

, · · · , ym1 , · · · , ymnmform a basis

of X. By definition of direct sum, these vectors span X, so we only need to show they are linearly independent.In fact, if not, then 0 has two distinct representations: 0 = 0 + · · · + 0 and 0 =

∑mi=1(a

i1y

i1 + · · · + aini

yini)

for some a11, · · · , a1n1, · · · , am1 , · · · , amnm

, where not all aij are zero. This is contradictory with the definitionof direct sum. So we must have linear independence, which imply y11 , · · · , y1n1

, · · · , ym1 , · · · , ymnmform a basis

of X. Consequently, dimX =∑

dimYi.

I 12. (page 7) Show that every finite-dimensional space X over K is isomorphic to Kn, n = dimX. Showthat this isomorphism is not unique when n is > 1.

Proof. Fix a basis x1, · · · , xn of X, any element x ∈ X can be uniquely represented as∑n

i=1 αi(x)xi for someαi(x) ∈ K, i = 1, · · · , n. We define the isomorphism as x 7→ (α1(x), · · · , αn(x)). Clearly this isomorphismdepends on the basis and by varying the choice of basis, we can have different isomorphisms.

I 13. (page 7) Prove (i)-(iii) above. Show furthermore that if x1 ≡ x2, then kx1 ≡ kx2 for every scalar k.

Proof. For any x1, x2 ∈ X, if x1 ≡ x2, i.e. x1 − x2 ∈ Y , then x2 − x1 = −(x1 − x2) ∈ Y , i.e. x2 ≡ x1. Thisis symmetry. For any x ∈ X, x− x = 0 ∈ Y . So x ≡ x. This is reflexivity. Finally, if x1 ≡ x2, x2 ≡ x3, thenx1 − x3 = (x1 − x2) + (x2 − x3) ∈ Y , i.e. x1 ≡ x3. This is transitivity.

4

I 14. (page 7) Show that two congruence classes are either identical or disjoint.

Proof. For any x1, x2 ∈ X, we can find y ∈ {x1} ∩ {x2} if and only if x1 − y ∈ Y and x2 − y ∈ Y . Then

x1 − x2 = (x1 − y)− (x2 − y) ∈ Y.

So {x1} ∩ {x2} = ∅ if and only if {x1} = {x2}.

I 15. (page 8) Show that the above definition of addition and multiplication by scalar is independent of thechoice of representatives in the congruence class.

Proof. If {x} = {x′} and {y} = {y′}, then x−x′, y− y′ ∈ Y . So (x+ y)− (x′ + y′) = (x−x′)+ (y− y′) ∈ Y .This shows {x+ y} = {x′ + y′}. Also, for any k ∈ K, kx− kx′ = k(x− x′) ∈ Y . So k{x} = {kx} = {kx′} =k{x′}.

I 16. (page 9) Denote by X the linear space of all polynomials p(t) of degree < n, and denote by Y the setof polynomials that are zero at t1, · · · , tj , j < n.

(i) Show that Y is a subspace of X.(ii) Determine dimY .(iii) Determine dimX/Y .

Proof. By theory of polynomials, we have

Y =

{q(t)

j∏i=1

(t− ti) : q(t) is a polynomial of degree < n− j

}.

Then it’s easy to see dimY = n− j and dimX/Y = dimX − dimY = j.

I 17. (page 10) Prove Corollary 6′.

Proof. By Theorem 6, dimX/Y = dimX − dimY = 0, which implies X/Y = {{0}}. So X = Y .

I 18. (page 11) Show thatdimX1 ⊕X2 = dimX1 + dimX2.

Proof. Define Y1 = {(x, 0) : x ∈ X1, 0 ∈ X2} and Y2 = {(0, x) : 0 ∈ X1, x ∈ X2}. Then Y1 and Y2 are linearsubspaces of X1⊕X2. It is easy to see Y1 is isomorphic to X1, Y2 is isomorphic to X2, and Y1∩Y2 = {(0, 0)}.So by Theorem 7, dimX1⊕X2 = dimY1+dimY2−dim(Y1∩Y2) = dimX1+dimX2−0 = dimX1+dimX2.

I 19. (page 11) X is a linear space, Y is a subspace. Show that Y ⊕X/Y is isomorphic to X.

Proof. By Exercise 18 and Theorem 6, dim(Y ⊕X/Y ) = dimY + dim(X/Y ) = dimY + dimX − dimY =dimX. Since linear spaces of same finite dimension are isomorphic (by one-to-one mapping between theirbases), Y ⊕X/Y is isomorphic to X.

I 20. (page 12) Which of the following sets of vectors x = (x1, · · · , xn) in Rn are a subspace of Rn? Explainyour answer.

(a) All x such that x1 ≥ 0.(b) All x such that x1 + x2 = 0.(c) All x such that x1 + x2 + 1 = 0.(d) All x such that x1 = 0.(e) All x such that x1 is an integer.

Proof. (a) is not since {x : x1 ≥ 0} is not closed under the scalar multiplication by −1. (b) is. (c) is notsince x1 + x2 + 1 = 0 and x′

1 + x′2 + 1 = 0 imply (x1 + x′

1) + (x2 + x′2) + 1 = −1. (d) is. (e) is not since x1

being an integer does not guarantee rx1 is an integer for any r ∈ R.

5

I 21. (page 12) Let U , V , and W be subspaces of some finite-dimensional vectors space X. Is the statement

dim(U + V +W ) = dimU + dimV + dimW − dim(U ∩ V )− dim(U ∩W )

−dim(V ∩W ) + dim(U ∩ V ∩W ).

true or false? If true, prove it. If false, provide a counterexample.

Proof. (From the textbook’s solutions, page 279.) The statement is false; here is an example to the contrary:

X = R2 = (x, y) spaceU = {y = 0}, V = {x = 0},W = {x = y}.U + V +W = R2, U ∩ V = {0}, U ∩W = {0}V ∩W = {0}, U ∩ V ∩W = 0.

2 DualityThe book’s own solution gives answers to Ex 4, 5, 6, 7.

I 1. (page 15) Given a nonzero vector x1 in X, show that there is a linear function l such that

l(x1) = 0.

Proof. We let Y = {kx1 : k ∈ K}. Then Y is a 1-dimensional linear subspace of X. By Theorem 2 andTheorem 4,

dimY ⊥ = dimX − dimY < dimX = dimX ′

So there must exist some l ∈ X ′ \ Y ⊥ such that l(x1) = 0.

Remark 1. When K is R or C, the proof can be constructive. Indeed, assume e1, · · · , en is a basis for Xand x1 =

∑ni=1 aiei. In the case of K = R, define l by setting l(ei) = ai, i = 1, · · · , n; in the case of K = C,

define l by setting l(ei) = a∗i (the conjugate of ai), i = 1, · · · , n. Then in both cases, l(x1) =∑n

i=1 ||ai||2 > 0.

I 2. (page 15) Verify that Y ⊥ is a subspace of X ′.

Proof. For any l1 and l2 ∈ Y ⊥, we have (l1+ l2)(y) = l1(y)+ l2(y) = 0+0 = 0 for any y ∈ Y . So l1+ l2 ∈ Y ⊥.For any k ∈ K, (kl)(y) = k(l(y)) = k0 = 0 for any y ∈ Y . So kl ∈ Y ⊥. Combined, we conclude Y ⊥ is asubspace of X ′.

I 3. (page 17) Prove Theorem 6.

Proof. Since S ⊂ Y , Y ⊥ ⊂ S⊥. For “⊃”, let x1, · · · , xm be a maximal linearly independent subset of S.Then S = span(x1, · · · , xm) and Y = {

∑mi=1 αixi : α1, · · · , αm ∈ K} by Exercise 9 of Chapter 1. By the

definition of annihilator, for any l ∈ S⊥ and y =∑m

i=1 αixi ∈ Y , we have

l(y) =m∑i=1

αil(xi) = 0.

So l ∈ Y ⊥. By the arbitrariness of l, S⊥ ⊂ Y ⊥. Combined, we have S⊥ = Y ⊥.

I 4. (page 18) In Theorem 7 take the interval I to be [−1, 1], and take n to be 3. Choose the three pointsto be t1 = −a, t2 = 0, and t3 = a.

(i) Determine the weights m1, m2, m3 so that (9) holds for all polynomials of degree < 3.(ii) Show that for a >

√1/3, all three weights are positive.

(iii) Show that for a =√

3/5, (9) holds for all polynomials of degree < 6.

6

Proof. Suppose three linearly independent polynomials p1, p2 and p3 are applied to formula (9). Then m1,m2 and m3 must satisfy the linear equationsp1(t1) p1(t2) p1(t3)

p2(t1) p2(t2) p2(t3)p3(t1) p3(t2) p3(t3)

m1

m2

m3

=

∫ 1

−1p1(t)dt∫ 1

−1p2(t)dt∫ 1

−1p3(t)dt

We take p1(t) = 1, p2(t) = t and p3(t) = t2. The above equation becomes 1 1 1

−a 0 aa2 0 a2

m1

m2

m3

=

2023

So m1

m2

m3

=

1 1 1−a 0 aa2 0 a2

−1 2023

=

0 − 12a

12a2

1 0 − 1a2

0 12a

12a2

−1 2023

=

13a2

2− 23a2

13a2

Then it’s easy to see that for a >

√1/3, all three weights are positive.

To show formula (9) holds for all polynomials of degree < 6 when a =√

3/5, we note for any odd n ∈ N,∫ 1

−1

xndx = 0, m1p(−a) +m3p(a) = 0 since m1 = m2 and p(−x) = −p(x), and m2p(0) = 0.

So (9) holds for any xn of odd degree n. In particular, for p(x) = x3 and p(x) = x5. For p(x) = x4, we have∫ 1

−1

x4dx =2

5, m1p(t1) +m2p(t2) +m3p(t3) = 2m1a

4 =2

3a2.

So formula (9) holds for p(x) = x4 when a =√

3/5. Combined, we conclude for a =√

3/5, (9) holds for allpolynomials of degree < 6.

Remark 2. In this exercise problem and Exercise 5 below, “Theorem 6” is corrected to “Theorem 7”.

I 5. (page 18) In Theorem 7 take the interval I to be [−1, 1], and take n = 4. Choose the four points to be−a, −b, b, a.

(i) Determine the weights m1, m2, m3, and m4 so that (9) holds for all polynomials of degree < 4.(ii) For what values of a and b are the weights positive?

Proof. We take p1(t) = 1, p2(t) = t, p3(t) = t2, and p4(t) = t3. Then m1, m2, m3, and m4 solve the followingequation:

1 1 1 1−a −b b aa2 b2 b2 a2

−a3 −b3 b3 a3

m1

m2

m3

m4

=

202/30

7

Then m1

m2

m3

m4

=

1 1 1 1−a −b b aa2 b2 b2 a2

−a3 −b3 b3 a3

−1

202/30

=

b2

−2a2+2b2b2

2a3−2ab21

2a2−2b21

−2a3+2ab2

a2

2a2−2b2a2

−2a2b+2b31

−2a2+2b21

2a2b−2b3

a2

2a2−2b2a2

2a2b−2b31

−2a2+2b21

−2a2b+2b3

b2

−2a2+2b2b2

−2a3+2ab21

2a2−2b21

2a3−2ab2

202/30

=

−3b2+13(a2−b2)3a2−1

3(a2−b2)3a2−1

3(a2−b2)−3b2+13(a2−b2)

So the weights are positive if and only if one of the following two mutually exclusive cases hold1) b2 > 1

3 , a2 < b2, a2 > 13 ;

2) b2 < 13 , a2 > b2, a2 < 1

3 .

I 6. (page 18) Let P2 be the linear space of all polynomials

p(x) = a0 + a1x+ a2x2

with real coefficients and degree ≤ 2. Let ξ1, ξ2, ξ3 be three distinct real numbers, and then define

lj = p(ξj) for j = 1, 2, 3.

(a) Show that l1, l2, l3 are linearly independent linear functions on P2.(b) Show that l1, l2, l3 is a basis for the dual space P ′

2.(c) (1) Suppose {e1, · · · , en} is a basis for the vector space V . Show there exist linear functions {l1, · · · , ln}

in the dual space V ′ defined by

li(ej) =

{1 if i = j

0 if i = j

Show that {l1, · · · , ln} is a basis of V ′, called the dual basis.(2) Find the polynomials p1(x), p2(x), p3(x) in P2 for which l1, l2, l3 is the dual basis in P ′

2.

(a)

Proof. (From the textbook’s solutions, page 280) Suppose there is a linear relation

al1(p) + bl2(p) + cl3(p) = 0.

Set p = p(x) = (x− ξ2)(x− ξ3). Then p(ξ2) = p(ξ3) = 0, p1(ξ1) = 0; so we get from the above relation thata = 0. Similarly b = 0, c = 0.

(b)

Proof. Since dimP2 = 3, dimP ′2 = 3. Since l1, l2, l3 are linearly independent, they span P ′

2.

(c1)

8

Proof. We define l1 by setting

l1(ej) =

{1, if j = 1

0, if j = 1

and extending l1 to V by linear combination, i.e. l1(∑n

j=1 αjej) :=∑n

j=1 αj l1(ej) = α1. l2, · · · , ln can beconstructed similarly. If there exist a1, · · · , an such that a1l1 + · · ·+ anln = 0, we have

0 = a1l1(ej) + · · · anln(ej) = aj , j = 1, · · · , n.

So l1, · · · , ln are linearly independent. Since dimV ′ = dimV = n, {l1, · · · , ln} is a basis of V ′.

(c2)

Proof. We define

p1(x) =(x− x2)(x− x3)

(x1 − x2)(x1 − x3), p2(x) =

(x− x1)(x− x3)

(x2 − x1)(x2 − x3), p3(x) =

(x− x1)(x− x2)

(x3 − x1)(x3 − x2).

I 7. (page 18) Let W be the subspace of R4 spanned by (1, 0,−1, 2) and (2, 3, 1, 1). Which linear functionsl(x) = c1x1 + c2x2 + c3x3 + c4x4 are in the annihilator of W?

Proof. (From the textbook’s solutions, page 280) l(x) has to be zero for x = (1, 0,−1, 2) and x = (2, 3, 1, 1).These yield two equations for c1, · · · , c4:

c1 − c3 + 2c4 = 0, 2c1 + 3c2 + c3 + c4 = 0.

We express c1 and c2 in terms of c3 and c4. From the first equation, c1 = c3 − 2c4. Setting this into thesecond equation gives c2 = −c3 + c4.

3 Linear MappingsThe book’s own solution gives answers to Ex 1, 2, 4, 5, 6, 7, 8, 10, 11, 13.

⋆ Comments: To memorize Theorem 5 (R⊥T = NT ′), recall for a given l ∈ U ′, (l, Tx) = 0 for any x ∈ X

if and only if T ′l = 0.

I 1. (page 20) Prove Theorem 1.(a)

Proof. For any y, y′ ∈ T (X), there exist x, x′ ∈ X such that T (x) = y and T (x′) = y′. So y + y′ =T (x) + T (x′) = T (x + x′) ∈ T (X). For any k ∈ K, ky = kT (x) = T (kx) ∈ T (X). Combined, we concludeT (X) is a linear subspace of U .

(b)

Proof. Suppose V is a linear subspace of U . For any x, x′ ∈ T−1(V ), there exist y, y′ ∈ V such that T (x) = yand T (x′) = y′. Since T (x + x′) = T (x) + T (x′) = y + y′ ∈ V , x + x′ ∈ T−1(V ). For any k ∈ K, sinceT (kx) = kT (x) = ky ∈ V , kx ∈ T−1(V ). Combined, we conclude T−1(V ) is a linear subspace of X.

I 2. (page 24) Letn∑1

tijxj = ui, i = 1, · · · ,m

be an overdetermined system of linear equations–that is, the number m of equations is greater than thenumber n of unknowns x1, · · · , xn. Take the case that in spite of the overdeterminacy, this system ofequations has a solution, and assume that this solution is unique. Show that it is possible to select a subsetof n of these equations which uniquely determine the solution.

9

Proof. (From the textbook’s solution, page 280) Suppose we drop the ith equation; if the remaining equationsdo not determine x uniquely, there is an x = 0 that is mapped into a vector whose components except theith are zero. If this were true for all i = 1, · · · ,m, the range of the mapping x → u would be m-dimensional;but according to Theorem 2, the dimension of the range is ≤ n < m. Therefore one of the equations may bedropped without losing uniqueness; by induction m− n of the equations may be omitted.

Alternative solution: Uniqueness of the solution x implies the column vectors of the matrix T = (tij) arelinearly independent. Since the column rank of a matrix equals its row rank (see Chapter 3, Theorem 6 andChapter 4, Theorem 2), it is possible to select a subset of n of these equations which uniquely determine thesolution.

Remark 3. The textbook’s solution is a proof that the column rank of a matrix equals its row rank.

I 3. (page 25) Prove Theorem 3.(i)

Proof. S ◦T (ax+ by) = S(T (ax+ by)) = S(aT (x)+ bT (y)) = aS(T (x))+ bS(T (y)) = aS ◦T (x)+ bS ◦T (y).So S ◦ T is also a linear mapping.

(ii)

Proof. (R + S) ◦ T (x) = (R + S)(T (x)) = R(T (x)) + S(T (x)) = (R ◦ T + S ◦ T )(x) and S ◦ (T + P )(x) =S((T + P )(x)) = S(T (x) + P (x)) = S(T (x)) + S(P (x)) = (S ◦ T + S ◦ P )(x).

I 4. (page 25) Show that S and T in Examples 8 and 9 are linear and that ST = TS.

Proof. For Example 8, the linearity of S and T is easy to see. To see the non-commutativity, consider thepolynomial p(s) = s. We have TS(s) = T (s2) = 2s = s = S(1) = ST (s). So ST = TS.

For Example 9, ∀x = (x1, x2, x3) ∈ X, S(x) = (x1, x3,−x2) and T (x) = (x3, x2,−x1). So it’s easy tosee S and T are linear. To see the non-commutativity, note ST (x) = S(x3, x2,−x1) = (x3,−x1,−x2) andTS(x) = T (x1, x3,−x2) = (−x2, x3,−x1). So ST = TS in general.

Remark 4. Note the problem does not specify the direction of the rotation, so it is also possible thatS(x) = (x1,−x3, x2) and T (x) = (−x3, x2, x1). There are total of four choices of (S, T ), and each of thecorresponding proofs is similar to the one presented above.

I 5. (page 25) Show that if T is invertible, TT−1 is the identity.

Proof. TT−1(x) = T (T−1(x)) = x by definition. So TT−1 = id.

I 6. (page 25) Prove Theorem 4.(i)

Proof. Suppose T : X → U is invertible. Then for any y, y′ ∈ U , there exist a unique x ∈ X and a uniquex′ ∈ X such that T (x) = y and T (x′) = y′. So T (x+ x′) = T (x) + T (x′) = y + y′ and by the injectivity ofT , T−1(y + y′) = x + x′ = T−1(y) + T−1(y′). For any k ∈ K, since T (kx) = kT (x) = ky, injectivity of Timplies T−1(ky) = kx = kT−1(y). Combined, we conclude T−1 is linear.

(ii)

Proof. Suppose T : X → U and S : U → V . First, by the definition of multiplication, ST is a linear map.Second, if x ∈ X is such that ST (x) = 0 ∈ V , the injectivity of S implies T (x) = 0 ∈ U and the injectivityof T further implies x = 0 ∈ X. So, ST is one-to-one. For any z ∈ V , there exists y ∈ U such that S(y) = z.Also, we can find x ∈ X such that T (x) = y. So ST (x) = S(y) = z. This shows ST is onto. Combined, weconclude ST is invertible.

By associativity, we have (ST )(T−1S−1) = ((ST )T−1)S−1 = (S(TT−1))S−1 = SS−1 = idV . ReplaceS with T−1 and T with S−1, we also have (T−1S−1)(ST ) = idX . Therefore, we can conclude (ST )−1 =T−1S−1.

10

I 7. (page 26) Show that whenever meaningful,

(ST )′ = T ′S′, (T +R)′ = T ′ +R′, and (T−1)′ = (T ′)−1.

(i)

Proof. Suppose T : X → U and S : U → V are linear maps. Then for any given l ∈ V ′, ((ST )′l, x) =(l, STx) = (S′l, Tx) = (T ′S′l, x), ∀x ∈ X. Therefore, (ST )′l = T ′S′l. Let l run through every element ofV ′, we conclude (ST )′ = T ′S′.

(ii)

Proof. Suppose T and R are both linear maps from X to U . For any given l ∈ U ′, we have ((T +R)′l, x) =(l, (T + R)x) = (l, Tx + Rx) = (l, Tx) + (l, Rx) = (T ′l, x) + (R′l, x) = ((T ′ + R′)l, x), ∀x ∈ X. Therefore(T +R)′l = (T ′ +R′)l. Let l run through every element of V ′, we conclude (T +R)′ = T ′ +R′.

(iii)

Proof. Suppose T is an isomorphism from X to U , then T−1 is a well-defined linear map. We first showT ′ is an isomorphism from U ′ to X ′. Indeed, if l ∈ U ′ is such that T ′l = 0, then for any x ∈ X, 0 =(T ′l, x) = (l, Tx). As x varies and goes through every element of X, Tx goes through every element ofU . By considering the identification of U with U ′′, we conclude l = 0. So T ′ is one-to-one. For any givenm ∈ X ′, define l = mT−1, then l ∈ U ′. For any x ∈ X, we have (m,x) = (m,T−1(Tx)) = (l, Tx) = (T ′l, x).Since x is arbitrary, m = T ′l and T ′ is therefore onto. Combined, we conclude T ′ is an isomorphism fromU ′ to X ′ and (T ′)−1 is hence well-defined.

By part (i), (T−1)′T ′ = (TT−1)′ = (idU )′ = idU ′ and T ′(T−1)′ = (T−1T )′ = (idX)′ = idX′ . This shows

(T−1)′ = (T ′)−1.

I 8. (page 26) Show that if X ′′ is identified with X and U ′′ with U via (5) in Chapter 2, then

T ′′ = T.

Proof. Suppose ξ : X → X ′′ and η : U → U ′′ are the isomorphisms defined in Chapter 2, formula (5), whichidentify X with X ′′ and U with U ′′, respectively. Then for any x ∈ X and l ∈ U ′, we have

(T ′′ξx, l) = (ξx, T′l) = (T ′l, x) = (l, Tx) = (ηTx, l).

Since l is arbitrary, we must have T ′′ξx = ηTx, ∀x ∈ X. Hence, T ′′ ◦ ξ = η ◦ T , which is the preciseinterpretation of T ′′ = T .

I 9. （page 28) Show that if A in L(X,X) is a left inverse of B in L(X,X), that is AB = I, then it is alsoa right inverse: BA = I.

Proof. If Bx = 0, by applying A to both sides of the equation and AB = I, we conclude x = 0. So B isinjective. By Corollary B of Theorem 2, B is surjective. Therefore the inverse of B, denoted by B−1, alwaysexists, and A = A(BB−1) = (AB)B−1 = IB−1 = B−1, which implies BA = I.

Remark 5. For a general algebraic structure, e.g. a ring with unit, it’s not always the case that an element’sright inverse equals to its left inverse. In the proof above, we used the fact that for finite dimensional linearvector space, a linear mapping is injective if and only if it’s surjective.

I 10. (page 30) Show that if M is invertible, and similar to K, then K also is invertible, and K−1 is similarto M−1.

Proof. Suppose K = MS . Then K(M−1)S = SMS−1SM−1S−1 = I. By Exercise 9, K is also invertibleand K−1 = (M−1)S .


11

Proof. Suppose A is invertible, we have AB = AB(AA−1) = A(BA)A−1. So AB and BA are similar. Thecase of B being invertible can be proved similarly.

I 12. (page 31) Show that P defined above is a linear map, and that it is a projection.

Proof. For any α, β ∈ K and x = (x1, · · · , xn), y = (y1, · · · , yn), we have

P (αx+ βy) = P ((αx1 + βy1, · · · , αxn + βyn))

= (0, 0, αx3 + βy3, · · · , αxn + βyn)

= (0, 0, αx3, · · · , αxn) + (0, 0, βy3, · · · , βyn)= α(0, 0, x3, · · · , xn) + β(0, 0, y3, · · · , yn)= αP (x) + βP (y).

This shows P is a linear map. Furthermore, we have

P 2(x) = P ((0, 0, x3, · · · , xn)) = (0, 0, x3, · · · , xn) = P (x).

So P is a projection.

I 13. (page 31) Prove that P defined above is linear, and that it is a projection.

Proof. For any α, β ∈ K and f, g ∈ C[−1, 1], we have

P (αf + βg)(x) =1

2[(αf + βg)(x) + (αf + βg)(−x)]

=α

2[f(x) + f(−x)] +

β

2[g(x) + g(−x)]

= αP (f)(x) + βP (g)(x).

This shows P is a linear map. Furthermore, we have

(P 2f)(x) = (P (Pf))(x) = P

(f(·) + f(−·)

2

)(x) =

1

2

[f(x) + f(−x)

2+

f(−x) + f(x)

2

]=

1

2(f(x) + f(−x)) = (Pf)(x).

So P is a projection.

I 14. (page 31) Suppose T is a linear map of rank 1 of a finite dimensional vector space into itself.(a) Show there exists a unique number c such that T 2 = cT .(b) Show that if c = 1 then I − T has an inverse. (As usual I denotes the identity map Ix = x.)(a)

Proof. Since dimRT = 1, it suffices to prove the following claim: if T is a linear map on a 1-dimensionallinear vector space X, there exists a unique number c such that T (x) = cx, ∀x ∈ X. We assume theunderlying filed K is either R or C. We further assume S : X → K is an isomorphism. Then S ◦ T ◦ S−1 isa linear map on K. Define c = S ◦ T ◦ S−1(1), we have

S ◦ T ◦ S−1(k) = S ◦ T ◦ S−1(k · 1) = k · c, ∀k ∈ K.

So T ◦ S−1(k) = S−1(c · k) = cS−1(k), ∀k ∈ K. This shows T is a scalar multiplication.

(b)

Proof. If c = 1, it’s easy to verify I + 11−cT is the inverse of I − T .

12

I 15. (page 31) Suppose T and S are linear maps of a finite dimensional vector space into itself. Show thatthe rank of ST is less than or equal the rank of S. Show that the dimension of the nullspace of ST is lessthan or equal the sum of the dimensions of the nullspaces of S and of T .

Proof. Because RST ⊂ RS , rank(ST ) = dim(RST ) ≤ dimRS = rank(S). Moreover, since the columnrank of a matrix equals its row rank (see Chapter 3, Theorem 6 and Chapter 4, Theorem 2), we haverank(ST ) = rank(T ′S′) ≤ rank(T ′) = rank(T ). Combined, we conclude rank(ST ) ≤ min{rank(S), rank(T )}.

Also, we note NST /NT is isomorphic to NS ∩RT , with the isomorphism defined by ϕ({x}) = Tx, where{x} := x + NT . It’s easy to see ϕ is well-defined, is linear, and is both injective and surjective. So byTheorem 6 of Chapter 1,

dimNST = dimNT + dimNST /NT = dimNT + dim(NS ∩RT ) ≤ dimNT + dimNS .

Remark 6. The result rank(ST ) ≤ min{rank(S), rank(T )} is used in econometrics. Cf. Greene [4, page 985]Appendix A.

4 MatricesThe book’s own solution gives answers to Ex 1, 2, 4.

I 1. (page 35) Let A be an arbitrary m× n matrix, and let D be an m× n diagonal matrix,

Dij =

{di if i = j

0 if i = j.

Show that the ith row of DA equals di times the ith row of A, and show that the jth column of AD equalsdj times the jth column of A.

Proof. It looks the phrasement of the exercise is problematic: when m = n, AD or DA may not be well-

defined. So we will assume m = n in the below. We can write A in the row form

r1r2· · ·rm

. Then DA can be

written as

DA =

d1 0 · · · 00 d2 · · · 0· · · · · · · · · · · ·0 0 · · · dn

r1r2· · ·rn

=

d1r1d2r2· · ·dnrn

We can also write A in the column form [c1, c2, · · · , cn], then AD can be written as

AD = [c1, c2, · · · , cn]

d1 0 · · · 00 d2 · · · 0· · · · · · · · · · · ·0 0 · · · dn

= [d1c1, d2c2, · · · , dncn]

I 2. (page 37) Look up in any text the proof that the row rank of a matrix equals its column rank, andcompare it to the proof given in the present text.

Proof. Proofs in most textbooks are lengthy and complicated. For a clear, although still lengthy, proof, see丘维声 [12, page 112], Theorem 3.5.3.

13

I 3. (page 38) Show that the product of two matrices in 2× 2 block form can be evaluated as(A11 A12

A21 A22

)(B11 B12

B21 B22

)=

(A11B11 +A12B21 A11B12 +A12B22

A21B11 +A22B21 A21B12 +A22B22

)Proof. The calculation is a bit messy. We refer the reader to 丘维声 [12, page 190], Theorem 4.6.1.

I 4. (page 38) Construct two 2× 2 matrices A and B such that AB = 0 but BA = 0.

Proof. Let A =

[1 10 0

]and B =

[1 2−1 −2

]. Then AB = 0 yet BA =

[1 1−1 −1

]= 0.

I 5. (page 40) Show that x1, x2, x3, and x4 given by (20)j satisfy all four equations (20).

Proof. 1 2 3 −12 5 4 −32 3 4 11 4 2 −2

12−21

=

1 · 1 + 2 · 2 + 3 · (−2) + (−1) · 12 · 1 + 5 · 2 + 4 · (−2) + (−3) · 12 · 1 + 3 · 2 + 4 · (−2) + 1 · 1

1 · 1 + 4 · 2 + 2 · (−2) + (−2) · 1

=

−2113

I 6. (page 41) Choose values of u1, u2, u3, u4 so that condition (23) is satisfied, and determine all solutionsof equations (22).

Proof. We choose u1 = u2 = u3 = 1 and u4 = 2. Then x3 = −5x4 − u3 − u2 + 3u1 = −5x4 + 1,x2 = 7x4 + u4 − 3u1 = 7x4 − 1, and x1 = u1 − x2 − 2x3 − 3x4 = 1− (7x4 − 1)− 2(−5x4 + 1)− 3x4 = 0.

I 7. (page 41) Verify that l = (1,−2,−1, 1) is a left nullvector of M :

lM = 0.

Proof.

[1,−2,−1, 1]

1 1 2 31 2 3 12 1 2 33 4 6 2

= [1 · 1− 2 · 1− 1 · 2 + 1 · 3, 1 · 1− 2 · 2− 1 · 1 + 1 · 4, 1 · 2− 2 · 3− 1 · 2 + 1 · 6, 1 · 3− 2 · 1− 1 · 3 + 1 · 2]= 0.

I 8. (page 42) Show by Gaussian elimination that the only left nullvectors of M are multiples of l in Exercise7, and then use Theorem 5 of Chapter 3 to show that condition (23) is sufficient for the solvability of thesystem (22).

Proof. Suppose a row vector x = (x1, x2, x3, x4) satisfies xM = 0. Then we can proceed according toGaussian elimination

x1 + x2 + 2x3 + 3x4 = 0

x2 + 2x2 + x3 + 4x4 = 0

2x1 + 3x2 + 2x3 + 6x4 = 0

3x1 + x2 + 3x3 + 2x4 = 0

⇒

x2 − x3 + x4 = 0

x2 − 2x3 = 0

−2x2 − 3x3 − 7x4 = 0

⇒

{−x3 − x4 = 0

−5x3 − 5x4 = 0.

14

So we have x1 = x4, x2 = −2x4, and x3 = −x4, i.e. x = x4(1,−2,−1, 1), a multiple of l in Exercise 7.

Equation (22) has a solution if and only if u =

u1

u2

u3

u4

is in RM . By Theorem 5 of Chapter 3, this is equivalent

to yu = 0, ∀y ∈ NM ′ (elements of NM ′ are seen as row vectors). We have proved y is a multiple of l. Hencecondition (23), which is just lu = 0, is sufficient for the solvability of the system (22).

5 Determinant and TraceThe book’s own solution gives answers to Ex 1, 2, 3, 4, 5.

⋆ Comments:1) For a more intuitive proof of Theorem 2 (det(BA) = detAdetB), see Munkres [10, page 18], Theorem

2.10.

2) The following proposition is one version of Cramer’s rule and will be used in the proof of Lemma 6,Chapter 6 (formula (21) on page 68).

Proposition 5.1. Let A be an n× n matrix and B defined as the matrix of cofactors of A; that is,

Bij = (−1)i+j detAji,

where Aji is the (ji)th minor of A. Then AB = BA = detA · In×n.

Proof. Suppose A has the column form A = (a1, · · · , an). By replacing the jth column with the ith columnin A, we obtain

M = (a1, · · · , aj−1, ai, aj , · · · , an).

On one hand, Property (i) of a determinant gives detM = δij detA; on the other hand, Laplace expansionof a determinant gives

detM =n∑

k=1

(−1)k+jaki detAkj =n∑

k=1

akiBjk = (Bj1, · · · , Bjn)

a1i...

ani

.

Combined, we can conclude detA · In×n = BA. By replacing the ith column with the jth column in A, wecan get similar result for AB.

I 1. (page 47) Prove properties (7).(a)

Proof. By formula (5), we have |P (x1, · · · , xn)| = |∏

i<j(xi − xj)| = |∏

i =j(xi − xj)| = |P (p(x1, · · · , xn))|.By formula (6), we have |P (p(x1, · · · , xn)| = |σ(p)||P (x1, · · · , xn)|. Combined, we conclude |σ(p)| = 1.

(b)

Proof. By definition, we have

P (p1 ◦ p2(x1, · · · , xn)) = P (p1(p2(x1, · · · , xn))) = σ(p1)P (p2(x1, · · · , x2)) = σ(p1)σ(p2)P (x1, · · · , xn).

So σ(p1 ◦ p2) = σ(p1)σ(p2).

I 2. (page 48) Prove (c) and (d) above.

15

Proof. To see (c) is true, we suppose t interchange i0 and j0. Without loss of generality, we assume i0 < j0.Then

P (t(x1, · · · , xn)) = P (x1, · · · , xj0 , · · · , xi0 , · · · , xn)

= (xj0 − xi0)∏

i<j,(i,j) =(i0,j0)

(xi − xj)

= −∏i<j

(xi − xj)

= −P (x1, · · · , xn).

So σ(t) = −1.To see (d) is true, note formula (9) is equivalent to id = tk ◦ · · · t1 ◦ p−1. Acting these operations on

(1, · · · , n), we have (1, · · · , n) = tk ◦ · · · ◦ t1(p−1(1), · · · , p−1(n)). Then the problem is reduced to proving

that a sequence of transpositions can sort an array of numbers into ascending order. There are many waysto achieve that. For example, we can let t1 be the transposition that interchanges p−1(1) and p−1(i0), wherei0 satisfies p−1(i0) = 1. That is, t1 puts 1 in the first position of the sequence. Then we let t2 be thetransposition that puts 2 to the second position. We continue this procedure until we sort out the wholesequence. This shows sorting can be accomplished by a sequence of transpositions.

I 3. (page 48) Show that the decomposition (9) is not unique, but that the parity of the member k of factorsis unique.

Proof. For any transposition t, we have t ◦ t = id. So if p = tk ◦ · · · ◦ t1, we can get another decompositionp = tk ◦ · · · ◦ t1 ◦ t ◦ t. This shows the decomposition is not unique.

Suppose the permutation p has two different decompositions into transpositions: p = tk ◦ · · · ◦ t1 =t′m ◦ · · · ◦ t′1. By formula (7), part (b) and formula (8), σ(p) = (−1)k = (−1)m. So k−m is an even number.This shows the parity of the member of factors is unique.

I 4. (page 49) Show that D defined by (16) has Properties (ii), (iii) and (iv).

Proof. To verify Property (ii), note for any index j, α, β ∈ K, we have

D(a1, · · · , αa′j + βaj , · · · , an)

=∑

σ(p)ap11 · · · (αa′pjj + βa′pjj) · · · apnn

=∑

[ασ(p)ap11 · · · a′pjj · · · apnn + βσ(p)ap11 · · · apjj · · · apnn]

= αD(a1, · · · , a′j , · · · , an) + βD(a1, · · · , aj , · · · , an).

To verify Property (iii), note ep11 · · · epnn is non-zero if and only if pi = i for any 1 ≤ i ≤ n. In this casethe product is 1.

To verify Property (iv), note for any i = j, if we denote by t the transposition that interchanges i and j,then p 7→ p ◦ t is a one-to-one and onto map from the set of all permutations to itself. Therefore, we have

D(a1, · · · , ai, · · · , aj , · · · , an)

=∑

σ(p)ap11 · · · apii · · · apjj · · · apnn

=∑

(−1)σ(p ◦ t)ap◦t11 · · · ap◦tji · · · ap◦tij · · · ap◦tnn

=∑

(−1)σ(p ◦ t)ap◦t11 · · · ap◦tij · · · ap◦tji · · · ap◦tnn

= (−1)∑

σ(q)aq11 · · · aqij · · · aqji · · · aqnn= −D(a1, · · · , aj , · · · , ai, · · · , an).

16

I 5. (page 49) Show that Property (iv) implies Property (i), unless the field K has characteristic two, thatis, 1 + 1 = 0.

Proof. By property (iv), D(a1, · · · , ai, · · · , ai, · · · , an) = −D(a1, · · · , ai, · · · , ai, · · · , an). So add to bothsides of the equations D(a1, · · · , ai, · · · , ai, · · · , an) , we have 2D(a1, · · · , ai, · · · , ai, · · · , an) = 0. If thecharacter of the field K is not two, we can conclude D(a1, · · · , ai, · · · , ai, · · · , an) = 0.

Remark 7. This exercise and Exercise 5.4 together show formula (16) is equivalent to Properties (i)-(iii),provided the character of K is not two. Therefore, for K = R or C, we can either use (16) or properties(i)-(iii) as the definition of determinant.

I 6. (page 52) Verify that C(A11) has properties (i)-(iii).

Proof. If two column vectors ai and aj (i = j) of A11 are equal, we have[0ai

]=

[0aj

]. So C(A11) = 0 and

property (i) is satisfied. Since any linear operation on a column vector ai of A11 can be naturally extended

to[0ai

], property (ii) is also satisfied. Finally, we note when A11 = I(n−1)×(n−1),

[1 00 A11

]= In×n. So

property (iii) is satisfied.

I 7. (page 52) Deduce Corollary 5 from Lemma 4.

Proof. We first move the j-th column to the position of the first column. This can be done by interchangingneighboring columns (j − 1) times. The determinant of the resulted matrix A1 is (−1)j−1 detA. Then wemove the i-th row to the position of the first row. This can be done by interchanging neighboring rows(i − 1) times. The resulted matrix A2 has a determinant equal to (−1)i−1 detA1 = (−1)i+j detA. On the

other hand, A2 has the form of(1 ∗0 Aij

). By Lemma 4, we have detAij = detA2 = (−1)i+j detA. So

detA = (−1)i+j detAij .

Remark 8. Rigorously speaking, we only proved that swapping two neighboring columns will give a minussign to the determinant (Property (iv)), but we haven’t proved this property for neighboring rows. This canbe made rigorous by using detA = detAT (Exercise 8 of this chapter).

I 8. (page 54) Show that for any square matrix

detAT = detA, AT = transpose of A

[Hint: Use formula (16) and show that for any permutation σ(p) = σ(p−1).]

Proof. We first show for any permutation p, σ(p) = σ(p−1). Indeed, by formula (7)(b), we have 1 = σ(id) =σ(p ◦ p−1) = σ(p)σ(p−1). By formula (7)(a), we conclude σ(p) = σ(p−1). Second, we denote by bij the(i, j)-th entry of AT . Then bij = aji. By formula (16) and the fact that p 7→ p−1 is a one-to-one and ontomap from the set of all permutations to itself, we have

detAT =∑

σ(p)bp11 · · · bpnn

=∑

σ(p)a1p1 · · · anpn

=∑

σ(p−1)a(p−1◦p)1p1· · · a(p−1◦p)npn

=∑

σ(p−1)ap−1(p1)p1· · · ap−1(pn)pn

=∑

σ(p−1)ap−1(1)1 · · · ap−1(n)n

= detA.

17

I 9. (page 54) Given a permuation p of n objects, we define an associated so-called permutation matrix Pas follows:

Pij =

{1, if j = p(i),0, otherwise.

Show that the action of P on any vector x performs the permutation p on the components of x. Show that ifp, q are two permutations and P , Q are the associated permutation matrices, then the permutation matrixassociated with p ◦ q is the product PQ.

Proof. By Exercise 2, it suffices to prove the property for transpositions. Suppose p interchanges i1, i2 andq interchanges j1, j2. Denote by P and Q the corresponding permutation matrices, respectively. Then forany x = (x1, · · · , xn)

T ∈ Rn, we have (δij is the Kronecker sign)

(Px)i =∑

Pijxj =∑

δp(i)jxj =

xi2 if i = i1

xi1 if i = i2

xi otherwise.

This shows the action of P on any column vector x performs the permutation p on the components of x.Similarly, we have

(Qx)i =

xj2 if i = j1

xj1 if i = j2

xi otherwise.

Since (PQ)(x) = P (Q(x)), the action of matrix PQ on x performs first the permutation q and then thepermutation p on the components of x. Therefore, the permutation matrix associated with p ◦ q is theproduct of P and Q.

I 10. (page 56) Let A be an m× n matrix, B an n×m matrix. Show that

trAB = trBA

Proof.

tr(AB) =m∑i=1

(AB)ii =m∑i=1

n∑j=1

aijbji =m∑j=1

n∑i=1

ajibij =n∑

i=1

m∑j=1

bijaji =n∑

i=1

(BA)ii = tr(BA),

where the third equality is obtained by interchanging the names of the indices i, j.

I 11. (page 56) Let A be an n× n matrix, AT its transpose. Show that

trAAT =∑

a2ij .

The square root of the double sum on the right is called the Euclidean, or Hilbert-Schmidt, norm of thematrix A.

Proof.tr(AAT ) =

∑i

(AAT )ii =∑i

∑j

AijATji =

∑i

∑j

aijaij =∑ij

a2ij .

I 12. (page 56) Show that the determinant of the 2× 2 matrix(a bc d

)is D = ad− bc.

18

Proof. Apply Laplace expansion of a determinant according to its columns (Theorem 6).

I 13. (page 56) Show that the determinant of an upper triangular matrix, one whose elements are zerobelow the main diagonal, equals the product of its elements along the diagonal.

Proof. Apply Laplace expansion of a determinant according to its columns (Theorem 6) and work by induc-tion.

I 14. (page 57) How many multiplications does it take to evaluate detA by using Gaussian elimination tobring it into upper triangular form?

Proof. Denote by M(n) the number of multiplications needed to evaluate detA of an n × n matrix A byusing Gaussian elimination to bring it into upper triangular form. To use the first row to eliminate a21,a31, · · · , an1, we need n(n − 1) multiplications. So M(n) = n(n − 1) + M(n − 1) with M(1) = 0. SoM(n) =

∑nk=1 k(k − 1) = n(n+1)(2n+1)

6 − n(n+1)2 = (n−1)n(n+1)

3 .

I 15. (page 57) How many multiplications does it take to evaluate detA by formula (16)?

Proof. Denote by M(n) the number of multiplications needed to evaluate the determinant of an n×n matrixby formula (16). Then M(n) = nM(n− 1). So M(n) = n!.

I 16. (page 57) Show that the determinant of a (3× 3) matrix

A =

a b cd e fg h i

can be calculated as follows. Copy the first two columns of A as a fourth and fifth column: a b c a b

d e f d eg h i g h

detA = aei+ bfg + cdh− gec− hfa− idb.

Show that the sum of the products of the three entries along the dexter diagonals, minus the sum of theproducts of the three entries along the sinister diagonals is equal to the determinant of A.

Proof. We apply Laplace expansion of a determinant according to its columns (Theorem 6):

det

a b cd e fg h i

= a det[e fh i

]− d det

[b ch i

]+ g det

[b ce f

]= a(ie− fh)− d(ib− ch) + g(bf − ce)

= aei+ bfg + cdh− gec− afh− idb.

6 Spectral TheoryThe book’s own solution gives answers to Ex 2, 5, 7, 8, 12.

⋆ Comments:1) λ ∈ C is an eigenvalue of a square matrix A if and only if it is a root of the characteristic polynomial

det(aI − A) = pA(a) (Corollary 3 of Chapter 5). The spectral mapping theorem (Theorem 4) extends thisresult further to polynomials of A.

19

2) The proof of Lemma 6 in this chapter (formula (21) on page 68) used Proposition 5.1 (see the Commentsof Chapter 5).

3) On p.72, the fact that that from a certain index on, Nd’s become equal can be seen from the followingline of reasoning. Assume Nd−1 = Nd while Nd = Nd+1. For any x ∈ Nd+2, we have (A−aI)x ∈ Nd+1 = Nd.So x ∈ Nd+1 = Nd. Then we work by induction.

4) Theorem 12 can be enhanced to a statement on necessary and sufficient conditions, which leads to theJordan canonical form (see Appendix A.15 for details).

⋆ Supplementary notes:1) Minimal polynomial is defined from the algebraic point of view as the generator of the polynomial

ring {p : p(A) = 0}. So the powers of its linear factors are given algebraically. Meanwhile, the index of aneigenvalue is defined from the geometric point of view. Theorem 11 says they are equal.

2) As a corollary of Theorem 11, we claim an n× n matrix A can be diagonalized over the field F if andonly if its minimal polynomial can be decomposed into the product of distinct linear factors (polynomials ofdegree 1 over the field F). Indeed, by the uniqueness of minimal polynomial, we have

mA is the product of distinct linear factors

⇐⇒ Fn =

k⊕j=1

N1(aj)

⇐⇒ Fn has a basis {xi}ni=1 consisting of eigenvectors of A⇐⇒ A can be diagonalized by the matrix U = (x1, · · · , xn), such that U−1AU = diag{λ1, · · · , λn}.

The above sequence of equivalence also gives the steps to diagonalize a matrix A：i) Compute the characteristic polynomial pA(s) = det(sI −A).ii) Solve the equation pA(s) = 0 in F to obtain the eigenvalues of A: a1, · · · , ak.iii) For each aj (j = 1, · · · , k), solve the homogenous equation (ajI −A)x = 0 to obtain the eigenvectors

pertaining to aj : xj1, · · · , xjmj , where mj = dimN1(aj).iv) If

∑kj=1 mj < n, A cannot be diagonalized in F. If

∑kj=1 mj = n, A can be diagonalized by the

matrixU = (x11, · · · , x1m1 , x21, · · · , x2m2 , · · · , xk1, · · · , xkmk

),

such that U−1AU = diag{a1, · · · , a1︸︷︷︸dimN1(a1)

, · · · , ak, · · · , ak︸︷︷︸dimN1(ak)

}.

3) Suppose X is a linear subspace of Fn that is invariant under the n × n matrix A. If A can bediagonalized, the matrix corresponding to A|X can also be diagonalized. This is due to the observation thatX =

⊕kj=1(N1(aj) ∩X).

4) We summarize several relationships among index, algebraic multiplicity, geometric multiplicity, andthe dimension of the space of generalized eigenvectors pertaining to a given eigenvalue. The first result is anelementary proof of Lemma 10, Chapter 9, page 132.Proposition 6.1 (Geometric and algebraic multiplicities). Let A be an n × n matrix over a field Fand α an eigenvalue of A. If m(α) is the multiplicity of α as a root of the characteristic polynomial pA ofA, then dimN1(α) ≤ m(α).

m(α) is called the algebraic multiplicity of α; dimN1(α) is called the geometric multiplicity of αand is the dimension of the linear space spanned by the eigenvectors pertaining to α. So this result says“geometric multiplicity dimN1(α) ≤ algebraic multiplicity m(α)”.Proof. Let v1, · · · , vs be a basis of N1(α) and extend it to a basis of Fn: v1, · · · , vs, u1, · · · , ur. DefineU = (v1, · · · , vs, u1, · · · , ur). Then

U−1AU = U−1A(v1, · · · , vs, u1, · · · , ur)

= U−1(αv1, · · · , αvs, Au1, · · · , Aur)

= (αU−1v1, · · · , αU−1vs, U−1Au1, · · · , U−1Aur).

20

Because U−1U = I, we must have U−1AU =

[αIs×s B

0 C

]and det(λI − A) = det(λI − U−1AU) =

det[(λ− α)Is×s −B

0 λI(n−s)×(n−s) − C

]= (λ− α)s det(λI − C).1 So s ≤ m.

We continue to use the notation from Proposition 6.1, and we define d(α) as the index of α. Then wehave

Proposition 6.2 (Index and algebraic multiplicity). d(α) ≤ m(α).

Proof. Let q(s) = pA(s)/(s − α)m. Then Cn = NpA= Nq

⊕Nm(α). For any v ∈ Nm+1(α), v can be

uniquely written as v = v′ + v′′ with v′ ∈ Nq and v′′ ∈ Nm(α). Then v′ = v − v′′ ∈ Nm+1(α) ∩ Nq.Similar to the second part of the proof of Lemma 9, we can show v′ = 0. So v = v′′ ∈ Nm(α). This showsNm+1(α) = Nm(α) and hence d(α) ≤ m(α).

Using the notations from Propositions 6.1 and 6.2, we have

Proposition 6.3 (Algebraic multiplicity and the dimension of the space of generalized eigen-vectors). m(α) = dimNd(α)(α).

Proof. See Theorem 11 of Chapter 9, page 133.

In summary, we havedimN1(α), d(α) ≤ m(α) = dimNd(α)(α).

In words, it becomes

geometric multiplicity of α, index of α≤ algebraic multiplicity of α as a root of the characteristic polynomial= dim. of the space of generalized eigenvectors pertaining to α.

I 1. (page 63) Calculate f32.

Proof. f32 = a321 /√5 = 2178309.

I 2. (page 65) (a) Prove that if A has n distinct eigenvalues aj and all of them are less than one in absolutevalue, then all h in Cn

ANh → 0, as N → ∞,

that is, all components of ANh tend to zero.(b) Prove that if all aj are greater than one in absolute value, then for all h = 0,

ANh → ∞, as N → ∞,

that is, some components of ANh tend to infinity.(a)

Proof. Denote by hj the eigenvector corresponding to the eigenvalue aj . For any h ∈ Cn, there existθ1, · · · , θn ∈ C such that h =

∑j θjhj . So ANh =

∑j θja

Nj hj . Define b = max{|a1|, · · · , |an|}. Then for any

1 ≤ k ≤ n, |(ANh)k| = |∑

j θjaNj (hj)k| ≤ bN

∑j |θj ||(hj)k| → 0, as N → ∞, since 0 ≤ b < 1. This shows

ANh → 0 as N → ∞.

(b)1For the last equality, see, for example, Munkres [10, page 24], Problem 6, or 蓝以中 [6, page 173].

21

Proof. We use the same notation as in part (a). Since h = 0, there exists some k0 ∈ {1, · · · , n} so thatthe k0th coordinate of h satisfies hk0 =

∑j θj(hj)k0 = 0. Then |(ANh)k0 | = |

∑j θja

Nj (hj)k0 |. Define

b1 = max1≤i≤n{|ai| : θi = 0, (hi)k0 = 0}. Then b1 > 1 and |(ANh)k0 | = |b1|N∣∣∣∑n

i=1 θiaNi

bN1(hi)k0

∣∣∣ → ∞ asN → ∞.

I 3. (page 66) Verify for the matrices discussed in Examples 1 and 2,(3 21 4

)and

(0 11 1

),

that the sum of the eigenvalues equals the trace, and their product is the determinant of the matrix.

Proof. The verification is straightforward.

I 4. (page 69) Verify (25) by induction on N .

Proof. Formula (24) gives us Af = af + h, which is formula (25) when N = 1. Suppose (25) holds for anyn ≤ N , then AN+1f = A(ANf) = A(aNf +NaN−1h) = aNAf +NaN−1Ah = aN (af + h) +NaN−1ah =aN+1f + (N + 1)aNh. So (25) also holds for N + 1. By induction, (25) holds for any N ∈ N.

I 5. (page 69) Prove that for any polynomial q,

q(A)f = q(a)f + q′(a)h,

where q′ is the derivative of q and f satisfies (22).

Proof. Suppose q(s) =∑n

i=0 bisi, then by formula (25), q(A)f =

∑ni=0 biA

if =∑n

i=0 bi(aif + iai−1h) =

(∑n

i=0 biai)f + (

∑ni=1 ibia

i−1)h = q(a)f + q′(a)h.

I 6. (page71) Prove (32) by induction on k.

Proof. By Lemma 9, Np1···pk= Np1 ⊕ Np2···pk

= Np1 ⊕ (Np2 ⊕ Np3···pk) = Np1 ⊕ Np2 ⊕ Np3···pk

= · · · =Np1 ⊕Np2 ⊕ · · · ⊕Npk

.

I 7. (page 73) Show that A maps Nd into itself.

Proof. For any x ∈ Nd(a), we have (A− aI)d(Ax) = (A− aI)d+1x+ a(A− aI)dx = 0. So Ax ∈ Nd(a).


Proof. A number is an eigenvalue of A if and only if it’s a root of the characteristic polynomial pA. So pA(s)

can be written as pA(s) =∏k

1(s − ai)mi with each mi a positive integer (i = 1, · · · , k). We have shown

in the text that pA is a multiple of mA, so we can assume mA(s) =∏k

i=1(s − ai)ri with each ri satisfying

0 ≤ ri ≤ mi (i = 1, · · · , k). We argue ri = di for any 1 ≤ i ≤ k.Indeed, we have

Cn = NpA=

k⊕j=1

Nmj (aj) =k⊕

j=1

Ndj (aj).

where the last equality comes from the observation Nmj (aj) ⊆ Nmj+dj (aj) = Ndj (aj) by the definition ofdj . This shows the polynomial

∏kj=1(s − aj)

dj ∈ ℘ := {polynomials p : p(A) = 0}. By the definition ofminimal polynomial, rj ≤ dj for j = 1, · · · , n.

Assume for some j, rj < dj , we can then find x ∈ Ndj (aj) \ Nrj (aj) with x = 0. Define q(s) =∏ki=1,i=j(s − ai)

ri , then by Corollary 10 x can be uniquely decomposed into x′ + x′′ with x′ ∈ Nq andx′′ ∈ Nrj (aj). We have 0 = (A − ajI)

djx = (A − ajI)djx′ + 0. So x′ ∈ Nq ∩ Ndj (aj) = {0}. This implies

x = x′′ ∈ Nrj (aj). Contradiction. Therefore, ri ≥ di for any 1 ≤ i ≤ k.Combined, we conclude mA(s) =

∏ki=1(s− ai)

di .

22

Remark 9. Along the way, we have shown that the index d of an eigenvalue is no greater than the algebraicmultiplicity of the eigenvalue in the characteristic polynomial. Also see Proposition 6.2.

I 9. (page 75) Prove Corollary 15.

Proof. The extension is straightforward as the key feature of the proof, “B maps N (j) into N (j)”, remainsthe same regardless of the number of linear maps, as far as they commute pairwise.


Proof. For any i ∈ {1, · · · , n}, by Theorem 17, (li, xj) = 0 for any j = i. Since x1, · · · , xn span the wholespace and li = 0, we must have(li, xi) = 0, i = 1, · · · , n. This proves (a) of Theorem 18. For (b), we note ifx =

∑kj=1 kjxj , then (li, x) = ki(li, xi). So ki = (li, x)/(li, xi).

I 11. (page 76) Take the matrix (0 11 1

)from equation (10)′ of Example 2.

(a) Determine the eigenvector of its transpose.(b) Use formulas (44) and (45) to determine the expansion of the vector (0,1)’ in terms of the eigenvectors

of the original matrix. Show that your answer agrees with the expansion obtained in Example 2.(a)

Proof. The matrix is symmetric, so it’s equal to its transpose and the eigenvectors are the same: for eigenvaluea1 = 1+

√5

2 , the eigenvector is h1 =

[1a1

]; for eigenvalue a2 = 1−

√5

2 , the eigenvector is h2 =

[1a2

].

(b)

Proof. We note (h1, h1) = 1 + a21 = 5+√5

2 and (h2, h2) = 1 + a22 = 5−√5

2 . For x =

[01

], we have (h1, x) = a1

and (h2, x) = a2. So using formula (44) and (45), x = c1h1 + c2h2 with

c1 = a1/5 +

√5

2= 1/

√5, c2 = a2/

5−√5

2= −1/

√5.

This agrees with the expansion obtained in Example 2.

I 12. (page 76) In Example 1 we have determined the eigenvalues and corresponding eigenvector of thematrix (

3 21 4

)as a1 = 2, h1 =

(2−1

), and a2 = 5, h2 =

(11

).

Determine eigenvectors l1 and l2 of its transpose and show that

(li, hj) =

{0 for i = j

= 0 for i = j

Proof. The transpose of the matrix has the same eigenvalues a1 = 2, a2 = 5. Solving the equation[3 12 4

] [xy

]= 2

[xy

], we have l1 =

[1 −1

]. Solving the equation

[3 12 4

] [xy

]= 5

[xy

], we have l2 =

[1 2

].

Then it’s easy to calculate (l1, h1) = 3, (l1, h2) = 0, (l2, h1) = 0, and (l2, h2) = 3.

23

I 13. (page 76) Show that the matrix

A =

0 1 11 0 11 1 0

has -1 as an eigenvalue. What are the other two eigenvalues?

Solution.

det(λI −A) = det

λ −1 −1−1 λ −1−1 −1 λ

= det

0 −1− λ −1 + λ2

0 λ+ 1 −1− λ−1 −1 λ

= −[(λ+ 1)2 − (λ2 − 1)(λ+ 1)] = (λ+ 1)2(λ− 2).

So the eigenvalues of A are −1 and 2, and the eigenvalue 2 has a multiplicity of 2.

7 Euclidean StructureThe book’s own solution gives answers to Ex 1, 2, 3, 5, 6, 7, 8, 14, 17, 19, 20.

⋆ Erratum: In the Note on page 92, the infinite-dimensional version of Theorem 15 is Theorem 5 inChapter 15, not “Theorem 15”.


Proof. By letting y = x||x|| , we get ||x|| ≤ max||y||=1(x, y). By Schwartz Inequality, max||y||=1(x, y) ≤ ||x||.

Combined, we must have ||x|| = max||y||=1(x, y).


Proof. ∀x, y and suppose their decomposition are x1 + x2, y1 + y2, respectively. Here x1, y1 ∈ Y andx2, y2 ∈ Y ⊥. Then (P ∗

Y y, x) = (y, PY x) = (y1 + y2, x1) = (y1, x1) = (y1, x) = (PY y, x). By the arbitrarinessof x and y, PY = P ∗

Y .

I 3. (page 89) Construct the matrix representing reflection of points in R3 across the plane x3 = 0. Showthat the determinant of this matrix is −1.

Proof. Under the reflection across the plane {(x1, x2, x3) : x3 = 0}, point (x1, x2, x3) will be mapped to

(x1, x2,−x3). So the corresponding matrix is

1 0 00 1 00 0 −1

, whose determinant is −1.

I 4. (page 89) Let R be reflection across any plane in R3.(i) Show that R is an isometry.(ii) Show that R2 = I.(iii) Show that R∗ = R.

Proof. Suppose the plane L is determined by the equation Ax + By + Cz = D. For any point x =(x1, x2, x3)

′ ∈ R3, we first find y = (y1, y2, y3)′ ∈ L such that the line segment xy ⊥ L. Then y must

satisfy the following equations {Ay1 +By2 + Cy3 = D

(y1 − x1, y2 − x2, y3 − x3) = k(A,B,C)

24

where k is some constant. Solving the equations gives us k = D−(Ax1+Bx2+Cx3)A2+B2+C2 andy1y2

y3

=

x1

x2

x3

+ k

ABC

=

x1

x2

x3

− 1

A2 +B2 + C2

A2 AB ACAB B2 BCCA CB C2

x1

x2

x3

+D

A2 +B2 + C2

ABC

So the symmetric point z = (z1, z2, z3)

′ of x with respect to L is given byz1z2z3

= 2

y1y2y3

−

x1

x2

x3

=

x1

x2

x3

− 2

A2 +B2 + C2

A2 AB ACAB B2 BCCA CB C2

x1

x2

x3

+2D

A2 +B2 + C2

ABC

=

1

A2 +B2 + C2

−A2 +B2 + C2 −2AB −2AC−2AB A2 −B2 + C2 −2BC−2CA −2CB A2 +B2 − C2

x1

x2

x3

+2D

A2 +B2 + C2

ABC

.

To make the reflection R a linear mapping, it’s necessary and sufficient that D = 0. So the problem’sstatement should be corrected to “let R be reflection across any plane in R3 that contains the origin”. Then

R =1

A2 +B2 + C2

−A2 +B2 + C2 −2AB −2AC−2AB A2 −B2 + C2 −2BC−2CA −2CB A2 +B2 − C2

.

R is symmetric, so R∗ = R and by plain calculation, we have R∗R = R2 = I. By Theorem 12, R is anisometry.

I 5. (page 89) Show that a matrix M is orthogonal iff its rows are pairwise orthogonal unit vectors.

Proof. Suppose M is an n× n orthogonal matrix. Let r1, · · · , rn be its column vectors. Then

I = MMT =

r1· · ·rn

[rT1 , · · · , rTn ] =

r1rT1 r1rT2 · · · r1r

Tn

· · · · · · · · · · · ·rnr

T1 rnr

T2 · · · rnr

Tn

.

So M is orthogonal if and only if rirTj = δij (1 ≤ i, j ≤ n).

I 6. (page 90) Show that |aij | ≤ ||A||.

Proof. Note |aij | = sign(aij) · eTi Aej , where ek is the column vector that has 1 as the k-th entry and 0elsewhere. Then we apply (ii) of Theorem 13.

I 7. (page 94) Show that {An} converges to A iff for all x, Anx converges to Ax.

Proof. The key is the space X being finite dimensional. See the solution in the textbook.

I 8. (page 95) Prove the Schwarz inequality for complex linear spaces with a Euclidean structure.

Proof. For any x, y ∈ X and a ∈ C, 0 ≤ ||x − ay|| = ||x||2 − 2Re(x, ay) + |a|2||y||2. Let a = (x,y)||y||2 (assume

y = 0), then we have

0 ≤ ||x||2 − 2Re{(x, y)

||y||2(x, y)

}+

|(x, y)|2

||y||2,

which gives after simplification |(x, y)| ≤ ||x||||y||.

I 9. (page 95) Prove the complex analogues of Theorem 6, 7, and 8.

Proof. Proofs are the same as the ones for the real Euclidean space.

I 10. (page 95) Prove the complex analogue of Theorem 9.

25

Proof. Proof is the same as the one for real Euclidean space.

I 11. (page 96) Show that a unitary map M satisfies the relations

M∗M = I

and, conversely, that every map M that satisfies (45) is unitary.

Proof. If M is a unitary map, then by parallelogram law, M preserves inner product. So ∀x, y, (x,M∗My) =(Mx,My) = (x, y). Since x is arbitrary, M∗My = y, ∀y ∈ X. So M∗M = I. Conversely, if M∗M = I,(x, x) = (x,M∗Mx) = (Mx,Mx). So M is an isometry.

I 12. (page 96) Show that if M is unitary, so is M−1 and M∗.

Proof. (M−1x,M−1x) = (M(M−1x),M(M−1x)) = (x, x). (Mx,Mx) = (x, x) = (M∗Mx,M∗Mx). ByRM = X, (y, y) = (M∗y,M∗y), ∀y ∈ X. So M−1 and M∗ are both unitary.

I 13. (page 96) Show that the unitary maps form a group under multiplication.

Proof. If M , N are two unitary maps, then (MN)∗(MN) = N∗M∗MN = N∗N = I. So the set of unitarymaps is closed under multiplication. Exercise 12 shows that each unitary map has a unitary inverse. So theset of unitary maps is a group under multiplication.

I 14. (page 96) Show that for a unitary map M , | detM | = 1.

Proof. By Exercise 8 of Chapter 5, detM∗ = detMT= detM . So by M∗M = I, we have

1 = detM∗ detM = | detM |2,

i.e. |detM | = 1.

I 15. (page 96) Let X be the space of continuous complex-valued functions on [−1, 1] and define the scalarproduct in X by

(f, g) =

∫ 1

−1

f(s)g(s)ds.

Let m(s) be a continuous function of absolute value 1: |m(s)| = 1, −1 ≤ s ≤ 1.Define M to be multiplication by m:

(Mf)(s) = m(s)f(s).

Show that M is unitary.

Proof. (Mf,Mf) =∫ 1

−1Mf(s)Mf(s)ds =

∫ 1

−1m(s)f(s)m(s)f(s)ds =

∫ 1

−1|m(s)|2|f(s)|2ds = (f, f). This

shows M is unitary.

I 16. (page 98) Prove the following analogue of (51) for matrices with complex entries:

||A|| ≤

∑i,j

|aij |21/2

.

Proof. The proof is very similar to that of real case, so we omit the details. Note we need the complexversion of Schwartz inequality (Exercise 8).

I 17. (page 98) Show that ∑i,j

|aij |2 = trAA∗.

26

Proof. We have

(AA∗)ij = [ai1, · · · , ain]

aj1· · ·ajn

=n∑

k=1

aikajk.

So (AA∗)ii =∑n

k=1 |aik|2 and tr(AA∗) =∑

i,j |aij |2.

I 18. (page 99) Show thattrAA∗ = trA∗A.

Proof. This is straightforward from the result of Exercise 17.

I 19. (page 99) Find an upper bound and a lower bound for the norm of the 2× 2 matrix

A =

(1 20 3

)

The quantity(∑

i,j |aij |2)1/2

is called the Hilbert-Schmidt norm of the matrix A.

Solution. Suppose λ1 and λ2 are two eigenvalues of A. Then by Theorem 3 of Chapter 6, λ1 +λ2 = trA = 4and λ1λ2 = detA = 3. Solving the equations gives us λ1 = 1, λ2 = 3. By formula (46), ||A|| ≥ 3. Accordingto formula (51), we have ||A|| ≤

√12 + 22 + 32 =

√14. Combined, we have 3 ≤ ||A|| ≤

√14 ≈ 3.7417.

I 20. (page 99) (i) w is a bilinear function of x and y. Therefore we write w as a product of x and y, denotedas

w = x× y,

and called the cross product.(ii) Show that the cross product is antisymmetric:

y × x = −x× y.

(iii) Show that x× y is orthogonal to both x and y.(iv) Let R be a rotation in R3; show that

(Rx)× (Ry) = R(x× y).

(v) Show that||x× y|| = ±||x||||y|| sin θ,

where θ is the angle between x and y.(vi) Show that 1

00

×

010

=

001

.

(vii) Using Exercise 16 in Chapter 5, show that abc

×

def

=

bf − cecd− afae− bd

.

(i)

27

Proof. For any α1, α2 ∈ F, we have

(w(α1x1 + α2x2, y), z) = det(α1x1 + α2x2, y, z) = α1 det(x1, y, z) + α2 det(x2, y, z)

= α1(w(x1, y), z) + α2(w(x2, y), z) = (α1w(x1, y) + α2w(x2, y), z).

Since z is arbitrary, we necessarily have w(α1x1 + α2x2, y) = α1w(x1, y) + α2w(x2, y). Similarly, we canprove w(x, α1y1 + α2y2) = α1w(x, y1) + α2w(x, y2). Combined, we have proved w is a bilinear function of xand y.

(ii)

Proof. We note

(w(x, y), z) = det(x, y, z) = −det(y, x, z) = −(w(y, x), z) = (−w(y, x), z).

By the arbitrariness of z, we conclude w(x, y) = −w(y, x), i.e. y × x = −x× y.

(iii)

Proof. Since (w(x, y), x) = det(x, y, x) = 0 and (w(x, y), y) = det(x, y, y) = 0, x× y is orthogonal to both xand y.

(iv)

Proof. We suppose every vector is in column form and R is the matrix that represents a rotation. Then

(Rx×Ry, z) = det(Rx,Ry, z) = (detR) · det(x, y,R−1z)

and(R(x× y), z) = (R(x× y))T z = (x× y)TRT z = (x× y,RT z) = det(x, y,RT z).

A rotation is isometric, so RT = R−1 and detR = ±1. Combing the above two equations gives us ±(Rx×Ry, z) = (R(x× y), z). Since z is arbitrary, we must have ±Rx×Ry = R(x× y).

(v)

Proof. In the equation det(x, y, z) = (x×y, z), we set z = x×y. Since the geometrical meaning of det(x, y, z)is the signed volume of a parallelogram determined by x, y, z, and since z = x× y is perpendicular to x andy, we have det(x, y, z) = ±||x||||y|| sin θ||z||, where θ is the angle between x and y. Then by (x× y, z) = ||z||2,we conclude ||x× y|| = ||z|| = ±||x||||y|| sin θ.

(vi)

Proof.

1 = det

1 0 00 1 00 0 1

= (

100

×

010

,

001

).So

100

×010

=

ab1

. By part (iii), we necessarily have a = b = 0. Therefore, we can conclude

100

×010

=001

.

(vii)

28

Proof. By Exercise 16 of Chapter 5,

det

a d gb e hc f i

= det

a b cd e fg h i

= aei+ bfg + cdh− gec− hfa− idb

= (bf − ec)g + (cd− fa)h+ (ae− db)i

=[bf − ce cd− af ae− bd

] ghi

.

So we have abc

×

def

=

bf − cecd− afae− bd

.

I 21. (page 100) Show that in a Euclidean space every pair of vector satisfies

||u+ v||2 + ||u− v||2 = 2||u||2 + 2||v||2.

Proof.

||u+ v||2 + ||u− v||2 = (u+ v, u+ v) + (u− v, u− v) = (u, u+ v) + (v, u+ v) + (u, u− v)− (v, u− v)

= (u, u) + (u, v) + (v, u) + (v, v) + (u, u)− (u, v)− (v, u) + (v, v) = 2||u||2 + 2||v||2.

8 Spectral Theory of Self-Adjoint Mappings of a Euclidean Spaceinto Itself

The book’s own solution gives answers to Ex 1, 4, 8, 10, 11, 12, 13.

⋆ Erratum: On page 114, formula (37)′ should be an = maxx=0(x,Hx)(x,x) instead of an = minx =0

(x,Hx)(x,x) .

⋆ Comments:1) In Theorem 4, the eigenvectors of H can be complex (the proof did not show they are real), although

the eigenvalues of H are real.

2) The following result will help us understand some details in the proof of Theorem 4′ (page 108, “Itfollows from this easily that we may choose an orthonormal basis consisting of real eigenvectors in eacheigenspace Na.”)

Proposition 8.1. Let X be a conjugate invariant subspace of Cn (i.e. X is invariant under conjugateoperation). Then we can find a basis of X consisting of real vectors.

Proof. We work by induction. First, assume dimX = 1. ∀v ∈ X with v = 0, we must have Rev ∈ X andImv ∈ X. At least one of them is non-zero and can be taken as a basis. Suppose for all conjugate invariantsubspaces with dimension no more than k the claim is true. Let dimX = k + 1. ∀v ∈ X with v = 0. If Revand Imv are (complex) linearly dependent, there must exist c ∈ C and v0 ∈ Rn such that v = cv0, and we letY = span{v0}; if Rev and Imv are (complex) linearly independent, we let Y = span{v, v} = span{Rev, Imv}.In either case, Y is conjugate invariant. Let Y ⊥ = {x ∈ X :

∑ni=1 xiyi = 0, ∀y = (y1, · · · , yn)′ ∈ Y }. Then

clearly, X = Y⊕

Y ⊥ and Y ⊥ is also conjugate invariant. By assumption, we can choose a basis of Y ⊥

consisting exclusively of real vectors. Combined with the real basis of Y , we get a real basis of X.

29

3) For an elementary proof of Theorem 4′ by mathematical induction, see丘维声 [12, page 297], Theorem5.9.4.

4) Theorem 5 (the spectral resolution representation of self-adjoint operators) can be extended to infinitedimensional space and is phrased as ”any self-adjoint operator can be decomposed as the integral w.r.t.orthogonal projections”. See any textbook on functional analysis for details.

5) For the second proof of Theorem 4, compare Spivak[13, page 122], Exercise 5-17 and Keener[5, page15], Theorem 1.6 (the “maximum principle”).

⋆ Supplementary notesIn view of the spectral theorem (Theorem 7 of Chapter 6, p.70), the diagonalization of a self-adjoint

matrix A is reduced to showing that in the decomposition

Cn = Nd1(α1)⊕

· · ·⊕

Ndk(αk),

we must have di(αi) = 1, i = 1, · · · , k. Indeed, assume for some α, d(α) > 1. Then for any x ∈ N2(α)\N1(α),we have

(αI −A)x = 0, (αI −A)2x = 0.

But the second equation implies

((αI −A)x, (αI −A)x) = ((αI −A)2x, x) = 0.

A contradiction. So we must have d(α) = 1. This is the substance of the proof of Theorem 4, part (b).

I 1. (page 102) Show thatRe(x,Mx) = (x,Msx).

Proof.(x,

M +M∗

2x

)=

1

2[(x,Mx) + (x,M∗x)] =

1

2[(x,Mx) + (Mx, x)] =

1

2[(x,Mx) + (x,Mx)] = Re(x,Mx).

I 2. (page 104) We have described above an algorithm for diagonalizing q; implement it as a computerprogram.

Solution. Skipped for this version.

I 3. (page 105) Prove thatp+ + p0 = max dimS, q ≥ 0 on S

andp− + p0 = max dimS, q ≤ 0 on S.

Proof. We prove p+ + p0 = maxq(S)≥0 dimS. p− + p0 = maxq(S)≤0 dimS can be proved similarly. We shalluse representation (11) for q in terms of the coordinates z1, · · · , zn; suppose we label them so that d1, · · · ,dp are nonnegative where p = p+ + p0, and the rest are negative. Define the subspace S1 to consist ofall vectors for which zp+1 = · · · = zn = 0. Clearly, dimS1 = p and q is nonnegative on S1. This showsp+ + p0 = p ≤ maxq(S)≥0 dimS. If < holds, there must exist a subspace S2 such that q(S2) ≥ 0 anddimS2 > p = p++p0. Define P : S2 → S3 := {z : zp+1 = zp+2 = · · · = zn = 0} by P (z) = (z1, ·, zp, 0, · · · , 0).Since dimS2 > p = dimS3, there exists some z ∈ S2 such that z = 0 and P (z) = 0. This impliesz1 = · · · = zp = 0. So q(z) =

∑pi=1 diz

2i +

∑ni=p+1 diz

2i =

∑ni=p+1 diz

2i < 0, contradiction. Therefore, our

assumption is not true and < cannot hold.

I 4. (page 109) Show that the columns of M are the eigenvectors of H.

30

Proof. Write M in the column form M = [c1, · · · , cn] and multiply M to both sides for formula (24)’, we get

HM = [Hc1, · · · ,Hcn] = MD = [c1, · · · , cn]

λ1 0 · · · 00 λ2 · · · 0· · · · · · · · · · · ·0 0 · · · λn

= [λ1c1, · · · , λncn],

where λ1, · · · , λn are eigenvalues of M , including multiplicity. So we have Hci = λici, i = 1, · · · , n. Thisshows the columns of M are eigenvectors of H.

I 5. (page 118) (a) Show that the minimum problem (47) has a nonzero solution f .(b) Show that a solution f of the minimum problem (47) satisfies the equation

Hf = bMf,

where the scalar b is the value of the minimum (47).(c) Show that the constrained minimum problem

min(y,Mf)=0

(y,Hy)

(y,My)

has a nonzero solution g.(d) Show that a solution g of the minimum problem (47)′ satisfies the equation

Hg = cMg,

where the scalar c is the value of the minimum (47)′.

Proof. The essence of the generalization can be summarized as follows: ⟨x, y⟩ = (x,My) is an inner productand M−1H is self-adjoint under this new inner product, hence all the previous results apply.

Indeed, ⟨x, y⟩ is a bilinear function of x and y; it is symmetric since M is self-adjoint; and it is positivesince M is positive. Combined, we can conclude ⟨x, y⟩ is an inner product.

Because M is positive, Mx = 0 has a unique solution 0. So M−1 exists. Define U = M−1H and wecheck U is self-adjoint under the new inner product ⟨·, ·⟩. Indeed, ∀x, y ∈ X,

⟨Ux, y⟩ = (Ux,My) = (M−1Hx,My) = (Hx, y) = (x,Hy) = (x,MM−1Hy) = (x,MUy) = ⟨x,Uy⟩.

Applying the second proof of Theorem 4, with (·, ·) replaced by ⟨·, ·⟩ and H replaced by M−1H, we canverify claims (a)-(d) are true.


Proof. This is just Theorem 4 with (·, ·) replaced by ⟨·, ·⟩ and H replaced by M−1H, where ⟨·, ·⟩ is definedin the solution of Exercise 5.

I 7. (page 118) Characterize the numbers bi in Theorem 11 by a minimax principle similar to (40).

Solution. This is just Theorem 11 with (·, ·) replaced by ⟨·, ·⟩ and H replaced by M−1H, where ⟨·, ·⟩ is definedin the solution of Exercise 5.

I 8. (page 119) Prove Theorem 11′.

Proof. Under the new inner product ⟨·, ·⟩ = (·,M ·), U = M−1H is selfadjoint. By Theorem 4, all theeigenvalues of M−1H are real. If H is positive and M−1Hx = λx, then λ⟨x, x⟩ = ⟨x,M−1Hx⟩ = (x,Hx) > 0for x = 0, which implies λ > 0. So under the condition that H is positive, all eigenvalues of M−1H arepositive.

I 9. (page 119) Give an example to show that Theorem 11’ is false if M is not positive.

31


I 10. (page 119) Prove Theorem 12. (Hint: Use Theorem 8.)

Proof. By Theorem 8, we can assume N has an orthonormal basis v1, · · · , vn consisting of genuine eigen-vectors. We assume the eigenvalue corresponding to vj is nj . Then by letting x = vj , j = 1, · · · , n andby the definition of ||N ||, we can conclude ||N || ≥ max |nj |. Meanwhile, ∀x ∈ X with ||x|| = 1, there exista1, · · · , an ∈ C, so that

∑|aj |2 = 1 and x =

∑ajvj . So

||Nx||||x||

= ||∑

ajnjvj || =√∑

|ajnj |2 ≤ max1≤j≤n

|nj |√∑

|aj |2 = max |nj |.

This implies ||N || ≤ max |nj |. Combined, we can conclude ||N || = max |nj |.

Remark 10. Compare the above result with formula (48) and Theorem 18 of Chapter 7.

I 11. (page 119) We define the cyclic shift mapping S, acting on vectors in Cn, by S(a1, a2, · · · , an) =(an, a1, · · · , an−1).

(a) Prove that S is an isometry in the Euclidean norm.(b) Determine the eigenvalues and eigenvectors of S.(c) Verify that the eigenvectors are orthogonal.

Proof. |S(a1, · · · , an)| = |(an, a1, · · · , an−1)| = |(a1, · · · , an)|. So S is an isometry in the Euclidean norm.To determine the eigenvalues and eigenvectors of S, note under the canonical basis e1, · · · , en, S correspondsto the matrix

A =

0 0 0 · · · 0 11 0 0 · · · 0 00 1 0 · · · 0 0· · · · · · · · · · · · · · · · · ·0 0 0 · · · 1 0

,

whose characteristic polynomial is p(s) = |A−sI| = (−s)n+(−1)n+1. So the eigenvalues of S are the solutionsto the equation sn = 1, i.e. λk = e

2πkn i, k = 1, · · · , n. Solve the equation Sxk = λkxk, we can obtain the gen-

eral solution as xk = (λn−1k , λn−2

k , · · · , λk, 1)′. After normalization, we have xk = 1√

n(λn−1

k , λn−2k , · · · , λk, 1)

′.Therefore, for i = j,

(xi, xj) =1

n

n∑k=1

λk−1i λk−1

j =1

n

n∑k=1

(λiλj)k−1 =

1

n

1− (λiλj)n

1− λiλj= 0.

I 12. (page 120) (i) What is the norm of the matrix

A =

(1 20 3

)in the standard Euclidean structure?

(ii) Compare the value of ||A|| with the upper and lower bounds of ||A|| asked for in Exercise 19 of Chapter7.

(i)

Solution. A∗A =

[1 02 3

] [1 20 3

]=

[1 22 13

], which has eigenvalues 7 ±

√40. By Theorem 13, ||A|| =√

7 +√40 ≈ 3.65.

(ii)

Solution. This is consistent with the estimate obtained in Exercise 19 of Chapter 7: 3 ≤ ||A|| ≤ 3.7417.

32

I 13. (page 120) What is the norm of the matrix(1 0 −12 3 0

)in the standard Euclidean structures of R2 and R3.

Solution.

1 20 3−1 0

[1 0 −12 3 0

]=

5 6 −16 9 0−1 0 1

, which has eigenvalues 0, 1.6477, and 13.3523 By Theorem

13, the norm of the matrix is approximately√13.3523 ≈ 3.65.

9 Calculus of Vector- and Matrix- Valued FunctionsThe book’s own solution gives answers to Ex 2, 3, 6, 7.

⋆ Erratum: In Exercise 6 (p.129), we should have det eA = etrA instead of det eA = eA.

⋆ Comments: In the proof of Theorem 11, to see why sI − Ad =∏d−1

0 () holds, see Lemma 3 ofAppendix 6, formula (9) (p.369).

I 1. (page 122) Prove the fundamental lemma for vector valued functions. (Hint: Show that for every vectory, (x(t), y) is a constant.)

Proof. Following the hint, note ddt (x(t), y) = (x(t), y)+(x(t), y) = 0. So (x(t), y) is a constant by fundamental

lemma for scalar valued functions. Therefore (x(t)− x(0), y) = 0, ∀y ∈ Kn. This implies x(t) ≡ x(0).

I 2. (page 124) Derive formula (3) using product rule (iii).

Proof. A−1(t)A(t) = I. So 0 = ddt

[A−1(t)A(t)

]= d

dtA−1(t) · A(t) + A−1(t)A(t) and d

dtA−1(t) = d

dtA−1(t) ·

A(t)A−1(t) = −A−1(t)A(t)A−1(t).

I 3. (page 128) Calculate

eA+B = exp(

0 11 0

).

Solution.(0 11 0

)2

=

(1 00 1

). So

(0 11 0

)n

=

I2×2 if n is even(0 1

1 0

)if n is odd.

Therefore, we have

exp{A+B} =∞∑

n=0

1

n!

(0 11 0

)n

=∞∑k=0

I2×2

(2k)!+

∞∑k=0

1

(2k + 1)!

(0 11 0

)=

e+ e−1

2I2×2 +

e− e−1

2

(0 11 0

).

I 4. (page 129) Prove the proposition stated in the Conclusion.

33

Proof. For any ε > 0, there exists M > 0, so that ∀m ≥ M , supt ||Em(t)− F (t)|| < ε. So ∀m ≥ M , ∀t, h,∣∣∣∣∣∣∣∣ 1h [Em(t+ h)− Em(t)]− F (t)

∣∣∣∣∣∣∣∣=

∣∣∣∣∣∣∣∣∣∣ 1h∫ t+h

t

[Em(s)− F (s)]ds+1

h

∫ t+h

t

F (s)ds− F (t)

∣∣∣∣∣∣∣∣∣∣

≤∫ t+h

t||Em(s)− F (s)||ds

h+

∣∣∣∣∣∣∣∣∣∣ 1h∫ t+h

t

F (s)ds− F (t)

∣∣∣∣∣∣∣∣∣∣

< ε+

∣∣∣∣∣∣∣∣∣∣ 1h∫ t+h

t

F (s)ds− F (t)

∣∣∣∣∣∣∣∣∣∣ .

Under the assumption that F is continuous, we have

limh→0

∣∣∣∣∣∣∣∣ 1h [E(t+ h)− E(t)]− F (t)

∣∣∣∣∣∣∣∣ = limh→0

limm→∞

∣∣∣∣∣∣∣∣ 1h [Em(t+ h)− Em(t)]− F (t)

∣∣∣∣∣∣∣∣≤ ε+ lim

h→0

∣∣∣∣∣∣∣∣∣∣ 1h∫ t+h

t

F (s)ds− F (t)

∣∣∣∣∣∣∣∣∣∣ = ε.

Since ε is arbitrary, we must have limh→01h [E(t+ h)− E(t)] = F (t).

I 5. (page 129) Carry out the details of the argument that Em(t) converges.

Proof. By formula (12), Em(t) =∑m

k=1

∑k−1i=0

1k!A

i(t)A(t)Ak−i−1(t). So for m and n with m < n,

||Em(t)− En(t)|| ≤n∑

k=m+1

k−1∑i=0

||Ai(t)A(t)Ak−i−1(t)||k!

=

n∑k=m+1

k−1∑i=0

||A(t)||k−1||A(t)||k!

=n∑

k=m+1

||A(t)||k−1

(k − 1)!||A(t)|| = ||A(t)||[en(||A(t)||)− em(||A(t)||)] → 0

as m,n → ∞. This shows (Em(t))∞m=1 is a Cauchy sequence, hence convergent.

I 6. (page 129) Apply formula (10) to Y (t) = eAt and show that

det eA = etrA.

Proof. Apply formula (10) to Y (t) = eAt, we have ddt log detY (t) = tr(e−AteAtA) = trA. Integrating from 0

to t, we get log detY (t)− log detY (0) = ttrA. So detY (t) = ettrA. In particular, det eA = etrA.

I 7. (page 129) Prove that all eigenvalues of eA are of the form ea, a an eigenvalue of A. Hint: Use Theorem4 of Chapter 6, along with Theorem 6 below.

Proof. Without loss of generality, we can assume A is a Jordan matrix. Then eA is an upper triangularmatrix and its entries on the diagonal line have the form ea, where a is an eigenvalue of A. So all eigenvaluesof eA are the exponentials of eigenvalues of A.

I 8. (page 142) (a) Show that the set of all complex, self-adjoint n× n matrices forms N = n2-dimensionallinear space over the reals.

(b) Show that the set of complex, self-adjoint n × n matrices that have one double and n − 2 simpleeigenvalues can be described in terms of N − 3 real parameters.

(a)

34

Proof. The total number of “free” entries is n(n+1)2 . The entries on the diagonal line must be real. So the

dimension is n(n+1)2 × 2− n = n2.

(b)

Proof. Similar to the argument in the text, the total number of complex parameters that determine theeigenvectors is (n−1)+ · · ·+2 = n(n−1)

2 −1. This is equivalent to n(n−1)−2 real parameters. The numberof distinct (real) eigenvalues is n− 1. So the dimension = n2 − n− 2 + n− 1 = n2 − 3.

I 9. (page 142) Choose in (41) at random two self-adjoint 10 × 10 matrices M and B. Using availablesoftware (MATLAB, MAPLE, etc.) calculate and graph at suitable intervals the 10 eigenvalues of B + tMas functions of t over some t-segment.

Solution. See the Matlab/Octave program aoc.m below.

function aoc

%AOC illustrates the avoidance-of-crossing phenomenon% of the neighboring eigenvalues of a continuous% symmetric matrix. This is Exercise 9, Chapter 9% of the textbook, Linear Algebra and Its Applications,% 2nd Edition, by Peter Lax.

% Initialize global variablesmatrixSize = 10;lowerBound = 0.01; %lower bound of t's rangeupperBound = 3; %upper bound of t's rangestepSize = 0.1;t = lowerBound:stepSize:upperBound;

% Generate random symmetric matrixtemp1 = rand(matrixSize);temp2 = rand(matrixSize);M = temp1+temp1';B = temp2+temp2';

% Initialize eigenvalue matrix to zeros;% use each column to store eigenvalues for% a given parametereigenval = zeros(matrixSize,numel(t));for i = 1:numel(t)

eigenval(:,i) = eig(B+t(i)*M);end

% Plot eigenvalues according to values of parameterhold off;disp(['There are ', num2str(matrixSize), ' eigenvalue curves.']);disp(' ');for j = 1:matrixSize

disp(['Eigenvalue curve No. ', num2str(j),'. Press ENTER to continue...']);plot(t, eigenval(j,:));xlabel('t');ylabel('eigenvalues');title('Matlab illustration of Avoidance of Crossing');hold on;

35

pause;endhold off;

10 Matrix InequalitiesThe book’s own solution gives answers to Ex 1, 3, 4, 5, 6, 7, 10, 11, 12, 13, 14, 15.

⋆ Erratum: In Exercise 6 (p.152), Imx > 0 should be Imz > 0.

I 1. (page 146) How many square roots are there of a positive mapping?

Solution. Suppose H has k distinct eigenvalues λ1, · · · , λk. Denote by Xj the subspace consisting ofeigenvectors of H pertaining to the eigenvalue λj . Then H can be represented as H =

∑kj=1 λjPj were Pj

is the projection to Xj . Let A be a positive square root of H, we claim A has to be∑k

j=1

√λjPj . Indeed, if

α is an eigenvalue of A and x is an eigenvector of A pertaining to α, then α2 is an eigenvalue of H and x isan eigenvector of H pertaining to α2. So we can assume A has m distinct eigenvalues α1, · · · , αm (m ≤ k)with αi =

√λi (1 ≤ i ≤ m). Denote by Yi the subspace consisting of eigenvectors of A pertaining to αi.

Then Yi ⊂ Xi. Since H =⊕m

i=1 Yi =⊕k

j=1 Xj , we must have m = k and Yi = Xi, otherwise at lease oneof the ≤ in the sequence of inequalities dimH =

∑mi=1 dimYi ≤

∑mi=1 dimXi ≤

∑ki=1 dimXi = dimH is <,

contradiction. So A can be uniquely represented as A =∑k

j=1

√λjPj , the same as

√H defined in formula

(6).

I 2. (page 146) Formulate and prove properties of nonnegative mappings similar to parts (i), (ii), (iii), (iv),and (vi) of Theorem 1.

Proposition 10.1. (i) The identity I is nonnegative. (ii) If M and N are nonnegative, so is their sumM +N , as well as aM for any nonnegative number a. (iii) If H is nonnegative and Q is invertible, we haveQ∗HQ ≥ 0. (iv) H is nonnegative if and only if all its eigenvalues are nonnegative. (vi) Every nonnegativemapping has a nonnegative square root, uniquely determined.

Proof. (i) and (ii) are obvious. For part (iii), we write the quadratic form associated with Q∗HQ as

(x,Q∗HQx) = (Qx,HQx) = (y,Hy) ≥ 0,

where y = Qx. For part (iv), by the selfadjointness of H, there exists an orthogonal basis of eigenvectors.Denote these by hj and the corresponding eigenvalue by aj . Then any vector x can be expressed as a linearcombination of the hj ’s: x =

∑j xjhj . So (x,Hx) =

∑i,j(xihi, xjajhj) =

∑nj=1 aj |xj |2. From the formula

it is clear that (x,Hx) ≥ 0 for any x if and only if aj ≥ 0, ∀j. For part (vi), the proof is similar to that ofpositive mappings and we omit the lengthy proof. Cf. also solution to Exercise 10.1.

I 3. (page 149) Construct two real, positive 2× 2 matrices whose symmetrized product is not positive.

Solution. Let A be a mapping that maps the vector (0, 1)′ to (0, α2)′ with α2 > 0 sufficiently small and

(1, 0)′ to (α1, 0)′ with α1 > 0 sufficiently large. Let B be a mapping that maps the vector (1, 1)′ to (λ1, λ1)

′

with λ1 > 0 sufficiently small and (−1, 1)′ to (−λ2, λ2)′ with λ2 > 0 sufficiently large. Then both A and B

are positive mappings, and we can find x between (1, 1)′ and (0, 1)′ so that (Ax,Bx) < 0. By the analysis in

the paragraph below formula (14)′, AB + BA is not positive. More precisely, we have A =

(α1 00 α2

)and

B = 12

(λ1 + λ2 λ1 − λ2

λ1 − λ2 λ1 + λ2

).

36

I 4. (page 151) Show that if 0 < M < N , then (a) M1/4 < N1/4. (b) M1/m < N1/m, m a power of 2. (c)logM ≤ logN .

Proof. By Theorem 5 and induction, it is easy to prove (a) and (b). For (c), we follow the hint. If M hasthe spectral resolution M =

∑ki=1 λiPi, logM is defined as

logM =k∑

i=1

logλiPi =k∑

i=1

limm→∞

m(λ1mi − 1)Pi = lim

m→∞m

(k∑

i=1

λ1mi Pi −

k∑i=1

Pi

)= lim

m→∞m(M

1m − I).

So logM = limm→∞ m(M1m − I) ≤ limm→∞ m(N

1m − I) = logN .

I 5. (page 151) Construct a pair of mappings 0 < M < N such that M2 is not less than N2. (Hint: UseExercise 3).

Solution. (from the textbook’s solution, pp. 291) Choose A and B as in Exercise 3, that is positive matriceswhose symmetrized product is not positive. Set

M = A,N = A+ tB,

t sufficiently small positive number. Clearly, M < N .

N2 = A2 + t(AB +BA) + t2B2;

for t small the term t2B is negligible compared with the linear term. Therefore for t small N2 is not greaterthan M2.

I 6. (page 151) Verify that (19) defines f(z) for a complex argument z as an analytic function, as well asthat Imf(z) > 0 for Imz > 0.

Proof. For f(z) = az + b−∫∞0

dm(t)z+t , we have

f(z +∆z)− f(z) = a∆z +∆z ·∫ ∞

0

dm(t)

(z +∆z + t)(z + t).

So if we can show lim∆z→0

∫∞0

dm(t)(z+∆z+t)(z+t) exists and is finite, f(z) is analytic by definition. Indeed, if

Imz > 0, for ∆z sufficiently small, we have∣∣∣∣ 1

z +∆z + t

∣∣∣∣ ≤ 1

|z + t| − |∆z|≤ 1

Imz − |∆z|≤ 2

Imz.

So by Dominated Convergence Theorem, lim∆z→0

∫∞0

dm(t)(z+∆z+t)(z+t) exists and is equal to

∫∞0

dm(t)(z+t)2 , which

is finite. To see Imf(z) > 0 for Imz > 0, we note

Imf(z) = aImz − Im∫ ∞

0

dm(t)

Rez + t+ iImz= Imz

[a+

∫ ∞

0

dm(t)

(Rez + t)2 + (Imz)2

].

Remark 11. This exercise can be used to verify formula (19) on p.151.

I 7. (page 153) Given m positive numbers r1, · · · , rm, show that the matrix

Gij =1

ri + rj + 1

is positive.

37

Proof. Consider the Euclidean space L2(−∞, 1], with the inner product (f, g) :=∫ 1

−∞ f(t)g(t)dt. Choosefj = erj(t−1), j = 1, · · · ,m, then the associated Gram matrix is

Gij = (fi, fj) =

∫ 1

−∞

e(ri+rj)t

eri+rjdt =

1

ri + rj.

Clearly, (fj)mj=1 are linearly independent. So G is positive.

I 8. (page 158) Look up a proof of the calculus result (35).

Proof. We apply the change of variable formula as follows∫ ∞

−∞e−z2

dz =

√∫R2

e−x2−y2dxdy =

√∫ 2π

0

dθ

∫ ∞

0

e−r2rdr =

√2π · 1

2=

√π.

I 9. (page 162) Extend Theorem 14 to the case when dimV = dimU −m, where m is greater than 1.

Proof. The extension is straightforward, just replace the paragraph (on page 161) “If S is a subspace of V ,then T = S and dimT = dimS. ... It follows that

dimS − 1 ≤ dimT

as asserted.” with the following one: Let T = S ∩ V and T1 = S ∩ V ⊥, where V ⊥ stands for the complimentof V in U . Then dimT + dimT1 = dimS. Since dimT1 ≤ dimV ⊥ = n− (n−m) = m, dimT ≥ dimS −m.

The rest of the proof is the same as the proof of Theorem 14 and we can conclude that

p±(A)−m ≤ p±(B) ≤ p±(A).

I 10. (page 164) Prove inequality (44)′.

Proof. For any x, (x, (N − M − dI)x) = (x, (N − M)x) − d||x||2 ≤ ||N − M ||||x||2 − d||x||2 = 0. Similarly,(x, (M −N − dI)x) = (x, (M −N)x)− d||x||2 ≤ ||M −N ||||x||2 − d||x||2 ≤ 0.

I 11. (page 166) Show that (51) is largest when ni and mj are arranged in the same order.

Proof. It’s easy to see the problem can be reduced to the case k = 2. To prove this case, we note if m1 ≤ m2

and np1≥ np2

, we have

m2np1 +m1np2 −m2np2 −m1np1 = (m2 −m1)(np1 − np2) ≥ 0.

I 12. (page 168) Prove that if the self-adjoint part of Z is positive, then Z is invertible, and the self-adjointpart of Z−1 is positive.

Proof. Assume Z is not invertible. Then there exists x = 0 such that Zx = 0. In particular, this implies(x, Zx) = (x,Z∗x) = 0. Sum up these two, we get (x, (Z + Z∗)x) = 0. Contradictory to the assumptionthat the selfadjoint part of Z is positive. For any x = 0, there exists y = 0 so that x = Zy. So

(x, (Z−1 + (Z−1)∗)x) = (x, Z−1x) + (x, (Z−1)∗x)

= (Zy, y) + (Z−1x, x)

= (y, Z∗y) + (y, Zy)

= (y, (Z + Z∗)y) > 0.

This shows the selfadjoint part of Z−1 is positive.

38

I 13. (page 170) Let A be any mapping of a Euclidean space into itself. Show that AA∗ and A∗A have thesame eigenvalues with the same multiplicity.

Proof. Exercise Problem 14 has proved the claim for non-zero eigenvalues. Since the dimensions of the spacesof generalized eigenvectors of AA∗ and A∗A are both equal to the dimension of the underlying Euclideanspace, we conclude by Spectral Theorem that their zero eigenvalues must have the same multiplicity.

I 14. (page 171) Let A be a mapping of a Euclidean space into another Euclidean space. Show that AA∗

and A∗A have the same nonzero eigenvalues with the same multiplicity.

Proof. Suppose a is a non-zero eigenvalue of AA∗ and x is an eigenvector of AA∗ pertaining to a: AA∗x = ax.Applying A∗ to both sides, we get A∗A(A∗x) = aA∗x. Since a = 0 and x = 0, A∗x = 0 by AA∗x = ax.Therefore, a is an eigenvalue of A∗A with A∗x an eigenvector of A∗A pertaining to a. By symmetry, weconclude AA∗ and A∗A have the same set of non-zero eigenvalues.

Fix a non-zero eigenvalue a, and suppose x1, · · · , xm is a basis for the space of generalized eigenvectorsof AA∗ pertaining to a. Since a = 0, we can claim A∗x1, · · · , A∗xm are linearly independent. Indeed,assume not, then there must exist α1, · · · , αm not all equal to 0, such that

∑mi=1 αiA

∗xi = 0. This impliesa(∑m

i=1 αixi) =∑m

i=1 αiAA∗xi = A(

∑mi=1 αiA

∗xi) = 0, which further implies x1, · · · , xm are linearlydependent since a = 0. Contradiction.

This shows the dimension of the space of generalized eigenvectors of AA∗ pertaining to a is no greaterthan that of the space of generalized eigenvectors of A∗A pertaining to a. By symmetry, we conclude thespaces of generalized eigenvectors of AA∗ and A∗A pertaining to the same nonzero eigenvalue have the samedimension. Combined, we can conclude AA∗ and A∗A have the same non-zero eigenvalues with the same(algebraic) multiplicity.

Remark 12. The multiplicity referred to in this problem is understood as algebraic multiplicity, which isequal to the dimension of the space of generalized eigenvectors.

I 15. (page 171) Give an example of a 2× 2 matrix Z whose eigenvalues have positive real part but Z +Z∗

is not positive.

Solution. Let Z =

(1 + bi 30 1 + bi

)where b could be any real number. Then the eigenvalue of Z, 1 + bi,

has positive real part. Meanwhile, Z + Z∗ =

(2 33 2

)has characteristic polynomial p(s) = (2 − s)2 − 9 =

(s− 5)(s+ 1). So Z + Z∗ has eigenvalue 5 and −1, and therefore cannot be positive.

I 16. (page 171) Verify that the commutator (50) of two self-adjoint matrices is anti-self-adjoint.

Proof. Suppose A and B are selfadjoint. Then for any x and y,

(x, (AB −BA)∗y) = ((AB −BA)x, y) = (ABx, y)− (BAx, y) = (x,BAy)− (x,ABy) = (x,−(AB −BA)y).

So (AB −BA)∗ = −(AB −BA).

11 Kinematics and DynamicsThe book’s own solution gives answers to Ex 1, 5, 6, 8, 9.

I 1. (page 174) Show that if M(t) satisfies a differential equation of form (11), where A(t) is antisymmetricfor each t and the initial condition (5), then M(t) is a rotation for every t.

Proof. We note ddt (M(t)M∗(t)) = M(t)M∗(t) + M(t)M∗(t) = A(t) + A∗(t) = 0. So M(t)M∗(t) ≡

M(0)M∗(0) = I. Also, f(t) = detM(t) is continuous function of t and takes values either 1 or -1 bythe isometry property of M(t). Since f(0) = 1, we have f(t) ≡ 1. By Theorem 1, M(t) is a rotation forevery t.

39

I 2. (page 174) Suppose that A is independent of t; show that the solution of equation (11) satisfying theinitial condition (5) is

M(t) = etA.

Proof. limh→0M(t+h)−M(t)

h = limh→0etA(ehA−I)

h = AetA, i.e. M(t) = AM(t). Clearly M(0) = I.

I 3. (page 174) Show that when A depends on t, equation (11) is not solved by

M(t) = e∫ t0A(s)ds,

unless A(t) and A(s) commute for all s and t.

Proof. The reason we need commutativity is that the following equation is required in the calculation ofderivative:

1

h(M(t+ h)−M(t)) =

1

h

(e∫ t+h0

A(s)ds − e∫ t0A(s)ds

)=

1

h

(e∫ t0A(s)ds+

∫ t+ht

A(s)ds −A∫ t0A(s)ds

)=

1

he∫ t0A(s)ds

(e∫ t+ht

A(s)ds − I),

i.e. e∫ t0A(s)ds+

∫ t+ht

A(s)ds = e∫ t0A(s)dse

∫ t+ht

A(s)ds. So when this commutativity holds,

M(t) = limh→0

1

h(M(t+ h)−M(t)) = M(t)A(t).

I 4. (page 175) Show that if A in (15) is not equal to 0, then all vectors annihilated by A are multiples of(16).

Proof. If f = (x, y, z)T satisfies Af = 0, we must haveay + bz = 0

−ax+ cz = 0

−bx− cy = 0.

By discussing various possibilities (a, b, c = 0 or not), we can check f is a multiple of (−c, b,−a)T .

I 5. (page 175) Show that the two other eigenvalues of A are ±i√a2 + b2 + c2.

Proof.

det(sI −A) = det

λ −a −ba λ −cb c λ

= λ(λ2 + c2)− a(−aλ+ bc) + b(ac+ bλ) = λ3 + λ(c2 + b2 + a2).

Solving it gives us the other two eigenvalues.

I 6. (page 176) Show that the motion M(t) described by (12) rotation around the axis through the vectorf given by formula (16). Show that the angle of rotation is t

√a2 + b2 + c2. (Hint: Use formula (4)′.)

40

Proof. Since A =

0 a b−a 0 c−b −c 0

is anti-symmetric, M(t)M∗(t) = etAetA∗= etAe−tA = I. By Exercise 7 of

Chapter 9, all eigenvalues of eAt has the form of eat, where a is an eigenvalue of A. Since the eigenvaluesof A are 0 and ±ik with k =

√a2 + b2 + c2 (Exercise 5), the eigenvalues of eAt are 1 and e±ikt. This

implies det eAt = 1 · eikt · e−ikt = 1. By Theorem 1, M = eAt is a rotation. Let f be given by formula(16). From Af = 0 we deduce that eAtf = f ; thus f is the axis of the rotation eAt. The trace ofeAt is 1 + eikt + e−ikt = 2 cos kt + 1. According to formula (4)′, the angle of rotation θ of eAt satisfies2 cos θ + 1 = treAt. This shows that θ = kt =

√a2 + b2 + c2.

I 7. (page 177) Show that the commutator

[A,B] = AB −BA

of two antisymmetric matrices is antisymmetric.

Proof. (AB −BA)∗ = (AB)∗ − (BA)∗ = B∗A∗ −A∗B∗ = (−B)(−A)− (−A)(−B) = BA−AB = −(AB −BA).

I 8. (page 177) Let A denote the 3 × 3 matrix (15); we denote the associated null vector (16) by fA.Obviously, f depends linearly on A.

(a) Let A and B denote two 3× 3 antisymmetric matrices. Show that

trAB = −2(fA, fB),

where (, ) denotes the standard scalar product for vectors in R3.

Proof. See the solution in the textbook, on page 294.

I 9. (page 177) Show that the cross product can be expressed as

f|A,B| = fA × fB .

Proof. See the solution in the textbook, page 294.

I 10. (page 184) Verify that solutions of the form (36) form a 2n-dimensional linear space.

Proof. It suffices to note that the set of 2n functions, {(cos cjt)hj , (sin cjt)hj}nj=1, are linearly independent,since any two of them are orthogonal when their subscripts are distinct.

12 ConvexityThe book’s own solution gives answers to Ex 2, 6, 7, 8, 10, 16, 19, 20.

⋆ Comments:1) The following results will help us understand some details in the proofs of Theorem 6 and Theorem

10.

Proposition 12.1. Let S be an arbitrary subset of X and x an interior point of S. For any real linearfunction l defined on X, if l ≡ 0, then l(x) is an interior point of Γ = l(S) in the topological sense.

Proof. We can find y ∈ X so that l(y) = 0. Then for t sufficiently small, l(x) + tl(y) = l(x+ ty) ∈ Γ. So Γcontains an interval which contains l(x), i.e. l(x) is an interior point of Γ under the topology of R1.

Corollary 12.1. If K is an open convex set and l is a linear function with l ≡ 0, Γ = l(K) is an openinterval.

Proof. Note Γ is convex and open in R1 in the topological sense.

41

Proposition 12.2. Let K be a convex set and K0 the set of all interior points of K. Then K0 is convexand open.

Proof. (Convexity) ∀x, y ∈ K0 and a ∈ [0, 1]. For any z ∈ X, [ax+(1−a)y]+tz = a(x+tz)+(1−a)(y+tz) ∈ Kwhen t is sufficiently small, since x, y are interior points of K and K is convex.

(Openness) Fix x ∈ K0, ∀y1 ∈ X. We need to show for t sufficiently small, x+ty1 ∈ K0. Indeed, ∀y2 ∈ X,we can find a common ε > 0, so that whenever (t1, t2) ∈ [−ε, ε] × [−ε, ε], x + t1y1 ∈ K and x + t2y2 ∈ K.Fix any t∗ ∈ [− ε

2 ,ε2 ], by the convexity of K, x + t∗y1 + t∗∗y2 = 1

2 (x + 2t∗y1) +12 (x + 2t∗∗y2) ∈ K when

t∗∗ ∈ [− ε2 ,

ε2 ]. This shows x + t∗y1 ∈ K0. Since t∗ is arbitrarily chosen from [− ε

2 ,ε2 ], we conclude for t

sufficiently small, x + ty1 ∈ K0. That is, x is an interior point of K0. By the arbitrariness of x, K0 isopen.

2) Regarding Theorem 10 (Carathéodory): i) Among all the three conditions, “convexity” is the essentialone; “closedness” and “boundedness” are to guarantee K has extreme points. (ii) Solution to Exercise 14may help us understand the proof of Theorem 10. When a convex set has no interior points, it’s often usefulto realize that the dimension can be reduced by 1. (iii) To understand “... then all points x on the opensegment bounded by x0 and x1 are interior points of K”, we note if this is not true, then we can find ysuch that for all t > 0 or t < 0, x + ty ∈ K. Without loss of generality, assume x + ty ∈ K, ∀t > 0. For tsufficiently small, x0 + ty ∈ K. so the segment [x1, x0 + ty] ⊂ K. But this necessarily intersects with theray x+ ty, t > 0. A contradiction. (iv) We can summarize the idea of the proof as follows. One dimensionis clear, so by using induction, we have two scenarios. Scenario one, K has no interior points. Then thedimension is reduced by 1 and we are done. Scenario two, K has interior points. Then intuition showsany interior point lies on a segment with one endpoint an extreme point and the other a boundary point; aboundary point resides on a hyperplane, whose dimension is reduced by 1. By induction, we are done.

I 1. (page 188) Verify that these are convex sets.

Proof. The verification is straightforward and we omit it.

I 2. (page 188) Prove these propositions.

Proof. These propositions are immediate consequences of the definition of convexity.

I 3. (page 188) Show that an open half-space (3) is an open convex set.

Proof. Fix a point x ∈ {z : l(z) < c}. For any y ∈ X, f(t) = l(x+ ty) = l(x) + tl(y) is a continuous functionof t, with f(0) = l(x) < c. By continuity, f(t) < c for t sufficiently small. So x + ty ∈ {z : l(z) < c} for tsufficiently small, i.e. x is an interior point. Since x is arbitrarily chosen, we have proved {z : l(z) < c} isopen.

I 4. (page 188) Show that if A is an open convex set and B is convex, then A+B is open and convex.

Proof. The convexity of A + B is Theorem 1(b). To see the openness, ∀x ∈ A, y ∈ B. For any z ∈ X,(x + y) + tz = (x + tz) + y. For t sufficiently small, x + tz ∈ A. So (x + y) + tz ∈ A + B for t sufficientlysmall. This shows A+B is open.

I 5. (page 188) Let X be a Euclidean space, and let K be the open ball of radius a centered at the origin:||x|| < a.

(i) Show that K is a convex set.(ii) Show that the gauge function of K is p(x) = ||x||/a.

Proof. That K is a convex set is trivial to see. It’s also clear that p(0) = 0. For any x ∈ Rn \ {0}, whenε ∈ (0, a), rε = ||x||

a−ε satisfies rε > 0 and x/rε ∈ K. So p(x) ≤ ||x||a−ε . By letting ε → 0, we conclude

p(x) ≤ ||x||/a. If ‘‘ < ” holds, we can find r > 0 such that r < ||x||a and x

r ∈ K. But r < ||x||a implies a < ||x||

r

and hence xr ∈ K. Contradiction. Combined, we conclude p(x) = ||x||

a .

42

I 6. (page 188) In the (u, v) plane take K to be the quarter-plane u < 1, v < 1. Show that the gaugefunction of K is

p(u, v) =

0 if u ≤ 0, v ≤ 0,v if 0 < v, u ≤ 0,u if 0 < u, v ≤ 0,

max(u, v) if 0 < u, 0 < v.

Proof. See the textbook’s solution.

I 7. (page 190) Let p be a positive homogeneous, subadditive function. Prove that the set K consisting ofall x for which p(x) < 1 is convex and open.

Proof. ∀x, y ∈ K, we have p(x) < 1 and p(y) < 1. For any a ∈ [0, 1], p(ax+(1−a)y) ≤ p(ax)+p((1−a)y) =ap(x) + (1 − a)p(y) < a + (1 − a) = 1. This shows K is convex. To see K is open, fix x ∈ K and chooseany z ∈ X. Then p(x + tz) ≤ p(x) + tp(z). So for t sufficiently small such that tp(z) < 1 − p(x), we havep(x+ tz) ≤ p(x) + tp(z) < p(x) + 1− p(x) = 1, i.e. x+ tz ∈ K. This shows K is open.

I 8. (page 193) Prove that the support function qS of any set is subadditive; that is, it satisfies qS(m+ l) ≤qS(m) + qS(l) for all l, m in X ′.

Proof. ∀ε > 0, there exists x(ε) ∈ S, so that

qS(m+ l) = supx∈S

(l +m)(x) < (l +m)(x(ε)) + ε ≤ supx∈S

l(x) + supx∈S

m(x) + ε = qS(m) + qS(l) + ε.

By the arbitrariness of ε, we conclude qS(m+ l) ≤ qS(m) + qS(l).

I 9. (page 193) Let S and T be arbitrary sets in X. Prove that qS+T (l) = qS(l) + qT (l).

Proof. qS+T (l) = supx∈S,y∈T l(x+ y) = supx∈S,y∈T [l(x) + l(y)] ≤ supx∈S,y∈T [qS(l) + qT (l)] = qS(l) + qT (l).Conversely, ∀ε > 0, there exists x0 ∈ S, y0 ∈ T , s.t. qS(l) < l(x0) +

ε2 , qT (l) < l(y0) +

ε2 . So qS(l) + qT (l) <

l(x0 + y0) + ε ≤ qS+T (l) + ε. By the arbitrariness of ε, qS(l) + qT (l) ≤ qS+T (l). Combined, we getqS+T (l) = qS(l) + qT (l).

I 10. (page 193) Show that qS∪T (l) = max{qS(l), qT (l)}.

Proof. qS∪T (l) = supx∈S∪T l(x) ≥ supx∈S l(x) = qS(l). Similarly, qS∪T (l) ≥ qT (l). Therefore, we haveqS∪T (l) ≥ max{qS(l), qT (l)}. For any ε > 0 sufficiently small, we can find xε ∈ S ∪ T , such that qS∪T (l) ≤l(xε) + ε. But l(xε) ≤ max{qS(l), qT (l)}. So qS∪T (l) ≤ max{qS(l), qT (l)}+ ε. Let ε → 0, we get qS∪T (l) ≤max{qS(l), qT (l)}. Combined, we can conclude qS∪T (l) = max{qS(l), qT (l)}.

I 11. (page 194) Show that a closed half-space as defined by (4) is a closed convex set.

Proof. If for any a ∈ (0, 1), l(ax + (1 − a)y) = al(x) + (1 − a)l(y) ≤ c, by continuity, we have l(x) ≤ c andl(y) ≤ c. This shows {x : l(x) ≤ c} is a closed convex set.

I 12. (page 194) Show that the closed unit ball in Euclidean space, consisting of all points ||x|| ≤ 1, is aclosed convex set.

Proof. Convexity is obvious. For closedness, note f(t) = ||tx+ (1− t)y|| is a continuous function of t. So iff(t) ≤ t for any t ∈ (0, 1), f(0) = ||y|| ≤ 1 and f(1) = ||x|| ≤ 1. So the unit ball B(0, 1) is closed. Combined,we conclude B(0, 1) = {x : ||x|| ≤ 1} is a closed convex set.

I 13. (page 194) Show that the intersection of closed convex sets is a closed convex set.

Proof. Suppose H and K are closed convex sets. Theorem 1(a) says H ∩K is also convex. Moreover, if forany a ∈ (0, 1), ax + (1 − a)y ∈ H ∩K, then the closedness of H and K implies a, b ∈ H and a, b ∈ K, i.e.a, b ∈ H ∩K. So H ∩K is closed.

43

I 14. (page 194) Complete the proof of Theorems 7 and 8.

Proof. Proof of Theorem 7: Suppose K has an interior point x0. If a linear function l and a realnumber c determine a closed half-space that contains K − x0 but not y − x0, i.e. l(x− x0) ≤ c, ∀x ∈ K andl(y−x0) > c, then l and c+l(x0) determine a closed half-space that contains K but not y, i.e. l(x) ≤ c+l(x0)and l(y) > c+ l(x0). So without loss of generality, we can assume x0 = 0. Note the convexity and closednessare preserved under translation, so this simplification is all right for this problem’s purpose.

Define gauge function pK as in (5). Then we can show pK(x) ≤ 1 if and only if x ∈ K. Indeed, if x ∈ K,then pK(x) ≤ 1 by deifinition. Conversely, if pK(x) ≤ 1, then for any ε > 0, there exists r(ε) < 1+ ε so thatx

r(ε) ∈ K. We choose r(ε) > 1 and note xr(ε) = a(ε) · 0+ (1− a(ε)) · x with a(ε) = 1− 1

r(ε) . As r(ε) can be asclose to 1 as we want when ε ↓ 0, a(ε) can be as close to 0 as we want. Meanwhile, 0 is an interior point ofK, so for r large enough, x

r ∈ K. This shows for a close enough to 1, a · 0 + (1− a) · x ∈ K. Combined, weconclude K contains the open segment {a · 0 + (1− a) · x : 0 < a < 1}. By definition of closedness, x ∈ K.The rest of the proof is completely analogous to that of Theorem 3, with p(x) < 1 replaced by p(x) ≤ 1.

If K has no interior point, we have two possibilities. Case one, y and K are not on the same hyperplane.In this case, there exists a linear function l and a real number c, such that l(x) = c(∀x ∈ K) but l(y) = c.By considering −l if necessary, we can have l(x) = c(∀x ∈ K) and l(y) > c. So the half-space {x : l(x) ≤ c}contains K, but not y. Case two, y and K reside on the same hyperplane. Then the dimension of theambient space for y and K can be reduced by 1. Work by induction and note the space is of finite dimension,we can finish the proof.

Proof of Theorem 8: By definition of (16), if x ∈ K, then l(x) ≤ qK(l). Conversely, suppose y is notin K, then there exists l ∈ X ′ and a real number c such that l(x) ≤ c, ∀x ∈ K and l(y) > c. This impliesl(y) > qK(l). Combined, we conclude x ∈ K if and only if l(x) ≤ qK(l), ∀l ∈ X ′.

Remark: From the above solution and the proof of Theorem 3, we can see a useful routine for provingresults on convex sets: first assume the convex set has an interior point and use the gauge function, whichoften helps to construct the desired linear functionals via Hahn-Banach Theorem. If there exists no interiorpoint, reduce the dimension by 1 and work by induction. Such a use of interior points as the criterion for adichotomy is also present in the proof of Theorem 10 (Carathéodory).


Proof. Denote by S the closed convex hull of S, and define Γl = {x : l(x) ≤ qS(l)} where l ∈ X ′. Then itis easy to see each Γl is a closed convex set containing S, so S ⊆ ∩l∈X′Γl. For the other direction, suppose∩l∈X′Γl \ S = ∅ and we choose a point x from ∩l∈X′Γl \ S. By Theorem 8, there exists l0 ∈ X ′ such thatl0(x) > qS(l0) ≥ qS(l0). So x ∈ Γl0 , contradiction. Combined, we conclude S = ∩l∈X′Γl = {x : l(x) ≤qS(l),∀l ∈ X ′}.

I 16. (page 195) Show that if x1, · · · , xm belong to a convex set, then so does any convex combination ofthem.

Proof. Suppose λ1, · · · , λm satisfy λ1, · · · , λm ∈ (0, 1) and∑m

i=1 λi = 1. We need to show∑m

i=1 λixi ∈K, where K is the convex set to which x1, · · · , xm belong. Indeed, since

∑mi=1 λixi = (λ1 + · · · +

λm−1)∑m−1

i=1λi

λ1+···+λm−1xi + λmxm, it suffices to show

∑m−1i=1

λi

λ1+···+λm−1xi ∈ K. Working by induction,

we are done.

I 17. (page 195) Show that an interior point of K cannot be an extreme point.

Proof. Suppose x is an interior point of K. ∀y ∈ X, for t sufficiently small, x + ty ∈ K. In particular, wecan find ε > 0 so that x + εy ∈ K and x − εy ∈ K. Since x can be represented as x = (x+εy)+(x−εy)

2 , weconclude x is not an extreme point.

I 18. (page 197) Verify that every permutation matrix is a doubly stochastic matrix.

44

Proof. Let S be a permutation matrix as defined in formula (25). Then clearly Sij ≥ 0. Furthermore,∑ni=1 Sij =

∑ni=1 δp(i)j , where j is fixed and is equal to p(i0) for some i0. So

∑ni=1 Sij = 1. Finally,∑n

j=1 Sij =∑n

j=1 δip−1(j), where i is fixed and is equal to p−1(j0) for some j0. So∑n

j=1 Sij = 1. Combined,we conclcude S is a doubly stochastic matrix.

I 19. (page 199) Show that, except for two dimensions, the representation of doubly stochastic matrices asconvex combinations of permutation matrices is not unique.

Proof. The textbook’s solution demonstrates the case of dimension 3. Counterexamples for higher dimensionscan be obtained by building permutation matrices upon the case of dimension 3.

I 20. (page 201) Show that if a convex set in a finite-dimensional Euclidean space is open, or closed, orbounded in the linear sense defined above, then it is open, or closed, or bounded in the topological sense,and conversely.

Proof. Suppose K is a convex subset of an n-dimension linear space X. We have the following properties.(1) If x is an interior point of K in the linear sense, then x is an interior point of K in the topological

sense. Consequently, being open in the linear sense is the same as being open in the topological sense.Indeed, let e1, · · · , en be a basis of X. There exists ε > 0 so that for any ti ∈ (−ε, ε), x + tiei ∈ K,

i = 1, · · · , n. For any y ∈ X which is close enough to x, the norm of y−x can be very small so that if we writey as y = x+

∑ni=1 aiei, |ai| < ε

n . Since for ti ∈ (− εn ,

εn ) (i = 1, · · · , n), x+

∑ni=1 tiei =

∑ni=1

1n (x+ntiei) ∈ K

by the convexity of K, we conclude y ∈ K if y is sufficiently close to x. This shows x is an interior point ofK.

(2) If K is closed in the linear sense, it is closed in the topological sense.Indeed, suppose (xk)

∞k=1 ⊆ K and xk → x in the topological sense, we need to show x ∈ K. We work

by induction. The case n = 1 is trivial, because x is necessarily an endpoint of a segment contained in K.Assume the property is true any n ≤ N . For n = N + 1, we have two cases to consider. Case one, K hasno interior points. Then as argued in the proof of Theorem 10, K is contained in a subspace of X withdimension less than n. By induction, K is closed in the topological sense and hence x ∈ K. Case two, K hasat least one interior point x0. In this case, all the points on the open segment (x0, x) must be in K. Indeed,assume not, then there exists an x∗ ∈ (x0, x) such that the open segment (x0, x

∗) ⊆ K, but (x∗, x] ∩K = ∅.Since x0 is an interior point of K, we can find n linearly independent vectors e1, · · · , en so that x0+ ei ∈ K,i = 1, · · · , n. For any xk sufficiently close to x, the cone with xk as the vertex and x0 + e1, · · · , x0 + enas the base necessarily intersects with (x∗, x]. So such an x∗ ∈ (x0, x) with (x∗, x] ∩K = ∅ does not exist.Therefore (x0, x) ⊆ K and by definition of being closed in the linear sense, we conclude x ∈ K.

(3) If K is bounded in the linear sense, it is bounded in the topological sense.Indeed, assume K is not bounded in the topological sense, then we can find a sequence (xk)

∞k=1 such

that ||xk|| → ∞. We shall show K is not bounded in the linear sense. Indeed, if the dimension n = 1, thisis clearly true. Assume the claim is true for any n ≤ N . For n = N + 1, we have two cases to consider.Case one, K has no interior points. Then as argued in the proof of Theorem 10, K is contained in asubspace of X with dimension less than n. By induction, K is not bounded in the linear sense. Case two,K has at least one interior point x0. Denote by yk the intersection of the segment [x0, xk] with the sphereS(x0, 1) = {z : ||z − x0|| = 1}. For k large enough, yk always exists. Since a sphere in finite-dimensionalspace is compact, we can assume without loss of generality that yk → y ∈ S(x0, 1). Then by an argumentsimilar to that of part (2) (the argument based on cone), the ray starting with x0 and going through y iscontained in K. So K is not bounded in the linear sense.

13 The Duality TheoremThe book’s own solution gives answers to Ex 3.

⋆ Comments: For a linear equation Ax = y to have solution, it is necessary and sufficient thaty ∈ R(A) = N(A)⊥. This observation of duality helps us determine the existence of solution. In optimization

45

theory, if the collection of points satisfying certain constraint is a convex set, we use the hyperplane separationtheorem to find and state the necessary and sufficient condition for the existence of solution.

I 1. (page 205) Show that K defined by (5) is a convex set.

Proof. Let Y = {y : y =∑m

i=1 pjyj , pj ≥ 0}. If y, y′ ∈ Y , then

ty + (1− t)y′ =

m∑i=1

[tpj + (1− t)p′j

]yj ∈ Y.

So Y is a convex set.

I 2. (page 205) Show that if x ≥ z and ξ ≥ 0, then ξx ≥ ξz.

Proof. ξx− ξz = ξ(x− z) ≥ 0.

I 3. (page 208) Show that the sup and inf in Theorem 3 is a maximum and minimum. [Hint: The sign ofequality holds in (21).]

Proof. In the proof of Theorem 3, we already showed that there is an admissible p∗ for which γp∗ ≥ s(formula (21)). Since S ≥ γp∗ ≥ s ≥ S by formula (16) and (20), the sup in Theorem 3 is obtained atp∗, hence a maximum. To see the inf in Theorem 3 is a minimum, note under the condition that there areadmissible p and ξ, Theorem 3 can be written as

sup{γp : y ≥ Y p, p ≥ 0} = inf{ξy : γ ≤ ξY, ξ ≥ 0} = ∞.

This is equivalent to

inf{(−γ)p : (−y) ≤ (−Y )p, p ≥ 0} = sup{ξ(−y) : (−γ) ≥ ξ(−Y ), ξ ≥ 0}.

By previous argument for S = supγ γp, we can find ξ∗ such that ξ∗ ≥ 0, (−γ) ≥ ξ∗(−Y ) and ξ∗(−y) =sup{ξ(−y) : (−γ) ≥ ξ(−Y ), ξ ≥ 0}, i.e. ξ∗ ≥ 0, γ ≤ ξ∗Y , and ξ∗y = inf{ξy : γ ≤ ξY, ξ ≥ 0}. That is, the infin Theorem 3 is obtained at ξ∗, hence a minimum.

14 Normed Linear SpacesThe book’s own solution gives answers to Ex 2, 5, 6.

⋆ Comments: The geometric intuition of Theorem 9 is clear if we identify X ′ with X and assume Xis an inner product space.

I 1. (page 214) (a) Show that the open and closed unit balls are convex.(b) Show that the open and closed unit balss are symmetric with respect to the origin, that is, if x belongs

to the unit ball, so does −x.

Proof. Trivial and proof is omitted.

I 2. (page 215) Prove the triangle inequality, that is, for all x, y, z in X,

|x− z| ≤ |x− y|+ |y − z|.

Proof. |x− z| = |(x− y) + (y − z)| ≤ |x− y|+ |y − z|.

I 3. (page 215) Prove that |x|1 defined by (5) has all three properties (1) of a norm.

Proof. (i) |x|1 ≥ 0, and |x|1 = 0 if and only if each aj = 0, i.e. x = 0.(ii) |x+ y|1 =

∑|xj + yj | ≤

∑|xj |+

∑|yj | = |x|1 + |y|1.

(iii) |kx|1 =∑

|kxj | =∑

|k||xj | = |k||x|1.

46

I 4. (page 216) Prove or look up a proof of Hölder’s inequality.

Proof. f(x) = − lnx is a strictly convex function on (0,∞). So for any a, b > 0 with a = b, we have

f(θa+ (1− θ)b) ≤ θf(a) + (1− θ)f(b), ∀θ ∈ [0, 1],

where “=” holds if and only if θa + (1 − θ)b = a or b. That is, one of the following three cases occurs: 1)θ = 0; 2) θ = 1; 3) a = b.

We note inequality f(θa+ (1− θ)b) ≤ θf(a) + (1− θ)f(b) is equivalent to aθb1−θ ≤ θa+ (1− θ)b, and byletting ai =

|xi|p|x|pp and bi =

|yi|q|y|qq , we have (θ = 1

p )

|xiyi||x|p|y|q

≤ 1

p

|xi|p

|x|pp+

1

q

|xi|q

|x|qq.

Taking summation gives∑

i |xiyi| ≤ |x|p|y|q.We consider when

∑i |xiyi| = |x|p|y|q. Since p, q are real positive numbers and since 1

p +1q = 1, we must

have p, q ∈ (0, 1). So among the three cases aforementioned, “=” holds in∑

i |xiyi| ≤ |x|p|y|q if and only iffor each i, |xi|p

|x|pp = |yi|q|y|qq , that is, (|x1|p, · · · , |xn|p) are proportional to (|y1|q, · · · , |yn|q).

For∑

i xiyi =∑

|xiyi| to hold, we need xiyi = |xiyi| for each i. This is the same as sign(xi) = sign(yi)for each i. In summary, we conclude xy ≤ |x|p|y|q and the “=” holds if and only if (|x1|p, · · · , |xn|p) and(|y1|q, · · · , |yn|q) are proportional to each other and sign(xi) = sign(yi) (i = 1, · · · , n).

I 5. (page 216) Prove that|x|∞ = lim

p→∞|x|p,

where |x|∞ is defined by (3).

Proof. Given x, we note

|x|p = |x|∞

(n∑

i=1

|xi|p

|x|p∞

)1/p

and 1 ≤

(n∑

i=1

|xi|p

|x|p∞

)1/p

≤ n1/p.

Letting p → ∞, we can see |x|p → |x|∞.

I 6. (page 219) Prove that every subspace of a finite-dimensional normed linear space is closed.

Proof. Every linear subspace of a finite-dimensional normed linear space is again a finite-dimensional normedlinear space. So the problem is reduced to proving any finite-dimensional normed space is closed. Fix a basise1, · · · , en, we introduce the following norm: if x =

∑ajej , ||x|| := (

∑j a

2j )

1/2. Then the original norm | · | isequivalent to || · ||. So (xk)

∞k=1 is a Cauchy sequence under | · | if and only if {(ak1, · · · , akn)}∞k=1 is a Cauchy

sequence in Cn or Rn. Here xk =∑n

j=1 akjej . Since Cn and Rn are complete, we conclude there exists(b1, · · · , bn) ∈ Cn or Rn, so that xk → x =

∑bjej in || · || and hence in | · |.

I 7. (page 221) Show that the infimum in Lemma 5 is a minimum.

Proof. If (zk)∞k=1 ⊂ Y is such that |x − zk| → d := infy∈Y |x − y|, then for k sufficiently large, |zk| ≤

|zk − x| + |x| ≤ 2d + |x|. Note that Y = span{y1, · · · , yn} is a finite dimensional space, by Theorem 3 (ii),(zk)

∞k=1 has a subsequence which converges to a point y0 ∈ Y . Then infy∈Y |x− y| is obtained at y0.

I 8. (page 223) Show that |l|′ defined by (23) satisfies all postulates for a norm listed in (1).

Proof. (i) Positivity: |ξ|′ = 0 implies ξx = 0, ∀x with |x| = 1. So for any y with y = 0, ξy = |y|ξ(y/|y|) = 0,i.e. ξ ≡ 0. So |ξ|′ = 0 implies ξ = 0, which is equivalent to ξ = 0 implies |ξ|′ > 0. |0| = 0 is obvious.

(ii) Subadditivity: |ξ1 + ξ2| =∑

|x|=1(ξ1 + ξ2)x ≤ sup|x|=1 ξ1x+ sup|x|=1 ξ2x = |ξ1|+ |ξ2|.(iii) Homogeneity: |kξ| = sup|x|=1 kξx = |k| sup|x|=1 ξx = |k||ξ|.

47

I 9. (page 228) (i) Show that for all rational r,

(rx, y) = r(x, y).

(ii) Show that for all real k,(kx, y) = k(x, y).

(i)

Proof. By formula (47) and (48), it suffices to prove the equality for positive rational r. Suppose r = qp with

p, q ∈ Z+. By formula (49) and by induction, we haven︷︸︸︷

(x, y) + · · ·+ (x, y)= (nx, y) .

Thereforep(rx, y) = (prx, y) = (qx, y) = q(x, y), i.e. (rx, y) = q

p(x, y) = r(x, y).

(ii)

Proof. For any given k, we can find a sequence of rational numbers (rn)∞n=1 such that rn → k as n → ∞.

Then k(x, y) = limn→∞ rn(x, y) = limn→∞(rnx, y) = (limn→∞ rnx, y) = (kx, y), where the third “=” usesthe fact that (·, y) defines a continuous linear functional on X.

15 Linear Mappings Between Normed Linear SpacesThe book’s own solution gives answers to Ex 1, 3, 5, 6.

⋆ Erratum: In Exercise 7 (p.236), it should be “defined by formulas (3) and (5) in Chapter 14” insteadof “defined by formulas (3) and (4) in Chapter 14”.

I 1. (page 230) Show that every linear map T : X → Y is continuous, that is, if limxn = x, thenlimTxn = Tx.

Proof. By Lemma 1, |Txn−Tx| = |T (xn−x)| ≤ c|xn−x|. So if limn→∞ xn = x, then limn→∞ Txn = Tx.

I 2. (page 235) Show that if for every x in X, |Tnx− Tx| tends to zero as n → ∞, then |Tn − T | tends tozero.

Proof. Suppose |Tn − T | does not tend to zero. Then there exists ε > 0 and a sequence (xn)∞n=1 such that

|xn| = 1 and |(Tn−T )xn| ≥ ε. By Theorem 3(ii), we can without loss of generality assume (xn)∞n=1 converges

to some point x∗. Then

|(Tn − T )xn| − |(Tn − T )x∗| ≤ |(Tn − T )xn − (Tn − T )x∗|≤ |T (xn − x∗)|+ |Tn(xn − x∗)|≤ |T ||xn − x∗|+ |Tn||xn − x∗|.

For n sufficiently large, |(Tn−T )xn|− |(Tn−T )x∗| will be greater than ε/2, while |T ||xn−x∗|+ |Tn||xn−x∗|will be as small as we want, provided that we can prove (|Tn|)∞n=1 is bounded. Indeed, this is the principleof uniform boundedness (see, for example, Lax [7], Chapter 10, Theorem 3). Thus we have arrived at acontradiction which shows our assumption is wrong.

Remark 13. Can we find an elementary proof without using the principle of uniform boundedness infunctional analysis, especially since we are working with finite dimensional space?

48

I 3. (page 235) Show that Tn =∑n

0 Rk converges to S−1 in the sense of definition (16).

Proof. First of all,∑∞

k=0 Rk is well-defined, since by |R| < 1,

(∑Kk=0 R

k)∞K=0

is a Cauchy sequence in X ′.Then we note S

∑∞k=0 R

k =∑∞

k=0 Rk −

∑∞k=1 R

k = I and (∑∞

k=0 Rk)S =

∑∞k=0 R

k −∑∞

k=1 Rk = I. So S

is invertible and S−1 =∑∞

k=0 Rk.

I 4. (page 235) Deduce Theorem 5 from Theorem 6 by factoring S = T + S − T as T [I − T−1(S − T )].

Proof. Assume all the conditions in Theorem 5. Define R = −T−1(S − T ), then |R| ≤ |T−1||S − T | < 1. Soby Theorem 6, I −R = T−1S is invertible, hence S = T ◦ T−1S is invertible.

I 5. (page 235) Show that Theorem 6 remains true if the hypothesis (17) is replaced by the followinghypothesis. For some positive integer m,

|Rm| < 1.

Proof. If for some m, |Rm| < 1, we define U =∑∞

k=0 Rkm = I +RmU . U is well-defined, and the following

linear map is also well-defined: V = U +RU + · · ·+Rm−1U . Then SV = U +RU + · · ·+Rm−1U − (RU +R2U + · · ·+RmU) = U −RmU = I. This shows S is invertible.

I 6. (page 235) Take X = Y = Rn, and T : X → X the matrix (tij). Take for the norm |x| the maximumnorm |x|∞ defined by formula (3) of Chapter 14. Show that the norm |T | of the matrix (tij), regarded as amapping of X into X, is

|T | = maxi

∑j

|tij |.

Proof. For any x ∈ Rn, |Tx|∞ = maxi |∑n

j=1 tijxj | ≤ maxi(∑n

j=1 |tij |)|x|∞. So |T | = supx =0|Tx|∞|x|∞ ≤

maxi

∑j |tij |. For the other direction, suppose

∑j |ti0j | = maxi

∑j |tij | and we choose

x∗ = (sign(ti01), · · · , sign(ti0n))T ,

then |x∗|∞ = 1 and Tx∗ = (∑

j t1jx∗j , · · · ,

∑j ti0jx

∗j , · · · ,

∑j tnjx

∗j )

T . So

|Tx∗|∞ ≥∑j

|ti0j | = maxi

∑j

|tij |.

This implies |T | ≥ maxi

∑j |tij |. Combined, we conclude |T | = maxi

∑j |tij |.

I 7. (page 236) Take X to be Rn normed by the maximum norm |x|∞, Y to be Rn normed by the 1-norm|x|1, defined by formulas (3) and (5) in Chapter 14. Show that the norm of the matrix (tij) regarded as amapping of X into Y is bounded by

|T | ≤∑i,j

|tij |.

Proof. For any x = (x1, · · · , xn)′, we have

|Tx|l =n∑

i=1

∣∣∣∣∣∣n∑

j=1

tijxj

∣∣∣∣∣∣ ≤n∑

i=1

n∑j=1

|tijxj | ≤∑i,j

|tij ||x|∞.

So |T | = sup|x|∞=1|Tx|l|x|∞ ≤

∑i,j |tij |.

49

I 8. (page 236) X is any finite-dimensional normed linear space over C, and T is a linear mapping of Xinto X. Denote by tj the eigenvalues of T , and denoted by r(T ) its spectral radius:

r(T ) = max |tj |.

(i) Show that |T | ≥ r(T ).(ii) Show that |Tn| ≥ r(T )n.(iii) Show, using Theorem 18 of Chapter 7, that

limn→∞

|Tn|1/n = r(T ).

Proof. The proof is very similar to the content on page 97, Chapter 7, the material up to Theorem 18. Sowe omit the proof.

16 Positive MatricesThe book’s own solution gives answers to Ex 1, 2.

⋆ Comments: To see the property of complex numbers mentioned on p.240, we note if z1, z2 ∈ C, then|z1 + z2| = |z1|+ |z2| if and only if arg z1 = arg z2. For n ≥ 3, if |

∑ni=1 zi| =

∑ni=1 |zi|, then∣∣∣∣∣

n∑i=1

zi

∣∣∣∣∣ ≤∣∣∣∣∣

n∑i=3

zi

∣∣∣∣∣+ |z1 + z2| ≤

∣∣∣∣∣n∑

i=3

zi

∣∣∣∣∣+ |z1|+ |z2| ≤n∑

i=1

|zi| =

∣∣∣∣∣n∑

i=1

zi

∣∣∣∣∣ .So |z1 + z2| = |z1|+ |z2| and hence arg z1 = arg z2. Then we work by induction.

I 1. (page 240) Denote by t(P ) the set of nonnegative λ such that

Px ≤ λx, x ≥ 0

for some vector x = 0. Show that the dominant eigenvalue λ(P ) satisfies

λ(P ) = minλ∈t(P )

λ.

Proof. Let x∗ = (1, · · · , 1)T and λ∗ = max1≤i≤n

∑nj=1 pij , then Px∗ ≤ λ∗x∗. So t(P ) = ∅ and t∗(P ) = {0 ≤

λ ≤ λ∗ : λ ∈ t(P )} is a bounded, nonempty set. We show further t∗(P ) is closed. Suppose (λm)∞m=1 ⊂ t∗(P )converges to a point λ. Denote by xm the nonnegative and nonzero vector such that Pxm ≤ λmxm. Withoutloss of generality, we assume

∑ni=1 x

mi = 1. Then (xm)∞m=1 is bounded and we can assume xm → x for some

x ≥ 0 with∑n

i=1 xi = 1. Passing to limit gives us Px ≤ λx. Clearly 0 ≤ λ ≤ λ∗. So λ ∈ t∗(P ). This showst∗(P ) is compact and t(P ) has a minimum λ.

Denote by x the nonzero and nonnegative vector such that Px ≤ λx. We show we actually havePx = λx. Assume not, there must exist some k ∈ {1, · · · , n} such that

∑nj=1 pij xij ≤ λxi for i = k and∑n

j=1 pkj xj < λxλxk. Consider the vector x = x− εek, where ε > 0 and ek has the k-th component equal to1 with all the other components zero. Then in the inequality Px ≤ λx, each component of LHS is decreasedwhen x is replaced by x, while only the k-th component of RHS is decreased by an amount of λε. So forε small enough, Px < λx, and we can find a λ < λ such that Px ≤ λx. Note λ > 0 (otherwise x = 0, acontradiction), so we can also let λ > 0. This contradicts with λ = minλ∈t(P ) λ. We have shown λ > 0 is aneigenvalue of P which has a nonzero, nonnegative eigenvalue. By Theorem 1(iv), λ = λ(P ).

I 2. (page 243) Show that if some power Pm of P is positive, then P has a dominant positive eigenvalue.

Proof. Pm has a dominant positive eigenvalue λ0. By Spectral Mapping Theorem, there is an eigenvalue λof P , such that λm = λ0. Suppose λ is real, then we can further assume λ > 0 by replacing λ with −λ if

50

necessary. Then for any other eigenvalue λ′ of P , Spectral Mapping Theorem implies (λ′)m is an eigenvalueof Pm. So |(λ′)m| < λ0 = λm, i.e. |λ′| < λ.

To show we can take λ as real, denote by x the eigenvector of Pm associated with λ0. Then

Pmx = λ0x.

Let P act on this relation:Pm+1x = Pm(Px) = λ0Px.

This shows Px is too an eigenvector of Pm with eigenvalue λ0. By Theorem 1(iv), Px = cx for some positivenumber c. Repeated application of P shows that Pmx = cmx. Therefore cm = λ0. Let λ = c.

17 How to Solve Systems of Linear Equations⋆ Comments: The three-term recursion formulae

xn+1 = (snA+ pnI)xn + qnxn−1 − snb

is introduced by Rutishauser et al. [1]. See Papadrakakis [11] for a survey on a family of vector iterativemethods with three-term recursion formulae and Golub and van Loan [2] for a gentle introduction to theChebyshev semi-iterative method (section §10.1.5).

I 1. (page 248) Show that κ(A) is ≥ 1.

Proof. I = AA−1. So 1 = |I| = |AA−1| ≤ |A||A−1| = κ(A).

I 2. (page 254) Suppose κ = 100, ||e0|| = 1, and (1/α)F (x0) = 1; how large do we have to take N in orderto make ||eN || < 10−3, (a) using the method in Section 1, (b) using the method in Section 2.

Proof. If we use the method of Section 1, we need to solve for N the following inequality: 2α (1−

1κ )

NF (x0) <10−3. Plug in numbers, we have N > 757. If we use the method of Section 2, we need to solve for N theinequality 2(1 + 2√

κ)−N ||e0|| < 10−3. Plug in numbers, we have N > 42. The numbers of steps needed in

respective methods differ a great deal.

I 3. (page 261) Write a computer program to evaluate the quantities sn, pn, and qn.

Solution. We first summarize the algorithm. We need to solve the system of linear equations Ax = b, whereb is a given vector and A is an invertible matrix. We start with an initial guess x0 of the solution and definer0 = Ax0 − b. TO BE CONTINUED ...

I 4. (page 261) Use the computer program to solve a system of equations of your choice.

Solution. We solve the following problem from the first edition of this book: Use the computer program inExercise 3 to solve the system of equations

Ax = f, Aij = c+1

i+ j + 1, fi =

1

i!,

c some nonnegative constant. Vary c between 0 and 1, and the order K of the system between 5 and 20.TO BE CONTINUED ...

51

18 How to Calculate the Eigenvalues of Self-Adjoint MatricesI 1. (page 266) Show that the off-diagonal entries of Ak tend to zero as k tends to ∞.

Proof. Skipped for this version.

I 2. (page 266) Show that the mapping (16) is norm-preserving.


I 3. (page 270) (i) Show that BL− LB is a tridiagonal matrix.(ii) Show that if L satisfies the differential equation (21), its entries satisfy

d

dtak = 2(b2k − b2k−1),

d

dtbk = bk(ak+1 − ak),

where k = 1, · · · , n and b0 = bn = 0.

A AppendixA.1 Special DeterminantsI 1. (page 304) Let

p(s) = x1 + x2s+ · · ·+ xnsn−1

by a polynomial of degree less than n. Let a1, · · · , an be n distinct numbers, and let p1, · · · , pn be narbitrary complex numbers; we wish to choose the coefficients x1, · · · , xn so that

p(ai) = pi, i = 1, · · · , n.

This is a system of n linear equations for the n coefficients xi. Find the matrix of this system of equations,and how that its determinant is = 0.

Solution. The system of equations p(ai) = pi (i = 1, · · · , n) can be written as1 a1 · · · an−1

1

1 a2 · · · an−12

· · · · · · · · · · · ·1 an · · · an−1

n

x1

x2

· · ·xn

=

p1p2· · ·pn

Since a1, · · · , an are distinct, by the formula for the determinant of Vandermonde matrix, the determinantof the matrix is equal to

∏j>i(aj − ai).

I 2. (page 304) Find an algebraic formula for the determinant of the matrix whose ijth element is

1

1 + aiaj;

here a1, · · · , an are arbitrary scalars.

52

Solution. Denote the matrix by A. We claim detA =∏

j>i(aj−ai)2∏

i,j(1+aiaj). Indeed, by subtracting column 1 from

each of the other columns, we have

detA = det

11+a2

1

11+a1a2

11+a1a3

· · · 11+a1an

11+a2a1

11+a2

2

11+a2a3

· · · 11+a2an

· · · · · · · · · · · · · · ·1

1+aia1

11+aia2

11+aia3

· · · 11+aian

· · · · · · · · · · · · · · ·1

1+ana1

11+ana2

11+ana3

· · · 11+a2

n

= det

11+a2

1

a1(a2−a1)(1+a2

1)(1+a1a2)a1(a3−a1)

(1+a21)(1+a1a3)

· · · a1(an−a1)(1+a2

1)(1+a1an)1

1+a2a1

a2(a2−a1)(1+a2a1)(1+a2

2)a2(a3−a1)

(1+a2a1)(1+a2a3)· · · a2(an−a1)

(1+a2a1)(1+a2an)

· · · · · · · · · · · · · · ·1

1+aia1

ai(a2−a1)(1+aia1)(1+aia2)

ai(a3−a1)(1+aia1)(1+aia3)

· · · ai(an−a1)(1+aia1)(1+aian)

· · · · · · · · · · · · · · ·1

1+ana1

an(a2−a1)(1+ana1)(1+ana2)

an(a3−a1)(1+ana1)(1+ana3)

· · · an(an−a1)(1+ana1)(1+a2

n)

By extracting the common factor 1

1+aiai(i = 1, · · · , n) from each row and (aj −a1) (j = 2, · · · , n) from each

column, we have

detA =

∏nj=2(aj − a1)∏ni=1(1 + aia1)

det

1 a1

1+a1a2

a1

1+a1a3· · · a1

1+a1an

1 a2

1+a22

a2

1+a2a3· · · a2

1+a2an

· · · · · · · · · · · · · · ·1 ai

1+aia2

ai

1+aia3· · · ai

1+aian

· · · · · · · · · · · · · · ·1 an

1+ana2

an

1+ana3· · · an

1+a2n

Subtracting row 1 from each of the other rows, we get

detA =

∏nj=2(aj − a1)∏ni=1(1 + aia1)

det

1 a1

1+a1a2

a1

1+a1a3· · · a1

1+a1an

0 a2−a1

(1+a1a2)(1+a22)

a2−a1

(1+a1a3)(1+a2a3)· · · a2−a1

(1+a1an)(1+a2an)

· · · · · · · · · · · · · · ·0 ai−a1

(1+a1a2)(1+aia2)ai−a1

(1+a1a3)(1+aia3)· · · ai−a1

(1+a1an)(1+aian)

· · · · · · · · · · · · · · ·0 an−a1

(1+a1a2)(1+ana2)an−a1

(1+a1a3)(1+ana3)· · · an−a1

(1+a1an)(1+a2n)

By the Laplace expansion and extracting the common factor (ai−a1) from row 2 through n and the commonfactor 1

1+a1ajfrom column 2 through n, we get

detA =

∏nj=2(aj − a1)

2∏ni=1(1 + aia1)

∏nj=2(1 + a1aj)

det

1

1+a22

11+a2a3

· · · 11+a2an

· · · · · · · · · · · ·1

1+aia2

11+aia3

· · · 11+aian

· · · · · · · · · · · ·1

1+ana2

11+ana3

· · · 11+a2

n

By induction, we can prove our claim.

A.2 The PfaffianI 1. (page 306) Verify by a calculation Cayley’s theorem for n = 4.

53

Proof. By the Laplace expansion and Exercise 16 of Chapter 5, we have

det

0 a b c−a 0 d e−b −d 0 f−c −e −f 0

= adet

a b c−d 0 f−e −f 0

− b det

a b c0 d e−e −f 0

+ cdet

a b c0 d e−d 0 f

= a(−bef + cdf + af2)− b(−be2 + cde+ aef) + c(adf − bde+ cd2)

= −2abef + 2acdf − 2bcde+ a2f2 + b2e2 + c2d2

= (af − be+ cd)2.

A.3 Symplectic MatricesI 1. (page 308) Prove that any real 2n × 2n anti-self-adjoint matrix A, detA = 0, can be written in theform

A = FJFT ,

J defined by (1), F some real matrix, detF = 0.

Proof. We work by induction. For n = 1, A has the form of[0 a−a 0

]. Since detA = 0, a = 0. We note[

1 00 1

a

] [0 a−a 0

] [1 00 1

a

]=

[0 1−1 0

].

Now assume the claim is true for 1, · · · , n, we show it also holds for n+1. Indeed, we write A into the form 0 a ∗−a 0 ∗∗ ∗ A1

,

where A1 is a 2n× 2n anti-self-adjoint matrix. Then 1 0 01×2n

0 1a 01×2n

02n×1 02n×1 I2n×2n

0 a ∗−a 0 ∗∗ ∗ A1

1 0 01×2n

0 1a 01×2n

02n×1 02n×1 I2n×2n

=

0 1 ∗−1 0 ∗∗ ∗ A1

.

Recall that multiplying an elementary matrix from the left is equivalent to an elementary row manipulation,while multiplying an elementary matrix from the right is equivalent to an elementary column manipulation.A being anti-self-adjoint implies Aij = −Aji, so we can find a sequence of elementary matrices U1, U2, · · · , Uk

such that

Uk · · ·U2U1

0 1 ∗−1 0 ∗∗ ∗ A1

UT1 UT

2 · · ·UTk =

0 1 0−1 0 00 0 A1

.

By assumption, A1 = F1J1FT1 for some real matrix F1 with detF1 = 0 and J1 =

[0n×n In×n

−In×n 0n×n

]. Therefore

(U := Uk · · ·U2U1) [I2×2 00 F−1

1

]U

0 1 ∗−1 0 ∗∗ ∗ A1

UT

[I2×2 00 (F−1

1 )T

]=

0 1 0−1 0 00 0 J1

.

Define

L =

0 1 0 00 0 0 In×n

1 0 0 00 0 In×n 0

54

and

F =

L

[I2×2 00 F−1

1

]U

1 0 01×2n

0 1a 01×2n

02n×1 02n×1 I2n×2n

−1

.

Then

F−1A(F−1)T = L

[I2×2 00 F−1

1

]U

1 0 01×2n

0 1a 01×2n

02n×1 02n×1 I2n×2n

0 a ∗−a 0 ∗∗ ∗ A1

·

1 0 01×2n

0 1a 01×2n

02n×1 02n×1 I2n×2n

UT

[I2×2 00 (F−1

1 )T

]LT

=

[0 I(n+1)×(n+1)

−I(n+1)×(n+1) 0

].

By induction, we have proved the claim.

I 2. (page 310) Prove the converse.

Proof. For any given x and y, define f(t) = (S(t)x, JS(t)y). Then we have

d

dtf(t) = (

d

dtS(t)x, JS(t)y) + (S(t)x, J

d

dtS(t)y)

= (G(t)S(t)x, JS(t)y) + (S(t)x, JG(t)S(t)y)

= (JL(t)S(t)x, JS(t)y) + (S(t)x, J2L(t)S(t)y)

= (L(t)S(t)x, JTJS(t)y)− (S(t)x, L(t)S(t)y)

= (S(t)x, L(t)S(t)y)− (S(t)x, L(t)S(t)y)

= 0.

So f(t) = f(0) = (S(0)x, JS(0)y) = (x, Jy). Since x and y are arbitrary, we conclude S(t) is a family ofsymplectic matrices.

I 3. (page 311) Prove that plus or minus 1 cannot be an eigenvalue of odd multiplicity of a symplecticmatrix.


I 4. (page 312) Verify Theorem 6.

Proof. We note

dv

dt=

∑2n

i=1∂v1

∂ui

dui

dt

· · ·∑2ni=1

∂v2n

∂ui

dui

dt

=∂v

∂u

du

dt=

∂v

∂uJHu.

H(u) can be seen as a function of v: K(v)∆= H(u(v)). So

Kv =

∂K∂v1

· · ·∂K∂v2n

=

∑2n

i=1∂H∂ui

∂ui

∂v1· · ·∑2n

i=1∂H∂ui

∂ui

∂v2n

=

∂u1

∂v1· · · ∂u2n

∂v1

· · · · · · · · ·∂u1

∂v2n· · · ∂u2n

∂v2n

∂H∂u1

· · ·∂H∂u2n

=

(∂u

∂v

)T

Hu.

Since ∂v/∂u is symplectic, by Theorem 2, ∂u/∂v and (∂v/∂u)T are also symplectic. So using formula (4)gives us

dv

dt=

∂v

∂uJ

(∂v

∂u

)T (∂u

∂v

)T

Hu = J

(∂u

∂v

)T

Hu = JKv.

55

A.4 Tensor ProductI 1. (page 313) Establish a natural isomorphism between tensor products defined with respect to two pairsof distinct bases.


I 2. (page 314) Verify that (4) maps U ⊗ V onto L(U ′, V ).


I 3. (page 316) Show that if {ui} and {vj} are linearly independent, so are ui ⊗ vi. Show that Mij ispositive.


I 4. (page 316) Let u be a twice differentiable function of x1, · · · , xn defined in a neighborhood of a pointp, where u has a local minimum. Let (Aij) be a symmetric, nonnegative matrix. Show that

∑Aij

∂2u

∂xi∂xj(p) ≥ 0.


A.5 LatticesI 1. (page 318) Show that a1 is a rational number.


I 2. (page 318) (i) Prove Theorem 2. (ii) Show that unimodular matrices form a group under multiplication.




I 4. (page 319) Show that L is discrete if and only if there is a positive number d such that the ball ofradius d centered at the origin contains no other point of L.


A.6 Fast Matrix MultiplicationThere are no exercise problems for this section. For examples of implementation of Strassen’s algorithm,

we refer to Huss-Lederman et al. [9] and references therein.

56

A.7 Gershgorin’s TheoremI 1. (page 324) Show that if Ci is disjoint from all the other Gershgorin discs, then Ci contains exactly oneeigenvalue of A.

Proof. Using the notation of Gershgorin Circle Theorem, let B(t) = D + tF , t ∈ [0, 1]. The eigenvalues ofB(t) are continuous functions of t (Theorem 6 of Chapter 9). For t = 0, the eigenvalues of B(0) are thediagonal entries of A. As t goes from 0 to 1, the radius of Gershgorin circles corresponding to B(t) becomebigger while the centers remain the same. So we can find for each di a continuous path γi(t) such thatγi(0) = di and γi(t) is an eigenvalue of B(t) (0 ≤ t ≤ 1, i = 1, · · · , n). Moreover, by Gershgorin CircleTheorem, each path γi(t) (0 ≤ t ≤ 1) is contained in disc Ci = {x : |x − di| ≤ |fi|l}. If for some i1 and i2with i1 = i2, γi2(0) falls into Ci1 , then it’s necessary that Ci1 ∩Ci2 = ∅. This implies that for any Gershgorindisc that is disjoint from all the other Gershgorin discs, there is one and only one eigenvalue of A falls withinit.

Remark 14. There’s a strengthened version of Gershgorin Circle Theorem that can be found at Wikipedi-a (http://en.wikipedia.org/wiki/Gershgorin_circle_theorem). The above exercise problem’s solution is anadaption of the proof therein. The claim: If the union of k Gershgorin discs is disjoint from the union of theother (n− k) Gershgorin discs, then the former union contains exactly k and the latter (n− k) eigenvaluesof A.

A.8 The Multiplicity of EigenvaluesI (page 327) Show that if n ≡ 2 (mod 4), there are no n×n real matrices A, B, C not necessarily self-adjoint,such that all their linear combinations (1) have real and distinct eigenvalues.


A.9 The Fast Fourier TransformThere are no exercise problems for this section.

A.10 The Spectral Radius⋆ Comments: In the textbook (pp. 340), when the author applied Cauchy integral theorem to get

formula (20):∫|z|=s

R(z)zjdz = 2πiAj , he used the version of Cauchy integral theorem for the outside regionof a simple closed curve (see, for example, Gong and Gong [3], Chapter 2, Exercise 8).

I 1. (page 337) Prove that the eigenvalues of an upper triangular matrix are its diagonal entries.

Proof. If T is an upper triangular matrix with diagonal entries a1, · · · , an, then its characteristic polynomialpT (λ) = det(λI − T ) =

∏ni=1(λ− ai). So

λ0 is an eigenvalue of T ⇔ det(λ0I − T ) = 0 ⇔n∏

i=1

(λ0 − ai) = 0 ⇔ λ0 is equal to one of a1, · · · , an.

I 2. (page 338) Show that the Euclidean norm of a diagonal matrix is the maximum of the absolute valueof its eigenvalues.

Proof. Let D = diag{a1, · · · , an}. Then Dei = diag{0, · · · , 0, ai, 0, · · · , 0}. So ||Dei|| = |ai|. This shows||D|| ≥ max1≤i≤n |ai|. For any x ∈ Cn, Dx = diag{a1x1, · · · , anxn}. So ||Dx|| =

√∑ni=1 |aixi|2 ≤

max1≤i≤a |ai| · ||x||. So ||D|| ≤ max1≤i≤n |ai|. Combined, we conclude ||D|| = max1≤i≤n |ai|.

57

I 3. (page 339) Prove the analogue of relation (2),

limj→∞

|Aj |1/j = r(A),

when A is a linear mapping of any finite-dimensional normed, linear space X (see Chapters 14 and 15).

Proof. By examining the proof for Euclidean space, we see inner product is not really used. All that hasbeen exploited is just norm. So the proof for any finite-dimensional normed linear space is entirely identicalto that of finite-dimensional Euclidean space.

I 4. (page 339) Show that the two definitions are equivalent.

Proof. It suffices to note that a sequence (An)∞n=1 converges to A in matrix norm iff each ((An)ij)

∞n=1

converges to Aij (Exercise 16 of Chapter 7).

I 5. (page 339) Let A(z) be an analytic matrix function in a domain G, invertible at every point of G. Showthat then A−1(z), too, is an analytic matrix function in G.

Proof. By formula (16) of Chapter 5: D(a1, · · · , an) =∑

σ(p)ap11 · · · apnn, we conclude the determinant ofany analytic matrix (i.e. matrix-valued analytic function) is analytic. By Cramer’s rule and detA(z) = 0 inG, we conclude A−1(z) is also analytic in G.

I 6. (page 339) Show that the Cauchy integral theorem holds for matrix-valued functions.

Proof. By Exercise 4, Cauchy integral theorem for matrix-valued functions is reduced to Cauchy integraltheorem for each entry of an analytic matrix.

A.11 The Lorentz GroupSkipped for this version.

A.12 Compactness of the Unit BallI 1. (page 354) (i) Show that a set of functions whose first derivatives are uniformly bounded in G areequicontinuous in G.

(ii) Use (i) and the Arzela-Ascoli theorem to prove Theorem 3.(i)

Proof. We use the notation of Theorem 3. For simplicity, we assume G is convex so that for any x, y ∈ G,the segment {z : z = (1− t)x+ ty} ⊂ G. Then by Mean Value Theorem, there exists c ∈ (0, 1) such that

|f(x)− f(y)| = |∇f((1− c)x+ cy) · (y − x)| ≤ dm|x− y|,

where d is the dimensional of the Euclidean space in which G resides. This shows the elements of D areequi-continuous in G.

(ii)

Proof. From (i), we know each element of D is uniformly continuous. So they can be extended to G, theclosure of G. Then Theorem 3 is the result of the following version of Arzela-Ascoli Theorem (see, forexample, Yosida [14]): Let S be a compact metric space, and C(S) the Banach space of (real- or) complex-valued continuous functions x(s) normed by ||x|| = sups∈S |x(s)|. Then a sequence {xn(s)} ⊂ C(S) isrelatively compact in C(S) if the following two conditions are statisfied: (a) {xn(s)} is uniformly bounded;(b) {xn(s)} is equi-continuous.

A.13 A Characterization of CommutatorsThere are no exercise problems for this section.

58

A.14 Liapunov’s TheoremI 1. (page 360) Show that the sums (14) tend to a limit as the size of the subintervals ∆j tends to zero.(Hint: Imitate the proof for the scalar case.)

Proof. This is basically about how to extend the Riemann integral to Banach space valued functions. Thetheory is essential the same as the scalar case – just replace the Euclidean norm with an arbitrary norm. Sowe omit the details.

I 2. (page 360) Show that the two definitions are equivalent.

Proof. It suffices to note An → A in matrix norm if and only if each entry of An converges to the corre-sponding entry of A (see Exercise 7 and formula (51) of Chapter 7).

I 3. (page 360) Show, using Lemma 4, that for the integral (12)

limT→∞

∫ T

0

eW∗teWtdt

Proof. We note for T ′ > T , by Definition 2∣∣∣∣∣∣∣∣∣∣∫ T ′

T

eW∗teWtdt

∣∣∣∣∣∣∣∣∣∣ ≤

∫ T ′

T

∣∣∣∣∣∣eW∗t∣∣∣∣∣∣ ∣∣∣∣eWt

∣∣∣∣ dt.By Lemma 4, limT,T ′→∞

∫ T ′

T

∣∣∣∣eW∗t∣∣∣∣ ∣∣∣∣eWt

∣∣∣∣ dt = 0. So by Cauchy’s criterion, we conclude the integral (12)exists.

A.15 The Jordan Canonical FormThere are no exercise problems for this section.

A.16 Numerical RangeI 1. (page 367) Show that for A normal, equality holds in (2).

Proof. By Theorem 8 of Chapter 8, we can find an orthonormal basis consisting of eigenvectors of A. Leta1, · · · , an be the eigenvalues of A (multiplicity counted) with v1, · · · , vn the corresponding eigenvectors.For any x ∈ X, we can find θ1(x), · · · , θn(x) ∈ C such that x =

∑ni=1 θi(x)vi. Then

|(Ax, x)| =

∣∣∣∣∣∣∑i,j

θi(x)θj(x)(Avi, vj)

∣∣∣∣∣∣ =∣∣∣∣∣∣∑i,j

θi(x)θj(x)(aivi, vj)

∣∣∣∣∣∣ =∣∣∣∣∣

n∑i=1

ai|θi(x)|2∣∣∣∣∣ ≤ max

1≤i≤n|ai| = r(A).

Combined with (2), we conclude r(A) = w(A).

I 2. (page 367) Show that for A normal,w(A) = ||A||.

Proof. By definition ||A|| = sup||x||=1 ||Ax||. Using the notation in the solution to Exercise 1, we have

Ax =∑i

θi(x)Avi =∑i

θi(x)aivi.

So ||Ax|| =√∑n

i=1 |ai|2|θi(x)|2 ≤ r(A) = w(A), where the last equality comes from Exercise 1. This implies||A|| ≤ w(A). By Theorem 13 (ii) of Chapter 7, w(A) ≤ ||A||. Combined, we conclude w(A) = ||A||.

I 3. (page 369) Verify (7) and (8).

59

Proof. To verify (7), we note∏k

(1− rkz) =∏k

(rk − z) ·∏k

rk =∏k

(rk − z) · e 2πin ·

∑nk=1 k =

∏k

(rk − z) · e(n+1)πi = (−1)n+1∏k

(rk − z).

Since (rk)n = 1, r1, · · · , rn are a permutation of r1, · · · , rn. So∏

k

(rk − z) = (−1)n(zn − 1).

Combined, we conclude (1− zn) =∏

k(1− rkz). To verify (8), we use (7) to get

1

n

∑j

∏k =j

(1− rkz) =1

n

∑j

1− zn

1− rjz.

∑j

11−rjz

is a rational function over the complex plane C, which can be assumed to have the form P (z)Q(z)

with P (z) and Q(z) being polynomials without common factors. Since r1, · · · , rn are singularity of degree1 for

∑j

11−rjz

, we conclude Q(z) =∏

k(1− rkz) = 1 − zn, up to the difference of a constant factor. Since∑j

11−rjz

has no zeros on complex plane, we conclude P (z) must be a constant. Combined, we conclude

∑j

1

1− rjz=

C

1− zn

for some constant C. By letting z → 0, we can see C = n. This finishes the verification of (8).

I 4. (page 370) Determine the numerical range of A =

(1 10 1

)and of A2 =

(1 20 1

).

Solution. If x =

[x1

x2

], (Ax, x) = x2

1 + x22 + x1x2. If x2

1 + x22 = 1, we have (Ax, x) = 1+ x1 · sign(x2)

√1− x2

1.

Calculus shows f(ξ) = ξ√

1− ξ2 (−1 ≤ ξ ≤ 1) achieves maximum at ξ0 =√22 . So w(A) = 1 + 1

2 = 32 .

Similarly, plain calculation shows w(A2) = 2.

References[1] M. Engeli, T. Ginsburg, H. Rutishauser and E. Stiefel. Refined iterative methods for computation of the

solution and the eigenvalues of self-adjoint boundary value problems, Birkhauser Verlag, Basel/Stuttgart,1959. 50

[2] Gene H. Golub and Charles F. van Loan. Matrix computation, 3rd Edition. Johns Hopkins UniversityPress, 1996. 50

[3] Gong Sheng and Gong Youhong. Concise complex analysis, Revised Edition, World Scientific, 2007. 56

[4] William H. Greene. Econometric Analysis, 7th ed., Prentice Hall, 2011. 12

[5] James P. Keener. Principles of applied mathematics: Transformation and approximation, revised edition.Westview Press, 2000. 29

[6] 蓝以中：《高等代数简明教程（上册）》。北京大学出版社，北京，2002.8。[Lan Yi-Zhong. A concise courseof advanced algebra (in Chinese), Volume 1, Peking University Press, Beijing, 2002.8.] 20

[7] P. Lax. Functional analysis, Wiley-Interscience, 2002. 47

[8] P. Lax. Linear algebra and its applications, 2nd Edition, Wiley-Interscience, 2007. 1

60

[9] Steven Huss-Lederman, Elaine M. Jacobson, Anna Tsao, Thomas Turnbull, Jeremy R. Johnson. Im-plementation of Strassen’s algorithm for matrix multiplication. Proceedings of the 1996 ACM/IEEEconference on Supercomputing (CDROM), p.32-es, January 01-01, 1996, Pittsburgh, Pennsylvania, U-nited States. 55

[10] J. Munkres. Analysis on manifolds, Westview Press, 1997. 14, 20

[11] M. Papadrakakis. A family of methods with three-term recursion formulae. International Journal forNumerical Methods in Engineering, Vol. 18, 1785-1799 (1982). 50

[12] 丘维声：《高等代数（上册）》，高等教育出版社，北京，1996。[Qiu Wei-Sheng. Advanced algebra (inChinese), Volume 1, Higher Education Press, Beijing, 1996.] 12, 13, 29

[13] Michael Spivak. Calculus on manifolds: A modern approach to classical theorems of advanced calculus.Perseus Books Publishing, 1965. 29

[14] Kosaku Yosida. Functional analysis, 6th Edition. Springer, 1996. 57

61

Linear Algebra and Its Applications, 2ed. Solution of …Linear Algebra and Its Applications, 2ed....

Documents

Transcript of Linear Algebra and Its Applications, 2ed. Solution of …Linear Algebra and Its Applications, 2ed....