Jordan Canonical Form Applications [Steven H. Weintraub]

93
Jordan Canonical Form: Application to Differential Equations

description

systems of linear differential ecuations

Transcript of Jordan Canonical Form Applications [Steven H. Weintraub]

  • Jordan Canonical Form:Application toDifferential Equations

  • Copyright 2008 by Morgan & Claypool

    All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted inany form or by any meanselectronic, mechanical, photocopy, recording, or any other except for brief quotations inprinted reviews, without the prior permission of the publisher.

    Jordan Canonical Form: Application to Differential Equations

    Steven H.Weintraub

    www.morganclaypool.com

    ISBN: 9781598298048 paperbackISBN: 9781598298055 ebook

    DOI 10.2200/S00146ED1V01Y200808MAS002

    A Publication in the Morgan & Claypool Publishers seriesSYNTHESIS LECTURES ON MATHEMATICS AND STATISTICS

    Lecture #2Series Editor: Steven G. Krantz,Washington University, St. Louis

    Series ISSNSynthesis Lectures on Mathematics and StatisticsISSN pending.

  • Jordan Canonical Form:Application toDifferential Equations

    Steven H.WeintraubLehigh University

    SYNTHESIS LECTURES ON MATHEMATICS AND STATISTICS #2

    CM& cLaypoolMorgan publishers&

  • ABSTRACTJordan Canonical Form ( JCF) is one of the most important, and useful, concepts in linear algebra.In this book we develop JCF and show how to apply it to solving systems of differential equations.We rst develop JCF, including the concepts involved in iteigenvalues, eigenvectors, and chainsof generalized eigenvectors.We begin with the diagonalizable case and then proceed to the generalcase, but we do not present a complete proof. Indeed, our interest here is not in JCF per se, but inone of its important applications.We devote the bulk of our attention in this book to showing howto apply JCF to solve systems of constant-coefcient rst order differential equations, where it isa very effective tool. We cover all situationshomogeneous and inhomogeneous systems; real andcomplex eigenvalues.We also treat the closely related topic of the matrix exponential.Our discussionis mostly conned to the 2-by-2 and 3-by-3 cases, and we present a wealth of examples that illustrateall the possibilities in these cases (and of course, a wealth of exercises for the reader).

    KEYWORDSJordanCanonical Form, linear algebra,differential equations, eigenvalues, eigenvectors,generalized eigenvectors, matrix exponential

  • vContents

    Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v

    Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

    1 Jordan Canonical Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 The Diagonalizable Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

    1.2 The General Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    2 Solving Systems of Linear Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .252.1 Homogeneous Systems with Constant Coefcients . . . . . . . . . . . . . . . . . . . . . . . . . . .25

    2.2 Homogeneous Systems with Constant Coefcients . . . . . . . . . . . . . . . . . . . . . . . . . . .40

    2.3 Inhomogeneous Systems with Constant Coefcients . . . . . . . . . . . . . . . . . . . . . . . . . 46

    2.4 The Matrix Exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

    A Background Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69A.1 Bases, Coordinates, and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    A.2 Properties of the Complex Exponential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

    B Answers to Odd-Numbered Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

    Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .85

  • vi CONTENTS

  • PrefaceJordan Canonical Form ( JCF) is one of the most important, and useful, concepts in linear

    algebra. In this book, we develop JCF and show how to apply it to solving systems of differentialequations.

    In Chapter 1, we develop JCF. We do not prove the existence of JCF in general, butwe present the ideas that go into iteigenvalues and (chains of generalized) eigenvectors. InSection 1.1, we treat the diagonalizable case, and in Section 1.2, we treat the general case. Wedevelop all possibilities for 2-by-2 and 3-by-3 matrices, and illustrate these by examples.

    In Chapter 2, we apply JCF.We show how to use JCF to solve systems Y = AY + G(x) ofconstant-coefcient rst-order linear differential equations. In Section 2.1, we consider homoge-neous systems Y = AY . In Section 2.2, we consider homogeneous systems when the characteristicpolynomial of A has complex roots (in which case an additional step is necessary). In Section 2.3,we consider inhomogeneous systems Y = AY + G(x) with G(x) nonzero. In Section 2.4, wedevelop the matrix exponential eAx and relate it to solutions of these systems. Also in this chapterwe provide examples that illustrate all the possibilities in the 2-by-2 and 3-by-3 cases.

    Appendix A has background material. Section A.1 gives background on coordinates forvectors and matrices for linear transformations. Section A.2 derives the basic properties of thecomplex exponential function. This material is relegated to the Appendix so that readers whoare unfamiliar with these notions, or who are willing to take them on faith, can skip it and stillunderstand the material in Chapters 1 and 2.

    Our numbering system for results is fairly standard: Theorem 2.1, for example, is the rstTheorem found in Section 2 of Chapter 1.

    As is customary in textbooks, we provide the answers to the odd-numbered exercises here.Instructors may contact me at [email protected] and I will supply the answers to all of the exercises.

    Steven H.WeintraubLehigh UniversityBethlehem, PA USAJuly 2008

  • viii PREFACE

  • 1C H A PT E R 1

    Jordan Canonical Form1.1 THEDIAGONALIZABLECASEAlthough, for simplicity, most of our examples will be over the real numbers (and indeed over therational numbers), we will consider that all of our vectors and matrices are dened over the complexnumbers C. It is only with this assumption that the theory of Jordan Canonical Form ( JCF) workscompletely. See Remark 1.9 for the key reason why.

    Denition 1.1. If v = 0 is a vector such that, for some ,Av = v ,

    then v is an eigenvector of A associated to the eigenvalue .

    Example 1.2. Let A be the matrix A =[

    5 72 4

    ]. Then, as you can check, if v1 =

    [72

    ], then

    Av1 = 3v1, so v1 is an eigenvector of A with associated eigenvalue 3, and if v2 =[

    11

    ], then

    Av2 = 2v2, so v2 is an eigenvector of A with associated eigenvalue 2.We note that the denition of an eigenvalue/eigenvector can be expressed in an alternate form.

    Here I denotes the identity matrix:

    Av = vAv = Iv

    (A I)v = 0 .For an eigenvalue of A, we let E denote the eigenspace of ,

    E = {v | Av = v} = {v | (A I)v = 0} = Ker(A I) .(The kernel Ker(A I) is also known as the nullspace NS(A I).)We also note that this alternate formulation helps us nd eigenvalues and eigenvectors. For if

    (A I)v = 0 for a nonzero vector v, thematrixA I must be singular, and hence its determinantmust be 0. This leads us to the following denition.

    Denition 1.3. The characteristic polynomial of a matrix A is the polynomial det(I A).

  • 2 CHAPTER 1. JORDANCANONICAL FORM

    Remark 1.4. This is the customary denition of the characteristic polynomial. But note that, ifA isan n-by-n matrix, then the matrix I A is obtained from the matrix A I by multiplying eachof its n rows by 1, and hence det(I A) = (1)n det(A I). In practice, it is most convenientto work with A I in nding eigenvectorsthis minimizes arithmeticand when we come tond chains of generalized eigenvectors in Section 1.2, it is (almost) essential to use A I , as usingI A would introduce lots of spurious minus signs.

    Example 1.5. Returning to the matrix A =[

    5 72 4

    ]of Example 1.2, we compute that det(I

    A) = 2 6 = ( 3)( + 2), so A has eigenvalues 3 and 2. Computation then shows thatthe eigenspace E3 = Ker(A 3I ) has basis

    {[72

    ]}, and that the eigenspace E2 = Ker(A

    (2)I ) has basis{[

    11

    ]}.

    We now introduce two important quantities associated to an eigenvalue of a matrix A.

    Denition 1.6. Let a be an eigenvalue of a matrix A. The algebraic multiplicity of the eigenvaluea is alg-mult(a) = the multiplicity of a as a root of the characteristic polynomial det(I A). Thegeometric multiplicity of the eigenvalue a is geom-mult(a) = the dimension of the eigenspace Ea .

    It is common practice to use the word multiplicity (without a qualier) to mean algebraicmultiplicity.

    We have the following relationship between these two multiplicities.

    Lemma 1.7. Let a be an eigenvalue of a matrix A. Then

    1 geom-mult(a) alg-mult(a) .

    Proof. By the denition of an eigenvalue, there is at least one eigenvector v with eigenvalue a, andso Ea contains the nonzero vector v, and hence dim(Ea) 1.

    For the proof that geom-mult(a) alg-mult(a), see Lemma 1.12 in Appendix A.

    Corollary 1.8. Let a be an eigenvalue of A and suppose that a has algebraic multiplicity 1. Then a alsohas geometric multiplicity 1.

    Proof. In this case, applying Lemma 1.7, we have

    1 geom-mult(a) alg-mult(a) = 1 ,so geom-mult(a) = 1.

  • 1.1. THEDIAGONALIZABLECASE 3

    Remark 1.9. LetA be an n-by-nmatrix.Then its characteristic polynomial det(I A) has degreen. Since we are considering A to be dened over the complex numbers, we may apply the FundamentalTheorem of Algebra, which states that an nth degree polynomial has n roots, counting multiplicities.Hence,we see that, for any n-by-nmatrixA, the sum of the algebraic multiplicities of the eigenvaluesof A is equal to n.

    Lemma 1.10. Let A be an n-by-n matrix. The following are equivalent:(1) For each eigenvalue a of A, geom-mult(a) = alg-mult(a).(2) The sum of the geometric multiplicities of the eigenvalues of A is equal to n.

    Proof. Let A have eigenvalues a1, a2, . . . , am. For each i between 1 and m, let si = geom-mult(ai)and ti = alg-mult(ai). Then, by Lemma 1.7, si ti for each i, and by Remark 1.9,mi=1 ti = n.Thus, if si = ti for each i, thenmi=1 si = n, while if si < ti for some i, thenmi=1 si < n.

    Proposition 1.11. (1) Let a1, a2, . . . , am be distinct eigenvalues ofA (i.e., ai = aj for i = j ). For eachi between 1 and m, let vi be an associated eigenvector.Then {v1, v2, . . . , vm} is a linearly independent setof vectors.(2) More generally, let a1, a2, . . . , am be distinct eigenvalues of A. For each i between 1 and m, let Si bea linearly independent set of eigenvectors associated to ai . Then S = S1 . . . Sm is a linearly independentset of vectors.

    Proof. (1) Suppose we have a linear combination 0 = c1v1 + c2v2 + . . . + cmvm.We need to showthat ci = 0 for each i.To do this, we begin with an observation: If v is an eigenvector ofA associatedto the eigenvalue a, and b is any scalar, then (A bI)v = Av bv = av bv = (a b)v. (Notethat this answer is 0 if a = b and nonzero if a = b.)

    We now go to work,multiplying our original relation by (A amI).Of course, (A amI)0 =0, so:

    0 = (A amI)(c1v1 + c2v2 + . . . + cm2vm2 + cm1vm1 + cmvm)= c1(A amI)v1 + c2(A amI)v2 + . . .

    + cm2(A amI)vm2 + cm1(A amI)vm1 + cm(A amI)vm= c1(a1 am)v1 + c2(a2 am)v2 + . . .

    + cm2(am2 am)vm2 + cm1(am1 am)vm1 .

  • 4 CHAPTER 1. JORDANCANONICAL FORM

    We now multiply this relation by (A am1I ). Again, (A am1I )0 = 0, so:0 = (A am1I )(c1(a1 am)v1 + c2(a2 am)v2 + . . .

    + cm2(am2 am)vm2 + cm1(am1 am)vm1)= c1(a1 am)(A am1I )v1 + c2(a2 am)(A am1I )v2 + . . .

    + cm2(am2 am)(A am1I )vm2 + cm1(am1 am)(A am1I )vm1= c1(a1 am)(a1 am1)v1 + c2(a2 am)(a2 am1)v2 + . . .

    + cm2(am2 am)(am2 am1)vm2 .Proceed in this way, until at the last step we multiply by (A a2I ). We then obtain:

    0 = c1(a1 a2) (a1 am1)(a1 am)v1 .But v1 = 0, as by denition an eigenvector is nonzero. Also, the product (a1 a2) (a1 am1)(a1 am) is a product of nonzero numbers and is hence nonzero.Thus, we must have c1 = 0.

    Proceeding in the same way, multiplying our original relation by (A amI), (A am1I ),(A a3I ), and nally by (A a1I ),we obtain c2 = 0, and, proceeding in this vein,we obtain ci = 0for all i, and so the set {v1, v2, . . . , vm} is linearly independent.

    (2)To avoid complicated notation,wewill simply prove this whenm = 2 (which illustrates thegeneral case).Thus, letm = 2, let S1 = {v1,1, . . . , v1,i1} be a linearly independent set of eigenvectorsassociated to the eigenvalue a1 of A, and let S2 = {v2,1, . . . , v2,i2} be a linearly independent setof eigenvectors associated to the eigenvalue a2 of A. Then S = {v1,1, . . . , v1,i1, v2,1, . . . , v2,i2}.We want to show that S is a linearly independent set. Suppose we have a linear combination0 = c1,1v1,1 + . . . + c1,i1v1,i1 + c2,1v2,1 + . . . + c2,i2v2,i2 . Then:

    0 = c1,1v1,1 + . . . + c1,i1v1,i1 + c2,1v2,1 + . . . + c2,i2v2,i2= (c1,1v1,1 + . . . + c1,i1v1,i1) + (c2,1v2,1 + . . . + c2,i2v2,i2)= v1 + v2

    where v1 = c1,1v1,1 + . . . + c1,i1v1,i1 and v2 = c2,1v2,1 + . . . + c2,i2v2,i2 . But v1 is a vector in Ea1 ,so Av1 = a1v1; similarly, v2 is a vector in Ea2 , so Av2 = a2v2. Then, as in the proof of part (1),

    0 = (A a2I )0 = (A a2I )(v1 + v2) = (A a2I )v1 + (A a2I )v2= (a1 a2)v1 + 0 = (a1 a2)v1

    so 0 = v1; similarly, 0 = v2. But 0 = v1 = c1,1v1,1 + . . . + c1,i1v1,i1 implies c1,1 = . . . c1,i1 = 0,as, by hypothesis, {v1,1, . . . , v1,i1} is a linearly independent set; similarly, 0 = v2 implies c2,1 =. . . = c2,i2 = 0. Thus, c1,1 = . . . = c1,i1 = c2,1 = . . . = c2,i2 = 0 and S is linearly independent, asclaimed.

    Denition 1.12. Two square matrices A and B are similar if there is an invertible matrix P withA = PBP1.

  • 1.1. THEDIAGONALIZABLECASE 5

    Denition 1.13. A square matrix A is diagonalizable if A is similar to a diagonal matrix.

    Here is the main result of this section.

    Theorem 1.14. Let A be an n-by-n matrix over the complex numbers. Then A is diagonalizable ifand only if, for each eigenvalue a of A, geom-mult(a) = alg-mult(a). In that case, A = PJP1 whereJ is a diagonal matrix whose entries are the eigenvalues of A, each appearing according to its algebraicmultiplicity, and P is a matrix whose columns are eigenvectors forming bases for the associated eigenspaces.

    Proof. We give a proof by direct computation here. For a more conceptual proof, see Theorem 1.10in Appendix A.

    First let us suppose that for each eigenvalue a of A, geom-mult(a) = alg-mult(a).

    Let A have eigenvalues a1, a2, , an. Here we do not insist that the ai s are distinct; rather,each eigenvalue appears the same number of times as its algebraic multiplicity.Then J is the diagonalmatrix

    J =[j1

    j2 . . .

    jn]

    and we see that ji , the ith column of J , is the vector

    ji =

    0...

    0ai

    0...

    ,

    with ai in the ith position, and 0 elsewhere.We have

    P =[v1

    v2 . . .

    vn],

    amatrix whose columns are eigenvectors forming bases for the associated eigenspaces.By hypothesis,geom-mult(a) = alg-mult(a) for each eigenvector a ofA, so there are as many columns of P that areeigenvectors for the eigenvalue a as there are diagonal entries of J that are equal to a. Furthermore,by Lemma 1.10, the matrix P indeed has n columns.

    We rst show by direct computation that AP = PJ . Now

    AP = A[v1

    v2 . . .

    vn]

  • 6 CHAPTER 1. JORDANCANONICAL FORM

    so the ith column of AP is Avi . ButAvi = aivi

    as vi is an eigenvector of A with associated eigenvalue ai .On the other hand,

    PJ =[v1

    v2 . . .

    vn]J

    and the ith column of PJ is Pji ,

    Pji =[v1

    v2 . . .

    vn]ji .

    Remembering what the vector ji is, and multiplying, we see that

    Pji = aivias well.

    Thus, every column of AP is equal to the corresponding column of PJ , so

    AP = PJ .By Proposition 1.11, the columns of the square matrix P are linearly independent, so P is

    invertible. Multiplying on the right by P1, we see that

    A = PJP1 ,completing the proof of this half of the Theorem.

    Now let us suppose thatA is diagonalizable,A = PJP1.ThenAP = PJ .We use the samenotation for P and J as in the rst half of the proof. Then, as in the rst half of the proof, wecompute AP and PJ column-by-column, and we see that the ith column of AP is Avi and thatthe ith column of PJ is aivi , for each i. Hence, Avi = aivi for each i, and so vi is an eigenvectorof A with associated eigenvalue ai .

    For each eigenvalue a of A, there are as many columns of P that are eigenvectors for a asthere are diagonal entries of J that are equal to a, and these vectors form a basis for the eigenspaceassociated of the eigenvaluea, sowe see that for each eigenvaluea ofA,geom-mult(a) = alg-mult(a),completing the proof.

    For a general matrix A, the condition in Theorem 1.14 may or may not be satised, i.e.,some but not all matrices are diagonalizable. But there is one important case when this condition isautomatic.

    Corollary 1.15. Let A be an n-by-n matrix over the complex numbers all of whose eigenvalues aredistinct (i.e., whose characteristic polynomial has no repeated roots). Then A is diagonalizable.

  • 1.2. THEGENERALCASE 7

    Proof. By hypothesis, for each eigenvalue a of A, alg-mult(a) = 1. But then, by Corollary 1.8, foreach eigenvalue a ofA, geom-mult(a) = alg-mult(a), so the hypothesis ofTheorem 1.14 is satised.

    Example 1.16. Let A be the matrix A =[

    5 72 4

    ]of Examples 1.2 and 1.5. Then, referring to

    Example 1.5, we see [5 72 4

    ]=[

    7 12 1

    ] [3 00 2

    ] [7 12 1

    ]1.

    As we have indicated, we have developed this theory over the complex numbers, as JFC worksbest over them. But there is an analog of our results over the real numberswe just have to requirethat all the eigenvalues of A are real. Here is the basic result on diagonalizability.

    Theorem 1.17. Let A be an n-by-n real matrix. Then A is diagonalizable if and only if all theeigenvalues of A are real numbers, and, for each eigenvalue a of A, geom-mult(a) = alg-mult(a). In thatcase, A = PJP1 where J is a diagonal matrix whose entries are the eigenvalues of A, each appearingaccording to its algebraic multiplicity (and hence is a real matrix), and P is a real matrix whose columnsare eigenvectors forming bases for the associated eigenspaces.

    1.2 THEGENERALCASE

    Let us begin this section by describing what a matrix in JCF looks like. A matrix in JCF is composedof Jordan blocks, so we rst see what a single Jordan block looks like.

    Denition 2.1. A k-by-k Jordan block associated to the eigenvalue is a k-by-k matrix of the form

    J =

    1 1

    1. . .

    . . .

    1

    .

  • 8 CHAPTER 1. JORDANCANONICAL FORM

    In other words, a Jordan block is a matrix with all the diagonal entries equal to each other, allthe entries immediately above the diagonal equal to 1, and all the other entries equal to 0.

    Denition 2.2. A matrix J in Jordan Canonical Form ( JCF) is a block diagonal matrix

    J =

    J1J2

    J3. . .

    J

    with each Ji a Jordan block.

    Remark 2.3. Note that every diagonal matrix is a matrix in JCF, with each Jordan block a 1-by-1block.

    In order to understand and be able to use JCF, we must introduce a new concept, that of ageneralized eigenvector.

    Denition 2.4. If v = 0 is a vector such that, for some ,(A I)k(v) = 0

    for some positive integer k, then v is a generalized eigenvector of A associated to the eigenvalue .The smallest k with (A I)k(v) = 0 is the index of the generalized eigenvector v.

    Let us note that if v is a generalized eigenvector of index 1, then

    (A I)(v) = 0(A)v = (I)vAv = v

    and so v is an (ordinary) eigenvector.Recall that, for an eigenvalue of A, E denotes the eigenspace of ,

    E = {v | Av = v} = {v | (A I)v = 0} .We let E denote the generalized eigenspace of ,

    E = {v | (A I)k(v) = 0 for some k} .It is easy to check that E is a subspace.

  • 1.2. THEGENERALCASE 9

    Since every eigenvector is a generalized eigenvector, we see that

    E E .The following result (which we shall not prove) is an important fact about generalized

    eigenspaces.

    Proposition 2.5. Let be an eigenvalue of the n-by-n matrix A of algebraic multiplicity m. Then, Eis a subspace of Cn of dimension m.

    Example 2.6. Let A be the matrix A =[

    0 14 4

    ]. Then, as you can check, if u =

    [12

    ], then

    (A 2I )u = 0, so u is an eigenvector of A with associated eigenvalue 2 (and hence a generalizedeigenvector of index 1 of A with associated eigenvalue 2). On the other hand, if v =

    [10

    ], then

    (A 2I )2v = 0 but (A 2I )v = 0, so v is a generalized eigenvector of index 2 ofAwith associatedeigenvalue 2.

    In this case, as you can check, the vector u is a basis for the eigenspaceE2, soE2 = { cu | c C}is one dimensional.

    On the other hand, u and v are both generalized eigenvectors associated to the eigenvalue2, and are linearly independent (the equation c1u + c2v = 0 only has the solution c1 = c2 = 0, asyou can readily check), so E2 has dimension at least 2. Since E2 is a subspace of C2, it must havedimension exactly 2, and E2 = C2 (and {u, v} is indeed a basis for C2).

    Let us next consider a generalized eigenvector vk of index k associated to an eigenvalue , andset

    vk1 = (A I)vk .We claim that vk1 is a generalized eigenvector of index k 1 associated to the eigenvalue .

    To see this, note that

    (A I)k1vk1 = (A I)k1(A I)vk = (A I)kvk = 0but

    (A I)k2vk1 = (A I)k2(A I)vk = (A I)k1vk = 0 .Proceeding in this way, we may set

    vk2 = (A I)vk1 = (A I)2vkvk3 = (A I)vk2 = (A I)2vk1 = (A I)3vk

    ...

    v1 = (A I)v2 = = (A I)k1vk

  • 10 CHAPTER 1. JORDANCANONICAL FORM

    and note that each vi is a generalized eigenvector of index i associated to the eigenvalue . Acollection of generalized eigenvectors obtained in this way gets a special name:

    Denition 2.7. If {v1, . . . , vk} is a set of generalized eigenvectors associated to the eigenvalue ofA, such that vk is a generalized eigenvector of index k and also

    vk1 =(A I)vk, vk2 = (A I)vk1, vk3 = (A I)vk2, , v2 = (A I)v3, v1 = (A I)v2 ,

    then {v1, . . . , vk} is called a chain of generalized eigenvectors of length k. The vector vk is calledthe top of the chain and the vector v1 (which is an ordinary eigenvector) is called the bottom of thechain.

    Example 2.8. Let us return to Example 2.6.We saw there that v =[

    10

    ]is a generalized eigenvector

    of index 2 of A =[

    0 14 4

    ]associated to the eigenvalue 2. Let us set v2 = v =

    [10

    ]. Then v1 =

    (A 2I )v2 =[24]

    is a generalized eigenvector of index 1 (i.e., an ordinary eigenvector), and

    {v1, v2} is a chain of length 2.

    Remark 2.9. It is important to note that a chain of generalized eigenvectors {v1, . . . , vk} is entirelydetermined by the vector vk at the top of the chain. For once we have chosen vk , there are no otherchoices to be made: the vector vk1 is determined by the equation vk1 = (A I)vk ; then thevector vk2 is determined by the equation vk2 = (A I)vk1, etc.

    With this concept in hand, let us return to JCF. As we have seen, a matrix J in JCF has anumber of blocks J1, J2, . . . , J, called Jordan blocks, along the diagonal. Let us begin our analysiswith the case when J consists of a single Jordan block. So suppose J is a k-by-k matrix

    J =

    1 1 0

    1. . .

    . . .

    0 1

    .

  • 1.2. THEGENERALCASE 11

    Then,

    J I =

    0 10 1

    0 1. . .

    . . .

    0 10

    .

    Let e1=

    100...

    0

    , e2=

    010...

    0

    , e3=

    001...

    0

    , , ek=

    000...

    1

    .

    Then direct calculation shows:

    (J I)ek = ek1(J I)ek1 = ek2

    ...

    (J I)e2 = e1(J I)e1 = 0

    and so we see that {e1, . . . , ek} is a chain of generalized eigenvectors.We also note that {e1, . . . , ek}is a basis for Ck , and so

    E = Ck .We rst see that the situation is very analogous when we consider any k-by-k matrix with a

    single chain of generalized eigenvectors of length k.

    Proposition 2.10. Let {v1, . . . , vk} be a chain of generalized eigenvectors of length k associated to theeigenvalue of a matrix A. Then {v1, . . . , vk} is linearly independent.

    Proof. Suppose we have a linear combination

    c1v1 + c2v2 + + ck1vk1 + ckvk = 0 .

    We must show each ci = 0.By the denition of a chain, vki = (A I)ivk for each i, so we may write this equation as

    c1(A I)k1vk + c2(A I)k2vk + + ck1(A I)vk + ckvk = 0 .

  • 12 CHAPTER 1. JORDANCANONICAL FORM

    Now let us multiply this equation on the left by (A I)k1. Then we obtain the equationc1(A I)2k2vk + c2(A I)2k3vk + + ck1(A I)kvk + ck(A I)k1vk = 0 .

    Now (A I)k1vk = v1 = 0. However, (A I)kvk = 0, and then also (A I)k+1vk =(A I)(A I)kvk = (A I)(0) = 0, and then similarly (A I)k+2vk = 0, . . . , (A I)2k2vk = 0, so every term except the last one is zero and this equation becomes

    ckv1 = 0 .Since v1 = 0, this shows ck = 0, so our linear combination is

    c1v1 + c2v2 + + ck1vk1 = 0 .Repeat the same argument, this time multiplying by (A I)k2 instead of (A I)k1.

    Then we obtain the equationck1v1 = 0 ,

    and, since v1 = 0, this shows that ck1 = 0 as well. Keep going to getc1 = c2 = = ck1 = ck = 0 ,

    so {v1, . . . , vk} is linearly independent.

    Theorem 2.11. Let A be a k-by-k matrix and suppose that Ck has a basis {v1, . . . , vk} consisting of asingle chain of generalized eigenvectors of length k associated to an eigenvalue a. Then

    A = PJP1

    where

    J =

    a 1a 1

    a 1. . .

    . . .

    a 1a

    is a matrix consisting of a single Jordan block and

    P =[v1

    v2 . . .

    vk]

    is a matrix whose columns are generalized eigenvectors forming a chain.

  • 1.2. THEGENERALCASE 13

    Proof. We give a proof by direct computation here. (Note the similarity of this proof to the proofof Theorem 1.14.) For a more conceptual proof, see Theorem 1.11 in Appendix A.

    Let P be the given matrix.We will rst show by direct computation that AP = PJ .It will be convenient to write

    J =[j1

    j2 . . .

    jk]

    and we see that ji , the ith column of J , is the vector

    ji =

    0...

    1a

    0...

    with 1 in the (i 1)st position, a in the ith position, and 0 elsewhere.We show that AP = PJ by showing that their corresponding columns are equal.Now

    AP = A[v1

    v2 . . .

    vk]

    so the ith column of AP is Avi . But

    Avi = (A aI + aI)vi= (A aI)vi + aIvi= vi1 + avi for i > 1, = avi for i = 1 .

    On the other hand,

    PJ =[v1

    v2 . . .

    vk]J

    and the ith column of PJ is Pji ,

    Pji =[v1

    v2 . . .

    vk]ji .

    Remembering what the vector ji is, and multiplying, we see that

    Pji = vi1 + avi for i > 1, = avi for i = 1as well.

  • 14 CHAPTER 1. JORDANCANONICAL FORM

    Thus, every column of AP is equal to the corresponding column of PJ , so

    AP = PJ .

    But Proposition 2.10 shows that the columns of P are linearly independent, so P is invertible.Multiplying on the right by P1, we see that

    A = PJP1 .

    Example 2.12. Applying Theorem 2.11 to the matrix A =[

    0 14 4

    ]of Examples 2.6 and 2.8, we

    see that [0 1

    4 4]

    =[2 14 0

    ] [2 10 2

    ] [2 14 0

    ]1.

    Here is the key theorem to which we have been heading. This theorem is one of the mostimportant (and useful) theorems in linear algebra.

    Theorem 2.13. Let A be any square matrix dened over the complex numbers. Then A is similar to amatrix in Jordan Canonical Form. More precisely, A = PJP1, for some matrix J in Jordan CanonicalForm. The diagonal entries of J consist of eigenvalues of A, and P is an invertible matrix whose columnsare chains of generalized eigenvectors of A.

    Proof. (Rough outline) In general, the JCF of a matrix A does not consist of a single block, but willhave a number of blocks, of varying sizes and associated to varying eigenvalues.

    But in this situation we merely have to assemble the various blocks (to get the matrix J )and the various chains of generalized eigenvectors (to get a basis and hence the matrix P ). Actually,the word merely is a bit misleading, as the argument that we can do so is, in fact, a subtle one, andwe shall not give it here.

    In lieu of proving Theorem 2.13, we shall give a number of examples that illustrate thesituation. In fact, in order to avoid complicated notation we shall merely illustrate the situation for2-by-2 and 3-by-3 matrices.

    Theorem 2.14. Let A be a 2-by-2 matrix. Then one of the following situations applies:

  • 1.2. THEGENERALCASE 15

    (i) A has two eigenvalues, a and b, each of algebraic multiplicity 1. Let u be an eigenvector associated tothe eigenvalue a and let v be an eigenvector associated to the eigenvalue b.Then A = PJP1 with

    J =[a 00 b

    ]and P =

    [u

    v].

    (Note, in this case, A is diagonalizable.)

    (ii) A has a single eigenvalue a of algebraic multiplicity 2.

    (a) A has two linearly independent eigenvectors u and v.

    Then A = PJP1 withJ =

    [a 00 a

    ]and P =

    [u

    v].

    (Note, in this case, A is diagonalizable. In fact, in this case Ea = C2 and A itself is the matrix[a 00 a

    ].)

    (b) A has a single chain {v1, v2} of generalized eigenvectors. Then A = PJP1 with

    J =[a 10 a

    ]and P =

    [v1

    v2].

    Theorem 2.15. Let A be a 3-by-3 matrix. Then one of the following situations applies:

    (i) A has three eigenvalues, a, b, and c, each of algebraic multiplicity 1. Let u be an eigenvectorassociated to the eigenvalue a, v be an eigenvector associated to the eigenvalue b, and w be aneigenvector associated to the eigenvalue c. Then A = PJP1 with

    J =a 0 00 b 0

    0 0 c

    and P = [u v

    w].

    (Note, in this case, A is diagonalizable.)

    (ii) A has an eigenvalue a of algebraic multiplicity 2 and an eigenvalue b of algebraic multiplicity 1.

    (a) A has two independent eigenvectors, u and v, associated to the eigenvalue a. Let w be an eigenvectorassociated to the eigenvalue b. Then A = PJP1 with

    J =a 0 00 a 0

    0 0 b

    and P = [u v

    w].

    (Note, in this case, A is diagonalizable.)

  • 16 CHAPTER 1. JORDANCANONICAL FORM

    (b) A has a single chain {u1, u2} of generalized eigenvectors associated to the eigenvalue a. Let v be aneigenvector associated to the eigenvalue b. Then A=PJP1 with

    J =a 1 00 a 0

    0 0 b

    and P = [u1

    u2 v].

    (iii) A has a single eigenvalue a of algebraic multiplicity 3.

    (a) A has three linearly independent eigenvectors, u, v, and w. Then A = PJP1 with

    J =a 0 00 a 0

    0 0 a

    and P = [u v

    w].

    (Note, in this case, A is diagonalizable. In fact, in this case Ea = C3 and A itself is the matrixa 0 00 a 0

    0 0 a

    .)

    (b) A has a chain {u1, u2} of generalized eigenvectors and an eigenvector v with {u1, u2, v} linearlyindependent. Then A = PJP1 with

    J =a 1 00 a 0

    0 0 a

    and P = [u1

    u2 v].

    (c) A has a single chain {u1, u2, u3} of generalized eigenvectors. Then A =PJP1 with

    J =a 1 00 a 1

    0 0 a

    and P = [u1

    u2 u3

    ].

    Remark 2.16. Suppose that A has JCF J = aI , a scalar multiple of the identity matrix. ThenA = PJP1 = P(aI)P1 = a(P IP1) = aI = J .This justies the parenthetical remark inThe-orems 2.14 (ii) (a) and 2.15 (iii) (a).

    Remark 2.17. Note that Theorems 2.14 (i), 2.14 (ii) (a), 2.15 (i), 2.15 (ii) (a), and 2.15 (iii) (a) areall special cases of Theorem 1.14, and in fact Theorems 2.14 (i) and 2.15 (i) are both special casesof Corollary 1.15.

  • 1.2. THEGENERALCASE 17

    Now we would like to apply Theorems 2.14 and 2.15. In order to do so, we need to have aneffective method to determine which of the cases we are in, and we give that here (without proof ).

    Denition 2.18. Let be an eigenvalue of A. Then for any positive integer i,

    Ei = {v | (A I)i(v) = 0}= Ker((A I)i) .

    Note that Ei consists of generalized eigenvectors of index at most i (and the 0 vector), and isa subspace. Note also that

    E = E1 E2 . . . E .In general, the JCF of A is determined by the dimensions of all the spaces Ei, but this

    determination can be a bit complicated. For eigenvalues of multiplicity at most 3, however, thesituation is simplerweneedonly consider the eigenspacesE.This is a consequence of the followinggeneral result.

    Proposition 2.19. Let be an eigenvalue of A.Then the number of blocks in the JCF of A correspondingto is equal to dim E, i.e., to the geometric multiplicity of .

    Proof. (Outline) Suppose there are such blocks. Since each block corresponds to a chain of gener-alized eigenvectors, there are such chains.Now the bottom of the chain is an (ordinary) eigenvector,so we get eigenvectors in this way. It can be shown that these eigenvectors are always linearlyindependent and that they always span E, i.e., that they are a basis of E. Thus, E has a basisconsisting of vectors, so dim E = .

    We can now determine the JCF of 1-by-1, 2-by-2, and 3-by-3 matrices, using the followingconsequences of this proposition.

    Corollary 2.20. Let be an eigenvalue of A of algebraic multiplicity 1. Then dim E1 = 1, i.e., a hasgeometric multiplicity 1, and the submatrix of the JCF of A corresponding to the eigenvalue is a single1-by-1 block.

    Corollary 2.21. Let be an eigenvalue of A of algebraic multiplicity 2. Then there are the followingpossibilities:

    (a) dim E1 = 2, i.e., a has geometric multiplicity 2. In this case, the submatrix of the JCF of Acorresponding to the eigenvalue consists of two 1-by-1 blocks.

  • 18 CHAPTER 1. JORDANCANONICAL FORM

    (b) dim E1 = 1, i.e., a has geometric multiplicity 1. Also, dim E2 = 2. In this case, the submatrix ofthe JCF of A corresponding to the eigenvalue consists of a single 2-by-2 block.

    Corollary 2.22. Let be an eigenvalue of A of algebraic multiplicity 3. Then there are the followingpossibilities:

    (a) dim E1 = 3, i.e., a has geometric multiplicity 3. In this case, the submatrix of the JCF of A corre-sponding to the eigenvalue consists of three 1-by-1 blocks.

    (b) dim E1 = 2, i.e., a has geometric multiplicity 2. Also, dim E2 = 3. In this case, the submatrix ofthe Jordan Canonical Form of A corresponding to the eigenvalue consists of a 2-by-2 block and a1-by-1 block.

    (c) dim E1 = 1, i.e., a has geometric multiplicity 1. Also, dim E2 = 2, and dim E3 = 3. In this case,the submatrix of the Jordan Canonical Form of A corresponding to the eigenvalue consists of asingle 3-by-3 block.

    Now we shall do several examples.

    Example 2.23. A = 2 3 32 2 2

    2 1 1

    .

    A has characteristic polynomial det (I A) = ( + 1)()( 2). Thus,A has eigenvalues1,0, and2, each ofmultiplicity one,and sowe are in the situation of Theorem2.15 (i).Computation

    shows that the eigenspaceE1 = Ker(A (I )) has basis10

    1

    , the eigenspaceE0 = Ker(A)

    has basis

    01

    1

    , and the eigenspace E2 = Ker(A 2I ) has basis

    11

    1

    . Hence, we see

    that 2 3 32 2 2

    2 1 1

    =

    1 0 10 1 1

    1 1 1

    1 0 00 0 0

    0 0 2

    1 0 10 1 1

    1 1 1

    1

    .

    Example 2.24. A =3 1 12 4 2

    1 1 3

    .

  • 1.2. THEGENERALCASE 19

    A has characteristic polynomial det (I A) = ( 2)2( 6). Thus,A has an eigenvalue2 of multiplicity 2 and an eigenvalue 6 of multiplicity 1. Computation shows that the eigenspace

    E2 = Ker(A 2I ) has basis11

    0

    ,

    10

    1

    , so dim E2 = 2 and we are in the situation of

    Corollary 2.21 (a). Further computation shows that the eigenspace E6 = Ker(A 6I ) has basis12

    1

    . Hence, we see that

    3 1 12 4 2

    1 1 3

    =

    1 1 11 0 2

    0 1 1

    2 0 00 2 0

    0 0 6

    1 1 11 0 2

    0 1 1

    1

    .

    Example 2.25. A = 2 1 12 1 2

    1 0 2

    .

    A has characteristic polynomial det (I A) = ( + 1)2( 3). Thus,A has an eigenvalue1 of multiplicity 2 and an eigenvalue 3 of multiplicity 1. Computation shows that the eigenspace

    E1 = Ker(A (I )) has basis12

    1

    so dim E1 = 1 and we are in the situation of Corol-

    lary 2.21 (b).Then we further compute that E21 = Ker((A (I ))2) has basis12

    0

    ,

    00

    1

    ,

    therefore is two-dimensional, as we expect.More to the point, we may choose any generalized eigen-vector of index 2, i.e., any vector in E21 that is not in E11, as the top of a chain. We choose u2 =00

    1

    , and then we have u1 = (A (I ))u2 =

    12

    1

    , and {u1, u2} form a chain.

    We also compute that, for the eigenvalue 3, the eigenspace E3 has basis

    v =

    56

    1

    .

    Hence, we see that

    2 1 12 1 2

    1 0 2

    =

    1 0 52 0 6

    1 1 1

    1 1 00 1 0

    0 0 3

    1 0 52 0 6

    1 1 1

    1

    .

  • 20 CHAPTER 1. JORDANCANONICAL FORM

    Example 2.26. A = 2 1 12 1 2

    1 1 2

    .

    A has characteristic polynomial det (I A) = ( 1)3, so A has one eigenvalue 1 of

    multiplicity three. Computation shows that E1 = Ker(A I ) has basis10

    1

    ,

    11

    0

    , so

    dim E1 = 2 and we are in the situation of Corollary 2.22 (b). Computation then shows that

    dim E21 = 3 (i.e.,(A I )2 = 0 andE21 is all of C3)with basis10

    0

    ,

    01

    0

    ,

    00

    1

    .Wemay choose

    u2 to be any vector inE21 that is not inE11 , and we shall choose u2 =

    10

    0

    .Then u1 = (A I )u2 =

    12

    1

    , and {u1, u2} form a chain. For the third vector, v, we may choose any vector in E1 such that

    {u1, v} is linearly independent.We choose v =10

    1

    . Hence, we see that

    2 1 12 1 2

    1 1 2

    =

    1 1 12 0 0

    1 0 1

    1 1 00 1 0

    0 0 1

    1 1 12 0 0

    1 0 1

    1

    .

    Example 2.27. A = 5 0 11 1 0

    7 1 0

    .

    A has characteristic polynomial det (I A) = ( 2)3, soA has one eigenvalue 2 of multi-

    plicity three.Computation shows thatE2 = Ker(A 2I ) has basis11

    3

    , so dim E12 = 1 and

    we are in the situation of Corollary 2.22 (c). Then computation shows that E22 = Ker((A 2I )2)

    has basis

    10

    2

    ,

    12

    0

    . (Note that

    11

    3

    = 3/2

    10

    2

    + 1/2

    12

    0

    .) Computation then

  • 1.2. THEGENERALCASE 21

    shows that dim E32 = 3 (i.e., (A 2I )3 = 0 and E32 is all of C3) with basis10

    0

    ,

    01

    0

    ,

    00

    1

    .

    We may choose u3 to be any vector in C3 that is not in E22 , and we shall choose u3 =10

    0

    . Then

    u2 = (A 2I )u3 = 31

    7

    and u1 = (A 2I )u2 =

    22

    6

    , and then {u1, u2, u3} form a chain.

    Hence, we see that

    5 0 11 1 0

    7 1 0

    =

    2 3 12 1 0

    6 7 0

    2 1 00 2 1

    0 0 2

    2 3 12 1 0

    6 7 0

    1

    .

    As we have mentioned, we need to work over the complex numbers in order for the theoryof JCF to fully apply. But there is an analog over the real numbers, and we conclude this section bystating it.

    Theorem 2.28. Let A be a real square matrix (i.e., a square matrix with all entries real numbers), andsuppose that all of the eigenvalues of A are real numbers. Then A is similar to a real matrix in JordanCanonical Form. More precisely, A = PJP1 with P and J real matrices, for some matrix J in JordanCanonical Form.The diagonal entries of J consist of eigenvalues of A, and P is an invertible matrix whosecolumns are chains of generalized eigenvectors of A.

    EXERCISES FORCHAPTER 1For each matrix A, write A = PJP1 with P an invertible matrix and J a matrix in JCF.

    1. A =[

    75 5690 67

    ], det(I A) = ( 3)( 5).

    2. A =[50 9920 39

    ], det(I A) = ( + 6)( + 5).

    3. A =[18 949 24

    ], det(I A) = ( 3)2.

  • 22 CHAPTER 1. JORDANCANONICAL FORM

    4. A =[

    1 116 9

    ], det(I A) = ( 5)2.

    5. A =[

    2 125 12

    ], det(I A) = ( 7)2.

    6. A =[15 925 15

    ], det(I A) = 2.

    7. A =1 0 01 2 3

    1 1 0

    , det(I A) = ( + 1)( 1)( 3).

    8. A =3 0 21 3 1

    0 1 1

    , det(I A) = ( 1)( 2)( 4).

    9. A = 5 8 164 1 8

    4 4 11

    , det(I A) = ( + 3)2( 1).

    10. A = 4 2 31 1 3

    2 4 9

    , det(I A) = ( 3)2( 8).

    11. A = 5 2 11 2 1

    1 2 3

    , det(I A) = ( 4)2( 2).

    12. A = 8 3 34 0 2

    2 1 3

    , det(I A) = ( 2)2( 7).

    13. A =3 1 17 5 1

    6 6 2

    , det(I A) = ( + 2)2( 4).

    14. A = 3 0 09 5 18

    4 4 12

    , det(I A) = ( 3)2( 4).

  • 1.2. THEGENERALCASE 23

    15. A =6 9 06 6 2

    9 9 3

    , det(I A) = 2( 3).

    16. A =18 42 1681 7 40

    2 6 27

    , det(I A) = ( 3)2( + 4).

    17. A = 1 1 110 6 5

    6 3 2

    , det(I A) = ( 1)3.

    18. A =0 4 12 6 1

    4 8 0

    , det(I A) = ( + 2)3.

    19. A =4 1 25 1 3

    7 2 3

    , det(I A) = 3.

    20. A =4 2 51 1 1

    2 1 2

    , det(I A) = ( + 1)3.

  • 24

  • 25

    C H A PT E R 2

    Solving Systems of LinearDifferential Equations

    2.1 HOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS

    We will now see how to use Jordan Canonical Form ( JCF) to solve systems Y = AY .We begin bydescribing the strategy we will follow throughout this section.

    Consider the matrix systemY = AY .

    Step 1.Write A = PJP1 with J in JCF, so the system becomesY = (PJP1)YY = PJ(P1Y )

    P1Y = J (P1Y )(P1Y ) = J (P1Y ) .

    (Note that, since P1 is a constant matrix, we have that (P1Y ) = P1Y .)

    Step 2. Set Z = P1Y , so this system becomesZ = JZ

    and solve this system for Z.

    Step 3. Since Z = P1Y , we have that

    Y = PZis the solution to our original system.

    Examining this strategy, we see that we already know how to carry out Step 1, and also thatStep 3 is very easyit is just matrix multiplication. Thus, the key to success here is being able tocarry out Step 2.This is where JCF comes in. As we shall see, it is (relatively) easy to solve Z = JZwhen J is a matrix in JCF.

  • 26 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    You will note that throughout this section, in solving Z = JZ, we write the solution asZ = MZC, where MZ is a matrix of functions, called the fundamental matrix of the system, andC is a vector of arbitrary constants. The reason for this will become clear later. (See Remarks 1.12and 1.14.)

    Although it is not logically necessarywe may regard a diagonal matrix as a matrix in JCFin which all the Jordan blocks are 1-by-1 blocksit is illuminating to handle the case when J isdiagonal rst. Here the solution is very easy.

    Theorem 1.1. Let J be a k-by-k diagonal matrix,

    J =

    a1a2 0

    a3. . .

    0 ak1ak

    .

    Then the system Z = JZ has the solution

    Z =

    ea1x

    ea2x 0ea3x

    . . .

    0 eak1xeakx

    C = MZC

    where C =

    c1c2...

    ck

    is a vector of arbitrary constants c1, c2, . . . , ck .

    Proof. Multiplying out, we see that the system Z = JZ is just the system

    z1z2...

    zk

    =

    a1z1

    a2z2...

    akzk

    .

  • 2.1. HOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 27

    But this system is uncoupled, i.e., the equation for zi only involves zi and none of theother functions. Now this equation is very familiar. In general, the differential equation z = az hassolution z = ceax , and applying that here we nd that Z = JZ has solution

    Z =

    c1ea1x

    c2ea2x

    ...

    ckeakx

    ,

    which is exactly the above product MZC.

    Example 1.2. Consider the system

    Y = AY where A =[

    5 72 4

    ].

    We saw in Example 1.16 in Chapter 1 that A = PJP1 with

    P =[

    7 12 1

    ]and J =

    [3 00 2

    ].

    Then Z = JZ has solution

    Z =[e3x 00 e2x

    ] [c1c2

    ]= MZC =

    [c1e3x

    c2e2x]

    and so Y = PZ = PMZC, i.e.,

    Y =[

    7 12 1

    ] [e3x 00 e2x

    ] [c1c2

    ]

    =[

    7e3x e2x2e3x e2x

    ] [c1c2

    ]

    =[

    7c1e3x + c2e2x2c1e3x + c2e2x

    ].

    Example 1.3. Consider the system

    Y = AY where A = 2 3 32 2 2

    2 1 1

    .

  • 28 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    We saw in Example 2.23 in Chapter 1 that A = PJP1 with

    P =1 0 10 1 1

    1 1 1

    and J =

    1 0 00 0 0

    0 0 2

    .

    Then Z = JZ has solution

    Z =ex 0 00 1 0

    0 0 e2x

    c1c2c3

    = MZC

    and so Y = PZ = PMZC, i.e.,

    Y =1 0 10 1 1

    1 1 1

    ex 0 00 1 0

    0 0 e2x

    c1c2c3

    =ex 0 e2x0 1 e2xex 1 e2x

    c1c2c3

    =c1ex c3e2xc2 c3e2xc1ex + c2 + c3e2x

    .

    We now see how to use JCF to solve systems Y = AY where the coefcient matrix A is notdiagonalizable.

    The key to understanding systems is to investigate a system Z = JZ where J is a matrixconsisting of a single Jordan block. Here the solution is not as easy as in Theorem 1.1, but it is stillnot too hard.

    Theorem 1.4. Let J be a k-by-k Jordan block with eigenvalue a,

    J =

    a 1a 1 0

    a 1. . .

    . . .

    0 a 1a

    .

  • 2.1. HOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 29

    Then the system Z = JZ has the solution

    Z = eax

    1 x x2/2! x3/3! xk1/(k 1)!1 x x2/2! xk2/(k 2)!

    1 x xk3/(k 3)!. . .

    ...

    x

    1

    C = MZC

    where C =

    c1c2...

    ck

    is a vector of arbitrary constants c1, c2, . . . , ck .

    Proof. We will prove this in the cases k = 1, 2, and 3, which illustrate the pattern. As you will see,the proof is a simple application of the standard technique for solving rst-order linear differentialequations.

    The case k = 1: Here we are considering the system[z1] = [a][z1]

    which is nothing other than the differential equation

    z1 = az1 .This differential equation has solution

    z1 = c1eax ,which we can certainly write as

    [z1] = eaz[1][c1] .The case k = 2: Here we are considering the system

    [z1z2

    ]=[a 10 a

    ] [z1z2

    ],

    which is nothing other than the pair of differential equations

    z1 = az1 + z2z2 = az2 .

  • 30 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    We recognize the second equation as having the solution

    z2 = c2eax

    and we substitute this into the rst equation to get

    z1 = az1 + c2eax .To solve this, we rewrite this as

    z1 az1 = c2eax

    and recognize that this differential equation has integrating factor eax . Multiplying by this factor,we nd

    eax(z1 az1) = c2(eaxz1) = c2

    eaxz1 =

    c2 dx = c1 + c2xso

    z1 = eax(c1 + c2x) .Thus, our solution is

    z1 = eax(c1 + c2x)z2 = eaxc2 ,

    which we see we can rewrite as [z1z2

    ]= eax

    [1 x0 1

    ] [c1c2

    ].

    The case k = 3: Here we are considering the systemz1z2z3

    =

    a 1 00 a 1

    0 0 a

    z1z2z3

    ,

    which is nothing other than the triple of differential equations

    z1 = az1 + z2z2 = az2 + z3z3 = az3 .

  • 2.1. HOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 31

    If we just concentrate on the last two equations, we see we are in the k = 2 case. Referring tothat case, we see that our solution is

    z2 = eax(c2 + c3x)z3 = eaxc3 .

    Substituting the value of z2 into the equation for z1, we obtain

    z1 = az1 + eax(c2 + c3x) .To solve this, we rewrite this as

    z1 az1 = eax(c2 + c3x)and recognize that this differential equation has integrating factor eax . Multiplying by this factor,we nd

    eax(z1 az1) = c2 + c3x(eaxz1) = c2 + c3x

    eaxz1 =

    (c2 + c3x) dx = c1 + c2x + c3(x2/2)so

    z1 = eax(c1 + c2x + c3(x2/2)) .Thus, our solution is

    z1 = eax(c1 + c2x + c3(x2/2))z2 = eax(c2 + c3x)z3 = eaxc3 ,

    which we see we can rewrite asz1z2z3

    = eax

    1 x x2/20 1 x

    0 0 1

    c1c2c3

    .

    Remark 1.5. Suppose thatZ = JZ where J is a matrix in JCF but one consisting of several blocks,not just one block.We can see that this systems decomposes into several systems, one correspondingto each block, and that these systems are uncoupled, so we may solve them each separately, usingTheorem 1.4, and then simply assemble these individual solutions together to obtain a solution ofthe general system.

  • 32 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    We now illustrate this (conning our illustrations to the case that A is not diagonalizable, aswe have already illustrated the diagonalizable case).

    Example 1.6. Consider the system

    Y = AY where A =[

    0 14 4

    ].

    We saw in Example 2.12 in Chapter 1 that A = PJP1 with

    P =[2 14 0

    ]and J =

    [2 10 2

    ].

    Then Z = JZ has solution

    Z = e2x[

    1 x0 1

    ] [c1c2

    ]=[e2x xe2x

    0 e2x] [

    c1c2

    ]= MZC =

    [c1e2x + c2xe2x

    c2e2x

    ]

    and so Y = PZ = PMZC, i.e.,

    Y =[2 14 0

    ]e2x

    [1 x0 1

    ] [c1c2

    ]

    =[2 14 0

    ] [e2x xe2x

    0 e2x] [

    c1c2

    ]

    =[2e2x 2xe2x + e2x4e2x 4xe2x

    ] [c1c2

    ]

    =[(2c1 + c2)e2x 2c2xe2x

    4c1e2x 4c2xe2x]

    .

    Example 1.7. Consider the system

    Y = AY where A = 2 1 12 1 2

    1 0 2

    .

    We saw in Example 2.25 in Chapter 1 that A = PJP1 with

    P = 1 0 52 0 6

    1 1 1

    and J =

    1 1 00 1 0

    0 0 3

    .

  • 2.1. HOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 33

    Then Z = JZ has solution

    Z =ex xex 00 ex 0

    0 0 e3x

    c1c2c3

    = MZC

    and so Y = PZ = PMZC, i.e.,

    Y = 1 0 52 0 6

    1 1 1

    ex xex 00 ex 0

    0 0 e3x

    c1c2c3

    = ex xex 5e3x2ex 2xex 6e3x

    ex xex + ex e3x

    c1c2c3

    = c1ex + c2xex 5c3e3x2c1ex 2c2xex 6c3e3x(c1 + c2)ex c2xex + c3e3x

    .

    Example 1.8. Consider the system

    Y = AY where A = 2 1 12 1 2

    1 1 2

    .

    We saw in Example 2.26 in Chapter 1 that A = PJP1 with

    P = 1 1 12 0 0

    1 0 1

    and J =

    1 1 00 1 0

    0 0 1

    .

    Then Z = JZ has solution

    Z =ex xex 00 ex 0

    0 0 ex

    c1c2c3

    = MZC

    and so Y = PZ = PMZC, i.e.,

    Y = 1 1 12 0 0

    1 0 1

    ex xex 00 ex 0

    0 0 ex

    c1c2c3

  • 34 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    = ex xex + ex ex2ex 2xex 0

    ex xex ex

    c1c2c3

    = (c1 + c2 + c3)ex + c2xex2c1ex 2c2xex

    (c1 + c3)ex + c2xex

    .

    Example 1.9. Consider the system

    Y = AY where A = 5 0 11 1 0

    7 1 0

    .

    We saw in Example 2.27 in Chapter 1 that A = PJP1 with

    P = 2 3 12 1 0

    6 7 0

    and J =

    2 1 00 2 1

    0 0 2

    .

    Then Z = JZ has solution

    Z =e2x xe2x (x2/2)e2x0 e2x xe2x

    0 0 e2x

    c1c2c3

    = MZC

    and so Y = PZ = PMZC, i.e.,

    Y = 2 3 12 1 0

    6 7 0

    e2x xe2x (x2/2)e2x0 e2x xe2x

    0 0 e2x

    c1c2c3

    = 2e2x 2xe2x + 3e2x x2e2x + 3xe2x + e2x2e2x 2xe2x + e2x x2e2x + xe2x

    6e2x 6xe2x 7e2x 3x2e2x 7xe2x

    c1c2c3

    = (2c1 + 3c2 + c3)e2x + (2c2 + 3c3)xe2x + c3x2e2x(2c1 + c2)e2x + (2c2 + c3)xe2x + c3x2e2x(6c1 7c2)e2x + (6c2 7c3)xe2x 3c3x2e2x

    .

  • 2.1. HOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 35

    We conclude this section by showing how to solve initial value problems.This is just one morestep, given what we have already done.

    Example 1.10. Consider the initial value problem

    Y = AY where A =[

    0 14 4

    ], and Y (0) =

    [3

    8]

    .

    In Example 1.6, we saw that this system has the general solution

    Y =[(2c1 + c2)e2x 2c2xe2x4c1e2x 4c2xe2x

    ].

    Applying the initial condition (i.e., substituting x = 0 in this matrix), gives[3

    8]

    = Y (0) =[2c1 + c2

    4c1]

    with solution [c1c2

    ]=[

    27

    ].

    Substituting these values in the above matrix gives

    Y =[

    3e2x 14xe2x8e2x 28te2x

    ].

    Example 1.11. Consider the initial value problem

    Y = AY where A = 2 1 12 1 2

    1 0 2

    , and Y (0) =

    832

    5

    .

    In Example 1.8, we saw that this system has the general solution

    Y =c1ex + c2xex 5c3xe3x2c1ex 2c2xex 6c3e3x

    (c1 + c2)ex c2xex + c3e3x

    .

    Applying the initial condition (i.e., substituting x = 0 in this matrix) gives 832

    5

    = Y (0) =

    c1 5c32c1 6c3

    c1 + c2 + c3

  • 36 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    with solution c1c2c3

    =

    71

    3

    .

    Substituting these values in the above matrix gives

    Y =7ex + xex + 15e3x14ex 2xex + 18e3x

    8ex xex 3e3x

    .

    Remark 1.12. There is a variant on our method of solving systems or initial value problems.

    We have written our solution of Z = JZ as Z = MZC. Let us be more explicit here andwrite this solution as

    Z(x) = MZ(x)C .This notation reminds us that Z(x) is a vector of functions,MZ(x) is a matrix of functions, and C isa vector of constants. The key observation is that MZ(0) = I , the identity matrix. Thus, if we wishto solve the initial value problem

    Z = JZ, Z(0) = Z0 ,we nd that, in general,

    Z(x) = MZ(x)Cand, in particular,

    Z0 = Z(0) = MZ(0)C = IC = C ,so the solution to this initial value problem is

    Z(x) = MZ(x)Z0 .Now suppose we wish to solve the system Y = AY .Then, ifA = PJP1, we have seen that

    this system has solution Y = PZ = PMZC. Let us manipulate this a bit:

    Y = PMZC = PMZIC = PMZ(P1P)C= (PMZP1)(PC) .

    Now let us set MY = PMZP1, and also let us set = PC. Note that MY is still a matrix offunctions, and that is still a vector of arbitrary constants (since P is an invertible constant matrixand C is a vector of arbitrary constants). Thus, with this notation, we see that

    Y = AY has solution Y = MY .

  • 2.1. HOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 37

    Now suppose we wish to solve the initial value problem

    Y = AY, Y (0) = Y0 .Rewriting the above solution of Y = AY to explicitly include the independent variable, we see thatwe have

    Y (x) = MY(x)and, in particular,

    Y0 = Y (0) = MY(0) = PMZ(0)P1 = PIP1 = ,so we see that

    Y = AY, Y (0) = Y0 has solution Y (x) = MY(x)Y0 .This variant method has pros and cons. It is actually less effective than our original method

    for solving a single initial value problem (as it requires us to compute P1 and do some extramatrix multiplication), but it has the advantage of expressing the solution directly in terms of theinitial conditions. This makes it more effective if the same system Y = AY is to be solved for avariety of initial conditions. Also, as we see from Remark 1.14 below, it is of considerable theoreticalimportance.

    Let us now apply this variant method.

    Example 1.13. Consider the initial value problem

    Y = AY where A =[

    0 14 4

    ], and Y (0) =

    [a1a2

    ].

    Aswehave seen inExample 1.6,A = PJP1 withP =[2 14 0

    ]and J =

    [2 10 2

    ].ThenMZ(x) =[

    e2x xe2x

    0 e2x]and

    MY(x) = PMZ(x)P1 =[2 14 0

    ] [e2x xe2x

    0 e2x] [2 1

    4 0]1

    =[e2x 2xe2x xe2x

    4xe2x e2x + 2xe2x]

    so

    Y (x) = MY(x)[a1a2

    ]=[e2x 2xe2x xe2x

    4xe2x e2x + 2xe2x] [

    a1a2

    ]

    =[a1e2x + (2a1 + a2)xe2xa2e2x + (4a1 + 2a2)xe2x

    ].

  • 38 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    In particular, if Y (0) =[

    38], then Y (x) =

    [3e2x 14xe2x

    8e2x 28xe2x], recovering the result of Exam-

    ple 1.10. But also, if Y (0) =[

    25

    ], then Y (x) =

    [2e2x + xe2x5e2x + 2te2x

    ], and if Y (0) =

    [415

    ], then Y (x) =[4e2x + 23xe2x

    15e2x + 46xe2x], etc.

    Remark 1.14. In Section 2.4 we will dene the matrix exponential, and, with this denition,MZ(x) = eJx and MY(x) = PMZ(x)P1 = eAx .

    EXERCISES FOR SECTION 2.1For each exercise, see the corresponding exercise in Chapter 1. In each exercise:

    (a) Solve the system Y = AY .(b) Solve the initial value problem Y = AY , Y (0) = Y0.

    1. A =[

    75 5690 67

    ]and Y0 =

    [1

    1].

    2. A =[50 9920 39

    ]and Y0 =

    [73

    ].

    3. A =[18 949 24

    ]and Y0 =

    [4198

    ].

    4. A =[

    1 116 9

    ]and Y0 =

    [7

    16

    ].

    5. A =[

    2 125 12

    ]and Y0 =

    [1075

    ].

    6. A =[15 925 15

    ]and Y0 =

    [50

    100

    ].

  • 2.1. HOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 39

    7. A =1 0 01 2 3

    1 1 0

    and Y0 =

    610

    10

    .

    8. A =3 0 21 3 1

    0 1 1

    and Y0 =

    03

    3

    .

    9. A = 5 8 164 1 8

    4 4 11

    and Y0 =

    02

    1

    .

    10. A = 4 2 31 1 3

    2 4 9

    and Y0 =

    32

    1

    .

    11. A = 5 2 11 2 1

    1 2 3

    and Y0 =

    32

    9

    .

    12. A = 8 3 34 0 2

    2 1 3

    and Y0 =

    58

    7

    .

    13. A =3 1 17 5 1

    6 6 2

    and Y0 =

    13

    6

    .

    14. A = 3 0 09 5 18

    4 4 12

    and Y0 =

    21

    1

    .

  • 40 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    15. A =6 9 06 6 2

    9 9 3

    and Y0 =

    13

    6

    .

    16. A =18 42 1681 7 40

    2 6 27

    and Y0 =

    72

    1

    .

    17. A = 1 1 110 6 5

    6 3 2

    and Y0 =

    310

    18

    .

    18. A =0 4 12 6 1

    4 8 0

    and Y0 =

    25

    8

    .

    19. A =4 1 25 1 3

    7 2 3

    and Y0 =

    611

    9

    .

    20. A =4 2 51 1 1

    2 1 2

    and Y0 =

    95

    8

    .

    2.2 HOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS: COMPLEXROOTS

    In this section, we show how to solve a homogeneous system Y = AY where the characteristicpolynomial of A has complex roots. In principle, this is the same as the situation where thecharacteristic polynomial of A has real roots, which we dealt with in Section 2.1, but in practice,there is an extra step in the solution.

  • 2.2. HOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 41

    We will begin by doing an example, which will show us where the difculty lies, and then wewill overcome that difculty. But rst, we need some background.

    Denition 2.1. For a complex number z, the exponential ez is dened by

    ez = 1 + z + z2/2! + z3/3! + . . . .

    The complex exponential has the following properties.

    Theorem 2.2. (1) (Euler) For any ,

    ei = cos() + i sin() .(2) For any a,

    d

    dz(eaz) = aeaz .

    (3) For any z1 and z2,ez1+z2 = ez1ez2 .

    (4) If z = s + it , thenez = es(cos(t) + i sin(t)) .

    (5) For any z,ez = ez .

    Proof. For the proof, see Theorem 2.2 in Appendix A.

    The following lemma will save us some computations.

    Lemma 2.3. Let A be a matrix with real entries, and let v be an eigenvector of A with associatedeigenvalue . Then v is an eigenvector of A with associated eigenvalue .

    Proof. We have that Av = v, by hypothesis. Let us take the complex conjugate of each side of thisequation. Then

    Av = v,Av = v,Av = v (as A = A since all the entries of A are real) ,

    as claimed.

  • 42 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    Now for our example.

    Example 2.4. Consider the system

    Y = AY where A =[

    2 171 4

    ].

    A has characteristic polynomial 2 6 + 25 with roots 1 = 3 + 4i and 2 = 1 = 3 4i, eachof multiplicity 1. Thus, 1 and 2 are the eigenvalues of A, and we compute that the eigenspace

    E3+4i = Ker(A (3 + 4i)I ) has basis{v1 =

    [1 + 4i1

    ]}, and hence, by Lemma 2.3, that the

    eigenspace E34i = Ker(A (3 4i)I ) has basis{v2 = v1 =

    [1 4i1

    ]}. Hence, just as before,

    A = PJP1 with P =[1 + 4i 1 4i

    1 1

    ]and J =

    [3 + 4i 0

    0 3 4i]

    .

    We continue as before, but now we use F to denote a vector of arbitrary constants. (This is just forneatness. Our constants will change, as you will see, and we will use the vector C to denote our nalconstants, as usual.) Then Z = JZ has solution

    Z =[e(3+4i)x 0

    0 e(34i)x] [

    f1f2

    ]= MZF =

    [f1e(3+4i)xf2e(34i)x

    ]

    and so Y = PZ = PMZF , i.e.,

    Y =[1 + 4i 1 4i

    1 1

    ] [e(3+4i)x 0

    0 e(34i)x] [

    f1f2

    ]

    = f1e(3+4i)x[1 + 4i

    1

    ]+ f2e(34i)x

    [1 4i1

    ].

    Now we want our differential equation to have real solutions, and in order for this to be thecase, it turns out that we must have f2 = f1. Thus, we may write our solution as

    Y = f1e(3+4i)x[1 + 4i

    1

    ]+ f1e(34i)x

    [1 4i1

    ]

    = f1e(3+4i)x[1 + 4i

    1

    ]+ f1e(3+4i)x

    [1 + 4i1

    ],

    where f1 is an arbitrary complex constant.This solution is correct but unacceptable.We want to solve the system Y = AY , where A has

    real coefcients, and we have a solution which is indeed a real vector, but this vector is expressed in

  • 2.2. HOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 43

    terms of complex numbers and functions. We need to obtain a solution that is expressed totally interms of real numbers and functions. In order to do this, we need an extra step.

    In order not to interrupt the ow of exposition, we simply state here what we need to do, andwe justify this after the conclusion of the example.

    We therefore do the following: We simply replace the matrix PMZ by the matrix whose

    rst column is the real part Re(e1xv1) = Re(e(3+4i)x

    [1 + 4i1

    ]), and whose second column is

    the imaginary part Im(e1xv1) = Im(e(3+4i)x

    [1 + 4i1

    ]), and the vector F by the vector C of

    arbitrary real constants.We compute

    e(3+4i)x[1 + 4i

    1

    ]= e3x(cos(4x) + i sin(4x))

    [1 + 4i1

    ]

    = e3x[ cos(4x) 4 sin(4x)

    cos(4x)

    ]+ ie3x

    [4 cos(4x) sin(4x)

    sin(4x)

    ]

    and so we obtain

    Y =[e3x( cos(4x) 4 sin(4x)) e3x(4 cos(4x) sin(4x))

    e3x cos(4x) e3x sin(4x)

    ] [c1c2

    ]

    =[(c1 + 4c2)e3x cos(4x) + (4c1 c2)e3x sin(4x)

    c1e3x cos(4x) + c2e3x sin(4x)]

    .

    Now we justify the step we have done.

    Lemma 2.5. Consider the system Y = AY , where A is a matrix with real entries. Let this system havegeneral solution of the form

    Y = PMZF =[v1

    v1][

    e1x 00 e1x

    ][f1f1

    ]=[e1xv1

    e1xv1] [

    f1f1

    ],

    where f1 is an arbitrary complex constant. Then this system also has general solution of the form

    Y =[

    Re(e1xv1) Im(e1xv1)

    ] [c1c2

    ],

    where c1 and c2 are arbitrary real constants.

    Proof. First note that for any complex number z = x + iy,x = Re(z) = 12 (z + z) and y = Im(z) =12i (z z), and similarly, for any complex vector.

  • 44 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    Now Y = AY has general solution Y = PMZF = PMZ(RR1)F = (PMZR)(R1F) forany invertible matrix R. We now (cleverly) choose

    R =[

    1/2 1/(2i)1/2 1/(2i)

    ].

    With this choice of R,

    PMZR =[

    Re(e1xv1) Im(e1xv1)

    ].

    Then

    R1 =[

    1 1i i

    ].

    Since f1 is an arbitrary complex constant, we may (cleverly) choose to write it as f1 = 12 (c1 + ic2)for arbitrary real constants c1 and c2, and with this choice

    R1F =[c1c2

    ],

    yielding a general solution as claimed.

    We now solve Y = AY where A is a real 3-by-3 matrix with a pair of complex eigenvaluesand a third, real eigenvalue. As you will see, we use the idea of Lemma 2.5 to simply replace therelevant columns of PMZ in order to obtain our nal solution.

    Example 2.6. Consider the system

    Y = AY where A =15 16 810 10 5

    0 1 2

    .

    A has characteristic polynomial (2 2 + 5)( 5) with roots 1 = 1 + 2i, 2 = 1 = 1 2i,and 3 = 5, each of multiplicity 1. Thus, 1, 2, and 3 are the eigenvalues of A, and we compute

    that the eigenspace E1+2i = Ker(A (1 + 2i)I ) has basisv1 =

    2 + 2i1 + 2i

    1

    , and hence, by

    Lemma 2.3, that the eigenspaceE12i = Ker(A (1 2i)I ) has basisv2 = v1 =

    2 2i1 2i

    1

    .

    We further compute that the eigenspace E5 = Ker(A 5I ) has basisv3 =

    43

    1

    . Hence, just as

  • 2.2. HOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 45

    before,

    A = PJP1 with P =2 + 2i 2 2i 41 + 2i 1 2i 3

    1 1 1

    and J =

    1 + 2i 0 00 1 2i 0

    0 0 5

    .

    Then Z = JZ has solution

    Z =e(1+2i)x 0 00 e(12i)x 0

    0 0 e5x

    f1f1c3

    = MZF =

    f1e

    (1+2i)x

    f1e(1+2i)xc3e5x

    and so Y = PZ = PMZF , i.e.,

    Y =2 + 2i 2 2i 41 + 2i 1 2i 3

    1 1 1

    e(1+2i)x 0 00 e(12i)x 0

    0 0 e5x

    f1f1c3

    .

    Now

    e(1+2i)x2 + 2i1 + 2i

    1

    = ex(cos(2x) + i sin(2x))

    2 + 2i1 + 2i

    1

    =ex(2 cos(2x) 2 sin(2x))ex( cos(2x) 2 sin(2x))

    ex cos(2x)

    + i

    ex(2 cos(2x) 2 sin(2x))ex(2 cos(2x) sin(2x))

    ex sin(2x)

    and of course

    e5x

    43

    1

    =

    4e5x3e5x

    e5x

    ,

    so, replacing the relevant columns of PMZ , we nd

    Y =ex(2 cos(2x) 2 sin(2x)) ex(2 cos(2x) 2 sin(2x)) 4e5xex( cos(2x) 2 sin(2x)) ex(2 cos(2x) sin(2x)) 3e5x

    ex cos(2x) ex sin(2x) e5x

    c1c2c3

    =(2c1 + 2c2)ex cos(2x) + (2c1 2c2)ex sin(2x) + 4c3e5x(c1 + 2c2)ex cos(2x) + (2c1 c2)ex sin(2x) + 3c3e5x

    c1ex cos(2x) + c2ex sin(2x) + c3e5x

    .

  • 46 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    EXERCISES FOR SECTION 2.2

    In Exercises 14:

    (a) Solve the system Y = AY .

    (b) Solve the initial value problem Y = AY , Y (0) = Y0.

    In Exercises 5 and 6, solve the system Y = AY .1. A =

    [3 5

    2 5], det(I A) = 2 8 + 25, and Y0 =

    [8

    13

    ].

    2. A =[

    3 42 7

    ], det(I A) = 2 10 + 29, and Y0 =

    [35

    ].

    3. A =[

    5 131 9

    ], det(I A) = 2 14 + 58, and Y0 =

    [21

    ].

    4. A =[

    7 174 11

    ], det(I A) = 2 18 + 145, and Y0 =

    [52

    ].

    5. A = 37 10 2059 9 24

    33 12 21

    , det(I A) = (2 4 + 29)( 3).

    6. A =4 42 154 25 10

    6 32 13

    , det(I A) = (2 6 + 13)( 2).

    2.3 INHOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS

    In this section, we show how to solve an inhomogeneous system Y = AY + G(x) where G(x)is a vector of functions. (We will often abbreviate G(x) by G). We use a method that is a directgeneralization of the method we used for solving a homogeneous system in Section 2.1.

    Consider the matrix system

    Y = AY + G .

  • 2.3. INHOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 47

    Step 1.Write A = PJP1 with J in JCF, so the system becomesY = (PJP1)Y + GY = PJ(P1Y ) + G

    P1Y = J (P1Y ) + P1G(P1Y ) = J (P1Y ) + P1G .

    (Note that, since P1 is a constant matrix, we have that (P1Y ) = P1Y .)

    Step 2. Set Z = P1Y and H = P1G, so this system becomesZ = JZ + H

    and solve this system for Z.

    Step 3. Since Z = P1Y , we have that

    Y = PZis the solution to our original system.

    Again, the key to this method is to be able to perform Step 2, and again this is straightforward.Within each Jordan block,we solve from the bottom up.Let us focus our attention on a single k-by-kblock.The equation for the last function zk in that block is an inhomogeneous rst-order differentialequation involving only zk , andwe go ahead and solve it.The equation for the next to the last functionzk1 in that block is an inhomogeneous rst-order differential equation involving only zk1 and zk .We substitute in our solution for zk to obtain an inhomogeneous rst-order differential equation forzk1 involving only zk1, and we go ahead and solve it, etc.

    In principle, this is the method we use. In practice, using this method directly is solving eachsystem by hand, and instead we choose to automate this procedure.This leads us to the followingmethod. In order to develop this method we must begin with some preliminaries.

    For a xed matrix A, we say that the inhomogeneous system Y = AY + G(x) has associatedhomogeneous system Y = AY . By our previous work, we know how to nd the general solution ofY = AY . First we shall see that, in order to nd the general solution of Y = AY + G(x), it sufcesto nd a single solution of that system.

    Lemma 3.1. Let Yi be any solution of Y = AY + G(x). If Yh is any solution of the associated ho-mogeneous system Y = AY , then Yh + Yi is also a solution of Y = AY + G(x), and every solution ofY = AY + G(x) is of this form.

    Consequently, the general solution ofY = AY + G(x) is given byY = YH + Yi ,whereYH denotesthe general solution of Y = AY .

  • 48 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    Proof. First we check that Y = Yh + Yi is a solution of Y = AY + G(x). We simply computeY = (Yh + Yi) = Y h + Y i = (AYh) + (AYi + G)

    = A(Yh + Yi) + G = AY + Gas claimed.

    Now we check that every solution Y of Y = AY + G(x) is of this form. So let Y be anysolution of this inhomogeneous system.We can certainly write Y = (Y Yi) + Yi = Yh + Yi whereYh = Y Yi . We need to show that Yh dened in this way is indeed a solution of Y = AY . Againwe compute

    Y h = (Y Yi) = Y Y i = (AY + G) (AYi + G)= A(Y Yi) = AYh

    as claimed.

    (It is common to call Yi a particular solution of the inhomogeneous system.)Let us now recall our work from Section 2.1, and keep our previous notation. The homoge-

    neous system Y = AY has general solution YH = PMZC whereC is a vector of arbitrary constants.Let us set NY = NY (x) = PMZ(x) for convenience, so YH = NYC. Then Y H = (NYC) = N YC,and then, substituting in the equation Y = AY , we obtain the equation N YC = ANYC. Since thisequation must hold for any C, we conclude that

    N Y = ANY .We use this fact to write down a solution to Y = AY + G. We will verify by direct computationthat the function we write down is indeed a solution. This verication is not a difcult one, butnevertheless it is a fair question to ask how we came up with this function. Actually, it can be derivedin a very natural way, but the explanation for this involves the matrix exponential and so we defer ituntil Section 2.4. Nevertheless, once we have this solution (no matter how we came up with it) weare certainly free to use it.

    It is convenient to introduce the following nonstandard notation. For a vector H(x), welet

    0 H(x)dx denote an arbitrary but xed antiderivative of H(x). In other words, in obtaining0 H(x)dx, we simply ignore the constants of integration.This is legitimate for our purposes, as by

    Lemma 3.1 we only need to nd a single solution to an inhomogeneous system, and it doesnt matterwhich one we ndany one will do. (Otherwise said, we can absorb the constants of integrationinto the general solution of the associated homogeneous system.)

    Theorem 3.2. The function

    Yi = NY

    0N1Y Gdx

    is a solution of the system Y = AY + G.

  • 2.3. INHOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 49

    Proof. We simply compute Y i . We have

    Y i =(NY

    0N1Y Gdx

    )

    = N Y

    N1Y Gdx + NY(

    0N1Y Gdx

    )by the product rule

    = N Y

    0N1Y Gdx + NY (N1Y G)

    by the denition of the antiderivative

    = N Y

    0N1Y Gdx + G

    = (ANY )

    0N1Y Gdx + G

    as N Y = ANY= A

    (NY

    0N1Y Gdx

    )+ G

    = AYi + Gas claimed.

    We now do a variety of examples: a 2-by-2 diagonalizable system, a 2-by-2 nondiagonalizablesystem, a 3-by-3 diagonalizable system, and a 2-by-2 system in which the characteristic polynomialhas complex roots. In all these examples, when it comes to nding N1Y , it is convenient to use thefact that N1Y = (PMZ)1 = M1Z P1.

    Example 3.3. Consider the system

    Y = AY + G where A =[

    5 72 4

    ]and G =

    [30ex60e2x

    ].

    We saw in Example 1.2 that

    P =[

    7 12 1

    ]and MZ =

    [e3x 00 e2x

    ],

    and NY = PMZ . Then

    N1Y G =[e3x 0

    0 e2x](1/5)

    [1 1

    2 7] [

    30ex60e2x

    ]=[

    6e2x 12ex12e3x + 84e4x

    ].

    Then 0N1Y G =

    [3e2x + 12e2x4e3x + 21e4x

    ]

  • 50 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    and

    Yi = NY

    0N1Y G =

    [7 12 1

    ] [e3x 00 e2x

    ] [3e2x + 12e2x4e3x + 21e4x

    ]

    =[25ex + 105e2x

    10ex + 45e2x]

    .

    Example 3.4. Consider the system

    Y = AY + G where A =[

    0 14 4

    ]and G =

    [60e3x72e5x

    ].

    We saw in Example 1.6 that

    P =[2 1

    4 0

    ]and MZ = e2x

    [1 x0 1

    ],

    and NY = PMZ . Then

    N1Y G = e2x[

    1 x0 1

    ](1/4)

    [0 14 2

    ] [60e3x72e5x

    ]=[18e3x 60xex + 36xe3x

    60ex 36e3x]

    .

    Then 0N1Y G =

    [60ex 60xex 10e3x + 12xe3x

    60ex 12e3x]

    and

    Yi = NY

    0N1Y G =

    [2 14 0

    ]e2x

    [1 x0 1

    ] [60ex 60xex 10e3x + 12xe3x

    60ex 12e3x]

    =[ 60e3x + 8e5x240e3x + 40e5x

    ].

    Example 3.5. Consider the system

    Y = AY + G where A = 2 3 32 2 2

    2 1 1

    and G =

    ex12e3x

    20e4x

    .

  • 2.3. INHOMOGENEOUS SYSTEMSWITHCONSTANTCOEFFICIENTS 51

    We saw in Example 1.3 that

    P =1 0 10 1 1

    1 1 1

    and MZ =

    ex 0 00 1 0

    0 0 e2x

    ,

    and NY = PMZ . Then

    N1Y G =ex 0 00 1 0

    0 0 e2x

    0 1 11 2 1

    1 1 1

    ex12e3x

    20e4x

    =

    12e4x + 20e5xex 24e3x 20e4x

    ex + 12ex + 20e2x

    .

    Then 0N1Y G =

    3e4x + 4e5xex 8e3x 5e4xex + 12ex + 10e2x

    and

    Yi = NY

    0N1Y G =

    1 0 10 1 1

    1 1 1

    ex 0 00 1 0

    0 0 e2x

    3e4x + 4e5xex 8e3x 5e4xex + 12ex + 10e2x

    = ex 9e3x 6e4x2ex 4e3x 5e4x

    2ex + 7e3x + 9e4x

    .

    Example 3.6. Consider the system

    Y = AY + G where A =[

    2 171 4

    ]and G =

    [200

    160ex]

    .

    We saw in Example 2.4 that

    P =[1 + 4i 1 4i

    1 1

    ]and MZ =

    [e(3+4i)x 0

    0 e(34i)x]

    ,

    and NY = PMZ . Then

    N1Y G =[e(3+4i)x 0

    0 e(34i)x](1/(8i))

    [1 1 + 4i1 1 4i

    ] [200

    160ex]

    =[25e(34i)x + 20(4 i)e(24i)x

    25e(3+4i)x + 20(4 + i)e(2+4i)x]

    .

  • 52 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    Then 0N1Y G =

    [(4 + 3i)e(34i)x + (4 + 18i)e(24i)x(4 3i)e(3+4i)x + (4 18i)e(2+4i)x

    ]and

    Yi = NY

    0N1Y G

    =[1 + 4i 1 4i

    1 1

    ] [e(3+4i)x 0

    0 e(34i)x] [

    (4 + 3i)e(34i)x + (4 + 18i)e(24i)x(4 3i)e(3+4i)x + (4 18i)e(2+4i)x

    ]

    =[1 + 4i 1 4i

    1 1

    ] [(4 + 3i) + (4 + 18i)ex(4 3i) + (4 18i)ex

    ]

    =[32 136ex

    8 8ex]

    .

    (Note that in this last example we could do arithmetic with complex numbers directly, i.e.,without having to convert complex exponentials into real terms.)

    Once we have done this work, it is straightforward to solve initial value problems. We do asingle example that illustrates this.

    Example 3.7. Consider the initial value problem

    Y = AY + G, Y(0) =[

    717

    ], where A =

    [5 72 4

    ]and G =

    [30ex60e2x

    ].

    We saw in Example 1.2 that the associated homogenous system has general solution

    YH =[

    7c1e3x + c2e2x2c1e3x + c2e2x

    ]

    and in Example 3.3 that the original system has a particular solution

    Yi =[25ex + 105e2x

    10ex + 45e2x]

    .

    Thus, our original system has general solution

    Y = YH + Yi =[

    7c1e3x + c2e2x 25ex + 105e2x2c1e3x + c2e2x 10ex + 45e2x

    ].

    We apply the initial condition to obtain the linear system

    Y (0) =[

    7c1 + c2 + 802c1 + c2 + 35

    ]=[

    717

    ]

  • 2.4. THEMATRIX EXPONENTIAL 53

    with solution c1 = 11, c2 = 4. Substituting, we nd that our initial value problem has solution

    Y =[77e3x + 4e2x 25ex + 105e2x

    22e3x + 4e2x 10ex + 45e2x]

    .

    EXERCISES FOR SECTION 2.3In each exercise, nd a particular solution Yi of the system Y = AY + G(x), where A is the matrixof the correspondingly numbered exercise for Section 2.1, and G(x) is as given.

    1. G(x) =[

    2e8x3e4x

    ].

    2. G(x) =[

    2e7x6e8x

    ].

    3. G(x) =[e4x

    4e5x

    ].

    4. G(x) =[e6x

    9e8x].

    5. G(x) =[

    9e10x25e12x

    ].

    6. G(x) =[

    5ex12e2x

    ].

    7. G(x) = 13e2x

    5e4x

    .

    8. G(x) = 83e3x

    3e5x

    .

    2.4 THEMATRIX EXPONENTIALIn this section, we will discuss the matrix exponential and its use in solving systems Y = AY .

    Our rst task is to ask what it means to take a matrix exponential. To answer this, we areguided by ordinary exponentials. Recall that, for any complex number z, the exponential ez is givenby

    ez = 1 + z + z2/2! + z3/3! + z4/4! + . . . .

  • 54 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    With this in mind, we dene the matrix exponential as follows.

    Denition 4.1. Let T be a square matrix. Then the matrix exponential eT is dened by

    eT = I + T + 12!T

    2 + 13!T

    3 + 14!T

    4 + . . . .

    (For this denition to make sense we need to know that this series always converges, and itdoes.)

    Recall that the differential equation y = ay has the solution y = ceax . The situation forY = AY is very analogous. (Note that we use rather than C to denote a vector of constants forreasons that will become clear a little later. Note that is on the right in Theorem 4.2 below, aconsequence of the fact that matrix multiplication is not commutative.)

    Theorem 4.2.

    (1) Let A be a square matrix. Then the general solution of

    Y = AY

    is given byY = eAx

    where is a vector of arbitrary constants.

    (2) The initial value problemY = AY, Y (0) = Y0

    has solutionY = eAxY0 .

    Proof. (Outline) (1) We rst compute eAx . In order to do so, note that (Ax)2 = (Ax)(Ax) =(AA)(xx) = A2x2 as matrix multiplication commutes with scalar multiplication, and (Ax)3 =(Ax)2(Ax) = (A2x2)(Ax) = (A2A)(x2x) = A3x3, and similarly, (Ax)k = Akxk for any k. Then,substituting in Denition 4.1, we have that

    Y = eAx = (I + Ax + 12!A

    2x2 + 13!A

    3x3 + 14!A

    4x4 + . . .) .

  • 2.4. THEMATRIX EXPONENTIAL 55

    To nd Y , we may differentiate this series term-by-term. (This claim requires proof, but we shallnot give it here.) Remembering that A and are constant matrices, we see that

    Y = (A + 12!A

    2(2x) + 13!A

    3(3x2) + 14!A

    4(4x3) + . . .)

    = (A + A2x + 12!A

    3x2 + 13!A

    4x3 + . . .)

    = A(I + Ax + 12!A

    2x2 + 13!A

    3x3 + . . .)= A(eAx) = AY

    as claimed.(2) By (1) we know that Y = AY has solution Y = eAx. We use the initial condition to

    solve for . Setting x = 0, we have:Y0 = Y (0) = eA0 = e0 = I =

    (where e0 means the exponential of the zero matrix, and the value of this is the identity matrix I , asis apparent from Denition 4.1), so = Y0 and Y = eAx = eAxY0.

    In the remainder of this section we shall see how to translate the theoretical solution ofY = AY given by Theorem 4.2 into a practical one. To keep our notation simple, we will stick to2-by-2 or 3-by-3 cases, but the principle is the same regardless of the size of the matrix.

    One case is relatively easy.

    Lemma 4.3. If J is a diagonal matrix,

    J =

    d1

    d2 0

    0 . . .dn

    then eJx is the diagonal matrix

    eJx =

    ed1x

    ed2x 0

    0 . . .ednx

    .

  • 56 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    Proof. Suppose, for simplicity, that J is 2-by-2,

    J =[d1 00 d2

    ].

    Then you can easily compute that J 2 =[d1

    2 00 d22

    ], J 3 =

    [d1

    3 00 d23

    ], and similarly, J k =[

    d1k 0

    0 d2k]for any k.

    Then, as in the proof of Theorem 4.2,

    eJx = I + Jx + 12!J

    2x2 + 13!J

    3x3 + 14!J

    4x4 + . . .

    =[

    1 00 1

    ]+[d1 00 d2

    ]x + 1

    2![d1

    2 00 d22

    ]x2 + 1

    3![d1

    3 00 d23

    ]x3 + . . .

    =[

    1 + d1x + 12! (d1x)2 + 13! (d1x)3 + . . . 00 1 + d2x + 12! (d2x)2 + 13! (d2x)3 + . . .

    ]

    which we recognize as

    =[ed1x 0

    0 ed2x]

    .

    Example 4.4. We wish to nd the general solution of Y = JY where

    J =[

    3 00 2

    ].

    To do so we directly apply Theorem 4.2 and Lemma 4.3. The solution is given by[y1y2

    ]= Y = eJx =

    [e3x 00 e2x

    ] [12

    ]=[1e3x

    2e2x]

    .

    Now suppose we want to nd the general solution of Y = AY whereA =[

    5 72 4

    ].Wemay

    still apply Theorem 4.2 to conclude that the solution is Y = eAx. We again try to calculate eAx .Now we nd

    A =[

    5 72 4

    ], A2 =

    [11 72 2

    ], A3 =

    [41 4914 22

    ], . . .

  • 2.4. THEMATRIX EXPONENTIAL 57

    so

    eAx =[

    1 00 1

    ]+[

    5 72 4

    ]x + 1

    2![

    11 72 2

    ]x2 + 1

    3![

    41 4914 22

    ]x3 + . . . ,

    which looks like a hopeless mess. But, in fact, the situation is not so hard!

    Lemma 4.5. Let S and T be two matrices and suppose

    S = PT P1

    for some invertible matrix P . Then

    Sk = PT kP1 for every k

    and

    eS = PeT P1 .

    Proof. We simply compute

    S2 = SS = (PT P1)(PT P1) = PT (P1P)T P1 = PT IT P1= PT T P1 = PT 2P1,

    S3 = S2S = (PT 2P1)(PT P1) = PT 2(P1P)T P1 = PT 2IT P1= PT 2T P1 = PT 3P1,

    S4 = S3S = (PT 3P1)(PT P1) = PT 3(P1P)T P1 = PT 3IT P1= PT 3T P1 = PT 4P1 ,

    etc.Then

    eS = I + S + 12!S

    2 + 13!S

    3 + 14!S

    4 + . . .= PIP1 + PT P1 + 1

    2!PT2P1 + 1

    3!PT3P1 + 1

    4!PT4P1 + . . .

    = P(I + T + 12!T

    2 + 13!T

    3 + 14!T

    4 + . . .)P1= PeT P1

    as claimed.

  • 58 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    With this in hand let us return to our problem.

    Example 4.6. (Compare Example 1.2.) We wish to nd the general solution of Y = AY where

    A =[

    5 72 4

    ].

    We saw in Example 1.16 in Chapter 1 that A = PJP1 with

    P =[

    7 12 1

    ]and J =

    [3 00 2

    ].

    Then

    eAx = PeJxP1

    =[

    7 12 1

    ] [e3x 00 e2x

    ] [7 12 1

    ]1

    =[ 7

    5e3x 25e2x 75e3x + 75e2x

    25e

    3x 25e2x 25e3x + 75e2x]

    and

    Y = eAx = eAx[12

    ]

    =[

    ( 751 752)e3x + ( 251 + 752)e2x( 251 252)e3x + ( 251 + 752)e2x

    ].

    Example 4.7. (Compare Example 1.3.) We wish to nd the general solution of Y = AY where

    A = 2 3 32 2 2

    2 1 1

    .

    We saw in Example 2.23 in Chapter 1 that A = PJP1 with

    P =1 0 10 1 1

    1 1 1

    and J =

    1 0 00 0 0

    0 0 2

    .

  • 2.4. THEMATRIX EXPONENTIAL 59

    Then

    eAx = PeJxP1

    =1 0 10 1 1

    1 1 1

    ex 0 00 1 0

    0 0 e2x

    1 0 10 1 1

    1 1 1

    1

    = e2x ex e2x ex e2x1 + e2x 2 e2x 1 e2x

    1 e2x ex 2 + e2x ex 1 + e2x

    and

    Y = eAx = eAx123

    = (2 + 3)ex + (1 2 3)e2x(1 + 22 + 3) + (1 2 3)e2x(2 + 3)ex + (1 22 3) + (1 + 2 + 3)e2x

    .

    Now suppose we want to solve the initial value problem Y = AY , Y (0) =10

    0

    . Then

    Y = eAx Y (0)

    = e2x ex e2x ex e2x1 + e2x 2 e2x 1 e2x

    1 e2x ex 2 + e2x ex 1 + e2x

    10

    0

    = e2x1 + e2x

    1 e2x

    .

    Remark 4.8. Let us compare the results of our method here with that of our previous method. Inthe case of Example 4.6, our previous method gives the solution

    Y = P[e3x 00 e2x

    ]C

    = PeJxC

  • 60 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    where J =[

    3 00 2

    ],

    while our method here gives

    Y = PeJxP1 .

    But note that these answers are really the same! For P1 is a constant matrix, so if is a vector ofarbitrary constants, then so is P1, and we simply set C = P1.

    Similarly, in the case of Example 4.7, our previous method gives the solution

    Y = Pex 0 00 1 0

    0 0 e2x

    C

    = PeJxC

    where J =1 0 00 0 0

    0 0 2

    ,

    while our method here gives

    Y = PeJxP1

    and again, setting C = P1, we see that these answers are the same.So the point here is not that the matrix exponential enables us to solve new problems, but

    rather that it gives a new viewpoint about the solutions that we have already obtained.

    While these two methods are in principle the same, we may ask which is preferable inpractice. In this regard we see that our earlier method is better, as the use of the matrix exponentialrequires us to nd P1, which may be a considerable amount of work. However, this advantageis (partially) negated if we wish to solve initial value problems, as the matrix exponential methodimmediately gives the unknown constants , as = Y (0), while in the former method we mustsolve a linear system to obtain the unknown constants C.

    Now let us consider the nondiagonalizable case. Suppose Z = JZ where J is a matrix con-sisting of a single Jordan block.Then byTheorem 4.2 this has the solution Z = eJx. On the other

  • 2.4. THEMATRIX EXPONENTIAL 61

    hand, inTheorem 1.1 we already saw that this system has solutionZ = MZC. In this case,we simplyhave C = , so we must have eJx = MZ . Let us see that this is true by computing eJx directly.

    Theorem 4.9. Let J be a k-by-k Jordan block with eigenvalue a,

    J =

    a 1a 1 0

    a 1. . .

    . . .

    0 a 1a

    .

    Then

    eJx = eax

    1 x x2/2! x3/3! xk1/(k 1)!0 1 x x2/2! xk2/(k 2)!

    1 x xk3/(k 3)!. . .

    ...

    x

    1

    .

    Proof. First suppose that J is a 2-by-2 Jordan block,

    J =[a 10 a

    ].

    Then J 2 =[a2 2a0 a2

    ], J 3 =

    [a3 3a20 a3

    ], J 4 =

    [a4 4a30 a4

    ],

    so

    eJx =[

    1 00 1

    ]+[a 10 a

    ]x + 1

    2![a2 2a0 a2

    ]x2 + 1

    3![a3 3a20 a3

    ]x3 + 1

    4![a4 4a30 a4

    ]x4 + . . .

    =[m11 m12

    0 m22

    ],

    and we see that

    m11 = m22 = 1 + ax + 12! (ax)2 + 1

    3! (ax)3 + 1

    4! (ax)4 + 1

    5! (ax)5 + . . .

    = eax ,

  • 62 CHAPTER 2. SOLVING SYSTEMSOFLINEARDIFFERENTIALEQUATIONS

    and

    m12 = x + ax2 + 12!a2x3 + 1

    3!a3x4 + 1

    4!a4x5 + . . .

    = x(1 + ax + 12! (ax)

    2 + 13! (ax)

    3 + 14! (ax)

    4 + . . .) = xeax

    and so we conclude that

    eJx =[eax xeax

    0 eax]

    = eax[

    1 x0 1

    ].

    Next suppose that J is a 3-by-3 Jordan block,

    J =a 1 00 a 1

    0 0 1

    .

    Then J 2 =a2 2a 10 a2 2a

    0 0 a2

    , J 3 =

    a3 3a2 3a0 a3 3a2

    0 0 a3

    , J 4 =

    a4 4a3 6a20 a4 4a3

    0 0 a4